Jump to content

SoftWare Hash IDentifier

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Boud (talk | contribs) at 16:26, 26 May 2025 (Creation and history: more accurate dates per non-SH source). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
  • Comment: In accordance with Wikipedia's Conflict of interest policy, I disclose that I have a conflict of interest regarding the subject of this article. AbcSxyZ (talk) 16:33, 25 May 2025 (UTC)



Full nameSoftWare Hash IDentifier
AcronymSWHID
Exampleswh:1:dir:df32c75242bf8d797ccd43af8ce8e294f35cd8fd
Websitehttps://www.swhid.org/

The SoftWare Hash IDentifier (SWHID) is a persistent identifier used to uniquely identify a particular piece of software source code and its version. SWHID is a standard similar to the DOI, but is tailored specifically for software source code,[1] compatible with versioning software such as git.

An SWHID can be used to point to different components or versions of the source code of a software package.[1]

Creation and history

The SoftWare Hash IDentifier was developed by Software Heritage. Software Heritage's archives, identified by their SWHIDs, were publicly released starting in 2018.[2]

As of 2020, SWHIDs were in use for about nine billion versions of pieces of software,[2] termed "artefacts".[3] SWHIDs are integrated with research repositories including HAL, Zenodo and the French catalog of Academic Research Free Software.[citation needed]

The acronym SWHID originally referred to "Software Heritage Identifiers" used to catalog software artifacts in the early days of the Software Heritage archive[4]. It later evolved into an open standard through a dedicated working group[5] and was standardized as ISO in April 2025 under the more general name "Software Hash Identifier"[6]

Télécom Paris welcomed the ISO normalization arguing that it is a significant step in global digital infrastructure, providing traceability of software affected by vulnerabilities.[7] UNESCO stated that SWHID is useful for the reproducibility and long-term accessibility of software.[8]

Standards

SWHID is an open standard licensed under the Community Specification License[9].

SWHID was formalized as the ISO 18670 standard in April 2025[10].

Format

The SWHID allows identifying different components of software source code. Object types relating to the software version are labelled as "snapshot", "release" or "revision"; a "directory" of files and possibly subdirectories can be identified; and a specific piece of a specific version of source code can be labelled as "content".[1] The identifier has the following syntax:[3]

swh:<scheme_version>:<object_type>:<object_id>[;qualifiers]

Example

Version 3.0 of the Linux kernel, released in July 2011, has the following SWHID[11]:

swh:1:dir:df32c75242bf8d797ccd43af8ce8e294f35cd8fd

References

  1. ^ a b c Sabrina Granger; Baptiste Mélès; Frédéric Santos (15 November 2024), Préserver et rendre identifiables les logiciels de recherche avec Software Heritage [Preserving and identifying research software with Software Heritage] (in French), doi:10.46430/PHFR0034, Wikidata Q134581061, archived from the original on 26 May 2025
  2. ^ a b Le CNRS apporte son soutien à Software Heritage [The CNRS supports Software Heritage] (in French), French National Centre for Scientific Research, 25 November 2020, Wikidata Q134581205, archived from the original on 26 May 2025
  3. ^ a b Axel Thévenet (26 September 2023), SWHID: Tracking past software for future humans, Wikidata Q134580517, archived from the original on 26 May 2025
  4. ^ "SoftWare Hash IDentifier (SWHID)". Software Heritage. Retrieved 2025-05-24.
  5. ^ "SWHID working group". Retrieved 2025-05-24.
  6. ^ "ISO/IEC 18670:2025". ISO. Retrieved 2025-05-24.
  7. ^ Une avancée significative pour l'infrastructure numérique mondiale : La norme ISO/IEC 18670 est désormais officielle [A significant advance for global digital infrastructure: the ISO/IEC 18670 standard is now official] (in French), Télécom Paris, 20 May 2025, Wikidata Q134580605, archived from the original on 26 May 2025
  8. ^ "Archiving open software as human heritage". UNESCO. Retrieved 2025-05-24.
  9. ^ "Copyright Section of SWHID Specification v1.2". Retrieved 2025-05-24.
  10. ^ "ISO/IEC 18670:2025". ISO. Retrieved 2025-05-24.
  11. ^ "Release v3.0 of torvalds/linux repository". Software Heritage. Retrieved 2025-05-24.