Jump to content

SoftWare Hash IDentifier

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Boud (talk | contribs) at 16:02, 26 May 2025 (Format: more about the structure of the identifier from The programming historian en français - this is a peer-reviewed journal, even though it's not (yet) Wikipedia-notable). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
  • Comment: In accordance with Wikipedia's Conflict of interest policy, I disclose that I have a conflict of interest regarding the subject of this article. AbcSxyZ (talk) 16:33, 25 May 2025 (UTC)



Full nameSoftWare Hash IDentifier
AcronymSWHID
IntroducedApril 2025; 2 months ago (2025-04)
Exampleswh:1:dir:df32c75242bf8d797ccd43af8ce8e294f35cd8fd
Websitehttps://www.swhid.org/

The SoftWare Hash IDentifier (SWHID) is a persistent identifier used to uniquely identify a particular piece of software and its version. SWHID is a standard similar to the DOI, but is tailored specifically for software, compatible with versioning software such as git.

A SWHID can be used to point to different components or history stages between version, snapshot, revision, directory or content[1].

Creation and history

The SoftWare Hash IDentifier was developed by Software Heritage. As of 2023, it was officially released by Software Heritage and was in use for billions of versions of pieces of software, termed "artefacts".[2] In research it is integrated with repositories such as HAL, Zenodo or the French catalog of Academic Research Free Software.

The acronym SWHID originally referred to "Software Heritage Identifiers" used to catalog software artifacts in the early days of the Software Heritage archive[3]. It later evolved into an open standard through a dedicated working group[4] and was standardized as ISO in April 2025 under the more general name "Software Hash Identifier"[5]

Télécom Paris welcomed the ISO normalization arguing that it is a significant step in global digital infrastructure, providing traceability of software affected by vulnerabilities.[6] UNESCO stated that SWHID is useful for the reproducibility and long-term accessibility of software.[7]

Standards

SWHID is an open standard licensed under the Community Specification License[8].

SWHID was formalized as the ISO 18670 standard in April 2025[9].

Format

The SWHID allows identifying different components of software source code. Object types relating to the software version are labelled as "snapshot", "release" or "revision"; a "directory" of files and possibly subdirectories can be identified; and a specific piece of a specific version of source code can be labelled as "content".[10] The identifier has the following syntax:[2]

swh:<scheme_version>:<object_type>:<object_id>[;qualifiers]

Example

Version 3.0 of the Linux kernel, released in July 2011, has the following SWHID[11]:

swh:1:dir:df32c75242bf8d797ccd43af8ce8e294f35cd8fd

References

  1. ^ "Préserver et rendre identifiables les logiciels de recherche avec Software Heritage". Programming Historian (in French). Retrieved 2025-05-24.
  2. ^ a b Axel Thévenet (26 September 2023), SWHID: Tracking past software for future humans, Wikidata Q134580517, archived from the original on 26 May 2025
  3. ^ "SoftWare Hash IDentifier (SWHID)". Software Heritage. Retrieved 2025-05-24.
  4. ^ "SWHID working group". Retrieved 2025-05-24.
  5. ^ "ISO/IEC 18670:2025". ISO. Retrieved 2025-05-24.
  6. ^ Une avancée significative pour l'infrastructure numérique mondiale : La norme ISO/IEC 18670 est désormais officielle [A significant advance for global digital infrastructure: the ISO/IEC 18670 standard is now official] (in French), Télécom Paris, 20 May 2025, Wikidata Q134580605, archived from the original on 26 May 2025
  7. ^ "Archiving open software as human heritage". UNESCO. Retrieved 2025-05-24.
  8. ^ "Copyright Section of SWHID Specification v1.2". Retrieved 2025-05-24.
  9. ^ "ISO/IEC 18670:2025". ISO. Retrieved 2025-05-24.
  10. ^ Sabrina Granger; Baptiste Mélès; Frédéric Santos (15 November 2024), Préserver et rendre identifiables les logiciels de recherche avec Software Heritage [Preserving and identifying research software with Software Heritage] (in French), doi:10.46430/PHFR0034, Wikidata Q134581061, archived from the original on 26 May 2025
  11. ^ "Release v3.0 of torvalds/linux repository". Software Heritage. Retrieved 2025-05-24.