wiki/notebook/data.archives.software-heritage.org

28 lines
2.2 KiB
Org Mode
Raw Normal View History

2021-07-27 15:13:13 +00:00
:PROPERTIES:
:ID: 9c85ffb2-fc90-4b38-abce-f0425a2b79de
:END:
#+title: Software Heritage
#+date: 2021-07-25 21:01:45 +08:00
#+date_modified: 2021-07-28 00:02:38 +08:00
2021-07-27 15:13:13 +00:00
#+language: en
- project link is at https://www.softwareheritage.org/
- the infrastructure and tools they used is also open source;
primarily happening at [[https://forge.softwareheritage.org/][their software forge]]
- an ambitious project archiving all of humanity's publicly available source code
- primarily made for researchers to easily refer to software;
a centralized database for referring software similar to digital object identifiers (DOI) in research materials and ISBN for books
- the archive itself is more of a gigantic merkle tree with the ability to interact with the [[https://docs.softwareheritage.org/devel/swh-model/data-model.html][individual objects]] such as commits, revisions, snapshots, and even the very source code files of an archived repo
- it is version control software-agnostic;
archived software from several sources (e.g., Git, Mercurial)
- each object is given an identifer referred to as [[https://docs.softwareheritage.org/devel/swh-model/persistent-identifiers.html][Software Heritage persistent identifiers]] (SWHIDs)
2021-07-27 15:13:13 +00:00
- funded from donations including big companies and several not-for-profit foundations
- a big component for [[id:6eeb7a24-b662-46d6-9ece-00a5028ff4d8][Reproducible research]] for other projects such as [[id:3b3fdcbf-eb40-4c89-81f3-9d937a0be53c][Nix package manager]] and [[id:be917383-84c4-4bf5-9ca0-b04bfb778f4f][Guix package manager]] used as a fallback when upstream vanished;
soon enough, it will develop tools to integrate them further such as archiving the code used to build the binary cache
- there is a [[https://archive.softwareheritage.org/][public interface for browsing the archive]]
- they have dedicated resources into creating an infrastructure for creating a centralized reference for software such the following list
+ [[id:cf5cadae-e01b-4099-86fe-b0e05bda4d39][swh.fuse]], a tool that integrates the archive into a user-local filesystem integrating the archive for development workflow
+ roam:swh.search adding the search functionality in the archive
+ roam:swh.lister lists from several forges (e.g., GitHub, GitLab)