Software Heritage
project link is at https://www.softwareheritage.org/
the infrastructure and tools they used is also open source; primarily happening at their software forge
an ambitious project archiving all of humanity's publicly available source code
primarily made for researchers to easily refer to software; a centralized database for referring software similar to digital object identifiers (DOI) in research materials and ISBN for books
the archive itself is more of a gigantic merkle tree with the ability to interact with the individual objects such as commits, revisions, snapshots, and even the very source code files of an archived repo
it is version control software-agnostic; archived software from several sources (e.g., Git, Mercurial)
each object is given an identifer referred to as Software Heritage persistent identifiers (SWHIDs)
funded from donations including big companies and several not-for-profit foundations
a big component for Reproducible research for other projects such as Nix package manager and Guix package manager used as a fallback when upstream vanished; soon enough, it will develop tools to integrate them further such as archiving the code used to build the binary cache
there is a public interface for browsing the archive
they have dedicated resources into creating an infrastructure for creating a centralized reference for software such the following list
swh.fuse, a tool that integrates the archive into a user-local filesystem integrating the archive for development workflow
roam:swh.search adding the search functionality in the archive
roam:swh.lister lists from several forges (e.g., GitHub, GitLab)