From af97e0542bb2df3d6c226386b134279cdfb49b59 Mon Sep 17 00:00:00 2001 From: "Antoine R. Dumont (@ardumont)" <ardumont@softwareheritage.org> Date: Mon, 26 Feb 2024 14:34:03 +0100 Subject: [PATCH] software-origins/nixguix: Document origins and link to loaders Refs. swh/devel/swh-loader-core#4749 --- docs/user/software-origins/nixguix.rst | 82 +++++++++++++++++++++++--- 1 file changed, 73 insertions(+), 9 deletions(-) diff --git a/docs/user/software-origins/nixguix.rst b/docs/user/software-origins/nixguix.rst index 6acfc6ff..cfd7a50f 100644 --- a/docs/user/software-origins/nixguix.rst +++ b/docs/user/software-origins/nixguix.rst @@ -8,12 +8,76 @@ Nix and Guix .. include:: dynamic/nixguix_status.inc -TODO: - -* description of the software origin -* summary of the lister's algorithm -* summary of the loader's algorithm -* URL pattern -* collect extrinsic metadata? -* index extrinsic metadata? -* index intrinsic metadata? +This page documents how |swh| archives source packages from the +`GNU Guix <https://guix.gnu.org/>`_ and `Nix <https://nixos.org/>`_ +distributions. + +Those distributions provide functional package managers, respectively GNU Guix and Nix +with similar properties (e.g. transactional, declarative up to the operating system, +reproducible, ...). Definition of packages is dependent on their respective DSL. As it's +not parsable easily nor any listing api existed, community effort was done to provide +regular extraction of origins listing (as json manifest). + +|swh|'s NixGuix lister queries respectively those origins listing. As they contain +various types of origins, |swh| has a lister and various loaders to deal with those +origins. + +Various kind of origins can be listed and ingested: + +- url targeting a simple file. The :ref:`Content Loader + <ContentLoader>` loader ingests origin of type 'content'. +- url targeting a tarball. The :ref:`Tarball Directory Loader + <TarballDirectoryLoader>` ingests origin with type + 'tarball-directory' +- :ref:`Svn <user-software-origins-svn>` repository. The :ref:`Svn loader + <swh.loader.svn.loader.SvnLoader>` ingests origin with type svn. +- :ref:`Svn <user-software-origins-svn>` repository at a specific revision. The + :ref:`Svn export loader <SvnExportLoader>` ingests origins + with type 'svn-export'. +- :ref:`Git <user-software-origins-git>` repository. The :ref:`Git loader + <GitLoader>` ingests origin with type' or at a specific commit, + visit type 'git-checkout') +- :ref:`Git <user-software-origins-git>` repository at a specific git commit. Then + :ref:`Git checkout loader <GitCheckoutLoader>` ingests origin + with type 'git-checkout' +- :ref:`Mercurial <user-software-origins-mercurial>` repository. The :ref:`Mercurial + loader <HgLoader>` ingests origin with type 'hg' +- :ref:`Mercurial <user-software-origins-mercurial>` repository. The :ref:`Mercurial + checkout loader <HgCheckoutLoader>` ingests with type + 'hg-checkout'. + +Origin URLs match each main url provided in the manifest. + +For some cases, there can be mirror urls which are used as fallback artifact retrieval +when the main url is no longer available. + +No extrinsic nor intrinsic metadata collection is happening on the lister's side. + +For some kind of origins ('content', 'tarball-directory', 'svn-export', 'hg-checkout', +'git-checkout'), extra intrinsic information regarding the checksums used ('standard', +e.g. sha256, or :ref:`Nar <Nar>`, specific intrinsic identifier used by +Guix and Nix), are transmitted to their respective loaders. During their respective +ingestion, those checksums are checked against. If the checksum does not match, the +artifact is rejected and the visit is failed. If not, the artifact is ingested. Note +also that a new entry is recorded in the ExtID table to map the SWHID content (of the +content file) or the SWHID directory (for the other kind) ingested to their their +original checksum. + +Sample: + ++-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+ +| extid_type | extid_version | extid | target_type | target | ++-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+ +| checksum-sha256 | 1 | \x00001a5b5be28bde9bc8c353afe546d8fe84e49b269a70393c1616957b0e1cce | directory | \xbe186100480d766ebdf0cfaeac0c90198f4b42e7 | +| nar-sha256 | 1 | \x00002584a56a9bce85793515604298f8b3b1e9497e00fc6361a0c2e731c063f3 | directory | \x1e1ace5b0ef56e188d3cf99059070cc5448d7454 | ++-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+ + +Resources: + +* `Gnu Guix git repository <https://git.savannah.gnu.org/cgit/guix.git>`__ + +* `Nixpkgs git repository <https://github.com/nixos/nixpkgs>`__ + +* `Remote nixpkgs json manifest <https://nix-community.github.io/nixpkgs-swh/sources-unstable-full.json>`__ + +- `Remote guix json manifest <https://nix-community.github.io/nixpkgs-swh/>`__ -- GitLab