From af97e0542bb2df3d6c226386b134279cdfb49b59 Mon Sep 17 00:00:00 2001
From: "Antoine R. Dumont (@ardumont)" <ardumont@softwareheritage.org>
Date: Mon, 26 Feb 2024 14:34:03 +0100
Subject: [PATCH] software-origins/nixguix: Document origins and link to
 loaders

Refs. swh/devel/swh-loader-core#4749
---
 docs/user/software-origins/nixguix.rst | 82 +++++++++++++++++++++++---
 1 file changed, 73 insertions(+), 9 deletions(-)

diff --git a/docs/user/software-origins/nixguix.rst b/docs/user/software-origins/nixguix.rst
index 6acfc6ff..cfd7a50f 100644
--- a/docs/user/software-origins/nixguix.rst
+++ b/docs/user/software-origins/nixguix.rst
@@ -8,12 +8,76 @@ Nix and Guix
 
 .. include:: dynamic/nixguix_status.inc
 
-TODO:
-
-* description of the software origin
-* summary of the lister's algorithm
-* summary of the loader's algorithm
-* URL pattern
-* collect extrinsic metadata?
-* index extrinsic metadata?
-* index intrinsic metadata?
+This page documents how |swh| archives source packages from the
+`GNU Guix <https://guix.gnu.org/>`_ and `Nix <https://nixos.org/>`_
+distributions.
+
+Those distributions provide functional package managers, respectively GNU Guix and Nix
+with similar properties (e.g. transactional, declarative up to the operating system,
+reproducible, ...). Definition of packages is dependent on their respective DSL. As it's
+not parsable easily nor any listing api existed, community effort was done to provide
+regular extraction of origins listing (as json manifest).
+
+|swh|'s NixGuix lister queries respectively those origins listing. As they contain
+various types of origins, |swh| has a lister and various loaders to deal with those
+origins.
+
+Various kind of origins can be listed and ingested:
+
+- url targeting a simple file. The :ref:`Content Loader
+  <ContentLoader>` loader ingests origin of type 'content'.
+- url targeting a tarball. The :ref:`Tarball Directory Loader
+  <TarballDirectoryLoader>` ingests origin with type
+  'tarball-directory'
+- :ref:`Svn <user-software-origins-svn>` repository. The :ref:`Svn loader
+  <swh.loader.svn.loader.SvnLoader>` ingests origin with type svn.
+- :ref:`Svn <user-software-origins-svn>` repository at a specific revision. The
+  :ref:`Svn export loader <SvnExportLoader>` ingests origins
+  with type 'svn-export'.
+- :ref:`Git <user-software-origins-git>` repository. The :ref:`Git loader
+  <GitLoader>` ingests origin with type' or at a specific commit,
+  visit type 'git-checkout')
+- :ref:`Git <user-software-origins-git>` repository at a specific git commit. Then
+  :ref:`Git checkout loader <GitCheckoutLoader>` ingests origin
+  with type 'git-checkout'
+- :ref:`Mercurial <user-software-origins-mercurial>` repository. The :ref:`Mercurial
+  loader <HgLoader>` ingests origin with type 'hg'
+- :ref:`Mercurial <user-software-origins-mercurial>` repository. The :ref:`Mercurial
+  checkout loader <HgCheckoutLoader>` ingests with type
+  'hg-checkout'.
+
+Origin URLs match each main url provided in the manifest.
+
+For some cases, there can be mirror urls which are used as fallback artifact retrieval
+when the main url is no longer available.
+
+No extrinsic nor intrinsic metadata collection is happening on the lister's side.
+
+For some kind of origins ('content', 'tarball-directory', 'svn-export', 'hg-checkout',
+'git-checkout'), extra intrinsic information regarding the checksums used ('standard',
+e.g. sha256, or :ref:`Nar <Nar>`, specific intrinsic identifier used by
+Guix and Nix), are transmitted to their respective loaders. During their respective
+ingestion, those checksums are checked against. If the checksum does not match, the
+artifact is rejected and the visit is failed. If not, the artifact is ingested. Note
+also that a new entry is recorded in the ExtID table to map the SWHID content (of the
+content file) or the SWHID directory (for the other kind) ingested to their their
+original checksum.
+
+Sample:
+
++-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+
+| extid_type      | extid_version | extid                                                              | target_type | target                                     |
++-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+
+| checksum-sha256 |             1 | \x00001a5b5be28bde9bc8c353afe546d8fe84e49b269a70393c1616957b0e1cce | directory   | \xbe186100480d766ebdf0cfaeac0c90198f4b42e7 |
+| nar-sha256      |             1 | \x00002584a56a9bce85793515604298f8b3b1e9497e00fc6361a0c2e731c063f3 | directory   | \x1e1ace5b0ef56e188d3cf99059070cc5448d7454 |
++-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+
+
+Resources:
+
+* `Gnu Guix git repository <https://git.savannah.gnu.org/cgit/guix.git>`__
+
+* `Nixpkgs git repository <https://github.com/nixos/nixpkgs>`__
+
+* `Remote nixpkgs json manifest <https://nix-community.github.io/nixpkgs-swh/sources-unstable-full.json>`__
+
+- `Remote guix json manifest <https://nix-community.github.io/nixpkgs-swh/>`__
-- 
GitLab