Skip to content
Snippets Groups Projects

software-origins/nixguix: Document the origins

Merged Antoine R. Dumont requested to merge ardumont/swh-docs:update-doc-nixguix into master
All threads resolved!
1 file
+ 79
9
Compare changes
  • Side-by-side
  • Inline
@@ -8,12 +8,82 @@ Nix and Guix
.. include:: dynamic/nixguix_status.inc
TODO:
* description of the software origin
* summary of the lister's algorithm
* summary of the loader's algorithm
* URL pattern
* collect extrinsic metadata?
* index extrinsic metadata?
* index intrinsic metadata?
This page documents how |swh| archives source packages from the `GNU Guix`_ and Nix_
distributions.
Those distributions provide functional package managers, respectively `GNU Guix`_ and
Nix_ with similar properties (e.g. transactional, declarative up to the operating
system, reproducible, ...). Definition of packages is dependent on their respective DSL.
As it's not parsable easily nor any listing api existed, community effort was done to
provide regular extraction of origins listing as json manifest.
|swh|'s :py:class:`swh.lister.nixguix.lister.NixGuix` lister queries respectively those
manifests. As they contain various types of origins, |swh| uses various loaders to
ingest with those origins:
- url targeting a simple file. The :py:class:`swh.loader.core.loader.ContentLoader`
ingests origin of type ``content``.
- url targeting a tarball. The :py:class:`swh.loader.core.loader.TarballDirectoryLoader`
ingests origin with type ``tarball-directory``.
- :ref:`Svn <user-software-origins-svn>` repository. The
:py:class:`swh.loader.svn.loader.SvnLoader` ingests origin with type ``svn``.
- :ref:`Svn <user-software-origins-svn>` repository at a specific revision. The
:py:class:`swh.loader.svn.directory.SvnExportLoader` ingests origins
with type ``svn-export``.
- :ref:`Git <user-software-origins-git>` repository. The
:py:class:`swh.loader.git.loader.GitLoader` ingests origin with type ``git``.
- :ref:`Git <user-software-origins-git>` repository at a specific git commit. The
:py:class:`swh.loader.git.directory.GitCheckoutLoader` ingests origin with type
``git-checkout``.
- :ref:`Mercurial <user-software-origins-mercurial>` repository. The
:py:class:`swh.loader.mercurial.loader.HgLoader` ingests origin with type ``hg``.
- :ref:`Mercurial <user-software-origins-mercurial>` repository. The
:py:class:`swh.loader.mercurial.directory.HgCheckoutLoader>` ingests with type
``hg-checkout``.
Origin URLs match each main url provided in the manifest.
For some cases like content or tarball urls, there can be mirror urls provided. They are
used as fallback artifact retrieval when the main url is no longer available.
No extrinsic nor intrinsic metadata collection is happening on the lister's side.
For some kind of origins (``content``, ``tarball-directory``, ``svn-export``,
``hg-checkout``, ``git-checkout``), intrinsic information, their checksums
(``standard``, e.g. sha256, or :py:class:`swh.loader.core.nar.Nar`, specific intrinsic
identifier used by `GNU Guix_ and Nix_), are transmitted to the loaders.
During their ingestion, those checksums are checked. If the checksum(s) does not match,
the artifact is rejected and the visit is failed. If not, the artifact is ingested.
The snapshot resulted from the ingestion is targeting either the content for the loading
of a file (visit type ``content``) either a directory for tarball (type
``tarball-directory``) and vcs repository at specific commit (``git-checkout``,
``svn-export``, ``hg-checkout``). Usual standard snapshot happens for dvcs (``git``,
``svn``, ``hg``) repository ingestion.
Note also that a new entry is recorded in the ExtID table to map the SWHID content (of
the content file) or the SWHID directory (for the other kind) ingested to their their
original checksum.
Sample:
+-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+
| extid_type | extid_version | extid | target_type | target |
+-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+
| checksum-sha256 | 1 | \x00001a5b5be28bde9bc8c353afe546d8fe84e49b269a70393c1616957b0e1cce | directory | \xbe186100480d766ebdf0cfaeac0c90198f4b42e7 |
| nar-sha256 | 1 | \x00002584a56a9bce85793515604298f8b3b1e9497e00fc6361a0c2e731c063f3 | directory | \x1e1ace5b0ef56e188d3cf99059070cc5448d7454 |
+-----------------+---------------+--------------------------------------------------------------------+-------------+--------------------------------------------+
Resources:
* `Gnu Guix git repository <https://git.savannah.gnu.org/cgit/guix.git>`__
* `Nixpkgs git repository <https://github.com/nixos/nixpkgs>`__
* `Remote nixpkgs json manifest <https://nix-community.github.io/nixpkgs-swh/sources-unstable-full.json>`__
- `Remote guix json manifest <https://nix-community.github.io/nixpkgs-swh/>`__
.. _`GNU Guix`: https://guix.gnu.org/
.. _Nix: https://nixos.org/
Loading