Skip to content

Add nixguix lister

Implementation as per the notes in the linked task (up to the hedgedoc linked in the task).

Downstream loader adaptations required to be able to ingest the output of the lister are tracked in the following diffs:

  • D8581: Create ContentLoader(BaseLoader) to deal with ListedOrigins with "file" visit_type

  • D8584: Create DirectoryLoader(BaseLoader) to deal with "integrity" field (with or without version)

  • run through docker to lift papercuts [1] [2] [3]

  • [1] guix

swh-lister_1                         | [2022-10-03 08:05:49,405: INFO/ForkPoolWorker-1] Task swh.lister.nixguix.tasks.NixGuixListerTask[f58096ad-af9f-42fa-bc29-e4791f1a24e3] succeeded in 557.3408025280223s: {'pages': 21483, 'origins': 18936}
  • [2] swh/meta$1455

  • [3] nixpkgs

swh-lister_1                         | [2022-10-03 15:36:38,225: INFO/ForkPoolWorker-1] Task swh.lister.nixguix.tasks.NixGuixListerTask[b442f750-797d-4df8-af0e-a5426a669462] succeeded in 177.8664992809645s: {'pages': 31285, 'origins': 31218}

Related to T3781


Migrated from D8341 (view on Phabricator)

Merge request reports

Loading