Add Directory Loader to allow tarball ingestion as Directory
In some marginal listing cases (Nix or Guix for now), we can receive raw tarball to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the ingested directory (contained within the tarball).
This expects to receive a mandatory 'integrity' field. It is used to check the tarball received out of the origin.
This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the tarball.
Related to T3781 Depends on D8581
Migrated from D8584 (view on Phabricator)
Merge request reports
Activity
Build has FAILED
Patch application report for D8584 (id=30988)
Could not rebase; Attempt merge onto 6299c091...
Merge made by the 'recursive' strategy. .pre-commit-config.yaml | 2 +- swh/loader/core/loader.py | 241 ++++++++++++++++++++- .../project_asdf_archives_asdf-3.3.5.lisp | 1 + .../https_example.org/archives_dummy-hello.tar.gz | Bin 0 -> 221 bytes swh/loader/core/tests/test_loader.py | 204 ++++++++++++++++- 5 files changed, 440 insertions(+), 8 deletions(-) create mode 100644 swh/loader/core/tests/data/https_common-lisp.net/project_asdf_archives_asdf-3.3.5.lisp create mode 100644 swh/loader/core/tests/data/https_example.org/archives_dummy-hello.tar.gz
Changes applied before test
commit 10076b690145ee3c306e923eea29b5ede907da57 Merge: 6299c09 12da8df Author: Jenkins user <jenkins@localhost> Date: Fri Sep 30 09:56:36 2022 +0000 Merge branch 'diff-target' into HEAD commit 12da8df8ee7277b9c208fdd282be92c87cb70a2e Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Fri Sep 30 11:54:13 2022 +0200 Add Directory Loader to ingest raw tarball In some marginal listing cases (Nix or Guix for now), we can receive raw tarball to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the ingested directory (contained within the tarball). This expects to receive a mandatory 'integrity' field. It is used to check the tarball received out of the origin. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the tarball. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator') commit c5fcf4025bb878df9541bee1e8c55006ba1874df Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/922/ See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/922/console
Build has FAILED
Patch application report for D8584 (id=30989)
Could not rebase; Attempt merge onto 6299c091...
Merge made by the 'recursive' strategy. .pre-commit-config.yaml | 2 +- swh/loader/core/loader.py | 240 ++++++++++++++++++++- .../project_asdf_archives_asdf-3.3.5.lisp | 1 + .../https_example.org/archives_dummy-hello.tar.gz | Bin 0 -> 221 bytes swh/loader/core/tests/test_loader.py | 204 +++++++++++++++++- 5 files changed, 439 insertions(+), 8 deletions(-) create mode 100644 swh/loader/core/tests/data/https_common-lisp.net/project_asdf_archives_asdf-3.3.5.lisp create mode 100644 swh/loader/core/tests/data/https_example.org/archives_dummy-hello.tar.gz
Changes applied before test
commit 3de188cb01ed3e21492491bb207da019b20b5742 Merge: 6299c09 628efbf Author: Jenkins user <jenkins@localhost> Date: Fri Sep 30 09:59:58 2022 +0000 Merge branch 'diff-target' into HEAD commit 628efbf0d9502a45acd55c49a69f1251ac093c06 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Fri Sep 30 11:54:13 2022 +0200 Add Directory Loader to allow tarball ingestion as Directory In some marginal listing cases (Nix or Guix for now), we can receive raw tarball to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the ingested directory (contained within the tarball). This expects to receive a mandatory 'integrity' field. It is used to check the tarball received out of the origin. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the tarball. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator') commit c5fcf4025bb878df9541bee1e8c55006ba1874df Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/923/ See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/923/console
Build is green
Patch application report for D8584 (id=30991)
Could not rebase; Attempt merge onto 6299c091...
Updating 6299c09..4eaa99e Fast-forward .pre-commit-config.yaml | 2 +- swh/loader/core/loader.py | 236 ++++++++++++++++++++- .../project_asdf_archives_asdf-3.3.5.lisp | 1 + .../https_example.org/archives_dummy-hello.tar.gz | Bin 0 -> 221 bytes swh/loader/core/tests/test_loader.py | 204 +++++++++++++++++- 5 files changed, 435 insertions(+), 8 deletions(-) create mode 100644 swh/loader/core/tests/data/https_common-lisp.net/project_asdf_archives_asdf-3.3.5.lisp create mode 100644 swh/loader/core/tests/data/https_example.org/archives_dummy-hello.tar.gz
Changes applied before test
commit 4eaa99ea751f49d5453dbb51e2361f9d070d3dd8 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Fri Sep 30 11:54:13 2022 +0200 Add Directory Loader to allow tarball ingestion as Directory In some marginal listing cases (Nix or Guix for now), we can receive raw tarball to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the ingested directory (contained within the tarball). This expects to receive a mandatory 'integrity' field. It is used to check the tarball received out of the origin. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the tarball. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator') commit f774aba59e65bd3e5dd0ba9364840d8903d5706c Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/924/ for more details.
Build is green
Patch application report for D8584 (id=30998)
Rebasing onto f774aba5...
Current branch diff-target is up to date.
Changes applied before test
commit 497f74f3225e4ccf11adce0d6a2bb50b2a471fab Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Fri Sep 30 11:54:13 2022 +0200 Add Directory Loader to allow tarball ingestion as Directory In some marginal listing cases (Nix or Guix for now), we can receive raw tarball to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the ingested directory (contained within the tarball). This expects to receive a mandatory 'integrity' field. It is used to check the tarball received out of the origin. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the tarball. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/926/ for more details.
mentioned in merge request !437 (closed)
Some references in the commit message have been migrated:
- T3781 is now swh/meta#3781 (closed)
Build is green
Patch application report for D8584 (id=31056)
Rebasing onto 5482a48e...
Current branch diff-target is up to date.
Changes applied before test
commit dbf7f3dca0c8c2b9c364bdcdf19481ecf8421b77 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Fri Sep 30 11:54:13 2022 +0200 Add Directory Loader to allow tarball ingestion as Directory In some marginal listing cases (Nix or Guix for now), we can receive raw tarball to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the ingested directory (contained within the tarball). This expects to receive a mandatory 'integrity' field. It is used to check the tarball received out of the origin. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the tarball. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/941/ for more details.
mentioned in issue swh/meta#3781 (closed)