Add Content Loader to ingest raw content file
In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested.
This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration.
This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content.
Note: For the integrity field, some future adaptations will be needed in that code. It's kept out of the scope of this diff to avoid depending on a new release of the model [1]
Related to T3781 Supersedes !446 (closed)
Migrated from D8581 (view on Phabricator)
Merge request reports
Activity
Build is green
Patch application report for D8581 (id=30956)
Rebasing onto 6299c091...
First, rewinding head to replay your work on top of it... Applying: Add Content Loader to ingest raw content file
Changes applied before test
commit 75e8a22f220083d9d4a3c1341ed5d882849f7b86 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive files to ingest. This creates a loader to ingest those. The output of the ingestion is a snapshot with 2 branches, one targetting the file ingested whose branch name is the filename. The other is an alias branch (matching what's done in other package loader). This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also receive a list of mirror urls in case the main origin url is no longer available. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/914/ for more details.
please document which terminal of the grammar you are aiming for. (the current implementation is
hash-expression
, but I don't know if that's intentional)Could you use a smaller test file? That one is really big...
please document which terminal of the grammar you are aiming for. (the current implementation is
hash-expression
, but I don't know if that's intentional)It is intentional, yes. I'll add a link to the grammar.
Could you use a smaller test file? That one is really big...
Well, that's the sole one i found.
! In !447 (closed), @ardumont wrote: Could you use a smaller test file? That one is really big...
Well, that's the sole one i found.
any file would do, you don't need to get one from Guix/Nix
Build has FAILED
Patch application report for D8581 (id=30978)
Rebasing onto 6299c091...
First, rewinding head to replay your work on top of it... Applying: Add Content Loader to ingest raw content file
Changes applied before test
commit 26d3ad52aa8c6e1223d4d0b0e3609c198bf46c7b Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
Link to build: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/916/ See console output for more information: https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/916/console
Build is green
Patch application report for D8581 (id=30979)
Rebasing onto 6299c091...
First, rewinding head to replay your work on top of it... Applying: Add Content Loader to ingest raw content file
Changes applied before test
commit 2aca780a73de24ecf7ff9227e43513acb0fb0357 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/917/ for more details.
@vlorentz I should have started with this... from the nixguix manifest, the integrity is for now only sha256... [1] So not sure we need to touch the model after all [2], especially since that diff got a tad bigger since you reviewed it...
swh tony yavin4 ~ work … swh swh-environment swh-model master 3⬆ % jq . /var/tmp/sources.json | grep -c sha256 13629 swh tony yavin4 ~ work … swh swh-environment swh-model master 3⬆ % jq . /var/tmp/sources.json | grep -c sha384 0 swh tony yavin4 ~ work … swh swh-environment swh-model master 3⬆ ERROR % jq . /var/tmp/sources.json | grep -c sha512 0
@vlorentz I should have started with this... from the nixguix manifest, the integrity is for now only sha256... [1] So not sure we need to touch the model after all [2], especially since that diff got a tad bigger since you reviewed it...
$ jq . /var/tmp/sources.json | grep -c sha256 13629 $ jq . /var/tmp/sources.json | grep -c sha384 0 $ jq . /var/tmp/sources.json | grep -c sha512 0
Although sha512 is used in the nixpkgs manifest...
$ jq . /var/tmp/sources-unstable.json | grep -c sha256 58036 $ jq . /var/tmp/sources-unstable.json | grep -c sha384 0 $ jq . /var/tmp/sources-unstable.json | grep -c sha512 8162
Build is green
Patch application report for D8581 (id=30983)
Rebasing onto 6299c091...
First, rewinding head to replay your work on top of it... Applying: Add Content Loader to ingest raw content file
Changes applied before test
commit 6436e2304d37812839870562f447895768d4c4a5 Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/919/ for more details.
Build is green
Patch application report for D8581 (id=30984)
Rebasing onto 6299c091...
First, rewinding head to replay your work on top of it... Applying: Add Content Loader to ingest raw content file
Changes applied before test
commit 32524ef0c03e677dbd60ec9d7aec7626c4a5322d Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/920/ for more details.
Some references in the commit message have been migrated:
- T3781 is now swh/meta#3781 (closed)
Build is green
Patch application report for D8581 (id=30992)
Rebasing onto 6299c091...
Current branch diff-target is up to date.
Changes applied before test
commit f774aba59e65bd3e5dd0ba9364840d8903d5706c Author: Antoine R. Dumont (@ardumont) <ardumont@softwareheritage.org> Date: Thu Sep 29 16:14:43 2022 +0200 Add Content Loader to ingest raw content file In some marginal listing cases (Nix or Guix for now), we can receive raw file to ingest. This commit adds a loader to ingest those. The output of the ingestion is a snapshot with 1 branch, one HEAD branch targetting the file content ingested. This expects to receive a mandatory 'integrity' field. It is used to check the content match the declaration. This can also optionally receive a list of mirror urls in case the main origin url is no longer available. Those mirror urls are solely used as fallback to retrieve the content. Related to [T3781](https://forge.softwareheritage.org/T3781 'view original for T3781 on Phabricator')
See https://jenkins.softwareheritage.org/job/DLDBASE/job/tests-on-diff/925/ for more details.