nixguix: Deal with manifest entries without an integrity field
In that case, this fallbacks to use the "outputHash" which is an equivalent field of the integrity one except it's for "recursive" outputHashMode. This adds the necessary assertions around this case so correct data is sent to loaders as well.
This got detected by runs in docker again [1].
Related to T3781
- [1] Related to swh/meta$1472
Test Plan
tests + docker runs as usual.
Way less skipped artifacts [1] with this compared to before [2]:
- [1]
swh-lister_1 | [2022-10-05 14:14:03,880: WARNING/ForkPoolWorker-1] Cannot detect extension for <https://sources.debian.net/data/main/g/gpsbabel/1.5.3-2/debian/patches/use_minizip>. Fallback to http head query
swh-lister_1 | [2022-10-05 14:14:03,984: WARNING/ForkPoolWorker-1] Still cannot detect extension through location <https://sources.debian.net/data/main/g/gpsbabel/1.5.3-2/debian/patches/use_minizip>...
swh-lister_1 | [2022-10-05 14:14:20,777: WARNING/ForkPoolWorker-1] Cannot detect extension for <https://github.com/mpeterv/luazip>. Fallback to http head query
swh-lister_1 | [2022-10-05 14:14:21,669: WARNING/ForkPoolWorker-1] url <https://github.com/mpeterv/luazip>: detected as 'file' with 'recursive' outputHashMode <{'outputHash': '1jlqzqlds3aa3hnp737fm2awcx0hzmwyd87klv0cv13ny5v9f2x4', 'outputHashAlgo': 'sha256', 'outputHashMode': 'recursive', 'type': 'url', 'urls': ['https://github.com/mpeterv/luazip'], 'integrity': 'sha256-pAuXdvF2hM3ApvOg5nn9EHTGlajujHMtHEoN3Sj+mMo=', 'inferredFetcher': 'unclassified'}>
swh-lister_1 | [2022-10-05 14:14:21,942: ERROR/ForkPoolWorker-1] Skipping url: <https://github.com/stedolan/jq/archive/jq-1.6.tar.gz>: integrity computation failure for <{'outputHash': 'sha256-CIE8vumQPGK+TFAncmpBijANpFALLTadOvkob0gVzro', 'outputHashAlgo': None, 'outputHashMode': 'recursive', 'type': 'url', 'urls': ['https://github.com/stedolan/jq/archive/jq-1.6.tar.gz'], 'inferredFetcher': 'fetchzip'}>
swh-lister_1 | Traceback (most recent call last):
swh-lister_1 | File "/tmp/tmp.9bwmNMLi9T/swh-lister/swh/lister/nixguix/lister.py", line 412, in get_pages
swh-lister_1 | chksum_algo: base64.decodebytes(chksum_b64.encode()).hex()
swh-lister_1 | File "/usr/local/lib/python3.7/base64.py", line 546, in decodebytes
swh-lister_1 | return binascii.a2b_base64(s)
swh-lister_1 | binascii.Error: Incorrect padding
swh-lister_1 | [2022-10-05 14:14:56,954: ERROR/ForkPoolWorker-1] Skipping url: <https://github.com/figiel/hosts/archive/v1.0.0.tar.gz>: integrity computation failure for <{'outputHash': 'sha256-9uF0fYl4Zz/Ia2UKx7CBi8ZU8jfWoBfy2QSgTSwXo5A', 'outputHashAlgo': None, 'outputHashMode': 'recursive', 'type': 'url', 'urls': ['https://github.com/figiel/hosts/archive/v1.0.0.tar.gz'], 'inferredFetcher': 'fetchzip'}>
swh-lister_1 | Traceback (most recent call last):
swh-lister_1 | File "/tmp/tmp.9bwmNMLi9T/swh-lister/swh/lister/nixguix/lister.py", line 412, in get_pages
swh-lister_1 | chksum_algo: base64.decodebytes(chksum_b64.encode()).hex()
swh-lister_1 | File "/usr/local/lib/python3.7/base64.py", line 546, in decodebytes
swh-lister_1 | return binascii.a2b_base64(s)
swh-lister_1 | binascii.Error: Incorrect padding
swh-lister_1 | [2022-10-05 14:15:09,152: WARNING/ForkPoolWorker-1] Cannot detect extension for <http://git.marmaro.de/?p=mmh;a=snapshot;h=431604647f89d5aac7b199a7883e98e56e4ccf9e;sf=tgz>. Fallback to http head query
swh-lister_1 | [2022-10-05 14:16:05,260: INFO/ForkPoolWorker-1] Task swh.lister.nixguix.tasks.NixGuixListerTask[49b5c683-a124-4e6b-8ac5-3d27b9831f02] succeeded in 165.84545156091917s: {'pages': 31563, 'origins': 31494}
- [2] swh/meta$1473
Migrated from D8631 (view on Phabricator)