Skip to content

blob_datasets: Add task BlobScancode

vlorentz requested to merge vlorentz/swh-graph:scancode into master

It wasn't implemented yet because it is only relevant for the license dataset, not the citation dataset. The code is mostly based on https://annex.softwareheritage.org/public/dataset/license-blobs/2022-04-25/replication-package.tar.gz but adapted to work with the Luigi workflow

Scancode pulls a bunch of dependencies, so I'd rather not add it to requirements.txt.

Resolves #4751 (closed)

Edited by Antoine R. Dumont

Merge request reports