Skip to content

Add task ListProvenanceNodes to dump nodes and their ids in .parquet files

vlorentz requested to merge ListProvenanceNodes into master

This will allow changing other tasks' format to refer to nodes by their id instead of their SWHID, which takes a lot less space (~5 bytes of entropy instead of ~21) and less time (no need to look up through the MPH+permutation when reading or through node2swhid when writing)

Merge request reports