misleading 100% known summary in sunburst rendering
considering the following scenario:
$ swh scanner scan -x '.git' -f ndjson scikit-learn | head -n 1 {".": {"swhid": "swh:1:dir:1de41371de86ff66c85271ac410097531372b6d1", "known": true}} $ echo foo > scikit-learn/foo.txt $ swh scanner scan -x '.git' -f ndjson scikit-learn | head -n 1 {".": {"swhid": "swh:1:dir:2699c6331bc22d604e524184ce2dd4340e3a1107", "known": false}} $ swh scanner scan -x '*.git' -f sunburst scikit-learn | head -n 1
the checkout of scikit-learn we are initially scanning is an archived commit, completely known to the archive. Then we add one file (foo.txt), which is also known in the archive, but which makes the top-level //directory// not known to the archive (because it is a directory that only contains known stuff, but which is itself, as a directory, unknown). Scanning works correctly, as the ndjson output shows, but the sunburst rendering shows "100.0%" as the percentage of known content of the root directory, which is misleading.
I believe this is because the 100% is computed only in terms of the number of //files// known/unknown.
One possible fix is counting in terms of nodes, which would include both files and directories, making the total lower than 100% in cases like this.
Migrated from T3755 (view on Phabricator)