Skip to content

Add random directory sampling policy

Raphaël Gomès requested to merge generated-differential-D8539-source into master

This makes use of the new discovery algorithm introduced in swh-loader-core, which should help speed up large (think Linux kernel or way larger) scans.

Most of the time is spend walking the on-disk directory and hashing, which is where the new optimizations in swh-model==6.5.0 should come in handy. Python is close to its limit in that regard, some future endeavor should look into setting up SWH for native extensions.


Migrated from D8539 (view on Phabricator)

Merge request reports

Loading