Add random directory sampling policy
This makes use of the new discovery algorithm introduced in
swh-loader-core
, which should help speed up large (think Linux
kernel or way larger) scans.
Most of the time is spend walking the on-disk directory and hashing,
which is where the new optimizations in swh-model==6.5.0
should come
in handy. Python is close to its limit in that regard, some future
endeavor should look into setting up SWH for native extensions.
Migrated from D8539 (view on Phabricator)