Skip to content

to_disk: Use a BFS to recursively list a directory instead of a DFS

It enables to push new files to download asynchronously while fetching sub-directories and thus slightly improve the overall cooking performance.

It should also reduce the memory consumption of the cooking process.

Below are the timings obtained when cooking the linux kernel source tree:

  • When using a DFS:
$ time swh vault cook -C /tmp/vault.yml swh:1:dir:44dde92e4dbd16f25c7ce50240bf53a7b753e7ad /tmp/dir.tar.gz

real    14m19,757s
user    6m14,742s
sys     0m45,033s
  • When using a BFS:
$ time swh vault cook -C /tmp/vault.yml swh:1:dir:44dde92e4dbd16f25c7ce50240bf53a7b753e7ad /tmp/dir.tar.gz

real    10m34,473s
user    6m11,210s
sys     0m45,816s

Merge request reports