Skip to content

Redesign list-directory-with-max-leaf-timestamp to work in parallel

vlorentz requested to merge list-directory-with-max-leaf-timestamp into master

and stop writing timestamps for directories not contained by any head revision.

The previous algorithm was to load timestamps from contents, and propagate them while following a topological sort. This was good to minimize writes to memory, but took half a day to run.

The new algorithm 'bruteforces' by traversing from every reachable content and atomically applying max_directory_ts = max(max_directory_ts, content_ts) to every parent directory that is also contained by a head revision. This takes 5 minutes.

Merge request reports

Loading