Skip to content

common/archive: Avoid db timeouts in lookup_snapshot_sizes

When querying all branch aliases in a snapshot, the underlying database query can timeout as it is not properly indexable. That diff intends to mitigate that issue.

A first commit moves the branches filtering by target type client side to avoid sending the costly request that might timeout. Note that even it seems that all branches of a snapshot are iterated, most of swh snapshots contains a single branch alias named HEAD that will be iterated first. Branches iteration will stop once all aliases processed so in that case the iteration will stop quickly.

A first commit removes aliases resolving in the lookup_snapshot_sizes function. Branch aliases will be resolved only when required for display from now on.

A second commit puts processed snapshot sizes in cache to avoid sending the same set of database queries each time a page is loaded when browsing the archive in the context of a specific snapshot.

Closes T2734

Depends on swh-storage!998 (closed)


Migrated from D4356 (view on Phabricator)

Merge request reports