09:16:44 +olasd ╡ in the storage backend, there's a provision to filter snapshot branches by target type; these queries aren't indexable properly because the table containing which branches have which type is completely separate from the join table between snapshots and branches09:17:11 ⤷ ╡ so really this filter should be removed altogether, or rewritten to force reading the full list of branches of the snapshot first (indexed query) then filtering out the results09:17:43 ⤷ ╡ (I'm not sure when this was introduced or why; it's much safer to do this filtering client-side)09:19:49 ⤷ ╡ what happened starting friday is that postgresql decided to use the deduplication index on snapshot branches to try to do the filtering of branches by type; so it's basically pulling all branches of type aliases, and matching that to the list of branches of the snapshot we're trying tolook at09:20:13 ⤷ ╡ I guess the amount of alias branches passed a threshold that made the planner change its course09:22:45 ⤷ ╡ (or the amount of queries for snapshots was low enough that the table got kicked off the cache somehow)
I have generated a flame graph for the django view rendering a directory in a snapshot context and we can clearly see that the lookup_snapshot_sizes function
puts quite a burden on the overall process time.
The performance issue comes from that piece of code that intends to match a branch alias to its target type (revision or release).
The purpose is to update branches and releases counters to get a consistent display.
As @olasd explained, the instruction that is slow to execute is the one that gets all branch aliases in a snapshot
if "alias" in snapshot_sizes: aliases = lookup_snapshot( snapshot_id, branches_count=snapshot_sizes["alias"], target_types=["alias"] )
For instance, for the latest snapshot of the linux kernel, if we look at the underlying PostgreSQL query analysis, we can see
that the execution time is not optimal at all:
softwareheritage=> explain analyze SELECT snapshot_id, name, target, target_type FROM swh_snapshot_get_by_id('\xc7beb2432b7e93c4cf6ab09cd194c7c1998df2f9'::bytea, '\x'::bytea, 2, ARRAY['alias'] :: snapshot_target[]); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=134458.68..134458.71 rows=2 width=75) (actual time=413.101..413.105 rows=1 loops=1) -> Sort (cost=134458.68..134458.77 rows=34 width=75) (actual time=397.577..397.581 rows=1 loops=1) Sort Key: snapshot_branch.name Sort Method: quicksort Memory: 25kB InitPlan 1 (returns $0) -> Index Scan using snapshot_id_idx on snapshot (cost=0.57..2.59 rows=1 width=8) (actual time=0.019..0.021 rows=1 loops=1) Index Cond: ((id)::bytea = ('\xc7beb2432b7e93c4cf6ab09cd194c7c1998df2f9'::bytea)::bytea) -> Hash Join (cost=6541.89..134455.23 rows=34 width=75) (actual time=178.624..397.565 rows=1 loops=1) Hash Cond: (snapshot_branch.object_id = snapshot_branches.branch_id) -> Index Scan using snapshot_branch_target_type_target_name_idx on snapshot_branch (cost=0.70..127317.96 rows=227043 width=51) (actual time=0.024..355.077 rows=322338 loops=1) Index Cond: ((target_type = ANY ('{alias}'::snapshot_target[])) AND (name >= '\x'::bytea)) -> Hash (cost=5899.47..5899.47 rows=51338 width=8) (actual time=0.614..0.615 rows=1448 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 569kB -> Index Only Scan using snapshot_branches_pkey on snapshot_branches (cost=0.58..5899.47 rows=51338 width=8) (actual time=0.040..0.420 rows=1448 loops=1) Index Cond: (snapshot_id = $0) Heap Fetches: 1448 Planning Time: 1.200 ms JIT: Functions: 19 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 4.928 ms, Inlining 0.000 ms, Optimization 1.257 ms, Emission 14.028 ms, Total 20.214 ms Execution Time: 418.197 ms(22 rows)
If we compare to the execution time of the query getting all branches without any filtering on the target type, this is night and day:
softwareheritage=> explain analyze SELECT snapshot_id, name, target, target_type FROM swh_snapshot_get_by_id('\xc7beb2432b7e93c4cf6ab09cd194c7c1998df2f9'::bytea, '\x'::bytea, 10000, NULL :: snapshot_target[]); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=141412.79..141537.79 rows=10000 width=75) (actual time=18.965..19.115 rows=1448 loops=1) -> Sort (cost=141412.79..141541.14 rows=51338 width=75) (actual time=5.487..5.554 rows=1448 loops=1) Sort Key: snapshot_branch.name Sort Method: quicksort Memory: 252kB InitPlan 1 (returns $0) -> Index Scan using snapshot_id_idx on snapshot (cost=0.57..2.59 rows=1 width=8) (actual time=0.026..0.028 rows=1 loops=1) Index Cond: ((id)::bytea = ('\xc7beb2432b7e93c4cf6ab09cd194c7c1998df2f9'::bytea)::bytea) -> Nested Loop (cost=1.15..137393.58 rows=51338 width=75) (actual time=0.074..4.782 rows=1448 loops=1) -> Index Only Scan using snapshot_branches_pkey on snapshot_branches (cost=0.58..5899.47 rows=51338 width=8) (actual time=0.057..0.428 rows=1448 loops=1) Index Cond: (snapshot_id = $0) Heap Fetches: 1448 -> Index Scan using snapshot_branch_pkey on snapshot_branch (cost=0.57..2.56 rows=1 width=51) (actual time=0.003..0.003 rows=1 loops=1448) Index Cond: (object_id = snapshot_branches.branch_id) Filter: (name >= '\x'::bytea) Planning Time: 1.276 ms JIT: Functions: 15 Options: Inlining false, Optimization false, Expressions true, Deforming true Timing: Generation 4.082 ms, Inlining 0.000 ms, Optimization 0.978 ms, Emission 12.285 ms, Total 17.345 ms Execution Time: 23.378 ms(20 rows)
So getting all branches and performing filtering on target types client side seems a better approach here to avoid
a costly request that might timeout.