Storage: Enable to paginate, filter and count snapshot content

Rebase to master and fix a couple of typos

Thanks!

I have made a few comments inline.

I also have a more generic doubt about the interface : I don't like multiplying the arguments to all functions that access snapshots; I'd rather we limit the results to a sensible value by default (e.g. the first 100 branches), and give callers the information that more data is available;

The current functions would be unchanged, they would return a snapshot with its id and a list of branches. We would just add one field to the return value, next_branch, defaulting to None, that the caller would have to check to see if the list of branches was complete or not.

The full scrolling list of branches would then be available through a single (new) endpoint referencing the snapshot id and the first branch to fetch:

def snapshot_get_branches(id, first_branch=None, count=None, target_types=None):
   ...

This keeps the API more regular and avoids us doing more joins than necessary on the backend when a client wants the full list of branches for a given snapshot.

Merge request was returned for changes

Some references in the commit message have been migrated:

T1207 is now swh-web#1207 (closed)

First diff update adressing @olasd comments:

remove the use of offset in the sql query used to return snapshot branches
do not modify parameters list of method snapshot_get, snapshot_get_by_origin_visit, snapshot_get_latest
snapshot content returned by these methods only contains the first 1000 branches (this seems a good default value imho, as I did not notice any performance issues and it should returned the whole set of branches for a majority of snapshots), a new field called next_branch is now present in the returned dict possibly containing the name of the first branch not returned.
add method snapshot_get_branches with the following optional parameters:
- branches_from: skip branches to return whose name is lesser than this value
- branches_count: maximum amount of branches to return
- target_types: list of target types to return while filtering out the others
rename method snapshot_branches_count to snapshot_count_branches

Merge request was accepted

approved this merge request

Merge request was merged

closed

Storage: Enable to paginate, filter and count snapshot content

Test Plan

Activity

Storage: Enable to paginate, filter and count snapshot content

Test Plan

Merge request reports

Activity