Implement an object storage scrubber
In order to check the contents stored in object storages can be found and are valid, we need to do the following:
- Get the full list of content hashes referenced in SWH storages using kafka or partitions of contents from storage API
- For each content:
- check its existence in all object storages (S3, azure, disk based, winery)
- download its bytes, recompute the hashes and check we obtain the same
Related to https://gitlab.softwareheritage.org/product-management/core-platform/-/work_items/23
Edited by Antoine Lambert