Update existing contents with new hash blake2s256
Leveraging azure infrastructure, trigger the blake2s256 update on the existing contents.
This means:
-
Provisioning azure vms (sizing -> DS2_V2: 7GB ram, 14GB ssd disk, 2 cores; 85.33E/month) -> for now 2 vms -
code: configuration composability on storage read/write and objstorage readings adaptation -
puppet: swh_indexer_rehash puppetization -
Deploying the swh.indexer.rehash module (+ fix bits and pieces along the way) -
Compute list of sha1s to rehash from swh.content table (IN PROGRESS in uffizi:/srv/storage/space/lists/contents-sha1-to-rehash.txt.gz). -
Send all contents to the swh_indexer_rehash queue
Note: In regards to the storage stack to use, we can:
- either use the azure's objstorage (copy is 'complete' as in the snapshot copy). This will be the starting point.
- or use uffizi's objstorage (or banco) as the azure's in-transit's cost is null if the cost projection is too high.
- or use a multiplexer objstorage using azure as initial objstorage, falling back to banco if object not found, falling back to uffizi if object not found (solution used)
Migrated from T712 (view on Phabricator)
Edited by Phabricator Migration user