Cassandra as a storage backend
Meta task to get my thoughts in order about adding a Cassandra backend to swh-storage. (Started in april)
-
have a draft implementation https://forge.softwareheritage.org/source/swh-storage-cassandra/ -
benchmark to check the performances are not catastrophic https://forge.softwareheritage.org/source/storage-benchmark-deployment/ -
increase test coverage of all behaviors of swh-storage (D 1534 to 1552) -
numeric origin ids -
define a replacement T1731 -
get rid of numeric origin ids in all storage clients #1816 (closed) -
non-swh-web clients -
swh-web -
queries by origin-id swh-web!182 (closed) -
paginated queries T1912
-
-
-
public API v2 swh-web#1805(postponed)
-
-
Add the draft Cassandra backend to the docker env -
Run the draft Cassandra backend with production data -
Rewrite the Cassandra backend using the experience learned working on the draft -
Add it to the docker env -
Write a storage proxy component, that queries the two backends (postgres and cassandra) and compares their results, to check they are the same; and run it in the docker env. This will make sure migrating to Cassandra does not introduce regressions -
Run it with production data -
Deploy in production (possibly with the proxy at first)
Migrated from T1892 (view on Phabricator)
Edited by Phabricator Migration user