[cassandra] POC the cross replication pg<->cassandra infrastructure in staging
In order to test the feasibility of the cross replication pg<->cassandra in a pseudo real environment, we should deploy the poc in staging.
Target:
-
Kafka -> cassandra replayers (already done to populate the cassandra backend) -
Kafka -> postgresql replayers (to insert in postgresql the objects loaded by the workers using cassandra as a backend)
TODO:
-
Extend the replayers helm configuration to support postgresql configuration and a filtered configuration with a retry and a filter step -
Create a new privileged kafka user (https://docs.softwareheritage.org/sysadm/deployment/howto-add-journal-user-credential.html#deployment-howto-add-journal-user-credential) -
Bootstrap the consumer groups to start at the end of the topics as postgresql is already up to date -
Deploy postgresql replayers -
Possibly deploy new nodes or resize the current ones to handle the new load (should not be necessary at the beginning) -
Monitor the behavior of the workers and replayers (not too much errors in sentry regarding the HashCollisions / Exception / slowness) -
For staging, test with more workers to check on load
Edited by Vincent Sellier