diff --git a/docs/sysadm/data-silos/cassandra/installation.rst b/docs/sysadm/data-silos/cassandra/installation.rst index 231aaec377c9d37c3878835c2ac00df63f1f49e0..598e07ea1dc2bbc47bab1eb36274c6fe04e9c9ff 100644 --- a/docs/sysadm/data-silos/cassandra/installation.rst +++ b/docs/sysadm/data-silos/cassandra/installation.rst @@ -57,12 +57,133 @@ System installation - Check the configuration looks correct and start the instance(s) with `systemctl start cassandra@<instance>` +Cassandra configuration +----------------------- + +This section explains how to configure the keyspaces and roles for the specific swh usage. + +Cassandra need to be configured with authentication and authorization activated. The following options +need to be present on the `cassandra.yaml` file: + +:: + + authenticator: PasswordAuthenticator + authorizer: CassandraAuthorizer + +Several users are used: + +- `swh-rw`: The main user used by swh-storage to manage the content in the database +- `swh-ro`: A read-only user used for read-only storages (webapp, ...) or humans +- `reaper`: A read-write user on the `reaper` keyspace. `Reaper <http://cassandra-reaper.io/>`_ is the tool in charge of managing the repairs + +The command line will use the staging environment as examples. The configuration is for a medium +data volume, with a Replication factor (RF) of 3. Adapt according to your own needs. + + +1. Create the keyspaces to be able to configure the accesses + + +:: + + CREATE KEYSPACE swh WITH replication = {'class': 'NetworkTopologyStrategy', 'sesi_rocquencourt_staging': '3'} AND durable_writes = true; + # If needed + CREATE KEYSPACE swh WITH reaper_db = {'class': 'NetworkTopologyStrategy', 'sesi_rocquencourt_staging': '3'} AND durable_writes = true; + + +2. Alter the system keyspace replication to prepare the authenticated accesses + +(from https://cassandra.apache.org/doc/latest/cassandra/operating/security.html#password-authentication) + +:: + + export PASS=<your jmx password> + ALTER KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', 'sesi_rocquencourt_staging': 3}; + seq 1 3 | xargs -t -i{} /opt/cassandra/bin/nodetool -h cassandra{} -u cassandra --password $PASS repair -j4 system_auth + + +3. Create a new `admin` superuser + +In cqlsh (the default admin user is `cassandra`/`cassandra`): + +:: + + CREATE ROLE admin WITH SUPERUSER = true AND LOGIN = true AND PASSWORD = 'changeme'; + +4. Disable the default superuser + +Connect to cqlsh with the new `admin` user: + +:: + + ALTER ROLE cassandra WITH SUPERUSER = false AND LOGIN = false; + + +5. Create the `swh-rw` user + +:: + + CREATE ROLE 'swh-rw' WITH LOGIN = true AND PASSWORD = 'changeme'; + GRANT CREATE ON ALL KEYSPACES to 'swh-rw'; + GRANT CREATE ON ALL FUNCTIONS to 'swh-rw'; + GRANT ALTER ON ALL FUNCTIONS to 'swh-rw'; + GRANT SELECT ON KEYSPACE swh to 'swh-rw'; + GRANT MODIFY ON KEYSPACE swh to 'swh-rw'; + +6. Create the `swh-ro` user + +:: + + CREATE ROLE 'swh-ro' WITH LOGIN = true AND PASSWORD = 'changeme'; + GRANT SELECT ON KEYSPACE swh to 'swh-ro'; + +7. Create the `reaper` user + +:: + + CREATE ROLE 'reaper' WITH LOGIN = true AND PASSWORD = 'changeme'; + GRANT CREATE ON ALL KEYSPACES to 'reaper'; + GRANT SELECT ON KEYSPACE reaper_db to 'reaper'; + GRANT MODIFY ON KEYSPACE reaper_db to 'reaper'; + +8. Specific table configurations + +The table compaction and compression strategies depend on the hardware topology cassandra is deployed on. +For the high density servers used by swh, these specific configurations are used: +- LCS compaction on big tables to reduce the free disk space needed by compactions +- ZSTD compression on big tables to optimize the disk space + +.. warning:: These configurations can be applied only once the swh-storage schema was created by the storage + + +- In staging + +:: + + ALTER TABLE content WITH + compaction = {'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb':'160'} + AND compression = {'class': 'ZstdCompressor', 'compression_level':'1'}; + ALTER TABLE directory_entry WITH + compaction = {'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb':'4096'} + AND compression = {'class': 'ZstdCompressor', 'compression_level':'1'}; + +- In production + +:: + + ALTER TABLE content WITH + compaction = {'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb':'2000'} + AND compression = {'class': 'ZstdCompressor', 'compression_level':'1'}; + ALTER TABLE directory_entry WITH + compaction = {'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb':'20480'} + AND compression = {'class': 'ZstdCompressor', 'compression_level':'1'}; + + Monitoring -^^^^^^^^^^ +---------- TODO Metric -^^^^^^ +------ TODO