diff --git a/docs/sysadm/data-silos/cassandra/index.rst b/docs/sysadm/data-silos/cassandra/index.rst index 6b0513731f567e7a6966a4a6115372a956e8f940..4265658d76d9fab631a44237d4fa967be2ddb160 100644 --- a/docs/sysadm/data-silos/cassandra/index.rst +++ b/docs/sysadm/data-silos/cassandra/index.rst @@ -8,3 +8,4 @@ Cassandra .. toctree:: installation + upgrade diff --git a/docs/sysadm/data-silos/cassandra/upgrade.rst b/docs/sysadm/data-silos/cassandra/upgrade.rst new file mode 100644 index 0000000000000000000000000000000000000000..a414e4e2e334f0f608bbbd7778810697b49c4707 --- /dev/null +++ b/docs/sysadm/data-silos/cassandra/upgrade.rst @@ -0,0 +1,105 @@ +.. _cassandra_upgrade_cluster: + +How to upgrade a cassandra cluster +================================== + +.. admonition:: Intended audience + :class: important + + sysadm staff members + + +This page document the actions to upgrade a cassandra cluster. The overall +plan is to upgrade each node of the cluster one at a time. + +.. - Prepare the puppet configuration + +Puppet configuration +-------------------- + +Our cluster cassandra are managed through puppet. + +As we need to rolling upgrade the nodes, we will want to stop the puppet agent +from running. + +So first, connect to pergamon and trigger a puppet agent test (just in case +some pending actions need to be applied) then stop the puppet agent. + + +.. code-block:: shell + + $ clush @staging-nodes 'puppet agent --test && \ + puppet agent --disable "Upgrade to cassandra"' + +Then identify the desired new version and retrieve its sha512 hash. + +https://archive.apache.org/dist/cassandra/4.0.15/apache-cassandra-4.0.15-bin.tar.gz +https://archive.apache.org/dist/cassandra/4.0.15/apache-cassandra-4.0.15-bin.tar.gz.sha512 + +In the swh-site repository, adapt the environment's common.yaml file with +those values: + +.. code-block:: yaml + + $ echo $environment + staging + $ grep "cassandra::" .../swh-site/data/deployments/$environment/common.yaml + cassandra::version: 4.0.15 + cassandra::version_checksum: 9368639fe07613995fec2d50de13ba5b4a2d02e3da628daa1a3165aa009e356295d7f7aefde0dedaab385e9752755af8385679dd5f919902454df29114a3fcc0 + +Commit and push the changes. + +Connect to pergamon and deploy those changes. + +Then connect on each machine of the cluster in any order (lexicographic order +is fine though). + +We'll need the nodetool access, so here is a simple alias to simplify the +commands (used for the remaining part of the doc). + +.. code-block:: shell + + $ USER=$(awk '{print $1}' /etc/cassandra/jmxremote.password) + $ PASS=$(awk '{print $2}' /etc/cassandra/jmxremote.password) + $ alias nodetool="/opt/cassandra/bin/nodetool --username $USER --password $PASS" + + +From another node in the cluster, connect and check the status of the cluster +is fine during the migration. + +.. code-block:: shell + + $ period=10; while true; do \ + date; nodetool status -r; echo; nodetool netstats; sleep $period; \ + done + + +Now, we can remove the node from the cluster (in terms of writing). It will +stop the writing in that node so we can stop it without losing data. + +.. code-block:: shell + + $ nodetool drain + + +We stop the cassandra service. + +.. code-block:: shell + + $ systemctl stop cassandra@instance1 + + +Finally we upgrade cassandra version in the node (through puppet): + +.. code-block:: shell + + $ puppet agent --enable && puppet agent --test + +.. code-block:: shell + + $ systemctl start cassandra@instance1 + +.. code-block:: shell + + $ nodetool version + ReleaseVersion: $version