False alerts related to unreacheable cassandra nodes
<swhprombot> Alert WARNING resolved - production/archive-production-rke2 - Cassandra_Degraded_Service_In_Production - The cassandra01.internal.softwareheritage.org:7070 node is unreachable for more than 15 minutes. This node seems down.
Since a couple of hours(days?), the monitoring triggers these alert for a couple of cassandra nodes.
the nodes are up and the cassandra failure detector see the nodes as safe:
cassandra01 ~ % /opt/cassandra/bin/nodetool -u cassandra -pwf ~/.cassandra/password status
Datacenter: sesi_rocquencourt
=============================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.100.184 10.21 TiB 16 22.8% 9c618479-7898-4d89-a8e0-dc1a23fce04e rack1
UN 192.168.100.181 10.13 TiB 16 22.7% cb0695ee-b7f1-4b31-ba5e-9ed7a068d993 rack1
UN 192.168.100.186 10.34 TiB 16 23.1% 557341c9-dc0c-4a37-99b3-bc71fb46b29c rack1
UN 192.168.100.188 10.27 TiB 16 23.0% 247cd9e3-a70c-465c-bca1-ea9d3af9609a rack1
UN 192.168.100.183 10.39 TiB 16 23.3% 4cc44367-67dc-41ea-accf-4ef8335eabad rack1
UN 192.168.100.191 10.47 TiB 16 23.6% 1199974f-9f03-4cc8-8d63-36676d00d53f rack1
UN 192.168.100.190 10.28 TiB 16 22.8% f39713c4-d78e-4306-91dd-25a8b276b868 rack1
UN 192.168.100.185 10.13 TiB 16 22.8% ac5e4446-9b26-43e4-8203-b05cb34f2c35 rack1
UN 192.168.100.193 10.3 TiB 16 23.3% 3681f79c-8d09-4e70-94f5-cfe1fbdb155d rack1
UN 192.168.100.189 10.42 TiB 16 23.3% e635af9a-3707-4084-b310-8cde61647a6e rack1
UN 192.168.100.192 10.31 TiB 16 23.3% 563d9f83-7ab4-41a2-95ff-d6f2bfb3d8ba rack1
UN 192.168.100.182 10.32 TiB 16 23.1% a3c89490-ee69-449a-acb1-c2aa6b3d6c71 rack1
UN 192.168.100.187 10.16 TiB 16 22.8% 0b7b2a1f-1403-48a8-abe1-65734cc02622 rack1
cassandra01 ~ % /opt/cassandra/bin/nodetool -u cassandra -pwf ~/.cassandra/password failuredetector
Endpoint, Phi
/192.168.100.182, 0.21285505
/192.168.100.183, 0.26325178
/192.168.100.192, 0.25650386
/192.168.100.193, 0.21311049
/192.168.100.184, 0.68138563
/192.168.100.185, 0.34601119
/192.168.100.186, 0.35046757
/192.168.100.187, 0.21111659
/192.168.100.188, 0.26417840
/192.168.100.189, 0.26448508
/192.168.100.190, 0.21111687
/192.168.100.191, 0.21411123