[cassandra] Evaluate changing the compression of the directory_entry table from lz4 to zstd
The page dedicated to the data compression explains the zstd should ahave a better compression ratio than lz4 at the price of a little overhead https://cassandra.apache.org/doc/latest/cassandra/operating/compression.html
The current ration is ~0.6 in production and 0.7 in
cassandra01 ~ % /opt/cassandra/bin/nodetool -h cassandra01 -u cassandra --password [redacted] tablestats swh.directory_entry | grep -i -e table: -e "compression ratio"
Table: directory_entry
SSTable Compression Ratio: 0.6141515791511662
cassandra1.staging ~ % /opt/cassandra/bin/nodetool -u cassandra --password [redacted] tablestats swh.directory | grep -i -e table: -e "compression ratio"
Table: directory
SSTable Compression Ratio: 0.7396605961670907
Staging could be used to evaluate the gain or lose of changing the compression. The compression level can also be evaluated. The default level is 3 on a range of 0 to 22.
According to the documentation, zstd level 1 is the equivalent of lz4.
Benchmarks should be performed for the read performace as the write impact will not be visible by the clients, only by the compactor.
The main question is if it will be possible to recompact the entire directory_entry table in production with the limited remaining space