Tune index parameters
Given the goal of conserving swh_workers indexes for ever, we will have > 1000 indexes after 3 years. With the default number of shards per index, this will mean more than 5400 shards. Each shard consumes Elasticsearch resources; having so many is unreasonable.
After collecting a few days of statistics with the new index patterns, we can see that:
- Each swh_worker index only uses 6-7GB maximum, way below the recommended maximum shard size of 40-50GB.
- Some swh_worker indexes contain more than 6 million documents, a number which could still grow in the future. It is recommanded not to have more more than 3-5 million documents per shard.
Given all the above data, having two shards per swh_worker index reasonable. The following shell code adds a template to the Elasticsearch cluster and sets the number of shards per swh_worker index to 2:
{
"template" : "swh_workers-*",
"settings" : { "number_of_shards" : 2 }
}'
Migrated from T990 (view on Phabricator)