Reindex old data on banco to put it into swh_worker indexes
Historical data on banco has been stored into generic logstash-${date} indexes. These indexes now contain deleted documents and are not compressed as much as they could, wasting precious storage space.
This is the proposed reindexation process:
1. Change index template to improve reindexation speed
------------------------------------------------------
curl -i -H'Content-Type: application/json' -XPUT http://192.168.101.58:9200/_template/template_swh_workers -d '
{
"template" : "swh_workers-*",
"settings" : {
"number_of_shards" : 2,
"number_of_replicas" : 0,
"refresh_interval" : -1,
"codec" : "best_compression"
}
}'
2. Reindex
----------
time curl -i -H'Content-Type: application/json' -XPOST http://192.168.101.58:9200/_reindex -d '
{
"source": { "index": "logstash-2017.03.08" },
"dest": { "index": "swh_workers-2017.03.08" }
}'
3. Add back replicas to index shards
------------------------------------
curl -i -H'Content-Type: application/json' -XPUT http://192.168.101.58:9200/swh_workers-2017.03.08/_settings -d '
{
"index" : { "number_of_replicas" : 1 }
}'
4. Change index template back to sane defaults
----------------------------------------------
curl -i -H'Content-Type: application/json' -XPUT http://192.168.101.58:9200/_template/template_swh_workers -d '
{
"template" : "swh_workers-*",
"settings" : {
"number_of_shards" : 2,
"number_of_replicas" : 1,
"refresh_interval" : "30s",
"codec" : "best_compression"
}
}'
Migrated from T1000 (view on Phabricator)