So the swh-search-journal-client@objects (on search1.internal.softwareheritage.org) got
stopped so elasticsearch do have a chance to clean up whatever it needed to do.
It worked, the load subsided.
And the docs deleted number is actually decreasing (and disk space got freed).
The initial goal was reached (free disk space ;)
root@search-esnode1:~# date; curl http://$ES_SERVER/_cat/indices\?vTue 23 Feb 2021 08:57:09 AM UTChealth status index uuid pri rep docs.count docs.deleted store.size pri.store.sizegreen open origin yFaqPPCnRFCnc5AA6Ah8lw 90 1 152756605 40806220 273.1gb 137.4gb
It's still in progress.
Further discussion ensued on #sysadm about the force merge action:
09:39 <+olasd> Force merge should only be called against an index after you havefinished writing to it. Force merge can cause very large (>5GB) segments to be produced,and if you continue to write to such an index then the automatic merge policy will neverconsider these segments for future merges until they mostly consist of deleteddocuments....09:40 <+olasd> force merge should only be called against an index after you're*completely done* writing to it09:41 <+olasd> if you're going to update the documents ever again, the merged segmentswon't be automatically cleaned, until most documents have been updated09:41 <+olasd> because they're gigantic...
So next action is actually to discontinue the usage of that index which is tainted by
our use of the force_merge action.
I'll create a new one out of the new as soon as disk space usage allows it (more than
50% free disk space).
[1]
06:32 <swhbot`> icinga PROBLEM: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is WARNING: DISK WARNING - free space: /srv/elasticsearch/nodes 41310 MB (20% inode=99%);06:46 <swhbot`> icinga RECOVERY: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is OK: DISK OK - free space: /srv/elasticsearch/nodes 41590 MB (21% inode=99%);06:49 <swhbot`> icinga PROBLEM: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is WARNING: DISK WARNING - free space: /srv/elasticsearch/nodes 40615 MB (20% inode=99%);06:53 <swhbot`> icinga RECOVERY: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is OK: DISK OK - free space: /srv/elasticsearch/nodes 41737 MB (21% inode=99%);07:05 <swhbot`> icinga PROBLEM: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is WARNING: DISK WARNING - free space: /srv/elasticsearch/nodes 41305 MB (20% inode=99%);08:06 <swhbot`> icinga RECOVERY: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is OK: DISK OK - free space: /srv/elasticsearch/nodes 41527 MB (21% inode=99%);08:11 <swhbot`> icinga PROBLEM: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is WARNING: DISK WARNING - free space: /srv/elasticsearch/nodes 41135 MB (20% inode=99%);08:12 <swhbot`> icinga PROBLEM: service disk /srv/elasticsearch/nodes on search-esnode2.internal.softwareheritage.org is WARNING: DISK WARNING - free space: /srv/elasticsearch/nodes 41403 MB (20% inode=99%);08:28 <swhbot`> icinga RECOVERY: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is OK: DISK OK - free space: /srv/elasticsearch/nodes 41620 MB (21% inode=99%);08:33 <swhbot`> icinga PROBLEM: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is WARNING: DISK WARNING - free space: /srv/elasticsearch/nodes 41248 MB (20% inode=99%);08:44 <swhbot`> icinga RECOVERY: service disk /srv/elasticsearch/nodes on search-esnode1.internal.softwareheritage.org is OK: DISK OK - free space: /srv/elasticsearch/nodes 42165 MB (21% inode=99%);08:48 <swhbot`> icinga RECOVERY: service disk /srv/elasticsearch/nodes on search-esnode2.internal.softwareheritage.org is OK: DISK OK - free space: /srv/elasticsearch/nodes 56738 MB (28% inode=99%);
puppet agent got disabled on search1.internal.softwareheritage.org so puppet does not
start back the search-journal-client@objects (which writs to the index) [1]
health status index uuid pri rep docs.count docs.deleted store.size pri.store.sizegreen open origin yFaqPPCnRFCnc5AA6Ah8lw 90 1 152756759 41962187 261gb 130.7gbgreen open origin-production hZfuv0lVRImjOjO_rYgDzg 90 1 1021300 0 1.1gb 917.9mbelasticsearch-data 193G 88G 106G 46% /srv/elasticsearch/nodes
[1] I'll install an alias "origin" on the new "origin-production" index. Keeping the
name "origin" as the index name is currently hard-coded in the swh-search code.
[2] We don't have that much space so cleaning up the wrong copy:
Install alias "origin" on "origin-production" index:
Expectedly, we can't add an alias named the same as an existing index:
ardumont@search-esnode1:~% curl -s -XPOST $ES_SERVER/origin-production/_alias/origin{"error":{"root_cause":[{"type":"invalid_alias_name_exception","reason":"Invalid alias name [origin], an index exists with the same name as the alias","index_uuid":"yFaqPPCnRFCnc5AA6Ah8lw","index":"origin"}],"type":"invalid_alias_name_exce
So first close and delete the old index to reclaim space:
# Close the indexardumont@search-esnode1:~% curl -s -XPOST $ES_SERVER/origin/_close{"acknowledged":true,"shards_acknowledged":true,"indices":{"origin":{"closed":true}}}%ardumont@search-esnode1:~% curl -s -XDELETE $ES_SERVER/origin{"acknowledged":true}%