Skip to content

Delete old system log data from the Elasticsearch cluster

The elasticsearch cluster on banco.internal.softwareheritage.org contains

  • old system log data
  • test data injected by error from my laptop

It would be nice to delete unneeded documents at some point.

Proposed request to clean up test data:

curl -i -H'Content-Type: application/json' -XPOST "http://localhost:9200/_all/_delete_by_query/?pretty=true" -d '
{
    "query" : {
	"match" : { "hostname" : "hplaptopft0" }}
    }
}'

Proposed request to clean up old system log data:

curl -i -H'Content-Type: application/json' -XPOST "http://localhost:9200/_all/_delete_by_query/?pretty=true" -d '
{
    "query" : {
	"bool": {
	    "must_not": [{ "match" : { "systemd_unit" : "swh-worker@" } }],
	    "must": { "range" : { "@timestamp" : { "lt" : "now-3M" }}}
	}
    }
}'

Remark: closed Elasticsearch indices are not processed. In order to delete documents from closed indices, we have to reopen them first.


Migrated from T977 (view on Phabricator)

Edited by Phabricator Migration user
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information