Upgrade the ELK stack
The ELK stack need a good refresh form the 7.8.0 to the 7.15.1
The following component needs to be update:
- elasticsearch
- logstash
- filebeat
- swh/infra/sysadm-environment#3545 (closed): journalbeat
Migrated from T3705 (view on Phabricator)
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Vincent Sellier added Component upgrades priority:Normal labels
added Component upgrades priority:Normal labels
- Vincent Sellier changed title from Upgrade the LK stack to Upgrade the ELK stack
changed title from Upgrade the LK stack to Upgrade the ELK stack
- Maintainer
FWIW the main blocker for upgrading journalbeat is a change in the target mapping, which will need some adaptations in our log routing (between systemlogs and swh_workers), as well as, well, an updated mapping on the target indexes!
- Author Maintainer
Thanks for the info. For the record, the entry point of the upgrade process: https://www.elastic.co/guide/en/elastic-stack/current/upgrading-elastic-stack.html
Elasticsearch supports rolling upgrades between minor versions, from Elasticsearch 5.6 to 6.8, and from 6.8 to 7.15.1.
Upgrade the components of your Elastic Stack in the following order:
Elasticsearch Hadoop: install instructions Elasticsearch: upgrade instructions Kibana: upgrade instructions Java High Level REST Client: dependency configuration Logstash: upgrade instructions Beats: upgrade instructions APM Server: upgrade instructions Elastic Agent: upgrade instructions
- Author Maintainer
The preparation of the migration through the vagrant environment is in progress.
- Vincent Sellier assigned to @vsellier
assigned to @vsellier
- Vincent Sellier added state:wip label
added state:wip label
- Author Maintainer
In order to validate the kibana upgrade, the kibana configuration can be copied locally with these commands:
- Export:
docker run --rm -ti \ -v /tmp/kibana_export:/tmp \ elasticdump/elasticsearch-dump \ --input=http://esnode1.internal.softwareheritage.org:9200/.kibana_2 \ --output=/tmp/kibana_2.json \ --type=data docker run --rm -ti \ -v /tmp/kibana_export:/tmp \ elasticdump/elasticsearch-dump \ --input=http://esnode1.internal.softwareheritage.org:9200/.kibana_2 \ --output=/tmp/kibana_2_mapping.json \ --type=mapping
Import:
# create the index curl -XPOST http://10.168.100.61:9200/.kibana_2 # Import the mapping docker run --net=host --rm -ti \ -v /tmp/:/tmp \ elasticdump/elasticsearch-dump \ --input=/tmp/kibana_2_mapping.json \ --output=http://10.168.100.61:9200/ \ --type=mapping # Import the data docker run --net=host --rm -ti \ -v /tmp/:/tmp \ elasticdump/elasticsearch-dump \ --input=/tmp/kibana_2_mapping.json \ --output=http://10.168.100.61:9200/ \ --type=data
- Update the kibana index alias:
cat > /tmp/alias.json <<EOF { "actions": [ { "remove": { "index": ".kibana_1", "alias": ".kibana" } }, { "add": { "index": ".kibana_2", "alias": ".kibana" } } ] } EOF curl -H'content-type:application/json' -XPOST http://10.168.100.61:9200/_aliases -d @/tmp/alias.json
- Author Maintainer
The migration of ES can be performed with:
elasticsearch migration
From: https://www.elastic.co/guide/en/elasticsearch/reference/7.15/rolling-upgrades.html
disable shard allocation
cat > /tmp/shard_allocation.json <<EOF { "persistent": { "cluster.routing.allocation.enable": "primaries" } } EOF curl -H'content-type: application/json' -XPUT http://10.168.100.61:9200/_cluster/settings -d @/tmp/shard_allocation.json
=> result:
{"acknowledged":true,"persistent":{"cluster":{"routing":{"allocation":{"enable":"primaries"}}}},"transient":{}}
Flush indexes
curl -XPOST http://10.168.100.61:9200/_flush/_synced
Launch the upgrade
- Add the following configuration per node in the esnodeX.i.s.o.yaml file in
swh-site
:
elastic::elk_version: '7.15.1' elasticsearch::config::extras: xpack.security.enabled: false
The xpack configuration is needed to avoid the display of a warning popup each time a kibana search will be made in the recent version:
We should think later to activate the authentication (it will also impact the webapp to retrieve the scn status)
- remove the prometheus exporter plugin to force it's upgrade
rm -rf /usr/share/elasticsearch/plugins/prometheus-exporter
- apply the new configuration
reenable the shard allocation
cat > /tmp/shard_allocation.json <<EOF { "persistent": { "cluster.routing.allocation.enable": null } } EOF curl -H'content-type: application/json' -XPUT http://10.168.100.61:9200/_cluster/settings -d @/tmp/shard_allocation.json
It seems everything is still running well after the upgrade (logstash, filebeat, journalbeat)
- Add the following configuration per node in the esnodeX.i.s.o.yaml file in
- Author Maintainer
To upgrade kibana, upgrading the version looks enough. The migration is automatically done and all the configured elements are still available:
root@esnode1:~# curl -s http://10.168.100.61:9200/_cat/indices\?v=true\&s=index | grep kibana health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open .kibana-event-log-7.15.1-000001 24Wb0rfUQuqab3Iody3Hrg 1 1 1 0 12.1kb 6kb <-------- new index green open .kibana-event-log-7.8.0-000001 6IjHICQVS2uX8qBekJLWsw 1 1 2 0 21.4kb 10.7kb green open .kibana_2 Oh9O6uB1R0-oNPbnhTM8kw 1 1 1928 3 1.5mb 788.4kb green open .kibana_7.15.1_001 5fyk6NMUSE-3P6uhx-HSeg 1 1 1110 35 5.3mb 2.6mb <-------- new index (automatically migrated from kibana_2) green open .kibana_task_manager_1 vINZFVqCSJiDHHFMdYGwTA 1 1 5 0 32kb 16kb green open .kibana_task_manager_7.15.1_001 pYeR_zFdTZO_jqxYS1DB9g 1 1 16 369 527kb 277.5kb <-------- new index
root@esnode1:~# curl -s http://10.168.100.61:9200/_cat/aliases\?v=true\&s=index | grep kibana alias index filter routing.index routing.search is_write_index .kibana-event-log-7.15.1 .kibana-event-log-7.15.1-000001 - - - true .kibana-event-log-7.8.0 .kibana-event-log-7.8.0-000001 - - - true .kibana .kibana_7.15.1_001 - - - - .kibana_7.15.1 .kibana_7.15.1_001 - - - - .kibana_task_manager .kibana_task_manager_7.15.1_001 - - - - .kibana_task_manager_7.15.1 .kibana_task_manager_7.15.1_001 - - - -
- Author Maintainer
Everything looks good with logstash 1:7.15.1 The monitoring of the logstash errors is still working as previously:
root@logstash0:/usr/lib/nagios/plugins/swh# ./check_logstash_errors.sh OK - No errors detected
after closing the current system index:
root@logstash0:/usr/lib/nagios/plugins/swh# ./check_logstash_errors.sh CRITICAL - Logstash has detected some errors in outputs errors=9 non_retryable_errors=13
- Antoine R. Dumont changed the description
changed the description
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-environment@a02e2078
mentioned in commit swh/infra/puppet/puppet-environment@a02e2078
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-swh-site@6ef2a729
mentioned in commit swh/infra/puppet/puppet-swh-site@6ef2a729
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-swh-site@0ec2231d
mentioned in commit swh/infra/puppet/puppet-swh-site@0ec2231d
- Author Maintainer
The diff to prepare the migration of filebeat and journalbeat are ready. If everything is good after the review, the upgrade will be perform at the beginning of the W46.
To create the new mappings :
root@logstash0:/etc/journalbeat# journalbeat export template -E setup.ilm.enabled=false -E setup.template.name=systemlogs-7.15.1 -E setup.template.pattern='systemlogs-7.15.1-*' > /tmp/systemlogs-7.15.1.json root@logstash0:/etc/journalbeat# curl -XPOST -H 'Content-Type: application/json' http://10.168.100.61:9200/_template/systemlogs-7.15.1 -d@/tmp/systemlogs-7.15.1.json; echo {"acknowledged":true}
root@logstash0:/etc/journalbeat# journalbeat export template -E setup.ilm.enabled=false -E setup.template.name=swh_workers-7.15.1 -E setup.template.pattern='swh_workers-7.15.1-*' > /tmp/swh_workers-7.15.1.json root@logstash0:/etc/journalbeat# curl -XPOST -H 'Content-Type: application/json' http://10.168.100.61:9200/_template/swh_workers-7.15.1 -d@/tmp/swh_workers-7.15.1.json; echo {"acknowledged":true}
The files are prepared on the
/root
of logstash0 in production - Author Maintainer
For the record, the upgrade of esnode[1-3] to bullseye is ok (in vagrant). The upgrade is done without errors, puppet is green. A reinstall from scratch is also working well without warning.
- Author Maintainer
- The 3 esnodes are updated to version 7.15.2: for each node:
puppet agent --disable
for each node:
apt update apt dist-upgrade cat > /tmp/shard_allocation.json <<EOF { "persistent": { "cluster.routing.allocation.enable": "primaries" } } EOF curl -H'content-type: application/json' -XPUT http://192.168.100.61:9200/_cluster/settings -d @/tmp/shard_allocation.json systemctl disable elasticsearch systemctl stop elasticsearch # wait for the node to be removed from the cluster nodes reboot # The configuration manually updated (gc configuration) is not working with the new jvm 1.14 bundled with ES 7.15.2 mv /etc/elasticsearch/jvm.options /etc/elasticsearch/jvm.options-7.8.0 mv /etc/elasticsearch/jvm.options /etc/elasticsearch/jvm.options-7.8.0 puppet agent --enable puppet agent --test systemctl enable elasticsearch cat > /tmp/shard_allocation.json <<EOF { "persistent": { "cluster.routing.allocation.enable": null } } EOF curl -H'content-type: application/json' -XPUT http://10.168.100.61:9200/_cluster/settings -d @/tmp/shard_allocation.json # wait for the cluster to be green again and perform the upgrade of the next node
- Kibana is updated too to the 715.2 versuib (with a puppet apply and a restart of the kibana service)
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-swh-site@00e99135
mentioned in commit swh/infra/puppet/puppet-swh-site@00e99135
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-swh-site@111a9921
mentioned in commit swh/infra/puppet/puppet-swh-site@111a9921
- Author Maintainer
- journalbeat and filebeat are migrated on all the nodes
- after the lag recovery and the fix of the closed indexes script, everything looks good
- Vincent Sellier removed state:wip label
removed state:wip label
- Vincent Sellier closed
closed
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-swh-site@53eef5fe
mentioned in commit swh/infra/puppet/puppet-swh-site@53eef5fe
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-swh-site@ae1e994e
mentioned in commit swh/infra/puppet/puppet-swh-site@ae1e994e
- Phabricator Migration user mentioned in commit swh/infra/puppet/puppet-swh-site@1dce1067
mentioned in commit swh/infra/puppet/puppet-swh-site@1dce1067