[swh-search] Improve the index/mapping migration process
The current implementation uses the same index name to perform the indexation and the search. The major drawback is it makes maintenance operations on the index complicated and create downtime on the public search service. Now this search is used in production it should be avoided as far as possible.
A proposal to make the upgrades easier is to used 2 aliases, one for the search and one for the indexation.
The current upgrade of the origin
index from version X to version Y is:
- stop the indexations
- create an new index
originvX
and copy theorigin
mapping to it - copy
origin
content tooriginvX
with areindex
operation - delete the
origin
index, the public search is impacted since this moment - recreate the
origin
via the swh-search cli, the public search is working again but no/few results are returned as the new index is empty - copy the content of
originvX
toorigin
with areindex
operation, the public search will start to return more results as the reindex is progressing (it can take several days) - delete the
originv1
index - restart the indexation
With the usage of aliases, the process could be:
Given a current index origin-vX
,
- stop the indexation
- create a new index
origin-vY
and configure the write alias to use it - copy the content of the
origin-vX
to theorigin-vY
with a reindex operation (can take several days) - update the search alias to use
origin-vY
- delete the
origin-vX
index The public search is not impacted as a fully populated index is always present. Only new updates will be delayed be the reindexation duration.
The changes to implement on swh-search should be:
- explicitly named the index and aliases names to use in the configuration
- On the initialization function:
- test if the search alias exist, if not, create it and configure it to use the index, otherwise do nothing
- test if the write alias exits, if not, create it and configure it to use the index, otherwise do nothing
- test if the index exists, create it and apply the mapping if not
- Always call the initialization method during the startup to ensure the index exists and the mapping is applied. It will avoid to start indexing with an auto-generated (and divergent) mapping
Furthermore, this first step will allow an automatic migration to be implemented when needed.
Migrated from T3076 (view on Phabricator)
Edited by Vincent Sellier