Skip to content

sysadmin/migrate-data: Script to move data between objstorages

Antoine R. Dumont requested to merge 5260-migrate-data into master

Script to move contents from objstorage storage1.staging to other db1.staging.

The gist of the algo is to try and copy objects (contents) from the source objstorage to the destination objstorage out of a object ids (sha1) read from the stdin [1]. It does a bunch of checks along the way (from already moved, presence check in source objstorage, corruption check, existence check in the destination objstorage, ...). If any issues during those checks, it logs the error and continues with the other objects. When the move is done, an entry is logged in a manifest of moved objects.

The script is idempotent and can be called multiple times with the same set of inputs.

[1] Sampled run

(venv-5260) root@storage1:~/5260-move-data# head -5 /srv/kubernetes/volumes/5260-migrate-data/objstorage-listing-20240423 | ./move.py --random-number 2 --debug --dry-run --cleanup
[2024-04-23 16:14:48,006] Content <4645e6f6b362a51617fdea10fcac0ee5655b346d> to copy from src to dst
[2024-04-23 16:14:48,023] ** DRY-RUN** Write <4645e6f6b362a51617fdea10fcac0ee5655b346d> in destination objstorage.
[2024-04-23 16:14:48,024] Content <4645e8980afaddb1b38060fa7084bfee13992634> to copy from src to dst
[2024-04-23 16:14:48,034] ** DRY-RUN** Write <4645e8980afaddb1b38060fa7084bfee13992634> in destination objstorage.
[2024-04-23 16:14:48,034] Content <4645e2937b3fddda1bdf5a7e78533569576b00be> to copy from src to dst
[2024-04-23 16:14:48,045] ** DRY-RUN** Write <4645e2937b3fddda1bdf5a7e78533569576b00be> in destination objstorage.
(venv-5260) root@storage1:~/5260-move-data# tail -3 /var/tmp/content-moved
4645e6f6b362a51617fdea10fcac0ee5655b346d
4645e8980afaddb1b38060fa7084bfee13992634
4645e2937b3fddda1bdf5a7e78533569576b00be

Refs. swh/infra/sysadm-environment#5260

Edited by Antoine R. Dumont

Merge request reports