-
- Downloads
Add support for resuming from interrupted export
This changes from the monolithic CLI function to luigi tasks (now called by the CLI), one to write the offsets and one to export each topic. This avoids needing to start a new export from scratch in case of error, so only the last topic needs to be re-run. Additionally, this can run topic exporters in parallel, which can save some hours of run time by still doing work while the exporter of an other topic is not maximizing CPU usage because it consumed most partitions
parent
a2c74d52
No related branches found
No related tags found
Showing
- pyproject.toml 0 additions, 1 deletionpyproject.toml
- requirements-luigi.txt 0 additions, 1 deletionrequirements-luigi.txt
- requirements.txt 1 addition, 0 deletionsrequirements.txt
- swh/export/cli.py 43 additions, 110 deletionsswh/export/cli.py
- swh/export/journalprocessor.py 9 additions, 15 deletionsswh/export/journalprocessor.py
- swh/export/luigi.py 361 additions, 13 deletionsswh/export/luigi.py
- swh/export/test/test_journal_processor.py 21 additions, 42 deletionsswh/export/test/test_journal_processor.py
Loading
Please register or sign in to comment