Rework deposit checker implementation
This MR contains two commits:
api/private: Add endpoint to get download links of uploaded tarballs
Add a new private Web API endpoint to get a list of URLs for downloading
the tarballs uploaded with a deposit.
In development mode, the tarballs are stored in the local filesystem
and served by django.
In production mode, the tarballs are stored in an azure blob storage
and temporary download links with a shared access signature are
generated when requesting the endpoint.
It enables to move costly operations related to downloading and
processing tarballs in celery workers instead of letting the deposit
server performing those tasks.
checker: Remove private API endpoint and do checks on celery worker
Checking deposit archives can be a costly operation as the checker
must download the archives to list their content.
It has been observed in production that if a large archive has been
uploaded with a deposit, requesting the check endpoint of the private
deposit API can end up with gunicorn worker being killed as the
time to download the archive exceeds the worker timeout.
So instead of using the private API endpoint performs the checks,
prefer to move these operations in the celery worker executing the
check-deposit task.
Related to #4657 (closed).
Fixes #4658 (closed).
These changes have been plugged into :
- the updated deposit loader (swh-loader-core!542 (merged))
- the updated deposit tests on docker (docker!42 (closed)).
Merge request reports
Activity
added 1 commit
- 2b72ed7a - api/private: Add endpoint to get download links of uploaded tarballs
Jenkins job DDEP/gitlab-builds #228 failed in 4 min 4 sec.
See Console Output, Blue Ocean and Coverage Report for more details.Jenkins job DDEP/gitlab-builds #229 failed in 4 min 6 sec.
See Console Output, Blue Ocean and Coverage Report for more details.added 1 commit
- 8f55fb8f - api/private: Add endpoint to get download links of uploaded tarballs
Jenkins job DDEP/gitlab-builds #230 succeeded in 2 min 43 sec.
See Console Output, Blue Ocean and Coverage Report for more details.- Resolved by Antoine Lambert
added 1 commit
- 03dba0d8 - api/private: Add endpoint to get download links of uploaded tarballs
Jenkins job DDEP/gitlab-builds #231 succeeded in 2 min 47 sec.
See Console Output, Blue Ocean and Coverage Report for more details.mentioned in merge request swh-loader-core!542 (merged)
Jenkins job DDEP/gitlab-builds #232 failed in 2 min 52 sec.
See Console Output, Blue Ocean and Coverage Report for more details.Jenkins job DDEP/gitlab-builds #233 failed in 2 min 45 sec.
See Console Output, Blue Ocean and Coverage Report for more details.added 1 commit
- 41f079a0 - checker: Remove private API endpoint and do checks on celery worker
Jenkins job DDEP/gitlab-builds #235 failed in 2 min 47 sec.
See Console Output, Blue Ocean and Coverage Report for more details.