Generate direct download links to efficiently fetch bundles
Currently when downloading a bundle cooked by the vault, its raw bytes are transiting through the following (quoting swh-objstorage!152 (comment 138383)):
- the objstorage backend
- the objstorage rpc server (wsgi app)
- the reverse proxy in front of the objstorage rpc server
- the objstorage rpc client
- the vault backend service
- the vault rpc server (wsgi app)
- the reverse proxy in front of the vault rpc server
- the vault rpc client
- the web frontend wsgi app
- the reverse proxy in front of the web frontend wsgi app
This is not really efficient and we are also facing connection errors in production when attempting to download large bundles (swh-web#4744 (closed)).
Instead of proceeding like this, we should rather take advantage of the proper built-in features of the actual content storage backend for the vault (azure blob storage) to redirect (not proxy) clients to a reasonably short-lived direct download URL (generated with a shared access signature). That way, clients get the most sensible download speed and we're staying out of the way.
Migrated from T885 (view on Phabricator)
Edited by Antoine Lambert