package/utils: Improve downloaded filename extraction
That diff improves the filename extraction for a download URL.
Two specific cases are considered, each in a dedicated commit:
-
requests
follows URL redirection by default for GET requests so filename should be extracted from targetted URL when a redirection has been performed
This should fix that kind of sentry reported issue.
- some URLs for downloading a file do not contain any filename but rather provide it in the "content-disposition" response header so ensure to extract the filename from that response header when available to avoid possible file processing issues afterwards.
This should fix the extraction of some tarballs downloaded by the opam loader for instance.
anlambert@carnavalet:/tmp$ curl -i https://codeload.github.com/abella-prover/abella/tar.gz/v2.0.2
HTTP/2 200
access-control-allow-origin: https://render.githubusercontent.com
content-disposition: attachment; filename=abella-2.0.2.tar.gz
content-security-policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
content-type: application/x-gzip
etag: "66393ca915087abb7e474f0d976918630ebb8d23250de2bd70bab0752c01708a"
strict-transport-security: max-age=31536000
vary: Authorization,Accept-Encoding,Origin
x-content-type-options: nosniff
x-frame-options: deny
x-xss-protection: 1; mode=block
date: Tue, 14 Sep 2021 11:51:16 GMT
x-github-request-id: CE86:E15A:45CCD8:573AE8:61408CB3
Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.
Related to swh/infra/sysadm-environment#3468 (closed)
Migrated from D6252 (view on Phabricator)