Skip to content

canonicalize gitlab urls in origin API

non-canonical Gitlab urls aren't managed by origin api (when missing ".git" suffix)

example : https://archive.softwareheritage.org/api/1/origin/https://gitlab.com/checkscale-gitlab/git-wtf.git/visit/latest/ >> found https://archive.softwareheritage.org/api/1/origin/https://gitlab.com/checkscale-gitlab/git-wtf/visit/latest/ >> not found

it would be interesting to canonize these urls to improve the matches.

For example, when using the UpdateSWH browser extension (https://www.softwareheritage.org/browser-extensions/, the gitlab repositories are initially marked as not archived yet (gray tab). Then if you save code now via the browser extension, it is archived with its non-canonical url so then it is recognized as archived (green tab)...


Migrated from T4369 (view on Phabricator)

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information