loader: Add support for submodules discovering
The git loader can now discover submodules while loading a repository.
That process works the following way:
-
Before sending a new directory to archive in the storage, check if it has a
.gitmodules
file in its entries and add the tuple(directory_id, content_sha1git)
in a global set if it is the case. -
During the post_load operation, process each discovered
.gitmodules
file the following way:
-
retrieve content metadata to get sha1 checksum of file
-
retrieve
.gitmodules
content bytes in objstorage from sha1 -
parse
.gitmodules
file content -
for each submodule definition:
-
get git commit id associated to submodule path
-
check if git commit has been archived by SWH
-
if not, add the submodule repository URL in a set
-
-
for each submodule detected as not archived or partially archived, create a one shot git loading task with high priority in the scheduler database
Related to #3311 Related to #3923
Migrated from D7332 (view on Phabricator)