Skip to content

loader: Add support for submodules discovering

Antoine Lambert requested to merge generated-differential-D7332-source into master

The git loader can now discover submodules while loading a repository.

That process works the following way:

  1. Before sending a new directory to archive in the storage, check if it has a .gitmodules file in its entries and add the tuple (directory_id, content_sha1git) in a global set if it is the case.

  2. During the post_load operation, process each discovered .gitmodules file the following way:

  • retrieve content metadata to get sha1 checksum of file

  • retrieve .gitmodules content bytes in objstorage from sha1

  • parse .gitmodules file content

  • for each submodule definition:

    • get git commit id associated to submodule path

    • check if git commit has been archived by SWH

    • if not, add the submodule repository URL in a set

  • for each submodule detected as not archived or partially archived, create a one shot git loading task with high priority in the scheduler database

Related to #3311 Related to #3923

Migrated from D7332 (view on Phabricator)

Merge request reports