Skip to content

loader git: enable global deduplication of head branches before fetching them

This task tracks the efforts to (re-)enable global deduplication of revisions in the git loader, to reduce the amount of data downloaded from upstreams (and converted uselessly by workers).

  • first enabling partial global deduplication through extid mappings for snapshot heads (for which we know that we have done a complete load of the history): #3635
  • then surveying the opportunity of "just" doing a global lookup for any object types: swh/devel/experiments/swh-db-audit#3656 (moved), and #3654 to avoid creating new "history holes"

Migrated from T3655 (view on Phabricator)

Edited by Phabricator Migration user