Use "fork" relationships to speed-up initial load of large repositories

(I'm writing this task just so that I don't forget the idea, but I don't expect it to be actionable in the short term)

To work incrementally, VCS loaders fetch the last snapshot of the origin, which gives them a set of "heads", they can pass to origins, so origins will detect what revisions they don't need to send.

Unfortunately, when someone forks a large repository (such as https://github.com/chromium/chromium) and we see it for the first time, we don't have that snapshot; so the server needs to send all revisions, and we then discard almost all of them, because they are already in the archive.

However, if we could detect new repositories are forks (from extrinsic metadata, from heuristics based on repository names, ...), we could fetch the snapshot from the original repositories and use them as the base to load the fork incrementally

Migrated from T3273 (view on Phabricator)

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information