git loader OOM when loading huge repository
Consistenly [1] not able to ingest some repositories on staging:
swhworker@worker1:~$ time SWH_CONFIG_FILENAME=/etc/softwareheritage/loader_git.yml swh loader run git https://github.com/NixOS/nixpkgs.git
INFO:swh.core.config:Loading config file /etc/softwareheritage/global.ini
INFO:swh.core.config:Loading config file /etc/softwareheritage/loader_git.yml
Enumerating objects: 1151, done.
Counting objects: 100% (1151/1151), done.
Compressing objects: 100% (475/475), done.
Total 2367234 (delta 844), reused 697 (delta 671), pack-reused 2366083
INFO:swh.loader.git.BulkLoader:Listed 70404 refs for repo https://github.com/NixOS/nixpkgs.git
Killed
real 57m16.787s
user 50m33.560s
sys 0m40.689s
Note: That ends up with a lingering origin visit with status ongoing (thus swh-storage#2372 is really interesting).
machine (worker1.internal.staging.swh.network):
- 4 cores
- 16Gib ram
- no swap (our prod node does though) [2]
Nothing else runs there (other loader service are stopped).
-
[1] https://grafana.softwareheritage.org/d/q6c3_H0iz/system-overview?orgId=1&var-instance=worker1.internal.staging.swh.network&from=1587553104919&to=1587561781178 (both pick in memory usage are tryouts)
-
[2] I will add some swap to that node to check if that goes further with it.
Migrated from T2373 (view on Phabricator)
Edited by Phabricator Migration user