nixguix: Fails to finish as it's stuck in a loop up to memory error
As $637 shows, this loader does not finish. A priori, that should be a common problem for all loaders. It's just more apparent for that one as it loads a lot of various sources in one go.
It's an issue with one of the proxy storage, the buffer one, which fails to flush its contents to the storage due to one of the real hash collision referenced in $619.
It's then stuck in loop of retry content_add
but fails to do so, pass to the next artifacts, add some more contents.
Still fails to content add to the storage (because it still has the problematic content in its buffer).
This happily bubbles up memory usage [3] up to a fatal memory error.
Then it's oom-reaped [4]
Possible workaround/fix includes:
- drop the buffer proxy storage from the configuration (that could be used as a test to ensure the loader does indeed finish)
- make the proxy storage (one of retry/buffer) exclude from the transaction the colliding hash (similar to what's been implemented currently in the journal [1])
- deal properly with the hash collision in question
- exclude the sources including the hash collision
- allow the current buffer proxy storage to be
cleared
in between failures toadd
operations.
Right now, heading for 2. for now as the solution for 3. is still a pending question [2]
-
[1] rDJNL3c0e491352934c67f1d92d1302760a32a333edee
-
[2] #2332
-
[4]
[Tue Apr 7 03:34:46 2020] Memory cgroup out of memory: Kill process 16402 (python3) score 996 or sacrifice child
[Tue Apr 7 03:34:46 2020] Killed process 16402 (python3) total-vm:14862676kB, anon-rss:14686960kB, file-rss:9920kB, shmem-rss:8kB
[Tue Apr 7 03:34:47 2020] oom_reaper: reaped process 16402 (python3), now anon-rss:0kB, file-rss:0kB, shmem-rss:8kB
Migrated from T2352 (view on Phabricator)