The following output has been filtered and manually edited to:
remove the most prominent occurrence of 404 errors (origins no
longer exist)
aggregate sensible identical output
The 'full' output is at $315 (it's derived from $314 for
orchestration):
{ "googlecode": { "total": 3933, "errors": { "...Reason: 404": 2034, // [1] "e 51, in author\n name = data\nKeyError: 'author'": 1409, // [2] "672//: timed out.\nTrying again in x... seconds...\n": 199, // aggregated manually " f.write(chunk)\nOSError: Cannot allocate memory": 31, // [3] "meout\nSSL connection has been closed unexpectedly\n": 22, // [4] "CONNRESET\\')\",)', OSError(\"(104, 'ECONNRESET')\",))": 16, // [4] "bb3891900f0f86c3c3bcf136fd8eb9a96b4e9a1f5e782287bb": 58, // [5] aggregated manually "eError: Error when checking size: 166344 != 166353": 1, // [5] "xxxxxx-x.x.x-x.xxx.xxx is not a supported archive.": 18, // [6] aggregated manually "server\nERROR: pgbouncer cannot connect to server\n": 15, // aggregated manually "te 0xd4 in position 219: invalid continuation byte": 17, // aggregated manually "code byte 0xa3 in position 236: invalid start byte": 13, // aggregated manually ") got an unexpected keyword argument 'back_compat'": 13, "r.gz blocked. Illegal path to directory metrique-.": 4, // aggregated manually ">nginx/1.10.3</center>\\r\\n</body>\\r\\n</html>\\r\\n')": 11, "\nTypeError: 'NoneType' object is not subscriptable": 9, ", commands ignored until end of transaction block\n": 7, "0.3</center>\\\\r\\\\n</body>\\\\r\\\\n</html>\\\\r\\\\n\\')',)": 7, "PermissionError(13, 'Permission denied')": 6, "SQL function swh_revision_add() line 3 at PERFORM\n": 6, "nectionResetError(104, 'Connection reset by peer')": 5, "del dict representing a person.\nKeyError: 'author'": 3, "OSError(timeout('timed out',),)": 3, "wError: timestamp out of range for platform time_t": 3, " dst.write(buf)\nOSError: Cannot allocate memory": 2, "elf._length+read)\nOSError: Cannot allocate memory": 2, "ection aborted.', OSError(\"(104, 'ECONNRESET')\",))": 2, "ocgtk-26460/j2/1.2.1/uncompress/j2-1.2.1/setup.py'": 1, "path to file /Users/temek/Downloads/._cronq-0.17.1": 1, "batch/0.1.5/uncompress/svgbatch-0.1.5/LICENSE.txt'": 1, "ang1_ku7-0.1.0/臺灣言語工具/資料佮語料匯入整合/教育部臺灣\\udce9\\udc96'": 1, "nection: Temporary failure in name resolution',))": 1, "725/pyjack/0.3.2/uncompress/pyjack-0.3.2/PKG-INFO'": 1, "96kw-8592/cclib/1.0/uncompress/cclib-1.0/ANNOUNCE'": 1, "2 for file 'yaxl-0.0.16/docs/dist/yaxl-0.0.16.zip'": 1, "tdir(dir_path)\nIndexError: list index out of range": 1, " No such file or directory: '/tmp/swh.loader.pypi'": 1, "colorama/0.1.8/uncompress/colorama-0.1.8/PKG-INFO'": 1, "'Worker exited prematurely: signal 9 (SIGKILL).',)": 1, "TimeoutError(110, 'Connection timed out')": 1, } }}
[1] Those are the origins removed between the pypi listing and the
pypi loading scheduling
[2] Those are the first issue we had early on when we discovered some
projects were missing author information (already fixed, swh/devel/swh-loader-core#1206 (closed)). As
they are the most prominent occurrences, they will be scheduled back
asap.
[3] Possibly an occurrence of running simultaneously too many services
on the same vm. We should cross-check for example the loader-git
around the same time, it possibly has the same errors.
[4] That's possibly an outage of updating the storage server. It's on
both line as it could happen at different point in time during the
loading.
[5] Those happens when an error is detected by the loader's client
(swh.loader.pypi.client) after the artifact release download. This is
a sign to improve that part to try the download multiple times.
[6] Visibly an improvment around the archive support is needed to
deal with some more formats (rpm, etc...).
The remaining issues are most probably either:
error on our side (pgbouncer, worker lost error, etc...). A simple
rescheduling could be enough.
current limitation in the loader that needs fixing.
Either way, this will need further analysis and dedicated tasks for
them.
In any case, for now, like i said in [2], we will first schedule back
those 1409 origins in error.