Dear support,today I saved my open-source project CSTNU(https://profs.scienze.univr.it/posenato/svn/sw/CSTNU) in SoftwareHeritage.The process was successful and ended at 10/20/2021, 12:08:10 PM.After few minutes, I need to save again the repository and, therefore, I madeanother 'Save' request.The request was scheduled at 10/20/2021, 12:12:28 PM but, then. failed.Log:2021-10-20 11:16:23,942: INFO/ForkPoolWorker-1] Taskswh.loader.svn.tasks.DumpMountAndLoadSvnRepository[709297c6-4de6-4a16-b7bb-a242c9ca8234]succeeded in 235.8857956090942s: {'status': 'failed'}Then, I verified that my SVN server was running correctly.I found no problem.Then, I tried again a save action, but I always received a 'failed' status.The last try was schedule at 10/20/2021, 12:12:28 PM and there is no log, justfailed status.Is it possible to know which is the error?
svn external properties on the origin. That will get detected by the loader and
raise.
history tampering happened on the origin after the first loading. The loader detects
it and raise such issue and stops.
is still #611, should be taken care of at some point by #3692
For this one, i had a thought about it recently and it should be enough to trigger a
run from scratch (loader has the flag for it which is tested but got never used).
So i've had a look in the end [1], it's indeed the history altered issue (2.).
Well, that and one other issue regarding no long disk space. Independently from the out of disk issue, subsequent visits fails on history alrered detection issue.
"@timestamp",message,priority,hostname"[""2021-10-20T11:16:23.942Z"",""2021-10-20T11:16:23.942Z""]","[2021-10-20 11:16:23,942: INFO/ForkPoolWorker-1] Task swh.loader.svn.tasks.DumpMountAndLoadSvnRepository[709297c6-4de6-4a16-b7bb-a242c9ca8234] succeeded in 235.8857956090942s: {'status': 'failed'}",6,worker15"[""2021-10-20T11:16:22.823Z"",""2021-10-20T11:16:22.823Z""]","[2021-10-20 11:16:22,818: ERROR/ForkPoolWorker-1] Loading failure, updating to `failed` statusTraceback (most recent call last): File ""/usr/lib/python3/dist-packages/swh/loader/core/loader.py"", line 339, in load self.store_data() File ""/usr/lib/python3/dist-packages/swh/loader/svn/loader.py"", line 489, in store_data revision=self._last_revision, snapshot=self._snapshot File ""/usr/lib/python3/dist-packages/swh/loader/svn/loader.py"", line 525, in generate_and_load_snapshot ""generate_and_load_snapshot called with null revision and snapshot!""ValueError: generate_and_load_snapshot called with null revision and snapshot!",3,worker15"[""2021-10-20T11:16:22.818Z"",""2021-10-20T11:16:22.818Z""]","[2021-10-20 11:16:22,817: ERROR/ForkPoolWorker-1] History of svn file:///tmp/tmpz9rde58m/swh.loader.svn.k4hkq8ka-1239536/tmpt4qtuquh@607 altered. Skipping...",3,worker15"[""2021-10-20T11:12:28.213Z"",""2021-10-20T11:12:28.213Z""]","[2021-10-20 11:12:28,213: INFO/ForkPoolWorker-1] Load origin 'https://profs.scienze.univr.it/posenato/svn/sw/CSTNU' with type 'svn'",6,worker15"[""2021-10-20T11:08:10.088Z"",""2021-10-20T11:08:10.088Z""]","[2021-10-20 11:08:10,087: INFO/ForkPoolWorker-1] Task swh.loader.svn.tasks.DumpMountAndLoadSvnRepository[39bd536d-fc0e-4d06-a229-46bba2a033d2] succeeded in 268.00870383530855s: {'status': 'eventful'}",6,worker06"[""2021-10-20T11:07:01.963Z"",""2021-10-20T11:07:01.963Z""]","[2021-10-20 11:07:01,962: INFO/ForkPoolWorker-1] Processing revisions [566-607] for {'swh-origin': 'https://profs.scienze.univr.it/posenato/svn/sw/CSTNU', 'remote_url': 'file:///tmp/tmpbdl0vo4y/swh.loader.svn.9l3j76ga-347873/tmpb3_qh849', 'local_url': b'/tmp/swh.loader.svn.53habc1i-347873/tmpb3_qh849', 'uuid': b'782453a1-1937-45d1-8845-2a6fcc2839b7'}",6,worker06"[""2021-10-20T11:03:43.090Z"",""2021-10-20T11:03:43.090Z""]","[2021-10-20 11:03:43,090: INFO/ForkPoolWorker-1] Load origin 'https://profs.scienze.univr.it/posenato/svn/sw/CSTNU' with type 'svn'",6,worker06
For this one, i had a thought about it recently and it should be enough to trigger a
run from scratch (loader has the flag for it which is tested but got never used).
I tried and it failed.
So a fix is probably in order first.
That'd be #3695.
@posenato, I did some deeper analysis on why your svn repository could not be loaded anymore into the archive
and it turned out @ardumont is right, the commit log of your repository got altered between two save code now requests.
The image below show the commit log of your repository as currently archived by SWH.
The image below show the commit log of your repository loaded in our SWH docker environment used to develop and test new features.
I have highlighted the commit that got altered between the two loads, as you can see the commit message was initially Revision 4.3
but it was later modified to Preparing Revision 4.3.
As a consequence, the revision hashes differ between the two loads as commit message is used to generate revision identifier.
Because the subversion loader works in an incremental way to avoid reprocessing the full commit log in each loading, it first checks
if svn repository state is consistent as it was during previous loadings. If a discrepancy is detected, the loading is aborted.
So the only way to make the loading works again in production archive is to do a full reload of the repository.
I guess we could trigger that from our side manually until we decided a policy about such edge cases (but we
need to wait for @ardumont being back from vacation first).
@anlambert, thank you for excellent analysis!
Yes, I made an adjustment of commit logs because there was an error during the release and I discover it very soon, so I preferred to make the release again adjusting the comments instead of to make a new release!
I never thought that such a log modification can have a such important impact.
Lesson learned.
@posenato Hello, i was gonna check this morning and ping you to let you know.
I never thought that such a log modification can have a such important impact. Lesson
learned.
On both side! :) The awesome outcome is now we have improved the loader svn on this
current limitation. So thank you as well for asking the question in the first place.