rlog: fix loading of CVS commits which have a commit ID
The CVS commit ID is an optional attribute which is only generated by relatively recent releases of CVS clients. Our rlog parser was skipping such commits because it failed to match on them due to an error in a regular expression. This resulted in an incomplete import of CVS revision history.
Here is a sample line from cvs rlog output which carries a commit ID and was not matched because the regex lacked the trailing semicolon: date: 2007-07-17 15:02:50 +0200; author: larsl; state: Exp; lines: +619 -285; commitid: oju0x8tTc9aUB7qs;
Found while testing ingestion of the GNU dino repository from cvs.sannah.gnu.org/sources/dino
Migrated from D6561 (view on Phabricator)
Merge request reports
Activity
Build is green
Patch application report for D6561 (id=23835)
Rebasing onto 7f761b85...
Current branch diff-target is up to date.
Changes applied before test
commit 3c5e365fee4ae71c1a39111171ff1261d0c22eb6 Author: Stefan Sperling <stsp@stsp.name> Date: Wed Oct 27 12:20:05 2021 +0200 rlog: fix loading of CVS commits which have a commit ID The CVS commit ID is an optional attribute which is only generated by relatively recent releases of CVS clients. Our rlog parser was skipping such commits because it failed to match on them due to an error in a regular expression. This resulted in an incomplete import of CVS revision history. Here is a sample line from cvs rlog output which carries a commit ID and was not matched because the regex lacked the trailing semicolon: date: 2007-07-17 15:02:50 +0200; author: larsl; state: Exp; lines: +619 -285; commitid: oju0x8tTc9aUB7qs; Found while testing ingestion of the GNU dino repository from cvs.sannah.gnu.org/sources/dino
See https://jenkins.softwareheritage.org/job/DLDCVS/job/tests-on-diff/36/ for more details.
Build is green
Patch application report for D6561 (id=23854)
Rebasing onto 0829dc33...
Current branch diff-target is up to date.
Changes applied before test
commit 509ac801df7440a95cdf9b4b3bc60af7cb5ac356 Author: Stefan Sperling <stsp@stsp.name> Date: Wed Oct 27 12:20:05 2021 +0200 rlog: fix loading of CVS commits which have a commit ID The CVS commit ID is an optional attribute which is only generated by relatively recent releases of CVS clients. Our rlog parser was skipping such commits because it failed to match on them due to an error in a regular expression. This resulted in an incomplete import of CVS revision history. Here is a sample line from cvs rlog output which carries a commit ID and was not matched because the regex lacked the trailing semicolon: date: 2007-07-17 15:02:50 +0200; author: larsl; state: Exp; lines: +619 -285; commitid: oju0x8tTc9aUB7qs; Found while testing ingestion of the GNU dino repository from cvs.sannah.gnu.org/sources/dino
See https://jenkins.softwareheritage.org/job/DLDCVS/job/tests-on-diff/40/ for more details.