Skip to content
Snippets Groups Projects
  1. Nov 23, 2021
  2. Nov 11, 2021
    • Stefan Sperling's avatar
      preserve empty lines in CVS log messages over pserver · 34f46486
      Stefan Sperling authored
      Empty lines sent by the CVS server in rlog output were being stripped
      by our custom cvs client implementation. Unfortunately, this resulted
      in empty lines being stripped from CVS log messages, which is fixed
      with this commit. The rsync access method already preserved log
      messages properly, and now the pserver access method does the same.
      34f46486
  3. Nov 09, 2021
    • Stefan Sperling's avatar
      add CVS commit ID support to rlog.py · f5b974a0
      Stefan Sperling authored
      Newer CVS clients tag commits with a commit ID which allows us to
      correctly convert commits which changed several RCS files at once.
      The rsync access method based on cvs2gitdump was already taking
      advantage of this. To ensure that conversions over the pserver
      protocol yield the same result as conversions over rsync we need
      to add commit ID support to rlog.py.
      
      Add two new test cases which convert the same repository over
      rsync and pserver respectively, and ensure that they yield the
      same result. Without commit ID support conversion over pserver
      produces a different result for this particular test repository.
      
      With feedback about coding style from vlorentz.
      f5b974a0
    • Stefan Sperling's avatar
      handle Attic-only RCS files over CVS pserver · d28a4b21
      Stefan Sperling authored
      CVS repositories may contain RCS history in file,v as well as
      a corresponding Attic/file,v where each file contains separate
      events that occurred in history. The Attic version of the file
      results from file deletion events.
      
      The rsync access method already uses history found in the Attic.
      However, a CVS server will only return RCS files from the Attic
      if we request them explicitly. If we do not request them then our
      converted history may end up missing deletion events for some files.
      Unfortunately, we cannot tell which RCS files have a corresponding
      file in the Attic, so we need to search all Attic directories by
      running the equivalent of 'cvs rlog' in each directory. This slows
      down pserver access considerably (and it was already quite slow
      compared to rsync). But we need to pay this price in order to
      obtain a valid conversion result.
      
      This patch contains related fixes to cvsroot path handling, which
      was broken for the pserver case. Without these fixes we cannot
      create the correct paths for Attic directories to search.
      
      Problem found while comparing conversion results of rsync and
      pserver access methods for the GNU dino CVS repository at
      cvs.savannah.gnu.org/sources/dino
      Add two new test cases based on RCS files from this repository.
      
      Without this fix in place history would diverge at this commit:
        8891a63 | larsl | Removed the MIDIEvent class | 04 May 2006, 01:11 UTC
      Because the files midievent.cpp and midievent.hpp would not get deleted
      when converting this commit via the pserver protocol.
      d28a4b21
    • Stefan Sperling's avatar
      improve test coverage of file additions and deletions · d72f15f2
      Stefan Sperling authored
      Make an existing test case run over pserver as well.
      This access method uses a different way of detecting file
      additions and deletions and should be tested separately.
      
      Add new tests to cover the re-addition of a file after it
      was deleted.
      d72f15f2
    • Stefan Sperling's avatar
      ca23bc13
    • Stefan Sperling's avatar
      add support for RCS keyword expansion over pserver protocol · f52f0e45
      Stefan Sperling authored
      We can simply ask the CVS server to expand keywords for us, instead
      of forcing binary file mode with the -kb option. The CVS repository
      contains per-file keyword expansion defaults the server will use.
      Files checked out by cvsclient.py should now match what a regular
      CVS client would check out by default.
      
      Add test cases which verify that we create the same snapshot ID
      for a repository which uses the Id keyword in a file, regardless
      of whether this repository is accessed via rsync or pserver.
      f52f0e45
  4. Nov 05, 2021
  5. Nov 03, 2021
  6. Oct 27, 2021
    • Stefan Sperling's avatar
      test checkout of file lacking trailing \n over pserver protocol · beb7fc8a
      Stefan Sperling authored
      This test reproduces the bug fixed in
      commit d3b3344b where our custom cvs
      client would fail to check out a file which lacks a trailing newline
      from a remote CVS server.
      
      The error triggered by the test without the fix in place is:
      
      CVSProtocolError: Overlong response from CVS server:
      b'delta with no trailing eolok\n'
      beb7fc8a
    • Stefan Sperling's avatar
      rlog: fix loading of CVS commits which have a commit ID · 509ac801
      Stefan Sperling authored
      The CVS commit ID is an optional attribute which is only generated
      by relatively recent releases of CVS clients. Our rlog parser was
      skipping such commits because it failed to match on them due to an
      error in a regular expression.
      This resulted in an incomplete import of CVS revision history.
      
      Here is a sample line from cvs rlog output which carries a
      commit ID and was not matched because the regex lacked the
      trailing semicolon:
      date: 2007-07-17 15:02:50 +0200;  author: larsl;  state: Exp;  lines: +619 -285;  commitid: oju0x8tTc9aUB7qs;
      
      Found while testing ingestion of the GNU dino repository from
      cvs.sannah.gnu.org/sources/dino
      509ac801
    • Stefan Sperling's avatar
      rlog: fix parsing of multiple file revisions · 0829dc33
      Stefan Sperling authored
      The rlog parser was only fetching a single file revision because
      some lines of code had the wrong indentation. These lines were
      supposed to be part of a loop body but were only executed once.
      
      Also rename a function which had a misleading name and docstring.
      This function does in fact process the entire RCS revision history
      of a given file, as opposed to just one entry of RCS revision history.
      
      Found while testing ingestion of the GNU dino repository from
      cvs.savannah.gnu.org/sources/dino
      0829dc33
    • Stefan Sperling's avatar
    • Stefan Sperling's avatar
      cvsclient: handle additional responses sent by server · 3a2f06b3
      Stefan Sperling authored
      While checking out files the server sends messages to the CVS
      client which provide information about the state of file paths.
      
      Our custom CVS client implementation needs to recognize a few
      additional responses the server may send while checking out a
      different version of a file which was already checked earlier.
      Otherwise our client will error out. We can simply ignore the
      messages (and its 2 paths arguments separated by \n) because
      we do not manage an actual CVS working copy.
      
      Found while testing ingestion of the GNU dino repository at
      cvs.savannah.gnu.org/sources/dino
      3a2f06b3
    • Stefan Sperling's avatar
      cvsclient: handle files which lack a trailing newline · d3b3344b
      Stefan Sperling authored
      CVS uses \n as a protocol message separator, which forces us
      to read protocol message line-by-line. File content sent by
      the server has a length known and is transmitted in bytes.
      The server appends a final "ok\n" message (or perhaps an error
      message) when it is done sending file contents.
      
      Properly handle the case where this final message gets buffered
      along with file contents and is not delimited from file contents
      by \n because the file lacks a trailing newline. Previously, the
      final protocol message ended up being written out to file contents
      in this case.
      
      Found while testing ingestion of the GNU dino CVS repository from
      cvs.savannah.gnu.org/sources/dino.
      d3b3344b
  7. Oct 04, 2021
  8. Oct 01, 2021
  9. Sep 22, 2021
    • Stefan Sperling's avatar
      eliminate code duplication of logic in process_cvs_changesets() · a7eaeb89
      Stefan Sperling authored
      Factor out code which is specific to rcsparse and cvsclient into
      separate functions and pass a parameter to process_cvs_changesets()
      so it can decide which of the two needs to be used.
      
      This supersedes the function process_cvs_rlog_changesets() which
      duplicated the looping code also contained in process_cvs_changesets().
      a7eaeb89
  10. Sep 21, 2021
  11. Sep 17, 2021
  12. Sep 16, 2021
  13. Sep 15, 2021
Loading