Skip to content
Snippets Groups Projects

attempt to avoid content differences due to paths in keywords

1 file
+ 22
1
Compare changes
  • Side-by-side
  • Inline
  • Some RCS keywords, such has "Header", contain absolute file paths
    derived from the on-disk filesystem path of the CVS repository.
    
    When we fetch files over the pserver protocol such keywords are
    expanded by the CVS server. But when using the rsync protocol we
    will first copy the CVS repository to local disk and the path to
    this local copy will correspond to some temporary directory.
    
    Try to avoid file content differences between pserver and rsync
    access methods by deriving a likely server-side path from path
    information found in the rsync:// origin URL.
    This will work as expected as long as the CVS server-side setup
    exposes the same path to the CVS repository over both access
    methods, which is the case for GNU savannah for example.
    
    In general, we should recommend treating pserver and rsync as distinct
    origins and not rely on them to be interchangable and always produce
    the same conversion result. But we can still try our best to avoid
    needless differences in content hashes.
+ 22
1
@@ -137,6 +137,7 @@ class CvsLoader(BaseLoader):
self, k: ChangeSetKey, f: FileRevision, rcsfile: rcsparse.rcsfile
) -> None:
assert self.cvsroot_path
assert self.server_style_cvsroot
path = file_path(self.cvsroot_path, f.path)
wtpath = os.path.join(self.worktree_path, path)
self.log.info("rev %s state %s file %s" % (f.rev, f.state, f.path))
@@ -151,7 +152,26 @@ class CvsLoader(BaseLoader):
if not rcsfile:
rcsfile = rcsparse.rcsfile(f.path)
rcs = RcsKeywords()
contents = rcs.expand_keyword(f.path, rcsfile, f.rev)
# We try our best to generate the same commit hashes over both pserver
# and rsync. To avoid differences in file content due to expansion of
# RCS keywords which contain absolute file paths (such as "Header"),
# attempt to expand such paths in the same way as a regular CVS server
# would expand them.
# Whether this will avoid content differences depends on pserver and
# rsync servers exposing the same server-side path to the CVS repository.
# However, this is the best we can do, and only matters if an origin can
# be fetched over both pserver and rsync. Each will still be treated as
# a distinct origin, but will hopefully point at the same SWH snapshot.
# In any case, an absolute path based on the origin URL looks nicer than
# an absolute path based on a temporary directory used by the CVS loader.
server_style_path = f.path.replace(
self.cvsroot_path, self.server_style_cvsroot
)
if server_style_path[0] != "/":
server_style_path = "/" + server_style_path
contents = rcs.expand_keyword(server_style_path, rcsfile, f.rev)
os.makedirs(os.path.dirname(wtpath), exist_ok=True)
outfile = open(wtpath, mode="wb")
outfile.write(contents)
@@ -293,6 +313,7 @@ class CvsLoader(BaseLoader):
if not url.path:
raise NotFound("Invalid CVS origin URL '%s'" % self.origin_url)
self.cvs_module_name = os.path.basename(url.path)
self.server_style_cvsroot = os.path.dirname(url.path)
os.mkdir(os.path.join(self.worktree_path, self.cvs_module_name))
if url.scheme == "file" or url.scheme == "rsync":
# local CVS repository conversion
Loading