Skip to content

Draft: Use rsvndump tool to improve incremental loading of svn repositories

Previosuly the SvnLoaderFromRemoteDump class was always dumping all revisions to file for loading a subversion repository into the archive, regardless if the repository has already been visited.

This is not really efficient for incremental loadings of a large repository as data for revisions already loaded into the archive are fetched again at every new visit. We are forced to proceed like this as the svnrdump tool we are using can generate dumps that fail to be loaded by svnadmin load when dumping a range [n: HEAD] of revisions where n > 1, as copyfrom operations can reference ancestor revisions not included in the dump (see example).

Fortunately, it exists another subversion dump tool named rsvndump designed to remove the svnrdump limitations regarding incremental dumps and their proper loading. The tool is slower and consumes much more memory than svnrdump but it does the job well.

That MR introduces the use of rsvndump tool for incremental loading of a subversion repository. First loading still uses svnrdump as it is faster and consumes less memory while subsequent loadings uses rsvndump.

Edited by Antoine Lambert

Merge request reports