Skip to content

replay: Ensure copyfrom operations are properly handled

A subversion revision can contain new directories and files copied from ancestor revisions but those were not perfectly handled in the commit editor used to reconstruct the repository filesystem when replaying revisions.

In particular previous implementation could not handle the case where a path copied from an ancestor revision is replaced in a same commit (for instance replacing a directory by a file with the same name).

These changes ensure that info about source path and source revision from which a path is copied is passed to the commit editor methods in order to let them handle the copies but also that the replace operations will be correctly replayed.

It also prevents OS error Too many open files when a really large files tree is copied from an ancestor revision.

Depends on !154 (closed)

It fixes the loading of https://svn.code.sf.net/p/mp3splt/code reported in SWH-LOADER-SVN-2Y.

It also fixes the loading of svn://tug.org/texlive (two previous diffs in that stack are also required for fixing the loading of this big one) as before these changes the loading were encountering the following issue:

ocker-swh-loader-1  | [2022-10-28 14:08:25,177: DEBUG/ForkPoolWorker-1] rev: 3972, swhrev: 192cedd037af713fb92fa12b09425d3a773c19e2, dir: da2338bdad222b133cc4e2378e18b33678999f49
docker-swh-loader-1  | [2022-10-28 14:08:36,398: ERROR/ForkPoolWorker-1] [Errno 24] Can't open file '/home/anlambert/tmp/texlive_repo/db/revs/0/90': Too many open files
docker-swh-loader-1  | Traceback (most recent call last):
docker-swh-loader-1  |   File "/tmp/tmp.Y8nDYSdruk/swh-loader-svn/swh/loader/svn/loader.py", line 446, in fetch_data
docker-swh-loader-1  |     data = next(self.swh_revision_gen)
docker-swh-loader-1  |   File "/tmp/tmp.Y8nDYSdruk/swh-loader-svn/swh/loader/svn/loader.py", line 342, in process_svn_revisions
docker-swh-loader-1  |     for rev, commit, new_objects, root_directory in gen_revs:
docker-swh-loader-1  |   File "/tmp/tmp.Y8nDYSdruk/swh-loader-svn/swh/loader/svn/svn.py", line 542, in swh_hash_data_per_revision
docker-swh-loader-1  |     objects = self.swhreplay.compute_objects(rev)
docker-swh-loader-1  |   File "/tmp/tmp.Y8nDYSdruk/swh-loader-svn/swh/loader/svn/replay.py", line 963, in compute_objects
docker-swh-loader-1  |     self.replay(rev)
docker-swh-loader-1  |   File "/tmp/tmp.Y8nDYSdruk/swh-loader-svn/swh/loader/svn/replay.py", line 945, in replay
docker-swh-loader-1  |     self.conn.replay(rev, rev + 1, self.editor)
docker-swh-loader-1  | OSError: [Errno 24] Can't open file '/home/anlambert/tmp/texlive_repo/db/revs/0/90': Too many open files

Migrated from D8793 (view on Phabricator)

Merge request reports