Skip to content

svn_repo: Optimize export of a remote subversion sub-path

Previously when exporting a sub-path of a remote subversion repository over the network, the full repository was exported and the local path targeting the sub-path was returned. This is no really optimal in terms of network bandwidth if the repository filesystem is large but it was implemented like this to ensure all tests related to sub-paths export were passing regardless the subversion loader class used: either SvnLoader or SvnLoaderFromRemoteDump.

After some analysis, it turned out that it was possible to avoid exporting the full repository but only the request sub-path when using the SvnLoader class. So modify the SvnRepo class to ensure that behavior and save some network bandwidth when dealing with a large repository.

These changes in the SvnRepo class induce some in the replay module to ensure all tests still pass and it also enables to remove a no longer needed optional parameter to the class constructor.

This also optimize the SvnDirectoryLoader class introduced in !221 (merged) in terms of performance, see example below:

  • before that optimization:
[2023-05-31 12:08:29,333: INFO/MainProcess] Task swh.loader.svn.tasks.LoadSvnDirectory[57730b45-5151-4217-bbd4-0d17314cc910] received
[2023-05-31 12:08:29,334: INFO/MainProcess] loader@f1b5a8eddc92 ready.
[2023-05-31 12:08:29,437] Loading config file /loader.yml
[2023-05-31 12:08:29,451] Loader checksums computation: nar
[2023-05-31 12:08:31,494] Load origin 'svn://svn.savannah.gnu.org/apl/trunk' with type 'svn-export'
[2023-05-31 12:08:31,495] lister_not provided, skipping extrinsic origin metadata
[2023-05-31 12:08:52,163] svn export -r 1550 --depth infinity --ignore-keywords svn://svn.savannah.gnu.org/apl /tmp/tmp6r328lle/check-revision-1550.ah6tp1ks/apl
[2023-05-31 12:11:45,023] Artifact <svn-export> with path /tmp/tmp6r328lle/check-revision-1550.ah6tp1ks/apl/trunk
[2023-05-31 12:11:45,023] Artifact <svn-export> to check nar hashes: /tmp/tmp6r328lle/check-revision-1550.ah6tp1ks/apl/trunk
[2023-05-31 12:11:49,591] Number of skipped contents: 0
[2023-05-31 12:11:49,591] Number of contents: 4877
[2023-05-31 12:11:49,601] Flushing 4877 objects of type content (162059838 bytes)
[2023-05-31 12:12:21,042] Number of directories: 46
[2023-05-31 12:12:21,052] Flushing 46 objects of type directory (4978 entries)
[2023-05-31 12:12:22,373] Flushing 1 objects of type snapshot
[2023-05-31 12:12:23,201] Flushing 1 objects of type extid
[2023-05-31 12:12:24,223] cleanup /tmp/tmp6r328lle
[2023-05-31 12:12:24,909] Task swh.loader.svn.tasks.LoadSvnDirectory[57730b45-5151-4217-bbd4-0d17314cc910] succeeded in 235.46256677999918s: {'status': 'eventful'}
  • after that optimization:
[2023-05-31 11:54:52,325] Task swh.loader.svn.tasks.LoadSvnDirectory[4efdfa16-59fe-498d-8db6-71e20b449076] received
[2023-05-31 11:54:52,328] Loading config file /loader.yml
[2023-05-31 11:54:52,344] Loader checksums computation: nar
[2023-05-31 11:54:55,388] Load origin 'svn://svn.savannah.gnu.org/apl/trunk' with type 'svn-export'
[2023-05-31 11:54:55,388] lister_not provided, skipping extrinsic origin metadata
[2023-05-31 11:55:07,639] svn export -r 1550 --depth infinity --ignore-keywords svn://svn.savannah.gnu.org/apl/trunk /tmp/tmp1hh2q8uz/check-revision-1550._nadhqvl/trunk
[2023-05-31 11:57:01,453] Artifact <svn-export> with path /tmp/tmp1hh2q8uz/check-revision-1550._nadhqvl/trunk
[2023-05-31 11:57:01,453] Artifact <svn-export> to check nar hashes: /tmp/tmp1hh2q8uz/check-revision-1550._nadhqvl/trunk
[2023-05-31 11:57:05,161] Number of skipped contents: 0
[2023-05-31 11:57:05,162] Number of contents: 4877
[2023-05-31 11:57:05,179] Flushing 4877 objects of type content (162059838 bytes)
[2023-05-31 11:57:33,333] Number of directories: 46
[2023-05-31 11:57:33,348] Flushing 46 objects of type directory (4978 entries)
[2023-05-31 11:57:34,596] Flushing 1 objects of type snapshot
[2023-05-31 11:57:34,749] Flushing 1 objects of type extid
[2023-05-31 11:57:35,768] cleanup /tmp/tmp1hh2q8uz
[2023-05-31 11:57:36,056] Task swh.loader.svn.tasks.LoadSvnDirectory[4efdfa16-59fe-498d-8db6-71e20b449076] succeeded in 163.7220381780062s: {'status': 'eventful'}

Merge request reports