- May 26, 2023
-
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
Update to upstream version '1.7.0' with Debian dir 890556da5124ec6551807fc627fe6faf907fa960
-
Antoine Lambert authored
-
Antoine Lambert authored
-
Antoine Lambert authored
Add jenkins badge for master branch build status. Rephrase introduction sentence. Remove remainings from when the file was written in restructuredText. Add syntax highlighting to code blocks.
-
Antoine Lambert authored
The export_temporary method of the SvnRepo class exports the content of a subversion repository at a given revision in a temporary directory. As we also export the externals that might be associated to some paths in the repository, we first need to get all the svn:externals property values in order to determine if there is recursive or relative externals and adjust some export parameters accordingly. While that operation is fast when the subversion repository is hosted locally, it is terribly slow when the repository is hosted on a remote server. Indeed a recursive propget operation on a remote server sends a lot of network requests which slows down quite a lot the process, especially with large repositories. To improve the performances, the previous implementation was doing a full checkout of the repository to local filesystem and gets svn:externals property values from it. Nevertheless, that process is time consuming for large repositories and it can consume a lot of disk space. In order to remove that bottleneck and improve overall performances for getting all properties values, introduce a C++ extension module for Python that implements a fast way to crawl all paths of a repository and their associated properties. Unlike "svn ls --depth infinity" or "svn propget -R" commands it performs only one SVN request over the network, hence saving time especially with large repositories. The code is freely inspired from the fast-svn-crawler project by Dmitry Pavlenko (https://sourceforge.net/projects/fastsvncrawler/). The obtained speedup is quite impressive, on a large remote repository listing all paths using "svn ls --depth infinity" or gettings all svn:externals property values using "svn propget -R" takes around one hour while it takes only a couple of minutes using the approach implemented in the C++ extension module. Using that approach also enables to save disk space as we no longer need to perform a full checkout of the repository. This change should greatly improve the performances when reloading a svn repository already visited by Software Heritage. Indeed, before the possible archiving of new commits issued since last visit, the loader checks that a repository has not been altered by calling the export_temporary method using the remote repository URL.
- May 23, 2023
-
-
Antoine Lambert authored
Some external definitions can have leading or trailing spaces/tabs so we need to strip them to avoid parsing errors. Fixes #4734
-
- May 03, 2023
-
-
Antoine Lambert authored
Official subversion documentation only mentions that paths containing spaces must be surrounded by double quotes but we can find some external definitions in the wild whose paths are surrounded by single quotes. Those are properly handled by the official subversion client so we must do the same when parsing externals.
-
- Apr 17, 2023
-
-
Antoine Lambert authored
When a directory is copied from another one in a previous revision, externals must be copied only if they have been defined in a revision greater or equal to the revision the directory is copied from. So store the revision number an external is defined and use it to filter externals when performing copyfrom operations.
-
- Apr 04, 2023
-
-
Antoine Lambert authored
As the swh.loader.svn.svn_repo.SvnRepo class quote all URLs before calling subversion API through subvertpy, me must ensure to unquote URLs extracted from external definitions otherwise they will be double quoted and thus no longer valid.
-
Antoine Lambert authored
When a directory is deleted by subversion, we must also remove its state holding externals info as the directory can be re-added later in another revision but without the svn:externals property set.
-
- Mar 06, 2023
-
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
Update to upstream version '1.6.0' with Debian dir 81c69c428c60b14aa7343438029360504faea63e
- Mar 02, 2023
-
-
Antoine Lambert authored
Previously SvnLoaderFromRemoteDump class was using the repository root URL to dump a sub-project.
-
- Mar 01, 2023
-
-
Antoine Lambert authored
"svnadmin load" has a --no-flush-to-disk option enabling faster load while being unsafe on power off. This drawback is not an issue for the subversion loader so use that option to significantly improve the performance for loading a repository from a dump file into a directory on the local filesystem.
-
Antoine Lambert authored
Those methods ensure URLs are properly quoted to avoid assertion failures when calling functions from the subversion C API.
-
Antoine Lambert authored
In order to ensure consistency between SvnLoader and SvnLoaderFromRemoteDump classes, run most of the tests with both of them. As a consequence, fix an invalid load status that was reported by the SvnLoader class when no new objects to archive have been found during a visit.
-
Antoine Lambert authored
That kind of error can be encoutered when loading a repository hosted on SouceForge.
-
Antoine Lambert authored
Recursive propget operation is terribly slow over the network, better doing it from a freshly checked out working copy as it is faster.
-
Antoine Lambert authored
-
- Feb 17, 2023
-
-
Antoine Lambert authored
Related to swh/meta#4960
-
- Feb 16, 2023
-
-
Jérémy Bobbio (Lunar) authored
Related to swh/meta#4959
-
- Feb 02, 2023
-
-
Antoine Lambert authored
Previously the parse_external_definition function was returning a single revision number regardless it was a revision specified with -rX or a peg revision specified with @X. However, the use and combination of these two parameters in the export command from subversion can lead to different results (see https://svnbook.red-bean.com/en/1.6/svn.advanced.pegrevs.html). So ensure to extract both revision and peg revision in order to avoid different behavior from the official subversion client when the loader exports externals.
-
Antoine Lambert authored
This fixes python 3.7 support due to poetry, a dependency of isort, that removed support for that Python version in a recent release.
-
- Jan 20, 2023
-
-
Antoine Lambert authored
Subversion allow to specify the revision for an external definition as a date instead of an integer identifier. So when encountering such case, get the HEAD revision number for the external at the specified date in order to export the correct version of the files targeted by the external. Related to #4727
-
Antoine Lambert authored
It enables to get the HEAD revision number for a repository at a specific date. Related to #4727
-
- Jan 18, 2023
-
-
Antoine Lambert authored
It prevents "Remote access object already in use" errors.
-
Antoine Lambert authored
Those are simply wrappers around functions from the converters module are are nou used elsewhere.
-
Antoine Lambert authored
Add some default parameters to SvnRepo class constructor in order to simplify initialization of such object for standalone use. Make origin_url parameter of info method optional. Add some tests for the SvnRepo class.
-
- Jan 17, 2023
-
-
Antoine Lambert authored
This is a more meaninful name considering that module only contains a single class named SvnRepo.
-
- Dec 19, 2022
-
-
Antoine Lambert authored
In order to remove warnings about /apidoc/*.rst files being included multiple times in toc when building full swh documentation, prefer to include module indices only when building standalone package documentation. Also include them the proper sphinx way. Related to T4496
-
- Dec 14, 2022
- Dec 12, 2022