- Dec 14, 2022
- Dec 12, 2022
-
-
Jenkins for Software Heritage authored
Update to upstream version '1.5.0' with Debian dir 56c3d92d929b551b72dd3a104a58846f87917bd2
- Dec 08, 2022
-
-
Antoine Lambert authored
Now that SvnRepo.propget supports URL as target, we can remove the use of costly checkout operation and directly retrieve the whole set of svn:externals properties. This should greatly improve incremental loading of a big repository in terms of performance.
-
Antoine Lambert authored
subvertpy 0.11 has a buggy implementation of propget bindings when target is an URL (https://github.com/jelmer/subvertpy/issues/35), so as a workaround we implement propget for URL using non buggy proplist bindings.
-
Antoine Lambert authored
Retrying three times is enough as we use expontential backoff. Previously the loader could be stuck more than twenty minutes in a row when it encounters a dead external, now it would be a couple of minutes.
-
- Dec 07, 2022
-
-
Antoine Lambert authored
Copied directories might have externals so we also need to copy states and update external paths in case externals list is later modified.
-
Antoine Lambert authored
In order to detect all ascii characters that must be percent encoded in svn URLs, add a brute force test and use urllib.parse.quote in quote_svn_url function.
-
Antoine Lambert authored
Such case can happen when an external definition is malformed. Previously, the parsed malformed external was added to the directory state with an empty external URL which could lead to unexpected side effects like removing all previously exported valid externals.
-
Antoine Lambert authored
Instead of maintaining file state based on svn properties across revisions replay and trying to reconstruct the same file as with a svn export operation after applying text deltas, prefer to simply export the file from the currently processed revision when closing the associated file editor. This greatly simplify the replay module implementation while approximatively keeping the same performance as before. Also add a test that would fail without these changes. Related to T4673
-
- Dec 05, 2022
-
-
Antoine Lambert authored
When copying a directory from an ancestor revision, do not ignore externals as properties are also copied by subversion so external paths must also be exported.
-
Antoine Lambert authored
In debug mode, when a hash tree computation divergence is detected after replaying a revision, compute and display the diff between contents to facilitate debugging of those type of issues.
-
- Nov 25, 2022
-
-
Antoine Lambert authored
Add more debug logs to the replay module to ease detection of issues. Nevertheless, as those are quite verbose, only display them when setting debug parameter of the loader to True.
-
- Nov 23, 2022
-
-
Antoine Lambert authored
-
- Nov 22, 2022
-
-
Antoine Lambert authored
When a tree computation divergence is detected after replaying a revision add debug logs displaying the paths that differ or are missing between the reconstructed repository filesystem and the exported one at that specific revision. It should help to gain some time when debugging such issues.
-
- Oct 31, 2022
-
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
-
Jenkins for Software Heritage authored
Update to upstream version '1.4.0' with Debian dir ca407cdefdc6a68db9c0f10c5eae282da06f021f
-
Jenkins for Software Heritage authored
Update to upstream version '1.3.6' with Debian dir c6ddd2637275897be2234792239829093fc65dea
-
Antoine Lambert authored
A subversion revision can contain new directories and files copied from ancestor revisions but those were not perfectly handled in the commit editor used to reconstruct the repository filesystem when replaying revisions. In particular previous implementation could not handle the case where a path copied from an ancestor revision is replaced in a same commit (for instance replacing a directory by a file with the same name). These changes ensure that info about source path and source revision from which a path is copied is passed to the commit editor methods as paramaters in order to let them handle the copies but also that the replace operations will be correctly replayed. It also prevents OS error "Too many open files" when a really large files tree is copied from an ancestor revision.
-
Antoine Lambert authored
When dumping a subversion repository to file before loading it, compress that file using gzip while producing it. It enables to save significant disk space while dumping a large repository. Also rework the way how truncated dump is handled now dump file is compressed by providing the expected max revision number to be loaded by svnadmin. If the number of loaded revisions matches, we can safely continue the partial loading of the repository.
- Oct 28, 2022
-
-
Antoine Lambert authored
URLs provided as parameters to subvertpy.client.Client methods must be quoted when it contains space characters or an assertion will be raised by libsvn otherwise.
-
- Oct 25, 2022
-
-
Antoine Lambert authored
It exists some subtle cases in subversion repositories where external paths defined on different directories can overlap so update replay module implementation to handle those and avoid to erroneously remove paths when replaying revisions.
-
- Oct 19, 2022
-
-
Antoine Lambert authored
-
Antoine Lambert authored
-
Antoine Lambert authored
When the "svnadmin load" command exits with error, report the svn admin error in the ValueError exception raised by the function init_svn_repo_from_dump. This should help debugging those type of issues reported by sentry.
-
- Oct 18, 2022
-
-
David Douard authored
- pre-commit from 4.1.0 to 4.3.0, - codespell from 2.2.1 to 2.2.2, - black from 22.3.0 to 22.10.0 and - flake8 from 4.0.1 to 5.0.4. Also freeze flake8 dependencies. Also change flake8's repo config to github (the gitlab mirror being outdated).
-
Antoine Lambert authored
Use helper fixture loading_task_creation_for_listed_origin_test from swh-loader-core and remove redundant tests.
-
- Oct 17, 2022
-
-
Antoine Lambert authored
Instead of maintaining a set of modified paths for each replayed revision, use the swh.model.from_disk.Directory.collect method which performs the same task by returning added or modified contents and directories since the last collect operation.
-
- Oct 01, 2022