- Sep 28, 2018
-
-
Antoine Lambert authored
- add Python iterator protocol support in the DirectoryIterator class in order to easily visit in a recursive way any directory stored in the archive. - add convenient function dir_iterator wrapping the instantiation of the DirectoryIterator class - add tests Related T1177
-
Antoine Lambert authored
Two issues found regarding the way empty directories were handled: - _empty_dir_hash variable did not have correct type (str instead of bytes) so empty directory test based on hash comparison was always failing - in the step method of DirectoryIterator, no need to push a new frame for an empty directory as this will stop the iteration
-
- Sep 25, 2018
-
-
Antoine Pietri authored
-
Antoine Pietri authored
-
- Sep 19, 2018
-
-
Stefano Zacchiroli authored
-
Antoine Lambert authored
Prior to this fix, storage.snapshot_get_latest(origin) was returning the first snapshot instead of the last one.
-
- Sep 07, 2018
-
-
Stefano Zacchiroli authored
-
Stefano Zacchiroli authored
-
- Sep 06, 2018
-
-
Antoine R. Dumont authored
Related T1180
-
Stefano Zacchiroli authored
-
Stefano Zacchiroli authored
brought to you by codespell
-
- Sep 05, 2018
-
-
Antoine Pietri authored
-
- Aug 02, 2018
-
- Jul 27, 2018
-
-
Antoine R. Dumont authored
Prior to this commit, this returned only the list of new ids. This is currently not used anywhere in our stack. Would it have been, this would have force the client to try and be smart for dealing with ids.
-
- Jun 05, 2018
-
-
Nicolas Dandrimont authored
Summary: To do so, we import a function from a recent version of psycopg2, execute_values, which can execute queries efficiently with a list of values. We also scale the cursor back from having things in SQL functions towards having things inside the db.py database "backend". This will make it easier to iterate, as we won't have to deploy function changes to twenty different databases. After these changes, testing the web UI on a physical replica works. Close T1073 Test Plan: Local integration tests are happy; Navigating the frontend backed by a physical replica seems to be okay now. Reviewers: #reviewers! Maniphest Tasks: T1073 Differential Revision: https://forge.softwareheritage.org/D340
-
- Jun 04, 2018
-
-
Nicolas Dandrimont authored
Summary: As of D337, directory_get has no users left; Time to remove it. Test Plan: make test in the toplevel environment still works Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D339
-
- May 30, 2018
-
-
Nicolas Dandrimont authored
Although most visits should be sequential, and the date should be monotonic, re-importing an old snapshot as a new visit would break the pagination.
-
- May 29, 2018
-
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
-
- May 28, 2018
-
-
Nicolas Dandrimont authored
-
Nicolas Dandrimont authored
Summary: Allow adding server-side statement timeouts for database operations Test Plan: make test still works Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D334
-
- May 24, 2018
-
-
Antoine R. Dumont authored
Related T1061
-
- May 12, 2018
-
- May 11, 2018
-
-
Nicolas Dandrimont authored
Summary: When mkdtemp is called, shutil.rmtree must be called as well Test Plan: Look at /tmp before and after running tests, notice no new directories instead of 60. Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D331
-
Nicolas Dandrimont authored
Summary: The behavior of storage when the underlying objstorage had an exception was never actually tested. This new test weeded out a bug in the threaded implementation for copy_to. Test Plan: the new test passes Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D330
-
- May 09, 2018
-
-
Nicolas Dandrimont authored
This allows connection reuse for postgresql and potential remote backends such as for the object storage, rather than reinitiating all connections on every request.
-
- May 07, 2018
-
-
Nicolas Dandrimont authored
Summary: This allows to use swh.storage with a modicum of concurrency Test Plan: clearly, make test should still pass Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D325
-
Nicolas Dandrimont authored
Summary: Add a level of indirection to allow swapping out the implementation of the db attribute Test Plan: once again, make test keeps on working Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D324
-
Nicolas Dandrimont authored
Summary: This avoids reusing a potentially stale connection handle. Also allows testing potential connection pooling behavior. This forces us to do proper cursor sanitation as well, a bunch of "transactional" operations weren't actually transactional. Test Plan: another round of make test still working Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D323
-
Nicolas Dandrimont authored
Summary: Helps avoid lingering postgresql connections when a test fails Test Plan: make test still works ;) Reviewers: #reviewers! Differential Revision: https://forge.softwareheritage.org/D322
-
Nicolas Dandrimont authored
-
- Apr 25, 2018
-
-
Antoine R. Dumont authored
Related T1036
-
- Mar 12, 2018
-
-
Stefano Zacchiroli authored
-
- Feb 27, 2018
-
-
Antoine Lambert authored
-
- Feb 20, 2018
-
-
Antoine Lambert authored
This commit adds the implementation of an efficient algorithm for comparing two directory trees in order to compute the list of introduced file changes in terms of addition / deletion / modification/ renaming. It can be found in the diff module located in the new namespace swh.storage.algos That algorithm is used to extend the storage API with the following methods: - diff_directories: compute diff between two arbitrary directories - diff_revisions: compute diff between two arbitrary revisions - diff_revision: compute diff between a revision and its first parent Related T921 Closes D295
-
- Feb 19, 2018
-
-
Nicolas Dandrimont authored
This table allows counting objects by bucket, keeping the transactions for counting objects short (a few dozen seconds at most). Also add a "single_update" boolean field to the main object_counts table to be able to discriminate tables that are counted via buckets and tables counted on one go. The main table is updated every 256 counted buckets to avoid too much churn on the table. Close T962.
-
- Feb 09, 2018
-
-
Stefano Zacchiroli authored
-
- Feb 08, 2018
-
-
Antoine R. Dumont authored
-