- Nov 29, 2022
-
-
Jenkins for Software Heritage authored
Update to upstream version '2.9.0' with Debian dir a24e16563ac395dcd85fb32f74c624ea35d115ae
- Nov 28, 2022
-
-
vlorentz authored
This avoids having a transaction inserting row A then B, while another inserts row B then A; which (probably) leads to deadlocks like this: ``` DeadlockDetected: deadlock detected DETAIL: Process 1842336 waits for ShareLock on transaction 1051957280; blocked by process 64261. Process 64261 waits for ShareLock on transaction 1051957281; blocked by process 1842336. HINT: See server log for query details. CONTEXT: while inserting index tuple (1972253,5) in relation "origin_extrinsic_metadata" SQL statement "insert into origin_extrinsic_metadata (id, metadata, indexer_configuration_id, from_remd_id, metadata_tsvector, mappings) ``` https://sentry.softwareheritage.org/share/issue/52b06caae89f4235a758887fd6817656/ This was already mitigating by sorting before inserting in temporary tables, then expecting postgresql to read from temporary tables in the same order rows where inserted. This is often true, but not guaranteed. No test for this, because I do not see a way to replicate this more than existing deadlock tests do.
- Nov 23, 2022
-
-
Jenkins for Software Heritage authored
Update to upstream version '2.8.0' with Debian dir 8b4ffcb03cd177c9149fcfd09e3a4cf499a1b201
- Nov 21, 2022
-
-
vlorentz authored
Some snapshots are really large. Rather than fetching them entirely only to discard most of the branches, this commit only fetches some branches (to check existence + to use less queries on small snapshots), then requests specific branches as needed (usually only 2). This should improve performance and reduce timeout exceptions from the storage.
- Nov 03, 2022
-
-
Nicolas Dandrimont authored
This code was flushing kafka messages and waiting for the brokers on every message, instead of just doing it once per batch.
-
- Nov 02, 2022
-
-
Antoine Lambert authored
-
Jenkins for Software Heritage authored
Update to upstream version '2.7.3' with Debian dir 7da9a21feb7239b589c6d53d33ca7baf0dc3504f
-
- Oct 27, 2022
-
-
Jenkins for Software Heritage authored
Update to upstream version '2.7.2' with Debian dir 0a3e9b68e3a9a078d1a53cb19a27f9c8e117938e
- Oct 26, 2022
-
-
vlorentz authored
Codemeta reexports schema:url, schema:dateCreated, ... with `"@type": "@id"` and `"type": "schema:Date"` so that ``` { "@context": "https://doi.org/10.5063/schema/codemeta-2.0", "url": "http://example.org", "dateCreated": "2022-10-26" } ``` expands to: ``` { "http://schema.org/url": { "@type": "@id", "@value": "http://example.org" }, "dateCreated": { "@type": "http://schema.org/Date", "@value": "2022-10-26" } } ``` However, our translation tried to translate directly to a partially expanded form, like this: ``` { "@context": "https://doi.org/10.5063/schema/codemeta-2.0", "url": { "@value": "http://example.org" }, "dateCreated": { "@value": "2022-10-26" } } ``` which prevents the compaction and expansion algorithms from adding a type themselves, causing the document to be compacted to: ``` { "@context": "https://doi.org/10.5063/schema/codemeta-2.0", "schema:url": "http://example.org" "schema:dateCreated": "2022-10-26" } ``` or expanded to: ``` { "http://schema.org/url": { "@value": "http://example.org" }, "http://schema.org/dateCreated": { "@value": "2022-10-26" } } ``` which are not what we want. This commit replaces the hack for `@type` with the right solution that works for all properties.
- Oct 25, 2022
- Oct 24, 2022
-
-
vlorentz authored
Without this, some Sentry issues were tagged with the wrong object, which can be very confusing
-
- Oct 18, 2022
-
-
David Douard authored
- pre-commit from 4.1.0 to 4.3.0, - codespell from 2.2.1 to 2.2.2, - black from 22.3.0 to 22.10.0 and - flake8 from 4.0.1 to 5.0.4. Also freeze flake8 dependencies. Also change flake8's repo config to github (the gitlab mirror being outdated).
-
- Oct 07, 2022
-
-
Jenkins for Software Heritage authored
Update to upstream version '2.7.1' with Debian dir 3e6c8e43699958132eba9d7c620d7871d989a731
- Sep 28, 2022
- Sep 27, 2022
-
-
vlorentz authored
-
vlorentz authored
It was only fixed as a side-effect of other changes, but it's good to have a regression test
-
vlorentz authored
They are closer semantics as 'html_url' is the main page of the repository, so it is the best to identify it; and 'clone_url' is the URL that should be given to 'git clone', as documented by https://schema.org/codeRepository Additionally, that property was missing so far; but a future commit will need to use it to identify fork relationships (node ids are required to representation relationships between documents as we cannot use blank nodes for that)
-
vlorentz authored
-
vlorentz authored
-
- Sep 12, 2022
-
-
Antoine Lambert authored
They have been moved in a swh-core pytest plugin to share them with other swh packages that might need it.
-
Jenkins for Software Heritage authored
Update to upstream version '2.6.0' with Debian dir 132f86a3595679ad6ca88ad2ca01b29bc4fc100b
-