- Apr 10, 2025
- Mar 26, 2025
-
-
Pierre-Yves David authored
This comes with PyPy 3.11 compatibility.
-
- Mar 14, 2025
-
-
vlorentz authored
-
- Mar 11, 2025
-
-
vlorentz authored
In practice this didn't seem to be an issue thanks to the array being large enough that it is unlikely two threads write to the same memory page at the same time (up to cache coherency); but we should not count on that. This is probably slower, as SWHIDs are 22 bytes, and x86_64 can't lock regions over 16 bytes
-
- Mar 07, 2025
- Mar 05, 2025
-
-
vlorentz authored
mimalloc is already used in swh-provenance, provides performance on par with jemalloc (even better by some reports, though with more memory overhead), and compiles significantly faster than tikv-jemalloctor.
-
- Feb 19, 2025
-
-
Martin Kirchgessner authored
-
- Feb 11, 2025
-
-
vlorentz authored
Move aggregate-dataset, blobs-datasets, filenames, and origin contributors to swh-datasets, and provenance generation to its own repository
-
- Jan 16, 2025
-
-
vlorentz authored
-
- Jan 09, 2025
- Dec 19, 2024
-
-
vlorentz authored
-
- Dec 17, 2024
-
- Dec 10, 2024
-
-
Aymeric Varasse authored
Include update to `pyo3-0.23.0`
-
- Oct 31, 2024
-
-
vlorentz authored
-
- Oct 29, 2024
-
-
vlorentz authored
-
vlorentz authored
This reverts commit 260d641a. orc-rust v0.4.0 and v0.5.0 break ar_row: * https://github.com/datafusion-contrib/orc-rust/pull/13 * https://github.com/datafusion-contrib/datafusion-orc/issues/137
-
vlorentz authored
-
vlorentz authored
-
vlorentz authored
This can be used to get a textual representation of small graphs, which can then be shared and used in tests or examples.
- Oct 24, 2024
-
-
vlorentz authored
-
- Oct 18, 2024
- Oct 15, 2024
-
-
vlorentz authored
-
- Sep 24, 2024
-
-
vlorentz authored
This allows linking errors to previous log events that happened. For example, if I add this to the code to the Traverse method: ``` if request.get_ref().src == ["swh:1:rev:57012c57536f8814dec92e74197ee96c3498d24e"] { use tokio::time::{sleep, Duration}; sleep(Duration::from_millis(10000)).await; tracing::error!("oh no traverse"); } ``` and send two requests to traverse, first from swh:1:rev:57012c57536f8814dec92e74197ee96c3498d24e, then from swh:1:rev:0000000000000000000000000000000000000003, this will display this in logs: ``` 2024-09-17T09:28:33.157323Z INFO request{id=0}:traverse: swh_graph_grpc_server: TraversalRequest { src: ["swh:1:rev:57012c57536f8814dec92e74197ee96c3498d24e"], direction: Forward, edges: None, max_edges: Some(1000000), min_depth: None, max_depth: None, return_nodes: None, mask: Some(FieldMask { paths: ["swhid"] }), max_matching_nodes: None } 2024-09-17T09:28:35.022307Z INFO request{id=1}:traverse: swh_graph_grpc_server: TraversalRequest { src: ["swh:1:rev:0000000000000000000000000000000000000003"], direction: Forward, edges: None, max_edges: Some(1000000), min_depth: None, max_depth: None, return_nodes: None, mask: Some(FieldMask { paths: ["swhid"] }), max_matching_nodes: None } 2024-09-17T09:28:35.022778Z INFO request{id=1}: swh_graph_grpc_server: 200 OK - http://localhost:50091/swh.graph.TraversalService/Traverse - response: 333.458µs - streaming: 325.639µs 2024-09-17T09:28:43.158810Z ERROR request{id=0}:traverse: swh_graph_grpc_server: oh no traverse 2024-09-17T09:28:43.159228Z INFO request{id=0}:traverse: swh_graph_grpc_server: error=status: NotFound, message: "Unknown SWHID: swh:1:rev:57012c57536f8814dec92e74197ee96c3498d24e", details: [], metadata: MetadataMap { headers: {} } 2024-09-17T09:28:43.159698Z INFO request{id=0}: swh_graph_grpc_server: 200 OK - http://localhost:50091/swh.graph.TraversalService/Traverse - response: 10.002514102s - streaming: 5.486µs ``` and Sentry will correctly show only `TraversalRequest { src: ["swh:1:rev:57012c57536f8814dec92e74197ee96c3498d24e"], ... }` in the breadcrumbs of the error, not `TraversalRequest { src: ["swh:1:rev:0000000000000000000000000000000000000003"], ... }`
-
- Sep 16, 2024
-
-
vlorentz authored
It notifies Sentry on ERROR-level logs and panics.
-
- Sep 13, 2024
-
-
vlorentz authored
-
- Sep 12, 2024
-
-
vlorentz authored
-
- Sep 11, 2024
- Sep 10, 2024
-
-
vlorentz authored
This is easier to install, as there is no need for users to remember --features=grpc-server. This also avoids declaring every dependency of the gRPC server as 'optional' and adding it to the list of dependencies of the 'grpc-server' feature; and I expect this set of dependencies to get larger now that the gRPC server is starting to be used in production (eg. statsd metrics). Finally, this allows Cargo to remove building the gRPC server from the critical path to building swh_graph_provenance when building all crates (eg. in CI), which should make it a little faster to build.
-
vlorentz authored
to avoid OOMs due to storing Bloom Filters in memory while writing, see https://github.com/apache/arrow-rs/pull/5860
-
- Sep 03, 2024
-
-
vlorentz authored
-