- Feb 11, 2025
-
-
vlorentz authored
Move aggregate-dataset, blobs-datasets, filenames, and origin contributors to swh-datasets, and provenance generation to its own repository
-
- Jan 28, 2025
-
-
Aymeric Varasse authored
-
- Jan 23, 2025
-
-
Some parameters have been renamed so some examples were not working anymore.
-
- Jan 22, 2025
-
-
vlorentz authored
-
- Jan 17, 2025
-
-
The softwaheritage S3 bucket is public so that option is required to download graph dataset files with the aws CLI.
-
- Jan 14, 2025
-
-
It is required to run swh-graph Python tests.
-
- Nov 15, 2024
-
-
vlorentz authored
The --locked option forces Cargo to use the lockfile declared by swh-graph, which was carefully written to avoid conflicts between two Arrow versions that are used as transitive dependencies through different packages. When Cargo tries to update one, it ends up with incompatible versions, causing a compilation failure. Currently, this only happens when enabling the `orc` features (which commands in the documentation do not enable), but similar issues could appear in the future with the default versions.
-
- Oct 24, 2024
-
-
vlorentz authored
-
- Oct 15, 2024
-
- Sep 19, 2024
-
-
vlorentz authored
They used to be installed by everyone as part of installing the gRPC server, but are not since we moved the gRPC server to its own crate
-
- Sep 13, 2024
-
-
vlorentz authored
-
- Sep 12, 2024
-
-
vlorentz authored
-
- Sep 11, 2024
-
- Sep 10, 2024
-
-
vlorentz authored
This is easier to install, as there is no need for users to remember --features=grpc-server. This also avoids declaring every dependency of the gRPC server as 'optional' and adding it to the list of dependencies of the 'grpc-server' feature; and I expect this set of dependencies to get larger now that the gRPC server is starting to be used in production (eg. statsd metrics). Finally, this allows Cargo to remove building the gRPC server from the critical path to building swh_graph_provenance when building all crates (eg. in CI), which should make it a little faster to build.
-
- Sep 09, 2024
-
-
vlorentz authored
It was a bug in the Java gRPC server to allow it without the "node." prefix because FieldMasks should match the fully-qualified field name in the output, and these two endpoints return a Path (with a Node field) instead of directly a Node stream like Traverse does.
-
- Sep 06, 2024
-
-
vlorentz authored
This makes the EDGE_LABELS run time go from 2.5 days to 1 day; and TRANSPOSE_EDGE_LABELS probably from 3.5 to 1-1.5 day.
-
- Aug 30, 2024
-
-
Antoine Lambert authored
-
- Aug 22, 2024
-
-
vlorentz authored
1. seems to be the de facto standard 2. already used by webgraph 3. prints which module emitted each log line 4. can be configured to default to INFO level, so we don't need to pass -vv to every Rust executable to get progress reporting 5. env vars allow configuring even when commands are called through Luigi
-
vlorentz authored
-
vlorentz authored
-
- Aug 21, 2024
- Aug 16, 2024
-
-
vlorentz authored
Based on observing ardumont read it and try to follow it.
-
- Aug 13, 2024
-
-
vlorentz authored
The tutorial is adapted from the existing Java tutorial.
-
- Aug 10, 2024
-
-
vlorentz authored
-
- Aug 06, 2024
-
-
vlorentz authored
-
- Jul 02, 2024
-
-
Renaud Boyer authored
build-essentials -> build-essential protobuf-compiler is required by cargo install swh-graph
-
- Jun 18, 2024
-
-
vlorentz authored
-
- Jun 17, 2024
-
-
vlorentz authored
-
- May 31, 2024
-
-
Théo Zimmermann authored
-
- Apr 17, 2024
-
-
vlorentz authored
-
- Apr 10, 2024
-
-
vlorentz authored
-
- Apr 08, 2024
-
-
vlorentz authored
-
- Feb 14, 2024
-
-
Stefano Zacchiroli authored
-
- Jan 08, 2024
-
-
vlorentz authored
When cgroups are available
-
- Dec 01, 2023
-
-
David Douard authored
-
- Nov 29, 2023
-
-
David Douard authored
-