Skip to content
Snippets Groups Projects

provenance/Dockerfile: Evolve to compile the rust provenance crate

Merged Antoine R. Dumont requested to merge mr/deploy-new-provenance into master
1 unresolved thread

awscli [1] is the tool to retrieve the necessary files (provenance, graph) for the deployment to be self-contained. The rest is the rust compilation steps to have the binaries to run the provenance index and grpc services [2] [3]

In details, this adapts the current provenance deployment:

  • Dockerfile got adapted to be able to install the swh-provenance crate (adding the rust tool)
  • Adapt the entrypoint to be able to run either the rpc or the grpc server [2] [3]
  • Adapt the entrypoint to add the option to run the database build index (required for the grpc to run). [2] [3]
  • Adapt the utils image to add the awscli image (to be able to fetch the necessary files from s3) [1]

[1] https://docs.softwareheritage.org/devel/swh-provenance/grpc-api.html#getting-a-provenance-database

[3] https://crates.io/crates/swh-provenance

[2]

root@d79ab1e18df7:/opt/swh# swh-provenance-
swh-provenance-gen-test-database  swh-provenance-grpc-serve         swh-provenance-index
root@d79ab1e18df7:/opt/swh# swh-provenance-index --help
Builds .ef indexes for extra quick querying of the Software Heritage Provenance Index

Usage: swh-provenance-index [OPTIONS] --database <DATABASE>

Options:
      --database <DATABASE>        Path to the provenance database
      --indexes <INDEXES>          Path to the directory where to write paths to. Defaults to `--database` (when it is a file:// URL)
      --statsd-host <STATSD_HOST>  Defaults to `localhost:8125` (or whatever is configured by the `STATSD_HOST` and `STATSD_PORT` environment variables)
  -h, --help                       Print help
root@d79ab1e18df7:/opt/swh# swh-provenance-grpc-serve --help
gRPC server for the Software Heritage Provenance Index

Usage: swh-provenance-grpc-serve [OPTIONS] --graph <GRAPH> --database <DATABASE>

Options:
      --cache-parquet                Keep Parquet metadata in RAM between queries, instead of re-parsing them every time
      --graph-format <GRAPH_FORMAT>  [default: webgraph] [possible values: webgraph, json]
      --graph <GRAPH>                Path to the graph prefix
      --database <DATABASE>          Path to the provenance database
      --indexes <INDEXES>            Path to Elias-Fano indexes, default to `--database` (when it is a file:// URL)
      --bind <BIND>                  [default: [::]:50141]
      --statsd-host <STATSD_HOST>    Defaults to `localhost:8125` (or whatever is configured by the `STATSD_HOST` and `STATSD_PORT` environment variables)
  -h, --help                         Print help

Refs. sysadm-environment#5608 (closed)

Edited by Antoine R. Dumont

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
1 1 swh.provenance
2 2 python-json-logger
3 3 gunicorn
4 awscli
  • Not entirely sure yet as to whether this one is needed or the other. Starting developing the template will tell and i'll get back in that repository to drop the one unneeded.

  • Please register or sign in to reply
  • added 1 commit

    • 9d71df1e - provenance/entrypoint.sh: Adapt to run either a grpc or rpc

    Compare with previous version

  • Antoine R. Dumont merged manually

    merged manually

  • Please register or sign in to reply
    Loading