staging/provenance: Deploy new grpc instance and expose its api through the webapp
All threads resolved!
In the staging environment, this:
- deploy the grpc provenance server
- adapt the webapp to use that new grpc server ^
This exposes 2 ingresses:
- 1 for other (cluster) internal services to use (like the webapp): provenance-grpc-popular-ingress:80
- another for staff/vpn users: provenance.internal.staging.swh.network:80
That allows to differentiate usage in grafana board.
helm diff
[swh] Comparing changes between branches production and mr/staging-deploy-provenance-grpc (per environment)...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment staging, namespace swh...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra...
[swh] Generate config in production branch for environment staging, namespace next-version...
[swh] Generate config in mr/staging-deploy-provenance-grpc branch for environment staging...
[swh] Generate config in mr/staging-deploy-provenance-grpc branch for environment staging...
[swh] Generate config in mr/staging-deploy-provenance-grpc branch for environment staging...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment production, namespace swh...
[swh] Generate config in production branch for environment production, namespace swh-cassandra...
[swh] Generate config in production branch for environment production, namespace next-version...
[swh] Generate config in mr/staging-deploy-provenance-grpc branch for environment production...
[swh] Generate config in mr/staging-deploy-provenance-grpc branch for environment production...
[swh] Generate config in mr/staging-deploy-provenance-grpc branch for environment production...
------------- diff for environment staging namespace swh -------------
--- /tmp/swh-chart.swh.ZBmUVqQg/staging-swh.before 2025-03-25 15:32:32.755491362 +0100
+++ /tmp/swh-chart.swh.ZBmUVqQg/staging-swh.after 2025-03-25 15:32:33.339469251 +0100
@@ -2837,20 +2837,30 @@
name: deposit-rpc-ingress
namespace: swh
spec:
type: ExternalName
externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
---
# Source: swh/templates/external-services/cname.yaml
apiVersion: v1
kind: Service
metadata:
+ name: provenance-grpc-popular-ingress
+ namespace: swh
+spec:
+ type: ExternalName
+ externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
+---
+# Source: swh/templates/external-services/cname.yaml
+apiVersion: v1
+kind: Service
+metadata:
name: graph-grpc-ingress
namespace: swh
spec:
type: ExternalName
externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
---
# Source: swh/templates/external-services/cname.yaml
apiVersion: v1
kind: Service
metadata:
------------- diff for environment staging namespace swh-cassandra -------------
--- /tmp/swh-chart.swh.ZBmUVqQg/staging-swh-cassandra.before 2025-03-25 15:32:33.071479398 +0100
+++ /tmp/swh-chart.swh.ZBmUVqQg/staging-swh-cassandra.after 2025-03-25 15:32:33.679456379 +0100
@@ -9708,23 +9708,22 @@
json_path: 1.0/status/578e5eddcdc0cc7951000520
server_url: https://status.softwareheritage.org/
corner_ribbon_text: Staging
show_corner_ribbon: "true"
search:
cls: remote
enable_requests_retry: true
url: http://search-rpc-ingress
provenance:
- cls: remote
- enable_requests_retry: true
- url: http://webapp-provenance-ingress
+ cls: grpc
+ url: provenance-grpc-popular-ingress:80
scheduler:
cls: remote
url: http://scheduler.internal.staging.swh.network
vault:
cls: remote
enable_requests_retry: true
url: http://vault-rpc-ingress
graph:
max_edges:
anonymous: 1000
@@ -10174,20 +10173,37 @@
app: graph-grpc-python3k
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local-persistent
volumeMode: Filesystem
---
+# Source: swh/templates/volumes/persistent-volume-claims.yaml
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+ name: provenance-popular-persistent-pvc
+ namespace: swh-cassandra
+ labels:
+ app: provenance-popular-grpc
+spec:
+ accessModes:
+ - ReadWriteOnce
+ resources:
+ requests:
+ storage: 1Gi
+ storageClassName: local-persistent
+ volumeMode: Filesystem
+---
# Source: swh/templates/counters/rpc-service.yaml
apiVersion: v1
kind: Service
metadata:
name: counters-rpc
namespace: swh-cassandra
labels:
app: counters-rpc
spec:
type: ClusterIP
@@ -10236,20 +10252,30 @@
name: deposit-rpc-ingress
namespace: swh-cassandra
spec:
type: ExternalName
externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
---
# Source: swh/templates/external-services/cname.yaml
apiVersion: v1
kind: Service
metadata:
+ name: provenance-grpc-popular-ingress
+ namespace: swh-cassandra
+spec:
+ type: ExternalName
+ externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
+---
+# Source: swh/templates/external-services/cname.yaml
+apiVersion: v1
+kind: Service
+metadata:
name: graph-grpc-ingress
namespace: swh-cassandra
spec:
type: ExternalName
externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
---
# Source: swh/templates/external-services/cname.yaml
apiVersion: v1
kind: Service
metadata:
@@ -10603,20 +10629,37 @@
app: provenance-graph-granet
spec:
type: ClusterIP
selector:
app: provenance-graph-granet
ports:
- port: 5014
targetPort: 5014
name: rpc
---
+# Source: swh/templates/provenance/service.yaml
+apiVersion: v1
+kind: Service
+metadata:
+ name: provenance-popular-grpc
+ namespace: swh-cassandra
+ labels:
+ app: provenance-popular-grpc
+spec:
+ type: ClusterIP
+ selector:
+ app: provenance-popular-grpc
+ ports:
+ - port: 50141
+ targetPort: 50141
+ name: grpc
+---
# Source: swh/templates/scheduler/rpc-service.yaml
apiVersion: v1
kind: Service
metadata:
name: scheduler-rpc
namespace: swh-cassandra
labels:
app: scheduler-rpc
spec:
type: ClusterIP
@@ -20277,20 +20320,209 @@
- name: config-utils
configMap:
name: config-utils
defaultMode: 0555
- name: backend-utils
configMap:
name: backend-utils
defaultMode: 0555
---
+# Source: swh/templates/provenance/deployment.yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+ namespace: swh-cassandra
+ name: provenance-popular-grpc
+ labels:
+ app: provenance-popular-grpc
+spec:
+ revisionHistoryLimit: 2
+ replicas: 2
+ selector:
+ matchLabels:
+ app: provenance-popular-grpc
+ strategy:
+ type: RollingUpdate
+ rollingUpdate:
+ maxSurge: 1
+ template:
+ metadata:
+ labels:
+ app: provenance-popular-grpc
+ annotations:
+ checksum/config: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
+ checksum/config-logging: 506cb71ff6d511b93438af8edc2cb510a5d5c0bdd22023e199de6740fca32b51
+ checksum/config-utils: 13a26f6add17e96ce01550153c77dcd48de60241a3f4db3c93d5467234be2a7f
+ checksum/backend-utils: 5ed55de12f3a82cd556464e232ae318f8523db72abd86b7410994ca05c3848ed
+ spec:
+ affinity:
+ nodeAffinity:
+ requiredDuringSchedulingIgnoredDuringExecution:
+ nodeSelectorTerms:
+ - matchExpressions:
+ - key: swh/rpc
+ operator: In
+ values:
+ - "true"
+ priorityClassName: swh-cassandra-frontend-rpc
+ initContainers:
+ - name: fetch-provenance-dataset
+ image: container-registry.softwareheritage.org/swh/infra/swh-apps/provenance:20250321.1
+ command:
+ - /entrypoints/provenance-fetch-datasets.sh
+ env:
+ - name: WITNESS_FETCH_FILE
+ value: /srv/dataset/provenance/.provenance-is-initialized
+ - name: WITNESS_DOWNLOADING_FILE
+ value: /srv/dataset/provenance/.provenance-is-downloading
+ - name: SWH_CONFIG_FILENAME
+ value: /etc/swh/config.yml
+ - name: PROVENANCE_PATH
+ value: /srv/dataset/provenance
+ - name: PROVENANCE_DATASET_FULL
+ value: "true"
+ - name: GRAPH_PATH
+ value: /srv/dataset/graph
+ - name: DATASET_VERSION
+ value: 2024-08-23-popular-500-python
+ volumeMounts:
+ - name: configuration
+ mountPath: /etc/swh
+ - name: backend-utils
+ mountPath: /entrypoints
+ - name: dataset-persistent
+ mountPath: /srv/dataset
+ readOnly: false
+
+ - name: index-provenance-dataset
+ image: container-registry.softwareheritage.org/swh/infra/swh-apps/provenance:20250321.1
+ imagePullPolicy: IfNotPresent
+ command:
+ - /entrypoints/provenance-index-dataset.sh
+ env:
+ - name: WITNESS_DATASETS_FILE
+ value: /srv/dataset/provenance/.provenance-is-initialized
+ - name: WITNESS_INDEX_FILE
+ value: /srv/dataset/provenance/.provenance-is-indexed
+ - name: WITNESS_INDEXING_FILE
+ value: /srv/dataset/provenance/.provenance-is-indexing
+ - name: PROVENANCE_PATH
+ value: /srv/dataset/provenance
+ - name: PERIOD
+ value: "3"
+ volumeMounts:
+ - name: backend-utils
+ mountPath: /entrypoints
+ readOnly: true
+ - name: dataset-persistent
+ mountPath: /srv/dataset
+ readOnly: false
+
+ - name: wait-for-dataset
+ image: container-registry.softwareheritage.org/swh/infra/swh-apps/utils:20250211.1
+ imagePullPolicy: IfNotPresent
+ command:
+ - /entrypoints/wait-for-dataset.sh
+ env:
+ - name: WITNESS_FILE
+ value: /srv/dataset/provenance/.provenance-is-initialized
+ - name: SERVICE_NAME
+ value: "provenance"
+ - name: PERIOD
+ value: "3"
+ volumeMounts:
+ - name: backend-utils
+ mountPath: /entrypoints
+ readOnly: true
+ - name: dataset-persistent
+ mountPath: /srv/dataset
+ readOnly: false
+
+ containers:
+ - name: provenance-popular-grpc
+ resources:
+ requests:
+ memory: 512Mi
+ cpu: 500m
+ image: container-registry.softwareheritage.org/swh/infra/swh-apps/provenance:20250321.1
+ imagePullPolicy: IfNotPresent
+ ports:
+ - containerPort: 50141
+ name: grpc
+ readinessProbe:
+ tcpSocket:
+ port: grpc
+ initialDelaySeconds: 15
+ failureThreshold: 30
+ periodSeconds: 5
+ livenessProbe:
+ tcpSocket:
+ port: grpc
+ initialDelaySeconds: 10
+ periodSeconds: 5
+ command:
+ - /bin/bash
+ args:
+ - -c
+ - /opt/swh/entrypoint.sh
+ env:
+ - name: PROVENANCE_TYPE
+ value: grpc
+ - name: PORT
+ value: "50141"
+ - name: PROVENANCE_PATH
+ value: /srv/dataset/provenance
+ - name: GRAPH_PATH
+ value: /srv/dataset/graph/graph
+ - name: STATSD_HOST
+ value: prometheus-statsd-exporter
+ - name: STATSD_PORT
+ value: "9125"
+ - name: STATSD_TAGS
+ value: deployment:provenance-popular-grpc
+ - name: SWH_LOG_LEVEL
+ value: INFO
+ - name: SWH_SENTRY_ENVIRONMENT
+ value: staging
+ - name: SWH_MAIN_PACKAGE
+ value: swh.provenance
+ - name: SWH_SENTRY_DSN
+ valueFrom:
+ secretKeyRef:
+ name: common-secrets
+ key: provenance-sentry-dsn
+ # 'name' secret should exist & include key
+ # if the setting doesn't exist, sentry pushes will be disabled
+ optional: true
+ - name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
+ value: "true"
+ volumeMounts:
+ - name: dataset-persistent
+ mountPath: /srv/dataset
+ readOnly: false
+
+ volumes:
+ - name: configuration
+ emptyDir: {}
+ - name: config-utils
+ configMap:
+ name: config-utils
+ defaultMode: 0555
+ - name: backend-utils
+ configMap:
+ name: backend-utils
+ defaultMode: 0555
+ - name: dataset-persistent
+ persistentVolumeClaim:
+ claimName: provenance-popular-persistent-pvc
+---
# Source: swh/templates/scheduler/extra-services-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: swh-cassandra
name: scheduler-listener
labels:
app: scheduler-listener
spec:
revisionHistoryLimit: 2
@@ -24603,21 +24835,21 @@
app: web-cassandra
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: web-cassandra
annotations:
- checksum/config: 618d363a889102f36ab478f2d271f8b1b05231ab6df401397e593dde30798953
+ checksum/config: 088d192724b78794ea6169aa8ea77e9fead4abcd61ef10ea54fe390eefd80676
checksum/config-logging: 21c90a039f27f4476045b8973a841bb2b3c0e4435be7fb9ab1d748372f8a96c8
checksum/config-utils: 13a26f6add17e96ce01550153c77dcd48de60241a3f4db3c93d5467234be2a7f
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/web
operator: In
@@ -27236,31 +27468,92 @@
app: provenance-graph-granet
endpoint-definition: default
annotations:
nginx.ingress.kubernetes.io/client-body-buffer-size: 128K
nginx.ingress.kubernetes.io/proxy-body-size: 4G
nginx.ingress.kubernetes.io/proxy-buffering: "on"
nginx.ingress.kubernetes.io/service-upstream: "true"
nginx.ingress.kubernetes.io/whitelist-source-range: 10.42.0.0/16,10.43.0.0/16,192.168.100.29/32,192.168.101.0/24,192.168.130.0/24,192.168.50.0/24
spec:
rules:
- - host: provenance.internal.staging.swh.network
+ - host: provenance-rpc.internal.staging.swh.network
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: provenance-graph-granet
port:
number: 5014
---
+# Source: swh/templates/provenance/ingress.yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+ namespace: swh-cassandra
+ name: provenance-popular-grpc-ingress-default
+ labels:
+ app: provenance-popular-grpc
+ endpoint-definition: default
+ annotations:
+ nginx.ingress.kubernetes.io/backend-protocol: GRPC
+ nginx.ingress.kubernetes.io/client-body-buffer-size: 128K
+ nginx.ingress.kubernetes.io/proxy-body-size: 4G
+ nginx.ingress.kubernetes.io/proxy-buffering: "on"
+ nginx.ingress.kubernetes.io/service-upstream: "true"
+ nginx.ingress.kubernetes.io/ssl-redirect: "true"
+spec:
+ ingressClassName: nginx
+ rules:
+ - host: provenance-grpc-popular-ingress
+ http:
+ paths:
+ - path: /
+ pathType: Prefix
+ backend:
+ service:
+ name: provenance-popular-grpc
+ port:
+ number: 50141
+---
+# Source: swh/templates/provenance/ingress.yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+ namespace: swh-cassandra
+ name: provenance-popular-grpc-ingress-extra-1-default
+ labels:
+ app: provenance-popular-grpc
+ endpoint-definition: default
+ annotations:
+ nginx.ingress.kubernetes.io/backend-protocol: GRPC
+ nginx.ingress.kubernetes.io/client-body-buffer-size: 128K
+ nginx.ingress.kubernetes.io/proxy-body-size: 4G
+ nginx.ingress.kubernetes.io/proxy-buffering: "on"
+ nginx.ingress.kubernetes.io/service-upstream: "true"
+ nginx.ingress.kubernetes.io/ssl-redirect: "true"
+ nginx.ingress.kubernetes.io/whitelist-source-range: 10.42.0.0/16,10.43.0.0/16,192.168.100.29/32,192.168.101.0/24,192.168.130.0/24,192.168.50.0/24
+spec:
+ ingressClassName: nginx
+ rules:
+ - host: provenance.internal.staging.swh.network
+ http:
+ paths:
+ - path: /
+ pathType: Prefix
+ backend:
+ service:
+ name: provenance-popular-grpc
+ port:
+ number: 50141
+---
# Source: swh/templates/scheduler/rpc-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: swh-cassandra
name: scheduler-rpc-ingress-default
labels:
app: scheduler-rpc
endpoint-definition: default
annotations:
------------- diff for environment staging namespace next-version -------------
--- /tmp/swh-chart.swh.ZBmUVqQg/staging-next-version.before 2025-03-25 15:32:33.211474098 +0100
+++ /tmp/swh-chart.swh.ZBmUVqQg/staging-next-version.after 2025-03-25 15:32:33.823450927 +0100
@@ -5095,20 +5095,30 @@
name: graph-rpc-next-version-ingress
namespace: swh-cassandra-next-version
spec:
type: ExternalName
externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
---
# Source: swh/templates/external-services/cname.yaml
apiVersion: v1
kind: Service
metadata:
+ name: provenance-grpc-popular-ingress
+ namespace: swh-cassandra-next-version
+spec:
+ type: ExternalName
+ externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
+---
+# Source: swh/templates/external-services/cname.yaml
+apiVersion: v1
+kind: Service
+metadata:
name: graph-grpc-ingress
namespace: swh-cassandra-next-version
spec:
type: ExternalName
externalName: archive-staging-rke2-ingress-nginx-controller.ingress-nginx.svc.cluster.local
---
# Source: swh/templates/external-services/cname.yaml
apiVersion: v1
kind: Service
metadata:
------------- diff for environment production namespace swh -------------
No differences
------------- diff for environment production namespace swh-cassandra -------------
No differences
Merge request reports
Activity
mentioned in issue swh/infra/sysadm-environment#5608 (closed)
added 5 commits
-
15e7440e...1986b37d - 3 commits from branch
production
- bb0ffbe4 - staging/provenance: Deploy new grpc instance
- f52c9001 - staging/web: Expose provenance api through webapp
-
15e7440e...1986b37d - 3 commits from branch
- Resolved by Antoine R. Dumont
fwiw, i've deployed the first commit (because no one depends on it).
The indexation step failed [1] (command used)
And the provenance grpc did not start for that reason [2]
[1]
2025-03-25T17:52:41.429551448Z Datasets file installed, build provenance dataset indexes... 2025-03-25T17:52:41.429581125Z provenance path: /srv/dataset/provenance 2025-03-25T17:52:41.430413234Z + swh-provenance-index --database file:///srv/dataset/provenance 2025-03-25T17:52:41.508419978Z 2025-03-25T17:52:41.508209Z INFO swh_provenance_index: Loading database... 2025-03-25T17:52:41.797610829Z 2025-03-25T17:52:41.797396Z INFO swh_provenance_index: Database loaded. 2025-03-25T17:52:41.798052544Z 2025-03-25T17:52:41.797939Z INFO swh_provenance_index: Building and writing indexes... 2025-03-25T17:52:46.793307583Z /entrypoints/provenance-index-dataset.sh: line 42: 784 Illegal instruction (core dumped) swh-provenance-index --database "file://${PROVENANCE_PATH}" 2025-03-25T17:52:46.793353025Z + echo 'Provenance indexes failed!' 2025-03-25T17:52:46.793312070Z Provenance indexes failed! 2025-03-25T17:52:46.793425855Z + '[' -f /srv/dataset/provenance/.provenance-is-indexing ']' 2025-03-25T17:52:46.793452192Z + rm /srv/dataset/provenance/.provenance-is-indexing
[2]
2025-03-25T19:25:29.119907211Z Error: Could not mmap Elias-Fano indexes 2025-03-25T19:25:29.119936427Z 2025-03-25T19:25:29.119945207Z Caused by: 2025-03-25T19:25:29.119959137Z 0: Could not mmap index for 'contents_in_frontier_directories' table 2025-03-25T19:25:29.119976115Z 1: Could not map file-level Elias-Fano index 2025-03-25T19:25:29.119989841Z 2: Could not mmap Elias-Fano index from /srv/dataset/provenance/contents_in_frontier_directories/7.parquet.index=cnt.ef 2025-03-25T19:25:29.120160225Z 3: No such file or directory (os error 2)
Edited by Antoine R. Dumont
added 3 commits
-
f52c9001...f4b8afa8 - 2 commits from branch
production
- 9295e972 - staging/web: Expose provenance api through webapp
-
f52c9001...f4b8afa8 - 2 commits from branch
Please register or sign in to reply