Skip to content

Implement a local pull-through docker cache

Nicolas Dandrimont requested to merge mr/local-docker-registry into staging

The upstream docker registry image only allows caching data from a single registry. We therefore define a set of upstream registries to cache data from, and configure a docker-cache StatefulSet for each of them. For each upstream repository, a httpPrefix is defined to have the ingress route queries properly.

The pull-through cache can be configured in containerd as a mirror. containerd degrades gracefully if the mirror is unavailable, pulling images from the upstream repository instead.

make cc-helm-diff | colordiff
./helm-diff.sh cluster-components
[cluster-components] Comparing changes between branches production and mr/local-docker-registry...
Your branch is up to date with 'origin/production'.
[cluster-components] Generate config in production branch for cluster-components/values/admin-rke2.yaml...
[cluster-components] Generate config in production branch for cluster-components/values/archive-production-rke2.yaml...
[cluster-components] Generate config in production branch for cluster-components/values/archive-staging-rke2.yaml...
[cluster-components] Generate config in production branch for cluster-components/values/gitlab-production.yaml...
[cluster-components] Generate config in production branch for cluster-components/values/gitlab-staging.yaml...
[cluster-components] Generate config in production branch for cluster-components/values/minikube.yaml...
[cluster-components] Generate config in production branch for cluster-components/values/rancher.yaml...
[cluster-components] Generate config in production branch for cluster-components/values/test-staging-rke2.yaml...
Your branch is up to date with 'origin/mr/local-docker-registry'.
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/admin-rke2.yaml...
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/archive-production-rke2.yaml...
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/archive-staging-rke2.yaml...
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/gitlab-production.yaml...
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/gitlab-staging.yaml...
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/minikube.yaml...
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/rancher.yaml...
[cluster-components] Generate config in mr/local-docker-registry branch for cluster-components/values/test-staging-rke2.yaml...


------------- diff for cluster-components/values/admin-rke2.yaml -------------

--- /tmp/swh-chart.cluster-components.LDf5DbxW/admin-rke2.yaml.before	2024-02-23 12:13:00.946480735 +0100
+++ /tmp/swh-chart.cluster-components.LDf5DbxW/admin-rke2.yaml.after	2024-02-23 12:13:01.330484729 +0100
@@ -1,11 +1,17 @@
 ---
+# Source: cluster-config/templates/docker-cache/namespace.yaml
+apiVersion: v1
+kind: Namespace
+metadata:
+  name: docker-cache
+---
 # Source: cluster-config/charts/blackboxExporter/templates/serviceaccount.yaml
 apiVersion: v1
 kind: ServiceAccount
 metadata:
   name: test-blackbox-exporter
   namespace: cattle-monitoring-system
   labels:
     helm.sh/chart: blackboxExporter-7.1.3
     app.kubernetes.io/name: blackbox-exporter
     app.kubernetes.io/instance: test
@@ -115,20 +121,100 @@
   name: alertmanager-irc-relay
   namespace: cattle-monitoring-system
 spec:
   selector:
     app: alertmanager-irc-relay
   ports:
     - port: 8000
       targetPort: 8000
       name: http
 ---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+kind: Service
+apiVersion: v1
+metadata:
+  name: dockercache-docker-io
+  namespace: docker-cache
+  labels:
+    app: dockercache-docker-io
+spec:
+  selector:
+    app: dockercache-docker-io
+  ports:
+    - name: http
+      port: 5000
+      targetPort: 5000
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+kind: Service
+apiVersion: v1
+metadata:
+  name: dockercache-ghcr-io
+  namespace: docker-cache
+  labels:
+    app: dockercache-ghcr-io
+spec:
+  selector:
+    app: dockercache-ghcr-io
+  ports:
+    - name: http
+      port: 5000
+      targetPort: 5000
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+kind: Service
+apiVersion: v1
+metadata:
+  name: dockercache-quay-io
+  namespace: docker-cache
+  labels:
+    app: dockercache-quay-io
+spec:
+  selector:
+    app: dockercache-quay-io
+  ports:
+    - name: http
+      port: 5000
+      targetPort: 5000
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+kind: Service
+apiVersion: v1
+metadata:
+  name: dockercache-registry-k-s-io
+  namespace: docker-cache
+  labels:
+    app: dockercache-registry-k-s-io
+spec:
+  selector:
+    app: dockercache-registry-k-s-io
+  ports:
+    - name: http
+      port: 5000
+      targetPort: 5000
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+kind: Service
+apiVersion: v1
+metadata:
+  name: dockercache-swh
+  namespace: docker-cache
+  labels:
+    app: dockercache-swh
+spec:
+  selector:
+    app: dockercache-swh
+  ports:
+    - name: http
+      port: 5000
+      targetPort: 5000
+---
 # Source: cluster-config/charts/blackboxExporter/templates/deployment.yaml
 apiVersion: apps/v1
 kind: Deployment
 metadata:
   name: test-blackbox-exporter
   namespace: cattle-monitoring-system
   labels:
     helm.sh/chart: blackboxExporter-7.1.3
     app.kubernetes.io/name: blackbox-exporter
     app.kubernetes.io/instance: test
@@ -255,20 +341,265 @@
               mountPath: /etc/ircrelay
       volumes:
         - name: configuration
           configMap:
             name: alertmanager-irc-relay
             defaultMode: 0660
             items:
               - key: "config"
                 path: "config.yml"
 ---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: dockercache-docker-io
+  namespace: docker-cache
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: dockercache-docker-io
+  template:
+    metadata:
+      labels:
+        app: dockercache-docker-io
+    spec:
+      priorityClassName: cluster-components-system
+      containers:
+        - name: docker-cache
+          image: registry:2.8.3
+          imagePullPolicy: IfNotPresent
+          env:
+            - name: REGISTRY_HTTP_ADDR
+              value: ":5000"
+            - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
+              value: "/var/lib/registry"
+            - name: REGISTRY_STORAGE_DELETE_ENABLED
+              value: "true"
+            - name: REGISTRY_HTTP_PREFIX
+              value: "/docker.io/"
+            - name: REGISTRY_PROXY_REMOTEURL
+              value: "https://registry-1.docker.io"
+          ports:
+            - name: http
+              containerPort: 5000
+          volumeMounts:
+            - name: image-store
+              mountPath: "/var/lib/registry"
+  volumeClaimTemplates:
+    - metadata:
+        name: image-store
+      spec:
+        accessModes:
+          - ReadWriteOnce
+        resources:
+          requests:
+            storage: 40Gi
+        storageClassName: ceph-rbd
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: dockercache-ghcr-io
+  namespace: docker-cache
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: dockercache-ghcr-io
+  template:
+    metadata:
+      labels:
+        app: dockercache-ghcr-io
+    spec:
+      priorityClassName: cluster-components-system
+      containers:
+        - name: docker-cache
+          image: registry:2.8.3
+          imagePullPolicy: IfNotPresent
+          env:
+            - name: REGISTRY_HTTP_ADDR
+              value: ":5000"
+            - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
+              value: "/var/lib/registry"
+            - name: REGISTRY_STORAGE_DELETE_ENABLED
+              value: "true"
+            - name: REGISTRY_HTTP_PREFIX
+              value: "/ghcr.io/"
+            - name: REGISTRY_PROXY_REMOTEURL
+              value: "https://ghcr.io"
+          ports:
+            - name: http
+              containerPort: 5000
+          volumeMounts:
+            - name: image-store
+              mountPath: "/var/lib/registry"
+  volumeClaimTemplates:
+    - metadata:
+        name: image-store
+      spec:
+        accessModes:
+          - ReadWriteOnce
+        resources:
+          requests:
+            storage: 10Gi
+        storageClassName: ceph-rbd
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: dockercache-quay-io
+  namespace: docker-cache
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: dockercache-quay-io
+  template:
+    metadata:
+      labels:
+        app: dockercache-quay-io
+    spec:
+      priorityClassName: cluster-components-system
+      containers:
+        - name: docker-cache
+          image: registry:2.8.3
+          imagePullPolicy: IfNotPresent
+          env:
+            - name: REGISTRY_HTTP_ADDR
+              value: ":5000"
+            - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
+              value: "/var/lib/registry"
+            - name: REGISTRY_STORAGE_DELETE_ENABLED
+              value: "true"
+            - name: REGISTRY_HTTP_PREFIX
+              value: "/quay.io/"
+            - name: REGISTRY_PROXY_REMOTEURL
+              value: "https://quay.io"
+          ports:
+            - name: http
+              containerPort: 5000
+          volumeMounts:
+            - name: image-store
+              mountPath: "/var/lib/registry"
+  volumeClaimTemplates:
+    - metadata:
+        name: image-store
+      spec:
+        accessModes:
+          - ReadWriteOnce
+        resources:
+          requests:
+            storage: 20Gi
+        storageClassName: ceph-rbd
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: dockercache-registry-k-s-io
+  namespace: docker-cache
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: dockercache-registry-k-s-io
+  template:
+    metadata:
+      labels:
+        app: dockercache-registry-k-s-io
+    spec:
+      priorityClassName: cluster-components-system
+      containers:
+        - name: docker-cache
+          image: registry:2.8.3
+          imagePullPolicy: IfNotPresent
+          env:
+            - name: REGISTRY_HTTP_ADDR
+              value: ":5000"
+            - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
+              value: "/var/lib/registry"
+            - name: REGISTRY_STORAGE_DELETE_ENABLED
+              value: "true"
+            - name: REGISTRY_HTTP_PREFIX
+              value: "/registry.k8s.io/"
+            - name: REGISTRY_PROXY_REMOTEURL
+              value: "https://registry.k8s.io"
+          ports:
+            - name: http
+              containerPort: 5000
+          volumeMounts:
+            - name: image-store
+              mountPath: "/var/lib/registry"
+  volumeClaimTemplates:
+    - metadata:
+        name: image-store
+      spec:
+        accessModes:
+          - ReadWriteOnce
+        resources:
+          requests:
+            storage: 10Gi
+        storageClassName: ceph-rbd
+---
+# Source: cluster-config/templates/docker-cache/deployment.yaml
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+  name: dockercache-swh
+  namespace: docker-cache
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app: dockercache-swh
+  template:
+    metadata:
+      labels:
+        app: dockercache-swh
+    spec:
+      priorityClassName: cluster-components-system
+      containers:
+        - name: docker-cache
+          image: registry:2.8.3
+          imagePullPolicy: IfNotPresent
+          env:
+            - name: REGISTRY_HTTP_ADDR
+              value: ":5000"
+            - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
+              value: "/var/lib/registry"
+            - name: REGISTRY_STORAGE_DELETE_ENABLED
+              value: "true"
+            - name: REGISTRY_HTTP_PREFIX
+              value: "/swh/"
+            - name: REGISTRY_PROXY_REMOTEURL
+              value: "https://container-registry.softwareheritage.org"
+          ports:
+            - name: http
+              containerPort: 5000
+          volumeMounts:
+            - name: image-store
+              mountPath: "/var/lib/registry"
+  volumeClaimTemplates:
+    - metadata:
+        name: image-store
+      spec:
+        accessModes:
+          - ReadWriteOnce
+        resources:
+          requests:
+            storage: 40Gi
+        storageClassName: ceph-rbd
+---
 # Source: cluster-config/templates/alertmanager-irc-relay/ingress-status.yaml
 apiVersion: networking.k8s.io/v1
 kind: Ingress
 metadata:
   name: alertmanager-irc-relay-internal-ingress-status
   namespace: cattle-monitoring-system
   annotations:
     
     cert-manager.io/cluster-issuer: letsencrypt-production
     # see https://cert-manager.io/docs/usage/ingress/
@@ -349,20 +680,142 @@
           service:
             name: alertmanager-irc-relay
             port:
               name: http
   tls:
   - hosts:
     - alertmanager-irc-relay.admin.swh.network
     - alertmanager-irc-relay.internal.admin.swh.network
     secretName: alertmanager-irc-relay-crt
 ---
+# Source: cluster-config/templates/docker-cache/ingress.yaml
+apiVersion: networking.k8s.io/v1
+kind: Ingress
+metadata:
+  name: docker-cache-ingress
+  namespace: docker-cache
+  annotations:
+    nginx.ingress.kubernetes.io/proxy-body-size: "0"
+    nginx.ingress.kubernetes.io/proxy-read-timeout: "600"
+    nginx.ingress.kubernetes.io/proxy-send-timeout: "600"
+    kubernetes.io/tls-acme: "true"
+    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
+    cert-manager.io/cluster-issuer: letsencrypt-production
+spec:
+  ingressClassName: nginx
+  rules:
+    - host: docker-cache.admin.swh.network
+      http:
+        paths:
+          - path: "/docker.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-docker-io
+                port:
+                  name: http
+    - host: docker-cache.admin.swh.network
+      http:
+        paths:
+          - path: "/ghcr.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-ghcr-io
+                port:
+                  name: http
+    - host: docker-cache.admin.swh.network
+      http:
+        paths:
+          - path: "/quay.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-quay-io
+                port:
+                  name: http
+    - host: docker-cache.admin.swh.network
+      http:
+        paths:
+          - path: "/registry.k8s.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-registry-k-s-io
+                port:
+                  name: http
+    - host: docker-cache.admin.swh.network
+      http:
+        paths:
+          - path: "/swh/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-swh
+                port:
+                  name: http
+    - host: docker-cache.internal.admin.swh.network
+      http:
+        paths:
+          - path: "/docker.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-docker-io
+                port:
+                  name: http
+    - host: docker-cache.internal.admin.swh.network
+      http:
+        paths:
+          - path: "/ghcr.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-ghcr-io
+                port:
+                  name: http
+    - host: docker-cache.internal.admin.swh.network
+      http:
+        paths:
+          - path: "/quay.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-quay-io
+                port:
+                  name: http
+    - host: docker-cache.internal.admin.swh.network
+      http:
+        paths:
+          - path: "/registry.k8s.io/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-registry-k-s-io
+                port:
+                  name: http
+    - host: docker-cache.internal.admin.swh.network
+      http:
+        paths:
+          - path: "/swh/"
+            pathType: ImplementationSpecific
+            backend:
+              service:
+                name: dockercache-swh
+                port:
+                  name: http
+  tls:
+    - secretName: docker-cache-crt
+      hosts:
+      - docker-cache.admin.swh.network
+      - docker-cache.internal.admin.swh.network
+---
 # Source: cluster-config/templates/alertmanager-irc-relay/config.yaml
 # See https://gitlab.softwareheritage.org/swh/infra/ci-cd/3rdparty/alertmanager-irc-relay/-/tree/master
 # for more information
 ---
 # Source: cluster-config/templates/scrape-external-metrics/endpoints.yaml
 # This defines the external endpoints ips to connect to scrape metrics
 ---
 # Source: cluster-config/templates/scrape-external-metrics/service-monitor.yaml
 # This defines the service-monitor to monitor the service which scrapes external metrics
 # This may redefine some metrics, see the relabeling configuration dict key


------------- diff for cluster-components/values/archive-production-rke2.yaml -------------

No differences


------------- diff for cluster-components/values/archive-staging-rke2.yaml -------------

No differences


------------- diff for cluster-components/values/gitlab-production.yaml -------------

No differences


------------- diff for cluster-components/values/gitlab-staging.yaml -------------

No differences


------------- diff for cluster-components/values/minikube.yaml -------------

No differences


------------- diff for cluster-components/values/rancher.yaml -------------

No differences


------------- diff for cluster-components/values/test-staging-rke2.yaml -------------

No differences

Cache sizes guesstimated with:

olasd@pergamon:~$ sudo clush -qw @k8s-test-staging,@k8s-admin,@k8s-archive-production,@k8s-archive-staging /var/lib/rancher/rke2/bin/crictl -c /var/lib/rancher/rke2/agent/etc/crictl.yaml image list | awk -F': ' '{print $2}' | column -t | sort | uniq -c

Ref. swh/infra/sysadm-environment#5247 (closed)

TODO:

  • default storageClassName is ceph-rbd; need to match that with persistent storage configured on the admin-rke2 cluster (which for now doesn't exist?)
Edited by Nicolas Dandrimont

Merge request reports