swh-apps: Bump swh-* images to use python:3.11-bookworm
This:
- updates the docker images to bookworm
- rework the Dockerfile scaffolding so we can share a base image between our apps [0] [1]
- unifies again the entrypoint with more idiomatic shell command (regarding optional env variables)
From the review and pairing, some extra refactoring steps occurred too:
-
Drop all the "deeply inspired" comments ;-) -
Reorder the content of the Dockerfiles to get better layer deduplication -
Drop librdkafka-dev -
Drop libcmph-dev (swh.perfecthash also bundles libcmph in its manylinux wheels) -
Drop swh-loader-bzr bzr package (we'll see if that's still working without, at some point it required it to be functional) -
Promote sed and git as default dependency in base image -
Make loader-savecodenow image inherit from the loader-package -
Update swh-toolbox dpes to use postgresql 15 client -
Make swh-toolbox inherit from loader-savecodenow image (adapted from suggestion)
Note that this now requires buildkit to be enabled (so we can use smarter command like COPY --chmod=0644 ...
)
Co-authored with @olasd and @vsellier
[0] There is currently a lot of duplication in the current declaration.
[1] This should make the maintenance of the Dockerfile easier. This also makes the overall build time faster. This is a work that will also open a more proper separation between docker build image and docker run image (to make the runtime images smaller, they are currently a bit fat, around ~1g of size, which in turn should reduce the waiting time in between deployments).
[2] sample build but all were tested time and again
$ app=loader-svn; DOCKER_BUILDKIT=1 docker build -t "swh-${app}:latest" "apps/swh-${app}" --build-arg REGISTRY=
[+] Building 0.1s (13/13) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 964B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/swh-base:latest 0.0s
=> [internal] load metadata for docker.io/library/rsvndump-base:latest 0.0s
=> [rsvndump_image 1/1] FROM docker.io/library/rsvndump-base:latest 0.0s
=> [stage-1 1/6] FROM docker.io/library/swh-base:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 78B 0.0s
=> CACHED [stage-1 2/6] COPY --from=rsvndump_image 0.0s
=> CACHED [stage-1 3/6] RUN apt-get -y update && apt-get install -y subversion libsvn-dev && apt-get clean 0.0s
=> CACHED [stage-1 4/6] COPY --chmod=0644 requirements-frozen.txt /opt/swh 0.0s
=> CACHED [stage-1 5/6] RUN --mount=type=cache,target=/opt/swh/.cache uv pip sync requirements-frozen.txt 0.0s
=> CACHED [stage-1 6/6] COPY --chmod=0755 entrypoint.sh /opt/swh 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:be755312307ea94bf5db44a5d704997b789e60f479108707b7a1c360767b5616 0.0s
=> => naming to docker.io/library/swh-loader-svn:latest 0.0s
Merge request reports
Activity
- Resolved by Antoine R. Dumont
It's probably a good opportunity to drop all the "deeply inspired" comments ;-)
This might be a good time to reorder the content of the Dockerfiles to get better layer deduplication as well. In practice, having a common base image that goes to at least up to the
pip install uv
instruction would make sense?We can do another apt dance in the leaf images to install additional packages.
The following libraries should not be needed:
- librdkafka-dev (confluent-kafka-python has been shipping manylinux wheels embedding librdkafka for a long time: https://pypi.org/project/confluent-kafka/2.8.0/#files)
- libcmph-dev (swh.perfecthash also bundles libcmph in its manylinux wheels: https://pypi.org/project/swh.perfecthash/1.3.2/#files)
I think swh-loader-bzr doesn't need the bzr package (as we install breezy from pypi).
sed and git could be promoted to the common base image
loader-savecodenow should probably inherit from the loader-package image as they have many tools in common
swh-toolbox needs to be bumped to postgresql 15 client tools. it might benefit from inheriting from the loader-package image in terms of layer dedup?
- Resolved by Antoine R. Dumont
added 8 commits
- 11ee5b3a - swh-apps: Add a swh-base docker image
- 01f151ba - swh-web: Adapt image to reuse the swh-base docker image
- 5507ab2f - storage/entrypoint: Populate cli with flags only when env var is defined
- 1b1e5981 - storage: Adapt image to reuse the swh-base docker image
- 3f617d9c - objstorage/entrypoint: Populate cli with flags only when env var is defined
- daa2f971 - objstorage: Adapt image to reuse the swh-base docker image
- 7511a530 - web: Use base image
- e5f77baa - loader-svn: Adapt image to reuse the swh-base docker image
Toggle commit listmentioned in merge request swh/infra/ci-cd/swh-jenkins-jobs!250 (merged)
added 9 commits
- 0096fc0f - swh-apps: Bump swh-* images to use python:3.11-bookworm
- c357730e - swh-apps: Add a swh-base docker image
- 7f2ae5cb - swh-web: Adapt image to reuse the swh-base docker image
- ee1aa6d6 - storage: Adapt image to reuse the swh-base docker image
- 87bdc851 - storage/entrypoint: Populate cli with flags only when env var is defined
- 464f3a4d - objstorage: Adapt image to reuse the swh-base docker image
- 0d19eaef - objstorage/entrypoint: Populate cli with flags only when env var is defined
- 37cd9f53 - loader-svn: Adapt image to reuse the swh-base docker image
- ded6c051 - cassandra-checks: Adapt image to reuse the swh-base docker image
Toggle commit listadded 9 commits
- ed394fd6 - loader-svn: Adapt image to reuse the swh-base docker image
- 8f6b2ade - cassandra-checks: Adapt image to reuse the swh-base docker image
- d11e6fa9 - add-forge-now: Adapt image to reuse the swh-base docker image
- a7422aa3 - alter: Adapt image to reuse the swh-base docker image
- 5e2099a6 - counters: Adapt image to reuse the swh-base docker image
- 2f3e642d - counters/entrypoint: Populate cli with flags only when env var is defined
- e7ee47c5 - deposit: Adapt image to reuse the swh-base docker image
- 923a28b3 - deposit/entrypoint: Populate cli with flags only when env var is defined
- cea86387 - deposit-checkers: Adapt image to reuse the swh-base docker image
Toggle commit listadded 5 commits
- 12f58cfb - graphql: Adapt image to reuse the swh-base docker image
- 97e2ae7e - graphql/entrypoint: Populate cli with flags only when env var is defined
- 558afde6 - indexer: Adapt image to reuse the swh-base docker image
- c90142ea - indexer-storage: Adapt image to reuse the swh-base docker image
- 5f89c931 - indexer-storage/entrypoint: Populate cli with flags only when env var is defined
Toggle commit listadded 35 commits
- 4c5436eb...8591a2c1 - 25 earlier commits
- 1a60da9f - loader-git: Adapt image to reuse the swh-base docker image
- 003a063f - loader-mercurial: Adapt image to reuse the swh-base docker image
- e507e923 - loader-metadata: Adapt image to reuse the swh-base docker image
- c584ce85 - loader-package: Adapt image to reuse the swh-base docker image
- 86aa54fb - loader-savecodenow: Adapt image to reuse the swh-loader-package image
- ead849a4 - objstorage-replayer: Adapt image to reuse the swh-base image
- 75758c5f - objstorage-winery: Adapt image to reuse the swh-base image
- 773a780f - objstorage-winery/entrypoint: Populate cli with flags only when env var is defined
- 64376eba - scheduler: Adapt image to reuse the swh-base image
- e9ec8c6a - scheduler/entrypoint: Populate cli with flags only when env var is defined
Toggle commit listadded 1 commit
- 9467daaf - wip (fail for now): graph: Adapt image to reuse the swh-base image
added 63 commits
-
9467daaf...d946377b - 17 commits from branch
master
- d946377b...55f74def - 36 earlier commits
- 07ea1a23 - scrubber: Adapt image to reuse the swh-base image
- 3ce95fc4 - search: Adapt image to reuse the swh-base image
- dcfe3ede - search/entrypoint: Populate cli with flags only when env var is defined
- 3ec55a61 - storage-replayer: Adapt image to reuse the swh-base image
- 47775f1c - toolbox: Adapt image to reuse the swh-base image
- f52244c6 - graph: Adapt image to reuse the swh-base image
- e28dc221 - vault: Adapt image to reuse the swh-base image
- c6441be8 - vault/entrypoint: Populate cli with flags only when env var is defined
- 69d5b322 - vault-cookers: Adapt image to reuse the swh-base image
- 050e9985 - webhooks: Adapt image to reuse the swh-base image
Toggle commit list-
9467daaf...d946377b - 17 commits from branch