Skip to content

Make clean docker image routine more efficient

More efficient in regards to our current way of building some specific images which are growing forever (up until we got no more disk space on thyssen).

That's a simpler workaround than what's discussed in issue [1] [2]. We keep the last "2" days images so the next build still benefits from it but we drop the rest (so no more too much disk cost) [3]. As this job is triggered daily, that should keep those images in check.

Thoughts?

[3] /usr/local/bin/clean-docker-images.sh is present on thyssen now:

root@thyssen:~# ls -lah /usr/local/bin/clean-docker-images.sh
-rwxr-xr-x 1 root root 994 May 10 08:15 /usr/local/bin/clean-docker-images.sh
root@thyssen:~# cat /usr/local/bin/clean-docker-images.sh
#!/usr/bin/env bash

##
# File managed by puppet (class profile::jenkins::server), changes will be lost.
##

set -x

# To avoid timezone shift shenanigans (when triggered around midnight)
today=$(date --date '13:00' +%Y%m%d)
yesterday=$(date --date 'yesterday 13:00' +%Y%m%d)

# Drop specific softwareheritage docker images (which accumulates over time)
# except for the last 2 days. If triggered around midnight, that could drop
# everything in the end, so let's stay safe and keep only 2 days.
# We also keep the latest tag
images_to_drop=$(docker image ls \
    | grep -E "softwareheritage/(base|web|replayer)" \
    | grep -v $today \
    | grep -v $yesterday \
    | grep -v "latest")

# To circumvent warning about `docker rmi` not being too happy when called with empty data
if [ ! -z "$images_to_drop" ]; then
    echo $images_to_drop \
    | awk '{print $1":"$2}' \
    | xargs docker rmi
fi

# Finally prune dangling refs.
docker system prune --filter 'label!=keep' --volumes --force

Refs. swh/infra/sysadm-environment#4846 (closed)

[1] swh/infra/sysadm-environment#4848

[2] swh/infra/sysadm-environment#4846 (closed)

Edited by Antoine R. Dumont

Merge request reports