[proposal] apps: Make docker image Install their dependencies faster
tl;dr
Use uv instead of pip to install python deps.
Here is a summary with current states which installs through pip and then with uv. There is no intermediary docker cache in between calls so the comparison is valid.
|-------------------------+------+------+-------------+---------------------|
| App | pip | uv | uv gain (s) | so uv is... |
|-------------------------+------+------+-------------+---------------------|
| graphql | 33s | 26s | -7 | faster |
| graph | 6:16 | 6:12 | -4 | slightly bit faster |
| deposit | 49s | 28s | -21 | faster |
| loader-savecodenow | 1:38 | 55s | -43 | faster |
| loader-bzr | 44s | 34s | -10 | faster |
| loader-svn | 58s | 47s | -11 | faster |
| loader-package | 57s | 51s | -6 | faster |
| toolbox | 2:23 | 1:45 | -38 | faster |
...
Note: Using /usr/bin/time on multiple calls to docker build --no-cache
before [1] and after [2] the changes.
Description
It's always faster to use pip over uv for dependency resolution and installation from the requirements-frozen.txt.
We already use it here to generate the requirement-frozen.txt [0]
The gain can be a bit smaller for docker images which relies less on python deps (e.g. graph). That's still slightly faster though.
It's way faster for applications with lots of python dependencies (e.g. frontend or multi-loader images) [1] [2]
The main required adaptation over the previous version is that we need to use a virtualenv for the uv cli.
In any case, if we generalize this, we can probably gain some minutes (> 5min) from the overall images build (especially from the bottom modules like swh-storage, swh-objstorage, swh-core, swh-scheduler, swh-model, etc...).
pros/cons
|------+--------------------+-----------------|
| What | Pros | Cons |
|------+--------------------+-----------------|
| pip | standard | slow |
| | no need for a venv | |
|------+--------------------+-----------------|
| uv | fast | requires a venv |
|------+--------------------+-----------------|
Review suggestion
It's best to check it commit by commit.
[1] with pip (multiple calls)
$ /usr/bin/time docker build --no-cache -t swh-graphql apps/swh-graphql
...
Successfully tagged swh-graphql:latest
0.02user 0.06system 0:33.23elapsed 0%CPU (0avgtext+0avgdata 41112maxresident)k
0inputs+0outputs (0major+2522minor)pagefaults 0swaps
Successfully tagged swh-graphql:latest
0.05user 0.02system 0:32.30elapsed 0%CPU (0avgtext+0avgdata 40768maxresident)k
0inputs+0outputs (0major+2583minor)pagefaults 0swaps
Successfully tagged swh-graphql:latest
0.03user 0.04system 0:32.61elapsed 0%CPU (0avgtext+0avgdata 38588maxresident)k
0inputs+0outputs (0major+2562minor)pagefaults 0swaps
[2] with uv (multiple calls)
/usr/bin/time docker build --no-cache -t swh-graphql-with-uv apps/swh-graphql
...
Successfully tagged swh-graphql-with-uv:latest
0.01user 0.03system 0:26.56elapsed 0%CPU (0avgtext+0avgdata 38856maxresident)k
0inputs+0outputs (0major+2570minor)pagefaults 0swaps
Successfully tagged swh-graphql-with-uv:latest
0.03user 0.01system 0:26.87elapsed 0%CPU (0avgtext+0avgdata 39852maxresident)k
0inputs+0outputs (0major+2471minor)pagefaults 0swaps
Successfully tagged swh-graphql-with-uv:latest
0.02user 0.02system 0:25.86elapsed 0%CPU (0avgtext+0avgdata 39524maxresident)k
0inputs+0outputs (0major+2268minor)pagefaults 0swaps
Related to [0] !46 (merged)