[draft] Replace setuptools
setuptools is the de facto standard build system for python packages but it suffers from a few caveats and drawbacks, especially it prevents from installing swh packages from sources stored on a read-only storage, which is a common scenario when using docker for testing/developing.
Thanks to the PEP517 (and others), it's now much easier to develop building tools for python packages, so a number of projects have been created implementing these PEPs:
- hatch
- PDM
- Poetry
- flit
See this page for a reasonably up to date picture presentation of the python packaging/build ecosystems.
The main issue with all these new packaging systems is their related build backend do not generally really support building extensions. There are many long discussions on the subject (see this one and its follow up).
extensions
Most of our packages are pure-python, but we have a few that do build extensions:
- swh.loader.svn: builds a
fast_crawler
extension (single cpp file, with a pyi stub) - swh.loader.csv: builds a
rcsparse
extension (a few c/h files, plus a pyi stub) - swh.graph: pretty complex java + rust building process, but these are NOT python extensions (a jar is included in the python package to make it possible to start the grpc server from the python package, and the rust part consists in a series of binary tools (compress, grpc server, etc.)
- swh-search uses tree_sitter to generate c files from json declarations which are compiled (either at install time or on the fly). Note that tree-sitter is an js tool installed via yarn by default.
- swh-perfecthash also ships an extension (cffi wrapper)
It seems PDM allows to build extensions (using setuptools in the backend), and hatch's main author started a plugin a while ago but it seems to be more of a POC than an actual stable tool. There is also the possibility of using simple build hooks to build these simple extensions (as long as we do not expect these to build on any other platform than Debian/Linux/x86 probably).
There are also a few hatch plugins, especially a hatch-cython one that can be used for loader extensions (requires "porting" them to cython which is not a bid deal).
However these 'setuptools-backend' extension builders suffer from the issue that they build the extension in the source tree, so they still need write access.
documentation
The main issue with hatch is the documentation is hard to follow and not very beginer friendly (fortunately, this issue is getting attention).