Skip to content

Draft: Add a learning/replaying proxy for loaders and listers

In order to make our test suite more reproducible, we configure a proxy for the loaders and listers. It can work in two modes: learning and replaying. In learning mode, HTTP(S) requests will be forwarded to their corresponding servers while recording the exchange. In replaying mode, the record will be used to replay the same exchanges (while denying any other requests), without reaching out to the wider Internet.

The idea is that the test suite should be run on Jenkins only in replay mode, so it won’t be affected by network or service issues.

The implementation uses mitmproxy. It is run in a new proxy Docker service. mitmproxy is able to intercept HTTPS request and generate new X509 certificates on the fly. The certificate for the authority signing these generated certificates is installed as trusted in our swh/stack image.

Network records are written to assets/mitmproxy/swh-tests. A placeholder file is added to create the empty repository.

Usage of the proxy is fully contained in docker-compose.proxy.yml that is meant to be used as on override. Changing the mode of the proxy is done by changing the command stanza from learn to replay and vice-versa.

Variables in env/proxy.env are set in the swh-loader and swh-lister containers. Internal hosts should be added to the no_proxy variable.

We also have to set PYTHONHASHSEED to a fixed value in order to (at least) have swh-loader-git send the same network request every time.

Merge request reports

Loading