- Jan 26, 2024
-
-
Antoine Lambert authored
Align with production settings and enable vault workers to fetch contents bytes directly from the objstorage to speedup cooking.
-
- Jan 22, 2024
-
-
David Douard authored
-
David Douard authored
Even when the docker_compose fixture itself fails.
-
David Douard authored
Use a file of installed packages generated at image build time when no python package is installed from source. This call ti 'pip install' for all startup of the services can slow the startup time a bit.
-
David Douard authored
Having the swh-web-cron service exit early make the test fail because docker_compose fixture now waits for the services to be started.
-
- Jan 19, 2024
-
-
David Douard authored
Make it tell which origins are loaded on the fly, with a duration.
-
David Douard authored
Replace the retry proxy by a tenacious one, also add a buffer proxy in the pipeline (should help, maybe).
-
David Douard authored
Instead of scaling up the loaders if needed in the origins fixture; the issue with this approach was the time needed by the newly spawn loaders to be ready to accept tasks from Celery, there is a good chance the first loader takes (prefetch) them all, thus falling back in a one-loader-loads-all-origins situation.
-
David Douard authored
Replace the retry proxy by a tenacious one and move the filter one at the entry of the pipeline. It should mitigate spurious hash collision errors when loading several origins in parallel (which happen when there are more than one origin to load, now that the origins() fixture scales the swh-loader service up if needed).
-
- Jan 18, 2024
-
-
David Douard authored
-
David Douard authored
-
- Jan 16, 2024
-
-
David Douard authored
This allows to easily query elasticserch, which can be useful for tests.
-
- Jan 15, 2024
-
-
Jérémy Bobbio (Lunar) authored
Removing an object from Kafka requires writing a new message with the same key as previously used and an empty value. These tombstones then get later “compacted” depending on the topic settings `cleanup.policy`, `max.compaction.lag.ms` and `delete.retention.ms`. To test the presence or absence of objects in Kafka, we thus need to find which is the most recent: a tombstone or a value. In order to do so, we parse all messages into a single dict, associating SWHIDs with the latest message timestamp and if it should be considered present or absent. This is sadly a bit time and memory consuming but at least we get accurate results. While not strictly necessary, we now use a topic configuration in Kafka that will aggressively try to remove “dead” messages. It should improve slightly the time needed to inventory objects as previously described. We use the match syntax introduced in Python 3.10 in `handle_message()`, so we bump black compatibility settings to Python 3.11. Depends on swh/devel/swh-alter!7 (and a new release thereafter)
-
David Douard authored
I prevents from a 301 back and forth on each http query.
-
David Douard authored
It seems still broken albeit replying to HEAD queries...
-
David Douard authored
Starting a service can be pretty slow, waiting for dependencies etc. Double these to 6 (default being 3) in all service with a healthcheck defined with default retries.
-
David Douard authored
Otherwise the mimetype and fossology indexers are executed, making tests very long to execute.
-
David Douard authored
It's not used any more and is about to be removed from swh-indexer.
-
- Jan 11, 2024
-
-
Antoine Lambert authored
-
- Jan 10, 2024
-
-
David Douard authored
-
David Douard authored
-
David Douard authored
-
David Douard authored
This adds a new test scenario for the cassandra backend. It includes: - add a journal_writer in storage_cassandra.yml config file, - add a simple test in which the cassandra backend is used in a simple git loading scenario.
-
David Douard authored
- use a dedicated group_id for the replayer (to prevent confusion), - use a dedicated cassandra storage configuration to prevent alter tests from being subject to side effects of modifications in the main cassandra storage configuration, - simplify a bit the alter_host fixture, - improve the way verified_origins fixture check for the replayer to be done (as kafka about consumer lag).
-
David Douard authored
Make it possible to declare one origin to be loaded with multiple urls and choose the first one to be accessible as url for the test. This can prevent the tests from failing due to a transient downtime of an origin. Use this to make the mercurial origin used in test_mirror iand test_graphql more robust (sr.ht being currently down).
-
David Douard authored
It can happen that the `docker kill` command fails because one of the listed containers is already dead. We don't want this situation to be an error.
-
David Douard authored
-
- Jan 09, 2024
-
-
David Douard authored
Instead of creating a dedicated container, use a shell on the running swh-scheduler or swh-web services. This should slighthly speed up tests (prevent from running all the service initialization scaffolding for these).
-
Antoine Lambert authored
Use an exit handler to guarantee current compose session will be teardowned when any keyboard interruption or unhandled exception occurs when running tests.
-
Antoine Lambert authored
If multiple test suites failed, the log directory is attempted to be created multiple times. But as we use tmp_path_factory.mktemp with the numbered parameter set to False (to ensure all logs are dumped in a same directory), an exception is raised after a second test suite fails as the log directory already exists. So check if the log directory exists before creating it.
-
- Jan 03, 2024
-
-
Antoine Lambert authored
-
- Dec 31, 2023
-
-
vlorentz authored
-
- Dec 18, 2023
-
-
Franck Bret authored
-
Antoine Lambert authored
WORKER_INSTANCES environment variable was not set for the swh-scheduler-schedule-recurrent service which prevented its proper initialization.
-
- Dec 11, 2023
-
-
Jérémy Bobbio (Lunar) authored
With e293873, compose logs are dumped in case of errors during the test. While this is nice, the tests are currently written to `docker/tests/logs` directory, with files having a random identifier. This means they require manual cleaning and it is hard to know which one is the most recent. This problem can be solved by using the [tmp_path_factory](https://docs.pytest.org/en/7.1.x/reference/reference.html#id47) fixture which allow us to create a subdirectory in pytest base temporary directory. The logs are now accessible through the managed `-current` symlink, like follows: `/tmp/pytest-of-lunar/pytest-current/docker/test_alter.logs`
-
- Nov 30, 2023
-
-
Jérémy Bobbio (Lunar) authored
To test the ability of swh-alter to remove objects from multiple storage, we need to set up a dedicated environment with – in addition to the common PostgreSQL storage – a Cassandra storage fed by a replayer. swh-alter also needs to get information from swh-graph but the compression is currently too fragile for us to rely on the dedicated container. Instead, we implement in `alter_companion.py` a trivial mock HTTP server that will always return 404 — making swh-alter fallback on information from storage. This server is run in a dedicated swh-alter image. This image will also be used to run commands from the same `alter_companion.py` that perform queries to the PostgreSQL and Cassandra databases to check the absence (or the presence) of a set of SWHIDs. We can then implement a simple integration scenario where we load two origins, check the presence in the storages, remove one of them, check the absence in the storages, restore the recovery bundle, and check the renewed presence in the storages. This will later be expanded to check for removal from objstorage(s), Kafka, and Elasticsearch.
-
Jérémy Bobbio (Lunar) authored
It used to crash if there was a `/src` but no directories starting with `swh-` in it.
-
- Nov 29, 2023
-
-
Antoine Lambert authored
Stricter checks have been introduced regarding input data validation in swh-deposit v2.0.
-
Antoine Lambert authored
Align with production settings as hash collisions can happen otherwise as multiple replayer workers are executed in parallel.
-
- Nov 22, 2023
-
-
Antoine Lambert authored
As a first experiment using webhooks in Software Heritage, add optional services enabling to update a Save Code Now request status by sending push notifications through webhook messages to the webapp. That feature can be used by initializing the docker environment with the following command: $ docker compose -f docker-compose.yml -f docker-compose.webhooks.yml up -d In that configuration, the service pulling Save Code Now requests to update their status is disabled and a new journal client service is used to forward origin visit status events through webhook messages. The webhooks management relies on the use of Svix: an open source framework which offers webhook sending as a service and the new swh-webhooks package interacting with the Svix server through its REST API. Related to swh/meta#4836.
-