- Mar 11, 2025
-
-
Antoine R. Dumont authored
Messages are now first sent to rabbitmq then postgresql. In the nominal case where all writes are ok, that changes nothing vs the previous implementation (postgresql first then rabbitmq). In degraded performance though, that's supposedly better. 1. If we cannot write to rabbitmq, then we won't write to postgresql either, that function will raise and stop. 2. If we can write to rabbitmq first, then the messages will be consumed independently from this. And then, if we cannot write to postgresql (for some reason), then we just lose the information we sent the task already. This means the same task will be rescheduled and we'll have a go at it again. As those kind of tasks are supposed to be idempotent, that should not a major issue for their upstream. Also, those tasks are mostly listers now and they have a state management of their own, so that should definitely mostly noops (if the ingestion from the previous run went fine). Edge cases scenario like down site will behave as before. Refs. swh/infra/sysadm-environment#5512
-
Antoine R. Dumont authored
To explicit its current behavior. Refs. swh/infra/sysadm-environment#5512
-
- Mar 05, 2025
-
-
vlorentz authored
-
- Feb 25, 2025
-
-
Antoine Lambert authored
-
- Feb 17, 2025
-
-
Antoine Lambert authored
-
Antoine Lambert authored
-
Antoine Lambert authored
Bump development tools: mypy, codespell, isort, ... Move all tools configuration in pyproject.toml. Remove no longer needed mypy overrides.
-
- Feb 10, 2025
-
-
In the CSV file consumed by the schedule command, allow to use the celery backend name as task type name because the mapping between a backend name and its task type name can be easily retrieved from the scheduler API and only the celry backend name is available in sentry events data.
-
- Feb 05, 2025
-
-
Antoine Lambert authored
An origin can have be listed by the bulk-save lister but never scheduled so we need to handle that case to avoid errors when attempting to schedule priority first visits.
-
- Dec 09, 2024
-
-
Antoine Lambert authored
A memory backend was recently introduced so that temporary backend relying on a postgresql server is no longer needed.
-
- Nov 29, 2024
-
-
David Douard authored
-
- Nov 06, 2024
-
-
Antoine Lambert authored
From now on requests to the scheduler remote API will be retried when encountering connection errors and transient remote exceptions.
-
- Oct 30, 2024
-
-
David Douard authored
These have been deprecated for ages now.
-
David Douard authored
-
- Oct 28, 2024
-
-
David Douard authored
The former has been deprecated for ages now.
-
- Oct 24, 2024
-
-
David Douard authored
Normalize the scheduler db for swh.core 3.6 with improved `swh db` handling capabilities. Remove test_init.py, it's now outdated.
-
Antoine R. Dumont authored
This log message serves as a crude healt check so we keep it but we make it a bit more interesting.
-
- Oct 17, 2024
-
-
Antoine R. Dumont authored
Refs. swh/devel/swh-scheduler#4687
-
Antoine R. Dumont authored
Refs. swh/devel/swh-scheduler#4687
-
Antoine R. Dumont authored
Refs. swh/devel/swh-scheduler#4687
-
Antoine R. Dumont authored
Refs. swh/devel/swh-scheduler#4687
-
Antoine R. Dumont authored
This also makes the function return the number of scheduled origins. Refs. swh/devel/swh-scheduler#4687
-
Antoine R. Dumont authored
The current opened cli was not looping. In effect, doing one round, schedule origins and then crash in production-like environment. There is no issue in the docker environment as the loop is implemented outside the pre-existing cli. This kept said cli to avoid breaking the docker environment. Refs. swh/devel/swh-scheduler#4687
-
- Oct 14, 2024
-
-
Antoine Lambert authored
This new command in the origin group enables to schedule first visits with high priority for origins registered by listers having the first_visits_priority_queue attribute set. The command ensures the visits of all origins registered by such listers will be scheduled with high priority after the first listing regardless if some have already been scheduled prior it. Subsequent executions of such listers will no longer trigger visits with high priority though, those will be scheduled by the recurrent visits runner. Related to #4687.
-
Antoine Lambert authored
It allows to return the set of visit types from the origins listed by a specific lister. Related to #4687.
-
- Oct 09, 2024
-
-
Antoine Lambert authored
This new optional parameter enables to only return listers whose first visits of listed origins must be scheduled with high priority after a first listing but were not scheduled yet. Those types of listers have the first_visits_queue_prefix attribute set. Related to #4687.
-
Antoine Lambert authored
In order to implement a new scheduler runner that will schedule first visits of listed origins with high priority, add the following new columns to the Lister model: - last_listing_finished_at: Timestamp at which the last execution of the lister finished - first_visits_queue_prefix: Optional prefix of message queue names to schedule first visits with high priority - first_visits_scheduled_at: Timestamp at which all the first visits of listed origins with high priority were scheduled Related to #4687.
-
- Sep 10, 2024
-
-
Antoine Lambert authored
It exist cases (for instance when running tests on Jenkins) where more than one log record is captured during that test, making it flaky.
-
- Aug 30, 2024
-
-
Antoine Lambert authored
-
Antoine Lambert authored
-
David Douard authored
This should allow a failed task to return the reason for the failure and make it possible to be displayed to the end user, especially for Save-Code-Now. Related to swh/devel/swh-web#4805
-
- Aug 27, 2024
-
-
David Douard authored
-
- Jul 17, 2024
-
-
Nicolas Dandrimont authored
This expands the schema to add fork (boolean yes/no/unknown) and fork source (URL field) if available from the listers, which will be usable for scheduling heuristics (e.g. for GitHub).
-
Antoine Lambert authored
-
- Jul 16, 2024
-
-
It enables to efficiently filter by URLs the origins recorded by a lister. Related to swh/devel/swh-web#4802.
-
Database table priority_ratio and function swh_scheduler_nb_priority_tasks are no longer used by swh-scheduler interface so better removing them.
-
- Jul 11, 2024
-
-
Antoine Lambert authored
Add missing description of ids argument notably. Related to swh/devel/swh-web#4802.
-
- May 23, 2024
-
-
Nicolas Dandrimont authored
Not all celery tasks are recorded in the task_run table, but the celery listener needs to be able to process all events, even for tasks it doesn't know. The stricter type checking has broken this, coincidental, ignoring. Closes: #4688
-
- May 22, 2024
-
-
Antoine Lambert authored
Wrap calls to attr.as_dict and attr.as_tuple in methods to_dict and to_tuple to avoid explicit import of the attr package in client code.
-
Antoine Lambert authored
-