Skip to content
Snippets Groups Projects
Unverified Commit b98f6f92 authored by Antoine R. Dumont's avatar Antoine R. Dumont
Browse files

celery_backend/runner: Switch write order to rabbitmq then postgresql

Messages are now first sent to rabbitmq then postgresql.

In the nominal case where all writes are ok, that changes nothing vs the
previous implementation (postgresql first then rabbitmq).

In degraded performance though, that's supposedly better.

1. If we cannot write to rabbitmq, then we won't write to postgresql either,
that function will raise and stop.

2. If we can write to rabbitmq first, then the messages will be consumed
independently from this. And then, if we cannot write to postgresql (for some
reason), then we just lose the information we sent the task already. This
means the same task will be rescheduled and we'll have a go at it again. As
those kind of tasks are supposed to be idempotent, that should not a major
issue for their upstream.

Also, those tasks are mostly listers now and they have a state management of
their own, so that should definitely mostly noops (if the ingestion from the
previous run went fine). Edge cases scenario like down site will behave as
before.

Refs. swh/infra/sysadm-environment#5512
parent 41f6156b
No related branches found
No related tags found
No related merge requests found
......@@ -37,21 +37,27 @@ def write_to_backends(
"""Utility function to unify the writing to rabbitmq and the scheduler backends in a
consistent way (towards transaction-like).
In the current state of affairs, the messages are written in postgresql first and
then sent to rabbitmq.
Messages are first sent to rabbitmq then postgresql.
That could pose an issue in case we can write to postgresql and then fail to write
in the rabbitmq backend. As those tasks are then marked as "next_run_scheduled",
they won't be scheduled again. And as those messages never hit rabbitmq in the first
place, that leaves records in a pending state in the postgresql backend (history
log).
In the nominal case where all writes are ok, that changes nothing vs the previous
implementation (postgresql first then rabbitmq).
It's not an issue if we cannot write to postgresql, because then this code raises
and we do not send the messages to rabbitmq.
In degraded performance though, that's supposedly better.
1. If we cannot write to rabbitmq, then we won't write to postgresql either, that
function will raise and stop.
2. If we can write to rabbitmq first, then the messages will be consumed
independently from this. And then, if we cannot write to postgresql (for some
reason), then we just lose the information we sent the task already. This means the
same task will be rescheduled and we'll have a go at it again. As those kind of
tasks are supposed to be idempotent, that should not a major issue for their upstream.
Also, those tasks are mostly listers now and they have a state management of their
own, so that should definitely mostly noops (if the ingestion from the previous run
went fine). Edge cases scenario like down site will behave as before.
"""
backend.mass_schedule_task_runs(backend_tasks)
logger.debug("Written %s celery tasks", len(backend_tasks))
for with_priority, backend_name, backend_id, args, kwargs in celery_tasks:
kw = dict(
task_id=backend_id,
......@@ -62,6 +68,8 @@ def write_to_backends(
kw["queue"] = f"save_code_now:{backend_name}"
app.send_task(backend_name, **kw)
logger.debug("Sent %s celery tasks", len(backend_tasks))
backend.mass_schedule_task_runs(backend_tasks)
logger.debug("Written %s celery tasks", len(backend_tasks))
def run_ready_tasks(
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment