Skip to content

Add throttling/backoff to origin visit scheduler respawn logic

When respawning the scheduling threads, no throttling is applied. This makes the process respawn threads (very) fast when the DB is down.

https://sentry.softwareheritage.org/organizations/swh/issues/10223/?referrer=phabricator_plugin

OperationalError: connection to server at "db1.internal.staging.swh.network" (192.168.130.11), port 5432 failed: FATAL:  server login has been failing, try again later (server_login_retry)
connection to server at "db1.internal.staging.swh.network" (192.168.130.11), port 5432 failed: FATAL:  server login has been failing, try again later (server_login_retry)

(2 additional frame(s) were not displayed)
...
  File "swh/scheduler/backend.py", line 72, in __init__
    cursor_factory=psycopg2.extras.RealDictCursor,
  File "psycopg2/pool.py", line 162, in __init__
    self, minconn, maxconn, *args, **kwargs)
  File "psycopg2/pool.py", line 59, in __init__
    self._connect()
  File "psycopg2/pool.py", line 63, in _connect
    conn = psycopg2.connect(*self._args, **self._kwargs)
  File "__init__.py", line 127, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)

Thread directory died with exception; respawning

Migrated from T4681 (view on Phabricator)