Errors from vault workers are not logged to Sentry
After reporting an error to the vault backend, workers call logging.exception
: https://forge.softwareheritage.org/source/swh-vault/browse/master/swh/vault/cookers/base.py$154
and this should be captured by Sentry, but it isn't.
I don't understand why, as vault workers use the same plumbling as loader workers (ie. https://forge.softwareheritage.org/source/swh-scheduler/browse/master/swh/scheduler/celery_backend/config.py$157-167)
Migrated from T3920 (view on Phabricator)
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- vlorentz added Sentry System administration Vault priority:High labels
added Sentry System administration Vault priority:High labels
- Maintainer
Looking at puppet configuration, my guess is that the
sentry_dsn
is not set for the vault cookers.All other components running in celery workers have
sentry_dsn
configuration set (see loader-git for instance). - Owner
Yes, i confirm (from #swh-sysadm discussion)
14:55 <+vlorentz> #3920 14:55 -- Notice(swhbot): #3920 (submitter: vlorentz, priority High, status: Open): Errors from vault workers are not logged to Sentry <https://forge.softwareheritage.org/#3920> 14:58 <+ardumont> #3920, they do not seem to have the sentry configuration set in puppet, i'll attend to that later
- Antoine R. Dumont added state:wip label
added state:wip label
- Owner
So i was wrong, it is correctly set [1]. And there are sentry issues about workers [2].
-
[1] infra/puppet/puppet-swh-site!488
-
[2] https://sentry.softwareheritage.org/share/issue/ec2b1220736b423aae13b1e8ef86785b/
-
- Owner
So i don't currently know what's wrong (if anything is).
I wonder if replacing
@worker_init.connect
with@worker_process_init.connect
at https://forge.softwareheritage.org/source/swh-scheduler/browse/master/swh/scheduler/celery_backend/config.py$157 would work.This seems to be recommended by Sentry https://docs.sentry.io/platforms/python/guides/celery/
- Owner
Something is bothering me, isn't there some catchall exceptions happening in the vault source code somewhere already?
Because that's currently working for other modules, so i don't get the difference here.
It might be worth double checking the vault code with this ^.
Anyway, nice finding. For your information, I've stopped the secondary cooker worker and adapted the other one with what you suggested:
#@worker_init.connect @worker_process_init.connect @_print_errors def on_worker_init(*args, **kwargs): try: from sentry_sdk.integrations.celery import CeleryIntegration except ImportError: integrations = [] else: integrations = [CeleryIntegration()] sentry_dsn = None # will be set in `init_sentry` function init_sentry(sentry_dsn, integrations=integrations, debug=True)
The process starts ok with the change [1] (with debug=True as proposed by the doc, dropped after the tryout).
I'll let you trigger some cooking and reports here your finding.
If that's satisfying enough for you, we can try and adapt this in the scheduler.
What do you think?
- [1]
Feb 09 08:48:17 worker11 python3[13674]: . swh.vault.cooking_tasks.SWHBatchCookingTask Feb 09 08:48:17 worker11 python3[13674]: . swh.vault.cooking_tasks.SWHCookingTask Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up integrations (with default = True) Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,280: DEBUG/ForkPoolWorker-1] Setting up integrations (with default = True) Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration celery Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,285: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration celery Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration logging Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,286: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration logging Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration stdlib Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,287: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration stdlib Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration excepthook Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,288: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration excepthook Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration dedupe Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,288: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration dedupe Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration atexit Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,289: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration atexit Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration modules Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,290: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration modules Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration argv Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,291: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration argv Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Setting up previously not enabled integration threading Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,292: DEBUG/ForkPoolWorker-1] Setting up previously not enabled integration threading Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration celery Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,292: DEBUG/ForkPoolWorker-1] Enabling integration celery Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration logging Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration stdlib Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration excepthook Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration dedupe Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration atexit Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration modules Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration argv Feb 09 08:48:17 worker11 python3[13680]: [sentry] DEBUG: Enabling integration threading Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,293: DEBUG/ForkPoolWorker-1] Enabling integration logging Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,294: DEBUG/ForkPoolWorker-1] Enabling integration stdlib Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,294: DEBUG/ForkPoolWorker-1] Enabling integration excepthook Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,294: DEBUG/ForkPoolWorker-1] Enabling integration dedupe Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,294: DEBUG/ForkPoolWorker-1] Enabling integration atexit Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,294: DEBUG/ForkPoolWorker-1] Enabling integration modules Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,294: DEBUG/ForkPoolWorker-1] Enabling integration argv Feb 09 08:48:17 worker11 python3[13680]: [2022-02-09 08:48:17,294: DEBUG/ForkPoolWorker-1] Enabling integration threading ...
- Owner
Also that's a cooker worker issue from sentry from 18h ago or so (as of the moment of this comment).
So i really don't think there is something wrong?
- Owner
I don't know how to trigger an error in the vault, currently; you'd have to change the code to manually do that :|
I don't know how to trigger either. I'm a bit dubious for the modification part.
Can't you trigger some cookings to your heart's content? At some point, some error should happen?
- Owner
It seems everything is already ok (another cooking task issue reported [1]) in the end so closing this.
- Antoine R. Dumont added state:invalid label and removed state:wip label
added state:invalid label and removed state:wip label
- Antoine R. Dumont closed
closed
- Antoine R. Dumont removed state:invalid label
removed state:invalid label