Project 'infra/sysadm-environment' was moved to 'swh/infra/sysadm-environment'. Please update any links and bookmarks that may still have the old path.
Note that it helped yet but i reproduced the issue in jenkins locally.
Prior to that, other issues with our moving cogs (swh.core, etc...) prevented it
(other unrelated failures arose).
Update the scheduler backend with the new task type:
# apt update; apt install -y python3-swh.lister...$ swhscheduler@scheduler0:~$ swh scheduler --config-file /etc/softwareheritage/scheduler/backend.yml task-type registerINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.archiveINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.cranINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.debianINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.depositINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.nixguixINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.npmINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin loader.pypiINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.bitbucketINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.cgitINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.cranINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.debianINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.giteaINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.githubINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.gitlabINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.gnuINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.launchpadINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.npmINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.packagistINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.phabricatorINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.pypiINFO:swh.scheduler.cli.task_type:Loading entrypoint for plugin lister.sourceforgeINFO:swh.scheduler.cli.task_type:Create task type list-sourceforge-full in scheduler
psql service=admin-staging-swh-scheduler psql (12.6)SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)Type "help" for help.swh-scheduler=> \conninfoYou are connected to database "swh-scheduler" as user "swh-scheduler" on host "db1.internal.staging.swh.network" (address "192.168.130.11") at port "5432".SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)swh-scheduler=> \xExpanded display is on.swh-scheduler=> select * from task_type where type like 'list-source%';-[ RECORD 1 ]----+---------------------------------------------------type | list-sourceforge-fulldescription | Full update of a SourceForge instancebackend_name | swh.lister.sourceforge.tasks.FullSourceForgeListerdefault_interval | 90 daysmin_interval | 90 daysmax_interval | 90 daysbackoff_factor | 1max_queue_length |num_retries |retry_delay |-- make some more gentle defaultswh-scheduler=> update task_type set max_queue_length=10, min_interval='30 days', max_interval='30 days', num_retries=3 where type='list-sourceforge-full';UPDATE 1swh-scheduler=> select * from task_type where type like 'list-source%';-[ RECORD 1 ]----+---------------------------------------------------type | list-sourceforge-fulldescription | Full update of a SourceForge instancebackend_name | swh.lister.sourceforge.tasks.FullSourceForgeListerdefault_interval | 90 daysmin_interval | 30 daysmax_interval | 30 daysbackoff_factor | 1max_queue_length | 10num_retries | 3retry_delay |
(note: we may want to adapt those in the lister repository in the register function).
May 07 14:01:31 scheduler0 swh[824184]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks list-sourceforge-full
That got picked and failed:
May 07 14:01:32 worker2 python3[218671]: [2021-05-07 14:01:32,495: INFO/MainProcess] Received task: swh.lister.sourceforge.tasks.FullSourceForgeLister[1eb27c36-2f58-4a33-8c9d-10b15b98a294]May 07 14:01:32 worker2 python3[218680]: [2021-05-07 14:01:32,541: ERROR/ForkPoolWorker-4] Task swh.lister.sourceforge.tasks.FullSourceForgeLister[1eb27c36-2f58-4a33-8c9d-10b15b98a294] raised unexpected: TypeError("__init__() got an unexpected keyword argument 'credentials'") Traceback (most recent call last): File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 385, in trace_task R = retval = fun(*args, **kwargs) File "/usr/lib/python3/dist-packages/swh/scheduler/task.py", line 55, in __call__ result = super().__call__(*args, **kwargs) File "/usr/lib/python3/dist-packages/celery/app/trace.py", line 650, in __protected_call__ return self.run(*args, **kwargs) File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 161, in _inner reraise(*exc_info) File "/usr/lib/python3/dist-packages/sentry_sdk/_compat.py", line 57, in reraise raise value File "/usr/lib/python3/dist-packages/sentry_sdk/integrations/celery.py", line 156, in _inner return f(*args, **kwargs) File "/usr/lib/python3/dist-packages/swh/lister/sourceforge/tasks.py", line 15, in list_sourceforge_full return SourceForgeLister.from_configfile().run().dict() File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 268, in from_configfile return cls.from_config(**config) File "/usr/lib/python3/dist-packages/swh/lister/pattern.py", line 255, in from_config return cls(scheduler=scheduler_instance, **config) TypeError: __init__() got an unexpected keyword argument 'credentials'
I'll adapt (but afk).
There is something else i need to update there anyway, the incremental task.
Added the incremental sourceforge task as well (staging).
INFO:swh.scheduler.cli.task_type:Create task type list-sourceforge-incremental in scheduler
Scheduled back the full listing task which got scheduled:
May 07 15:35:02 scheduler0 swh[824184]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks list-sourceforge-full
It's now running:
May 07 15:31:58 worker0 python3[230921]: [2021-05-07 15:31:58,779: INFO/MainProcess] lister@worker0.internal.staging.swh.network ready.May 07 15:35:02 worker0 python3[230921]: [2021-05-07 15:35:02,091: INFO/MainProcess] Received task: swh.lister.sourceforge.tasks.FullSourceForgeLister[ec00c8cd-ff5b-47df-adfd-a8c1884b9831]May 07 15:35:06 worker0 python3[230930]: [2021-05-07 15:35:06,698: WARNING/ForkPoolWorker-4] Project 'https://sourceforge.net/rest/adobe/wiki' does not have any toolsMay 07 15:35:07 worker0 python3[230930]: [2021-05-07 15:35:07,402: WARNING/ForkPoolWorker-4] Project 'https://sourceforge.net/rest/adobe/blog' does not have any tools
And we can see the new lister appear in the scheduler backend:
swh-scheduler=> \conninfoYou are connected to database "swh-scheduler" as user "swh-scheduler" on host "db1.internal.staging.swh.network" (address "192.168.130.11") at port "5432".SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)swh-scheduler=> select * from listers where name ='sourceforge'; id | name | instance_name | created | current_state | updated--------------------------------------+-------------+---------------+-------------------------------+---------------+------------------------------- 4b19e941-5e25-4cb0-b55d-ae421d983e2f | sourceforge | main | 2021-05-07 15:35:02.157958+00 | {} | 2021-05-07 15:35:02.157958+00(1 row)
It broke with the following, sentry should have more detail [1]
May 07 15:57:03 worker0 python3[230930]: [2021-05-07 15:57:03,547: ERROR/ForkPoolWorker-4] Task swh.lister.sourceforge.tasks.FullSourceForgeLister[ec00c8cd-ff5b-47df-adfd-a8c1884b9831] raised unexpected: HTTPError('404 Client Error: Not Found for url: https://sourceforge.net/rest/p/fci-cu-library2/b396')
Sorry for the delayed response. I'm assuming we'd like it better if the lister continued anyway in case of a "fatal" connection error, with maybe some sort of retry?