Deploy maven stack in production
Plan:
- Deploy maven indexer exporter services (one per maven indexes: maven-central, ...)
- Deploy frontend to expose maven indexer exporter results (*.fld files)
- Deploy maven worker services: lister + loader
- scheduler (saatchi): Restart swh-scheduler-schedule-recurrent service (so maven tasks are scheduled)
- Add maven central so maven exporter scrapes its index and export it
- Discover it failed (out of disk space) due to a forgotten step ¯_(ツ)_/¯
- #3746 (closed): Prepare maven-exporter node to have enough disk to receive indices
- Restart maven export for maven-central
- #4330 (closed): Wait for the end of it ^...
- #4330 (closed): Add maven central to the standard listing once ^ is done
- infra/puppet/puppet-swh-site!554: Make sure lister workers consume the maven queues
- #4330 (closed): Checks
Migrated from T4330 (view on Phabricator)
Designs
- Show closed items
- swh/meta #1724Extend archive coverage [Roadmap - Collect]
- swh/meta #4079
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Phabricator Migration user marked this issue as related to swh/meta#1724
marked this issue as related to swh/meta#1724
- Phabricator Migration user marked this issue as related to swh/meta#4079
marked this issue as related to swh/meta#4079
- Antoine R. Dumont changed the description
changed the description
- Phabricator Migration user mentioned in commit swh-sysadmin-provisioning@cf93d7f7
mentioned in commit swh-sysadmin-provisioning@cf93d7f7
- Antoine R. Dumont marked the checklist item Deploy maven indexer exporter services (one per maven indexes: maven-central, ...) as completed
marked the checklist item Deploy maven indexer exporter services (one per maven indexes: maven-central, ...) as completed
- Antoine R. Dumont marked the checklist item Deploy frontend to expose maven indexer exporter results (*.fld files) as completed
marked the checklist item Deploy frontend to expose maven indexer exporter results (*.fld files) as completed
- Antoine R. Dumont marked the checklist item Deploy maven worker services: lister + loader as completed
marked the checklist item Deploy maven worker services: lister + loader as completed
- Antoine R. Dumont changed the description
changed the description
- Antoine R. Dumont changed the description
changed the description
- Antoine R. Dumont marked the checklist item Restart maven export for maven-central as completed
marked the checklist item Restart maven export for maven-central as completed
- Antoine R. Dumont added state:wip label
added state:wip label
- Author Owner
Finally, export is done on maven central [1], the fld is computed [2]... And it's also exposed, hence reachable from lister worker nodes.
- [1]
Sep 08 11:24:15 maven-exporter run_maven_index_exporter.sh[146073]: * Make files modifiable by the end-user. Sep 08 11:24:15 maven-exporter run_maven_index_exporter.sh[146073]: Docker Script execution finished on 2022-09-08 11:24:15. Sep 08 11:24:15 maven-exporter run_maven_index_exporter.sh[146073]: INFO:__main__:Export directory has the following files: Sep 08 11:24:15 maven-exporter run_maven_index_exporter.sh[146073]: INFO:__main__: - _s.fld size 20699901033 Sep 08 11:24:15 maven-exporter run_maven_index_exporter.sh[146073]: INFO:__main__:Found fld file: _s.fld Sep 08 11:24:15 maven-exporter run_maven_index_exporter.sh[146073]: INFO:__main__:Copying files to /publish/export.fld. Sep 08 11:25:39 maven-exporter run_maven_index_exporter.sh[146073]: INFO:__main__:Script finished on 2022-09-08 11:25:39 Sep 08 11:25:42 maven-exporter run_maven_index_exporter.sh[146071]: + mv /var/www/maven_index_exporter/export.fld /var/www/maven_index_exporter/export-maven-central.fld
- [2]
root@maven-exporter:~# ls -lah /var/www/maven_index_exporter/ total 1.8G drwxr-xr-x 2 root root 6 Sep 8 11:25 . drwxr-xr-x 4 root root 4.0K Sep 8 08:16 .. -rwxrwxrwx 1 root root 442 Sep 8 08:36 export-atlassian-public.fld -rwxrwxrwx 1 root root 63M Sep 8 08:18 export-clojars.fld -rwxrwxrwx 1 root root 91M Sep 8 08:32 export-jboss.fld -rwxrwxrwx 1 root root 20G Sep 8 10:49 export-maven-central.fld
- [3]
root@maven-exporter:~# curl -s https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld | head doc 0 field 0 name u type string value org.pustefixframework|pustefix-archetype-basic|0.18.0|NA|jar field 1 name m type string value 1318436946815 field 2
- Antoine R. Dumont changed the description
changed the description
- Author Owner
Schedule maven-central listing:
swhscheduler@saatchi:~$ curl -s https://repo1.maven.org/maven2/ | head -2 <!DOCTYPE html> <html> swhscheduler@saatchi:~$ curl -s https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld | head -2 doc 0 field 0 swhscheduler@saatchi:~$ curl -s http://saatchi.internal.softwareheritage.org:5008/ <html> <head><title>Software Heritage scheduler RPC server</title></head> <body> <p>You have reached the <a href="https://www.softwareheritage.org/">Software Heritage</a> scheduler RPC server.<br /> See its <a href="https://docs.softwareheritage.org/devel/swh-scheduler/">documentation and API</a> for more information</p> </body> </html>swhscheduler@saatchi:~$ swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ > task add list-maven-full \ > url=https://repo1.maven.org/maven2/ \ > index_url=https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld Created 1 tasks Task 415251304 Next run: today (2022-09-08T12:03:54.630698+00:00) Interval: 90 days, 0:00:00 Type: list-maven-full Policy: recurring Args: Keyword args: index_url: 'https://maven-exporter.internal.softwareheritage.org/export-maven-central.fld' url: 'https://repo1.maven.org/maven2/'
- Antoine R. Dumont changed the description
changed the description
- Author Owner
Checks:
-
task has been scheduled by the scheduler runner process [1]
-
listing is being consumed by one worker [2]
-
'maven' listed origins is steadily growing [3]
-
New 'maven' listed origins are getting scheduled for ingestion [4]
-
maven loaders are ingesting those [5]
-
[1]
root@saatchi:~# journalctl -xe -u swh-scheduler-runner.service | grep -A1 maven Sep 08 12:04:00 saatchi swh[1210080]: INFO:swh.scheduler.celery_backend.runner:Grabbed 1 tasks list-maven-full Sep 08 12:04:01 saatchi swh[1210080]: INFO:swh.scheduler.cli.admin.runner:Scheduled 1 tasks
- [2]
root@pergamon:~# clush -b -w @prod-listers 'systemctl status swh-worker@lister' | grep maven | grep Received Sep 08 12:14:51 worker10 python3[2300925]: [2022-09-08 12:14:51,477: INFO/MainProcess] Received task: swh.lister.maven.tasks.FullMavenLister[56d16483-d676-4b15-8a71-e4a8227e3157]
- [3]
14:20:35 softwareheritage-scheduler@belvedere:5432=> select now(), visit_type, count(*) from listed_origins where lister_id='2b519d27-b0b0-442e-b340-b0d5017ea014' group by visit_type; +-------------------------------+------------+-------+ | now | visit_type | count | +-------------------------------+------------+-------+ | 2022-09-08 12:21:19.380732+00 | maven | 1415 | +-------------------------------+------------+-------+ (1 row) Time: 238.210 ms 14:21:22 softwareheritage-scheduler@belvedere:5432=> select now(), visit_type, count(*) from listed_origins where lister_id='2b519d27-b0b0-442e-b340-b0d5017ea014' group by visit_type; +-------------------------------+------------+-------+ | now | visit_type | count | +-------------------------------+------------+-------+ | 2022-09-08 12:21:42.729403+00 | maven | 1427 | +-------------------------------+------------+-------+ (1 row) Time: 18.782 ms
- [4]
Sep 08 12:34:37 saatchi swh[1210191]: INFO:swh.scheduler.celery_backend.recurrent_visits:maven: 53 visits scheduled in queue swh.loader.package.maven.tasks.LoadMaven
- [5]
root@pergamon:~# clush -b -w @prod-listers 'systemctl status swh-worker@loader_maven' | grep "Received\|succeeded" | head Sep 08 12:38:38 worker03 python3[2164719]: [2022-09-08 12:38:38,257: INFO/ForkPoolWorker-14] Task swh.loader.package.maven.tasks.LoadMaven[b7b05fc1-f673-4da1-9b3e-ba8cdf7bc7f0] succeeded in 20.646365012042224s: {'status': 'eventful', 'snapshot_id': '6cb2ba6d63a096dc66fe5c22677be53ca9b0e09d'} Sep 08 12:38:38 worker03 python3[2074432]: [2022-09-08 12:38:38,262: INFO/MainProcess] Received task: swh.loader.package.maven.tasks.LoadMaven[bdb9c538-0577-4df6-85d1-2b78942fa06b] Sep 08 12:39:34 worker03 python3[2164719]: [2022-09-08 12:39:34,655: INFO/ForkPoolWorker-14] Task swh.loader.package.maven.tasks.LoadMaven[60eaa929-d9b8-4ce1-8ffd-e7b70f74d61b] succeeded in 56.391186997061595s: {'status': 'eventful', 'snap shot_id': 'f23dfb7dd725d4f714ffdfcf38f2ea555df01d41'} Sep 08 12:39:34 worker03 python3[2074432]: [2022-09-08 12:39:34,671: INFO/MainProcess] Received task: swh.loader.package.maven.tasks.LoadMaven[0029c978-4285-49b4-ad3d-c91ed89998ee] Sep 08 12:39:41 worker03 python3[2164719]: [2022-09-08 12:39:41,880: INFO/ForkPoolWorker-14] Task swh.loader.package.maven.tasks.LoadMaven[bdb9c538-0577-4df6-85d1-2b78942fa06b] succeeded in 7.204680480062962s: {'status': 'eventful', 'snaps hot_id': '445101e4715d8415d228ed6ff1c96f8d48a95229'} Sep 08 12:39:41 worker03 python3[2074432]: [2022-09-08 12:39:41,884: INFO/MainProcess] Received task: swh.loader.package.maven.tasks.LoadMaven[6cb88082-f19e-4a08-aae6-61247c9484fd] Sep 08 12:39:44 worker03 python3[2164719]: [2022-09-08 12:39:44,433: INFO/ForkPoolWorker-14] Task swh.loader.package.maven.tasks.LoadMaven[0029c978-4285-49b4-ad3d-c91ed89998ee] succeeded in 2.5496441069990396s: {'status': 'eventful', 'snapshot_id': '8bdc02042046d481b3448b7abf12e83339095dd1'} Sep 08 12:39:47 worker03 python3[2074432]: [2022-09-08 12:39:47,930: INFO/MainProcess] Received task: swh.loader.package.maven.tasks.LoadMaven[066ea81a-9c0a-4dbe-9d22-4a8929281876] Sep 08 12:39:54 worker03 python3[2164833]: [2022-09-08 12:39:54,471: INFO/ForkPoolWorker-15] Task swh.loader.package.maven.tasks.LoadMaven[6cb88082-f19e-4a08-aae6-61247c9484fd] succeeded in 6.516929866047576s: {'status': 'eventful', 'snaps hot_id': '80e2b2758fa3133d2706ca26ceb8583204f01b0d'} Sep 08 12:39:54 worker03 python3[2074432]: [2022-09-08 12:39:54,477: INFO/MainProcess] Received task: swh.loader.package.maven.tasks.LoadMaven[2a8298d9-1938-43b8-8d2a-acd33aeffdc4]
-
- Antoine R. Dumont marked the checklist item infra/puppet/puppet-swh-site!554: Make sure lister workers consume the maven queues as completed
marked the checklist item infra/puppet/puppet-swh-site!554: Make sure lister workers consume the maven queues as completed
- Owner
\o/ great
- Antoine R. Dumont removed state:wip label
removed state:wip label
- Antoine R. Dumont closed
closed