Add forge now - Process https://gitlab.isc.org/
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Guillaume Samson changed milestone to %Extend archive coverage [Roadmap - Collect]
changed milestone to %Extend archive coverage [Roadmap - Collect]
- Guillaume Samson added AddForgeNow label
added AddForgeNow label
- Guillaume Samson assigned to @guillaume
assigned to @guillaume
- Author Owner
On staging environment:
swhscheduler@scheduler0:~$ swh scheduler --url http://scheduler0.internal.staging.swh.network:5008/ \ > add-forge-now --preset staging \ > register-lister gitlab \ > url=https://gitlab.isc.org/api/v4/ Created 1 tasks Task 33421031 Next run: today (2023-04-13T12:34:38.548248+00:00) Interval: 90 days, 0:00:00 Type: list-gitlab-full Policy: oneshot Args: Keyword args: enable_origins: False max_origins_per_page: 10 max_pages: 3 url: 'https://gitlab.isc.org/api/v4/'
swhscheduler@scheduler0:~$ swh scheduler --url http://scheduler0.internal.staging.swh.network:5008/ \ > add-forge-now --preset staging \ > schedule-first-visits \ > --type-name git \ > --lister-name gitlab \ > --lister-instance-name gitlab.isc.org 100 slots available in celery queue 20 visits to send to celery
- Guillaume Samson added 15m of time spent
added 15m of time spent
- Owner
To avoid the peckle you hit on staging [1] [2], i've triggered a save code now on one of the fork on bind9 (in production). Which went smoothly [3]. So i think it'd be fine to deploy in production now.
You should not get hit by it there. Most forks should be ingested. That is retrieve a large packfile of data but only send their different revisions (and related dag objects) built on top of the main fork (i expect not much).
[1] I think you hit the same issue i did for the "blender" fork repository or some similar issue. staging archive being less populated than production, workers probably sent around the same time a large amount of duplicated data to be stored in the archive. Ending up in concurrent postgresql transaction, and they stepped on each other...
[3] https://sentry.softwareheritage.org/share/issue/252628f2879f46039b70bc7127919dd0/
[2]
loaders [2023-04-14 10:22:26,178: INFO/ForkPoolWorker-1] Listed 8508 refs for repo https://gitlab.isc.org/isc-projects/bind9 loaders [2023-04-14 11:01:41,423: INFO/ForkPoolWorker-1] Fetched 729963 objects; 9691 are new loaders [2023-04-14 11:01:41,517: INFO/ForkPoolWorker-1] Task swh.loader.git.tasks.UpdateGitRepository[3cdaabf4-9018-4d12-85ca-285eea02b92f] succeeded in 2381.3799842860317s: {'status': 'eventful'}
Edited by Antoine R. Dumont Collapse replies - Author Owner
Great thank.
Indeed a lot of errors on staging...
swh-scheduler=> select visit_type, url, last_visit_status from origin_visit_stats where visit_type='git' and url like 'https://gitlab.isc.org%'; visit_type | url | last_visit_status ------------+----------------------------------------------------------------+------------------- git | https://gitlab.isc.org/pspacek/zone-transfer-benchmarks.git | successful git | https://gitlab.isc.org/stepan/hypothesis-dns.git | successful git | https://gitlab.isc.org/isc-projects/DNS-Compliance-Testing.git | successful git | https://gitlab.isc.org/fanf/hg64.git | successful git | https://gitlab.isc.org/pspacek/zone-stats-experiment.git | successful git | https://gitlab.isc.org/isc-projects/keama-leases.git | successful git | https://gitlab.isc.org/wlodek/tcp-python-client-blq.git | successful git | https://gitlab.isc.org/tkrizek/gitlab-helpers.git | successful git | https://gitlab.isc.org/isc-projects/images.git | successful git | https://gitlab.isc.org/pemensik/bind9.git | failed git | https://gitlab.isc.org/isc-projects/bind9.git | failed git | https://gitlab.isc.org/kchen/bind9.git | failed git | https://gitlab.isc.org/fanf/bind9.git | failed git | https://gitlab.isc.org/isc-projects/python-rndc.git | successful git | https://gitlab.isc.org/isc-projects/kea.git | failed git | https://gitlab.isc.org/bshastry/bind9.git | failed git | https://gitlab.isc.org/isc-projects/userspace-rcu.git | failed git | https://gitlab.isc.org/wpk/bind9.git | failed git | https://gitlab.isc.org/uedvt359/kea.git | failed git | https://gitlab.isc.org/oliverford/bind9.git | failed (20 rows)
and many forks on ISC forge (BIND, DHCP and KEA):
ᐅ curl -s "https://gitlab.isc.org/api/v4/projects?per_page=100&page=1" | jq '.[] | .name' | \ sort | uniq -c | sort -k1 -n | grep -v "^[[:space:]]*[12] " 6 "stork" 10 "dhcp" 13 "Kea" 18 "BIND"
Edited by Guillaume Samson
- Guillaume Samson added 4h of time spent
added 4h of time spent
- Author Owner
On production environment:
swhscheduler@saatchi:~$ swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ > add-forge-now --preset production \ > register-lister gitlab \ > url=https://gitlab.isc.org/api/v4/ Created 1 tasks Task 415368906 Next run: today (2023-04-17T09:35:42.072031+00:00) Interval: 90 days, 0:00:00 Type: list-gitlab-full Policy: recurring Args: Keyword args: url: 'https://gitlab.isc.org/api/v4/' Created 1 tasks Task 415368907 Next run: tomorrow (2023-04-18T09:35:42.145388+00:00) Interval: 1 day, 0:00:00 Type: list-gitlab-incremental Policy: recurring Args: Keyword args: url: 'https://gitlab.isc.org/api/v4/' swhscheduler@saatchi:~$ swh scheduler --url http://saatchi.internal.softwareheritage.org:5008/ \ > add-forge-now --preset production \ > schedule-first-visits \ > --type-name git \ > --lister-name gitlab \ > --lister-instance-name gitlab.isc.org 10000 slots available in celery queue 110 visits to send to celery
- Guillaume Samson added 15m of time spent
added 15m of time spent
- Author Owner
On production environment, most first ingest are successfully completed:
softwareheritage-scheduler=> select last_visit_status, count(ovs.url) from origin_visit_stats ovs join listed_origins lo USING(url, visit_type) where lister_id = (select id from listers where name='gitlab' and instance_name='gitlab.isc.org') and visit_type='git' group by last_visit_status; last_visit_status | count -------------------+------- successful | 105 failed | 5 (2 rows)
1 - Guillaume Samson added 15m of time spent
added 15m of time spent
- Guillaume Samson closed
closed