Ingest sourceforge repositories (origins of type git, svn, hg)
Lister is deployed and ingestion started for the svn and git repositories [1]
We need some more work for the mercurial loader and then start the ingestion for the mercurial repositories.
This will track both the ingestion monitoring and the remaining actions to trigger the hg ingestion.
When this is reasonably well on its way, say for example when the git and svn repositories are done, we need to update the logo in the main archive page [2] and add an entry in the archive changelog about it [3]
- [1] Status done out of total seen summary (out of [4] and [5] for the curious)
|------------+-------------------------------+-------------+--------+------------------------------+-----------+---------|
| Visit type | Status done (now) | Status done | Total | Total (now) | % Done | Remains |
|------------+-------------------------------+-------------+--------+------------------------------+-----------+---------|
| git | 2021-07-30 08:06:15.578567+00 | 181658 | 181646 | 2021-07-30 08:06:26.37034+00 | 100.00661 | -12 |
| svn | 2021-08-03 08:16:40.75639+00 | 101940 | 101894 | 2021-07-30 08:06:26.37034+00 | 100.04514 | -46 |
| cvs | x | x | 28622 | 2021-07-30 08:06:26.37034+00 | x | x |
| hg | 2021-08-03 08:14:43.364316+00 | 27630 | 27660 | 2021-07-30 08:06:26.37034+00 | 99.891540 | 30 |
| bzr | x | x | 290 | 2021-07-30 08:06:26.37034+00 | x | x |
|------------+-------------------------------+-------------+--------+------------------------------+-----------+---------|
#+TBLFM: @2$6=(100.0 * @2$3) / @2$4::@3$6=(100.0 * @3$3) / @3$4::@2$7=@2$4 - @2$3::@3$7=@3$4 - @3$3::@5$6=(100.0 * @5$3) / @5$4::@5$7=@5$4 - @5$3
-
[2] https://archive.softwareheritage.org (integrated in D6004 already)
-
[3] https://docs.softwareheritage.org/devel/archive-changelog.html (integrated in D5952)
-
[4] count the listed origins:
softwareheritage-scheduler=> select now(), visit_type, count(*) from listed_origins lo inner join listers l on l.id=lo.
lister_id where l.name='sourceforge' group by visit_type order by count(*) desc;
- [5] count origins
softwareheritage=> select now(), count(*) from origin where url like 'https://git.code.sf.net%'; -- replace per `svn`
Limited to those origin types as we don't have any cvs nor bazar loader implementations.
Migrated from T3374 (view on Phabricator)