Skip to content

Use mercurial tags as named pointer (referenced in the snapshot)

As exchanged in irc discussion:

12:04 <+olasd> ardumont: so, just to be clear, what do tags look like in the mercurial loader now? 12:17 <+ardumont> as before, they just now target the right associated revision (and no longer the wrong mercurial ones) 12:17 <+ardumont> (https://forge.softwareheritage.org/!2 is the fix) 13:07 <+olasd> what's the point of having release objects with no message, no author, no date? this information could live entirely in the snapshot 13:08 <+olasd> (same as git "lightweight" tags) 13:11 <+ardumont> i think that was developed prior to the snapshot 13:12 <+olasd> snapshots have been in the data model forever (they were called occurrences before but they were the same thing) 13:13 <+olasd> if the object is, functionally, a named pointer for a revision id, there's no need for a full fat release object 13:17 <+olasd> (this conversation is orthogonal to the removal of the "bogus" objects) 13:20 <+ardumont> yes, right (about occurrences) 13:20 <+ardumont> for the choice then, "the point" was probably done to stick to the mercurial model, 1 tag -> 1 release 13:21 <+ardumont> which as you point out is probably not that good 13:23 <+olasd> well, we've been dithering on whether to synthesize release objects for most loaders now, but I think we're pretty much consistently avoiding them in favor of putting the information in the snapshot 13:24 <+ardumont> indeed 13:24 <+olasd> unless there really is more to it than having a named pointer to a given revision

Related #1189 (closed)

Test Plan

Manual (unfortunately cf. #954 (closed))

  • No data in swh-dev instance

  • Charged 'distutils' origin (with tags)

  • Checked that no longer any releases are created.

  • Checked that the 'tags' are referenced as branch in the snapshot and that they still (as in last prior fix !2 (closed)) target the right revisions.

$ python3
>>> project = 'distutils2' 
>>> # remote repository
... origin_url = 'https://www.mercurial-scm.org/repo/%s' % project
>>> # local clone
... directory = '/home/storage/hg/repo/%s' % project
>>>
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>>
>>> from swh.loader.mercurial.tasks import LoadMercurialTsk
>>>
>>> t = LoadMercurialTsk()
>>> t.run(origin_url=origin_url, directory=directory, visit_date='2016-05-03T15:16:32+00:00')
DEBUG:swh.scheduler.task.LoadMercurialTsk:Creating hg origin for https://www.mercurial-scm.org/repo/distutils2
DEBUG:swh.scheduler.task.LoadMercurialTsk:Done creating hg origin for https://www.mercurial-scm.org/repo/distutils2
DEBUG:swh.scheduler.task.LoadMercurialTsk:Creating origin_visit for origin 1 at time 2016-05-03 15:16:32+00:00
DEBUG:swh.scheduler.task.LoadMercurialTsk:Done Creating origin_visit for origin 1 at time 2016-05-03 15:16:32+00:00
DEBUG:swh.scheduler.task.LoadMercurialTsk:Bundling at /home/storage/hg/repo/distutils2/HG20_none_bundle
INFO:swh.scheduler.task.LoadMercurialTsk:5847fb2b7549fb301882c1054eb8b3d0893e3570: b'1.0a1' -> 2e8c4a95ae8d6855d083a4bf3888c0c2bd1ab7d7
INFO:swh.scheduler.task.LoadMercurialTsk:61d1f457a279398c785f8ab729f30f526b6edd58: b'1.0a2' -> a6c182120215be5357db6d538dceee57f8f76d58
INFO:swh.scheduler.task.LoadMercurialTsk:e15c0aa57f52215fe8f184427b0207e6acfb65d6: b'1.0a3' -> c4dd5b9920c7bfefdf857f656c4f64939caf6fe4
INFO:swh.scheduler.task.LoadMercurialTsk:e15c0aa57f52215fe8f184427b0207e6acfb65d6: b'1.0a3' -> c4dd5b9920c7bfefdf857f656c4f64939caf6fe4
INFO:swh.scheduler.task.LoadMercurialTsk:7c8e61aa51f4748286964bc1405bd4169c270f46: b'1.0a3' -> d06ecb3a53cd93e3a9b90861a355e3f1f86b4e75
INFO:swh.scheduler.task.LoadMercurialTsk:7c8e61aa51f4748286964bc1405bd4169c270f46: b'1.0a3' -> d06ecb3a53cd93e3a9b90861a355e3f1f86b4e75
INFO:swh.scheduler.task.LoadMercurialTsk:d930ae6caab58bec92683235aa88d06bbc07ae36: b'1.0a3' -> b893f4bee89bb5dbc0cd8b1821fd37008b377a9c
INFO:swh.scheduler.task.LoadMercurialTsk:1e4d52d83e95c14e3f0bd2179a81ac1023ef32e9: b'1.0a4' -> ca984898a5f9218d0ab5496748002fdc1ec4c260
DEBUG:swh.scheduler.task.LoadMercurialTsk:Updating origin_visit for origin 1 with status full
DEBUG:swh.scheduler.task.LoadMercurialTsk:Done updating origin_visit for origin 1 with status full
DEBUG:swh.scheduler.task.LoadMercurialTsk:Cleanup up working bundle /home/storage/hg/repo/distutils2/HG20_none_bundle
{'status': 'eventful'}
>>>

log format: "node id -> rev-target-id-as-hex" and check that id in db.


Migrated from D409 (view on Phabricator)

Merge request reports