Mercurial checkouts origins are ingested with the .hg folder.
It's excluded from the nar hash computation but it remains ingested
with its tree and it should not.
Original issue:
Hello!The tarball at https://archive.softwareheritage.org/api/1/vault/flat/swh:1:dir:218d95849f10fc0691d7dfa80999ce5061e654ef/raw/ contains a `.hg` metadata directory, which I think is unintended, right?
I've looked into the 16 origins of type hg-checkout in the archive [1]
Only one has a remaining .hg directory [2]
Note that this origin has also a visit type hg.
Funnily enough, for that origin, I noticed that the hg visit type rendering tree listing does not show the .hg folder while the hg-checkout type rendering does...
Ah but it seems to be the case for all hg visit type rendering tree ui, the .hg is not listed.
@anlambert might be interested to know this either way ^
For that origin, its last visit ended up in a not_found [1]. The loader did not find anything.
Hence why the api is still displaying the old '.hg' at its root folder. It's the previous successful visit's
snapshot directory. (We were not able to revisit with the new deployed version yet).
The main site seems to have trouble too [2]. So nothing we can do about it currently.
swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: {"asctime": "2024-02-26 16:56:39,000", "threadName": "MainThread", "pathname": "/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/loader.py", "lineno": 414, "funcName": "load", "task_name": null, "task_id": null, "name": "swh.loader.mercurial.directory.HgCheckoutLoader", "levelname": "INFO", "message": "Load origin 'http://hg.openjdk.java.net/openjfx/8u-dev/rt' with type 'hg-checkout'"}swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: real URL is https://hg.openjdk.org/openjfx/8u-dev/rtswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]:swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: worker: Warm shutdown (MainProcess)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]:swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: worker: Warm shutdown (MainProcess)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: adding changesetsswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: transaction abort!swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: rollback completedswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: Process Process-1:1:swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: Traceback (most recent call last):swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/billiard/process.py", line 323, in _bootstrapswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: self.run()swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/billiard/process.py", line 110, in runswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: self._target(*self._args, **self._kwargs)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/utils.py", line 76, in _clone_taskswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: raise eswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/utils.py", line 71, in _clone_taskswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: clone_func()swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/hg.py", line 1016, in cloneswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: exchange.pull(swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py", line 1713, in pullswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: _fullpullbundle2(repo, pullop)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py", line 1564, in _fullpullbundle2swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: _pullbundle2(pullop)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py", line 1923, in _pullbundle2swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: bundle2.processbundle(swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 508, in processbundleswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: processparts(repo, op, unbundler)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 514, in processpartsswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: with partiterator(repo, op, unbundler) as parts:swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 457, in __exit__swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: raise excswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 516, in processpartsswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: _processpart(op, part)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 594, in _processpartswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: handler(op, part)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 2092, in handlechangegroupswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: ret = _processchangegroup(swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 524, in _processchangegroupswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: ret = cg.apply(op.repo, tr, source, url, **kwargs)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py", line 547, in applyswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: if not cl.addgroup(swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/revlog.py", line 3561, in addgroupswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: for data in deltas:swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py", line 779, in deltaiterswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: for chunkdata in iter(lambda: self.deltachunk(chain), {}):swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py", line 779, in <lambda>swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: for chunkdata in iter(lambda: self.deltachunk(chain), {}):swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py", line 365, in deltachunkswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: delta = readexactly(self._stream, l - self.deltaheadersize)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py", line 3174, in readexactlyswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: s = stream.read(n)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 1481, in readswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: data = self._payloadstream.read(size)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py", line 2713, in readswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: for chunk in self.iter:swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py", line 2684, in splitbigswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: for chunk in chunks:swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: File "/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py", line 1342, in decodepayloadchunksswh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: raise error.Abort(swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: mercurial.error.Abort: stream ended unexpectedly (got 29127 bytes, expected 32768)swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: {"asctime": "2024-02-26 17:04:53,240", "threadName": "MainThread", "pathname": "/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/loader.py", "lineno": 513, "funcName": "load", "task_name": null, "task_id": null, "name": "swh.loader.mercurial.directory.HgCheckoutLoader", "levelname": "ERROR", "message": "Loading failure, updating to `not_found` status", "exc_info": "Traceback (most recent call last):\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/mercurial/utils.py\", line 40, in raise_not_found_repository\n yield\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/mercurial/directory.py\", line 40, in clone_repository\n clone(repo_url, str(local_clone_dir), rev=hg_changeset)\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/mercurial/hgutil.py\", line 132, in clone\n clone_with_timeout(src, dest, closure, timeout)\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/utils.py\", line 112, in clone_with_timeout\n raise CloneFailure(src, dest, errors.get())\nswh.loader.core.utils.CloneFailure: ('http://hg.openjdk.java.net/openjfx/8u-dev/rt', '/tmp/tmpmo7w5w0_-2024-02-26T16:56:39.393668/rt', 'Traceback (most recent call last):\\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/utils.py\", line 71, in _clone_task\\n clone_func()\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/hg.py\", line 1016, in clone\\n exchange.pull(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py\", line 1713, in pull\\n _fullpullbundle2(repo, pullop)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py\", line 1564, in _fullpullbundle2\\n _pullbundle2(pullop)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py\", line 1923, in _pullbundle2\\n bundle2.processbundle(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 508, in processbundle\\n processparts(repo, op, unbundler)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 514, in processparts\\n with partiterator(repo, op, unbundler) as parts:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 457, in __exit__\\n raise exc\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 516, in processparts\\n _processpart(op, part)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 594, in _processpart\\n handler(op, part)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 2092, in handlechangegroup\\n ret = _processchangegroup(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 524, in _processchangegroup\\n ret = cg.apply(op.repo, tr, source, url, **kwargs)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 547, in apply\\n if not cl.addgroup(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/revlog.py\", line 3561, in addgroup\\n for data in deltas:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 779, in deltaiter\\n for chunkdata in iter(lambda: self.deltachunk(chain), {}):\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 779, in <lambda>\\n for chunkdata in iter(lambda: self.deltachunk(chain), {}):\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 365, in deltachunk\\n delta = readexactly(self._stream, l - self.deltaheadersize)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py\", line 3174, in readexactly\\n s = stream.read(n)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 1481, in read\\n data = self._payloadstream.read(size)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py\", line 2713, in read\\n for chunk in self.iter:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py\", line 2684, in splitbig\\n for chunk in chunks:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 1342, in decodepayloadchunks\\n raise error.Abort(\\nmercurial.error.Abort: stream ended unexpectedly (got 29127 bytes, expected 32768)\\n')\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/loader.py\", line 451, in load\n more_data_to_fetch = self.fetch_data()\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/loader.py\", line 799, in fetch_data\n for artifact_path in self.fetch_artifact():\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/mercurial/directory.py\", line 81, in fetch_artifact\n repo = clone_repository(\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/mercurial/directory.py\", line 39, in clone_repository\n with raise_not_found_repository():\n File \"/usr/local/lib/python3.10/contextlib.py\", line 153, in __exit__\n self.gen.throw(typ, value, traceback)\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/mercurial/utils.py\", line 42, in raise_not_found_repository\n raise NotFound(e)\nswh.loader.exception.NotFound: ('http://hg.openjdk.java.net/openjfx/8u-dev/rt', '/tmp/tmpmo7w5w0_-2024-02-26T16:56:39.393668/rt', 'Traceback (most recent call last):\\n File \"/opt/swh/.local/lib/python3.10/site-packages/swh/loader/core/utils.py\", line 71, in _clone_task\\n clone_func()\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/hg.py\", line 1016, in clone\\n exchange.pull(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py\", line 1713, in pull\\n _fullpullbundle2(repo, pullop)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py\", line 1564, in _fullpullbundle2\\n _pullbundle2(pullop)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/exchange.py\", line 1923, in _pullbundle2\\n bundle2.processbundle(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 508, in processbundle\\n processparts(repo, op, unbundler)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 514, in processparts\\n with partiterator(repo, op, unbundler) as parts:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 457, in __exit__\\n raise exc\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 516, in processparts\\n _processpart(op, part)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 594, in _processpart\\n handler(op, part)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 2092, in handlechangegroup\\n ret = _processchangegroup(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 524, in _processchangegroup\\n ret = cg.apply(op.repo, tr, source, url, **kwargs)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 547, in apply\\n if not cl.addgroup(\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/revlog.py\", line 3561, in addgroup\\n for data in deltas:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 779, in deltaiter\\n for chunkdata in iter(lambda: self.deltachunk(chain), {}):\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 779, in <lambda>\\n for chunkdata in iter(lambda: self.deltachunk(chain), {}):\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/changegroup.py\", line 365, in deltachunk\\n delta = readexactly(self._stream, l - self.deltaheadersize)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py\", line 3174, in readexactly\\n s = stream.read(n)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 1481, in read\\n data = self._payloadstream.read(size)\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py\", line 2713, in read\\n for chunk in self.iter:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/util.py\", line 2684, in splitbig\\n for chunk in chunks:\\n File \"/opt/swh/.local/lib/python3.10/site-packages/mercurial/bundle2.py\", line 1342, in decodepayloadchunks\\n raise error.Abort(\\nmercurial.error.Abort: stream ended unexpectedly (got 29127 bytes, expected 32768)\\n')", "swh_task_args": [], "swh_task_kwargs": {"origin": "http://hg.openjdk.java.net/openjfx/8u-dev/rt", "lister_name": null, "lister_instance_name": null}}swh/loader-hg-checkout-5cc5cb874-6fwns[loaders]: {"asctime": "2024-02-26 17:04:53,310", "threadName": "MainThread", "pathname": "/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py", "lineno": 131, "funcName": "info", "task_name": null, "task_id": null, "name": "celery.app.trace", "levelname": "INFO", "message": "Task swh.loader.mercurial.tasks.LoadMercurialCheckout[269fab60-7404-4c80-a3d0-04b937fdebd2] succeeded in 494.68754841946065s: {'status': 'uneventful'}", "data": {"id": "269fab60-7404-4c80-a3d0-04b937fdebd2", "name": "swh.loader.mercurial.tasks.LoadMercurialCheckout", "return_value": "{'status': 'uneventful'}", "runtime": 494.68754841946065, "args": "()", "kwargs": "{'ref': '8u202-ga', 'url': 'http://hg.openjdk.java.net/openjfx/8u-dev/rt', 'checksums': {'sha256': '2e15215de59feb86687a3da9d02f8902ed38ba01d3249b36635cef787945e379'}, 'checksum_layout': 'nar'}"}}
What we did is fix the tree in the archive (with the next visit, the snapshot now targets a proper representation of the tree).
That vault entry, since it's targetting a swhid, must be invalidated then cooked again.
I'll see what i can do.
Antoine R. Dumontchanged title from Mercurial checkouts obtained from the Vault contain '.hg' directory to Mercurial checkouts origins contain '.hg' directory
changed title from Mercurial checkouts obtained from the Vault contain '.hg' directory to Mercurial checkouts origins contain '.hg' directory