Skip to content

mercurial.loader: Make it run within docker

  • [1] Without this diff, mercurial loader run within docker fails with multiple errors (in order, one error appears after another):
  • TypeError: can not serialize 'map' object
  • TypeError: can not serialize 'set' object

So this diff fixes those:

  • map is not ok when calling storage.content_missing
  • set are not ok when calling storage.{revision|release}_missing

No idea why the tests do not capture any of those issues though. I'm just unstucking this so people can run it within docker.

  • [1] The initial problem was along those lines (exactly like D3258#79482):
- swh.core.api.RemoteException: <RemoteException 500 AttributeError: ["'dict' object has no attribute 'url'"]>

where the self.origin being written to storage was a dict instead of an Origin model object [1].

That error is now gone with the current loader-core at least v0.2.0.

Test Plan

tox + run on docker:

docker-compose.override.yml:

version: '2'

services:
  swh-loader:
    volumes:
      # - "$SWH_ENVIRONMENT_HOME/swh-loader-core:/src/swh-loader-core"
      - "$SWH_ENVIRONMENT_HOME/swh-loader-mercurial:/src/swh-loader-mercurial"
$ doco up
$ doco exec swh-loader run mercurial https://www.mercurial-scm.org/repo/evolve/

Finally:

$ time doco exec swh-loader swh loader run mercurial https://www.mercurial-scm.org/repo/evolve/
WARNING:swh.core.cli:Could not load subcommand search: cannot import name 'get_journal_client' from 'swh.journal.cli' (/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/journal/cli.py)
INFO:swh.core.config:Loading config file /loader.yml
WARNING:swh.loader.mercurial.Bundle20Loader:No matching revision for tag 5.6.1 (hg changeset: 70694b2621ba9d919bc38303f8901e84caf5da0f). Skipping
{'status': 'eventful'}
docker-compose exec swh-loader swh loader run mercurial   0.59s user 0.61s system 0% cpu 2:10.59 total
$  time doco exec swh-loader swh loader run mercurial https://www.mercurial-scm.org/repo/evolve/
WARNING:swh.core.cli:Could not load subcommand search: cannot import name 'get_journal_client' from 'swh.journal.cli' (/srv/softwareheritage/venv/lib/python3.7/site-packages/swh/journal/cli.py)
INFO:swh.core.config:Loading config file /loader.yml
WARNING:swh.loader.mercurial.Bundle20Loader:No matching revision for tag 5.6.1 (hg changeset: 70694b2621ba9d919bc38303f8901e84caf5da0f). Skipping
{'status': 'uneventful'}
docker-compose exec swh-loader swh loader run mercurial   0.59s user 0.53s system 2% cpu 40.954 total

Migrated from D3258 (view on Phabricator)

Merge request reports

Loading