Skip to content

loader-pypi: Snapshot with null branch are badly handled by loader

There remained errors on the pypi loader about None reference.

Sample stacktrace for origin https://pypi.org/project/configpy/

[2018-11-29 08:38:27,300: ERROR/Worker-1903] Loading failure, updating to `partial` status
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 886, in load
    self.prepare(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 152, in prepare
    self._prepare_state()
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 161, in _prepare_state
    self.known_artifacts = self._known_artifacts(last_snapshot)
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 115, in _known_artifacts
    for rev in last_snapshot['branches'].values()
  File "/usr/lib/python3/dist-packages/swh/loader/pypi/loader.py", line 116, in <listcomp>
    if rev['target_type'] == 'revision']
TypeError: 'NoneType' object is not subscriptable

kibana dashboard is at [1]

Its snapshot [2] feels wrong:

{'branches': {b'HEAD': {'target': b'releases/0.5', 'target_type': 'alias'},
              b'releases/0.2': None,
              b'releases/0.3': None,
              b'releases/0.4': None,
              b'releases/0.5': {'target': b'\xa5\x978\x92;S{\xf6\xcfE\xa9r'
                                          b'\xf0\xdd\x85e\xe8Z\x1d\x80',
                                'target_type': 'revision'}},
 'id': b'\xee<\xdfq\xc76\xec\xa2Dn?\x8aE\xb3\xc3\xe0I\x87H\xb0',
 'next_branch': None}
c = {'storage': {
  'cls': 'remote',
  'args': {'url': 'http://uffizi.internal.softwareheritage.org:5002/'}
  }
}
from swh.storage import get_storage
storage = get_storage(**c['storage']

origin = storage.origin_get({'type': 'pypi', 'url': 'https://pypi.org/project/configpy/'}
origin_id = origin['id']

snap = s.snapshot_get_latest(origin_id)
from pprint import pprint
pprint(snap)

Note: swh-loader-pypi!19 (closed) fixed a wrong behavior in snapshot resolution (most probably few snapshots were really used prior to this fix). -> This led to new errors fixed in swh-loader-pypi!20 (closed) (which catch most errors when trying to solve the snapshot revisions) -> This finally revealed that some snapshots are badly formatted (i did not expect snapshot branches to target None)


Migrated from T1396 (view on Phabricator)