Skip to content

npm.client: Ensure package.json parsing

Ensure package.json file can be parsed when its content can not be properly decoded due to the encoding not properly detected.

So try to decode from utf-8 first, then use chardet as a fallback using the replace error hanling to replace characters that can not be decoded.

Even if the package.json content can not be correctly loaded, this is not critical as these data are only added to a swh revision metadata. Original package.json file can still be obtained from the archive content.

This should fix this kind of reported errors:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 895, in load
    more_data_to_fetch = self.fetch_data()
  File "/usr/lib/python3/dist-packages/swh/loader/npm/loader.py", line 203, in fetch_data
    data = next(self.new_versions)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 149, in prepare_package_versions
    version_data)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 207, in _prepare_package_version
    package_json = json.loads(package_json_bytes.decode(file_encoding))
  File "/usr/lib/python3.5/encodings/cp1254.py", line 15, in decode
    return codecs.charmap_decode(input,errors,decoding_table)
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 42: character maps to <undefined>
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 895, in load
    more_data_to_fetch = self.fetch_data()
  File "/usr/lib/python3/dist-packages/swh/loader/npm/loader.py", line 203, in fetch_data
    data = next(self.new_versions)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 145, in prepare_package_versions
    version_data)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 197, in _prepare_package_version
    package_json = json.load(package_json_file)
  File "/usr/lib/python3.5/json/__init__.py", line 268, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/usr/lib/python3.5/json/__init__.py", line 315, in loads
    s, 0)
json.decoder.JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 895, in load
    more_data_to_fetch = self.fetch_data()
  File "/usr/lib/python3/dist-packages/swh/loader/npm/loader.py", line 203, in fetch_data
    data = next(self.new_versions)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 149, in prepare_package_versions
    version_data)
  File "/usr/lib/python3/dist-packages/swh/loader/npm/client.py", line 204, in _prepare_package_version
    with open(package_json_path, 'rb') as package_json_file:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/swh.loader.npm/swh.loader.npm.jrx67u3_-2344/@lpmraven/link-components/0.1.1/package/package.json'

Related swh-loader-core#1726


Migrated from D1498 (view on Phabricator)

Merge request reports