Skip to content
Snippets Groups Projects

packagist: Fix json parsing which is different depending on page

Merged Antoine R. Dumont requested to merge ardumont/swh-lister:fix-json-format-parsing into master
1 unresolved thread

Detected by the current run on latest version crash [1] (& reproduced on docker).

The test dataset is a mix of the previous json format (returned by /p/, /packages/ urls) and the new one (/p2/ url). So tests are consistent with what can be listed by the packagist api.

This time, i checked with docker and it's no longer crashing on that error (run still ongoing).

[1]

listers {"asctime": "2023-08-02 14:19:25,040", "threadName": "MainThread", "pathname": "/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py", "lineno": 270, "funcName": "_log_error", "task_name": null, "task_id": null, "name":
 "celery.app.trace", "levelname": "ERROR", "message": "Task swh.lister.packagist.tasks.PackagistListerTask[906772b3-22d1-43e4-8248-7ff34cf1bfb2] raised unexpected: AttributeError(\"'list' object has no attribute 'values'\")", "exc_info": "
Traceback (most recent call last):\n  File \"/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py\", line 477, in trace_task\n    R = retval = fun(*args, **kwargs)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/sche
duler/task.py\", line 61, in __call__\n    result = super().__call__(*args, **kwargs)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py\", line 760, in __protected_call__\n    return self.run(*args, **kwargs)\n  Fi
le \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/tasks.py\", line 13, in list_packagist\n    return PackagistLister.from_configfile(**lister_args).run().dict()\n  File \"/opt/swh/.local/lib/python3.10/site-packages/sw
h/lister/pattern.py\", line 222, in run\n    for origin in self.get_origins_from_page(page):\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/lister.py\", line 203, in get_origins_from_page\n    versions_info = s
elf._get_metadata_for_package(package_name)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/lister.py\", line 177, in _get_metadata_for_package\n    meta_info = self._get_metadata_from_page(package_url_format, p
ackage_name)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/lister.py\", line 153, in _get_metadata_from_page\n    return package_info.values()  # could be an empty response though -> []\nAttributeError: 'list'
 object has no attribute 'values'", "data": {"hostname": "lister@lister-packagist-547f8bc56d-5rbrc", "id": "906772b3-22d1-43e4-8248-7ff34cf1bfb2", "name": "swh.lister.packagist.tasks.PackagistListerTask", "exc": "AttributeError(\"'list' ob
ject has no attribute 'values'\")", "traceback": "Traceback (most recent call last):\n  File \"/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py\", line 477, in trace_task\n    R = retval = fun(*args, **kwargs)\n  File \"/op
t/swh/.local/lib/python3.10/site-packages/swh/scheduler/task.py\", line 61, in __call__\n    result = super().__call__(*args, **kwargs)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/celery/app/trace.py\", line 760, in __protected_
call__\n    return self.run(*args, **kwargs)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/tasks.py\", line 13, in list_packagist\n    return PackagistLister.from_configfile(**lister_args).run().dict()\n  File
 \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/pattern.py\", line 222, in run\n    for origin in self.get_origins_from_page(page):\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/lister.py\", line 20
3, in get_origins_from_page\n    versions_info = self._get_metadata_for_package(package_name)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/lister.py\", line 177, in _get_metadata_for_package\n    meta_info =
self._get_metadata_from_page(package_url_format, package_name)\n  File \"/opt/swh/.local/lib/python3.10/site-packages/swh/lister/packagist/lister.py\", line 153, in _get_metadata_from_page\n    return package_info.values()  # could be an e
mpty response though -> []\nAttributeError: 'list' object has no attribute 'values'\n", "args": "[]", "kwargs": "{}", "description": "raised unexpected", "internal": false}}

Refs. swh/meta#5001 (closed)

Edited by Antoine R. Dumont

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
    • The two sides of that and seem redundant.

      Does the test data properly handle both response types?

    • Author Maintainer

      yes, maybe, i tried to prevent mypy from annoying me. I did not actually check if that was useful.

      Does the test data properly handle both response types?

      Heh, i was answering that in the description while you were reviewing ;) Yes, the old format and the new one are present.

    • Please register or sign in to reply
  • Nicolas Dandrimont approved this merge request

    approved this merge request

  • Antoine R. Dumont changed the description

    changed the description

  • Jenkins job DLS/gitlab-builds #171 succeeded .
    See Console Output and Coverage Report for more details.

  • added 1 commit

    • 903ff367 - packagist: Fix json parsing which is different depending on page

    Compare with previous version

  • Jenkins job DLS/gitlab-builds #172 succeeded .
    See Console Output and Coverage Report for more details.

  • Antoine R. Dumont changed the description

    changed the description

  • Antoine R. Dumont changed the description

    changed the description

  • Please register or sign in to reply
    Loading