Skip to content

Refactor output of indexer storage's `get` methods.

In the Indexer Storage API, most get methods (eg. content_ctags_get) yield items with this format:

{"id": sha1, "tool": TOOL, "ctags": ctags1}
{"id": sha1, "tool": TOOL, "ctags": ctags2}

Starting with T782/!414 (closed), content_fossology_license_get yields item with this format:

{sha1: {"tool": TOOL, "licenses": [license1, license2]}}

This task is twofold:

  • first, improve content_fossology_license_get's result to return a dictionary instead of yielding dictionaries each with a single key-value
  • secondly, refactor other _get methods to use the same format.

The files that should be edited are:

  • swh/indexer/tests/storage/test_storage.py: this are the test cases for both Indexer Storage implementations. It should be adapted to test for the new format.
  • swh/indexer/storage/in_memory.py: a fully in-memory implementation of the Indexer Storage. This is the easiest implementation to start with.
  • swh/indexer/storage/__init__.py and swh/indexer/storage/converters.py: an implementation of the Indexer Storage backed by postgresql. Look at !414 (closed) for examples of how to do it.

Migrated from T1433 (view on Phabricator)

Edited by Phabricator Migration user