Skip to content

Implement mimetype indexer using range of sha1s

This implements the new mimetype range indexer only. The previous indexers (base class ContentIndexer, and ContentMimetypeIndexer) are still present.

The base class because it's still needed for the other not yet migrated indexers (fossology license is next, language, ctags).

The ContentMimetypeIndexer because i'm not 100% certain we really want to drop it entirely. That could be used if we wanted to make a journal client just for the content indexing on new contents.

Note:

  • The tests need to be rewritten to reuse in-memory storages (and ideally data generation). It's somewhat partially done but it's not good enough.I did not want to start this right now because i want to deploy! It needs to be done nonethelesss at some point (i'll open a task).

  • I created a new type of tasks named swh.indexer.tasks.StatusTask who returns the task's status. That's more consistent with all other swh tasks (outside swh-indexer). We need to align on this behavior.

Related #991 (closed)

Test Plan

tox


Migrated from D660 (view on Phabricator)

Merge request reports

Loading