Simplify indexer design: move away from the pipeline approach
That does not scale well in regards to scheduling. We cannot easily schedule the indexer (actually it is with a fork of the main scheduler but it's not a complete thing, the input is still done from a db extract).
Moving towards a range approach, we will be able to schedule a finite range (for content at least). Adding new indexer will just be a matter of adding yet another task type and the same amount of finite ranges.
That means though:
- change the indexer's input from arbitrary list of ids to a range of ids (swh/devel/swh-indexer#991 (closed))
- removing orchestrator approach
- moving some logic within indexer (for example, the language, ctags, license indexers will need to filter themselves for textual content).
Migrated from T1310 (view on Phabricator)