indexer: orchestrator now provides the persistence policy to indexer tasks
The orchestrator owns a check_presence flag which determine if we first filter out data already present or not in the db. Turning this flag (orchestrator's configuration file) off permitted to avoid this check. But then, we could have had duplicates data in db which were ignored (python3-swh.storage <= 0.0.68). Those entries can now be updated as well: - check_presence: True. Filter out data and if there are still duplicates (should not happen), they will be in any case ignored. - check_presence: False. Do not filter out data and if there are duplicates, they will be updated according to latest results.
Showing
Loading
Please register or sign in to comment