swh-indexer produces dates not supported by swh-search/ElasticSearch
eg. 2022-12-2
https://sentry.softwareheritage.org/organizations/swh/issues/104824/?referrer=phabricator_plugin
BulkIndexError: ('8 document(s) failed to index.', [{'update': {'_index': 'origin-v0.11', '_type': '_doc', '_id': '155291d5b9ada4570672510509f93fcfd9809882', 'status': 400, 'error': {'type': 'mapper_parsing_exception', 'reason': "failed to parse field [jsonld.http://schema.org/dateModified.@value] of type [date] in document with id '155291d5b9ada4570672510509f93fcfd9809882'. Preview of field's value: '2020-12-2'", 'caused_by': {'type': 'illegal_argument_exception', 'reason': 'failed to parse date field [2020-12-2] with ...
(5 additional frame(s) were not displayed)
...
File "swh/search/metrics.py", line 21, in d
return f(*a, **kw)
File "swh/search/elasticsearch.py", line 382, in origin_update
indexed_count, errors = helpers.bulk(self._backend, actions, index=write_index)
File "elasticsearch/helpers/actions.py", line 300, in bulk
for ok, item in streaming_bulk(client, actions, *args, **kwargs):
File "elasticsearch/helpers/actions.py", line 230, in streaming_bulk
**kwargs
File "elasticsearch/helpers/actions.py", line 158, in _process_bulk_chunk
raise BulkIndexError("%i document(s) failed to index." % len(errors), errors)
Migrated from T4654 (view on Phabricator)