Modify deposit workflow to check duplicated POST requests
After the session with Bruno this week, we saw that multiple request of the same deposit that are waiting for the workers create a corner case where each is treated as a different deposit and each is loaded into the archive separately. For example this deposit -https://archive.softwareheritage.org/browse/origin/https://hal.archives-ouvertes.fr/hal-01862659/visits/ with 9 visits but not related through the parent history.
- if external id exists
- if md5 identical 3. calculate metadata hash 4. if metadata hash identical 5. return 400 //we have already received this deposit
- mark deposit with last identical external-id as parent-id 3. if parent is 'rejected' status iterate until last non-rejected parent
- return 201 with new deposit-id
Comment: when parent is not in status 'done' the deposit can't be loaded
Migrated from T1171 (view on Phabricator)