- Feb 23, 2023
-
-
Antoine Lambert authored
It also enables to remove the version restriction on psycopg2. Related to swh/infra/sysadm-environment#4772
-
- Feb 17, 2023
-
-
Antoine Lambert authored
Related to swh/meta#4960
-
- Oct 27, 2022
-
-
vlorentz authored
-
- May 25, 2022
-
- Apr 11, 2022
-
-
Antoine Lambert authored
Add new private API endpoint /1/private/deposits/datatables/ to list and filter deposits whose responses are intended to be consumed by datatables javascript framework used in deposits admin Web UI. Originally that view was implemented in swh-web but for performance reasons it has been decided to move it in swh-deposit, swh-web will then simply forward the HTTP request to swh-deposit. Related to T3128
-
- Apr 08, 2022
-
-
Antoine Lambert authored
Related to T3922
-
- Apr 07, 2022
-
-
vlorentz authored
Resolves SWH-DEPOSIT-2P <https://sentry.softwareheritage.org/share/issue/b8509b67972f425a8f5c06805f8cf2fe/>
-
vlorentz authored
'metadata_raw' made sense, to discriminate from 'metadata_dict'; but no longer does, now that the latter was removed. Additionally, swh-web expects it to be named 'raw_metadata', so it could never actually get the metadata.
-
- Apr 06, 2022
-
-
vlorentz authored
-
- Mar 28, 2022
-
-
vlorentz authored
-
vlorentz authored
Manually validate <codemeta:affiliation>. Unfortunately, this cannot be validated by codemeta.xsd, because Codemeta has conflicting requirements: 1. https://codemeta.github.io/terms/ requires it to be Text (represented by simple content), but 2. https://doi.org/10.5063/SCHEMA/CODEMETA-2.0 requires it to be an Organization (represented by complex content) See https://github.com/codemeta/codemeta/pull/239 for a discussion about this issue.
-
- Mar 21, 2022
-
-
vlorentz authored
-
- Mar 16, 2022
-
-
Antoine R. Dumont authored
Related to T4013#80910
-
- Mar 08, 2022
-
-
Antoine R. Dumont authored
Prior to this, this was fetching all deposits and then for each deposit, query further information. Then return results and let the pagination happen. This now keeps the queryset lazy, the pagination happens and when a page is requested, this fetches further information on the subset required. Related to T4020
-
- Mar 04, 2022
-
-
vlorentz authored
And Make it non-empty in 'GET Cont-IRI'. And remove the legacy <atom:deposit_date> from the 'GET Cont-IRI'. As it was always empty, there is no need to keep it for backward compatibility.
-
- Mar 02, 2022
-
-
This function is only used by server-side API checks. Having it defined in the main utils module makes the deposit client transitively depend on Django (via swh.deposit.errors), which does not seem necessary.
-
- Feb 28, 2022
-
-
Nicolas Dandrimont authored
This function is only used by server-side API checks. Having it defined in the main utils module makes the deposit client transitively depend on Django (via swh.deposit.errors), which does not seem necessary.
-
vlorentz authored
For now this increases code complexity, but this will allow addition of other check more easily.
-
vlorentz authored
-
- Feb 24, 2022
-
-
Antoine R. Dumont authored
This will refuse the metadata-only deposit if the metadata provenance does not match. This is doing a similar check already done when doing deposit with origin url mismatching that same (client) provider url. Related to T3677
-
Antoine R. Dumont authored
This should ease deposit listing in whatever forms (backend db read or client consuming deposit listing). Deposit types stand for: - meta: metadata-only deposit - code: content deposit This commit includes a migration schema script which adds a new column 'type'. The script is also in charge of migration existing data with the right type values.. Related to T3677
-
vlorentz authored
xmltodict was already on the way out for the deposit, and the latest libexpat security update broke it entirely when dealing with namespaces, which means we cannot use it until this is addressed. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1006317 Functional changes of this commit: 1. No more writes to the 'metadata' jsonb column in the DB (as it strongly depends on xmltodict) 2. ServiceDocumentDepositClient always outputs a list of collections, instead of None/dict/List[dict] depending on the number of collections (artefact of using xmltodict, which is replaced by proper parsing)
-
vlorentz authored
No one uses that, and it's redundant, as we provide the original XML
-
- Feb 23, 2022
-
-
Antoine R. Dumont authored
This now lists the deposit with their associated raw metadata if any is present. This will allow adaptations in the moderation view [1] to display the metadata provenance url (provided it's parsed out of the raw metadata). [1] The moderation view consumes this internal api. Related to T3677
-
vlorentz authored
-
- Feb 22, 2022
-
-
vlorentz authored
This commit does not touch the external API though; ie. `metadata_dict` is still present in the JSON API, and the equivalent `jsonb` field remains in the database. They will probably be removed in a future commit because they are not very useful, though. Rationale: I find xmltodict's approach of translating XML tree to native structures to be intrinsically flawed for non-trivial handling of XML, because the data structure is: * implementation-defined (by xmltodict, which is python-only) and it may change across versions * does not intrinsically store namespaces, and relies on an internal prefix map (though it isn't much of an issue right now, as we do not need composability and all the changed APIs are private) * not stable; for example, `<a><b>foo</b></a>` and `<a><b>foo</b><b>bar</b></a>` are encoded completely differently (the former is a `Dict[str, str]`, the latter is `Dict[str, list]`. And every operation manipulating this data structure needs to check presence, number *and* type on every access. Consider this part of this commit for example: ``` - swh_deposit = metadata.get("swh:deposit") - if not swh_deposit: - return None - - swh_reference = swh_deposit.get("swh:reference") - if not swh_reference: - return None - - swh_origin = swh_reference.get("swh:origin") - if swh_origin: - url = swh_origin.get("@url") - if url: - return url + ref_origin = metadata.find( + "swh:deposit/swh:reference/swh:origin[@url]", namespaces=NAMESPACES + ) + if ref_origin is not None: + return ref_origin.attrib["url"] ``` the use of XPath makes it considerably shorter; and the original version did not even check number/type (ie. it would crash if an element was duplicated).
-
vlorentz authored
We don't use that feature at all as far as I am aware. I also find that it complicates any metadata handling (especially the validation I would like to add in the near future), and probably does not match semantics intended by SWORD (merging occurs on PUT requests, as we don't implement PATCH)
-
- Feb 21, 2022
-
-
Antoine R. Dumont authored
Prior to this commit, only rejected deposit were storing problem details. Now that we can have warnings even in case of 'verified' deposit, we need to store that details for post-analysis. Note that this also fixes the docstring of the overall class which were out of date since the beginning (duplicated from another class). Related to T3677
-
Antoine R. Dumont authored
This introduces a new check about the metadata provenance. While it's a suggested field, it's definitely something that we want deposit clients to send us. So warn when it's not the case. That does not reject the deposit but it's worth keeping that detail in the backend. Related to T3677
-
- Jan 18, 2022
-
-
Antoine R. Dumont authored
instead of not being detected and crash as an internal server error (500) Related to T3856
-
- Jan 10, 2022
-
- Nov 05, 2021
-
- Oct 21, 2021
-
-
vlorentz authored
-
- Oct 19, 2021
-
-
Antoine Lambert authored
Add a new username query parameter to the /private/deposits/ endpoint enabling to filter the deposits according to the client that created them. Related to T3174
-
- Oct 06, 2021
-
-
vlorentz authored
-
- Aug 12, 2021
-
- May 25, 2021
-
-
Antoine R. Dumont authored
Related to T2996
-
vlorentz authored
1. Follows the AtomPub <https://datatracker.ietf.org/doc/html/rfc5023\#section-5.2\> spec 2. Discoverable from the service document 3. Makes more sense semantically to have the list of items in a path at the root of that path.
-
- May 21, 2021
-
-
Antoine R. Dumont authored
This adds a paginated listing endpoint so authenticated user can retrieve their deposit information in batch. This touches another part of an equivalent private listing api to allow paginated code reuse. That new endpoint is not a sword endpoint but it lists deposits in xml relatively sword like nonetheless. Related to T2996
-