Skip to content
Snippets Groups Projects
  1. Mar 21, 2022
  2. Mar 16, 2022
  3. Mar 15, 2022
  4. Mar 08, 2022
  5. Mar 04, 2022
  6. Feb 28, 2022
  7. Feb 24, 2022
  8. Feb 23, 2022
  9. Feb 22, 2022
    • vlorentz's avatar
      server: Use xml.etree.ElementTree instead of nested dicts internally · 55ae87b1
      vlorentz authored
      This commit does not touch the external API though; ie. `metadata_dict`
      is still present in the JSON API, and the equivalent `jsonb` field remains
      in the database. They will probably be removed in a future commit
      because they are not very useful, though.
      
      Rationale:
      
      I find xmltodict's approach of translating XML tree to native structures
      to be intrinsically flawed for non-trivial handling of XML, because the
      data structure is:
      
      * implementation-defined (by xmltodict, which is python-only) and it may
        change across versions
      * does not intrinsically store namespaces, and relies on an internal
        prefix map  (though it isn't much of an issue right now, as we do not need
        composability and all the changed APIs are private)
      * not stable; for example, `<a><b>foo</b></a>` and `<a><b>foo</b><b>bar</b></a>`
        are encoded completely differently (the former is a `Dict[str, str]`,
        the latter is `Dict[str, list]`.
      
      And every operation manipulating this data structure needs to check
      presence, number *and* type on every access. Consider this part of this
      commit for example:
      
      ```
      -    swh_deposit = metadata.get("swh:deposit")
      -    if not swh_deposit:
      -        return None
      -
      -    swh_reference = swh_deposit.get("swh:reference")
      -    if not swh_reference:
      -        return None
      -
      -    swh_origin = swh_reference.get("swh:origin")
      -    if swh_origin:
      -        url = swh_origin.get("@url")
      -        if url:
      -            return url
      +    ref_origin = metadata.find(
      +        "swh:deposit/swh:reference/swh:origin[@url]", namespaces=NAMESPACES
      +    )
      +    if ref_origin is not None:
      +        return ref_origin.attrib["url"]
      ```
      
      the use of XPath makes it considerably shorter; and the original version
      did not even check number/type (ie. it would crash if an element was
      duplicated).
      55ae87b1
    • Antoine R. Dumont's avatar
      deposit.cli.client: Allow user to define the metadata provenance url · b9f565aa
      Antoine R. Dumont authored
      If the user is providing the `--metadata-provenance-url`, the xml generated will forward
      that information to the deposit server. If the user is providing the metadata file
      directly, a warning will be logged to notify the user of the missing metadata provenance
      url (if it is missing).
      
      Related to T3677
      Verified
      b9f565aa
    • vlorentz's avatar
      Fix URI of schema.org · a10ed57b
      vlorentz authored
      Either is valid according to https://schema.org/docs/gs.html ;
      but we need to pick one, as they are opaque identifiers.
      And codemeta chose http:// (because it was the only one to be
      valid back then), so we should stick to this one.
      a10ed57b
    • vlorentz's avatar
      Remove metadata merging; use only the latest document · 7727a9c0
      vlorentz authored
      We don't use that feature at all as far as I am aware.
      
      I also find that it complicates any metadata handling (especially the validation
      I would like to add in the near future), and probably does not match semantics
      intended by SWORD (merging occurs on PUT requests, as we don't implement PATCH)
      7727a9c0
  10. Feb 21, 2022
    • Antoine R. Dumont's avatar
      deposit_check: Actually store warning in deposit status detail · 770cc0f5
      Antoine R. Dumont authored
      Prior to this commit, only rejected deposit were storing problem details. Now that we
      can have warnings even in case of 'verified' deposit, we need to store that details for
      post-analysis.
      
      Note that this also fixes the docstring of the overall class which were out of date
      since the beginning (duplicated from another class).
      
      Related to T3677
      Verified
      770cc0f5
    • Antoine R. Dumont's avatar
      api.checks: Warn when suggested fields are missing from metadata · 339f7dd3
      Antoine R. Dumont authored
      This introduces a new check about the metadata provenance. While it's a suggested field,
      it's definitely something that we want deposit clients to send us. So warn when it's not
      the case. That does not reject the deposit but it's worth keeping that detail in the
      backend.
      
      Related to T3677
      Verified
      339f7dd3
Loading