The popular Zenodo platform is now becoming a white label open source application, InvenioRDM, built on top of the Invenio library: https://inveniosoftware.org/products/rdm/
There are many partners collaborating on this new version, and this is the right time to contribute to it a software deposit functionality similar to what we have for HAL:
SWORD 3 support is planned for InvenioRDM
CodeMeta support/export is planned for InvenioRDM
our new BibTeX types would be welcome in InvenioRDM
I have made a survey of the existing code to ensure what I think happens in the
deposit is correct. TL; DR, it is!
Existing update metadata endpoints are focused in the
swh.deposit.api.deposit_update module [1].
Up-to-now, there is a restriction of use within the base class to prevent
updating deposit with status other than 'partial' [2]. That restriction should
be relaxed for the deposit metadata update case (when a SWHID is provided in
some ways). It should stay for the other existing cases.
See the Invenio-cli repository (to which the invenio-swh module needs to connect to be deployed on a live instance)
Notes from the Zenodo-CottageLabs-SWH meeting
1. the additional software properties we want to add for software records:
operatingSystem (proposed list)
runtimePlatform (proposed list)
programmingLanguage (proposed list)
codeRepository (url)
releaseNotes (text)
softwareVersion (text)
developmentStatus (proposed list)
issueTracker (url)
1.1. It is preferable to wait for the customized properties feature before adding the software properties into the InvenioRDM form.
1.2. The releaseNotes property can be added as a type of additionalDescription
1.3. The codeRepository property can be added as a type of relatedIdentifiers (I need to check this with Martin from DataCite)
1.4. I will verify with HAL what proposed lists are used to help with the following properties operatingSystem, runtimePlatform, programmingLanguage and developmentStatus.
1.5. we see a specific software properties category in blue with the customized properties feature - when it is ready, we will discuss exactly which properties should be added from CodeMeta
1.6. [update] the custom fields feature has been dropped from the roadmap for June
2. the GitHub integration
2.1. the url of a code repository is used in supplemented by property in zenodo (we need to keep this in mind when working on phase 3 of the integration)
2.2. [not discussed, but important to review] the origin in SWH should be the (concept) DOI of a deposit (for which the SWH property createOrigin should be used in the SWORD metadata)
the version DOI needs to be added in metadata as an identifier, but will not act as the origin
the code repository (in supplemented by) should be also added to the metadata with the CodeMeta term codeRepository
2.2. the content from the release notes on GitHub is used in the description property on a zenodo record
it might be useful to review exactly what information from GitHub is used in zenodo
if not changed in Zenodo, then send to SWH metadata as is = as description
2.3. the GitHub guide dates from 2016: https://guides.github.com/activities/citable-code/ and does not explain how authors are being collected from the repository
3. CodeMeta
3.1. with the CodeMeta task force we are working on the adoption of the CodeMeta properties in schema.org (2 of the properties in the list above are CodeMeta specific)
3.2. I will verify with Matt and Carl (CodeMeta maintainers) when is planned the v3 release of CodeMeta
4. Pushing the InvenioRDM-SWH integration into Zenodo
4.1. Cottage labs and Software Heritage will finish the implementation of the integration workflow with the following items:
updated metadata workflow
record and display suitable minimal information on deposit state
4.2. Cottage labs and Software Heritage scheduled a sprint this Friday and we will see where we are at
4.3. Cottage labs will do a PyPi package release with the complete workflow that can be used by other systems
4.4. we will demonstrate the workflow at a Bi-weekly InvenioRDM meeting to get feedback and find a volunteer live system to put in production
4.5. after first push in production on a live system, we can consider pushing on Zenodo :-)
5. source code display in a software record
5.1. first we will implement the display of a the SWHID with a browse button as implemented on HAL
5.2. we did discuss the future (far future?) possibility to have the source code content displayed in a widget on the record page, which will require implementation of the widget by the SWH team at some point => to see with Roberto if this is part of the plan (discussion on swh-web#3351 (closed))
Development of the InvenioRDM SWH integration was paused until the 6.0 LTS release of InvenioRDM on the 5th August 2021.
WIP in stage 2:
Link new versions of previous deposits to those previous deposits in SWH by using add_to_origin
Send the correct HTTP header to enable us to send replacement metadata when the metadata of a record is updated in InvenioRDM.
InvenioRDM’s delivery timeline has slipped repeatedly, and has not been a stable integration target for much of the timeframe of this project.
The descoping in June 2021 of custom properties from the metadata schema from the LTS release. It had been recommended that we use custom properties for CodeMeta metadata such as operatingSystem, developmentStatus, but this is now difficult to achieve until custom property support is implemented.
The later removal of support for extension metadata from the LTS release’s record metadata schema, despite having already built our deposit-process-related metadata on top of it.
Going forward, Cottage Labs require clarity from the InvenioRDM project on a suitable and sustainable approach to take for these outstanding issues:
Storing SWH deposit-related metadata on or alongside the record
Exposing that metadata through to the record detail template
Extending the record detail template to display SWH deposit-related metadata
B. CERN's update :
Lars is keen to help get the SWH work finished, and accepts that with features changing underneath Cottage labs, that the InvenioRDM team will shoulder some of that burden.
Work should be directed at the LTS release, as stability can't be guaranteed, also the LTS should be well supported and bugs fixed quickly.