Skip to PREreview

PREreview of A maturity model for catalogues of semantic artefacts

Published
DOI
10.5281/zenodo.8138448
License
CC BY 4.0

Disclaimer: I am a member of the EOSC Metadata and data quality Task Force but have not read the paper prior to this review and was not involved in the discussions and writing of the article. I was working in the Publication Office of the European Union, the publisher of the catalogue EU Vocabularies, assessed in the article.

The paper identifies 12 dimensions for measuring the maturity level of semantic artefacts catalogues and applies them to 26 such catalogues. The dimensions include a wide range of relevant aspects for measuring maturity levels of catalogues, from use of persistent identifiers and metadata schemas to community engagement and governance. Additional sub-criteria are defined for each dimension resulting in a comprehensive tool for assessment of the catalogues.

Prior to presenting the evaluation criteria, a literature review is provided with a twofold aim: The prior is to define semantic artefacts catalogues as well as semantic artefacts. I found this section very useful because indeed, there is need for greater coherence and for shared definitions of the basic tools and concepts used in this field (e.g. vocabulary versus ontology are both commonly used to define owl:Ontology assets). The second goal of the literature review is identifying the abovementioned dimensions.

The workflow for the assessment of the maturity level of the 26 selected catalogues is documented clearly and a link is provided for consulting the produced data.

Overall, the paper offers most pertinent analysis, metrics and examples for the assessment it undertakes to provide.

Major issues

The only point I believe merits more consideration and is not fully resolved in the paper is the definition of semantic artefacts. Although not the core subject, the article rightfully identifies the lack of clarity and common understanding of the formalisations of semantic artefacts, but stops short of attempting to provide one. For example, when addressing the issue of the various possible formats (or rather, serialisations) semantic artefacts may have, it simply cites D2.5 FAIR Semantics Recommendations Second Iteration (2000) which in turn only considers RDF or RDF-like assets (p. 4). However, many common semantic artefacts use non-RDF formats such as YAML, XSD, and XML that, it may be argues, is one of the issues that renders interoperability more complex.

Minor issues

The assessment of the EU Vocabularies catalogue, of which I have a privileged knowledge (see disclaimer above), misses out of some features of the catalogue. For example, the catalogue is assessed negatively on the Alignment sub-criteria, although alignments are available both as additional resources (cfr. https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/eurovoc under the Alignments tab) and as internal mappings (cfr. https://europa.eu/!jvPVbb). Other sub-criteria for which a positive score was not awarded include documented curation (a diff files and version release notes are available for all assets and a release schedule is available at https://op.europa.eu/en/web/eu-vocabularies/releases). The description of the artefacts (sub-criterion Me custom vocabulary, for which the CDM ontology is used) and use of persistent identifiers to refer to assets described by the metadata record (sub-criterion Pi metadata record), are also assessed as missing although present. The omission of these features from the evaluation could potentially indicate the relevance of another dimension worth evaluating, namely the difficulty of finding these metadata.

Competing interests

The author declares that they have no competing interests.