The research question this paper aims at answering is the following: In an ontology-driven annotation system, can the information extracted from external resources (namely, Wikidata) provide users with useful suggestions in the characterization of entities used for the annotation of documents from historical archives? The context of the research is the PRiSMHA project, in which the main goal is the development of a proof-of-concept prototype ontology-driven system for semantic metadata generation. The assumption behind this effort is that an effective access to historical archives needs a rich semantic knowledge, relying on a domain ontology, that describes the content of archival resources. In the paper, we present a new feature of the annotation system: when characterizing a new entity (e.g., a person), some properties describing it are automatically pre-filled in, and more complex semantic representations (e.g., events the entity is involved in) are suggested; both kinds of suggestions are based on information retrieved from Wikidata. In the paper, we describe the automatic algorithm devised to support the definition of the mappings between the Wikidata semantic model and the PRiSMHA ontology, as well as the process used to extract information from Wikidata and to generate suggestions based on the defined mappings. Finally, we discuss the results of a qualitative evaluation of the suggestions, which provides a positive answer to the initial research question and indicates possible improvements.

Wikidata Support in the Creation of Rich Semantic Metadata for Historical Archives

Davide Colla;Annamaria Goy
;
Marco Leontino;Diego Magro
2021-01-01

Abstract

The research question this paper aims at answering is the following: In an ontology-driven annotation system, can the information extracted from external resources (namely, Wikidata) provide users with useful suggestions in the characterization of entities used for the annotation of documents from historical archives? The context of the research is the PRiSMHA project, in which the main goal is the development of a proof-of-concept prototype ontology-driven system for semantic metadata generation. The assumption behind this effort is that an effective access to historical archives needs a rich semantic knowledge, relying on a domain ontology, that describes the content of archival resources. In the paper, we present a new feature of the annotation system: when characterizing a new entity (e.g., a person), some properties describing it are automatically pre-filled in, and more complex semantic representations (e.g., events the entity is involved in) are suggested; both kinds of suggestions are based on information retrieved from Wikidata. In the paper, we describe the automatic algorithm devised to support the definition of the mappings between the Wikidata semantic model and the PRiSMHA ontology, as well as the process used to extract information from Wikidata and to generate suggestions based on the defined mappings. Finally, we discuss the results of a qualitative evaluation of the suggestions, which provides a positive answer to the initial research question and indicates possible improvements.
2021
11
10
1
34
https://www.mdpi.com/2076-3417/11/10/4378
semantic metadata generation, linked data, ontologies, digital curation, historical archives, Digital Cultural Heritage, information exploration
Davide Colla, Annamaria Goy, Marco Leontino, Diego Magro
File in questo prodotto:
File Dimensione Formato  
applsci-11-04378.pdf

Accesso aperto

Descrizione: articolo
Tipo di file: PDF EDITORIALE
Dimensione 4.36 MB
Formato Adobe PDF
4.36 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1788778
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
social impact