Numerous NLP applications rely on the accessibility to multilingual, diversified, context-sensitive, and broadly shared lexical semantic information. Standard lexical resources tend to first encode monolithic language-bounded senses which are eventually translated and linked across repositories and languages. In this paper, we propose a novel approach for the representation of lexical-semantic knowledge in - and shared from the origin by - multiple languages, based on the idea of k-Multilingual Concept (MCk). MC(k)s consist of multilingual alignments of semantically equivalent words in k different languages, that are generated through a defined linguistic context and linked via empirically determined semantic relations without the use of any sense disambiguation process. The MCk model allows to uncover novel layers of lexical knowledge in the form of multifaceted conceptual links between naturally disambiguated sets of words. We first present the conceptualization of the MC(k)s, along with the word alignment methodology that generates them. Secondly, we describe a large-scale automatic acquisition of MC(k)s in English, Italian and German based on the exploitation of corpora. Finally, we introduce MultiAlignNet, an original lexical resource built using the data gathered from the extraction task. Results from both qualitative and quantitative assessments on the generated knowledge demonstrate both the quality and the novelty of the proposed model.

MultiAligNet: Cross-lingual Knowledge Bridges Between Words and Senses

Grasso, F
;
Lovera Rulfi, V;Di Caro, L
2022-01-01

Abstract

Numerous NLP applications rely on the accessibility to multilingual, diversified, context-sensitive, and broadly shared lexical semantic information. Standard lexical resources tend to first encode monolithic language-bounded senses which are eventually translated and linked across repositories and languages. In this paper, we propose a novel approach for the representation of lexical-semantic knowledge in - and shared from the origin by - multiple languages, based on the idea of k-Multilingual Concept (MCk). MC(k)s consist of multilingual alignments of semantically equivalent words in k different languages, that are generated through a defined linguistic context and linked via empirically determined semantic relations without the use of any sense disambiguation process. The MCk model allows to uncover novel layers of lexical knowledge in the form of multifaceted conceptual links between naturally disambiguated sets of words. We first present the conceptualization of the MC(k)s, along with the word alignment methodology that generates them. Secondly, we describe a large-scale automatic acquisition of MC(k)s in English, Italian and German based on the exploitation of corpora. Finally, we introduce MultiAlignNet, an original lexical resource built using the data gathered from the extraction task. Results from both qualitative and quantitative assessments on the generated knowledge demonstrate both the quality and the novelty of the proposed model.
2022
23rd International Conference on Knowledge Engineering and Knowledge Management
Bolzano, Italy
26–29 September 2022
Knowledge Engineering and Knowledge Management. EKAW 2022
Springer
13514
36
50
978-3-031-17104-8
978-3-031-17105-5
https://link.springer.com/chapter/10.1007/978-3-031-17105-5_3
Lexical Semantics; Multilingual alignments
Grasso, F; Lovera Rulfi, V; Di Caro, L
File in questo prodotto:
File Dimensione Formato  
MultiAligNet.pdf

Accesso aperto

Descrizione: versione editoriale
Tipo di file: PDF EDITORIALE
Dimensione 569.01 kB
Formato Adobe PDF
569.01 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1891705
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact