CINECA IRIS Institutional Research Information System

Numerous NLP applications rely on the accessibility to multilingual, diversified, context-sensitive, and broadly shared lexical semantic information. Standard lexical resources tend to first encode monolithic language-bounded senses which are eventually translated and linked across repositories and languages. In this paper, we propose a novel approach for the representation of lexical-semantic knowledge in - and shared from the origin by - multiple languages, based on the idea of k-Multilingual Concept (MCk). MC(k)s consist of multilingual alignments of semantically equivalent words in k different languages, that are generated through a defined linguistic context and linked via empirically determined semantic relations without the use of any sense disambiguation process. The MCk model allows to uncover novel layers of lexical knowledge in the form of multifaceted conceptual links between naturally disambiguated sets of words. We first present the conceptualization of the MC(k)s, along with the word alignment methodology that generates them. Secondly, we describe a large-scale automatic acquisition of MC(k)s in English, Italian and German based on the exploitation of corpora. Finally, we introduce MultiAlignNet, an original lexical resource built using the data gathered from the extraction task. Results from both qualitative and quantitative assessments on the generated knowledge demonstrate both the quality and the novelty of the proposed model.

MultiAligNet: Cross-lingual Knowledge Bridges Between Words and Senses

Grasso, F;Lovera Rulfi, V;Di Caro, L

2022-01-01

Abstract

Numerous NLP applications rely on the accessibility to multilingual, diversified, context-sensitive, and broadly shared lexical semantic information. Standard lexical resources tend to first encode monolithic language-bounded senses which are eventually translated and linked across repositories and languages. In this paper, we propose a novel approach for the representation of lexical-semantic knowledge in - and shared from the origin by - multiple languages, based on the idea of k-Multilingual Concept (MCk). MC(k)s consist of multilingual alignments of semantically equivalent words in k different languages, that are generated through a defined linguistic context and linked via empirically determined semantic relations without the use of any sense disambiguation process. The MCk model allows to uncover novel layers of lexical knowledge in the form of multifaceted conceptual links between naturally disambiguated sets of words. We first present the conceptualization of the MC(k)s, along with the word alignment methodology that generates them. Secondly, we describe a large-scale automatic acquisition of MC(k)s in English, Italian and German based on the exploitation of corpora. Finally, we introduce MultiAlignNet, an original lexical resource built using the data gathered from the extraction task. Results from both qualitative and quantitative assessments on the generated knowledge demonstrate both the quality and the novelty of the proposed model.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Lingua di pubblicazione
	
				Inglese
			
	Su invito
	
				contributo
			
	Tipo di evento
	
				1 - Conferenza
			
	Titolo dell'evento
	
				23rd International Conference on Knowledge Engineering and Knowledge Management
			
	Luogo dell'evento
	
				Bolzano, Italy
			
	Data dell'evento
	
				September 2022
			
	Rilevanza dell'evento
	
				Internazionale
			
	Titolo del volume
	
				International Conference on Knowledge Engineering and Knowledge Management (EKAW)
			
	Referee
	
				Comitato scientifico
			
	Nome editore
	
				SPRINGER INTERNATIONAL PUBLISHING AG
			
	Città editore
	
				GEWERBESTRASSE 11, CHAM, CH-6330, SWITZERLAND
			
	Nazione editore
	
				GERMANIA
			
	N. Volume
	
				13514
			
	Pagine (da)
	
				36
			
	Pagine (a)
	
				50
			
	Numero di Pagine
	
				15
			
	Titolo della serie (se presente ISSN)
	
				LECTURE NOTES IN COMPUTER SCIENCE
			
	Codice ISBN
	
				978-3-031-17104-8
978-3-031-17105-5
			
	Codice ISI WoS
	
				WOS:000869768200003
			
	Codice Scopus
	
				2-s2.0-85140444102
			
	DOI
	
				https://dx.doi.org/10.1007/978-3-031-17105-5_3
			
	Altre informazioni
	
				Awarded as Best paper
			
	Parole Chiave
	
				Lexical Semantics; Multilingual alignments
			
	Coautori affiliati a enti stranieri
	
				no
			
	Prodotto conforme al Regolamento di Ateneo sull'accesso aperto?
	
				1 – prodotto con  file in versione Open Access (allegherò il file al passo 6 - Carica)
			
	Numero autori
	
				3
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04-CONTRIBUTO IN ATTI DI CONVEGNO::04A-Conference paper in volume
			
	Tutti gli autori
	
						Grasso, F; Lovera Rulfi, V; Di Caro, L
					
	Tipologia sito docente
	
				273
			
	Fulltext
	
				open
			
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
EKAW2022_MultiAlignet-2.pdf Open Access dal 27/09/2023 Descrizione: postprint Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE) Dimensione 419.32 kB Formato Adobe PDF Visualizza/Apri	419.32 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1891705

Citazioni

ND

0

0

social impact