In the last years many efforts have been spent to build word embeddings, a representational device in which word meanings are described through dense unit vectors of real numbers over a continuous, high-dimensional Euclidean space, where similarity can be interpreted as a metric. Afterwards, sense-level embeddings have been proposed to describe the meaning of senses, rather than terms. More recently, additional intermediate representations have been designed, providing a vector description for pairs , and mapping both term and sense descriptions onto a shared semantic space. However, surprisingly enough, this wealth of approaches and resources has not been supported by a parallel refinement in the metrics used to compute semantic similarity: to date, the semantic similarity featuring two input entities is mostly computed as the maximization of some angular distance intervening between vector pairs, typically cosine similarity. In this work we introduce two novel similarity metrics to compare sense-level representations, and show that by exploiting the features of sense-embeddings it is possible to substantially improve on existing strategies, by obtaining enhanced correlation with human similarity ratings. Additionally, we argue that semantic similarity needs to be complemented by another task, involving the identification of the senses at the base of the similarity rating. We experimentally verified that the proposed metrics are beneficial when dealing with both semantic similarity task and sense identification task. The experimentation also provides a detailed how-to illustrating how six important sets of sense embeddings can be used to implement the proposed similarity metrics.

Novel metrics for computing semantic similarity with sense embeddings

Davide Colla;Enrico Mensa;Daniele P. Radicioni
2020-01-01

Abstract

In the last years many efforts have been spent to build word embeddings, a representational device in which word meanings are described through dense unit vectors of real numbers over a continuous, high-dimensional Euclidean space, where similarity can be interpreted as a metric. Afterwards, sense-level embeddings have been proposed to describe the meaning of senses, rather than terms. More recently, additional intermediate representations have been designed, providing a vector description for pairs , and mapping both term and sense descriptions onto a shared semantic space. However, surprisingly enough, this wealth of approaches and resources has not been supported by a parallel refinement in the metrics used to compute semantic similarity: to date, the semantic similarity featuring two input entities is mostly computed as the maximization of some angular distance intervening between vector pairs, typically cosine similarity. In this work we introduce two novel similarity metrics to compare sense-level representations, and show that by exploiting the features of sense-embeddings it is possible to substantially improve on existing strategies, by obtaining enhanced correlation with human similarity ratings. Additionally, we argue that semantic similarity needs to be complemented by another task, involving the identification of the senses at the base of the similarity rating. We experimentally verified that the proposed metrics are beneficial when dealing with both semantic similarity task and sense identification task. The experimentation also provides a detailed how-to illustrating how six important sets of sense embeddings can be used to implement the proposed similarity metrics.
2020
206
106346
106360
https://www.sciencedirect.com/science/article/abs/pii/S0950705120305025
Semantic similarity, Semantic similarity metrics, Sense embeddings, Word embeddings, Sense identification, Lexical semantics, Cognitively plausible similarity metrics
Davide Colla, Enrico Mensa, Daniele P. Radicioni
File in questo prodotto:
File Dimensione Formato  
colla2020novel.pdf

Accesso riservato

Descrizione: file presente sul sito dell'Editore
Tipo di file: PDF EDITORIALE
Dimensione 713.58 kB
Formato Adobe PDF
713.58 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1762842
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 6
social impact