Collaborative tagging systems have nowadays become important data sources for populating semantic web applications. For tasks like synonym detection and discovery of concept hierarchies, many researchers introduced measures of tag similarity. Even though most of these measures appear very natural, their design often seems to be rather ad hoc, and the underlying assumptions on the notion of similarity are not made explicit. A more systematic characterization and validation of tag similarity in terms of formal representations of knowledge is still lacking. Here we address this issue and analyze several measures of tag similarity: Each measure is computed on data from the social bookmarking system del.icio.us and a semantic grounding is provided by mapping pairs of similar tags in the folksonomy to pairs of synsets in Wordnet, where we use validated measures of semantic distance to characterize the semantic relation between the mapped tags. This exposes important features of the investigated similarity measures and indicates which ones are better suited in the context of a given semantic application.

Semantic Grounding of Tag Relatedness in Social Bookmarking Systems

CATTUTO C;
2008-01-01

Abstract

Collaborative tagging systems have nowadays become important data sources for populating semantic web applications. For tasks like synonym detection and discovery of concept hierarchies, many researchers introduced measures of tag similarity. Even though most of these measures appear very natural, their design often seems to be rather ad hoc, and the underlying assumptions on the notion of similarity are not made explicit. A more systematic characterization and validation of tag similarity in terms of formal representations of knowledge is still lacking. Here we address this issue and analyze several measures of tag similarity: Each measure is computed on data from the social bookmarking system del.icio.us and a semantic grounding is provided by mapping pairs of similar tags in the folksonomy to pairs of synsets in Wordnet, where we use validated measures of semantic distance to characterize the semantic relation between the mapped tags. This exposes important features of the investigated similarity measures and indicates which ones are better suited in the context of a given semantic application.
2008
ISWC 2008
Karlsruhe, Germania
26-28 ottobre 2008
The Semantic Web - ISWC 2008
Springer
LNCS 5318
615
631
978-3540885634
https://link.springer.com/chapter/10.1007/978-3-540-88564-1_39
CATTUTO C; D. Benz; A. Hotho; G. Stumme
File in questo prodotto:
File Dimensione Formato  
10.1007%2F978-3-540-88564-1.pdf

Accesso riservato

Dimensione 1.41 MB
Formato Adobe PDF
1.41 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1730958
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 149
  • ???jsp.display-item.citation.isi??? 78
social impact