Disagreement in annotation, traditionally treated mostly as noise, is now more and more often considered as a source of valuable information instead. We investigate a particular form of disagreement, occurring when the focus of an annotated dataset is a subjective and controversial phenomenon, therefore inducing a certain degree of polarization among the annotators' judgments.We argue that the polarization is indicative of the conflicting perspectives held by different annotator groups, and propose a quantitative method to model this phenomenon. Moreover, we introduce a method to automatically identify shared perspectives stemming from a common background. We test our method on several corpora in English and Italian, manually annotated according to their hate speech content, validating prior knowledge about the groups of annotators, when available, and discovering characteristic traits among annotators with unknown background. We found several precisely de- fined perspectives, described in terms of increased sensitivity towards textual content expressing attitudes such as xenophobia, islamophobia, and homophobia.

Mining Annotator Perspectives from Hate Speech Corpora

Fell M.;Akhtar S.;Basile V.
2021-01-01

Abstract

Disagreement in annotation, traditionally treated mostly as noise, is now more and more often considered as a source of valuable information instead. We investigate a particular form of disagreement, occurring when the focus of an annotated dataset is a subjective and controversial phenomenon, therefore inducing a certain degree of polarization among the annotators' judgments.We argue that the polarization is indicative of the conflicting perspectives held by different annotator groups, and propose a quantitative method to model this phenomenon. Moreover, we introduce a method to automatically identify shared perspectives stemming from a common background. We test our method on several corpora in English and Italian, manually annotated according to their hate speech content, validating prior knowledge about the groups of annotators, when available, and discovering characteristic traits among annotators with unknown background. We found several precisely de- fined perspectives, described in terms of increased sensitivity towards textual content expressing attitudes such as xenophobia, islamophobia, and homophobia.
2021
5th Workshop on Natural Language for Artificial Intelligence, NL4AI 2021
Online
2021
Proceedings of the Fifth Workshop on Natural Language for Artificial Intelligence (NL4AI 2021)
Elena Cabrio, Danilo Croce, Lucia C. Passaro, Rachele Sprugnoli
3015
1
15
Annotator bias; Hate speech; Linguistic annotation; Perspective identification; Polarization of opinions
Fell M.; Akhtar S.; Basile V.
File in questo prodotto:
File Dimensione Formato  
paper136.pdf

Accesso aperto

Tipo di file: PDF EDITORIALE
Dimensione 454.08 kB
Formato Adobe PDF
454.08 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2121984
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact