CINECA IRIS Institutional Research Information System

Disagreement in annotation, traditionally treated mostly as noise, is now more and more often considered as a source of valuable information instead. We investigate a particular form of disagreement, occurring when the focus of an annotated dataset is a subjective and controversial phenomenon, therefore inducing a certain degree of polarization among the annotators' judgments.We argue that the polarization is indicative of the conflicting perspectives held by different annotator groups, and propose a quantitative method to model this phenomenon. Moreover, we introduce a method to automatically identify shared perspectives stemming from a common background. We test our method on several corpora in English and Italian, manually annotated according to their hate speech content, validating prior knowledge about the groups of annotators, when available, and discovering characteristic traits among annotators with unknown background. We found several precisely de- fined perspectives, described in terms of increased sensitivity towards textual content expressing attitudes such as xenophobia, islamophobia, and homophobia.

Mining Annotator Perspectives from Hate Speech Corpora

Fell M.;Akhtar S.;Basile V.

2021-01-01

Abstract

Disagreement in annotation, traditionally treated mostly as noise, is now more and more often considered as a source of valuable information instead. We investigate a particular form of disagreement, occurring when the focus of an annotated dataset is a subjective and controversial phenomenon, therefore inducing a certain degree of polarization among the annotators' judgments.We argue that the polarization is indicative of the conflicting perspectives held by different annotator groups, and propose a quantitative method to model this phenomenon. Moreover, we introduce a method to automatically identify shared perspectives stemming from a common background. We test our method on several corpora in English and Italian, manually annotated according to their hate speech content, validating prior knowledge about the groups of annotators, when available, and discovering characteristic traits among annotators with unknown background. We found several precisely de- fined perspectives, described in terms of increased sensitivity towards textual content expressing attitudes such as xenophobia, islamophobia, and homophobia.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo dell'evento
	
				5th Workshop on Natural Language for Artificial Intelligence, NL4AI 2021
			
	Luogo dell'evento
	
				Online
			
	Data dell'evento
	
				2021
			
	Titolo del volume
	
				Proceedings of the Fifth Workshop on Natural Language for Artificial Intelligence (NL4AI 2021)
			
	Nome editore
	
				Elena Cabrio, Danilo Croce, Lucia C. Passaro, Rachele Sprugnoli
			
	N. Volume
	
				3015
			
	Pagine (da)
	
				1
			
	Pagine (a)
	
				15
			
	Parole Chiave
	
				Annotator bias; Hate speech; Linguistic annotation; Perspective identification; Polarization of opinions
			
	Tutti gli autori
	
						Fell M.; Akhtar S.; Basile V.
					
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
paper136.pdf Accesso aperto Tipo di file: PDF EDITORIALE Dimensione 454.08 kB Formato Adobe PDF Visualizza/Apri	454.08 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2121984

Citazioni

ND

1

ND

social impact