CINECA IRIS Institutional Research Information System

In this paper we propose an approach to exploit the fine-grained knowledge expressed by individual human annotators during a hate speech (HS) detection task, before the aggregation of single judgments in a gold standard dataset eliminates non-majority perspectives. We automatically divide the annotators into groups, aiming at grouping them by similar personal characteristics (ethnicity, social background, culture etc.). To serve a multi-lingual perspective, we performed classification experiments on three different Twitter datasets in English and Italian languages. We created different gold standards, one for each group, and trained a state-of-the-art deep learning model on them, showing that supervised models informed by different perspectives on the target phenomena outperform a baseline represented by models trained on fully aggregated data. Finally, we implemented an ensemble approach that combines the single perspective-aware classifiers into an inclusive model. The results show that this strategy further improves the classification performance, especially with a significant boost in the recall of HS prediction.

Modeling Annotator Perspective and Polarized Opinions to Improve Hate Speech Detection

Sohail Akhtar;Valerio Basile;Viviana Patti

2020-01-01

Abstract

In this paper we propose an approach to exploit the fine-grained knowledge expressed by individual human annotators during a hate speech (HS) detection task, before the aggregation of single judgments in a gold standard dataset eliminates non-majority perspectives. We automatically divide the annotators into groups, aiming at grouping them by similar personal characteristics (ethnicity, social background, culture etc.). To serve a multi-lingual perspective, we performed classification experiments on three different Twitter datasets in English and Italian languages. We created different gold standards, one for each group, and trained a state-of-the-art deep learning model on them, showing that supervised models informed by different perspectives on the target phenomena outperform a baseline represented by models trained on fully aggregated data. Finally, we implemented an ensemble approach that combines the single perspective-aware classifiers into an inclusive model. The results show that this strategy further improves the classification performance, especially with a significant boost in the recall of HS prediction.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Titolo dell'evento
	
				AAAI Conference on Human Computation and Crowdsourcing
			
	Luogo dell'evento
	
				Hilversum
			
	Data dell'evento
	
				25-29 ottobre 2020
			
	Titolo del volume
	
				Proceedings of the AAAI Conference on Human Computation and Crowdsourcing
			
	Nome editore
	
				Association for the Advancement of Artificial Intelligence
			
	N. Volume
	
				8
			
	Pagine (da)
	
				151
			
	Pagine (a)
	
				154
			
	Codice ISBN
	
				978-1-57735-848-0
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://ojs.aaai.org/index.php/HCOMP/article/view/7473
			
	Tutti gli autori
	
						Sohail Akhtar, Valerio Basile, Viviana Patti
					
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
7473-Article Text-10855-1-10-20200925.pdf Accesso aperto Descrizione: Articolo principale Tipo di file: PDF EDITORIALE Dimensione 442.6 kB Formato Adobe PDF Visualizza/Apri	442.6 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1759758

Citazioni

ND

64

ND

social impact