The detection of abusive or offensive remarks in social texts has received significant attention in research. In several related shared tasks, BERT has been shown to be the state-of-the-art. In this paper, we propose to utilize lexical features derived from a hate lexicon towards improving the performance of BERT in such tasks. We explore different ways to utilize the lexical features in the form of lexicon-based encodings at the sentence level or embeddings at the word level. We provide an extensive dataset evaluation that addresses in-domain as well as cross-domain detection of abusive content to render a complete picture. Our results indicate that our proposed models combining BERT with lexical features help improve over a baseline BERT model in many of our in-domain and cross-domain experiments.

HurtBERT: Incorporating Lexical Features with BERT for the Detection of Abusive Language

Pamungkas, Endang Wahyu;Basile, Valerio;Patti, Viviana
2020-01-01

Abstract

The detection of abusive or offensive remarks in social texts has received significant attention in research. In several related shared tasks, BERT has been shown to be the state-of-the-art. In this paper, we propose to utilize lexical features derived from a hate lexicon towards improving the performance of BERT in such tasks. We explore different ways to utilize the lexical features in the form of lexicon-based encodings at the sentence level or embeddings at the word level. We provide an extensive dataset evaluation that addresses in-domain as well as cross-domain detection of abusive content to render a complete picture. Our results indicate that our proposed models combining BERT with lexical features help improve over a baseline BERT model in many of our in-domain and cross-domain experiments.
2020
Fourth Workshop on Online Abuse and Harms
Online
November 2020
Proceedings of the Fourth Workshop on Online Abuse and Harms
Association for Computational Linguistics
34
43
https://www.aclweb.org/anthology/2020.alw-1.5
abusive language detection, linguistically informed deep learning, social media
Koufakou, Anna; Pamungkas, Endang Wahyu; Basile, Valerio; Patti, Viviana
File in questo prodotto:
File Dimensione Formato  
2020.alw-1.5.pdf

Accesso aperto

Tipo di file: PDF EDITORIALE
Dimensione 444.21 kB
Formato Adobe PDF
444.21 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1769037
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact