In the last decades, the Natural Language Processing (NLP) community has demonstrated committed involvement in addressing societal challenges, particularly in the realm of hate-speech detection. Despite advancements, these phenomena continue to perpetrate, especially online, where users on social network platforms often find themselves in unsafe and possibly harmful environments. Among the various manifestations of hate speech and offensive language, one aspect that has been overlooked by the NLP community is body-shaming. Despite its prevalence among hateful users and its potential to harm a diverse group of individuals, from women to people with disabilities, efforts to counteract this damaging phenomenon remain limited. In this work, we first introduce a novel taxonomy designed to distinguish and classify instances of body-shaming by the targeted group. Following this, we present a dataset of Instagram comments for body-shaming detection and classification in the Italian language, which has been manually annotated according to the taxonomy. After detailing the data-gathering and annotation process, we present a classification benchmark using three BERT-based models to showcase our dataset’s classification potential. Results demonstrate good performances in detecting body-shaming instances across several categories of our proposed taxonomy.
Body-Shaming Detection and Classification in Italian Social Media
Grasso F.
;Valese A.;Micheli M.
2024-01-01
Abstract
In the last decades, the Natural Language Processing (NLP) community has demonstrated committed involvement in addressing societal challenges, particularly in the realm of hate-speech detection. Despite advancements, these phenomena continue to perpetrate, especially online, where users on social network platforms often find themselves in unsafe and possibly harmful environments. Among the various manifestations of hate speech and offensive language, one aspect that has been overlooked by the NLP community is body-shaming. Despite its prevalence among hateful users and its potential to harm a diverse group of individuals, from women to people with disabilities, efforts to counteract this damaging phenomenon remain limited. In this work, we first introduce a novel taxonomy designed to distinguish and classify instances of body-shaming by the targeted group. Following this, we present a dataset of Instagram comments for body-shaming detection and classification in the Italian language, which has been manually annotated according to the taxonomy. After detailing the data-gathering and annotation process, we present a classification benchmark using three BERT-based models to showcase our dataset’s classification potential. Results demonstrate good performances in detecting body-shaming instances across several categories of our proposed taxonomy.File | Dimensione | Formato | |
---|---|---|---|
73.pdf
Accesso aperto
Descrizione: paper
Tipo di file:
PDF EDITORIALE
Dimensione
285.19 kB
Formato
Adobe PDF
|
285.19 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.