In the last decades, the Natural Language Processing (NLP) community has demonstrated committed involvement in addressing societal challenges, particularly in the realm of hate-speech detection. Despite advancements, these phenomena continue to perpetrate, especially online, where users on social network platforms often find themselves in unsafe and possibly harmful environments. Among the various manifestations of hate speech and offensive language, one aspect that has been overlooked by the NLP community is body-shaming. Despite its prevalence among hateful users and its potential to harm a diverse group of individuals, from women to people with disabilities, efforts to counteract this damaging phenomenon remain limited. In this work, we first introduce a novel taxonomy designed to distinguish and classify instances of body-shaming by the targeted group. Following this, we present a dataset of Instagram comments for body-shaming detection and classification in the Italian language, which has been manually annotated according to the taxonomy. After detailing the data-gathering and annotation process, we present a classification benchmark using three BERT-based models to showcase our dataset’s classification potential. Results demonstrate good performances in detecting body-shaming instances across several categories of our proposed taxonomy.

Body-Shaming Detection and Classification in Italian Social Media

Grasso F.
;
Valese A.;Micheli M.
2024-01-01

Abstract

In the last decades, the Natural Language Processing (NLP) community has demonstrated committed involvement in addressing societal challenges, particularly in the realm of hate-speech detection. Despite advancements, these phenomena continue to perpetrate, especially online, where users on social network platforms often find themselves in unsafe and possibly harmful environments. Among the various manifestations of hate speech and offensive language, one aspect that has been overlooked by the NLP community is body-shaming. Despite its prevalence among hateful users and its potential to harm a diverse group of individuals, from women to people with disabilities, efforts to counteract this damaging phenomenon remain limited. In this work, we first introduce a novel taxonomy designed to distinguish and classify instances of body-shaming by the targeted group. Following this, we present a dataset of Instagram comments for body-shaming detection and classification in the Italian language, which has been manually annotated according to the taxonomy. After detailing the data-gathering and annotation process, we present a classification benchmark using three BERT-based models to showcase our dataset’s classification potential. Results demonstrate good performances in detecting body-shaming instances across several categories of our proposed taxonomy.
2024
29th International Conference on Natural Language and Information Systems (NLDB 2024)
Torino, Italy
19–21 June 2024
Natural Language Processing and Information Systems
Springer Nature Switzerland AG
14762
256
270
9783031702389
9783031702396
https://link.springer.com/chapter/10.1007/978-3-031-70239-6_18
Body-Shaming; Hate Speech; Natural Language Processing
Grasso F.; Valese A.; Micheli M.
File in questo prodotto:
File Dimensione Formato  
Grasso_et_al_bodyshaming_2025.pdf

Accesso aperto

Descrizione: versione finale post print
Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 285.19 kB
Formato Adobe PDF
285.19 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2037964
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact