Stance Detection is the task of automatically determining whether the author of a text is in favor, against, or neutral towards a given target. In this paper we investigate the portability of tools performing this task across different languages, by analyzing the results achieved by a Stance Detection system (i.e. MultiTACOS) trained and tested in a multilingual setting. First of all, a set of resources on topics related to politics for English, French, Italian, Spanish and Catalan is provided which includes: novel corpora collected for the purpose of this study, and benchmark corpora exploited in Stance Detection tasks and evaluation exercises known in literature. We focus in particular on the novel corpora by describing their development and by comparing them with the benchmarks. Second, MultiTACOS is applied with different sets of features especially designed for Stance Detection, with a specific focus to exploring and combining both features based on the textual content of the tweet (e.g., style and affective load) and features based on contextual information that do not emerge directly from the text. Finally, for better highlighting the contribution of the features that most positively affect system performance in the multilingual setting, a features analysis is provided, together with a qualitative analysis of the misclassified tweets for each of the observed languages, devoted to reflect on the open challenges.

Multilingual stance detection in social media political debates

Lai M.;Cignarella A. T.;Bosco C.;Patti V.;
2020-01-01

Abstract

Stance Detection is the task of automatically determining whether the author of a text is in favor, against, or neutral towards a given target. In this paper we investigate the portability of tools performing this task across different languages, by analyzing the results achieved by a Stance Detection system (i.e. MultiTACOS) trained and tested in a multilingual setting. First of all, a set of resources on topics related to politics for English, French, Italian, Spanish and Catalan is provided which includes: novel corpora collected for the purpose of this study, and benchmark corpora exploited in Stance Detection tasks and evaluation exercises known in literature. We focus in particular on the novel corpora by describing their development and by comparing them with the benchmarks. Second, MultiTACOS is applied with different sets of features especially designed for Stance Detection, with a specific focus to exploring and combining both features based on the textual content of the tweet (e.g., style and affective load) and features based on contextual information that do not emerge directly from the text. Finally, for better highlighting the contribution of the features that most positively affect system performance in the multilingual setting, a features analysis is provided, together with a qualitative analysis of the misclassified tweets for each of the observed languages, devoted to reflect on the open challenges.
2020
63
1
27
https://www.sciencedirect.com/science/article/pii/S0885230820300085
Contextual features, Multilingual, Political debates, Stance detection, Twitter
Lai M.; Cignarella A.T.; Hernandez Farias D.I.; Bosco C.; Patti V.; Rosso P.
File in questo prodotto:
File Dimensione Formato  
Multilingual-stance-detection-in-social-media-poli_2020_Computer-Speech---La.pdf

Accesso riservato

Descrizione: articolo principale
Tipo di file: PDF EDITORIALE
Dimensione 847.45 kB
Formato Adobe PDF
847.45 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1739632
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 67
  • ???jsp.display-item.citation.isi??? 40
social impact