CINECA IRIS Institutional Research Information System

Recent scientific studies on natural language processing (NLP) report the outstanding effectiveness observed in the use of context-dependent and task-free language understanding models such as ELMo, GPT, and BERT. Specifically, they have proved to achieve state of the art performance in numerous complex NLP tasks such as question answering and sentiment analysis in the English language. Following the great popularity and effectiveness that these models are gaining in the scientific community, we trained a BERT language understanding model for the Italian language (AlBERTo). In particular, AlBERTo is focused on the language used in social networks, specifically on Twitter. To demonstrate its robustness, we evaluated AlBERTo on the EVALITA 2016 task SENTIPOLC (SENTIment POLarity Classification) obtaining state of the art results in subjectivity, polarity and irony detection on Italian tweets. The pre-trained AlBERTo model will be publicly distributed through the GitHub platform at the following web address: https://github.com/marcopoli/AlBERTo-it in order to facilitate future research.

AlBERTo: Italian BERT language understanding model for NLP challenging tasks based on tweets

Polignano M.;Basile P.;de Gemmis M.;Semeraro G.;Basile V.

2019-01-01

Abstract

Recent scientific studies on natural language processing (NLP) report the outstanding effectiveness observed in the use of context-dependent and task-free language understanding models such as ELMo, GPT, and BERT. Specifically, they have proved to achieve state of the art performance in numerous complex NLP tasks such as question answering and sentiment analysis in the English language. Following the great popularity and effectiveness that these models are gaining in the scientific community, we trained a BERT language understanding model for the Italian language (AlBERTo). In particular, AlBERTo is focused on the language used in social networks, specifically on Twitter. To demonstrate its robustness, we evaluated AlBERTo on the EVALITA 2016 task SENTIPOLC (SENTIment POLarity Classification) obtaining state of the art results in subjectivity, polarity and irony detection on Italian tweets. The pre-trained AlBERTo model will be publicly distributed through the GitHub platform at the following web address: https://github.com/marcopoli/AlBERTo-it in order to facilitate future research.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2019
			
	Lingua di pubblicazione
	
				Inglese
			
	Su invito
	
				contributo
			
	Tipo di evento
	
				1 - Conferenza
			
	Titolo dell'evento
	
				6th Italian Conference on Computational Linguistics, CLiC-it 2019
			
	Luogo dell'evento
	
				Bari
			
	Data dell'evento
	
				2019
			
	Titolo del volume
	
				Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2019)
			
	Referee
	
				Esperti anonimi
			
	Nome editore
	
				CEUR
			
	Città editore
	
				Aachen
			
	Nazione editore
	
				GERMANIA
			
	N. Volume
	
				2481
			
	Pagine (da)
	
				1
			
	Pagine (a)
	
				6
			
	Numero di Pagine
	
				6
			
	Titolo della serie (se presente ISSN)
	
				CEUR WORKSHOP PROCEEDINGS
			
	Codice Scopus
	
				2-s2.0-85074851349
			
	Coautori affiliati a enti stranieri
	
				no
			
	Prodotto conforme al Regolamento di Ateneo sull'accesso aperto?
	
				1 – prodotto con  file in versione Open Access (allegherò il file al passo 6 - Carica)
			
	Numero autori
	
				5
			
	Tipologia
	
				info:eu-repo/semantics/conferenceObject
			
	Tipologia
	
				04-CONTRIBUTO IN ATTI DI CONVEGNO::04A-Conference paper in volume
			
	Tutti gli autori
	
						Polignano M.; Basile P.; de Gemmis M.; Semeraro G.; Basile V.
					
	Tipologia sito docente
	
				273
			
	Fulltext
	
				open
			
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
paper57.pdf Accesso aperto Descrizione: Articolo principale Tipo di file: PDF EDITORIALE Dimensione 513.87 kB Formato Adobe PDF Visualizza/Apri	513.87 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1759767

Citazioni

ND

90

ND

social impact