CINECA IRIS Institutional Research Information System

We introduce a new benchmark designed to evaluate the ability of Large Language Models (LLMs) to generate Italian-language headlines for science news articles. The benchmark is based on a large dataset of science news articles obtained from Ansa Scienza and Galileo, two important Italian media outlets. Effective headline generation requires more than summarizing article content; headlines must also be informative, engaging, and suitable for the topic and target audience, making automatic evaluation particularly challenging. To address this, we propose two novel transformer-based metrics to assess headline quality. We aim for this benchmark to support the evaluation of Italian LLMs and to foster the development of tools to assist in editorial workflows.

GATTINA - GenerAtion of TiTles for Italian News Articles: A CALAMITA Challenge

Francis M.;Rinaldi M.;Gili J.;De Cosmo L.;Iannaccone S.;Nissim M.;Patti V.

2024-01-01

Abstract

We introduce a new benchmark designed to evaluate the ability of Large Language Models (LLMs) to generate Italian-language headlines for science news articles. The benchmark is based on a large dataset of science news articles obtained from Ansa Scienza and Galileo, two important Italian media outlets. Effective headline generation requires more than summarizing article content; headlines must also be informative, engaging, and suitable for the topic and target audience, making automatic evaluation particularly challenging. To address this, we propose two novel transformer-based metrics to assess headline quality. We aim for this benchmark to support the evaluation of Italian LLMs and to foster the development of tools to assist in editorial workflows.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo dell'evento
	
				10th Italian Conference on Computational Linguistics, CLiC-it 2024
			
	Luogo dell'evento
	
				Pisa, Italia
			
	Data dell'evento
	
				2024
			
	Titolo del volume
	
				Proceedings of the Tenth Italian Conference on Computational Linguistics (CLiC-it 2024), Pisa, Italy, December 4-6, 2024
			
	Nome editore
	
				CEUR-WS
			
	N. Volume
	
				3878
			
	Pagine (da)
	
				1
			
	Pagine (a)
	
				12
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://ceur-ws.org/Vol-3878/121_calamita_long.pdf
			
	Parole Chiave
	
				Benchmarking; CALAMITA Challenge; Headline generation; Italian; LLMs; Summarisation
			
	Tutti gli autori
	
						Francis M.; Rinaldi M.; Gili J.; De Cosmo L.; Iannaccone S.; Nissim M.; Patti V.
					
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
121_calamita_long.pdf Accesso aperto Tipo di file: PDF EDITORIALE Dimensione 1.24 MB Formato Adobe PDF Visualizza/Apri	1.24 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2059281

Citazioni

ND

0

ND

social impact