CINECA IRIS Institutional Research Information System

In the field of assistive technologies, making accessible to visually impaired users complex visual content such as graphs or conceptual maps remains a significant challenge. This work proposes a modular dialog system that leverages a combination of neural Natural Language Understanding (NLU) and Retrieval-Augmented Generation (RAG) to translate graphical structures into meaningful text-based interactions. The NLU module combines a fine-tuned BERT classifier for intent recognition together with a spaCy-based Named Entity Recognition (NER) model to extract user intents and parameters. Moreover, the RAG pipeline retrieves relevant subgraphs and contextual information from a knowledge base, reranking and summarizing them via a language model. We evaluate the system across multiple specific tasks, achieving over 92% F1 in intent classification and NER, and demonstrate that even open-weight models, like DeepSeek-r1 or LLaMA-3.1, can offer competitive performance compared to GPT-4o in specific domains. Our approach enhances accessibility while maintaining modularity, interpretability, and performance on par with modern LLM architectures.

A Modular LLM-based Dialog System for Accessible Exploration of Finite State Automata

Stefano Vittorio Porta;Pier Felice Balestrucci;Michael Oliverio;Luca Anselma;Alessandro Mazzei

2025-01-01

Abstract

In the field of assistive technologies, making accessible to visually impaired users complex visual content such as graphs or conceptual maps remains a significant challenge. This work proposes a modular dialog system that leverages a combination of neural Natural Language Understanding (NLU) and Retrieval-Augmented Generation (RAG) to translate graphical structures into meaningful text-based interactions. The NLU module combines a fine-tuned BERT classifier for intent recognition together with a spaCy-based Named Entity Recognition (NER) model to extract user intents and parameters. Moreover, the RAG pipeline retrieves relevant subgraphs and contextual information from a knowledge base, reranking and summarizing them via a language model. We evaluate the system across multiple specific tasks, achieving over 92% F1 in intent classification and NER, and demonstrate that even open-weight models, like DeepSeek-r1 or LLaMA-3.1, can offer competitive performance compared to GPT-4o in specific domains. Our approach enhances accessibility while maintaining modularity, interpretability, and performance on par with modern LLM architectures.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo dell'evento
	
				Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
			
	Luogo dell'evento
	
				Cagliari, Italy
			
	Data dell'evento
	
				2025, September
			
	Titolo del volume
	
				CLiC-it 2025: Eleventh Italian Conference on Computational Linguis- tics, September 24 — 26, 2025, Cagliari, Italy
			
	Nome editore
	
				CEUR Workshop Proceedings
			
	Pagine (da)
	
				1
			
	Pagine (a)
	
				9
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://ceur-ws.org/Vol-4112/85_main_long.pdf
			
	Parole Chiave
	
				Dialogue Systems, Retrieval-Augmented Generation, Large Language Models, Education
			
	Tutti gli autori
	
						Stefano Vittorio Porta, Pier Felice Balestrucci, Michael Oliverio, Luca Anselma, Alessandro Mazzei
					
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
85_main_long.pdf Accesso aperto Dimensione 1.43 MB Formato Adobe PDF Visualizza/Apri	1.43 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2104812

Citazioni

ND

ND

ND

social impact