Urban policy makers require comprehensive insights into transportation issues and demographic distributions to design equitable and efficient infrastructure. However, analyzing multi-modal data (numeric and visual) while accounting for diverse perspectives remains challenging. To address this, we propose PeRAG, a novel pipeline combining multi-modal perspective-oriented verbalization with Retrieval-Augmented Generation (RAG). Our approach first converts numeric transportation/demographic data and population heatmaps into natural language descriptions using LLaMA, incorporating multiple policy-relevant perspectives. These verbalizations are then fed into the RAG system to generate context-aware, perspective driven responses for urban planners. We demonstrate the effectiveness of PeRAG in generating actionable insights for transportation policy, bridging the gap between raw data and decision-making. Our experiments highlight the pipeline’s ability to handle heterogeneous data modalities while adapting to diverse stakeholder viewpoints, offering a scalable solution for smart city analytics

PeRAG: Multi-Modal Perspective-Oriented Verbalization with RAG for Inclusive Decision Making

Muhammad Saad Amin
;
Horacio Jesus Jarquin Vasquez;Franco Sansonetti;Simona Lo Giudice;Valerio Basile;Viviana Patti
2025-01-01

Abstract

Urban policy makers require comprehensive insights into transportation issues and demographic distributions to design equitable and efficient infrastructure. However, analyzing multi-modal data (numeric and visual) while accounting for diverse perspectives remains challenging. To address this, we propose PeRAG, a novel pipeline combining multi-modal perspective-oriented verbalization with Retrieval-Augmented Generation (RAG). Our approach first converts numeric transportation/demographic data and population heatmaps into natural language descriptions using LLaMA, incorporating multiple policy-relevant perspectives. These verbalizations are then fed into the RAG system to generate context-aware, perspective driven responses for urban planners. We demonstrate the effectiveness of PeRAG in generating actionable insights for transportation policy, bridging the gap between raw data and decision-making. Our experiments highlight the pipeline’s ability to handle heterogeneous data modalities while adapting to diverse stakeholder viewpoints, offering a scalable solution for smart city analytics
2025
Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
Cagliari, Italy
2025
Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)
CEUR Workshop Proceedings
4112
31
44
979-12-243-0587-3
https://aclanthology.org/2025.clicit-1.5/
Multi-modal Verbalization, Retrieval-Augmented Generation (RAG), Perspective-Aware NLP, Large Language Models (LLMs), Urban Transportation Analytics
Muhammad Saad Amin, Horacio Jesus Jarquin Vasquez, Franco Sansonetti, Simona Lo Giudice, Valerio Basile, Viviana Patti
File in questo prodotto:
File Dimensione Formato  
2025.clicit-1.5.pdf

Accesso aperto

Tipo di file: PDF EDITORIALE
Dimensione 1.27 MB
Formato Adobe PDF
1.27 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2121309
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact