Evaluating Discourse Representation Structure (DRS)-based systems for semantic parsing (Text-to-DRS) and generation (DRS-to-Text) poses unique challenges, particularly in low-resource languages like Urdu. Traditional metrics often fall short, focusing either on structural accuracy or linguistic quality, but rarely capturing both. To address this limitation, we introduce two complementary evaluation methodologies{---}Parse-Generate (PARS-GEN) and Generate-Parse (GEN-PARS){---}designed for a more comprehensive assessment of DRS-based systems. PARS-GEN evaluates the parsing process by converting DRS outputs back to the text, revealing linguistic nuances often missed by structure-focused metrics like SMATCH. Conversely, GEN-PARS assesses text generation by converting generated text into DRS, providing a semantic perspective that complements surface-level metrics such as BLEU, METEOR, and BERTScore. Using the Parallel Meaning Bank (PMB) dataset, we demonstrate our methodology across Urdu, uncovering unique insights into Urdu`s structural and linguistic interplay. Findings show that traditional metrics frequently overlook the complexity of linguistic and semantic fidelity, especially in low-resource languages. Our dual approach offers a robust framework for evaluating DRS-based systems, enhancing semantic parsing and text generation quality.

Evaluating Structural and Linguistic Quality in {U}rdu {DRS} Parsing and Generation through Bidirectional Evaluation

Muhammad Saad Amin;Luca Anselma;Alessandro Mazzei
2025-01-01

Abstract

Evaluating Discourse Representation Structure (DRS)-based systems for semantic parsing (Text-to-DRS) and generation (DRS-to-Text) poses unique challenges, particularly in low-resource languages like Urdu. Traditional metrics often fall short, focusing either on structural accuracy or linguistic quality, but rarely capturing both. To address this limitation, we introduce two complementary evaluation methodologies{---}Parse-Generate (PARS-GEN) and Generate-Parse (GEN-PARS){---}designed for a more comprehensive assessment of DRS-based systems. PARS-GEN evaluates the parsing process by converting DRS outputs back to the text, revealing linguistic nuances often missed by structure-focused metrics like SMATCH. Conversely, GEN-PARS assesses text generation by converting generated text into DRS, providing a semantic perspective that complements surface-level metrics such as BLEU, METEOR, and BERTScore. Using the Parallel Meaning Bank (PMB) dataset, we demonstrate our methodology across Urdu, uncovering unique insights into Urdu`s structural and linguistic interplay. Findings show that traditional metrics frequently overlook the complexity of linguistic and semantic fidelity, especially in low-resource languages. Our dual approach offers a robust framework for evaluating DRS-based systems, enhancing semantic parsing and text generation quality.
2025
Proceedings of the First Workshop on Natural Language Processing for Indo-Aryan and Dravidian Languages
Association for Computational Linguistics
33
43
979-8-89176-214-5
https://aclanthology.org/2025.indonlp-1.4/
semantic parsing, text generation, Discourse Representation Structure, Parallel Meaning Bank
Muhammad Saad Amin, Luca Anselma, Alessandro Mazzei
File in questo prodotto:
File Dimensione Formato  
2025.indonlp-1.4.pdf

Accesso aperto

Dimensione 538.66 kB
Formato Adobe PDF
538.66 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2052694
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact