CINECA IRIS Institutional Research Information System

The data augmentation approach is becoming very popular in Natural Language Generation (NLG). Different approaches have been utilized in NLP and NLG to augment data and increase training examples for the neural model. Yet no studies have performed augmentation on logical input i.e., Discourse Representation Structures (DRS). We present data augmentation in DRS i.e., DRS taken from the PMB corpus, for the DRS-to-Text generation task. We conducted our experiments on a standard bi-LSTM-based sequence-to-sequence model thus creating an end-to-end neural approach for generating English sentences from DRS. We evaluated the output generated from word-level and character-level decoders with the help of reference-based evaluation metrics like BLEU, ROUGE, METEOR, NIST, and CIDEr. The practical implementation of augmented DRS succeeded in achieving better results compared to DRS without augmentation. To prove the significance of our model, we conducted statistical significance tests i.e., the Shapiro-Wilk Test (to check data normality) and the Wilcoxon Test (to test model significance). Wilcoxon results states that our model is significantly better with the p-value = 2.37e-05 for Char-level model and p-value = 7.78e-07 for Word-level model.

Towards Data Augmentation for DRS-to-Text Generation

Amin M. S.;Mazzei A.;Anselma L.

2022-01-01

Abstract

The data augmentation approach is becoming very popular in Natural Language Generation (NLG). Different approaches have been utilized in NLP and NLG to augment data and increase training examples for the neural model. Yet no studies have performed augmentation on logical input i.e., Discourse Representation Structures (DRS). We present data augmentation in DRS i.e., DRS taken from the PMB corpus, for the DRS-to-Text generation task. We conducted our experiments on a standard bi-LSTM-based sequence-to-sequence model thus creating an end-to-end neural approach for generating English sentences from DRS. We evaluated the output generated from word-level and character-level decoders with the help of reference-based evaluation metrics like BLEU, ROUGE, METEOR, NIST, and CIDEr. The practical implementation of augmented DRS succeeded in achieving better results compared to DRS without augmentation. To prove the significance of our model, we conducted statistical significance tests i.e., the Shapiro-Wilk Test (to check data normality) and the Wilcoxon Test (to test model significance). Wilcoxon results states that our model is significantly better with the p-value = 2.37e-05 for Char-level model and p-value = 7.78e-07 for Word-level model.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Titolo dell'evento
	
				Workshop on Natural Language for Artificial Intelligence
			
	Luogo dell'evento
	
				Udine
			
	Data dell'evento
	
				November 30, 2022
			
	Titolo del volume
	
				Proceedings of the Sixth Workshop on Natural Language for Artificial Intelligence (NL4AI 2022) co-located with 21th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2022)
			
	Nome editore
	
				CEUR-WS
			
	N. Volume
	
				3287
			
	Pagine (da)
	
				141
			
	Pagine (a)
	
				152
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://ceur-ws.org/Vol-3287/paper14.pdf
			
	Parole Chiave
	
				Bi-LSTM; Data Augmentation; DRS-to-Text Generation; Neural Network; Parallel Meaning Bank (PMB); Shapiro-Wilk Test; Statistical Significance Test; Wilcoxon Test
			
	Tutti gli autori
	
						Amin M.S.; Mazzei A.; Anselma L.
					
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
paper14.pdf Accesso aperto Dimensione 851.69 kB Formato Adobe PDF Visualizza/Apri	851.69 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1887628

Citazioni

ND

9

ND

social impact