CINECA IRIS Institutional Research Information System

The automated identification of national implementations (NIMs) of European directives by text similarity techniques has shown promising preliminary results. Previous works have proposed and utilized unsupervised lexical and semantic similarity techniques based on vector space models, latent semantic analysis and topic models. However, these techniques were evaluated on a small multilingual corpus of directives and NIMs. In this paper, we utilize word and paragraph embedding models learned by shallow neural networks from a multilingual legal corpus of European directives and national legislation (from Ireland, Luxembourg and Italy) to develop unsupervised semantic similarity systems to identify transpositions. We evaluate these models and compare their results with the previous unsupervised methods on a multilingual test corpus of 43 Directives and their corresponding NIMs. We also develop supervised machine learning models to identify transpositions and compare their performance with different feature sets.

Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives

NANDA, ROHAN;SIRAGUSA, GIOVANNI;Di Caro L.;Boella G.;Grossio L.;GERBAUDO, Marco;Costamagna F.

2019-01-01

Abstract

The automated identification of national implementations (NIMs) of European directives by text similarity techniques has shown promising preliminary results. Previous works have proposed and utilized unsupervised lexical and semantic similarity techniques based on vector space models, latent semantic analysis and topic models. However, these techniques were evaluated on a small multilingual corpus of directives and NIMs. In this paper, we utilize word and paragraph embedding models learned by shallow neural networks from a multilingual legal corpus of European directives and national legislation (from Ireland, Luxembourg and Italy) to develop unsupervised semantic similarity systems to identify transpositions. We evaluate these models and compare their results with the previous unsupervised methods on a multilingual test corpus of 43 Directives and their corresponding NIMs. We also develop supervised machine learning models to identify transpositions and compare their performance with different feature sets.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
			2019
		
	Titolo rivista
	
			ARTIFICIAL INTELLIGENCE AND LAW
		
	N. Volume
	
			27
		
	Fascicolo
	
			2
		
	Pagine (da)
	
			199
		
	Pagine (a)
	
			225
		
	DOI
	
			https://dx.doi.org/10.1007/s10506-018-9236-y
		
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
			www.kluweronline.com/issn/0924-8463/
		
	Parole Chiave
	
			Machine learning; Text similarity; Transposition
		
	Tutti gli autori
	
			Nanda R.; Siragusa G.; Di Caro L.; Boella G.; Grossio L.; Gerbaudo M.; Costamagna F.
		
	Appare nelle tipologie:
	
			03A-Articolo su Rivista

File in questo prodotto:

File	Dimensione	Formato
Nanda2019_Article_UnsupervisedAndSupervisedTextS.pdf Accesso riservato Tipo di file: PDF EDITORIALE Dimensione 1.47 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.47 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
preprint_AILaw_Nanda.pdf Accesso aperto Tipo di file: PREPRINT (PRIMA BOZZA) Dimensione 857 kB Formato Adobe PDF Visualizza/Apri	857 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1710660

Citazioni

ND

26

17

social impact