CINECA IRIS Institutional Research Information System

The present article is the description of a project aimed at building a specialized corpus of Italian newspaper texts and at developing a computational technique to retrieve new false Anglicisms from it. Texts were collected along a ten-month span from three Italian newspapers: La Stampa, La Repubblica, and Corriere della Sera. The size of the corpus is about 20 million tokens and approximately 230,000 types. The system was automatically updated on a daily basis and a list of words was obtained at the end of the collection period. This procedure originated a refined word list in which false Anglicisms were searched. Along with computational techniques, careful manual scanning proved to be indispensable to extract new false Anglicisms. The corpus is available for future work and may be exploited not only to find false Anglicisms but also to retrieve Anglicisms, neologisms, and to analyse lexical features of Italian newspaper language.

The Retrieval of False Anglicisms in Newspaper Texts

FURIASSI, Cristiano Gino;K. HOFLAND

2007-01-01

Abstract

The present article is the description of a project aimed at building a specialized corpus of Italian newspaper texts and at developing a computational technique to retrieve new false Anglicisms from it. Texts were collected along a ten-month span from three Italian newspapers: La Stampa, La Repubblica, and Corriere della Sera. The size of the corpus is about 20 million tokens and approximately 230,000 types. The system was automatically updated on a daily basis and a list of words was obtained at the end of the collection period. This procedure originated a refined word list in which false Anglicisms were searched. Along with computational techniques, careful manual scanning proved to be indispensable to extract new false Anglicisms. The corpus is available for future work and may be exploited not only to find false Anglicisms but also to retrieve Anglicisms, neologisms, and to analyse lexical features of Italian newspaper language.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2007
			
	Titolo del volume
	
				Corpus Linguistics 25 Years On
			
	Nome editore
	
				Rodopi
			
	Titolo della collana
	
				Language and Computers: Studies in Practical Linguistics
			
	N. Volume
	
				62
			
	Pagine (da)
	
				347
			
	Pagine (a)
	
				363
			
	Codice ISBN
	
				9789042021952
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				http://www.rodopi.nl
			
	Parole Chiave
	
				false Anglicisms; corpus linguistics; newspaper corpora; language contact; lexicology
			
	Tutti gli autori
	
						Furiassi, Cristiano Gino; Hofland, K.
					
	Appare nelle tipologie:
	
				02A-Contributo in volume

File in questo prodotto:

File	Dimensione	Formato
2007_Furiassi_Hofland_The Retrieval of False Anglicisms in Newspaper Texts_Rodopi.pdf Accesso riservato Descrizione: articolo Tipo di file: PDF EDITORIALE Dimensione 321.81 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	321.81 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/99723

Citazioni

ND

ND

19

social impact