The present article is the description of a project aimed at building a specialized corpus of Italian newspaper texts and at developing a computational technique to retrieve new false Anglicisms from it. Texts were collected along a ten-month span from three Italian newspapers: La Stampa, La Repubblica, and Corriere della Sera. The size of the corpus is about 20 million tokens and approximately 230,000 types. The system was automatically updated on a daily basis and a list of words was obtained at the end of the collection period. This procedure originated a refined word list in which false Anglicisms were searched. Along with computational techniques, careful manual scanning proved to be indispensable to extract new false Anglicisms. The corpus is available for future work and may be exploited not only to find false Anglicisms but also to retrieve Anglicisms, neologisms, and to analyse lexical features of Italian newspaper language.

The Retrieval of False Anglicisms in Newspaper Texts

FURIASSI, Cristiano Gino;
2007-01-01

Abstract

The present article is the description of a project aimed at building a specialized corpus of Italian newspaper texts and at developing a computational technique to retrieve new false Anglicisms from it. Texts were collected along a ten-month span from three Italian newspapers: La Stampa, La Repubblica, and Corriere della Sera. The size of the corpus is about 20 million tokens and approximately 230,000 types. The system was automatically updated on a daily basis and a list of words was obtained at the end of the collection period. This procedure originated a refined word list in which false Anglicisms were searched. Along with computational techniques, careful manual scanning proved to be indispensable to extract new false Anglicisms. The corpus is available for future work and may be exploited not only to find false Anglicisms but also to retrieve Anglicisms, neologisms, and to analyse lexical features of Italian newspaper language.
2007
Corpus Linguistics 25 Years On
Rodopi
Language and Computers: Studies in Practical Linguistics
62
347
363
9789042021952
http://www.rodopi.nl
false Anglicisms; corpus linguistics; newspaper corpora; language contact; lexicology
Furiassi, Cristiano Gino; Hofland, K.
File in questo prodotto:
File Dimensione Formato  
2007_Furiassi_Hofland_The Retrieval of False Anglicisms in Newspaper Texts_Rodopi.pdf

Accesso riservato

Descrizione: articolo
Tipo di file: PDF EDITORIALE
Dimensione 321.81 kB
Formato Adobe PDF
321.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/99723
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 13
social impact