The present article is the description of a project aimed at building a specialized corpus of Italian newspaper texts and at developing a computational technique to retrieve new false Anglicisms from it. Texts were collected along a ten-month span from three Italian newspapers: La Stampa, La Repubblica, and Corriere della Sera. The size of the corpus is about 20 million tokens and approximately 230,000 types. The system was automatically updated on a daily basis and a list of words was obtained at the end of the collection period. This procedure originated a refined word list in which false Anglicisms were searched. Along with computational techniques, careful manual scanning proved to be indispensable to extract new false Anglicisms. The corpus is available for future work and may be exploited not only to find false Anglicisms but also to retrieve Anglicisms, neologisms, and to analyse lexical features of Italian newspaper language.
The Retrieval of False Anglicisms in Newspaper Texts
FURIASSI, Cristiano Gino;
2007-01-01
Abstract
The present article is the description of a project aimed at building a specialized corpus of Italian newspaper texts and at developing a computational technique to retrieve new false Anglicisms from it. Texts were collected along a ten-month span from three Italian newspapers: La Stampa, La Repubblica, and Corriere della Sera. The size of the corpus is about 20 million tokens and approximately 230,000 types. The system was automatically updated on a daily basis and a list of words was obtained at the end of the collection period. This procedure originated a refined word list in which false Anglicisms were searched. Along with computational techniques, careful manual scanning proved to be indispensable to extract new false Anglicisms. The corpus is available for future work and may be exploited not only to find false Anglicisms but also to retrieve Anglicisms, neologisms, and to analyse lexical features of Italian newspaper language.File | Dimensione | Formato | |
---|---|---|---|
2007_Furiassi_Hofland_The Retrieval of False Anglicisms in Newspaper Texts_Rodopi.pdf
Accesso riservato
Descrizione: articolo
Tipo di file:
PDF EDITORIALE
Dimensione
321.81 kB
Formato
Adobe PDF
|
321.81 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.