Spoken and Written Learner English: A Quantitative Analysis of ICLE-IT and LINDSEI-IT

Furiassi, Cristiano Gino

The study focuses on the quantitative comparison of the vocabulary features of spoken and written English produced by Italian advanced learners. The analysis is based on the Italian sub-corpus of the International Corpus of Learner English (ICLE-IT) and of the Louvain International Database of Spoken English Interlanguage (LINDSEI-IT). The study considers statistical data on overall vocabulary features such as word length, type/token ratio, lexical density, and keyness. The aim is to provide objective evidence for the well-known differences between spoken and written language. The computational analysis of the two learner corpora has confirmed that the same differences that exist between spoken and written modes of communication in native productions exist also in learner corpora. The author speaks for the usefulness of quantitative calculations as starting points for more qualitative analyses. However, he also warns against the uncritical acceptance of raw numerical information. Researchers should be aware of the most important statistical concepts that are behind the numbers and percentages easily and rapidly provided by corpus-analysis tools. Besides, some of these measures should be handled with caution when applied to learner corpora.