The study focuses on the quantitative comparison of the vocabulary features of spoken and written English produced by Italian advanced learners. The analysis is based on the Italian sub-corpus of the International Corpus of Learner English (ICLE-IT) and of the Louvain International Database of Spoken English Interlanguage (LINDSEI-IT). The study considers statistical data on overall vocabulary features such as word length, type/token ratio, lexical density, and keyness. The aim is to provide objective evidence for the well-known differences between spoken and written language. The computational analysis of the two learner corpora has confirmed that the same differences that exist between spoken and written modes of communication in native productions exist also in learner corpora. The author speaks for the usefulness of quantitative calculations as starting points for more qualitative analyses. However, he also warns against the uncritical acceptance of raw numerical information. Researchers should be aware of the most important statistical concepts that are behind the numbers and percentages easily and rapidly provided by corpus-analysis tools. Besides, some of these measures should be handled with caution when applied to learner corpora.

Spoken and Written Learner English: A Quantitative Analysis of ICLE-IT and LINDSEI-IT

FURIASSI, Cristiano Gino
2004-01-01

Abstract

The study focuses on the quantitative comparison of the vocabulary features of spoken and written English produced by Italian advanced learners. The analysis is based on the Italian sub-corpus of the International Corpus of Learner English (ICLE-IT) and of the Louvain International Database of Spoken English Interlanguage (LINDSEI-IT). The study considers statistical data on overall vocabulary features such as word length, type/token ratio, lexical density, and keyness. The aim is to provide objective evidence for the well-known differences between spoken and written language. The computational analysis of the two learner corpora has confirmed that the same differences that exist between spoken and written modes of communication in native productions exist also in learner corpora. The author speaks for the usefulness of quantitative calculations as starting points for more qualitative analyses. However, he also warns against the uncritical acceptance of raw numerical information. Researchers should be aware of the most important statistical concepts that are behind the numbers and percentages easily and rapidly provided by corpus-analysis tools. Besides, some of these measures should be handled with caution when applied to learner corpora.
2004
Computer Learner Corpora. Theoretical Issues and Empirical Case Studies of Italian Advanced EFL Learner’s Interlanguage
Edizioni dell'Orso
193
208
9788876947674
http://www.ediorso.it
EFL; Learner Corpora; Spoken and Written English
C. FURIASSI
File in questo prodotto:
File Dimensione Formato  
Furiassi_Spoken and Written Learner English.pdf

Accesso riservato

Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 1.1 MB
Formato Adobe PDF
1.1 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/130793
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact