In the paper, the problems of building a corpus of a low-density variety are considered in the light of two projects - DiWaC and ArchiWals - built to preserve the linguistic and cultural heritage of the Walser German communities of Piedmont and Aosta Valley. In the paper it is argued that similar problems affect the task of working on spoken and written data of low-density varieties. On the one hand low-density varieties are defined by the absence or scarcity of ready-to-use language resources for automatic processing. On the other hand, written and spoken data of low-density varieties are both characterised by a high degree of granularity at different levels. The solutions proposed for DiWaC and ArchiWals are an attempt to conjugate computability and granularity by stratifying the information retrieved in the original texts constituting the corpora.
Corpora e varietà minoritarie: le isole walser in Italia
Raffaele Cioffi;Marco Bellante;Livio gaeta
2020-01-01
Abstract
In the paper, the problems of building a corpus of a low-density variety are considered in the light of two projects - DiWaC and ArchiWals - built to preserve the linguistic and cultural heritage of the Walser German communities of Piedmont and Aosta Valley. In the paper it is argued that similar problems affect the task of working on spoken and written data of low-density varieties. On the one hand low-density varieties are defined by the absence or scarcity of ready-to-use language resources for automatic processing. On the other hand, written and spoken data of low-density varieties are both characterised by a high degree of granularity at different levels. The solutions proposed for DiWaC and ArchiWals are an attempt to conjugate computability and granularity by stratifying the information retrieved in the original texts constituting the corpora.File | Dimensione | Formato | |
---|---|---|---|
2020_RID.pdf
Accesso aperto
Descrizione: Articolo principale
Tipo di file:
PREPRINT (PRIMA BOZZA)
Dimensione
300.23 kB
Formato
Adobe PDF
|
300.23 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.