In the paper, the problems of building a corpus of a low-density variety are considered in the light of two projects - DiWaC and ArchiWals - built to preserve the linguistic and cultural heritage of the Walser German communities of Piedmont and Aosta Valley. In the paper it is argued that similar problems affect the task of working on spoken and written data of low-density varieties. On the one hand low-density varieties are defined by the absence or scarcity of ready-to-use language resources for automatic processing. On the other hand, written and spoken data of low-density varieties are both characterised by a high degree of granularity at different levels. The solutions proposed for DiWaC and ArchiWals are an attempt to conjugate computability and granularity by stratifying the information retrieved in the original texts constituting the corpora.

Corpora e varietà minoritarie: le isole walser in Italia

Raffaele Cioffi;Marco Bellante;Livio gaeta
2020-01-01

Abstract

In the paper, the problems of building a corpus of a low-density variety are considered in the light of two projects - DiWaC and ArchiWals - built to preserve the linguistic and cultural heritage of the Walser German communities of Piedmont and Aosta Valley. In the paper it is argued that similar problems affect the task of working on spoken and written data of low-density varieties. On the one hand low-density varieties are defined by the absence or scarcity of ready-to-use language resources for automatic processing. On the other hand, written and spoken data of low-density varieties are both characterised by a high degree of granularity at different levels. The solutions proposed for DiWaC and ArchiWals are an attempt to conjugate computability and granularity by stratifying the information retrieved in the original texts constituting the corpora.
2020
44
107
125
Corpus linguistics, Minority languages, Cultural heritage
Marco Angster; Raffaele Cioffi; Marco Bellante; Livio gaeta
File in questo prodotto:
File Dimensione Formato  
2020_RID.pdf

Accesso aperto

Descrizione: Articolo principale
Tipo di file: PREPRINT (PRIMA BOZZA)
Dimensione 300.23 kB
Formato Adobe PDF
300.23 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1788745
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact