English. In this paper we propose a method for collecting a dictionary to deal with noisy medical text documents. The quality of such Italian Emergency Room Reports is so poor that in most cases these can be hardly automatically elaborated; this also holds for other languages (e.g., English), with the notable difference that no Italian dictionary has been proposed to deal with this jargon. In this work we introduce and evaluate a resource designed to fill this gap.

A Resource for Detecting Misspellings and Denoising Medical Text Data

Enrico Mensa
;
Gian Manuel Marino
;
Davide Colla
;
Matteo Delsanto
;
Daniele P. Radicioni
2021-01-01

Abstract

English. In this paper we propose a method for collecting a dictionary to deal with noisy medical text documents. The quality of such Italian Emergency Room Reports is so poor that in most cases these can be hardly automatically elaborated; this also holds for other languages (e.g., English), with the notable difference that no Italian dictionary has been proposed to deal with this jargon. In this work we introduce and evaluate a resource designed to fill this gap.
2021
CLiC-it 2020 Italian Conference on Computational Linguistics 2020
Held online. Planned Bologna, Italy
March 1-3, 2021
Proceedings of the Seventh Italian Conference on Computational Linguistics
CEUR
Vol-2769
1
7
http://ceur-ws.org/Vol-2769/paper_48.pdf
Enrico Mensa, Gian Manuel Marino, Davide Colla, Matteo Delsanto, Daniele P. Radicioni
File in questo prodotto:
File Dimensione Formato  
mensa2021resource.pdf

Accesso aperto

Tipo di file: PDF EDITORIALE
Dimensione 391.57 kB
Formato Adobe PDF
391.57 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1775465
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact