The large availability of hospital administrative and clinical data has encouraged the application of Process Mining techniques to the healthcare domain. Predictive Process Monitoring techniques can be used in order to learn from these data related to past historical executions and predict the future of incomplete cases. However, some of these data, possibly the most informative ones, are often available in natural language text, while structured information-extracted from these data-would be more beneficial for training predictive models.In this paper we focus on the scenario of the Home Hospitalization Service, supporting the team in making decisions on the home hospitalization of a patient, by predicting whether it is likely that a new patient will successfully undergo home hospitalization. We aim at investigating whether, in this scenario, we can take advantage of mapping unstructured textual diagnoses, reported by the doctor in the Emergency Department, into structured information, as the standardized disease ICD-9-CM codes, to provide more accurate predictions. To this aim, we devise two different approaches involving respectively lexicographic and semantic distance for mapping textual diagnoses in ICD-9-CM codes and leverage the structured information for making predictions.

Unstructured Data in Predictive Process Monitoring: Lexicographic and Semantic Mapping to ICD-9-CM Codes for the Home Hospitalization Service

Ronzani, M;Ferrod, R;Di Francescomarino, C;Sulis, E;Aringhieri, R;Boella, G;Brunetti, E;Di Caro, L;Marinello, R
2022-01-01

Abstract

The large availability of hospital administrative and clinical data has encouraged the application of Process Mining techniques to the healthcare domain. Predictive Process Monitoring techniques can be used in order to learn from these data related to past historical executions and predict the future of incomplete cases. However, some of these data, possibly the most informative ones, are often available in natural language text, while structured information-extracted from these data-would be more beneficial for training predictive models.In this paper we focus on the scenario of the Home Hospitalization Service, supporting the team in making decisions on the home hospitalization of a patient, by predicting whether it is likely that a new patient will successfully undergo home hospitalization. We aim at investigating whether, in this scenario, we can take advantage of mapping unstructured textual diagnoses, reported by the doctor in the Emergency Department, into structured information, as the standardized disease ICD-9-CM codes, to provide more accurate predictions. To this aim, we devise two different approaches involving respectively lexicographic and semantic distance for mapping textual diagnoses in ICD-9-CM codes and leverage the structured information for making predictions.
2022
20th International Conference of the Italian Association for Artificial Intelligence, AIxIA 2021
italy
1-3 12 2021
20th International Conference of the Italian Association for Artificial Intelligence, AIxIA 2021
SPRINGER INTERNATIONAL PUBLISHING AG
13196
700
715
978-3-031-08420-1
978-3-031-08421-8
Healthcare processes; Predictive process monitoring; Natural language processing; Home hospitalization service
Ronzani, M; Ferrod, R; Di Francescomarino, C; Sulis, E; Aringhieri, R; Boella, G; Brunetti, E; Di Caro, L; Dragoni, M; Ghidini, C; Marinello, R...espandi
File in questo prodotto:
File Dimensione Formato  
Unstructured data in predictive.pdf

Accesso aperto

Dimensione 457.59 kB
Formato Adobe PDF
457.59 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1889767
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact