Random forest (RF) is a well-known data-driven algorithm applied in several fields, thanks to its flexibility in modeling the relationship between the response variable and the predictors, also in case of strong non-linearities. In environmental applications, it often occurs that the phenomenon of interest may present spatial and/or temporal dependence that is not taken explicitly into account by RF in its standard version. In this work, we propose a taxonomy to classify strategies according to when (Pre-, In-, and/or Post-processing) they try to include the spatial information into regression RF. Moreover, we provide a systematic review and classify the most recent strategies adopted to “adjust” regression RF to spatially dependent data, based on the criteria provided by the Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA). The latter consists of a reproducible methodology for collecting and processing existing literature on a specified topic from different sources. PRISMA starts with a query and ends with a set of scientific documents to review: we performed an online query on the 25th of October 2022, and in the end, 32 documents were considered for review. The employed methodological strategies and the application fields considered in the 32 scientific documents are described and discussed.
A path in regression Random Forest looking for spatial dependence: a taxonomy and a systematic review
Natalia Golini;Rosaria Ignaccolo
2024-01-01
Abstract
Random forest (RF) is a well-known data-driven algorithm applied in several fields, thanks to its flexibility in modeling the relationship between the response variable and the predictors, also in case of strong non-linearities. In environmental applications, it often occurs that the phenomenon of interest may present spatial and/or temporal dependence that is not taken explicitly into account by RF in its standard version. In this work, we propose a taxonomy to classify strategies according to when (Pre-, In-, and/or Post-processing) they try to include the spatial information into regression RF. Moreover, we provide a systematic review and classify the most recent strategies adopted to “adjust” regression RF to spatially dependent data, based on the criteria provided by the Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA). The latter consists of a reproducible methodology for collecting and processing existing literature on a specified topic from different sources. PRISMA starts with a query and ends with a set of scientific documents to review: we performed an online query on the 25th of October 2022, and in the end, 32 documents were considered for review. The employed methodological strategies and the application fields considered in the 32 scientific documents are described and discussed.| File | Dimensione | Formato | |
|---|---|---|---|
|
2024_Patelli et al_A path in regression Random Forest looking for spatial dependence- a taxonomy and a systematic review.pdf
Accesso riservato
Descrizione: Manoscritto
Tipo di file:
POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione
1.29 MB
Formato
Adobe PDF
|
1.29 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
2024_A_path_in_RF_SPRINGER_final.pdf
Accesso riservato
Descrizione: pdf editoriale
Tipo di file:
PDF EDITORIALE
Dimensione
21.81 MB
Formato
Adobe PDF
|
21.81 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



