Presence-only data are referred to situations in which a censoring mechanism acts on a binary response which can be partially observed only with respect to one outcome, usually denoting the presence of an attribute of interest. A typical example is the recording of species presence in ecological surveys. In this work a Bayesian approach to the analysis of presence-only data based on a two levels scheme is presented. A probability law and a case-control design are combined to handle the double source of uncertainty: one due to censoring and the other one due to sampling. In the paper, through the use of a stratified sampling design with non-overlapping strata, a new formulation of the logistic model for presence-only data is proposed. In particular, the logistic regression with linear predictor is considered. Estimation is carried out with a new Markov Chain Monte Carlo algorithm with data augmentation, which does not require the a priori knowledge of the population prevalence. The performance of the new algorithm is validated by means of extensive simulation experiments using three scenarios and comparison with optimal benchmarks. An application to data existing in literature is reported in order to discuss the model behaviour in real world situations together with the results of an original study on termites occurrences data.
Bayesian logistic regression for presence-only data
Golini N.Co-first
;
2015-01-01
Abstract
Presence-only data are referred to situations in which a censoring mechanism acts on a binary response which can be partially observed only with respect to one outcome, usually denoting the presence of an attribute of interest. A typical example is the recording of species presence in ecological surveys. In this work a Bayesian approach to the analysis of presence-only data based on a two levels scheme is presented. A probability law and a case-control design are combined to handle the double source of uncertainty: one due to censoring and the other one due to sampling. In the paper, through the use of a stratified sampling design with non-overlapping strata, a new formulation of the logistic model for presence-only data is proposed. In particular, the logistic regression with linear predictor is considered. Estimation is carried out with a new Markov Chain Monte Carlo algorithm with data augmentation, which does not require the a priori knowledge of the population prevalence. The performance of the new algorithm is validated by means of extensive simulation experiments using three scenarios and comparison with optimal benchmarks. An application to data existing in literature is reported in order to discuss the model behaviour in real world situations together with the results of an original study on termites occurrences data.File | Dimensione | Formato | |
---|---|---|---|
SERRA 2016 Divino Golini JonaLasinio Penttinen.pdf
Accesso riservato
Descrizione: Articolo principale
Tipo di file:
PDF EDITORIALE
Dimensione
546.2 kB
Formato
Adobe PDF
|
546.2 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.