Given a sample of size n from a population of individuals belonging to different species with unknown proportions, a popular problem of practical interest consists in making inference on the probability Dn(l) that the (n+1)-th draw coincides with a species with frequency l in the sample, for any l=0,1,…,n. This paper contributes to the methodology of Bayesian nonparametric inference for Dn(l). Specifically, under the general framework of Gibbs-type priors we show how to derive credible intervals for a Bayesian nonparametric estimation of Dn(l), and we investigate the large n asymptotic behaviour of such an estimator. Of particular interest are special cases of our results obtained under the specification of the two parameter Poisson--Dirichlet prior and the normalized generalized Gamma prior, which are two of the most commonly used Gibbs-type priors. With respect to these two prior specifications, the proposed results are illustrated through a simulation study and a benchmark Expressed Sequence Tags dataset. To the best our knowledge, this illustration provides the first comparative study between the two parameter Poisson--Dirichlet prior and the normalized generalized Gamma prior in the context of Bayesian nonparemetric inference for Dn(l).
Bayesian nonparametric inference for discovery probabilities: credible intervals and large sample asymptotics
FAVARO, STEFANO;NIPOTI, BERNARDO;
2017-01-01
Abstract
Given a sample of size n from a population of individuals belonging to different species with unknown proportions, a popular problem of practical interest consists in making inference on the probability Dn(l) that the (n+1)-th draw coincides with a species with frequency l in the sample, for any l=0,1,…,n. This paper contributes to the methodology of Bayesian nonparametric inference for Dn(l). Specifically, under the general framework of Gibbs-type priors we show how to derive credible intervals for a Bayesian nonparametric estimation of Dn(l), and we investigate the large n asymptotic behaviour of such an estimator. Of particular interest are special cases of our results obtained under the specification of the two parameter Poisson--Dirichlet prior and the normalized generalized Gamma prior, which are two of the most commonly used Gibbs-type priors. With respect to these two prior specifications, the proposed results are illustrated through a simulation study and a benchmark Expressed Sequence Tags dataset. To the best our knowledge, this illustration provides the first comparative study between the two parameter Poisson--Dirichlet prior and the normalized generalized Gamma prior in the context of Bayesian nonparemetric inference for Dn(l).File | Dimensione | Formato | |
---|---|---|---|
sinica_final.pdf
Accesso aperto
Tipo di file:
POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione
2.46 MB
Formato
Adobe PDF
|
2.46 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.