Although Bayesian nonparametric mixture models for continuous data are well developed, the literature on related approaches for count data is limited. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions with variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow for smooth deviations from the Poisson. We propose broad class of alternative models, nonparametric mixtures of rounded continuous kernels. We develop an efficient Gibbs sampler for posterior computation, and perform a simulation study to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. We illustrate our methods through applications to a developmental toxicity study and marketing data. Supplemental material is available online

Bayesian Kernel Mixtures for Counts

CANALE, Antonio;
2011-01-01

Abstract

Although Bayesian nonparametric mixture models for continuous data are well developed, the literature on related approaches for count data is limited. A common strategy is to use a mixture of Poissons, which unfortunately is quite restrictive in not accounting for distributions with variance less than the mean. Other approaches include mixing multinomials, which requires finite support, and using a Dirichlet process prior with a Poisson base measure, which does not allow for smooth deviations from the Poisson. We propose broad class of alternative models, nonparametric mixtures of rounded continuous kernels. We develop an efficient Gibbs sampler for posterior computation, and perform a simulation study to assess performance. Focusing on the rounded Gaussian case, we generalize the modeling framework to account for multivariate count data, joint modeling with continuous and categorical variables, and other complications. We illustrate our methods through applications to a developmental toxicity study and marketing data. Supplemental material is available online
2011
106
496
1528
1539
http://amstat.tandfonline.com/doi/pdf/10.1198/jasa.2011.ap11592
Bayesian nonparametrics; Dirichlet process mixtures; Kullback-Leibler condition; Large support; Multivariate count data; Posterior consistency; Rounded Gaussian distribution
Antonio Canale; David B. Dunson
File in questo prodotto:
File Dimensione Formato  
jasa2011.pdf

Open Access dal 31/07/2014

Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 621.98 kB
Formato Adobe PDF
621.98 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/128126
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 57
  • ???jsp.display-item.citation.isi??? 57
social impact