Bayesian nonparametric mixtures and random partition models are powerful tools for probabilistic clustering. However, standard independent mixture models can be restrictive in some applications such as inference on cell lineage due to the biological relations of the clusters. The increasing availability of large genomic data requires new statistical tools to perform model-based clustering and infer the relationship between homogeneous subgroups of units. Motivated by single-cell RNA data we develop a novel dependent mixture model to jointly perform cluster analysis and align the clusters on a graph. Our flexible graph-aligned random partition model (GARP) exploits Gibbs-type priors as building blocks, allowing us to derive analytical results for the probability mass function (pmf) on the graph-aligned random partition. We derive a generalization of the Chinese restaurant process from the pmf and a related efficient and neat MCMC algorithm to implement Bayesian inference. We illustrate posterior inference under the GARP using single-cell RNA-seq data from mice stem cells. We further investigate the performance of the model in recovering the underlying clustering structure as well as the underlying graph by means of simulation studies. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

Graph-Aligned Random Partition Model (GARP)

Rebaudo, Giovanni
First
;
2025-01-01

Abstract

Bayesian nonparametric mixtures and random partition models are powerful tools for probabilistic clustering. However, standard independent mixture models can be restrictive in some applications such as inference on cell lineage due to the biological relations of the clusters. The increasing availability of large genomic data requires new statistical tools to perform model-based clustering and infer the relationship between homogeneous subgroups of units. Motivated by single-cell RNA data we develop a novel dependent mixture model to jointly perform cluster analysis and align the clusters on a graph. Our flexible graph-aligned random partition model (GARP) exploits Gibbs-type priors as building blocks, allowing us to derive analytical results for the probability mass function (pmf) on the graph-aligned random partition. We derive a generalization of the Chinese restaurant process from the pmf and a related efficient and neat MCMC algorithm to implement Bayesian inference. We illustrate posterior inference under the GARP using single-cell RNA-seq data from mice stem cells. We further investigate the performance of the model in recovering the underlying clustering structure as well as the underlying graph by means of simulation studies. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
2025
120
549
486
497
10.1080/01621459.2024.2353943
Bayesian nonparametrics, Dependent mixture model, Exchangeability, Gibbs-Type prior, Random partition model, Single-cell RNA
Rebaudo, Giovanni; Müller, Peter
File in questo prodotto:
File Dimensione Formato  
2024RebaudoMueller.pdf

Open Access dal 30/05/2025

Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 1.11 MB
Formato Adobe PDF
1.11 MB Adobe PDF Visualizza/Apri
Graph-Aligned Random Partition Model GARP .pdf

Accesso riservato

Tipo di file: PDF EDITORIALE
Dimensione 2.29 MB
Formato Adobe PDF
2.29 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1975850
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact