The problem of maximizing cell type discovery under budget constraints is a fundamental challenge for the collection and analysis of single-cell RNAsequencing (scRNA-seq) data. In this paper we introduce a simple, computationally efficient and scalable Bayesian nonparametric sequential approach to optimize the budget allocation when designing a large-scale experiment for the collection of scRNA-seq data for the purpose of, but not limited to, creating cell atlases. Our approach relies on the following tools: (i) a hierarchical Pitman–Yor prior that recapitulates biological assumptions regarding cellular differentiation, and (ii) a Thompson sampling multiarmed bandit strategy that balances exploitation and exploration to prioritize experiments across a sequence of trials. Posterior inference is performed by using a sequential Monte Carlo approach which allows us to fully exploit the sequential nature of our species sampling problem. We empirically show that our approach outperforms state-of-the-art methods and achieves near-Oracle performance on simulated and scRNA-seq data alike.

Nonparametric bayesian multiarmed bandits for single-cell experiment design

Favaro S.
2020-01-01

Abstract

The problem of maximizing cell type discovery under budget constraints is a fundamental challenge for the collection and analysis of single-cell RNAsequencing (scRNA-seq) data. In this paper we introduce a simple, computationally efficient and scalable Bayesian nonparametric sequential approach to optimize the budget allocation when designing a large-scale experiment for the collection of scRNA-seq data for the purpose of, but not limited to, creating cell atlases. Our approach relies on the following tools: (i) a hierarchical Pitman–Yor prior that recapitulates biological assumptions regarding cellular differentiation, and (ii) a Thompson sampling multiarmed bandit strategy that balances exploitation and exploration to prioritize experiments across a sequence of trials. Posterior inference is performed by using a sequential Monte Carlo approach which allows us to fully exploit the sequential nature of our species sampling problem. We empirically show that our approach outperforms state-of-the-art methods and achieves near-Oracle performance on simulated and scRNA-seq data alike.
2020
14
4
2003
2019
Cell type discovery; Experimental sampling design; Hierarchical Pitman–Yor model; Multiarmed bandits; ScRNA-seq; Sequential Monte Carlo; Thompson sampling
Camerlenghi F.; Dumitrascu B.; Ferrari F.; Engelhardt B.E.; Favaro S.
File in questo prodotto:
File Dimensione Formato  
CDFEF_bandits.pdf

Accesso aperto

Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 2.07 MB
Formato Adobe PDF
2.07 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1810645
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 4
social impact