CINECA IRIS Institutional Research Information System

Recent advances in molecular biology and bioinformatic techniques have brought about an explosion of information about the spatial organisation of the DNA in the nucleus of a cell. High-throughput molecular biology techniques provide a genome-wide capture of the spatial organisation of chromosomes at unprecedented scales, which permit one to identify physical interactions between genetic elements located throughout a genome. This important information is, however, hampered by the lack of biologist-friendly analysis and visualisation software: these disciplines are literally caught in a flood of data and are now facing many of the scale-out issues that high-performance computing has been addressing for years. Data must be managed, analysed and integrated, with substantial requirements of speed (in terms of execution time), application scalability and data representation. In this work, we present NuChart-II, an efficient and highly optimised tool for genomic data analysis that provides a gene-centric, graph-based representation of genomic information and which proposes an ex-post normalisation technique for Hi-C data. While designing NuChart-II, we addressed several common issues in the parallelisation of memory-bound algorithms for shared-memory systems.

NuChart-II: The road to a fast and scalable tool for Hi-C data analysis

TORDINI, FABIO;DROCCO, MAURIZIO;MISALE, CLAUDIA;Milanesi, L.;Lio, P.;Merelli, I.;Torquati, M.;ALDINUCCI, MARCO

2017-01-01

Abstract

Recent advances in molecular biology and bioinformatic techniques have brought about an explosion of information about the spatial organisation of the DNA in the nucleus of a cell. High-throughput molecular biology techniques provide a genome-wide capture of the spatial organisation of chromosomes at unprecedented scales, which permit one to identify physical interactions between genetic elements located throughout a genome. This important information is, however, hampered by the lack of biologist-friendly analysis and visualisation software: these disciplines are literally caught in a flood of data and are now facing many of the scale-out issues that high-performance computing has been addressing for years. Data must be managed, analysed and integrated, with substantial requirements of speed (in terms of execution time), application scalability and data representation. In this work, we present NuChart-II, an efficient and highly optimised tool for genomic data analysis that provides a gene-centric, graph-based representation of genomic information and which proposes an ex-post normalisation technique for Hi-C data. While designing NuChart-II, we addressed several common issues in the parallelisation of memory-bound algorithms for shared-memory systems.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2017
			
	Titolo rivista
	
				INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS
			
	Pagine (da)
	
				1
			
	Pagine (a)
	
				16
			
	DOI
	
				https://dx.doi.org/10.1177/1094342016668567
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				http://dx.doi.org/10.1177/1094342016668567
			
	Parole Chiave
	
				High-performance computing, Bioinformatics, Hi-C data analysis, parallel computing, memory-bound algorithms
			
	Tutti gli autori
	
						Tordini, F.; Drocco, M.; Misale, C.; Milanesi, L.; Lio, P.; Merelli, I.; Torquati, M.; Aldinucci, M.
					
	Appare nelle tipologie:
	
				03A-Articolo su Rivista

File in questo prodotto:

File	Dimensione	Formato
main.pdf Accesso aperto Descrizione: postprint autore Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE) Dimensione 5.47 MB Formato Adobe PDF Visualizza/Apri	5.47 MB	Adobe PDF	Visualizza/Apri
IJHPCA-2016-Tordini-1094342016668567.pdf Accesso riservato Descrizione: Published version Tipo di file: PDF EDITORIALE Dimensione 1.92 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.92 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1607126

Citazioni

ND

5

2

social impact