CINECA IRIS Institutional Research Information System

High-throughput technologies allow to produce rapidly huge amount of gene expression data, useful to characterize wide variety of phenotypes. However, the choice of the best methods and approaches to analyze data from an high-throughput study is not a trivial aspect and makes the biological interpretation of the results a challenging task. The current analysis pipeline rst selects a set of genes which are somehow signicant for a specic aspect of the available data and then provides a functional characterization of the results using the many public sources of prior biological knowledge. This approach, while successful, may not be able to obtain the best possible results because, for example, it could discard some functions or processes, known to be relevant for the specic biological question under analysis, just based on the low number of genes in the identied list. We present a new analysis framework, Knowledge Driven Variable Selection , that integrates prior knowledge on data analysis. The expression data matrix is partitioned according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, dierently from the current analysis pipeline, doesn't exclude a priori any function or process potentially relevant for the biological question under investigation. Three case studies have been presented to demonstrate the performance of the method.

Discriminant functional gene groups identiﬁcation with machine learning and prior knowledge

Zycinski, Grzegorz^First;Squillario, Margherita;Barla, Annalisa;Sanavia, Tiziana;Verri, Alessandro;Di Camillo, Barbara^Last

2012-01-01

Abstract

High-throughput technologies allow to produce rapidly huge amount of gene expression data, useful to characterize wide variety of phenotypes. However, the choice of the best methods and approaches to analyze data from an high-throughput study is not a trivial aspect and makes the biological interpretation of the results a challenging task. The current analysis pipeline rst selects a set of genes which are somehow signicant for a specic aspect of the available data and then provides a functional characterization of the results using the many public sources of prior biological knowledge. This approach, while successful, may not be able to obtain the best possible results because, for example, it could discard some functions or processes, known to be relevant for the specic biological question under analysis, just based on the low number of genes in the identied list. We present a new analysis framework, Knowledge Driven Variable Selection , that integrates prior knowledge on data analysis. The expression data matrix is partitioned according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, dierently from the current analysis pipeline, doesn't exclude a priori any function or process potentially relevant for the biological question under investigation. Three case studies have been presented to demonstrate the performance of the method.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2012
			
	Titolo dell'evento
	
				20th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2012
			
	Luogo dell'evento
	
				Bruges, Belgium
			
	Data dell'evento
	
				Belgium, 25 - 27 April 2012
			
	Titolo del volume
	
				ESANN 2012 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning.
			
	Nome editore
	
				i6doc
			
	Pagine (da)
	
				221
			
	Pagine (a)
	
				226
			
	URL del prodotto (archivi open access, fulltext su sito editore, etc.)
	
				https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2012-167.pdf
			
	Parole Chiave
	
				Information Systems; Artificial Intelligence
			
	Tutti gli autori
	
						Zycinski, Grzegorz; Squillario, Margherita; Barla, Annalisa; Sanavia, Tiziana; Verri, Alessandro; Di Camillo, Barbara
					
	Appare nelle tipologie:
	
				04A-Conference paper in volume

File in questo prodotto:

File	Dimensione	Formato
Tiziana_Sanavia_ESANN_2012.pdf Accesso aperto Tipo di file: PDF EDITORIALE Dimensione 818.23 kB Formato Adobe PDF Visualizza/Apri	818.23 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1804108

Citazioni

ND

3

ND

social impact