High-throughput technologies allow to produce rapidly huge amount of gene expression data, useful to characterize wide variety of phenotypes. However, the choice of the best methods and approaches to analyze data from an high-throughput study is not a trivial aspect and makes the biological interpretation of the results a challenging task. The current analysis pipeline rst selects a set of genes which are somehow signicant for a specic aspect of the available data and then provides a functional characterization of the results using the many public sources of prior biological knowledge. This approach, while successful, may not be able to obtain the best possible results because, for example, it could discard some functions or processes, known to be relevant for the specic biological question under analysis, just based on the low number of genes in the identied list. We present a new analysis framework, Knowledge Driven Variable Selection , that integrates prior knowledge on data analysis. The expression data matrix is partitioned according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, dierently from the current analysis pipeline, doesn't exclude a priori any function or process potentially relevant for the biological question under investigation. Three case studies have been presented to demonstrate the performance of the method.

Discriminant functional gene groups identification with machine learning and prior knowledge

Sanavia, Tiziana;Di Camillo, Barbara
Last
2012-01-01

Abstract

High-throughput technologies allow to produce rapidly huge amount of gene expression data, useful to characterize wide variety of phenotypes. However, the choice of the best methods and approaches to analyze data from an high-throughput study is not a trivial aspect and makes the biological interpretation of the results a challenging task. The current analysis pipeline rst selects a set of genes which are somehow signicant for a specic aspect of the available data and then provides a functional characterization of the results using the many public sources of prior biological knowledge. This approach, while successful, may not be able to obtain the best possible results because, for example, it could discard some functions or processes, known to be relevant for the specic biological question under analysis, just based on the low number of genes in the identied list. We present a new analysis framework, Knowledge Driven Variable Selection , that integrates prior knowledge on data analysis. The expression data matrix is partitioned according to prior knowledge, into smaller matrices, easier to analyze and to interpret from both computational and biological viewpoints. Therefore KDVS, dierently from the current analysis pipeline, doesn't exclude a priori any function or process potentially relevant for the biological question under investigation. Three case studies have been presented to demonstrate the performance of the method.
2012
20th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2012
Bruges, Belgium
Belgium, 25 - 27 April 2012
ESANN 2012 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning.
i6doc
221
226
https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2012-167.pdf
Information Systems; Artificial Intelligence
Zycinski, Grzegorz; Squillario, Margherita; Barla, Annalisa; Sanavia, Tiziana; Verri, Alessandro; Di Camillo, Barbara
File in questo prodotto:
File Dimensione Formato  
Tiziana_Sanavia_ESANN_2012.pdf

Accesso aperto

Tipo di file: PDF EDITORIALE
Dimensione 818.23 kB
Formato Adobe PDF
818.23 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1804108
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact