Background: High-throughput gene expression data can predict gene function through the “guilt by association†principle: coexpressed genes are likely to be functionally associated. Methodology/Principal Findings: We analyzed publicly available expression data on normal human tissues. The analysis is based on the integration of data obtained with two experimental platforms (microarrays and SAGE) and of various measures of dissimilarity between expression profiles. The building blocks of the procedure are the Ranked Coexpression Groups (RCG), small sets of tightly coexpressed genes which are analyzed in terms of functional annotation. Functionally characterized RCGs are selected by means of the majority rule and used to predict new functional annotations. Functionally characterized RCGs are enriched in groups of genes associated to similar phenotypes. We exploit this fact to find new candidate disease genes for many OMIM phenotypes of unknown molecular origin. Conclusions/Significance: We predict new functional annotations for many human genes, showing that the integration of different data sets and coexpression measures significantly improves the scope of the results. Combining gene expression data, functional annotation and known phenotype-gene associations we provide candidate genes for several genetic diseases of unknown molecular basis.

Functional Annotation and Identification of Candidate Disease Genes by Computational Analysis of Normal Tissue Gene Expression Data

PIRO, Rosario Michael;ROSA, FABIO;ALA, UGO;SILENGO, Lorenzo;DI CUNTO, Ferdinando;PROVERO, Paolo
2008-01-01

Abstract

Background: High-throughput gene expression data can predict gene function through the “guilt by association†principle: coexpressed genes are likely to be functionally associated. Methodology/Principal Findings: We analyzed publicly available expression data on normal human tissues. The analysis is based on the integration of data obtained with two experimental platforms (microarrays and SAGE) and of various measures of dissimilarity between expression profiles. The building blocks of the procedure are the Ranked Coexpression Groups (RCG), small sets of tightly coexpressed genes which are analyzed in terms of functional annotation. Functionally characterized RCGs are selected by means of the majority rule and used to predict new functional annotations. Functionally characterized RCGs are enriched in groups of genes associated to similar phenotypes. We exploit this fact to find new candidate disease genes for many OMIM phenotypes of unknown molecular origin. Conclusions/Significance: We predict new functional annotations for many human genes, showing that the integration of different data sets and coexpression measures significantly improves the scope of the results. Combining gene expression data, functional annotation and known phenotype-gene associations we provide candidate genes for several genetic diseases of unknown molecular basis.
2008
3
6
e2439
e2439
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2409962/pdf/pone.0002439.pdf
Miozzi L; Piro RM; Rosa F; Ala U; Silengo L; Di Cunto F; Provero P
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/46592
Citazioni
  • ???jsp.display-item.citation.pmc??? 9
  • Scopus 21
  • ???jsp.display-item.citation.isi??? 19
social impact