Given the exponentially increasing amount of available data, electronic annotation procedures for protein sequences are a core topic in bioinformatics. In this paper we present the refinement of an already published procedure that allows a fine grained level of detail in the annotation results. This enhancement is based on a graph representation of the similarity relationship between sequences within a cluster, followed by the application of community detection algorithms. These algorithms identify groups of highly connected nodes inside a bigger graph. The core idea is that sequences belonging to the same community share more features in respect to all the other sequences in the same graph.
Community detection within clusters helps large scale protein annotation: Preliminary results of modularity maximization for the bar+ database
Fariselli P;
2013-01-01
Abstract
Given the exponentially increasing amount of available data, electronic annotation procedures for protein sequences are a core topic in bioinformatics. In this paper we present the refinement of an already published procedure that allows a fine grained level of detail in the annotation results. This enhancement is based on a graph representation of the similarity relationship between sequences within a cluster, followed by the application of community detection algorithms. These algorithms identify groups of highly connected nodes inside a bigger graph. The core idea is that sequences belonging to the same community share more features in respect to all the other sequences in the same graph.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.