CINECA IRIS Institutional Research Information System

Extensive amounts of information about protein sequences are becoming available, as demonstrated by the over 79 million entries in the UniProt database. Yet, it is still challenging to obtain proteome-wide experimental information on the structural properties associated with these sequences. Fast computational predictors of secondary structure and of intrinsic disorder of proteins have been developed in order to bridge this gap. These two types of predictions, however, have remained largely separated, often preventing a clear characterization of the structure and dynamics of proteins. Here, we introduce a computational method to predict secondary-structure populations from amino acid sequences, which simultaneously characterizes structure and disorder in a unified statistical mechanics framework. To develop this method, called s2D, we exploited recent advances made in the analysis of NMR chemical shifts that provide quantitative information about the probability distributions of secondary-structure elements in disordered states. The results that we discuss show that the s2D method predicts secondary-structure populations with an average error of about 14%. A validation on three datasets of mostly disordered, mostly structured and partly structured proteins, respectively, shows that its performance is comparable to or better than that of existing predictors of intrinsic disorder and of secondary structure. These results indicate that it is possible to perform rapid and quantitative sequence-based characterizations of the structure and dynamics of proteins through the predictions of the statistical distributions of their ordered and disordered regions.

The s2D Method: Simultaneous Sequence-Based Prediction of the Statistical Populations of Ordered and Disordered Regions in Proteins

Sormanni, P.;Camilloni, C.;Fariselli, Piero;Vendruscolo, M.

2015-01-01

Abstract

Extensive amounts of information about protein sequences are becoming available, as demonstrated by the over 79 million entries in the UniProt database. Yet, it is still challenging to obtain proteome-wide experimental information on the structural properties associated with these sequences. Fast computational predictors of secondary structure and of intrinsic disorder of proteins have been developed in order to bridge this gap. These two types of predictions, however, have remained largely separated, often preventing a clear characterization of the structure and dynamics of proteins. Here, we introduce a computational method to predict secondary-structure populations from amino acid sequences, which simultaneously characterizes structure and disorder in a unified statistical mechanics framework. To develop this method, called s2D, we exploited recent advances made in the analysis of NMR chemical shifts that provide quantitative information about the probability distributions of secondary-structure elements in disordered states. The results that we discuss show that the s2D method predicts secondary-structure populations with an average error of about 14%. A validation on three datasets of mostly disordered, mostly structured and partly structured proteins, respectively, shows that its performance is comparable to or better than that of existing predictors of intrinsic disorder and of secondary structure. These results indicate that it is possible to perform rapid and quantitative sequence-based characterizations of the structure and dynamics of proteins through the predictions of the statistical distributions of their ordered and disordered regions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2015
			
	Lingua di pubblicazione
	
				Inglese
			
	Codice ISI WoS
	
				WOS:000350076400022
			
	Codice PubMed
	
				25534081
			
	Codice Scopus
	
				2-s2.0-84922341947
			
	Referee
	
				Esperti anonimi
			
	Titolo rivista
	
				JOURNAL OF MOLECULAR BIOLOGY
			
	N. Volume
	
				427
			
	Fascicolo
	
				4
			
	Pagine (da)
	
				982
			
	Pagine (a)
	
				996
			
	Numero di pagine totale
	
				15
			
	DOI
	
				https://dx.doi.org/10.1016/j.jmb.2014.12.007
			
	Parole Chiave
	
				alpha-helix; beta-sheet; intrinsically disordered proteins; random coil; Algorithms; Amino Acid Sequence; Computational Biology; Databases; Protein; Models; Molecular; Nuclear Magnetic Resonance; Biomolecular; Protein Folding; Protein Structure; Tertiary; Sequence Analysis; Secondary
			
	Coautori affiliati a enti stranieri
	
				no
			
	Tipologia sito docente
	
				262
			
	Numero autori
	
				4
			
	Tutti gli autori
	
						Sormanni, P.; Camilloni, C.; Fariselli, Piero; Vendruscolo, M.
					
	Tipologia
	
				info:eu-repo/semantics/article
			
	Fulltext
	
				reserved
			
	Tipologia
	
				03-CONTRIBUTO IN RIVISTA::03A-Articolo su Rivista
			
	Appare nelle tipologie:
	
				03A-Articolo su Rivista

File in questo prodotto:

File	Dimensione	Formato
s2D_Sormannietal_2015_JMB.pdf Accesso riservato Tipo di file: PDF EDITORIALE Dimensione 1.4 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.4 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1687640

Citazioni

37

69

61

social impact