TY - GEN
T1 - Statistical analysis of protein structural features
T2 - 11th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2014
AU - Del Prete, E.
AU - Dotolo, S.
AU - Marabotti, A.
AU - Facchiano, A.
PY - 2015
Y1 - 2015
N2 - Subtle structural differences among homologous proteins may be responsible of the modulation of their functional properties. Therefore, we are exploring novel and strengthened methods to investigate in deep protein structure, and to analyze conformational features, in order to highlight relationships to functional properties. We selected some protein families based on their different structural class from CATH database, and studied in detail many structural parameters for these proteins. Some valuable results from Pearson’s correlation matrix have been validated with a Student’s t‐distribution test at a significance level of 5% (p‐value). We investigated in detail the best relationships among parameters, by using partial correlation. Moreover, PCA technique has been used for both single family and all families, in order to demonstrate how to find outliers for a family and extract new combined features. The correctness of this approach was borne out by the agreement of our results with geometric and structural properties, known or expected. In addition, we found unknown relationships, which will be object of further studies, in order to consider them as putative markers related to the peculiar structure‐function relationships for each family.
AB - Subtle structural differences among homologous proteins may be responsible of the modulation of their functional properties. Therefore, we are exploring novel and strengthened methods to investigate in deep protein structure, and to analyze conformational features, in order to highlight relationships to functional properties. We selected some protein families based on their different structural class from CATH database, and studied in detail many structural parameters for these proteins. Some valuable results from Pearson’s correlation matrix have been validated with a Student’s t‐distribution test at a significance level of 5% (p‐value). We investigated in detail the best relationships among parameters, by using partial correlation. Moreover, PCA technique has been used for both single family and all families, in order to demonstrate how to find outliers for a family and extract new combined features. The correctness of this approach was borne out by the agreement of our results with geometric and structural properties, known or expected. In addition, we found unknown relationships, which will be object of further studies, in order to consider them as putative markers related to the peculiar structure‐function relationships for each family.
KW - Correlation
KW - Global features
KW - PCA
KW - Protein structure
UR - http://www.scopus.com/inward/record.url?scp=84949995110&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84949995110&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-24462-4_3
DO - 10.1007/978-3-319-24462-4_3
M3 - Conference contribution
AN - SCOPUS:84949995110
SN - 9783319244617
VL - 8623
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 33
EP - 43
BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PB - Springer Verlag
Y2 - 26 June 2014 through 28 June 2014
ER -