Statistical analysis of protein structural features

Relationships and PCA grouping

E. Del Prete, S. Dotolo, A. Marabotti, A. Facchiano

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Subtle structural differences among homologous proteins may be responsible of the modulation of their functional properties. Therefore, we are exploring novel and strengthened methods to investigate in deep protein structure, and to analyze conformational features, in order to highlight relationships to functional properties. We selected some protein families based on their different structural class from CATH database, and studied in detail many structural parameters for these proteins. Some valuable results from Pearson’s correlation matrix have been validated with a Student’s t‐distribution test at a significance level of 5% (p‐value). We investigated in detail the best relationships among parameters, by using partial correlation. Moreover, PCA technique has been used for both single family and all families, in order to demonstrate how to find outliers for a family and extract new combined features. The correctness of this approach was borne out by the agreement of our results with geometric and structural properties, known or expected. In addition, we found unknown relationships, which will be object of further studies, in order to consider them as putative markers related to the peculiar structure‐function relationships for each family.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages33-43
Number of pages11
Volume8623
ISBN (Print)9783319244617
DOIs
Publication statusPublished - 2015
Event11th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2014 - Cambridge, United Kingdom
Duration: Jun 26 2014Jun 28 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8623
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other11th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2014
CountryUnited Kingdom
CityCambridge
Period6/26/146/28/14

Fingerprint

Grouping
Statistical Analysis
Statistical methods
Proteins
Protein
Partial Correlation
Pearson Correlation
Significance level
Structural Parameters
Correlation Matrix
Structural properties
Structure-function
Protein Structure
Structural Properties
Modulation
Outlier
Students
Correctness
Relationships
Family

Keywords

  • Correlation
  • Global features
  • PCA
  • Protein structure

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Del Prete, E., Dotolo, S., Marabotti, A., & Facchiano, A. (2015). Statistical analysis of protein structural features: Relationships and PCA grouping. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8623, pp. 33-43). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8623). Springer Verlag. https://doi.org/10.1007/978-3-319-24462-4_3

Statistical analysis of protein structural features : Relationships and PCA grouping. / Del Prete, E.; Dotolo, S.; Marabotti, A.; Facchiano, A.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8623 Springer Verlag, 2015. p. 33-43 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8623).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Del Prete, E, Dotolo, S, Marabotti, A & Facchiano, A 2015, Statistical analysis of protein structural features: Relationships and PCA grouping. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 8623, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8623, Springer Verlag, pp. 33-43, 11th International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics, CIBB 2014, Cambridge, United Kingdom, 6/26/14. https://doi.org/10.1007/978-3-319-24462-4_3
Del Prete E, Dotolo S, Marabotti A, Facchiano A. Statistical analysis of protein structural features: Relationships and PCA grouping. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8623. Springer Verlag. 2015. p. 33-43. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-24462-4_3
Del Prete, E. ; Dotolo, S. ; Marabotti, A. ; Facchiano, A. / Statistical analysis of protein structural features : Relationships and PCA grouping. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 8623 Springer Verlag, 2015. pp. 33-43 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{22a330149427429c8dedb5b3f5053f3b,
title = "Statistical analysis of protein structural features: Relationships and PCA grouping",
abstract = "Subtle structural differences among homologous proteins may be responsible of the modulation of their functional properties. Therefore, we are exploring novel and strengthened methods to investigate in deep protein structure, and to analyze conformational features, in order to highlight relationships to functional properties. We selected some protein families based on their different structural class from CATH database, and studied in detail many structural parameters for these proteins. Some valuable results from Pearson’s correlation matrix have been validated with a Student’s t‐distribution test at a significance level of 5{\%} (p‐value). We investigated in detail the best relationships among parameters, by using partial correlation. Moreover, PCA technique has been used for both single family and all families, in order to demonstrate how to find outliers for a family and extract new combined features. The correctness of this approach was borne out by the agreement of our results with geometric and structural properties, known or expected. In addition, we found unknown relationships, which will be object of further studies, in order to consider them as putative markers related to the peculiar structure‐function relationships for each family.",
keywords = "Correlation, Global features, PCA, Protein structure",
author = "{Del Prete}, E. and S. Dotolo and A. Marabotti and A. Facchiano",
year = "2015",
doi = "10.1007/978-3-319-24462-4_3",
language = "English",
isbn = "9783319244617",
volume = "8623",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "33--43",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Statistical analysis of protein structural features

T2 - Relationships and PCA grouping

AU - Del Prete, E.

AU - Dotolo, S.

AU - Marabotti, A.

AU - Facchiano, A.

PY - 2015

Y1 - 2015

N2 - Subtle structural differences among homologous proteins may be responsible of the modulation of their functional properties. Therefore, we are exploring novel and strengthened methods to investigate in deep protein structure, and to analyze conformational features, in order to highlight relationships to functional properties. We selected some protein families based on their different structural class from CATH database, and studied in detail many structural parameters for these proteins. Some valuable results from Pearson’s correlation matrix have been validated with a Student’s t‐distribution test at a significance level of 5% (p‐value). We investigated in detail the best relationships among parameters, by using partial correlation. Moreover, PCA technique has been used for both single family and all families, in order to demonstrate how to find outliers for a family and extract new combined features. The correctness of this approach was borne out by the agreement of our results with geometric and structural properties, known or expected. In addition, we found unknown relationships, which will be object of further studies, in order to consider them as putative markers related to the peculiar structure‐function relationships for each family.

AB - Subtle structural differences among homologous proteins may be responsible of the modulation of their functional properties. Therefore, we are exploring novel and strengthened methods to investigate in deep protein structure, and to analyze conformational features, in order to highlight relationships to functional properties. We selected some protein families based on their different structural class from CATH database, and studied in detail many structural parameters for these proteins. Some valuable results from Pearson’s correlation matrix have been validated with a Student’s t‐distribution test at a significance level of 5% (p‐value). We investigated in detail the best relationships among parameters, by using partial correlation. Moreover, PCA technique has been used for both single family and all families, in order to demonstrate how to find outliers for a family and extract new combined features. The correctness of this approach was borne out by the agreement of our results with geometric and structural properties, known or expected. In addition, we found unknown relationships, which will be object of further studies, in order to consider them as putative markers related to the peculiar structure‐function relationships for each family.

KW - Correlation

KW - Global features

KW - PCA

KW - Protein structure

UR - http://www.scopus.com/inward/record.url?scp=84949995110&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949995110&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-24462-4_3

DO - 10.1007/978-3-319-24462-4_3

M3 - Conference contribution

SN - 9783319244617

VL - 8623

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 33

EP - 43

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -