In this paper we focus on the prediction of Crohn's disease (CD) susceptibility by analyzing SNP profiles for a number of defined or suggested gene polymorphisms. We assess the correlation between genetic markers and the phenotype by using well-founded methods and procedures developed in the field of statistical learning theory. To this end, we use a sample generated by a case-control study composed of 178 CD patients and 127 healthy controls. The genetic profile of each subject is composed of 16 genetic variants distributed over 11 genes. We find that regularized least squares (RLS) classifiers predict Crohn's disease with a statistically significant accuracy of 62% (p = 0.018), significantly increasing the diagnostic accuracy by at least 10% compared to that obtained with the more largely confirmed gene involved in CD predisposition, namely CARD15. This also demonstrates that our sample size is adequate for accurate and significant prediction estimates. The strength of this methodology, in contrast to classical statistical methods, is that it accounts simultaneously for the effect of several genetics markers and their possible interactions. The findings of this study show that RLS methodology is able to increase the diagnostic accuracy of CD prediction by contemporary evaluation of a large number of gene polymorphisms. This approach may be particularly useful in large-scale population screening programs, and when evaluating large datasets of gene polymorphisms (i.e. chips, microarrays). Moreover, it could shed more light on possible candidate genes with a weak genetic contribution, and for evaluating gene-gene and gene-phenotype interactions by analyzing populations with a reasonably small sample size.
- Crohn's disease
- Single nucleotide polymorphisms
ASJC Scopus subject areas