Introduction: Obstructive sleep apnea syndrome has become an important public health concern. Polysomnography is traditionally considered an established and effective diagnostic tool providing information on the severity of obstructive sleep apnea syndrome and the degree of sleep fragmentation. However, the numerous steps in the polysomnography test to diagnose obstructive sleep apnea syndrome are costly and time consuming. This study aimed to test the efficacy and clinical applicability of different machine learning methods based on demographic information and questionnaire data to predict obstructive sleep apnea syndrome severity. Materials and methods: We collected data about demographic characteristics, spirometry values, gas exchange (PaO2, PaCO2) and symptoms (Epworth Sleepiness Scale, snoring, etc.) of 313 patients with previous diagnosis of obstructive sleep apnea syndrome. After principal component analysis, we selected 19 variables which were used for further preprocessing and to eventually train seven types of classification models and five types of regression models to evaluate the prediction ability of obstructive sleep apnea syndrome severity, represented either by class or by apnea–hypopnea index. All models are trained with an increasing number of features and the results are validated through stratified 10-fold cross validation. Results: Comparative results show the superiority of support vector machine and random forest models for classification, while support vector machine and linear regression are better suited to predict apnea–hypopnea index. Also, a limited number of features are enough to achieve the maximum predictive accuracy. The best average classification accuracy on test sets is 44.7 percent, with the same average sensitivity (recall). In only 5.7 percent of cases, a severe obstructive sleep apnea syndrome (class 4) is misclassified as mild (class 2). Regression results show a minimum achieved root mean squared error of 22.17. Conclusion: The problem of predicting apnea–hypopnea index or severity classes for obstructive sleep apnea syndrome is very difficult when using only data collected prior to polysomnography test. The results achieved with the available data suggest the use of machine learning methods as tools for providing patients with a priority level for polysomnography test, but they still cannot be used for automated diagnosis.
- machine learning
- obstructive sleep apnea syndrome
ASJC Scopus subject areas
- Health Informatics