Predictive Modeling For Depression With Co-Morbidities – Results From Korea National Health Insurance Services Data
Depression, despite its high prevalence, remains severely under-diagnosed across the healthcare system. This demands the development of data-driven approaches that can help screen patients who are at a high risk of depression. In this work, depression risk prediction models that incorporate disease co-morbidities were built on the data from the one million twelve-year longitudinal cohort from Korean National Health Insurance Services (KNHIS), with multiple supervised machine-learning approaches, including decision tree, boost trees, random forest, and support vector machine. Then traditional logistic regression model and Elastic Net regression model were employed in order to leverage the predictive performance and interpretability. Among the supervised machine-learning approaches, boost trees, random forest, and support vector machine achieved Area Under the Curve of the Receiver Operating Characteristic (AUROC) of 0.793, 0.739, and 0.660, respectively. And Elastic Net regression model achieved an AUROC of 0.7818, compared to a traditional logistic regression model without co-morbidity analysis (AUROC of 0.6992). In addition, Elastic Net regression model showed co-morbidity adjusted Odds Ratios (ORs), which may be more accurate independent estimate of each predictor variable. In conclusion, the inclusion of co-morbidity analysis with Elastic Net regression model showed the performance of depression risk prediction models comparable to that of supervised machine-learning methods, with providing better interpretability.
Co-morbidity; Depression; Elastic Net; Korea National Health Insurance Services; Machine Learning; Risk Prediction Model
Master of Science
Attribution-NonCommercial-NoDerivatives 4.0 International
dissertation or thesis
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivatives 4.0 International