Development and Validation of Predictive Models for Depression Using PHQ-9 Data
Abstract: Depression, the leading cause of suicide worldwide, is a serious, widespread, and growing mental
health disorder that has now been labeled a global health epidemic. The Patient Health Questionnaire-9
(PHQ-9), a depression-screener questionnaire, has emerged as an effective diagnostic tool globally. Using U.S.
PHQ-9 patient response data and corresponding demographic data from 2013-2014 and 2015-2016, this study
conducts a comprehensive big data analysis of the response data to develop and validate predictive models
for depression probability. Age at screening, gender, race/ethnicity, education level, and body weight were
proposed as factors correlated with depression. Two models were constructed using RStudio to explore these
correlations: a logistic regression model, and an artificial neural network. The logistic regression predictive
model performed better than the artificial neural network in an unfamiliar dataset, whereas the opposite was
true in a familiar dataset. Both models supported that the proposed factors are indeed significantly correlated
with depression. The logistic regression model indicated that females and those with weight problems are
more likely to have depression, and that the likelihood of depression increases with age, decreases with higher
education levels, and varies by race. The artificial neural network indicated that age, the Asian race, some
college education, and weight problems are the most significant factors affecting depression probability, in that
order. Based on these results, populations most at-risk for depression are identified and appropriate measures should be taken to combat depression
Citation: Jonathan C. Huang. “Development and Validation of Predictive Models for Depression Using PHQ-9
Data”. American Research Journal of Psychiatry, 2018; 1(1): 1-14.
Abstract
Copyright © 2018 Jonathan C. Huang. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.