It is shown that artificial neural networks (ANN) are promising tools in learning complex interplay of factors on a particular outcome. We performed this study to compare the predictive power of ANN and conventional methods in prediction of bone mineral density (BMD) in Iranian post-menopausal women.
A database of 10 variables (age, weight, age at menopause, corticosteroid use, estrogen use, number of pregnancies, age at menarche, tea consumption, activity and smoking) on 2158 participants (who underwent screening for osteoporosis in an endocrinology research center) was randomly divided into training (1400), validation (150), and test (608) groups. Robust multivariate linear regression and ANN models were developed and validated on the training and validation sets and outcomes (femoral neck and lumbar T scores) were predicted and compared on the test group using different numbers of input variables. Results were evaluated by comparing mean square of differences between predicted and reference values (non-centric chi-square test) and by measuring area under the receiver operating characteristic curve (AUROC) around cutoff value of -2.5.
For models with less than 3 input variables in femoral neck and 4 variables in lumbar area, performance of linear and ANN models was almost the same. With more variables imported into models, ANN outperformed linear regression models. AUROC varied in 2 to 10 variable models as follows: for ANN in lumbar area: from 0.709 to 0.774, linear models in lumbar area: from 0.709 to 0.744, ANN in femoral neck: from 0.801 to 0.867, linear models in femoral neck: from 0.799 to 0.834. All models performed better on prediction of femoral neck BMD than lumbar BMD. The best performance of ANN model yielded a sensitivity of 85.6% and 77.7% and specificity of 74.6% and 65.5% for femoral neck and lumbar area, respectively; whereas the regression model at the best had the sensitivity of 72.5% and 75.2% and specificity of 75.1% and 66.7% for these two places, in that order.
Superior performance of neural networks than linear models can be utilized to achieve more accurate predictions based on easily obtainable and questionnaire based variables. These models demonstrate their advantage especially in mass screening applications, when even a slight enhancement in performance results in significant decrease in number of misclassifications.