| 초록 |
Backgrounds: Previous studies of IgA nephropathy (IgAN) patients have focused on risk factor evaluation. We aimed to develop individual outcome prediction models in IgAN patients using machine learning methods.
Methods: We reviewed clinical and pathologic characteristics of adult IgAN patients from Seoul National University Hospital (SNUH, n=1,540) and Asan Medical Center (AMC, n=1,044) at the time of renal biopsy. They were divided into development (followed up ≥10 years) and prediction (follow up <10 years) sets, respectively. The outcome was 10-year renal survival (10YRS) probability. We developed prediction models from SNUH test set by using logistic regression (LR) with Lasso method, a classification and regression tree (CART), and neural network (NN) using 16 clinico-pathologic variables. We also used bagging, random forest (RF) and boosting for ensemble learning. Finally, those models were validated internally in SNUH prediction set and externally in AMC development and prediction sets.
Results: Considering missing data, 1,514 and 847 patients were included from SNUH and AMC cohorts. In the LR model, estimated glomerular filtration rate (eGFR), hemoglobin, proportions of GS and SS, interstitial fibrosis were selected as predictors for ESRD. In the CART model, eGFR 53.3 ml/min/1.73m2 was proved to be a watershed for 10 YRS, followed by, proportion of GS (≥ 39.2%) and SS (≥ 28.5%), tubular atrophy (≥ moderate), hemoglobin (<11.9 g/dL) and proteinuria (≥1.39 g/g), sequentially (accuracy, 0.849). In the NN, variables were selected including eGFR, proportions of SS, hemoglobin, proportion of GS, albumin, and interstitial fibrosis, sequentially (accuracy, 0.862). In addition, the ensemble learners showed good performance (accuracies of bagging, 0.868; RF, 0.874; boosting, 0.862). Those individual learners were validated internally with good performance (sensitivities of LR, 0.855; CART, 0.921; NN, 0.952; bagging, 0.857; RF, 0.921; boosting, 0.921). And finally, we proved the robustness of those models from external validation. Good performances of both development (sensitivities of LR, 0.847; CART, 0.867; NN, 0.855; bagging, 0.852; RF, 0.872; boosting, 0.851) and prediction sets (sensitivities of LR, 0.980; CART, 0.902; NN, 0.941; bagging, 0.882; RF, 0.902; boosting, 0.922) were showed.
Conclusions: We developed robustness of prediction models using machine learning for the individual’s likelihood of 10YRS in IgAN with both internal and external validation. |