Skip Navigation
Skip to contents

대한신장학회


간행물 검색

현재 페이지 경로
  • HOME
  • 간행물
  • 간행물 검색
논문분류 춘계학술대회 초록집
제목 Privacy-Preserving Synthetic Data Enhances PO-AKI Prediction in Data-Scarce Scenarios
저자 Soie Kwon
출판정보 2025; 2025(1):
키워드 Synthetic data, Postoperative AKI, Deep learning, Generative AI, Non-cardiac major surgery
초록 Despite the growing use of artificial intelligence, data availability, and privacy concerns limit its clinical application. This study aimed to develop a synthetic model as a promising solution to address these, enabling the prediction of postoperative acute kidney injury (PO-AKI) prediction even with a relatively small real-world dataset. We developed a synthetic model to generate virtual patient data, incorporating comorbidities, laboratory results, medication history, surgical details, and PO-AKI occurrence in patients underwent non-cardiac major surgeries. The model was built on the BERT architecture and trained using real-world data from data-rich hospitals. Privacy risks were evaluated through Membership and Attribute Inference Attacks (MIA and AIA). The similarity between synthetic and real-world data was statistically assessed, and its clinical utility was evaluated by examining whether augmenting data-scarce scenarios with exact matched synthetic data improved PO-AKI prediction using the CatBoost. A total of 335,687 real-world patient data were collected from six tertiary hospitals, including 275,727 from 3 data-rich and 59,960 from 3 data-scarce hospitals. The similarity between the real-world data from the data-rich hospitals, which served as the training set for the synthetic generation model, and the synthetic data from each hospital was analyzed (Table 1). At SNUH, 90.4% of variables showed no statistically significant difference between real-world and synthetic data, compared to 89.0% at SNUBH and 94.4% at AMC. The MIA and AIA confirmed the privacy protection of synthesized data. The clinical utility of synthetic data in PO-AKI prediction was evaluated by augmenting real-world data-scarce cohorts (250–2,000 patients) with synthetic data. The benefit was most pronounced in smaller cohorts, peaking at 2,000–4,000 synthetic patients and plateauing beyond 16,000 (Figure 1). This is the first study to apply generative AI to PO-AKI prediction. We comprehensively demonstrate its clinical utility in data-scarce scenarios by enhancing prediction performance through synthetic data augmentation.
원문(PDF) PDF 원문보기
위로가기