2390
Augmented ensemble learning is effective strategy for imbalanced small dataset: improve differentiation of low from high grade prostate cancer
Yuta Akamine1, Yoshiko Ueno2, Keitaro Sofue2, Takamichi Murakami2, Yu Ueda1, Ahsan Budrul1, Masami Yoneyama1, Makoto Obara1, and Marc Van Cauteren3
1Philips Japan, Tokyo, Japan, 2Department of Radiology, Kobe University Graduate School of Medicine, Hyogo, Japan, 3Asia Pacific, Philips Healthcare, Tokyo, Japan
SMOTE as data augmentation and ensemble learning are combined, trained from mp-MR including IVIM, DKI, and permeability. This proposed method showed F1 (0.831) and AUC (0.762) and is effective strategy to improve diagnosis of low from high grade prostate cancer for imbalanced small dataset.
Fig. 1. Schema of synthetic-minority-over-sampling-technique (SMOTE) algorism. (A) SMOTE algorithm generates synthetic examples through a linear interpolation between two existing minority examples. (B) For each minority sample, depending on the amount of synthetic examples, neighbors from the k nearest neighbors are randomly chosen. In this study, we used five nearest neighbors.
Table 4. Performance of proposed method combining SMOTE and ensemble model, compared to base models. Accuracy, sensitivity, specificity, F1 score, and AUC are evaluated. P value between ensemble model without SMOTE and with SMOTE is shown. A paired t-test was used and a P- value less than 0.05 was considered significant.