Open-radiomics: A Collection of Standardized Datasets and a Technical Protocol for Reproducible Radiomics Machine Learning Pipelines

29 Jul 2022 · Khashayar Namdar, Matthias W. Wagner, Birgit B. Ertl-Wagner, Farzad Khalvati ·

Purpose: As an important branch of machine learning pipelines in medical imaging, radiomics faces two major challenges namely reproducibility and accessibility. In this work, we introduce open-radiomics, a set of radiomics datasets along with a comprehensive radiomics pipeline based on our proposed technical protocol to investigate the effects of radiomics feature extraction on the reproducibility of the results. Materials and Methods: Experiments are conducted on BraTS 2020 open-source Magnetic Resonance Imaging (MRI) dataset that includes 369 adult patients with brain tumors (76 low-grade glioma (LGG), and 293 high-grade glioma (HGG)). Using PyRadiomics library for LGG vs. HGG classification, 288 radiomics datasets are formed; the combinations of 4 MRI sequences, 3 binWidths, 6 image normalization methods, and 4 tumor subregions. Random Forest classifiers were used, and for each radiomics dataset the training-validation-test (60%/20%/20%) experiment with different data splits and model random states was repeated 100 times (28,800 test results) and Area Under Receiver Operating Characteristic Curve (AUC) was calculated. Results: Unlike binWidth and image normalization, tumor subregion and imaging sequence significantly affected performance of the models. T1 contrast-enhanced sequence and the union of necrotic and the non-enhancing tumor core subregions resulted in the highest AUCs (average test AUC 0.951, 95% confidence interval of (0.949, 0.952)). Although 28 settings and data splits yielded test AUC of 1, they were irreproducible. Conclusion: Our experiments demonstrate the sources of variability in radiomics pipelines (e.g., tumor subregion) can have a significant impact on the results, which may lead to superficial perfect performances that are irreproducible.

PDF Abstract