Standardization of CT radiomics features for multi-center analysis: Impact of software settings and parameters

Defeudis, A.; De Mattia, C.; Rizzetto, F.; Calderoni, F.; Mazzetti, S.; Torresin, A.; Vanzulli, A.; Regge, D.; Giannini, V.

doi:10.1088/1361-6560/ab9f61

The aim of this multicentric study is an inter-center benchmarking, to assess how different set tools applied to the same radiomics workflow affected the radiomics features (RFs) values. This topic is of key importance to start collaboration between different centers and to bring radiomic studies from benchmark to bedside. A per-lesion analysis was performed on 56 metastases (mts) selected from 14 patients. A single radiologist performed the segmentation of all mts, and RFs were extracted from the same segmentation of each mts, using two different software and file formats. Potential sources of discrepancies were evaluated. The intraclass correlation coefficient was used to describe how strongly the same radiomic measurements calculated in the two different centers resemble each other. Moreover, means of the relative changes of each RF were calculated, compared and gradually reduced. We showed that, after matching all formulas, discrepancies in RFs calculation between two centers ranged from 1% to 277%. Therefore, we evaluated other sources of variability using a stepwise approach, which led us to reduce the inter-center discrepancies to 0% for 22/25 RFs and below 2% for 3 RFs out of 25. In this study we demonstrated that different radiomic applications and masks formats might strongly impact the computation of some RFs. Therefore, when dealing with multi-center studies it is mandatory to adopt all strategies that can help in limiting the differences, thus keeping in mind the feasibility of these strategies in large cohort studies.