Summary
Objectives To develop a deep learning-based harmonization framework, assessing whether it can improve performance of radiomics models given different kernels in different clinical tasks and additionally generalize to mitigate the effects of new/unobserved kernels on radiomics features. Methods Patient data with 2 reconstruction kernels and phantom data with 22 reconstruction kernels were included. Eighty-five patients were studied for lymph node metastasis (LNM) prediction, and 164 patients for differential diagnosis between lung cancer (LC) and pulmonary tuberculosis (TB). Two convolutional neural network (CNN) models were developed to convert images (i) from B70f to B30f (CNNa) and (ii) from B30f to B70f (CNNb). Model performance between the two kernels was evaluated using AUC and compared with other well-known harmonization methods. Patient-normalized feature difference (PNFD) was used to identify the incompatible kernels (i.e., kernel with median PNFD > 1) with baseline (B30f/B70f), and measure the ability of the CNN models to convert the non-comparable kernels. Results For LC versus pulmonary TB diagnosis, AUCs of CNNa vs. others were 0.85 vs. 0.54-0.74 (p = 0.0001-0.0003), and for CNNb vs. others: 0.87 vs. 0.54-0.86 (p = 0.0001-0.55). For LNM prediction, AUCs of CNNa vs. others were 0.68 vs. 0.56-0.61 (p = 0.10-0.39), and for CNNb vs. others: 0.78 vs. 0.70-0.73 (p = 0.07-0.40). After CNN harmonization, 17 of 20 (85%) of investigated unknown kernels produced comparable radiomics feature values relative to baseline (median PNFD from 1.10-2.31 to 0.23-1.13). Conclusion The CNN harmonization effectively improved performance of radiomics models between reconstruction kernels in different clinical tasks, and reduced feature differences between unknown kernels vs. baseline.
-
Institution南方医科大学; 5; 1