An Artificial Intelligence Model Based on ACR TI-RADS Characteristics for US Diagnosis of Thyroid Nodules

作者:Chen, Yufan; Gao, Zixiong; He, Yanni; Mai, Wuping; Li, Jinhua; Zhou, Meijun; Li, Sushu; Yi, Wenhong; Wu, Shuyu; Bai, Tong; Zhang, Ning; Zeng, Weibo; Lu, Yao; Liu, Hongmei*
来源:RADIOLOGY, 2022, 303(3): 613-619.
DOI:10.1148/radiol.211455

摘要

Background: US-based diagnosis of thyroid nodules is subjective and influenced by radiologists experience levels. @@@ Purpose: To develop an artificial intelligence model based on American College of Radiology Thyroid Imaging Reporting and Data System characteristics for diagnosing thyroid nodules and identifying nodule characteristics (hereafter, MTI-RADS) and to compare the performance of MTI-RADS, radiologists, and a model trained on benign and malignant status based on surgical histopathologic analysis (hereafter, M-Diag). @@@ Materials and Methods: In this retrospective study, 1588 surgically proven nodules from 636 consecutive patients (mean age, 49 years +/- 14 [SD]; 485 women) were included. MTI-RADS and M-Diag were trained on US images of 1345 nodules (January 2018 to December 2019). The performance of MTI-RADS was compared with that of M-Diag and radiologists with different experience levels on the test data set (243 nodules, January 2019 to December 2019) with the DeLong method and McNemar test. @@@ Results: The area under the receiver operating characteristic curve (AUC) and sensitivity of MTI-RADS were 0.91 and 83% (55 of 66 nodules), respectively, which were not significantly different from those of experienced radiologists (0.93 [P = .45] and 92% [61 of 66 nodules; P = .07]) and exceeded those of junior radiologists (0.78 [P < .001] and 70% [46 of 66 nodules; P = .04]). The specificity of MTI-RADS (87% [154 of 177 nodules]) was higher than that of both experienced and junior radiologists (80% [141 of 177 nodules; P = .02] and 75% [133 of 177 nodules; P = .001], respectively). The AUC of MTI-RADS was higher than that of M-Diag (0.91 vs 0.84, respectively; P = .001). In the test set of 243 nodules, the consistency rates between MTI-RADS and the experienced group were higher than those between MTI-RADS and the junior group for composition (79% [n = 193] vs 73% [n = 178], respectively; P = .02), echogenicity (75% [n = 183] vs 68% [n = 166]; P = .04), shape (93% [n = 227] vs 88% [n = 215]; P = .04), and smooth or ill-defined margin (72% [n = 174] vs 63% [n = 152]; P = .002). @@@ Conclusion: The area under the receiver operating characteristic curve (AUC) of an artificial intelligence model based on the American College of Radiology Thyroid Imaging Reporting and Data System (TI-RADS) was higher than that of a model trained on benign and malignant status based on surgical histopathologic analysis. The AUC and sensitivity of the model based on TI-RADS exceeded those of junior radiologists; the specificity of the model was higher than that of both experienced and junior radiologists.

  • 单位
    中山大学; 南方医科大学