ArticlePDF Available

Applying machine-learning models to differentiate benign and malignant thyroid nodules classified as C-TIRADS 4 based on 2D-ultrasound combined with five contrast-enhanced ultrasound key frames

Frontiers
Frontiers in Endocrinology
Authors:

Abstract and Figures

Objectives To apply machine learning to extract radiomics features from thyroid two-dimensional ultrasound (2D-US) combined with contrast-enhanced ultrasound (CEUS) images to classify and predict benign and malignant thyroid nodules, classified according to the Chinese version of the thyroid imaging reporting and data system (C-TIRADS) as category 4. Materials and methods This retrospective study included 313 pathologically diagnosed thyroid nodules (203 malignant and 110 benign). Two 2D-US images and five CEUS key frames (“2nd second after the arrival time” frame, “time to peak” frame, “2nd second after peak” frame, “first-flash” frame, and “second-flash” frame) were selected to manually label the region of interest using the “Labelme” tool. A total of 7 images of each nodule and their annotates were imported into the Darwin Research Platform for radiomics analysis. The datasets were randomly split into training and test cohorts in a 9:1 ratio. Six classifiers, namely, support vector machine, logistic regression, decision tree, random forest (RF), gradient boosting decision tree and extreme gradient boosting, were used to construct and test the models. Performance was evaluated using a receiver operating characteristic curve analysis. The area under the curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy (ACC), and F1-score were calculated. One junior radiologist and one senior radiologist reviewed the 2D-US image and CEUS videos of each nodule and made a diagnosis. We then compared their AUC and ACC with those of our best model. Results The AUC of the diagnosis of US, CEUS and US combined CEUS by junior radiologist and senior radiologist were 0.755, 0.750, 0.784, 0.800, 0.873, 0.890, respectively. The RF classifier performed better than the other five, with an AUC of 1 for the training cohort and 0.94 (95% confidence interval 0.88–1) for the test cohort. The sensitivity, specificity, accuracy, PPV, NPV, and F1-score of the RF model in the test cohort were 0.82, 0.93, 0.90, 0.85, 0.92, and 0.84, respectively. The RF model with 2D-US combined with CEUS key frames achieved equivalent performance as the senior radiologist (AUC: 0.94 vs. 0.92, P = 0.798; ACC: 0.90 vs. 0.92) and outperformed the junior radiologist (AUC: 0.94 vs. 0.80, P = 0.039, ACC: 0.90 vs. 0.81) in the test cohort. Conclusions Our model, based on 2D-US and CEUS key frames radiomics features, had good diagnostic efficacy for thyroid nodules, which are classified as C-TIRADS 4. It shows promising potential in assisting less experienced junior radiologists.
This content is subject to copyright.
Applying machine-learning
models to differentiate benign
and malignant thyroid nodules
classied as C-TIRADS 4 based
on 2D-ultrasound combined
with ve contrast-enhanced
ultrasound key frames
Jia-hui Chen, Yu-Qing Zhang, Tian-tong Zhu, Qian Zhang,
Ao-xue Zhao and Ying Huang*
Department of Ultrasound, Shengjing Hospital of China Medical University, Shenyang, China
Objectives: To apply machine learning to extract radiomics features from thyroid
two-dimensional ultrasound (2D-US) combined with contrast-enhanced
ultrasound (CEUS) images to classify and predict benign and malignant thyroid
nodules, classied according to the Chinese version of the thyroid imaging
reporting and data system (C-TIRADS) as category 4.
Materials and methods: This retrospective study included 313 pathologically
diagnosed thyroid nodules (203 malignant and 110 benign). Two 2D-US images
and ve CEUS key frames (2
nd
second after the arrival timeframe, time to peak
frame, 2
nd
second after peakframe, rst-ashframe, and second-ashframe)
were selected to manually label the region of interest using the Labelmetool. A
total of 7 images of each nodule and their annotates were imported into the
Darwin Research Platform for radiomics analysis. The datasets were randomly split
into training and test cohorts in a 9:1 ratio. Six classiers, namely, support vector
machine, logistic regression, decision tree, random forest (RF), gradient boosting
decision tree and extreme gradient boosting, were used to construct and test the
models. Performance was evaluated using a receiver operating characteristic curve
analysis. The area under the curve (AUC), sensitivity, specicity, positive predictive
value (PPV), negative predictive value (NPV), accuracy (ACC), and F1-score were
calculated. One junior radiologist and one senior radiologist reviewed the 2D-US
image and CEUS videos of each nodule and made a diagnosis. We then compared
their AUC and ACC with those of our best model.
Results: The AUC of the diagnosis of US, CEUS and US combined CEUS by junior
radiologist and senior radiologist were 0.755, 0.750, 0.784, 0.800, 0.873, 0.890,
respectively. The RF classier performed better than the other ve, with an AUC
of 1 for the training cohort and 0.94 (95% condence interval 0.881) for the test
cohort. The sensitivity, specicity, accuracy, PPV, NPV, and F1-score of the RF
model in the test cohort were 0.82, 0.93, 0.90, 0.85, 0.92, and 0.84, respectively.
The RF model with 2D-US combined with CEUS key frames achieved equivalent
performance as the senior radiologist (AUC: 0.94 vs. 0.92, P= 0.798; ACC: 0.90
Frontiers in Endocrinology frontiersin.org01
OPEN ACCESS
EDITED BY
Horatiu Silaghi,
University of Medicine and Pharmacy Iuliu
Hatieganu, Romania
REVIEWED BY
Jeehee Yoon,
Chonnam National University Bitgoeul
Hospital, Republic of Korea
Aixia Sun,
Michigan State University, United States
*CORRESPONDENCE
Ying Huang
huangying712@163.com
RECEIVED 22 September 2023
ACCEPTED 21 March 2024
PUBLISHED 03 April 2024
CITATION
Chen J-h, Zhang Y-Q, Zhu T-t, Zhang Q,
Zhao A-x and Huang Y (2024) Applying
machine-learning models to differentiate
benign and malignant thyroid nodules
classied as C-TIRADS 4 based on 2D-
ultrasound combined with ve contrast-
enhanced ultrasound key frames.
Front. Endocrinol. 15:1299686.
doi: 10.3389/fendo.2024.1299686
COPYRIGHT
© 2024 Chen, Zhang, Zhu, Zhang, Zhao and
Huang. This is an open-access article
distributed under the terms of the Creative
Commons Attribution License (CC BY). The
use, distribution or reproduction in other
forums is permitted, provided the original
author(s) and the copyright owner(s) are
credited and that the original publication in
this journal is cited, in accordance with
accepted academic practice. No use,
distribution or reproduction is permitted
which does not comply with these terms.
TYPE Original Research
PUBLISHED 03 April 2024
DOI 10.3389/fendo.2024.1299686
vs. 0.92) and outperformed the junior radiologist (AUC: 0.94 vs. 0.80, P= 0.039,
ACC: 0.90 vs. 0.81) in the test cohort.
Conclusions: Our model, based on 2D-US and CEUS key frames radiomics
features, had good diagnostic efcacy for thyroid nodules, which are classied as
C-TIRADS 4. It shows promising potential in assisting less experienced
junior radiologists.
KEYWORDS
thyroid nodules, ultrasound, contrast-enhanced ultrasound, machine learning,
radiomics features, key frames, radiologists
1 Introduction
Thyroid nodules are a common clinical condition. In recent
decades, the use of high-resolution ultrasound has rapidly increased
worldwide (1,2). The detection rate of thyroid nodules can reach
67%; however, only 515% of them are malignant (3,4). In clinical
practice, many patients suffer some complications after surgical
thyroidectomy (5,6). Moreover, the status quo of overdiagnosis and
overtreatment has added unnecessary burdens to patients. In 2020,
Chinese experts developed the Chinese version of the thyroid
imaging reporting and data system (C-TIRADS) to evaluate the
characteristics of thyroid nodules, providing a more practical and
concise tool for daily clinical practice (7). Most nodules classied as
C-TIRADS 3 or 5 can be quickly distinguished accurately using
two-dimensional ultrasound (2D-US) alone; however, there is a
wide range of malignancy rates among thyroid nodules classied as
C-TIRADS 4 (290%). Moreover, some hypoechoic Hashimoto
nodules with blurred margins can be classied as C-TIRADS 4
(8). and mummied nodules with internal necrotic components
may also exhibit marked hypoechogenicity (9). Distinguishing these
from malignant nodules poses challenges, leading to the low
specicity of 2D-US and warranting ne needle aspiration (FNA),
an invasive procedure (2). Thus, there is a need to explore new
methods for a more precise diagnosis of thyroid nodules which are
classied as C-TIRADS 4.
Contrast-enhanced ultrasound (CEUS), which describes focal
microcirculation perfusion status by distinguishing acoustic
features of tissue backgrounds, plays an essential role in the
diagnosis of thyroid nodules and differentiation of necrotic
benign nodules from malignant ones to avoid FNA procedures
(10). Additionally, CEUS is utilized in the eld of interventional
ultrasonography, which includes assisting biopsy and FNA
procedures and estimating therapeutic conditions after ablation
(11,12). Despite not being recommended as part of the guidelines
for diagnosing thyroid nodules, numerous studies have
demonstrated that CEUS exhibits a sensitivity and specicity of
discriminating malignant nodules from benign nodules that could
reach 0.87 and 0.83, respectively (13,14). The consensus on the
qualitative and quantitative analysis of CEUS recommends that
malignant characteristics include later wash-in, heterogeneous
hypoenhancement, earlier wash-out, and centripetal perfusion
(1517). Machine learning (ML) is an algorithm based on
representational learning of data, except for computer vision,
natural language processing, and speech recognition, and has
played a prominent role in the medical eld (1821). ML can
signicantly limit interobserver variations (22). With the rapid
development of articial intelligence (AI), radiomics has recently
attracted the attention of researchers. Radiomics can transform
pixels in medical images into high-dimensional features and
quantitative data that can be calculated, which could show
intratumor heterogeneity and texture features (23,24). ML
algorithms can be used to develop predictive models and calculate
their performances. In the eld of thyroid nodules, ML is mostly
based on 2D-US images, with an accuracy (ACC) of approximately
0.880.92 (25,26). To our knowledge, only two studies have used
CEUS images to build AI models for diagnosing thyroid nodules
(27,28). Wan et al. used deep learning (DL) to build a diagnostic
model based on dynamic CEUS video and obtained relatively high
performance (27). Guo et al. used logistic regression to build ML
models based on US and CEUS features, while only included a
single frame of CEUS images (28). Our study aimed to explore the
useful information of CEUS images for diagnosing C- TIRADS 4
thyroid nodules. Herein, we combined 2D-US with ve CEUS key
frames as an import for further radiomics feature extraction and
ML model development, aimed at examining the value of ML model
based on 2D-US and CEUS key frames in the differential diagnosis
of benign and malignant nodules which are classied as C-
TIRADS 4.
2 Materials and methods
2.1 Patients
This retrospective study was conducted between September
2019 and February 2023. Data from 313 thyroid nodules in 300
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org02
patients which underwent FNA or thyroid surgery at our hospital
were included in this study. The inclusion criteria were: (1)
patients aged 18 years; (2) nodule classied as C-TIRADS 4
(with at least one malignant sign); (3) some suspicious malignant
nodules that needed CEUS examination to exclude mummied
nodules before FNA, and some cystic-solid nodules which were
classied as C-TIRADS 3 but the most component were the solid
and were eccentric distribution; (4) CEUS examination
procedures that contained double-ashat 40s and 60s,
respectively; and (5) patients who signed an informed consent
form and obtain pathological results of thyroid nodules after the
CEUS examination. The exclusion criteria were: (1) allergy to any
of the components in the ultrasound contrast agent; (2) nodules
with macrocalcication during B-mode ultrasound examination;
(3) FNA pathological results incomplete or categorized as
Bethesda I, III, and IV; and (4) CEUS videos with severe
motion. The patients were separated at a ratio of 9:1. Our study
was approved by Medical Ethics Committee of Shengjing Hospital
of China Medical University (2023PS967K). The PASS.15
software (NCSS LLC, Kaysville, UT, USA) was used to calculate
the sample size, with parameters set to ensure the power of 0.90
and level awas set at bilateral 0.05. Based on our expected results,
the receiver operating characteristic (ROC) curve was set to 0.90.
Thefalse-positiveratewaslimitedfrom0to1.Thegroup
allocation was set at 2. The number of nodules included in the
training cohort was 144 in the malignant group and 72 in the
benign group (total = 216), with an additional 10% for dropouts.
Hence, the nal result was 158 and 80 nodules in the malignant
and benign groups, respectively (total = 238).
2.2 US, CEUS examinations and
images selection
An L14-3U transducer (frequency: 39 MHz) from the Resona 9
device (Mindray, Shenzhen, China) and an L12-5 transducer
(frequency: 512 MHz) from the iU22 device (Philips, Amsterdam,
The Netherlands) were used. 2D-US was performed by two
radiologists, one with 3 years of experience in thyroid ultrasound
and the other with >10 years of experience in thyroid ultrasound. We
measured the thyroid size, nodule numbers, nodule size, nodule
location, component, echogenicity, shape, margin, and the presence
or absence of Hashimotos background and microcalcication. We
then recorded following the C-TIRADS guidelines. In patients with
multiple nodules, the ones most suspicious for malignancy were
selected for observation and subsequent CEUS examination. The C-
TIRADS classication was recorded, and nodules with inconsistent
C-TIRADS results were reevaluated and decided upon. Subsequently,
CEUS was performed by an experienced radiologist, who then
selected the largest section of the nodule, including the surrounding
normal thyroid tissue. The mechanical index was set to 0.060.08,
and the gain, depth, acoustic window, and focal zone were adjusted.
The probe stabilized, and the CEUS mode was initiated. For this
procedure, 59 mg of contrast agent (SonoVue; Bracco, Milan, Italy)
was mixed with 5 mL of saline to prepare a suspension. The
suspension (1.5 mL) was injected rapidly through the supercial
vein of the elbow, followed by a 5 mL saline ush. The timer was
started simultaneously with the time of injection. The term ash
means when the microbubbles had been blown up, the remaining
microbubbles would reperfuse after the ashwithout the boluss
inuence, making good efforts to observe reperfusion status. The
radiologist pressed the contrast agent click-button in the 40
th
and 60
th
seconds, dened as rst-ashand second-ash,respectively. The
entire dynamic recording lasted 80 seconds and was recorded in
AVIformat. Two experienced radiologists immediately diagnosed
patients. CEUS observation parameters, including wash-in pattern
(earlier, synchronous, and later), enhanced intensity (hypo-, iso-, and
hyperintensity), enhanced homogeneity (homo- and heterogeneous),
enhanced method (centripetal and centrifugal), and wash-out pattern
(earlier, synchronous, and later), were recorded. The nodules with
inconsistent results were examined and discussed. According to the
previous studies (17,2932), nodules with later wash-in,
heterogeneous hypointensity,centripetal enhancementand
earlier wash-outwere malignant parameters for thyroid nodules.
In our study, we dened nodules with at least two of the among
parameters as malignant nodules, the others were dened as
benign nodules.
Furthermore, the nodules largest transverse and longitudinal
sections were selected in 2D-US after rotating the probe 90°
clockwise. Regarding CEUS, the perfusion of the contrast agents
gradually changes with changes in brightness during CEUS
examinations, which could reveal the blood supply of the nodule.
Many previous studies have also suggested wash-in or -out patterns
of contrast agents, and the enhanced intensity in the nodule area
compared to the surrounding normal thyroid tissues was the most
helpful parameter for diagnosing malignant nodules (11,31,33,34).
The double-ash,identied as a new CEUS quantitative
parameter in our previous study, indicated that the diagnostic
accuracy in distinguishing malignant and benign thyroid nodules
could reach 88.4% (24). Therefore, based on these principles and
results, ve CEUS key frames were nally selected: the 2
nd
second
after the arrival timeframe, time to peakframe, 2
nd
second after
peakframe, rst-ashframe, and second-ashframe.
2.3 Nodule segmentation
The 80-second CEUS video of eachpatient was converted to 1120
images (14 images every second) using Python code. One radiologist
(with 3 years of CEUS experience) browsed the images and found ve
key CEUS frames. The radiologist manually delineated the boundary
of the regionof interest (ROI) on seven images (two from 2D-US and
ve from CEUS key frames) using Labelmein an Anaconda (http://
anaconda.org) environment. The second radiologist (with 8 years of
CEUS experience) checked the segmentations. If there were any
inconsistencies, the results were jointly discussed, and further
modications were made until a consensus was reached. Finally,
the patient images and labels were imported into the Darwin
Research Platform (https://arxiv.org/abs/2009.00908) for feature
extraction and model establishment. The workow scheme is
illustrated in Figure 1. The nodule segmentation process is
described in the Supplementary Materials.
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org03
2.4 Feature extraction and selection
After nodule segmentation, feature extraction was performed
using the PyRadiomicspackage for Python (Python Software
Foundation, Beaverton, OR, USA). Radiomics features include rst-
order, shape, and texture. First-order features can be obtained using
a simple metric procedure to clarify the distribution of voxel
intensities, such as mean range, variance, and kurtosis. Texture
features are used to describe the heterogeneity of the lesion,
including the gray-level cooccurrence matrix (GLCM), gray-level
run length matrix (GLRLM), gray-level dependence matrix
(GLDM), neighboring gray-tonedifferencematrix(NGTDM),
and gray-level size zone matrix (GLSZM). Eight kinds of lters
were applied in our study to transform the original images:
exponential, gradient, local binary pattern- two dimensional
(Lbp-2D), logarithm, square, square root, wavelet, and Laplacian
of Gaussian (LoG). First-order shape and texture features were
extracted from the derived images. However, since a single image
contained 1125 features, seven images from one patient produced
7875 features in total. We extracted all features and subsequently
selected them. Feature selection is an important ML procedure
because it reduces computational complexity and trains classiers
more accurately. Maximum absolute normalization was used to
scale the numerical value to the unit length within a range of 1to1.
The variance threshold can remove all low-variance features. To
reduce overtting and nd denitive correlation features, only F
values equal to 0 were excluded from this study. The classiers also
contain algorithms that iteratively calculate the importance of the
features. Finally, the decision tree (DT) classier was used to
determine the most relevant feature rankings (Figure 2).
2.5 Model development
Six ML models, namely support vector machine (SVM), logistic
regression (LR), DT, random forest (RF), gradient boosting decision
tree (GBDT), and extreme gradient boosting (XGBOOST) were used to
determine the best diagnostic performance. The radial basis function
was used in the SVM classier, and the penalty coefcient C was used
to set the tolerance for misclassied samples (from 0.0001 to 1,000). LR
was based on an elastic net, and the I1 ratio was set to 0.5. For RF, DT,
GBDT, and XGBOOST, the maximum depth of the tree was set at 5 to
avoid overtting.Ifvaluesweremissing,wechosethemeanvalueasa
supplement. The 10-fold crossvalidation was used to inspect the
accuracy of the models. The ROC curve and area under the curve
(AUC)wereusedtocomparetheperformanceofthesixMLmodels,
and the sensitivity, specicity, accuracy, F1-score, positive predictive
value(PPV),andnegativepredictivevalue(NPV)werecalculated.
2.6 Statistical analysis
Statistical analysis was performed using the SPSS software
(version 26.0; IBM Corp., Armonk, NY, USA). Count data were
recorded as frequencies and rates. The measurement data that
conrmed a normal distribution were recorded as mean ±
standard deviation, while data that were not consistent with a
normal distribution were recorded as the median (interquartile
range). Furthermore, measurement data between groups were
compared using the independent t-test and MannWhitney U
test. Count data (clinical data, 2D-US and CEUS data) were
analyzed using chi-square or Fishers exact tests. Radiomics
analyses were performed using Python (version 3.6). Delongs test
was used to test whether there were any differences in AUC among
the six ML models and between the ML model and human readers.
A calibration curve demonstrated the consistency between the
prediction model and the actual situation. Decision curve analysis
(DCA) was used to determine whether this model had net clinical
benets. Statistical signicance was set at P<0.05.
3 Results
3.1 Clinical and sonographic data
A total of 313 nodules were enrolled in our study, with 282 in
the training cohort and 31 in the test cohort. The training cohort
included 100 benign and 182 malignant nodules, while the test
FIGURE 1
Workow of image acquisition.
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org04
cohort included 10 benign and 21 malignant nodules. In our data,
89 nodules were classied as C- TIRADS 4a, 128 nodules were
classied as C- TIRADS 4b, 96 nodules were classied as C-
TIRADS 4c. The malignancy rate of C- TIRADS 4a, 4b and 4c
were 34.8% (31/89), 70.3% (90/128) and 85.4% (82/96), respectively.
The characteristics of the nodules are listed in Table 1, and the
patient inclusion owchart is shown in Figure 3. In the training
cohort, the clinical and sonographic variables between the
malignant and benign groups showed signicant differences in
age, number, size, solid composition, microcalcication, shape,
margin, enhanced intensity, homogeneity, and wash-in patterns
(all P<0.05). However, no signicant difference was found in sex,
location, Hashimotos background, echogenicity, centripetal
enhancement, and wash-out patterns (all P>0.05). There was no
statistically signicant difference in the distribution of patients
between the training and test cohorts (P>0.05).
3.2 The US and CEUS analysis by
human reader
Each nodule was evaluated simultaneously by a junior radiologist
(3 years of CEUS experience) and a senior radiologist (8 years of
CEUS experience). The parallel method was used for combined
diagnosis of C- TIRADS and CEUS. That is to say, if both C-
TIRADS and CEUS were benign, the nal diagnosis was recorded
FIGURE 2
Feature selection.
TABLE 1 Clinical and sonographic characteristics.
Training cohort (n=282) Test cohort (n=31) P
Characteristics Total (n= 282) Benign(n= 100) Malignant (n= 182) p
Age (years) 44.57 ± 12.31 48.12 ± 12.5 42.62 ± 11.79 0.000* 45.52 ± 11.47 0.683
Sex
Female
Male
224 (79.4%)
58 (20.6%)
82 (82.0%)
18 (18.0%)
142 (78.0%)
40 (22.0%)
0.429
23 (74.2%)
8 (25.8%)
0.497
Number
Single
Multiple
99 (35.1%)
183 (64.9%)
19 (19.0%)
81 (81.0%)
80 (44.0%)
102 (56.0%)
0.000*
12 (38.7%)
19 (61.3%)
0.691
Size (mm)
Maximum diameter 10.67 ± 9.07 15.1 ± 12.14 8.24 ± 5.52 0.000* 10.2 ± 7.83 0.779
Location
Upper pole
Middle
Subthyroid pole
Isthmus
64 (22.7%)
112 (39.7%)
77 (27.3%)
29 (10.3%)
23 (23.0%)
37 (37.0%)
34 (34.0%)
6 (6.0%)
41 (22.5%)
75 (41.3%)
43 (23.6%)
23 (12.6%)
0.133
10 (32.3%)
10 (32.2%)
7 (22.6%)
4 (12.9%)
0.595
Hashimoto Background
Yes
No
46 (16.3%)
236 (83.7%)
17 (17.0%)
83 (83.0%)
29 (15.9%)
153 (84.1%)
0.917
7 (22.6%)
24 (77.4%)
0.377
Solid composition
Yes
No
278 (98.6%)
4 (1.4%)
96 (96.0%)
4 (4.0%)
182 (100%)
0
0.007*
31 (100%)
0
1.000
(Continued)
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org05
FIGURE 3
Retrospective workow. CEUS, contrast-enhanced ultrasound.
TABLE 1 Continued
Training cohort (n=282) Test cohort (n=31) P
Characteristics Total (n= 282) Benign(n= 100) Malignant (n= 182) p
Very low echogenicity
Yes
No
16 (5.7%)
266 (94.3%)
3 (3.0%)
97 (97.0%)
13 (7.1%))
169 (92.9%)
0.150
2 (6.5%)
29 (93.5%)
0.860
Microcalcication
Yes
No
89 (31.6%)
193 (68.4%)
24 (24.0%)
76 (76.0%)
65 (35.7%)
117 (64.3%)
0.043*
5 (16.1%)
26 (83.9%)
0.075
Shape (Aspect ratio)
>1
<1
81 (28.7%)
201 (71.3%)
12 (12.0%)
88 (88.0%)
69 (37.9%)
113 (62.1%)
0.000*
9 (29.0%)
22 (71.0%)
0.971
Margin
Regular
Irregular
144 (51.1%)
138 (48.9%)
72 (72.0%)
28 (28.0%)
72 (39.6%)
110 (60.4%)
0.000*
19 (61.3%)
12 (38.7%)
0.279
Enhanced intensity
Hyperenhancement
Iso-enhancement
Hypoenhancement
54 (19.2%)
149 (52.8%)
79 (28.0%)
36 (36.0%)
32 (32.0%)
32 (32.0%)
18 (9.9%)
47 (25.8%)
117 (64.3%)
0.000*
10 (32.3%)
16 (51.6%)
5 (16.1%)
0.148
Homogeneity
Homogeneous
Heterogeneous
108 (38.3%)
174 (61.7%)
56 (56.0%)
44 (44.0%)
52 (28.6%)
130 (71.4%)
0.000*
16 (51.6%)
15 (48.4%)
0.150
Centripetal
enhancement
Yes
No
28 (9.9%)
254 (90.1%)
9 (9.0%)
91 (91.0%)
19 (10.4%)
163 (89.6%)
0.699
3 (10%)
28 (90%)
1.000
Wash-in
Synchronous
Later
Earlier
133 (47.2%)
116 (41.1%)
33 (11.7%)
51 (51.0%)
29 (29.0%)
20 (20.0%)
82 (45.1%)
87 (47.8%)
13 (7.1%)
0.001*
20 (64.5%)
9 (29.0%)
2 (6.5%)
0.180
Wash-out
Synchronous
Later
Earlier
180 (63.8%)
44 (15.6%)
58 (20.6%)
64 (64.0%)
12 (12.0%)
24 (24.0%)
116 (63.7%)
32 (17.6%)
34 (18.7%)
0.337
22 (71.0%)
4 (12.9%)
5 (16.1%)
0.731
*Represents P <0.05. Numerical data are presented as mean ± standard deviation. Categorical data are presented as numbers (%).
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org06
as benign, while one of the C-TIRADS or CEUS was malignant, the
nal diagnosis was recorded as malignant. As shown in Table 2;
Figure 4, the AUCs of junior radiologist observing US for C-TIRADS
classication, CEUS videos, and the combined diagnosis of the two
methods were 0.755, 0.750, 0.784, respectively. Except from the
specicity and the PPV, the sensitivity, NPV and accuracy of
combining US and CEUS by junior radiologist were higher than
using US and CEUS alone, which were 0.941, 0,852, 0.831,
respectively. The AUC of senior radiologist observing US for C-
TIRADS classication, CEUS video, and combined diagnosis of the
two methods were 0.80, 0.873 and 0.890 respectively. Except from the
specicity and the PPV, the sensitivity, NPV and accuracy of
combining US and CEUS by senior radiologist were higher
than using US and CEUS alone, which were 0.970, 0,937,
0.914, respectively.
3.3 Prediction performance of ML models
based on 2D-US combined with CEUS
key frames
The six classiers (SVM, LR, DT, RF, GBDT, and XGBOOST)
and their performance are listed in Table 3. AUCs for SVM, LR, DT,
RF, GBDT, and XGBOOST in the training cohort were 0.75, 0.87,
1.00, 1.00, 1.00, and 0.92, respectively. In the test cohort, AUCs of
SVM, LR, DT, RF, GBDT, and XGBOOST were 0.74, 0.81, 0.84,
0.94, 0.92, and 0.92, respectively. The ROC curves of the six ML
models are shown in Figure 5. The results of the Delong test showed
that in the test cohort, the difference between AUC of SVM, LR, and
DT was not statistically signicant (P>0.05). Similarly, the
difference in AUC between RF, XGBOOST, and GBDT was not
statistically signicant (P>0.05). RF, GBDT, and XGBOOST had
comparable predictive effectiveness. The differences in AUC
between GBDT, LR, and DT were not statistically signicant
(P>0.05); however, AUCs of RF and XGBOOST were statistically
signicant compared to those of SVM, LR, and DT, respectively (all
P<0.05). Notably, AUC of RF was the highest in the test cohort
(0.94). Additionally, the calibration and DCA curves of RF showed
favorable consistency with reality (Figure 6). The cases in test
cohorts were presented in Figures 7,8.
3.4 Comparison with human readers
A senior radiologist (8 years of CEUS experience) and a junior
radiologist (3 years of CEUS experience) independently reviewed
the transverse and longitudinal sections of the test cohorts 2D-US
and CEUS videos of each nodule. Both groups were blinded to
clinical characteristics and pathological results, and a denitive
diagnosis of whether each nodule was benign or malignant was
provided. The diagnostic performances of the best-performing RF
model and human readers are summarized in Table 4;Figure 9.As
shown, the RF model achieved an equivalent performance to that of
the senior radiologist (P= 0.799) and gained more specicity. The
RF model outperformed the junior radiologist (P= 0.039) and
showed greater sensitivity, specicity and NPV.
TABLE 2 The US and CEUS analysis by human readers.
Models SEN SPE PPV NPV Accuracy AUC
Junior radiologist C- TIRADS 0.720
(0.651, 0.779)
0.791
(0.701, 0.860)
0.864
(0.801, 0.910)
0.604
(0.519, 0.683)
0.744 0.755
(0.698, 0.812)
Junior radiologist CEUS 0.764
(0.698, 0.819)
0.736
(0.642, 0.814)
0.842
(0.780, 0.890)
0.628
(0.538, 0.710)
0.754 0.750
(0.692, 0.808)
Junior radiologist C- TIRADS+CEUS 0.941
(0.897, 0.968)
0.627
(0.529, 0.716)
0.823
(0.767, 0.869)
0.852
(0.752, 0.918)
0.831 0.784
(0.698, 0.812)
Senior radiologist C- TIRADS 0.768
(0.703, 0.823)
0.836
(0.751, 0.898)
0.897
(0.839,0.936)
0.662
(0.576, 0.739)
0.792 0.800
(0.747, 0.853)
Senior radiologist CEUS 0.882
(0.827, 0.921)
0.864
(0.782, 0.920)
0.923
(0.873, 0.955)
0.800
(0.713, 0.864)
0.875 0.873
(0.828, 0.918)
Senior radiologist C-
TIRADS+CEUS
0.970
(0.934, 0.988)
0.809
(0.721, 0.875)
0.904
(0.855, 0.938)
0.937
(0.862, 0.974)
0.914 0.890
(0.844, 0.936)
C- TIRADS, Chinese version of thyroid imaging reporting and data system; CEUS, contrast-enhanced ultrasound; PPV, positive predictive value; NPV, negative predictive value; AUC, area
under the receiver operating characteristic curve; SEN, sensitivity; SPE, specicity; PPV, positive predictive value; NPV, negative predictive value.
FIGURE 4
ROC curves of TIRADS, CEUS and TIRADS combined with CEUS of
junior radiologist and senior radiologist, respectively.
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org07
TABLE 3 Predictive performance of six machine learning models based on 2D-US and CEUS key frames.
Parameter SVM LR DT RF GBDT XGBOOST
Training
cohort
Test
cohort
Training
cohort
Test
cohort
Training
cohort
Test
cohort
Training
cohort
Test
cohort
Training
cohort
Test
cohort
Training
cohort
Test
cohort
AUC 0.746
(0.7070.786)
0.735
(0.6150.854)
0.867
(0.8390.895)
0.808
(0.7090.907)
1 0.843
(0.7570.929)
1 0.936
(0.8840.988)
0.999 (0.9981) 0.916
(0.8540.978)
1 0.923
(0.8640.984)
ACC 0.741 0.75 0.791 0.807 1 0.864 1 0.898 0.99 0.864 1 0.841
SEN 0.671
(0.610.726)
0.643
(0.4580.793)
0.813
(0.7610.857)
0.679
(0.493, 0.821)
1 (0.9851) 0.786
(0.6050.898)
1 (0.985,1) 0.821
(0.6440.921)
0.988
(0.9660.996)
0.857
(0.6850.943)
1 (0.9851) 0.893
(0.7280.963)
SPE 0.773
(0.7360.807)
0.8
(0.6820.882)
0.781
(0.7440.814)
0.867
(0.758, 0.931)
1 (0.9931) 0.9
(0,7990.953)
1 (0.993,1) 0.933
(0.8410.974)
0.991
(0.9780.996)
0.867
(0.7580.931)
1 (0.9931) 0.817
(0.7010.894)
PPV 0.581
(0.5230.636)
0.6
(0.4230.754)
0.635
(0.5810.685)
0.704
(0.5150.841)
1 (0.9851) 0.786
(0.6050.898)
1 (0.9851) 0.852
(0.6750.941)
0.98
(0.9550.992)
0.75
(0.5790.867)
1 (0.9851) 0.694
(0.5310.82)
NPV 0.834
(0.7980.864)
0.828
(0.7110.904)
0.899
(0.8690.923)
0.852
(0.7430.92)
1 (0.9931) 0.9
(0.7990.953)
1 (0.9931) 0.918
(0.8220.964)
0.994
(0.9840.998)
0.929
(0.830.972)
1 (0.9931) 0.942
(0.8440.98)
F1-Score 0.741 0.621 0.713 0.691 1 0.786 1 0.836 0.984 0.8 1 0.781
SVM, support vector machine; LR, logistic regression; DT, decision tree; RF, random forest; GBDT, gradient boosting decision tree; XGBOOST, extreme gradient boosting; AUC, area under curve; ACC, accuracy; SEN, sensitivity; SPE, specicity; PPV, positive predictive
value; NPV, negative predictive value.
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org08
4 Discussion
In this study, we constructed six ML models using 2D-US
images combined with ve CEUS keyframes. The ROC curves
showed that the diagnostic performance of our models was
desirable, with all AUC values >0.80 in the test cohort (except
SVM [0.74]). Moreover, we compared our best model with human
readers (senior and junior radiologists) and found that the best ML
model achieved equivalent performance to that of the senior
radiologist and outperformed the junior radiologist.
Traditional diagnostic methods for thyroid nodules, such as 2D-
US, color Doppler ow imaging (CDFI), elastography, and FNA,
have many disadvantages (3537); the main ones are severe
overdiagnosis and overtreatment (38,39). For example, some
Hashimotos nodules may show hypoechogenicity with blurred
margins on 2D-US, which may be classied as TIRADS >4 and
require unnecessary FNA according to the guidelines (7,40,41).
CEUS, as a novel noninvasive microangiography technology, can
reveal microvasculature with a smaller diameter (>40 µm) than that
by CDFI (>100 µm) and is helpful in the detection of malignant
thyroid nodules (42,43). Recent studies have indicated that CEUS
could modify the current TIRADS to create a new risk stratication
that may reduce unnecessary biopsies (4246). Our team had
published one CEUS- TIRADS model to differentiate thyroid
nodules (C-TIRADS 4) by combining CEUS with C-TIRADS
(46), which had high clinical practicability in clinic. Additionally,
CEUS images may contain valuable information that has not
received sufcient attention in daily clinical practice. In recent
years, AI, especially radiomic features, has demonstrated
promising potential for evaluating the characteristics of thyroid
nodules (47,48). Radiomics has also been used to diagnose
cytologically uncertain nodules (4951), lymph node metastases
(52,53), and extrathyroidal extension (54). Many studies employing
AI for evaluating the thyroid are mainly based on 2D-US images
(48,55,56). In 2015, LeCun introduced the principles of deep
learning and convolutional neural networks (CNNs) (18), attracting
the interest of many researchers. The principle of machine or deep
learning is that CNNs are trained using a large number of 2D-US
images with known corresponding pathological results. A specic
algorithm is used to segment US images. After several calculation
iterations, the CNNs can capture and analyze thyroid nodules and
suggest risk stratication. Studies on ML based on 2D-US to
distinguish malignant thyroid nodules from benign nodules could
reach a diagnostic accuracy of approximately 90%. Peng et al.
developed a deep learning AI model based on 2D-US to diagnose
thyroid nodules that outperformed 12 radiologists (AUC: 0.922 vs.
0.839, P<0.05) (37). Conversely, a study conducted by Sun et al.,
also based on 2D-US, indicated that the experts achieved better
performance (AUC: 0.881 vs. 0.819) (57). Gong et al. reported that
an AI-assisted diagnostic system combined with CEUS could
signicantly improve the diagnostic sensitivity and NPV in
diagnosing thyroid nodules classied as American College of
Radiology Thyroid Imaging (ACR-TIRADS) 4 (58). However, to
FIGURE 5
ROC curves of the SVM, LR, DT, RF, GBDT, and XGBOOST classiers
in the test cohort. ROC, receiver operating characteristic; SVM,
support vector machine; LR, logistic regression; DT, decision tree;
RF, random forest; GBDT, gradient boosting decision tree;
XGBOOST, extreme gradient boosting.
FIGURE 6
The calibration curves and decision curve analysis of RF models. RF, random forest.
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org09
our knowledge, few researchers have developed AI models based on
CEUS images. To date, only two studies have proposed AI
diagnostic models based on CEUS image information (27,28).
Wan et al. used DL to build a diagnostic model based on dynamic
CEUS video and obtained AUC of 0.92 (27), which was lower than
ours (AUC: 0.94); ACC in their study was substantially lower than
that in ours (0.80 vs. 0.90). Guo et al. used logistic regression to
build ML models based on US and CEUS features, while as for
CEUS features, only a single frame of CEUS images was used (28).
Our studies extracted radiomics features ve key CEUS frames and
FIGURE 7
A thyroid nodule in left lobe in a 46-year-old woman in test cohort. (A) 2D-US image; (B) the mask image corresponding to 2D-US image; (C) CEUS
image at peak time; (D) the mask image of CEUS image at peak time. The nodule was solid, hypoechoic, blurred margin, aspect ratio less than 1,
with microcalcication and was categorized as C-TIRADS 4c. CEUS showed later wash-in, heterogeneous enhancementand later wash-out, and
was diagnosed as malignant. RF model classies it as malignant. Histologic analysis revealed papillary microcarcinoma (PTMC).
FIGURE 8
A thyroid nodule in right lobe in a 56-year-old man in test cohort. (A) 2D-US image; (B) the mask image corresponding to 2D-US image; (C) CEUS
image at peak time; (D) the mask image of CEUS image at peak time. The nodule was solid with blurred margin and was categorized as C-TIRADS
4b. CEUS showed later wash-inand with hypointensity, and was diagnosed as malignant. RF model classies it as benign. Histologic analysis
revealed nodular goiter with granuloma formation.
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org10
the sample size of our study is bigger (313 vs. 123). And our study
aimed at thyroid nodules which are classied as C-TIRADS 4,
which are relatively hardly differentiated in clinic. Therefore, this
was the rst study to provide the highest value of radiomics
information from CEUS images in thyroid nodules (C-TIRADS
4) evaluation, offering a promising, noninvasive, fast, feasible, and
reliable method.
In our study, none of the patients experienced complications
during CEUS and FNA. By comparing the FNA and surgical
pathological results from January 2016 to June 2021 in our
hospital, we found that the success rate and diagnostic accuracy
of FNA were 96.6% and 93.3%, respectively (59). The accuracy of
FNA was much higher than that in most previous studies,
indicating that the pathological results from FNA at our
institution were reliable. Our study also demonstrated that
malignant thyroid nodules commonly occurred in younger people
(P<0.05). The statistical differences between malignant and benign
nodules in the training cohort were also signicant for nodule
number, nodule size, nodule composition, the presence of
microcalcications, shape, margin, enhanced intensity of CEUS,
homogeneity of CEUS, and wash-in patterns of CEUS (all P<0.05).
Regarding CEUS patterns, the malignant nodules in our data mostly
showed hypoenhancement (117/182; 64.3%), heterogeneous
enhancement (130/182; 71.4%), and later wash-in (87/182;
47.8%), which is consistent with previous studies (12,33,34).
This may be attributed to the peripheral blood vessels of
malignant nodules being damaged by malignant growth,
hindering contrast agent entry. When the nodule is small, the
number of new blood vessels, branches, and arteriovenous stulas
is not relatively large, and the inside of the nodule will be closely
related to the poor blood supply and uneven distribution of blood
vessels within the malignant nodules. In the present study, the mean
maximal diameter of malignant nodules was smaller than that of
benign nodules (P<0.05), which may indicate that the direction of
perfusion of contrast agents was difcult to observe, which explains
the lack of statistical signicance in the enhancement methods and
wash-out patterns. And in our data, the diagnostic AUC and
accuracy of both junior and senior radiologist of using US
combined with CEUS were higher than those of US or CEUS alone.
In this study, we rst extracted nearly all radiomics features as
published in the present literature. Subsequently, we adopted
maximum abs normalization to preprocess the data. Many data
normalization methods are used in ML, such as Z-score
standardization, max abs normalization, min-max normalization,
robust scaling, and median absolute deviation. The advantage of
max absol normalization lies in its ability to retain data distribution
without centralizing it, preserving the sparsity of large-scale data
such as ours. We then used the variance threshold to eliminate
outliers from the data. DT is a nonparametric method. Thus, it does
not make any assumptions regarding the spatial distribution or
categorical structure of the data, making it suitable for our study.
The best feature selection is based on the DT classier. Wavelet
features accounted for the largest proportion of radiomics features
(6/18). High-dimensional wavelet features are texture features that
show lesion heterogeneity (60). Fan et al. used ML to predict the
aggressiveness of prostate cancer, and wavelet features accounted
for the largest proportion of their models (61). Meng et al. and Aerts
et al. reached similar conclusions (60,62). Additionally, CEUS
frames played a substantial role in feature selection (12/18),
illustrating the importance of CEUS images. Moreover, among
the selected features, the top-ranked one was the time to peak
frame. This may be because the image is brightest at the peak time,
and the number of microbubbles in the nodule area is the highest,
which can probably provide more information.
Classiers play a crucial role in ML procedures. Our study uses six
classiersformodeldevelopment(SVM,LR,DT,RF,GBDT,and
XGBOOST). Support vector machine (SVM) is a kind of generalized
linear classier for binary classication of data according to supervised
learning, which is more suitable for dealing with complex nonlinear
TABLE 4 Diagnostic performance of the RF model compared to human readers in the test cohort.
Models SEN SPE PPV NPV Accuracy AUC P
Test cohort RF model 0.821
(0.6440.921)
0.933
(0.8410.974)
0.852
(0.6750.941)
0.918
(0.8220.964)
0.898 0.936
(0.8840.988)
Senior
radiologist
0.965
(0.8680.994)
0.839
(0.6550.939)
0.917
(0.8080.968)
0.929
(0.7500.988)
0.920 0.923
(0.8540.991)
0.799
Junior
radiologist
0.817
(0.6910.901)
0.786
(0.5850.910)
0.891
(0.7710.955)
0.667
(0.4810.814)
0.807 0.801
(0.6960.906)
0.039*
RF, random forest; SEN, sensitivity; SPE, specicity; PPV, positive predictive value; NPV, negative predictive value; AUC, area under the curve.
FIGURE 9
ROC curves of the RF model, senior radiologist, and junior
radiologist in the test cohort. ROC, receiver operating characteristic;
RF, random forest.
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org11
equations than logistic regression. Compared with SVM, LR can be
used for multivariate classication and is more suitable for small data
volume.Decisiontree(DT)isabasicclassication and regression
method and dened as a conditional probability distribution on
feature space and class space. Both random forest (RF) and gradient
boosting decision tree (GBDT) are based on DT. RF is an extension of
a parallel ensemble learning method, and randommeans the
randomness of the selected partition attributes. GBDT is a decision
tree model trained with gradient boosting strategy, which performs
well in screening features (63). XGBOOST is a kind of basic GBDT,
but compared with GBDT, it can support custom loss functions and
add more regular terms, handling of missing value and column
sampling. Among the four models based on DT, RF can converge
to a lower generalization error than the traditional DT. What is more,
DT selects the optimal partition attribute from all attribute sets, while
RF selects the partition attribute only in a subset of the attribute set, so
the training efciency is higher. And each tree of RF only chooses part
of samples and features, breaking through the overttingdefect of
DT. Compared with GBDT and XGBOOST, the performance of RF is
more stable, the parameter adjusting is relatively less complicated, the
operation time is short, and the universality is stronger. Compared
with SVM and LR, RF randomly selects samples and features for each
tree, removes noise variables, increases noise resistance and provides
more stable performance. Moreover, unlike SVM, as the number of
observed samples and features increases, SVM rstly needs to spend
much time to nd a suitable kernel function during the calculation. RF
has no such weakness. The results of our study also proved that RF
was the optimal classier for our model. In our data, the RF, GBDT,
and XGBOOST classiers generally performed better than the SVM,
LR, and DT classiers.TheRFmodelperformedthebest(AUC:0.94,
95% CI: 0.8840.988; ACC: 0.90). In the test cohort, our RF model
obtained an equivalent performance to that of the senior radiologist
(AUC: 0.94 vs.0.92, P = 0.798; ACC: 0.90 vs. 0.92) and was
considerably higher in specicity than both the senior (0.93 vs. 0.84)
and junior (0.93 vs. 0.79) radiologists. The good performance of our
model also indicated that during the CEUS process, the radiologists
could pay more attention to those ve time points: 2nd second after
the arrival time,”“time to peakframe, 2nd second after peakframe,
rst-ashframe, and second-ashframe, especially the peak time.
This not only achieves comparable performance in diagnosing thyroid
nodules, which are classied as C- TIRADS 4, but also saves
radiologists time compared to watching the entire CEUS video.
This study had some limitations. First, this was a single-center
retrospective study; our institution is a referral center, and the
malignancy risk of thyroid nodules is relatively high, which may
have led to selection bias in our samples. Second, this study lacked
external verication, requiring a multi-center, multi-hospital,
multi-region study to augment the robustness and generalizability
of our results. Third, the ROI lines of the nodules were all manually
delineated, and key-frame selection was also observed and operated
by radiologists, although we had obtained rather good performance;
however, these two procedures are time-consuming and prone to
errors, and their efciency and accuracy could potentially be
improved with the implementation of a mature automated
articially intelligent system.
5 Conclusion
Our study established six ML models based on two 2D-US
images and ve CEUS key frames to distinguish malignant from
benign thyroid nodules which were classied as C-TIRADS 4. Our
study highlighted the information of CEUS image extracted by ML
that could not be seen by human eyes, indicating that CEUS may
have great potential in the eld of thyroid nodules. The RF model, as
the optimal ML algorithm, may provide a noninvasive, convenient,
feasible, and highly accurate method for invasive FNA and assist
junior radiologists in diagnosis or preoperative prediction models.
Further studies will address these limitations, making it possible to
improve clinical diagnostic and therapeutic strategies.
Data availability statement
The original contributions presented in the study are included
in the article/Supplementary Material. Further inquiries can be
directed to the corresponding author.
Ethics statement
The studies involving humans were approved by Medical Ethics
Committee of Shengjing Hospital, China Medical University. The
studies were conducted in accordance with the local legislation and
institutional requirements. The participants provided their written
informed consent to participate in this study.
Author contributions
J-hC: Conceptualization, Data curation, Formal analysis,
Investigation, Methodology, Project administration, Software,
Supervision, Validation, Visualization, Writing original draft,
Writing review & editing. Y-QZ: Conceptualization, Methodology,
Software, Validation, Visualization, Writing review & editing. T-tZ:
Conceptualization, Data curation, Formal analysis, Methodology,
Resources, Writing review & editing. QZ: Conceptualization,
Formal analysis, Investigation, Methodology, Validation,
Visualization, Writing review & editing. A-xZ: Data curation,
Formal analysis, Writing review & editing. YH: Conceptualization,
Data curation, Funding acquisition, Methodology, Resources,
Supervision, Writing review & editing.
Funding
The author(s) declare nancial support was received for the
research, authorship, and/or publication of this article. This study
was supported by grants from the 345 Talent Project of Shengjing
Hospital of China Medical University; Liaoning Province Bai Qian
Wan Talents Program; Liaoning Province "Xingliao Talent Plan"
Medical Master Project (YXMJ-LJ-10) and Liaoning Provincial
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org12
Science and Technology Program Combined Program (Key R&D
Program Projects).
Acknowledgments
We would like to thank Yizhun Medical AI Technology Co.,
Ltd., who kindly provided the Darwin research platform and
technical support.
Conict of interest
The authors declare that the research was conducted in the
absence of any commercial or nancial relationships that could be
construed as a potential conict of interest.
Publishers note
All claims expressed in this article are solely those of the authors
and do not necessarily represent those of their afliated
organizations, or those of the publisher, the editors and the
reviewers. Any product that may be evaluated in this article, or
claim that may be made by its manufacturer, is not guaranteed or
endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online
at: https://www.frontiersin.org/articles/10.3389/fendo.2024.1299686/
full#supplementary-material
References
1. Vaccarella S, Franceschi S, Bray F, Wild CP, Plummer M, Dal Maso L. Worldwide
thyroidCancer epidemic? The increasing impact of overdiagnosis. N Engl J Med. (2016)
375:6147. doi: 10.1056/NEJMp1604412
2. Haugen BR, Alexander EK, Bible KC, Doherty GM, Mandel SJ, Nikiforov YE,
et al. 2015 American thyroid association management guidelines for adult patients with
thyroid nodules and differentiated thyroid cancer: the american thyroid association
guidelines task force on thyroid nodules and differentiated thyroid cancer. Thyroid.
(2016) 26:1133. doi: 10.1089/thy.2015.0020
3. Batawil N, Alkordy T. Ultrasonographic features associated with Malignancy in
cytologically indeterminate thyroid nodules. Eur J Surg Oncol. (2014) 40:1826.
doi: 10.1016/j.ejso.2013.11.015
4. Kwak JY, Han KH, Yoon JH, Moon HJ, Son EJ, Park SH, et al. Thyroid imaging
reporting and data system for US features of nodules: a step in establishing better
stratication of cancer risk. Radiology. (2011) 260:8929. doi: 10.1148/radiol.11110206
5. Ma YH, Yue T, He QQ. Tracheal injury following robotic thyroidectomy: A
literature review of epidemiology, etiology, diagnosis, and treatment and 3 case reports.
Asian J Surg. (2023) 10:039. doi: 10.1016/j.asjsur
6. Haddou N, Idrissi N, Ben Jebara S. Analysis of voice quality after thyroid surgery.
J Voice. (2023) S0892-1997(23)00208-4. doi: 10.1016/j.jvoice.2023.06.027
7. Zhou J, Yin L, Wei X, Zhang S, Song Y, Luo B, et al. 2020 Chinese guidelines for
ultrasound Malignancy risk stratication of thyroid nodules: the C-TIRADS.
Endocrine. (2020) 70:25679. doi: 10.1007/s12020-020-02441-y
8. Zhu TT, Zhuang LT, Ma XF, Zhao AX, Huang Y. Differential diagnosis of
Malignant and Hashimoto thyroid nodules by conventional ultrasound combined with
contrast-enhanced ultrasound. Chin J Med Imaging Technol. (2021) 37:178993.
doi: 10.13929/j.issn.10033289.2021.12.007
9. Chen S, Tang K, Gong Y, Ye F, Liao L, Li X, et al. Value of contrast-enhanced
ultrasound in mummied thyroid nodules. Front Endocrinol (Lausanne).(2022)
13:2022.850698. doi: 10.3389/fendo.2022.850698
10. Yin T, Zheng B, Lian Y, Li H, Tan L, Xu S, et al. Contrast-enhanced ultrasound
improves the potency of ne-needle aspiration in thyroid nodules with high inadequate
risk. BMC Med Imaging. (2022) 22:83. doi: 10.1186/s12880-022-00805-6
11. Zhang M, Luo Y, Zhang Y, Tang J. Efcacy and safety of ultrasound-guided
radiofrequency ablation for treating low-risk papillary thyroid microcarcinoma: A
prospective study. Thyroid. (2016) 26:15817. doi: 10.1089/thy.2015.0471
12. Wang Y, Dong T, Nie F, Wang G, Liu T, Niu Q. Contrast-enhanced ultrasound
in the differential diagnosis and risk stratication of ACR TI-RADS category 4 and 5
thyroid nodules with non-hypovascular. Front Oncol. (2021) 11:2021.662273.
doi: 10.3389/fonc.2021.662273
13. Zhang J, Zhang X, Meng Y, Chen Y. Contrast-enhanced ultrasound for the
differential diagnosis of thyroid nodules: An updated meta-analysis with
comprehensive heterogeneity analysis. PloS One. (2020) 15:e0231775. doi: 10.1371/
journal.pone.0231775
14. Wan Q, Cao P, Liu J. Meta-analysis of contrast enhanced ultrasound in judging
benign and Malignant thyroid tumors. Comput Math Methods Med. (2021)
2021:2577113. doi: 10.1155/2021/2577113
15. Wu Q, Wang Y, Li Y, Hu B, He ZY. Diagnostic value of contrast-enhanced
ultrasound in solid thyroid nodules with and without enhancement. Endocrine. (2016)
53:4808. doi: 10.1007/s12020-015-0850-0
16. Zhang Y, Luo YK, Zhang MB, Li J, Li J, Tang J. Diagnostic accuracy of contrast-
enhanced ultrasound enhancement patterns for thyroid nodules. Med Sci Monit. (2016)
22:475564. doi: 10.12659/msm.899834
17. Sorrenti S, Dolcetti V, Fresilli D, Del Gaudio G, Pacini P, Huang P, et al. The role
of CEUS in the evaluation of thyroid cancer: from diagnosis to local staging. J Clin Med.
(2021) 10(19):4559. doi: 10.3390/jcm10194559
18. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. (2015) 521:43644.
doi: 10.1038/nature14539
19. Wongkoblap A, Vadillo MA, Curcin V. Modeling depression symptoms from
social network data through multiple instance learning. AMIA Jt Summits Transl Sci
Proc. (2019) 2019:4453.
20. Jing X, Wielema M, Cornelissen LJ, van Gent M, Iwema WM, Zheng S, et al.
Using deep learning to safely exclude lesions with only ultrafast breast MRI to shorten
acquisition and reading time. Eur Radiol. (2022) 32(12):870615. doi: 10.1007/s00330-
022-08863-8
21. Zheng Y, Zhou D, Liu H, Wen M. CT-based radiomics analysis of different
machine learning models for differentiating benign and Malignant parotid tumors. Eur
Radiol. (2022) 32(10):695364. doi: 10.1007/s00330-022-08830-3
22. Almberg SS, Lervåg C, Frengen J, Eidem M, Abramova TM, Nordstrand CS, et al.
Training, validation, and clinical implementation of a deep-learning segmentation
model for radiotherapy of loco-regional breast cancer. Radiother Oncol. (2022) 173:62
8. doi: 10.1016/j.radonc.2022.05.018
23. McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashraan H,
et al. International evaluation of an AI system for breast cancer screening. Nature.
(2020) 577:8994. doi: 10.1038/s41586-019-1799-6
24. Mao N, Yin P, Wang Q, Liu M, Dong J, Zhang X, et al. Added value of radiomics
on mammography for breast cancer diagnosis: A feasibility study. J Am Coll Radiol.
(2019) 16:48591. doi: 10.1016/j.jacr.2018.09.041
25. Bai Z, Chang L, Yu R, Li X, Wei X, Yu M, et al. Thyroid nodules risk stratication
through deep learning based on ultrasound images. Med Phys. (2020) 47:635565.
doi: 10.1002/mp.14543
26. Nguyen DT, Kang JK, Pham TD, Batchuluun G, Park KR. Ultrasound image-
based diagnosis of Malignant thyroid nodule using articial intelligence. Sensors
(Basel). (2020) 20(7):1822. doi: 10.3390/s20071822
27. Wan P, Chen F, Liu C, Kong W, Zhang D. Hierarchical temporal attention
network for thyroid nodule recognition using dynamic CEUS imaging. IEEE Trans Med
Imaging. (2021) 40:164660. doi: 10.1109/tmi.2021.3063421
28. Guo SY, Zhou P, Zhang Y, Jiang LQ, Zhao YF. Exploring the value of radiomics
features based on B-mode and contrast-enhanced ultrasound in discriminating the
nature of thyroid nodules. Front Oncol. (2021) 11:738909. doi: 10.3389/
fonc.2021.738909
29. Jin ZQ, Yu HZ, Mo CJ, Su RQ. Clinical study of the prediction of Malignancy in
thyroid nodules: modied score versus 2017 american college of radiologys thyroid
imaging reporting and data system ultrasound lexicon. Ultrasound Med Biol. (2019)
45:162737. doi: 10.1016/j.ultrasmedbio.2019.03.014
30. Sidhu PS, Cantisani V, Dietrich CF, Gilja OH, Saftoiu A, Bartels E, et al. The
EFSUMB guidelines and recommendat ions for the clinical practice of contrast-
enhanced ultrasound (CEUS) in non-hepatic applications: update 2017 (Long
version). Ultraschall Med. (2018) 39:e2e44. doi: 10.1055/a-0586-1107
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org13
31. Radzina M, Ratniece M, Putrins DS, Saule L, Cantisani V. Performance of
contrast-enhanced ultrasound in thyroid nodules: review of current state and future
perspectives. Cancers (Basel). (2021) 13(21):5469 doi: 10.3390/cancers13215469
32. He Y, Wang XY, Hu Q, Chen XX, Ling B, Wei HM. Value of contrast-enhanced
ultrasound and acoustic radiation force impulse imaging for the differential diagnosis of
benign and Malignant thyroid nodules. Front Pharmacol. (2018) 9:1363. doi: 10.3389/
fphar.2018.01363
33. Pang T, Huang L, Deng Y, Wang T, Chen S, Gong X, et al. Logistic regression
analysis of conventional ultrasonography, strain elastosonography, and contrast-
enhanced ultrasound characteristics for the differentiation of benign and Malignant
thyroid nodules. PloS One. (2017) 12:e0188987. doi: 10.1371/journal.pone.0188987
34. Yu D, Han Y, Chen T. Contrast-enhanced ultrasound for differentiation of
benign and Malignant thyroid lesions: meta-analysis. Otolaryngol Head Neck Surg.
(2014) 151:90915. doi: 10.1177/0194599814555838
35. Torigian DA, Li G, Alavi A. The role of CT, MR imaging, and ultrasonography in
endocrinology. PET Clin. (2007) 2:395408. doi: 10.1016/j.cpet.2008.05.002
36. Jiang L, Zhang D, Chen YN, Yu XJ, Pan MF, Lian L. The value of conventional
ultrasound combined with superb microvascularimaging and color Doppler ow imaging
in the diagnosis of thyroid Malignant nodules: a systematic review and meta-analysis.
Front Endocrinol (Lausanne). (2023) 14:2023.1182259. doi: 10.3389/fendo.2023.1182259
37. Chambara N, Lo X, Chow TCM, Lai CMS, Liu SYW, Ying M. Combined shear
wave elastography and EU TIRADS in differentiating Malignant and benign thyroid
nodules. Cancers (Basel). (2022) 14(22):5521. doi: 10.3390/cancers14225521
38. Takano T. Overdiagnosis of juvenile thyroid cancer: time to consider self-limiting
cancer. J Adolesc Young Adult Oncol. (2020) 9:2868. doi: 10.1089/jayao.2019.0098
39. Acosta GJ, Singh Ospina N, Brito JP. Overuse of thyroid ultrasound. Curr Opin
Endocrinol Diabetes Obes. (2023) 30:22530. doi: 10.1097/med.0000000000000814
40. Zhao T, Xu S, Zhang X, Xu C. Comparison of various ultrasound-based
Malignant risk stratication systems on an occasion for assessing thyroid nodules in
hashimotos thyroiditis. Int J Gen Med. (2023) 16:599608. doi: 10.2147/ijgm.S398601
41. Shin JH, Baek JH, Chung J, Ha EJ, Kim JH, Lee YH, et al. Ultrasonography
diagnosis and imaging-based management of thyroid nodules: revised korean society of
thyroid radiology consensus statement and recommendations. Korean J Radiol. (2016)
17:37095. doi: 10.3348/kjr.2016.17.3.370
42. Xiao F, Li JM, Han ZY, Liu FY, Yu J, Xie MX, et al. Multimodality US versus
thyroid imaging reporting and data system criteria in recommending ne-needle
aspiration of thyroid nodules. Radiology. (2023) 307:e22 1408. doi: 10.1148/radiol.221408
43. Zhou P, Chen F, Zhou P, Xu L, Wang L, Wang Z, et al. The use of modied TI-
RADS using contrast-enhanced ultrasound features for classication purposes in the
differential diagnosis of benign and Malignant thyroid nodules: A prospective and
multi-center study. Front Endocrinol (Lausanne). (2023) 14:2023.1080908.
doi: 10.3389/fendo.2023.1080908
44. Zhu T, Chen J, Zhou Z, Ma X, Huang Y. Differentiation of thyroid nodules (C-
TIRADS 4) by combining contrast-enhanced ultrasound diagnosis model with chinese
thyroid imaging reporting and data system. Front Oncol. (2022) 12:2022.840819.
doi: 10.3389/fonc.2022.840819
45. Cheng H, Zhuo SS, Rong X, Qi TY, Sun HG, Xiao X, et al. Value of contrast-
enhanced ultrasound in adjusting the classication of chinese-TIRADS 4 nodules. Int J
Endocrinol. (2022) 2022:5623919. doi: 10.1155/2022/5623919
46. Ruan J, Xu X, Cai Y, Zeng H, Luo M, Zhang W, et al. A practical CEUS thyroid
reporting system for thyroid nodules. Radiology. (2022) 305:14959. doi: 10.1148/
radiol.212319
47. ZhuYC,JinPF,BaoJ,JiangQ,WangX.Thyroidultrasoundimageclassication using
a convolutional neural network. Ann Transl Med.(2021)9:1526.doi:10.21037/atm-21-4328
48. Peng S, Liu Y, Lv W, Liu L, Zhou Q, Yang H, et al. Deep learning-based articial
intelligence model to assist thyroid nodule diagnosis and management: a multicentre
diagnostic study. Lancet Digit Health. (2021) 3:e250e9. doi: 10.1016/s2589-7500(21)
00041-8
49. Alabrak MMA, Megahed M, Alkhouly AA, Mohammed A, Elfandy H, Tahoun N,
et al. Articial intelligence role in subclassifying cytology of thyroid follicular neoplasm.
Asian Pac J Cancer Prev. (2023) 24:137987. doi: 10.31557/apjcp.2023.24.4.1379
50. Hirokawa M, Niioka H, Suzuki A, Abe M, Arai Y, Nagahara H, et al. Application
of deep learning as an ancillary diagnostic tool for thyroid FNA cytology. Cancer
Cytopathol. (2023) 131:21725. doi: 10.1002/cncy.22669
51. Cui Y, Fu C, Si C, Li J, Kang Y, Huang Y, et al. Analysis and comparison of the
Malignant thyroid nodules not recommended for biopsy in ACR TIRADS and AI
TIRADS with a large sample of surgical series. J Ultrasound Med. (2023) 42:122533.
doi: 10.1002/jum.16132
52. Wang Z, Qu L, Chen Q, Zhou Y, Duan H, Li B, et al. Deep learning-based
multifeature integration robustly predicts central lymph node metastasis in papillary
thyroid cancer. BMC Cancer. (2023) 23:128. doi: 10.1186/s12885-023-10598-8
53. Abbasian Ardakani A, Mohammadi A, Mirza-Aghazadeh-Attari M, Faeghi F,
Vogl TJ, Acharya UR. Diagnosis of metastatic lymph nodes in patients with papillary
thyroid cancer: A comparative multi-center study of semantic features and deep
learning-based models. J Ultrasound Med. (2023) 42:121121. doi: 10.1002/jum.16131
54. Lu WJ, Mao L, Li J, OuYang LY, Chen JY, Chen SY, et al. Three-dimensional
ultrasound based radiomics nomogram for the prediction of extrathyroidal extension
features in papillary thyroid cancer. Front Oncol. (2023) 13:2023.1046951. doi: 10.3389/
fonc.2023.1046951
55. Liu Z, Zhong S, Liu Q, Xie C, Dai Y, Peng C, et al. Thyroid nodule recognition
using a joint convolutional neural network with information fusion of ultrasound
images and radiofrequency data. Eur Radiol. (2021) 31:500111. doi: 10.1007/s00330-
020-07585-z
56. Gomes Ataide EJ, Ponugoti N, Illanes A, Schenke S, Kreissl M, Friebe M. Thyroid
nodule classication for physician decision support using machine learning-evaluated
geometric and morphological features. Sensors (Basel). (2020) 20(21):6110.
doi: 10.3390/s20216110
57. Sun C, Zhang Y, Chang Q, Liu T, Zhang S, Wang X, et al. Evaluation of a deep
learning based computer-aided diagnosis system for distinguishing benign from
Malignant thyroid nodules in ultrasound images. Med Phys. (2020) 47:395260.
doi: 10.1002/mp.14301
58. Gong ZJ, Xin J, Yin J, Wang B, Li X, Yang HX, et al. Diagnostic value of articial
intelligence-assistant diagnostic system combined with contrast-enhanced ultrasound
in thyroid TI-RADS 4 nodules. J Ultrasound Med. (2023) 42:152735. doi: 10.1002/
jum.16170
59. Ma XF ZT, Zhuang LT, Huang Y. Retrospective study and interrupted time
series analysis of ultrasound guided thyroid ne needle aspiration. J China Clinic Med
Imaging. (2022) 33:83741. doi: 10.12117/jccmi.2022.12.001
60. Bhattacharjee S, Kim CH, Park HG, Prakash D, Madusanka N, Cho NH, et al.
Multi-features classication of prostate carcinoma observed in histological sections:
analysis of wavelet-based texture and colour features. Cancers (Basel). (2019) 11
(12):1937. doi: 10.3390/cancers11121937
61. Fan X, Xie N, Chen J, Li T, Cao R, Yu H, et al. Multiparametric MRI and
machine learning based radiomic models for preoperative prediction of multiple
biological characteristics in prostate cancer. Front Oncol. (2022) 12:2022. 839621.
doi: 10.3389/fonc.2022.839621
62. Meng X, Xia W, Xie P, Zhang R, Li W, Wang M, et al. Preoperative radiomic
signature based on multiparametric magnetic resonance imaging for noninvasive
evaluation of biological characteristics in rectal cancer. Eur Radiol. (2019) 29:32009.
doi: 10.1007/s00330-018-5763-x
63. Zhang Z, Jung C. GBDT-MO: gradient-boosted decision trees for multiple
outputs. IEEE Tr ans Neural Netw Learn Syst.(2021)32:315667. doi: 10.1109/
TNNLS.2020.3009776
Chen et al. 10.3389/fendo.2024.1299686
Frontiers in Endocrinology frontiersin.org14
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Purpose To develop and validate a three-dimensional ultrasound (3D US) radiomics nomogram for the preoperative prediction of extrathyroidal extension (ETE) in papillary thyroid cancer (PTC). Methods This retrospective study included 168 patients with surgically proven PTC (non-ETE, n = 90; ETE, n = 78) who were divided into training (n = 117) and validation (n = 51) cohorts by a random stratified sampling strategy. The regions of interest (ROIs) were obtained manually from 3D US images. A larger number of radiomic features were automatically extracted. Finally, a nomogram was built, incorporating the radiomics scores and selected clinical predictors. Receiver operating characteristic (ROC) curves were performed to validate the capability of the nomogram on both the training and validation sets. The nomogram models were compared with conventional US models. The DeLong test was adopted to compare different ROC curves. Results The area under the receiver operating characteristic curve (AUC) of the radiologist was 0.67 [95% confidence interval (CI), 0.580–0.757] in the training cohort and 0.62 (95% CI, 0.467–0.746) in the validation cohort. Sixteen features from 3D US images were used to build the radiomics signature. The radiomics nomogram, which incorporated the radiomics signature, tumor location, and tumor size showed good calibration and discrimination in the training cohort (AUC, 0.810; 95% CI, 0.727–0.876) and the validation cohort (AUC, 0.798; 95% CI, 0.662–0.897). The result suggested that the diagnostic efficiency of the 3D US-based radiomics nomogram was better than that of the radiologist and it had a favorable discriminate performance with a higher AUC (DeLong test: p < 0.05). Conclusions The 3D US-based radiomics signature nomogram, a noninvasive preoperative prediction method that incorporates tumor location and tumor size, presented more advantages over radiologist-reported ETE statuses for PTC.
Article
Full-text available
Purpose To evaluate and compare the value of conventional ultrasound-based superb microvascular imaging (SMI) and color Doppler flow imaging (CDFI) in the diagnosis of malignant thyroid nodule by meta-analysis. Methods The literature included in the Cochrane Library, PubMed, and Embase were searched by using “ superb microvascular imaging (SMI), color Doppler flow imaging (CDFI), ultrasound, thyroid nodules” as the keywords from inception through February 1, 2023. According to the inclusion and exclusion criteria, the clinical studies using SMI and CDFI to diagnose thyroid nodules were selected, and histopathology of thyroid nodules was used as reference standard. The diagnostic accuracy research quality assessment tool (QUADAS-2) was used to evaluate the quality of included literature, and the Review Manager 5.4 was used to make the quality evaluation chart. The heterogeneity test was performed on the literature that met the requirements, the combined sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio were pooled, and a comprehensive ROC curve analysis was performed. Meta-DiSc version 1.4, StataSE 12, and Review Manager 5.4 software were used. Results Finally, 13 studies were included in this meta-analysis. A total of 815 thyroid malignant nodules were assessed. All thyroid nodules were histologically confirmed after SMI or CDFI. The combined sensitivity, specificity, PLR, NLR, DOR, and area under the SROC curve of SMI for the diagnosis of malignant thyroid nodules were 0.80(95%CI: 0.77-0.83), 0.79(95%CI: 0.77-0.82), 4.37(95%CI: 3.0-6.36), 0.23(95%CI: 0.15-0.35), 22.29(95%CI: 12.18-40.78), and 0.8944, respectively; the corresponding values of CDFI were 0.62(95%CI: 0.57-0.67), 0.81(95%CI: 0.78-0.85), 3.33(95%CI: 2.18-5.07), 0.41(95%CI: 0.27-0.64), 8.93(95%CI: 3.96-20.16), and 0.8498. Deek funnel pattern showed no significant publication bias. Conclusion The diagnostic efficiency of SMI for malignant thyroid nodules is better than CDFI, and SMI technology can provide significantly more information on vascularity, make up for the deficiency of CDFI, and has better clinical application value. Systematic review registration https://www.crd.york.ac.uk/PROSPERO , identifier CRD42023402064.
Article
Full-text available
Purpose To compare the diagnostic performance and unnecessary ultrasound-guided fine-needle aspiration (US-FNA) biopsy rate of the 2015 American Thyroid Association (ATA), 2016 Korean Society of Thyroid Radiology (KSThR), and 2017 American College of Radiology (ACR) guidelines for patients with and without Hashimoto’s thyroiditis (HT). Patients and Methods This retrospective study included 716 nodules from 696 consecutive patients, which were classified using the categories defined by the three guidelines: ATA, KSThR, and ACR. The malignancy risk in each category was calculated and the diagnostic performance and unnecessary fine-needle aspiration (FNA) rates of the three guidelines were compared. Results In total, 426 malignant and 290 benign nodules were identified. Patients with malignant nodules had lower total thyroxine levels and higher thyroid-stimulating hormone, thyroid peroxidase antibody, and thyroglobulin antibody levels than those without malignant nodules (all P<0.01). The margin difference was significant in non-HT patients (P<0.01), but comparable in HT patients (P=0.55). The calculated malignancy risks of high and intermediate suspicion nodules in the ATA and KSThR guidelines and moderately suspicious nodules in the ACR guidelines were significantly lower in non-HT patients compared with HT patients (P<0.05). The ACR guidelines showed the lowest sensitivity, highest specificity, and lowest unnecessary FNA rates in patients with and without HT. Compared to non-HT patients, HT patients had significantly lower unnecessary FNA rates (P<0.01). Conclusion HT was associated with a higher malignancy rate of thyroid nodules with intermediate suspicion according to the ATA, KSThR, and ACR guidelines. The three guidelines, especially ACR, were likely to be more effective and could allow a greater reduction in the percentage of benign nodules biopsied in patients with HT.
Article
Full-text available
Background: Few highly accurate tests can diagnose central lymph node metastasis (CLNM) of papillary thyroid cancer (PTC). Genetic sequencing of tumor tissue has allowed the targeting of certain genetic variants for personalized cancer therapy development. Methods: This study included 488 patients diagnosed with PTC by ultrasound-guided fine-needle aspiration biopsy, collected clinicopathological data, analyzed the correlation between CLNM and clinicopathological features using univariate analysis and binary logistic regression, and constructed prediction models. Results: Binary logistic regression analysis showed that age, maximum diameter of thyroid nodules, capsular invasion, and BRAF V600E gene mutation were independent risk factors for CLNM, and statistically significant indicators were included to construct a nomogram prediction model, which had an area under the curve (AUC) of 0.778. A convolutional neural network (CNN) prediction model built with an artificial intelligence (AI) deep learning algorithm achieved AUCs of 0.89 in the training set and 0.78 in the test set, which indicated a high prediction efficacy for CLNM. In addition, the prediction models were validated in the subclinical metastasis and clinical metastasis groups with high sensitivity and specificity, suggesting the broad applicability of the models. Furthermore, CNN prediction models were constructed for patients with nodule diameters less than 1 cm. The AUCs in the training set and test set were 0.87 and 0.76, respectively, indicating high prediction efficacy. Conclusions: The deep learning-based multifeature integration prediction model provides a reference for the clinical diagnosis and treatment of PTC.
Article
Full-text available
Objectives To evaluate the diagnostic efficacy of a modified thyroid imaging reporting and data system (TI-RADS) in combination with contrast-enhanced ultrasound (CEUS) for differentiating between benign and malignant thyroid nodules and to assess inter-observer concordance between different observers. Methods This study included 3353 patients who underwent thyroid ultrasound (US) and CEUS in ten multi-centers between September 2018 and March 2020. Based on a modified TI-RADS classification using the CEUS enhancement pattern of thyroid lesions, ten radiologists analyzed all US and CEUS examinations independently and assigned a TI-RADS category to each thyroid nodule. Pathology was the reference standard for determining the diagnostic performance (accuracy (ACC), sensitivity (SEN), specificity (SPN), positive predictive value (PPV), and negative predictive value (NPV)) of the modified TI-RADS for predicting malignant thyroid nodules. The risk of malignancy was stratified for each TI-RADS category-based on the total number of benign and malignant lesions in that category. ROC curve was used to determine the cut-off value and the area under the curve (AUC). Cohen’s Kappa statistic was applied to assess the inter-observer agreement of each sonological feature and TI-RADS category for thyroid nodules. Results The calculated malignancy risk in the modified TI-RADS categories 5, 4b, 4a, 3 and 2 nodules was 95.4%, 86.0%, 12.0%, 4.1% and 0%, respectively. The malignancy risk for the five categories was in agreement with the suggested malignancy risk. The ROC curve showed that the AUC under the ROC curve was 0.936, and the cutoff value of the modified TI-RADS classification was >TI-RADS 4a, whose SEN, ACC, PPV, NPV and SPN were 93.6%, 91.9%, 90.4%, 93.7% and 88.5% respectively. The Kappa value for taller than wide, microcalcification, marked hypoechoic, solid composition, irregular margins and enhancement pattern of CEUS was 0.94, 0.93, 0.75, 0.89, 0.86 and 0.81, respectively. There was also good agreement between the observers with regards to the modified TI-RADS classification, the Kappa value was 0.80. Conclusions The actual risk of malignancy according to the modified TI-RADS concurred with the suggested risk of malignancy. Inter-observer agreement for the modified TI-RADS category was good, thus suggesting that this classification was very suitable for clinical application.
Article
Summary: Objectives. Voice changes are a common complication after a thyroidectomy, which is a surgical procedure involving partial or total removal of the thyroid gland. The main objective of this work is to examine the possible voice disorders after thyroid surgery. More precisely, it is an investigation of partial and total thyroidectomy, as well as the effects that cancerous and noncancerous thyroid glands can have regarding postsurgical vocal and their association with age and gender. Methods. Patients were evaluated using acoustic voice parameters, including harmonics-to-noise ratio (HNR), fundamental frequency (F0), jitter, speaker phonation frequency (SPF) range, cepstral peak prominence (CPP), maximum phonational frequency range (MPFR), and shimmer at the preoperative stage and postoperatively at the 1 day, and first-month stages. Results. Results demonstrated a significant change in F0 parameters, SPF range, and CPP feature 1 month after surgery, depending on the type of thyroidectomy and thyroid pathology. No significant changes were observed in the HNR, shimmer, and jitter features. Age was associated with the CPP parameter in the entire sample. In contrast, the MPFR parameter was also related to the type of thyroidectomy in the entire sample. However, maximum F0 was significantly associated with the type of thyroidectomy, specifically in the female sample. Conclusions. Results indicated that a thyroidectomy can have a negative impact on voice quality. The age and type of thyroidectomy performed are not responsible for this change. Potentially this change can be due to factors such as nerve damage or the subjects’ experience, such as job, anxiety, and their physical condition, as well as treatments they may have undergone before thyroidectomy. Further efforts are needed to fully understand the background of voice changes after thyroidectomy.
Article
Objective: Fine needle aspiration cytology has higher sensitivity and predictive value for diagnosis of thyroid nodules than any other single diagnostic methods. In the Bethesda system for reporting thyroid, the category IV, encompasses both adenoma and carcinoma, but it is not possible to differentiate both lesions in the cytology practice and can be only differentiated after resection. In this work, we aim at exploring the ability of a convolutional neural network (CNN) model to sub-classifying cytological images of Bethesda category IV diagnosis into follicular adenoma and follicular carcinoma. Methods: We used a cohort of cytology cases n= 43 with extracted images n= 886 to train CNN model aiming to sub-classify follicular neoplasm (Bethesda category IV) into either follicular adenoma or follicular carcinoma. Result: In our study, the model subclassification of follicular neoplasm into follicular adenoma (n = 28/43, images n = 527/886) from follicular carcinoma (n = 15/43, images n= 359/886), has achieved an accuracy of 78%, with a sensitivity of 88.4%, and a specificity of 64% and an area under the curve (AUC) score of 0.87 for each of follicular adenoma and follicular carcinoma. Conclusion: Our CNN model has achieved high sensitivity in recognizing follicular adenoma amongest cytology smears of follciualr neoplasms, thus it can be used as an ancillary technique in the subcalssification of Bethesda Iv category cytology smears.
Article
Background Current guidelines recommend the use of conventional US for risk stratification and management of thyroid nodules. However, fine-needle aspiration (FNA) is often recommended in benign nodules. Purpose To compare the diagnostic performance of multimodality US (including conventional US, strain elastography, and contrast-enhanced US [CEUS]) with the American College of Radiology Thyroid Imaging Reporting and Data System (TI-RADS) in the recommendation of FNA for thyroid nodules to reduce unnecessary biopsies. Materials and Methods In this prospective study, 445 consecutive participants with thyroid nodules from nine tertiary referral hospitals were recruited between October 2020 and May 2021. With univariable and multivariable logistic regression, the prediction models incorporating sonographic features, evaluated with interobserver agreement, were constructed and internally validated with bootstrap resampling technique. In addition, discrimination, calibration, and decision curve analysis were performed. Results A total of 434 thyroid nodules confirmed at pathologic analysis (259 malignant thyroid nodules) in 434 participants (mean age, 45 years ± 12 [SD]; 307 female participants) were included. Four multivariable models incorporated participant age, nodule features at US (proportion of cystic components, echogenicity, margin, shape, punctate echogenic foci), elastography features (stiffness), and CEUS features (blood volume). In recommending FNA in thyroid nodules, the highest area under the receiver operating characteristic curve (AUC) was 0.85 (95% CI: 0.81, 0.89) for the multimodality US model, and the lowest AUC was 0.63 (95% CI: 0.59, 0.68) for TI-RADS (P < .001). At the 50% risk threshold, 31% (95% CI: 26, 38) of FNA procedures could be avoided with multimodality US compared with 15% (95% CI: 12, 19) with TI-RADS (P < .001). Conclusion Multimodality US had better performance in recommending FNA to avoid unnecessary biopsies than the TI-RADS. Clinical trial registration no. NCT04574258 © RSNA, 2023 Supplemental material is available for this article.
Article
Purpose of review: Thyroid ultrasound (TUS) is a common diagnostic test that can help guide the management of patients with thyroid conditions. Yet, inappropriate use of TUS can lead to harmful unintended consequences. This review aims to describe trends in the use and appropriateness of TUS in practice, drivers and consequences of inappropriate use, and potential solutions to decrease overuse. Recent findings: TUS use has increased in the U.S. and is associated with increased diagnosis of thyroid cancer. Between 10-50% of TUSs may be ordered outside of clinical practice recommendations. Patients who receive an inappropriate TUS and are incidentally found to have a thyroid nodule may experience unnecessary worry, diagnostic interventions, and potential overdiagnosis of thyroid cancer. The drivers of inappropriate TUS use are not yet fully understood, but it is likely that a combination of clinician, patient, and healthcare system factors contribute to this problem. Summary: Inappropriate TUS is a factor leading to the overdiagnosis of thyroid nodules and thyroid cancer, resulting in increased healthcare costs and potential harm to patients. To effectively address the overuse of this diagnostic test, it is necessary to gain a deeper understanding of the frequency of inappropriate TUS use in clinical practice and the factors that contribute to it. With this knowledge, interventions can be developed to reduce the inappropriate use of TUS, leading to improved patient outcomes and more efficient use of healthcare resources.