ArticlePDF Available

A novel end‐to‐end deep learning solution for coronary artery segmentation from CCTA

Authors:

Abstract and Figures

Purpose Coronary computed tomographic angiography (CCTA) plays a vital role in the diagnosis of cardiovascular diseases, among which automatic coronary artery segmentation (CAS) serves as one of the most challenging tasks. To computationally assist the task, this paper proposes a novel end‐to‐end deep learning‐based (DL) solution for automatic CAS. Methods Inspired by the Di‐Vnet network, a fully automatic multistage DL solution is proposed. The new solution aims to preserve the integrity of blood vessels in terms of both their shape details and continuity. The solution is developed using 338 CCTA cases, among which 133 cases (33865 axial images) have their ground‐truth cardiac masks pre‐annotated and 205 cases (53365 axial images) have their ground‐truth coronary artery (CA) masks pre‐annotated. The solution's accuracy is measured using dice similarity coefficient (DSC), 95th percentile Hausdorff Distance (95% HD), Recall, and Precision scores for CAS. Results The proposed solution attains 90.29% in DSC, 2.11 mm in 95% HD, 97.02% in Recall, and 92.17% in Precision, respectively, which consumes 0.112 s per image and 30 s per case on average. Such performance of our method is superior to other state‐of‐the‐art segmentation methods. Conclusions The novel DL solution is able to automatically learn to perform CAS in an end‐to‐end fashion, attaining a high accuracy, efficiency and robustness simultaneously.
This content is subject to copyright. Terms and conditions apply.
Received: 12 December 2021 Revised: 7 April 2022 Accepted: 10 June 2022
DOI: 10.1002/mp.15842
RESEARCH ARTICLE
A novel end-to-end deep learning solution for coronary
artery segmentation from CCTA
Caixia Dong Songhua Xu Zongfang Li
Institute of Medical Artificial Intelligence, The
Second Affiliated Hospital of Xi’an Jiaotong
University, Xi’an, Shannxi, China
Correspondence
Songhua Xu, Institute of Medical Artificial
Intelligence, The Second Affiliated Hospital of
Xi’an Jiaotong University,Xi’an, Shaanxi,
710004, China.
Email: songhua_xu1@163.com
Funding information
National Natural Science Foundation of
China, Grant/Award Number: 12026609;
Ministry of Science and Technology of China,
Grant/Award Number: 2020AAA0106302;
Shaanxi University Joint Project, Grant/Award
Number: 2020GXLH-Z-002; Research Fund of
the Second Affiliated Hospital of Xi’an
Jiaotong University, Grant/Award Number:
YJ(QN)202017)
Abstract
Purpose: Coronary computed tomographic angiography (CCTA) plays a vital
role in the diagnosis of cardiovascular diseases, among which automatic coro-
nary artery segmentation (CAS) serves as one of the most challenging tasks.
To computationally assist the task,this paper proposes a novel end-to-end deep
learning-based (DL) solution for automatic CAS.
Methods: Inspired by the Di-Vnet network,a fully automatic multistage DL solu-
tion is proposed.The new solution aims to preserve the integrity of blood vessels
in terms of both their shape details and continuity. The solution is developed
using 338 CCTA cases, among which 133 cases (33865 axial images) have
their ground-truth cardiac masks pre-annotated and 205 cases (53365 axial
images) have their ground-truth coronary artery (CA) masks pre-annotated.
The solution’s accuracy is measured using dice similarity coefficient (DSC),95th
percentile Hausdorff Distance (95% HD), Recall, and Precision scores for CAS.
Results: The proposed solution attains 90.29% in DSC, 2.11 mm in 95% HD,
97.02% in Recall, and 92.17% in Precision, respectively, which consumes 0.112
s per image and 30 s per case on average. Such performance of our method is
superior to other state-of-the-art segmentation methods.
Conclusions: The novel DL solution is able to automatically learn to per-
form CAS in an end-to-end fashion, attaining a high accuracy, efficiency and
robustness simultaneously.
KEYWORDS
blood vessel integrity, coronar y ar tery segmenta tion, deep learning
1INTRODUCTION
Cardiovascular disease is the leading cause of death
around the world,1among which Coronary Artery Dis-
ease (CAD) is a major one. To evaluate the severity
of CAD, coronary computed tomographic angiography
(CCTA) is popularly adopted due to its non-invasive
nature, with the aid of which the heart and coronary
artery structures of a patient can be reconstructed
and assessed. To analyze CCTA for a range of tasks
such as stenosis calculation, centerline extraction, and
Abbreviations: 3D Cardiac ROI, 3D cardiac Region of Interest; 95% HD, the
95th percentile Hausdorff distance; CAD, coronary ar tery disease; CAS,
coronary arter y segmentation; CCTA, coronary computed tomographic
angiography; CS, cardiac segmentation; DL, deep lear ning; DSC, dice similarity
coefficient; FCN, Fully Convolutional Networks; GCAS, global coronary arter y
segmentation; HU, Hounsfield unit; LCAS, local coronary arter y segmentation.
plaque analysis, Coronary Artery Segmentation (CAS)
is a crucial step.Since coronary arteries may have richly
varying diameters and complicated trajectories, manual
CAS is highly laborious and skill demanding, leading to
the rising demand for automatic CAS.
Existing CAS methods have exploited classical
machine learning and modern deep learning meth-
ods to segment coronary arteries from CCTA. The
former category of methods includes region-based,2
edge-based, tracking-based,4graph-cut-based,5and
level-set-based methods.6These algorithms have
shown promising results on a limited dataset, which,
to name just a few most representative ones, often
perform poorly on newly encountered datasets beyond
the original training data. These techniques heavily
relied on hand-crafted engineering features, requiring
intensive domain knowledge and expert input.Moreover,
Med Phys. 2022;1–15. wileyonlinelibrary.com/journal/mp © 2022 American Association of Physicists in Medicine. 1
2A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION
hand-crafted features exhibit limited representational
ability for handling challenging conditions, such as
severe noises, varying brightness, and ambiguous
boundaries. Therefore, these methods suffer from the
common problem of heavily relying on hand-crafted
features, which are difficult to design to obtain the
desired level of robustness in conducting CAS.
Encouraged by the remarkable successes of deep
learning in computer vision, deep convolutional neural
networks (CNNs) have recently emerged as promis-
ing alternatives for medical image segmentation.7–24
Compared with traditional methods using hand-crafted
features, deep learning methods can automatically
learn hierarchical feature representations from the input
image to attain end-to-end prediction. Most state-of -the-
art medical image segmentation approaches are based
on encoder-decoder network architecture,among which
the most representative methods are fully convolutional
network (FCN),8U-Net,9and a series of U-Net variants.
In par ticular, Çiçek et al.10 extended the previous U-Net
architecture8by replacing all 2D operations involved
with their 3D counterparts, called U-Net3D, which can
use a few training samples and partial annotations
to solve volumetric medical segmentation task in an
end-to-end fashion. Military et al.11 proposed a V-Net
network, which can be considered a 3D variant of
the U-net framework, including both contracting and
expanding stages,and introduced the usage of residual
blocks and dice coefficient as an objective function of
3D convolutional layers. That is, V-Net essentially is a
variant of the U-Net structure.
Benefited by the powerful discriminative ability of the
above DL networks, the U-Net and its variants have
achieved remarkable success in the segmentation of
various structures in liver lesions,12 brain lesions,13 pan-
creas lesions,14,23 and breast lesions24 from medical
volumes. Specifically, Giddwani et al.23 proposed a deep
dilation network (DDN)-based V-Net architecture for 3D
volumetric segmentation of the pancreas in CT images.
However, the DDN is only integrated at the bottom layer
of V-Net architecture. Hu et al.24 proposed a fully con-
volutional network with dilated convolution to segment
the tumor on the breast ultrasound images. It used the
dilated convolution layers to replace the max-pooling
layers in stages 4 and 5 of the encoder. Also,numerous
works have made efforts on CAS due to the signifi-
cance for CAD diagnosis and preoperative planning.For
example,2D FCNs have been employed to segment the
coronary artery in CCTA.15 Mirunalini et al.16 proposed a
two-stage approach to segment the coronary artery.The
first stage adopted a 2D Recurrent Convolutional Neu-
ral Network to detect the artery in the slice, and then a
2D residual U-Net was used to segment the coronary
artery. Gu et al.17 proposed a global feature embedded
network containing semantic information and detailed
features for coronary arteries segmentation. Most 2D
methods that omit 3D spatial context info cannot be
applied to process 3D images straightforwardly due to
the spatial continuity of 3D images along the z-axis.
To overcome the limitation, Huang et al.18 presented
approaches with 3D U-Net for both CACT data with and
without centerline; Lei et al.19 developed a 3D Atten-
tion Fully Convolutional Network (Atten-FCN3D) model
to segment the coronary artery for CCTA automatically.
These approaches have extensively advanced the
development of automatic segmentation of coronary
artery (CA); nevertheless, the task of CAS remains
highly challenging due to 1) the relatively small vol-
ume of coronary artery scattered across a much larger
volume of surrounding tissues, 2) the high perceptual
similarity and close spatial adjacency between the coro-
nary artery and their neighboring vessel structures, such
as coronary veins and pulmonary blood vessels, 3)
limited training data annotated with quality golden stan-
dard segmentation results. To perform accurate and
robust CAS with high efficiency, in this work, we 1) pro-
pose a novel end-to-end DL solution that is able to
preserve the integrity of coronary arteries in terms of
both shape details and continuity; 2) construct a large
scale CCTA dataset with fine-grained pixel-level annota-
tions for CAS. Specifically, we utilize the collected CCTA
dataset consisting of thousands of CT images from
hundreds of real patient cases to train the proposed
solution to attain a satisfactory level of segmentation
performance. As illustrated in Figure 1, building upon
the newly produced Di-Vnet model, a novel multistage
DL solution for CAS is further proposed. It is noted that
Di-Vnet is essentially a variant of the U-Net structure,
which can produce a larger receptive field to incorpo-
rate a more extensive context without increasing the
number of parameters or the amount of computation
while preserving the full spatial resolution of the mod-
eling images. The first stage is cardiac segmentation
(CS); the second stage derives and independently ana-
lyzes global and local features by global coronary artery
segmentation (GCAS) and local coronary artery seg-
mentation (LCAS) modules, respectively, based on the
3D cardiac region of interest (3D Cardiac ROI) from CS
for more accurate CAS results.
In summary, the main contributions of this work are
three-fold:
1. We propose a novel multi-stage end-to-end DL solu-
tion for automated CAS. Building upon the newly
introduced Di-Vnet model, a two-stage procedure for
automatic segmentation of the coronary artery is
introduced. Benefited by the global and local CAS
independently performed by the two modules of
GCAS and LCAS, the proposed solution is able to
preserve the integrity of blood vessels in terms of
both their shape details and continuity in its end CAS
output.
2. We construct a new large-scale CCTA dataset,which
contains 87,230 fine-grained pixel-level labeled CT
A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION 3
FIGURE 1 Key processing workflow of the proposed multistage deep learning (DL) solution for coronary artery segmentation (CAS). The
solution consists of a cardiac segmentation (CS) module, a GCAS module, a LCAS module, and a fusion module
images from 338 cases,among which 133 cases cor-
responding to 33865 axial images, were annotated
in terms of their underlying cardiac masks while the
remaining 205 cases corresponding to 53365 axial
images, were annotated in terms of their underlying
coronary artery masks.
3. Extensive experimental results indicate that the pro-
posed approach outperforms multiple state-of -the-
art algorithms tackling the CAS task. To explore the
generalization of the proposed method, we addition-
ally validate the trained model on our self CCTA
dataset on a public dataset of the MICCAI 2008
Coronary Artery Tracking Challenge (CAT08).
2 METHODS AND MATERIALS
Building upon the newly introduced Di-Vnet model, a
novel multistage DL solution for CAS is further proposed,
whose processing workflow is shown in Figure 1with its
key operations listed in Table 1. The new DL solution first
leverages its CS module to extract the cardiac region
and a 3D cardiac Region of Interest (3D Cardiac ROI)
from the raw CT scan images of a case.According to the
3D Cardiac ROI, global and local features are indepen-
dently derived and independently analyzed by a global
coronary artery segmentation (GCAS) and a local coro-
nary artery segmentation (LCAS) module respectively,
to generate two sets of CAS results, the result of which
are subsequently fused together to synthesize the end
CAS result.
2.1 Datasets
This study leverages two datasets. One is an in-house
CCTA dataset. To the best of our knowledge, this is the
first large-scale voxel-wise annotation dataset for CAS,
and the other is the CAT08 dataset.25 The in-house
CCTA dataset is used to train the proposed solution
4A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION
TABLE 1 Key operations in the proposed solution, as illustrated in Figure 1, along with the dimensionalities of their respective input and
output (Dim_In, Dim_Out). The steps of I1-I4represent operations of the cardiac segmentation (CS) module, I5-I8represent operations of the
GCAS module, I9-I12 represent operations of the local coronary artery segmentation (LCAS) module,and I13-I16 represent operations of the
fusion module. S1represents the three-dimensional size (512 ×512 ×N, where N, the number of slices, is between 200 and 500) of the original
coronary computed tomographic angiography (CCTA) data matrix; S2: 128 ×128 ×128; S3: 256 ×256 ×128; SROI represents the
three-dimensional size of a concerned 3D Cardiac ROI
Step Operation Dim_In/Dim_Out Step Operation Dim_In/Dim_Out
I1Grayscale mapping and resizing S1S2I6Resizing and registering S3SROIS1
I2Resizing S2S1I7,I
8Skeletonization S1S1
I33D ROI detection S1S1I9,I
10 3D slicing SROI Many S2
I43D cropping S1SROI I11,I
12 3D assembling and registering Many S2SROI S1
I5Resizing SROI S3I13-I
16 Fusion S1S1
and demonstrate the performance of our method. The
CAT08 dataset is used to verify the generalization abil-
ity over different data distributions of the new method on
cross-validation.
2.1.1 Our CCTA dataset
In this study, we retrospectively collected CCTA data
from 338 patient cases. The data were acquired using
multiple CT scanners manufactured by GE Healthcare
(Revolution CT) and Siemens Healthcare (SOMATOM
Definition Flash). All images were scanned at the car-
diovascular medicine department of a top-grade (grade
III level A) hospital in its country by well-certified
technicians in radiology. No data from patients with
serious adverse reactions to iodine contrast agents,
inability to cooperate with scanning operations, hemo-
dynamic instability, decompensated heart failure, and
acute myocardial infarction were admitted into the
dataset. Each case in the dataset attains a good visual
quality in all its scanning images,as manually examined
and ensured by the operating technician at the acqui-
sition time. The study received hospital review and was
approved by its ethics board.
The CCTA data used in this study were acquired
in the DICOM format (Digital Imaging and Communi-
cations in Medicine). 338 cases were scanned with a
varying slice thickness between 0.5 and 1 mm. Three
experienced radiologists participated in the annotation
of these data, all using 3D-slicer,26 among which 133
cases (33865 axial images) were annotated in terms of
their underlying cardiac masks while the remaining 205
cases (53365 axial images) were annotated in terms
of their underlying coronary artery masks. In terms of
the experimental setup,a five-fold cross-validation strat-
egy is carried out for the dataset to obtain an adequate
and reliable performance of different methods.For each
fold,the dataset was randomly divided into a training set
(70%, 93 cases annotated with cardiac marks and 155
cases annotated with coronary artery masks), a valida-
tion set (15%, 20 cases annotated with cardiac marks
and 30 cases annotated with coronary artery masks)
and a test set (15%, 20 cases annotated with cardiac
marks and 30 cases annotated with coronary artery
masks).
2.1.2 CAT08 dataset
We include 18 CCTA images of a training set from
a publicly available evaluation framework for coro-
nary centerline extraction,25 the MICCAI 2008 Coronary
Artery Tracking Challenge (CAT08). Images in CAT08
were acquired on a 64-slice CT scanner (Sensation 64,
Siemens Medical Solutions, Forchheim, Germany) or a
dual-source CT scanner (Somatom Definition, Siemens
Medical Solutions, Forchheim, Germany). Images were
reconstructed to a mean voxel size of 0.32 ×0.32 ×
0.4 mm3. The centerline of four major coronary arter-
ies was manually annotated in a consensus reading by
three experts in each scan. These arteries were the left
anterior descending (LAD), left circumflex (LCX), right
coronary artery (RCA), and a fourth vessel selected
as a side branch of the LAD, LCX, or RCA. A detailed
description of scan acquisition and reconstruction and
the centerline annotation protocol is provided in.25
2.2 Data preprocessing
The data preprocessing of this study performs a
grayscale mapping operation.
2.2.1 Grayscale mapping
For the aforesaid CCTA data, its range of Hounsfield
unit (HU) involved is [-250,450]. We thus leverage Equa-
tion (1) to map original CT values in the CCTA data into
A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION 5
FIGURE 2 Architecture of the Di-Vnet. In the network, Skip Connections are implemented through concatenations; Conv means a
convolutional layer,possibly dilated; ReLu represents a rectified linear unit layer; BN represents a batch normalization layer; ConvTrans
represents a transposed convolutional layer, also possibly dilated
the range of [0, 255.0]:
y=255.0×xmin
max min (1)
where x and y represent the CT value before and after
the mapping, and min and max are set to -250 and 450
HU, respectively.
2.3 A novel DL model, Di-Vnet, for CS
and CAS
A novel DL model,titled Di-Vnet, is proposed in this study
for automatic CS and CAS. Di-Vnet is constructed by
incorporating dilated convolutions27 with multi-rate into
multiple layers in the encoding and decoding stages of
the original Vnet,11 hence its name. Figure 1shows the
network architecture of Di-Vnet. In its encoding stage,
a convolutional layer with a receptive field of 3 ×3×3
voxels downsamples resolutions of feature maps by half
(see &), followed by a dilated convolution layer, which
preserves resolutions of its incoming feature maps yet
achieves a receptive field of 7 ×7×7 voxels using only
3×3×3 parameters (see ). Another dilated convolu-
tion layer further processes the resulting feature maps
with a receptive field of 15 ×15 ×15 voxels using 3
×3×3 parameters while also preserving the resolu-
tions of these maps (see ). The end embedding vector
output by the encoding part of the network (see )
is transformed by a ResConv block. The transformed
embedding vector is then fed into the decoding part
of the network, whose construction mirrors that of the
encoding part of the network, yet in reverse order (see
detail in Figure 2). At last, a fully convolutional layer is
applied, which uses sigmoid as its activation function to
derive the final network output.
A key advantage of the proposed Di-Vnet model is
its capability of directly processing 3D data, thus being
sensitive to spatial features and constraints otherwise
beyond the sight of 2D counterpart processing models.
Thanks to various dilated convolution layers built into the
Di-Vnet,a relatively small set of network parameters can
be used to attain receptive fields of adequate sizes with-
out compromising resolutions of feature maps in con-
trast to the conventional Vnet design,which employs tra-
ditional convolution layers and therefore requires much
more network parameters. The performance advantage
of Di-Vnet with respect to Vnet is comprehensively
revealed through our experimental results.
2.4 Cardiac segmentation
A capable CS model can provide two important clinical
benefits. First, quality CS results facilitate heart recon-
struction, whose results enable physicians to diagnose
CAD more intuitively and accurately;second, 3D Cardiac
ROIs generated by CS can effectively eliminate ribs and
pulmonary blood vessels,leading to more accurate CAS
results.
In the proposed solution, a CS model is implemented
using the Di-Vnet, trained through supervised learn-
ing, coupled with a Dice_loss function. The essence of
Dice_loss is to measure the overlap area between the
ground truth area and the segmentation area predicted
6A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION
by the network, which can better alleviate category
imbalance. The loss function is defined in Equation (2)
according to the dice similarity coefficient (DSC).11,28
Dice =2×ytrue ypred
ytrue+
ypred
(2)
Dice_loss =1Dice (3)
where ytrue and ypred respectively represent the ground-
truth mask and the mask generated by CS.
A 3D Cardiac ROI detection procedure is subse-
quently executed as follows: First, the original CCTA
data are filtered using the CS mask generated in the
first stage to obtain the cardiac region, for which a cor-
responding bounding box is derived that has an equal
length in its coronal and sagittal direction. A 3D crop-
ping operation further extracts the 3D Cardiac ROI from
the detected cardiac region.
2.5 Coronary artery segmentation
The coronary artery often exhibits richly varying diame-
ters and complicated trajectories,making manual CAS a
highly laborious and skill-demanding procedure. There-
fore, an automatic CAS procedure is introduced in this
study. Given a 3D Cardiac ROI, the CAS procedure first
derives its global and local features, which are sub-
sequently analyzed by a GCAS and a LCAS module
independently to generate two sets of CAS results. The
two results are fused together to synthesize the end
CAS result. Benefited by the global and local CAS inde-
pendently performed by the above two modules, the
proposed solution is able to preserve the integrity of
blood vessels in terms of both their shape details and
continuity in its end CAS output.
2.5.1 GCAS
The GCAS module first resizes the input 3D Cardiac
ROI to a 256 ×256 ×128 matrix. This matrix is then
fed into a Di-Vnet, trained through supervised learning
and coupled with a clDice_loss29 function, to generate
a 3D mask of the dimension of 256 ×256 ×128 that
corresponds to all coronary artery detected from within
the 3D Cardiac ROI. This mask is then resized to the
dimension of the input 3D Cardiac ROI, the result of
which is finally registered with the original CCTA to
produce the end 3D GCAS mask for the whole case.
The clDice_loss, defined in Equation (4), seeks to
preserve the continuity in GCAS results, where Vtand
Vprespectively represent the ground-truth mask and
the mask generated by GCAS, whose skeletons are
denoted as Stand Sprespectively, Tprec(Sp,Vt) and
Tsens(St,Vp) represent the proportion of Sp(St) inside
Vt(Vp), also known as the Topology Precision and
Sensitivity measures respectively.
clDice_loss =12×Tprec Sp,V
t×Tsens St,V
p
Tprec Sp,V
t+Tsens St,V
p(4)
2.5.2 LCAS
The LCAS module first applies a 3D sliding window onto
the input 3D Cardiac ROI,with a stride size of 64 pixels in
all three dimensions,to generate a series of 3D patches,
each of the dimension of 128 ×128 ×128 voxels. Each
patch is individually fed into the Di-Vnet, trained through
supervised learning and coupled with a Suos_Dice_loss
function,to generate a 3D mask for coronary artery lying
within the 3D space represented by the patch. All result-
ing 3D masks are then assembled together to formulate
a full-sized 3D mask,of the same dimension as the input
3D Cardiac ROI. Finally, the full-sized 3D mask is reg-
istered with the original CCTA to produce the end 3D
LCAS mask for the whole case.
Suos_Dice_loss, defined in Equation (6), seeks to
suppress over-segmentation while ensuring the smooth-
ness of LCAS results.
Suos =ytrue ypred
ytrue ypred ytrue ypred
ypred (5)
Suos_Dice_loss =1(𝛼×Suos +(1−𝛼
)×Dice),
(6)
where the parameter 𝛼was empirically tuned to 0.3 in
all our experiments.
2.5.3 Fusion
Lastly, a fusion network is employed to synthesize the
end CAS result for a given case. The network is built
upon the backbone of the U-Net3D10 network, which
consumes three types of inputs, including CAS results
respectively generated by the GCAS and LCAS modules
as well as skeletons of the GCAS results as extracted
via a thinning algorithm30 and also utilizes dice_loss as
its loss term during training.
2.6 Training strategy and model
evaluation
2.6.1 Settings
The proposed solution was implemented in Python
using the deep learning library Keras, which runs
A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION 7
on an Ubuntu 16.04.4 LTS system equipped with
NVIDIA GeForce RTX 3090 Ti GPUs. Key parameters
of the solution were empirically optimized as follows:
batch_size =2, epochs =90, and the initial learning
rate =0.001, monitor =val_loss, patience =10, fac-
tor =0.1, min_lr =1e-8. The five-fold cross-validation
strategy is carried out for our CCTA dataset to obtain
an adequate and reliable performance of different meth-
ods.For each fold,the dataset was randomly divided into
a training set (70%, 93 cases annotated with cardiac
marks and 155 cases annotated with coronary artery
masks), a validation set (15%, 20 cases annotated with
cardiac marks and 30 cases annotated with coronary
artery masks) and a test set (15%, 20 cases annotated
with cardiac marks and 30 cases annotated with coro-
nary artery masks). The average performance of all
evaluation criteria is reported.
2.6.2 Evaluation metrics
In this study, four performance metrics are used to
evaluate the quality of CS and CAS results. First, we
employ three classical performance metrics to quan-
titatively evaluate the performance of our proposed
solution: dice similarity coefficient (DSC), Recall, and
Precision. The DSC score represents the overlap ratio
between the actual vessels and those identified as ves-
sels. Recall (also known as Sensitivity) is the propor tion
of actual vessels correctly identified as vessels, while
Precision reflects the proportion of vessels identified
as actual vessels. The range of these three values is
between 0 and 1. The higher the values, the better the
segmentation quality. The following equations calculate
them:
DSC =2TP
2TP+FN+FP(7)
Recall =TP
TP +FN(8)
Precision =TP
TP +FP(9)
where TP and FP are the variables of true positive and
false positive,which represent the number of vessel vox-
els correctly segmented and the number of background
voxels that the model, respectively, incorrectly segment;
additionally, FN is the variable of false negative, which
represents the vessel voxels that are incorrectly marked
as background voxels.
Then, given the critical importance of blood ves-
sel boundaries in CAS results, the Hausdorff distance
(HD)31 metric is additionally employed in the quantitative
evaluation process. The HD is the maximum distance
between the boundary of a reference object and that of
the automatically segmented one, which is defined as:
dH(X, Y )=max {dXY,d
YX }
=max maxxXminyYd(x, y),maxyYminxXd(x, y )
(10)
where X and Y denote the boundaries of the segmented
object and reference one, respectively. Acknowledging
the high susceptibility of the maximum HD to noises and
outliers, this study adopts the 95th percentile Hausdorff
distance (95% HD) as its metric to evaluate the quality
of CAS in terms of boundaries of coronary artery seg-
mented. The smaller the 95% HD, the better quality a
CAS result is.
Among these metrics, DSC mainly reflects the over-
lapping pixels between the estimated and ground-truth
masks,which is the most important evaluation metric for
segmentation tasks.In our work, we mainly rank the per-
formance according to the DSC result, followed by 95%
HD, Recall, and Precision.
2.7 Statistical analysis
The statistical analysis was performed using the Scipy
package of Python 3.7 on a Linux server. The diag-
nostic performance of the DL algorithm was evaluated
using DSC, Recall, Precision, and 95% HD through
comparison with the ground truth, the results of which
were expressed in terms of mean ±standard deviation.
Additionally, to test the effectiveness of our method, a
significant two-tailed T-test is conducted considering the
DSC score, and the p-Values between ours and other
methods are reported as well. Subgroup analyses of the
proposed solution performance were performed for DSC
and 95% HD by gender, age, slice number, and coronar y
artery volume,and two-tailed t-tests were performed by
both DSC and 95% HD scores. The smaller p-values
indicate that the differences between the tested meth-
ods are more significant, and p <0.05 is considered
statistically significant.
3RESULTS
In this subsection, we performed comparative experi-
ments to demonstrate the advantages of our proposed
solution for CAS. The comparison methods include the
classical segmentation methods, that is, U-Net3D10 and
HighResNet3D,32 the-state-of -art methods for vessel
segmentation, that is, DenseVoxelNet33 and CS2-Net,28
and the method of Atten-FCN3D,19 which is designed
explicitly for CAS.
8A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION
TABLE 2 Cardiac segmentation (CS) and coronary artery segmentation (CAS) performance of the proposed solution compared to
the-state-of-art methods (mean ±standard deviation). The best results are in bold
Method
Dice similarity
coefficient (DSC)(%) Recall (%) Precision (%)
95% Hausdorff
distance (HD) (mm) p-Value (DSC)
CS performance
Ours 94.14 ±1.85 97.83 ±0.53 93.65 ±0.71 4.53 ±0.63
CAS performance
U-Net3D10 87.75 ±1.42 95.49 ±0.98 91.48 ±0.28 2.66 ±0.43 6.75 ×106
HighResNet3D32 88.23 ±1.23 95.65 ±0.53 87.17 ±0.52 2.32 ±0.37 4.69 ×105
Atten-FCN3D19 85.45 ±1.97 94.31 ±1.21 89.53 ±0.74 2.97 ±0.45 9.87 ×107
DenseVoxelNet33 88.49 ±1.21 95.28 ±0.31 90.94 ±0.44 2.25 ±0.41 1.14 ×105
CS2Net28 89.45 ±1.02 96.69 ±0.90 91.46 ±0.51 2.18 ±0.36 8.87 ×103
Ours 90.29 ±0.67 97.02 ±0.64 92.17 ±0.30 2.11 ±0.24
3.1 Quantitative study
Table 2quantitatively shows the performance of the
CS and CAS results generated by the proposed solu-
tion in terms of their DSC and 95% HD. For the CAS
task, we compare the proposed solution with five com-
parison methods on our CCTA dataset, which include
two general segmentation networks and three networks
designed explicitly for vessel segmentation. All evalu-
ation criteria in Table 2are obtained by averaging the
five-fold.
3.1.1 CS analysis
The performance of the proposed solution in its first
stage CS task was tested on 20 cases randomly
selected from our CCTA dataset, where each case is
annotated with its ground-truth cardiac masks. As shown
in Table 2, the solution attains 94.08 % in DSC, 4.53 mm
in 95% HD, 97.83% in Recall, and 93.65% in Precision,
respectively. In this study, cardiac segmentation is the
first stage, and its vital role is to support the down-
stream CAS task, and the current accuracy can meet
the second stage of CAS.In future research work, we will
design a network with a powerful discriminative ability to
improve the performance of heart segmentation.
3.1.2 CAS analysis
The performance of the proposed solution in its sec-
ond stage CAS task was tested on 30 cases randomly
selected from the CCTA dataset, where each case is
annotated with its ground-truth coronary artery masks.
As shown in Table 2, the solution achieves the best seg-
mentation results compared with other methods with
DSC of 90.29% and 95% HD of 2.11 mm, Recall
of 97.02%, and Precision of 92.17. Compared with
the classical segmentation network of U-Net3D,10 our
DSC and 95% HD achieve improvements of at least
2.54% DSC and 0.55 mm, respectively. Meanwhile,
Recall and Precision increased from 95.49%/91.48%
to 97.02%/92.12%, respectively, demonstrating the pro-
posed method’s effectiveness. Compared to the Atten-
FCN3D19 designed explicitly for CAS,our DSC and 95%
HD increase from 85.45%/2.97 mm to 90.29%/2.11 mm,
respectively. Moreover, compared with the selected
methods, the proposed method gains p-values in terms
of DSC and 95% HD metrics less than 0.05 (p =8.87
×10–3), which shows that the proposed method is sig-
nificantly better than other methods in segmentation
performance. Overall, DSC and 95% HD scores and
P-values consistently demonstrate that the proposed
solution is superior to the comparison methods with a
statistically significant advantage.
3.2 Qualitative study
In this paragraph, we will qualitatively analyze the effec-
tiveness of our proposed method in four-folds. First, the
visual effect of the CS is analyzed; later is the visual
effect of the CAS; in addition,the CAS robustness is also
analyzed; finally, the learning curve analysis.
3.2.1 CS Analysis
We randomly selected CS results of 9 cases from
20 testing cases for 3D reconstruction using the soft-
ware package RadiAnt.34 Figure 3shows these 3D
reconstruction results, demonstrating that CS results
generated by the proposed solution can satisfactorily
ensure that:1) coronary artery segmented for each case
indeed reside inside their corresponding heart area, 2)
ribs and other soft tissues are successfully eliminated
from areas occupied by the coronary artery segmented
A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION 9
FIGURE 3 3D heart reconstruction based on cardiac
segmentation (CS) results by the proposed solution. Results for nine
cases randomly selected from 20 testing cases are shown
for each case. These two properties favorably support
the downstream CAS task.
3.2.2 CAS Analysis
Figure 4compares 2D ground-truth CAS and the corre-
sponding 2D projections of the 3D CAS generated by
the proposed model for different slices of a case ran-
domly selected from the testing set. Figure 5compares
3D ground-truth CAS and the 3D CAS generated by
the proposed model for eight randomly selected testing
cases. Both figures are created using the visualiza-
tion package ITK-SNAP.35 As shown by the two figures,
the proposed model is able to generate CAS results
closely resembling ground-truth masks. Figure 6shows
the CAS results of 9 challenging cases selected from 30
testing cases, which are visualized using the 3D recon-
struction package RadiAnt. These results demonstrate
that the proposed model is indeed capable of perform-
ing CAS even dealing with complex situations such as
images with poor contrast,plaques,as well as left or right
coronary dominance.
The qualitative experimental results of our method
and other competitors on typical cases are illustrated
in Figure 7. It shows the segmented boundaries by
all 3D segmentation methods in 2D CCTA slices. Our
method has the most similar segmented boundaries
(Figure 7(f)) to the ground truths in most slices. How-
ever, it still suffers from inaccurate boundaries of CA
(see the boundary of the ascending aorta in the second
row of Figure 7(f)).There are two main reasons:First,the
task of CAS remains highly challenging due to the large-
scale variations exhibited by coronary arteries, their
complicated anatomical structures and morphologies,
as well as the low contrast between vessels and their
background. Second,like most region-based dense pixel
classification methods, consecutive pooling and convo-
lution operations result in limited contextual information
and inadequate discriminative feature maps, leading to
unsatisfactory segmentation results.
3.2.3 CAS robustness analysis
Figure 8shows two cases with cardiac over-
segmentation (see A and C), which may be caused
by non-cardiac tissues or blood vessels lying outside
the heart region, as pointed by yellow arrows. Never-
theless, the proposed model is still able to produce
satisfactory CAS results (shown in B and D, respec-
tively) despite the intermediary over-segmentation
results. Such resilience of the model in coping with
over-segmented CS results, presumably acquired by
the CAS module during its end-to-end training phase
autonomously, fur ther improves its overall accuracy and
robustness.
3.2.4 Learning curve analysis
Figure 9shows the loss curve of the proposed solution
during its training.The LCAS module converges after 70
epochs’ training to a small training loss of 0.0397 and
validation loss of 0.0457, as shown in (A). The LCAS
module converges after 60 epochs’ training to a small
training loss of 0.0442 and validation loss of 0.0573,
as shown in (B). No overfitting is found in the training
for both modules. It proves that the proposed Di-Vnet is
easy to train.
4DISCUSSION
4.1 Study of model performance from
different subgroups
Table 3lists DSC and 95% HD of the CAS results
generated by the proposed solution for the full testing
set of 30 cases, compared to the counterpart perfor-
mance statistics for various subcohorts in the testing
test assembled by age, gender, or the number of CT
slices in a case. No statistically significant differences
can be found among performance statistics for these
subcohorts, which demonstrates the robustness of the
proposed solution. Table 3also statistics DSC and 95%
HD of the CAS results generated by the proposed
solution for the subgroup of coronary artery volume.
For different volume subgroups, our method will have
10 A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION
FIGURE 4 Comparing ground-truth and model output for different slices of a case. Images in the leftmost column present original slices of
a case, followed by the ground-truth coronary artery segmentation (CAS) (second column), its zoomed-in view (third column) and the CAS
generated by the proposed model (fourth column) and its zoomed-in view (fifth column)
TABLE 3 Model performance from Different Subgroups
Cohort
Number
of cases
Dice similarity
coefficient
(DSC)(%)
95% Hausdorff
distance (HD)
(mm) p-Value (DSC)
p-Value
(95% HD)
Full testing cohort 30 90.29 ±0.67 2.11 ±0.24
Age >65 years 17 90.51 ±1.51 2.14 ±0.32 0.173 0.422
Age 65years 13 90.00 ±1.03 2.09 ±0.51
Female 5 90.36 ±1.53 2.02 ±0.28 0.445 0.893
Male 25 90.28 ±1.33 2.13 ±0.43
Slice number >290 11 90.45 ±1.64 1.97 ±0.37 0.428 0.721
Slice number 290 19 90.19 ±1.04 2.19 ±0.45
Coronary artery volume >80 14 90.98 ±1.19 1.90 ±0.26 0.047274
(P <0.05)
0.029425
(P <0.05)
Coronary artery volume 80 16 89.87 ±1.38 2.29 ±0.56
A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION 11
FIGURE 5 Comparing 3D ground-truth coronary ar tery
segmentation (CAS) and 3D model output for eight cases randomly
selected from 30 cases
specific performance differences, but all can achieve
better performance. As the method in,20 the correlation
between coronary artery volume along with DSC and
95% HD is shown in Figure 10.
4.2 Effectiveness of end-to-end cardiac
segmentation and coronary artery
segmentation
Since the proposed solution is performed in two stages,
CS and CAS, it is necessary to evaluate some cases
FIGURE 6 Coronary artery reconstruction based on coronary
artery segmentation (CAS) results by the proposed solution. Nine
challenging cases are selected from 30 testing cases
that jointly go through both the cardiac segmentation
network and coronary artery segmentation network to
verify the end-to-end performance of the proposed solu-
tion. As the dataset described in Section 2.1, 338 cases
were collected,among which 133 cases were annotated
in terms of their underlying cardiac masks while the
remaining 205 cases were annotated in terms of their
underlying coronary artery masks. Therefore, no case
has both the cardiac mask and coronary artery mask.
To this end, first, we randomly selected six cases
from the 30 CAS testing cases, and the radiologist
supplemented annotations to the underlying cardiac
masks. Then, our proposed solution is performed on
the six cases to verify the effectiveness of end-to-end
FIGURE 7 Boundaries of 2D visual segmentation results in comparison with different methods. (a) U-Net3D, (b) HighResNet3D, (c)
Atten-FCN3D, (d) DenseVoxelNet, (e) CS2Net, and (f) Ours.Green lines represent ground truth, and red lines represent segmentation results by
various methods
12 A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION
FIGURE 8 Satisfactory coronary artery segmentation (CAS)
results despite cardiac over-segmentation for two cases.The
proposed model is able to generate satisfactory CAS results (b and
d) despite its input of over-segmented cardiac regions (yellow rows
in a and c)
performance. Figure 11 shows the CS and CAS results
of two cases selected from the six above, visualized
using the 3D-slicer software. The first and second rows
of Figure 11 compare the 2D ground-truth and the cor-
responding 2D projections of the 3D results generated
by the proposed solution for different slices.We can see
that the proposed solution can generate CS and CAS
results closely resembling ground-truth masks. The 3D
surface rendering of the proposed solution is shown
in the third row of Figure 10, from which we can see
that the proposed solution can preserve the reasonable
integrity of blood vessels in terms of continuity and
can be performed in an end-to-end manner. Therefore,
the proposed solution has the potential to be used in a
routine clinical workflow.
4.3 Generalization of CAT08 dataset
The generalization ability is highly desirable and espe-
cially critical for a CAD system, playing a decisive role
in the system’s success in real applications. However,
in clinical practice, medical images often exhibit differ-
ent appearances for various reasons, such as different
scanner vendors and image quality. These distribution
discrepancies could lead the deep networks to over-fit
on the training datasets and lack generalization abil-
ity on the unseen datasets. To study the generalization
of the proposed network, we test the model trained on
FIGURE 9 Loss curves of the global
coronary artery segmentation (GCAS) and
the local coronary artery segmentation
(LCAS) module during their respective training
processes
A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION 13
FIGURE 10 Scatter plot showing the correlation between the
coronary artery volume along with dice similarity coefficient (DSC)
and 95% HD
our CCTA dataset on the CAT08 dataset. The CAT08
dataset only has centerline annotations for the four
major coronary arteries and no voxel annotations for
coronary arteries.Therefore,we test the performance of
the proposed method by calculating the ratio of the cen-
terline (CL) located within the predicted coronary artery
(VP). The higher the ratio, the better the performance,
and the ratio is defined as follows:
Ratio =(CLVP)CL
As shown in Table 4, it can be seen that our method
achieves the highest ratio for testing on the CAT08
dataset. Moreover, as illustrated in Figure 12, the cen-
terline is primarily contained in the vessels segmented
by our proposed model, and our model has less over-
TABLE 4 Quantitative comparison of our method with other
methods on the CAT08 dataset;the best results are in bold
Methods Ratio
U-Net3D10 0.7658
HighResNet3D32 0.8270
DenseVoxelNet33 0.8669
CS2Net28 0.8783
Ours 0.9197
segmentation (e.g., case 2 shown in Figure 12). It can
be inferred that our proposed model has a more vital
generalization ability to CAS.
4.4 Computational complexity
All models are trained using the open-source Keras
and Tensorflow software packages for this study. Since
there are three Di-Vnet models and one U-Net model
in the pipeline, the pipeline has a relatively high com-
putational complexity during training. Our experiments
took about 48 h to train the proposed solution on a
server with 4 NVIDIA 2080Ti GPUs and 512GB of Mem-
ory. Applying the proposed solution for CAS consumes
an average of only 0.112 s per image and 30 s per
case; in comparison, manual CAS by an experienced
doctor takes at least 10 min per case. Although time-
consuming, training the model can be done offline. The
fast online testing suggests that the proposed solu-
tion has the potential to be used in a routine clinical
workflow.
FIGURE 11 Comparing ground-truth and
model output for two cases randomly selected
from six cases. For each case in the left to
right column: ground truth, and model output,
respectively;images in the first and second
rows present different slices of 2D ground
truth and the corresponding 2D projections of
the 3D results generated by the proposed
solution. The third row is the 3D surface
rendering, the visual transparency of the
cardiac segmentation (CS) adjusted to 0.6 via
the 3D-slicer software
14 A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION
FIGURE 12 The model trained on our coronary computed
tomographic angiography (CCTA) dataset is validated on the CAT08
dataset. The green line represents the centerline annota tion, which
lies in the segmented coronary artery. For Case 1, it can be seen
from the UNet-3D segmentation results that the centerline
representing the left anterior descending (LAD) is not included in the
coronary ar teries; that is, there is under-segmentation in the LAD
part. For Case 2, although all centerlines are included in coronar y
arteries, there is more over-segmentation in UNet-3D, as indicated by
the cyan arrows
4.5 Limitations and future works
This study proposes an end-to-end novel deep learning-
based solution for automatic CAS,which is able to attain
a high accuracy and robustness according to quantita-
tive and visual comparisons with ground-truth segmen-
tation results. However, like most existing CNN-based
methods, it may produce unsatisfactory segmentation
masks without accurate coronary artery boundaries.
This problem is caused by the limited context informa-
tion and inadequate discriminative feature maps after
consecutive pooling and convolution operations. Mean-
while,the task of CAS remains highly challenging due to
the large-scale variations exhibited by coronary arteries,
their complicated anatomical structures and morpholo-
gies, as well as the low contrast between vessels and
their background. In the future, we will formulate a
boundary-aware context neural network for CAS to
capture richer context and preserve adequate spatial
information, which incorporates encoder-decoder archi-
tecture. Moreover, the proposed model can be improved
to cope with highly challenging situations by exploiting
next-generation small sample learning36 and zero-shot
learning37 techniques. Collecting a dataset, with the aid
of our clinical partners, that carries a larger cohort
of CAD patients with diverse conditions is another
immediate future effort.
5CONCLUSIONS
A novel end-to-end multistage deep learning-based
solution is proposed for automatic CS and CAS. The
solution is able to attain a high accuracy and robustness,
according to quantitative and visual comparisons with
ground-truth segmentation results. It can efficiently and
effectively help eliminate intra-observer variations,lead-
ing to more efficient and accurate diagnoses as well as
better quality of care.
ACKNOWLEDGEMENT
This research is supported by the National Nat-
ural Science Foundation of China (grant number:
12026609), the Ministry of Science and Technology
of China (grant number: 2020AAA0106302), Shaanxi
University Joint Project (grant number: 2020GXLH-Z-
002), and Research Fund of the Second Affiliated
Hospital of Xi’an Jiaotong University (grant number:
YJ(QN)202017).
CONFLICT OF INTEREST
The authors declare that there is no conflict of interest
that could be perceived as prejudicing the impartiality of
the research reported.
REFERENCES
1. GBD 2013 Mortality and CDCM. Global, regional, and national
age-sex specific all-cause and cause-specific mortality for 240
causes of death, 1990–2013: a systematic analysis for the
Global Burden of Disease Study 2013. Lancet. 2015;385:117-
171.
2. Hennemuth A, Boskamp T, Fritz D, et al. One-click coronar y tree
segmentation in CT angiographic images. International Congress
Series. 2005;1281:317-321.
3. Wang C, Moreno R, Smedby Ö. Vessel segmentation using
implicit model-guided level sets. MICCAI Workshop" 3D Car-
diovascular Imaging: a MICCAI segmentation Challenge" Nice
France; 1st of October 2012.
4. Schaap M, Neefjes L,Metz C, et al.Coronary lumen segmentation
using graph cuts and robust kernel regression. International Con-
ference on Information Processing in Medical Imaging.Springer.
2009;528-539.
5. Brieva J, Gonzalez E, Gonzalez F, et al. A level set method for
vessel segmentation in coronary angiography. Conf Proc IEEE
Eng Med Biol Soc. 2005;6:6348-6351.
6. Carneiro G, Zheng YF, Xing FY, ang L. Review of deep learn-
ing methods in mammography, cardiovascular, and microscopy
image analysis. Deep Learning and Convolutional Neural Net-
works for Medical Image Computing. Springer; 2017;11-32.
7. Dai D, Dong C, Xu S, et al. Ms RED: a novel multi-scale resid-
ual encoding and decoding network for skin lesion segmentation.
Med Image Anal. 2022;75:102293.
8. Long J, Shelhamer E, Darrell T. Fully Convolutional Networks for
Semantic Segmentation. IEEE; 2015.
9. Ronneberger O, Fischer P, Brox T, U-net: convolutional networks
for biomedical ima ge segmentation. International Conference on
A DEEP LEARNING SOLUTION FOR CORONARY ARTERY SEGMENTATION 15
Medical Image Computing and Computer-Assisted Intervention.
Springer; 2015:234-241.
10. Çiçek Ö, Abdulkadir A, Lienkamp SS, et al. 3D U-Net: learn-
ing dense volumetric segmentation from sparse annotation.
International Conference on Medical Image Computing and
Computer-Assisted Intervention. Springer. 2016;424-432.
11. Milletari F, Navab N, V-net AhmadiSA. Fully Convolutional Neu-
ral Networks for Volumetric Medical Image Segmentation. IEEE;
2016.
12. Li X, Chen H,Qi X, et al.H-DenseUNet:hybrid densely connected
UNet for liver and tumor segmentation from CT volumes[J].IEEE
Trans Med Imaging. 2018;37(12):2663-2674.
13. Wang G, Li W, Ourselin S, et al. Automatic brain tumor segmen-
tation using cascaded anisotropic convolutional neural networks.
International MICCAI brainlesion workshop. Springer. 2017;178-
190.
14. Yu Q, Xie L, Wang Y, et al. Recurrent saliency transformation
network: incorporating multi-stage visual cues for small organ
segmentation.Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition. IEEE. 2018;8280-8289.
15. Moeskops P, Wolterink JM, van der Velden BHM, et al. Deep
learning for multi-task medical image segmentation in multiple
modalities. International Conference on Medical Image Comput-
ing and Computer-Assisted Intervention.Springer. 2016;478-486.
16. Mirunalini P, Aravindan C, Nambi AT, et al. Segmentation of
coronary arteries from CTA axial slices using deep learning
techniques. TENCON 2019-2019 IEEE Region 10 Conference
(TENCON). IEEE. 2019;2074-2080.
17. Gu J, Fang Z, Gao Y, et al. Segmentation of coronary arter-
ies images using global feature embedded network with active
contour loss. Comput Med Imaging Graph. 2020;86:101799.
18. Huang W, Huang L, Lin Z, et al. Coronary ar tery segmenta-
tion by deep learning neural networks on computed tomographic
coronary angiographic images. 2018 40th Annual International
Conference of the IEEE Engineering in Medicine and Biology
Society (EMBC). IEEE. 2018;608-611.
19. Yang L, Bang JG, Ya BF, et al. Automated coronary ar tery
segmentation in coronary computed tomography angiography
(CCTA) using deep learning neural networks. Paper presented
at: SPIE Medical Imaging; March 2, 2020; Houston, Texas, USA.
20. Zabihollahy F, Viswanathan AN, Schmidt EJ, et al. Fully auto-
mated multiorgan segmentation of female pelvic magnetic res-
onance images with coarse-to-fine convolutional neural network.
Med Phys. 2021;48(11):7028-7042.
21. Morris ED, Ghanem AI, Dong M, et al. Cardiac substr ucture seg-
mentation with deep learning for improved cardiac sparing. Med
Phys. 2020;47(2):576-586.
22. Sun X, Garg P, Plein S, et al. SAUN: stack attention U-Net for
left ventricle segmentation from cardiac cine magnetic resonance
imaging. Med Phys. 2021;48(4):1750-1763.
23. Hu Y, Guo Y, Wang Y, et al. Automatic tumor segmentation
in breast ultrasound images using a dilated fully convolutional
network combined with an active contour model. Med Phys.
2019;46(1):215-228.
24. Giddwani B, Tekchandani H, Verma S, Deep dilated v-net for 3d
volume segmentation of pancreas in ct images. 2020 7th Interna-
tional Conference on Signal Processing and Integrated Networks
(SPIN). IEEE; 2020:591-596.
25. Schaap M, Metz CT, van Walsum T, et al. Standardized eval-
uation methodology and reference database for evaluating
coronary artery centerline extraction algorithms.Med Image Anal.
2009;13(5):701-714.
26. Kikinis R, Pieper SD, Vosburgh KG. 3D Slicer: a platform
for subject-specific image analysis, visualization, and clinical
support. Intraoperative Imaging and Image-Guided Therapy.
Springer; 2014.
27. Yu F, Vladlen K. Multi-scale context aggregation by dilated con-
volutions.International Conference on Learning Representations.
Puerto Rico; May 2-4, 2016.
28. AAbdel AT, Allan H. Metrics for evaluating 3D medical image
segmentation: analysis, selection, and tool. BMC Med Imaging.
2015;15(1):29.
29. Shit S, Paetzold JC, Sekuboyina A, et al. clDice-a novel topology-
preserving loss function for tubular structure segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision
and Pattern Recognition. 2021:16560-16569.
30. Lee TC. Building skeleton models via 3-D medial surface/axis
thinning algorithms. Graphical Models and Image Processing.
1994;56(6):462-478.
31. Huttenlocher DP, Klanderman GA, Rucklidge WJ. Comparing
images using the Hausdorff distance. IEEE Transactions on
Pattern Analysis & Machine Intelligence. 1993;15:850-863.
32. Li W, Wang G, Fidon L, et al. On the compactness, efficiency,
and representation of 3D convolutional networks: brain parcel-
lation as a pretext task. International Conference on Information
Processing in Medical Imaging. Springer; 2017:348-360.
33. Yu L, Cheng JZ, Dou Q, et al. Automatic 3D cardiovascular
MR segmentation with densely-connected volumetric convnets.
International Conference on Medical Image Computing and
Computer-Assisted Intervention. Springer; 2017:287-295.
34. RadiAnt is a Windows DICOM-viewer. radiantviewer. Accessed
July 19, 2020. https://www.radiantviewer.com/
35. ITK-SNAP is a software application used to visualization of
3D medical images and masks. Itksnap. Accessed December 6,
2019. http://www.itksnap.org/pmwiki/pmwiki.php
36. Liu DB, He ZN, Chen DD, et al. A network framework for small-
sample learning. IEEE Transactions on Neural Networks and
Learning Systems. 2019;31:4049-4062.
37. Bucher M, Vu TH,Cord M,et al.ZS3Net:Zero-Shot Semantic Seg-
mentation. Arxiv. Accessed November 18, 2019. https://arxiv.org/
abs/1906.00817
How to cite this article: Dong C, Xu S, Li Z. A
novel end-to-end deep learning solution for
coronary arter y segmentation from CCTA. Med
Phys. 2022;1-15.
https://doi.org/10.1002/mp.15842
... Moreover, software 2 showed the bias closest to zero for plaque area (mean difference = -0.90 mm 2 , SD = 4.40 mm 2 , 95% CI = -1.17 to [14][15][16]. Deep learning approaches are emerging as promising analysis tools for CCTA imaging, and DL-LATM was developed using deep learning approaches combined with numerical algorithms. To determine the potential of DL-LATM in aiding lumen or plaque in stenotic regions is essential. ...
Article
Full-text available
Automatic segmentation of the coronary artery using coronary computed tomography angiography (CCTA) images can facilitate several analyses related to coronary artery disease (CAD). Accurate segmentation of the lumen or plaque region is one of the most important factors. This study aimed to analyze the performance of the coronary artery segmentation of a software platform with a deep learning-based location-adaptive threshold method (DL-LATM) against commercially available software platforms using CCTA. The dataset from intravascular ultrasound (IVUS) of 26 vessel segments from 19 patients was used as the gold standard to evaluate the performance of each software platform. Statistical analyses (Pearson correlation coefficient [PCC], intraclass correlation coefficient [ICC], and Bland-Altman plot) were conducted for the lumen or plaque parameters by comparing the dataset of each software platform with IVUS. The software platform with DL-LATM showed the bias closest to zero for detecting lumen volume (mean difference = -9.1 mm³, 95% confidence interval [CI] = -18.6 to 0.4 mm³) or area (mean difference = -0.72 mm², 95% CI = -0.80 to -0.64 mm²) with the highest PCC and ICC. Moreover, lumen or plaque area in the stenotic region was analyzed. The software platform with DL-LATM showed the bias closest to zero for detecting lumen (mean difference = -0.07 mm², 95% CI = -0.16 to 0.02 mm²) or plaque area (mean difference = 1.70 mm², 95% CI = 1.37 to 2.03 mm²) in the stenotic region with significantly higher correlation coefficient than other commercially available software platforms (p < 0.001). The result shows that the software platform with DL-LATM has the potential to serve as an aiding system for CAD evaluation.
... The S-UNET architecture was chosen and evaluated on a dataset of 33 images. The model was able to achieve a median DICE coefficient score of 88%, indicating high accuracy in segmenting the cardiovascular system [17]. ...
Preprint
Full-text available
In present Era, the cardiovascular disease is the most common disease in human. According to the World Health organization reports 2022, there are 70% of Human death from the Heart attack. Most of the Indian peoples suffering from heart disease having the age group of 30–60 years. Xray Coronary angiography imaging is a primary procedure for diagnosis of heart disease. Manual Segmentation of heart vessels by cardiologists are typical and time-consuming process. Manual segmentation facing the problem of variations in results due to experience and expertise of the medical professionals. Segmentation of coronary vessels angiography provides important information for the expert and patient suffering from cardiovascular disease. Therefore, different types of computer-aided Tools have been designed and developed for automatic segmentation of coronary vessels angiography images. An automatic segmentation of coronary arteries can be improved by computer vision and artificial intelligence approaches. In this paper an automatic segmentation of coronary angiography images has been designed and implemented using edge-based feature and artificial intelligence approaches. For this purpose, dominating and prominent edges of cardiovascular arteries system has been detected using traditional edge detection algorithms like Sobel, Prewitt, Robert’s and Canny. The strong edges from the above-mentioned algorithms are selected using Artificial Intelligence (Random Forest) algorithm. Experimental results shows that proposed model provides accuracy, Positive Prediction Value, Sensitivity and Dice Coefficient as 99%, 96%, 94% and 95% respectively.
... High dataset dimensionality is a crucial issue for ML approaches [33]. The algorithm's performance can be improved by weighting features, which reduce redundant data and prevent overfitting [33][34][35][36][37]. ...
Article
Full-text available
Coronary artery disease (CAD) is one of the major causes of fatalities across the globe. The recent developments in convolutional neural networks (CNN) allow researchers to detect CAD from computed tomography (CT) images. The CAD detection model assists physicians in identifying cardiac disease at earlier stages. The recent CAD detection models demand a high computational cost and a more significant number of images. Therefore, this study intends to develop a CNN-based CAD detection model. The researchers apply an image enhancement technique to improve the CT image quality. The authors employed You look only once (YOLO) V7 for extracting the features. Aquila optimization is used for optimizing the hyperparameters of the UNet++ model to predict CAD. The proposed feature extraction technique and hyperparameter tuning approach reduces the computational costs and improves the performance of the UNet++ model. Two datasets are utilized for evaluating the performance of the proposed CAD detection model. The experimental outcomes suggest that the proposed method achieves an accuracy, recall, precision, F1-score, Matthews correlation coefficient, and Kappa of 99.4, 98.5, 98.65, 98.6, 95.35, and 95 and 99.5, 98.95, 98.95, 98.95, 96.35, and 96.25 for datasets 1 and 2, respectively. In addition, the proposed model outperforms the recent techniques by obtaining the area under the receiver operating characteristic and precision-recall curve of 0.97 and 0.95, and 0.96 and 0.94 for datasets 1 and 2, respectively. Moreover, the proposed model obtained a better confidence interval and standard deviation of [98.64–98.72] and 0.0014, and [97.41–97.49] and 0.0019 for datasets 1 and 2, respectively. The study’s findings suggest that the proposed model can support physicians in identifying CAD with limited resources.
... Deep learning-based vessel segmentation methods (Chen et al., 2018a;Dong et al., 2022Dong et al., , 2021Mou et al., 2021) have achieved better segmentation results than ever. Nevertheless, the trained deep learning models, specifically, U-Net3D (Çiçek et al., 2016) are still vulnerable to local disturbances owing to the extremely complicated anatomical structure and a large range of vessel scales from the root to the terminal. ...
Article
Automatic segmentation of coronary arteries provides vital assistance to enable accurate and efficient diagnosis and evaluation of coronary artery disease (CAD). However, the task of coronary artery segmentation (CAS) remains highly challenging due to the large-scale variations exhibited by coronary arteries, their complicated anatomical structures and morphologies, as well as the low contrast between vessels and their background. To comprehensively tackle these challenges, we propose a novel multi-attention, multi-scale 3D deep network for CAS, which we call CAS-Net. Specifically, we first propose an attention-guided feature fusion (AGFF) module to efficiently fuse adjacent hierarchical features in the encoding and decoding stages to capture more effectively latent semantic information. Then, we propose a scale-aware feature enhancement (SAFE) module, aiming to dynamically adjust the receptive fields to extract more expressive features effectively, thereby enhancing the feature representation capability of the network. Furthermore, we employ the multi-scale feature aggregation (MSFA) module to learn a more distinctive semantic representation for refining the vessel maps. In addition, considering that the limited training data annotated with a quality golden standard are also a significant factor restricting the development of CAS, we construct a new dataset containing 119 cases consisting of coronary computed tomographic angiography (CCTA) volumes and annotated coronary arteries. Extensive experiments on our self-collected dataset and three publicly available datasets demonstrate that the proposed method has good segmentation performance and generalization ability, outperforming multiple state-of-the-art algorithms on various metrics. Compared with U-Net3D, the proposed method significantly improves the Dice similarity coefficient (DSC) by at least 4% on each dataset, due to the synergistic effect among the three core modules, AGFF, SAFE, and MSFA. Our implementation is released at https://github.com/Cassie-CV/CAS-Net.
Article
Full-text available
Coronary artery segmentation is an essential procedure in the computer-aided diagnosis of coronary artery disease. It aims to identify and segment the regions of interest in the coronary circulation for further processing and diagnosis. Currently, automatic segmentation of coronary arteries is often unreliable because of their small size and poor distribution of contrast medium, as well as the problems that lead to over-segmentation or omission. To improve the performance of convolutional-neural-network (CNN) based coronary artery segmentation, we propose a novel automatic method, DR-LCT-UNet, with two innovative components: the Dense Residual (DR) module and the Local Contextual Transformer (LCT) module. The DR module aims to preserve unobtrusive features through dense residual connections, while the LCT module is an improved Transformer that focuses on local contextual information, so that coronary artery-related information can be better exploited. The LCT and DR modules are effectively integrated into the skip connections and encoder-decoder of the 3D segmentation network, respectively. Experiments on our CorArtTS2020 dataset show that the dice similarity coefficient (DSC), Recall, and Precision of the proposed method reached 85.8%, 86.3% and 85.8%, respectively, outperforming 3D-UNet (taken as the reference among the 6 other chosen comparison methods), by 2.1%, 1.9%, and 2.1%.
Article
Full-text available
Purpose Brachytherapy combined with external beam radiotherapy (EBRT) is the standard treatment for cervical cancer and has been shown to improve overall survival rates compared to EBRT only. Magnetic resonance (MR) imaging is used for radiotherapy (RT) planning and image guidance due to its excellent soft tissue image contrast. Rapid and accurate segmentation of organs at risk (OAR) is a crucial step in MR image‐guided RT. In this paper, we propose a fully automated two‐step convolutional neural network (CNN) approach to delineate multiple OARs from T2‐weighted (T2W) MR images. Methods We employ a coarse‐to‐fine segmentation strategy. The coarse segmentation step first identifies the approximate boundary of each organ of interest and crops the MR volume around the centroid of organ‐specific region of interest (ROI). The cropped ROI volumes are then fed to organ‐specific fine segmentation networks to produce detailed segmentation of each organ. A three‐dimensional (3‐D) U‐Net is trained to perform the coarse segmentation. For the fine segmentation, a 3‐D Dense U‐Net is employed in which a modified 3‐D dense block is incorporated into the 3‐D U‐Net‐like network to acquire inter and intra‐slice features and improve information flow while reducing computational complexity. Two sets of T2W MR images (221 cases for MR1 and 62 for MR2) were taken with slightly different imaging parameters and used for our network training and test. The network was first trained on MR1 which was a larger sample set. The trained model was then transferred to the MR2 domain via a fine‐tuning approach. Active learning strategy was utilized for selecting the most valuable data from MR2 to be included in the adaptation via transfer learning. Results The proposed method was tested on 20 MR1 and 32 MR2 test sets. Mean ± SD dice similarity coefficients are 0.93 ± 0.04, 0.87 ± 0.03, and 0.80 ± 0.10 on MR1 and 0.94 ± 0.05, 0.88 ± 0.04, and 0.80 ± 0.05 on MR2 for bladder, rectum, and sigmoid, respectively. Hausdorff distances (95th percentile) are 4.18 ± 0.52, 2.54 ± 0.41, and 5.03 ± 1.31 mm on MR1 and 2.89 ± 0.33, 2.24 ± 0.40, and 3.28 ± 1.08 mm on MR2, respectively. The performance of our method is superior to other state‐of‐the‐art segmentation methods. Conclusions We proposed a two‐step CNN approach for fully automated segmentation of female pelvic MR bladder, rectum, and sigmoid from T2W MR volume. Our experimental results demonstrate that the developed method is accurate, fast, and reproducible, and outperforms alternative state‐of‐the‐art methods for OAR segmentation significantly (p < 0.05).
Article
Full-text available
Purpose Quantification of left ventricular (LV) volume, ejection fraction and myocardial mass from multi‐slice multi‐phase cine MRI requires accurate segmentation of the LV in many images. We propose a stack attention‐based convolutional neural network (CNN) approach for fully automatic segmentation from short‐axis cine MR images. Methods To extract the relevant spatiotemporal image features, we introduce two kinds of stack methods, spatial stack model and temporal stack model, combining the target image with its neighboring images as the input of a CNN. A stack attention mechanism is proposed to weigh neighboring image slices in order to extract the relevant features using the target image as a guide. Based on stack attention and standard U‐Net, a novel Stack Attention U‐Net (SAUN) is proposed and trained to perform the semantic segmentation task. A loss function combining cross‐entropy and Dice is used to train SAUN. The performance of the proposed method was evaluated on an internal and a public dataset using technical metrics including Dice, Hausdorff distance (HD), and mean contour distance (MCD), as well as clinical parameters, including left ventricular ejection fraction (LVEF) and myocardial mass (LVM). In addition, the results of SAUN were compared to previously presented CNN methods, including U‐Net and SegNet. Results The spatial stack attention model resulted in better segmentation results than the temporal stack model. On the internal dataset comprising of 167 post‐myocardial infarction patients and 57 healthy volunteers, our method achieved a mean Dice of 0.91, HD of 3.37 mm, and MCD of 1.08 mm. Evaluation on the publicly available ACDC dataset demonstrated good generalization performance, yielding a Dice of 0.92, HD of 9.4 mm, and MCD of 0.74 mm on end‐diastolic images, and a Dice of 0.89, HD of 7.1 mm and MCD of 1.03 mm on end‐systolic images. The Pearson correlation coefficient of LVEF and LVM between automatically and manually derived results were higher than 0.98 in both datasets. Conclusion We developed a CNN with a stack attention mechanism to automatically segment the LV chamber and myocardium from the multi‐slice short‐axis cine MRI. The experimental results demonstrate that the proposed approach exceeds existing state‐of‐the‐art segmentation methods and verify its potential clinical applicability.
Article
Full-text available
Purpose Radiation dose to cardiac substructures is related to radiation‐induced heart disease. However, substructures are not considered in radiation therapy planning (RTP) due to poor visualization on CT. Therefore, we developed a novel deep learning (DL) pipeline leveraging MRI’s soft tissue contrast coupled with CT for state‐of‐the‐art cardiac substructure segmentation requiring a single, non‐contrast CT input. Materials/methods Thirty‐two left‐sided whole‐breast cancer patients underwent cardiac T2 MRI and CT‐simulation. A rigid cardiac‐confined MR/CT registration enabled ground truth delineations of 12 substructures (chambers, great vessels (GVs), coronary arteries (CAs), etc.). Paired MRI/CT data (25 patients) were placed into separate image channels to train a three‐dimensional (3D) neural network using the entire 3D image. Deep supervision and a Dice‐weighted multi‐class loss function were applied. Results were assessed pre/post augmentation and post‐processing (3D conditional random field (CRF)). Results for 11 test CTs (seven unique patients) were compared to ground truth and a multi‐atlas method (MA) via Dice similarity coefficient (DSC), mean distance to agreement (MDA), and Wilcoxon signed‐ranks tests. Three physicians evaluated clinical acceptance via consensus scoring (5‐point scale). Results The model stabilized in ~19 h (200 epochs, training error <0.001). Augmentation and CRF increased DSC 5.0 ± 7.9% and 1.2 ± 2.5%, across substructures, respectively. DL provided accurate segmentations for chambers (DSC = 0.88 ± 0.03), GVs (DSC = 0.85 ± 0.03), and pulmonary veins (DSC = 0.77 ± 0.04). Combined DSC for CAs was 0.50 ± 0.14. MDA across substructures was <2.0 mm (GV MDA = 1.24 ± 0.31 mm). No substructures had statistical volume differences (P > 0.05) to ground truth. In four cases, DL yielded left main CA contours, whereas MA segmentation failed, and provided improved consensus scores in 44/60 comparisons to MA. DL provided clinically acceptable segmentations for all graded patients for 3/4 chambers. DL contour generation took ~14 s per patient. Conclusions These promising results suggest DL poses major efficiency and accuracy gains for cardiac substructure segmentation offering high potential for rapid implementation into RTP for improved cardiac sparing.
Article
Computer-Aided Diagnosis (CAD) for dermatological diseases offers one of the most notable showcases where deep learning technologies display their impressive performance in acquiring and surpassing human experts. In such the CAD process, a critical step is concerned with segmenting skin lesions from dermoscopic images. Despite remarkable successes attained by recent deep learning efforts, much improvement is still anticipated to tackle challenging cases, e.g., segmenting lesions that are irregularly shaped, bearing low contrast, or possessing blurry boundaries. To address such inadequacies, this study proposes a novel Multi-scale Residual Encoding and Decoding network (Ms RED) for skin lesion segmentation, which is able to accurately and reliably segment a variety of lesions with efficiency. Specifically, a multi-scale residual encoding fusion module (MsR-EFM) is employed in an encoder, and a multi-scale residual decoding fusion module (MsR-DFM) is applied in a decoder to fuse multi-scale features adaptively. In addition, to enhance the representation learning capability of the newly proposed pipeline, we propose a novel multi-resolution, multi-channel feature fusion module (M2F2), which replaces conventional convolutional layers in encoder and decoder networks. Furthermore, we introduce a novel pooling module (Soft-pool) to medical image segmentation for the first time, retaining more helpful information when down-sampling and getting better segmentation performance. To validate the effectiveness and advantages of the proposed network, we compare it with several state-of-the-art methods on ISIC 2016, 2017, 2018, and PH2. Experimental results consistently demonstrate that the proposed Ms RED attains significantly superior segmentation performance across five popularly used evaluation criteria. Last but not least, the new model utilizes much fewer model parameters than its peer approaches, leading to a greatly reduced number of labeled samples required for model training, which in turn produces a substantially faster converging training process than its peers. The source code is available at https://github.com/duweidai/Ms-RED.
Article
Coronary heart disease (CHD) is a serious disease that endangers human health and life. In recent years, the morbidity and mortality of CHD are increasing significantly. Because of the particularity and complexity of medical image, it is challenging to segment coronary artery accurately and efficiently. This paper proposes a novel global feature embedded network for better coronary arteries segmentation in 3D coronary computed tomography angiography (CTA) data. The global feature combines multi-level layers from various stages of the network, which contains semantic information and detailed features, aiming to accurately segment target with precise boundary. In addition, we integrate a group of improved noisy activating functions with parameters into our network to eliminate the impact of noise in CTA data. And we improve the learning active contour model, which obtains a refined segmentation result with smooth boundary based on the high-quality score map produced by the networks. The experimental results show that the proposed framework achieved the state-of-the-art performance intuitively and quantitively.
Conference Paper
Pancreas segmentation is essential in the medical diagnosis of cancer, pancreatitis, and pancreatic surgeries. Computed tomography (CT) abdominal scans help in the detection of tumors, infections, and other injuries in the pancreas. The treatment of pancreatic tumors begins with surgery, followed by neoadjuvant therapy. However, it is challenging to detect boundaries, due to variable shape, small anatomical structures, and low contrast images of the pancreas in the abdominal CT scans. Recently, Convolutional Neural Networks (CNNs) based deep learning models show significant performance on medical imaging related tasks. However, most of the data available for the evaluation of diseases consists of 3D CT scan volumes. Hence, learning from volumetric data is essential in biomedical applications. V-Net achieves extraordinary performance in various medical datasets consists of 3D scan. In this paper, we have proposed a multi-rate Deep-Dilation Network (DDN) in V-Net for the segmentation of pancreas in CT-82 abdominal dataset. To overcome data-imbalance between bright and dark pixels, we propose a Weighted Fusion Loss (WFL) using Balanced Binary Cross-entropy (BBCE) loss and Smooth Dice Coefficient (SDC) loss. The proposed model attains a state-of-the-art performance for pancreas segmentation. The achieved dice score, sensitivity and precision are 83.31%, 87.70% and 97.07% respectively.
Article
Small-sample learning involves training a neural network on a small-sample data set. An expansion of the training set is a common way to improve the performance of neural networks in small-sample learning tasks. However, improper constraints in expanding training data will reduce the performance of the neural networks. In this article, we present certain conditions for incorporation of additional training data. According to these conditions, we propose a neural network framework for self-training using self-generated data called small-sample learning network (SSLN). The SSLN consists of two parts: the expression learning network and the sample recall generative network, both of which are constructed based on restricted Boltzmann machine (RBM). We show that this SSLN can converge as well as the RBM. Moreover, the experiment results on MNIST Digit, SVHN, CIFAR10, and STL-10 data sets reveal the superiority of the SSLN over other models.