ArticlePDF Available

A Framework for Early Detection of Acute Lymphoblastic Leukemia and Its Subtypes From Peripheral Blood Smear Images Using Deep Ensemble Learning Technique

Authors:

Abstract and Figures

Acute lymphoblastic leukemia (ALL), one of the prevalent types of carcinogenic disease, has been seen a deadly illness exposing numerous patients across the world to potential threats of lives. It impacts both adults and children providing a narrow range of chances of being cured if diagnosed at a later stage. A definitive diagnosis often demands highly invasive diagnostic procedures thereby proving time-consuming and expensive. Peripheral Blood Smear (PBS) images have been playing a crucial role in the initial screening of ALL in suspected individuals. However, the nonspecific nature of ALL poses a substantial challenge in the analysis of these images thus leaving space for misdiagnosis. Aiming at contribute to the early diagnoses of this life-threatening disease, we put forward automated platform for screening the presence of ALL concerning its specific subtypes (benign, Early Pro-B, Pro-B and Pre-B) using PBS images. The proposed web based platform follows weighted ensemble learning technique using a Residual Convolutional Neural Network (ResNet-152) as the base learner to identify ALL from hematogone cases and then determine ALL subtypes. This is likely to save both diagnosis time and the efforts of clinicians and patients. Experimental results are obtained and comparative analysis among 7 well-known CNN Network architectures (AlexNet, VGGNet, Inception, ResNet-50, ResNet-18, Inception and DenseNet-121) is also performed that demonstrated that the proposed platform achieved comparatively high accuracy (99.95%), precision (99.92%), recall (99.92%), F1-Score (99.90%), sensitivity (99.92%) and specificity (99.97%). The promising results demonstrate that the proposed platform has the potential to be used as a reliable tool for early diagnosis of ALL and its sub-types. Furthermore, this provides references for pathologists and healthcare providers, aiding them in producing specific guidelines and more informed choices about patient and disease management.
Content may be subject to copyright.
1
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2022.Doi Number
A framework for Early Detection of Acute
Lymphoblastic Leukemia and its Subtypes from
Peripheral Blood Smear Images Using Deep
Ensemble Learning Technique
Sajida Perveen1, Abdullah Alourani* 2, Muhammad Shahbaz3, Usman Ashraf 4, Isma Hamid 5
1,5 Department of computer Science, National Textile University, Faisalabad, Pakistan
2 Department of Management Information Systems and Production Management, College of Business and Economics, Qassim
University, P.O. Box 6640, Buraidah 51452, Saudi Arabia
3 Department of Computer Engineering, University of Engineering & Technology, Lahore, Pakistan.
4 Department of computer Science, Government College Women University, Sialkot, Pakistan.
Corresponding author: Abdullah Alourani (ab.alourani@qu.edu.sa).
“This work was funded by Scientific Research, Qassim University, KSA.
ABSTRACT Acute lymphoblastic leukemia (ALL), one of the prevalent types of carcinogenic disease, has
been seen a deadly illness exposing numerous patients across the world to potential threats of lives. It impacts
both adults and children providing a narrow range of chances of being cured if diagnosed at a later stage. A
definitive diagnosis often demands highly invasive diagnostic procedures thereby proving time-consuming
and expensive. Peripheral Blood Smear (PBS) images have been playing a crucial role in the initial screening
of ALL in suspected individuals. However, the nonspecific nature of ALL poses a substantial challenge in
the analysis of these images thus leaving space for misdiagnosis. Aiming at contribute to the early diagnoses
of this life-threatening disease, we put forward automated platform for screening the presence of ALL
concerning its specific subtypes (benign, Early Pro-B, Pro-B and Pre-B) using PBS images. The proposed
web based platform follows weighted ensemble learning technique using a Residual Convolutional Neural
Network (ResNet-152) as the base learner to identify ALL from hematogone cases and then determine ALL
subtypes. This is likely to save both diagnosis time and the efforts of clinicians and patients. Experimental
results are obtained and comparative analysis among 7 well-known CNN Network architectures (AlexNet,
VGGNet, Inception, ResNet-50, ResNet-18, Inception and DenseNet-121) is also performed that
demonstrated that the proposed platform achieved comparatively high accuracy (99.95%), precision
(99.92%), recall (99.92%), F1-Score (99.90%), sensitivity (99.92%) and specificity (99.97%). The promising
results demonstrate that the proposed platform has the potential to be used as a reliable tool for early diagnosis
of ALL and its sub-types. Furthermore, this provides references for pathologists and healthcare providers,
aiding them in producing specific guidelines and more informed choices about patient and disease
management.
INDEX TERMS Acute Lymphoblastic Leukemia (ALL), Deep Ensemble Learning, Ensemble
convolutional neural networks, Lymphocytic leukemia, Peripheral Blood Smear Images, ResNet-152,
Weighted deep ensemble learning.
I. INTRODUCTION
Acute lymphoblastic leukemia (ALL), or lymphocytic
leukemia, is a potentially life-threatening disease [1]. It is
a malignancy of lymphoid blood cells characterized by
hyper-proliferation and immature growth of lymphocytes
(white blood cell/leukocyte)) by the bone marrow in the
human body [2]. These immature and impaired
lymphocytes pose a substantial threat to the overall immune
system. This anomaly also inhibits the bone marrow’s
ability to produce platelets and red blood cells. In addition,
mutated erythrocytes in the bloodstream can lead to serious
health concerns for other vital body organs [3].
The prevalence of ALL is increasing dramatically each year
[64], making it one of the more common types of cancer
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
2
prevalent in adults and children; however, it has a fair
chance of being cured [4]. In 2017, the global incidents of
ALL were estimated to have risen from 49.1 thousand cases
to 64.2 thousand [5]. According to a recent report published
in 2022 by the International Agency for Research on
Cancer (IARC) of WHO, there have been 437,033 reported
cases of leukemia and 303,006 leukemia related deaths
worldwide [6]. It is also noteworthy that neglecting early
interventions or diagnosing ALL at a later stage might also
result in premature mortality [7].
Furthermore, preliminary and rapid diagnosis of ALL has
always been a challenge to haematologists, oncologists and
researchers [8]. Enlargement of the liver spleen, uncertain
bleeding, pallor, fever, skeletal pain, common infections,
easy bruising and abrupt weight loss are the common
symptoms often associated with leukemia, but these
symptoms can also be indicative of other medical
conditions. Consequently, leukemia diagnosis is
challenging in its early stages due to the covert and mild
nature of these symptoms [9].
Diagnosing acute leukemia involves a diverse array of
diagnostic procedures and equipment. The conventional
method used for initial screening for ALL in a suspected
individual at the preliminary stage is the microscopic
evaluation of Peripheral Blood Smear (PBS) samples.
However, the gold standard methods for leukemia
diagnosis involve bone marrow aspiration, cytogenetic
analysis, immune phenotyping and lumbar puncture to
name a few [10]. Such invasive techniques have serious
complications, particularly in children. The preliminary
complications associated with these procedures are pain,
bleeding and bruising [9]. Besides the expensive bone
marrow test, it imposes extra healthcare expenditures on
patients who need multiple samples [11]. According to a
report published in 2020 by WHO, the equipment and
resources required for such tests still need to be made
available in developing countries, particularly in rural areas
[12].
Furthermore, manual analyses that have been used for a
long time are often labor intensive and time-consuming in
general [13]. It might also come out with less accurate
results or diagnostic errors due to the time-consuming
nature of the PBS analysis under the microscope, which
again makes the intervention measures ineffective and
impractical. Also, hematologists must diagnose the
presence of leukemia along with its specific subtypes
(benign, Early, Pro-B and Pre-B) to prevent healthcare
complications and identify the optimal treatment of
leukemia.
Over the last three decades, extensive research articles have
adopted machine learning (ML) and computer-aided
diagnostic approaches to analyze laboratory images. The
primary objectives of such analyses are to overcome the
challenges associated with late leukemia diagnoses and to
effectively determine its subtypes [14]. Numerous studies
have also analyzed blood smear images for differentiating,
diagnosing and counting the cells in various types of
leukemia [15, 16].
According to the existing literature, extensive studies
regarding ALL prediction from microscopic images have
been conducted. These studies employed various ML
algorithms for early diagnosis of ALL using microscopic
images [10, 17, 18]. However, when these techniques are
applied to imagery datasets, some obstructions exist,
including handling multi-dimensional data, handcrafted
feature engineering, and dimensionality reduction [19, 20].
Recently, to overcome these limitations, deep learning is
frequently used for diagnosing ALL, especially when
dealing with such data. Subsequently, in deep learning,
convolutional neural network (CNN) is one of the
frequently used models, particularly over image related
healthcare datasets as it has strong self-learning, adaptive
and generalization capabilities.
Unlike the classical Machine learning technique, CNN
explicitly alleviates the requirement of handcrafted feature
extraction; thus it only requires the data as an input and its
self-learning capability completes the intended task
unaided [16]. Various other DL-based techniques have
already proven their immense worth across diverse
domains of automatic segmentation, recognition or
detection, such as skin lesions [21, 22], brain tumor [23]
breast cancer [24], diabetic retinopathy [25], COVID-19
pandemic [26], minor invasive surgery [27] and others [28].
Although numerous articles have already been published to
diagnose ALL using CNN, there is still room available for
enhancing the performance, learning process and extending
the model's generalizability, mainly when dealing with data
scarcity, as depicted in numerous articles [29, 30. 31,32, 33
]. In this regard, ensemble learning is a widely adopted and
effective machine learning technique, involving combining
multiple models to enhance overall performance and
accuracy articles [29, 30, 31, 32, 33].
Keeping the above-mentioned aspects in mind, the salient
features of our contributions are depicted below: (1) our
research pioneers the development of a fully automated
online platform to diagnose and classify ALL into its
subtypes with a deep ensemble learning based model using
blood smear images. Making such web platforms accessible
to the general population may also contribute significantly
to patients’ education, disease management and reduced
healthcare cost, as reportedly approximately 6 billion
individuals worldwide rely on the Internet to access
disease-related information before seeking medical care
[65]. (2) According to our best knowledge this is the first
study that incorporated the ResNET-152 network
architecture as a key component with weighted ensemble
learning method and to classify acute lymphoblastic
leukemia and its subtypes with almost 99.95% accuracy.
(3) The proposed research leveraged from transfer learning
by incorporating pre-trained Convolutional Neural
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
3
Network (CNN) based architectures to reduce the need for
expensive hardware and computational cost. We have
experimented with the performance of 7 various pre-trained
networks. To perform a comparative analysis among 7
well-known CNN Network architectures (AlexNet,
VGGNet, Inception, ResNet-50, ResNet-18, Inception and
DenseNet-121). This comparative analysis also
demonstrated that the proposed method outperforms in
early diagnosing ALL along with its types and achieve
comparatively high accuracy. (4) The proposed fully
automated real time tele-diagnosis system, does not
necessitate extensively trained medical personnel for ALL
diagnoses, particularly filling the gaps when direct doctor
contact is not possible or available. While avoiding the cost
and effort of prevention and treatment in those at early stage.
These developments collectively improve the landscape of
ALL diagnosis, fusing efficiency, accuracy, and accessibility.
II. LITERATURE REVIEW
Sampathila et al. [63] developed a custom ALLNET network
architecture. The custom ALLNET architecture was trained
and tested using publically available microscopic images
dataset. The imagery data incorporated in this research
belongs to the ALL Challenge dataset of ISBI, 2019. Results
demonstrated that ALLNET outperformed and achieved an
accuracy of 98%, that depicts the proposed model have the
potential to diagnose the ALL disease effectively. However
the proposed research lacks in offering support to classify the
leukemia cells into its subtypes which is crucial for
comprehensive disease management.
Jha and Dutta [67] demonstrated a framework for leukemia
detection based on hybrid technique. In this research blood
smear images are obtained from the AA-IDB2 database. In
the proposed research a novel segmentation and optimization
methods are applied on blood smear images. In the
segmentation phase utilizes a hybrid model based on
primarily Mutual Information (MI), combining Active
Contour and Fuzzy C-Means techniques. Subsequently,
potentially significant features were extracted, comprises of
color histogram features and Local Difference Pattern
(LDP). These extracted features serve as input for the
Chronological Self-Adaptive (SCA)-based CNN
architecture, optimizing its weights by the Chronological
SCA algorithm to increase the accuracy of leukemia
identification. Although this research proposed a holistic
technique and determined the leukemia from the blood smear
images, but required domain knowledge to effectively utilize
the proposed framework for the detection of ALL from the
single cell blood smear images. Furthermore, this research
does not focus on the identification of its subtypes.
In the existing literature there is a lack of research that
investigate the significance of using various optimization
algorithm for hyper-parameter tuning of deep learning
models developed particularly to diagnosis ALL. Atteia et
al, [68] proposed an optimized Convolutional Neural
Network (CNN) based on Bayesian to detect ALL from
microscopic smear images. The network architecture of the
CNN and its hyper-parameters are tailored to the input data
through Bayesian optimization, a technique that iteratively
refines the hyper-parameter space to minimize an objective
error function. A hybrid data was created by combining two
publically available datasets to train and test the proposed
Bayesian-optimized CNN. This data augmentation improves
the hybrid data, contributing to improved performance.
Experimental results demonstrated that Bayesian-optimized
CNN model demonstrates superior performance to classify
ALL from blood smear images test set, surpassing other
optimized deep learning ALL classification models. While
this research demonstrated the effectiveness of the optimized
Bayesian driven CNN model in elevating the accuracy of
ALL detection from microscopic PBS images, it does not
extend its assistance in categorizing specific ALL subtypes.
As it is a widely accepted notation and that optimal and
effective treatment depends on the accurate detection of
disease type and how far the disease has spread in the body.
Khandekar et al, [69] proposed an automation technique to
detect ALL blast cells in PBS images using the YOLO v4.
Two publically available datasets (ALL-IDB1 and
C_NMC_2019 consist of a total of 10, 769, singled celled
and preprocessed, images) were incorporated to train and
evaluate the propose method. The proposed method achieved
The MAP (Mean Average Precision) of 98.7 % over the
C_NMC_2019 dataset and 96.06 % over the ALL-IDB1
data. Experimental results demonstrated that at the end of
6000iteration the loss reduced exponentially and reached at
an approximately 0.57664. However, no interactive
automated screening or diagnosing tele-health service was
proposed in this research.
Ghaderzadeh et al, [34] proposed deep learning based
method to identify ALL and its subtypes using PBS images.
The dataset utilized in the study consists of 3256 PBS images
from 89 individuals suspected of ALL. It incorporated a low-
cost leukemia cell segmentation approach and utilizes pairs
of segmented and original images. The model consists of a
DenseNet-201-based feature extraction block and a
classification block. The features extracted from the
segmented and original images were concatenated to train
the model DenseNet-201 for classification into
benign/malignant and subtypes of pre-B, early pre-B, and
pro-B ALL. Although the proposed multi-step DL
architecture enhances the comprehensive analysis of PBS
images for ALL detection and subtype classification, it
inherit high computational cost and required domain
knowledge. Furthermore, it does not provide any support for
tele-diagnosis facility.
In [70], a multi-step deep learning approach was proposed to
automatically segment the leukemia cell from bone marrow
images. The objectives of this research were to identify the
Acute Myeloid Leukemia (AML) and to also predict the
most common mutation status in AML disease. The study
involved the analysis of 1487 individuals newly diagnosed
with AML among them 236 individuals were Healthy.
Feature extraction was performed manually by
hematologists from the dataset. An extensive dataset
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
4
(comprised of 5202 and 5428 augmented AML bone marrow
smear images of Acute Myeloid Leukemia and healthy
individuals samples respectively ) was used to train the
Faster Region-based Convolutional Neural Net [71]
(FRCNN) model. Experimental results demonstrated that
the proposed binary model achieved a high accuracy (86 %)
in predicting NPM1 mutation status and 91% accuracy while
classifying between AML and healthy bone marrow
samples. Although the study has achieved optimal results,
the manual feature extraction method may lead to error and
inefficiency particularly when dealing with the large sample
size dataset.
The study [61] focuses on the application of Deep Learning
(DL) techniques along with ensemble method to predict ALL
and identify its subtypes using PBS images. In this study C-
NMC-2019 dataset was incorporated to build the deep
ensemble model. Oversampling technique was used to tackle
class imbalance problem, resulting in a training set of 11644
images. Pre-trained networks named as VGG-16, Xception,
MobileNetV2, InceptionResNet-V2, and DenseNet-121
were adopted for transfer learning applications. The
ensemble model exhibits comparatively high performance in
identifying ALL. Experimental results demonstrated that the
proposed method achieved 89.72%, 94.8% accuracy and
AROC value respectively. The overall finding of this
research represented that ensemble learning, combining the
capabilities of diverse networks, enhances the overall
effectiveness of the model for identifying ALL in medical
images. Barrera et al. [85] proposed an effective approach
using GAN1 and GAN2 modules for preserving structural
morphological traits and aligning color staining with a
reference center (RC) when combining images from various
centers. The normalization process aimed to enhance the
objectivity, accuracy, and speed of morphological analysis in
peripheral blood cell images, contributing to more reliable
diagnoses of hematological and non-hematological
disorders.
III. METHODOLOGY
In this section workflow, image processing methods and
network architecture developed to classify ALL and its
subtypes are detailed. Once the dataset was obtained
augmentation was employed to increase and enhance its size
and robustness. Subsequently, significant features are
extracted from the imagery dataset. The network then
managed to successfully categorize the imagery test data as
either ALL (blast cell) or hematogenous (healthy cell).
A. DATASET
The PBS imagery data used in this research was obtained from
the Kaggle repository. The same dataset is also used by
Ghaderzadeh et al. [34] for ALL diagnosis and its subtype’s
classification which consist of approximately 3256 images
collected from 89 suspected individual with ALL consisting
of 25 healthy individuals with a benign diagnosis
(hematogone) as well as 64 patients diagnosed with various
subtypes of ALL. This imagery data was prepared in the bone
marrow laboratory of Taleqani Hospital (Tehran, Iran).
The slides of blood smear samples were prepared and stained
by skilful laboratory staff. The dataset is primarily categorized
into two main categories malignant and benign. The benign
category contains hematogones, normal B-lymphocyte
precursors, which are naturally exist in the bone marrow of
healthy individuals and closely resemble with acute
lymphoblastic leukemia cases. These hematogones benign
hematopoietic precursor cells usually do not required
chemotherapy and resolve on their own without any medical
intervention. While, the latter is further divided into its three
subtypes of malignant lymphoblasts namely: Early pre-B, pre-
B, and pro-B ALL. Randomly selected sample image frames
from each category have been illustrated in Figure 1.
FIGURE 1. Randomly selected images from each category of Acute
Lymphoblastic Leukemia (A) Benign (B) Early Pre-b(C) Pre-b (D) Pro-b.
All the images obtained from these slides were captured
through a microscope equipped with a Zeiss camera at a
magnification of ×100 and saved in JPG format. Subsequently,
the definitive classification of these PBS images into specific
type and subtypes was made by an expert using the flow
cytometry tool. Table 1 illustrates an abstract overview of the
dataset used in the proposed study.
A. PREPROCESSING PHASE
Preprocessing is a commonly used approach in computer
vision applications for preparing input data, directly
influencing the predictive performance of deep learning
models. [34, 35]. In this proposed research segmentation,
decoding and resizing, normalization and augmentation were
used to preprocess the microscopic imagery data obtained
TABLE I
AN ABSTRACT OVERVIEW OF THE DATASET USED IN THE
PROPOSED RESEARCH.
Categories/
Subtypes
No. of
Samples
No. of
Individuals
Microscopic
Image Size
Benign
Hematogones
(Hem)
504
25
1024 × 768
Malignant
Early
985
20
1024 × 768
Pre-b ALL
963
21
1024 × 768
Pro-b ALL
804
23
1024 × 768
Total
3256
89
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
5
FIGURE 2. Overview of proposed research Feature extraction process
from peripheral blood smear images. A graphical
demonstration of preprocessingand feature extraction phase is
depicted in Figure 2.
B. SEGMENTATION
In microscopic peripheral blood smear images, an array of
diverse colors can be observed representing a range of cellular
components [17]. However, our primary focus lies solely on
the detection of lymphoblasts, which are immature
lymphocytes, with an objective to effectively identify ALL
cases and their respective subtypes. Therefore, to address this
challenge segmentation is widely used [17, 34, 76, 78] that
define the boundaries of a lymphoblastic cell and thus
separates the unnecessary components [75] (erythrocytes,
platelets and plasma) from the main substances (lymphoblast
cells) [33, 34,77, 78,79].
In the existing literature, various segmentation techniques
have been incorporated to extract the Region of Interest(ROI)
including the Otsu method followed by the morphological
operations technique [36], k-means clustering ([37],
watershed transformation [38]. Furthermore, Parthasarathy
and Chitra, [80] proposed a segmentation method to
effectively segregate the object taken into consideration from
the rest of the image using color thresholding technique.
Experimental results showed that using the selected threshold
values have the ability to effectively extract ROI. It also
demonstrated its applicability in scenarios where precise
segmentation of intended object is paramount, aligning with
the objectives of our study in detecting lymphoblasts.
Nevertheless, to segment the ROI (lymphoblast cells) from
microscopic PBS images, we incorporated a cost effective
technique relying on color interval separation and a simple
threshold method. The method decreases the image to only
two level of intensity, thereby making it adoptable to identify
and separate the lymphoblasts.
The microscopic imagery data is captured in RGB space,
encompassing three distinct channels: red, green, and blue.
Differentiating among the colors in RGB space is a
challenging task. Therefore, we transformed the images into
the HSV color space (hue, saturation, and value), which
provides ameliorated color resolution for further analysis.
Figure 3 represents the HSV color space distribution of the
randomly selected sample images shown in Figure 1.
FIGURE 3. (A) 3D HSV color space distribution of PBS image selected
from benign (hematogones) subtype (B) HSV color space distribution
from Early Pre-b subtype, (C) HSV color space distribution from Pre-b
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
6
subtype, (D) HSV color space distribution from Pro-b subtype as
illustrated in Figure 1.
Subsequently, to obtain the two distinct lower and upper
thresholds for the purple color, representing the dominant hue
of blast cells, a 3D scatter plot is generated based on the image
pixels in the HSV color space. Binary masks were then created
using these lower and upper thresholds for each image,
capturing the region of interest (Lymphoblast) from the input
image. Therefore, by applying these masks to the input
images, the Lymphoblast cells were accurately segmented.
Finally, the segmented cells were converted back to the RGB
color space, resulting in a segmented image with Lymphoblast
cells displayed, as shown in Figure 4.
C. DECODE AND RESIZE
Before feeding input images to a CNN, they are transformed
from a standard format into numeric arrays, capturing the pixel
magnitude values of the image. To ensure network
adaptability, reduce computational demands, and enhance
training efficiency, the input images are subsequently resized
[17, 34]. The chosen image size of 224x224 pixels strikes a
balance between model performance and computational
efficiency.
While higher dimensions could be used, maintaining
uniformity with existing research and widely used
architectures led us to standardize the input image dimensions
at 224x224 pixels. Subsequently, the proposed study
incorporated a pre-trained ResNet-152 model for feature
extraction.
FIGURE 4. Original RGB input image samples from each subtype and
their corresponding HSV color space distribution diagram, binary
masked image and segmented image.
D. DATA NORMALIZATION
Normalization of the pixel intensity of an image is a
commonly adopted technique in image processing to enhance
model convergence at the training phase. It mainly ensures that
all the pixel values fall within a specified range thus providing
a standardized representation. It also helps optimization
algorithms to adjust the weights more effectively. To
normalize the data, the global mean and global standard
deviation (SD) were computed from all the pixel values across
the entire image dataset, representing the overall distribution
of pixel values. Subsequently, the data underwent
normalization using Equation (1), where: Xi represents the
global mean calculated from all the image set X and i
{1,2,3,....,3256}. Whereas σ represents the global standard
deviation
The normalization equation is applied to every pixel in each
image in the dataset. This process transforms pixel intensities,
centering them on the global mean and scaling them based on
the global standard deviation, making them more suitable for
training deep learning models.
𝑍!=𝑋! µ
σ + ε *************************************************(1)
E. DATA AUGMENTATION
In the context of image processing, data augmentation
involves introducing additional diversity in the training data
through various transformation techniques. While avoiding
distortion of the vital and meaningful information of the
images, these transformations enhance the model's resilience
to adopt changes in real world data and perform well on new,
unseen data. Therefore, we also incorporated various
transformation techniques including rotating the images
vertically and horizontally, variation in contrast, adjusting
brightness and introducing JPEG noise. The standard color
augmentation used in [73] is incorporated in this research.
Figure 2 presents random samples of original and un-
segmented images, across all the classes (Benign, Early pre-
B, Pre-B, and Pro-B ALL) considered in this research,
followed by preprocessing.
F. BALANCING IMBALANCED CLASSES
Class imbalance problems frequently exist in the medical
domain particularly, stemming from the intricacies of
obtaining manually annotated images. In medical imaging,
certain classes of interest, such as rare diseases or specific
abnormalities, may have significantly fewer instances
compared to more common cases [20, 49, 55]. If the model is
trained over the imbalance data, it might become overly
influenced by the minority class leading to biased model
predictions and hinder the performance of learning model.
Furthermore, the scarcity of data in the minority classes can
result in inadequate representation, making it difficult for the
model to learn and distinguish these less prevalent patterns
[55]. To address this issue effectively random oversampling is
widely used in existing literature [58, 83]. In random
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
7
oversampling instances from the minority class are randomly
replicated and included to the training data for balancing the
FIGURE 5. An abstract overview of ResNet-152 based architecture for
feature extraction and classification.
imbalanced class [58, 83]. It can be observed that our dataset
is also suffering with class imbalance problem (see Table I).
The number of Hematogones (Hem) samples are
comparatively lower. Therefore, in this study we incorporated
random oversampling technique to build a generalized model,
exposed to a more equitable distribution of each class
instances.
G. CLASSIFIER
In this study, we have examined 7 most commonly used pre-
trained network architectures AlexNet [39], VGGNET [40],
Inception [41], DenseNet-121 [42], ResNET-18 [43],
ResNET-50 [44] and ResNET-152 [45]. Among the network
architectures considered, DenseNet-121 and ResNet-152
demonstrated superior performance. However, after careful
evaluation, ResNet-152 was chosen as it exhibited the best
performance. Its deeper architecture and skip connections
enable it to capture intricate patterns and features in the data
effectively [84]. Therefore, ResNet-152 was selected for the
proposed method and subsequently fine-tuned for the ALL
dataset. It is worth noting that there is a trade-off between
performance and computational complexity, which was
taken into consideration during the selection process.
H. RESNET-152
ResNet-152 is a cutting-edge deep neural network with 152
layers that enables to extract implicit and complex
relationships from the input data. It is also well-known for its
multiple residual blocks aimed at learning the residual
representation functions. These functions enable the training
of exceptionally deep neural networks along with dealing with
the vanishing gradient problem. Therefore, this model excels
in accuracy and performance across various computer vision
tasks, making it a popular choice for transfer learning [34].
Various pre-trained ResNet models such as ResNet-18,
ResNet-34, and ResNet-50 are available in open source with
high level APIs including Keras and fastai. In ResNET-152
consists of blocks and each block typically contains several
layers. Furthermore, the layer configuration typically consists
of these main building blocks: convolution, batch
normalization (BN), activation function, pooling, fully
connected, dropout and output block.
These blocks are stacked on top of each other to form the
complete model. The first three (Convolution, BN and
Activation) blocks are usually combined with an optional
pooling layer to extract the useful features from the input data
such as an image. However, the batch normalization and
dropout are the optional layers basically designed to reduce the
training time over-fitting. This layer performs spatial pooling,
reducing the spatial dimensions of the features to a single
value per channel and feeding it into the third layer.
Subsequently, this transforms them into a single vector that
can be fed into the input layer of the next layer. The fully
connected layer of the CNN estimates the optimal weights
through the back-propagation process. These weights
associated with each node are used to determine the
corresponding class scores and labels for each input image. A
classical network architecture may consist of repetitions of a
stack of several convolution layers, as can be seen in Figure 5.
As discussed earlier, deep learning techniques have the
potential to scale and handle complicated problems and also
provide implicit capabilities to extract optimal features from
unstructured data [46, 47]. Therefore, deep learning
algorithms perform better when compared to classical
machine learning algorithms [48]. However, when it comes to
deep learning models, the training phase demands substantial
efforts to mitigate the risk of over-fitting. Tuning the optimal
hyper-parameters is also a significant challenge that
necessitates expertise and extensive experimentation.
Additionally, a single deep learning model might be obliquely
limited when dealing with highly variable and distinctive
image datasets, especially if there are only a few samples
available.
In such cases, ensemble learning can be a valuable and
powerful tool for improving predictive accuracy and
generalization. Instead of relying on a single model, ensemble
learning constructs an ensemble of diverse models, each with
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
8
its strengths and weaknesses [49, 50]. By leveraging the
diversity among these models, the ensemble can capture and
diversify patterns in the data and collectively make more
reliable and robust predictions. Bagging, boosting, and
stacking are frequently used ensemble learning techniques,
each with its unique way of aggregating predictions. Ensemble
learning has proven to be highly effective in tackling complex
and challenging problems [48]. It is widely used in several
domains, including image classification, medical diagnosis
and natural language processing, where it has achieved
optimal results by leveraging the strengths of different models
and mitigating their weaknesses [50].
I. ENSEMBLE LERANING
The key advantage of ensemble learning lies in its ability to
enhance predictive accuracy and overall generalization by
leveraging the strengths of diverse models while mitigating
their individual weaknesses.
The above discussed literature on automatic ALL recognition
from microscopic images demonstrated that various CNN-
based deep learning techniques are widely used recently (see
details under section 2 and in Table II). Although extensive
amount of literature have already been published, there is still
opportunity available for performance improvement, broader
and generalized model. Additionally, CNN-based methods
struggle with data scarcity in order to prevent model
overfitting; however, the ensemble of different CNN network
architecture alleviates these constraints [29, 30, 81, 82].
Although existing literature indicates that various articles used
ensemble learning method [49, 50], but the weighted ensemble
leaning method have not been used in the applications of CNN
base ResNet-152 network architecture in the setting of acute
lymphoblastic leukemia (ALL) and on this specific dataset.
In this study, we constructed a homogeneous ensemble model
using ResNet-152 as the base learner to categorize benign
(HEM, healthy cell) and malignant (ALL, blast cell) samples,
as well as to perform subtype detection. We adopted the
parallel ensemble building technique [51] where independent
data samples are fed into each base learner simultaneously,
thus exploiting the independence among base learners as can
be seen in Figure 6. Hence, as each base learner in the
ensemble model makes distinct errors, the ensemble model
can effectively average out these errors [52].
FIGURE 6. Weighted average voting ensemble learning method using
CNN based architecture.
J. FUSION METHOD
Output fusion is a process of aggregating the output of
individual base learners into a unified and coherent output.
According to recent literature, the most frequently used
fusion method is average voting despite the fact it is biased
towards weak learners and is not a suitable approach for
integrating the output of base learners [53]. To overcome this
problem, we incorporated the weighted average voting
method as the fusion method. The weighted average
approach gives each base model for a given class a certain
weight depending based on the model contributes to that
class. Rather of using traditional hyperparameter tuning,
theses weights were updated using feedforward neural
network.
In this research project, the outermost layer of our novel
weighted deep ensemble learning model involves the
consideration of the output for each of the four classes at the
final layer.
In this study, the output layer of our weighted deep consists
of four nodes for each class (n=4) considered relevant in this
study. A convolution neural network assigns probability
values 𝑃
" *ℝ and 𝑃
"*range between [0,1] for a previously
unseen test image,∀𝑗 {1,2, . . , 𝑛} and 𝑃′"#1
$
"#% **for all
the classes. Given the number of classes n and the number
of base classifiers K, K m where m {1, 2, 3}. Let
p?𝑎"&'A be the predicted accuracy of k for class j, we proceed
to calculate the weight for each K as denoted below.
𝑊'= p(𝑎%')+ p(𝑎(')+⋯+p?𝑎"'A
𝑛**********************(2)
Here 𝑊'&is the calculated weight value for each base
classifier CNNk and k {1, 2, 3}. The estimated weight value
*𝑊'& is further utilized to calculate the weighted average,
denoted as*𝑃′" .
𝑃′"=*𝑊'&(*𝑝"'&)
)
*#%
*𝑊'&
)
*#% **************************************************(3)
In this method we used area under receiver operating
characteristic curve (AROC) as the evaluation score, which
was further used to calculate the average weight for each
classifier. To perform AROC Scheme the probabilities value
𝑃′" for each class j can be derived as follows:
𝑃′"+,-./ =𝑊,-./
'(*𝑝"'&)
)
*#%&
𝑊,-./
'
)
*#%
*******************************(4)
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
9
K. EXPERIMENTAL SETUP
Python programming language is used to implement the
proposed system using various built in libraries including
Fast.ai. It is a simple yet advanced deep-learning library built
on top of PyTorch and provides high level abstraction to
build and train CNN based pre-trained ResNet-152 network
architecture. Implementation is done on an HP laptop with
an SSD drive connected in SATA format (Serial Advanced
Technology Architecture), an Intel Core i7 10th Generation
processor, and 16 GB of RAM. It was trained out on Google
Collaboratory using the Nvidia Tesla P-100 GPU method.
L. TRAINING POLICY
In this study, our implementation and configuration of
ResNet-152 network architecture follows the practice in [73,
74]. The initial phase of model development involves
establishing the architecture of CNN. As mentioned earlier,
a typically CNN architecture comprises of a series of various
types of blocks and layers, which is often tailored to the
specific application and data characteristics. The foremost
layer in this architecture is the input layer, which sets the
dimensions of the input images for the network. This
includes specifying the height, width and number of color
channels in the given image. To maintain the uniformity with
existing research [34, 73], we set the input image dimensions
at 224x224x3 pixels.
Furthermore, each input image was an image pair of both the
original PBS sample and its corresponding segmented
version therefore we also performed stacking or
concatenation of both image types. This concatenation
serves to retain a more comprehensive dataset and mitigate
the potential loss of information that can occur during the
segmentation process. Since segmentation can sometimes
result in the removal of pertinent details. This approach
mainly aims to equip the model with a richer set of
information, thus enhancing its capability to learn and
generalize from the data [34]. The image is resized and
augmented as mention above. The standard color
augmentation used in [73] is incorporated in this research.
FIGURE 7. Filters from layer 2 of ResNET-152 with one subplot per channel
The second layer, convolutional layer, contains batch
normalization and activation function layer along with small
filters Figure 7 represent the filters from ResNET-152 with
one subplot per channel. Filters are typically followed by
non-linear activation functions like ReLU. Each filter is
responsible for extracting specific patterns through sliding
across the image computing a dot product between the pixel
values and its weights in the overlapping region [61]. They
also allow the network to recognize patterns at different
levels of abstraction. The convolution and pooling layers
maintain and preserve the spatial hierarchy among features
maps, while process the input data. However, batch
normalization layer overcomes the internal dependencies
among layers, allowing them to contribute more
independently in the learning process through managing
covariant shift. As the layers progress deeper into the
network, they can detect increasingly complex and high-
level features [17].
The third layer is an optional layer called the pooling layer
responsible for spatial pooling, decreasing the spatial
dimensions of the features to a single value per channel, thus
providing a form of down-sampling by focusing on the
intended region in the input image. In this research we
employed max pooling as it is widely used type of pooling
layer in CNN, including ResNet-152 network architecture.
ResNet-152 network consists of several residual blocks each
block contains multiple convolution layers.
The intermediate outputs of each convolutional layer within
the residual block are feature maps. Feature maps are crucial
for capturing distinct cellular patterns associated with
different subtypes. In this study, we used 64 feature maps
with the size of 8 x 8. The gradients of loss with respect to
the input image pixels is calculated and the output is
converted to grayscale image. The entire input data set is
used for features extraction during feature extraction
process, however, Figure 8(A) displays 64 grayscale feature
maps extracted from an image of the Pro-b subtype, as
depicted in Figure 1. These feature maps are derived from
the second layer of the ResNet-152 model. If we critically
review the feature maps depicted in Figure 8 it can be
observed that in layer 2, network learned edges and textures.
During the feature maps extraction it can also be observed
that at higher layers more abstract objects detectors are
learned. While this process gives a visual inside into the
activated feature maps at layer 3 block5 as can be seen in
Figure 8(B). The number of features are so high at higher
level of layers that images become non interpretable. Further
feature maps obtained from different higher levels, those are
available in supplementary file. These maps enable
interpretable insights into the model's decision-making
process, aiding clinicians in understanding and localizing
abnormalities within blood cell images. Ultimately, feature
maps contribute to enhanced diagnostics and more precise
identification of leukemia subtypes.
On the other hand, fully connected layer process 1-D feature
vector. Therefore, the next layer is flatten layer used to
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
10
transform the 2D max-pooled matrix into a 1D array. This
layer basically calculate average value of each feature map,
decrease spatial dimensions to 1x1. Thus each element in this
array could be used as an input to the fully connected layer.
This layer is a naïve connected fully feed forward network
containing an input layer, receives the output of the flatten
layer as an input and transforms the spatial features into a
format appropriate for classification [29, 30].
It is challenging to comprehend, nevertheless, how these
qualities are applied and whether they are utilized to
eliminate a class from consideration or to anticipate the class.
At the end we have an output layer, in this layer we added 4
output nodes (each node represents a particular ALL subtype
considered relevant in this study. Furthermore, softmax
activation function is commonly used in the output layer to
obtain class probabilities particularly in multiclass problem.
In the output layer, we also incorporated an Adaptive
Moment (ADAM) estimation optimizer. It is frequently used
dynamic second-moment optimizer [54] to enhance the
performance of the gradient descent. It handles the rapidly
decreasing learning rates by integrating the strengths of
momentum and RMSprop (Root Mean Square Propagation),
allowing faster convergence and effectively. In the output
layer, we also incorporated an Adaptive Moment (ADAM)
estimation optimizer. It is frequently used dynamic second-
moment optimizer [54] to enhance the performance of the
gradient descent. It handles the rapidly decreasing learning
rates by integrating the strengths of momentum and
RMSprop (Root Mean Square Propagation), allowing faster
convergence and effectively deal with sparse data. During
the training phase, it dynamically updates the learning rate
of individual parameter based on the recent magnitude of the
gradient. In this study, the initial learning rate which controls
the step size during convergence, was set to 0.01. The
parameter values for β1 and β2 were set to 0.9 and 0.999,
respectively, to achieve the desired behavior of the Adam
optimizer. For numerical stability, the default value of ε
(epsilon) is set to 1e-8.
The batch size is explicitly set to 8, implying that 8 images
will be processed in each iteration during the training of the
model. Additionally, the number of epoch is specified to be
85 with early stopping criteria to minimize the over fitting.
Subsequently, Categorical Cross-Entropy is used as the loss
function to measure the discrepancies between the actual and
predicted probability distributions of the model. However,
accuracy is chosen as the measure to evaluate the
performance of the model. In the case of C classes for a
single data point, the Cross entropy loss function was
calculated using the following Equations. Where C is the
classes, 𝑃
0 represents the predicted probability for class c and
y represents the actual class of c.
L(Y, P)= J***𝑌
0*. log(𝑃
0)
/
0#% (5)
During the training process, if the validation accuracy
remains stagnant for ten consecutive epochs, the learning
rate will be reduced by 20%, continuing until no further
improvement is achieved. The training process also employs
the Cosine learning rate annealing scheduler. Additionally, if
the validation loss remains stable for 20 epochs, and the
learning rate reaches the minimum threshold value of 0.0001
without any further improvement, the training process is
halted. Throughout the training, only the weights associated
with the best performance on the validation set are saved. It
ensures that the model captures the most optimal state
attained during training, leading to enhanced generalization
on unseen data.
FIGURE 8. (A) Visualization of the grayscale feature maps (B) Activated
colored feature maps extracted from an image of the Pro-b subtype, as
depicted in Figure 1.
L. EVALUATION METRICS
In this study, to comprehensively assess the capabilities of
the proposed model we incorporated clinically important
performance metrics such as precision, recall, specificity,
sensitivity, AROC, weighted accuracy and Matthews
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
11
Correlation Coefficient (MCC) [55]. We also incorporated
the F-measure to evaluate the performance of our proposed
model as accuracy always is not adequate performance
measure when dealing with the imbalance or asymmetrical
datasets. Mathematically, these measures are describe as
follows:
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =123!
"
!#$
123
"
!#$ 453******* (6)
𝑅𝑒𝑐𝑎𝑙𝑙 =123!
"
!#$
123!456!
"
!#$ ****** (7)
𝐹 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 =2
1
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛*+1
𝑅𝑒𝑐𝑎𝑙𝑙
*************(8)
𝑀𝐶𝐶 =723826953856:
;723453:723456:726453:726456: (9)
Where 𝑇𝑃 demonstrated true positives rate, 𝑇𝑁 true
negative rate, FP the number of false positives, FN the
number of false negatives and l is the number of classes. All
the obtained result is expressed as percentages as depicted in
Table 2.
IV. RESULTS AND DISCUSSION
Here, we introduce a weighted deep ensemble learning
method for image classification and cell segmentation that
provide a scalable and efficient method to identify ALL
samples from PBS image. In this section, our initial focus is
to represent the evaluation results of the test data. To achieve
more reliable estimation of our model‘s performance, we
incorporated the K-fold cross-validation that determines the
degree to which the obtained results are independent of the
training data. It is especially useful when dealing with
limited dataset, as across different iterations all the data
points contribute to both the training and test data. In this
study we incorporated 10-fold cross validation to evaluate
the performance of our model. Table 2 demonstrated
weighted average voting results obtained after 10-cross
validation on the oversampled dataset, comprising a total of
3940 samples. In this overly sampled dataset, each class is
represented by 985 sample pairs.
It demonstrated that deep ensemble learning model obtained
optimal results on both benign and malignant categories. The
model achieved 99.85 %, 99.90 %, 99.80 F1-score, precision
and recall respectively on the hematogones (Hem) class fall
under the benign category. Furthermore, the high sensitivity
(90.80%) and specificity (99.93) values demonstrated
models’ ability to effectively classify a significant portion of
actual benign hem test images. Thus it not only minimizes
the probability of overlooking any positive instances but also
demonstrates its competence in minimizing false positive
rate. MCC of 0.703 further exhibits the robustness of the
deep ensemble model in this category. The Weighted
Accuracy, though slightly lower, is still inspiring at 99.93%.
Early pro-b is one of the three subtypes within the malignant
category as mentioned earlier.
In the context of this subtype, the model demonstrated
precision, recall, and F1-Score values of 99.80%, 99.90%,
and 99.86% respectively. The model's recall, was equally
notable at 99.90%, indicating its significant magnitude in
capturing optimal positive rate. This, eventually, leads to a
robust F1-Score of 99.86%, indicating the harmonic balance
between recall and precision and proclaiming the deep
ensemble model's generalizability to effectively recognizing
test samples from Early pro-b subtype. The proposed model
achieved perfect performance in the 'Pre-b ALL' and pro-b
ALL subtypes, achieving 100% in precision, recall, F1-
Score, sensitivity, and specificity. The MCC value of 1.0
indicates flawless predictions in this category.
Furthermore, the proposed model achieved overall high
performance while aggregating across all the classes
included in this study. It achieved 99.92%, 99.92%,
99.90%,99.92%, and 99.97%, precision, recall, F1-Score,
sensitivity and specificity respectively. The balanced
performance is also reflected in the MCC and weighted
accuracy of 0.897 and 99.95% respectively. High MCC
values across all categories indicate a robust model
performance, considering both false positives and false
negatives. A close observation of the results across all the
classes concludes the supremacy of the weighted deep
ensemble learning model over the single CNN architecture.
Furthermore, it signifies the proposed model's ability to
accurately diagnose and identify specific ALL subtype.
Hence, it is likely to save both diagnosis time and the efforts
TABLE II.
PERFORMANCE EVALUATION OF THE PROPOSED DEEP ENSEMBLE LEARNING-BASED MODEL USING 10-FOLD
CROSS VALIDATION METHOD ON BLOOD SMEAR IMAGES.
Categories
Recall
Sensitivity
Specificity
MCC
Weighted Accuracy
Benign
Hem
99.80
99.80
99.93
0.703
99.86
Malignant
Early
99.90
99.90
99.97
0.997
99.93
Pre-b ALL
100
100
100
1.00
100
Pro-b ALL
100
100
100
1.00
100
Total
99.92
99.92
99.97
0.897
99.95
Hem, Hematogones; ALL, Acute lymphoblastic leukemia (ALL); MCC, Matthews Correlation Coefficient.
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
12
of clinicians and patients. In this study we also incorporated
hold on method to evaluate the performance of our weighted
deep ensemble learning model. In this approach blood smear
image dataset is divided into training and test data with 80%
and 20% respectively. To demonstrate a better view and
FIGURE 9. (A) The confusion matrix of the weighted deep ensemble learning
model evaluation on the test data with a total of 783 unseen test samples
(213-Benign (Hematogones), 206-Early Pre-b, 175-pre-b and 189-Pro-b
samples) with the resolutions of 224x224 pixels. (B) Corresponding
normalized confusion matrix.
more insight of the obtained results from the test data, we
utilized a confusion matrix as the performance evaluation
indicator. Each cell in this matrix shows the proportion of
each subtype of the predicted image. The diagonal values
correspond to the correctly classified images, while the
remaining entries demonstrate misclassified instances in
each subtype.
Figure 9 demonstrates that our fine-tuned model accurately
identifies the majority of samples. However, it can be
observed that it only misclassifies 2 images of the benign
category (Hem) as pre-b (False positive), a subtype of the
malignant category. It also misclassifies 1 instance of pre-b
subtype as benign (Hem) perhaps due to the substantial
similarity between these two subtypes [34]. While the true
positive rate is 100% for the remaining two subtypes. The
main diagonal shows the number of correctly classified
samples in Figure 9. It can also be noticeable that the false
positive rate is significantly low, minimizing the probability
that benign instances are mistakenly classified as malignant
or vice versa. The model's ability to reduce false positives is
especially useful in the context of patient care. False
positives have the ability to cause unneeded stress and
medical procedures by igniting unneeded patient fears and
treatments. The suggested model helps provide preliminary
leukemia diagnoses that are more accurate and trustworthy
by lowering the possibility of these false positives. Hence, it
signifies the proposed model's ability to enhance the
accuracy of preliminary diagnoses and informed decision
making. Experimental results demonstrated that the
proposed method has the ability to classify more than 90%
of the samples correctly, even when dealing with an
unbalanced dataset.
TABLE III.
SUMMARY OF DEEP LEARNING AND ML BASED METHODS USED FOR ALL CLASSIFICATION
Author (s), year
Features Type
Method
Test Data
Accuracy
Zhou et al. [56]
CNN features
RetinaNet , VGG, ResNext101, ResNext50,
ResNet50 and the Feature Pyramid Network
346
97%
Jha and Dutta, [57]
Statistical and the Local
Directional Pattern (LDP) features
Chronological SCA-based Deep CNN
98.7%
Almadhor et al. [58]
CNN features
KNN, Random Forest, SVM , and Naive Bayes
4456
90%
Nizar Ahmed et al.
[59]
CNN features
CNN, naive Bayes, support vector machine, KNN
and decision tree
245,231
88.25%
and
81.74%
Sanam Ansari
et al. [60]
CNN features
CNN model with the Tversky loss function
187
99.5%
Chayan Mondal et al.
[61]
texture, Size and shape feature
Ensemble model based onVGG-16, MobileNet,
InceptionResNet-V2, and DenseNet-121
1867
94%
Maryam Bukhari et
al.[62]
CNN features
CNN with squeeze and excitation learning
22
97.06 %
Niranjana Sampathila
et al. [63]
CNN features
ALLNET
2132
95.54%
Rezayi et al. [66]
CNN features
ResNet-50 and VGG-16 ,
2506
84%
Proposed approach
CNN features
ResNet-152 based weighted Ensemble, VGGNet,
Inception, DensNet-121, AlexNet, ResNet-18,
ResNet-50
783
99.97 %
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
13
FIGURE 10. Weighted deep ensemble learning model‘s loss
function with respect to epochs number on training and validation sets.
Figure 11 represents the conversion trend of our model‘s
accuracy during the training and testing phase, which is
plotted on the basis of epochs. In this research, to obtain the
optimal results we have conducted experiments with
different numbers of epochs. Although, initially we
increased the number of epochs to 100, it resulted in more
running time without significant progress in accuracy.
Therefore, we set the number of epochs to 85 to train the
proposed model however, it can be observed that the model
converged to its saturation point after approximately 30
epochs with training and validation accuracy 99.97 % and
99.86% respectively. Whereas Figure 10 depicts the loss
during training and validation process of the model 0.0016
and 0.0018 respectively.
FIGURE 11. Weighted ensemble learning model‘s accuracy with
respect to epochs number on training and validation sets.
The experimental results depicted that the proposed model
consistently demonstrated high performance in term of
accuracy even when the learning rate is set towards lower
bounds and conducting a limited number of epochs. This
implies that further extending more epochs and increasing
the learning rate, significant improvements can be attained.
We also incorporated training and test accuracy along with
loss function as evaluation metrics to measure the
performance of our proposed model. First, accuracy
demonstrates how well the model is generalized to new data.
Basically, it is the ratio of the number of correctly classified
cases among the total number of cases. However, the
accuracy can be categorized into two types: Training
accuracy and test accuracy obtained from training and test
data respectively. Secondly, the loss function is the metric
that quantifies how well the model‘s output matches with the
actual output, which is used to update the weights of each
node in the neural network architecture. This function is also
used to calculate the training and validation loss to evaluate
that the trained model is converging and learning from the
data.
FIGURE 12. Comparative analysis between weighted deep ensemble
learning model and the individual model (ResNet-152) in term of AROCs.
In this study, to further explore the proposed method
performance we also incorporated clinically important
performance metric such as AROC. In Figure 12, the AROC
value illustrates that our weighted deep ensemble learning
model outperforms compared to individual CNN-based
model (ResNet-152) with an AROC value of 0.999. Notably,
the AROC value for the individual ResNet-152 is 0.98,
suggesting relatively lower performance. This result
demonstrates the effectiveness of our approach in both
diagnosing and categorizing each class. This also
emphasizes the potential for significant improvements by
employing an ensemble approach, aggregating predictions
from multiple models [48]. Experimental results also
demonstrated that the implementation of the ensemble model
with ResNet-152 as the base learner yields more enhanced
performance at the cost of longer training time and additional
computational overhead.
To provide a broader context for our proposed model, we
also comprehensively evaluate the performance of our
proposed method. Therefore, we conducted a comparative
analysis involving seven renowned network architectures.
These architectures include AlexNet [39], VGGNET [40],
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
14
Inception [41], DenseNet-121 [42], ResNET-18 [43],
ResNET-50 [44], and ResNET-152 [45]. These architectures
have been widely used in existing for the classification of
acute leukemia.
Subsequently, we analyzed AROC values for each individual
model, highlighting distinctions among them. The area under
receive operating curve (AROC) in Figure 13 also reveals
the supremacy of our weighted deep ensemble learning
model when compare it to all the individual CNN based
models (ResNet-152, VGGNet, Inception, DenseNet and
AlexNet) with an AROC value of 0.999. This analysis
contextualizes our results within the landscape of state-of-
the-art classifiers for acute leukemia classification. It also
illustrates the peak effectiveness of the proposed approach in
both diagnosing and categorizing each class. However, the
AROC value for individual ResNet-152 is 0.98 which is
comparatively low. AROC values for ResNET-18 and
Inception are similar. However, DenseNet and ResNet-152
have claimed to perform slightly better than DenseNet and
VGGNet with AROC values 0.97, 0.96 and 0.95 respectively
as can be in Figure 13. Furthermore, ResNet-50 depicted
slightly higher AROC value (0.97) compared to DenseNet-
121 (0.96) suggests that ResNet-50 excels in capturing
complex pattern present in imagery dataset. Its deeper
architecture probably enables it to learn intricate patterns and
characteristics that contribute to better performance. The
slightly lower AROC value of 0.93 attained by AlexNet
provide interesting perspectives on its efficiency within the
context of the classification. This outcome might be ascribed
to various architectural factors inherent in AlexNet. Which
contribute to its slightly diminished AROC value.
The data pre-processing techniques used in this study also
contribute to the proposed models' efficacy. These pre-
processing procedures are essential for the model’s ability to
learn relevant features from the data. Specifically, two
crucial pre-processing pipeline phases are segmentation and
relevant feature maps extraction. Clinicians can visualize
and understand which feature or patterns in the PBS
contribute to the model's predictions. This level of
transparency is crucial for building trust in the model's
prediction. According to the activated feature maps extracted
from the data (Provided in the supplementary file) it can be
concluded that the model is learning edges, colors, and
textures. However, it's unclear how these features are
applied, and if they're utilized to eliminate a class from the
list of possibilities or to anticipate the class. Furthermore, it
can also be concluded from the observations that color is the
primary feature incorporated to discriminate between
classes.
Eventually, a comparative analysis is also presented between
the weighted deep ensemble learning model and the recent
studies particularly related to ALL diagnosis as depicted in
Table 3. These recent studies consistently report
categorization accuracy exceeding 90%. The complexity of
the networks employed in prior studies has necessitated the
use of deep processing units. For example, the Inception
standard network consists of over ten processing blocks,
each comprising more than ten convolutional layers. In
contrast, our model is composed of five convolutional layers
in total. Overall, the experimental results obtained from this
study demonstrated that the model's high precision, recall,
F1-measure and accuracy can translate into improved patient
outcomes.
FIGURE 13. A comparison among the AROCs of seven different
pretrained CNN based models and proposed weighted ensemble learning
model.
In this research, we have also incorporated essential details
that provide a more in-depth understanding of our proposed
web based platform that can be observed in Figure 14. This
web based platform is developed using HTML, CSS, and JS,
with Flask. The incorporation of Flask in our code allows for
the creation of routes and effective management of HTTP
requests. In terms of the frontend, HTML templates are
employed for rendering web pages, and JavaScript code
enhances the user interface interaction. It is operational on a
Flask server.
We further emphasize that future plans involve integrating
and hosting the web platform on either a private server or an
online cloud server, depending on specific requirements.
This strategic approach ensures scalability and accessibility
beyond local execution, enhancing the overall utility and
reach of our web application. To access our web-based
platform, users are required to complete the registration
process by providing the necessary credentials. This
information is utilized for verification and identification
purposes within our platform. Upon successful registration,
users can then avail themselves of the services offered by our
web platform.
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
15
FIGURE 14. A screenshot of the web-based platform primarily based
on weighted deep ensemble learning model.
The limitations of this study include the reliance on single
dataset obtained from the Kaggle repository [34]. This might
limit the generalizability of the learned model over diverse
populations. Despite its size, the dataset may not fully
capture the range of variability found in real-world
situations, which could have an impact on the model's
effectiveness when applied to unobserved data. Furthermore,
since evaluating only one diagnostic modality at a time is
insufficient for an accurate diagnosis, an integration of
various diagnostic modalities, such as cytomorphology of
both peripheral blood and bone marrow, flow cytometry, as
well as genetic and clinical data, seems warranted to build
ML models that may aid in clinical decision making.
The proposed research particularly focus on identification of
acute lymphoblastic leukemia (ALL) and classification of its
subtypes. However, the suggested methodology may not be
applicable to other medical fields or hematological disorders.
Additionally, even if oversampling was attempted to rectify
the class imbalance problem, the dataset's intrinsic qualities
may restrict the applicability of the proposed methodology
to other medic al domains. Lastly, the suggested ResNet-152
network architecture and ensemble learning approach's
robustness is restricted by the lack of external validation on
other datasets. These drawbacks highlight to areas that need
more investigation and development in order to increase the
model's applicability to a variety of clinical contexts.
CONCLUSION
ALL is a prevalent disease both in adults and children. It
often required costly, time consuming and invasive
diagnostic tests. Peripheral blood smear (PBS) images play
a vital role in the early screening of ALL. While, the PBS
images provide a noninvasive mean to early diagnose the
ALL in a suspected individual, however, manual analysis of
such images could be subject to inter-observer variability
and human error. Therefore, in this research we developed a
web based platform using weighted deep ensemble learning
model to diagnose acute lymphocytic leukemia with respect
to its subtypes (benign, Early Pro-B, Pro-B, and Pre-B). To
develop the deep ensemble learning model ResNet-152
architecture based on CNN is incorporated as the base
learner. Experimental results and comparative analysis
across seven renowned CNN Network architectures
demonstrated that the proposed web based platform has the
potential to accurately classify the PBS images into
cancerous or healthy categories thus help oncologists in
tailoring treatment plan according to individual patient
needs. Nevertheless, the consistent and higher recall scores
depict its capabilities in identifying true positive cases. It
could lead to timely initiation of personalized treatment
thereby reducing the risk of delayed or suboptimal
interventions. Additionally, while the proposed method
exhibits exceptional performance within the confines of the
dataset, its real-world applicability warrants rigorous
evaluation and validation across diverse patient populations.
In future endeavors, we intend to expand our experiments by
incorporating a hybrid deep learning approach using
convolutional neural network accompanied by recurrent
neural networks to further enhance the performance.
Furthermore, the proposed platform could be used to find
other type of abnormalities in the blood.
ACKNOWLEDGMENT
Researchers would like to thank the Deanship of Scientific
Research, Qassim University for funding publication of this
project.
REFERENCES
[1] N. Jiwani, K. Gupta, G. Pau, M. Alibakhshikenari. “Pattern
Recognition of Acute Lymphoblastic Leukemia (ALL) Using
Computational Deep Learning”. IEEE Access. 2023 Mar
21,11:29541-53.
[2] N. Sampathila, K. Chadaga, N. Goswami, RP. Chadaga, M. Pandya,
S. Prabhu, MG. Bairy, SS. Katta, D. Bhat, SP. Upadya. “Customized
deep learning classifier for detection of acute lymphoblastic leukemia
using blood smear images”. InHealthcare 2022 Sep 20 (Vol. 10, No.
10, p. 1812). MDPI.
[3] KJ. Hiam-Galvez, BM. Allen, MH. Spitzer. “Systemic immunity in
cancer”. Nature reviews cancer. 2021 Jun 21, (6):345-59.
[4] M. Belson, B. Kingsley, A. Holmes. “Risk factors for acute leukemia
in children: a review”. Environmental health perspectives. 2007 Jan
11, 5(1):138-45.
[5] Y. Dong, O. Shi, Q. Zeng, X. Lu, W. Wang, Y. Li, Q. Wang.
“Leukemia incidence trends at the global, regional, and national level
between 1990 and 2017”. Experimental hematology & oncology. 2020
Dec, 9:1-1.
[6] D. Singh, J. Vignat, V. Lorenzoni, M. Eslahi, O. Ginsburg, B. Lauby-
Secretan, M. Arbyn, P. Basu, F. Bray, S. Vaccarella. “Global estimates
of incidence and mortality of cervical cancer in 2020: a baseline
analysis of the WHO Global Cervical Cancer Elimination Initiative”.
The Lancet Global Health. 2023 Feb 1, 11(2):e197-206.
[7] K. Stephens. “Every Month Delayed in Cancer Treatment Can Raise
Risk of Death by Around 10%”. AXIS Imaging News. 2020 Nov 6.
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
16
[8] B. Seruga, A. Sadikov, EL. Cazap, LB. Delgado, R. Digumarti, NB.
Leighl, MM. Meshref, H. Minami, E. Robinson, NH. Yamaguchi, D.
Pyle. “Barriers and challenges to global clinical cancer research”. The
Oncologist. 2014 Jan 1, 19(1):61-7.
[9] P. McGrath. “Beginning treatment for childhood acute lymphoblastic
leukemia: Insights from the parents' perspective”. Number 6/2002.
2002 Jul 1, 29(6):988-96.
[10] F. Kazemi, TA. Najafabadi, BN. Araabi. “Automatic recognition of
acute myelogenous leukemia in blood microscopic images using k-
means clustering and support vector machine”. Journal of medical
signals and sensors. 2016 Jul, 6(3):183.
[11] M Ghaderzadeh, M. Aria, A. Hosseini, F. Asadi, D. Bashash, H.
Abolghasemi.” A fast and efficient CNN model for B‐ALL diagnosis
and its subtypes classification using peripheral blood smear images”.
International Journal of Intelligent Systems. 2022 Aug, 37(8):5113-
33.
[12] World Health Organization, 2020. Global cancer profile 2020 (2020)
https://tinyurl.com/3unsh9xa. [Accessed 10 February 2020]
[13] S. Gehlot, A. Gupta, R. Gupta. “SDCT-AuxNetθ: DCT augmented
stain deconvolutional CNN with auxiliary classifier for cancer
diagnosis”. Medical image analysis. 2020 Apr 1, 61:101661.
[14] S. Mohapatra , S.S. Samanta, D. Patra, S. Satpathi. “Fuzzy based blood
image segmentation for automated leukemia detection 2011
International conference on devices and Communications”, IEEE
(2011), pp. 1-5.
[15] C. Marzahl, M. Aubreville, J. Voigt, A. Maier. “Classification of
leukemic b-lymphoblast cells from blood smear microscopic images
with an attention-based deep learning method and advanced
augmentation techniques ISBI 2019 C-NMC challenge: classification
in cancer cell imaging”, Springer (2019), pp. 13-2
[16] S. Mishra, B. Majhi, P.K. Sa. “Texture feature based classification on
microscopic blood smear for acute lymphoblastic leukemia detection
Biomed Signal Process Control”, 47 (2019), pp. 303-311
[17] A. Mittal, S. Dhalla, S. Gupta, A.Gupta. “Automated analysis of blood
smear images for leukemia detection: a comprehensive review”. ACM
Computing Surveys (CSUR). 2022 Sep 10;54(11s):1-37.
[18] N. Patel, A. Mishra. “Automated leukaemia detection using
microscopic images Procedia Comput Sci”, 58 (2015), pp. 635-642.
[19] Lai, Yunfei. "A comparison of traditional machine learning and deep
learning in image recognition." In Journal of Physics: Conference
Series, vol. 1314, no. 1, p. 012148. IOP Publishing, 2019.
[20] S. Perveen, M. Shahbaz, K. Keshavjee, A. Guergachi. “Prognostic
modeling and prevention of diabetes using machine learning
technique. Scientific reports”. 2019 Sep 24, 9(1):13805.
[21] M.K. Hasan, L. Dahal, P.N. Samarakoon, F.I Tushar, R. Martí.
“DSNet: Automatic dermoscopic skin lesion segmentation Comput
Biol Med”, 120 (2020), Article 103738
[22] M.Hasan, S. Roy, C. Mondal, M. Alam, M. Elahi, E. Toufick, et al.
“Dermo-DOCTOR: A web application for detection and recognition
of the skin lesion using a deep convolutional neural network”, (2021)
[23] A. Işın, C. Direkoğlu, M. Şah. “Review of MRI-based brain tumor
image segmentation using deep learning methods Procedia Comput
Sci”, 102 (2016), pp. 317-324
[24] D.F. Steiner, R. MacDonald, Y. Liu, P. Truszkowsk, J.D Hipp, C.
Gammage, et al. “Impact of deep learning assistance on the
histopathologic review of lymph nodes for metastatic breast cancer
Am J Surg Pathol”, 42 (12) (2018), p. 1636
[25] M.K. Hasan, M.A Alam, M.T.E Elahi, S. Roy, R. Martí. “DRNet:
Segmentation and localization of optic disc and fovea from diabetic
retinopathy image Artif Intell Med”, 111 (2021), Article 102001
[26] Y. Oh, S. Park, J.C Ye. “Deep learning covid-19 features on cxr using
limited training data sets IEEE Trans Med Imaging”, 39 (8) (2020),
pp. 2688-2700.
[27] F. Chadebecq, LB. Lovat, D. Stoyanov. Artificial intelligence and
automation in endoscopy and surgery. Nature Reviews
Gastroenterology & Hepatology”. 2023 Mar, 20(3):171-82.
[28] M.S.H Sunny, A.N.R Ahmed, M.K. Hasan. “Design and simulation of
aximum power point tracking of photovoltaic system using ANN 2016
3rd International conference on electrical engineering and information
communication technology (2016)”, pp. 1-5,
10.1109/CEEICT.2016.7873105
[29] Y. Ding, Y. Yang, Y. Cui. “Deep learning for classifying of white
blood cancer ISBI 2019 C-NMC challenge: classification in cancer
cell imaging, Springer (2019)”, pp. 33-41
[30] T. Shi, L. Wu, C. Zhong, R. Wang, W. Zheng. “Ensemble
convolutional neural networks for cell classification in microscopic
images ISBI 2019 C-NMC challenge: classification in cancer cell
imaging”, Springer (2019), pp. 43-
[31] M.A Khan, J. Choo. “Classification of cancer microscopic images via
convolutional neural networks ISBI 2019 C-NMC Challenge:
Classification in Cancer Cell Imaging”, Springer (2019), pp. 141-147
[32] B. Harangi. “Skin lesion classification with ensembles of deep
convolutional neural networks J Biomed Inform”, 86 (2018), pp. 25-
32
[33] F. Xiao, R. Kuang, Z. Ou, B. Xiong. “DeepMEN: Multi-model
ensemble network for B-lymphoblast cell classification ISBI 2019 C-
NMC challenge: classification in cancer cell imaging”, Springer
(2019), pp. 83-93.
[34] M. Ghaderzadeh, M. Aria, A. Hosseini, F. Asadi, D. Bashash, H.
Abolghasemi. A fast and efficient CNN model for B‐ALL diagnosis
and its subtypes classification using peripheral blood smear images.
International Journal of Intelligent Systems. 2022 Aug; 37(8):5113-
33.
[35] KK. Pal, KS. Sudeep. “Preprocessing for image classification by
convolutional neural networks. In: 2016 IEEE International
Conference on Recent Trends in Electronics, Information &
Communication Technology (RTEICT)”, (pp. 1778-1781). IEEE;
2016.
[36] A. Makandar and B. Halalli, 2016. “Threshold based segmentation
technique for mass detection in mammography”. J Comput, 11(6),
pp.472-478.
[37] D. Goutam, & S. Sailaja, (2015). “Classification of acute myelogenous
leukemia in blood microscopic images using supervised classifier.
International Journal of Engineering Research & Technology
(IJERT)”, 4(1), 569574.
[38] S. Jagadeesh, E. Nagabhooshanam & S. Venkatachalam, (2013).
“Image processing based approach to cancer cell prediction in blood
samples”. International Journal of Technology and Engineering
Sciences, 1(1), 110.
[39] A. Krizhevsky, I. Sutskever and G.E Hinton, 2012. “2012 AlexNet.
Adv. Neural Inf. Process. Syst”, pp.1-9.
[40] K. Simonyan, A. Zisserman. “Very deep convolutional networks for
large-scale image recognition”. arXiv Prepr arXiv14091556. 2014.
[41] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens. “Rethinking the
inception architecture for computer vision”, 2016.
[42] G. Huang, Z. Liu and K.Q. Weinberger, 2017. “Laurens van der
Maaten”. Densely Connected Convolutional Networks. arXiv preprint
arXiv:1608.06993.
[43] S. Jian, H. Kaiming, R. Shaoqing and Z. Xiangyu, 2016. “Deep
residual learning for image recognition”. In IEEE Conference on
Computer Vision & Pattern Recognition (pp. 770-778).
[44] Z. Liu, H. Mao, C.Y. Wu, C. Feichtenhofer, T. Darrell and S. Xie,
2022. “A convnet for the 2020s”. In Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition (pp. 11976-
11986).
[45] K. He, X. Zhang, S. Ren. And J. Sun, 2016. “Identity mappings in deep
residual networks”. In Computer VisionECCV 2016: 14th European
Conference, Amsterdam, The Netherlands, October 1114, 2016,
Proceedings, Part IV 14 (pp. 630-645). Springer International
Publishing.
[46] A. Esteva, B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau,
et al. “dermatologist-level classification of skin cancer with deep
neural networks Nature”, 542 (7639) (2017), pp. 115-118.
[47] A. Kamilaris, F.X. Prenafeta-Boldú. “Deep learning in agriculture: A
survey Comput”. Electron. Agric., 147 (2018), pp. 70-90.
[48] C. Mondal, MK. Hasan, M. Ahmad, MA. Awal, MT Jawad, A. Dutta,
MR. Islam, MA. Moni. “Ensemble of convolutional neural networks
to diagnose acute lymphoblastic leukemia from microscopic images”.
Informatics in Medicine Unlocked. 2021 Jan 1, 27:100794.
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
17
[49] S. Perveen, M. Shahbaz, A. Guergachi, K. Keshavjee. “Performance
analysis of data mining classification techniques to predict diabetes”.
Procedia Computer Science. 2016 Jan 1, 82:115-21.
[50] PP. Shinde, S. Shah. “A review of machine learning and deep learning
applications”. In 2018 Fourth international conference on computing
communication control and automation (ICCUBEA) 2018 Aug 16 (pp.
1-6). IEEE.
[51] J. Tang, Q. Su, B. Su, S. Fong, W. Cao, X. Gong. “Parallel ensemble
learning of convolutional neural networks and local binary patterns for
face recognition Comput”. Methods Programs Biomed, 197 (2020), p.
105622.
[52] C. Valle, F. Saravia, H. Allende, R. Monge, C. Fernández. “Parallel
approach for ensemble learning with locally coupled neural networks
Neural Process”. Lett, 32 (3) (2010), pp. 277-291.
[53] A. Mohammed, R. Kora. “A comprehensive review on ensemble deep
learning: Opportunities and challenges”. Journal of King Saud
University-Computer and Information Sciences. 2023 Feb 1.
[54] SY. ŞEN, N. ÖZKURT. “Convolutional neural network
hyperparameter tuning with adam optimizer for ECG classification”.
In 2020 innovations in intelligent systems and applications conference
(ASYU) 2020 Oct 15 (pp. 1-6). IEEE.
[55] S. Perveen, M. Shahbaz, K. Keshavjee, A. Guergachi. “Metabolic
syndrome and development of diabetes mellitus: predictive modeling
based on machine learning techniques”. IEEE Access. 2018 Dec 21,
7:1365-75.
[56] M. Zhou, K. Wu, L. Yu, M. Xu, J. Yang, Q. Shen, B. Liu, L. Shi, S.
Wu, B. Dong, H. Wang. “Development and evaluation of a leukemia
diagnosis system using deep learning in real clinical scenarios”.
Frontiers in Pediatrics. 2021 Jun 24, 9:693676.
[57] KK. Jha, HS. Dutta. “Mutual information based hybrid model and
deep learning for acute lymphocytic leukemia detection in single cell
blood smear images”. Computer methods and programs in
biomedicine. 2019 Oct 1, 179:104987.
[58] A. Almadhor, U. Sattar, A. Al Hejaili, U. Ghulam Mohammad, U.
Tariq, H. Ben Chikha. “An efficient computer vision-based approach
for acute lymphoblastic leukemia prediction”. Frontiers in
Computational Neuroscience. 2022 Nov 24, 16:1083649.
[59] N. Ahmed, A. Yigit, Z. Isik, A. Alpkocak. “Identification of leukemia
subtypes from microscopic images using convolutional neural
network”. Diagnostics. 2019 Aug 25, 9(3):104.
[60] S. Ansari, AH. Navin, AB. Sangar, JV. Gharamaleki, S. Danishvar.
“A customized efficient deep learning model for the diagnosis of acute
leukemia cells based on lymphocyte and monocyte images”.
Electronics. 2023 Jan 8, 12(2):322.
[61] C. Mondal, MK. Hasan, MT. Jawad, A Dutta, MR Islam, MA Awal,
M. Ahmad. “Acute lymphoblastic leukemia detection from
microscopic images using weighted ensemble of convolutional neural
networks”. arXiv preprint arXiv:2105.03995. 2021 May 9.
[62] M. Bukhari, S. Yasmin, S. Sammad, A. El-Latif, A. Ahmed. “A deep
learning framework for leukemia cancer detection in microscopic
blood samples using squeeze and excitation learning”. Mathematical
Problems in Engineering. 2022 Jan 31, 2022.
[63] N. Sampathila, K. Chadaga, N. Goswami, RP. Chadaga, M. Pandya,
S. Prabhu, MG. Bairy, SS. Katta, D. Bhat, SP. Upadya. “Customized
deep learning classifier for detection of acute lymphoblastic leukemia
using blood smear images”. InHealthcare 2022 Sep 20 (Vol. 10, No.
10, p. 1812). MDPI.
[64] World Health Organization, 2020 B. Global cancer profile
2020(2020). https://tinyurl.com/3unsh9xa. [Accessed 10 February
2020]
[65] National Research Council, 2000. Networking health: prescriptions
for the internet.
[66] S. Rezayi, N. Mohammadzadeh, H. Bouraghi, S. Saeedi, A.
Mohammadpour. “Timely diagnosis of acute lymphoblastic leukemia
using artificial intelligence-oriented deep learning methods”.
Computational Intelligence and Neuroscience. 2021 Nov 11, 2021.
[67] K.K. Jha, & H.S. Dutta.” Mutual information based hybrid model and
deep learning for acute lymphocytic leukemia detection in single cell
blood smear images”. Computer methods and programs in
biomedicine. 2019. 179, 104987.
[68] G. Atteia, A.A. Alhussan, N. A. Samee. Bo-allcnn.”Bayesian-based
optimized cnn for acute lymphoblastic leukemia detection in
microscopic blood smear images”. Sensors. 2022 Jul 24;22(15):5520.
[69] R. Khandekar, P. Shastry, S. Jaishankar, O.Faust, N. Sampathila.
“Automated blast cell detection for Acute Lymphoblastic Leukemia
diagnosis”. Biomedical Signal Processing and Control. 2021 Jul
1;68:102690.
[70] Deep learning detects acute myeloid leukemia and predicts NPM1
mutation status from bone marrow smears. Leukemia. 2022
Jan;36(1):111-8.
[71] S. Ren, K. He, R. Girshick, J. Sun. “Faster R-CNN: towards real-time
object detection with region proposal networks”. IEEE Trans Pattern
Anal Mach Intell. 2017;39:113749.
[72] H. Miyoshi, K. Sato, Y. Kabeya, S. Yonezawa, H. Nakano, Y.
Takeuchi, I. Ozawa, S. Higo, E. Yanagida, K. Yamada, K. Kohno.
“Deep learning shows the capability of high-level computer-aided
diagnosis in malignant lymphoma”. Laboratory Investigatio n. 2020
Oct;100(10):1300-10.
[73] A. Krizhevsky, I. Sutskever, G. E. Hinton. “Imagenet classification
with deep convolutional neural networks”. Advances in neural
information processing systems. 2012;25.
[74] K. He, X. Zhang, S. Ren, J. Sun. “Deep residual learning for image
recognition”. In Proceedings of the IEEE conference on computer
vision and pattern recognition 2016 (pp. 770-778).
[75] H. D. Cheng, X. H. Jiang, Y. Sun, J. Wang. “Color image
segmentation: advances and prospects”. Pattern recognition. 2001 Dec
1;34(12):2259-81.
[76] H. D. Cheng, J. Shan, W. Ju, Y. Guo, L. Zhang.” Automated breast
cancer detection and classification using ultrasound images: A
survey”. Pattern recognition. 2010 Jan 1;43(1):299-317.
[77] S. Sharma, S. Gupta, D. Gupta, S. Juneja, P. Gupta, G. Dhiman, S.
Kautish. “Deep learning model for the automatic classification of
white blood cells”. Computational Intelligence and Neuroscience.
2022 Jan 12;2022.
[78] S. Mohapatra, D. Patra. “Automated cell nucleus segmentation and
acute leukemia detection in blood microscopic images”. In2010
International Conference on Systems in Medicine and Biology 2010
Dec 16 (pp. 49-54). IEEE.
[79] R. Baig, A. Rehman, A. Almuhaimeed, A. Alzahrani,H. T. Rauf HT.
“Detecting malignant leukemia cells using microscopic blood smear
images: a deep learning approach”. Applied Sciences. 2022 Jun
21;12(13):6317.
[80] G. Parthasarathy G, D. Chitra. “Thresholding technique for color
image segmentation”. International Journal for Research in Applied
Science & Engineering Technology. 2015 Jun;3(6):437-45.
[81] Y. Liu, F. “Long. Acute lymphoblastic leukemia cells image analysis
with deep bagging ensemble learning”. InISBI 2019 C-NMC
Challenge: Classification in Cancer Cell Imaging: Select Proceedings
2019 Nov 29 (pp. 113-121). Singapore: Springer Singapore.
[82] F. Xiao, R. Kuang, Z. Ou, B. Xiong. “DeepMEN: Multi-model
ensemble network for B-lymphoblast cell classification”. InISBI 2019
C-NMC Challenge: Classification in Cancer Cell Imaging: Select
Proceedings 2019 (pp. 83-93). Springer Singapore.
[83] D. S. Depto, M.M. Rizvee, A. Rahman, H. Zunair, M. S. Rahman,
M. R. Mahdy. “Quantifying imbalanced classification methods for
leukemia detection”. Computers in Biology and Medicine. 2023 Jan
1;152:106372.
[84] K. Barrera-Llanga, J. Burriel-Valencia, A. Sapena-Bañó, J. Martínez-
Román. “A Comparative Analysis of Deep Learning Convolutional
Neural Network Architectures for Fault Diagnosis of Broken Rotor
Bars in Induction Motors”. Sensors. 2023 Sep 30;23(19):8196.
[85] K. Barrera, J. Rodellar, S. Alférez, A. Merino. “Automatic normalized
digital color staining in the recognition of abnormal blood cells using
generative adversarial networks. Computer methods and programs in
biomedicine. 2023 Oct 1;240:107629.
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
18
Dr. SAJIDA PERVEEN received the
Ph.D. degree in Computer Science from
the Department of Computer Science,
University of Engineering & Technology
(UET) Lahore, Pakistan, in 2021. She is
currently serving as an Assistant Professor
in the Department of Computer Science, National Textile
University, Faisalabad, Pakistan. Her research interests
include healthcare informatics, data science, and data
analytics.
Dr. ABDULLAH ALOURANI is
Assistant Professor at the Department of
Management Information Systems and
Production Management, Qassim
University, Saudi Arabia. He received his
Ph.D. in Computer Science from the
University of Illinois at Chicago, his
Master’s degree in Computer Science
from DePaul University in Chicago, and his Bachelor’s
degree in Computer Science from Qassim University, Saudi
Arabia. His current research interests are in the areas of
Cloud Computing, Software Engineering, Security, and
Artificial Intelligence. He is a member of ACM and IEEE
Dr. MUHAMMAD SHAHBAZ received
the Ph.D. degree from Loughborough
University, U.K. He is currently a Full
Professor in the Department of Computer
Engineering, University of Engineering and
Technology. He has delivered several talks
in the industry at National and International
levels and at various conferences around the world. He has
wide experience in the field of data science and has published
more than 100 articles in the same domain. His research
interests include healthcare informatics, fog computing, data
science, and artificial intelligence
Dr. M. USMAN ASHRAF received PhD
(Computer Science) degree in 2018 from
King Abdul-Aziz University, Saudi
Arabia. He is Associate Professor and
Head of department of Computer Science,
GC Women University, Sialkot, Pakistan.
His research on Exascale Computing
Systems, High Performance Computing (HPC) Systems,
Parallel Computing, HPC for Deep learning and Location
Based Services System has appeared in IEEE Access, IET
Software, International Journal of Advanced Research in
Computer Science, International Journal of Advanced
Computer Science and Applications, I.J. Information
Technology and Computer Science, International Journal of
Computer Science and Security and several International
IEEE/ACM/Springer conferences. He served as HPC
Scientist at HPC Centre King Abdul-Aziz University, Saudi
Arabia as well.
Dr. ISMA HAMID is currently an
assistant professor at National textile
university, Pakistan. She has thirteen years
of teaching and research experience. Her
main research interests are visualization,
big data and computational intelligence.
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
... The objective of this study is to provide a complete overview of how Convolutional Neural Networks are used to diagnose medical pictures [15] [16]. They will investigate the fundamental principles of CNNs, explain their applications in various imaging modalities and clinical settings, assess current obstacles and limitations, and identify future research and development prospects in this rapidly expanding subject [17]. They can exploit AI's promise to alter medical imaging and improve patient care by learning more about CNN-based techniques. ...
Article
Full-text available
Medical image diagnosis using Convolutional Neural Networks (CNNs) has emerged as a viable way to improve the accuracy and efficiency of disease identification and categorization in clinical settings. In this study, they look at how CNNs can be used to diagnose lung nodules from chest X-ray pictures, to provide insights into the technology's performance and future clinical applications. A dataset of 10,000 tagged chest X-ray pictures showing both benign and malignant lung nodules was obtained and preprocessed using standard methods. The dataset was used to construct and train a proprietary CNN architecture, which was then rigorously evaluated on distinct training, validation, and test sets. The CNN model showed good accuracy (94.8%), sensitivity (92.1%), specificity (96.5%), precision, recall, F1 score, and area under the ROC curve (AUC), indicating its robustness and generalization ability. These findings show that CNN-based diagnostic tools may help radiologists and physicians discover and diagnose lung cancer earlier, improving patient outcomes and optimizing healthcare delivery. However, difficulties such as interpretability, data privacy, and regulatory approval must be addressed before CNNs can be fully utilized in medical imaging. This study emphasizes CNNs' transformative significance in diagnostic medicine and the necessity for additional research and development to realize their full potential in clinical practice.
... The images were resized to a standardized dimension of (227 × 227) pixels, facilitating consistent input dimensions for the subsequent stages. Normalization was applied to standardize pixel values across all images, ensuring uniformity in data representation [38]. A distinctive technique, LoGMH, was incorporated to accentuate relevant features within the images. ...
Article
Full-text available
Disease recognition has been revolutionized by autonomous systems in the rapidly developing field of medical technology. A crucial aspect of diagnosis involves the visual assessment and enumeration of white blood cells in microscopic peripheral blood smears. This practice yields invaluable insights into a patient’s health, enabling the identification of conditions of blood malignancies such as leukemia. Early identification of leukemia subtypes is paramount for tailoring appropriate therapeutic interventions and enhancing patient survival rates. However, traditional diagnostic techniques, which depend on visual assessment, are arbitrary, laborious, and prone to errors. The advent of ML technologies offers a promising avenue for more accurate and efficient leukemia classification. In this study, we introduced a novel approach to leukemia classification by integrating advanced image processing, diverse dataset utilization, and sophisticated feature extraction techniques, coupled with the development of TL models. Focused on improving accuracy of previous studies, our approach utilized Kaggle datasets for binary and multiclass classifications. Extensive image processing involved a novel LoGMH method, complemented by diverse augmentation techniques. Feature extraction employed DCNN, with subsequent utilization of extracted features to train various ML and TL models. Rigorous evaluation using traditional metrics revealed Inception-ResNet’s superior performance, surpassing other models with F1 scores of 96.07% and 95.89% for binary and multiclass classification, respectively. Our results notably surpass previous research, particularly in cases involving a higher number of classes. These findings promise to influence clinical decision support systems, guide future research, and potentially revolutionize cancer diagnostics beyond leukemia, impacting broader medical imaging and oncology domains.
Article
Full-text available
Android is the most popular operating system of the latest mobile smart devices. With this operating system, many Android applications have been developed and become an essential part of our daily lives. Unfortunately, different kinds of Android malware have also been generated with these applications’ endless stream and somehow installed during the API calls, permission granted and extra packages installation and badly affected the system security rules to harm the system. Therefore, it is compulsory to detect and classify the android malware to save the user’s privacy to avoid maximum damages. Many research has already been developed on the different techniques related to android malware detection and classification. In this work, we present AMDDLmodel a deep learning technique that consists of a convolutional neural network. This model works based on different parameters, filter sizes, number of epochs, learning rates, and layers to detect and classify the android malware. The Drebin dataset consisting of 215 features was used for this model evaluation. The model shows an accuracy value of 99.92%. The other statistical values are precision, recall, and F1-score. AMDDLmodel introduces innovative deep learning for Android malware detection, enhancing accuracy and practical user security through inventive feature engineering and comprehensive performance evaluation. The AMDDLmodel shows the highest accuracy values as compared to the existing techniques.
Article
Full-text available
Induction machines (IMs) play a critical role in various industrial processes but are susceptible to degenerative failures, such as broken rotor bars. Effective diagnostic techniques are essential in addressing these issues. In this study, we propose the utilization of convolutional neural networks (CNNs) for detection of broken rotor bars. To accomplish this, we generated a dataset comprising current samples versus angular position using finite element method magnetics (FEMM) software for a squirrel-cage rotor with 28 bars, including scenarios with 0 to 6 broken bars at every possible relative position. The dataset consists of a total of 16,050 samples per motor. We evaluated the performance of six different CNN architectures, namely Inception V4, NasNETMobile, ResNET152, SeNET154, VGG16, and VGG19. Our automatic classification system demonstrated an impressive 99% accuracy in detecting broken rotor bars, with VGG19 performing exceptionally well. Specifically, VGG19 exhibited high accuracy, precision, recall, and F1-Score, with values approaching 0.994 and 0.998. Notably, VGG19 exhibited crucial activations in its feature maps, particularly after domain-specific training, highlighting its effectiveness in fault detection. Comparing CNN architectures assists in selecting the most suitable one for this application based on processing time, effectiveness, and training losses. This research suggests that deep learning can detect broken bars in induction machines with accuracy comparable to that of traditional methods by analyzing current signals using CNNs.
Article
Full-text available
Background and Objectives: Combining knowledge of clinical pathologists and deep learning models is a growing trend in morphological analysis of cells circulating in blood to add objectivity, accuracy, and speed in diagnosing hematological and non-hematological diseases. However, the variability in staining protocols across different laboratories can affect the color of images and performance of automatic recognition models. The objective of this work is to develop, train and evaluate a new system for the normalization of color staining of peripheral blood cell images, so that it transforms images from different centers to map the color staining of a reference center (RC) while preserving the structural morphological features. Methods: The system has two modules, GAN1 and GAN2. GAN1 uses the PIX2PIX technique to fade original color images to an adaptive gray, while GAN2 transforms them into RGB normalized images. Both GANs have a similar structure, where the generator is a U-NET convolutional neural network with ResNet and the discriminator is a classifier with ResNet34 structure. Digitally stained images were evaluated using GAN metrics and histograms to assess the ability to modify color without altering cell morphology. The system was also evaluated as a pre-processing tool before cells undergo a classification process. For this purpose, a CNN classifier was designed for three classes: abnormal lymphocytes, blasts and reactive lymphocytes. Results: Training of all GANs and the classifier was performed using RC images, while evaluations were conducted using images from four other centers. Classification tests were performed before and after applying the stain normalization system. The overall accuracy reached a similar value around 96% in both cases for the RC images, indicating the neutrality of the normalization model for the reference images. On the contrary, it was a significant improvement in the classification performance when applying the stain normalization to the other centers. Reactive lymphocytes were the most sensitive to stain normalization, with true positive rates (TPR) increasing from 46.3% - 66% for the original images to 81.2% - 97.2% after digital staining. Abnormal lymphocytes TPR ranged from 31.9% - 95.7% with original images to 83% - 100% with digitally stained images. Blast class showed TPR ranges of 90.3% - 94.4% and 94.4% - 100%, for original and stained images, respectively. Conclusions: The proposed GAN-based normalization staining approach improves the performance of classifiers with multicenter data sets by generating digitally stained images with a quality similar to the original images and adaptability to a reference staining standard. The system requires low computation cost and can help improve the performance of automatic recognition models in clinical settings.
Article
Full-text available
Leukemia is a cancer of blood-producing cells, including the bone marrow. Abnormal white blood cells travel through blood vessels and multiply rapidly. Healthy cells in the body become a minority, and the imbalance increases the chances of infection in the body. Leukemia or blood cancer is the most common cancer in children ages 2 - 14. Most leukemia in children is treated. Acute lymphocytic leukemia (ALL) is a type of cancer in the blood and bone marrow. It progresses rapidly when immature white blood cells are formed instead of mature ones. Treatments for acute lymphocytic leukemia include drugs and blood transfusions directly into veins, chemotherapy, and all transplantation, which involve transferring organs or tissues within the body or from one person to another. In this paper, Pattern Recognition of Acute Lymphoblastic Leukemia has been proposed using Computational Deep Learning. Pattern recognition technology uses mathematical algorithms to identify patterns in large datasets of data. Analyzing the data, the algorithms can identify patterns indicative of certain states or conditions. In the case of ALL, the algorithm would look for patterns in white blood cell count data that indicate the presence of ALL. These patterns may include changes in the number of white blood cells over time, changes in the composition of the white blood cells, or changes in the levels of certain proteins or gene expressions associated with ALL. The proposed ALLDM model achieved 81.53% (DDS) and 87.92% (SDS) of chemotherapy management, 79.16% (DDS) and 94.31% (SDS) of Stem Cell Transplantation Management, 63.77% (DDS) and 87.37% (SDS) of Radiation therapy Management and 88.92% (DDS) and 85.86% (SDS) of Targeted therapy drugs management.
Article
Full-text available
The production of blood cells is affected by leukemia, a type of bone marrow cancer or blood cancer. Deoxyribonucleic acid (DNA) is related to immature cells, particularly white cells, and is damaged in various ways in this disease. When a radiologist is involved in diagnosing acute leukemia cells, the diagnosis is time consuming and needs to provide better accuracy. For this purpose, many types of research have been conducted for the automatic diagnosis of acute leukemia. However, these studies have low detection speed and accuracy. Machine learning and artificial intelligence techniques are now playing an essential role in medical sciences, particularly in detecting and classifying leukemic cells. These methods assist doctors in detecting diseases earlier, reducing their workload and the possibility of errors. This research aims to design a deep learning model with a customized architecture for detecting acute leukemia using images of lymphocytes and monocytes. This study presents a novel dataset containing images of Acute Lymphoblastic Leukemia (ALL) and Acute Myeloid Leukemia (AML). The new dataset has been created with the assistance of various experts to help the scientific community in its efforts to incorporate machine learning techniques into medical research. Increasing the scale of the dataset is achieved with a Generative Adversarial Network (GAN). The proposed CNN model based on the Tversky loss function includes six convolution layers, four dense layers, and a Softmax activation function for the classification of acute leukemia images. The proposed model achieved a 99% accuracy rate in diagnosing acute leukemia types, including ALL and AML. Compared to previous research, the proposed network provides a promising performance in terms of speed and accuracy; and based on the results, the proposed model can be used to assist doctors and specialists in practical applications.
Article
Full-text available
Leukemia (blood cancer) diseases arise when the number of White blood cells (WBCs) is imbalanced in the human body. When the bone marrow produces many immature WBCs that kill healthy cells, acute lymphocytic leukemia (ALL) impacts people of all ages. Thus, timely predicting this disease can increase the chance of survival, and the patient can get his therapy early. Manual prediction is very expensive and time-consuming. Therefore, automated prediction techniques are essential. In this research, we propose an ensemble automated prediction approach that uses four machine learning algorithms K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB). The C-NMC leukemia dataset is used from the Kaggle repository to predict leukemia. Dataset is divided into two classes cancer and healthy cells. We perform data preprocessing steps, such as the first images being cropped using minimum and maximum points. Feature extraction is performed to extract the feature using pre-trained Convolutional Neural Network-based Deep Neural Network (DNN) architectures (VGG19, ResNet50, or ResNet101). Data scaling is performed by using the MinMaxScaler normalization technique. Analysis of Variance (ANOVA), Recursive Feature Elimination (RFE), and Random Forest (RF) as feature Selection techniques. Classification machine learning algorithms and ensemble voting are applied to selected features. Results reveal that SVM with 90.0% accuracy outperforms compared to other algorithms.
Article
In machine learning, two approaches outperform traditional algorithms: ensemble learning and deep learning. The former refers to methods that integrate multiple base models in the same framework to obtain a stronger model that outperforms them. The success of an ensemble method depends on several factors, including how the baseline models are trained and how they are combined. In the literature, there are common approaches to building an ensemble model successfully applied in several domains. On the other hand, deep learning-based models have improved the predictive accuracy of machine learning across a wide range of domains. Despite the diversity of deep learning architectures and their ability to deal with complex problems and the ability to extract features automatically, the main challenge in deep learning is that it requires a lot of expertise and experience to tune the optimal hyper-parameters, which makes it a tedious and time-consuming task. Numerous recent research efforts have been made to approach ensemble learning to deep learning to overcome this challenge. Most of these efforts focus on simple ensemble methods that have some limitations. Hence, this review paper provides comprehensive reviews of the various strategies for ensemble learning, especially in the case of deep learning. Also, it explains in detail the various features or factors that influence the success of ensemble methods. In addition, it presents and accurately categorized several research efforts that used ensemble learning in a wide range of domains.
Article
Background Tracking progress and providing timely evidence is a fundamental step forward for countries to remain aligned with the targets set by WHO to eliminate cervical cancer as a public health problem (ie, to reduce the incidence of the disease below a threshold of 4 cases per 100 000 women-years). We aimed to assess the extent of global inequalities in cervical cancer incidence and mortality, based on The Global Cancer Observatory (GLOBOCAN) 2020 estimates, including geographical and socioeconomic development, and temporal aspects. Methods For this analysis, we used the GLOBOCAN 2020 database to estimate the age-specific and age-standardised incidence and mortality rates of cervical cancer per 100 000 women-years for 185 countries or territories aggregated across the 20 UN-defined world regions, and by four-tier levels of the Human Development Index (HDI). Time trends (1988–2017) in incidence were extracted from the Cancer Incidence in Five Continents (CI5) plus database. Mortality estimates were obtained using the most recent national vital registration data from WHO. Findings Globally in 2020, there were an estimated 604 127 cervical cancer cases and 341 831 deaths, with a corresponding age-standardised incidence of 13·3 cases per 100 000 women-years (95% CI 13·3–13·3) and mortality rate of 7·2 deaths per 100 000 women-years (95% CI 7·2–7·3). Cervical cancer incidence ranged from 2·2 (1·9–2·4) in Iraq to 84·6 (74·8–94·3) in Eswatini. Mortality rates ranged from 1·0 (0·8–1·2) in Switzerland to 55·7 (47·7–63·7) in Eswatini. Age-standardised incidence was highest in Malawi (67·9 [95% CI 65·7 –70·1]) and Zambia (65·5 [63·0–67·9]) in Africa, Bolivia (36·6 [35·0–38·2]) and Paraguay (34·1 [32·1–36·1]) in Latin America, Maldives (24·5 [17·0–32·0]) and Indonesia (24·4 [24·2–24·7]) in Asia, and Fiji (29·8 [24·7–35·0]) and Papua New Guinea (29·2 [27·3–31·0]) in Melanesia. A clear socioeconomic gradient exists in cervical cancer, with decreasing rates as HDI increased. Incidence was three times higher in countries with low HDI than countries with very high HDI, whereas mortality rates were six times higher in low HDI countries versus very high HDI countries. In 2020 estimates, a general decline in incidence was observed in most countries of the world with representative trend data, with incidence becoming stable at relatively low levels around 2005 in several high-income countries. By contrast, in the same period incidence increased in some countries in eastern Africa and eastern Europe. We observed different patterns of age-specific incidence between countries with well developed population-based screening and treatment services (eg, Sweden, Australia, and the UK) and countries with insufficient and opportunistic services (eg, Colombia, India, and Uganda). Interpretation The burden of cervical cancer remains high in many parts of the world, and in most countries, the incidence and mortality of the disease remain much higher than the threshold set by the WHO initiative on cervical cancer elimination. We identified substantial geographical and socioeconomic inequalities in cervical cancer globally, with a clear gradient of increasing rates for countries with lower levels of human development. Our study provides timely evidence and impetus for future strategies that prioritise and accelerate progress towards the WHO elimination targets and, in so doing, address the marked variations in the global cervical cancer landscape today. Funding French Institut National du Cancer, Horizon 2020 Framework Programme for Research and Innovation of the European Commission; and EU4Health Programme.
Article
Uncontrolled proliferation of B-lymphoblast cells is a common characterization of Acute Lymphoblastic Leukemia (ALL). B-lymphoblasts are found in large numbers in peripheral blood in malignant cases. Early detection of the cell in bone marrow is essential as the disease progresses rapidly if left untreated. However, automated classification of the cell is challenging, owing to its fine-grained variability with B-lymphoid precursor cells and imbalanced data points. Deep learning algorithms demonstrate potential for such fine-grained classification as well as suffer from the imbalanced class problem. In this paper, we explore different deep learning-based State-Of-The-Art (SOTA) approaches to tackle imbalanced classification problems. Our experiment includes input, GAN (Generative Adversarial Networks), and loss-based methods to mitigate the issue of imbalanced class on the challenging C-NMC and ALLIDB-2 dataset for leukemia detection. We have shown empirical evidence that loss-based methods outperform GAN-based and input-based methods in imbalanced classification scenarios.
Article
Modern endoscopy relies on digital technology, from high-resolution imaging sensors and displays to electronics connecting configurable illumination and actuation systems for robotic articulation. In addition to enabling more effective diagnostic and therapeutic interventions, the digitization of the procedural toolset enables video data capture of the internal human anatomy at unprecedented levels. Interventional video data encapsulate functional and structural information about a patient’s anatomy as well as events, activity and action logs about the surgical process. This detailed but difficult-to-interpret record from endoscopic procedures can be linked to preoperative and postoperative records or patient imaging information. Rapid advances in artificial intelligence, especially in supervised deep learning, can utilize data from endoscopic procedures to develop systems for assisting procedures leading to computer-assisted interventions that can enable better navigation during procedures, automation of image interpretation and robotically assisted tool manipulation. In this Perspective, we summarize state-of-the-art artificial intelligence for computer-assisted interventions in gastroenterology and surgery. Advances in artificial intelligence (AI) are changing endoscopy and gastrointestinal surgery, including computer-assisted detection and diagnosis, computer-aided navigation, robot-assisted intervention and automated reporting. This Perspective introduces the role of AI in computer-assisted interventions in gastroenterology with insights on regulatory aspects and the challenges ahead.