ArticlePDF Available

A Framework for Early Detection of Acute Lymphoblastic Leukemia and Its Subtypes From Peripheral Blood Smear Images Using Deep Ensemble Learning Technique

January 2024
IEEE Access PP(99):1-1

January 2024
PP(99):1-1

DOI:10.1109/ACCESS.2024.3368031

License
CC BY-NC-ND 4.0

Authors:

Abdullah Alourani

Qassim University

M. Usman Ashraf

Government College Women University, Sialkot

Show all 5 authorsHide

Acute lymphoblastic leukemia (ALL), one of the prevalent types of carcinogenic disease, has been seen a deadly illness exposing numerous patients across the world to potential threats of lives. It impacts both adults and children providing a narrow range of chances of being cured if diagnosed at a later stage. A definitive diagnosis often demands highly invasive diagnostic procedures thereby proving time-consuming and expensive. Peripheral Blood Smear (PBS) images have been playing a crucial role in the initial screening of ALL in suspected individuals. However, the nonspecific nature of ALL poses a substantial challenge in the analysis of these images thus leaving space for misdiagnosis. Aiming at contribute to the early diagnoses of this life-threatening disease, we put forward automated platform for screening the presence of ALL concerning its specific subtypes (benign, Early Pro-B, Pro-B and Pre-B) using PBS images. The proposed web based platform follows weighted ensemble learning technique using a Residual Convolutional Neural Network (ResNet-152) as the base learner to identify ALL from hematogone cases and then determine ALL subtypes. This is likely to save both diagnosis time and the efforts of clinicians and patients. Experimental results are obtained and comparative analysis among 7 well-known CNN Network architectures (AlexNet, VGGNet, Inception, ResNet-50, ResNet-18, Inception and DenseNet-121) is also performed that demonstrated that the proposed platform achieved comparatively high accuracy (99.95%), precision (99.92%), recall (99.92%), F1-Score (99.90%), sensitivity (99.92%) and specificity (99.97%). The promising results demonstrate that the proposed platform has the potential to be used as a reliable tool for early diagnosis of ALL and its sub-types. Furthermore, this provides references for pathologists and healthcare providers, aiding them in producing specific guidelines and more informed choices about patient and disease management.

Randomly selected images from each category of Acute Lymphoblastic Leukemia (A) Benign (B) Early Pre-b(C) Pre-b (D) Pro-b.

…

Original RGB input image samples from each subtype and their corresponding HSV color space distribution diagram, binary masked image and segmented image.

…

An abstract overview of ResNet-152 based architecture for feature extraction and classification.

…

Filters from layer 2 of ResNET-152 with one subplot per channel

…

(A) Visualization of the grayscale feature maps (B) Activated colored feature maps extracted from an image of the Pro-b subtype, as depicted in Figure 1.

…

Figures - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Content may be subject to copyright.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.

Digital Object Identifier 10.1109/ACCESS.2022.Doi Number

A framework for Early Detection of Acute

Lymphoblastic Leukemia and its Subtypes from

Peripheral Blood Smear Images Using Deep

Ensemble Learning Technique

Sajida Perveen1, Abdullah Alourani* 2, Muhammad Shahbaz3, Usman Ashraf 4, Isma Hamid 5

1,5 Department of computer Science, National Textile University, Faisalabad, Pakistan

2 Department of Management Information Systems and Production Management, College of Business and Economics, Qassim

University, P.O. Box 6640, Buraidah 51452, Saudi Arabia

3 Department of Computer Engineering, University of Engineering & Technology, Lahore, Pakistan.

4 Department of computer Science, Government College Women University, Sialkot, Pakistan.

Corresponding author: Abdullah Alourani (ab.alourani@qu.edu.sa).

“This work was funded by Scientific Research, Qassim University, KSA. ”

ABSTRACT Acute lymphoblastic leukemia (ALL), one of the prevalent types of carcinogenic disease, has

been seen a deadly illness exposing numerous patients across the world to potential threats of lives. It impacts

both adults and children providing a narrow range of chances of being cured if diagnosed at a later stage. A

definitive diagnosis often demands highly invasive diagnostic procedures thereby proving time-consuming

and expensive. Peripheral Blood Smear (PBS) images have been playing a crucial role in the initial screening

of ALL in suspected individuals. However, the nonspecific nature of ALL poses a substantial challenge in

the analysis of these images thus leaving space for misdiagnosis. Aiming at contribute to the early diagnoses

of this life-threatening disease, we put forward automated platform for screening the presence of ALL

concerning its specific subtypes (benign, Early Pro-B, Pro-B and Pre-B) using PBS images. The proposed

web based platform follows weighted ensemble learning technique using a Residual Convolutional Neural

Network (ResNet-152) as the base learner to identify ALL from hematogone cases and then determine ALL

subtypes. This is likely to save both diagnosis time and the efforts of clinicians and patients. Experimental

results are obtained and comparative analysis among 7 well-known CNN Network architectures (AlexNet,

VGGNet, Inception, ResNet-50, ResNet-18, Inception and DenseNet-121) is also performed that

demonstrated that the proposed platform achieved comparatively high accuracy (99.95%), precision

(99.92%), recall (99.92%), F1-Score (99.90%), sensitivity (99.92%) and specificity (99.97%). The promising

results demonstrate that the proposed platform has the potential to be used as a reliable tool for early diagnosis

of ALL and its sub-types. Furthermore, this provides references for pathologists and healthcare providers,

aiding them in producing specific guidelines and more informed choices about patient and disease

management.

INDEX TERMS Acute Lymphoblastic Leukemia (ALL), Deep Ensemble Learning, Ensemble

convolutional neural networks, Lymphocytic leukemia, Peripheral Blood Smear Images, ResNet-152,

Weighted deep ensemble learning.

I. INTRODUCTION

Acute lymphoblastic leukemia (ALL), or lymphocytic

leukemia, is a potentially life-threatening disease [1]. It is

a malignancy of lymphoid blood cells characterized by

hyper-proliferation and immature growth of lymphocytes

(white blood cell/leukocyte)) by the bone marrow in the

human body [2]. These immature and impaired

lymphocytes pose a substantial threat to the overall immune

system. This anomaly also inhibits the bone marrow’s

ability to produce platelets and red blood cells. In addition,

mutated erythrocytes in the bloodstream can lead to serious

health concerns for other vital body organs [3].

The prevalence of ALL is increasing dramatically each year

[64], making it one of the more common types of cancer

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

prevalent in adults and children; however, it has a fair

chance of being cured [4]. In 2017, the global incidents of

ALL were estimated to have risen from 49.1 thousand cases

to 64.2 thousand [5]. According to a recent report published

in 2022 by the International Agency for Research on

Cancer (IARC) of WHO, there have been 437,033 reported

cases of leukemia and 303,006 leukemia related deaths

worldwide [6]. It is also noteworthy that neglecting early

interventions or diagnosing ALL at a later stage might also

result in premature mortality [7].

Furthermore, preliminary and rapid diagnosis of ALL has

always been a challenge to haematologists, oncologists and

researchers [8]. Enlargement of the liver spleen, uncertain

bleeding, pallor, fever, skeletal pain, common infections,

easy bruising and abrupt weight loss are the common

symptoms often associated with leukemia, but these

symptoms can also be indicative of other medical

conditions. Consequently, leukemia diagnosis is

challenging in its early stages due to the covert and mild

nature of these symptoms [9].

Diagnosing acute leukemia involves a diverse array of

diagnostic procedures and equipment. The conventional

method used for initial screening for ALL in a suspected

individual at the preliminary stage is the microscopic

evaluation of Peripheral Blood Smear (PBS) samples.

However, the gold standard methods for leukemia

diagnosis involve bone marrow aspiration, cytogenetic

analysis, immune phenotyping and lumbar puncture to

name a few [10]. Such invasive techniques have serious

complications, particularly in children. The preliminary

complications associated with these procedures are pain,

bleeding and bruising [9]. Besides the expensive bone

marrow test, it imposes extra healthcare expenditures on

patients who need multiple samples [11]. According to a

report published in 2020 by WHO, the equipment and

resources required for such tests still need to be made

available in developing countries, particularly in rural areas

[12].

Furthermore, manual analyses that have been used for a

long time are often labor intensive and time-consuming in

general [13]. It might also come out with less accurate

results or diagnostic errors due to the time-consuming

nature of the PBS analysis under the microscope, which

again makes the intervention measures ineffective and

impractical. Also, hematologists must diagnose the

presence of leukemia along with its specific subtypes

(benign, Early, Pro-B and Pre-B) to prevent healthcare

complications and identify the optimal treatment of

leukemia.

Over the last three decades, extensive research articles have

adopted machine learning (ML) and computer-aided

diagnostic approaches to analyze laboratory images. The

primary objectives of such analyses are to overcome the

challenges associated with late leukemia diagnoses and to

effectively determine its subtypes [14]. Numerous studies

have also analyzed blood smear images for differentiating,

diagnosing and counting the cells in various types of

leukemia [15, 16].

According to the existing literature, extensive studies

regarding ALL prediction from microscopic images have

been conducted. These studies employed various ML

algorithms for early diagnosis of ALL using microscopic

images [10, 17, 18]. However, when these techniques are

applied to imagery datasets, some obstructions exist,

including handling multi-dimensional data, handcrafted

feature engineering, and dimensionality reduction [19, 20].

Recently, to overcome these limitations, deep learning is

frequently used for diagnosing ALL, especially when

dealing with such data. Subsequently, in deep learning,

convolutional neural network (CNN) is one of the

frequently used models, particularly over image related

healthcare datasets as it has strong self-learning, adaptive

and generalization capabilities.

Unlike the classical Machine learning technique, CNN

explicitly alleviates the requirement of handcrafted feature

extraction; thus it only requires the data as an input and its

self-learning capability completes the intended task

unaided [16]. Various other DL-based techniques have

already proven their immense worth across diverse

domains of automatic segmentation, recognition or

detection, such as skin lesions [21, 22], brain tumor [23]

breast cancer [24], diabetic retinopathy [25], COVID-19

pandemic [26], minor invasive surgery [27] and others [28].

Although numerous articles have already been published to

diagnose ALL using CNN, there is still room available for

enhancing the performance, learning process and extending

the model's generalizability, mainly when dealing with data

scarcity, as depicted in numerous articles [29, 30. 31,32, 33

]. In this regard, ensemble learning is a widely adopted and

effective machine learning technique, involving combining

multiple models to enhance overall performance and

accuracy articles [29, 30, 31, 32, 33].

Keeping the above-mentioned aspects in mind, the salient

features of our contributions are depicted below: (1) our

research pioneers the development of a fully automated

online platform to diagnose and classify ALL into its

subtypes with a deep ensemble learning based model using

blood smear images. Making such web platforms accessible

to the general population may also contribute significantly

to patients’ education, disease management and reduced

healthcare cost, as reportedly approximately 6 billion

individuals worldwide rely on the Internet to access

disease-related information before seeking medical care

[65]. (2) According to our best knowledge this is the first

study that incorporated the ResNET-152 network

architecture as a key component with weighted ensemble

learning method and to classify acute lymphoblastic

leukemia and its subtypes with almost 99.95% accuracy.

(3) The proposed research leveraged from transfer learning

by incorporating pre-trained Convolutional Neural

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

Network (CNN) based architectures to reduce the need for

expensive hardware and computational cost. We have

experimented with the performance of 7 various pre-trained

networks. To perform a comparative analysis among 7

well-known CNN Network architectures (AlexNet,

VGGNet, Inception, ResNet-50, ResNet-18, Inception and

DenseNet-121). This comparative analysis also

demonstrated that the proposed method outperforms in

early diagnosing ALL along with its types and achieve

comparatively high accuracy. (4) The proposed fully

automated real time tele-diagnosis system, does not

necessitate extensively trained medical personnel for ALL

diagnoses, particularly filling the gaps when direct doctor

contact is not possible or available. While avoiding the cost

and effort of prevention and treatment in those at early stage.

These developments collectively improve the landscape of

ALL diagnosis, fusing efficiency, accuracy, and accessibility.

II. LITERATURE REVIEW

Sampathila et al. [63] developed a custom ALLNET network

architecture. The custom ALLNET architecture was trained

and tested using publically available microscopic images

dataset. The imagery data incorporated in this research

belongs to the ALL Challenge dataset of ISBI, 2019. Results

demonstrated that ALLNET outperformed and achieved an

accuracy of 98%, that depicts the proposed model have the

potential to diagnose the ALL disease effectively. However

the proposed research lacks in offering support to classify the

leukemia cells into its subtypes which is crucial for

comprehensive disease management.

Jha and Dutta [67] demonstrated a framework for leukemia

detection based on hybrid technique. In this research blood

smear images are obtained from the AA-IDB2 database. In

the proposed research a novel segmentation and optimization

methods are applied on blood smear images. In the

segmentation phase utilizes a hybrid model based on

primarily Mutual Information (MI), combining Active

Contour and Fuzzy C-Means techniques. Subsequently,

potentially significant features were extracted, comprises of

color histogram features and Local Difference Pattern

(LDP). These extracted features serve as input for the

Chronological Self-Adaptive (SCA)-based CNN

architecture, optimizing its weights by the Chronological

SCA algorithm to increase the accuracy of leukemia

identification. Although this research proposed a holistic

technique and determined the leukemia from the blood smear

images, but required domain knowledge to effectively utilize

the proposed framework for the detection of ALL from the

single cell blood smear images. Furthermore, this research

does not focus on the identification of its subtypes.

In the existing literature there is a lack of research that

investigate the significance of using various optimization

algorithm for hyper-parameter tuning of deep learning

models developed particularly to diagnosis ALL. Atteia et

al, [68] proposed an optimized Convolutional Neural

Network (CNN) based on Bayesian to detect ALL from

microscopic smear images. The network architecture of the

CNN and its hyper-parameters are tailored to the input data

through Bayesian optimization, a technique that iteratively

refines the hyper-parameter space to minimize an objective

error function. A hybrid data was created by combining two

publically available datasets to train and test the proposed

Bayesian-optimized CNN. This data augmentation improves

the hybrid data, contributing to improved performance.

Experimental results demonstrated that Bayesian-optimized

CNN model demonstrates superior performance to classify

ALL from blood smear images test set, surpassing other

optimized deep learning ALL classification models. While

this research demonstrated the effectiveness of the optimized

Bayesian driven CNN model in elevating the accuracy of

ALL detection from microscopic PBS images, it does not

extend its assistance in categorizing specific ALL subtypes.

As it is a widely accepted notation and that optimal and

effective treatment depends on the accurate detection of

disease type and how far the disease has spread in the body.

Khandekar et al, [69] proposed an automation technique to

detect ALL blast cells in PBS images using the YOLO v4.

Two publically available datasets (ALL-IDB1 and

C_NMC_2019 consist of a total of 10, 769, singled celled

and preprocessed, images) were incorporated to train and

evaluate the propose method. The proposed method achieved

The MAP (Mean Average Precision) of 98.7 % over the

C_NMC_2019 dataset and 96.06 % over the ALL-IDB1

data. Experimental results demonstrated that at the end of

6000iteration the loss reduced exponentially and reached at

an approximately 0.57664. However, no interactive

automated screening or diagnosing tele-health service was

proposed in this research.

Ghaderzadeh et al, [34] proposed deep learning based

method to identify ALL and its subtypes using PBS images.

The dataset utilized in the study consists of 3256 PBS images

from 89 individuals suspected of ALL. It incorporated a low-

cost leukemia cell segmentation approach and utilizes pairs

of segmented and original images. The model consists of a

DenseNet-201-based feature extraction block and a

classification block. The features extracted from the

segmented and original images were concatenated to train

the model DenseNet-201 for classification into

benign/malignant and subtypes of pre-B, early pre-B, and

pro-B ALL. Although the proposed multi-step DL

architecture enhances the comprehensive analysis of PBS

images for ALL detection and subtype classification, it

inherit high computational cost and required domain

knowledge. Furthermore, it does not provide any support for

tele-diagnosis facility.

In [70], a multi-step deep learning approach was proposed to

automatically segment the leukemia cell from bone marrow

images. The objectives of this research were to identify the

Acute Myeloid Leukemia (AML) and to also predict the

most common mutation status in AML disease. The study

involved the analysis of 1487 individuals newly diagnosed

with AML among them 236 individuals were Healthy.

Feature extraction was performed manually by

hematologists from the dataset. An extensive dataset

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

(comprised of 5202 and 5428 augmented AML bone marrow

smear images of Acute Myeloid Leukemia and healthy

individuals samples respectively ) was used to train the

Faster Region-based Convolutional Neural Net [71]

(FRCNN) model. Experimental results demonstrated that

the proposed binary model achieved a high accuracy (86 %)

in predicting NPM1 mutation status and 91% accuracy while

classifying between AML and healthy bone marrow

samples. Although the study has achieved optimal results,

the manual feature extraction method may lead to error and

inefficiency particularly when dealing with the large sample

size dataset.

The study [61] focuses on the application of Deep Learning

(DL) techniques along with ensemble method to predict ALL

and identify its subtypes using PBS images. In this study C-

NMC-2019 dataset was incorporated to build the deep

ensemble model. Oversampling technique was used to tackle

class imbalance problem, resulting in a training set of 11644

images. Pre-trained networks named as VGG-16, Xception,

MobileNetV2, InceptionResNet-V2, and DenseNet-121

were adopted for transfer learning applications. The

ensemble model exhibits comparatively high performance in

identifying ALL. Experimental results demonstrated that the

proposed method achieved 89.72%, 94.8% accuracy and

AROC value respectively. The overall finding of this

research represented that ensemble learning, combining the

capabilities of diverse networks, enhances the overall

effectiveness of the model for identifying ALL in medical

images. Barrera et al. [85] proposed an effective approach

using GAN1 and GAN2 modules for preserving structural

morphological traits and aligning color staining with a

reference center (RC) when combining images from various

centers. The normalization process aimed to enhance the

objectivity, accuracy, and speed of morphological analysis in

peripheral blood cell images, contributing to more reliable

diagnoses of hematological and non-hematological

disorders.

III. METHODOLOGY

In this section workflow, image processing methods and

network architecture developed to classify ALL and its

subtypes are detailed. Once the dataset was obtained

augmentation was employed to increase and enhance its size

and robustness. Subsequently, significant features are

extracted from the imagery dataset. The network then

managed to successfully categorize the imagery test data as

either ALL (blast cell) or hematogenous (healthy cell).

A. DATASET

The PBS imagery data used in this research was obtained from

the Kaggle repository. The same dataset is also used by

Ghaderzadeh et al. [34] for ALL diagnosis and its subtype’s

classification which consist of approximately 3256 images

collected from 89 suspected individual with ALL consisting

of 25 healthy individuals with a benign diagnosis

(hematogone) as well as 64 patients diagnosed with various

subtypes of ALL. This imagery data was prepared in the bone

marrow laboratory of Taleqani Hospital (Tehran, Iran).

The slides of blood smear samples were prepared and stained

by skilful laboratory staff. The dataset is primarily categorized

into two main categories malignant and benign. The benign

category contains hematogones, normal B-lymphocyte

precursors, which are naturally exist in the bone marrow of

healthy individuals and closely resemble with acute

lymphoblastic leukemia cases. These hematogones benign

hematopoietic precursor cells usually do not required

chemotherapy and resolve on their own without any medical

intervention. While, the latter is further divided into its three

subtypes of malignant lymphoblasts namely: Early pre-B, pre-

B, and pro-B ALL. Randomly selected sample image frames

from each category have been illustrated in Figure 1.

FIGURE 1. Randomly selected images from each category of Acute

Lymphoblastic Leukemia (A) Benign (B) Early Pre-b(C) Pre-b (D) Pro-b.

All the images obtained from these slides were captured

through a microscope equipped with a Zeiss camera at a

magnification of ×100 and saved in JPG format. Subsequently,

the definitive classification of these PBS images into specific

type and subtypes was made by an expert using the flow

cytometry tool. Table 1 illustrates an abstract overview of the

dataset used in the proposed study.

A. PREPROCESSING PHASE

Preprocessing is a commonly used approach in computer

vision applications for preparing input data, directly

influencing the predictive performance of deep learning

models. [34, 35]. In this proposed research segmentation,

decoding and resizing, normalization and augmentation were

used to preprocess the microscopic imagery data obtained

TABLE I

AN ABSTRACT OVERVIEW OF THE DATASET USED IN THE

PROPOSED RESEARCH.

Categories/

Subtypes

No. of

Samples

No. of

Individuals

Microscopic

Image Size

Benign

Hematogones

(Hem)

504

1024 × 768

Malignant

Early

985

1024 × 768

Pre-b ALL

963

1024 × 768

Pro-b ALL

804

1024 × 768

Total

3256

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

FIGURE 2. Overview of proposed research Feature extraction process

from peripheral blood smear images. A graphical

demonstration of preprocessingand feature extraction phase is

depicted in Figure 2.

B. SEGMENTATION

In microscopic peripheral blood smear images, an array of

diverse colors can be observed representing a range of cellular

components [17]. However, our primary focus lies solely on

the detection of lymphoblasts, which are immature

lymphocytes, with an objective to effectively identify ALL

cases and their respective subtypes. Therefore, to address this

challenge segmentation is widely used [17, 34, 76, 78] that

define the boundaries of a lymphoblastic cell and thus

separates the unnecessary components [75] (erythrocytes,

platelets and plasma) from the main substances (lymphoblast

cells) [33, 34,77, 78,79].

In the existing literature, various segmentation techniques

have been incorporated to extract the Region of Interest(ROI)

including the Otsu method followed by the morphological

operations technique [36], k-means clustering ([37],

watershed transformation [38]. Furthermore, Parthasarathy

and Chitra, [80] proposed a segmentation method to

effectively segregate the object taken into consideration from

the rest of the image using color thresholding technique.

Experimental results showed that using the selected threshold

values have the ability to effectively extract ROI. It also

demonstrated its applicability in scenarios where precise

segmentation of intended object is paramount, aligning with

the objectives of our study in detecting lymphoblasts.

Nevertheless, to segment the ROI (lymphoblast cells) from

microscopic PBS images, we incorporated a cost effective

technique relying on color interval separation and a simple

threshold method. The method decreases the image to only

two level of intensity, thereby making it adoptable to identify

and separate the lymphoblasts.

The microscopic imagery data is captured in RGB space,

encompassing three distinct channels: red, green, and blue.

Differentiating among the colors in RGB space is a

challenging task. Therefore, we transformed the images into

the HSV color space (hue, saturation, and value), which

provides ameliorated color resolution for further analysis.

Figure 3 represents the HSV color space distribution of the

randomly selected sample images shown in Figure 1.

FIGURE 3. (A) 3D HSV color space distribution of PBS image selected

from benign (hematogones) subtype (B) HSV color space distribution

from Early Pre-b subtype, (C) HSV color space distribution from Pre-b

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

subtype, (D) HSV color space distribution from Pro-b subtype as

illustrated in Figure 1.

Subsequently, to obtain the two distinct lower and upper

thresholds for the purple color, representing the dominant hue

of blast cells, a 3D scatter plot is generated based on the image

pixels in the HSV color space. Binary masks were then created

using these lower and upper thresholds for each image,

capturing the region of interest (Lymphoblast) from the input

image. Therefore, by applying these masks to the input

images, the Lymphoblast cells were accurately segmented.

Finally, the segmented cells were converted back to the RGB

color space, resulting in a segmented image with Lymphoblast

cells displayed, as shown in Figure 4.

C. DECODE AND RESIZE

Before feeding input images to a CNN, they are transformed

from a standard format into numeric arrays, capturing the pixel

magnitude values of the image. To ensure network

adaptability, reduce computational demands, and enhance

training efficiency, the input images are subsequently resized

[17, 34]. The chosen image size of 224x224 pixels strikes a

balance between model performance and computational

efficiency.

While higher dimensions could be used, maintaining

uniformity with existing research and widely used

architectures led us to standardize the input image dimensions

at 224x224 pixels. Subsequently, the proposed study

incorporated a pre-trained ResNet-152 model for feature

extraction.

FIGURE 4. Original RGB input image samples from each subtype and

their corresponding HSV color space distribution diagram, binary

masked image and segmented image.

D. DATA NORMALIZATION

Normalization of the pixel intensity of an image is a

commonly adopted technique in image processing to enhance

model convergence at the training phase. It mainly ensures that

all the pixel values fall within a specified range thus providing

a standardized representation. It also helps optimization

algorithms to adjust the weights more effectively. To

normalize the data, the global mean and global standard

deviation (SD) were computed from all the pixel values across

the entire image dataset, representing the overall distribution

of pixel values. Subsequently, the data underwent

normalization using Equation (1), where: Xi represents the

global mean calculated from all the image set X and i

∈{1,2,3,....,3256}. Whereas σ represents the global standard

deviation

The normalization equation is applied to every pixel in each

image in the dataset. This process transforms pixel intensities,

centering them on the global mean and scaling them based on

the global standard deviation, making them more suitable for

training deep learning models.

𝑍!=𝑋!− µ

σ + ε *************************************************(1)

E. DATA AUGMENTATION

In the context of image processing, data augmentation

involves introducing additional diversity in the training data

through various transformation techniques. While avoiding

distortion of the vital and meaningful information of the

images, these transformations enhance the model's resilience

to adopt changes in real world data and perform well on new,

unseen data. Therefore, we also incorporated various

transformation techniques including rotating the images

vertically and horizontally, variation in contrast, adjusting

brightness and introducing JPEG noise. The standard color

augmentation used in [73] is incorporated in this research.

Figure 2 presents random samples of original and un-

segmented images, across all the classes (Benign, Early pre-

B, Pre-B, and Pro-B ALL) considered in this research,

followed by preprocessing.

F. BALANCING IMBALANCED CLASSES

Class imbalance problems frequently exist in the medical

domain particularly, stemming from the intricacies of

obtaining manually annotated images. In medical imaging,

certain classes of interest, such as rare diseases or specific

abnormalities, may have significantly fewer instances

compared to more common cases [20, 49, 55]. If the model is

trained over the imbalance data, it might become overly

influenced by the minority class leading to biased model

predictions and hinder the performance of learning model.

Furthermore, the scarcity of data in the minority classes can

result in inadequate representation, making it difficult for the

model to learn and distinguish these less prevalent patterns

[55]. To address this issue effectively random oversampling is

widely used in existing literature [58, 83]. In random

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

oversampling instances from the minority class are randomly

replicated and included to the training data for balancing the

FIGURE 5. An abstract overview of ResNet-152 based architecture for

feature extraction and classification.

imbalanced class [58, 83]. It can be observed that our dataset

is also suffering with class imbalance problem (see Table I).

The number of Hematogones (Hem) samples are

comparatively lower. Therefore, in this study we incorporated

random oversampling technique to build a generalized model,

exposed to a more equitable distribution of each class

instances.

G. CLASSIFIER

In this study, we have examined 7 most commonly used pre-

trained network architectures AlexNet [39], VGGNET [40],

Inception [41], DenseNet-121 [42], ResNET-18 [43],

ResNET-50 [44] and ResNET-152 [45]. Among the network

architectures considered, DenseNet-121 and ResNet-152

demonstrated superior performance. However, after careful

evaluation, ResNet-152 was chosen as it exhibited the best

performance. Its deeper architecture and skip connections

enable it to capture intricate patterns and features in the data

effectively [84]. Therefore, ResNet-152 was selected for the

proposed method and subsequently fine-tuned for the ALL

dataset. It is worth noting that there is a trade-off between

performance and computational complexity, which was

taken into consideration during the selection process.

H. RESNET-152

ResNet-152 is a cutting-edge deep neural network with 152

layers that enables to extract implicit and complex

relationships from the input data. It is also well-known for its

multiple residual blocks aimed at learning the residual

representation functions. These functions enable the training

of exceptionally deep neural networks along with dealing with

the vanishing gradient problem. Therefore, this model excels

in accuracy and performance across various computer vision

tasks, making it a popular choice for transfer learning [34].

Various pre-trained ResNet models such as ResNet-18,

ResNet-34, and ResNet-50 are available in open source with

high level APIs including Keras and fastai. In ResNET-152

consists of blocks and each block typically contains several

layers. Furthermore, the layer configuration typically consists

of these main building blocks: convolution, batch

normalization (BN), activation function, pooling, fully

connected, dropout and output block.

These blocks are stacked on top of each other to form the

complete model. The first three (Convolution, BN and

Activation) blocks are usually combined with an optional

pooling layer to extract the useful features from the input data

such as an image. However, the batch normalization and

dropout are the optional layers basically designed to reduce the

training time over-fitting. This layer performs spatial pooling,

reducing the spatial dimensions of the features to a single

value per channel and feeding it into the third layer.

Subsequently, this transforms them into a single vector that

can be fed into the input layer of the next layer. The fully

connected layer of the CNN estimates the optimal weights

through the back-propagation process. These weights

associated with each node are used to determine the

corresponding class scores and labels for each input image. A

classical network architecture may consist of repetitions of a

stack of several convolution layers, as can be seen in Figure 5.

As discussed earlier, deep learning techniques have the

potential to scale and handle complicated problems and also

provide implicit capabilities to extract optimal features from

unstructured data [46, 47]. Therefore, deep learning

algorithms perform better when compared to classical

machine learning algorithms [48]. However, when it comes to

deep learning models, the training phase demands substantial

efforts to mitigate the risk of over-fitting. Tuning the optimal

hyper-parameters is also a significant challenge that

necessitates expertise and extensive experimentation.

Additionally, a single deep learning model might be obliquely

limited when dealing with highly variable and distinctive

image datasets, especially if there are only a few samples

available.

In such cases, ensemble learning can be a valuable and

powerful tool for improving predictive accuracy and

generalization. Instead of relying on a single model, ensemble

learning constructs an ensemble of diverse models, each with

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

its strengths and weaknesses [49, 50]. By leveraging the

diversity among these models, the ensemble can capture and

diversify patterns in the data and collectively make more

reliable and robust predictions. Bagging, boosting, and

stacking are frequently used ensemble learning techniques,

each with its unique way of aggregating predictions. Ensemble

learning has proven to be highly effective in tackling complex

and challenging problems [48]. It is widely used in several

domains, including image classification, medical diagnosis

and natural language processing, where it has achieved

optimal results by leveraging the strengths of different models

and mitigating their weaknesses [50].

I. ENSEMBLE LERANING

The key advantage of ensemble learning lies in its ability to

enhance predictive accuracy and overall generalization by

leveraging the strengths of diverse models while mitigating

their individual weaknesses.

The above discussed literature on automatic ALL recognition

from microscopic images demonstrated that various CNN-

based deep learning techniques are widely used recently (see

details under section 2 and in Table II). Although extensive

amount of literature have already been published, there is still

opportunity available for performance improvement, broader

and generalized model. Additionally, CNN-based methods

struggle with data scarcity in order to prevent model

overfitting; however, the ensemble of different CNN network

architecture alleviates these constraints [29, 30, 81, 82].

Although existing literature indicates that various articles used

ensemble learning method [49, 50], but the weighted ensemble

leaning method have not been used in the applications of CNN

base ResNet-152 network architecture in the setting of acute

lymphoblastic leukemia (ALL) and on this specific dataset.

In this study, we constructed a homogeneous ensemble model

using ResNet-152 as the base learner to categorize benign

(HEM, healthy cell) and malignant (ALL, blast cell) samples,

as well as to perform subtype detection. We adopted the

parallel ensemble building technique [51] where independent

data samples are fed into each base learner simultaneously,

thus exploiting the independence among base learners as can

be seen in Figure 6. Hence, as each base learner in the

ensemble model makes distinct errors, the ensemble model

can effectively average out these errors [52].

FIGURE 6. Weighted average voting ensemble learning method using

CNN based architecture.

J. FUSION METHOD

Output fusion is a process of aggregating the output of

individual base learners into a unified and coherent output.

According to recent literature, the most frequently used

fusion method is average voting despite the fact it is biased

towards weak learners and is not a suitable approach for

integrating the output of base learners [53]. To overcome this

problem, we incorporated the weighted average voting

method as the fusion method. The weighted average

approach gives each base model for a given class a certain

weight depending based on the model contributes to that

class. Rather of using traditional hyperparameter tuning,

theses weights were updated using feedforward neural

network.

In this research project, the outermost layer of our novel

weighted deep ensemble learning model involves the

consideration of the output for each of the four classes at the

final layer.

In this study, the output layer of our weighted deep consists

of four nodes for each class (n=4) considered relevant in this

study. A convolution neural network assigns probability

values 𝑃

"∈ *ℝ and 𝑃

"*range between [0,1] for a previously

unseen test image,∀𝑗 ∈ {1,2, … . . , 𝑛} and ∑𝑃′"#1

"#% **for all

the classes. Given the number of classes n and the number

of base classifiers K, ∀K ∈ m where m ∈ {1, 2, 3}. Let

p?𝑎"&'A be the predicted accuracy of k for class j, we proceed

to calculate the weight for each K as denoted below.

𝑊'= p(𝑎%')+ p(𝑎(')+⋯+p?𝑎"'A

𝑛**********************(2)

Here 𝑊'&is the calculated weight value for each base

classifier CNNk and k∈ {1, 2, 3}. The estimated weight value

*𝑊'& is further utilized to calculate the weighted average,

denoted as*𝑃′" .

𝑃′"=∑*𝑊'&(*𝑝"'&)

)

*#%

∑*𝑊'&

)

*#% **************************************************(3)

In this method we used area under receiver operating

characteristic curve (AROC) as the evaluation score, which

was further used to calculate the average weight for each

classifier. To perform AROC Scheme the probabilities value

𝑃′" for each class j can be derived as follows:

𝑃′"+,-./ =∑𝑊,-./

'(*𝑝"'&)

)

*#%&

∑𝑊,-./

)

*#%

*******************************(4)

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

K. EXPERIMENTAL SETUP

Python programming language is used to implement the

proposed system using various built in libraries including

Fast.ai. It is a simple yet advanced deep-learning library built

on top of PyTorch and provides high level abstraction to

build and train CNN based pre-trained ResNet-152 network

architecture. Implementation is done on an HP laptop with

an SSD drive connected in SATA format (Serial Advanced

Technology Architecture), an Intel Core i7 10th Generation

processor, and 16 GB of RAM. It was trained out on Google

Collaboratory using the Nvidia Tesla P-100 GPU method.

L. TRAINING POLICY

In this study, our implementation and configuration of

ResNet-152 network architecture follows the practice in [73,

74]. The initial phase of model development involves

establishing the architecture of CNN. As mentioned earlier,

a typically CNN architecture comprises of a series of various

types of blocks and layers, which is often tailored to the

specific application and data characteristics. The foremost

layer in this architecture is the input layer, which sets the

dimensions of the input images for the network. This

includes specifying the height, width and number of color

channels in the given image. To maintain the uniformity with

existing research [34, 73], we set the input image dimensions

at 224x224x3 pixels.

Furthermore, each input image was an image pair of both the

original PBS sample and its corresponding segmented

version therefore we also performed stacking or

concatenation of both image types. This concatenation

serves to retain a more comprehensive dataset and mitigate

the potential loss of information that can occur during the

segmentation process. Since segmentation can sometimes

result in the removal of pertinent details. This approach

mainly aims to equip the model with a richer set of

information, thus enhancing its capability to learn and

generalize from the data [34]. The image is resized and

augmented as mention above. The standard color

augmentation used in [73] is incorporated in this research.

FIGURE 7. Filters from layer 2 of ResNET-152 with one subplot per channel

The second layer, convolutional layer, contains batch

normalization and activation function layer along with small

filters Figure 7 represent the filters from ResNET-152 with

one subplot per channel. Filters are typically followed by

non-linear activation functions like ReLU. Each filter is

responsible for extracting specific patterns through sliding

across the image computing a dot product between the pixel

values and its weights in the overlapping region [61]. They

also allow the network to recognize patterns at different

levels of abstraction. The convolution and pooling layers

maintain and preserve the spatial hierarchy among features

maps, while process the input data. However, batch

normalization layer overcomes the internal dependencies

among layers, allowing them to contribute more

independently in the learning process through managing

covariant shift. As the layers progress deeper into the

network, they can detect increasingly complex and high-

level features [17].

The third layer is an optional layer called the pooling layer

responsible for spatial pooling, decreasing the spatial

dimensions of the features to a single value per channel, thus

providing a form of down-sampling by focusing on the

intended region in the input image. In this research we

employed max pooling as it is widely used type of pooling

layer in CNN, including ResNet-152 network architecture.

ResNet-152 network consists of several residual blocks each

block contains multiple convolution layers.

The intermediate outputs of each convolutional layer within

the residual block are feature maps. Feature maps are crucial

for capturing distinct cellular patterns associated with

different subtypes. In this study, we used 64 feature maps

with the size of 8 x 8. The gradients of loss with respect to

the input image pixels is calculated and the output is

converted to grayscale image. The entire input data set is

used for features extraction during feature extraction

process, however, Figure 8(A) displays 64 grayscale feature

maps extracted from an image of the Pro-b subtype, as

depicted in Figure 1. These feature maps are derived from

the second layer of the ResNet-152 model. If we critically

review the feature maps depicted in Figure 8 it can be

observed that in layer 2, network learned edges and textures.

During the feature maps extraction it can also be observed

that at higher layers more abstract objects detectors are

learned. While this process gives a visual inside into the

activated feature maps at layer 3 block5 as can be seen in

Figure 8(B). The number of features are so high at higher

level of layers that images become non interpretable. Further

feature maps obtained from different higher levels, those are

available in supplementary file. These maps enable

interpretable insights into the model's decision-making

process, aiding clinicians in understanding and localizing

abnormalities within blood cell images. Ultimately, feature

maps contribute to enhanced diagnostics and more precise

identification of leukemia subtypes.

On the other hand, fully connected layer process 1-D feature

vector. Therefore, the next layer is flatten layer used to

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

transform the 2D max-pooled matrix into a 1D array. This

layer basically calculate average value of each feature map,

decrease spatial dimensions to 1x1. Thus each element in this

array could be used as an input to the fully connected layer.

This layer is a naïve connected fully feed forward network

containing an input layer, receives the output of the flatten

layer as an input and transforms the spatial features into a

format appropriate for classification [29, 30].

It is challenging to comprehend, nevertheless, how these

qualities are applied and whether they are utilized to

eliminate a class from consideration or to anticipate the class.

At the end we have an output layer, in this layer we added 4

output nodes (each node represents a particular ALL subtype

considered relevant in this study. Furthermore, softmax

activation function is commonly used in the output layer to

obtain class probabilities particularly in multiclass problem.

In the output layer, we also incorporated an Adaptive

Moment (ADAM) estimation optimizer. It is frequently used

dynamic second-moment optimizer [54] to enhance the

performance of the gradient descent. It handles the rapidly

decreasing learning rates by integrating the strengths of

momentum and RMSprop (Root Mean Square Propagation),

allowing faster convergence and effectively. In the output

layer, we also incorporated an Adaptive Moment (ADAM)

estimation optimizer. It is frequently used dynamic second-

moment optimizer [54] to enhance the performance of the

gradient descent. It handles the rapidly decreasing learning

rates by integrating the strengths of momentum and

RMSprop (Root Mean Square Propagation), allowing faster

convergence and effectively deal with sparse data. During

the training phase, it dynamically updates the learning rate

of individual parameter based on the recent magnitude of the

gradient. In this study, the initial learning rate which controls

the step size during convergence, was set to 0.01. The

parameter values for β1 and β2 were set to 0.9 and 0.999,

respectively, to achieve the desired behavior of the Adam

optimizer. For numerical stability, the default value of ε

(epsilon) is set to 1e-8.

The batch size is explicitly set to 8, implying that 8 images

will be processed in each iteration during the training of the

model. Additionally, the number of epoch is specified to be

85 with early stopping criteria to minimize the over fitting.

Subsequently, Categorical Cross-Entropy is used as the loss

function to measure the discrepancies between the actual and

predicted probability distributions of the model. However,

accuracy is chosen as the measure to evaluate the

performance of the model. In the case of C classes for a

single data point, the Cross entropy loss function was

calculated using the following Equations. Where C is the

classes, 𝑃

0 represents the predicted probability for class c and

y represents the actual class of c.

L(Y, P)= − J***𝑌

0*. log(𝑃

0#% (5)

During the training process, if the validation accuracy

remains stagnant for ten consecutive epochs, the learning

rate will be reduced by 20%, continuing until no further

improvement is achieved. The training process also employs

the Cosine learning rate annealing scheduler. Additionally, if

the validation loss remains stable for 20 epochs, and the

learning rate reaches the minimum threshold value of 0.0001

without any further improvement, the training process is

halted. Throughout the training, only the weights associated

with the best performance on the validation set are saved. It

ensures that the model captures the most optimal state

attained during training, leading to enhanced generalization

on unseen data.

FIGURE 8. (A) Visualization of the grayscale feature maps (B) Activated

colored feature maps extracted from an image of the Pro-b subtype, as

depicted in Figure 1.

L. EVALUATION METRICS

In this study, to comprehensively assess the capabilities of

the proposed model we incorporated clinically important

performance metrics such as precision, recall, specificity,

sensitivity, AROC, weighted accuracy and Matthews

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

Correlation Coefficient (MCC) [55]. We also incorporated

the F-measure to evaluate the performance of our proposed

model as accuracy always is not adequate performance

measure when dealing with the imbalance or asymmetrical

datasets. Mathematically, these measures are describe as

follows:

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =123!

!#$

123

!#$ 453******* (6)

𝑅𝑒𝑐𝑎𝑙𝑙 =123!

!#$

123!456!

!#$ ****** (7)

𝐹 − 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 =2

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛*+1

𝑅𝑒𝑐𝑎𝑙𝑙

*************(8)

𝑀𝐶𝐶 =723826953856:

;723453:723456:726453:726456: (9)

Where 𝑇𝑃 demonstrated true positives rate, 𝑇𝑁 true

negative rate, FP the number of false positives, FN the

number of false negatives and l is the number of classes. All

the obtained result is expressed as percentages as depicted in

Table 2.

IV. RESULTS AND DISCUSSION

Here, we introduce a weighted deep ensemble learning

method for image classification and cell segmentation that

provide a scalable and efficient method to identify ALL

samples from PBS image. In this section, our initial focus is

to represent the evaluation results of the test data. To achieve

more reliable estimation of our model‘s performance, we

incorporated the K-fold cross-validation that determines the

degree to which the obtained results are independent of the

training data. It is especially useful when dealing with

limited dataset, as across different iterations all the data

points contribute to both the training and test data. In this

study we incorporated 10-fold cross validation to evaluate

the performance of our model. Table 2 demonstrated

weighted average voting results obtained after 10-cross

validation on the oversampled dataset, comprising a total of

3940 samples. In this overly sampled dataset, each class is

represented by 985 sample pairs.

It demonstrated that deep ensemble learning model obtained

optimal results on both benign and malignant categories. The

model achieved 99.85 %, 99.90 %, 99.80 F1-score, precision

and recall respectively on the hematogones (Hem) class fall

under the benign category. Furthermore, the high sensitivity

(90.80%) and specificity (99.93) values demonstrated

models’ ability to effectively classify a significant portion of

actual benign hem test images. Thus it not only minimizes

the probability of overlooking any positive instances but also

demonstrates its competence in minimizing false positive

rate. MCC of 0.703 further exhibits the robustness of the

deep ensemble model in this category. The Weighted

Accuracy, though slightly lower, is still inspiring at 99.93%.

Early pro-b is one of the three subtypes within the malignant

category as mentioned earlier.

In the context of this subtype, the model demonstrated

precision, recall, and F1-Score values of 99.80%, 99.90%,

and 99.86% respectively. The model's recall, was equally

notable at 99.90%, indicating its significant magnitude in

capturing optimal positive rate. This, eventually, leads to a

robust F1-Score of 99.86%, indicating the harmonic balance

between recall and precision and proclaiming the deep

ensemble model's generalizability to effectively recognizing

test samples from Early pro-b subtype. The proposed model

achieved perfect performance in the 'Pre-b ALL' and pro-b

ALL subtypes, achieving 100% in precision, recall, F1-

Score, sensitivity, and specificity. The MCC value of 1.0

indicates flawless predictions in this category.

Furthermore, the proposed model achieved overall high

performance while aggregating across all the classes

included in this study. It achieved 99.92%, 99.92%,

99.90%,99.92%, and 99.97%, precision, recall, F1-Score,

sensitivity and specificity respectively. The balanced

performance is also reflected in the MCC and weighted

accuracy of 0.897 and 99.95% respectively. High MCC

values across all categories indicate a robust model

performance, considering both false positives and false

negatives. A close observation of the results across all the

classes concludes the supremacy of the weighted deep

ensemble learning model over the single CNN architecture.

Furthermore, it signifies the proposed model's ability to

accurately diagnose and identify specific ALL subtype.

Hence, it is likely to save both diagnosis time and the efforts

TABLE II.

PERFORMANCE EVALUATION OF THE PROPOSED DEEP ENSEMBLE LEARNING-BASED MODEL USING 10-FOLD

CROSS VALIDATION METHOD ON BLOOD SMEAR IMAGES.

Categories

Precision

Recall

F1-Score

Sensitivity

Specificity

MCC

Weighted Accuracy

Benign

Hem

99.90

99.80

99.85

99.80

99.93

0.703

99.86

Malignant

Early

99.80

99.90

99.76

99.90

99.97

0.997

99.93

Pre-b ALL

100

1.00

100

Pro-b ALL

100

1.00

100

Total

99.92

99.90

99.92

99.97

0.897

99.95

Hem, Hematogones; ALL, Acute lymphoblastic leukemia (ALL); MCC, Matthews Correlation Coefficient.

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

of clinicians and patients. In this study we also incorporated

hold on method to evaluate the performance of our weighted

deep ensemble learning model. In this approach blood smear

image dataset is divided into training and test data with 80%

and 20% respectively. To demonstrate a better view and

FIGURE 9. (A) The confusion matrix of the weighted deep ensemble learning

model evaluation on the test data with a total of 783 unseen test samples

(213-Benign (Hematogones), 206-Early Pre-b, 175-pre-b and 189-Pro-b

samples) with the resolutions of 224x224 pixels. (B) Corresponding

normalized confusion matrix.

more insight of the obtained results from the test data, we

utilized a confusion matrix as the performance evaluation

indicator. Each cell in this matrix shows the proportion of

each subtype of the predicted image. The diagonal values

correspond to the correctly classified images, while the

remaining entries demonstrate misclassified instances in

each subtype.

Figure 9 demonstrates that our fine-tuned model accurately

identifies the majority of samples. However, it can be

observed that it only misclassifies 2 images of the benign

category (Hem) as pre-b (False positive), a subtype of the

malignant category. It also misclassifies 1 instance of pre-b

subtype as benign (Hem) perhaps due to the substantial

similarity between these two subtypes [34]. While the true

positive rate is 100% for the remaining two subtypes. The

main diagonal shows the number of correctly classified

samples in Figure 9. It can also be noticeable that the false

positive rate is significantly low, minimizing the probability

that benign instances are mistakenly classified as malignant

or vice versa. The model's ability to reduce false positives is

especially useful in the context of patient care. False

positives have the ability to cause unneeded stress and

medical procedures by igniting unneeded patient fears and

treatments. The suggested model helps provide preliminary

leukemia diagnoses that are more accurate and trustworthy

by lowering the possibility of these false positives. Hence, it

signifies the proposed model's ability to enhance the

accuracy of preliminary diagnoses and informed decision

making. Experimental results demonstrated that the

proposed method has the ability to classify more than 90%

of the samples correctly, even when dealing with an

unbalanced dataset.

TABLE III.

SUMMARY OF DEEP LEARNING AND ML BASED METHODS USED FOR ALL CLASSIFICATION

Author (s), year

Features Type

Method

Test Data

Accuracy

Zhou et al. [56]

CNN features

RetinaNet , VGG, ResNext101, ResNext50,

ResNet50 and the Feature Pyramid Network

346

97%

Jha and Dutta, [57]

Statistical and the Local

Directional Pattern (LDP) features

Chronological SCA-based Deep CNN

98.7%

Almadhor et al. [58]

CNN features

KNN, Random Forest, SVM , and Naive Bayes

4456

90%

Nizar Ahmed et al.

[59]

CNN features

CNN, naive Bayes, support vector machine, KNN

and decision tree

245,231

88.25%

and

81.74%

Sanam Ansari

et al. [60]

CNN features

CNN model with the Tversky loss function

187

99.5%

Chayan Mondal et al.

[61]

texture, Size and shape feature

Ensemble model based onVGG-16, MobileNet,

InceptionResNet-V2, and DenseNet-121

1867

94%

Maryam Bukhari et

al.[62]

CNN features

CNN with squeeze and excitation learning

97.06 %

Niranjana Sampathila

et al. [63]

CNN features

ALLNET

2132

95.54%

Rezayi et al. [66]

CNN features

ResNet-50 and VGG-16 ,

2506

84%

Proposed approach

CNN features

ResNet-152 based weighted Ensemble, VGGNet,

Inception, DensNet-121, AlexNet, ResNet-18,

ResNet-50

783

99.97 %

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

FIGURE 10. Weighted deep ensemble learning model‘s loss

function with respect to epochs number on training and validation sets.

Figure 11 represents the conversion trend of our model‘s

accuracy during the training and testing phase, which is

plotted on the basis of epochs. In this research, to obtain the

optimal results we have conducted experiments with

different numbers of epochs. Although, initially we

increased the number of epochs to 100, it resulted in more

running time without significant progress in accuracy.

Therefore, we set the number of epochs to 85 to train the

proposed model however, it can be observed that the model

converged to its saturation point after approximately 30

epochs with training and validation accuracy 99.97 % and

99.86% respectively. Whereas Figure 10 depicts the loss

during training and validation process of the model 0.0016

and 0.0018 respectively.

FIGURE 11. Weighted ensemble learning model‘s accuracy with

respect to epochs number on training and validation sets.

The experimental results depicted that the proposed model

consistently demonstrated high performance in term of

accuracy even when the learning rate is set towards lower

bounds and conducting a limited number of epochs. This

implies that further extending more epochs and increasing

the learning rate, significant improvements can be attained.

We also incorporated training and test accuracy along with

loss function as evaluation metrics to measure the

performance of our proposed model. First, accuracy

demonstrates how well the model is generalized to new data.

Basically, it is the ratio of the number of correctly classified

cases among the total number of cases. However, the

accuracy can be categorized into two types: Training

accuracy and test accuracy obtained from training and test

data respectively. Secondly, the loss function is the metric

that quantifies how well the model‘s output matches with the

actual output, which is used to update the weights of each

node in the neural network architecture. This function is also

used to calculate the training and validation loss to evaluate

that the trained model is converging and learning from the

data.

FIGURE 12. Comparative analysis between weighted deep ensemble

learning model and the individual model (ResNet-152) in term of AROCs.

In this study, to further explore the proposed method

performance we also incorporated clinically important

performance metric such as AROC. In Figure 12, the AROC

value illustrates that our weighted deep ensemble learning

model outperforms compared to individual CNN-based

model (ResNet-152) with an AROC value of 0.999. Notably,

the AROC value for the individual ResNet-152 is 0.98,

suggesting relatively lower performance. This result

demonstrates the effectiveness of our approach in both

diagnosing and categorizing each class. This also

emphasizes the potential for significant improvements by

employing an ensemble approach, aggregating predictions

from multiple models [48]. Experimental results also

demonstrated that the implementation of the ensemble model

with ResNet-152 as the base learner yields more enhanced

performance at the cost of longer training time and additional

computational overhead.

To provide a broader context for our proposed model, we

also comprehensively evaluate the performance of our

proposed method. Therefore, we conducted a comparative

analysis involving seven renowned network architectures.

These architectures include AlexNet [39], VGGNET [40],

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

Inception [41], DenseNet-121 [42], ResNET-18 [43],

ResNET-50 [44], and ResNET-152 [45]. These architectures

have been widely used in existing for the classification of

acute leukemia.

Subsequently, we analyzed AROC values for each individual

model, highlighting distinctions among them. The area under

receive operating curve (AROC) in Figure 13 also reveals

the supremacy of our weighted deep ensemble learning

model when compare it to all the individual CNN based

models (ResNet-152, VGGNet, Inception, DenseNet and

AlexNet) with an AROC value of 0.999. This analysis

contextualizes our results within the landscape of state-of-

the-art classifiers for acute leukemia classification. It also

illustrates the peak effectiveness of the proposed approach in

both diagnosing and categorizing each class. However, the

AROC value for individual ResNet-152 is 0.98 which is

comparatively low. AROC values for ResNET-18 and

Inception are similar. However, DenseNet and ResNet-152

have claimed to perform slightly better than DenseNet and

VGGNet with AROC values 0.97, 0.96 and 0.95 respectively

as can be in Figure 13. Furthermore, ResNet-50 depicted

slightly higher AROC value (0.97) compared to DenseNet-

121 (0.96) suggests that ResNet-50 excels in capturing

complex pattern present in imagery dataset. Its deeper

architecture probably enables it to learn intricate patterns and

characteristics that contribute to better performance. The

slightly lower AROC value of 0.93 attained by AlexNet

provide interesting perspectives on its efficiency within the

context of the classification. This outcome might be ascribed

to various architectural factors inherent in AlexNet. Which

contribute to its slightly diminished AROC value.

The data pre-processing techniques used in this study also

contribute to the proposed models' efficacy. These pre-

processing procedures are essential for the model’s ability to

learn relevant features from the data. Specifically, two

crucial pre-processing pipeline phases are segmentation and

relevant feature maps extraction. Clinicians can visualize

and understand which feature or patterns in the PBS

contribute to the model's predictions. This level of

transparency is crucial for building trust in the model's

prediction. According to the activated feature maps extracted

from the data (Provided in the supplementary file) it can be

concluded that the model is learning edges, colors, and

textures. However, it's unclear how these features are

applied, and if they're utilized to eliminate a class from the

list of possibilities or to anticipate the class. Furthermore, it

can also be concluded from the observations that color is the

primary feature incorporated to discriminate between

classes.

Eventually, a comparative analysis is also presented between

the weighted deep ensemble learning model and the recent

studies particularly related to ALL diagnosis as depicted in

Table 3. These recent studies consistently report

categorization accuracy exceeding 90%. The complexity of

the networks employed in prior studies has necessitated the

use of deep processing units. For example, the Inception

standard network consists of over ten processing blocks,

each comprising more than ten convolutional layers. In

contrast, our model is composed of five convolutional layers

in total. Overall, the experimental results obtained from this

study demonstrated that the model's high precision, recall,

F1-measure and accuracy can translate into improved patient

outcomes.

FIGURE 13. A comparison among the AROCs of seven different

pretrained CNN based models and proposed weighted ensemble learning

model.

In this research, we have also incorporated essential details

that provide a more in-depth understanding of our proposed

web based platform that can be observed in Figure 14. This

web based platform is developed using HTML, CSS, and JS,

with Flask. The incorporation of Flask in our code allows for

the creation of routes and effective management of HTTP

requests. In terms of the frontend, HTML templates are

employed for rendering web pages, and JavaScript code

enhances the user interface interaction. It is operational on a

Flask server.

We further emphasize that future plans involve integrating

and hosting the web platform on either a private server or an

online cloud server, depending on specific requirements.

This strategic approach ensures scalability and accessibility

beyond local execution, enhancing the overall utility and

reach of our web application. To access our web-based

platform, users are required to complete the registration

process by providing the necessary credentials. This

information is utilized for verification and identification

purposes within our platform. Upon successful registration,

users can then avail themselves of the services offered by our

web platform.

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

FIGURE 14. A screenshot of the web-based platform primarily based

on weighted deep ensemble learning model.

The limitations of this study include the reliance on single

dataset obtained from the Kaggle repository [34]. This might

limit the generalizability of the learned model over diverse

populations. Despite its size, the dataset may not fully

capture the range of variability found in real-world

situations, which could have an impact on the model's

effectiveness when applied to unobserved data. Furthermore,

since evaluating only one diagnostic modality at a time is

insufficient for an accurate diagnosis, an integration of

various diagnostic modalities, such as cytomorphology of

both peripheral blood and bone marrow, flow cytometry, as

well as genetic and clinical data, seems warranted to build

ML models that may aid in clinical decision making.

The proposed research particularly focus on identification of

acute lymphoblastic leukemia (ALL) and classification of its

subtypes. However, the suggested methodology may not be

applicable to other medical fields or hematological disorders.

Additionally, even if oversampling was attempted to rectify

the class imbalance problem, the dataset's intrinsic qualities

may restrict the applicability of the proposed methodology

to other medic al domains. Lastly, the suggested ResNet-152

network architecture and ensemble learning approach's

robustness is restricted by the lack of external validation on

other datasets. These drawbacks highlight to areas that need

more investigation and development in order to increase the

model's applicability to a variety of clinical contexts.

CONCLUSION

ALL is a prevalent disease both in adults and children. It

often required costly, time consuming and invasive

diagnostic tests. Peripheral blood smear (PBS) images play

a vital role in the early screening of ALL. While, the PBS

images provide a noninvasive mean to early diagnose the

ALL in a suspected individual, however, manual analysis of

such images could be subject to inter-observer variability

and human error. Therefore, in this research we developed a

web based platform using weighted deep ensemble learning

model to diagnose acute lymphocytic leukemia with respect

to its subtypes (benign, Early Pro-B, Pro-B, and Pre-B). To

develop the deep ensemble learning model ResNet-152

architecture based on CNN is incorporated as the base

learner. Experimental results and comparative analysis

across seven renowned CNN Network architectures

demonstrated that the proposed web based platform has the

potential to accurately classify the PBS images into

cancerous or healthy categories thus help oncologists in

tailoring treatment plan according to individual patient

needs. Nevertheless, the consistent and higher recall scores

depict its capabilities in identifying true positive cases. It

could lead to timely initiation of personalized treatment

thereby reducing the risk of delayed or suboptimal

interventions. Additionally, while the proposed method

exhibits exceptional performance within the confines of the

dataset, its real-world applicability warrants rigorous

evaluation and validation across diverse patient populations.

In future endeavors, we intend to expand our experiments by

incorporating a hybrid deep learning approach using

convolutional neural network accompanied by recurrent

neural networks to further enhance the performance.

Furthermore, the proposed platform could be used to find

other type of abnormalities in the blood.

ACKNOWLEDGMENT

Researchers would like to thank the Deanship of Scientific

Research, Qassim University for funding publication of this

project.

REFERENCES

[1] N. Jiwani, K. Gupta, G. Pau, M. Alibakhshikenari. “Pattern

Recognition of Acute Lymphoblastic Leukemia (ALL) Using

Computational Deep Learning”. IEEE Access. 2023 Mar

21,11:29541-53.

[2] N. Sampathila, K. Chadaga, N. Goswami, RP. Chadaga, M. Pandya,

S. Prabhu, MG. Bairy, SS. Katta, D. Bhat, SP. Upadya. “Customized

deep learning classifier for detection of acute lymphoblastic leukemia

using blood smear images”. InHealthcare 2022 Sep 20 (Vol. 10, No.

10, p. 1812). MDPI.

[3] KJ. Hiam-Galvez, BM. Allen, MH. Spitzer. “Systemic immunity in

cancer”. Nature reviews cancer. 2021 Jun 21, (6):345-59.

[4] M. Belson, B. Kingsley, A. Holmes. “Risk factors for acute leukemia

in children: a review”. Environmental health perspectives. 2007 Jan

11, 5(1):138-45.

[5] Y. Dong, O. Shi, Q. Zeng, X. Lu, W. Wang, Y. Li, Q. Wang.

“Leukemia incidence trends at the global, regional, and national level

between 1990 and 2017”. Experimental hematology & oncology. 2020

Dec, 9:1-1.

[6] D. Singh, J. Vignat, V. Lorenzoni, M. Eslahi, O. Ginsburg, B. Lauby-

Secretan, M. Arbyn, P. Basu, F. Bray, S. Vaccarella. “Global estimates

of incidence and mortality of cervical cancer in 2020: a baseline

analysis of the WHO Global Cervical Cancer Elimination Initiative”.

The Lancet Global Health. 2023 Feb 1, 11(2):e197-206.

[7] K. Stephens. “Every Month Delayed in Cancer Treatment Can Raise

Risk of Death by Around 10%”. AXIS Imaging News. 2020 Nov 6.

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

[8] B. Seruga, A. Sadikov, EL. Cazap, LB. Delgado, R. Digumarti, NB.

Leighl, MM. Meshref, H. Minami, E. Robinson, NH. Yamaguchi, D.

Pyle. “Barriers and challenges to global clinical cancer research”. The

Oncologist. 2014 Jan 1, 19(1):61-7.

[9] P. McGrath. “Beginning treatment for childhood acute lymphoblastic

leukemia: Insights from the parents' perspective”. Number 6/2002.

2002 Jul 1, 29(6):988-96.

[10] F. Kazemi, TA. Najafabadi, BN. Araabi. “Automatic recognition of

acute myelogenous leukemia in blood microscopic images using k-

means clustering and support vector machine”. Journal of medical

signals and sensors. 2016 Jul, 6(3):183.

[11] M Ghaderzadeh, M. Aria, A. Hosseini, F. Asadi, D. Bashash, H.

Abolghasemi.” A fast and efficient CNN model for B‐ALL diagnosis

and its subtypes classification using peripheral blood smear images”.

International Journal of Intelligent Systems. 2022 Aug, 37(8):5113-

33.

[12] World Health Organization, 2020. Global cancer profile 2020 (2020)

https://tinyurl.com/3unsh9xa. [Accessed 10 February 2020]

[13] S. Gehlot, A. Gupta, R. Gupta. “SDCT-AuxNetθ: DCT augmented

stain deconvolutional CNN with auxiliary classifier for cancer

diagnosis”. Medical image analysis. 2020 Apr 1, 61:101661.

[14] S. Mohapatra , S.S. Samanta, D. Patra, S. Satpathi. “Fuzzy based blood

image segmentation for automated leukemia detection 2011

International conference on devices and Communications”, IEEE

(2011), pp. 1-5.

[15] C. Marzahl, M. Aubreville, J. Voigt, A. Maier. “Classification of

leukemic b-lymphoblast cells from blood smear microscopic images

with an attention-based deep learning method and advanced

augmentation techniques ISBI 2019 C-NMC challenge: classification

in cancer cell imaging”, Springer (2019), pp. 13-2

[16] S. Mishra, B. Majhi, P.K. Sa. “Texture feature based classification on

microscopic blood smear for acute lymphoblastic leukemia detection

Biomed Signal Process Control”, 47 (2019), pp. 303-311

[17] A. Mittal, S. Dhalla, S. Gupta, A.Gupta. “Automated analysis of blood

smear images for leukemia detection: a comprehensive review”. ACM

Computing Surveys (CSUR). 2022 Sep 10;54(11s):1-37.

[18] N. Patel, A. Mishra. “Automated leukaemia detection using

microscopic images Procedia Comput Sci”, 58 (2015), pp. 635-642.

[19] Lai, Yunfei. "A comparison of traditional machine learning and deep

learning in image recognition." In Journal of Physics: Conference

Series, vol. 1314, no. 1, p. 012148. IOP Publishing, 2019.

[20] S. Perveen, M. Shahbaz, K. Keshavjee, A. Guergachi. “Prognostic

modeling and prevention of diabetes using machine learning

technique. Scientific reports”. 2019 Sep 24, 9(1):13805.

[21] M.K. Hasan, L. Dahal, P.N. Samarakoon, F.I Tushar, R. Martí.

“DSNet: Automatic dermoscopic skin lesion segmentation Comput

Biol Med”, 120 (2020), Article 103738

[22] M.Hasan, S. Roy, C. Mondal, M. Alam, M. Elahi, E. Toufick, et al.

“Dermo-DOCTOR: A web application for detection and recognition

of the skin lesion using a deep convolutional neural network”, (2021)

[23] A. Işın, C. Direkoğlu, M. Şah. “Review of MRI-based brain tumor

image segmentation using deep learning methods Procedia Comput

Sci”, 102 (2016), pp. 317-324

[24] D.F. Steiner, R. MacDonald, Y. Liu, P. Truszkowsk, J.D Hipp, C.

Gammage, et al. “Impact of deep learning assistance on the

histopathologic review of lymph nodes for metastatic breast cancer

Am J Surg Pathol”, 42 (12) (2018), p. 1636

[25] M.K. Hasan, M.A Alam, M.T.E Elahi, S. Roy, R. Martí. “DRNet:

Segmentation and localization of optic disc and fovea from diabetic

retinopathy image Artif Intell Med”, 111 (2021), Article 102001

[26] Y. Oh, S. Park, J.C Ye. “Deep learning covid-19 features on cxr using

limited training data sets IEEE Trans Med Imaging”, 39 (8) (2020),

pp. 2688-2700.

[27] F. Chadebecq, LB. Lovat, D. Stoyanov. “Artificial intelligence and

automation in endoscopy and surgery. Nature Reviews

Gastroenterology & Hepatology”. 2023 Mar, 20(3):171-82.

[28] M.S.H Sunny, A.N.R Ahmed, M.K. Hasan. “Design and simulation of

aximum power point tracking of photovoltaic system using ANN 2016

3rd International conference on electrical engineering and information

communication technology (2016)”, pp. 1-5,

10.1109/CEEICT.2016.7873105

[29] Y. Ding, Y. Yang, Y. Cui. “Deep learning for classifying of white

blood cancer ISBI 2019 C-NMC challenge: classification in cancer

cell imaging, Springer (2019)”, pp. 33-41

[30] T. Shi, L. Wu, C. Zhong, R. Wang, W. Zheng. “Ensemble

convolutional neural networks for cell classification in microscopic

images ISBI 2019 C-NMC challenge: classification in cancer cell

imaging”, Springer (2019), pp. 43-

[31] M.A Khan, J. Choo. “Classification of cancer microscopic images via

convolutional neural networks ISBI 2019 C-NMC Challenge:

Classification in Cancer Cell Imaging”, Springer (2019), pp. 141-147

[32] B. Harangi. “Skin lesion classification with ensembles of deep

convolutional neural networks J Biomed Inform”, 86 (2018), pp. 25-

[33] F. Xiao, R. Kuang, Z. Ou, B. Xiong. “DeepMEN: Multi-model

ensemble network for B-lymphoblast cell classification ISBI 2019 C-

NMC challenge: classification in cancer cell imaging”, Springer

(2019), pp. 83-93.

[34] M. Ghaderzadeh, M. Aria, A. Hosseini, F. Asadi, D. Bashash, H.

Abolghasemi. A fast and efficient CNN model for B‐ALL diagnosis

and its subtypes classification using peripheral blood smear images.

International Journal of Intelligent Systems. 2022 Aug; 37(8):5113-

33.

[35] KK. Pal, KS. Sudeep. “Preprocessing for image classification by

convolutional neural networks. In: 2016 IEEE International

Conference on Recent Trends in Electronics, Information &

Communication Technology (RTEICT)”, (pp. 1778-1781). IEEE;

2016.

[36] A. Makandar and B. Halalli, 2016. “Threshold based segmentation

technique for mass detection in mammography”. J Comput, 11(6),

pp.472-478.

[37] D. Goutam, & S. Sailaja, (2015). “Classification of acute myelogenous

leukemia in blood microscopic images using supervised classifier.

International Journal of Engineering Research & Technology

(IJERT)”, 4(1), 569– 574.

[38] S. Jagadeesh, E. Nagabhooshanam & S. Venkatachalam, (2013).

“Image processing based approach to cancer cell prediction in blood

samples”. International Journal of Technology and Engineering

Sciences, 1(1), 1– 10.

[39] A. Krizhevsky, I. Sutskever and G.E Hinton, 2012. “2012 AlexNet.

Adv. Neural Inf. Process. Syst”, pp.1-9.

[40] K. Simonyan, A. Zisserman. “Very deep convolutional networks for

large-scale image recognition”. arXiv Prepr arXiv14091556. 2014.

[41] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens. “Rethinking the

inception architecture for computer vision”, 2016.

[42] G. Huang, Z. Liu and K.Q. Weinberger, 2017. “Laurens van der

Maaten”. Densely Connected Convolutional Networks. arXiv preprint

arXiv:1608.06993.

[43] S. Jian, H. Kaiming, R. Shaoqing and Z. Xiangyu, 2016. “Deep

residual learning for image recognition”. In IEEE Conference on

Computer Vision & Pattern Recognition (pp. 770-778).

[44] Z. Liu, H. Mao, C.Y. Wu, C. Feichtenhofer, T. Darrell and S. Xie,

2022. “A convnet for the 2020s”. In Proceedings of the IEEE/CVF

conference on computer vision and pattern recognition (pp. 11976-

11986).

[45] K. He, X. Zhang, S. Ren. And J. Sun, 2016. “Identity mappings in deep

residual networks”. In Computer Vision–ECCV 2016: 14th European

Conference, Amsterdam, The Netherlands, October 11–14, 2016,

Proceedings, Part IV 14 (pp. 630-645). Springer International

Publishing.

[46] A. Esteva, B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau,

et al. “dermatologist-level classification of skin cancer with deep

neural networks Nature”, 542 (7639) (2017), pp. 115-118.

[47] A. Kamilaris, F.X. Prenafeta-Boldú. “Deep learning in agriculture: A

survey Comput”. Electron. Agric., 147 (2018), pp. 70-90.

[48] C. Mondal, MK. Hasan, M. Ahmad, MA. Awal, MT Jawad, A. Dutta,

MR. Islam, MA. Moni. “Ensemble of convolutional neural networks

to diagnose acute lymphoblastic leukemia from microscopic images”.

Informatics in Medicine Unlocked. 2021 Jan 1, 27:100794.

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

[49] S. Perveen, M. Shahbaz, A. Guergachi, K. Keshavjee. “Performance

analysis of data mining classification techniques to predict diabetes”.

Procedia Computer Science. 2016 Jan 1, 82:115-21.

[50] PP. Shinde, S. Shah. “A review of machine learning and deep learning

applications”. In 2018 Fourth international conference on computing

communication control and automation (ICCUBEA) 2018 Aug 16 (pp.

1-6). IEEE.

[51] J. Tang, Q. Su, B. Su, S. Fong, W. Cao, X. Gong. “Parallel ensemble

learning of convolutional neural networks and local binary patterns for

face recognition Comput”. Methods Programs Biomed, 197 (2020), p.

105622.

[52] C. Valle, F. Saravia, H. Allende, R. Monge, C. Fernández. “Parallel

approach for ensemble learning with locally coupled neural networks

Neural Process”. Lett, 32 (3) (2010), pp. 277-291.

[53] A. Mohammed, R. Kora. “A comprehensive review on ensemble deep

learning: Opportunities and challenges”. Journal of King Saud

University-Computer and Information Sciences. 2023 Feb 1.

[54] SY. ŞEN, N. ÖZKURT. “Convolutional neural network

hyperparameter tuning with adam optimizer for ECG classification”.

In 2020 innovations in intelligent systems and applications conference

(ASYU) 2020 Oct 15 (pp. 1-6). IEEE.

[55] S. Perveen, M. Shahbaz, K. Keshavjee, A. Guergachi. “Metabolic

syndrome and development of diabetes mellitus: predictive modeling

based on machine learning techniques”. IEEE Access. 2018 Dec 21,

7:1365-75.

[56] M. Zhou, K. Wu, L. Yu, M. Xu, J. Yang, Q. Shen, B. Liu, L. Shi, S.

Wu, B. Dong, H. Wang. “Development and evaluation of a leukemia

diagnosis system using deep learning in real clinical scenarios”.

Frontiers in Pediatrics. 2021 Jun 24, 9:693676.

[57] KK. Jha, HS. Dutta. “Mutual information based hybrid model and

deep learning for acute lymphocytic leukemia detection in single cell

blood smear images”. Computer methods and programs in

biomedicine. 2019 Oct 1, 179:104987.

[58] A. Almadhor, U. Sattar, A. Al Hejaili, U. Ghulam Mohammad, U.

Tariq, H. Ben Chikha. “An efficient computer vision-based approach

for acute lymphoblastic leukemia prediction”. Frontiers in

Computational Neuroscience. 2022 Nov 24, 16:1083649.

[59] N. Ahmed, A. Yigit, Z. Isik, A. Alpkocak. “Identification of leukemia

subtypes from microscopic images using convolutional neural

network”. Diagnostics. 2019 Aug 25, 9(3):104.

[60] S. Ansari, AH. Navin, AB. Sangar, JV. Gharamaleki, S. Danishvar.

“A customized efficient deep learning model for the diagnosis of acute

leukemia cells based on lymphocyte and monocyte images”.

Electronics. 2023 Jan 8, 12(2):322.

[61] C. Mondal, MK. Hasan, MT. Jawad, A Dutta, MR Islam, MA Awal,

M. Ahmad. “Acute lymphoblastic leukemia detection from

microscopic images using weighted ensemble of convolutional neural

networks”. arXiv preprint arXiv:2105.03995. 2021 May 9.

[62] M. Bukhari, S. Yasmin, S. Sammad, A. El-Latif, A. Ahmed. “A deep

learning framework for leukemia cancer detection in microscopic

blood samples using squeeze and excitation learning”. Mathematical

Problems in Engineering. 2022 Jan 31, 2022.

[63] N. Sampathila, K. Chadaga, N. Goswami, RP. Chadaga, M. Pandya,

S. Prabhu, MG. Bairy, SS. Katta, D. Bhat, SP. Upadya. “Customized

deep learning classifier for detection of acute lymphoblastic leukemia

using blood smear images”. InHealthcare 2022 Sep 20 (Vol. 10, No.

10, p. 1812). MDPI.

[64] World Health Organization, 2020 B. Global cancer profile

2020(2020). https://tinyurl.com/3unsh9xa. [Accessed 10 February

2020]

[65] National Research Council, 2000. Networking health: prescriptions

for the internet.

[66] S. Rezayi, N. Mohammadzadeh, H. Bouraghi, S. Saeedi, A.

Mohammadpour. “Timely diagnosis of acute lymphoblastic leukemia

using artificial intelligence-oriented deep learning methods”.

Computational Intelligence and Neuroscience. 2021 Nov 11, 2021.

[67] K.K. Jha, & H.S. Dutta.” Mutual information based hybrid model and

deep learning for acute lymphocytic leukemia detection in single cell

blood smear images”. Computer methods and programs in

biomedicine. 2019. 179, 104987.

[68] G. Atteia, A.A. Alhussan, N. A. Samee. Bo-allcnn.”Bayesian-based

optimized cnn for acute lymphoblastic leukemia detection in

microscopic blood smear images”. Sensors. 2022 Jul 24;22(15):5520.

[69] R. Khandekar, P. Shastry, S. Jaishankar, O.Faust, N. Sampathila.

“Automated blast cell detection for Acute Lymphoblastic Leukemia

diagnosis”. Biomedical Signal Processing and Control. 2021 Jul

1;68:102690.

[70] Deep learning detects acute myeloid leukemia and predicts NPM1

mutation status from bone marrow smears. Leukemia. 2022

Jan;36(1):111-8.

[71] S. Ren, K. He, R. Girshick, J. Sun. “Faster R-CNN: towards real-time

object detection with region proposal networks”. IEEE Trans Pattern

Anal Mach Intell. 2017;39:1137–49.

[72] H. Miyoshi, K. Sato, Y. Kabeya, S. Yonezawa, H. Nakano, Y.

Takeuchi, I. Ozawa, S. Higo, E. Yanagida, K. Yamada, K. Kohno.

“Deep learning shows the capability of high-level computer-aided

diagnosis in malignant lymphoma”. Laboratory Investigatio n. 2020

Oct;100(10):1300-10.

[73] A. Krizhevsky, I. Sutskever, G. E. Hinton. “Imagenet classification

with deep convolutional neural networks”. Advances in neural

information processing systems. 2012;25.

[74] K. He, X. Zhang, S. Ren, J. Sun. “Deep residual learning for image

recognition”. In Proceedings of the IEEE conference on computer

vision and pattern recognition 2016 (pp. 770-778).

[75] H. D. Cheng, X. H. Jiang, Y. Sun, J. Wang. “Color image

segmentation: advances and prospects”. Pattern recognition. 2001 Dec

1;34(12):2259-81.

[76] H. D. Cheng, J. Shan, W. Ju, Y. Guo, L. Zhang.” Automated breast

cancer detection and classification using ultrasound images: A

survey”. Pattern recognition. 2010 Jan 1;43(1):299-317.

[77] S. Sharma, S. Gupta, D. Gupta, S. Juneja, P. Gupta, G. Dhiman, S.

Kautish. “Deep learning model for the automatic classification of

white blood cells”. Computational Intelligence and Neuroscience.

2022 Jan 12;2022.

[78] S. Mohapatra, D. Patra. “Automated cell nucleus segmentation and

acute leukemia detection in blood microscopic images”. In2010

International Conference on Systems in Medicine and Biology 2010

Dec 16 (pp. 49-54). IEEE.

[79] R. Baig, A. Rehman, A. Almuhaimeed, A. Alzahrani,H. T. Rauf HT.

“Detecting malignant leukemia cells using microscopic blood smear

images: a deep learning approach”. Applied Sciences. 2022 Jun

21;12(13):6317.

[80] G. Parthasarathy G, D. Chitra. “Thresholding technique for color

image segmentation”. International Journal for Research in Applied

Science & Engineering Technology. 2015 Jun;3(6):437-45.

[81] Y. Liu, F. “Long. Acute lymphoblastic leukemia cells image analysis

with deep bagging ensemble learning”. InISBI 2019 C-NMC

Challenge: Classification in Cancer Cell Imaging: Select Proceedings

2019 Nov 29 (pp. 113-121). Singapore: Springer Singapore.

[82] F. Xiao, R. Kuang, Z. Ou, B. Xiong. “DeepMEN: Multi-model

ensemble network for B-lymphoblast cell classification”. InISBI 2019

C-NMC Challenge: Classification in Cancer Cell Imaging: Select

Proceedings 2019 (pp. 83-93). Springer Singapore.

[83] D. S. Depto, M.M. Rizvee, A. Rahman, H. Zunair, M. S. Rahman,

M. R. Mahdy. “Quantifying imbalanced classification methods for

leukemia detection”. Computers in Biology and Medicine. 2023 Jan

1;152:106372.

[84] K. Barrera-Llanga, J. Burriel-Valencia, A. Sapena-Bañó, J. Martínez-

Román. “A Comparative Analysis of Deep Learning Convolutional

Neural Network Architectures for Fault Diagnosis of Broken Rotor

Bars in Induction Motors”. Sensors. 2023 Sep 30;23(19):8196.

[85] K. Barrera, J. Rodellar, S. Alférez, A. Merino. “Automatic normalized

digital color staining in the recognition of abnormal blood cells using

generative adversarial networks”. Computer methods and programs in

biomedicine. 2023 Oct 1;240:107629.

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

Dr. SAJIDA PERVEEN received the

Ph.D. degree in Computer Science from

the Department of Computer Science,

University of Engineering & Technology

(UET) Lahore, Pakistan, in 2021. She is

currently serving as an Assistant Professor

in the Department of Computer Science, National Textile

University, Faisalabad, Pakistan. Her research interests

include healthcare informatics, data science, and data

analytics.

Dr. ABDULLAH ALOURANI is

Assistant Professor at the Department of

Management Information Systems and

Production Management, Qassim

University, Saudi Arabia. He received his

Ph.D. in Computer Science from the

University of Illinois at Chicago, his

Master’s degree in Computer Science

from DePaul University in Chicago, and his Bachelor’s

degree in Computer Science from Qassim University, Saudi

Arabia. His current research interests are in the areas of

Cloud Computing, Software Engineering, Security, and

Artificial Intelligence. He is a member of ACM and IEEE

Dr. MUHAMMAD SHAHBAZ received

the Ph.D. degree from Loughborough

University, U.K. He is currently a Full

Professor in the Department of Computer

Engineering, University of Engineering and

Technology. He has delivered several talks

in the industry at National and International

levels and at various conferences around the world. He has

wide experience in the field of data science and has published

more than 100 articles in the same domain. His research

interests include healthcare informatics, fog computing, data

science, and artificial intelligence

Dr. M. USMAN ASHRAF received PhD

(Computer Science) degree in 2018 from

King Abdul-Aziz University, Saudi

Arabia. He is Associate Professor and

Head of department of Computer Science,

GC Women University, Sialkot, Pakistan.

His research on Exascale Computing

Systems, High Performance Computing (HPC) Systems,

Parallel Computing, HPC for Deep learning and Location

Based Services System has appeared in IEEE Access, IET

Software, International Journal of Advanced Research in

Computer Science, International Journal of Advanced

Computer Science and Applications, I.J. Information

Technology and Computer Science, International Journal of

Computer Science and Security and several International

IEEE/ACM/Springer conferences. He served as HPC

Scientist at HPC Centre King Abdul-Aziz University, Saudi

Arabia as well.

Dr. ISMA HAMID is currently an

assistant professor at National textile

university, Pakistan. She has thirteen years

of teaching and research experience. Her

main research interests are visualization,

big data and computational intelligence.

This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and

content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2024.3368031

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/

Diagnosis of Medical Images Using Convolutional Neural Networks

Article

Full-text available

May 2024

Yogita K. Desai

Medical image diagnosis using Convolutional Neural Networks (CNNs) has emerged as a viable way to improve the accuracy and efficiency of disease identification and categorization in clinical settings. In this study, they look at how CNNs can be used to diagnose lung nodules from chest X-ray pictures, to provide insights into the technology's performance and future clinical applications. A dataset of 10,000 tagged chest X-ray pictures showing both benign and malignant lung nodules was obtained and preprocessed using standard methods. The dataset was used to construct and train a proprietary CNN architecture, which was then rigorously evaluated on distinct training, validation, and test sets. The CNN model showed good accuracy (94.8%), sensitivity (92.1%), specificity (96.5%), precision, recall, F1 score, and area under the ROC curve (AUC), indicating its robustness and generalization ability. These findings show that CNN-based diagnostic tools may help radiologists and physicians discover and diagnose lung cancer earlier, improving patient outcomes and optimizing healthcare delivery. However, difficulties such as interpretability, data privacy, and regulatory approval must be addressed before CNNs can be fully utilized in medical imaging. This study emphasizes CNNs' transformative significance in diagnostic medicine and the necessity for additional research and development to realize their full potential in clinical practice.

Advancing Early Leukemia Diagnostics: A Comprehensive Study Incorporating Image Processing and Transfer Learning

Article

Full-text available

Apr 2024

Disease recognition has been revolutionized by autonomous systems in the rapidly developing field of medical technology. A crucial aspect of diagnosis involves the visual assessment and enumeration of white blood cells in microscopic peripheral blood smears. This practice yields invaluable insights into a patient’s health, enabling the identification of conditions of blood malignancies such as leukemia. Early identification of leukemia subtypes is paramount for tailoring appropriate therapeutic interventions and enhancing patient survival rates. However, traditional diagnostic techniques, which depend on visual assessment, are arbitrary, laborious, and prone to errors. The advent of ML technologies offers a promising avenue for more accurate and efficient leukemia classification. In this study, we introduced a novel approach to leukemia classification by integrating advanced image processing, diverse dataset utilization, and sophisticated feature extraction techniques, coupled with the development of TL models. Focused on improving accuracy of previous studies, our approach utilized Kaggle datasets for binary and multiclass classifications. Extensive image processing involved a novel LoGMH method, complemented by diverse augmentation techniques. Feature extraction employed DCNN, with subsequent utilization of extracted features to train various ML and TL models. Rigorous evaluation using traditional metrics revealed Inception-ResNet’s superior performance, surpassing other models with F1 scores of 96.07% and 95.89% for binary and multiclass classification, respectively. Our results notably surpass previous research, particularly in cases involving a higher number of classes. These findings promise to influence clinical decision support systems, guide future research, and potentially revolutionize cancer diagnostics beyond leukemia, impacting broader medical imaging and oncology domains.

AMDDLmodel: Android smartphones malware detection using deep learning model

Article

Full-text available

Jan 2024
PLOS ONE

Android is the most popular operating system of the latest mobile smart devices. With this operating system, many Android applications have been developed and become an essential part of our daily lives. Unfortunately, different kinds of Android malware have also been generated with these applications’ endless stream and somehow installed during the API calls, permission granted and extra packages installation and badly affected the system security rules to harm the system. Therefore, it is compulsory to detect and classify the android malware to save the user’s privacy to avoid maximum damages. Many research has already been developed on the different techniques related to android malware detection and classification. In this work, we present AMDDLmodel a deep learning technique that consists of a convolutional neural network. This model works based on different parameters, filter sizes, number of epochs, learning rates, and layers to detect and classify the android malware. The Drebin dataset consisting of 215 features was used for this model evaluation. The model shows an accuracy value of 99.92%. The other statistical values are precision, recall, and F1-score. AMDDLmodel introduces innovative deep learning for Android malware detection, enhancing accuracy and practical user security through inventive feature engineering and comprehensive performance evaluation. The AMDDLmodel shows the highest accuracy values as compared to the existing techniques.

A Comparative Analysis of Deep Learning Convolutional Neural Network Architectures for Fault Diagnosis of Broken Rotor Bars in Induction A Comparative Analysis of Deep Learning Convolutional Neural Network Architectures for Fault Diagnosis of Broken Rotor Bars in Induction Motors

Article

Full-text available

Sep 2023
SENSORS-BASEL

Induction machines (IMs) play a critical role in various industrial processes but are susceptible to degenerative failures, such as broken rotor bars. Effective diagnostic techniques are essential in addressing these issues. In this study, we propose the utilization of convolutional neural networks (CNNs) for detection of broken rotor bars. To accomplish this, we generated a dataset comprising current samples versus angular position using finite element method magnetics (FEMM) software for a squirrel-cage rotor with 28 bars, including scenarios with 0 to 6 broken bars at every possible relative position. The dataset consists of a total of 16,050 samples per motor. We evaluated the performance of six different CNN architectures, namely Inception V4, NasNETMobile, ResNET152, SeNET154, VGG16, and VGG19. Our automatic classification system demonstrated an impressive 99% accuracy in detecting broken rotor bars, with VGG19 performing exceptionally well. Specifically, VGG19 exhibited high accuracy, precision, recall, and F1-Score, with values approaching 0.994 and 0.998. Notably, VGG19 exhibited crucial activations in its feature maps, particularly after domain-specific training, highlighting its effectiveness in fault detection. Comparing CNN architectures assists in selecting the most suitable one for this application based on processing time, effectiveness, and training losses. This research suggests that deep learning can detect broken bars in induction machines with accuracy comparable to that of traditional methods by analyzing current signals using CNNs.

Automatic normalized digital color staining in the recognition of abnormal blood cells using generative adversarial networks

Article

Full-text available

May 2023
COMPUT METH PROG BIO

Background and Objectives: Combining knowledge of clinical pathologists and deep learning models is a growing trend in morphological analysis of cells circulating in blood to add objectivity, accuracy, and speed in diagnosing hematological and non-hematological diseases. However, the variability in staining protocols across different laboratories can affect the color of images and performance of automatic recognition models. The objective of this work is to develop, train and evaluate a new system for the normalization of color staining of peripheral blood cell images, so that it transforms images from different centers to map the color staining of a reference center (RC) while preserving the structural morphological features. Methods: The system has two modules, GAN1 and GAN2. GAN1 uses the PIX2PIX technique to fade original color images to an adaptive gray, while GAN2 transforms them into RGB normalized images. Both GANs have a similar structure, where the generator is a U-NET convolutional neural network with ResNet and the discriminator is a classifier with ResNet34 structure. Digitally stained images were evaluated using GAN metrics and histograms to assess the ability to modify color without altering cell morphology. The system was also evaluated as a pre-processing tool before cells undergo a classification process. For this purpose, a CNN classifier was designed for three classes: abnormal lymphocytes, blasts and reactive lymphocytes. Results: Training of all GANs and the classifier was performed using RC images, while evaluations were conducted using images from four other centers. Classification tests were performed before and after applying the stain normalization system. The overall accuracy reached a similar value around 96% in both cases for the RC images, indicating the neutrality of the normalization model for the reference images. On the contrary, it was a significant improvement in the classification performance when applying the stain normalization to the other centers. Reactive lymphocytes were the most sensitive to stain normalization, with true positive rates (TPR) increasing from 46.3% - 66% for the original images to 81.2% - 97.2% after digital staining. Abnormal lymphocytes TPR ranged from 31.9% - 95.7% with original images to 83% - 100% with digitally stained images. Blast class showed TPR ranges of 90.3% - 94.4% and 94.4% - 100%, for original and stained images, respectively. Conclusions: The proposed GAN-based normalization staining approach improves the performance of classifiers with multicenter data sets by generating digitally stained images with a quality similar to the original images and adaptability to a reference staining standard. The system requires low computation cost and can help improve the performance of automatic recognition models in clinical settings.

Pattern Recognition of Acute Lymphoblastic Leukemia (ALL) Using Computational Deep Learning

Article

Full-text available

Jan 2023

Leukemia is a cancer of blood-producing cells, including the bone marrow. Abnormal white blood cells travel through blood vessels and multiply rapidly. Healthy cells in the body become a minority, and the imbalance increases the chances of infection in the body. Leukemia or blood cancer is the most common cancer in children ages 2 - 14. Most leukemia in children is treated. Acute lymphocytic leukemia (ALL) is a type of cancer in the blood and bone marrow. It progresses rapidly when immature white blood cells are formed instead of mature ones. Treatments for acute lymphocytic leukemia include drugs and blood transfusions directly into veins, chemotherapy, and all transplantation, which involve transferring organs or tissues within the body or from one person to another. In this paper, Pattern Recognition of Acute Lymphoblastic Leukemia has been proposed using Computational Deep Learning. Pattern recognition technology uses mathematical algorithms to identify patterns in large datasets of data. Analyzing the data, the algorithms can identify patterns indicative of certain states or conditions. In the case of ALL, the algorithm would look for patterns in white blood cell count data that indicate the presence of ALL. These patterns may include changes in the number of white blood cells over time, changes in the composition of the white blood cells, or changes in the levels of certain proteins or gene expressions associated with ALL. The proposed ALLDM model achieved 81.53% (DDS) and 87.92% (SDS) of chemotherapy management, 79.16% (DDS) and 94.31% (SDS) of Stem Cell Transplantation Management, 63.77% (DDS) and 87.37% (SDS) of Radiation therapy Management and 88.92% (DDS) and 85.86% (SDS) of Targeted therapy drugs management.

A Customized Efficient Deep Learning Model for the Diagnosis of Acute Leukemia Cells Based on Lymphocyte and Monocyte Images

Article

Full-text available

Jan 2023

The production of blood cells is affected by leukemia, a type of bone marrow cancer or blood cancer. Deoxyribonucleic acid (DNA) is related to immature cells, particularly white cells, and is damaged in various ways in this disease. When a radiologist is involved in diagnosing acute leukemia cells, the diagnosis is time consuming and needs to provide better accuracy. For this purpose, many types of research have been conducted for the automatic diagnosis of acute leukemia. However, these studies have low detection speed and accuracy. Machine learning and artificial intelligence techniques are now playing an essential role in medical sciences, particularly in detecting and classifying leukemic cells. These methods assist doctors in detecting diseases earlier, reducing their workload and the possibility of errors. This research aims to design a deep learning model with a customized architecture for detecting acute leukemia using images of lymphocytes and monocytes. This study presents a novel dataset containing images of Acute Lymphoblastic Leukemia (ALL) and Acute Myeloid Leukemia (AML). The new dataset has been created with the assistance of various experts to help the scientific community in its efforts to incorporate machine learning techniques into medical research. Increasing the scale of the dataset is achieved with a Generative Adversarial Network (GAN). The proposed CNN model based on the Tversky loss function includes six convolution layers, four dense layers, and a Softmax activation function for the classification of acute leukemia images. The proposed model achieved a 99% accuracy rate in diagnosing acute leukemia types, including ALL and AML. Compared to previous research, the proposed network provides a promising performance in terms of speed and accuracy; and based on the results, the proposed model can be used to assist doctors and specialists in practical applications.

An efficient computer vision-based approach for acute lymphoblastic leukemia prediction

Article

Full-text available

Nov 2022

Leukemia (blood cancer) diseases arise when the number of White blood cells (WBCs) is imbalanced in the human body. When the bone marrow produces many immature WBCs that kill healthy cells, acute lymphocytic leukemia (ALL) impacts people of all ages. Thus, timely predicting this disease can increase the chance of survival, and the patient can get his therapy early. Manual prediction is very expensive and time-consuming. Therefore, automated prediction techniques are essential. In this research, we propose an ensemble automated prediction approach that uses four machine learning algorithms K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB). The C-NMC leukemia dataset is used from the Kaggle repository to predict leukemia. Dataset is divided into two classes cancer and healthy cells. We perform data preprocessing steps, such as the first images being cropped using minimum and maximum points. Feature extraction is performed to extract the feature using pre-trained Convolutional Neural Network-based Deep Neural Network (DNN) architectures (VGG19, ResNet50, or ResNet101). Data scaling is performed by using the MinMaxScaler normalization technique. Analysis of Variance (ANOVA), Recursive Feature Elimination (RFE), and Random Forest (RF) as feature Selection techniques. Classification machine learning algorithms and ensemble voting are applied to selected features. Results reveal that SVM with 90.0% accuracy outperforms compared to other algorithms.

A Comprehensive Review on Ensemble Deep Learning: Opportunities and Challenges

Article

Feb 2023

In machine learning, two approaches outperform traditional algorithms: ensemble learning and deep learning. The former refers to methods that integrate multiple base models in the same framework to obtain a stronger model that outperforms them. The success of an ensemble method depends on several factors, including how the baseline models are trained and how they are combined. In the literature, there are common approaches to building an ensemble model successfully applied in several domains. On the other hand, deep learning-based models have improved the predictive accuracy of machine learning across a wide range of domains. Despite the diversity of deep learning architectures and their ability to deal with complex problems and the ability to extract features automatically, the main challenge in deep learning is that it requires a lot of expertise and experience to tune the optimal hyper-parameters, which makes it a tedious and time-consuming task. Numerous recent research efforts have been made to approach ensemble learning to deep learning to overcome this challenge. Most of these efforts focus on simple ensemble methods that have some limitations. Hence, this review paper provides comprehensive reviews of the various strategies for ensemble learning, especially in the case of deep learning. Also, it explains in detail the various features or factors that influence the success of ensemble methods. In addition, it presents and accurately categorized several research efforts that used ensemble learning in a wide range of domains.

Global estimates of incidence and mortality of cervical cancer in 2020: a baseline analysis of the WHO Global Cervical Cancer Elimination Initiative

Article

Dec 2022

Background Tracking progress and providing timely evidence is a fundamental step forward for countries to remain aligned with the targets set by WHO to eliminate cervical cancer as a public health problem (ie, to reduce the incidence of the disease below a threshold of 4 cases per 100 000 women-years). We aimed to assess the extent of global inequalities in cervical cancer incidence and mortality, based on The Global Cancer Observatory (GLOBOCAN) 2020 estimates, including geographical and socioeconomic development, and temporal aspects. Methods For this analysis, we used the GLOBOCAN 2020 database to estimate the age-specific and age-standardised incidence and mortality rates of cervical cancer per 100 000 women-years for 185 countries or territories aggregated across the 20 UN-defined world regions, and by four-tier levels of the Human Development Index (HDI). Time trends (1988–2017) in incidence were extracted from the Cancer Incidence in Five Continents (CI5) plus database. Mortality estimates were obtained using the most recent national vital registration data from WHO. Findings Globally in 2020, there were an estimated 604 127 cervical cancer cases and 341 831 deaths, with a corresponding age-standardised incidence of 13·3 cases per 100 000 women-years (95% CI 13·3–13·3) and mortality rate of 7·2 deaths per 100 000 women-years (95% CI 7·2–7·3). Cervical cancer incidence ranged from 2·2 (1·9–2·4) in Iraq to 84·6 (74·8–94·3) in Eswatini. Mortality rates ranged from 1·0 (0·8–1·2) in Switzerland to 55·7 (47·7–63·7) in Eswatini. Age-standardised incidence was highest in Malawi (67·9 [95% CI 65·7 –70·1]) and Zambia (65·5 [63·0–67·9]) in Africa, Bolivia (36·6 [35·0–38·2]) and Paraguay (34·1 [32·1–36·1]) in Latin America, Maldives (24·5 [17·0–32·0]) and Indonesia (24·4 [24·2–24·7]) in Asia, and Fiji (29·8 [24·7–35·0]) and Papua New Guinea (29·2 [27·3–31·0]) in Melanesia. A clear socioeconomic gradient exists in cervical cancer, with decreasing rates as HDI increased. Incidence was three times higher in countries with low HDI than countries with very high HDI, whereas mortality rates were six times higher in low HDI countries versus very high HDI countries. In 2020 estimates, a general decline in incidence was observed in most countries of the world with representative trend data, with incidence becoming stable at relatively low levels around 2005 in several high-income countries. By contrast, in the same period incidence increased in some countries in eastern Africa and eastern Europe. We observed different patterns of age-specific incidence between countries with well developed population-based screening and treatment services (eg, Sweden, Australia, and the UK) and countries with insufficient and opportunistic services (eg, Colombia, India, and Uganda). Interpretation The burden of cervical cancer remains high in many parts of the world, and in most countries, the incidence and mortality of the disease remain much higher than the threshold set by the WHO initiative on cervical cancer elimination. We identified substantial geographical and socioeconomic inequalities in cervical cancer globally, with a clear gradient of increasing rates for countries with lower levels of human development. Our study provides timely evidence and impetus for future strategies that prioritise and accelerate progress towards the WHO elimination targets and, in so doing, address the marked variations in the global cervical cancer landscape today. Funding French Institut National du Cancer, Horizon 2020 Framework Programme for Research and Innovation of the European Commission; and EU4Health Programme.

Quantifying imbalanced classification methods for leukemia detection

Article

Nov 2022
COMPUT BIOL MED

Uncontrolled proliferation of B-lymphoblast cells is a common characterization of Acute Lymphoblastic Leukemia (ALL). B-lymphoblasts are found in large numbers in peripheral blood in malignant cases. Early detection of the cell in bone marrow is essential as the disease progresses rapidly if left untreated. However, automated classification of the cell is challenging, owing to its fine-grained variability with B-lymphoid precursor cells and imbalanced data points. Deep learning algorithms demonstrate potential for such fine-grained classification as well as suffer from the imbalanced class problem. In this paper, we explore different deep learning-based State-Of-The-Art (SOTA) approaches to tackle imbalanced classification problems. Our experiment includes input, GAN (Generative Adversarial Networks), and loss-based methods to mitigate the issue of imbalanced class on the challenging C-NMC and ALLIDB-2 dataset for leukemia detection. We have shown empirical evidence that loss-based methods outperform GAN-based and input-based methods in imbalanced classification scenarios.

Artificial intelligence and automation in endoscopy and surgery

Article

Nov 2022

Modern endoscopy relies on digital technology, from high-resolution imaging sensors and displays to electronics connecting configurable illumination and actuation systems for robotic articulation. In addition to enabling more effective diagnostic and therapeutic interventions, the digitization of the procedural toolset enables video data capture of the internal human anatomy at unprecedented levels. Interventional video data encapsulate functional and structural information about a patient’s anatomy as well as events, activity and action logs about the surgical process. This detailed but difficult-to-interpret record from endoscopic procedures can be linked to preoperative and postoperative records or patient imaging information. Rapid advances in artificial intelligence, especially in supervised deep learning, can utilize data from endoscopic procedures to develop systems for assisting procedures leading to computer-assisted interventions that can enable better navigation during procedures, automation of image interpretation and robotically assisted tool manipulation. In this Perspective, we summarize state-of-the-art artificial intelligence for computer-assisted interventions in gastroenterology and surgery. Advances in artificial intelligence (AI) are changing endoscopy and gastrointestinal surgery, including computer-assisted detection and diagnosis, computer-aided navigation, robot-assisted intervention and automated reporting. This Perspective introduces the role of AI in computer-assisted interventions in gastroenterology with insights on regulatory aspects and the challenges ahead.

A Framework for Early Detection of Acute Lymphoblastic Leukemia and Its Subtypes From Peripheral Blood Smear Images Using Deep Ensemble Learning Technique

Abstract and Figures

Recommended publications

Ensemble of Convolutional Neural Networks to diagnose Acute Lymphoblastic Leukemia from microscopic...

Automated Blast Cell Detection for Acute Lymphoblastic Leukemia Using a Stacking Ensemble of Convolu...

Acute Lymphoblastic Leukemia Detection from Microscopic Images Using Weighted Ensemble of Convolutio...

Customized Deep Learning Classifier for Detection of Acute Lymphoblastic Leukemia Using Blood Smear...