Content uploaded by Vikas Katla
Author content
All content in this area was uploaded by Vikas Katla on Jul 13, 2023
Content may be subject to copyright.
Content uploaded by Aruna Kumari Kakumani
Author content
All content in this area was uploaded by Aruna Kumari Kakumani on Jul 11, 2023
Content may be subject to copyright.
A Comparative Analysis for Leukocyte
Classification Based on Various Deep Learning
Models Using Transfer Learning
Aruna Kumari Kakumani
Dept. of ECE
VNR Vignana Jyothi Institute of Engineering and Technology,
Hyderabad,India
arunakumari_k@vnrvjiet.in
Vinisha Rekhawar
Dept. of ECE
VNR Vignana Jyothi Institute of Engineering and Technology,
Hyderabad,India
rekhawarvinisha@gmail.com
Vikas Katla
Dept. of ECE
VNR Vignana Jyothi Institute of Engineering and Technology,
Hyderabad,India
katlavikas11@gmail.com
Anish Reddy Yellakonda
Dept. of ECE
VNR Vignana Jyothi Institute of Engineering and Technology,
Hyderabad,India
reddyanish211@gmail.com
Abstract— Leukocytes, sometimes referred to as white blood
cells (WBCs), are crucial to the healthy operation of the human
body. WBC distribution in human body are biological
markers that determine the immunity of human body to fight
against infectious diseases. WBC detection and classification
plays an important role in medical application. However, using
manual microscopic evaluation is complicated and time
consuming. To tackle the limitations associated with traditional
methods, recently deep learning (D.L) based methods are much
experimented and explored. In this paper, we implemented
various D.L models for automatic classification of WBCs. A
comparative study among pretrained networks namely
Inceptionv3, MobileNetV3 and VGG-19 was performed using
transfer learning on publicly available WBC images from
Kaggle. Classification accuracy of WBC images using
Inceptionv3, MobileNetV3 and VGG-19 is 99.76%, 99.25% and
86.50% respectively. Inceptionv3 was further compared with
the existing works in the literature and is found to be superior.
Keywords—White blood cells, Inception v3, Deep Learning,
Transfer Learning
I. INTRODUCTION
A blood smear examined under a microscope contains
information that is helpful in the diagnosis of numerous
disorders. Red blood cells (RBCs), white blood cells
(WBCs), and blood platelets make up the majority of human
blood. WBCs, commonly known as immune cells, are a
crucial component of the immune system. They are created
in the bone marrow and found in lymph tissue. They support
the body's defense mechanisms against harmful pathogens
and invaders. One can distinguish WBCs from other
components of blood as WBC’s have nuclei. Whereas RBCs
and platelets have no nuclei. WBCs are majorly divided into
two types depending on the shape and size of nucleus.
Granular cells: neutrophils, basophil, eosinophil and non-
granular cells: monocyte and lymphocytes.
Neutrophils are the primary component of WBCs which
comprise 40% - 60%, and have a nucleus with three to five
lobes and these lobes are connected by slender strands of
genetic material, and they are responsible to fight against
bacterial and fungal infections. Eosinophils, which
comprises of 1% - 4% of the WBCs in the body, have a bi-
Fig. 1. Structure of Classification of WBCs
lobed nucleus with large granules scattered throughout their
cytoplasm, and they are primarily responsible for destroying
parasites and responding to allergens. Lymphocytes, which
comprises of 20%-40%, the smallest of the WBCs, have a
massive, round nucleus with minimal surrounding
cytoplasm, and they serve the purpose of producing
antibodies and killing infected cells. Lastly, monocytes,
which make about 2%-8%, the largest of the white blood
cells, have nuclei with a kidney shape and an abundance of
cytoplasm surrounding it. Basophils are present in very less
quantity around 0.5%-1%, are responsible to release
enzymes and prevent blood clotting. As basophils are
negligible, this study concentrates majorly on eosinophil,
lymphocyte, monocyte and neutrophil.
Currently, two different methods are used to determine
the WBC count and WBC differential. In the first method, a
pathologist applies a drop of blood thinly to a glass slide,
lets it dry, colors the resulting smear, and then counts and
classifies the WBCs on the slide by hand under a
microscope. In another method, the WBCs are suspended in
solution for the automated count, and a device known as a
laser flow cytometer shines a laser on them and measures
the refracted light to ascertain the WBCs distribution and
count. Both of these approaches have shortcomings despite
being precise and trustworthy. The earlier approach requires
manually counting the cells which is a cumbersome task and
2023 4th International Conference for Emerging Technology (INCET)
B
elgaum, India. May 26-28, 2023
979-8-3503-3575-0/23/$31.00 ©2023 IEEE 1
2023 4th International Conference for Emerging Technology (INCET) | 979-8-3503-3575-0/23/$31.00 ©2023 IEEE | DOI: 10.1109/INCET57972.2023.10170443
other method uses a laser flow cytometer, which is
expensive. Due to these reasons, there is a necessity to
develop computer vision-based methods to accurately
classify the WBCs.
II. RELATED WORK
A. Literature survey
In the paper [1] DenseNet121 Deep Learning (D.L)
model is utilized to classify WBCs into four classes. The
article [2] demonstrated a robust method for classifying
cropped WBCs utilizing features retrieved from CNN layers.
After performing data augmentation, the Adaptive Moment
Learning Rate (ADAM) optimizer was used to train CNN.
They discovered that using classifiers besides the standard
CNN softmax classifier, specifically the SVM and RF
classifier, boosts accuracy. In paper [3] Yolov4 and faster
RCNN based methods were utilized for simultaneous
recognition and classification of WBCs from images. In [4]
the Maximal Information Coefficient and Ridge feature
extraction procedures were used with the CNNs to
accomplish the classification task efficiently. The prominent
features were selected using the Ridge feature extraction
method and the Maximum Information Coefficient features.
Using this feature set, the classification process was carried
out. In their study, AlexNet, GoogLeNet, and ResNet-50
were utilized for feature selection and quadratic discriminant
analysis was employed as a classifier. In the paper [5]
authors have proposed a K-Means clustering-based image
processing approach to detect WBCs in microscopic images,
and a VGG-16 classifier for classification. In [6] the authors
have proposed a W-Net, a D.L based WBC classification
model, to precisely identify WBC types. Three convolutional
layers were used for feature selection and two fully
connected (FC) layers using softmax activation function is
used to classify them into five categories. A D.L system for
WBC type categorization using MobileNetV3-ShufflenetV2
is implemented in this paper [7]. First, WBC image frames
are divided into segments by a powerful Pyramid Scene
Parsing Network (PSPNet). After that, global and local
details are chosen from the segmented images using
MobileNetV3 and a method based on Artificial Gravitational
Cuckoo Search (AGCS). Lastly, five classes are created from
the WBC photos using a ShufflenetV2 model. In [8] for
accurate WBC classification, a multi-attention framework
has been developed. Using attention in augmentation and
regularization techniques, this method gathers texture
information from upper layers and deep features from
innermost layers which helps the model to learn only
selective features. In the article [9] Gaussian filter and
median filter are used to preprocess the raw WBC images
and GoogLeNet, Alexnet, DenseNet201 and ResNet50 were
utilized to perform classification. The performance after
applying the two filters was superior. This study [10]
provides a D.L architecture based on CCA (canonical
correlation analysis) by combining CNN and Long Short-
Term Memory (LSTM) for the goal of classifying WBCs in
images. CCA increases accuracy rate by extracting a wide
range of overlapping features from input image. This method
classifies the WBCs without undergoing segmentation and
feature extraction stages. Article [11] uses the VGG-16
model to combine D.L and machine learning methods to
extract characteristics from the detected nucleus of WBCs.
III. METHODOLOGY
A. Dataset and data preprocessing
Initially, the dataset was downloaded from Kaggle, having
12515 images [12]. It comprises of 9957 images for
training, 2487 images for testing and 71 images for
validation. The original dataset was reorganized in such a
way that 10,008 images are allocated for training, 1251
images for validation, and 1251 images for testing. The
detailed description of modified dataset is shown in Table I.
TABLE I. DESCRIPTION OF MODIFIED DATASET
Train
Validation
Test
Eosinophil
2505
313
315
lymphocyte
2486
311
312
Monocyte
2481
310
311
Neutrophil
2536
317
318
Fig. 2. (Counting rightward, clockwise) an image of a lymphocyte, an
image of a neutrophil, an image of an eosinophil, and an image of a
monocyte. The purple object is a white blood cell, and the reddish-
pink objects are red blood cells. The darker purple stain distinguishes
the nucleus from the lighter purple or pink cytoplasm. They are
sample images from dataset [12]
The images in the dataset had a dimension of
320*240*3, which are resized to 224*224*3 to reduce the
computation. The train data was then rotated, zoomed,
horizontally flipped, sheared, shifted in height and width,
and filled in nearest-neighbor mode, whereas the test and
validation data were just rescaled [13]. This is done to
improve the accuracy in classification of WBCs.
Afterwards, the entire data undergone data Normalization. It
is one of the preprocessing techniques wherein, each pixel
value of the input image is divided by 255, so that the output
pixel value lies between 0 to1. D.L models can be trained
more quickly by normalizing the input images.
B. Transfer Learning Procedure
The proposed methodology workflow is shown in Fig. 3.
The images in the dataset had a dimension of 320*240*3,
which were resized to 224*224*3 and data augmentation
was done. Afterwards, the entire data undergone data
Normalization. It is one of the preprocessing techniques
wherein, each pixel value of the input image is divided by
255, so that the output pixel value lies between 0 to1. D.L
models can be trained more quickly by normalizing the
input images.
2
Fig. 3. Methodology
In this work pretrained models namely, Inceptionv3,
MobileNetV3 and VGG-19 trained on ImageNet [14] dataset
was obtained from TensorFlow hub. Utilizing the concept of
Transfer Learning, fine tuning approach was performed for
the classification of WBCs. We performed three experiments
individually for the above mentioned three pretrained
models.
Each time pre-trained weights of the three models are
loaded into the feature extraction layer and are unfreezed.
This means that all layers in the feature extractor model are
trainable and fine-tuned during the training process. The
feature extraction layer outputs a tensor which is then fed to
a dropout layer.
In the proposed method, two dropout layers were
employed with a dropout rate of 0.5. The first dropout layer
is placed after the feature extractor layer to help regularize
the feature extraction process. The second dropout layer is
placed between the dense layer and the softmax layer to
regularize the classification process. The dense layer has 256
input neurons and four output neurons which means that the
model is performing a multi-class classification task with 4
classes. Additionally, it also has a L1 and L2 bias
regularizers, with a coefficient of 0.01. Regularization
improves the model's capacity for generalization by
preventing overfitting.
A penalty term proportional to the absolute value of the
model weights is added to the loss function by L1,
sometimes referred to as Lasso regularization (1). A penalty
term proportional to the squared size of the model weights is
added to the loss function via L2 regularization, also known
as Ridge regularization (2).
L1=λσȁܹȁ
ୀ (1)
L2=λܹ
ଶ
ୀ (2)
Where, λ is the regularization strength, n is the number
of model weights and W is the weight.
Additionally, early stop callback is used to prevent
overfitting by monitoring the validation loss during training.
If the validation loss does not improve for a certain number
of epochs (patience), training is halted and the weights with
the best validation loss are restored.
In this model, the Adam optimizer is utilized with
0.0010 learning rate. The loss function employed is known
as Sparse Categorical Cross entropy. The output of the
dense layer is then passed through the softmax activation
function, which transforms the output values into a set of
probabilities that sum to 1.
C. Fine Tuning
Fine tuning in D.L is a technique used in transfer
learning where a pre-trained model (usually trained on a
massive dataset) is further trained on an entirely new
dataset, typically with a smaller size, to improve the model's
performance on a specific task.
D. Model architecture
In this paper, we analyzed three pretrained models:
Inceptionv3, MobileNetV3, and VGG-19.
Inceptionv3 [15] is a popular CNN architecture designed
to improve the accuracy and efficiency of image
classification tasks. It consists of 42 layers organized into
four major parts: the stem, inception modules, reduction
modules, and classifier. It consists of a set of convolutional
layers with different filter sizes that are combined in parallel
to retrieve features from the input image. It has several
notable features that contribute to its high accuracy in
various image classification benchmarks such as batch
normalization to reduce overfitting, auxiliary classifiers to
improve training, factorized convolutions to reduce
computational complexity.
MobileNetV3, introduced in 2019, is a lightweight CNN
designed for mobile and embedded vision applications. It
combines depth wise and pointwise convolutions to
minimize the number of parameters and calculations needed
while preserving excellent accuracy.
On the other hand, VGG-19 is a deep CNN introduced in
2014 as an extension of the VGG-16 architecture. VGG-19
consists of 19 layers and is used for image classification and
object recognition tasks. It has convolutional and max
pooling layers, making it easy to understand and implement.
VGG-19 has achieved high accuracy in several image
classification benchmarks and is commonly used as a
baseline architecture for new CNN designs.
3
IV. RESULTS
This section describes the results obtained from
Inceptionv3, MobileNetV3, VGG-19 models. These models
were simulated by ‘Blood cell Images’ dataset available on
Kaggle.
A. Evaluation metrics
In order to analyze the proposed models, several
performance evaluation metrics are taken into account,
which includes accuracy, recall, precision and F1 score as
shown in (3)-(6).
ܣܿܿݑݎܽܿݕ ൌ ሺାሻ
ሺାାାሻ (3)
ܴ݈݈݁ܿܽ ൌ ሺሻ
ሺାሻ (4)
ܲݎ݁ܿ݅ݏ݅݊ ൌ ሺሻ
ሺ்ାிሻ (5)
ܨͳ െ ܵܿݎ݁ ൌ ሺଶכ୰ୣୡ୧ୱ୧୭୬כୖୣୡୟ୪୪ሻ
ሺ୰ୣୡ୧ୱ୧୭୬ାୖୣୡୟ୪୪ሻ (6)
Where TP, TN, FP and FN stand for True Positive, TN
True Negative, FP False Positive and False Negative
respectively.
A thorough description of the models performance
evaluation metrics is shown in below Table III.
B. Explaination of Model training and testing
The three deep learning models were executed using
TensorFlow [16], an open-source platform and Python
programming language in Visual Studio Code using
NVIDIA GeForce RTX 2060 GPU. The open-access dataset
has been taken from the Kaggle containing 12,515 images.
Pre-trained Inceptionv3, consisting of 48 layers, trained on
more than one million images form ImageNet database is
loaded from Keras library. The last layer of this model has
four neurons to classify WBCs into 4 different types.
Utilizing the fine-tuning technique, the model is retrained
with a batch size of 32 for 50 epochs and having sparse
categorical cross entropy loss function. L1, L2, two layers of
dropout and Early stop regularizers were used to prevent
over fitting of the model. The hyperparameters of the model
are depicted in Table II.
TABLE II. HYPERPARAMETERS OF PROPOSED MODEL
Hyperparameter
Value
Input size
224x224x3
Number of epochs
50
Batch Size
32
Learning rate
0.00010
Loss function
Sparse Categorical Cross entropy
Train-Validation-Test split ratio
80:10:10
Optimizer
ADAM
Output Activation Function
softmax
C. Analysis of the results
Training and validation curves for accuracy and loss
of the three models are shown in Fig. 3. It is observed
that Inceptionv3 has outperformed among the three
models.
Accuracy and Loss graphs of Inceptionv3
Accuracy and Loss graphs of MobileNetV3
Accuracy and Loss graphs of VGG-19
Fig. 4. Accuracy and Loss graphs for Training and Validation
4
TABLE III. RESULTS FOR TEST DATA OF PROPOSED MODELS
D. Confusion Matrix
A confusion matrix is a tool for precisely evaluating
classifier’s outcome. Every diagonal component represents
correctly categorized output. The results which are not
classified correctly are displayed on off-diagonal. The
confusion matrix for all three models are displayed in Fig. 5.
Inceptionv3
MobileNetV3
VGG-19
Fig. 5. Confusion matrix for proposed methods
V. COMPARISION WITH EXISTING METHODS FROM
LITERATURE
In our experiment, Inceptionv3 gave superior results than
MobileNetV3 and VGG-19. We have considered
Inceptionv3 for comparison with existing methods from
literature. The comparison results are shown in Table IV.
TABLE IV. COMPARISON WITH EXISTING MODELS
Work
Deep Learning Model
Accuracy
[4]
GoogleNet, Resnet-50
97.95
[
8]
EfficientNet with
attention
99.69
This work
Pre trained Inceptionv3
99.76
Model
Accuracy
Precision
F1-
score
Recall
Inceptionv3
EOSINOPHIL
LYMPHOCYTE
MONOCYTE
NEUTROPHIL
99.76%
1.00
1.00
1.00
0.99
-
0.99
1.00
1.00
1.00
-
0.99
1.00
1.00
0.995
-
1.00
1.00
1.00
0.99
MobileNetV3
EOSINOPHIL
LYMPHOCYTE
MONOCYTE
NEUTROPHIL
99.25%
0.98
1.00
1.00
0.99
-
0.98
1.00
1.00
0.98
-
0.98
1.00
1.00
0.98
-
0.98
1.00
1.00
0.99
VGG
-19
EOSINOPHIL
LYMPHOCYTE
MONOCYTE
NEUTROPHIL
86.50%
0.77
1.00
0.75
0.94
-
0.71
1.00
1.00
0.80
-
0.74
1.00
0.85
0.86
-
0.77
1.00
0.75
0.94
5
From these experiments, it is observed that Inceptionv3
has superior accuracy.
VI. CONCLUSION
In this work, we have experimented on three D.L models
which are Inceptionv3, MobileNetV3 and VGG-19 for the
task of leukocyte classification. The classification accuracy
of Inceptionv3, MobileNetV3 and VGG-19 are
99.76%,99.25% and 86.5% respectively. This concludes that
D.L models can be efficiently used for leukocyte
classification which will aid in automatic processing of
WBCs. In future detection of aberrant WBCs as part of the
classification model which could aid in early disease
prediction may be explored. Further, taking into account
RBCs and blood platelets during training and validation will
allow the suggested model to be more generalizable in the
future. These techniques could be utilized to develop an
online application that allows medical personnel to upload an
image of a stained blood smear that can be prepared in
advance of a patient's visit to the doctor for preliminary
analysis.
REFERENCES
[1] Sharma, S., Gupta, S., Gupta, D., Juneja, S., Gupta, P., Dhiman, G., &
Kautish, S. (2022). Deep Learning Model for the Automatic
Classification of White Blood Cells. Computational Intelligence and
Neuroscience, 2022, 7384131. https://doi.org/10.1155/2022/7384131
[2] Malkawi, A., Al-Assi, R., Salameh, T., Sheyab, B., Alquran, H., &
Alqudah, A. M. (2020). White Blood Cells Classification Using
Convolutional Neural Network Hybrid System. In 2020 IEEE 5th
Middle East and Africa Conference on Biomedical Engineering
(MECBME) (pp.15). Amman, Jordan:IEEE
https://doi.org/10.1109/MECBME47393.2020.9265154
[3] Yao, J., Huang, X., Wei, M., Han, W., Xu, X., Wang, R., Chen, J., &
Sun, L. (2021). High-Efficiency Classification of White Blood Cells
Based on Object Detection. Journal of Healthcare Engineering, 8, 1-
10. https://doi.org/10.1155/2021/1615192
[4] Toğaçar, M., Ergen, B., & Cömert, Z. (2020). Classification of White
Blood Cells Using Deep Features Obtained from Convolutional
Neural Network Models Based on the Combination of Feature
Selection Methods. Applied Soft Computing, 97(Part B), 106810.
https://doi.org/10.1016/j.asoc.2020.106810
[5] Wijesinghe, C. B., Wickramarachchi, D. N., Kalupahana, I. N., De
Seram, L. R., Silva, I. D., & Nanayakkara, N. D. (2020). Fully
Automated Detection and Classification of White Blood Cells. In
Proceedings of the 42nd Annual International Conference of the IEEE
Engineering in Medicine & Biology Society (EMBC) (pp. 1816-
1819).Montreal,QC,Canada:IEEE.https://doi.org/10.1109/EMBC4410
9.2020.9175961
[6] Jung, C., Abuhamad, M., Alikhanov, J., Mohaisen, A., Han, K., &
Nyang, D. (2021). W-Net: A CNN-Based Architecture for White
Blood Cell Image Classification. Journal of Healthcare Engineering,
2021, 1-14. https://doi.org/10.1155/2021/8852761.
[7] Rao, B. S. S., & Rao, B. S. (2023). An Effective WBC Segmentation
and Classification Using MobileNetV3–ShuffleNetV2 Based Deep
Learning Framework. IEEE Access, 11, 27739-27748.
https://doi.org/10.1109/ACCESS.2023.3259100.
[8] Bayat, N., Davey, D. D., Coathup, M., & Park, J.-H. (2022). White
Blood Cell Classification Using Multi-Attention Data Augmentation
and Regularization. Big Data and Cognitive Computing, 6(4), 122.
https://doi.org/10.3390/bdcc6040122
[9] Yildirim, M., & Çinar, A. (2019). Classification of White Blood Cells
by Deep Learning Methods for Diagnosing Disease. Revue
d'Intelligence Artificielle, 33(5), 335-340.
https://doi.org/10.18280/ria.330502
[10] Patil, A. M., Patil, M. D., & Birajdar, G. K. (2021). White Blood
Cells Image Classification Using Deep Learning with Canonical
CorrelationAnalysis.IRBM,42(5),374381.https://doi.org/10.1016/j.irb
m.2020.08.005.
[11] Baby, D., Devaraj, S. J., & Raj M. M, A. (2021). Leukocyte
Classification Based on Transfer Learning of VGG16 Features by K-
Nearest Neighbor Classifier. In 2021 3rd International Conference on
Signal Processing and Communication (ICPSC) (pp. 252-256).
Coimbatore,India:IEEE.https://doi.org/10.1109/ICSPC51351.2021.94
51707.
[12] Mooney, P. (2018). Blood Cell Images [Data set]. Kaggle. Retrieved
from https://www.kaggle.com/paultimothymooney/blood-cells
[13] TensorFlow Authors. "Image Data Augmentation." TensorFlow: An
Open Source Machine Learning Platform. Accessed April 14, 2023.
https://www.tensorflow.org/tutorials/images/data_augmentation.
[14] L. Fei-Fei, J. Deng, and K. Li, “ImageNet: Constructing a large-scale
image database,” J. Vis., vol. 9, no. 8, pp. 1037–1037, 2010, doi:
10.1167/9.8.1037
[15] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens and Z. Wojna,
"Rethinking the Inception Architecture for Computer Vision," 2016
IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), Las Vegas, NV, USA, 2016, pp. 2818-2826, doi:
10.1109/CVPR.2016.308.
[16] TensorFlow. "TensorFlow Hub." Last modified n.d. Accessed April
15, 2023. https://tfhub.dev/s?subtype=module,placeholder.
6