ArticlePDF Available

A SYSTEMATIC REVIEW OF CAD SYSTEM BASED APPROACH IN DIAGNOSING BREAST CANCER AND ANALYZE EFFECTIVENESS OF MACHINE LEARNING AND DEEP LEARNING ALGORITHMS IN EARLY DETECTION

Authors:
  • The Heritage Academy, Kolkata, India

Abstract and Figures

This study intends to throw some light on the different treatment gateways of breast cancer. As we know that women are worst affected by this life threatening disease around the globe, everyone should be aware of the fact that this disease can be tackled if it is detected at the initial stage. In India, the most number of women are affected by this fatal carcinoma and that results in a huge death rate. MRI, Biopsy, USG, Mammography, Histopathological images and many other diagnostic tests can confirm the presence of breast cancer in women. This paper will focus on the prediction of the test samples to be malignant or not by studying the ways of performing machine learning based computer aided systems. By reviewing many important and promising papers in this area, it has been found that there is an established system of detection of carcinoma that is known as Computer Aided Detection. This system consists of the different stages as in image pre-processing, segmentation of images, extraction of relevant features and image classification. We also found from the review that the efficiency of CAD systems increases when the methodologies like CART, Decision Tree Classifier (DT), Logistic Regression (LR), Naïve Bayes (NB), Ensemble, Random Forest Classifier (RF), and K-nearest neighbor classifiers (KNN) used to extracted features. We reviewed several research papers and found a plethora of methodologies available for early detection of breast cancer by using CAD. When the WBCD dataset was evaluated by using Ensemble technique, it recorded about 98% of accuracy. Previously, radiologists could not diagnose breast cancer with so much efficacy as there was a scarcity of so many efficient techniques which are available nowadays. Although the ultimate result of the tests depends on the diagnostic ability of the radiologists, they get a significant amount of assistance by the latest methodologies.
Content may be subject to copyright.
IJBPAS, November, Special Issue, 2021, 10(11): 804-827
ISSN: 2277–4998
804
IJBPAS, November, Special Issue, 2021, 10(11)
A SYSTEMATIC REVIEW OF CAD SYSTEM BASED APPROACH IN
DIAGNOSING BREAST CANCER AND ANALYZE EFFECTIVENESS
OF MACHINE LEARNING AND DEEP LEARNING ALGORITHMS IN
EARLY DETECTION
SUSHOVAN CHAUDHURY1*, SHAMEEK MUKHOPADHYAY2, SADEM NABEEL
KBAH3, DR. KARTIK SAU4
1: Research Scholar, University of Engineering and Management, Kolkata, WB, India
2: Assistant Professor, The Heritage Academy, Kolkata, WB, India
3: Assistant Lecturer, Department of Biomedical Engineering, University of Baghdad,
Baghdad, Iraq
4: Professor, University of Engineering and Management, Kolkata, WB
*Corresponding Author: Sushovan Chaudhury; E Mail: sushovan.chaudhury@gmail.com
https://doi.org/10.31032/IJBPAS/2021/10.11.1069
ABSTRACT
This study intends to throw some light on the different treatment gateways of breast cancer.
As we know that women are worst affected by this life threatening disease around the globe,
everyone should be aware of the fact that this disease can be tackled if it is detected at the
initial stage. In India, the most number of women are affected by this fatal carcinoma and that
results in a huge death rate. MRI, Biopsy, USG, Mammography, Histopathological images
and many other diagnostic tests can confirm the presence of breast cancer in women. This
paper will focus on the prediction of the test samples to be malignant or not by studying the
ways of performing machine learning based computer aided systems. By reviewing many
important and promising papers in this area, it has been found that there is an established
system of detection of carcinoma that is known as Computer Aided Detection. This system
consists of the different stages as in image pre-processing, segmentation of images, extraction
of relevant features and image classification. We also found from the review that the
Received 20
th
July 2021; Revised 22
nd
Aug. 2021; Accepted 30
th
Sept. 2021; Available online 1
st
Nov. 2021
Sushovan Chaudhury et al Research Article
805
IJBPAS, November, Special Issue, 2021, 10(11)
efficiency of CAD systems increases when the methodologies like CART, Decision Tree
Classifier (DT), Logistic Regression (LR), Naïve Bayes (NB), Ensemble, Random Forest
Classifier (RF), and K-nearest neighbor classifiers (KNN) used to extracted features. We
reviewed several research papers and found a plethora of methodologies available for early
detection of breast cancer by using CAD. When the WBCD dataset was evaluated by using
Ensemble technique, it recorded about 98% of accuracy. Previously, radiologists could not
diagnose breast cancer with so much efficacy as there was a scarcity of so many efficient
techniques which are available nowadays. Although the ultimate result of the tests depends
on the diagnostic ability of the radiologists, they get a significant amount of assistance by the
latest methodologies.
Keywords: MRI, Biopsy, CAD, Histopathology, Invasive Ductal Carcinoma, Machine
Learning, Deep Learning
1. INTRODUCTION
One of the commonest explanations for
death after lung cancer is Breast cancer.
Early identification and effective carcinoma
therapy may enhance the therapy options
and decrease the mortality rate. As reported
in [1], there were 20 million new cases of
breast cancer worldwide resulting in the
death of more than 62 million people in
2018. Incidence of breast cancer is more
common in the western countries such as
the USA compared to Africa and Asian
countries. This fatal disease has increased
worldwide at the rate of 0.5% annually and
this increment is more in Asian countries
which is around 3-4% [2]. In
underdeveloped nations, mortality and
breast cancer morbidity are prevalent [3, 4].
It is observed that in case of Indian women,
breast cancer is detected at a very young
age and thankfully they are diagnosed early
in most of the cases and that helps the
oncologists to treat them better and save
their lives. It is also seen that, in rural areas,
cervical cancer is more prevalent whereas
in urban women evidence of breast cancer
is more common [5]. There are many life
style related reasons for this kind of
discrimination. The rate of incidence of
breast cancer in states like Delhi, Mumbai,
Bangalore, Chennai and Kolkata are 41.0,
33.6, 34.4, 37.9 and 25.5 cases per one
million women population respectively. A
huge number of cases of breast cancer is
reported per year in India as shown in
Table 1. Every year there is a significant
percentage (0.68%) of change observed in
case of cancer, during the time span 2011
to 2014, by National Cancer Registry
Program (NCRP), out of which about 2%
of change in every year in case of breast
cancer is found. From 1998 to 2012, the
number of reported cases of breast cancer
Sushovan Chaudhury et al Research Article
806
IJBPAS, November, Special Issue, 2021, 10(11)
has increased manifold. In [6], it is reported
that the Annual Percentage Change (APC)
rose up to 5.31% from 0.91% in Delhi
during this period of time. They also
predicted that among all cancers, about
10% would be breast cancer and this is a
threat to the women’s health of our country.
The nature of breast cancer in India is no
way similar to that of in western countries.
Many researchers [7] reported that Indian
women are caught with this disease at a
very young age compared to other
developed countries. Their tumor size is
higher, they suffer from more negative
hormone receptor conditions, lower ratings,
more positive lymph nodes and aggressive
illness. In [8], it has been observed that
tumors in stages 1, 2, 3 have an
independent risk factor for premature
mortality for the control matched patients.
1.1 Breast Cancer Susceptibility
In the US, it is reported that the median age
of breast cancer patients is 62 years
whereas the range of age is 60 to 69 years
[9]. India is a country with diversity in all
aspects such as economic conditions,
education, climatic conditions and cultural
heritages. The range of ages for urban
population in India is found to be 40 to 49
years whereas in rural areas the range is
between 65 and 69 years [10, 11, and 12].
Indian women are affected in their early
life and mostly present in advanced stages.
Some researchers found that Indian women
are diagnosed with breast cancer with
symptoms and mammographic detection
doesn’t work in case of them. It is also seen
that almost 60% of the patients are detected
in stage 3 or 4 leading to a higher death rate
[1], [13]. In [14, 15], authors reported that
in 62% of the cases, the disease was
diagnosed with TNM stage III in the
women from Northern India. It is very
unfortunate that only 1.4% are diagnosed in
stage 1. They also studied that TATA
Memorial Hospital, Mumbai reported that
54% patients with advanced stage and
women from urban areas came to report in
an early stage of the disease (OR = 0.64).
Authors of [16] found that the stage of
diagnosis depends on the level of
education, socioeconomic background, area
of residence and marital status of the
patients. A few studies found that the
mortality depends on the presentation of the
disease in the later stage; it is better if the
diagnosis is done within three months.
1.2. Breast Cancer Assessment and
Motivation-In Context of Asian
demographics and Indian Subcontinent
Mammography can be a specific imaging
method for the assessment of the breast
using low dose X-rays [18]. Mammography
is the best known method for preliminary
screening, but has certain limitations [19,
20]. Breast density might be some
Sushovan Chaudhury et al Research Article
807
IJBPAS, November, Special Issue, 2021, 10(11)
misleading element which makes it difficult
for women with thick breasts to diagnose
cancer [19, 21-23]. Figure 1 demonstrates
the different densities of breast in women
obtained through breast ultrasound [24-29].
To gauge breast issues, ultrasound is
proved to be one of the efficient tools. It is
usually recommended by the profession,
especially lactation period and pregnancy,
to scan breasts. For biopsy guidance and
mass locating, it can also be recommended.
Figure 2 shows how mammograms can
detect the presence of lesions in the human
breast [23]. However, ultrasound is very
prone to detecting invasive ductal
carcinoma in dense breasts as shown by
Costantini et al. [25], [30]. MRI is usually
recommended for screening women who
have a high risk of developing br0 1H
cancer and is often used to investigate
suspicious ar0 1 found by the mammogram
to assist m01 to detect the dimensions of
the mass. The interpretation/prediction
procedure of MRI imaging, as shown in
Figure 3 is extremely time-taking and
requires a considerable level of radiologist
expertise to classify the differences
between benign and malignant lesions
shown by [31] [32]. Recent studies have
shown that computer systems developed to
facilitate MRI image analysis enhance the
treatment and diagnosis many fold as
demonstrated in [31], [33] and [34].
1.3. Motivation behind CAD system
based diagnosis and performance
evaluation.
The contrast of the tumor between the
background of the image and cancer is
particularly poor in dense images of breast
might alter the results of the diagnosis.
Non-cancerous lesions (fake-positive
value) were commonly misread in the
mammographic examination, whereas
malignancies were frequently overlooked
(false-negative value). Therefore,
radiologists often fail to detect breast
cancers [20]. Several strategies are
presented to strengthen the sensitivity and
specificity of mammography to avoid
needless biopsies. Double reading is one
among the strategies which will contribute
significantly towards achieving high
sensitivity and specificity. CAD systems
might be regarded as a supplementary
mechanism to improve the doctor's
interpretation as a powerful second reader.
An autonomously cancer cells detection CAD
system based on computer vision can assist
radiologists distinguish cancer from non-cancer
cells. Br0 1H are also analyzed using
histopathology images in few studies [35].
Bhardwaj et al. used deep neural networks to
classify breast cancer while Niwas et al.
extracted wavelet features from
histopathological images [36], [37]. Figure 4
presents a sample of histopathological images
[35].
Sushovan Chaudhury et al Research Article
808
IJBPAS, November, Special Issue, 2021, 10(11)
Table 1: State wise statistics
States/UT 2016 2017 2018
Jammu & Kashmir 1421 1516 1618
Himachal Pradesh 613 647 681
Punjab 3321 3503 3694
Chandigarh 196 207 219
Uttaranchal 1217 1298 1384
Haryana 3103 3308 3526
Delhi 3181 3351 3530
Rajasthan 7536 7996 8483
Uttar Pradesh 21376 22737 24181
Bihar 9958 10644 11378
Sikkim 30 30 31
Arunachal Pradesh 82 84 85
Nagaland 67 67 68
Manipur 273 281 289
Mizoram 97 99 101
Tripura
129
130
132
Meghalaya 104 106 108
Assam 2406 2437 2467
West Bengal 10902 11550 12234
Jharkhand 3716 3962 4225
Orissa 4205 4448 4705
Chhattisgarh 2944 3145 3359
Madhya Pradesh 8334 8858 9414
Gujarat 8001 8504 9039
Daman & Diu 42 47 52
Dadra & Nagar Haveli 54 61 68
Maharashtra 14726 15522 16358
Telangana 4633 4918 5220
Andhra Pradesh 5901 6251 6620
Karnataka
8029
8527
9055
Goa 233 247 262
Lakshadweep 14 15 17
Kerala 5682 6189 6748
Tamil Nadu 9486 9870 10269
Pondicherry 227 242 257
Andaman & Nicobar
Islands 44 45 47
Total 142283 150842 159924
Figure 1: densities of breast in women obtained through breast ultrasound
Sushovan Chaudhury et al Research Article
809
IJBPAS, November, Special Issue, 2021, 10(11)
Figure 2: Detection of presence of lesions in human breast through mammograms
Figure 3: The interpretation/prediction procedure of MRI imaging
Figure 4: Sample histopathological images
Sushovan Chaudhury et al Research Article
810
IJBPAS, November, Special Issue, 2021, 10(11)
Any CAD system is primarily based on the
following 5 stages.
1) Image Preprocessing: Any kind of
Biomedical Image preprocessing technique
involves noise removal from the images
being acquired. It also involves image
resizing, enhancing the image intensity as
shown by [38], adjusting brightness and
contrast or converting them into grayscale.
2) Image Segmentation: Image
segmentation is again a key element in the
recognition of computer vision and
patterns. Segmentation techniques allow us
to identify important areas and to remove
various features for further analysis, such
as the tumor or lesion. Based on the
properties of images, segmentation
approach can be classified as follows
Similarity-based
Discontinuity-based
Edge-based segmentation is an example of
discontinuity-based approach. Lee et al.,
further divided the similarity approach into
threshold, region-based and clustering
methods [39]. Each procedure has its own
benefits and limitations and is chosen
according to the individual applications and
imaging methods.
3) Feature Extraction: The characteristics
of the lesions in the images are taken for
the distinct attributes. These features are
utilized for categorization of benign and
malignant tumors in the next stage. One of
the real challenges of the feature extraction
process is the size of the feature set.
Computing feature descriptors from a
picture to scale back the quantity of
knowledge ordinarily signifies feature
extraction. Features are characteristics of
the whole image or ROI. Often an image
descriptor can be classified into three
dimensions; shape, pattern and spectra and
density as claimed in [40]. Feature
matching techniques can also be employed
too by comparing the key points within the
feature descriptor using algorithms like
SIFT, SURF, BRIEF and ORB.
4) Classification: It is essential that a
trustworthy classifier is applied to
differentiate between cancer and non-
cancer cells. Various such machine learning
models like Linear Regression, Logistic
Regression, DT, RF, Ensemble techniques,
SVM, KNN, NB, CART have been used
traditionally for the purpose of
classification. We have discussed different
such approaches in this paper and the
accuracy being achieved in the most
prominent works in the last decade on the
topic.
5) Performance-Review or Evaluation: As
in most systems, a CAD of detection of
breast cancer demands high accuracy and
precision. We have considered key features
to measure accuracy like Sensitivity, F1
score, True Positive and True Negative and
Sushovan Chaudhury et al Research Article
811
IJBPAS, November, Special Issue, 2021, 10(11)
Overall Accuracy to justify our claim that a
significant amount of research has been
done for cancer classification using CAD
system. A performance review of crucial
works in the last decade has been
thoroughly explored in this paper.
The flow of all five stages of the CAD
system is illustrated in Figure 5.
Figure 5: Stages of a CAD system
The rest of this paper is organized as
follows in sequence. Literature review
section discusses previous studies along
with their area, different techniques used in
this field along with a comparison among
them. The research gap section discusses
some of the unexplored areas. Finally, the
conclusion section describes the
applicability of results.
2. Literature Review-Evaluation of
Existing Literature:
For breast cancer detection using different
techniques relevant literature from multiple
sources are being referred to. Various
authors have worked on different datasets
over a period of time and based on that
conclusion are derived. Machine learning
algorithms can be classified as following
three types [41].
The Supervised learning
algorithms;
Unsupervised learning and
Reinforcement learning
Supervised learning is the most common
for every machine learning method that is
used to predict cancer and supervised
learning algorithms are on the basis of
some criteria and conditions. Genetic
algorithms, artificial neural networks and
decision trees are some of the algorithms
used in supervised learning. Physical
examination, imaging and biopsy are some
of the ways to diagnose breast cancer [42].
They also said that X-ray is used just to
understand the shape of the breast but
mammography is used for imaging the
internal parts of the breasts. Some studies
Sushovan Chaudhury et al Research Article
812
IJBPAS, November, Special Issue, 2021, 10(11)
[43] reported that when different kinds of
machine learning models are used for
predicting breast cancer or cancer in
general, machine learning models
outperform the classical statistical models
or expert based systems. In [44], they tried
to distinguish between the mammograms of
healthy tissues and cancer tissues and to do
that they applied DT, SVM and Bayes’
approach. They used a 10-fold cross
validation process by employing statistical
parameters such as positive predictive
value, negative predictive value, sensitivity
and specificity. There are some studies that
suggest a methodology which helps in
computing contourlet coefficient,
decomposed image for the purpose of
classifying mammogram images [45]. The
analysts in [46] showed that for predicting
breast cancer, a combination of Mixed
Gravitational Search Algorithm (MGSA)
and Support Vector Machine (SVM)
improved the performance of these models
individually up to 93.1%. They used 70%
of the data for training the dataset and the
rest for the test. A study reported that for
classification of mammogram images,
CAD system gives 96% accuracy [47].
Classification of mammogram images has
been studied by many researchers. In [48],
the authors studied the same by employing
KNN and GLCM. In a study, it is found
that the performance of machine learning
models vary due to the parameter selection
and dataset. They reported that SVM
combined with Gaussian kennel gave the
best result in the case of prediction of
breast cancer both for recurrence and non-
recurrence one [49]. Authors in [50] have
trained SVM, DT, NB and k-NN on the
WBC dataset [51] and noticed that among
all classifiers, SVN outperforms. A similar
study done by researchers [52]. In [53] they
used data mining techniques to explore the
risk factors for predicting breast cancer
while in [54] they compared two machine
learning methods (ANN and SVM) for
breast cancer detection. The nested
ensemble approach based detection of the
benign breast tumors from malignant
cancers was proposed by [55]. They used
Stacking and Voting as a combination of
classifying techniques. In [56], they worked
on the WDBC dataset. In their study, the
first dimension is reduced using PCA and
then machine learning models are trained
for classification of tumors. A similar study
is made by [57]. Authors performed
experiments on two standard databases i.e.,
Wisconsin Prognostic Breast Cancer
(WPBC) and WBC using various machine
learning approaches including decision
tree, NN and SVM to classify tumors.
Another comparative analysis is done by
the researchers of [58]. When WBPC
dataset is used, it was observed that use of
Sushovan Chaudhury et al Research Article
813
IJBPAS, November, Special Issue, 2021, 10(11)
PCA improved the results significantly. In
[59], the authors employed classification
model ANN and extracted the parameters
by using PCA. Based on the WBPC
dataset, some studies used machine
learning algorithms for predicting the
recurrent cases of breast cancer [60]. Some
studies compared different ML algorithms
for predicting recurrent or non-recurrent
breast cancers [61], [62]. A combination of
neural network and weighted Naïve Bayes’
classifier was used by many studies and
they proved the improved performance of
these models [63], [64]. The authors of [65]
developed the models for prediction of
breast cancer based on Radial Basis
Function Network, Naïve Bayes’ (NB) and
Decision Tree (DT). They found NB to be
the most efficient model with 97.36% of
accuracy. A paper combined a GA with
feed forward neural network and they used
this combination of models for
classification [66]. In [67], the researchers
studied the survival probabilities of breast
cancer patients by using different survival
analysis models. They used two different
breast cancer datasets and proved their
hypothesis. Some researchers showed that
ML algorithms have improved the accuracy
of classification models and prediction
models by manifold. They reviewed many
articles on the application of different ML
algorithms for classification and prediction
of breast cancer and then concluded [68]. In
[69], they proved that machine learning
algorithms are capable of yielding almost
100% prediction accuracy when tested on
the Wisconsin Diagnostic Breast Cancer
dataset. Researchers in [70] studied the
performance of breast cancer classification
using KNN and NB classifiers. Model is
trained using 683 samples of Breast Cancer
Dataset. In the result, the reported a
maximum accuracy of 97.51% which was
achieved by KNN classifier. A two-stage
architecture based deep model is trained in
[71]. ResNet is used as a building block of
the proposed architecture. Results suggest
that the deep model successfully predicts
the presence of cancer in the breast with
AUC of 89.50%. A similar study based on
deep learning and inspired from U-Net is
done by authors of [72]. Extending the
study of deep learning for breast cancer
classification, in [73], automatic and robust
features are extracted using deep neural
networks and trained using deep ensemble
transfer learning. Authors have reported
88% classification accuracy with area
under curve as 0.88. In the most recent
work, a new CAD system is proposed
which uses multi-DCNN to classify breast
cancer [74]. The CBIS-DDSM and MIAS
dataset is used for evaluation of the
performance of deep models. Experimental
results improvement of accuracy using
Sushovan Chaudhury et al Research Article
814
IJBPAS, November, Special Issue, 2021, 10(11)
deep feature fusion compared to state-of-
the-art methods. In [75], a deep neural
network architecture is proposed to study
breast cancer classification using
histopathological images. Results suggest
that among different numbers of layered
architectures, 19-layer CNN performed
well. A hybrid approach based on a
combination of Graph convolutional
network and convolutional neural network
is proposed in [76]. Few more recent
studies of breast cancer classification based
on deep learning are available in [77-80].
Table 2 depicts the few popular studies
made in the last decade on breast cancer
detection using deep learning and ML
techniques. Table 3 compares this study to
other review papers being published in the
last decade. Table 4 describes the different
modalities being taken as benchmark for
researchers to study and implement ML/DL
techniques on the appropriate data set. In
Table 5 we present how different ML/DL
techniques have been used in analysis and
classification of benign and malignant
tumors of the breast.
In the past few decades, breast cancer
classification gained the attention of many
researchers. Many novel methods and
techniques are proposed. Few researchers
summarized the methodologies and
published it as a survey. Table 3 presents a
few important surveys available in the
literature (index value 2 to 7) and compares
it with our study. From the Table 3, it can
be noticed that our survey presents an
extensive study and includes almost all
machine learning techniques which are
being used for classification of breast
cancer. Our study can be uniquely
distinguished from the indexed survey
based on the study of Indian and Asian
Demographics.
Table 2: Comparison of different BCD techniques in literature using different ML and DL Algorithms
Reference
Techniques/Methods used
Area
Result
[83] Applied different classification
techniques on BCW dataset for
detection of breast cancer.
MLP, using back propagation
, NN (MLP BPN) and SVM to
diagnose and analyze breast
cancer and performance is
evaluated by calculating
statistical parameters.
SVM is found to produce the
lowest average error compared
to MLPBPN.
[42] ANN, GA, DT, LDA, and KNN
have been applied.
Diagnosis and detection of
breast cancer using different
modalities include physical
examination, biopsy and
imaging
Texture analysis is a tested
methodology that may be
efficiently employed for
classification of noncancerous
and cancerous lesions with
Sensitivity-94.28%, Specificity-
100%, Accuracy-97.80%, AU-
ROC-0.9714.
[57] Comparative study is done for
different ML/DL techniques like
DT, NB, NN and SVM
Objective is to classify the
labels in WPBC and WBC
datasets
NN - 98.09% in WBC dataset,
and SVM-RBF - 98.32% in
WPBC dataset using 10 fold
cross validation.(cv=10)
Sushovan Chaudhury et al Research Article
815
IJBPAS, November, Special Issue, 2021, 10(11)
[46] MGSA and SVM Goal is to classify breast
cancer as per given labels
using machine learning
techniques.
Outcome: SVM with 24 features
- 86% MGSA – SVM with 12
features- 93.1%.
[47] CAD Normal and abnormal breast
tissues differentiation for
visual diagnostic aid of the
radiologists.
Maximum accuracy of 96% if
found using 3NN.
[62] Breast cancer is detected using
the “Relevance vector machine
“(RVM).
LDA was used as a
dimensional reduction method
and feed the reduced features
into the classifier
The “Relevance vector
machine” outperforms other
‘ML classifiers’ in classifying
the labels appropriately.
[60]
Use Cases of Invasive ductal
carcinoma in the subjects on the
basis of vital features as predicted
by ML techniques
WPBC SVM and DT (C 5.0) - 81%
(highest)
FCM 37% (lowest)
[44] SVM, Bayes approach and DT Distinguish cancer
mammograms from normal
samples.. Dataset is broken
down into a train, test and
validation sets and the model
is subjected to training, taking
cv=10.
NPR, FPR and AUC were
measured. From the results it is
observed that different feature
extracting strategies and
classifiers yield different and
effective results to detect breast
cancer in the given dataset.
[51] Naïve Bayes, SVC classifier, RF,
C4.5, k-NN and NN
Aim is to classify breast cancer
where different ML /DL
techniques are compared for
the Wisconsin dataset and
reported.
SVM and RF produced highest
classification accuracy
[52] SVM, GRU-SVM, LR, MLP, NN
search and Softmax Regression
WDBC dataset is being used
for experimentation
From the result it is noticed that
MLP reports significantly
higher accuracy when
compared to other models.
[70] NB and KNN are used as
classification technique for breast
carcinoma detection,
Identification using ML
techniques. A set of Breast
Cancer Image Dataset is used
which consists of a total of 683
samples. Dataset is broken
down into training and test
sets in a 60:40 ratio.
Highest accuracy is achieved by
K-NN - 97.51% while Naive
Bayes classifier produced
96.19% accuracy
[72] As in, U-Net, a DL framework is
proposed for initial detection of
breast carcinoma and
performance is compared with
architectures like AlexNet,
VGGNet and GoogleNet
CBIS-DDSM is used to train
the deep model which contains
‘Curated Breast Imaging
Subsets’.
Classification accuracy:
Micro calcification – 94.31%
Masses- 95.01%
[55] Two-layered nested ensemble
technique is used along with SV-
NaiveBayes-3-MetaClassifier and
SV-BayesNet-3- Meta Classifier
and compared with Bayesian
Network, NB, SGD and Logistic
model tree
Invasive Ductal Carcinoma
was detected using ensemble
techniques. Dataset being used
is WBCD.
Among all other classifiers, the
proposed SV-Naïve Bayes-3-
MetaClassifiers generated
highest accuracy – 98.07%
[71] A DNN based on a two stage
framework is proposed where
ResNet is used as a building block
of the model.
Diagnostic aid for breast
cancer detection:Performance
of the model is examined over
two million exams having 10
million image samples ,thereby
a large validation set.
The result shows that the model
is capable to predict the
presence of breast carcinoma
with an AUC of 89.50%
[73] Deep ensemble transfer learning
approach is used to distinguish
cancerous and noncancerous
lesions using features extracted by
DNN.
Classification of cancerous and
non-cancerous lesions. The
CBIS-DDSM dataset is used
for experiments.
The classification accuracy
achieved is 88% with AUC
value as 0.88
[74] A new CAD system is proposed
which uses multi-DCNN to
classify breast carcinoma. Deep
Uses deep convolutional neural
networks’ for classification.
The CBIS-DDSM and MIAS
Result suggests an improvement
of accuracy using deep feature
fusion compared to traditional
Sushovan Chaudhury et al Research Article
816
IJBPAS, November, Special Issue, 2021, 10(11)
feature fusion is also performed
and SVM is used as a classifier.
dataset is used for evaluation
of the performance of deep
models.
methods.
[75] A DNN architecture is proposed
to study breast cancer
classification using
histopathological images.
‘Histopathological biopsy
images’ are used for breast
cancer detection. AMIDA13
and MITOS-ATYPIA dataset
is used to train deep models.
Results suggest that among
different numbers of layered
architectures, 19-layer CNN
performed better.
[76] A hybrid approach based on
amalgamation of Graph based
CNN(GCN) and conventional
CNN is proposed.
The malignancy is classified
using DL. The model is
experimented on breast
dataset mini-MIAS.
Statistical parameters are
reported as follows:
Sensitivity – 96.20%
Specificity – 96%
Accuracy – 96.10%
Table 3: Comparison of our survey along with other popular surveys
Models / Sl. No 1 2 3 4 5 6 7
Machine
learning
models
SVM
Decision
Trees
K-NN
Logistics
Regression
NB
Artificial
Neural
Network
(ANN)
Methods /
data set
Gaussian
kernel
Wisconsin
dataset
Indian and
Asian
demographics
Table 4: A table with reference of different testing modalities and their performance studied in different papers
Modalities References
Mammography [18], [19], [20], [21], [22], [23]
Ultrasonography [24], [25], [26], [27], [28], [29], [30]
MRI
[31], [32], [33], [34]
Biopsy histopathological images [35], [36], [37]
Sushovan Chaudhury et al Research Article
817
IJBPAS, November, Special Issue, 2021, 10(11)
Table 5: Popular Machine learning techniques used for Breast cancer diagnosis in various researches
Machine Learning Models
References
Support vector machine (SVM) [44], [46], [50], [51], [52], [53], [54], [60], [61], [68]
Decision Trees [50], [53], [57], [60], [65], [68]
K Nearest Neighbors (KNN) [48], [49], [50], [58]
Logistics Regression [49]
Naïve Bayes [49], [51], [58], [63], [65]
Artificial Neural Network (ANN) [53], [54], [59], [67], [68]
3. Research Gap
Compared to traditional image processing
methods, application of machine learning
and deep learning in the field of breast
cancer classification has drastically
improved classification accuracy. In the last
few decades, a number of researches have
been done [81, 82]. Few authors well
summarized the recent trends in this
particular domain which are available in the
literature [4], [40]. However, there are few
points which still need to be investigated.
In this study, we tried to bridge this gap by
including most popular and recent work
done on breast cancer classification using
machine learning and deep learning
techniques. We also discussed different
stages of CAD systems in detail. In
addition to that, we highlighted different
testing modalities viz. Mammography [19,
20], Ultrasonography [24-28], Biopsy
histopathological images [35-37] and MRI
[31-34] and their performances along with
limitations. Finally, we focused on Breast
cancer trends in Indian and Asian
demographics. This study discusses
different types of machine learning
techniques and also reports which
algorithm works well with different
databases. We believe that this study will
help beginners to understand the past
researches and recent trends in the field of
breast cancer classification and will help
them to decide use of appropriate
algorithms for their research work. We also
present a list of dataset in Table 6 along
with sources and the modality being used in
the dataset and what research can be
undertaken on those set of data. Some of
the areas of research which can be explored
in this area are as under:
1. Using Transfer learning techniques
on histopathological images.
2. Use of Knowledge distillation and
semi-supervised techniques on
available histopathological biopsy
images and they can be validated
against images checked by medical
experts.
3. Use of Active Learning to train the
available datasets obtained through
different modalities like
Mammogram, MRI, and Biopsy
etc.
Sushovan Chaudhury et al Research Article
818
IJBPAS, November, Special Issue, 2021, 10(11)
Table 6: A comprehensive overview of some publicly available datasets in the area of Breast Cancer Research and
open areas of research
Available Data Set Modality Source Scope of Research
WBCD Numerical values of cell
nuclei extracted from
FNAB histopathological
images of the breast.
UCL Machine Learning
repository, Kaggle
Exploratory analysis,
Application of new
ML/DL techniques,
Feature Extraction
techniques like PCA, LDA
and Factor Analysis
Breast Histopathological
Images
Histopathological biopsy
to detect invasive ductal
carcinoma
https://www.kaggle.com/p
aultimothymooney/breast-
histopathology-images
Feature Extraction,
Feature matching
Classification, Knowledge
distillation, Transfer
Learning, CNN, Big Data
Analysis of image
MIAS Mammography Mammogram https://www.kaggle.com/k
mader/mias-
mammography
Segmentation, Finding
ROI, Implement object
detection using mask R
CNN, Yolo V4, Feature
Extraction, Classification
CBIS DDSM
Mammograms
http://www.eng.usf.edu/cv
prg/Mammography/Datab
ase.html
Segmentation,
Finding
ROI, Implement object
detection using mask R
CNN, Yolo V4, Feature
Extraction, Classification,
RNN, GAN, Big data
Analysis
BACH 2018 Histopathology biopsy https://iciar2018-
challenge.grand-
challenge.org/Dataset/
semi supervised KD,GAN,
Big Data Analysis, Auto
encoders, GAN, Multi
label classification
SEER Breast Cancer
Dataset
Numerical attributes
being extracted from
patient EMR
IEEE data
port
Exploratory data analysis,
Implementing Statistical
methods to BCD for
meaningful insights.
Breast Ultrasound Images USG of the breast https://www.kaggle.com/a
ryashah2k/breast-
ultrasound-images-dataset
Segmentation, Detection
and Classification
CONCLUSION
Going by the statistics, the emerging trends
and increased breast cancer rate in India as
well as other parts of the world, the study
of breast cancer has become the need of the
hour though getting appropriate data for
research remains a challenge. The socio-
economic conditions vary across the world
and radiologists are often not 100 percent
accurate in diagnosing breast cancer. As
such the use of CAD systems can be a great
tool to assist radiologists and ascertain their
predictions. The major aim of this study is
to highlight all research conducted on ML
and DL techniques for prediction of breast
cancer. This article will help the beginner
who wishes to explore the machine learning
algorithms for classification problems and
Sushovan Chaudhury et al Research Article
819
IJBPAS, November, Special Issue, 2021, 10(11)
their performance on different breast cancer
testing modalities. In this thorough review,
the performance of different ML/DL
techniques are assessed and compared.
From the result it has been found that the
efficiency of the CAD system can be
improved significantly with the application
of proper algorithms which can in turn
enhance radiologists’ performance. We
have talked about the different options
available as far as dataset is concerned and
what kind of dataset can yield what results.
We observed that the machine learning
methods have demonstrated its exceptional
capacity to classify and predict cancer cells
with significant improvement in accuracy
using computer- vision techniques.
REFERENCES
[1] F. Bray, J. Ferlay, I. Soerjomataram,
R. L. Siegel, L. A. Torre, A. Jemal,
“Global cancer statistics 2018:
GLOBOCAN estimates of incidence and
mortality worldwide for 36 cancers in185
countries.” CA Cancer J Clin 68(6): 394–
424
[2] M. Green, V. Raina, “Epidemiology,
screening and diagnosis of breast cancer
in the Asia Pacific region: current
perspectives and important
considerations.” Asia PacJClinOncol4:S5–
S13
[3] N. Li, Y. Deng, L. Zhou, T. Tian, S.
Yang, Y. Wu, Y. Zheng, Z. Zhai, Q. Hao,
D. Song, D. Zhang, H. Kang, Z. Dai,
“Global burden of breast cancer and
attributable risk factorsin195 countries and
territories, from 1990 to 2017: results from
the global burden of disease study 2017.” J
Hematol Oncol 12(1):140
[4] R. Dikshit, P. C. Gupta, C. R.
Sundarahettige, V. Gajalakshmi, L.
Aleksandrowicz, R. Badwe, R. Kumar, S
Roy, W. Suraweera, F. Bray, M. Mallath ,
P. Singh, D. N. Sinha, A> S. Shet, H.
Gelband, P. Jha, “Cancer mortality in India:
a nationally representative survey.” Lancet
379:1807–1816
[5] National Cancer Registry Programme
(2016) Three-year report of population-
based cancer registries:2012-2014,
Chapter-2 leading site of cancer. Indian
Council of Medical Research. New Delhi
[6] National Cancer Registry Programme
(2016) Three-year report of population-
based cancer registries:2012-2014,
Chapter-10 Trends over time for all sites
and on selected sites of cancer and
projection of burden of cancer. Indian
Council of Medical Research. New Delhi
[7] A. Mathew, M. Pandey, B. Rajan, “Do
younger women with non-metastatic &
non-inflammatory breast carcinoma have
poor prognosis?” World J SurgOncol2:2
[8] M. A. Maggard, J. B. O’Connell, K. E.
Lane, J. H. Liu, D. A. Etzioni, “Do young
Sushovan Chaudhury et al Research Article
820
IJBPAS, November, Special Issue, 2021, 10(11)
breast cancer patients have worse
outcomes?” J Surg Res 113:109–113
[9] C. E. De Santis, M. M. Ma J, Gaudet, L.
A. New man, K. D. Miller, A. Goding
Sauer, A. Jemal, R. L. Siegel, “Breast
cancer statistics.” CA Cancer J Clin69:438–
451
[10] A. Chauhan, S. Subba, R. G. Menezes,
B. S. Shetty, V. Thakur, S. Chabra, R.
Warrier (2011) ,Younger women area
affected by breast cancer in South India - a
hospital based descriptive study. Asian Pac
J Cancer Prev 12:709–711
[11] V. Raina, M. Bhutani, R. Bedi, A.
Sharma, S. V. S. Deo, N. K. Shukla, N. K.
Mohanti, G. K. Rath (2005) “Clinical
features and prognostic factors of early
breast cancer at a major cancer center in
North India.” Indian J Cancer 42:40–45
[12] G. Agarwal, P. V. Pradeep, V.
Aggarwal, C. H. Yip, P. S. Cheung (2007),
“Spectrum of breast cancer in Asian
women”. World JSurg31:1031– 1040
[13] S. P. L. Leong, Z. Shen, T. Liu, G.
Agarwal, T. Tajima, N. S. Paik, K. Sandel,
A. Derossis, H. Cody, W. D. Foulkes,
(2010) “Is breast cancer the same disease
in Asian and Western women?” World J
Surg 34:2308–2324
[14] S. Saxena, B. Rekhi, A. Bansal, A.
Bagga, M. S. S. Chintamani (2005),
“Clinico-morphological patterns of breast
cancer including family history in a New
Delhi hospital, India a cross-sectional
study”. World J SurgOncol3:67
[15] J. A. Sathwara, G. Balasubramaniam,
S. C. Bobdey, A. Jain, S. Saoba, (2017)
“Socio demographic factors and late-stage
diagnosis of breast Cancer in India: a
hospital-based study.” Indian J Med
Paediatr Oncol38(3):277 –281
[16] F. Kaffashian, S. Godward, T. Davies,
L. Solomon, J. Mc Cann, S. W. Duffy,
(2003) “Socio economic effects on breast
cancer survival: proportion attributable to
stage and morphology.” Br J Cancer
89:1693– 1696
[17] M. A. Richards, A. M. Westcombe, S.
B. Love, P. Little johns, A. J. Ramirez,
(1999) “Influence of delay on survival in
patients with breast cancer: a systematic
review.” Lancet 353:1119–1126
[18] T. W. Freer, M. J. Ulissey, “Screen in
mammography with computer-aided
detection: Prospective study of 12,860
patients in a community breast center”,
Radiology, 2001,220:781-6.
[19] M. G. Ertosun and D. L. Rubin,
"Probabilistic visual search for masses
within mammography images using deep
learning," 2015 IEEE International
Conference on Bioinformatics and
Biomedicine (BIBM), Washington, DC,
USA, 2015, pp. 1310-1315, doi:
10.1109/BIBM.2015.7359868.
Sushovan Chaudhury et al Research Article
821
IJBPAS, November, Special Issue, 2021, 10(11)
[20] N. F. Boyd, H. Guo, L. J. Martin, L.
Sun, J. Stone Fishel, “Mammographic
density and the risk and detection of breast
cancer”, N Engl J Med, 2007, 356,227-236,
doi: 10.1056/NEJMoa062790
[21] J. K. Jesneck, J. Y. Lo, J. A. Baker,
“Breast mass lesions: computer-aided
diagnosis models with mammographic
and sonographic descriptors”, Radiology,
2007,244: 390-8.
[22] H. D. Nelson, K. Tyne, A. Naik, C.
Bougatsos, B. K. Chan, L. Humphrey,
“Screening for breast cancer: an update for
the US Preventive Services Task Force”,
Ann Intern Med, 2009,151:727-37; W237-
42.
[23] Skåne University Hospital in Malmö,
digital image, accessed 7 April 2021,
https://healthcare-in-
europe.com/en/news/3d-mammography-
detected-34-more-breast-cancers-in-
screening.html
[24] W. A. Berg, J. D. Blume, J. B.
Cormack, E. B. Mendelson, D. Lehrer, M.
Böhm-Vélez, “Combined screening with
ultrasound and mammography vs
mammography alone in women at elevated
risk of breast cancer”, JAMA,
2008,299:2151-63.
[25] K. Drukker, M. Giger, K. Horsch, M.
A. Kupinski, C. J. Vyborny, E. B.
Mendelson, “Computerized lesion
detection on breast ultrasound”,
MedPhysics,2002,29:1438-46.
[26] N. Ohuchi, A. Suzuki, T. Sobue, M.
Kawai, S. Yamamoto, Y. F. heng,”
Sensitivity and specificity of
mammography and adjunctive
ultrasonography to screen for breast cancer
in the Japan Strategic Anti-cancer
Randomized Trial (J-START): a
randomized controlled trial”, Lancet,
2016;387(10016):341-348.
[27] J. R. Scheel, J. M. Lee, B. L.
Sprague, C. I. Lee, C. D. Lehman,
“Screening ultrasound as an adjunct to
mammography in women with
mammographically dense breasts”, Am J
Obstetr Gynaecol, 2015,212:9-17.
[28] W. Svensson , “A review of the
current status of breast ultrasound”, Eur J
Ultrasound. 1997,6:77-101.
[29] How dense are you? digital image,
accessed 7 April 2021,
https://wispecialists.com/3d-automated-
whole-breast-ultrasound/
[30] M. Costantini, P. Belli, R. Lombardi,
G. Franceschini, A. Mulè, L. Bonomo,
“Characterization of solid breast masses
use of the sonographic breast imaging
reporting and data system lexicon”, J
Ultrasound Med, 2006,25: 649-59.
[31] C. Meeuwis, S M. van de Ven, G.
Stapper, A. M. Fernandez Gallardo, M.
van den Bosch, W. Mali, “Computer-
Sushovan Chaudhury et al Research Article
822
IJBPAS, November, Special Issue, 2021, 10(11)
aided detection (CAD) for breast MRI:
evaluation of efficacy at 3.0 T”, Eur
Radiol,2010,20: 522-8.
[32] S. Apostolos. Mitrousias 2020,
Mammography, Breast Ultrasound and
MRI: The Basic Imaging Tools for
Prevention and Management of Breast
Disease, digital image, accessed 7 April
2021,
https://www.linkedin.com/pulse/mammog
raphy-breast-ultrasound-mri-basic-
imaging-tools-apostolos-s-/
[33] L. C. Wang, W. B. DeMartini, S. C.
Partridge, S. Peacock, C. D. Lehman,
“MRI-detected suspicious breast lesions:
predictive values of kinetic features
measured by computer-aided evaluation”,
Am J Roentgenol, 2009,193: 826-31.
[34] T. C. Williams, W. B. DeMartini, S.
C. Partridge, S. Peacock, C. D. Lehman,
“Breast MR imaging: computer-aided
evaluation program for discriminating
benign from malignant lesions”,
Radiology, 2007,244:94-103.
[35] B. Weigelt et al., Histological types
of breast cancer: how special are they?”,
Molecular oncology vol. 4,3 ,2010, pp 192-
208, doi: 10.1016/j.molonc.2010.04.004,
digital image, accessed 7 April 2021,
https://www.ncbi.nlm.nih.gov/pmc/articles/
PMC5527938/
[36] A. Bhardwaj, A. Tiwari, “Breast
cancer diagnosis using genetically
optimized neural network model”, Expert
System Application, 2015,42:4611-20.
[37] S. Issac Niwas, P. Palanisamy, R.
Chibbar, W. J. Zhang, “An expert support
system for breast cancer diagnosis using
color wavelet features”, J Med System,
2012,36: 3091-102.
[38] M. M. Kyaw, “Pre-segmentation for
the computer aided diagnosis system”,
International Journal of Computer Science
and Information Technology, 2013,5(1):79.
[39] G. Kumar, P. P. Sarthi, P. Ranjan and
R. Rajesh, “Performance of k-means based
satellite image clustering in RGB and HSV
color space,” 2016 International
Conference on Recent Trends in
Information Technology (ICRTIT), 2016,
pp. 1-5, doi:
10.1109/ICRTIT.2016.7569523.
[40] G. Kumar, S. Bakshi, P. K. Sa, B.
Majhi, “Non-overlapped block wise
interpolated local binary pattern as
periocular feature.” Multimed Tools Appl
80, 16565–16597 (2021).
https://doi.org/10.1007/s11042-020-08708-
w
[41] G. Kumar, D. P. Chowdhury, S.
Bakshi, P. K. Sa, “Person Authentication
Based on Biometric Traits Using Machine
Learning Techniques," S. Sharma et a.
(eds) in IoT Security Paradigms and
Applications: Research and Practices (1st
Sushovan Chaudhury et al Research Article
823
IJBPAS, November, Special Issue, 2021, 10(11)
ed.), 2020. CRC Press.
https://doi.org/10.1201/9781003054115.
[42] A. A. Ardakani, A. Gharbali, and A.
Mohammadi, “Classification of breast
tumors using sonographic texture
analysis,” J. Ultrasound Med., vol. 34, no.
2, pp. 225–231, 2015.
[43] A. J. Cruz and D. S. Wishart,
“Applications of machine learning in
cancer prediction and prognosis,” Cancer
Informatics, vol. 2, pp. 59–77, 2006.
[44] L. Hussain, W. Aziz, S. Saeed, S.
Rathore and M. Rafique, “Automated
Breast Cancer Detection Using Machine
Learning Techniques by Extracting
Different Feature Extracting Strategies,"
17th IEEE International Conference on
Trust, Security And Privacy In Computing
And Communications/ 12th IEEE
International Conference On Big Data
Science And Engineering
(TrustCom/BigDataSE), New York, NY,
USA, 2018, pp. 327-331, doi:
10.1109/TrustCom/BigDataSE.2018.00057.
[45] S. Deepa, V. Subbiah Bharathi,
“Textural feature extraction and
classification of mammogram images using
CCCM and PNN,” IOSR Journal of
Computer Engineering, vol. 10, Issue 6,
2013, pp. 07-13.
[46] F. Shirazi, E. Rashedi, “Detection of
cancer tumors in mammography images
using support vector machine and mixed
gravitational search algorithm,” In
proceedings of IEEE 1st Conference on
Swarm Intelligence and Evolutionary
Computation, Bam, March 2016, pp. 98-
101, doi:10.1109/CSIEC.2016.7482133.
[47] R. Biswas, A. Nath and S. Roy,
"Mammogram Classification Using Gray-
Level Co-occurrence Matrix for Diagnosis
of Breast Cancer," in 2016 International
Conference on Micro-Electronics and
Telecommunication Engineering
(ICMETE), Ghaziabad, 2016 pp. 161-166.
doi: 10.1109/ICMETE.2016.85.
url:
https://doi.ieeecomputersociety.org/10.11
09/ICMETE.2016.85
[48] L. Puneeth, A. N. Krishna,
“Classification of mammograms using
texture features,” International Journal of
Innovative Research & Development, vol.
3, Issue 7, July 2014, pp. 373-377.
[49] M. Rana, P. Chandorkar, A. Dsouza,
and N. Kazi, “Breast cancer diagnosis and
recurrence prediction using machine
learning techniques,'” Int. J. Res. Eng.
Technology., vol. 4, no. 4, pp. 1163_2319,
2015.
[50] H. Asria, H. Mousannif, H.
Moatassimec and T. Noeld, “Using
Machine Learning Algorithms for Breast
Cancer Risk Prediction and Diagnosis,” in
Proceedings of the 6th International
Sushovan Chaudhury et al Research Article
824
IJBPAS, November, Special Issue, 2021, 10(11)
Symposium on Frontiers in Ambient and
Mobile Systems, pp. 1064 –1069, 2016.
[51] J. Ivancáková, F. Babic and P. Butka,
“Comparison of Different Machine
Learning Methods on Wisconsin Dataset,”
in Proceedings of the 16th IEEE World
Symposium on Applied Machine
Intelligence and Informatics, pp.173-178,
2018.
[52] A. Fred and M. Agarap, “On Breast
Cancer Detection: An Application of
Machine Learning Algorithms on the
Wisconsin Diagnostic Dataset,” in
Proceedings of the 2nd International
Conference on Machine Learning and Soft
Computing, 2018.
[53] L. G. Ahmad, A. T. Eshlaghy, A.
Poorebrahimi, M. Ebrahimi, and A. R.
Razavi, “Using three machine learning
techniques for predicting breast cancer
recurrence,'' J. Health Med. Inform., vol. 4,
no. 124, p. 3, 2013.
[54] E. A. Bayrak, P. Kirci, and T. Ensari,
``Comparison of machine learning methods
for breast cancer diagnosis,'' in Proc. Sci.
Meeting Elect. - Electron. Biomed. Eng.
Comput. Sci. (EBBT), Apr. 2019, pp. 1_3.
[55] M. Abdar, M. Zomorodi-Moghadam,
X. Zhou, R. Gururajan, X. Tao, P. D.
Barua, and R. Gururajan, “A new nested
ensemble technique for automated
diagnosis of breast cancer,” Pattern
Recognit. Lett., vol. 132, pp. 123_131, Apr.
2020.
[56] D. A. Omondiagbe, S. Veeramani, and
A. S. Sidhu, “Machine learning
classification techniques for breast cancer
diagnosis,'' IOP Conf. Ser., Mater. Sci.
Eng., vol. 495, Jun. 2019, Art. no. 012033.
[57] Z. Nematzadeh, Roliana Ibrahim and
Ali Selamat, “Comparative studies on
breast cancer classifications with k-fold
cross validations using machine learning
techniques,” Proc. in 2015, 10th Asian
Control Conf. (ASCC), pp 1-6, IEEE, 2015.
[58] Z. Zain, M. Alshenaifi, A.Aljaloud, T.
Albednah, R. Alghanim, A. Alqifari & A.
Alqahtani, (2020). “Predicting breast
cancer recurrence using principal
component analysis as feature extraction:
an unbiased comparative analysis.”
International Journal of Advances in
Intelligent Informatics, 6(3), 313-327. doi:
https://doi.org/10.26555/ijain.v6i3.462.
[59] H. Hasan and N. M. Tahir, “Feature
selection of breast cancer based on
Principal Component Analysis,” in 2010
6th International Colloquium on Signal
Processing & its Applications, 2010, pp. 1–
4, doi: 10.1109/CSPA.2010.5545298.
[60] U. Ojha and S. Goel, “A study on
prediction of breast cancer recurrence using
data mining techniques,” 2017 7th Int.
Conf. on Cloud Computing, Data Science
Sushovan Chaudhury et al Research Article
825
IJBPAS, November, Special Issue, 2021, 10(11)
& Engineering Confluence, pp 527-530,
IEEE, 2017.
[61] D. Bazazeh and R. Shubair.
“Comparative study of machine learning
algorithms for breast cancer detection and
diagnosis,” 2016 5th Int. Conf. on
Electronic Devices, Systems and
Applications (ICEDSA), 6-8 December
2016, Ras Al Khaimah, UAE.
[62] B. M. Gayathri and C. P. Sumathi,
“Comparative study of relevance vector
machines with various machine learning
techniques used for detecting breast
cancer,” 2016 IEEE Int. Conf. on
Computational Intelligence and Computing
Research (ICCIC), pp 1-5, IEEE, 2016.
[63] S. Kharya and S. Soni, “Weighted
Naïve Bayes classifier –Predictive model
for breast cancer detection”, International
Journal of computer applications, Vol.133,
No.9, pp.32-37, January 2016.
[64] F. Paulin, A. Santhakumaran,
“Classification of Breast cancer by
comparing Back propagation training
algorithms.” Int. J. Comput. Sci. Eng. 2011,
3, 327–332.
[65] V. Chaurasia, S. Pal & B. B. Tiwari,
(2018). “Prediction of benign and
malignant breast cancer using data mining
techniques.” Journal of Algorithms &
Computational Technology, 12(2), 119-
126.
[66] A. Adam, K. Omar, “Computerized
Breast Cancer Diagnosis with Genetic
Algorithms and Neural Network”.
[67] C. Chi, W. Nick Street, and W. H.
Wolberg, Application of Artificial Neural
Network-Based Survival Analysis on Two
Breast Cancer Datasets”-AMIA Annu
Symp Proc. 2007; 2007: 130–134.
[68] W. Yeu, “Machine Learning with
applications in Breast Cancer Diagnosis
and Prognosis”, Designs 2018,2,13;
doi:10.3390/designs20SS20013
[69] R. Patgiri, “Machine Learning: A Dark
Side of Cancer Computing” International
Conference. Bioinformatics and
Computational Biology BIOCOMP’ 18,
ISBN: 1-60132-471-5, CSREA Press.
[70] M. Amrane, S. Oukid, I. Gagaoua and
T. Ensarİ, “Breast cancer classification
using machine learning,” Electric
Electronics, Computer Science, Biomedical
Engineerings' Meeting (EBBT), IEEE
2018, pp. 1-4, doi:
10.1109/EBBT.2018.8391453.
[71] N. Wu , “Deep Neural Networks
Improve Radiologists’ Performance in
Breast Cancer Screening,” in IEEE
Transactions on Medical Imaging, vol. 39,
no. 4, pp. 1184-1194, April 2020, doi:
10.1109/TMI.2019.2945514.
[72] E. Rashed and M. Samir Abou El
Seoud. 2019. “Deep learning approach for
breast cancer diagnosis.” In Proceedings of
Sushovan Chaudhury et al Research Article
826
IJBPAS, November, Special Issue, 2021, 10(11)
the 2019 8th International Conference on
Software and Information Engineering
(ICSIE '19). Association for Computing
Machinery, New York, NY, USA, 243–
247. DOI:
https://doi.org/10.1145/3328833.3328867
[73] R. Arora, P. K. Rai & B. Raman,
“Deep feature–based automatic
classification of mammograms.” Med Biol
Eng Comput 58, 1199–1211 (2020).
https://doi.org/10.1007/s11517-020-02150-
8
[74] D. A. Ragab, O. Attallah, M. Sharkas,
J. Ren, S. Marshall, “A framework for
breast cancer classification using Multi-
DCNNs” Computers in Biology and
Medicine, Volume 131, 2021, 104245,
ISSN 0010-4825,
https://doi.org/10.1016/j.compbiomed.2021
.104245
[75] Z. Zainudin, S. M. Shamsuddin, S.
Hasan (2021), “Deep Layer Convolutional
Neural Network (CNN) Architecture for
Breast Cancer Classification Using
Histopathological Images.” In: Hassanien
A.E., Darwish A. (eds) Machine Learning
and Big Data Analytics Paradigms:
Analysis, Applications and Challenges.
Studies in Big Data, vol 77. Springer,
Cham. https://doi.org/10.1007/978-3-030-
59338-4_18
[76] Y. Zhang, S. C. Satapathy, D. S.
Guttery, J. M. Górriz, S. Wang, “Improved
Breast Cancer Classification Through
Combining Graph Convolutional Network
and Convolutional Neural Network,”
Information Processing & Management,
Volume 58, Issue 2, 2021, 102439, ISSN
0306-4573,
https://doi.org/10.1016/j.ipm.2020.102439.
[77] G. Murtaza, L. Shuib, A. W. A.
Wahab, “Ensembled deep convolution
neural network-based breast cancer
classification with misclassification
reduction algorithms.” Multimed Tools
Appl 79, 18447–18479 (2020).
https://doi.org/10.1007/s11042-020-08692-
1
[78] S. Sharma, R. Mehra, “Conventional
Machine Learning and Deep Learning
Approach for Multi-Classification of Breast
Cancer Histopathology Images—a
Comparative Insight.” J Digit Imaging 33,
632–654 (2020).
https://doi.org/10.1007/s10278-019-00307-y
[79] R. Yan, F. Ren, Z. Wang, L. Wang, T.
Zhang, Y. Liu, X. Rao, C. Zheng, F. Zhang,
“Breast cancer histopathological image
classification using a hybrid deep neural
network, Methods,” Volume 173, 2020,
Pages 52-60, ISSN 1046-2023,
https://doi.org/10.1016/j.ymeth.2019.06.014.
[80] A. Kumar, S. K. Singh, S. Saxena, K.
Lakshmanan, A. K. Sangaiah, H. Chauhan,
S. Shrivastava, R. K. Singh, “Deep feature
learning for histopathological image
classification of canine mammary tumors
Sushovan Chaudhury et al Research Article
827
IJBPAS, November, Special Issue, 2021, 10(11)
and human breast cancer, Information
Sciences,”
Volume 508, 2020, Pages 405-421, ISSN
0020-0255,
https://doi.org/10.1016/j.ins.2019.08.072.
[81] H.D. Cheng, J. Shan, W. Ju, Y. Guo,
L. Zhang, “Automated breast cancer
detection and classification using
ultrasound images: A survey”, Pattern
Recognition, Volume 43, Issue 1, 2010,
Pages 299-317, ISSN 0031-3203,
https://doi.org/10.1016/j.patcog.2009.05.01
2.
[82] G. Meenalochini, S. Ramkumar,
“Survey of machine learning algorithms for
breast cancer detection using mammogram
images, Materials Today: Proceedings”,
Volume 37, Part 2, 2021, Pages 2738-2743,
https://doi.org/10.1016/j.matpr.2020.08.543
[83] Ghosh, S., Mondal, S., & Ghosh, B.
(2014, February). A comparative study of
breast cancer detection based on SVM and
MLP BPN classifier. In 2014 First
International Conference on Automation,
Control, Energy and Systems (ACES) (pp.
1-4). IEEE.
... Also, many researchers reported that Indian women suffer from this disease at a very young age compared to other developed countries. It is claimed that Indian women are more likely to have a larger tumor size and be malignant [4]. ...
Article
Full-text available
Breast cancer is a disease that is becoming more and more common day by day, causing emotional and behavioral reactions and having fatal consequences if not detected early. At this point, traditional methods are insufficient, especially in early diagnosis. In this context, this study aimed to predict breast cancer by using machine learning (ML) algorithms on different datasets and to demonstrate the applicability of these algorithms. Algorithm performances were compared on balanced and unbalanced datasets, taking into account the performance metrics obtained in applications on different datasets. In addition, a model based on the Borda Voting method was developed by including the results obtained from four different algorithms (NB, KNN, DT, and RF) in the process. The prediction values obtained from each algorithm were written in different columns on the same excel file and the most repetitive value was accepted as the final result value. The developed model was tested on real data consisting of 60 records and the results were analyzed. When the results were examined, it was seen that higher performance was obtained with the proposed RF model compared to similar studies in the literature. Finally, the prediction results obtained with the developed model revealed the applicability of ML algorithms in the diagnosis of breast cancer.
Chapter
Full-text available
Internet of Things (IoT) is a concept of transferring data among interrelated physical devices, objects, or humans over a network without the interaction of humans or machine. Therefore, in IoT, one device collects and sends data while others receive and act on it. To communicate and to share information through interconnected devices, authentication of the sender and receiver is essential. When the goal is to improve security in IoT, traditional authentication techniques, such as knowledge-based authentication and token-based authentication, become a challenge. Therefore, researchers strongly recommend using biometrics whenever direct human access is required. However, a biometric system is also vulnerable to different types of attacks. Therefore, it is necessary to avoid and detect such attacks and secure IoT devices. Throughout this chapter, we discuss IoT and its applications, the types of security in IoT, the identification and verification process in biometrics, the vulnerability of components of biometric systems, methods to secure these components, and different types of attacks that can be made at different modules of a biometric system along with machine learning techniques to detect these attacks. We also investigate the various biometric traits used to authenticate end users using machine learning (ML) techniques, various ML algorithms, and methodologies for features extraction, matching, and classifications. Finally, a deep model is trained, and the performance of the model is evaluated on the Caltech face database and the UBIRIS.v1 iris dataset. The results of the deep model are compared with traditional ML techniques.
Article
Full-text available
Most of the prominent features of human face are present in the ocular area, referred as the periocular region. Complex and dense features in these regions makes it a candidate to be used as a biometric trait. This paper discusses an effective method for periocular recognition using non-overlapped blockwise interpolated local binary pattern (iLBP) features. For a given periocular image, an iLBP coded feature image is obtained and further divided into four equal non-overlapping sub-regions. From each sub-region having iLBP pattern, eight bin histogram features are calculated. A single feature vector is formed by concatenating blocked histograms of each non-overlapping region. Binned histogram based feature is also extracted using Phase Intensive Global Pattern (PIGP) features for comparison of results. Experiments are conducted on UBIRIS.v1 and UBIPr.v2 datasets. From the experiments, it is observed that selected histogram feature bins through the proposed approach provide a more compact representation of periocular image and size of the feature vector is also reduced with significant improvement in performance.
Article
Full-text available
We present a deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200,000 exams (over 1,000,000 images). Our network achieves an AUC of 0.895 in predicting the presence of cancer in the breast, when tested on the screening population. We attribute the high accuracy to a few technical advances. (i) Our network’s novel two-stage architecture and training procedure, which allows us to use a high-capacity patch-level network to learn from pixel-level labels alongside a network learning from macroscopic breast-level labels. (ii) A custom ResNet-based network used as a building block of our model, whose balance of depth and width is optimized for high-resolution medical images. (iii) Pretraining the network on screening BI-RADS classification, a related task with more noisy labels. (iv) Combining multiple input views in an optimal way among a number of possible choices. To validate our model, we conducted a reader study with 14 readers, each reading 720 screening mammogram exams, and show that our model is as accurate as experienced radiologists when presented with the same data. We also show that a hybrid model, averaging the probability of malignancy predicted by a radiologist with a prediction of our neural network, is more accurate than either of the two separately. To further understand our results, we conduct a thorough analysis of our network’s performance on different subpopulations of the screening population, the model’s design, training procedure, errors, and properties of its internal representations. Our best models are publicly available at https://github.com/nyukat/breastcancerclassifier.
Article
Background Deep learning (DL) is the fastest-growing field of machine learning (ML). Deep convolutional neural networks (DCNN) are currently the main tool used for image analysis and classification purposes. There are several DCNN architectures among them AlexNet, GoogleNet, and residual networks (ResNet). Method This paper presents a new computer-aided diagnosis (CAD) system based on feature extraction and classification using DL techniques to help radiologists to classify breast cancer lesions in mammograms. This is performed by four different experiments to determine the optimum approach. The first one consists of end-to-end pre-trained fine-tuned DCNN networks. In the second one, the deep features of the DCNNs are extracted and fed to a support vector machine (SVM) classifier with different kernel functions. The third experiment performs deep features fusion to demonstrate that combining deep features will enhance the accuracy of the SVM classifiers. Finally, in the fourth experiment, principal component analysis (PCA) is introduced to reduce the large feature vector produced in feature fusion and to decrease the computational cost. The experiments are performed on two datasets (1) the curated breast imaging subset of the digital database for screening mammography (CBIS-DDSM) and (2) the mammographic image analysis society digital mammogram database (MIAS). Results and Conclusions: The accuracy achieved using deep features fusion for both datasets proved to be the highest compared to the state-of-the-art CAD systems. Conversely, when applying the PCA on the feature fusion sets, the accuracy did not improve; however, the computational cost decreased as the execution time decreased.
Article
Aim In a pilot study to improve detection of malignant lesions in breast mammograms, we aimed to develop a new method called BDR-CNN-GCN, combining two advanced neural networks: (i) graph convolutional network (GCN); and (ii) convolutional neural network (CNN). Method We utilised a standard 8-layer CNN, then integrated two improvement techniques: (i) batch normalization (BN) and (ii) dropout (DO). Finally, we utilized rank-based stochastic pooling (RSP) to substitute the traditional max pooling. This resulted in BDR-CNN, which is a combination of CNN, BN, DO, and RSP. This BDR-CNN was hybridized with a two-layer GCN, and yielded our BDR-CNN-GCN model which was then utilized for analysis of breast mammograms as a 14-way data augmentation method. Results As proof of concept, we ran our BDR-CNN-GCN algorithm 10 times on the breast mini-MIAS dataset (containing 322 mammographic images), achieving a sensitivity of 96.20±2.90%, a specificity of 96.00±2.31% and an accuracy of 96.10±1.60%. Conclusion Our BDR-CNN-GCN showed improved performance compared to five proposed neural network models and 15 state-of-the-art breast cancer detection approaches, proving to be an effective method for data augmentation and improved detection of malignant breast masses.
Article
Breast cancer is the primary cause of death in most cancer affected women. Mammography is one of the most dependable strategies for early detection and diagnosis of breast cancer and reduces the death rate. Mammograms are radiographic images of the breast which are utilized to identify the early symptoms of breast cancer. These radiographic images reduce human errors in detecting cysts and reduce the diagnosing time and also increase the diagnosis accuracy. An overview of the machine learning techniques for breast cancer detection and classification has been presented in this paper, which can be divided into three main stages: pre-processing, extraction of features, and classification. This article discusses about the effects of several Machine learning techniques for automation of mammogram image classification are investigated. This investigation assembles agent works that show how Machine learning technique is applied to the result of different issues identified with various analytic science examinations. This study portrays the impacts of pre-taken care of mammogram images before entering the classifier, which brings about higher effective classification. The detection stage is trailed by segmentation of the tumor region in a mammogram image. This study is an attempt to gather and compare the various screening techniques, classifiers, and their performance in terms of sensitivity, specificity and exactness for breast cancer diagnosis.
Article
Automatic multi-classification of breast cancer histopathological images has remained one of the top-priority research areas in the field of biomedical informatics, due to the great clinical significance of multi-classification in providing diagnosis and prognosis of breast cancer. In this work, two machine learning approaches are thoroughly explored and compared for the task of automatic magnification-dependent multi-classification on a balanced BreakHis dataset for the detection of breast cancer. The first approach is based on handcrafted features which are extracted using Hu moment, color histogram, and Haralick textures. The extracted features are then utilized to train the conventional classifiers, while the second approach is based on transfer learning where the pre-existing networks (VGG16, VGG19, and ResNet50) are utilized as feature extractor and as a baseline model. The results reveal that the use of pre-trained networks as feature extractor exhibited superior performance in contrast to baseline approach and handcrafted approach for all the magnifications. Moreover, it has been observed that the augmentation plays a pivotal role in further enhancing the classification accuracy. In this context, the VGG16 network with linear SVM provides the highest accuracy that is computed in two forms, (a) patch-based accuracies (93.97% for 40×, 92.92% for 100×, 91.23% for 200×, and 91.79% for 400×); (b) patient-based accuracies (93.25% for 40×, 91.87% for 100×, 91.5% for 200×, and 92.31% for 400×) for the classification of magnification-dependent histopathological images. Additionally, “Fibro-adenoma” (benign) and “Mucous Carcinoma” (malignant) classes have been found to be the most complex classes for the entire magnification factors.
Conference Paper
Breast cancer is one of the leading fatal disease worldwide with high risk control if early discovered. Conventional method for breast screening is x-ray mammography, which is known to be challenging for early detection of cancer lesions. The dense breast structure produced due to the compression process during imaging lead to difficulties to recognize small size abnormalities. Also, inter- and intra-variations of breast tissues lead to significant difficulties to achieve high diagnosis accuracy using hand-crafted features. Deep learning is an emerging machine learning technology that requires a relatively high computation power. Yet, it proved to be very effective in several difficult tasks that requires decision making at the level of human intelligence. In this paper, we develop a new network architecture inspired by the U-net structure that can be used for effective and early detection of breast cancer. Results indicate a high rate of sensitivity and specificity that indicate potential usefulness of the proposed approach in clinical use.