ArticlePDF Available

A SYSTEMATIC REVIEW OF CAD SYSTEM BASED APPROACH IN DIAGNOSING BREAST CANCER AND ANALYZE EFFECTIVENESS OF MACHINE LEARNING AND DEEP LEARNING ALGORITHMS IN EARLY DETECTION

November 2021

November 2021
10(11 (SPECIAL ISSUE))

DOI:10.31032/IJBPAS/2021/10.11.1069

Authors:

Sushovan Chaudhury

Shameek Mukhopadhyay

The Heritage Academy, Kolkata, India

Sadeem Kbah

University of Baghdad

This study intends to throw some light on the different treatment gateways of breast cancer. As we know that women are worst affected by this life threatening disease around the globe, everyone should be aware of the fact that this disease can be tackled if it is detected at the initial stage. In India, the most number of women are affected by this fatal carcinoma and that results in a huge death rate. MRI, Biopsy, USG, Mammography, Histopathological images and many other diagnostic tests can confirm the presence of breast cancer in women. This paper will focus on the prediction of the test samples to be malignant or not by studying the ways of performing machine learning based computer aided systems. By reviewing many important and promising papers in this area, it has been found that there is an established system of detection of carcinoma that is known as Computer Aided Detection. This system consists of the different stages as in image pre-processing, segmentation of images, extraction of relevant features and image classification. We also found from the review that the efficiency of CAD systems increases when the methodologies like CART, Decision Tree Classifier (DT), Logistic Regression (LR), Naïve Bayes (NB), Ensemble, Random Forest Classifier (RF), and K-nearest neighbor classifiers (KNN) used to extracted features. We reviewed several research papers and found a plethora of methodologies available for early detection of breast cancer by using CAD. When the WBCD dataset was evaluated by using Ensemble technique, it recorded about 98% of accuracy. Previously, radiologists could not diagnose breast cancer with so much efficacy as there was a scarcity of so many efficient techniques which are available nowadays. Although the ultimate result of the tests depends on the diagnostic ability of the radiologists, they get a significant amount of assistance by the latest methodologies.

Stages of a CAD system The rest of this paper is organized as follows in sequence. Literature review section discusses previous studies along with their area, different techniques used in this field along with a comparison among

…

depicts the few popular studies made in the last decade on breast cancer detection using deep learning and ML techniques. Table 3 compares this study to

…

we present how different ML/DL techniques have been used in analysis and classification of benign and malignant tumors of the breast.

…

Comparison of different BCD techniques in literature using different ML and DL Algorithms

…

Figures - uploaded by Sushovan Chaudhury

Content may be subject to copyright.

Content uploaded by Sushovan Chaudhury

Content may be subject to copyright.

Content uploaded by Shameek Mukhopadhyay

Content may be subject to copyright.

IJBPAS, November, Special Issue, 2021, 10(11): 804-827

ISSN: 2277–4998

804

IJBPAS, November, Special Issue, 2021, 10(11)

A SYSTEMATIC REVIEW OF CAD SYSTEM BASED APPROACH IN

DIAGNOSING BREAST CANCER AND ANALYZE EFFECTIVENESS

OF MACHINE LEARNING AND DEEP LEARNING ALGORITHMS IN

EARLY DETECTION

SUSHOVAN CHAUDHURY1*, SHAMEEK MUKHOPADHYAY2, SADEM NABEEL

KBAH3, DR. KARTIK SAU4

1: Research Scholar, University of Engineering and Management, Kolkata, WB, India

2: Assistant Professor, The Heritage Academy, Kolkata, WB, India

3: Assistant Lecturer, Department of Biomedical Engineering, University of Baghdad,

Baghdad, Iraq

4: Professor, University of Engineering and Management, Kolkata, WB

*Corresponding Author: Sushovan Chaudhury; E Mail: sushovan.chaudhury@gmail.com

https://doi.org/10.31032/IJBPAS/2021/10.11.1069

ABSTRACT

This study intends to throw some light on the different treatment gateways of breast cancer.

As we know that women are worst affected by this life threatening disease around the globe,

everyone should be aware of the fact that this disease can be tackled if it is detected at the

initial stage. In India, the most number of women are affected by this fatal carcinoma and that

results in a huge death rate. MRI, Biopsy, USG, Mammography, Histopathological images

and many other diagnostic tests can confirm the presence of breast cancer in women. This

paper will focus on the prediction of the test samples to be malignant or not by studying the

ways of performing machine learning based computer aided systems. By reviewing many

important and promising papers in this area, it has been found that there is an established

system of detection of carcinoma that is known as Computer Aided Detection. This system

consists of the different stages as in image pre-processing, segmentation of images, extraction

of relevant features and image classification. We also found from the review that the

Received 20

July 2021; Revised 22

Aug. 2021; Accepted 30

Sept. 2021; Available online 1

Nov. 2021

Sushovan Chaudhury et al Research Article

805

IJBPAS, November, Special Issue, 2021, 10(11)

efficiency of CAD systems increases when the methodologies like CART, Decision Tree

Classifier (DT), Logistic Regression (LR), Naïve Bayes (NB), Ensemble, Random Forest

Classifier (RF), and K-nearest neighbor classifiers (KNN) used to extracted features. We

reviewed several research papers and found a plethora of methodologies available for early

detection of breast cancer by using CAD. When the WBCD dataset was evaluated by using

Ensemble technique, it recorded about 98% of accuracy. Previously, radiologists could not

diagnose breast cancer with so much efficacy as there was a scarcity of so many efficient

techniques which are available nowadays. Although the ultimate result of the tests depends

on the diagnostic ability of the radiologists, they get a significant amount of assistance by the

latest methodologies.

Keywords: MRI, Biopsy, CAD, Histopathology, Invasive Ductal Carcinoma, Machine

Learning, Deep Learning

1. INTRODUCTION

One of the commonest explanations for

death after lung cancer is Breast cancer.

Early identification and effective carcinoma

therapy may enhance the therapy options

and decrease the mortality rate. As reported

in [1], there were 20 million new cases of

breast cancer worldwide resulting in the

death of more than 62 million people in

2018. Incidence of breast cancer is more

common in the western countries such as

the USA compared to Africa and Asian

countries. This fatal disease has increased

worldwide at the rate of 0.5% annually and

this increment is more in Asian countries

which is around 3-4% [2]. In

underdeveloped nations, mortality and

breast cancer morbidity are prevalent [3, 4].

It is observed that in case of Indian women,

breast cancer is detected at a very young

age and thankfully they are diagnosed early

in most of the cases and that helps the

oncologists to treat them better and save

their lives. It is also seen that, in rural areas,

cervical cancer is more prevalent whereas

in urban women evidence of breast cancer

is more common [5]. There are many life

style related reasons for this kind of

discrimination. The rate of incidence of

breast cancer in states like Delhi, Mumbai,

Bangalore, Chennai and Kolkata are 41.0,

33.6, 34.4, 37.9 and 25.5 cases per one

million women population respectively. A

huge number of cases of breast cancer is

reported per year in India as shown in

Table 1. Every year there is a significant

percentage (0.68%) of change observed in

case of cancer, during the time span 2011

to 2014, by National Cancer Registry

Program (NCRP), out of which about 2%

of change in every year in case of breast

cancer is found. From 1998 to 2012, the

number of reported cases of breast cancer

Sushovan Chaudhury et al Research Article

806

IJBPAS, November, Special Issue, 2021, 10(11)

has increased manifold. In [6], it is reported

that the Annual Percentage Change (APC)

rose up to 5.31% from 0.91% in Delhi

during this period of time. They also

predicted that among all cancers, about

10% would be breast cancer and this is a

threat to the women’s health of our country.

The nature of breast cancer in India is no

way similar to that of in western countries.

Many researchers [7] reported that Indian

women are caught with this disease at a

very young age compared to other

developed countries. Their tumor size is

higher, they suffer from more negative

hormone receptor conditions, lower ratings,

more positive lymph nodes and aggressive

illness. In [8], it has been observed that

tumors in stages 1, 2, 3 have an

independent risk factor for premature

mortality for the control matched patients.

1.1 Breast Cancer Susceptibility

In the US, it is reported that the median age

of breast cancer patients is 62 years

whereas the range of age is 60 to 69 years

[9]. India is a country with diversity in all

aspects such as economic conditions,

education, climatic conditions and cultural

heritages. The range of ages for urban

population in India is found to be 40 to 49

years whereas in rural areas the range is

between 65 and 69 years [10, 11, and 12].

Indian women are affected in their early

life and mostly present in advanced stages.

Some researchers found that Indian women

are diagnosed with breast cancer with

symptoms and mammographic detection

doesn’t work in case of them. It is also seen

that almost 60% of the patients are detected

in stage 3 or 4 leading to a higher death rate

[1], [13]. In [14, 15], authors reported that

in 62% of the cases, the disease was

diagnosed with TNM stage III in the

women from Northern India. It is very

unfortunate that only 1.4% are diagnosed in

stage 1. They also studied that TATA

Memorial Hospital, Mumbai reported that

54% patients with advanced stage and

women from urban areas came to report in

an early stage of the disease (OR = 0.64).

Authors of [16] found that the stage of

diagnosis depends on the level of

education, socioeconomic background, area

of residence and marital status of the

patients. A few studies found that the

mortality depends on the presentation of the

disease in the later stage; it is better if the

diagnosis is done within three months.

1.2. Breast Cancer Assessment and

Motivation-In Context of Asian

demographics and Indian Subcontinent

Mammography can be a specific imaging

method for the assessment of the breast

using low dose X-rays [18]. Mammography

is the best known method for preliminary

screening, but has certain limitations [19,

20]. Breast density might be some

Sushovan Chaudhury et al Research Article

807

IJBPAS, November, Special Issue, 2021, 10(11)

misleading element which makes it difficult

for women with thick breasts to diagnose

cancer [19, 21-23]. Figure 1 demonstrates

the different densities of breast in women

obtained through breast ultrasound [24-29].

To gauge breast issues, ultrasound is

proved to be one of the efficient tools. It is

usually recommended by the profession,

especially lactation period and pregnancy,

to scan breasts. For biopsy guidance and

mass locating, it can also be recommended.

Figure 2 shows how mammograms can

detect the presence of lesions in the human

breast [23]. However, ultrasound is very

prone to detecting invasive ductal

carcinoma in dense breasts as shown by

Costantini et al. [25], [30]. MRI is usually

recommended for screening women who

have a high risk of developing br0 1H

cancer and is often used to investigate

suspicious ar0 1 found by the mammogram

to assist m01 to detect the dimensions of

the mass. The interpretation/prediction

procedure of MRI imaging, as shown in

Figure 3 is extremely time-taking and

requires a considerable level of radiologist

expertise to classify the differences

between benign and malignant lesions

shown by [31] [32]. Recent studies have

shown that computer systems developed to

facilitate MRI image analysis enhance the

treatment and diagnosis many fold as

demonstrated in [31], [33] and [34].

1.3. Motivation behind CAD system

based diagnosis and performance

evaluation.

The contrast of the tumor between the

background of the image and cancer is

particularly poor in dense images of breast

might alter the results of the diagnosis.

Non-cancerous lesions (fake-positive

value) were commonly misread in the

mammographic examination, whereas

malignancies were frequently overlooked

(false-negative value). Therefore,

radiologists often fail to detect breast

cancers [20]. Several strategies are

presented to strengthen the sensitivity and

specificity of mammography to avoid

needless biopsies. Double reading is one

among the strategies which will contribute

significantly towards achieving high

sensitivity and specificity. CAD systems

might be regarded as a supplementary

mechanism to improve the doctor's

interpretation as a powerful second reader.

An autonomously cancer cells detection CAD

system based on computer vision can assist

radiologists distinguish cancer from non-cancer

cells. Br0 1H are also analyzed using

histopathology images in few studies [35].

Bhardwaj et al. used deep neural networks to

classify breast cancer while Niwas et al.

extracted wavelet features from

histopathological images [36], [37]. Figure 4

presents a sample of histopathological images

[35].

Sushovan Chaudhury et al Research Article

808

IJBPAS, November, Special Issue, 2021, 10(11)

Table 1: State wise statistics

States/UT 2016 2017 2018

Jammu & Kashmir 1421 1516 1618

Himachal Pradesh 613 647 681

Punjab 3321 3503 3694

Chandigarh 196 207 219

Uttaranchal 1217 1298 1384

Haryana 3103 3308 3526

Delhi 3181 3351 3530

Rajasthan 7536 7996 8483

Uttar Pradesh 21376 22737 24181

Bihar 9958 10644 11378

Sikkim 30 30 31

Arunachal Pradesh 82 84 85

Nagaland 67 67 68

Manipur 273 281 289

Mizoram 97 99 101

Tripura

129

130

132

Meghalaya 104 106 108

Assam 2406 2437 2467

West Bengal 10902 11550 12234

Jharkhand 3716 3962 4225

Orissa 4205 4448 4705

Chhattisgarh 2944 3145 3359

Madhya Pradesh 8334 8858 9414

Gujarat 8001 8504 9039

Daman & Diu 42 47 52

Dadra & Nagar Haveli 54 61 68

Maharashtra 14726 15522 16358

Telangana 4633 4918 5220

Andhra Pradesh 5901 6251 6620

Karnataka

8029

8527

9055

Goa 233 247 262

Lakshadweep 14 15 17

Kerala 5682 6189 6748

Tamil Nadu 9486 9870 10269

Pondicherry 227 242 257

Andaman & Nicobar

Islands 44 45 47

Total 142283 150842 159924

Figure 1: densities of breast in women obtained through breast ultrasound

Sushovan Chaudhury et al Research Article

809

IJBPAS, November, Special Issue, 2021, 10(11)

Figure 2: Detection of presence of lesions in human breast through mammograms

Figure 3: The interpretation/prediction procedure of MRI imaging

Figure 4: Sample histopathological images

Sushovan Chaudhury et al Research Article

810

IJBPAS, November, Special Issue, 2021, 10(11)

Any CAD system is primarily based on the

following 5 stages.

1) Image Preprocessing: Any kind of

Biomedical Image preprocessing technique

involves noise removal from the images

being acquired. It also involves image

resizing, enhancing the image intensity as

shown by [38], adjusting brightness and

contrast or converting them into grayscale.

2) Image Segmentation: Image

segmentation is again a key element in the

recognition of computer vision and

patterns. Segmentation techniques allow us

to identify important areas and to remove

various features for further analysis, such

as the tumor or lesion. Based on the

properties of images, segmentation

approach can be classified as follows

● Similarity-based

● Discontinuity-based

Edge-based segmentation is an example of

discontinuity-based approach. Lee et al.,

further divided the similarity approach into

threshold, region-based and clustering

methods [39]. Each procedure has its own

benefits and limitations and is chosen

according to the individual applications and

imaging methods.

3) Feature Extraction: The characteristics

of the lesions in the images are taken for

the distinct attributes. These features are

utilized for categorization of benign and

malignant tumors in the next stage. One of

the real challenges of the feature extraction

process is the size of the feature set.

Computing feature descriptors from a

picture to scale back the quantity of

knowledge ordinarily signifies feature

extraction. Features are characteristics of

the whole image or ROI. Often an image

descriptor can be classified into three

dimensions; shape, pattern and spectra and

density as claimed in [40]. Feature

matching techniques can also be employed

too by comparing the key points within the

feature descriptor using algorithms like

SIFT, SURF, BRIEF and ORB.

4) Classification: It is essential that a

trustworthy classifier is applied to

differentiate between cancer and non-

cancer cells. Various such machine learning

models like Linear Regression, Logistic

Regression, DT, RF, Ensemble techniques,

SVM, KNN, NB, CART have been used

traditionally for the purpose of

classification. We have discussed different

such approaches in this paper and the

accuracy being achieved in the most

prominent works in the last decade on the

topic.

5) Performance-Review or Evaluation: As

in most systems, a CAD of detection of

breast cancer demands high accuracy and

precision. We have considered key features

to measure accuracy like Sensitivity, F1

score, True Positive and True Negative and

Sushovan Chaudhury et al Research Article

811

IJBPAS, November, Special Issue, 2021, 10(11)

Overall Accuracy to justify our claim that a

significant amount of research has been

done for cancer classification using CAD

system. A performance review of crucial

works in the last decade has been

thoroughly explored in this paper.

The flow of all five stages of the CAD

system is illustrated in Figure 5.

Figure 5: Stages of a CAD system

The rest of this paper is organized as

follows in sequence. Literature review

section discusses previous studies along

with their area, different techniques used in

this field along with a comparison among

them. The research gap section discusses

some of the unexplored areas. Finally, the

conclusion section describes the

applicability of results.

2. Literature Review-Evaluation of

Existing Literature:

For breast cancer detection using different

techniques relevant literature from multiple

sources are being referred to. Various

authors have worked on different datasets

over a period of time and based on that

conclusion are derived. Machine learning

algorithms can be classified as following

three types [41].

● The Supervised learning

algorithms;

● Unsupervised learning and

● Reinforcement learning

Supervised learning is the most common

for every machine learning method that is

used to predict cancer and supervised

learning algorithms are on the basis of

some criteria and conditions. Genetic

algorithms, artificial neural networks and

decision trees are some of the algorithms

used in supervised learning. Physical

examination, imaging and biopsy are some

of the ways to diagnose breast cancer [42].

They also said that X-ray is used just to

understand the shape of the breast but

mammography is used for imaging the

internal parts of the breasts. Some studies

Sushovan Chaudhury et al Research Article

812

IJBPAS, November, Special Issue, 2021, 10(11)

[43] reported that when different kinds of

machine learning models are used for

predicting breast cancer or cancer in

general, machine learning models

outperform the classical statistical models

or expert based systems. In [44], they tried

to distinguish between the mammograms of

healthy tissues and cancer tissues and to do

that they applied DT, SVM and Bayes’

approach. They used a 10-fold cross

validation process by employing statistical

parameters such as positive predictive

value, negative predictive value, sensitivity

and specificity. There are some studies that

suggest a methodology which helps in

computing contourlet coefficient,

decomposed image for the purpose of

classifying mammogram images [45]. The

analysts in [46] showed that for predicting

breast cancer, a combination of Mixed

Gravitational Search Algorithm (MGSA)

and Support Vector Machine (SVM)

improved the performance of these models

individually up to 93.1%. They used 70%

of the data for training the dataset and the

rest for the test. A study reported that for

classification of mammogram images,

CAD system gives 96% accuracy [47].

Classification of mammogram images has

been studied by many researchers. In [48],

the authors studied the same by employing

KNN and GLCM. In a study, it is found

that the performance of machine learning

models vary due to the parameter selection

and dataset. They reported that SVM

combined with Gaussian kennel gave the

best result in the case of prediction of

breast cancer both for recurrence and non-

recurrence one [49]. Authors in [50] have

trained SVM, DT, NB and k-NN on the

WBC dataset [51] and noticed that among

all classifiers, SVN outperforms. A similar

study done by researchers [52]. In [53] they

used data mining techniques to explore the

risk factors for predicting breast cancer

while in [54] they compared two machine

learning methods (ANN and SVM) for

breast cancer detection. The nested

ensemble approach based detection of the

benign breast tumors from malignant

cancers was proposed by [55]. They used

Stacking and Voting as a combination of

classifying techniques. In [56], they worked

on the WDBC dataset. In their study, the

first dimension is reduced using PCA and

then machine learning models are trained

for classification of tumors. A similar study

is made by [57]. Authors performed

experiments on two standard databases i.e.,

Wisconsin Prognostic Breast Cancer

(WPBC) and WBC using various machine

learning approaches including decision

tree, NN and SVM to classify tumors.

Another comparative analysis is done by

the researchers of [58]. When WBPC

dataset is used, it was observed that use of

Sushovan Chaudhury et al Research Article

813

IJBPAS, November, Special Issue, 2021, 10(11)

PCA improved the results significantly. In

[59], the authors employed classification

model ANN and extracted the parameters

by using PCA. Based on the WBPC

dataset, some studies used machine

learning algorithms for predicting the

recurrent cases of breast cancer [60]. Some

studies compared different ML algorithms

for predicting recurrent or non-recurrent

breast cancers [61], [62]. A combination of

neural network and weighted Naïve Bayes’

classifier was used by many studies and

they proved the improved performance of

these models [63], [64]. The authors of [65]

developed the models for prediction of

breast cancer based on Radial Basis

Function Network, Naïve Bayes’ (NB) and

Decision Tree (DT). They found NB to be

the most efficient model with 97.36% of

accuracy. A paper combined a GA with

feed forward neural network and they used

this combination of models for

classification [66]. In [67], the researchers

studied the survival probabilities of breast

cancer patients by using different survival

analysis models. They used two different

breast cancer datasets and proved their

hypothesis. Some researchers showed that

ML algorithms have improved the accuracy

of classification models and prediction

models by manifold. They reviewed many

articles on the application of different ML

algorithms for classification and prediction

of breast cancer and then concluded [68]. In

[69], they proved that machine learning

algorithms are capable of yielding almost

100% prediction accuracy when tested on

the Wisconsin Diagnostic Breast Cancer

dataset. Researchers in [70] studied the

performance of breast cancer classification

using KNN and NB classifiers. Model is

trained using 683 samples of Breast Cancer

Dataset. In the result, the reported a

maximum accuracy of 97.51% which was

achieved by KNN classifier. A two-stage

architecture based deep model is trained in

[71]. ResNet is used as a building block of

the proposed architecture. Results suggest

that the deep model successfully predicts

the presence of cancer in the breast with

AUC of 89.50%. A similar study based on

deep learning and inspired from U-Net is

done by authors of [72]. Extending the

study of deep learning for breast cancer

classification, in [73], automatic and robust

features are extracted using deep neural

networks and trained using deep ensemble

transfer learning. Authors have reported

88% classification accuracy with area

under curve as 0.88. In the most recent

work, a new CAD system is proposed

which uses multi-DCNN to classify breast

cancer [74]. The CBIS-DDSM and MIAS

dataset is used for evaluation of the

performance of deep models. Experimental

results improvement of accuracy using

Sushovan Chaudhury et al Research Article

814

IJBPAS, November, Special Issue, 2021, 10(11)

deep feature fusion compared to state-of-

the-art methods. In [75], a deep neural

network architecture is proposed to study

breast cancer classification using

histopathological images. Results suggest

that among different numbers of layered

architectures, 19-layer CNN performed

well. A hybrid approach based on a

combination of Graph convolutional

network and convolutional neural network

is proposed in [76]. Few more recent

studies of breast cancer classification based

on deep learning are available in [77-80].

Table 2 depicts the few popular studies

made in the last decade on breast cancer

detection using deep learning and ML

techniques. Table 3 compares this study to

other review papers being published in the

last decade. Table 4 describes the different

modalities being taken as benchmark for

researchers to study and implement ML/DL

techniques on the appropriate data set. In

Table 5 we present how different ML/DL

techniques have been used in analysis and

classification of benign and malignant

tumors of the breast.

In the past few decades, breast cancer

classification gained the attention of many

researchers. Many novel methods and

techniques are proposed. Few researchers

summarized the methodologies and

published it as a survey. Table 3 presents a

few important surveys available in the

literature (index value 2 to 7) and compares

it with our study. From the Table 3, it can

be noticed that our survey presents an

extensive study and includes almost all

machine learning techniques which are

being used for classification of breast

cancer. Our study can be uniquely

distinguished from the indexed survey

based on the study of Indian and Asian

Demographics.

Table 2: Comparison of different BCD techniques in literature using different ML and DL Algorithms

Reference

Techniques/Methods used

Area

Result

[83] Applied different classification

techniques on BCW dataset for

detection of breast cancer.

MLP, using back propagation

, NN (MLP BPN) and SVM to

diagnose and analyze breast

cancer and performance is

evaluated by calculating

statistical parameters.

SVM is found to produce the

lowest average error compared

to MLPBPN.

[42] ANN, GA, DT, LDA, and KNN

have been applied.

Diagnosis and detection of

breast cancer using different

modalities include physical

examination, biopsy and

imaging

Texture analysis is a tested

methodology that may be

efficiently employed for

classification of noncancerous

and cancerous lesions with

Sensitivity-94.28%, Specificity-

100%, Accuracy-97.80%, AU-

ROC-0.9714.

[57] Comparative study is done for

different ML/DL techniques like

DT, NB, NN and SVM

Objective is to classify the

labels in WPBC and WBC

datasets

NN - 98.09% in WBC dataset,

and SVM-RBF - 98.32% in

WPBC dataset using 10 fold

cross validation.(cv=10)

Sushovan Chaudhury et al Research Article

815

IJBPAS, November, Special Issue, 2021, 10(11)

[46] MGSA and SVM Goal is to classify breast

cancer as per given labels

using machine learning

techniques.

Outcome: SVM with 24 features

- 86% MGSA – SVM with 12

features- 93.1%.

[47] CAD Normal and abnormal breast

tissues differentiation for

visual diagnostic aid of the

radiologists.

Maximum accuracy of 96% if

found using 3NN.

[62] Breast cancer is detected using

the “Relevance vector machine

“(RVM).

LDA was used as a

dimensional reduction method

and feed the reduced features

into the classifier

The “Relevance vector

machine” outperforms other

‘ML classifiers’ in classifying

the labels appropriately.

[60]

Use Cases of Invasive ductal

carcinoma in the subjects on the

basis of vital features as predicted

by ML techniques

WPBC SVM and DT (C 5.0) - 81%

(highest)

FCM 37% (lowest)

[44] SVM, Bayes approach and DT Distinguish cancer

mammograms from normal

samples.. Dataset is broken

down into a train, test and

validation sets and the model

is subjected to training, taking

cv=10.

NPR, FPR and AUC were

measured. From the results it is

observed that different feature

extracting strategies and

classifiers yield different and

effective results to detect breast

cancer in the given dataset.

[51] Naïve Bayes, SVC classifier, RF,

C4.5, k-NN and NN

Aim is to classify breast cancer

where different ML /DL

techniques are compared for

the Wisconsin dataset and

reported.

SVM and RF produced highest

classification accuracy

[52] SVM, GRU-SVM, LR, MLP, NN

search and Softmax Regression

WDBC dataset is being used

for experimentation

From the result it is noticed that

MLP reports significantly

higher accuracy when

compared to other models.

[70] NB and KNN are used as

classification technique for breast

carcinoma detection,

Identification using ML

techniques. A set of Breast

Cancer Image Dataset is used

which consists of a total of 683

samples. Dataset is broken

down into training and test

sets in a 60:40 ratio.

Highest accuracy is achieved by

K-NN - 97.51% while Naive

Bayes classifier produced

96.19% accuracy

[72] As in, U-Net, a DL framework is

proposed for initial detection of

breast carcinoma and

performance is compared with

architectures like AlexNet,

VGGNet and GoogleNet

CBIS-DDSM is used to train

the deep model which contains

‘Curated Breast Imaging

Subsets’.

Classification accuracy:

Micro calcification – 94.31%

Masses- 95.01%

[55] Two-layered nested ensemble

technique is used along with SV-

NaiveBayes-3-MetaClassifier and

SV-BayesNet-3- Meta Classifier

and compared with Bayesian

Network, NB, SGD and Logistic

model tree

Invasive Ductal Carcinoma

was detected using ensemble

techniques. Dataset being used

is WBCD.

Among all other classifiers, the

proposed SV-Naïve Bayes-3-

MetaClassifiers generated

highest accuracy – 98.07%

[71] A DNN based on a two stage

framework is proposed where

ResNet is used as a building block

of the model.

Diagnostic aid for breast

cancer detection:Performance

of the model is examined over

two million exams having 10

million image samples ,thereby

a large validation set.

The result shows that the model

is capable to predict the

presence of breast carcinoma

with an AUC of 89.50%

[73] Deep ensemble transfer learning

approach is used to distinguish

cancerous and noncancerous

lesions using features extracted by

DNN.

Classification of cancerous and

non-cancerous lesions. The

CBIS-DDSM dataset is used

for experiments.

The classification accuracy

achieved is 88% with AUC

value as 0.88

[74] A new CAD system is proposed

which uses multi-DCNN to

classify breast carcinoma. Deep

Uses deep convolutional neural

networks’ for classification.

The CBIS-DDSM and MIAS

Result suggests an improvement

of accuracy using deep feature

fusion compared to traditional

Sushovan Chaudhury et al Research Article

816

IJBPAS, November, Special Issue, 2021, 10(11)

feature fusion is also performed

and SVM is used as a classifier.

dataset is used for evaluation

of the performance of deep

models.

methods.

[75] A DNN architecture is proposed

to study breast cancer

classification using

histopathological images.

‘Histopathological biopsy

images’ are used for breast

cancer detection. AMIDA13

and MITOS-ATYPIA dataset

is used to train deep models.

Results suggest that among

different numbers of layered

architectures, 19-layer CNN

performed better.

[76] A hybrid approach based on

amalgamation of Graph based

CNN(GCN) and conventional

CNN is proposed.

The malignancy is classified

using DL. The model is

experimented on breast

dataset mini-MIAS.

Statistical parameters are

reported as follows:

Sensitivity – 96.20%

Specificity – 96%

Accuracy – 96.10%

Table 3: Comparison of our survey along with other popular surveys

Models / Sl. No 1 2 3 4 5 6 7

Machine

learning

models

SVM

Decision

Trees

K-NN

Logistics

Regression

Artificial

Neural

Network

(ANN)

✓

Methods /

data set

Gaussian

kernel

Wisconsin

dataset

✓

Indian and

Asian

demographics

✓

Table 4: A table with reference of different testing modalities and their performance studied in different papers

Modalities References

Mammography [18], [19], [20], [21], [22], [23]

Ultrasonography [24], [25], [26], [27], [28], [29], [30]

MRI

[31], [32], [33], [34]

Biopsy histopathological images [35], [36], [37]

Sushovan Chaudhury et al Research Article

817

IJBPAS, November, Special Issue, 2021, 10(11)

Table 5: Popular Machine learning techniques used for Breast cancer diagnosis in various researches

Machine Learning Models

References

Support vector machine (SVM) [44], [46], [50], [51], [52], [53], [54], [60], [61], [68]

Decision Trees [50], [53], [57], [60], [65], [68]

K Nearest Neighbors (KNN) [48], [49], [50], [58]

Logistics Regression [49]

Naïve Bayes [49], [51], [58], [63], [65]

Artificial Neural Network (ANN) [53], [54], [59], [67], [68]

3. Research Gap

Compared to traditional image processing

methods, application of machine learning

and deep learning in the field of breast

cancer classification has drastically

improved classification accuracy. In the last

few decades, a number of researches have

been done [81, 82]. Few authors well

summarized the recent trends in this

particular domain which are available in the

literature [4], [40]. However, there are few

points which still need to be investigated.

In this study, we tried to bridge this gap by

including most popular and recent work

done on breast cancer classification using

machine learning and deep learning

techniques. We also discussed different

stages of CAD systems in detail. In

addition to that, we highlighted different

testing modalities viz. Mammography [19,

20], Ultrasonography [24-28], Biopsy

histopathological images [35-37] and MRI

[31-34] and their performances along with

limitations. Finally, we focused on Breast

cancer trends in Indian and Asian

demographics. This study discusses

different types of machine learning

techniques and also reports which

algorithm works well with different

databases. We believe that this study will

help beginners to understand the past

researches and recent trends in the field of

breast cancer classification and will help

them to decide use of appropriate

algorithms for their research work. We also

present a list of dataset in Table 6 along

with sources and the modality being used in

the dataset and what research can be

undertaken on those set of data. Some of

the areas of research which can be explored

in this area are as under:

1. Using Transfer learning techniques

on histopathological images.

2. Use of Knowledge distillation and

semi-supervised techniques on

available histopathological biopsy

images and they can be validated

against images checked by medical

experts.

3. Use of Active Learning to train the

available datasets obtained through

different modalities like

Mammogram, MRI, and Biopsy

etc.

Sushovan Chaudhury et al Research Article

818

IJBPAS, November, Special Issue, 2021, 10(11)

Table 6: A comprehensive overview of some publicly available datasets in the area of Breast Cancer Research and

open areas of research

Available Data Set Modality Source Scope of Research

WBCD Numerical values of cell

nuclei extracted from

FNAB histopathological

images of the breast.

UCL Machine Learning

repository, Kaggle

Exploratory analysis,

Application of new

ML/DL techniques,

Feature Extraction

techniques like PCA, LDA

and Factor Analysis

Breast Histopathological

Images

Histopathological biopsy

to detect invasive ductal

carcinoma

https://www.kaggle.com/p

aultimothymooney/breast-

histopathology-images

Feature Extraction,

Feature matching

Classification, Knowledge

distillation, Transfer

Learning, CNN, Big Data

Analysis of image

MIAS Mammography Mammogram https://www.kaggle.com/k

mader/mias-

mammography

Segmentation, Finding

ROI, Implement object

detection using mask R

CNN, Yolo V4, Feature

Extraction, Classification

CBIS DDSM

Mammograms

http://www.eng.usf.edu/cv

prg/Mammography/Datab

ase.html

Segmentation,

Finding

ROI, Implement object

detection using mask R

CNN, Yolo V4, Feature

Extraction, Classification,

RNN, GAN, Big data

Analysis

BACH 2018 Histopathology biopsy https://iciar2018-

challenge.grand-

challenge.org/Dataset/

semi supervised KD,GAN,

Big Data Analysis, Auto

encoders, GAN, Multi

label classification

SEER Breast Cancer

Dataset

Numerical attributes

being extracted from

patient EMR

IEEE data

port

Exploratory data analysis,

Implementing Statistical

methods to BCD for

meaningful insights.

Breast Ultrasound Images USG of the breast https://www.kaggle.com/a

ryashah2k/breast-

ultrasound-images-dataset

Segmentation, Detection

and Classification

CONCLUSION

Going by the statistics, the emerging trends

and increased breast cancer rate in India as

well as other parts of the world, the study

of breast cancer has become the need of the

hour though getting appropriate data for

research remains a challenge. The socio-

economic conditions vary across the world

and radiologists are often not 100 percent

accurate in diagnosing breast cancer. As

such the use of CAD systems can be a great

tool to assist radiologists and ascertain their

predictions. The major aim of this study is

to highlight all research conducted on ML

and DL techniques for prediction of breast

cancer. This article will help the beginner

who wishes to explore the machine learning

algorithms for classification problems and

Sushovan Chaudhury et al Research Article

819

IJBPAS, November, Special Issue, 2021, 10(11)

their performance on different breast cancer

testing modalities. In this thorough review,

the performance of different ML/DL

techniques are assessed and compared.

From the result it has been found that the

efficiency of the CAD system can be

improved significantly with the application

of proper algorithms which can in turn

enhance radiologists’ performance. We

have talked about the different options

available as far as dataset is concerned and

what kind of dataset can yield what results.

We observed that the machine learning

methods have demonstrated its exceptional

capacity to classify and predict cancer cells

with significant improvement in accuracy

using computer- vision techniques.

REFERENCES

[1] F. Bray, J. Ferlay, I. Soerjomataram,

R. L. Siegel, L. A. Torre, A. Jemal,

“Global cancer statistics 2018:

GLOBOCAN estimates of incidence and

mortality worldwide for 36 cancers in185

countries.” CA Cancer J Clin 68(6): 394–

424

[2] M. Green, V. Raina, “Epidemiology,

screening and diagnosis of breast cancer

in the Asia Pacific region: current

perspectives and important

considerations.” Asia PacJClinOncol4:S5–

S13

[3] N. Li, Y. Deng, L. Zhou, T. Tian, S.

Yang, Y. Wu, Y. Zheng, Z. Zhai, Q. Hao,

D. Song, D. Zhang, H. Kang, Z. Dai,

“Global burden of breast cancer and

attributable risk factorsin195 countries and

territories, from 1990 to 2017: results from

the global burden of disease study 2017.” J

Hematol Oncol 12(1):140

[4] R. Dikshit, P. C. Gupta, C. R.

Sundarahettige, V. Gajalakshmi, L.

Aleksandrowicz, R. Badwe, R. Kumar, S

Roy, W. Suraweera, F. Bray, M. Mallath ,

P. Singh, D. N. Sinha, A> S. Shet, H.

Gelband, P. Jha, “Cancer mortality in India:

a nationally representative survey.” Lancet

379:1807–1816

[5] National Cancer Registry Programme

(2016) Three-year report of population-

based cancer registries:2012-2014,

Chapter-2 leading site of cancer. Indian

Council of Medical Research. New Delhi

[6] National Cancer Registry Programme

(2016) Three-year report of population-

based cancer registries:2012-2014,

Chapter-10 Trends over time for all sites

and on selected sites of cancer and

projection of burden of cancer. Indian

Council of Medical Research. New Delhi

[7] A. Mathew, M. Pandey, B. Rajan, “Do

younger women with non-metastatic &

non-inflammatory breast carcinoma have

poor prognosis?” World J SurgOncol2:2

[8] M. A. Maggard, J. B. O’Connell, K. E.

Lane, J. H. Liu, D. A. Etzioni, “Do young

Sushovan Chaudhury et al Research Article

820

IJBPAS, November, Special Issue, 2021, 10(11)

breast cancer patients have worse

outcomes?” J Surg Res 113:109–113

[9] C. E. De Santis, M. M. Ma J, Gaudet, L.

A. New man, K. D. Miller, A. Goding

Sauer, A. Jemal, R. L. Siegel, “Breast

cancer statistics.” CA Cancer J Clin69:438–

451

[10] A. Chauhan, S. Subba, R. G. Menezes,

B. S. Shetty, V. Thakur, S. Chabra, R.

Warrier (2011) ,Younger women area

affected by breast cancer in South India - a

hospital based descriptive study. Asian Pac

J Cancer Prev 12:709–711

[11] V. Raina, M. Bhutani, R. Bedi, A.

Sharma, S. V. S. Deo, N. K. Shukla, N. K.

Mohanti, G. K. Rath (2005) “Clinical

features and prognostic factors of early

breast cancer at a major cancer center in

North India.” Indian J Cancer 42:40–45

[12] G. Agarwal, P. V. Pradeep, V.

Aggarwal, C. H. Yip, P. S. Cheung (2007),

“Spectrum of breast cancer in Asian

women”. World JSurg31:1031– 1040

[13] S. P. L. Leong, Z. Shen, T. Liu, G.

Agarwal, T. Tajima, N. S. Paik, K. Sandel,

A. Derossis, H. Cody, W. D. Foulkes,

(2010) “Is breast cancer the same disease

in Asian and Western women?” World J

Surg 34:2308–2324

[14] S. Saxena, B. Rekhi, A. Bansal, A.

Bagga, M. S. S. Chintamani (2005),

“Clinico-morphological patterns of breast

cancer including family history in a New

Delhi hospital, India – a cross-sectional

study”. World J SurgOncol3:67

[15] J. A. Sathwara, G. Balasubramaniam,

S. C. Bobdey, A. Jain, S. Saoba, (2017)

“Socio demographic factors and late-stage

diagnosis of breast Cancer in India: a

hospital-based study.” Indian J Med

Paediatr Oncol38(3):277 –281

[16] F. Kaffashian, S. Godward, T. Davies,

L. Solomon, J. Mc Cann, S. W. Duffy,

(2003) “Socio economic effects on breast

cancer survival: proportion attributable to

stage and morphology.” Br J Cancer

89:1693– 1696

[17] M. A. Richards, A. M. Westcombe, S.

B. Love, P. Little johns, A. J. Ramirez,

(1999) “Influence of delay on survival in

patients with breast cancer: a systematic

review.” Lancet 353:1119–1126

[18] T. W. Freer, M. J. Ulissey, “Screen in

mammography with computer-aided

detection: Prospective study of 12,860

patients in a community breast center”,

Radiology, 2001,220:781-6.

[19] M. G. Ertosun and D. L. Rubin,

"Probabilistic visual search for masses

within mammography images using deep

learning," 2015 IEEE International

Conference on Bioinformatics and

Biomedicine (BIBM), Washington, DC,

USA, 2015, pp. 1310-1315, doi:

10.1109/BIBM.2015.7359868.

Sushovan Chaudhury et al Research Article

821

IJBPAS, November, Special Issue, 2021, 10(11)

[20] N. F. Boyd, H. Guo, L. J. Martin, L.

Sun, J. Stone Fishel, “Mammographic

density and the risk and detection of breast

cancer”, N Engl J Med, 2007, 356,227-236,

doi: 10.1056/NEJMoa062790

[21] J. K. Jesneck, J. Y. Lo, J. A. Baker,

“Breast mass lesions: computer-aided

diagnosis models with mammographic

and sonographic descriptors”, Radiology,

2007,244: 390-8.

[22] H. D. Nelson, K. Tyne, A. Naik, C.

Bougatsos, B. K. Chan, L. Humphrey,

“Screening for breast cancer: an update for

the US Preventive Services Task Force”,

Ann Intern Med, 2009,151:727-37; W237-

42.

[23] Skåne University Hospital in Malmö,

digital image, accessed 7 April 2021,

https://healthcare-in-

europe.com/en/news/3d-mammography-

detected-34-more-breast-cancers-in-

screening.html

[24] W. A. Berg, J. D. Blume, J. B.

Cormack, E. B. Mendelson, D. Lehrer, M.

Böhm-Vélez, “Combined screening with

ultrasound and mammography vs

mammography alone in women at elevated

risk of breast cancer”, JAMA,

2008,299:2151-63.

[25] K. Drukker, M. Giger, K. Horsch, M.

A. Kupinski, C. J. Vyborny, E. B.

Mendelson, “Computerized lesion

detection on breast ultrasound”,

MedPhysics,2002,29:1438-46.

[26] N. Ohuchi, A. Suzuki, T. Sobue, M.

Kawai, S. Yamamoto, Y. F. heng,”

Sensitivity and specificity of

mammography and adjunctive

ultrasonography to screen for breast cancer

in the Japan Strategic Anti-cancer

Randomized Trial (J-START): a

randomized controlled trial”, Lancet,

2016;387(10016):341-348.

[27] J. R. Scheel, J. M. Lee, B. L.

Sprague, C. I. Lee, C. D. Lehman,

“Screening ultrasound as an adjunct to

mammography in women with

mammographically dense breasts”, Am J

Obstetr Gynaecol, 2015,212:9-17.

[28] W. Svensson , “A review of the

current status of breast ultrasound”, Eur J

Ultrasound. 1997,6:77-101.

[29] How dense are you? digital image,

accessed 7 April 2021,

https://wispecialists.com/3d-automated-

whole-breast-ultrasound/

[30] M. Costantini, P. Belli, R. Lombardi,

G. Franceschini, A. Mulè, L. Bonomo,

“Characterization of solid breast masses

use of the sonographic breast imaging

reporting and data system lexicon”, J

Ultrasound Med, 2006,25: 649-59.

[31] C. Meeuwis, S M. van de Ven, G.

Stapper, A. M. Fernandez Gallardo, M.

van den Bosch, W. Mali, “Computer-

Sushovan Chaudhury et al Research Article

822

IJBPAS, November, Special Issue, 2021, 10(11)

aided detection (CAD) for breast MRI:

evaluation of efficacy at 3.0 T”, Eur

Radiol,2010,20: 522-8.

[32] S. Apostolos. Mitrousias 2020,

Mammography, Breast Ultrasound and

MRI: The Basic Imaging Tools for

Prevention and Management of Breast

Disease, digital image, accessed 7 April

2021,

https://www.linkedin.com/pulse/mammog

raphy-breast-ultrasound-mri-basic-

imaging-tools-apostolos-s-/

[33] L. C. Wang, W. B. DeMartini, S. C.

Partridge, S. Peacock, C. D. Lehman,

“MRI-detected suspicious breast lesions:

predictive values of kinetic features

measured by computer-aided evaluation”,

Am J Roentgenol, 2009,193: 826-31.

[34] T. C. Williams, W. B. DeMartini, S.

C. Partridge, S. Peacock, C. D. Lehman,

“Breast MR imaging: computer-aided

evaluation program for discriminating

benign from malignant lesions”,

Radiology, 2007,244:94-103.

[35] B. Weigelt et al., “Histological types

of breast cancer: how special are they?”,

Molecular oncology vol. 4,3 ,2010, pp 192-

208, doi: 10.1016/j.molonc.2010.04.004,

digital image, accessed 7 April 2021,

https://www.ncbi.nlm.nih.gov/pmc/articles/

PMC5527938/

[36] A. Bhardwaj, A. Tiwari, “Breast

cancer diagnosis using genetically

optimized neural network model”, Expert

System Application, 2015,42:4611-20.

[37] S. Issac Niwas, P. Palanisamy, R.

Chibbar, W. J. Zhang, “An expert support

system for breast cancer diagnosis using

color wavelet features”, J Med System,

2012,36: 3091-102.

[38] M. M. Kyaw, “Pre-segmentation for

the computer aided diagnosis system”,

International Journal of Computer Science

and Information Technology, 2013,5(1):79.

[39] G. Kumar, P. P. Sarthi, P. Ranjan and

R. Rajesh, “Performance of k-means based

satellite image clustering in RGB and HSV

color space,” 2016 International

Conference on Recent Trends in

Information Technology (ICRTIT), 2016,

pp. 1-5, doi:

10.1109/ICRTIT.2016.7569523.

[40] G. Kumar, S. Bakshi, P. K. Sa, B.

Majhi, “Non-overlapped block wise

interpolated local binary pattern as

periocular feature.” Multimed Tools Appl

80, 16565–16597 (2021).

https://doi.org/10.1007/s11042-020-08708-

[41] G. Kumar, D. P. Chowdhury, S.

Bakshi, P. K. Sa, “Person Authentication

Based on Biometric Traits Using Machine

Learning Techniques," S. Sharma et a.

(eds) in IoT Security Paradigms and

Applications: Research and Practices (1st

Sushovan Chaudhury et al Research Article

823

IJBPAS, November, Special Issue, 2021, 10(11)

ed.), 2020. CRC Press.

https://doi.org/10.1201/9781003054115.

[42] A. A. Ardakani, A. Gharbali, and A.

Mohammadi, “Classification of breast

tumors using sonographic texture

analysis,” J. Ultrasound Med., vol. 34, no.

2, pp. 225–231, 2015.

[43] A. J. Cruz and D. S. Wishart,

“Applications of machine learning in

cancer prediction and prognosis,” Cancer

Informatics, vol. 2, pp. 59–77, 2006.

[44] L. Hussain, W. Aziz, S. Saeed, S.

Rathore and M. Rafique, “Automated

Breast Cancer Detection Using Machine

Learning Techniques by Extracting

Different Feature Extracting Strategies,"

17th IEEE International Conference on

Trust, Security And Privacy In Computing

And Communications/ 12th IEEE

International Conference On Big Data

Science And Engineering

(TrustCom/BigDataSE), New York, NY,

USA, 2018, pp. 327-331, doi:

10.1109/TrustCom/BigDataSE.2018.00057.

[45] S. Deepa, V. Subbiah Bharathi,

“Textural feature extraction and

classification of mammogram images using

CCCM and PNN,” IOSR Journal of

Computer Engineering, vol. 10, Issue 6,

2013, pp. 07-13.

[46] F. Shirazi, E. Rashedi, “Detection of

cancer tumors in mammography images

using support vector machine and mixed

gravitational search algorithm,” In

proceedings of IEEE 1st Conference on

Swarm Intelligence and Evolutionary

Computation, Bam, March 2016, pp. 98-

101, doi:10.1109/CSIEC.2016.7482133.

[47] R. Biswas, A. Nath and S. Roy,

"Mammogram Classification Using Gray-

Level Co-occurrence Matrix for Diagnosis

of Breast Cancer," in 2016 International

Conference on Micro-Electronics and

Telecommunication Engineering

(ICMETE), Ghaziabad, 2016 pp. 161-166.

doi: 10.1109/ICMETE.2016.85.

url:

https://doi.ieeecomputersociety.org/10.11

09/ICMETE.2016.85

[48] L. Puneeth, A. N. Krishna,

“Classification of mammograms using

texture features,” International Journal of

Innovative Research & Development, vol.

3, Issue 7, July 2014, pp. 373-377.

[49] M. Rana, P. Chandorkar, A. Dsouza,

and N. Kazi, “Breast cancer diagnosis and

recurrence prediction using machine

learning techniques,'” Int. J. Res. Eng.

Technology., vol. 4, no. 4, pp. 1163_2319,

2015.

[50] H. Asria, H. Mousannif, H.

Moatassimec and T. Noeld, “Using

Machine Learning Algorithms for Breast

Cancer Risk Prediction and Diagnosis,” in

Proceedings of the 6th International

Sushovan Chaudhury et al Research Article

824

IJBPAS, November, Special Issue, 2021, 10(11)

Symposium on Frontiers in Ambient and

Mobile Systems, pp. 1064 –1069, 2016.

[51] J. Ivancáková, F. Babic and P. Butka,

“Comparison of Different Machine

Learning Methods on Wisconsin Dataset,”

in Proceedings of the 16th IEEE World

Symposium on Applied Machine

Intelligence and Informatics, pp.173-178,

2018.

[52] A. Fred and M. Agarap, “On Breast

Cancer Detection: An Application of

Machine Learning Algorithms on the

Wisconsin Diagnostic Dataset,” in

Proceedings of the 2nd International

Conference on Machine Learning and Soft

Computing, 2018.

[53] L. G. Ahmad, A. T. Eshlaghy, A.

Poorebrahimi, M. Ebrahimi, and A. R.

Razavi, “Using three machine learning

techniques for predicting breast cancer

recurrence,'' J. Health Med. Inform., vol. 4,

no. 124, p. 3, 2013.

[54] E. A. Bayrak, P. Kirci, and T. Ensari,

``Comparison of machine learning methods

for breast cancer diagnosis,'' in Proc. Sci.

Meeting Elect. - Electron. Biomed. Eng.

Comput. Sci. (EBBT), Apr. 2019, pp. 1_3.

[55] M. Abdar, M. Zomorodi-Moghadam,

X. Zhou, R. Gururajan, X. Tao, P. D.

Barua, and R. Gururajan, “A new nested

ensemble technique for automated

diagnosis of breast cancer,” Pattern

Recognit. Lett., vol. 132, pp. 123_131, Apr.

2020.

[56] D. A. Omondiagbe, S. Veeramani, and

A. S. Sidhu, “Machine learning

classification techniques for breast cancer

diagnosis,'' IOP Conf. Ser., Mater. Sci.

Eng., vol. 495, Jun. 2019, Art. no. 012033.

[57] Z. Nematzadeh, Roliana Ibrahim and

Ali Selamat, “Comparative studies on

breast cancer classifications with k-fold

cross validations using machine learning

techniques,” Proc. in 2015, 10th Asian

Control Conf. (ASCC), pp 1-6, IEEE, 2015.

[58] Z. Zain, M. Alshenaifi, A.Aljaloud, T.

Albednah, R. Alghanim, A. Alqifari & A.

Alqahtani, (2020). “Predicting breast

cancer recurrence using principal

component analysis as feature extraction:

an unbiased comparative analysis.”

International Journal of Advances in

Intelligent Informatics, 6(3), 313-327. doi:

https://doi.org/10.26555/ijain.v6i3.462.

[59] H. Hasan and N. M. Tahir, “Feature

selection of breast cancer based on

Principal Component Analysis,” in 2010

6th International Colloquium on Signal

Processing & its Applications, 2010, pp. 1–

4, doi: 10.1109/CSPA.2010.5545298.

[60] U. Ojha and S. Goel, “A study on

prediction of breast cancer recurrence using

data mining techniques,” 2017 7th Int.

Conf. on Cloud Computing, Data Science

Sushovan Chaudhury et al Research Article

825

IJBPAS, November, Special Issue, 2021, 10(11)

& Engineering – Confluence, pp 527-530,

IEEE, 2017.

[61] D. Bazazeh and R. Shubair.

“Comparative study of machine learning

algorithms for breast cancer detection and

diagnosis,” 2016 5th Int. Conf. on

Electronic Devices, Systems and

Applications (ICEDSA), 6-8 December

2016, Ras Al Khaimah, UAE.

[62] B. M. Gayathri and C. P. Sumathi,

“Comparative study of relevance vector

machines with various machine learning

techniques used for detecting breast

cancer,” 2016 IEEE Int. Conf. on

Computational Intelligence and Computing

Research (ICCIC), pp 1-5, IEEE, 2016.

[63] S. Kharya and S. Soni, “Weighted

Naïve Bayes classifier –Predictive model

for breast cancer detection”, International

Journal of computer applications, Vol.133,

No.9, pp.32-37, January 2016.

[64] F. Paulin, A. Santhakumaran,

“Classification of Breast cancer by

comparing Back propagation training

algorithms.” Int. J. Comput. Sci. Eng. 2011,

3, 327–332.

[65] V. Chaurasia, S. Pal & B. B. Tiwari,

(2018). “Prediction of benign and

malignant breast cancer using data mining

techniques.” Journal of Algorithms &

Computational Technology, 12(2), 119-

126.

[66] A. Adam, K. Omar, “Computerized

Breast Cancer Diagnosis with Genetic

Algorithms and Neural Network”.

[67] C. Chi, W. Nick Street, and W. H.

Wolberg, “Application of Artificial Neural

Network-Based Survival Analysis on Two

Breast Cancer Datasets”-AMIA Annu

Symp Proc. 2007; 2007: 130–134.

[68] W. Yeu, “Machine Learning with

applications in Breast Cancer Diagnosis

and Prognosis”, Designs 2018,2,13;

doi:10.3390/designs20SS20013

[69] R. Patgiri, “Machine Learning: A Dark

Side of Cancer Computing” International

Conference. Bioinformatics and

Computational Biology BIOCOMP’ 18,

ISBN: 1-60132-471-5, CSREA Press.

[70] M. Amrane, S. Oukid, I. Gagaoua and

T. Ensarİ, “Breast cancer classification

using machine learning,” Electric

Electronics, Computer Science, Biomedical

Engineerings' Meeting (EBBT), IEEE

2018, pp. 1-4, doi:

10.1109/EBBT.2018.8391453.

[71] N. Wu , “Deep Neural Networks

Improve Radiologists’ Performance in

Breast Cancer Screening,” in IEEE

Transactions on Medical Imaging, vol. 39,

no. 4, pp. 1184-1194, April 2020, doi:

10.1109/TMI.2019.2945514.

[72] E. Rashed and M. Samir Abou El

Seoud. 2019. “Deep learning approach for

breast cancer diagnosis.” In Proceedings of

Sushovan Chaudhury et al Research Article

826

IJBPAS, November, Special Issue, 2021, 10(11)

the 2019 8th International Conference on

Software and Information Engineering

(ICSIE '19). Association for Computing

Machinery, New York, NY, USA, 243–

247. DOI:

https://doi.org/10.1145/3328833.3328867

[73] R. Arora, P. K. Rai & B. Raman,

“Deep feature–based automatic

classification of mammograms.” Med Biol

Eng Comput 58, 1199–1211 (2020).

https://doi.org/10.1007/s11517-020-02150-

[74] D. A. Ragab, O. Attallah, M. Sharkas,

J. Ren, S. Marshall, “A framework for

breast cancer classification using Multi-

DCNNs” Computers in Biology and

Medicine, Volume 131, 2021, 104245,

ISSN 0010-4825,

https://doi.org/10.1016/j.compbiomed.2021

.104245

[75] Z. Zainudin, S. M. Shamsuddin, S.

Hasan (2021), “Deep Layer Convolutional

Neural Network (CNN) Architecture for

Breast Cancer Classification Using

Histopathological Images.” In: Hassanien

A.E., Darwish A. (eds) Machine Learning

and Big Data Analytics Paradigms:

Analysis, Applications and Challenges.

Studies in Big Data, vol 77. Springer,

Cham. https://doi.org/10.1007/978-3-030-

59338-4_18

[76] Y. Zhang, S. C. Satapathy, D. S.

Guttery, J. M. Górriz, S. Wang, “Improved

Breast Cancer Classification Through

Combining Graph Convolutional Network

and Convolutional Neural Network,”

Information Processing & Management,

Volume 58, Issue 2, 2021, 102439, ISSN

0306-4573,

https://doi.org/10.1016/j.ipm.2020.102439.

[77] G. Murtaza, L. Shuib, A. W. A.

Wahab, “Ensembled deep convolution

neural network-based breast cancer

classification with misclassification

reduction algorithms.” Multimed Tools

Appl 79, 18447–18479 (2020).

https://doi.org/10.1007/s11042-020-08692-

[78] S. Sharma, R. Mehra, “Conventional

Machine Learning and Deep Learning

Approach for Multi-Classification of Breast

Cancer Histopathology Images—a

Comparative Insight.” J Digit Imaging 33,

632–654 (2020).

https://doi.org/10.1007/s10278-019-00307-y

[79] R. Yan, F. Ren, Z. Wang, L. Wang, T.

Zhang, Y. Liu, X. Rao, C. Zheng, F. Zhang,

“Breast cancer histopathological image

classification using a hybrid deep neural

network, Methods,” Volume 173, 2020,

Pages 52-60, ISSN 1046-2023,

https://doi.org/10.1016/j.ymeth.2019.06.014.

[80] A. Kumar, S. K. Singh, S. Saxena, K.

Lakshmanan, A. K. Sangaiah, H. Chauhan,

S. Shrivastava, R. K. Singh, “Deep feature

learning for histopathological image

classification of canine mammary tumors

Sushovan Chaudhury et al Research Article

827

IJBPAS, November, Special Issue, 2021, 10(11)

and human breast cancer, Information

Sciences,”

Volume 508, 2020, Pages 405-421, ISSN

0020-0255,

https://doi.org/10.1016/j.ins.2019.08.072.

[81] H.D. Cheng, J. Shan, W. Ju, Y. Guo,

L. Zhang, “Automated breast cancer

detection and classification using

ultrasound images: A survey”, Pattern

Recognition, Volume 43, Issue 1, 2010,

Pages 299-317, ISSN 0031-3203,

https://doi.org/10.1016/j.patcog.2009.05.01

[82] G. Meenalochini, S. Ramkumar,

“Survey of machine learning algorithms for

breast cancer detection using mammogram

images, Materials Today: Proceedings”,

Volume 37, Part 2, 2021, Pages 2738-2743,

https://doi.org/10.1016/j.matpr.2020.08.543

[83] Ghosh, S., Mondal, S., & Ghosh, B.

(2014, February). A comparative study of

breast cancer detection based on SVM and

MLP BPN classifier. In 2014 First

International Conference on Automation,

Control, Energy and Systems (ACES) (pp.

1-4). IEEE.

Prediction of breast cancer using machine learning algorithms on different datasets

Article

Full-text available

Jun 2023

Breast cancer is a disease that is becoming more and more common day by day, causing emotional and behavioral reactions and having fatal consequences if not detected early. At this point, traditional methods are insufficient, especially in early diagnosis. In this context, this study aimed to predict breast cancer by using machine learning (ML) algorithms on different datasets and to demonstrate the applicability of these algorithms. Algorithm performances were compared on balanced and unbalanced datasets, taking into account the performance metrics obtained in applications on different datasets. In addition, a model based on the Borda Voting method was developed by including the results obtained from four different algorithms (NB, KNN, DT, and RF) in the process. The prediction values obtained from each algorithm were written in different columns on the same excel file and the most repetitive value was accepted as the final result value. The developed model was tested on real data consisting of 60 records and the results were analyzed. When the results were examined, it was seen that higher performance was obtained with the proposed RF model compared to similar studies in the literature. Finally, the prediction results obtained with the developed model revealed the applicability of ML algorithms in the diagnosis of breast cancer.

Person Authentication Based on Biometric Traits Using Machine Learning Techniques

Chapter

Full-text available

Oct 2020

Internet of Things (IoT) is a concept of transferring data among interrelated physical devices, objects, or humans over a network without the interaction of humans or machine. Therefore, in IoT, one device collects and sends data while others receive and act on it. To communicate and to share information through interconnected devices, authentication of the sender and receiver is essential. When the goal is to improve security in IoT, traditional authentication techniques, such as knowledge-based authentication and token-based authentication, become a challenge. Therefore, researchers strongly recommend using biometrics whenever direct human access is required. However, a biometric system is also vulnerable to different types of attacks. Therefore, it is necessary to avoid and detect such attacks and secure IoT devices. Throughout this chapter, we discuss IoT and its applications, the types of security in IoT, the identification and verification process in biometrics, the vulnerability of components of biometric systems, methods to secure these components, and different types of attacks that can be made at different modules of a biometric system along with machine learning techniques to detect these attacks. We also investigate the various biometric traits used to authenticate end users using machine learning (ML) techniques, various ML algorithms, and methodologies for features extraction, matching, and classifications. Finally, a deep model is trained, and the performance of the model is evaluated on the Caltech face database and the UBIRIS.v1 iris dataset. The results of the deep model are compared with traditional ML techniques.

Non-overlapped blockwise interpolated local binary pattern as periocular feature

Article

Full-text available

May 2021
MULTIMED TOOLS APPL

Most of the prominent features of human face are present in the ocular area, referred as the periocular region. Complex and dense features in these regions makes it a candidate to be used as a biometric trait. This paper discusses an effective method for periocular recognition using non-overlapped blockwise interpolated local binary pattern (iLBP) features. For a given periocular image, an iLBP coded feature image is obtained and further divided into four equal non-overlapping sub-regions. From each sub-region having iLBP pattern, eight bin histogram features are calculated. A single feature vector is formed by concatenating blocked histograms of each non-overlapping region. Binned histogram based feature is also extracted using Phase Intensive Global Pattern (PIGP) features for comparison of results. Experiments are conducted on UBIRIS.v1 and UBIPr.v2 datasets. From the experiments, it is observed that selected histogram feature bins through the proposed approach provide a more compact representation of periocular image and size of the feature vector is also reduced with significant improvement in performance.

Deep Neural Networks Improve Radiologists’ Performance in Breast Cancer Screening

Article

Full-text available

Oct 2019

We present a deep convolutional neural network for breast cancer screening exam classification, trained and evaluated on over 200,000 exams (over 1,000,000 images). Our network achieves an AUC of 0.895 in predicting the presence of cancer in the breast, when tested on the screening population. We attribute the high accuracy to a few technical advances. (i) Our network’s novel two-stage architecture and training procedure, which allows us to use a high-capacity patch-level network to learn from pixel-level labels alongside a network learning from macroscopic breast-level labels. (ii) A custom ResNet-based network used as a building block of our model, whose balance of depth and width is optimized for high-resolution medical images. (iii) Pretraining the network on screening BI-RADS classification, a related task with more noisy labels. (iv) Combining multiple input views in an optimal way among a number of possible choices. To validate our model, we conducted a reader study with 14 readers, each reading 720 screening mammogram exams, and show that our model is as accurate as experienced radiologists when presented with the same data. We also show that a hybrid model, averaging the probability of malignancy predicted by a radiologist with a prediction of our neural network, is more accurate than either of the two separately. To further understand our results, we conduct a thorough analysis of our network’s performance on different subpopulations of the screening population, the model’s design, training procedure, errors, and properties of its internal representations. Our best models are publicly available at https://github.com/nyukat/breastcancerclassifier.

A Framework for Breast Cancer Classification using Multi-DCNNs

Article

Jan 2021
COMPUT BIOL MED

Background Deep learning (DL) is the fastest-growing field of machine learning (ML). Deep convolutional neural networks (DCNN) are currently the main tool used for image analysis and classification purposes. There are several DCNN architectures among them AlexNet, GoogleNet, and residual networks (ResNet). Method This paper presents a new computer-aided diagnosis (CAD) system based on feature extraction and classification using DL techniques to help radiologists to classify breast cancer lesions in mammograms. This is performed by four different experiments to determine the optimum approach. The first one consists of end-to-end pre-trained fine-tuned DCNN networks. In the second one, the deep features of the DCNNs are extracted and fed to a support vector machine (SVM) classifier with different kernel functions. The third experiment performs deep features fusion to demonstrate that combining deep features will enhance the accuracy of the SVM classifiers. Finally, in the fourth experiment, principal component analysis (PCA) is introduced to reduce the large feature vector produced in feature fusion and to decrease the computational cost. The experiments are performed on two datasets (1) the curated breast imaging subset of the digital database for screening mammography (CBIS-DDSM) and (2) the mammographic image analysis society digital mammogram database (MIAS). Results and Conclusions: The accuracy achieved using deep features fusion for both datasets proved to be the highest compared to the state-of-the-art CAD systems. Conversely, when applying the PCA on the feature fusion sets, the accuracy did not improve; however, the computational cost decreased as the execution time decreased.

Improved Breast Cancer Classification Through Combining Graph Convolutional Network and Convolutional Neural Network

Article

Jan 2021

Aim In a pilot study to improve detection of malignant lesions in breast mammograms, we aimed to develop a new method called BDR-CNN-GCN, combining two advanced neural networks: (i) graph convolutional network (GCN); and (ii) convolutional neural network (CNN). Method We utilised a standard 8-layer CNN, then integrated two improvement techniques: (i) batch normalization (BN) and (ii) dropout (DO). Finally, we utilized rank-based stochastic pooling (RSP) to substitute the traditional max pooling. This resulted in BDR-CNN, which is a combination of CNN, BN, DO, and RSP. This BDR-CNN was hybridized with a two-layer GCN, and yielded our BDR-CNN-GCN model which was then utilized for analysis of breast mammograms as a 14-way data augmentation method. Results As proof of concept, we ran our BDR-CNN-GCN algorithm 10 times on the breast mini-MIAS dataset (containing 322 mammographic images), achieving a sensitivity of 96.20±2.90%, a specificity of 96.00±2.31% and an accuracy of 96.10±1.60%. Conclusion Our BDR-CNN-GCN showed improved performance compared to five proposed neural network models and 15 state-of-the-art breast cancer detection approaches, proving to be an effective method for data augmentation and improved detection of malignant breast masses.

Survey of machine learning algorithms for breast cancer detection using mammogram images

Article

Oct 2020

Ramkumar Sivasakthivel

Breast cancer is the primary cause of death in most cancer affected women. Mammography is one of the most dependable strategies for early detection and diagnosis of breast cancer and reduces the death rate. Mammograms are radiographic images of the breast which are utilized to identify the early symptoms of breast cancer. These radiographic images reduce human errors in detecting cysts and reduce the diagnosing time and also increase the diagnosis accuracy. An overview of the machine learning techniques for breast cancer detection and classification has been presented in this paper, which can be divided into three main stages: pre-processing, extraction of features, and classification. This article discusses about the effects of several Machine learning techniques for automation of mammogram image classification are investigated. This investigation assembles agent works that show how Machine learning technique is applied to the result of different issues identified with various analytic science examinations. This study portrays the impacts of pre-taken care of mammogram images before entering the classifier, which brings about higher effective classification. The detection stage is trailed by segmentation of the tumor region in a mammogram image. This study is an attempt to gather and compare the various screening techniques, classifiers, and their performance in terms of sensitivity, specificity and exactness for breast cancer diagnosis.

Conventional Machine Learning and Deep Learning Approach for Multi-Classification of Breast Cancer Histopathology Images—a Comparative Insight

Article

Jan 2020

Automatic multi-classification of breast cancer histopathological images has remained one of the top-priority research areas in the field of biomedical informatics, due to the great clinical significance of multi-classification in providing diagnosis and prognosis of breast cancer. In this work, two machine learning approaches are thoroughly explored and compared for the task of automatic magnification-dependent multi-classification on a balanced BreakHis dataset for the detection of breast cancer. The first approach is based on handcrafted features which are extracted using Hu moment, color histogram, and Haralick textures. The extracted features are then utilized to train the conventional classifiers, while the second approach is based on transfer learning where the pre-existing networks (VGG16, VGG19, and ResNet50) are utilized as feature extractor and as a baseline model. The results reveal that the use of pre-trained networks as feature extractor exhibited superior performance in contrast to baseline approach and handcrafted approach for all the magnifications. Moreover, it has been observed that the augmentation plays a pivotal role in further enhancing the classification accuracy. In this context, the VGG16 network with linear SVM provides the highest accuracy that is computed in two forms, (a) patch-based accuracies (93.97% for 40×, 92.92% for 100×, 91.23% for 200×, and 91.79% for 400×); (b) patient-based accuracies (93.25% for 40×, 91.87% for 100×, 91.5% for 200×, and 92.31% for 400×) for the classification of magnification-dependent histopathological images. Additionally, “Fibro-adenoma” (benign) and “Mucous Carcinoma” (malignant) classes have been found to be the most complex classes for the entire magnification factors.

Deep learning approach for breast cancer diagnosis

Conference Paper

Apr 2019

Breast cancer is one of the leading fatal disease worldwide with high risk control if early discovered. Conventional method for breast screening is x-ray mammography, which is known to be challenging for early detection of cancer lesions. The dense breast structure produced due to the compression process during imaging lead to difficulties to recognize small size abnormalities. Also, inter- and intra-variations of breast tissues lead to significant difficulties to achieve high diagnosis accuracy using hand-crafted features. Deep learning is an emerging machine learning technology that requires a relatively high computation power. Yet, it proved to be very effective in several difficult tasks that requires decision making at the level of human intelligence. In this paper, we develop a new network architecture inspired by the U-net structure that can be used for effective and early detection of breast cancer. Results indicate a high rate of sensitivity and specificity that indicate potential usefulness of the proposed approach in clinical use.

Breast cancer classification using machine learning

Conference Paper