ArticlePDF Available

Breast cancer detection and classification using metaheuristic optimized ensemble extreme learning machine

Authors:

Abstract and Figures

Breast cancer deaths are increasing rapidly due to the abnormal growth of breast cells in the women's milk duct. Manual cancer diagnosis from mammogram images is also difficult for radiologists and medical practitioners. This paper proposes a novel metaheuristic algorithm-based machine learning model and Fuzzy C Means-based segmentation technique for the classification and detection of breast cancer from mammogram images. At first instance, the fuzzy factor improved fast and robust fuzzy c means (FFI-FRFCM) segmentation is proposed for the segmentation by modifying the member partition matrix of the FRFCM technique. Secondly, a hybrid improved water cycle algorithm-Accelerated particle swarm optimization (IWCA-APSO) optimization, is proposed for weight optimization of the ensemble extreme learning machine (EELM) model. Three benchmark functions are taken for optimization to demonstrate the proposed hybrid IWCA-APSO algorithm's uniqueness. With the INbreast dataset, the IWCA-APSO-based EELM classification shown the sensitivity, specificity, accuracy, and computational time as 99.67%, 99.71%, 99.36%, and 23.8751 s respectively. The proposed IWCA-APSO-based EELM model performs better than the traditional models at classifying breast cancer.
Content may be subject to copyright.
Vol.:(0123456789)
1 3
Int. j. inf. tecnol.
https://doi.org/10.1007/s41870-023-01533-y
ORIGINAL RESEARCH
Breast cancer detection andclassification using metaheuristic
optimized ensemble extreme learning machine
RajKumarPattnaik1· MohammadSiddique1· SatyasisMishra2 ·
DemissieJ.Gelmecha2· RamSewakSingh2· SunitaSatapathy3
Received: 5 May 2023 / Accepted: 14 September 2023
© The Author(s), under exclusive licence to Bharati Vidyapeeth’s Institute of Computer Applications and Management 2023
Abstract Breast cancer deaths are increasing rapidly due
to the abnormal growth of breast cells in the women’s milk
duct. Manual cancer diagnosis from mammogram images
is also difficult for radiologists and medical practitioners.
This paper proposes a novel metaheuristic algorithm-based
machine learning model and Fuzzy C Means-based segmen-
tation technique for the classification and detection of breast
cancer from mammogram images. At first instance, the fuzzy
factor improved fast and robust fuzzy c means (FFI-FRFCM)
segmentation is proposed for the segmentation by modify-
ing the member partition matrix of the FRFCM technique.
Secondly, a hybrid improved water cycle algorithm-Accel-
erated particle swarm optimization (IWCA-APSO) optimi-
zation, is proposed for weight optimization of the ensemble
extreme learning machine (EELM) model. Three bench-
mark functions are taken for optimization to demonstrate
the proposed hybrid IWCA-APSO algorithm’s uniqueness.
With the INbreast dataset, the IWCA-APSO-based EELM
classification shown the sensitivity, specificity, accuracy,
and computational time as 99.67%, 99.71%, 99.36%, and
23.8751s respectively. The proposed IWCA-APSO-based
EELM model performs better than the traditional models at
classifying breast cancer.
Keywords Extreme learning machine· Breast cancer·
Fuzzy C means· Water cycle algorithm· Wavelet
transform· Accelerated particle swarm optimization
1 Introduction
Breast cancer is a severe illness that affects womens breast
cells. According to the global health challenge 2020, 2.26
million cases were predicted to occur globally in 2020. In
2020, underdeveloped countries saw more breast cancer-
related fatalities [1]. Due to the lack of innovative detection
tools, underdeveloped and developing nations have higher
breast cancer-related mortality rates [2]. Breast cancer is
the most frequent disease in the world and the leading cause
of cancer-related death. Breast cancer is the most prevalent
disease in the world, and its share is nearly 12.2% of all
afresh identified cases in 2020, according to the World Can-
cer Research Fund International (WCRFI) [3]. For instance,
a study on the epidemiology of breast cancer at Hawassa
University Comprehensive Specialized Hospital (HUCSH)
found that African women, particularly those in Ethiopia, are
impacted by breast cancer at a young age due to the absence
of treatment at an early stage [4]. Segmentation is essential
for detection and aids in measuring the volume of tissue in
the breast for scheduling diagnoses. To enhance the noise
capabilities, the researchers proposed FCM-based image
segmentation methods. Hybrid Markov Penalized FCM pro-
posed by Priya etal. [5], fuzzy c means and k-means pro-
posed by Kamil etal. [6], and Intuitionist Possibilistic Fuzzy
C-Mean proposed by Chowdhary etal. [7] for breast cancer
detection. There are not enough literature for the segmenta-
tion of breast cancer based on FCM-based algorithms. The
FCM-based algorithms were developed in border sense and
applied to brain tumor. Some of the FCM-based techniques
* Satyasis Mishra
satyasismishra@gmail.com
1 Department ofMathematics, Centurion University
ofTechnology andManagement, Bhubaneswar, Odisha,
India
2 Department ofECE, Adama Science andTechnology
University, Adama, Ethiopia
3 Department ofZoology, Centurion University ofTechnology
andManagement, Bhubaneswar, Odisha, India
Int. j. inf. tecnol.
1 3
are presented as follows: Szilagyi etal. [8] developed an
improved FCM algorithm (EnFCM) for brain images.
EnFCM’s parameter (configurable) enhances segmenta-
tion outcomes but cannot remove noise altogether. The Fast
generalized FCM algorithm (FGFCM) was proposed by Cai
etal. [9] to reduce noise, but FGFCM requires additional
components to segment the images. The fuzzy local infor-
mation c-means clustering algorithm (FLICM), developed
by Krinidis etal. [10], which replaces the parameter with a
fuzzy factor in FGFCM to delimit the noise. The FLICM
speeds up segmentation, but can only reduce Gaussian noise
by 30% or less. FCM with local information and kernel met-
ric (KWFLICM), proposed by Gong etal. [11] to enhance
the segmentation capability of FLICM and increases its
robustness. Fast and Robust FCM (FRFCM) was suggested
by Tai etal. [12] for the segmentation of brain tumors to
reduce rician noise. All of the aforementioned segmenta-
tion methods were applied to strengthen the segmentation
and noise reduction capability. Motivated by the FCM-based
segmentation techniques for breast cancer detection, we have
developed a fuzzy factor improved fast and Robust FCM
(FFI-FRFCM) segmentation by updating the fuzzy factor in
the objective function. According to the literature review,
some basic segmentation techniques have been used with
breast cancer web data. However, none of the algorithms
succeed in removing the necessary image noise and detect-
ing cancer.
Controlling observation depends on the classification.
Researchers have suggested various classification strategies
based on the unpredictable nature of cancer and classifica-
tion challenges. A support vector machine (SVM) classi-
fier was proposed by Gorgel etal. [13] to categorize the
segmented masses as cancerous or non-cancerous breast
tumors. For mass classification, Lima etal. [14] suggested
a new SVM-based feature selection method along with
selected geometry and texture features. Juneja etal.[15]
proposed selective feature based improved decision tree
Algorithm for classification and Chi square test to recog-
nize the features with Wisconsin Breast Cancer Database
and achieved 99% accuracy. Pramod etal. [16] proposed
a hybrid approach based on differential evaluation evolu-
tionary algorithm and cuckoo search for identification of
ROI from the mammogram images and achieved 97.51% of
accuracy with DDSM dataset. Sharma etal. [17] proposed
feature selection approaches such as Correlation- based
selection, Information Gain based selection and Sequential
feature selection and Max Voting Classifier and achieved
99.41% classification accuracy with Wisconsin Breast Can-
cer (WDBC) datasets. Kate etal. [18] proposed gravitation
search algorithm for kapur’s entropy as a fitness function
and VGG16 and InceptionV3 model for classification with
Digital Database for Screening Mammography DDSM data-
set and achieved an accuracy of 97.98% for InceptionV3
based model and 91.92% for VGG16. A novel residual deep
convolutional neural network (DCNN) with stochastic gradi-
ent descent (SGD) and AdaGrad based optimizers was pro-
posed by Mishra etal. [19] with breast ultrasound (BUS)
images and achieved AUC 0.9906, accuracy 96.21%, and
F1-scores of, 0.9725, respectively. Kumari etal. [20] pro-
posed hybrid classifier which integrates eXtreme Gradient
Boost (XGBoost) with Random Forest (XGBoost-RF) for
classification of mammograms with Mammographic Image
Analysis Society (MIAS) and DDSM dataset and K-Fold
cross validation and achieved an accuracy for MIAS and
DDSM datasets are 98.6% and 94.3% respectively. Machine
learning algorithms were able to recognize the wide size
variations among the masses [21], but they were unable to
offer a suitable scale for the variety of masses. Some of the
metaheuristic algorithms such as the water cycle algorithm
(WCA) [22], sine cosine algorithm (SCA) [23], Teaching
and Learning based optimization (TLBO) [23], artificial bee
colony (ABC) [24], particle swarm optimization [25], APSO
algorithm [25], harmony search (HS) algorithm [26], etc.,
were proposed for optimization of weights of the machine
learning models, to improve the performance of the clas-
sification. In this research, we have proposed a parametric
improvement to the WCA algorithm and hybridized with
APSO algorithm to improve the classification of the EELM
model. The hybrid IWCA-APSO weight-optimized EELM
model is proposed for classification in order to classify can-
cerous and non-cancerous diseases from the mammogram
images.
The contributions are as follows:
We have developed a fuzzy factor improved fast and
robust FCM (FFI-FRFCM) method by upgrading the
fuzzy factor in fuzzy partition matrix. In order to achieve
results with more precision, we also combined the fuzzy
partition matrix with the mean filter to detect breast can-
cer and reduce noise.
To improve the performance of the EELM classifier, a
hybrid improved WCA-APSO optimization algorithm is
developed by incorporating parameter variations, and its
mathematical analysis is also presented. The improved
WCA-APSO optimization is employed to optimize the
weights of the EELM model.
Further, to show the uniqueness of the improved WCA-
APSO hybrid algorithm, we have considered three differ-
ent benchmark functions for optimization. The results of
the improved WCA-APSO algorithm is compared with
metaheuristic algorithms such as the water cycle algo-
rithm (WCA), sine cosine algorithm (SCA), and artificial
bee colony (ABC) algorithms.
The remaining part of the paper is organized as follows:
Sect.2 presents related work of the research, Sect.3 presents
Int. j. inf. tecnol.
1 3
the materials and methods, which contains the research dia-
gram, and proposed IWCA-APSO-based EELM model,
Sect.4 presents the results and discussion; and Sect.5 pre-
sents the conclusion and reference.
2 Related work
Using computer-aided methods, Singh etal. [27] devel-
oped segmentation on k-means clustering. In addition, the
clustering with the MAIS dataset was upgraded using the
fuzzy intensification operator (INT). Velmurugan etal.
[28] suggested using k-Means and FCM techniques for
segmentation. Finally, compared the two approaches
and selected the most effective approach for analyz-
ing breast images. By combining Fuzzy C-Means with
the Chan-Vese model, Hmida etal. [29] created fuzzy
active contour model for segmenting masses utilizing the
regions of interest of mammographic images. The seg-
mented masses are then used to extract shape and margin
parameters that are used to categorize them as benign
or malignant. According to experimental findings on
regions of interest (ROIs) taken from the MIAS database,
the suggested strategy produces accurate mass segmenta-
tion and classification outcomes. A new graph cut based
segmentation algorithm was created by Zheng etal. [30]
to improve coarse manual segmentation for the identifi-
cation of tumor areas. Second, a spatio-temporal model
of segmented tumor is created to extract spatio-temporal
enhancement patterns by treating successive contrast-
enhanced images as a single spatio-temporal image
(STEPs). For instance, the proposed framework’s high
accuracy was confirmed through experiments that pro-
duced results like an area of 0.97 under the ROC curve.
The development of deep learning improves classifica-
tion accuracy, but model simulation requires a significant
amount of processing time. Table1 presents the literature
survey of some latest researches.
3 Materials andmethods
3.1 Proposed methodology
The proposed methodology in Fig.1 focuses on classifying
breast cancer by using machine learning and a soft comput-
ing hybrid model. In the first step, the mammogram images
are given to fuzzy factor improved FRFCM image segmenta-
tion and features are extracted by wavelet transform. In the
second step, the features are fed as input to the proposed
IWCA-APSO-EELM model, WCA-EELM model, WCA-
PSO-EELM model, and IWCA-APSO-EELM model. In the
third step, the classification comparison results are obtained
and presented. The Ensemble Extreme Learning Machine
model is a combination of ELM models. Each single ELM
model weight is optimized by the proposed IWCA-APSO
algorithm and ensemble all the outputs of all ELM. The
mean of all the ELM model outputs is selected. The detailed
diagram of the EELM model is presented in Fig.5.
Table 1 Literature survey of previous research
Sl no References Year Dataset used Model Accuracy in %
4 Hameed etal. [31] 2022 WSI Xception 97.33%
5 Maqsood etal. [32] 2022 DDSM Transferable texture convolutional neural net-
work (TTCNN) a deep learning model
97.49%
6 Joseph etal. [33] 2022 BreakHis dataset DNN 97.87%
11 Jabeen etal. [34] 2022 Breast Ultrasound
Images (BUSI)
Deep learning 99.1%
12 Ramesh etal. [35] 2022 MIAS dataset GoogLeNet 99.12%
13 Khozama etal. [36] 2022 BCSC dataset Ensemble Learning Model 91.33%
Fig. 1 Proposed methodology flow
Int. j. inf. tecnol.
1 3
3.1.1 Fuzzy factor improved fast androbust fuzzy C means
(FFI‑FRFCM) segmentation
The FCM method can store more visual evidence; how-
ever, detection accuracy is challenging. A fuzzy factor-
improved FRFCM method is proposed for detecting breast
cancer from mammogram images.
The objective function with local information [11] is
specified by
where the fuzzy aspect is given by
where
ukl
is the fuzzy partition matrix. The fuzzy partition
matrix is given by
And
where
vk
is the center. From Eq.(3), it is seen that the factor
Fkl
reduces the noise and preserve the image details, but time
for computation increases.
To improve the performance of the segmentation and
reduce the “computational” complexity, the “membership
partition matrix” is updated as
where
𝜌
is a constant, and
𝛾
is gray assessment of image and
𝜏
”is the smoothness factor. Further, using the dilation and
erosion operation on the image through the “morphological”
reconstruction operations, the new image is considered as
𝜉p
”, and is presented as
where
I
denotes an original image and
RC
b
is the “morpho-
logical closing” reconstruction. The membership partition
matrix is given by
(1)
FFI =
ukl
xlvk
2+
F
(2)
F
kl =
kNv
lk
1
dlk +1
(
1ukl
)
m
xlvk
2
(3)
u
kl =
1
c
j
=
1
xlvk
2
+Fkl
xlvj
2
+Fjl
1
m1
(4)
v
k=
n
k=1
um
klxl
n
l=1
ukl
(5)
F
kl�=
rNv
lr
log
(
𝜌
𝜏+1
dlr
)
um
kr
xrvk
2
(6)
𝜉p=Rc
b(I)
and
Now, applying the mean filter to the “membership sepa-
ration matrix”, the new “membership” partition matrix is
given by
With the application of mean filter the segmentation pro-
cess will provide a better detection of tumor from the breast
cancer images and improves the noise reduction capability.
3.1.2 Proposed hybrid IWCA‑APSO algorithm
In the particle swarm optimization (PSO) [23, 24] algorithm,
the velocity update equation is given by
And the position update equation is given by
where the learning factors β1 and β2 indicating the local and
global position weight coefficients,
C1,C2
are the random val-
ues taken in-between [0 1], and κ is inertia coefficient. The
complexity of the PSO has been reduced by reducing the
parameter variations and APSO [36] algorithm has evolved,
the velocity equations of the PSO are modified as
The position equation is given by
where
g
b
is the global best position parameter.
The WCA algorithm is based on how streams and rivers
flow, combined with the water cycle process to form the sea
proposed by Sadollah etal. [22]. Figure2 depicts the water
cycle algorithm, which begins with rainfall or precipitation
events by the formation of streams of population or design
variables [22, 23]. A new stream is selected when the stream
moves to a new location close to the sea, as seen in Fig.3.
The rivers are then chosen from the collection of streams
with the best match values. It is considered that streams
(7)
u
kp =
1
c
j=1
𝜉pvk
2
+Fkp
𝜉pvj
2
+Fjp
1
t1
(8)
v
k=
c
k=1
ut
kp𝜉
p
s
p=1
ukp
(9)
U
=mean
[
U
kp]
(10)
v
i(l+1)=𝜅vi(l)+𝛽1C1
pbest
ixi(l)
+𝛽2C2
pgbest
ixi(l)
(11)
xi(l+1)=xi(l)+vi(l+1)
(12)
vi
(l+1)=v
i
(l)+𝛼
n
+𝛽
[
g
b
x
i
(l)
]
(13)
xi(l+1)=(1𝛽)xi(l)+𝛽g
b+𝛼n
Int. j. inf. tecnol.
1 3
change positions and flow velocities as they proceed toward
rivers and the sea. Assuming the one-dimensional 1 × d array
to be a stream, the dimensional array for the solution and the
corresponding matrix is given by
(14)
RS
Totpop =
Sea
river 1
river 2
StreamSsr+1
StreamSsr+2
StreamSsr+3
StreamS
pop
=
x1
11 x1
12 x1
d(i,j+1)
x1
21 x2
22 x2
d(i+1,j+1)
⋮⋯
xspop
i+1,ixspop
i+1,j+1xspop
d(i+n,j+n)
where the values of
Ssr
are taken as the sea and rivers with
the dimension of the matrix “
d
”.The total of number of riv-
ers and sea are presented as
Ssr
.
where
Spop
are the populations of stream, and the
Spop
streams
are created at the first stage of rain fall, then streams are cre-
ated. The rivers and sea are selected as the number of ideal
individuals minimum values
Ssr
. The stream is treated as the
sea when it has minimum value. Typically, water entering a
river goes through streams to the sea. Adding new rivers and
removing the surviving population as the stream flows into
the rivers depends on the amount of water flow.
By combining the two algorithms, improved water cycle
algorithm and accelerated particle swarm optimization and
ignoring the evaporation criteria, and taking into account the
position and velocity from the APSO algorithm, the stream
flow matrix of size
Spop ×D
, and the mapping of the posi-
tion and velocity equation to the stream flow, resulting in the
velocity of the flow update equation is given by
where,
𝜒
,
𝛼,𝜆1,𝜆2,and𝜆3
are the controlling parameter
of convergence and
vstr
i
(
l
+
1)
is the new stream velocity,
vriv
i
(l+1
)
is the new velocity of river. By considering the
velocity of stream and rivers, the new position update equa-
tion for stream and river is given by
The parameter
𝜒
is the controlling parameter for the
optimization, where x
str
i
(l+1
)
is the new stream position,
xriv
i
(l+1
)
new river position, and
𝛽
is the controlling coef-
ficient of the stream position.
3.2 Proposed IWCA‑APSO weight optimization
ofEELM Model
The extreme learning machine (ELM) [24] is a feed-forward
network that is known by its fast convergence. As the dataset
size grows, ELM suffers from overfitting. To improve the clas-
sification accuracy, we have proposed an IWCA-APSO hybrid
(15)
Ssr =No.of rivers +1(sea)
(16)
SStr =Spop Ssr
(17)
vstr
i
(l+1)=𝜒v
str
i
(l)+𝛼
n
+𝜆
1
(
x
sea
(l)x
str
i
(l)
)
(18)
vstr
i
(l+1)=𝜒v
str
i
(l)+𝛼
n
+𝜆
2
(
x
riv
(l)x
str
i
(l)
)
(19)
vriv
i
(l+1)=𝜒v
str
i
(l)+𝛼
n
+𝜆
3
(
x
sea
(l)x
riv
i
(l)
)
(20)
x
str
i(l+1)=
{
𝜒(1𝛽)x
str
i(l)+v
str
i(l+1),for sea >stream
𝜒(1𝛽)xstr
i
(l)+vstr
i
(l+1),for river >
stream
(21)
xriver
i
(l+1)=𝜒(1𝛽)x
str
i
(l)+v
riv
i
(l+1
)
Fig. 2 Water cycle algorithm [22]
Fig. 3 New positions of stream, flows to sea [22]
Int. j. inf. tecnol.
1 3
model for optimization of the weights of the Ensemble ELM
model. The Architecture of the proposed IWCA-APSO-based
EELM model is presented in Fig.4. First, we have taken single
ELM model shown in Fig.4 and optimized the weights by the
IWCA-APSO model, and the error is calculated, then ensem-
ble all other ELM errors and selected the average error. The
mathematical analysis is also presented step-wise to under-
stand the flow of algorithm (Fig.5).
3.2.1 Step‑1
According to the ELM architecture [24], the output is given by
where
q
(w,x)=
[
1, q
1(
w
1
,x
)
, ......., q
K(
w
K
,x
)]
is the hidden
layer and
𝛽
is the weight vector of all hidden neurons to
an output neuron to be analytically analyzed.
gk(
)
Which is
the activation function of hidden layer. Equation(8) can be
written as
where
Q
is the hidden layer matrix,
And
qL
w
K
;x
K
=
w
1
x
1
+w
1
x
1
.........w
K
x
K
.e
(xKci)
2
2𝜎2
k
.
Where
x
k
c
i
is the Euclidean distance between the
inputs and the function center.
Equation(9) is a linear equation, which can be solved by
where
Q
is the “Moore–Penrose generalized inverse of
matrix” and
d
is the desired vector.
(22)
y
=
L
k=0
𝛽kqk
(
wk;x
)
(23)
Q𝛽=y
(24)
Q
=
(
QTQ
)1
QT
(25)
𝛽=Q
d,
Fig. 4 Architecture of the proposed IWCA-APSO-based EELM
model
Fig. 5 IWCA-APSO Based
ELM Model
Int. j. inf. tecnol.
1 3
3.2.2 Step‑2
The optimization takes place according to the
𝛽
. The error
for single ELM is given by
3.2.3 Step‑3
With the proposed weights
w
=
[
w
1
,w
2
....., w
L1
, ...w
L]
of the
ELM model, the velocity equation is reformulated as
The position equation is given by
When river merged with sea, the weights are optimized
by the equation
The position equation is given by
3.2.4 Step‑4
Combining all ‘L’ ELMs, for the he
Lth
ELM, the error is
given by
where
y
ens =
1
L
L
i=1
y
i
.
The expectation of the error is given by
(26)
e=dy=dQ
𝛽
(27)
wstr
i
(l+1)=𝜒w
str
i
(l)+𝛼
n
+𝜆
1
(
x
sea
(l)x
str
i
(l)
)
(28)
wstr
i
(l+1)=𝜒w
str
i
(l)+𝛼
n
+𝜆
2
(
x
riv
(l)x
str
i
(l)
)
(29)
xstr
i
(k+1)=𝜒(1𝛽)x
str
i
(l)+w
str
i
(l+1
)
(30)
wriv
i
(k+1)=𝜒w
str
i
(k)+𝛼
n
+𝜆
3
(
x
sea
(l)x
riv
i
(l)
)
(31)
xriv
i
(l+1)=𝜒(1𝛽)x
str
i
(l)+w
riv
i
(l+1
)
(32)
e
=dy
ens
=dQ
𝛽
ens
(33)
E
[
(
d
l
y
)2
]=E
[
e2
l]
Now, the MSE of the ensemble model is
Therefore
The pseudo code for IWCA-APSO–EELM is presented
in Table1. Parameter values used for simulation in this
research are presented in the Table2.
3.2.5 Dataset
The INbreast database [37], which contains 410 mammo-
grams from 115 people, has been used in experiments. The
mammograms have two views such as cranial cardo (CC)
and mediolateral oblique (MLO). Depending on the compres-
sion plate that was utilized for the acquisition, the mammo-
gram’s size is either 4084 × 3328 or 3328 × 2560 pixels. The
INbresat database, which is open to the public, was used for
our tests. The sample image of the INbreast dataset is pre-
sented in Fig.6. We have selected the Morlet wavelet [24] and
extracted six features from 410 images, hence 410 × 6 = 2460
data are collected for each Wavelet. For our experiment, we
selected Morlet wavelet features after visualizing the col-
lected feature data. The Morlet wavelet function is given by
(34)
E
ens =E
1
L
L
i=1
E[e2
i]
2
=1
L2
L
i=1
E[e2
i
]
(35)
E
ens =
1
L
E
avg
(36)
𝜓
(t)=cos (1.75x)exp
(
x2
2
)
Table 2 Values of the parameter used and its range
Parameters Values Bound range
𝜒
0.6 [0 1]
C1
0.9 [0 1]
C2
0.9 [0 1]
𝜆1,𝜆2,𝜆3
2[0 2]
𝛼
rand [0 1]
n
rand [0 1]
Fig. 6 The INbresat database
sample images [37]
Int. j. inf. tecnol.
1 3
4 Results anddiscussion
4.1 Validation results oftheproposed IWCA‑APSO
algorithm
To demonstrate the uniqueness of the new IWCA-APSO
algorithm, the proposed IWCA-APSO optimization tech-
nique has been compared to the current WCA and APSO
metaheuristic algorithms. The three benchmark functions,
such as Griewanks’s function, Sphere function, and Quar-
tic Function [25], are considered for optimization to dem-
onstrate the distinctiveness of the proposed IWCA-APSO
hybrid algorithm. In Table3, the benchmark functions
with their bound range and dimensions are presented. The
comparison results of validation are shown in Figs.7, 8,
and 9 in the results section. In order to show the unique-
ness of the proposed algorithm, each of the three functions
underwent optimization using APSO, WCA, WCA + PSO,
WCA + APSO, and IWCA + APSO optimization algorithms.
The validation of function F1 using the APSO, WCA,
WCA + PSO, WCA + APSO, and IWCA + APSO algorithms
is shown in Fig.7. Figure7 shows that the proposed IWCA-
APSO required only around 60 iterations, whereas APSO,
WCA, WCA + PSO, and WCA + APSO required about 600,
550, 450, and 120 iterations, respectively, to reach conver-
gence. For all benchmark function optimization,1000 itera-
tions are taken into account for simulation. The optimal
Table 3 Benchmark functions
for the validation of the
proposed IWCA-APSO
algorithm
Function Name of the function Details Dimension Bound regions
F1 Griewanks’s function
d
i=1
x
2
i
4000
d
i=1
cos
(
xi
i
)
+
1
30 [− 600,600]
F2 Sphere function
d
i=1
x
2
i
30 [− 5.12,5.12]
F3 Quartic Function
d
i=1
ix
4
i
30 [− 1.28,1.28]
Int. j. inf. tecnol.
1 3
values of functions F1, F2, and F3 are presented in Table5.
The validation of function F2 is shown in Fig.8.
The validation of function F3 is shown in Fig.9.
According to Fig. 9, the suggested IWCA-APSO
required less than 80 iterations, whereas APSO, WCA,
WCA + PSO, and WCA + APSO required more than
870 iterations, or 350 iterations, 280 iterations, and 100
iterations, respectively, to reach convergence. Further-
more, APSO, WCA, WCA + PSO, and WCA + APSO
all achieved ideal values of 0.8572, 0.6123, 0.5249, and
0.4913 for function F3, compared to the recommended
WCA-optimal APSO’s value of 0.3778.
Table4 shows the optimal values for APSO, WCA,
WCA + PSO, WCA + APSO, and IWCA + APSO algo-
rithms. When compared to APSO, WCA, WCA + PSO,
and WCA + APSO optimization algorithms, all
benchmark functions F1, F2, and F3 for the proposed
IWCA + APSO algorithm achieved good optimal values.
The comparison of optimal values are shown Fig.10.
0100 200 300 400 500 600700 800900 1000
No. Of Iteration
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Best fit
Function-F1
IWCA-APSO
WCA-PSO
WCA-APSO
WCA
APSO
Fig. 7 Validation of function F1 using APSO, WCA, WCA + PSO,
WCA + APSO and IWCA + APSO algorithms
0100 200 300 400 500 600 700 800 900 1000
Iteration
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Best fit
FUNCTION-F2
IWCA-APSO
WCA-PSO
WCA-APSO
WCA
APSO
Fig. 8 Validation of function F2 using APSO, WCA, WCA + PSO,
WCA + APSO and IWCA + APSO algorithms
0 100 200 300 400 500 600 700 800 900 1000
Iteration
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Best fit
Function-F3
IWCA-APSO
WCA-PSO
WCA-APSO
WCA
APSO
Fig. 9 Validation of function F3 using APSO, WCA, WCA + PSO,
WCA + APSO and IWCA + APSO algorithms
Table 4 Benchmark functions optimal value
Objective function optimal value
Optimization methods No. of Iterations F1 F2 F3
APSO 1000 0.8006 0.8039 0.8572
WCA 1000 0.4562 0.4782 0.6123
WCA-PSO 1000 0.4221 0.3435 0.5249
WCA + APSO 1000 0.3392 0.3279 0.4913
IWCA + APSO 1000 0.2851 0.2794 0.3778
Fig. 10 Optimal value comparison results of functions F1–F3 using
APSO, WCA, WCA + PSO, WCA + APSO and IWCA + APSO algo-
rithms
Int. j. inf. tecnol.
1 3
4.2 Segmentation results
The segmentation results were obtained by using EnFCM,
FLICM, NDFCM, FRFCM and proposed FFI-FRFCM
techniques. The segmentation accuracies are presented in
Table5. The Figs.11, 12, 13 and14 shows the different
image segmentation outputs.
The segmentation results by using EnFCM, FLICM,
NDFCM, FRFCM and proposed FFI-FRFCM techniques
are presented from Figs.11, 12, 13 and 14. It is observed
from the Fig.11 that the segmentation is not proper by using
EnFCM technique. Similarly, in FLICM technique, the noise
reduction capability is less and the segmentation accuracy is
95.65% and the segmentation result is presented in Fig.12.
It is found from Fig.13 that the FRFCM segmentation is
achieving better result in segmentation in the breast tumor.
But the detected tumor is not up to the requirement. The
proposed FFI-FRFCM segmentation shows 99.12% accu-
racy, which is shown in Fig.14, and shows good capability
of detection in terms of segmentation accuracy. The quality
measure PSNR value is 37.85dB and SSIM is 0.9253 in
case of proposed FFI-FRFCM segmentation respectively.
The higher value of PSNR and SSIM for FFI-FRFCM show
better signal-to noise ratio and noise reduction.
4.3 Classification performance results
Classification performance measurements are crucial to
identify the signals using machine learning models. The
terms "true positive ratio (TPR)," "true negative ratio
(TNR)," and "accuracy" are used to verify the classifier’s
performance analysis. Sensitivity is also referred to as the
"true positive ratio" (TPR). The K-fold cross-validation tech-
nique ensures that each subsample is trained and evaluated,
preventing overfitting issues and lowering generalization
errors. Results of the performance measure results are pre-
sented in Table6.
The MSE during classification is shown in Fig.15. In
contrast to the WCA-EELM, WCA-PSO-EELM, and WCA-
APSO-EELM, which required 95, 75, and 45 iterations,
respectively, the proposed IWCA-APSO-based ELM model
only required 25 iterations to reach convergence. According
Sensitivity
=TPR =
TP
TP +FN
Specificity
=TNR =
TN
TN +FP
Accuracy
=
TP +TN
TP +TN +FP +FN
to our analysis, the proposed IWCA-APSO-based EELM
model outperforms the WCA-EELM, WCA-PSO-EELM,
Table 5 Quality measures and Segmentation Accuracy
Bold value represents the comparison values of SSIM and PSNR.
Higher values of SSIM and PSNR show the better performance of
FFI-FRFCM
Algorithm Accuracy In % SSIM PSNR
En FCM 93.22 0.7985 27.31
FLICM 95.65 0.8524 29.55
NDFCM 97.17 0.8847 31.18
FRFCM 98.43 0.9013 35.29
FFI-FRFCM 99.12 0.9253 37.85
Input Image Tumor outline Detected Tumor
Fig. 11 Segmentation result of breast tumor using EnFCM technique
Input Image Tumor outline Detected Tumor
Fig. 12 Segmentation result of breast tumor using NDFCM tech-
nique
Input image
Tumor outline Detected Tumor
Fig. 13 Segmentation result of breast tumor using FRFCM technique
Input Image Tumor outline Detected Tumor
Fig. 14 Segmentation result of affected breast tissues using FFI-
FRFCM technique
Int. j. inf. tecnol.
1 3
and WCA-APSO-EELM models. The classification accu-
racy of the proposed model is achieved as 99.36%. The pro-
posed IWCA-APSO-based EELM model took 23.8751s less
than the other models, WCA-EELM, WCA-PSO-EELM,
and WCA-APSO-EELM models, which took 98.2462s,
65.1405s, and 34.0192s, respectively.
5 Conclusion
A hybrid IWCA-APSO algorithm-based EELM model was
proposed to classify cancerous and non-cancerous tissues
from the mammogram images. The proposed IWCA-PSO
algorithm optimized the weights of the EELM model to
enhance the classification performance. To assess the
robustness of the hybrid IWCA-APSO method, three bench-
mark functions, such as Griewanks’ function, Sphere func-
tion, and Quartic function, were considered for optimiza-
tion. The benchmark functions were optimized by APSO,
WCA, WCA + PSO, and WCA + APSO algorithms, and the
hybrid IWCA-APSO algorithm and comparison results were
presented. A fuzzy factor improved FRFCM segmentation
was proposed for detecting the breast cancer-affected tis-
sues from the mammogram images. The Morlet wavelet
transform was employed for feature extraction from the
segmented images. Six features were extracted and fed as
input to the IWCA-APSO-based EELM model for the clas-
sification of breast cancer. Compared to the WCA-EELM,
WCA-PSO-EELM, and WCA-APSO-EELM models, the
proposed IWCA-APSO-based EELM model yields better
classification results. The proposed IWCA-APSO-based
EELM model attains superior classification accuracy and
computational efficiency. Even though the computational
time is approximate to the WCA-APSO-EELM model, the
classification accuracy is better in the case of the proposed
IWCA-APSO-based EELM model. The proposed IWCA-
APSO-based EELM model has shown good capability for
classifying breast cancer mammogram images into can-
cerous and non-cancerous groups. The proposed IWCA-
APSO-based EELM models can be applied to liver tumor
datasets, and brain tumor image datasets. Since the deep
learning models are showing better results in classifica-
tion, but implementation of the models through the embed-
ded platform is difficult because of memory requirement,
high end processors and availability of peripheral hardware
accessories, which is not cost effective. The development of
the EELM model hybridization with Dove Swarm Optimi-
zation (DSO), Moth-flame optimization (MFO) and other
optimization technique, the proposed model’s implemen-
tation in an embedded platform with NVIDIA processor
and associated hardware for detection and classification of
breast cancer on real-time images in hospitals is the future
scope of the research.
Data availability The INbreast dataset is used in this research. The
dataset can be made available upon request to the corresponding author.
References
1. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I,
Jemal A (2021) Global cancer statistics 2020: GLOBOCAN
Table 6 Performance
evaluation of classifiers
Bold value represents the higher accuracy of the proposed IWCA-APSO-EELM model
Model No. of
iterations
Computa-
tional time
in s
Sensitivity in % Specificity in % Accuracy in %
WCA-EELM 100 98.2462 96.54 97.24 97.23
WCA-PSO-EELM 100 65.1405 98.02 98.18 98.17
WCA-APSO-EELM 100 34.0192 98.34 97.71 98.75
IWCA-APSO-EELM 100 23.8751 99.67 99.71 99.36
Fig. 15 Optimization methods mean square error results with
INbreast dataset
Int. j. inf. tecnol.
1 3
estimates of incidence and mortality worldwide for 36 cancers in
185 countries. CA Cancer J Clin 71:209–249. https:// doi. org/ 10.
3322/ caac. 21660
2. Tolessa L, Sendo EG, Dinegde NG, Desalew A (2021) Risk fac-
tors associated with breast cancer among women in Addis Ababa,
Ethiopia: unmatched case-control study. Int J Womens Health
18(13):101–110. https:// doi. org/ 10. 2147/ IJWH. S2925 88
3. World Cancer Research Fund Internationals (2020) https:// www.
wcrf. org/ cancer- trends/ world wide- cancer- data/
4. Gebretsadik A, Bogale N, Negera DG (2021) Epidemiological
trends of breast cancer in southern Ethiopia: a seven-year retro-
spective review. Cancer Control. https:// doi. org/ 10. 1177/ 10732
74821 10552 62
5. Priya S, Ashok E (2014) HMPFIM-B: hybrid Markov penalized
FCM in mammograms for breast cancer. Int J Recent Innov
Trends Comput Commun 2(10):3033–3037
6. Kamil MY, Salih AM (2019) Mammography images segmenta-
tion via fuzzy C-mean and K-mean. Int J Intell Eng Syst. https://
doi. org/ 10. 22266/ ijies 2019. 0228. 03
7. Chowdhary CL, Mittal MPK, Pattanaik PA, Marszalek Z (2020)
An efficient segmentation and classification system in medical
images using intuitionist possibilistic fuzzy C-mean clustering
and fuzzy SVM algorithm. Sensors 20:3903. https:// doi. org/ 10.
3390/ s2014 3903
8. Szilagyi L, Benyo Z, Szilagyii SM, Adam HS (2003) MR brain
image segmentation using an enhanced fuzzy c-means algo-
rithm. In: Proceeding of the 25th annual international confer-
ence of the IEEE EMBS, pp 17–21
9. Cai W, Chen S, Zhang D (2007) Fast and robust fuzzy c-means
clustering algorithms incorporating local information for image
segmentation. Pattern Recognit 40(3):825–838. https:// doi. org/
10. 1016/j. patcog. 2006. 07. 011
10. Krinidis S, Chatzis V (2010) A robust fuzzy local informa-
tion cmeans clustering algorithm. IEEE Trans Image Process
19(5):1328–1337. https:// doi. org/ 10. 1109/ tip. 2010. 20407
63AQ7
11. Gong M, Liang Y, Shi S, Ma J (2013) Fuzzy c-means clustering
with local information and kernel metric for image segmenta-
tion. IEEE Trans Image Process 22(2):573–584. https:// doi. org/
10. 1109/ TIP. 2012. 22195 47
12. Lei T, Jia X, Zhang Y, He L, Meng H, Nandi AK (2018) Sig-
nificantly fast and robust fuzzy c-means clustering algorithm
based on morphological reconstruction and membership filter-
ing. IEEE Trans Fuzzy Syst 26(5):3027–3041. https:// doi. org/
10. 1109/ tfuzz. 2018. 27960 74
13. Gorgel P, Sertbas A, Ucan ON (2013) Mammographical mass”
detection and classification using local seed region growing–
spherical wavelet transform (lsrg–swt) hybrid scheme. Comput
Biol Med 43(6):765–774
14. de Lima SM, da Silva-Filho AG, dos Santos WP (2016) Detection
and classification of masses in mammographic images in a multi-
kernel approach. Comput Methods Programs Biomed 134:11–29
15. Juneja K, Rana C (2020) An improved weighted decision tree
approach for breast cancer prediction. Int J Inf Technol 12:797–
804. https:// doi. org/ 10. 1007/ s41870- 018- 0184-2
16. Bhalerao PB, Bonde SV (2021) Cuckoo search based multi-
objective algorithm with decomposition for detection of masses
in mammogram images. Int J Inf Technol 13:2215–2226. https://
doi. org/ 10. 1007/ s41870- 021- 00805-9
17. Sharma A, Mishra PK (2022) Performance analysis of machine
learning based optimized feature selection approaches for breast
cancer diagnosis. Int J Inf Technol 14:1949–1960. https:// doi.
org/ 10. 1007/ s41870- 021- 00671-5
18. Kate V, Shukla P (2022) Breast tissue density classification
based on gravitational search algorithm and deep learning: a
novel approach. Int J Inf Technol 14:3481–3493. https:// doi.
org/ 10. 1007/ s41870- 022- 00930-z
19. Mishra AK, Roy P, Bandyopadhyay S etal (2022) Achieving
highly efficient breast ultrasound tumor classification with deep
convolutional neural networks. Int J Inf Technol 14:3311–3320.
https:// doi. org/ 10. 1007/ s41870- 022- 00901-4
20. Kumari LK, Jagadesh BN (2022) Classification of mammo-
grams using adaptive binary TLBO with ensemble classifier for
early detection of breast cancer. Int J Inf Technol 14:3579–3590.
https:// doi. org/ 10. 1007/ s41870- 022- 00998-7
21. Michaelson J, Satija S, Moore R, Weber G, Halpern E, Garland
A etal (2003) Estimates of the sizes at which breast cancers
become detectable on mammographic and clinical grounds. J
Womens Health 5(1):3–10. https:// doi. org/ 10. 1097/ 00130 747-
20030 2000- 00002
22. Sadollah A, Eskandar H, Bahreininejad A, Kim JH (2015) Water
cycle algorithm with evaporation rate for solving constrained
and unconstrained optimization problems. Appl Soft Comput
30:58–71
23. Mishra S, Gelmecha Demissie J, Singh Ram S, Singh RD,
Gopikrishna T (2021) Hybrid WCA–SCA and modified FRFCM
technique for enhancement and segmentation of brain tumor
from magnetic resonance images. Biomed Eng Appl Basis Com-
mun 33(3):2150017. https:// doi. org/ 10. 4015/ S1016 23722 15001
74
24. Mishra S, Nayak PK, Dash PK, Bisoi R (2016) Comparison
of modified TLBO based optimization and extreme learning
machine for classification of multiple power signal disturbances.
Neural Comput Appl 27(7):2107–2122
25. Mishra S, Sahu P, Senapati MR (2019) MASCA-PSO based
LLRBFNN model and improved fast and robust FCM algorithm
for detection and classification of brain tumor from MR image.
Evolut Intell. https:// doi. org/ 10. 1007/ s12065- 019- 00266-x
26. Tawseef AS, Ali R (2020) An intelligent healthcare system for
optimized breast cancer diagnosis using harmony search and
simulated annealing (HS-SA) algorithm. Inform Med Unlocked
21:100408. https:// doi. org/ 10. 1016/j. imu. 2020. 100408
27. Singh N, Veenadhari S (2020) Segmentation of fuzzy enhanced
mammogram mass images by using K-mean clustering and
region growing. Int J Adv Comput Sci Appl IJACSA. https://
doi. org/ 10. 14569/ IJACSA. 2020. 01105 46
28. Velmurugan T, Venkatesan E (2019) A hybrid multifarious
clustering algorithm for the analysis of memmogram images. J
Comput Commun 7:136–151. https:// doi. org/ 10. 4236/ jcc. 2019.
712013
29. Marwa H, Hamrouni K, Solaiman B, Boussetta S (2017) An
efficient method for breast mass segmentation and classification
in mammographic images. Int J Adv Comput Sci Appl IJACSA.
https:// doi. org/ 10. 14569/ IJACSA. 2017. 081134
30. Zheng Y, Baloch S, Englander S, Schnall MD, Shen D (2007)
Segmentation and classification of breast tumor using dynamic
contrast-enhanced MR images. Med Image Comput Comput
Assist Interv 10(Pt 2):393–401. https:// doi. org/ 10. 1007/ 978-3-
540- 75759-7_ 48
31. Hameed Z, Garcia-Zapirain B, Aguirre JJ etal (2022) Multiclass
classification of breast cancer histopathology images using mul-
tilevel features of deep convolutional neural network. Sci Rep
12:15600. https:// doi. org/ 10. 1038/ s41598- 022- 19278-2
32. Maqsood S, Damaševičius R, Maskeliūnas R (2022) TTCNN:
a breast cancer detection and classification towards computer-
aided diagnosis using digital mammography in early stages.
Appl Sci 12:3273. https:// doi. org/ 10. 3390/ app12 073273
33. Joseph AA, Abdullahi M, Junaidu SB, Ibrahim HH, Chiroma
H (2022) Improved multi-classification of breast cancer histo-
pathological images using handcrafted features and deep neural
Int. j. inf. tecnol.
1 3
network (dense layer). Intell Syst Appl 14:200066. https:// doi.
org/ 10. 1016/j. iswa. 2022. 200066
34. Jabeen K, Khan MA, Alhaisoni M, Tariq U, Zhang Y-D, Hamza
A, Mickus A, Damaševičius R (2022) Breast cancer classifica-
tion from ultrasound images using probability-based optimal
deep learning feature fusion. Sensors 22:807. https:// doi. org/
10. 3390/ s2203 0807
35. Ramesh S, Sasikala S, Gomathi S etal (2022) Segmentation
and classification of breast cancer using novel deep learning
architecture. Neural Comput Appl 34:16533–16545. https:// doi.
org/ 10. 1007/ s00521- 022- 07230-4
36. Khozama S, Mayya AM (2022) A new range-based breast can-
cer prediction model using the Bayes’ theorem and ensemble
learning. Inf Technol Control 51(4):757–770. https:// doi. org/
10. 5755/ j01. itc. 51.4. 31347
37. Lian Z, Duan L, Qiao Y, Chen J, Miao J, Li M (2021) The
improved ELM algorithms optimized by bionic WOA for
EEG classification of brain computer interface. IEEE Access
9:67405–67416. https:// doi. org/ 10. 1109/ ACCESS. 2021. 30763
47
Springer Nature or its licensor (e.g. a society or other partner) holds
exclusive rights to this article under a publishing agreement with the
author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of
such publishing agreement and applicable law.
... Computers and digital technologies are especially apparent in the field of histopathology for microscopic analysis of tissue samples to provide diagnostic insights. Researchers have focused on delivering diagnostic services using stateof-the-art AI solutions [1][2][3]. But these technologies are notoriously data-hungry. ...
Article
Full-text available
This paper presents an autoencoder-based neural network architecture to compress histopathological images while retaining the denser and more meaningful representation of the original images. Current research into improving compression algorithms is focused on methods allowing lower compression rates for Regions of Interest (ROI-based approaches). Neural networks are great at extracting meaningful semantic representations from images and, therefore can select the regions to be considered of interest for the compression process. In this work, we focus on the compression of whole slide histopathology images. The objective is to build an ensemble of neural networks that enables a compressive autoencoder in a supervised fashion to retain a denser and more meaningful representation of the input histology images. Our proposed system is a simple and novel method to supervise compressive neural networks. We test the compressed images using transfer learning-based classifiers and show that they provide promising accuracy and classification performance.
... Moreover, the world is experiencing considerable technological advances in all sectors thanks to artificial intelligence (AI). Machine learning (ML), considered the primary means to achieve AI, provides modelling rules to a computer system to gain information from data without explicit human programming [4,5]. ML has been increasingly used for data analyses and for gaining additional knowledge from data (e.g., prediction of outcomes). ...
Article
Full-text available
Blood transfusion is a medical procedure that involves transfusing blood or one of its components from one or more donors into a patient. Digital technology and machine learning have played a crucial role in the blood field and have provided real prospects for the production and distribution of blood products. In this study, we propose supervised machine learning techniques for the multi-label classification of blood products in patients with hematologic diseases. We used three multi-label approaches from the problem transformation category to create a decision support system for blood products: Label Power Set (LP), Binary Relevance (BR), and Classifier Chain (CC). Multi-label classification using the Problem Transformation approach is a flexible approach. In this study, we used data from different hospitals in hematology departments and blood transfusion centers to explore the application of contemporary supervised learning algorithms in blood product prediction. The experiment was performed by calculating the Hamming loss and accuracy to facilitate the classification of blood products. As a result, the prediction model achieved an area under the ROC curve of 99.80%, a Hamming loss of 0.30, and an accuracy of 98.8%. The proposed model has been developed to provide accurate and fast results that can save patients’ lives.
Article
Identifying a small subset of informative genes from a gene expression dataset is vital in sample classification. In this process, there are two objectives: (i) to minimise the number of selected genes and (ii) to maximise the classification accuracy. This paper proposes a Greedy and Mutation based Archived Multi-Objective Simulated Annealing Algorithm (GMAMOSA) to solve this problem. The proposed GMAMOSA is obtained by incorporating two greedy-based and one mutation-based perturbation strategies in AMOSA. These strategies are used to maintain an appropriate balance between the exploitation and exploration of the search process. In preprocessing, the Fisher method filters out the noisy genes from the dataset. Then, the proposed GMAMOSA, K-Nearest Neighbour (KNN), and Leave-One-Out Cross-Validation (LOOCV) have been applied to find a small set of informative genes that maximise classification accuracy. To conduct a comprehensive performance study, GMAMOSA has been used in 11 benchmark gene expression datasets, where in 8 datasets, it has achieved 100% classification accuracy considering the best case. In 5 datasets among 8, 100% classification accuracy is achieved considering the average case. It is compared with the state-of-the-art methods in appropriate datasets, where it has outperformed most of them in classification accuracy.
Article
Social media platforms typically serve as generators of huge data sources as users express their sentiments directly or indirectly on these platforms. With increased sentiment data on social media platforms, there is an immediate requirement to design intelligent systems with early risk detection (ERD)capabilities. Early detection in case of mental health disorders, especially in case of depression detection could provide for better information, identification, and utilization of treatments, along with future risk reduction planning. Machine learning based early risk detection systems typically utilize the social media sentiments to correctly classify the potential depression cases to initiate an early diagnosis, detection, and recovery process. The authors, henceforth, present a unique ensemble-based machine learning classifier with a mix of logistic regression, decision tree, random forest, support vector machine, multi- layer perceptron along with adaptive and gradient boosting resulting in improved performance on evaluation metrices indicators than past research.
Article
Binary Imaging Reporting and Data System (BIRADS) categorizes the mammogram masses using their shape, size, and density for the earlier detection of breast malignancy to reduce the mortality rate. For the efficient earlier detection, the proposed work develops a novel CNN with customized filters including High Frequency Boost Filter (HFBF) for extracting the complex features including sharp edges, sharp shapes and sharp patterns. The CNN with the customized filters is applied on the Discrete Wavelet Transform (DWT) sub-bands of Mammographic Image Analysis Society (MIAS) and Digital Database for Screening Mammography (DDSM) databases images with the fine-tuned hyperparameters. As a result, High-High (HH) sub-band images provide the highest accuracy of 97.28% and 97.94% on MIAS and DDSM databases respectively with stable measures of precision 0.96, recall 0.97 and F1-Score 0.97 on MIAS and precision 0.97, recall 0.98 and F1-Score 0.98 on DDSM when compared to the other sub-bands and original images of the databases. Likewise, the results obtained by the proposed model compared with other existing pre-defined models. The novel contribution of the proposed work is CNN architecture with customized filters for the efficient and complex feature extraction on the sub-bands of DWT. In addition, HH sub-band is considered for high frequency components of microcalcification rather neglating as noise (unlike other works).
Article
Lung cancer is consistently ranked as the primary cause of cancer-related fatalities worldwide. The timely identification and effective treatment of lung cancer play a pivotal role in patient survival rates. Generally, higher rates of lung cancer mortality have been observed in men compared to women, largely attributable to smoking levels. This article proposes a new hybrid approach to lung cancer detection using the Computed Tomography (CT) scan images. Our objective is two folds: first, the development of a robust and accurate segmentation approach based on the Active Shape Model (ASM), and second, the implementation of a fully automatic lung cancer detection system employing the Deep Neural Networks (DNN). Given the diverse nature of cancer growth within the lung, it can appear in any location, showing a wide range of shapes, sizes, and contrasts. The proposed approach thus lays the foundation for precise segmentation, enabling a comprehensive understanding of the structural nuances. The experimental evaluation shows that the proposed approach achieves good precision and accuracy and can help practitioners as an enhanced tool for fast and reliable cancer detection.
Article
Full-text available
Breast cancer prediction is essential for preventing and treating cancer. In this research, a novel breast cancer prediction model is introduced. In addition, this research aims to provide a range-based cancer score instead of binary classification results (yes or no). The Breast Cancer Surveillance Consortium dataset (BCSC) dataset is used and modified by applying a proposed probabilistic model to achieve the range-based cancer score. The suggested model analyses a sub dataset of the whole BCSC dataset, including 67632 records and 13 risk factors. Three types of statistics are acquired (general cancer and non-cancer probabilities, previous medical knowledge, and the likelihood of each risk factor given all prediction classes). The model also uses the weighting methodology to achieve the best fusion of the BCSC's risk factors. The computation of the final prediction score is done using the post probability of the weighted combination of risk factors and the three statistics acquired from the probabilistic model. This final prediction is added to the BCSC dataset, and the new version of the BCSC dataset is used to train an ensemble model consisting of 30 learners. The experiments are applied using the sub and the whole datasets (including 317880 medical records). The results indicate that the new range-based model is accurate and robust with an accuracy of 91.33%, a false rejection rate of 1.12%, and an AUC of 0.9795. The new version of the BCSC dataset can be used for further research and analysis.
Article
Full-text available
Breast cancer is a common malignancy and a leading cause of cancer-related deaths in women worldwide. Its early diagnosis can significantly reduce the morbidity and mortality rates in women. To this end, histopathological diagnosis is usually followed as the gold standard approach. However, this process is tedious, labor-intensive, and may be subject to inter-reader variability. Accordingly, an automatic diagnostic system can assist to improve the quality of diagnosis. This paper presents a deep learning approach to automatically classify hematoxylin-eosin-stained breast cancer microscopy images into normal tissue, benign lesion, in situ carcinoma, and invasive carcinoma using our collected dataset. Our proposed model exploited six intermediate layers of the Xception (Extreme Inception) network to retrieve robust and abstract features from input images. First, we optimized the proposed model on the original (unnormalized) dataset using 5-fold cross-validation. Then, we investigated its performance on four normalized datasets resulting from Reinhard, Ruifrok, Macenko, and Vahadane stain normalization. For original images, our proposed framework yielded an accuracy of 98% along with a kappa score of 0.969. Also, it achieved an average AUC-ROC score of 0.998 as well as a mean AUC-PR value of 0.995. Specifically, for in situ carcinoma and invasive carcinoma, it offered sensitivity of 96% and 99%, respectively. For normalized images, the proposed architecture performed better for Makenko normalization compared to the other three techniques. In this case, the proposed model achieved an accuracy of 97.79% together with a kappa score of 0.965. Also, it attained an average AUC-ROC score of 0.997 and a mean AUC-PR value of 0.991. Especially, for in situ carcinoma and invasive carcinoma, it offered sensitivity of 96% and 99%, respectively. These results demonstrate that our proposed model outperformed the baseline AlexNet as well as state-of-the-art VGG16, VGG19, Inception-v3, and Xception models with their default settings. Furthermore, it can be inferred that although stain normalization techniques offered competitive performance, they could not surpass the results of the original dataset.
Article
Full-text available
Breast cancer is one of the most frequent cancers in women, and it has a higher mortality rate than other cancers. As a result, early detection is critical. In computer-assisted disease diagnosis, accurate segmentation of the region of interest is a vital concept. The segmentation techniques have been widely used by doctors and physicians to locate the pathology, identify the abnormality, compute the tissue volume, analyze the anatomical structures, and provide treatment. Cancer diagnostic efficiency is based on two aspects: The precision value associated with the segmentation and calculation of the tumor area and the accuracy of the features extracted from the images to categorize the benign or malignant tumors. A novel deep-learning architecture for tumor segmentation is therefore proposed in this study, and machine learning algorithms are used to categorize benign or malignant tumors. The segmentation results improve the decision-making capability of the physicians to identify whether a tumor is malignant or not and normally, the machine learning techniques need expert annotation and pathology reports to identify this. This challenge is overcome in this work with the help of the GoogLeNet architecture used for segmentation. The segmentation results are then offered to the Support Vector Mchine, Decision Tree, Random Forest, and Naïve Bayes classifier to improve their efficiency. Our work has provided better results in terms of accuracy, Jaccard and dice coefficient, sensitivity, and specificity compared to conventional architectures. The proposed model offers an accuracy score of 99.12% which is relatively higher than the other techniques. A 3.78% accuracy improvement is noticed by the proposed model against the AlexNet classifier and the actual increase is 4.61% on average when compared to the existing techniques.
Article
Full-text available
Breast cancer is a major research area in the medical image analysis field; it is a dangerous disease and a major cause of death among women. Early and accurate diagnosis of breast cancer based on digital mammograms can enhance disease detection accuracy. Medical imagery must be detected, segmented, and classified for computer-aided diagnosis (CAD) systems to help the radiologists for accurate diagnosis of breast lesions. Therefore, an accurate breast cancer detection and classification approach is proposed for screening of mammograms. In this paper, we present a deep learning system that can identify breast cancer in mammogram screening images using an “end-to-end” training strategy that efficiently uses mammography images for computer-aided breast cancer recognition in the early stages. First, the proposed approach implements the modified contrast enhancement method in order to refine the detail of edges from the source mammogram images. Next, the transferable texture convolutional neural network (TTCNN) is presented to enhance the performance of classification and the energy layer is integrated in this work to extract the texture features from the convolutional layer. The proposed approach consists of only three layers of convolution and one energy layer, rather than the pooling layer. In the third stage, we analyzed the performance of TTCNN based on deep features of convolutional neural network models (InceptionResNet-V2, Inception-V3, VGG-16, VGG-19, GoogLeNet, ResNet 18, ResNet-50, and ResNet-101). The deep features are extracted by determining the best layers which enhance the classification accuracy. In the fourth stage, by using the convolutional sparse image decomposition approach, all the extracted feature vectors are fused and, finally, the best features are selected by using the entropy controlled firefly method. The proposed approach employed on DDSM, INbreast, and MIAS datasets and attained the average accuracy of 97.49%. Our proposed transferable texture CNN-based method for classifying screening mammograms has outperformed prior methods. These findings demonstrate that automatic deep learning algorithms can be easily trained to achieve high accuracy in diverse mammography images, and can offer great potential to improve clinical tools to minimize false positive and false negative screening mammography results.
Article
Full-text available
Breast cancer (BC) classification has become a point of concern within the field of biomedical informatics in the health care sector in recent years. This is because it is the second-largest cause of cancer-related fatalities among women. The medical field has attracted the attention of researchers in applying machine learning techniques to the detection, and monitoring of life-threatening diseases such as breast cancer (BC). Proper detection and monitoring contribute immensely to the survival of BC patients, which is largely dependent on the analysis of pathological images. Automatic detection of BC based on pathological images and the use of a Computer-Aided Diagnosis (CAD) system allow doctors to make a more reliable decision. Recently, Deep Learning algorithms like Convolution Neural Network have been proven to be reliable in detecting BC targets from pathological images. Several research efforts have been undertaken in the binary classification of histopathological images. However, few approaches have been proposed for the multi-classification of histopathological images. The classification accuracy produced by these approaches are inefficient since they considered only texture-based extracted features and they used some techniques that cannot extract some of the main features from the images. Also, these techniques still suffered from the issue of overfitting. In this work, handcrafted feature extraction techniques (Hu moment, Haralick textures, and colour histogram) and Deep Neural Network (DNN) are employed for breast cancer multi-classification using histopathological images on the BreakHis dataset. The features extracted using the handcrafted techniques are used to train the DNN classifiers with four dense layers and Softmax. Further, the data augmentation method was employed to address the issue of overfitting. The results obtained reveal that the use of handcrafted approach as feature extractors and DNN classifiers had a better performance in breast cancer multi-classification than other approaches in the literature. Moreover, it was also noted that augmentation of data plays a key role in further improvement of classification accuracy. The proposed method achieved an accuracy score of 97.87% for 40x, 97.60% for 100x, 96.10% for 200x, and 96.84% for 400x for the magnification-dependent histopathological images classification. The results also showed that the proposed method for using the Handcrafted feature extraction method with DNN classifier had a better performance in multi-classification of breast cancer using histopathological images than most of the related works in the literature.
Article
Full-text available
After lung cancer, breast cancer is the second leading cause of death in women. If breast cancer is detected early, mortality rates in women can be reduced. Because manual breast cancer diagnosis takes a long time, an automated system is required for early cancer detection. This paper proposes a new framework for breast cancer classification from ultrasound images that employs deep learning and the fusion of the best selected features. The proposed framework is divided into five major steps: (i) data augmentation is performed to increase the size of the original dataset for better learning of Convolutional Neural Network (CNN) models; (ii) a pre-trained DarkNet-53 model is considered and the output layer is modified based on the augmented dataset classes; (iii) the modified model is trained using transfer learning and features are extracted from the global average pooling layer; (iv) the best features are selected using two improved optimization algorithms known as reformed differential evaluation (RDE) and reformed gray wolf (RGW); and (v) the best selected features are fused using a new probability-based serial approach and classified using machine learning algorithms. The experiment was conducted on an augmented Breast Ul-trasound Images (BUSI) dataset, and the best accuracy was 99.1%. When compared with recent techniques, the proposed framework outperforms them.
Article
Full-text available
Introduction African women are affected by cancer at an early age of their productivity. However, the exact prevalence and incidence of cancer, including breast cancer is not known in most sub-Saharan African countries, including Ethiopia because of lack of well-established cancer registry. This study aims to assess the epidemiology of breast cancer at Hawassa University Comprehensive Specialized Hospital (HUCSH), the biggest referral hospital with cancer treatment center serving the southern part of the country. Methods Retrospective review of charts of all patients with a diagnosis of breast cancer between 2013 and 2019 at HUCSH was conducted. A standardized questionnaire was used to collect relevant data that include sociodemographic, symptoms, type of diagnosis, treatment, and outcomes. Data were entered using epidata version 3.1 and analyzed using MS Excel and SPSS version 20. Results Five hundred fifty-nine (18.6%) breast cancer cases were retrieved in 7 years between 2013 and 2019. Of this, 548 (98%) were women. The median ages of the patents were 38 years. Invasive ductal carcinoma was the leading 309 (55.3%) histologic type followed by 185 (33.1%) lobular carcinoma. One hundred seventy-seven (31.7%) were moderately differentiated and 155 (27.7%) were poorly differentiated. Three hundred seventy-two (66.5%) were advanced breast cancer (Stages III and IV). Trends of breast cancer showed the case load is continuously increasing except with a slight reduction of cases in between 2015 and 2016. The majority were advanced breast cancer occurring at an early age by the time diagnosis made. Invasive ductal carcinomas were the predominant one. The trend also showed a continuous increment of cancer case load. Therefore, cancer registration center establishment, community awareness creation, and intensive early detection strategy are mandatory.
Article
Breast cancer is another second petrifying common health problem transversely identified in the world. Early uncovering of breast cancer is desperately helpful to save lives. This can be achieved by mammogram imaging modality, which is competent, precise, and has fewer side effects. The main objective of our research is to strategically develop a hybrid classifier which integrates eXtreme Gradient Boost (XGBoost) with Random Forest (XGBoost-RF) for classification of mammograms. The experiments are evaluated by using two openly available datasets namely Mammographic Image Analysis Society (MIAS) and Digital Database for Screening Mammography (DDSM) and performance is evaluated using k-fold (k = 10) cross validation. In order to increase the classifier performance a cohesive collaborative classification technique is designed namely eXtreme Gradient Boost (XGBoost) with Random Forest (XGBoost-RF) is applied to measure the performance. The classification accurateness achieved for MIAS and DDSM datasets are 98.6% and 94.3% respectively. The experimental outcome revealed that the anticipated classifier performance is better than the state-of-art methods that assists physicians.
Article
This work presents the automatic classification of mammographic breast tissue density as it plays a crucial role in morphological analysis for abnormality detection. The proposed work consists of pre-processing, mammogram enhancement, and classification. Image pre-processing is used to extract the foreground image from the original image. Image extraction is a well known optimization problem. In order to capture the aforementioned problem, gravitation search algorithm is adopted with the consideration of kapur’s entropy as a fitness function. Afterwards, mammogram enhancement and noise removal is performed by utilizing unsharp masking and Anisotrophic filtering technique. Finally, VGG16 and InceptionV3 were utilised to train the convolution neural network (CNN) classification model using deep transfer learning. For a four class classification of breast tissue density, the suggested method is evaluated on the dataset from Digital Image Database for Screening Mammography (DDSM). It obtains classification accuracy of 97.98% for InceptionV3 based model and 91.92% for VGG16 based model on DDSM dataset.
Article
Ultrasound imaging is one of the common modalities used nowadays during radiological screening of breast cancer. A novel residual deep convolutional neural network (DCNN) is proposed in this work to perform automatic benign vs. malignant classification of breast ultrasound (BUS) images. The key ideas presented in this work are- larger residual blocks, utility assessment of tumor localization, and fine-tuning of optimizer settings. The primary focus of this work is on finding an optimal residual DCNN based setup for BUS based cancer detection. This is achieved by exploring various sizes of residual blocks and several different optimizer settings. From experimental analysis, larger-sized residual blocks are found to be more suitable for the BUS classification based cancer detection task. Moreover, impressive classification performances are obtained with the proposed approach after fine-tuning the optimizer settings, achieving AUC, accuracy, and F1-scores of 0.9906, 0.9624, and 0.9725, respectively. Stochastic Gradient Descent (SGD) and AdaGrad based optimizers achieve the best classification performances when paired with smaller learning rates. Such impressive results suggest the potential for real-time applicability of the proposed approach.