Conference PaperPDF Available

A Deep Learning Approach to Detect and Classification of Lung Cancer

Authors:

Abstract and Figures

Cancer is a name of fear to people in the world. Every year millions of people dead of cancer in the world and lung cancer is one of them. Lung cancer is classified by our research. Non-small cell lung cancer (NSCLC) is the most common of the two main types of lung cancer. Here we have classified our model NSCLC into 2 subtypes Adenocarcinoma and Squamous Cell Carcinoma and non-cancerous benign tumors. The CNN model is utilized here for classification (VGG19, ResNet50, EfficientNetB7 and MobileNetV2). We used 15 thousand image data. The Augmentor package was utilized to enhance to 15 thousand from 250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung squamous cell carcinomas. In comparison to other models, ResNet50 has the best accuracy of 98% among our proposed models. By putting this model into practice, medical experts will be able to create an accurate, automatic method for diagnosing different forms of lung cancer.
Content may be subject to copyright.
A Deep Learning Approach to Detect and
Classification of Lung Cancer
Mst. Farhana Khatun
Dept. of CSE
Daffodil International University
1341, Dhaka, Bangladesh
farhana15-14304@diu.edu.bd
Moshfiqur Rahman Ajmain
Dept. of CSE
Daffodil International University
1341, Dhaka, Bangladesh
moshfiqur15-14090@diu.edu.bd
Md. Assaduzzaman
Dept. of CSE
Daffodil International University
1341, Dhaka, Bangladesh
assaduzzaman.cse@diu.edu.bd
Abstract—Cancer is a name of fear to people in the world. Ev-
ery year millions of people dead of cancer in the world and lung
cancer is one of them. Lung cancer is classified by our research.
Non-small cell lung cancer (NSCLC) is the most common of the
two main types of lung cancer. Here we have classified our model
NSCLC into 2 subtypes Adenocarcinoma and Squamous Cell
Carcinoma and non-cancerous benign tumors. The CNN model is
utilized here for classification (VGG19, ResNet50, EfficientNetB7
and MobileNetV2). We used 15 thousand image data. The
Augmentor package was utilized to enhance to 15 thousand
from 250 benign lung tissue, 250 lung adenocarcinomas, and 250
lung squamous cell carcinomas. In comparison to other models,
ResNet50 has the best accuracy of 98% among our proposed
models. By putting this model into practice, medical experts will
be able to create an accurate, automatic method for diagnosing
different forms of lung cancer.
Index Terms—Lung Cancer, ResNet50, Deep Learning
I. INTRODUCTION
The most important organ of our body is the lung. That helps
us breathe. Lumps or lumps appear in different parts of the
body when cells in the body continue to grow uncontrollably
because cancer-suppressor genes are inactive. What we call
a tumor. This tumor can be benign or malignant. Malignant
tumors are known as cancer. That is, neoplastic or tumor
cells with a high rate of aggressiveness, metastasis and the
ability to spread elsewhere in the body are called malignant
tumors or cancer. Difficulty breathing, coughing with blood,
weight loss, air pollution, genetic factors, chest pain, vomiting
with blood are among them. Also various chemicals like
arsenic, nickel, silica etc, increase the risk. Lung cancer is
classified into three categories. non-small cell type, small
cell type and carcinoid. Of these, the small cell type is
the worst, growing and spreading throughout the body very
quickly. Small particles or fibers of inorganic materials such
as asbestos, nickel, chromium and organic materials such as
benzene, benzopyrene etc. Enter the lungs with the air and
cause lung cancer. Constant torture on the lungs is responsible
for this. 85% of lung cancer cases are attributed to tobacco use
and the remaining 10%-15% are those who have never smoked
or used tobacco products. Lung cancer is the greatest cause of
illness and fatality in males around the world. Women’s deaths
are the second biggest cause of death. Every year, 1.3 million
people worldwide pass away from lung cancer. According to
GLOBACON 2020 data, the risk of this cancer increases with
age to 1 in 101 people in the population up to the age of 74
years, 1 in 68 in men and 1 in 201 in women. Apart from this,
the risk of lung cancer also increases in diseases like silicosis,
interstitial lung disease, cystic fibrosis, chronic bronchitis etc.
The presence of radon gas in the air and unwanted radioactivity
are also significant causes of lung cancer.
The study’s purpose was to accurately and efficiently di-
agnose lung cancer. That’s why in this paper we have im-
plemented four Deep Learning models. ResNet50, VGG19,
EfficientNetB7, MobileNetV2. We got the highest accuracy
from ResNet 50 and VGG19.
The remainder of the research study is structured as follows.
Section 2, it is outlined earlier research on lung cancer
detection. The dataset utilized is narrated in Section 3. In
Section 4, we describe the various backdrop strategies used.
Section 5 goes over the proposed technique, pre-processing
operations, feature extraction, and classification. In Section 6,
the outcomes are further discussed. In Section 7, we finish
with directions for the future.
II. LITERATURE REVIEW
Bhatia et al. [1] used Tree-based classifiers such as XG-
Boost, Random Forest as deep learning models. They got an
accuracy of 84% using an ensemble of UNet+RandomForest
and ResNet+XGBoost which separately have accuracies re-
spectively 74% and 76%.
Yiwen Xu et al. [2] analyzed time series CT images of ad-
vanced non-small cells. They performed CNN with RNN and
single seed localization on pre-treatment and post-treatment
patients. END showed that deep learning predicts survival and
cancer-specific outcomes and that the CNN model improved
model performance with additional follow-up scans.
Riquelme et al. [3] aimed to detect malignant lung nod-
ules from computed tomography and for this they proposed
computer-aided diagnosis (CAD) systems. They used deep
learning’s CAD algorithm and split its architecture into 2 parts
and analyzed the performance.
Ibrahim et al. [4] used multi-classification to diagnose
pneumonia, covid-19, X-RAY, lung cancer and CT images of
the chest were used for this. VGG19 +CNN model gave better
results than other models and their accuracy was 98.05
2023 International Conference for Advancement in Technology (ICONAT)
Goa, India. Jan 24-26, 2023
978-1-6654-7517-4/23/$31.00 ©2023 IEEE
1
Asuntha et al. [5] Their main aim was to detect cancerous
lung nodules and classify lung cancer and its acute stage.
They used various feature extraction techniques such as His-
togram of Oriented Gradients(HoG), Local Binary Pattern
(LBP), wavelet transform-based feature, and Zernike Moment
and Scale Invariant Feature Transform (SIFT). and then they
used Fuzzy Particle Swarm Optimization (FPSO), geometric,
volumetric, and intensity features algorithm to obtain good
results in FPSOCNN.
Kriegsmann et al. [6] studied the potential and limitations of
different models of CNN images. They used CNN to detect
4 subtypes of cancer and An optimized InceptionV3 CNN
architecture detected the most accurately.
Lei Cong et al. [7] For the diagnosis and treatment of pri-
mary or metastatic malignancies, the recognition and features
of malignant cells are crucial. Applications of deep learning
for lung cancer research, progress, future and problems were
discussed.
Machine learning techniques are utilized in biomedical ap-
plications to forecast and categorize different kinds of signals
and pictures. Deep learning (DL) techniques have made it
possible for machines to handle large-scale data sets such
anatomical multidimensional films and photographs. Machine
learning includes deep learning that develops methods to create
an artificial neural network modelled after the structure and
operations of the human brain [8]. The bulk of earlier studies
used DL to concurrently classify images of colon and lung
cancer. Others focused on lung cancer detection, while some
authors were more concerned with colon cancer.
Masud et al. [9] used a deep learning-based algorithm to
categorize lung and colon histology pictures. To create four
feature sets for image categorization, they applied two different
sorts of domain alterations. To reach the final classification
result, they combined the two categories’ attributes. Their
accuracy rating was 96.33
Mangal et al. [10] was successful in classifying colon and
lung tumors based on histology pictures by using a shallow
neural network design. They achieved classification accuracy
for lung and colon cancers of 97% and 96%, respectively.
Deep learning based on CNN was proposed by Hatuwal et
al. [11] Only lung tissue samples from the dataset are shown
in the approach. This method could only distinguish between
two benign and one malignant lung tissue, and no classification
of colon cancer was provided. Their proposed categorization
model for lung tissue achieved 97.20% accuracy, 97.33%
recall, and 97.33% precision.
A classifier of K-Nearest Neighbor with features obtained
for colon cancer tissues using a pretrained DenseNet121 net-
work was proposed by Sarwinda [12]. Their method searches
the data for colon tissues and separates benign from cancerous
colon tissues. Their model had a recall of 98.63% and an
accuracy of 98.53% for colon classification. However, the
inability of their model to collect lung cancer tissues and
offered no data regarding lung cancer classification.
The DenseNet-121 captures more meaningful properties
than other convolutional neural network pre-trained networks,
according to Kumar [13]. This is due to the network’s utiliza-
tion of tiny links to increase its accuracy and effectiveness.
Wang et al. [14] developed a deep learning-based Python
library to identify cancer image types. They merged the CNN
model and the SVM algorithm in their suggested approach.
The total accuracy of the Support Vector Machine model was
94
Chehade et al. [15] distinguished between subtypes of In
terms of accuracy, recall, and precision, the XGBoost model
has the highest classification rate for colon and lung cancer.
XGBoost scored 98.8% F1 and had 99% accuracy.
Convolutional neural networks were used by Hlavcheva
[16] to assess medical photos were used deep learning meth-
ods. Their dataset was utilized to evaluate the classification
accuracy of various convolutional neural network designs.
Statistical mathematical techniques and neural network theory
were used to attain the accuracy of 94.6
A spatially limited neural network was suggested by Sir-
inukunwattana et al. [17] for identifying the nucleus in colon
cancer histopathology images. Using a unique nearby group
predictor, cell nuclei were categorized. The highest level of
accuracy was 97.1%. Despite producing positive results, their
model’s computational efficiency fell short because it often
took Fifty minutes to execute only one presentation.
Sun et al. [18] created a Machine Learning algorithm to de-
termine whether a lung nodule was malignant. characteristics
from lung nodule CT images have been used numerous times.
were extracted using max-pooling, and feature maps were
cropped using a pooling method. This method is unique in that
it utilized the CT scan pictures without using any segmentation
or feature extraction techniques. They achieved an accuracy
of 87.14 percent for identifying lung nodules using only
their machine-learning model. They used three deep structured
algorithms to autonomously extract information from a lung
nodule’s CT pictures in this study: SDAE (stacked denoising
autoencoder),deep belief network (DBN), and CNN. Using
CNN, the greatest degree of a result obtained was 89%.
To identify lung cancer, Selvanambi et al. [19] used RNN
(Recurrent Neural Network) with the DLS (Damped Least-
Squares) method, and the GSO (Glowworm Swarm Optimiza-
tion) technique, with an accuracy of 98%.
Image segmentation was utilized by Filho et al. [20] to
preprocess CT images of lung nodules. By combining to
distinguish between benign and malignant tumors, index fun-
damental taxic weights with standard taxic weights patterns
classification using CNN, the authors were able to identify
patterns with 92.6 percent accuracy.
Masood et al. [21] introduced with an accuracy of 84.5
percent, Using a deep CNN as its foundation, the DFCNet
model, can classify CT scan pictures of pulmonary nodules
show the four stages of lung cancer. False-positive results
are produced by using multiple data sets with different scan
parameters which were faulty when it comes to malignant
tumors. Utilizing same scan parameters across all datasets
yields the best categorization results. polyp can be found
utilizing colonoscopy videos.
2
Mo et al. [22] achieved 98.5% average accuracy using a
Faster R-CNN ( Region-Based Convolutional Neural Network
) was developed in four different datasets.
Deep CNNs were created and put into use by Urban et al.
[23] for the detection of polyps in images collected from more
than 2000 colonoscopies that had been expertly tagged. Their
model managed to achieve 96.4% accuracy. The categorization
of colonoscopy frames using the model proposed in [24]
with an accuracy of 90.28%. applied binarized weights with
convolutional neural networks to minimize the network size.
Wolf heuristic features with the least amount of recurrence
were chosen in [25] to decrease dimensionality and complex-
ity. Lung structures that are both diseased and normal were
distinguished with exceptional accuracy of 98.42% using a
combined neural network: a learning comprehensive developed
for AdaBoost.
An eight-layer CNN architecture was suggested by Suresh
and Mohan [26] for classifying a lung lesion’s CT picture
to one of three categories. To extract the interesting nodule
regions from pictures, segmentation was used in consultation
with specialists. In addition, the dataset was expanded using
generative adversarial networks. Their suggested model has
a classification accuracy rate of 93.9%. for the purpose of
finding pulmonary nodules.
A simple deep learning approach using only four con-
volutional layers in the CNN architecture was proposed by
Masud et al. [27]. For each convolutional layer, made up of
a connecting convolutional block, two convolutional blocks
that follow one another, non-linear activation functions, and a
pooling block. Due to the fact that it contains less flops and
parameters than cutting-edge CNN architectures, the authors
determined that their suggested model is appropriate for real-
time CT image interpretation. The accuracy of the suggested
model was 97.9%.
[28] used a method to preprocess lung cancer CT scans that
preserves image brightness at multiple levels and eliminates
noise. For feature extraction and segmentation of the affected
regions, an improved neural network was used. To classify an
ensemble classifier was applied to the features. The model’s
classification accuracy was 96.2
Bukhari et al. [29] achieved 96.4% accuracy by using
data from two different databases, three pre-trained CNNs(
ResNet-34, ResNet-50,and ResNet-18) were used to assess
histopathological pictures of colonic cancer.
III. METHODOLOGY
A. Description of the dataset
Datasets are required for any machine learning or deep
learning approach. The quality of the data at hand aids in
the creation, training, and improvement of the algorithms.
The accessible data in medical imaging applications must be
validated and labeled by professionals in order to be used in
any development. The datasets utilized in current research on
deep learning for lung cancer diagnosis are presented in this
section.
TABLE I
DATASET O F LUN G CANCER
Disease Class Total Image
Lung Adenocarcinoma lung aca 5000
Lung Squamous Cell Carcinoma lung scc 5000
Lung Benign lung n 5000
There are 3 classifications and 15,000 histopathology pic-
tures in this collection. Each image is a jpg file with a reso-
lution of 768 by 768 pixels. The 750 lung tissue images were
produced from a sample of original sources that were HIPAA
compliant and confirmed (250 cases of benign lung tissue,
250 cases of adenocarcinomas, and 250 cases of squamous
cell carcinomas in the lungs). used the Augmentor package to
augment to 15,000. The dataset consists of three categories,
each containing 5,000 images:
Lung Adenocarcinoma
Lung Squamous Cell Carcinoma
Lung Benign Tissue
Fig. 1. Proposed Diagram
IV. MOD EL IMPLEMENTATION
Many different types of supervised deep learning algorithms
are used in this research study to generate the four suggested
categorization models. ResNet50, VGG19, EfficientNetB7,
and MobileNetV2 are the names of these models. The follow-
ing subsections provide details on each of the four developed
models.
A. Proposed Model
ResNet50: ResNet is a Convolutional Neural Network
where ResNet50 is fifty neural network layers of ResNet-
50. In these 50 layers, 48 layers are convolutional layers
and Two layers are the MaxPool layer and average pool
layer. ResNet50 classifies the new images.
VGG19: The Visual Geometry Group is known as VGG.
VGG is used for image recognition or classification.
3
Fig. 2. Lung Adenocarcinoma, Squamous Cell Carcinoma, and Benign Tissue
of the Lung
TABLE II
CLASSIFICATION REPO RT OF RE SNE T50
Cancer type Precision Recall F1 score
Lung Adenocarcinoma 0.98 0.96 0.97
Squamous Cell Carcinoma 0.96 0.96 0.97
Lung Benign 1.00 1.00 1.00
VGG19 is a convolutional neural network with 19 deep
layers which when compared to other state-of-the-art
models, consistently achieves greater performance.
TABLE III
CLASSIFICATION REPO RT OF VGG19
Cancer type Precision Recall F1 score
Lung Adenocarcinoma 0.97 0.96 0.97
Squamous Cell Carcinoma 0.97 0.96 0.97
Lung Benign 1.00 1.00 1.00
EfficientNetB7: EfficientNetB7 is one of the most power-
ful convolutional neural network (CNN) models available
today.
TABLE IV
CLASSIFICATION REPO RT OF EFFI CIE NT NETB 7
Cancer type Precision Recall F1 score
Lung Adenocarcinoma 0.91 0.99 0.95
Squamous Cell Carcinoma 0.99 0.91 0.94
Lung Benign 1.00 0.99 1.00
MobileNetV2: MobileNet-v2 is a 53-layer deep con-
volutional neural network. The pretrained network can
classify photos into 1000 different object categories,
including laptops, tables, pens, and a variety of birds.
TABLE V
CLASSIFICATION REPO RT OF MOBILENE TV2
Cancer type Precision Recall F1 score
Lung Adenocarcinoma 0.92 0.83 0.88
Squamous Cell Carcinoma 0.86 0.94 0.90
Lung Benign 0.99 0.99 0.99
B. Evaluation of Performance
Accuracy =(T P +T N )
(T P +T N +F P +F N )(1)
P recision =T P
(T P +F P )(2)
Recall =T P
(T P +EN )(3)
F1Score =(2 P recision Recall)
(P recision Recall)(4)
where, TP denotes True Positive, TN is True Negative, FP
denotes False Positive, and FN denotes False Negative. The
accuracy, Precision and recall of ResNet50, VGG19, Efficient-
NetB7, MobileNetV2 are shown in Fig.4 The highest accuracy
is ResNet50. So the accuracy of ResNet50 is better then the
others model like VGG19, MobileNetV2 and EfficientnetB7.
Fig. 3. Confusion Matrix of ResNet50, VGG19, MobileNetV2, EfficientnetB7
V. RE SU LTS A ND DISCUSSION
Our dataset contains 15000 photos that are equally split
among 3 classifications. We divided the dataset so that 20%
of the images in each class were utilized for testing, while
the other 80% were used for training. Precision, f-1 score,
recall, and accuracy were all taken into account throughout
the performance analysis. The highest accuracy achieved by
the ResNet50 models is 98% and VGG19 models is 97%
then EfficientNetB7 and MobileNetV2 are 96% and 92%
respectively.
4
TABLE VI
COMPARISON OF FOUR DEEP LEARNING ALGORITHMS
Algorithms Class Precision Recall F1-Score Accuracy
aca 0.98 0.96 0.97
ResNet50 n 1.00 1.00 1.00 0.98
scc 0.96 0.99 0.97
aca 0.97 0.96 0.97
VGG19 n 1.00 1.00 1.00 0.97
scc 0.97 0.96 0.97
aca 0.91 0.99 0.95
EfficientNetB7 n 1.00 0.99 1.00 0.96
scc 0.99 0.91 0.94
aca 0.92 0.83 0.88
MobileNetV2 n 0.99 0.99 0.99 0.92
scc 0.86 0.94 0.90
VI. CONCLUSION AND FUTURE WO RK
Lung cancer is a leading cause of death worldwide.The
therapy results and survival rates of various malignancies
can be considerably increased with an early and appropriate
diagnosis. We think that our suggested approach can be
utilized to accurately detect a number of diseases. Lung
cancer is a dangerous disease that must be combated with all
available resources. Many previous technologies and research
shows some gaps in which they did not achieve high-accuracy
detection results. However, the proposed model ResNet50 is
useful in lung cancer detection because of its high accuracy
and helps in improving the accurate results of tumor cells,
which reduces the death rate.
REFERENCES
[1] Bhatia, S., Sinha, Y., & Goel, L. (2019). Lung cancer detection: a deep
learning approach. In Soft Computing for Problem Solving (pp. 699-
705). Springer, Singapore.
[2] Xu, Y., Hosny, A., Zeleznik, R., Parmar, C., Coroller, T., Franco, I., ...
& Aerts, H. J. (2019). Deep Learning Predicts Lung Cancer Treatment
Response from Serial Medical ImagingLongitudinal Deep Learning to
Track Treatment Response. Clinical Cancer Research, 25(11), 3266-
3275.
[3] Riquelme, D., & Akhloufi, M. A. (2020). Deep learning for lung cancer
nodules detection and classification in CT scans. Ai, 1(1), 28-67.
[4] Ibrahim, D. M., Elshennawy, N. M., & Sarhan, A. M. (2021). Deep-
chest: Multi-classification deep learning model for diagnosing COVID-
19, pneumonia, and lung cancer chest diseases. Computers in biology
and medicine, 132, 104348.
[5] Asuntha, A., & Srinivasan, A. (2020). Deep learning for lung Cancer
detection and classification. Multimedia Tools and Applications, 79(11),
7731-7762.
[6] Kriegsmann, M., Haag, C., Weis, C. A., Steinbuss, G., Warth, A.,
Zgorzelski, C., ... & Kriegsmann, K. (2020). Deep learning for the
classification of small-cell and non-small-cell lung cancer. Cancers,
12(6), 1604.
[7] Cong, L., Feng, W., Yao, Z., Zhou, X., & Xiao, W. (2020). Deep learning
model as a new trend in computer-aided diagnosis of tumor pathology
for lung cancer. Journal of Cancer, 11(12), 3615.
[8] Schmidhuber, J. (2015). Deep learning in neural networks: An overview.
Neural networks, 61, 85-117.
[9] Masud, M., Sikder, N., Nahid, A. A., Bairagi, A. K., & AlZain, M.
A. (2021). A machine learning approach to diagnosing lung and colon
cancer using a deep learning-based classification framework. Sensors,
21(3), 748.
[10] Mangal, S., Chaurasia, A., & Khajanchi, A. (2020). Convolution neural
networks for diagnosing colon and lung cancer histopathological images.
arXiv preprint arXiv:2009.03878.
[11] Hatuwal, B. K., & Thapa, H. C. (2020). Lung cancer detection using
convolutional neural network on histopathological images. Int. J. Com-
put. Trends Technol, 68(10), 21-24.
[12] Sarwinda, D., Bustamam, A., Paradisa, R. H., Argyadiva, T., & Man-
gunwardoyo, W. (2020, November). Analysis of deep feature extraction
for colorectal cancer detection. In 2020 4th International Conference on
Informatics and Computational Sciences (ICICoS) (pp. 1-5). IEEE.
[13] Kumar, N., Sharma, M., Singh, V. P., Madan, C., & Mehandia, S. (2022).
An empirical study of handcrafted and dense feature extraction tech-
niques for lung and colon cancer classification from histopathological
images. Biomedical Signal Processing and Control, 75, 103596.
[14] Wang, Y., Yang, L., Webb, G. I., Ge, Z., & Song, J. (2021). OCTID:
a one-class learning-based Python package for tumor image detection.
Bioinformatics, 37(21), 3986-3988.
[15] Chehade, A. H., Abdallah, N., Marion, J. M., Oueidat, M., & Chauvet, P.
(2022). Lung and Colon Cancer Classification Using Medical Imaging:
A Feature Engineering Approach.
[16] Hlavcheva, D., Yaloveha, V., Podorozhniak, A., & Kuchuk, H. (2021,
August). Comparison of CNNs for Lung Biopsy Images Classification.
In 2021 IEEE 3rd Ukraine Conference on Electrical and Computer
Engineering (UKRCON) (pp. 1-5). IEEE.
[17] Sirinukunwattana, K., Raza, S.E.A., Tsang, Y.W., Snead, D.R., Cree, I.A.
and Rajpoot, N.M., 2016. Locality sensitive deep learning for detection
and classification of nuclei in routine colon cancer histology images.
IEEE transactions on medical imaging, 35(5), pp.1196-1206.
[18] Sun, W., Zheng, B. and Qian, W., 2017. Automatic feature learning using
multichannel ROI based on deep structured algorithms for computerized
lung cancer diagnosis. Computers in biology and medicine, 89, pp.530-
539.
[19] Selvanambi, R., Natarajan, J., Karuppiah, M., Islam, S.K., Hassan,
M.M. and Fortino, G., 2020. Lung cancer prediction using higher-
order recurrent neural network based on glowworm swarm optimization.
Neural Computing and Applications, 32(9), pp.4373-4386.
[20] de Carvalho Filho, A.O., Silva, A.C., de Paiva, A.C., Nunes, R.A. and
Gattass, M., 2018. Classification of patterns of benignity and malignancy
based on CT using topology-based phylogenetic diversity index and
convolutional neural network. Pattern Recognition, 81, pp.200-212.
[21] Masood, A., Sheng, B., Li, P., Hou, X., Wei, X., Qin, J. and Feng, D.,
2018. Computer-assisted decision support system in pulmonary cancer
detection and stage classification on CT images. Journal of biomedical
informatics, 79, pp.117-128.
[22] Mo, X., Tao, K., Wang, Q. and Wang, G., 2018, August. An efficient
approach for polyps detection in endoscopic videos based on faster
R-CNN. In 2018 24th international conference on pattern recognition
(ICPR) (pp. 3929-3934). IEEE.
[23] Urban, G., Tripathi, P., Alkayali, T., Mittal, M., Jalali, F., Karnes, W.
and Baldi, P., 2018. Deep learning localizes and identifies polyps in real
time with 96% accuracy in screening colonoscopy. Gastroenterology,
155(4), pp.1069-1078.
[24] Akbari, Mojtaba, Majid Mohrekesh, Shima Rafiei, SM Reza Soroush-
mehr, Nader Karimi, Shadrokh Samavi, and Kayvan Najarian. ”Classifi-
cation of informative frames in colonoscopy videos using convolutional
neural networks with binarized weights.” In 2018 40th annual inter-
national conference of the IEEE engineering in medicine and biology
society (EMBC), pp. 65-68. IEEE, 2018.
5
[25] Shakeel, P.M., Tolba, A., Al-Makhadmeh, Z. and Jaber, M.M., 2020.
Automatic detection of lung cancer from biomedical data set using
discrete AdaBoost optimized ensemble learning generalized neural net-
works. Neural Computing and Applications, 32(3), pp.777-790.
[26] Suresh, S. and Mohan, S., 2020. ROI-based feature learning for efficient
true positive prediction using convolutional neural network for lung can-
cer diagnosis. Neural Computing and Applications, 32(20), pp.15989-
16009.
[27] Masud, M., Muhammad, G., Hossain, M.S., Alhumyani, H., Alshamrani,
S.S., Cheikhrouhou, O. and Ibrahim, S., 2020. Light deep model for
pulmonary nodule detection from CT scan images for mobile devices.
Wireless Communications and Mobile Computing, 2020.
[28] Shakeel, P.M., Burhanuddin, M.A. and Desa, M.I., 2020. Automatic lung
cancer detection from CT image using improved deep neural network
and ensemble classifier. Neural Computing and Applications, pp.1-14.
[29] Bukhari, S.U.K., Syed, A., Bokhari, S.K.A., Hussain, S.S., Armaghan,
S.U. and Shah, S.S.H., 2020. The histological diagnosis of colonic
adenocarcinoma by applying partial self supervised learning. MedRxiv.
6
Article
Full-text available
Motivation: Tumor tile selection is a necessary prerequisite in patch-based cancer whole slide image analysis, which is labor-intensive and requires expertise. Whole slides are annotated as tumor or tumor free, but tiles within a tumor slide are not. As all tiles within a tumor free slide are tumor free, these can be used to capture tumor-free patterns using the one-class learning strategy. Results: We present a Python package, termed OCTID, which combines a pre-trained convolutional neural network (CNN) model, UMAP, and one-class SVM to achieve accurate tumor tile classification using a training set of tumor free tiles. Benchmarking experiments on four H&E image datasets achieved remarkable performance in terms of F1-Score (0.90 ± 0.06), Matthews correlation coefficient (0.93 ± 0.05), and Accuracy (0.94 ± 0.03). Availability: Detailed information can be found in the supplementary file.
Article
Full-text available
Corona Virus Disease (COVID-19) has been announced as a pandemic and is spreading rapidly throughout the world. Early detection of COVID-19 may protect many infected people. Unfortunately, COVID-19 can be mistakenly diagnosed as pneumonia or lung cancer, which with fast spread in the chest cells, can lead to patient death. The most commonly used diagnosis methods for these three diseases are chest X-ray and computed tomography (CT) images. In this paper, a multi-classification deep learning model for diagnosing COVID-19, pneumonia, and lung cancer from a combination of chest x-ray and CT images is proposed. This combination has been used because chest X-ray is less powerful in the early stages of the disease, while a CT scan of the chest is useful even before symptoms appear, and CT can precisely detect the abnormal features that are identified in images. In addition, using these two types of images will increase the dataset size, which will increase the classification accuracy. To the best of our knowledge, no other deep learning model choosing between these diseases is found in the literature. In the present work, the performance of four architectures are considered, namely: VGG19-CNN, ResNet152V2, ResNet152V2 + Gated Recurrent Unit (GRU), and ResNet152V2 + Bidirectional GRU (Bi-GRU). A comprehensive evaluation of different deep learning architectures is provided using public digital chest x-ray and CT datasets with four classes (i.e., Normal, COVID-19, Pneumonia, and Lung cancer). From the results of the experiments, it was found that the VGG19 +CNN model outperforms the three other proposed models. The VGG19+CNN model achieved 98.05% accuracy (ACC), 98.05% recall, 98.43% precision, 99.5% specificity (SPC), 99.3% negative predictive value (NPV), 98.24% F1 score, 97.7% Matthew’s correlation coefficient (MCC), and 99.66% area under the curve (AUC) based on X-ray and CT images.
Article
Full-text available
The field of Medicine and Healthcare has attained revolutionary advancements in the last forty years. Within this period, the actual reasons behind numerous diseases were unveiled, novel diagnostic methods were designed, and new medicines were developed. Even after all these achievements, diseases like cancer continue to haunt us since we are still vulnerable to them. Cancer is the second leading cause of death globally; about one in every six people die suffering from it. Among many types of cancers, the lung and colon variants are the most common and deadliest ones. Together, they account for more than 25% of all cancer cases. However, identifying the disease at an early stage significantly improves the chances of survival. Cancer diagnosis can be automated by using the potential of Artificial Intelligence (AI), which allows us to assess more cases in less time and cost. With the help of modern Deep Learning (DL) and Digital Image Processing (DIP) techniques , this paper inscribes a classification framework to differentiate among five types of lung and colon tissues (two benign and three malignant) by analyzing their histopathological images. The acquired results show that the proposed framework can identify cancer tissues with a maximum of 96.33% accuracy. Implementation of this model will help medical professionals to develop an automatic and reliable system capable of identifying various types of lung and colon cancers.
Article
Full-text available
Lung Cancer is one of the leading life taking cancer worldwide. Early detection and treatment are crucial for patient recovery. Medical professionals use histopathological images of biopsied tissue from potentially infected areas of lungs for diagnosis. Most of the time, the diagnosis regarding the types of lung cancer are error-prone and time-consuming. Convolutional Neural networks can identify and classify lung cancer types with greater accuracy in a shorter period, which is crucial for determining patients' right treatment procedure and their survival rate. Benign tissue, Adenocarcinoma, and squamous cell carcinoma are considered in this research work. The CNN model training and validation accuracy of 96.11 and 97.2 percentage are obtained.
Preprint
Full-text available
Background: The cancer of colon is one of the important cause of morbidity and mortality in adults. For the management of colonic carcinoma, the definitive diagnosis depends on the histological examination of biopsy specimens. With the development of whole slide imaging, the convolutional neural networks are being applied to diagnose colonic carcinoma by digital image analysis. Aim: The main aim of the current study is to assess the application of deep learning for the histopathological diagnosis of colonic adenocarcinoma by analysing the digitized pathology images. Materials & Methods: The images of colonic adenocarcinoma and non neoplastic colonic tissue have been acquired from the two datasets. The first dataset contains ten thousand images which were used to train and validate the convolutional neural network (CNN) architecture. From the second dataset (Colorectal Adenocarcinoma Gland (CRAG) Dataset) 40% of the images were used as a train set while 60% of the images were used as test dataset. Two histopathologists also evaluated these images. In this study, three variants of CNN (ResNet-18, ResNet-34 and ResNet-50 ) have been employed to evaluate the images. Results: In the present study, three CNN architectures(ResNet-18, ResNet-30, and ResNet-50) were applied for the classification of digitized images of colonic tissue. The accuracy (93.91%) of ResNet-50 was the highest which is followed by ResNet-30 and ResNet-18 with the accuracy of 93.04% each. Conclusion: Based on the findings of the present study and analysis of previously reported series, the development of computer aided technology to evaluate the surgical specimens for the diagnosis of malignant tumors could provide a significant assistance to pathologists.
Article
Full-text available
The emergence of cognitive computing and big data analytics revolutionize the healthcare domain, more specifically in detecting cancer. Lung cancer is one of the major reasons for death worldwide. The pulmonary nodules in the lung can be cancerous after development. Early detection of the pulmonary nodules can lead to early treatment and a significant reduction of death. In this paper, we proposed an end-to-end convolutional neural network- (CNN-) based automatic pulmonary nodule detection and classification system. The proposed CNN architecture has only four convolutional layers and is, therefore, light in nature. Each convolutional layer consists of two consecutive convolutional blocks, a connector convolutional block, nonlinear activation functions after each block, and a pooling block. The experiments are carried out using the Lung Image Database Consortium (LIDC) database. From the LIDC database, 1279 sample images are selected of which 569 are noncancerous, 278 are benign, and the rest are malignant. The proposed system achieved 97.9% accuracy. Compared to other famous CNN architecture, the proposed architecture has much lesser flops and parameters and is thereby suitable for real-time medical image analysis.
Article
Lung and colon cancers lead to a significant portion of deaths. Their simultaneous occurrence is uncommon, however, in the absence of early diagnosis, the metastasis of cancer cells is very high between these two organs. Currently, histopathological diagnosis and appropriate treatment are the only way to improve the chances of survival and reduce cancer mortality. Using artificial intelligence in the histopathological diagnosis of colon and lung cancer can provide significant help to specialists in identifying cases of colon and lung cancers with less effort, time and cost. The objective of this study is to set up a computer-aided diagnostic system that can accurately classify five types of colon and lung tissues (two classes for colon cancer and three classes for lung cancer) by analyzing their histopathological images. Using machine learning, features engineering and image processing techniques, the six models XGBoost, SVM, RF, LDA, MLP and LightGBM were used to perform the classification of histopathological images of lung and colon cancers that were acquired from the LC25000 dataset. The main advantage of using machine learning models is that they allow a better interpretability of the classification model since they are based on feature engineering; however, deep learning models are black box networks whose working is very difficult to understand due to the complex network design. The acquired experimental results show that machine learning models give satisfactory results and are very precise in identifying classes of lung and colon cancer subtypes. The XGBoost model gave the best performance with an accuracy of 99% and a F1-score of 98.8%. The implementation and the development of this model will help healthcare specialists identify types of colon and lung cancers. The code will be available upon request.
Article
According to a 2020 WHO report, cancer is one of the main causes of deaths worldwide. Among these deaths, lung and colon cancer collectively responsible for nearly 2.735 million deaths. So, detection and classification of lung and colon cancer is one of the utmost priority research areas in the field of biomedical health informatics. In this article, comparative analysis of two feature extraction methodologies has been presented for lung and colon cancer classification. In one approach, six handcrafted features extraction techniques based on colour, texture, shape and structure are presented. Gradient Boosting (GB), SVM-RBF, Multilayer Perceptron (MLP) and Random Forest (RF) classifiers with handcrafted features are trained and tested for lung and colon cancer classification. In another approach, using the notion of transfer learning, seven deep learning frameworks for deep feature extraction from lung and colon cancer histopathological images are presented. The extracted deep features (as input attributes) are applied into conventional GB, SVM-RBF, MLP and RF classifiers for lung and colon cancer classification. However, in contrast to handcrafted features a significant improvement in classifiers performance is observed with features extracted by deep CNN networks. It has been found that the proposed technique obtained excellent results in terms of accuracy, precision, recall, F1 score and ROC-AUC. The RF classifier with DenseNet-121 extracted deep features can identify the lung and colon cancer tissue with an accuracy and recall of 98.60%, precision of 98.63%, F1 score of 0.985 and ROC-AUC of 01.
Conference Paper
Deep learning approaches are widely used in the processing of medical images, including histopathological images for cancer diagnosis. Therefore, the scientific and practical problem of automation of biopsy image analysis using convolutional neural networks is considered in the paper. The LC25000 dataset was used to compare the classification accuracy of different CNN architectures. To analyze the impact of image size, two more datasets were created from the initial dataset by slicing of the images. The correlation between the complexity of CNN structure, size of the images, and the resulted accuracy on test data was obtained. Results were compared with related researches on the LC25000 dataset. The theory of deep learning neural networks and mathematical statistics methods are used.