ArticlePDF Available

Hybrid approach for lung cancer detection based on deep learning/machine learning

May 2024
Journal of Autonomous Intelligence 7(5)

May 2024
7(5)

DOI:10.32629/jai.v7i5.1605

Authors:

The incidence of Lung Cancer (LC) is rising in India. LC has been diagnosed and detected numerous times utilizing numerous data processing and identification strategies. Since the underlying origin of LC is still unknown, treatment is hopeless, making early diagnosis of lung tumors the only viable treatment option. So, a Machine Learning (ML) and Deep Learning (DL) based system is utilized to categorize CT scans for the existence of LC. The Visual Geometry Group (VGG-16) and Multi-Class Support Vector Machine (VGG-16+MSVM) technique is proposed in this research. Non-Local Means (NLM) Filter and Bi-Histogram Equalization (Bi-HE) are used, respectively; to filter out unwanted background noise in raw data samples and improve image quality. To isolate tumors in the raw data, the K-Means Clustering (KMC) technique is applied. The Gray Level Co-Occurrence Matrix (GLCM) is employed to generate features from the segmented data. The proposed approach is optimized with the use of a Genetic Algorithm (GA) that selects optimal feature subsets to maximize its performance. Combining ML and DL methods in Medical Image Processing is the most effective approach to detecting LC and its stages with the hope of achieving more precise findings. When accuracy is assessed and compared to other procedures, it becomes clear that the suggested methodology is more accurate (95%).

The SCLC and NSCLC LC.

…

Symptoms of LC.

…

Comparative analysis of Accuracy with existing methods.

…

Comparative analysis of precision with existing methods.

…

Comparative analysis of recall with the existing method.

…

Figures - uploaded by Pallavi Singh

Content may be subject to copyright.

Content uploaded by Pallavi Singh

Content may be subject to copyright.

Journal of Autonomous Intelligence (2024) Volume 7 Issue 5

doi: 10.32629/jai.v7i5.1605

Original Research Article

Hybrid approach for lung cancer detection based on deep

learning/machine learning

Sandeep Kumar Hegde1, Sujidha B.2, K. Vimala Devi3,*, K. Maheswari4, K. Leela Krishna5, Pallavi Singh6,

Varsha D. Jadhav7

1 Department of Computer Science and Engineering, NMAM Institute of Technology, NITTE (Deemed to be University),

Karnataka 574110, India

2 Department of Science & Humanities, Rathinam Technical Campus, Coimbatore 641021, India

3 School of Computer Science and Engineering, Vellore Institute of Technology, Vellore 632014, India

4 Department of CSE, CMR Technical Campus, Hyderabad 501401, India

5 Civil Engineering Department, RVR & JC College of Engineering, Guntur 522019, India

6 Department of Biotechnology, Graphic Era Deemed to be University, Dehradun 248002, India

7 Department of Artificial Intelligence and Data Science, Vishwakarma Institute of Information Technology, Pune

411048, India

* Corresponding author: K. Vimala Devi, vimaladevi.k@vit.ac.in

ABSTRACT

The incidence of Lung Cancer (LC) is rising in India. LC has been diagnosed and detected numerous times utilizing

numerous data processing and identification strategies. Since the underlying origin of LC is still unknown, treatment is

hopeless, making early diagnosis of lung tumors the only viable treatment option. So, a Machine Learning (ML) and Deep

Learning (DL) based system is utilized to categorize CT scans for the existence of LC. The Visual Geometry Group

(VGG-16) and Multi-Class Support Vector Machine (VGG-16+MSVM) technique is proposed in this research. Non-

Local Means (NLM) Filter and Bi-Histogram Equalization (Bi-HE) are used, respectively; to filter out unwanted

background noise in raw data samples and improve image quality. To isolate tumors in the raw data, the K-Means

Clustering (KMC) technique is applied. The Gray Level Co-Occurrence Matrix (GLCM) is employed to generate features

from the segmented data. The proposed approach is optimized with the use of a Genetic Algorithm (GA) that selects

optimal feature subsets to maximize its performance. Combining ML and DL methods in Medical Image Processing is

the most effective approach to detecting LC and its stages with the hope of achieving more precise findings. When

accuracy is assessed and compared to other procedures, it becomes clear that the suggested methodology is more accurate

(95%).

Keywords: medical image processing; LC; ML; DL; VGG-16; multi-class support vector machine (VGG-16+MSVM)

1. Introduction

LC is a sort of lethal cancer that is challenging to identify.

Typically, it results in mortality for both males and females. Smoking

is virtually related, and the latter is accelerating. Non-small cell LC

(NSCLC) is more prevalent and develops more gradually. Hybrid

little cell/large cell cancer is the label assigned to it if both forms of

cancer are present in the patient. According to current statistics,

NSCL accounts for over 85% of the estimated 234,030 new instances

of LC predicted to be identified in 2018. The proliferation of LC

without indications is the main aspect that makes this illness so

dangerous. A percent of those surveyed showed no indications of

ARTICLE INFO

Received: 23 February 2024

Accepted: 9 April 2024

Available online: 29 May 2024

Journal of Autonomous Intelligence is

published by Frontier Scientific Publishing.

This work is licensed under the Creative

Commons Attribution-NonCommercial 4.0

International License (CC BY-NC 4.0).

https://creativecommons.org/licenses/by-

nc/4.0/

malignancy. Many individuals are aware that LC may also generate X-rays of the lungs. The importance of

prompt detection cannot be overstated since LC spreads swiftly. LC may begin in the major airway, the

windpipe, the lungs, or another location. It is brought on by unchecked cell expansion and proliferation of

certain cells in the lungs. LC is quite frequent in those with emphysema or lung conditions. The mainly two

categories of LC are “small-cell lung carcinoma (SCLC) and non-small-cell lung tumor (NSCLC)”[1]. SCLC,

the more dangerous pathologic form of LC, makes up for 25%–40% of lung tumor cases in China, making it

one of the most deadly tumors. Elevated LC has a severe fatality ratio and few treatment choices.With a strong

growth percentage, quick replication times, and the initial emergence of extensive metastatic tumors, it has a

distinctive biological record.. SCLC, formerly known as “oat cell carcinoma”, originally emerged in research

in 1936, in a report of an instance of the disease in a person who had asbestosis Various sampling techniques

may have an impact on the estimated frequency of C-SCLC, which varies from 2% to 28% of all SCLC patients

in various studies. Since there is minimal evidence of inter-tumor diversity about morphology or biomolecular

in medical care, SCLC has so far been considered a “uniformly” illness[2]. Figure 1 depicts the SCLC and

NSCLC LC.

Figure 1. The SCLC and NSCLC LC.

Among all types of LC, NSCLC spreads 40% more rapidly than many kinds of LC. Because of this, it is

thought that carcinoma in its initial phases has a strong probability of not spreading. The NSCLC disease is

determined by four phases, based on the phase classifications: Stage 0 (unknown), in which cancerous cells

are discovered in sputum or tracheal soakings but cannot be readily seen using scanning methods or

tracheostomy; in addition, cancer may have spread to other body areas; Stage 1 (development of tumor), no

lymph nodes have been affected; Stage 2 refers to a disease that has progressed to the major bronchial tubes;

Stage III refers to tumors that have grown to several body parts but have not yet shown evidence of metastasis,

and phase IV refers to cancer that has spread to many locations in one or more systems. Most instances that

test positive for NSCLC have already proceeded to the developed phases[3]. Having 21 lakh new cases and 18

lakh fatalities from LC in 2018, it ranks as the highest prevalent form of cancer globally. Contrary to other

malignancies like breast and testicular tumors, which often appear with a single recognizable side effect, LC

has a more wide common side effect pattern (e.g., painless lump). Shortness of breath, chest discomfort, a

chronic cough, and other signs including alterations to an underlying cough may all be caused by initial LC.

Severe illness is often accompanied by widespread signs such as unexplainable body loss and exhaustion. One

of the best LC symptom predictions is hemoptysis (bleeding). Loss of appetite, lump in the neck, weakness,

abdominal pain, and blood clots are other major symptoms of LC[4]. Figure 2 displays the symptoms of LC.

Smoking is connected to certain genomic alterations that result in lung malignancies with unique

pathological characteristics. Few oncogenes and carcinogenic factors have comparable impacts of smoking on

the LC genomic that are well characterized. Smoking is the greatest significant danger solid indications linking

this risk factor to LC as a key cause of the disease. In comparison to non-smokers, smokers have a 30-fold

higher chance of acquiring cancer. LC may be promoted by inflammation through several different routes.

Therefore, pulmonary inflammation may contribute to the development or spread of cancer. The tobacco-

induced pulmonary cell connection provides a distinct setting in which lung inflammation, functional, and

stromal cells collaborate to promote cancer. The substantial modifications brought about by cigarette smoke,

which includes well-known carcinogens as well as large quantities of reacting oxygen molecules; represent the

first link between smoking and LC. After being exposed to cigarette smoke, the quick generation of reacting

oxygen causes inflammation as well as a deficiency in epithelium and endothelium cellular functions[5]. The

greatest impact of cancer mortality globally is LC. The prediction of patients with severe LC is regarded as

incorrect since the earlier detection ratio of LC is only 15%, and 75% of individuals are identified at a severe

or localized phase. When metastatic develops in individuals with severe LC, chemotherapeutic, laser surgery,

and chemotherapeutic drugs are often utilized as the first line of medication for initial malignancies. Despite

the availability of several LC therapies, the recovery rate of individuals with LC continues to be poor. This is

likely due to delayed detection and the ineffectiveness of the therapeutic interventions that are now in the

medical sector[6]. Finding early diagnosis methods for LC with significant accuracy and precision is

increasingly crucial. Therefore, a method relies on DL and ML is employed to classify CT images for the

presence of LC. The article suggests the VGG-16 and Multi-Class Support Vector Machine (VGG-16+MSVM)

approach.

Figure 2. Symptoms of LC.

2. Literature survey

The research c suggested a new CT-scan-based image processing and artificial intelligence-based LC

diagnostic method was suggested in the study of Xu et al.[7]. In the current investigation, Alexnet has been

used to distinguish between malignant and healthful instances following noise reduction depending on wiener

filtration. Additionally, the system makes utilizes the best possible terms for each feature that is replaced by

the network feature extraction component. The Alexnet framework and extraction of characteristics in the

research are designed optimally using a new customized form of the Satin Bowerbird Optimization Algorithm.

This technique has high operational complexity. Data from computed tomography scans of lung patients are

utilized in this research to identify and categorize pulmonary nodules as well as to determine their degree of

aggressiveness. U-Net architecture is employed to divide the CT scan information. This study of Dunke and

Tarade[8] proposed a 3D multi-path VGG-like network, which is tested on 3D cubes taken from a collection of

lung images. This study was presented using the modified stochastic diffusion search (SDS) method to create

a brand-new wrapper-based feature selection approach. The SDS would profit from agent-to-agent interactions

to find the best-selected features. For categorization, the clustering algorithm, the neural network, and Naive

Bayes have all been employed. Techniques using ML has been extensively employed to improve the efficacy

of preclinical tumor diagnosis. A network called “study showed network (FLN)” cutting-edge ML method that

uses little computing power and is quick to operate. The FLN’s existing energy variables (weight and basis),

though, are generated arbitrarily, making the algorithm unpredictable[9]. This research suggested a combination

approach using FLN and “K-nearest neighbors” to classify the lungs thorax CT’s structure and picture elements

images and diagnose LC to increase effectiveness[10]. Because of the intricate structure and therapeutic

interconnections of computer-diagnosed scanning results, physicians have trouble diagnosing LC. Physicians

might benefit from Computer-Aided Detection (CAD) for rational judgment, earlier cancer detection, and

categorization of malignant anomalies. In this study of Alyami et al.[11], the stages of LC are distinguished

utilizing image analysis methods, and CAD has been used to improve the precision, sensitivities, and validity

of automatic identification. The identification and classification of anomalies in clinical imaging are crucial

for medical treatment, such as assessment, radiation, reaction analysis, and visual data study. Therefore, for

accuracy and early diagnosis, Desai et al.[12] developed a completely automatic process in order to identify and

morphological categorization of “non-small cell LC”. Improved “Low Dose Computed Tomography (LDCT)”

in conjunction with assured resolution particle swarm optimization is used to diagnose LC[13]. Because the

suggested method is computerized, much reduced time was needed for both the evaluation and the preparation

of the data. This methodical methodology, together with the essential detectors and the incorporation of those

devices, is required to receive information about the improved LDCT scans that were performed. But this

method has high energy consumption. In this study of Tumuluru et al.[14], deep convolutional neural networks

were presented as a method for predicting LC at an earlier phase. The CT and MRI helped locate and diagnose

the pulmonary illnesses that were present. Additionally, improved CT and MRI exams that are based on CNN

to increase image quality have a significant applicability potential Upon detecting of LC. This method has poor

finding LC. In the present study of Bai et al.[15], a lung tumor segmentation and identification technique are

developed by making use of the suggested sine cosine Sailing Fish (SCSF) driven generated adversary network

(GAN). The computed tomography (CT) scan that has been normalized is then passed on to the tumor

identification classification phase. During this step, the CT scan is divided into a variety of sub-images to

precisely locate the aberrant area. Through the use of the discriminator element’s error mechanism, it is

possible to precisely locate the afflicted areas. This method has less classification accuracy. As a result of this

study of Selvapandian et al.[16], improved artificial neural network (ANN) methodologies for diagnosing lung

disorders have been developed. ANN is employed to train on the dataset that has been provided. The

methodology that has been suggested results in improved categorization reliability. This article of Manoharan

et al.[17] concentrated on an expanded iteration of the KNN Algorithm, which is utilized for the prognosis of

LC depending based on the CT scan (CT)—scanned data that are supplied as the source. This then goes through

a process called Pattern Recovery, which is then proceeded by Binarization before it is sent as Performance

information to the ML system. The system analyzes the screening data using the Expanded KNN Method, and

then makes forecasts based on the results of that analysis. The algorithm determines the Tumor Phase

depending on the input CT-Image, and this information is then sent to the physician so that additional treatment

may be administered.

In order to further confirm the Early CDT-Lung test’s for detecting small cell lung cancer (SCLC) in a

bigger patient population, sample from this group were run on it without matched controls. Inside the validation

data set, 73 SCLC samples were included[18]. The parental lung cancer’s genetic heterogeneity was likewise

preserved in the LCOs. This research indicates that the lung cancer organoids LCO system will serve as a

helpful platform for new clinical trials and drug screening. The NBOs can also be utilized to estimate the

toxicity of drugs on semi cells. The parental lung cancer’s genetic heterogeneity was likewise preserved in the

LCOs[19]. The requirement to offer quick, trustworthy, and affordable results from NSCLC specimens is crucial

given the growing number of predictive biomarkers available for treating NSCLC patients.

Immunohistochemistry (IHC) is a commonly used and less technically complex assay than molecular testing

that may be successfully carried out on the majority of FFPE tissue[20]. For assessing small samples like

cytology specimens, hybrid capture-based NGS techniques that can identify gene mutations as well as copy

number changes and genomic structural changes may prove to be the most effective[21]. Study shows that

volumetric modulated arc therapy (VMAT) and Intensity modulated radiotherapy (IMRT) are thus advised for

the care of patients with stage III NSCLC due to the possible dosimetric benefits associated with these

modalities[22]. The local disease failures that were most likely to be the first sites of a recurrence were averted

by the radiation treatment. Further supporting the possible advantages of local therapy in restricted metastatic

settings, PFS for individuals with minimal metastatic disease appeared comparable to those of patients with a

greater metastatic burden[23]. The authors have outlined the key strategies for classifying nodules and predicting

lung cancer from CT imaging data. According to their observations, given enough training data, the state-of-

the-art is now obtained by CNNs trained with deep learning, with classification performance in the low 90s

AUC points[24]. Detection of lung cancer even at 0.11 mSv, a relatively low effective radiation dosage. New

uses for FDG-PET may result from further development of this technology, which may also increase the

specificity of lung cancer screening programmes[25]. Every two weeks for up to 12 months, the study of Li and

Liang[26] randomly allocated participants in a 2:1 ratio to receive durvalumab (at a dose of 10 mg per

kilogramme of body weight intravenously) or a placebo. One to 42 days after the patients had had

chemoradiotherapy, the study medication was given to them. Identification of patients with early-stage lung

cancer and the need for treatment interventions are made possible by screening and early diagnosis. Inferring

the relative risks of relapse through dynamic classification of patients is made possible by prognosis prediction

utilizing ctDNA. Personalizing treatment and facilitating interventions based on resistant mechanisms are

made possible by evaluating treatment response and resistance[27]. The U.S. Preventive Services Task Force

(USPSTF) advises lung cancer screening (LCS) with low-dose computed tomography (LDCT) in high-risk

individuals, although only a small percentage of those who are eligible are screened. It is unclear if PCPs’ use

of LDCT is impacted by their familiarity with USPSTF recommendations[28]. Visually guided, voluntarily

conducted Deep-inspirational breath-hold (DIBH) was used using optical tracking. For the purpose of planning

radiotherapy, patients underwent three consecutive DIBH CT scans. In order to calculate the PTV margins, the

authors examined the intrafractional errors in the position of the peripheral tumor, lymph nodes, and

differential mobility between them[29]. They use a multitask deep neural network to process pre-therapy free

breathing (FB) computed tomography (CT) images from 849 patients receiving lung Stereotactic Body

Radiation (SBRT) to create an image fingerprint signature (or DL score) that forecasts time-to-event local

failure outcomes[30].

Existing methods struggle with several drawbacks, including improper categorization, erroneous

detection, increased energy consumption, and increased processing time. The previously available procedures

were not sufficient to accomplish the earlier detection.

Problem statement

Globally, cancer is the non-communicable illness that is responsible for the second most fatalities.

Pulmonary tumor is one of the many forms of the disease, but it is the form that accounts for the greatest

number of deaths worldwide. The danger of death from LC is higher than any other kind of disease that may

impact people of both genders. Throughout the unregulated expansion of exceptional cells, one side of the

lung, or both, will develop to enlarge. The diagnosis of these illnesses at a preliminary phase is one of the most

important steps that can be taken to protect humans from obtaining them. Numerous academics are now

investigating a variety of approaches to illness forecasting in the hopes of improving reliability. However, the

methods that are currently in use face several deficiencies when it comes to the categorization and detection

of LC. Therefore, we recommended a VGG-16 and Multi-Class Support Vector Machine (VGG16+MSVM)

to boost the classification accuracy of LC prediction utilizing CT scans. This was done so that we could make

more accurate diagnoses.

3. Proposed method

The economy and global health are significantly impacted by chronic LC. Effective management of the

mortality rate and significant public health issues may result from early diagnosis and prediction of a LC

diagnosis. We thus suggested a VGG-16 and Multi-Class Support Vector Machine (VGG-16+MSVM) to

enhance the classification accuracy of LC prediction using CT scans. The procedure of the suggested technique

is depicted in Figure 3.

Figure 3. Procedure of the proposed method.

3.1. Contribution of the study

In this work, a method for classifying CT scans termed the VGG-16 and Multi-Class Support Vector

Machine (VGG-16+MSVM) is presented. Non-Local Means (NLM) Filter and Bi- Histogram Equalization

(Bi-HE) are utilized in image preprocessing to remove undesired noise data samples. For segmentation, the

clustering K-Means (KMC) approach is being used. To create attributes from the segmented data, feature

extraction is done using the Gray Level Co-Occurrence Matrix (GLCM). A Genetic Algorithm (GA) is

employed to optimize the suggested method, selecting the best feature subsets to maximize performance.

Data set: This database included a sample of 311 BLCS people with initial NSCLC who were treated at

Massachusetts General Hospital (MGH) between 1999 and 2011. The majority of individuals had first

treatment for their condition. All procedures were performed in compliance with the organizational policies

and standards of the hospitals. Information from the individual group’s pre-resection computed tomography

(CT) scanning was gathered. Additionally, information about these sufferers’ total and advancement survival

rates, cancer stage, and clinical and pathological findings was recorded[31].

Image preprocessing: The effectiveness of the training process might be increased by using various pre-

processing techniques. Because the preprocessing reduces the amount of noise present in the source CT scans,

the image quality of the system may be improved, which in turn allows for an improvement in the efficacy of

the system. As a consequence, in this study, image clarity can be increased by using data pre-processing

techniques such as noise reduction and image enhancement.

3.2. Non-Local means (NLM) filter used for noise removal

The Non-Local Means (NLM)filter was suggested as a way to analyze CT images corrupted by undesired

noise. The resemblance between a pixel’s region arrangement and that of all the other pixels in a region is

taken into account by the NLM when estimating individual pixels. In Equation (1), the estimated pixel 

may be calculated numerically as the weighting factor of all the adjacent pixels in the distorted CT scan.



󰗸

󰗸

(1)

whereby  is the middle pixel of the similar spots in me, j, while I, j is the patched block similar to the present

patched core by . The options in Ij are comparable areas that are sufficiently near to the present patchwork

located on in Euclidean distance. The value for every selection  is defined by weight training  in

Equation (1), and it may be determined as pursues:

  󰇻󰇻 







(2)

where , and  denote the local frames in the noised images that are focused on pixels and

, respectively. The smoothness variable, h, regulates how quickly the exponential equation decays. The

Euclidean divergence, normalized by a Kernel function with a constant average and variability, is the only

standard utilized in Equation (2). Since global structural resemblance is accounted for throughout denoising,

in addition to localized quantitative information, NLM has successfully suppressed undesired noise.

3.3. Image quality enhancement by Bi- Histogram equalization (Bi-HE)

Bi-Histogram Equalization is a novel method that is offered to improve CT scans. The incoming

distribution is split into two separate distributions using the suggested Bi-Histogram Equalization procedure,

which is located at the histogram median’s criterion for standard intensity maintenance. To regulate the pace

of improvement, histogram trimming is done. The improved image is then created by equalizing and

integrating the sub-histograms from the trimmed distribution. The Bi-Histogram Equalization allows for more

trimming limitation adaptability by autonomously finding the lowest quantity among data points, means, and

percentile frequencies, which preserves more of the image’s content. When doing histogram equalization, the

problem of over-higher bandwidth segments is addressed by automatically choosing the trimming threshold.

Segmentation: Image segmentation is a technique that is extensively utilized in digitized image

processing and investigation. The goal of image segmentation is to divide CT scans into various portions or

areas, and the division is often determined by the properties of the pixels included in the scans.

3.4. K-Means clustering (KMC)

A technique for grouping together a collection of data is called clustering. The k-means clustering

technique is one of the most often used. A gathering of information is divided into a group of variables with k

groups in k-means clustering. It divides the provided set of data into k distinct clusters. The K-means technique

comprises two distinct stages. The k centroids are calculated in the initial stage, and then in stage 2, every pixel

is moved to the cluster with the centroids that are closest to it. The most widely frequent way for determining

the proximity in the vicinity centroids is the Euclidean distance. Once the clustering is complete, it recalculates

the centroids of every cluster, calculates an updated Euclidean distance between every core and every part of

information depending on those centroids, and gives the pattern elements Its Euclidean distance is the shortest.

The component entities and centroids of every group in the division serve as its defining characteristics. The

position at which the total distances from all the items in a cluster are reduced is the centroids for every group.

K-means, then, is an iteration method it shortens the overall distance between each item and its cluster centroids

over all groups.

Let us assume an image that has a quality of x × y, and the CT scans have to be clustered into k different

clusters. Let CK be the clustering centers and 󰇛󰇜be the incoming images to be clustered. The following is

a description of the k-means clustering method shown in Algorithm 1.

The random choice of the starting centroids affects how well the clustering findings turn out in the end.

As a consequence, if the starting centroids are picked arbitrarily, the outcome will vary depending on the

beginning center. So that we may achieve the segmentation we want, the starting center will be appropriately

selected. It is dependent on the number of information points, groups, and iterations. Thus segmentation of the

CT scan is done using KMC.

Algorithm 1 K-Means clustering

Establish the cluster’s k-count and centroid.

Utilizing the connection shown in Equation (3), determine the euclidean distance d between the center and every pixel of a

CT scan.  󰇛󰇜CK (3)

Depending on the distance d, allocate each pixel to the center that is closest to it.

When all of the pixels have been allocated, reevaluate the center’s location by applying the correlation shown in Equation (4)

below: 

  󰇛󰇜



 (4)

Continue the procedure till the tolerances or defect amount is achieved.

Resize the pixels in the images cluster.

Feature extraction: Feature extraction is the procedure of converting unprocessed images into

quantitative features from the source database. In comparison to introducing ML to the original data,

autonomous feature extraction utilizes specific techniques to automatically retrieve characteristics from CT

scans. This technique might useful when images need to quickly switch from generating raw data to artificial

intelligence algorithms. “Gray Level Co-Occurrence Matrix (GLCM)” feature extraction was carried out in

this study.

3.5. Gray level co-occurrence matrix (GLCM)

The identification and categorization of LC are challenges that are resolved by the “Gray Level Co-

Occurrence Matrix (GLCM)”, a feature extraction method that works with the CT images which have been

obtained in the database. Creating a robust CT scan collection is often the initial stage of LC screening. The

result of GLCM utilized to categorize the illnesses includes surface characteristics, training, and testing. All

CT scans, including light, moderate, high, flow, and excess scans, are available in the CT scan collection. The

varying degrees of CT scan is used to create a more reliable and usable mentoring and evaluation dataset for

classification validation. The GLCM offers a second-order approach for creating feature extraction to

determine the correlation between the different grayscale variations in the CT scan properties, such as distance,

d, and direction, q. Gray levels (I j) orientated at q 14 0 and q 14 180, respectively, constitute the GLCM

matrices. As a result, entries were formed at (I j) and (j, I), and every GLCM was modeled to the quantity G

of the quantified grey scale.

GLCM metrics may be used to produce a variety of image characteristics. Therefore, the measurements

need a possibility rather than quantity prior pattern characteristics can be estimated. In Equation (5), the

probabilistic measurement is derived.

󰇛󰇜 󰇛󰇜

(5)

where Equation (6) defines the co-occurrence possibility () between grayscale i and j.

 







(6)

where stands for how many instances of grayscale i and j there are in a set of d, q, and G factors.

The intensity, power, uniformity, and coherence of grayscale threshold readings are the pattern descriptors

produced by GLCM. It is defined in Equations (7)–(10). While power in GLCM is the total of the square

components, brightness refers to the degree of local differences contained in images. The angular second

component or homogeneity is another name for power. The uniformity description indicates how closely the

dispersion of GLCM components resembles the diagonally of GLCM. A pixel’s association with its neighbors

throughout the entire image will be shown through coherence, which is the final step.





 󰇛󰇜

(7)

 





(8)

 

󰇛󰇜





(9)

 



 



(10)

  is the standardized symmetrically GLCM’s element i and j.

 N=the number of grayscales in the image as determined by the GLCM’s quantification of the degrees in

the image.

 =The GLCM average.

 =The dispersion of all comparison pixels’ brightness in the correlations that made up the GLCM.

Numerous studies have been done to examine various elements of co-occurrence textured characteristics

connected to the parameters G, d, and q, as well as their applicability in this context. These researches were

carried out to provide suggestions for the appropriate empirical system criteria. Given that 0, 45, 90, and 135

are considered to produce more accurate categorization by several investigators, the roles of G and d are

investigated in this work.

Feature selection: Feature selection is a method of image quantization that includes selecting merely the

most crucial traits from a complete collection and dismissing the rest. A method of feature selection that selects

features based on the general attributes of the training data. For feature selection, the Genetic Algorithm (GA)

method was suggested. The procedure of the genetic algorithm is shown in Figure 4.

Figure 4. Procedure of genetic algorithm.

3.6. Genetic Algorithm (GA)

All characteristics are taken into consideration when using a genetic algorithm as a feature selection

strategy. The primary objective of the feature selection strategy is to improve overall efficiency through

refinement. The fundamental task when using genetic programming for features choice is to produce the

optimal characteristic subgroup by the fitness function. GA begins with population activation at randomization.

New people are chosen in each generation based on the fitness function’s score. We used rank-based fitness

selection in this investigation. The CT image that recombines with the remaining people to create the next

iteration is chosen by the selecting mechanism following fitness allocation. CT images are chosen arbitrarily

using a roulette roll choice process, which makes use of the motor’s rotating motion. The crossover process,

which is in charge of creating the new fittest subgroups known as the offspring of founders, was completed by

arbitrarily selecting the two guardians. However, a subgroup (singular) that arbitrarily modifies certain desired

quantities of characteristics for the existence of the fittest is what makes the mutation operation function.

LC detection by utilizing VGG-16 and Multi-Class Support Vector Machine (VGG-16+MSVM):

The well-known pretrained DL model (VGG-16) configuration serves as the foundation for the suggested

methodology. Considering two factors, we recommend the VGG-16 model. First, contrasted to its other

equivalent, the VGG-19 framework, it recovers the characteristics at a reduced rate by employing a reduced

kernel size, which is suited for CT images with fewer layers. It also offers a stronger feature extraction

capability for classifying CT scans of LC. One of the transferable training strategies is the fine-tuning strategy,

that we utilize. We employ the pre-trained weights of ImageNet to interact with the VGG-16 classifier during

the fine-tuning procedure. Given that there are only a certain number of CT images available for training, it

assists in overcoming the overfitting issue. The “Attention module, Convolution module, FC-layers, and

Softmax classifier” are the four key building elements that make up the suggested approach (also called

“Attention-based VGG-16”). Figure 5 displays the comprehensive schematic diagram of the suggested

architecture.

Figure 5. Schematic diagram of the VGG-16 architecture.

3.7. Attention module

This component allows scientists to record the temporal connection between visual information included

in lung CT scans. We utilize the idea of temporal concentration for this. The source vector, which corresponds

to the fourth pooling layer of the VGG-16 framework in the technique, is subjected to both maximum and

mean pooling operations. Then, utilizing the Sigmoid function (), these two resulting tensors-the maximum

pooled 2D vector and the mean pooled 2D vector-are connected to conduct a convolution with a filter size (f)

of . Equation (11) defines the conjugated resulting vector (Ms (F)).

󰇛󰇜





(11)

where the 2D vector obtained by mean pooling and maximum pooling operations on the source vector F,

correspondingly, are denoted by 



. Here, the terms H and W stand for the

vector’s length and breadth, correspondingly.

3.8. Convolution module

The convolution module, the fourth pooling layer of the VGG-16 architecture, is what we employ in the

approach. The important hints of the CT scans are captured by the size-independent convolutional component.

The middle-level layer, or fourth pooling, which is better suited to CT scans, is where the intriguing hints are

recovered. Nevertheless, since such CT scans are neither broader nor more particular, the characteristics from

other layers (greater or lesser) are inappropriate for CT imaging. As a result, we begin by providing the

attention module with the fourth pooling layer. The output of this component is then combined with the actual

fourth pooling layer.

3.9. The fully connected (FC) layer

We employ fully connected layers to describe the concatenation characteristics obtained from the

convolutional blocks and attentiveness as one-dimensional (1D) information. According to Figure 2, it has 3

components: Flatten, dropout, and dense. In this technique, the dense layer is set at 256 and the dropout is

fixed at 0.5.

3.10. Softmax classifier

We employ the softmax layer to categorize the characteristics that were collected from the FC layers. The

component value for the softmax layer-the final dense layer-depends on the number of classes in the database

(e.g., three for datasets with 3 groups, four for datasets with four types, etc.). Depending on the categorization

that was done, the softmax layer produces multivariate regression distributions of the probabilities values.

󰇡

󰇢



(12)

Equation (12) defines the outcome of this probability, where b and c stand for possibilities obtained from

the softmax layer and a particular class of the database utilized in our suggested technique, accordingly.

3.11. Multi-Class Support Vector Machine

A technique for supervised ML known as “Multi-Class Support Vector Machine (MSVM)” is useful for

both the categorization and reconstruction of challenge statements. It searches for support vectors, or ideal

borders, between several categories. The MSVM classifier is used to accurately classify CT lung scans once

the characteristics from the prior step have been filtered. A classifying challenge is considered using a

collection of n examples, a test set is indicated as 󰇝󰇞 and a target class example matrix is denoted

󰇛󰇜󰇛 󰇜. To discriminate between negative (−1) and positive (+1) occurrences

while exploring support vectors, instruction is done. Equation (13) illustrates how the training step entails the

optimization method.

󰇛󰇜







(13)

Considering the restrictions:

󰇟󰇛󰇜󰇠

(14)

where C stands for a compensation element and denotes the mappings of the e source vector to a greater

dimensional feature set, and where stands for the slack parameter, whose value is ≥ to quantify the

classification error



 . The entire procedure is a simple, high-dimensional, continuously separable

problem, and the transformation is dependent on the MSVM kernel operator.

󰇛󰇜

(15)

The kernel procedure, which produces standard information as linearly separated information so that

dimensionality concerns may be readily eliminated, is essential. Equation (10) can be utilized to construct the

many different kinds of kernel operations, including the radically base factor (RBF), quadratic, and polynomial

kernel processes.



(16)



(17)

 󰇧

󰇨

(18)

Thus, the detection of LC using CT scans was done by a hybrid VGG-16 and Multi-Class Support Vector

Machine (VGG16+MSVM) to enhance the earlier detection accuracy.

4. Result and discussion

A technique based on ML and DL is employed to classify CT scans for the presence of LC. This paper

suggests the use of the Visual Geometry Group (VGG-16) and Multi-Class Support Vector Machine (VGG-

16+MSVM) approach for the precise and early identification of LC. The efficacy of the suggested approach is

covered in this section. The recommended system’s capacity to achieve Accuracy, Precision, Recall, F1-Score,

Sensitivity, and Specificity justifies its adoption. The classic approaches used for comparison include artificial

intelligence (AI), “denoising first two-path convolutional neural networks (DFD-Net)”, “fusion-based

convolutional fuzzy neural networks (F-CFNN)” and “Wilcoxon Signed Generative DL (WS-GDL)”.

Accuracy: An evaluation of the model’s accuracy is obtained by multiplying the total number of

predictions by the number of VGG-16+MSVM correct detections. Accurate evaluations in VGG-16+MSVM

must consider both positive and adverse outcomes. The quantity to which a prediction can be made with

complete certainty is referred to as its accuracy, and the measure to which it can approximately anticipate the

outcome is referred to as its accuracy. To determine how accurate the detection was, a calculation was made

using the ratio of the predicted result to the true result. When compared to other conventional procedures, our

recommended method gives a high degree of accuracy in detecting LC. Figure 6 represents an evaluation of

the accuracy.

Figure 6. Comparative analysis of Accuracy with existing methods.

Precision: The percentage of detection that concentrates on important components of LC is referred to

as “precision.” The proportion of correctly detecting LC is known as precision. It can mean that precision is

the criterion for quality. Precision is the average possibility of appropriate detection. As a consequence, the

procedure that is presently advised is more accurate than those that were previously used. The precision of the

recommended method is shown in Figure 7.

Recall: A recall is one of the characteristics that is taken into account while assessing medical approaches.

The percentage of reliable LC diagnoses using CT scans is often known as a recall. The true positive rate is

often referred to as recall. As a result, the recommended strategy is more effective than the existing techniques.

Figure 8 demonstrates that the model has a high recall rate.

F1-Score: By summing the precision and recall scores and computing their harmonic means, the F1 score

is obtained. The average of something like the Precision and Recall scores, weighted computations is utilized

to get the F1 score. These two elements each have an equal impact on the outcome. The method we suggest

provides a high degree of F1 score when it comes to detecting LC in contrast to the other methods that have

previously been employed. Figure 9 depicts the F1-Scores for the proposed methodology and the previously

used methods.

Figure 7. Comparative analysis of precision with existing methods.

Figure 8. Comparative analysis of recall with the existing method.

Figure 9. Comparative analysis of F1-Score with existing methods.

Specificity: The capacity of a test to rule out someone who has a condition as negative when does not

have a disease is referred to as specificity. The specificity demonstrates the efficiency of LC detection. It

demonstrates the effectiveness of the suggested approach. Figure 10 illustrates the Suggested Technique’s

Specificity. The recommended method is thus more effective than the traditional one.

Figure 10. Comparative analysis of specificity with existing methods.

Sensitivity: Sensitivity is the capacity of a diagnostic to identify a diseased person as positive. The

proportion of CT scans that are segmented and provide a correct result when the assessment is employed in

the study is the genuine optimistic proportion, also called the sensitivity of detection of LC. Figure 11 displays

the sensitivity of the suggested strategy. As a result, the recommended method is more effective than the

existing systems.

Figure 11. Comparative analysis of sensitivity with existing methods.

Discussion: One of the most fatal malignancies and one with a high prevalence in the community is LC.

As a result, Chaunzwa et al.[32] proposed an artificial intelligence-based diagnostic method, to determine when

lung nodular micro-calcification first occurs, which may help physicians and surgeons forecast it with accuracy

using CT scan processing techniques. The morphology of nodules, such as contour and dimensions, and image

distortion have an indirect and complicated connection with cancer; for this reason, a detailed investigation of

every suspicious nodule and the integration of data from every lesion should be necessary. To tackle the

difficulty of LC diagnosis, Maurer[33] developed the “denoising first two-path convolutional neural network

(DFD-Net)”. The newly presented model is made up entirely of denoising and detecting components.

Enhancing CT’s ability to accurately diagnose or identify lung disease is a difficult challenge. As a result, the

“fusion-based convolutional fuzzy neural network (F-CFNN)” proposed in this research of Sori et al.[34], which

recognizes and categorizes CT scans, is fusion-based convolution fuzzy. The convolutional fuzzy neural

network (CFNN) in the F-CFNN combines two convolutional layers, max pooling, and a fuzzy neural network

to retrieve information and produce reliable categorization results. This study of Lin and Yang[35] introduced

the “Wilcoxon Signed Generative DL (WS-GDL)” approach for identifying LC. First, test informational gain

and relevance analysis remove superfluous and unimportant qualities and recover numerous essential and

instructive characteristics. The deep characteristics are then learned by the need for a generator function

technique. It consumes more energy. These techniques are less detection rate and classification accuracy.

5. Conclusion

The deadliest disease, LC harms both men and women similarly. It is a dangerous kind of cancer that is

difficult to diagnose. The main airways, the windpipe, the lungs, or another area may be where LC develops.

Uncontrolled cell growth and the proliferation of certain cells in the lungs are the causes of LC. One of the

most crucial actions that can be made to prevent human death is the early detection of these disorders. Thus, a

methodology based on ML and DL is used to classify CT images for the presence of LC. Therefore, we

presented a VGG-16 and Multi-Class Support Vector Machine (VGG16+MSVM) to improve the detection

accuracy of LC prediction using CT scans. In terms of Accuracy, Precision, Recall, F1-Score, Sensitivity, and

Specificity, the suggested approach performed better. This criterion was compared to traditional methods such

as Wilcoxon Signed Generative DL, “denoising first” two-path convolutional neural networks, and “denoising

first” two-path convolutional neural networks (WS-GDL). It demonstrates that the suggested method is more

successful in detecting LC. The next potential focus of the proposed study may be the application of

optimization approaches to enhance performance indicators like computation speed and detection quality.

Author contributions

Conceptualization, SKH and SB; methodology, SKH; software, KVD; validation, KM, SB and KVD;

formal analysis, KLK; investigation, PS; resources, VDJ; data curation, VDJ; writing—original draft

preparation, KM; writing—review and editing, KLK; visualization, PS; supervision, KVD; project

administration, SKH; funding acquisition, SB. All authors have read and agreed to the published version of

the manuscript.

Conflict of interest

The authors declare no conflict of interest.

References

1. Abdullah DM, Ahmed NS. A review of most recent LC detection techniques using machine learning. International

Journal of Science and Business. 2021; 5(3): 159-173.

2. Wang X, Guo Y, Liu L, et al. YAP1 protein expression has variant prognostic significance in small cell lung

cancer (SCLC) stratified by histological subtypes. Lung Cancer. 2021; 160: 166-174. doi:

10.1016/j.lungcan.2021.06.026

3. Sardarabadi P, Kojabad AA, Jafari D, et al. Liquid Biopsy-Based Biosensors for MRD Detection and Treatment

Monitoring in Non-Small Cell Lung Cancer (NSCLC). Biosensors. 2021; 11(10): 394. doi: 10.3390/bios11100394

4. Saab MM, McCarthy M, O’Driscoll M, et al. A systematic review of interventions to recognise, refer and diagnose

patients with lung cancer symptoms. npj Primary Care Respiratory Medicine. 2022; 32(1). doi: 10.1038/s41533-

022-00312-9

5. Wang X, Ricciuti B, Nguyen T, et al. Association between Smoking History and Tumor Mutation Burden in

Advanced Non–Small Cell Lung Cancer. Cancer Research. 2021; 81(9): 2566-2573. doi: 10.1158/0008-5472.can-

20-3991

6. Xu K, Zhang C, Du T, et al. Progress of exosomes in the diagnosis and treatment of lung cancer. Biomedicine &

Pharmacotherapy. 2021; 134: 111111. doi: 10.1016/j.biopha.2020.111111

7. Xu Y, Wang Y, Razmjooy N. Lung cancer diagnosis in CT images based on Alexnet optimized by modified

Bowerbird optimization algorithm. Biomedical Signal Processing and Control. 2022; 77: 103791. doi:

10.1016/j.bspc.2022.103791

8. Dunke SR, Tarade SS. LC Detection Using Deep Learning. International Journal of Research Publication and

Reviews.

9. Shanthi S, Rajkumar N. Lung Cancer Prediction Using Stochastic Diffusion Search (SDS) Based Feature Selection

and Machine Learning Methods. Neural Processing Letters. 2020; 53(4): 2617-2630. doi: 10.1007/s11063-020-

10192-0

10. Lin CJ, Yang TY. A Fusion-Based Convolutional Fuzzy Neural Network for LC Classification. International

Journal of Fuzzy Systems. 2022; 1-17.

11. Alyami J, Khan AR, Bahaj SA, et al. Microscopic handcrafted features selection from computed tomography scans

for early stage lungs cancer diagnosis using hybrid classifiers. Microscopy Research and Technique. 2022; 85(6):

2181-2191. doi: 10.1002/jemt.24075

12. Desai U, Kamath S, Shetty AD, Prabhu MS. Computer-Aided Detection for Early Detection of LC Using CT

Images. In: Intelligent Sustainable Systems. Springer; 2022.

13. Primakov SP, Ibrahim A, van Timmeren JE, et al. Automated detection and segmentation of non-small cell lung

cancer computed tomography images. Nature Communications. 2022; 13(1). doi: 10.1038/s41467-022-30841-3

14. Tumuluru P, Hrushikesava Raju S, Santhi MVBT, et al. Smart LC Detector Using a Novel Hybrid for Early

Detection of LC. In: Inventive Communication and Computational Technologies. Springer; 2022.

15. Bai Y, Li D, Duan Q, et al. Analysis of high-resolution reconstruction of medical images based on deep

convolutional neural networks in lung cancer diagnostics. Computer Methods and Programs in Biomedicine. 2022;

217: 106592. doi: 10.1016/j.cmpb.2021.106592

16. Selvapandian A, Prabhu SN, Sivakumar P, et al. Lung Cancer Detection and Severity Level Classification Using

Sine Cosine Sail Fish Optimization Based Generative Adversarial Network with CT Images. The Computer

Journal. 2021; 65(6): 1611-1630. doi: 10.1093/comjnl/bxab141

17. Manoharan H, Rambola RK, Kshirsagar PR, et al. Aerial Separation and Receiver Arrangements on Identifying

Lung Syndromes Using the Artificial Neural Network. Computational Intelligence and Neuroscience. 2022; 2022:

1-8. doi: 10.1155/2022/7298903

18. Sutedja G. New techniques for early detection of lung cancer. European Respiratory Journal. 2003; 21(Supplement

39): 57S-66s. doi: 10.1183/09031936.03.00405303

19. Teixeira VH, Pipinikas CP, Pennycuick A, et al. Deciphering the genomic, epigenomic, and transcriptomic

landscapes of pre-invasive lung cancer lesions. Nature Medicine. 2019; 25(3): 517-525. doi: 10.1038/s41591-018-

0323-0

20. Jain D, Roy-Chowdhuri S. Molecular Pathology of Lung Cancer Cytology Specimens: A Concise Review.

Archives of Pathology & Laboratory Medicine. 2018; 142(9): 1127-1133. doi: 10.5858/arpa.2017-0444-ra

21. Dong Z, Li H, Zhou J, et al. The value of cell block based on fine needle aspiration for lung cancer diagnosis.

Journal of Thoracic Disease. 2017; 9(8): 2375-2382. doi: 10.21037/jtd.2017.07.91

22. Peng J, Pond G, Donovan E, et al. A Comparison of Radiation Techniques in Patients Treated With Concurrent

Chemoradiation for Stage III Non-Small Cell Lung Cancer. International Journal of Radiation

Oncology*Biology*Physics. 2020; 106(5): 985-992. doi: 10.1016/j.ijrobp.2019.12.027

23. Lindeman NI, Cagle PT, Aisner DL, et al. Updated Molecular Testing Guideline for the Selection of Lung Cancer

Patients for Treatment with Targeted Tyrosine Kinase Inhibitors. Journal of Thoracic Oncology. 2018; 13(3): 323-

358. doi: 10.1016/j.jtho.2017.12.001

24. Kadir T, Gleeson F. Lung cancer prediction using machine learning and advanced imaging techniques.

Translational Lung Cancer Research. 2018; 7(3): 304-312. doi: 10.21037/tlcr.2018.05.15

25. Schwyzer M, Ferraro DA, Muehlematter UJ, et al. Automated detection of lung cancer at ultralow dose PET/CT

by deep neural networks – Initial results. Lung Cancer. 2018; 126: 170-173. doi: 10.1016/j.lungcan.2018.11.001

26. Li RY, Liang ZY. Circulating tumor DNA in lung cancer: real-time monitoring of disease evolution and treatment

response. Chinese Medical Journal. 2020; 133(20): 2476-2485. doi: 10.1097/cm9.0000000000001097

27. Mazzone PJ, Silvestri GA, Patel S, et al. Screening for Lung Cancer. Chest. 2018; 153(4): 954-985. doi:

10.1016/j.chest.2018.01.016

28. Josipovic M, Aznar MC, Thomsen JB, et al. Deep inspiration breath hold in locally advanced lung cancer

radiotherapy: validation of intrafractional geometric uncertainties in the INHALE trial. The British Journal of

Radiology. 2019; 92(1104). doi: 10.1259/bjr.20190569

29. Pastorino U, Silva M, Sestini S, et al. Prolonged lung cancer screening reduced 10-year mortality in the MILD

trial: new confirmation of lung cancer screening efficacy. Annals of Oncology. 2019; 30(7): 1162-1169. doi:

10.1093/annonc/mdz117

30. Teo PT, Bajaj A, Randall J, et al. Deterministic small‐scale undulations of image‐based risk predictions from the

deep learning of lung tumors in motion. Medical Physics; 2022.

31. Ajitha E, Diwan B, Roshini M. March. LC Prediction using Extended KNN Algorithm. In: 2022 6th International

Conference on Computing Methodologies and Communication (ICCMC).

32. Chaunzwa TL, Hosny A, Xu Y, et al. Deep learning classification of lung cancer histology using CT images.

Scientific Reports. 2021; 11(1). doi: 10.1038/s41598-021-84630-x

33. Maurer A. An Early Prediction of LC using CT Scan Images. Journal of Computing and Natural Science. 2021;

39-44.

34. Sori WJ, Feng J, Godana AW, et al. DFD-Net: lung cancer detection from denoised CT scan image using deep

learning. Frontiers of Computer Science. 2020; 15(2). doi: 10.1007/s11704-020-9050-z

35. Lin CJ, Yang TY. A Fusion-Based Convolutional Fuzzy Neural Network for LC Classification. International

Journal of Fuzzy Systems. 2022.

ResearchGate has not been able to resolve any citations for this publication.

A systematic review of interventions to recognise, refer and diagnose patients with lung cancer symptoms

Article

Full-text available

Oct 2022

Patients with lung cancer (LC) often experience delay between symptom onset and treatment. Primary healthcare professionals (HCPs) can help facilitate early diagnosis of LC through recognising early signs and symptoms and making appropriate referrals. This systematic review describes the effect of interventions aimed at helping HCPs recognise and refer individuals with symptoms suggestive of LC. Seven studies were synthesised narratively. Outcomes were categorised into: Diagnostic intervals; referral and diagnosis patterns; stage distribution at diagnosis; and time interval from diagnosis to treatment. Rapid access pathways and continuing medical education for general practitioners can help reduce LC diagnostic and treatment delay. Awareness campaigns and HCP education can help inform primary HCPs about referral pathways. However, campaigns did not significantly impact LC referral rates or reduce diagnostic intervals. Disease outcomes, such as LC stage at diagnosis, recurrence, and survival were seldom measured. Review findings highlight the need for longitudinal, powered, and controlled studies.

Aerial Separation and Receiver Arrangements on Identifying Lung Syndromes Using the Artificial Neural Network

Article

Full-text available

Aug 2022
Comput Intell Neurosci

Lung disease is one of the most harmful diseases in traditional days and is the same nowadays. Early detection is one of the most crucial ways to prevent a human from developing these types of diseases. Many researchers are involved in finding various techniques for predicting the accuracy of the diseases. On the basis of the machine learning algorithm, it was not possible to predict the better accuracy when compared to the deep learning technique; this work has proposed enhanced artificial neural network approaches for the accuracy of lung diseases. Here, the discrete Fourier transform and the Burg auto-regression techniques are used for extracting the computed tomography (CT) scan images, and feature reduction takes place by using principle component analysis (PCA). This proposed work has used the 120 subjective datasets from public landmarks with and without lung diseases. The given dataset is trained by using an enhanced artificial neural network (ANN). The preprocessing techniques are handled by using a Gaussian filter; thus, our proposed approach provides enhanced classification accuracy. Finally, our proposed method is compared with the existing machine learning approach based on its accuracy.

Deterministic small‐scale undulations of image‐based risk predictions from the deep learning of lung tumors in motion

Article

Full-text available

Sep 2022
MED PHYS

Introduction Deep learning (DL) models that use medical images to predict clinical outcomes are poised for clinical translation. For tumors that reside in organs that move, however, the impact of motion (i.e., degenerated object appearance or blur) on DL model accuracy remains unclear. We examine the impact of tumor motion on an image‐based DL framework that predicts local failure risk after lung stereotactic body radiotherapy (SBRT). Methods We input pre‐therapy free breathing (FB) computed tomography (CT) images from 849 patients treated with lung SBRT into a multitask deep neural network to generate an image fingerprint signature (or DL score) that predicts time‐to‐event local failure outcomes. The network includes a convolutional neural network encoder for extracting imaging features and building a task‐specific fingerprint, a decoder for estimating handcrafted radiomic features, and a task‐specific network for generating image signature for radiotherapy outcome prediction. The impact of tumor motion on the DL scores was then examined for a holdout set of 468 images from 39 patients comprising: (1) FB CT, (2) four‐dimensional (4D) CT, and (3) maximum‐intensity projection (MIP) images. Tumor motion was estimated using a 3D vector of the maximum distance traveled, and its association with DL score variance was assessed by linear regression. Findings The variance and amplitude in 4D CT image‐derived DL scores were associated with tumor motion (R² = 0.48 and 0.46, respectively). Specifically, DL score variance was deterministic and represented by sinusoidal undulations in phase with the respiratory cycle. DL scores, but not tumor volumes, peaked near end‐exhalation. The mean of the scores derived from 4D CT images and the score obtained from FB CT images were highly associated (Pearson r = 0.99). MIP‐derived DL scores were significantly higher than 4D‐ or FB‐derived risk scores (p < 0.0001). Interpretation An image‐based DL risk score derived from a series of 4D CT images varies in a deterministic, sinusoidal trajectory in a phase with the respiratory cycle. These results indicate that DL models of tumors in motion can be robust to fluctuations in object appearance due to movement and can guide standardization processes in the clinical translation of DL models for patients with lung cancer.

Automated detection and segmentation of non-small cell lung cancer computed tomography images

Article

Full-text available

Jun 2022

Detection and segmentation of abnormalities on medical images is highly important for patient management including diagnosis, radiotherapy, response evaluation, as well as for quantitative image research. We present a fully automated pipeline for the detection and volumetric segmentation of non-small cell lung cancer (NSCLC) developed and validated on 1328 thoracic CT scans from 8 institutions. Along with quantitative performance detailed by image slice thickness, tumor size, image interpretation difficulty, and tumor location, we report an in-silico prospective clinical trial, where we show that the proposed method is faster and more reproducible compared to the experts. Moreover, we demonstrate that on average, radiologists & radiation oncologists preferred automatic segmentations in 56% of the cases. Additionally, we evaluate the prognostic power of the automatic contours by applying RECIST criteria and measuring the tumor volumes. Segmentations by our method stratified patients into low and high survival groups with higher significance compared to those methods based on manual contours.

A Fusion-Based Convolutional Fuzzy Neural Network for Lung Cancer Classification

Article

Oct 2022

Among cancer types, lung cancer has one of the highest mortality rates worldwide. Clinicians currently use magnetic resonance imaging or computed tomography (CT) to diagnose lung cancer in patients. For lung cancer detection, improving the accuracy of diagnosis or detection through CT is a challenging task. Therefore, this study proposes a fusion-based convolutional fuzzy neural network (F-CFNN) that identifies and classifies CT images. The F-CFNN has a convolutional fuzzy neural network (CFNN) that uses two convolutional and two pooling layers to extract features and utilizes a fuzzy neural network to provide robust classification results. Furthermore, five fusion methods are used, namely global max pooling (GMP), global average pooling (GAP), channel global max pooling (CGMP), channel global average pooling (CGAP), and network mapping fusion (NMF). In the F-CFNN, parameter selection is generally conducted through trial-and-error; therefore, the Taguchi method is applied to identify the optimal parameter combination of the network. To validate the proposed method, the SPIE-AAPM public data set is used in this experiment. The experimental results indicate that the classification accuracy of the F-CFNN with NMF is 93.26%. In addition, after the Taguchi method is applied to identify the optimal parameter combination, the classification accuracy of the Taguchi-based F-CFNN with NMF is increased to 99.98%.

Lung Cancer Prediction using Extended KNN Algorithm

Conference Paper

Mar 2022

Lung cancer diagnosis in CT images based on Alexnet optimized by modified Bowerbird optimization algorithm

Article

Aug 2022
BIOMED SIGNAL PROCES

Objective Cancer is the uncontrolled growth of abnormal cells that do not function as normal cells. Lung cancer is the leading cause of cancer death in the world, so early detection of lung disease will have a major impact on the likelihood of a definitive cure. Computed Tomography (CT) has been identified as one of the best imaging techniques. Various tools available for medical image processing include data collection in the form of images and algorithms for image analysis and system testing. Methods This study proposes a new diagnosis system for lung cancer based on image processing and artificial intelligence from CT-scan images. In the present study, after noise reduction based on wiener Filtering, Alexnet has been utilized for diagnosing healthy and cancerous cases. The system also uses optimum terms of different features, including Gabor wavelet transform, GLCM, and GLRM to be used in replacing with the network feature extraction part. The study also uses a new modified version of the Satin Bowerbird Optimization Algorithm for optimal designing of the Alexnet architecture and optimal selection of the features. Results Simulation results of the proposed method on the RIDER Lung CT collection database and the comparison results with some other state-of-the-art methods show that the proposed method provides a satisfying tool for lung cancer diagnosis. The comparison results show that the proposed method with 95.96% accuracy shows the highest value toward the others. The results also show that a higher harmonic mean value for the proposed method with higher F1-score of the method toward the others. Plus, the highest test recall results (98.06%) of the proposed method indicate its higher rate of relevant instances that are retrieved for the images. Conclusion Therefore, using the proposed method can provide an efficient tool for optimal diagnosis of the Lung Cancer from the CT Images. Significance this shows that using the proposed method as a new deep-learning-based methodology, can provide higher accuracy and can resolve the big problem of optimal hyperparameters selection of the deep-learning-based methodology techniques for the aimed case.

Microscopic handcrafted features selection from computed tomography scans for early stage lungs cancer diagnosis using hybrid classifiers

Article

Feb 2022

Lung's cancer is the leading cause of cancer‐related deaths worldwide. Recently cancer mortality rate and incidence increased exponentially. Many patients with lung cancer are diagnosed late, so the survival rate is shallow. Machine learning approaches have been widely used to increase the effectiveness of cancer detection at an early stage. Even while these methods are efficient in detecting specific forms of cancer, there is no known technique that could be used universally and consistently to identify new malignancies. As a result, cancer diagnosis via machine learning algorithms is still fresh area of research. Computed tomography (CT) images are frequently employed for early cancer detection and diagnosis because they contain significant information. In this research, an automated lung cancer detection and classification framework is proposed which consists of preprocessing, three patches local binary pattern feature encoding, local binary pattern, histogram of oriented gradients features are extracted and fused. The fast learning network (FLN) is a novel machine‐learning technique that is fast to train and economical in terms of processing resources. However, the FLN's internal power parameters (weight and basis) are randomly initialized, resulting it an unstable algorithm. Therefore, to enhance accuracy, FLN is hybrid with K‐nearest neighbors to classify texture and appearance‐based features of lung chest CT scans from Kaggle dataset into cancerous and non‐cancerous images. The proposed model performance is evaluated using accuracy, sensitivity, specificity on the Kaggle benchmark dataset that is found comparable in state of the art using simple machine learning strategies. Research highlights Fast learning network and K‐nearest neighbor hybrid classifier proposed first time for lung cancer classification using handcrafted features including three patches local binary pattern, local binary pattern, and histogram of oriented gradients. Promising results obtained from novel simple combination.

Analysis of high-resolution reconstruction of medical images based on deep convolutional neural networks in lung cancer diagnostics

Article

Dec 2021
COMPUT METH PROG BIO

Background and objective : To study the diagnostic effect of 64-slice spiral CT and MRI high-resolution images based on deep convolutional neural networks(CNN) in lung cancer. Methods : In this paper, we Select 74 patients with highly suspected lung cancer who were treated in our hospital from January 2017 to January 2021 as the research objects. The enhanced 64-slice spiral CT and MRI were used to detect and diagnose respectively, and the images and accuracy of CT diagnosis and MRI diagnosis were retrospectively analyzed. Results : The accuracy of CT diagnosis is 94.6% (70/74), and the accuracy of MRI diagnosis is 89.2% (66/74). CT examination has the advantages of non-invasive, convenient operation and fast examination. MRI is showing there are advantages in the relationship between the chest wall and the mediastinum, and the relationship between the lesion and the large blood vessels. Conclusion : Enhanced CT and MRI examinations based on convolutional neural networks(CNN) to improve image clarity have high application value in the diagnosis of lung cancer patients, but the focus of performance is different.

Lung Cancer Detection and Severity Level Classification Using Sine Cosine Sail Fish Optimization Based Generative Adversarial Network with CT Images

Article

Oct 2021

This paper develops a lung nodule detection mechanism using the proposed sine cosine Sail Fish (SCSF) based generative adversarial network (GAN). However, the proposed SCSF-based GAN is designed by integrating the sine cosine algorithm with the SailFish optimizer, respectively. By using pre-processing, lung nodule segmentation, feature extraction, lung cancer detection, and severity level classification methods detection and classification are performed. The pre-processed computed tomography (CT) image is fed to the lung nodule segmentation phase, where the CT image is segmented into different sub-images to exactly detect the abnormal region. The segmented result after segmentation is fed to the feature extraction phase, where the features like mean, variance, entropy and hole entropy, are extracted from the nodule region. The affected regions are accurately detected using the loss function of the discriminator component. Finally, the lung nodules are detected and classified using the proposed SCSF-based GAN. The proposed approach obtained better performance with the accuracy of 96.925%, sensitivity of 96.900% and specificity of 97.920% for the first-level classification, and the accuracy of 94.987%, the sensitivity of 94.962% and specificity of 95.962% for second-level classification, respectively.

Hybrid approach for lung cancer detection based on deep learning/machine learning

Abstract and Figures

Recommended publications

Automated Early Diagnosis of Lung Tumor Based on Deep Learning Algorithms

A Survey on Lung Cancer Detection and Location from CT Scan Using Image Segmentation and CNN

Sooty-LuCaNet: Sooty tern optimization based deep learning network for lung cancer detection

Early Detection and Classification of Malignant Lung Nodules from CT Images: An Optimal Ensemble Lea...