ChapterPDF Available

Genetic Algorithm and Machine Learning

Authors:

Abstract

Genetic algorithm is based on the natural search process, which mimics natural growth and employs approaches inspired by natural evolution to solve optimization problems, employing bequest, mutation, and miscellany, as well as intersect. Its actual meaning is a competent, concurrent, and universal search approach that continuously obtains and builds up knowledge about search space and command management search space in order to alter the best search result. The traditional multilevel association rules mining techniques generate a large number of candidate items and compare them to the whole database. Nonetheless, the majority of mining procedures are in vain, since they guide crucial costs associated with computing. The inherited algorithms provide a novel technique for tackling these sorts of problems.
Copyright © 2023, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Chapter 9
167
DOI: 10.4018/978-1-6684-5656-9.ch009
ABSTRACT
Genetic algorithm is based on the natural search process, which mimics natural
growth and employs approaches inspired by natural evolution to solve optimization
problems, employing bequest, mutation, and miscellany, as well as intersect. Its
actual meaning is a competent, concurrent, and universal search approach that
continuously obtains and builds up knowledge about search space and command
management search space in order to alter the best search result. The traditional
multilevel association rules mining techniques generate a large number of candidate
items and compare them to the whole database. Nonetheless, the majority of mining
procedures are in vain, since they guide crucial costs associated with computing. The
inherited algorithms provide a novel technique for tackling these sorts of problems.
Genetic Algorithm and
Machine Learning
Radha Raman Chandan
https://orcid.org/0000-0001-7382-6143
School of Management Sciences, India
Sarita Soni
https://orcid.org/0000-0001-9193-1545
Government Polytechnic, Bighapur,
India
Atul Raj
Government Polytechnic, Sikandra,
India
Vivek Veeraiah
Adichunchanagiri University, India
Dharmesh Dhabliya
Vishwakarma Institute of Information
Technology, India
Sabyasachi Pramanik
https://orcid.org/0000-0002-9431-8751
Haldia Institute of Technology, India
Ankur Gupta
https://orcid.org/0000-0002-4651-5830
Vaish College of Engineering, India
168
Genetic Algorithm and Machine Learning
INTRODUCTION
John Holland created the Genetic Algorithm (Katoch et al., 2021) (GA) in the year
1970. It is based on a genetic organism’s hereditary mechanism. Normal humans
developed through many generations according to the idea of normal taxonomy
and the “survival of the fittest,” according to Charles Darwin, who claimed that the
genesis of living things was undeniable. GA is an adaptive procedure that is used to
tackle problems involving search and optimization (Hashim et al., 2021). No more
explanation can be found after a large number of new generations have been created
using the outlined technique. This answer is regarded as the ultimate outcome.
The vast amount of data we have access to can be separated into tiny groupings,
each of which may be assessed as a population. Re-applying hereditary operators
to the populace is the most beneficial approach to the present predicament. As we
all know, search progress is a problem-solving process in which we can’t predict
the sequence of actions that will lead to interpretation in future publications. Based
on how well and smartly we employed search operators to accomplish this goal. A
good search mechanism should be capable of doing searches both locally and in a
random manner. Local search investigates all local capabilities and helps to reach
the best solution as far as possible, while random search (Wu et al., 2021) explores
the whole solution and is effective at avoiding the most favourable local.
Figure 1. Genetic algorithm flow chart
169
Genetic Algorithm and Machine Learning
Procedure
1. Use the Genetic K-Means Algorithm
We may have an algorithm that combines the progress of the genetic algorithm with
the K-Means method for clustering, in addition to parallel implementation utilising
genetic algorithms. Optimal clustering (Pramanik et al., 2020) is more natural than
K-Means clustering, however it typically comes with certain short-term drawbacks.
ANALYSIS OF PRIMARY COMPONENTS (PCA)
In this case, the data dimension is given additional weight in order to facilitate
computation. Let’s examine at two-dimensional data to see how PCA (Pramanik et
al., 2021) works. When data is plotted on a graph, two axes are created. PCA works
with data, which is subsequently transformed into a single dimension. This is shown
in Figure 1, and PCA pseudo-code is presented in Figure 2.
Figure 2. Shows data visualisation before and after PCA
170
Genetic Algorithm and Machine Learning
Learning in a Semi-Supervised Environment
It incorporates both supervised and unsupervised learning (Dutta et al., 2021)
strategies. In machine learning and data mining (Samanta et al., 2021), unlabeled
data may be more useful than labelled data, yet effective operations on labelled data
are very challenging. Semi-supervisory learning is divided into numerous categories.
The following are some examples:
Generative Models are a kind of model that is used to generate new ideas. p (x,
y) = p (y) p (x | y) p (x | y) p (x | y) p (x | y) p (x | y) p (x | y) p (x | y) p (x | y) p
(x | y) p (x | y) p (The Gaussian mixture model is a kind of mathematical model.
Through unlabeled data that might be personal, mixed machines are created. To
assure mixture dispersion, just one label instance per ingredient is required.
Self-Training
A classifier qualifies a collection of data with a label in this case. The classifier is
then given unlabeled data to work with. After that, the unlabeled point and anticipated
label are combined with the prepared set. Since then, the system has remained steady.
Self-training gets its name from the ability to categorise oneself.
Transductive SVM
Transductive Support Vector Machine (TSVM) (Dushyant et al., 2022) is a kind of
support vector machine that incorporates SVM. It is feasible to measure both labelled
and unlabeled data with TSVM. The TSVM algorithm is used to convert unlabeled
data to label data, which has the largest margin between label and unlabeled data.
Using TSVM to find the proper answer is an NP-hard task.
Figure 3. Pseudocode for PCA
171
Genetic Algorithm and Machine Learning
i) Learning while multitasking
The goal of multiple learning is to assist prior learners in improving their
performance. When a task is completed, the Multi-task Learning algorithm is
activated; it remembers how to solve the issue or finish the job. These stages are
used by the algorithm to address a similar issue or job. The inductive transfer
mechanism is a supplementary function provided by this method. When learners
share their own expertise with one another, they may learn far more rapidly and
without feeling lonely.
ii) Ensemble Learning
Ensemble learning, such as Naive Bayes (Kaushik et al., 2021), Decision Tree
(Bhattacharya et al., 2021), and Neural Network (Meslie et al., 2021), is a kind of
learning in which individual learners interact to form a single learner. Ensemble
learning is a relatively new field that dates back to the 1990s. Individual learners may
have been approximately improved for eternity, but the group of learners working
faster than individual learners may have been roughly improved for eternity. Two
widely used Ensemble learning approaches are as follows:
1) Boosting: Boosting is a strategy for reducing bias and inconsistency in assembly
learning. Boosting is the process of taking a group of weak learners and replacing
one of them with a strong learner. A weak learner is a classifier that does not have
any real-world associations. A strapping learner, on the other hand, is a classifier
with a high correlation in real classification. AdaBoost (Pramanik et. al., 2022)
is the most well-known example of boosting, and the pseudocode of AdaBoost is
given in Figure 3.
Figure 4. AdaBoost pseudocode
172
Genetic Algorithm and Machine Learning
2) Bagging: A bagging or bootstrap aggregate function is one in which the
accuracy and consistency requirements for a machine learning method are big
enough. This is important when it comes to categorization and regression. Bags are
useful for eliminating discrepancies and assisting with treatment overfitting. Figure
4 shows the pseudocode for bagging.
iii. Neural Network Learning
The neural network, also known as an Artificial Neural Network (ANN) (Bacanin
et al., 2022), is not derived from organic neurons. Inside the brain, a neuron is a
single cell with a similar makeup. To comprehend a neural network, one must first
comprehend the function of a neuron. Dendrite, nucleus and axon are the four major
components of a neuron, as depicted in Fig. 5.
Electrical signals are received by dendrites. Electrical signals are processed by
Soma. Soma’s output travels via an axon to the terminal of a dendrite. This terminal
serves as an interface, allowing the output of this neuron to be transmitted to the
next neuron. Nucleus is the most sensitive portion of a neuron. The neural network,
which is found in the brain and where electrical impulses travel about, is made up
of interconnected neuron.
ANN functions in the same way as a neural network. There are three levels to
it. The input layer (similar to a dendrite) is in charge of receiving data, while the
hidden layer (Mandal et al., 2021) is in charge of processing it (like soma and axon).
Figure 5. Bagging Pseudocode
173
Genetic Algorithm and Machine Learning
Finally, the output layer is in charge of transmitting processed output to dendritic
terminals. There are three different types of Artificial Neural Networks: supervise,
unsupervised, and reinforce.
i) Supervised Neural Network
The output of a Supervised Neural Network is already known about the provided
input. In this case, the predicted outputs were compared to the actual outcome. The
number of parameters (Ahmad et al., 2021) is modified based on the output error
and then injected into the neural network. Figure 7 depicts the overall development.
As a result, as a feed forward neural network, a supervised neural network is used.
Figure 6. A Neuron
Figure 7. Artificial Neural Network Structure
174
Genetic Algorithm and Machine Learning
ii) Unsupervised Neural Network
There is no suggestion for output in an Unsupervised Neural Network for a specific
input. This network’s most significant role is to prepare input data based on a set of
comparable characteristics. The neural network now checks input combinations and
groups them appropriately. This network’s block diagram is presented in Figure 8.
iii) Reinforced Neural Network
The network offers a mechanism for individuals to interact with others around
them in a Reinforced Neural Network (Bassi and Attux, 2022). The network responds
by confirming whether the choice made via the network is accurate or erroneous
based on the circumstances. If the result is right, the relationships pointing to output
accuracy will be strengthened. The organisation is in a state of flux. Aside from that,
there is no previous information on the network’s availability. Figure 3.19 depicts
the reinforced neural network.
Figure 8. Supervised Neural Network
Figure 9. Unsupervised neural network
175
Genetic Algorithm and Machine Learning
iv) Instance-Based Learning
Instance-Based learning is a precise model for a learner’s learning. It tries to
connect the corresponding form to data that was previously supplied. As a result,
the most fitting word for this is instance-based. It is utilised by a lazy learner who
is waiting for test data to arrive before interacting with the training data. As the
bulk of the data becomes larger, the complexity of this knowledge-based method
grows. The kernel machine and the RPF network are two well-known examples of
instance-based learning that employ K-nearest neighbour.
TOOLS FOR MACHINE LEARNING
Scikit-learn, PyTorch, WEKA (Sakshi et al., 2021), TensorFlow (Baeta et al., 2021),
RapidMiner, RStudio (Hu, 2021), and other machine learning tools are available,
and the following is a description of this method:
Scikit-learn is an Open-Source Learning Framework (Sklearn)
Python, which includes Scikit-learn (Agrawal, 2021), is utilised for many machine
learning development tasks. Python is a computer language that comes with a wide
collection of development tools. Scikit-source learn’s code is developed in the Python
programming language and is based on NumPy (Pramanik and Bandyopadhyay,
2022), SciPy (Pramanik and Bandyopadhyay 2022), and Matplotib.
It has the following features:
1. It has data mining (Pramanik et al., 2022) and data analysis capabilities.
Figure 10. Reinforced Neural Network (RNN)
176
Genetic Algorithm and Machine Learning
2. It covers classification, regression, clustering, dimension reduction, model
selection, and pre-processing, among other topics.
Pros:
3. The instructional material is simple to comprehend and put into practise.
4. Documentation is supplied that is simple to grasp.
5. Different parameters for every algorithm may be changed according to
requirements in order to invoke a particular object
PyTorch
Facebook’s AI research department has urbanised it. It is based on the PyTorch
library (Imambi et al., 2021). It’s an open-source machine learning library that
may be used for deep learning, natural language processing, and computer vision
(Jayasingh et al., 2022), among other things.
It has the following features:
1. It assists in the creation of neural networks with the Auto grade module.
2. It provides a series of optimization algorithms for structural neural networks.
3. It may be used on a cloud-based platform.
4. It also supports the use of a variety of different tools, libraries, and distributed
training.
Pros:
5. It gives enough assistance in the creation of a computation map.
6. It has a variety of front ends that are simple to use.
WEKA
It is an open-source machine learning programme that has been tried and proven.
It’s written in Java and can run on any platform. It contains a collection of machine
learning algorithms with the goal of assisting in the resolution of real-world data
mining challenges.
Characteristics:
1. Data preparation
2. Classification
3. Regression
177
Genetic Algorithm and Machine Learning
4. Clustering
5. Visualization
6. Association rules and regulations, etc.
Pros:
7. A variety of web-based courses are available to give training.
8. All algorithms are straightforward to understand and implement.
9. The application’s versatility expands the student’s options.
Cons:
10. There are a limited number of supporting materials and internet resources
accessible.
TensorFlow
It is a free end-to-end open-source machine learning platform created by the Google
Brain team. It allows for the creation of a JavaScript record, which aids in machine
learning. Various APIs are utilised to train and develop the model here.
Description:
Step 1: To train and build the model.
Step 2: Using TensorFlow.js to run the model (model converter).
Pros:
1. It may be utilised by utilising character tags or installing using Node Package
Manager (NPM).
2. Human stance inference is easily applied.
Cons:
3. Learning is tough due to the difficulties in obtaining study materials.
RapidMiner
It is used for research, study, and the creation of various applications. RapidMiner
provides a platform for machine learning, deep learning, data analysis, text mining,
and predictive analytics, among other things.
Features:
1. It aids in the creation and implementation of logical processes in the system.
2. It helps with data training.
178
Genetic Algorithm and Machine Learning
3. It employs the usage of result visualisation.
4. Validation and optimization of models are possible.
Pros:
5. Plug-in power that can be expanded.
6. The method of using it is simple.
7. You don’t need a lot of programming knowledge.
Cons:
8. Each tool is quite expensive to use.
9. Sharing RapidMiner Studio analysis is really challenging.
10. We can only deal with 10,000 rows in the trial version.
11. The corporate team’s reaction is adequate.
RStudio
The R programming language makes use of RStudio. RStudio provides the R integrated
development environment (IDE), which includes a terminal and a syntax-highlighting
editor. Support for graphics and statistical computation in the R programming
language. RStudio Desktop and RStudio Serve are the two versions available.
RStudio Desktop is a form of desktop programme that runs on a workstation with
a local operating system. RStudio Server is a server-side programme that runs on a
remote server and allows access to RStudio through any web browser.
RStudio Desktop and RStudio Server are available in both commercial and
non-commercial variants. The format or edition of the IDE determines the OS’s
long-term viability. RStudio Desktop is available as a packaged base version for
Windows, macOS, and Linux. Debian, Ubuntu, Red Hat Linux, CentOS, openSUSE,
and SLES are all supported by RStudio Server and Server Pro.
Open-source software development is one of the advantages of R programming.
2. Lacking graphics prowess.
3. A society that is very active.
4. A large number of packages are available.
5. An all-encompassing atmosphere
6. Compute the composite statistical data.
7. Encourage the use of distributed computing.
8. Run code without using a compiler.
179
Genetic Algorithm and Machine Learning
Pros:
9. Communicative.
10. Debug with little effort.
11. Feature of auto-completion.
12. A sociable atmosphere in which to build packages.
13. Codes are auto-saved, and work is organised as a project.
Cons:
14. It often causes viewers to have a crisis.
15. Inability to see all fields at the same time (due to HTML boundary).
CONCLUSION
The genetic algorithm is based on the natural search process, which imitates natural
growth. Its true meaning is a competent, concurrent, and universal search method
that continually gathers and produces information about the command management
search space and the search space for searches in order to change the best search
result. The conventional methods for mining multilevel association rules produce
a huge number of candidate items and evaluate them against the whole database.
The bulk of mining techniques, however, are useless since they influence significant
computational costs. The inherited algorithms provide a fresh approach to solving
these kinds of issues. The GA-dependent technique efficiently tests the common
predicted main aspirant items to control the search space and arrive at the best answer
while searching for association rules. Due to this alluring advantage, association
laws may considerably restrict the right to conduct searches while enhancing mining
operations.
REFERENCES
Agrawal, T. (2021). Hyperparameter Optimization Using Scikit-Learn. In
Hyperparameter Optimization in Machine Learning. Apress. doi:10.1007/978-1-
4842-6579-6_2
180
Genetic Algorithm and Machine Learning
Ahmad, A., Garhwal, S., Ray, S. K., Kumar, G., Malebary, S. J., & Barukab, O. M.
(2021). The Number of Confirmed Cases of Covid-19 by using Machine Learning:
Methods and Challenges. Archives of Computational Methods in Engineering, 28(4),
2645–2653. doi:10.100711831-020-09472-8 PMID:32837183
Bacanin, N., Vukobrat, N., Zivkovic, M., Bezdan, T., & Strumberger, I. (2022).
Improved Harris Hawks Optimization Adapted for Artificial Neural Network
Training. In C. Kahraman, S. Cebi, S. Cevik Onar, B. Oztaysi, A. C. Tolga, & I. U.
Sari (Eds.), Intelligent and Fuzzy Techniques for Emerging Conditions and Digital
Transformation. INFUS 2021. Lecture Notes in Networks and Systems (Vol. 308).
Springer. doi:10.1007/978-3-030-85577-2_33
Baeta, F., Correia, J., Martins, T., & Machado, P. (2021). TensorGP Genetic
Programming Engine in TensorFlow. In P. A. Castillo & J. L. Jiménez Laredo
(Eds.), Lecture Notes in Computer Science Vol. 12694. Springer. doi:10.1007/978-
3-030-72699-7_48
Bassi, P. R. A. S., & Attux, R. (2022). A deep convolutional neural network for
COVID-19 detection using chest X-rays. Research on Biomedical Engineering,
38(1), 139–148. doi:10.100742600-021-00132-9
Bhattacharya, A., Ghosal, A., Obaid, A. J., Krit, S., Shukla, V. K., Mandal, K., &
Pramanik, S. (2021). Unsupervised Summarization Approach with Computational
Statistics of Microblog Data. In D. Samanta, R. R. Althar, S. Pramanik, & S. Dutta
(Eds.), Methodologies and Applications of Computational Statistics for Machine
Learning (pp. 23–37). IGI Global. doi:10.4018/978-1-7998-7701-1.ch002
Dushyant, K., Muskan, G., Gupta, A., & Pramanik, S. (2022). Utilizing Machine
Learning and Deep Learning in Cyber security: An Innovative Approach. In M.
M. Ghonge, S. Pramanik, R. Mangrulkar, & D. N. Le (Eds.), Cyber security and
Digital Forensics. Wiley. doi:10.1002/9781119795667.ch12
Dutta, S., Pramanik, S., & Bandyopadhyay, S. K. (2021). Prediction of Weight
Gainduring COVID-19 for Avoiding Complication in Health. International Journal
of Medical Science and Current Research, 4(3), 1042–1052.
Hashim, F. A., Hussain, K., Houssein, E. H., Mabrouk, M. S., & Al-Atabany,
W. (2021). Archimedes optimization algorithm: A new metaheuristic algorithm
for solving optimization problems. Applied Intelligence, 51(3), 1531–1551.
doi:10.100710489-020-01893-z
181
Genetic Algorithm and Machine Learning
Hu, K. (2021). Become Competent in Generating RNA-Seq Heat Maps in One Day
for Novices Without Prior R Experience. In K. Hu (Ed.), Nuclear Reprogramming.
Methods in Molecular Biology (Vol. 2239). Humana. doi:10.1007/978-1-0716-
1084-8_17
Imambi, S., Prakash, K. B., & Kanagachidambaresan, G. R. (2021). PyTorch. In K. B.
Prakash & G. R. Kanagachidambaresan (Eds.), Programming with TensorFlow. EAI/
Springer Innovations in Communication and Computing. Springer. doi:10.1007/978-
3-030-57077-4_10
Jayasingh, R., Kumar, J., Telagathoti, B., Sagayam, K. M., & Pramanik, S. (2022).
Speckle noise removal by SORAMA segmentation in Digital Image Processing
to facilitate precise robotic surgery. International Journal of Reliable and Quality
E-Healthcare, 11(1), 2022. doi:10.4018/IJRQEH.295083
Katoch, S., Chauhan, S. S., & Kumar, V. (2021). A review on genetic algorithm:
Past, present, and future. Multimedia Tools and Applications, 80(5), 8091–8126.
doi:10.100711042-020-10139-6 PMID:33162782
Kaushik, D., & Garg, M. Annu, Gupta, A. and Pramanik, S. (2021). Application of
Machine Learning and Deep Learning in Cyber security: An Innovative Approach.
In M. Ghonge, S. Pramanik, R. Mangrulkar and D. N. Le, (eds.), Cybersecurity and
Digital Forensics: Challenges and Future Trends. Wiley.
Mandal, A., Dutta, S., & Pramanik, S. (2021). Machine Intelligence of Pi from
Geometrical Figures with Variable Parameters using SCILab. In D. Samanta,
R. R. Althar, S. Pramanik, & S. Dutta (Eds.), Methodologies and Applications
of Computational Statistics for Machine Learning (pp. 38–63). IGI Global.,
doi:10.4018/978-1-7998-7701-1.ch003
Meslie, Y., Enbeyle, W., Pandey, B. K., Pramanik, S., Pandey, D., Dadeech, P., Belay,
A., & Saini, A. (2021). Machine Intelligence-based Trend Analysis of COVID-19
for Total Daily Confirmed Cases in Asia and Africa. In D. Samanta, R. R. Althar,
S. Pramanik, & S. Dutta (Eds.), Methodologies and Applications of Computational
Statistics for Machine Learning (pp. 164–185). IGI Global., doi:10.4018/978-1-
7998-7701-1.ch009
Pramanik, S. (2022). Carpooling Solutions using Machine Learning Tools. In
Handbook of Research on Evolving Designs and Innovation in ICT and Intelligent
Systems for Real-World Applications, K. K. Sarma, N. Saikia and M. Sharma. IGI
Global. doi:10.4018/978-1-7998-9795-8.ch002
182
Genetic Algorithm and Machine Learning
Pramanik, S., & Bandyopadhyay, S. (2022). Identifying Disease and Diagnosis in
Females using Machine Learning. In I. G. I. John Wang (Ed.), Encyclopedia of Data
Science and Machine Learning. Global.
Pramanik, S., & Bandyopadhyay, S. (2022). Analysis of Big Data. In I. G. I. John
Wang (Ed.), Encyclopedia of Data Science and Machine Learning. Global.
Pramanik, S., Galety, M. G., Samanta, D., & Joseph, N. P. (2022). Data Mining
Approaches for Decision Support Systems. 3rd International Conference on Emerging
Technologies in Data Mining and Information Security.
Pramanik, S., Sagayam, K. M., & Jena, O. P. (2021). Machine Learning Frameworks
in Cancer Detection. ICCSRE 2021, Morocco, North Africa.
Pramanik, S., Singh, R. P., & Ghosh, R. (2020). Application of Bi-orthogonal
Wavelet Transformand Genetic Algorithm in Image Steganography. Multimedia
Tools and Applications, 79(25-26), 17463–17482. doi:10.100711042-020-08676-1
Sakshi, G. S., Sharma, C., Kukreja, V. (2021). Handwritten Mathematical Symbols
Classification Using WEKA. In Choudhary, A., Agrawal, A.P., Logeswaran,
R., Unhelkar, B. (eds) Applications of Artificial Intelligence and Machine
Learning. Lecture Notes in Electrical Engineering (vol. 778). Springer, Singapore.
doi:10.1007/978-981-16-3067-5_4
Samanta, D., Dutta, S., Galety, M. G., & Pramanik, S. (2021). A Novel Approach
for Web Mining Taxonomy for High-Performance Computing, The 4th International
Conference of Computer Science and Renewable Energies (ICCSRE’2021).
doi:10.1051/e3sconf/202129701073
Wu, D., Liao, Y., Hu, C., Yu, S., & Tian, Q. (2021). An Enhanced Fuzzy Control
Strategy for Low-Level Thrusters in Marine Dynamic Positioning Systems Based
on Chaotic Random Distribution Harmony Search. International Journal of Fuzzy
Systems, 23(6), 1823–1839. doi:10.100740815-020-00989-5
... 2. High dimensionality input space: Text classifier learning requires over 50,000 features. A large feature space might cause overfitting [24]. SVM is capable of handling vast characteristic spaces. ...
Conference Paper
Social media and e-commerce platforms have led online communities to utilize reviews as a means of exchanging opinions about products, services, and issues. Reviews can also help customers make better purchasing decisions by analyzing customer feedback, and they can help businesses improve the quality of their manufacture. However, the spread of fake reviews misleads individuals, which is a concerning issue. Online consumers either enhance or diminish the standing of rival brands. This paper suggests a strategy for detecting fake reviews of online textual content through assisted learning. The technique uses honest reviews and machine learning classifiers to split off fake data. Planned system performance is compared to the baseline, and experimental results are compared to assessment metrics.
... A heuristic optimisation technique based on the principles of natural evolution, GA is a stochastic function in natural genetics and evolution. Following a series of iterative computations using Darwin's "Survival of the Fittest" hypothesis, GA determines the optimal solution [42]. For feature selection, we employ GA for the following reasons: Conventional feature selection algorithms frequently perform worse than GAs. ...
Article
Full-text available
Distributed Denial of Service (DDoS) attacks continue to pose a significant threat to network infrastructures, exploiting vulnerabilities within existing security protocols and disrupting the seamless availability of online services. The intricate interconnections of nodes within computer networks contribute to the dynamic structure of this environment, complicating efforts to establish a secure and productive user experience. Effectively mitigating DDoS attacks in this complex networked setting remains a challenge. While current strategies primarily rely on anomaly detection and signature-based techniques, utilizing statistical analysis and predefined patterns to identify and thwart attacks, none have consistently demonstrated efficacy or reliability. Consequently, there is a compelling need for advancements in security mechanisms to address DDoS threats more effectively. This research introduces an innovative and highly efficient approach that incorporates various classification algorithms, including Random Forest, Decision Tree, Gradient Boosting, Linear SVM, Logistics, K-nearest neighbors (KNN), and AdaBoost, for DDoS attack detection. The performance of these machine learning classifiers is evaluated using key metrics such as accuracy, recall, F1-score, and precision. Remarkably, experimental results reveal outstanding accuracy rates, with Random Forest achieving the highest accuracy in detecting attacks. Additionally, a genetic algorithm is employed to select optimal features from the dataset, further enhancing the performance of the classifiers. This results in a notable 25% increase in accuracy, surpassing AdaBoost and Logistics, with K-nearest neighbors emerging as the top performer in terms of accuracy.
... It is worth noting that, explaining the method of training the MLP NN is out of the scope of this article, because these methods have been widely published. So, the readers to obtain more information can refer to these references [24][25][26][27][28][29]. ...
Article
Full-text available
Today, it can be said that in every field in which timely information is needed, we can use the applications of time-series prediction. In this paper, among so many chaotic systems, the Mackey-Glass and Loranz are chosen. To predict them, Multi-Layer Perceptron Neural Network (MLP NN) trained by a variety of heuristic methods are utilized such as genetic, particle swarm, ant colony, evolutionary strategy algorithms, and population-based incremental learning. Also, in addition to expressed methods, we propose two algorithms of Bio-geography-Based Optimization (BBO) and fuzzy system to predict these chaotic systems. Simulation results show that if the MLP NN is trained based on the proposed meta-heuristic algorithm of BBO, training and testing accuracy will be improved by 28.5% and 51%, respectively. Also, if the presented fuzzy system is utilized to predict the chaotic systems, it outperforms approximately by 98.5% and 91.3% in training and testing accuracy, respectively.
Article
Full-text available
Cancer has been described as a diverse illness with several distinct subtypes that may occur simultaneously. As a result, early detection and forecast of cancer types have graced essentially in cancer fact-finding methods since they may help to improve the clinical treatment of cancer survivors. The significance of categorizing cancer suffers into higher or lower-threat categories has prompted numerous fact-finding associates from the bioscience and genomics field to investigate the utilization of machine learning (ML) algorithms in cancer diagnosis and treatment. Because of this, these methods have been used with the goal of simulating the development and treatment of malignant diseases in humans. Furthermore, the capacity of machine learning techniques to identify important characteristics from complicated datasets demonstrates the significance of these technologies. These technologies include Bayesian networks and artificial neural networks, along with a number of other approaches. Decision Trees and Support Vector Machines which have already been extensively used in cancer research for the creation of predictive models, also lead to accurate decision making. The application of machine learning techniques may undoubtedly enhance our knowledge of cancer development; nevertheless, a sufficient degree of validation is required before these approaches can be considered for use in daily clinical practice. An overview of current machine learning approaches utilized in the simulation of cancer development is presented in this paper. All of the supervised machine learning approaches described here, along with a variety of input characteristics and data samples, are used to build the prediction models. In light of the increasing trend towards the use of machine learning methods in biomedical research, we offer the most current papers that have used these approaches to predict risk of cancer or patient outcomes in order to better understand cancer.
Article
Full-text available
Kidney stones are renal calculi that are formed due to the collection of calcium and uric acid. The major symptom for the existence of these renal calculi is severe pain, especially when it travels down the urethras To detect these renal calculi, ultrasound images are preferable. But these images have speckle noise which makes the detection of stone challenge. To obtain better results, Semantic Object Region and Morphological Analysis (SORAMA) found to be productive. First scanned image undergoes noise removal process Later the image is enhanced. Detection of Region of interest (ROI) in the image is done. Later it undergoes Dilation and Erosion were a part of Morphological analysis which produces a smoothening effect on the image. From the smoothened image, the stone is detected. If the stone is not detected then it again undergoes noise removal technique and the whole process is repeated until the smoothened image with the stone is detected. This novel research paper will be a boon to medical patients suffering from this disease to be detected and diagnose at a very early stage.
Chapter
In the modern business condition, data being produced in most service associations is estimated in gigabytes or terabytes. On occasion, searching for information in a few terabytes of information can appear to be much the same as scanning for a needle in a sheaf. With the wild use of IT in business forms everywhere throughout the world, organizations as of now have massive measures of data but can't relate the vast majority of it to the creation or survey of corporate methodology. This shows the requirement for intelligent utilization of technology to be ventured in front of others in this gigantically aggressive corporate reality where profit edges are slender and clients are keen for decision.
Chapter
In the modern business condition, data being produced in most service associations is estimated in gigabytes or terabytes. On occasion, searching for information in a few terabytes of information can appear to be much the same as scanning for a needle in a sheaf. With the wild use of IT in business forms everywhere throughout the world, organizations as of now have massive measures of data but can’t relate the vast majority of it to the creation or survey of corporate methodology. This shows the requirement for intelligent utilization of technology to be ventured in front of others in this gigantically aggressive corporate reality where profit edges are slender and clients are keen for decision.
Chapter
Here the authors are trying to prepare a model for Identifying a patient whether he is Diabetic or not. The Pima Indian dataset has been used in this case study. There are basically 2 types of Diabetes. This chapter consists of 2 main sections: first is Data Pre-Processing and another is the Classifier Construction. After the pre-processing phase and finalizing the data classifier it will be predicted whether the patient is diabetic or not. Here, the author proposes to use Decision Tree Classifier and Random Forest Classifier. After studying the dataset, the missing values were handled in an optimal manner. All the types of proposed algorithm have been described here.
Chapter
Machine learning (ML) and deep learning (DL) have both produced overwhelming interest and drawn unparalleled community interest recently. With a growing convergence of online activities and digital life, the way people have learned and function is evolving, but this also leads them towards significant security concerns. Protecting sensitive information, documents, networks and machine-connected devices from unwanted cyber threats is a difficult task. Robust cybersecurity protection is necessary for this reason. For a problem solution, current innovations like machine learning and deep learning is incorporated to cyber threats. This paper also highlights the problems and benefits with using ML / DL and presents recommendations for research directions for machine learning and deep learning in cybersecurity.
Chapter
The learning process is one of the most difficult problems in artificial neural networks. This process goal is to find the appropriate values for connection weights and biases and has a direct influence on the neural network classification and prediction accuracy. Since the search space is huge, traditional optimization techniques are not suitable as they are prone to slow convergence and getting trapped in the local optima. In this paper, an enhanced harris hawks optimization algorithm is proposed to address the task of neural networks training. Conducted experiments include 2 well-known classification benchmark datasets to evaluate the performance of the proposed method. The obtained results indicate that the devised algorithm has promising performance, as that it is able to achieve better overall results than other state-of-the-art metaheuristics that were taken into account in comparative analysis, in terms of classification accuracy and converging speed.