2: An example program classification map for four classes.

Chapter

Full-text available

Jan 2024

Classification is a supervised machine learning process that categories an instance based on a number of features. The process of classification involves several stages, including data preprocessing (such as feature selection and feature construction), model training and evaluation. Evolutionary computation has been widely applied to all these stages to improve the performance and explainability of the built classification models, where term for this research area is Evolutionary Classification. This chapter introduces the fundamental concepts of evolutionary classification, followed by the key ideas using evolutionary computation techniques to address existing classification challenges such as multi-class classification, unbalanced data, explainable/interpretable classifiers and transfer learning.

Survey on Evolutionary Deep Learning: Principles, Algorithms, Applications and Open Issues

Article

Jun 2023
ACM COMPUT SURV

Over recent years, there has been a rapid development of deep learning (DL) in both industry and academia fields. However, finding the optimal hyperparameters of a DL model often needs high computational cost and human expertise. To mitigate the above issue, evolutionary computation (EC) as a powerful heuristic search approach has shown significant merits in the automated design of DL models, so-called evolutionary deep learning (EDL). This paper aims to analyze EDL from the perspective of automated machine learning (AutoML). Specifically, we firstly illuminate EDL from DL and EC and regard EDL as an optimization problem. According to the DL pipeline, we systematically introduce EDL methods ranging from data preparation, model generation, to model deployment with a new taxonomy (i.e., what and how to evolve/optimize), and focus on the discussions of solution representation and search paradigm in handling the optimization problem by EC. Finally, key applications, open issues and potentially promising lines of future research are suggested. This survey has reviewed recent developments of EDL and offers insightful guidelines for the development of EDL.

Survey on Evolutionary Deep Learning: Principles, Algorithms, Applications and Open Issues

Preprint

Full-text available

Aug 2022

Over recent years, there has been a rapid development of deep learning (DL) in both industry and academia fields. However, finding the optimal hyperparameters of a DL model often needs high computational cost and human expertise. To mitigate the above issue, evolutionary computation (EC) as a powerful heuristic search approach has shown significant merits in the automated design of DL models, so-called evolutionary deep learning (EDL). This paper aims to analyze EDL from the perspective of automated machine learning (AutoML). Specifically, we firstly illuminate EDL from machine learning and EC and regard EDL as an optimization problem. According to the DL pipeline, we systematically introduce EDL methods ranging from feature engineering, model generation, to model deployment with a new taxonomy (i.e., what and how to evolve/optimize), and focus on the discussions of solution representation and search paradigm in handling the optimization problem by EC. Finally, key applications, open issues and potentially promising lines of future research are suggested. This survey has reviewed recent developments of EDL and offers insightful guidelines for the development of EDL.

Graph-based genetic programming

Presentation

Full-text available

Jul 2022

Graph representations in genetic programming

Article

Full-text available

Dec 2021
GENET PROGRAM EVOL M

Graph representations promise several desirable properties for genetic programming (GP); multiple-output programs, natural representations of code reuse and, in many cases, an innate mechanism for neutral drift. Each graph GP technique provides a program representation, genetic operators and overarching evolutionary algorithm. This makes it difficult to identify the individual causes of empirical differences, both between these methods and in comparison to traditional GP. In this work, we empirically study the behaviour of Cartesian genetic programming (CGP), linear genetic programming (LGP), evolving graphs by graph programming and traditional GP. By fixing some aspects of the configurations, we study the performance of each graph GP method and GP in combination with three different EAs: generational, steady-state and $$(1+\lambda )$$ ( 1 + λ ) . In general, we find that the best choice of representation, genetic operator and evolutionary algorithm depends on the problem domain. Further, we find that graph GP methods can increase search performance on complex real-world regression problems and, particularly in combination with the ( $$1 + \lambda$$ 1 + λ ) EA, are significantly better on digital circuit synthesis tasks. We further show that the reuse of intermediate results by tuning LGP’s number of registers and CGP’s levels back parameter is of utmost importance and contributes significantly to better convergence of an optimization algorithm when solving complex problems that benefit from code reuse.

Genetic Programming based Feature Manipulation for Skin Cancer Image Classification

Thesis

Dec 2020

Qurrat Ul Ain

Skin image classification involves the development of computational methods for solving problems such as cancer detection in lesion images, and their use for biomedical research and clinical care. Such methods aim at extracting relevant information or knowledge from skin images that can significantly assist in the early detection of disease. Skin images are enormous, and come with various artifacts that hinder effective feature extraction leading to inaccurate classification. Feature selection and feature construction can significantly reduce the amount of data while improving classification performance by selecting prominent features and constructing high-level features. Existing approaches mostly rely on expert intervention and follow multiple stages for pre-processing, feature extraction, and classification, which decreases the reliability, and increases the computational complexity. Since good generalization accuracy is not always the primary objective, clinicians are also interested in analyzing specific features such as pigment network, streaks, and blobs responsible for developing the disease; interpretable methods are favored. In Evolutionary Computation, Genetic Programming (GP) can automatically evolve an interpretable model and address the curse of dimensionality (through feature selection and construction). GP has been successfully applied to many areas, but its potential for feature selection, feature construction, and classification in skin images has not been thoroughly investigated. The overall goal of this thesis is to develop a new GP approach to skin image classification by utilizing GP to evolve programs that are capable of automatically selecting prominent image features, constructing new high-level features, interpreting useful image features which can help dermatologist to diagnose a type of cancer, and are robust to processing skin images captured from specialized instruments and standard cameras. This thesis focuses on utilizing a wide range of texture, color, frequency-based, local, and global image properties at the terminal nodes of GP to classify skin cancer images from multiple modalities effectively. This thesis develops new two-stage GP methods using embedded and wrapper feature selection and construction approaches to automatically generating a feature vector of selected and constructed features for classification. The results show that wrapper approach outperforms the embedded approach, the existing baseline GP and other machine learning methods, but the embedded approach is faster than the wrapper approach. %Insights of the evolved programs reveal that GP selects highly significant features that can help dermatologists make a diagnosis. This thesis develops a multi-tree GP based embedded feature selection approach for melanoma detection using domain specific and domain independent features. It explores suitable crossover and mutation operators to evolve GP classifiers effectively and further extends this approach using a weighted fitness function. The results show that these multi-tree approaches outperformed single tree GP and other classification methods. They identify that a specific feature extraction method extracts most suitable features for particular images taken from a specific optical instrument. This thesis develops the first GP method utilizing frequency-based wavelet features, where the wrapper based feature selection and construction methods automatically evolve useful constructed features to improve the classification performance. The results show the evidence of successful feature construction by significantly outperforming existing GP approaches, state-of-the-art CNN, and other classification methods. This thesis develops a GP approach to multiple feature construction for ensemble learning in classification. The results show that the ensemble method outperformed existing GP approaches, state-of-the-art skin image classification, and commonly used ensemble methods. Further analysis of the evolved constructed features identified important image features that can potentially help the dermatologist identify further medical procedures in real-world situations.

A Study on Graph Representations for Genetic Programming

Conference Paper

Full-text available

Jun 2020

A number of alternative representations have been proposed for Genetic Programming (GP). Linear Genetic Programming (LGP) and Cartesian Genetic Programming (CGP) use a linear genotype that can be interpreted as Directed Acyclic Graphs (DAGs), whereas Evolving Graphs by Graph Programming (EGGP) evolves graphs directly. Each of these variants uses its own set of genetic operators and evolutionary algorithms (EAs). Thus, it is difficult to point a direct cause for a difference in performance between these methods. The goal of the present work is to study how the representations of LGP, CGP, and EGGP differ from one another, in terms of performance, and to asses the role of the genetic operators and EAs that are used. With this purpose, we test each technique, including GP, using three different EAs: generational, steady-state, and 1+lambda, on symbolic regression and digital circuits benchmarks. We find that the generational EA generally works better for regression problems, and the 1+lambda EA presents a clear advantage on digital circuits problems. Moreover, the 1+lambda EA is able to perform much better in combination with graphs, thus making this alternative representation more suited to evolving digital circuits, whereas it presented no clear advantage on the regression problems.

Keypoints Detection and Feature Extraction: A Dynamic Genetic Programming Approach for Evolving Rotation-Invariant Texture Image Descriptors

Article

Mar 2017
IEEE T EVOLUT COMPUT

The goodness of the features extracted from the instances and the number of training instances are two key components in machine learning, and building an effective model is largely affected by these two factors. Acquiring a large number of training instances is very expensive in some situations such as in the medical domain. Designing a good feature set, on the other hand, is very hard and often requires domain expertise. In computer vision, image descriptors have emerged to automate feature detection and extraction; however, domain-expert intervention is typically needed to develop these descriptors. The aim of this paper is to utilise Genetic Programming to automatically construct a rotation-invariant image descriptor by synthesising a set of formulae using simple arithmetic operators and first-order statistics, and determining the length of the feature vector simultaneously using only two instances per class. Using seven texture classification image datasets, the performance of the proposed method is evaluated and compared against eight domain-expert hand-crafted image descriptors. Quantitatively, the proposed method has significantly outperformed, or achieved comparable performance to, the competitor methods. Qualitatively, the analysis shows that the descriptors evolved by the proposed method can be interpreted.

Parallel Linear Genetic Programming

Conference Paper

Full-text available

Apr 2011

Motivated by biological inspiration and the issue of code disruption, we develop a new form of LGP called Parallel LGP (PLGP). PLGP programs consist of n lists of instructions. These lists are executed in parallel, after which the resulting vectors are combined to produce program output. PGLP limits the disruptive effects of crossover and mutation, which allows PLGP to significantly outperform regular LGP.

Use of Linear Genetic Programming and Artificial Neural Network Methods to Solve Classification Task

Article

Jan 2011

This paper presents a comparative analysis of linear genetic programming and artificial neural network methods to solve classification tasks. Usually classification tasks have data sets containing a large number of attributes and records, and more than two classes that will be processed using, for example, created classification rules. As a result, by using classical method to classify a large number of records, a high classification error value will be obtained. The artificial neural networks are often used to solve classification task, mostly obtaining good results. The linear genetic programming is a new direction of evolution algorithms that is not widely researched and its application areas are not well defined. However, some advantages of linear genetic programming are based on genetic operators whose structure does not require complicated calculations. During this work approximately 400 experiments were conducted with linear genetic programming and artificial neural network methods, using various data sets with different quantity of records, attributes and classes. Based on the results received, conclusions on possibilities of using the methods of linear genetic programming and artificial neural networks in classification problems were drawn, and suggestions for improving their performance were proposed.

2: An example program classification map for four classes.

Citations