Conference PaperPDF Available

A Comparative Analysis of Machine Learning Techniques for Enhanced Resource Management in Multi-Access Edge Computing

Authors:

Abstract and Figures

The Internet of Things (IoT) merges physical and virtual realms, whereas Mobile Access Edge Computing (MEC) facilitates various IoT applications in this context. Machine Learning (ML) can predict the optimal amount of resources, optimizing resource usage in edge environments with limited resources. However, selecting an appropriate approach is challenging because of the available techniques and algorithms. This study offers an in-depth analysis of several algorithms applied in diverse contexts for predictive autoscaling edge IoT applications. Using open datasets, we comprehensively compared these algorithms' performance across multiple scenarios commonly encountered at the edge of the network. We assessed their efficacy in univariate/multivariate, one-step/multistep forecasting, and regression/classification tasks. Our findings indicate no one-size-fits-all solution because different algorithms are more suitable for distinct scenarios.
Content may be subject to copyright.
A Comparative Analysis of Machine Learning
Techniques for Enhanced Resource Management in
Multi-access Edge Computing
Lucas Vinhal, Rodrigo Moreiraand Fl´
avio de Oliveira Silva
Faculty of Computing (FACOM), Federal University of Uberlˆ
andia, Brazil
Federal University of Vic¸osa (UFV), Brazil
Department of Informatics (DI), University of Minho (UMinho), Portugal
Email: {vinhal,flavio}@ufu.br, rodrigo@ufv.br, flavio@di.uminho.pt
Abstract—The Internet of Things (IoT) merges physical and
virtual realms, whereas Mobile Access Edge Computing (MEC)
facilitates various IoT applications in this context. Machine
Learning (ML) can predict the optimal amount of resources,
optimizing resource usage in edge environments with limited
resources. However, selecting an appropriate approach is chal-
lenging because of the available techniques and algorithms. This
study offers an in-depth analysis of several algorithms applied
in diverse contexts for predictive autoscaling edge IoT applica-
tions. Using open datasets, we comprehensively compared these
algorithms’ performance across multiple scenarios commonly
encountered at the edge of the network. We assessed their efficacy
in univariate/multivariate, one-step/multistep forecasting, and
regression/classification tasks. Our findings indicate no one-size-
fits-all solution because different algorithms are more suitable
for distinct scenarios.
Index Terms—Machine Learning, Deep Learning, Multi-access
Edge Computing, IoT, Time Series Forecasting, Predictive Au-
toscaling
I. INTRODUCTION
Multi-access Edge Computing (MEC) has revolutionized the
delivery of computational solutions closer to the client, acting
as a technological enabler for 5G and already being explored
for 6G mobile users [1], Internet of Things (IoT), and opening
up business opportunities [2]. As a result, various connectivity
approaches and disruptive applications leverage the advantages
of computational resources near the requirement point, leading
to low latencies and an enhanced end-user experience. Nu-
merous organizations, such as European Telecommunications
Standards Institute (ETSI), 3rd Generation Partnership Project
(3GPP), and Groupe Speciale Mobile Association (GSMA),
have directed their efforts towards the definition, standardiza-
tion, and architectural framework of MEC [2]. Furthermore,
the IoT is an application that greatly benefits from the MEC
paradigm and has revolutionized the way we interact with the
world while generating vast amounts of data and requiring
computational resources.
It is estimated that the number of IoT connections in
2028 will reach 34.7 billion, nearly three times the current
number [3]. Therefore, as a technological enabler, MEC
needs to evolve to ensure the seamless functioning of IoT
applications [4]. In addition to the large volume of data
generated by the IoT, there is a need to develop computational
techniques for detailed and intelligent resource management.
The MEC paradigm alleviates many issues, such as sending
large data volumes to the core, efficient content distribution,
and processing near the client [5]. However, challenges related
to caching, positioning, mobility, smart management, and
orchestration remain unaddressed [6].
However, challenges have emerged, such as the limited
computational capacity of edge nodes, energy constraints,
alleviating pressure on network bandwidth, security concerns,
and, most importantly, management orchestration tools [14]. In
particular, the management and orchestration of IoT applica-
tions present opportunities for contributions to make seamless,
fine-grained, and smart management possible. There have
been proposals for IoT-VNF orchestration in the literature to
achieve low latency and enable Service Function Chaining
(SFC) and network slicing [6]. However, existing orchestration
approaches for IoT do not fully satisfy the requirements at
scale and lack intelligence throughout the entire lifecycle of
IoT-VNF, resulting in inefficient resource utilization and im-
pacting application experiences. Therefore, this study proposes
and evaluates machine-learning mechanisms for resource con-
sumption forecasting for IoT-MANO, which is nativity inte-
grated with the management and orchestration architecture and
VNF-IoT.
Among the main contributions of this paper, we highlight
an empirical evaluation of ML algorithms and Deep Neural
Networks (DNNs) for predicting the consumption of compu-
tational resources in MEC resources. Our evaluation involved
comparing algorithms and training paradigms using time-
series data from multiple datasets with diverse data patterns.
We implemented and assessed these approaches across numer-
ous variations commonly encountered when designing fore-
casting techniques for scaling VNFs and MEC applications.
Furthermore, we optimized each algorithm’s hyperparameter
to determine the optimal parameter values. To evaluate the
performance of the algorithms, we measured their training
and prediction times and assessed them using metrics such as
TABLE I
REL ATED WOR K COMPARISON.
Approach ML Strategy IoT MANO IoT MEC Hyperparameter Tunning Evaluation Metric
Alawe et. al [7] RNN, LSTM, and DNN Network Resources Forecasting
Subramanya & Riggio [8] DNN and LSTM IoT Network Load Forecasting
Toma et. al [9] DNN IoT Resources Forecasting
Liu et. al [10] RL IoT Resources Forecasting
Peng et. al [11] LSTM Network Traffic Forecasting
Brick & Ksentini [12] LSTM and RNN IoV Resources Forecasting
Abdellah et. al [13] LSTM IoT Network Resources Forecasting
Our Approach Classical and DNN IoT Network Resources Forecasting
Mean Absolute Error (MAE) and Mean Squared Error (MSE).
In addition, we thoroughly analyzed the results to determine
the optimal approaches tailored to each specific scenario and
variation. This thorough analysis aims to provide a definitive
guide for the resource forecasting techniques employed in
edge network environments, offering valuable insights and
recommendations.
The remainder of this paper is organized as follows. Section
2 discusses the relevant approaches. Section 3 describes the
experimental methodology and metrics. Section 4 presents
the experimental setup and a comparison. Section 5 presents
results and insights. Finally, Section 6 concludes the study with
remarks and suggestions for resource forecasting in network
edge environments.
II. RE LATE D WOR K
Alawe et al. [7] proposed a novel mechanism to scale 5G
core network resources by anticipating traffic load changes
through forecasting using machine learning techniques. The
authors used and trained a neural network on a real dataset of
traffic arrivals in a mobile network and compared two tech-
niques: Recurrent Neural Network (RNN) with Long Short-
Term Memory (LSTM) and Deep Neural Network (DNN).
The results show that the forecast-based scalability mechanism
outperforms threshold-based solutions in terms of latency and
delay in reacting to traffic increases for an Access and Mobility
Management Function (AMF) core entity.
Subramanya & Riggio [8] Deep Learning (DL) models
for resource orchestration in multi-domain networks using
Network Function Virtualization (NFV) and MEC. Through
AI-driven Kubernetes, they compared the centralized and
federated learning approaches for horizontal and vertical au-
toscaling of Virtual Network Functions (VNFs) based on traffic
demand prediction.
In [8] a machine learning based on DNN, LSTM, and edge
computing approach for optimal resource allocation in IoT
networks was proposed. It uses k-means clustering to group
IoT users based on their distance and priority. It executes high-
priority tasks at the edge level and low-priority tasks at the
local level. In addition, it uses a deep Q-network to learn
the best computational offloading strategy. The study claims
the proposed approach is economical, effective, and energy-
efficient, outperforming existing models.
Toma et al. [9] present the concept of Edge ML from a
variety of perspectives, describing different implementations
such as a tech-glove smart device for controlling teleoperated
robots and Unmanned Aerial Vehicles (UAVs) that is process-
ing data locally using machine learning techniques and DNN,
to make decisions without interrogating the cloud platforms.
Through accuracy, latency, power consumption, and memory
usage, they implemented and evaluated an IoT embedded
device integrated into a technological glove. In addition, the
Unmanned Aerial Vehicle (UAV) device can perform visual
computation using Machine Learning (ML) with low power
consumption and memory usage compared to other solutions
such as cloud computing or edge computing.
Liu et al. [10] proposed a digital twin for IoT Virtual
Network Function (VNF) migration using ML to predict the
deployment resource consumption and avoid the trial and
error risk. To this end, they formulated the solution as an
optimization problem combined with Reinforcement Learning
(RL) and basic ML algorithms. This study uses a Bi-GRU
algorithm to predict the resource demands of VNFs and a
Distributed Proximal Policy Optimization (DPPO) algorithm
to formulate a VNF migration plan in advance. The study
claims their method can improve network service performance,
reduce energy consumption, and balance the load.
In [11], an MEC network traffic prediction based on LSTM
combined with Variable Sampling Rates (VSR) is proposed
for the IoT context. Similarly, [12] used LSTM combined
with RNN to forecast resource consumption for IoT demands
in MEC scenarios. They considered collision-detection and
avoidance for Internet of Vehicles (IoV) applications. Both
measured the proposed solutions using an estimation error
approach.
Abdellah et al. [13] present a novel approach to predicting
network traffic in 5G networks for IoT applications using deep
learning techniques. The main contribution of this study is
implementing an LSTM model capable of capturing temporal
patterns in IoT traffic, which results in high accuracy and low
error rates. Additionally, this study compares the performances
of different LSTM layer configurations. The findings of this
study demonstrate the suitability of LSTM for edge-based IoT
traffic prediction, as evaluated using metrics such as Root
Mean Square Error (RMSE) and Mean Absolute Percentage
Error (MAPE).
We outline the related works in Table I, where we build
our study by considering some relevant characteristics rep-
resented in the columns. The first column, “ML Strategy”
concerns the ML technique used for resource prediction, and
predominantly related works used Deep Neural Networks
(DNNs) for resource prediction in MEC. The other column
“IoT MANO” considered is which approaches are compatible
with Management and Orchestration (MANO) IoT, many ap-
proaches use VNFs on high performance computing resources
instead of using VNFs on MEC. The “IoT MEC” column
refers to the MEC solution being oriented to instantiate IoT
applications, some evaluated approaches do not instantiate
IoT applications over MEC. The “Hyperparameter Tunning”
column refers to whether the approach makes fine tuning of
hyperparameters before embedding the models in IoT MANO.
Finally, the “Evaluation Metric” column describes how the
authors evaluated the proposed approach.
III. THE COMPARISON APPR OACH
Our method consists of equipping MANO IoT with
machine-learning models capable of accurately predicting time
series. We considered two datasets related to the orchestration
of VNFs, specifically, those related to the CPU and RAM
resource consumption of these entities. The Fig. 1 represents
the entire process or our method for validating the proposal,
and in Step 1, we consider dataset Knowledge-Defined Net-
working [15] and Bono [16].
Publish
Open-Source
Datasets
Forecasting Error
Comparison
Hyperparameter
Tunning
Feeds the
IoT MANO
DatasetDataset
Extracts from
Feeds Evaluation
Setup
Compute
Resource
Starts
Reports
DNN-based
Forecasting Evaluation
Fig. 1. Steps of the Proposed Method.
The KDN dataset includes 86 input features that capture
CPU usage and traffic characteristics such as port number,
source IP, and destination IP. For this study, we focused
exclusively on the firewall dataset, which consisted of 755
samples. The Bono dataset used only data corresponding to
the monitoring of the Bono VNFC. It consists of a total of
177,038 samples of captured metrics, such as CPU, memory,
network (number of packages and used network in bytes), and
disk usage. The dataset used in this study is publicly available
and can be accessed from [17].
Later, in Step 2 we deal with the task of cleaning and
standardizing the data to feed the machine learning models.
Initially, we chose a set of ML algorithms for time series
forecasting to assess their behavior and generalization ability.
After this step, the data becomes available to feed the ML
algorithms.
For both datasets, we used the 70%:20%:10% approach,
allocating 70% of the data as training datasets, 20% as valida-
tion datasets, and the remaining 10% as testing datasets. This
decomposition ensured a balanced distribution of data for the
training, validation, and evaluation of the performance of our
models. In addition, we applied normalization, cleaning, and
standardization of the datasets, as well as scale adjustments,
to facilitate the model training process.
In Step 3, each trained ML algorithm was subjected to
hyperparameter optimization to fine-tune the learning rate and
momentum parameters. The algorithms used to equip the IoT
MANO are listed in Table II. For this task, we used a hyper-
opt library, which can perform an exhaustive search in the
search space to maximize the accuracy response variable.
TABLE II
COMPARED MACHINE LEARNING ALGORITHMS.
Algorithm Category
Random Forest Classical
Decision Trees (DT) Classical
Support Vector Regression (SVR) Classical
Gated Recurrent Unit (GRU) Neural Network
Long Short-Term Memory (LSTM) Neural Network
Bidirectional LSTM (BI-LSTM) Neural Network
Convolutional LSTM (CNN-LSTM) Neural Network
Encoder-Decoder LSTM Enc-Dec Neural Network
Encoder-Decoder CNN-LSTM Enc-Dec Neural Network
To validate the performance of each algorithm as a time-
series forecasting model, they must be implemented and
compared for each of these variations. To this end, in this
study, each algorithm was implemented in four variations, as
presented in Table III.
TABLE III
COMPARED TIME SERIES FORECASTING VARIATIONS.
Features Prediction Steps Task Type
1 univariate one-step regression
2 multivariate one-step regression
3 univariate multi-step regression
4 univariate one-step classification
In Step 3, the hyperparameter optimization of each model
considers the search space according to Table IV. Once the
models are trained and optimized, they are empirically com-
pared in Step 4, considering the accuracy and loss parameters
for defining the models that will compose the IoT MANO.
For model comparison, we use RMSE and MAE functions
that allow us to compare experiments samples and confidently
decide that the model performs better given the proposed
problem.
TABLE IV
SEA RCH S PACE FO R HYPERPARAMETER VARIABLES.
Hyperparameter Value
# of Estimators x[200, 2000]
Max Features SQRT, log2, and Auto
Max Depth x[10, 1000]
Bootstrap True or False
Kernel RBF, Poly, and Sigmoid
Gamma {1, 0.1, 0.01, and 0.001}
C{0.1. 1, 10, and 100}
Epsilon {0.05 and 0.1}
Finally, in Step 5, the model is incorporated into the
IoT MANO to support the intelligent orchestration of IoT
resources at the edge. The IoT MANO uses the management
interface to receive adjustments from machine-learning models
that are trained using our method.
IV. EXP ER IM EN TAL SE TUP
After determining the optimal configurations for each algo-
rithm and scenario, we performed 10 model training processes
for each of the 72 variations, which consisted of a combina-
tion of nine algorithms, four forecasting variations, and two
datasets. Consequently, we trained 720 models, enabling us to
make informed decisions about the best-performing model for
the given problem.
To compare the models, we considered metrics such as MSE
and MAE for regression models and Accuracy and Cross-
Entropy for classification models. In addition, we calculated
and recorded the Time to Train and the Time to Prediction
for each iteration. This allowed us to analyze the average
time required to train a model and the average time it
takes for a trained model to make predictions when provided
with input data. The classical machine-learning algorithms in
this study were implemented using scikit-learn [18], whereas
the neural-network-based algorithms were implemented using
TensorFlow [19] and Keras [20].
All training and comparison processes were performed
using separate virtual machines hosted and managed on the
Google Cloud Platform (GCP). Each process was executed in
its own isolated instance, which consisted of two vCPUs and
8GB of memory. By utilizing individual virtual machines for
each process, we ensured independent and dedicated resources
for efficient and reliable computations.
V. RE SU LTS AND DISCUSSION
As a result of Step 3, we established the optimal hyperpa-
rameters for each algorithm for each variation. This outcome,
along with the code and algorithms utilized, is hosted and
openly available in the [21] repository. This repository contains
the actual code employed, identifies optimal hyperparameters,
and compares the architecture of each machine-learning al-
gorithm and neural network. Using this outcome as a base
for the second phase of our work, we executed 10 times each
algorithm for each variation, using the defined architecture and
hyperparameter.
A. Base Analysis
Table V illustrates a comparison of the algorithm perfor-
mance in base scenario variation I. This variation involves
organizing the data and algorithms within a univariate single-
step regression-forecasting architecture. A comparison was
conducted by performing ten simulation runs for each case,
and the resulting MSE and MAE values were recorded for
each item. The performance values are presented with a con-
fidence interval that indicates the range of classification errors
expected from the models with 95% likelihood. In addition,
the mean Time to Train was calculated and is presented in
Table V.
Using the KDN dataset, which had a smaller number of
rows but more features, the LSTM model demonstrated the
highest performance, achieving 0.01275 MSE and 0.06261
MAE (the results are plotted in Fig. 2). The Gated Recurrent
Unit (GRU) model came in second place with comparable
results, obtaining 0.01285 MSE and 0.05966 MAE. In the
case of the Bono dataset, which has a larger number of rows
but fewer features, the Bi-LSTM model achieved the highest
performance with 0.00283 MSE and 0.03928 MAE (the results
are plotted in the second chart in Fig. 2). Following this, the
LSTM model secures the second position with an MSE of
0.00284 and MAE of 0.04266. The random-forest algorithm
ranked third, exhibiting 0.00298 MSE and 0.03973 MAE.
It is worth noting that in both datasets, the neural network
models tended to outperform the classical and encoder-decoder
algorithms. This observation can be attributed to the fact that in
this particular variation, which involves a univariate structure
and less complex inputs, the encoder-decoder algorithms,
comprising both an encoder and decoder neural network, tend
to become overly complex, leading to underfitting scenarios.
On the other hand, the classical algorithms demonstrated
favorable performance in the KDN dataset, which had a
limited amount of data, but they struggled to handle the Bono
dataset effectively owing to its more complex data.
Moreover, in a deep analysis, the LSTM and Bi-LSTM
algorithms demonstrated a good performance across both
datasets. The predicted values of these algorithms for the test
portions of the datasets are presented in Fig. 2, showcasing
their strong ability to forecast based on unseen data. Although
both algorithms yielded similar results in terms of error,
the Bi-LSTM algorithm required less training time, further
highlighting its efficiency.
Finally, this particular base variation stood out as the most
notable in the overall analysis, surpassing the other variations.
This highlights the potential of this variation (univariate, one-
step, regression) in an IoT/edge scenario, which requires
solutions that can meet the high variability scenarios requested
by the network.
TABLE V
PERFORMANCE RESULTS IN VARIATION I (UNIVARIATE -ONE-S TE P -REGRESSION).
KDN BONO
Algorithm Rank MSE MAE Rank MSE MAE Time To Train (x)
BI-LSTM 0.0133 (±0.0003) 0.0611 (±0.0032) 0.0028 (±0.0) 0.0611 (±0.0032) 726.96 s
CNN-LSTM 0.0137 (±0.0003) 0.0643 (±0.0026) 0.0036 (±0.0006) 0.0643 (±0.0026) 1795.29 s
DT 0.013 (±0.0) 0.0629 (±0.0) 0.0148 (±0.0) 0.0629 (±0.0) 0.11 s
ENC-DEC-CNN-LSTM 0.0246 (±0.0022) 0.1045 (±0.0073) 0.0036 (±0.0004) 0.1045 (±0.0073) 599.95 s
ENC-DEC-LSTM 0.0135 (±0.0005) 0.0618 (±0.0017) 0.0031 (±0.0002) 0.0618 (±0.0017) 1127.93 s
GRU 0.0128 (±0.0002) 0.0597 (±0.0007) 0.0032 (±0.0002) 0.0597 (±0.0007) 3138.08 s
LSTM 0.0128 (±0.0002) 0.0626 (±0.0028) 0.0028 (±0.0001) 0.0626 (±0.0028) 1311.05 s
RANDOM-FOREST 0.0141 (±0.0) 0.0656 (±0.0) 0.003 (±0.0) 0.0656 (±0.0) 60.62 s
SVR 0.0141 (±0.0) 0.0645 (±0.0) 0.003 (±0.0) 0.0645 (±0.0) 0.56 s
Fig. 2. Comparison of the achieved results over variation I with the expected values in the test datasets.
B. Multivariate Analysis
Table VI presents a comparison of the algorithm perfor-
mance in variation II. The table follows a structure similar to
that in Table V, and the simulation tests were conducted in
accordance. In contrast to the base variation, this particular
variation employs a multivariate architecture that utilizes the
entire dataset with all features as inputs during both the
training and prediction stages.
For the KDN dataset, in this variation, the random-forest al-
gorithm demonstrated superior performance, achieving 0.0158
MSE and 0.07282 MAE (the results are plotted in the first
chart in Fig. 3). The Bi-LSTM model closely followed the
comparable results, obtaining 0.01634 MSE and 0.08724
MAE. For the Bono dataset, which has a larger number of
rows but fewer features, the Bi-LSTM model achieved the
highest performance with an MSE of 0.00288 and MAE of
0.04129 (the results are plotted in the second chart in Fig. 3).
The Enc-Dec-LSTM model secured the second position with
an MSE of 0.00335 and MAE of 0.03984.
In this particular variation, the random-forest and Bi-LSTM
algorithms demonstrated the highest performance, with this
second ranking first in both datasets, which emphasizes the
versatility of this algorithm. However, when comparing the
overall results (as shown in Fig. 3) to the results obtained in
the first variation, the algorithms designed to predict using a
multivariate structure yielded to worst results. Additionally, a
comparison of the average training times in Table VI with
those in Table V shows that there was an overall increase for
nearly all the algorithms.
The inferior results observed can be directly attributed to the
utilization of multiple features as inputs, commonly known
as the curse of dimensionality [22], which states that the
inclusion of numerous features can result in overfitting and the
consequent decline in model performance. This underscores
the necessity of a feature selection mechanism when dealing
with a multivariate approach. Such mechanisms play a critical
role in reducing the dimensionality of the input data and can
be implemented using various techniques such as statistical
measures, genetic algorithms, and colony optimization, as
presented in [23] and [24]. However, these techniques require
additional time and hardware resources [25], which can be
limited, particularly in IoT/edge environments where resource
constraints are prevalent.
C. Multi-step Analysis
Table VII shows a comparison of the algorithm performance
for variation III. The table and simulation tests were conducted
following the same approach as for the previously mentioned
variants. In this variation, the models generate predictions for
multiple future steps, instead of just a single step.
In the case of the KDN dataset, in this particular variation,
the Enc-Dec-CNN-LSTM algorithm exhibited superior perfor-
mance, achieving 0.00452 MSE and 0.04644 MAE (the results
are plotted in the first chart in Fig. 4). The Enc-Dec-LSTM
algorithm closely followed comparable results, with an MAE
of 0.00852 and 0.05917 MAE. For the Bono dataset, which
contains a larger number of rows but fewer features, the Enc-
Dec-CNN-LSTM model achieved the highest performance
with an MSE of 0.00249 and MAE of 0.01567 (the results
are plotted in the second chart in Fig. 4). The Enc-Dec-LSTM
model secured the second position with an MAE of 0.00282
and an MAE of 0.01935.
As highlighted by the results, the encoder-decoder-based
algorithms performed significantly better than the other algo-
rithms in both datasets. These algorithms were the only ones
capable of achieving satisfactory results in this multi-step sce-
TABLE VI
PER FOR MA NCE R ESU LTS IN VAR IATI ON II ( MULTIVARIATE -ONE -ST EP -REGRESSION).
KDN BONO
Algorithm Rank MSE MAE Rank MSE MAE Time To Train (x)
BI-LSTM 0.0163 (±0.001) 0.0872 (±0.0055) 0.0029 (±0.0002) 0.0872 (±0.0055) 2720.93 s
CNN-LSTM 0.0186 (±0.0003) 0.0853 (±0.0022) 0.0037 (±0.0007) 0.0853 (±0.0022) 1066.8 s
DT 0.0182 (±0.0002) 0.0821 (±0.0011) 0.0171 (±0.0) 0.0821 (±0.0011) 0.41 s
ENC-DEC-CNN-LSTM 0.0217 (±0.0027) 0.0969 (±0.0115) 0.0044 (±0.0002) 0.0969 (±0.0115) 643.38 s
ENC-DEC-LSTM 0.0196 (±0.0027) 0.0883 (±0.0089) 0.0034 (±0.0008) 0.0883 (±0.0089) 5299.87 s
GRU 0.0184 (±0.0011) 0.0836 (±0.0035) 0.0039 (±0.0014) 0.0836 (±0.0035) 1750.75 s
LSTM 0.0168 (±0.0008) 0.0802 (±0.0038) 0.0045 (±0.0019) 0.0802 (±0.0038) 763.24 s
RANDOM-FOREST 0.0158 (±0.0) 0.0728 (±0.0) 0.008 (±0.0) 0.0728 (±0.0) 1352.35 s
SVR 0.0181 (±0.0) 0.0848 (±0.0) 0.0046 (±0.0) 0.0848 (±0.0) 1.27 s
Fig. 3. Comparison of the achieved results over variation II with the expected values in the test datasets.
TABLE VII
PERFORMANCE RESULTS IN VARIATION III (UNIVARIATE -MULTI-STE P -REGRESSION).
KDN BONO
Algorithm Rank MSE MAE Rank MSE MAE Time To Train (x)
BI-LSTM 0.0186 (±0.0038) 0.0976 (±0.0125) 0.0042 (±0.0002) 0.0976 (±0.0125) 2264.35 s
CNN-LSTM 0.0194 (±0.0026) 0.1019 (±0.0099) 0.0042 (±0.0003) 0.1019 (±0.0099) 2392.25 s
DT 0.0134 (±0.0) 0.0773 (±0.0) 0.01 (±0.0001) 0.0773 (±0.0) 0.07 s
ENC-DEC-CNN-LSTM 0.0045 (±0.001) 0.0464 (±0.0001) 0.0025 (±0.001) 0.0464 (±0.0001) 1074.94 s
ENC-DEC-LSTM 0.0085 (±0.0042) 0.0592 (±0.0166) 0.0028 (±0.0002) 0.0592 (±0.0166) 1293.86 s
GRU 0.0214 (±0.0022) 0.109 (±0.0068) 0.0039 (±0.0001) 0.109 (±0.0068) 1499.1 s
LSTM 0.0199 (±0.0028) 0.101 (±0.0041) 0.004 (±0.0001) 0.101 (±0.0041) 725.05 s
RANDOM-FOREST 0.017 (±0.0) 0.0954 (±0.0) 0.004 (±0.0) 0.0954 (±0.0) 607.83 s
SVR 0.0143 (±0.0) 0.0844 (±0.0) 0.0041 (±0.0) 0.0844 (±0.0) 6.16 s
Fig. 4. Comparison of the achieved results over variation III with the expected values in the test datasets.
nario. A drawback of this approach could be the overall time
to predict, which presented slightly higher results compared
to the other variations; however, its average remained under
1 second per prediction, which is considered acceptable for
most applications.
Examining the predicted values (Fig. 4), and comparing
them with the results obtained for the base variation (illustrated
in Fig. 2), becomes evident that the encoder-decoder algo-
rithms produced decent results close to the expected values,
but their performance was slightly lower than the in the base
variation.
This discrepancy can be attributed to the increased com-
plexity associated with predicting the multiple steps [26]. The
dependencies and interactions between the predicted steps
introduce additional uncertainty to the forecasting task, leading
to cumulative error accumulation over the forecast horizon.
Because each predicted step relies on previous predictions,
any inaccuracies or errors in earlier steps can propagate and
amplify over time, potentially resulting in significant devia-
tions from actual future values [27].
In general, aiming IoT/edge environments, we understand
that predicting multiple future steps adds complexity that can-
not be justified solely by the time it takes to make predictions,
because most algorithms and variations have demonstrated
TABLE VIII
PER FOR MA NCE R ESU LTS IN VAR IATI ON IV ( UNIVARIATE -ONE-S TEP -C LA SSI FICAT ION ).
KDN BONO
Algorithm Rank ACC C-ENTROPY Rank ACC C-ENTROPY Time To Train (x)
BI-LSTM 0.8571 (±0.0096) 0.4617 (±0.0512) 0.8065 (±0.0462) 0.4617 (±0.0512) 3313.37 s
CNN-LSTM 0.7943 (±0.0539) 0.6257 (±0.0834) 0.8157 (±0.0148) 0.6257 (±0.0834) 597.92 s
DT 0.8343 (±0.0182) 0.7088 (±0.0374) 0.01 s
ENC-DEC-CNN-LSTM 0.8107 (±0.0727) 0.8619 (±0.3722) 0.6617 (±0.0115) 0.8619 (±0.3722) 717.71 s
ENC-DEC-LSTM 0.8514 (±0.0115) 0.5139 (±0.0272) 0.7425 (±0.0446) 0.5139 (±0.0272) 5546.34 s
GRU 0.83 (±0.0212) 0.6271 (±0.0554) 0.7322 (±0.0551) 0.6271 (±0.0554) 1487.74 s
LSTM 0.83 (±0.039) 0.6593 (±0.1511) 0.7973 (±0.026) 0.6593 (±0.1511) 951.97 s
RANDOM-FOREST 0.8286 (±0.0) 0.7532 (±0.0) 8.53 s
SVR 0.8429 (±0.0) 0.8013 (±0.0) 2.06 s
Fig. 5. Comparison of the achieved results over variation IV with the expected values in the test datasets.
mean prediction times of less than 1s. However, there are
scenarios in which obtaining the current input values for the
forecasting model is difficult or expensive, or when predictions
are made in the network’s core and scaling decisions are
delegated to edge elements [28]. In such cases, the use of
a multi-step forecasting approach can be justified despite the
added complexity and potentially less optimal results. For
these cases, the results highlight the benefits of the encoder-
decoder-based algorithms to address these tasks.
D. Classification Analysis
Table VIII presents a comparison of the algorithm perfor-
mance for Variation IV. In this particular variation, forecasting
is performed through classification rather than regression.
The table and simulation tests were conducted using the
same methodology as for the previously mentioned variants,
however, as the forecasting is structured as a classification, the
algorithms are compared based on accuracy and cross-entropy.
As shown in Table VIII, the cross-entropy results were not
computed for the classical algorithms due to constraints in the
framework with this metric.
In the KDN dataset, the Bi-LSTM algorithm exhibited
superior performance, achieving an accuracy of 0.85714 and
cross-entropy of 0.46172 (the results are plotted in the first
chart in Fig. 5). The Enc-Dec-LSTM model closely followed
similar results, obtaining an accuracy of 0.85143 and a cross-
entropy of 0.51387. Regarding the Bono dataset, the CNN-
LSTM model achieved the highest performance, with 0.81566
of accuracy and 1.17626 cross-entropy (the results are plotted
in the second chart in Fig. 5). Moreover, the Bi-LSTM model
secured the second position, with an accuracy of 0.80647 and
1.16248 of cross-entropy.
As depicted in Figure 5, the performance using the BONO
dataset exhibited better results for this particular variation,
benefiting from a larger pool of available training data.
However, Table VIII reveals that the KDN dataset yielded
higher overall accuracy than the Bono dataset. This highlights
the significance of the data characteristics, particularly when
forecasting is approached as a classification problem, in which,
the quantity of data available during the training phase is
not the primary factor; instead, the relationship between the
features, scales, and quality (its generality and diversity) plays
the crucial role in achieving favorable outcomes [29].
However, in an IoT/edge scenario, the decision to approach
the solution as a classification problem instead of a regression
problem imposes limitations on the range of predicted values.
This can restrict scaling actions to horizontal scaling ap-
proaches where new instances are added or removed. Predict-
ing scalar values such as CPU and memory as discrete classes,
as in this approach, may not always be feasible or optimal [30].
This restricts precise optimization and limits the flexibility of
scaling decisions, potentially leading to suboptimal resource
utilization at the edge. This lack of flexibility is detrimental
to IoT, where efficient resource utilization is crucial.
VI. CONCLUDING REMARKS
This paper presents an evaluation and comparison of var-
ious machine learning-based forecasting algorithms in the
context of auto-scaling in an IoT/edge environment. We
assessed multiple algorithms to validate their performance,
aiming for an optimal solution for each forecasting cate-
gory: univariate/multivariate, one-step/multi-step, and regres-
sion/classification.
Our findings reveal that when confronted with diverse
scenarios presented in an IoT/edge environment, no single
algorithm can serve as a universal solution to achieve optimal
results for all demands. However, our empirical results offer
valuable guidance in choosing effective approaches for com-
mon challenges in IoT/MEC application auto-scaling. For in-
stance, in a baseline scenario, such as where the objective is to
predict the future values of a monitored metric using univariate
data and forecasting a single step, the results recommend
employing Long Short-Term Memory-based models. Among
these options, Bidirectional LSTM is preferable because it
demonstrates a good performance with fewer network layers,
resulting in a shorter training time.
When dealing with multivariate data, our results indicate
a preference for the random-forest and Bi-LSTM models;
however, they also highlight the importance of implementing
a feature-selection mechanism to enhance their effectiveness.
Furthermore, in scenarios requiring predictions of multiple
steps, which is valuable in situations where obtaining input
values for the forecasting model is challenging or expensive
or when scaling decisions are delegated to edge elements by
the network’s core, encoder-decoder-based algorithms, as the
Enc-Dec-LSTM and Enc-Dec-CNN-LSTM, presented a clear
advantage and should be main enablers.
Finally, if the task involves classification rather than regres-
sion, which can simplify the scaling process, the Convolutional
and the Bidirectional LSTM algorithms prove to be the right
choice. Nonetheless, it is worth noting that opting for a classi-
fication approach may limit scaling flexibility by allowing only
horizontal scaling actions, potentially compromising resource
utilization efficiency in IoT environments where optimal re-
source allocation is of utmost importance. In a future study,
we plan to measure the functional behavior of these algorithms
in embedded IoT-MANOs.
REFERENCES
[1] K. Oztoprak, Y. K. Tuncel, and I. Butun, “Technological Transformation
of Telco Operators towards Seamless IoT Edge-Cloud Continuum,”
Sensors, vol. 23, no. 2, 2023.
[2] D. Borsatti, L. Bassi, W. Cerroni, G. Davoli, and C. Raffaelli, “A Multi-
Protocol MEC-based Approach to Deploy and Consume IoT Services,”
in 2022 25th Conference on Innovation in Clouds, Internet and Networks
(ICIN), pp. 154–156, 2022.
[3] E. M. Report, “IoT connections outlook, 2023.
[4] B. D. Deebak, F. Al-Turjman, and L. Mostarda, “Seamless secure
anonymous authentication for cloud-based mobile edge computing,”
Computers & Electrical Engineering, vol. 87, p. 106782, 2020.
[5] C. Cicconetti, M. Conti, and A. Passarella, “Uncoordinated access to
serverless computing in MEC systems for IoT,” Computer Networks,
vol. 172, p. 107184, 2020.
[6] Y. Chiang, Y. Zhang, H. Luo, T.-Y. Chen, G.-H. Chen, H.-T. Chen, Y.-J.
Wang, H.-Y. Wei, and C.-T. Chou, “Management and Orchestration of
Edge Computing for IoT: A Comprehensive Survey, IEEE Internet of
Things Journal, pp. 1–1, 2023.
[7] I. Alawe, A. Ksentini, Y. Hadjadj-Aoul, and P. Bertin, “Improving
traffic forecasting for 5g core network scalability: A machine learning
approach,” IEEE Network, vol. 32, no. 6, pp. 42–49, 2018.
[8] T. Subramanya and R. Riggio, “Centralized and federated learning for
predictive vnf autoscaling in multi-domain 5g networks and beyond,
IEEE Transactions on Network and Service Management, vol. 18, no. 1,
pp. 63–78, 2021.
[9] C. Toma, M. Popa, B. Iancu, M. Doinea, A. Pascu, and F. Ioan-Dutescu,
“Edge machine learning for the automated decision and visual comput-
ing of the robots, iot embedded devices or uav-drones, Electronics,
vol. 11, no. 21, 2022.
[10] Q. Liu, L. Tang, T. Wu, and Q. Chen, “Deep reinforcement learning for
resource demand prediction and virtual function network migration in
digital twin network,” IEEE Internet of Things Journal, pp. 1–1, 2023.
[11] R. Peng, X. Fu, and T. Ding, “Machine Learning with Variable Sampling
Rate for Traffic Prediction in 6G MEC IoT, Discrete Dynamics in
Nature and Society, vol. 2022, p. 8190688, Nov 2022.
[12] B. Brik and A. Ksentini, “Toward Optimal MEC Resource Dimensioning
for a Vehicle Collision Avoidance System: A Deep Learning Approach,
IEEE Network, vol. 35, no. 3, pp. 74–80, 2021.
[13] A. R. Abdellah, V. Artem, A. Muthanna, D. Gallyamov, and A. Kouch-
eryavy, “Deep Learning for IoT Traffic Prediction Based on Edge
Computing,” in Distributed Computer and Communication Networks:
Control, Computation, Communications (V. M. Vishnevskiy, K. E.
Samouylov, and D. V. Kozyrev, eds.), (Cham), pp. 18–29, Springer
International Publishing, 2020.
[14] L. A. Haibeh, M. C. E. Yagoub, and A. Jarray, “A Survey on Mobile
Edge Computing Infrastructure: Design, Resource Management, and
Optimization Approaches,” IEEE Access, vol. 10, pp. 27591–27610,
2022.
[15] A. Mestres, A. Rodriguez-Natal, J. Carner, P. Barlet-Ros, E. Alarc´
on,
M. Sol´
e, V. Munt´
es-Mulero, D. Meyer, S. Barkai, M. J. Hibbett,
G. Estrada, K. Ma’ruf, F. Coras, V. Ermagan, H. Latapie, C. Cassar,
J. Evans, F. Maino, J. Walrand, and A. Cabellos, “Knowledge-defined
networking,” SIGCOMM Comput. Commun. Rev., vol. 47, p. 2–10, sep
2017.
[16] J. Bendriss, Cognitive management of SLA in software-based networks.
PhD thesis, Evry Institut national des telecommunications, 06 2018.
[17] I. B. Yahia, “Vnfdataset: Virtual ip multimedia ip system,” Apr 2020.
[18] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion,
O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vander-
plas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duch-
esnay, “scikit-learn: Machine learning in python. Software, 2011.
[19] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow,
A. Harp, G. Irving, M. Isard, Y. Jia, R. J´
ozefowicz, and L. Kaiser,
“Tensorflow: Large-scale machine learning on heterogeneous distributed
systems,” CoRR, vol. abs/1603.04467, 2016.
[20] F. Chollet et al., “Keras.” https://keras.io, 2015. Accessed on 2023-08-
16.
[21] L. Vinhal, “Codebase - algorithms and hyperparameters.” https://github.
com/vinhal/ai-timeseries- forecasting-compariosn, 2023. Accessed on
2023-08-20.
[22] T. Dokeroglu, A. Deniz, and H. E. Kiziloz, “A comprehensive survey on
recent metaheuristics for feature selection,” Neurocomputing, vol. 494,
pp. 269–296, 2022.
[23] J. Zhou and Z. Hua, A correlation guided genetic algorithm and its
application to feature selection,” Applied Soft Computing, vol. 123,
p. 108964, 2022.
[24] F. Karimi, M. B. Dowlatshahi, and A. Hashemi, “Semiaco: A semi-
supervised feature selection based on ant colony optimization,” Expert
Systems with Applications, vol. 214, p. 119130, 2023.
[25] P. Dhal and C. Azad, “A comprehensive survey on feature selection in
the various fields of machine learning, Applied Intelligence, vol. 52,
pp. 4543–4581, Mar 2022.
[26] J. Wang, Y. Song, F. Liu, and R. Hou, “Analysis and application
of forecasting models in wind power integration: A review of multi-
step-ahead wind speed forecasting models,” Renewable and Sustainable
Energy Reviews, vol. 60, pp. 960–981, 2016.
[27] A. Galicia, R. Talavera-Llames, A. Troncoso, I. Koprinska, and
F. Mart´
ınez- ´
Alvarez, “Multi-step forecasting for big data time series
based on ensemble learning,” Knowledge-Based Systems, vol. 163,
pp. 830–841, 2019.
[28] M. Chantry, H. Christensen, P. Dueben, and T. Palmer, “Opportunities
and challenges for machine learning in weather and climate modelling:
hard, medium and soft AI,” Philosophical Transactions of the Royal
Society A: Mathematical, Physical and Engineering Sciences, vol. 379,
p. 20200083, Feb. 2021.
[29] A. Dogan and D. Birant, “Machine learning and data mining in manu-
facturing,” Expert Systems with Applications, vol. 166, p. 114060, 2021.
[30] T. P. da Silva, A. R. Neto, T. V. Batista, F. C. Delicato, P. F. Pires,
and F. Lopes, “Online machine learning for auto-scaling in the edge
computing,” Pervasive and Mobile Computing, vol. 87, p. 101722, 2022.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
With the development of telecommunication technologies and the proliferation of network applications in the past decades, the traditional cloud network architecture becomes unable to accommodate such demands due to the heavy burden on the backhaul links and long latency. Therefore, edge computing, which brings network functions close to end-users by providing caching, computing and communication resources at network edges, turns into a promising paradigm. Benefit from its nature, edge computing enables emerging scenarios and use cases like Augmented Reality (AR) and Internet of Things (IowT). However, it also creates complexities to efficiently orchestrate heterogeneous services and manage distributed resources in the edge network. In this survey, we make a comprehensive review of the research efforts on service orchestration and resource management for edge computing. We first give an overview of edge computing, including architectures, advantages, enabling technologies and standardization. Next, a comprehensive survey of state-of-the-art techniques in the management and orchestration of edge computing is presented. Subsequently, the state-of-the-art research on the infrastructure of edge computing is discussed in various aspects. Finally, open research challenges and future directions are presented as well.
Article
Full-text available
This article investigates and discusses challenges in the telecommunication field from multiple perspectives, both academic and industry sides are catered for, surveying the main points of technological transformation toward edge-cloud continuum from the view of a telco operator to show the complete picture, including the evolution of cloud-native computing, Software-Defined Networking (SDN), and network automation platforms. The cultural shift in software development and management with DevOps enabled the development of significant technologies in the telecommunication world, including network equipment, application development, and system orchestration. The effect of the aforementioned cultural shift to the application area, especially from the IoT point of view, is investigated. The enormous change in service diversity and delivery capabilities to mass devices are also discussed. During the last two decades, desktop and server virtualization has played an active role in the Information Technology (IT) world. With the use of OpenFlow, SDN, and Network Functions Virtualization (NFV), the network revolution has got underway. The shift from monolithic application development and deployment to micro-services changed the whole picture. On the other hand, the data centers evolved in several generations where the control plane cannot cope with all the networks without an intelligent decision-making process, benefiting from the AI/ML techniques. AI also enables operators to forecast demand more accurately, anticipate network load, and adjust capacity and throughput automatically. Going one step further, zero-touch networking and service management (ZSM) is proposed to get high-level human intents to generate a low-level configuration for network elements with validated results, minimizing the ratio of faults caused by human intervention. Harmonizing all signs of progress in different communication technologies enabled the use of edge computing successfully. Low-powered (from both energy and processing perspectives) IoT networks have disrupted the customer and end-point demands within the sector, as such paved the path towards devising the edge computing concept, which finalized the whole picture of the edge-cloud continuum.
Article
Full-text available
The high-speed development of mobile broadband networks and IoT applications has brought about massive data transmission and data processing, and severe traffic congestion has adversely affected the fast-growing networks and industries. To better allocate network resources and ensure the smooth operation of communications, predicting network traffic becomes an important tool. We investigate in detail the impact of variable sampling rate on traffic prediction and propose a high-speed traffic prediction method using machine learning and recurrent neural networks. We first investigate a VSR-NLMS adaptive prediction method to perform time series prediction dataset transformation. Then, we propose a VSR-LSTM algorithm for real-time prediction of network traffic. Finally, compared with the traditional traffic prediction algorithm based on fixed sampling rate (FSR-LSTM), we simulate the prediction accuracy of the VSR-LSTM algorithm based on the variable sampling rate proposed. The experiment shows that VSR-LSTM has higher traffic prediction accuracy because its sampling rate varies with the traffic.
Article
Full-text available
This paper presents edge machine learning (ML) technology and the challenges of its implementation into various proof-of-concept solutions developed by the authors. Paper presents the concept of Edge ML from a variety of perspectives, describing different implementations such as: a tech-glove smart device (IoT embedded device) for controlling teleoperated robots or an UAVs (unmanned aerial vehicles/drones) that is processing data locally (at the device level) using machine learning techniques and artificial intelligence neural networks (deep learning algorithms), to make decisions without interrogating the cloud platforms. Implementation challenges used in Edge ML are described and analyzed in comparisons with other solutions. An IoT embedded device integrated into a tech glove, which controls a teleoperated robot, is used to run the AI neural network inference. The neural network was trained in an ML cloud for better control. Implementation developments, behind the UAV device capable of visual computation using machine learning, are presented.
Article
Full-text available
Feature selection has become an indispensable machine learning process for data preprocessing due to the ever-increasing sizes in actual data. There have been many solution methods proposed for feature selection since the 1970s. For the last two decades, we have witnessed the superiority of metaheuristic feature selection algorithms, and tens of new ones are being proposed every year. This survey focuses on the most outstanding recent metaheuristic feature selection algorithms of the last two decades in terms of their performance in exploration/exploitation operators, selection methods, transfer functions, fitness value evaluation, and parameter setting techniques. Current challenges of the metaheuristic feature selection algorithms and possible future research topics are examined and brought to the attention of the researchers as well.
Article
Internet of Things (IoT) enables intelligent services varying with the complex and realtime environment to achieve network benefits, where Network Function Virtualization (NFV) can dynamically provide Virtualized Network Functions (VNFs) for IoT devices. In NFV-enabled IoT architecture, a service function chain (SFC) consists of an ordered set of virtual network functions (VNFs). However, the energy consumption of the VNF migration and SFC reconfiguration is one major issue owing to the dynamic characteristic of IoT network. In this paper, we propose a new paradigm digital twin (DT) to create the virtual twin of physical objects in IoT network, then, we formalize the problem as a mathematical model, which aims to minimize the energy consumption. To this end, we prove this problem is NP-hard and propose an algorithm Bidirectional Gated Recurrent Unit (Bi-GRU) based on federated learning to predict the resource requirement. Further more, according to the prediction result, which utilizing the deep reinforcement learning (DRL) algorithm for decision making of the VNF migration. Simulation results show that our proposed method can effectively reduce the number of VNFs to be migrated and economize the energy consumption of the digital twin IoT network.
Article
The evolution of edge computing devices has enabled machine intelligence techniques to process data close to its producers (the sensors) and end-users. Although edge devices are usually resource-constrained, the distribution of processing services among several nodes enables a processing capacity similar to cloud environments. However, the edge computing environment is highly dynamic, impacting the availability of nodes in the distributed system. In addition, the processing workload for each node can change constantly. Thus, the scaling of processing services needs to be rapidly adjusted, avoiding bottlenecks or wasted resources while meeting the applications’ QoS requirements. This paper presents an auto-scaling subsystem for container-based processing services using online machine learning. The auto-scaling follows the MAPE-K control loop to dynamically adjust the number of containers in response to workload changes. We designed the approach for scenarios where the number of processing requests is unknown beforehand. We developed a hybrid auto-scaling mechanism that behaves reactively while a prediction online machine learning model is continuously trained. When the prediction model reaches a desirable performance, the auto-scaling acts proactively, using predictions to anticipate scaling actions. An experimental evaluation has demonstrated the feasibility of the architecture. Our solution achieved fewer service level agreement (SLA) violations and scaling operations to meet demand than purely reactive and no scaling approaches using an actual application workload. Also, our solution wasted fewer resources compared to the other techniques.
Article
Feature selection is one of the most efficient procedures for reducing the dimensionality of high-dimensional data by choosing a practical subset of features. Since labeled samples are not always available and labeling data may be time-consuming or costly, the importance of semi-supervised learning becomes apparent. Semi-supervised learning deals with data that includes both labeled and unlabelled instances. This article proposes a method based on Ant Colony Optimization (ACO) for the semi-supervised feature selection problem called SemiACO. The SemiACO algorithm finds features by considering the minimum redundancy between features and the maximum relevancy between the features and the class label. The SemiACO uses a nonlinear heuristic function instead of a linear one. The heuristic learning technique for the ACO heuristic function utilize a Temporal Difference (TD) reinforcement learning algorithm. We characterize the feature selection search space as a Markov Decision Process (MDP), where features indicate the states, and selecting the unvisited features by each ant represents a set of actions. We contrast the efficiency of SemiACO based on various experiments on 14 benchmark datasets, comparing eight semi-supervised feature selection methods.
Article
Traditional feature selection methods based on genetic algorithms randomly evolve using unguided crossover operators and mutation operators. This leads to many inferior solutions being generated and verified using costly fitness functions. In this paper, we propose a new feature selection method based on a correlation-guided genetic algorithm. It first roughly checks the quality of the potential solutions to reduce the possibility of producing inferior solutions. Then more potentially superior solutions can be verified by the classifier to improve the efficiency of the evolutionary process. It is theoretically proven that the proposed method converges to the optimal solution with a very weak precondition. Numerical results on 4 artificial datasets and 6 real datasets show that compared with other existing methods, the proposed method is a competitive feature selection method with higher classification accuracy and a more efficient evolutionary process.