ArticlePDF Available

Dow Jones Index Return Forecasting: Semantics Based Genetic Programming with Local Search Optimizer

Authors:
Int. J. Bio-Inspired Computation, Vol. X, No. Y, 200x 1
Copyright © 20XX Inderscience Enterprises Ltd.
Stock index return forecasting: semantics-based
genetic programming with local search optimiser
Mauro Castelli* and Leonardo Vanneschi
NOVA IMS,
Universidade Nova de Lisboa,
1070-312, Lisbon, Portugal
Email: mcastelli@novaims.unl.pt
Email: lvanneschi@novaims.unl.pt
*Corresponding author
Leonardo Trujillo
Departamento de Ingeniería Eléctrica y Electrónica,
Tree-Lab, Posgrado en Ciencias de la Ingeniería,
Instituto Tecnológico de Tijuana,
Tijuana, BC, Mexico
Email: leonardo.trujillo@tectijuana.edu.mx
Aleš Popovič
Faculty of Economics,
University of Ljubljana,
Kardeljeva Ploščad 17, 1000 Ljubljana, Slovenia
Email: ales.popovic@ef.uni-lj.si
Abstract: Making accurate stock price predictions is the pillar of effective decisions in
high-velocity environments since the successful prediction of future prices could yield significant
profit and reduce operational costs. Generally, solutions for this task are based on trend
predictions and are driven by various factors. To add to the existing body of knowledge, we
propose a semantics-based genetic programming framework. The proposed framework blends a
recently developed version of genetic programming that uses semantic genetic operators with a
local search method. To analyse the appropriateness of the proposed computational method for
stock market price prediction, we analysed data related to the Dow Jones index and to the
Istanbul Stock Index. Experimental results confirm the suitability of the proposed method for
predicting stock market prices. In fact, the system produces lower errors with respect to the
existing state-of-the art techniques, such as neural networks and support vector machines.
Keywords: forecasting; financial markets; genetic programming; semantics; local search.
Reference to this paper should be made as follows: Castelli, M., Vanneschi, L., Trujillo, L. and
Popovič, A. (xxxx) ‘Stock index return forecasting: semantics-based genetic programming with
local search optimiser’, Int. J. Bio-Inspired Computation, Vol. X, No. Y, pp.xxx–xxx.
Biographical notes: Mauro Castelli received his Masters degree in Computer Science from the
University of Milano Bicocca, Milan, Italy in 2008 (Summa Cum Laude), and PhD from the
University of Milano Bicocca in 2012. Since 2013, he has been an Assistant Professor with
NOVA IMS, Universidade Nova de Lisboa, Lisbon, Portugal.
Leonardo Vanneschi received his Masters degree in Computer Science from the University of
Pisa, Pisa, Italy in 1996 (Summa Cum Laude), and PhD from the University of Lausanne,
Switzerland in 2004. He is an Associate Professor with NOVA IMS, Universidade Nova de
Lisboa, Lisbon, Portugal.
Leonardo Trujillo is a Research Professor at the Instituto Tecnológico de Tijuana (ITT) in
Mexico. His primary research areas are evolutionary computation, genetic programming,
computer vision and pattern recognition. He has an Engineering degree in Electronics and
Masters in Computer Science from ITT, and Doctorate in Computer Science from the CICESE
Research Center in Mexico.
2 M. Castelli et al.
Aleš Popovič is an Associate Professor in the Academic Unit for Business Informatics and
Logistics at the Faculty of Economics University of Ljubljana. He received his PhD in
Information Management from University of Ljubljana.
1 Introduction
Financial markets are considered an important pillar of our
economy. The exchange and trade that occurs on these
markets are based on the price value of a variety of financial
instruments, such as bonds, stocks, commodities, and funds.
It is the ability to predict the direction and changes in prices
for profitability that provide for a multibillion-dollar
industry. Predicting stock prices has long been an intriguing
challenge that has been extensively studied by researchers
from different fields. Forecasting stock market prices is
regarded as a challenging task, mostly due to various
uncertainties surrounding the movement of the market. In
fact, many factors add to the complexity of the stock market
forecasting, including, but not limited to, political events,
general economic conditions, psychological reasons, and
traders’ expectations. While all these factors interrelate with
each other, predicting stock market prices is a complex and
resource-consuming undertaking. Considering the difficulty
of this task, any insights regarding future price performance
could secure huge profits in this market (Tay and Shen,
2002). Thus, proper forecast of this market is an important
factor for investors, buyers, sellers, fund managers, and
other stakeholders, as well as for researchers from the field.
Typically, as reported in Banik et al. (2014), in a stock
market, techniques employed to inform investment choices
fall into two broad categories: fundamental analysis and
technical analysis. The former technique is a complete
method that involves real and reliable information of a
firm’s financial report, economic conditions, and
competitive strength. This technique considers that a present
price depends on its fundamental value, expected return on
investment, and new information about the firm that will
collectively add to the fluctuation of a firm’s share value.
On the other hand, technical analysis is concerned with
market indicators representing the trend of price indices and
individual stocks. The common idea of these indicators is
that once a trend is in motion, it will persist in that path.
While fundamental and technical analysis have been the two
main techniques used to predict market trend and stock
prices, in recent years machine learning (ML) techniques
have shown their suitability in addressing the financial
markets forecasting task (Tay and Shen, 2002; Dash et al.,
2015; Patel et al., 2015). As reported in Shen et al. (2012),
well-known ML algorithms, like support vector machine
and reinforcement learning, have been shown to be
relatively effective in tracing the stock market and help
maximising the profit of stock option purchases while, at
the same time, keeping the associated risk low (Huang et al.,
2005; Moody and Saffell, 2001). Some of these studies,
however, showed that ML techniques have certain inherent
limitations in learning the patterns underlying the stock
market prices due to tremendous noise and complex
dimensionality of stock market data. This results in
inconsistent and unpredictable performance on noisy
data, a phenomenon regularly present in today’s business
environments. In this vein, our research has been motivated
by the challenge to predict, as accurately as possible, the
stock market prices using historical data of share prices and
considering a one week ahead forecasting. The
methodology used in the paper is based on a recently
defined variant of genetic programming, one of the most
successful existing computational intelligence methods.
Recently, genetic programming has obtained excellent
results on a large number of complex real-life applications
(Koza, 2010) and it has lately made an important
breakthrough: the definition of geometric semantic
operators (GSOs), new genetic operators that induce a
unimodal error surface on any supervised learning problem
(including forecasting). Eliminating local optima, GSOs
have a stronger problem solving ability. Thus, they are an
excellent step towards the development of an optimal
forecasting model. However, much work has still to be done
in order to use GSOs in a complex application like the one
taken into account in this work. In particular, GSOs
converge to optimal solution(s) very slowly and this
behaviour is an important limitation in all the applications
characterised by the presence of a large amount of data.
Hence, we propose the definition of a system that combines
GSOs with a local search algorithm. The main idea in
combining GSOs and a local searcher is to couple the
exploration ability of GSOs with the exploitation ability of
the local searcher. In this way we expect to achieve optimal
solutions faster and to obtain a final model that does not
overfit the training data. To analyse the suitability of the
proposed computational method for stock market prices
forecasting, the proposed method has been applied to data
from the Dow Jones index and the Instanbul stock index.
The remainder of the paper is organised as follows:
Section 2 presents an overview of standard genetic
programming and shows its suitability for addressing
symbolic regression problems. Section 3 introduces the
components of the proposed system: the geometric semantic
genetic operators for genetic programming, the motivation
for the use of a local searcher, and the approach used in this
work. Section 4 provides insights into the employed dataset,
the experimental settings, and provides a detailed discussion
about the results obtained. Moreover, a comparison between
the results obtained by the proposed system and the results
achieved on the same dataset by other state-of-the-art
method is presented. Finally, the conclusions summarise
and highlight the work’s main contributions.
Stock index return forecasting: semantics-based genetic programming with local search optimiser 3
2 Genetic programming
Genetic programming (GP) is one of the techniques that
belong to a larger computational intelligence research area
called evolutionary computation. GP consists in the
automated learning of computer programs by means of a
process inspired by biological evolution (Koza, 1992).
Generation by generation, GP stochastically transforms
populations of programs into new, hopefully improved,
populations of programs. The quality of a solution is
expressed by using an objective function (also called fitness
function). The search process of GP is graphically depicted
in Figure 1.
Figure 1 The GP algorithm
Hence, the recipe for solving a problem with GP is the
following:
Choose a representation space in which candidate
solutions can be specified. This consists of choosing the
primitives of the programming language that will be
used to construct programs. A program is built up from
a terminal set (the input variables of the problem and,
optionally, a set of constant values) and a function set
(the primitive operators).
Design the fitness criteria for evaluating the quality of a
solution. This involves the execution of a candidate
solution on a suite of test cases, also referred to as the
test set, reminiscent of the process of black-box testing.
In case of supervised learning, a distance-based
function is employed to quantify the divergence of a
candidate’s behaviour from the desired one.
Design a parent selection and replacement policy.
Central to every EA is the concept of fitness-driven
selection in order to exert an evolutionary pressure
towards promising areas of the program space. The
replacement policy determines the way in which newly
created offspring programs replace their parents in the
population.
Design a variation mechanism for generating offspring
from a parent or a set of parents. Standard GP uses two
main variation operators: crossover and mutation.
Crossover recombines parts of the structure of two
individuals, whereas mutation stochastically alters a
portion of the structure of an individual.
After a random initialisation of a population of
computer programs, an iterative application of
selection-variation-replacement is employed to improve
the programs quality, which can be seen as a step-wise
refinement.
In order to transform a population into a new population of
candidate solutions, GP makes use of particular search
operators called genetic operators. Considering the common
tree representation of GP individuals, the standard genetic
operators (crossover and mutation) act on the structure of
the trees that represent the candidate solutions. In other
terms, standard genetic operators act on the syntax of the
programs. In this paper we used genetic operators that,
differently from the standard ones, are able to act at the
semantic level. The definition of semantics used in this
work is the one also proposed in Moraglio et al. (2012) and
will be presented in the following section.
However, to understand the differences between the
genetic operators used in this work and the ones used in the
standard GP algorithm, the latter are also briefly recalled.
The standard crossover operator is traditionally used to
combine the genetic material of two parents by swapping a
part of one parent with a part of the other. More in detail,
after choosing two individuals based on their fitness, the
crossover operator performs the following operations
1 selects a random subtree in each parent
2 swaps the selected subtrees between the two parents:
the resulting individuals are referred to as the offspring.
The mutation operator introduces random changes in the
structures of the individuals in the population. The most
well-known mutation operator, called sub-tree mutation,
works as follows:
1 it randomly selects a node in a tree
2 it removes the node and the subtree for which it is the
root
3 it inserts a randomly generated tree there.
This operation is controlled by a parameter that specifies the
maximum size (usually measured in terms of tree depth) for
the newly created subtree that is to be inserted.
2.1 Symbolic regression with genetic programming
In symbolic regression, the goal is to search for the
symbolic expression TO : Rp R that best fits a particular
training set T = {(x1, t1), ...., (xn, tn)} of n input/output pairs
with xi Rp and ti R. The general symbolic regression
problem can then be defined as
(
)
;
(,) (,),
with 1,....,
∈∈
=
x
m
OO ii
T
TargminfTt
ip
G R
θ
θθ
(1)
4 M. Castelli et al.
where G is the solution or syntactic space defined by the
primitive set P (functions and terminals), f is the fitness
function based on the distance or error between a program’s
output T(xi, θ) and the expected, or target, output ti, and θ is
a particular parameterisation of the symbolic expression T,
assuming m real-valued parameters.
In standard GP, parameter optimisation is usually not
performed explicitly, since GP search operators only focus
on syntax. Therefore, the parameters are only implicitly
considered. However, recent works have begun to address
this issue, such as Z-Flores et al. (2014) where a nonlinear
numerical optimiser is used to tune the parameterisation of
the evolved programs, achieving substantial improvements
in terms of convergence speed and solution quality.
Let us consider the following hypothetical example to
grasp the importance of such a process. Imagine a GP
individual K with a syntax T(x) = x + sin(x) and the
following parameterisation: θ = (
α
1,
α
2,
α
3), with
T(x) =
α
1x +
α
2sin(
α
3x). In a traditional GP, these
parameters are usually set to 1, which does not necessarily
lead to the best possible performance for this particular
syntax. Indeed, if the optimal solution is, for instance,
TO(x) = 3.3x + 1.003sin(0.0001x), then individual K might
be easily discarded by the search. On the other hand, a local
search process that performs a numerical optimisation of
these implicit parameters might be able to tune them,
and produce a substantial improvement in a program’s
performance, potentially improving the fitness assigned to
the above syntax.
This is the view taken in this work, and while previous
works have applied parameter optimisation to a standard GP
search (Z-Flores et al., 2014), this work applies it to a new
GP variant based on geometric semantic genetic operators.
These operators are described in the next section.
3 Methodology
This section describes the components of the proposed
computational intelligence system used for financial market
return forecasting. In particular Section 3.1 describes the
geometric semantic operators and their properties, while
Section 3.2 presents the local search strategy that we used
with the GSOs.
3.1 Geometric semantic operators
Despite the large number of human-competitive results
achieved with the use of GP (Koza, 2010), researchers still
continue to develop new methods that improve the ability of
GP to produce high-quality solutions. In recent years, one of
the emerging idea is to include the concept of semantics in
the evolutionary process performed by GP. While several
studies exist (i.e., Vanneschi et al., 2014a) the definition of
semantics is not unique and this concept is interpreted in
different ways from different perspectives (Vanneschi et al.,
2014a). In this work we use the most common and widely
accepted definition of semantics in GP literature. The
semantics of a program Ti is defined as the vector of outputs
si = [Ti(x1), T
i(x2), ..., T
i(xn)] obtained after executing the
program (or candidate solution) on a set of training data T
(Moraglio et al., 2012); when Ti represents a real-valued
function then si Rn.
In this section, we briefly recall the definition of the
geometric semantic operators proposed by Moraglio et al.
(2012). The objective of GSOs is to define modifications on
the syntax of GP individuals that have a precise effect on
their semantics. The idea is to define transformations of the
syntax of GP individuals that correspond to well known
operators of genetic algorithms (GAs). In this way, GP
could ‘inherit’ the known properties of those GA operators.
Furthermore, contrary to what typically happens in
real-valued GAs or other heuristics, in the GP semantic
space the target point is also known (it corresponds to the
vector of expected output values in supervised learning) and
the fitness of an individual is simply given by the distance
between the point si it represents in the semantic space and
the target point t. It was shown in Moraglio et al. (2012)
that when fitness is defined in this way it induces a
unimodal error surface. The real-valued GA operators that
we want to ‘map’ into the GP semantic space are geometric
crossover and ball mutation. In real-valued GAs, geometric
crossover produces an offspring that lies on the segment that
joins the parents. It was proven in Krawiec and Lichocki
(2009) that in cases where the fitness is a direct function of
the distance to the target (like the case we are interested in
here) this offspring cannot have a worse fitness than the
worst of its parents. Ball mutation consists of a random
perturbation of the coordinates of an individual. Figure 2
shows a graphical representation of the mapping between
the syntactic and semantic space produced by geometric
semantic operators. The definitions of the operators that
correspond to geometric crossover and ball mutation in the
GP semantic space, as given in Moraglio et al. (2012), are
the following:
Definition 3.1: Geometric semantic crossover (GSC).
Given two parent functions T1, T
2 : Rn R, the
geometric semantic crossover returns the real function
TXO = (T1 · TR) + ((1 TR) · T2), where TR is a random real
function whose output values range in the interval [0, 1].
Definition 3.2: Geometric semantic mutation (GSM). Given
a parent function T : Rn R, the geometric semantic
mutation with mutation step ms returns the real function
TM = T + ms · (TR1 TR2), where TR1 and TR2 are random real
functions.
Figure 3 shows an example of application of GSC to two
arbitrary trees T1 and T2 [represented in plots 3(a) and 3(b)
respectively], using a random tree TR [represented in
plots 3(c)]. The offspring generated by this crossover is
shown in plot 3(d).
Stock index return forecasting: semantics-based genetic programming with local search optimiser 5
Hereafter, GP that uses geometric semantic operators
will be referred to as geometric semantic GP (GSGP). An
important drawback of GSGP, pointed out by Moraglio et
al., is that geometric semantic operators create much larger
offspring than their parents and that the fast growth of the
individuals in the population rapidly makes fitness
evaluation unbearably slow, making the system unusable.
Moreover, while this growth produces fitter solutions, it is
responsible for creating models that are too specialised on
training data, hence potentially generating overfitting. In
Castelli et al. (2015), a possible workaround to the problem
related to the slowness of the fitness evaluation process was
proposed, consisting in an implementation of these
operators that makes them not only usable in practice, but
also very efficient. Basically, this implementation is based
on the idea that, besides storing the initial trees, at every
generation it is enough to maintain in memory, for each
individual, its semantics and a reference to its parents. As
shown in Castelli et al. (2015), the computational cost of
evolving a population of n individuals for g generations is
O(ng), while the cost of evaluating a new, unseen, instance
is O(g). Hence, the system can be efficiently used to address
problems characterised by a large amount of data. This is
the implementation used in this work.
Figure 2 Geometric semantic crossover [plot (a)] (and respectively geometric semantic mutation [plot (b)]) perform a transformation of
the syntax of the individual that corresponds to geometric crossover (respectively geometric mutation) in the semantic space
(see online version for colours)
(a) (b)
Figure 3 Two parents T1 and T2 [plots (a) and (b), respectively], one random tree TR [plot (c)] and the offspring of the crossover between
T1 and T2 using TR [plot (d)]
(a) (b) (c)
(d)
6 M. Castelli et al.
3.2 Local search in GP
This section will first discuss previous approaches to
applying a local search strategy during a GP run. All of
them were developed for standard GP, and most of them
focused on symbolic regression problems. Afterwards, the
main contribution of this paper is presented, i.e., the first
integration of a local searcher within GSGP.
3.2.1 Local search in standard GP
Many works have studied how to combine an evolutionary
algorithm with a local optimiser so far (also referred to as a
refinement process). In general, such approaches are
considered to be a simple type of memetic search (Chen et
al., 2011). The basic idea is straightforward: include within
the optimisation process an additional search operator that
takes an individual (or several) as an initial point and
searches for the local optima around it. Such a strategy can
help ensure that the local region around each individual is
fully exploited. However, there can be some negative
consequences to such an approach. The most evident is the
computational overhead, while the cost of a single LS might
be negligible, performing it on every individual might
become inefficient. Second, LS can produce overfitted
solutions, stagnating the search on local optima. These
issues aside, these techniques have produced impressive
results in a variety of scenarios, some of which are reviewed
by Chen et al. (2011).
A noteworthy aspect of this survey is an almost
complete lack of papers that deal with GP. Of the more than
two hundred papers covered by Chen et al., in fact, only a
couple deal with memetic GP. This indicates that the GP
community may have not addressed the topic adequately.
Some examples are Wang et al. (2011) and Eskridge and
Hougen (2004), which present domain-specific memetic
approaches, that are not intended for LS in symbolic
regression with GP. In fact, LS in GP can be performed in
two ways, in the syntactic space or in the parameter space as
defined in equation (1). Regarding the former, in Azad and
Ryan (2014) the authors proposed a syntactic LS operator
that performs a greedy point mutation, with promising
results on several benchmarks. Regarding the latter
approach, the complete optimisation problem defined in
equation (1) has not received much attention.
In Topchy and Punch (2001), gradient descent is used to
optimise numerical constants within a GP tree, achieving
good results on five symbolic regression problems.
Similarly, in Zhang and Smart (2004) and Graff et al.
(2013) a LS algorithm is used to optimise the value of
constant terminal elements. In Zhang and Smart (2004)
gradient descent is used and tested on classification
problems, while Graff et al. (2013) uses resilient
backpropagation and evaluates the proposal on a real-world
problem, in both cases leading towards improved results. In
Smart and Zhang (2004), the authors include weight
parameters for each function node, which the authors call
inclusion factors; these weights modulate the importance
that each node has within the tree. Indeed, the authors
identify what we are here referring to as implicit program
parameters, and optimise these values by applying gradient
descent on all trees. The authors also propose a series of
new search operators that explicitly consider the
parameterisation of each GP tree.
In a recent work (Z-Flores et al., 2014), this problem
was addressed by implementing a very simple
parameterisation of the tree, by constraining the number of
internal parameters of each tree regardless of its size.
Several different strategies were compared to determine
when a local optimiser should be applied, showing that it is
often best to apply it on either all the population or a subset
of the best individuals. The LS used is called trust
region optimisation (Sorensen, 1982), and results showed
substantial improvements in performance compared with
standard GP search on several benchmark and real-world
problems. A similar approach was developed by Kommenda
et al. (2013), with two noteworthy differences. First,
parameters replace all constants present on a given tree, and
each GP tree is enhanced by adding an artificial root tree
that effectively adds a weight coefficient and a bias to the
entire tree, then the Levenberg-Marquardt optimiser is used
to find the optimal values for these parameters. Second, the
authors apply constant optimisation to the population using
different probabilities, as well as a strict offspring selection
variant for comparison.
3.2.2 Local search in geometric semantic operators
The goal of this work is to integrate a LS strategy within
GSGP. In particular, as an initial proposal, we include a
local searcher within the GSM mutation operator, since
previous works have shown that GSGP achieves its best
performance using only mutation during the search
(Vanneschi et al., 2014b). In particular, the GSM with LS
(GSM-LS) of a tree T generates an individual:
01 21 2
· · ( ) =+ +
MRR
TTTT
αα α
(2)
where
α
i R; notice that
α
2 replaces the mutation step
parameter ms of the geometric semantic mutation (GSM).
This in fact defines a basic multivariate linear regression
problem, which could be solved, for example, by ordinary
least square regression (OLS). However, in this case we
have n linear equations, the number of fitness cases, and
only three unknowns (the
α
is). This gives an over-
determined multivariate linear fitting problem, which can be
solved through SVD (in this work, the GNU Scientific
Library available at http://www.gnu.org/software/gsl/ is
used). We argue that this should be seen as a LS operator,
that attempts to determine the best linear combination of the
parent tree and the random trees used to perturb it, which is
local in the sense of the linear problem posed by the GSM
operator. It should not be seen as a LS in the entire semantic
space, since in that case the LS would necessarily converge
to the optimum in this unimodal landscape.
Stock index return forecasting: semantics-based genetic programming with local search optimiser 7
Figure 4 A graphical representation of (a) GSM and (b) GSM-LS (see online version for colours)
(a) (b)
To illustrate how GSM and GSM-LS differ, a graphical
description of each method is provided in Figure 4. First,
Figure 4(a) shows a contour plot of the semantic space, the
space of all possible program outputs, with the highest
fitness peak at the desired program output t. Also, the
semantics of a single GP tree is depicted as s, a circle
around s is the area in which the semantics
s of the
offspring generated by GSM will lie, where the radius of the
circle is determined by the mutation step ms. Notice that
GSM can, in some cases, generate offspring with semantics
that are farther away from t than the parent, with lower
fitness. This can slow down the convergence speed of the
search.
Instead, GSM-LS will always produce offspring that
have a better fitness than the parent, by forcing the
geometric mutation to always move in the direction of the
known goal of the search, as depicted in Figure 4(b).
This approach is similar to two previously proposed
approaches. First, the linear fitting problem is reminiscent
of the linear scaling procedure proposed in Keijzer (2003),
which allows GP to fit the form of the desired output
without necessarily optimising the scale or bias. However,
in that case, the scaling process is only used to adjust the
fitness value of each individual, while the search operators
used are standard ones. Second, and more closely related to
this work, the non-isotropic Gaussian mutation proposed in
Moraglio and Mambrini (2013), that is used to perform a
run-time analysis of GSGP. However, the mutation
proposed in that work considers a fixed set of basis
functions instead of randomly generated GP trees, and
perturbs the linear combination with Gaussian-noise instead
of providing the best fit coefficients. Finally, the work
presented by Krawiec and O’Reilly (2014) also uses a
multivariate linear regression approach to optimise evolved
solutions, with several key differences. Particularly, the
search is conducted by standard GP, not GSGP, and each
tree is decomposed into a set of subtrees which are then
linearly combined. The method is much more explorative
then the one presented here.
Moreover, the approach we propose contrasts with
previous work (Z-Flores et al., 2014), that relied on a
nonlinear local optimiser, since the linear assumption is
mostly not satisfied by the expression evolved with standard
GP and the corresponding parameterisation. Instead, in this
new approach, it is simple to apply an optimiser based on a
linear regression, given that the GSM operator defines a
linear expression in parameter space.
The idea of including a LS method is based on a very
simple observation related to the properties of the geometric
semantic operators: while these operators are effective in
achieving good performance with respect to standard
syntax-based operators, they require many generations to
converge to optimal solutions. By including a local search
method, we expect to improve the convergence speed of the
search algorithm and to obtain better performance with
respect to the algorithm that only uses the geometric
semantic operators. Moreover, by speeding up the search
process, it will be possible to limit the construction of
over-specialised solutions that, in the end, would overfit the
data.
4 Experimental study
In Section 4.1 we present the data we have used. In
Section 4.2 we describe the experimental settings we
employed, including all the parameters of the systems we
have studied to allow the interested reader to fully replicate
our work. Finally, in Section 4.3 we discuss the obtained
results.
8 M. Castelli et al.
Table 1 Features in the considered datasets
Variable Description
date The last business day of the work (this is typically Friday). Number from 1 to 5
open The price of the stock at the beginning of the week
high The highest price of the stock during the week
low The lowest price of the stock during the week
close The price of the stock at the end of the week
volume The number of shares of stock that traded hands in the week
percent_change_price The percentage change in price throughout the week
percent_change_volume_over_last_week The percentage change in the number of shares of a stock that traded hands for this week
compared to the previous week
previous_weeks_volume The number of shares of stock that traded hands in the previous week
days_to_next_dividend The number of days until the next dividend
percent_return_next_dividend The percentage of return on the next dividend
next_weeks_price The price of the stock in the following week
4.1 Data description
To assess the suitability of the proposed system, we
considered data related to the Dow Jones index and to the
Instabul stock index. In particular, for the Dow Jones
dataset (also used in the study by Brown et al., 2013), the
training data came from the first quarter of 2011 and the test
data came from the second quarter of 2011. For the Istanbul
index dataset, the training data came from the first semester
of 2012 and the test data came from the second semester of
2012. Each record (instance) represents data from a single
week. The objective is to predict the stock price at the end
of week w + 1 considering a set of variables (reported in
Table 1) related to the stock in week w.
4.2 Experimental settings
Two different GP systems were compared: GSGP that only
uses the original GSM operator; and a HYBRID algorithm
that uses both the GSM operator and the proposed GSM-LS
operator.
All the runs used populations of 200 individuals evolved
for 300 generations. Tree initialisation was performed with
the Ramped Half-and-Half method (Koza, 1992) with a
maximum initial depth equal to 6. The function set
contained the arithmetic operators, including protected
division as in Koza (1992). The terminal set contained
12 variables, each one corresponding to a different feature
in the datasets (including the target), these are summarised
in Table 1. Mutation has been used with probability 1.
Survival from one generation to the following one was
always granted to the best individual of the population
(elitism). In GSM a random mutation step has been
considered in each mutation event as suggested in
Vanneschi et al. (2014b). Regarding the HYBRID system,
GSMLS has been used in the first 20 generations, while in
the remaining generations we considered the standard GSM
operator. We decided to limit the number of generations
where the local search was used to avoid overfitting the
training data.
For all the systems under consideration, we analysed the
performance obtained according to two different error
measures. In particular, these two measures are the mean
absolute error (MAE) and the mean square error (MSE).
The definition of these error measures are the following:
1||
=−
ii
iQ
M
AE t y
N (3)
2
1||
=−
ii
iQ
M
SE t y
N (4)
where yi = T(xi) is the output of the GP individual T on the
input data xi and ti is the target value for the instance xi. N
denotes the number of samples in the training or testing
subset, and Q contains the indices of that set.
In the next section, the experimental results obtained are
reported using plots of the median error on the training and
test set. In particular, in each generation the best individual
in the population (i.e., the one that has the smallest training
error) has been chosen and the value of its error on the
training and test sets has been stored. The reported curves
finally contain the median of all these values collected at the
end of each generation. The median was preferred over the
mean in the reported plots because of its higher robustness
to outliers.
The results discussed in the next section have been
obtained using the GSGP implementation freely available at
http://gsgp.sourceforge.net and documented in Castelli et al.
(2015).
4.3 Experimental results
Figures 5 and 6 report, for the datasets taken into account,
training and test error (MAE and MSE) for the considered
GP systems against generations. For all the considered GP
systems 30 runs have been performed.
Stock index return forecasting: semantics-based genetic programming with local search optimiser 9
Figure 5 Dow Jones dataset
(a) (b)
(c) (d)
Notes: Training and test error: MAE and MSE. The plots show the median over 30 runs
Figure 6 Istanbul dataset
(a) (b)
(c) (d)
Notes: Training and test error: MAE and MSE. The plots show the median over 30 runs.
We start the discussion of the results shown in the plots by
considering the performance on the Dow Jones dataset.
Figure 5 clearly shows that HYBRID outperforms GSGP on
both the training and the test set. In particular, it is possible
to note the fast convergence of the proposed algorithm as
well as the fact that the final model does not overfit the
training data.
Regarding the Istanbul dataset the situation is slightly
different: both GP systems under consideration are able to
converge to good quality solutions in a small number of
generations [Figures 6(a) and 6(c)]. Anyway, once again,
the HYBRID system is able to reach good quality solutions
in a smaller number of generations with respect to GSGP.
More interesting is the performance on unseen instances
[Figures 6(a) and 6(c)]. In this case, both GP systems overfit
the training data. When MAE is considered as an error
measure it is possible to see that both GP systems reach the
lower test error in less than 20 generations, while in the
10 M. Castelli et al.
remaining generations overfitting starts appearing. At the
end of the evolutionary search process, the two GP systems
present a comparable test error. When the MSE is used as
an error measure, overfitting is even more evident.
Nonetheless, note that the HYBRID method is able to
produce solutions with a test error that is lower than the one
achieved with the GSGP system.
To analyse the statistical significance of these results, a
set of tests has been performed on the median errors. In
particular, we wanted to assess whether the final results
(generation 300), produced by the considered GP systems,
were statistically significantly different. As a first step, the
Shapiro Wilk test (with
α
= 0.1) has shown that the data are
not normally distributed and hence a rank-based statistic has
been used. Successively, the Wilcoxon rank-sum test for
pairwise data comparison has been used (with
α
= 0.1)
under the alternative hypothesis that the samples do not
have equal medians. The p-values obtained are reported in
Table 2.
Table 2 p-values obtained in the statistical validation
procedure
MAE MSE
Training Test Training Test
Dow Jones 0.003 0.543 0.091 0.04
Istanbul 0.2549 0.408 0.116 0.07
Note: Comparison between results produced by GSGP
and HYBRID on training and test set for the two
considered error measures.
According to the p-values, for the Dow Jones dataset we can
clearly state that HYBRID produces solutions that are
significantly better (i.e., with lower error) than GSGP both
on training and test data when the MSE is used as a measure
of error. When the MAE is the considered error measure the
differences between the two techniques are statistically
significant only taking into account the results on the
training set. Regarding the Istanbul dataset, HYBRID
produces solutions that are significantly better than GSGP
on test data when the MSE is used as a measure of error. In
all the remaining cases, the difference between the quality
of the solutions produced by the two GP systems are not
statistically significant.
While this comparison is interesting to obtain
information about the behaviour of the two systems, it is
important to notice that the parameter settings used in the
experimental phase favour GSGP. More in detail, it is
possible to notice that HYBRID is able to achieve in a few
generations better fitness values than the ones achieved by
GSGP in 300 generations. This is an important aspect to
consider especially in application characterised by a large
amount of data, like the considered one. This last result is
promising, because it suggests that it is possible to achieve
better results using the proposed GSM-LS operator with
respect to the ones achieved by GSGP, and to do so using
fewer iterations of the algorithm, thus producing smaller
functions.
4.4 Comparison with other ML techniques
Besides comparing GSGP against the proposed variant that
uses a local searcher in the mutation operator, we are also
interested in considering the performance of other
well-known state-of-the-art ML methods, to evaluate the
competitiveness of the results obtained.
Table 3 reports the values of the training and test errors
(MAE) of the solutions obtained by all the studied
techniques including, in the last lines of the table, GSGP
and HYBRID. From these results, it is possible to see that
GSGP and HYBRID perform better than all the other
methods we studied, particularly on unseen test data.
Table 3 Experimental comparison between different
non-evolutionary techniques, GSGP and HYBRID
Dow Jones Istanbul
Method Training
error Test
error
Trainin
g error Test
error
Linear regression
(Weisberg, 2005)
1.79 2.17 1.05 1.07
Least square
regression (Seber and
Wild, 2003)
1.77 2.17 1.08 1.06
Radial basis function
network (Haykin,
1999)
1.80 2.17 1.35 1.17
Isotonic regression
(Hoffmann, 2009)
1.72 2.39 1.09 1.12
Neural network
(Haykin, 1998)
1.55 1.81 1.16 1.08
SVM polynomial
kernel (degree 2)
1.31 1.68 1.04 1.07
SVM polynomial
kernel (degree 3)
1.20 1.58 1.09 1.11
SVM polynomial
Kernel (degree 4)
1.10 1.59 1.11 1.13
SVM polynomial
kernel (degree 5)
1.05 1.63 1.14 1.21
GSGP 0.82 1.05 1.14 1.05
HYBRID 0.73 0.95 1.14 1.05
Notes: For non-deterministic techniques we reported the
median of the training error and test error (MAE)
calculated over 30 independent runs. Italics
values denote the best performer.
To assess the statistical significance of these results, the
same set of tests described in the previous section has been
performed. In this case, a Bonferroni correction for the
value of
α
has been considered, given that the number of
compared techniques is larger than two. All the obtained
p-values relative to the comparison between HYBRID and
the other methods are reported in Table 4.
Stock index return forecasting: semantics-based genetic programming with local search optimiser 11
Table 4 p-values obtained from the statistical validation procedure
MAE Dow Jones
LIN ISO SQ NN RBF SVM-2 SVM-3 SVM-4 SVM-5
TRAIN 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 HYBRID
TEST 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001 0.001
MAE Istanbul
LIN ISO SQREG NN RBF SVM-2 SVM-3 SVM-4 SVM-5
TRAIN 0.01 0.118 0.081 0.411 0.001 0.001 0.106 0.306 0.988 HYBRID
TEST 0.863 0.093 1 0.271 0.001 0.745 0.162 0.028 0.001
In the table, LIN stands for linear regression, ISO stands for
isotonic regression, SQ stands for least square regression,
NN stands for neural networks, RBF stands for radial basis
function network, SVM-2 refers to the support vector
machines with polynomial kernel of second degree and
similarly for SVM-3, SVM-4 and SVM-5. According to the
results reported in the table, the differences in terms of
training and test fitness between HYBRID and all the other
considered techniques are statistically significant when the
Dow Jones dataset is considered.
Regarding the Istanbul dataset the HYBRID method is
the best performer on unseen instances, with a difference
that is statistically significant with respect to several of the
other non-evolutionary machine learning techniques
(RBF and SVM-5). On the other hand, on the training
instances, the proposed system performs poorly with respect
to simple linear regression models (according to the
p-values, the only techniques that significantly outperform
the HYBRID system on the training instances are LIN and
SVM-2). Moreover, HYBRID produces the same training
error obtained with SVM and NN and, for all of these
techniques, training error is greater than test error. While
performance on training instances, for this second dataset, is
not as good as the one achieved with other techniques, it is
important to highlights that HYBRID is the best performer
on the test instances. This is the more desirable behaviour
when a real life problem must be addressed.
These results are a clear indication of the
appropriateness of the proposed method to generate
predictive models of stock prices, at least for the studied test
cases.
5 Conclusions
Stock market prices forecasting is one of the most
challenging tasks for financial investors across the globe.
This challenge is due to the uncertainty and volatility of the
stock prices in the market. Due to technology and
globalisation of business and financial markets it is
important to predict the stock prices more quickly and
accurately. In this study the stock market prices forecasting
task has been considered and, in order to address it, a
computational intelligence technique has been proposed.
The proposed system is based on a variant of the genetic
programming algorithm. In particular, the GP system makes
use of particular genetic operators that, differently from the
standard genetic operators used in GP, work on the
semantics of the solutions. While the use of semantic
methods in GP has been successfully investigated and
applied, several important problems that do not allow to
efficiently use these methods are still open. In particular, the
GP system that uses the semantic operators (GSGP) requires
a huge amount of generations in order to converge towards
optimal solutions.
Under this light, the contribution of this work consists of
integrating the GSGP framework with a local search
optimiser. The use of a local searcher is aimed at improving
the speed with which GSGP converges, in order to find
good-quality solutions. Moreover, by combining the
exploration ability of GSGP with the exploitation ability of
a local search method we expect to find good quality
solutions in a small number of generations, hence avoiding
or limiting the excessive specialisation of a model on the
training instances and, consequently, overfitting.
To validate the proposed system, called HYBRID, an
extensive experimental analysis has been performed,
considering stock prices related to the Dow Jones index and
to the Istanbul index. We tested the proposed system against
a standard semantic GP systems. The reported results have
shown that semantic GP with the local searcher is able to
produce results that outperforms the ones obtained by
GSGP on the Dow Jones index while, on the second dataset,
results achieved by the two systems are comparable. More
interesting is the fact that the proposed system is
able to produce better results in a significantly lower
number of generations with respect to GSGP, hence saving
computational effort. This is extremely important in a
domain where a large amount of data are daily available.
To summarise, the paper provides two contributions:
from the point of view of the stock market prices
forecasting, a system able to outperform the existing
state-of-the-art techniques has been defined; from the
machine learning perspective, this case study has shown that
including a local searcher in the geometric semantic GP
system can speed up the convergence of the search process.
We hope that this contribution will pave the way for further
research in these areas.
12 M. Castelli et al.
Acknowledgements
CONACYT Basic Science Research Project No. 178323,
TecNM (Mexico) Research Project 5621.15-P, and the
FP7-Marie Curie-IRSES 2013 European Commission
program through project ACoBSEC with Contract
No. 612689.
References
Azad, R.M.A. and Ryan, C. (2014) ‘A simple approach to lifetime
learning in genetic programming-based symbolic regression’,
Evol. Comput., Vol. 22, No. 2, pp.287–317.
Banik, S., Khan, A.F.M.K. and Anwer, M. (2014) ‘Hybrid
machine learning technique for forecasting Dhaka stock
market timing decisions’, Computational Intelligence and
Neuroscience, pp.1–6.
Brown, M., Pelosi, M. and Dirska, H. (2013) ‘Dynamic-radius
species-conserving genetic algorithm for the financial
forecasting of Dow Jones index stocks’, in P. Perner (Ed.):
Machine Learning and Data Mining in Pattern Recognition,
Lecture Notes in Computer Science, Vol. 7988, pp.27–41,
Springer, Berlin, Heidelberg.
Castelli, M., Silva, S. and Vanneschi, L. (2015) ‘A C++
framework for geometric semantic genetic programming’,
Genetic Programming and Evolvable Machines, Vol. 16,
No. 1, pp.73–81.
Chen, X., Ong, Y-S., Lim, M-H. and Tan, K.C. (2011)
‘A multifacet survey on memetic computation’, Trans. Evol.
Comp., Vol. 15, No. 5, pp.591–607.
Dash, R., Dash, P.K. and Bisoi, R. (2015) ‘A differential harmony
search based hybrid interval type2 fuzzy EGARCH model for
stock market volatility prediction’, International Journal of
Approximate Reasoning, Vol. 59, No. C, pp.81–104.
Eskridge, B. and Hougen, D. (2004) ‘Imitating success: a memetic
crossover operator for genetic programming’, in Proceedings
of the IEEE Congress on Evolutionary Computation, IEEE
Press, Portland, Oregon, pp.809–815.
Graff, M., Peña, R. and Medina, A. (2013) ‘Wind speed
forecasting using genetic programming’, in IEEE Congress
on Evolutionary Computation, IEEE, pp.408–415.
Haykin, S. (1998) Neural Networks: A Comprehensive
Foundation, 2nd ed., Prentice Hall PTR, Upper Saddle River,
NJ, USA.
Haykin, S. (1999) Neural Networks: A Comprehensive
Foundation, Prentice Hall, Upper Saddle River, NJ, USA.
Hoffmann, L. (2009) Multivariate Isotonic Regression and its
Algorithms, Wichita State University, College of Liberal Arts
and Sciences, Department of Mathematics and Statistics.
Huang, W., Nakamori, Y. and Wang, S-Y. (2005) ‘Forecasting
stock market movement direction with support vector
machine’, Computers and Operations Research, Applications
of Neural Networks, Vol. 32, No. 10, pp.2513–2522.
Keijzer, M. (2003) ‘Improving symbolic regression with interval
arithmetic and linear scaling’, in Proceedings of the 6th
European Conference on Genetic Programming EuroGP,
Springer-Verlag, Berlin, Heidelberg, pp.70–82.
Kommenda, M., Kronberger, G., Winkler, S., Affenzeller, M. and
Wagner, S. (2013) ‘Effects of constant optimization by
nonlinear least squares minimization in symbolic regression’,
Proceeding of the Fifteenth Annual Conference Companion
on Genetic and Evolutionary Computation Conference
CompanionGECCO Companion, pp.11–21.
Koza, J.R. (1992) Genetic Programming: On the Programming of
Computers by Means of Natural Selection, MIT Press,
Cambridge, MA, USA.
Koza, J.R. (2010) ‘Human-competitive results produced by
genetic programming’, Genetic Programming and Evolvable
Machines, Vol. 11, Nos. 3–4, pp.251–284.
Krawiec, K. and Lichocki, P. (2009) ‘Approximating geometric
crossover in semantic space’, in GECCO: Proceedings of the
11th Annual Conference on Genetic and Evolutionary
Computation, pp.987–994, ACM, Montreal.
Krawiec, K. and O’Reilly, U-M. (2014) ‘Behavioral programming:
a broader and more detailed take on semantic GP’,
in Proceedings of the 2014 Conference on Genetic and
Evolutionary Computation, GECCO, ACM, New York, NY,
USA, pp.935–942.
Moody, J. and Saffell, M. (2001) ‘Learning to trade via direct
reinforcement’, Neural Networks, IEEE Transactions on,
Vol. 12, No. 4, pp.875– 889.
Moraglio, A. and Mambrini, A. (2013) ‘Runtime analysis of
mutation-based geometric semantic genetic programming for
basis functions regression’, in Proceedings of the 15th Annual
Conference on Genetic and Evolutionary Computation
GECCO, ACM, New York, NY, USA, pp.989–996.
Moraglio, A., Krawiec, K. and Johnson, C.G. (2012) ‘Geometric
semantic genetic programming’, in C.A. Coello Coello, V.
Cutello, K. Deb, S. Forrest, G. Nicosia and M. Pavone (Eds.):
Parallel Problem Solving from Nature, PPSN XII (Part 1),
Lecture Notes in Computer Science, Vol. 7491, pp.21–31,
Springer
Patel, J., Shah, S., Thakkar, P. and Kotecha, K. (2015) ‘Predicting
stock market index using fusion of machine learning
techniques’, Expert Systems with Applications, Vol. 42, No. 4,
pp.2162–2172.
Seber, G. and Wild, C. (2003). Nonlinear Regression, Wiley Series
in Probability and Statistics, Wiley.
Shen, S., Jiang, H. and Zhang, T. (2012) Stock Market Forecasting
using Machine Learning Algorithms, Stanford University,
Santa Clara, California, USA.
Smart, W. and Zhang, M. (2004) ‘Continuously evolving programs
in genetic programming using gradient descent’, in R.I.
Mckay and S-B. Cho (Eds.): Proceedings of The Second
Asian-Pacific Workshop on Genetic Programming, p.16,
Cairns, Australia.
Sorensen, D.C. (1982) ‘Newton’s method with a model trust
region modification’, SIAM Journal on Numerical Analysis,
Vol. 19, No. 2, pp.409–426.
Tay, F.E. and Shen, L. (2002) ‘Economic and financial prediction
using rough sets model’, European Journal of Operational
Research, Vol. 141, No. 3, pp.641–659.
Topchy, A. and Punch, W.F. (2001) ‘Faster genetic programming
based on local gradient search of numeric leaf values’,
in L. Spector, E.D. Goodman, A. Wu, W.B. Langdon, H-M.
Voigt, M. Gen, S. Sen, M. Dorigo, S. Pezeshk, M.H. Garzon
and E. Burke (Eds.): Proceedings of the Genetic
and Evolutionary Computation Conference (GECCO1),
pp.155–162, Morgan Kaufmann.
Stock index return forecasting: semantics-based genetic programming with local search optimiser 13
Vanneschi, L., Castelli, M. and Silva, S. (2014a) ‘A survey of
semantic methods in genetic programming’, Genetic
Programming and Evolvable Machines, Vol. 15, No. 2,
pp.195–214.
Vanneschi, L., Silva, S., Castelli, M. and Manzoni, L. (2014b)
‘Geometric semantic genetic programming for real life
applications’, in Genetic Programming Theory and Practice
XI, Springer, New York, pp.191–209.
Wang, P., Tang, K., Tsang, E.P.K. and Yao, X. (2011) ‘A memetic
genetic programming with decision tree-based local search
for classification problems’, in IEEE Congress on
Evolutionary Computation, IEEE, pp.917–924.
Weisberg, S. (2005). Applied Linear Regression, Wiley Series in
Probability and Statistics, Wiley, Hoboken, New Jersey,
USA.
Z-Flores, E., Trujillo, L., Schuetze, O. and Legrand, P. (2014)
‘Evaluating the effects of local search in genetic
programming’, in A-A. Tantar et al. (Eds.): EVOLVE A
Bridge between Probability, Set Oriented Numerics, and
Evolutionary Computatio, Advances in Intelligent Systems
and Computing, No. 288, pp.213–228, Springer.
Zhang, M. and Smart, W. (2004). Genetic programming with
gradient descent search for multiclass object classification’,
in M. Keijzer, U-M. O’Reilly, S.M. Lucas, E. Costa and
T. Soule (Eds.): Genetic Programming 7th European
Conference, EuroGP, Proceedings, LNCS, Vol. 3003,
pp.399–408, Springer-Verlag, Coimbra, Portugal.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
In a recent contribution we have introduced a new implementation of geometric semantic operators for Genetic Programming. Thanks to this implementation, we are now able to deeply investigate their usefulness and study their properties on complex real-life applications. Our experiments confirm that these operators are more effective than traditional ones in optimizing training data, due to the fact that they induce a unimodal fitness landscape. Furthermore, they automatically limit overfitting, something we had already noticed in our recent contribution, and that is further discussed here. Finally, we investigate the influence of some parameters on the effectiveness of these operators, and we show that tuning their values and setting them “a priori” may be wasted effort. Instead, if we randomly modify the values of those parameters several times during the evolution, we obtain a performance that is comparable with the one obtained with the best setting, both on training and test data for all the studied problems.
Article
Full-text available
Geometric semantic operators are new and promising genetic operators for genetic programming. They have the property of inducing a unimodal error surface for any supervised learning problem, i.e., any problem consisting in finding the match between a set of input data and known target values (like regression and classification). Thanks to an efficient implementation of these operators, it was possible to apply them to a set of real-life problems, obtaining very encouraging results. We have now made this implementation publicly available as open source software, and here we describe how to use it. We also reveal details of the implementation and perform an investigation of its efficiency in terms of running time and memory occupation, both theoretically and experimentally. The source code and documentation are available for download at http://gsgp.sourceforge.net.
Article
Full-text available
Several methods to incorporate semantic awareness in genetic programming have been proposed in the last few years. These methods cover fundamental parts of the evolutionary process: from the population initialization, through different ways of modifying or extending the existing genetic operators, to formal methods, until the definition of completely new genetic operators. The objectives are also distinct: from the maintenance of semantic diversity to the study of semantic locality; from the use of semantics for constructing solutions which obey certain constraints to the exploitation of the geometry of the semantic topological space aimed at defining easy-to-search fitness landscapes. All these approaches have shown, in different ways and amounts, that incorporating semantic awareness may help improving the power of genetic programming. This survey analyzes and discusses the state of the art in the field, organizing the existing methods into different categories. It restricts itself to studies where semantics is intended as the set of output values of a program on the training data, a definition that is common to a rather large set of recent contributions. It does not discuss methods for incorporating semantic information into grammar-based genetic programming or approaches based on formal methods. The objective is keeping the community updated on this interesting research track, hoping to motivate new and stimulating contributions.
Conference Paper
Full-text available
Geometric Semantic Genetic Programming (GSGP) is a recently introduced form of Genetic Programming (GP) that searches the semantic space of functions/programs. The fitness landscape seen by GSGP is always -- for any domain and for any problem -- unimodal with a linear slope by construction. This makes the search for the optimum much easier than for traditional GP, and it opens the way to analyse theoretically in a easy manner the optimisation time of GSGP in a general setting. Very recent work proposed a runtime analysis of mutation-based GSGP on the class of all Boolean functions. We present a runtime analysis of mutation-based GSGP on the class of all regression problems with generic basis functions (encompassing e.g., polynomial regression and trigonometric regression).
Article
Full-text available
Forecasting stock market has been a difficult job for applied researchers owing to nature of facts which is very noisy and time varying. However, this hypothesis has been featured by several empirical experiential studies and a number of researchers have efficiently applied machine learning techniques to forecast stock market. This paper studied stock prediction for the use of investors. It is always true that investors typically obtain loss because of uncertain investment purposes and unsighted assets. This paper proposes a rough set model, a neural network model, and a hybrid neural network and rough set model to find optimal buy and sell of a share on Dhaka stock exchange. Investigational findings demonstrate that our proposed hybrid model has higher precision than the single rough set model and the neural network model. We believe this paper findings will help stock investors to decide about optimal buy and/or sell time on Dhaka stock exchange.
Article
Full-text available
Abstract Genetic Programming (GP) coarsely models natural evolution to evolve computer programs. Unlike in nature, where individuals can often improve their fitness through lifetime experience, the fitness of GP individuals generally does not change during their lifetime, and there is usually no opportunity to pass on acquired knowledge. This paper introduces the Chameleon system to address this discrepancy and augment GP with lifetime learning by adding a simple local search that operates by tuning the internal nodes of individuals. Although not the first attempt to combine local search with GP, its simplicity means that it is easy to understand and cheap to implement. A simple cache is added which leverages the local search to reduce the tuning cost to a small fraction of the expected cost, and we provide a theoretical upper limit on the maximum tuning expense given the average tree size of the population and show that this limit grows very conservatively as the average tree size of the population increases. We show that Chameleon uses available genetic material more efficiently by exploring more actively than with standard GP, and demonstrate that not only does Chameleon outperform standard GP (on both training and test data) over a number of symbolic regression type problems, it does so by producing smaller individuals and that it works harmoniously with two other well known extensions to GP, namely, linear scaling and a diversity-promoting tournament selection method.
Article
In this paper a new hybrid model integrating an interval type2 fuzzy logic system (IT2FLS) with a computationally efficient functional link artificial neural network (CEFLANN) and an Exponential Generalized Autoregressive Conditional Heteroskedasticity (EGARCH) model has been proposed for accurate forecasting and modeling of financial data with changing variance over time. The proposed model denoted as IT2F-CE-EGARCH helps to enhance the ability of EGARCH model through a joint estimation of the important features of EGARCH like leverage effect, asymmetric shock by leverage effect with the secondary membership functions of interval type2 TSK FLS and the functional expansion and learning component of a CEFLANN. The secondary membership functions with upper and lower limits of IT2FLS provide a forecasting interval for handling more complicated uncertainties involved in volatility forecasting compared to type1 FLS. The performance of the proposed model has been observed with two membership functions i.e. Gaussian with fixed mean, uncertain variance and Gaussian with fixed variance and uncertain mean. The proposed model has also been compared with a few other fuzzy time series models and GARCH family models based on four performance metrics: MSFE, RMSFE, MAFE and Rel MAE. Again a differential harmony search (DHS) algorithm has been suggested for optimizing the parameters of all the fuzzy time series models. The results indicate that the proposed IT2F-CE-EGARCH model offers significant improvements in volatility forecasting performance in comparison with all other specified models over BSE Sensex and CNX Nifty dataset.
Conference Paper
In this publication a constant optimization approach for symbolic regression is introduced to separate the task of finding the correct model structure from the necessity to evolve the correct numerical constants. A gradient-based nonlinear least squares optimization algorithm, the Levenberg-Marquardt (LM) algorithm, is used for adjusting constant values in symbolic expression trees during their evolution. The LM algorithm depends on gradient information consisting of partial derivations of the trees, which are obtained by automatic differentiation. The presented constant optimization approach is tested on several benchmark problems and compared to a standard genetic programming algorithm to show its effectiveness. Although the constant optimization involves an overhead regarding the execution time, the achieved accuracy increases significantly as well as the ability of genetic programming to learn from provided data. As an example, the Pagie-1 problem could be solved in 37 out of 50 test runs, whereas without constant optimization it was solved in only 10 runs. Furthermore, different configurations of the constant optimization approach (number of iterations, probability of applying constant optimization) are evaluated and their impact is detailed in the results section.