Conference PaperPDF Available

Decision Making Process of Stock Trading Implementing DRQN and ARIMA

Authors:

Abstract and Figures

The approach to collect realistic trading signals throughout the transaction process to broaden advantages is a long-studied issue. The rapid expansion and dynamic character on stock markets is a major issue for the financial sector, as conventional trade tactics developed by experienced financial professionals do not generate sufficient performance under all market situations. In most previous studies, Machine Learning and Deep Learning have been based on price estimation techniques, yet few studies have shown decisions based on stock trading. To solve this difficulty, adaptive stock trading strategies are suggested with profound techniques of deep reinforcement learning. This study exhibits the implementation of Deep Recurrent Q-Learning (DRQN) and Autoregressive Integrating Moving Average (ARIMA) on stock trading with predicting closing value of stock that helps for strategic decision-making from a stock market by acknowledging risk to buy, hold and sell with profit calculation. This method was applied on 15 Nasdaq stock datasets and overcomes the limitation of recent developed reinforcement learning methods. The proposed fusion of DRQN and ARIMA based strategy displays robust result which helps to take better decision for stock trading with visualizing experimental outcomes.
Content may be subject to copyright.
XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE
Decision Making Process of Stock Trading
Implementing DRQN And ARIMA
Abstract—The approach to collect realistic trading signals
throughout the transaction process to broaden advantages is a
long-studied issue. The rapid expansion and dynamic character
on stock markets is a major issue for the financial sector, as
conventional trade tactics developed by experienced financial
professionals do not generate sufficient performance under all
market situations. In most previous studies, Machine Learning
and Deep Learning have been based on price estimation
techniques, yet few studies have shown decisions based on stock
trading. To solve this difficulty, adaptive stock trading strategies
are suggested with profound techniques of deep reinforcement
learning. This study exhibits the implementation of Deep
Recurrent Q-Learning (DRQN) and Autoregressive Integrating
Moving Average (ARIMA) on stock trading with predicting
closing value of stock that helps for strategic decision-making
from a stock market by acknowledging risk to buy, hold and sell
with profit calculation. This method was applied on 15 Nasdaq
stock datasets and overcomes the limitation of recent developed
reinforcement learning methods. The proposed fusion of DRQN
and ARIMA based strategy displays robust result which helps
to take better decision for stock trading with visualizing
experimental outcomes.
Keywords— reinforcement learning, DRQN, stock data, stock
market, LSTM, ARIMA, stock trading.
I. INTRODUCTION
To maximise profit, stock should ideally be obtained at a
low cost and disposed of at a high cost. The greater the
difference between the price of sales and the price of
purchases, the greater the return from trading. As a result, the
goal of regular stock traders is to profit from short-term events
by purchasing stock at a lower price and selling it at a higher
value. Traders' primary task is to identify when to purchase
and sell shares [1].
The ever-expanding character of machine learning research is
becoming a sanctuary for exploration in increasingly
sophisticated applications such as stock trading. This is
referred to as a decision-making process, and it assists traders
in achieving certain return-on-investment performance
indicators, including as profit, economic utility, and risk-
adjusted return, amongst other. Nonetheless, research in this
field has focused on the advancement of machine learning and
artificial intelligence over the years to cope with the volatile
behaviour of stock markets. The accuracy of the anticipated
price or trend determines the prediction's efficacy. The higher
accuracy is relied on less difference between forecasted and
actual value. Traditional statistical learning algorithms are
limited in their capacity to cope with the non-linearity inherent
in stock markets, therefore supervised and unsupervised
learning methods are employed [2]. However, these
algorithms often rely specifically on accuracy in price
prediction. The primary purpose for forecasting future prices
is to determine whether or not it is a suitable price to buy o sell
the stock. Nonetheless, the basic reinforcement learning
methodology is incapable of capturing time series data, while
stock history data is precisely a lengthy sequential time series
data. This challenge is resolved using DQRN with LSTM
which is also beneficial for tackling sequence prediction
issues due to their capacity to capture foreknowledge and
merges Deep Learning's observation with reinforcement
learning's decision. Moreover, with our proposal work the
forecast of stock closing data are estimated for next for 100
days and signals for buy and sell are generated which
enhances the chance for trading reducing risks. Profit is also
assessed in order to evaluate risk.
The following are the sections of the paper that are arranged
as follows: II covers similar studies that have used deep
learning, machine learning, and reinforcement learning
techniques; III describes the proposed methodology as well as
the models that have been implemented. IV presents the
findings and their analysis, and the V summarises the
research's overall objective.
II. RELATED WORKS
Multiple deep learning, reinforcement learning, and
machine learning had been studied previously where those
methods were mostly implemented for stock price prediction.
A. Deep Learning Based Approaches
Nguyen, D.H.D. et al implemented a LSTM model that
utilized the average of the preceding five days' stock market
information (opening price; closing price; high; low; volume;
close) as its input value. This value was used to generate the
initial estimate. To calculate a mean of the stock price data for
the following five days, the ARIMA technique was utilised to
include the forecast into the average stock price information.
Furthermore, researchers used technical analysis indications
to determine whether or not to purchase stocks, whether or not
to retain stocks, and whether or not to sell stocks [3].
Selvin, S. et al proposed a combined method of LSTM, RNN
and CNN Sliding window based approach for stock price
prediction for short term returns [4]. They didn't match the
data to a particular model but rather detect the underlying
dynamics of the data with profound learning frameworks.
However, the model uses the information provided for
Monirul Islam Pavel
Center for Artificial Intelligence Technology,
Faculty of Information Science &
Technology,
Universiti Kebangsaan Malaysia, Selangor,
Malaysia
Email: P104619@siswa.ukm.edu.my
orcid.org/0000-0001-9470-7725
Dewan Ahmed Muhtasim
Center for Artificial Intelligence Technology,
Faculty of Information Science &
Technology,
Universiti Kebangsaan Malaysia, Selangor,
Malaysia
Email: dewanmuhtasim@gmail.com
Omar Faruk
Faculty of Information Science &
Technology,
Universiti Kebangsaan Malaysia, Selangor,
Malaysia
Email: ga03843@siswa.ukm.edu.my
2021 IEEE Madras Section Conference (MASCON) | 978-1-6654-0405-1/21/$31.00 ©2021 IEEE | DOI: 10.1109/MASCON51689.2021.9563476
Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.
prediction at a specific moment. Although the other two
models are utilised in many other time dependent data
analyses, CNN architecture in this case is not carried out. This
is because of the rapid fluctuations in stock markets.
For each series, Skehin, T. et al suggested a linear
Autoregressive Integrated moving average (ARIMA) and
LSTM network for the next day. The techniques had broken
down a series to approximate and describe behaviour in detail
across time. These approaches were integrated in a new
ensemble model to enhance predictive accuracy [5].
Chatzis, S. P. et al developed a DNN model using Boost
methods to forecast bouts of stock market crises. According to
his study, the stock market crisis was useful to forecast the
price, although his research not particular to certain techniques
of prediction [6].
Nakagawa, K. et al has suggested a deep-factor model and a
shallow DNN model, implying that the link between stock
returns on the financial market and factors is nonlinear instead
of linear. Other machine techniques such as SVR and random
forests were also superior to the deep learning model. The
shallow model obtained highest precision comparing with the
other machine learning approaches [7].
Ding, X. presented a deep learning technique for the stock
market dictionary driven by events. First, events have been
retrieved from news texts and displayed as dense vectors by a
new network of neural tensors (NTN). Secondly, CNN has
been utilised to estimate both the effects on stock prices in the
near term and in the long run [8].
In order to acquire a stock representation and transform 4-
channel stock time series (lowest, highest, open and closing
prize each day) into candlestick charts Hu, G. et al presented
a convolutional auto-encoder model utilising the synthesis
approach to show price history as [9]. This solution avoids
costly annotation effectively. In terms of total return, the
suggested model exceeded the FTSE 100 index and many
well-known funds.
B. Reinforcement Learning Based Approaches
Carapuço, J. et al. developed a learning-quality strengthening
network model; three hidden layers of ReLU neutrons were
trained in a new market-based simulation environment under
the Q-learning algorithm as agents in enhancement learning
(RL). The approach has reliably induced steady and
generalised learning to out-of-sample data [10].
In order to resolve the portfolio management problem and
create a stand alone profound enhancement learning model,
Kang, Q. et al. the implementation of the state-of-the art
Asynchronous Actor Critical Algorithm (A3C algorithm)
[11].
Yes, W. et al. developed a strengthening learning model with
multi-targets and LSTM agents. Functional learning was
found to contribute to improved performance. The LSTM
network took consistent choices and could change positions in
time, reducing the cost of the transaction and making good
profit from the multi-objective structure within an acceptable
risk [12].
Further, deep reinforcement learning approach for suggesting
cryptocurrency trading points was proposed by Sattarov, O. et
al. [20] to avoid capital minimization. This approach was used
for developing an application to analysis historical price
movements in real-time cases testing on Bitcoin, Ethereum
crypto coin and Litecoin data. Their method obtained 74%,
14.4% and 41% more profit testing Litecoin, Ethereum and
Bitcoin.
C. Machine Learning Based Approaches
In [13], the authors have developed a predicted stock price
model utilising the ARIMA model. Datasets have been
collected from the NSE (Nigeria Stock Exchange) and NYSE
for the stock price forecast (New York Stock Exchange). The
V-5 e-views programme is the implementation tool. The
criteria used to identify the optimal ARIMA model for each
stock index include the Schwarz or Bayesian data, a very low
standard regression error, a relatively highly adjusted R2. The
two datasets utilised are the stock Nokia index and the bank
zenith index. Both datasets produced a reduced value for the
Bayesian information criterion and a substantially smaller
ARIMA (2,1, 0) and ARIMA standard regression error (1,
0,1). The results therefore show that ARIMA models have the
capacity to forecast short-term stock prices.
An experimental research was done, using SVR analysis as a
machine learning approach for predicting the stock price and
for predicting stock market trends [1]. They have employed
several types of window operators including flatten window,
basic rectangular window and flattened window to feed more
trustworthy entries in regression models, transforming time
series data into generic data. The test was carried out utilising
the Dhaka Stock Exchange (DSE), which contained historical
data from 2009 to 2014. The results demonstrate that good
stock price prediction outcomes were obtained by SVR
models that were developed using flatten window and
rectangle window operator.
In [15], survey on stock price predictions in the field of neural
network, Support Vector Machines (SVM) and Hidden
Markov Model were done (HMM). In this article, the HMM
was proposed for the prediction approach, and the exciting
methodologies were compared. Data training was carried by
using the HMM Baum Welch method. The algorithm has been
tested on three distinct equities, SBI, IDBI and ICICI. The
results revealed that the suggested model was more precise
than existing approaches.
III. METHODOLOG
A. Nasdaq Dataset
To assess these DRL agents in a real situation, daily
Nasdaq stock trading data are utilized from the past as a
training and testing set where the dataset are collected in
csv format from Nasdaq's website [16]. There are more than
7000 stocks in US market, it will be time-consuming to use all
data for evaluation. Therefore, few stock data from Nasdaq are
sampled as evaluation dataset including the BV (BrightView
Holdings, Inc), CG (Carlyle Group Inc, CAAS (China
Automotive Systems, Inc), CAC (Camden National
Corporation), CAKE (Cheesecake Factory Inc), BEBE (bebe
stores, inc.), BCS (Barclays PLC), AGIO (Agios
Pharmaceuticals Inc), BANX (StoneCastle Financial Corp),
BCRX (BioCryst Pharmaceuticals, Inc.), DGX (Quest
Diagnostics Inc), GOOG (Alphabet Inc), DAL (Delta Air
Lines, Inc), FB (Facebook, Inc), ABBV (AbbVie Inc)
containing values upto 2021-06-30.
Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.
B. DRQN
Deep Q-Network (DQN) is the fundamental components
including experiential replay and Q-target networks. Deep Q-
learning is Q-learning, substituted by a deep neural network
with a Q-table. In this method the issue is solved with: if the
state space and action space are constant or indefinitely
discrete, the Q-value in the Q-table with finite samples cannot
be iterated. Since it has been demonstrated that a 3-layer
neural network can represent any function,[17] the
fundamental concept for deep Q-learning is to utilise a neural
network for approximation of the Q-function. (
In the DQN framework, it is essential to assume that the
state is a complete environmental observation at every stage.
However, variables that exist in the temporal dimension for
stock trading environment generally cannot be found
effectively under the DQN framework. In order to tackle this
issue, deep recurring Q-networks is proposed to apply
[18]. In the DRQN architecture, is the hidden condition of
the LSTM units and represents the preceding step of
information. In time interrelationships, LSTM may uncover
data concealed and maintain essential functions in prior state
[19]. Implementation of LSTM in proposed DQN framework
makes the state specified a closer look at the trading
environment.
The characteristic of DQN makes the agent stable in
training and is why DQN techniques are used instead of
tabular Q- learning methods. More particular, the recurring
layer is added to the input to collect sequential input state data
in DRQN [21,22]. During DRQN implementation some
policies have been taken: (a) the agent may only execute one
step at a time, the stock cannot be purchased or sold
simultaneously, (b) the sales action is only valid if at least one
unit holds the agent, (c) the agent may only purchase or sell
one unit at a time, (d) the agency is given starting capital for
trading stocks in the amount of $10000 at the beginning of
each action.
Fig. 1. Architecture for proposed DRQN model.
C. Implementation of DRQN
To implement DRQN, it is divided into three parts: state,
action and reward as shown in Fig 1; considering it as a
Markov Decision Process (MDP) [22]. The state space
denotes the observation of the environment which is stock
price at any timestamp. Moreover, the action space which is
what an agent can do in each state, includes 3 buying signals.
From the value of 0,1,2 represents hold, buy and sell. Action-
based observation of the environment results in the calculation
of profit or loss. The beginning capital for commerce is set for
every agent and the profit produced by the agent is dependent
on the balance of the credit held at the conclusion of the
transaction. Now, if purchased, the profit balance will be
updated to calculate the difference between the current
amount and the money invested. In case of selling, profit will
be calculated. Considering ΔB as current balance and ΔC as
starting capital, the equation is given,

  (1)
The starting capital given to the agent at the beginning is
assumed as $10000, current capital is the balance money hold
by agent at the cutting point.
D. Forcasting Using ARIMA
Forecasting time series is an interdisciplinary technique used
to address stock price estimating issues. It is adaptable since
just historical observations of the relevant variables are
required. In the time order, stock price index data is typically
gathered. Although It is seen as data from time series, it has
strong nonlinear features and time differences. In time series
analysis, ARIMA is highly versatile, combining the benefits
of time series and regression [23, 24].
Nonseasonal algorithm can be categorized as ARIMA(p, q, d)
where I is integrated value, p refers to the numbers of
autoregressive (AR) parts, d refers to the number of
nonseasonal differences, and q refers to the number of moving
average (MA) terms [25, 26].
Now to apply ARIMA,  is taken as time
series of stock trading problem, ARIMA is constructed by
following steps,
First of all, the nonstationary time series of dataset, it is
stabilized using the following equation where value of d is 1,
integrated (I) term is proposed in ARIMA model to remove
the effects of nonstationary data by differencing and
denotes as the nonstationary data, the equation is,
  !"#$% (2)
The integrated term is proposed in ARIMA model to remove
the effects of nonstationary data by differencing, where z is
the difference of dth difference,
&'()
&'()
&'()*)+)*, (3)
(
-.+/(+0+/1(1+230243 (4)
Here, (
- is the general equation for ARIMA where β is the
slope coefficient, 2 is the moving average parameter and 3 is
the error term.
Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
Decision Making Process of Stock Trading
Implementing DRQN And ARIMA
Monirul Islam Pavel
Center for Artificial Intelligence Technology,
Faculty of Information Science &
Technology,
Universiti Kebangsaan Malaysia, Selangor,
Malaysia
Email: P104619@siswa.ukm.edu.my
orcid.org/0000-0001-9470-7725
Dewan Ahmed Muhtasim
Center for Artificial Intelligence Technology,
Faculty of Information Science &
Technology,
Universiti Kebangsaan Malaysia, Selangor,
Malaysia
Email: dewanmuhtasim@gmail.com
Omar Faruk
Faculty of Information Science &
Technology,
Universiti Kebangsaan Malaysia, Selangor,
Malaysia
Email: ga03843@siswa.ukm.edu.my
Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.
IV. RESULT AND ANALYSIS
To experiment the outcomes, first ARIMA based time series
prediction is estimated next 100 days stock closing price. In
addition to evaluating every agent's ultimate profit, the capital
acquired or lost over the test period is also evaluated. At first,
the capital acquired is negative since the agent may only spend
capital on purchase or hold actions; hence, the capital gained
is negative. Capital drops more on subsequent days if the agent
executes a purchase activity rather than a sell action. Near the
end of the test period, the majority of the gain reversed to a
positive value as the agent profited from the unit being sold at
a higher closure price. Fig. 2 illustrates the outputs of the
agents' buy and sell signals over forecasted set; the daily close
price is plotted and attached with a coloured circle denoting a
purchase or sale point.
The blue circle represents a purchase action taken by the agent
on that location, whereas the red downward triangle indicates
a sell action taken by the agency on that point. Moreover,
Table 1. shows that the calculated profit value obtained from
DRQN rewards of each dataset where GOOG and CAKE
showed highest profit outcome of 6.02% and 7.11% based on
sell and DGX and DAL showed lowest profit outcome of
-5.28% and -3.51%. The table also describe mean square error
from obtained rewards and costs from DRQN for iteration
200, window size 30 and batch size 64. Moreover, mean
square error (MSE) and mean absolute error (MAE) are
computed to validate the ARIMA forecasted result.
Considering 56 as mean train value and 57 as predicted value,
the equations of MSE and MAE are,
!89
:;<5657=*
:
6 (5)
!9
:;>56
:
657> (6)
For the prediction using ARIMA, the lowest MSE was 0.046
for BANX and highest was 3.67 for DGX. In addition,
minimum MAE was achieved by CG scoring 1.1% and BCRX
showed 20.77% which provided most unstable result
comparing with the other 14 dataset.
(k)
(l)
(m)
(n)
(o)
Fig 2. Outcomes of DRQn showing buy or signal for (a)BV, (b)CG, (c)CAAS, (d)CAC, (e) CAKE, (f) BEBE, (g) BCS, (h)AGIO, (i)
BANX, (j) BCRX, (k) DGX, (l) GOOG, (m) DAL, (n) FB, (o) ABBV.
Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.
Table 1. Outcomes and validation for 15 datasets.
DRQN ARIMA
Dataset Profit
(%) Reward Cost MSE MAE
(%)
BV -0.529 11.08 0.034 0.315 2.7
CG 0.097 42.06 0.243 0.073 1.1
CAAS -0.115 -2.62 0.008 2.5 9.7
CAC 1.25 49.6 5.003 0.071 1.4
CAKE 6.02 28.56 5.87 0.194 2.47
BEBE 0.022 0.048 0.033 0.229 2.6
BCS -0.034 4.91 0.051 0.191 2.5
AGIO 4.56 23.62 1.385 0.093 1.3
BANX 0.029 5.34 0.0187 0.046 3.33
BCRX 0.4 19.26 0.19 0.45 20.77
DGX -5.28 35.12 2.75 3.67 5.07
GOOG 7.11 1335 547 1.68 10.7
DAL -3.51 51.41 1.2 1 8.71
FB 0.32 91.51 33.16 3.2 9.68
ABBV 0.16 49.47 2.38 2.23 4.11
V. CONCLUSION
From the perspective of stock trading, a valuable portion is to
have knowledge when to buy, sell or hold the stock and know
what can be estimated risk by calculating profit. Although
there is a vast improvement of artificial intelligence based
analysis in stock data, most of them are limited in between
predicting future stock price only which is not effective as
well for the nature of uncertain stock records. In this work,
the issues are overcame by combining both deep recurrent Q-
network and ARIMA for predicting closing value and
generating buy, sell and hold signal based on profit
calculation applying on 15 Nasdaq dataset. However, as very
few works have done on this theme, it was hard to evaluate
and compare the outcomes with other research works.
Further, more focus will be given to improve the performance
concatenating deep learning agent with the DRQN model to
boost the Q-value prediction and for more efficient ARIMA
model with higher accuracy and stable forecasting ability to
obtain higher profit in stock trading.
REFERENCES
[1] W. C. Chiang, D. Enke, T. Wu, and R. Wang, "An adaptive stock index
trading decision support system," Expert Systems with Applications,
vol. 59, pp. 195-207, 2016.
[2] D. W. Lu, "Agent inspired trading using recurrent reinforcement
learning and lstm neural networks," arXiv preprint arXiv:.07338, 2017.
[3] D. H. D. Nguyen, L. P. Tran, and V. Nguyen, "Predicting stock prices
using dynamic LSTM models," in International Conference on Applied
Informatics, 2019, pp. 199-212.
[4] S. Selvin, R. Vinayakumar, E. Gopalakrishnan, V. K. Menon, and K.
Soman, "Stock price prediction using LSTM, RNN and CNN-sliding
window model," in 2017 international conference on advances in
computing, communications and informatics (icacci), 2017, pp. 1643-
1647.
[5] T. Skehin, M. Crane, and M. Bezbradica, "Day ahead forecasting of
FAANG stocks using ARIMA, LSTM networks and wavelets," 2018:
CEUR Workshop Proceedings.
[6] S. P. Chatzis, V. Siakoulis, A. Petropoulos, E. Stavroulakis, and N.
Vlachogiannakis, "Forecasting stock market crisis events using deep
and statistical machine learning techniques," Expert systems with
applications, vol. 112, pp. 353-371, 2018.
[7] K. Nakagawa, T. Ito, M. Abe, and K. Izumi, "Deep recurrent factor
model: interpretable non-linear and time-varying multi-factor model,"
arXiv preprint arXiv:.11493, 2019.
[8] X. Ding, Y. Zhang, T. Liu, and J. Duan, "Deep learning for event-
driven stock prediction," in Twenty-fourth international joint
conference on artificial intelligence, 2015.
[9] G. Hu et al., "Deep stock representation learning: From candlestick
charts to investment decisions," in 2018 IEEE international conference
on acoustics, speech and signal processing (ICASSP), 2018, pp. 2706-
2710.
[10] J. Carapuço, R. Neves, and N. Horta, "Reinforcement learning applied
to Forex trading," Applied Soft Computing, vol. 73, pp. 783-794, 2018.
[11] Q. Kang, H. Zhou, and Y. Kang, "An asynchronous advantage actor-
critic reinforcement learning method for stock selection and portfolio
management," in Proceedings of the 2nd International Conference on
Big Data Research, 2018, pp. 141-145.
[12] W. Si, J. Li, P. Ding, and R. Rao, "A multi-objective deep
reinforcement learning approach for stock index future’s intraday
trading," in 2017 10th International symposium on computational
intelligence and design (ISCID), 2017, vol. 2, pp. 431-436.
[13] A. A. Ariyo, A. O. Adewumi, and C. K. Ayo, "Stock price prediction
using the ARIMA model," in 2014 UKSim-AMSS 16th International
Conference on Computer Modelling and Simulation, 2014, pp. 106-
112.
[14] P. Meesad and R. I. Rasel, "Predicting stock market price using support
vector regression," in 2013 International Conference on Informatics,
Electronics and Vision (ICIEV), 2013, pp. 1-6.
[15] P. Somani, S. Talele, and S. Sawant, "Stock market prediction using
hidden Markov model," in 2014 IEEE 7th joint international
information technology and artificial intelligence conference, 2014,
pp. 89-92.
[16] (Nasdaq Stock Dataset). Retrieved June 4, 2021, from
https://www.nasdaq.com/market-activity/stocks
[17] L. Chen and Q. Gao, "Application of deep reinforcement learning on
automated stock trading," in 2019 IEEE 10th International Conference
on Software Engineering and Service Science (ICSESS), 2019, pp. 29-
33.
[18] M. Hausknecht and P. Stone, "Deep recurrent q-learning for partially
observable mdps," in 2015 aaai fall symposium series, 2015.
[19] C. Ma, J. Zhang, J. Liu, L. Ji, and F. J. N. Gao, "A parallel multi-module
deep reinforcement learning algorithm for stock trading," vol. 449, pp.
290-302, 2021.
[20] O. Sattarov, A. Muminov, C. W. Lee, H. K. Kang, R. Oh, J. Ahn, H. J.
Oh, and H. S. Jeon, “Recommending cryptocurrency trading points
with deep reinforcement learning approach,” Applied Sciences, vol. 10,
no. 4, p. 1506, 2020.
[21] K. Chantona, R. Purba, and A. Halim, "News sentiment analysis in
forex trading using r-cnn on deep recurrent q-network," in 2020 Fifth
International Conference on Informatics and Computing (ICIC), 2020,
pp. 1-7.
[22] C. Y. Huang, "Financial trading as a game: A deep reinforcement
learning approach," arXiv preprint arXiv:.02787, 2018.
[23] G. P. Zhang, “Time series forecasting using a hybrid Arima and neural
network model,” Neurocomputing, vol. 50, pp. 159–175, 2003.
[24] D. Fan, H. Sun, J. Yao, K. Zhang, X. Yan, and Z. Sun, “Well
production forecasting based on ARIMA-LSTM model Considering
manual operations,” Energy, vol. 220, p. 119708, 2021.
[25] S. M. Kamruzzaman, M. I. Pavel, M. A. Hoque, and S. R. Sabuj,
“Promoting greenness with iot-based plant growth system,”
Computational Intelligence and Sustainable Systems, p. 235–253,
2018.
[26] M. S. I. Milon, M. I. Pavel, M. S. Ehsan, S. H. Said & S. R. Sabuj,
“Application of Smart Appliance Using Internet of Things,” In
Innovations in Electronics and Communication Engineering, p. 359-
368, 2020.
Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.
Article
Full-text available
In recent years, deep reinforcement learning (DRL) algorithm has been widely used in algorithmic trading. Many fully automated trading systems or strategies have been built using DRL agents, which integrate price prediction and trading signal generation in one system. However, the previous agents extract the current state from the market data without considering the long-term market historical trend when making decisions. Besides, plenty of related and useful information has not been considered. To address these two problems, we propose a novel model named Parallel multi-module deep reinforcement learning (PMMRL) algorithm. Here, two parallel modules are used to extract and encode the feature: one module employing Fully Connected (FC) layers is used to learn the current state from the market data of the traded stock and the fundamental data of the issuing company; another module using Long Short-Term Memory (LSTM) layers aims to detect the long-term historical trend of the market. The proposed model can extract features from the whole environment by the above two modules simultaneously, taking the advantages of both LSTM and FC layers. Extensive experiments on China stock market illustrate that the proposed PMMRL algorithm achieves a higher profit and a lower drawdown than several state-of-the-art algorithms.
Chapter
Full-text available
Nowadays, it is a growing trend for our electrical appliances to be much more automated with the use of sensors and Internet of things (IoT)-based remote control, one particular example being the home juice maker. In this paper, we design a system for home juice maker to have smart features with the use of numerous advanced sensors and Internet connectivity to enable IoT applications. For experimental setup, we propose a Raspberry Pi-3-based smart juice maker which through the use of IoT which is capable of taking commands remotely from a phone application via MySQL servers. In order to the quality, pH and temperature sensor are used to maintain the freshness. The prediction model of ARIMA is implemented to acknowledge the further pH values in different temperature where the best case shows 1.63% MSE, and in the worst case, it gets 12.72% error.
Article
Full-text available
The net profit of investors can rapidly increase if they correctly decide to take one of these three actions: buying, selling, or holding the stocks. The right action is related to massive stock market measurements. Therefore, defining the right action requires specific knowledge from investors. The economy scientists, following their research, have suggested several strategies and indicating factors that serve to find the best option for trading in a stock market. However, several investors’ capital decreased when they tried to trade the basis of the recommendation of these strategies. That means the stock market needs more satisfactory research, which can give more guarantee of success for investors. To address this challenge, we tried to apply one of the machine learning algorithms, which is called deep reinforcement learning (DRL) on the stock market. As a result, we developed an application that observes historical price movements and takes action on real-time prices. We tested our proposal algorithm with three—Bitcoin (BTC), Litecoin (LTC), and Ethereum (ETH)—crypto coins’ historical data. The experiment on Bitcoin via DRL application shows that the investor got 14.4% net profits within one month. Similarly, tests on Litecoin and Ethereum also finished with 74% and 41% profit, respectively.
Chapter
Full-text available
The use of Internet of Things (IoT) for plant growth and environmental management is a promising new field of research. Here a network of seamlessly connected sensors is used to feed data aimed at providing healthier plant growth and a better environment. In this chapter, we present a system where eight types of sensors are used to measure the air and soil quality. Our design utilizes cloud storage for keeping the collected sensor data which then gets sorted online in order to create accurate forecasts on the environment and plants using an autoregressive integrated moving average algorithm. Additionally the system has been designed with web interface and data visualization, enabling people to obtain the real-time environmental information to take better decisions for plant growth and environmental management. Finally we highlight the accuracy of results of prediction data which is approximately 99.13%.
Article
Accurate and efficient prediction of well production is essential for extending a well’s life cycle and improving reservoir recovery. Traditional models require expensive computational time and various types of formation and fluid data. Besides, frequent manual operations are always ignored because of their cumbersome processing. In this paper, a novel hybrid model is established that considers the advantages of linearity and nonlinearity, as well as the impact of manual operations. This integrates the autoregressive integrated moving average (ARIMA) model and the long short term memory (LSTM) model. The ARIMA model filters linear trends in the production time series data and passes on the residual value to the LSTM model. Given that the manual open-shut operations lead to nonlinear fluctuations, the residual and daily production time series are composed of the LSTM input data. To compare the performance of the hybrid models ARIMA-LSTM and ARIMA-LSTM-DP (Daily Production time series) with the ARIMA, LSTM, and LSTM-DP models, production time series of three actual wells are analyzed. Four indexes, namely, root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and similarity (Sim) values are evaluated to calculate the prediction accuracy. The results of the experiments indicate that the single ARIMA model has a good performance in the steady production decline curves. Conversely, the LSTM model has obvious advantages over the ARIMA model to the fluctuating nonlinear data. And coupling models (ARIMA-LSTM, ARIMA-LSTM-DP) exhibit better results than the individual ARIMA, LSTM, or LSTM-DP models, wherein the ARIMA-LSTM-DP model performs even better when the well production series are affected by frequent manual operations.
Chapter
Predicting stock prices accurately is a key goal of investors in the stock market. Unfortunately, stock prices are constantly changing and affected by many factors, making the process of predicting them a challenging task. This paper describes a method to build models for predicting stock prices using long short-term memory network (LSTM). The LSTM-based model, which we call dynamic LSTM, is initially built and continuously retrained using newly augmented data to predict future stock prices. We evaluate the proposed method using data sets of four stocks. The results show that the proposed method outperforms others in predicting stock prices based on different performance metrics.
Conference Paper
Computation finance has been a classical field that uses computer techniques to handle financial challenges. The most popular domains include financial forecast and portfolio management. They often involve large datasets with complex relations. Due to the special properties of computation finance problems, machine learning techniques, especially deep learning techniques, are widely used as the quantitative analysis tool. In this paper, we try to apply the state-of-art Asynchronous Advantage Actor-Critic algorithm to solve the portfolio management problem and design a standalone deep reinforcement learning model. In the simulated market environment with practical portfolio constrain settings, asset value managed by the proposed machine learning model largely outperforms S&P500 stock index in the test period.
Article
This paper describes a new system for short-term speculation in the foreign exchange market, based on recent reinforcement learning (RL) developments. Neural networks with three hidden layers of ReLU neurons are trained as RL agents under the Q-learning algorithm by a novel simulated market environment framework which consistently induces stable learning that generalizes to out-of-sample data. This framework includes new state and reward signals, and a method for more efficient use of available historical tick data that provides improved training quality and testing accuracy. In the EUR/USD market from 2010 to 2017 the system yielded, over 10 tests with varying initial conditions, an average total profit of 114.0 ± 19.6% for an yearly average of 16.3 ± 2.8%.