Conference PaperPDF Available

Decision Making Process of Stock Trading Implementing DRQN and ARIMA

August 2021

August 2021

DOI:10.1109/MASCON51689.2021.9563476

Conference: IEEE Madras Section International Conference (MASCON)

Authors:

Monirul Islam Pavel

University of South Australia

Dewan Ahmed Muhtasim

Universiti Kebangsaan Malaysia

Omar Faruk

Universiti Kebangsaan Malaysia

The approach to collect realistic trading signals throughout the transaction process to broaden advantages is a long-studied issue. The rapid expansion and dynamic character on stock markets is a major issue for the financial sector, as conventional trade tactics developed by experienced financial professionals do not generate sufficient performance under all market situations. In most previous studies, Machine Learning and Deep Learning have been based on price estimation techniques, yet few studies have shown decisions based on stock trading. To solve this difficulty, adaptive stock trading strategies are suggested with profound techniques of deep reinforcement learning. This study exhibits the implementation of Deep Recurrent Q-Learning (DRQN) and Autoregressive Integrating Moving Average (ARIMA) on stock trading with predicting closing value of stock that helps for strategic decision-making from a stock market by acknowledging risk to buy, hold and sell with profit calculation. This method was applied on 15 Nasdaq stock datasets and overcomes the limitation of recent developed reinforcement learning methods. The proposed fusion of DRQN and ARIMA based strategy displays robust result which helps to take better decision for stock trading with visualizing experimental outcomes.

Architecture for proposed DRQN model.

…

Outcomes of DRQn showing buy or signal for (a)BV, (b)CG, (c)CAAS, (d)CAC, (e) CAKE, (f) BEBE, (g) BCS, (h)AGIO, (i) BANX, (j) BCRX, (k) DGX, (l) GOOG, (m) DAL, (n) FB, (o) ABBV.

…

Outcomes and validation for 15 datasets.

…

Figures - uploaded by Monirul Islam Pavel

Content may be subject to copyright.

Content uploaded by Monirul Islam Pavel

Content may be subject to copyright.

XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE

Decision Making Process of Stock Trading

Implementing DRQN And ARIMA

Abstract—The approach to collect realistic trading signals

throughout the transaction process to broaden advantages is a

long-studied issue. The rapid expansion and dynamic character

on stock markets is a major issue for the financial sector, as

conventional trade tactics developed by experienced financial

professionals do not generate sufficient performance under all

market situations. In most previous studies, Machine Learning

and Deep Learning have been based on price estimation

techniques, yet few studies have shown decisions based on stock

trading. To solve this difficulty, adaptive stock trading strategies

are suggested with profound techniques of deep reinforcement

learning. This study exhibits the implementation of Deep

Recurrent Q-Learning (DRQN) and Autoregressive Integrating

Moving Average (ARIMA) on stock trading with predicting

closing value of stock that helps for strategic decision-making

from a stock market by acknowledging risk to buy, hold and sell

with profit calculation. This method was applied on 15 Nasdaq

stock datasets and overcomes the limitation of recent developed

reinforcement learning methods. The proposed fusion of DRQN

and ARIMA based strategy displays robust result which helps

to take better decision for stock trading with visualizing

experimental outcomes.

Keywords— reinforcement learning, DRQN, stock data, stock

market, LSTM, ARIMA, stock trading.

I. INTRODUCTION

To maximise profit, stock should ideally be obtained at a

low cost and disposed of at a high cost. The greater the

difference between the price of sales and the price of

purchases, the greater the return from trading. As a result, the

goal of regular stock traders is to profit from short-term events

by purchasing stock at a lower price and selling it at a higher

value. Traders' primary task is to identify when to purchase

and sell shares [1].

The ever-expanding character of machine learning research is

becoming a sanctuary for exploration in increasingly

sophisticated applications such as stock trading. This is

referred to as a decision-making process, and it assists traders

in achieving certain return-on-investment performance

indicators, including as profit, economic utility, and risk-

adjusted return, amongst other. Nonetheless, research in this

field has focused on the advancement of machine learning and

artificial intelligence over the years to cope with the volatile

behaviour of stock markets. The accuracy of the anticipated

price or trend determines the prediction's efficacy. The higher

accuracy is relied on less difference between forecasted and

actual value. Traditional statistical learning algorithms are

limited in their capacity to cope with the non-linearity inherent

in stock markets, therefore supervised and unsupervised

learning methods are employed [2]. However, these

algorithms often rely specifically on accuracy in price

prediction. The primary purpose for forecasting future prices

is to determine whether or not it is a suitable price to buy o sell

the stock. Nonetheless, the basic reinforcement learning

methodology is incapable of capturing time series data, while

stock history data is precisely a lengthy sequential time series

data. This challenge is resolved using DQRN with LSTM

which is also beneficial for tackling sequence prediction

issues due to their capacity to capture foreknowledge and

merges Deep Learning's observation with reinforcement

learning's decision. Moreover, with our proposal work the

forecast of stock closing data are estimated for next for 100

days and signals for buy and sell are generated which

enhances the chance for trading reducing risks. Profit is also

assessed in order to evaluate risk.

The following are the sections of the paper that are arranged

as follows: II covers similar studies that have used deep

learning, machine learning, and reinforcement learning

techniques; III describes the proposed methodology as well as

the models that have been implemented. IV presents the

findings and their analysis, and the V summarises the

research's overall objective.

II. RELATED WORKS

Multiple deep learning, reinforcement learning, and

machine learning had been studied previously where those

methods were mostly implemented for stock price prediction.

A. Deep Learning Based Approaches

Nguyen, D.H.D. et al implemented a LSTM model that

utilized the average of the preceding five days' stock market

information (opening price; closing price; high; low; volume;

close) as its input value. This value was used to generate the

initial estimate. To calculate a mean of the stock price data for

the following five days, the ARIMA technique was utilised to

include the forecast into the average stock price information.

Furthermore, researchers used technical analysis indications

to determine whether or not to purchase stocks, whether or not

to retain stocks, and whether or not to sell stocks [3].

Selvin, S. et al proposed a combined method of LSTM, RNN

and CNN Sliding window based approach for stock price

prediction for short term returns [4]. They didn't match the

data to a particular model but rather detect the underlying

dynamics of the data with profound learning frameworks.

However, the model uses the information provided for

Monirul Islam Pavel

Center for Artificial Intelligence Technology,

Faculty of Information Science &

Technology,

Universiti Kebangsaan Malaysia, Selangor,

Malaysia

Email: P104619@siswa.ukm.edu.my

orcid.org/0000-0001-9470-7725

Dewan Ahmed Muhtasim

Center for Artificial Intelligence Technology,

Faculty of Information Science &

Technology,

Universiti Kebangsaan Malaysia, Selangor,

Malaysia

Email: dewanmuhtasim@gmail.com

Omar Faruk

Faculty of Information Science &

Technology,

Universiti Kebangsaan Malaysia, Selangor,

Malaysia

Email: ga03843@siswa.ukm.edu.my

Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.

prediction at a specific moment. Although the other two

models are utilised in many other time dependent data

analyses, CNN architecture in this case is not carried out. This

is because of the rapid fluctuations in stock markets.

For each series, Skehin, T. et al suggested a linear

Autoregressive Integrated moving average (ARIMA) and

LSTM network for the next day. The techniques had broken

down a series to approximate and describe behaviour in detail

across time. These approaches were integrated in a new

ensemble model to enhance predictive accuracy [5].

Chatzis, S. P. et al developed a DNN model using Boost

methods to forecast bouts of stock market crises. According to

his study, the stock market crisis was useful to forecast the

price, although his research not particular to certain techniques

of prediction [6].

Nakagawa, K. et al has suggested a deep-factor model and a

shallow DNN model, implying that the link between stock

returns on the financial market and factors is nonlinear instead

of linear. Other machine techniques such as SVR and random

forests were also superior to the deep learning model. The

shallow model obtained highest precision comparing with the

other machine learning approaches [7].

Ding, X. presented a deep learning technique for the stock

market dictionary driven by events. First, events have been

retrieved from news texts and displayed as dense vectors by a

new network of neural tensors (NTN). Secondly, CNN has

been utilised to estimate both the effects on stock prices in the

near term and in the long run [8].

In order to acquire a stock representation and transform 4-

channel stock time series (lowest, highest, open and closing

prize each day) into candlestick charts Hu, G. et al presented

a convolutional auto-encoder model utilising the synthesis

approach to show price history as [9]. This solution avoids

costly annotation effectively. In terms of total return, the

suggested model exceeded the FTSE 100 index and many

well-known funds.

B. Reinforcement Learning Based Approaches

Carapuço, J. et al. developed a learning-quality strengthening

network model; three hidden layers of ReLU neutrons were

trained in a new market-based simulation environment under

the Q-learning algorithm as agents in enhancement learning

(RL). The approach has reliably induced steady and

generalised learning to out-of-sample data [10].

In order to resolve the portfolio management problem and

create a stand alone profound enhancement learning model,

Kang, Q. et al. the implementation of the state-of-the art

Asynchronous Actor Critical Algorithm (A3C algorithm)

[11].

Yes, W. et al. developed a strengthening learning model with

multi-targets and LSTM agents. Functional learning was

found to contribute to improved performance. The LSTM

network took consistent choices and could change positions in

time, reducing the cost of the transaction and making good

profit from the multi-objective structure within an acceptable

risk [12].

Further, deep reinforcement learning approach for suggesting

cryptocurrency trading points was proposed by Sattarov, O. et

al. [20] to avoid capital minimization. This approach was used

for developing an application to analysis historical price

movements in real-time cases testing on Bitcoin, Ethereum

crypto coin and Litecoin data. Their method obtained 74%,

14.4% and 41% more profit testing Litecoin, Ethereum and

Bitcoin.

C. Machine Learning Based Approaches

In [13], the authors have developed a predicted stock price

model utilising the ARIMA model. Datasets have been

collected from the NSE (Nigeria Stock Exchange) and NYSE

for the stock price forecast (New York Stock Exchange). The

V-5 e-views programme is the implementation tool. The

criteria used to identify the optimal ARIMA model for each

stock index include the Schwarz or Bayesian data, a very low

standard regression error, a relatively highly adjusted R2. The

two datasets utilised are the stock Nokia index and the bank

zenith index. Both datasets produced a reduced value for the

Bayesian information criterion and a substantially smaller

ARIMA (2,1, 0) and ARIMA standard regression error (1,

0,1). The results therefore show that ARIMA models have the

capacity to forecast short-term stock prices.

An experimental research was done, using SVR analysis as a

machine learning approach for predicting the stock price and

for predicting stock market trends [1]. They have employed

several types of window operators including flatten window,

basic rectangular window and flattened window to feed more

trustworthy entries in regression models, transforming time

series data into generic data. The test was carried out utilising

the Dhaka Stock Exchange (DSE), which contained historical

data from 2009 to 2014. The results demonstrate that good

stock price prediction outcomes were obtained by SVR

models that were developed using flatten window and

rectangle window operator.

In [15], survey on stock price predictions in the field of neural

network, Support Vector Machines (SVM) and Hidden

Markov Model were done (HMM). In this article, the HMM

was proposed for the prediction approach, and the exciting

methodologies were compared. Data training was carried by

using the HMM Baum Welch method. The algorithm has been

tested on three distinct equities, SBI, IDBI and ICICI. The

results revealed that the suggested model was more precise

than existing approaches.

III. METHODOLOG

A. Nasdaq Dataset

To assess these DRL agents in a real situation, daily

Nasdaq stock trading data are utilized from the past as a

training and testing set where the dataset are collected in

csv format from Nasdaq's website [16]. There are more than

7000 stocks in US market, it will be time-consuming to use all

data for evaluation. Therefore, few stock data from Nasdaq are

sampled as evaluation dataset including the BV (BrightView

Holdings, Inc), CG (Carlyle Group Inc, CAAS (China

Automotive Systems, Inc), CAC (Camden National

Corporation), CAKE (Cheesecake Factory Inc), BEBE (bebe

stores, inc.), BCS (Barclays PLC), AGIO (Agios

Pharmaceuticals Inc), BANX (StoneCastle Financial Corp),

BCRX (BioCryst Pharmaceuticals, Inc.), DGX (Quest

Diagnostics Inc), GOOG (Alphabet Inc), DAL (Delta Air

Lines, Inc), FB (Facebook, Inc), ABBV (AbbVie Inc)

containing values upto 2021-06-30.

Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.

B. DRQN

Deep Q-Network (DQN) is the fundamental components

including experiential replay and Q-target networks. Deep Q-

learning is Q-learning, substituted by a deep neural network

with a Q-table. In this method the issue is solved with: if the

state space and action space are constant or indefinitely

discrete, the Q-value in the Q-table with finite samples cannot

be iterated. Since it has been demonstrated that a 3-layer

neural network can represent any function,[17] the

fundamental concept for deep Q-learning is to utilise a neural

network for approximation of the Q-function. (

In the DQN framework, it is essential to assume that the

state is a complete environmental observation at every stage.

However, variables that exist in the temporal dimension for

stock trading environment generally cannot be found

effectively under the DQN framework. In order to tackle this

issue, deep recurring Q-networks is proposed to apply

[18]. In the DRQN architecture,  is the hidden condition of

the LSTM units and represents the preceding step of

information. In time interrelationships, LSTM may uncover

data concealed and maintain essential functions in prior state

[19]. Implementation of LSTM in proposed DQN framework

makes the state specified a closer look at the trading

environment.

The characteristic of DQN makes the agent stable in

training and is why DQN techniques are used instead of

tabular Q- learning methods. More particular, the recurring

layer is added to the input to collect sequential input state data

in DRQN [21,22]. During DRQN implementation some

policies have been taken: (a) the agent may only execute one

step at a time, the stock cannot be purchased or sold

simultaneously, (b) the sales action is only valid if at least one

unit holds the agent, (c) the agent may only purchase or sell

one unit at a time, (d) the agency is given starting capital for

trading stocks in the amount of $10000 at the beginning of

each action.

Fig. 1. Architecture for proposed DRQN model.

C. Implementation of DRQN

To implement DRQN, it is divided into three parts: state,

action and reward as shown in Fig 1; considering it as a

Markov Decision Process (MDP) [22]. The state space

denotes the observation of the environment which is stock

price at any timestamp. Moreover, the action space which is

what an agent can do in each state, includes 3 buying signals.

From the value of 0,1,2 represents hold, buy and sell. Action-

based observation of the environment results in the calculation

of profit or loss. The beginning capital for commerce is set for

every agent and the profit produced by the agent is dependent

on the balance of the credit held at the conclusion of the

transaction. Now, if purchased, the profit balance will be

updated to calculate the difference between the current

amount and the money invested. In case of selling, profit will

be calculated. Considering ΔB as current balance and ΔC as

starting capital, the equation is given,



  (1)

The starting capital given to the agent at the beginning is

assumed as $10000, current capital is the balance money hold

by agent at the cutting point.

D. Forcasting Using ARIMA

Forecasting time series is an interdisciplinary technique used

to address stock price estimating issues. It is adaptable since

just historical observations of the relevant variables are

required. In the time order, stock price index data is typically

gathered. Although It is seen as data from time series, it has

strong nonlinear features and time differences. In time series

analysis, ARIMA is highly versatile, combining the benefits

of time series and regression [23, 24].

Nonseasonal algorithm can be categorized as ARIMA(p, q, d)

where I is integrated value, p refers to the numbers of

autoregressive (AR) parts, d refers to the number of

nonseasonal differences, and q refers to the number of moving

average (MA) terms [25, 26].

Now to apply ARIMA,  is taken as time

series of stock trading problem, ARIMA is constructed by

following steps,

First of all, the nonstationary time series of dataset, it is

stabilized using the following equation where value of d is 1,

integrated (I) term is proposed in ARIMA model to remove

the effects of nonstationary data by differencing and 

denotes as the nonstationary data, the equation is,

  !"#$% (2)

The integrated term is proposed in ARIMA model to remove

the effects of nonstationary data by differencing, where z is

the difference of dth difference,

&'()

&'()

&'()*)+)*, (3)

(

-.+/(+0+/1(1+230243 (4)

Here, (

- is the general equation for ARIMA where β is the

slope coefficient, 2 is the moving average parameter and 3 is

the error term.

Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

Decision Making Process of Stock Trading

Implementing DRQN And ARIMA

Monirul Islam Pavel

Center for Artificial Intelligence Technology,

Faculty of Information Science &

Technology,

Universiti Kebangsaan Malaysia, Selangor,

Malaysia

Email: P104619@siswa.ukm.edu.my

orcid.org/0000-0001-9470-7725

Dewan Ahmed Muhtasim

Center for Artificial Intelligence Technology,

Faculty of Information Science &

Technology,

Universiti Kebangsaan Malaysia, Selangor,

Malaysia

Email: dewanmuhtasim@gmail.com

Omar Faruk

Faculty of Information Science &

Technology,

Universiti Kebangsaan Malaysia, Selangor,

Malaysia

Email: ga03843@siswa.ukm.edu.my

Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.

IV. RESULT AND ANALYSIS

To experiment the outcomes, first ARIMA based time series

prediction is estimated next 100 days stock closing price. In

addition to evaluating every agent's ultimate profit, the capital

acquired or lost over the test period is also evaluated. At first,

the capital acquired is negative since the agent may only spend

capital on purchase or hold actions; hence, the capital gained

is negative. Capital drops more on subsequent days if the agent

executes a purchase activity rather than a sell action. Near the

end of the test period, the majority of the gain reversed to a

positive value as the agent profited from the unit being sold at

a higher closure price. Fig. 2 illustrates the outputs of the

agents' buy and sell signals over forecasted set; the daily close

price is plotted and attached with a coloured circle denoting a

purchase or sale point.

The blue circle represents a purchase action taken by the agent

on that location, whereas the red downward triangle indicates

a sell action taken by the agency on that point. Moreover,

Table 1. shows that the calculated profit value obtained from

DRQN rewards of each dataset where GOOG and CAKE

showed highest profit outcome of 6.02% and 7.11% based on

sell and DGX and DAL showed lowest profit outcome of

-5.28% and -3.51%. The table also describe mean square error

from obtained rewards and costs from DRQN for iteration

200, window size 30 and batch size 64. Moreover, mean

square error (MSE) and mean absolute error (MAE) are

computed to validate the ARIMA forecasted result.

Considering 56 as mean train value and 57 as predicted value,

the equations of MSE and MAE are,

!89

:;<5657=*

6 (5)

!9

:;>56

657> (6)

For the prediction using ARIMA, the lowest MSE was 0.046

for BANX and highest was 3.67 for DGX. In addition,

minimum MAE was achieved by CG scoring 1.1% and BCRX

showed 20.77% which provided most unstable result

comparing with the other 14 dataset.

(k)

(l)

(m)

(n)

(o)

Fig 2. Outcomes of DRQn showing buy or signal for (a)BV, (b)CG, (c)CAAS, (d)CAC, (e) CAKE, (f) BEBE, (g) BCS, (h)AGIO, (i)

BANX, (j) BCRX, (k) DGX, (l) GOOG, (m) DAL, (n) FB, (o) ABBV.

Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.

Table 1. Outcomes and validation for 15 datasets.

DRQN ARIMA

Dataset Profit

(%) Reward Cost MSE MAE

(%)

BV -0.529 11.08 0.034 0.315 2.7

CG 0.097 42.06 0.243 0.073 1.1

CAAS -0.115 -2.62 0.008 2.5 9.7

CAC 1.25 49.6 5.003 0.071 1.4

CAKE 6.02 28.56 5.87 0.194 2.47

BEBE 0.022 0.048 0.033 0.229 2.6

BCS -0.034 4.91 0.051 0.191 2.5

AGIO 4.56 23.62 1.385 0.093 1.3

BANX 0.029 5.34 0.0187 0.046 3.33

BCRX 0.4 19.26 0.19 0.45 20.77

DGX -5.28 35.12 2.75 3.67 5.07

GOOG 7.11 1335 547 1.68 10.7

DAL -3.51 51.41 1.2 1 8.71

FB 0.32 91.51 33.16 3.2 9.68

ABBV 0.16 49.47 2.38 2.23 4.11

V. CONCLUSION

From the perspective of stock trading, a valuable portion is to

have knowledge when to buy, sell or hold the stock and know

what can be estimated risk by calculating profit. Although

there is a vast improvement of artificial intelligence based

analysis in stock data, most of them are limited in between

predicting future stock price only which is not effective as

well for the nature of uncertain stock records. In this work,

the issues are overcame by combining both deep recurrent Q-

network and ARIMA for predicting closing value and

generating buy, sell and hold signal based on profit

calculation applying on 15 Nasdaq dataset. However, as very

few works have done on this theme, it was hard to evaluate

and compare the outcomes with other research works.

Further, more focus will be given to improve the performance

concatenating deep learning agent with the DRQN model to

boost the Q-value prediction and for more efficient ARIMA

model with higher accuracy and stable forecasting ability to

obtain higher profit in stock trading.

REFERENCES

[1] W. C. Chiang, D. Enke, T. Wu, and R. Wang, "An adaptive stock index

trading decision support system," Expert Systems with Applications,

vol. 59, pp. 195-207, 2016.

[2] D. W. Lu, "Agent inspired trading using recurrent reinforcement

learning and lstm neural networks," arXiv preprint arXiv:.07338, 2017.

[3] D. H. D. Nguyen, L. P. Tran, and V. Nguyen, "Predicting stock prices

using dynamic LSTM models," in International Conference on Applied

Informatics, 2019, pp. 199-212.

[4] S. Selvin, R. Vinayakumar, E. Gopalakrishnan, V. K. Menon, and K.

Soman, "Stock price prediction using LSTM, RNN and CNN-sliding

window model," in 2017 international conference on advances in

computing, communications and informatics (icacci), 2017, pp. 1643-

1647.

[5] T. Skehin, M. Crane, and M. Bezbradica, "Day ahead forecasting of

FAANG stocks using ARIMA, LSTM networks and wavelets," 2018:

CEUR Workshop Proceedings.

[6] S. P. Chatzis, V. Siakoulis, A. Petropoulos, E. Stavroulakis, and N.

Vlachogiannakis, "Forecasting stock market crisis events using deep

and statistical machine learning techniques," Expert systems with

applications, vol. 112, pp. 353-371, 2018.

[7] K. Nakagawa, T. Ito, M. Abe, and K. Izumi, "Deep recurrent factor

model: interpretable non-linear and time-varying multi-factor model,"

arXiv preprint arXiv:.11493, 2019.

[8] X. Ding, Y. Zhang, T. Liu, and J. Duan, "Deep learning for event-

driven stock prediction," in Twenty-fourth international joint

conference on artificial intelligence, 2015.

[9] G. Hu et al., "Deep stock representation learning: From candlestick

charts to investment decisions," in 2018 IEEE international conference

on acoustics, speech and signal processing (ICASSP), 2018, pp. 2706-

2710.

[10] J. Carapuço, R. Neves, and N. Horta, "Reinforcement learning applied

to Forex trading," Applied Soft Computing, vol. 73, pp. 783-794, 2018.

[11] Q. Kang, H. Zhou, and Y. Kang, "An asynchronous advantage actor-

critic reinforcement learning method for stock selection and portfolio

management," in Proceedings of the 2nd International Conference on

Big Data Research, 2018, pp. 141-145.

[12] W. Si, J. Li, P. Ding, and R. Rao, "A multi-objective deep

reinforcement learning approach for stock index future’s intraday

trading," in 2017 10th International symposium on computational

intelligence and design (ISCID), 2017, vol. 2, pp. 431-436.

[13] A. A. Ariyo, A. O. Adewumi, and C. K. Ayo, "Stock price prediction

using the ARIMA model," in 2014 UKSim-AMSS 16th International

Conference on Computer Modelling and Simulation, 2014, pp. 106-

112.

[14] P. Meesad and R. I. Rasel, "Predicting stock market price using support

vector regression," in 2013 International Conference on Informatics,

Electronics and Vision (ICIEV), 2013, pp. 1-6.

[15] P. Somani, S. Talele, and S. Sawant, "Stock market prediction using

hidden Markov model," in 2014 IEEE 7th joint international

information technology and artificial intelligence conference, 2014,

pp. 89-92.

[16] (Nasdaq Stock Dataset). Retrieved June 4, 2021, from

https://www.nasdaq.com/market-activity/stocks

[17] L. Chen and Q. Gao, "Application of deep reinforcement learning on

automated stock trading," in 2019 IEEE 10th International Conference

on Software Engineering and Service Science (ICSESS), 2019, pp. 29-

33.

[18] M. Hausknecht and P. Stone, "Deep recurrent q-learning for partially

observable mdps," in 2015 aaai fall symposium series, 2015.

[19] C. Ma, J. Zhang, J. Liu, L. Ji, and F. J. N. Gao, "A parallel multi-module

deep reinforcement learning algorithm for stock trading," vol. 449, pp.

290-302, 2021.

[20] O. Sattarov, A. Muminov, C. W. Lee, H. K. Kang, R. Oh, J. Ahn, H. J.

Oh, and H. S. Jeon, “Recommending cryptocurrency trading points

with deep reinforcement learning approach,” Applied Sciences, vol. 10,

no. 4, p. 1506, 2020.

[21] K. Chantona, R. Purba, and A. Halim, "News sentiment analysis in

forex trading using r-cnn on deep recurrent q-network," in 2020 Fifth

International Conference on Informatics and Computing (ICIC), 2020,

pp. 1-7.

[22] C. Y. Huang, "Financial trading as a game: A deep reinforcement

learning approach," arXiv preprint arXiv:.02787, 2018.

[23] G. P. Zhang, “Time series forecasting using a hybrid Arima and neural

network model,” Neurocomputing, vol. 50, pp. 159–175, 2003.

[24] D. Fan, H. Sun, J. Yao, K. Zhang, X. Yan, and Z. Sun, “Well

production forecasting based on ARIMA-LSTM model Considering

manual operations,” Energy, vol. 220, p. 119708, 2021.

[25] S. M. Kamruzzaman, M. I. Pavel, M. A. Hoque, and S. R. Sabuj,

“Promoting greenness with iot-based plant growth system,”

Computational Intelligence and Sustainable Systems, p. 235–253,

2018.

[26] M. S. I. Milon, M. I. Pavel, M. S. Ehsan, S. H. Said & S. R. Sabuj,

“Application of Smart Appliance Using Internet of Things,” In

Innovations in Electronics and Communication Engineering, p. 359-

368, 2020.

Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on October 20,2021 at 12:38:20 UTC from IEEE Xplore. Restrictions apply.

Improving algorithmic trading consistency via human alignment and imitation learning

Article

May 2024
EXPERT SYST APPL

Reinforcement Learning for Stock Option Trading

Conference Paper

Dec 2023

James Garza

A novel deep reinforcement learning framework with BiLSTM-Attention networks for algorithmic trading

Article

Nov 2023
EXPERT SYST APPL

A Parallel Multi-module Deep Reinforcement Learning Algorithm for Stock Trading

Article

Full-text available

Apr 2021
NEUROCOMPUTING

In recent years, deep reinforcement learning (DRL) algorithm has been widely used in algorithmic trading. Many fully automated trading systems or strategies have been built using DRL agents, which integrate price prediction and trading signal generation in one system. However, the previous agents extract the current state from the market data without considering the long-term market historical trend when making decisions. Besides, plenty of related and useful information has not been considered. To address these two problems, we propose a novel model named Parallel multi-module deep reinforcement learning (PMMRL) algorithm. Here, two parallel modules are used to extract and encode the feature: one module employing Fully Connected (FC) layers is used to learn the current state from the market data of the traded stock and the fundamental data of the issuing company; another module using Long Short-Term Memory (LSTM) layers aims to detect the long-term historical trend of the market. The proposed model can extract features from the whole environment by the above two modules simultaneously, taking the advantages of both LSTM and FC layers. Extensive experiments on China stock market illustrate that the proposed PMMRL algorithm achieves a higher profit and a lower drawdown than several state-of-the-art algorithms.

Application of Smart Appliance Using Internet of Things

Chapter

Full-text available

Apr 2020

Nowadays, it is a growing trend for our electrical appliances to be much more automated with the use of sensors and Internet of things (IoT)-based remote control, one particular example being the home juice maker. In this paper, we design a system for home juice maker to have smart features with the use of numerous advanced sensors and Internet connectivity to enable IoT applications. For experimental setup, we propose a Raspberry Pi-3-based smart juice maker which through the use of IoT which is capable of taking commands remotely from a phone application via MySQL servers. In order to the quality, pH and temperature sensor are used to maintain the freshness. The prediction model of ARIMA is implemented to acknowledge the further pH values in different temperature where the best case shows 1.63% MSE, and in the worst case, it gets 12.72% error.

Recommending Cryptocurrency Trading Points with Deep Reinforcement Learning Approach

Article

Full-text available

Feb 2020

The net profit of investors can rapidly increase if they correctly decide to take one of these three actions: buying, selling, or holding the stocks. The right action is related to massive stock market measurements. Therefore, defining the right action requires specific knowledge from investors. The economy scientists, following their research, have suggested several strategies and indicating factors that serve to find the best option for trading in a stock market. However, several investors’ capital decreased when they tried to trade the basis of the recommendation of these strategies. That means the stock market needs more satisfactory research, which can give more guarantee of success for investors. To address this challenge, we tried to apply one of the machine learning algorithms, which is called deep reinforcement learning (DRL) on the stock market. As a result, we developed an application that observes historical price movements and takes action on real-time prices. We tested our proposal algorithm with three—Bitcoin (BTC), Litecoin (LTC), and Ethereum (ETH)—crypto coins’ historical data. The experiment on Bitcoin via DRL application shows that the investor got 14.4% net profits within one month. Similarly, tests on Litecoin and Ethereum also finished with 74% and 41% profit, respectively.

Promoting Greenness with IoT-Based Plant Growth System: Intelligence and Sustainable Computing

Chapter

Full-text available

Jan 2019

The use of Internet of Things (IoT) for plant growth and environmental management is a promising new field of research. Here a network of seamlessly connected sensors is used to feed data aimed at providing healthier plant growth and a better environment. In this chapter, we present a system where eight types of sensors are used to measure the air and soil quality. Our design utilizes cloud storage for keeping the collected sensor data which then gets sorted online in order to create accurate forecasts on the environment and plants using an autoregressive integrated moving average algorithm. Additionally the system has been designed with web interface and data visualization, enabling people to obtain the real-time environmental information to take better decisions for plant growth and environmental management. Finally we highlight the accuracy of results of prediction data which is approximately 99.13%.

Well production forecasting based on ARIMA-LSTM model considering manual operations

Article

Dec 2020
ENERGY

Accurate and efficient prediction of well production is essential for extending a well’s life cycle and improving reservoir recovery. Traditional models require expensive computational time and various types of formation and fluid data. Besides, frequent manual operations are always ignored because of their cumbersome processing. In this paper, a novel hybrid model is established that considers the advantages of linearity and nonlinearity, as well as the impact of manual operations. This integrates the autoregressive integrated moving average (ARIMA) model and the long short term memory (LSTM) model. The ARIMA model filters linear trends in the production time series data and passes on the residual value to the LSTM model. Given that the manual open-shut operations lead to nonlinear fluctuations, the residual and daily production time series are composed of the LSTM input data. To compare the performance of the hybrid models ARIMA-LSTM and ARIMA-LSTM-DP (Daily Production time series) with the ARIMA, LSTM, and LSTM-DP models, production time series of three actual wells are analyzed. Four indexes, namely, root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and similarity (Sim) values are evaluated to calculate the prediction accuracy. The results of the experiments indicate that the single ARIMA model has a good performance in the steady production decline curves. Conversely, the LSTM model has obvious advantages over the ARIMA model to the fluctuating nonlinear data. And coupling models (ARIMA-LSTM, ARIMA-LSTM-DP) exhibit better results than the individual ARIMA, LSTM, or LSTM-DP models, wherein the ARIMA-LSTM-DP model performs even better when the well production series are affected by frequent manual operations.

News Sentiment Analysis in Forex Trading Using R-CNN on Deep Recurrent Q-Network

Conference Paper

Nov 2020

Application of Deep Reinforcement Learning on Automated Stock Trading

Conference Paper

Oct 2019

Predicting Stock Prices Using Dynamic LSTM Models

Chapter

Oct 2019

Predicting stock prices accurately is a key goal of investors in the stock market. Unfortunately, stock prices are constantly changing and affected by many factors, making the process of predicting them a challenging task. This paper describes a method to build models for predicting stock prices using long short-term memory network (LSTM). The LSTM-based model, which we call dynamic LSTM, is initially built and continuously retrained using newly augmented data to predict future stock prices. We evaluate the proposed method using data sets of four stocks. The results show that the proposed method outperforms others in predicting stock prices based on different performance metrics.

An Asynchronous Advantage Actor-Critic Reinforcement Learning Method for Stock Selection and Portfolio Management

Conference Paper

Oct 2018

Computation finance has been a classical field that uses computer techniques to handle financial challenges. The most popular domains include financial forecast and portfolio management. They often involve large datasets with complex relations. Due to the special properties of computation finance problems, machine learning techniques, especially deep learning techniques, are widely used as the quantitative analysis tool. In this paper, we try to apply the state-of-art Asynchronous Advantage Actor-Critic algorithm to solve the portfolio management problem and design a standalone deep reinforcement learning model. In the simulated market environment with practical portfolio constrain settings, asset value managed by the proposed machine learning model largely outperforms S&P500 stock index in the test period.

Reinforcement learning applied to Forex trading

Article

Sep 2018
APPL SOFT COMPUT

This paper describes a new system for short-term speculation in the foreign exchange market, based on recent reinforcement learning (RL) developments. Neural networks with three hidden layers of ReLU neurons are trained as RL agents under the Q-learning algorithm by a novel simulated market environment framework which consistently induces stable learning that generalizes to out-of-sample data. This framework includes new state and reward signals, and a method for more efficient use of available historical tick data that provides improved training quality and testing accuracy. In the EUR/USD market from 2010 to 2017 the system yielded, over 10 tests with varying initial conditions, an average total profit of 114.0 ± 19.6% for an yearly average of 16.3 ± 2.8%.

Decision Making Process of Stock Trading Implementing DRQN and ARIMA

Abstract and Figures

Recommended publications

MCTG:Multi-frequency continuous-share trading algorithm with GARCH based on deep reinforcement learn...

A cellular automata model for safe investment based on expert's recommendations

A tabular sarsa-based stock market agent

Using Data Augmentation Based Reinforcement Learning for Daily Stock Trading