Content uploaded by Erdefi Rakun
Author content
All content in this area was uploaded by Erdefi Rakun on Mar 28, 2019
Content may be subject to copyright.
RESEARCH ARTICLE Adv. Sci. Lett. 4, 400–407, 2016
1 Adv. Sci. Lett. Vol. 4, No. 2, 2016 1936-6612/2011/4/400/008 doi:10.1166/asl.2011.1261
Copyright © 2016 American Scientific Publishers Advanced Science Letters
All rights reserved Vol. 4, 400–407, 2016
Printed in the United States of America
Recognition of Sign Language System for
Indonesian Language using Long Short-Term
Memory Neural Networks
Erdefi Rakun1, Aniati M. Arymurthy1, Lim Y. Stefanus1, Alfan F. Wicaksono1, I Wayan W. Wisesa1
1 Fakultas Ilmu Komputer, Universitas Indonesia, Depok 16424, Jawa Barat, Indonesia
SIBI (Sign Language System for Indonesian Language) is the official sign language system for the Indonesian language.This
research aims to find a suitable model for performing SIBI-to-text translation on inflectional word gestures. Extant research
have been able to translate the alphabet, root words, and numbers from SIBI to text. Inflectional words are root words with
prefixes, infixes, and suffixes, or some combination of the three. A new method that splits an inflectional word into three
feature vector sets was developed. This reduces the amount of feature sets used, which would otherwise be as big as the
product of the prefixes, suffixes and root words feature sets of the inflectional word gestures. Long Short-Term Memory
(LSTM) is used, as this models can take entire sequences as input and does not have to rely on pre-clustered per-frame data.
LSTM suits this system well as the SIBI sequence data has long-term temporal dependency. The 2-layer LSTM performed
the best, being 95.4% accurate with root words. The same model is 77% accurate with inflectional words, using the combined
skeleton-image feature set, with an 800-epoch limit. The lower accuracy with inflectional words is due to difficulties in
recognizing prefixes and suffixes.
Keywords: Inflectional Words, Long Short-Term Memory, Deep Learning, Kinect, sign language, SIBI.
1. INTRODUCTION
SIBI (Sistem Isyarat Bahasa Indonesia or the Sign
Language System for Indonesian Language) is the official
method of communication for the hearing-impaired in
Indonesia. SIBI is used for communication between the
hearing-impaired, as well as between the speech/hearing-
impaired and those without the impairment1. SIBI is
Indonesian language, complete with its native syntax,
represented in gestures. There are four types of SIBI
gestures: root word, affix, inflectional and pronoun
gestures. Pronouns are further divided into four
categories: personal, possessive, pointer, and
conjunctive1.
*Email Address: efi@cs.ui.ac.id
Like most other sign languages, SIBI is not trivial to
master, and consequently there is a need for a system to
convert SIBI gestures into a text output. The challenge in
creating a SIBI-to-text translation system is the
recognition of the four different types of gestures in SIBI.
The translation system in progress must eventually be
able to recognize the gestures associated with all the
aforementioned linguistic elements efficiently, quickly,
and accurately.
This research and extant ones have attempted to create
the components that make up a SIBI-to-text translation
system. The first to be created was a system that can do
the translation solely for the alphabet and numbers2. A
system that can translate root words was then created
using the acquired knowledge. Previous work3 and this
Adv. Sci. Lett. 4, 400–407, 2016 RESEARCH ARTICLE
2
research focuses on creating a system that can translate
inflectional words. Inflectional words are root words
combined with prefixes, suffixes, infixes, or a mix of
some of the three. These additional morphemes and
inflections add extra information to the root word, as well
as ensuring that the resulting inflectional words serve a
particular grammatical or logical purpose. As an
illustration, adding the prefix "me-", and the suffix "-i" to
the root word "lempar" (to throw), results in the word
"melempari" (to throw at). This is a different result than
when "me-" and "-kan" are added to "lempar," which
results in the inflectional word "melemparkan" (to throw
with). The subject of the latter is what is being thrown,
and the subject of the former is the target of the throw.
There are no specific gestures for inflectional words
in SIBI, but the gestures that represent them are
constructed modularly. For instance, there is no single
gesture for “berlompatan” (= to jump about); instead,
three gestures are used: the gesture for the prefix “ber-” +
the gesture for the root word “lompat” (=jump) + the
gesture for the suffix “-an,” as shown in Fig. 1. This
method of gesture concatenation is a uniqueness of SIBI4.
The translation system needs to employ a feature
extraction technique for a minimally-sized feature vector
set, yet still be capable of recognizing different types of
gestures5. There are 7 prefixes, 11 suffixes and thousands
of root words in the Indonesian language. To construct an
inflectional word, one can concatenate a maximum of 3
prefixes and 2 suffixes6. The number of possible
combinations is in the thousands, and in order to
recognize these inflectional words, the SIBI translation
system should have an adequate amount of feature vector
sets in order to sufficiently recognize the existing
inflectional words.
In this research, to reduce the amount of required
feature vector sets, the gesture of the inflectional word is
separated into and recognized by its components (Fig. 2).
With this component recognition capability, the
translation system uses only three feature vector sets: one
for prefix gestures, one of root word gestures, and one for
suffix gestures. This is much smaller than generating a
separate feature vector set for each inflectional word in
Indonesian language. The implementation of this
separation method greatly reduces computation time
required to interpret inflectional word gestures, as well as
improving the efficiency of the translation system.
In addition to being able to recognize inflectional
words, the model developed by this research can be
adjusted to recognize other types of words in SIBI such as
the root word, pronoun, conjunction, repeated words (for
example "kemerah-merahan" - reddish), compound words
(for example "bertanggung jawab" - to be responsible
for). This adjustability is the big contribution of this
research, in the realm of SIBI gesture to text translation.
This research attempts to use Long Short-Term
Memory (LSTM7) neural networks, due to its ability to
take advantage of the long-term temporal dependencies
between each frames of a SIBI sequence to improve the
model's prediction.
2. RELATED WORKS
There are only a few researches in Indonesia which
focus on translating gesture to text. Kurniawan8, Rakun2
translated alphabet to text, while Najiburahman9, Rakun10,
Marcelita11 focused on translating root words, not the
complex inflectional word. Najiburahman9 categorized
words in SIBI into words with one, two or three gestures,
while this research uses image and skeleton features to
correctly identify any words expressed in SIBI,
irrespective of the number of gestures that make up the
word.
Many related researches have been done outside
Indonesia. Most of these researches attempted to
recognize a certain sign language by using Hidden
Markov Model (HMM)12,13,14,15,16,17,18. This group of
extant research is collectively unsuited for the recognition
of SIBI's inflectional words.
In a previous research, we have tried to divide the
inflectional word gestures into the following
subsequences: prefix, root word, and suffix. These
subsequences are then classified by a heuristic Hidden
Markov Model (HMM), with a best accuracy value of
67.77%3. One shortcoming of using HMM is that a K-
Means pre-clustered feature set had to be used as the
HMM's observed variable19. This pre-clustering scheme
results in some loss of information on the extracted
feature set. It is highly likely that the model's accuracy
will improve if it is possible to use a model that can use
features that are not pre-clustered.
HMM's architecture dictates that each frame be
Fig 1. Gesture representing the inflectional word
“berlompatan” (= to jump about)4
epenthesis
prefix
...
......
epenthesis
......
root
word
epenthesis
...
......
......
suffix
epenthesis
......
inflectional word gesture frames
Fig. 2. The possible components of an inflectional
word.
RESEARCH ARTICLE Adv. Sci. Lett. 4, 400–407, 2016
3
assigned a label, which is clearly not the ideal way to
classify the sequences. The frame-labeling method results
in one sequence having multiple labels, making it difficult
to ascertain the validity of the error figure returned by the
model. HMM's classification is done in each frame, which
is a smaller unit than the unit in which the model is
required to classify in, namely classifying the components
of inflectional words, which is what each sequence in the
data represents.
In the interest of overcoming HMM's apparent
limitations, LSTM neural networks were used in this
research. The unique ability of LSTM is its ability to
decide whether to store, ignore, or forget information.
This stems from its inclusion of a four layer neural
network acting as gating units7,20,21. Because of this
ability, LSTM is considered to be well suited to speech
and handwriting recognition, polyphonic music
modeling22, as well as Chinese word segmentation23.
These tasks bear some similarities to the objective of this
research, as for instance with handwriting recognition, a
small sequence of a pen's strokes is relatively meaningless
unless the model correctly identifies the preceding
sequence of strokes. In short, highly accurate execution of
any of the above would be difficult if their respective
models cannot take advantage of their data's temporal
dependencies.
3. SCOPE OF RESEARCH
The prefixes and suffixes examined in this research
are all prefixes, suffixes, and the combination of prefixes
+ root words + suffixes in Indonesian language’s
grammatical structure6. Inflectional word gestures in this
research are defined as:
• One prefix gesture + root word gesture, or
• Root word gesture + one suffix gesture, or
• One prefix gesture + root word gesture + one suffix
gesture.
SIBI categorizes its gestures into two groups: primary
gestures and supporting gestures. Primary gestures
determine the meaning of the gesture, whereas supporting
gestures give additional meaning to the gesture. An
example of supporting gesture is a facial expression
added to the gesture, fulfilling a role similar to the role of
intonation in speech. This research focuses on primary
gestures performed by both hands as well as the finger
movements of both hands.
4. EXPERIMENTS
4.1. DATASETS
This experiment uses 19 root words and 144
inflectional words, recorded 10 times each for a total of
1630 inflectional word sequences. The gestures were
performed by 2 teachers from the Santi Rama School for
the hearing-impaired, in Jakarta. These are broken down
into subsequences of prefixes, root words, and suffixes.
The training data set consists of 457 root words, 290
prefixes, and 238 suffixes, for a total of 985
subsequences. The testing data set's proportion is
416:264:213, for a total of 893 subsequences.
4.2. FEATURES
This study uses Microsoft Kinect to record the SIBI
gestures. The output from the depth sensor and the
skeleton tracking output from the Kinect are used to
obtain the experimental features. The movement of the
tracked skeleton is computed from the angles between the
two joints in the skeleton, shown in Fig. 3. The shoulder-
center joint is used as the origin, and the angles of interest
are those formed between the origin and the elbow, and
between the origin and the hand. Each of these angles are
described in terms of and . Angles and are
defined as follows:
The sequences of these angles represent the movement of
the combined arm-and-hand shape in the frame.
We used MATLAB's regionprops24 function to
extract the image-based features. This function is used to
measure the region properties in an image and return
them as a structured array. Each frame's depth images are
transformed into a binary depth matrix, which is then
used as parameter for the regionprops function.
The returned array from this function may contain
more than one object, based on the object(s) or region(s)
found within the image. The largest region is assumed to
be the hand-blob area. This area is then selected as the
extraction region. The chosen features from the extraction
region are: area, centroid, major and minor axes,
orientation, and normalized convex hull. These five
features sufficiently define the hand position and form
that will occur. The area will help the model recognize the
different hand shapes; the major and minor axes help
define the perimeter of the hand blob. The orientation
feature will define the orientation of the hand in each
frame; the convex hull represents the smallest convex
Fig. 3. Joints from skeleton tracking generated by Kinect
Adv. Sci. Lett. 4, 400–407, 2016 RESEARCH ARTICLE
4
polygon that envelops the extraction region. A normalized
convex hull is an alternate representation of the convex
hull, where the coordinates of the convex hull's vertices
are described relative to the extraction region's centroid,
as opposed to a static origin.
4.3. OUTLINE OF EXPERIMENTS
The whole experiment is outlined in Fig. 4. Our inputs
consist of skeleton tracking data and frame video (depth
image) data captured by Kinect. The feature extraction
process yields skeleton point angle, depth image
properties, and combined skeleton-image feature data.
The raw data contains epenthesis, which are
transitional gestures. They do not contain any information
by themselves, since they are present simply as links
between the other gestures. During preprocessing, the
epenthesis are cut out from the data, and therefore what's
left in the data are only the gestures from each inflectional
word's components, namely the prefix, the root word and
the suffix.
The average sequence length of the post-processed
data set is then calculated, which turns out to be 32
frames. All the sequences are then homogenized in length
to conform to that length. The entire data set is then split
into training and testing data.
The skeleton, image and combined data sets are then
fed into the LSTM model. The parameters measured for
evaluation purposes are the accuracy on both the training
and testing data sets, as well as the time required to
perform both training and testing. The experiment uses
LSTM written in Python 2.7, using Keras and Theano
libraries. The experiments were run on an i7-equipped
PC.
4.4. MODELS
Fig. 5 is the1-layer and 2-layer LSTM architectures
used in this experiment. In this LSTM implementation,
one label is assigned to one sequence, as opposed to
labeling each frame. xn denotes the feature present in the
nth frame. A more detailed visualization of the internals of
each LSTM block is shown in Fig. 6.
By using a sigmoid function , the Forget gate will
discard irrelevant information from the previous cell state
. Then the Input gate layer, using a sigmoid
function , and a separate tanh layer, will cooperate to
calculate the new value of the current cell state . The
4th layer will determine the value of the output . The
sigmoid function , on this final layer will determine
which part of will be carried through in . The tanh
function in this final layer will ensure that the value of
will be between -1 and 1. Using both of these functions,
will contain only the relevant portion of 21.
5. RESULTS
The independent variables here are the epoch limits
(denoted as nE, in hundreds of epochs), the feature type
(skeleton / SKL, image / IMG, combined / CMB), and
the model type (1 and 2-layer LSTM, denoted as L1 and
L2). The dependent variables are the training and testing
time, and the accuracy of the prediction relative to the
testing data. Table 1 shows each model's accuracy
variation with changing feature types and increasing
epoch limits, using the testing data set.
nE
CMB
SKL
IMG
L1
L2
L1
L2
L1
L2
100
0,559
0,677
0,594
0,624
0,479
0,677
200
0,560
0,736
0,637
0,686
0,533
0,710
300
0,601
0,727
0,653
0,709
0,501
0,703
400
0,604
0,721
0,646
0,730
0,498
0,709
500
0,577
0,753
0,646
0,723
0,516
0,736
600
0,587
0,766
0,665
0,709
0,504
0,718
700
0,591
0,739
0,661
0,712
0,503
0,714
800
0,580
0,770
0,675
0,747
0,494
0,709
900
0,615
0,747
0,673
0,705
0,508
0,705
1000
0,598
0,759
0,676
0,703
0,493
0,727
From these results, it can be concluded that the model
is most accurate when using the combined feature data
Fig. 4. Experiment flow
Fig. 5. Block diagram of (left) 1-layer LSTM and (right) 2-
layer LSTM
Fig. 6. Four interacting layers of LSTM
21
Table 1. Inflectional word prediction accuracy for the
different types of features
RESEARCH ARTICLE Adv. Sci. Lett. 4, 400–407, 2016
5
set, as opposed to the individual feature data sets. This
was achieved with an 800-epoch limit for the 2-level
LSTM, with an accuracy of 77%.
The time required by the 2-layer LSTM for training or
testing is nearly twice as much as what is required for the
1-layer LSTM. However, this increase is offset by the
gain in accuracy when using 2 layers. Using a combined
feature set and a 800-epoch limit, the 2-layer LSTM is
77% accurate, whereas the 1-layer LSTM is only 58%.
6. ANALYSIS
The results of the prediction is summarized in the
confusion matrix, shown in Fig. 7. This matrix shows that
the errors occur mainly with prefixes and suffixes.
Consequently, further experiments are done to shed more
light on this issue.
6.1. ANALYSIS FROM DATA AND FEATURES
POINT OF VIEW
In order to understand the experimental results better,
the LSTM is then run on the individual groups (suffixes,
root words, and prefixes). An additional process was
implemented to separate both training and testing data
into three different groups. This also changes the average
length of the sequence, depending on which type the
sequence belongs to. The resulting average length for
prefixes, root words, and suffixes are 30, 37, and 22
frames respectively.
The best result of each group of data can be seen in
Table 2 (on next page). From this table, it can be
concluded that the model is again most accurate when
Prefixes
Suffixes
Fig. 7. Confusion matrix for prediction result of testing data
Fig. 8. Confusion matrix for the prefixes
Fig. 9. Confusion matrix for the suffixes
Adv. Sci. Lett. 4, 400–407, 2016 RESEARCH ARTICLE
6
using the combined feature data set. This was achieved in
the 2-level LSTM, with an accuracy of 95.4% if tested on
root word only. The lowest accuracies were attained for
both prefixes and suffixes, with confusion matrices shown
for prefixes and suffixes shown in Fig. 8 and Fig. 9,
respectively.
This happens mainly because of the way prefixes are
expressed in SIBI. The expression of a prefix in SIBI is
done by making the right hand express the first letter of
the prefix, with the left palm facing to the right, and
moving both hands to meet in the middle. The issue is
that the left hand orientation is the same for every prefix,
which reduces the uniqueness of a prefix's features. This
is particularly true with the prefix group (me-, te-, and se-),
as seen in Fig. 10.
The suffixes also suffer from a general lack of feature
uniqueness. In SIBI, after the root word has been
expressed, the suffix is then expressed by making the
right hand express the first letter of the suffix. The right
hand will then move from the final position of the root
word's gesture (in front of the chest) towards the right hip
in a downward arc. The first letters of suffixes have very
similar xy-plane projections to begin with, and the
downward arc is similar in every suffix. The left hand
remains stationary in all suffix gestures. All of the above
contributes to the fact that the skeleton data for suffixes
have a large degree of commonality, and the model has to
rely solely on the right hand's image data. This problem
proves to be difficult to solve among the suffix group –
kah, –kan and –lah, as shown in Fig. 11.
7. CONCLUSIONS AND FUTURE WORKS
Using LSTM results in a higher accuracy, from 66.7%
with HMM to 77.4% with 2-layer LSTM. The remaining
error are mostly due to misidentification of prefixes and
suffixes.
The aforementioned errors occur due to the lack of
feature uniqueness in the gestures of the error-causing
prefixes and suffixes.
Experiments using skeleton, image, and combined
feature sets, reveal that the model performs best using the
combined features. This is because the relevant
movement of arm and hand, as well as the finger shapes
are best recorded by the combined feature set.
When tasked with identifying only the root words, the
2-layer LSTM is 95.4% accurate. This means that the 2-
layer LSTM works well if the gestures to be identified are
sufficiently unique relative to each other. It is also worthy
to note that the length of the root word sequence is 37
frames on average, which is longer than the prefixes or
suffixes. Further investigation is needed to study how
LSTM's accuracy varies with sequence length.
The time required for training and testing a 2-layer
LSTM is double that of 1-layer LSTM, but the 2-layer
LSTM is more accurate. Using a combined feature set and
a 800-epoch limit, the 2-layer LSTM is 77% accurate,
whereas the 1-layer LSTM is only 58%.
To improve the performance of this model, a better
image processing technique is needed that can capture the
finger's shapes better, in order to resolve the current
problem of less than optimal uniqueness of the prefix and
suffix features.
ACKNOWLEDGMENTS
This work is supported by SINAS 2015 Research
Grant #RT-2015-0547, from The Ministry of Research,
Technology and Higher Education of Indonesia. This
support is gratefully received and acknowledged. The
author also wishes to thank M. I. Mas for the photographs
and final proofreading.
REFERENCES
[1] S. Siswomartono, Cara mudah belajar SIBI (Sistem Isyarat
Bahasa Indonesia), Yayasan Santi Rama, 2007.
[2] E. Rakun, M. Febrian Rachmadi, A. Tjandra, and K. Danniswara,
Spectral domain cross correlation function and generalized
Learning Vector Quantization for recognizing and classifying
Indonesian Sign Language, In IEEE International Conference on
Advanced Computer Science and Information Systems
(ICACSIS), pages 213–218, Depok, 2012.
[3] E. Rakun, M. I. Fanany, I.W. W. Wisesa, and A. Tjandra, A
heuristic Hidden Markov Model to recognize inflectional words
in sign system for Indonesian language known as SIBI (Sistem
Isyarat Bahasa Indonesia), In IEEE International Conference on
Technology, Informatics, Management, Engineering &
Environment (TIME-E), pages 53–58, Samosir, 2015.
[4] Kamus Sistem Isyarat Bahasa Indonesia, Departemen
Pendidikan Nasional, 2001.
Fig. 11. L-R: gestures for the suffixes -kah, -kan, and -lah
Fig. 10. L-R: gestures for the prefixes me-, te-, and se-
Table 2. Best prediction result for prefix, root word, and
suffix group
Group
Sequence
Length
nE Accuracy
Inflectional
32
800
0.770
Root
37
900
0.954
Prefix
30
700
0.667
Suffix
22
800
0.690
RESEARCH ARTICLE Adv. Sci. Lett. 4, 400–407, 2016
7
[5] S. Kausar and M. Y. Javed, A Survey on Sign Language
Recognition, In IEEE Frontiers of Information Technology (FIT),
pages 95–98, Islamabad, 2011.
[6] M. Adriani, J. Asian, B. Nazief, S. M. M. Tahaghoghi, and H. E.
Williams, Stemming Indonesian: A confix-stripping approach,
ACM Transactions on Asian Language Information Processing
(TALIP), 6(4):1–33, 2007.
[7] S. Hochreiter and J. Schmidhuber, Long Short-Term Memory,
Neural computation, 9(8):1735–1780, 1997.
[8] W. Kurniawan and A. Harjoko, Pengenalan Bahasa Isyarat
dengan Metode Segmentasi Warna Kulit dan Center of Gravity,
Indonesian Journal of Electronics and Instrumentation Systems
(IJEIS), 1(2):67–78, 2011.
[9] M. Najiburahman, Simulasi dan Analisis Sistem Penerjemah
Bahasa SIBI Menjadi Bahasa Indonesia Menggunakan Metode
Klasifikasi Hidden Markov Model, Bachelor’s thesis, Universitas
Telkom, 2015.
[10] E. Rakun, M. Andriani, I. W. Wiprayoga, K. Danniswara, and A.
Tjandra, Combining depth image and skeleton data from Kinect
for recognizing words in the sign system for Indonesian language
(SIBI [Sistem Isyarat Bahasa Indonesia]), In IEEE International
Conference on Advanced Computer Science and Information
Systems (ICACSIS), pages 387–392, Bali, 2013.
[11] F. Marcelita, Pengenalan Bahasa Isyarat dari Video
Menggunakan Ciri Geometris, K-Means, dan Hidden Markov
Model, Bachelor’s thesis, Universitas Telkom, 2008.
[12] T. Starner and A. Pentland, Real-Time American Sign Language
Recognition from Video Using Hidden Markov Models, In IEEE
International Symposium on Computer Vision, pages 265–270,
Coral Gables, FL, 1995.
[13] K. Grobel and M. Assan, Isolated sign language Recognition
using hidden Markov models, In IEEE Systems, Man, and
Cybernetics, 1997. Computational Cybernetics and Simulation,
pages 162–167, Orlando, FL, 1997.
[14] T. Matsuo, Y. Shirai, and N. Shimada, Automatic generation of
HMM topology for sign language recognition, In IEEE 19th
International Conference on Pattern Recognition (ICPR), pages
1–4, Tampa, FL, 2008.
[15] M. Maebatake, I. Suzuki, M. Nishida, Y. Horiuchi, and S.
Kuroiwa, Sign Language Recognition Based on Position and
Movement Using Multi-Stream HMM, In IEEE Proceedings of
the 2nd International Symposium on Universal Communication
(ISUC), pages 478–481, Osaka, 2008.
[16] S. Theodorakis, A. Katsamanis, and P. Maragos, Product-HMMs
for automatic sign language recognition, In IEEE International
Conference on Acoustics, Speech and Signal Processing
(ICASSP), pages 1601–1604,Taipei, 2009.
[17] C. Vogler and D. Metaxas, Parallel hidden Markov models for
American sign language recognition, In Proceedings of the
Seventh IEEE International Conference on Computer Vision,
volume 1, pages 116 –122, Kerkyra, 1999.
[18] M. Jebali, P. Dalle, and M. Jemni, Hmm-based method to
overcome spatiotemporal sign language recognition issues, In
IEEE International Conference on Electrical Engineering and
Software Applications (ICEESA), pages 1–6, Hammamet, 2013.
[19] L.R. Rabiner. A tutorial on Hidden Markov Models and selected
applications in speech recognition. Proceedings of the IEEE,
77(2):257-286,1989.
[20] K. Cho, B. van Merrienboer, and D. Bahdanau, On the Properties
of Neural Machine Translation: Encoder Decoder Approaches, In
Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics
and Structure in Statistical Translation, pages 103–111, Doha,
Qatar, 2014.
[21] C. Olah, Understanding lstm networks,
http://colah.github.io/posts/2015-08-Understanding-LSTMs/,
Last accessed April 6, 2015.
[22] K. Greff, R. K. Srivastava, J. Koutnik, B. R. Steunebrink, and J.
Schmidhuber, LSTM: A search space odyssey, arXiv preprint
arXiv:1503.04069, pages 1–10, 2015.
[23] X. Chen, X. Qiu, C. Zhu, P. Liu, and X. Huang, Long Short-
Term Memory Neural Networks for Chinese Word Segmentation,
In Proceedings of the 2015 Conference on Empirical Methods in
Natural Language Processing, September, pages 1197–1206,
Lisbon, Portugal, 2015.
[24] MATLAB, version 7.10.0 (R2010a), Natick, Massachusetts,
2010.
Received: May 10, 2016. Accepted: -