ArticlePDF Available

Human-Machine Interaction Sensing Technology Based on Hand Gesture Recognition: A Review

Authors:

Abstract and Figures

Human machine interaction (HMI) is an interactive way of information exchange between human and machine. By collecting the information that can be conveyed by the person to express the intention, and then transforming and processing the information, the machine can work according to the intention of the person. However, the traditional HMI including mouse, keyboard etc. usually requires a fixed operating space, which limits people's actions and cannot directly reflect people's intentions. It requires people to learn systematically how to operate skillfully, which indirectly affects work efficiency. Hand gesture, as one of the important ways for human to convey information and express intuitive intention, has the advantages of high degree of differentiation, strong flexibility and high efficiency of information transmission, which makes hand gesture recognition (HGR) as one of the research hotspots in the field of HMI. In order to enable readers to systematically and quickly understand the research status of HGR and grasp the basic problems and development direction of HGR, this article takes the sensing method used by HGR technology as the entry point, and makes a detailed elaboration and systematic summary by referring to a large number of research achievements in recent years.
Content may be subject to copyright.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS 1
Human-Machine Interaction Sensing Technology
Based on Hand Gesture Recognition: A Review
Lin Guo, Zongxing Lu , and Ligang Yao
Abstract—Human machine interaction (HMI) is an interactive
way of information exchange between human and machine. By
collecting the information that can be conveyed by the person to
express the intention, and then transforming and processing the
information, the machine can work according to the intention of
the person. However, the traditional HMI including mouse, key-
board etc. usually requires a fixed operating space, which limits
people’s actions and cannot directly reflect people’s intentions. It
requires people to learn systematically how to operate skillfully,
which indirectly affects work efficiency. Hand gesture, as one of
the important ways for human to convey information and express
intuitive intention, has the advantages of high degree of differentia-
tion, strong flexibility and high efficiency of information transmis-
sion, which makes hand gesture recognition (HGR) as one of the
research hotspots in the field of HMI. In order to enable readers to
systematically and quickly understand the research status of HGR
and grasp the basic problems and development direction of HGR,
this article takes the sensing method used by HGR technology as
the entry point, and makes a detailed elaboration and systematic
summary by referring to a large number of research achievements
in recent years.
Index Terms—Hand gesture recognition (HGR), human machine
interaction (HMI), information acquisition, sensors.
I. INTRODUCTION
ENABLING machines to work according to human inten-
tions has been the goal of human beings since the emer-
gence of machines. In the early days of machine development,
people used buttons, joysticks, etc. to control the machine’s
circuit, oil, and even mechanical transmission to achieve the
purpose of conveyingorders to the machine. Since modern times,
with the emergence of computers, the HMI environment is more
and more friendly to people, people can pass information to the
machine through the mouse and keyboard, and monitor their
work through the display. In recent years, the size of the com-
puter is constantly decreasing, and the working efficiency of the
machine has also been significantly improved. This means that
Manuscript received August 16, 2020; revised February 15, 2021; accepted
April 24, 2021. This work was supported in part by the National Natural
Science Foundation of China under Grant 61801122, in part by the Natural
Science Foundation of Fujian Province under Grant 2018J01762, and in part by
the Science Project of Fujian Education Department under Grant JK2017002.
This article was recommended by Associate Editor V. Fuccella. (Corresponding
authors: Zongxing Lu; Ligang Yao.)
The authors are with the School of Mechanical Engineering
and Automation, Fuzhou University, Fuzhou, Fujian 350116, China
(e-mail: guolin_fzu@sina.com; luzongxing@fzu.edu.cn; ylgyao@fzu.edu.cn).
Color versions of one or more figures in this article are available at https:
//doi.org/10.1109/THMS.2021.3086003.
Digital Object Identifier 10.1109/THMS.2021.3086003
some simple input signals can control the machine to program-
matically and autonomously complete complex tasks. On the
basis of such technology, researchers have developed a variety of
human-computer interaction technologies, such as voice control,
brain-computer interaction gesture recognition, etc.
Hand gesture, as one of the important ways for human to
convey information and express intuitive intention, has a va-
riety of advantages such as strong intuition, high flexibility
and rich meaning. Therefore, hand gesture recognition (HGR)
is widely used and developed as a new HMI technology, and
has shown great potential in many fields such as novel control
mode and virtual reality, but it still faces huge challenges in
the process of being put into use due to its low accuracy and
lack of portability. The accuracy of HGR is closely related
to the sensors, acquisition methods, and algorithms. As the
hardware of HGR, sensing technology plays an important role
in the process of HGR. According to different kinds of sensing
technologies, HGR can be divided into different forms, which
mainly include data glove, vision, and various wearable devices
based on surface electromyography (sEMG) signals, ultrasonic
signals, etc. This article mainly reviews the sensing technology
and gesture acquisition process used in HGR.
In this article, Section II introduces various realization meth-
ods, sensing technology and acquisition methods of HGR tech-
nology; Section III compares and analyzes various sensing
methods, Section IV analyzes the application direction of HGR.
Section V describes the research direction and challenges of
HGR technology in the future.
II. SENSING TECHNOLOGY FOR GESTURE RECOGNITION
A. Data Glove
Data glove is a common form of HGR. A data glove can be
defined as a system composed of an array of sensors, electronics
for data acquisition/processing and power supply, and a support
for the sensors that can be worn on the user’s hand [1]. In data
glove, the sensors or sensor array is fixed on the special gloves
by stitching or glue, which transform the motion information of
hands into electrical signals according to the characteristics of
different sensors and then transmit them to the data processing
module. After filtering, noise reduction, and other processes,
the new data are transferred to the computer and the HGR
process is completed by means of machine learning. Fig. 1
shows a detailed process of HGR using most sensing methods.
Obviously, the information conversion of sensors plays a key
role in this process, and the different features of various sensors
2168-2291 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
2IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS
Fig. 1. Process of gesture recognition using most sensing methods.
Fig. 2. Data gloves. (a) Data glove based on inertial sensor. (b) Data glove
basedonbendsensor.
will inevitably affect the function of data gloves. Currently, the
sensors used in data gloves are mainly inertial sensors, magnetic
sensors and bending sensors. Fig. 2(a) and (b) show data gloves
based on inertial sensor and data gloves based on bend sensor
respectively [2], [3].
Inertial sensor is a kind of sensor that detects and measures
acceleration, tilt, impact, vibration, rotation, and multidegree of
freedom movement. Due to its intuitive motion detection and
low cost, it is widely used in data gloves. Fang [2] et al. de-
signed a novel data glove for gestures capturing and recognition
based on inertial and magnetic measurement units, which are
made up of three-axis gyroscopes, three-axis accelerometers,
and three-axis magnetometers. They obtained basic data by
collecting 3-D motion information of arm, palm and finger, and
used the extreme learning machine (ELM) method to trained and
tested the data, then the accuracy of static recognition reached
89.59%, and dynamic gesture recognition reached 82.5%, which
applied to the operation of robotic arm-hand [4]. Galka [5] et al.
collected data on the movements of finger, wrist, and arm during
sign language through an accelerometer glove, then trained and
tested them with parallel hidden Markov model. The results
showed that the reference accuracy [(RA), which is defined as the
recognition accuracy of hand gesture in this article, namely the
percentage of the ratio between the number of correctly recog-
nized gestures and the total number of experimental gestures in
the prediction experiment. However, due to the different dataset
designs of different authors, the reference value of this parameter
is also different. We will discuss this issue in Section III] was
99.75%. Nayan [6] et al. designed a data glove system containing
ten angle sensors, and designed experiments to identify Indian
sign language letters and American sign language letters, and
obtained an average RA of 96.7% using ten-fold cross validation.
In the current stage, the RA of the data gloves based on inertial
sensors has reached a high level, but the sensors are relatively
bulky, which affects the user experience.
In recent years, with more research on the properties of new
materials, the use of bending sensors in data gloves has be-
come widespread. Compared with inertial sensors, the bending
sensors are lighter, better fitting with gloves, and have better
user experience. Shen [7] et al. presented a soft bending sensor,
and evaluated its application in data gloves. Huang [3] et al.
manufactured a data glove by sewing the reduced graphene oxide
coated fiber that is prepared in a simple method onto a textile
glove, which is used to recognize hand gesture by monitoring
the motion of ten finger joints from one hand, and the RA
reached 98.5%. Wu [8] et al. presented a pair of fiber-based
self-powered noncontact smart gloves with the unique function
of recognizing a wide range of gestures without contact between
fingertips and the palm. Compared with the traditional inertial
sensing data gloves, such data gloves using fiber as sensors are
more portable, which improved air permeability and comfort.
Accurate measurement of thumb carpometacarpal (CMC) joint
movement remains challenging due to crosstalk between the
multi-sensor outputs required to measure the degrees of free-
dom, Dong [9] et al. estimated the optimal sensor locations
by the least squares method, which minimized the difference
between the true CMC-joint angles and the joint angle estimates.
Additionally, some new sensors have been applied to HGR data
gloves in recent years, Chiu [10] et al. proposed a self-powered
gesture sensing system, which can detect the output signals of
triboelectric nanogenerators arranged on the back of the hand,
and successfully established a set of rules for the conversion of
gestures and English letters. The features of hand joint motion
used by data gloves are one of the most closely related to
hand gestures. Its high accuracy in static gesture recognition
is very worthy of affirmation. In the future, data gloves should
be lightweight and portable.
B. Vision
HGR based on vision is a relatively mature technology, which
uses cameras to capture videos of the scene containing gestures
and then uses computer algorithms to identify, extract, and clas-
sify the gesture features in the images. Most of the complete hand
interactive mechanisms that act as a building block for vision
based HGR system are comprised of three fundamental phases:
detection, tracking, and recognition [11]. In the detection stage,
besides using the camera to collect the image containing the
gesture, there is also an important step, that is, the segmentation
of the gesture and the background. The segmentation is based on
features extracted from the hand, such as skin color, shape, hand
movement, etc. Sun [12] et al. achieved gesture segmentation
and recognition by building skin color model; Indra [13] et al.
proposed a method of recognition of Indonesian Sign Language
letters based on hand-shape features; Lin[14] et al. proposed
a gesture recognition method based on histograms of oriented
gradients(HOG) features by capturing gesture trajectory infor-
mation. Tracking provides users with the possibility of real-time
dynamic gesture recognition. In this stage, the computer needs
to continuously track the corresponding relationship between
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GUO et al.: HMI SENSING TECHNOLOGY BASED ON HGR: A REVIEW 3
Fig. 3. HGR based on vision. (a) Kinect. (b) Leap Motion.
the features of each frame of the hand through the algorithm.
Finally, the computer realizes the recognition and classification
of gestures through machine learning algorithms such as support
vector machine (SVM) and hidden Markov model (HMM).
Pramod [15] et al. reviewed a large number of methods based
on visual gesture recognition and the comparison of algorithms.
When it comes to sensing devices based on visual gesture
recognition, the average RGB camera in the market is generally
up to the task. The higher the quality of the camera and the higher
the number of pixels, the greater the accuracy of gesture recog-
nition and the greater the load on the computer. Nowadays, with
the development and application of structured light technology
and depth sensor, 3-D visual gesture recognition technology and
stereo vision gesture recognition technology have been widely
applied and rapidly commercialized. For example, disparity
map-based centroid movement and changing of its intensity are
used as the feature based on stereoscopic vision [16], and used
conditional random field (CRF) as the classifier to recognize
gestures. Their experiment verified that the average recognition
rate reached 88%.
At present, the mature typical commercial products include
Kinect, Leap Motion etc. as shown in Fig. 3(a) and (b) [17], [18].
By using this kind of equipment with more complete functions
and higher reliability, many scholars and related professionals
have developed effective methods and algorithms for HGR [19].
A recognition algorithm based on Kinect sensor is proposed by
Wu et al. [20], and compared it with the traditional method to
verify that its RA has been significantly improved. Murata [17]
et al. tested the Kinect sensor’s ability to recognize numbers and
alphanumeric characters written in the air, and the results showed
that the average rate of the number and alphanumeric characters
was 95.0% and 98.9%, respectively. Kinect sensor is also used
to recognize Arabic numerals and English letters written in
the air based on the trajectory characteristics of fingertips, and
obtained a high accuracy [21]. Leap Motion with the model
based on constant radial basis function (RBF) neural networks
was verified by experiments [18]. Almarzuqi [22] et al. used
Leap Motion depth sensor for gestures recognition to improve
the accuracy of intelligent system when interacting with robots.
The hardware of visual gesture recognition is relatively sim-
ple, but it requires higher computational costs due to the neces-
sity of processing 2-D or even 3-D image data. Furthermore, the
gesture recognition based on the principle of optics is limited
by the focal length and covering range of the camera, so it
is easy to have blind areas of vision and blocking of light,
which will affect the RA and the user’s activity range. Ahmad
Fig. 4. Surface electromyography. (a) Relationship between gesture and fore-
arm muscle. (b) Wet electrodes. (c) Dry electrodes.
et al. systematically summarized and discussed the problems
and challenges of visual gesture recognition from three aspects:
system (challenges of response time and cost factor), environ-
ment (challenges of background, illumination, invariance, and
ethnic groups) and gesture (challenges of translation, scaling,
rotation, segmentation, feature selection, dynamic gesture, and
size of dataset) [23].
C. Surface Electromyography
Human hand movement is driven by the contraction and
stretching of forearm muscles as shown in Fig. 4(a) [24]. When
describing the anatomical position of forearm muscles, flexor
carpi radialis and flexor carpi ulnaris muscle play a major
role in wrist joint movement [25], which are responsible for
finger bending and extension. The obvious correlation between
forearm muscle and hand movement provides the possibility
for HGR based on forearm muscle characteristics. The acquisi-
tion methods of forearm muscle movement information mainly
include sEMG, ultrasound (US), forearm shape detection, and
mechanomyography (MMG).
sEMG signals are collected by the contact between electrodes
and the skin of the body. The resulting EMG signal is the
summation of the action potentials discharged by the active
muscle fibers in the proximity of the recording electrodes [26].
These electrodes are generally divided into wet electrodes and
dry electrodes, as shown in Fig. 4(b) and (c) [24], [27]. Wet
electrodes have lower skin contact impedance to reduce the in-
fluence of external interference sources to the electromyographic
signal electrode and improves the signal-to-noise ratio. The wet
electrodes can be freely pasted to any position on the skin surface
in the arrangement without additional fixation, but due to the
existence of the gel, the occupied area is larger, so the wet
electrodes cannot be used for more channels signal acquisition.
In contrast, the dry electrodes are easy to be integrated into the
sleeve and other devices for multichannel array collection due to
its small size and large layout space without the presence of gel.
However, due to its lack of viscosity, additional fixation device
is needed to make it stick to the skin surface.
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
4IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS
As other sensing methods, the HGR technology based on
sEMG also uses machine learning method for classification,
but the algorithms contain fewer dimensions of features and are
easier than the visual method. In recent years, many scholars
have proved the feasibility of sEMG method in experiments
and system design. Hu [28] et al. designed a human-machine
coordinated control system that can recognize 8 gestures of
subjects in real time by wearing an array of sEMG sensors
and control a six-degree-of-freedom dexterous fake hand for
synchronous movements. Ketyko [29] et al. proposed a domain
adaptation method based on deep learning, and verified the
superiority of this method. A novel method is proposed for
pattern recognition and compared the classification effect with
different algorithms for able-bodied and amputees [30]. Fang
[31] et al. proposed a clustering-feedback strategy to improve
the accuracy of EMG online pattern recognition. Different from
other sensing methods, the method based on muscle activity
characteristics can also identify and estimate grip force. There
is a strong correlation between muscle force and sEMG signals
[32]. Xu [27] et al. verified the feasibility of three different
types of neural network algorithms for force estimation based on
sEMG through experiments. Wang [33] et al. proposed a wavelet
scale selection technology based on nonlinear correlation to
select effective the wavelet scales from sEMG signals to estimate
grip force. Fang [34] et al. proposed an attribute-driven granular
model under a machine learning scheme and it was fulfilled
for EMG-based pinch-type classification and the fingertip force
grand prediction. Of course, sEMG signals also have some
shortcoming, mainly reflected in weak signals and crosstalk,
as well as the impact of inevitable movement of dry electrodes
in the process of wearing. The denoising of sEMG signal is a
very important topic. Baspinar [35] et al. studied three differ-
ent denoising methods of sEMG. Wu [36] et al. reviewed the
denoising methods of sEMG signals. Aiming at the problem of
robust motion recognition, Minjae [37] et al. proposed an sEMG
interface rotation compensation method. It is worth mentioning
that the amputee’s motion intention can still generate relevant
sEMG signals on the body, so sEMG signals have some unique
advantages in assisting the disabled [38], [39]. Fang[40] et al.
reviewed a lot of research on interacting with prosthetics based
on the sensing techniques including EMG, SMG, MMG, EEG,
ECoG, ENG, etc.
D. Ultrasound
As mentioned in the previous sections, wearable HGR sys-
tem based on the sEMG have made some achievements in the
experiments and practical applications, but the sEMG still have
some defects such as weak signal and poor penetration, which
affecting the user experience. In recent years, scholars have
turned to US that has greater penetrating power.
Sound waves are mechanical waves and travel by repeatedly
compressing and expanding the medium. The complex ratio of
the sound pressure p to the particle velocity v at a certain point
in the medium is called the acoustic impedance, that is
Zs=
p
v.(1)
Fig. 5. US-based HGR. (a) Acquisition methods of A-mode US. (b) Acquisi-
tion methods of B-mode US.
In general, the plane harmonic acoustic impedance of one-
dimensional forward propagation can be expressed in the fol-
lowing form
Zs=ρc (2)
where, ρis the density of medium, and c is the speed of sound.
The acoustic impedance of plane sound wave is a real number,
and only depends on the nature of the medium itself and the
speed of sound, that is also known as the acoustic characteristic
impedance of the medium. When sound waves pass from one
medium to another with different impedance characteristics,
echo is reflected at the boundary. The proportion of echo is
proportional to the impedance mismatch between the two media.
If we send a series of ultrasonic pulses to the forearm muscle,
we can get the muscle movement and shape information by
detecting and analyzing the echo signal.
The acquisition methods of US are divided into A-mode and
B-mode. Fig. 5 shows the gesture recognition equipment of
A-mode and B-mode US, respectively [41], [25]. Yang [41] et al.
designed a gesture recognition method based on multi A-mode
US, with an offline accuracy of 98.87% and an online accuracy
of 95.4%. Hettiarachchi [42] et al. introduced a new wearable
ultrasonic radial muscle activity detection system and obtained
good gesture RA through experiments. Sun [43] et al. compared
the dual-frequency transducer of A-mode ultrasonic probe with
the single-frequency transducer, and the results proved that the
dual-frequency transducer has better performance. Yan et al.
[44] designed a new lightweight A-mode ultrasonic probe and
verified its classification effect of gesture recognition. B-mode
ultrasonic, namely luminance (brightness) model to 2-D imaging
of echo signal, reflects more information and covers a wider
range of the muscles. However, the US imaging often requires
more complicated algorithm and visual identification of the rel-
evant methods for feature extraction, its processing and recogni-
tion algorithms have certain similarities with visual way. Gesture
recognition method is proposed based on B-mode US without
repeated training, and the RA reached 94% [45], [46]. Akhlaghi
[47] et al. used US images to identify six participants in 15
different gestures, the results showed an offline RA of 91% and
an online real-time RA of 92%. Compared with sEMG signals,
US signals can penetrate deeper muscles and ensure higher
accuracy. Youjia Huang [48] et al. compared the recognition
methods based on B-mode US and sEMG signals, then found
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GUO et al.: HMI SENSING TECHNOLOGY BASED ON HGR: A REVIEW 5
that the ultrasonic method (95.88%) was higher than sEMG
method (90.14%). At the same time, the US method can get
rid of more wires in gesture recognition wearable devices, but
the current B-mode US device probe is still relatively bulky, and
the US gel needs to be applied to couple between the probe
and skin, which affect the user’s experience, and these gels
add to the cost and sometimes affect the performance of the
recognition. McIntosh [25] et al. explored the placement of US
probes, which has certain guiding significance for the design
of wearable devices based on US, and many scholars are also
studying how to reduce the cost of ultrasonic gels and improve
their performance [49], [50].
E. Other Methods
HGR is closely related to the change of forearm muscle
morphology, which will inevitably cause the change of forearm
transverse section shape. It is a novel way to extract gesture
information by detecting such shape changes with piezoelectric
sensors. Zhang [51] et al. used four pressure sensors to realize
gesture recognition. Booth [52] et al. recognized the motion of
five-finger tapping by placing six piezoelectric sensors inside
the wrist band to recognize the changes in wrist shape, and its
RA reached 97%. In addition to piezoelectric sensors, optical
sensors can also be used to extract the changing features of
the arm shape for gesture recognition [53]. Although these
detection methods can achieve high accuracy in the recognition
of individual gestures, they are still unable to collect the move-
ment information of deep muscles and have great limitations
in the recognition of flexible gestures. In some studies, muscle
movement information of forearm was collected by means of
MMG, this method collects signals by recording the vibration
signals produced by the process of muscle contraction. The same
as ultrasonic method to detect muscle acoustic signal for HRG,
this kind of method is to detect the spontaneous acoustic signal of
muscle, its signal strength is far less strong than that of ultrasonic
detection, and it is easy to be interfered. Wilson [54] et al. use
this method for recognition of 12 hand gestures, the offline RA
is 89%, and the real-time RA is 68%. In recent years, some
scholars focused HGR on the WiFi signals and radar signals.
Tian [55] et al. introduced a method through transmitting and
receiving weak WiFi signals for gesture recognition. System
with leverages changing in WiFi signal strength is presented
to sense in-air hand gestures around the user’s mobile device
[56]. Some novel continuous HGR system is proposed based on
singular radar and dual-channel doppler radar sensors [57], [58].
Ryu [59] et al. introduced a frequency modulated continuous
wave radar system for HGR. These methods as new types of
HGR can get rid of the shackles of wearable equipment. In the
future, they may become the new application method of HGR
equipment.
F. Hybrid Methods
Different sensors can detect different and limited gesture-
related features. Just as the inertial sensor can collect the motion
features of the hand, the visual or optical method can collect the
shape and depth features of the gesture, as well as the motion
features of the forearm muscles by sEMG and US. The limitation
of each method is that it limits the improvement of HGR per-
formance. In recent years, the development of HGR system and
technology by integrating a variety of sensing technologies and
collecting a variety of features has gradually entered people’s vi-
sion. As a result, many molding products have been derived and
even commercialized. Along with the aforementioned Kinect
and Leap Motion, which integrate camera, depth sensor, voice
recognition, and other functions, there are also Myo bracelet and
gForce bracelet, which integrate sEMG signal sensors and inertia
sensors. Xia [60] et al. designed a device that integrated sEMG
method and A-mode US method for acquisition. Experimental
results show that the RA of the system is 20.6% and 4.85% higher
than single sEMG method and A-mode US method. Zhang [61]
et al. designed a new system, which trained and tested the
acquired signals of inertial measurement unit, EMG, pressure
data of finger and palm, and established an effective gesture
classification model based on long short-term memory algorithm
of deep learning technique. Guo [62] et al. used data gloves and
Kinect to obtain the data of changes in the angle of finger joints
and the data of centroid motion of hand for gesture recognition.
Accelerometer signal and sEMG signal are used to HGR, and
the results showed that the RA of this method improved to a
certain extent [63]. Molchanov [64] et al. proposed a new system
that uses short-range radar, color camera, and depth camera to
recognize the gestures of drivers in cars. Skaria [65] et al. used
miniature radar sensors to pick up gestures and classified them,
with an accuracy of more than 95%. Wilson [66] et al. used
inertial sensors and MMG for HGR and control of prosthetic
hand.
III. COMPARISON AND ANALYSIS
Table I shows the different methods used in several main-
stream sensing methods and the RA of recognition in relevant lit-
eratures. In the table, the definition of reference accuracy varies
according to the calculation methods used in different articles,
and the specific content of the literature shall prevail. Obviously,
there are certain differences in the sensing methods, algorithms
and processing processes used by the authors, so that the final
results are also quite different. It is worth mentioning that the size
of the dataset, the type of gestures and some verification methods
used by the authors in their gesture recognition experiment are
listed in Table I. In general, we believe that the higher the
RA, the better the performance of the method. However, the
reference value of this parameter varies according to different
datasets and experimental designs. The more kinds of gestures,
the more difficult the classification; the larger the sample size,
the more convincing the RA. Through comparison and analysis,
these methods and processes will have great reference value
for the future research work. We also mention several different
gesture types in Table I, which by default refer to static gesture
recognition if we do not specify them separately. We summarize
the advantages and inherent defects of various sensing methods
in applications and show them in Table II. Through comparison,
it can be found that the HGR system based on data gloves
has higher robustness, the overall system is stable, the sensor
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
6IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS
TAB L E I
COMPARISON OF DIFFERENT METHODS USED IN HGR
Algorithms instructions: support vector machine (SVM), genetic algorithm (GA), neural network (NN), dynamic time warping (DTW), extreme learning machine (ELM), radial
basis function (RBF), conditional random field (CRF), convolutional neural network (CNN), hidden Markov model (HMM), K-nearest neighbors (KNN), radial basis function
based support vector machine (RBF SVM), weight-based Pearson correlation coefficient (WPCC).
TAB L E I I
ANALYSIS OF DIFFERENT METHODS USED IN HGR
has low cost and the accuracy is higher, but it is inconvenient
to wear, which affects the application and user’s experience.
Compared with other methods, visual gesture recognition gets
rid of the hands and arms bondage, but the recognition area
is also limited to the space that captured by the camera. From
the perspective of technology, visual recognition is relatively
mature and there are many researches and achievements. sEMG
has an advantage in cost, but its accuracy is often affected by
its weak signal. Although the method of replacing the forearm
muscle recognition with ultrasonic signals can improve the
accuracy, it is difficult to achieve commercial application due
to some problems, such as the high cost of equipment and the
inconvenience of wearing. From the recognizable gesture type,
we can see that visual mode can recognize gesture in air writing,
which is more suitable for dynamic information transmission.
SEMG signals and US signals are particularly useful in the
field of medical rehabilitation because they can be worn on the
forearm to recognize the gestures of amputees.
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GUO et al.: HMI SENSING TECHNOLOGY BASED ON HGR: A REVIEW 7
In fact, from the content described in Table II and existing
commercial products, we can find that the sensing systems that
can be truly popularized and successfully applied in daily life
are those that are more portable to use, such as Kinect in visual
mode and Myo armband based on sEMG, which can make the
hands get rid of the shackles and be more favorable to people.
Specific applications need scholars and researchers to choose
according to the actual application scenarios.
IV. APPLICATIONS
Human–computer interaction system consisting of a variety
of sensing technologies has great development and application
potential. This section summarizes the applications of HGR
technology in the following five directions.
A. Simplify or Replace Traditional Control Methods
More intuitive and convenient methods of HMI such as
HGR will gradually replace traditional control methods such
as switches and buttons. This is an obvious trend, just as buttons
and switches replaced manual operation methods. In fact, HGR
shows great potential in various scenarios of daily life, and
performs better than traditional control methods. N.Lalithamani
[71] applied single web camera-based gesture recognition tech-
nology to controlling the mouse cursor, the clicking actions,
and few shortcuts for opening specific applications. Rajesh [72]
et al. expounded the application effect of HGR in traditional
joysticks controlled wheelchair, and showed that the application
in wheelchair was a better choice based on experimental results
and users feedback. Shalahudin [73] et al. developed a controller
prototype to facilitate family life by controlling lamps and other
home devices through HGR. At the same time, because of
the intuition and convenience of it, HGR has shown a good
effect on the substitution of traditional joystick, button, etc. in
3-D motion mode, which is mainly reflected in the control of
mobile machinery, mechanical prosthetic hand, etc. Luo [74]
et al. realized the control of a Mecanum-wheeled mobile robot
through the recognition of two-hand gesture, and verified the
system could timely, accurately and stably complete the tasks
of directional movement, grasping and clearing obstacles in the
control experiment of mobile robots. It has been proven that
HMI is a better alternative to the traditional control method.
B. Human–Machine Cooperation
Many applications of HGR to replace traditional controls
have been mentioned. In fact, HGR also has development po-
tential in controlling industrial robots. Nuzzi [75] et al. built
a gesture recognition model for collaborative robots. Du [76]
et al. propose a natural human–robot interface using an adaptive
tracking method, which get rid of the limitation of accuracy
and operating space of traditional HMI, and improves the user
experience and reduces the complexity of operation. Neto [77]
et al. proposed a method to program industrial robots by using
a hand-held accelerometer-based input device to recognize ges-
tures and voice, which can better replace the traditional process
of industrial robot teaching and programming, improve work
efficiency and save time. At present, many studies have shown
that HGR can realize the control of robots, but most of the control
is one-way control of robots by humans, that is, humans make
simple associated actions to control the actions of robots. We
believe that with the in-depth study of sensing technology and
algorithms, robots will be able to recognize and analyze human
hand movements based on the information obtained from sensors
when humans are engaged in some complex manual activities,
and autonomously make coordinated actions to assist human to
complete the work. These are not utopian ideas, but rather a
development direction that can be fully realized in the light of
existing technical conditions. The application of HGR to human-
computer collaboration will help humans to complete work more
efficiently, after all, some work robots are more expensive to
complete, while others are more suitable for humans to complete
manually. Whether this kind of man-machine collaboration can
be realized depends on the optimization and development of
sensing technology and algorithm in the future.
C. Sign Language Translation
Hand gesture as one of the most important body languages of
human beings also plays an important role in the interpersonal
communication. The standard sign language can be translated by
HGR technology, which overcomes the communication barrier
between deaf-mutes and ordinary people who do not know sign
language, and realizes the auxiliary communication to the people
who are lack of language expression. Zhang [78] et al. proposed
a sign language recognition system that recognizes gestures by
extracting key frames of videos. Wei [79] et al. proposed a
component-based vocabulary extensible sign language recog-
nition framework using data from surface electromyographic
sensors, accelerometers, and gyroscopes, implemented recog-
nition experiments under different size of training sets on a
target gesture set consisting of 110 frequently used Chinese Sign
Language sign words and achieved high RA. A 3-D recognition
model is proposed for Indian Sign Language recognition [80],
and verified that the motionlet based adaptive kernel matching
algorithm on 500 class 3-D sign language data gives better RA
compared to state of the art action recognition models. Neiva
[81] et al. reviewed the work of scholars on sign language recog-
nition technology in detail, especially in mobile situations, and
summarized the research direction of sign language recognition
technology.
D. Interactive Entertainment and Virtual Reality (VR)
HGR technology also plays a key role in interactive en-
tertainment and virtual reality. In the process of interactive
entertainment, people mainly use hand gestures as input signals
to transmit to computers, televisions, and other entertainment
devices to get visual and auditory feedback to complete the
entertainment process. For example, one of the main functions of
Microsoft’s Kinect, is the ability to recognize body movements
to provide input signal for many supporting video games. In
the future, with the improvement of signal transmission speed
and sensor accuracy, HGR will be further applied in the field
of virtual reality, and hand movements will also be able to be
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
8IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS
reproduced in VR. HGR based on Leap Motion is applied to
rehabilitation training on subacute stroke based on virtual reality
and verified its effect through experiments [82].
E. Correction of Standard Actions
Because the HGR technology can make hand movements be
quantified, it had a surprised effect that people can correct and
teach some activities that need to be regulated based on the
quantified gesture data. For example, Sufen [83] et al. used
inertial data glove and infrared detection rod to collect and
identify the gestures and piano keys of people when they played
the piano, which realized real-time correction of piano playing
gestures. In the future, this technique could be used to train many
professional skills based on hand movements. The quantified
gesture data will provide a standardized parameter for gesture
movement, through which people can learn more professional
gestures and correct their own movements.
V. F UTURE WORKS
HGR technology as a novel technology has been gradually
accepted by the public and has caught the attention of more and
more enterprises and scientific research workers. At present,
many enterprises have begun to develop equipment for HGR.
There are many research institutions designing some HGR sys-
tems, and have also made a higher classification accuracy. As the
basis of HGR, the sensor technology will develop significantly
in the future. This article roughly divides its future development
into three major directions.
A. Integration of Various Sensors
Section II mentioned that there are many research results inte-
grated of various sensing methods. With the continuous progress
of future researches, more and more combination methods of
sensors will be verified by experiments to achieve a better RA.
B. Look for New Features
Feature extraction is a key process in pattern recognition, and
the richness of features also affects the accuracy significantly.
Different sensors can extract different gesture signal, and more
new features associated with gestures can be found by searching
for new sensors, so as to develop an HGR system with better
performance. In general, the higher the correlation between the
selected features and gestures, the easier the classification and
the better the modeling effect will be. Therefore, it is one of
the future work contents to explore the features with higher
degree of correlation. However, the type of feature often affects
the generalization ability of the HGR system. Before designing
the HGR system, we should first recognize whether the system
is for the entire human group (such as the shape of the hand and
the bending of the joint) or for the use of individuals (such as
the related features of the forearm muscles). These features are
different in the selection of different application scenarios, and
how to use these features determined by us according to different
work requirements is also an important content of future work.
C. Develop New Algorithms
Most data from the sensors are needed for the processing of
machine learning algorithms to classify gestures. From many
research results, we found that the matching of different algo-
rithms and different sensing methods would lead to different
results. Through the comparison and analysis with experiments,
the combination of algorithms and sensors with higher relative
matching degree can be developed in the future.
There are still many challenges in HGR technology. RA is
the key parameter of user’s experience, how to break through
the bottleneck of RA is to be solved urgently. In addition, how
to improve the wearing portability of the system is also an
important issue. We need to constantly explore new ways, such
as switching from wired connections to Bluetooth connections
or reducing the weight of our wearables through craftsmanship.
ACKNOWLEDGMENT
The authors declare no conflict of interest. For this type of
study, formal consent is not required.
REFERENCES
[1] L.Dipietro, A. M. Sabatini, and P. Dario, “A survey of glove-basedsystems
and their applications,” IEEE Trans. Syst. Man Cybern. Part C Appl. Rev.,
vol. 38, no. 4, pp. 461–482, Jul. 2008.
[2] B. Fang, F. Sun, H. Liu, and C. Liu, “3-D human gesture capturing and
recognition by the IMMU-based data glove, Neurocomput., vol. 277,
pp. 198–207, 2018.
[3] X. Huang et al., “Tracing the motion of finger joints for gesture recognition
via sewing RGO-coated fibers onto a textile glove, IEEE Sensors J.,
vol. 19, no. 20, pp. 9504–9511, Oct. 2019.
[4] B. Fang, F. Sun, H. Liu, and D. Guo, “A novel data glove using inertial and
magnetic sensors for motion capture and robotic arm-hand teleoperation,”
Ind. Robot. Int. J., vol. 44, no. 2, pp. 155–165, 2017.
[5] J. Galka, M. Masior, M. Zaborski, and K. Barczewska, “Inertial motion
sensing glove for sign language gesture acquisition and recognition, IEEE
Sensors J., vol. 16, no. 16, pp. 6310–6316, Aug. 2016.
[6] N. M. Kakoty and M. D. Sharma, “Recognition of sign language alphabets
and numbers based on hand kinematics using a data glove, Procedia
Comput. Sci., vol. 133, pp. 55–62, 2018.
[7] Z. Shen et al., “A soft stretchable bending sensor and data glove applica-
tions,” Robot. Biomimetics, vol. 3, no. 1, 2016, Art. no. 22.
[8] H. Wu et al., “Fabric-based self-powered noncontact smart gloves for
gesture recognition,” J. Mater. Chem. A, vol. 6, no. 41, pp. 20277–20288,
2018.
[9] D. H. Kim, S. W. Lee, and H. S. Park, “Improving kinematic accuracy of
soft wearable data gloves by optimizing sensor locations, Sensors, vol. 16,
no. 6, 2016, Art. no. 766.
[10] C.-M. Chiu, S.-W. Chen, Y.-P. Pao, M.-Z. Huang, S.-W. Chan, and Z.-H.
Lin, “A smart glove with integrated triboelectric nanogenerator for self-
powered gesture recognition and language expression, Sci. Technol. Adv.
Mater., vol. 20, no. 1, pp. 964–971, 2019.
[11] S. S. Rautaray and A. Agrawal,“Vision-based hand gesture recognition for
human computer interaction: A survey, Artif. Intell. Rev., vol. 43, no. 1,
pp. 1–54, 2015.
[12] J. H. Sun, T. T. Ji, S. Bin Zhang, J. K. Yang, and G. R. Ji, “Research on the
hand gesture recognition based on deep learning,” in 2018 12th Int. Symp.
Antennas, Propagation and EM Theory (ISAPE), Hangzhou, China, 2018,
doi: 10.1109/ISAPE.2018.8634348.
[13] D. Indra, S. M. Purnawansyah, and E. P. Wibowo, “Indonesian sign
language recognition based on shape of hand gesture,” Procedia Comput.
Sci., vol. 161, pp. 74–81, 2019.
[14] J. Lin and Y. Ding, “A temporal hand gesture recognition system
based on hog and motion trajectory, Optik (Stuttg.), vol. 124, no. 24,
pp. 6795–6798, 2013.
[15] P. K. Pisharady and M. Saerbeck, “Recent methods and databases in vision-
based hand gesture recognition: A review, Comput. Vis. Image Underst.,
vol. 141, pp. 152–165, 2015.
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
GUO et al.: HMI SENSING TECHNOLOGY BASED ON HGR: A REVIEW 9
[16] M. A. Laskar, A. J. Das, A. K. Talukdar, and K. K. Sarma, “Stereo
vision-based hand gesture recognition under 3-D environment, Procedia
Comput. Sci., vol. 58, pp. 194–201, 2015.
[17] T. Murata and J. Shin, “Hand gesture and character recognition based on
Kinect sensor,” Int. J. Distrib. Sens. Netw., vol. 2014, pp. 1–6, 2014.
[18] W. Zeng, C. Wang, and Q. Wang, “Hand gesture recognition using leap
motion via deterministic learning,” Multimed. Tools Appl., vol. 77, no. 21,
pp. 28185–28206, 2018.
[19] H. Cheng, L. Yang, and Z. Liu, “Survey on 3-D hand gesture recognition,”
IEEE Trans. Circuits Syst. Video Technol., vol. 26, no. 9, pp. 1659–1673,
Sep. 2016.
[20] X. Wu, C. Yang, Y. Wang, H. Li, and S. Xu, “An intelligent interactive
system based on hand gesture recognition algorithm and Kinect,” in Proc.
5th Int. Symp. Comput. Intell. Des., 2012, vol. 2, pp. 294–298.
[21] F. Liu, W. Zeng, C. Yuan, Q. Wang, and Y. Wang, “Kinect-based hand
gesture recognition using trajectory information, hand motion dynamics
and neural networks,” Artif. Intell. Rev., vol. 52, no. 1, pp. 563–583, 2019.
[22] A. Ahmed Almarzuqi and S. M. Buhari, “Enhance robotics ability in hand
gesture recognition by using leap motion controller,” Adv. Broad-Band
Wirel. Comput. Appl., vol. 2, pp. 513–523, 2017.
[23] A.S.Al-Shamayleh,R.Ahmad,M.A.M.Abushariah,K.A.Alam,andN.
Jomhari, “A systematic literature review on vision based gesture recogni-
tion techniques,” Multimed. Tools Appl., vol. 77, no. 21, pp. 28121–28184,
2018.
[24] T. R. Farrell and R. F. Weir, “A comparison of the effects of electrode
implantation and targeting on pattern classification accuracy for prosthe-
sis control,” IEEE Trans. Biomed. Eng., vol. 55, no. 9, pp. 2198–2211,
Sep. 2008.
[25] J. McIntosh, A. Marzo, M. Fraser, and C. Phillips, “EchoFlex: Hand
gesture recognition using ultrasound imaging,” in Proc. 35th Annu. CHI
Conf. Human Factors Comput. Syst., Denver, Colorado, USA, 2017,
pp. 1923–1934.
[26] N. Jiang, D. Falla, A. D’Avella, B. Graimann, and D. Farina, “Myoelectric
control in neurorehabilitation,” Crit. Rev. Biomed. Eng., vol. 38, no. 4,
pp. 381–391, 2010.
[27] L. Xu, X. Chen, S. Cao, X. Zhang, and X. Chen, “Feasibility study
of advanced neural networks applied to sEMG-based force estimation,
Sensors, vol. 18, no. 10, p. 3226, 2018, doi: 10.3390/s18103226.
[28] X. H. Hu, A. G. Song, and H. J. Li, “Dexterous robot hand control system
based on surface electromyography image,” Control Theory Appl., vol. 35,
no. 12, pp. 1707–1714, 2018.
[29] I. Ketyko, F. Kovacs, and K. Z. Varga, “Domain adaptation for sEMG-
based gesture recognition with recurrent neural networks,” Int. Jt. Conf.
Neural Netw., vol. 2019, pp. 14–19, 2019.
[30] J. J. V. Mayor, R. M. Costa, A. Frizera Neto, and T. F. Bastos, “Dexterous
hand gestures recognition based on low-density sEMG signals for upper-
limb forearm amputees,” Res. Biomed. Eng., vol. 33, no. 3, pp. 202–217,
2017.
[31] Y. Fang, D. Zhou, K. Li, and H. Liu, “Interface prostheses with classifier-
feedback based user training,” IEEE Trans. Biomed. Eng., vol. 64, no. 11,
pp. 2575–2583, Nov. 2017.
[32] C. Disselhorst-Klug, T. Schmitz-Rode, and G. Rau, “Surface electromyo-
graphy and muscle force: Limits in sEMG-force relationship and new
approaches for applications,” Clin. Biomech., vol. 24, no. 3, pp. 225–235,
2009.
[33] K. Wang, X. Zhang, J. Ota, and Y. Huang, “Estimation of handgrip force
from SEMG based on wavelet scale selection, Sensors, vol. 18, no. 2,
2018, Art. no. 663.
[34] Y. Fang, D. Zhou, K. Li, Z. Ju, and H. Liu, “Attribute-driven gran-
ular model for EMG-based pinch and fingertip force grand recog-
nition,” IEEE Trans. Cybern., vol. 51, no. 2, pp. 1–12, 2019, doi:
10.1109/TCYB.2019.2931142.
[35] U. Baspinar, V. Y. Senyürek, B. Dogan, and H. S. Varol, “Comparative
study of denoising sEMG signals,” Turkish J. Electr. Eng. Comput. Sci.,
vol. 23, no. 4, pp. 931–944, 2015.
[36] J. Wu, X. Li, W. Liu, and Z. J. Wang, “sEMG signal processing methods:
Areview,J. Phys. Conf. Ser., vol. 1237, 2019, Art. no. 032008.
[37] M. Kim, K. Kim, and W. K. Chung, “Simple and fast compensation of
sEMG interface rotation for robust hand motion recognition, IEEE Trans.
Neural Syst. Rehabil. Eng., vol. 26, no. 12, pp. 2397–2406, Dec. 2018.
[38] M. Atzori, A. Gijsberts, H. Muller, and B. Caputo, “Classification of hand
movements in amputated subjects by sEMG and accelerometers,” in Proc.
36th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2014, pp. 3545–3549.
[39] M. Atzori, H. Muller, and M. Baechler, “Recognition of hand movements
in a trans-radial amputated subject by sEMG,” in Proc. IEEE Int. Conf.
Rehabil. Robot., 2013, vol. 2013, pp. 1–5.
[40] Y. Fang, N. Hettiarachchi, D. Zhou, and H. Liu, “Multi-modal sensing
techniques for interfacing hand prostheses: A review, IEEE Sensors J.,
vol. 15, no. 11, pp. 6065–6076, Nov. 2015.
[41] X. Yang, X. Sun, D. Zhou, Y. Li, and H. Liu, “Towards wearable A-mode
ultrasound sensing for real-time finger motion recognition,” IEEE Trans.
Neural Syst. Rehabil. Eng., vol. 26, no. 6, pp. 1199–1208, Jun. 2018.
[42] N. Hettiarachchi, Z. Ju, and H. Liu, “A new wearable ultrasound muscle
activity sensing system for dexterous prosthetic control, in 2015 IEEE
Int. Conf. Syst., Man, and Cybernetics, SMC, 2015, pp. 1415–1420, doi:
10.1109/SMC.2015.251.
[43] X. Sun, X. Yang, X. Zhu, and H. Liu, “Dual-frequency ultrasound
transducers for the detection of morphological changes of deep-layered
muscles,” IEEE Sens. J., vol. 18, no. 4, pp. 1373–1383, 2018, doi:
10.1109/JSEN.2017.2778243.
[44] J. Yan, X. Yang, X. Sun, Z. Chen, and H. Liu, “A lightweight ultrasound
probe for wearable human-machine interfaces,” IEEE Sensors J., vol. 19,
no. 14, pp. 5895–5903, Jul. 2019.
[45] X. Yang, D. Zhou, Y. Zhou, Y. Huang, and H. Liu, “Towards zero re-
training for long-term hand gesture recognition via ultrasound sensing,”
IEEE J. Biomed. Heal. Inform., vol. 23, no. 4, pp. 1639–1646, Jul. 2019.
[46] W. Xia, L. W. Ye, and H. H. Liu, “A gesture database of B-
mode ultrasound-based human-machine interface,” in Proc. 2017 Int.
Conf. Mach. Learn. Cybernetics, ICMLC, 2017, pp. 118–122, doi:
10.1109/ICMLC.2017.8107752.
[47] N. Akhlaghi et al., “Real-time classification of hand motions using ultra-
sound imaging of forearm muscles,” IEEE Trans. Biomed. Eng., vol. 63,
no. 8, pp. 1687–1698, Aug. 2016.
[48] Y. Huang, X. Yang, Y. Li, D. Zhou, K. He, and H. Liu, “Ultrasound-based
sensing models for finger motion classification,” IEEE J. Biomed. Heal.
Inform., vol. 22, no. 5, pp. 1395–1405, Sep. 2018.
[49] C. Riguzzi, A. Binkowski, M. Butterfield, F. Sani, N. Teismann, and
J. Fahimi, “A randomised experiment comparing low-cost ultrasound
gel alternative with commercial gel, Emerg. Med. J., vol. 34, no. 4,
pp. 227–230, 2016.
[50] E. Sutton, W. M. Bullock, T. Khan, M. Eng, and J. Gadsden, “Hand
sanitizer as an alternative to ultrasound transmission gel, Reg. Anesth.
Pai n M e d., vol. 41, no. 5, pp. 655–656, 2016.
[51] Y. Zhang, B. Liu, and Z. Liu, “Recognizing hand gestures with pressure-
sensor-based motion sensing,”IEEE Trans. Biomed. Circuits Syst., vol. 13,
no. 6, pp. 1425–1436, Dec. 2019.
[52] R. Booth and P. Goldsmith, “A wrist-worn piezoelectric sensor array
for gesture input,” J. Med. Biol. Eng., vol. 38, no. 2, pp. 284–295,
2018.
[53] Y. Sugiura, F.Nakamura, W. Kawai, T. Kikuchi, and M. Sugimoto, “Behind
the palm: Hand gesture recognition through measuring skin deformation
on back of hand by using optical sensors,” in Proc. 56th Annu. Conf. Soc.
Instrum. Control Eng. Jpn., 2017, pp. 1082–1087.
[54] S. Wilson and R. Vaidyanathan, “Gesture recognition through classi-
fication of acoustic muscle sensing for prosthetic control,” Biommetic
Biohybrid Syst., vol. 10384, pp. 637–642, 2017.
[55] Z. Tian, J. Wang, X. Yang, and M. Zhou, “WiCatch: A Wi-Fi based
hand gesture recognition system,” IEEE Access, vol. 6, pp. 16911–16923,
2018.
[56] H. Abdelnasser, K. Harras, and M. Youssef, “A ubiquitous WiFi-based
fine-grained gesture recognition system,” IEEE Trans. Mob. Comput.,
vol. 18, no. 11, pp. 2474–2487, Nov. 2019.
[57] Z. Zhang, Z. Tian, and M. Zhou, “Latern: Dynamic continuous hand
gesture recognition using FMCW radar sensor, IEEE Sensors J., vol. 18,
no. 8, pp. 3278–3289, Apr. 2018.
[58] T. Fan et al., “Wireless hand gesture recognition based on continuous-wave
Doppler radar sensors,” IEEE Trans. Microw. Theory Tech., vol. 64, no. 11,
pp. 4012–4020, Nov. 2016.
[59] S. J. Ryu, J. S. Suh, S. H. Baek, S. Hong, and J. H. Kim, “Feature-
based hand gesture recognition using an FMCW radar and its tempo-
ral feature analysis,” IEEE Sensors J., vol. 18, no. 18, pp. 7593–7602,
Sep. 2018.
[60] W. Xia, Y. Zhou, X. Yang, K. He, and H. Liu, “Toward portable hybrid
surface electromyography/a-mode ultrasound sensing for human-machine
interface,” IEEE Sensors J., vol. 19, no. 13, pp. 5219–5228, Jul. 2019.
[61] X. Zhang, Z. Yang, T. Chen, D. Chen, and M. C. Huang, “Cooperative
sensing and wearable computing for sequential hand gesture recognition,”
IEEE Sensors J., vol. 19, no. 14, pp. 5775–5783, Jul. 2019.
[62] G. Xiaopei, F. Zhiquan, S. Kaiyun, L. Hong, X. Wei, and B. Jianping,
“Research on unified recognition model and algorithm for multi-modal
gestures,” J. China Univ. Posts Telecommun., vol. 26, no. 2, pp. 30–42,
2019.
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
10 IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS
[63] K. Rhee and H. C. Shin, “Electromyogram-based hand gesture recognition
robust to various arm postures, Int. J. Distrib. Sens. Netw., vol. 14, no. 7,
pp. 1–15, 2018, doi: 10.1177/1550147718790751.
[64] P. Molchanov, S. Gupta, K. Kim, and K. Pulli, “Multi-sensor system for
driver’s hand-gesture recognition,”11th IEEE Int. Conf. Workshops Autom.
Face Gesture Recognit., 2015, pp. 1–8, doi: 10.1109/FG.2015.7163132.
[65] S. Skaria, A. Al-Hourani, M. Lech, and R. J. Evans, “Hand-gesture recog-
nition using two-antenna Doppler radar with deep convolutional neural
networks,” IEEE Sensors J., vol. 19, no. 8, pp. 3041–3048, Apr. 2019.
[66] S. Wilson and R. Vaidyanathan, “Upper-limb prosthetic control using
wearable multichannel mechanomyography, in Proc. IEEE Int. Conf.
Rehabil. Robot., 2017, vol. 2017, pp. 1293–1298.
[67] P. Pławiak, T. So´snicki, M. Nied´zwiecki, Z. Tabor, and K. Rzecki, “Hand
body language gesture recognition based on signals from specialized glove
and machine learning algorithms,” IEEE Trans. Ind. Inform., vol. 12, no. 3,
pp. 1104–1113, Jun. 2016.
[68] P. Bao, A. I. Maqueda, C. R. Del-Blanco, and N. Garciá, “Tiny hand gesture
recognition without localization via a deep convolutional network, IEEE
Trans. Consum. Electron., vol. 63, no. 3, pp. 251–257, Aug. 2017.
[69] A. Singla, P. P. Roy, and D. P. Dogra, “Visual rendering of shapes on 2-D
display devices guided by hand gestures, Displays, vol. 57, pp. 18–33,
2019.
[70] Z. Ding, C. Yang, Z. Tian, C. Yi, Y. Fu, and F. Jiang, “sEMG-based gesture
recognition with convolution neural networks, Sustain, vol. 10, no. 6,
2018, Art. no. 1865.
[71] N. Lalithamani, “Gesture control using single camera for pC, in Proc. Int.
Conf. Inf. Secur. Privacy, 2016, vol. 78, pp. 146–152.
[72] R. K. Megalingam et al., “Design, analysis and performance evaluation
of a hand gesture platform for navigation,” Technol. Heal. Care, vol. 27,
no. 4, pp. 417–430, 2019.
[73] S. Al Ayubi, D. W. Sudiharto, E. M. Jadied, and E. Aryanto, “The prototype
of hand gesture recognition for elderly people to control connected home
devices, J. Phys. Conf. Ser., vol. 1201, no. 1, 2019, Art. no. 012042.
[74] X. Luo, A. Amighetti, and D. Zhang, “A human-robot interaction for
a Mecanum wheeled mobile robot with real-time 3-D two-hand gesture
recognition,” J. Phys. Conf. Ser., vol. 1267, no. 1, 2019, Art. no. 012056.
[75] C. Nuzzi, S. Pasinetti, and M. Lancini, “Deep learning-based hand gesture
recognition for collaborative robots, IEEE Instrum. Meas. Mag., vol. 22,
no. 2, pp. 44–51, Apr. 2019.
[76] G. Du, G. Yao, C. Li, and P. X. Liu, “Natural human-robot interface using
adaptive tracking system with the unscented Kalman filter, IEEE Trans.
Hum.-Mach. Syst., vol. 50, no. 1, pp. 42–54, Feb. 2020.
[77] P. Neto, J. N. Pires, and A. P. Moreira, “High-level programming and
control for industrial robotics: Using a hand-held accelerometer-based
input device for gesture and posture recognition, Ind. Robot An Int. J.,
vol. 37, no. 2, pp. 137–147, 2010.
[78] S. Zhang, Z. Zhu, and Z. Hu, “Sign language recognition based on key
frame,”IOP Conf. Ser. Earth Environ. Sci., vol. 252, 2019, Art. no. 032012.
[79] S. Wei, X. Chen, X. Yang, S. Cao, and X. Zhang, “A component-based
vocabulary-extensible sign language gesture recognition framework, Sen-
sors, vol. 16, no. 4, pp. 1–16, 2016, doi: 10.3390/s16040556.
[80] P. V. V. Kishore, D. A. Kumar, A. Chandra Sekhara Sastry, and E. K.
Kumar, “Motionlets matching with adaptive kernels for 3-D Indian sign
language recognition,” IEEE Sensors J., vol. 18, no. 8, pp. 3327–3337,
Apr. 2018.
[81] D. Hirafuji Neiva and C. Zanchettin, “Gesture recognition: A review
focusing on sign language in a mobile context,”Expert Syst. Appl., vol. 103,
pp. 159–183, 2018.
[82] Z. R. Wang, P. Wang, L. Xing, L. P. Mei, J. Zhao, and T. Zhang, “Leap
motion-based virtual reality training for improving motor functional recov-
ery of upper limbs and neural reorganization in subacute stroke patients,
Neural Regen. Res., vol. 12, no. 11, pp. 1823–1831, 2017.
[83] L. J. P. C. Ye Sufen, “An intelligent perception and gesture recognition
technology for wearable piano-playing glove, Chin. J. Sci. Instrum.,
vol. 40, no. 5, pp. 187–194, 2019.
Authorized licensed use limited to: Nanjing University. Downloaded on June 25,2021 at 02:58:51 UTC from IEEE Xplore. Restrictions apply.
... Traditional control methods may subject the patient to excessive torque, which increases the risk of secondary injury. In contrast, control methods based on human-robot interactive information can have good rehabilitation training effects (Guo et al., 2021). Such methods can not only effectively avoid potential injuries but also help improve recovery. ...
Article
Full-text available
Different patients have different rehabilitation requirements. It is essential to ensure the safety and comfort of patients at different recovery stages during rehabilitation training. This study proposes a multi-mode adaptive control method to achieve a safe and compliant rehabilitation training strategy. First, patients’ motion intention and motor ability are evaluated based on the average human–robot interaction force per task cycle. Second, three kinds of rehabilitation training modes—robot-dominant, patient-dominant, and safety-stop—are established, and the adaptive controller can dexterously switch between the three training modes. In the robot-dominant mode, based on the motion errors, the patient’s motor ability, and motion intention, the controller can adaptively adjust its assistance level and impedance parameters to help patients complete rehabilitation tasks and encourage them to actively participate. In the patient-dominant mode, the controller only adjusts the training speed. When the trajectory error is too large, the controller switches to the safety-stop mode to ensure patient safety. The stabilities of the adaptive controller under three training modes are then proven using Lyapunov theory. Finally, the effectiveness of the multi-mode adaptive controller is verified by simulation results.
... A detailed review about the various technologies used for hand gesture recognition is provided by [36]. It can be observed here that various technologies like SVM, kNN, LDA, CNN are used in existing literature for recognizing the hand gestures. ...
Article
Full-text available
Recognizing hand gestures poses a formidable challenge, particularly when dealing with semantic gestures that require disentanglement prior to recognition. This paper addresses the intricate issue of an additional stroke, commonly referred to as ‘movement epenthesis stroke,’ which emerges between continuous gestures. Our proposed system employs a multifaceted approach to tackle this challenge. Initially, the system extracts color-motion information to facilitate hand detection, subsequently employing a fusion of shape information and a modified Kanade–Lucas–Tomasi (KLT) feature tracker. This integration significantly mitigates the issue of occlusions. The identification of movement epenthesis is accomplished by analyzing the gesture trajectory using a speed profile. Furthermore, self-co-articulation strokes are discerned by leveraging slope-angle information. To enhance the recognition process, a carefully selected set of 40 features is extracted, which are then employed for recognizing the resulting meaningful gestures. These features serve as inputs to various classification models, including support vector machines (SVM), k-nearest neighbors (kNN), and extreme learning machines (ELM). Deep learning algorithms are judiciously deployed to recognize gesture trajectories, thus streamlining the time-consuming feature extraction process. The outcomes of individual classifiers are amalgamated, resulting in a classifier fusion model. This model is enhanced through majority voting and is used in conjunction with cross-validation results. The experimental analysis culminates in an impressive accuracy rate of 98.88% achieved by the classifier fusion model. This achievement surpasses the performance of individual classifiers, underscoring the effectiveness of our proposed methodology.
Article
Full-text available
Surface electromyography (sEMG) is a State-of-the-Art (SoA) sensing modality for non-invasive human-machine interfaces for consumer, industrial, and rehabilitation use cases. The main limitation of the current sEMG-driven control policies is the sEMG’s inherent variability, especially cross-session due to sensor repositioning; this limits the generalization of the Machine/Deep Learning (ML/DL) in charge of the signal-to-command mapping. The other hot front on the ML/DL side of sEMG-driven control is the shift from the classification of fixed hand positions to the regression of hand kinematics and dynamics, promising a more versatile and fluid control. We present an incremental online-training strategy for sEMG-based estimation of simultaneous multi-finger forces, using a small Temporal Convolutional Network suitable for embedded learning-on-device. We validate our method on the HYSER dataset, cross-day. Our incremental online training reaches a cross-day Mean Absolute Error (MAE) of (9.58 ± 3.89)% of the Maximum Voluntary Contraction on HYSER’s RANDOM dataset of improvised, non-predefined force sequences, which is the most challenging and closest to real scenarios. This MAE is on par with an accuracy-oriented, non-embeddable offline training exploiting more epochs. Further, we demonstrate that our online training approach can be deployed on the GAP9 ultra-low power microcontroller, obtaining a latency of 1.49 ms and an energy draw of just 40.4 uJ per forward-backward-update step. These results show that our solution fits the requirements for accurate and real-time incremental training-on-device.
Article
Full-text available
Spike extraction by blind source separation (BSS) algorithms can successfully extract physiologically meaningful information from the sEMG signal, as they are able to identify motor unit (MU) discharges involved in muscle contractions. However, BSS approaches are currently restricted to isometric contractions, limiting their applicability in real-world scenarios. We present a strategy to track MUs across different dynamic hand gestures using adaptive independent component analysis (ICA): first, a pool of MUs is identified during isometric contractions, and the decomposition parameters are stored; during dynamic gestures, the decomposition parameters are updated online in an unsupervised fashion, yielding the refined MUs; then, a Pan-Tompkins-inspired algorithm detects the spikes in each MUs; finally, the identified spikes are fed to a classifier to recognize the gesture. We validate our approach on a 4-subject, 7-gesture + rest dataset collected with our custom 16-channel dry sEMG armband, achieving an average balanced accuracy of 85.58±14.91% and macro-F1 score of 85.86±14.48%. We deploy our solution onto GAP9, a parallel ultra-low-power microcontroller specialized for computation-intensive linear algebra applications at the edge, obtaining an energy consumption of 4.72 mJ @ 240 MHz and a latency of 121.3 ms for each 200 ms-long window of sEMG signal.
Article
With the development of technology, using radar for gesture recognition is feasible and valuable. However, ensuring that gesture recognition can be applied to a wide range of scenarios with sufficient accuracy is still challenging. Due to the lack of accuracy and efficiency of traditional methods, we propose a gesture recognition scheme based on deep learning. We converted radar signals into pictures and designed a lightweight network called self-reparameterization network based on distance and velocity aware and binary coding(SR-DVBNet) to match them. We use the Self-reparameterization Encoder of the signal as the baseline of the network and add Distance and Velocity-aware Embedding (DVAE) between different blocks to do weighting for different dimensions. Since gesture recognition by radar signals often uses two-dimensional data, such as RDM or chirp-arranged matrices, we designed the DVAE module, which can weigh the different dimensions of the data separately to enhance the interpretability and gesture recognition accuracy of the model. At the same time, we use binary descriptors as the final representation of feature vectors for classification, which can well reflect the features of images and improve classification accuracy. Finally, we verify the algorithm’s effectiveness on two publicly available data sets and achieve an accuracy rate of more than 98%, surpassing other known gesture recognition algorithms.
Article
Due to the current focus of research on ankle rehabilitation robots on structural design, there is still limited research on ankle human–machine interaction technology. In order to enable rehabilitation robots to conduct personalized rehabilitation training based on patients’ ankle movement intentions, we propose a new ankle motion recognition method based on plantar pressure. First, we designed a plantar pressure collection system based on array sensors. Then, we collected nine types of ankle joint motion pressure data from five volunteers and conducted algorithm selection, data processing, and algorithm optimization. Finally, we proposed a small sample optimization algorithm based on support vector machine, with an average recognition rate of 93.16%. The recognition method proposed in this paper can be combined with ankle rehabilitation robots to achieve active rehabilitation functions, laying the foundation for the clinical application of active rehabilitation technology.
Article
Full-text available
This work is motivated by the recent advances in Deep Neural Networks (DNNs) and their widespread applications in human-machine interfaces. DNNs have been recently used for detecting the intended hand gesture through the processing of surface electromyogram (sEMG) signals. Objective: Although DNNs have shown superior accuracy compared to conventional methods when large amounts of data are available for training, their performance substantially decreases when data are limited. Collecting large datasets for training may be feasible in research laboratories, but it is not a practical approach for real-life applications. The main objective of this work is to design a modern DNN-based gesture detection model that relies on minimal training data while providing high accuracy. Methods: We propose the novel Few-Shot learning- Hand Gesture Recognition (FS-HGR) architecture. Few-shot learning is a variant of domain adaptation with the goal of inferring the required output based on just one or a few training observations. The proposed FS-HGR generalizes after seeing very few observations from each class by combining temporal convolutions with attention mechanisms. This allows the meta-learner to aggregate contextual information from experience and to pinpoint specific pieces of information within its available set of inputs. Data Source & Summary of Results: The performance of FS-HGR was tested on the second and fifth Ninapro databases, referred to as the DB2 and DB5, respectively. The DB2 consists of 50 gestures (rest included) from 40 healthy subjects. The Ninapro DB5 contains data from 10 healthy participants performing a total of 53 different gestures (rest included). The proposed approach for the Ninapro DB2 led to 85.94% classification accuracy on new repetitions with few-shot observation (5-way 5-shot), 81.29% accuracy on new subjects with few-shot observation (5-way 5-shot), and 73.36% accuracy on new gestures with few-shot observation (5-way 5-shot). Moreover, the proposed approach for the Ninapro DB5 led to 64.65% classification accuracy on new subjects with few-shot observation (5-way 5-shot).
Article
Full-text available
At present, albeit the dexterous hand prostheses of multiple degrees of freedom (DOFs) have become prosperous on the market, the user’s demand on intuitively operating these devices have not been well addressed so that their acceptance rate is relatively low. The unintuitive control method and inadequate sensory feedback are frequently cited as the two barriers to the successful application of these dexterous products. Recently, driven by the wave of artificial intelligence (AI), a series of shared control methods have emerged, in which “bodily function” (myoelectric control) and “artificial intelligence” (local autonomy, computer vision, etc.) are tightly integrated, and provided a new conceptual solution for the intuitive operation of dexterous prostheses. In this paper, the background and development trends of this type of methods are described in detail, and the potential development directions and the key technologies that need breakthroughs are indicated. In practice, we instantiate this shared control strategy by proposing a new method combining simultaneous myoelectric control, multi-finger grasp autonomy, and augmented reality (AR) feedback together. This method “divides” the human sophisticated reach-and-grasp task into several subtasks, and then “conquers” them by using different strategies from either human or machine perspective. It is highly expected that the shared control methods with hybrid human-machine intelligence could address the control problem of dexterous prostheses.
Article
Full-text available
Deep learning has transformed the field of data analysis by dramatically improving the state of the art in various classification and prediction tasks, especially in the area of computer vision. In biomedical engineering, a lot of new work is directed toward surface electromyography (sEMG)-based gesture recognition, often addressed as an image classification problem using convolutional neural networks (CNNs). In this paper, we utilize the Hilbert space-filling curve for the generation of image representations of sEMG signals, which allows the application of typical image processing pipelines such as CNNs on sequence data. The proposed method is evaluated on different state-of-the-art network architectures and yields a significant classification improvement over the approach without the Hilbert curve. Additionally, we develop a new network architecture (MSHilbNet) that takes advantage of multiple scales of an initial Hilbert curve representation and achieves equal performance with fewer convolutional layers.
Article
Full-text available
During the last few years, significant attention has been paid to surface electromyographic (sEMG) signal–based gesture recognition. Nevertheless, sEMG signal is sensitive to various user-dependent factors, like skin impedance and muscle strength, which causes the existing gesture recognition models not suitable for new users and huge precision dropping. Therefore, we propose a dual layer transfer learning framework, named dualTL, to realize user-independent gesture recognition based on sEMG signal. DualTL is composed of two layers. The first layer of dualTL leverages the correlations of sEMG signal among different users to label partial gestures with high confidence from new users. Then, according to the consistencies of sEMG signal from the same users, the rest gestures are labeled in the second layer. We compare our method with three universal machine learning methods, seven representative transfer learning methods, and two deep learning–based sEMG gesture recognition methods. Experimental results show that the average recognition accuracy of dualTL is 80.17%. Comparing with SMO, KNN, RF, PCA, TCA, STL, and CWT, the performance improves 24.26% approximately.
Article
Full-text available
Flexible electronics with great functional characteristics have proved to be a stepping stone in the field of wearable devices. Amongst all, gesture-sensing techniques have been widely studied for human-machine interfaces. In this paper, we propose a self-powered gesture-sensing system attached to the back of the hands, which has the capability of distinguishing hand gestures by measuring the triboelectric nanogenerator output signal. By attaching the sensor on the back of the hand, we can sense the displacement of tendons to detect the gestures. In addition, humidity resistance and durability of the device were tested and validated. Furthermore, we have established a set of rules to define the relationship between gestures and corresponding English letters. Therefore, the proposed sensor can further serve as an electronic sign language translator by converting gestures into words. Finally, we can integrate this system into gloves to enhance the applicability and utility. Overall, we have developed a real-time self-powered back-of-hand sensing system which can recognize various hand gestures.
Article
In gesture recognition, static gestures, dynamic gestures and trajectory gestures are collectively known as multi-modal gestures. To solve the existing problem in different recognition methods for different modal gestures, a unified recognition algorithm is proposed. The angle change data of the finger joints and the movement of the centroid of the hand were acquired respectively by data glove and Kinect. Through the preprocessing of the multi-source heterogeneous data, all hand gestures were considered as curves while solving hand shaking, and a uniform hand gesture recognition algorithm was established to calculate the Pearson correlation coefficient between hand gestures for gesture recognition. In this way, complex gesture recognition was transformed into the problem of a simple comparison of curves similarities. The main innovations: 1) Aiming at solving the problem of multi-modal gesture recognition, an unified recognition model and a new algorithm is proposed; 2) The Pearson correlation coefficient for the first time to construct the gesture similarity operator is improved. By testing 50 kinds of gestures, the experimental results showed that the method presented could cope with intricate gesture interaction with the 97. 7% recognition rate. © 2019, Beijing University of Posts and Telecommunications. All rights reserved.
Article
Adding dynamic gesture recognition to the new interactive teaching system is of great significance to improve the teaching efficiency. However, the features extracted by traditional dynamic gesture recognition methods are usually difficult to accurately represent the differences between dynamic gestures. Aiming at the problems of complex time sequence and spatial variability of dynamic gestures, this paper proposes a gesture recognition method combining global motion and local motion of fingers. Firstly, preprocessing of dynamic gesture data is carried out, including removing invalid gesture frames, completing gesture frame data and normalizing joint length. Then, according to the given hand joint coordinates, dynamic gesture key frames are extracted by using gesture distance function, and the global motion features of hand in space and the local motion features of fingers in hand are extracted based on gesture key frames. Secondly, the key frame gesture features of gesture global motion and finger local motion are fused, and linear discriminant analysis is used for feature dimensionality reduction. Finally, support vector machine with Gaussian kernel is used to realize dynamic gesture recognition and classification. Experimental results on DHG-14/28 and FPHA dynamic gesture dataset show that the accuracy of classification and recognition is 98.57% 88.29% and 97.31% respectively.
Article
Traditional human–robot interfaces usually have limitations in accuracy and/or operational space. This article proposes a natural human–robot interface using an adaptive tracking method, which can effectively expand the operational space while ensuring high accuracy. The natural interface allows the robot to directly reproduce the user's hand movement, making the interaction more intuitive and natural. The leap motion is fixed on the Cartesian platform to capture the movement of the user's hand. Because the Cartesian platform follows the hand and keeps the hand in the center of the detection area, the measurement accuracy is improved and the measurement space can be extended. During the process of acquiring gesture data, the measurement errors were found to increase over time because of the inherent noise of the sensor. To deal with this problem, the unscented Kalman filter is applied to estimate the position of the hand. Moreover, an adaptive velocity control method is proposed to improve the operation accuracy and reduce the task execution time with the consideration of users’ habits and easiness of usage. The effectiveness of this interface is verified by a series of experiments, and the results show that the proposed interface can be used by nonprofessional users for object operation tasks and can provide users with superior interactive experiences.
Article
This paper proposes a novel framework to process pressure signals for real-time and robust gesture recognition, which includes an innovative segmentation scheme, a gesture recognition scheme and a pressure-parameter adaptive updating strategy. A prototype system, including a wearable gesture sensing device with four pressure sensors and the corresponding algorithmic framework, is developed to realize real-time gesture-based interaction. With the device worn on the wrist, the user can interact with the computer using 8 predefined gestures. Experimental results show that the delay of gesture recognition is about 100 ms, with the average accuracy of 95.28% in the experienced-user test and 86.20% in the inexperienced-user test. Finally, the system is evaluated by a mouse-controlling interaction task and performs well. Both experienced and inexperienced people can easily and quickly complete interactive tasks. These results demonstrate that a pressure-sensor based wristband can be used to classify hand gestures well and to control the mouse interaction. This approach provides an interactive way to replace the mouse for decreasing the risk of the carpal tunnel syndrome (CTS).