ArticlePDF Available

Automatic classification of Polish sign language words

Authors:

Abstract

In the article we present the approach to automatic recognition of hand gestures using eGlove device. We present the research results of the system for detection and classification of static and dynamic words of Polish language. The results indicate the usage of eGlove allows to gain good recognition quality that additionally can be improved using additional data sources such as RGB cameras.
8th International Conference
Zakopane, Poland, June 18 21, 2013
Automatic classification of
static and dynamic
words in sign language
Tomasz Dziubich, Julian Szymanski
Gdask University of Technology
Faculty of Electronics, Telecommunications and Informatics
{tomasz.dziubich, julian.szymanski}@eti.pg.gda.pl
In the article we present the approach to automatic recognition of hand gestures. The
research results have been used to create the system for detection of words of Polish
sign language that is widely used in communication of deaf people. The proposed
approach opens additional ways for building multimodal Human-Machine Interactions by
processing information acquired from body movements monitoring.
We present detailed architecture of constructed device called eGlove that has been
built using set of wireless sensors. Its main element is Printed Circuit Board with ATmega
128 microcontroler and Bluetooth module (FLC-BTM403) mounted in a special fabric
glove on its upper surface. The device allows to precisely detect the current position of a
hand trought analysing the signals from magnetometer and tri-axis accelometers. The
signals from the sensors transferred to the computer have been used to construct
computational model of human hand movements. Analysis of signals from eGlove allows
us to build the classifier that detects particular patterns of hand movements that we used
for recognition of words in sign language.
We give a brief description of current state-of art methods used to sign language
detection and provide detailed description of our system as well as we report used
evaluation methodology. We evaluate our approach measured in terms of classification
quality and compare achieved results to the results given in the literature.
The results indicate the usage of eGlove significantly improve the hand gesture
recognition quality. In conclusion section we propose also the extension of our system by
adding RGB cameras from Microsoft Kinect which allows multimodal analysis of the body
language. Promising results of application of eGlove for hand gestures recognition opens
other areas of its usage, eg. in medical area where device can be used for predicting
epileptic seizures.
References
[1] Tutenel T., Smelik R. M., Lopes R., de Kraker D. J., Bidarra R.: Generating Consistent Buildings:
a Semantic Approach for Integrating Procedural Techniques, IEEE Transactions on
Computational Intelligence and AI in Games, 2011.
[2] Umeyama S.: Least-Squares Estimation of Transformation. Parameters Between Two Point
Patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 4, 1991.
[3] Sylvie CW Ong and Surendra Ranganath. Automatic sign language analysis: A survey and the
future beyond lexical meaning. IEEE Transactions on Pattern Analysis and Machine Intelligence,
27 (6): 873891, 2005.
[4] Stephan Liwicki and Mark Everingham. Automatic recognition of fingerspelled words in british
sign language. Computer Vision and Pattern Recognition Workshops, 2009. CVPR Workshops
2009. IEEE Computer Society Conference on, pages 5057. IEEE, 2009.
[5] Christian Vogler and Dimitris Metaxas. A framework for recognizing the simultaneous aspects of
american sign language. Computer Vision and Image Understanding, 81 (3): 358384, 2001.
Chapter
Language classification is one of the tasks in Natural Language Processing domain that has been analyzed over the years using various approaches. The main goal of this task is to for selected two languages define their similarity. In our research, we compare two approaches: the statistical and neural methods. For this purpose, varied randomly selected Wikipedia articles in 24 European languages were used as a data set. Firstly, the frequency of unigrams, bigrams, as well as trigrams, have been created, next, using Kullback-Leibler divergence, distributions were compared and presented on the similarity graph. Additionally, the assignment to language family groups has been presented in the form of a graph and evaluated. We compare this approach with the application of neural network architecture. We evaluated the quality of the classification for narrowing test datasets. The contribution of this research is a comparison of the statistical and neural approach for the strict domain of the Latin alphabet, as well as the identification of the language family groups based on classification models. The results of the experiments show that while the text is reduced to a length of few words, the method based on n-grams distributions outperforms recurrent neural network.KeywordsLanguage identificationText representationNeural networks
Article
Full-text available
The Nintendo Wii Remote (Wiimote) has served as an input device in 3D user interfaces (3DUIs) but differs from the general-purpose input hardware typically found in research labs and commercial applications. Despite this, no one has systematically evaluated the device in terms of what it offers 3DUI designers. Experience with the Wiimote indicates that it's an imperfect harbinger of a new class of spatially convenient devices, classified in terms of spatial data, functionality, and commodity design. This tutorial presents techniques for using the Wiimote in 3DUIs. It discusses the device's strengths and how to compensate for its limitations, with implications for future spatially convenient devices.
Article
Full-text available
Computer games often take place in extensive virtual worlds, attractive for roaming and exploring. Unfortunately, current virtual cities can strongly hinder this kind of gameplay, since the buildings they feature typically have replicated interiors, or no interiors at all. Procedural content generation is becoming more established, with many techniques for automatically creating specific building elements. However, the integration of these techniques to form complete buildings is still largely unexplored, limiting their application to open game worlds. We propose a novel approach that integrates existing procedural techniques to generate such buildings. With minimal extensions, individual techniques can be coordinated to create buildings with consistently interrelated exteriors and interiors, as in the real world. Our solution offers a framework where various procedural techniques communicate with a moderator, which is responsible for negotiating the placement of building elements, making use of a library of semantic classes and constraints. We demonstrate the applicability of our approach by presenting several examples featuring the integration of a façade shape grammar, two different floor plan layout generation techniques, and furniture placement techniques. We conclude that this approach allows one to preserve the individual qualities of existing procedural techniques, while assisting the consistency maintenance of the generated buildings.
Article
Full-text available
Research in automatic analysis of sign language has largely focused on recognizing the lexical (or citation) form of sign gestures as they appear in continuous signing, and developing algorithms that scale well to large vocabularies. However, successful recognition of lexical signs is not sufficient for a full understanding of sign language communication. Nonmanual signals and grammatical processes which result in systematic variations in sign appearance are integral aspects of this communication but have received comparatively little attention in the literature. In this survey, we examine data acquisition, feature extraction and classification methods employed for the analysis of sign language gestures. These are discussed with respect to issues such as modeling transitions between signs in continuous signing, modeling inflectional processes, signer independence, and adaptation. We further examine works that attempt to analyze nonmanual signals and discuss issues related to integrating these with (hand) sign gestures. We also discuss the overall progress toward a true test of sign recognition systems--dealing with natural signing by native signers. We suggest some future directions for this research and also point to contributions it can make to other fields of research. Web-based supplemental materials (appendicies) which contain several illustrative examples and videos of signing can be found at www.computer.org/publications/dlib.
Article
This paper presents a review of the history of Gesture controlled user interface (GCUI), and identifies trends in technology, application and usability. Our findings conclude that GCUI now affords realistic opportunities for specific application areas, and especially for users who are uncomfortable with more commonly used input devices. We have tried collated chronographic research information which covers the past 30 years. We investigated different types of gestures, its users, applications, technology, issues addressed, results and interfaces from existing research. We consider the next direction of gesture controlled user interfaces as rich user interface using gestures seems appropriate for current and future ubiquitous and ambient devices. This paper also provides a research background for gesture controlled research for elderly or disabled people.
Chapter
In the present paper the results of applying the Elastic Graph Matching (EGM) method to classify the 23 postures occurring in Polish finger alphabet are presented. Four different classifiers were constructed. Three of them used different feature vectors to the graph’s nodes description. The forth one integrated the former three classifiers.
Article
The major challenge that faces American Sign Language (ASL) recognition now is developing methods that will scale well with increasing vocabulary size. Unlike in spoken languages, phonemes can occur simultaneously in ASL. The number of possible combinations of phonemes is approximately 1.5×109, which cannot be tackled by conventional hidden Markov model-based methods. Gesture recognition, which is less constrained than ASL recognition, suffers from the same problem. In this paper we present a novel framework to ASL recognition that aspires to being a solution to the scalability problems. It is based on breaking down the signs into their phonemes and modeling them with parallel hidden Markov models. These model the simultaneous aspects of ASL independently. Thus, they can be trained independently, and do not require consideration of the different combinations at training time. We show in experiments with a 22-sign-vocabulary how to apply this framework in practice. We also show that parallel hidden Markov models outperform conventional hidden Markov models.
Article
The proliferation of accelerometers on consumer electronics has brought an opportunity for interaction based on gestures. We present uWave, an efficient recognition algorithm for such interaction using a single three-axis accelerometer. uWave requires a single training sample for each gesture pattern and allows users to employ personalized gestures. We evaluate uWave using a large gesture library with over 4000 samples for eight gesture patterns collected from eight users over one month. uWave achieves 98.6% accuracy, competitive with statistical methods that require significantly more training samples. We also present applications of uWave in gesture-based user authentication and interaction with 3D mobile user interfaces. In particular, we report a series of user studies that evaluates the feasibility and usability of lightweight user authentication. Our evaluation shows both the strength and limitations of gesture-based user authentication.
Article
In many applications of computer vision, the following problem is encountered. Two point patterns (sets of points) { x <sub>i</sub>} and { x <sub>i</sub>}; i =1, 2, . . ., n are given in m -dimensional space, and the similarity transformation parameters (rotation, translation, and scaling) that give the least mean squared error between these point patterns are needed. Recently, K.S. Arun et al. (1987) and B.K.P. Horn et al. (1987) presented a solution of this problem. Their solution, however, sometimes fails to give a correct rotation matrix and gives a reflection instead when the data is severely corrupted. The proposed theorem is a strict solution of the problem, and it always gives the correct transformation parameters even when the data is corrupted
Article
The author suggests that, due to the trends of unobtrusive technology and more intrusive information, the next phase of computing technology will develop nonlinearly. He states that, in the long run, the personal computer and the workstation will become practically obsolete because computing access will be everywhere: in the walls, on your wrist, and in `scrap' computers (i.e., like scrap paper) lying about to be used as needed. The current research on ubiquitous computing is reviewed
Article
Contents . Introduction . Current status . Key features of a wearable system . Two usage models . Guidelines for wearability . Interaction and input devices . Categories of input devices by application 3 NOKIA 1999 NewIntTech.ppt/13.2.2001 /MSa Contents . Wearable keyboards . Speech recognition . Mouse and joystick . Pen input in wearable computing . Evaluation of input mechanisms . Our examples . Summary 4 NOKIA 1999 NewIntTech.ppt/13.2.2001 /MSa Introduction . Basically interactive computer systems either automate tasks or aid people carrying them out . Wearable systems are mostly for the latter but also utilized in communication between people . Three ways of interaction can be defined Interaction with computer or other personal device Interaction in an intelligent environment Interaction with other people 5 NOKIA 1999 NewIntTech.ppt/13.2.2001 /MSa Current status . There does not exist reall