Figure - available from: Journal of Ambient Intelligence and Humanized Computing
This content is subject to copyright. Terms and conditions apply.
Recognition accuracy of different referee hand signals

Recognition accuracy of different referee hand signals

Source publication
Article
Full-text available
Recognition of hand gestures (hand signals) is an active research area for human computer interaction with many possible applications. Automatic machine vision-based hand gesture interfaces for real-time applications require fast and extremely robust human, pose and hand detection, and gesture recognition. Attempting to recognize gestures performed...

Similar publications

Article
Full-text available
Digital menu boards (DMB) are convenient for customers as well as sellers. In this paper, we have implemented a DMB using IR-UWB transceivers. Unlike the traditional touch-based interfaces for menu selection, in our proposed system, users can select items from the menu without touching the screen. The screen is used to display the menu, and the use...

Citations

... The conducted review of existing solutions indicated a wide range of applications for CV, in particular, for instance segmentation across various domains [77][78][79], with a growing trend of employing CV in sports, especially in football. However, there is still a noticeable lack of research aimed at improving the quality of video broadcasts by eliminating visual distractions and providing a more immersive experience, pointing to promising directions for future exploration. ...
Article
Full-text available
Using instance segmentation and video inpainting provides a significant leap in real-time football video broadcast enhancements by removing potential visual distractions, such as an occasional person or another object accidentally occupying the frame. Despite its relevance and importance in the media industry, this area remains challenging and relatively understudied, thus offering potential for research. Specifically, the segmentation and inpainting of camera operator instances from video remains an underexplored research area. To address this challenge, this paper proposes a framework designed to accurately detect and remove camera operators while seamlessly hallucinating the background in real-time football broadcasts. The approach aims to enhance the quality of the broadcast by maintaining its consistency and level of engagement to retain and attract users during the game. To implement the inpainting task, firstly, the camera operators instance segmentation method should be developed. We used a YOLOv8 model for accurate real-time operator instance segmentation. The resulting model produces masked frames, which are used for further camera operator inpainting. Moreover, this paper presents an extensive “Cameramen Instances” dataset with more than 7500 samples, which serves as a solid foundation for future investigations in this area. The experimental results show that the YOLOv8 model performs better than other baseline algorithms in different scenarios. The precision of 95.5%, recall of 92.7%, mAP50-95 of 79.6, and a high FPS rate of 87 in low-volume environment prove the solution efficacy for real-time applications.
... Hand Gesture Recognition (HGR) plays an essential role in various interactive systems, including signaling systems that rely on gestures [1,2], recognition of sign language [3,4], sports-specific sign language recognition [5,6], human gesture recognition [7,8], pose and posture detection [9,10], physical exercise monitoring [11,12], and control of smart Yi Yao and Chang-Tsun Li's research [51] focuses on addressing the formidable task of recognizing and tracking hand movements in uncontrolled environments. They identify several critical challenges inherent in such environments, including multiple hand regions, moving background objects, variations in scale, speed, trajectory location, changing lighting conditions, and frontal occlusions. ...
... This shift in position creates additional training data, improving the model's ability to handle changes in position. Equation (5) is employed for translating the x-axis, while Equation (6) is utilized for translating the y-axis. In these equations, (x,y) represents the coordinates of a pixel in the original image, (x′,y′) represents the coordinates of the corresponding pixel in the translated image, and (dx, dy) represents the translation offsets. ...
... The study adopts an "on the fly" augmentation strategy to achieve this goal and defines various parameters for each augmentation technique, as shown in Table 1. Augmentations are performed in a series of steps, starting with background augmentation (1), followed by geometry transformation (2), brightness (3), temperature (4), and blurriness (5). The deep learning algorithm's training phase involves each stage's application. ...
Article
Full-text available
This research stems from the increasing use of hand gestures in various applications, such as sign language recognition to electronic device control. The focus is the importance of accuracy and robustness in recognizing hand gestures to avoid misinterpretation and instruction errors. However, many experiments on hand gesture recognition are conducted in limited laboratory environments, which do not fully reflect the everyday use of hand gestures. Therefore, the importance of an ideal background in hand gesture recognition, involving only the signer without any distracting background, is highlighted. In the real world, the use of hand gestures involves various unique environmental conditions, including differences in background colors, varying lighting conditions, and different hand gesture positions. However, the datasets available to train hand gesture recognition models often lack sufficient variability, thereby hindering the development of accurate and adaptable systems. This research aims to develop a robust hand gesture recognition model capable of operating effectively in diverse real-world environments. By leveraging deep learning-based image augmentation techniques, the study seeks to enhance the accuracy of hand gesture recognition by simulating various environmental conditions. Through data duplication and augmentation methods, including background, geometric, and lighting adjustments, the diversity of the primary dataset is expanded to improve the effectiveness of model training. It is important to note that the utilization of the green screen technique, combined with geometric and lighting augmentation, significantly contributes to the model’s ability to recognize hand gestures accurately. The research results show a significant improvement in accuracy, especially with implementing the proposed green screen technique, underscoring its effectiveness in adapting to various environmental contexts. Additionally, the study emphasizes the importance of adjusting augmentation techniques to the dataset’s characteristics for optimal performance. These findings provide valuable insights into the practical application of hand gesture recognition technology and pave the way for further research in tailoring techniques to datasets with varying complexities and environmental variations.
... In this study, we introduce a novel technique to accurately recognize [20] hand signals given by basketball referees from game footage. Our technique exploits the performance of image segmentation algorithms and combines Histogram of Oriented Gradients (HOG) and Local Binary Patterns (LBP) features together. ...
Preprint
Full-text available
Verbal communication is the dominant form of self-expression and interpersonal communication. Speech is a considerable obstacle for individuals with disabilities, including those who are deaf, hard of hearing, mute, or nonverbal. Consequently, these individuals depend on sign language to communicate with others. Sign Language is a complex system of gestures and visual cues that facilitate the inclusion of individuals into vocal communication groups. In this manuscript a novel technique proposed using deep learning to recognize the Arabic Sign language (ArSL) accurately. Through this advanced system, the objective is to help in communication between the hearing and deaf community. The proposed mechanism relies on advanced attention mechanisms, and state-of-art Convolutional Neural Network (CNN) architectures with the robust YOLO object detection model that highly improves the implementation and accuracy of ArSL recognition. In our proposed method, we integrate the self-attention block, channel attention module, spatial attention module, and cross-convolution module into the features processing, and the ArSL recognition accuracy reaches 98.9%. The recognition accuracy of our method is significantly improved with higher detection rate. The presented approach showed significant improvement as compared with the conventional techniques with a precision rate of 0.9. For the mAP@0.5, the mAP score is 0.9909 while for the mAP@0.5:0.95 and the results tops all the state-of-the-art techniques. This shows that the model has the great capability to accurately detect and classify complex multiple ArSL signs. The model provides a unique way of linking people and improving the communication strategy while also promoting the social inclusion of deaf people in the Arabic region.
... Lagrangian measures are employed in another paper for violent video detection, outperforming other local features in detecting violence [11]. Lastly, a technique using histogram of oriented gradients and local binary pattern features is presented for accurate recognition of basketball referees' signals in game videos [12]. These studies collectively contribute to the field of video-based activity recognition, offering insights and advancements in various aspects such as player representation, violence detection, pose modeling, action valuation, and gesture recognition in sports videos. ...
Article
The objectives of this research are to develop a deep learning approach for event recognition in field hockey videos, construct a dataset that includes important activities in field hockey such as goals, penalty corners, and penalty, and evaluate the performance of the approach using the constructed dataset. By achieving these objectives, the research aims to improve the accuracy and effectiveness of event recognition in the fast-paced and complex domain of field hockey videos. The methods employed in this research involve utilizing a pretrained convolutional neural network (CNN) to train a classifier specifically designed for event recognition in field hockey videos. To facilitate this process, a dataset is constructed, consisting of labeled instances of key activities in field hockey, namely goals, penalty corners, and penalty. The performance of the approach is then evaluated using this carefully prepared dataset, providing insights into the effectiveness and accuracy of the proposed method for event recognition in the context of field hockey videos. The findings of this research reveal that the proposed deep learning approach for event recognition in field hockey videos achieves a remarkable accuracy of 99.47%. This high level of accuracy highlights the effectiveness of the approach in accurately identifying and classifying events in field hockey. Furthermore, the results demonstrate the potential of this approach in various field hockey applications, including performance analysis, coaching, and video replay. The accurate recognition of events opens new possibilities for leveraging field hockey videos for enhanced analysis, coaching strategies, and engaging video presentations. The novelty of this research lies in the introduction of a deep learning approach specifically designed for event recognition in field hockey videos. Unlike traditional methods, this approach leverages the power of deep learning, particularly a pretrained CNN, to improve the accuracy of event recognition. Additionally, the construction of a domain-specific dataset addresses the limitation of existing field hockey datasets and enhances the effectiveness of the approach. The remarkable accuracy achieved in event recognition further emphasizes the novelty and potential of this approach in the field of field hockey video analysis.
... In [22], the researchers proposed a method for recognizing basketball referee hand signals from recorded game recordings using image segmentation based on the histogram of oriented gradients (HOG) and local binary pattern (LBP) features. Using LBP features and a support vector machine (SVM) for classification, the proposed method obtained a 95.6% accuracy rate. ...
Article
Full-text available
Every one of us has a unique manner of communicating to explore the world, and such communication helps to interpret life. Sign language is the popular language of communication for hearing and speech-disabled people. When a sign language user interacts with a non-sign language user, it becomes difficult for a signer to express themselves to another person. A sign language recognition system can help a signer to interpret the sign of a non-sign language user. This study presents a sign language recognition system that is capable of recognizing Arabic Sign Language from recorded RGB videos. To achieve this, two datasets were considered, such as (1) the raw dataset and (2) the face–hand region-based segmented dataset produced from the raw dataset. Moreover, operational layer-based multi-layer perceptron “SelfMLP” is proposed in this study to build CNN�LSTM-SelfMLP models for Arabic Sign Language recognition. MobileNetV2 and ResNet18-based CNN backbones and three SelfMLPs were used to construct six different models of CNN-LSTM�SelfMLP architecture for performance comparison of Arabic Sign Language recognition. This study examined the signer-independent mode to deal with real-time application circumstances. As a result, MobileNetV2-LSTM-SelfMLP on the segmented dataset achieved the best accuracy of 87.69% with 88.57% precision, 87.69% recall, 87.72% F1 score, and 99.75% specificity. Overall, face–hand regionbased segmentation and SelfMLP-infused MobileNetV2-LSTM-SelfMLP surpassed the previous findings on Arabic Sign Language recognition by 10.970% accuracy.
... The brain creates electrical impulses that are sent to muscle fibres through the spinal cord and nerve fibres to activate movements [6,7]. EMG-based research for medical assistance and smart gadgets, such as stroke assessment [8], analysis of neural impairments [9], heartbeat analysis [10], prosthetics [11], rehabilitation [12,13], physical training assessment [14,15], sports analytics [16], movement recognition and analysis [17], affective computing [18], human-machine interfaces [19,20], text input for the disabled [21], bio-signal fusion [22], and general healthcare [23], has been active. To aid rehabilitation from a distance, interactive forms for telerehabilitation can be used to measure the development of patients' range of motion (ROM) in real time using artificial intelligence algorithms, by manipulating the angles of action of limbs about a joint [24]. ...
Article
Full-text available
One of the most difficult components of stroke therapy is regaining hand mobility. This research describes a preliminary approach to robot-assisted hand motion therapy. Our objectives were twofold: First, we used machine learning approaches to determine and describe hand motion patterns in healthy people. Surface electrodes were used to collect electromyographic (EMG) data from the forearm’s flexion and extension muscles. The time and frequency characteristics were used as parameters in machine learning algorithms to recognize seven hand gestures and track rehabilitation progress. Eight EMG sensors were used to capture each contraction of the arm muscles during one of the seven actions. Feature selection was performed using the Pareto front. Our system was able to reconstruct the kinematics of hand/finger movement and simulate the behaviour of every motion pattern. Analysis has revealed that gesture categories substantially overlap in the feature space. The correlation of the computed joint trajectories based on EMG and the monitored hand movement was 0.96 on average. Moreover, statistical research conducted on various machine learning setups revealed a 92% accuracy in measuring the precision of finger motion patterns.
... Identification of specific sign languages is used in sports [3] as well as im applications for smart homes and supported living, including the human action identification [4][5][6][7], pose and posture detection [8,9], physical activity monitoring [10], and control hand gesture recognition sub-domains [11]. Researchers in computer science have used a variety of mathematical models and techniques to solve problems in this field over time [12]. ...
... In Figure 4, the proposed DNN for the classification of number gestures from 0 to 9 signs is displayed, along with the dimension information for each layer. With the initial data indicating the RGB channel and the subsequent data indicating the input image dimension, the input layer's dimension is (3,128,128). Each of the 16 filters in the first ConvNet block has a size of 5, and the max-pooling layer comes next with a size of 2, before the final two ConvNet layers. ...
... The DP value here represents an adjustable threshold for spotting key hand gestures. ∀g : P(O|λ g ) < P(O|λ non−gesture ) (3) ∃g : P(O|λ g ) > P(O|λ non−gesture ) ...
Article
Full-text available
Automatic key gesture detection and recognition are difficult tasks in Human–Computer Interaction due to the need to spot the start and the end points of the gesture of interest. By integrating Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs), the present research provides an autonomous technique that carries out hand gesture spotting and prediction simultaneously with no time delay. An HMM can be used to extract features, spot the meaning of gestures using a forward spotting mechanism with varying sliding window sizes, and then employ Deep Neural Networks to perform the recognition process. Therefore, a stochastic strategy for creating a non-gesture model using HMMs with no training data is suggested to accurately spot meaningful number gestures (0–9). The non-gesture model provides a confidence measure, which is utilized as an adaptive threshold to determine where meaningful gestures begin and stop in the input video stream. Furthermore, DNNs are extremely efficient and perform exceptionally well when it comes to real-time object detection. According to experimental results, the proposed method can successfully spot and predict significant motions with a reliability of 94.70%.
... Also, the evolution of gesture detection technologies was particularly indispensable towards the progress of computers and the social interface, including the employment of hand gestures, was becoming much more prevalent in many fields. Hand gesture identification encompasses subdomains including certain sign language acknowledgement [14][15][16], identification of particularly unique signal language utilized throughout sports [17], complete human motion diagnosis [18], pose as well as body position identification [19,20], physical activity actively supervising [21], and sometimes even attempting to regulate smart residence living applications only with hand gestures [22]. Hand size diversity, skin texture and colouring, lighting, viewpoint discrepancy, resemblance in diverse motions, and even the crucial ecological context pose difficult obstacles toward vision-dependent hand gesture detection. ...
Article
Hand gestures are a sort of nonverbal communication that may be utilized for many diverse purposes, including deaf-mute interaction, robotic manipulation, human-computer interface (HCI), residential management, and healthcare usage. Moreover, most current research uses the artificial intelligence approach effectively to extract dense features from hand gestures. Since most of them used neural network models, the performance of the models influences the modification of the hyperparameter to enhance recognition accuracy. Therefore, our research proposed a capsule neural network, in which the internal computations on the inputs are better encapsulated by transforming the findings into a tiny vector of information outputs. Moreover, to increase the accuracy of recognizing hand gestures, the neural network has been optimized by inserting additional SoftMax layers before the output layer of the CapsNet. Subsequently, the findings of the tests were assessed and then compared. This developed approach has been beneficial across all tests when contrasted against state-of-the-art systems.
... Despite achievements in resolving multiple obstacles under a variety of conditions, the basic issues remain complicated and difficult [10]. Because of its widespread applications in domains, such as gesture recognition [11], driver tracking [12], human action recognition [13], sports analysis [14], industrial work activity [15], monitoring the condition of industrial machinery [16], visual surveillance [17], and healthcare and rehabilitation [18], visual object tracking (VOT) is an active research issue in computer vision and machine learning. However, tracking is complicated by features such as partial or full occlusion, backdrop clutter, light change, deformation, and other environmental factors [19]. ...
Article
Full-text available
Pedestrian occurrences in images and videos must be accurately recognized in a number of applications that may improve the quality of human life. Radar can be used to identify pedestrians. When distinct portions of an object move in front of a radar, micro-Doppler signals are produced that may be utilized to identify the object. Using a deep-learning network and time–frequency analysis, we offer a method for classifying pedestrians and animals based on their micro-Doppler radar signature features. Based on these signatures, we employed a convolutional neural network (CNN) to recognize pedestrians and animals. The proposed approach was evaluated on the MAFAT Radar Challenge dataset. Encouraging results were obtained, with an AUC (Area Under Curve) value of 0.95 on the public test set and over 0.85 on the final (private) test set. The proposed DNN architecture, in contrast to more common shallow CNN architectures, is one of the first attempts to use such an approach in the domain of radar data. The use of the synthetic radar data, which greatly improved the final result, is the other novel aspect of our work.
... By estimating the pose of the player, the trajectory of the ball [38,39] is estimated from various distances to the basket. By recognizing and classifying the referee's signals [40], player behavior can be assessed and highlights of the game can be extracted [41]. The behavior of a basketball team [42] can be characterized by the dynamics of space creation presented in [43][44][45][46][47][48] that works to counteract space creation dynamics with a defensive play presented in [49]. ...
Article
Full-text available
Recent developments in video analysis of sports and computer vision techniques have achieved significant improvements to enable a variety of critical operations. To provide enhanced information, such as detailed complex analysis in sports such as soccer, basketball, cricket, and badminton, studies have focused mainly on computer vision techniques employed to carry out different tasks. This paper presents a comprehensive review of sports video analysis for various applications: high-level analysis such as detection and classification of players, tracking players or balls in sports and predicting the trajectories of players or balls, recognizing the team’s strategies, and classifying various events in sports. The paper further discusses published works in a variety of application-specific tasks related to sports and the present researcher’s views regarding them. Since there is a wide research scope in sports for deploying computer vision techniques in various sports, some of the publicly available datasets related to a particular sport have been discussed. This paper reviews detailed discussion on some of the artificial intelligence (AI) applications, GPU-based work-stations and embedded platforms in sports vision. Finally, this review identifies the research directions, probable challenges, and future trends in the area of visual recognition in sports.