Figure 13 - uploaded by Robertas Damaševičius
Content may be subject to copyright.
Confusion matrices for all (outdoor and indoor) image categories 

Confusion matrices for all (outdoor and indoor) image categories 

Source publication
Conference Paper
Full-text available
Object and scene recognition solutions have a wide application field from entertainment apps, and medical tools to security systems. In this paper, scene recognition methods and applications are analysed, and the Bag of Words (BoW), a local image feature based scene classification model is implemented. In the BoW model every picture is encoded by a...

Similar publications

Article
Full-text available
Feature extraction involves feature detection, description and matching which is the baseline of many computer vision applications like content based image retrieval, image classification, image recognition, object detection etc. Features detected should have greater repeatability and should be able to derive descriptors out of it that are highly d...
Conference Paper
Full-text available
Analysis of microscopy images is one of the available methods to support the diagnosis of many diseases. In order to contribute to the analysis of these images, we compare strategies to cell classification using handcrafted feature extraction based on key point information and local descriptors (SIFT, SURF, DAISY, and ORB) represented using a bag-o...
Chapter
Full-text available
The evolving and task-dependent nature of visual categories of problems prompts for an example based solutions involving machine learning approach. One such technique, ‘Bag of features’ approach has gained popularity in computer vision applications, including texture recognition, image classification and robot localization. Despite being quite newe...
Chapter
Full-text available
Remote sensing images or images collected by unmanned aerial vehicles in the hazy weather are easily interfered by scattering effect generated by atmospheric particulate matter. The terrible interference will not only lead to the images quality seriously degraded, but also result in a bad effect on the process of images feature extraction and image...
Article
Full-text available
The buildings in a city are of great importance. Certain historic buildings are landmarks and indicate the city’s architecture and culture. The buildings over time undergo changes because of various factors, such as structural changes, natural disaster damages, and aesthetic interventions. The form of buildings in each period is perceived and under...

Citations

... With the advances of sensor and computer technology, the human-computer interaction (HCI) system becomes more and more popular in our daily life and it occurs to us that the the HCI technology can be used to facilitate the interaction between the referees and the players and the play officials. The research is also important in the context of ambient assisted living environments as gesture-based interfaces could be used to improve the daily life of hearing impaired people and to control domestic appliances such as smart TVs by employing the Kinect device, while advances in image segmentation are important for scene and object recognition in assistive devices for visually impaired people (Petraitis et al. 2017;Malukas et al. 2018). ...
Article
Full-text available
Recognition of hand gestures (hand signals) is an active research area for human computer interaction with many possible applications. Automatic machine vision-based hand gesture interfaces for real-time applications require fast and extremely robust human, pose and hand detection, and gesture recognition. Attempting to recognize gestures performed by official referees in sports (such as basketball game) video places tough requirements on the image segmentation techniques. Here we propose an image segmentation technique based on the histogram of oriented gradients and local binary pattern (LBP) features, which allow recognizing the signals of basketball referee from recorded game videos and achieved an accuracy of 95.6% using LBP features and support vector machine for classification. Our results are relevant for real-time analysis of basketball game.
Chapter
Full-text available
We analyse the environment scene classification methods based on the Bag of Words (BoW) model. The BoW model encodes images by a bag of visual features, which is a sparse histogram over a dictionary of visual features extracted from an image. We analyse five feature detectors (Scale Invasive Feature Transform (SIFT), Speed-Up Robust Features (SURF), Features from Accelerated Segment Test (FAST), Maximally Stable Extremal Regions (MSER), and grid-based) and three feature descriptors (SIFT, SURF and U-SURF). Our experiments show that feature detection with a grid and feature description using SIFT descriptor, and feature detection with SURF and feature description with U-SURF are most effective when classifying (using Support Vector Machine (SVM)) images into eight outdoor scene categories (coast, forest, highway, inside city, mountain, open country, street, and high buildings). Indoor scene classification into five categories (bedroom, industrial, kitchen, living room, and store) achieved worse results, while the most confused categories were industrial/store images. The classification of full image dataset (15 outdoor and indoor categories) achieved the overall accuracy of 67.49 ± 1.50%, while most errors came from misclassifications of indoor images. The results of the study can be applicable for assisting living applications and security systems.
Article
Full-text available
Hand gestures, either static or dynamic, for human computer interaction in real time systems is an area of active research and with many possible applications. However, vision-based hand gesture interfaces for real-time applications require fast and extremely robust hand detection, and gesture recognition. Attempting to recognize gestures performed by officials in typical sports video places tremendous computational requirements on the image segmentation techniques. Here we propose an image segmentation technique based on the Histogram of Oriented Gradients (HOG) features that allows recognizing the signals of the basketball referee from videos. We achieve an accuracy of 97.5% using Support Vector Machine (SVM) for classification.