Article

A novel method for automatic face segmentation, facial feature extraction and tracking

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The present paper describes a novel method for the segmentation of faces, extraction of facial features and tracking of the face contour and features over time. Robust segmentation of faces out of complex scenes is done based on color and shape information. Additionally, face candidates are verified by searching for facial features in the interior of the face. As interesting facial features we employ eyebrows, eyes, nostrils, mouth and chin. We consider incomplete feature constellations as well. If a face and its features are detected once reliably, we track the face contour and the features over time. Face contour tracking is done by using deformable models like snakes. Facial feature tracking is performed by block matching. The success of our approach was verified by evaluating 38 different color image sequences, containing features as beard, glasses and changing facial expressions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... • RG replaces the RGB color space (Storring et al. 2003) • The HS replaces the HSV color space (Baskan et al. 2002;Sandeep and Rajagopalan 2002;Juang and Shiu 2008;Tsekeridou and Pitas 1998;Sobottka and Pitas 1998;McKenna et al. 1998). • The CbCr replaces YCbCr color space (Habili et al. 2004;Shih et al. 2008;Chai and Ngan 1999;Kumar and Bindu 2006;Ghazali et al. 2012;Yuetao and Nana 2011;Mahmoodi 2017). ...
... For a source image, both filters are applied and the one, which gives the better shape of the face, is selected. Sobottka and Pitas (1998) also considered that hue and saturation HS are sufficient to discriminate color information for segmentation of skin regions (i.e., without taking the intensity value V into account). Based on extensive experiments, the thresholds rules that are used for skin detection are: 0.23 ≤ S ≤ 0.68 and 0 • ≤ H ≤ 50 ...
... The shaded region defines the skin color cluster. Fig. 7 The graphical representation of classification rules used by Sobottka and Pitas (1998). The shaded region outlines the distribution of skin color in HSV color space. ...
Article
Full-text available
Color is an efficient feature for object detection as it has the advantage of being invariant to changes in scaling, rotation, and partial occlusion. Skin color detection is an essential required step in various applications related to computer vision. The rapidly-growing research in human skin detection is based on the premise that information about individuals, intent, mode, and image contents can be extracted from colored images, and computers can then respond in an appropriate manner. Detecting human skin in complex images has proven to be a challenging problem because skin color can vary dramatically in its appearance due to many factors such as illumination, race, aging, imaging conditions, and complex background. However, many methods have been developed to deal with skin detection problem in color images. The purpose of this study is to provide an up-to-date survey on skin color modeling and detection methods. We also discuss relevant issues such as color spaces, cost and risks, databases, testing, and benchmarking. After investigating these methods and identifying their strengths and limitations, we conclude with several implications for future direction.
... For each connected component, the algorithm adjusts an ellipse in order to determine the candidate area which corresponds to the face. Finally, a more detailed analysis of features within this region leads to the conclusion about the presence of a face or not [17]. ...
... A set ofelevn lower-order geometrical moment is calculated using the Fourier transform and the radial Mellin transform to characterize the shape of "clusters" in the binarized image. To detect the face region, a neural network is trained using the extracts geometrical moments [17]. ...
Article
Full-text available
Face recognition is the field of great interest in the domaine of research for several applications such as biometry identification, surveillance, and human-machine interaction…This paper exposes a system of face recognition. This system exploits an image document text embedding a color human face image. Initially, the system, in its phase of extraction, exploitis the horizontal and vertical histogram of the document, detects the image which contains the human face. The second task of the system consists of detecting the included face in other to determine, with the help of invariants moments, the characteristics of the face. The third and last task of the system is to determine, via the same invariants moments, the characteristics of each face stored in a database in order to compare them by means of a classification tool (Neural Networks and K nearest neighbors) with the one determined in the second task for the purpose of taking the decision of identification in that database, of the most similar face to the one detected in the input image.
... Rule-based methods use color only as attribute and go from a simple one component thresholding to a more or less complex rules. A simple thresholding in different spaces is used in [8] and [19] for the RGB space, [1] and [12] for the HSV space and [2] and [4] for the YCbCr space. In [3] a constructive induction algorithm is used to construct a three component decision rules from normalized RGB components through simple arithmetic operations and in [6] a heuristic rules in the RGB color space are used to detect skin regions. ...
... Figure 5 shows also that the results obtained by model M 2 were better than those obtained by the method proposed in [19], which validates our choice of the color space and the threshold model. Afterwards, our method was compared to other explicit methods and we have selected the most widely used in the literature: the methods proposed in Sobotka and pitas [1], Chai and Ngan [2], Gomez and Morales [3], Hsu et al. [4] Kovac et al. [6], Brancati et al. [18] and the RGB threshold used in Kolkurl et al. [19]. This choice is made because we wanted to test it against methods of the same class i.e. rule-based methods that use color as single attribute. ...
Chapter
Skin detection is a very important task in computer vision, since we can find it in many applications such as face detection and recognition, face tracking, gesture analysis, content-based image retrieval systems and human machine interaction systems. In this paper we present a novel rule-based skin detection method in the Cyan Magenta Yellow Key (CMYK) color space. This space is a subtractive color space used in color printing, and poorly explored in image processing and still less in skin detection tasks. Our method uses thresholds which are based on the relation between CMYK color components in order to recognize skin pixels, two thresholding models were proposed and we have considered the most performing one. The proposed method has been tested on two public skin image databases and has achieved very satisfactory qualitative and quantitative results against other widely used rule-based methods.
... Bayes ( Jones and Rehg, 2002 ) n/a n/a 94.70 5.47 5.15 n/a n/a n/a FPSS ( Kawulok et al., 2013 ) n/a n/a 95.36 3.46 5.72 n/a n/a n/a DSPF n/a n/a 96.68 2.24 4.34 n/a n/a n/a SASS n/a n/a 96.11 3.16 4.57 n/a n/a n/a MMSC ( Hettiarachchi & Peters, 2016 ) n/a n/a 96.92 1.66 4.41 n/a n/a n/a YCbCr ( Hsu, Abdel-Mottaleb & Jain, 2002 ) 98.22 18.25 30.78 n/a n/a 18.73 n/a n/a CbCr ( Hsu et al., 2002 ) 49.58 70.25 58.14 n/a n/a 94.88 n/a n/a Skin Color-map ( Chai & Ngan, 1999 ) 98.20 67.02 79.67 n/a n/a 75.43 n/a n/a YCbCr ( Basilio, Torres, Pérez, Medina & Meana, 2011 ) 91.76 69.55 79.13 n/a n/a 79.12 n/a n/a Color Clustering ( Kovac, Peer & Solina, 2003 ) 75.51 47.53 58.34 n/a n/a 88.06 n/a n/a HSV ( Sobottka & Pitas, 1998 ) 83.53 59.69 69.63 n/a n/a 76.91 n/a n/a Dynamic Color Clustering ( Brancati, De Pietro, Frucci & Gallo, 2017 ) 76.64 89.38 82.52 n/a n/a 92.74 n/a n/a WSPM + Bayes ( Chakraborty et al., 2016 ) n/a n/a n/a 4.79 4.47 n/a 9.26 n/a WSPM + DTCD ( Chakraborty et al., 2016 ) n/a n/a n/a 4.58 4.04 n/a 8.62 n/a RBCNN ( Roy et al., 2017 ) 80 n/a 85 n/a n/a 97.00 n/a 93.00 MCST ( Rahmat et al., 2016 ) 95.4 n/a n/a 1.76 4.6 n/a n/a n/a OR- Kawulok et al., 2014 ), and MMSC ( Hettiarachchi & Peters, 2016 )are taken from the research on MMSC ( Hettiarachchi & Peters, 2016 ). * * Results for YCbCr ( Hsu et al., 2002 ), CbCr ( Hsu et al., 2002 ), Skin Color-map ( Chai & Ngan, 1999 ) , YCbCr ( Basilio et al., 2011 ), Color Clustering ( Kovac et al., 2003 ), HSV ( Sobottka & Pitas, 1998 ), and Dynamic Color Clustering ( Brancati et al., 2017 ) are taken from the research on Dynamic Color Clustering ( Brancati et al., 2017 ). ...
... Bayes ( Jones and Rehg, 2002 ) n/a n/a 94.70 5.47 5.15 n/a n/a n/a FPSS ( Kawulok et al., 2013 ) n/a n/a 95.36 3.46 5.72 n/a n/a n/a DSPF n/a n/a 96.68 2.24 4.34 n/a n/a n/a SASS n/a n/a 96.11 3.16 4.57 n/a n/a n/a MMSC ( Hettiarachchi & Peters, 2016 ) n/a n/a 96.92 1.66 4.41 n/a n/a n/a YCbCr ( Hsu, Abdel-Mottaleb & Jain, 2002 ) 98.22 18.25 30.78 n/a n/a 18.73 n/a n/a CbCr ( Hsu et al., 2002 ) 49.58 70.25 58.14 n/a n/a 94.88 n/a n/a Skin Color-map ( Chai & Ngan, 1999 ) 98.20 67.02 79.67 n/a n/a 75.43 n/a n/a YCbCr ( Basilio, Torres, Pérez, Medina & Meana, 2011 ) 91.76 69.55 79.13 n/a n/a 79.12 n/a n/a Color Clustering ( Kovac, Peer & Solina, 2003 ) 75.51 47.53 58.34 n/a n/a 88.06 n/a n/a HSV ( Sobottka & Pitas, 1998 ) 83.53 59.69 69.63 n/a n/a 76.91 n/a n/a Dynamic Color Clustering ( Brancati, De Pietro, Frucci & Gallo, 2017 ) 76.64 89.38 82.52 n/a n/a 92.74 n/a n/a WSPM + Bayes ( Chakraborty et al., 2016 ) n/a n/a n/a 4.79 4.47 n/a 9.26 n/a WSPM + DTCD ( Chakraborty et al., 2016 ) n/a n/a n/a 4.58 4.04 n/a 8.62 n/a RBCNN ( Roy et al., 2017 ) 80 n/a 85 n/a n/a 97.00 n/a 93.00 MCST ( Rahmat et al., 2016 ) 95.4 n/a n/a 1.76 4.6 n/a n/a n/a OR- Kawulok et al., 2014 ), and MMSC ( Hettiarachchi & Peters, 2016 )are taken from the research on MMSC ( Hettiarachchi & Peters, 2016 ). * * Results for YCbCr ( Hsu et al., 2002 ), CbCr ( Hsu et al., 2002 ), Skin Color-map ( Chai & Ngan, 1999 ) , YCbCr ( Basilio et al., 2011 ), Color Clustering ( Kovac et al., 2003 ), HSV ( Sobottka & Pitas, 1998 ), and Dynamic Color Clustering ( Brancati et al., 2017 ) are taken from the research on Dynamic Color Clustering ( Brancati et al., 2017 ). ( Xu, Li, Xue, & Lu, 2005 ). ...
Article
Skin segmentation is one of the most important tasks for human activity recognition, video monitoring, face detection, hand gesture recognition, content-based detection, adult content filtering, human tracking, and robotic surgeries. Skin segmentation in ideal situations is easy to accomplish because it is with similar backgrounds. However, the skin segmentation in non-ideal situations is complicated due to difficult background illuminations, the presence of skin-like pixels, and environmental changes. The current studies are handling the mentioned challenges by adding the preprocessing stages in their methods, which increases the overall cost of the system. In addition, prevailing segmentation studies have ignored black skin and mainly focused on white skin for their experiments. To deal with skin segmentation in challenging environments irrespective of skin color, and to eliminate the cost of the preprocessing, this paper proposes an outer residual skip connection-based deep convolutional neural network (OR-Skip-Net) which innovatively empowers the features by transferring the direct edge information from the initial layer to the end of the network. Experiments were performed on the following eight open datasets for skin segmentation in different environments: hand gesture recognition dataset, event detection dataset, laboratoire d'informatique en image et systèmes d'information dataset, in-house dataset, UT-interaction dataset, augmented multi-party interaction dataset, Pratheepan dataset, and black skin people dataset. In addition, two other experiments were performed for gland segmentation from colon cancer histology images for the diagnosis of colorectal cancer using the Warwick-QU dataset and for iris segmentation using the Noisy Iris Challenge Evaluation - Part II dataset to explore the possibility of applying our method to different applications. Experimental results showed that the proposed OR-Skip-Net outperformed existing methods in terms of skin, gland, and iris segmentation accuracies.
... Detection rate is defined as the ratio between the number of faces correctly detected by the system and the actual number of faces in the image. There are many existing techniques to detect faces, some for single image and others for images sequence (multiple images) [7][8][9] . Let us give a practical example to highlight the necessity of the detection of facial features. ...
... Researchers used hardware orientated model for separation of facial features [13] . Many research studies [6][7][8] use skintone colour since it is independent of the luminance component. Normally skin colours of different people appear to vary over a wide range. ...
Article
Full-text available
Presently detection of facial feature is becoming very effective in the process of face recognition system, behavioral classification system. There are several numbers of techniques like skin color-based segmentation, principal component analysis, template matching based which are used to detect various facial features. In this paper a method is presented for detection various features of human face.
... It is not a simple task to build a model of human skin color that works in all lighting conditions possible therefore, a good model of skin color must provide some sort of robustness to changes in lighting conditions. There are several classification algorithms, including multilayer perceptrons [19], self-organizing maps, linear decision boundaries [22], and probabilistic-based density estimation [22]. The choice of color space is also varied: RGB [12] YCbCr [12], HSV [12], CIE Luv [12] Farnsworth UCS [19], and normalized RGB [12]. ...
... It is not a simple task to build a model of human skin color that works in all lighting conditions possible therefore, a good model of skin color must provide some sort of robustness to changes in lighting conditions. There are several classification algorithms, including multilayer perceptrons [19], self-organizing maps, linear decision boundaries [22], and probabilistic-based density estimation [22]. The choice of color space is also varied: RGB [12] YCbCr [12], HSV [12], CIE Luv [12] Farnsworth UCS [19], and normalized RGB [12]. ...
Conference Paper
The remote control, the standard interaction device between users and TV sets, has become a complex to use tool as every innovation built into the TV set yields a growing number of options that can be selected by pressing one button or a button combination. In the new interactive Digital TV (IDTV) besides the remote control one has the interaction device, making TV manipulation rather difficult, especially for those with a low level of digital knowledge. In such scenario, the iGesture emerges as a useful and more natural way to provide user interaction with IDTV, reducing the users’ learning curve and increasing its usability. This paper describes a mode of interaction between the user and IDTV through recognition of hand gestures using Hidden Markov Model (HMM), implemented using the Adaboost tool, based on skin tone recognition and motion detection. The hand gestures used in the recognition algorithm were defined by experiments with a user group and application of rules of usability engineering. The approach described in this paper addresses the challenges founded in the areas of usability engineering, IDTV and computer vision. The final algorithm proposed here presented a hit rate recognition of user hand gestures 85% higher than its competing algorithms.
... Face elements or features of the face are extracted to extract a feature vector that represents the geometry of the face. Appearance features (skin texture) show facial deformities, such as wrinkles that can be extracted on face images in all areas of the face or a specific part [40, 92,103]. The algorithms of this approach are very diverse, including PCA [108], Gabor filters [68] and (LBP) [97], and others. ...
Article
Full-text available
The recognition of facial expression in images is one of the motional states in the observing forms and is one of the most frequent non-verbal routes in which a person transfers his inner emotional expressions on faces. The recognition of facial expressions in a wide range of fields including psychological and legal studies, animation, robotics, lip-reading, image and video conferencing, communications, telecommunications, and security protection whilst counterterrorism is used to identify individuals as well as human-machine confrontation. The general solution to this problem includes three general steps: images preprocessing, features extraction, and expression classification algorithms. A series of pre-processing steps must be performed to process the area on the face and then detect the expression, that is, a square in the face must be localized while the rest of the image must be removed. Then, Features extraction is used to classify. Each facial expression to a specific category. We divide our data, including images from different expressions, into two parts: training and testing. Different categories have been learned to specify different features that tested thereafter. In recent years, a number of researches has been performed on a facial expression analysis. Even though much progress has been made in this field since the recognition of facial expression with a high accuracy rate is difficult to achieve due to the complexity and variability. In this research article, we noticed that most of researchers are used JAFFE and CK+ databases due the diversification and high accuracy. Nevertheless most of researchers are used PSO, PCA, and LBP features as well as HOG that presented high accuracy. We also noticed that SVM and CNN classification algorithms have been used mostly due to high accuracy and response latency with few errors.
... The threshold model is a simple and fast rule-based method for filtering skin pixels. Sobottka et al. [16] set a pair of thresholds in the H and S components of the HSV color space to distinguish skin pixels from non-skin pixels. Chai et al. [17] establish a pair of fixed thresholds in the YCbCr color space to detect skin regions. ...
Article
Full-text available
Skin segmentation plays an important role in image processing and human–computer interaction tasks. However, it is a challenging task to accurately detect skin regions from various scenes with different illumination or color styles. In addition, in the field of video processing, reducing the computational load and improving the real-time performance of the algorithm has also become an important topic of skin segmentation. Existing deep semantic segmentation networks usually pay too much attention to the detection performance of the model and make the model structure tend to be complex, which brings heavy computational burden. To achieve the trade-off between detection performance and real-time performance of the skin segmentation algorithm, this paper proposes a lightweight skin segmentation network. Compared with existing semantic segmentation networks, this model adopts a simpler structure to improve the real-time performance. In addition, to improve the feature fitting ability of the network without slowing down its inference speed, this paper proposes a color attention mechanism, which locates skin regions in images based on the distribution features of skin colors on the E-R/G color plane generated from the YES color space, and guides the network to update parameters. Experimental results show that this method not only exhibits similar detection performance to existing semantic segmentation networks such as U-Net and DeepLab, but also the computation load of the model is 18.1% lower than Fast-SCNN.
... So, researchers have adopted various other types of color cues. One of the earliest methods of skin detection is proposed by Sobottka and Pitas [16]. They proposed a skin detection boundary along S and H channels in HSV color space. ...
Article
Full-text available
Vision-based hand gesture recognition involves a visual analysis of handshape, position and/or movement. Most of the previous approaches require complex gesture representation as well as the selection of robust features for proper gesture recognition. To eliminate the problem of illumination variation and occlusion in gesture videos, a simple model-based framework has been presented here using a deep network for hand gesture recognition. The model is fed with ‘hand-trajectory-based-contour-images’. These images represent the motion trajectory of the hand for isolated trajectory gestures obtained via pre-processing steps—a two-level segmentation process and a double-tracking system. Deep features extracted from these images are used for estimating the hand gestures. Conventional machine learning methods involve tedious feature engineering schemes, while deep learning approaches can learn image features hierarchically from local to global with multiple layers of abstraction from a vast number of raw sample images. The feature learning capability of CNN architecture has been used here and it has shown outstanding results on three different datasets.
... Garcia and Tziritas [13] proposed a set of explicit thresholds that are found suitable for skin color in different illuminations. Sobottka and Paitas [14] used a pair of fixed thresholds in hue (H) and saturation (S) to discriminate skin color in the HSV color space. Gupta and Chaudhary [15] reported that three combined color spaces of YCbCr, HSV, and RGB can improve the classification rate in various illuminations. ...
Article
Full-text available
Skin color plays an important role in color image processing and human–computer interaction. However, factors such as rapidly changing illumination, various color styles, and camera characteristics also make skin detection a challenging task. In particular, the real-time requirement of practical applications is a challenging task in skin detection. In this paper, face detection and alignment are applied to select facial reference points for modeling the skin color distribution. Moreover, we propose the conception and detection approach of skin color model updating unit (SCMUU) according to the fact of skin color distribution remains consistent in a range of frames. The redundant operation of frame by frame updating is avoided using one model in frames of SCMUU. When no reliable faces are detected, two strategies are introduced to remedy and reduce the computational cost. It uses the corresponding model parameters if a similar previous SCMUU is found. Otherwise, we use fixed thresholds instead and increase the interval between two consecutive face detection. Besides, the time-consuming steps are accelerated using a graphic processing unit (GPU) with CUDA in this paper. Experimental results show that, compared with other existing methods, the proposed method has good real time and accuracy for skin detection of various resolution videos under different illumination conditions.
... Human intervention, for example, tuning the parameters for the extracting algorithm, is inevitable when processing the new unseen samples. In the latter study [7], ROIs are proposed using a human-designed expert system guided by the sliding window. ...
Article
Full-text available
Osteoarthritis (OA) is the most common form of arthritis. According to the evidence presented on both sides of the knee bones, radiologists assess the severity of OA based on the Kellgren-Lawrence (KL) grading system. Recently, computer-aided methods are proposed to improve the efficiency of OA diagnosis. However, the human interventions required by previous semiautomatic segmentation methods limit the application on large-scale datasets. Moreover, well-known CNN architectures applied to the OA severity assessment do not explore the relations between different local regions. In this work, by integrating the object detection model, YOLO, with the visual transformer into the diagnosis procedure, we reduce human intervention and provide an end-to-end approach to automatic osteoarthritis diagnosis. Our approach correctly segments 95.57% of data at the expense of training on 200 annotated images on a large dataset that contains more than 4500 samples. Furthermore, our classification result improves the accuracy by 2.5% compared to the traditional CNN architectures.
... A large portion of the segmentation strategies can be extensively delegated as follows ( Fig. 13): (a) skin color-based segmentation, (b) region based, (c) edge based, (d) Otsu thresholding and so on. The simplest method to recognize skin districts of a picture is through an explicit boundary specification for skin tone in a particular color space, e.g., RGB [69], HSV [205], YCbCr [28] or CMYK [193]. Numerous analysts drop the luminance segment and have utilized just the chrominance segment since chrominance signals contain skin color information. ...
Article
Full-text available
Hand gesture recognition is viewed as a significant field of exploration in computer vision with assorted applications in the human–computer communication (HCI) community. The significant utilization of gesture recognition covers spaces like sign language, medical assistance and virtual reality–augmented reality and so on. The underlying undertaking of a hand gesture-based HCI framework is to acquire raw data which can be accomplished fundamentally by two methodologies: sensor based and vision based. The sensor-based methodology requires the utilization of instruments or the sensors to be genuinely joined to the arm/hand of the user to extract information. While vision-based plans require the obtaining of pictures or recordings of the hand gestures through a still/video camera. Here, we will essentially discuss vision-based hand gesture recognition with a little prologue to sensor-based data obtaining strategies. This paper overviews the primary methodologies in vision-based hand gesture recognition for HCI. Major topics include different types of gestures, gesture acquisition systems, major problems of the gesture recognition system, steps in gesture recognition like acquisition, detection and pre-processing, representation and feature extraction, and recognition. Here, we have provided an elaborated list of databases, and also discussed the recent advances and applications of hand gesture-based systems. A detailed discussion is provided on feature extraction and major classifiers in current use including deep learning techniques. Special attention is given to classify the schemes/approaches at various stages of the gesture recognition system for a better understanding of the topic to facilitate further research in this area.
... This motivates that the skin color can be exploited for conducting the determination of the correct face target. To extract more appropriate skin color, the HSV (Hue, Saturation, and Value) color space [30] is utilized in this work. Based on the skin color model, the conditions for detecting possible face regions are given in Eqs. ...
Article
Full-text available
This paper is dedicated to developing high-efficiency face detection and tracking method for big dynamic crowds or numerous pedestrians. Three modules constitute the proposed method, i.e., face candidate generation, face candidate verification, and face target tracking. In this work, face candidates are localized using the features of the face area, edge information, and skin color. Non-face parts in the face candidates are further verified by the C-SVM learning model and then removed, by which the face targets can be generated with lower computation-complexity and satisfactory accuracy than other approaches. Finally, the face targets are tracked by an efficient and reliable searching scheme for improving the effective face detection rate. Experimental results show that the average face detection rate (FDR) of 85%, average effective FDR of 95%, a frame rate of 28–66 frames per second (fps), and about 30 faces detected per frame are obtained from various test videos with big dynamic crowds or numerous pedestrians, indicating the feasibility of the proposed method to achieve unconstrained face detection with high-efficiency and cost-effectiveness. This result makes the proposed method more attractive for the video surveillance system as compared to other approaches, especially in the high computational complexity-based methods.
... If is used also recognition on the color, then can be added the additional condition that to be examined as the potential microorganisms there will be only regions close ones before the color down the nuance of microorganisms. Checking the relationship of the discovered signs of microorganisms can be based down: a certain empirical algorithm [1], to the statistics of the mutual arrangement of signs, assembled on the images of microorganisms [2], the simulation of processes, proceeding before the human brain during the recognition of visual means, the application of the rigid or deformed templates and so forth. ...
Article
Full-text available
During longtime space fights and interplanetary missions among numerous outboard risks, the crew faces onboard microbiological intruders as well. There is no way to send biological tests for the analysis to the Earth in such missions, so special onboard system of methods and activities must solve two different problems: sampling different types of bacteria and fungus and perform independent analysis of collected material without professional microbiologist among the crew and any help from the Earth. So we meet an interesting task to create system that would minimize human factor and rely mostly on computing machinery. In order to use pattern recognition method, we need to perform proper tests sampling and prepare them for the machine analysis. Stereoscopy and spectrometry is the only way to achieve our goal. Apart from tests sampling it is necessary to develop modified mathematical model for pattern recognition of bacteria and fungus, which were found during the flight. For that reason we are making mathematical model, describing microbiological samples. Still, we have a lot of work to do but the result of our research could become common use not only in space sector, but also in clinical medicine as well.
... The binary function Ix, y = 1 if and only if the pixel at location x, y belongs to the skin tone and Ix, y = 0 otherwise. For simplicity reasons, we employ the HSV threshold values defined by Sobottka et al. [33] for Asian and Caucasian skin (0 < H < 50, 0.23 < S < 0.68, 0 < V < 1). ...
Conference Paper
Throughout the past decade, numerous interaction techniques have been designed for mobile and wearable devices. Among these devices, smartglasses mostly rely on hardware interfaces such as touch-pad and buttons, which are often cumbersome and counter-intuitive to use. Furthermore, smartglasses feature cheap and low-power hardware preventing the use of advanced pointing techniques. To overcome these issues, we introduce UbiPoint, a freehand mid-air interaction technique. UbiPoint uses the monocular camera embedded in smartglasses to detect the user's hand without relying on gloves, markers, or sensors, enabling intuitive and non-intrusive interaction. We introduce a computationally fast and lightweight algorithm for fingertip detection, which is especially suited for the limited hardware specifications and the short battery life of smartglasses. UbiPoint processes pictures at a rate of 20 frames per second with high detection accuracy-no more than 6 pixels deviation. Our evaluation shows that UbiPoint, as a mid-air non-intrusive interface, delivers a better experience for users and smart glasses interactions, with users completing typical tasks 1.82 times faster than when using the original hardware.
... Recent studies [10,19,33] show that skin segmentation methods can be broadly classified into three major classes: Explicit Boundary Specification, Parametric Modelling-based approach, and Non-Parametric Modelling-based approach. Explicit boundary specification-based methods explicitly define the ranges of pixel components in different colour spaces e.g., RGB [13], HSV [43], YCbCr [2] or CMYK [41] for skin colour. However, skin colour varies significantly among different human races. ...
Article
Full-text available
In recent times, the majority of colour-based skin detection methods used skin modelling in different colour spaces, and they are capable of skin classification at a pixel level. However, the accuracy of these methods is significantly affected by different issues, such as the presence of skin-like colours in scene background, variations in skin pigmentation, scene illumination, etc. Recent developments show that the discriminating power of a colour-based skin classifier can be increased by employing texture and spatial features. However, we observed that discriminability between skin and non-skin regions does not follow any statistics, and the discrimination is extremely image specific. In this paper, a novel adaptive discriminative analysis (ADA) is proposed to extract most discriminant features between skin and non-skin regions from an image itself in an unsupervised manner. Experimental results for standard databases show that the proposed method can efficiently segment out skin pixels in the presence of skin-like background colours.
... These domains affect many aspects of human life such as human-computer interaction for games and disabled users, immigration border control, airport/seaport security, video compression, real-time face recognition, and gaze tracking. The most used tracking features are: skin -2 -color [1], wavelet templates [2], eyes [3], motion [4], active Appearance Model (AAM) [5], facial shape [6] or combination from them [7,8]. ...
Conference Paper
Face tracking continuously estimates the location and the extent of a face in an image sequence in real time. This paper provides two improvements in the performance of the well known skin color based face tracker named CAMSHIFT in the detection and tracking parts of its operation. CAMSHIFT utilizes mean-shift algorithm to climb skin color density for finding its mode and parameters which respectively represent face center and dimensions. In the proposed tracker algorithm, the first improvement in CAMSHIFT is performed by using grey world color space which is invariant to illuminant color changes to replace hue color in skin detection part producing a new improved CAMSHIFT which we call the "Grey World color based CAMSHIFT tracker". The second improvement in CAMSHIFT is performed on the Grey World color based CAMSHIFT by utilizing stored motion vectors within video file as tracking guide since they are part of compressed video file and they store for each block of the frame the most matching block at the next frame. This second improvement produces the new final proposed tracker which we call the "Motion Vectors based CAMSHIFT tracker". The performances of three trackers (original CAMSHIFT, Grey World color based CAMSHIFT and Motion Vectors based CAMSHIFT) are tested using five video sequences which represent five different viewing conditions. The testing results show that the Motion Vectors based CAMSHIFT gives the best precision, accuracy and frames per second in tracking.
... They did segmentation of face-like structure depending on the skin colour. They first enhanced the input image using the colour information then using SGLD matrices textural features are derived [8]. ...
... We collected around 150 images per class and trained a Convolutional Neural Network (CNN) to extract a feature vector of length 4096 for four classes (face, palm, A, v) [13] [14]. We used a highly pre-trained built-in Neural Network AlexNet with 25 layers and replaced last layer with a feature extraction layer [15] [16]. ...
Conference Paper
A real-time sign language translator is an important milestone in facilitating communication between the deaf community and the general public. We hereby present the development and implementation of an American Sign Language (ASL) fingerspelling translator based on skin segmentation and machine learning algorithms. We present an automatic human skin segmentation algorithm based on color information. The YCbCr color space is employed because it is typically used in video coding and provides an effective use of chrominance information for modeling the human skin color. We model the skin-color distribution as a bivariate normal distribution in the CbCr plane. The performance of the algorithm is illustrated by simulations carried out on images depicting people of different ethnicity. Then Convolutional Neural Network (CNN) is used to extract features from the images and Deep Learning Method is used to train a classifier to recognize Sign Language.
... A skin classifier defines a decision boundary of the skin colour in a colour space based on the training set of skin pixels. Sobottka and Pitas used fixed range values on the HS colour space in the range of 0 ≤ í µí°» ≤ 50 and 0.23 ≤ í µí±† ≤ 0.68 23 . Wang Previously, researchers moved to a dynamic or adaptive approach based on face or hand adaptation [25][26][27][28][29][30][31][32] . ...
... If the color of a pixel falls into the "skin" region, it is classified as "skin" and vice versa. Some studies that applied the thresholding method to skin detection can be referred to as in [14,[17][18][19][20]. In short, the thresholding method gains an advantage because it is a very basic and understandable method; however, it is mainly based on subjective experience and has low performance when the thresholds are incorrectly tuned [1,21]. ...
Article
Full-text available
Skin detection is an interesting problem in image processing and is an important preprocessing step for further techniques like face detection, objectionable image detection, etc. However, its performance has not really been high because of the high overlapped degree between “skin” and “nonskin” pixels. This paper proposes a new approach to improve the skin detection performance using the Bayesian classifier and connected component algorithm. Specifically, the Bayesian classifier is utilized to identify “true skin” pixels using the first posterior probability threshold, which is approximate to 1, and to identify "skin candidate" pixels using the second posterior probability threshold. Subsequently, the connected component algorithm is used to find all the connected components containing the “skin candidate” pixels. According to the fact that a skin pixel often connects with other skin pixels in an image, all pixels in a connected component are classified as “skin” if there is at least one “true skin” pixel in that connected component. It means that the “nonskin” pixels whose color is similar to skin are classified as “nonskin” when they have the posterior probabilities lower than the first posterior probability threshold and do not connect with any “true skin” pixel. This idea can help us to improve the skin classification performance, especially the false positive rate.
... The kernel-based approach is based on the face appearance representation, namely templates (Romero and Bobick, 2004) and density-appearance models (Kodirov et al., 2013). As for silhouette-based tracker, it models the face using its shape (Hou et al., 2007) or contour (Sobottka and Pitas, 1998). ...
... Human face recognition has evolved as a potential branch of machine vision. The applications of computer vision has received remarkable attention in face tracking [1][2], crowd analysis [3], steganography [4], facial expression [5], driver fatigue detection [6] and many more. Automated face recognition is a natural and accessible passive biometric practice in the field of machine vision. ...
Article
It is evident that the research contributions in the domain of partially occluded image are quite sparse. This paper presents a novel method, termed as Partially Occluded Face Recognition (POFR) using Maximally Stable External Regions (MSER) feature sets and Dynamic Time Wrapping (DTW). This proposed system works in two phases: Phase-I, creates an annotated database using the non-occluded images, and Phase-II focuses on the detection and recognition of partially occluded probe image, which is also annotated using the mechanism of phase-I. Hence, POFR selectively and dynamically calibrates the annotated database as per the annotation of the probe image. Further, the similarity between the feature sets of the annotated database images and the probe image is computed, using the principle of DTW. The POFR is tested on the face images from University of Stirling dataset and the average accuracy of face recognition is recorded as 88%. This method promises a computational advantage for partially occluded face recognition without any prior reconstruction or synthesis. The POFR finds direct applications in surveillance and security systems.
... The kernel-based approach is based on the face appearance representation, namely templates (Romero and Bobick, 2004) and density-appearance models (Kodirov et al., 2013). As for silhouette-based tracker, it models the face using its shape (Hou et al., 2007) or contour (Sobottka and Pitas, 1998). ...
... In digital images, human skin is features by the consistent distribution of intensity. Human skin segmentation is used as a pre-processing step in manual/machine vision techniques ranging from face detection [1][2][3] and [4], face recognition [5,6], face tracking, object tracking and crowd analysis [7], steganography [8], adult image filtering [9,10], body pose estimation [11,12], gesture recognition [13,14], human computer interaction [15], to Driver Fatigue Detection [16,17]. Although human skin detection through human vision is an effortless task, it is a challenging and complicated task for machine vision. ...
... HSV color space is a perceptual color space and the H stands for hue, while S stands for saturation and V for value. Hue is the variety of color from red to green, saturation defines variety from red to pink and value determines changing from black to white or we called it image brightness [1,14] The following rule proposed in [15] for human skin color using HSV. 0.23 < < 0.68 and 0 < < 50 ...
... [29] Many algorithms have been proposed for face detection in still images that are based on texture, depth, shape and color information. For example, face localization can be based on the observation that human faces are characterized by their oval shape and skin-color [30] S.V.Cerez proposed a system that detects a human face in an image and extracts or bounds facial features using a geometrical face model [31]. Burl and others use multi-oriented, multi-scale Gaussian derivative filters to model the texture around key points on the face. ...
Conference Paper
Skin detection plays a vital role in various humanrelated computer vision applications, including human-computer interaction, medical diagnostic tools, and web content filtering. However, accurate skin detection remains challenging due to different factors such as luminosity variations, complex backgrounds, and diversity in skin tones. In this paper we present a rule-based skin detection method that applies dimensionality reduction using Principal Component Analysis (PCA) on pixels represented by multiple color channels. This process retains only the most pertinent information in form of principal components. Subsequently, skin detection is achieved according to the individual contribution of the pixels along these principal components. To evaluate the effectiveness of our approach, we conducted comprehensive experiments on the SFA dataset. Our method demonstrated consistently superior skin detection performance compared to other rule-based methods, in both quantitative and qualitative aspects across diverse scenarios.
Article
Full-text available
Human-machine interface (HMI) is a crucial area of research as gestures have the potential to efficiently control and interact with computers. Many applications for hand detection have been created as a result of the pervasive use of built-in cameras in computers, smartphones, and tablets. For the majority of users, however, many of these are not useful. A straightforward concept for a keyboard- and mouse-free music controller is presented in this research. Using MATLAB code that integrates skin detection, area labelling, erosion, dilation, and motion differentiation, a music player controller is developed using the real-time frame tracking feature of a camera. Three hand detection algorithms are created and assessed for maximum performance and accuracy. Real-time hand detection for operating the music player is provided by the algorithm, which is created with efficiency and speed in mind.
Conference Paper
Changes in illumination can substantially impact the apparent color of the skin, jeopardizing the effectiveness of any color-based segmentation method. Our solution to this problem is to use adaptive technology to generate skin color models in real-time. We employ a Viola-Jones feature-based face detector built-in MATLAB to sample faces inside a picture in a moderate-recall, high-precision configuration. We extract a set of pixels that are likely to be from skin areas from these samples. Then, filter them based on their relative luma values to remove non-skin face characteristics, producing a set of pixels. We train a unimodal Gaussian function to model the skin color in the provided image in the normalized rg color space using this representative set–a combination of the modeling strategy and color space that aids us in various ways. Subsequently, a developed function is employed for each pixel in the picture, allowing the likelihood that each pixel represents skin to be calculated. Application of a binary threshold to the computed probabilities may used to segment the skin. We discuss various current techniques in this work, detail the methodology behind our new proposed model. Moreover, provide the outcomes of its application to random photos of individuals with recognizable faces, which we found to be quite encouraging, and explores its possibilities for usage in real-time systems.
Chapter
Driver’s drowsiness is considered as a major reason behind accidents on road, around the globe. Driving nonstop for a long period of time will cause accidents. The consequences of drowsy state are the same as alcohol, and it will create a driver’s driving inputs poorer, destroy the driver’s reaction times, and blur driver’s thought processes. To prevent such disastrous situations, a real-time driver monitoring system is implemented using OpenCV, where the aspect ratios of extracted contour features of eye and mouth are measured, and an alarm is generated. With EAR 0.25 and MAR 0.75, the results show that the alarm is generated for the blinks. The robustness of the implementation has been verified by changing the EAR and MAR, values, and best results are given for the EAR 0.25 and MAR 0.75.
Conference Paper
Full-text available
The purpose of this paper consists to use the nonlinear Partial Differential Equation (PDE) in order to reduce additive or multiplicative noise from images. Firstly, we applied the nonlinear anisotropic diffusion technique. This filter greatly reduces the additive noise, where the weighted image gradient is used to stop diffusion near edges, then, the Speckle Reducing Anisotropic Diffusion (SRAD) which is a multiplicative speckle noise reduction method is used in case where images containing speckle. Numerical results prove that the anisotropic diffusion performs well for images corrupted by additive noise and the performance of SRAD is highly dependent on the good homogeneous area selected initially.
Article
Full-text available
Driver drowsiness is the momentous factor in a huge number of vehicle accidents. This driver drowsiness detection system has been valued highly and applied in various fields recently such as driver visual attention monitoring and driver activity tracking. Drowsiness can be detected through the driver face monitoring system. Nowadays smartphone-based application has developed rapidly and thus also used for driver safety monitoring system. In this paper, a detailed review of driver drowsiness detection techniques implemented in the smartphone has been reviewed. The review has also been focused on insight into recent and state-of-the-art techniques. The advantages and limitations of each have been summarized. A comparative study of recently implemented smartphone-based approaches and mostly used desktop-based approaches has also been discussed in this review paper. And the most important thing is this paper helps others to decide better techniques for the effective drowsiness detection.
Thesis
Full-text available
To prove the identification and authentication, this thesis is introduced a recent class of biometrics based upon ear features for use in the development of passive identification systems. The future of biometrics will surely lead to systems based on image analysis as the data acquisition is very simple and requires only cameras, scanners or sensors. More importantly such methods could be passive, which means that the user does not have to take active part in the whole process or, in fact, would not even know that the process of identification takes place. The ear is the most interesting human anatomical part for such passive, physiological biometrics systems based on images acquired from cameras. Ear contains a large volume of unique features that allow to distinctively identify many users and will be surely implemented into efficient biometrics systems for many applications, for instance, terrorist at the airports, stations, access control to various buildings and crowd surveillance in any place. One goal of the work presented in this thesis is to build a passive secure system by identify a person without his/her knowledge. This thesis presents three methods to recognize a person by his/her ear, the first method is based on the Similarity Matching Measures (SMM) which is used to authenticate an impostor by taking a profile image for him in a special studio belong to SMM method after he/she wears a black cap to extract only the ear from it. The matching efficiency rate in this method is 88%. The second method is the Choras Method Improvement for authentication / identification application. Due to the ellipse shape of the ear, CMI method is based on building ellipses around the ear to use them in features extraction operation after discarding the outer of the helix and the lobe and anything beyond them like the hair, sideburn, earring, or chain because the most fake processes and surgery operations that are done for forgeries happen in this area of the ear. The matching efficiency rate in CMI method is 63%. To improve this rate, the Moment Method (MM) is accomplished by using only the internal part of the ear which acts as signature of the ear. This method will avoid all the problems of forgeries in the external area of the ear. The matching efficiency rate in MM method is 84% which acts as a good result and can be applied easily by the end users.
Article
Face Recognition is a Biometric Application, which is used for Criminal Identification, Visitor Verification and many other Real Time Identification systems. We use basically two approaches for this system which are namely 'Template Matching' and 'Feature Matching'. The Template Matching approach is independent of the features points, which we have used in this paper. Here we find the convolution values of the features for a test image and all the images in the database. In this work we introduce the novel idea of 'Energies'. The distance algorithm states that the image in the database having the least distance with the test image in terms of Energies is the identified image.
Article
Full-text available
With the emerging needs for identification of individuals, face recognition plays a tremendous role. Face recognition is used in various domains such as access control, identification systems and surveillance. A new thermal face recognition system for biometric authentication that uses Gabor feature descriptor is proposed. Thermal images are captured using IR sensing cameras that provide images based on the vasculature information of objects that vary from surroundings. The acquired thermal faces are segmented Otsu thresholding algorithm. Gabor features are used as feature descriptors for face recognition. Publically available Terravic thermal dataset is used. LBP, LPQ, HOG feature descriptors and Naive Bayes, Multi-Layer Perception, J48 and Random Forest Classifiers are used for comparison with the proposed work. The novelty of the proposed work lies in the selection of combined features applied to the random forest classifier yielding a promising result of 99.93% of accuracy.
Article
Full-text available
We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces of persons in typical head-and-shoulders video sequences, and to exploit the face location information in a ‘classical’ video coding/decoding system. The motivation is to enable the system to selectively encode various image areas and to produce psychologically pleasing coded images where faces are sharper. We refer to this approach as model-assisted coding. We propose a totally automatic, low-complexity algorithm, which robustly performs face detection and tracking. A priori assumptions regarding sequence content are minimal and the algorithm operates accurately even in cases of partial occlusion by moving objects. Face location information is exploited by a low bit-rate 3D subband-based video coder which uses both a novel model-assisted pixel-based motion compensation scheme, as well as model-assisted dynamic bit allocation with object-selective quantization. By transferring a small fraction of the total available bit-rate from the non-facial to the facial area, the coder produces images with better-rendered facial features. The improvement was found to be perceptually significant on video sequences coded at 96 kbps for an input luminance signal in CIF format. The technique is applicable to any video coding scheme that allows for fine-grain quantizer selection (e.g. MPEG, H.261), and can maintain full decoder compatibility.
Conference Paper
Full-text available
In Computer Mediated Communication such as desk- top video conferencing, static video cameras provide a restricted field of view of remote sites. The effective field of view can be enlarged, while maintaining the user's freedom of movement, by slaving a remote controlled camera to movements of the user's head. This paper concerns techniques for tracking of faces. We demonstrate that robustness and reliability can be increased by combining multiple perceptual processes such as eye blink detection, skin color histogram and cross correlation, that adapt to a variety of operating conditions. We illustrate our technique with CoMedi, a media-space currently under development. further. It has been demonstrated that in CMC, robustness and performance have a strong impact on user's behavior and system acceptance (10). The central message of this paper is that robustness and performance can be increased by combining multiple perceptual processes that adapt to a variety of operating conditions. We illustrate this message with CoMedi, a media-space under development for experiments with both social and technical aspects of CMC. We first introduce the CoMedi system. We then present the architecture used as the framework for integrating multiple perceptual processes followed by the description of each of the perceptual processes implemented in CoMedi. We close the discussion with an example of cooperation of visual processes based on their mutual merits and limitations for a robust and efficient face tracking. 2 . CoMedi: a media-space prototype
Conference Paper
Full-text available
We designed a modular system using a combination of shape analysis, color segmentation and motion information for locating reliably heads and faces of different sizes and orientations in complex images. The first of the system's three channels does a shape analysis on gray-level images to determine the location of individual facial features as well as the outlines of heads. In the second channel the color space is analyzed with a clustering algorithm to find areas of skin colors. The color space is first calibrated, using the results from the other channels. In the third channel motion information is extracted from frame differences. Head outlines are determined by analyzing the shapes of areas with large motion vectors. All three channels produce lists of shapes, each marking an area of the image where a facial feature or a part of the outline of a head may be present. Combinations of such shapes are evaluated with n-gram searches to produce a list of likely head positions and the locations of facial features. We tested the system for tracking faces of people sitting in front of terminals and video phones and used it to track people entering through a doorway
Article
Full-text available
The use of energy-minimizing curves, known as “snakes” to extract features of interest in images has been introduced by Kass, Witkin and Terzopoulos (1987). A balloon model was introduced by Cohen (1991) as a way to generalize and solve some of the problems encountered with the original method. A 3-D generalization of the balloon model as a 3-D deformable surface, which evolves in 3-D images, is presented. It is deformed under the action of internal and external forces attracting the surface toward detected edgels by means of an attraction potential. We also show properties of energy-minimizing surfaces concerning their relationship with 3-D edge points. To solve the minimization problem for a surface, two simplified approaches are shown first, defining a 3-D surface as a series of 2-D planar curves. Then, after comparing finite-element method and finite-difference method in the 2-D problem, we solve the 3-D model using the finite-element method yielding greater stability and faster convergence. This model is applied for segmenting magnetic resonance images
Article
Full-text available
The problems of segmenting a noisy intensity image and tracking a nonrigid object in the plane are discussed. In evaluating these problems, a technique based on an active contour model commonly called a snake is examined. The technique is applied to cell locomotion and tracking studies. The snake permits both the segmentation and tracking problems to be simultaneously solved in constrained cases. A detailed analysis of the snake model, emphasizing its limitations and shortcomings, is presented, and improvements to the original description of the model are proposed. Problems of convergence of the optimization scheme are considered. In particular, an improved terminating criterion for the optimization scheme that is based on topographic features of the graph of the intensity image is proposed. Hierarchical filtering methods, as well as a continuation method based on a discrete sale-space representation, are discussed. Results for both segmentation and tracking are presented. Possible failures of the method are discussed
Article
Full-text available
(C) In the proceedings of the "European Conf. on Multimedia Applications, Services and Techniques - ECMAST; Louvain-la-Neuve, 28-30 May, 1996" We propose multi-modal person verification using voice and images as a solution to the secured access problem. The necessary i/o devices are now standard, cheaply available and, most importantly, constitute the two most important human communication modalities. The visual part currently involves i) matching of a coarse grid containing Gabor phase information from face images, ii) facial feature localization and extraction iii) 3D biometrical feature extraction by structured light. The acoustic part uses three methods (DTW, SOSM and HMM) to compare voice references extracted from the speech signal. In the acoustic part LPC coefficients are extracted and three different classifiers are used in parallel. The global decision is taken by applying a Furui threshold to the individual methods and in combining the individual results according to a majo...
Conference Paper
In this paper, a face localization system is proposed in which local detectors are coupled with a statistical model of the spatial arrangement of facial features to yield robust performance. The outputs from the local detectors are treated as candidate locations and constellations are formed from these. The effects of translation, rotation, and scale are eliminated by mapping to a set of shape variables. The constellations are then ranked according to the likelihood that the shape variables correspond to a face versus an alternative model. Incomplete constellations, which occur when some of the true features are missed, are handled in a principled way.
Article
It is the key step for face recognition systems to extract the facial parts from the complex backgrounds. In this paper, we propose a new method for the face extraction in the color complex backgrounds. By transforming color images from RGB color represention to YIQ color one, the orange-like parts including the face areas are enhanced in the original images, if the I-componet of YIQ color system is only used. The facial texture model based on the space gray level dependence (SGLD) matrices is applied to these images. Using this model, facial parts are detected as those regions which satisfy a set of inequalities. The weight coefficients in the inequalities are decided by the conventional method of learning. Using this textural model, we design a kind of scanning scheme for face detection in the complex backgrounds. The experiments show that this method could locate the face position in the complex backgrounds effectively.
Article
The successful segmentation of magnetic resonance images is dependent upon the success of three seperate stages. Initially attention must be paid to the image acquisition in choosing appropriate spin-sequences to enhance neurological contrast. Secondly, the images must be preprocessed to compensate for non-uniformities and noise, possibly using non-linear iterative techniques. The third stage is the actual segmentation processing. These may be low-level : such as image enhancement, smoothing, and resolution changes; medium-level : such as region-growing, clustering, edge-detection etc.; or high-level : incorporating knowledge-based approaches. The work in this project has concentrated on low and medium level techniques. Work is being carried out using C and C++ programming under Unix and making use of subroutines and software tools developed as part of the /usr/image package of the University of North Carolina, the SPIDER package, and Synoptics programming language Semper 6+. The Mayo Clinic ANALYZE package is used for fast 3D visualisation. Three main methods are being compared : Multi-resolution segmentation by detection of extremal regions in scale space, edge-based processing, and multi-spectral
Article
We propose a way to locate facial features from a video sequence captured by a camcorder undergoing strong translational motion. Pairs of stereo images containing frontal views of the human subject are sampled from the video sequence. A multiresolution hierarchical matching algorithm finds point correspondences over a large disparity range. The task of locating facial features such as the eyes, nose and mouth is aided by depth information derived from the matching data. We present experimental results to verify our approach.
Article
A model for representing image contours in a form that allows interaction with higher level processes has been proposed by Kass et al. (in Proceedings of First International Conference on Computer Vision, London, 1987, pp. 259–269). This active contour model is defined by an energy functional, and a solution is found using techniques of variational calculus. Amini et al. (in Proceedings, Second International Conference on Computer Vision, 1988, pp. 95–99) have pointed out some of the problems with this approach, including numerical instability and a tendency for points to bunch up on strong portions of an edge contour. They proposed an algorithm for the active contour model using dynamic programming. This approach is more stable and allows the inclusion of hard constraints in addition to the soft constraints inherent in the formulation of the functional; however, it is slow, having complexity O(nm3), where n is the number of points in the contour and m is the size of the neighborhood in which a point can move during a single iteration. In this paper we summarize the strengths and weaknesses of the previous approaches and present a greedy algorithm which has performance comparable to the dynamic programming and variational calculus approaches. It retains the improvements of stability, flexibility, and inclusion of hard constraints introduced by dynamic programming but is more than an order of magnitude faster than that approach, being O(nm). A different formulation is used for the continuity term than that of the previous authors so that points in the contour are more evenly spaced. The even spacing also makes the estimation of curvature more accurate. Because the concept of curvature is basic to the formulation of the contour functional, several curvature approximation methods for discrete curves are presented and evaluated as to efficiency of computation, accuracy of the estimation, and presence of anomalies.
Conference Paper
The authors present a new approach for automatically segmentation and tracking of faces in color images. Segmentation of faces is performed by evaluating color and shape information. First, skin-like regions are determined based on the color attributes hue and saturation. Then regions with elliptical shape are selected as face hypotheses. They are verified by searching for facial features in their interior. After a face is reliably detected it is tracked over time. Tracking is realized by using an active contour model. The exterior forces of the snake are defined based on color features. They push or pull snaxels perpendicular to the snake. Results for tracking are shown for an image sequence consisting of 150 frames
Conference Paper
This paper proposes a new method that can be used to extract facial sketch image automatically and to transform facial expression. The extraction of facial features is performed by the following three processing steps: segmentation using color characteristics, detection of feature points of each part of the face, and their approximation using polynomials. They are called feature curves and they constitute a facial sketch image. Expression transformation of the sketch image based on the facial action coding system (FACS) is carried out easily by adjusting the locations of the feature points. The control scheme of the feature point locations is based on the FACS. Experimental results on the extraction of the facial sketch image and their expression transformations are provided
Conference Paper
An efficient, physically based solution for recovering a 3-D solid model from collections of 3-D surface measurements is presented. Given a sufficient number of independent measurements, the solution is overconstrained and unique except for rotational symmetries. A physically based object recognition method that allows simple, closed-form comparisons of recovered 3-D solid models is given. The performance of these methods is evaluated using both synthetic and real laser rangefinder data
Article
The authors present a closed-form, physically based solution for recovering a three-dimensional (3-D) solid model from collections of 3-D surface measurements. Given a sufficient number of independent measurements, the solution is overconstrained and unique except for rotational symmetries. The proposed approach is based on the finite element method (FEM) and parametric solid modeling using implicit functions. This approach provides both the convenience of parametric modeling and the expressiveness of the physically based mesh formulation and, in addition, can provide great accuracy at physical simulation. A physically based object-recognition method that allows simple, closed-form comparisons of recovered 3-D solid models is presented. The performance of these methods is evaluated using both synthetic range data with various signal-to-noise ratios and using laser rangefinder data
Article
The goal of this paper is to present a critical survey of existing literature on human and machine recognition of faces. Machine recognition of faces has several applications, ranging from static matching of controlled photographs as in mug shots matching and credit card verification to surveillance video images. Such applications have different constraints in terms of complexity of processing requirements and thus present a wide range of different technical challenges. Over the last 20 years researchers in psychophysics, neural sciences and engineering, image processing analysis and computer vision have investigated a number of issues related to face recognition by humans and machines. Ongoing research activities have been given a renewed emphasis over the last five years. Existing techniques and systems have been tested on different sets of images of varying complexities. But very little synergism exists between studies in psychophysics and the engineering literature. Most importantly, there exists no evaluation or benchmarking studies using large databases with the image quality that arises in commercial and law enforcement applications In this paper, we first present different applications of face recognition in commercial and law enforcement sectors. This is followed by a brief overview of the literature on face recognition in the psychophysics community. We then present a detailed overview of move than 20 years of research done in the engineering community. Techniques for segmentation/location of the face, feature extraction and recognition are reviewed. Global transform and feature based methods using statistical, structural and neural classifiers are summarized
Article
A first step towards face recognition is the localization of face-like regions and the extraction of facial features as eyes, mouth and nose. Although a lot of work has already been done in this research area, the recognition of human faces out of a scene with cluttered background is still a problem that deserves further investigation. In this framework we present a novel approach to face localization by using color and shape information. For the extraction of facial features we employ two approaches. One is based on a modified watershed method and the second on min-max analysis. Results are shown for two different scenes. Index Terms: Face recognition, color image processing, watersheds, min-max analysis. 1 Introduction There are many different applications for face localization and recognition systems, e.g. modelbased video coding, security systems, mug shot matching. Due to variations in illumination, background, visual angle and facial expressions, the problem is complex...
Article
Introduction Recognition of human faces out of still images or image sequences is a research field of fast increasing interest. There are many different applications for systems coping with the problem of face localization and recognition, e.g. model-based video coding, security systems, mug shot matching. Due to variations in illumination, background, visual angle and facial expressions, the problem is complex. A critical survey about existing literature on human and machine recognition of faces is given in [1]. In a first step of face recognition, the localization of facial regions and the detection of facial features as eyes, nose and mouth is necessary. In this framework we present an approach that copes with problems of this first step. For the detection of facial regions and facial features, approaches have been published using texture, depth, shape and color information or combinations of them ([2], [4], [3], [6]). Our approach is based on the observation that human f
Article
In this paper, a face localization system is proposed in which local detectors are coupled with a statistical model of the spatial arrangement of facial features to yield robust performance. The outputs from the local detectors are treated as candidate locations and constellations are formed from these. The effects of translation, rotation, and scale are eliminated by mapping to a set of shape variables. The constellations are then ranked according to the likelihood that the shape variables correspond to a face versus an alternative model. Incomplete constellations, which occur when some of the true features are missed, are handled in a principled way. 1 Introduction The problem of face recognition has received considerable attention in the literature [11, 24, 21, 4, 19, 17, 22, 10]; however, in most of these studies, the faces were either embedded in a benign background or were assumed to have been pre-segmented. For any of these recognition algorithms to work in realworld applicati...
Introduction into Digital Image Processing G. Galicia, A. Zakhor, Depth recovery of human facial features from video sequences
  • H Ernst
H. Ernst, Introduction into Digital Image Processing (in German), Franzis, 1991. G. Galicia, A. Zakhor, Depth recovery of human facial features from video sequences, in: IEEE Intemat. Conf. on Image Processing, Washington DC, USA, 23-26 October 1995. IEEE Computer Society Press, Los Alamitos, California, pp. 603-606.
Face localization via shape statistics, in: Intemat. Workshop on Automatic Face
  • M Bichsel
M. Bichsel (Ed.), Intemat. Workshop on Automatic Face and Gesture Recognition, Zurich, Switzerland, 26-28 June 1995, IEEE Computer Society, Swiss Informaticians Society, MultiMedia Laboratory, Department of Computer Science, University of Zurich. M.C. Burl, T.K. Leung, P. Perona, Face localization via shape statistics, in: Intemat. Workshop on Automatic Face [41 [51 161 [71 [RI [91 1101 [Ill [I21 [I31 and Gesture Recognition, Zurich, Switzerland, 26-28 June 1995, pp. 154-159.
An application of fuzzy theory: Face detection, in: Intemat. Workshop on Automatic Face and Gesture Recognition
  • H Q Wu
  • M Chen
  • Yachida
H. Wu. Q. Chen, M. Yachida, An application of fuzzy theory: Face detection, in: Intemat. Workshop on Automatic Face and Gesture Recognition, Zurich, Switzerland, 26-28 June 1995, pp. 3144319.
Looking for faces and facial features in color images. Pattern Recognition and Image Analysis: Advances in Mathematical Theory and Applications
  • K Sobottka
  • I Pitas
K. Sobottka and I. Pitas. Looking for faces and facial features in color images. Pattern Recognition and Image Analysis: Advances in Mathematical Theory and Applications, Russian Academy of Sciences, 7(1), 1997.