Figure 3 - uploaded by Bernd M. Radig
Content may be subject to copyright.
Acquiring a skin color mask. 

Acquiring a skin color mask. 

Source publication
Article
Full-text available
Skin color is an important feature of faces. Various ap- plications benefit from robust skin color detection. Skin color may look quite different, depending on camera set- tings, illumination, shadows, people's tans, ethnic groups. That variation is a challenging aspect of skin color classi- fication. In this paper, we present an approach that uses...

Context in source publication

Context 1
... skin color is challenging, because it occupies a large cluster within color space. Camera types and settings, illumination conditions, as well as people’s tans and ethnic groups make skin color vary significantly. How- ever, within one image skin color looks similarly, because most of the above conditions are fixed. Our image specific skin color model describes those conditions. Using this model, dynamic skin color classifiers are adapted to the image conditions, which improves their classification accuracy. Previous work focuses on detecting image specific skin color models via low level techniques such as color segmentation, background subtraction, or histogram prediction [8]. In order to improve the accuracy of the obtained image specific skin color model we use a more sophisticated vision module. It is a combination of the commonly known and widely accepted face detector of Viola and Jones, see Section 3.1, and an empirically obtained skin color mask, see Section 3.2. Since the face detector works with gray value images it is not influenced by the color distribution. The entire module is capable of extracting a small number of skin color pixels from the image, which are used to set up the image specific skin color model. Viola and Jones [1] propose a visual object detection framework that processes images very quickly while achieving high detection rates. Their approach detects previously trained objects within rectangular regions of interest (ROI) by evaluating simple features within those rect- angles. Two of those features are depicted in Figure 2. The features are called haar-like features because they are computed similarly to the haar wavelet parameters. The feature value is the sum of the gray values within the black area subtracted from the sum of the gray values within the white area. Viola and Jones introduce an image representation called the Integral Image which allows any of their features to be computed very quickly and independently of its size. They utilize simple yet fast classifiers that evaluate those features for object detection. Through a combination of those simple classifiers in a boosted cascade high accuracy is achieved. Viola and Jones demonstrate the benefit of their approach in the domain of face detection. The trained frontal face detector runs at 15 frames per second on a standard desktop computer which makes it suitable for real-time applications. We use this implementation for obtaining a ROI around a face within the processed image. A skin color mask enables a face detector to extract skin color pixels from the ROI. It is a two-dimensional matrix that specifies the probability of skin color for each pixel within the ROI after being scaled to its size. Runtime performance is increased by only taking those probability entries into account that exceed a given threshold value. A skin color mask exists for any face detector; however, we show its benefit for the Viola and Jones face detector, see section 3.1. Skin color masks are learned via training images whose skin color pixels are previously known. We use a set of K training images that show various faces which originate from well known face detectors, the Boston University skin color database [7], and various web pages. A skin color mask M is an n 1 n 2 matrix whose entries are called m i,j ∈ [0 .. 1] . In this work we take n 1 = n 2 = 24 as a reasonable compromise between accuracy and runtime performance. We apply the face detector to each image k and receive a region of interest roi k which is divided into n 1 × n 2 cells f k,i,j with 1 ≤ i ≤ n 1 and 1 ≤ j ≤ n 2 The likelihood for skin color within cell f k,i,j is expressed by s k,i,j . Finally, we calculate the entries of the skin color mask m i,j , see (2) and (3). The entire procedure is depicted in Figure ...

Similar publications

Conference Paper
Full-text available
Skin detection is used in applications in computer vision, including image correction, image–content filtering, image processing, and skin classification. In this study, we propose an accurate and effective method for detecting the most representative skin color in one's face based on the face's center region, which is free from nonskin-colored fea...
Article
Full-text available
Wide resection of malignant skin tumors in the upper orbital region often results in soft-tissue defects involving the eyebrow. We used composite skin grafts from the area around the sideburns for 1-stage reconstruction of skin and eyebrow defects. The results were aesthetically satisfying because the hair and shape of these regions were similar to...

Citations

... The hybrid-feature-based methods usually introduce different features for feature fusion to improve the adaptability of the skin segmentation. Wimmer et al. [21] propose a set of adaptive skin color models that can be dynamically updated by building the correlation between the human skin color and facial features. Sun [22] utilizes a large number of skin samples to train the global skin model and the non-skin model. ...
Article
Full-text available
Skin segmentation plays an important role in image processing and human–computer interaction tasks. However, it is a challenging task to accurately detect skin regions from various scenes with different illumination or color styles. In addition, in the field of video processing, reducing the computational load and improving the real-time performance of the algorithm has also become an important topic of skin segmentation. Existing deep semantic segmentation networks usually pay too much attention to the detection performance of the model and make the model structure tend to be complex, which brings heavy computational burden. To achieve the trade-off between detection performance and real-time performance of the skin segmentation algorithm, this paper proposes a lightweight skin segmentation network. Compared with existing semantic segmentation networks, this model adopts a simpler structure to improve the real-time performance. In addition, to improve the feature fitting ability of the network without slowing down its inference speed, this paper proposes a color attention mechanism, which locates skin regions in images based on the distribution features of skin colors on the E-R/G color plane generated from the YES color space, and guides the network to update parameters. Experimental results show that this method not only exhibits similar detection performance to existing semantic segmentation networks such as U-Net and DeepLab, but also the computation load of the model is 18.1% lower than Fast-SCNN.
... Our hand tracking system first starts with Haar-based face detection [57]. An adaptive skin colour model [60] is then learnt from the colour distribution of the face pixels. Candidate hand regions are generated based on the fusion of the face data, the skin likelihood map, as well as a motion likelihood map generated via weighted frame differencing. ...
Conference Paper
Sign languages are visual languages used by the Deaf community for communication purposes. Whilst recent years have seen a high growth in the quantity of sign language video collections available online, much of this material is hard to access and process due to the lack of associated text-based tagging information and because 'extracting' content directly from video is currently still a very challenging problem. Also limited is the support for the representation and documentation of sign language video resources in terms of sign writing systems. In this paper, we start with a brief survey of existing sign language technologies and we assess their state of the art from the perspective of a sign language digital information processing system. We then introduce our work, focusing on vision-based sign language recognition. We apply the factorisation method to sign language videos in order to factor out the signer's motion from the structure of the hands. We then model the motion of the hands in terms of a weighted combination of linear trajectory basis and apply a set of classifiers on the basis weights for the purpose of recognising meaningful phonological elements of sign language. We demonstrate how these classification results can be used for transcribing sign videos into a written representation for annotation and documentation purposes. Results from our evaluation process indicate the validity of our proposed framework.
... ходе, основаны на параметрическом представлении области цветового пространства (RGB, HSV, YCbCr), соответствующей цвету кожи. В частности, используются простые пороговые правила [5,6,12], либо анализ главных компонент [7,13], либо модели смеси нормальных распределений [4,14]. Однако, при съемке в реальных условиях, в результате изменения экспозиции, конфигурация области, соответствующей цвету кожи, внутри цветового пространства может существенно меняться. ...
Article
Full-text available
В работе предлагается метод детектирования кисти руки в видеопотоке на основе одноклассового пиксельного классификатора, вероятностной гамма-нормальной модели и скелетного описания. Первоначальная сегментация участков кожи выполняется с помощью модифицированной версии одноклассового классификатора, обученного фрагментом изображения части лица и не требующего формирования обучающей выборки для построения модели фона. Результатом классификации является степень принадлежности к классу интереса. Улучшение первоначальной сегментации осуществляется за счет согласования локальных решений и привлечения информации о структуре изображения. Для этого применяется специальный фильтр со свойствами переноса структуры на основе вероятностной гамма-нормальной модели. Для принятия окончательного решения о том, что найденный фрагмент является изображением кисти человека, используется метод сравнения бинарных изображений на основе их скелетов.
... We employ an adaptive skin colour classifier [18] for generating the skin likelihood map. The skin model used by this classifier is initialised via face detection as follows: a 24 × 24 mask, generated off-line using several hundred images of different persons, is applied to the face region that is found by the face detector -this mask indicates which pixels within the face region are most likely to be skin; then working within the normalised RGB colour space, a parametric skin colour model is estimated. ...
Conference Paper
Full-text available
In this paper, we propose to incorporate prior knowledge from sign language linguistic models about the motion of the hands within a multiple hypothesis tracking framework. A critical component for automated visual sign language recognition is the tracking of the signer’s hands, especially when faced with frequent and persistent occlusions and complex hand interactions. Hand motion constraints identified by sign language phonological models, such as the hand symmetry condition, are used as part of the data association process. Initial experimental results show the validity of the proposed approach.
... These methods are based on the assumption that at least one reliable face is present in the image and has been reliably detected. They differentiate among each other mainly in the way they select skin pixels from the detected face(s) to be used to train ad-hoc skin classifiers [26], [27], [28]. Bianco et al. [20] showed that skin classifiers initialized by reliable skin pixels extracted from faces outperform traditional methods, even when they are preceded by a color constancy preprocessing step. ...
Article
Full-text available
In this paper we propose a skin classification method exploiting faces and bodies automatically detected in the image, to adaptively initialize individual ad-hoc skin classifiers. Each classifier is initialized by a face and body couple or by a single face, if no reliable body is detected. Thus, the proposed method builds an ad-hoc skin classifier for each person in the image, resulting in a classifier less dependent from changes in skin color due to tan levels, races, genders, and illumination conditions. Experimental results on a heterogeneous dataset of labeled images show that our proposal outperforms the stateof- the-art methods, and that this improvement is statistically significant.
... Hence, if we could construct an adaptive skin color model, the misclassification rate would be greatly reduced. By exploiting skin color information from individual's face, we could create the skin color model for each person and then improve system robustness because of the reduced amount of color variations between a person's face and hands [29]. The face-based adaptive skin color model proposed by Liou [30] is adopted here. ...
Article
Full-text available
Due to the effect of lighting and complex background, most visual hand gesture recognition systems work only under restricted environments. Here, we propose a robust system which consists of three modules: digital zoom, adaptive skin detection, and hand gesture recognition. The first module detects user face and zooms in so that the face and upper torus take the central part of the image. The second module utilizes the detected user facial color information to detect the other skin color regions like hands. The last module is the most important part for doing both static and dynamic hand gesture recognition. The region of interest next to the detected user face is for fist/waving hand gesture recognition. To classify the dynamic hand gestures under complex background, motion history image and four groups of novel Haar-like features are investigated to classify the dynamic up, down, left, and right hand gestures. A simple efficient algorithm using Support Vector Machine is developed. These defined hand gestures are intuitive and easy for user to control most home appliances. Five users doing 50 dynamic hand gestures at near, medium, and far distances, respectively, were tested under complex environments. Experimental results showed that the accuracy was 95.37 % on average and the processing speed was 3.93 ms per frame. An application integrated with the developed hand gesture recognition was also given to demonstrate the feasibility of proposed system.
... The other approaches utilize high level information to detect image specific skin regions. Wimmer and Radig [21] presented an adaptive skin color classifier based on face detection. ...
... The false positive could be filtered out by checking the color scope of extracted face region with the traditional skin color equations. It is different from [21] that we do not need to manually annotate the face mask for sampling the skin region. From the pixels in the skin region, the personalized skin color model consisting of normalized red/green and red could be described by Gaussian distributions. ...
... To prevent from over/under segmentation of skin color pixels, face based adaptive skin color model was proposed. Wimmer and Radig [21] presented a parametric skin color classifier that can be adapted to the condition of each image or image sequence. They applied a stand-alone face detector [22] to get the rough position and size of the frontal view face. ...
Article
Full-text available
Man machine interface by video analysis becomes popular recently. The most typical body gesture utilized for computer interaction is hand gesture. Therefore, it is a very important topic to accurately extract hand regions from a sequence of images in real time. In this paper, we propose an adaptive skin color model which is based on detected face color. Skin colors are sampled from extracted face region where non-skin color pixels like eyebrow or glasses are excluded. Gaussian distributions of normalized RGB are then used to define the skin color model for the detected person. To demonstrate the robustness of proposed model, experiments under diversified lighting and background are tested. Traditional methods based on RGB, Normalized RGB, and YCbCr are all implemented for comparison. From experimental results, skin color pixels could be detected for each person. The accuracy rate is 95.73% on average and is superior to previously mentioned methods.
... Hence, if we could construct an adaptive skin color model, the misclassification rate would be greatly reduced. By exploiting skin color information from individual's face to create the skin color model for each person will improve system robustness because of the reduced amount of color variations within a person's face and hands [17]. The face-based adaptive skin color model proposed by Liou [18] is adopted here. ...
Article
Full-text available
Hand gesture recognition based man-machine interface is being developed vigorously in recent years. Due to the effect of lighting and complex background, most visual hand gesture recognition systems work only under restricted environment. An adaptive skin color model based on face detection is utilized to detect skin color regions like hands. To classify the dynamic hand gestures, we developed a simple and fast motion history image based method. Four groups of haar-like directional patterns were trained for the up, down, left, and right hand gestures classifiers. Together with fist hand and waving hand gestures, there were totally six hand gestures defined. In general, it is suitable to control most home appliances. Five persons doing 250 hand gestures at near, medium, and far distances in front of the web camera were tested. Experimental results show that the accuracy is 94.1% in average and the processing time is 3.81 ms per frame. These demonstrated the feasibility of the proposed system.
... Adaptive skin segmentation is implemented using a procedure similar to the one described in [10]. The central idea is to use the skin color distribution in a perceived face to build a specific skin model. ...
Conference Paper
Full-text available
In this article a robust and real-time hand gesture detection and recognition system for dynamic environments is proposed. The system is based on the use of boosted classifiers for the detection of hands and the recognition of gestures, together with the use of skin segmentation and hand tracking procedures. The main novelty of the proposed approach is the use of innovative training techniques - active learning and bootstrap -, which allow obtaining a much better performance than similar boosting-based systems, in terms of detection rate, number of false positives and processing time. In addition, the robustness of the system is increased due to the use of an adaptive skin model, a color-based hand tracking, and a multi-gesture classification tree. The system performance is validated in real video sequences.
... [10] proposed a skin segmentation method in YCbCr space, applying Bayesian decision rules. A face detector is used in [11] to generate a skin model and then applied to images to detect skin. [1] predicted changes of skin color during tracking with a second order Markov model. ...
Conference Paper
Full-text available
In this paper, we present a novel algorithm to detect homogeneous color regions in images. We show its performance by applying it to skin detection. In contrast to previously presented methods, we use only a rough skin direction vector instead of a static skin model as a priori knowledge. Thus, higher robustness is achieved in images captured under unconstrained conditions. We formulate the segmentation as a clustering problem in color space. A homogeneous color region in image space is modeled using a 3D gaussian distribution. Parameters of the gaussians are estimated using the EM algorithm with spatial constraints. We transform the image by a whitening transform and then apply a fuzzy k-means algorithm to the hue value in order to obtain initialization parameters for the EM algorithm. A divisive hierarchical approach is used to determine the number of clusters. The stopping criterion for further subdivision is based on the edge image. For evaluation, the proposed method is applied to skin segmentation and compared with a well known method.