Figure 5 - uploaded by Jin Zheng
Content may be subject to copyright.
The trained filters of the first convolutional layer in the CNN model used by the proposed method for face detection. 

The trained filters of the first convolutional layer in the CNN model used by the proposed method for face detection. 

Source publication
Article
Full-text available
Current face or object detection methods via convolutional neural network (such as OverFeat, R-CNN and DenseNet) explicitly extract multi-scale features based on an image pyramid. However, such a strategy increases the computational burden for face detection. In this paper, we propose a fast face detection method based on discriminative complete fe...

Contexts in source publication

Context 1
... for training: To balance the efficiency and effectiveness of the proposed face detection method, we design a lightweight CNN structure, which is shown in Table 1. As shown in Table 1, Layer1 and Layer4 are the convolutional layers whose filter size is 5 × 5 and the number of output feature maps is 16; Layer2 and Layer6 are the max-pooling layers whose filter size is 2 × 2; Layer3 and Layer5 are the local contrast normalization layers defined in Eq. 3; Layer7 is a fully connected layer. The hyperparameters used for training are given in Table 2, where epsW and epsB respectively denote the learning rate for the weight vector and the bias in the CNN model; momW and momB respectively denote the momentum of the weight vector and that of the bias in the CNN model; wc denotes the weight decay parameter. In the training process, we use 16 batches of images for training and 3 batches of images for validation. The whole training process takes around one hour on a computer equipped with a GTX-780Ti GPU. The trained CNN achieves 99.74% of test precision on the two validation batches. Given H[w] = 0.0008, two CNN models (calculated by Eq. (10)) are trained, and the filters at the convolution layers in one model are shown in Fig. 5. 0.0010 0.7500 - - - Figure 6: One fragment of DCFs at the sixth layer. The number of the feature maps in one fragment is equal to the number of channels at the sixth layer in Table ...
Context 2
... of images for validation. The whole training process takes around one hour on a computer equipped with a GTX-780Ti GPU. The trained CNN achieves 99.74% of test precision on the two validation batches. Given H[w] = 0.0008, two CNN models (calculated by Eq. (10)) are trained, and the filters at the convolution layers in one model are shown in Fig. 5. 0.0010 0.7500 --- Figure 6: One fragment of DCFs at the sixth layer. The number of the feature maps in one fragment is equal to the number of channels at the sixth layer in Table ...

Similar publications

Article
Full-text available
Vision-based object detection is an essential component of autonomous driving. Because vehicles typically have limited on-board computing resources, a small-sized detection model is required. Simultaneously, high object detection accuracy and real-time inference detection speeds are required to ensure safety while driving. In this paper, an anchor-...
Article
Full-text available
In the field of object detection, the research on the problem of detecting small face is the most extensive, but when there are objects with obvious scale differences in the image, the detection performance is not obvious, which is due to the scale invariance properties of the deep convolutional neural networks. Although in recent years, there have...
Article
Full-text available
The detection of arbitrary-oriented and multi-scale objects in satellite optical imagery is an important task in remote sensing and computer vision. Despite significant research efforts, such detection remains largely unsolved due to the diversity of patterns in orientation, scale, aspect ratio, and visual appearance; the dense distribution of obje...
Article
Full-text available
The configuration of the detection head has a significant impact on detection performance. However, when the input resolution or detection scene changes, there is not a clear method for quantitatively and efficiently configuring the detection head. We find that there is a rule of matching degrees between the object scale and the detection head acro...
Article
Full-text available
This study aimed to address the problems of low detection accuracy and inaccurate positioning of small-object detection in remote sensing images. An improved architecture based on the Swin Transformer and YOLOv5 is proposed. First, Complete-IOU (CIOU) was introduced to improve the K-means clustering algorithm, and then an anchor of appropriate size...

Citations

... Such an analysis of a person's emotional state is especially useful when organizing the educational process, for example, for assessing students' concentration and attention. By CNN it is possible to recognize not only faces with high accuracy [9,10], but also masks on faces. However, this requires extensive training using appropriate datasets. ...
Article
Full-text available
In the work, the software implementation of the face mask recognition system using the Viola-Jones method and fuzzy logic is performed. The initial images are read from digital video cameras or from graphic files. Detection of face, eye and mouth positions in images is performed using appropriate Haar cascades. The confidence of detecting a face and its features is determined based on the set parameters of Haar cascades. Face recognition in the image is performed based on the results of face and eye detection by means of fuzzy logic using the Mamdani knowledge base. Fuzzy sets are described by triangular membership functions. Face mask recognition is performed based on the results of face recognition and mouth detection by means of fuzzy logic using the Mamdani knowledge base. Comprehensive consideration of the results of different Haar cascades in the detection of face, eyes and mouth allowed to increase the accuracy of recognition face and face mask. The software implementation of the system was made in Python using the OpenCV, Scikit-Fuzzy libraries and Google Colab cloud platform. The developed recognition system will allow monitoring the presence of people without masks in vehicles, in the premises of educational institutions, shopping centers, etc. In educational institutions, a face mask recognition system can be useful for determining the number of people in the premises and for analyzing their behavior.
... In their paper, Guo, G., Wang, H., Yan, Y., Zheng, J., & Li, B et al [1] presented a quick face detection technique based on Discriminative Correlation Filters (DCFs) that are retrieved by a Convolutional Neural Network (CNN) that has been specially constructed. This method directly classifies DCFs, which leads to significant gains in detection efficiency over existing CNNbased face identification approaches. ...
Article
The widespread adoption of drones can be attributed to their low cost and convenience which led to a growth in their use for surveillance reasons leading to their extensive use in other areas too. In spite of this, the problem of maximizing their production while simultaneously minimizing their expenses is still one that they face. This paper provides a comprehensive methodology that can improve the efficiency of drone surveillance at a cheaper cost. This methodology is accomplished through the application of contemporary image processing technology. The employment of RRDB ESRGAN for the purpose of image enhancement, the utilization of face recognition for the purpose of authentication, the utilization of YOLO for the purpose of object detection, and the streamlining of data collecting and processing are all components of our plan. By enhancing image quality, implementing secure access through facial recognition, and facilitating real-time object detection, our system seeks to maximize drone surveillance, thereby improving both efficiency and accuracy. It has been demonstrated by the findings of this study that drone systems that are less expensive and have been improved by more advanced image processing algorithms have the potential to improve security and surveillance capabilities in a variety of different fields.
... One of the classical and most used face detection methods is the boosted cascade of weak classifiers, proposed by Viola and Jones [60]. Deep learning-based methods are also developed for face detection, such as the cascaded convolution neural network (CNN) [61] and the discriminative complete feature-based CNN [62]. Face alignment is to rotate and frontal the detected faces to promise the in-plane consistency of different faces. ...
... Finally, it is worth noting that face recognition in real-world applications, such as those for security, biometric authentication, marketing, and healthcare, is much more challenging than in closely controlled experiments. Although the studies summarized in this review offer exciting insights for basic research at the intersection of neuroscience and artificial intelligence, the difficulty often lies in their practical implementation (for more technical perspectives, see Guo et al., 2020;Li et al., 2015). Note: a = Preprint at the time of writing, b = Different fine-tuning. ...
Preprint
Full-text available
Deep convolutional neural networks (DCNNs) have become the state-of-the-art computational models of biological object recognition. Their remarkable success has helped vision science break new ground and recent efforts have started to transfer this achievement to research on biological face recognition. In this regard, face detection can be investigated by comparing face-selective biological neurons and brain areas to artificial neurons and model layers. Similarly, face identification can be examined by comparing in vivo and in silico multidimensional "face spaces". In this review, we summarize the first studies that use DCNNs to model biological face recognition. On the basis of a broad spectrum of behavioral and computational evidence, we conclude that DCNNs are useful models that closely resemble the general hierarchical organization of face recognition in the ventral visual pathway and the core face network. In two exemplary spotlights, we emphasize the unique scientific contributions of these models. First, studies on face detection in DCNNs indicate that elementary face selectivity emerges automatically through feedforward processing even in the absence of visual experience. Second, studies on face identification in DCNNs suggest that identity-specific experience and generative mechanisms facilitate this particular challenge. Taken together, as this novel modeling approach enables close control of predisposition (i.e., architecture) and experience (i.e., training data), it may be suited to inform long-standing debates on the substrates of biological face recognition.
... Finally, it is worth noting that face recognition in realworld applications, such as those for security, biometric authentication, marketing, and healthcare, is much more challenging than in closely controlled experiments. Although the studies summarized in this review offer exciting insights for basic research at the intersection of neuroscience and artificial intelligence, the difficulty often lies in their practical implementation (for more technical perspectives, see Guo, Wang, Yan, Zheng, & Li, 2020;Li, Lin, Shen, Brandt, & Hua, 2015). ...
Article
Full-text available
Deep convolutional neural networks (DCNNs) have become the state-of-the-art computational models of biological object recognition. Their remarkable success has helped vision science break new ground, and recent efforts have started to transfer this achievement to research on biological face recognition. In this regard, face detection can be investigated by comparing face-selective biological neurons and brain areas to artificial neurons and model layers. Similarly, face identification can be examined by comparing in vivo and in silico multidimensional "face spaces." In this review, we summarize the first studies that use DCNNs to model biological face recognition. On the basis of a broad spectrum of behavioral and computational evidence, we conclude that DCNNs are useful models that closely resemble the general hierarchical organization of face recognition in the ventral visual pathway and the core face network. In two exemplary spotlights, we emphasize the unique scientific contributions of these models. First, studies on face detection in DCNNs indicate that elementary face selectivity emerges automatically through feedforward processing even in the absence of visual experience. Second, studies on face identification in DCNNs suggest that identity-specific experience and generative mechanisms facilitate this particular challenge. Taken together, as this novel modeling approach enables close control of predisposition (i.e., architecture) and experience (i.e., training data), it may be suited to inform long-standing debates on the substrates of biological face recognition.
... The proposed model was tested on several datasets and achieved good results. Guo et al. (2020) [20] proposed a method for face detection based on complete discriminative features associated with a CNN. This approach is useful for face detection and had outstanding results. ...
... The proposed model was tested on several datasets and achieved good results. Guo et al. (2020) [20] proposed a method for face detection based on complete discriminative features associated with a CNN. This approach is useful for face detection and had outstanding results. ...
Article
Full-text available
A new artificial intelligence-based approach is proposed by developing a deep learning (DL) model for identifying the people who violate the face mask protocol in public places. To achieve this goal, a private dataset was created, including different face images with and without masks. The proposed model was trained to detect face masks from real-time surveillance videos. The proposed face mask detection (FMDNet) model achieved a promising detection of 99.0% in terms of accuracy for identifying violations (no face mask) in public places. The model presented a better detection capability compared to other recent DL models such as FSA-Net, MobileNet V2, and ResNet by 24.03%, 5.0%, and 24.10%, respectively. Meanwhile, the model is lightweight and had a confidence score of 99.0% in a resource-constrained environment. The model can perform the detection task in real-time environments at 41.72 frames per second (FPS). Thus, the developed model can be applicable and useful for governments to maintain the rules of the SOP protocol.
... The author intends to leverage the capabilities of Convolutional Neural Networks (CNN) (Guo, G., 2020) in this study. CNNs are a class of deep neural networks, applied mostly to analyzing visual imagery. ...
... The eighth study (Yu, B., 2018) concluded that anchor-based face detectors and CNN-based cascade face detectors can improve the accuracy of face detection (Yu, B., 2018) but did not use the basic features of the T and U areas in detecting it. The ninth item of research (Guo, G., 2020), by proposing a fast face detection method based on the DCF extracted by an elaborately designed CNN (Guo, G., 2020). Though the study confirmed that the CNN method for face detection can achieve promising performance on several face detection datasets, it did not use the basic features of the T and U areas in detecting it. ...
Article
Every organism possesses a unique structural makeup, extending from molecules to organ systems. One such organ, the skin, is our body's largest and the most complex. Its diversity in color corresponds with various human races, although the facial skin type, an often-overlooked factor, also plays a significant role in race identification. In this research, a system was developed to recognize racial classifications from facial skin types by focusing on features in the facial T and U areas, using the Convolutional Neural Network (CNN) and Haar cascade methodologies. CNN was employed due to its ability to use the convolution process, moving a kernel across an image, multiplying it with the applied filter, and thereby generating new representative information. It is especially effective in image recognition and processing. The Haar cascade method, on the other hand, was used to outline the T and U areas on the face for the skin type detection system. The T area, known for oil detection, identifies skin types, while the U area identifies race types by forming facial patterns. This system, trained on 1670 race and 60 skin type datasets and optimized using the Adam optimizer, exhibited high accuracy levels. Upon testing with five new samples, it demonstrated an average accuracy of 98% in race detection and 97% in skin type detection.
... Image auto-rotation was accomplished using convolution neural network (CNN) face detector on each of the four possible 90-degree rotations [10,11]. The rotation with the detected face was deemed the correct rotation. ...
Article
Full-text available
We describe implementation of a point-of-care system for simultaneous acquisition of patient photographs along with portable radiographs at a large academic hospital. During the implementation process, we observed several technical challenges in the areas of (1) hardware—automatic triggering for photograph acquisition, camera hardware enclosure, networking, and system server hardware and (2) software—post-processing of photographs. Additionally, we also faced cultural challenges involving workflow issues, communication with technologists and users, and system maintenance. We describe our solutions to address these challenges. We anticipate that these experiences will provide useful insights into deploying and iterating new technologies in imaging informatics.
... In general, the CNN model adheres to the architecture depicted in Figure 1. According to various studies, the advantages of CNN include the capacity of the CNN method to increase the speed of the face detection process based on research findings [11], [12]. the CNN method was used to perform a face detection algorithm representation using multilayer as a feature extractor to automatically obtain special features [13]. ...
Article
Full-text available
Purpose: The identification and selection of food to be consumed are critical in determining the health quality of human life. Our diet and the illnesses we develop are closely linked. Public awareness of the significance of food quality has increased due to the rising prevalence of degenerative diseases such as obesity, heart disease, type 2 diabetes, hypertension, and cancer. This study aims to develop a model for food identification and identify aspects that can aid in food identification. Methods: This study employs the convolutional neural network (CNN) approach, which is used to identify food objects or images based on the detected features. The images of thirty-five different types of traditional, processed, and western foods were gathered as the study's input data. The image data for each type of food was repeated 100 times to produce a total of 3500 images. Using the color, shape, and texture information, the food image is retrieved. The hue, saturation, and value (HSV) extraction method for color features, the Canny extraction method for shape features, and the gray level co-occurrence matrix (GLCM) method for texture features, in that sequence, were used to evaluate the data in addition to the CNN classification method. Results: The simulation results show that the classification model's accuracy and precision are 76% and 78%, respectively, when the CNN approach is used alone without the extraction method. The CNN classification model and HSV color extraction yielded an accuracy and precision of 51% and 55%, respectively. The CNN classification model with the Canny texture extraction method has an accuracy and precision of 20% and 20%, respectively, while the combined CNN and GLCM extraction methods have 67% and 69% success rates, respectively. According to the simulation results, the food classification and identification model that uses the CNN approach without the HSV, Canny, and GLCM feature extraction methods produces better results in terms of accuracy and precision model. Novelty: This research has the potential to be used in a variety of food identification applications, such as food and nutrition service systems, as well as to improve product quality in the food and beverage industry.
... Object detection is one of the main tasks in the field of computer vision. Its main purpose is to quickly and accurately locate and classify various objects in images, and it has been used in various applications, e.g., unmanned driving [1], facial recognition [2], intelligent security [3], and medical imaging [4]. The main aspects of traditional object detection lie in image preprocessing, feature extraction and classification and recognition [5][6][7]. ...
Article
Full-text available
Due to the limited feature information possessed by small objects in images, it is difficult for a single-shot multibox detector (SSD) to quickly notice the important regions of these small image objects. We propose an enhanced SSD based on feature cross-reinforcement (FCR-SSD). For shallow sampling, an improved group shuffling-efficient channel attention (GS-ECA) mechanism is used to make the model focus on the object areas rather than the background. Then, an FCR module allows the multiscale information from the shallow layer to be passed to the subsequent layer and fused to generate an enhanced feature map, which improves the utilization of the context information associated with small objects. We develop an adaptive algorithm for calculating positive and negative candidate box selection thresholds to select positive and negative samples, determine the intersection over union (IOU) thresholds of candidate boxes and ground-truth boxes, and adaptively determine the threshold for each ground-truth box. The proposed FCR-SSD algorithm achieves 79.6% mean average precision (mAP) for the PASCAL VOC 2007 dataset and 30.1% mAP for the MS COCO dataset at 34.2 frames per second (FPS) when run on an RTX 3080Ti GPU. The experimental results show that the FCR-SSD model yields high accuracy and a good detection speed in small-target detection tasks.