The trained filters of the first convolutional layer in the CNN model used by the proposed method for face detection.

Source publication

A Fast Face Detection Method via Convolutional Neural Network

Article

Full-text available

Mar 2018

Current face or object detection methods via convolutional neural network (such as OverFeat, R-CNN and DenseNet) explicitly extract multi-scale features based on an image pyramid. However, such a strategy increases the computational burden for face detection. In this paper, we propose a fast face detection method based on discriminative complete fe...

Context 1

... for training: To balance the efficiency and effectiveness of the proposed face detection method, we design a lightweight CNN structure, which is shown in Table 1. As shown in Table 1, Layer1 and Layer4 are the convolutional layers whose filter size is 5 × 5 and the number of output feature maps is 16; Layer2 and Layer6 are the max-pooling layers whose filter size is 2 × 2; Layer3 and Layer5 are the local contrast normalization layers defined in Eq. 3; Layer7 is a fully connected layer. The hyperparameters used for training are given in Table 2, where epsW and epsB respectively denote the learning rate for the weight vector and the bias in the CNN model; momW and momB respectively denote the momentum of the weight vector and that of the bias in the CNN model; wc denotes the weight decay parameter. In the training process, we use 16 batches of images for training and 3 batches of images for validation. The whole training process takes around one hour on a computer equipped with a GTX-780Ti GPU. The trained CNN achieves 99.74% of test precision on the two validation batches. Given H[w] = 0.0008, two CNN models (calculated by Eq. (10)) are trained, and the filters at the convolution layers in one model are shown in Fig. 5. 0.0010 0.7500 - - - Figure 6: One fragment of DCFs at the sixth layer. The number of the feature maps in one fragment is equal to the number of channels at the sixth layer in Table ...

View in full-text

Context 2

... of images for validation. The whole training process takes around one hour on a computer equipped with a GTX-780Ti GPU. The trained CNN achieves 99.74% of test precision on the two validation batches. Given H[w] = 0.0008, two CNN models (calculated by Eq. (10)) are trained, and the filters at the convolution layers in one model are shown in Fig. 5. 0.0010 0.7500 --- Figure 6: One fragment of DCFs at the sixth layer. The number of the feature maps in one fragment is equal to the number of channels at the sixth layer in Table ...

View in full-text

FIGURE 1. Mismatch problems between classification and localization....

FIGURE 2. GhostNet Module. The top part is the standard convolutional...

FIGURE 3. Architecture of the proposed detector. P3 to P7 denote the...

FIGURE 4. Ghost attention bottleneck. The top part is the Ghost...

FIGURE 5. Training curves for detectors with YOLOv3 head or decoupled...

ALODAD: An Anchor-free Lightweight Object Detector for Autonomous Driving

Article

Full-text available

Jan 2022

Vision-based object detection is an essential component of autonomous driving. Because vehicles typically have limited on-board computing resources, a small-sized detection model is required. Simultaneously, high object detection accuracy and real-time inference detection speeds are required to ensure safety while driving. In this paper, an anchor-...

Different scale face in the WIDER dataset

Three detection module of SSH detector and their structure

Training framework of regional cascade multi‐scale method

Performance of global detector in different scale

Dense Small Face Detection Based On Regional Cascade Multi-scale Method

Article

Full-text available

Dec 2019

In the field of object detection, the research on the problem of detecting small face is the most extensive, but when there are objects with obvious scale differences in the image, the detection performance is not obvious, which is due to the scale invariance properties of the deep convolutional neural networks. Although in recent years, there have...

Figure 1. Examples of objects with various orientations in satellite...

Figure 7. The overall process of AdaptConv. Decoded angle feature map θ...

Figure 9. Visualization of some detection results on DOTA. Different...

Ablation study of DFR, FPT, and data augmentation.

Detection accuracy on different objects (AP) and overall performance...

ADT-Det: Adaptive Dynamic Refined Single-Stage Transformer Detector for Arbitrary-Oriented Object Detection in Satellite Optical Imagery

Article

Full-text available

Jul 2021

The detection of arbitrary-oriented and multi-scale objects in satellite optical imagery is an important task in remote sensing and computer vision. Despite significant research efforts, such detection remains largely unsolved due to the diversity of patterns in orientation, scale, aspect ratio, and visual appearance; the dense distribution of obje...

The schematic diagram of the proposed detection head configuration. The...

Lightweight dilated convolution module. C represents channel dimension....

Visualization of ETFOD-v2 dataset. (a) Fixation points from more than...

The matching degree between object scales and detection heads on the...

Comparison of visualization detection results between YOLOv5-S w/CG and...

Matching strategy and skip-scale head configuration guideline based traffic object detection

Article

Full-text available

Mar 2024

The configuration of the detection head has a significant impact on detection performance. However, when the input resolution or detection scene changes, there is not a clear method for quantitatively and efficiently configuring the detection head. We find that there is a rule of matching degrees between the object scale and the detection head acro...

Figure 1. Network structure diagram of improved YOLOv5. (a)...

Figure 2. Schematic diagram of the difference between IOU and CIOU.

Figure 3. The architecture of Swin Transformer.

Figure 4. Modified Swin Transformer Block.

Swin-Transformer-Based YOLOv5 for Small-Object Detection in Remote Sensing Images

Article

Full-text available

Mar 2023

This study aimed to address the problems of low detection accuracy and inaccurate positioning of small-object detection in remote sensing images. An improved architecture based on the Swin Transformer and YOLOv5 is proposed. First, Complete-IOU (CIOU) was introduced to improve the K-means clustering algorithm, and then an anchor of appropriate size...

Face Mask Recognition by the Viola-Jones Method Using Fuzzy Logic

Article

Full-text available

Jun 2024

In the work, the software implementation of the face mask recognition system using the Viola-Jones method and fuzzy logic is performed. The initial images are read from digital video cameras or from graphic files. Detection of face, eye and mouth positions in images is performed using appropriate Haar cascades. The confidence of detecting a face and its features is determined based on the set parameters of Haar cascades. Face recognition in the image is performed based on the results of face and eye detection by means of fuzzy logic using the Mamdani knowledge base. Fuzzy sets are described by triangular membership functions. Face mask recognition is performed based on the results of face recognition and mouth detection by means of fuzzy logic using the Mamdani knowledge base. Comprehensive consideration of the results of different Haar cascades in the detection of face, eyes and mouth allowed to increase the accuracy of recognition face and face mask. The software implementation of the system was made in Python using the OpenCV, Scikit-Fuzzy libraries and Google Colab cloud platform. The developed recognition system will allow monitoring the presence of people without masks in vehicles, in the premises of educational institutions, shopping centers, etc. In educational institutions, a face mask recognition system can be useful for determining the number of people in the premises and for analyzing their behavior.

Optimizing Affordable Drone Surveillance with Advanced Image Processing Techniques

Article

Apr 2024

The widespread adoption of drones can be attributed to their low cost and convenience which led to a growth in their use for surveillance reasons leading to their extensive use in other areas too. In spite of this, the problem of maximizing their production while simultaneously minimizing their expenses is still one that they face. This paper provides a comprehensive methodology that can improve the efficiency of drone surveillance at a cheaper cost. This methodology is accomplished through the application of contemporary image processing technology. The employment of RRDB ESRGAN for the purpose of image enhancement, the utilization of face recognition for the purpose of authentication, the utilization of YOLO for the purpose of object detection, and the streamlining of data collecting and processing are all components of our plan. By enhancing image quality, implementing secure access through facial recognition, and facilitating real-time object detection, our system seeks to maximize drone surveillance, thereby improving both efficiency and accuracy. It has been demonstrated by the findings of this study that drone systems that are less expensive and have been improved by more advanced image processing algorithms have the potential to improve security and surveillance capabilities in a variety of different fields.

A review of multimodal emotion recognition from datasets, preprocessing, features, and fusion methods

Article

Oct 2023
NEUROCOMPUTING

Modeling biological face recognition with deep convolutional neural networks

Preprint

Full-text available

Aug 2023

Deep convolutional neural networks (DCNNs) have become the state-of-the-art computational models of biological object recognition. Their remarkable success has helped vision science break new ground and recent efforts have started to transfer this achievement to research on biological face recognition. In this regard, face detection can be investigated by comparing face-selective biological neurons and brain areas to artificial neurons and model layers. Similarly, face identification can be examined by comparing in vivo and in silico multidimensional "face spaces". In this review, we summarize the first studies that use DCNNs to model biological face recognition. On the basis of a broad spectrum of behavioral and computational evidence, we conclude that DCNNs are useful models that closely resemble the general hierarchical organization of face recognition in the ventral visual pathway and the core face network. In two exemplary spotlights, we emphasize the unique scientific contributions of these models. First, studies on face detection in DCNNs indicate that elementary face selectivity emerges automatically through feedforward processing even in the absence of visual experience. Second, studies on face identification in DCNNs suggest that identity-specific experience and generative mechanisms facilitate this particular challenge. Taken together, as this novel modeling approach enables close control of predisposition (i.e., architecture) and experience (i.e., training data), it may be suited to inform long-standing debates on the substrates of biological face recognition.

Modeling Biological Face Recognition with Deep Convolutional Neural Networks

Article

Full-text available

Jul 2023
J COGNITIVE NEUROSCI

Deep convolutional neural networks (DCNNs) have become the state-of-the-art computational models of biological object recognition. Their remarkable success has helped vision science break new ground, and recent efforts have started to transfer this achievement to research on biological face recognition. In this regard, face detection can be investigated by comparing face-selective biological neurons and brain areas to artificial neurons and model layers. Similarly, face identification can be examined by comparing in vivo and in silico multidimensional "face spaces." In this review, we summarize the first studies that use DCNNs to model biological face recognition. On the basis of a broad spectrum of behavioral and computational evidence, we conclude that DCNNs are useful models that closely resemble the general hierarchical organization of face recognition in the ventral visual pathway and the core face network. In two exemplary spotlights, we emphasize the unique scientific contributions of these models. First, studies on face detection in DCNNs indicate that elementary face selectivity emerges automatically through feedforward processing even in the absence of visual experience. Second, studies on face identification in DCNNs suggest that identity-specific experience and generative mechanisms facilitate this particular challenge. Taken together, as this novel modeling approach enables close control of predisposition (i.e., architecture) and experience (i.e., training data), it may be suited to inform long-standing debates on the substrates of biological face recognition.

FMDNet: An Efficient System for Face Mask Detection Based on Lightweight Model during COVID-19 Pandemic in Public Areas

Article

Full-text available

Jul 2023
SENSORS-BASEL

A new artificial intelligence-based approach is proposed by developing a deep learning (DL) model for identifying the people who violate the face mask protocol in public places. To achieve this goal, a private dataset was created, including different face images with and without masks. The proposed model was trained to detect face masks from real-time surveillance videos. The proposed face mask detection (FMDNet) model achieved a promising detection of 99.0% in terms of accuracy for identifying violations (no face mask) in public places. The model presented a better detection capability compared to other recent DL models such as FSA-Net, MobileNet V2, and ResNet by 24.03%, 5.0%, and 24.10%, respectively. Meanwhile, the model is lightweight and had a confidence score of 99.0% in a resource-constrained environment. The model can perform the detection task in real-time environments at 41.72 frames per second (FPS). Thus, the developed model can be applicable and useful for governments to maintain the rules of the SOP protocol.

Facial Skin Type Detection for Race Classification using Convolutional Neural Network and Haar Cascade Method

Article

Jun 2023

Every organism possesses a unique structural makeup, extending from molecules to organ systems. One such organ, the skin, is our body's largest and the most complex. Its diversity in color corresponds with various human races, although the facial skin type, an often-overlooked factor, also plays a significant role in race identification. In this research, a system was developed to recognize racial classifications from facial skin types by focusing on features in the facial T and U areas, using the Convolutional Neural Network (CNN) and Haar cascade methodologies. CNN was employed due to its ability to use the convolution process, moving a kernel across an image, multiplying it with the applied filter, and thereby generating new representative information. It is especially effective in image recognition and processing. The Haar cascade method, on the other hand, was used to outline the T and U areas on the face for the skin type detection system. The T area, known for oil detection, identifies skin types, while the U area identifies race types by forming facial patterns. This system, trained on 1670 race and 60 skin type datasets and optimized using the Adam optimizer, exhibited high accuracy levels. Upon testing with five new samples, it demonstrated an average accuracy of 98% in race detection and 97% in skin type detection.

Lessons Learned in Change Management in Deploying Novel Informatics Solutions: Experience Implementing a Point-of-Care Patient Photography System with Radiography

Article

Full-text available

Jun 2023
J DIGIT IMAGING

We describe implementation of a point-of-care system for simultaneous acquisition of patient photographs along with portable radiographs at a large academic hospital. During the implementation process, we observed several technical challenges in the areas of (1) hardware—automatic triggering for photograph acquisition, camera hardware enclosure, networking, and system server hardware and (2) software—post-processing of photographs. Additionally, we also faced cultural challenges involving workflow issues, communication with technologists and users, and system maintenance. We describe our solutions to address these challenges. We anticipate that these experiences will provide useful insights into deploying and iterating new technologies in imaging informatics.

Selection of Food Identification System Features Using Convolutional Neural Network (CNN) Method

Article

Full-text available

May 2023

Purpose: The identification and selection of food to be consumed are critical in determining the health quality of human life. Our diet and the illnesses we develop are closely linked. Public awareness of the significance of food quality has increased due to the rising prevalence of degenerative diseases such as obesity, heart disease, type 2 diabetes, hypertension, and cancer. This study aims to develop a model for food identification and identify aspects that can aid in food identification. Methods: This study employs the convolutional neural network (CNN) approach, which is used to identify food objects or images based on the detected features. The images of thirty-five different types of traditional, processed, and western foods were gathered as the study's input data. The image data for each type of food was repeated 100 times to produce a total of 3500 images. Using the color, shape, and texture information, the food image is retrieved. The hue, saturation, and value (HSV) extraction method for color features, the Canny extraction method for shape features, and the gray level co-occurrence matrix (GLCM) method for texture features, in that sequence, were used to evaluate the data in addition to the CNN classification method. Results: The simulation results show that the classification model's accuracy and precision are 76% and 78%, respectively, when the CNN approach is used alone without the extraction method. The CNN classification model and HSV color extraction yielded an accuracy and precision of 51% and 55%, respectively. The CNN classification model with the Canny texture extraction method has an accuracy and precision of 20% and 20%, respectively, while the combined CNN and GLCM extraction methods have 67% and 69% success rates, respectively. According to the simulation results, the food classification and identification model that uses the CNN approach without the HSV, Canny, and GLCM feature extraction methods produces better results in terms of accuracy and precision model. Novelty: This research has the potential to be used in a variety of food identification applications, such as food and nutrition service systems, as well as to improve product quality in the food and beverage industry.

An enhanced SSD with feature cross-reinforcement for small-object detection

Article

Full-text available

Mar 2023
APPL INTELL

Due to the limited feature information possessed by small objects in images, it is difficult for a single-shot multibox detector (SSD) to quickly notice the important regions of these small image objects. We propose an enhanced SSD based on feature cross-reinforcement (FCR-SSD). For shallow sampling, an improved group shuffling-efficient channel attention (GS-ECA) mechanism is used to make the model focus on the object areas rather than the background. Then, an FCR module allows the multiscale information from the shallow layer to be passed to the subsequent layer and fused to generate an enhanced feature map, which improves the utilization of the context information associated with small objects. We develop an adaptive algorithm for calculating positive and negative candidate box selection thresholds to select positive and negative samples, determine the intersection over union (IOU) thresholds of candidate boxes and ground-truth boxes, and adaptively determine the threshold for each ground-truth box. The proposed FCR-SSD algorithm achieves 79.6% mean average precision (mAP) for the PASCAL VOC 2007 dataset and 30.1% mAP for the MS COCO dataset at 34.2 frames per second (FPS) when run on an RTX 3080Ti GPU. The experimental results show that the FCR-SSD model yields high accuracy and a good detection speed in small-target detection tasks.

The trained filters of the first convolutional layer in the CNN model used by the proposed method for face detection.

Contexts in source publication

Similar publications

Citations