Conference PaperPDF Available

Fusion of Multiple Simple Convolutional Neural Networks for Gender Classification

Authors:

Abstract and Figures

Gender classification using face images is one of the most important and challenging tasks in automated face analysis, especially in unrestricted scenarios. Gender classification has become related to a growing number of applications. Nevertheless, the performance of existing methods on real-world images is still lacking. In this paper, we show that the performance of a simple convolutional neural network can be improved by learning multiple representations. We employ a simple feature fusion method using two simple convolutional neural network architectures. Our proposed method aims to replace the complex convolutional neural networks with two simple Principal Component Analysis network (PCANet) trained on different patch sizes. In addition, the high dimensional feature vector generated from each PCANet is reduced using whitening PCA. We evaluate our method on Gallagher’s database which identified as among the hardest databases for gender classification. Our approach shows a comparative performance in comparison with state-of-the-art approaches.
Content may be subject to copyright.
Fusion of Multiple Simple Convolutional Neural
Networks for Gender Classification
Nihad A. Abdalrady
Electrical Engineering Department,
Faculty of Engineering
Aswan University, Aswan 81542, Egypt
nihad.abdalrady@eng.aswu.edu.eg
Saleh Aly
Electrical Engineering Department,
Faculty of Engineering
Aswan University, Aswan 81542, Egypt
saleh@aswu.edu.eg
Abstract— Gender classification using face images is one of
the most important and challenging tasks in automated face
analysis, especially in unrestricted scenarios. Gender
classification has become related to a growing number of
applications. Nevertheless, the performance of existing methods
on real-world images is still lacking. In this paper, we show that
the performance of a simple convolutional neural network can be
improved by learning multiple representations. We employ a
simple feature fusion method using tw o simple convolutional
neural network architectures. Our proposed method aims to
replace the complex convolutional neural networks with two
simple Principal Component Analysis network (PCANet) trained
on different patch sizes. In addition, the high dimensional feature
vector generated from each PCANet is reduced using whitening
PCA. We evaluate our method on Gallagher’s database which
identified as among the hardest databases for gender
classification. Our approach shows a comparative performance
in comparison with state-of-the-art approaches.
Keywords—principal component analysis network; deep
learning; automatic gender classification; whitening PCA; deep
convolutional neural network.
I.
I
NTRODUCTION
Automatic face-based gender classification is an important
and challenging problem in computer vision. It has attracted
many researchers in the last two decades due to its large
number of important applications [1], for example, surveillance
and access control of certain areas, organizing a huge amount
of image and video data, business intelligence approaches, etc.
Face-based gender classification is a challenging problem
because of being affected by the variation of many factors such
as pose, illumination, expression, and others. Many approaches
are developed to automatically classify gender from face
images, as described in [2]. Those approaches can be divided
into appearance-based and feature-based approaches. In the
appearance-based approaches, either or both holistic and local
features were extracted [3, 4, 5, 6], while in feature-based
approaches the geometric features were extracted from the face
image [8, 9, 10].
Recently, deep learning convolutional neural networks have
generally dominated many computer vision applications. In [3]
Local-Deep Neural Network model developed to recognize
gender; this model was built using a feed-forward neural
network without dropout to extract features from the input
images. First, edges of the face image were detected, and then
small image patches were selected around these edges. The
neural networks trained with all selected image patches. Then
expectations of all the patches from the input test image were
averaged for the final output; the LDNN model gives accuracy
96.25% and 90.58% on the LFW [11] and Gallagher [12]
datasets respectively. However, this performance is dependent
on the results of the edge detector.
In [4] a modified version of the LDDN model was
proposed. In this model: First, the facial landmarks were
detected instead of using fixed patch locations; then the image
patches around the detected landmarks were selected to train
the neural network, this approach help to reduce the training
time and gives 96% accuracy.
In [5] deep convolutional neural network (CNN) with
simple architecture was used to estimate age and gender. This
architecture comprised three convolutional layers and two fully
connected layers using few numbers of neurons. The network
tested on the Adience benchmark, trained from scratch and
achieves 86.8% accuracy.
In [6] a fine-tuned convolutional neural network (CNN)
combined with a linear support vector machine classifier [7]
used to recognize gender. The model was tested on the Adience
and the color FERET datasets. The best results were obtained
when applying an oversampling procedure by calculating the
average class scores of the final classifiers. The achieved
accuracy was 87.2% and 97.3% for the Adience and the color
FERET datasets respectively.
In [8], [9] Local Binary patterns (LBP) used to extract
features from local facial regions and followed by a support
vector machine (SVM) classifier to find the decision
hyperplane such that the expected classification error is
minimized. The approach proposed in [8] was applied to the
CAS-PEAL dataset which contains thousands of samples with
different poses; the obtained accuracy rate was 96.75%.
However, only the discriminative LBP-Histogram (LBPH) bins
were used in [9] to produce a recognition performance of
94.81% on the LFW database.
In [10] the performance of periocular gender classification
is compared with other state-of-the-art facial gender
classification systems. Firstly, the periocular area was extracted
from the face image, and then the feature vector is constructed
using local descriptors and passed to the SVM classifier. The
experiments were carried out on Gallagher (GROUPS) dataset
using Dago’s protocol [13]. Different local descriptors
including Local Binary Patterns (LBP), Histogram of Oriented
Gradients (HOG), Local ternary patterns, Weber local
descriptor, and Local oriented statistics information booster
experimented. The best-obtained accuracy was 83.02% when
using features extracted from histogram of oriented gradients.
Although various methods based on hand-crafted or CNN
features have been utilized to solve gender classification
problem, the high computational resources required by CNN
architectures complicate the training process. The goal of this
paper is to replace the complex structure of recent CNNs with a
simple PCANet [14] trained using an unsupervised learning
algorithm. Two PCANet networks are trained with two
different patch sizes to capture various features from face
images. The dimensionality of the feature vectors generated
from each PCANet is reduced using whitening PCA algorithm.
There are many advantages of performing feature reduction
using WPCA such as: remove the redundancy, reduce the
computational resources required for classification and hence
improving the performance. Features from each PCANet are
fused to create the final feature vector. Linear support vector
machine is then utilized to classify the concatenated feature
vector. The performance of the proposed method is evaluated
based on Gallagher’s (GROUPS) dataset [12] which is among
the most challenging and representative dataset of real-world
settings.
The rest of the paper is organized as follows: Section II
briefly explains PCANet architecture; section III presents the
proposed method; in section IV the experiments and results
are analyzed and evaluated, and section V presents the
conclusion and future work.
II. P
RINCIPAL
C
OMPONENT
A
NALYSIS
N
ETWORK
(PCAN
ET
)
The main objective of deep learning algorithms is to
discover numerous levels of data representations in which
higher-level features represent more specific representations of
the data. Using convolutional architectures considered one of
the main reasons for deep learning success in image
classification tasks. Typical convolutional deep neural network
(CNN) architecture consists of multiple trainable stages
followed by an output classification layer. Each stage
comprises three layers: a convolutional filter bank layer, a
nonlinear processing layer, and a feature pooling layer.
Recently, a simple convolutional neural network named
PCANet is proposed to solve efficiently many image
classification problems [14]. The goal of PCANet is to design
a deep learning network to be very simple and easy to
train/adapt to various input data and tasks. To this end,
complex convolutional layer filters were replaced by a set of
PCA filter banks in each stage; the binary hashing represents
the nonlinear layer; the block-wise histograms were applied to
be the pooling layer which is the final output features. Fig. 1
illustrates how features are extracted from the input image
using a two-stage PCANet. The final feature vector produced
from the block histogram calculation process represents
localized spatial information of face image which helps to
discriminate between male and female classes.
As discussed in [14], the PCANet architecture has mainly
five parameters that affect the network performance: number
of stages, filter size k
1
, k
2
, number of filters in each stage L
1
,
L
2
, block size for local histograms, and the overlap ratio
between blocks. We discuss and study the effect of those
parameters in the experimental section.
III.
THE PROPOSED METHOD
We propose a method that can efficiently learn facial
features from low-resolution images under unconstrained
conditions such as facial expression, pose, illumination, age,
and ethic. Features learned from the two convolutional
PCANet neural networks are fused to improve results. The
proposed method composed of two phases: 1) training phase
and 2) testing phase. First, two PCANet models are trained on
different filter sizes to capture different spatial levels of facial
features. After extracting features from each PCANet, the
whitening PCA algorithm is applied to reduce the
dimensionality and make feature fusion more reliable. Linear
support vector machine classifier [7] is finally employed to
classify the fused feature vector. Details of the implemented
approach are described in the following subsections and
illustrated in the block diagram shown in Fig. (2):
Fig. 1. Illustration of PCANet architecture. The network consists of two convolutional stages. The first and second stages contain L
1
and L
2
filters of
k
1
xk
2
pixels, respectively. The output layer contains two processes: Binarization & Binary to Decimal Conversion; Image concatenation and
block-wise histogram.
A. Training phase
Our proposed method utilizes two PCANets with different
filter sizes to extract features from the input images. In this
phase, we select sample images from the given database to
train the system; second, features are extracted from those
images using each of the PCANet, then concatenate the output
feature vectors and train the SVM classifier with the
concatenated feature vector. We found that features extracted
from each PCANet are highly correlated due to block
overlapping. Therefore, we use whitening PCA (WPCA)
algorithm to make the feature vector less redundant. Two
advantages are accomplished by applying whitening: 1) make
features less correlated with each other, and 2) give all
features the same variance. Using WPCA helps also to reduce
the feature vector dimension. The Whitening operation has
two simple steps:
1) Project the feature vector onto the eigenspace of the
training images: This rotates the dataset so that we get non-
correlated components.
σ = 1
mx
i
m
i=1
x
i
T
(1)
x
rot
i
=U
T
x
i
=
u
1
T
x
i
u
2
T
x
i
u
n
T
x
i
(2)
Where x
i
is the feature vector of an input image i; m is
the total number of training images; σ is the covariance matrix
of x; u
1
, u
2
, u
n
are the principal eigenvectors of the covariance
matrix; and x
rot
i
is the rotated feature vector.
2) Normalization of projected data: By normalizing the
projected dataset we get a variance of 1 for all components;
this is done by simply dividing each component by the square
root of its corresponding eigenvalue.
,
,
 (3)
Where
,
is the whitened data; λ
1
, λ
2
,λ
n
are the
corresponding eigenvalues; And ϵ is the regularization
constant.
After applying WPCA on each PCANet feature vector, we
concatenate the two whitened feature vectors and then train
the SVM classifier with the concatenated feature vector. After
finishing the training phase, we can utilize the PCANet
models, WPCA1, WPCA2 projection matrices and SVM
classifier for testing.
B. Testing phase
To evaluate the efficiency of our proposed method, we
select a set of images from the same database which are not
included in the training phase. Fig. 2 illustrates the framework
of our proposed method. As illustrated in the figure we can
conclude the testing process in five steps:
1) We select an input test image from the database.
2) Extract facial features using both PCANet models to
represent the input image.
3) Use the previously obtained whitened data matrix to
eliminate the correlation between the features and
reduce the dimension of the test feature vector in each
PCANet.
4) Concatenate the two output test feature vectors.
5) Feed the concatenated test feature vector and the SVM
classifier model to the test stage.
In the next section, we will show the effectiveness of the
proposed method to classify gender.
Fig. 3. Sample images from Gallagher’s database.
Fig. 2. Block diagram of the proposed gender classification method based on PCANet.
IV.
EXPERIMENTS
A. Gallagher’s (GROUPS) database
To evaluate the performance of the proposed method, we
employ Gallagher’s (GROUPS) database [12]. Gallagher’s
database is a public database comprising a large number of
individuals; it contains more than 28,000 low-resolution
labeled faces collected from Flicker images. Based on the
classification results reported in the FRVT report [16], this
database is among the most challenging for gender
classification. In our experiments, we follow Dago’s protocol
[13] which uses a subset of faces that has an inner-eye
distance larger than 20 pixels. This subset has a total number
of 14,760 facial images. A sample of the database images is
shown in Fig. 3.
B.
Experimental evaluation
As mentioned above, we use 14,760 facial images from the
Gallagher database with image resolution of 61x49 pixels. In
all experiments, we used half of the database images (7380) as
training data and the remaining half (7380) as testing data.
We study the effect of changing PCANet parameters on the
classification rate. The performance of feature fusion using
two PCANet models is evaluated and compared with other
state-of-the-art methods.
a) Effect of changing filter size: here we choose the
number of filters to be L
1
= L
2
= 8, and the histogram block
size of 7x7 with an overlap ratio = 0.5. Then we vary the filter
size k
1
= k
2
from 3 to 11 with a step of 2. Fig. 4(a) shows the
results of varying the filter size, the PCANet achieves the best
results at filter size equals 5. While increasing the size of
filters beyond 5 decreases the accuracy as the captured
features cannot discriminate male and female classes.
b) Effect of changing the number of filters: we choose
the filter size of the networks to be k
1
= k
2
= 5, and the bock
size 7x7 with an overlap ratio = 0.5; then set L
2
= 8 and vary
L
1
from 3 to 10. Fig. 4(b) shows the results of varying the
number of filters in stage1, the results improved when we
increase the number of filters and achieve the best result at L
1
= 8.
c) Effect of changing the histogram block size: we
examine the effect of varying the histogram block size on the
accuracy. The PCANet parameters are set as follows: k
1
= k
2
=
5, L
1
= L
2
= 8 and the bocks overlap ratio = 0.5. We vary the
block size from 7x7 to 17x17 with a step of 2. Fig. 4(c) shows
the results of varying the block size, the best results achieved
at small block size of 7x7.
d) Effect of changing image resolution
In this experiment, we aim to examine the performance of
the proposed method when we reduce the image resolution.
We resize the input images to be 48x48 pixels and then tune
the PCANet histogram block size and overlap ratio between
blocks due to their significant impact on the feature vector
length. The filter size is set to k
1
= k
2
= 5 and the number of
(a)
(b)
(c)
Fig. 4. Classification accuracy of PCANet on Gallagher’s
database. (a) Impact of filter size; (b) Impact of the
number of filters; (c) Impact of the block size.
85
86
87
88
89
90
7x7 9x9 11x11 13x13 15x15 17x17
Classification rate (%)
Block size
Fig. 5. Classification accuracy of PCANet with a different
number of principal components.
# of principal components
filters is L
1
= L
2
= 8. The size of the final feature vector
depends on the block size; thus, we compare the performance
of three block sizes of 8x8, 12x12, and 16x16 when the block
overlap ratio = 0.5. Table I shows the classification accuracy
and the feature vector size.
According to results in Table I, we study the impact of
changing overlap ratio using 8x8 and 12x12 block size. We
start with non-overlapping blocks and vary the overlap ratio
with step 0.1 to reach 0.5. Table II and Table III show the
classification accuracy and the feature vector size with
different overlap ratios. Results reveal that the accuracy is
improved when we increase the overlap ratio but at the
expense of increasing the feature vector size. We choose the
non-overlapping blocks of size 12x12 to carry out subsequent
experiments.
e) Effect of applying Whitening PCA
As discussed before, we aim to make the features less
correlated and have the same variance to make the
classification process easier and faster. We use whitening
transform that was implemented in the standard Eigenfaces
method [17].
We fed the feature vector extracted from the PCANet
training phase to the whitening transformation operation, and
we vary the number of principal components that we will
retain from 1000 to 5000. Fig. 5 shows the classification
accuracy with different number of principal components. The
best classification accuracy achieved when using only 1000
principal components. From this experiment, we confirm that
while the number of principal components increases the
classification accuracy decrease.
f) Fusion of two PCANet feature vectors
This experiment examines the effect of fusing two PCANet
models trained with two different filter sizes for gender
classification. It is assumed that the fusion of two feature
vectors will help linear classifier to find the decision boundary
between male and female classes; thus, the classification
accuracy can be improved. In this experiment, we calculate the
accuracy for each PCANet model and after concatenating the
two whitened PCANets features. Table IV summarizes the
classification accuracy at different filter sizes for each
PCANet. The obtained results confirm that the classification
accuracy of the concatenated PCANets is improved rather than
the accuracy of each one. Finally, Table V shows a
comparison of the proposed method with other state-of-the-art
methods which indicate that our method is comparable with
other methods.
V.
CONCLUSIONS
In this paper, we introduced a new method for gender
classification based on the combination of two convolutional
deep learning PCANet. We proved that our method is reliable
for gender classification in unconstrained scenarios by testing
it using small image resolution of 48x48 pixels from
Gallagher’s database. Parameters of PCANet are optimized for
the given classification problem. In addition, whitening PCA
is applied to reduce the dimension of feature vectors which
makes them reliable for fusion. Our proposed method
improves the classification accuracy while using a small size
feature vector. For future work, classification accuracy can be
further improved by applying single and cross-validation
approach using Dago’s protocol.
TABLE I.
PCANET CLASSIFICATION ACCURACY AND FEATURE VECTOR
SIZE FOR VARYING BOCK SIZE
.
Block size Classification rate (%) Size of feature vector
8x8 90.04 247808
12x12 89.56 100352
16x16 88.68 51200
TABLE II.
PCANET CLASSIFICATION ACCURACY AND FEATURE VECTOR
SIZE FOR VARYING BLOCK OVERLAP RATIO WITH
8
X
8
BLOCK SIZE
.
Overlap ratio 0.0 0.1 0.2 0.3 0.4 0.5
Classification rate (%) 86.03 89.19 89.3 89.3 90.07 90.04
Feature vector size 73728 73728 100352 100352 165888 247808
TABLE III.
PCANET CLASSIFICATION ACCURACY AND FEATURE VECTOR
SIZE FOR VARYING BLOCK OVERLAP RATIO WITH
12
X
12
BLOCK SIZE
.
Overlap ratio 0.0 0.1 0.2 0.3 0.4 0.5
Classification rate (%) 88.05 87.63 87.58 88.96 89.55 89.56
Feature vector size 32768 32768 32768 51200 73728 100352
TABLE
IV.
C
LASSIFICATION ACCURACY
(%)
OF SINGLE AND
CONCATENATED PCANET FEATURES
.
Filter size
PCANet1 PCANet2 PCANet1+2
PCANet1
k
11
= k
12
PCANet2
k
21
= k
22
3 5 86.44 89.09 89.25
3 7 86.44 88.60 89.58
3 9 86.44 87.32 89.32
3 11 86.44 86.59 89.17
5 7 89.09 88.60 89.58
5 9 89.09 87.32 89.58
5 11 89.09 86.59 89.65
7 9 88.60 87.32 88.93
7 11 88.60 86.59 88.82
9 11 87.32 86.59 87.45
R
EFERENCES
[1] G. Guo, “Gender Classification,” Springer. Verlag London, January
2014.
[2] J. Bekios-Calfa, J. M. Buenaposada, and L. Baumela, “Revisiting Linear
Discriminant Techniques in Gender Recognition,” IEEE Transactions
On Pattern Analysis And Machine Intelligence, vol. 33, no. 4, April
2011.
[3] J. Mansanet, A. Albiol, and R. Paredes, “Local deep neural networks for
gender recognition,” Pattern Recognition Letters, vol. 70, pp. 80–86,
November 2015.
[4] Y. Zhang, and T. Xu, “Landmark-Guided Local Deep Neural Networks
for Age and Gender Classification,” Journal of Sensors, July 2018.
[5] G. Levi and T. Hassncer, “Age and gender classification using
convolutional neural networks,” in 2015 IEEE Conference on Computer
Vision and Pattern Recognition Workshops (CVPRW), Boston, MA,
USA, pp. 34–42, June 2015.
[6] J. van de Wolfshaar, M. F. Karaaba, and M. A. Wiering, “Deep
convolutional neural networks and support vector machines for gender
recognition,” in 2015 IEEE Symposium Series on Computational
Intelligence, Cape Town, South, Africa, pp. 188–195, Dec. 2015.
[7] Y. Tang, “Deep Learning using Linear Support Vector Machines,” in
2013 International Conference on Machine Learning, Atlanta, Georgia,
USA, June 2013.
[8] Hui-Cheng Lian and Bao-Liang Lu, “Multi-view Gender Classification
Using Local Binary Patterns and Support Vector Machines,” in Third
International Symposium on Neural Networks, Chengdu, China, pp 202-
209, June 2006.
[9] C. Shan, “Learning local binary patterns for gender classification on
real-world face images,” Pattern Recognition Letters, vol. 33, pp. 431-
437, March 2012.
[10] M. Castrillón-Santana, J. Lorenzo-Navarro, and E. Ramón-Balmaseda,
“On using periocular biometric for gender classification in the wild,”
Pattern Recognition Letters, vol. 82, pp. 81–189, October 2016.
[11] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, Labeled Faces
in the Wild: A Database for Studying Face Recognition in
Unconstrained Environments,” Workshop on Faces in Real-Life Images:
Detection, Alignment, and Recognition, E.Learned-Miller, A. Ferencz,
and F. Jurie, Marseille, France, Oct 2008.
[12] A. C. Gallagher, T. Chen, “Understanding Images of Groups Of
People,” in 2009 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR), Miami, FL, USA, pp. 256–263,
June 2009.
[13] P. Dago-Casas, D. González-Jiménez, L. Long-Yu, and J.L. Alba-
Castro, “Single- and Cross-Database Benchmarks for Gender
Classification Under Unconstrained Settings,” in 2011 IEEE
International Conference on Computer Vision Workshops (ICCV
Workshops), pp. 2152–2159, Nov. 2011.
[14] T.-Han Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Y. Ma.PCANet: A
Simple Deep Learning Baseline for Image Classification?,” IEEE
Transactions on Image Processing, vol. 24, pp. 5017 – 5032, Dec. 2015.
[15] M. Castrillón-Santana, J. Lorenzo-Navarro, and Enrique Ramón-
Balmaseda, “Improving Gender Classification Accuracy in the Wild,”
18th Iberoamerican Congress, CIARP 2013, Havana, Cuba,
pp 270-277,
November 2013.
[16] M. Ngan, P. Grother, “Face Recognition Vendor Test (FRVT)
Performance of Automated Gender Classification Algorithms,” Tech.
Rep., National Institute of Standards and Technology, April 2015.
[17] M.A. Turk, and A.P. Pentland, “Face recognition using eigenfaces,” in
Proceedings. 1991 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, Maui, HI, USA, June 1991.
TABLE
V.
C
OMPARISON
A
OF
CCURACY
RESULTS
USING
GALLAGHER’S
DATABASE
Method Accuracy (%)
Gabor + PCA+SVM [13] 85.58-86.61
LBP + HOG +Bagging [15] 88.1
Local-DNN [3] 91.59
FHOG + FLBP
u 2
+ HSHOG [10] 92.46
Proposed method 89.65
... The use of artificial intelligence nowadays resolves a number of issues with human recognition. In many applications, including security, monitoring, context-based index, human-computer interaction, epidemiological studies, and biometrics, gender categorization is crucial [1]. Automated gender identification is often approached as a two-class classification issue, in which two-class classifiers are trained using characteristics derived from a collection of photos corresponding to male and female people. ...
Article
Full-text available
Gender identification is a crucial technique that can enhance the performance of authentication systems. Due to its variety of applications, human gender detection, a component of face recognition, has drawn a lot of interest. Previous studies on gender identification have relied on static features of the body, such as the face, eyebrows, hands, bodies, fingernails, etc. This study focuses on the use of machine learning and deep learning techniques for gender identification in face recognition systems. The abundance of face picture datasets has enabled the development of several effective models. In this study, the effectiveness of pre-trained deep neural network (DNN) models is examined when there is a lack of data. To address this issue, only the areas of the one eye picture with brows were used to classify the gender, instead of the entire face. The results of the study indicate that the EfficientNetb7 model is the best, with an accuracy of 0.913, outperforming other models such as Xception (accuracy of 0.876), InceptionResNetV2 (accuracy of 0.892), VGG16 (accuracy of 0.902) and Resnet50 (accuracy of 0.905). Overall, the study highlights the importance of accurate feature extraction and the effectiveness of DNN models for gender identification in face recognition systems. 1. INTRODUCING The use of artificial intelligence nowadays resolves a number of issues with human recognition. In many applications, including security, monitoring, context-based index, human-computer interaction, epidemiological studies, and biometrics, gender categorization is crucial[1]. Automated gender identification is often approached as a two-class classification issue, in which two-class classifiers are trained using characteristics derived from a collection of photos corresponding to male and female people. It has taken up a lot of room in the facial recognition sector. Due to rapid transmute of the human face, identifying a person's gender from facial data is likewise a difficult problem nowadays. When a person's face changes, such as when they grow a beard or a moustache or as they age, the effectiveness of the procedures may quickly deteriorate[2]. The position of the head might have an impact on it as well. To address these issues and produce reliable findings, researchers have created a variety of methodologies and models that have been documented in literature. The traditional method used in face recognition, including face-based gender recognition, generally entails the phases of picture capture and processing, dimensionality reduction, feature extraction, and classification, in that sequence. The ideal feature extractor to develop will depend on prior understanding about the application area. Additionally, the type of classifier used, which in turn depends on the feature extraction technique had been using, has a significant impact on how well the recognition system performs. Finding a classifier that works well with the selected feature extractor in order to obtain the best classification performance is challenging. Any modifications to the issue domain necessitate a whole system redesign[3]. The major goal of this study is to identify the gender of a person from their eyes using image processing techniques to extract features based on appearance and Deep Neural Networks (DNNs) to classify the gender of a person. In this regard, we provide a unique DNN-based method for real-time gender categorization. Gender identification from face photos is limited in this study to gender identification from eye images. As a result, the dataset and area of interest are both limited. In this work, many cutting-edge DNN models that have not been used on that particular classification issue are used on a well-structured dataset of eye pictures since pre-trained DNN models exhibit extremely promising results on image classification challenges[4]. Despite the fact that gender categorization might be crucial in many computer vision applications, it has not received as much research as the more well-known issues of recognition and identification. The majority of current pattern recognition systems start with heuristic-based feature extractors followed by trainable or non-trainable classifiers. This section summarizes earlier efforts from the standpoint of the categorization techniques used.
... The proposal includes implementing new models named Advanced Convolutional Neural Network (A-CNN) [1] to classify gender-specific faces. Abdalrady and Aly [2] proposed a simple feature fusion method using two simple Principal Component Analysis (PCANet) that trained on real-world images. In [3], they proposed gender classification with multi-feature methods that used Scale-Invariant Feature Transform (SFT) and Support Vector Machine (SVM) for feature extraction and classification. ...
Article
Full-text available
Face classification is a challenging task that is crucial to numerous applications. There are many algorithms for classifying gender, but their ability to evaluate their effectiveness regarding scientific data is constrained. Deep learning is popular among researchers in face classification problems. The detection of many faces is complicated and becomes a necessity in real problems. The proposed research aims to examine the effect of twofold face detection approach on the accuracy of gender classification, as well as the effect of using small datasets on accuracy. In this study, we use a small dataset to classify facial images based on their gender. The following phases involve deep learning methods along with the OpenCV library version 3.4.2 which is recommended to serve as a twofold face detection approach. In the experiments conducted, Phase 1 is the designated training phase, and Phase 2 serves as a testing phase. Two different algorithms are used in the testing phase to detect one face in the image (Experiment 1), while the remaining algorithm detects multiple faces in the image (Experiment 2). The FEI dataset is used to evaluate the accuracy of the proposed research, which results in 84% accuracy for Experiment 2 and 74% for Experiment 1, respectively.
... The study in (Abdalrady et. al., 2020) reports the interchange of traditional CNN models with the PCANet model for gender categorization. In addition, by using PCANet, it is able to decrease the size of the network design in complicated CNN models. For gender categorization, this technique has an accuracy of 89.65%. In (Yu et al., 2017) researchers proposes a CNN with reduce ...
Article
Full-text available
Automatic gender classification from face images has been a popular topic among researchers for a decade. Feature extraction and classification methods are very important to create a successful automatic classification system. Due to the richness of face image datasets today, many successful machine learning and deep learning methods has been implemented. It is very critical to extract accurate features from the datasets to achieve promising classification scores when traditional machine learning methods are used. However, deep learning models have been designed to extract the features automatically from the raw data directly. This also automatize the feature extraction process besides classification. The hidden and unpredictable feature sets can be explored by the deep neural networks which can increase the classification performance comparing to traditional machine learning methods. Convolutional Neural Networks (CNN) as one of the effective classes of deep models have been adopted by many scientists for solving the gender classification problem. It can solve the problem of the fact that facial cues can change from origin to origin which makes the accurate feature extraction harder. There are several state-of-the art pretrained CNN structures which are very successful for image classification problems. The performance of CNNs is generally higher when the number of the input data is high. However, in this study, the success of the pretrained CNN models is investigated when the data is limited. Considering this fact, in this study, rather than using complete face images, only the one eye image regions with eyebrows are used for the gender classification. The performance results present that the best CNN models are NASNetLarge and Xception models.
Article
Full-text available
Human facial analysis (HFA) has recently become an attractive topic for computer vision research due to technological progress and mobile applications. HFA explores several issues as gender recognition (GR), facial expression, age, and race recognition for automatically understanding social life. This study explores HFA from the angle of recognizing a person’s gender from their face. Several hard challenges are provoked, such as illumination, occlusion, facial emotions, quality, and angle of capture by cameras, making gender recognition more difficult for machines. The Archimedes optimization algorithm (AOA) was recently designed as a metaheuristic-based population optimization method, inspired by the Archimedes theory’s physical notion. Compared to other swarm algorithms in the realm of optimization, this method promotes a good balance between exploration and exploitation. The convergence area is increased By incorporating extra data into the solution, such as volume and density. Because of the preceding benefits of AOA and the fact that it has not been used to choose the best area of the face, we propose utilizing a wrapper feature selection technique, which is a real motivation in the field of computer vision and machine learning. The paper’s primary purpose is to automatically determine the optimal face area using AOA to recognize the gender of a human person categorized by two classes (Men and women). In this paper, the facial image is divided into several subregions (blocks), where each area provides a vector of characteristics using one method from handcrafted techniques as the local binary pattern (LBP), histogram-oriented gradient (HOG), or gray-level co-occurrence matrix (GLCM). Two experiments assess the proposed method (AOA): The first employs two benchmarking datasets: the Georgia Tech Face dataset (GT) and the Brazilian FEI dataset. The second experiment represents a more challenging large dataset that uses Gallagher’s uncontrolled dataset. The experimental results show the good performance of AOA compared to other recent and competitive optimizers for all datasets. In terms of accuracy, the AOA-based LBP outperforms the state-of-the-art deep convolutional neural network (CNN) with 96.08% for the Gallagher’s dataset.
Conference Paper
Full-text available
Power Quality (PQ) can be defined as a clean supply voltage that stays within the prescribed range in a smooth curve waveform. A power quality problem is defined as any problem that causes voltage or frequency deviations in a power supply, and it may result in failure or maloperation of a network. Therefore, continuous monitoring is also required in case of malfunction in these cases. In this paper, we have presented a deep learning-based power quality event classification method. We have used the proprietary electric relay wave-form data from The Turkish Electricity Transmission Corporation (TEIAS), as well as generated wave-form from MATLAB-Simulink, to train our model, using Convolutional Neural Networks (CNNs). The results proved to be effective, and can open the path to further research in this direction.
Article
Full-text available
Gender classification from face images is a challenging task due to presence of complex background, object occlusion, and variations in illumination conditions. Face images can be exploited for various applications such as expression analysis, recognition and tracking. In this paper, two deep learning-based methods are investigated for gender classification using face images. These methods include convolutional neural network (CNN) and Alex Net. Experiments were performed to evaluate the performance of both models for identification of male and female classes from face images. Results show that both methods were effective for gender classification. Moreover, a comparative analysis was also performed between these two models and some of the popular methods for gender classification.
Article
Full-text available
Many types of deep neural networks have been proposed to address the problem of human biometric identification, especially in the areas of face detection and recognition. Local deep neural networks have been recently used in face-based age and gender classification, despite their improvement in performance, their costs on model training is rather expensive. In this paper, we propose to construct a local deep neural network for age and gender classification. In our proposed model, local image patches are selected based on the detected facial landmarks; the selected patches are then used for the network training. A holistical edge map for an entire image is also used for training a “global” network. The age and gender classification results are obtained by combining both the outputs from both the “global” and the local networks. Our proposed model is tested on two face image benchmark datasets; competitive performance is obtained compared to the state-of-the-art methods.
Conference Paper
Full-text available
In this paper, we focus on gender recognition in challenging large scale scenarios. Firstly, we review the literature results achieved for the problem in large datasets, and select the currently hardest dataset: The Images of Groups. Secondly, we study the extraction of features from the face and its local context to improve the recognition accuracy. Different descriptors, resolutions and classifiers are studied, overcoming previous literature results, reaching an accuracy of 89.8%.
Article
Deep learning methods are able to automatically discover better representations of the data to improve the performance of the classifiers. However, in computer vision tasks, such as the gender recognition problem, sometimes it is difficult to directly learn from the entire image. In this work we propose a new model called Local Deep Neural Network (Local-DNN), which is based on two key concepts: local features and deep architectures. The model learns from small overlapping regions in the visual field using discriminative feed-forward networks with several layers. We evaluate our approach on two well-known gender benchmarks, showing that our Local-DNN outperforms other deep learning methods also evaluated and obtains state-of-the-art results in both benchmarks.
Article
Gender information may serve to automatically modulate interaction to the user needs, among other applications. Within the Computer Vision community, gender classification (GC) has mainly been accomplished with the facial pattern. Periocular biometrics has recently attracted researchers attention with successful results in the context of identity recognition. But, there is a lack of experimental evaluation of the periocular pattern for GC in the wild. The aim of this paper is to study the performance of this specific facial area in the currently most challenging large dataset for the problem. As expected, the achieved results are slightly worse, roughly 8 percentage points lower, than those obtained by state-of-the-art facial GC, but they suggest the validity of the periocular area particularly in difficult scenarios where the whole face is not visible, or has been altered. A final experiment combines in a multi-scale approach features extracted from the periocular, face and head and shoulders areas, fusing them in a two stage ensemble of classifiers. The accuracy reported beats any previous results on the difficult The Images of Groups dataset, reaching 92.46%, with a GC error reduction of almost 20% compared to the best face based GC results in the literature.
Chapter
SynonymGender recognition; Sex classification; Gender predictionDefinitionGender classification is to determine a person’s gender, e.g., male or female, based on his or her biometric cues. Usually facial images are used to extract features and then a classifier is applied to the extracted features to learn a gender recognizer. It is an active research topic in Computer Vision and Biometrics fields. The gender classification result is often a binary value, e.g., 1 or 0, representing either male or female. Gender recognition is essentially a two-class classification problem. Although other biometric traits could also be used for gender classification, such as gait, face-based approaches are still the most popular for gender discrimination.IntroductionA sex difference is a distinction of biological and/or physiological characteristics associated with either males or females of a species. These can be of several types, including direct and indirect. Direct is the direct result of differenc ...
Article
In this work, we propose a very simple deep learning network for image classification which comprises only the very basic data processing components: cascaded principal component analysis (PCA), binary hashing, and block-wise histograms. In the proposed architecture, PCA is employed to learn multistage filter banks. It is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus named as a PCA network (PCANet) and can be designed and learned extremely easily and efficiently. For comparison and better understanding, we also introduce and study two simple variations to the PCANet, namely the RandNet and LDANet. They share the same topology of PCANet but their cascaded filters are either selected randomly or learned from LDA. We have tested these basic networks extensively on many benchmark visual datasets for different tasks, such as LFW for face verification, MultiPIE, Extended Yale B, AR, FERET datasets for face recognition, as well as MNIST for hand-written digits recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state of the art features, either prefixed, highly hand-crafted or carefully learned (by DNNs). Even more surprisingly, it sets new records for many classification tasks in Extended Yale B, AR, FERET datasets, and MNIST variations. Additional experiments on other public datasets also demonstrate the potential of the PCANet serving as a simple but highly competitive baseline for texture classification and object recognition.
Article
Recently, fully-connected and convolutional neural networks have been trained to achieve state-of-the-art performance on a wide variety of tasks such as speech recognition, image classification, natural language processing, and bioinformatics. For classification tasks, most of these "deep learning" models employ the softmax activation function for prediction and minimize cross-entropy loss. In this paper, we demonstrate a small but consistent advantage of replacing the softmax layer with a linear support vector machine. Learning minimizes a margin-based loss instead of the cross-entropy loss. While there have been various combinations of neural nets and SVMs in prior art, our results using L2-SVMs show that by simply replacing softmax with linear SVMs gives significant gains on popular deep learning datasets MNIST, CIFAR-10, and the ICML 2013 Representation Learning Workshop's face expression recognition challenge.