Conference PaperPDF Available

Gender Classification Based on Asian Faces using Deep Learning

October 2019

October 2019

DOI:10.1109/ICSEngT.2019.8906399

Conference: 2019 IEEE 9th International Conference on System Engineering and Technology (ICSET)

Authors:

Tiagrajah v. Janahiraman

Aerodyne Group

Prasantth Subramaniam

Universiti Tenaga Nasional (UNITEN)

Block diagram of our proposed method

…

Sample of Male images from our database

…

Sample of Female images from our database

…

Sample of prediction produced by ResNet-50

…

Figures - uploaded by Tiagrajah v. Janahiraman

Content may be subject to copyright.

Content uploaded by Tiagrajah v. Janahiraman

Content may be subject to copyright.

Gender Classification Based on Asian Faces

using Deep Learning

Tiagrajah V. Janahiraman1and Prasantth Subramaniam

Department of Electrical and Electronics Engineering, College of Engineering.

University Tenaga Nasional (UNITEN)

Kajang, Selangor, Malaysia

1tiagrajah@uniten.edu.my

Abstract—For the past few years, gender classiﬁcation has been

an active area of study and researchers have been putting a lot

of effort to contribute quality research in this area. There is

a big potential ﬁeld of study as it can be used in monitoring,

surveillance and human-computer interaction. However, there is

still a lack of the performance of existing methods on real live

images. The rise of deep learning algorithm has been showing a

spectacular increase in performance lately. Many difﬁcult tasks

involving computer vision, speech recognition, and natural lan-

guage processing are easily solved with deep learning. Therefore,

the approach to deep learning notably growing and this also

happens to be on image classiﬁcation. Gender classiﬁcation is

an important subject in the face recognition process. This paper

shows the results of classifying gender using Convolutional Neural

Network based Deep Learning architectures using Tensorﬂow’s

Deep Learning framework. We have used models provided by

Keras with weights pre-trained on ImageNet. We have made a

comparison of the different type of models which includes VGG-

16, ResNet-50, and MobileNet. Our own database consists of

Asian faces inclusive of Malaysians and some Caucasians. Our

trained model on a database consisting of 1000 images shows

that VGG-16 delivered the highest recognition accuracy.

Index Terms—Deep learning, TensorFlow, Gender classiﬁcation

I. INT ROD UC TI ON

Facial images can be helpful in extracting the information

needed for multiple tasks that involve human interaction. Since

then, researchers are actively working on developing systems

and trying on various kind of algorithms that can make use of

human system reliabilities. Face recognition approach mostly

involves image processing, feature extraction, and further

image classiﬁcation. Hence, the performance will be reliable

on the classiﬁer used and a number of features extracted.

Facial features that can differentiate between male and female

gender can give more accurate performance in the analysis

of biometric systems and computer vision applications where

it gains a high level of understanding from the given facial

images.

Computer-human interaction is a scope of study concen-

trating on the design of computer technology to be more

speciﬁc the interaction between humans and computers. Before

personal computers were invented, only certain people like

professionals and people in the mainstream were able to inter-

act with computers until the late 1970s. Personal computers

with better graphical user interface made everyone to be a

potential computer user. Today computer is playing a vital

role in everyone’s life. For the past decade, there is a massive

progress in a technological sector that makes it seamlessly fast

to compute big data with the help of graphical processing units

(GPUs) which is powerful for massive parallel processing and

with the support of large amount of memory bandwidth it

made possible for heavy computational task such as machine

learning using deep learning approach.

In the current world scenario, artiﬁcial intelligence have

become a part of our daily life where it generalized human

cognitive abilities. Popular artiﬁcial intelligence approaches

are machine learning, natural language processing, robotics,

etc. Industry experts say that artiﬁcial intelligence term closely

related to current culture that makes the public to have

unrealistic fear about how it will evolve workplace and normal

human being life in general. AI growth is tremendous that

nowadays it involves in analyzing purchase histories and

inﬂuence marketing decisions. AI expectations can be better

and improvised than reality.

In this paper, we have introduced how to classify gender

from images of Asian faces using popular deep learning frame-

work TensorFlow. It detects a face and predicts the gender

and outputs the probability of the prediction. We present

multiple architectures for image classiﬁcation which contains

different parameters each. Due to the lack of face database

with Malaysian faces, we have created our own database

consisting of mainly Asian faces inclusive of Malaysians with

a small portion of Caucasian faces. Our model was trained

using Deep Learning architectures on this database to achieve

a marvellous result. Block diagram of our system is described

in Figure 1.

A trained model, which stores the architecture parameters

and weights, will be generated after the training phase. In the

testing phase, the trained model will be utilized to perform the

classiﬁcation of cropped face images in order to identify the

subject’s gender. A face detection module was used to identify

the location of the faces in a given still image. The cropped

face images were extracted from this still image.

In [1], Shefali et al proposed a deep learning model using

custom architecture to classify the gender from face images.

This Convolution Neural Network (CNN) model consist of 10

layers of convolution with max pooling followed by a ﬁnal

layer of fully connected nodes. This model was trained using

2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), 7 October 2019, Shah Alam, Malaysia

Fig. 1. Block diagram of our proposed method

1500 images and validated using 1000 images from CASIA

database.

Gil Levi et al [2] have proposed a network architecture

that consist of only three convolutional layers and two fully-

connected layers with a small number of neurons. This model

was tested on the latest version of Adience benchmark which

was collected for age and gender classiﬁcation. The Adience

database contains approximately 26,000 images from 2,284

subjects.

Xiaofeng et al have used SVM method to classify gender

from 310 face images and have achieved highest accuracy of

72.73% when used Haar-Like classiﬁer has 192 features from

8x8 images [3].

Haseena et al have proposed a neural network architecture

that consist of 10 convolutional layer, 4 max pooling layers

and a average pooling layer [4]. The images were trained using

standard back propagation method. Then the trained images

were fed to a KNN classiﬁer to identify the gender. The

performance was validated on LFW dataset consist of 13,233

images from 5749 subjects.

II. DEEP LEARNING ARCHITECTURE

A. Maintaining the Integrity of the Speciﬁcations

Deep Learning is one of the members of machine learning

that is based on Artiﬁcial Intelligence methods, which is

also referred to as Deep Structure Learning or Hierarchical

Learning. There are several types of Deep Learning Archi-

tectures (DLA) such as Deep Belief Networks, Recurrent

Neural Networks and Convolutional Neural Networks (CNN).

In the past, Deep Learning architectures had been widely

used in many applications such as computer vision, machine

translation, material inspection, medical image analysis, and

board game programs. The classiﬁcation accuracy of DLA is

comparable to or better than humans in several cases. CNN

architecture has been the most commonly used method in the

image classiﬁcation process to extract images from images

and classify them according to categories. CNN input that is

given through neural networks is processed in hidden layers

during training which is adjusted according to the weights.

CNN consists of two layers which are feature extraction

layer and classiﬁcation layer. The feature extraction layer

consists of multiple convolution layers and then activate with

Rectiﬁed Linear Unit (ReLU) after the max pooling. While

the classiﬁcation layer consists of fully connected layers that

formed from the neural network calculation from the feature

extraction layer [5]. Then, the model gives the output with a

classiﬁcation probability for each class. To make the output

prediction more accurate, weights need to be adjusted to ﬁnd

the pattern. The neural network learns on its own to ﬁnd the

pattern it needs. Convolutional neural networks have become

an active research study in computer vision after AlexNet

won the ImageNet Large-Scale Visual Recognition Challenge

(ILSVRC) in 2012 [6].

B. VGG 16

VGG-16 is a VGGNet convolutional neural network consist

of 6 layers with ﬁve sets of small convolutional ﬁlters with 3x3

2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), 7 October 2019, Shah Alam, Malaysia

size. It is developed by Karen Simonyan & Andrew Zisserman.

This architecture which is different from a kind which has won

ILSVRC competition in 2014. When VGGNet is evaluated on

ImageNet database which consists of 1000 classes of objects

and 1.3 million sample images, it achieved an accuracy of

92.7% in top 5 accuracy [7]. The input is a ﬁxed size of a

224x224 RGB image. The 16 convolution layers have a size

of 3x3, ﬁve max-pooling layers with size 2x2 followed by 3

fully connected layers and ﬁnal layer as the soft-max layer.

All the hidden layers then undergo ReLu activation [8].

C. ResNet-50

Deep Residual Network is often referred to its short name

ResNet was arguably the most ground-breaking work in the

deep learning society in the last few years. This architecture

can train up to hundreds or even thousands of layers and

still retain its performance. ResNet-50 is a 50-layer residual

network. There is quite a number of modiﬁed versions of this

architecture with a different number of layers such as ResNet-

101 and ResNet-152. ResNet is a CNN architecture from the

Microsoft team that won ILSRVC competition in 2015 and

surpassed human performance on ImageNet database. ResNet-

50 is an adaptation from ResNet-152 model and mostly being

used for transfer learning as it gives a promising result. This

powerful backbone model is used in a lot of computer vision

tasks mainly because of the use of skip connection in adding

the output from the previous layer to the next layer. This can

diminish the vanishing gradient problem in training the neural

network.

D. MobileNet

MobileNet is lightweight CNN architecture which is very

helpful for mobile and embedded based vision applications

where there is a lack of computation power. This architecture

was proposed by Google. To reduce the number of parameters,

this architecture uses depth wise separable convolutions. Un-

like normal convolution method, this architecture is replaced

by depth wise convolution followed by pointwise convolu-

tion which is called as depth-wise separable convolution.

By reducing the number of parameters, the total number of

ﬂoating-point multiplication operations decreases which suits

for mobile computing and embedded vision applications which

does not need much power. Depth wise separable convolutions

affect the accuracy for low complexity deep neural network.

This makes the deep neural network more lightweight com-

pared to others [9].

III. TENSORFLOW

TensorFlow is a machine learning system that is widely used

in research. It was released by Google as an open source deep

learning software library. It supports various applications that

focuses on training and inference on deep neural networks.

TensorFlow based applications can be executed on platforms

with single or multiple CPUs and GPUs. In the GPU operation

mode, Computer Uniﬁed Device Architecture (CUDA) and

SYCL for OpenCL extensions will be used to execute the

Fig. 2. Sample of Male images from our database

Deep Learning CNN architecture in the GPU modules. In this

project, CUDA version 10.0 was used in the backend layer

to support TensorFlow version 1.13. TensorFlow framework

can be used on various kind of operating systems such

as Linux, Windows, Macintosh. For mobile and embedded

devices, there is a lightweight framework called TensorFlow

Lite. The architectures are ﬂexible to run a computation in a

variety of working environments like high computation power

desktops, linked servers, to low computation power mobile

and embedded devices. TensorFlow mathematical operations

are explained in stateful dataﬂow graphs. TensorFlow can be

expressed as neural networks that operate on multidimensional

data arrays. “Tensors” is the alternate name for the arrays

present. Recently, TensorFlow has adapted Keras library to

build and train models.

IV. KER AS

Keras is a high-level neural network application interface

programming which is fully written in Python programming

language. Since Keras is developed on higher layers, it has the

ability to run on top any of the three popular deep learning

frameworks out there which TensorFlow, CNTK or Theano.

This API was created from the research of project Open-

ended Neuro-Electronic Intelligent Robot Operating System

(ONEIROS). The purpose of this API is to enable fast exper-

imentation to achieve the desired result. Keras can run both

convolutional neural networks and recurrent networks and on

both CPU or GPU based hardware. Keras is user-friendly

and easy to add new modules which make researchers run

experiments conveniently.

2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), 7 October 2019, Shah Alam, Malaysia

Fig. 3. Sample of Female images from our database

V. DATABASE

The Keras CNN modules are pre-trained on ImageNet

database. The images are random and were gathered from

the web and annotated by distributing the workforce among

hired people using Amazon’s Mechanical Turk crowd-sourcing

platform. A part of ImageNet has been extracted to form a new

database referred to as ILSVRC. This subset database consists

of 1000 classes such as umbrella, soccer ball, laptop and etc.

Each class contains 1000 images. There are approximately

around 1.2 millions of training images, 50,000 of validation

images, and 150,000 of testing images. All of the images

are in different resolutions. Therefore, the images need to be

ﬁxed to the resolution of 256x256 by rescaling and crop out

the subject. Our database referred to as U10 Face consists of

500 male and 500 female images which sums to 1000 images

prepared for a train phase. All the images were downloaded

from the internet with the help of google images. Human

faces were extracted using Haar Cascade algorithm provided

in OpenCV library. Misclassiﬁed and small face images were

removed in the post-processing step. Samples of images from

our database are shown in Figures 2 and 3.

VI. RESULTS AND DISCUSSION

Training was done for all the three models for 100 epochs

with a batch size of 16 on Intel i7-8700 processor, 16GB RAM

memory, Nvidia GTX 1080 GPU. Stochastic gradient descent

(SGD) function has been used as an optimization method for

tuning the parameters and the rate of learning is set to 0.001.

As tabulated in Table II, VGG-16 achieved the best accuracy

of 100% on the training set. Followed by ResNet-50 achieving

99.9% and then MobileNet with 99.8%. The TensorBoard log

shown in Figure 4 and 5 visualizes the accuracy obtained

TABLE I

ACC URAC Y OF T HE TR AI NSE T FO R DIFF ER ENT T YP ES OF M OD ELS

Models Input

image

size

Parameters No of

epochs

Training

accu-

racy

Loss

VGG16 224x224 138,357,544 100 100% 1.7074e-

Resnet50 224x224 25,636,712 100 99.9% 2.4288e-

MobileNet 224x224 4,253,864 100 99.8% 7.4571e-

Fig. 4. Accuracy of the trainset (Accuracy vs Number of Epochs)

and loss occurred during the training process. These data are

plotted against epochs.

We randomly selected 43 images consisting of human faces.

Our face detection module detects the faces, crops and resizes

in order to feed into the CNN model that was selected for

classiﬁcation. Prediction accuracy delivered by these CNN

models are shown in Table II.

TABLE II

ACC URAC Y OF T HE TR AI NSE T FO R DIFF ER ENT T YP ES OF M OD ELS

Models Number

available

faces

True

Positive

False Posi-

tive

Recognition

rate (%)

VGG16 253 223 30 88

Resnet50 253 215 38 85%

MobileNet 253 124 130 49%

Fig. 5. Loss occurred on the trainset (Loss vs Number of Epochs)

2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), 7 October 2019, Shah Alam, Malaysia

Fig. 6. Sample of predictions produced by VGG-16

Among the three CNN models, VGG-16 delivered the

highest accuracy of 88% and MobileNet performed the lowest

with 49%. Some samples of classiﬁcation result with the label

attached to each face using our trained model are shown in

Figures 6, 7 and 8. Here, the label of the class (i.e male or

female) and its corresponding probability is depicted at the

bottom of the rectangular bounding box. Based on the predic-

tions produced by all the models, we can conclude that the

performance delivered by VGG-16 is far superior producing

highest accuracy for gender classiﬁcation tasks. ResNet-50

performed moderately well but there are wrong classiﬁcations

and the probability of each prediction is lower compared

to VGG-16. While validating with MobileNet model, there

are higher misclassiﬁcations when compared to ResNet-50.

Hence, VGG-16 being the best performer and MobileNet

comes the last with poor result.

VII. CONCLUSION

Convolutional Neural Network(CNN) based Deep Learning

model was proposed for gender classiﬁcation tasks in this

paper. Our CNN model were developed using Keras library

on Tensorﬂow based Deep Learning framework. We made a

comparison among three CNN models that were pretrained

on ImageNet database. These models are VGG-16, ResNet-50

and MobileNet. Our training database was collected manually

using Google Images consisting of Asian faces. This database

consists of 500 male and female samples, respectively. All

the models were trained with 100 epochs and a batch size of

16 using GPU based hardware. VGG-16 model delivered the

best accuracy on training set. This is followed by ResNet-

50, and MobileNet. As a part of future work, other types

Fig. 7. Sample of prediction produced by ResNet-50

Fig. 8. Sample of prediction produced by MobileNet

2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), 7 October 2019, Shah Alam, Malaysia

of CNN models can be used for further investigation. Some

of the suggested models are InceptionResnetV2, InceptionV3,

AlexNet and DenseNet.

REF ER EN CE S

[1] Arora, Shefali, and M. P. S. Bhatia, “A Robust Approach for Gender

Recognition Using Deep Learning,” In 2018 9th International Con-

ference on Computing, Communication and Networking Technologies

(ICCCNT), pp. 1-6, 2018.

[2] Levi, Gil, and Tal Hassner, “Age and gender classiﬁcation using con-

volutional neural networks.” In Proceedings of the IEEE conference on

computer vision and pattern recognition workshops, pp. 34-42, 2015.

[3] Wang, Xiaofeng, Azliza Mohd Ali, and Plamen Angelov, “Gender and

age classiﬁcation of human faces for automatic detection of anomalous

human behaviour.” In 2017 3rd IEEE International Conference on

Cybernetics (CYBCONF), pp. 1-6, 2017.

[4] Haseena, S., S. Bharathi, I. Padmapriya, and R. Lekhaa, “Deep Learning

Based Approach for Gender Classiﬁcation.” In 2018 Second Inter-

national Conference on Electronics, Communication and Aerospace

Technology (ICECA), pp. 1396-1399, 2018.

[5] Ramdhani, B., Djamal, E.C. and Ilyas, R., “Convolutional Neural Net-

works Models for Facial Expression Recognition. In 2018 International

Symposium on Advanced Intelligent Informatics (SAIN), pp. 96-101,

August 2018.

[6] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S.,

Huang, Z., Karpathy, A., Khosla, A., Bernstein, M. and Berg, A.C.,

“Imagenet large scale visual recognition challenge,” International journal

of computer vision, 115(3), pp.211-252, 2015.

[7] Simonyan, K. and Zisserman, A., “Very deep convolutional networks for

large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.

[8] Gopalakrishnan, K., Khaitan, S.K., Choudhary, A. and Agrawal, A.,

“Deep Convolutional Neural Networks with transfer learning for com-

puter vision-based data-driven pavement distress detection,” Construc-

tion and Building Materials, 157, pp.322-330, 2017.

[9] Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand,

T., Andreetto, M. and Adam, H., “MobileNets: Efﬁcient convolu-

tional neural networks for mobile vision applications,” arXiv preprint

arXiv:1704.04861, 2017.

2019 IEEE 9th International Conference on System Engineering and Technology (ICSET), 7 October 2019, Shah Alam, Malaysia

Automated Deep Learning Framework for Accurate Human Identity and Gender Detection

Article

Jan 2023

An Automatic Approach to Detect Girl Child Trafficking

Article

Full-text available

Jun 2024

Girl child trafficking has become a matter of serious concern for human society. There are different manual approaches to stop and prevent it. However, these approaches need a huge amount of manual interventions. Consequently, there is a necessity to develop an automatic approach for detecting the incidents of girl child trafficking. In this work, we proposed a two-stage computational model for automatic girl child trafficking by analyzing images. Due to the unavailability of girl child trafficking images, we constructed a data set having one thousand four hundred ninety-six data. After careful observations, we decided to consider three features - age, emotion, and gender. Using these three features we developed our proposed computational model. In the first stage, the ResNet 50 deep neural network was used to determine the three feature values from an image. It was observed that these three models can perform the gender, age, and emotions with a testing accuracy of 80.23%, 76.29%, and 85.73%, respectively. In the next level, a Support Vector Machine (SVM) was used to determine whether there is a possibility of girl child trafficking or not. A K-fold cross-validation technique with K= 6 was used to avoid the overfitting problems. It has been observed our proposed model can detect girl child trafficking with an accuracy of 93.13%. The high accuracy observed in our study indicates the candidatures of our model for real-time child trafficking.

Accurate drone corner position estimation in complex backgrounds with boundary classification

Article

Full-text available

Mar 2024

This study develops an efficient approach for precise channel frame detection in complex backgrounds, addressing the critical need for accurate drone navigation. Leveraging YOLACT and group regression, our method outperforms conventional techniques that rely solely on color information. We conducted extensive experiments involving channel frames placed at various angles and within intricate backgrounds, training the algorithm to effectively recognize them. The process involves initial edge image detection, noise reduction through binarization and erosion, segmentation of channel frame line segments using the Hough Transform algorithm, and subsequent classification via the K-means algorithm. Ultimately, we obtain the regression line segment through linear regression, enabling precise positioning by identifying intersection points. Experimental validations validate the robustness of our approach across diverse angles and challenging backgrounds, making significant advancements in UAV applications.

An Approach to ECG-based Gender Recognition Using Random Forest Algorithm

Article

Full-text available

Apr 2024

Human-Computer Interaction (HCI) has witnessed rapid advancements in signal processing research within the health domain, particularly in signal analyses like electrocardiogram (ECG), electromyogram (EMG), and electroencephalogram (EEG). ECG, containing diverse information about medical history, identity, emotional state, age, and gender, has exhibited potential for biometric recognition. The Random Forest method proves essential to facilitate gender classification based on ECG. This research delves into applying the Random Forest method for gender classification, utilizing ECG data from the ECG ID Database. The primary aim is to assess the efficacy of the Random Forest algorithm in gender classification. The dataset employed in this study comprises 10,000 features, encompassing both raw and filtered datasets, evaluated through 10-fold cross-validation with Random Forest Classification. Results reveal the highest accuracy for raw data at 55.000%, with sensitivity at 46.452% and specificity at 63.548%. In contrast, the filtered data achieved the highest accuracy of 65.806%, with sensitivity and specificity at 67.097%. These findings conclude that the most significant impact on gender classification in this study lies in the low sensitivity value in raw data. The implications of this research contribute to knowledge by presenting the performance results of the Random Forest algorithm in ECG-based gender classification.

Gender Classification Based on Electrocardiogram Signals Using Long Short Term Memory and Bidirectional Long Short Term Memory

Article

Full-text available

Sep 2023

Gender classification by computer is essential for applications in many domains, such as human-computer interaction or biometric system applications. Generally, gender classification by computer can be done by using a face photo, fingerprint, or voice. However, researchers have demonstrated the potential of the electrocardiogram (ECG) as a biometric recognition and gender classification. In facilitating the process of gender classification based on ECG signals, a method is needed, namely Long Short-Term Memory (LSTM) and Bidirectional Long Short-Term Memory (Bi-LSTM). Researchers use these two methods because of the ability of these two methods to deal with sequential problems such as ECG signals. The inputs used in both methods generally use one-dimensional data with a generally large number of signal features. The dataset used in this study has a total of 10,000 features. This research was conducted on changing the input shape to determine its effect on classification performance in the LSTM and Bi-LSTM methods. Each method will be tested with input with 11 different shapes. The best accuracy results obtained are 79.03% with an input shape size of 100×100 in the LSTM method. Moreover, the best accuracy in the Bi-LSTM method with input shapes of 250×40 is 74.19%. The main contribution of this study is to share the impact of various input shape sizes to enhance the performance of gender classification based on ECG signals using LSTM and Bi-LSTM methods. Additionally, this study contributes for selecting an appropriate method between LSTM and Bi-LSTM on ECG signals for gender classification.

A Deep Dive into Gender Classification Using Inception V3: Performance and Insights

Conference Paper

Dec 2023

Human Gender Classification of Males and Females in a Crowd Using Deep Learning Techniques

Chapter

Mar 2024

Gender classification has recently received a lot of interest because genders include a lot of information about male and female social activities. It is difficult to extract discriminating visual representations for gender classification, especially with faces. Gender classification is the process of determining a person’s gender based on their appearance. Automatic gender classification is gaining popularity due to the fact that genders contain a wealth of information about male and female social activities. In recent years, such classification has become increasingly significant in a variety of fields. In a conservative society, a gender classification system can be utilized for a variety of objectives, such as in secure settings. Identifying the gender type is critical, especially in sensitive areas, to keep extremists out of safe areas. Furthermore, such a system is used in situations where women are segregated, such as female railway cabins, gender-specific marketing, and temples.

Gender Classification System Based on the Behavioral Biometric Modality: Application of Handwritten Text

Article

Oct 2023

Forensic Science is a branch of science that deals with the discovery, examination, and analysis of strong elements or evidence involved in the criminal justice system. It involves the use of scientific methods to investigate crimes. The Gender Classification System is closely linked to forensic studies, specifically investigating individuals through their handwriting, known as Behavioral Biometrics. Biometric systems rely on behavioral and physiological traits such as brain-prints, fingerprints, handwritten text, speech, facial attributes, gait information, palm vein patterns, hand geometry, ECG, and more. Gender classification is an intriguing and important aspect within the field of pattern recognition and machine learning. It involves a binary problem of classifying individuals as either male or female. Analyzing the differences in femininity and masculinity behaviors can contribute to the evaluation of biometric-based identification systems. Gender classification has numerous forensic applications, including crime identification, demographic research, forgery detection, security, and surveillance. The main objective of this paper is to present the latest survey findings on the gender classification system based on handwritten text, specifically the behavioral biometric modality. It includes an overview of the state-of-the-art work, the general framework, approaches, biometric modalities, and critical analysis. The manuscript concludes with a critical analysis, discussion of open issues, concluding remarks, and future perspectives.

Pre-trained Convolutional Neural Networks for Gender Classification

Chapter

Jul 2023

Application of the SIFT Algorithm in the Architecture of a Convolutional Neural Network for Human Face Recognition

Chapter

Jun 2023

Solving the problem of pattern recognition is one of the areas of research in the field of digital video signal processing. Recognition of a person’s face in a real-time video data stream requires the use of advanced algorithms. Traditional recognition methods include neural network architectures for pattern recognition. To solve the problem of identifying singular points that characterize a person’s face, this paper proposes a neural network architecture that includes the method of scale-invariant feature transformation. Experimental modeling showed an increase in recognition accuracy and a decrease in the time required for training in comparison with the known neural network architecture. Software simulation showed reliable recognition of a person’s face at various angles of head rotation and overlapping of a person’s face. The results obtained can be effectively applied in various video surveillance, control and other systems that require recognition of a person’s face.Keywordsface recognitionneural networkSIFT methodfeature point descriptorrecognition accuracy

A Robust Approach for Gender Recognition Using Deep Learning

Conference Paper

Full-text available

Jul 2018

Deep Learning Based Approach for Gender Classification

Conference Paper

Full-text available

Mar 2018

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

Article

Full-text available

Apr 2017

We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.

ImageNet Large Scale Visual Recognition Challenge

Article

Full-text available

Sep 2014

The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide detailed a analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the five years of the challenge, and propose future directions and improvements.

Convolutional Neural Networks Models for Facial Expression Recognition

Conference Paper

Aug 2018

Deep Convolutional Neural Networks with transfer learning for computer vision-based data-driven pavement distress detection

Article

Sep 2017
CONSTR BUILD MATER

Automated pavement distress detection and classification has remained one of the high-priority research areas for transportation agencies. In this paper, we employed a Deep Convolutional Neural Network (DCNN) trained on the ‘big data’ ImageNet database, which contains millions of images, and transfer that deep earning to automatically detect cracks in Hot-Mix Asphalt (HMA) and Portland Cement Concrete (PCC) surfaced pavement images that also include a variety of non-crack anomalies and defects. Apart from the common sources of false positives encountered in vision based automated pavement crack detection, a significantly higher order of complexity was introduced in this study by trying to train a classifier on combined HMA-surfaced and PCC-surfaced images that have different surface characteristics. A single-layer neural network classifier (with ‘adam’ optimizer) trained on ImageNet pre-trained VGG-16 DCNN features yielded the best performance.

Gender and Age Classification of Human Faces for Automatic Detection of Anomalous Human Behaviour

Conference Paper

Jun 2017

Age and gender classification using convolutional neural networks

Conference Paper

Jun 2015

Very Deep Convolutional Networks for Large-Scale Image Recognition

Article

Sep 2014

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.

MobileNets: Efficient convolutional neural networks for mobile vision applications

A G Howard
M Zhu
B Chen
D Kalenichenko
W Wang
T Weyand
M Reetto
H Adam

Gender Classification Based on Asian Faces using Deep Learning

Figures

Recommended publications

Prominent Face Region Based Gender Classification Using Deep Learning

Gender and Age Detection using Deep Learning

Transfer Learning with EfficientNet for Computer Vision based Automatic Gender Classification

Genderpredictions using Convolution Neural Networks

Gender Classification using Deep Learning Techniques