ArticlePDF Available

Ensemble of texture and shape descriptors using support vector machine classification for face recognition

Authors:

Abstract and Figures

One of the significant task in pattern recognition and computer vision along with artificial intelligence and machine learning is the Face Recognition. Most of the prevailing approaches on face recognition concentrates on the recognition of the utmost appropriate facial attributes for efficiently recognizing and differentiating amongst the considered images. In this paper, an ensemble aided facial recognition approach is suggested that performs well in wild environment using an ensemble of feature descriptors and preprocessing approaches. The combination of texture and color descriptors are mined from the preprocessed facial images and classified using support vector machine algorithm. The experimental outcome of the suggested methodology is illustrated using two databases such as FERET data samples and Labeled Faces in the Wild data samples. From the results, it is shown that, the proposed approach has good classification accuracy and combination utility of pre-processing techniques due to the usage of additional preprocessing and extracted feature descriptors. The average classification accuracies for the both the data samples are 99% and 94% respectively.
Content may be subject to copyright.
Vol.:(0123456789)
1 3
Journal of Ambient Intelligence and Humanized Computing
https://doi.org/10.1007/s12652-019-01192-7
ORIGINAL RESEARCH
Ensemble oftexture andshape descriptors using support vector
machine classification forface recognition
PallavaramVenkateswarLal1· GnaneswararRaoNitta2· AndePrasad3
Received: 7 October 2018 / Accepted: 3 January 2019
© Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract
One of the significant task in pattern recognition and computer vision along with artificial intelligence and machine learning
is the Face Recognition. Most of the prevailing approaches on face recognition concentrates on the recognition of the utmost
appropriate facial attributes for efficiently recognizing and differentiating amongst the considered images. In this paper, an
ensemble aided facial recognition approach is suggested that performs well in wild environment using an ensemble of feature
descriptors and preprocessing approaches. The combination of texture and color descriptors are mined from the preprocessed
facial images and classified using support vector machine algorithm. The experimental outcome of the suggested methodol-
ogy is illustrated using two databases such as FERET data samples and Labeled Faces in the Wild data samples. From the
results, it is shown that, the proposed approach has good classification accuracy and combination utility of pre-processing
techniques due to the usage of additional preprocessing and extracted feature descriptors. The average classification accura-
cies for the both the data samples are 99% and 94% respectively.
Keywords Face recognition· Feature descriptions· Histograms of gradients· ICS_LBP· Color dominant structure·
Support vector machine
1 Introduction
Biometrics is the utmost significant portion of patternrec-
ognition in which face recognition is the utmost striking-
biometric approach. However, face recognition in practical
appliances is a challenging task (Xu etal. 2013) since face is
a non-rigid element, and frequently has diverse appearances
pertaining tonumerous facial look, dissimilar ages, various
viewpoints and furthersignificantly, various illumination
intensities. Face recognition (FR) is extensively studied due
to its hypothetical value in addition to its incredible probable
application (Ouanan etal. 2016, 2015a, b, c).Humans could
identify faces in a scene withtheir natural abilities without
any additionalequipment. It is very hard to create anauto-
mated system for the identification task. The developmentsin
hardware and software of computertechnologies are remov-
ing the limit of thedifficulty. The problem of finding face
patternsis actual problematic due to the large variationof
distortions that have to take into reason.
Face recognition is extensively employed in practical
appliances like exploration of internet image, access con-
trol, law enforcement for confidential services such as pris-
oners premises, safety investigation, theatre, home safety
to individual verification patterns in e-commerce, health,
and governance amenities, etc. Numerous issues exists in
working using these appliance sunder various features like
alterations in illumination, inconsistency in scale, position,
direction and posture. Additionally, facial appearance, facial
decorations, masquerade, fractional occlusion circumstances
that modify whole appearance so as to make the detection
of faces tougher.
* Pallavaram VenkateswarLal
venkateswarlal@gmail.com
Gnaneswarar Rao Nitta
gnani.nitta@gmail.com
Ande Prasad
prasadjkc@yahoo.co.in
1 Research Scholar ofVignan’s University andCSE
Department, Narayana Engineering College, Gudur,
AndhraPradesh, India
2 CSE Department, Vignan’s University, Vadlamudi, Guntur,
AndhraPradesh, India
3 Computer Science Department, Vikrama Simhapuri
University, Nellore, AndhraPradesh, India
P.VenkateswarLal et al.
1 3
Recently, an attractive, intuitive solution for the chal-
lenges faced is to exaggeratedly alter the postures of faces
coming in photographs through creating novel and front
facial viewpoints. This better supports its attributes and min-
imizes the inconsistency that face recognition systems ought
to address. In the proposed approach, rather than focusing
on shape, the main focus is specifically on texture and color
feature for effective face recognition. Color provides the vis-
ual characteristics for indexing and retrieval of images and
textual features provides the data on structural organization
of surface and objects of images. For this purpose, texture
and color feature descriptors are extracted from preprocessed
facial images depending on which an efficient classification
is performed using support vector machines. The texture
and color descriptors are extracted in such a manner that the
dominant color, orientation, texture patterns and transformed
features of the images are obtained.
2 Literature survey
Some of the utmost prominent face recognition methods
introduced for the previous five decades are given in this sec-
tion. Ouanan etal. (2015a, b, c), a facial image illustration
providing good results on FERET dataset is suggested, this
methodology depends on Gabor filters (GFs) and Zernike
moments (ZMs), where GFs is employed for mining of tex-
ture attributes and ZMs mines shape attributes, alternatively,
a flexible Genetic Algorithm (GA) is performed to choose
the moment attributes which finely differentiate individual
faces beneath numerous postures and illuminating issues.
Further, improved extracted attribute vectors are transformed
to a lower dimensional subdomain employing Random Pro-
jection (Menon 2007) approach. Cao etal. (2013), a regular-
ization outline is presented to know similarity measures for
face authentication in the remote. This methodology attains
a good outcomes on LFW data sample. Huang (2007), a
joint Bayesian methodology is suggested depending on the
traditional Bayesian face recognition technique (Chen etal.
2012). This technique attained 92.4% accuracy on LFW data
sample. The other fascinating methodology is Fisher vector
encryption that executes better on LFW. Nevertheless, the
accuracy of those approaches worsens onextreme postures
of face such as profile. This demonstration the necessity of
methodsproficient to reimbursehuge posture disparity.
A prototype for recognition of facial attributes employ-
ing deformable patterns is proposed in Yuille etal. (1992).
The facial attributes are defined through a parametric pattern
which associates peaks, breaks and edges using the energy
function in image pixels to consistent template attributes.
This template interrelates energetically using the input
through handling the parametric values to diminish the
dynamic function. Lam and Yan (1996), a methodology is
introduced to recognize the header constraint and the esti-
mated locations of eyes are evaluated. An analytic-to-holistic
technique is given in Lam and Yan (1998) that could recog-
nize at diverse perception disparities. The initial stage is to
trace 15 attribute on the face. A header prototype suggested
that the spin of face is evaluated employing geometrical
dimensions. In subsequent stage to adjust the windows for
nose, mouth and eyes. These attribute templates are matched
using the templates in data samples through correlation.
A face normalization methodology is suggested in (Liu
etal. 2006), depending on the position of eyes. The face
is recognized depending on enhanced cascade of flexible
Haar attributes. This methodology recognition the location
and distance amongst face and eyes through alignment fea-
tures. Wu etal. (2009), a prototype for recognizing a shared
nasal attributes for facial image using four descriptors is
suggested. The numerous attributes has been introduced for
facial identification and three added nasal attributes related
to significant nasal root, cylindrical appearance, and smaller
nasal are mined.
Kamencay etal. (2017), the performance of the suggested
convolutional neural network (CNN) with three famous
image recognition approach like PCA, Local Binary Patterns
Histograms (LBPH) and KNNis validated. The complete
detection accuracy of PCA, LBPH, KNN and suggested
CNN is illustrated. The complete experiment is executed
on ORL dataset and the attained results were demonstrated
and calculated. This facial dataset comprises of 400 diverse
entities (40 categories/10 images for every category). The
result exhibited that the LBPH gives good results compared
to PCA and KNN. These experimental outcomes on the ORL
datasets illustrated the efficiency of suggested methodology
for facial identification. For the given CNN, the finest iden-
tification accuracy of 98.3% is achieved. This suggested
approach depends on CNN that outperforms the existing
approaches.
Gupta etal. (2018) a new way of using a deep neural net-
work (another type of deep network) for face recognition. In
this approach, instead of providing raw pixel values as input,
only the extracted facial features are provided. This lowers
the complexity of while providing the accuracy of 97.05%
on Yale faces dataset. Dong etal. (2018), a virtual sam-
ple generating algorithm called k nearest neighbors based
virtual sample generating (kNNVSG) to enrich intra-class
variation information for training samples. Furthermore,
in order to use the intra-class variation information of the
virtual samples generated by kNNVSG algorithm, we pro-
pose image set based multimanifold discriminant learning
(ISMMDL) algorithm.by comprehensively using kNNVSG
and ISMMDL algorithms, we propose k nearest neighbor
virtual image set based multimanifold discriminant learning
(kNNMMDL) approach for single sample face recognition
(SSFR) tasks.
Ensemble oftexture andshape descriptors using support vector machine classification for…
1 3
3 Proposed approach
In this section, a novel methodology is introduced for robust
face recognition using significant techniques at every stage.
The combination of different descriptors of the image fea-
tures are extracted and categorization is performed depend-
ing on these descriptors. The proposed methodology is
mainly segregated into four significant stages in order to
make the ensemble stronger. The four different stages are
image frontalization, image pre-processing, feature extrac-
tion, and classification. The diagrammatic illustration ofsug-
gested methodology is given in Fig.1. Details description
of every technique used in each phase is briefly explained in
the below subsections.
3.1 Image frontalization
“Frontalization” is an approach of fusing frontal facial views
coming in an unrestricted pictures. This process transforms
the challenging issue of detecting frontal facial observed
from unrestricted perspectives to the flexible issue of detect-
ing facial frontal views in constrained positions. It produces
decently allied images that brings an accurate matching of
localized facial attributes amongst dissimilar faces. Any face
is distinguished through an off-the-shelf face indicator (Lee
etal. 2012) and further reaped, rescaled to the conventional
coordinate system. The Facial attributes are generalized and
employed for aligning the picture using a textured, 3D form
of a general, referencing face. A condensed, front view of
this face gives a referenced coordinate system. The initially
frontal face is extracted through back-projecting the form of
the query picture to its referencing coordinate system that
employed 3D surface like a proxy. A finalized outcome is
generated through borrowing appearances from consistent
symmetric edges of theface wherever facial attributes are
poorly detectable because of query’s position.
3.2 Preprocessing
Prior to feature extraction phase, the issue of illuminating
discrepancy need to be addressed through employing certain
currently introduced enhancement methods:
Adaptive single scale retinex (AR) (Park etal. 2008)
This is an alternate to retinexprocedure that was primar-
ily introduced to improve scene detail and color repro-
duction in the darker areas of an image. This technique
normalizes illumination using the spatial information
between surrounding pixels (it should be noted that AR
produced the best results in our experiments).
Anisotropic smoothing (AS) (Gross and Brajovic 2003)
This is flexible automated standardizationapproach
that initiated through estimating the illumination field
and then compensates for it byenhancing the local con-
trast of the image in a fashion similar to human visual
perception. This technique has provenhighly effective
with standard face recognition algorithms across many
face databases.
Fig. 1 Block diagram of the
suggested face recognition
approach
Hard
Frontalizaon
AR
AS
DG
DCT
OLHE
MSR
ISN
PN
GF
HOG
ICS_LBP
SHIFT
Dominant Color
Structure
Support Vector
Machine
Facial Image
Representaon
Pre-processing Extracon of Feature
Descriptors
Image Classificaon
P.VenkateswarLal et al.
1 3
Difference of Gaussians (DG) This is a methodology
which depends on the diverse Gaussians to generate a
normalize image. Band-pass filter is employed to the
initial input image prior to extraction of features. In
this paper, log transformations are employed prior to
filtering similar to Sˇ truc and Pavesˇic (2011).
Low-frequency discrete cosine transform aided tech-
nique (Chen etal. 2006) It is an illuminating normali-
zation technique for facial detection, where a discrete
cosine transformation (DCT) is employed to com-
pensate for illumination variations in the logarithm
domain. The rationale is that since illumination varia-
tions mainly lie in the low-frequency band, an appro-
priate number of DCT coefficients are truncated to
minimize variations under different lighting conditions.
Oriented local histogram equalization (OLHE) (Lee
etal. 2012) It is a HE which reimburses illumination
during encrypting quality data on edge orientations.
Multiple Scale Retinex (MSR) (Jobson etal. 1997) It is
a multiple scale retinex (prototype for light and color
intuition) which obtains instantaneous dynamically
ranging firmness, color stability, and light version
with goal to enhance reliability of colored images to
individual perceptions.
Isotropic smoothing normalization (ISN) (Heusch
etal. 2005) In this methodology, the issue of facial
authentication throughout radiance through isotropic
smoothening normalization is employed (a dissemi-
nation phase that fundamentally updates every pixel
employing the mean of its adjacent intensity values,
irrespective of the image data nearby the area beneath
deliberation).
Photometric normalization (PN) (Tan and Triggs
2007): It is a strong illuminating normalization tech-
nique that functions like a flexible and effective pre-
processing sequence aided on gamma alteration, vari-
ations of masking, Gaussian filtering, and distinction
equalization, and normalization using photometric that
eradicates the maximum effects of altering illumination
through further still conserving the crucial appearance
descriptions that are essential for detection.
Gradient faces (GFs) (Zhang etal. 2009) This is not
appropriately an improvement approach nevertheless
an illuminating impervious metric obtained from image
gradient which is strong enough to illumination vari-
ations comprising in unrestrained natural illuminating
atmosphere. In this paper, gradient faces similar to
the preprocessing technique is employed to signify an
image in gradient space.
3.3 Feature extraction
This phase is employed on every pre-processed images in
order to extract dissimilar forms of descriptors from different
features such as texture and color. In this section, four different
types of descriptors are for extraction such as,
histograms of gradients (HOG),
improved center-symmetric local binary patterns (ICS-
LBP),
SIFT descriptors,
dominant color structure descriptors.
Histograms of gradients The HOG depends on assess-
ing a good normalized local histograms of image gradient
orientations in a dense network. The rudimentary notion is
that localized image appearance and shape can frequently be
considered slightly better through the dissemination of local
intensity gradients or edge directions, deprived of any detailed
information of consistent gradient or edge location. In fact, this
is executed through segregating the image window into smaller
spatial sections (“cells”), for every cell adding a localized 1-D
histogram of gradient directions or edge alignments on the
cell pixels. The merged histogram items generates the format.
For higher invariance to illumination, shadowing, etc., it is
likewise beneficial to contrast-normalize the local replies prior
to use them. This could be accomplished through gathering
a measurement of localized histogram power on the certain
higher spatial section known as blocks and employing the out-
comes to normalize entire cells in the block. The normalized
descriptor blocks known as Histogram of Oriented gradient
(HOG) It seizes edge or gradient organization which is the
attribute of local shape, and it performs likewise in a local
demonstration with a flexibly manageable amount of invari-
ance to localized arithmetical and photometric alterations.
Improved center-symmetric local binary patterns The novel
operation categorizes the local feature depending on relative-
ness of centralized pixel and centralized symmetric pixels in
place of gray value variations amongst the centralized symmet-
ric pixels known as CS-LBP that could completely mine the
texture data thrown out by CS-LBP. Even if CS-LBP is further
effective region descriptor compared to LBP, the inclined data
is not deliberated completely due to the unawareness of cen-
tralized pixel. Similarly tough to pick an adjustable threshold.
This issue is addressed by suggesting ICS-LBP. In this novel
approach, the dependency of centralized pixel and centralized
symmetric group of pixels are considered in place of gray-level
alterations amongst centralized symmetric known as CS-LBP.
The schematic operation of LBP, CS-LBP, and ICS-LBP:
S
ICSLBP
(
pi,pc,pi+(p2)
)
=
{
1, (pipcand pcpi+(p2))∕(pi<pcand pcpi+(p2)
)
0, otherwise,
Ensemble oftexture andshape descriptors using support vector machine classification for…
1 3
Here
pi,pi+(p
2
)
and pc refers to gray-level of centralized
symmetric group of pixels and centralized pixel on the circle
of radius R. Using above operation, the binary representation
of LBP, CS-LBP and ICS-LBP are evaluated as:
here (x, y) refers to the coordinates of pixel. It is obvious that
LBP generates 256 (28) diverse binary forms, while CS-LBP
and ICS-LBP generates merely 16 (24) diverse binary pattern
for 8 neighbors.
Scale-invariant feature transform (SIFT) This approach
has been suggested primarily for mining typical invariant
attributes from images to obtain a consistent comparison
amongst diverse viewpoints of an image or picture. It com-
prises of two significant phases: keypoint recognition and
feature descriptor generation. To recognize the keypoints in
the image, this approach includes scale special Difference-
Of-Gaussian (DOG). Once the recognition is performed,
keypoints in image, this approach calculates the gradient
magnitude and orientation in an area around the keypoint
position weighted through a Gaussian window. The result-
ant attribute vector is the dimension of 128 for every sec-
tion. The coordinates of descriptor and the gradient orienta-
tions are revolved comparative to the keypoint alignment to
achieve the orientation independence.
Dominant color structure A novel dominant color struc-
ture descriptor (DCSD) is employed. It is considered to pro-
pose an effective methodology to characterize both color
and spatial structural data using solitary dense descriptor.
This descriptor merges density of dominant color descrip-
tor (DCD) and retrieval accurateness ofCSD to augment the
efficiency in an extremely effective way:
DCD mines image attributes through grouping image
colors to less colors which is given as:
These descriptor comprises of illustrative colors ci, its
percentage Pi, elective color alterations of every dominant
color Vi, and elective three-dimensional coherency Sc of
dominance colors. The QHDM (Manjunath etal. 2001) is
employed to measure the resemblance of DCD. Using this
flexible and dense illustration, DCD permits effective index-
ing for similarity retrieval through fore going accurateness
pertaining to the deficiency of spatial data of description
when matched with another color descriptors.
CSD (Manjunath etal. 2001) is dependent on his-
togram, however, intends at giving a further precise
ICS
LBPP,R(x,y)=
p
21
i=0
SICSLBP
pi,pc,pi+(p2)
×2i
,
F
=
{(
c
i
,P
i
,V
i)
,S
c}
,(i=1N)
.
description through recognizing local color dissemi-
nation of every color. The color structure histogram
(CSH)is employed to characterize CSD for M quantized
color, and is defined:
Here
M∈{32, 64, 128, 256}
and bin value h(m)refers to
the count of structuring elements (se) comprising higher
than single pixels using color cm. Distinct from traditional
histogram, CSH is mined through gathering employing
8 × 8-structuring frame. The se scans image and calcu-
lates numerous times a specific color is present within se.
Let I represent group of quantizing color index and
SI
refers to group of quantizing color index residing within
the subimage section enclosed through the se. Through
the se scanning the image, the color histogram bins are
collected rendering to,
Therefore, the concluding value of H(m) is defined
using numerous locations at which the structuring element
comprises of cm.
3.4 Image classification
In this phase, every preprocessed image combined using
the mined descriptor persuades a dissimilar individual
categorizer or distance metric. Thus, for every descriptor,
diverse value score is present for the referencing image.
The ultimate choice of ensemble is attained through merg-
ing the entire the scores through summation rule. It is a
direct methodology which is chosen as numerous catego-
rizers are fairly higher while comprising the entire pre-
processing images, descriptors and imitation postures
beneath discussion in this approach.
Support vector machines (SVMs) (Cristianini and Shawe-
Taylor 2000) are a standard binary class categorizers which
discovers the hyperplanewhich extremely segregates the
entire points amongst the two categories. SVM functions
on non-linearly segregating issues employing kernel func-
tions to map the data to a higher dimension attribute domain.
Various kernels functions are employed in the paper, how-
ever, the finest outcomes are achieved using a linear kernel.
SVM is accomplished to discriminate amongst honest and
imitator comparison. Consequently, the training set is an
amalgamation x with descriptors xi and xj and class label
l. These are merged so as to attain below resultant vector:
here the element-wise power and division (./) are executed.
H(m),m=1M.
x
=
(
x
i
x
j)2
.
(
x
i
+x
j),
P.VenkateswarLal et al.
1 3
4 Experimental results
The suggested face recognition system is experiments
using Matlab 16b version on two benchmark datasets such
as FERET (Phillips etal. 2000) and LFW (Huang etal.
2007). The comparison of the suggested face recognition
approach is accomplished against Robust Face Recogni-
tion using Multi-Scale Feature Pattern Sparse (Previous
Paper), Texture Ensemble (Lumini etal. 2017) and Face
Recognition based on Improved Robust Sparse Coding
Algorithm (Jun-Kai 2015).
4.1 4.1 Dataset descriptions
FERET database The FERET data sample comprises of
five data samples: Fc (194 images), Fb (1195 images),
Fa (1196 images), Dup2 (234 images) and Dup1 (722
images). The typical FERET estimation procedure
contains matching images in validating group to every
image in dataset. In this experiment, the entire images of
FERET gray scale are associated through true eye loca-
tions and reaped with 110 × 110 pixels.
LFW database This [54] dataset comprises of 13,233
images of 5749 personalities which are gathered from
internet. Totally, 1680 faces occurs in higher than two
images. Two views are given in LFW data sample. First
view comprises of a validating group of 2200 facial
pairs and a validating group with 1000 facial pairs and
employed for choosing of pattern merely. View 2 com-
prises of 10 non-overlapping group of 600 matches and
is for reporting the performance.
4.2 Results comparison
The experimentation in Tables1 and 2 was targeted towards
estimating diverse descriptors when merged using the pre-
processing approaches given in Sect.3.2. It exhibited that
efficiency of every descriptor is merged with every pre-
processing approach, considering the entire by sum rule of
complete techniques in the similar column. The outcomes
in Tables1 and 2 obviously demonstrate that the integra-
tion achieved through merging the entire the preprocessing
approach gives better results compared to the good single
preprocessing approach for every descriptor. The other
stimulating discovery is that good descriptor for this cat-
egorization issue is HOG, and the finest unique preproc-
essed approach is SHIFT. Table1 refers to the experiment
evaluated using FERET Datasets and Table2 refers to the
experiment evaluated using LFW Datasets.
The average classification accuracies using Support Vec-
tor Machines comparison is shown in Table3. From the
results in Table3, it is clearly shown that from the addressed
existing approaches, the proposed methodology has higher
classification accuracy for both the datasets compared to
the existing. It is inferred that the suggested preprocessing
approaches and texture, color descriptors has enhanced the
Table 1 Accuracies of merging preprocessed approaches for FERET
datasets
Preprocessing
approaches
Feature descriptors
HOG ICS_LBP SHIFT Dominant
color struc-
ture
AR 82.3 79.2 83.4 81.2
AS 86.6 81.5 84.5 82.1
DG 87.7 82.5 88.6 85.6
DCT 85.1 82.0 84.2 85.0
OLHE 83.4 84.3 80.3 81.7
MSR 85.1 83.1 82.3 81.0
ISN 86.4 82.3 84.3 81.4
PN 84.0 81.4 82.5 83.4
GF 84.0 81.0 82.1 81.5
All 88.3 85.6 82.6 85.5
Table 2 Accuracies of combining the preprocessing methods for
LFW datasets
Preprocessing
approaches
Feature descriptors
HOG ICS_LBP SHIFT Dominant
color struc-
ture
AR 82.5 79.6 84.2 82.2
AS 85.3 80.5 85.2 86.1
DG 84.6 83.7 84.5 85.7
DCT 86.1 86.0 85.0 85.5
OLHE 83.9 82.7 81.4 83.7
MSR 84.2 85.7 84.2 82.8
ISN 85.1 84.4 82.8 81.9
PN 82.8 82.8 82.4 83.0
GF 82.0 83.4 82.4 81.4
All 87.6 86.4 83.8 87.3
Table 3 Comparison of average classification accuracy for specific
datasets
References Fb Fc Dup1 Dup2 LFW
Chai etal. 2014 99.9 100 95.7 93.1
Juefei-Xu etal. 2015 – – 87.55
Proposed approach 99.5 100 94 92.5
Ensemble oftexture andshape descriptors using support vector machine classification for…
1 3
efficiency of face recognition. Figure2, refers to the average
classification accuracy of both the datasets. From the fig, it
can be inferred thatthe suggested methodology has higher
accuracy values when matched with the existing Ensemble
texture descriptors for face recognition.
5 Conclusions
In this paper, an ensemble of feature descriptors based
face recognition technique is suggested that classifies the
facial images in any wild environment. Usually, for recog-
nition facial images shape, texture and color plays a very
significant role in the extraction of significant feature for
accurate and efficient categorization. For this purpose, in
the proposed methodology, color, texture, and orientation
feature descriptors are considered such as HOG, ICS_LBP,
SHIFT, and color dominant structure descriptors. The facial
images from the wild datasets are represented using the hard
frontalization techniques that always attempts to obtain the
frontal view of an image. Further, dissimilar pre-processing
techniques are employed to obtain to have the image without
any noise elements. Upon which, the extracted color and tex-
ture features are employed for facial image recognition using
support vector machine. The experimental results is carried
out on FERET and LFW Datasets and the results inferred
that the suggested methodology gives effective results when
matched with previous methods and has good average clas-
sification accuracy. The average classification accuracies for
the both the data samples are 99% and 94% respectively.
References
Cao Q, Ying Y, Li P (2013) Similarity metric learning for face rec-
ognition. In: Proceedings of the international conference on
computer vision. IEEE, pp2408–2415
Chai Z etal (2014) Gabor ordinal measures for face recognition.
IEEE Trans Inf Foren Secur 9(1):14–26
Chen W, Meng-Joo E, Shiqian W (2006) Illumination compensation
and normalization for robust face recognition using discrete
cosine transform in logarithm domain. IEEE Trans Syst Man
Cybernet Part B 36:458–466
Chen D, Cao X, Wang L, Wen F, Sun J (2012) Bayesian face revis-
ited: a joint formulation. In: Fitzgibbon A, Lazebnik S, Perona
P, Sato Y, Schmid C (eds) Computer vision–ECCV 2012, vol
7574. Springer, Berlin, pp 566–579
Cristianini N, Shawe-Taylor J (2000) An introduction to support vec-
tor machines and other kernel-based learning methods. Cam-
bridge University Press, Cambridge
Dong X, Wu F, Jing XY (2018) Generic training set based multi-
manifold discriminant learning for single sample face recogni-
tion. KSII Trans Internet Inf Syst 12(1):368–391
Gross R, Brajovic V (2003) An image preprocessing algorithm for
illumination invariant face recognition. In: Kittler J, Nixon MS
(eds) 4th International conference on audio-and video-based
biometric person authentication, vol 2688. Springer, Berlin, pp
10–18. https ://doi.org/10.1007/3-540-44887 -X_2
Gupta P, Saxena N, Sharma M, Tripathia J (2018) Deep neural net-
work for human face recognition. IJ Eng Manuf 1:63–71
Heusch G, Cardinaux F, Marcel S (2005) Lighting normalization
algorithms for face verification. IDIAP, p 9
Huang GB, Ramesh M, Berg T, Learned-Miller E (2007) Labeled
faces in the wild: a database for studying face recognition in
unconstrained environments. University of Massachusetts,
Amherst (TR 07–49)
Jobson DJ, Rahman Zu, Woodell GA (1997) A multiscale retinex for
bridging the gap between color images and the human observation
of scenes. IEEE Trans Image Process 6(7):965–976
Fig. 2 Average Classification
Accuracies comparison of
FERET and LFW Datasets
P.VenkateswarLal et al.
1 3
Juefei-Xu F, Luu K, Savvides M (2015) Spartans: single-sample perioc-
ular-based alignment-robust recognition technique applied to non-
frontal scenarios. IEEE Trans Image Process 24(12):4780–4795
Kamencay P, Benco M, Mizdos T, Radil R (2017) A new method for
face recognition using convolutional neural network. Digital
Image Process Comput Graphics 15(4):663–672
Lam KM, Yan H (1996) Locating and extracting the eye in human face
images. Elsevier Science Inc, Amsterdam, pp771–779
Lam K-M, Yan H (1998) An analytic-to-holistic approach for face rec-
ognition based on a single frontal view. IEEE Trans Pattern Anal
Mach Intell 7(20):673–686
Lee PH, Wu SW, Hung YP (2012) Illumination compensation using
oriented local histogram equalization and its application to face
recognition. IEEE Trans Image Process 21:4280–4289
Liu Y, Li G, Cai X, Li X (2006) An efficient face normalization algo-
rithm based on eyes detection. In: International conference on
intelligent robots and systems. IEEE/RSJ, pp3843–3848
Lumini A, Nanni L, Brahman S (2017) Ensemble of texture descrip-
tors and classifiers for face recognition. Appl Comput Inform
13:79–91
Manjunath BS, Ohm JR, Vasudevan VV, Yamada A (2001) Color
and texture descriptors. IEEE Trans Circuits Syst Video Technol
11(6):703–715
Menon AK (2007) Random projections and applications to dimension-
ality reduction, Ph.D. thesis, School of Information Technologies,
The University of Sydney, Australia
Ouanan H, Ouanan M, Aksasse B (2015a) Gabor–Zernike fea-
tures based face recognition scheme. Int J Imaging Robot
16(2):118–131
Ouanan H, Ouanan M, Aksasse B (2015b) Face recognition by neural
networks based on Gabor filters and random projection. Int J Math
Comput 26(12):30–42
Ouanan H, Ouanan M, Aksasse B (2015c) Gabor-HOG features based
face recognition scheme. TELKOMNIKA Indones J Electr Eng
15(12):331–335
Ouanan H, Ouanan M, Aksasse B (2016) Gabor-Zernike features based
face recognition scheme. Int J Imaging Robot 16(12):118–131
Park YK, Park SL, Kim JK (2008) Retinex method based on adaptive
smoothing for illumination invariant face recognition. Sig Process
88(8):1929–1945
Phillips J etal (2000) The feret evaluation methodology for face
recognition algorithms. IEEE Trans Pattern Anal Mach Intell
22:1090–1104
Sˇ truc V, Pavesˇic N (2011) Photometric normalization techniques for
illumination invariance. In: Sˇ truc V, Pavesˇic N (eds) Advances
in face image analysis: techniques and technologies. IGI Global,
Hershey, pp279–300
Tan X, Triggs B (2007) Enhanced local texture feature sets for face
recognition under difficult lighting conditions. Anal. Model. Faces
Gestures, vol 4778. LNCS, pp 168–182
Wu J, Wilamowska K, Shapiro L, Heike C (2009) Automatic analy-
sis of local nasal features in 22q11.2DS affected individuals. In:
Engineering in medicine and biology society, annual international
conference of the IEEE, pp3597–3600
Xu Y, Zhu Q, Fan Z etal (2013) Using the idea of the sparse rep-
resentation to perform coarse-to-fine face recognition. Inf Sci
238(7):138–148
Yuille AL, Hallinan PW, Cohen DS (1992) Feature extraction
from faces using deformable templates. Int J Comput Vision
8(2):99–111
Zhang T etal (2009) Face recognition under varying illumination using
gradient faces. IEEE Trans Image Process 18(11):2599–2606
Publisher’s Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
... SVM finds parallel hyperplanes passing through a small number of support vectors to reduce the structural risk of the classification problem. The obtained hyperplane separates the half-spaces with the lowest classification error (see [8,9,15,55,60,61,63]). SVM [28, 43, 52, 64] has emerged as a potent pattern recognition tool over the past decade and has been applied to HAR problems. ...
Article
Full-text available
Applying machine learning tools to human activity analysis presents two major challenges: Firstly, the transformation of actions into multiple attributes increases training and testing time significantly. Secondly, the presence of noises and outliers in the dataset adds complexity and makes it difficult to implement the activity detection system efficiently. This paper addresses both of the challenges by proposing a kernel fuzzy proximal support vector machine as a robust classifier for multicategory classification problems. It transforms the input patterns into a higher-dimensional space and assigns each pattern an appropriate membership degree to reduce the effect of noises and outliers. The proposed method only requires the solution of a set of linear equations to obtain the classifiers; thus, it is computationally efficient. The computer simulation results on ten UCI benchmark problems show that the proposed method outperforms established methods in predictive accuracy. Finally, numerical results from three human activity recognition problems validate the applicability of the proposed method.
Article
Face recognition and classification have gained increasing attraction in the recent decades due to their widespread adoption in real time application systems. Most of the conventional research efforts focused on developing face recognition frameworks using enhanced optimization‐based classification methods, they are hampered by issues such as computational complexity, increased overhead, limited capacity to handle large datasets, and lengthy processing time. The novel contribution of this paper is to develop a highly competent and precise face recognition methodology through an innovative mechanism. In this framework, the initial step involves face detection from input images using an analytical face parts detection methodology. Subsequently, the tutor face filtering (TFF) technique is applied to preprocess the face image, enhancing its quality and filtering out noise content. Following this preprocessing step, features are extracted from the processed image using the direction‐based pattern extraction (DBPE) model. To improve classifier accuracy, a novel adaptive gravitational search optimization (AGSO) technique is employed to select the optimal features during model training. Finally, an integrated deep learning model, referred to as convolutional neural network — long short‐term memory (LSTM), is utilized for accurate face image recognition based on the selected optimal features. To assess and compare the system's performance, various metrics are employed in the results analysis to demonstrate the superiority of the proposed approach.
Chapter
Many decagons faced several unresolved problems because of the growth of machine learning, including image identification, image detection, picture categorization, etc. The most fundamental, traditional, and important subject matter of research in the area of machine learning has always been image recognition. Image recognition software progresses in society at a faster rate than technology. The protection of personal information, for instance, when using mobile phones, depends on the picture recognition. For picture recognition, we used the GAN algorithm and the CNN algorithm. To categorize segment, and recognize images, machine learning-based image preprocessing technology is used. Nevertheless, because of the intricacy of video images and the current nature of things in various application qualifications, accurate categorization becomes vital and difficult. The usage of image recognition technologies is very useful in the future generation.
Article
Full-text available
Navigating the complex landscape of 21st-century agriculture involves overcoming numerous obstacles, such as changing dietary trends, food safety issues, and health concerns due to soil inconsistencies, climatic fluctuations, and varied agricultural practices. The global population surge, climate change, and resource depletion compound these issues. For various stakeholders, including farmers and policymakers, precise crop yield predictions at diverse spatial levels can be immensely advantageous. The value of these forecasts heightens when they are available at multiple spatial resolutions. However, accurately predicting the complex interplay between many data sources and regional yields presents a significant challenge. Conventional methods often fall short, delivering inconsistent results that are difficult to generalize due to their limited ability to incorporate spatial and temporal features, insufficient understanding of market trends, and challenges with scalability and nonlinearity. This paper proposes an innovative Deep Learning approach that adeptly captures and integrates spatial and temporal features, marking a substantial enhancement over traditional methodologies that often grapple with these aspects. This approach forecasts crop yields with a minimal error rate, leveraging the robustness of a unique WaveNet and LSTM hybrid architecture, introducing a fresh perspective to agricultural yield predictions. The novelty of our methodology lies in its two-tier design: the preliminary phase involves pre-processing similar to existing models, and the second phase harnesses the combined power of WaveNet and LSTM for Feature Extraction and regression, enabling precise predictions. The model has been tested on four crops in the Netherlands using varied data splitting criteria, demonstrating stellar performance by offering predictions with minimal error rates. When compared against prevailing machine learning methodologies using an identical Netherlands dataset, our approach outperforms them, highlighting its efficacy and potential practical application in real-world agricultural settings.
Article
Full-text available
Face recognition (FR) with a single sample per person (SSPP) is common in real-world face recognition applications. In this scenario, it is hard to predict intra-class variations of query samples by gallery samples due to the lack of sufficient training samples. Inspired by the fact that similar faces have similar intra-class variations, we propose a virtual sample generating algorithm called k nearest neighbors based virtual sample generating (kNNVSG) to enrich intra-class variation information for training samples. Furthermore, in order to use the intra-class variation information of the virtual samples generated by kNNVSG algorithm, we propose image set based multimanifold discriminant learning (ISMMDL) algorithm. For ISMMDL algorithm, it learns a projection matrix for each manifold modeled by the local patches of the images of each class, which aims to minimize the margins of intra-manifold and maximize the margins of inter-manifold simultaneously in low-dimensional feature space. Finally, by comprehensively using kNNVSG and ISMMDL algorithms, we propose k nearest neighbor virtual image set based multimanifold discriminant learning (kNNMMDL) approach for single sample face recognition (SSFR) tasks. Experimental results on AR, Multi-PIE and LFW face datasets demonstrate that our approach has promising abilities for SSFR with expression, illumination and disguise variations.
Article
Full-text available
In this paper, the performance of the proposed Convolutional Neural Network (CNN) with three well-known image recognition methods such as Principal Component Analysis (PCA), Local Binary Patterns Histograms (LBPH) and K–Nearest Neighbour (KNN) is tested. In our experiments, the overall recognition accuracy of the PCA, LBPH, KNN and proposed CNN is demonstrated. All the experiments were implemented on the ORL database and the obtained experimental results were shown and evaluated. This face database consists of 400 different subjects (40 classes/ 10 images for each class). The experimental result shows that the LBPH provide better results than PCA and KNN. These experimental results on the ORL database demonstrated the effectiveness of the proposed method for face recognition. For proposed CNN we have obtained a best recognition accuracy of 98.3 %. The proposed method based on CNN outperforms the state of the art methods.
Article
Full-text available
Presented in this paper is a novel system for face recognition that works well in the wild and that is based on ensembles of descriptors that utilize different preprocessing techniques. The power of our proposed approach is demonstrated on two datasets: the FERET dataset and the Labeled Faces in the Wild (LFW) dataset. In the FERET datasets, where the aim is identification, we use the angle distance. In the LFW dataset, where the aim is to verify a given match, we use the Support Vector Machine and Similarity Metric Learning. Our proposed system performs well on both datasets, obtaining, to the best of our knowledge, one of the highest performance rates published in the literature on the FERET datasets. Particularly noteworthy is the fact that these good results on both datasets are obtained without using additional training patterns. The MATLAB source of our best ensemble approach will be freely available at https://www.dei.unipd.it/node/2357.
Article
Full-text available
This article is part of the context of biometrics whose objective is the recognition of people through their faces. We propose an approach based on the texture characteristics, using the Gabor filters (GF), specifically; we construct a vector descriptor combining the amplitude and the phase of the convolution of the face image with a family of Gabor filters characterized by different orientations and resolutions. Moreover, to reduce the dimensionality of the descriptor vector, we make use of Random Projection (RP). Finally Multi-layer perceptron (MLP) based Back Propagation Neural Network (BPNN) has been used for classification.
Article
Full-text available
A Facial recognition (FR) system in still images is an important application in computer vision and image processing. Extraction of invariant features is the core of FR systems. In this paper, we propose a novel and efficient facial image representation based on Gabor energy filters (GFs) and Complex Zernike moments (ZMs), where GFs is used for texture feature extraction and ZMs extracts shape features, Almost all existing methods use only magnitude component of the ZMs (respectively GFs) as features in recognition task. Recently it is well known that the phase component of moments (respectively Gabor Filters) also captures useful information for image representation, in other hand, a simple Genetic Algorithm (GA) is applied to select the moment features that better discriminate human faces and facial expressions, under several pose and illumination conditions. Next, the extracted feature vectors are projected onto a low-dimensional subspace using Random Projection (RP) technique. The (GF+ZM+RP) feature vectors are then applied to a powerful face support vector machine (SVM) classifier, employing the Gaussian radial basis function as kernel function (RBF kernel). Comprehensive performance evaluation of our proposed algorithm is based on Libor Spacek’s Facial Images Databases and Color FERET Database. It can be concluded from the experimental results that the performance of the proposed FR system outperforms other related approaches in terms of recognition rate.
Article
Full-text available
Extraction of invariant features is the core of Face RecognitionSystems (FRS). This work proposes a novel feature extractor-fusion scheme using two powerful feature descriptor known as Gabor Filters (GFs) and Histogram of Oriented Gradient (HOG), which the face image is filtered with the multiscale multiresolution Gabor filter bank to generate multiple Gabor magnitude images (GMIs), then the down-sampled GMIs and apply Histogram of Oriented Gradient to form the features. The experimental results on the FERET face database show the effectiveness of our methods.
Article
Face recognition technology has come a long way since its beginnings in the previous century. Due to its countless application possibilities, it has attracted the interest of research groups from universities and companies around the world. Thanks to this enormous research effort, the recognition rates achievable with the state-of-the-art face recognition technology are steadily growing, even though some issues still pose major challenges to the technology. Amongst these challenges, coping with illumination-induced appearance variations is one of the biggest and still not satisfactorily solved. A number of techniques have been proposed in the literature to cope with the impact of illumination ranging from simple image enhancement techniques, such as histogram equalization, to more elaborate methods, such as anisotro-pic smoothing or the logarithmic total variation model. This chapter presents an overview of the most popular and efficient normalization techniques that try to solve the illumination variation problem at the preprocessing level. It assesses the techniques on the YaleB and XM2VTS databases and explores their strengths and weaknesses from the theoretical and implementation point of view.