ArticlePDF Available

Multimodal biometric scheme for human authentication technique based on voice and face recognition fusion

Authors:

Abstract and Figures

In this paper, an effective multimodal biometric identification approach for human authentication tool based on face and voice recognition fusion is proposed. Cepstral coefficients and statistical coefficients are employed to extract features of voice recognition and these two coefficients are compared. Face recognition features are extracted utilizing different extraction techniques, Eigenface and Principle Component Analysis (PCA) and the results are compared. Voice and face identification modality are performed using different three classifiers, Gaussian Mixture Model (GMM), Artificial Neural Network (ANN), and Support Vector Machine (SVM). The combination of biometrics systems, voice and face, into a single multimodal biometric system is performed using features fusion and scores fusion. The computer simulation experiments reveal that better results are given in case of utilizing for voice recognition the cepstral coefficients and statistical coefficients and in case of face, Eigenface and SVM experiment gives better results for face recognition. Also, in the proposed multimodal biometrics system the scores fusion performs better than other scenarios.
This content is subject to copyright. Terms and conditions apply.
Multimodal biometric scheme for human authentication
technique based on voice and face recognition fusion
Anter Abozaid
1
&Ayman Haggag
1
&Hany Kasban
2
&Mostafa Eltokhy
1
Received: 6 March 2018 / Revised: 31 October 2018 /Accepted: 30 November 2018
#Springer Science+Business Media, LLC, part of Springer Nature 2018
Abstract
In this paper, an effective multimodal biometric identification approach for human authenti-
cation tool based on face and voice recognition fusion is proposed. Cepstral coefficients and
statistical coefficients are employed to extract features of voice recognition and these two
coefficients are compared. Face recognition features are extracted utilizing different extraction
techniques, Eigenface and Principle Component Analysis (PCA) and the results are compared.
Voice and face identification modality are performed using different three classifiers, Gaussian
Mixture Model (GMM), Artificial Neural Network (ANN), and Support Vector Machine
(SVM). The combination of biometrics systems, voice and face, into a single multimodal
biometric system is performed using features fusion and scores fusion. The computer simula-
tion experiments reveal that better results are given in case of utilizing for voice recognition the
cepstral coefficients and statistical coefficients and in case of face, Eigenface and SVM
experiment gives better results for face recognition. Also, in the proposed multimodal biomet-
rics system the scores fusion performs better than other scenarios.
Keywords Multimodal biometrics .SVM .ANN .GMM .Voice identification .Face recognition
1 Introduction
Biometric techniques are utilized for human identification and security. Biometrics means the
technology that is applied for measuring the human physical characteristics and is considered to be
very promising tools in human authentication. Most of the used biometric authentication tech-
niques now take the advantage of a unimodal biometric authentication scheme for executing the
authenticating process. The unimodal biometric authentication technique differentiates the person
Multimedia Tools and Applications
https://doi.org/10.1007/s11042-018-7012-3
*Anter Abozaid
anter19731973@gmail.com
1
Electronics Technology Department, Faculty of Industrial Education, Helwan University, Cairo,
Egypt
2
Engineering Department, Nuclear Research Center, Atomic Energy Authority, Cairo, Egypt
based on only one sensor of biometric data such as fingerprint, face, voice, hand, palm print, walk,
ear, retina, iris,, or signature. Many researchers presented state of the art, surveyed and differen-
tiated between different unimodal biometric methods [1,14,21].
Unimodal biometric faces many confrontations such as: the noise in the captured raw data
that hails from the natural resources encircling the sensor may cause mistaken labeled by the
person and increases the false negative rate. The authentication utilizing the unimodal biomet-
ric may not be able to hold significant biometric data from some persons due to the defeat of
enroll error. The authentication through biometric system may be suffering from spoofing
attacks when an impostor attempts to impersonate the trait matching to a validly enrolled
subject [3]. To overcome the challenges of a unimodal biometric authentication system, a
combination of different biometric systems can be used by employing an approach that
combines numerous sources of biometric input into a single decision; in this case, the scheme
is denoted by multimodal biometric authentication.
Authentication using multimodal biometric schemes has been presented in [25]. Good
study about diverse systems and architectures linked to multimodal idea is presented in [25].
The biometric authentication by multimodal schemes promotes the matching accuracy of the
authentication process and accomplishes more reliability and security than the unimodal
biometric system because it takes combination from different behavioral or physiological
characteristics of the person into account to distinguish that person. This model likes data or
image security using merging and combining different multi-level security techniques [11].
The most important challenge that faces the implementing of the multimodal biometric
scheme is the different modality inputs fusion such as the face image and the voice signal for
example, because the fusion process should be performing considering the particular modality
of the biometric inputs. The information fusion of the multimodal biometric schemes may be
performed before classification or after classification. In the before classification fusion, the
information integrated before applying the matching algorithm, while in the after classification
fusion, the information integrated after application of the matching algorithm [18].
The rest of this paper is arranged as follows: in section 2, the related work is presented. The voice
recognition technique is introduced in section 3. In section 4, the face recognition techniques are
discussed. The multimodal Biometric Fusion is presented in section 5. The section 6presents the
computer simulation experiments results. Finally, the last section gives the conclusions remarks.
2 Related work
In this section, previous research work and recently published papers are presented. Many
researchers have presented different multimodal biometric schemes for person verification
using voice and face. The reasons of integration of the voice and the face are that they are easy
to acquire in a short time with acceptable accuracy using low cost technology. Srinivas Halvi
et al. in 2017 presented proposed face recognition model based on transform domain and
fusion technique. The proposed model is given in Fig. 1.
In this proposed model, two transform domain techniques are utilized which as the DWT
and FFT techniques as shown in Fig. 1. In this proposed technique, the extracted feature from
the DWT and FFT are compare utilizing Euclidian Distance (ED) for computing the param-
eters performance [16]. Biometric security is utilized to enhance the Wireless Body Area
Network (WBAN) security. WBAN is wireless network for medical applications; it can be
considered special branch form the Wireless Sensor Networks (WSNs) [9,10,29].
Multimedia Tools and Applications
Poh and Korczak showed a hybrid prototype using combining face and text dependent
voice biometrics to implement person authentication [28]. In this prototype the features vector
excerpted the face information using the moments and the speech information excerpted using
the wavelets. The derived features are classified using two separate multilayers. The results of
this scheme achieved an Equal Error Rate (EER) equal to 0.15% and 0.07% for the face
recognition and the voice recognition, respectively [28].
Chetty and Wagner suggested a powerful multilevel fusion strategy including a hybrid
cascaded multimodal fusion of audio, Two-Dimensional (2-D) lip face motion, Three-
Dimensional (3-D) face correlation & depth, and tri-module (audio, lip motion, and correlation
& depth) for biometric person authentication. The excerpted audio features vector is the Mel
Frequency Cepstral Coefficients (MFCC) features, while the features vector excerpted from
the face images consists of three types of features; Discrete Cosine Transform (DCT) features,
the explicit grid-based lip motion (GRD) features and the contour based lip motion (CTR)
features. The features vector extracted from the 3-D face are; 3-D shape and texture features.
The audio signals are degenerated by additive white Gaussian noise and the visual speech
degenerated with JPEG compression. The results of thee presented technique achieved an EER
equal to 42.9%, 32%, 15% and 7.3% for audio, lip face motion, 3-D face and tri-module,
respectively [6].
Palanivel and Yegnanarayana suggested a multimodal person authentication way based on
speech, face and visual speech. The face differentiation is performed using Morphological Dynamic
Link Architecture (MDLA) method for the excerpted features vector of the speech is the Weighted
Linear Prediction Cepstral Coefficients (WLPCCs) features, while the features vector of the face
excerpted using the morphological operations. The excerpted features are categorized using Auto
Associative Neural Network (AANN). The result of EER for the face and the voice was 2.5% and
9.2%, respectively; the EER equals 0.45% for the multimodal [27].
Fig. 1 Block diagram of face recognition and fusion technique model [16]
Multimedia Tools and Applications
Raghavendra et al. shown a person verification way based on voice and face. The features vector
for the voice differentiation is the WLPCC features, while the features vector of the face differen-
tiation is excerpted using 2D LDA. The fusion of these features has been accomplished using GMM.
The results accomplished an EER for the face differentiation equal to 2.1%, for the voice
differentiation equal to 2.7% and for the multimodal equal to 1.2% [30]. In [12], the person
authentication by hierarchical multimodal method based on face and voice is presented. MFCCs
features excerpted from the voice and the Gabor filter bank is used to establish the face features
vector. The Cosine Mahalanobis Distance (CMD) is used for measuring the similarity between the
planning coefficients. The results achieved an EER for the face differentiation equal to 1.02%, for
the voice differentiation equal to 22.37% and for the multimodal equal to 0.39%.
In [32] biometric authentication technique based on face and voice differentiation is
presented. In this research paper, the features vector for the voice is the MFCC features, while
the features vector of the face differentiation is excerpted using eigenfaces. The fusion of these
features has been accomplished using GMM. The results accomplished an EER for the face
differentiation equal to 0.39995%, for the voice differentiation equal to 0.00539% and for the
multimodal equal to 0.28125% [32]. Another proposed person authentication technique based
on face and voice differentiation is presented in [19]. The features vector for the voice is the
MFCCs, LPCs and LPCCs features, while the features vector of the face differentiation is
excerpted using PCA, LDA and Gabor filter. The fusion of these features has been accom-
plished using LLR. The results achieved an EER for the face differentiation equal to 1.95%,
for the voice differentiation equal to 2.24% and for the multimodal equal to 0. 64% [19].
Table 1summarizes some multimodal biometric schemes for person verification using face
and voice with/without fusion technique.
In this research paper, combined multimodal biometrics scheme is proposed for person
authentication based on fusion of voice and face recognition. The proposed scheme utilizes
combination of different three biometric modalities to establish reliable biometric identification
system. The fusion process is carried out using two methods, feature fusion which uses the
extracted feature and score fusion by using the score.
3 Voice recognition
Several researchers have submitted voice differentiation as a unimodal biometric personal
authentication system. The there are some advantages of using the voice as a biometric
Table 1 Differet schemes of Multimodal biometric using face and voice
Multimodal
biometric scheme
Extracted features Fusion
technique
Database Results (EER %)
Face Voice Face Voice Fusion
Poh et. al [28] Moments Wavelet No Fusion Persons 0.15 0.07
Chetty et. al [6] DCT, GRD,
CTR
MFCC GMM AVOZES 3.2 4.2 0.73
Palanivel et. al. [27] MDLA WLPCC GMM Newspapers 2.9 9.2 0.45
Raghavendra [30]2DLDA LPCC GMM VidTIMIT 2.1 2.7 1.2
Elmir et. al. [12] Gabor filter MFCCs CMD VidTIMIT 1.02 22.37 0.39
Soltane [32] Eigenfaces MFCC GMM eNTERFACE 0.399 0.0054 0.281
H. kasban [19]PCA,LDA,
Gabor filter
MFCCs, LPCs,
LPCCs
LLR PROPOSED 1.95 2.24 0.64
Multimedia Tools and Applications
personal authentication are the voice biometric is an intuitive and natural technology because it
uses the human voice, it can supply remote authentication without the need for user presence
and it is low cost technology also. The disadvantages of using the voice as a biometric
authentication are; the speech variability by background noises and temporary voice alter-
ations, it is low security and poor accuracy, and its suffering from the cross channel
conditions. The block diagram of the voice recognition approach used in this research paper
is shown in Fig. 2, it operates in two modes; training mode and recognition mode [17,26].
The first step in voice training mode is the features excerption process that transforms the
voice signal into features vector. In this paper, two categories of features are used. The first
category is the statistical coefficients features that linked to the voice signal such as the mean,
the standard deviation, the median, the third quartile and the dominant. The second category is
the voice features excerpted in the form of the cepstral coefficients such as Mel Frequency
Cepstral Coefficients (MFCCs), Linear Prediction Coefficients (LPCs), and Linear Prediction
Cepstral Coefficients (LPCCs) [17]. The second step in voice training mode is speaker
modeling that carried out using three classifiers, Artificial Neural Network (ANN), Support
Vector Machine (SVM), and Gaussian Mixture Model (GMM) [23].
3.1 Artificial neural network (ANN)
An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by
the way biological nervous systems, such as the brain, process information. The key element of
this paradigm is the novel structure of the information processing system. It is made up of a
large number of highly interconnected processing elements (neurons) working in unison to
solve specific problems. ANNs, like people, learn by example. An ANN is prepared for a
specific application, such as pattern differentiation or data classification, through a learning
process. Learning in biological systems includes adjustments to the synaptic connections that
stay between the neurons [20]. The main idea of Holder-function is next. Consider f(t)Df
Holder derived, that f(t+Δt)f(t)∣≤const(Δt)α(t),α(t)[0, 1]
α¼0 means that we have break of second order
α¼1 means that h ave OΔtðÞ
This formula is a somewhat connection between Bbad^functions and Bgood^functions. If we
will look on this formula with more precise we will notice, that we can catch moments in time,
Feature
Extraction
Speaker
Modeling
Pattern
Matching
Speech Signal
Speaker
Model
Database
Signal
Degradation
Decision
Voice Training Mode
Voice Recognition Mode
Feature
Extraction
Fig. 2 Speaker identification approach
Multimedia Tools and Applications
when our function knows, that its going to change its behavior from one to another. It means
that today we can make a forecast on tomorrow behavior. But one should mention that we
dont know the sigh on what behavior is going to change [15].
3.2 Support vector machine (SVM)
Support Vector Machine (SVM) is a classification and regression prediction tool that uses machine
learning theory to make the most of predictive accuracy while automatically avoiding over-fit to the
data. Support Vector machines can be realized as systems which use hypothesis space of a linear
functions in a high dimensional feature space, trained with a learning algorithm from optimization
theory that executes a learning bias derived from statistical learning theory.
In this we present the QP formulation for SVM classification. This is a simple representation only.
SV classification in Eq. (1) as shown:
min
f;ξi
f
kk
2
KþC
l
i¼1
ξiyif xiðÞ1ξi;for all i ξi0ð1Þ
SVM classification, Dual formulation:
min
αi
l
i¼1
αi1
2
l
i¼1
l
j¼1
αiαjyiyjKxi;xj

0αiC;for all i;
l
i¼1
αiyi¼0ð2Þ
Variables ξiare called slack variables and they measure the error made at point (xi,yi). Training
SVM becomes quite challenging when the number of training points is large. A number of
methods for fast SVM training have been proposed [4].
3.3 Gaussian mixture models (GMMs)
The GMM model contains a finite number of Gaussian distributions defined by three
parameters; the weights wj, the mean vectors μj, and the covariance matrices εj. These
parameters estimated using the Expectation Maximization (EM) algorithm. For an input vector
X={X
1,...,X
m}, the log likelihood Lof the GMM can be defined by [31]asgivenEq.(3):
L¼log p X =λj

log p X =λj1
 ð3Þ
where λj=(w
j,μj,εj)andλj1 =(w
j1,μj1 εj1)are the model of speaker jand the background
model of speaker j1.
In the voice differentiation mode, after degrading the voice signal, the features vector
excerpted from the voice signal as in the training mode. After that, the pattern matching
executed by measuring the probability density of the observation given by the Gaussian. The
likelihood of the features vector realized by the GMM is the weighted sum over the likelihoods
of the Gaussian densities that realized as shown in Eq. (4):
Px
i;λ
ðÞ
¼
M
j¼1
wjbx
i;λj
 ð4Þ
The likelihood of xigiven jth Gaussian mixture is given in Eq. (5):
bx
i;λj

¼1
2πðÞ
D=2jεjjexp 1
2xiμj

T
1
jxiμj


ð5Þ
Multimedia Tools and Applications
Where Dis the vector dimension, μjand εjare the mean vectors and covariance matrices of the
training vectors respectively.
Pattern matching is executed by calculating the matching score between the stored features
in the speaker model database and the given model in the recognition mode. The extracted
features in the recognition mode compare with the stored features in the speaker model
database and finally the decision is made. The decision is taken using the basis of the matching
score, then it accepted as a genuine speaker or it rejected as an imposter speaker.
4 Face recognition
Face verification technique contains two main stages; face detection or localisation and face
recognition. The face detection means determining the face in the whole image. As shown in
Fig. 3, the block diagram of face recognition model is given. The face recognition means
obtaining the similarity between the detected face image and the stored templates in a database
to determine the personality of the person. Many face recognition approaches presented by
many researchers [8,24,36]. In this paper, two face differentiation methods are used; Eigen
face-based face differentiation, and Principal Components Analysis (PCA) [13,22,33,35].
4.1 Eigenfaces face recognition method
Also, it is called Principal Components Analysis (PCA) based face recognition method. This
method consists of two stages; training stage and operational stage. In the training stage, a set
of the training images that contain the distribution of the face images in a lower dimensional
subspace (Eigenspace) is determined. Consider a set of face images i1,i
2,……., iM, Mis the
number of images, then the average face image of this set is [2]:
Acquisition Feature
Extracting
Training
Sets
Classification
Process
Face
Database
Input Image
Fig. 3 Block diagram of face recognition Model
Multimedia Tools and Applications
i¼1
MM
j¼1ij
 ð6Þ
The difference between each face image and the average face image is:
The covariance matrix of the image is constructed:
C¼M
j¼1φjφT
j¼AAT;A¼φ1φ2::::::::φM
½ ð8Þ
Then calculate the eigenvalues λkand the eigenvectors υk. The eigenvectors determine the
linear combination of M difference images with Ø to form the Eigenfaces υi:
νl¼M
k¼1νlkφk;l¼1; ::::: ::::::; Mð9Þ
Finally, select the Eigenfaces corresponding to the highest eigenvalues at K = M.
In the operational stage, the face image is projected onto the same Eigenspace and then
computing the likeness between the input face image and the stored template in the database to
take the final decision.
4.2 Principal components analysis (PCA)
The Eigen faces face differentiation method deals with the whole face image regardless to the
structure. In principal components analysis (PCA) and factor analysis (FA) one wishes to
extract from a set of p variables a reduced set of m components or factors that accounts for
most of the variance in the p variables. In other words, we wish to reduce a set of p variables to
a set of m underlying super ordinate dimensions [5].
These underlying factors are inferred from the correlations among the p variables. Each
factor is estimated as a weighted sum of the p variables. The ith factor is thus expressed in Eq.
(10).
Fi¼Wi1X1þWi2X2þþWipXpð10Þ
One may also express each of the pvariables as a linear combination of the mfactors, as
shown in Eq. (11)
Xj¼A1jF1þA2jF2þþAmj FmþUjð11Þ
where Ujis the variance that is unique to variable j, variance that cannot be explained by any of
the common factors.
5 Multimodal biometric fusion
Fusion in multimodal biometric schemes can be done before matching or after matching, the
fusion before matching may be sensors fusion or features fusion. The sensors fusion is
executed if the biometric system utilizes multiple sensors for a single trait. The features fusion
is done by combining the different features vectors that extracted from multiple of biometric
systems. The fusion after matching may be scores fusion or decisions fusion. The scores fusion
Multimedia Tools and Applications
is carried out by combining the individual matching scores to single score according to some
rules such as sum, max, and min rule or by using a formula such as Likelihood Ratio (LLR).
The decisions fusion is carried out when the outputs by different matching techniques are
available and it considers the weakest fusion [7].
In this paper, two fusion methods are used and compared; features fusion and scores fusion.
Results of the features fusion are shown in Fig. 4a, the extracted features vectors extracted
from the voice signal and from the face image are combined in a single features vector, which
compares to the enrollment template and assigned the final matching score as a single
biometric system. The scores fusion is given in Fig. 4b is based on LLR formula that computes
the total fused score by [34], as given in Eq. (12):
S¼pS
voicejG
ðÞ
:pS
facejG

pS
voicejIðÞ:pS
facejI
 ð12Þ
where p(.|G) is the matching scores probability density function of the genuine person, p(.|I) is
the matching scores probability density function of the impostor person, Svoice is the matching
score of the voice recognition technique, and Sface is the matching score of the face differen-
tiation technique [28].
6 Results and discussions
For testing the performance of proposed multimodal biometrics system, a voice and face
database are collected for 100 persons, for every person, five pictures are taken (500 faces
images) and every person say the same word five times (500 voices signals). The voices
signals are sampled at 8 kHz over 3 s and the faces images are resized into 512 × 512 pixels in
RGB color model. The database is acquired using Lenovo tablet with camera model A3500-
aFeatures Fusion b Scores Fusion
Score
Features Vector
Voice Signal
Face Image
Features Vector
Features
Extraction
Pattern
Matching
Database
Decision
Features
Extraction
Features Fusion
Features Vector
Features Vector
Face Image
Features
Extraction
Pattern
Matching
Database
Decision
Features
Extraction
Scores Fusion
Pattern
Matching
Voice Signal
Features Vector
Score
Score
Score
Fig. 4 The proposed block diagram of multimodal biometric fusion scheme
Multimedia Tools and Applications
HV and standard microphone. Figure 5shows samples from the used images database, which
is utilized in the computer simulation experiments for evaluating the proposed multimodal
scheme.
The performance of the proposed scheme has been evaluated using the Receiver
Operating Characteristic (ROC) curve and the Equal Error Rate (EER). The ROC
curve is a plot of the False Acceptance Rate (FAR) against the False Rejection Rate
(FRR). FAR reflects the proportion of zero effort impostors trials misclassified as
genuine trials, while FRR reflects the proportion of the genuine trials misclassified as
Fig. 5 Some of face images from the used image database
Multimedia Tools and Applications
zero effort impostor trials. The EER refers to the point where the FAR and FRR are
equal, it is defined as shown in Eq. (13):
EER ¼FAR þFRR
2;whenFAR ¼FRR ð13Þ
In the voice training mode, 300 voice signals (3 signals / person) are used. The statistical
coefficients features and the cepstral coefficients features are used individual and together in
order to obtain the best results of voice recognition process. The remain 200 voice signals (2
signals / person) are used for testing in the voice recognition mode. During the testing, the
voice signals are degraded with Additive White Gaussian Noise (AWGN) in order to test the
robustness of the proposed scheme. Figure 6shows the ROCs curves for the voice differen-
tiation using different features vectors. The equal error line (EEL) curve shows the values of
EERs at intersecting with ROCs curves. Table 2compares the values of the EER for the voice
recognition using different features vectors.
The results in Fig. 6and Table 2show that, the cepstral coefficients features give the lowest
EER among the other features extracting method. The reason is, in the cepstral features, any
periodicities, or frequented patterns in the spectrum is mapped to one or two specific
components in the Cepstrum and leads to separate the harmonic series such as the spectrum
10
-2
10
-1
10
0
10
1
10
2
10
-2
10
-1
10
0
10
1
10
2
FAR
RRF
St atis tical Co effic ients ANN
St atis tical Co effic ients SVM
St atis tical Co effic ients GMM
Ceps tral Coefficients ANN
Ceps tral Coefficients SVM
Ceps tral Coefficients GMM
Fig. 6 ROC curves for the voice recognition using different features vectors
Table 2 EER for the voice recognition using different features vectors
Voice recognition method EER (%)
Statistical Coefficients + ANN 10.55
Statistical Coefficients +SVM 10.73
Statistical Coefficients + GMM 9.48
Cepstral Coefficients + ANN 3.15
Cepstral Coefficients + SVM 4.47
Cepstral Coefficients + GMM 2.98
Multimedia Tools and Applications
separates repetitive time patterns in the waveform. Using the statistical coefficients and the
voice timbre features make efficient the performance of the voice recognition technique, so
that, in this paper, all features are used training and testing in the voice differentiation
technique.
In the face recognition computer experiments, there are 300 faces images (3 images /
person) are used for training and the remain 200 faces images (2 images / person) are used for
testing the three recognition methods; Eigenface and PCA so as to select the method that gives
the best results of face recognition process. During the testing, some of faces images are
degenerated with JPEG compression in order to test the robustness of the face differentiation
approaches. Figure 7Shows the ROCs curves for the different face recognition methods.
Table 3compares the values of the EER for the three face recognition methods.
The results of face recognition experiments are shown in Fig. 7and Table 3. These results
clear that, the FAR and FRR of the PCA with GMM face recognition method are less than the
FAR and FRR and it gives the lowest EER among the other methods, so that, in this paper, the
PCA face recognition method is used for the face recognition. The PCA with GMM is more
robust because it finds the optimal projective direction by maximizing the difference between
10
-2
10
-1
10
0
10
1
10
2
10
-2
10
-1
10
0
10
1
10
2
FAR
RRF
Eigenface ANN
Eigenface SVM
Eigenface GMM
PCA ANN
PCA SVM
PCA GMM
EER
Fig. 7 ROCs curves for the different face recognition methods
Table 3 EER for the different face recognition methods
Voice recognition method EER (%)
Eigenface + ANN 26.83
Eigenface +SVM 27.12
Eigenface + GMM 4.45
PCA + ANN 1.71
PCA + SVM 13.05
PCA + GMM 1.43
Multimedia Tools and Applications
class scatter and minimizing it within the same class scatter. The two fusion processes results
are given in Fig. 8. This figure shows the ROCs curves for the individual voice recognition,
individual face recognition and after the fusion using features fusion and scores fusion.
The results of the features fusion and scores fusion experiments clear that the features
fusion gives EER equal to 2.81, and scores fusion gives the lowest EER equal to 0.69. Scores
fusion gives the best results because it takes in consideration the different biometric traits based
on their strength and weaknesses for different users, then the collected information will lead to
the right identification of the user. In addition to the LLR between the genuine and impostor
distribution reduces the probability of error. Furthermore, the obtained results are compared
with some published results as displayed in Table 4. The results reveal the ability of the
proposed approach as a promising multimodal fusion approach.
As shown in the final computer simulation experiment results in Fig. 8and Table 4,the
proposed multimodal scheme gives lower EER. Also, it performs better than the previous
10
-2
10
-1
10
0
10
1
10
2
10
-2
10
-1
10
0
10
1
10
2
FAR
RRF
Voice rec ognition using
Cepstral Coefficients GMM
Fa ce r ecognit ion u sing
PCA GMM
Face and voice features fus ion
Face and voice sc ores fusion
Fig. 8 ROCs curves for the proposed multimodal fusion approach
Table 4 Comparison between the obtained EER of the proposed scheme with the other published results
Authentication method Results (EER %)
Voice Face Fusion
Poh and Korczak [28] 0.07 0.15
Chetty and Wagner [6] 4.2 3.2 0.73
Palanivel and Yegnanarayana [27] 9.2 2.9 0.45
Raghavendra et al. [30]2.72.11.2
Elmir et al. [12] 2.37 1.02 0.39
Soltane [32] 0.01 0.39 0.28
H. kasban [19] 2.24 1.95 0.64
Proposed scheme 2.98 1.43 0.62
Multimedia Tools and Applications
related work as shown in Table 4which tabulates the EER values of the proposed scheme and
the previous work results.
In the future work, triple multimodal will be studied using iris, face and voice. Three
biometrics will be combined in one multimodal biometric scheme. Also, biometric security for
WBAN using unimodal and combined model will be studied with power consumption and
complexity consideration. The third research point in the future work will focus on design and
testing wireless combined biometric person authentication system.
7Conclusions
In this research paper, a fusion scheme for voice and face differentiation as a multimodal
biometrics system for human authentication is proposed. Both voice and the face recognition
are performed using different feature extraction tools to choose the best for the recognition
process. Results of voice recognition process showed that the best results are obtained by
simulation of the Cepstral Coefficients using GMM classifier scenario. Results of face
recognition process showed that the PCA with the GMM classifier based face differentiation
method is the best face recognition method among the other tested methods. The fusion results
showed that, the scores fusion gives the lowest EER and considers a promising multimodal
fusion approach. The proposed scheme performs better than other biometric schemes. The
computer simulation experiments reveal the superiority of the proposed modal for the pro-
posed face recognition modal and the proposed fusion scenarios.
PublishersNote Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
References
1. Abdel Karim N, Shukur Z (2015) Review of user authentication methods in online examination. Asian
Journal of Information Technology 14(5):166175
2. Abhishree TM, Latha J, Manikantan K, Ramachandran S (2015) Face recognition using Gabor filter based
feature extraction with anisotropic diffusion as a pre-processing technique. Procedia Computer Science 45:
312321
3. Agashe NM, Nimbhorkar S (2015) A survey paper on continuous authentication by multimodal biometric.
International Journal of Advanced Research in Computer Engineering & Technology 4(11):4247-4253
4. Baken RJ, Orlikoff RF (2000) Clinical measurement of speech and voice second edition. Singular
Publishing Group, San Diego
5. Burges C (1998) A tutorial on support vector machines for pattern recognition. In: Data mining and
knowledge discovery (Volume 2). Kluwer Academic Publishers, Boston, pp 143
6. Chetty G, Wagner M (2008) Robust face-voice based speaker identity verification using multilevel fusion.
Image Vis Comput 26:12491260
7. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based
learning methods. Cambridge University Press, Cambridge
8. De A, Saha A, Pal MC (2015) A human facial expression recognition model based on Eigen face approach.
Procedia Computer Science 45:282289
9. Dodangeh P, Jahangir AH (2018) A biometric security scheme for wireless body area networks. Journal of
Information Security and Applications 41:6274. https://doi.org/10.1016/j.jisa.2018.06.001
10. El-Bendary MAM (2015) Developing security tools of WSN and WBAN networks applications. Springer,
Japan
Multimedia Tools and Applications
11. El-Bendary MA (2017) FEC merged with double security approach based on encrypted image steganog-
raphy for different purpose in the presence of noise and different attack. Multimed Tools Appl 76(24):
2646326501
12. Elmir Y, Elberrichi Z,Adjoudj R (2014) Multimodal biometric using a hierarchical fusion of a persons face,
voice, and online signature. J Inf Process Syst:555-567
13. Fookes C, Lin F, Chandran V, Sridharan S (2012) Evaluation of image resolution and super-resolution on
face recognition performance. J Vis Commun Image Represent 23(1):7593
14. Gad R, El-Fishawy N, El-Sayed A, Zorkany M (2015) Multi-biometric systems: a state of the art survey and
research directions. Int J Adv Comput Sci Appl 6(6):128-138
15. Galka J, Masior M, Salasa M (2014) Voice authentication embedded solution for secured access control.
IEEE Trans Consum Electron 60(4):653661
16. Halvia S, Ramapurb N, Rajac KB, Prasadd S (2017) Fusion based face recognition system using 1D
transform domains. Procedia Computer Science 115:383390
17. Inthavisas K, Lopresti D (2012) Secure speech biometric templates for user authentication. IET Biometrics
1(1):4654
18. Jain A, Nandakumar K, Ross A (2005) Score normalisation in multimodal biometric systems. Pattern
Recogn 38:22702285
19. Kasban H (2017) A robust multimodal biometric authentication scheme with voice and face recognition.
Arab Journal of Nuclear Sciences and Applications 50(3):120130
20. Kinnunen T, Karpov E, Franti P (2006) Real time speaker identification and verification. IEEE Trans Audio
Speech Lang Process 14(1):277288
21. Kumar HCS, Janardhan NA (2016) An efficient personnel authentication through multi modal biometric
system. International Journal of Scientific Engineering and Applied Science 2(1):534543
22. Li H, Suen CY (2016) Robust face recognition based on dynamic rank representation. Pattern Recogn 60:
1324
23. Liu Z, Wang H (2014) A novel speech content authentication algorithm based on BesselFourier moments.
Digital Signal Process 24:197208
24. Liu T, Mi JX, Liu Y, Li C (2016) Robust face recognition via sparse boosting representation.
Neurocomputing 214:944957
25. Lumini A, Nanni L (2017) Overview of the combination of biometric matchers. Information Fusion 33:71
85
26. Morgen B (2012) Voice biometrics for customer authentication. Biom Technol Today 2012(2):811
27. Palanivel S, Yegnanarayana B (2008) Multimodal person authentication using speech, face and visual
speech. Comput Vis Image Underst 109:4455
28. Poh N, Korczak J (2001) Hybrid biometric person authentication using face and voice features. International
Conference, Audio and Video Based Biometric Person Authentication, Halmstad, Sweden, pp 348353
29. Qia M, Chena J, Chen Y (2018) A secure biometrics-based authentication key exchange protocol for multi-
server TMIS using ECC. Comput Methods Prog Biomed 164:101109
30. Raghavendra R, Rao A, Kumar GH (2010) Multimodal person verification system using face and speech.
Procedia Computer Science 2:181187
31. Reynolds DA, Quatieri TF, Dunn RB (2000) Speaker verification using adapted Guassian mixture models.
Digital Signal Processing 10:1941
32. Soltane M (2015) Greedy expectation maximization tuning algorithm of finite GMM based face, voice and
signature multi-modal biometric verification fusion systems. International Journal of Engineering &
Technology 15(03):4152
33. Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):7186
34. Vapnik V (1998) Statistical learning theory. Wiley, New York
35. Xuan S, Xiang S, Ma H (2016) Subclass representation-based face-recognition algorithm derived from the
structure scatter of training samples. Comput Vis 10(6):493502
36. Zheng CH, Hou YF, Zhang J (2016) Improved sparse representation with low-rank representation for robust
face recognition. Neurocomputing 198:114124
Multimedia Tools and Applications
Anter Abozaid was born in Elbehaira Egypt in 1981. He received his B.Sc. degree from Bani Sweif University,
Egypt, in June 2003, Since March 2012, he has been with the Electronics Technology Dept., Faculty of Industrial
Education, Helwan University, Egypt. His current research interests are in the fields of Security and Authenti-
cation.
Ayman Haggag was born in Cairo, Egypt in 1971. He received his B.Sc. degree from Ain Shams University,
Egypt, in June 1994, M.Sc. degree from Eindhoven University of Technology, The Netherlands, in December
1997, and Ph.D. degree from Chiba University, Japan, in September 2008. Since March 1996, he has been with
the Electronics Technology Department, Faculty of Industrial Education, Helwan University, Egypt. His current
research interests are in the fields of Network Security, Wireless Security, Software Defined Network and
Wireless Sensor Network.
Multimedia Tools and Applications
Hany Kasban received the B. Sc., M. Sc. and Ph. D. degrees in Electrical and Electronic Engineering from
Menoufia University, Egypt in 2002, 2008 and 2012, respectively. He is currently an Associate Professor in the
Department of Engineering and Scientific Instruments., Nuclear Research Center (NRC), Egyptian Atomic
Energy Authority (EAEA), Cairo, Egypt. He is a co-author of many papers in national and international
conference proceedings and journals. His current research areas of interest are in the fields of electronics, digital
signal processing, communication systems and nuclear applications in industry and medicine.
Mostafa Elt okhy was born in Kaluobia, Egypt, in 1970. He received his B.Sc.degree from Zagazig university,
Banha branch, Egypt, and M.Sc. degree from Technical University, Eindhoven, The Netherlands in 1993 and
1998, respectively. He received his Ph.D. degree from Osaka University, Osaka, Japan in 2003. Presently, he is an
Associate Professor of Electronics Engineering at Department of Electronics Technology, Faculty of Industrial
Education, Helwan University, Cairo, Egypt. His current research interests are high performance digital circuits
and analog circuits. He is a member of the IEEE.
Multimedia Tools and Applications
... The mayfly optimization (MO) algorithm was proposed to simulate the social behavior, especially the mating process displayed by the mayflies in nature. Zervoudakis and Tsafarakis [2] translated the mating process and flight behavior of mayflies as a mathematical model to be used in solving the optimization problem [1]. ...
... Harsh and Pawan [6], stress that the integration of biometrics in the diurnal life provokes the need to design secure authentication systems. The biometric authentication by multimodal schemes promotes the matching accuracy of the authentication process and accomplishes further trustability and security than the unimodal biometric system because it takes a combination of different behavioral or physiological characteristics of the person into account to distinguish that person [2]. Mina and Önsen [12], emphasize the main aim of multimodal biometric systems as to improve the recognition accuracy by minimizing the limitations of unimodal systems. ...
... Unlike males, female mayflies do not gather in swarms [2]. They instead fly toward males to breed. ...
... The fundamental advantage of multimodal biometrics is that they can identify the person based on who they are rather than what they know or have. Thus, the multimodal biometric system that is recommended includes facial (physiological) [4] and speech (physiological and behavioural) [5,15,18,19] modalities being concatenated at the feature level [18, 19] using a deep neural network-based classifier [20] for user authentication as shown in Figure 3. For any application requiring increased accuracy, security and widespread user acceptability, this can be regarded as a preferred solution over three factor authentication. ...
Article
Full-text available
Network-based services place significant emphasis on user authentication as a critical security concern. Li et al. have proposed a user authentication method for wireless sensor networks in IoT environments, utilising a three-factor authentication approach. They claimed that their approach has numerous advantages and is capable of enduring different types of attacks. However, this study examines the weaknesses of the aforesaid technique and identifies many types of the attacks, including sensor node capture assault, user impersonation attack, sensor node impersonation attack, session key leak attack, and gateway node impersonation attack. Hence, it is demonstrated that the suggested method is unsuitable for applications based on wireless sensor networks in an IoT environment. In addition, a reliable multimodal biometric system using face and speech modality, is suggested as a solution to tackle with the aforesaid vulnerable authentication scheme.
... The resultant registration error serves as the matching score. Abozaid et al. [25] presented a proficient approach for human authentication is established through the fusion of face and voice recognition modalities, resulting in an effective multimodal biometric identification system. Voice recognition features are obtained via cepstral coefficients and statistical coefficients, and the subsequent comparison of these coefficients takes place. ...
Article
Full-text available
In the pursuit of fortified security measures, the convergence of multimodal biometric authentication and ensemble learning techniques have emerged as a pivotal domain of research. This study explores the integration of multimodal biometric authentication and ensemble learning techniques to enhance security. Focusing on lip movement and electrocardiogram (ECG) data, the research combines their distinct characteristics for advanced authentication. Ensemble learning merges diverse models, achieving increased accuracy and resilience in multimodal fusion. Harmonizing lip and ECG modalities establishes a robust authentication system, countering vulnerabilities in unimodal methods. This approach leverages ECG's robustness against spoofing attacks and lip's fine-grained behavioral cues for comprehensive authentication. Ensemble learning techniques, from majority voting to advanced methods, harness the strengths of individual models, improving accuracy, reliability, and generalization. Moreover, ensemble learning detects anomalies, enhancing security. The study incorporates ECG signal filtering and lip region extraction as preprocessing, uses wavelet transform for ECG features, SIFT for lip image features, and employs greywolf optimization for feature selection. Ultimately, a voting-based ensemble classifier is applied for classification, showcasing the potential of this integrated approach in fortified security measures.
... After the padding of the outcome cross section, the convolution layer gets a component map for the text structure also concerning picture data. The procured part map is given by (15). ...
Article
Full-text available
In the realm of secure data access, biometric authentication frameworks are vital. This work proposes a hybrid model, with a 90% confidence interval, that combines "hyperparameter optimization-adaptive neuro-fuzzy inference system (HPO-ANFIS)" parallel and "hyperparameter optimization-convolutional neural network (HPO-CNN)" sequential techniques. This approach addresses challenges in feature selection, hyperparameter optimization (HPO), and classification in dual multimodal biometric authentication. HPO-ANFIS optimizes feature selection, enhancing discriminative abilities, resulting in improved accuracy and reduced false acceptance and rejection rates in the parallel modal architecture. Meanwhile, HPO-CNN focuses on optimizing network designs and parameters in the sequential modal architecture. The hybrid model's 90% confidence interval ensures accurate and statistically significant performance evaluation, enhancing overall system accuracy, precision, recall, F1 score, and specificity. Through rigorous analysis and comparison, the hybrid model surpasses existing approaches across critical criteria, providing an advanced solution for secure and accurate biometric authentication.
... An alternative to single-factor authentication based on biometric information is represented by multimodal biometric authentication systems that combine different biometric traits. Several multimodal biometric schemes of authentication were developed to increase the level of protection for the access control mechanisms as presented in [28][29][30][31][32]. ...
Article
Full-text available
In the current context in which user authentication is the first line of defense against emerging attacks and can be considered a defining element of any security infrastructure, the need to adopt alternative, non-invasive, contactless, and scalable authentication mechanisms is mandatory. This paper presents initial research on the design, implementation, and evaluation of a multi-factor authentication mechanism that combines facial recognition with a fully homomorphic encryption algorithm. The goal is to minimize the risk of unauthorized access and uphold user confidentiality and integrity. The proposed device is implemented on the latest version of the Raspberry Pi and Arduino ESP 32 modules, which are wirelessly connected to the computer system. Additionally, a comprehensive evaluation, utilizing various statistical parameters, demonstrates the performance, the limitations of the encryption algorithms proposed to secure the biometric database, and also the security implications over the system resources. The research results illustrate that the Brakerski–Gentry–Vaikuntanathan algorithm can achieve higher performance and efficiency when compared to the Brakerski–Fan–Vercauteren algorithm, and proved to be the best alternative for the designed mechanism because it effectively enhances the level of security in computer systems, showing promise for deployment and seamless integration into real-world scenarios of network architectures.
Article
Especially in the financial sector and law enforcement, biometric authentication technologies are becoming crucial. Unimodal systems were utilised in the past for biometrics, but multimodal biometrics is now in vogue. Multimodal biometrics is a crucial part of pattern recognition and has received a lot of attention in the scientific community. Although multimodality is frequently manipulated by unimodal systems, they frequently accomplish numerous tasks. In order to obtain high classification accuracy and low mistake rate, this research looks into the safe authentication procedure of a multimodal system employing fingerprint and facial recognition. However, the developed optimisation method's method for carrying out the fusion process involves the chance selection of two logical operators. The exponentially weighted moving average (EWMA) and WWO (water wave optimisation) are combined in order to create the scale-invariant feature transformation based fuzzy set exponential water wave optimisation (FEWWO). SIFT and FEWWO of face category, fingerprint ridge, and feature extraction are used to optimise and extract features. On the basis of the FVC200 fingerprint and Ollivati face datasets, the suggested solution is assessed. By obtaining low false positive and rejection rates, the suggested method can improve the realism of multimodal systems and reach our desired target of accuracy of about 99.2%.
Article
Purpose: The voice biometric system for verification and authentication of the user is a more advanced version. The technology is slowly grabbing the attention of researchers and industrialists for customer verification. This method allows users to use a voice instead of any code-based password which can be hacked or forgotten easily. The present research work proposes and describes the voice biometric system that can be implemented in the banking sector. The key of this entire research work is to convert the input voice into a waveform in a multifrequency range and store it in the bank server. During the authentication and verification, a customer will repeat a random phrase given by the voice assistant and then the bank server will match the voice availability. Design/Methodology/Approach: In this study, we are defining the speech biometric system architecture, which will provide rapid client authentication regardless of language. This technique generates a random word and asks the user to repeat it, rather than asking numerous questions like name, account number details, etc. The banking server will match, recognise, and authenticate the clients' voices by repeating a certain phrase, after which it will determine which customers have access. Findings/Result: This process will decide the access/denial of the usage of the banking facility to the customer. The overall results conclude that this system will help to enhance the accuracy level of recognition. Originality/Value: The conceptual framework of the speech biometric system in the banking industry is described in this research study. The system's architecture will assist the banking industry in building a robust voice biometric identification database. Paper Type: Conceptual Research.
Conference Paper
These days, the scope of a huge number of web, desktop, IoT, cloud and mobile applications being compromised is rapidly increasing. The attackers exploit weak authentication techniques to compromise government, educational, financial, and healthcare systems to access sensitive information. Authentication failure is very serious according to Open Web Application Security Project (OWASP) organization. This organization has ranked authentication failure among the top ten risks in the world in 2021 and 2023. In this study, we discuss the authentication technologies in terms of advantages, disadvantages, and drawbacks of some common authentication methods. We provide a comprehensive comparison table between several types of authentication techniques. Furthermore, we provide important recommendations for the future direction. Additionally, we provide a list of crucial factors for decision-makers to consider when selecting the appropriate authentication type. This research serves to raise awareness about the dangers of vulnerable authentication, benefiting users, companies, and researchers.
Article
Full-text available
The biometric is used to recognize a person based on physiological and behavioral traits. In this paper, we propose Fusion based Face recognition using 1D transform domain. The two dimensional face images are converted into one dimensional (1D). 1D- DWT and FFT are used to extract features of face images. The features of 1D- DWT and 1D- FFT are compared between database and test images using Euclidian Distance (ED) to compute the performance parameters, which are fused at matching level to obtain better results. It is observed that, the performance of the proposed method is better compared to extracting methods.
Article
Full-text available
In this research paper, robust and reliable encrypted steganography technique with the error control schemes is presented. It also, investigates the performance of this technique with existing different noises and attacks. The paper presents a powerful security algorithm through merging between the data hiding and encryption techniques. This security merging technique aims to improve and strengthen the image security through its transmission over a wireless channel. Error performance of the presented security algorithm is considered also, and the different error control schemes are utilized to encode the transmitted packets. The steganography conceals the data existence within a cover medium, while the encryption hides the data meaning by using specific algorithms. The chaos based encryption algorithms are considered efficient and secured image encryption algorithms. In this research work a confidential image is embedded into a cover image using the data hiding Least Significant Bit (LSB) Steganography technique and then the stego image is encrypted using two dimensional chaotic map encryption tool. The Logistic map and Chaotic Baker map encryption techniques are utilized for this purpose to produce a multi-level high secure algorithm for sensitive secret image transmission. The computer simulation experiments reveal that the chaotic Baker scenario resist the noise and attacks more than another scenario. The Median filter is utilized to enhance the extracted message quality in existence the Pepper-Salt noise. The encrypted stego signal performance is evaluated over the AWGN channel and different attacks. To improve the extraction of Logistic steganography, the FEC is employed for this purpose and to decrease the required SNR which permits the successfully embedded image extraction. These different error control techniques are utilized to improve the presented algorithm reliability over the wireless channels. There are several metrics are used to measure the extracted images quality such as the correlation coefficient, mean square error, and peak signal to noise ratio. Also, the number of lost packet is used as data loss attack and to evaluate the wireless link efficiency. Different image analyses and comparisons are verified to examine the suitability of proposed algorithms for securing a high sensitive image data through its transmission over wireless channel at different noise levels.
Article
Full-text available
Biometric identity verification refers to technologies used to measure human physical or behavioral characteristics, which offer a radical alternative to passports, ID cards, driving licenses or PIN numbers in authentication. Since biometric systems present several limitations in terms of accuracy, universality, distinctiveness, acceptability, methods for combining biometric matchers have attracted increasing attention of researchers with the aim of improving the ability of systems to handle poor quality and incomplete data, achieving scalability to manage huge databases of users, ensuring interoperability, and protecting user privacy against attacks. The combination of biometric systems, also known as “biometric fusion”, can be classified into unimodal biometric if it is based on a single biometric trait and multimodal biometric if it uses several biometric traits for person authentication.
Article
Full-text available
User authentication is a very important online environment issue. Further, investigation is needed to focus on improving user authentication methods for the sake of improving e-Learning security mechanisms; especially in the field of online exams. This study is the result of a systematic literature review which will answer the following three main questions of online exam user authentication methods, systems and threats. The results cover four main directions. First, it shows complete authentication methods that have been used in online exam systems; whether they are classified as knowledge, possession or biometric based. Second, it summarizes the online exam systems and authentication techniques used whether they are classified as user identification, authentication or continuous authentication. Third, it explores the threats that may occur during exam sessions and specifically classifies impersonation threats. Finally, it investigates existing commercial user authentication products that are used to observe online exams.
Article
Full-text available
In this paper, the use of finite Gaussian mixture modal (GMM) based Greedy Expectation Maximization (GEM) estimated algorithm for score level data fusion is proposed. Automated biometric systems for human identification measure a “signature” of the human body, compare the resulting characteristic to a database, and render an application dependent decision. These biometric systems for personal authentication and identification are based upon physiological or behavioral features which are typically distinctive, Multi-biometric systems, which consolidate information from multiple biometric sources, are gaining popularity because they are able to overcome limitations such as non-universality, noisy sensor data, large intra-user variations and susceptibility to spoof attacks that are commonly encountered in mono modal biometric systems. Simulation show that finite mixture modal (GMM) is quite effective in modelling the genuine and impostor score densities, fusion based the resulting density estimates achieves a significant performance on eNTERFACE 2005 multi-biometric database based on dynamic face, signature and speech modalities.
Article
Wireless body area networks (WBANs) are receiving significant interest as the next generation of wireless networks and emerging technology in the field of health monitoring. One of the most important factors for the acceptance of WBANs is the provision of appropriate security and access control mechanisms. Due to its nature in transferring the patients' sensitive data, WBAN has both classical and specific security requirements. In this paper, we survey such requirements and propose a new security scheme for satisfying them in WBANs. The proposed scheme deals with the overall network architecture, including intra- and inter-WBAN tiers, and proposes two mutual authentication and key exchange protocols for diverse WBAN environments. In our scheme, we use biometrics as one part of the solution for authentication and key exchange, and the simple password three-party key exchange protocol as the other part of the WBAN security. Our scheme meets security requirements along with energy-constraint considerations. We verify our scheme through BAN Logic. Unlike the majority of the existing security protocols, our scheme proposes a solution for entire WBANs communications, from biosensors to the medical server as a trusted third party.
Article
Background and objectives: Telecare Medicine Information System (TMIS) enables physicians to efficiently and conveniently make certain diagnoses and medical treatment for patients over the insecure public Internet. To ensure patients securely access to medicinal services, many authentication schemes have been proposed. Although numerous cryptographic authentication schemes for TMIS have been proposed with the aim to ensure data security, user privacy and authentication, various forms of attacks make these schemes impractical. Methods: To design a truly secure and practical authentication scheme for TMIS, a new biometrics-based authentication key exchange protocol for multi-server TMIS without sharing the system private key with distributed servers is presented in this work. Results: Our proposed protocol has perfect security features including mutual authentication, user anonymity, perfect forward secrecy and resisting various well-known attacks, and these security feathers are confirmed by the BAN logic and heuristic cryptanalysis, respectively. Conclusions: A secure biometrics-based authentication key exchange protocol for multi-server TMIS is presented in this work, which has perfect security properties including perfect forward secrecy, supporting user anonymity, etc., and can withstand various attacks such as impersonation attack, off-line password guessing attack, etc.. Considering security is the most important factor for an authentication scheme, so our scheme is more suitable for multi-server TMIS.
Article
Recently linear representation provides an effective way for robust face recognition. However, the existing linear representation methods cannot make an adaptive adjustment in responding to the variations on facial image, so the generalization ability of these methods is limited. In this paper, we propose a sparse boosting representation classification (SBRC) for robust face recognition. To improve the effectiveness of representation coding, an error detection machine (EDM) with multiple error detectors (ED) in SBRC, is proposed to detect and remove destroyed features (i.e. pixels) on a testing image. SBRC has three advantages: First, it has good generalization ability, since the EDM can self-adjust the number of ED according to different variations; Second, EDM would boost the sparsity of coding vector; Third, its implementation is simple and efficient as the EDM is based on . In addition, five popular face image databases including AR database, Extended Yale B database, ORL database, FERET database and LFW database were applied to validate the performance of SBRC. The superiority of SBRC is confirmed by comparing it with the state-of-the-art face recognition methods.
Article
Representation-based face-recognition techniques have received attention in the field of pattern recognition in recent years; however, the well-known works focus mainly on constraint conditions and dictionary learning. Few researchers study, which sample data features determine the performance of representation-based classification algorithms. To address this problem, the authors define the structure-scatter degree, which represents the structural features of training sample sets, to determine whether a set is suitable for the representation-based classification algorithm. Experimental results show that sets with a higher structure scatter more likely allows a classification algorithm to obtain a higher recognition rate. Further, the block contribution degree (DBC) of a training sample set is defined to evaluate whether a sample set is suitable for block-based sparse-representation classification algorithms. Experimental results indicate that if the DBC approaches zero, the block technique is unlikely to improve the performance of a representation-based classification algorithm. Thus, they devise a self-adaptive optimisation method to generate an optimal block size, an overlapping degree, and a block-weighting scheme. Finally, they propose the structure scatter-based subclass representation classification. Experimental results demonstrate that the proposed algorithm not only improves the recognition accuracy of the representation-based classification algorithm, but also greatly reduces its time complexity.
Article
In this paper, an approach to learn a robust sparse representation dictionary for face recognition is proposed. As well known, sparse representation algorithm can effectively tackle slight occlusion problems for face recognition. However, if images are corrupted by heavy noise, performance will be not guaranteed. In this paper, to enhance the robustness of sparse representation to serious noise in face images, we integrate low rank representation into dictionary learning to alleviate the influence of unfavorable factors such as large scale noise and occlusion. Among which we extract eigenfaces by singular value decomposition (SVD) from the low rank pictures to reduce dictionary atoms and, thereby, optimize the efficiency of improved algorithm. Otherwise, we characterize each image using the histogram of orientated gradient (HOG) feature which has been proven to be an effective descriptor for face recognition in particular. The performance of the proposed Low-rank and HOG feature based ESRC (LH_ESRC) algorithm on several popular face databases such as the Extended Yale B database and CMU_PIE face database shows the effectiveness of our method. In addition, we evaluate the robustness of our method by adding different proportions of randomly noise and block occlusion and real disgusts. Experimental results illustrate the benefits of our approach.