ArticlePDF Available

Abstract and Figures

The present paper is Part 2 in this series of two papers. In Part 1 we provided an introduction to Multiple Classifier Systems (MCS) with a focus into the fundamentals: basic nomenclature, key elements, architecture, main methods, and prevalent theory and framework. Part 1 then overviewed the application of MCS to the particular field of multimodal biometric person authentication in the last 25 years, as a prototypical area in which MCS has resulted in important achievements. Here in Part 2 we present in more technical detail recent trends and developments in MCS coming from multimodal biometrics that incorporate context information in an adaptive way. These new MCS architectures exploit input quality measures and pattern-specific particularities that move apart from general population statistics, resulting in robust multimodal biometric systems. Similarly as in Part 1, methods here are described in a general way so they can be applied to other information fusion problems as well. Finally, we also discuss here open challenges in biometrics in which MCS can play a key role.
Content may be subject to copyright.
Multiple classifiers in biometrics.
Part 2: Trends and challenges
Julian Fierrez, Aythami Morales, Ruben Vera-Rodriguez, David Camacho
School of Engineering, Universidad Autonoma de Madrid, Madrid, Spain
Abstract
The present paper is Part 2 in this series of two papers. In Part 1 we
provided an introduction to Multiple Classifier Systems (MCS) with a focus
into the fundamentals: basic nomenclature, key elements, architecture, main
methods, and prevalent theory and framework. Part 1 then overviewed the
application of MCS to the particular field of multimodal biometric person
authentication in the last 25 years, as a prototypical area in which MCS has
resulted in important achievements.
Here in Part 2 we present in more technical detail recent trends and devel-
opments in MCS coming from multimodal biometrics that incorporate con-
text information in an adaptive way. These new MCS architectures exploit
input quality measures and pattern-specific particularities that move apart
from general population statistics, resulting in robust multimodal biometric
systems. Similarly as in Part 1, methods here are described in a general way
so they can be applied to other information fusion problems as well. Finally,
we also discuss here open challenges in biometrics in which MCS can play a
key role.
Keywords:
classifier, fusion, biometrics, multimodal, adaptive, context
1. Introduction
The present paper is Part 2 in a series of two papers dedicated to overview-
ing the field of Multiple Classifier Systems (MCS) in biometrics. In Part 1,
we introduced the fundamentals of MCS [1], including: nomenclature, archi-
tecture, and a flexible theoretical framework. We then provided a review of
MCS applied to multimodal biometric person authentication in the last 25
Preprint submitted to Information Fusion December 19, 2017
To appear as:
J. Fierrez, A. Morales, R. Vera-Rodriguez and D. Camacho, "Multiple Classifiers in Biometrics.
Part 2: Trends and Challenges", Information Fusion, Vol. 44, November 2018.
years [2]. That review was developed using a generic MCS framework and
mathematical notation, with the purpose of facilitating the transfer of MCS
achievements from biometrics to other pattern recognition applications like
video surveillance [3], speech technologies [4], human-computer interaction
[5], data analytics [6], behavioural modelling [7], or recommender systems
[8].
Here in Part 2 we build from Part 1 to overview more recent trends in MCS
applied to biometrics, with a focus in context-based information fusion [9]. In
particular, the main MCS architectures in biometrics that have successfully
exploited context information are based on quality measures [10], or user-
specificities [11]. Similarly as in Part 1, the methods here are described in a
general way so they can be applied to other information fusion problems as
well. Additionally, particular implementations of the reported context-based
MCS architectures are described using two paradigms: 1) statistical based on
Bayesian statistics, and 2) discriminative based on Support Vector Machine
classifiers.
We end this series of two papers with a discussion of open challenges in
biometrics. The challenges exposed largely follow the excellent survey and
outlook of the field of biometric person recognition by Jain et al. [2], which
we complement with our personal view, and augment with the way MCS
developments can advance those key challenges in biometrics. With that, we
also hope to provide some light about the future of other pattern recognition
and information fusion areas as well.
The present paper is organized as follows. Section 2 overviews current
trends in context-based fusion for biometrics, first focusing in user-dependent
fusion, and then in quality-based fusion. In both cases, we first discuss
general architecture and then describe specific fusion algorithm under two
paradigms: statistical (combination approach), and discriminative (classifi-
cation approach). Section 3 summarizes open challenges in biometrics, and
discusses the role of MCS methods in overcoming those challenges. The paper
ends in Section 4 with some concluding remarks.
2. Trends in biometrics: Context-based MCS
This section is focused on MCS for multimodal biometric authentication,
adapted both to user-specificities and to the input biometric quality. In the
following sections we summarize key related works in these areas.
2
Feature
Extraction
Similarity
Multimodal
biometric signal
SYSTEM 1
(e.g. Fingerprint
Recognition)
Score
Normalization
Enrolled
Models
Enrolled
Models
Identity claim
Similarity Score
Normalization DECISION
THRESHOLD
Accepted or
Rejected
FUSION FUNCTION
yf=()x
Pre-
Processing
Feature
Extraction
Pre-
Processing
SYSTEM
(e.g. Signature
Recognition)
M
B
k
sx
xy
Authentication trials: = 1, ... ,
Systems: = 1, ... ,
Users:
iN
jM
k
Figure 1: General system model of multimodal biometric authentication using score level
fusion including name conventions.
The adaptive MCS schemes for multimodal biometrics are divided into
three classes: 1) user-dependent, 2) quality-based, and 3) user-dependent
and quality-based. Although the last class includes the first two classes as
particular cases, the three classes are introduced sequentially in order to
facilitate the description.
For each class of methods, we first sketch the system model and then
we derive particular implementations by using standard pattern recognition
methods, either based on generative assumptions following Bayesian theory,
or discriminative criteria using Support Vector Machines. These two classes
of implementations aim at minimizing the Bayesian error and the Structural
Risk of the verification task, respectively.
In the rest of the paper we use the following nomenclature and con-
ventions. Given a multimodal biometric verification system consisting of
Mdifferent unimodal systems j= 1, . . . , M , each one computes a similar-
ity score sbetween an input biometric pattern and the enrolled pattern or
model of the given claimant k. The similarity scores sare normalized to
x. Let the normalized similarity scores provided by the different unimodal
systems be combined into a multimodal score x= [x1, . . . , xM]T. The de-
sign of a fusion scheme consists in the definition of a function f:RMR,
so as to maximize the separability of client {f(x)|client attempt}and im-
postor {f(x)|impostor attempt}fused score distributions. This function
may be trained by using labelled training scores (xi, zi), where zi={0 =
impostor attempt,1 = client attempt}.
In Fig. 1 we depict the general system model including all the notations
defined above.
3
2.1. User-dependent multimodal biometrics
The idea of exploiting user-specific parameters at the score level in mul-
timodal biometrics was introduced, to the best of our knowledge, by [12]. In
this work, user-independent weighted linear combination of similarity scores
was demonstrated to be improved by using either user-dependent weights
or user-dependent decision thresholds, both of them computed by exhaustive
search on the testing data. The idea of user-dependent fusion parameters was
also explored by [13] using non-biased error estimation procedures. Other at-
tempts to personalized multimodal biometrics include the use of the claimed
identity index as a feature for a global trained fusion scheme based on Neu-
ral Networks [14], computing user-dependent weights using lambness metrics
[15], and using personalized Fisher ratios [16].
Toh et al. [17] proposed a taxonomy of score-level fusion approaches for
multi-biometrics. Multimodal fusion approaches were classified as global or
local depending firstly on the fusion function (i.e., user-independent or user-
dependent fusion strategies) and secondly depending on the decision mak-
ing process (i.e., user-independent or user-dependent decision thresholds):
global-learning and global-decision (GG), local-learning and global-decision
LG, and similarly GL and LL. Some example works of each group are listed
in Table 2.1.
These local methods (user-dependent fusion or decision) are confronted
with a big challenge: training data scarcity, as the amount of available train-
ing data in localized learning is usually not sufficient and representative
enough to guarantee good MCS parameter estimation and generalization
capabilities. To cope with this lack of robustness derived from partial knowl-
edge, the use of robust adaptive learning strategies based on background
information was proposed in related research areas [23]. The idea of exploit-
ing background information and adapt from there the fusion functions of
MCS based on context information was introduced in biometrics by Fier-
rez et al. [11, 24], and was soon followed by others [25]. In brief, in these
context-based MCS methods, the relative balance between the background
information (from a pool of background users) and the local data (a given
user) is performed as a tradeoff between both kinds of information.
The system model of user-dependent score fusion including the mentioned
adaptation from background information is shown in Fig. 2.
Two selected algorithms implementing the discussed adapted user-dependent
fusion are summarized in the following sections.
4
Table 1: Example works on multimodal biometrics based on local and global learning. Mdenotes the total number of classifiers
combined. Architecture is Global-learning and Global-decision (GG), Local-learning and Global-decision LG, and similarly
GL and LL. Performance gain over the best single classifier is given for IDentification or VERification either as FR@FA pair,
EER or Total Error TE=FR+FA (in %).
Work Modalities M Architecture Gain
Brunelli and Falavigna (1995) [18] Speaker, face 5 GG ID:172 (TE)
Kittler et al. (1998) [19] Speaker, face 3 GG VER:1.40.7 (EER)
Hong and Jain (1998) [20] Face, fingerprint 2 GG ID:6.94.5 (FR@0.1%FA)
Ben-Yacoub et al. (1999) [21] Speaker, face 3 GG VER:40.5 (EER)
Verlinde et al. (2000) [22] Speaker, face 3 GG VER:3.70.1 (TE)
Jain and Ross (2002) [12] Face, fingerprint, hand 3 LG/GL n/a
Kumar and Zhang (2003) [14] Face, palmprint 2 LG VER:3.60.8 (EER)
Toh et al. (2004) [17] Speaker, fingerprint, hand 3 LG/GL/LL VER: 50% Improvement (EER)
Fierrez et al. (2005) [11] Signature, finger 2 LG/GL/LL VER:3.50.8 (EER)
5
Feature
Extraction
Similarity
Multimodal
biometric signal
SYSTEM 1
(e.g. Fingerprint
Recognition)
Score
Normalization
Enrolled
Models
Enrolled
Models
Identity claim
Similarity Score
Normalization
DECISION
THRESHOLD
Accepted or
Rejected
FUSION FUNCTION
(Claimed User)
Pre-
Processing
Feature
Extraction
Pre-
Processing
SYSTEM
(e.g. Signature
Recognition)
R
POOL OF
USERS
TRAINING DATA
CLAIMED USER
Fusion
Functions
Figure 2: System model of multimodal biometric authentication with adapted user-
dependent score fusion.
2.1.1. User-dependent MCS: combination approach
Here we outline this algorithm, representative of context-based MCS by
adapting the score fusion function to each user from general background
information. For a more detailed description and experimental evaluation
see [24].
Impostor and client score distributions are modelled as multivariate Gaus-
sians p(x|ω0) = N(x|µ0,σ2
0) and p(x|ω1) = N(x|µ1,σ2
1), respectively1. The
fused score yTof a multimodal test score xTis defined then as follows
yT=f(xT) = log p(xT|ω1)log p(xT|ω0),(1)
which is known to be a Quadratic Discriminant (QD) function consistent
with Bayes estimate in case of equal impostor and client prior probabilities
[26]. The score distributions are estimated using the available training data
as follows:
Global. The training set XG= (xi, zi)NG
i=1 includes multimodal scores from
a number of different clients, and ({µG,0,σ2
G,0},{µG,1,σ2
G,1}) are esti-
mated by using the standard Maximum Likelihood criterion [27]. The
resulting fusion rule fG(x) is applied globally at the operational stage
regardless of the claimed identity.
Local. A different fusion rule fk,L(x) is obtained for each client kenrolled in
the system by using Maximum Likelihood density estimates
({µk,L,0,σ2
k,L,0},{µk,L,1,σ2
k,L,1}) computed from a set of development
scores Xkof the specific client k.
1We use diagonal covariance matrixes, so σ2is shorthand for diag(Σ). Similarly, µ2is
shorthand for diag(µµ0).
6
Adapted. The adapted fusion rule fk,A(x) of client ktrades off the general
knowledge provided by the user-independent development data XG,
and the user specificities provided by the user-dependent training set
Xk, through Maximum a Posteriori density estimation [27]. This is
done by adapting the sufficient statistics as follows
µk,A,l =αlµk,L,l + (1 αl)µG,l ,
σ2
k,A,l =αl(σ2
k,L,l +µ2
k,L,l) + (1 αl)(σ2
G,l +µ2
G,l)µ2
j,A,l.
(2)
For each class l={0 = impostor,1 = client}, a data-dependent adap-
tation coefficient
αl=Nl/(Nl+r) (3)
is used, where Nlis the number of local training scores in class l, and
ris a fixed relevance factor.
Note that other statistical models or other techniques for trading-off the
general and local knowledge can be used in a similar way.
2.1.2. User-dependent MCS: classification approach
Similarly as before, we only outline here the main aspects of this context-
based MCS approach, which also adapts the score fusion function to each
user from general background information. This particular implementation
is based on SVM, but the approach is easily extensible to any other binary
classifier. For a detailed description and experimental evaluation see [11].
Without loss of generality, suppose we train a SVM classifier with the
following training set: X= (xi, zi)N
i=1 where Nis the number of multimodal
scores in the training set, and zi∈ {−1,1}={Impostor,Client}. We train
the SVM classifier by solving the following quadratic programming prob-
lem [28]:
min
w,w01,...,ξN1
2kwk2+
N
P
i=1
Ciξi(4)
subject to
zi(hw,Φ(xi)iH+w0)1ξi, i = 1, . . . , N,
ξi0, i = 1, . . . , N, (5)
7
where slack variables ξiare introduced to take into account the eventual
non-separability of Φ(X) and parameter Ci=Cis a positive constant that
controls the relative influence of the two competing terms.
The optimization problem in Eqs. (4) and (5) is solved with the Wolfe
dual representation by using the kernel trick [29]:
max
α1,...,αN N
P
i=1
αi1
2
N
P
i,j=1
αiαjzizjK(xi,xj)!(6)
subject to
0αiCi, i = 1, . . . , N
N
P
i=1
αizi= 0 (7)
where the kernel function K(xi,xj) = hΦ(xi),Φ(xj)iHis introduced to avoid
direct manipulation of the elements of H. Typical kernel functions include
radial basis functions
K(xi,xj) = exp kxixjk2/2σ2,(8)
and linear kernels
K(xi,xj) = xT
ixj.(9)
resulting in complex and linear separating surfaces between client and im-
postor distributions, respectively.
The fused score yTof a multimodal test pattern xTis defined as follows:
yT=f(xT) = hw,Φ(xT)iH+w
0,(10)
which is a signed distance measure from xTto the separating surface given
by the solution of the SVM problem. Applying the Karush-Kuhn-Tucker
(KKT) conditions to the problem in Eqs. (4) and (5), yTcan be shown to be
equivalent to the following sparse expression
yT=f(xT) = X
iSV
α
iyiK(xi,xT) + w
0,(11)
where (w, w
0) is the optimal hyperplane, (α
1, . . . , α
N) is the solution to the
problem in Eqs. (6) and (7), and SV = {i|α
i>0}indexes the set of support
8
vectors. The bias parameter w
0is obtained from the solution to the problem
in Eqs. (6) and (7) by using the KKT conditions [29].
As a result, the training procedure in Eqs. (6) and (7) and the testing
strategy in Eq. (11) are obtained for the problem of multimodal fusion.
Global. The training set XG= (xi, zi)NG
i=1 includes multimodal scores from a
number of different clients and the resulting fusion rule fG(x) is applied
globally at the operational stage regardless of the claimed identity.
Local. A different fusion rule fk,L(x) is obtained for each client enrolled in
the system kby using development scores Xkof the specific client k.
At the operational stage, the fusion rule fk,L(x) of the claimed identity
kis applied.
Adapted. This scheme trades off the general knowledge provided by a user-
independent training set XG, and the user specificities provided by a
user-dependent training set Xk. To obtain the adapted fusion rule,
fk,A(x), for user k, we compute both the global fusion rule, fG(x), and
the local fusion rule, fk,L(x), as described above, and finally combine
them as follows:
fk,A(x) = αfk,L(x) + (1 α)fG(x),(12)
where αis a trade-off parameter. This can be seen as a user-dependent
fusion scheme adapted from user-independent information. The idea
can also be extended easily to trained fusion schemes based on other
classifiers. Worth noting, sequential algorithms to solve the SVM op-
timization problem in Eqs. (4) and (5) have been already proposed
[30], and can be used to extend the proposed idea, first constructing
the user-independent solution and then refining it by incorporating the
local data.
2.1.3. User-dependent decision
The system model of user-dependent decision is shown in Fig. 3. Once
a fused similarity score has been obtained by using either a global, local or
an adapted fusion method, the score is compared to a decision threshold in
order to accept or reject the identity claim. This decision making process,
also subject to training, can also be made globally, locally, or can be adapted
from global to local information. For this purpose, the methods presented
9
Feature
Extraction
Similarity
Multimodal
biometric signal
SYSTEM 1
(e.g. Fingerprint
Recognition)
Score
Normalization
Enrolled
Models
Enrolled
Models
Identity claim
Similarity Score
Normalization
DECISION
FUNCTION
(Claimed User)
Accepted or
Rejected
FUSION FUNCTION
Pre-
Processing
Feature
Extraction
Pre-
Processing
SYSTEM
(e.g. Signature
Recognition)
R
POOL OF
USERS
TRAINING DATA
CLAIMED USER
Decision
Functions
Figure 3: System model of multimodal biometric authentication with adapted user-
dependent decision.
in Sects. 2.1.1 and 2.1.2 can be directly applied exchanging the input multi-
modal scores xfor fused scores y.
2.2. Quality-based multimodal biometrics
The 21st century began with a growing interest in studying the effects
of signal quality on the performance of biometric systems [31, 32, 33]. As a
result, it was shown in several works that the performance of an unimodal
system can drop significantly under noisy conditions [34]. Multimodal sys-
tems have been demonstrated to overcome this challenge to some extent by
combining the evidences provided by a number of different traits. This idea
can be extended by explicitly considering quality measures of the input bio-
metric signals and weighting the various pieces of evidence based on this
quality information. Following this idea, various quality-based multimodal
authentication schemes were proposed and studied since mid 2000s [10].
Quality measures of the input biometric signals can be used for adapting
the different modules of a multimodal authentication system [34]. Here we
concentrate in quality-based score fusion. The system model of quality-based
score fusion is shown in Fig. 4.
Bigun et al. [35] studied the problem of multimodal biometric authenti-
cation by using Bayesian statistics. The result was an Expert Conciliation
scheme including weighting factors not only for the accuracy of the experts
but also for the confidence of the experts on the particular input samples.
Experiments were provided by combining face and voice modalities. The idea
of relating the confidence value to quality measures of the input biometric
signals was nevertheless not developed.
The concept of confidence measure of matching scores was also studied by
[36]. In that work Bengio et al. demonstrated that the confidence of matching
10
Feature
Extraction
Similarity
Multimodal
biometric signal
SYSTEM 1
(e.g. Fingerprint
Recognition)
Score
Normalization
Enrolled
Models
Enrolled
Models
Identity claim
Similarity Score
Normalization
DECISION
THRESHOLD
Accepted or
Rejected
FUSION FUNCTION
Pre-
Processing
Feature
Extraction
Pre-
Processing
SYSTEM
(e.g. Signature
Recognition)
R
Signal Quality for
Modality 1
Signal Quality for
Modality R
Figure 4: System model of multimodal biometric authentication with quality-based score
fusion.
scores can help in the fusion process. In particular, they tested confidence
measures based on: 1) Gaussian assumptions on the score distributions, 2)
the adequacy of the trained biometric models to explain the input data, and
3) resampling techniques on the set of test scores. This research line was
further developed by Poh and Bengio [37], who devised confidence measures
based on the margin between impostor and client score distributions.
Chatzis et al. [38] evaluated a number of fusion schemes based on clus-
tering strategies. In this case quality measures obtained directly from the
input biometric signals were used to fuzzify the scores provided by the differ-
ent systems. They demonstrated that fuzzy versions of k-means and Vector
Quantization including the quality measures outperformed slightly, and not
in all cases, the standard non-fuzzy clustering methods. This work is, to
the best of our knowledge, the first one reporting results of quality-based fu-
sion. One limitation in the experimental setup of this work was the reduced
number of individuals used, only 37.
Another work in quality-based fusion without the success of previous
methods was reported by Toh et al. [39], who developed a score fusion
scheme based on polynomial functions. Quality measures were introduced in
the optimization problem for training the polynomials as weights in the regu-
larization term. Unexpectedly, no performance improvements were obtained
by including the quality measures. One inconvenience of this work was the
use of a chimeric multimodal database combining the data from 3 different
face, voice and fingerprint databases.
11
2.2.1. Quality-based MCS: combination approach
One straightforward way to incorporate the input biometric quality to
the score fusion approach is by including weights in simple combination ap-
proaches. In the case of the weighted average presented in Part 1 Eq. (10),
this can be achieved by using wj=qjin order to obtain the following quality-
based score fusion function
y=
M
X
j=1
qjxj,(13)
where qjis a quality measure of the score xj. This score quality should
be ideally related to the confidence of the system jin providing a reliable
matching score for the particular biometric signal being tested [40, 41]. The
score quality proposed and used in [10] is as follows:
q=pQ·Qclaim,(14)
where Qand Qclaim are the input biometric quality and the average quality of
the biometric signals used for enrollment, respectively. The two quality mea-
sures Qand Qclaim are supposed to be in the range [0,1] where 0 corresponds
to the poorest quality, and 1 corresponds to the highest quality.
Other definitions of score quality found in the literature include [34]:
q= (Q+Qclaim)/2, q= min{Q, Qclaim}, etc.
Preliminaries.. The nomenclature and conventions summarized in Fig. 1 are
extended here:
xij Similarity score idelivered by system j
vij Variance of xij as estimated by system j
ziThe true label corresponding to score i
ζij The error score ζij =zixij
With respect to the previous cases developed in this paper, note that here
we introduce the variance vij of the input scores xij . The true labels zican
take only two numerical values corresponding to “Impostor” and “Client”. If
xij is between 0 and 1 then these values are chosen to be 0 and 1, respectively.
The fusion function is trained on shots i1. . . N (i.e. xij and ziare known
for i1. . . N) and we consider the trial N+ 1 as a test shot on the working
multimodal system (i.e. x(N+1)jis known, but zN+1 is not known).
12
Statistical Model. The model for combining the different systems (here also
called machine experts) is based on Bayesian statistics and the assumption
of normal distributed expert errors, i.e. ζij is considered to be a sample of
a normally distributed random variable. It has been shown experimentally
[35] that this assumption does not strictly hold for common audio- and video-
based biometric machine experts, but it is shown that it holds reasonably well
when client and impostor distributions are considered separately. Taking
this result into account, two different fusion functions are constructed, one
of them based on genuine scores
C={xij , vij |1iNand zi= 1,1jM},(15)
while the other is based on impostor scores
I={xij , vij |1iNand zi= 0,1jM}.(16)
The two fusion functions will be referred to as client function and impostor
function respectively.
The client function estimates the expected true label of an input claim
based on its expertise on recognizing client data. More formally, it computes
M00
C=E[ZN+1|C, xN+1,j ]. Similarly, the impostor function computes M00
I=
E[ZN+1|I, xN+1,j ]. The conciliated overall score M00 takes into account the
different expertise of the two fusion functions and chooses the one which
came closest to its goal, i.e. 0 for the impostor function and 1 for the client
function:
M00 =M00
Cif |1M00
C|−|0M00
I|<0
M00
Iotherwise .(17)
Based on the normality assumption of the errors, the fusion training and
testing algorithm described in [35] is obtained, see [42] for further back-
ground and details. In the following paragraphs we summarize the resulting
algorithm in the two cases where it can be applied.
Bayesian simplified quality-based score fusion. When only the similarity scores
xij are available, the following simplified fusion function is obtained by using
vij = 1:
Training. Estimate the bias parameters of each system. The bias parame-
ters for the client function are
MCj=1
nCX
i
ζij and VCj=αCj
nC
,(18)
13
where iindexes the training set C,nCis the number of training samples
in Cand
αCj=1
nC3
X
i
ζ2
ij 1
nC X
i
ζij !2
.(19)
Similarly MIjand VIjare obtained for the impostor function.
Authentication. At this step, both fusion functions are operational, so
that the time instant is N+ 1 and the fusion functions have access
to the similarity scores xN+1,j but not to the true label zN+1. First
the client and impostor functions are calibrated according to their past
performance, yielding (for the client function)
M0
Cj=xn+1,j +MCjand V0
Cj= (nC+ 1)VCj,(20)
and then the different calibrated systems are combined according to
M00
C=
M
P
j=1
M0
Cj
V0
Cj
M
P
j=1
1
V0
Cj
.(21)
Similarly, M0
I,V0
Iand M00
Iare obtained. The final fused output is
obtained according to Eq. (17).
The algorithm described above has been successfully applied in [43] in a
multimodal authentication system combining face and speech data. Verifica-
tion performance improvements of almost an order magnitude were reported
as compared to the best modality.
Bayesian quality-based score fusion. When not only the scores but also the
score variances are available, the following algorithm is obtained:
Training. Estimate the bias parameters. For the client function
MCj=Pi
ζij
σ2
ij
Pi
1
σ2
ij
and VCi=1
Pi
1
σ2
ij
,(22)
14
where the training set Cis used. The variances σ2
ij are estimated
through ¯σ2
ij =vij ·αCj, where
αCj=1
nC3
X
i
ζ2
ij
vij
X
i
ζij
vij !2 X
i
1
vij !1
.(23)
Similarly MIjand VIjare obtained for the impostor function.
Authentication. First we calibrate the systems according to their past per-
formance, for the client function
M0
Cj=xN+1,j +MCjand V0
Cj=vN+1,j αCj+VCj,(24)
and then the different calibrated systems are combined according to
Eq. (21). Similarly, M0
I,V0
Iand M00
Iare obtained. The final fused
score is obtained according to Eq. (17). This combined output can be
expressed in the form of Eq. (11) from Part 1.
The algorithm described above has been successfully applied not only in
biometrics where it was originated [44], but also in other unrelated fields like
risk assessment of aircraft accidents [42].
The variance vij of the score xij concerns a particular authentication as-
sessment. It is not a general reliability measure for the system itself, but
a certainty measure based on the performance of the system and the data
being assessed. Typically the variance of the score is chosen as the width of
the range in which one can place the score when considering human opinions.
Because such intervals can be conveniently provided by a human expert, the
algorithm presented here constitutes a systematic way of combining human
and machine expertise in MCS applications. An example of such an appli-
cation is forensic reporting using biometric evidences, where machine expert
approaches are increasingly being used [45] and human opinions must be
taken into consideration.
The context-based MCS approach summarized here calculates vij as a
function of quality measures computed on the input biometric signals (see
Fig. 4). This implies taking into account Eq. (24) right, that the trained fu-
sion function adapts the weights of the experts using the input signal quality.
For that purpose the quality qij of the score xij is defined as:
15
qij =pQij ·Qclaim,j ,(25)
where Qij and Qclaim,j are the quality label of the biometric trait jin trial
iand the average quality of the biometric signals used by the system jfor
modelling the claimed identity respectively. The two quality labels Qij and
Qclaim,j are supposed to be in the range [0, Qmax ] with Qmax >1, where
0 corresponds to the poorest quality, 1 corresponds to normal quality and
Qmax corresponds to the highest quality. Finally, the variance parameter is
calculated according to
vij =1
q2
ij
.(26)
Experimental evaluation of this quality-based fusion approach can be
found in [44, 42].
2.2.2. Quality-based MCS: classification approach
Instead of assuming particular statical models on the genuine and impos-
tor score distributions like in previous section, here we exemplify a quality-
based score fusion approach based on any binary classifier. Without loss of
generality, we sketch the approach considering SVM classifiers [10].
Let q= [q1, . . . , qM]Tdenote the quality vector of the multimodal sim-
ilarity score x= [x1, . . . , xM]T, where qjis a scalar quality measure corre-
sponding to the similarity score xjwith j= 1, . . . , M being Mthe number
of modalities. As in the case of the Bayesian quality-based fusion algorithm,
the quality values qjare computed as follows:
qj=pQj·Qclaim,j ,(27)
where Qjand Qclaim,j are the quality measure of the sensed signal for bio-
metric trait j, and the average signal quality of the biometric signals used
by unimodal system jfor modelling the claimed identity, respectively. The
two quality labels Qjand Qclaim,j are supposed to be in the range [0, Qmax ]
with Qmax >1, where 0 corresponds to the poorest quality, 1 corresponds to
standard quality, and Qmax corresponds to the highest quality.
The score-level fusion scheme based on SVM classifiers and quality mea-
sures proposed in [10] is as follows:
Training. An initial fusion function:
16
fSVM :RMR, fSVM(xT) = hw,Φ(xT)i+w0(28)
is trained by solving the problem:
min
w,w01,...,ξN1
2kwk2+
N
P
i=1
Ciξi(29)
subject to
yi(hw,Φ(xi)iH+w0)1ξi, i = 1, . . . , N, (30)
ξi0, i = 1, . . . , N, (31)
as described in Sect. 2.1.2, but using as cost weights
Ci=C QM
j=1 qi,j
QM
max !α1
,(32)
where qi,j ,j= 1, . . . , M are the components of the quality vector qias-
sociated with training sample (xi, zi), zi∈ {−1,1}={Impostor,Client},
and Cis a positive constant. As a result, the higher the overall qual-
ity of a multimodal training score the higher its contribution to the
computation of the initial fusion function. Additionally, MSVMs of
dimension M1 (SVM1to SVMM) are trained leaving out traits 1 to
Mrespectively. Similarly to Eq. (32)
Ci=C Qr6=jqi,r
Q(M1)
max !α1
,(33)
for SVMjwith j= 1, . . . , M.
Authentication. Let the sensed multimodal biometric sample generate a
quality vector qT= [qT,1, . . . , qT, M ]T. Re-index the individual traits in
order to have qT,1qT,2. . . qT ,M . A multimodal similarity score
xT= [xT,1, . . . , xT ,M ]0is then generated. The combined quality-based
similarity score is computed as follows:
17
Feature
Extraction
Similarity
Multimodal
biometric signal
SYSTEM 1
(e.g. Fingerprint
Recognition)
Score
Normalization
Enrolled
Models
Enrolled
Models
Identity claim
Similarity Score
Normalization
DECISION
THRESHOLD
Accepted or
Rejected
FUSION FUNCTION
(Claimed User)
Pre-
Processing
Feature
Extraction
Pre-
Processing
SYSTEM
(e.g. Signature
Recognition)
R
Signal Quality for
Modality 1
Signal Quality for
Modality R
POOL OF
USERS
TRAINING DATA
CLAIMED USER
Fusion
Functions
Figure 5: System model of multimodal biometric authentication with user-dependent and
quality-based score fusion.
fSVMQ(xT) = β1
M1
X
j=1
βj
PM1
r=1 βr
fSVMj(x(j)
T) + (1 β1)fSVM(xT),(34)
where x(j)
T= [xT,1, . . . , xT ,j 1, xT ,j+1 , . . . , xT ,M ]Tand
βj=qT,M qT ,j
Qmax α2
, j = 1, . . . , M 1.(35)
As a result, the adapted fusion function in Eq. (34) is a quality-based
trade-off between not using and using low quality traits.
2.3. User-dependent and quality-based multimodal biometrics
Finally, we may combine previous strategies to derive fusion systems
adapted both to the user at hand and to the input biometric quality, as
shown in Fig. 5.
Practical implementations of this scheme can be obtained by combining
some of the procedures described previously in the present paper. One pos-
sibility is to use Bayesian user-dependent score fusion plus discriminative
quality-based adaptation.
18
3. Challenges in biometrics: Role of MCS
In the present section, similarly as in the excellent exposition by Jain et
al. [2], we discuss main challenges in biometrics, adapting their discussion
based on our personal view, and commenting how new MCS developments
may play a role in overcoming those challenges.
Note that biometrics person recognition shares architectures, methods,
issues, and challenges with almost any other pattern recognition application.
Therefore, the challenges exposed here have a parallel in other research ar-
eas, and may provide some light on the future of other pattern recognition
applications as well.
Challenge 0: Better understanding about the nature of biometrics (distinc-
tiveness and permanence). Current knowledge about the nature of the va-
riety of biometric modalities useful for person recognition is quite limited
[2]. Although practical systems based on fingerprint or face recognition can
satisfy certain applications, a better understanding of factors like their in-
trinsic distinctive capacity [46, 47], or their permanence [48, 49], will open
the way to new improved recognition, and will rationalize the application
of such technologies depending on the scenario of application and potential
population of use [50].
There have been some advances in these areas, but still much work is nec-
essary to fully understand the nature of biometrics for person authentication.
Towards this objective, MCS approaches can be instrumental for analyz-
ing the increasing amount of multimodal biometric data available nowadays
[51, 52]. MCS methods can be quite helpful to analyze those data as they
permit to simultaneously analyse and model complex yet structured relations
on heterogenous data [53], which is the case in biometrics, e.g.: the different
representation levels existing in fingerprint [54, 55], or speech [56, 57].
Challenge 1: Design of robust algorithms (representation and matching) from
uncooperative users in unconstrained and varying scenarios. This challenge
has been the main focus of research in biometrics during the last 50 years
[2], and still the desirable performance level for many biometric applications
in realistic scenarios is not yet satisfactory. There are a myriad of pattern
representation schemes and matching procedures depending on the biometric
modality (e.g., face image vs speech time-sequences) and acquisition scenario
(e.g., controlled vs latent fingerprints), and one can find in the vast and grow-
ing literature representation and matching methods specifically adjusted for
19
many practical applications. Most of these approaches are variants of suc-
cessful representation and matching techniques coming from other research
areas like image and signal processing, speech analysis, or computer vision,
e.g., LBP or SIFT features [58].
As developed in Part 1 in our review of MCS applied to multimodal bio-
metrics, combining various of such representation-matching schemes provide
significant benefits, not only when one has multiple evidences to combine
[59], but also when one has only one biometric evidence but wants to be
robust against degraded or varying conditions by combining various repre-
sentation schemes [60]. The success of such MCS schemes is related to the
diversity of classifiers being combined, a topic attracting much attention in
the MCS community [61, 8].
MCS strategies in previous paragraph supposed that there are various
classifiers available to be combined, but one can also generate multiple base
classifiers, e.g., the highly successful AdaBoost approach in the Viola-Jones
cascade MCS [62]. These MCS approaches are specially useful when pat-
terns to be recognized are difficult to be represented, or vary in time due
to its intrinsic nature or environmental changes. An adaptive generation of
multiple base classifiers, and adaptive fusion schemes, like AdaBoost, may
track and adapt well under those unconstrained and varying conditions. This
topic of adaptive pattern recognition is also source of interesting research in
MCS under multiple names like concept drift [63, 64]. Advances in adaptive
MCS can be instrumental for the future of this Challenge 1. In addition to
such adaptive schemes, a better understanding of such unconstrained scenar-
ios through benchmarks and public databases is also of outmost importance
[65, 66].
On the other hand, in the last 5 years or so we have witnesses the triumph
of data-agnostic (i.e., without any explicit representation) end-to-end ma-
chine learning approaches such as deep neural networks that, given enough
representative training data, can generate very robust classifiers for many
problems in unconstrained scenarios with highly varying conditions, e.g.,
face [67] or speaker recognition [68].
MCS methods exploiting deep learning [69], and new deep learning strate-
gies exploiting and considering both existing classifiers (a common case in
biometric applications) and contextual information [70] are also very promis-
ing lines for advancing in this area.
20
Challenge 2: Integration with end applications. Most traditional and widely
deployed biometric solutions for person recognition are designed for access
control or forensic scenarios. One important challenge in biometrics is how
to properly integrate biometrics technologies in other application scenarios
like mobile authentication [71, 72], video surveillance [3], forensics [73], large-
scale ID [74], cloud biometrics or ubiquitous biometrics [75].
Depending on the scenario at hand, the traditional biometric technologies
will need to be adapted, or perhaps designed again in order to satisfy new
application requirements. In this case adaptive MCS techniques incorporat-
ing context information, like the ones described here in Section 2, can be
quite useful.
Challenge 3: Understanding and improving the usability. As mentioned in
Challenge 2, the number and variety of biometric applications for person
recognition is ever growing, and some of them are strongly dependent on
an adequate interaction between the user and the biometric sensor, e.g., in
mobile authentication [71].
We currently lack a good understanding of how the people naturally inter-
act with some biometric sensors, and in which conditions the authentication
mechanisms generated with biometric technology perform best. There has
been some research in the past to analyze those factors between the user
and the biometric sensor in general [76], including specific models to analyze
and exploit the interaction between the user and the biometric sensor [77].
More recently, we can see some targeted studies towards understanding the
interaction between users and technology for key biometric end applications
like border control [78], or smartphone unlock [79].
Similar to Challenge 0, MCS approaches can be exploited here as a tool for
analyzing multiple sources of heterogenous data [53], complex yet structured,
as is the case of human-biometric sensor interaction data [77].
Challenge 4: Understanding and improving the security. Pattern recognition
applications based on biometrics are usually intended for securing informa-
tion or control the access to services or places [2]. Note this is not the only
usage possible, as biometric technologies may be also used to analyze per-
sonal data towards other objectives, like behaviour analysis [80] or medical
diagnosis [81].
When biometrics are used for security applications, one may want to know
the level of security provided by the application at hand, given a set of oper-
ational conditions. This question has been already addressed in the general
21
information security community, where various international standards have
been generated under the umbrella of Common Criteria (ISO/IEC 15408)
since 1990 [82]. That standardization effort includes some specific develop-
ments for biometric systems [83]. The basic idea behind those standards is to
measure quantitatively the effort required for potential attackers to bypass
the protection provided by biometrics, and the impact of such attacks.
These ideas have generated much research in biometrics towards under-
standing possible attacks [84], and the generation of protection methods
against attacks [85]. When MCS approaches are applied to biometrics, spe-
cific vulnerabilities appear [86], and protection methods can be generated by
exploiting specific MCS fusion strategies [87].
The topic of security against attackers seeking illicit access is related to
the privacy protection of users, and in particular their biometric templates.
Securing such templates against potential identity theft has also generated
much research activity in the last decade [88]. There are some recent devel-
opments in this area exploiting advances in cryptography like homomorphic
encryption [89], but still there are no general satisfactory solutions for gen-
erating secure biometric templates at the same time 1) non-invertible, 2)
non-linkable, and 3) with high discrimination [2]. Current trends for better
protecting templates containing multiple biometric data are usually based on
advanced cryptographic constructions and the principles of MCS described
in Part 1 [90].
4. Conclusions
The present paper is the Part 2 in a series of two papers. In Part 1
we first provided a brief introduction to Multiple Classifier Systems (MCS)
including basic nomenclature, architecture, and key elements [1]. Our main
focus there was into the fundamentals of MCS, providing pointers for detailed
descriptions of MCS algorithms.
Part 1 then overviewed the application of MCS to the particular field of
multimodal biometric person authentication in the last 25 years [2], including
general descriptions of main MCS elements, methods, and algorithms gener-
ated in the biometrics field. The presentation there was general with a generic
mathematical formulation, in order to facilitate the export of experiences and
methods to other information fusion problems, e.g.: video surveillance [3],
speech technologies [4], biomedical applications [91], human-computer in-
teraction [5], data analytics [6], behavioural modelling [7], or recommender
22
systems [8].
Part 1 was intended for the non-expert in MCS, or any other reader
interested in overviewing the field of multimodal biometrics. Here in Part 2
we provide more advanced material intended for researchers knowledgeable
already in MCS and multimodal biometrics, readers that completed Part 1,
and any other researcher seeking ideas and prospects about the future of
biometrics that can be parallel to other pattern recognition areas as well.
We began this Part 2 describing in technical detail recent trends and
developments in MCS from multimodal biometrics that incorporate context
information in an adaptive way, using the framework and mathematical tools
introduced in Part 1. These new MCS architectures exploit input qual-
ity measures [10] and pattern-specific particularities that move apart from
general population statistics [11], resulting in robust multimodal biometric
systems.
Similarly as in Part 1, methods here in Part 2 were introduced in a gen-
eral way so they can be applied to other information fusion problems as
well. In related works such as [9], one can find an excellent treatment of
general context-based information fusion, in which there are indications on
how to apply the methods and specific algorithms developed here to other
information fusion architectures.
Finally, we have discussed open challenges in biometrics in which MCS
may play a key role: 0) limited knowledge about the nature of biometrics (in
terms of distinctiveness and permanence for different populations), 1) design
of robust algorithms (representation and matching) from uncooperative users
in unconstrained and varying scenarios, 2) integration with end applications,
3) understanding and improving the usability, and 4) understanding and
improving the security.
5. Acknowledgements
This work was funded by projects CogniMetrics (TEC2015-70627-R) from
MINECO/FEDER, RiskTrakc (JUST-2015-JCOO-AG-1), and DeepBio (TIN2017-
85727-C4-3-P). Part of this work was conducted during a research visit of J.F.
to Prof. Ludmila Kuncheva at Bangor University (UK) with STSM funding
from COST CA16101 (MULTI-FORESEE). Author J.F. want to thank Prof.
Kuncheva for fruitful discussions during his visit.
23
References
[1] L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algo-
rithms, Wiley, 2014.
[2] A. K. Jain, K. Nandakumar, A. Ross, 50 years of biometric research,
Pattern Recogn. Lett. 79 (2016) 80–105.
[3] A. Garcia-Martin, J. M. Martinez, People detection in surveillance: clas-
sification and evaluation, IET Computer Vision 9 (2015) 779–788(9).
[4] I. Lopez-Moreno, J. Gonzalez-Dominguez, D. Martinez, O. Plchot,
J. Gonzalez-Rodriguez, P. J. Moreno, On the use of deep feedforward
neural networks for automatic language identification, Computer Speech
and Language 40 (2016) 46 – 59.
[5] D. Rozado, T. Moreno, J. S. Agustin, F. B. Rodriguez, P. Varona,
Controlling a smartphone using gaze gestures as the input mechanism,
Human-Computer Interaction 30 (1) (2015) 34–63.
[6] G. Bello-Orgaz, J. J. Jung, D. Camacho, Social big data: Recent achieve-
ments and new challenges, Information Fusion 28 (2016) 45 – 59.
[7] V. Rodriguez-Fernandez, A. Gonzalez-Pardo, D. Camacho, Modelling
behaviour in UAV operations using higher order double chain Markov
models, IEEE Computational Intelligence Magazine 12 (4) (2017) 28–37.
[8] P. Castells, N. J. Hurley, S. Vargas, Recommender Systems Handbook,
Springer US, 2015, Ch. Novelty and Diversity in Recommender Systems,
pp. 881–918.
[9] L. Snidaro, J. Garca, J. Llinas, Context-based information fusion: A
survey and discussion, Information Fusion 25 (2015) 16 – 31.
[10] J. Fierrez-Aguilar, J. Ortega-Garcia, J. Gonzalez-Rodriguez, J. Bigun,
Discriminative multimodal biometric authentication based on quality
measures, Pattern Recognition 38 (5) (2005) 777–779.
[11] J. Fierrez-Aguilar, D. Garcia-Romero, J. Ortega-Garcia, J. Gonzalez-
Rodriguez, Adapted user-dependent multimodal biometric authentica-
tion exploiting general information, Pattern Recognition Letters 26 (16)
(2005) 2628–2639.
24
[12] A. K. Jain, A. Ross, Learning user-specific parameters in a multibio-
metric system, in: Proc. of IEEE Intl. Conf. on Image Processing, ICIP,
Vol. 1, 2002, pp. 57–60.
[13] Y. Wang, Y. Wang, T. Tan, Combining fingerprint and voice biometrics
for identity verification: An experimental comparison, in: D. Zhang,
A. K. Jain (Eds.), Proc. of Intl. Conf. on Biometric Authentication,
ICBA, Springer LNCS-3072, 2004, pp. 663–670.
[14] A. Kumar, D. Zhang, Integrating palmprint with face for user authen-
tication, in: Proc. of Workshop on Multimodal User Authentication,
MMUA, 2003, pp. 107–112.
[15] R. Snelick, U. Uludag, A. Mink, M. Indovina, A. K. Jain, Large scale
evaluation of multimodal biometric authentication using state-of-the-art
systems, IEEE Transactions on Pattern Analysis and Machine Intelli-
gence 27 (3) (2005) 450–455.
[16] N. Poh, S. Bengio, An investigation of f-ratio client-dependent normal-
isation on biometric authentication tasks, in: Proc. of the IEEE Intl.
Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, 2005,
pp. 721–724.
[17] K. A. Toh, X. Jiang, W. Y. Yau, Exploiting local and global decisions for
multimodal biometrics verification, IEEE Trans. on Signal Processing 52
(2004) 3059–3072.
[18] R. Brunelli, D. Falavigna, Person identification using multiple cues,
IEEE Trans. on Pattern Anal. and Machine Intell. 17 (10) (1995) 955–
966.
[19] J. Kittler, M. Hatef, R. Duin, J. Matas, On combining classifiers, IEEE
Trans. on Pattern Anal. and Machine Intell. 20 (3) (1998) 226–239.
[20] L. Hong, A. K. Jain, Integrating faces and fingerprints for personal iden-
tification, IEEE Trans. on Pattern Anal. and Machine Intell. 20 (12)
(1998) 1295–1307.
[21] S. Ben-Yacoub, Y. Abdeljaoued, E. Mayoraz, Fusion of face and speech
data for person identity verification, IEEE Trans. on Neural Networks
10 (5) (1999) 1065–1074.
25
[22] P. Verlinde, G. Chollet, M. Acheroy, Multi-modal identity verification
using expert fusion, Information Fusion 1 (1) (2000) 17–33.
[23] C. H. Lee, Q. Huo, On adaptive decision rules and decision parameter
adaptation for automatic speech recognition, Proceedings of the IEEE
88 (8) (2000) 1241–1269.
[24] J. Fierrez-Aguilar, D. Garcia-Romero, J. Ortega-Garcia, J. Gonzalez-
Rodriguez, Bayesian adaptation for user-dependent multimodal biomet-
ric authentication, Pattern Recognition 38 (8) (2005) 1317–1319.
[25] N. Poh, J. Kittler, T. Bourlai, Quality-based score normalization with
device qualitative information for multimodal biometric fusion, IEEE
Transactions on Systems, Man, and Cybernetics - Part A: Systems and
Humans 40 (3) (2010) 539–554.
[26] R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classification, Wiley, 2001.
[27] D. A. Reynolds, T. F. Quatieri, R. B. Dunn, Speaker verification using
adapted Gaussian Mixture Models, Digital Signal Processing 10 (2000)
19–41.
[28] V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, 2000.
[29] S. Theodoridis, K. Koutroumbas, Pattern Recognition, Academic Press,
2003.
[30] A. Navia-Vazquez, F. Perez-Cruz, A. Artes-Rodriguez, A. R. Figueiras-
Vidal, Weighted least squares training of support vector classifiers lead-
ing to compact and adaptive schemes, IEEE Trans. on Neural Networks
12 (5) (2001) 1047–1059.
[31] J. C. Junqua, G. V. Noord (Eds.), Robustness in Language and Speech
Technology, Kluwer Academic Publishers, 2001.
[32] D. Simon-Zorita, J. Ortega-Garcia, J. Fierrez-Aguilar, J. Gonzalez-
Rodriguez, Image quality and position variability assessment in
minutiae-based fingerprint verification, IEE Proceedings Vision, Image
and Signal Processing 150 (6) (2003) 402–408.
26
[33] C. Wilson, et al., FpVTE2003: Fingerprint Vendor Technology Evalua-
tion 2003, nIST Research Report NISTIR 7123 (http://fpvte.nist.gov/)
(June 2004).
[34] F. Alonso-Fernandez, J. Fierrez, J. Ortega-Garcia, Quality measures
in biometric systems, IEEE Security and Privacy 10 (9) (2012) 52–62.
doi:http://dx.doi.org/10.1109/MSP.2011.178.
[35] E. S. Bigun, J. Bigun, B. Duc, S. Fischer, Expert conciliation for
multi modal person authentication systems by Bayesian statistics, in:
J. Bigun, G. Chollet, G. Borgefors (Eds.), Proc. of IAPR Intl. Conf.
on Audio- and Video-based Person Authentication, AVBPA, Springer
LNCS-1206, 1997, pp. 291–300.
[36] S. Bengio, C. Marcel, S. Marcel, J. Mariethoz, Confidence measures for
multimodal identity verification, Information Fusion 3 (4) (2002) 267–
276.
[37] N. Poh, S. Bengio, Improving fusion with margin-derived confidence in
biometric authentication tasks, in: Proc. of Intl. Conf. on Audio- and
Video-Based Biometric Person Authentication, AVBPA, Vol. Springer
LNCS-3546, 2005, pp. 474–483.
[38] V. Chatzis, A. G. Bors, I. Pitas, Multimodal decision-level fusion for
person authentication, IEEE Trans. on System, Man, and Cybernetics,
part A 29 (6) (1999) 674–680.
[39] K. A. Toh, W. Y. Yau, E. Lim, L. C. a C. H. Ng, Fusion of auxiliary
information for multi-modal biometrics authentication, in: D. Zhang,
A. K. Jain (Eds.), Proc. of Intl. Conf. on Biometric Authentication,
ICBA, Springer LNCS-3072, 2004, pp. 678–685.
[40] P. Grother, E. Tabassi, Performance of biometric quality measures,
IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (4)
(2007) 531–543.
[41] F. Alonso-Fernandez, J. Fierrez, D. Ramos, J. Gonzalez-Rodriguez,
Quality-based conditional processing in multi-biometrics: application
to sensor interoperability, IEEE Tansactions on Systems, Man and Cy-
bernetics Part A 40 (6) (2010) 1168–1179.
27
[42] E. S. Bigun, Risk analysis of catastrophes using experts’ judgments:
An empirical study on risk analysis of major civil aircraft accidents in
Europe, European J. Operational Research 87 (1995) 599–612.
[43] J. Bigun, B. Duc, S. Fischer, A. Makarov, F. Smeraldi, Multi modal per-
son authentication, in: H. Wechsler, et al. (Eds.), NATO-ASI Advanced
Study on Face Recogniton, Vol. F-163, Springer, 1997, pp. 26–50.
[44] J. Bigun, J. Fierrez-Aguilar, J. Ortega-Garcia, J. Gonzalez-Rodriguez,
Multimodal biometric authentication using quality signals in mobile
communications, in: Proc. of Intl. Conf. on Image Analysis and Pro-
cessing, ICIAP, IEEE CS Press, 2003, pp. 2–13.
[45] J. Gonzalez-Rodriguez, J. Fierrez-Aguilar, D. Ramos-Castro, J. Ortega-
Garcia, Bayesian analysis of fingerprint, face and signature evidences
with automatic biometric systems, Forensic Science International 155 (2-
3) (2005) 126–140.
[46] J. Daugman, Information theory and the iriscode, IEEE Transactions
on Information Forensics and Security 11 (2) (2016) 400–409.
[47] S. Gong, V. N. Boddeti, A. K. Jain, On the capacity of face representa-
tion, CoRR abs/1709.10433 (2017) 1–9.
URL http://arxiv.org/abs/1709.10433
[48] J. Galbally, M. Martinez-Diaz, J. Fierrez, Aging in biometrics: An
experimental analysis on on-line signature, PLOS ONE 8 (7) (2013)
e69897.
[49] S. Yoon, A. K. Jain, Longitudinal study of fingerprint recognition, Pro-
ceedings of the National Academy of Sciences 112 (28) (2015) 8555–8560.
[50] N. Yager, T. Dunstone, The biometric menagerie, IEEE Trans. Pattern
Anal. Mach. Intell. 32 (2) (2010) 220–230.
[51] J. Ortega-Garcia, J. Fierrez, F. Alonso-Fernandez, J. Galbally,
M. Freire, J. Gonzalez-Rodriguez, C.Garcia-Mateo, J.-L.Alba-Castro,
E.Gonzalez-Agulla, E.Otero-Muras, S.Garcia-Salicetti, L.Allano, B.Ly-
Van, B.Dorizzi, J.Kittler, T.Bourlai, N.Poh, F.Deravi, M.Ng,
M.Fairhurst, J.Hennebert, A.Humm, M.Tistarelli, L.Brodo, J.Richiardi,
A.Drygajlo, H.Ganster, F.M.Sukno, S.-K.Pavani, A.Frangi, L.Akarun,
28
A.Savran, The multi-scenario multi-environment biosecure multimodal
database (bmdb), IEEE Trans. on Pattern Analysis and Machine Intel-
ligence 32 (6) (2010) 1097–1111.
[52] B. Rios-Sanchez, M. F. Arriaga-Gomez, J. Guerra-Casanova,
D. de Santos-Sierra, I. de Mendizabal-Vazquez, G. Bailador, C. Sanchez-
Avila, gb2sumod: A multimodal biometric video database using visible
and ir light, Information Fusion 32 (2016) 64 – 79.
[53] L. Sorber, M. V. Barel, L. D. Lathauwer, Structured data fusion, IEEE
Journal of Selected Topics in Signal Processing 9 (4) (2015) 586–600.
[54] H. Fronthaler, K. Kollreider, J. Bigun, J. Fierrez, F. Alonso-Fernandez,
J. Ortega-Garcia, J. Gonzalez-Rodriguez, Fingerprint image quality es-
timation and its application to multi-algorithm verification, IEEE Trans.
on Information Forensics and Security 3 (2) (2008) 331–338.
[55] M. Vatsa, R. Singh, A. Noore, Unification of evidence-theoretic fusion
algorithms: A case study in level-2 and level-3 fingerprint features, IEEE
Transactions on Systems, Man, and Cybernetics - Part A: Systems and
Humans 39 (1) (2009) 47–56.
[56] J. Fierrez-Aguilar, D. Garcia-Romero, J. Ortega-Garcia, J. Gonzalez-
Rodriguez, Speaker verification using adapted user-dependent multilevel
fusion, in: Proc. 6th IAPR Intl. Workshop on Multiple Classifier Sys-
tems, MCS, Vol. 3541 of LNCS, Springer, 2005, pp. 356–365.
[57] H. Quene, Multilevel modeling of between-speaker and within-speaker
variation in spontaneous speech tempo, The Journal of the Acoustical
Society of America 123 (2) (2008) 1104–1113.
[58] E. Gonzalez-Sosa, R. Vera-Rodriguez, J. Fierrez, J. Ortega-Garcia, Ex-
ploring facial regions in unconstrained scenarios: Experience on icb-rw,
IEEE Intelligent Systems (2018) 1–3.
[59] N. Poh, T. Bourlai, J. Kittler, L. Allano, F. Alonso-Fernandez, O. Am-
bekar, J. Baker, B. Dorizzi, O. Fatukasi, J. Fierrez, H. Ganster,
J. Ortega-Garcia, D. Maurer, A. A. Salah, T. Scheidat, C. Vielhauer,
Benchmarking quality-dependent and cost-sensitive score-level multi-
modal biometric fusion algorithms, IEEE Trans on Information Foren-
sics and Security 4 (4) (2009) 849–866.
29
[60] J. Fierrez-Aguilar, Y. Chen, J. Ortega-Garcia, A. Jain, Incorporating
image quality in multi-algorithm fingerprint verification, in: D. Zhang,
A. K. Jain (Eds.), Proc. of IAPR Intl. Conf. on Biometrics, ICB,
Springer LNCS-3832, 2006, pp. 213–220.
[61] L. I. Kuncheva, C. J. Whitaker, Measures of diversity in classifier ensem-
bles and their relationship with the ensemble accuracy, Machine Learn-
ing 51 (2) (2003) 181–207.
[62] P. Viola, M. J. Jones, Robust real-time face detection, Int. J. Comput.
Vision 57 (2) (2004) 137–154.
[63] R. Elwell, R. Polikar, Incremental learning of concept drift in nonsta-
tionary environments, IEEE Transactions on Neural Networks 22 (10)
(2011) 1517–1531.
[64] L. I. Kuncheva, Classifier ensembles for changing environments, in: 5th
International Workshop on Multiple Classifier Systems, MCS 04, Vol.
3077 of Lecture Notes in Computer Science, Springer-Verlag, 2004, pp.
1–15.
[65] J. Neves, J. C. Moreno, H. Proenca, Quis-campi: An annotated multi-
biometrics data feed from surveillance scenarios, IET Biometrics (2018)
1–20.
[66] E. Gonzalez-Sosa, J. Fierrez, R. Vera-Rodriguez, F. Alonso-Fernandez,
Facial soft biometrics for recognition in the wild: Recent works, anno-
tation and cots evaluation, IEEE Trans. on Information Forensics and
Security (2018) 1–12.
[67] O. M. Parkhi, A. Vedaldi, A. Zisserman, Deep face recognition, in:
British Machine Vision Conference, 2015.
[68] E. Variani, X. Lei, E. McDermott, I. L. Moreno, J. Gonzalez-Dominguez,
Deep neural networks for small footprint text-dependent speaker veri-
fication, in: 2014 IEEE International Conference on Acoustics, Speech
and Signal Processing (ICASSP), 2014, pp. 4052–4056.
[69] J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, A. Y. Ng, Multimodal
deep learning, in: ICML, 2011, pp. 689–696.
30
[70] P. S. Aleksic, M. Ghodsi, A. H. Michaely, C. Allauzen, K. B. Hall,
B. Roark, D. Rybach, P. J. Moreno, Bringing contextual information to
google speech recognition, in: INTERSPEECH 2015, 16th Annual Con-
ference of the International Speech Communication Association, Dres-
den, Germany, September 6-10, 2015, 2015, pp. 468–472.
[71] V. M. Patel, R. Chellappa, D. Chandra, B. Barbello, Continuous user
authentication on mobile devices: Recent progress and remaining chal-
lenges, IEEE Signal Processing Magazine 33 (4) (2016) 49–61.
[72] J. Fierrez, A. Pozo, M. Martinez-Diaz, J. Galbally, A. Morales, Bench-
marking swipe biometrics for mobile authentication, IEEE Trans. on
Information Forensics and Security (2018) 1–12.
[73] M. Tistarelli, C. Champod (Eds.), Handbook of Biometrics for Forensic
Science, Springer, 2017.
[74] D. Wang, C. Otto, A. K. Jain, Face search at scale, IEEE Transactions
on Pattern Analysis and Machine Intelligence 39 (6) (2017) 1122–1136.
[75] R. He, B. Lovell, R. Chellappa, A. Jain, Z. Sun, Editorial: Special issue
on ubiquitous biometrics, Pattern Recognition 66 (2017) 1–3.
[76] R. Blanco-Gonzalo, R. Sanchez-Reillo, J. Liu-Jimenez, C. Sanchez-
Redondo, How to assess user interaction effects in biometric perfor-
mance, in: 2017 IEEE International Conference on Identity, Security
and Behavior Analysis (ISBA), 2017.
[77] M. Brockly, S. Elliott, R. Guest, R. Blanco-Gonzalo, Encyclopedia of
Biometrics, Springer, 2015, Ch. Human-Biometric Sensor Interaction,
pp. 887–893.
[78] J. J. Robertson, R. M. Guest, S. J. Elliott, K. O’Connor, A framework
for biometric and interaction performance assessment of automated bor-
der control processes, IEEE Transactions on Human-Machine Systems
47 (6) (2017) 983–993.
[79] M. Harbach, A. De Luca, S. Egelman, The anatomy of smartphone
unlocking: A field study of android lock screens, in: Proceedings of the
ACM Conference on Human Factors in Computing Systems, CHI, 2016,
pp. 4806–4817.
31
[80] P. Tzirakis, G. Trigeorgis, M. A. Nicolaou, B. W. Schuller, S. Zafeiriou,
End-to-end multimodal emotion recognition using deep neural networks,
CoRR abs/1704.08619 (2017) 1–9.
URL http://arxiv.org/abs/1704.08619
[81] J. Garre-Olmo, M. Faundez-Zanuy, K. Lopez-de Ipina, L. Calvo-Perxas,
O. Turro-Garriga, Kinematic and pressure features of handwriting and
drawing: Preliminary results between patients with mild cognitive im-
pairment, alzheimer disease and healthy controls, Current Alzheimer
Research 14 (9) (2017) 960–968.
[82] D. Mellado, E. Fernandez-Medina, M. Piattini, A common criteria based
security requirements engineering process for the development of secure
information systems, Computer Standards and Interfaces 29 (2) (2007)
244 – 253.
[83] A. Merle, J. Bringer, J. Fierrez, N. Tekampe, Beat: A methodology for
common criteria evaluations of biometrics systems, in: Intl. Common
Criteria Conf., London, UK, 2015.
[84] A. Hadid, N. Evans, S. Marcel, J. Fierrez, Biometrics systems under
spoofing attack: An evaluation methodology and lessons learned, IEEE
Signal Processing Magazine 32 (5) (2015) 20–30.
[85] J. Galbally, S. Marcel, J. Fierrez, Image quality assessment for fake
biometric detection: Application to iris, fingerprint and face recognition,
IEEE Trans. on Image Processing 23 (2) (2014) 710–724.
[86] M. Gomez-Barrero, J. Galbally, J. Fierrez, Efficient software attack to
multimodal biometric systems and its application to face and iris fusion,
Pattern Recognition Letters 36 (2014) 243–253.
[87] B. Biggio, G. Fumera, G. L. Marcialis, F. Roli, Statistical meta-analysis
of presentation attacks for secure multibiometric systems, IEEE Trans-
actions on Pattern Analysis and Machine Intelligence 39 (3) (2017) 561–
575.
[88] K. Nandakumar, A. K. Jain, Biometric template protection: Bridging
the performance gap between theory and practice, IEEE Signal Process-
ing Magazine 32 (5) (2015) 88–100.
32
[89] M. Gomez-Barrero, J. Galbally, A. Morales, J. Fierrez, Privacy-
preserving comparison of variable-length data with application to bio-
metric template protection, IEEE Access 5 (2017) 8606–8619.
[90] M. Gomez-Barrero, E. Maiorana, J. Galbally, P. Campisi, J. Fierrez,
Multi-biometric template protection based on homomorphic encryption,
Pattern Recognition 67 (2017) 149–163.
[91] L. Nanni, C. Salvatore, A. Cerasa, I. Castiglioni, Combining multiple
approaches for the early diagnosis of alzheimer’s disease, Pattern Recog-
nition Letters 84 (2016) 259 – 266.
33
... This would in fact be a further possible 8 VOLUME 11, 2023 This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and step, that leaves additional margin of improvement [56]. ...
Article
Full-text available
Analyzing keystroke dynamics (KD) for biometric verification has several advantages: it is among the most discriminative behavioral traits; keyboards are among the most common human-computer interfaces, being the primary means for users to enter textual data; its acquisition does not require additional hardware, and its processing is relatively lightweight; and it allows for transparently recognizing subjects. However, the heterogeneity of experimental protocols and metrics, and the limited size of the databases adopted in the literature impede direct comparisons between different systems, thus representing an obstacle in the advancement of keystroke biometrics. To alleviate this aspect, we present a new experimental framework to benchmark KD-based biometric verification performance and fairness based on tweet -long sequences of variable transcript text from over 185,000 subjects, acquired through desktop and mobile keyboards, extracted from the Aalto Keystroke Databases. The framework runs on CodaLab in the form of the Keystroke Verification Challenge (KVC) <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">a</sup> . Moreover, we also introduce a novel fairness metric, the Skewed Impostor Ratio (SIR), to capture inter - and intra -demographic group bias patterns in the verification scores. We demonstrate the usefulness of the proposed framework by employing two state-of-the-art keystroke verification systems, TypeNet and TypeFormer , to compare different sets of input features, achieving a less privacy-invasive system, by discarding the analysis of text content (ASCII codes of the keys pressed) in favor of extended features in the time domain. Our experiments show that this approach allows to maintain satisfactory performance.
... To address this issue, multimodal biometrics that utilizes multiple biometric modalities emerged. Multiple biometric modalities can be fused at representation, score, decision, or rank-level [4], [5]. Though it is well-proven that multimodal biometrics can drastically improve accuracy performance, various issues exist during its deployment. ...
Article
Full-text available
Unimodal biometric systems are commonplace nowadays. However, there remains room for performance improvement. Multimodal biometrics, i.e., the combination of more than one biometric modality, is one of the promising remedies; yet, there lie various limitations in deployment, e.g., availability, template management, deployment cost, etc. In this paper, we propose a new notion dubbed Conditional Biometrics representation for flexible biometrics deployment, whereby a biometric modality is utilized to condition another for representation learning. We demonstrate the proposed conditioned representation learning on the face and periocular biometrics via a deep network dubbed the Conditional Biometrics Network. Our proposed Conditional Biometrics Network is a representation extractor for unimodal, multimodal, and cross-modal matching during deployment. Our experimental results on five in-the-wild periocular-face datasets demonstrate that the network outperforms their respective baselines for identification and verification tasks in all deployment scenarios.
... Finally, future work will also explore new multimodal machine learning methods [32] capable of exploiting all the heterogeneous information sources [33] originated in elearning session can be properly combined with instructors and learners in human-in-the-loop AI-powered e-learning systems. ...
Conference Paper
Full-text available
In this article, we present a Web-based System called M2LADS, which supports the integration and visualization of multimodal data recorded in learning sessions in a MOOC in the form of Web-based Dashboards. Based on the edBB platform, the multimodal data gathered contains biometric and behavioral signals including electroencephalogram data to measure learners' cognitive attention, heart rate for affective measures, visual attention from the video recordings. Additionally, learners' static background data and their learning performance measures are tracked using LOGCE and MOOC tracking logs respectively, and both are included in the Web-based System. M2LADS provides opportunities to capture learners' holistic experience during their interactions with the MOOC, which can in turn be used to improve their learning outcomes through feedback visualizations and interventions, as well as to enhance learning analytics models and improve the open content of the MOOC.
... Finally, future work will also explore new multimodal machine learning methods [32] capable of exploiting all the heterogeneous information sources [33] originated in elearning session can be properly combined with instructors and learners in human-in-the-loop AI-powered e-learning systems. ...
Preprint
Full-text available
In this article, we present a Web-based System called M2LADS, which supports the integration and visualization of multimodal data recorded in learning sessions in a MOOC in the form of Web-based Dashboards. Based on the edBB platform, the multimodal data gathered contains biometric and behavioral signals including electroencephalogram data to measure learners' cognitive attention, heart rate for affective measures, visual attention from the video recordings. Additionally, learners' static background data and their learning performance measures are tracked using LOGCE and MOOC tracking logs respectively, and both are included in the Web-based System. M2LADS provides opportunities to capture learners' holistic experience during their interactions with the MOOC, which can in turn be used to improve their learning outcomes through feedback visualizations and interventions, as well as to enhance learning analytics models and improve the open content of the MOOC.
Book
Nowadays, audiovisual content is distributed rapidly but also extensively to remote regions via the web in a number of formats, comprising images, audio, video, and textual. Everything is easily accessible and simple for all users thanks to digitized transmission via the World Wide Web. As a consequence, data protection is indeed a required and essential activity. Networking or data security has three primary goals: confidentiality, integrity, and availability. Confidentiality refers to content that is secure yet not accessed by unauthorized individuals. The term “integrity” refers to an information’s veracity, while “availability” refers to the ease in which authorized users can access essential data. Information security is insufficient on its own to assure the constant operation of data such as text, audio, video, and electronic images. Although there are several ways to image security available, including encryption, watermarking, digital watermarking, reversible watermarking, cryptography, and steganography. The goal of this book is to transfer secure textual data storage on public networks and IoT devices by concealing secret data in multimedia. It also covers discussions on textual image recognition using machine learning/deep learning-based methods. This book also offers advanced steganography ways for embedding textual data on the cover image, as well as a new way for secure transmission of biological imaging, imaging with machine learning and deep learning, and 2D, 3D imaging in the field of telemedicine.
Article
Full-text available
Mobile devices such as smartphones and smartwatches are part of our everyday life, acquiring large amount of personal information that needs to be properly secured. Among the different authentication techniques, behavioural biometrics has become a very popular method as it allows authentication in a non-intrusive and continuous way. This study proposes M-GaitFormer, a novel mobile biometric gait verification system based on Transformer architectures. This biometric system only considers the accelerometer and gyroscope data acquired by the mobile device. A complete analysis of the proposed M-GaitFormer is carried out using the popular available databases whuGAIT and OU-ISIR. M-GaitFormer achieves Equal Error Rate (EER) values of 3.42% and 2.90% on whuGAIT and OU-ISIR, respectively , outperforming other state-of-the-art approaches based on popular Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs).
Chapter
We are sure that patient-centric health care is required for improving quality of life (QoL). To realize patient-centric health care, the burden and workload of doctors, nurses, and healthcare professionals must be reduced by using cutting-edge technologies to concentrate on patients’ care for improving quality of medicine. This means that human-centric health care improves QOL not only for patients but also for doctors, nurses, and healthcare professionals.There have been several major technological advancements over the past decades. Smart devices, cloud computing, AI (artificial intelligence), robotics, and the Internet of Things (IoT) can make big contributions to accelerating digital transformation in hospitals and health care. As for the network environment, the utilization of 5G is expected to create new healthcare applications remotely.Use cases of application are rapidly expanding with the spread of the advancements described above. The applications can be categorized along with the patient care cycle from prevention, testing, diagnosis, treatment, and prognosis. The applications are mainly used in medical institutions, but they are also expanding outside of medical institutions, such as patients at home and remote patient monitoring. These situations are rapidly expanding owing to the COVID-19 pandemic.However, each application using cutting-edge technologies allows the situation of application silo and data silo to be accelerated without any standard and digital platform.We are required to contribute to making some standards such as application programing interface, terminology, and data format. In addition, we are also required to realize digital platforms with views from doctors, nurses, healthcare professionals, and patient usability and true benefits.Useful applications can be utilized easily by doctors, nurses, healthcare professionals, and patients as needed, realizing data connectivity and a data integration environment for visualizing real situations of human resources, asset utilization, clinical outcome, etc., in hospitals and health care.Digital transformation in hospitals and health care is very important not only for patients but also for doctors, nurses, and healthcare professionals. This means not just patient-centric health care but human-centric health care.KeywordsArtificial intelligenceCOVID-19Cloud computingDigital transformationInternet of ThingsPersonal health recordPlatformRemote patient monitoringRobotics5G
Article
Full-text available
Creating behavioural models of human operators engaged in supervisory control tasks with UAVs is of great value due to the high cost of operator failures. Recent works in the field advocate the use of Hidden Markov Models (HMMs) and derivatives to model the operator behaviour, since they offer interpretable patterns for a domain expert and, at the same time, provide valuable predictions which can be used to detect abnormal behaviour in time. However, the first order Markov assumption in which HMMs rely, and the assumed independence/oktay ortakcioglu novEmbEr 2017 | IEEE ComputatIonal IntEllIgEnCE magazInE 29 between the operator actions along time, limit their modelling capabilities. In this work, we extend the study of behavioural modelling in UAV operations by using Double Chain Markov Models (DCMMs), which provide a flexible modelling framework in which two higher order Markov Chains (one hidden and one visible) are combined. This work is focused on the development of a process flow to rank and select DCMMs based on a set of evaluation measures that quantify the predictability and interpretability of the models. To evaluate and demonstrate the possibilities of this modelling strategy over the classical HMMs, the proposed process has been applied in a multi-UAV simulation environment.
Article
Full-text available
Automatic affect recognition is a challenging task due to the various modalities emotions can be expressed with. Applications can be found in many domains including multimedia retrieval and human computer interaction. In recent years, deep neural networks have been used with great success in determining emotional states. Inspired by this success, we propose an emotion recognition system using auditory and visual modalities. To capture the emotional content for various styles of speaking, robust features need to be extracted. To this purpose, we utilize a Convolutional Neural Network (CNN) to extract features from the speech, while for the visual modality a deep residual network (ResNet) of 50 layers. In addition to the importance of feature extraction, a machine learning algorithm needs also to be insensitive to outliers while being able to model the context. To tackle this problem, Long Short-Term Memory (LSTM) networks are utilized. The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.
Article
Full-text available
The establishment of cloud computing and Big Data in a wide variety of daily applications has raised some privacy concerns due to the sensitive nature of some of the processed data. This has promoted the need to develop data protection techniques where the storage and all operations are carried out without disclosing any information. Following this trend, this article presents a new approach to efficiently compare variable-length data in the encrypted domain using Homomorphic Encryption, where only encrypted data is stored or exchanged. The new variable-length based algorithm is fused with existing fixed-length techniques in order to obtain increased comparison accuracy. To assess the soundness of the proposed approach, we evaluate its performance on a particular application: a multi-algorithm biometric template protection system based on dynamic signatures, which complies with the requirements described in the ISO/IEC 24745 standard on biometric information protection. Experiments have been carried out on a publicly available database and a free implementation of the Paillier cryptosystem to ensure reproducibility and comparability to other schemes.
Article
We study user interaction with touchscreens based on swipe gestures for personal authentication. This approach has been analyzed only recently in the last few years in a series of disconnected and limited works. We summarize those recent efforts, and then compare them to three new systems (based on SVM and GMM using selected features from the literature) exploiting independent processing of the swipes according to their orientation. For the analysis, four public databases consisting of touch data obtained from gestures sliding one finger on the screen are used. We first analyze the contents of the databases, observing various behavioral patterns, e.g., horizontal swipes are faster than vertical independently of the device orientation. We then explore both an intra-session scenario where users are enrolled and authenticated within the same day; and an inter-session one, where enrollment and test are performed on different days. The resulting benchmarks and processed data are made public, allowing the reproducibility of the key results obtained based on the provided score files and scripts. In addition to remarkable performance thanks to the proposed orientation-based conditional processing, the results show various new insights into the distinctiveness of swipe interaction, e.g.: some gestures hold more user-discriminant information, data from landscape orientation is more stable, and horizontal gestures are more discriminative in general than vertical ones. IEEE
Article
The role of soft biometrics to enhance person recognition systems in unconstrained scenarios has not been extensively studied. Here, we explore the utility of the following modalities: gender, ethnicity, age, glasses, beard and moustache. We consider two assumptions: i) manual estimation of soft biometrics, and ii) automatic estimation from two Commercial Off-The-Shelf systems (COTS). All experiments are reported using the LFW database. First, we study the discrimination capabilities of soft biometrics standalone. Then, experiments are carried out fusing soft biometrics with two state-of-the-art face recognition systems based on deep learning. We observe that soft biometrics is a valuable complement to the face modality in unconstrained scenarios, with relative improvements up to 40%/15% in the verification performance when using manual/automatic soft biometrics estimation. Results are reproducible as we make public our manual annotations and COTS outputs of soft biometrics over LFW, as well as the face recognition scores. IEEE
Article
Face recognition is a widely used technology with numerous large-scale applications, such as surveillance, social media and law enforcement. There has been tremendous progress in face recognition accuracy over the past few decades, much of which can be attributed to deep learning based approaches during the last five years. Indeed, automated face recognition systems are now believed to surpass human performance in some scenarios. Despite this progress, a crucial question still remains unanswered: given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different face representation methods, but will also establish an upper bound on the scalability of an automatic face recognition system. We cast the face capacity estimation problem under the information theoretic framework of capacity of a Gaussian noise channel. By explicitly accounting for two sources of representational noise: epistemic (model) uncertainty and aleatoric (data) variability, our approach is able to estimate the capacity of any given face representation. To demonstrate the efficacy of our approach, we estimate the capacity of a 128-dimensional state-of-the-art deep neural network based face representation, FaceNet. Our numerical experiments indicate that, (a) our capacity estimation model yields a capacity upper bound of $1\times10^{12}$ for the FaceNet representation at a false acceptance rate (FAR) of 5%, (b) the capacity reduces drastically as you lower the desired FAR with an estimate of $2\times10^{7}$ and $6\times10^{3}$ at FAR of 0.1% and 0.001%, respectively), and (c) the performance of the FaceNet representation is significantly below the theoretical limit.