ArticlePDF Available

Multiple classifiers in biometrics. Part 2: Trends and challenges

December 2017
Information Fusion 44

December 2017
44

DOI:10.1016/j.inffus.2017.12.005

Authors:

Julian Fierrez

Universidad Autónoma de Madrid

Aythami Morales

Universidad Autónoma de Madrid

Ruben Vera-Rodriguez

Universidad Autónoma de Madrid

David Camacho

Universidad Politécnica de Madrid

The present paper is Part 2 in this series of two papers. In Part 1 we provided an introduction to Multiple Classifier Systems (MCS) with a focus into the fundamentals: basic nomenclature, key elements, architecture, main methods, and prevalent theory and framework. Part 1 then overviewed the application of MCS to the particular field of multimodal biometric person authentication in the last 25 years, as a prototypical area in which MCS has resulted in important achievements. Here in Part 2 we present in more technical detail recent trends and developments in MCS coming from multimodal biometrics that incorporate context information in an adaptive way. These new MCS architectures exploit input quality measures and pattern-specific particularities that move apart from general population statistics, resulting in robust multimodal biometric systems. Similarly as in Part 1, methods here are described in a general way so they can be applied to other information fusion problems as well. Finally, we also discuss here open challenges in biometrics in which MCS can play a key role.

General system model of multimodal biometric authentication using score level fusion including name conventions.

…

: Example works on multimodal biometrics based on local and global learning. M denotes the total number of classifiers combined. Architecture is Global-learning and Global-decision (GG), Local-learning and Global-decision LG, and similarly GL and LL. Performance gain over the best single classifier is given for IDentification or VERification either as FR@FA pair, EER or Total Error TE=FR+FA (in %).

…

System model of multimodal biometric authentication with adapted userdependent decision.

…

System model of multimodal biometric authentication with quality-based score fusion.

…

System model of multimodal biometric authentication with user-dependent and quality-based score fusion.

…

Figures - uploaded by Julian Fierrez

Content may be subject to copyright.

Content uploaded by Julian Fierrez

Content may be subject to copyright.

Multiple classiﬁers in biometrics.

Part 2: Trends and challenges

Julian Fierrez, Aythami Morales, Ruben Vera-Rodriguez, David Camacho

School of Engineering, Universidad Autonoma de Madrid, Madrid, Spain

Abstract

The present paper is Part 2 in this series of two papers. In Part 1 we

provided an introduction to Multiple Classiﬁer Systems (MCS) with a focus

into the fundamentals: basic nomenclature, key elements, architecture, main

methods, and prevalent theory and framework. Part 1 then overviewed the

application of MCS to the particular ﬁeld of multimodal biometric person

authentication in the last 25 years, as a prototypical area in which MCS has

resulted in important achievements.

Here in Part 2 we present in more technical detail recent trends and devel-

opments in MCS coming from multimodal biometrics that incorporate con-

text information in an adaptive way. These new MCS architectures exploit

input quality measures and pattern-speciﬁc particularities that move apart

from general population statistics, resulting in robust multimodal biometric

systems. Similarly as in Part 1, methods here are described in a general way

so they can be applied to other information fusion problems as well. Finally,

we also discuss here open challenges in biometrics in which MCS can play a

key role.

Keywords:

classiﬁer, fusion, biometrics, multimodal, adaptive, context

1. Introduction

The present paper is Part 2 in a series of two papers dedicated to overview-

ing the ﬁeld of Multiple Classiﬁer Systems (MCS) in biometrics. In Part 1,

we introduced the fundamentals of MCS [1], including: nomenclature, archi-

tecture, and a ﬂexible theoretical framework. We then provided a review of

MCS applied to multimodal biometric person authentication in the last 25

Preprint submitted to Information Fusion December 19, 2017

To appear as:

J. Fierrez, A. Morales, R. Vera-Rodriguez and D. Camacho, "Multiple Classifiers in Biometrics.

Part 2: Trends and Challenges", Information Fusion, Vol. 44, November 2018.

years [2]. That review was developed using a generic MCS framework and

mathematical notation, with the purpose of facilitating the transfer of MCS

achievements from biometrics to other pattern recognition applications like

video surveillance [3], speech technologies [4], human-computer interaction

[5], data analytics [6], behavioural modelling [7], or recommender systems

[8].

Here in Part 2 we build from Part 1 to overview more recent trends in MCS

applied to biometrics, with a focus in context-based information fusion [9]. In

particular, the main MCS architectures in biometrics that have successfully

exploited context information are based on quality measures [10], or user-

speciﬁcities [11]. Similarly as in Part 1, the methods here are described in a

general way so they can be applied to other information fusion problems as

well. Additionally, particular implementations of the reported context-based

MCS architectures are described using two paradigms: 1) statistical based on

Bayesian statistics, and 2) discriminative based on Support Vector Machine

classiﬁers.

We end this series of two papers with a discussion of open challenges in

biometrics. The challenges exposed largely follow the excellent survey and

outlook of the ﬁeld of biometric person recognition by Jain et al. [2], which

we complement with our personal view, and augment with the way MCS

developments can advance those key challenges in biometrics. With that, we

also hope to provide some light about the future of other pattern recognition

and information fusion areas as well.

The present paper is organized as follows. Section 2 overviews current

trends in context-based fusion for biometrics, ﬁrst focusing in user-dependent

fusion, and then in quality-based fusion. In both cases, we ﬁrst discuss

general architecture and then describe speciﬁc fusion algorithm under two

paradigms: statistical (combination approach), and discriminative (classiﬁ-

cation approach). Section 3 summarizes open challenges in biometrics, and

discusses the role of MCS methods in overcoming those challenges. The paper

ends in Section 4 with some concluding remarks.

2. Trends in biometrics: Context-based MCS

This section is focused on MCS for multimodal biometric authentication,

adapted both to user-speciﬁcities and to the input biometric quality. In the

following sections we summarize key related works in these areas.

Feature

Extraction

Similarity

Multimodal

biometric signal

SYSTEM 1

(e.g. Fingerprint

Recognition)

Score

Normalization

Enrolled

Models

Enrolled

Models

Identity claim

Similarity Score

Normalization DECISION

THRESHOLD

Accepted or

Rejected

FUSION FUNCTION

yf=()x

Pre-

Processing

Feature

Extraction

Pre-

Processing

SYSTEM

(e.g. Signature

Recognition)

Authentication trials: = 1, ... ,

Systems: = 1, ... ,

Users:

Figure 1: General system model of multimodal biometric authentication using score level

fusion including name conventions.

The adaptive MCS schemes for multimodal biometrics are divided into

three classes: 1) user-dependent, 2) quality-based, and 3) user-dependent

and quality-based. Although the last class includes the ﬁrst two classes as

particular cases, the three classes are introduced sequentially in order to

facilitate the description.

For each class of methods, we ﬁrst sketch the system model and then

we derive particular implementations by using standard pattern recognition

methods, either based on generative assumptions following Bayesian theory,

or discriminative criteria using Support Vector Machines. These two classes

of implementations aim at minimizing the Bayesian error and the Structural

Risk of the veriﬁcation task, respectively.

In the rest of the paper we use the following nomenclature and con-

ventions. Given a multimodal biometric veriﬁcation system consisting of

Mdiﬀerent unimodal systems j= 1, . . . , M , each one computes a similar-

ity score sbetween an input biometric pattern and the enrolled pattern or

model of the given claimant k. The similarity scores sare normalized to

x. Let the normalized similarity scores provided by the diﬀerent unimodal

systems be combined into a multimodal score x= [x1, . . . , xM]T. The de-

sign of a fusion scheme consists in the deﬁnition of a function f:RM→R,

so as to maximize the separability of client {f(x)|client attempt}and im-

postor {f(x)|impostor attempt}fused score distributions. This function

may be trained by using labelled training scores (xi, zi), where zi={0 =

impostor attempt,1 = client attempt}.

In Fig. 1 we depict the general system model including all the notations

deﬁned above.

2.1. User-dependent multimodal biometrics

The idea of exploiting user-speciﬁc parameters at the score level in mul-

timodal biometrics was introduced, to the best of our knowledge, by [12]. In

this work, user-independent weighted linear combination of similarity scores

was demonstrated to be improved by using either user-dependent weights

or user-dependent decision thresholds, both of them computed by exhaustive

search on the testing data. The idea of user-dependent fusion parameters was

also explored by [13] using non-biased error estimation procedures. Other at-

tempts to personalized multimodal biometrics include the use of the claimed

identity index as a feature for a global trained fusion scheme based on Neu-

ral Networks [14], computing user-dependent weights using lambness metrics

[15], and using personalized Fisher ratios [16].

Toh et al. [17] proposed a taxonomy of score-level fusion approaches for

multi-biometrics. Multimodal fusion approaches were classiﬁed as global or

local depending ﬁrstly on the fusion function (i.e., user-independent or user-

dependent fusion strategies) and secondly depending on the decision mak-

ing process (i.e., user-independent or user-dependent decision thresholds):

global-learning and global-decision (GG), local-learning and global-decision

LG, and similarly GL and LL. Some example works of each group are listed

in Table 2.1.

These local methods (user-dependent fusion or decision) are confronted

with a big challenge: training data scarcity, as the amount of available train-

ing data in localized learning is usually not suﬃcient and representative

enough to guarantee good MCS parameter estimation and generalization

capabilities. To cope with this lack of robustness derived from partial knowl-

edge, the use of robust adaptive learning strategies based on background

information was proposed in related research areas [23]. The idea of exploit-

ing background information and adapt from there the fusion functions of

MCS based on context information was introduced in biometrics by Fier-

rez et al. [11, 24], and was soon followed by others [25]. In brief, in these

context-based MCS methods, the relative balance between the background

information (from a pool of background users) and the local data (a given

user) is performed as a tradeoﬀ between both kinds of information.

The system model of user-dependent score fusion including the mentioned

adaptation from background information is shown in Fig. 2.

Two selected algorithms implementing the discussed adapted user-dependent

fusion are summarized in the following sections.

Table 1: Example works on multimodal biometrics based on local and global learning. Mdenotes the total number of classiﬁers

combined. Architecture is Global-learning and Global-decision (GG), Local-learning and Global-decision LG, and similarly

GL and LL. Performance gain over the best single classiﬁer is given for IDentiﬁcation or VERiﬁcation either as FR@FA pair,

EER or Total Error TE=FR+FA (in %).

Work Modalities M Architecture Gain

Brunelli and Falavigna (1995) [18] Speaker, face 5 GG ID:17→2 (TE)

Kittler et al. (1998) [19] Speaker, face 3 GG VER:1.4→0.7 (EER)

Hong and Jain (1998) [20] Face, ﬁngerprint 2 GG ID:6.9→4.5 (FR@0.1%FA)

Ben-Yacoub et al. (1999) [21] Speaker, face 3 GG VER:4→0.5 (EER)

Verlinde et al. (2000) [22] Speaker, face 3 GG VER:3.7→0.1 (TE)

Jain and Ross (2002) [12] Face, ﬁngerprint, hand 3 LG/GL n/a

Kumar and Zhang (2003) [14] Face, palmprint 2 LG VER:3.6→0.8 (EER)

Toh et al. (2004) [17] Speaker, ﬁngerprint, hand 3 LG/GL/LL VER: 50% Improvement (EER)

Fierrez et al. (2005) [11] Signature, ﬁnger 2 LG/GL/LL VER:3.5→0.8 (EER)

Feature

Extraction

Similarity

Multimodal

biometric signal

SYSTEM 1

(e.g. Fingerprint

Recognition)

Score

Normalization

Enrolled

Models

Enrolled

Models

Identity claim

Similarity Score

Normalization

DECISION

THRESHOLD

Accepted or

Rejected

FUSION FUNCTION

(Claimed User)

Pre-

Processing

Feature

Extraction

Pre-

Processing

SYSTEM

(e.g. Signature

Recognition)

POOL OF

USERS

TRAINING DATA

CLAIMED USER

Fusion

Functions

Figure 2: System model of multimodal biometric authentication with adapted user-

dependent score fusion.

2.1.1. User-dependent MCS: combination approach

Here we outline this algorithm, representative of context-based MCS by

adapting the score fusion function to each user from general background

information. For a more detailed description and experimental evaluation

see [24].

Impostor and client score distributions are modelled as multivariate Gaus-

sians p(x|ω0) = N(x|µ0,σ2

0) and p(x|ω1) = N(x|µ1,σ2

1), respectively1. The

fused score yTof a multimodal test score xTis deﬁned then as follows

yT=f(xT) = log p(xT|ω1)−log p(xT|ω0),(1)

which is known to be a Quadratic Discriminant (QD) function consistent

with Bayes estimate in case of equal impostor and client prior probabilities

[26]. The score distributions are estimated using the available training data

as follows:

Global. The training set XG= (xi, zi)NG

i=1 includes multimodal scores from

a number of diﬀerent clients, and ({µG,0,σ2

G,0},{µG,1,σ2

G,1}) are esti-

mated by using the standard Maximum Likelihood criterion [27]. The

resulting fusion rule fG(x) is applied globally at the operational stage

regardless of the claimed identity.

Local. A diﬀerent fusion rule fk,L(x) is obtained for each client kenrolled in

the system by using Maximum Likelihood density estimates

({µk,L,0,σ2

k,L,0},{µk,L,1,σ2

k,L,1}) computed from a set of development

scores Xkof the speciﬁc client k.

1We use diagonal covariance matrixes, so σ2is shorthand for diag(Σ). Similarly, µ2is

shorthand for diag(µµ0).

Adapted. The adapted fusion rule fk,A(x) of client ktrades oﬀ the general

knowledge provided by the user-independent development data XG,

and the user speciﬁcities provided by the user-dependent training set

Xk, through Maximum a Posteriori density estimation [27]. This is

done by adapting the suﬃcient statistics as follows

µk,A,l =αlµk,L,l + (1 −αl)µG,l ,

σ2

k,A,l =αl(σ2

k,L,l +µ2

k,L,l) + (1 −αl)(σ2

G,l +µ2

G,l)−µ2

j,A,l.

(2)

For each class l={0 = impostor,1 = client}, a data-dependent adap-

tation coeﬃcient

αl=Nl/(Nl+r) (3)

is used, where Nlis the number of local training scores in class l, and

ris a ﬁxed relevance factor.

Note that other statistical models or other techniques for trading-oﬀ the

general and local knowledge can be used in a similar way.

2.1.2. User-dependent MCS: classiﬁcation approach

Similarly as before, we only outline here the main aspects of this context-

based MCS approach, which also adapts the score fusion function to each

user from general background information. This particular implementation

is based on SVM, but the approach is easily extensible to any other binary

classiﬁer. For a detailed description and experimental evaluation see [11].

Without loss of generality, suppose we train a SVM classiﬁer with the

following training set: X= (xi, zi)N

i=1 where Nis the number of multimodal

scores in the training set, and zi∈ {−1,1}={Impostor,Client}. We train

the SVM classiﬁer by solving the following quadratic programming prob-

lem [28]:

min

w,w0,ξ1,...,ξN1

2kwk2+

i=1

Ciξi(4)

subject to

zi(hw,Φ(xi)iH+w0)≥1−ξi, i = 1, . . . , N,

ξi≥0, i = 1, . . . , N, (5)

where slack variables ξiare introduced to take into account the eventual

non-separability of Φ(X) and parameter Ci=Cis a positive constant that

controls the relative inﬂuence of the two competing terms.

The optimization problem in Eqs. (4) and (5) is solved with the Wolfe

dual representation by using the kernel trick [29]:

max

α1,...,αN N

i=1

αi−1

i,j=1

αiαjzizjK(xi,xj)!(6)

subject to

0≤αi≤Ci, i = 1, . . . , N

i=1

αizi= 0 (7)

where the kernel function K(xi,xj) = hΦ(xi),Φ(xj)iHis introduced to avoid

direct manipulation of the elements of H. Typical kernel functions include

radial basis functions

K(xi,xj) = exp kxi−xjk2/2σ2,(8)

and linear kernels

K(xi,xj) = xT

ixj.(9)

resulting in complex and linear separating surfaces between client and im-

postor distributions, respectively.

The fused score yTof a multimodal test pattern xTis deﬁned as follows:

yT=f(xT) = hw∗,Φ(xT)iH+w∗

0,(10)

which is a signed distance measure from xTto the separating surface given

by the solution of the SVM problem. Applying the Karush-Kuhn-Tucker

(KKT) conditions to the problem in Eqs. (4) and (5), yTcan be shown to be

equivalent to the following sparse expression

yT=f(xT) = X

i∈SV

α∗

iyiK(xi,xT) + w∗

0,(11)

where (w∗, w∗

0) is the optimal hyperplane, (α∗

1, . . . , α∗

N) is the solution to the

problem in Eqs. (6) and (7), and SV = {i|α∗

i>0}indexes the set of support

vectors. The bias parameter w∗

0is obtained from the solution to the problem

in Eqs. (6) and (7) by using the KKT conditions [29].

As a result, the training procedure in Eqs. (6) and (7) and the testing

strategy in Eq. (11) are obtained for the problem of multimodal fusion.

Global. The training set XG= (xi, zi)NG

i=1 includes multimodal scores from a

number of diﬀerent clients and the resulting fusion rule fG(x) is applied

globally at the operational stage regardless of the claimed identity.

Local. A diﬀerent fusion rule fk,L(x) is obtained for each client enrolled in

the system kby using development scores Xkof the speciﬁc client k.

At the operational stage, the fusion rule fk,L(x) of the claimed identity

kis applied.

Adapted. This scheme trades oﬀ the general knowledge provided by a user-

independent training set XG, and the user speciﬁcities provided by a

user-dependent training set Xk. To obtain the adapted fusion rule,

fk,A(x), for user k, we compute both the global fusion rule, fG(x), and

the local fusion rule, fk,L(x), as described above, and ﬁnally combine

them as follows:

fk,A(x) = αfk,L(x) + (1 −α)fG(x),(12)

where αis a trade-oﬀ parameter. This can be seen as a user-dependent

fusion scheme adapted from user-independent information. The idea

can also be extended easily to trained fusion schemes based on other

classiﬁers. Worth noting, sequential algorithms to solve the SVM op-

timization problem in Eqs. (4) and (5) have been already proposed

[30], and can be used to extend the proposed idea, ﬁrst constructing

the user-independent solution and then reﬁning it by incorporating the

local data.

2.1.3. User-dependent decision

The system model of user-dependent decision is shown in Fig. 3. Once

a fused similarity score has been obtained by using either a global, local or

an adapted fusion method, the score is compared to a decision threshold in

order to accept or reject the identity claim. This decision making process,

also subject to training, can also be made globally, locally, or can be adapted

from global to local information. For this purpose, the methods presented

Feature

Extraction

Similarity

Multimodal

biometric signal

SYSTEM 1

(e.g. Fingerprint

Recognition)

Score

Normalization

Enrolled

Models

Enrolled

Models

Identity claim

Similarity Score

Normalization

DECISION

FUNCTION

(Claimed User)

Accepted or

Rejected

FUSION FUNCTION

Pre-

Processing

Feature

Extraction

Pre-

Processing

SYSTEM

(e.g. Signature

Recognition)

POOL OF

USERS

TRAINING DATA

CLAIMED USER

Decision

Functions

Figure 3: System model of multimodal biometric authentication with adapted user-

dependent decision.

in Sects. 2.1.1 and 2.1.2 can be directly applied exchanging the input multi-

modal scores xfor fused scores y.

2.2. Quality-based multimodal biometrics

The 21st century began with a growing interest in studying the eﬀects

of signal quality on the performance of biometric systems [31, 32, 33]. As a

result, it was shown in several works that the performance of an unimodal

system can drop signiﬁcantly under noisy conditions [34]. Multimodal sys-

tems have been demonstrated to overcome this challenge to some extent by

combining the evidences provided by a number of diﬀerent traits. This idea

can be extended by explicitly considering quality measures of the input bio-

metric signals and weighting the various pieces of evidence based on this

quality information. Following this idea, various quality-based multimodal

authentication schemes were proposed and studied since mid 2000s [10].

Quality measures of the input biometric signals can be used for adapting

the diﬀerent modules of a multimodal authentication system [34]. Here we

concentrate in quality-based score fusion. The system model of quality-based

score fusion is shown in Fig. 4.

Bigun et al. [35] studied the problem of multimodal biometric authenti-

cation by using Bayesian statistics. The result was an Expert Conciliation

scheme including weighting factors not only for the accuracy of the experts

but also for the conﬁdence of the experts on the particular input samples.

Experiments were provided by combining face and voice modalities. The idea

of relating the conﬁdence value to quality measures of the input biometric

signals was nevertheless not developed.

The concept of conﬁdence measure of matching scores was also studied by

[36]. In that work Bengio et al. demonstrated that the conﬁdence of matching

Feature

Extraction

Similarity

Multimodal

biometric signal

SYSTEM 1

(e.g. Fingerprint

Recognition)

Score

Normalization

Enrolled

Models

Enrolled

Models

Identity claim

Similarity Score

Normalization

DECISION

THRESHOLD

Accepted or

Rejected

FUSION FUNCTION

Pre-

Processing

Feature

Extraction

Pre-

Processing

SYSTEM

(e.g. Signature

Recognition)

Signal Quality for

Modality 1

Signal Quality for

Modality R

Figure 4: System model of multimodal biometric authentication with quality-based score

fusion.

scores can help in the fusion process. In particular, they tested conﬁdence

measures based on: 1) Gaussian assumptions on the score distributions, 2)

the adequacy of the trained biometric models to explain the input data, and

3) resampling techniques on the set of test scores. This research line was

further developed by Poh and Bengio [37], who devised conﬁdence measures

based on the margin between impostor and client score distributions.

Chatzis et al. [38] evaluated a number of fusion schemes based on clus-

tering strategies. In this case quality measures obtained directly from the

input biometric signals were used to fuzzify the scores provided by the diﬀer-

ent systems. They demonstrated that fuzzy versions of k-means and Vector

Quantization including the quality measures outperformed slightly, and not

in all cases, the standard non-fuzzy clustering methods. This work is, to

the best of our knowledge, the ﬁrst one reporting results of quality-based fu-

sion. One limitation in the experimental setup of this work was the reduced

number of individuals used, only 37.

Another work in quality-based fusion without the success of previous

methods was reported by Toh et al. [39], who developed a score fusion

scheme based on polynomial functions. Quality measures were introduced in

the optimization problem for training the polynomials as weights in the regu-

larization term. Unexpectedly, no performance improvements were obtained

by including the quality measures. One inconvenience of this work was the

use of a chimeric multimodal database combining the data from 3 diﬀerent

face, voice and ﬁngerprint databases.

2.2.1. Quality-based MCS: combination approach

One straightforward way to incorporate the input biometric quality to

the score fusion approach is by including weights in simple combination ap-

proaches. In the case of the weighted average presented in Part 1 Eq. (10),

this can be achieved by using wj=qjin order to obtain the following quality-

based score fusion function

j=1

qjxj,(13)

where qjis a quality measure of the score xj. This score quality should

be ideally related to the conﬁdence of the system jin providing a reliable

matching score for the particular biometric signal being tested [40, 41]. The

score quality proposed and used in [10] is as follows:

q=pQ·Qclaim,(14)

where Qand Qclaim are the input biometric quality and the average quality of

the biometric signals used for enrollment, respectively. The two quality mea-

sures Qand Qclaim are supposed to be in the range [0,1] where 0 corresponds

to the poorest quality, and 1 corresponds to the highest quality.

Other deﬁnitions of score quality found in the literature include [34]:

q= (Q+Qclaim)/2, q= min{Q, Qclaim}, etc.

Preliminaries.. The nomenclature and conventions summarized in Fig. 1 are

extended here:

xij Similarity score idelivered by system j

vij Variance of xij as estimated by system j

ziThe true label corresponding to score i

ζij The error score ζij =zi−xij

With respect to the previous cases developed in this paper, note that here

we introduce the variance vij of the input scores xij . The true labels zican

take only two numerical values corresponding to “Impostor” and “Client”. If

xij is between 0 and 1 then these values are chosen to be 0 and 1, respectively.

The fusion function is trained on shots i∈1. . . N (i.e. xij and ziare known

for i∈1. . . N) and we consider the trial N+ 1 as a test shot on the working

multimodal system (i.e. x(N+1)jis known, but zN+1 is not known).

Statistical Model. The model for combining the diﬀerent systems (here also

called machine experts) is based on Bayesian statistics and the assumption

of normal distributed expert errors, i.e. ζij is considered to be a sample of

a normally distributed random variable. It has been shown experimentally

[35] that this assumption does not strictly hold for common audio- and video-

based biometric machine experts, but it is shown that it holds reasonably well

when client and impostor distributions are considered separately. Taking

this result into account, two diﬀerent fusion functions are constructed, one

of them based on genuine scores

C={xij , vij |1≤i≤Nand zi= 1,1≤j≤M},(15)

while the other is based on impostor scores

I={xij , vij |1≤i≤Nand zi= 0,1≤j≤M}.(16)

The two fusion functions will be referred to as client function and impostor

function respectively.

The client function estimates the expected true label of an input claim

based on its expertise on recognizing client data. More formally, it computes

M00

C=E[ZN+1|C, xN+1,j ]. Similarly, the impostor function computes M00

E[ZN+1|I, xN+1,j ]. The conciliated overall score M00 takes into account the

diﬀerent expertise of the two fusion functions and chooses the one which

came closest to its goal, i.e. 0 for the impostor function and 1 for the client

function:

M00 =M00

Cif |1−M00

C|−|0−M00

I|<0

M00

Iotherwise .(17)

Based on the normality assumption of the errors, the fusion training and

testing algorithm described in [35] is obtained, see [42] for further back-

ground and details. In the following paragraphs we summarize the resulting

algorithm in the two cases where it can be applied.

Bayesian simpliﬁed quality-based score fusion. When only the similarity scores

xij are available, the following simpliﬁed fusion function is obtained by using

vij = 1:

Training. Estimate the bias parameters of each system. The bias parame-

ters for the client function are

MCj=1

nCX

ζij and VCj=αCj

,(18)

where iindexes the training set C,nCis the number of training samples

in Cand

αCj=1

nC−3

X

ζ2

ij −1

nC X

ζij !2

.(19)

Similarly MIjand VIjare obtained for the impostor function.

Authentication. At this step, both fusion functions are operational, so

that the time instant is N+ 1 and the fusion functions have access

to the similarity scores xN+1,j but not to the true label zN+1. First

the client and impostor functions are calibrated according to their past

performance, yielding (for the client function)

Cj=xn+1,j +MCjand V0

Cj= (nC+ 1)VCj,(20)

and then the diﬀerent calibrated systems are combined according to

M00

j=1

.(21)

Similarly, M0

I,V0

Iand M00

Iare obtained. The ﬁnal fused output is

obtained according to Eq. (17).

The algorithm described above has been successfully applied in [43] in a

multimodal authentication system combining face and speech data. Veriﬁca-

tion performance improvements of almost an order magnitude were reported

as compared to the best modality.

Bayesian quality-based score fusion. When not only the scores but also the

score variances are available, the following algorithm is obtained:

Training. Estimate the bias parameters. For the client function

MCj=Pi

ζij

σ2

and VCi=1

σ2

,(22)

where the training set Cis used. The variances σ2

ij are estimated

through ¯σ2

ij =vij ·αCj, where

αCj=1

nC−3

X

ζ2

vij

− X

ζij

vij !2 X

vij !−1

.(23)

Similarly MIjand VIjare obtained for the impostor function.

Authentication. First we calibrate the systems according to their past per-

formance, for the client function

Cj=xN+1,j +MCjand V0

Cj=vN+1,j αCj+VCj,(24)

and then the diﬀerent calibrated systems are combined according to

Eq. (21). Similarly, M0

I,V0

Iand M00

Iare obtained. The ﬁnal fused

score is obtained according to Eq. (17). This combined output can be

expressed in the form of Eq. (11) from Part 1.

The algorithm described above has been successfully applied not only in

biometrics where it was originated [44], but also in other unrelated ﬁelds like

risk assessment of aircraft accidents [42].

The variance vij of the score xij concerns a particular authentication as-

sessment. It is not a general reliability measure for the system itself, but

a certainty measure based on the performance of the system and the data

being assessed. Typically the variance of the score is chosen as the width of

the range in which one can place the score when considering human opinions.

Because such intervals can be conveniently provided by a human expert, the

algorithm presented here constitutes a systematic way of combining human

and machine expertise in MCS applications. An example of such an appli-

cation is forensic reporting using biometric evidences, where machine expert

approaches are increasingly being used [45] and human opinions must be

taken into consideration.

The context-based MCS approach summarized here calculates vij as a

function of quality measures computed on the input biometric signals (see

Fig. 4). This implies taking into account Eq. (24) right, that the trained fu-

sion function adapts the weights of the experts using the input signal quality.

For that purpose the quality qij of the score xij is deﬁned as:

qij =pQij ·Qclaim,j ,(25)

where Qij and Qclaim,j are the quality label of the biometric trait jin trial

iand the average quality of the biometric signals used by the system jfor

modelling the claimed identity respectively. The two quality labels Qij and

Qclaim,j are supposed to be in the range [0, Qmax ] with Qmax >1, where

0 corresponds to the poorest quality, 1 corresponds to normal quality and

Qmax corresponds to the highest quality. Finally, the variance parameter is

calculated according to

vij =1

.(26)

Experimental evaluation of this quality-based fusion approach can be

found in [44, 42].

2.2.2. Quality-based MCS: classiﬁcation approach

Instead of assuming particular statical models on the genuine and impos-

tor score distributions like in previous section, here we exemplify a quality-

based score fusion approach based on any binary classiﬁer. Without loss of

generality, we sketch the approach considering SVM classiﬁers [10].

Let q= [q1, . . . , qM]Tdenote the quality vector of the multimodal sim-

ilarity score x= [x1, . . . , xM]T, where qjis a scalar quality measure corre-

sponding to the similarity score xjwith j= 1, . . . , M being Mthe number

of modalities. As in the case of the Bayesian quality-based fusion algorithm,

the quality values qjare computed as follows:

qj=pQj·Qclaim,j ,(27)

where Qjand Qclaim,j are the quality measure of the sensed signal for bio-

metric trait j, and the average signal quality of the biometric signals used

by unimodal system jfor modelling the claimed identity, respectively. The

two quality labels Qjand Qclaim,j are supposed to be in the range [0, Qmax ]

with Qmax >1, where 0 corresponds to the poorest quality, 1 corresponds to

standard quality, and Qmax corresponds to the highest quality.

The score-level fusion scheme based on SVM classiﬁers and quality mea-

sures proposed in [10] is as follows:

Training. An initial fusion function:

fSVM :RM→R, fSVM(xT) = hw,Φ(xT)i+w0(28)

is trained by solving the problem:

min

w,w0,ξ1,...,ξN1

2kwk2+

i=1

Ciξi(29)

subject to

yi(hw,Φ(xi)iH+w0)≥1−ξi, i = 1, . . . , N, (30)

ξi≥0, i = 1, . . . , N, (31)

as described in Sect. 2.1.2, but using as cost weights

Ci=C QM

j=1 qi,j

max !α1

,(32)

where qi,j ,j= 1, . . . , M are the components of the quality vector qias-

sociated with training sample (xi, zi), zi∈ {−1,1}={Impostor,Client},

and Cis a positive constant. As a result, the higher the overall qual-

ity of a multimodal training score the higher its contribution to the

computation of the initial fusion function. Additionally, MSVMs of

dimension M−1 (SVM1to SVMM) are trained leaving out traits 1 to

Mrespectively. Similarly to Eq. (32)

Ci=C Qr6=jqi,r

Q(M−1)

max !α1

,(33)

for SVMjwith j= 1, . . . , M.

Authentication. Let the sensed multimodal biometric sample generate a

quality vector qT= [qT,1, . . . , qT, M ]T. Re-index the individual traits in

order to have qT,1≤qT,2≤. . . ≤qT ,M . A multimodal similarity score

xT= [xT,1, . . . , xT ,M ]0is then generated. The combined quality-based

similarity score is computed as follows:

Feature

Extraction

Similarity

Multimodal

biometric signal

SYSTEM 1

(e.g. Fingerprint

Recognition)

Score

Normalization

Enrolled

Models

Enrolled

Models

Identity claim

Similarity Score

Normalization

DECISION

THRESHOLD

Accepted or

Rejected

FUSION FUNCTION

(Claimed User)

Pre-

Processing

Feature

Extraction

Pre-

Processing

SYSTEM

(e.g. Signature

Recognition)

Signal Quality for

Modality 1

Signal Quality for

Modality R

POOL OF

USERS

TRAINING DATA

CLAIMED USER

Fusion

Functions

Figure 5: System model of multimodal biometric authentication with user-dependent and

quality-based score fusion.

fSVMQ(xT) = β1

M−1

j=1

βj

PM−1

r=1 βr

fSVMj(x(j)

T) + (1 −β1)fSVM(xT),(34)

where x(j)

T= [xT,1, . . . , xT ,j −1, xT ,j+1 , . . . , xT ,M ]Tand

βj=qT,M −qT ,j

Qmax α2

, j = 1, . . . , M −1.(35)

As a result, the adapted fusion function in Eq. (34) is a quality-based

trade-oﬀ between not using and using low quality traits.

2.3. User-dependent and quality-based multimodal biometrics

Finally, we may combine previous strategies to derive fusion systems

adapted both to the user at hand and to the input biometric quality, as

shown in Fig. 5.

Practical implementations of this scheme can be obtained by combining

some of the procedures described previously in the present paper. One pos-

sibility is to use Bayesian user-dependent score fusion plus discriminative

quality-based adaptation.

3. Challenges in biometrics: Role of MCS

In the present section, similarly as in the excellent exposition by Jain et

al. [2], we discuss main challenges in biometrics, adapting their discussion

based on our personal view, and commenting how new MCS developments

may play a role in overcoming those challenges.

Note that biometrics person recognition shares architectures, methods,

issues, and challenges with almost any other pattern recognition application.

Therefore, the challenges exposed here have a parallel in other research ar-

eas, and may provide some light on the future of other pattern recognition

applications as well.

Challenge 0: Better understanding about the nature of biometrics (distinc-

tiveness and permanence). Current knowledge about the nature of the va-

riety of biometric modalities useful for person recognition is quite limited

[2]. Although practical systems based on ﬁngerprint or face recognition can

satisfy certain applications, a better understanding of factors like their in-

trinsic distinctive capacity [46, 47], or their permanence [48, 49], will open

the way to new improved recognition, and will rationalize the application

of such technologies depending on the scenario of application and potential

population of use [50].

There have been some advances in these areas, but still much work is nec-

essary to fully understand the nature of biometrics for person authentication.

Towards this objective, MCS approaches can be instrumental for analyz-

ing the increasing amount of multimodal biometric data available nowadays

[51, 52]. MCS methods can be quite helpful to analyze those data as they

permit to simultaneously analyse and model complex yet structured relations

on heterogenous data [53], which is the case in biometrics, e.g.: the diﬀerent

representation levels existing in ﬁngerprint [54, 55], or speech [56, 57].

Challenge 1: Design of robust algorithms (representation and matching) from

uncooperative users in unconstrained and varying scenarios. This challenge

has been the main focus of research in biometrics during the last 50 years

[2], and still the desirable performance level for many biometric applications

in realistic scenarios is not yet satisfactory. There are a myriad of pattern

representation schemes and matching procedures depending on the biometric

modality (e.g., face image vs speech time-sequences) and acquisition scenario

(e.g., controlled vs latent ﬁngerprints), and one can ﬁnd in the vast and grow-

ing literature representation and matching methods speciﬁcally adjusted for

many practical applications. Most of these approaches are variants of suc-

cessful representation and matching techniques coming from other research

areas like image and signal processing, speech analysis, or computer vision,

e.g., LBP or SIFT features [58].

As developed in Part 1 in our review of MCS applied to multimodal bio-

metrics, combining various of such representation-matching schemes provide

signiﬁcant beneﬁts, not only when one has multiple evidences to combine

[59], but also when one has only one biometric evidence but wants to be

robust against degraded or varying conditions by combining various repre-

sentation schemes [60]. The success of such MCS schemes is related to the

diversity of classiﬁers being combined, a topic attracting much attention in

the MCS community [61, 8].

MCS strategies in previous paragraph supposed that there are various

classiﬁers available to be combined, but one can also generate multiple base

classiﬁers, e.g., the highly successful AdaBoost approach in the Viola-Jones

cascade MCS [62]. These MCS approaches are specially useful when pat-

terns to be recognized are diﬃcult to be represented, or vary in time due

to its intrinsic nature or environmental changes. An adaptive generation of

multiple base classiﬁers, and adaptive fusion schemes, like AdaBoost, may

track and adapt well under those unconstrained and varying conditions. This

topic of adaptive pattern recognition is also source of interesting research in

MCS under multiple names like concept drift [63, 64]. Advances in adaptive

MCS can be instrumental for the future of this Challenge 1. In addition to

such adaptive schemes, a better understanding of such unconstrained scenar-

ios through benchmarks and public databases is also of outmost importance

[65, 66].

On the other hand, in the last 5 years or so we have witnesses the triumph

of data-agnostic (i.e., without any explicit representation) end-to-end ma-

chine learning approaches such as deep neural networks that, given enough

representative training data, can generate very robust classiﬁers for many

problems in unconstrained scenarios with highly varying conditions, e.g.,

face [67] or speaker recognition [68].

MCS methods exploiting deep learning [69], and new deep learning strate-

gies exploiting and considering both existing classiﬁers (a common case in

biometric applications) and contextual information [70] are also very promis-

ing lines for advancing in this area.

Challenge 2: Integration with end applications. Most traditional and widely

deployed biometric solutions for person recognition are designed for access

control or forensic scenarios. One important challenge in biometrics is how

to properly integrate biometrics technologies in other application scenarios

like mobile authentication [71, 72], video surveillance [3], forensics [73], large-

scale ID [74], cloud biometrics or ubiquitous biometrics [75].

Depending on the scenario at hand, the traditional biometric technologies

will need to be adapted, or perhaps designed again in order to satisfy new

application requirements. In this case adaptive MCS techniques incorporat-

ing context information, like the ones described here in Section 2, can be

quite useful.

Challenge 3: Understanding and improving the usability. As mentioned in

Challenge 2, the number and variety of biometric applications for person

recognition is ever growing, and some of them are strongly dependent on

an adequate interaction between the user and the biometric sensor, e.g., in

mobile authentication [71].

We currently lack a good understanding of how the people naturally inter-

act with some biometric sensors, and in which conditions the authentication

mechanisms generated with biometric technology perform best. There has

been some research in the past to analyze those factors between the user

and the biometric sensor in general [76], including speciﬁc models to analyze

and exploit the interaction between the user and the biometric sensor [77].

More recently, we can see some targeted studies towards understanding the

interaction between users and technology for key biometric end applications

like border control [78], or smartphone unlock [79].

Similar to Challenge 0, MCS approaches can be exploited here as a tool for

analyzing multiple sources of heterogenous data [53], complex yet structured,

as is the case of human-biometric sensor interaction data [77].

Challenge 4: Understanding and improving the security. Pattern recognition

applications based on biometrics are usually intended for securing informa-

tion or control the access to services or places [2]. Note this is not the only

usage possible, as biometric technologies may be also used to analyze per-

sonal data towards other objectives, like behaviour analysis [80] or medical

diagnosis [81].

When biometrics are used for security applications, one may want to know

the level of security provided by the application at hand, given a set of oper-

ational conditions. This question has been already addressed in the general

information security community, where various international standards have

been generated under the umbrella of Common Criteria (ISO/IEC 15408)

since 1990 [82]. That standardization eﬀort includes some speciﬁc develop-

ments for biometric systems [83]. The basic idea behind those standards is to

measure quantitatively the eﬀort required for potential attackers to bypass

the protection provided by biometrics, and the impact of such attacks.

These ideas have generated much research in biometrics towards under-

standing possible attacks [84], and the generation of protection methods

against attacks [85]. When MCS approaches are applied to biometrics, spe-

ciﬁc vulnerabilities appear [86], and protection methods can be generated by

exploiting speciﬁc MCS fusion strategies [87].

The topic of security against attackers seeking illicit access is related to

the privacy protection of users, and in particular their biometric templates.

Securing such templates against potential identity theft has also generated

much research activity in the last decade [88]. There are some recent devel-

opments in this area exploiting advances in cryptography like homomorphic

encryption [89], but still there are no general satisfactory solutions for gen-

erating secure biometric templates at the same time 1) non-invertible, 2)

non-linkable, and 3) with high discrimination [2]. Current trends for better

protecting templates containing multiple biometric data are usually based on

advanced cryptographic constructions and the principles of MCS described

in Part 1 [90].

4. Conclusions

The present paper is the Part 2 in a series of two papers. In Part 1

we ﬁrst provided a brief introduction to Multiple Classiﬁer Systems (MCS)

including basic nomenclature, architecture, and key elements [1]. Our main

focus there was into the fundamentals of MCS, providing pointers for detailed

descriptions of MCS algorithms.

Part 1 then overviewed the application of MCS to the particular ﬁeld of

multimodal biometric person authentication in the last 25 years [2], including

general descriptions of main MCS elements, methods, and algorithms gener-

ated in the biometrics ﬁeld. The presentation there was general with a generic

mathematical formulation, in order to facilitate the export of experiences and

methods to other information fusion problems, e.g.: video surveillance [3],

speech technologies [4], biomedical applications [91], human-computer in-

teraction [5], data analytics [6], behavioural modelling [7], or recommender

systems [8].

Part 1 was intended for the non-expert in MCS, or any other reader

interested in overviewing the ﬁeld of multimodal biometrics. Here in Part 2

we provide more advanced material intended for researchers knowledgeable

already in MCS and multimodal biometrics, readers that completed Part 1,

and any other researcher seeking ideas and prospects about the future of

biometrics that can be parallel to other pattern recognition areas as well.

We began this Part 2 describing in technical detail recent trends and

developments in MCS from multimodal biometrics that incorporate context

information in an adaptive way, using the framework and mathematical tools

introduced in Part 1. These new MCS architectures exploit input qual-

ity measures [10] and pattern-speciﬁc particularities that move apart from

general population statistics [11], resulting in robust multimodal biometric

systems.

Similarly as in Part 1, methods here in Part 2 were introduced in a gen-

eral way so they can be applied to other information fusion problems as

well. In related works such as [9], one can ﬁnd an excellent treatment of

general context-based information fusion, in which there are indications on

how to apply the methods and speciﬁc algorithms developed here to other

information fusion architectures.

Finally, we have discussed open challenges in biometrics in which MCS

may play a key role: 0) limited knowledge about the nature of biometrics (in

terms of distinctiveness and permanence for diﬀerent populations), 1) design

of robust algorithms (representation and matching) from uncooperative users

in unconstrained and varying scenarios, 2) integration with end applications,

3) understanding and improving the usability, and 4) understanding and

improving the security.

5. Acknowledgements

This work was funded by projects CogniMetrics (TEC2015-70627-R) from

MINECO/FEDER, RiskTrakc (JUST-2015-JCOO-AG-1), and DeepBio (TIN2017-

85727-C4-3-P). Part of this work was conducted during a research visit of J.F.

to Prof. Ludmila Kuncheva at Bangor University (UK) with STSM funding

from COST CA16101 (MULTI-FORESEE). Author J.F. want to thank Prof.

Kuncheva for fruitful discussions during his visit.

References

[1] L. I. Kuncheva, Combining Pattern Classiﬁers: Methods and Algo-

rithms, Wiley, 2014.

[2] A. K. Jain, K. Nandakumar, A. Ross, 50 years of biometric research,

Pattern Recogn. Lett. 79 (2016) 80–105.

[3] A. Garcia-Martin, J. M. Martinez, People detection in surveillance: clas-

siﬁcation and evaluation, IET Computer Vision 9 (2015) 779–788(9).

[4] I. Lopez-Moreno, J. Gonzalez-Dominguez, D. Martinez, O. Plchot,

J. Gonzalez-Rodriguez, P. J. Moreno, On the use of deep feedforward

neural networks for automatic language identiﬁcation, Computer Speech

and Language 40 (2016) 46 – 59.

[5] D. Rozado, T. Moreno, J. S. Agustin, F. B. Rodriguez, P. Varona,

Controlling a smartphone using gaze gestures as the input mechanism,

Human-Computer Interaction 30 (1) (2015) 34–63.

[6] G. Bello-Orgaz, J. J. Jung, D. Camacho, Social big data: Recent achieve-

ments and new challenges, Information Fusion 28 (2016) 45 – 59.

[7] V. Rodriguez-Fernandez, A. Gonzalez-Pardo, D. Camacho, Modelling

behaviour in UAV operations using higher order double chain Markov

models, IEEE Computational Intelligence Magazine 12 (4) (2017) 28–37.

[8] P. Castells, N. J. Hurley, S. Vargas, Recommender Systems Handbook,

Springer US, 2015, Ch. Novelty and Diversity in Recommender Systems,

pp. 881–918.

[9] L. Snidaro, J. Garca, J. Llinas, Context-based information fusion: A

survey and discussion, Information Fusion 25 (2015) 16 – 31.

[10] J. Fierrez-Aguilar, J. Ortega-Garcia, J. Gonzalez-Rodriguez, J. Bigun,

Discriminative multimodal biometric authentication based on quality

measures, Pattern Recognition 38 (5) (2005) 777–779.

[11] J. Fierrez-Aguilar, D. Garcia-Romero, J. Ortega-Garcia, J. Gonzalez-

Rodriguez, Adapted user-dependent multimodal biometric authentica-

tion exploiting general information, Pattern Recognition Letters 26 (16)

(2005) 2628–2639.

[12] A. K. Jain, A. Ross, Learning user-speciﬁc parameters in a multibio-

metric system, in: Proc. of IEEE Intl. Conf. on Image Processing, ICIP,

Vol. 1, 2002, pp. 57–60.

[13] Y. Wang, Y. Wang, T. Tan, Combining ﬁngerprint and voice biometrics

for identity veriﬁcation: An experimental comparison, in: D. Zhang,

A. K. Jain (Eds.), Proc. of Intl. Conf. on Biometric Authentication,

ICBA, Springer LNCS-3072, 2004, pp. 663–670.

[14] A. Kumar, D. Zhang, Integrating palmprint with face for user authen-

tication, in: Proc. of Workshop on Multimodal User Authentication,

MMUA, 2003, pp. 107–112.

[15] R. Snelick, U. Uludag, A. Mink, M. Indovina, A. K. Jain, Large scale

evaluation of multimodal biometric authentication using state-of-the-art

systems, IEEE Transactions on Pattern Analysis and Machine Intelli-

gence 27 (3) (2005) 450–455.

[16] N. Poh, S. Bengio, An investigation of f-ratio client-dependent normal-

isation on biometric authentication tasks, in: Proc. of the IEEE Intl.

Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, 2005,

pp. 721–724.

[17] K. A. Toh, X. Jiang, W. Y. Yau, Exploiting local and global decisions for

multimodal biometrics veriﬁcation, IEEE Trans. on Signal Processing 52

(2004) 3059–3072.

[18] R. Brunelli, D. Falavigna, Person identiﬁcation using multiple cues,

IEEE Trans. on Pattern Anal. and Machine Intell. 17 (10) (1995) 955–

966.

[19] J. Kittler, M. Hatef, R. Duin, J. Matas, On combining classiﬁers, IEEE

Trans. on Pattern Anal. and Machine Intell. 20 (3) (1998) 226–239.

[20] L. Hong, A. K. Jain, Integrating faces and ﬁngerprints for personal iden-

tiﬁcation, IEEE Trans. on Pattern Anal. and Machine Intell. 20 (12)

(1998) 1295–1307.

[21] S. Ben-Yacoub, Y. Abdeljaoued, E. Mayoraz, Fusion of face and speech

data for person identity veriﬁcation, IEEE Trans. on Neural Networks

10 (5) (1999) 1065–1074.

[22] P. Verlinde, G. Chollet, M. Acheroy, Multi-modal identity veriﬁcation

using expert fusion, Information Fusion 1 (1) (2000) 17–33.

[23] C. H. Lee, Q. Huo, On adaptive decision rules and decision parameter

adaptation for automatic speech recognition, Proceedings of the IEEE

88 (8) (2000) 1241–1269.

[24] J. Fierrez-Aguilar, D. Garcia-Romero, J. Ortega-Garcia, J. Gonzalez-

Rodriguez, Bayesian adaptation for user-dependent multimodal biomet-

ric authentication, Pattern Recognition 38 (8) (2005) 1317–1319.

[25] N. Poh, J. Kittler, T. Bourlai, Quality-based score normalization with

device qualitative information for multimodal biometric fusion, IEEE

Transactions on Systems, Man, and Cybernetics - Part A: Systems and

Humans 40 (3) (2010) 539–554.

[26] R. O. Duda, P. E. Hart, D. G. Stork, Pattern Classiﬁcation, Wiley, 2001.

[27] D. A. Reynolds, T. F. Quatieri, R. B. Dunn, Speaker veriﬁcation using

adapted Gaussian Mixture Models, Digital Signal Processing 10 (2000)

19–41.

[28] V. N. Vapnik, The Nature of Statistical Learning Theory, Springer, 2000.

[29] S. Theodoridis, K. Koutroumbas, Pattern Recognition, Academic Press,

2003.

[30] A. Navia-Vazquez, F. Perez-Cruz, A. Artes-Rodriguez, A. R. Figueiras-

Vidal, Weighted least squares training of support vector classiﬁers lead-

ing to compact and adaptive schemes, IEEE Trans. on Neural Networks

12 (5) (2001) 1047–1059.

[31] J. C. Junqua, G. V. Noord (Eds.), Robustness in Language and Speech

Technology, Kluwer Academic Publishers, 2001.

[32] D. Simon-Zorita, J. Ortega-Garcia, J. Fierrez-Aguilar, J. Gonzalez-

Rodriguez, Image quality and position variability assessment in

minutiae-based ﬁngerprint veriﬁcation, IEE Proceedings Vision, Image

and Signal Processing 150 (6) (2003) 402–408.

[33] C. Wilson, et al., FpVTE2003: Fingerprint Vendor Technology Evalua-

tion 2003, nIST Research Report NISTIR 7123 (http://fpvte.nist.gov/)

(June 2004).

[34] F. Alonso-Fernandez, J. Fierrez, J. Ortega-Garcia, Quality measures

in biometric systems, IEEE Security and Privacy 10 (9) (2012) 52–62.

doi:http://dx.doi.org/10.1109/MSP.2011.178.

[35] E. S. Bigun, J. Bigun, B. Duc, S. Fischer, Expert conciliation for

multi modal person authentication systems by Bayesian statistics, in:

J. Bigun, G. Chollet, G. Borgefors (Eds.), Proc. of IAPR Intl. Conf.

on Audio- and Video-based Person Authentication, AVBPA, Springer

LNCS-1206, 1997, pp. 291–300.

[36] S. Bengio, C. Marcel, S. Marcel, J. Mariethoz, Conﬁdence measures for

multimodal identity veriﬁcation, Information Fusion 3 (4) (2002) 267–

276.

[37] N. Poh, S. Bengio, Improving fusion with margin-derived conﬁdence in

biometric authentication tasks, in: Proc. of Intl. Conf. on Audio- and

Video-Based Biometric Person Authentication, AVBPA, Vol. Springer

LNCS-3546, 2005, pp. 474–483.

[38] V. Chatzis, A. G. Bors, I. Pitas, Multimodal decision-level fusion for

person authentication, IEEE Trans. on System, Man, and Cybernetics,

part A 29 (6) (1999) 674–680.

[39] K. A. Toh, W. Y. Yau, E. Lim, L. C. a C. H. Ng, Fusion of auxiliary

information for multi-modal biometrics authentication, in: D. Zhang,

A. K. Jain (Eds.), Proc. of Intl. Conf. on Biometric Authentication,

ICBA, Springer LNCS-3072, 2004, pp. 678–685.

[40] P. Grother, E. Tabassi, Performance of biometric quality measures,

IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (4)

(2007) 531–543.

[41] F. Alonso-Fernandez, J. Fierrez, D. Ramos, J. Gonzalez-Rodriguez,

Quality-based conditional processing in multi-biometrics: application

to sensor interoperability, IEEE Tansactions on Systems, Man and Cy-

bernetics Part A 40 (6) (2010) 1168–1179.

[42] E. S. Bigun, Risk analysis of catastrophes using experts’ judgments:

An empirical study on risk analysis of major civil aircraft accidents in

Europe, European J. Operational Research 87 (1995) 599–612.

[43] J. Bigun, B. Duc, S. Fischer, A. Makarov, F. Smeraldi, Multi modal per-

son authentication, in: H. Wechsler, et al. (Eds.), NATO-ASI Advanced

Study on Face Recogniton, Vol. F-163, Springer, 1997, pp. 26–50.

[44] J. Bigun, J. Fierrez-Aguilar, J. Ortega-Garcia, J. Gonzalez-Rodriguez,

Multimodal biometric authentication using quality signals in mobile

communications, in: Proc. of Intl. Conf. on Image Analysis and Pro-

cessing, ICIAP, IEEE CS Press, 2003, pp. 2–13.

[45] J. Gonzalez-Rodriguez, J. Fierrez-Aguilar, D. Ramos-Castro, J. Ortega-

Garcia, Bayesian analysis of ﬁngerprint, face and signature evidences

with automatic biometric systems, Forensic Science International 155 (2-

3) (2005) 126–140.

[46] J. Daugman, Information theory and the iriscode, IEEE Transactions

on Information Forensics and Security 11 (2) (2016) 400–409.

[47] S. Gong, V. N. Boddeti, A. K. Jain, On the capacity of face representa-

tion, CoRR abs/1709.10433 (2017) 1–9.

URL http://arxiv.org/abs/1709.10433

[48] J. Galbally, M. Martinez-Diaz, J. Fierrez, Aging in biometrics: An

experimental analysis on on-line signature, PLOS ONE 8 (7) (2013)

e69897.

[49] S. Yoon, A. K. Jain, Longitudinal study of ﬁngerprint recognition, Pro-

ceedings of the National Academy of Sciences 112 (28) (2015) 8555–8560.

[50] N. Yager, T. Dunstone, The biometric menagerie, IEEE Trans. Pattern

Anal. Mach. Intell. 32 (2) (2010) 220–230.

[51] J. Ortega-Garcia, J. Fierrez, F. Alonso-Fernandez, J. Galbally,

M. Freire, J. Gonzalez-Rodriguez, C.Garcia-Mateo, J.-L.Alba-Castro,

E.Gonzalez-Agulla, E.Otero-Muras, S.Garcia-Salicetti, L.Allano, B.Ly-

Van, B.Dorizzi, J.Kittler, T.Bourlai, N.Poh, F.Deravi, M.Ng,

M.Fairhurst, J.Hennebert, A.Humm, M.Tistarelli, L.Brodo, J.Richiardi,

A.Drygajlo, H.Ganster, F.M.Sukno, S.-K.Pavani, A.Frangi, L.Akarun,

A.Savran, The multi-scenario multi-environment biosecure multimodal

database (bmdb), IEEE Trans. on Pattern Analysis and Machine Intel-

ligence 32 (6) (2010) 1097–1111.

[52] B. Rios-Sanchez, M. F. Arriaga-Gomez, J. Guerra-Casanova,

D. de Santos-Sierra, I. de Mendizabal-Vazquez, G. Bailador, C. Sanchez-

Avila, gb2sumod: A multimodal biometric video database using visible

and ir light, Information Fusion 32 (2016) 64 – 79.

[53] L. Sorber, M. V. Barel, L. D. Lathauwer, Structured data fusion, IEEE

Journal of Selected Topics in Signal Processing 9 (4) (2015) 586–600.

[54] H. Fronthaler, K. Kollreider, J. Bigun, J. Fierrez, F. Alonso-Fernandez,

J. Ortega-Garcia, J. Gonzalez-Rodriguez, Fingerprint image quality es-

timation and its application to multi-algorithm veriﬁcation, IEEE Trans.

on Information Forensics and Security 3 (2) (2008) 331–338.

[55] M. Vatsa, R. Singh, A. Noore, Uniﬁcation of evidence-theoretic fusion

algorithms: A case study in level-2 and level-3 ﬁngerprint features, IEEE

Transactions on Systems, Man, and Cybernetics - Part A: Systems and

Humans 39 (1) (2009) 47–56.

[56] J. Fierrez-Aguilar, D. Garcia-Romero, J. Ortega-Garcia, J. Gonzalez-

Rodriguez, Speaker veriﬁcation using adapted user-dependent multilevel

fusion, in: Proc. 6th IAPR Intl. Workshop on Multiple Classiﬁer Sys-

tems, MCS, Vol. 3541 of LNCS, Springer, 2005, pp. 356–365.

[57] H. Quene, Multilevel modeling of between-speaker and within-speaker

variation in spontaneous speech tempo, The Journal of the Acoustical

Society of America 123 (2) (2008) 1104–1113.

[58] E. Gonzalez-Sosa, R. Vera-Rodriguez, J. Fierrez, J. Ortega-Garcia, Ex-

ploring facial regions in unconstrained scenarios: Experience on icb-rw,

IEEE Intelligent Systems (2018) 1–3.

[59] N. Poh, T. Bourlai, J. Kittler, L. Allano, F. Alonso-Fernandez, O. Am-

bekar, J. Baker, B. Dorizzi, O. Fatukasi, J. Fierrez, H. Ganster,

J. Ortega-Garcia, D. Maurer, A. A. Salah, T. Scheidat, C. Vielhauer,

Benchmarking quality-dependent and cost-sensitive score-level multi-

modal biometric fusion algorithms, IEEE Trans on Information Foren-

sics and Security 4 (4) (2009) 849–866.

[60] J. Fierrez-Aguilar, Y. Chen, J. Ortega-Garcia, A. Jain, Incorporating

image quality in multi-algorithm ﬁngerprint veriﬁcation, in: D. Zhang,

A. K. Jain (Eds.), Proc. of IAPR Intl. Conf. on Biometrics, ICB,

Springer LNCS-3832, 2006, pp. 213–220.

[61] L. I. Kuncheva, C. J. Whitaker, Measures of diversity in classiﬁer ensem-

bles and their relationship with the ensemble accuracy, Machine Learn-

ing 51 (2) (2003) 181–207.

[62] P. Viola, M. J. Jones, Robust real-time face detection, Int. J. Comput.

Vision 57 (2) (2004) 137–154.

[63] R. Elwell, R. Polikar, Incremental learning of concept drift in nonsta-

tionary environments, IEEE Transactions on Neural Networks 22 (10)

(2011) 1517–1531.

[64] L. I. Kuncheva, Classiﬁer ensembles for changing environments, in: 5th

International Workshop on Multiple Classiﬁer Systems, MCS 04, Vol.

3077 of Lecture Notes in Computer Science, Springer-Verlag, 2004, pp.

1–15.

[65] J. Neves, J. C. Moreno, H. Proenca, Quis-campi: An annotated multi-

biometrics data feed from surveillance scenarios, IET Biometrics (2018)

1–20.

[66] E. Gonzalez-Sosa, J. Fierrez, R. Vera-Rodriguez, F. Alonso-Fernandez,

Facial soft biometrics for recognition in the wild: Recent works, anno-

tation and cots evaluation, IEEE Trans. on Information Forensics and

Security (2018) 1–12.

[67] O. M. Parkhi, A. Vedaldi, A. Zisserman, Deep face recognition, in:

British Machine Vision Conference, 2015.

[68] E. Variani, X. Lei, E. McDermott, I. L. Moreno, J. Gonzalez-Dominguez,

Deep neural networks for small footprint text-dependent speaker veri-

ﬁcation, in: 2014 IEEE International Conference on Acoustics, Speech

and Signal Processing (ICASSP), 2014, pp. 4052–4056.

[69] J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, A. Y. Ng, Multimodal

deep learning, in: ICML, 2011, pp. 689–696.

[70] P. S. Aleksic, M. Ghodsi, A. H. Michaely, C. Allauzen, K. B. Hall,

B. Roark, D. Rybach, P. J. Moreno, Bringing contextual information to

google speech recognition, in: INTERSPEECH 2015, 16th Annual Con-

ference of the International Speech Communication Association, Dres-

den, Germany, September 6-10, 2015, 2015, pp. 468–472.

[71] V. M. Patel, R. Chellappa, D. Chandra, B. Barbello, Continuous user

authentication on mobile devices: Recent progress and remaining chal-

lenges, IEEE Signal Processing Magazine 33 (4) (2016) 49–61.

[72] J. Fierrez, A. Pozo, M. Martinez-Diaz, J. Galbally, A. Morales, Bench-

marking swipe biometrics for mobile authentication, IEEE Trans. on

Information Forensics and Security (2018) 1–12.

[73] M. Tistarelli, C. Champod (Eds.), Handbook of Biometrics for Forensic

Science, Springer, 2017.

[74] D. Wang, C. Otto, A. K. Jain, Face search at scale, IEEE Transactions

on Pattern Analysis and Machine Intelligence 39 (6) (2017) 1122–1136.

[75] R. He, B. Lovell, R. Chellappa, A. Jain, Z. Sun, Editorial: Special issue

on ubiquitous biometrics, Pattern Recognition 66 (2017) 1–3.

[76] R. Blanco-Gonzalo, R. Sanchez-Reillo, J. Liu-Jimenez, C. Sanchez-

Redondo, How to assess user interaction eﬀects in biometric perfor-

mance, in: 2017 IEEE International Conference on Identity, Security

and Behavior Analysis (ISBA), 2017.

[77] M. Brockly, S. Elliott, R. Guest, R. Blanco-Gonzalo, Encyclopedia of

Biometrics, Springer, 2015, Ch. Human-Biometric Sensor Interaction,

pp. 887–893.

[78] J. J. Robertson, R. M. Guest, S. J. Elliott, K. O’Connor, A framework

for biometric and interaction performance assessment of automated bor-

der control processes, IEEE Transactions on Human-Machine Systems

47 (6) (2017) 983–993.

[79] M. Harbach, A. De Luca, S. Egelman, The anatomy of smartphone

unlocking: A ﬁeld study of android lock screens, in: Proceedings of the

ACM Conference on Human Factors in Computing Systems, CHI, 2016,

pp. 4806–4817.

[80] P. Tzirakis, G. Trigeorgis, M. A. Nicolaou, B. W. Schuller, S. Zafeiriou,

End-to-end multimodal emotion recognition using deep neural networks,

CoRR abs/1704.08619 (2017) 1–9.

URL http://arxiv.org/abs/1704.08619

[81] J. Garre-Olmo, M. Faundez-Zanuy, K. Lopez-de Ipina, L. Calvo-Perxas,

O. Turro-Garriga, Kinematic and pressure features of handwriting and

drawing: Preliminary results between patients with mild cognitive im-

pairment, alzheimer disease and healthy controls, Current Alzheimer

Research 14 (9) (2017) 960–968.

[82] D. Mellado, E. Fernandez-Medina, M. Piattini, A common criteria based

security requirements engineering process for the development of secure

information systems, Computer Standards and Interfaces 29 (2) (2007)

244 – 253.

[83] A. Merle, J. Bringer, J. Fierrez, N. Tekampe, Beat: A methodology for

common criteria evaluations of biometrics systems, in: Intl. Common

Criteria Conf., London, UK, 2015.

[84] A. Hadid, N. Evans, S. Marcel, J. Fierrez, Biometrics systems under

spooﬁng attack: An evaluation methodology and lessons learned, IEEE

Signal Processing Magazine 32 (5) (2015) 20–30.

[85] J. Galbally, S. Marcel, J. Fierrez, Image quality assessment for fake

biometric detection: Application to iris, ﬁngerprint and face recognition,

IEEE Trans. on Image Processing 23 (2) (2014) 710–724.

[86] M. Gomez-Barrero, J. Galbally, J. Fierrez, Eﬃcient software attack to

multimodal biometric systems and its application to face and iris fusion,

Pattern Recognition Letters 36 (2014) 243–253.

[87] B. Biggio, G. Fumera, G. L. Marcialis, F. Roli, Statistical meta-analysis

of presentation attacks for secure multibiometric systems, IEEE Trans-

actions on Pattern Analysis and Machine Intelligence 39 (3) (2017) 561–

575.

[88] K. Nandakumar, A. K. Jain, Biometric template protection: Bridging

the performance gap between theory and practice, IEEE Signal Process-

ing Magazine 32 (5) (2015) 88–100.

[89] M. Gomez-Barrero, J. Galbally, A. Morales, J. Fierrez, Privacy-

preserving comparison of variable-length data with application to bio-

metric template protection, IEEE Access 5 (2017) 8606–8619.

[90] M. Gomez-Barrero, E. Maiorana, J. Galbally, P. Campisi, J. Fierrez,

Multi-biometric template protection based on homomorphic encryption,

Pattern Recognition 67 (2017) 149–163.

[91] L. Nanni, C. Salvatore, A. Cerasa, I. Castiglioni, Combining multiple

approaches for the early diagnosis of alzheimer’s disease, Pattern Recog-

nition Letters 84 (2016) 259 – 266.

Keystroke Verification Challenge (KVC): Biometric and Fairness Benchmark Evaluation

Article

Full-text available

Jan 2023

Analyzing keystroke dynamics (KD) for biometric verification has several advantages: it is among the most discriminative behavioral traits; keyboards are among the most common human-computer interfaces, being the primary means for users to enter textual data; its acquisition does not require additional hardware, and its processing is relatively lightweight; and it allows for transparently recognizing subjects. However, the heterogeneity of experimental protocols and metrics, and the limited size of the databases adopted in the literature impede direct comparisons between different systems, thus representing an obstacle in the advancement of keystroke biometrics. To alleviate this aspect, we present a new experimental framework to benchmark KD-based biometric verification performance and fairness based on tweet -long sequences of variable transcript text from over 185,000 subjects, acquired through desktop and mobile keyboards, extracted from the Aalto Keystroke Databases. The framework runs on CodaLab in the form of the Keystroke Verification Challenge (KVC) <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">a</sup> . Moreover, we also introduce a novel fairness metric, the Skewed Impostor Ratio (SIR), to capture inter - and intra -demographic group bias patterns in the verification scores. We demonstrate the usefulness of the proposed framework by employing two state-of-the-art keystroke verification systems, TypeNet and TypeFormer , to compare different sets of input features, achieving a less privacy-invasive system, by discarding the analysis of text content (ASCII codes of the keys pressed) in favor of extended features in the time domain. Our experiments show that this approach allows to maintain satisfactory performance.

On the Representation Learning of Conditional Biometrics for Flexible Deployment

Article

Full-text available

Jan 2023

Unimodal biometric systems are commonplace nowadays. However, there remains room for performance improvement. Multimodal biometrics, i.e., the combination of more than one biometric modality, is one of the promising remedies; yet, there lie various limitations in deployment, e.g., availability, template management, deployment cost, etc. In this paper, we propose a new notion dubbed Conditional Biometrics representation for flexible biometrics deployment, whereby a biometric modality is utilized to condition another for representation learning. We demonstrate the proposed conditioned representation learning on the face and periocular biometrics via a deep network dubbed the Conditional Biometrics Network. Our proposed Conditional Biometrics Network is a representation extractor for unimodal, multimodal, and cross-modal matching during deployment. Our experimental results on five in-the-wild periocular-face datasets demonstrate that the network outperforms their respective baselines for identification and verification tasks in all deployment scenarios.

M2LADS: A System for Generating MultiModal Learning Analytics Dashboards in Open Education

Conference Paper

Full-text available

Jun 2023

In this article, we present a Web-based System called M2LADS, which supports the integration and visualization of multimodal data recorded in learning sessions in a MOOC in the form of Web-based Dashboards. Based on the edBB platform, the multimodal data gathered contains biometric and behavioral signals including electroencephalogram data to measure learners' cognitive attention, heart rate for affective measures, visual attention from the video recordings. Additionally, learners' static background data and their learning performance measures are tracked using LOGCE and MOOC tracking logs respectively, and both are included in the Web-based System. M2LADS provides opportunities to capture learners' holistic experience during their interactions with the MOOC, which can in turn be used to improve their learning outcomes through feedback visualizations and interventions, as well as to enhance learning analytics models and improve the open content of the MOOC.

M2LADS: A System for Generating MultiModal Learning Analytics Dashboards in Open Education

Preprint

Full-text available

May 2023

SwipeFormer: Transformers for mobile touchscreen biometrics

Article

Sep 2023
EXPERT SYST APPL

BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot Detection

Conference Paper

Jun 2023

M2LADS: A System for Generating MultiModal Learning Analytics Dashboards

Conference Paper

Jun 2023

Technologies on Image Processing

Book

Jul 2023

Nowadays, audiovisual content is distributed rapidly but also extensively to remote regions via the web in a number of formats, comprising images, audio, video, and textual. Everything is easily accessible and simple for all users thanks to digitized transmission via the World Wide Web. As a consequence, data protection is indeed a required and essential activity. Networking or data security has three primary goals: confidentiality, integrity, and availability. Confidentiality refers to content that is secure yet not accessed by unauthorized individuals. The term “integrity” refers to an information’s veracity, while “availability” refers to the ease in which authorized users can access essential data. Information security is insufficient on its own to assure the constant operation of data such as text, audio, video, and electronic images. Although there are several ways to image security available, including encryption, watermarking, digital watermarking, reversible watermarking, cryptography, and steganography. The goal of this book is to transfer secure textual data storage on public networks and IoT devices by concealing secret data in multimedia. It also covers discussions on textual image recognition using machine learning/deep learning-based methods. This book also offers advanced steganography ways for embedding textual data on the cover image, as well as a new way for secure transmission of biological imaging, imaging with machine learning and deep learning, and 2D, 3D imaging in the field of telemedicine.

M-GaitFormer: Mobile Biometric Gait Verification using Transformers

Article

Full-text available

Jul 2023
ENG APPL ARTIF INTEL

Mobile devices such as smartphones and smartwatches are part of our everyday life, acquiring large amount of personal information that needs to be properly secured. Among the different authentication techniques, behavioural biometrics has become a very popular method as it allows authentication in a non-intrusive and continuous way. This study proposes M-GaitFormer, a novel mobile biometric gait verification system based on Transformer architectures. This biometric system only considers the accelerometer and gyroscope data acquired by the mobile device. A complete analysis of the proposed M-GaitFormer is carried out using the popular available databases whuGAIT and OU-ISIR. M-GaitFormer achieves Equal Error Rate (EER) values of 3.42% and 2.90% on whuGAIT and OU-ISIR, respectively , outperforming other state-of-the-art approaches based on popular Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs).

Automation in Hospitals and Health Care

Chapter

Jun 2023

Atsushi Ugajin

We are sure that patient-centric health care is required for improving quality of life (QoL). To realize patient-centric health care, the burden and workload of doctors, nurses, and healthcare professionals must be reduced by using cutting-edge technologies to concentrate on patients’ care for improving quality of medicine. This means that human-centric health care improves QOL not only for patients but also for doctors, nurses, and healthcare professionals.There have been several major technological advancements over the past decades. Smart devices, cloud computing, AI (artificial intelligence), robotics, and the Internet of Things (IoT) can make big contributions to accelerating digital transformation in hospitals and health care. As for the network environment, the utilization of 5G is expected to create new healthcare applications remotely.Use cases of application are rapidly expanding with the spread of the advancements described above. The applications can be categorized along with the patient care cycle from prevention, testing, diagnosis, treatment, and prognosis. The applications are mainly used in medical institutions, but they are also expanding outside of medical institutions, such as patients at home and remote patient monitoring. These situations are rapidly expanding owing to the COVID-19 pandemic.However, each application using cutting-edge technologies allows the situation of application silo and data silo to be accelerated without any standard and digital platform.We are required to contribute to making some standards such as application programing interface, terminology, and data format. In addition, we are also required to realize digital platforms with views from doctors, nurses, healthcare professionals, and patient usability and true benefits.Useful applications can be utilized easily by doctors, nurses, healthcare professionals, and patients as needed, realizing data connectivity and a data integration environment for visualizing real situations of human resources, asset utilization, clinical outcome, etc., in hospitals and health care.Digital transformation in hospitals and health care is very important not only for patients but also for doctors, nurses, and healthcare professionals. This means not just patient-centric health care but human-centric health care.KeywordsArtificial intelligenceCOVID-19Cloud computingDigital transformationInternet of ThingsPersonal health recordPlatformRemote patient monitoringRobotics5G

QUIS-CAMPI: an annotated multi-biometrics data feed from surveillance scenarios

Article

Full-text available

Oct 2017

Modelling Behaviour in UAV Operations Using Higher Order Double Chain Markov Models

Article

Full-text available

Oct 2017

Creating behavioural models of human operators engaged in supervisory control tasks with UAVs is of great value due to the high cost of operator failures. Recent works in the field advocate the use of Hidden Markov Models (HMMs) and derivatives to model the operator behaviour, since they offer interpretable patterns for a domain expert and, at the same time, provide valuable predictions which can be used to detect abnormal behaviour in time. However, the first order Markov assumption in which HMMs rely, and the assumed independence/oktay ortakcioglu novEmbEr 2017 | IEEE ComputatIonal IntEllIgEnCE magazInE 29 between the operator actions along time, limit their modelling capabilities. In this work, we extend the study of behavioural modelling in UAV operations by using Double Chain Markov Models (DCMMs), which provide a flexible modelling framework in which two higher order Markov Chains (one hidden and one visible) are combined. This work is focused on the development of a process flow to rank and select DCMMs based on a set of evaluation measures that quantify the predictability and interpretability of the models. To evaluate and demonstrate the possibilities of this modelling strategy over the classical HMMs, the proposed process has been applied in a multi-UAV simulation environment.

End-to-End Multimodal Emotion Recognition Using Deep Neural Networks

Article

Full-text available

Apr 2017

Automatic affect recognition is a challenging task due to the various modalities emotions can be expressed with. Applications can be found in many domains including multimedia retrieval and human computer interaction. In recent years, deep neural networks have been used with great success in determining emotional states. Inspired by this success, we propose an emotion recognition system using auditory and visual modalities. To capture the emotional content for various styles of speaking, robust features need to be extracted. To this purpose, we utilize a Convolutional Neural Network (CNN) to extract features from the speech, while for the visual modality a deep residual network (ResNet) of 50 layers. In addition to the importance of feature extraction, a machine learning algorithm needs also to be insensitive to outliers while being able to model the context. To tackle this problem, Long Short-Term Memory (LSTM) networks are utilized. The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.

Privacy-Preserving Comparison of Variable-Length Data With Application to Biometric Template Protection

Article

Full-text available

Apr 2017

The establishment of cloud computing and Big Data in a wide variety of daily applications has raised some privacy concerns due to the sensitive nature of some of the processed data. This has promoted the need to develop data protection techniques where the storage and all operations are carried out without disclosing any information. Following this trend, this article presents a new approach to efficiently compare variable-length data in the encrypted domain using Homomorphic Encryption, where only encrypted data is stored or exchanged. The new variable-length based algorithm is fused with existing fixed-length techniques in order to obtain increased comparison accuracy. To assess the soundness of the proposed approach, we evaluate its performance on a particular application: a multi-algorithm biometric template protection system based on dynamic signatures, which complies with the requirements described in the ISO/IEC 24745 standard on biometric information protection. Experiments have been carried out on a publicly available database and a free implementation of the Paillier cryptosystem to ensure reproducibility and comparability to other schemes.

Bringing contextual information to google speech recognition

Conference Paper

Sep 2015

Benchmarking Touchscreen Biometrics for Mobile Authentication

Article

May 2018

We study user interaction with touchscreens based on swipe gestures for personal authentication. This approach has been analyzed only recently in the last few years in a series of disconnected and limited works. We summarize those recent efforts, and then compare them to three new systems (based on SVM and GMM using selected features from the literature) exploiting independent processing of the swipes according to their orientation. For the analysis, four public databases consisting of touch data obtained from gestures sliding one finger on the screen are used. We first analyze the contents of the databases, observing various behavioral patterns, e.g., horizontal swipes are faster than vertical independently of the device orientation. We then explore both an intra-session scenario where users are enrolled and authenticated within the same day; and an inter-session one, where enrollment and test are performed on different days. The resulting benchmarks and processed data are made public, allowing the reproducibility of the key results obtained based on the provided score files and scripts. In addition to remarkable performance thanks to the proposed orientation-based conditional processing, the results show various new insights into the distinctiveness of swipe interaction, e.g.: some gestures hold more user-discriminant information, data from landscape orientation is more stable, and horizontal gestures are more discriminative in general than vertical ones. IEEE

Facial Soft Biometrics for Recognition in the Wild: Recent Works, Annotation and COTS Evaluation

Article

Feb 2018

The role of soft biometrics to enhance person recognition systems in unconstrained scenarios has not been extensively studied. Here, we explore the utility of the following modalities: gender, ethnicity, age, glasses, beard and moustache. We consider two assumptions: i) manual estimation of soft biometrics, and ii) automatic estimation from two Commercial Off-The-Shelf systems (COTS). All experiments are reported using the LFW database. First, we study the discrimination capabilities of soft biometrics standalone. Then, experiments are carried out fusing soft biometrics with two state-of-the-art face recognition systems based on deep learning. We observe that soft biometrics is a valuable complement to the face modality in unconstrained scenarios, with relative improvements up to 40%/15% in the verification performance when using manual/automatic soft biometrics estimation. Results are reproducible as we make public our manual annotations and COTS outputs of soft biometrics over LFW, as well as the face recognition scores. IEEE

On the Capacity of Face Representation

Article

Sep 2017

Face recognition is a widely used technology with numerous large-scale applications, such as surveillance, social media and law enforcement. There has been tremendous progress in face recognition accuracy over the past few decades, much of which can be attributed to deep learning based approaches during the last five years. Indeed, automated face recognition systems are now believed to surpass human performance in some scenarios. Despite this progress, a crucial question still remains unanswered: given a face representation, how many identities can it resolve? In other words, what is the capacity of the face representation? A scientific basis for estimating the capacity of a given face representation will not only benefit the evaluation and comparison of different face representation methods, but will also establish an upper bound on the scalability of an automatic face recognition system. We cast the face capacity estimation problem under the information theoretic framework of capacity of a Gaussian noise channel. By explicitly accounting for two sources of representational noise: epistemic (model) uncertainty and aleatoric (data) variability, our approach is able to estimate the capacity of any given face representation. To demonstrate the efficacy of our approach, we estimate the capacity of a 128-dimensional state-of-the-art deep neural network based face representation, FaceNet. Our numerical experiments indicate that, (a) our capacity estimation model yields a capacity upper bound of $1\times10^{12}$ for the FaceNet representation at a false acceptance rate (FAR) of 5%, (b) the capacity reduces drastically as you lower the desired FAR with an estimate of $2\times10^{7}$ and $6\times10^{3}$ at FAR of 0.1% and 0.001%, respectively), and (c) the performance of the FaceNet representation is significantly below the theoretical limit.

How to assess user interaction effects in Biometric performance

Conference Paper