ResearchPDF Available

An Approach to Iris Contact Lens Detection based on Deep Image Representations

June 2015

June 2015

Authors:

David Menotti

Universidade Federal do Paraná

Pedro Silva

Universidade Federal de Ouro Preto

Eduardo José da Silva Luz

Universidade Federal de Ouro Preto

Rafael Baeta

Show all 6 authorsHide

Spoofing detection is a challenging task in biometric systems, when differentiating illegitimate users from genuine ones. Although iris scans are far more inclusive than fingerprints, and also more precise for person authentication, iris recognition systems are vulnerable to spoofing via textured cosmetic contact lenses. Iris spoofing detection is also referred to as liveness detection (binary classification of fake and real images). In this work, we focus on a three-class detection problem: images with textured (colored) contact lenses, soft contact lenses, and no lenses. Our approach uses a convolutional network to build a deep image representation and an additional fully-connected single layer with softmax regression for classification. Experiments are conducted in comparison with a state-of-the-art approach (SOTA) on two public iris image databases for contact lens detection: 2013 Notre Dame and IIIT-Delhi. Our approach can achieve a 30% performance gain over SOTA on the former database (from 80% to 86%) and comparable results on the latter. Since IIIT-Delhi does not provide segmented iris images and, differently from SOTA, our approach does not segment the iris yet, we conclude that these are very promising results.

…

Figures - uploaded by David Menotti

Content may be subject to copyright.

Content uploaded by David Menotti

Content may be subject to copyright.

An Approach to Iris Contact Lens Detection

based on Deep Image Representations

Pedro Silva, Eduardo Luz, Rafael Baeta, David Menotti

Computing Department

Federal University of Ouro Preto - UFOP

Ouro Preto, MG, Brazil, 35400-000

Email: {pedroh21.silva,eduluz,rafael.baeta,menottid}@gmail.com

Helio Pedrini, Alexandre Xavier Falc

Institute of Computing

University of Campinas - UNICAMP

Campinas, SP, Brazil, 13083-852

Email: {helio,afalcao}@ic.unicamp.br

Abstract—Spooﬁng detection is a challenging task in biometric

systems, when differentiating illegitimate users from genuine

ones. Although iris scans are far more inclusive than ﬁngerprints,

and also more precise for person authentication, iris recognition

systems are vulnerable to spooﬁng via textured cosmetic contact

lenses. Iris spooﬁng detection is also referred to as liveness

detection (binary classiﬁcation of fake and real images). In this

work, we focus on a three-class detection problem: images with

textured (colored) contact lenses, soft contact lenses, and no

lenses. Our approach uses a convolutional network to build a deep

image representation and an additional fully-connected single

layer with softmax regression for classiﬁcation. Experiments are

conducted in comparison with a state-of-the-art approach (SOTA)

on two public iris image databases for contact lens detection: 2013

Notre Dame and IIIT-Delhi. Our approach can achieve a 30%

performance gain over SOTA on the former database (from 80%

to 86%) and comparable results on the latter. Since IIIT-Delhi

does not provide segmented iris images and, differently from

SOTA, our approach does not segment the iris yet, we conclude

that these are very promising results.

Keywords-Iris Biometrics; Contact Lens Detection; Deep

Learning; Convolutional Networks.

I. INTRODUCTION

Biometric-based person identiﬁcation systems have been

developed rapidly in the last two decades. Moreover, biometric

systems based on iris recognition have been deployed in

several applications, such as border-crossing control systems,

controlled environments, access to personal computers and

smart phones. Iris is considered as the most promising, reliable

and accurate biometric trait, providing rich texture that allows

high discrimination among subjects. Furthermore, iris is stable

along ageing of individuals [1].

The ﬁrst functional iris recognition method was introduced

by Daugman in 1993 [1], whereas the ﬁrst patent proposing

iris texture as biometric modality appeared in 1987 [2].

Thenceforward, several iris recognition approaches have been

proposed in the literature [3]–[5].

Due to the increasing use of iris as a source of biometric

information in the last decade, the possibility of attacks

to these systems has become more common [6]–[8]. These

attacks are usually referred in the literature to as iris spooﬁng

and several works for dealing with this problem have been

proposed [9]–[11]. Nonetheless, the deﬁnition of iris spooﬁng

detection may be confusing, where liveness and counterfeit

detection terms are used with different meanings and, in some

cases, interchangeably [12]. Several works in the literature

have addressed the problem of classifying an iris image as

real/live or as fake, in which a fake image is not a live one

(e.g., a printed image [6], [10], [13]). In addition, counterfeit

detection approaches have also been proposed in the past

years [14]–[19], in which counterfeit iris with printed color

contact lenses are considered fake images and iris images with

soft/clear or no lenses are considered real images.

Given that cosmetic contact lenses are becoming more

popular, the sort of attacks with textured contact lenses that

an iris biometric system may suffer varies in a wide range.

For instance, a person who has banned from a country or

geographical region and has been included in a watch list may

want to rejoin that region by using contact lenses to obfuscate

his/her textured irises avoiding to be identiﬁed. Similarly,

an individual may want to personify someone else by using

textured contact lens iris of an enrolled person [9]. Moreover,

transparent or prescript contact lenses used when acquiring iris

images have indeed shown to decrease the false acceptance

rate [15], [19] in iris recognition systems, demonstrating that

it is important to identify when soft/clear lenses are present.

Furthermore, the accuracy of textured contact lens detection

methods may be affected by contact lens patterns and also

sensor manufacturer as shown in [18].

In this context, we introduce the use of deep learning

techniques [20]. In the last few years, deep learning has

allowed promising and outstanding results for several and

important visual analysis tasks, such as face recognition [21]–

[24], pedestrian detection [25], character recognition [26],

[27], trafﬁc sign classiﬁcation [28], general object recognition

in large categorized databases [29], among others. Besides the

success in these areas, the use of deep representations for

spooﬁng detection on iris, face, and ﬁngerprint images has

also been recently proposed [10], in which a simpler two-

class problem of detecting fake/spoof and real/live images is

addressed.

The present paper addresses a more complex three-class

image detection problem, where iris images may appear with

textured (colored) contact lenses, soft contact lenses, and no

lenses. We propose a convolutional network to build deep

image representations, followed by a fully-connected single

layer with softmax regression for image classiﬁcation. Our

approach is based on the work of Krizhevsky et al. [29], in

which the weights of all layers are learned by backpropagation.

In [30], the authors present two image databases to evaluate

methods on the three-class detection problem: the 2013 Notre

Dame Contact Lens Detection database (NDCL) and the IIIT-

Delhi Contact Lens Iris database (IIIT-D). Each database

contains images from two different sensors: LG4000 and

AD100 in the NDCL database, where images come with

iris location, and Cogent and Vista in the IIIT-D database,

where iris location is not available (i.e., more challenging).

We compare our approach with the state-of-the-art algorithm

(SOTA), also proposed in [30], by taking into account images

from each sensor and from different sensors.

The paper is organized as follows. In Section II, we present

a brief review of relevant works directly related to contact

lens spooﬁng detection. In Section III, the databases used in

our experiments are described. The methodology proposed

to cope with spooﬁng detection is detailed in Section IV.

Experimental results are described and discussed in Section V.

Finally, conclusions and directions for future work are outlined

in Section VI.

II. RELATED WORK

In this section, we review some relevant works directly

related to the three-class iris image problem addressed in this

paper, that is, those that propose to classify iris images in

(color) textured contact lenses, soft (prescript or clear) contact

lenses and non lenses.

The ﬁrst step of a recognition system is to capture the iris

images. Due to the difﬁculty in identifying the iris textures in

the color images, sensors have to operate under near infrared

(NIR) illumination. However, cosmetic contact lenses can

change the pattern of the iris and its presence could be very

difﬁcult to detect on images taken under NIR illumination.

Such undesirable feature is used against iris recognition sys-

tems, which makes spooﬁng attacks with textured lenses easier

and also increases false negative matching even for prescript

lenses [15], [17], [31].

Lee et al. [32] propose a new method for detecting fake

iris based on the Purkinje image. To acquire the data, a

conventional USB camera is used with modiﬁed CCD sensor

and special illumination. To accomplish the experiments, the

dataset is built with 300 live irises and 15 fake ones. The

authors report a false accept rate (FAR) of 0.33% and false

reject rate (FRR) of 0.33% on the dataset, however, a more

robust evaluation, on a larger and diverse dataset, should be

considered to properly validate the method.

Wei et al. [14] present three methods for detecting textured

lenses: measurement of iris edge sharpness, application of iris-

texton for characterizing the visual primitives of iris textures,

and use of selected features based on co-occurrence matrix. To

perform experiments, two datasets are built using CASIA [33]

and BATH [34] datasets for live irises, whereas fake irises are

collected by the authors. The reported correct classiﬁcation

rate (CCR) is up to 100% for experiments using co-occurrence

matrix features.

In [35], a method based on Local Binary Pattern (LBP)

encoding and Adaboost learning together with Gaussian kernel

density estimation achieves FAR of 0.67% and FRR of 2.64%

on discriminating fake iris texture from live iris. The method is

evaluated on CASIA-Iris-V3 [36] and ICE v1.0 [37] with the

addition of 600 custom fake iris images. Among the used 600

fake images, there are 20 different types of textured contact

lenses.

In [16], it is proposed a contact lens detection algorithm

based on Scale-Invariant Feature Transform (SIFT), weighted

LBP and Support Vector Machines (SVM). According to

the authors, the combination of SIFT and LBP improves

its variance of scale illumination and local afﬁne distortion.

The authors claim that their method achieves state-of-the-art

performance in contact lens detection. They build a custom

dataset of 5000 fake iris images with 70 different types of

textured lens.

After Daugman [38] presented a method for allowing easy

detection of contact lens patterns, many other authors have

reported accuracy rates over 98% [14], [16], [35]. How-

ever, since the contact lens technology is under constant

development, robust detection has become more difﬁcult [9].

Combined to this fact, some studies found in the literature

are favored by their methodology due to the use of datasets

containing contact lenses from a single manufacturer among

both training and test data [9], [14]. According to [18], in

more realistic scenarios, methods whose accuracy are close to

100% could decrease to below 60%.

To avoid this situation, two datasets are built in [18]

with textured contact lenses from three major manufacturers.

Multiple colors are selected for each manufacturer and some

lenses are also designed to correct astigmatism. Authors claim

that textured lens detection can drop dramatically when tested

on a manufacturer of lenses not seen in the training data

and when the iris sensor is different between training and

test data. An extension of this work is presented in [30],

where the datasets are well described and made available upon

request. Additionally, state-of-the-art results are reported by a

modiﬁed LBP feature extraction method and compared with 17

different classiﬁers. The databases are tested with techniques

available in the literature, such as textural features based on

co-occurrence matrix, weighted LBP approach, as well as

other techniques based on LBP and SVM. Finally, the authors

suggest that the development of a fully general approach to

textured lens detection is a problem that still requires attention.

In a recent work [19], a new contact lens detection method

based on binarized statistical image features reports accuracies

near to optimality for the NDCL database. However, in that

work, the authors deal with a two-problem classiﬁcation, that

is, soft/clear lens iris images are considered the same class as

non lens iris images.

Fig. 1. Samples of images in the 2013 Notre Dame Contact Lens

Detection (NDCL) database. In the ﬁrst and second columns, we show

images acquired with AD100 and LG4000 sensors, respectively. The ﬁrst,

second and third rows present samples with textured/cosmetic contact lenses,

soft/clear/prescript contact lenses, and no contact lenses, respectively.

III. DATABASES

In this section, we describe the databases used in our

experiments. Both are publicly available upon request and

were speciﬁcally developed for evaluation of contact lens

iris detection in a three-class way [30]. We summarize the

main characteristics of each database in Table I and present

additional details in the following subsections. Note that all

images of these databases are grayscale with 640 ×480 pixels.

A. Notre Dame Contact Lens Database

The 2013 Notre Dame Contact Lens Detection (NDCLD’13

or simply NDCL) database consists of 5100 images [39]. All

640 × 480 pixel images of this database were acquired under

near-IR illumination using two types of cameras, LG4000

and IrisGuard AD100. This database is divided into two

subsets: LG4000 with 3000 images for training and 1200 for

veriﬁcation; AD100 with 600 images for training and 300

for veriﬁcation. These subsets are indeed used as primary

databases for intra-camera evaluation.

The entire database, i.e., the fusion of images acquired by

the LG4000 & AD100 cameras, is proposed as a multi-camera

training set of 3600 images and a veriﬁcation (testing) set

of 1500 images. The images are equally divided into three

classes: (1) wearing cosmetic contact lenses, (2) wearing clear

soft contact lenses, and (3) wearing no contact lenses. Fig. 1

illustrates some samples of the NDCL and its cameras and

classes.

Fig. 2. Samples of images in the IIIT-Delhi Contact Lens Iris (IIIT). In the

ﬁrst and second columns, we show images acquired with Cogent and Vista

sensors, respectively. The ﬁrst, second and third rows present samples with

textured/cosmetic contact lenses, soft/clear/prescript contact lenses, and no

contact lenses, respectively.

All images in the database are annotated with the following

information: an ID to the subject it belongs, eye (left and

right), the subject’s gender, race, the type of contact lenses

used, and the coordinates of pupil and iris. These coordinates

allow us to perform experiments considering perfect iris seg-

mentation. More speciﬁc details for this database can be found

in [39, Section II.B]

B. IIIT-D Contact Lens Iris Database

The Indraprastha Institute of Information Technology (IIIT)-

Delhi Contact Lens Iris (IIIT-D CLI or simply IIIT) database

contains 6583 iris images of 101 subjects. For each individual:

(1) both left and right eyes were captured generating 202 iris

classes (different iris); (2) images were captured without lens

and with soft and textured lens - the three classes considered

here; (3) the textured lenses were captured by using variations

in iris sensors and lenses (colors and manufacturers). Images

in this database are illustrated in Fig. 2.

The used iris sensors are Cogent dual and VistaFA2E single.

Although this database offers a large variation of textured

contact lenses, the iris location information is not provided.

More speciﬁc details for this database can be found in [39,

Section II.A].

We only conducted experiments on this database using the entire eye

image. The perfect iris segmentation or annotation is planned as future work.

TABLE I

MAIN FEATURES OF THE DATASETS CONSIDERED HEREIN AND INTRODUCED IN [30].

Database Sensor

# Training # Testing/Veriﬁcation # Full

Text. Soft No Total Text. Soft No Total Text. Soft No Total

NDCL

IrisGuard AD100 200 200 200 600 100 100 100 300 300 300 300 900

LG4000 iris camera 1000 1000 1000 3000 400 400 400 1200 1400 1400 1400 4200

Multi-camera 1200 1200 1200 3600 500 500 500 1500 1700 1700 1700 5100

IIIT

Cogent Scanner 589 569 563 1721 613 574 600 1787 1202 1143 1163 3508

Vista Scanner 535 500 500 1535 530 510 500 1540 1065 1010 1000 3075

Multi-scanner 1124 1069 1063 3256 1143 1084 1100 3327 2267 2153 2163 6583

IV. DEEP REPRESENTATIONS

In this section, we present the proposed method for iris

contact lens detection based on deep image representations.

Initially, we brieﬂy describe the structure of the deep learning

techniques used to build deep representations for the problem,

which involves a combination of convolutional network [40],

for deep image representations, and a fully-connected [41]

three-layered network for classiﬁcation. Then, we detail the

methodology to choose the network topology and learn its

parameters, by using the domain-knowledge from the liter-

ature. The activation operations used here are the rectiﬁed

linear units (ReLU) [29], which have demonstrated to be

essential to learn deep representations. Based on gain control

mechanisms found in cortical neurons [42], the normalization

operation promotes competition among ﬁlter outputs such

that high and isolated responses are further emphasized [10].

Spatial pooling is a foundational operation in convolutional

networks [40] that aims at bringing translational invariance to

the features by aggregating activations from the same ﬁlter

in a given region. The order of these last two operations,

i.e., normalization and pooling, in a convolutional layer is an

open problem and is application dependent. As we expect to

achieve higher discrimination power with deep representations,

the convolutional network stacks several layers for ﬁnal image

representation. All these operations and layers demand the

determination of several parameters. Instead of performing

random search on the hyperparameter space [24], [43] or even

applying some speciﬁc search algorithm [44], we preferred

to empirically analyze a set of parameters at a time to build

the ﬁnal network structure (topology), and to learn the ﬁlter

weights by backpropagation. The idea of learning the network

architecture by using random weights [10], [24], [43], [44]

certainly deserves more attention and we will leave this

approach for a future work. The idea here is to ﬁrst evaluate

how far one can go with the domain-knowledge from previous

works for object classiﬁcation [29], in CIFAR-10 database

and spooﬁng [10], to establish a preliminary topology network

and explore its parameters according to the perception of

the problem. These steps are explained in Section IV-A and

employed in Section V.

The ﬁnal layer of the convolutional network outputs a

deep image representation. For classiﬁcation, we use a fully-

connected three-layered network [41]. We discard the use

http://www.cs.toronto.edu/

∼

kriz/cifar.html

of unshared local layers, since the literature [10] has shown

that they are inappropriate to problems in which the object

structure is irrelevant. The last network contains only three

neurons (one for each class) and classiﬁcation is performed

by softmax regression. Then, the weights of each layer in

both networks are learned by the well-known backpropagation

algorithm.

The framework described above appears in CUDA-convnet

library implemented in C++ / CUDA by Krizhevky

. It is

important to highlight that such networks are a longstanding

approach, but it has recently enabled signiﬁcant advances in

computer vision and pattern recognition ﬁelds, due to the

availability of more data and processing power, as well as

a better understanding of the learning process [21], [29], [44].

A. Methodology

The development of a network architecture for the three-

class detection problem, involving textured, soft, and no

contact lens images, starts from the Spoofnet — a network

specially developed to address the two-class detection problem

of fake and live images [10]. From this network, we determine

the range of the parameter values to evaluate and understand

their inﬂuence on the performance of the contact lens detection

method. These parameters are related to four groups: (i) the

training methodology; (ii) the network architecture; (iii) the

input image size; (iv) the database annotation. These groups

are described in more details next.

Training methodology: We follow the training methodol-

ogy established in [29] and described in

. An initial learning

rate (LR) must be chosen. It is set to 10

−3

in [29] and to

−4

in [10]. We analyze both values.

Given an initial number of epochs, we develop the following

steps in order to train a network: (1) train 100% of epochs in

three out of four batches of the training data, using the fourth

one as a validation set; (2) train more 40% of epochs in all

four batches with the same learning rate; (3) train more 10% of

epochs in all training batches by decreasing the LR by a factor

of 10; (4) ﬁnally, train more 10% of epochs in all training

batches by decreasing the LR again by a factor of 10. In [29],

this initial number is set to 100, whereas in [10], it is set to 200.

The authors in [10] argue that this parameter is both data and

problem dependent. Then, here, besides evaluating 100 and

https://code.google.com/p/cuda-convnet/

https://code.google.com/p/cuda-convnet/wiki/Methodology

200 for the initial number of epochs, we propose to evaluate

higher numbers while overﬁtting is not achieved. After those

steps, we compute the accuracy of the trained network using

the veriﬁcation data.

Network architecture: Once the training methodology

parameters are deﬁned, we focus on the network topology

deﬁnition. In the speciﬁcation of the network architecture,

one can use several layer and operation details

, although

here we evaluate: the number of convolutional layers of the

networks: {1, 2, 3}, that is, networks with one, two, or three

convolutional layers – the number of fully-connected layers

is ﬁxed in only one layer per class in order to reduce the

number of possibilities to be evaluated; the use or not of

normalization operation on top of each layer is also evaluated;

the number of ﬁlters in each layer is also evaluated – com-

bination of {16, 32, 64} ﬁlters are evaluated for one, two and

three layers. The window sizes of the convolutional, pooling

and normalization operations are kept identical to the ones of

Spoofnet.

Input image dimension: After ﬁnding the best network

architecture, we investigate the inﬂuence of the input image

size. We evaluate different image sizes, i.e., 64×64, 128×128

and 256 ×256 pixels, given that for lower values than 64×64,

the contact lens details are not visible, whereas for higher

values than 512×512, oversampling is performed and memory

issues arise. To obtain images with the proposed dimensions,

we resize them.

A very important aspect, that also affects the input image

size, is data augmentation. It is strongly recommended to

reduce overﬁtting. In Krizhevsky’s framework [29], given an

input image, it is possible to deﬁne a window size such that

ﬁve image patches are cropped out from the original image.

We deﬁne the border in pixels to be cropped out from the

image. For instance, for a 64 × 64 input image, we consider

the cropped image (a window) with 56×56 pixels at its center

and we also slide this central window of 4 pixels horizontally

and vertically to get cropped images from the four corners of

the original image. We also apply reﬂections on each of the

ﬁve images such that this procedure on each original image

ends up to 10 training images. Here, we propose to evaluate

crop border values of {2, 4, 6, 8} for 64×64, {4, 8, 12, 16} for

128×128, and {8, 16, 24, 32} for 256 ×256 image sizes. Note

that the crop border values respect a proportion regarding the

image size.

Database annotation: As previously mentioned in Sec-

tion III, the NDCL database (with images from the AD100 and

LG4000 sensors) come with annotations for the pupil and iris

locations, i.e., the x and y coordinates and the ratio, allowing

a perfect iris segmentation or, in our case, only a perfect iris

location, since we use squared region crops. For these datasets,

through these annotations, we consider to use the iris image

region plus a percentage of the background and deﬁne the

following value: 0% (without), 10%, 20%, 30%, and 40%, in

order to evaluate the importance of background addition.

https://code.google.com/p/cuda-convnet/wiki/LayerParams

Fig. 3. Spoofnet - initial network topology used here. Source: [10].

V. EXPERIMENTS AND RESULTS

In this section, we present the experiments performed in

this work. We start by evaluating the groups of parameters

established in the previous section to study their behavior

and to obtain a performing network topology for contact lens

detection called CLDnet (see Fig. 4). Then, we compare the

effectiveness of our proposed approach with state-of-the-art

results in different scenarios.

A. Parameter Evaluation

As established in Section IV-B, we have to ﬁrst evaluate

parameters in order to analyze their inﬂuence in the effective-

ness of the proposed method and also design a robust network

topology. These experiments were conducted separately only

on NDCL database, namely AD100 and LG4000 sensors, since

the iris location is available.

As initial network topology, we consider the one used

in [10], i.e., Spoofnet. Its conﬁgurations are illustrated in Fig 3.

We also consider image size of 128 × 128 and crop border of

8 pixels (indeed, input images are 112 × 112 pixels) – values

used in Spoofnet. Furthermore, 10% of background addition

was selected, before cropping for generating the initial input

images. The 10% value was decided through visual inspection

on the images and verifying that this amount sufﬁces to include

the contact lens borders in the cropped iris image.

The ﬁrst evaluation is on the training methodology. We

veriﬁed that, for initial learning rate of 10

−3

, the framework

caught/crashed in early iterations/epochs. This probably oc-

curred because the learning rate was too aggressive. Then, for

all remaining experiments, an initial learning rate of 10

−4

was

used. We start the initial number of epochs in 100, however,

we also tested 200, 300 and 400 epochs. When 400 epochs

were evaluated, we observed that the learning process was

overﬁtting in the validation batch of the training set, then we

decide to take 300 as the initial number of epochs, since the

learning process was still achieving generalization. Therefore,

this is deﬁned as our evaluation protocol for the remaining

experiments.

Then, we evaluate the network architecture parameters.

We evaluate some conﬁgurations by varying the number of

layers and the number of ﬁlters in each layer. The results in

correct classiﬁcation rate (CCR) are shown in Table II.

TABLE II

NETWORK ARCHITECTURE EVALUATION FOR AD100 AND LG4000

SENSORS ON NDCL DATABASE - VARYING THE NUMBER OF LAYERS AND

THE NUMBER OF FILTERS IN EACH LAYER.

Sensor N. Filters CCR N. Filters CCR

AD100

16 72.33 16-16-16 73.67

32 68.67 16-16-32 76.00

64 70.00 16-16-64 77.00

16-16 75.67 16-32-16 72.33

16-32 75.00 16-32-32 76.33

16-64 74.67 16-32-64 71.00

32-32 76.00 32-32-16 75.00

32-64 76.00 32-32-64 79.67

LG4000

16 79.50 16-16-16 77.59

32 77.34 16-16-32 83.34

64 80.84 16-16-64 81.17

16-16 84.34 16-32-16 82.92

16-32 84.82 16-32-32 81.75

16-64 84.17 16-32-64 76.92

32-32 85.59 32-32-16 81.34

32-64 85.00 32-32-64 83.75

Note that the use of three convolution layers does not

increase signiﬁcantly the method effectiveness and the network

using a single layer does not present promising results. The

best result for the AD100 sensor (79.67%) is obtained using

a three layers of convolutions, while only two layers yielded

the best result for LG4000 sensor (85.59%). For our CLDnet,

we kept the conﬁguration of two layers using 32 and 64 ﬁlters

for the ﬁrst and second layers, respectively, since the results

seemed more stable for both sensors of the NDCL database.

We also evaluate whether or not to use the normalization

operation, but the results demonstrated that the method ef-

fectiveness is insensitive to this operation in the contact lens

detection problem. Thus, this operation was removed from the

CLDnet.

Finally, we evaluate the input image size and database

annotations parameters simultaneously. The results of these

experiments are shown in Table III. By observing these results,

we can conclude that, in general, the results achieved by the

larger input image size, i.e., 256 × 256 pixels, correspond

to the worst CCRs for both sensors, AD100 and LG4000.

Additionally, on average, the results reported when using input

image dimensions of 64×64 and 128 ×128 pixels are slightly

Fig. 4. CLDnet - network for Contact Lens Detection proposed here.

similar. As the image size might be a constraint in some

applications, we prefer the smallest one for image input in

our CLDnet. Moreover, the best results presented in Table III

are obtained by networks with 64 × 64 pixel images as input,

4 pixels for crop border, and 10% of background addition,

such that the ﬁnal designed CLDnet shown in Fig. 4. Despite

that fact, there is no strong claim to be stated for crop border

and background addition parameters. That is, the results vary

signiﬁcantly when analyzing these two parameters.

B. Results

In this section, we compare the results obtained with our

method against the state-of-the-art (SOTA) results in [30].

Tables IV, V and VI present CCRs for no (N), textured

(T) and soft (S) contact lens classes and the overall (O)

CCR when analyzing intra, inter, and multi-sensor evaluations,

respectively. These results are analyzed as follows.

It is important to note that for the experiments run in the

sensors of the IIIT-D database, we use the same network,

CLDnet, however, we had to adjust the initial learning rate to

−3

, because 10

−4

was not sufﬁcient for effective learning.

All the remainder parameters and procedures were maintained

as the sensors of the NDCL database.

1) Intra-sensor evaluation: It is possible to observe that

the proposed method outperformed SOTA for AD100 &

LG4000 sensors in the NDCL database, in which iris loca-

tion is available, therefore, establishing new SOTA results.

A marginal improvement is observed for the AD100 sensor

images, however, for the LG4000 sensor, the CCR raises from

approximately 80% to 86%, an improvement of 30%. We can

TABLE III

NETWORK ARCHITECTURE EVALUATION FOR AD100 AND LG4000 SENSORS ON NDCL DATABASE - EVALUATING THE INPUT IMAGE SIZE, THE CROP

BORDER PARAMETER USED IN THE DATA AUGMENTATION, AND THE BACKGROUND ADDITION FROM THE DATABASE ANNOTATIONS.

Input image size & Crop borders

Sensor

Background 64 × 64 128 × 128 256 × 256

addition (%) 2 4 6 8 4 8 12 16 8 16 24 32

AD100

0 74.67 73.67 71.00 71.00 74.00 78.00 70.67 70.00 70.33 70.67 71.33 63.33

10 74.67 78.33 74.00 73.33 73.33 76.00 72.67 65.33 71.33 73.67 68.00 62.33

20 71.33 76.67 76.00 67.33 69.67 75.33 76.33 68.00 71.33 71.00 73.00 68.33

30 69.00 70.00 72.67 75.00 68.33 72.33 73.00 75.33 67.33 70.00 72.00 67.00

40 66.33 69.67 72.67 69.67 73.33 71.67 71.00 68.33 66.67 69.67 75.67 68.33

LG4000

0 82.50 81.92 82.75 82.08 84.25 83.83 84.17 82.08 77.25 76.50 76.00 77.00

10 83.25 86.00 84.25 82.75 84.58 84.58 85.25 82.92 72.25 75.58 76.25 75.08

20 81.25 82.83 84.08 80.58 84.75 84.83 85.58 84.33 72.17 74.92 75.83 73.58

30 82.00 81.92 82.50 80.00 83.42 82.83 84.33 84.42 71.08 70.42 74.50 71.33

40 80.25 81.42 81.50 82.08 82.17 82.92 84.92 82.67 68.00 70.58 72.08 71.08

also see comparable results to SOTA for the Cogent & Vista

sensor images in the IIIT-D database. In this case, iris location

is not provided and the entire eye image was used as input to

our method. Nonetheless, our method achieves higher results

than the second best performing methods reported in [30].

The results on the IIIT-D database can be better understood,

when we consider that SOTA counts with an iris segmentation

algorithm.

TABLE IV

INTRA-SENSOR RESULTS FOR THE NDCL AND IIIT-D DATABASES.

Sensors

AD100 LG4000 Cogent Vista

Ours SOTA Ours SOTA Ours SOTA 2nd Ours SOTA 2nd

N 73.00 81.00 84.50 76.21 35.50 66.83 59.73 60.80 76.21 49.49

T 97.00 100.00 99.75 91.62 73.00 94.91 91.87 55.88 91.62 99.42

S 65.00 52.00 73.75 67.52 98.21 56.66 52.84 98.30 67.52 59.32

O 78.33 77.67 86.00 80.04 69.05 73.01 68.57 72.08 80.04 69.84

2) Inter-sensor evaluation: Again, our method achieved

new SOTA results in this scenario for the NDCL database,

improving the CCR in 18% and 15%. In one sense, this

result highlights how robust deep representations can be when

learning features directly from the data. In contrast, disastrous

results for the IIIT-D database were achieved due to the

absence of iris location — a feature that comes with SOTA.

TABLE V

INTER-SENSOR RESULTS FOR THE NDCL AND IIIT-D DATABASES.

Sensors

Train AD100 LG4000 Cogent Vista

Test LG4000 AD100 Vista Cogent

Ours SOTA Ours SOTA Ours SOTA Ours SOTA

N 75.00 62.25 80.00 74.00 6.00 62.10 48.67 65.99

T 94.00 88.50 97.00 93.00 89.61 92.95 38.15 80.81

S 65.00 29.50 49.00 17.00 45.47 75.44 42.25 48.31

O 78.00 60.08 75.33 61.33 45.51 77.79 43.08 65.29

3) Multi-sensor evaluation: Finally, we observe that the

CCRs obtained by our method outperforms the SOTA results

in almost 10% in the multi-sensor scenario for the NDCL

database and, even though the iris location is not provided (for

the IIIT database), a comparable performance is achieved.

TABLE VI

MULTI-SENSOR RESULTS FOR THE NDCL AND IIIT-D DATABASES.

Databases

NDCL IIIT

Ours SOTA Ours SOTA

N 77.40 72.60 47.55 62.14

T 99.60 97.00 61.07 94.74

S 71.40 50.00 97.99 61.63

O 82.80 73.20 69.28 72.96

C. Architecture learning and processing times

In our experiments, we used six PCs with 32GB RAM, Intel

Core i7 CPUs, and NVIDIA GPUs (Tesla K40 with 12GB

or GTX GeForce Titan Black with 6GB). The framework

(CUDA-convnet) clearly relied on GPUs and the processing

time of the different GPUs was not signiﬁcant. The training

time taken by the convolutional networks is highly dependent

on the input image size, number of layers, and other parame-

ters. For image sizes 256 × 256, 128 × 128, and 64 × 64, the

average training time was less than 172, 49, and 11 minutes,

respectively, for the LG4000 sensor – the one with the highest

number of training samples.

Although we did not measure the classiﬁcation time of a

single sample, our approach is quite suitable for real-world

applications. Indeed, there is an optimized framework, Jetpacs

iOS Deep Belief image recognition framework [45], that

implements the convolutional network architecture described

in [29], which can classify a 256 × 256 image in one among

1k categories in less than 300ms on an iPhone 5S. That

architecture is signiﬁcantly larger and more complex than

the ones that we propose here. Our architectures comprise

fewer operations and layers, and use lower resolution images,

64 × 64 pixels. Therefore, contact lens detection systems with

architectures developed by using [45] should be suitable for

real-world applications.

VI. CONCLUSIONS AND FUTURE WORK

In this paper, we proposed the use of deep image represen-

tations, by means of learning weights in convolutional network

followed by a classiﬁcation network, for the iris contact lens

detection problem. The conducted experiments validate our

method, which could achieve a 30% performance gain over

the state-of-the-art approach, SOTA, on the NDCL database

and comparable results on the IIIT-D database. In NDCL,

iris location is available, which allows to create deep image

representations of regions of interest with mostly iris pixels.

This becomes a problem in the IIIT-D database, where neither

iris segmentation nor location is available. SOTA performs

iris segmentation, but our approach is not prepared yet to

preprocess images and segment/locate the iris. We intend to

add this feature in future work and also to evaluate deep

learning techniques in which the architecture of the network

is ﬁrst learned by using ﬁlters with random weights. Once

the architecture is learned, the weights can be improved by

backpropagation.

Effective comprehension and exploitation of representations

built through deep learning techniques, such as the convolu-

tional networks, are still open problems in the literature. We

also plan to put more effort on this subject to clarify such

points.

ACKNOWLEDGMENTS

We thank UFOP, Brazilian National Research Council –

CNPq (Grants 307010/2014-7, 302970/2014-2, 479070/2013-

0, 307113/2012-4), S

ao Paulo Research Foundation – FAPESP,

(Grants 2011/22749-8 and 2013/04172-0). D. Menotti thanks

NVIDIA for donating two GeForce GTX Titan Black with

6GB each.

REFERENCES

[1] J. G. Daugman, “High Conﬁdence Visual Recognition of Persons by a

Test of Statistical Independence,” IEEE Trans. on Pattern Analysis and

Machine Intelligence, vol. 15, no. 11, pp. 1148–1161, 1993.

[2] L. Flom and A. Saﬁr, “Iris Recognition System,” U.S. US Patent 4 641

394, 1987.

[3] K. W. Bowyer, K. Hollingsworth, and P. J. Flynn, “Image Understanding

for Iris Biometrics: A Survey,” Computer Vision and Image Understand-

ing, vol. 110, no. 2, pp. 281–307, 2008.

[4] Y. Song, W. Cao, and Z. He, “Robust Iris Recognition using Sparse

Error Correction Model and Discriminative Dictionary Learning,” Neu-

rocomputing, vol. 137, pp. 198–204, 2014.

[5] A. F. M. Raffei, H. Asmuni, R. Hassan, and R. M. Othman, “Feature

Extraction for Different Distances of Visible Reﬂection Iris using

Multiscale Sparse Representation of Local Radon Transform,” Pattern

Recognition, vol. 46, no. 10, pp. 2622–2633, 2013.

[6] J. Galbally, S. Marcel, and J. Fierrez, “Image Quality Assessment for

Fake Biometric Detection: Application to Iris, Fingerprint, and Face

Recognition,” IEEE Trans. on Image Processing, vol. 23, no. 2, pp.

710–724, 2014.

[7] P. Gupta, S. Behera, M. Vatsa, and R. Singh, “On Iris Spooﬁng using

Print Attack,” in 22nd Int. Conf. on Pattern Recognition. IEEE, 2014,

pp. 1681–1686.

[8] Z. Sun and T. Tan, “Iris Anti-Spooﬁng,” in Handbook of Biometric Anti-

Spooﬁng, ser. Advances in Computer Vision and Pattern Recognition,

S. Marcel, M. S. Nixon, and S. Z. Li, Eds. Springer London, 2014,

pp. 103–123.

[9] K. W. Bowyer and J. S. Doyle, “Cosmetic Contact Lenses and Iris

Recognition Spooﬁng,” Computer, vol. 47, no. 5, pp. 96–98, 2014.

[10] D. Menotti, G. Chiachia, A. Pinto, W. Schwartz, H. Pedrini, A. Falc

ao,

and A. Rocha, “Deep Representations for Iris, Face, and Fingerprint

Spooﬁng Detection,” IEEE Trans. on Information Forensics and Secu-

rity, vol. 10, no. 4, pp. 864–879, 2015.

[11] R. Raghavendra and C. Busch, “Robust Scheme for Iris Presentation

Attack Detection Using Multiscale Binarized Statistical Image Features,”

IEEE Trans. on Information Forensics and Security, vol. 10, no. 4, pp.

703–715, 2015.

[12] Z. Sun, H. Zhang, T. Tan, and J. Wang, “Iris Image Classiﬁcation Based

on Hierarchical Visual Codebook,” IEEE Trans. on Pattern Analysis and

Machine Intelligence, vol. 36, no. 6, pp. 1120–1133, Jun. 2014.

[13] A. Sequeira, H. Oliveira, J. Monteiro, J. Monteiro, and J. Cardoso,

“MobILive 2014 - Mobile Iris Liveness Detection Competition,” in IEEE

Int. Joint Conf. on Biometrics, Sept 2014, pp. 1–6.

[14] Z. Wei, X. Qiu, Z. Sun, and T. Tan, “Counterfeit Iris Detection based on

Texture Analysis,” in Int. Conf. on Pattern Recognition. IEEE, 2008,

pp. 1–4.

[15] S. E. Baker, A. Hentz, K. W. Bowyer, and P. J. Flynn, “Degradation of

Iris Recognition Performance due to non-Cosmetic Prescription Contact

Lenses,” Computer Vision and Image Understanding, vol. 114, no. 9,

pp. 1030–1044, 2010.

[16] H. Zhang, Z. Sun, and T. Tan, “Contact Lens Detection based on

Weighted LBP,” in Int. Conf. on Pattern Recognition, 2010, pp. 4279–

4282.

[17] N. Kohli, D. Yadav, M. Vatsa, and R. Singh, “Revisiting Iris Recognition

with Color Cosmetic Contact Lenses,” in Int. Conf. on Biometrics, 2013,

pp. 1–7.

[18] J. S. Doyle, K. W. Bowyer, and P. J. Flynn, “Variation in Accuracy of

Textured Contact Lens Detection based on Sensor and Lens Pattern,”

in IEEE Int. Conf. on Biometrics: Theory, Applications, and Systems,

2013, pp. 1–7.

[19] J. Komulainen, A. Hadid, and M. Pietikainen, “Generalized Textured

Contact Lens Detection by Extracting BSIF Description from Cartesian

Iris Images,” in IEEE Int. Joint Conf. on Biometrics, 2014, pp. 1–7.

[20] Y. Bengio, A. Courville, and P. Vincent, “Representation Learning: A

Review and New Perspectives,” IEEE Trans. on Pattern Analysis and

Machine Intelligence, vol. 35, no. 8, 2013.

[21] F. Schroff, D. Kalenichenko, and J. Philbin, “FaceNet: A Uniﬁed

Embedding for Face Recognition and Clustering,” in IEEE Conf. on

Computer Vision and Pattern Recognition (CVPR), 2015, pp. 132–142,

to appear.

[22] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “DeepFace: Closing

the Gap to Human-Level Performance in Face Veriﬁcation,” in IEEE

Int. Conf. on Computer Vision and Pattern Recognition, 2014, pp. 1701–

1708.

[23] G. Chiachia, A. X. Falc

ao, N. Pinto, A. Rocha, and D. Cox, “Learning

Person-Speciﬁc Representations From Faces in the Wild,” IEEE Trans.

on Information Forensics and Security, vol. 9, no. 12, pp. 2089–2099,

Dec. 2014.

[24] D. Cox and N. Pinto, “Beyond Simple Features: A Large-Scale Feature

Search Approach to Unconstrained Face Recognition,” in IEEE Int.

Conf. on Automatic Face Gesture Recognition and Workshops. IEEE,

2011, pp. 8–15.

[25] P. Sermanet, K. Kavukcuoglu, S. Chintala, and Y. LeCun, “Pedestrian

Detection with Unsupervised Multi-Stage Feature Learning,” in IEEE

Conf. on Computer Vision and Pattern Recognition. IEEE, 2013, pp.

3626–3633.

[26] D. Menotti, G. Chiachia, A. Falc

ao, and V. Oliveira Neto, “Vehicle

License Plate Recognition with Random Convolutional Networks,” in

27th SIBGRAPI Conf. on Graphics, Patterns and Images, 2014, pp. 298–

303.

[27] D. C. Ciresan, U. Meier, L. M. Gambardella, and J. Schmidhuber, “Deep

Big Simple Neural Nets For Handwritten Digit Recognition,” Neural

Computation, vol. 22, no. 12, pp. 3207–3220, 2010.

[28] D. Cires¸an, U. Meier, J. Masci, and J. Schmidhuber, “Multi-Column

Deep Neural Network for Trafﬁc Sign Classiﬁcation,” Neural Networks,

vol. 32, pp. 333–338, 2012.

[29] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet Classiﬁcation

with Deep Convolutional Neural Networks,” in Advances in neural

information processing systems, 2012, pp. 1097–1105.

[30] D. Yadav, N. Kohli, J. Doyle, R. Singh, M. Vatsa, and K. Bowyer,

“Unraveling the Effect of Textured Contact Lenses on Iris Recognition,”

IEEE Trans. on Information Forensics and Security, vol. 9, no. 5, pp.

851–862, 2014.

[31] S. E. Baker, A. Hentz, K. W. Bowyer, and P. J. Flynn, “Contact

Lenses: Handle with Care for Iris Recognition,” in IEEE Int. Conf. on

Biometrics: Theory, Applications, and Systems, 2009, pp. 1–8.

[32] E. C. Lee, K. R. Park, and J. Kim, “Fake Iris Detection by using Purkinje

Image,” in Advances in Biometrics. Springer, 2005, pp. 397–403.

[33] Chinese Academy of Sciences (CASIA), Insti-

tute of Automation, “CASIA Iris Image Database,”

http://biometrics.idealtest.org/ﬁndTotalDbByMode.do?mode=Iris,

2010, accessed 26 Mar, 2015 [online].

[34] University of Bath, Department of Electronic and Electrical Engineering,

“University of Bath Iris Image Database,” 2008.

[35] Z. He, Z. Sun, T. Tan, and Z. Wei, “Efﬁcient Iris Spoof Detection via

Boosted Local Binary Patterns,” in Advances in Biometrics. Springer,

2009, pp. 1080–1090.

[36] Chinese Academy of Sciences (CASIA), Institute of

Automation, “CASIA-IrisV3 Image Database [Online],”

http://biometrics.idealtest.org/dbDetailForUser.do?id=3, 2010, accessed

26 Mar, 2015 [online].

[37] National Institute of Standards and Technology (NIST), “Iris Challenge

Evaluation (ICE),” http://www.nist.gov/itl/iad/ig/ice.cfm, 2008, accessed

26 Mar, 2015 [online].

[38] J. Daugman, “Demodulation by Complex-Valued Wavelets for Stochastic

Pattern Recognition,” Int. Journal of Wavelets, Multiresolution and

Information Processing, vol. 1, no. 1, pp. 1–17, 2003.

[39] J. Doyle and B. Kevin, “Notre Dame Image Database for Con-

tact Lens Detection In Iris Recognition-2013: README,” Available:

http://www3.nd.edu/ cvrl/papers/CosCon2013README.pdf, 2014.

[40] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based

Learning Applied to Document Recognition,” Proceedings of the IEEE,

vol. 86, no. 11, pp. 2278–2324, 1998.

[41] C. Bishop, Neural Networks for Pattern Recognition, ser. Advanced

Texts in Econometrics. Clarendon Press, 1995.

[42] W. S. Geisler and D. G. Albrecht, “Cortical Neurons: Isolation of

Contrast Gain Control,” Vision Research, vol. 32, no. 8, pp. 2429–2454,

1992.

[43] N. Pinto, D. Doukhan, J. J. DiCarlo, and D. D. Cox, “A High-

Throughput Screening Approach to Discovering Good Forms of

Biologically-Inspired Visual Representation,” PLoS Computational Bi-

ology, vol. 5, no. 11, 2009.

[44] J. Bergstra, D. Yamins, and D. D. Cox, “Making a Science of Model

Search: Hyperparameter Optimization in Hundreds of Dimensions for

Vision Architectures,” in Int. Conf. on Machine Learning), 2013.

[45] P. Wardern, “The SDK for Jetpac’s iOS Deep Belief image recognition

framework,” 2014. [Online]. Available: https://github.com/jetpacapp/

DeepBeliefSDK

Deep convolutional neural networks for face and iris presentation attack detection: Survey and case study

Article

Full-text available

Jul 2020

Biometric presentation attack detection (PAD) is gaining increasing attention. Users of mobile devices find it more convenient to unlock their smart applications with finger, face, or iris recognition instead of passwords. In this study, the authors survey the approaches presented in the recent literature to detect face and iris presentation attacks. Specifically, they investigate the effectiveness of fine-tuning very deep convolutional neural networks to the task of face and iris antispoofing. They compare two different fine-tuning approaches on six publicly available benchmark datasets. Results show the effectiveness of these deep models in learning discriminative features that can tell apart real from fake biometric images with a very low error rate. Cross-dataset evaluation on face PAD showed better generalisation than state-of-the-art. They also performed cross-dataset testing on iris PAD datasets in terms of equal error rate, which was not reported in the literature before. Additionally, they propose the use of a single deep network trained to detect both face and iris attacks. They have not noticed accuracy degradation compared to networks trained for only one biometric separately. Finally, they analysed the learned features by the network, in correlation with the image frequency components, to justify its prediction decision.

DC-CNNPAD to Enhance the Detection Rate for Iris Presentation Attack

Chapter

Mar 2024

Biometrics involves the analysis and statistical assessment of unique physical and behavioural characteristics of an individual. It finds application in areas like identification, access control, and surveillance. In security systems, biometric-based recognition is replacing conventional methods. Iris recognition (IR) has gained prominence in contemporary biometric technology deployed across various devices for security purposes. Recent advancements in deep convolutional neural networks (CNNs), computer vision, and access to extensive training data have significantly enhanced the performance of IR systems over the last decade. A presentation attack refers to a scenario where an impostor generates fake biometric data to deceive the system. This study introduces an effective strategy to enhance the precision of detecting iris presentation attacks and reviews the evolution of CNN techniques from 2015 to 2022. The proposed solution is a Dual-Channel Convolutional Neural Network Presentation Attack Detector (DC-CNNPAD), designed to improve the accuracy of real iris detection. An experiment is conducted on the LivDet-2015 dataset to evaluate the model’s effectiveness in identifying artefacts. The results obtained from the detection model on the sample dataset demonstrate highly favourable outcomes, and on LivDet-2015, the TDR is 98.70%.

Fingerprint and Iris liveness detection using invariant feature-set

Article

Full-text available

Jan 2024
MULTIMED TOOLS APPL

Bineet Kaur Maunder

Presentation attacks that make the biometric systems vulnerable has become a growing concern in recent years keeping in view its widespread applications in the field of banking, medical, security systems etc. For instance, textured contact lenses, high-quality printouts and fabricated synthetic materials spoof the iris texture and fingerprints that lead to increase in false rejection. Till now, extensive work has been done on global features. However, this paper proposed local features with invariance properties. Thus, the paper proposes detection of spoofing attacks in which local features are extracted for micro-textural analysis with properties of invariance to scale, rotation and translation. The features are encoded using Lehmer code and transformed into histograms that act as feature descriptors for classification. The top 4 features are selected using Friedman test. Experiments are simulated on iris spoofing databases: IIITD-Contact Lens, IIITD-Iris Spoofing, Clarkson-2015, Warsaw-2015and fingerprint spoofing databases: LivDet-2013 and LivDet-2015. Results have been validated through intra-sensor, inter-sensor, cross-sensor and cross-material. In case of IIITD-CLI, an EER of 1.36% and an ACER of 1.45% is obtained. For IIS, 0.94% of EER and 1.61% of ACER is observed. For Clarkson database, 0.79% of EER and 2.10% of ACER is obtained. An ACER of 0.57% is obtained for LivDet-2013 and 0.47% for LivDet-2015.

Lifelong iris presentation attack detection without forgetting

Article

Full-text available

Jun 2023
J SUPERCOMPUT

Despite the promising results achieved by deep iris presentation attack detection (PAD) in dataset-specific scenarios, the advanced approach remains vulnerable to novel attacks. Real-world attacks evolve over time. Typically, fine-tuning and retraining from scratch are employed to incrementally learn new attacks. However, fine-tuning degrades performance on old attacks, i.e., catastrophic forgetting. Retraining on all data is unavailable due to data privacy. To address these issues, we are the first to propose a lifelong iris PAD to incrementally learn new attacks without storing old data. Our approach utilizes a prompt pool to preserve attack-independent and attack-shared knowledge, wherein learnable prompts aid in prediction by the pre-trained Vision Transformer (ViT). Furthermore, adaptive attention masks for sequential new attacks are applied to pre-trained ViT. Consequently, our method improves plasticity while preserving stability. Extensive experiments are performed on our building dataset combing IITD and CASIA to evaluate iris PAD in incremental learning. Our proposed method obtains competitive performance over state-of-the-art Iris PAD schemes.

Introduction to Presentation Attack Detection in Iris Biometrics and Recent Advances

Chapter

Feb 2023

Iris recognition technology has attracted an increasing interest in the last decades in which we have witnessed a migration from research laboratories to real-world applications. The deployment of this technology raises questions about the main vulnerabilities and security threats related to these systems. Among these threats, presentation attacks stand out as some of the most relevant and studied. Presentation attacks can be defined as the presentation of human characteristics or artifacts directly to the capture device of a biometric system trying to interfere with its normal operation. In the case of the iris, these attacks include the use of real irises as well as artifacts with different levels of sophistication such as photographs or videos. This chapter introduces iris Presentation Attack Detection (PAD) methods that have been developed to reduce the risk posed by presentation attacks. First, we summarize the most popular types of attacks including the main challenges to address. Second, we present a taxonomy of PAD methods as a brief introduction to this very active research area. Finally, we discuss the integration of these methods into iris recognition systems according to the most important scenarios of practical application.

A survey of identity recognition via data fusion and feature learning

Article

Nov 2022
INFORM FUSION

With the rapid development of the Mobile Internet and the Industrial Internet of Things, a variety of applications put forward an urgent demand for user and device identity recognition. Digital identity with hidden characteristics is essential for both individual users and physical devices. With the assistance of multimodalities as well as fusion strategies, identity recognition can be more reliable and robust. In this survey, we turn to investigate the concepts and limitations of unimodal identity recognition, the motivation, and advantages of multimodal identity recognition, and summarize the recognition technologies and applications via feature level, match score level, decision level, and rank level data fusion strategies. Additionally, we also discuss the security concerns and future research orientations of learning-based identity recognition, which enables researchers to achieve a better understanding of the current status of this field and select future research directions. This survey summarizes and expands the fusion processing technologies and methods for multi-source and multimodality data, and provides theoretical support for their applications in complicated scenarios. In addition, it enables researchers to achieve a better understanding of the current research status of this field and select proper future research directions.

MULTIMODAL BIOMETRIC IDENTIFICATION SYSTEM USING THE FUSION OF FINGERPRINT AND IRIS RECOGNITION WITH CNN APPROACH

Article

Dec 2021

Multimodal biometric systems are widely applied in many real-world applications because of its ability to accommodate variety of great limitations of unimodal biometric systems, including sensitivity to noise, population coverage, intra-class variability, nonuniversality, and vulnerability to spoofing. during this paper, an efficient and real-time multimodal biometric system is proposed supported building deep learning representations for images of both the correct and left irises of someone, and fusing the results obtained employing a ranking-level fusion method. The trained deep learning system proposed is named IrisConvNet whose architecture relies on a mix of Convolutional Neural Network (CNN) and Softmax classifier to extract discriminative features from the input image with none domain knowledge where the input image represents the localized iris region and so classify it into one amongst N classes. during this work, a discriminative CNN training scheme supported a mixture of back-propagation algorithm and mini-batch AdaGrad optimization method is proposed for weights updating and learning rate adaptation, respectively. additionally, other training strategies (e.g., dropout method, data augmentation) also are proposed so as to gauge different CNN architectures. The performance of the proposed system is tested on three public datasets collected under different conditions: SDUMLA-HMT, CASIA-IrisV3 Interval and IITD iris database

Ocular recognition databases and competitions: a survey

Article

Full-text available

Jan 2022
ARTIF INTELL REV

The use of the iris and periocular region as biometric traits has been extensively investigated, mainly due to the singularity of the iris features and the use of the periocular region when the image resolution is not sufficient to extract iris information. In addition to providing information about an individual’s identity, features extracted from these traits can also be explored to obtain other information such as the individual’s gender, the influence of drug use, the use of contact lenses, spoofing, among others. This work presents a survey of the databases created for ocular recognition, detailing their protocols and how their images were acquired. We also describe and discuss the most popular ocular recognition competitions (contests), highlighting the submitted algorithms that achieved the best results using only iris trait and also fusing iris and periocular region information. Finally, we describe some relevant works applying deep learning techniques to ocular recognition and point out new challenges and future directions. Considering that there are a large number of ocular databases, and each one is usually designed for a specific problem, we believe this survey can provide a broad overview of the challenges in ocular biometrics.

Iris presentation attack detection based on best-k feature selection from YOLO inspired RoI

Article

Full-text available

Jun 2021
NEURAL COMPUT APPL

Obfuscating an iris recognition system through forged iris samples has been a major security threat in iris-based authentication. Therefore, a detection mechanism is essential that may explicitly discriminate between the live iris and forged (attack) patterns. The majority of existing methods analyze the eye image as a whole to find discriminatory features for fake and real iris. However, many attacks do not alter the entire eye image, instead merely the iris region is affected. It infers that the iris embodies the region of interest (RoI) for an exhaustive search towards identifying forged iris patterns. This paper introduces a novel framework that locates RoI using the YOLO approach and performs selective image enhancement to enrich the core textural details. The YOLO approach tightly bounds the iris region without any pattern loss, where the textural analysis through local and global descriptors is expected to be efficacious. Afterward, various handcrafted and CNN based methods are employed to extract the discriminative textural features from the RoI. Later, the best-k features are identified through the Friedman test as the optimal feature set and combined using score-level fusion. Further, the proposed approach is assessed on six different iris databases using predefined intra-dataset, cross-dataset, and combined-dataset validation protocols. The experimental outcomes exhibit that the proposed method results in significant error reduction with the state of the arts.

CCRNet: a novel data-driven approach to improve cross-domain Iris recognition

Article

Full-text available

Nov 2020
MULTIMED TOOLS APPL

In spite of the prominence and robustness of iris recognition systems, iris images acquisition using heterogeneous cameras/sensors, is the prime concern in deploying them for wide-scale applications. The textural qualities of iris samples (images) captured through distinct sensors substantially differ due to the differences in illumination and the underlying hardware that yields intra-class variation within the iris dataset. This paper examines three miscellaneous configurations of convolution and residual blocks to improve cross-domain iris recognition. Further, the finest architecture amongst three is identified by the Friedman test, where the statistical differences in proposed architectures are identified based on the outcomes of Nemeny and Bonferroni-Dunn tests. The quantitative performances of these architectures are perceived on several experiments simulated on two iris datasets; ND-CrossSensor-Iris-2013 and ND-iris-0405. The finest model is referred to as “Collaborative Convolutional Residual Network (CCRNet)” and is further examined on several experiments prepared in similar and cross-domains. Results depict that least two error rates reported by CCRNet are 1.06% and 1.21% that enhances the benchmark for the state of the arts. This is due to fast convergence and rapid weights updation achieved from convolution and residual connections, respectively. It helps in recognizing the micro-patterns existing within the iris region and results in better feature discrimination among large numbers of iris subjects.

Vehicle License Plate Recognition With Random Convolutional Networks

Conference Paper

Full-text available

Aug 2014

Despite decades of research on automatic license plate recognition (ALPR), optical character recognition (OCR) still leaves room for improvement in this context, given that a single OCR miss is enough to miss the entire plate. We propose an OCR approach based on convolutional neural networks (CNNs) for feature extraction. The architecture of our CNN is chosen from thousands of random possibilities and its filter weights are set at random and normalized to zero mean and unit norm. By training linear support vector machines (SVMs) on the resulting CNN features, we can achieve recognition rates of over 98% for digits and 96% for letters, something that neither SVMs operating on image pixels nor CNNs trained via back-propagation can achieve. The results are obtained in a dataset that has 182 samples per digit and 28 per letter, and suggest the use of random CNNs as a promising alternative approach to ALPR systems.

Generalized textured contact lens detection by extracting BSIF description from Cartesian iris images

Conference Paper

Full-text available

Sep 2014

Textured contact lenses cause severe problems for iris biometric systems because they can be used to alter the appearance of iris texture in order to deliberately increase the false positive and, especially, false negative match rates. Many texture analysis based techniques have been proposed for detecting the presence of cosmetic contact lenses. However, it has been shown recently that the generalization capability of the existing approaches is not sufficient because they have been developed for detecting specific lens texture patterns and evaluated only on those same lens types seen during development phase. This scenario does not apply in unpredictable practical applications because unseen lens patterns will be definitely experienced in operation. In this paper, we address this issue by studying the effect of different iris image preprocessing techniques and introducing a novel approach formore generalized cosmetic contact lens detection using binarized statistical image features (BSIF).Our extensive experimental analysis on benchmark datasets shows that the BSIF description extracted from preprocessed Cartesian iris texture images yields to promising generalization capabilities across unseen texture patterns and different iris sensors with mean equal error rate of 0.14%and 0.88%, respectively. The findings support the intuition that the textural differences between genuine iris texture and fake ones are best described by preserving the regular structure of different printing signatures without transforming the iris images into polar coordinate system.

Learning Person-Specific Representations From Faces in the Wild

Article

Full-text available

Dec 2014

Humans are natural face recognition experts, far out-performing current automated face recognition algorithms, especially in naturalistic, “in the wild” settings. However, a striking feature of human face recognition is that we are dramatically better at recognizing highly familiar faces, presumably because we can leverage large amounts of past experience with the appearance of an individual to aid future recognition. Meanwhile, the analogous situation in automated face recognition, where a large number of training examples of an individual are available, has been largely underexplored, in spite of the increasing relevance of this setting in the age of social media. Inspired by these observations, we propose to explicitly learn enhanced face representations on a per-individual basis, and we present two methods enabling this approach. By learning and operating within person-specific representations, we are able to significantly outperform the previous state-of-the-art on PubFig83, a challenging benchmark for familiar face recognition in the wild, using a novel method for learning representations in deep visual hierarchies. We suggest that such person-specific representations aid recognition by introducing an intermediate form of regularization to the problem.

Robust Scheme for Iris Presentation Attack Detection Using Multiscale Binarized Statistical Image Features

Article

Full-text available

Apr 2015

Vulnerability of iris recognition systems remains a challenge due to diverse presentation attacks that fail to assure the reliability when adopting these systems in real-life scenarios. In this paper, we present an in-depth analysis of presentation attacks on iris recognition systems especially focusing on the photo print attacks and the electronic display (or screen) attack. To this extent, we introduce a new relatively large scale visible spectrum iris artefact database comprised of 3300 iris normal and artefact samples that are captured by simulating five different attacks on iris recognition system. We also propose a novel presentation attack detection (PAD) scheme based on multiscale binarized statistical image features and linear support vector machines. Extensive experiments are carried out on four different publicly available iris artefact databases that have revealed the outstanding performance of the proposed PAD scheme when benchmarked with various well-established state-of-the-art schemes.

Imagenet classification with deep convolutional neural networks

Conference Paper

Jan 2012

We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry

Demodulation by complex-valued wavelets for stochastic pattern recognition

Article

Jan 2003

J. Daugman

Iris anti-spoofing

Chapter

Jul 2014

Iris images contain rich texture information for reliable personal identification. However, forged iris patterns may be used to spoof iris recognition systems. This paper proposes an iris anti-spoofing approach based on the texture discrimination between genuine and fake iris images. Four texture analysis methods include gray level co-occurrence matrix, statistical distribution of iris texture primitives, local binary patterns (LBP) and weighted-LBP are used for iris liveness detection. And a fake iris image database is constructed for performance evaluation of iris liveness detection methods. Fake iris images are captured from artificial eyeballs, textured contact lens and iris patterns printed on a paper, or synthesised from textured contact lens patterns. Experimental results demonstrate the effectiveness of the proposed texture analysis methods for iris liveness detection. And the learned statistical texture features based on weighted-LBP can achieve 99accuracy in classification of genuine and fake iris images.

FaceNet: A unified embedding for face recognition and clustering

Conference Paper

Jun 2015

On Iris Spoofing Using Print Attack

Conference Paper

Aug 2014

Human iris contains rich textural information which serves as the key information for biometric identifications. It is very unique and one of the most accurate biometric modalities. However, spoofing techniques can be used to obfuscate or impersonate identities and increase the risk of false acceptance or false rejection. This paper revisits iris recognition with spoofing attacks and analyzes their effect on the recognition performance. Specifically, print attack with contact lens variations is used as the spoofing mechanism. It is observed that print attack and contact lens, individually and in conjunction, can significantly change the inter-personal and intra-personal distributions and thereby increase the possibility to deceive the iris recognition systems. The paper also presents the IIITD iris spoofing database, which contains over 4800 iris images pertaining to over 100 individuals with variations due to contact lens, sensor, and print attack. Finally, the paper also shows that cost effective descriptor approaches may help in counter-measuring spooking attacks.

FaceNet: A Unified Embedding for Face Recognition and Clustering

Article

Mar 2015

Despite significant recent advances in the field of face recognition, implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors. Our method uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches. To train, we use triplets of roughly aligned matching / non-matching face patches generated using a novel online triplet mining method. The benefit of our approach is much greater representational efficiency: we achieve state-of-the-art face recognition performance using only 128-bytes per face. On the widely used Labeled Faces in the Wild (LFW) dataset, our system achieves a new record accuracy of 99.63%. On YouTube Faces DB it achieves 95.12%. Our system cuts the error rate in comparison to the best published result by 30% on both datasets.

An Approach to Iris Contact Lens Detection based on Deep Image Representations

Abstract and Figures

Recommended publications

Presentation attack detection for iris recognition using deep learning

Generalized Contact Lens Iris Presentation Attack Detection

Automatic Detection of Non-Cosmetic Soft Contact Lenses in Ocular Images

ContlensNet: Robust Iris Contact Lens Detection Using Deep Convolutional Neural Networks