Conference PaperPDF Available

Machine Learning Algorithm on Keystroke Dynamics Pattern

December 2018

December 2018

DOI:10.1109/SPC.2018.8704135

Conference: 2018 IEEE Conference on Systems, Process and Control (ICSPC)

Authors:

Purvashi Baynath

University of Mauritius

K.M.s. Soyjaudah

University of Mauritius

Maleika Heenaye- Mamode Khan

University of Mauritius

Content uploaded by Maleika Heenaye- Mamode Khan

Content may be subject to copyright.

This is a draft version of the paper. The full version is available on: https://ieeexplore.ieee.org/document/8704135

and should be cited as: Baynath, P., Soyjaudah, K.S. and Khan, M.Heenaye- Mamode khan., 2018, December.

Machine Learning Algorithm on Keystroke Dynamics Pattern. In 2018 IEEE Conference on Systems, Process and

Control (ICSPC) (pp. 11-16). IEEE.

Machine Learning Algorithm on Keystroke Dynamics pattern

Purvashi Baynath

Electrical and Electronics Engineering

University of Mauritius

Reduit

Mauritius

e-mail: p.baynath@gmail.com

K. M. Sunjiv Soyjaudah

Electrical and Electronics Engineering

University of Mauritius

Reduit,

Mauritius

e-mail: sunjivsoyjaudah@gmail.com

Maleika Heenaye-Mamode Khan

Software and Information Systems,

University of Mauritius,

Reduit,

Mauritius

e-mail: m.mamodekhan@uom.ac.mu

Abstract—In this paper, the machine learning algorithms have

been applied on distinct features of Keystroke Dynamics. The

Machine learning is important to correctly authenticate an

individual. The complex models and algorithms determine

when the person is a genuine user or an imposter through

learning. The algorithms that has been studied, in this work,

are the Fuzzy Expert System (FESs), NeuroEvolution of the

augmenting topology (NEAT), Proposed NeuroEvolution of the

augmenting topology, Support Vector Machine (SVM) and

Chaotic Neural Network. From the algorithms applied, the

proposed NEAT algorithms performs better in terms of

recognition rate.

Keywords—Biometric; Dwell time; Flight time; Keystroke

Dynamics; User Authentication

I. INTRODUCTION

In this digital era, privacy and data security are gaining a

lot of importance. This is leading to the adoption of

Biometrics over current usual modes of authentication.

Keystroke dynamics is one that is used for identity

establishment in behavioral Biometrics [1]. It is the

measurement of the typing rate of an individual at a

Keyboard. Keystroke Dynamics has been accepted by users

since it is cheap and the only external devise that is required

for the user authentication is the Keyboard [1]. It is a

measurement that purports to belong to a particular entity

and is compared against the data stored in relationship to

that entity for authentication. When the measurement of the

user is matched, then the assertion is made that the person is

the one whom they claim to be. The process is known as

authentication which is used to grant access to users on data.

The application of Keystroke Dynamics on mobile

devices is very common in this era, however the main

concern remains when companies’ data get compromised. In

this computerized world, companies’ employees use desktop

and laptop to make their routine work. Hence, in this study,

emphasis was put on Keystroke Dynamics using the normal

keyboard.

Since user authentication takes place instantaneously,

the fraud identity is impossible. An attacker can bypass the

authentication system and still be considered as a genuine

user. Once an attacker has successfully forged the keystroke

characteristics, the end user must change its password and

adapt a new typing pattern. The process of continuously

changing the password is very tedious and this may lead to

users being refrain from using the keystroke dynamics

system. The classification phase within a biometric system

consist of learning the different datasets and classify them

accordingly. The classification phase can be categorized as

the statistical approach or the machine learning approach.

By using an appropriate classification system, the point of

attach, i.e. the attack on the matcher module can be

minimised since the score manipulation becomes much

difficult.

In this work various machine learning algorithms has been

applied on keystroke features along with our proposed

NEAT algorithms. The objective is to propose an algorithm

that significantly raise the recognition rate and at the same

reduce the false acceptance rate and false rejection rates.

Two different databases have been used to train and validate

these algorithms. Then the proposed algorithm has been

compared with existing techniques so that to determine the

most appropriate machine learning techniques. Section two

provides a literature review. Section three provides the

methodology of the approach used in the design of the

system. Section four shows the result of the simulation

while section five provides ground for discussion. Section

seven gives an insight of the impact this research as the

future work.

II. RELATED WORK

Different machine learning techniques that gained

remarkable results after its application of keystroke

This is a draft version of the paper. The full version is available on: https://ieeexplore.ieee.org/document/8704135

and should be cited as: Baynath, P., Soyjaudah, K.S. and Khan, M.Heenaye- Mamode khan., 2018, December.

Machine Learning Algorithm on Keystroke Dynamics Pattern. In 2018 IEEE Conference on Systems, Process and

Control (ICSPC) (pp. 11-16). IEEE.

dynamics features are Fuzzy Expert System (FESs),

NeuroEvolution of the augmenting topology(NEAT) ,

Support Vector Machine(SVM) and Chaotic Neural

Network [2][3][4]. Among the recognition system, SVM is

one of the most efficient machine learning algorithms,

which is mostly used for pattern recognition since its

introduction in 1990s [5]. SVM has numerous advantages as

its learning result is robust and over-fitting is not common.

The other advantage is that while learning, it is never

trapped in the local minima. SVM is the supervised machine

learning algorithm which is commonly being used and it

works by simply classifying the data into different classes.

Compared to Neural Network, SVM has fewer parameters

to tune. Sang et al.[6] applied SVM on fused features of

keystroke dynamics and achieved a result of 0.02% FAR

and 0.1% FRR. However only 10 user profiles have been

used throughout the study. In [7], the authors applied the

SVM learning to develop an industrial based applications.

The database used for the validation of the techniques had

only 100 users, where the performance achieved was

15.28% equal error rate on the dataset. For SVM, Yu and

Cho [3] applied this technique as a novelty detector

attaining 99.19% with average error rate equals of 0.81%.

Hocquet et al.[8] gained in terms of ERR of 4.5% by

applying SVM.

Fuzzy Expert System uses a method of reasoning that

resembles human reasoning. The approach of FESs imitates

the way of decision making in humans that involves all

intermediate possibilities between digital values. The main

advantage for using FESs is that it can predict good values

using limited data and it also is simple and flexible. In [9],

the authors has applied FESs on keystroke pattern. The

performance of the system has been computed in terms of

false acceptance rate (FAR) and false rejection rate (FRR).

NEAT is yet another techniques that performs well on

machine learning. NEAT behaves in such a way to optimize

both the structure and weight of the Artificial Neural

Network. NEAT has the ability to resolve the local optimum

issue as it contain multiple genome structure. In [10], the

authors applied NEAT in Keystroke dynamics system where

the learning rate gained were impressive.

Chaotic Neural has also been on the insight of

researchers lately. The chaos that is present in Neural

Networks plays an essential role in the memory storage and

retrieval of data. Chaos can also provide advantage over

alternative memory resolving methods in ANN. Chaotic

System are easily controlled as small changes in the system

parameters can affect the behaviour of the controlled system.

In [11], the authors have used Chaotic Neural Network

along with Keystroke Dynamics. In their work, the

application of the Chaotic Neural Network has been made

on the flight time and dwell time features. The performance

was recorded in term of Recognition rate (RR) where the

RR achieved in the study was 99.2 % where 1000 subjects

have been used. The main difference between the Chaotic

Neural Network and NEAT is mainly regarding the tuning

of the system. Chaotic Neural Network follows the

conventional ANN Network where the tuning is done

through the trial and error basis until an acceptance of the

performance is obtained. The tuning is very different for

NEAT as the process of tuning is automated by the

algorithm where the process evoluates and searches for the

optimal weight by itself. NEAT has achieved a reputation of

performing better compared to other NeuroEvolution

techniques in various sectors like gaming and on other

Biometrics System[12][13].

Some of the works carried out related to machine

learning on keystroke dynamics are detailed below in Table

TABLE I. APPLICATION OF MACHINE LEARNING TECHNIQUES

Author

Technique

Result

Sang et

al[6]

efficiency of SVM

for keystroke

dynamics

verification

FAR - 0.2, FRR - 0.1

Yu and

Cho[3]

SVM

RR – 99.19%

ERR – 0.81%

Giot et al.,

[15].

SVM on

concatenation of

features

Identification Rate – 95%

ERR – 13.45%

Killourhy

and

Maxion[14]

SVM, Fuzzy Logic

on KD features

0.136 FAR

0.108 FAR

Hocquet et

al[8]

SVM applied on hold

time.

ERR – 4.5%

Li et al.

2011

SVM

EER 11.83%

Giot and

Rosenberge

r[16]

SVM

Identification Rate – 95%

ERR – 1.401%

De Ru and

Eloff[17]

Fuzzy Expert System

FAR 2.79%, FRR- 7.37%

Azavado et

al[18]

SVM

EER 1.57%

Due to numerous forms of attacks, the machine

learning part can further be improved. Hence in this work,

the focus is to apply these machine learning algorithms on

different features of Keystroke Dynamics other than the

ones currently being used in literature. Fuzzy Expert System,

NeuroEvolution of the augmenting topology, Proposed

NeuroEvolution of the augmenting topology, Support

Vector Machine (SVM) and Chaotic Neural Network,

learning behavior has been evaluated and their behaviour

has been analysed. As a novel approach, a new NEAT

algorithm has been proposed.

III. PROPOSED ARCHITECTURE FOR KEYSTROKE

DYNAMICS

In this research work, flight time and dwell time of the

keystroke dynamics has been adopted as features to develop

This is a draft version of the paper. The full version is available on: https://ieeexplore.ieee.org/document/8704135

and should be cited as: Baynath, P., Soyjaudah, K.S. and Khan, M.Heenaye- Mamode khan., 2018, December.

Machine Learning Algorithm on Keystroke Dynamics Pattern. In 2018 IEEE Conference on Systems, Process and

Control (ICSPC) (pp. 11-16). IEEE.

the Biometric System unlike other applications. The

supervised training method has been used. In this type of

learning, the data is exposed to the environment during the

learning process. The flowchart in Fig 1 provides an insight

of the design of the application.

FIGURE 1: STEPS INVOLVED

The steps that which are carried out throughout the

experiment is detailed below:

A. Data Capture

To carry out this experiment, we have prepared our own

datasets. The motivation behind making our own dataset is

due to variation of the environmental condition for the

available online dataset. For this experiment, 1000

volunteers with the University of Mauritius provided a total

typing samples of 30000. The software was designed to

capture the position of the keys held, the dwell time and

flight time. It is to be noted that the distance between each

keys held was also captured. During the experiment, the

standard QWERTY keyboard has been used. During the

dataset collection, the environmental condition was fully

monitored. The laboratory was well ventilated and it was

ensured that the user has the optimal position regarding their

sitting posture, the lighting condition among others.

Different types of passwords have also been chosen so that

we can have a variety of datasets to test. Different types of

password was also devised so that there is a variation

between the distances of keys on the keyboard. The devised

password were namely .tie5Roalnb, .aeihoz246@,

.nzkla29zah.#, and aeR5t.ilnb.As it can be deduced, the

password is categorized under strong password[19][20].The

categorisation of the dataset has been spread into three

different types. The first type contains data where the user

was allowed to use both their hands throughout the capture

of different passwords. The second type was done by

requesting the user to use only one hand (strong hand) so

that the position of the keys affect the typing rate and the

last type is captured where the emotional state of the user

has been influence before doing the capture. The

performance of a user could be affected by the emotional

factor. The application of only one dataset does not

qualitatively show the behaviour of one technique. So, for

our study one online dataset which is freely available on

internet was chosen for verification of the methodology

proposed [14]. To our knowledge, Killourhy and Maxion

dataset is the only dataset that has a strong password. The

password that has been used by the latter is ‘.tie5Roaln’

which resemble the password convention adopted for our

password derivation and the database contain the dwell time

as well as the flight time of the digraph of keys[19][20].

B. Features

The flight time as well as the dwell time has been

considered.

C. Normalization and Feature Subset Selection

The Z-score normalization techniques has been adopted

to eliminate the unwanted impurities. Z-score normalization

has been chosen as it is robust and has a high efficiency

compared to other normalization techniques. For the Feature

Subset selection, the Ant colony optimization has been

chosen as it has been demonstrated from previous research

that it works well on Keystroke Dynamics Features [11].

D. Classifier

Fuzzy Expert System, NeuroEvolution of the

augmenting topology, Proposed NeuroEvolution of the

augmenting topology, Support Vector Machine and Chaotic

Neural Network have been chosen as the classifier. The

algorithms which has been used are detailed below.

o NeuroEvolution of the augmenting topology

The simple NEAT has been developed. The genome

structure contains the list of genes which will compromise

the neurons, neuron genes, link genes and connections. The

link genes shall contain the information about the interlinks,

the weight of the relationship connection as well as the flag

which enables each link. The parent genome shall allow be

undergo various mutations. Four kinds of mutation can

occur throughout the evolutionary process, which can alters

both the weight and structure. The type of mutation are

namely (1) adding new connections, (2) perturbing a

connection weight, (3) adding new hidden nodes, (4)

disabling or enabling genes in the chromosome. For our

study, the add-node mutation has been applied, where an

existing connection is split and a node is added in any of the

branches for the old connection. During this process the old

connection is disables and two new connections shall be

added to the genome. The weight assigned on the disable

node shall then be transmitted to the first genome of the

chain. However, during this process the node linking the

new node and the last node still contain the same weight as

it was before the split of the connections[13].

o Proposed NeuroEvolution of the augmenting

topology

In this work, the standard NEAT implementation has

been optimized by using and AND operator to mate the

genome from different parents. Then the matching genes are

inherited from the ‘more fit’ parent. This approach shall

help us to improve the overall performance of the system.

During the training phase of the evolution, a random

Data

Collections

Data

Feature

Extraction

Feature Subset

Selection

Classification

This is a draft version of the paper. The full version is available on: https://ieeexplore.ieee.org/document/8704135

and should be cited as: Baynath, P., Soyjaudah, K.S. and Khan, M.Heenaye- Mamode khan., 2018, December.

Machine Learning Algorithm on Keystroke Dynamics Pattern. In 2018 IEEE Conference on Systems, Process and

Control (ICSPC) (pp. 11-16). IEEE.

population of the Neural Networks were generated. During

the evolution process, crossover and mutation are applied in

order to produce better offspring. The evolution continues

until the fitness has reached. Mutation occurs in both

parents by the addition of nodes as well as connection in the

structure of the NEAT. New genes have been assigned new

increasingly higher number. In adding a connection, a single

new connection gene is added to the genome and given the

next available number. When a node is added, the

connection already present for the gene is disabled and two

new connections are added to the end of the genome. The

new node is usually present between the two new

connections.

o Support Vector Machine

A typical SVM has been implemented in [5]. Inspired

from the work conducted the Radial basis Kernal function

has been used. This choice was done so as to handle the

probable nonlinearities between the input vectors and their

corresponding class.

o Fuzzy Expert System

For the Fuzzy system has been implemented as

explain in [5]. The premise space consisted of

three inputs and each premise input was

segmented by three trapezoid members, as

shown in equation (1):

where the parameters a i,j and d i,j locate the

“feet” of the “jth” trapezoid of the “ith” premise

input and the parameters bi,j and ci,j locate the

“shoulders”.

The fuzzy rule was used to produce an output

for its linear function. An estimated values is

calculated from the equation 2 below. When the

estimated values is higher than the threshold

then it is classified as genuine user else it is

considered as an imposter.

(2)

o Chaotic Neural Network

A multi-layer feed-forward neural network has been

used. The layers consisted of the input neurons, hidden

neurons and the output neuron. Chaos has been introduced

in the neural network to limit the search space of the

classifier[11]. The sigmoid has been the choice for the

transfer function as it addresses the nonlinearities on the

input data. The typical back propagation method was used

for training of the weights. The optimum number of training

iterations and training parameters was set heuristically.

IV. EXPERIMENTAL RESULTS AND EVALUATION

In this section, the results of the data analysis phase has

been detailed. The experiments were performed with a total

of 1000 users where the overall samples data were 30000.

All the samples have been used for simulation to ensure that

the results obtained reflects the real datasets. Each features

of the whole dataset was tested that is the flight time and

dwell of each features. Table II, table III and table IV

represent the results achieved for the classification of each

machine learning algorithms presented. The tables

summarized the results of the investigated machine learning

algorithms on each features. For both (own and online)

datasets, different disjoint sets have been chosen for the

training and testing. The experiments were repeated using

the same random number generator. During the simulation,

the training class and testing datasets were equally divided

were one subset was used for validating the classifier and

the training was done with the remaining subsets. The

process was repeated sequentially until all subsets acted a

validation dataset.

The classifiers has been evaluated using the same data

which were under the same conditions and was using the

same procedures. Hence it shall be possible to attribute

differences in the performance and the top performer on the

particular dataset can be statistically analyzed.

TABLE II. RESULT ON EACH CLASSIFIER FOR OUR INBUILT DATABASE

TAKING DISTANCE BETWEEN KEYS

Dataset

Technique

Result

False

Rejection

Rate

(FRR)

False

Acceptan

ce Rate

(FAR)

Recognition

Rate (RR)

Flight Time

(Inbuilt)

NEAT

0.95

0.45

98.5

Dwell Time

(Inbuilt)

NEAT

0.75

0.55

97.5

Flight Time

(Inbuilt)

Proposed

NEAT

0.25

0.15

99.1

Dwell Time

(Inbuilt)

Proposed

NEAT

0.30

0.25

98.7

Flight Time

(Inbuilt)

SVM

0.65

0.85

95.2

Dwell Time

(Inbuilt)

SVM

0.65

0.68

94.5

Flight Time

(Inbuilt)

Fuzzy Expert

System

1.2

0.9

93.5

Dwell Time

(Inbuilt)

Fuzzy Expert

System

1.3

0.65

93.7

Flight Time

(Inbuilt)

Chaotic

Neural

Network

0.66

0.28

94.8

Dwell Time

(Inbuilt)

Chaotic

Neural

Network

0.30

0.25

95.2

On the inbuilt database where long distance is present

between the keys, the proposed NEAT performs better with

a smallest FRR, FAR as well as the best RR. Since its RR

This is a draft version of the paper. The full version is available on: https://ieeexplore.ieee.org/document/8704135

and should be cited as: Baynath, P., Soyjaudah, K.S. and Khan, M.Heenaye- Mamode khan., 2018, December.

Machine Learning Algorithm on Keystroke Dynamics Pattern. In 2018 IEEE Conference on Systems, Process and

Control (ICSPC) (pp. 11-16). IEEE.

yield is high for the proposed NEAT > 98%, the tempering

of the matcher module becomes very difficult as the system

is stable. Another advantage of adopting the proposed

NEAT is that since the FAR achieved is very low, the

chance that an intruder access the system is minimal

compared to the other machine learning system.

TABLE III. RESULT ON EACH CLASSIFIER FOR OUR INBUILT DATABASE

NOT TAKING DISTANCE BETWEEN KEYS

Dataset

Technique

Result

False

Rejection

Rate

(FRR)

False

Acceptan

ce Rate

(FAR)

Recognition

Rate (RR)

Flight Time

(Inbuilt)

NEAT

1.2

1.3

95.5

Dwell Time

(Inbuilt)

NEAT

1.1

0.9

95.1

Flight Time

(Inbuilt)

Proposed

NEAT

0.52

0.75

97.2

Dwell Time

(Inbuilt)

Proposed

NEAT

0.25

0.45

97.9

Flight Time

(Inbuilt)

SVM

0.75

0.85

94.1

Dwell Time

(Inbuilt)

SVM

0.95

0.68

93.9

Flight Time

(Inbuilt)

Fuzzy Expert

System

1.40

2.0

91.5

Dwell Time

(Inbuilt)

Fuzzy Expert

System

1.57

1.97

91.9

Flight Time

(Inbuilt)

Chaotic

Neural

Network

0.85

0.70

96.2

Dwell Time

(Inbuilt)

Chaotic

Neural

Network

0.72

0.65

96.8

On the inbuilt database where distance has not been

taken between the keys, even then the proposed NEAT

performs better yields a smallest FRR, FAR as well as the

best RR. The second best results has been obtained with the

Chaotic Neural Network.

TABLE IV. RESULT ON EACH CLASSIFIER FOR OUR ONLINE DATABASE

Dataset

Technique

Result

False

Rejection

Rate

(FRR)

False

Acceptan

ce Rate

(FAR)

Recognition

Rate (RR)

Flight Time

(Inbuilt)

NEAT

1.57

1.58

94.5

Dwell Time

(Inbuilt)

NEAT

1.0

0.75

93.5

Flight Time

(Inbuilt)

Proposed

NEAT

0.60

0.59

95.2

Dwell Time

(Inbuilt)

Proposed

NEAT

0.96

0.95

95.6

Flight Time

(Inbuilt)

SVM

1.75

1.93

93.1

Dwell Time

(Inbuilt)

SVM

2.10

1.95

91.7

Flight Time

(Inbuilt)

Fuzzy Expert

System

1.95

88.1

Dataset

Technique

Result

Dwell Time

(Inbuilt)

Fuzzy Expert

System

2.52

2.63

87.2

Flight Time

(Inbuilt)

Chaotic

Neural

Network

0.95

1.10

94.2

Dwell Time

(Inbuilt)

Chaotic

Neural

Network

1.00

1.30

94.5

Different datasets has been used to evaluate the

performance of machine learning algorithm on different

keystroke Dynamics features. Among the different features,

it can be deduced that the best results during the

experiments (both online database and inbuilt database) has

been obtained with the flight time. Between the different

machine learning algorithms, our proposed NEAT system

performs better in terms of RR. The FAR and FRR achieved

throughout the experiment is also remarkable.

Among the datasets, our online database where the

distance were taken into consideration between the keys

achieved better results compared to the different datasets.

Hence it is advisable for people to choose strong password

and also to use high distance between the keys so that the

password is not easily compromised.

V. CONCLUSION

Each classification techniques works differently on each

feature. On the proposed Keystroke Dynamics system, our

proposed NEAT algorithm has gained better results using

the Flight time features. From the results gained, since the

FRR and FAR is low, it is advisable to use NEAT as the

classification technique. The security of the keystroke

Dynamics system is also improved due to the remarkable

results gained in terms of RR as the system is not

compromised easily. NEAT has the ability to handle large

amount of dataset. Two large databases have been collected

and open for public research. Different features and

benchmark algorithms have been tested and summarized.

Among the datasets our proposed dataset where the distance

has been taken into consideration between the keys yield the

best results in terms of recognition rate (RR).

VI. FUTURE WORKS

It would be interesting to see the behaviour of the

classification using Multi-Biometrics by fusing the features

used on score level and template level.

REFERENCES

[1] D. Shanmugapriya and G. A. Padmavathi, “Survey of Biometric

keystroke Dynamics: Approaches, Security and Challenges,”

International Journal of Computer Science and Information Security,

Vol. 5, No. 1, 2009.

[2] M.Sridhar, S.Vaidya, and P.Yawalkar, “Intrusion detection using

keystroke dynamics & fuzzy logic membership functions,” In

Technologies for Sustainable Development (ICTSD), 2015

International Conference on (pp. 1-10). IEEE, 2015.

[3] E.Yu, and S.Cho, “GA-SVM wrapper approach for feature subset

selection in keystroke dynamics identity verification,” In Neural

This is a draft version of the paper. The full version is available on: https://ieeexplore.ieee.org/document/8704135

and should be cited as: Baynath, P., Soyjaudah, K.S. and Khan, M.Heenaye- Mamode khan., 2018, December.

Machine Learning Algorithm on Keystroke Dynamics Pattern. In 2018 IEEE Conference on Systems, Process and

Control (ICSPC) (pp. 11-16). IEEE.

Networks, 2003. Proceedings of the International Joint Conference

on, Vol. 3, pp. 2253-2257, 2003.

[4] Y.Zhong, Y. Deng, and A.K Jain, “Keystroke dynamics for user

authentication,” In Computer Vision and Pattern Recognition

Workshops. IEEE Computer Society Conference. pp. 117-123,2012.

[5] I. G. Damousis and D. Tzovaras, “Fuzzy fusion of eyelid activity

indicators for hypovigilance-related accident prediction,” IEEE

Transactions on Intelligent Transportation Systems, vol. 9, no. 3, pp.

491–500, 2008

[6] T.Kudo, and Y.Matsumoto, “Chunking with support vector

machines’” In Proceedings of the second meeting of the North

American Chapter of the Association for Computational Linguistics

on Language technologies 2001 Jun 2 (pp. 1-8). Association for

Computational Linguistics, 2001.

[7] R.Giot, and M. El-Abed, B. Hemery, and C. Rosenberger,

“Unconstrained keystroke dynamics authentication with shared

secret,” Computers & security, 30(6-7), 427-445, 2011.

[8] S. Hocquet, J.-Y. Ramel, and H. Cardot, “User classiﬁcation for

keystroke dynamics authentication,” in The Sixth International

Conference on Biometrics (ICB2007), 2007, pp. 531–539, 2007.

[9] B. Scholkopf and A. Smola, “Learning with Kernels: Support Vector

Machines, Regularization,” Optimization, and Beyond. MIT Press,

vol. 1, p. 2, 2002.

[10] E., Hastings, R., Guha, and K. O. Stanley, “Neat particles: Design,

representation, and animation of particle system effects,” In

Computational Intelligence and Games, 2007. CIG 2007. IEEE

Symposium on (pp. 154-160). IEEE, 2007.

[11] P. Baynath, , KMS, Soyjaudah, and M. Heenaye-Mamode Khan.,

"Keystroke recognition using chaotic neural network," Intelligent

Systems and Signal Processing (ICSPIS), 2017 3rd Iranian

Conference on. IEEE, 2017.

[12] P. Baynath, , KMS, Soyjaudah, and M. Heenaye-Mamode Khan.,

"Keystroke Recognition Using Neural Network," 5th International

Symposium on Computational and Business Intelligence (ISCBI),

IEEE. 2017.

[13] H., Mohabeer, and K. S. Soyjaudah, “Application of Predictive

Coding in Neuroevolution,” International Journal of Computer

Applications, 114(2), 2015.

[14] K.Killourhy and R.Maxion, “Comparing anomaly-detection

algorithms for keystroke dynamics,” IEEE/IFIP International

Conference on Dependable Systems & Networks, DSN’09, pp. 125–

134,Jul. 2009.

[15] R., Giot, M., El-Abed, and C. Rosenberger, “Greyc keystroke: a

benchmark for keystroke dynamics biometric systems,” In

Biometrics: Theory, Applications, and Systems, 2009. BTAS'09. IEEE

3rd International Conference on (pp. 1-6). IEEE, 2009.

[16] R., Giot, and C. Rosenberger , “A new soft biometric approach for

keystroke dynamics based on gender recognition,” International

Journal of Information Technology and Management, 11(1-2), 35-49.,

2012.

[17] De Ru, W. G., and J. H. Eloff, “Enhanced password authentication

through fuzzy logic,” IEEE Expert, 12(6), 38-45, 1995.

[18] M. J., Cardoso, J., Cardoso, N., Amaral, I., Azevedo, L., Barreau, M.,

Bernardo, and J. Johansen, “Turning subjective into objective: the

BCCT. core software for evaluation of cosmetic results in breast

cancer conservative treatment”, The Breast, 16(5), 456-461, 2007.

[19] P. Baynath, K.M.S Soyjaudah and M. Heenaye-Momode Khan,

“Improving Security Of Keystroke Dynamics By Increasing The

Distance Between Keys”, In proceeding of 3rd World Congress on

Computer Applications and Information Systems 2016, DOI: 08.

WCCAIS.2016.1.10, 2016.

[20] P. Baynath, K.M.S Soyjaudah and M. Heenaye-Momode Khan,

“Implementation of a Secure Keystroke Dynamics using Ant colony

optimisation”, The International Conference on Communications

Computer Science and Information Technology 2016, 2016.

Performance Analysis of Free Text Keystroke Authentication using XGBoost

Conference Paper

Full-text available

Mar 2023

Cite this as: I. Kuzminykh, S. Mathur, B. Ghita (2023). Performance Analysis of Free Text Keystroke Authentication using XGBoost. In Proceedings of 6th International Conference on Computer Science, Engineering and Education Applications (ICCSEEA2023), March 17–19, 2023,Warsaw, Poland.. Authentication based on keystroke dynamics is a form of behavioral biometric authentication that uses the user typing patterns and keyboard interaction as a discriminatory input. This type of authentication can be coupled with a fixed text password in a traditional login system to contribute to a multifactor authentication or provide continuous user authentication in a usable security system, where the typing patterns are continuously analysed to validate the user at run time. This paper investigates the effectiveness of free text keystroke for continuous authentication in real-world systems. Evaluation is performed using XGBoost multiclass classification, applied to an unbalanced free-text keystroke dataset. The introduction of additional activity-based features and removal of inaccuracies in the timing between keys allowed a reduction of the EER for the Clarkson II dataset from 14-24%, as achieved by previous studies, to 8% when employing the proposed method.

Machine Learning and Deep Learning for Fixed-Text Keystroke Dynamics

Preprint

Full-text available

Jul 2021

Keystroke dynamics can be used to analyze the way that users type by measuring various aspects of keyboard input. Previous work has demonstrated the feasibility of user authentication and identification utilizing keystroke dynamics. In this research, we consider a wide variety of machine learning and deep learning techniques based on fixed-text keystroke-derived features, we optimize the resulting models, and we compare our results to those obtained in related research. We find that models based on extreme gradient boosting (XGBoost) and multi-layer perceptrons (MLP)perform well in our experiments. Our best models outperform previous comparable research.

Understanding insiders in cloud adopted organizations: A survey on taxonomies, incident analysis, defensive solutions, challenges

Article

Apr 2024
FUTURE GENER COMP SY

Performance Analysis of Free Text Keystroke Authentication Using XGBoost

Chapter

Aug 2023

Authentication based on keystroke dynamics is a form of behavioral biometric authentication that uses the user typing patterns and keyboard interaction as a discriminatory input. This type of authentication can be coupled with a fixed text password in a traditional login system to contribute to a multifactor authentication or provide continuous user authentication in a usable security system, where the typing patterns are continuously analysed to validate the user at run time. This paper investigates the effectiveness of free text keystroke for continuous authentication in real-world systems. Evaluation is performed using XGBoost multiclass classification, applied to an unbalanced free-text keystroke dataset. The introduction of additional activity-based features and removal of inaccuracies in the timing between keys allowed a reduction of the EER for the Clarkson II dataset from 14–24%, as achieved by previous studies, to 8% when employing the proposed method.KeywordsUsable securityKeystroke dynamicsContinuous authenticationXGBoost

A Systematic Literature Review on Multimodal Machine Learning: Applications, Challenges, Gaps and Future Directions

Article

Full-text available

Jan 2023

Multimodal machine learning (MML) is a tempting multidisciplinary research area where heterogeneous data from multiple modalities and machine learning (ML) are combined to solve critical problems. Usually, research works use data from a single modality, such as images, audio, text, and signals. However, real-world issues have become critical now, and handling them using multiple modalities of data instead of a single modality can significantly impact finding solutions. ML algorithms play an essential role by tuning parameters in developing MML models. This paper reviews recent advancements in the challenges of MML, namely: representation, translation, alignment, fusion and co-learning, and presents the gaps and challenges. A systematic literature review (SLR) applied to define the progress and trends on those challenges in the MML domain. In total, 1032 articles were examined in this review to extract features like source, domain, application, modality, etc. This research article will help researchers understand the constant state of MML and navigate the selection of future research directions.

Transfer Learning for Behavioral Biometrics-based Continuous User Authentication

Conference Paper

Jul 2022

Online User Authentication System Using Keystroke Dynamics

Article

Aug 2022

Nowadays, people become more connected to the internet using their mobile devices. They tend to use their critical and sensitive data among many applications. These applications provide security via user authentication. Authentication by passwords is a reliable and efficient access control procedure, but it is not sufficient. Additional procedures are needed to enhance the security of these applications. Keystroke dynamics (KSD) is one of the common behavioral based systems. KSD rhythm uses combinations of timing and non-timing features that are extracted and processed from several devices. This work presents a novel authentication approach based on two factors: password and KSD. Also, it presents extensive comparative analysis conducted between authentication systems based on KSDs. It proposes a prototype for a keyboard in order to collect timing and non-timing information from KSDs. Hence, the proposed approach uses timing and several non-timing features. These features have a demonstrated significant role for improving the performance measures of KSD behavioral authentication systems. Several experiments have been done and show acceptable level in performance measures as a second authentication factor. The approach has been tested using multiple classifiers. When Random Forest classifier has been used, the approach reached 0% error rate with 100% accuracy for classification.

A Systematic Literature Review on Latest Keystroke Dynamics Based Models

Article

Full-text available

Jan 2022

The purpose of this study is to conduct a comprehensive evaluation and analysis of the most recent studies on the implications of keystroke dynamics (KD) patterns in user authentication, identification, and the determination of useful information. Another aim is to provide an extensive and up-to-date survey of the recent literature and potential research directions to understand the present state-of-the-art methodologies in this particular domain that are expected to be beneficial for the KD research community. From January 1st, 2017 to March 13th, 2022, the popular six electronic databases have been searched using a search criterion (“keystroke dynamics” OR “typing pattern”) AND (“authentication” OR “verification” OR “identification”). With this criterion, a total of nine thousand three hundred forty-eight results, including duplicates, were produced. However, one thousand five hundred forty-seven articles have been chosen after removing duplicates and preliminary screening. Due to insufficient information, only one hundred twenty-seven high-quality quantitative research articles have been included in the article selection process. We compared and summarised several factors with multiple tables to comprehend the various methodologies, experimental settings, and findings. In this study, we have identified six unique KD-based designs and presented the status of findings toward an effective solution in authentication, identification, and prediction. We have also discovered considerable heterogeneity across studies in each KD-based design for desktops and smartphones separately. Finally, this paper found a few open research challenges and provided some indications for a deeper understanding of the issues and further study.

Machine Learning and Deep Learning for Fixed-Text Keystroke Dynamics

Chapter

Feb 2022

Keystroke dynamics can be used to analyze the way that users type by measuring various aspects of keyboard input. Previous work has demonstrated the feasibility of user authentication and identification utilizing keystroke dynamics. In this research, we consider a wide variety of machine learning and deep learning techniques based on fixed-text keystroke-derived features, we optimize the resulting models, and we compare our results to those obtained in related research. We find that models based on extreme gradient boosting (XGBoost) and multi-layer perceptrons (MLP) perform well in our experiments. Our best models outperform previous comparable research.

Design of biometric system and modeling of machine learning for entering the information system

Conference Paper

Sep 2021

Keystroke recognition using chaotic neural network

Conference Paper

Full-text available

Dec 2017

Keystroke dynamics, which distinguishes individual by its typing rhythm, is the most prevalent behavior biometrie authentication system. Neural Network is the active research area where different area has been presented. This paper present a keystroke dynamics Biometric system using chaotic neural network as the dimensional reduction and pattern recognition of the individual. Biometric scheme are being extensively used as their security qualities over the prior authentication system based on their history, that is the records were easily lost, guessed or forget. Biometric is more complex than password and is unique for each individual. In this work, the focus is made on the dwell time and flight time of the users' typing to recognize or reject an imposter. For this paper, the recognition rate obtained for the application of chaotic neural network was 99.1%.

Keystroke recognition using neural network

Conference Paper

Full-text available

Aug 2017

This paper present a keystroke dynamics Biometric system using neural networkas its classifier to recognize an individual. Biometric scheme are being widely used as their security merits over the earlier authentication system based on their history, that is the records were easily lost, guessed or forget. Biometric is more complex than passwordand is unique for each individual. Keystroke dynamics, which distinguishesindividual by its typing rhythm, is the most prevalentbehavior biometric authentication system. In this work, the focus is made on the dwell time and flight time of the users’ typing to recognize or reject an imposter. A multilayer perceptron (MLP) neural network is used to train and authenticatethe features. The neural network classifier is used to evaluatethe feature of the user.Based on the recognition rate of 98.5% achieved, the fusion of keystroke dynamic features along with Neural Network has proved to be a promising technique. (PDF) Keystroke recognition using neural network. Available from: https://www.researchgate.net/publication/320178246_Keystroke_recognition_using_neural_network [accessed Jan 21 2019].

A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture

Article

Full-text available

Oct 2018
NEURAL COMPUT APPL

Support vector machine (SVM) is a well-regarded machine learning algorithm widely applied to classification tasks and regression problems. SVM was founded based on the statistical learning theory and structural risk minimization. Despite the high prediction rate of this technique in a wide range of real applications, the efficiency of SVM and its classification accuracy highly depends on the parameter setting as well as the subset feature selection. This work proposes a robust approach based on a recent nature-inspired metaheuristic called multi-verse optimizer (MVO) for selecting optimal features and optimizing the parameters of SVM simultaneously. In fact, the MVO algorithm is employed as a tuner to manipulate the main parameters of SVM and find the optimal set of features for this classifier. The proposed approach is implemented and tested on two different system architectures. MVO is benchmarked and compared with four classic and recent metaheuristic algorithms using ten binary and multi-class labeled datasets. Experimental results demonstrate that MVO can effectively reduce the number of features while maintaining a high prediction accuracy

Improving Security Of Keystroke Dynamics By Increasing The Distance Between Keys

Conference Paper

Full-text available

Jan 2016

Keystroke dynamics is gaining popularity and researchers are striving to improve existing techniques or to explore aspects that have not been given much attention. In this paper, we are providing a new means of authenticationfor keystroke dynamics,by using a password with different distances between the keys. The classifier used in this paper isneural network.The mean square error has been used to compute the performance of the classifier. After the analysis and evaluations of the results, it was deduced that distance of keys on a keyboard affect the reliability of the password. The mean square error of the most space digraph was in the range of 15.5×10-3 to 107.6 ×10-3 and the least distant digraph has a mean square error range of 8.5×10-8 to 3.7×10-9. In this way, it is observed that the smaller the distance between the keys of the password used, the easier is the keystroke pattern compromise compared to larger distance between keys. Hence, it can be concluded by the larger is the distance between the keys, the more the security increases. (PDF) Improving Security Of Keystroke Dynamics By Increasing The Distance Between Keys. Available from: https://www.researchgate.net/publication/295919290_Improving_Security_Of_Keystroke_Dynamics_By_Increasing_The_Distance_Between_Keys [accessed Jan 21 2019].

Application of Predictive Coding in Neuroevolution

Article

Full-text available

Mar 2015

Heman Mohabeer

This paper presents promising results achieved by applying a new coding scheme based on predictive coding to neuroevolution. The technique proposed exploits the ability of a bit, which contains sufficient information, to represent its neighboring bits. In this way, a single bit represents not only its own information, but also that of its neighborhood. Moreover, whenever there is a change in bit representation, it is determined by a threshold value that determine the point at which the change in information is significant. The main contributions of this work are the following: (i) the ratio of the number of bits to the amount of information content is reduced; (ii) the complexity of the overall system is reduced as there is lesser amount of bit to process; (iii) Finally, we successfully apply the coding scheme to NEAT, which is used as a biometric classifier for the authentication of keystroke dynamics

Keystroke dynamics for user authentication

Conference Paper

Full-text available

Jun 2012

In this paper we investigate the problem of user authentication using keystroke biometrics. A new distance metric that is effective in dealing with the challenges intrinsic to keystroke dynamics data, i.e., scale variations, feature interactions and redundancies, and outliers is proposed. Our keystroke biometrics algorithms based on this new distance metric are evaluated on the CMU keystroke dynamics benchmark dataset and are shown to be superior to algorithms using traditional distance metrics.

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Book

Jan 2018

A comprehensive introduction to Support Vector Machines and related kernel methods. In the 1990s, a new type of learning algorithm was developed, based on results from statistical learning theory: the Support Vector Machine (SVM). This gave rise to a new class of theoretically elegant learning machines that use a central concept of SVMs—-kernels—for a number of learning tasks. Kernel machines provide a modular framework that can be adapted to different tasks and domains by the choice of the kernel function and the base algorithm. They are replacing neural networks in a variety of fields, including engineering, information retrieval, and bioinformatics. Learning with Kernels provides an introduction to SVMs and related kernel methods. Although the book begins with the basics, it also includes the latest research. It provides all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms and to understand and apply the powerful algorithms that have been developed over the last few years.

Chunking with Support Vector Machines.

Article

Oct 2002

本稿では, Support Vector Machine (SVM) に基づく一般的なchunk同定手法を提案し, その評価を行う.SVMは従来からある学習モデルと比較して, 入力次元数に依存しない高い汎化能力を持ち, Kernel関数を導入することで効率良く素性の組み合わせを考慮しながら分類問題を学習することが可能である.SVMを英語の単名詞句とその他の句の同定問題に適用し, 実際のタグ付けデータを用いて解析を行ったところ, 従来手法に比べて高い精度を示した.さらに, chunkの表現手法が異なる複数のモデルの重み付き多数決を行うことでさらなる精度向上を示すことができた.

Intrusion detection using keystroke dynamics & fuzzy logic membership functions

Conference Paper

Feb 2015

If the password is compromised, either due it being weak or someone getting to know it through other means, the system cannot detect it. To overcome this problem, we propose a system whereby the system can detect whether the current user is the authorized user, a substitute user or an intruder pretending to be a valid user. Therefore the system checks the identity of the user by their behaviour pattern using keystrokes dynamics to authenticate user. A number of samples of login and password attempts of each user is gathered and stored in a database. From the samples collected, keystroke patterns are derived called feature sets and signatures are formed for each user using Fuzzy Logic algorithms. Once signatures are formed, users are authenticated by comparing their typing pattern to the respective signatures formed. We study the performance of such a system based on features like False Acceptance Rate (FAR) and False Rejection Rate (FRR), thus evaluating the efficiency of the system. [1]

User Classification for Keystroke Dynamics Authentication

Conference Paper

Aug 2007

In this paper, we propose a method to realize a classification of keystroke dynamics users before performing user authentication. The objective is to set automatically the individual parameters of the classification method for each class of users. Features are extracted from each user learning set, and then a clustering algorithm divides the user set in clusters. A set of parameters is estimated for each cluster. Authentication is then realized in a two steps process. First the users are associated to a cluster and second, the parameters of this cluster are used during the authentication step. This two steps process provides better results than system using global settings.

Machine Learning Algorithm on Keystroke Dynamics Pattern

Recommended publications

An study of user validation using keystroke dynamics

Machine Learning Algorithm on Keystroke dynamics Fused pattern in biometrics

Pattern representation using Neuroevolution of the augmenting topology (NEAT) on Keystroke dynamics...

Keystroke recognition using chaotic neural network

Keystroke recognition using neural network