ArticlePDF Available

A robust machine learning structure for driving events recognition using smartphone motion sensors

July 2022
Journal of Intelligent Transportation Systems Technology Planning and Operations 28(2):1-15

July 2022
28(2):1-15

DOI:10.1080/15472450.2022.2101109

Authors:

Mahdi Zarei Yazd

Khaje Nasir Toosi University of Technology

Iman Taheri

University of Melbourne

Driving behavior monitoring by smartphone sensors is one of the most investigated approaches to ameliorate road safety. Various methods are adopted in the literature; however, to the best of our knowledge, their robustness to the prediction of new unseen data from different drivers and road conditions is not explored. In this paper, a two-phase Machine Learning (ML) method with taking advantage of high-pass, low-pass, and wavelet filters is developed to detect driving brakes and turns. In the first phase, accelerometer and gyroscope filtered time series are fed into Random Forest and Artificial Neural Network classifiers, and the suspicious intervals are extracted by a high recall. Following that, in the next phase, statistical features calculated based on the obtained intervals are used to determine the false and true positive events. To compare the predicted and real labels of the recorded events and calculate the accuracy, a method that covers the limitations of previous sliding windows is also employed. Real-world experimental result shows that the proposed method can predict new unseen datasets with average F1-scores of 71% in brake detection and 82% in turn detection which is comparable with previous works. Moreover, by sensitivity analysis of our proposed model, it is proven that implementing high-pass and low-pass filters can affect the accuracy for turn detection up to 30%.

(a) Smartphone coordination system, (b) Smartphone placement in the vehicle during the experiment.

…

A brief overview of the methodology. (a) Event Extraction Phase (Phase 1), (b) Performance Improvement Phase (Phase 2).

…

Label comparison method.

…

Unintended lags between the real labels and the predicted events.

…

Results of testing the model by second scenario.

…

Figures - uploaded by Mahdi Zarei Yazd

Content may be subject to copyright.

Content uploaded by Mahdi Zarei Yazd

Content may be subject to copyright.

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=gits20

Journal of Intelligent Transportation Systems

Technology, Planning, and Operations

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/gits20

A robust machine learning structure for driving

events recognition using smartphone motion

sensors

Mahdi Zarei Yazd, Iman Taheri Sarteshnizi, Amir Samimi & Majid Sarvi

To cite this article: Mahdi Zarei Yazd, Iman Taheri Sarteshnizi, Amir Samimi & Majid Sarvi (2022):

A robust machine learning structure for driving events recognition using smartphone motion

sensors, Journal of Intelligent Transportation Systems, DOI: 10.1080/15472450.2022.2101109

To link to this article: https://doi.org/10.1080/15472450.2022.2101109

Published online: 24 Jul 2022.

Submit your article to this journal

View related articles

View Crossmark data

A robust machine learning structure for driving events recognition using

smartphone motion sensors

Mahdi Zarei Yazd

, Iman Taheri Sarteshnizi

, Amir Samimi

, and Majid Sarvi

Department of Civil Engineering, Sharif University of Technology, Tehran, Iran;

Department of Infrastructure Engineering, University

of Melbourne, Melbourne, Australia;

School of Civil Engineering, University of Sydney, Sydney, Australia

ABSTRACT

Driving behavior monitoring by smartphone sensors is one of the most investigated

approaches to ameliorate road safety. Various methods are adopted in the literature; how-

ever, to the best of our knowledge, their robustness to the prediction of new unseen data

from different drivers and road conditions is not explored. In this paper, a two-phase

Machine Learning (ML) method with taking advantage of high-pass, low-pass, and wavelet

filters is developed to detect driving brakes and turns. In the first phase, accelerometer and

gyroscope filtered time series are fed into Random Forest and Artificial Neural Network clas-

sifiers, and the suspicious intervals are extracted by a high recall. Following that, in the next

phase, statistical features calculated based on the obtained intervals are used to determine

the false and true positive events. To compare the predicted and real labels of the recorded

events and calculate the accuracy, a method that covers the limitations of previous sliding

windows is also employed. Real-world experimental result shows that the proposed method

can predict new unseen datasets with average F1-scores of 71% in brake detection and

82% in turn detection which is comparable with previous works. Moreover, by sensitivity

analysis of our proposed model, it is proven that implementing high-pass and low-pass fil-

ters can affect the accuracy for turn detection up to 30%.

ARTICLE HISTORY

Received 4 November 2021

Revised 4 July 2022

Accepted 10 July 2022

KEYWORDS

driving behavior; driving

monitoring; machine

learning; smartphone sensor

Introduction

Road accidents cost most countries approximately 3%

of their GDPs, and 1.35 million people die annually

because of these crashes. Among the main reasons,

speeding, drunk driving, and distracted driving are

evidenced to be the most contributors (World Health

Organization, 2018). Drivers tend to perform some

commonly known maneuvers, like braking and turn-

ing, more frequently in these conditions. If they are

prohibited from driving while being in such situations

that would considerably help avoiding road accidents.

Formerly, traffic policing was the case for the

researchers to improve driving safety (Bates et al.,

2012). However, with the appearance of in-vehicle

sensors (Yuksel & Atmaca, 2021) and the outbreak of

smartphones, developing platforms for driving moni-

toring became a hotspot in this area (Siami et al.,

2021; Toledo et al., 2008). In addition to safety pur-

poses, driving style monitoring also contributes to less

fuel consumption and leads to eco-driving (Jamson

et al., 2015; Tanvir et al., 2021). Providing proper and

on-time feedback to the drivers proved to be effective

and this motivate them to pay more attention to their

performance while driving. Therefore, a precise and

robust method for this aim would lead to fewer daily

accidents and also fuel consumption.

Several methodologies are introduced and tested

formerly for driving behavior monitoring in the litera-

ture and excellent results are achieved using them

employing different datasets. Motion data of individ-

ual vehicles such as acceleration, velocity, angular vel-

ocity, and orientation data is recorded in these

datasets using OBD (On-Board Diagnostic) devices or

smartphones. The behavior of the drivers is also

recorded alongside the data collection using question-

naires or labeling some specific events to investigate

the validity of the methodologies (Chan et al., 2020;

Kazemeini et al., 2022). Despite the existence of such

efforts, the robustness of previous works has remained

an open area of research. A detection method is called

robust if its performance is not susceptible to a sig-

nificant decline when it comes to situations with

CONTACT Iman Taheri Sarteshnizi itaherisarte@student.unimelb.edu.au Department of Infrastructure Engineering, University of Melbourne,

Melbourne, Victoria, Australia

ß2022 Taylor & Francis Group, LLC

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS

https://doi.org/10.1080/15472450.2022.2101109

different characteristics. For instance, in driving

behavior monitoring using smartphones, the driver,

smartphone type, and route condition are some

parameters affecting the collected data of smartphone

motion sensors and changing these parameters may

affect the results of trained models. Analyzing the

robustness of the previous detection methods is over-

looked by the literature and in this paper, we devel-

oped a driving behavior detection method that

performs well even using a test set with different char-

acteristics compared to the training set. In the follow-

ing paragraphs, we deeply discuss the literature on

driving behavior detection and elaborate on our

method contributions.

Broadly, driving behavior recognition models fall

into two categories: unsupervised and supervised

models. The former types deal with a huge mass of

unlabeled data in which the ground truth of the tar-

gets (driving events) is not specified. For instance,

Eftekhari and Ghatee (2018) developed a system that

recognizes driving events by feeding Discrete Wavelet

Transformation (DWT) of the time series data into an

Adaptive Neuro-Fuzzy Inference System (ANFIS). Yao

et al. (2021) also benefited from DTW (Dynamic

Time Warping) and HMM (Hidden Markov Model)

to cluster driver behavior. Although these researchers

fully exploit the massive real-world data collected by

smartphones, the accuracy of their systems remains

open to doubt.

In the supervised structures, which is the case in

this paper, multiple instances of different driving

events are available in the collected data. Data collec-

tion for designing such models is a challenging pro-

cess and also labor-intensive; however, one can

quantify the prediction performance using the labeled

instances and employ it to compare the results in dif-

ferent scenarios. Methods implemented in this area

are divided into three distinct groups, namely thresh-

old-based, pattern matching-based, and learning-based

methods. Detailed information regarding some of the

main studies in this field is represented in Table 1.

Threshold-based methods such as Chhabra et al.

(2018) are simple and easy to implement; but very

case dependent. Pattern matching-based methods such

as Dynamic Time Warping (DTW) (Singh et al.,

2017) measure the similarity between two signals. In

these algorithms, one must define template signals

and the models’performances highly depend on the

selected templates (Chan et al., 2020).

Learning-based structures can learn and construct

predictive models exploiting a large amount of train-

ing data which enables them to find more complex

patterns than the other approaches. For example, Yu

et al. (2017) used Artificial Neural Network (ANN),

and Support Vector Machine (SVM) to classify six

types of driving events using a total of 4029 labeled

driving events. Bejani and Ghatee (2018) developed an

ensemble learning method containing a Decision Tree

(DT), SVM, ANN, and K-Nearest Neighbors (KNN)

to evaluate the different driving styles of 27 drivers.

Nuswantoro et al. (2020) also adopted an ANN algo-

rithm by sliding a five-second time window over the

collected data to specify different driving behaviors.

Furthermore, some other researchers like Zhang et al.

(2019), Wang et al. (2021), and Saleh et al. (2017)

designed deep learning algorithms for this aim. To

elaborate, Saleh et al. (2017) used Long-Short Term

Memory (LSTM) to recognize normal, aggressive, and

drowsy driving applying a 50 percent overlapped slid-

ing window. Zhang et al. (2019) also designed an

attention-based convolutional and recurrent neural

network to classify driving events like brakes and

turns. Deep learning algorithms are unveiled to be

more powerful than simple ML methods since they are

capable of capturing more complex time-related features

of multidimensional data taking advantage of their

complicated nonlinear structure (Shinde & Shah, 2018);

Table 1. Summary of the main previous studies.

Type Reference Method EE

method Events Testing approach

Threshold-based (Chhabra et al., 2018) Fixed Threshold –Acc

, Brake, Turn Not mentioned

Pattern matching-based (Singh et al., 2017) DTW Using threshold LC

, Acc, Brake, Turn Not mentioned

Learning-Based (Bejani & Ghatee, 2018) DT Sliding window LC, Turn, U-Turn Cross-validation

(Carlos et al., 2020) Bag of Words þANN Sliding window Acc, Swerving, Brake Train/Test split

(Yu et al., 2017) ANN, SVM Not mentioned Weaving, Swerving,

Side slipping, U-

Turn, Turn, Brake

Train/Test split

(Xie et al., 2018)RF

Sliding window LC, Acc, Brake, Turn Train/Test split

(Ferreira et al., 2017) RF, ANN, SVM, BN

Sliding window LC, Acc, Turn, Brake Not mentioned

(Carvalho et al., 2017) RNN

Sliding window Acc, LC, Turn Train/Test split

(Saleh et al., 2017) RNN Sliding window Normal, Aggressive,

Drowsy Driving

Train/Test split

(Wang et al., 2021) CNN þRNN Separated events Acc, Brake, Turn Different datasets

(Zhang et al., 2019) CNN þRNN Sliding window Brake, Acc, Turn Cross-validation

2 M. ZAREI YAZD ET AL.

however,theyaremoreusefulwhenmassivedriving

events are labeled.

The event extraction method, data filtering, and

testing approach are three key factors that should be

more discussed among previous learning-based mod-

els. In terms of event extraction, some previous

works like Yu et al. (2017), Ma et al. (2019), and

Wang et al. (2021) collected separated driving events,

and their aims were just classifying, not recognizing,

different driving behaviors. They supposed that the

driving events are previously extracted from the raw

time series data of smartphones and their focus was

only on determining the type of these extracted driv-

ing events. Other studies by Eftekhari and Ghatee

(2019) and Carlos et al. (2020) took advantage of

sliding windows to split the whole data time series

and implement learning-based algorithms. Using a

sliding window is the case for most of the recent

papers, nevertheless, there are some drawbacks with

this method mentioned in the literature (Ouyang

et al., 2018). For instance, some driving events may

not be entirely captured by rolling a window over a

time series, and label assignment to the slices of a

time series becomes a controversial task while using

a supervised dataset.

Collected data by smartphone sensors comprise

some noise that may affect the result provided by

developed detection methods (Wu et al., 2018).

Different data filtering approaches are implemented in

the literature for smartphone sensor data denoising

namely simple moving average filter (Johnson &

Trivedi, 2011), band-pass filter (Singh et al., 2017),

low and high-pass filters (Chhabra et al., 2018), and

wavelet filter (Eftekhari & Ghatee, 2018). Previous

papers exploited these approaches in their analysis,

however, the impact of these denoising filters on

improving the accuracy is not reported.

From the perspective of the testing approach, most

of the conducted studies (Carvalho et al., 2017; Xie

et al., 2018) randomly divided their datasets into train

and test set to show the performance of the designed

models. Besides, some other studies (Nguyen et al.,

2020;S



anchez et al., 2018) benefited from cross-valid-

ation method to address the problem of overfitting. In

these studies, the number of recorded labeled events

per event type hardly hits 100 (Carvalho et al., 2017;

Daptardar et al., 2015; Eftekhari & Ghatee, 2019;

Ferreira et al., 2017; Nuswantoro et al., 2020; Xie

et al., 2018). The point here is that the predictive

models in the literature are not robust and examined

by rich train and test datasets. In other words, it is

not clear whether the former models still perform well

when it comes to other unseen drivers, mobile

phones, or routes. Some works are developing person-

alized driving behavior monitoring systems (Yi et al.,

2019) but a transferrable and robust model which can

be used for different drivers is not yet explored.

In brief, the literature on driving behavior detec-

tion using smartphone motion data can be improved

in three distinct ways namely: (1) providing a proper

event extraction method for supervised collected

datasets rather than using sliding windows, (2) exam-

ining the effect of data filtering approaches on detec-

tion accuracy, and (3) testing the reliability (or

robustness) of the results using different testing scen-

arios. To address the above-mentioned gaps, we

developed a two-phase ML structure for driving

behavior recognition and demonstrated the profi-

ciency and robustness of our model with different

testing scenarios. Furthermore, a supervised dataset

including 4193 braking and 1434 turning samples is

collected with more than 40 drivers and 12 different

smartphones to design multiple train/test scenarios.

The findings of this paper suggest that our proposed

model is practical especially in situations where a

new driver with a new smartphone joins a monitor-

ing system and it is necessary to detect his/her

behavior in the early stage without using his/her his-

torical data. The main contributions of this research

are highlighted below:

An event extraction phase utilizing Random Forest

and Multilayer Perceptron (ANN) classifiers along

with an algorithm for quantifying the performance

of this phase are proposed in this paper to cover

the limitations of the traditional sliding windows.

We trained and tested our proposed method with

multiple datasets completely different from each

other in terms of environmental properties to

explain the robustness of our method.

Despite other studies, the accuracy of our method

in recognizing braking and turning events is meas-

ured and discussed with a rich supervised dataset,

and the marginal contribution of different input

components in our method, specifically denoising

filters, are also quantified.

The rest of this paper is organized as follows. In

the next section, the data collection method is

described. In Method section, our structure for brake

and turn detection is demonstrated. In Result section,

results achieved by this research are shown and dis-

cussed. Finally, a summary of findings is highlighted

in Summary and conclusion section.

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS 3

Data

To record the drivers’behavior, accelerometer and

gyroscope sensors embedded in different smartphones

are utilized. These two sensors record motion data of

smartphones in three different dimensions (x, y, and z

indicated in Figure 1(a)). The physical and gravita-

tional acceleration (m/s

) of the devices is collected by

the accelerometer and the angular velocity (rad/s)

relative to the three axes of the smartphones is stored

by the gyroscope. Since the coordination of the smart-

phone should be aligned with the vehicle direction,

every device must be affixed to the car during the

data collection. In our experiment, smartphones are

connected to the vehicles in a way that the z-axis is

toward the sky and the y axis is aligned with the dir-

ection of the car movement (Figure 1(b)). To make

sure that the smartphones are not affected by

unwanted movements inside the vehicles, we used

some equipment to strictly connect the smartphones

to the vehicles such as mobile holders or adhesive

tapes. Moreover, smartphones in our experiment were

placed mostly on the vehicle’s center console, over the

glove box (using a holder), or on the small ledge pro-

vided on car doors as these locations were mostly

used in the literature (Carlos et al., 2020; Carvalho

et al., 2017; Ferreira et al., 2017; Yu et al., 2017;

Zhang et al., 2019).

For conducting the data collection phase of our

research, 11 people were trained to label driving

events (brakes and turns) while seating next to the

driver during the trip. We asked the drivers to also

announce every brake and turn maneuver before per-

forming that to enhance the correctness of the real

recorded labels. An android-based application was

developed to simultaneously record the accelerometer

and gyroscope data along with the assigned labels

(Figure 2). Data collectors were supposed to tap a

brake or turn button provided in the application

when a driving event is about to take place and tap it

off when the maneuver is finished. Before recording

the data, passengers were able to select the desired

sensors (accelerometer and gyroscope in our case) for

data collection (Figure 2(a)).

All the experiments were executed in an urban

environment with 42 different drivers and 12 types of

smartphones (with 9 different brands). Nine of the par-

ticipated drivers were between 21 and 35, eight of

them were between 36 and 49, and the rest of them

were experienced taxi drivers working within the city

with more than 50 years of age. The reason behind

using these different characteristics is to test the robust-

ness of our proposed model and address the generaliz-

ability problem of the results which is overlooked by

the previous models in the literature. We have parti-

tioned our dataset according to the smartphone brands

into six groups. A description of the gathered data in

each group is demonstrated in Table 2.245tripsare

recorded with an average duration of 10 minutes

including 5627 driving events (brakes and turns).

Method

A two-phase ML-based structure is developed in this

paper to recognize brakes and turns during a trip

when smartphone sensory data is available. A general

picture of this method is illustrated in Figure 3. The

Event Extraction Phase (Phase 1) is designed to find

the potential driving event intervals. Since some out-

puts of the first phase may contain false-positive

events, the second phase is then created to better clas-

sify false and true positive intervals from the previous

phase. The filters and classifiers used in our method

as well as our proposed structure will be further dis-

cussed in this section.

Filters

Fourier-based Low-pass, high-pass, as well as wavelet

filters, are applied in our method to eliminate the

undesirable recorded noise by smartphones. Although

just a few studies in the area of driving behavior

Figure 1. (a) Smartphone coordination system, (b) Smartphone placement in the vehicle during the experiment.

4 M. ZAREI YAZD ET AL.

detection such as Eftekhari and Ghatee (2018) applied

them to smoothen the data of smartphones instead of

simple or exponential moving average (Yu et al.,

2017), these filters are widely used in different areas

related to time series analysis and showed promising

results over the basic filters (Haque et al., 2016;

Malghan & Hota, 2020). Therefore, we will follow the

way of Eftekhari and Ghatee (2018) and also include

Fourier-based high and low-pass filters and show that

their contributions in improving the detection accur-

acy are considerable.

The Fourier transform plays a crucial role in high

and low pass filters which is formulated as in (1):

ðÞ¼X

n1

t¼0

ðÞ

exp 2pipt

 (1)

where

ðÞis the Fourier transform of the time series

ðÞ

, p is the specified frequency (p ¼0, …, n-1) and

n is the number of observations in ft

ðÞ

:In a low-pass

filter, the frequency components of a time series are

first decomposed by the Fourier transform. Then, the

components with lower frequencies than the cutoff

frequency are passed, and the other ones are

neglected. With the usage of this filter, only the low-

frequency patterns of the time series remain and the

noises contributing to the high-frequency components

are smoothed. The performance of the high-pass fil-

ters is similar to the low-pass types except that,

reversely, in the high-pass filters, components with

higher frequencies than the cutoff are passed.

In wavelet filter, instead of Fourier transform, a

Discrete Wavelet Transform (DWT) is applied to sep-

arate different frequency parts of the time series. In

DWT, time resolution is also an important factor.

This means that the decomposed frequencies of a sig-

nal can be turned on or off depending on the time,

which is not the case in Fourier transform. The DWT

is demonstrated by (2):

Wa,b

ðÞ

¼1

ﬃﬃﬃ

n1

t¼0

ðÞwta

 (2)

where a¼k2j,b¼2j, j is scale index, k is wavelet

transform signal index, and wðtÞis the mother wave-

let. Multi-level decomposition can be applied by

DWT. The input time series at the first level is

Figure 2. an overview of the proposed android application for data collection. (a) selecting the desired sensors for data collection,

(b) the application during data collection where no maneuver is taking place, (c) the application during data collection where a

brake is happening and being labeled.

Table 2. Collected data description.

Group Number of trips

Number of

turn events

Number of

brake events

Nokia 95 260 966

Samsung 63 377 1214

Asus 21 91 129

Huawei 12 304 467

Xiaomi 19 182 589

Others 35 220 828

Sum 245 1434 4193

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS 5

decomposed to Approximation and Detail coefficients

and D

). Then, A

could be also fed again into a

DWT to generate A

and D

. This process can be

continued, but in this research, A

and A

will

be extracted.

Classifiers

Random Forest (RF) and Artificial Neural Network

(ANN) with fully connected layers are two types of

classifiers employed in our method. In RF (Ho, 1995),

many decision trees work together as an ensemble to

predict the output of unseen data. To elaborate, in a

single decision tree (or C4.5), prediction is provided

based on some trained values for different features of

the dataset. In each step of training, a greedy algo-

rithm is utilized to specify the best choice for splitting

the dataset using the Gini impurity loss function,

measuring the class distribution of samples. Gini

impurity of different classes can be calculated by the

following equations:

G node

ðÞ

¼X

k¼1

pkð1pkÞ(3)

pk¼#observations with class k

#all observations (4)

where pkis the probability of selecting a sample from

class k and n is the number of observations in a node.

Figure 3. A brief overview of the methodology. (a) Event Extraction Phase (Phase 1), (b) Performance Improvement Phase

(Phase 2).

6 M. ZAREI YAZD ET AL.

Using a greedy approach may cause some error in the

result since the greedy models only determine local

optimums. By using multiple decision trees in the RF

classifier, different combinations of the training data-

set are produced and used as the input of different

decision trees. Finally, when the trees are learned by

the various bags of the training dataset, the prediction

result would be the mode of the decision trees’out-

puts. Taking advantage of this method in RF leads to

more accurate results as more randomness

is considered.

ANN (Haykin, 1994) is another popular type of

classifier consisting of several connected nodes trans-

mitting some information. An activation function is

embedded into the nodes of an ANN, adding nonli-

nearity to the results. In other words, the activation

function decides to activate a neuron or not based on

the signal received from the other nodes. A mathem-

atical representation of the activation function (ReLU)

is shown below:

ðÞ¼maxð0, zÞ(5)

where zis the output of each neuron. While training

ANNs, different features of the training dataset are

fed to the first layer of the nodes, and then using the

weights between the nodes, signals are sent to the

other nodes as below:

fx,w

ðÞ

¼X

i¼1

xiwiþb(6)

where xiand wiare the i-th feature and weight from

xand wn-dimensional vectors, and bis the bias. This

continues until the information is received by the out-

put layer’s nodes which is the final step to determine

the prediction of the model (feedforward function). In

this step, a decision is simply made with the reformat-

ted data by the previous nodes. Then an error func-

tion (binary cross-entropy in our case) is defined to

measure how far the output is from reality:

L¼ylog p

ðÞ

1y

ðÞ

logð1pÞ(7)

where yis the real label and pis the predicted prob-

ability by the model. After that, by backpropagation

function, different weights between the nodes are

modified exploiting the gradient descent technique.

These two functions (feedforward and

backpropagation) iteratively continue working until a

low and reasonable error is achieved. The selected

hyperparameters of NN and RF are provided in Table

3. These hyperparameters are set experimentally by

comparing the achieved accuracy.

Event Extraction Phase (Phase 1)

The main aim of the first phase is to determine the

potential intervals from a time series data recorded in

a whole trip. The raw data recorded by smartphone

sensors according to the data section of this paper can

be described by (8) and (9):

A¼frS,A

ðÞ

t1,rS,A

ðÞ

t2,rS,A

ðÞ

t3,:::,rS,A

ðÞ

ti ,:::,rS,A

ðÞ

tN g(8)

LE¼flE

t1,lE

t2,lE

t3,:::,lE

ti,:::,lE

tN g(9)

where RS

Adenotes data time series recorded by the A

axis (x, y, or z) of the sensor S (acc or gyr) in the

smartphone and the rS,A

ðÞ

ti is a single data recorded at

the ti-th time. Moreover, LEis the label time series

recorded for the event type E (brake or turn) and lE

ti is

always 0 or 1, denoting whether rS,A

ðÞ

ti is part of a spe-

cific event or not. Additionally, the total number of

the recorded data points in a whole trip is N. Based

on Figure 1, the raw input for the phase one in the

brake detection part is RAcc

yand in the turn detection

part are RGyr

zand RAcc

x:The raw inputs, in this phase,

are first modified by different filters as shown. In fact,

by noise elimination, RS

Abecomes FRS

A, where F dem-

onstrate the type of filter (L for low-pass, H for high-

pass, W

for level one and W

for level two of wavelet

filter). After data filtration, the modified inputs are

fed into the proposed classifiers in each part.

Therefore, every recorded single point in the relevant

FRS

Ais the input of the classifier, and the classifier

predicts lE

ti ð

tiÞ:The shape of a sample input for the

brake and turn detection classifiers is demonstrated in

(10) and (11) respectively:

brakeinput ¼fW1rAcc,y

ðÞ

t1,Lr Acc,y

ðÞ

t1g(10)

turninput ¼fLr Gyr,z

ðÞ

t1,Hr Gyr,z

ðÞ

t1,Lr Acc,x

ðÞ

t1g(11)

All the notations are defined previously. The above

equations show that, for example, the neural network

classifier in the brake detection part is trained to

Table 3. Classifiers’hyperparameters.

Random forest Artificial neural network

Criterion Gini Number of hidden layers 2

Maximum depth 5 Number of neurons per layer 80 and 40

Number of estimators 40 Learning rate 0.001 (Constant)

Min_samples_split 2 Maximum iteration without meeting improvement 10

Min_samples_leaf 1 Solver Adam

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS 7

determine whether a single data point is part of a

brake or not by using two features, Wr Acc,y

ðÞ

t1and

Lr Acc,y

ðÞ

t1:Similarly, this is done in the turn section by

three features namely Lr Gyr,z

ðÞ

t1,Hr Gyr,z

ðÞ

t1, and Lr Acc,x

ðÞ

t1:

The predicted

tis together, then, create LE-pred.In

the last step of this phase, intervals with

ti ¼1 are

considered as primary driving behavior intervals.

These intervals are extracted to be used in the

next phase.

To show the performance of this phase and prove

the necessity of designing the second phase, a time

series label comparison method (Figure 4) is proposed

in this paper for comparing LEand LE-pred. This

method, additionally, creates ground-truth labels to be

used as a measure of performance for training and

testing the classifiers in the next phase. A point-by-

point comparison is not reasonable to compare LE

and LE-pred, since some unintentional lags could be

observed between the real labels and where the driv-

ing event is actually happened due to the labeler

responsiveness. A sample of this lag is depicted in

Figure 5, where the yellow boxes show the time inter-

vals specified by the labeler, and they are not matched

with the events. Therefore, there is a need for a new

method to consider these situations as true positives.

According to Figure 4, the Label Comparison

Function needs two lists of real and predicted inter-

vals as input. Primarily, all the predicted intervals are

counted as False Positive (FP). Then, for the first real

event (first period with lE

ti ¼1), the method finds all

overlapped predicted intervals (periods with

ti ¼1)

and saves them in the options list. If the options list

remains empty, the first real event will be considered

a false negative (FN) event. However, if the options

list includes one or more predicted intervals, the first

interval in this list will be considered as a True

Positive (TP), and therefore, one of the FP intervals

becomes a TP. This structure is applied for all the real

events existing in the LE:Thereafter, based on the

findings (FN, TP, and FP), precision, recall, and F1-

score of the first phase can be calculated with (12),

(13), and (14):

Precision ¼TP

TP þFP (12)

Recall ¼TP

TP þFN (13)

F1score ¼2Precision Recall

Precision þRecall (14)

Performance Improvement Phase (Phase 2)

As we stated before, the outputs of phase one may

contain some false positive events. However, the num-

ber of false negatives is not usually high in phase one.

For enhancing the precision of the model, we have

trained new classifiers to recognize the truly extracted

intervals. A sample extracted interval from the first

phase is shown by (15):

R'S

A¼fr'S,A

ðÞ

t1,r'S,A

ðÞ

t2,:::,r'S,A

ðÞ

ti ,:::,r'S,A

ðÞ

tn g(15)

where R0

ASis the data interval from the Aaxis (x, y,

or z) of sensor S (acc or gyr) captured by the previous

phase and the r0

ti S,A

ðÞ

is a single data recorded at the

ti-th time in the interval. Since R0

ASis a section of RS

we can state that n <N.

According to Figure 3, in the second phase, some

statistical features are experimentally chosen to be cal-

culated from FR0

AS(Fdenoting the type of assigned

filter). The details of these features and their defini-

tions are demonstrated in Table 4. The minimum,

Figure 4. Label comparison method.

8 M. ZAREI YAZD ET AL.

maximum, and energy level of the intervals are the

statistical features used in this section. Among all the

features, the B1 and T1 features in Table 4 need some

inputs more than just a single interval. One second of

LRAcc

yor LRAcc

xbefore the extracted interval (i¼pto

q) and another one second after that (i¼p’to q’) are

necessary to compute these features. Other input fea-

tures are extracted using the intervals which are

selected in the previous phase.

After the feature derivation from multiple extracted

intervals, they are fed to the proposed classifiers. The

classifiers are expected to determine whether an inter-

val is a real event or a false positive event. For brake

intervals, a random forest and for turn intervals, a

neural network with fully connected layers is

designed. The shape of a sample input data (j-th inter-

val) for brake and turn classifiers are denoted by (16)

and (17) respectively:

brakeinput ¼fB1j,B2j,B3j,B4jg(16)

turninput ¼fT'1j,T'2j,T'3j,T'4j,T'5jg(17)

where B1j,…,B4jand T1j,…,T5jare the calcu-

lated features. It should be noted that for learning the

classifiers in this phase, real labels are assigned by the

Label comparison method provided in phase one.

Therefore, for testing the classifiers it is determined

whether a single interval refers to a true positive event

or not (L0

E¼0 or 1).

Results and discussion

For demonstrating the proficiency of the proposed

method, two different testing scenarios are considered

in this section. A sensitivity analysis is also imple-

mented to clarify the importance of different parts

used in our method. In the first scenario, we will

show the performance of our designed model using

the same approach adopted previously in the litera-

ture. Afterward, in the second scenario, we will dem-

onstrate that our model also yields satisfactory results

when it comes to the prediction of newly collected

data in different situations.

First scenario

In this part, some different sub-datasets are extracted

from the gathered data described previously to train

Figure 5. Unintended lags between the real labels and the predicted events.

Table 4. Statistical features calculated in Phase 2.

Event Feature Definition Formula

Brake B1 Mean (LR0Acc

yÞ–Mean (1 sec before and after the LR0Acc

y)Pn

i¼1Lr0

tiðAcc:yÞ

nPq

i¼pLrðAcc:yÞ

ti þPq0

i¼p0LrðAcc:yÞ

qp

ðÞ

þðq0p0Þ

B2 Min (LR0

yAcc) minðLr0Acc,y

ðÞ

ti Þ

B3 Energy (W1R0

yAccÞPn

i¼1ðW1r0Acc,y

ðÞ

ti Þ2

B4 Energy (W2R0

yAccÞPn

i¼1ðW2r0Acc,y

ðÞ

ti Þ2

Turn T1 Mean (LR0

xAcc)–Mean (1 sec before and after the LR0Acc

x)Pn

i¼1Lr0

tiðAcc,xÞ

nPq

i¼pLrðAcc,xÞ

ti þPq0

i¼p0LrðAcc,xÞ

qp

ðÞ

þðq0p0Þ

T2 Max (absðLR0Acc

xÞ) maxðabsðLr0Acc,x

ðÞ

ti ÞÞ

T3 Max (absðLR0Gyr

zÞ) maxðabsðLr0Gyr,z

ðÞ

ti ÞÞ

T4 Max (absðHR0

zGyrÞ) maxðHr0Gyr:z

ðÞ

ti Þ

T5 Duration (R0

AS)n

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS 9

and test the model. All the collected data in this

research are divided into six groups and every single

group is different from the others in terms of driver,

route, and smartphone type. To show the model’s per-

formance in an experiment like the literature, some

sub-datasets are generated by different combinations

of the six data groups presented in Table 2.Table 5

shows the properties of these sub-datasets (bundles).

For this analysis, we created two to five-phone bun-

dles, and another bundle containing all the collected

data. The proposed two-phased method in this paper

is trained and tested using these bundles by 5-fold

cross-validation, and the results are reported in

Table 6.

Some essential points are revealed according to

Table 6. As demonstrated, the first phase of the pro-

posed method finds the potential intervals with some

false positive events. Therefore, a high recall and a

relatively low precision are shown after this phase.

However, by extracting the statistical features and

using the second classifier, both precision and recall

improve, and as a result, a high F1-score is provided.

Additionally, the proposed two-phase method has a

better performance in turn detection compared to

brake detection. Turns are usually more sensible for

the labeler; however, brake intensity varies in different

situations, and this may confuse the labeler while con-

fronting multiple intensities. Furthermore, there are

two sensors (accelerometer and gyroscope) for turn

recognition, but there is just the accelerometer sensor

available in smartphones for brake detection.

In Table 6, the larger the bundle becomes, the

more the accuracy drops. A large and diverse dataset

contains different types of each driving event (brakes

and turns) since different smartphones, drivers, and

trip routes are covered by that. As a result, the unique

characteristics of each different event become more

uncertain employing large datasets, and this causes

accuracy reduction of the model. However, by training

and testing our structure using such a dataset against

some datasets covering just one or a few situations,

we can further address the problem of overfitting in

our solution rather than just using train-test splitting

or cross-validation. This phenomenon is mostly over-

looked by the previous papers utilizing just one data-

set collected in unique situations. In other words,

here, we demonstrated that our model still works well

(with F1-scores of more than 70 percent) not only

when cross-validation or train-test split is applied on

data with unique characteristics (like the case in previ-

ous papers), but also when the data is collected in

various situations.

Second scenario

In this scenario, we train and test the proposed model

to measure its robustness. By robustness, we mean

that our model performs well even in situations when

the train and test datasets are different, and they are

gathered in an independent time and place using dif-

ferent smartphones and drivers. As the partitioned

datasets in Table 2 are completely different from each

other, we employed them for this aim in this section.

To demonstrate the robustness, we will compare

the accuracy of our model when we use a train-test

split for a single database and when a different

Table 5. Bundles description.

Brake detection Turn detection

Bundle number Bundle phones Bundle number Bundle phones

2 Xiaomi and Huawei 2 Asus and Huawei

3 Xiaomi, Huawei, and other 3 Asus, Nokia, and Huawei

4 Xiaomi, Nokia, Huawei, and other 4 Asus, Xiaomi, Nokia, and Huawei

5 Samsung, Xiaomi, Nokia, Huawei, and other 5 Samsung, Asus, Xiaomi, Nokia, and Huawei

6 Samsung, Asus, Xiaomi, Nokia, Huawei, and other 6 Samsung, Asus, Xiaomi, Nokia, Huawei, and other

Table 6. Brake and turn detection results by 5-fold cross validation.

Event type # Bundle

First phase results Improved results (second phase)

Precision % Recall % F1-score % Precision % Recall % F1-score %

Brake 2 64.13 96.47 77.04 87.52 89.56 88.53

3 54.11 95.2 69.00 79.78 79.88 79.83

4 49.56 92.57 64.56 79.13 68.65 74.14

5 46.98 89.2 61.55 76.16 66.14 70.79

6 45.83 89.37 60.59 75.30 66.46 70.60

Turn 2 75.86 95.65 84.61 93.18 94.25 93.71

3 78.53 93.85 85.51 93.35 91.59 92.46

4 76.4 91.71 83.36 89.52 86.28 87.87

5 73.49 88.6 80.34 87.32 80.43 83.73

6 71.5 88.81 79.22 84.99 79.00 81.89

10 M. ZAREI YAZD ET AL.

validation dataset is considered. Each dataset in Table

2is first split into 80 percent training and 20 percent

testing batches. Then, we trained and tested our

model six times with these created train-test splits.

Additionally, we also considered each row of Table 2

as a training set and the other rows as a testing set,

and again, we trained and tested our method another

six times. The results of these experiments are

depicted in Figure 6.

The first point about Figure 6 is the considerable

drop in F1-score in most of the cases when we use a

validation dataset different from the training set. To

the best of our knowledge, this experiment is never

done in previous studies. In the literature, all the gen-

erated models are fitted to a single dataset, and this

will increase the probability of overfitting even if the

collected dataset is large. Although by using a valid-

ation dataset our method does not perform as well as

using train-test split, the F1-scores yielded in Figure 6

remained acceptable and relatively high. From Figure

6, we can infer that by employing a small dataset (one

of the six datasets in Table 2) as a training set and

testing the model by an approximately five-time larger

and unseen dataset (other five datasets), the accuracy

remains reasonable and relatively high (about 70 and

80 percent of F1-score in brake and turn detection).

Another interesting observation in Figure 6 is that

the performance is somehow related to the training

size. This means that datasets containing a lower

number of events lead to lower F1-scores. For

instance, when the method is trained by Asus or

Xiaomi datasets, specifically in brake detection, the

validation results are 66 and 68 percent of F1-score,

respectively. As shown in Table 2, these two datasets

(Asus and Xiaomi) include 129 and 589 brake samples

which is a low number compared to the other data-

sets. However, the dataset size is not the only reason

affecting the results. Other factors such as labeling

error, smartphone type, and environmental situation

should also be considered for inferring the results. As

seen, the Huawei dataset contains 467 brake samples,

but the achieved F1-score is higher than Asus

and Xiaomi.

A detailed report of the results, choosing the

Huawei dataset as the training set and the other data-

sets as the test set, is shown in Table 7. The model

recognizes brakes and turns by using one part of a

six-part dataset with an accuracy of 71 and 82 percent,

respectively. The impact of the second phase intro-

duced in our method in terms of improving the preci-

sion is also observable in this scenario, Table 7.In

summary, the result indicates that although there are

Figure 6. Results of testing the model by second scenario.

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS 11

different smartphones, drivers, and rout conditions in

reality affecting the motion data of vehicles, it is pos-

sible to train a model based on the data collected in a

few situations and use it where there is no historical

data of another new particular participant (e.g., differ-

ent driver or smartphone). This challenge becomes

more visible when exploiting widely used smartphones

instead of precise but pricy OBD devices for driving

behavior monitoring as there are several brands and

types in the world. However, the result in this paper

shows that our structure is capable of handling it

based on our experiments.

Feature sensitivity analysis

In this section, the significance of each part of the

model such as inputs and features are analyzed. As

discussed earlier, accelerometer and gyroscope data,

along with different filters and statistical features are

utilized to detect driving maneuvers. Therefore, their

marginal contribution in accuracy improvement

worth further investigation. For this analysis, 80-20

train-test split is implemented with the Huawei data-

set, and each part of the method is systematically

dropped to find the effect of that part on

the accuracy.

The sensitivity analysis of the first phase is demon-

strated in Table 8. For brake and turn detection, first,

we implemented our completed method. Then, differ-

ent parts of the method like filters in brake detection

and sensors in turn detection are removed. The aim

in the first phase is to get a high recall, and therefore,

it is observable that by neglecting each part in this

phase the recall metric also becomes lower. Besides,

the most drop in the recall is recorded when all filters

in turn detection part are removed and raw data is

used instead. This shows the significance of noise

reduction and using filters in our method. Although

this is not the case for brake detection in this experi-

ment, a huge recall reduction was observed in another

experiment exploiting different datasets for training

and testing (different from the results demonstrated

in Table 8). In this new experiment, we used one of

the previous validation-sets (as discussed in scenario

2) as a test set and the recall reduction for brake

detection was about 37 percent which is considerable.

This observation showed that the intensity of noise

produced by sensors differs from one type of smart-

phone to another and this may affect the result of

event detection. Since the employed filters in our

approach reshape the data from time-domain to fre-

quency-domain and then removes the noise based on

frequency, we can state that our method is also robust

to the amount of noise, and this is also proved based

on these observations.

For studying the sensitivity of our model in the

second phase to the different input parameters, we

used the first phase with no reduction, as it has the

highest recall, and then we dropped some different

features of the second phase. The neglected features

and the achieved accuracy parameters after the

second phase are reported in Table 9. As we can see,

in the brake detection section, removing energy-

related features and other statistical features affect

the F1-score and decrease it by approximately 2 to 4

percent. Since during brakes a reduction of acceler-

ometer data is expected, energy related features help

sign some milder brakes and then the combination

of both statistical and energy related features reveals

Table 7. The performance of our method trained by a single

dataset and tested on the whole other data.

Event type First phase results Second phase results

Precision Recall F1-score Precision Recall F1-score

Brake 43.99 88.35 58.74 71.06 71.52 71.29

Turn 72.97 89.38 80.35 81.7 81.4 81.56

Table 8. First phase feature sensitivity analysis.

Dropped item Description Precision Recall F1-score

Nothing –65.51 97.64 78.41

Brake

detection

W1RAcc

yNo wavelet filter 83.08 95.71 88.95

LRAcc

yNo low-pass filter 75.68 95.28 84.36

Filters Just raw data is used 87.33 95.93 91.43

Turn

detection

Nothing –70.35 98.35 82.03

RAcc

xNo accelerometer data 62.71 97.36 76.28

RGyr

zNo gyroscope data 74.32 90.46 81.60

Filters Just raw data is used 64.67 81.91 72.28

Table 9. Second phase feature sensitivity analysis.

Dropped item Description Precision Recall F1-score

Brake detection Nothing –93.68 94.68 94.18

B3 and B4 No energy related feature 85.57 94.68 89.89

B1 Removing B1 91.66 93.61 92.63

B2 Removing B2 89.89 94.68 92.22

Turn detection Nothing –97.82 91.84 94.74

Duration (T5) Not using T5 feature 96.38 81.63 88.39

T3 and T4 No gyroscope feature 95.89 71.42 81.87

T1 and T2 No accelerometric feature 93.54 88.78 91.10

12 M. ZAREI YAZD ET AL.

the best result. In the turn detection section, the

marginal contribution of the features is higher than

the brake section. This can be justified by the fact

that there are different types of turns in reality. For

example, some of them are simply right or left turn

while others may be U-turns. Therefore, the model

embedded in our proposed structure may use each of

the features for the detection of any specific type of

event. Furthermore, it is observed that gyroscopic

features are more vital to get the F1-score high

rather than accelerometric features and the duration

feature (T5) is an essential part of the second phase

as its absence caused a 6.5% reduction in the F1-

score. Some of the turn samples always contain a

brake in themselves. Consequently, it would be frus-

trating for the model to solely rely on the accelerom-

eter information. In these situations, the duration of

the events and the gyroscope data would be benefi-

cial for detection.

Summary and conclusion

In this paper, a two-phase method using Random

Forest and Artificial Neural Network (Multi-Layer

Perceptron) along with high-pass, low-pass, and wave-

let filters is established to detect driving brakes and

turns. Suspicious intervals from filtered time series

data are first extracted with a high recall in our

method. Then, false positive samples are dropped by

feeding statistical features of the suspicious intervals

into specific classifiers. A label comparison method is

also developed to compare real labels of the events

with the predicted ones to overcome some limitations

of the previous sliding windows. The robustness of

this method is demonstrated by different scenarios

employing six distinct datasets collected by different

drivers and smartphones in multiple routes.

Additionally, the marginal contribution of input fea-

tures in this method, specifically different filters, is

quantified to show the significance of each part in

achieving high performance.

The proposed methodology in this paper can be

used as a monitoring system to detect driving maneu-

vers and provide feedbacks for drivers about their

driving styles. Our experiments show that this method

reveals acceptable results not only when the train and

test data are collected in similar situations but also

when the test data is different from the training set.

Results in this paper demonstrated that although the

performance usually drops by transferring a trained

model into a different situation, the proposed method

in this paper does not experience a high reduction in

accuracy while being used for different drivers, smart-

phones, and routes. It is shown that by training our

two-phase method with a dataset containing enough

number of events (e.g., 300 samples per event type),

we can detect braking and turning of other drivers in

different situations with an average F1-score of 71%

and 81% respectively. Using filters for data denoising

also showed a significant impact on accuracy and

raised the recall in the first phase of brake and turn

detection about 2% and 16%, respectively.

This research venue may be extended in several

ways such as developing an unsupervised or semi-

supervised approach for detecting driving behaviors

and testing them by new unseen datasets. Collecting

unsupervised or partially supervised data is less time

and labor consuming. Furthermore, another area

worth exploring is also investigating the effects of the

smartphones’placement on data quality.

Disclosure statement

No potential conflict of interest was reported by

the authors.

ORCID

Iman Taheri Sarteshnizi http://orcid.org/0000-0002-

7798-8788

References

Bates, L., Soole, D., & Watson, B. (2012). The effectiveness

of traffic policing in reducing traffic crashes. In

T. Prenzler (Ed.), Policing and security in practice (pp.

90–109). Springer.

Bejani, M. M., & Ghatee, M. (2018). A context aware system

for driving style evaluation by an ensemble learning on

smartphone sensors data. Transportation Research Part C:

Emerging Technologies,89(February), 303–320. https://

doi.org/10.1016/j.trc.2018.02.009

Carlos, M. R., Gonzalez, L. C., Wahlstrom, J., Ramirez, G.,

Martinez, F., & Runger, G. (2020). How smartphone

accelerometers reveal aggressive driving behavior? The

key is the representation. IEEE Transactions on Intelligent

Transportation Systems,21(8), 3377–3387. https://doi.org/

10.1109/TITS.2019.2926639

Carvalho,E.,Ferreira,B.V.,Jr.,Souza,J.F.,De,C.,Carvalho,

H. V., Suhara, Y., Pentland, A. S., Pessin, G., & Behavior,

A. D. (2017). Exploiting the use of recurrent neural net-

works for driver behavior profiling. In Y. Choe, C. Jayne, B.

Hammer, & I. King (Eds.), International Joint Conference

on Neural Networks (IJCNN) (pp. 3017–3021). IEEE.

Chan, T. K., Chin, C. S., Chen, H., & Zhong, X. (2020). A

comprehensive review of driver behavior analysis utilizing

smartphones. IEEE Transactions on Intelligent

Transportation Systems,21(10), 4444–4475. https://doi.

org/10.1109/TITS.2019.2940481

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS 13

Chhabra, R., Verma, S., & Krishna, R. (2018). Detecting

aggressive driving behavior using mobile smartphone. In

C. Rama Krishna, R. Kumar, & M. Dutta (Eds.),

Proceedings of 2nd International Conference on

Communication, Computing and Networking (pp.

513–522). Springer.

Daptardar, S., Lakshminarayanan, V., Reddy, S., Nair, S.,

Sahoo, S., & Sinha, P. (2015, November 1–4). Hidden

Markov model based driving event detection and driver

profiling from mobile inertial sensor data [Paper presenta-

tion]. 2015 IEEE Sensors. Busan, Korea (South). https://

doi.org/10.1109/ICSENS.2015.7370312

Eftekhari, H. R., & Ghatee, M. (2018). Hybrid of discrete

wavelet transform and adaptive neuro fuzzy inference

system for overall driving behavior recognition.

Transportation Research Part F: Traffic Psychology and

Behaviour,58, 782–796. https://doi.org/10.1016/j.trf.2018.

06.044

Eftekhari, H. R., & Ghatee, M. (2019). A similarity-based

neuro-fuzzy modeling for driving behavior recognition

applying fusion of smartphone sensors. Journal of

Intelligent Transportation Systems,23(1), 72–83. https://

doi.org/10.1080/15472450.2018.1506338

Ferreira, J., Pentland, A., Ferreira, B. V., Pessin, G.,

Carvalho, E., de Souza, C., & Suhara, Y. (2017). Driver

behavior profiling: An investigation with different smart-

phone sensors and machine learning. PloS One,12(4),

e0174959. https://doi.org/10.1371/journal.pone.0174959

Haque, M. E., Khan, M. N. S., & Sheikh, M. R. I. (2016).

Smoothing control of wind farm output fluctuations by

proposed low pass filter, and moving averages. ICEEE

2015 –1st International Conference on Electrical and

Electronic Engineering (pp. 121–124). https://doi.org/10.

1109/CEEE.2015.7428234

Haykin, S. (1994). Neural networks: A comprehensive foun-

dation. Prentice Hall.

Ho, T. K. (1995). Random decision forests. Proceedings of

3rd International Conference on Document Analysis and

Recognition (Vol. 1, pp. 278––282).

Jamson, S. L., Hibberd, D. L., & Jamson, A. H. (2015).

Drivers’ability to learn eco-driving skills; effects on fuel

efficient and safe driving behaviour. Transportation

Research Part C: Emerging Technologies,58(PD), 657–668.

https://doi.org/10.1016/j.trc.2015.02.004

Johnson, D. A., & Trivedi, M. M. (2011). Driving style rec-

ognition using a smartphone as a sensor platform. IEEE

Conference on Intelligent Transportation Systems (ITSC)

(pp. 1609–1615). https://doi.org/10.1109/ITSC.2011.

6083078

Kazemeini, A., Taheri, I., & Samimi, A. (2022). A GPS-

based algorithm for brake and turn detection.

International Journal of Intelligent Transportation Systems

Research,20(2), 433–445. https://doi.org/10.1007/s13177-

022-00301-9

Ma, Y., Zhang, Z., Chen, S., Yu, Y., & Tang, K. (2019). A

comparative study of aggressive driving behavior recogni-

tion algorithms based on vehicle motion data. IEEE

Access,7, 8028–8038. https://doi.org/10.1109/ACCESS.

2018.2889751

Malghan, P. G., & Hota, M. K. (2020). A review on ECG fil-

tering techniques for rhythm analysis. Research on

Biomedical Engineering,36(2), 171–186. https://doi.org/

10.1007/s42600-020-00057-9

Nguyen, T., Lu, D., Nguyen, D., & Nguyen, H. (2020).

Dynamic basic activity sequence matching method in

abnormal driving pattern detection using smartphone

sensors. Electronics,9(2), 217. https://doi.org/10.3390/

electronics9020217

Nuswantoro, F. M., Sudarsono, A., & Santoso, T. B. (2020).

Abnormal driving detection based on accelerometer and

gyroscope sensor on smartphone using artificial neural

network (ANN) algorithm. 2020 International Electronics

Symposium (IES) (pp. 356–363). https://doi.org/10.1109/

IES50839.2020.9231851

Ouyang, Z., Niu, J., & Guizani, M. (2018). Improved vehicle

steering pattern recognition by using selected sensor data.

IEEE Transactions on Mobile Computing,17(6),

1383–1396. https://doi.org/10.1109/TMC.2017.2762679

Saleh, K., Hossny, M., & Nahavandi, S. (2017). Driving

behavior classification based on sensor data fusion using

LSTM recurrent neural networks. 2017 IEEE 20th

International Conference on Intelligent Transportation

Systems (ITSC) (pp. 1–6). https://doi.org/10.1109/ITSC.

2017.8317835

S

anchez, S. H., Pozo, R. F., & Hern

andez G

omez, L. A.

(2018). Estimating vehicle movement direction from

smartphone accelerometers using deep neural networks.

Sensors (Switzerland),18(8), 2624. https://doi.org/10.3390/

s18082624

Shinde, P. P., & Shah, S. (2018, August 16–18). A review of

machine learning and deep learning applications.

Proceedings –2018 4th International Conference on

Computing, Communication Control and Automation,

ICCUBEA 2018. IEEE. https://doi.org/10.1109/ICCUBEA.

2018.8697857

Siami, M., Naderpour, M., & Lu, J. (2021). A mobile tele-

matics pattern recognition framework for driving behav-

ior extraction. IEEE Transactions on Intelligent

Transportation Systems,22(3), 1459–1472. https://doi.org/

10.1109/TITS.2020.2971214

Singh, G., Bansal, D., & Sofat, S. (2017). A smartphone

based technique to monitor driving behavior using DTW

and crowdsensing. Pervasive and Mobile Computing,40,

56–70. https://doi.org/10.1016/j.pmcj.2017.06.003

Tanvir, S., Chase, R. T., & Roupahil, N. M. (2021).

Development and analysis of eco-driving metrics for nat-

uralistic instrumented vehicles. Journal of Intelligent

Transportation Systems,25(3), 235–248. https://doi.org/

10.1080/15472450.2019.1615486

Toledo, T., Musicant, O., & Lotan, T. (2008). In-vehicle

data recorders for monitoring and feedback on drivers’

behavior. Transportation Research Part C: Emerging

Technologies,16(3), 320–331. https://doi.org/10.1016/j.trc.

2008.01.001

Wang, R., Xie, F., Zhao, J., Zhang, B., Sun, R., & Yang, J.

(2021). Smartphone sensors-based abnormal driving

behaviors detection: Serial-feature network. IEEE Sensors

Journal,21(14), 15719–15728. https://doi.org/10.1109/

JSEN.2020.3036862

World Health Organization. (2018). Global status report on

road safety 2018.

Wu, Z., Liu, P., Liu, Q., & Wang, Y. (2018). MEMS-based

IMU assisted real time difference using raw

14 M. ZAREI YAZD ET AL.

measurements from smartphone. Proceedings of the 31st

International Technical Meeting of the Satellite Division of

the Institute of Navigation (ION GNSSþ2018) (pp.

445–454). https://doi.org/10.33012/2018.15928

Xie, J., Hilal, A. R., & Kulic, D. (2018). Driving maneuver

classification: A comparison of feature extraction meth-

ods. IEEE Sensors Journal,18(12), 4777–4784. https://doi.

org/10.1109/JSEN.2017.2780089

Xie, J., Kulic, D., & Hilal, A. R. (2018, October 7–10).

Driver distraction recognition based on smartphone sen-

sor data. 2018 IEEE International Conference on Systems,

Man, and Cybernetics (SMC). Miyazaki, Japan. https://

doi.org/10.1109/SMC.2018.00144

Yao, Y., Zhao, X., Wu, Y., Zhang, Y., & Rong, J. (2021).

Clustering driver behavior using dynamic time warping

and hidden Markov model. Journal of Intelligent

Transportation Systems,25(3), 249–262. https://doi.org/

10.1080/15472450.2019.1646132

Yi, D., Su, J., Liu, C., Quddus, M., & Chen, W. H. (2019).

A machine learning based personalized system for driving

state recognition. Transportation Research Part C:

Emerging Technologies,105, 241–261. https://doi.org/10.

1016/j.trc.2019.05.042

Yu, J., Chen, Z., Zhu, Y., Chen, Y., Kong, L., & Li, M.

(2017). Fine-grained abnormal driving behaviors detec-

tion and identification with smartphones. IEEE

Transactions on Mobile Computing,16(8), 2198–2212.

https://doi.org/10.1109/TMC.2016.2618873

Yuksel, A. S., & Atmaca, S. (2021). Driver’s black box: A

system for driver risk assessment using machine learning

and fuzzy logic. Journal of Intelligent Transportation

Systems,25(5), 482–500. https://doi.org/10.1080/

15472450.2020.1852083

Zhang, J., Wu, Z., Li, F., Luo, J., Ren, T., Hu, S., Li, W., &

Li, W. (2019). Attention-based convolutional and recur-

rent neural networks for driving behavior recognition

using smartphone sensor data. IEEE Access.7,

148031–148046. https://doi.org/10.1109/ACCESS.2019.

2932434

JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS 15

Utilizing mobile phone sensors and machine learning to detect drivers through right leg motion

Article

Nov 2023
COMPUT ELECTR ENG

This paper begins with the assumption that a mobile phone is placed in the right trouser pocket. Its primary objective is to ascertain whether the mobile phone belongs to the driver or the passenger. To achieve this goal, the paper conducts an analysis of raw data collected from phone sensors, specifically the accelerometer, gyroscope, orientation, and GPS, in order to identify and detect movement in the right leg. Based on this analysis, the application makes the decision to disconnect the signal and Internet access on the driver's phone if the car's speed exceeds 60 miles per hour, while maintaining these services for passengers. Subsequently, relevant features are extracted from the data, and a variety of machine learning algorithms are evaluated to determine the most suitable model for the task. The results demonstrate promising performance, suggesting that the proposed method could serve as an effective tool for detecting distracted driving.

Sensitivity analysis of driving event classification using smartphone motion data: case of classifier type, sensor bundling, and data acquisition rate

Article

Full-text available

Nov 2022

Classification of driving events is a crucial stage in driving behavior monitoring using smartphone sensory data. It has not been previously explored that to what extent classification performance depends on the classifier type and input data characteristics. To fill this gap, a real-world experiment is designed for supervised data collection. Then the effects of different machine learning (ML) classifiers, data sampling rates, and sensor combinations on the final classification accuracy are demonstrated. A considerable number of labeled events (4114) containing 11 types of driving maneuvers are collected using base sensors (accelerometer and gyroscope) and composite sensors (linear accelerometer and rotation vector) available in smartphones. Several models using 23 ML algorithms are trained. The sensitivity of these models is analyzed by changing the characteristics of the input data concerning the type of ML classifier, data sampling rate, and the bundle of mobile sensors. It is demonstrated that: (1) F1 scores vary from 70 to 96% for different ML classifiers, (2) F1 scores drop 30–40% depending on the classifier type when reducing the data sampling rate, and (3) using all four sensors as a bundle for classifying driving events is not reasonable since an approximate equal F1 score is achievable by a three-sensor bundle which includes an accelerometer and a linear accelerometer.

Kick-scooters identification in the context of transportation mode detection using inertial sensors: Methods and accuracy

Article

Full-text available

Nov 2022

This work presents a novel transportation mode detection algorithm that handles the recognition of kick-scooters. In 2015, 10 minutes of data from a kick-scooter were considered in a transportation mode detection study, yielding a 56% F1-score. Since then, kick-scooters were not given much attention. Yet, kick-scooters are now very present in the urban transportation ecosystem, and their consideration in transportation studies has become a must. To fill this gap, 4 hours of kick-scooter signals were collected by 18 participants, with a set of 6 different kick-scooters, using 3 body-worn inertial measurement units. Obviously, kick-scooter patterns are classified in contrast with other modes of transportation. Two classification scenarios are considered in order to gradually increase the classification model complexity. The first scenario includes walking, biking, and kick-scooter, while the second considers public transport (tramway and bus) in addition to the former transportation modes. Results show that kick-scooters can be detected with an F1-score of 80% in the first scenario. Walking and public transport samples were still accurately classified in the second scenario, with an F1-score above 80% for both classes. However, bike and kick-scooter samples were both classified with lower F1-scores, equal to 59% and 64% respectively. Therefore, the main focus of future works should be directed toward the separability of kick-scooters and bikes when public transport is considered. The findings also suggest to place preferably the sensors in the trouser’s pocket, allowing for leg motion to be finely captured.

An automatic methodology to measure drivers’ behavior in public transport

Article

Oct 2022

The way in which public transport buses are driven has an influence in users’perception and satisfaction with the service. Bus driver’s behavior is usually obtained surveying passengers and/or using the mystery passenger method, not necessarily allowing for an objective and continuous evaluation. In this work, we introduce a novel methodology to automatically classify drivers’ behavior in a more consistent and objective manner, based on data from inertial measurement units, and machine learning techniques. By substituting human evaluators with automatic data collection and classification algorithms, we are able to reduce the subjectivity and cost of the current methodology, while increasing sample size. Our approach is based on three components: i) data capture using inertial measurement units (e.g. mobile devices), ii) carefully tuned classifiers that deal with sample imbalance problems, and iii) an interpretable scoring system. Results show that collected data captures several types of undesirable maneuvers, providing a rich information to the classification process. In terms of categorization performance, the evaluated classifiers, namely support vector machines, decision trees and k-NN, deliver high and consistent accuracy after the tuning process, even in the presence of a highly imbalanced sample. Finally, the proposed driver’s behavior score shows high discriminative power, effectively characterizing differences between drivers, and providing driver-tailored driving recommendations, that can be generated in specific spots, in order to improve passengers’ experience. The resulting methodology can be cost-effectively deployed at a large scale with good performance.

A review on ECG filtering techniques for rhythm analysis

Article

Full-text available

Mar 2020

PurposeElectrocardiogram (ECG) signal recording is a challenging task in the field of biomedical engineering. ECG is the cardiac recording of systematic electrical activity arising from the electro-physiological rhythm of the heart muscle. But, during processing, the ECG signal is contaminated with different types of noise in the medical environment. An immense task is the separation of the preferred signal from noises caused by artifacts like muscle noise, power line interference (PLI), baseline wandering (BW), and motion artifacts (MA). Hence, our paper focuses on 50 Hz PLI which is a major artifact/noise affecting the recorded ECG signal.Methods This paper comprehensively reviews fundamental concepts of different denoising techniques. Some of the pioneers’ works are also concisely explained in the paper. Further, in this work, comparative analysis is carried out using notch filter, adaptive filter, discrete wavelet transform (DWT) and empirical mode decomposition (EMD) for filtering 50 Hz PLI noise.ResultsA considerable improvement in signal-to-noise ratio (SNR) can be observed from the results when compared with SNR input and SNR output values. Performance comparisons of all the four techniques are also analyzed based on variations in noise frequency. The simulations were carried out in the environment of MATLAB 2019b®.Conclusion This work epitomizes the significance of our quantitative evaluation, in which adaptive filters are found to perform better with respect to the SNR, whereas DWT performs better with assessment of mean square error (MSE).

Dynamic Basic Activity Sequence Matching Method in Abnormal Driving Pattern Detection Using Smartphone Sensors

Article

Full-text available

Jan 2020

In this work, we present a novel method, namely dynamic basic activity sequence matching (DAS), a combination of machine learning methods and flexible threshold based methods for distinguishing normal and abnormal driving patterns. Indeed, DAS relies on the activity detection module (ADM) presented in our previous work to analyze each driving pattern as a sequence of basic activities—stopping (S), going straight (G), turning left (L), and turning right (R). In fact, the threshold value and other parameters like the duration of long and short activities are iteratively induced from the collected dataset. Hence, DAS is flexible and independent of driving contexts such as vehicle modes and road conditions. Experimental results, on the dataset collected from numerous motorcyclists, show the outperformance of our proposed method against dynamic time warping and the two popular machine learning methods—random forest and neural network—in distinguishing the normal and abnormal driving patterns. Moreover, we propose an efficient framework composing of two phases: in the first phase, the normal and abnormal driving patterns are distinguished by relying on DAS. In the second phase, the detected abnormal patterns are further classified into various specific abnormal driving patterns—weaving, sudden braking, etc. This fusion framework again achieves the highest overall accuracy of 97.94%.

A GPS-based Algorithm for Brake and Turn Detection

Article

Mar 2022

Driving behavior recognition is a notable topic in travel safety, as transportation and insurance companies could adopt effective tools to detect unsafe driving and internalize the associated costs. Different driving events and the related severity must be detected to distinguish abnormal behaviors. The global positioning system (GPS) provides useful information regarding the location of the vehicle at any time and is vastly used in various devices such as smartphones and GPS trackers. Other sensors, on the other hand, provide complementary valuable information but their implementation requires extra costs and more complex and intensive algorithms. We developed a threshold-based algorithm to detect the turning and braking of vehicles using the GPS sensor. The data contained 11 trips with a frequency of 1 Hz with a total duration of 2.7 h. The algorithm utilizes a supplementary map matching and a relabeling technique to boost the accuracy and yet preserve the reasonable computation load. The overall precision and recall rate of the turn-detecting model are respectively 77.5% and 92.5%. Also, this algorithm can detect braking events with a precision of 68.18% and a recall of 83.33%. To address the concerns about the overfitting, we tested our algorithm on a secondary dataset, and nearly similar values of accuracy were resulted, showing the flexible nature of our algorithm while dealing with a different set of driving behaviors and road characteristics. Additionally, a sensitivity analysis showed the sensitive nature of the brake detection algorithm, in contrast with the turn detection algorithm. Overall, our algorithm showed promising results and can be a pioneer one in the field of low-cost detection algorithms built for smartphones or GPS trackers possessed by various trucking and car insurance companies.

Abnormal Driving Detection Based on Accelerometer and Gyroscope Sensor on Smartphone using Artificial Neural Network (ANN) Algorithm

Conference Paper

Sep 2020

Smartphone Sensors-Based Abnormal Driving Behaviors Detection: Serial-Feature Network

Article

Nov 2020

One of the important factors leading to traffic accidents is the abnormal driving behavior of drivers. Early detection of abnormal driving behaviors can effectively reduce the occurrence of traffic accidents. At present, most of the mainstream driving behavior detection methods are based on the data of a single moment, which separates the continuity of driving behavior. In this paper, a driving behavior recognition algorithm based on Serial-Feature Network (SF-Net) and smart phone inertial sensor is proposed, which fully considers the continuity of driving events and uses adjacent multi time data to identify driving status. The data used in this paper are collected from GPS data, 3-axis acceleration and gyroscope data of smart phone. Through the preprocessing operation, SF-net makes the input vector not only contain the current sensor data, but also fuse the relevant information of adjacent time. In SF-net, deep convolution neural network is used for feature extracting, and 10 different driving behaviors can be identified by fusing multi-level and multi-time feature information. The field test results show that the accuracy rate of the serial feature network is 97.1%, and the recall rate is 98.4%, which is better than other test network models. When the number of training samples is small, the sequential feature network can still maintain a high recognition rate, and the network model is relatively stable.

Driver’s black box: a system for driver risk assessment using machine learning and fuzzy logic

Article

Aug 2021

Risky driving behaviors can cause accidents, which may result in major material and moral damages. Due to the increase in road accidents, it has become an important issue to identify risky driving behaviors and reward people who drive safely. With the development of technology, it is now possible to model driving behavior through advanced sensors integrated into embedded systems. In this study, we modeled four major risky driving behaviors and created driver profiles using data obtained from accelerometer and gyroscope sensors and applying widely used machine learning algorithms in behavior analysis, including the C4.5 Decision Tree, Random Forest, Artificial Neural Network, Support-Vector Machine, K-Nearest Neighbor, Naive Bayes, and K-Star algorithms. Risky driving behaviors and their risk levels were evaluated in accordance with the expert opinions of traffic officers, and driver risk was modeled using the fuzzy logic method. The applied machine learning algorithms were compared using common validation metrics such as accuracy, f-measure, precision, and recall. In our experiments, the K-Star algorithm was the most successful algorithm, with 100% accuracy. As a result, a highly accurate, low-cost system which acts as the driver’s black box was developed. The system can be integrated into vehicles and it can record the driver’s behaviors and identify the risky ones. It can also open up new horizons for insurance companies to utilize usage-based policies, in which customers who drive safely are rewarded with lower car insurance premiums, encouraging others to do the same.

A context aware system for driving style evaluation by an ensemble learning on smartphone sensors data

Article

Apr 2018
TRANSPORT RES C-EMER

There are many systems to evaluate driving style based on smartphone sensors without enough awareness from the context. To cover this gap, we propose a new system namely CADSE system to consider the effects of traffic levels and car types on driving evaluation. CADSE system includes three subsystems to calibrate smartphone, to classify the maneuvers, and to evaluate driving styles. For each maneuver, the smartphone sensors data are gathered in three successive time intervals referred as pre-maneuver, in-maneuver, and post-maneuver times. Then, we extract some important mathematical and experimental features from these data. Afterwards, we propose an ensemble learning method on these features to classify the maneuvers. This ensemble method includes decision tree, support vector machine, multi-layer perceptron, and k-nearest neighbors. Finally, we develop a rule-based fuzzy inference system to integrate the outputs of these algorithms and to recognize dangerous and safe maneuvers. CADSE saves this result in driver’s profile to consider more for dangerous driving recognition. The experimental results show that accuracy, precision, recall, and F-measure of CADSE system are greater than 94%, 92%, 92%, and 93%, respectively that prove the system efficiency.

A Mobile Telematics Pattern Recognition Framework for Driving Behavior Extraction

Article

Feb 2020

Mobile telematics is a relatively new innovation that involves collecting data on driving behavior using the internal sensors in a smartphone rather than from an in-vehicle data recorder. However, telematics data are usually not labeled, which makes extracting driving patterns from them very difficult. Therefore, unsupervised learning algorithms play an important role in this field. In addition, most current research is based on datasets developed in a laboratory or from site investigations and questionnaires, which are very different from real-world driving behaviors. To advance unsupervised learning techniques in this field, and to fill the gap in findings based on real-world data, we have developed an unsupervised pattern recognition framework for mobile telematics data. The framework comprises three main components: a self-organizing map, a nine-layers deep auto-encoder, and partitive clustering algorithms. The SOM algorithm reduces the complexity of the data, the deep auto-encoder extracts the features, and the clustering algorithm groups driving events with similar patterns into behaviors. Further, given clustering with mobile telematics data is an under-researched area, we undertook an empirical comparison of five well-known clustering algorithms to determine the strengths and weaknesses of each method and which is best suited to categorizing driving styles. The study was conducted with a real-world insurance dataset containing 500,000 journeys by 2500 drivers, and the results were evaluated against three measures - Davis Boulding, Calinski Harabasz, and execution time. Overall, we find that k-means clustering and a self-organizing map were able to extract more accurate patterns than others. A statistical analysis of the 29 clusters produced by SOM and k-means, revealed 29 unique driving styles, all of which can be found in the transportation literature. The results from the study, with support from the corresponding literature review, demonstrate the efficacy of the presented framework in unsupervised settings. Additionally, the results provide a basis for developing a future risk analysis and automatic decision support system for usage-based insurance companies.

A Comprehensive Review of Driver Behavior Analysis Utilizing Smartphones

Article

Sep 2019

Human factors are the primary catalyst for traffic accidents. Among different factors, fatigue, distraction, drunkenness, and/or recklessness are the most common types of abnormal driving behavior that leads to an accident. With technological advances, modern smartphones have the capabilities for driving behavior analysis. There has not yet been a comprehensive review on methodologies utilizing only a smartphone for drowsiness detection and abnormal driver behavior detection. In this paper, different methodologies proposed by different authors are discussed. It includes the sensing schemes, detection algorithms, and their corresponding accuracy and limitations. Challenges and possible solutions such as integration of the smartphone behavior classification system with the concept of context-aware, mobile crowdsensing, and active steering control are analyzed. The issue of model training and updating on the smartphone and cloud environment is also included.

Clustering driver behavior using dynamic time warping and hidden Markov model

Article

Aug 2019

Based on on-board diagnostics and Global Position System installed in taxicabs, driver behavior data is collected. Left turn data on six similar curves are extracted, and speed, acceleration, yaw rate, and sideslip angle of drivers in time series are selected as clustering indexes. Initial clustering is implemented by Dynamic Time Warping (DTW) and Hierarchical Clustering, and the clustering results are put into the Hidden Markov Model (HMM) to iteratively optimize the results for achieving convergence. Driver behavior patterns over time while driving on the curves and the statistical characteristics of different groups are examined. All indexes including lateral vehicle control and longitudinal vehicle control have a significant difference in different groups, indicating that the clustering method of DTW and HMM can effectively classify driver behavior. Finally, the driving behavior in different groups is further investigated and classified based on characteristics related to safe and ecological driving. This method can be applied by automobile insurance companies, and for the development of specific training courses for drivers to optimize their driving behavior.

A robust machine learning structure for driving events recognition using smartphone motion sensors

Abstract and Figures

Recommended publications

Development of vehicle maneuvering system for autonomous driving

Sensitivity analysis of driving event classification using smartphone motion data: case of classifie...

A GPS-based Algorithm for Brake and Turn Detection

Maneuver-Based Driving Behavior Classification Based on Random Forest

Driver’s black box: a system for driver risk assessment using machine learning and fuzzy logic