ArticlePDF Available

V-ITS: Video-based Intelligent Transportation System for Monitoring Vehicle Illegal Activities

Authors:

Abstract

Vehicle monitoring is a challenging task for video-based intelligent transportation system (V-ITS). Nowadays, the V-ITS system has a significant socioeconomic impact on the development of smart cities and always demand to monitor different traffic parameters. It noticed that traffic accidents are exceeded throughout the world with the percentage of 1.7%. The increase in accidents and the percentage of deaths are due to the people that don't abide by the traffic rules. To address these challenges, an improved V-ITS system is developed in this paper to detect and track vehicles and driver's activities during highway driving. This improved V-ITS system is capable to do automatic traffic management that saves traffic accidents. It provides the feature of a real-time detection algorithm for driver immediate line overrun, speed limit overrun and yellow-line driving. To develop this V-ITS system, a pre-trained convolutional neural network (CNN) model with 4-layer architecture was developed and then deep-belief network (DBN) model was utilized to recognize illegal activities. To implement V-ITS system, OpenCV and python tools are mainly utilized. The GRAM-RTM online free data sets were used to test the performance of V-ITS system. The overall significance of this intelligent V-ITS system is comparable to other state-of-the-art systems. The real-time experimental results indicate that the V-ITS system can be used to reduce the number of accidents and ensure the safety of passengers as well as pedestrians.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 3, 2019
202 | P a g e
www.ijacsa.thesai.org
V-ITS: Video-based Intelligent Transportation
System for Monitoring Vehicle Illegal Activities
Vehicle Intelligent Transportation System by Abbas Q
Qaisar Abbas
College of Computer and Information Sciences,
Al Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia
AbstractVehicle monitoring is a challenging task for video-
based intelligent transportation system (V-ITS). Nowadays, the
V-ITS system has a significant socioeconomic impact on the
development of smart cities and always demand to monitor
different traffic parameters. It noticed that traffic accidents are
exceeded throughout the world with the percentage of 1.7%. The
increase in accidents and the percentage of deaths are due to the
people that don’t abide by the traffic rules. To address these
challenges, an improved V-ITS system is developed in this paper
to detect and track vehicles and driver’s activities during
highway driving. This improved V-ITS system is capable to do
automatic traffic management that saves traffic accidents. It
provides the feature of a real-time detection algorithm for driver
immediate line overrun, speed limit overrun and yellow-line
driving. To develop this V-ITS system, a pre-trained
convolutional neural network (CNN) model with 4-layer
architecture was developed and then deep-belief network (DBN)
model was utilized to recognize illegal activities. To implement V-
ITS system, OpenCV and python tools are mainly utilized. The
GRAM-RTM online free data sets were used to test the
performance of V-ITS system. The overall significance of this
intelligent V-ITS system is comparable to other state-of-the-art
systems. The real-time experimental results indicate that the V-
ITS system can be used to reduce the number of accidents and
ensure the safety of passengers as well as pedestrians.
KeywordsComputer vision; intelligent traffic management
system; traffic monitoring; vehicle tracking from video; image
processing; deep learning
I. INTRODUCTION
An intelligent transportation system based on the video (V-
ITS) automatically tracks the vehicle illegal driving activities is
an active research area in the field of computer vision and
socioeconomic development. Due to rapidly increase the
vehicles, the intelligent system is required to control serious
injuries caused by traffic-related accidents. This problem is
widespread across the globe. In fact, the automatic tracking of
a vehicle is required to monitor the roads and highways as
well. To monitor the road or highways, there are many traffic
parameters that should be calculated such as over-speeding,
yellow-lane or off-road driving and detection of obstacles
presented on the road. Such an expert system also facilitates
driver assistance during automatic driving.
In particular, the statistics of accidents in Saudi Arabia is
increasing rapidly during the last five months. In one
estimation, there were 82.281 accidents in one month. These
accidents were due to not use of leaving enough safety distance
between vehicles, run red traffic lights, sudden change of lane,
and lack of commitment to the priorities. The system is
predicted and discovered the mistakes of the drivers especially
highway driving. A desktop application software was
developed to detect and track both vehicle and driver’s
activities during highway driving, using a real-time video
camera installed on highway road.
This V-ITS based system communicates with the respective
authorities to apply strict measures when the driver exceeds the
speed limit or track sudden change or leave out the yellow lane
or stop in the left path. On the road, the activities tracking is a
challenging and important task due to influence in surveillance
or road site accidents. In the past studies, the surveillance
cameras [1] on the road are increasing due to the negligence of
the car or truck drivers during highway driving. Therefore,
there is a great demand to increase surveillance systems.
The main goal of this paper is to propose the latest
computational intelligence algorithms using deep-learning
concept and to develop an effective and efficient system for
providing vehicle illegal activities detection rate from video
sequences. The use of video image processing and machine
learning algorithms for traffic monitoring was initiated in many
countries. When these algorithms have implemented on the
hardware, there are different parameters calculated. All video
detection systems used for traffic monitoring can be broadly
classified in two categories: 1) Systems which rely on localized
incident detections, and 2) Systems which track individual
vehicles. The advantage of the first is that the computational
requirements are quite low, and algorithms are relatively
simple. In the case of vehicle tracking systems, sophisticated
algorithms are needed and are usually computationally
demanding. As a result of this project, our main objective is to
track vehicle activities during highway driving time. Vehicle
tracking systems offer a more accurate estimation of
microscopic traffic parameters like lane changes, erratic motion
etc.
In V-ITS system, detection and monitor is planning to
develop to track activities of vehicles on the spot by using real-
time video sequences. Generate a penalty if there is a traffic
violation occur during the driving process on the highways.
Automatically discover the most times and days accidents, and
peak traffic violations. Predict accidents and congestion during
highway driving. Automatic generation of Penalties to illegal
drivers. If use ITS-V system then the drivers will try to control
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 3, 2019
203 | P a g e
www.ijacsa.thesai.org
the speed limits that will definitely reduce accidents on the
road. Compare to the speed limit, the accidents are occurred to
immediate line turn or driving on yellow-line to cross the
vehicles very fast. In case of different weather conditions on
the road, the chances of accidents will reduce to 70% due to the
immediate line overrun, speed limit overrun and yellow-line
driving.
II. RELATED WORK
Object tracking from live video sequences in Video
surveillance [1] applications is an important and emerging
research area, which is attracting many scientists. The object
tracking is directly related to the domain of computer vision
under image processing category. It has many applications in
practice such as traffic control, security, and surveillance and
mass events, etc. To effectively track an object, it is always a
challenging task. During driving, some ways have to find out
to track the vehicle activities so that the accidents should be
minimized. Also, object tracking is required to highlights the
role of humans in the next generation [2] of driver assistance
and intelligent vehicles. It is important to detect and track the
activities of human or robotic driving for safety reasons.
It noticed that the statistics of road-side accidents are
rapidly increasing throughout the world especially in Saudi
Arabia. According to estimation, there were 82.281 accidents
occurred during a time period of one month. These accidents
were due to not use of leaving enough safety distance between
vehicles, run a red traffic light, sudden change of lane, and lack
of commitment to the priorities. In this paper, an automatic
system is developed to predict and discover the mistakes of the
drivers, especially highway driving. A desktop application
software is implemented to detect and track both vehicle and
driver’s activities during highway driving, using a real-time
video camera installed on highway road. Real-time car
detection and tracking are applied over hundreds of image
frames. The proposed system will communicate with the
respective authorities to apply strict measures when the driver
exceeds the speed limit or track sudden change or leave out the
yellow lane or stop in the left path.
In fact, the road, traffic lights and other drivers on the road
[3] information are provided by the vision-based systems. It
noticed that the number of vehicles increased nowadays and
created a total burden on computer vision systems. Therefore,
there is a dire need for developing effective and efficient
solutions for tracking vehicle illegal activities on the road.
Moreover, the security risks have significantly increased and
that becomes an important subject for law enforcement
authorities for surveillance highways. To solve these problems,
there are lots of researchers who develop a tracking algorithm
but still inefficient and un-effective.
Another requirement of these systems in public is to have a
rationally large number of pixels on an objective. There are
many different types of objectives which could be important
and it is often not possible to get a large number of pixels on
objective and robustness [4]. The road digital cameras must
provide plenty related to uncommon events compared to just
over speed information. For example, those digital cameras
must include plenty of information related to immediate line
overrun, speed limit overrun and yellow-line driving.
Automatic detection and identification of objects is the main
importance of security systems and video surveillance
applications. Automatic video surveillance is placed to provide
coverage over the scenes of most interest. Within the scope of
view of the camera, some areas are of greater importance than
others and some areas are really of no interest at all.
For an automatic development of vehicle detection and
tracking, it is an important subject in the domain of computer
vision. Therefore in this paper, a system is trying to develop an
application that can track vehicle activities from live video
streaming during highway driving. The primary aim of this
project (ITS-V) is to focus on the detection of moving vehicles
for surveillance purposes. This problem is also related to the
domain of artificial intelligence [5] due to recognizing moving
objects from live video streams. When working on ITS-V
problem, an accurate solution is required to track object and
segment [6] it at the same time without losing time efficiency.
Also, real-time vehicle activity tracking means that the tracking
and segmentation step were integrated together. It is also
important that the developed application must help to law
enforcement agencies in case of discovering any vehicle
violating laws and it helps in improving traffic safety levels
and raise the quality of existing roads network.
The features of the system can achieve a better standard of
safety on the roads to reduce the rate of accidents and deaths
among drivers as less as possible. It can read the exceed speed
limit of the vehicle. The system will work on highways only
because according to statistics accidents often happen on the
highways. To reduce the number of accidents, a system is
developed that works in good control with the vehicle, so it can
follow the driver's movements and behaviors. And the system
processes the images of accidents in the video clips and adds
them to a dataset to discover which roads have the most
accidents.
The speed limit parameter for a traffic violation in KSA is
varied from 120 km to 140 km per hour if the driver goes over
it more than 10% the system will detect him. In the past
studies, the authors utilized some image processing and
machine learning algorithms to detect real-time driver’s
activities. Those machine learning techniques such as SVM
[7], [8], PCA [9], Neural Networks [10] or Bayesian decision-
making [11]. However, in this paper, the latest deep learning
techniques are integrated together to develop this novel system.
Instead of driving activities control system, there is also a
requirement for developing driving assistance systems and that
is considered to be a challenging task [12, 13]. To do this task,
there are many algorithms proposed to track the vehicles in the
motorway driving [14]. The advantage of these systems is their
real-time performance. There are many systems also talking
about the problems related to detect and track real-time vehicle
activities. These systems are explained below.
An intelligent transportation system based on the video
(ITS-V) is providing an important solution for the problem of
socioeconomic [15] impact on society. For tracking or
segmentation of vehicles, there is a need to monitor different
traffic parameters such as yellow-line, off-road or immediate
line changing during highway driving. Since, there are lots of
challenges of traditional video-based tracking systems such as
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 3, 2019
204 | P a g e
www.ijacsa.thesai.org
in case of vehicle drifting, occlusions, obstacles and detection
still in various environmental conditions. To address these
challenges, there are few studies that focused on all these
problems during vehicle tracking.
Another paper [16], the authors developed a driving
assistance system to track head-and eye-blinking. Whereas in
[17], the authors presented a solution to detect a lane using a
vision system on the vehicle. Though, they concluded that lane
detection is a difficult problem because of the varying road
conditions that one can encounter while driving. There are also
some papers that talking about the techniques for monitoring
and understanding of real-world human activities, in particular
of drivers, from distributed vision sensors [18]. In order to
achieve this goal, the authors used different parameters such as
head pose estimation module, hand, and foot tracking, ego-
vehicle parameters, lane, and road geometry analysis, and
surround vehicle trajectories. The system is evaluated on a
challenging dataset of naturalistic driving in real-world
settings.
In contrast to these approaches, the authors developed an
Intelligent Vehicle Monitoring System [19] that using Global
Positioning System along with Google Maps and Cloud
Computing which collects useful information about a vehicle.
The detection of a vehicle is also a challenging task. In paper
[20], the authors developed a video-based analysis system that
detects, tracks and archives vehicles in video stream data at
multiple resolutions. This step is important even for controlling
activities of autonomous vehicle driving. Even in urban areas
[21], the sensors utilized in the car to communicate with the
clouds to given information about drivers activities during
driving. In this study [22], the authors introduced an activity
classification system based on activity class through random
forests (RFs) classifier. Moreover, in [23], the authors
discussed a human-centered perspective to develop an
intelligent vehicle.
There are a lot of limitations and deficiencies of the current
strengths and weaknesses of existing V-ITS approaches of
video-based vehicle tracking, which are as under. Previous
approaches did not provide effective traffic monitoring results
due to the effects of occlusion and spillover. The state-of-the-
art approaches utilized old fashion image processing and
machine learning algorithms to track the vehicle illegal
movements. In addition, large trucks often occlude neighboring
vehicles and make them hard to recognize side-by vehicle
activity. Though the approach is this paper to overcome these
limitations. The past studies have the inability to detect
vehicles due to headlight reflections and different weather
situations such as dust storm. The dust storm is come in KSA
and making the automatic system totally raise false positive to
detect vehicle activities.
The accuracy of tracking is also affected by the distance of
the camera from the closest lane. A larger pan angle is required
to cover all lanes when the camera is placed far from the
closest lane, so the camera should be placed as close to the
closest lane as possible. By using previous techniques, the
selection of key points of vehicles is crucial and those systems
are sensitive to drift and occlusion. The other developed
systems outside the KSA for automatically determining the
vehicle illegal activities are computationally expensive due to
the increasing number of vehicles. It noticed from the literature
that there is a dire need to develop a high-performance vehicle
detection method.
To overcome these above-mentioned problems, an
improved V-ITS system was presented through advanced deep
learning algorithms to get robust results. In the past studies, the
deep learning algorithms [24-36] have many variants to
represent visual features such as the convolutional neural
network (CNN), recurrent neural network (RNN), deep belief
networks (DBN), restricted Boltzmann machine (RBM) and
AutoEncoder. A four multilayer convolutional neural network
(CNN) [38] model is trained in different roads samples along
with diverse environmental conditions. Those samples were
obtained from GRAM Road-Traffic Monitoring (GRAM-
RTM) dataset [37]. The three layers in the CNN model were
dedicated towards driver immediate line overrun, speed limit
overrun and yellow-line driving along with one features
extraction layer. After this CNN model, a deep belief neural
network (DBN) [39] model was used to classify these three
drivers’ illegal activities.
III. METHODOLOGY
The systematic flow diagram of all steps of the proposed V-
ITS system is displayed in Fig. 1. An increasing number of
vehicles and environment conditions created a new interest in
the development of new technologies in real-time video image
processing. It noticed from the past studies that there are many
commercial systems proposed but they have difficulties with
congestion, shadows, dust storms, and lighting transitions.
Also, there are some illegal driving activities such as yellow-
lane driving that they are unable to detect. Therefore in this
research study, a feature-based transform system is developed
to overcome these difficulties by using advanced deep-learning
algorithms to process real-time video sequences. The author of
this paper entitled this system as a video-based intelligent
transportation system (V-ITS).
In this paper, the vehicle features are tracked instead of
tracking just entire vehicles to make it fit for various
environmental conditions such as occlusion and lighting
conditions. Also, the developed system can easily differentiate
between cars and trucks through feature-based tracking.
Moreover, there are some examples of training samples for
traffic violations on a highway are shown in Fig. 2.
Overall the proposed vehicle tracking system is developed
based on four main stages that are mentioned below. To
develop this ITS-V system, the firsts step is to segment of the
scene into individual vehicles and then tracking each vehicle
inside a tracking zone. A simple background subtraction
technique was utilized to segment vehicle from the background
video frames. After detecting and segmenting the vehicle form
a background scene, the next step is to compute traffic
parameters such as vehicle speed in different lanes of the
highway roads. After collecting local parameters thorough
CNN model at the collection site, this intelligent data is then
passed on to automated and operator assisted applications,
which is developed through the DBN model for classification.
In the end, plenty is generated according to the specific traffic
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 3, 2019
205 | P a g e
www.ijacsa.thesai.org
violation and stored in the somewhere distributed server to
avoid further traffic violations.
The first step of this research study is to extract the video
frames from the side of highway daily time driving. In this
paper, a pre-train four multi-layer convolutional neural network
(CNN) model was used to transform features for prediction of
driver’s illegal activities during highway driving. Those frames
are extracted from a video in every 0.3 seconds. Afterward, a
deep-belief network (DBN) model was applied to predict the
final decision of the activity class of driver’s illegal driving.
The DBN model is trying to help pre-train CNN model to pool
layer for defining an effective features map. In the proposed
system, a single features map was utilized that was extracted
from a single video frame. To perform the final prediction
about the driver’s illegal activities, a group of frames was used
for recognition. From the video frames, fifteen feature maps
are stored to generate by the pre-trained CNN model for
prediction, the equivalent of three seconds of video. Afterward,
this group of feature maps is concentrated into one single
pattern, which will be the input of our proposed DBN model, to
obtain the final classification of the traffic violation system.
To train the CNN model, the GRAM road-traffic
monitoring (GRAM-RTM) dataset was utilized. In this dataset,
the sample videos are selected according to different
environmental conditions to make the system runs better
compare to state-of-the-art systems. The methodological steps
are explained in the upcoming subsection of this paper.
A. Acquisition of Datasets
GRAM Road-Traffic Monitoring (GRAM-RTM) dataset
[37] was utilized in this paper to test and compare the
performance of proposed V-ITS system. In fact, the GRAM-
RTM dataset consists of multiple vehicles tracking ground
truth during real-time video processing. The V-ITS system was
tested and implemented on HP brand Laptop with an Intel core
processor of the processing capability of i7 CPU @ 3.35 GHz
and 8 GB of RAM with Windows XP. This program was
programmed in OpenCV and Python tools. The experimental
results were also statistical measured. The dataset of 1200
region-of-interest (ROI) video frame images including normal
of 600 and traffic violation of 600 were acquired from GRAM-
RTM to test and evaluate the performance of V-ITS system.
An example of this dataset from urban second video is visually
displayed in Fig. 2.
Fig. 1. A Systematic Diagram of Proposed V-ITS System for Detecting
Vehicle Illegal Activities.
Fig. 2. An Example of Training a Convolutional Neural Network Model for
Transfer Training Video Region-of-Interest (ROI).
The region-of-interest (ROIs) for each video frames from
GRAM-RTM dataset is automatically defined to trained the
convolutional neural network (CNN) [38] for transform
features. In the video frames, the three-lines are drawn to
defined vehicle tracking and checking illegal activities. This
step is visually represented in Fig. 2. The high-level features
are defined from these ROI video frames to effectively train the
multilayers of the CNN model. The detailed description of this
CNN model is defined in the next subsection.
B. Pretrain CNN Model
Convolutional neural network (CNN) model [38] is using
in many applications to extract deep features and it has
applications in many different fields. In practice, the CNN
model is used to detect the pixel-level features using
convolutional filters in different layers and then finally classify
them through softmax linear classifier. In the past studies, the
authors achieved significantly higher performance than
manually tune machine learning algorithms.
To extract features from an image, a domain-expert
knowledge is required to detect best features from image
processing. However, if a convolutional neural network (CNN)
model is applied to a digital image then it provides deep-
invariant features. Those features are extracted from a different
multilayer of CNN model. To recognize the objects from
images, the features-map is responsible for the output layer.
But the features-map created by CNN model is not optimized
for real-time video processing. Therefore, a pre-train CNN
model was utilized to get an effective features map.
In this paper, a pre-train CNN model is utilized to extract
effective and optimize features from video frames. A pre-
Illegal activities
tracking
Extract Video
Frames
Transfer features using
Pre-train-CNN
Generate Plenty
Deep-belief network
Detect and classify
illegal activities
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 3, 2019
206 | P a g e
www.ijacsa.thesai.org
training strategy was used to guide CNN for defining the
features from video frames. The dataset is divided into three
different groups such as immediate line overrun, speed limit
overrun and yellow-line driving. The frames are extracted from
GRAM Road-Traffic Monitoring (GRAM-RTM) dataset that
defined the problems of immediate line overrun, speed limit
overrun and yellow-line driving. In fact, the CNN network
structure is learned to recognize the three different tasks for a
traffic violation from labeled video sequences. At the top layer
of the CNN model, the features are extracted and transformed
into weights to train the next layer of the CNN model. A pre-
training step for CNN multilayer architecture along with deep-
belief network (DBN) is explained in the subsequent
paragraphs.
The region-of-interest (ROIs) are extracted from video
frame from fixed regions and then send to the four multilayer
CNN architecture which has been already trained for high-level
feature extraction in different frames from GRAM-RTM
dataset video frames in three different categories. Those
categories are related to a traffic violation. Convolution and
max-pooling layers are added to the network for extracting and
selecting the most important features. The two fully-connected
deep belief network (DBN) layers are then added to the
network after convolution and the pooling layers. In practice, a
multi-layer RBM network is utilized for developing in a DBN
unsupervised network. In deep learning algorithms, the DBN
architecture proved to be an excellent generative model that
can easily outperform for fine-tune parameters.
During the training stage, high-level features are learned
simultaneously with the training of the proposed CNN
network. After four layers of convolution, RBM, and pooling,
the features are out of the last layer into the two-layer fully-
connected network for further training. Finally, the trained
feature matrix is taken as the input of DBN. DBN is trained
and fully connected to the output of the network to predict
saliency. In the inference stage, the full image was used as the
input of the network. Similar to the training stage, the high-
level features of the test video frame is extracted via the trained
CNN network. Finally, using the trained DBN and the learned
features, the plenty value was computed in each frame.
In order to develop a pre-train CNN model, the high-level
features are learned or extracted together to train the multi-
layer architecture of the network. In advance step, the pre-
training step is applied to get and learn good informative
features extracted from video frames. The dataset samples are
divided into three category classes. The ROI regions are
defined earlier from each video to do better initialization of the
training step. These ROIs frames are visually represented in
Fig. 2. In three different traffic violation, a set of informative
features are defined to effectively train the CNN model. Then a
single features map was generated from every single image,
which is convolved with a Gaussian mask.
The first layer of convolution neural network (CNN) model
was generated by following the DBN and max-pooling
concepts. To develop this pre-train CNN model, the features
are obtained at the first layer are additionally learned through
the next three CNN network layers. Training Similar to the pre-
training step, the training of the proposed CNN network is
performed with features from the pre-training step as the input
on the same collected from target frame regions.
C. Drivers Illegal Activities Prediction
Given a frame region, the driver’s illegal activity is
predicted through the pre-train CNN model and classifies those
activities by deep-belief network (DBN) multilayer
architecture. In order to achieve the high-level traffic violation
parameters, the ROI image region was used from the video
frame to do perfect training of the CNN network. In fact, the
convolutional filters are performed to each layer of the pre-
train CNN network model. Afterward, the deep belief network
(DBN) model was applied followed by the pooling layer to
classify driver’s illegal activities. In order to recognize driver’s
illegal activities, the last two layers are fully-connected layer
and input to this layer is a feature map to the DBN architecture.
After repeated running a DBN model, a weighted matrix is
obtained that is conforming to the high-level video features of
this frame. Hence, the traffic violation value of this frame can
be obtained via multiplying the weight with the features and it
is defined by the Eq. (1).

   (1)
Where w parameter is learned parameters from the deep-
belief network (DBN) classifier for three categories of plenty
class, and x is the high-level features matrix extracted by the
well-trained a multilayer convolutional neural network (CNN)
network model. In this equation, G parameter denotes the
Gaussian masking template to detect driver immediate line
overrun, speed limit overrun and yellow-line driving.
IV. EXPERIMENTAL RESULTS
Experimental results are described in this section to
validate the performance of the proposed V-ITS system for
detecting vehicle illegal activities without performing pre- or
post- image processing techniques. The proposed V-ITS
system based best variants of deep-learning multilayer
architecture is proposed in this paper that is different from
state-of-the-art detection systems.
The performance was evaluated based on the frames that
are extracted from GRAM Road-Traffic Monitoring (GRAM-
RTM) dataset. The multi-layer CNN architecture model was
trained in three different samples from GRAM-RTM dataset.
The three different samples driver’s illegal activities are
counted based on immediate line overrun, speed limit overrun
and yellow-line driving. The samples datasets are considered to
have different environments such as normal, sunny, and
cloudy. To extract high-level features from video frames, the
RBM layer was added to pre-train CNN model. To show the
effectiveness of proposed pre-train CNN and DBN models, the
comparisons with the CNN network is presented in Table I.
The V-ITS system was implemented in python and
OpenCV tools in Windows XP 64-bit system. The learning rate
for training the four-layer CNN is initialized as (10 x 6) with a
batch size of 32. The training of the four-layer of CNN model
for about 75 epochs and the training procedure costs nearly 12
hours in all. In the experiments, it averagely takes 0.254
seconds to train an image, and 0.152 seconds to test a video
frame image. The performance detector calculates two kinds of
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 3, 2019
207 | P a g e
www.ijacsa.thesai.org
errors, namely to miss a true violation detection and to detect
the false violation. The first error is measured by the detection
rate while the second one is called the number of false accepts.
These measures are represented in an average form in Table I.
To evaluate the performance of V-ITS system, different
vehicles are considered in the experiments. The experimental
results are shown in Table I. In this table, all types of
environmental conditions are considered. As shown in this
table, the proposed system is performed better results
compared to a simple convolutional neural network (CNN)
model with pre-training through principal component analysis
(PCA) technique. On average, the proposed V-ITS system is
getting 90% better detection rate compared to the 84% value
obtained by CNN model. The V-ITS system is capable of
detection and recognition the traffic violation up to four frames
per second (fps), so at normal speed (i.e., 100 km/h), the
system offers at least two opportunities to identify driver’s
illegal activities. In fact, the V-ITS will be upgraded in the
future to add more traffic violation at lower speeds. The V-ITS
system has been tested in a real-time environment mode during
nighttime and daytime under different environmental
conditions.
TABLE I. RESULTS CONCERNING THE AVERAGE PRECISION OF
TRACKING METHODS FOR VEHICLES IN SELECTED DATASET FROM GRAM-
RTM IN CASE OF IMMEDIATE LINE OVERRUN, SPEED LIMIT OVERRUN AND
YELLOW-LINE DRIVING
No.
V-ITS Performance
CNN
Pre-train CNN & DBN
1
0.751
0.890
2
0.845
0.920
3
0.821
0.945
Average Detection Rate
84.50%
90.01%
In this paper, a novel V-ITS system based on a new CNN
framework is proposed to automatically detect driver’s illegal
activities during highway driving. To detect effective features,
high-level features are learned and the classification results are
predicted through the DBN model. The proposed pre-train
model is distinct from the state-of-the-art techniques, an extra
layer was added through DBN into the CNN framework to
obtain more accurate features. Moreover, to avoid manual
annotation of video frames data, Deep Belief Network (DBN)
classifier was added to pre-train CNN model for recognizing of
driver’s illegal activities without using complex methods of
image processing algorithms. The proposed V-ITS system
outperforms compare to simple pre-train CNN model using
PCA in the same selected dataset.
V. CONCLUSIONS
Tracking of vehicle illegal activities is a critical step for the
development of an automatic traffic management system. In
the past studies, there are many authors focus on extracting
traffic parameters without focusing on environmental
conditions. In this paper, an efficient V-ITS system is
developed to predict the driver’s illegal activities during
highway driving. The system was evaluated and tested on
GRAM-RTM dataset. The experimental results indicate that
the proposed V-ITS system outperformed compared to state-of-
the-art video processing systems. It is happened due to use of
pre-train CNN model to transform the features and then DBN
is deployed to classify the vehicle illegal activities in multiple
video frames. In this paper, the V-ITS system measured traffic
parameters due to pre-train CNN deep learning algorithm to get
robust results without any problem or delay. The main
objective of this paper is to consider the latest deep learning
algorithm to calculate traffic illegal parameters without
focusing pre- or post-processing steps as done in many studies.
In this study, the combination of transform features and
multilayer architecture of deep learning algorithms (DBN) are
effectively integrated for better classification results. To
implement and test this V-ITS system, Python, computer vision
OpenCV tools were utilized. In a future study, more traffic
violation is added according to the type of vehicle.
ACKNOWLEDGMENT
The author would like to thank Deanship of Scientific
Research at Al Imam Mohammad ibn Saud Islamic university,
Saudi Arabia, for financing this project under the grant no.
(380902).
REFERENCES
[1] S. Ojha and S. Sakhare, "Image processing techniques for object
tracking in video surveillance- A survey," International Conference on
Pervasive Computing (ICPC), Pune, pp. 1-6, 2015.
[2] E. Ohn-Bar and M. M. Trivedi, "Looking at Humans in the Age of Self-
Driving and Highly Automated Vehicles," in IEEE Transactions on
Intelligent Vehicles, vol. 1, no. 1, pp. 90-104, March 2016.
[3] J. Wu and X. Zhang, “A PCA Classifier and its Application in Vehicle
Detection,” in Neural Networks, Proceedings. IJCNN ’01. International
Joint Conference on, vol. 1, 2001.
[4] S. Sivaraman and M. Trivedi, “A General Active-Learning Framework
for On-road Vehicle Recognition and Tracking,” IEEE Trans. on ITS,
vol. 11, no. 2, june 2010.
[5] A. Broggi, P. Cerri, and P. Antonello, “Multi-resolution Vehicle
Detection using Artificial Vision, in Intelligent Vehicles Symposium,
2004 IEEE, june 2004.
[6] H. Grabner, M. Grabner, and H. Bischof, “Real-Time Tracking via On-
line Boosting,” in Proc. BMVC, pp. 6.16.10, 2006.
[7] C. Papageorgiou and T. Poggio, “A Trainable System for Object
Detection,” Int. J. Comp. Vision, vol. 38, no. 1, pp. 15–33, Jun. 2000.
[8] Z. Sun, G. Bebis, and R. Miller, “On-road Vehicle Detection using
Gabor Filters and Support Vector Machines,” in 14th Intern. Conf. on
Digital Signal Processing (DSP), vol. 2, pp. 10191022, 2002.
[9] J. Wu and X. Zhang, “A PCA Classifier and its Application in Vehicle
Detection,” in Neural Networks, Proceedings. IJCNN ’01. International
Joint Conference on, vol. 1, pp. 600604 vol.1, 2001.
[10] N. Matthews, P. An, D. Charnley, and C. Harris, “Vehicle Detection and
Recognition in Grayscale Imagery,” Control Engineering Practice, vol.
4, no. 4, pp. 473479, 1996.
[11] H. Schneiderman and T. Kanade, “A Statistical Method for 3D Object
Detection applied to Faces and Cars,” in IEEE Conf. on Computer
Vision and Pattern Recognition, vol. 1, pp. 746751, 2000.
[12] A. Ess, K. Schindler, B. Leibe, and L. Van Gool, “Object Detection and
Tracking for Autonomous Navigation in Dynamic Environments,” The
International Journal of Robotics Research, vol. 29, 2010.
[13] E. Richter, R. Schubert, and G. Wanielik, “Radar and Vision-based Data
Fusion - Advanced Filtering Techniques for a Multi-object Vehicle
Tracking System,” in Intelligent Vehicles Symposium, 2008 IEEE, pp.
120125, , june 2008.
(IJACSA) International Journal of Advanced Computer Science and Applications,
Vol. 10, No. 3, 2019
208 | P a g e
www.ijacsa.thesai.org
[14] NC Mithun, T Howlader, SMM Rahman, Video-based tracking of
vehicles using multiple time-spatial images, Expert Systems with
Applications, vol. 62, no. 15, pp. 17-31, 2016.
[15] C. Braunagel, E. Kasneci, W. Stolzmann and W. Rosenstiel, "Driver-
Activity Recognition in the Context of Conditionally Autonomous
Driving," 2015 IEEE 18th International Conference on Intelligent
Transportation Systems, Las Palmas, pp. 1652-1657, 2015.
[16] A. A. Assidiq, O. O. Khalifa, M. R. Islam and S. Khan, "Real time lane
detection for autonomous vehicles," 2008 International Conference on
Computer and Communication Engineering, Kuala Lumpur, pp. 82-88,
2008.
[17] Eshed Ohn-Bar, Ashish Tawari, Sujitha Martin, Mohan M. Trivedi, On
surveillance for safety critical events: In-vehicle video networks for
predictive driver assistance systems, Computer Vision and Image
Understanding, vol. 134, pp. 130-140, May 2015.
[18] Dimil Jose, Sanath Prasad, V.G. Sridhar, "Intelligent Vehicle
Monitoring Using Global Positioning System and Cloud Computing",
Procedia Computer Science, vol. 50, pp. 440-446, 2015.
[19] W Wu, EA Bernal, RP Loce, ME Hoover, "Multi-resolution video
analysis and key feature preserving video reduction strategy for (real-
time) vehicle tracking and speed enforcement systems", S Patent
8,953,044, 2015.
[20] M. Gerla, E. K. Lee, G. Pau and U. Lee, "Internet of vehicles: From
intelligent grid to autonomous cars and vehicular clouds," IEEE World
Forum on Internet of Things (WF-IoT), Seoul, pp. 241-246, 2014.
[21] Lijie Xu and Kikuo Fujimura, Real-Time Driver Activity Recognition
with Random Forests. In Proceedings of the 6th International
Conference on Automotive User Interfaces and Interactive Vehicular
Applications (AutomotiveUI '14). ACM, New York, NY, USA, Article
9 , 8 pages, 2014.
[22] E. Ohn-Bar and M. M. Trivedi, "Looking at Humans in the Age of Self-
Driving and Highly Automated Vehicles," in IEEE Transactions on
Intelligent Vehicles, vol. 1, no. 1, pp. 90-104, March 2016.
[23] Jianwei Ding, Yongzhen Huang, Wei Liu, and Kaiqi Huang, "Severely
Blurred Object Tracking by Learning Deep Image Representations,
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR
VIDEO TECHNOLOGY", vol. 26, no. 2, FEBRUARY 2016.
[24] Hong yang Xue, Yao Liu, Deng Cai, Xiaofei He, Tracking people in
RGBD videos using deep learning and motion clues, Neurocomputing
vol. 204, pp. 70-76, 2016.
[25] Bohan Zhuang, Li jun Wang, Huchuan Lu, Visual tracking via shallow
and deep collaborative model, Neurocomputing, vol. 218, pp. 6171,
2016.
[26] Naiyan Wang, Dit-Yan Yeung, Learning a deep compact image
representation for visual tracking, Proceeding NIPS'13 Proceedings of
the 26th International Conference on Neural Information Processing
Systems, Lake Tahoe, Nevada, pp. 809-817, 2013.
[27] Lijun Wang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu, Visual
Tracking with fully Convolutional Networks, EEE ICCV 2015.
[28] Chao Ma, Jia-Bin Huang, Xiao kang Yang, Ming-Hsuan Yang,
Hierarchical Convolutional Features for Visual Tracking, Proceeding
ICCV '15 Proceedings of the 2015 IEEE International Conference on
Computer Vision (ICCV), pp. 3074-3082 December 07 - 13, 2015.
[29] H. Li, Y. Li and F. Porikli, "DeepTrack: Learning Discriminative
Feature Representations Online for Robust Visual Tracking," in IEEE
Transactions on Image Processing, vol. 25, no. 4, pp. 1834-1848, April
2016.
[30] Charissa Ann Ronao, Sung-Bae Cho, Human activity recognition with
smartphone sensors using deep learning neural networks, Expert
Systems With Applications, vol. 59, pp. 235-244, 2016.
[31] Trumble, M., Gilbert, A., Hilton, A., Collomosse, J.: Learning
markerless human pose estimation from multiple viewpoint video”, In:
Proc. ECCV Workshops, 2016.
[32] Lishen Pei, Mao Ye, Xuezhuan Zhao, Tao Xiang Tao Li, Learning
spatio-temporal features for action recognition from the side of the
video, Signal, Image and Video Processing January, vol. 10, no. 1, pp
199206, 2016.
[33] Konstantinos Charalampous, Antonios Gasteratos, On-line deep
learning method for action recognition, Pattern Analysis &
Applications, vol. 19, no. 2, pp. 337-354, May 2016.
[34] Cheng-Sheng Chan, Shou-Zhong Chen, Pei-Xuan Xie, Chiung-Chih
Chang, Min Sun, Recognition from Hand Cameras: A Revisit with
Deep Learning, Computer Vision ECCV 2016 Volume 9908 of the
series Lecture Notes in Computer Science pp. 505-521, September 2016.
[35] Yanming Guo, Yu Liu, Ard Oerlemans, Songyang Lao, Song Wu,
Michael S. Lew, Deep learning for visual understanding: A review,
Neurocomputing, Recent Developments on Deep Big Vision, vol. 187,
pp. 2748, 26 April 2016.
[36] Guerrero-Gómez-Olmedo, R., López-Sastre, R. J., Maldonado-Bascón,
S., & Fernández-Caballero, A. Vehicle tracking by simultaneous
detection and viewpoint estimation”, In International Work-Conference
on the Interplay Between Natural and Artificial Springer, Berlin,
Heidelberg, pp. 306-316, 2013.
[37] G.W. Yang, and J. Hui-Fang, Multiple Convolutional Neural Network
for Feature Extraction, International Conference on Intelligent
Computing, pp. 104114. Springer International Publishing, 2015.
[38] A. Qaisar, DeepCAD: A Computer-Aided Diagnosis System for
Mammographic Masses Using Deep Invariant Features, Computers,
vol.5,pp.115,2016.
[39] Qaisar Abbas, “Glaucoma-Deep: Detection of Glaucoma Eye Disease on
Retinal Fundus Images using Deep Learning”, International Journal of
Advanced Computer Science and Applications, 8(6):, pp. 4145 ,2017.
... Its main objective is to enhance traffic efficiency, safety, and sustainability by integrating technologies, data, and communication networks. Over the years, several trends and technologies have emerged in the traffic management system, including Intelligent Transportation Systems, Big Data Analytics, Autonomous Vehicles, Connected Vehicle Technology, and Multi-modal Transportation (Abbas, 2019). Big Data Analytics involves using data analysis techniques to extract valuable insights from large and complex data sets. ...
... By mitigating human factors such as driver error and fatigue, these systems can enhance road safety. The introduction of Autonomous Vehicles (AVs) further holds promise for reducing accidents caused by human mistakes (Abbas, 2019). Additionally, CVT enables seamless communication between vehicles, infrastructure, and pedestrians, facilitating more efficient traffic flow and safer road conditions. ...
... Another advantage is the potential for increased safety. Autonomous Vehicles (AVs) can minimize accidents caused by human error, while CVT facilitates real-time exchange of information, enhancing traffic flow and safety (Abbas, 2019). AVs can be programmed to follow traffic laws and avoid accidents, further enhancing safety (Pathik et al., 2022). ...
Chapter
Full-text available
The main objective of this proposed project is to manage the movement of people and goods efficiently. This traffic management system is based on AI and deep learning, which works with a traffic signal controller, vehicle classifier, and fine system. In this project, the programmable peripheral interface's buffer and ports are used to connect the traffic lights to the microprocessor system. As a result, the traffic lights can be turned ON or OFF automatically. The Interface Board was created to operate with the parallel port of the microprocessor system. A vehicle classifier is a vision-based vehicle classifier that uses machine learning algorithms to recognize vehicles and trucks in video pictures. Drivers and owners of motor vehicles who disobey traffic laws are subject to a fine system. When a traffic challenge is issued, it suggests that the recipient is liable to pay a fine that varies in amount according to the specific type of traffic infringement that was observed.
... Also, adding context information transmission in the designated layer can achieve accurate illegal parking detection. Abbas [25] constructed a pretrained convolutional neural network model with a four-layer architecture and used Shenxin network model to detect vehicle overlimit, speed limit overlimit, and yellow line driving and other illegal phenomena. Liu et al. [26] realized effective vehicle tracking in CNTK toolkit based on Fast-RNN network model and realized the identification of parking violations. ...
... After that, the overlap area A sidewalk of the lower part of the detection frame with the sidewalk and the overlap area A road with the road, respectively, and the overlap ratio according to the following formula can be calculated. If P > 1, it is judged as sidewalk parking [25]. ...
... In this section, to verify the superiority of the proposed method's recognition performance, based on the works of some authors [23,25,26] as a comparison method, a simulation experiment for the identification of taxi violation behaviors is realized under the same experimental scenarios and conditions. Experimental evaluation indicators are receiver operating characteristic (ROC) curve and Equal error rate (EER). ...
Article
Full-text available
Taxi has the characteristics of strong mobility and wide dispersion, which makes it difficult for relevant law enforcement officers to make accurate judgment on their illegal acts quickly and accurately. With the investment of intelligent transportation system, image analysis technology has become a new method to determine the illegal behavior of taxis, but the current image analysis method is still difficult to support the detection of illegal behavior of taxis in the actual complex image scene. To solve this problem, this study proposed a method of taxi violation recognition based on semantic segmentation of PSPNet and improved YOLOv3. (1) Based on YOLOv3, the proposed method introduces spatial pyramid pooling (SPP) for taxi recognition, which can convert vehicle feature images with different resolutions into feature vectors with the same dimension as the full connection layer and solve the problem of repeated extraction of YOLOv3 vehicle image features. (2) This method can recognize two different violations of taxi (blocking license plate and illegal parking) rather than only one. (3) Based on PSPNet semantic segmentation network, a taxi illegal parking detection method is proposed. This method can collect the global information of road condition images and aggregate the image information of different regions, so as to improve the ability to obtain the global information orderly and improve the accuracy of taxi illegal parking detection. The experimental results show that the proposed method has excellent recognition performance for the detection rate of license plate occlusion behavior DR is 85.3%, and the detection rate of taxi illegal parking phenomenon DR is 96.1%.
... The mutual problems that all the technology mentioned previously are the integration between them and the processing data through real-time connection [2,17,18,19,5]. VANET has an issue detecting other objects on the road to collect data [23] as well as computer vision when detecting the exact kind of violation that has been captured through camera sensors [7]. As for RFID, it has an issue when controlling the signal to detect the tag and dealing with the range of all vehicles at a time [14]. ...
... One reactive routing protocol that has been tested in VANET is the adhoc on-demand routing protocol distance vector (AODV) [9]. Furthermore, another research discusses about performance analysis of the ad-hoc on-demand routing protocol the distance vector (AODV) with the parameter 802.11p in the VANET environment [18,19,5]. ...
Article
Full-text available
Traffic Violation Detection system using radio frequency identification (RFID) has been applied to detect vehicles with limited power sources through RFID Tag. Camera sensors are also applied to identify a vehicle and its plate number through image and video processing known as computer vision. On the other hand, vehicular ad hoc network (VANET) to gain information about location, speed, and so on through the vehicle-to-vehicle (V2V) connection. Lastly, the internet-of-things that is backed-up by cloud computing helps to store various information from each of these technologies and process them to get results. Many researchers have proposed and developed algorithms or technology with astonishing experimental results of them. However, there has been no review of the integration and reporting mechanism that correlates them. There has also been no review about how to connect the information to the authority and violator. Therefore, the review was made to explain each technology’s method with their experiment’s results and issues. Moreover, the future challenge also had been given to further research.
Article
Presently, most smart cities face massive traffic issues every day. The smart cities’ significant challenge is the traffic control system, wherein some places are automated and cost-effective. In this manuscript, cloud-assisted Internet of things Intelligent Transportation System (CIoT-ITS) is proposed to overcome traffic management’s challenges. Here, the IoT sensor integrated camera is installed in every traffic signal corner to monitor the vehicle’s flow. Further, the optimised vehicle flow data is sent to the cloud processes. The data from the various signal corners runs an algorithm to detect traffic direction and controls the signal lights. The alert notification is sent to the nearest traffic control room during traffic congestion using IoT sensors. Simulation analysis proved that the proposed CIoT-ITS could monitor and manage the vehicle flow successfully and automatically. The proposed system has been validated based on the optimisation parameter, which outperforms conventional methods.
Article
Traffic Clog is the main issue of the fast and evolving world. Due to the rise in the use of more private vehicles and low road network capacity managing traffic with the traditional approach is cumbersome. Pollution and productivity of individuals are highly affected due to traffic. The use of mundane methods may not be an efficient and significant solution for varying traffic congestion. Nowadays, artificial intelligence (AI) and machine learning (ML) are playing an important role in solving many real-world problems. So, to tackle this problem, use of artificial intelligence and machine learning can give optimal solutions. An AI-enabled traffic management system can provide greater leeway to vehicles as they can then be directed and controlled more by the external environment. The main aim of using AI is to decrease manual interfacing. Various algorithms have been designed to curb this problem. The traffic management system consists of tools and technologies to gather information from heterogeneous sources. This study will help in identifying hazards that may potentially degrade traffic efficiency and its overcome technique. This article presents the detailed methodology, review, challenges, and future scope of the use of various algorithms for optimizing different aspects of Traffic Management System, i.e., Smart Traffic Signal Management, Traffic Flow Prediction, Traffic Congestion Detection, and its Management, and Automatic Detection of Traffic Signal.
Article
As a kind of transformer installed on vehicles, the vehicular mobile transformer needs to realize frequent transportation quickly and safely. However, there are few studies on the on-line monitoring of vehicular mobile transformers. In this paper, a transportation condition monitoring system for vehicular mobile transformers is introduced and developed. The transportation process of a 110kV/40MVA vehicular mobile transformer is monitored from Jiangsu Province to Guangdong Province, which is recognized as the first time that the transformer is monitored for more than 12h and more than 1000 km transportation. The geo-location, driving speed, vibration and shock acceleration in six different road sections are obtained in real time. Results show that the vibration and shock acceleration has a strong positive correlation with the driving speed. The amplitude of Y-axis acceleration is generally the lowest among three-axis accelerations and the three-axis vibration and shock acceleration has the same trend of increase and decrease. During the whole transportation process, the driving speed of the 110kV/40MVA vehicular mobile transformer varies from 0 to 76 km/h, and the vibration and shock acceleration amplitudes obtained by the monitoring system do not exceed 1g, which has great significance to the safe operation of power grid and can provide the strong evidence of online monitoring data to set transportation guidelines for vehicular mobile transformers.
Article
Full-text available
Road accidents mainly caused by the state of driver drowsiness. Detection of driver drowsiness (DDD) or fatigue is an important and challenging task to save road-side accidents. To help reduce the mortality rate, the "HybridFatigue" DDD system was proposed. This HybridFatigue system is based on integrating visual features through PERCLOS measure and non-visual features by heart-beat (ECG) sensors. A hybrid system was implemented to combine both visual and non-visual features. Those hybrid features have been extracted and classified as driver fatigue by advanced deep-learning-based architectures in real-time. A multi-layer based transfer learning approach by using a convolutional neural network (CNN) and deep-belief network (DBN) was used to detect driver fatigue from hybrid features. To solve night-time driving and to get accurate results, the ECG sensors were utilized on steering by analyzing heartbeat signals in case if the camera is not enough to get facial features. Also to solve the accurate detection of center head-position of drivers, two-cameras were mounted instead of a single camera. As a result, a new HybridFatigue system was proposed to get high accuracy of driver's fatigue. To train and test this HybridFatigue system, three online datasets were used. Compare to state-of-the-art DDD system, the HybridFatigue system is outperformed. On average, the HybridFatigue system achieved 94.5% detection accuracy on 4250 images when tested on different subjects in the variable environment. The experimental results indicate that the HybridFatigue system can be utilized to decrease accidents.
Conference Paper
Full-text available
We present a novel human performance capture technique capable of robustly estimating the pose (articulated joint positions) of a performer observed passively via multiple viewpoint video (MVV). An affine invariant pose descriptor is learned using a convolutional neural network (CNN) trained over volumetric data extracted from a MVV dataset of diverse human pose and appearance. A manifold embedding is learned via Gaussian Processes for the CNN descriptor and articulated pose spaces enabling regression and so estimation of human pose from MVV input. The learned descriptor and manifold are shown to generalise over a wide range of human poses, providing an efficient performance capture solution that requires no fiducials or other markers to be worn. The system is evaluated against ground truth joint configuration data from a commercial marker-based pose estimation system.
Article
Full-text available
The development of a computer-aided diagnosis (CAD) system for differentiation between benign and malignant mammographic masses is a challenging task due to the use of extensive pre- and post-processing steps and ineffective features set. In this paper, a novel CAD system is proposed called DeepCAD, which uses four phases to overcome these problems. The speed-up robust features (SURF) and local binary pattern variance (LBPV) descriptors are extracted from each mass. These descriptors are then transformed into invariant features. Afterwards, the deep invariant features (DIFs) are constructed in supervised and unsupervised fashion through multilayer deep-learning architecture. A fine-tuning step is integrated to determine the features, and the final decision is performed via softmax linear classifier. To evaluate this DeepCAD system, a dataset of 600 region-of-interest (ROI) masses including 300 benign and 300 malignant masses was obtained from two publicly available data sources. The performance of DeepCAD system is compared with the state-of-the-art methods in terms of area under the receiver operating characteristics (AUC) curve. The difference between AUC of DeepCAD and other methods is statistically significant, as it demonstrates a sensitivity (SN) of 92%, specificity (SP) of 84.2%, accuracy (ACC) of 91.5% and AUC of 0.91. The experimental results indicate that the proposed DeepCAD system is reliable for providing aid to radiologists without the need for explicit design.
Article
Full-text available
Recent advances in communications, controls, and embedded systems have changed the perception of a car. A vehicle has been the extension of the man’s ambulatory system, docile to the driver’s commands. It is now a formidable sensor platform, absorbing information from the environment (and from other cars) and feeding it to drivers and infrastructure to assist in safe navigation, pollution control, and traffic management. The next step in this evolution is just around the corner: the Internet of Autonomous Vehicles. Pioneered by the Google car, the Internet of Vehicles will be a distributed transport fabric capable of making its own decisions about driving customers to their destinations. Like other important instantiations of the Internet of Things (e.g. the smart building), the Internet of Vehicles will have communications, storage, intelligence, and learning capabilities to anticipate the customers’ intentions. The concept that will help transition to the Internet of Vehicles is the vehicular fog, the equivalent of instantaneous Internet cloud for vehicles, providing all the services required by the autonomous vehicles. In this article, we discuss the evolution from intelligent vehicle grid to autonomous, Internet-connected vehicles, and vehicular fog.
Conference Paper
In this paper, we study the challenging problem of tracking the trajectory of a moving object in a video with possibly very complex background. In contrast to most existing trackers which only learn the appearance of the tracked object online, we take a different approach, inspired by recent advances in deep learning architectures, by putting more emphasis on the (unsupervised) feature learning problem. Specifically, by using auxiliary natural images, we train a stacked denoising autoencoder offline to learn generic image features that are more robust against variations. This is then followed by knowledge transfer from offline training to the online tracking process. Online tracking involves a classification neural network which is constructed from the encoder part of the trained autoencoder as a feature extractor and an additional classification layer. Both the feature extractor and the classifier can be further tuned to adapt to appearance changes of the moving object. Comparison with the state-of-the-art trackers on some challenging benchmark video sequences shows that our deep learning tracker is more accurate while maintaining low computational cost with real-time performance when our MATLAB implementation of the tracker is used with a modest graphics processing unit (GPU).
Article
An innovative idea of vehicle tracking for video-based intelligent traffic management system is known to bring significant socioeconomic impact. A successful vehicle tracking method is always in demand to monitor different traffic parameters such as the average speed, strange movements, and congestion of vehicles or even to detect accidents automatically on highways or freeways. The challenges of traditional video-based vehicle tracking methods include the initialization of tracking to tackle an unknown number of targets and the reduction of the drift sensitivity of targets from true positions mainly caused by the variations in lighting condition, occlusions and camera position. To address these challenges, this paper presents a novel vehicle tracking method for a traffic management system that introduces the multiple time-spatial images (MTSIs)-based detection in the stochastic filter-based tracking. The MTSI-based tracking employs the concept of multiple numbers of key vehicular frames (KVFs) for each of the vehicular-objects in the traffic. These KVFs provide highly accurate positional information of the vehicles due to the fact that the shape and texture of the vehicles are comparable on the same scale and do not depend on the speed of the traffic. The spatial correspondence of a vehicle in successive KVFs is then incorporated as a low-complexity data association technique to alleviate the common problem of drifting in the stochastic filter-based method and thereby increasing the accuracy in tracking trajectory. Comprehensive experimentations are carried out using two publicly available video databases (EBVT and GRAM-RTM) that have traffics of varying environments to evaluate the vehicle tracking performance of the proposed method as compared to the existing methods. Experimental results demonstrate that the introduction of MTSIs not only automates the initialization of tracking, but also significantly increases the accuracy of the tracking trajectories of the vehicles on roads evaluated both in the presence and absence of ground truths.
Article
Tracking people in videos is an important topic in surveillance. We consider the problem of human tracking in RGBD videos filmed by sensors such as MS Kinect and Primesense. Our goal is to track persons where the crowd of people is known in advance or all persons in the video have appeared in the very beginning. Thus we can train a classifier to help classify and track persons across the video. A deep learning model trained with big data has been proved to be an effective classifier for various kinds of objects. We propose to train a deep convolutional neural network, which improves tracking performance, to classify people. And a motion model based on spatial and kinetic clues is combined with the network to track people in the scene. We demonstrate the effectiveness of our method by evaluating it on several datasets and comparing with traditional methods like SVM.
Article
Human activities are inherently translation invariant and hierarchical. Human activity recognition (HAR), a field that has garnered a lot of attention in recent years due to its high demand in various application domains, makes use of time-series sensor data to infer activities. In this paper, a deep convolutional neural network (convnet) is proposed to perform efficient and effective HAR using smartphone sensors by exploiting the inherent characteristics of activities and 1D time-series signals, at the same time providing a way to automatically and data-adaptively extract robust features from raw data. Experiments show that convnets indeed derive relevant and more complex features with every additional layer, although difference of feature complexity level decreases with every additional layer. A wider time span of temporal local correlation can be exploited (1 × 9-1 × 14) and a low pooling size (1 × 2-1 × 3) is shown to be beneficial. Convnets also achieved an almost perfect classification on moving activities, especially very similar ones which were previously perceived to be very difficult to classify. Lastly, convnets outperform other state-of-the-art data mining techniques in HAR for the benchmark dataset collected from 30 volunteer subjects, achieving an overall performance of 94.79% on the test set with raw sensor data, and 95.75% with additional information of temporal fast Fourier transform of the HAR data set.