Content uploaded by Wael A. Farag
Author content
All content in this area was uploaded by Wael A. Farag on Jul 14, 2020
Content may be subject to copyright.
Behavior Cloning for Autonomous Driving using
Convolutional Neural Networks
Wael Farag1,2,4, Zakaria Saleh3,5
1American University of the Middle East (AUM), Kuwait.
2Electrical Eng., Cairo University, Egypt
3University of Bahrain, Bahrain.
4wael.farag@aum.edu.kw, 5zsaleh@ubo.edu.bh
Abstract—In this paper, we propose using a Convolutional
Neural Network (CNN) to learn safe driving behavior and
smooth steering maneuvering as an empowerment of
autonomous driving technologies. The training data is collected
from a front-facing camera and the steering commands issued
by an experienced driver driving in traffic as well as urban
roads. This data is then used to train the proposed CNN to
facilitate what we call it behavioral cloning. The proposed
Behavior Cloning CNN is named as “BCNet” and its deep
seventeen-layer architecture has been selected after extensive
trials. The BCNet got trained using Adam’s optimization
algorithm as a variant of the Scholastic Gradient Descent (SGD)
technique. The paper goes through the development and
training process in details and shows the image processing
pipeline harnessed in the development. The proposed approach
proved successful in cloning the driving behavior embedded in
the training data set after extensive simulations.
Keywords—Behavioral Cloning, Convolutional Neural
Network, Autonomous Driving, Machine Learning
I. INTRODUCTION
In the past decade, the automobile industry has made a
shift towards intelligent vehicles equipped with driving
assistance systems [1-2] and recently has introduced vision
systems in their high-end cars. The vision system (the
mounted cameras in the car including the front-facing ones) is
being utilized by autonomous driving engineers to develop
many of the future self-driving cars features like: a) road-lane
finding; b) free driving-space finding; c) traffic signs detection
and recognition [3-7]; d) traffic lights detection and
recognition; e) road-objects detection and tracking. In this
paper, we propose to use the mounted car vision system (more
specifically, the front-facing camera) to improve the safety
and the driving behavior of future self-driving cars.
The main idea is to construct a Convolutional Neural
Network (CNN) that is able to learn the safe driving
maneuvers from data collected through the driving of an
expert driver in urban roads. The main focus in this paper to
let the proposed CNN map raw pixels from a single front-
facing camera directly to steering commands of the car. This
an end-to-end approach that lets the car drives without lane
markings on highways and inroads with unclear visual
guidance such as in parking lots and on unpaved roads [8]. The
CNN automatically learns internal representations of the
necessary processing pipeline steps such as detecting useful
road features with only the human steering angle as the
training signal. In comparison with the explicit decomposition
of the autonomous driving problem into lane-marking
detection, path planning, and control, the proposed end-to-end
CNN optimizes all processing steps simultaneously.
II. THE CNN ARCHITECTURE
The proposed CNN architecture is a seventeen-layer
Behavior Cloning CNN model is given the name “BCNet”.
The model is coded using Keras [8] on top of Tensorflow [10]
in Python [11]. Fig. 1 illustrates the BCNet architecture as
well as Table I below describes the architecture in details:
BCNET ARCHITECTURE.
# Layer Size/Output Parameters Comment
1 Input 160x320x3 ------------ Color – 3
Channels RGB
2 Normalization 160x320x3 lambda x:
x/127.5 - 1
Scaling the
inputs => -1 &
1, using Keras
Lambda
function
3 Cropping 65x320x3 =
62,400
The new
height will be
(160-70) - 25
= 65
Cropping the
images, cut 70
pixels from the
top and 25
pixels from the
bottom.
4 Convolutional
#1 31x159x24
No. Filters
=24,
Kernel =5x5,
Strides = 2x2,
Padding:
Valid,
Activation =
RELU
No Pooling
[(W−F+2P)/S]+1
5 Convolutional
#2 14x78x36
No. Filters
=36,
Kernel =5x5,
Strides = 2x2,
Padding:
Valid,
Activation =
RELU
No Pooling
[(W−F+2P)/S]+1
6 Convolutional
#3
5x38x48
No. Filters
=48,
Kernel =5x5,
Strides = 2x2,
Padding:
Valid,
Activation =
RELU
No Pooling
[(W−F+2P)/S]+1
7 Convolutional
#4
3x36x64
No. Filters
=64,
Kernel =3x3,
Strides = 1x1,
Padding:
Valid,
No Pooling
[(W−F+2P)/S]+1
2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)
978-1-5386-9207-3/18/$31.00 ©2018 IEEE
Activation =
RELU
8 Convolutional
#5 1x34x64
No. Filters
=64,
Kernel =3x3,
Strides = 1x1,
Padding:
Valid,
Activation =
RELU
No Pooling
[(W−F+2P)/S]+1
9 Flatten 1x34x64 =
2,176
Keras Flatten
Function
10 Drop-out 2,176
Keep
Probability:
0.5 => 0.7
11 Fully
Connected #1 200 Keras Dense
Layer With biases
12 Drop-out 200
Keep
Probability:
0.5 => 0.7
13 Fully
Connected #2 100 Keras Dense
Layer With biases
14 Drop-out 100
Keep
Probability:
0.5 => 0.7
15 Fully
Connected #3 20 Keras Dense
Layer With biases
16 Drop-out 20
Keep
Probability:
0.5 => 0.7
17 Fully
Connected #4 1 Keras Dense
Layer Output layer
Four Drop-out layers are added to prevent over-fitting
during training, and the fully connected layers are widened.
Also, No Pooling layers are used here, as it is a regression
problem, not a classification. Additionally, all the
convolutional layers are sized according to the input image
sizes after normalization and cropping.
III. THE TRAINING DATA SET
The following are the two main sources of data which are
utilized to construct the training data set that is used to train
the BCNet:
1) Source 1 - Udacity Supplied Data [12]: These collections
with an unzipped size of 365MB consists of 24,108 images
equally divided between center, left and right front
cameras shots. Each image is 160x320 pixels size with 3
channels for RGB colors. The index of the data is stored in
a CSV file which contains 8,036 line of records.
2) Source 2 - Simulator Generated Data: collected using the
open source Udacity driving simulator in [13]. The
recorded data set has an unzipped size of 808MB and
consists of 49,851 images equally divided between center,
left and right front cameras shots. Each image is 160x320
pixels size with 3 channels for RGB colors. The index of
the data is stored in a CSV file which contains 16,617 line
of records. The data has been generated by driving the car
manually around Track 1 in the mentioned simulator
several times (~ 10 times) with as good as possible safe
driving behavior. Particularly, it is encouraged to include
"recovery" data while training. This means that data should
be captured starting from the point of approaching the edge
of the track (perhaps nearly missing a turn and almost
driving off the track) and recording the process of steering
the car back toward the center of the track to give the
model a chance to learn recovery behavior.
Several subroutines have been written for data
visualization and analysis. This acts as a sort of sanity check
to verify that the preprocessing is not fundamentally flawed.
Flawed data will almost certainly act to confuse the model and
result in unacceptable performance. An Example of the output
of these subroutines is presented in Fig. 2 which displays a
sample of the generated training data (source 2), and Fig. 3
which presents the histogram of the steering angle values
collected during driving (source 2).
The data (both source 1 and 2) is divided into 2 separate
parts: training data which represents 80% of the chunk and
validation data which represents 20% of the chunk.
160x320x3
Convolution
#
1
31x159x24
Kernel: 5x5
Strides: 2x2
Cropping
65x320x3
Convolution #2
14x78x36
Kernel: 5x5
Strides: 2x2
Convolution #3
5x38x48
Kernel: 5x5
Strides: 2x2
Convolution #4
3x36x64
Kernel: 3x3
Strides: 1x1
Convolution #5
1x34x64
Kernel: 3x3
Strides: 1x1
Flatten
2176
Drop-out
0.5
Drop-out
0.7
Fully
Connected 200
Drop-out
0.5
Fully
Connected 100
Drop-out
0.7
Fully
Connected 20
Output
Neuron
Fig. 1. The BCNet Architecture.
IV. DRIVING DATA PRE-PROCESSING
Before using the front camera's images in
training/validation data sets, these images need to be pre-
processed to make more useful and convenient throughout the
learning process. The pre-processing steps meant to improve
the training results and reduce the computation as much as
possible. The following steps describe the implemented pre-
processing steps in order of execution:
1) Normalization (color): this is done for color images using
the “Lambda function” in Keras [9] by simply
2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)
implementing a min-max scaling. The values of the RGB
pixels are scaled to the -1 → 1 range and centered on zero
instead of instead of the 0→255 range.
2) Cropping images: The images have been cropped from the
top by 70 pixels and from the bottom by 25 pixels, in order
to focus on the region of interest (ROI) and to reduce the
number of inputs (faster learning process). The cropped
images have the size of 65x320x3.
3) Flipping images: The data has been doubled (augmented)
by flipping all the images (around the y-axis) and reversing
the sign of the corresponding steering angle. Accordingly,
the source-1 data becomes 48,216 samples, and source-2
data becomes 99,702 samples. In other words, each CSV
line record can generate 6 training samples (center, left,
right, flipped-center, flipped-left, and flipped-right).
4) Jittering images: To minimize the model's tendency to
overfit to the conditions of the test track, images are
"jittered" before being fed to the BCNet. The jittering
consists of a randomized brightness adjustment, a
randomized shadow, and a randomized horizon shift. The
shadow effect is simply a darkening of a random
rectangular portion of the image, starting at either the left
or right edge and spanning the height of the image. The
horizon shift applies a perspective transform beginning at
the horizon line (at roughly 2/5 of the height) and shifting
it up or down randomly by up to 1/8 of the image height.
The horizon shift is meant to mimic the topology
conditions of the test track.
5) Data Distribution Flattening: Because the test track
includes long sections with very slight or no curvature, the
data captured from it tends to be heavily skewed toward
low and zero turning angles. This creates a problem for the
neural network, which then becomes biased toward driving
in a straight line and can become easily confused by sharp
turns. The distribution of the input data can be observed in
Fig. 3. To reduce the occurrence of low and zero angle data
points, a histogram of the turning angles is produced and
the average number of samples per bin is computed. Next,
a "keep probability" for the samples belonging to each bin
is determined. That keep probability is 1.0 for bins that
contain less than the computed average samples per bin,
and for other bins, the keep probability is calculated to be
the number of samples for that bin divided by average
samples per bin. Finally, random data points from the data
set are removed with a rate of (1 – “keep probability”). The
resulting data distribution can be seen in Fig. 4. The
distribution is not uniform overall, but it is much closer to
uniform for lower and zero turning angles. This method
helped speed up the training process as lower size data is
used but with higher quality.
6) Cleaning the dataset: it is discovered that the model
performed especially poorly on certain data points, and
then found those data points to be mislabeled in several
cases. A subroutine is created to display frames from the
dataset on which the model performs the worst. The intent
was to manually adjust the steering angles for the
mislabeled frames. Even though this approach is tedious,
it helped to improve the results of the training to some
extent.
7) Shuffling the training data: this is done once each training
epoch to avoid pattern memorization and consequently
trapping in local minima.
8) Using generator function to load data in memory [14]: this
step helps a lot to smooth out the training process, as it is
actually mandatory. Loading the whole data in the
computer memory was actually not possible (or at least not
practical). Each patch (size = 128 images and angles) is
generated and loaded in memory individually. The
fit_generator() function by Keras [14] is used to manage
the whole process.
Fig. 2. A sample of the collected images: center, left and right
respectively.
Finally, the actually used pre-processing pipeline is:
Color-Image → Normalization → Cropping→ Flipping→
Jittering→Shuffling→ Batch Memory Loading.
It is found out that the above pipeline is fair enough and
produced the required results. However, other techniques
were put into consideration; it may be needed or for future
endeavors; as follows:
1) Converting color images to grey: reduces complexity a lot
by reducing the size of training data and the associated
computation to 1/3rd.
2) Incorporating Edge detection and lane finding using
Cunny algorithm and Hough transform.
3) Blur-Filtering the input images using Gaussian filtering to
remove noise.
Fig. 3. The histogram of the steering angles collected during driving.
2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)
Fig. 4. The histogram of the steering angles after flattening.
V. THE LEARNING ALGORITHM
In this work, Adam learning algorithm [15] is used to train
the proposed BCNet and update its weights iteratively based
on the prepared training driving data. Adam is an
optimization algorithm that is used instead of the classical
stochastic gradient descent (SGD) learning algorithm [16].
The method computes individual adaptive learning rates for
different parameters from estimates of first and second
moments of the gradients.
Adaptive Moment Estimation (ADAM) [15] is a method
that computes adaptive learning rates for each parameter. In
addition to storing an exponentially decaying average of past
squared gradients , Adam also keeps an exponentially
decaying average of past gradients , similar to momentum,
and the update equations are given as follows:
=
+
(
1
−
)
(1)
=
+
(
1
−
)
and are estimates of the first moment (the mean) and
the second moment (the uncentered variance) of the gradients
respectively, hence the name of the method. As and are
initialized as vectors of 0’s, the authors of Adam observe that
they are biased towards zero, especially during the initial time
steps, and especially when the decay rates are small (i.e. β1
and β2 are close to 1).
These biases have been counteracted by computing bias-
corrected first and second-moment estimates as follows:
=
1
−
(2)
=
1
−
The parameters are being updated using the above
estimates which yields the following Adam update rule:
=
−
+
(3)
for all neural network model's parameters θ ∈ ℝd (weights
and biases) where is the learning rate, and is a very small
constant prevents divide by zero.
The authors of Adam’s method propose default values of
0.9 for β1, 0.999 for β2, and 10-8 for . They show empirically
that Adam works well in practice and compares favorably to
other adaptive learning-method algorithms [17].
The output layer of the proposed BCNet is a linear
regression function [18]. The network models the system as
a linear combination of features to produce the final estimated
output . The function is given by
= ℎ(, )=∑
+ (4)
where xj is the jth input to the output unit and wj is the jth
weight of the jth input, b is the bias term, N is the number of
inputs to the output unit (or in other words, the size of the
previous hidden layer), w is the vector of weights w = [w0, w1,
… wj, … wN-1], x is the vector of inputs x = [x0, x1, … xj, …
xN-1].
The main task of the training process is to find the weights
that provide the best fit for the training data. One way to
measure this fit is to calculate the least squares error (or the
data loss) over the training dataset:
()= ∑(ℎ(, )−
)= ∑(− )
(5)
Where L is the data loss function that needs to be
minimized using Adam’s algorithm, yi is the ith ground truth
sample output, is the it h output estimate of the neural net,
and M is the number of training samples. Then the gradient
descent will be used on the loss's gradient
∇
w L(w) in order to
minimize the overall error on the training data. Using this, the
weights can be updated using the standard gradient descent:
= − ∇() (6)
where η is the learning rate.
VI. THE BCNET TRAINING RESULTS
The BCNet model is trained using the parameters listed in
Tables II, III and IV using ADAM’s optimization algorithm.
Fig. 5 shows the setup of the BCNet used during the training
phase, while Fig. 6 shows the setup during the running and
simulation modes. Furthermore, the training results are
presented in Table V, Fig. 7 and Fig. 8. The state of the model
is a bit over-fitting after the training represented by Fig. 8. For
this reason, the learning rate is further reduced and the keep
probability increased.
2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)
Fig. 5. Overview of the CNN in training mode.
Fig. 6. Overview of the trained CNN in running mode.
BCNET TRAINING PARAMETERS (LEARNING RATE).
Algorithm
Parameter
Value
Comment
ADAM
Optimization
Learning
Rate
0.001
For epochs: 0
–
5 for data
source 1
For epochs: 0 – 5 for data
source 2
0.0005
For epochs:
6
–
8 for data
source 2
0.0002
For epochs: 8
–
9 for data
source 2
BCNET TRAINING PARAMETERS (LEARNING RATE).
Parameter
Value
Comment
Left Angle Correction
0.25
Radians
Right Angle Correction
-
0.25
Radians
BCNET TRAINING PARAMETERS (OTHERS).
Parameter
Value
Comment
Batch Size
120
For epochs: 0
-
9
Epoch
s
15
The whole training
Keep Probability
0.5 => 0.7
For Drop
-
out Layers
BCNET TRAINING RESULTS.
Phase
Data Type
Loss Value
Parameters
Comment
Phase 1
Coarse Tuning Source-1 Data Training: 0.0235
Validation: 0.0205
5 Epochs
Learning Rate = 0.001
Keep Prob. = 0.5
Figure 7
Coarse Tuning with Udacity Data. Not enough for full
learning.
Phase 2
Fine Tuning Source-2 Data Training: 0.0455
Validation: 0.0411
5 Epochs
Learning Rate = 0.001
Keep Prob. = 0.5
Figure 8
Fine Tuning with self
-
collected data. Proved enough for
full learning with acceptable Performance.
Phase 3
Fine Tuning Source-2 Data Training: 0.0417
Validation: 0.0377
3 Epochs
Learning Rate = 0.0005
Keep Prob. = 0.6
Figure 9
More fine tuning with self
-
collected data. Caused over
-
fitting with a kind of inferior performance.
Phase 4
Fine Tuning Source-2 Data Training: 0.0348
Validation: 0.0295
2 Epochs
Learning Rate = 0.0002
Keep Prob. = 0.7
Figure 10
More fine tuning with se
lf
-
collected data. Full learning
with very good performance.
Fig. 7. Learning progress for data source 1 (Udacity Data).
Fig. 8. Learning progress for data source 2 (Self-Collected).
2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)
Fig. 9. Learning progress for data source 2 (Self-Collected).
Fig. 10. Learning progress for data source 2 (Self-Collected).
The training of the BCNet has been carried-out through
several trials to achieve the presented results in Table V. The
following observations have been collected during the training
process:
1) Training the network using only the “source-1” Udacity
supplied data have been tried several times incorporating
several ways of data augmentation, however, acceptable
results have not been achieved, and the car always hit the
borders.
2) Using the Udacity driving simulator [13], training data has
been collected by maneuvering the car using a keyboard or
a joystick. Accordingly, useful data has been successfully
collected for training by looping the car around; as an
example; “Track 1” several times (~ 10).
3) After coarse tuning the model using the “source-1” data,
the resultant model weights are then reused for further
training and fine-tuning based on the “source-2” data using
ADAM optimizer learning rate of 0.001 as in Table V and
Fig. 8. This matter of course and then fine-tuning
resembles the transfer learning approach. Note that the two
types of data are never used together. After this fine-tuning
phase, the resultant model is then tested on “Track 1” in
the simulator and produces acceptable results (no unsafe or
sudden maneuvering).
4) In order to improve the performance further, the learning
rate of the ADAM optimizer has been halved to 0.0005 and
the model got trained for further 3 epochs. However, this
results in an over-fitting model as shown in Fig. 9.
Furthermore, the testing confirmed that after producing
inferior performance even with both training and
validation loss are lower than the previous case.
Consequently, this model is set to get further fine-tuning.
5) The learning rate of the ADAM optimizer has been
reduced further to 0.0002 and keep probability increased
to 0.7 and the model got trained for an extra 2 epochs (Fig.
10). The resultant model is then tested on “Track 1” and
produced a very good performance.
VII. A SHORTCOMING OF THE IMPLEMENTED APPROACH
The following list summaries the identified shortcomings:
1) The presented neural network model doesn’t have a
memory, it takes momentarily decision and doesn’t build
on previous states to make the current decision. However,
It is believed that driving is a sequential process and the
current approach doesn’t mimic that.
2) After training the network on one track and testing it on
another one (considerably different that first one), it may
produce unacceptable results in some scenarios in terms of
driving behavior, as it has never gone through these
scenarios before. Accordingly, this approach may require
the network to be exposed to a massive number of tracks
in order to generalize well for actual street deployment
(commercial application).
VIII. SUGGESTED IMPROVEMENTS
The following points summarize the suggested
improvements:
1) Other network topologies with a memory like Long Short-
Term Memory (LSTM) models need to be tested for
behavior cloning end-to-end learning.
2) The network needs to be trained on much more tracks,
maneuvering scenarios and road conditions in order to
make it generalize as much as possible.
3) More useful data can be generated from the currently
collected data by random distortion addition, brightness
manipulation, jitter, and rotation … etc.
4) Applying the concept of a finite impulse response (FIR)
filtering or the moving average concept for the steering
angle estimation before the final steering command,
instead of using the raw estimated value directly. In such a
case, the new estimated value will depend on the previous
history as well.
IX. CONCLUSION
In this paper, a CNN-based safe steering controller
“BCNet” has been proposed. The architecture of the CNN is
presented in details. The structure of the comprehensive
training, validation, and testing data is described. The
involved image processing algorithms have been described as
well and their contributions are analyzed. The BCNet has
shown that it is able to learn the entire task of lane and road
following without manual decomposition into road or lane
marking detection, semantic abstraction, path planning, and
control. A small amount of training data from one or two
tracks was sufficient to train the car to drive safely in multiple
2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)
tracks. The CNN is able to learn meaningful road features
from a very sparse training signal (steering alone). It has been
shown throughout the training process that the quality of data
(much more than quantity) is specifically crucial for this
application. Therefore, a comprehensive pipeline of training
data pre-processing has been carefully implemented.
Moreover, the shortcomings of the proposed approach
have been discussed with proposed improvement actions for
future work being elaborated. The presented solution presents
a cornerstone in facilitating the existence of fully autonomous
cars in the near future.
REFERENCES
[1] Karim Mansour, Wael Farag, “AiroDiag: A Sophisticated Tool that
Diagnoses and Updates Vehicles Software Over Air”, IEEE Intern.
Electric Vehicle Conference (IEVC), TD Convention Center
Greenville, SC, USA, March 4, 2012, ISBN: 978-1-4673-1562-3.
[2] Wael Farag, “CANTrack: Enhancing automotive CAN bus security
using intuitive encryption algorithms”, 7th Inter. Conf. on Modeling,
Simulation, and Applied Optimization (ICMSAO), UAE, March 2017.
[3] Long Chen, Qingquan Li, Ming Li, Qingzhou Mao, “Traffic sign
detection and recognition for intelligent vehicle”, IEEE Intelligent
Vehicles Symposium, June 2011, Baden-Baden, Germany.
[4] J. Greenhalgh and M. Mirmehdi, “Real-Time Detection and
Recognition of Road Traffic Signs”, IEEE trans. on intelligent
transportation systems, 13(4), Dec. 2012.
[5] Á. Arcos-García, J.A. Álvarez-García, L.M. Soria-Morillo, “Deep
neural network for traffic sign recognition systems: An analysis of
spatial transformers and stochastic optimisation methods”, Neural
Networks 99 (2018) 158–165, Elsevier.
[6] Wael Farag, Zakaria Saleh, "Traffic Signs Identification by Deep
Learning for Autonomous Driving", Smart Cities Symposium (SCS'18),
Bahrain, 22-23 April 2018.
[7] Wael Farag, “Recognition of traffic signs by convolutional neural nets
for self-driving vehicles”, International Journal of Knowledge-based
and Intelligent Engineering Systems, IOS Press, Vol: 22, No: 3, pp. 205
– 214, 2018.
[8] M Bojarski, D Del Testa, D Dworakowski, B Firner, B Flepp, P Goyal,
... et al., “End to End Learning for Self-Driving Cars”,
arXiv:1604.07316, 25 Apr 2016.
[9] Keras Documentation, “https://keras.io/”.
[10] TensorFlow, “https://www.tensorflow.org/”.
[11] Python, “https://www.python.org/”
[12] Udacity Sample Training Data,
https://d17h27t6h515a5.cloudfront.net/topher/2016/December/584f6e
dd_data/data.zip
[13] Udacity Simulator, https://github.com/udacity/self-driving-car-sim
[14] Shervine Amidi, “https://stanford.edu/~shervine/blog/keras-how-to-
generate-data-on-the-fly.html”.
[15] D.P. Kingma, J. Ba, “Adam: A Method for Stochastic Optimization”,
3rd Inter. Conf. for Learning Representations, San Diego, USA, 2015.
[16] Léon Bottou, "Online Algorithms and Stochastic Approximations",
Online Learning and Neural Nets, Cambridge Univ. Press, ISBN 978-
0-521-65263-6, (1998).
[17] Sebastian Ruder, “An overview of gradient descent optimization
algorithms”, arXiv:1609.04747v2, 15 Jun 2017.
[18] Wael Farag, “Synthesis of intelligent hybrid systems for modeling and
control”, University of Waterloo, Canada, 1998.
2018 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)