Available via license: CC BY-NC-ND 4.0
Content may be subject to copyright.
Contents lists available at ScienceDirect
Automation in Construction
journal homepage: www.elsevier.com/locate/autcon
Self-reconfigurable façade-cleaning robot equipped with deep-learning-
based crack detection based on convolutional neural networks
Maryam Kouzehgar
a,⁎
, Yokhesh Krishnasamy Tamilselvam
b
, Manuel Vega Heredia
a,c
,
Mohan Rajesh Elara
a
a
Engineering Production and Development Pillar, Singapore University of Technology and Design, Singapore 487372, Singapore
b
Electrical Engineering Department (Robotics), University of Western Ontario, London, Ontario N6A 3K7, Canada
c
Engineering and Technology Department, Campus Los Mochis, Universidad de Occidente, Sinaloa 81223, Mexico
ARTICLE INFO
Keywords:
Self-reconfigurable robot
Façade-cleaning robot
Glass crack detection
Convolutional neural network (CNN)
Deep learning
TensorFlow™
OpenCV
ROS
Adam optimizer
Adagrad optimizer
ABSTRACT
Despite advanced construction technologies that are unceasingly filling the city-skylines with glassy high-rise
structures, maintenance of these shining tall monsters has remained a high-risk labor-intensive process. Thus,
nowadays, utilizing façade-cleaning robots seems inevitable. However, in case of navigating on cracked glass,
these robots may cause hazardous situations. Accordingly, it seems necessary to equip them with crack-detection
system to eventually avoid cracked area. In this study, benefitting from convolutional neural networks developed
in TensorFlow™, a deep-learning-based crack detection approach is introduced for a novel modular façade-
cleaning robot. For experimental purposes, the robot is equipped with an on-board camera and the live video is
loaded using OpenCV. The vision-based training process is fulfilled by applying two different optimizers utilizing
asufficiently generalized data-set. Data augmentation techniques and also image pre-processing also apply as a
part of process. Simulation and experimental results show that the system can hit the milestone on crack-de-
tection with an accuracy around 90%. This is satisfying enough to replace human-conducted on-site inspections.
In addition, a thorough comparison between the performance of optimizers is put forward: Adam optimizer
shows higher precision, while Adagrad serves more satisfying recall factor, however, Adam optimizer with the
lowest false negative rate and highest accuracy has a better performance. Furthermore, proposed CNN's per-
formance is compared to traditional NN and the results provide a remarkable difference in success level, proving
the strength of CNN.
1. Introduction
High-rise buildings with glass facades are increasingly growing
worldwide [1,2]. However, when it comes to their maintenance pro-
cess, cleaning is normally a classical approach which is still labor-in-
tensive and often of high risk to the manpower, especially in adverse
weather conditions and strong winds. Recently, there is a growing trend
toward commercialized robots such as Winbot Series from Ecovacs
(Winbot X, Winbot 950 and Winbot 850), Hobot window cleaning series
(Hobat 168, 198 and 288), Alfawise A168/S60, and Rumbot's window
cleaning robot. Furthermore, in case of high-rise structures, there are
commercialized façade cleaning robots such as RobuGlass designed for
cleaning Louvre museum [3] and Glazenwasrobot from KITE robotics
that works with protective cranes.
In order to improve the capabilities of these increasingly demanded
service robots, also in academic research, robotic systems for cleaning
vertical glass facades have been a matter of interest. In this field, there
are numerous research challenges associated with mechanism design
and autonomous capabilities. This essential need to use robots for
vertical façade cleaning, has been sensed long back ago and has been
addressed by introducing SIRIUSc as a modular robot basically designed
to work on skyscrapers: This robot has been specifically utilized to clean
the 25,000 m
2
vaulted glass hall of the Leipzig Trade Fair in Germany
[4,5]. Going ahead on the same stream toward autonomy, the proposed
robotic system in [6] performs tasks such as cleaning, moving, rail
alignment control, and obstacle detection.
In terms of mechanical design, glass façade cleaning robot should be
first a surface climbing robot. A detailed review of wall climbing robots
is presented in [7] categorizing them into six distinguished classes
based on the applied adhesive mechanism. Apart from façade cleaning,
https://doi.org/10.1016/j.autcon.2019.102959
Received 25 January 2019; Received in revised form 2 September 2019; Accepted 7 September 2019
⁎
Corresponding author.
E-mail addresses: maryam_kouzehgar@sutd.edu.sg (M. Kouzehgar), ykrishn4@uwo.ca (Y. Krishnasamy Tamilselvam),
manuel_vega@sutd.edu.sg (M. Vega Heredia), rajeshelara@sutd.edu.sg (M. Rajesh Elara).
Automation in Construction 108 (2019) 102959
Available online 23 October 2019
0926-5805/ © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/BY-NC-ND/4.0/).
T
the surface climbing robots are mainly utilized for remote inspection
and maintenance applications [8]indifficult access situations such as
wall inspection [9], labeling oil tank volume [10] and nuclear power
plants inspection [11].
Regardless of the application to façade cleaning, several types of
mechanisms are proposed for surface climbing: Legged morphology
with eight articulated limbs for Robug III is presented in [12]. The
structure illustrated in [13], is another type of legged morphology. In
addition, a hexapod structure for climbing is put forward in [14].
Moreover, a shape shifting rolling-crawling mechanism is introduced is
in [15] which is also able to climb vertical surfaces. Surface climbing
robots illustrated in [16,17] use a tracked wheel mechanism. There are
also pneumatic-type climbing robots [18]; namely utilizing suction
cups: [19,20]. Among the climbing robots equipped with suction cups,
[21] proposes a biped mechanism, whereas [22] is introducing a robot
specifically designed for glass facade cleaning with passive suction cups
driven by self-locking leadscrews. Additionally, some of the climbing
robots benefit from material-based adhesiveness [23] and impedance
control [17].
When it comes to improve the performance of the façade cleaning
robots, reconfigurability can add a great value in terms of accessing
more spaces and flexibility in morphology. In [24], a nested re-
configurable mechanism has been proposed specifically for vertical
façade cleaning applications. However, the rolling-crawling mechanism
introduced in [15] is an example of reconfigurable morphology that can
also be used in façade cleaning applications due to its wall-climbing
ability. Moreover, bio-inspired mechanisms can also pave their way
toward vertical facade cleaning as with the climbing mechanism pro-
posed in [23] benefitting from special adhesive materials.
The robot considered in this paper is Mantis [25] which is a modular
climbing robot that is using a powerful commercial impeller to provide
the required mechanical attachment force as the adhesion mechanism.
Even in robotic façade cleaners, usually a person is required to detach
the robot from one window panel and attach it to another. Mantis is
equipped with the ability to distinguish the frames (or any obstacles
other than glass), and also due to its modular morphology, it is able to
perform a transition from one window panel to another, avoiding its
frame. This way, the need for human interference is eliminated. This is
one aspect of autonomy worked out for the façade cleaning robot
thoroughly investigated in [25] in terms of design, locomotion strate-
gies and control.
One thing more is that, the façade cleaning robots had better avoid
the cracked glass regions because it would cause dangerous situations.
Hence, it is necessary that the robot can recognize the safe surfaces for
navigation, just as the high-rise window cleaner from FatCat Robotics is
claimed to be equipped with the crack-detection technology. In this
study, our aim is to equip Mantis with glass crack-detection ability
using deep learning techniques; namely convolutional neural networks
(CNN). This is prominently significant because we are adding a very
necessary capability to the façade cleaning robot to avoid and prevent
probable dangerous situations for itself, and also the humans around.
Due to the uncompensable harms to people's health, no one prefers a
glass frame explosion only for having a tech-equipped cleaning robot
navigating on that. Without this additional capability, the people would
prefer the glass to remain stained rather than probably get explode, and
this paper is introducing this distinguished capability as its main con-
tribution which will guarantee the safety for all.
Crack detection generally is viewed from two aspects. First, the
material on which the crack exists and second the method to handle the
analysis of crack. In terms of material, there are several other materials
which require significant crack detection efforts such as metals, con-
crete, asphalt, walls with different materials in construction industry,
etc. For example, in [26,27], crack detection in metal surfaces is
worked out by applying Eddy current sensor. In [28], a nonlinear ul-
trasonic modulation technique based on dual laser excitation is pro-
posed for fatigue crack detection in aluminum and steel plates.
Ultrasound crack detection in a simulated human tooth is investigated
in [29]. In terms of detecting civil infrastructure defects, a survey on
image-based crack detection for concrete surfaces is put forward in
[30].
The vision-based deep-learning-supported crack detection methods
have been mostly a matter of interest on concrete and asphalt surfaces.
For instance, [31] proposes a vision-based method using a deep archi-
tecture of CNNs for detecting concrete cracks. Road crack detection
using CNN is elaborated in [32] and crack detection on 3d asphalt
surfaces using a deep-learning network is proposed by [33]. While [34]
is applying deep CNNs with transfer learning for vision-based pavement
distress detection. Furthermore, deep learning-based crack detection
using CNNs has also an interesting application in identifying cracks on
reactors within the process of nuclear power plant inspection [35].
The reason why recently there has been a great interest toward
CNNs is its outstanding advantages: Notably, CNNs benefit from auto-
matic feature extraction and feature learning. In fact, they can learn
relevant features from an image/video at different levels similar to a
human brain. While for example NNs cannot do this and need a prior
phase of feature extraction to be applied to any object/pattern re-
cognition application. In general, when using NNs, it is needed to ex-
tract relevant features for the given task and assign each of them to an
element of the input vector; while a CNN will automatically extract
such features provided that you can represent the input as a tensor with
locally correlated elements such as audio data, images, video, etc.
Furthermore, CNN is more efficient in terms of memory and complexity
because it needs much less parameters specially in image processing
applications where, dealing with image matrix, the number of weight
parameters grow drastically in other methods such as NNs. However, in
CNN it only depends on number and size of filters. One of other dis-
tinguished advantages of CNNs is the capability for transfer learning;
i.e. re-using a pre-trained CNN to feed your data on each level and just
slightly tune the CNN for the newly-defined relevant task for example
using the knowledge gained on cars to distinguish trucks. This way, we
avoid training of CNN from scratch and save memory and time.
Particularly, when it comes to crack detection on glass surfaces,
beside the vision-based techniques, there are also some other methods
to handle the glass crack detection such as the ultrasonic glass crack
detection proposed in [36] and detecting the defects on glass surface
with Wavelet Transforms in [37]. Regarding vision-based techniques,
specifically glass crack detection has been recently developed mostly
based on classical image processing methods such as edge detection and
segmentation. As an illustrative example, the glass crack detection
system proposed in [38]first applies a pre-processing and smooth
sharpening followed by image segmentation and ends with crack fea-
ture extraction including calculation of crack area, Crack Perimeter and
Crack Circularity. Also, in [39], a very basic image processing method
based on pixel coordinates is applied to detect crack on bottles in a
video-supervised production line.
However, vision-based deep learning has been rarely applied to
glass crack detection and therefore presents a scope for tremendous
research. The reason of rare applications of deep learning in glass crack
detection mostly lies in two different venues: First, glass crack detection
has not been a matter of interest in major industrial applications and
glass-cleaning robot with crack-detection capability is rarely found with
a valid commercialized patent. The second reason is that dealing with
images of glass is not straightforward, because the lighting exposure on
glass is a sever issue that tremendously varies in different times of day/
night. Furthermore, in the background and foreground of the glass,
many objects may be lying that definitely have their images reflected on
the glass, and also intense light can be reflected on the glass. This re-
flection issue can be addressed as glare effect in the paper and we have
tried to train the network with sample videos containing different types
of glare effect (reflection of sun and artificial intense light).
In this paper the aim is to propose a crack detection system for a
glass façade cleaning modular robot based on deep learning. To this
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
2
end, we will be using a live video sent from an onboard camera
mounted on the robot monitoring the ahead glass surface. The CNN
which is taking care of crack detection in each video frame is im-
plemented in python through TensorFlow backend. The utilized CNN
consists of two convolutional layers, two maximum pooling layers and
one fully connected layer and during the training process we are using
two types of optimizers: Adam optimizer and Adagrad optimizer. The
overall performance of the proposed approach is quite satisfying since it
can hit an accuracy more than 90%. This way, our façade cleaning
robot is equipped with a higher level of autonomy in term of detecting
cracked regions and providing adequate safety.
The rest of the paper is organized as follows. Section 2 describes the
overall locomotion mechanism of the considered façade cleaning robot
(Mantis). Section 3 is dedicated to a general brief review of the CNN
concept along with the mechanism of each layer. Section 4 is concerned
with the nature of the data set and details about data augmentation and
preprocessing. Furthermore, Section 5 introduces the proposed CNN
architecture and training algorithm along with the mathematical sup-
port for utilized optimizers. Section 6 is devoted to results and discus-
sion that summarizes the test results casting light on the accuracy
metrics, performance analysis graphs and sample data classification
results added by performance comparison to simple neural network.
Eventually, Section 7 concludes the paper and opens a window toward
the future works.
2. Description of the robotic platform [25]
Mantis is a modular vertical climbing robot developed for façade
cleaning [25]. The currently commercialized façade cleaning robots
usually require human interference to detach them, move them to the
other window frame and again attach them to continue the cleaning
process, just like Winbot series, Hobat series and Alfawise. In order to
answer this shortcoming in current commercial designs, Mantis is de-
signed in such a way that it can do the transition from one window
panel to another without manual assistance. The robot has been de-
veloped on a rigid structure that keeps the modules unified, in the
meantime individual rotation and lifting mechanism for each module is
provided. Fig. 1 shows a picture of Mantis.
2.1. Adhesion mechanism
Mantis uses a powerful commercial impeller to create the suction
required as the mechanical attachment force to keep it attached to the
glass and prevents the robot from falling. The impeller's input voltage is
controllable to control the amount of needed force based on the surface
sensitivity.
2.2. Locomotion mechanism
The robot is equipped with a locomotive wheel mechanism with a
soft and flexible high friction rubber. In this way, we decrease the drift
of the robot on the glass in the presence of dirt and liquids. The dia-
meter of the wheels is 6 cm, connected to dc motors with 250 oz./in of
torque. The maximal speed of the robot is 15 cm/s.
As depicted in Fig. 2, the wheels are attached to each module in
orthogonal position to the glass and equidistant to the center of module,
aligned with Y axis.
The whole module can rotate βdegrees around the Z-axis for each
module i (where i is the pad i = a,b,c). Thus, by rotating the module,
the wheels rotate concentrically around a central axis.
Considering locomotive mechanism, Mantis is a 12W6D3S robot.
Based on the locomotive characteristics of the robot, given the re-
strictions imposed to facilitate control, it is nearly holonomic. Each of
the modules can rotate 360
o
. However, in the current application the
rotation range is limited to 90
o
since it is enough for the robot to na-
vigate throughout the whole window.
2.3. Transition mechanism
The transition mechanism consists of two steps: first recognizing the
window frame or any obstacles. Second lifting the module that is
nearest to the obstacle. The recognition is fulfilled via applying in-
ductive sensors. The transition between the windows, across the frames
or obstacles, is shouldered by the lifting system. Aiming at transition, a
linear actuator is utilized which linearly moves a screw and lifts a
Fig. 1. Structure of Mantis while cleaning.
Fig. 2. Mantis inertial system in red, module c inertial system in black, βis the
steering of each module. (For interpretation of the references to colour in this
figure legend, the reader is referred to the web version of this article.) (For
interpretation of the references to colour in this figure legend, the reader is
referred to the web version of this article.)
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
3
module from the surface using the rotation around a screw rod located
in the center of each module. This mechanism is illustrated in Fig. 3.
2.4. Software architecture
As shown in Fig. 4, for this application the robot is teleoperated
through a Bluetooth device HC-06 and a mobile application. The robot
can also be teleoperated by a computer. Also, it can use a USB interface.
To develop autonomous navigation, the robot is equipped with a sen-
sory system to know the orientation, position and detect the frame of
the window.
In Fig. 4, continuous arrows are power connections and dotted ar-
rows show signals.
The robot is controlled using a master system which works with the
Robot Operative System (ROS). Using ROS and its topics subscription
system, it is possible to process package information. The packets are
bidirectional between MCU (Arduino Mega) and ROS slave (Intel
Compute Stick), on board. As well as, between ROS slave and ROS
master, communicated by WiFi. ROS master is a server PC where the
information from the ROS slave is processed in software which cannot
be processed in the ROS slave given the processing resources necessary
for the correct operation. As shown in Fig. 5, The communication
between TensorFlow and ROS is developed by subscribing to data
package topics [40] for identifying the cracked glass by image proces-
sing.
2.5. On-board visual support system
In order to develop an autonomous robot that can navigate safely on
the glass, it is necessary for it to determine the state of the surface, for
existing cracks. For the analysis of the glass surface, a vision system is
used using a camera of high-resolution (HD 1080 camera, 30 fps),
connected to a Wi-Fi module. As shown in Fig. 6.a, the camera is as-
sembled in a fixture with an estimated angle of inclination of 45
o
. Also,
it has a 90
o
vision angle as shown Fig. 6.b. In addition, Fig. 7 shows a
picture of the real robot with the camera mounted on the middle
module.
3. Convolutional neural networks - a very brief review
In this paper, we will be using a convolutional neural network
(CNN) implemented in Python through TensorFlow backend. A CNN
consists of special layers called convolutional layers which are very
useful in detecting objects and patterns [41,42]. One advantage of CNN
is that it can be used to build a very deep network with a lesser number
of parameters to train and thereby reducing the time and complexity in
the training process. Apart from this, a CNN consists of different types
of layers with specific characteristics such as convolutional layers, ac-
tivation layers, pooling layers, fully connected layers and SoftMax
layers.
The primary idea behind image classification is horizontal or ver-
tical edge detection which can be achieved by performing a convolution
operation on the input image. The algorithm takes a small square (or
‘window’) called filter and starts applying it over the image. Each filter
allows the CNN to identify certain patterns in the image. In the initial
layers, a CNN will start by detecting simple features such as lines, cir-
cles and edges. In each layer, the network can combine these findings
and continually learn more complex concepts as it goes deeper, and in
our case, it detects cracks existing on the glass.
3.1. Convolution layer
The convolution layer consists of the filter which is supposed to
convolve across the width and height of the input volume. In order
words, the output of a convolutional layer is obtained by carrying out a
dot product operation between the filter weight content and each
Fig. 3. Mantis lifting the middle module for transi-
tion between window panels [25].
Fig. 4. Schematic architecture of Mantis.
Fig. 5. ROS-TensorFlow robot communication.
γ
90
o
250 mm
100 mm 100 mm
20 mm
200 mm
b. Top view
a. Side view
Fig. 6. Position and orientation of the camera mounted on Mantis.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
4
unique position of the input image. This process results in a 2-dimen-
sional activation map that gives the responses of that filter at every
spatial position. The various parameters involved in this process are
number of filters, filter size and weight contents, stride, and padding.
All these parameters will impact the size of the output.
3.2. Pooling layers
Pooling layers generally aim at avoiding overfitting and by applying
non-linear down-sampling on activation maps, reduce the dimension
and complexity.to speed up the computation.
The various parameters involved in this process are filter size and
stride, whereas padding is not used in pooling as it is against the pur-
pose of reducing the input dimension. In addition, pooling is applied on
each input channel individually. This way, the number of output and
input channels will be equal. There are two different types of pooling
namely max pooling and average pooling.
•Max pooling
The basic operation of pooling layer is similar to the convolutional
layer. One noticeable difference is that instead of taking the dot product
of the input and the filter, we take the maximum neighboring value
from each unique position in the input image. This is done through each
channel in the input.
•Average pooling
In average pooling, we take the average of all the values sur-
rounding each unique position in the input image.
3.3. Fully connected layers
The fully connected layer is the other name for the hidden layer
used in a regular neural network. Before this step, the input array is
converted into a single dimensional vector using a flattening layer. As
the name suggests, in a fully connected layer, each node in the input is
connected to every other node in the output.
3.4. SoftMax layer
SoftMax activation function is applied in the output layer of a CNN
to represent a categorical distribution over labels and gives the prob-
abilities of each input belonging to a label.
4. Image Data Set, augmentation techniques and preprocessing
In this study, basically we generated cracked and non-cracked
images from the video frames captured by the camera mounted on
robot. As discussed earlier, the viewing angle of this camera covers the
glass that is supposed to be cleaned.
Most of the data set consists of snapshots of videos (from robot) or
photos taken in Singapore University of Technology and Design (SUTD)
which is beautifully designed with a half-glass structure in all its
buildings. The samples are captured in different conditions (illumina-
tion, reflection, etc.), with different camera devices, with different re-
solutions including different types of object reflections and glare effect
(light and sun reflections). While the majority of these images are
collected by our team, for the sake of diversity and to generalize the
data set, some cracked/non-cracked images were also added from the
web.
Since the main goal of this study was crack detection, the images
were manually labelled to fall into two separate categories: Cracked or
non-cracked. Originally, 1539 cracked images and 1565 non-cracked
images were labelled. In addition, by applying some data augmentation
techniques (mostly including flipping and rotating), the final dataset
consists of 2205 cracked and 4303 non-cracked images with different
conditions of orientation, illumination and resolution.
After saving the trained CNN, it comes to test it with snapshots from
the live video. Regarding the online crack detection through live video
from robot, there are different phases of pre-processing of the raw
image data:
- Converting the video into images by reading the video frame every
1s.
- Converting the image into an appropriate size. The reason behind
this is that software tends to provide a better result if all the input
images are of same pixel dimensions. In this paper, we have used
dimensions of 480 × 240 pixels.
- The resultant images are read using the OpenCV package in python.
- In this paper, we have used the grayscale image as it reduces the
number of dimensions and thereby reducing the operational time
and capacity. Then the obtained vector is normalized (each pixel
value is divided by 255). This is the last step in pre-processing the
image prior to be fed into trained CNN for test.
5. The training algorithm and applied CNN architecture
As illustrated before, this paper aims at classifying the cracked glass
from non-cracked glass using the video captured from the robot. The
purpose behind the classification is that the robot needs to avoid the
areas where the cracks are present in the glass. As mentioned earlier,
the video is converted into a grayscale image and loaded using the
OpenCV package. Later, CNN, developed using TensorFlow, is used for
classification. To get the best results and also make a comparison, we
have used two optimizers namely Adam and Adagrad to perform the
training.
Fig. 7. Camera mounted on the middle module of Mantis.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
5
5.1. Optimizers
A brief elaboration on the formulation of these optimizers is given
below in order to make it possible to reproduce the simulations.
5.1.1. Adam optimizer
Adam optimizer is Adaptive Moment estimation optimizer which
follows an algorithm for first-order gradient-based optimization based
on adaptive estimates of lower-order moments. The pseudo code for
Adam algorithm is given below [43].
Require: α: Step size
Require: β
1
,β
2
∈[0, 1): Exponential decay rates for the moment
estimates
Require: f(θ): Stochastic objective function with parameters θ
Require: θ
0
Initial parameter vector
m
0
←0 (Initialize 1st moment vector)
v
0
←0 (Initialize 2nd moment vector)
t←0 (Initialize timestep)
while θ
t
not converged, do the following:
t←t+1
g
t
←∇
θ
f
t
(θ
t
−1) (Get gradients w.r.t. stochastic objective at
timestep t)
m
t
←β
1
m
t−1
+(1-β
1
)g
t
(Update biased first moment estimate)
v
t
←β
2
v
t−1
+(1-β
2
)g
t2
(Update biased second raw moment esti-
mate)
m
t
←m
t
/(1 −β
1t
) (Compute bias-corrected first moment estimate)
v
t
←v
t
/(1 −β
2t
) (Compute bias-corrected second raw moment es-
timate)
θ
t
←θ
t−1
-
+αm v ε/( )
tt
(Update parameters).
end while
return θ
t
(Resulting parameters)where g
t
are the gradients, θ
t
is the
parameter at time t, β
1
and β
2
belong to [0,1), and αis the learning rate.
According to [43], g
t2
indicates the elementwise square of g
t
ʘg
t
and
proposed default settings are α= 0.001, β
1
= 0.9, β
2
= 0.999 and
ε=10
−8
. All operations on vectors are element-wise, and β
1t
and β
2t
,
denote β
1
and β
2
to the power of t.
5.1.2. Adagrad optimizer
Adagrad optimizer is a gradient based optimization algorithm that
works well for sparse gradients [44]. It will automatically adapt the
learning rate based on the parameters. The basic equation used for
parameter update is shown in Eq. (1) where θ
t
is the parameter at time
t, αis the learning rate, g
t
is the gradient estimate, and ʘmeans ele-
ment wise multiplication.
=− +∑⊙
+
θθ α
εg
g
tt
t
t
12(1)
4
4
480
240
1
4
4
8
240
480
8
8
60
30
8
2
2
8
30
60
4
4
16
8
15
Input Image Convolution
Layer 1
Max Pooling
Layer 1
Convolution
Layer 2
Max Pooling
Layer 2
Flattening
process
Fully
Connected
Network
SoftMax
Layer
1920
Uncracked
Cracked
Fig. 8. CNN architecture for this classification process.
Table 1
Parameters used in the CNN layers.
Layer Filter size Stride Padding type Activation
Convolution layer 1 4 1 Same ReLU
Max pooling layer 1 8 8 Same (P=0) –
Convolution layer 2 2 1 Same ReLU
Max pooling layer 2 4 4 Same (P = 0) –
No
Start
Input Video from
on-board camera
Get Snapshots of the video
Labelling the training data
Normalize
1st convolutional layer with
Relu Activation
2nd convolutional layer
with Relu Activation
1st maximu m pooling la yer
Fully connected layer with
SoftMax function
Calculate cost
Calculate test accuracy
Save the trained model
Stopping criteria
fulfilled?
2nd maximum pooling layer
Stop
Yes
Fig. 9. The CNN training algorithm.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
6
5.2. Proposed CNN
A sketch of the utilized CNN architecture is shown in Fig. 8 in which
we have used two convolutional layers with ReLU activation, two
maximum pooling layers and one fully connected layer with SoftMax
activation. Due to the mutually exclusive nature of the crack detection
problem (cracked or non-cracked), a SoftMax layer is used as the last
layer to compute the probability of each class. Furthermore, Table 1
summarizes the parameters corresponding to CNN layers. Meanwhile,
the training algorithm is illustrated in Fig. 9.
6. Results and discussion
Training performance graphs for both optimizers are summarized in
Fig. 10 (both for 700 epochs).
The confusion matrix is a way of describing the performance of the
classifier output. The layout of the confusion matrix, in our case, is as
illustrated in Eq. (2).
=⎡
⎣
⎤
⎦
True Cracked False Cracked
False Uncracked True Uncracked
C
onfusion Matrix (2)
Performance metrics for growing number of epochs is summarized
in Table 2 for both optimizers. This Table gives detailed metrics in
terms of confusion matrix, accuracy, sensitivity in terms of TPR (true
positive rate) or the same recall, precision (PPV: positive predictive
values), specificity (SPC), negative predicted value (NPV), false positive
rate (FPR), false discovery rate (FDR), miss rate or FNR (false negative
rate), FOR (false omission rate) and F1 score [45].
An illustration of precision, recall and F1 values during the training
process, are given in Figs. 11 to 13. As it can be seen, precision of Adam
optimizer is smoothly standing much higher than Adagrad Optimizer
during the training process, however, Adagrad seems to be able to reach
the same precision value and intersect with Adam precision, if given
more epochs for training. Although Adagrad shows a weak precision, in
terms of recall, it has a more satisfying performance keeping its graph
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0 200 400 600 800
Loss
Epochs
Adagrad
Adam
Fig. 10. Cost function values.
Table 2
Quantitative performance metrics for both optimizers.
Optimizers Epochs Confusion matrix Accuracy% PPV (precision) TPR (Recall/sensitivity) F1 score SPC NPV FPR FDR FNR FOR
Adam optimizer 700 ⎡
⎣⎤
⎦
1804 271
401 4032
89.674 0.869 0.818 0.842 0.937 0.909 0.062 0.130 0.181 0.090
350
⎡
⎣⎤
⎦
1777 298
423 4010
88. 921 0.856 0.807 0.831 0.930 0.904 0.069 0.143 0.192 0.095
300 ⎡
⎣⎤
⎦
1774 301
417 4016
88. 967 0.854 0.809 0.831 0.930 0.905 0.069 0.145 0.190 0.094
280 ⎡
⎣⎤
⎦
1777 298
419 4014
88. 982 0.856 0.809 0.832 0.930 0.905 0.069 0.143 0.190 0.094
Adagrad optimizer 700
⎡
⎣⎤
⎦
1322 753
23 4410
88. 076 0.637 0.982 0.773 0.854 0.994 0.145 0.362 0.017 0.005
350
⎡
⎣
⎤
⎦
712 1363
9 4424
78.918 0.343 0.987 0.509 0.764 0.997 0.235 0.656 0.012 0.002
300
⎡
⎣
⎤
⎦
571 1504
8 4425
76.767 0.275 0.986 0.430 0.746 0.998 0.253 0.724 0.013 0.001
280 ⎡
⎣
⎤
⎦
473 1602
8 4425
75.261 0.227 0.983 0.370 0.734 0.998 0.265 0.772 0.016 0.001
Fig. 11. Precision values for both optimizers.
Fig. 12. Recall values for both optimizers.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
7
higher throughout the epochs. Both optimizers benefit from an ap-
proximately same level of accuracy around 90%, however Adam opti-
mizer results in slightly more accuracy (89.674% for Adam optimizer
vs. 88.076% for Adagrad optimizer). Overall, Adam optimizer with the
lowest rate in false negative detection (FNR) and highest accuracy
suggests a well-trained classifier, able to recognize cracked from non-
cracked.
Aiming at further illustration, the probability of correct classifica-
tion for sample test images are given for both of the optimizers in
Figs. 14 and 15 (for cracked samples) and Figs. 16 and 17 (for non-
cracked samples) in which TP and TN respectively stand for True Po-
sitive and True Negative.
In corresponding figures for the cracked and non-cracked (Figs.
14–15 and Figs. 16-17), the sample test images are selected to be the
same for both optimizers for comparison aims to be fulfilled more easily
[32]. Beside the samples from real video snapshots (captured by robot),
some samples from web are also tested for the sake of diversity and to
test the classifiers with generalized samples. Only to clarify, the photos
with gray stipes, are from real robot navigating on a half-glass wall at
Singapore University of Technology and Design. It is evident that in
each test image, Adam optimizer has been more powerful with higher
probability (at level of 90%–100%) for correct decision. However,
Adagrad optimizer also never fails in correct decision with reporting
probabilities always safely more than 50% for true positive decision.
Moreover, it is worth mentioning that in this application for glass fa-
çade cleaning, not to mistake in crack detection, is more important than
precision of correct decision, because only one small mistake may cause
a serious incident and our proposed method actually guarantees this:
Although we also have rather lower probabilities reported, but all the
results have a sufficiently safe distance to 50% which is the threshold
for right decision. This guarantees the safety of the process to be re-
placed with human inspector.
In addition, in order to prove the strength of applied CNN, we have
also gone ahead with a simple neural network. This NN utilizes Adam
optimizer and consisting of six layers, it is very similar to the applied
CNN in terms of architecture. It includes one input, one output and four
hidden layers with size of 32, 16, 8, 4. However, it should be mentioned
that NN needs features to be fed into its input layer. In this regard,
feature extraction is done using OpenCV which converts the image into
a multi-dimensional matrix. This matrix is then flattened to be provided
to NN's input layer. Figs. 18 and 19 illustrate NN's performance on the
same test samples for cracked and non-cracked. It is obvious that NN
appears to be very poor because of lacking convolutional layer.
Convolutional neural networks benefit from several advantages as
described earlier. In comparison to NNs they need less time and
memory, however, practically they are still computationally expensive.
This drawback can be solved with better computing hardware such as
GPUs and Neuromorphic chips. Furthermore, some more recent types of
Fig. 13. F1 values for both optimizers.
TP: p = 61.5 % TP: p = 99.8% TP: p = 97.0% TP: p = 99.8%
TP: p = 77.5% TP: p = 99.8% TP: p = 99.1% TP: p = 100 %
TP: p = 99.7% TP: p = 99.6% TP: p = 100% TP: p = 99.5%
TP: p = 99.8% TP: p = 100% TP: p = 99.6% TP: p = 99.0%
TP: p = 99.3% TP: p = 92.4% TP: p = 99.8% TP: p = 99.9%
Fig. 14. Cracked samples (True Positive) for CNN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
8
TP: p = 59.5 % TP: p = 59.5% TP: p = 63.3% TN: p = 97.5%
TP: p = 59.7% TP: p = 62.1% TP: p = 72.8% TN: p = 94.6 %
TP: p = 54.2% TP: p = 62.7% TP: p = 76.3% TP: p = 54.4%
TP: p = 89.2% TP: p = 87.7% TP: p = 95.7% TP: p = 54.6%
TP: p = 56.7% TP: p = 82.5% TP: p = 80.5% TP: p = 93.0%
Fig. 15. Cracked samples (True Positive) for CNN with Adagrad Optimizer.
TN: p = 99.9% TN: p = 95.80% TN: p = 98.9% TN: p = 91.6%
TN: p = 94.1% TN: p = 97.6% TN: p = 99.3% TN: p = 92.5%
TN: p = 100% TN: p = 100% TN: p = 90.7% TN: p = 95.8%
TN: p = 99.6% TN: p = 100% TN: p = 94.0% TN: p = 98.1%
TN: p = 100% TN: p = 99.8% TN: p = 99.9% TN: p = 99.6%
Fig. 16. Non-Cracked samples (True Negative) for CNN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
9
CNNs can provide faster response in comparison to basic CNN that uses
sliding window. The other drawback is they need a lot of training data.
This aspect is usually well-treated by applying data augmentation
methods as described in this paper.
7. Conclusion
In this paper, a deep learning approach has been put forward for
equipping our modular façade-cleaning robot with crack-detection
TN: p = 78.1% TN: p = 77.8% TN: p = 93.2% TN: p = 77.8%
TN: p = 77.9% TN: p = 96.2% TN: p = 96.4% TN: p = 77.9%
TN: p = 64.6% TN: p = 99.2% TN: p = 77.1% TN: p = 79.2%
TN: p = 74.7% TN: p = 98.1% TN: p = 77.6% TN: p = 79.3%
TN: p = 90.2% TN: p = 91.8% TN: p = 85.8% TN: p = 62.5%
Fig. 17. Non-Cracked samples (True Negative) for CNN with Adagrad Optimizer.
TP: p = 55.29% TP: p = 55.57% TP: p = 54.83% TN: p = 51.72%
TP: p = 55.24% TP: p = 53.32% TP: p = 54.58% TN: p = 53.95%
TP: p = 54.56% TP: p = 55.16% TP: p = 55.43% TP: p = 54.55%
TP: p = 53.55% TP: p = 54.16% TP: p = 57.11% TP: p = 55.64%
TP: p = 53.93% TP: p = 54.64% TP: p = 56.91% TP: p = 51.95%
Fig. 18. Cracked samples (True Positive) for NN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
10
package. For façade cleaning robots, the necessity of such an additional
capability will be more tangible when the dangers of navigating on
cracked glass are taken into account. To this end, the modular Mantis
robot is equipped with on-board camera for experimental purposes and
the live video is loaded using the OpenCV package. In addition, a deep
convolutional neural network developed in TensorFlow™is proposed,
and it is trained with sufficient data set including photos taken from
video snapshots (directly fed from robot), several other photos taken at
SUTD half-glass campus, and also photos from web with different
conditions of illumination and resolution (taken with different devices).
On the way of training, two different optimizers are utilized, both of
which hit a very high accuracy around 90% which is trustable enough
to replace the human-operator with this system in real time on-site
inspections. Each of the two utilized optimizers has its own advantages:
Adam benefits from higher precision while Adagrad optimizer results in
higher recall factor. However, Adam optimizer with the lowest FNR and
highest accuracy suggests a more trustable classifier. Moreover, the
overall performance of the proposed CNN is also compared to that of
the traditional NN-based method which further proves the strength of
CNN.
The transcend aim of this research is to make the robot avoid from
cracked regions. Our scope of future work would involve enhancing the
algorithm so that the robot can take a much more informed decision. In
our next version of the crack detection algorithm, we would focus more
in solving the localization problem by detecting the exact position of
the crack partial to robot and navigate the robot such that it can avoid
the cracked region and also maintain the highest possible coverage for
efficient cleaning.
To further empower the crack detection package, we will also be
focusing on implementing a more powerful deep learning technique
that could also help with object tracking such as Faster R-CNN and
enhancing the dataset with more images (specially consisting of dif-
ferent reflections and illuminations). Furthermore, another expansion
to current work would be to have all the processing onboard rather than
utilizing a master-slave ROS system.
Acknowledgments
This work is financially supported by the National Robotics R&D
Program Office, Singapore, under the Grant no. RGAST1702, Singapore
University of Technology and Design (SUTD), which are gratefully ac-
knowledged to conduct this research work.
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://
doi.org/10.1016/j.autcon.2019.01.025.
References
[1] C.W. Bostick, Architectural trend thru the looking glass, Proceedings of the Glass
Performance Days, 2009, pp. 860–866. Tampere, Finland https://de.glassglobal.
com/gpd/downloads/ChangingMarkets-Bostick.pdf.
[2] F. Pariafsai, A review of design considerations in glass buildings, Frontiers of
Architectural Research 5 (2) (2016) 171–193, https://doi.org/10.1016/j.foar.2016.
01.006.
[3] A. Kochan, Robot cleans glass roof of louvre pyramid, Industrial Robot: An
International Journal 32 (5) (2005) 380–382, https://doi.org/10.1108/
01439910510614637.
[4] N. Elkmann, T. Felsch, M. Sack, J. Saenz, J. Hortig, Innovative service robot systems
for facade cleaning of difficult-to-access areas, IEEE/RSJ International Conference
on Intelligent Robots and Systems, vol. 1, 2002, pp. 756–762, , https://doi.org/10.
1109/IRDS.2002.1041481.
[5] N. Elkmann et al., "SIRIUSc —Facade cleaning robot for a high-rise building in
Munich, Germany," in Climbing and Walking Robots, pp. 1033–1040: Springer
Berlin Heidelberg, vol. 2005. doi:https://doi.org/10.1007/3-540-29461-9_101.
[6] Y.-S. Lee, et al., The study on the integrated control system for curtain wall building
façade cleaning robot, Autom. Constr. 94 (2018) 39–46, https://doi.org/10.1016/j.
autcon.2017.12.030.
[7] S. Nansai, M. Rajesh Elara, A survey of wall climbing robots: recent advances and
challenges, Robotics 5 (3) (2016) 14, https://doi.org/10.3390/robotics5030014.
[8] D. Schmidt, K. Berns, Climbing robots for maintenance and inspections of vertical
structures—a survey of design aspects and technologies, Robot. Auton. Syst. 61 (12)
(2013) 1288–1305, https://doi.org/10.1016/j.robot.2013.09.002.
[9] D. Longo, G. Muscato, The Alicia/sup3/climbing robot: a three-module robot for
automatic wall inspection, IEEE Robotics & Automation Magazine 13 (1) (2006)
42–50, https://doi.org/10.1109/MRA.2006.1598052.
[10] Z. Xu, P. Ma, A wall-climbing robot for labelling scale of oil tank's volume, Robotica
TN: p = 46.84% TN: p = 39.21% TN: p = 15.00% TN: p = 38.15%
TN: p = 39.21% TN: p = 31.83% TN: p = 19.97% TN: p = 39.37%
TN: p = 43.84% TN: p = 50.01% TN: p = 39.41% TN: p = 35.70%
TN: p = 26.76% TN: p = 41.48% TN: p = 39.20% TN: p = 35.21%
TN: p = 50.01% TN: p = 49.20% TN: p = 50.01% TN: p = 50.01%
Fig. 19. Non-Cracked samples (True Negative) for NN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
11
20 (2) (2002) 209–212, https://doi.org/10.1017/S0263574701003964.
[11] B.L. Luk, K.P. Liu, A.A. Collie, S. Chen, Tele-operated climbing and mobile service
robots for remote inspection and maintenance in nuclear industry, Industrial Robot:
An International Journal 33 (3) (2006) 194–204, https://doi.org/10.1108/
01439910610659105.
[12] B.L. Luk, L.K.P. Liu, A.A. Collie, Climbing service robots for improving safety in
building maintenance industry, in: M.K. Habib (Ed.), Bioinspiration and Robotics,
vol. Ch. 9, IntechOpen, Rijeka, 2007, , https://doi.org/10.5772/5498.
[13] A. Sintov, T. Avramovich, A. Shapiro, Design and motion planning of an autono-
mous climbing robot with claws, Robot. Auton. Syst. 59 (11) (2011) 1008–1019,
https://doi.org/10.1016/j.robot.2011.06.003.
[14] M. Henrey, A. Ahmed, P. Boscariol, L. Shannon, C. Menon, Abigaille-III: a versatile,
bioinspired hexapod for scaling smooth vertical surfaces, Journal of Bionic
Engineering 11 (1) (2014) 1–17, https://doi.org/10.1016/S1672-6529(14)
60015-9.
[15] T. Yanagida, R. Elara Mohan, T. Pathmakumar, K. Elangovan, M. Iwase, Design and
implementation of a shape shifting rolling-crawling-wall-climbing robot, Appl. Sci.
7 (4) (2017), https://doi.org/10.3390/app7040342.
[16] J. Zhu, D. Sun, S.-K. Tso, Development of a tracked climbing robot, J. Intell. Robot.
Syst. 35 (4) (2002) 427–443, https://doi.org/10.1023/A:1022383216233.
[17] T. Kim, K. Seo, J. Kim, H.S. Kim, Adaptive impedance control of a cleaning unit for a
novel wall-climbing mobile robotic platform (ROPE RIDE), 2014 IEEE/ASME
International Conference on Advanced Intelligent Mechatronics, 2014, pp.
994–999, , https://doi.org/10.1109/AIM.2014.6878210.
[18] H. Zhang, J. Zhang, W.L. Wang, Rong, G. Zong, A series of pneumatic glass-wall
cleaning robots for high-rise buildings, Industrial Robot: An International Journal
34 (2) (2007) 150–160, https://doi.org/10.1108/01439910710727504.
[19] N. Mir-Nasiri, H.S. J, M.H. Ali, Portable autonomous window cleaning robot,
Procedia Computer Science 133 (2018) 197–204, https://doi.org/10.1016/j.procs.
2018.07.024.
[20] J. Liu, K. Tanaka, L.M. Bao, I. Yamaura, Analytical modelling of suction cups used
for window-cleaning robots, Vacuum 80 (6) (2006) 593–598, https://doi.org/10.
1016/j.vacuum.2005.10.002.
[21] S. Nansai, K. Onodera, P. Veerajagadheswar, R. Elara Mohan, M. Iwase, Design and
experiment of a novel façade cleaning robot with a biped mechanism, Appl. Sci. 8
(2018) 2398, https://doi.org/10.3390/app8122398.
[22] T.T. Tun, M.R. Elara, M. Kalimuthu, A. Vengadesh, Glass facade cleaning robot with
passive suction cups and self-locking trapezoidal lead screw drive, Autom. Constr.
96 (2018) 180–188, https://doi.org/10.1016/j.autcon.2018.09.006.
[23] C. Menon, M. Murphy, M. Sitti, Gecko inspired surface climbing robots, 2004 IEEE
International Conference on Robotics and Biomimetics, 2004, pp. 431–436, ,
https://doi.org/10.1109/ROBIO.2004.1521817.
[24] S. Nansai, M. Elara, T. Tun, P. Veerajagadheswar, T. Pathmakumar, A novel nested
reconfigurable approach for a glass façade cleaning robot, Inventions 2 (3) (2017)
18, https://doi.org/10.3390/inventions2030018.
[25] M. Vega-Herediaa, et al., Design and modelling of a modular window cleaning
robot, Autom. Constr. 103 (2019) 268–278, https://doi.org/10.1016/j.autcon.
2019.01.025.
[26] X. Peng, S. Katsunori, Eddy current sensor with a novel probe for crack position
detection, 2008 IEEE International Conference on Industrial Technology, 2008, pp.
1–6, , https://doi.org/10.1109/ICIT.2008.4608445.
[27] D.J. Sadler, C.H. Ahn, On-chip eddy current sensor for proximity sensing and crack
detection, Sensors Actuators A Phys. 91 (3) (2001) 340–345, https://doi.org/10.
1016/S0924-4247(01)00605-7.
[28] P. Liu, J. Jang, S. Yang, H. Sohn, Fatigue crack detection using dual laser induced
nonlinear ultrasonic modulation, Opt. Lasers Eng. 110 (2018) 420–430, https://doi.
org/10.1016/j.optlaseng.2018.05.025.
[29] M. Culjat, R. Singh, E. Brown, R. Neurgaonkar, D. Yoon, S. White, Ultrasound crack
detection in a simulated human tooth, Dentomaxillofacial Radiology 34 (2) (2005)
80–85, https://doi.org/10.1259/dmfr/12901010.
[30] A. Mohan, S. Poobal, Crack detection using image processing: a critical review and
analysis, Alexandria Engineering Journal 57 (2) (2018) 787–798, https://doi.org/
10.1016/j.aej.2017.01.020.
[31] Y.-J. Cha, W. Choi, O. Büyüköztürk, Deep learning-based crack damage detection
using convolutional neural networks, Computer-Aided Civil and Infrastructure
Engineering 32 (5) (2017) 361–378, https://doi.org/10.1111/mice.12263.
[32] L. Zhang, F. Yang, Y.D. Zhang, Y.J. Zhu, Road crack detection using deep con-
volutional neural network, 2016 IEEE International Conference on Image
Processing (ICIP), 2016, pp. 3708–3712, , https://doi.org/10.1109/ICIP.2016.
7533052.
[33] A. Zhang, et al., Automated pixel-level pavement crack detection on 3D asphalt
surfaces using a deep-learning network, Computer-Aided Civil and Infrastructure
Engineering 32 (10) (2017) 805–819, https://doi.org/10.1111/mice.12297.
[34] K. Gopalakrishnan, S.K. Khaitan, A. Choudhary, A. Agrawal, Deep convolutional
neural networks with transfer learning for computer vision-based data-driven pa-
vement distress detection, Constr. Build. Mater. 157 (2017) 322–330, https://doi.
org/10.1016/j.conbuildmat.2017.09.110.
[35] F. Chen, M.R. Jahanshahi, NB-CNN: deep learning-based crack detection using
convolutional neural network and Naïve Bayes data fusion, IEEE Trans. Ind.
Electron. 65 (5) (2018) 4392–4400, https://doi.org/10.1109/TIE.2017.2764844.
[36] M.E. Ibrahim, R.A. Smith, C.H. Wang, Ultrasonic detection and sizing of compressed
cracks in glass- and carbon-fibre reinforced plastic composites, NDT & E
International 92 (2017) 111–121, https://doi.org/10.1016/j.ndteint.2017.08.004.
[37] B. Akdemir, Ş. Öztürk, Glass surface defects detection with wavelet transforms,
International Journal of Materials, Mechanics and Manufacturing 3 (3) (2015),
https://doi.org/10.7763/IJMMM.2015.V3.189.
[38] Z. Yiyang, The design of glass crack detection system based on image preprocessing
technology, 2014 IEEE 7th Joint International Information Technology and
Artificial Intelligence Conference, 2014, pp. 39–42, , https://doi.org/10.1109/
ITAIC.2014.7065001.
[39] M. Hui-Min, S. Guang-Da, W. Jun-Yan, N. Zheng, A glass bottle defect detection
system without touching, Proceedings. International Conference on Machine
Learning and Cybernetics, vol. 2, 2002, pp. 628–632, , https://doi.org/10.1109/
ICMLC.2002.1174411.
[40] L. Joseph, ROS Robotics Projects, Packt Publishing Ltd., 2017 ISBN 10:
1783554711. (ISBN 13: 9781783554713).
[41] T. Guo, J. Dong, H. Li, Y. Gao, Simple convolutional neural network on image
classification, 2017 IEEE 2nd International Conference on Big Data Analysis
(ICBDA), 2017, pp. 721–724, , https://doi.org/10.1109/ICBDA.2017.8078730.
[42] S. Albawi, T.A. Mohammed, S. Al-Zawi, Understanding of a convolutional neural
network, 2017 International Conference on Engineering and Technology (ICET),
2017, pp. 1–6, , https://doi.org/10.1109/ICEngTechnol.2017.8308186.
[43] Diederik P. Kingma, J. Ba, Adam: A method for stochastic optimization, 3rd
International Conference for Learning Representations, San Diego, 2015 https://
arxiv.org/abs/1412.6980.
[44] J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and
stochastic optimization, J. Mach. Learn. Res. 12 (2011) 2121–2159 http://www.
jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.
[45] E. Protopapadakis, N. Doulamis, Image based approaches for tunnels' defects re-
cognition via robotic inspectors, International Symposium on Visual Computing,
ISVC 2015: Advances in Visual Computing, vol. 9474, Springer, Cham, 2015, pp.
706–716, , https://doi.org/10.1007/978-3-319-27857-5_63.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
12