ArticlePDF Available

Self-reconfigurable façade-cleaning robot equipped with deep-learning-based crack detection based on convolutional neural networks

Authors:

Abstract and Figures

Despite advanced construction technologies that are unceasingly filling the city-skylines with glassy high-rise structures, maintenance of these shining tall monsters has remained a high-risk labor-intensive process. Thus, nowadays, utilizing façade-cleaning robots seems inevitable. However, in case of navigating on cracked glass, these robots may cause hazardous situations. Accordingly, it seems necessary to equip them with crack-detection system to eventually avoid cracked area. In this study, benefitting from convolutional neural networks developed in TensorFlow™, a deep-learning-based crack detection approach is introduced for a novel modular façade-cleaning robot. For experimental purposes, the robot is equipped with an on-board camera and the live video is loaded using OpenCV. The vision-based training process is fulfilled by applying two different optimizers utilizing a sufficiently generalized data-set. Data augmentation techniques and also image pre-processing also apply as a part of process. Simulation and experimental results show that the system can hit the milestone on crack-detection with an accuracy around 90%. This is satisfying enough to replace human-conducted on-site inspections. In addition, a thorough comparison between the performance of optimizers is put forward: Adam optimizer shows higher precision, while Adagrad serves more satisfying recall factor, however, Adam optimizer with the lowest false negative rate and highest accuracy has a better performance. Furthermore, proposed CNN's performance is compared to traditional NN and the results provide a remarkable difference in success level, proving the strength of CNN.
Content may be subject to copyright.
Contents lists available at ScienceDirect
Automation in Construction
journal homepage: www.elsevier.com/locate/autcon
Self-recongurable façade-cleaning robot equipped with deep-learning-
based crack detection based on convolutional neural networks
Maryam Kouzehgar
a,
, Yokhesh Krishnasamy Tamilselvam
b
, Manuel Vega Heredia
a,c
,
Mohan Rajesh Elara
a
a
Engineering Production and Development Pillar, Singapore University of Technology and Design, Singapore 487372, Singapore
b
Electrical Engineering Department (Robotics), University of Western Ontario, London, Ontario N6A 3K7, Canada
c
Engineering and Technology Department, Campus Los Mochis, Universidad de Occidente, Sinaloa 81223, Mexico
ARTICLE INFO
Keywords:
Self-recongurable robot
Façade-cleaning robot
Glass crack detection
Convolutional neural network (CNN)
Deep learning
TensorFlow
OpenCV
ROS
Adam optimizer
Adagrad optimizer
ABSTRACT
Despite advanced construction technologies that are unceasingly lling the city-skylines with glassy high-rise
structures, maintenance of these shining tall monsters has remained a high-risk labor-intensive process. Thus,
nowadays, utilizing façade-cleaning robots seems inevitable. However, in case of navigating on cracked glass,
these robots may cause hazardous situations. Accordingly, it seems necessary to equip them with crack-detection
system to eventually avoid cracked area. In this study, benetting from convolutional neural networks developed
in TensorFlow, a deep-learning-based crack detection approach is introduced for a novel modular façade-
cleaning robot. For experimental purposes, the robot is equipped with an on-board camera and the live video is
loaded using OpenCV. The vision-based training process is fullled by applying two dierent optimizers utilizing
asuciently generalized data-set. Data augmentation techniques and also image pre-processing also apply as a
part of process. Simulation and experimental results show that the system can hit the milestone on crack-de-
tection with an accuracy around 90%. This is satisfying enough to replace human-conducted on-site inspections.
In addition, a thorough comparison between the performance of optimizers is put forward: Adam optimizer
shows higher precision, while Adagrad serves more satisfying recall factor, however, Adam optimizer with the
lowest false negative rate and highest accuracy has a better performance. Furthermore, proposed CNN's per-
formance is compared to traditional NN and the results provide a remarkable dierence in success level, proving
the strength of CNN.
1. Introduction
High-rise buildings with glass facades are increasingly growing
worldwide [1,2]. However, when it comes to their maintenance pro-
cess, cleaning is normally a classical approach which is still labor-in-
tensive and often of high risk to the manpower, especially in adverse
weather conditions and strong winds. Recently, there is a growing trend
toward commercialized robots such as Winbot Series from Ecovacs
(Winbot X, Winbot 950 and Winbot 850), Hobot window cleaning series
(Hobat 168, 198 and 288), Alfawise A168/S60, and Rumbot's window
cleaning robot. Furthermore, in case of high-rise structures, there are
commercialized façade cleaning robots such as RobuGlass designed for
cleaning Louvre museum [3] and Glazenwasrobot from KITE robotics
that works with protective cranes.
In order to improve the capabilities of these increasingly demanded
service robots, also in academic research, robotic systems for cleaning
vertical glass facades have been a matter of interest. In this eld, there
are numerous research challenges associated with mechanism design
and autonomous capabilities. This essential need to use robots for
vertical façade cleaning, has been sensed long back ago and has been
addressed by introducing SIRIUSc as a modular robot basically designed
to work on skyscrapers: This robot has been specically utilized to clean
the 25,000 m
2
vaulted glass hall of the Leipzig Trade Fair in Germany
[4,5]. Going ahead on the same stream toward autonomy, the proposed
robotic system in [6] performs tasks such as cleaning, moving, rail
alignment control, and obstacle detection.
In terms of mechanical design, glass façade cleaning robot should be
rst a surface climbing robot. A detailed review of wall climbing robots
is presented in [7] categorizing them into six distinguished classes
based on the applied adhesive mechanism. Apart from façade cleaning,
https://doi.org/10.1016/j.autcon.2019.102959
Received 25 January 2019; Received in revised form 2 September 2019; Accepted 7 September 2019
Corresponding author.
E-mail addresses: maryam_kouzehgar@sutd.edu.sg (M. Kouzehgar), ykrishn4@uwo.ca (Y. Krishnasamy Tamilselvam),
manuel_vega@sutd.edu.sg (M. Vega Heredia), rajeshelara@sutd.edu.sg (M. Rajesh Elara).
Automation in Construction 108 (2019) 102959
Available online 23 October 2019
0926-5805/ © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/BY-NC-ND/4.0/).
T
the surface climbing robots are mainly utilized for remote inspection
and maintenance applications [8]indicult access situations such as
wall inspection [9], labeling oil tank volume [10] and nuclear power
plants inspection [11].
Regardless of the application to façade cleaning, several types of
mechanisms are proposed for surface climbing: Legged morphology
with eight articulated limbs for Robug III is presented in [12]. The
structure illustrated in [13], is another type of legged morphology. In
addition, a hexapod structure for climbing is put forward in [14].
Moreover, a shape shifting rolling-crawling mechanism is introduced is
in [15] which is also able to climb vertical surfaces. Surface climbing
robots illustrated in [16,17] use a tracked wheel mechanism. There are
also pneumatic-type climbing robots [18]; namely utilizing suction
cups: [19,20]. Among the climbing robots equipped with suction cups,
[21] proposes a biped mechanism, whereas [22] is introducing a robot
specically designed for glass facade cleaning with passive suction cups
driven by self-locking leadscrews. Additionally, some of the climbing
robots benet from material-based adhesiveness [23] and impedance
control [17].
When it comes to improve the performance of the façade cleaning
robots, recongurability can add a great value in terms of accessing
more spaces and exibility in morphology. In [24], a nested re-
congurable mechanism has been proposed specically for vertical
façade cleaning applications. However, the rolling-crawling mechanism
introduced in [15] is an example of recongurable morphology that can
also be used in façade cleaning applications due to its wall-climbing
ability. Moreover, bio-inspired mechanisms can also pave their way
toward vertical facade cleaning as with the climbing mechanism pro-
posed in [23] benetting from special adhesive materials.
The robot considered in this paper is Mantis [25] which is a modular
climbing robot that is using a powerful commercial impeller to provide
the required mechanical attachment force as the adhesion mechanism.
Even in robotic façade cleaners, usually a person is required to detach
the robot from one window panel and attach it to another. Mantis is
equipped with the ability to distinguish the frames (or any obstacles
other than glass), and also due to its modular morphology, it is able to
perform a transition from one window panel to another, avoiding its
frame. This way, the need for human interference is eliminated. This is
one aspect of autonomy worked out for the façade cleaning robot
thoroughly investigated in [25] in terms of design, locomotion strate-
gies and control.
One thing more is that, the façade cleaning robots had better avoid
the cracked glass regions because it would cause dangerous situations.
Hence, it is necessary that the robot can recognize the safe surfaces for
navigation, just as the high-rise window cleaner from FatCat Robotics is
claimed to be equipped with the crack-detection technology. In this
study, our aim is to equip Mantis with glass crack-detection ability
using deep learning techniques; namely convolutional neural networks
(CNN). This is prominently signicant because we are adding a very
necessary capability to the façade cleaning robot to avoid and prevent
probable dangerous situations for itself, and also the humans around.
Due to the uncompensable harms to people's health, no one prefers a
glass frame explosion only for having a tech-equipped cleaning robot
navigating on that. Without this additional capability, the people would
prefer the glass to remain stained rather than probably get explode, and
this paper is introducing this distinguished capability as its main con-
tribution which will guarantee the safety for all.
Crack detection generally is viewed from two aspects. First, the
material on which the crack exists and second the method to handle the
analysis of crack. In terms of material, there are several other materials
which require signicant crack detection eorts such as metals, con-
crete, asphalt, walls with dierent materials in construction industry,
etc. For example, in [26,27], crack detection in metal surfaces is
worked out by applying Eddy current sensor. In [28], a nonlinear ul-
trasonic modulation technique based on dual laser excitation is pro-
posed for fatigue crack detection in aluminum and steel plates.
Ultrasound crack detection in a simulated human tooth is investigated
in [29]. In terms of detecting civil infrastructure defects, a survey on
image-based crack detection for concrete surfaces is put forward in
[30].
The vision-based deep-learning-supported crack detection methods
have been mostly a matter of interest on concrete and asphalt surfaces.
For instance, [31] proposes a vision-based method using a deep archi-
tecture of CNNs for detecting concrete cracks. Road crack detection
using CNN is elaborated in [32] and crack detection on 3d asphalt
surfaces using a deep-learning network is proposed by [33]. While [34]
is applying deep CNNs with transfer learning for vision-based pavement
distress detection. Furthermore, deep learning-based crack detection
using CNNs has also an interesting application in identifying cracks on
reactors within the process of nuclear power plant inspection [35].
The reason why recently there has been a great interest toward
CNNs is its outstanding advantages: Notably, CNNs benet from auto-
matic feature extraction and feature learning. In fact, they can learn
relevant features from an image/video at dierent levels similar to a
human brain. While for example NNs cannot do this and need a prior
phase of feature extraction to be applied to any object/pattern re-
cognition application. In general, when using NNs, it is needed to ex-
tract relevant features for the given task and assign each of them to an
element of the input vector; while a CNN will automatically extract
such features provided that you can represent the input as a tensor with
locally correlated elements such as audio data, images, video, etc.
Furthermore, CNN is more ecient in terms of memory and complexity
because it needs much less parameters specially in image processing
applications where, dealing with image matrix, the number of weight
parameters grow drastically in other methods such as NNs. However, in
CNN it only depends on number and size of lters. One of other dis-
tinguished advantages of CNNs is the capability for transfer learning;
i.e. re-using a pre-trained CNN to feed your data on each level and just
slightly tune the CNN for the newly-dened relevant task for example
using the knowledge gained on cars to distinguish trucks. This way, we
avoid training of CNN from scratch and save memory and time.
Particularly, when it comes to crack detection on glass surfaces,
beside the vision-based techniques, there are also some other methods
to handle the glass crack detection such as the ultrasonic glass crack
detection proposed in [36] and detecting the defects on glass surface
with Wavelet Transforms in [37]. Regarding vision-based techniques,
specically glass crack detection has been recently developed mostly
based on classical image processing methods such as edge detection and
segmentation. As an illustrative example, the glass crack detection
system proposed in [38]rst applies a pre-processing and smooth
sharpening followed by image segmentation and ends with crack fea-
ture extraction including calculation of crack area, Crack Perimeter and
Crack Circularity. Also, in [39], a very basic image processing method
based on pixel coordinates is applied to detect crack on bottles in a
video-supervised production line.
However, vision-based deep learning has been rarely applied to
glass crack detection and therefore presents a scope for tremendous
research. The reason of rare applications of deep learning in glass crack
detection mostly lies in two dierent venues: First, glass crack detection
has not been a matter of interest in major industrial applications and
glass-cleaning robot with crack-detection capability is rarely found with
a valid commercialized patent. The second reason is that dealing with
images of glass is not straightforward, because the lighting exposure on
glass is a sever issue that tremendously varies in dierent times of day/
night. Furthermore, in the background and foreground of the glass,
many objects may be lying that denitely have their images reected on
the glass, and also intense light can be reected on the glass. This re-
ection issue can be addressed as glare eect in the paper and we have
tried to train the network with sample videos containing dierent types
of glare eect (reection of sun and articial intense light).
In this paper the aim is to propose a crack detection system for a
glass façade cleaning modular robot based on deep learning. To this
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
2
end, we will be using a live video sent from an onboard camera
mounted on the robot monitoring the ahead glass surface. The CNN
which is taking care of crack detection in each video frame is im-
plemented in python through TensorFlow backend. The utilized CNN
consists of two convolutional layers, two maximum pooling layers and
one fully connected layer and during the training process we are using
two types of optimizers: Adam optimizer and Adagrad optimizer. The
overall performance of the proposed approach is quite satisfying since it
can hit an accuracy more than 90%. This way, our façade cleaning
robot is equipped with a higher level of autonomy in term of detecting
cracked regions and providing adequate safety.
The rest of the paper is organized as follows. Section 2 describes the
overall locomotion mechanism of the considered façade cleaning robot
(Mantis). Section 3 is dedicated to a general brief review of the CNN
concept along with the mechanism of each layer. Section 4 is concerned
with the nature of the data set and details about data augmentation and
preprocessing. Furthermore, Section 5 introduces the proposed CNN
architecture and training algorithm along with the mathematical sup-
port for utilized optimizers. Section 6 is devoted to results and discus-
sion that summarizes the test results casting light on the accuracy
metrics, performance analysis graphs and sample data classication
results added by performance comparison to simple neural network.
Eventually, Section 7 concludes the paper and opens a window toward
the future works.
2. Description of the robotic platform [25]
Mantis is a modular vertical climbing robot developed for façade
cleaning [25]. The currently commercialized façade cleaning robots
usually require human interference to detach them, move them to the
other window frame and again attach them to continue the cleaning
process, just like Winbot series, Hobat series and Alfawise. In order to
answer this shortcoming in current commercial designs, Mantis is de-
signed in such a way that it can do the transition from one window
panel to another without manual assistance. The robot has been de-
veloped on a rigid structure that keeps the modules unied, in the
meantime individual rotation and lifting mechanism for each module is
provided. Fig. 1 shows a picture of Mantis.
2.1. Adhesion mechanism
Mantis uses a powerful commercial impeller to create the suction
required as the mechanical attachment force to keep it attached to the
glass and prevents the robot from falling. The impeller's input voltage is
controllable to control the amount of needed force based on the surface
sensitivity.
2.2. Locomotion mechanism
The robot is equipped with a locomotive wheel mechanism with a
soft and exible high friction rubber. In this way, we decrease the drift
of the robot on the glass in the presence of dirt and liquids. The dia-
meter of the wheels is 6 cm, connected to dc motors with 250 oz./in of
torque. The maximal speed of the robot is 15 cm/s.
As depicted in Fig. 2, the wheels are attached to each module in
orthogonal position to the glass and equidistant to the center of module,
aligned with Y axis.
The whole module can rotate βdegrees around the Z-axis for each
module i (where i is the pad i = a,b,c). Thus, by rotating the module,
the wheels rotate concentrically around a central axis.
Considering locomotive mechanism, Mantis is a 12W6D3S robot.
Based on the locomotive characteristics of the robot, given the re-
strictions imposed to facilitate control, it is nearly holonomic. Each of
the modules can rotate 360
o
. However, in the current application the
rotation range is limited to 90
o
since it is enough for the robot to na-
vigate throughout the whole window.
2.3. Transition mechanism
The transition mechanism consists of two steps: rst recognizing the
window frame or any obstacles. Second lifting the module that is
nearest to the obstacle. The recognition is fullled via applying in-
ductive sensors. The transition between the windows, across the frames
or obstacles, is shouldered by the lifting system. Aiming at transition, a
linear actuator is utilized which linearly moves a screw and lifts a
Fig. 1. Structure of Mantis while cleaning.
Fig. 2. Mantis inertial system in red, module c inertial system in black, βis the
steering of each module. (For interpretation of the references to colour in this
gure legend, the reader is referred to the web version of this article.) (For
interpretation of the references to colour in this gure legend, the reader is
referred to the web version of this article.)
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
3
module from the surface using the rotation around a screw rod located
in the center of each module. This mechanism is illustrated in Fig. 3.
2.4. Software architecture
As shown in Fig. 4, for this application the robot is teleoperated
through a Bluetooth device HC-06 and a mobile application. The robot
can also be teleoperated by a computer. Also, it can use a USB interface.
To develop autonomous navigation, the robot is equipped with a sen-
sory system to know the orientation, position and detect the frame of
the window.
In Fig. 4, continuous arrows are power connections and dotted ar-
rows show signals.
The robot is controlled using a master system which works with the
Robot Operative System (ROS). Using ROS and its topics subscription
system, it is possible to process package information. The packets are
bidirectional between MCU (Arduino Mega) and ROS slave (Intel
Compute Stick), on board. As well as, between ROS slave and ROS
master, communicated by WiFi. ROS master is a server PC where the
information from the ROS slave is processed in software which cannot
be processed in the ROS slave given the processing resources necessary
for the correct operation. As shown in Fig. 5, The communication
between TensorFlow and ROS is developed by subscribing to data
package topics [40] for identifying the cracked glass by image proces-
sing.
2.5. On-board visual support system
In order to develop an autonomous robot that can navigate safely on
the glass, it is necessary for it to determine the state of the surface, for
existing cracks. For the analysis of the glass surface, a vision system is
used using a camera of high-resolution (HD 1080 camera, 30 fps),
connected to a Wi-Fi module. As shown in Fig. 6.a, the camera is as-
sembled in a xture with an estimated angle of inclination of 45
o
. Also,
it has a 90
o
vision angle as shown Fig. 6.b. In addition, Fig. 7 shows a
picture of the real robot with the camera mounted on the middle
module.
3. Convolutional neural networks - a very brief review
In this paper, we will be using a convolutional neural network
(CNN) implemented in Python through TensorFlow backend. A CNN
consists of special layers called convolutional layers which are very
useful in detecting objects and patterns [41,42]. One advantage of CNN
is that it can be used to build a very deep network with a lesser number
of parameters to train and thereby reducing the time and complexity in
the training process. Apart from this, a CNN consists of dierent types
of layers with specic characteristics such as convolutional layers, ac-
tivation layers, pooling layers, fully connected layers and SoftMax
layers.
The primary idea behind image classication is horizontal or ver-
tical edge detection which can be achieved by performing a convolution
operation on the input image. The algorithm takes a small square (or
window) called lter and starts applying it over the image. Each lter
allows the CNN to identify certain patterns in the image. In the initial
layers, a CNN will start by detecting simple features such as lines, cir-
cles and edges. In each layer, the network can combine these ndings
and continually learn more complex concepts as it goes deeper, and in
our case, it detects cracks existing on the glass.
3.1. Convolution layer
The convolution layer consists of the lter which is supposed to
convolve across the width and height of the input volume. In order
words, the output of a convolutional layer is obtained by carrying out a
dot product operation between the lter weight content and each
Fig. 3. Mantis lifting the middle module for transi-
tion between window panels [25].
Fig. 4. Schematic architecture of Mantis.
Fig. 5. ROS-TensorFlow robot communication.
γ
90
o
250 mm
100 mm 100 mm
20 mm
200 mm
b. Top view
a. Side view
Fig. 6. Position and orientation of the camera mounted on Mantis.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
4
unique position of the input image. This process results in a 2-dimen-
sional activation map that gives the responses of that lter at every
spatial position. The various parameters involved in this process are
number of lters, lter size and weight contents, stride, and padding.
All these parameters will impact the size of the output.
3.2. Pooling layers
Pooling layers generally aim at avoiding overtting and by applying
non-linear down-sampling on activation maps, reduce the dimension
and complexity.to speed up the computation.
The various parameters involved in this process are lter size and
stride, whereas padding is not used in pooling as it is against the pur-
pose of reducing the input dimension. In addition, pooling is applied on
each input channel individually. This way, the number of output and
input channels will be equal. There are two dierent types of pooling
namely max pooling and average pooling.
Max pooling
The basic operation of pooling layer is similar to the convolutional
layer. One noticeable dierence is that instead of taking the dot product
of the input and the lter, we take the maximum neighboring value
from each unique position in the input image. This is done through each
channel in the input.
Average pooling
In average pooling, we take the average of all the values sur-
rounding each unique position in the input image.
3.3. Fully connected layers
The fully connected layer is the other name for the hidden layer
used in a regular neural network. Before this step, the input array is
converted into a single dimensional vector using a attening layer. As
the name suggests, in a fully connected layer, each node in the input is
connected to every other node in the output.
3.4. SoftMax layer
SoftMax activation function is applied in the output layer of a CNN
to represent a categorical distribution over labels and gives the prob-
abilities of each input belonging to a label.
4. Image Data Set, augmentation techniques and preprocessing
In this study, basically we generated cracked and non-cracked
images from the video frames captured by the camera mounted on
robot. As discussed earlier, the viewing angle of this camera covers the
glass that is supposed to be cleaned.
Most of the data set consists of snapshots of videos (from robot) or
photos taken in Singapore University of Technology and Design (SUTD)
which is beautifully designed with a half-glass structure in all its
buildings. The samples are captured in dierent conditions (illumina-
tion, reection, etc.), with dierent camera devices, with dierent re-
solutions including dierent types of object reections and glare eect
(light and sun reections). While the majority of these images are
collected by our team, for the sake of diversity and to generalize the
data set, some cracked/non-cracked images were also added from the
web.
Since the main goal of this study was crack detection, the images
were manually labelled to fall into two separate categories: Cracked or
non-cracked. Originally, 1539 cracked images and 1565 non-cracked
images were labelled. In addition, by applying some data augmentation
techniques (mostly including ipping and rotating), the nal dataset
consists of 2205 cracked and 4303 non-cracked images with dierent
conditions of orientation, illumination and resolution.
After saving the trained CNN, it comes to test it with snapshots from
the live video. Regarding the online crack detection through live video
from robot, there are dierent phases of pre-processing of the raw
image data:
- Converting the video into images by reading the video frame every
1s.
- Converting the image into an appropriate size. The reason behind
this is that software tends to provide a better result if all the input
images are of same pixel dimensions. In this paper, we have used
dimensions of 480 × 240 pixels.
- The resultant images are read using the OpenCV package in python.
- In this paper, we have used the grayscale image as it reduces the
number of dimensions and thereby reducing the operational time
and capacity. Then the obtained vector is normalized (each pixel
value is divided by 255). This is the last step in pre-processing the
image prior to be fed into trained CNN for test.
5. The training algorithm and applied CNN architecture
As illustrated before, this paper aims at classifying the cracked glass
from non-cracked glass using the video captured from the robot. The
purpose behind the classication is that the robot needs to avoid the
areas where the cracks are present in the glass. As mentioned earlier,
the video is converted into a grayscale image and loaded using the
OpenCV package. Later, CNN, developed using TensorFlow, is used for
classication. To get the best results and also make a comparison, we
have used two optimizers namely Adam and Adagrad to perform the
training.
Fig. 7. Camera mounted on the middle module of Mantis.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
5
5.1. Optimizers
A brief elaboration on the formulation of these optimizers is given
below in order to make it possible to reproduce the simulations.
5.1.1. Adam optimizer
Adam optimizer is Adaptive Moment estimation optimizer which
follows an algorithm for rst-order gradient-based optimization based
on adaptive estimates of lower-order moments. The pseudo code for
Adam algorithm is given below [43].
Require: α: Step size
Require: β
1
,β
2
[0, 1): Exponential decay rates for the moment
estimates
Require: f(θ): Stochastic objective function with parameters θ
Require: θ
0
Initial parameter vector
m
0
0 (Initialize 1st moment vector)
v
0
0 (Initialize 2nd moment vector)
t0 (Initialize timestep)
while θ
t
not converged, do the following:
tt+1
g
t
θ
f
t
(θ
t
1) (Get gradients w.r.t. stochastic objective at
timestep t)
m
t
β
1
m
t1
+(1-β
1
)g
t
(Update biased rst moment estimate)
v
t
β
2
v
t1
+(1-β
2
)g
t2
(Update biased second raw moment esti-
mate)
m
t
m
t
/(1 β
1t
) (Compute bias-corrected rst moment estimate)
v
t
v
t
/(1 β
2t
) (Compute bias-corrected second raw moment es-
timate)
θ
t
θ
t1
-
+αm v ε/( )
tt
(Update parameters).
end while
return θ
t
(Resulting parameters)where g
t
are the gradients, θ
t
is the
parameter at time t, β
1
and β
2
belong to [0,1), and αis the learning rate.
According to [43], g
t2
indicates the elementwise square of g
t
ʘg
t
and
proposed default settings are α= 0.001, β
1
= 0.9, β
2
= 0.999 and
ε=10
8
. All operations on vectors are element-wise, and β
1t
and β
2t
,
denote β
1
and β
2
to the power of t.
5.1.2. Adagrad optimizer
Adagrad optimizer is a gradient based optimization algorithm that
works well for sparse gradients [44]. It will automatically adapt the
learning rate based on the parameters. The basic equation used for
parameter update is shown in Eq. (1) where θ
t
is the parameter at time
t, αis the learning rate, g
t
is the gradient estimate, and ʘmeans ele-
ment wise multiplication.
=− +
+
θθ α
εg
g
tt
t
t
12(1)
4
4
480
240
1
4
4
8
240
480
8
8
60
30
8
2
2
8
30
60
4
4
16
8
15
Input Image Convolution
Layer 1
Max Pooling
Layer 1
Convolution
Layer 2
Max Pooling
Layer 2
Flattening
process
Fully
Connected
Network
SoftMax
Layer
1920
Uncracked
Cracked
Fig. 8. CNN architecture for this classication process.
Table 1
Parameters used in the CNN layers.
Layer Filter size Stride Padding type Activation
Convolution layer 1 4 1 Same ReLU
Max pooling layer 1 8 8 Same (P=0)
Convolution layer 2 2 1 Same ReLU
Max pooling layer 2 4 4 Same (P = 0)
No
Start
Input Video from
on-board camera
Get Snapshots of the video
Labelling the training data
Normalize
1st convolutional layer with
Relu Activation
2nd convolutional layer
with Relu Activation
1st maximu m pooling la yer
Fully connected layer with
SoftMax function
Calculate cost
Calculate test accuracy
Save the trained model
Stopping criteria
fulfilled?
2nd maximum pooling layer
Stop
Yes
Fig. 9. The CNN training algorithm.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
6
5.2. Proposed CNN
A sketch of the utilized CNN architecture is shown in Fig. 8 in which
we have used two convolutional layers with ReLU activation, two
maximum pooling layers and one fully connected layer with SoftMax
activation. Due to the mutually exclusive nature of the crack detection
problem (cracked or non-cracked), a SoftMax layer is used as the last
layer to compute the probability of each class. Furthermore, Table 1
summarizes the parameters corresponding to CNN layers. Meanwhile,
the training algorithm is illustrated in Fig. 9.
6. Results and discussion
Training performance graphs for both optimizers are summarized in
Fig. 10 (both for 700 epochs).
The confusion matrix is a way of describing the performance of the
classier output. The layout of the confusion matrix, in our case, is as
illustrated in Eq. (2).
=
True Cracked False Cracked
False Uncracked True Uncracked
C
onfusion Matrix (2)
Performance metrics for growing number of epochs is summarized
in Table 2 for both optimizers. This Table gives detailed metrics in
terms of confusion matrix, accuracy, sensitivity in terms of TPR (true
positive rate) or the same recall, precision (PPV: positive predictive
values), specicity (SPC), negative predicted value (NPV), false positive
rate (FPR), false discovery rate (FDR), miss rate or FNR (false negative
rate), FOR (false omission rate) and F1 score [45].
An illustration of precision, recall and F1 values during the training
process, are given in Figs. 11 to 13. As it can be seen, precision of Adam
optimizer is smoothly standing much higher than Adagrad Optimizer
during the training process, however, Adagrad seems to be able to reach
the same precision value and intersect with Adam precision, if given
more epochs for training. Although Adagrad shows a weak precision, in
terms of recall, it has a more satisfying performance keeping its graph
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0 200 400 600 800
Loss
Epochs
Adagrad
Adam
Fig. 10. Cost function values.
Table 2
Quantitative performance metrics for both optimizers.
Optimizers Epochs Confusion matrix Accuracy% PPV (precision) TPR (Recall/sensitivity) F1 score SPC NPV FPR FDR FNR FOR
Adam optimizer 700
1804 271
401 4032
89.674 0.869 0.818 0.842 0.937 0.909 0.062 0.130 0.181 0.090
350
1777 298
423 4010
88. 921 0.856 0.807 0.831 0.930 0.904 0.069 0.143 0.192 0.095
300
1774 301
417 4016
88. 967 0.854 0.809 0.831 0.930 0.905 0.069 0.145 0.190 0.094
280
1777 298
419 4014
88. 982 0.856 0.809 0.832 0.930 0.905 0.069 0.143 0.190 0.094
Adagrad optimizer 700
1322 753
23 4410
88. 076 0.637 0.982 0.773 0.854 0.994 0.145 0.362 0.017 0.005
350
712 1363
9 4424
78.918 0.343 0.987 0.509 0.764 0.997 0.235 0.656 0.012 0.002
300
571 1504
8 4425
76.767 0.275 0.986 0.430 0.746 0.998 0.253 0.724 0.013 0.001
280
473 1602
8 4425
75.261 0.227 0.983 0.370 0.734 0.998 0.265 0.772 0.016 0.001
Fig. 11. Precision values for both optimizers.
Fig. 12. Recall values for both optimizers.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
7
higher throughout the epochs. Both optimizers benet from an ap-
proximately same level of accuracy around 90%, however Adam opti-
mizer results in slightly more accuracy (89.674% for Adam optimizer
vs. 88.076% for Adagrad optimizer). Overall, Adam optimizer with the
lowest rate in false negative detection (FNR) and highest accuracy
suggests a well-trained classier, able to recognize cracked from non-
cracked.
Aiming at further illustration, the probability of correct classica-
tion for sample test images are given for both of the optimizers in
Figs. 14 and 15 (for cracked samples) and Figs. 16 and 17 (for non-
cracked samples) in which TP and TN respectively stand for True Po-
sitive and True Negative.
In corresponding gures for the cracked and non-cracked (Figs.
1415 and Figs. 16-17), the sample test images are selected to be the
same for both optimizers for comparison aims to be fullled more easily
[32]. Beside the samples from real video snapshots (captured by robot),
some samples from web are also tested for the sake of diversity and to
test the classiers with generalized samples. Only to clarify, the photos
with gray stipes, are from real robot navigating on a half-glass wall at
Singapore University of Technology and Design. It is evident that in
each test image, Adam optimizer has been more powerful with higher
probability (at level of 90%100%) for correct decision. However,
Adagrad optimizer also never fails in correct decision with reporting
probabilities always safely more than 50% for true positive decision.
Moreover, it is worth mentioning that in this application for glass fa-
çade cleaning, not to mistake in crack detection, is more important than
precision of correct decision, because only one small mistake may cause
a serious incident and our proposed method actually guarantees this:
Although we also have rather lower probabilities reported, but all the
results have a suciently safe distance to 50% which is the threshold
for right decision. This guarantees the safety of the process to be re-
placed with human inspector.
In addition, in order to prove the strength of applied CNN, we have
also gone ahead with a simple neural network. This NN utilizes Adam
optimizer and consisting of six layers, it is very similar to the applied
CNN in terms of architecture. It includes one input, one output and four
hidden layers with size of 32, 16, 8, 4. However, it should be mentioned
that NN needs features to be fed into its input layer. In this regard,
feature extraction is done using OpenCV which converts the image into
a multi-dimensional matrix. This matrix is then attened to be provided
to NN's input layer. Figs. 18 and 19 illustrate NN's performance on the
same test samples for cracked and non-cracked. It is obvious that NN
appears to be very poor because of lacking convolutional layer.
Convolutional neural networks benet from several advantages as
described earlier. In comparison to NNs they need less time and
memory, however, practically they are still computationally expensive.
This drawback can be solved with better computing hardware such as
GPUs and Neuromorphic chips. Furthermore, some more recent types of
Fig. 13. F1 values for both optimizers.
TP: p = 61.5 % TP: p = 99.8% TP: p = 97.0% TP: p = 99.8%
TP: p = 77.5% TP: p = 99.8% TP: p = 99.1% TP: p = 100 %
TP: p = 99.7% TP: p = 99.6% TP: p = 100% TP: p = 99.5%
TP: p = 99.8% TP: p = 100% TP: p = 99.6% TP: p = 99.0%
TP: p = 99.3% TP: p = 92.4% TP: p = 99.8% TP: p = 99.9%
Fig. 14. Cracked samples (True Positive) for CNN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
8
TP: p = 59.5 % TP: p = 59.5% TP: p = 63.3% TN: p = 97.5%
TP: p = 59.7% TP: p = 62.1% TP: p = 72.8% TN: p = 94.6 %
TP: p = 54.2% TP: p = 62.7% TP: p = 76.3% TP: p = 54.4%
TP: p = 89.2% TP: p = 87.7% TP: p = 95.7% TP: p = 54.6%
TP: p = 56.7% TP: p = 82.5% TP: p = 80.5% TP: p = 93.0%
Fig. 15. Cracked samples (True Positive) for CNN with Adagrad Optimizer.
TN: p = 99.9% TN: p = 95.80% TN: p = 98.9% TN: p = 91.6%
TN: p = 94.1% TN: p = 97.6% TN: p = 99.3% TN: p = 92.5%
TN: p = 100% TN: p = 100% TN: p = 90.7% TN: p = 95.8%
TN: p = 99.6% TN: p = 100% TN: p = 94.0% TN: p = 98.1%
TN: p = 100% TN: p = 99.8% TN: p = 99.9% TN: p = 99.6%
Fig. 16. Non-Cracked samples (True Negative) for CNN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
9
CNNs can provide faster response in comparison to basic CNN that uses
sliding window. The other drawback is they need a lot of training data.
This aspect is usually well-treated by applying data augmentation
methods as described in this paper.
7. Conclusion
In this paper, a deep learning approach has been put forward for
equipping our modular façade-cleaning robot with crack-detection
TN: p = 78.1% TN: p = 77.8% TN: p = 93.2% TN: p = 77.8%
TN: p = 77.9% TN: p = 96.2% TN: p = 96.4% TN: p = 77.9%
TN: p = 64.6% TN: p = 99.2% TN: p = 77.1% TN: p = 79.2%
TN: p = 74.7% TN: p = 98.1% TN: p = 77.6% TN: p = 79.3%
TN: p = 90.2% TN: p = 91.8% TN: p = 85.8% TN: p = 62.5%
Fig. 17. Non-Cracked samples (True Negative) for CNN with Adagrad Optimizer.
TP: p = 55.29% TP: p = 55.57% TP: p = 54.83% TN: p = 51.72%
TP: p = 55.24% TP: p = 53.32% TP: p = 54.58% TN: p = 53.95%
TP: p = 54.56% TP: p = 55.16% TP: p = 55.43% TP: p = 54.55%
TP: p = 53.55% TP: p = 54.16% TP: p = 57.11% TP: p = 55.64%
TP: p = 53.93% TP: p = 54.64% TP: p = 56.91% TP: p = 51.95%
Fig. 18. Cracked samples (True Positive) for NN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
10
package. For façade cleaning robots, the necessity of such an additional
capability will be more tangible when the dangers of navigating on
cracked glass are taken into account. To this end, the modular Mantis
robot is equipped with on-board camera for experimental purposes and
the live video is loaded using the OpenCV package. In addition, a deep
convolutional neural network developed in TensorFlowis proposed,
and it is trained with sucient data set including photos taken from
video snapshots (directly fed from robot), several other photos taken at
SUTD half-glass campus, and also photos from web with dierent
conditions of illumination and resolution (taken with dierent devices).
On the way of training, two dierent optimizers are utilized, both of
which hit a very high accuracy around 90% which is trustable enough
to replace the human-operator with this system in real time on-site
inspections. Each of the two utilized optimizers has its own advantages:
Adam benets from higher precision while Adagrad optimizer results in
higher recall factor. However, Adam optimizer with the lowest FNR and
highest accuracy suggests a more trustable classier. Moreover, the
overall performance of the proposed CNN is also compared to that of
the traditional NN-based method which further proves the strength of
CNN.
The transcend aim of this research is to make the robot avoid from
cracked regions. Our scope of future work would involve enhancing the
algorithm so that the robot can take a much more informed decision. In
our next version of the crack detection algorithm, we would focus more
in solving the localization problem by detecting the exact position of
the crack partial to robot and navigate the robot such that it can avoid
the cracked region and also maintain the highest possible coverage for
ecient cleaning.
To further empower the crack detection package, we will also be
focusing on implementing a more powerful deep learning technique
that could also help with object tracking such as Faster R-CNN and
enhancing the dataset with more images (specially consisting of dif-
ferent reections and illuminations). Furthermore, another expansion
to current work would be to have all the processing onboard rather than
utilizing a master-slave ROS system.
Acknowledgments
This work is nancially supported by the National Robotics R&D
Program Oce, Singapore, under the Grant no. RGAST1702, Singapore
University of Technology and Design (SUTD), which are gratefully ac-
knowledged to conduct this research work.
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://
doi.org/10.1016/j.autcon.2019.01.025.
References
[1] C.W. Bostick, Architectural trend thru the looking glass, Proceedings of the Glass
Performance Days, 2009, pp. 860866. Tampere, Finland https://de.glassglobal.
com/gpd/downloads/ChangingMarkets-Bostick.pdf.
[2] F. Pariafsai, A review of design considerations in glass buildings, Frontiers of
Architectural Research 5 (2) (2016) 171193, https://doi.org/10.1016/j.foar.2016.
01.006.
[3] A. Kochan, Robot cleans glass roof of louvre pyramid, Industrial Robot: An
International Journal 32 (5) (2005) 380382, https://doi.org/10.1108/
01439910510614637.
[4] N. Elkmann, T. Felsch, M. Sack, J. Saenz, J. Hortig, Innovative service robot systems
for facade cleaning of dicult-to-access areas, IEEE/RSJ International Conference
on Intelligent Robots and Systems, vol. 1, 2002, pp. 756762, , https://doi.org/10.
1109/IRDS.2002.1041481.
[5] N. Elkmann et al., "SIRIUSc Facade cleaning robot for a high-rise building in
Munich, Germany," in Climbing and Walking Robots, pp. 10331040: Springer
Berlin Heidelberg, vol. 2005. doi:https://doi.org/10.1007/3-540-29461-9_101.
[6] Y.-S. Lee, et al., The study on the integrated control system for curtain wall building
façade cleaning robot, Autom. Constr. 94 (2018) 3946, https://doi.org/10.1016/j.
autcon.2017.12.030.
[7] S. Nansai, M. Rajesh Elara, A survey of wall climbing robots: recent advances and
challenges, Robotics 5 (3) (2016) 14, https://doi.org/10.3390/robotics5030014.
[8] D. Schmidt, K. Berns, Climbing robots for maintenance and inspections of vertical
structuresa survey of design aspects and technologies, Robot. Auton. Syst. 61 (12)
(2013) 12881305, https://doi.org/10.1016/j.robot.2013.09.002.
[9] D. Longo, G. Muscato, The Alicia/sup3/climbing robot: a three-module robot for
automatic wall inspection, IEEE Robotics & Automation Magazine 13 (1) (2006)
4250, https://doi.org/10.1109/MRA.2006.1598052.
[10] Z. Xu, P. Ma, A wall-climbing robot for labelling scale of oil tank's volume, Robotica
TN: p = 46.84% TN: p = 39.21% TN: p = 15.00% TN: p = 38.15%
TN: p = 39.21% TN: p = 31.83% TN: p = 19.97% TN: p = 39.37%
TN: p = 43.84% TN: p = 50.01% TN: p = 39.41% TN: p = 35.70%
TN: p = 26.76% TN: p = 41.48% TN: p = 39.20% TN: p = 35.21%
TN: p = 50.01% TN: p = 49.20% TN: p = 50.01% TN: p = 50.01%
Fig. 19. Non-Cracked samples (True Negative) for NN with Adam Optimizer.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
11
20 (2) (2002) 209212, https://doi.org/10.1017/S0263574701003964.
[11] B.L. Luk, K.P. Liu, A.A. Collie, S. Chen, Tele-operated climbing and mobile service
robots for remote inspection and maintenance in nuclear industry, Industrial Robot:
An International Journal 33 (3) (2006) 194204, https://doi.org/10.1108/
01439910610659105.
[12] B.L. Luk, L.K.P. Liu, A.A. Collie, Climbing service robots for improving safety in
building maintenance industry, in: M.K. Habib (Ed.), Bioinspiration and Robotics,
vol. Ch. 9, IntechOpen, Rijeka, 2007, , https://doi.org/10.5772/5498.
[13] A. Sintov, T. Avramovich, A. Shapiro, Design and motion planning of an autono-
mous climbing robot with claws, Robot. Auton. Syst. 59 (11) (2011) 10081019,
https://doi.org/10.1016/j.robot.2011.06.003.
[14] M. Henrey, A. Ahmed, P. Boscariol, L. Shannon, C. Menon, Abigaille-III: a versatile,
bioinspired hexapod for scaling smooth vertical surfaces, Journal of Bionic
Engineering 11 (1) (2014) 117, https://doi.org/10.1016/S1672-6529(14)
60015-9.
[15] T. Yanagida, R. Elara Mohan, T. Pathmakumar, K. Elangovan, M. Iwase, Design and
implementation of a shape shifting rolling-crawling-wall-climbing robot, Appl. Sci.
7 (4) (2017), https://doi.org/10.3390/app7040342.
[16] J. Zhu, D. Sun, S.-K. Tso, Development of a tracked climbing robot, J. Intell. Robot.
Syst. 35 (4) (2002) 427443, https://doi.org/10.1023/A:1022383216233.
[17] T. Kim, K. Seo, J. Kim, H.S. Kim, Adaptive impedance control of a cleaning unit for a
novel wall-climbing mobile robotic platform (ROPE RIDE), 2014 IEEE/ASME
International Conference on Advanced Intelligent Mechatronics, 2014, pp.
994999, , https://doi.org/10.1109/AIM.2014.6878210.
[18] H. Zhang, J. Zhang, W.L. Wang, Rong, G. Zong, A series of pneumatic glass-wall
cleaning robots for high-rise buildings, Industrial Robot: An International Journal
34 (2) (2007) 150160, https://doi.org/10.1108/01439910710727504.
[19] N. Mir-Nasiri, H.S. J, M.H. Ali, Portable autonomous window cleaning robot,
Procedia Computer Science 133 (2018) 197204, https://doi.org/10.1016/j.procs.
2018.07.024.
[20] J. Liu, K. Tanaka, L.M. Bao, I. Yamaura, Analytical modelling of suction cups used
for window-cleaning robots, Vacuum 80 (6) (2006) 593598, https://doi.org/10.
1016/j.vacuum.2005.10.002.
[21] S. Nansai, K. Onodera, P. Veerajagadheswar, R. Elara Mohan, M. Iwase, Design and
experiment of a novel façade cleaning robot with a biped mechanism, Appl. Sci. 8
(2018) 2398, https://doi.org/10.3390/app8122398.
[22] T.T. Tun, M.R. Elara, M. Kalimuthu, A. Vengadesh, Glass facade cleaning robot with
passive suction cups and self-locking trapezoidal lead screw drive, Autom. Constr.
96 (2018) 180188, https://doi.org/10.1016/j.autcon.2018.09.006.
[23] C. Menon, M. Murphy, M. Sitti, Gecko inspired surface climbing robots, 2004 IEEE
International Conference on Robotics and Biomimetics, 2004, pp. 431436, ,
https://doi.org/10.1109/ROBIO.2004.1521817.
[24] S. Nansai, M. Elara, T. Tun, P. Veerajagadheswar, T. Pathmakumar, A novel nested
recongurable approach for a glass façade cleaning robot, Inventions 2 (3) (2017)
18, https://doi.org/10.3390/inventions2030018.
[25] M. Vega-Herediaa, et al., Design and modelling of a modular window cleaning
robot, Autom. Constr. 103 (2019) 268278, https://doi.org/10.1016/j.autcon.
2019.01.025.
[26] X. Peng, S. Katsunori, Eddy current sensor with a novel probe for crack position
detection, 2008 IEEE International Conference on Industrial Technology, 2008, pp.
16, , https://doi.org/10.1109/ICIT.2008.4608445.
[27] D.J. Sadler, C.H. Ahn, On-chip eddy current sensor for proximity sensing and crack
detection, Sensors Actuators A Phys. 91 (3) (2001) 340345, https://doi.org/10.
1016/S0924-4247(01)00605-7.
[28] P. Liu, J. Jang, S. Yang, H. Sohn, Fatigue crack detection using dual laser induced
nonlinear ultrasonic modulation, Opt. Lasers Eng. 110 (2018) 420430, https://doi.
org/10.1016/j.optlaseng.2018.05.025.
[29] M. Culjat, R. Singh, E. Brown, R. Neurgaonkar, D. Yoon, S. White, Ultrasound crack
detection in a simulated human tooth, Dentomaxillofacial Radiology 34 (2) (2005)
8085, https://doi.org/10.1259/dmfr/12901010.
[30] A. Mohan, S. Poobal, Crack detection using image processing: a critical review and
analysis, Alexandria Engineering Journal 57 (2) (2018) 787798, https://doi.org/
10.1016/j.aej.2017.01.020.
[31] Y.-J. Cha, W. Choi, O. Büyüköztürk, Deep learning-based crack damage detection
using convolutional neural networks, Computer-Aided Civil and Infrastructure
Engineering 32 (5) (2017) 361378, https://doi.org/10.1111/mice.12263.
[32] L. Zhang, F. Yang, Y.D. Zhang, Y.J. Zhu, Road crack detection using deep con-
volutional neural network, 2016 IEEE International Conference on Image
Processing (ICIP), 2016, pp. 37083712, , https://doi.org/10.1109/ICIP.2016.
7533052.
[33] A. Zhang, et al., Automated pixel-level pavement crack detection on 3D asphalt
surfaces using a deep-learning network, Computer-Aided Civil and Infrastructure
Engineering 32 (10) (2017) 805819, https://doi.org/10.1111/mice.12297.
[34] K. Gopalakrishnan, S.K. Khaitan, A. Choudhary, A. Agrawal, Deep convolutional
neural networks with transfer learning for computer vision-based data-driven pa-
vement distress detection, Constr. Build. Mater. 157 (2017) 322330, https://doi.
org/10.1016/j.conbuildmat.2017.09.110.
[35] F. Chen, M.R. Jahanshahi, NB-CNN: deep learning-based crack detection using
convolutional neural network and Naïve Bayes data fusion, IEEE Trans. Ind.
Electron. 65 (5) (2018) 43924400, https://doi.org/10.1109/TIE.2017.2764844.
[36] M.E. Ibrahim, R.A. Smith, C.H. Wang, Ultrasonic detection and sizing of compressed
cracks in glass- and carbon-bre reinforced plastic composites, NDT & E
International 92 (2017) 111121, https://doi.org/10.1016/j.ndteint.2017.08.004.
[37] B. Akdemir, Ş. Öztürk, Glass surface defects detection with wavelet transforms,
International Journal of Materials, Mechanics and Manufacturing 3 (3) (2015),
https://doi.org/10.7763/IJMMM.2015.V3.189.
[38] Z. Yiyang, The design of glass crack detection system based on image preprocessing
technology, 2014 IEEE 7th Joint International Information Technology and
Articial Intelligence Conference, 2014, pp. 3942, , https://doi.org/10.1109/
ITAIC.2014.7065001.
[39] M. Hui-Min, S. Guang-Da, W. Jun-Yan, N. Zheng, A glass bottle defect detection
system without touching, Proceedings. International Conference on Machine
Learning and Cybernetics, vol. 2, 2002, pp. 628632, , https://doi.org/10.1109/
ICMLC.2002.1174411.
[40] L. Joseph, ROS Robotics Projects, Packt Publishing Ltd., 2017 ISBN 10:
1783554711. (ISBN 13: 9781783554713).
[41] T. Guo, J. Dong, H. Li, Y. Gao, Simple convolutional neural network on image
classication, 2017 IEEE 2nd International Conference on Big Data Analysis
(ICBDA), 2017, pp. 721724, , https://doi.org/10.1109/ICBDA.2017.8078730.
[42] S. Albawi, T.A. Mohammed, S. Al-Zawi, Understanding of a convolutional neural
network, 2017 International Conference on Engineering and Technology (ICET),
2017, pp. 16, , https://doi.org/10.1109/ICEngTechnol.2017.8308186.
[43] Diederik P. Kingma, J. Ba, Adam: A method for stochastic optimization, 3rd
International Conference for Learning Representations, San Diego, 2015 https://
arxiv.org/abs/1412.6980.
[44] J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and
stochastic optimization, J. Mach. Learn. Res. 12 (2011) 21212159 http://www.
jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.
[45] E. Protopapadakis, N. Doulamis, Image based approaches for tunnels' defects re-
cognition via robotic inspectors, International Symposium on Visual Computing,
ISVC 2015: Advances in Visual Computing, vol. 9474, Springer, Cham, 2015, pp.
706716, , https://doi.org/10.1007/978-3-319-27857-5_63.
M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959
12
... Automatic glass crack detection for façade-cleaning robot [118] AI Integration of service robots in the smart home by means of UPnP: A surveillance robot case study 2013 Implementing a basic garbage detection routine using built-in camera that allows the smart home system to instruct a service robot to clean whenever garbage is detected. ...
Article
Full-text available
The construction industry faces many challenges, including schedule and cost overruns, productivity constraints, and workforce shortages. Compared to other sectors, it lags in digitalization in every project phase. Artificial Intelligence (AI) and Machine Learning (ML) have emerged as transformative technologies revolutionizing the construction sector. However, a discernible gap persists in systematically categorizing the applications of these technologies throughout the various phases of the construction project life cycle. In response to this gap, this research aims to present a thorough assessment of the deployment of AI and ML across diverse phases in construction projects, with the ultimate goal of furnishing valuable insights for the effective integration of these intelligent systems within the construction sector. A thorough literature review was performed to identify AI and ML applications in the building sector. After scrutinizing the literature, the applications of AI and ML were presented based on a construction project life cycle. A critical review of existing literature on AI and ML applications in the building industry showed that AI and ML applications are more frequent in the planning and construction stages. Moreover, the opportunities for AI and ML applications in other stages were discussed based on the life cycle categorization and presented in this study. The practical contribution of the study lies in providing valuable insights for the effective integration of intelligent systems within the construction sector. Academically, the research contributes by conducting a thorough literature review, categorizing AI and ML applications based on the construction project life cycle, and identifying opportunities for their deployment in different stages.
Article
Wheeled mobile robots (WMRs) with variable wheelbases are capable of traveling on deformable terrains and handling complex detection tasks. While the variable wheelbase length of WMR allows it to interact with the terrains adaptively, enhancing its mobility, it brings a control challenge. Inspired by the worm's movement of stretching body at different lengths under different environmental resistance, a creeping gait (CG) strategy is proposed in this work to enable the WMR to be controlled in dual modes: wheeled following mode (WFM) and specified length mode (SLM). WFM adjusts the wheelbase's length by the wheels' movements freely to minimize the internal force and torque between wheels. SLM adjusts the wheelbase's length using a proposed fuzzy logic based algorithm to stabilize the body's posture on rough terrain and overcome specific motion challenges, like escaping wheel sinking. A state-adaptive mode-switching controller is then developed using the dwell time approach to smooth the output velocities during the switching phase, and a Lyapunov analysis is performed to verify its stability. According to the results of physical experiments, three-wheeled mobile robot movements with CG enable more precise path following by 37% and faster response by 11% compared to fixed wheelbase movements, and the dwell time approach achieves smoother speed transitions between the modes than the direct switching method, especially when moving from flat to slope terrain.
Article
Purpose Automation of detecting cracked surfaces on buildings or in any industrially manufactured products is emerging nowadays. Detection of the cracked surface is a challenging task for inspectors. Image-based automatic inspection of cracks can be very effective when compared to human eye inspection. With the advancement in deep learning techniques, by utilizing these methods the authors can create automation of work in a particular sector of various industries. Design/methodology/approach In this study, an upgraded convolutional neural network-based crack detection method has been proposed. The dataset consists of 3,886 images which include cracked and non-cracked images. Further, these data have been split into training and validation data. To inspect the cracks more accurately, data augmentation was performed on the dataset, and regularization techniques have been utilized to reduce the overfitting problems. In this work, VGG19, Xception and Inception V3, along with Resnet50 V2 CNN architectures to train the data. Findings A comparison between the trained models has been performed and from the obtained results, Xception performs better than other algorithms with 99.54% test accuracy. The results show detecting cracked regions and firm non-cracked regions is very efficient by the Xception algorithm. Originality/value The proposed method can be way better back to an automatic inspection of cracks in buildings with different design patterns such as decorated historical monuments.
Article
Full-text available
Façade cleaning in high-rise buildings has always been considered a hazardous task when carried out by labor forces. Even though numerous studies have focused on the development of glass façade cleaning systems, the available technologies in this domain are limited and their performances are broadly affected by the frames that connect the glass panels. These frames generally act as a barrier for the glass façade cleaning robots to cross over from one glass panel to another, which leads to a performance degradation in terms of area coverage. We present a new class of façade cleaning robot with a biped mechanism that is able overcome these obstacles to maximize its area coverage. The developed robot uses active suction cups to adhere to glass walls and adopts mechanical linkage to navigate the glass surface to perform cleaning. This research addresses the design challenges in realizing the developed robot. Its control system consists of inverse kinematics, a fifth polynomial interpolation, and sequential control. Experiments were conducted in a real scenario, and the results indicate that the developed robot achieves significantly higher coverage performance by overcoming both negative and positive obstacles in a glass panel.
Article
Full-text available
The idea of having a compact and autonomous office or house window cleaning robot is quite simple and very attractive. This small window climbing robot with pneumatic suction cups should be able to move autonomously along an outside surface of high-rise building office window with a relatively large area and meantime clean and wash it. Being manually attached to the outside surface of the room window the robot will execute and accomplish the task of window cleaning automatically in a predefined pattern. The sensory system will help to navigate the robot. It is noted that window cleaning robots are commercially available but pricey (in the range of USD 5000 or more). The designed robot is lightweight, small size and cheap because it is driven only by one rotary actuator and system of properly arranged conventional belts and pulleys. It uses the suction cups to stick to the window pane and set of optical sensors to detect the window frame. The microcontroller is programmed to move the robot in a specific pattern depending on the sensory data. There are no similar reasonably priced rival products available in the market yet.
Conference Paper
Full-text available
The term Deep Learning or Deep Neural Network refers to Artificial Neural Networks (ANN) with multi layers . Over the last few decades, it has been considered to be one of the most powerful tools, and has become very popular in the literature as it is able to handle a huge amount of data. The interest in having deeper hidden layers has recently begun to surpass classical methods performance in different fields; especially in pattern recognition. One of the most popular deep neural networks is the Convolutional Neural Network (CNN). It take this name from mathematical linear operation between matrixes called convolution. CNN have multiple layers; including convolutional layer, non-linearity layer, pooling layer and fully-connected layer. The convolutional and fully- connected layers have parameters but pooling and non-linearity layers don't have parameters. The CNN has an excellent performance in machine learning problems. Specially the applications that deal with image data, such as largest image classification data set (Image Net), computer vision, and in natural language processing (NLP) and the results achieved were very amazing . In this paper we will explain and define all the elements and important issues related to CNN, and how these elements work. In addition, we will also state the parameters that effect CNN efficiency. This paper assumes that the readers have adequate knowledge about both machine learning and artificial neural network.
Article
The design of a modular window façades cleaning robot is challenging given the conditions under which these robots are required to operate. In this work, we attempt to extend the locomotion capabilities of these robots beyond what is currently feasible. The modular design of three equal interconnected sections of our robot, called Mantis, allows increasing the range concerning the work of cleaning window façades. Mantis has the ability to make transition from one window panel to another by crossing over the metallic panel. We implemented the inductive sensors to detect the metallic frame for autonomous crossover. The mechanical design and system architecture are introduced in detail, followed by a detailed description of the locomotion control and the sensor system for the classification of the metallic frame. The experimental results are presented to validate Mantis' abilities.
Article
We report on the mechanism, design iteration, and performance of a new glass facade cleaning robot, vSlider. The passive suction cups, driven by self-locking lead screws, are used to engage the vSlider robot to the glass facade. This mechanism has higher efficiency, compared to active suction cups, and offers better power consumption and safety in the case of power disruption or power loss. Due to the self-locking leadscrews, the counter-moment in a static position is not transferred to the motor, and thus, the servos which drive the lead screws only consume the power needed for a typical free load. A DC motor with encoder generates the primary locomotion in vSlider which was tested both in position- and velocity-control modes. This paper also details the design iteration efforts and discusses the key findings from the experiments involving the first prototype, vSlider 1.x, and the application of these findings in the development of the second prototype, vSlider 2.x. Experiments were performed to validate the proposed design approach and to benchmark the performance of the two robot prototypes that were developed.
Article
In this study, a nonlinear ultrasonic modulation technique based on dual laser excitation is proposed for fatigue crack detection. Two pulse lasers are shot on the target specimen for ultrasonic generation. The corresponding ultrasonic responses are measured by a laser Doppler vibrometer (LDV) and analyzed to extract the crack induced nonlinear ultrasonic modulation. First, the effect of the pulse laser beam size on the frequency content of the generated ultrasonic waves is numerically and experimentally investigated. Then, this finding of the laser beam size effect is utilized to generate wideband (WB) and narrowband (NB) ultrasonic waves by adjusting the laser beam sizes of the two pulsed excitation lasers. Nonlinear ultrasonic modulation results from the interaction of WB and NB ultrasonic waves when a fatigue crack exists in the target specimen. The fatigue crack is then detected by comparing the spectral responses obtained under a single WB input and both WB and NB inputs. In the end, a fully noncontact dual laser ultrasonic system is developed and used to detect micro fatigue cracks in aluminum and steel plate specimens.
Article
Recently, with a growing number of high-rise buildings in cities, interest in building facade maintenance is increasing. The existing method of cleaning the exterior walls of existing high-rise buildings depended on the methods by workers who used ropes, gondolas, and winch systems. Recently, however, BMU (building maintenance unit) has been developed and applied to resolve safety problems and boost work efficiency. In Germany, USA, France and other countries, various types of robot systems for building façade maintenance are being applied. In South Korea, façade cleaning robots attached with curtain walls are also being developed. In this paper, we propose an integrated control system for the stable control of robots with the building façade cleaning technology. The proposed control system can be divided into three stages such as preparation stage, cleaning stage, and return stage. Each independent robot system performs tasks such as cleaning, moving, and obstacle detection according to each stage. A wireless communication system for stable communication between robots was proposed and applied for controlling the robot system. The proposed integrated control system was applied to building façade cleaning robots and its efficiency was verified compared with existing high-rise building cleaning methods.
Article
Regular inspection of nuclear power plant components is important to guarantee safe operations. However, current practice is time-consuming, tedious, and subjective which involves human technicians review the inspection videos and identify cracks on reactors. A few vision-based crack detection approaches have been developed for metallic surfaces, and they typically perform poorly when used for analyzing nuclear inspection videos. Detecting these cracks is a challenging task since they are tiny, and noisy patterns exist on the components' surfaces. This study proposes a deep learning framework called NB-CNN to analyze individual video frames for crack detection while a novel data fusion scheme is proposed to aggregate the information extracted from each video frame to enhance the overall performance and robustness of the system. To this end, a Convolutional Neural Network (CNN) is proposed to detect crack patches in each video frame while the proposed data fusion scheme maintains the spatiotemporal coherence of cracks in videos, and the Naïve Bayes decision making discards false positives effectively. The proposed framework achieves 98.3% hit rate against 0.1 false positives per frame that is significantly higher than state-of-the-art approaches as presented in this paper.
Article
Automated pavement distress detection and classification has remained one of the high-priority research areas for transportation agencies. In this paper, we employed a Deep Convolutional Neural Network (DCNN) trained on the ‘big data’ ImageNet database, which contains millions of images, and transfer that deep earning to automatically detect cracks in Hot-Mix Asphalt (HMA) and Portland Cement Concrete (PCC) surfaced pavement images that also include a variety of non-crack anomalies and defects. Apart from the common sources of false positives encountered in vision based automated pavement crack detection, a significantly higher order of complexity was introduced in this study by trying to train a classifier on combined HMA-surfaced and PCC-surfaced images that have different surface characteristics. A single-layer neural network classifier (with ‘adam’ optimizer) trained on ImageNet pre-trained VGG-16 DCNN features yielded the best performance.