ArticlePDF Available

Self-reconfigurable façade-cleaning robot equipped with deep-learning-based crack detection based on convolutional neural networks

December 2019
Automation in Construction 108:102959

December 2019
108:102959

DOI:10.1016/j.autcon.2019.102959

License
CC BY-NC-ND 4.0

Authors:

Maryam Kouzehgar

Singapore University of Technology and Design

Yokhesh Krishnasamy Tamilselvam

The University of Western Ontario

Manuel Vega-Heredia

Rajesh Elara Mohan

Singapore University of Technology and Design

Despite advanced construction technologies that are unceasingly filling the city-skylines with glassy high-rise structures, maintenance of these shining tall monsters has remained a high-risk labor-intensive process. Thus, nowadays, utilizing façade-cleaning robots seems inevitable. However, in case of navigating on cracked glass, these robots may cause hazardous situations. Accordingly, it seems necessary to equip them with crack-detection system to eventually avoid cracked area. In this study, benefitting from convolutional neural networks developed in TensorFlow™, a deep-learning-based crack detection approach is introduced for a novel modular façade-cleaning robot. For experimental purposes, the robot is equipped with an on-board camera and the live video is loaded using OpenCV. The vision-based training process is fulfilled by applying two different optimizers utilizing a sufficiently generalized data-set. Data augmentation techniques and also image pre-processing also apply as a part of process. Simulation and experimental results show that the system can hit the milestone on crack-detection with an accuracy around 90%. This is satisfying enough to replace human-conducted on-site inspections. In addition, a thorough comparison between the performance of optimizers is put forward: Adam optimizer shows higher precision, while Adagrad serves more satisfying recall factor, however, Adam optimizer with the lowest false negative rate and highest accuracy has a better performance. Furthermore, proposed CNN's performance is compared to traditional NN and the results provide a remarkable difference in success level, proving the strength of CNN.

Structure of Mantis while cleaning.

…

Mantis inertial system in red, module c inertial system in black, β is the steering of each module. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

…

Mantis lifting the middle module for transition between window panels [25].

…

Schematic architecture of Mantis.

…

ROS-TensorFlow robot communication.

…

Figures - available via license: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International

Content may be subject to copyright.

Available via license: CC BY-NC-ND 4.0

Content may be subject to copyright.

Contents lists available at ScienceDirect

Automation in Construction

journal homepage: www.elsevier.com/locate/autcon

Self-reconﬁgurable façade-cleaning robot equipped with deep-learning-

based crack detection based on convolutional neural networks

Maryam Kouzehgar

a,⁎

, Yokhesh Krishnasamy Tamilselvam

, Manuel Vega Heredia

a,c

Mohan Rajesh Elara

Engineering Production and Development Pillar, Singapore University of Technology and Design, Singapore 487372, Singapore

Electrical Engineering Department (Robotics), University of Western Ontario, London, Ontario N6A 3K7, Canada

Engineering and Technology Department, Campus Los Mochis, Universidad de Occidente, Sinaloa 81223, Mexico

ARTICLE INFO

Keywords:

Self-reconﬁgurable robot

Façade-cleaning robot

Glass crack detection

Convolutional neural network (CNN)

Deep learning

TensorFlow™

OpenCV

ROS

Adam optimizer

Adagrad optimizer

ABSTRACT

Despite advanced construction technologies that are unceasingly ﬁlling the city-skylines with glassy high-rise

structures, maintenance of these shining tall monsters has remained a high-risk labor-intensive process. Thus,

nowadays, utilizing façade-cleaning robots seems inevitable. However, in case of navigating on cracked glass,

these robots may cause hazardous situations. Accordingly, it seems necessary to equip them with crack-detection

system to eventually avoid cracked area. In this study, beneﬁtting from convolutional neural networks developed

in TensorFlow™, a deep-learning-based crack detection approach is introduced for a novel modular façade-

cleaning robot. For experimental purposes, the robot is equipped with an on-board camera and the live video is

loaded using OpenCV. The vision-based training process is fulﬁlled by applying two diﬀerent optimizers utilizing

asuﬃciently generalized data-set. Data augmentation techniques and also image pre-processing also apply as a

part of process. Simulation and experimental results show that the system can hit the milestone on crack-de-

tection with an accuracy around 90%. This is satisfying enough to replace human-conducted on-site inspections.

In addition, a thorough comparison between the performance of optimizers is put forward: Adam optimizer

shows higher precision, while Adagrad serves more satisfying recall factor, however, Adam optimizer with the

lowest false negative rate and highest accuracy has a better performance. Furthermore, proposed CNN's per-

formance is compared to traditional NN and the results provide a remarkable diﬀerence in success level, proving

the strength of CNN.

1. Introduction

High-rise buildings with glass facades are increasingly growing

worldwide [1,2]. However, when it comes to their maintenance pro-

cess, cleaning is normally a classical approach which is still labor-in-

tensive and often of high risk to the manpower, especially in adverse

weather conditions and strong winds. Recently, there is a growing trend

toward commercialized robots such as Winbot Series from Ecovacs

(Winbot X, Winbot 950 and Winbot 850), Hobot window cleaning series

(Hobat 168, 198 and 288), Alfawise A168/S60, and Rumbot's window

cleaning robot. Furthermore, in case of high-rise structures, there are

commercialized façade cleaning robots such as RobuGlass designed for

cleaning Louvre museum [3] and Glazenwasrobot from KITE robotics

that works with protective cranes.

In order to improve the capabilities of these increasingly demanded

service robots, also in academic research, robotic systems for cleaning

vertical glass facades have been a matter of interest. In this ﬁeld, there

are numerous research challenges associated with mechanism design

and autonomous capabilities. This essential need to use robots for

vertical façade cleaning, has been sensed long back ago and has been

addressed by introducing SIRIUSc as a modular robot basically designed

to work on skyscrapers: This robot has been speciﬁcally utilized to clean

the 25,000 m

vaulted glass hall of the Leipzig Trade Fair in Germany

[4,5]. Going ahead on the same stream toward autonomy, the proposed

robotic system in [6] performs tasks such as cleaning, moving, rail

alignment control, and obstacle detection.

In terms of mechanical design, glass façade cleaning robot should be

ﬁrst a surface climbing robot. A detailed review of wall climbing robots

is presented in [7] categorizing them into six distinguished classes

based on the applied adhesive mechanism. Apart from façade cleaning,

https://doi.org/10.1016/j.autcon.2019.102959

Received 25 January 2019; Received in revised form 2 September 2019; Accepted 7 September 2019

⁎

Corresponding author.

E-mail addresses: maryam_kouzehgar@sutd.edu.sg (M. Kouzehgar), ykrishn4@uwo.ca (Y. Krishnasamy Tamilselvam),

manuel_vega@sutd.edu.sg (M. Vega Heredia), rajeshelara@sutd.edu.sg (M. Rajesh Elara).

Automation in Construction 108 (2019) 102959

Available online 23 October 2019

(http://creativecommons.org/licenses/BY-NC-ND/4.0/).

the surface climbing robots are mainly utilized for remote inspection

and maintenance applications [8]indiﬃcult access situations such as

wall inspection [9], labeling oil tank volume [10] and nuclear power

plants inspection [11].

Regardless of the application to façade cleaning, several types of

mechanisms are proposed for surface climbing: Legged morphology

with eight articulated limbs for Robug III is presented in [12]. The

structure illustrated in [13], is another type of legged morphology. In

addition, a hexapod structure for climbing is put forward in [14].

Moreover, a shape shifting rolling-crawling mechanism is introduced is

in [15] which is also able to climb vertical surfaces. Surface climbing

robots illustrated in [16,17] use a tracked wheel mechanism. There are

also pneumatic-type climbing robots [18]; namely utilizing suction

cups: [19,20]. Among the climbing robots equipped with suction cups,

[21] proposes a biped mechanism, whereas [22] is introducing a robot

speciﬁcally designed for glass facade cleaning with passive suction cups

driven by self-locking leadscrews. Additionally, some of the climbing

robots beneﬁt from material-based adhesiveness [23] and impedance

control [17].

When it comes to improve the performance of the façade cleaning

robots, reconﬁgurability can add a great value in terms of accessing

more spaces and ﬂexibility in morphology. In [24], a nested re-

conﬁgurable mechanism has been proposed speciﬁcally for vertical

façade cleaning applications. However, the rolling-crawling mechanism

introduced in [15] is an example of reconﬁgurable morphology that can

also be used in façade cleaning applications due to its wall-climbing

ability. Moreover, bio-inspired mechanisms can also pave their way

toward vertical facade cleaning as with the climbing mechanism pro-

posed in [23] beneﬁtting from special adhesive materials.

The robot considered in this paper is Mantis [25] which is a modular

climbing robot that is using a powerful commercial impeller to provide

the required mechanical attachment force as the adhesion mechanism.

Even in robotic façade cleaners, usually a person is required to detach

the robot from one window panel and attach it to another. Mantis is

equipped with the ability to distinguish the frames (or any obstacles

other than glass), and also due to its modular morphology, it is able to

perform a transition from one window panel to another, avoiding its

frame. This way, the need for human interference is eliminated. This is

one aspect of autonomy worked out for the façade cleaning robot

thoroughly investigated in [25] in terms of design, locomotion strate-

gies and control.

One thing more is that, the façade cleaning robots had better avoid

the cracked glass regions because it would cause dangerous situations.

Hence, it is necessary that the robot can recognize the safe surfaces for

navigation, just as the high-rise window cleaner from FatCat Robotics is

claimed to be equipped with the crack-detection technology. In this

study, our aim is to equip Mantis with glass crack-detection ability

using deep learning techniques; namely convolutional neural networks

(CNN). This is prominently signiﬁcant because we are adding a very

necessary capability to the façade cleaning robot to avoid and prevent

probable dangerous situations for itself, and also the humans around.

Due to the uncompensable harms to people's health, no one prefers a

glass frame explosion only for having a tech-equipped cleaning robot

navigating on that. Without this additional capability, the people would

prefer the glass to remain stained rather than probably get explode, and

this paper is introducing this distinguished capability as its main con-

tribution which will guarantee the safety for all.

Crack detection generally is viewed from two aspects. First, the

material on which the crack exists and second the method to handle the

analysis of crack. In terms of material, there are several other materials

which require signiﬁcant crack detection eﬀorts such as metals, con-

crete, asphalt, walls with diﬀerent materials in construction industry,

etc. For example, in [26,27], crack detection in metal surfaces is

worked out by applying Eddy current sensor. In [28], a nonlinear ul-

trasonic modulation technique based on dual laser excitation is pro-

posed for fatigue crack detection in aluminum and steel plates.

Ultrasound crack detection in a simulated human tooth is investigated

in [29]. In terms of detecting civil infrastructure defects, a survey on

image-based crack detection for concrete surfaces is put forward in

[30].

The vision-based deep-learning-supported crack detection methods

have been mostly a matter of interest on concrete and asphalt surfaces.

For instance, [31] proposes a vision-based method using a deep archi-

tecture of CNNs for detecting concrete cracks. Road crack detection

using CNN is elaborated in [32] and crack detection on 3d asphalt

surfaces using a deep-learning network is proposed by [33]. While [34]

is applying deep CNNs with transfer learning for vision-based pavement

distress detection. Furthermore, deep learning-based crack detection

using CNNs has also an interesting application in identifying cracks on

reactors within the process of nuclear power plant inspection [35].

The reason why recently there has been a great interest toward

CNNs is its outstanding advantages: Notably, CNNs beneﬁt from auto-

matic feature extraction and feature learning. In fact, they can learn

relevant features from an image/video at diﬀerent levels similar to a

human brain. While for example NNs cannot do this and need a prior

phase of feature extraction to be applied to any object/pattern re-

cognition application. In general, when using NNs, it is needed to ex-

tract relevant features for the given task and assign each of them to an

element of the input vector; while a CNN will automatically extract

such features provided that you can represent the input as a tensor with

locally correlated elements such as audio data, images, video, etc.

Furthermore, CNN is more eﬃcient in terms of memory and complexity

because it needs much less parameters specially in image processing

applications where, dealing with image matrix, the number of weight

parameters grow drastically in other methods such as NNs. However, in

CNN it only depends on number and size of ﬁlters. One of other dis-

tinguished advantages of CNNs is the capability for transfer learning;

i.e. re-using a pre-trained CNN to feed your data on each level and just

slightly tune the CNN for the newly-deﬁned relevant task for example

using the knowledge gained on cars to distinguish trucks. This way, we

avoid training of CNN from scratch and save memory and time.

Particularly, when it comes to crack detection on glass surfaces,

beside the vision-based techniques, there are also some other methods

to handle the glass crack detection such as the ultrasonic glass crack

detection proposed in [36] and detecting the defects on glass surface

with Wavelet Transforms in [37]. Regarding vision-based techniques,

speciﬁcally glass crack detection has been recently developed mostly

based on classical image processing methods such as edge detection and

segmentation. As an illustrative example, the glass crack detection

system proposed in [38]ﬁrst applies a pre-processing and smooth

sharpening followed by image segmentation and ends with crack fea-

ture extraction including calculation of crack area, Crack Perimeter and

Crack Circularity. Also, in [39], a very basic image processing method

based on pixel coordinates is applied to detect crack on bottles in a

video-supervised production line.

However, vision-based deep learning has been rarely applied to

glass crack detection and therefore presents a scope for tremendous

research. The reason of rare applications of deep learning in glass crack

detection mostly lies in two diﬀerent venues: First, glass crack detection

has not been a matter of interest in major industrial applications and

glass-cleaning robot with crack-detection capability is rarely found with

a valid commercialized patent. The second reason is that dealing with

images of glass is not straightforward, because the lighting exposure on

glass is a sever issue that tremendously varies in diﬀerent times of day/

night. Furthermore, in the background and foreground of the glass,

many objects may be lying that deﬁnitely have their images reﬂected on

the glass, and also intense light can be reﬂected on the glass. This re-

ﬂection issue can be addressed as glare eﬀect in the paper and we have

tried to train the network with sample videos containing diﬀerent types

of glare eﬀect (reﬂection of sun and artiﬁcial intense light).

In this paper the aim is to propose a crack detection system for a

glass façade cleaning modular robot based on deep learning. To this

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

end, we will be using a live video sent from an onboard camera

mounted on the robot monitoring the ahead glass surface. The CNN

which is taking care of crack detection in each video frame is im-

plemented in python through TensorFlow backend. The utilized CNN

consists of two convolutional layers, two maximum pooling layers and

one fully connected layer and during the training process we are using

two types of optimizers: Adam optimizer and Adagrad optimizer. The

overall performance of the proposed approach is quite satisfying since it

can hit an accuracy more than 90%. This way, our façade cleaning

robot is equipped with a higher level of autonomy in term of detecting

cracked regions and providing adequate safety.

The rest of the paper is organized as follows. Section 2 describes the

overall locomotion mechanism of the considered façade cleaning robot

(Mantis). Section 3 is dedicated to a general brief review of the CNN

concept along with the mechanism of each layer. Section 4 is concerned

with the nature of the data set and details about data augmentation and

preprocessing. Furthermore, Section 5 introduces the proposed CNN

architecture and training algorithm along with the mathematical sup-

port for utilized optimizers. Section 6 is devoted to results and discus-

sion that summarizes the test results casting light on the accuracy

metrics, performance analysis graphs and sample data classiﬁcation

results added by performance comparison to simple neural network.

Eventually, Section 7 concludes the paper and opens a window toward

the future works.

2. Description of the robotic platform [25]

Mantis is a modular vertical climbing robot developed for façade

cleaning [25]. The currently commercialized façade cleaning robots

usually require human interference to detach them, move them to the

other window frame and again attach them to continue the cleaning

process, just like Winbot series, Hobat series and Alfawise. In order to

answer this shortcoming in current commercial designs, Mantis is de-

signed in such a way that it can do the transition from one window

panel to another without manual assistance. The robot has been de-

veloped on a rigid structure that keeps the modules uniﬁed, in the

meantime individual rotation and lifting mechanism for each module is

provided. Fig. 1 shows a picture of Mantis.

2.1. Adhesion mechanism

Mantis uses a powerful commercial impeller to create the suction

required as the mechanical attachment force to keep it attached to the

glass and prevents the robot from falling. The impeller's input voltage is

controllable to control the amount of needed force based on the surface

sensitivity.

2.2. Locomotion mechanism

The robot is equipped with a locomotive wheel mechanism with a

soft and ﬂexible high friction rubber. In this way, we decrease the drift

of the robot on the glass in the presence of dirt and liquids. The dia-

meter of the wheels is 6 cm, connected to dc motors with 250 oz./in of

torque. The maximal speed of the robot is 15 cm/s.

As depicted in Fig. 2, the wheels are attached to each module in

orthogonal position to the glass and equidistant to the center of module,

aligned with Y axis.

The whole module can rotate βdegrees around the Z-axis for each

module i (where i is the pad i = a,b,c). Thus, by rotating the module,

the wheels rotate concentrically around a central axis.

Considering locomotive mechanism, Mantis is a 12W6D3S robot.

Based on the locomotive characteristics of the robot, given the re-

strictions imposed to facilitate control, it is nearly holonomic. Each of

the modules can rotate 360

. However, in the current application the

rotation range is limited to 90

since it is enough for the robot to na-

vigate throughout the whole window.

2.3. Transition mechanism

The transition mechanism consists of two steps: ﬁrst recognizing the

window frame or any obstacles. Second lifting the module that is

nearest to the obstacle. The recognition is fulﬁlled via applying in-

ductive sensors. The transition between the windows, across the frames

or obstacles, is shouldered by the lifting system. Aiming at transition, a

linear actuator is utilized which linearly moves a screw and lifts a

Fig. 1. Structure of Mantis while cleaning.

Fig. 2. Mantis inertial system in red, module c inertial system in black, βis the

steering of each module. (For interpretation of the references to colour in this

ﬁgure legend, the reader is referred to the web version of this article.) (For

interpretation of the references to colour in this ﬁgure legend, the reader is

referred to the web version of this article.)

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

module from the surface using the rotation around a screw rod located

in the center of each module. This mechanism is illustrated in Fig. 3.

2.4. Software architecture

As shown in Fig. 4, for this application the robot is teleoperated

through a Bluetooth device HC-06 and a mobile application. The robot

can also be teleoperated by a computer. Also, it can use a USB interface.

To develop autonomous navigation, the robot is equipped with a sen-

sory system to know the orientation, position and detect the frame of

the window.

In Fig. 4, continuous arrows are power connections and dotted ar-

rows show signals.

The robot is controlled using a master system which works with the

Robot Operative System (ROS). Using ROS and its topics subscription

system, it is possible to process package information. The packets are

bidirectional between MCU (Arduino Mega) and ROS slave (Intel

Compute Stick), on board. As well as, between ROS slave and ROS

master, communicated by WiFi. ROS master is a server PC where the

information from the ROS slave is processed in software which cannot

be processed in the ROS slave given the processing resources necessary

for the correct operation. As shown in Fig. 5, The communication

between TensorFlow and ROS is developed by subscribing to data

package topics [40] for identifying the cracked glass by image proces-

sing.

2.5. On-board visual support system

In order to develop an autonomous robot that can navigate safely on

the glass, it is necessary for it to determine the state of the surface, for

existing cracks. For the analysis of the glass surface, a vision system is

used using a camera of high-resolution (HD 1080 camera, 30 fps),

connected to a Wi-Fi module. As shown in Fig. 6.a, the camera is as-

sembled in a ﬁxture with an estimated angle of inclination of 45

. Also,

it has a 90

vision angle as shown Fig. 6.b. In addition, Fig. 7 shows a

picture of the real robot with the camera mounted on the middle

module.

3. Convolutional neural networks - a very brief review

In this paper, we will be using a convolutional neural network

(CNN) implemented in Python through TensorFlow backend. A CNN

consists of special layers called convolutional layers which are very

useful in detecting objects and patterns [41,42]. One advantage of CNN

is that it can be used to build a very deep network with a lesser number

of parameters to train and thereby reducing the time and complexity in

the training process. Apart from this, a CNN consists of diﬀerent types

of layers with speciﬁc characteristics such as convolutional layers, ac-

tivation layers, pooling layers, fully connected layers and SoftMax

layers.

The primary idea behind image classiﬁcation is horizontal or ver-

tical edge detection which can be achieved by performing a convolution

operation on the input image. The algorithm takes a small square (or

‘window’) called ﬁlter and starts applying it over the image. Each ﬁlter

allows the CNN to identify certain patterns in the image. In the initial

layers, a CNN will start by detecting simple features such as lines, cir-

cles and edges. In each layer, the network can combine these ﬁndings

and continually learn more complex concepts as it goes deeper, and in

our case, it detects cracks existing on the glass.

3.1. Convolution layer

The convolution layer consists of the ﬁlter which is supposed to

convolve across the width and height of the input volume. In order

words, the output of a convolutional layer is obtained by carrying out a

dot product operation between the ﬁlter weight content and each

Fig. 3. Mantis lifting the middle module for transi-

tion between window panels [25].

Fig. 4. Schematic architecture of Mantis.

Fig. 5. ROS-TensorFlow robot communication.

250 mm

100 mm 100 mm

20 mm

200 mm

b. Top view

a. Side view

Fig. 6. Position and orientation of the camera mounted on Mantis.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

unique position of the input image. This process results in a 2-dimen-

sional activation map that gives the responses of that ﬁlter at every

spatial position. The various parameters involved in this process are

number of ﬁlters, ﬁlter size and weight contents, stride, and padding.

All these parameters will impact the size of the output.

3.2. Pooling layers

Pooling layers generally aim at avoiding overﬁtting and by applying

non-linear down-sampling on activation maps, reduce the dimension

and complexity.to speed up the computation.

The various parameters involved in this process are ﬁlter size and

stride, whereas padding is not used in pooling as it is against the pur-

pose of reducing the input dimension. In addition, pooling is applied on

each input channel individually. This way, the number of output and

input channels will be equal. There are two diﬀerent types of pooling

namely max pooling and average pooling.

•Max pooling

The basic operation of pooling layer is similar to the convolutional

layer. One noticeable diﬀerence is that instead of taking the dot product

of the input and the ﬁlter, we take the maximum neighboring value

from each unique position in the input image. This is done through each

channel in the input.

•Average pooling

In average pooling, we take the average of all the values sur-

rounding each unique position in the input image.

3.3. Fully connected layers

The fully connected layer is the other name for the hidden layer

used in a regular neural network. Before this step, the input array is

converted into a single dimensional vector using a ﬂattening layer. As

the name suggests, in a fully connected layer, each node in the input is

connected to every other node in the output.

3.4. SoftMax layer

SoftMax activation function is applied in the output layer of a CNN

to represent a categorical distribution over labels and gives the prob-

abilities of each input belonging to a label.

4. Image Data Set, augmentation techniques and preprocessing

In this study, basically we generated cracked and non-cracked

images from the video frames captured by the camera mounted on

robot. As discussed earlier, the viewing angle of this camera covers the

glass that is supposed to be cleaned.

Most of the data set consists of snapshots of videos (from robot) or

photos taken in Singapore University of Technology and Design (SUTD)

which is beautifully designed with a half-glass structure in all its

buildings. The samples are captured in diﬀerent conditions (illumina-

tion, reﬂection, etc.), with diﬀerent camera devices, with diﬀerent re-

solutions including diﬀerent types of object reﬂections and glare eﬀect

(light and sun reﬂections). While the majority of these images are

collected by our team, for the sake of diversity and to generalize the

data set, some cracked/non-cracked images were also added from the

web.

Since the main goal of this study was crack detection, the images

were manually labelled to fall into two separate categories: Cracked or

non-cracked. Originally, 1539 cracked images and 1565 non-cracked

images were labelled. In addition, by applying some data augmentation

techniques (mostly including ﬂipping and rotating), the ﬁnal dataset

consists of 2205 cracked and 4303 non-cracked images with diﬀerent

conditions of orientation, illumination and resolution.

After saving the trained CNN, it comes to test it with snapshots from

the live video. Regarding the online crack detection through live video

from robot, there are diﬀerent phases of pre-processing of the raw

image data:

- Converting the video into images by reading the video frame every

1s.

- Converting the image into an appropriate size. The reason behind

this is that software tends to provide a better result if all the input

images are of same pixel dimensions. In this paper, we have used

dimensions of 480 × 240 pixels.

- The resultant images are read using the OpenCV package in python.

- In this paper, we have used the grayscale image as it reduces the

number of dimensions and thereby reducing the operational time

and capacity. Then the obtained vector is normalized (each pixel

value is divided by 255). This is the last step in pre-processing the

image prior to be fed into trained CNN for test.

5. The training algorithm and applied CNN architecture

As illustrated before, this paper aims at classifying the cracked glass

from non-cracked glass using the video captured from the robot. The

purpose behind the classiﬁcation is that the robot needs to avoid the

areas where the cracks are present in the glass. As mentioned earlier,

the video is converted into a grayscale image and loaded using the

OpenCV package. Later, CNN, developed using TensorFlow, is used for

classiﬁcation. To get the best results and also make a comparison, we

have used two optimizers namely Adam and Adagrad to perform the

training.

Fig. 7. Camera mounted on the middle module of Mantis.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

5.1. Optimizers

A brief elaboration on the formulation of these optimizers is given

below in order to make it possible to reproduce the simulations.

5.1.1. Adam optimizer

Adam optimizer is Adaptive Moment estimation optimizer which

follows an algorithm for ﬁrst-order gradient-based optimization based

on adaptive estimates of lower-order moments. The pseudo code for

Adam algorithm is given below [43].

Require: α: Step size

Require: β

,β

∈[0, 1): Exponential decay rates for the moment

estimates

Require: f(θ): Stochastic objective function with parameters θ

Require: θ

Initial parameter vector

←0 (Initialize 1st moment vector)

←0 (Initialize 2nd moment vector)

t←0 (Initialize timestep)

while θ

not converged, do the following:

t←t+1

←∇

(θ

−1) (Get gradients w.r.t. stochastic objective at

timestep t)

←β

t−1

+(1-β

(Update biased ﬁrst moment estimate)

←β

t−1

+(1-β

(Update biased second raw moment esti-

mate)



←m

/(1 −β

) (Compute bias-corrected ﬁrst moment estimate)

v

←v

/(1 −β

) (Compute bias-corrected second raw moment es-

timate)

←θ

t−1



+αm v ε/( )

(Update parameters).

end while

return θ

(Resulting parameters)where g

are the gradients, θ

is the

parameter at time t, β

and β

belong to [0,1), and αis the learning rate.

According to [43], g

indicates the elementwise square of g

ʘg

and

proposed default settings are α= 0.001, β

= 0.9, β

= 0.999 and

ε=10

−8

. All operations on vectors are element-wise, and β

and β

denote β

and β

to the power of t.

5.1.2. Adagrad optimizer

Adagrad optimizer is a gradient based optimization algorithm that

works well for sparse gradients [44]. It will automatically adapt the

learning rate based on the parameters. The basic equation used for

parameter update is shown in Eq. (1) where θ

is the parameter at time

t, αis the learning rate, g

is the gradient estimate, and ʘmeans ele-

ment wise multiplication.

=− +∑⊙

θθ α

εg

12(1)

480

240

480

Input Image Convolution

Layer 1

Max Pooling

Layer 1

Convolution

Layer 2

Max Pooling

Layer 2

Flattening

process

Fully

Connected

Network

SoftMax

Layer

1920

Uncracked

Cracked

Fig. 8. CNN architecture for this classiﬁcation process.

Table 1

Parameters used in the CNN layers.

Layer Filter size Stride Padding type Activation

Convolution layer 1 4 1 Same ReLU

Max pooling layer 1 8 8 Same (P=0) –

Convolution layer 2 2 1 Same ReLU

Max pooling layer 2 4 4 Same (P = 0) –

Start

Input Video from

on-board camera

Get Snapshots of the video

Labelling the training data

Normalize

1st convolutional layer with

Relu Activation

2nd convolutional layer

with Relu Activation

1st maximu m pooling la yer

Fully connected layer with

SoftMax function

Calculate cost

Calculate test accuracy

Save the trained model

Stopping criteria

fulfilled?

2nd maximum pooling layer

Stop

Yes

Fig. 9. The CNN training algorithm.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

5.2. Proposed CNN

A sketch of the utilized CNN architecture is shown in Fig. 8 in which

we have used two convolutional layers with ReLU activation, two

maximum pooling layers and one fully connected layer with SoftMax

activation. Due to the mutually exclusive nature of the crack detection

problem (cracked or non-cracked), a SoftMax layer is used as the last

layer to compute the probability of each class. Furthermore, Table 1

summarizes the parameters corresponding to CNN layers. Meanwhile,

the training algorithm is illustrated in Fig. 9.

6. Results and discussion

Training performance graphs for both optimizers are summarized in

Fig. 10 (both for 700 epochs).

The confusion matrix is a way of describing the performance of the

classiﬁer output. The layout of the confusion matrix, in our case, is as

illustrated in Eq. (2).

=⎡

⎣

⎤

⎦

True Cracked False Cracked

False Uncracked True Uncracked

onfusion Matrix (2)

Performance metrics for growing number of epochs is summarized

in Table 2 for both optimizers. This Table gives detailed metrics in

terms of confusion matrix, accuracy, sensitivity in terms of TPR (true

positive rate) or the same recall, precision (PPV: positive predictive

values), speciﬁcity (SPC), negative predicted value (NPV), false positive

rate (FPR), false discovery rate (FDR), miss rate or FNR (false negative

rate), FOR (false omission rate) and F1 score [45].

An illustration of precision, recall and F1 values during the training

process, are given in Figs. 11 to 13. As it can be seen, precision of Adam

optimizer is smoothly standing much higher than Adagrad Optimizer

during the training process, however, Adagrad seems to be able to reach

the same precision value and intersect with Adam precision, if given

more epochs for training. Although Adagrad shows a weak precision, in

terms of recall, it has a more satisfying performance keeping its graph

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

0 200 400 600 800

Loss

Epochs

Adagrad

Adam

Fig. 10. Cost function values.

Table 2

Quantitative performance metrics for both optimizers.

Optimizers Epochs Confusion matrix Accuracy% PPV (precision) TPR (Recall/sensitivity) F1 score SPC NPV FPR FDR FNR FOR

Adam optimizer 700 ⎡

⎣⎤

⎦

1804 271

401 4032

89.674 0.869 0.818 0.842 0.937 0.909 0.062 0.130 0.181 0.090

350

⎡

⎣⎤

⎦

1777 298

423 4010

88. 921 0.856 0.807 0.831 0.930 0.904 0.069 0.143 0.192 0.095

300 ⎡

⎣⎤

⎦

1774 301

417 4016

88. 967 0.854 0.809 0.831 0.930 0.905 0.069 0.145 0.190 0.094

280 ⎡

⎣⎤

⎦

1777 298

419 4014

88. 982 0.856 0.809 0.832 0.930 0.905 0.069 0.143 0.190 0.094

Adagrad optimizer 700

⎡

⎣⎤

⎦

1322 753

23 4410

88. 076 0.637 0.982 0.773 0.854 0.994 0.145 0.362 0.017 0.005

350

⎡

⎣

⎤

⎦

712 1363

9 4424

78.918 0.343 0.987 0.509 0.764 0.997 0.235 0.656 0.012 0.002

300

⎡

⎣

⎤

⎦

571 1504

8 4425

76.767 0.275 0.986 0.430 0.746 0.998 0.253 0.724 0.013 0.001

280 ⎡

⎣

⎤

⎦

473 1602

8 4425

75.261 0.227 0.983 0.370 0.734 0.998 0.265 0.772 0.016 0.001

Fig. 11. Precision values for both optimizers.

Fig. 12. Recall values for both optimizers.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

higher throughout the epochs. Both optimizers beneﬁt from an ap-

proximately same level of accuracy around 90%, however Adam opti-

mizer results in slightly more accuracy (89.674% for Adam optimizer

vs. 88.076% for Adagrad optimizer). Overall, Adam optimizer with the

lowest rate in false negative detection (FNR) and highest accuracy

suggests a well-trained classiﬁer, able to recognize cracked from non-

cracked.

Aiming at further illustration, the probability of correct classiﬁca-

tion for sample test images are given for both of the optimizers in

Figs. 14 and 15 (for cracked samples) and Figs. 16 and 17 (for non-

cracked samples) in which TP and TN respectively stand for True Po-

sitive and True Negative.

In corresponding ﬁgures for the cracked and non-cracked (Figs.

14–15 and Figs. 16-17), the sample test images are selected to be the

same for both optimizers for comparison aims to be fulﬁlled more easily

[32]. Beside the samples from real video snapshots (captured by robot),

some samples from web are also tested for the sake of diversity and to

test the classiﬁers with generalized samples. Only to clarify, the photos

with gray stipes, are from real robot navigating on a half-glass wall at

Singapore University of Technology and Design. It is evident that in

each test image, Adam optimizer has been more powerful with higher

probability (at level of 90%–100%) for correct decision. However,

Adagrad optimizer also never fails in correct decision with reporting

probabilities always safely more than 50% for true positive decision.

Moreover, it is worth mentioning that in this application for glass fa-

çade cleaning, not to mistake in crack detection, is more important than

precision of correct decision, because only one small mistake may cause

a serious incident and our proposed method actually guarantees this:

Although we also have rather lower probabilities reported, but all the

results have a suﬃciently safe distance to 50% which is the threshold

for right decision. This guarantees the safety of the process to be re-

placed with human inspector.

In addition, in order to prove the strength of applied CNN, we have

also gone ahead with a simple neural network. This NN utilizes Adam

optimizer and consisting of six layers, it is very similar to the applied

CNN in terms of architecture. It includes one input, one output and four

hidden layers with size of 32, 16, 8, 4. However, it should be mentioned

that NN needs features to be fed into its input layer. In this regard,

feature extraction is done using OpenCV which converts the image into

a multi-dimensional matrix. This matrix is then ﬂattened to be provided

to NN's input layer. Figs. 18 and 19 illustrate NN's performance on the

same test samples for cracked and non-cracked. It is obvious that NN

appears to be very poor because of lacking convolutional layer.

Convolutional neural networks beneﬁt from several advantages as

described earlier. In comparison to NNs they need less time and

memory, however, practically they are still computationally expensive.

This drawback can be solved with better computing hardware such as

GPUs and Neuromorphic chips. Furthermore, some more recent types of

Fig. 13. F1 values for both optimizers.

TP: p = 61.5 % TP: p = 99.8% TP: p = 97.0% TP: p = 99.8%

TP: p = 77.5% TP: p = 99.8% TP: p = 99.1% TP: p = 100 %

TP: p = 99.7% TP: p = 99.6% TP: p = 100% TP: p = 99.5%

TP: p = 99.8% TP: p = 100% TP: p = 99.6% TP: p = 99.0%

TP: p = 99.3% TP: p = 92.4% TP: p = 99.8% TP: p = 99.9%

Fig. 14. Cracked samples (True Positive) for CNN with Adam Optimizer.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

TP: p = 59.5 % TP: p = 59.5% TP: p = 63.3% TN: p = 97.5%

TP: p = 59.7% TP: p = 62.1% TP: p = 72.8% TN: p = 94.6 %

TP: p = 54.2% TP: p = 62.7% TP: p = 76.3% TP: p = 54.4%

TP: p = 89.2% TP: p = 87.7% TP: p = 95.7% TP: p = 54.6%

TP: p = 56.7% TP: p = 82.5% TP: p = 80.5% TP: p = 93.0%

Fig. 15. Cracked samples (True Positive) for CNN with Adagrad Optimizer.

TN: p = 99.9% TN: p = 95.80% TN: p = 98.9% TN: p = 91.6%

TN: p = 94.1% TN: p = 97.6% TN: p = 99.3% TN: p = 92.5%

TN: p = 100% TN: p = 100% TN: p = 90.7% TN: p = 95.8%

TN: p = 99.6% TN: p = 100% TN: p = 94.0% TN: p = 98.1%

TN: p = 100% TN: p = 99.8% TN: p = 99.9% TN: p = 99.6%

Fig. 16. Non-Cracked samples (True Negative) for CNN with Adam Optimizer.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

CNNs can provide faster response in comparison to basic CNN that uses

sliding window. The other drawback is they need a lot of training data.

This aspect is usually well-treated by applying data augmentation

methods as described in this paper.

7. Conclusion

In this paper, a deep learning approach has been put forward for

equipping our modular façade-cleaning robot with crack-detection

TN: p = 78.1% TN: p = 77.8% TN: p = 93.2% TN: p = 77.8%

TN: p = 77.9% TN: p = 96.2% TN: p = 96.4% TN: p = 77.9%

TN: p = 64.6% TN: p = 99.2% TN: p = 77.1% TN: p = 79.2%

TN: p = 74.7% TN: p = 98.1% TN: p = 77.6% TN: p = 79.3%

TN: p = 90.2% TN: p = 91.8% TN: p = 85.8% TN: p = 62.5%

Fig. 17. Non-Cracked samples (True Negative) for CNN with Adagrad Optimizer.

TP: p = 55.29% TP: p = 55.57% TP: p = 54.83% TN: p = 51.72%

TP: p = 55.24% TP: p = 53.32% TP: p = 54.58% TN: p = 53.95%

TP: p = 54.56% TP: p = 55.16% TP: p = 55.43% TP: p = 54.55%

TP: p = 53.55% TP: p = 54.16% TP: p = 57.11% TP: p = 55.64%

TP: p = 53.93% TP: p = 54.64% TP: p = 56.91% TP: p = 51.95%

Fig. 18. Cracked samples (True Positive) for NN with Adam Optimizer.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

package. For façade cleaning robots, the necessity of such an additional

capability will be more tangible when the dangers of navigating on

cracked glass are taken into account. To this end, the modular Mantis

robot is equipped with on-board camera for experimental purposes and

the live video is loaded using the OpenCV package. In addition, a deep

convolutional neural network developed in TensorFlow™is proposed,

and it is trained with suﬃcient data set including photos taken from

video snapshots (directly fed from robot), several other photos taken at

SUTD half-glass campus, and also photos from web with diﬀerent

conditions of illumination and resolution (taken with diﬀerent devices).

On the way of training, two diﬀerent optimizers are utilized, both of

which hit a very high accuracy around 90% which is trustable enough

to replace the human-operator with this system in real time on-site

inspections. Each of the two utilized optimizers has its own advantages:

Adam beneﬁts from higher precision while Adagrad optimizer results in

higher recall factor. However, Adam optimizer with the lowest FNR and

highest accuracy suggests a more trustable classiﬁer. Moreover, the

overall performance of the proposed CNN is also compared to that of

the traditional NN-based method which further proves the strength of

CNN.

The transcend aim of this research is to make the robot avoid from

cracked regions. Our scope of future work would involve enhancing the

algorithm so that the robot can take a much more informed decision. In

our next version of the crack detection algorithm, we would focus more

in solving the localization problem by detecting the exact position of

the crack partial to robot and navigate the robot such that it can avoid

the cracked region and also maintain the highest possible coverage for

eﬃcient cleaning.

To further empower the crack detection package, we will also be

focusing on implementing a more powerful deep learning technique

that could also help with object tracking such as Faster R-CNN and

enhancing the dataset with more images (specially consisting of dif-

ferent reﬂections and illuminations). Furthermore, another expansion

to current work would be to have all the processing onboard rather than

utilizing a master-slave ROS system.

Acknowledgments

This work is ﬁnancially supported by the National Robotics R&D

Program Oﬃce, Singapore, under the Grant no. RGAST1702, Singapore

University of Technology and Design (SUTD), which are gratefully ac-

knowledged to conduct this research work.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://

doi.org/10.1016/j.autcon.2019.01.025.

References

[1] C.W. Bostick, Architectural trend thru the looking glass, Proceedings of the Glass

Performance Days, 2009, pp. 860–866. Tampere, Finland https://de.glassglobal.

com/gpd/downloads/ChangingMarkets-Bostick.pdf.

[2] F. Pariafsai, A review of design considerations in glass buildings, Frontiers of

Architectural Research 5 (2) (2016) 171–193, https://doi.org/10.1016/j.foar.2016.

01.006.

[3] A. Kochan, Robot cleans glass roof of louvre pyramid, Industrial Robot: An

International Journal 32 (5) (2005) 380–382, https://doi.org/10.1108/

01439910510614637.

[4] N. Elkmann, T. Felsch, M. Sack, J. Saenz, J. Hortig, Innovative service robot systems

for facade cleaning of diﬃcult-to-access areas, IEEE/RSJ International Conference

on Intelligent Robots and Systems, vol. 1, 2002, pp. 756–762, , https://doi.org/10.

1109/IRDS.2002.1041481.

[5] N. Elkmann et al., "SIRIUSc —Facade cleaning robot for a high-rise building in

Munich, Germany," in Climbing and Walking Robots, pp. 1033–1040: Springer

Berlin Heidelberg, vol. 2005. doi:https://doi.org/10.1007/3-540-29461-9_101.

[6] Y.-S. Lee, et al., The study on the integrated control system for curtain wall building

façade cleaning robot, Autom. Constr. 94 (2018) 39–46, https://doi.org/10.1016/j.

autcon.2017.12.030.

[7] S. Nansai, M. Rajesh Elara, A survey of wall climbing robots: recent advances and

challenges, Robotics 5 (3) (2016) 14, https://doi.org/10.3390/robotics5030014.

[8] D. Schmidt, K. Berns, Climbing robots for maintenance and inspections of vertical

structures—a survey of design aspects and technologies, Robot. Auton. Syst. 61 (12)

(2013) 1288–1305, https://doi.org/10.1016/j.robot.2013.09.002.

[9] D. Longo, G. Muscato, The Alicia/sup3/climbing robot: a three-module robot for

automatic wall inspection, IEEE Robotics & Automation Magazine 13 (1) (2006)

42–50, https://doi.org/10.1109/MRA.2006.1598052.

[10] Z. Xu, P. Ma, A wall-climbing robot for labelling scale of oil tank's volume, Robotica

TN: p = 46.84% TN: p = 39.21% TN: p = 15.00% TN: p = 38.15%

TN: p = 39.21% TN: p = 31.83% TN: p = 19.97% TN: p = 39.37%

TN: p = 43.84% TN: p = 50.01% TN: p = 39.41% TN: p = 35.70%

TN: p = 26.76% TN: p = 41.48% TN: p = 39.20% TN: p = 35.21%

TN: p = 50.01% TN: p = 49.20% TN: p = 50.01% TN: p = 50.01%

Fig. 19. Non-Cracked samples (True Negative) for NN with Adam Optimizer.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

20 (2) (2002) 209–212, https://doi.org/10.1017/S0263574701003964.

[11] B.L. Luk, K.P. Liu, A.A. Collie, S. Chen, Tele-operated climbing and mobile service

robots for remote inspection and maintenance in nuclear industry, Industrial Robot:

An International Journal 33 (3) (2006) 194–204, https://doi.org/10.1108/

01439910610659105.

[12] B.L. Luk, L.K.P. Liu, A.A. Collie, Climbing service robots for improving safety in

building maintenance industry, in: M.K. Habib (Ed.), Bioinspiration and Robotics,

vol. Ch. 9, IntechOpen, Rijeka, 2007, , https://doi.org/10.5772/5498.

[13] A. Sintov, T. Avramovich, A. Shapiro, Design and motion planning of an autono-

mous climbing robot with claws, Robot. Auton. Syst. 59 (11) (2011) 1008–1019,

https://doi.org/10.1016/j.robot.2011.06.003.

[14] M. Henrey, A. Ahmed, P. Boscariol, L. Shannon, C. Menon, Abigaille-III: a versatile,

bioinspired hexapod for scaling smooth vertical surfaces, Journal of Bionic

Engineering 11 (1) (2014) 1–17, https://doi.org/10.1016/S1672-6529(14)

60015-9.

[15] T. Yanagida, R. Elara Mohan, T. Pathmakumar, K. Elangovan, M. Iwase, Design and

implementation of a shape shifting rolling-crawling-wall-climbing robot, Appl. Sci.

7 (4) (2017), https://doi.org/10.3390/app7040342.

[16] J. Zhu, D. Sun, S.-K. Tso, Development of a tracked climbing robot, J. Intell. Robot.

Syst. 35 (4) (2002) 427–443, https://doi.org/10.1023/A:1022383216233.

[17] T. Kim, K. Seo, J. Kim, H.S. Kim, Adaptive impedance control of a cleaning unit for a

novel wall-climbing mobile robotic platform (ROPE RIDE), 2014 IEEE/ASME

International Conference on Advanced Intelligent Mechatronics, 2014, pp.

994–999, , https://doi.org/10.1109/AIM.2014.6878210.

[18] H. Zhang, J. Zhang, W.L. Wang, Rong, G. Zong, A series of pneumatic glass-wall

cleaning robots for high-rise buildings, Industrial Robot: An International Journal

34 (2) (2007) 150–160, https://doi.org/10.1108/01439910710727504.

[19] N. Mir-Nasiri, H.S. J, M.H. Ali, Portable autonomous window cleaning robot,

Procedia Computer Science 133 (2018) 197–204, https://doi.org/10.1016/j.procs.

2018.07.024.

[20] J. Liu, K. Tanaka, L.M. Bao, I. Yamaura, Analytical modelling of suction cups used

for window-cleaning robots, Vacuum 80 (6) (2006) 593–598, https://doi.org/10.

1016/j.vacuum.2005.10.002.

[21] S. Nansai, K. Onodera, P. Veerajagadheswar, R. Elara Mohan, M. Iwase, Design and

experiment of a novel façade cleaning robot with a biped mechanism, Appl. Sci. 8

(2018) 2398, https://doi.org/10.3390/app8122398.

[22] T.T. Tun, M.R. Elara, M. Kalimuthu, A. Vengadesh, Glass facade cleaning robot with

passive suction cups and self-locking trapezoidal lead screw drive, Autom. Constr.

96 (2018) 180–188, https://doi.org/10.1016/j.autcon.2018.09.006.

[23] C. Menon, M. Murphy, M. Sitti, Gecko inspired surface climbing robots, 2004 IEEE

International Conference on Robotics and Biomimetics, 2004, pp. 431–436, ,

https://doi.org/10.1109/ROBIO.2004.1521817.

[24] S. Nansai, M. Elara, T. Tun, P. Veerajagadheswar, T. Pathmakumar, A novel nested

reconﬁgurable approach for a glass façade cleaning robot, Inventions 2 (3) (2017)

18, https://doi.org/10.3390/inventions2030018.

[25] M. Vega-Herediaa, et al., Design and modelling of a modular window cleaning

robot, Autom. Constr. 103 (2019) 268–278, https://doi.org/10.1016/j.autcon.

2019.01.025.

[26] X. Peng, S. Katsunori, Eddy current sensor with a novel probe for crack position

detection, 2008 IEEE International Conference on Industrial Technology, 2008, pp.

1–6, , https://doi.org/10.1109/ICIT.2008.4608445.

[27] D.J. Sadler, C.H. Ahn, On-chip eddy current sensor for proximity sensing and crack

detection, Sensors Actuators A Phys. 91 (3) (2001) 340–345, https://doi.org/10.

1016/S0924-4247(01)00605-7.

[28] P. Liu, J. Jang, S. Yang, H. Sohn, Fatigue crack detection using dual laser induced

nonlinear ultrasonic modulation, Opt. Lasers Eng. 110 (2018) 420–430, https://doi.

org/10.1016/j.optlaseng.2018.05.025.

[29] M. Culjat, R. Singh, E. Brown, R. Neurgaonkar, D. Yoon, S. White, Ultrasound crack

detection in a simulated human tooth, Dentomaxillofacial Radiology 34 (2) (2005)

80–85, https://doi.org/10.1259/dmfr/12901010.

[30] A. Mohan, S. Poobal, Crack detection using image processing: a critical review and

analysis, Alexandria Engineering Journal 57 (2) (2018) 787–798, https://doi.org/

10.1016/j.aej.2017.01.020.

[31] Y.-J. Cha, W. Choi, O. Büyüköztürk, Deep learning-based crack damage detection

using convolutional neural networks, Computer-Aided Civil and Infrastructure

Engineering 32 (5) (2017) 361–378, https://doi.org/10.1111/mice.12263.

[32] L. Zhang, F. Yang, Y.D. Zhang, Y.J. Zhu, Road crack detection using deep con-

volutional neural network, 2016 IEEE International Conference on Image

Processing (ICIP), 2016, pp. 3708–3712, , https://doi.org/10.1109/ICIP.2016.

7533052.

[33] A. Zhang, et al., Automated pixel-level pavement crack detection on 3D asphalt

surfaces using a deep-learning network, Computer-Aided Civil and Infrastructure

Engineering 32 (10) (2017) 805–819, https://doi.org/10.1111/mice.12297.

[34] K. Gopalakrishnan, S.K. Khaitan, A. Choudhary, A. Agrawal, Deep convolutional

neural networks with transfer learning for computer vision-based data-driven pa-

vement distress detection, Constr. Build. Mater. 157 (2017) 322–330, https://doi.

org/10.1016/j.conbuildmat.2017.09.110.

[35] F. Chen, M.R. Jahanshahi, NB-CNN: deep learning-based crack detection using

convolutional neural network and Naïve Bayes data fusion, IEEE Trans. Ind.

Electron. 65 (5) (2018) 4392–4400, https://doi.org/10.1109/TIE.2017.2764844.

[36] M.E. Ibrahim, R.A. Smith, C.H. Wang, Ultrasonic detection and sizing of compressed

cracks in glass- and carbon-ﬁbre reinforced plastic composites, NDT & E

International 92 (2017) 111–121, https://doi.org/10.1016/j.ndteint.2017.08.004.

[37] B. Akdemir, Ş. Öztürk, Glass surface defects detection with wavelet transforms,

International Journal of Materials, Mechanics and Manufacturing 3 (3) (2015),

https://doi.org/10.7763/IJMMM.2015.V3.189.

[38] Z. Yiyang, The design of glass crack detection system based on image preprocessing

technology, 2014 IEEE 7th Joint International Information Technology and

Artiﬁcial Intelligence Conference, 2014, pp. 39–42, , https://doi.org/10.1109/

ITAIC.2014.7065001.

[39] M. Hui-Min, S. Guang-Da, W. Jun-Yan, N. Zheng, A glass bottle defect detection

system without touching, Proceedings. International Conference on Machine

Learning and Cybernetics, vol. 2, 2002, pp. 628–632, , https://doi.org/10.1109/

ICMLC.2002.1174411.

[40] L. Joseph, ROS Robotics Projects, Packt Publishing Ltd., 2017 ISBN 10:

1783554711. (ISBN 13: 9781783554713).

[41] T. Guo, J. Dong, H. Li, Y. Gao, Simple convolutional neural network on image

classiﬁcation, 2017 IEEE 2nd International Conference on Big Data Analysis

(ICBDA), 2017, pp. 721–724, , https://doi.org/10.1109/ICBDA.2017.8078730.

[42] S. Albawi, T.A. Mohammed, S. Al-Zawi, Understanding of a convolutional neural

network, 2017 International Conference on Engineering and Technology (ICET),

2017, pp. 1–6, , https://doi.org/10.1109/ICEngTechnol.2017.8308186.

[43] Diederik P. Kingma, J. Ba, Adam: A method for stochastic optimization, 3rd

International Conference for Learning Representations, San Diego, 2015 https://

arxiv.org/abs/1412.6980.

[44] J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online learning and

stochastic optimization, J. Mach. Learn. Res. 12 (2011) 2121–2159 http://www.

jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.

[45] E. Protopapadakis, N. Doulamis, Image based approaches for tunnels' defects re-

cognition via robotic inspectors, International Symposium on Visual Computing,

ISVC 2015: Advances in Visual Computing, vol. 9474, Springer, Cham, 2015, pp.

706–716, , https://doi.org/10.1007/978-3-319-27857-5_63.

M. Kouzehgar, et al. Automation in Construction 108 (2019) 102959

Artificial intelligence and machine learning applications in the project lifecycle of the construction industry: A comprehensive review

Article

Full-text available

Feb 2024

The construction industry faces many challenges, including schedule and cost overruns, productivity constraints, and workforce shortages. Compared to other sectors, it lags in digitalization in every project phase. Artificial Intelligence (AI) and Machine Learning (ML) have emerged as transformative technologies revolutionizing the construction sector. However, a discernible gap persists in systematically categorizing the applications of these technologies throughout the various phases of the construction project life cycle. In response to this gap, this research aims to present a thorough assessment of the deployment of AI and ML across diverse phases in construction projects, with the ultimate goal of furnishing valuable insights for the effective integration of these intelligent systems within the construction sector. A thorough literature review was performed to identify AI and ML applications in the building sector. After scrutinizing the literature, the applications of AI and ML were presented based on a construction project life cycle. A critical review of existing literature on AI and ML applications in the building industry showed that AI and ML applications are more frequent in the planning and construction stages. Moreover, the opportunities for AI and ML applications in other stages were discussed based on the life cycle categorization and presented in this study. The practical contribution of the study lies in providing valuable insights for the effective integration of intelligent systems within the construction sector. Academically, the research contributes by conducting a thorough literature review, categorizing AI and ML applications based on the construction project life cycle, and identifying opportunities for their deployment in different stages.

Variable Wheelbase Control of Wheeled Mobile Robots With Worm-Inspired Creeping Gait Strategy

Article

Jan 2024

Wheeled mobile robots (WMRs) with variable wheelbases are capable of traveling on deformable terrains and handling complex detection tasks. While the variable wheelbase length of WMR allows it to interact with the terrains adaptively, enhancing its mobility, it brings a control challenge. Inspired by the worm's movement of stretching body at different lengths under different environmental resistance, a creeping gait (CG) strategy is proposed in this work to enable the WMR to be controlled in dual modes: wheeled following mode (WFM) and specified length mode (SLM). WFM adjusts the wheelbase's length by the wheels' movements freely to minimize the internal force and torque between wheels. SLM adjusts the wheelbase's length using a proposed fuzzy logic based algorithm to stabilize the body's posture on rough terrain and overcome specific motion challenges, like escaping wheel sinking. A state-adaptive mode-switching controller is then developed using the dwell time approach to smooth the output velocities during the switching phase, and a Lyapunov analysis is performed to verify its stability. According to the results of physical experiments, three-wheeled mobile robot movements with CG enable more precise path following by 37% and faster response by 11% compared to fixed wheelbase movements, and the dwell time approach achieves smoother speed transitions between the modes than the direct switching method, especially when moving from flat to slope terrain.

Multi-scale robotic scanning of surface cracks in concrete structures

Conference Paper

May 2024

An efficient approach for automatic crack detection using deep learning

Article

Apr 2024

Purpose Automation of detecting cracked surfaces on buildings or in any industrially manufactured products is emerging nowadays. Detection of the cracked surface is a challenging task for inspectors. Image-based automatic inspection of cracks can be very effective when compared to human eye inspection. With the advancement in deep learning techniques, by utilizing these methods the authors can create automation of work in a particular sector of various industries. Design/methodology/approach In this study, an upgraded convolutional neural network-based crack detection method has been proposed. The dataset consists of 3,886 images which include cracked and non-cracked images. Further, these data have been split into training and validation data. To inspect the cracks more accurately, data augmentation was performed on the dataset, and regularization techniques have been utilized to reduce the overfitting problems. In this work, VGG19, Xception and Inception V3, along with Resnet50 V2 CNN architectures to train the data. Findings A comparison between the trained models has been performed and from the obtained results, Xception performs better than other algorithms with 99.54% test accuracy. The results show detecting cracked regions and firm non-cracked regions is very efficient by the Xception algorithm. Originality/value The proposed method can be way better back to an automatic inspection of cracks in buildings with different design patterns such as decorated historical monuments.

GBC-BCD: an improved bridge crack detection method based on bidirectional Laplacian pyramid structure with lightweight attention mechanism convolution

Article

Apr 2024

Automatic curtain wall frame detection based on deep learning and cross-modal feature fusion

Article

Apr 2024
AUTOMAT CONSTR

Design of building glass cleaning machine with robotic leg based on Arduino nano microcontroller

Conference Paper

Jan 2024

Precise control mode for concrete vibration time based on attention-enhanced machine vision

Article

Feb 2024
AUTOMAT CONSTR

A multi-scale robotic approach for precise crack measurement in concrete structures

Article

Feb 2024
AUTOMAT CONSTR

Real-time high-resolution neural network with semantic guidance for crack segmentation

Article

Dec 2023
AUTOMAT CONSTR

Design and Experiment of a Novel Façade Cleaning Robot with a Biped Mechanism

Article

Full-text available

Nov 2018

Façade cleaning in high-rise buildings has always been considered a hazardous task when carried out by labor forces. Even though numerous studies have focused on the development of glass façade cleaning systems, the available technologies in this domain are limited and their performances are broadly affected by the frames that connect the glass panels. These frames generally act as a barrier for the glass façade cleaning robots to cross over from one glass panel to another, which leads to a performance degradation in terms of area coverage. We present a new class of façade cleaning robot with a biped mechanism that is able overcome these obstacles to maximize its area coverage. The developed robot uses active suction cups to adhere to glass walls and adopts mechanical linkage to navigate the glass surface to perform cleaning. This research addresses the design challenges in realizing the developed robot. Its control system consists of inverse kinematics, a fifth polynomial interpolation, and sequential control. Experiments were conducted in a real scenario, and the results indicate that the developed robot achieves significantly higher coverage performance by overcoming both negative and positive obstacles in a glass panel.

Portable Autonomous Window Cleaning Robot

Article

Full-text available

Jan 2018

The idea of having a compact and autonomous office or house window cleaning robot is quite simple and very attractive. This small window climbing robot with pneumatic suction cups should be able to move autonomously along an outside surface of high-rise building office window with a relatively large area and meantime clean and wash it. Being manually attached to the outside surface of the room window the robot will execute and accomplish the task of window cleaning automatically in a predefined pattern. The sensory system will help to navigate the robot. It is noted that window cleaning robots are commercially available but pricey (in the range of USD 5000 or more). The designed robot is lightweight, small size and cheap because it is driven only by one rotary actuator and system of properly arranged conventional belts and pulleys. It uses the suction cups to stick to the window pane and set of optical sensors to detect the window frame. The microcontroller is programmed to move the robot in a specific pattern depending on the sensory data. There are no similar reasonably priced rival products available in the market yet.

Understanding of a Convolutional Neural Network

Conference Paper

Full-text available

Aug 2017

The term Deep Learning or Deep Neural Network refers to Artificial Neural Networks (ANN) with multi layers . Over the last few decades, it has been considered to be one of the most powerful tools, and has become very popular in the literature as it is able to handle a huge amount of data. The interest in having deeper hidden layers has recently begun to surpass classical methods performance in different fields; especially in pattern recognition. One of the most popular deep neural networks is the Convolutional Neural Network (CNN). It take this name from mathematical linear operation between matrixes called convolution. CNN have multiple layers; including convolutional layer, non-linearity layer, pooling layer and fully-connected layer. The convolutional and fully- connected layers have parameters but pooling and non-linearity layers don't have parameters. The CNN has an excellent performance in machine learning problems. Specially the applications that deal with image data, such as largest image classification data set (Image Net), computer vision, and in natural language processing (NLP) and the results achieved were very amazing . In this paper we will explain and define all the elements and important issues related to CNN, and how these elements work. In addition, we will also state the parameters that effect CNN efficiency. This paper assumes that the readers have adequate knowledge about both machine learning and artificial neural network.

Design and modelling of a modular window cleaning robot

Article

Jul 2019
AUTOMAT CONSTR

The design of a modular window façades cleaning robot is challenging given the conditions under which these robots are required to operate. In this work, we attempt to extend the locomotion capabilities of these robots beyond what is currently feasible. The modular design of three equal interconnected sections of our robot, called Mantis, allows increasing the range concerning the work of cleaning window façades. Mantis has the ability to make transition from one window panel to another by crossing over the metallic panel. We implemented the inductive sensors to detect the metallic frame for autonomous crossover. The mechanical design and system architecture are introduced in detail, followed by a detailed description of the locomotion control and the sensor system for the classification of the metallic frame. The experimental results are presented to validate Mantis' abilities.

Glass facade cleaning robot with passive suction cups and self-locking trapezoidal lead screw drive

Article

Dec 2018
AUTOMAT CONSTR

We report on the mechanism, design iteration, and performance of a new glass facade cleaning robot, vSlider. The passive suction cups, driven by self-locking lead screws, are used to engage the vSlider robot to the glass facade. This mechanism has higher efficiency, compared to active suction cups, and offers better power consumption and safety in the case of power disruption or power loss. Due to the self-locking leadscrews, the counter-moment in a static position is not transferred to the motor, and thus, the servos which drive the lead screws only consume the power needed for a typical free load. A DC motor with encoder generates the primary locomotion in vSlider which was tested both in position- and velocity-control modes. This paper also details the design iteration efforts and discusses the key findings from the experiments involving the first prototype, vSlider 1.x, and the application of these findings in the development of the second prototype, vSlider 2.x. Experiments were performed to validate the proposed design approach and to benchmark the performance of the two robot prototypes that were developed.

Fatigue crack detection using dual laser induced nonlinear ultrasonic modulation

Article

Nov 2018
OPT LASER ENG

In this study, a nonlinear ultrasonic modulation technique based on dual laser excitation is proposed for fatigue crack detection. Two pulse lasers are shot on the target specimen for ultrasonic generation. The corresponding ultrasonic responses are measured by a laser Doppler vibrometer (LDV) and analyzed to extract the crack induced nonlinear ultrasonic modulation. First, the effect of the pulse laser beam size on the frequency content of the generated ultrasonic waves is numerically and experimentally investigated. Then, this finding of the laser beam size effect is utilized to generate wideband (WB) and narrowband (NB) ultrasonic waves by adjusting the laser beam sizes of the two pulsed excitation lasers. Nonlinear ultrasonic modulation results from the interaction of WB and NB ultrasonic waves when a fatigue crack exists in the target specimen. The fatigue crack is then detected by comparing the spectral responses obtained under a single WB input and both WB and NB inputs. In the end, a fully noncontact dual laser ultrasonic system is developed and used to detect micro fatigue cracks in aluminum and steel plate specimens.

The study on the integrated control system for curtain wall building façade cleaning robot

Article

Oct 2018
AUTOMAT CONSTR

Recently, with a growing number of high-rise buildings in cities, interest in building facade maintenance is increasing. The existing method of cleaning the exterior walls of existing high-rise buildings depended on the methods by workers who used ropes, gondolas, and winch systems. Recently, however, BMU (building maintenance unit) has been developed and applied to resolve safety problems and boost work efficiency. In Germany, USA, France and other countries, various types of robot systems for building façade maintenance are being applied. In South Korea, façade cleaning robots attached with curtain walls are also being developed. In this paper, we propose an integrated control system for the stable control of robots with the building façade cleaning technology. The proposed control system can be divided into three stages such as preparation stage, cleaning stage, and return stage. Each independent robot system performs tasks such as cleaning, moving, and obstacle detection according to each stage. A wireless communication system for stable communication between robots was proposed and applied for controlling the robot system. The proposed integrated control system was applied to building façade cleaning robots and its efficiency was verified compared with existing high-rise building cleaning methods.

Simple convolutional neural network on image classification

Conference Paper

Mar 2017

NB-CNN: Deep Learning-based Crack Detection Using Convolutional Neural Network and Naïve Bayes Data Fusion

Article

Oct 2017

Regular inspection of nuclear power plant components is important to guarantee safe operations. However, current practice is time-consuming, tedious, and subjective which involves human technicians review the inspection videos and identify cracks on reactors. A few vision-based crack detection approaches have been developed for metallic surfaces, and they typically perform poorly when used for analyzing nuclear inspection videos. Detecting these cracks is a challenging task since they are tiny, and noisy patterns exist on the components' surfaces. This study proposes a deep learning framework called NB-CNN to analyze individual video frames for crack detection while a novel data fusion scheme is proposed to aggregate the information extracted from each video frame to enhance the overall performance and robustness of the system. To this end, a Convolutional Neural Network (CNN) is proposed to detect crack patches in each video frame while the proposed data fusion scheme maintains the spatiotemporal coherence of cracks in videos, and the Naïve Bayes decision making discards false positives effectively. The proposed framework achieves 98.3% hit rate against 0.1 false positives per frame that is significantly higher than state-of-the-art approaches as presented in this paper.

Deep Convolutional Neural Networks with transfer learning for computer vision-based data-driven pavement distress detection

Article

Sep 2017
CONSTR BUILD MATER

Automated pavement distress detection and classification has remained one of the high-priority research areas for transportation agencies. In this paper, we employed a Deep Convolutional Neural Network (DCNN) trained on the ‘big data’ ImageNet database, which contains millions of images, and transfer that deep earning to automatically detect cracks in Hot-Mix Asphalt (HMA) and Portland Cement Concrete (PCC) surfaced pavement images that also include a variety of non-crack anomalies and defects. Apart from the common sources of false positives encountered in vision based automated pavement crack detection, a significantly higher order of complexity was introduced in this study by trying to train a classifier on combined HMA-surfaced and PCC-surfaced images that have different surface characteristics. A single-layer neural network classifier (with ‘adam’ optimizer) trained on ImageNet pre-trained VGG-16 DCNN features yielded the best performance.

Self-reconfigurable façade-cleaning robot equipped with deep-learning-based crack detection based on convolutional neural networks

Abstract and Figures

Recommended publications

Obstacle Avoidance through Deep Networks based Intermediate Perception

Humanoid Robot Detection using Deep Learning: A Speed-Accuracy Tradeoff

A Deep Learning-based Grasp-detection Method for a Five-fingered Industrial Robot Hand

Research on multi-robot scheduling algorithms based on machine vision