ArticlePDF Available

Improved CNN-Based Path Planning for Stairs Climbing in Autonomous UAV with LiDAR Sensor

July 2021
IEIE Transactions on Smart Processing and Computing

July 2021

Authors:

Yeon Ji Choi

Kumoh National Institute of Technology

Tariq Rahim

Soo Young Shin

Kumoh National Institute of Technology

Unmanned aerial vehicles (UAVs) have tremendous potential in civil and public areas. These are especially beneficial in applications, where human lives would contrarily be threatened. Autonomous navigation in unknown environments is a challenging issue for UAVs where decision-based navigation is required. In this paper, a deep learning (DL) approach is presented that aids UAVs for autonomous navigation in completely unknown, GPS-denied indoor environments. The UAV is equipped with a monocular camera and light detection and ranging (LiDAR) sensor for the determination of the next maneuver and distance calculation, respectively. For deeper features extraction, the You Only Look Once (YOLO) version is improved by the addition of convolution layers with different filter sizes. The process is observed as a classification exercise, where the DL model classifies the targeted image as a stair or no-stair. We have created our dataset considering the indoor scenario for the specific implementation. Comprehensive experimental results are presented that are compared with the YOLOv3 tiny indicating better performance in terms of accuracy, recall, F1-score, precision, and maneuver movements.

Content uploaded by Tariq Rahim

Content may be subject to copyright.

Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a

LiDAR Sensor

저자

(Authors)

Yeon Ji Choi, Tariq Rahim, Soo Young Shin

출처

(Source)

IEIE Transactions on Smart Processing & Computing 10(5), 2021.10, 390-397 (8 pages)

발행처

(Publisher)

대한전자공학회

The Institute of Electronics and Information Engineers

URL http://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE10620516

APA Style Yeon Ji Choi, Tariq Rahim, Soo Young Shin (2021). Improved CNN-based Path Planning So

an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor. IEIE Transactions on Smart

Processing & Computing, 10(5), 390-397.

이용정보

(Accessed)

저작권 안내

DBpia에서 제공되는 모든 저작물의 저작권은 원저작자에게 있으며, 누리미디어는 각 저작물의 내용을 보증하거나 책임을 지지

않습니다. 그리고 DBpia에서 제공되는 저작물은 DBpia와 구독계약을 체결한 기관소속 이용자 혹은 해당 저작물의 개별 구매자

가 비영리적으로만 이용할 수 있습니다. 그러므로 이에 위반하여 DBpia에서 제공되는 저작물을 복제, 전송 등의 방법으로 무단

이용하는 경우 관련 법령에 따라 민, 형사상의 책임을 질 수 있습니다.

contents of the literary work or assume responsibility for the same. In addition, the literary works provided by DBpia may

only be used by the users affiliated to the institutions which executed a subscription agreement with DBpia or the

individual purchasers of the literary work(s)for non-commercial purposes. Therefore, any person who illegally uses the

literary works provided by DBpia by means of reproduction or transmission shall assume civil and criminal responsibility

according to applicable laws and regulations.

금오공과대학교

202.31.134.***

2021/11/12 09:19 (KST)

IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021

https://doi.org/10.5573/IEIESPC.2021.10.5.390 390

IEIE Transactions on Smart Processing and Computing

Improved CNN-based Path Planning So an Autonomous

UAV Can Climb Stairs By using a LiDAR Sensor

Yeon Ji Choi1, Tariq Rahim2, and Soo Young Shin3*

Department of IT Convergence Engineering, Kumoh National Institute of Technology / Gumi, Korea

yzygzy@kumoh.ac.kr, tariqrahim@ieee.org, wdragon@kumoh.ac.kr

* Corresponding Author: Soo Young Shin

Received April 7, 2021; Revised May 25, 2021; Accepted July 9, 2021; Published October 30, 2021

* Regular Paper

* Extended from a Conference: Preliminary results of this paper were presented at the ICEIC Winter 2021. This paper has

been accepted by the editorial board through the regular reviewing process that confirms the original contribution.

Abstract: Unmanned aerial vehicles (UAVs) have tremendous potential in civil and public areas.

These are especially beneficial in applications where human lives are threatened. Autonomous

navigation in unknown environments is a challenging issue for UAVs where decision-based

navigation is required. In this paper, a deep learning (DL) approach is presented that aids

autonomous navigation for UAVs in completely unknown, GPS-denied indoor environments. The

UAV is equipped with a monocular camera and a light detection and ranging (LiDAR) sensor to

determine each next maneuver and distance calculation, respectively. For deeper feature extraction,

a version of You Only Look Once (YOLOv3-tiny) is improved by adding a convolution layer with

different filter sizes. The process is observed as an exercise where the DL model classifies the

targeted image as stairs or not stairs. We created our dataset considering the indoor scenario for

specific implementation. Comprehensive experimental results are compared with YOLOv3-tiny,

indicating better performance in terms of accuracy, recall, F1-score, precision, and maneuvering

movements.

Keywords: UAVs, CNN, Path planning, Stair climbing, LiDAR sensor

1. Introduction

UAV use is growing in areas such as scientific research,

rescue missions, commerce, and agriculture. Originally,

UAVs were developed to be managed by an on-the-ground

pilot via remote-control communication [1]. Recently,

UAVs have been moving closer to navigating with unusual

degrees of autonomy. Most UAVs employ global

navigation satellite system technology and inertial sensors

to determine their geospatial positioning. It is necessary to

overcome factors such as GPS signal error, narrow

passageways, and transparent glass for stable-flight UAVs

in indoor environments [2]. Studies in image-based stair-

recognition for robots [3] and of techniques for ground

robots [4] are ongoing; however, there is a lack of such

research with UAVs. An abundance of techniques, varying

from learning-based to non–learning-based, have been

suggested to resolve UAV navigation dilemmas. The most

popular non–learning-based method is sensing and

avoidance, which prevents accidents by steering vehicles

in a reverse orientation and navigating by path planning [5,

6]. Another type of non–learning-based technique takes

advantage of simultaneous localization and mapping

(SLAM). The inspiration is that, after creating a map of the

surroundings by utilizing SLAM, navigation is

accomplished by path planning [7, 8]. The work in [7]

combines GraphSLAM [9] with an online path planning

module in a proposal-approving UAV to determine

obstacle-free trajectories in foliage. A general

characteristic of non–learning-based approaches is that

they demand precise path planning, which may result in

unanticipated failures when environments are extremely

dynamic and complicated. To address this matter, machine

learning (ML) methods such as imitation learning and

reinforcement learning (RL) have been explored [10-12].

For example, a model-based RL approach called

TEXPLORE [12] was presented, which is a high-level

control system for navigation of a UAV within a grid map

having no barriers. And an imitation learning–based

controller utilizing a small set of human displays was

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021

391

presented that obtains reliable performance in forested

areas [10].

Therefore, this paper proposes a convolutional neural

network (CNN)-based system based on real-time stair

recognition that can fly a UAV without colliding with

stairs, and that obtains distance information between walls

or stairs through 2D light detection and ranging (LiDAR)

with a camera mounted on the UAV. In addition,

algorithms were designed for systems that recognize stairs,

avoid collisions, and maneuver themselves, which is one

of the obstacles to an autonomous flight process, and flight

experiments were carried out after the actual UAV was

implemented.

Deep learning (DL), which is a subcategory of machine

learning, acts like the human brain, and is therefore known

as artificial intelligence (AI). Many applications of

machine learning have been proposed, with different

signals representing data such as music signals [13], 2D

signals or images [14], and video signals [15]. CNNs are

used for various purposes, such as classification, detection,

and pattern recognition, especially in health [16], drone

applications [17], and autonomous driving systems.

Recently, You Only Look Once (YOLO) was introduced

for real-time detection of objects, with each version

improving the mean average precision (mAP) per frame

per second [18].

In this work, we attempted for the first time to use the

YOLOv3-tiny model, and improved the model further by

adding a convolution layer to extract deep features for the

detection of stairs. This DL detection model was used in a

classification problem to determine each next maneuver.

The rest of this paper is organized as follows. Section 2

details related work, while Section 3 explains the proposed

scheme. Section 4 summarizes the experimental results

and the analysis. Section 5 provides concluding statements

and suggests the scope of future work.

2. Related Work

Previously, a 3D map of the local area was developed

for autonomous UAV navigation. In some cases, these

methods were used to map exact quadcopters [19, 20].

However, these methods are based on a smart control

scheme, thereby restricting their use to laboratory settings

[21-23]. The map is learned through other manual route

methods, and quadcopters travel the same path [24]. For

most outdoor flights (where precision is not as high as

indoors), a GPS-based posing projection is used.

Most applications use scale sensors, such as infrared

sensors, RGB-D (red, green, blue depth) sensors, or laser

range sensors [25]. A single ultrasonic sensor was used in

[26] as an automated navigation device with an infrared

sensor. The condition evaluation method of the LiDAR

and inertial measurement unit (IMU) was advanced to

work independently in uncertain conditions that are denied

by a GPS [27]. Range sensors have limitations, being

heavy and high in power consumption.

The simultaneous localization and mapping (SLAM)

technique uses separate optical sensors to create a 3D

image [21-23] from every UAV position on the map. A 3D

map of an unknown indoor scenario was used for the

SLAM laser range finder [25]. The SLAM technique [29,

31] offers single-camera indoor navigation. SLAM is

highly complicated when it comes to regenerating the 3D

map region, requiring precise measurements and extensive

resources because additional sensors are needed.

SLAM can also set contact delays during real-time

navigation. The studies in [31] and [32] addressed these

issues. SLAM is primarily a practical system, and its

output with indoor materials (such as walls/roofs) is not

considered good, because its differential intensity is very

weak. The entire corridor comprises partitions, roofs, and

floors, and SLAM technologies cannot attain the desired

navigational quality.

3. The Proposed Scheme

This section discusses the system configuration for

UAV recognition of stairs, the deep learning model using

YOLOv3-tiny, and the improved YOLOv3-tiny for

detecting stairs.

3.1 System Configuration

The proposed system was designed based on

recognizing stairs with a camera mounted on the UAV for

indoor environments and on distances measured via the 2D

LiDAR sensor attached to the UAV’s side. Fig. 1 shows

the flowchart for the entire system. The connections and

communications between the parts are both wired and

wireless, as shown in Fig. 2. In particular, communications

among the ground control station, the UAV, and the

onboard PC is via Wi-Fi/LTE. Meanwhile, the wired

connection is only used for the sensor.

The system’s actual implementation uses a Parrot

Bebop 2 drone, which is suitable for narrow passageways

and convenient for load sensors. The UAV is equipped

with an RPLiDAR S1 laser scanner, which rotates 360°

and can measure distances up to 40m with a lightweight,

mainboard Jetson TX2 embedded computing device

(Auvidea J120 carrier board) as shown in Fig. 3(c). The

Fig. 1. Flowchart for the proposed implementation.

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

Choi et al.: Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor

392

Lenovo ThinkPad T580 is used as a ground control system

(GCS), and the equipment required for the experiment is

listed in Table 1. All algorithms are implemented in

Python, and the Robot Operating System (ROS) was used

as middleware (software that can run multiple different

programs together) in a kinetic version.

The LiDAR sensor uses distances measured along 360

points, as shown in Fig. 3(b). The distance data obtained

by the LiDAR sensor were 0° to the floor, 90° to the front,

and 180° to the ceiling, based on the direction of progress

for the UAV. In the polar coordination system, each of the

raw laser points is defined as {(di, θi); 0 ≤ i ≤ 359}, where

di is the distance from the UAV center to the object, and θi

is the relative angle of measurement. The information

obtained by the LiDAR is stored as a vector (di, θi), and the

stored data are checked to convert the values of the infinity

scan.

3.2 Stair-climbing System

Algorithm 1 is used by the UAV to climb stairs. When

steps are recognized by the camera, the algorithm starts. If

the distance between the UAV and the stairs is longer than

r meters, a straight start is performed on the x-axis, or a

rising maneuver on the z-axis, to avoid collisions if the

distance is less than r m. At this instant, if a staircase is not

recognized, the stair climb mission is determined as

complete, and recognition for climbing the next step

commences.

3.3 Deep Learning Model for Detection of

Stairs

In this study, a DL approach is implemented for

detecting stairs, which the drone uses to make decisions

intelligently in order to follow the stairs and determine the

next maneuver. In this work, we improved the YOLOv3-

tiny default model. The backbone of YOLO is darknet,

where the YOLOv3-tiny default model uses six max-

pooling and seven convolution layers. We modified it by

adding one more layer. Instead of the softmax function,

and where multi-class classification and detection is an

Fig. 2. Network connections and the architecture of the

proposed system.

Fig. 3. System configuration: (a) UAV movement axes;

(b) illustration of the RPLiDAR S1 scanning process;

PC attached to the UAV; (d) the test environment.

Table 1. Experiment Parameters.

Device Model name Company

Lidar sensor RPLiDAR S1 Slamtec

UAV Bebop drone 2 Parrot

Onboard PC Jetson TX2 Nvidia

Carrier board Auvidea J120 Auvidea

GCS ThinkPad T580 Lenovo

LTE modem LTE USB Stick Huawei

lgorithm 1. Stai

-climbing algorithm.

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021

393

issue, regression is employed to solve the multi-class

detection and classification problem [33].

The proposed model starts by dividing the stair-image

input into a G × G grid in the training stage. A bounding

box is used as a tool for labeling five features—width w,

height h, vertical height v, horizontal height u—as shown

in Fig. 4, and confidence score C, which represents the

presence of stairs within the bounding box, and hence,

represents the accuracy.

In the proposed YOLOv3-tiny method, we attempt to

make the model computationally inexpensive, along with

implementing it to extract more semantic features. Max-

pooling is used after each convolution layer to reduce the

computational complexity and improve image feature

extraction. Fig. 6 shows the network architecture for both

the default and the improved YOLOv3-tiny models. The

loss function is obtained as an end-to-end network, and can

be expressed as follows [33]:

loss iouErr coordErr clsErr

=++

∑ (1)

where iouErr, coordErr, and clsErr indicate the IOU error,

coordinates error, and classification error, respectively. We

used a rectified linear unit (ReLU) as an activation

function to achieve sparsity and reduce vanishing gradient

issues [25]. Table 2 details the training configuration

employed for both YOLOv3-tiny and the proposed

improved YOLOv3-tiny model.

3.4 ROS

The nodes that are separated and managed by the

master are shown in Fig. 5. In addition, the topic node

continuously communicates the results processed by the

publisher node, and makes them available to other nodes

by subscription. The system proposed in this paper is

largely a UAV status message, a scan value obtained from

the LiDAR, and a visual message obtained from the UAV

camera. When running darknet on the ROS, the messages

required from the published messages are subscripted.

Among them, a message containing information on the

bounding box is received through the darknet_ros node.

When the proposed DL model detects a staircase, a

Fig. 5. ROS node graph.

Fig. 4. Definition of the bounding box.

Fig. 6. YOLO models: the default YOLOv3-tiny and the

improved YOLOv3-tiny.

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

Choi et al.: Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor

394

message from the LiDAR is subscribed as a token that

allows the UAV to perform actions and maneuvering

based on the incoming output. This process continues till

detection is performed within darknet_ros.

4. Experimental Results and Analysis

A dataset was created in the Kumoh National Institute

of Technology, South Korea, by employing a Bebop drone

that has a high-resolution camera and a GPS mounted on it.

The dataset comprises 1000 images at a resolution of

1920 1080× resized to 428 428× before model training.

For training and testing purposes, the dataset was split

70% and 30%, respectively. Fig. 7 depicts the training

phase of the proposed improved YOLOv3-tiny model

where 20,000 epochs were set. As shown in Fig. 7, the

blue line represents the average loss achieved (0.215)

whereas the red line represents the highest mAP (91.6%).

The detection performance of the improved YOLOv3-

tiny model was benchmarked against the default model by

utilizing the same parametric configurations and dataset.

The metrics used to reflect the efficacy in stair detection of

both models are accuracy, recall, F1-score, and precision.

Table 3 shows that the proposed improved YOLOv3-tiny

model outperformed the default model in terms of

accuracy, recall, and F1-score. Furthermore, a low

precision value with higher values of other performance

metrics shows stable performance from the model.

Fig. 8 shows the real-time detection of the proposed

model, where the top left image represents the starting

point of the UAV after takeoff, and the top right image

represents the middle position of the UAV when hovering

and climbing. In Fig. 8, the bottom left image shows the

last step of the stairs, while the bottom right image shows

the instant when the UAV was located at a distance of r

meters from the stairs.

For the experimental scenario, the set of stairs climbed

was 2.1 m long and 2.85 m wide, as shown in Fig. 3(d).

Based on Algorithm 1, Some of the experiment’s results

are shown in Fig. 9, depicting commands sent by the GCS

and the corresponding images from the built-in camera of

the UAV. In Fig. 9, we have tried to show the different

stages in the decisions made by the UAV, such as moving

forward or upward, hovering, and going to the next stair to

climb it. Furthermore, the actual trajectory-wise UAV

movement from the beginning of the staircase to the

beginning of the next step is shown in Fig. 10 as a 3D plot.

This movement started at approximately 0.8 m from the

starting point of the stairs. In total, 88 experiments were

performed three times each, and the results are shown in

Table 4 for the time elapsed during takeoff and landing on

average, reported to be 55.97 sec.

Table 2. Training Parameters for Both Models.

Parameters for training Configuration values

Image/stairs 428 x 428

Batch size 32

Learning rate 0.001

Optimizer Stochastic gradient descent

Decay 0.0005

Momentum 0.9

Epochs 20,000

Table 3. Performance of the Detection Scheme.

Parameter metrics YOLOv3-tiny

(%) [17]

Modified

YOLOv3-tiny

(%)

Accuracy 90.01 92.06

Recall 89.00 91.00

F1-score 83.00 85.00

Precision 78.00 73.00

Fig. 7. Training phase of the improved YOLOv3-tiny.

Fig. 8. Detection results from the improved YOLOv3-

tiny model.

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021

395

Fig. 10. Trajectory of the UAV.

Table 4. Performance Time of the Proposed Stair-

climbing Scheme.

No. Takeoff Landing

1 0:06.35 1:00.91

2 0:05.22 0:57.16

3 0:05.78 1:07.18

Average 0:05.78 1:01.75

5. Conclusion

In this study, we designed, implemented, and

experimented with a system in which a UAV recognizes

and climbs stairs, which are obstacles often encountered

during indoor flight. The system was implemented through

a CNN-based imaging process for real-time stair

(a) (b)

Fig. 9. GCS screen commands and screen shots from the UAV’s built-in camera for (a) forward movement; (b)

upward movement; (c) hovering; (d) going to the next stair.

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

Choi et al.: Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor

396

recognition and by using LiDAR-based distance

measurements. The accuracy derived from stair

recognition was 92.06%, and the actual test results showed

that stair climbing was carried out without collisions.

Future research would require more efficient

algorithms to climb various types of stairs. Moreover, the

proposed system can be combined with SLAM navigation

to expand studies to systems that can autonomously fly

through multiple floors.

Acknowledgement

This work was supported by the Priority Research

Centers Program through the National Research

Foundation of Korea (NRF) funded by the Ministry of

Education, Science and Technology

(2018R1A6A1A03024003).

References

[1] P. R. Prasad, et al., "Monocular vision aided

autonomous UAV navigation in indoor corridor

environments." IEEE Transactions on Sustainable

Computing, Vol. 4, No. 1, pp. 96-108, 2018. Article

(CrossRefLink)

[2] Y. Lu, et al., "A survey on vision-based UAV

navigation." Geo-spatial information science, Vol. 21,

No. 1, pp. 21-32, 2018. Article (CrossRefLink)

[3] M. Ilyas, et al., "Design of sTetro: A Modular,

Reconfigurable, and Autonomous Staircase Cleaning

Robot," Journal of Sensors, Vol. 2018, 16 pages. Jul.

2018. Article (CrossRefLink)

[4] X. Gao, et al., “Dynamics and stability analysis on

stairs climbing of wheel–track mobile robot,”

International Journal of Advanced Robotic Systems,

Vol. 14, No. 4, pp. 1729881417720783, 2017. Article

(CrossRef Link)

[5] J. Israelsen, et al., "Automatic collision avoidance for

manually tele-operated unmanned aerial vehicles." In

2014 IEEE International Conference on Robotics and

Automation (ICRA), pp. 6638-6643, 2014. Article

(CrossRef Link)

[6] L. Chnibo, et al., "UAV position estimation and

collision avoidance using the extended Kalman

filter." IEEE Transactions on Vehicular Technology,

Vol. 62, No. 6, pp. 2749-2762, 2013. Article

(CrossRef Link)

[7] J. Cui, et al., "Autonomous navigation of UAV in

foliage environment." Journal of intelligent & robotic

systems, Vol. 84, No. 1 pp. 259-276, 2016. Article

(CrossRef Link)

[8] Z. Huizhong, et al., "StructSLAM: Visual SLAM

with building structure lines." IEEE Transactions on

Vehicular Technology, Vol. 64, No. 4 pp. 1364-1375,

2015. Article (CrossRef Link)

[9] A. E. Oguz, et al., "On the consistency analysis of A-

SLAM for UAV navigation. Proc. SPIE 9084,

Unmanned Systems Technology XVI, Vol. 9084, pp.

90840R, June. 2014. Article (CrossRef Link)

[10] S. Ross, et all., "Learning monocular reactive uav

control in cluttered natural environments." In 2013

IEEE international conference on robotics and

automation, pp. 1765-1772, 2013. Article (CrossRef

Link)

[11] A. Fraust, et all., "Automated aerial suspended cargo

delivery through reinforcement learning." Artificial

Intelligence, Vol. 247, pp. 381-398, 2017. Article

(CrossRef Link)

[12] N. Imanberdiyev, et all., "Autonomous navigation of

UAV by using real-time model-based reinforcement

learning." In 2016 14th international conference on

control, automation, robotics and vision (ICARCV),

pp. 1-6, 2016. Article (CrossRef Link)

[13] B. L. Sturm, et al., "Machine learning research that

matters for music creation: A case study," Journal of

New Music Research, Vol. 48, No.1, pp. 36-55, 2019.

Article (CrossRefLink)

[14] J. Raharjo, et al., “Cholesterol level measurement

through iris image using gray level co-occurrence

matrix and linear regression,” ARPN Journal of

Engineering and Applied Sciences, Vol. 14, No. 21,

pp. 3757–3763, Nov. 2019. Article (CrossRef Link)

[15] Y. Zhang, et al., "Machine learning based video

coding optimizations: A survey." Information

Sciences, Vol. 506, pp.395-423, Jan. 2020. Article

(CrossRef Link)

[16] M. Heidari, et al., "Improving the performance of

CNN to predict the likelihood of COVID-19 using

chest X-ray images with preprocessing algorithms,"

International journal of medical informatics, Vol. 144,

pp. 104284, Sep. 2020. Article (CrossRef Link)

[17] S. A. Hassan, et al., "Real-time uav detection based

on deep learning network,” In 2019 International

Conference on Information and Communication

Technology Convergence, pp. 630-632, Oct. 2019.

Article (CrossRef Link)

[18] J. Redmon, et al., “You only look once: Unified, real-

time object detection,” In Proceedings of the IEEE

conference on computer vision and pattern

recognition, pp. 779-788, 2016. Article (CrossRef

Link)

[19] D. Mellinger, et al., “Minimum snap trajectory

generation and control for quadrotors,” In 2011 IEEE

international conference on robotics and automation,

pp. 2520-2525, 2011. Article (CrossRef Link)

[20] D. Mellinger, et al., “Trajectory generation and

control for precise aggressive maneuvers with

quadrotors,” The International Journal of Robotics

Research, Vol. 31, No. 5, pp. 664-674, Jan. 2012.

Article (CrossRef Link)

[21] P. Checchin, et al., “Radar scan matching slam using

the fourier-mellin transform,” In Field and Service

Robotics, Vol. 62, pp. 151-161, 2010. Article

(CrossRef Link)

[22] J. Engel, et al., “LSD-SLAM: Large-scale direct

monocular SLAM,” In European conference on

computer vision, Vol. 8690, pp. 834-849, 2014.

Article (CrossRef Link)

[23] C. Mei, et al., “RSLAM: A system for large-scale

mapping in constant-time using stereo,” International

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021

397

journal of computer vision, Vol. 94, No.2, pp. 198-

214, Jun. 2011. Article (CrossRef Link)

[24] M. M uller, et al., “Quadrocopter ball juggling,” in

2011 IEEE/RSJ International Conference on

Intelligent Robots and Systems, pp. 5113–5120, Sep.

2011. Article (CrossRef Link)

[25] A. S. Huang, et al., “Visual odometry and mapping

for autonomous flight using an RGB-D camera,”

Robotics Research. Vol. 100, pp. 235–252, Aug.

2011. Article (CrossRef Link)

[26] J. F. Roberts, et al., “Quadrotor using minimal

sensing for autonomous indoor flight,” In European

Micro Air Vehicle Conference and Flight

Competition (EMAV2007), Sep. 2007. Article

(CrossRef Link)

[27] A. Bry, et al., “State estimation for aggressive flight

in GPS-denied environments using onboard sensing,”

In 2012 IEEE International Conference on Robotics

and Automation, pp. 1-8, May, 2012. Article

(CrossRef Link)

[28] A. Bachrach, et al, “Autonomous flight in unknown

indoor environments,” International Journal of Micro

Air Vehicles, Vol. 1, No. 4, pp. 217-228, Dec. 2009.

Article (CrossRef Link)

[29] M. Achtelik, et al., "Onboard IMU and monocular

vision based control for MAVs in unknown in-and

outdoor environments." 2011 IEEE International

Conference on Robotics and Automation, pp. 3056-

3063, 2011. Article (CrossRef Link)

[30] M. Blösch, et al., "Vision based MAV navigation in

unknown and unstructured environments." 2010

IEEE International Conference on Robotics and

Automation, pp. 21-28, 2010. Article (CrossRef

Link)

[31] G. Nützi, et al., "Fusion of IMU and vision for

absolute scale estimation in monocular SLAM."

Journal of intelligent & robotic systems, Vol. 61, No.

1, pp. 287-299, Nov. 2011. Article (CrossRef Link)

[32] S. Weiss, et al., “Versatile distributed pose estimation

and sensor self-calibration for an autonomous MAV,”

In 2012 IEEE International Conference on Robotics

and Automation, pp. 31-38, 2012. Article (CrossRef

Link)

[33] T. Rahim, et al., “A Deep Convolutional Neural

Network for the Detection of Polyps in Colonoscopy

Images,” Biomedical Signal Processing and Control

68 (2021): 102654.

Yeonji Choi received her BSc in

Electrical Engineering in 2019 and

received her MSc from the Department

of IT Convergence Engineering at

Kumoh National Institute of

Technology (KIT) Gumi, South Korea,

in 2021. Currently, she is working as

graduate research assistant at the

Wireless and Emerging Network System (WENS) Lab in

the Department of IT Convergence Engineering, Kumoh

National Institute of Technology (KIT), Gumi, South

Korea. Her major research interests include intelligent

control and systems, Unmanned Aerial Vehicles, and

wireless communications.

Tariq Rahim is a PhD student in the

Wireless and Emerging Network

System Laboratory (WENS Lab) of the

Department of IT Convergence

Engineering, Kumoh National Institute

of Technology, Republic of Korea. He

completed his master’s degree in

Information and Communication

Engineering from Beijing Institute of Technology, PRC, in

2017. His research interests include image and video

processing and quality of experience for high-resolution

videos.

Soo Young Shin received his BSc,

MSc, and PhD in Electrical Engi-

neering and Computer Science from

Seoul National University, Korea, in

1999, 2001, and 2006, respectively. He

was a visiting scholar for the FUN Lab

at the University of Washington,

U.S.A., from July 2006 to June 2007.

After three years working in the WiMAX Design Lab of

Samsung Electronics, he is now an associate professor for

the School of Electronics at Kumoh National Institute of

Technology, joining the institute in September 2010. His

research interests include wireless LANs, WPANs,

WBANs, wireless mesh networks, sensor networks,

coexistence among wireless networks, industrial and

military networks, cognitive radio networks, and next-

generation mobile wireless broadband networks.

금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)

Indoor staircase detection for supporting security systems in autonomous smart wheelchairs based on deep analysis of the Co-occurrence Matrix and Binary Classification

Article

Jun 2024

Machine Learning Research that Matters for Music Creation: A Case Study

Article

Full-text available

Sep 2018

Research applying machine learning to music modeling and generation typically proposes model architectures, training methods and datasets, and gauges system performance using quantitative measures like sequence likelihoods and/or qualitative listening tests. Rarely does such work explicitly question and analyse its usefulness for and impact on real-world practitioners, and then build on those outcomes to inform the development and application of machine learning. This article attempts to do these things for machine learning applied to music creation. Together with practitioners, we develop and use several applications of machine learning for music creation, and present a public concert of the results. We reflect on the entire experience to arrive at several ways of advancing these and similar applications of machine learning to music creation.

Design of sTetro: A Modular, Reconfigurable and Autonomous Staircase Cleaning Robot

Article

Full-text available

Jul 2018

In this article, the mechanical, electrical and autonomy aspects of designing a novel, modular and reconfigurable cleaning robot, dubbed as sTetro (stair Tetro), are presented. The developed robotic platform uses a vertical conveyor mechanism to reconfigure itself and is capable of navigating over flat surfaces as well as staircases, thus significantly extending the automated cleaning capabilities as compared to conventional home cleaning robots. The mechanical design and system architecture are introduced first, followed by a detailed description of system modelling and controller design efforts in sTetro. An autonomy algorithm also proposed for self-reconfiguration, locomotion and autonomous navigation of sTetro in the controlled environment, e.g. in homes/offices with flat floor and straight staircase. A staircase recognition algorithm is presented to distinguish between surrounding environment and the stairs. The misalignment detection technique of the robot with front staircase riser is also given and a feedback from IMU sensor for misalignment corrective measures is provided. The Experiments performed with the sTetro robot demonstrated the efficacy and validity of the developed system models, control and autonomy approaches.

A survey on vision-based UAV navigation

Article

Full-text available

Jan 2018

Research on unmanned aerial vehicles (UAV) has been increasingly popular in the past decades, and UAVs have been widely used in industrial inspection, remote sensing for mapping & surveying, rescuing, and so on. Nevertheless, the limited autonomous navigation capability severely hampers the application of UAVs in complex environments, such as GPS-denied areas. Previously, researchers mainly focused on the use of laser or radar sensors for UAV navigation. With the rapid development of computer vision, vision-based methods, which utilize cheaper and more flexible visual sensors, have shown great advantages in the field of UAV navigation. The purpose of this article is to present a comprehensive literature review of the vision-based methods for UAV navigation. Specifically on visual localization and mapping, obstacle avoidance and path planning, which compose the essential parts of visual navigation. Furthermore, throughout this article, we will have an insight into the prospect of the UAV navigation and the challenges to be faced.

Dynamics and stability analysis on stairs climbing of wheel–track mobile robot

Article

Full-text available

Jul 2017

A transformable wheel–track robot with the tail rod whose winding will coordinate the center of gravity of the robot is researched, and a theoretical basis for the stable climbing of the robot is provided. After a general introduction of the research, firstly the mechanical hardware and control hardware composition of the wheel–track robot is provided and the principles of its mechanical structure are illustrated. Secondly, through studying the fundamental constrains during the process of the robot climbing the obstacles, a mathematical model based on classical mechanics method is built to help analyze the dynamic principles of a wheel–track mobile robot climbing stairs. Thirdly, the dynamic stability analysis is carried out by analyzing not only the interaction among forces of track, track edge, and stair step but also the different stabilities of the robot when the track and the stairs have different touch points. Finally, an experiment of the modeling track robot climbing the stairs has convinced the effectiveness of the dynamic theories researched, which will be a beneficial reference for the future mobile robots obstacle climbing studies.

Real-Time Lane Detection Based on Deep Learning

Article

Sep 2021

As the research and development of autonomous vehicles has become more active, lane detection technologies for providing road information have become key elements. There are limits to detecting lanes in dynamic driving environments in conventional machine vision research, as the approaches are generally dependent on expert scenarios and fine-tuned heuristics. Deep learning has shown good performance in classifying target information with this distribution of nonlinear data; thus, many studies have actively applied deep learning to lane detection. However, most of these studies have focused on improving the accuracy, rather than on the operating speed. For the work reported herein, a benchmarking deep-learning framework for lane detection was applied with lightened feature extraction modules and decoder modules. These were used to compare performances and to present an indicator for selecting a model for optimizing real-time performance and accuracy. The VGG-16, MobileNet, and ShuffleNet networks were used for the encoder module, whereas frontend dilation and UNet were used for the decoder module. The limitations of the benchmarking framework were analyzed, and perspective loss concepts were applied to the processing of the network using front-view images to ensure improvements in the accuracy and operating speed. All of the candidate networks obtained objective performance indicators based on a large-scale benchmark dataset (TuSimple) and network training with a dataset collected and verified via performance on public roads in Singapore.

A Deep Convolutional Neural Network for the Detection of Polyps in Colonoscopy Images

Article

Apr 2021
BIOMED SIGNAL PROCES

Colonic polyps detection remains an unsolved issue because of the wide variation in the appearance, texture, color, size, and appearance of the multiple polyp-like imitators during the colonoscopy process. In this paper, a deep convolutional neural network (CNN) based model for the computerized detection of polyps within colonoscopy images is proposed. The proposed deep CNN model employs a unique way of adopting different convolutional kernels having different window sizes within the same hidden layer for deeper feature extraction. A lightweight model comprising 16 convolutional layers with 2 fully connected layers (FCN), and a Softmax layer as output layer is implemented. For achieving a deeper propagation of information, self-regularized smooth non-monotonicity, and to avoid saturation during training, MISH as an activation function is used in the first 15 layers followed by the rectified linear unit activation (ReLU) function. Moreover, a generalized intersection of the union (GIoU) approach is employed, overcoming issues such as scale invariance, rotation, and shape encountering with IoU. Data augmentation techniques such as photometric and geometric distortions are employed to overcome the scarcity of the data set of the colonic polyp. Detailed experimental results are provided that are bench-marked with the MICCAI 2015 challenge and other publicly available datasets reflecting better performance in terms of precision, sensitivity, F1-score, F2-score, and dice-coefficient, thus proving the efficacy of the proposed model.

Improving the performance of CNN to predict the likelihood of COVID-19 using chest X-ray images with preprocessing algorithms

Article

Sep 2020
INT J MED INFORM

Objective This study aims to develop and test a new computer-aided diagnosis (CAD) scheme of chest X-ray images to detect coronavirus (COVID-19) infected pneumonia. Method CAD scheme first applies two image preprocessing steps to remove the majority of diaphragm regions, process the original image using a histogram equalization algorithm, and a bilateral low-pass filter. Then, the original image and two filtered images are used to form a pseudo color image. This image is fed into three input channels of a transfer learning-based convolutional neural network (CNN) model to classify chest X-ray images into 3 classes of COVID-19 infected pneumonia, other community-acquired no-COVID-19 infected pneumonia, and normal (non-pneumonia) cases. To build and test the CNN model, a publicly available dataset involving 8474 chest X-ray images is used, which includes 415, 5179 and 2,880 cases in three classes, respectively. Dataset is randomly divided into 3 subsets namely, training, validation, and testing with respect to the same frequency of cases in each class to train and test the CNN model. Results The CNN-based CAD scheme yields an overall accuracy of 94.5 % (2404/2544) with a 95 % confidence interval of [0.93,0.96] in classifying 3 classes. CAD also yields 98.4 % sensitivity (124/126) and 98.0 % specificity (2371/2418) in classifying cases with and without COVID-19 infection. However, without using two preprocessing steps, CAD yields a lower classification accuracy of 88.0 % (2239/2544). Conclusion This study demonstrates that adding two image preprocessing steps and generating a pseudo color image plays an important role in developing a deep learning CAD scheme of chest X-ray images to improve accuracy in detecting COVID-19 infected pneumonia.

Machine Learning based Video Coding Optimizations: A Survey

Article

Jul 2019
INFORM SCIENCES

Video data has become the largest source of data consumed globally. Due to the rapid growth of video applications and boosting demands for higher quality video services, video data volume has been increasing explosively worldwide, which has been the most severe challenge for multimedia computing, transmission and storage. Video coding by compressing videos into a much smaller size is one of the key solutions; however, its development has become saturated to some extent while the compression ratio continuously grows in the last three decades. Machine leaning algorithms, especially those employing deep learning, which are capable of discovering knowledge from unstructured massive data and providing data-driven predictions, provide new opportunities for further upgrading video coding technologies. In this article, we present a review on machine learning based video encoding optimization, aiming to provide researchers with a strong foundation and inspire future developments for data-driven video coding. Firstly, we analyze the representations and redundancies of video data. Secondly, we review the development of video coding standards and key requirements. Subsequently, we present a systemic survey on the recent advances and challenges associated with the machine learning based video coding optimizations from three key aspects, including high efficiency, low complexity and high visual quality. Their workflows, representative schemes, performances, advantages and disadvantages are analyzed in detail. Finally, the challenges and opportunities are identified, which may provide the academic and industrial communities with groundwork and potential directions for future research.

Monocular Vision Aided Autonomous UAV Navigation in Indoor Corridor Environments

Article

Mar 2018

Deployment of autonomous Unmanned Aerial Vehicles (UAV) in various sectors such as disaster hit environments, industries, agriculture etc. not only improves productivity but also reduces human intervention resulting in sustainable benefits. In this regard, we present a model for autonomous navigation and collision avoidance of UAVs in GPS-denied corridor environments. In the first stage, we suggest a fast procedure to estimate the set of parallel lines whose intersection would yield the position of the vanishing point (VP) inside the corridor. A suitable measure is then formulated based on the position of VP on the intersecting lines in reference to any of the image boundary axes which helps safe navigation of the UAV avoiding collisions with side walls. Furthermore, the relative Euclidean distance scale expansion of matched scale-invariant keypoints in a pair of frames is taken into account to estimate the depth of a frontal obstacle. However, turbulence in the UAV arising due to its rotors or external factors may intruduce uncertainty in depth estimation. It is rectified with the help of a constant velocity aided Kalman filter model. Necessary set of control commands are then generated to avoid the frontal collision. Exhaustive experiments advocate the efficacy of the proposed scheme.

Visual odometry and mapping for autonomous flight using an RGB-D camera

Article

Jan 2011

A.S. Huang

Improved CNN-Based Path Planning for Stairs Climbing in Autonomous UAV with LiDAR Sensor

Abstract

Recommended publications

Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor

Improved CNN-Based Path Planning for Stairs Climbing in Autonomous UAV with LiDAR Sensor

Cellular Communication-Based Autonomous UAV Navigation with Obstacle Avoidance for Unknown Indoor En...

Deep Neural Network for Autonomous UAV Navigation in Indoor Corridor Environments