ArticlePDF Available

Improved CNN-Based Path Planning for Stairs Climbing in Autonomous UAV with LiDAR Sensor

Authors:

Abstract

Unmanned aerial vehicles (UAVs) have tremendous potential in civil and public areas. These are especially beneficial in applications, where human lives would contrarily be threatened. Autonomous navigation in unknown environments is a challenging issue for UAVs where decision-based navigation is required. In this paper, a deep learning (DL) approach is presented that aids UAVs for autonomous navigation in completely unknown, GPS-denied indoor environments. The UAV is equipped with a monocular camera and light detection and ranging (LiDAR) sensor for the determination of the next maneuver and distance calculation, respectively. For deeper features extraction, the You Only Look Once (YOLO) version is improved by the addition of convolution layers with different filter sizes. The process is observed as a classification exercise, where the DL model classifies the targeted image as a stair or no-stair. We have created our dataset considering the indoor scenario for the specific implementation. Comprehensive experimental results are presented that are compared with the YOLOv3 tiny indicating better performance in terms of accuracy, recall, F1-score, precision, and maneuver movements.
Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a
LiDAR Sensor
저자
(Authors)
Yeon Ji Choi, Tariq Rahim, Soo Young Shin
출처
(Source)
IEIE Transactions on Smart Processing & Computing 10(5), 2021.10, 390-397 (8 pages)
발행처
(Publisher)
대한전자공학회
The Institute of Electronics and Information Engineers
URL http://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE10620516
APA Style Yeon Ji Choi, Tariq Rahim, Soo Young Shin (2021). Improved CNN-based Path Planning So
an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor. IEIE Transactions on Smart
Processing & Computing, 10(5), 390-397.
이용정보
(Accessed)
저작권 안내
DBpia에서 제공되는 모든 저작물의 저작권은 원저작자에게 있으며, 누리미디어는 각 저작물의 내용을 보증하거나 책임을 지지
않습니다. 그리고 DBpia에서 제공되는 저작물은 DBpia와 구독계약을 체결한 기관소속 이용자 혹은 해당 저작물의 개별 구매자
가 비영리적으로만 이용할 수 있습니다. 그러므로 이에 위반하여 DBpia에서 제공되는 저작물을 복제, 전송 등의 방법으로 무단
이용하는 경우 관련 법령에 따라 민, 형사상의 책임을 질 수 있습니다.
Copyright Information
Copyright of all literary works provided by DBpia belongs to the copyright holder(s)and Nurimedia does not guarantee
contents of the literary work or assume responsibility for the same. In addition, the literary works provided by DBpia may
only be used by the users affiliated to the institutions which executed a subscription agreement with DBpia or the
individual purchasers of the literary work(s)for non-commercial purposes. Therefore, any person who illegally uses the
literary works provided by DBpia by means of reproduction or transmission shall assume civil and criminal responsibility
according to applicable laws and regulations.
금오공과대학교
202.31.134.***
2021/11/12 09:19 (KST)
IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021
https://doi.org/10.5573/IEIESPC.2021.10.5.390 390
IEIE Transactions on Smart Processing and Computing
Improved CNN-based Path Planning So an Autonomous
UAV Can Climb Stairs By using a LiDAR Sensor
Yeon Ji Choi1, Tariq Rahim2, and Soo Young Shin3*
Department of IT Convergence Engineering, Kumoh National Institute of Technology / Gumi, Korea
yzygzy@kumoh.ac.kr, tariqrahim@ieee.org, wdragon@kumoh.ac.kr
* Corresponding Author: Soo Young Shin
Received April 7, 2021; Revised May 25, 2021; Accepted July 9, 2021; Published October 30, 2021
* Regular Paper
* Extended from a Conference: Preliminary results of this paper were presented at the ICEIC Winter 2021. This paper has
been accepted by the editorial board through the regular reviewing process that confirms the original contribution.
Abstract: Unmanned aerial vehicles (UAVs) have tremendous potential in civil and public areas.
These are especially beneficial in applications where human lives are threatened. Autonomous
navigation in unknown environments is a challenging issue for UAVs where decision-based
navigation is required. In this paper, a deep learning (DL) approach is presented that aids
autonomous navigation for UAVs in completely unknown, GPS-denied indoor environments. The
UAV is equipped with a monocular camera and a light detection and ranging (LiDAR) sensor to
determine each next maneuver and distance calculation, respectively. For deeper feature extraction,
a version of You Only Look Once (YOLOv3-tiny) is improved by adding a convolution layer with
different filter sizes. The process is observed as an exercise where the DL model classifies the
targeted image as stairs or not stairs. We created our dataset considering the indoor scenario for
specific implementation. Comprehensive experimental results are compared with YOLOv3-tiny,
indicating better performance in terms of accuracy, recall, F1-score, precision, and maneuvering
movements.
Keywords: UAVs, CNN, Path planning, Stair climbing, LiDAR sensor
1. Introduction
UAV use is growing in areas such as scientific research,
rescue missions, commerce, and agriculture. Originally,
UAVs were developed to be managed by an on-the-ground
pilot via remote-control communication [1]. Recently,
UAVs have been moving closer to navigating with unusual
degrees of autonomy. Most UAVs employ global
navigation satellite system technology and inertial sensors
to determine their geospatial positioning. It is necessary to
overcome factors such as GPS signal error, narrow
passageways, and transparent glass for stable-flight UAVs
in indoor environments [2]. Studies in image-based stair-
recognition for robots [3] and of techniques for ground
robots [4] are ongoing; however, there is a lack of such
research with UAVs. An abundance of techniques, varying
from learning-based to non–learning-based, have been
suggested to resolve UAV navigation dilemmas. The most
popular non–learning-based method is sensing and
avoidance, which prevents accidents by steering vehicles
in a reverse orientation and navigating by path planning [5,
6]. Another type of non–learning-based technique takes
advantage of simultaneous localization and mapping
(SLAM). The inspiration is that, after creating a map of the
surroundings by utilizing SLAM, navigation is
accomplished by path planning [7, 8]. The work in [7]
combines GraphSLAM [9] with an online path planning
module in a proposal-approving UAV to determine
obstacle-free trajectories in foliage. A general
characteristic of non–learning-based approaches is that
they demand precise path planning, which may result in
unanticipated failures when environments are extremely
dynamic and complicated. To address this matter, machine
learning (ML) methods such as imitation learning and
reinforcement learning (RL) have been explored [10-12].
For example, a model-based RL approach called
TEXPLORE [12] was presented, which is a high-level
control system for navigation of a UAV within a grid map
having no barriers. And an imitation learning–based
controller utilizing a small set of human displays was
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021
391
presented that obtains reliable performance in forested
areas [10].
Therefore, this paper proposes a convolutional neural
network (CNN)-based system based on real-time stair
recognition that can fly a UAV without colliding with
stairs, and that obtains distance information between walls
or stairs through 2D light detection and ranging (LiDAR)
with a camera mounted on the UAV. In addition,
algorithms were designed for systems that recognize stairs,
avoid collisions, and maneuver themselves, which is one
of the obstacles to an autonomous flight process, and flight
experiments were carried out after the actual UAV was
implemented.
Deep learning (DL), which is a subcategory of machine
learning, acts like the human brain, and is therefore known
as artificial intelligence (AI). Many applications of
machine learning have been proposed, with different
signals representing data such as music signals [13], 2D
signals or images [14], and video signals [15]. CNNs are
used for various purposes, such as classification, detection,
and pattern recognition, especially in health [16], drone
applications [17], and autonomous driving systems.
Recently, You Only Look Once (YOLO) was introduced
for real-time detection of objects, with each version
improving the mean average precision (mAP) per frame
per second [18].
In this work, we attempted for the first time to use the
YOLOv3-tiny model, and improved the model further by
adding a convolution layer to extract deep features for the
detection of stairs. This DL detection model was used in a
classification problem to determine each next maneuver.
The rest of this paper is organized as follows. Section 2
details related work, while Section 3 explains the proposed
scheme. Section 4 summarizes the experimental results
and the analysis. Section 5 provides concluding statements
and suggests the scope of future work.
2. Related Work
Previously, a 3D map of the local area was developed
for autonomous UAV navigation. In some cases, these
methods were used to map exact quadcopters [19, 20].
However, these methods are based on a smart control
scheme, thereby restricting their use to laboratory settings
[21-23]. The map is learned through other manual route
methods, and quadcopters travel the same path [24]. For
most outdoor flights (where precision is not as high as
indoors), a GPS-based posing projection is used.
Most applications use scale sensors, such as infrared
sensors, RGB-D (red, green, blue depth) sensors, or laser
range sensors [25]. A single ultrasonic sensor was used in
[26] as an automated navigation device with an infrared
sensor. The condition evaluation method of the LiDAR
and inertial measurement unit (IMU) was advanced to
work independently in uncertain conditions that are denied
by a GPS [27]. Range sensors have limitations, being
heavy and high in power consumption.
The simultaneous localization and mapping (SLAM)
technique uses separate optical sensors to create a 3D
image [21-23] from every UAV position on the map. A 3D
map of an unknown indoor scenario was used for the
SLAM laser range finder [25]. The SLAM technique [29,
31] offers single-camera indoor navigation. SLAM is
highly complicated when it comes to regenerating the 3D
map region, requiring precise measurements and extensive
resources because additional sensors are needed.
SLAM can also set contact delays during real-time
navigation. The studies in [31] and [32] addressed these
issues. SLAM is primarily a practical system, and its
output with indoor materials (such as walls/roofs) is not
considered good, because its differential intensity is very
weak. The entire corridor comprises partitions, roofs, and
floors, and SLAM technologies cannot attain the desired
navigational quality.
3. The Proposed Scheme
This section discusses the system configuration for
UAV recognition of stairs, the deep learning model using
YOLOv3-tiny, and the improved YOLOv3-tiny for
detecting stairs.
3.1 System Configuration
The proposed system was designed based on
recognizing stairs with a camera mounted on the UAV for
indoor environments and on distances measured via the 2D
LiDAR sensor attached to the UAV’s side. Fig. 1 shows
the flowchart for the entire system. The connections and
communications between the parts are both wired and
wireless, as shown in Fig. 2. In particular, communications
among the ground control station, the UAV, and the
onboard PC is via Wi-Fi/LTE. Meanwhile, the wired
connection is only used for the sensor.
The system’s actual implementation uses a Parrot
Bebop 2 drone, which is suitable for narrow passageways
and convenient for load sensors. The UAV is equipped
with an RPLiDAR S1 laser scanner, which rotates 360°
and can measure distances up to 40m with a lightweight,
mainboard Jetson TX2 embedded computing device
(Auvidea J120 carrier board) as shown in Fig. 3(c). The
Fig. 1. Flowchart for the proposed implementation.
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
Choi et al.: Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor
392
Lenovo ThinkPad T580 is used as a ground control system
(GCS), and the equipment required for the experiment is
listed in Table 1. All algorithms are implemented in
Python, and the Robot Operating System (ROS) was used
as middleware (software that can run multiple different
programs together) in a kinetic version.
The LiDAR sensor uses distances measured along 360
points, as shown in Fig. 3(b). The distance data obtained
by the LiDAR sensor were 0° to the floor, 90° to the front,
and 180° to the ceiling, based on the direction of progress
for the UAV. In the polar coordination system, each of the
raw laser points is defined as {(di, θi); 0 i 359}, where
di is the distance from the UAV center to the object, and θi
is the relative angle of measurement. The information
obtained by the LiDAR is stored as a vector (di, θi), and the
stored data are checked to convert the values of the infinity
scan.
3.2 Stair-climbing System
Algorithm 1 is used by the UAV to climb stairs. When
steps are recognized by the camera, the algorithm starts. If
the distance between the UAV and the stairs is longer than
r meters, a straight start is performed on the x-axis, or a
rising maneuver on the z-axis, to avoid collisions if the
distance is less than r m. At this instant, if a staircase is not
recognized, the stair climb mission is determined as
complete, and recognition for climbing the next step
commences.
3.3 Deep Learning Model for Detection of
Stairs
In this study, a DL approach is implemented for
detecting stairs, which the drone uses to make decisions
intelligently in order to follow the stairs and determine the
next maneuver. In this work, we improved the YOLOv3-
tiny default model. The backbone of YOLO is darknet,
where the YOLOv3-tiny default model uses six max-
pooling and seven convolution layers. We modified it by
adding one more layer. Instead of the softmax function,
and where multi-class classification and detection is an
Fig. 2. Network connections and the architecture of the
proposed system.
Fig. 3. System configuration: (a) UAV movement axes;
(b) illustration of the RPLiDAR S1 scanning process;
(c) the 2D-LiDAR sensor and the Jetson-TX2 onboard
PC attached to the UAV; (d) the test environment.
Table 1. Experiment Parameters.
Device Model name Company
Lidar sensor RPLiDAR S1 Slamtec
UAV Bebop drone 2 Parrot
Onboard PC Jetson TX2 Nvidia
Carrier board Auvidea J120 Auvidea
GCS ThinkPad T580 Lenovo
LTE modem LTE USB Stick Huawei
A
lgorithm 1. Stai
r
-climbing algorithm.
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021
393
issue, regression is employed to solve the multi-class
detection and classification problem [33].
The proposed model starts by dividing the stair-image
input into a G × G grid in the training stage. A bounding
box is used as a tool for labeling five features—width w,
height h, vertical height v, horizontal height u—as shown
in Fig. 4, and confidence score C, which represents the
presence of stairs within the bounding box, and hence,
represents the accuracy.
In the proposed YOLOv3-tiny method, we attempt to
make the model computationally inexpensive, along with
implementing it to extract more semantic features. Max-
pooling is used after each convolution layer to reduce the
computational complexity and improve image feature
extraction. Fig. 6 shows the network architecture for both
the default and the improved YOLOv3-tiny models. The
loss function is obtained as an end-to-end network, and can
be expressed as follows [33]:
0
a
S
i
loss iouErr coordErr clsErr
=
=++
(1)
where iouErr, coordErr, and clsErr indicate the IOU error,
coordinates error, and classification error, respectively. We
used a rectified linear unit (ReLU) as an activation
function to achieve sparsity and reduce vanishing gradient
issues [25]. Table 2 details the training configuration
employed for both YOLOv3-tiny and the proposed
improved YOLOv3-tiny model.
3.4 ROS
The nodes that are separated and managed by the
master are shown in Fig. 5. In addition, the topic node
continuously communicates the results processed by the
publisher node, and makes them available to other nodes
by subscription. The system proposed in this paper is
largely a UAV status message, a scan value obtained from
the LiDAR, and a visual message obtained from the UAV
camera. When running darknet on the ROS, the messages
required from the published messages are subscripted.
Among them, a message containing information on the
bounding box is received through the darknet_ros node.
When the proposed DL model detects a staircase, a
Fig. 5. ROS node graph.
Fig. 4. Definition of the bounding box.
Fig. 6. YOLO models: the default YOLOv3-tiny and the
improved YOLOv3-tiny.
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
Choi et al.: Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor
394
message from the LiDAR is subscribed as a token that
allows the UAV to perform actions and maneuvering
based on the incoming output. This process continues till
detection is performed within darknet_ros.
4. Experimental Results and Analysis
A dataset was created in the Kumoh National Institute
of Technology, South Korea, by employing a Bebop drone
that has a high-resolution camera and a GPS mounted on it.
The dataset comprises 1000 images at a resolution of
1920 1080× resized to 428 428× before model training.
For training and testing purposes, the dataset was split
70% and 30%, respectively. Fig. 7 depicts the training
phase of the proposed improved YOLOv3-tiny model
where 20,000 epochs were set. As shown in Fig. 7, the
blue line represents the average loss achieved (0.215)
whereas the red line represents the highest mAP (91.6%).
The detection performance of the improved YOLOv3-
tiny model was benchmarked against the default model by
utilizing the same parametric configurations and dataset.
The metrics used to reflect the efficacy in stair detection of
both models are accuracy, recall, F1-score, and precision.
Table 3 shows that the proposed improved YOLOv3-tiny
model outperformed the default model in terms of
accuracy, recall, and F1-score. Furthermore, a low
precision value with higher values of other performance
metrics shows stable performance from the model.
Fig. 8 shows the real-time detection of the proposed
model, where the top left image represents the starting
point of the UAV after takeoff, and the top right image
represents the middle position of the UAV when hovering
and climbing. In Fig. 8, the bottom left image shows the
last step of the stairs, while the bottom right image shows
the instant when the UAV was located at a distance of r
meters from the stairs.
For the experimental scenario, the set of stairs climbed
was 2.1 m long and 2.85 m wide, as shown in Fig. 3(d).
Based on Algorithm 1, Some of the experiment’s results
are shown in Fig. 9, depicting commands sent by the GCS
and the corresponding images from the built-in camera of
the UAV. In Fig. 9, we have tried to show the different
stages in the decisions made by the UAV, such as moving
forward or upward, hovering, and going to the next stair to
climb it. Furthermore, the actual trajectory-wise UAV
movement from the beginning of the staircase to the
beginning of the next step is shown in Fig. 10 as a 3D plot.
This movement started at approximately 0.8 m from the
starting point of the stairs. In total, 88 experiments were
performed three times each, and the results are shown in
Table 4 for the time elapsed during takeoff and landing on
average, reported to be 55.97 sec.
Table 2. Training Parameters for Both Models.
Parameters for training Configuration values
Image/stairs 428 x 428
Batch size 32
Learning rate 0.001
Optimizer Stochastic gradient descent
Decay 0.0005
Momentum 0.9
Epochs 20,000
Table 3. Performance of the Detection Scheme.
Parameter metrics YOLOv3-tiny
(%) [17]
Modified
YOLOv3-tiny
(%)
Accuracy 90.01 92.06
Recall 89.00 91.00
F1-score 83.00 85.00
Precision 78.00 73.00
Fig. 7. Training phase of the improved YOLOv3-tiny.
Fig. 8. Detection results from the improved YOLOv3-
tiny model.
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021
395
Fig. 10. Trajectory of the UAV.
Table 4. Performance Time of the Proposed Stair-
climbing Scheme.
No. Takeoff Landing
1 0:06.35 1:00.91
2 0:05.22 0:57.16
3 0:05.78 1:07.18
Average 0:05.78 1:01.75
5. Conclusion
In this study, we designed, implemented, and
experimented with a system in which a UAV recognizes
and climbs stairs, which are obstacles often encountered
during indoor flight. The system was implemented through
a CNN-based imaging process for real-time stair
(a) (b)
(c) (d)
Fig. 9. GCS screen commands and screen shots from the UAV’s built-in camera for (a) forward movement; (b)
upward movement; (c) hovering; (d) going to the next stair.
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
Choi et al.: Improved CNN-based Path Planning So an Autonomous UAV Can Climb Stairs By using a LiDAR Sensor
396
recognition and by using LiDAR-based distance
measurements. The accuracy derived from stair
recognition was 92.06%, and the actual test results showed
that stair climbing was carried out without collisions.
Future research would require more efficient
algorithms to climb various types of stairs. Moreover, the
proposed system can be combined with SLAM navigation
to expand studies to systems that can autonomously fly
through multiple floors.
Acknowledgement
This work was supported by the Priority Research
Centers Program through the National Research
Foundation of Korea (NRF) funded by the Ministry of
Education, Science and Technology
(2018R1A6A1A03024003).
References
[1] P. R. Prasad, et al., "Monocular vision aided
autonomous UAV navigation in indoor corridor
environments." IEEE Transactions on Sustainable
Computing, Vol. 4, No. 1, pp. 96-108, 2018. Article
(CrossRefLink)
[2] Y. Lu, et al., "A survey on vision-based UAV
navigation." Geo-spatial information science, Vol. 21,
No. 1, pp. 21-32, 2018. Article (CrossRefLink)
[3] M. Ilyas, et al., "Design of sTetro: A Modular,
Reconfigurable, and Autonomous Staircase Cleaning
Robot," Journal of Sensors, Vol. 2018, 16 pages. Jul.
2018. Article (CrossRefLink)
[4] X. Gao, et al., “Dynamics and stability analysis on
stairs climbing of wheel–track mobile robot,”
International Journal of Advanced Robotic Systems,
Vol. 14, No. 4, pp. 1729881417720783, 2017. Article
(CrossRef Link)
[5] J. Israelsen, et al., "Automatic collision avoidance for
manually tele-operated unmanned aerial vehicles." In
2014 IEEE International Conference on Robotics and
Automation (ICRA), pp. 6638-6643, 2014. Article
(CrossRef Link)
[6] L. Chnibo, et al., "UAV position estimation and
collision avoidance using the extended Kalman
filter." IEEE Transactions on Vehicular Technology,
Vol. 62, No. 6, pp. 2749-2762, 2013. Article
(CrossRef Link)
[7] J. Cui, et al., "Autonomous navigation of UAV in
foliage environment." Journal of intelligent & robotic
systems, Vol. 84, No. 1 pp. 259-276, 2016. Article
(CrossRef Link)
[8] Z. Huizhong, et al., "StructSLAM: Visual SLAM
with building structure lines." IEEE Transactions on
Vehicular Technology, Vol. 64, No. 4 pp. 1364-1375,
2015. Article (CrossRef Link)
[9] A. E. Oguz, et al., "On the consistency analysis of A-
SLAM for UAV navigation. Proc. SPIE 9084,
Unmanned Systems Technology XVI, Vol. 9084, pp.
90840R, June. 2014. Article (CrossRef Link)
[10] S. Ross, et all., "Learning monocular reactive uav
control in cluttered natural environments." In 2013
IEEE international conference on robotics and
automation, pp. 1765-1772, 2013. Article (CrossRef
Link)
[11] A. Fraust, et all., "Automated aerial suspended cargo
delivery through reinforcement learning." Artificial
Intelligence, Vol. 247, pp. 381-398, 2017. Article
(CrossRef Link)
[12] N. Imanberdiyev, et all., "Autonomous navigation of
UAV by using real-time model-based reinforcement
learning." In 2016 14th international conference on
control, automation, robotics and vision (ICARCV),
pp. 1-6, 2016. Article (CrossRef Link)
[13] B. L. Sturm, et al., "Machine learning research that
matters for music creation: A case study," Journal of
New Music Research, Vol. 48, No.1, pp. 36-55, 2019.
Article (CrossRefLink)
[14] J. Raharjo, et al., “Cholesterol level measurement
through iris image using gray level co-occurrence
matrix and linear regression,” ARPN Journal of
Engineering and Applied Sciences, Vol. 14, No. 21,
pp. 3757–3763, Nov. 2019. Article (CrossRef Link)
[15] Y. Zhang, et al., "Machine learning based video
coding optimizations: A survey." Information
Sciences, Vol. 506, pp.395-423, Jan. 2020. Article
(CrossRef Link)
[16] M. Heidari, et al., "Improving the performance of
CNN to predict the likelihood of COVID-19 using
chest X-ray images with preprocessing algorithms,"
International journal of medical informatics, Vol. 144,
pp. 104284, Sep. 2020. Article (CrossRef Link)
[17] S. A. Hassan, et al., "Real-time uav detection based
on deep learning network,” In 2019 International
Conference on Information and Communication
Technology Convergence, pp. 630-632, Oct. 2019.
Article (CrossRef Link)
[18] J. Redmon, et al., “You only look once: Unified, real-
time object detection,” In Proceedings of the IEEE
conference on computer vision and pattern
recognition, pp. 779-788, 2016. Article (CrossRef
Link)
[19] D. Mellinger, et al., “Minimum snap trajectory
generation and control for quadrotors,” In 2011 IEEE
international conference on robotics and automation,
pp. 2520-2525, 2011. Article (CrossRef Link)
[20] D. Mellinger, et al., “Trajectory generation and
control for precise aggressive maneuvers with
quadrotors,” The International Journal of Robotics
Research, Vol. 31, No. 5, pp. 664-674, Jan. 2012.
Article (CrossRef Link)
[21] P. Checchin, et al., “Radar scan matching slam using
the fourier-mellin transform,” In Field and Service
Robotics, Vol. 62, pp. 151-161, 2010. Article
(CrossRef Link)
[22] J. Engel, et al., “LSD-SLAM: Large-scale direct
monocular SLAM,” In European conference on
computer vision, Vol. 8690, pp. 834-849, 2014.
Article (CrossRef Link)
[23] C. Mei, et al., “RSLAM: A system for large-scale
mapping in constant-time using stereo,” International
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
IEIE Transactions on Smart Processing and Computing, vol. 10, no. 5, October 2021
397
journal of computer vision, Vol. 94, No.2, pp. 198-
214, Jun. 2011. Article (CrossRef Link)
[24] M. M uller, et al., “Quadrocopter ball juggling,” in
2011 IEEE/RSJ International Conference on
Intelligent Robots and Systems, pp. 5113–5120, Sep.
2011. Article (CrossRef Link)
[25] A. S. Huang, et al., “Visual odometry and mapping
for autonomous flight using an RGB-D camera,”
Robotics Research. Vol. 100, pp. 235–252, Aug.
2011. Article (CrossRef Link)
[26] J. F. Roberts, et al., “Quadrotor using minimal
sensing for autonomous indoor flight,” In European
Micro Air Vehicle Conference and Flight
Competition (EMAV2007), Sep. 2007. Article
(CrossRef Link)
[27] A. Bry, et al., “State estimation for aggressive flight
in GPS-denied environments using onboard sensing,”
In 2012 IEEE International Conference on Robotics
and Automation, pp. 1-8, May, 2012. Article
(CrossRef Link)
[28] A. Bachrach, et al, “Autonomous flight in unknown
indoor environments,” International Journal of Micro
Air Vehicles, Vol. 1, No. 4, pp. 217-228, Dec. 2009.
Article (CrossRef Link)
[29] M. Achtelik, et al., "Onboard IMU and monocular
vision based control for MAVs in unknown in-and
outdoor environments." 2011 IEEE International
Conference on Robotics and Automation, pp. 3056-
3063, 2011. Article (CrossRef Link)
[30] M. Blösch, et al., "Vision based MAV navigation in
unknown and unstructured environments." 2010
IEEE International Conference on Robotics and
Automation, pp. 21-28, 2010. Article (CrossRef
Link)
[31] G. Nützi, et al., "Fusion of IMU and vision for
absolute scale estimation in monocular SLAM."
Journal of intelligent & robotic systems, Vol. 61, No.
1, pp. 287-299, Nov. 2011. Article (CrossRef Link)
[32] S. Weiss, et al., “Versatile distributed pose estimation
and sensor self-calibration for an autonomous MAV,”
In 2012 IEEE International Conference on Robotics
and Automation, pp. 31-38, 2012. Article (CrossRef
Link)
[33] T. Rahim, et al., “A Deep Convolutional Neural
Network for the Detection of Polyps in Colonoscopy
Images,” Biomedical Signal Processing and Control
68 (2021): 102654.
Yeonji Choi received her BSc in
Electrical Engineering in 2019 and
received her MSc from the Department
of IT Convergence Engineering at
Kumoh National Institute of
Technology (KIT) Gumi, South Korea,
in 2021. Currently, she is working as
graduate research assistant at the
Wireless and Emerging Network System (WENS) Lab in
the Department of IT Convergence Engineering, Kumoh
National Institute of Technology (KIT), Gumi, South
Korea. Her major research interests include intelligent
control and systems, Unmanned Aerial Vehicles, and
wireless communications.
Tariq Rahim is a PhD student in the
Wireless and Emerging Network
System Laboratory (WENS Lab) of the
Department of IT Convergence
Engineering, Kumoh National Institute
of Technology, Republic of Korea. He
completed his master’s degree in
Information and Communication
Engineering from Beijing Institute of Technology, PRC, in
2017. His research interests include image and video
processing and quality of experience for high-resolution
videos.
Soo Young Shin received his BSc,
MSc, and PhD in Electrical Engi-
neering and Computer Science from
Seoul National University, Korea, in
1999, 2001, and 2006, respectively. He
was a visiting scholar for the FUN Lab
at the University of Washington,
U.S.A., from July 2006 to June 2007.
After three years working in the WiMAX Design Lab of
Samsung Electronics, he is now an associate professor for
the School of Electronics at Kumoh National Institute of
Technology, joining the institute in September 2010. His
research interests include wireless LANs, WPANs,
WBANs, wireless mesh networks, sensor networks,
coexistence among wireless networks, industrial and
military networks, cognitive radio networks, and next-
generation mobile wireless broadband networks.
Copyrights © 2021 The Institute of Electronics and Information Engineers
금오공과대학교 | IP:202.31.134.*** | Accessed 2021/11/12 09:19(KST)
Article
Full-text available
Research applying machine learning to music modeling and generation typically proposes model architectures, training methods and datasets, and gauges system performance using quantitative measures like sequence likelihoods and/or qualitative listening tests. Rarely does such work explicitly question and analyse its usefulness for and impact on real-world practitioners, and then build on those outcomes to inform the development and application of machine learning. This article attempts to do these things for machine learning applied to music creation. Together with practitioners, we develop and use several applications of machine learning for music creation, and present a public concert of the results. We reflect on the entire experience to arrive at several ways of advancing these and similar applications of machine learning to music creation.
Article
Full-text available
In this article, the mechanical, electrical and autonomy aspects of designing a novel, modular and reconfigurable cleaning robot, dubbed as sTetro (stair Tetro), are presented. The developed robotic platform uses a vertical conveyor mechanism to reconfigure itself and is capable of navigating over flat surfaces as well as staircases, thus significantly extending the automated cleaning capabilities as compared to conventional home cleaning robots. The mechanical design and system architecture are introduced first, followed by a detailed description of system modelling and controller design efforts in sTetro. An autonomy algorithm also proposed for self-reconfiguration, locomotion and autonomous navigation of sTetro in the controlled environment, e.g. in homes/offices with flat floor and straight staircase. A staircase recognition algorithm is presented to distinguish between surrounding environment and the stairs. The misalignment detection technique of the robot with front staircase riser is also given and a feedback from IMU sensor for misalignment corrective measures is provided. The Experiments performed with the sTetro robot demonstrated the efficacy and validity of the developed system models, control and autonomy approaches.
Article
Full-text available
Research on unmanned aerial vehicles (UAV) has been increasingly popular in the past decades, and UAVs have been widely used in industrial inspection, remote sensing for mapping & surveying, rescuing, and so on. Nevertheless, the limited autonomous navigation capability severely hampers the application of UAVs in complex environments, such as GPS-denied areas. Previously, researchers mainly focused on the use of laser or radar sensors for UAV navigation. With the rapid development of computer vision, vision-based methods, which utilize cheaper and more flexible visual sensors, have shown great advantages in the field of UAV navigation. The purpose of this article is to present a comprehensive literature review of the vision-based methods for UAV navigation. Specifically on visual localization and mapping, obstacle avoidance and path planning, which compose the essential parts of visual navigation. Furthermore, throughout this article, we will have an insight into the prospect of the UAV navigation and the challenges to be faced.
Article
Full-text available
A transformable wheel–track robot with the tail rod whose winding will coordinate the center of gravity of the robot is researched, and a theoretical basis for the stable climbing of the robot is provided. After a general introduction of the research, firstly the mechanical hardware and control hardware composition of the wheel–track robot is provided and the principles of its mechanical structure are illustrated. Secondly, through studying the fundamental constrains during the process of the robot climbing the obstacles, a mathematical model based on classical mechanics method is built to help analyze the dynamic principles of a wheel–track mobile robot climbing stairs. Thirdly, the dynamic stability analysis is carried out by analyzing not only the interaction among forces of track, track edge, and stair step but also the different stabilities of the robot when the track and the stairs have different touch points. Finally, an experiment of the modeling track robot climbing the stairs has convinced the effectiveness of the dynamic theories researched, which will be a beneficial reference for the future mobile robots obstacle climbing studies.
Article
As the research and development of autonomous vehicles has become more active, lane detection technologies for providing road information have become key elements. There are limits to detecting lanes in dynamic driving environments in conventional machine vision research, as the approaches are generally dependent on expert scenarios and fine-tuned heuristics. Deep learning has shown good performance in classifying target information with this distribution of nonlinear data; thus, many studies have actively applied deep learning to lane detection. However, most of these studies have focused on improving the accuracy, rather than on the operating speed. For the work reported herein, a benchmarking deep-learning framework for lane detection was applied with lightened feature extraction modules and decoder modules. These were used to compare performances and to present an indicator for selecting a model for optimizing real-time performance and accuracy. The VGG-16, MobileNet, and ShuffleNet networks were used for the encoder module, whereas frontend dilation and UNet were used for the decoder module. The limitations of the benchmarking framework were analyzed, and perspective loss concepts were applied to the processing of the network using front-view images to ensure improvements in the accuracy and operating speed. All of the candidate networks obtained objective performance indicators based on a large-scale benchmark dataset (TuSimple) and network training with a dataset collected and verified via performance on public roads in Singapore.
Article
Colonic polyps detection remains an unsolved issue because of the wide variation in the appearance, texture, color, size, and appearance of the multiple polyp-like imitators during the colonoscopy process. In this paper, a deep convolutional neural network (CNN) based model for the computerized detection of polyps within colonoscopy images is proposed. The proposed deep CNN model employs a unique way of adopting different convolutional kernels having different window sizes within the same hidden layer for deeper feature extraction. A lightweight model comprising 16 convolutional layers with 2 fully connected layers (FCN), and a Softmax layer as output layer is implemented. For achieving a deeper propagation of information, self-regularized smooth non-monotonicity, and to avoid saturation during training, MISH as an activation function is used in the first 15 layers followed by the rectified linear unit activation (ReLU) function. Moreover, a generalized intersection of the union (GIoU) approach is employed, overcoming issues such as scale invariance, rotation, and shape encountering with IoU. Data augmentation techniques such as photometric and geometric distortions are employed to overcome the scarcity of the data set of the colonic polyp. Detailed experimental results are provided that are bench-marked with the MICCAI 2015 challenge and other publicly available datasets reflecting better performance in terms of precision, sensitivity, F1-score, F2-score, and dice-coefficient, thus proving the efficacy of the proposed model.
Article
Objective This study aims to develop and test a new computer-aided diagnosis (CAD) scheme of chest X-ray images to detect coronavirus (COVID-19) infected pneumonia. Method CAD scheme first applies two image preprocessing steps to remove the majority of diaphragm regions, process the original image using a histogram equalization algorithm, and a bilateral low-pass filter. Then, the original image and two filtered images are used to form a pseudo color image. This image is fed into three input channels of a transfer learning-based convolutional neural network (CNN) model to classify chest X-ray images into 3 classes of COVID-19 infected pneumonia, other community-acquired no-COVID-19 infected pneumonia, and normal (non-pneumonia) cases. To build and test the CNN model, a publicly available dataset involving 8474 chest X-ray images is used, which includes 415, 5179 and 2,880 cases in three classes, respectively. Dataset is randomly divided into 3 subsets namely, training, validation, and testing with respect to the same frequency of cases in each class to train and test the CNN model. Results The CNN-based CAD scheme yields an overall accuracy of 94.5 % (2404/2544) with a 95 % confidence interval of [0.93,0.96] in classifying 3 classes. CAD also yields 98.4 % sensitivity (124/126) and 98.0 % specificity (2371/2418) in classifying cases with and without COVID-19 infection. However, without using two preprocessing steps, CAD yields a lower classification accuracy of 88.0 % (2239/2544). Conclusion This study demonstrates that adding two image preprocessing steps and generating a pseudo color image plays an important role in developing a deep learning CAD scheme of chest X-ray images to improve accuracy in detecting COVID-19 infected pneumonia.
Article
Video data has become the largest source of data consumed globally. Due to the rapid growth of video applications and boosting demands for higher quality video services, video data volume has been increasing explosively worldwide, which has been the most severe challenge for multimedia computing, transmission and storage. Video coding by compressing videos into a much smaller size is one of the key solutions; however, its development has become saturated to some extent while the compression ratio continuously grows in the last three decades. Machine leaning algorithms, especially those employing deep learning, which are capable of discovering knowledge from unstructured massive data and providing data-driven predictions, provide new opportunities for further upgrading video coding technologies. In this article, we present a review on machine learning based video encoding optimization, aiming to provide researchers with a strong foundation and inspire future developments for data-driven video coding. Firstly, we analyze the representations and redundancies of video data. Secondly, we review the development of video coding standards and key requirements. Subsequently, we present a systemic survey on the recent advances and challenges associated with the machine learning based video coding optimizations from three key aspects, including high efficiency, low complexity and high visual quality. Their workflows, representative schemes, performances, advantages and disadvantages are analyzed in detail. Finally, the challenges and opportunities are identified, which may provide the academic and industrial communities with groundwork and potential directions for future research.
Article
Deployment of autonomous Unmanned Aerial Vehicles (UAV) in various sectors such as disaster hit environments, industries, agriculture etc. not only improves productivity but also reduces human intervention resulting in sustainable benefits. In this regard, we present a model for autonomous navigation and collision avoidance of UAVs in GPS-denied corridor environments. In the first stage, we suggest a fast procedure to estimate the set of parallel lines whose intersection would yield the position of the vanishing point (VP) inside the corridor. A suitable measure is then formulated based on the position of VP on the intersecting lines in reference to any of the image boundary axes which helps safe navigation of the UAV avoiding collisions with side walls. Furthermore, the relative Euclidean distance scale expansion of matched scale-invariant keypoints in a pair of frames is taken into account to estimate the depth of a frontal obstacle. However, turbulence in the UAV arising due to its rotors or external factors may intruduce uncertainty in depth estimation. It is rectified with the help of a constant velocity aided Kalman filter model. Necessary set of control commands are then generated to avoid the frontal collision. Exhaustive experiments advocate the efficacy of the proposed scheme.