Conference PaperPDF Available

Car Detector Based on YOLOv5 for Parking Management

July 2023

July 2023

DOI:10.1007/978-3-031-36886-8_9

Conference: The 12th Conference on Information Technology and Its Applications
At: Danang, Vietnam

Authors:

Duy-Linh Nguyen

University of Ulsan

Xuan-Thuy Vo

University of Ulsan

Adri Priadana

University of Ulsan

Kang-Hyun Jo

University of Ulsan

Nowadays, YOLOv5 is one of the most widely used object detection network architectures in real-time systems for traffic management and regulation. To develop a parking management tool, this paper proposes a car detection network based on redesigning the YOLOv5 network architecture. This research focuses on network parameter optimization using lightweight modules from EfficientNet and PP-LCNet architectures. The proposed network is trained and evaluated on two benchmark datasets which are the Car Parking Lot Dataset and the Pontifical Catholic University of Parana+ Dataset and reported on mAP@0.5 and mAP@0.5:0.95 measurement units. As a result, this network achieves the best performances at 95.8 % and 97.4 % of mAP@0.5 on the Car Parking Lot Dataset and the Pontifical Catholic University of Parana+ Dataset, respectively.

Content uploaded by Duy-Linh Nguyen

Content may be subject to copyright.

Car Detector Based on YOLOv5 for Parking

Management

Duy-Linh Nguyen[0000−0001−6184−4133], Xuan-Thuy Vo[0000−0002−7411−0697],

Adri Priadana[0000−0002−1553−7631], and Kang-Hyun Jo[0000−0002−4937−7082]

Department of Electrical, Electronic and Computer Engineering, University of Ulsan,

Ulsan 44610, South Korea

ndlinh301@mail.ulsan.ac.kr, xthuy@islab.ulsan.ac.kr,

priadana@mail.ulsan.ac.kr, acejo@ulsan.ac.kr

Abstract. Nowadays, YOLOv5 is one of the most widely used object

detection network architectures in real-time systems for traﬃc manage-

ment and regulation. To develop a parking management tool, this paper

proposes a car detection network based on redesigning the YOLOv5 net-

work architecture. This research focuses on network parameter optimiza-

tion using lightweight modules from EﬃcientNet and PP-LCNet archi-

tectures. The proposed network is trained and evaluated on two bench-

mark datasets which are the Car Parking Lot Dataset and the Pontiﬁcal

Catholic University of Parana+ Dataset and reported on mAP@0.5 and

mAP@0.5:0.95 measurement units. As a result, this network achieves the

best performances at 95.8 % and 97.4 % of mAP@0.5 on the Car Parking

Lot Dataset and the Pontiﬁcal Catholic University of Parana+ Dataset,

respectively.

Keywords: Convolutional neural network (CNN) ·EﬃcientNet ·PP-

LCNet ·Parking management ·YOLOv5.

1 Introduction

Along with the rapid development of modern and smart cities, the number of

vehicles in general and cars in particular has also increased in both quantity

and type. According to a report by the Statista website [15], there are currently

about one and a half million cars in the world and it is predicted that in 2023,

the number of cars sold will reach nearly 69.9 million. This number will increase

further in the coming years. Therefore, the management and development of

tools to support parking lots are essential. To construct smart parking lots,

researchers propose many methods based on geomagnetic [25], ultrasonic [16],

infrared [2], and wireless techniques [21]. These approaches mainly rely on the

operation of sensors designed and installed in the parking lot. Although these

designs achieve high accuracy, they require large investment, labor, and mainte-

nance costs, especially when deployed in large-scale parking lots. Exploiting the

beneﬁts of convolutional neural networks (CNNs) in the ﬁeld of computer vision,

several researchers have designed networks to detect empty or occupied parking

2 Duy-Linh Nguyen et al.

spaces using conventional cameras with quite good accuracy [5, 12, 13]. Following

that trend, this paper proposes a car detector to support smart parking man-

agement. This work explores lightweight network architectures and redesigned

modules inside of the YOLOv5 network to balance network parameters, detec-

tion accuracy, and computational complexity. It ensures deployment in real-time

systems with the lowest deployment cost. The main contributions of this paper

are shown below:

1 - Proposes an improved YOLOv5 architecture for car detection that can be

applied to parking management and other related ﬁelds of computer vision.

2 - The proposed detector performs better than other detectors on the Car Park-

ing Lot Dataset and the Pontiﬁcal Catholic University of Parana+ Dataset.

The distribution of the remaining parts in the paper is as follows: Section 2

presents the car detection-based methods. Section 3 explains the proposed ar-

chitecture in detail. Section 4 introduces the experimental setup and analyzes

the experimental results. Section 5 summarizes the issue and future work orien-

tation.

2 Related works

2.1 Traditional machine learning-based methods

The car detection process of traditional machine learning-based techniques is

divided into two stages, manual feature extraction and classiﬁcation. First, fea-

ture extractors generate feature vectors using classical methods such as Scale-

invariant Feature Transform (SIFT), Histograms of Oriented Gradients (HOG),

and Haar-like features [18, 19, 22]. Then, the feature vectors go through classiﬁers

like the Support Vector Machine (SVM) and Adaboost [6, 14] to obtain the tar-

get classiﬁcation result. The traditional feature extraction methods rely heavily

on prior knowledge. However, in the practical application, there are many objec-

tive confounding factors including weather, exposure, distortion, etc. Therefore,

the applicability of these techniques on real-time systems is limited due to low

accuracy.

2.2 CNN-based methods

Parking lot images obtained from drones or overhead cameras contain many

small-sized cars. In order to detect these objects well, many studies have focused

on the small object detection topic using a combination of CNN and traditional

methods or one-stage detectors. The authors in [1, 24, 3] fuse the modern CNNs

and SVM networks to achieve high spatial resolution in vehicle count detection

and counting. Research in [11] develops a network based on the YOLOv3 net-

work architecture in which the backbone network is combined between ResNet

and DarkNet to solve object vision in drone images. The work in [10] proposes

a new feature-matching method and a spatial context analysis for pedestrian-

vehicle discrimination. An improved YOLOv5 network architecture is designed

Car Detector Based on YOLOv5 for Parking Management 3

Conv

PP-LCNet

LiteEﬃcientNet

Spatial Pyramid Pooling

Upsample

Concat

BottleNeck Cross Stage Partial

Con2D

640×640×3

40×40×192

20×20×384 20×20×384

40×40×18

Medium object

PP-LC

80×80×18

Small object

20×20×18

Large object

80×80×192

40×40×384

80×80×96

40×40×576

80×80×288

40×40×576

40×40×384

20×20×768

Backbone Neck Detection head

3×

Fig. 1. The architecture of proposed car detector.

4 Duy-Linh Nguyen et al.

by [7] for vehicle detection and classiﬁcation in Unmanned Aerial Vehicle (UAV)

imagery and [23] for real-world imagery. Another study in [20] provides a one-

stage detector (SF-SSD) with a new spatial cognition algorithm for car detection

in UAV imagery. The advantage of modern machine learning methods is high

detection and classiﬁcation accuracy, especially for small-sized objects. However,

they require the network to have a high-level feature extraction and fusion, and

a certain complexity to ensure operation in real-world conditions.

3 Methodology

The proposed car detection network is shown in Fig. 1. This network is an

improved YOLOv5 architecture [9] including three main parts: backbone, neck,

and detection head.

3.1 Proposed network architecture

Basically, the structure of the proposed network follows the design of the YOLOv5

network architecture with many changes inside the backbone and neck modules.

Speciﬁcally, the Focus module is replaced by a simple block called Conv. This

block is constructed with a standard convolution layer (Con2D) with kernel size

of 1 ×1 followed by a batch normalization (BN) and a ReLU activation func-

tion as shown in Fig. 2 (a). Subsequent blocks in the backbone module are also

Con2D

Conv

Batch Normalization

SiLU

LeakyReLU

Concat

Maxpooling

k=7

k=5

k=3

(c)

(a)

n×

(b)

BottleNeck

Fig. 2. The architecture of Conv (a), BottleNeck Cross Stage Partial (b), and Spatial

Pyramid Pooling (c) blocks.

redesigned based on inspiration from lightweight network architectures such as

PP-LCNet [4] and EﬃcientNet [17]. The design of the PP-LCNet (PP-LC) layer

is described in detail in Fig. 3 (a). It consists of a depthwise convolution layer

Car Detector Based on YOLOv5 for Parking Management 5

(b)

GAP

FC1, ReLU

FC2, Sigmoid

SE block

BN, Hardswish

3×3 DWConv

SE block

1×1 Con2D

BN, Hardswish

(a)

Fig. 3. The architecture of PP-LCNet (a) and SE (b) blocks.

(3 ×3 DWConv), an attention block (SE block), and ends with a standard con-

volution layer (1×1 Con2D). In between these layers, the BN and the hardswish

activation function are used. The SE block is an attention mechanism based on

a global average pooling (GAP) layer, a fully connected layer (FC1) followed by

a rectiﬁed linear unit activation function (ReLU), and a second fully connected

layer (FC2) followed by a sigmoid activation function as Fig. 3 (b). This method

uses lightweight convolution layers that save a lot of network parameters. In

addition, the attention mechanism helps the network focus on learning impor-

tant information about the object on each feature map level. The next block

BN, ReLU6

1×1 Con2D

3×3 DWConv

1×1 Con2D

Expand

Conv

Project

Conv

Stride = 2

(a)

BN, ReLU6

1×1 Con2D

Expand

Conv

Project

Conv

Stride = 1

(b)

3×3 DWConv

Skip connection

Fig. 4. The two types of LiteEﬃcientNet (LE) architecture, stride = 2 (a) and stride

= 1 (b)

is LiteEﬃcientNet (LE). This block is very simple and is divided into two types

corresponding to two stride levels (stride = 1 or stride = 2). In the ﬁrst type

with stride = 2, the LiteEﬃcientNet block uses an extended convolution layer

(1 ×1 Con2D), a depth-wise convolution layer (3 ×3 DWConv), and ends with

a project convolution layer (1 ×1 Con2D). For the second type with stride =

1, the LiteEﬃcientNet block is exactly designed the same as the ﬁrst type and

6 Duy-Linh Nguyen et al.

added a skip connection to merge the current and original feature maps with

the addition operation. This block extracts the feature maps on the channel di-

mension. The combined use of PP-LCNet and LiteEﬃcientNet blocks ensures

that feature extraction is both spatial and channel dimensions of each feature

map level. The detail of the LiteEﬃcientNet block is shown in Fig. 4. The last

block in the backbone module is the Spatial Pyramid Pooling (SPP) block. This

work re-applies the architecture of SPP in the YOLOv5 as Fig. 2 (c). However,

to minimize the network parameters, the max pooling kernel sizes are reduced

from 5 ×5, 9 ×9, and 13 ×13 to 3 ×3, 5 ×5, and 7 ×7, respectively.

The neck module in the proposed network utilizes the Path Aggregation Network

(PAN) architecture following the original YOLOv5. This module combines the

current feature maps with previous feature maps by concatenation operations.

It generates the output with three multi-scale feature maps that are enriched

information. These serve as three inputs for the detection heads.

The detection head module also leverages the construction of three detection

heads from the YOLOv5. Three feature map scales of the PAN neck go through

three convolution operations to conduct prediction on three object scales: small,

medium, and large. Each detection head uses three anchor sizes that describe in

Table 1.

Table 1. Detection heads and anchors sizes.

Heads Input Anchor sizes Ouput Object

1 80 ×80 ×129 (10, 13), (16, 30), (33, 23) 80 ×80 ×18 Small

2 40 ×40 ×384 (30, 61), (62, 45), (59, 119) 40 ×40 ×18 Medium

3 20 ×20 ×768 (116, 90), (156, 198), (373, 326) 20 ×20 ×18 Large

3.2 Loss function

The deﬁnition of the loss function is shown as follows:

L=λboxLbox +λobj Lobj +λcls Lcls,(1)

where Lbox uses CIoU loss to compute the bounding box regression. The object

conﬁdence score loss Lobj and the classiﬁcation loss Lcls using Binary Cross

Entropy loss to calculate. λbox,λobj , and λcls are balancing parameters.

4 Experiments

4.1 Datasets

The proposed network is trained and evaluated on two benchmark datasets,

the Car Parking Lot Dataset (CarPK) and the Pontiﬁcal Catholic University

Car Detector Based on YOLOv5 for Parking Management 7

of Parana+ Dataset (PUCPR+) [8]. The CarPK dataset contains 89,777 cars

collected from the Phantom 3 Professional drone. The images were taken from

four parking lots with an approximate height of 40 meters. The CarPK dataset

is divided into 988 images for training and 459 images for validation phases.

The PUCPR+ dataset is selected from a part of the PUCPR dataset consisting

of 16,456 cars. The PUCPR+ dataset provides 100 images for training and 25

images for validation. These are image datasets for car counting in diﬀerent

parking lots. The cars in the image are annotated by bounding boxes with top-

left and bottom-right angles and stored as text ﬁles (*.txt ﬁles). To accommodate

the training and evaluation processes, this experiment converts the entire format

of the annotation ﬁles to the YOLOv5 format.

4.2 Experimental setup

The proposed network is conducted on the Pytorch framework and the Python

programming language. This network is trained on a Testla V100 32GB GPU

and evaluated on a GeForce GTX 1080Ti 11GB GPU. The optimizer is Adam

optimization. The learning rate is initialized at 10−5and ends at 10−3. The mo-

mentum set at 0.8 and then increased to 0.937. The training process goes through

300 epochs with a batch size of 64. The balance parameters are set as follows:

λbox=0.05, λobj =1, and λcls =0.5. To increase training scenarios and avoid the

over-ﬁtting issue, this experiment applies data augmentation methods such as

mosaic, translate, scale, and ﬂip. For the inference process, other arguments are

set like an image size of 1024×1024, a batch size of 32, a conﬁdence threshold =

0.5, and an IoU threshold = 0.5. The speed results are reported in milliseconds

(ms).

4.3 Experimental results

The performance of the proposed network is evaluated lying on the comparison

results with the retrained networks from scratch and the recent research on the

two above benchmark datasets. Speciﬁcally, this work conducts the training and

evaluation of the proposed network and the four versions of YOLOv5 architec-

tures (l, m, s, n). Then, it compares the results obtained with the results in [7,

20] on the CarPK dataset and the results in [20] on the PUCPR+ dataset. As

a result, the proposed network achieves 95.8% of mean Average Precision with

an IoU threshold of 0.5 (mAP@0.5) and 63.1% of mAP with ten IoU thresholds

from 0.5 to 0.95 (mAP@0.5:0.95). This result shows the superior ability of the

proposed network compared to other networks. While the speed (inferent time)

is only 1.7 ms higher than retrained YOLOv5m network, nearly 1.5 times lower

than the retrained YOLOv5l network, and quite lower than other experiments

in [7] from 2.3 (YOLOv5m) to 7.9 (YOLOv5m) times. Besides, the weight of

the network (22.7 MB) and the computational complexity (23.9 GFLOPs) are

only half of the retrained YOLOv5m architecture. The comparison results on

the CarPK validation set are presented in Table 2. For the PUCPR+ dataset,

the proposed network achieves 97.4% of mAP@0.5 and 58.0% of mAP@0.5:0.95.

8 Duy-Linh Nguyen et al.

Table 2. Comparison result of proposed car detection network with other networks

and retrained YOLOv5 on CarPK validation set. The symbol ”∗” denotes the retrained

networks. N/A means not-available values.

Models Parameter Weight (MB) GFLOPs mAP@0.5 mAP@0.5:0.95 Inf. time (ms)

YOLOv5l∗46,631,350 93.7 114.2 95.3 62.3 26.4

YOLOv5m∗21,056,406 42.4 50.4 94.4 61.5 15.9

YOLOv5s∗7,022,326 14.3 15.8 95.6 62.7 8.7

YOLOv5n∗1,765,270 3.7 4.2 93.9 57.8 6.3

YOLOv5x [7] N/A 167.0 205.0 94.5 57.9 138.2

YOLOv5l [7] N/A 90.6 108.0 95.0 59.2 72.1

YOLOv5m [7] N/A 41.1 48.0 94.6 57.8 40.4

Modiﬁed YOLOv5 [7] N/A 44.0 57.7 94.9 61.1 50.5

SSD [20] N/A N/A N/A 68.7 N/A N/A

YOLO9000 [20] N/A N/A N/A 20.9 N/A N/A

YOLOv3 [20] N/A N/A N/A 85.3 N/A N/A

YOLOv4 [20] N/A N/A N/A 87.81 N/A N/A

SA+CF+CRT [20] N/A N/A N/A 89.8 N/A N/A

SF-SSD [20] N/A N/A N/A 90.1 N/A N/A

Our 11,188,534 22.7 23.9 95.8 63.1 17.6

This result is outstanding compared to other competitors and is only 0.3% of

mAP@0.5 and 2.5% of mAP@0.5:095 lower than retrained YOLOv5m, respec-

tively. However, the proposed network has a speed of 17.9 ms, which is only

slightly higher than the retrained YOLOv5m network (2.3 ms ↑) and lower than

the retrained YOLOv5l network (4.5 ms ↓). The comparison results are shown

in Table 3 and several qualitative results are shown in Fig. 5.

Table 3. Comparison result of proposed car detection network with other networks and

retrained YOLOv5 on PUCPR+ validation set. The symbol ” ∗” denotes the retrained

networks. N/A means not-available values.

Models Parameter Weight (MB) GFLOPs mAP@0.5 mAP@0.5:0.95 Inf. time (ms)

YOLOv5l∗46,631,350 93.7 114.2 96.4 53.8 22.4

YOLOv5m∗21,056,406 42.4 50.4 97.7 60.5 15.6

YOLOv5s∗7,022,326 14.3 15.8 84.6 38.9 7.4

YOLOv5n∗1,765,270 3.7 4.2 89.7 41.6 5.9

SSD [20] N/A N/A N/A 32.6 N/A N/A

YOLO9000 [20] N/A N/A N/A 12.3 N/A N/A

YOLOv3 [20] N/A N/A N/A 95.0 N/A N/A

YOLOv4 [20] N/A N/A N/A 94.1 N/A N/A

SA+CF+CRT [20] N/A N/A N/A 92.9 N/A N/A

SF-SSD [20] N/A N/A N/A 90.8 N/A N/A

Our 11,188,534 22.7 23.9 97.4 58.0 17.9

From the mentioned results, the proposed network has a balance in performance,

speed, and network parameters. Therefore, it can be implemented in parking

management systems on low-computing and embedded devices. However, the

process of testing this network also revealed some disadvantages. Since the car

detection network is mainly based on the signal obtained from the drone-view

or ﬂoor-view camera, it is inﬂuenced by a number of environmental factors, in-

Car Detector Based on YOLOv5 for Parking Management 9

CarPK dataset

PUCPR+ dataset

Fig. 5. The qualitative results and several mistakes of the proposed network on the

validation set of the CarPK and PUCPR+ datasets with IoU threshold = 0.5 and

conﬁdence score = 0.5. Yellow circles denote the wrong detection areas.

10 Duy-Linh Nguyen et al.

cluding illumination, weather, car density, occlusion, shadow, objects similarity,

and the distance from the camera to the cars. Several mistaken cases are listed

in Fig. 5 with yellow circles.

4.4 Ablation study

The experiment conducted several ablation studies to inspect the importance

of each block in the proposed backbones. The blocks are replaced in turn,

trained on the CarPK training set, and evaluated on the CarPK validation

set as shown in Table 4. The results from this table show that the PP-LCNet

block increases the network performance at mAP@ 0.5 (1.1% ↑) but decreased in

mAP@0.5:0.95 (0.8% ↓) when compared to the LiteEﬃcientNet block. Combin-

ing these two blocks gives a perfect result along with the starting Conv and the

ending SPP blocks. Besides, it also shows the superiority of the SPP block (0.4%

↑of mAP@0.5 and mAP@0.5:0.95) over the SPPF block when they generate the

same GFLOPs and network parameters.

Table 4. Ablation studies with diﬀerent types of backbones on the CarPK validation

set.

Blocks Proposed backbones

Conv D D D D

PP-LCNet D D D

LiteEﬃcientNet D D D

SPPF D

SPP D D D

Parameter 10,728,766 9,780,850 11,188,534 11,188,534

Weight (MB) 21.9 19.9 22.7 22.7

GFLOPs 20.8 18.5 23.9 23.9

mAP@0.5 95.1 94.3 95.4 95.8

mAP@0.5:0.95 58.2 59.3 62.7 63.1

5 Conclusion

This paper introduces an improved YOLOv5 architecture for car detection in

parking management systems. The proposed network contains three main mod-

ules: backbone, neck, and detection head. The backbone module is redesigned

using lightweight architectures: PP-LCNet and LiteEﬃcientNet. The network

achieves 95.8 % of mAP@0.5 and 63.1 % of mAP@0.5:0.95 and better perfor-

mance results when compared to recent works. The optimization of network

parameters, speed, and detection accuracy provides the ability to deploy on

real-time systems. In the future, the neck and detection head modules will be

developed to detect smaller vehicles and implement on larger datasets.

Car Detector Based on YOLOv5 for Parking Management 11

Acknowledgement

This result was supported by the ”Regional Innovation Strategy (RIS)” through

the National Research Foundation of Korea(NRF) funded by the Ministry of

Education(MOE)(2021RIS-003).

References

1. Ammour, N., Alhichri, H., Bazi, Y., Benjdira, B., Alajlan, N., Zuair, M.: Deep

learning approach for car detection in uav imagery. Remote Sensing 9, 1–15 (03

2017). https://doi.org/10.3390/rs9040312

2. Chen, H.C., Huang, C.J., Lu, K.H.: Design of a non-processor obu device for

parking system based on infrared communication. In: 2017 IEEE International

Conference on Consumer Electronics - Taiwan (ICCE-TW). pp. 297–298 (2017).

https://doi.org/10.1109/ICCE-China.2017.7991113

3. Chen, S., Zhang, S., Shang, J., Chen, B., Zheng, N.: Brain-inspired cognitive model

with attention for self-driving cars. IEEE Transactions on Cognitive and Develop-

mental Systems 11(1), 13–25 (2019). https://doi.org/10.1109/TCDS.2017.2717451

4. Cui, C., Gao, T., Wei, S., Du, Y., Guo, R., Dong, S., Lu, B., Zhou, Y., Lv, X.,

Liu, Q., Hu, X., Yu, D., Ma, Y.: Pp-lcnet: A lightweight CPU convolutional neural

network. CoRR abs/2109.15099 (2021), https://arxiv.org/abs/2109.15099

5. Ding, X., Yang, R.: Vehicle and parking space detection based on improved yolo

network model. Journal of Physics: Conference Series 1325, 012084 (10 2019).

https://doi.org/10.1088/1742-6596/1325/1/012084

6. Freund, Y., Schapire, R.E.: A desicion-theoretic generalization of on-line learn-

ing and an application to boosting. In: Vit´anyi, P. (ed.) Computational Learning

Theory. pp. 23–37. Springer Berlin Heidelberg, Berlin, Heidelberg (1995)

7. Hamzenejadi, M.H., Mohseni, H.: Real-time vehicle detection and classiﬁcation

in uav imagery using improved yolov5. In: 2022 12th International Confer-

ence on Computer and Knowledge Engineering (ICCKE). pp. 231–236 (2022).

https://doi.org/10.1109/ICCKE57176.2022.9960099

8. Hsieh, M., Lin, Y., Hsu, W.H.: Drone-based object counting by spa-

tially regularized regional proposal network. CoRR abs/1707.05972 (2017),

http://arxiv.org/abs/1707.05972

9. Jocher, G., et al.: ultralytics/yolov5: v3.1 - Bug Fixes and Perfor-

mance Improvements (Oct 2020). https://doi.org/10.5281/zenodo.4154370,

https://doi.org/10.5281/zenodo.4154370

10. Liang, X., Zhang, J., Zhuo, L., Li, Y., Tian, Q.: Small object detection in unmanned

aerial vehicle images using feature fusion and scaling-based single shot detector

with spatial context analysis. IEEE Transactions on Circuits and Systems for Video

Technology pp. 1758–1770 (2019)

11. Liu, M., Wang, X., Zhou, A., Fu, X., Ma, Y., Piao, C.: Uav-yolo:

Small object detection on unmanned aerial vehicle perspective. Sensors

20(8) (2020). https://doi.org/10.3390/s20082238, https://www.mdpi.com/1424-

8220/20/8/2238

12. Mart´ın Nieto, R., Garc´ıa-Mart´ın, , Hauptmann, A.G., Mart´ınez, J.M.: Automatic

vacant parking places management system using multicamera vehicle detection.

IEEE Transactions on Intelligent Transportation Systems 20(3), 1069–1080 (2019).

https://doi.org/10.1109/TITS.2018.2838128

12 Duy-Linh Nguyen et al.

13. Mettupally, S.N.R., Menon, V.: A smart eco-system for parking detection using

deep learning and big data analytics. In: 2019 SoutheastCon. pp. 1–4 (2019).

https://doi.org/10.1109/SoutheastCon42311.2019.9020502

14. Mitra, V., Wang, C.J., Banerjee, S.: Text classiﬁcation: A least square sup-

port vector machine approach. Applied Soft Computing 7, 908–914 (06 2007).

https://doi.org/10.1016/j.asoc.2006.04.002

15. Scotiabank: Number of cars sold worldwide from 2010 to 2022, with a 2023 fore-

cast (in million units). https://www.statista.com/statistics/200002/international-

car-sales-since-1990/, note = Accessed: Jan. 01, 2023. [Online]. Available:

https://www.statista.com/statistics/200002/international-car-sales-since-1990/

16. Shao, Y., Chen, P., Tongtong, C.: A grid projection method based on

ultrasonic sensor for parking space detection. pp. 3378–3381 (07 2018).

https://doi.org/10.1109/IGARSS.2018.8519022

17. Tan, M., Le, Q.V.: Eﬃcientnet: Rethinking model scaling for convolutional neural

networks. CoRR abs/1905.11946 (2019), http://arxiv.org/abs/1905.11946

18. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of sim-

ple features. In: Proceedings of the 2001 IEEE Computer Society Conference on

Computer Vision and Pattern Recognition. CVPR 2001. vol. 1, pp. I–I (2001).

https://doi.org/10.1109/CVPR.2001.990517

19. XU Zihao, HUANG Weiquan, W.Y.: Multi-class vehicle detection in surveillance

video based on deep learning. Journal of Computer Applications 39(3), 700 (2019)

20. Yu, J., Gao, H., Sun, J., Zhou, D., Ju, Z.: Spatial cognition-driven deep

learning for car detection in unmanned aerial vehicle imagery. IEEE Trans-

actions on Cognitive and Developmental Systems 14(4), 1574–1583 (2022).

https://doi.org/10.1109/TCDS.2021.3124764

21. Yuan, C., Qian, L.: Design of intelligent parking lot system based on wireless

network. In: 2017 29th Chinese Control And Decision Conference (CCDC). pp.

3596–3601 (2017). https://doi.org/10.1109/CCDC.2017.7979129

22. Zhang, S., Wang, X.: Human detection and object tracking based on histograms of

oriented gradients. In: 2013 Ninth International Conference on Natural Computa-

tion (ICNC). pp. 1349–1353 (2013). https://doi.org/10.1109/ICNC.2013.6818189

23. Zhang, Y., Guo, Z., Wu, J., Tian, Y., Tang, H., Guo, X.: Real-time

vehicle detection based on improved yolo v5. Sustainability 14(19)

(2022). https://doi.org/10.3390/su141912274, https://www.mdpi.com/2071-

1050/14/19/12274

24. Zhao, F., Kong, Q., Zeng, Y., Xu, B.: A brain-inspired visual fear

responses model for uav emergent obstacle dodging. IEEE Transac-

tions on Cognitive and Developmental Systems 12(1), 124–132 (2020).

https://doi.org/10.1109/TCDS.2019.2939024

25. Zhou, F., Li, Q.: Parking guidance system based on zigbee and geomagnetic sen-

sor technology. In: 2014 13th International Symposium on Distributed Comput-

ing and Applications to Business, Engineering and Science. pp. 268–271 (2014).

https://doi.org/10.1109/DCABES.2014.58

ResearchGate has not been able to resolve any citations for this publication.

Real-Time Vehicle Detection Based on Improved YOLO v5

Article

Full-text available

Sep 2022

To reduce the false detection rate of vehicle targets caused by occlusion, an improved method of vehicle detection in different traffic scenarios based on an improved YOLO v5 network is proposed. The proposed method uses the Flip-Mosaic algorithm to enhance the network’s perception of small targets. A multi-type vehicle target dataset collected in different scenarios was set up. The detection model was trained based on the dataset. The experimental results showed that the Flip-Mosaic data enhancement algorithm can improve the accuracy of vehicle detection and reduce the false detection rate.

Dhaka City Traffic Detection

Poster

Full-text available

Dec 2020

We have presented the experiment we have performed in the detection of Dhaka traffic on a public dataset and presented our results thereby. Our experiment includes training the YOLOv5 object detection model and fine-tuning the parameters for better mean average precision.

UAV-YOLO: Small Object Detection on Unmanned Aerial Vehicle Perspective

Article

Full-text available

Apr 2020
SENSORS-BASEL

Object detection, as a fundamental task in computer vision, has been developed enormously, but is still challenging work, especially for Unmanned Aerial Vehicle (UAV) perspective due to small scale of the target. In this study, the authors develop a special detection method for small objects in UAV perspective. Based on YOLOv3, the Resblock in darknet is first optimized by concatenating two ResNet units that have the same width and height. Then, the entire darknet structure is improved by increasing convolution operation at an early layer to enrich spatial information. Both these two optimizations can enlarge the receptive filed. Furthermore, UAV-viewed dataset is collected to UAV perspective or small object detection. An optimized training method is also proposed based on collected UAV-viewed dataset. The experimental results on public dataset and our collected UAV-viewed dataset show distinct performance improvement on small object detection with keeping the same level performance on normal dataset, which means our proposed method adapts to different kinds of conditions.

Vehicle and Parking Space Detection Based on Improved YOLO Network Model

Article

Full-text available

Oct 2019

YOLO has a fast detection speed and is suitable for object detection in real-time environment. This paper is based on YOLO v3 network and applied to parking spaces and vehicle detection in parking lots. Based on YOLO v3, this paper adds a residual structure to extract deep vehicle parking space features, and uses four different scale feature maps for object detection, so that deep networks can extract more fine-grained features. Experiment results show that this method can improve the detection accuracy of vehicle and parking space, while reducing the missed detection rate.

Real-Time Vehicle Detection and Classification in UAV imagery Using Improved YOLOv5

Conference Paper

Nov 2022

Spatial Cognition-Driven Deep Learning for Car Detection in Unmanned Aerial Vehicle Imagery

Article

Nov 2021

Small object detection is the main challenge for image detection of unmanned aerial vehicles (UAVs), especially with small pixel ratios and blurred boundaries. In this paper, a one-stage detector (SF-SSD) is proposed with a new spatial cognition algorithm. The deconvolution operation is introduced to a feature fusion module, which enhances the representation of shallow features. These more representative features prove effective for small-scale object detection. Empowered by a spatial cognition method, the deep model can re-detect objects with less-reliable confidence scores. This enables the detector to improve detection accuracy significantly. Both between-class similarity and within-class similarity are fully exploited to suppress useless background information. This motivates the proposed model to take full use of semantic features in the detection process of multi-class small objects. A simplified network structure can improve the speed of object detection. The experiments are conducted on a newly collected dataset (SY-UAV) and the benchmark datasets (CARPK and PUCPR+). To further demonstrate the effectiveness of the spatial cognition module, a multi-class object detection experiment is conducted on the Stanford Drone dataset (SDD). The results show that the proposed model achieves high frame rates and better detection accuracies than the state-of-the-art methods, which are 90.1% (CAPPK), 90.8% (PUCPR+), and 91.2% (SDD).

A Smart Eco-System for Parking Detection Using Deep Learning and Big Data Analytics

Conference Paper

Apr 2019

A Brain-inspired Visual Fear Responses Model for UAV Emergent Obstacle Dodging

Article

Sep 2019

Dodging emergent dangers is an innate cognitive ability for animals, which helps them to survive in the natural environment. The retina-superior colliculus-pulvinar-amygdala-periaqueductal gray pathway is responsible for the visual fear responses, and it is able to quickly detect the looming obstacles for innate dodging. Inspired by the mechanism of the visual fear responses pathway, we propose a brain-inspired emergent obstacle dodging method to model the functions of the related brain regions. This method firstly detects the moving direction and speed of the salient point of moving objects (retina). Then, we detect the looming obstacles (superior colliculus). Thirdly, we modulate attention to the most dangerous area (pulvinar). Fourthly, if the degree of danger exceeds the threshold (amygdala), the UAV moves back to dodge it (periaqueductal gray). Two types of experiments are conducted to validate the effectiveness of the proposed model. In simulated scene, we simulate the process of mice’s fear responses by putting looming dark lights shining on them. In natural scene, we apply the proposed model to the UAV emergent obstacles dodging. Compared to the stereo vision model, the proposed model is not only more biologically realistic from the mechanisms perspective, but also more accurate and faster for computation.

Small Object Detection in Unmanned Aerial Vehicle Images Using Feature Fusion and Scaling-Based Single Shot Detector With Spatial Context Analysis

Article

Mar 2019

Objects in unmanned aerial vehicle (UAV) images are generally small due to the high photography altitude. Although many efforts have been made in object detection, how to accurately and quickly detect small objects is still one of the remaining open challenges. In this paper, we propose a feature fusion and scaling-based single shot detector (FS-SSD) for small object detection in UAV images. FS-SSD is an enhancement based on FSSD, a variety of the original single shot multibox detector (SSD). We add an extra scaling branch of the deconvolution module with an average pooling operation to form a feature pyramid. The original feature fusion branch is adjusted to be better suited to the small object detection task. The two feature pyramids generated by the deconvolution module and feature fusion module are utilized to make predictions together. In addition to the deep features learned by FS-SSD, to further improve the detection accuracy, spatial context analysis is proposed to incorporate the object spatial relationships into object redetection. The interclass and intraclass distances between different object instances are computed as spatial context, which proves effective for multiclass small object detection. Six experiments are conducted on the PASCAL VOC dataset and two UAV image datasets. The experimental results demonstrate that the proposed method can achieve a comparable detection speed but an accuracy superior to those of the six state-of-the-art methods.

A Grid Projection Method Based on Ultrasonic Sensor for Parking Space Detection

Conference Paper

Jul 2018

Car Detector Based on YOLOv5 for Parking Management

Abstract

Recommended publications

Real-time Traffic Sign Text Detection Based on Deep Learning

Car Detection for Smart Parking Systems Based on Improved YOLOv5

YOLO5PKLot: A Parking Lot Detection Network Based on Improved YOLOv5 for Smart Parking Management Sy...

YOLO5PKLot: A Parking Lot Detection Network Based on Improved YOLOv5 for Smart Parking Management Sy...

Vehicle Detector Based on Improved YOLOv5 Architecture for Traffic Management and Control Systems