Conference PaperPDF Available

The Face Mask Detection For Preventing the Spread of COVID-19 at Politeknik Negeri Batam

Authors:
The Face Mask Detection For Preventing the Spread
of COVID-19 at Politeknik Negeri Batam
Susanto
Electrical Engineering
Politeknik Negeri Batam
Batam, Indonesia
susanto@polibatam.ac.id
Febri Alwan Putra
Electrical Engineering
Politeknik Negeri Batam
Batam, Indonesia
febry@polibatam.ac.id
Riska Analia
Electrical Engineering
Politeknik Negeri Batam
Batam, Indonesia
riskaanalia@polibatam.ac.id
Ika Karlina Laila Nur
Suciningtyas
Electrical Engineering
Politeknik Negeri Batam
Batam, Indonesia
ikakarlina@polibatam.ac.id
Abstract— After the new Coronavirus disease (COVID-19)
case spread rapidly in Wuhan-China in December 2019, World
Health Organization (WHO) confirmed that this is a dangerous
virus which can be spreading from humans to humans through
droplets and airborne. As for the prevention, wearing a face
mask is essentials while going outside or meeting to others.
However, some irresponsible people refuse to wear face mask
with so many excuses. Moreover, developing the face mask
detector is very crucial in this case. This paper aims to develop
the face mask detector which is able to detect any kinds of face
mask. In order to detect the face mask, a YOLO V4 deep
learning has been chosen as the mask detection algorithm. The
experimental results have been done in real-time application
and the device has been installed at Politeknik Negeri Batam.
From the experimental results, this device is able to detect the
people who wear or do not wear the face mask accurately even
if they are moving to various position.
Keywords—COVID-19, mask detector, YOLO, deep learning.
I. I
NTRODUCTION
The spread of COVID-19 is increasingly worrying for
everyone in the world. This virus can be affected from human
to human through the droplets and airborne. According to the
instruction from WHO, to reduce the spread of COVID-19,
every people need to wear face mask, do social distancing,
evade the crowd area and also always maintain the immune
system. Therefore, to protect each other, every person should
wear the face mask properly when they are in outdoor.
However, most of selfish people won’t wear the face mask
properly with so many reasons.
To overcome this situation, a robust face mask detection
needs to be developed. In order to detect a face mask, the
object detection algorithm can be implemented. The state of
art of object detection algorithm which has a robust
performance is the You Only Look Once (YOLO). As
presented in [1], Susanto, et al., used the YOLO deep learning
method to distinguish the white ball and goal which is
integrated to humanoid robot soccer. This algorithm has been
carried out by using the NVIDIA JETSON TX1 controller
board. The other work implemented the YOLO was
introduced by Liu, et al. [2]. In this work they implemented
the traditional image processing in order to shooting of the
noise, blurring and rotating filter in real-world. Then they used
the YOLO algorithm to train a robust model to improve the
traffic signs detection. On the other hands, Yang, et al., [3]
used the YOLO algorithm to detect the face in real-time
application with accuracy and fast detection time. In contrast
with [3], in [4] they improved the YOLO algorithm for
detecting the face in a video sequence and compared
the
© IEEE 2021. This article is free to access and download, along with rights
for full text and data mining, re-use and analysis.
accuracy of detecting to the traditional approach. They also
used the FDDB dataset for training and testing out the model.
The improvement of YOLO model also has been done by
Zhao, et al [5]. They improved YOLO model to detect the
pedestrian which address two issues such as leverage real-time
saliency through surveillance camera and extract the detail of
distinguished feature.
A few years later, YOLO method was improved to YOLO
V2 which was able to detect over 9000 object categories. In
this version, a novel, multi-scale training method was
developed [6]. Thereafter, Kim, et al., implemented the
YOLO-V2 for image recognition and other testbenches for a
CNN accelerator. In this work, the YOLO V2 method has
been applied through the simulation and FPGA experiment
[7]. Harisankar, et al. [8] on the other hand, modified the
YOLO V2 to detect and localize the pedestrian by adding the
Model H architecture. In order to detect the pedestrian
precisely, they used ZED camera and created the depth map.
Fig. 1. The block diagram of the face mask detection hardware.
Fig. 2. The prototype of face mask detector.
2020 3rd International Conference on Applied Engineering (ICAE) | 978-1-7281-9917-7/20/$31.00 ©2020 IEEE | DOI: 10.1109/ICAE50557.2020.9350556
Authorized licensed use limited to: Central Michigan University. Downloaded on May 14,2021 at 05:29:57 UTC from IEEE Xplore. Restrictions apply.
Within two years Redmon, et al., introduced the YOLO
V3 although the architecture is a little bigger than the last time,
this version is accurately as SSD algorithm but three times
faster [9]. Therefore, Hu, et al. [10] implemented this version
to detect the workers with or without helmets in videos. The
first, YOLO V3 model was to identify and intercept the
worker video then made the sample model of wearing and not
wearing helmet. The theorical analysis and experimental
results verify that the proposed algorithm is able to detect the
helmet with high detection accuracy. Because of the YOLO
method is open source, then it allows everyone to improve the
algorithm of the YOLO method. As presented in [11], Alexey,
et al. introduced the YOLOv4 with optimal speed and
accuracy. In this work, they assumed that such universal
features such as Weighted-Residual-Connections (WRC),
Cross-Stage-Partial connections (CSP), Cross mini-Batch
Normalization (CmBN), Self-adversarial-training (SAT) and
Mish activation are able to give high detection accuracy.
According to the state of arts and the results which have
already been tested by some researcher, in this work we
developed a face mask detection for COVID-19 prevention by
using the YOLO V4 algorithm. In contrast with [11], the
YOLO v4 algorithm will be used to detect the face mask as
the object by adding several feature which will be explained
in section III. The face mask detection will be applied in real-
time application to detect all types of commercial face mask.
By adding the YOLO V4, it is hoped that this device able to
detect whether the users are wearing a face mask or not.
The rest of this paper will be organized as follows: section
two illustrates the hardware architecture. Section three
describes the face mask detection algorithm and Section four
presents the experimental results in real-world condition. This
work will be closed by the conclusion and future work which
are delivered in section five.
II. T
HE
H
ARDWARE
E
LEMENTS OF
F
ACE
M
AKS
D
ETECTION
This section will explain about the hardware architecture
of our system. The hardware diagram system can be seen on
Fig. 1. The whole hardware system on Fig. 1 consists of a
digital webcam camera, PC, and a speaker. In order to do the
face mask detection, a deep neural network has been chosen
as a detection method. All the computation of the face
detection is done in a computer which is mounted to the GPU
to enhance the graphical calculation of the image. This system
is running in real-time application when the camera detected
the user who wear or un-wear the face mask. When the system
detected the user who un-wear or wear face mask from the
camera, it will order the speaker to alert him/her to wear the
face mask and will be going on until the user put their face
mask properly.
The prototype of the face mask detector is presented on
Fig. 2. This figure shows the position of each parts of the face
mask detector when it is operated in real-time application. The
webcam camera, as seen on Fig. 2, was mounted on the top of
the monitor, which allowed the detection of the user at around
5 meters. The PC and speaker on the other hand, was put on
the back of the monitor for the safety reason and to make it
more compact so that it can be moved everywhere freely. The
prototype, illustrated on Fig. 2, is equipped by MiniPC
embedded with GTX 1060 Nvidia GPU for the object
detection method calculation.
III. T
HE
F
ACE
M
ASK
D
ETECTION
As for the face mask detector method, this work
implemented the deep neural network known as YOLO V4.
According to [11], the YOLO V4 is able to run twice faster
than the other deep neural network method which is used to
detect the object. The performance of this version is able to
improve the YOLOv3’ AP by 10% and FPS by around 12%.
By reflecting these results, it is suitable to implement the
method into the real-time face mask detector where high
detection accuracy is needed.
In this work, the object which needs to be detected is the
face mask wearer. The YOLO V4 which has been
implemented in this work consists of two-stages detector. As
seen on Fig. 3, the first-stage detector consists of input,
backbone, neck, and dense prediction. Moreover, the second-
stage of detector has sparse prediction to predict the object by
understanding the bounding boxes and the classes on the
object. The input image of this work needs to be handled in
the resolution of around 1920x1080 pixel in real-time
application to ensure the detection of the moving object who
wear the face-mask properly. In this part, the convolutional
layer 3x3 is built and the larger number of parameters will be
selected as the backbone input layer. While at the backbone
parts, the Darknet53 has been chosen as a detector method. In
this backbone, the input network resolution will be shrunk into
512x512 with receptive field size of 725x725 and contains 29
convolutional layers by 3x3 and then each layer will be sent
to the neck detector. The PANet is applied as the neck detector
method for understanding the parameter aggregation from
different backbone level detector. All the aggregation layer
from the neck detector will be sent to the sparse prediction
Fig. 3. The face mask detection system.
Authorized licensed use limited to: Central Michigan University. Downloaded on May 14,2021 at 05:29:57 UTC from IEEE Xplore. Restrictions apply.
parts as the input layer in this stage is represented as line blue
on Fig. 3. Meanwhile, the last detector process on stage one is
done at the dense prediction. The YOLO V3 model is used in
this stage to generate the prediction which result will be used
as the input prediction at the second-stage predictor. The
second stage predictor has sparse prediction which applies the
faster R-CNN as the prediction method. In this stage, it got the
input layer from the neck which is 3x3 layer and the input
prediction from the dense prediction. The result from this
stage are two classes such as face-mask wearer and un-wearer
of face-mask.
IV. E
XPERIMENTAL
R
ESULTS
This section will present the experiment results of the face
mask detection in real-time application and that has already
been installed at Politeknik Negeri Batam. The first
experiment which is depicted on Fig. 4 has been done as the
first trial before it is implemented for the moving person. Fig.
4(a) illustrates the face detector that detected the single user
wearing a face mask accurately even it has some disturbance
in the area. As for Fig. 4(b), the user was added slowly from
below the camera and the detector was able to detect the mask
properly. When the users are standing close to each other as
seen on Fig. 4 (c)-(f), this system was also able to detect the
face mask even if the user was surrounded by many objects
with similar color.
After did the trial with no error, we are ready to verify this
device with more user. As seen on Fig. 5(a)-(c), we added the
user into three people with different types of face mask such
as surgical and fabric face mask. Each person was standing in
different position to verify the performance of face mask
detection. From the picture it is verify that the face detected
remains steady in detecting face mask of the users even the
lighting was in different brightness. To make different in
brightness, we turned off and on the lamp at our lab as to test
the feature of this system.
On the other hand, the experiment of detecting a non-
wearing mask is presented on Fig. 6(a)-(c). On Fig. 6(a) the
first user who wear a white T-shirt attempted to pull off his
mask, and the mask detector was able to distinguish the non-
wearing mask and mask-wearing user. Fig. 6(b) also presented
the mask detection precisely. Moreover, on Fig. 6(c) the first
and third users tried to take off their mask, and the mask
detector detected the face mask condition steadily.
(a) (b)
(c) (d)
(e) (f)
Fig. 4. The face mask detector detected the user who wear the face mask (a)
alone in the frame, (b) detected the new user, (c)-(f) detected the face mask
where the user was close to each other.
(a)
(b)
(c)
Fig. 5. The detector detected multiple people who are wearing a face mask
with different position from each other.
(a)
Authorized licensed use limited to: Central Michigan University. Downloaded on May 14,2021 at 05:29:57 UTC from IEEE Xplore. Restrictions apply.
(b)
(c)
Fig. 6. The face mask detection detected the non-wearing user with different
angle of pose.
(a)
(b)
Fig. 7. The face detection detected the fabric face mask while (a) the user
moving towards to the device detector, (b) the user moved away from the
device.
When the user attempted not to wear the face mask
properly, represented on Fig. 8, this device will announce that
the user needs to wear the mask by the speaker plugged into
the device. Fig. 8 (a) illustrates the non-wearing user of face
mask and Fig. 8 (b) for wearers of face mask. From all these
experiments, the average FPS generated by the face detection
is about 11,1 FPS which is presented on Fig. 9. From all these
experiments, the face detector which is built by YOLO v4
algorithm is able to detect and distinguish a non-wearing and
a wearing-mask user properly in every different situation such
as lighting, mess up area, and clean area.
(a) (b)
Fig. 8. The face-mask detected the (a) a non-wearing face mask user, (b) a
wearer of face mask.
Fig. 9. The FPS results when face mask was detected.
V. C
ONCLUSION AND
F
UTURE
W
ORK
This work developed the face mask detection by using
YOLO V4 algorithm. The YOLO V4 algorithm consists of
deep learning method which is able to detect the object
properly. This device has already been installed at Politeknik
Negeri Batam in real-time application to avoid the spread of
COVID-19 in campus area. From the experiment results, the
algorithm is able to detect and distinguish a non-wearing and
a wearing-mask precisely with any condition of surrounding
environment. In the future, we will add the thermal detection
on this device to help the guard’s work easier. Furthermore,
this device is hopped to be installed in other crowd area which
need face mask detector.
R
EFERENCES
[1] Susanto, E. Rudiawan, R. Analia, D. S. Pamungkas, and H. Soebakti,
“The deep learning development for real-time ball and goal detection
of barelang-FC,” in 2017 International Electronics Symposium on
Engineering Technology and Applications (IES-ETA), Surabaya,
2017, pp. 146-151, doi: 10.1109/ELECSYM.2017.8240393.
[2] C. Liu, Y. Tao, J. Liang, K. Li and Y. Chen, “Object detection based
on YOLO network,” in 2018 IEEE 4th Information Technology and
Mechatronics Engineering Conference (ITOEC), Chongqing, China,
2018, pp. 799-803, doi: 10.1109/ITOEC.2018.8740604.
[3] W. Yang and Z. Jiachun, “Real-time face detection based on YOLO,”
in 2018 1st IEEE International Conference on Knowledge Innovation
and Invention (ICKII), Jeju, 2018, pp. 221-224, doi:
10.1109/ICKII.2018.8569109.
[4] D. Garg, P. Goel, S. Pandya, A. Ganatra and K. Kotecha, “A deep
learning approach for face detection using YOLO,” in 2018 IEEE
Punecon, Pune, India, 2018, pp. 1-4, doi:
10.1109/PUNECON.2018.8745376.
[5] C. Zhao and B. Chen, “Real-time pedestrian detection based on
improved YOLO model,” in 2019 11th International Conference on
FPS 11,1
Authorized licensed use limited to: Central Michigan University. Downloaded on May 14,2021 at 05:29:57 UTC from IEEE Xplore. Restrictions apply.
Intelligent Human-Machine Systems and Cybernetics (IHMSC),
Hangzhou, China, 2019, pp. 25-28, doi: 10.1109/IHMSC.2019.10101.
[6] J. Redmon, & A. Farhadi, YOLO9000: Better, Faster, Stronger, 2016.
[7] C. Kim et al., “Implementation of Yolo-v2 image recognition and other
testbenches for a CNN accelerator,” in 2019 IEEE 9th International
Conference on Consumer Electronics (ICCE-Berlin), Berlin, Germany,
2019, pp. 242-247, doi: 10.1109/ICCE-Berlin47944.2019.8966213.
[8] H. V and K. R, “Real time pedestrian detection using modified YOLO
V2,” in 2020 5th International Conference on Communication and
Electronics Systems (ICCES), COIMBATORE, India, 2020, pp. 855-
859, doi: 10.1109/ICCES48766.2020.9138103.
[9] J. Redmon and A. Farhadi, YOLOv3: An Incremental Improvement,
2018.
[10] J. Hu, X. Gao, H. Wu and S. Gao, “Detection of workers without the
helments in videos based on YOLO V3,” in 2019 12th International
Congress on Image and Signal Processing, BioMedical Engineering
and Informatics (CISP-BMEI), Suzhou, China, 2019, pp. 1-4, doi:
10.1109/CISP-BMEI48845.2019.8966045.
[11] A. Bochkovskiy, C-Y. Wang, and H-Y. M. Liao, YOLOv4: Optimal
Speed and Accuracy of Object Detection, 2020.
Authorized licensed use limited to: Central Michigan University. Downloaded on May 14,2021 at 05:29:57 UTC from IEEE Xplore. Restrictions apply.
... It excels at detecting small or intricate objects with several benefits such as real-time and high-speed performance, accurate multi-scale detection, lightweight design, and intuitive usability and good compatibility with deep learning [2]. In the early YOLO mask detection models [3], the focus was mainly on local features, lacking the fusion of global information. This made it challenging for the model to distinguish the relationship between the mask and other backgrounds accurately, resulting in inaccurate key feature extraction. ...
Article
Full-text available
Due to the COVID-19 pandemic, there has been a significant increase in the usage of masks, leading to more complex scenarios for mask detection techniques. This paper focuses on optimizing the performance of mask detection using the You Only Look Once (YOLO) v5 model. In this study, the yolov5 target detection model was employed for training the mask dataset. Diverse model improvement techniques were explored to enhance the model's capability to capture crucial features and differentiate masks from the background in complex scenarios. Finally, the modified model was compared with the earlier original target detection model to identify the most considerable performance gain. The CSPDarknet design with the TensorFlow framework is utilized in this study, and the Attention Mechanism module is implemented through the Keras library. The objective is to optimize the three feature layers between the backbone network and the neck by integrating multiple attention mechanisms. This will enable the model to more quickly and accurately capture important features when dealing with complex scenarios by adjusting the feature map weights. Additionally, in the feature pyramid network, shallow feature maps are fused with deeper feature maps in a certain order to determine the most efficient feature fusion method. Finally, this study identified the optimal combination of attention mechanism and feature fusion through ablation experiments. The results of the experiment demonstrate that the combination of SE block and shallow feature fusion (SE + FF2 model) can greatly enhance category confidence, leading to an improved model performance.
... With the development of deep learning, researchers have been working on combining deep learning with practical scenarios, such as Zheng et al. introduced DL-PR, a deep learning-based adaptive modulation classification method that significantly enhances AMC accuracy through regularization based on the SNR distribution of samples, 14 Bassiouni et al. employ deep learning to predict transportation risks during the COVID-19 pandemic, achieving approximately 100% accuracy, facilitating proactive decision-making for resilient supply chains. 15 In the identification and detection of masks, Chavda et al. use two-stage CNN architecture to detect whether pedestrians are wearing masks, 16 Susanto et al. use YOLOv4 to detect whether pedestrians are wearing masks., 17 Xue et al. improved upon the basic face detection algorithm for mask detection 18 and introduced the attention mechanism to classify masks. Zhang et al. proposed the Context-Attention R-CNN model, a highly accurate and effective mask detector. ...
Article
Full-text available
The ongoing COVID-19 pandemic remains a significant threat, emphasizing the critical importance of mask-wearing to reduce infection risks. However, existing methods for mask detection encounter challenges such as identifying small targets and achieving high accuracy. In this paper, we present an enhanced YOLOv7 model tailored for mask-wearing detection. we employing a Generative Adversarial Network (GAN) to augment the original dataset, introducing the Convolutional Block Attention Module (CBAM) mechanism into the YOLOv7 model to enhance its small target detection capabilities, and replacing the model’s activation function with Parametric Rectified Linear Unit (FReLU) to improve overall performance. Experimental validation on a dataset showcases an average precision of 97.8% and a real-time inference speed of 64 frames per second (fps), meeting the real-time mask-wearing detection requirements effectively.
... Throughout their research, they applied the Kaggle face mask dataset. The author used the Pascal VOC standard dataset, which included 7,959 images and 16,635 annotations, to perform face mask identification using transfer learning and PP-YOLO [21]. They obtained a mask detection mAP score of 86.69 percent for this model. ...
Article
Full-text available
Object detection system in light of deep learning have been monstrously effective in complex item identification task images and have shown likely in an extensive variety of genuine applications counting the Coronavirus pandemic. Ensuring and enforcing the proper use of face masks is one of the main obstacles in containing and reducing the spread of the infection among the population. This paper aims to find out how the urban population of a megacity uses facial masks correctly. Using YOLOv3 and YOLOv5, we trained and validated a brand-new dataset to identify images as "with mask", "without mask", and "mask not in position". In the YOLOv3 we carried out three pre-trained models which are: YOLOv3, YOLOv3-tiny, and SPP-YOLOv3. In addition, we utilized five pre-trained models in the YOLOv5: YOLOv5n, YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. The dataset is included 6550 pictures with three classes. On mAP, the dataset achieved a commendable 95% performance accuracy. This research can be used to monitor the proper use of face masks in various public spaces through automated scanning.
Chapter
Full-text available
This book covers computer vision-based applications in digital healthcare industry 4.0, including different computer vision techniques, image classification, image segmentations, and object detection. Various application case studies from domains such as science, engineering, and social networking are introduced, along with their architecture and how they leverage various technologies, such as edge computing and cloud computing. It also covers applications of computer vision in tumor detection , cancer detection, combating COVID-19, and patient monitoring. Features: • Provides a state-of-the-art computer vision application in the digital health-care industry. • Reviews advances in computer vision and data science technologies for analyzing information on human function and disability. • Includes practical implementation of computer vision application using recent tools and software. • Explores computer vision-enabled medical/clinical data security in the cloud. • Includes case studies from the leading computer vision integrated vendors, such as Amazon, Microsoft, IBM, and Google. This book is aimed at researchers and graduate students in bioengineering, intelligent systems, and computer science and engineering.
Article
Full-text available
Pendemi Covid-19 pernah terjadi di akhir tahun 2019, walau pandemi sudah berlalu untuk usaha pencegahan kasus yang sama maka pengecekan suhu tubuh dan masker masih diberlakukan terutama ditempat publik misalnya di kantor perbankan dan sebagainya dan kenyataanya virus dapat bermutasi dan pernah ditemukan variannya. Penelitian ini merupakan penelitian lanjutan dari penulis tentang deteksi masker dengan menambahkan fitur pengecekan suhu tubuh untuk kontrol otomatis sistem buka tutup portal. Implementasi sistem dilakukan di mesin android model hasil pelatihan deteksi masker dilatih menggunakan machine learning dan deteksi suhu menggunakan sensor non kontak MLX90614. Berdasarkan pengujian sistem telah bekerja dengan baik, portal akan membuka secara otomatis jika pengunjung menggunakan masker dan suhu tubuh dibawah 37,5 oC.
Article
Full-text available
Este artículo tiene como objetivo estudiar los recibimientos de gobernadores en Asunción colonial (siglo XVII). Estas ceremonias contribuyeron a integrar al Imperio español una sociedad colonial e identidad particular, entendida esta como un proceso relacional en el cual ciertas prácticas culturales, como los recibimientos, eran construidas de manera local a la vez que formaban parte de las dinámicas imperiales. Para ello se analiza, por medio de la lectura de cartas e informes, la entrada / visita realizada por el gobernador Luis Céspedes de Xeria en 1628-1629 para tomar posesión de la Gobernación del Paraguay, evento que marcará la «costumbre» a seguir en Asunción
Conference Paper
Full-text available
Studies of object detection have recently attracted increased interest. One of the applications of object detection is robotics. This paper present the real-time object detection integrated to humanoid robot soccer. In order to enhance the vision to detect ball and goal, the You Only Look Once (YOLO) methods is used as deep-learning object detection method. The real-time experiments have been carried out in LINUX OS by using NVIDIA JETSON TX1 controller board. The experimental results show that the proposed method capable to detect and distinguish objects in the different lighting condition, with interference from other objects, also from the different angle of capturing an image.
Article
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at https://pjreddie.com/yolo/
YOLOv4: Optimal Speed and Accuracy of Object Detection
  • A Bochkovskiy
  • C-Y. Wang
  • H-Y M Liao
A. Bochkovskiy, C-Y. Wang, and H-Y. M. Liao, YOLOv4: Optimal Speed and Accuracy of Object Detection, 2020.