Conference PaperPDF Available

Fire Detection based on Convolutional Neural Networks with Channel Attention

November 2020

November 2020

DOI:10.1109/CAC51589.2020.9327309

Conference: 2020 Chinese Automation Congress (CAC)

Authors:

Kun Qian

Southeast University (China)

Kaihe Jing

Southeast University (China)

Show all 5 authorsHide

RGB-IR camera

…

Automatic annotation process A sample video only needs to be registered once, and the transformation matrix is applicable to all frames of this sample video. Because the pixels in the flame area have higher brightness in the infrared image, simple image processing methods can be used to obtain the mask of the flame in the infrared image. Since the infrared image and the RGB image are

…

Example of generating pixel-aligned pairs of RGB-T samples Fig. 4 shows some examples of automatic annotation. The left column is the infrared images, the middle column is the flame masks extracted by the image processing algorithm, and the right column is the generated sample labels.

…

Examples of automatic annotation

…

Channel attention module

…

Figures - uploaded by Kun Qian

Content may be subject to copyright.

Content uploaded by Kun Qian

Content may be subject to copyright.

3080978-1-7281-7687-1/20/$31.00 2020 IEEE

Fire Detection based on Convolutional Neural

Networks with Channel Attention

Xiaobo Zhang

School of Automation

Southeast University

Nanjing 210096, China

zxb852@sina.com

Kun Qian

School of Automation, Southeast University,

Nanjing 210096, China

Key Laboratory of Measurement and

Control of Complex Systems of Engineering,

Ministry of Education of China

kqian@seu.edu.cn

Kaihe Jing

School of Automation

Southeast University

Nanjing 210096, China

928104929@qq.com

Jianwei Yang

Future Science & Technology Park

Changping District,

Beijing, China, 102209

hvdcyjw@sina.com

Hai Yu

Global Energy Interconnection Research

Institute, State Grid

Nanjing, China, 210000

83616883@qq.com

Abstract—The existing research on fire detection is basically

based on a two-stage method, resulting in slower detection speed,

and the positioning accuracy is limited by the first-stage candidate

region extraction algorithm. In order to solve the problem of high-

precision real-time fire detection, this paper proposes a Yolo

detection network combined with attention mechanism. The

attention module is serially added to the final three convolutional

networks of different scales of Yolo v3. The channel attention

module updates the feature map by weighting and summing all

channels, which captures the semantic dependencies between

channels in the deep layers of the network, and improves the

generalization ability of the model. Experiments show that the

method proposed in this paper improves the accuracy of fire

detection without reducing the detection speed.

Keywords—Image recognition; fire detection; Deep learning;

Attention mechanism; Yolo

I. INTRODUCTION

There are flammable and explosive materials in typical

industrial production scenes such as power plants. Once a fire

occurs, it will bring huge damage to industrial production and

personnel safety. In addition, there may be multiple operating

devices in the production environment. Better response

measures can be taken for the location of the burning device.

Therefore, real-time fire detection is very necessary.

Fire detection methods can be divided into two categories: 1)

traditional fire detection and 2) visual-based fire detection.

Traditional fire detection is generally based on various types of

sensors, including smoke, temperature and photosensitive

sensors. Traditional fire detection is a close-range detection and

can only monitor a certain range of space, and is not suitable for

open space. And it is impossible to locate the specific position

of the flame.

Computer vision and deep learning technologies have

developed rapidly in recent years, the algorithm based on image

processing can effectively overcome the shortcomings of

traditional fire detection. The image has a wider field of view

and a longer detection range than traditional sensors. The image

contains more information, including the position of the flame.

Surveillance cameras are very common in various production

scene and are inexpensive. Image-based fire detection is also

divided into traditional methods and deep learning methods.

Among various deep learning methods, most of the current

researches are focused on image-level fire detection, rather than

region-level fire detection. However, in typical power plant

environment such as control rooms, substations, detecting and

localizing flame regions in images is essential for identifying the

burning equipment.

Yolo v3 is a recently proposed target detection network with

excellent performance. This paper proposes a method that

combines the channel attention mechanism with the Yolo v3

network, and uses the channel attention module to capture the

deep semantic feature dependency of Yolo. Experiments have

proved that the method improves the accuracy of fire detection

in multi-scale scenes and verifies the effectiveness of the method,

without reducing the detection speed.

II. RELATED WORK

Image processing algorithm provides an effective solution to

the problems existing in traditional fire detection. Fire detection

based on image processing is divided into traditional methods

and deep learning methods. Traditional recognition algorithms

can be summarized into three stages: 1. Flame pixel

classification 2. Motion detection 3. Candidate region feature

analysis. Flame pixel classification is mostly to establish static

color models in various color spaces, which are responsible for

generating flame candidate areas; motion detection is

responsible for detecting the dynamic feature; candidate area

feature analysis is responsible for comprehensive analysis of

previous data and realizing fire detection through conditional

filtering. Flame pixel classification and motion detection are

based on each other in different methods. Celik et al. [9] used

manually marked flame masks to extract flame pixel values for

subsequent color model research. Celik et al. [1] proposed a

This work is supported by the National Natural Science Foundation of

China (Grant No. 61573101), and the Science and Technology Program of

Global Energy Interconnection Research Institute "Research on

infrared/ultraviolet image recognition algorithm and software modules

development"

Authorized licensed use limited to: Southeast University. Downloaded on March 24,2021 at 00:05:59 UTC from IEEE Xplore. Restrictions apply.

3081

flame color model represented by geometric shapes in RGB

space.

In recent years, the method based on deep learning has

become the mainstream method of fire detection. At present,

most researches are to identify whether there is flame in the

image, but there are few researchers on flame location.

Regarding the research of fire detection, Dunnings et al. [7] use

superpixel segmentation to divide the picture into regions, and

then perform fire detection on each segmented block

respectively, and finally take the union of the detected regions

to obtain the segmented regions. The positioning accuracy of

this two-stage method is poor, and the process of superpixel

segmentation is very time-consuming. Using the traditional

algorithm of the color model, Zhong et al.[4] proposed a two-

stage algorithm combining the generation of candidate regions

and the classification network.But the candidate regions was not

used to locate the flame area, and the output is only a binary

classification of the image. Zhao et al. [14] used the saliency

detection method to extract the suspected fire area, calculated

the color and texture features of the ROI, and then used two

logistic regression classifiers to classify the feature vector of the

ROI, which is also a two-stage detection Methods with

positioning accuracy and detection speed limited.

Yolo v3 is a recently-appearing detection network with

excellent performance. It converts the border positioning

problem into a regression problem. It has high detection

accuracy and high speed, and is very suitable for real-time target

detection. The attention mechanism was first used in machine

translation, and it has been applied to convolutional networks for

image processing in recent years, deriving good results. Fu et al.

[6] proposed a dual attention module of space and channel,

which can capture the spatial relationship and channel

relationship of deep semantics. Wu et al. [2] introduced the

channel attention mechanism into the flame classification

network, and used the attention mechanism to learn the

nonlinear interaction between channels, which improved the

accuracy of flame type classification. This paper combines Yolo

v3 with the attention mechanism to achieve accurate and

efficient fire detection.

III. BUILDING FLAME DATASET BY RGB-T AUTOMATIC

ANNOTATION

At present, most flame datasets on the Internet only provide

binary labels labels rather than bounding box labels required for

fire detection. Therefore, this paper constructs a flame dataset.

In order to obtain sample labels accurately and quickly, our

previous work of RGB-T image registration was used. The

RGB-IR camera shown in Fig. 1 was used to collect flame

samples, and the flames in multi-scale scenes were collected

with paired RGB-T samples. A total of 5 videos were collected.

The RGB images is used for the input of the network, and the

infrared images is used for the automatic annotation algorithm

to automatically generate sample labels. In order to accurately

segment the flame mask in the infrared image, we keep the

relative position and viewing angle between the camera and the

flame fixed, and minimize the background interference in the

infrared image.

RGB image

channel

thermal image

channel

camera tripod

Fig. 1. RGB-IR camera

The process of automatic annotation as shown in Fig. 2 based

on our previous work of RGB-T image registration. The mutual

information method is used to calculate the transformation

relation between a pair of infrared and RGB images so as to

realize the automatic annotation. [18] The flame samples of the

RGB images and infrared images are registered to obtain the

transformation matrix M from the visible pixel coordinate

system to the infrared pixel coordinate system.

ܯ=ቂܴܶ

0்1ቃ

൥ܺᇱ

ܻᇱ

1൩=ቂܴܶ

0்1ቃ൥ܺ

1൩

First, we use the transformation matrix to transform the RGB

images as shown in (2), and then crop the infrared field of view

and generate pixel-aligned pairs of RGB-T samples. An

example is shown in Fig. 3.

Fig. 2. Automatic annotation process

A sample video only needs to be registered once, and the

transformation matrix is applicable to all frames of this sample

video. Because the pixels in the flame area have higher

brightness in the infrared image, simple image processing

methods can be used to obtain the mask of the flame in the

infrared image. Since the infrared image and the RGB image are

RGB

image

Infrared

image

rgb-t image

registr ation

algorithm

Flame mask

under infrared

image

Flame mask

under visible

image

Sample self -

labeling

Detection

network

Authorized licensed use limited to: Southeast University. Downloaded on March 24,2021 at 00:05:59 UTC from IEEE Xplore. Restrictions apply.

3082

pixel-aligned, the mask of the flame in the infrared image is the

same as the mask of the flame in the RGB image. Finally, take

the minimum bounding rectangle to obtain the sample label.

+crop

RGB image

infrared image

pixel-aligned pairs

of RGB-T sample

Merge

Fig. 3. Example of generating pixel-aligned pairs of RGB-T samples

Fig. 4 shows some examples of automatic annotation. The

left column is the infrared images, the middle column is the

flame masks extracted by the image processing algorithm, and

the right column is the generated sample labels.

Fig. 4. Examples of automatic annotation

In order to expand the diversity of the sample set, 10 flame

videos were downloaded from the Internet, including flame

samples in different scenes such as indoor and outdoor, long and

short distance, day and night, and manually labeled to obtain

labels. Finally, the original data set was expanded by the method

of data enhancement, and the final flame data set contains 6000

samples.

IV. PROPOSED METHOD

The proposed method combines the Yolo v3 network with

the attention mechanism. The network structure is shown in Fig.

5. The attention module is serially added to the final three

convolutional networks on different scales. The structure of the

attention module refers to DANet[6]. The deep layer of the

network mainly extracts semantic information, in which each

channel is equivalent to a certain type of semantic response. The

channel attention module is used to capture the semantic

dependencies between channels and encode channel context

information into local features. The input is the original feature

maps, and the dependency between each feature map is used as

the weight to update the feature maps. The updated feature map

is obtained by weighting and summing all the feature maps, so

as to capture the semantic dependency between each channel

and strengthen the expression ability of semantic features.

The channel attention mechanism is shown in Fig. 6, where

the scale of the input feature maps A is C×H×W. First, perform

a reshape operation on A to obtain a C×N scale feature map,

where N=H×W is the number of pixels in each layer of the

feature maps, and then reshape and transpose A to obtain a N×C

scale feature map . The result of multiplying the two feature

maps is calculated through the softmax layer to obtain the

channel attention matrix X (C×C), and each element of the X

matrix is:

ݔ௝௜ =௘௫௣ ( ஺೔ή஺ೕ)

σ௘௫௣ ( ஺೔ή஺ೕ)

಴

೔సభ 

The next step is to reshape the original input feature maps A

to a scale of C×N, multiply it by the transposition of X, and

reshape the result of the operation to a scale of C×H×W, which

is the same scale as the input feature. Finally, multiply the result

by a scale factor and add the original feature maps to get the final

output E (C×H×W):

ܧ௝=ߚσ൫ݔ௝௜ ήܣ௜൯+ܣ௝

஼

௜ୀଵ 

7KH VFDOH IDFWRU ȕ LV WKH RQO\ WUDLQLQJ SDUDPHWHU LQ Whe

channel attention module, which is initialized to 0 and is

continuously assigned weights during the training process. X is

the channel attention matrix, each row of which represents the

weighting relationship between the channels. A value in a row

indicates the weight of the corresponding layer. The final output

feature E has the same scale as the input feature A, and the

feature map of each channel is the weighted sum of each channel

of the original input A. The channel attention module captures

the relevance of different layers of semantic features, improves

the ability to express semantic features, and models the semantic

dependence between different channels.

Fig. 6. Channel attention module

UHVKDSHUHVKDSH

UHVKDSH VRIWPD[

UHVKDSH

WUDQVSRVH

;

(

uu&+: uu&+:

Authorized licensed use limited to: Southeast University. Downloaded on March 24,2021 at 00:05:59 UTC from IEEE Xplore. Restrictions apply.

3083

Input

92 la yers

416×416×3

60 la yers

ers

ayers

Darknet-53 with out FC layers

32 la yers

13×13 ×255

26×26 ×255

52×52 ×255

Multi-scale output

13×13 Atte ntion

module

Concat

26×26 Atte ntion

module

52×52 Atte ntion

module

Upsampling

Fig. 5. Yolo v3 detection network combined with attention mechanism

V. EXPERIMENT AND EVALUATION

The test was conducted on a computer with Intel Core i7-

9700K CPU @ 3.60GHz, NVIDIA 2080ti GPU, 16GB RAM,

Ubuntu 16.04. The detection performance of the network was

tested on two test sets. The first test set was generated using the

hold-out method. The original dataset was randomly divided

into a train set and a test set according to the proportion, and the

test set contains 600 samples. The distribution of the first test set

is close to the train set. Yolo v3 and the method proposed in this

paper (Yolo v3-SA) were tested on the first test set. The

calculation method of Average Precision (AP) in Pascal VOC

was adopted. When the IOU is 0.8, the AP of Yolo v3 is 66.3%,

and the AP of Yolo v3-SA proposed in this paper is 66.77%,

which is a slight increase. The channel attention mechanism

captures the semantic dependencies between channels by

weighting and summing the channels, but it also blurs the

original semantic features. As a result, the improvement is not

obvious on the first test set whose distribution is close to the

training set.

During the RGB-T registration, the RGB image channel was

transformed and cropped, so part of the field of view was lost,

resulting in a difference in the distribution of the original flame

samples. In order to test the detection performance of the model

on samples with different distributions, 250 new test samples

were generated by manually labeling the original flame samples.

The first test set was expanded with these new samples, and the

new test set contains 850 test samples, which is called the

generalized test set.

First, the detection speed was tested, and the results are

shown in Table 1. The experiment shows that the method

proposed in this paper reduces the detection speed very little,

and the algorithm is real-time.

TABLE I. DETECTION SPEED

algorithm Yolo v3 Yolo v3-SA

Frames Per Second 22.14 21.90

The two models were tested on the generalized test set. Table

2 lists the AP of the two models under different IOU standards.

The precision-recall curve, when the IOU is 0.75, is drawn as

shown in Fig. 7. The accuracy of the method proposed in this

paper on the generalized test set is 4.57% higher than that of

Yolo v3. Fig. 9 shows some test examples

TABLE II. DETECTION AVERAGE PRECISION

algorithm Average Precision

0.85 IOU 0.80 IOU 0.75 IOU 0.70 IOU 0.65 IOU

Yolo v3 28.99% 52.88% 69.31% 80.04% 85.79%

Yolo v3-SA 29.57% 55.86% 73.88% 86.51% 91.74%

(a) Yolo v3 (b) Yolo v3-SA

Fig. 7. Precision-Recall curve of 0.75 IOU

Finally, additional supplementary experiments were

performed on the public dataset voc2007 to verify the

effectiveness of the method in this paper. The mAP results are

shown in Fig. 8. The results show that the mean Average

Precision of the method proposed in this paper is improved by

0.54% compared with Yolo v3.

Authorized licensed use limited to: Southeast University. Downloaded on March 24,2021 at 00:05:59 UTC from IEEE Xplore. Restrictions apply.

3084

(a) Yolo v3 (b) Yolo v3-SA

Fig. 8. Supplementary experimental test results on the voc2007 dataset

Fig. 9. Examples of Yolo v3-SA fire detection

VI. CONCLUSION

In order to solve the problem of high-precision real-time fire

detection, this paper proposes a Yolo detection network

combined with an attention mechanism. The flame dataset is

generated using automatic annotation algorithm and data

enhancement. Combining the attention mechanism with the

Yolo v3 network, the channel attention module captures the

semantic dependencies between channels. Experiments show

Authorized licensed use limited to: Southeast University. Downloaded on March 24,2021 at 00:05:59 UTC from IEEE Xplore. Restrictions apply.

3085

that the method proposed in this paper improves the accuracy of

fire detection without reducing the detection speed. The next

step will be to study the lack of flame samples in some actual

production scene to further improve the generalization ability of

the model.

REFERENCES

[1] T. Celik, H. Demirel, and H. Ozkaramanli, "Automatic fire detection in

video sequences," in 2006 14th European Signal Processing Conference,

2006: IEEE, pp. 1-5.

[2] Y. Wu, Y. He, P. Shivakumara, Z. Li, H. Guo, and T. Lu, "Channel-wise

attention model-based fire and rating level detection in video," CAAI

Transactions on Intelligence Technology, vol. 4, no. 2, pp. 117-121, 2019.

[3] %87|UH\LQ <'HGHR÷OX 8*GNED\DQG$( &HWLQ&RPSXWHU

vision based method for real-time fire and flame detection," Pattern

recognition letters, vol. 27, no. 1, pp. 49-58, 2006.

[4] Z. Zhong, M. Wang, Y. Shi, and W. Gao, "A convolutional neural

network-based flame detection method in video sequence," Signal, Image

and Video Processing, vol. 12, no. 8, pp. 1619-1627, 2018.

[5] K. Muhammad, J. Ahmad, I. Mehmood, S. Rho, and S. W. Baik,

"Convolutional neural networks based fire detection in surveillance

videos," IEEE Access, vol. 6, pp. 18174-18183, 2018.

[6] J. Fu et al., "Dual attention network for scene segmentation," in

Proceedings of the IEEE Conference on Computer Vision and Pattern

Recognition, 2019, pp. 3146-3154.

[7] A. J. Dunnings and T. P. Breckon, "Experimentally defined convolutional

neural network architecture variants for non-temporal real-time fire

detection," in 2018 25th IEEE International Conference on Image

Processing (ICIP), 2018: IEEE, pp. 1558-1562.

[8] X. Wang, A. Shrivastava, and A. Gupta, "A-fast-rcnn: Hard positive

generation via adversary for object detection," in Proceedings of the IEEE

conference on computer vision and pattern recognition, 2017, pp. 2606-

2615.

[9] T. Celik and H. Demirel, "Fire detection in video sequences using a

generic color model," Fire safety journal, vol. 44, no. 2, pp. 147-158, 2009.

[10] B. U. Toreyin, Y. Dedeoglu, and A. E. Cetin, "Flame detection in video

using hidden Markov models," in IEEE International Conference on

Image Processing 2005, 2005, vol. 2: IEEE, pp. II-1230.

[11] W. Phillips Iii, M. Shah, and N. da Vitoria Lobo, "Flame recognition in

video," Pattern recognition letters, vol. 23, no. 1-3, pp. 319-327, 2002.

[12] Y. Ren, C. Zhu, and S. Xiao, "Object detection based on fast/faster RCNN

employing fully convolutional architectures," Mathematical Problems in

Engineering, vol. 2018, 2018.

[13] C. Hu, P. Tang, W. Jin, Z. He, and W. Li, "Real-time fire detection based

on deep convolutional long-recurrent networks and optical flow method,"

in 2018 37th Chinese Control Conference (CCC), 2018: IEEE, pp. 9061-

9066.

[14] Y. Zhao, J. Ma, X. Li, and J. Zhang, "Saliency detection and deep

learning-based wildfire identification in UAV imagery," Sensors, vol. 18,

no. 3, p. 712, 2018.

[15] B. Kim and J. Lee, "A video-based fire detection using deep learning

models," Applied Sciences, vol. 9, no. 14, p. 2862, 2019.

[16] C. B. Liu and N. Ahuja, "Vision based fire detection," in Proceedings of

the 17th International Conference on Pattern Recognition, 2004. ICPR

2004., 2004, vol. 4: IEEE, pp. 134-137.

[17] Q. X. Zhang, G. H. Lin, Y. M. Zhang, G. Xu, and J. I. Wang, "Wildland

forest fire smoke detection based on faster R-CNN using synthetic smoke

images," Procedia engineering, vol. 211, pp. 441-446, 2018.

[18] J. Ma, K. Qian, X. Zhang, and X. Ma, "Weakly Supervised Instance

Segmentation of Electrical Equipment Based on RGB-T Automatic

Annotation," IEEE Transactions on Instrumentation and Measurement,

2020, DOI:10.1109/TIM.2020.3001796

Authorized licensed use limited to: Southeast University. Downloaded on March 24,2021 at 00:05:59 UTC from IEEE Xplore. Restrictions apply.

Hybrid CNN-ViT architecture to exploit spatio-temporal feature for fire recognition trained through transfer learning

Article

Full-text available

Mar 2024
MULTIMED TOOLS APPL

Fires are becoming one of the major natural hazards that threaten the ecology, economy, human life and even more worldwide. Therefore, early fire detection systems are crucial to prevent fires from spreading out of control and causing destruction. Based on vision sensors, many fire detection techniques have evolved with the recent surge of curiosity in deep learning, which exploits the spatial features of individual images. However, fire can take different forms, scales, and combustion materials can produce different colors, making accurate fire detection from an image challenging. Small fires captured from long-distance cameras lack salient features, further complicating detection. This paper proposes a hybrid structure that uses attention-enhanced convolutional neural networks and vision transformers (CNN-ViT) to detect fire. The proposed CNN-ViT first pays spatial attention to every frame and then aggregates temporal contextual information from neighboring frames to improve detection performance. Due to the limited availability of training fire datasets, the study employs deep transfer learning for feature extraction using pre-trained CNN. We used various metrics to examine the efficacy of the proposed approach. The results showed that the CNN-ViT method outperformed previous models based on spatial-temporal characteristics by achieving a relative improvement in accuracy and F1 score. The satisfactory results on images contaminated with different intensities of noise confirm the robustness of the approach.

Research and Implementation of Forest Fire Detection Algorithm Improvement

Article

Full-text available

Mar 2024

To overcome low efficiency and accuracy of existing forest fire detection algorithms, this paper proposes a network model to enhance the real-time and robustness of detection. This structure is based on the YOLOv5 target detection algorithm and combines the backbone network with The feature extraction module combines the attention module dsCBAM improved by depth-separable convolution, and replaces the loss function CIoU of the original model with a VariFocal loss function that is more suitable for the imbalanced characteristics of positive and negative samples in the forest fire data set. Experiments were conducted on a self-made and public forest fire data set. The accuracy and recall rate of the model can reach 87.1% and 81.6%, which are 7.40% and 3.20% higher than the original model, and the number of images processed per second reaches 64 frames, a growth rate of 8.47%. At the same time, this model was compared horizontally with other improved methods. The accuracy, recall rate and processing speed were all improved in the range of 3% to 10%. The effectiveness of the improved method in this article was verified, and the external perception level of the forest fire scene was deeper.

EA-YOLO: Efficient Extraction and Aggregation Mechanismof YOLO for Fire Detection

Preprint

Full-text available

Feb 2024

For fire detection, there are characteristics such as variable sample feature morphology, complex background and dense target, small sample size of dataset and imbalance ofcategories, which lead to the problems of low accuracy and poor real-time performanceof the existing fire detection models. We propose a flame smoke detection model basedon efficient multi-scale feature enhancement, i.e., EA-YOLO. In order to improve theextraction capability of the network model for flame target features, an efficient attentionmechanism is integrated into the backbone network, Multi Channel Attention (MCA), andthe number of parameters of the model is reduced by introducing the RepVB module;at the same time, we design a multi-weighted multidirectional feature neck structure,Multidirectional Feature Pyramid Network (MDFPN), to enhance the model’s flametarget feature information fusion ability; finally, the CIoU loss function is redesigned byintroducing the Slide weighting function to improve the imbalance between difficult andeasy samples. In addition, to address the problem of a small sample size of fire datasets,this paper establishes two fire datasets, Fire-Smoke and Ro-Fire-Smoke, of which thelatter has the model robustness validation function. The experimental results show that themethod of this paper is 6.5% and 7.3% higher compared to the baseline model YOLOv7on the Fire-Smoke and Ro-Fire-Smoke datasets, respectively. The detection speed is 74.6frames per second. It fully demonstrates that the method in this paper has high flamedetection accuracy while considering the real-time nature of the model. The source codeand dataset are located at https://github.com/DIADH/DIADH.YOLO.

Fire-PPYOLOE: An Efficient Forest Fire Detector for Real-Time Wild Forest Fire Monitoring

Article

Full-text available

Jan 2024

Forest fire has the characteristics of sudden and destructive, which threatens safety of people’s life and property. Automatic detection and early warning of forest fire in the early stage is very important for protecting forest resources and reducing disaster losses. Unmanned forest fire monitoring is one popular way of forest fire automatic detection. However, the actual forest environment is complex and diverse, and the vision image is affected by various factors easily such as geographical location, seasons, cloudy weather, day and night, etc. In this paper, we propose a novel fire detection method called Fire-PPYOLOE. We design a new backbone and neck structure leveraging large kernel convolution to capture a large arrange area of reception field based on the existing fast and accurate object detection model PP-YOLOE. In addition, our model maintains the high-speed performance of the single-stage detection model and reduces model parameters by using CSPNet significantly. Extensive experiments are conducted to show the effectiveness of Fire-PPYOLOE from the views of detection accuracy and speed. The results show that our Fire-PPYOLOE is able to detect the smoke- and flame-like objects because it can learn features around the object to be detected. It can provide real-time forest fire prevention and early detection.

Anchor-Free Smoke and Flame Recognition Algorithm with Multi-Loss

Article

Full-text available

Jun 2023

Fire perception based on machine vision is essential for improving social safety. Object recognition based on deep learning has become the mainstream smoke and flame recognition method. However, the existing anchor-based smoke and flame recognition algorithms are not accurate enough for localization due to the irregular shapes, unclear contours, and large-scale changes in smoke and flames. For this problem, we propose a new anchor-free smoke and flame recognition algorithm, which improves the object detection network in two dimensions. First, we propose a channel attention path aggregation network (CAPAN), which forces the network to focus on the channel features with foreground information. Second, we propose a multi-loss function. The classification loss, the regression loss, the distribution focal loss (DFL), and the loss for the centerness branch are fused to enable the network to learn a more accurate distribution for the locations of the bounding boxes. Our method attains a promising performance compared with the state-of-the-art object detectors; the recognition accuracy improves by 5% for the mAP, 8.3% for the flame AP50, and 2.1% for the smoke AP50 compared with the baseline model. Overall, the algorithm proposed in this paper significantly improves the accuracy of the object detection network in the smoke and flame recognition scenario and can provide real-time fire recognition.

A YOLOv6-Based Improved Fire Detection Approach for Smart City Environments

Article

Full-text available

Mar 2023
SENSORS-BASEL

Rashid Nasimov

Citation: Norkobil Saydirasulovich, S.; Abdusalomov, A.; Jamil, M.K.; Nasimov, R.; Kozhamzharova, D.; Cho, Y.-I. A YOLOv6-Based Improved Fire Detection Approach for Smart City Environments. Sensors 2023, 23, 3161. https://doi. Abstract: Authorities and policymakers in Korea have recently prioritized improving fire prevention and emergency response. Governments seek to enhance community safety for residents by constructing automated fire detection and identification systems. This study examined the efficacy of YOLOv6, a system for object identification running on an NVIDIA GPU platform, to identify fire-related items. Using metrics such as object identification speed, accuracy research, and time-sensitive real-world applications, we analyzed the influence of YOLOv6 on fire detection and identification efforts in Korea. We conducted trials using a fire dataset comprising 4000 photos collected through Google, YouTube, and other resources to evaluate the viability of YOLOv6 in fire recognition and detection tasks. According to the findings, YOLOv6's object identification performance was 0.98, with a typical recall of 0.96 and a precision of 0.83. The system achieved an MAE of 0.302%. These findings suggest that YOLOv6 is an effective technique for detecting and identifying fire-related items in photos in Korea. Multi-class object recognition using random forests, k-nearest neighbors, support vector, logistic regression, naive Bayes, and XGBoost was performed on the SFSC data to evaluate the system's capacity to identify fire-related objects. The results demonstrate that for fire-related objects, XGBoost achieved the highest object identification accuracy, with values of 0.717 and 0.767. This was followed by random forest, with values of 0.468 and 0.510. Finally, we tested YOLOv6 in a simulated fire evacuation scenario to gauge its practicality in emergencies. The results show that YOLOv6 can accurately identify fire-related items in real time within a response time of 0.66 s. Therefore, YOLOv6 is a viable option for fire detection and recognition in Korea. The XGBoost classifier provides the highest accuracy when attempting to identify objects, achieving remarkable results. Furthermore, the system accurately identifies fire-related objects while they are being detected in real-time. This makes YOLOv6 an effective tool to use in fire detection and identification initiatives.

Dynamic attention network for flame detection

Article

Full-text available

Jan 2023
ELECTRON LETT

Recently, flame detection has attracted great attention. However, existing methods have the issues of low detection rates, high false alarm rates, and lack of smoke anti‐interference ability. In this letter, a novel dynamic attention‐based network (DANet) is proposed for autonomous flame detection in various scenarios. To mitigate the disturbance of smoke in images, a dynamic attention strategy is proposed to discover the potential features among scale‐awareness and spatial‐awareness. Then, based on dynamic attention module, a decoupled detection head is presented, which can predict category, regression, and object score independently to boost the performance. A self‐contained challenging flame dataset, which is multi‐scene, multiscale, and multi‐interference informative is constructed to evaluate the proposed model and organize the experiments. Extensive ablation and comparison studies on self‐labelled dataset reveal the effectiveness of the proposed dynamic attention‐based network.

Fire-YOLO: A Small Target Object Detection Method for Fire Inspection

Article

Full-text available

Apr 2022

For the detection of small targets, fire-like and smoke-like targets in forest fire images, as well as fire detection under different natural lights, an improved Fire-YOLO deep learning algorithm is proposed. The Fire-YOLO detection model expands the feature extraction network from three dimensions, which enhances feature propagation of fire small targets identification, improves network performance, and reduces model parameters. Furthermore, through the promotion of the feature pyramid, the top-performing prediction box is obtained. Fire-YOLO attains excellent results compared to state-of-the-art object detection networks, notably in the detection of small targets of fire and smoke. Overall, the Fire-YOLO detection model can effectively deal with the inspection of small fire targets, as well as fire-like and smoke-like objects. When the input image size is 416 × 416 resolution, the average detection time is 0.04 s per frame, which can provide real-time forest fire detection. Moreover, the algorithm proposed in this paper can also be applied to small target detection under other complicated situations.

Analysis of false alarm causes in video fire detection systems

Article

Jun 2023

Video-based fire detection systems represent an innovative path in fire signalling. Thanks to a suitably designed algorithm, a system of this kind can enable the detection of a flame based on its characteristics such as colour or shape, which were not previously used in classical fire detection systems. Video-based detection systems, due to their early stage of development in the fire protection market, are not yet a certified, fully tested method for early fire detection. This paper focuses on the analysis of possible causes of false alarms occurring in video-based fire detection systems in relation to classical Fire Alarm Systems (FAS). For this purpose, a video-based flame detection algorithm is designed and implemented to further analyse the phenomena occurring in such systems.

Fire detection using vision transformer on power plant

Article

Nov 2022

The importance of power plant safety is increasing in the era of gradual technological development. When a fire occurs in the power plant, it will cause huge material losses, social unrest, and even casualties. The paper studies the common methods and models of fire warning, and introduces several model recognition techniques based on flames or smoke. Improved an automated power plant identification system based on the vision transformer, and proved the advantages of the technology through comparative analysis.

Weakly Supervised Instance Segmentation of Electrical Equipment Based on RGB-T Automatic Annotation

Article

Full-text available

Jun 2020

To address the problem of weakly supervised instance segmentation for electrical equipment using RGB camera only, an automatic annotation of masks of samples (AAMS) method based on thermal image guidance is proposed in this paper. With image-level label supervision only, we exploit foreground segmentation results of thermal images to guide the instance mask extraction of electrical equipment in RGB images through the heterogeneous pixel registration algorithm between RGB-T image pairs. It is realized to automatically annotate instance masks, which greatly improves efficiency and decreases costs. In addition, we further propose a progressively optimized model (POM) for instance segmentation, which first utilizes the fully-connected conditional random field (CRF) and the constrain-to-boundary loss to specify fine-detailed boundaries of each object and to solve the difficulty of segmenting electrical equipment with complicated structures. This model also explores the self-paced learning technology to solve the issue of resolution differences between RGB-T image pairs for improving the generalization ability. By comparison to the other state-of-the-arts, experimental results show that our method can obtain by far the better performance on the electrical equipment dataset.

A Video-Based Fire Detection Using Deep Learning Models

Article

Full-text available

Jul 2019

Fire is an abnormal event which can cause significant damage to lives and property. In this paper, we propose a deep learning-based fire detection method using a video sequence, which imitates the human fire detection process. The proposed method uses Faster Region-based Convolutional Neural Network (R-CNN) to detect the suspected regions of fire (SRoFs) and of non-fire based on their spatial features. Then, the summarized features within the bounding boxes in successive frames are accumulated by Long Short-Term Memory (LSTM) to classify whether there is a fire or not in a short-term period. The decisions for successive short-term periods are then combined in the majority voting for the final decision in a long-term period. In addition, the areas of both flame and smoke are calculated and their temporal changes are reported to interpret the dynamic fire behavior with the final fire decision. Experiments show that the proposed long-term video-based method can successfully improve the fire detection accuracy compared with the still image-based or short-term video-based method by reducing both the false detections and the misdetections.

Channel-wise Attention Model based Fire and Rating Level Detection in Video

Article

Full-text available

May 2019

Due to natural disaster and global warning, one can expect unexpected fire, which causes panic among people and extent to death. To reduce the impact of fire, the authors propose a new method for predicting and rating fire in video through deep-learning models in this work such that rescue team can save lives of people. The proposed method explores a hybrid deep convolutional neural network, which involves motion detection and maximally stable extremal region for detecting and rating fire in video. Further, the authors propose to use a channel-wise attention mechanism of the deep neural network for detecting rating of fire level. Experimental results on a large dataset show the proposed method outperforms the existing methods for detecting and rating fire in video.

A convolutional neural network-based flame detection method in video sequence

Article

Full-text available

Nov 2018

Computer vision-based fire detection is one of the crucial tasks in modern surveillance system. In recent years, the convolutional neural network (CNN) has become an active topic because of its high accuracy recognition rate in a wide range of applications. How to reliably and effectively solve the problems of flame detection, however, has still been a challenging problem in practice. In this paper, we proposed a novel flame detection algorithm based on CNN in real time by processing the video data generated by an ordinary camera monitoring a scene. Firstly, to improve the efficiency of recognition, a candidate target area extraction algorithm is proposed for dealing with the suspected flame area. Secondly, the extracted feature maps of candidate areas are classified by the designed deep neural network model based on CNN. Finally, the corresponding alarm signal is obtained by the classification results. The experimental results show that the proposed method can effectively identify fire and achieve higher alarm rate in the homemade database. The proposed method can effectively realize the real-time performance of fire warning in practice.

Convolutional Neural Networks Based Fire Detection in Surveillance Videos

Article

Full-text available

Mar 2018

The recent advances in embedded processing have enabled the vision based systems to detect fire during surveillance using convolutional neural networks (CNNs). However, such methods generally need more computational time and memory, restricting its implementation in surveillance networks. In this research article, we propose a cost-effective fire detection CNN architecture for surveillance videos. The model is inspired from GoogleNet architecture, considering its reasonable computational complexity and suitability for the intended problem compared to other computationally expensive networks such as “AlexNet”. To balance the efficiency and accuracy, the model is fine-tuned considering the nature of the target problem and fire data. Experimental results on benchmark fire datasets reveal the effectiveness of the proposed framework and validate its suitability for fire detection in CCTV surveillance systems compared to state-of-the-art methods.

Saliency Detection and Deep Learning-Based Wildfire Identification in UAV Imagery

Article

Full-text available

Feb 2018
SENSORS-BASEL

An unmanned aerial vehicle (UAV) equipped with global positioning systems (GPS) can provide direct georeferenced imagery, mapping an area with high resolution. So far, the major difficulty in wildfire image classification is the lack of unified identification marks, the fire features of color, shape, texture (smoke, flame, or both) and background can vary significantly from one scene to another. Deep learning (e.g., DCNN for Deep Convolutional Neural Network) is very effective in high-level feature learning, however, a substantial amount of training images dataset is obligatory in optimizing its weights value and coefficients. In this work, we proposed a new saliency detection algorithm for fast location and segmentation of core fire area in aerial images. As the proposed method can effectively avoid feature loss caused by direct resizing; it is used in data augmentation and formation of a standard fire image dataset 'UAV_Fire'. A 15-layered self-learning DCNN architecture named 'Fire_Net' is then presented as a self-learning fire feature exactor and classifier. We evaluated different architectures and several key parameters (drop out ratio, batch size, etc.) of the DCNN model regarding its validation accuracy. The proposed architecture outperformed previous methods by achieving an overall accuracy of 98%. Furthermore, 'Fire_Net' guarantied an average processing speed of 41.5 ms per image for real-time wildfire inspection. To demonstrate its practical utility, Fire_Net is tested on 40 sampled images in wildfire news reports and all of them have been accurately identified.

Wildland Forest Fire Smoke Detection Based on Faster R-CNN using Synthetic Smoke Images

Article

Full-text available

Jan 2018

In this paper, Faster R-CNN was used to detect wildland forest fire smoke to avoid the complex manually feature extraction process in traditional video smoke detection methods. Synthetic smoke images are produced by inserting real smoke or simulative smoke into forest background to solve the lack of training data. The models trained by the two kinds of synthetic images respectively are tested in dataset consisting of real fire smoke images. The results show that simulative smoke is the better choice and the model is insensitive to thin smoke. It may be possible to further boost the performance by improving the synthetic process of forest fire smoke images or extending this solution to video sequences.

Dual Attention Network for Scene Segmentation

Conference Paper

Jun 2019

Real-Time Fire Detection Based on Deep Convolutional Long-Recurrent Networks and Optical Flow Method

Conference Paper

Jul 2018

Experimentally Defined Convolutional Neural Network Architecture Variants for Non-Temporal Real-Time Fire Detection

Conference Paper

Oct 2018

Fire Detection based on Convolutional Neural Networks with Channel Attention

Figures

Recommended publications

SDChannelNets: Extremely Small and Efficient Convolutional Neural Networks

Gland Instance Segmentation by Deep Multichannel Neural Networks

Damage Degree Identification of Rolling Bearings Under Variable Load with Improved Anti⁃interference...

Deep learning the high variability and randomness inside multimode fibers