ArticlePDF Available

An adaptive algorithm based on principal component analysis- deep learning for anomalous events detection

Authors:
  • University of Information Technology and Communications

Abstract and Figures

One of the most often used applications of human activity detection is anomaly detection, which is covered in this paper. Providing security for a person is a key issue in every community nowadays because of the constantly expanding activities that pose danger, from planned violence to harm caused by an accident. Existing classical closed-circuit television considered is insufficient since it needs a person to stay awake and constantly monitor the cameras, which is expensive. In addition, a person's attention decreases after a certain time. For these reasons, the development of an automated security system that can identify suspicious activities in real-time and quickly aid victims is required. Because identifying activity must be with high accuracy, and in the shortest possible time. We adopt an adaptive algorithm based on the combination of machine learning (ML), principal component analysis (PCA) and deep learning (DL). The UCF-crime dataset was used for the experimentation in this work. Where the area under the curve (AUC) with the proposed approach was equal to 94.21% while the detection accuracy was equal to 88.46% on the test set database. The suggested system has demonstrated its robustness and accomplishment of the best accuracy when compared with earlier designed systems.
Content may be subject to copyright.
Indonesian Journal of Electrical Engineering and Computer Science
Vol. 29, No. 1, January 2023, pp. 421~430
ISSN: 2502-4752, DOI: 10.11591/ijeecs.v29.i1.pp421-430 421
Journal homepage: http://ijeecs.iaescore.com
An adaptive algorithm based on principal component analysis-
deep learning for anomalous events detection
Zainab K. Abbas, Ayad A. Al-Ani
Department of Information and Communication Engineering, College of Information Engineering, Al-Nahrain University,
Baghdad, Iraq
Article Info
ABSTRACT
Article history:
Received Aug 13, 2022
Revised Sep 9, 2022
Accepted Sep 19, 2022
One of the most often used applications of human activity detection is
anomaly detection, which is covered in this paper. Providing security for a
person is a key issue in every community nowadays because of the constantly
expanding activities that pose danger, from planned violence to harm caused
by an accident. Existing classical closed-circuit television considered is
insufficient since it needs a person to stay awake and constantly monitor the
cameras, which is expensive. In addition, a person's attention decreases after
a certain time. For these reasons, the development of an automated security
system that can identify suspicious activities in real-time and quickly aid
victims is required. Because identifying activity must be with high accuracy,
and in the shortest possible time. We adopt an adaptive algorithm based on
the combination of machine learning (ML), principal component analysis
(PCA) and deep learning (DL). The UCF-crime dataset was used for the
experimentation in this work. Where the area under the curve (AUC) with the
proposed approach was equal to 94.21% while the detection accuracy was
equal to 88.46% on the test set database. The suggested system has
demonstrated its robustness and accomplishment of the best accuracy when
compared with earlier designed systems.
Keywords:
Anomaly detection
Bidirectional long short term
memory
Deep learning
Machine learning
Principal component analysis
Resnet50
This is an open access article under the CC BY-SA license.
Corresponding Author:
Zainab K. Abbas
Department of Information and Communication Engineering, College of Information Engineering
Al-Nahrain University, Baghdad, Iraq
Email: zainabkudair@gmail.com
1. INTRODUCTION
Video surveillance systems (VSS) are frequently utilized to increase public safety in various settings,
including malls, hospitals, banks, markets, intelligent cities, educational institutions, and roadways [1]. The
main goal of security applications is typically the accuracy and speed of video anomaly identification [2]. For
the sake of public safety, numerous surveillance cameras have recently been put in numerous areas throughout
the world [3]. These cameras continuously generate massive volumes of video data [3]. Therefore, real-time
video analysis and finding abnormal cases require a lot of human resources. In addition, it is subject to fault
due to human attention loss over time [1]. Because human monitoring is ineffective, automatic anomaly
detection solutions based on artificial intelligence (AI) algorithms become necessary in surveillance systems [4].
Various techniques in the literature identify anomaly actions as "the occurrence of variance in regular
patterns" [3]. Traffic security, automated intelligent visual monitoring, and crime prevention are some
applications for abnormal event detection in surveillance videos [5]. Due to the lack of actual anomalous
instances, video anomalous detection was formerly assumed to be a one-class classification task [6]-[8] that is
to mean the classifier model is trained on normal movies, and a video is classified as anomalous when irregular
ISSN: 2502-4752
Indonesian J Elec Eng & Comp Sci, Vol. 29, No. 1, January 2023: 421-430
422
patterns are seen in the testing [5]. As a result, various normal behaviors may swerve from normal events in
the training set, resulting in false alarms [1], [9].
Different studies were performed in this field, and all these strategies were adopted for a specific
situation. A study performed by Waqas et al. [10] provided a framework that can recognize abnormal attitudes
and tell the user of the type of behavior. Another study suggests adaptively compressing each video before to
being transmitted to the event detection system by Shreyas et al. [11]. Anala et al. [12], Hao et al. [13] and
Dubey et al. [14] regarded the detect anomalous behavior as a regression problem. Another study offered a
lightweight convolutional neural network (CNN) by Ullah et al. [5]. Ullah et al. [3] proposed an intelligent
anomaly detection system that depends on combining the ResNet50 with multilayer bidirectional long short
term memory (BiLSTM). Zaheer et al. [15] presented a weakly supervised anomaly detection method that
trains the model using video-level labels. Another method for handling anomaly detection and classification
using a weakly supervised learning model was provided by Majhi et al. [16]. Employing multi-detail ideas in
both the temporal and spatial dimensions as input, a dual branch network has been developed by Wu et al. [17].
To detect video anomalies, Cao et al. [18] suggested taking into consideration the spatial-temporal relationships
between video parts. Abbas and Al-Ani [2] suggest compressing each video using high-efficiency video coding
(H265) before feeding the video into the anomaly detection systems. An algorithm for reducing the size of the
extracted features has been suggested by Abbas and Al-Ani before anomaly identification [19].
Even though the word "anomaly" is used frequently in literature, there isn't a standard definition of it
yet [9]. The most of existing technologies have a high rate of false alarms. Additionally, the effectiveness of
these approaches is reduced when used in real-world situations, even though they work effectively on
basic databases.
To overcome these difficulties, we suggested reducing the dimensionality of the data features
extracted using pre-trained convolution neural networks (Resnet50), by using principal component analysis
(PCA) [20] to improve the performance of the model and reduce the model complexity. After that, we feed the
features to our classifier model, which is a BiLSTM. We employ a weakly supervised method based on spatio-
temporal features and BiLSTM to train our classifier model. When the context of the input is needed, BiLSTMs
have proven to be highly helpful. While in a unidirectional LSTM, information moves from backward to
forward, the BiLSTM uses two hidden states to flow information not only backward to forward, but also
forward to backward. As a result, BiLSTMs are better able to perceive the context [21].
2. METHODS
The results of a study that compared the various methods used for anomaly detection showed that
deep learning (DL) has outperformed other methods in this field [1]. The recommended approach used in this
work is split into three stages:
Feature extraction using Resnet50.
Dimensionality reduction using PCA.
Anomaly events detection using BiLSTM.
This work focuses on the assessment of the mixture of the machine learning (ML) and DL algorithms
for anomaly event detection purposes in videos for the first time on the UCF-Crime database. Figure 1
illustrates the suggested framework.
Figure 1. The suggested framework
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752
An adaptive algorithm based on principal component analysis-deep learning for (Zainab K. Abbas)
423
2.1. Input database UCF-crime
The UCF-Crime database is the database that was used in this work [22], [23]. This comprises 13 different
categories of anomalies, such as explosions, fights, abuse, and accidents, in addition to normal events. The collection
contains 1900 surveillance videos, with almost equal numbers of normal and abnormal videos. The training set
included 810 anomalous and 800 normal videos, while the testing set included 140 anomalous and 150 normal videos
[10], [24]. This set includes almost 129 hours of videos with a resolution of 320x240 and 13 million frames, with
different videos length [10], [22], [24]. We selected this database because it includes a variety of anomalous event
categories since the irregularities in it have a big influence on public safety. However, there are two issues with this
database. The first is that this dataset's anomalous class has high inter-class variations. The second, video-level labels,
which mean we only know that each movie has an abnormality but not which exact part is abnormal. This maybe
leads to overfitting [2], [22]. In this work, we picked the video with a length less than or equal to 2 min, depending
on this condition, we had 1324 videos in total divided as follows: 1116 videos for the training purpose (90% for
training and 10% for validation), and the remainder, which was 208, was used for the testing purpose.
2.2. Machine learning (ML) and deep learning (DL)
It is possible to successfully handle unstructured data with DL which is considered a subset of ML.
DL approaches exceed current ML approaches. It makes it possible for computational models to gradually
learn and comprehend the features of the information at various levels. DL became more and more prevalent
as data availability expanded and powerful computers. The DL method transforms the input into levels, each
of which can extract features and send them to the next layer. Initial layers gather basic data, which is then
integrated with later layers to provide a complete description. DL can be accomplished using a variety of
designs, including CNN, pre-trained networks, recurrent neural networks (RNN), and others. The efficiency of
DL classifiers greatly increases as the number of data increases when compared to traditional learning methods.
DL algorithms perform better as the amount of training data increases, but traditional machine learning
algorithms' performance stabilizes after a certain amount of training data. Although deep structures require
more time to train, they perform better than straightforward artificial neural networks (ANNs). However,
strategies like transfer learning and GPU computing can shorten the training period [1], [2], [25].
One of the neural network kinds is CNN which contain convolutional layers. Despite the processing
of the spatial data using CNN being good, for handling the sequential data RNN is better. This is due to RNN
utilizing state variables to save the past data and use them beside the present inputs to determine the present
outputs [2], [26]. Most of the time, CNN is used in image processing. Various items in the image are given
biases and weights, and this separates them. CNN needs lower preparation than other classification techniques.
Where it uses the appropriate filters to extract both temporal and spatial links in an image. Some CNN kinds
are ZFNet, VGGNet, ResNet, GoogleNet, AlexNet and LeNet [2], [25].
RNNs, on the other hand, use prior outputs as inputs to determine the condition of the current situation.
Knowledge can be remembered by RNN's hidden layers. To modify the hidden state, the output generated in
the previous state was applied. Due to their ability to remember previous inputs, RNNs can be utilized to
estimate time series. An example of an RNN is the LSTM [2], [25].
2.2.1. Features extraction
The 50-layer deep convolutional neural network ResNet50, which has 23.5 million learnable
parameters, was utilized in this study to extract features [3]. This network generates 1000 characteristics for
each frame [2]. ResNet50's input is 224 by 224 pixels in size. For this reason, in this research, the longest edges
of a movie were cropped and resized to match the input dimensions using a center crop.
2.2.2. Dimensionality reduction - principal component analysis (PCA)
This technique can be found in the literature with different names like the singular value
decomposition (SVD), hotelling transform, the Karhunen Loeve transform (KLT), and the empirical orthogonal
function (EOF) method [27]. It is an old, straightforward statistical approach that changes data from a higher
dimension to a lower dimension in the hopes of gaining a better understanding of the data. For data
compression, dimensionality reduction, and data visualization, PCA is utilized. In the case of dimensionality
reduction, it greatly simplifies the problem by drastically lowering the number of features. If the original dataset
has x features, the new dataset will have y features, where y is less than x. Some information will be lost due
to the new dimension being smaller than the original one. PCA can aid in the discovery of factors buried deep
within the data. The principal components are uncorrelated, linear combinations of the original data. 90% of
the original signal could be accounted for by two or three of the main components. In machine learning, PCA
is an unsupervised learning approach for dimensionality reduction. It is a statistical technique that converts a
set of linearly uncorrelated features from correlated feature observations via orthogonal transformation. The
principal components (PC) indicate these newly modified features [20], [27], [28]. Figure 2 shows the steps of
PCA applied in this work [27].
ISSN: 2502-4752
Indonesian J Elec Eng & Comp Sci, Vol. 29, No. 1, January 2023: 421-430
424
Figure 2. The PCA steps
2.2.3. Classifier
Unlike previous studies that fed the classifier model with features extracted from the features
extraction stage, in this work, we fed the classifier model with features got from the dimensional reduction
stage. The BiLSTM consider suitable for long sequential data such as videos [2]. So, in this research, we used
the BiLSTM as a classifier model for anomaly detection.
3. RESULTS AND DISCUSSION
The MATLAB software environment (version 2021a) was utilized to implement the computer codes
in this research. The work was done on the computer with the following specification: Windows 10, Intel Core
i7 processor, 1 TB SSD hard drive, 64-bit operating system, 16 GB RAM, and NVIDIA GeForce MX450
graphics processing unit. The results of each stage were as follows:
3.1. Input dataset UCF-crime
The UCF-Crime database was used for this study's experiments. The 13 anomaly classes in addition
to the normal class were used. Since the anomaly identification in this work is done at the video level, the
duration of the video had no bearing on the functioning of anomaly detection, so we chose videos that were no
longer than or equal to two minutes, depending on this condition, we had 1324 videos in total divided as
follows: 1116 videos for the training purpose (90% for training and 10% for validation), and the remainder,
which was 208, was used for the testing purpose.
3.2. Machine learning (ML) and deep learning (DL)
3.2.1. Features extraction
In this study, feature extraction was accomplished using a pre-trained model called ResNet50. Instead
of the SoftMax layer, the fc1000 layer is used to extract the features. This means each frame in the video after
this step will be represented by 1000 features, i.e. for a video with (x) number of frames the number of features
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752
An adaptive algorithm based on principal component analysis-deep learning for (Zainab K. Abbas)
425
will be (1000 * x) and this is considered a huge data, for this reason, in this paper, we suggest for the first time
feed the extracted features to PCA model before classifying them, to reduce the training time in the same time
increase the accuracy of the classification.
3.2.2. Dimensionality reduction - principal component analysis (PCA)
In this stage, PC is calculated for the extracted features where PC refers to these newly modified features
which will be fed the classifier model instead of the extracted features from ResNet50. In this work, the PC for
different variance values has been calculated for each video in the database. Figures 3 and 4 show the PC numbers
for all videos at different variance values for train and test data, respectively, for the UCF-Crime dataset. Where
the x-axis represents the video number, the y-axis represents the PC number of the video.
Figure 3. The video’s PC number for train datasets at different variance values
Figure 4. The video’s PC number for test datasets at different variance values
ISSN: 2502-4752
Indonesian J Elec Eng & Comp Sci, Vol. 29, No. 1, January 2023: 421-430
426
3.2.3. Classifier
Application of decision classifiers becomes necessary after the features are complete. In this work,
the BiLSTM classification model has been used. The parameters of the classifier model are shown in Table 1
for different variance values, Adam optimizer was utilized for all cases. The value of L2Regularization and
dropout was selected to lessen the model's overfitting. The model parameters were selected by trial and error.
Table 1. The parameter values of the classifier model at different variance values
Classifier
Parameters
Variance value
75%
Minimum Batch Size
32
Hidden layer nodes No.
112
Dropout
0.8
Initial Learning Rate
1e-5
Maximum epochs
170
L2Regularization
0.8
Following the previous research, the area under curve (AUC) and the receiver operating characteristics
(ROC) were used as performance measures to assess the performance of the suggested work. Moreover,
we determine the detection accuracy of our classifier model. The classifier accuracy was determined using
the (1) [29].
 
 x 100% (1)
The experimental results show the efficiency of our proposed framework, as it detects anomalous
events with greater precision than existing methods. Our classifier model's ROC curve is shown on the left side
in Figures 5-14, while the confusion matrix is shown on the right, for different variance values. It is clear that
the highest value for AUC was recorded at a variance value equal to 90%, while the highest value for detection
accuracy was recorded at a variance value equal to 95% where the true positive (TP) which means an anomaly
classified as anomaly was equal to 84, false negative (FN) which means an anomaly classified as normal was
equal to 9, true negative (TN) which means normal classified as normal was equal to 100, and false positive
(FP) which means normal classified as an anomaly was equal to 15.
AUC = 92.30%
Detection accuracy = 84.13%
Figure 5. The classifier model ROC curve and the confusion matrix for variance = 99%
AUC = 93.60%
Detection accuracy = 88.46%
Figure 6. The classifier model ROC curve and the confusion matrix for variance = 95%
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752
An adaptive algorithm based on principal component analysis-deep learning for (Zainab K. Abbas)
427
AUC = 94.21%
Detection accuracy = 86.54%
Figure 7. The classifier model ROC curve and the confusion matrix for variance = 90%
AUC = 94.11%
Detection accuracy = 87.50%
Figure 8. The classifier model ROC curve and the confusion matrix for variance = 85%
AUC = 93.61%
Detection accuracy = 86.54%
Figure 9. The classifier model ROC curve and the confusion matrix for variance = 80%
AUC = 93.28%
Detection accuracy = 87.50%
Figure 10. The classifier model ROC curve and the confusion matrix for variance = 75%
ISSN: 2502-4752
Indonesian J Elec Eng & Comp Sci, Vol. 29, No. 1, January 2023: 421-430
428
AUC = 92.88%
Detection accuracy = 86.54%
Figure 11. The classifier model ROC curve and the confusion matrix for variance = 70%
AUC = 92.13%
Detection accuracy = 85.58%
Figure 12. The classifier model ROC curve and the confusion matrix for variance = 65%
AUC = 93.25%
Detection accuracy = 86.54%
Figure 13. The classifier model ROC curve and the confusion matrix for variance = 60%
AUC = 93.44%
Detection accuracy = 86.06%
Figure 14. The classifier model ROC curve and the confusion matrix for variance = 55%
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752
An adaptive algorithm based on principal component analysis-deep learning for (Zainab K. Abbas)
429
After explaining the suggested system details, it is vital to compare the suggested system with the
previous studies. The AUC scores have been compared with the previous research works in Table 2, and it is
clear that the proposed system achieved the highest AUC of 94.21%.
Table 2. AUC score comparison between the proposed work and the previous works
Method
AUC %
Waqas et al. [10]
75.41
Anala et al. [12]
85
Shreyas et al. [11]
79.8
Hao et al. [13]
81.22
Dubey et al. [14]
81.91
Ullah et al. [5]
78.43
Ullah et al. [3]
85.53
Zaheer et al. [15]
78.27
Majhi et al. [16]
82.12
Wu et al.[17]
87.65
Cao et al. [18]
83.14
Abbas and Al-Ani [2]
90.16
Abbas and Al-Ani [19]
93.61
Our Adaptive Algorithm
94.21
6. CONCLUSION
The proposed anomalous event detection system based on a combination between ML and DL is
designed and tested, in which the feature vector for each video was extracted using pre-trained Resnet50. After
that, the feature reduction algorithm PCA was used to remove particular features, which was used for the first
time in such work. Finally, the new feature vectors were fed into BiLSTM for abnormal and normal class
detection. In comparison with previous works on anomalous detection approaches, the suggested system has
been shown to have superior accuracy. According to the experimental results, the AUC value for the UCF-
Crime database increased by up to 94.21%. We also measured the classifier's detection accuracy, which came
out to be 88.46%. And this demonstrates the effectiveness of our suggestion to increase the accuracy of
anomaly event detection, with the least amount of both negative and positive false alarms. Future work will
focus on investigating different feature extraction models, feature selection techniques, and dimensionality
reduction techniques to combine them with our proposed system to improve the accuracy indicator.
REFERENCES
[1] Z. K. Abbas and A. A. Al-Ani, “A comprehensive review for video anomaly detection on videos,” Proceedings of the 2nd 2022
International Conference on Computer Science and Software Engineering, CSASE. 2022, pp. 3035, doi:
10.1109/CSASE51777.2022.9759598.
[2] Z. K. Abbas and A. A. Al-ani, “Anomaly detection in surveillance videos based on H265 and deep learning,” Int. J. Adv. Technol.
Eng. Explor., vol. 9, no. 92, pp. 910922, 2022, doi: 10.19101/IJATEE.2021.875907.
[3] W. Ullah, A. Ullah, I. U. Haq, K. Muhammad, M. Sajjad, and S. W. Baik, “CNN features with bi-directional LSTM for real-time
anomaly detection in surveillance networks,” Multimed. Tools Appl., vol. 80, no. 11, pp. 1697916995, 2021, doi: 10.1007/s11042-
020-09406-3.
[4] H. İ. Öztürk and A. B. Can, “ADNet: temporal anomaly detection in surveillance videos,” Lect. Notes Comput. Sci. (including
Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 12664 LNCS, pp. 88101, 2021, doi: 10.1007/978-3-030-68799-17.
[5] W. Ullah, A. Ullah, T. Hussain, Z. A. Khan, and S. W. Baik, “An efficient anomaly recognition framework using an attention
residual lstm in surveillance videos,” Sensors, vol. 21, no. 8, 2021, doi: 10.3390/s21082811.
[6] R. Morais, V. Le, T. Tran, B. Saha, M. Mansour, and S. Venkatesh, “Learning regularity in skeleton trajectories for anomaly detection
in videos,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2019, pp. 1198811996, doi: 10.1109/CVPR.2019.01227.
[7] Y. Fan, G. Wen, D. Li, S. Qiu, M. D. Levine, and F. Xiao, “Video anomaly detection and localization via Gaussian mixture fully
convolutional variational autoencoder,” Comput. Vis. Image Underst., vol. 195, 2020, doi: 10.1016/j.cviu.2020.102920.
[8] F. P. dos Santos, L. S. F. Ribeiro, and M. A. Ponti, “Generalization of feature embeddings transferred from different video anomaly
detection domains,” J. Vis. Commun. Image Represent., vol. 60, pp. 407416, 2019, doi: 10.1016/j.jvcir.2019.02.035.
[9] J. Ren, F. Xia, Y. Liu, and I. Lee, “Deep video anomaly detection : Opportunities and challenges,” In 2021 International Conference
on Data Mining Workshops (ICDMW), pp. 959-966, IEEE, 2021, doi: 10.1109/ICDMW53433.2021.00125.
[10] S. Waqas, C. Chen, and S. Mubarak, “Real-world anomaly detection in surveillance videos,” Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pp. 6479-6488, 2019, doi: 10.1109/HiPCW.2019.00031.
[11] D. G. Shreyas, S. Raksha, and B. G. Prasad, “Implementation of an anomalous human activity recognition system,” SN Comput.
Sci., vol. 1, no. 3, 2020, doi: 10.1007/s42979-020-00169-0.
[12] M. R. Anala, M. Makker, and A. Ashok, “Anomaly detection in surveillance videos,” Proc. - 26th IEEE Int. Conf. High Perform.
Comput. Work. HiPCW, 2019, pp. 9398, doi: 10.1109/HiPCW.2019.00031.
[13] W. Hao et al., “Anomaly event detection in security surveillance using two-stream based model,” Hindawi Secur. Commun.
Networks, vol. 2020, 2020, doi: 10.1155/2020/8876056.
ISSN: 2502-4752
Indonesian J Elec Eng & Comp Sci, Vol. 29, No. 1, January 2023: 421-430
430
[14] S. Dubey, A. Boragule, J. Gwak, and M. Jeon, “Anomalous event recognition in videos based on joint learning of motion and
appearance with multiple ranking measures,” Appl. Sci., vol. 11, no. 3, pp. 121, 2021, doi: 10.3390/app11031344.
[15] M. Z. Zaheer, J. Lee, M. Astrid, A. Mahmood, and S.-I. Lee, “Cleaning label noise with clusters for minimally supervised anomaly
detection,” arXiv e-prints, pp. 36, 2021, [Online]. Available: http://arxiv.org/abs/2104.14770.
[16] S. Majhi, S. Das, F. Bremond, R. Dash, and P. K. Sa, “Weakly-supervised joint anomaly detection and classification,” 16th IEEE
International Conference on Automatic Face and Gesture Recognition (FG 2021), pp. 1-7. IEEE, 2021 2021, doi:
10.1109/FG52635.2021.9667006.
[17] J. Wu et al., “Weakly-supervised spatio-temporal anomaly detection in surveillance video,” arXiv preprint arXiv:2108.03825,
pp. 11721178, 2021, doi: 10.24963/ijcai.2021/162.
[18] C. Cao, X. Zhang, S. Zhang, P. Wang, and Y. Zhang, “Adaptive graph convolutional networks for weakly supervised anomaly
detection in videos,arXiv:2202.06503, 2022.
[19] Z. K. Abbas and A. A. Al-Ani, “Detection of anomalous events based on deep learning-BiLSTM,” Inpress.
[20] H. Alquran et al., “Cervical cancer classification using combined machine learning and deep learning approach,” Comput. Mater.
Contin., vol. 72, no. 3, pp. 51175134, 2022, doi: 10.32604/cmc.2022.025692.
[21] A. A. Sharfuddin, M. N. Tihami, and M. S. Islam, “A deep recurrent neural network with bilstm model for sentiment classification,”
2018 Int. Conf. Bangla Speech Lang. Process. ICBSLP 2018, no. September, 2018, doi: 10.1109/ICBSLP.2018.8554396.
[22] R. Maqsood, U. I. Bajwa, G. Saleem, R. H. Raza, and M. W. Anwar, “Anomaly recognition from surveillance videos using 3D
convolution neural network,” Multimed. Tools Appl., vol. 80, no. 12, pp. 1869318716, 2021, doi: 10.1007/s11042-021-10570-3.
[23] “UCF-Crime-database,” [Online]. Available: https://www.dropbox.com/sh/75v5ehq4cdg5g5g/AABvnJSwZI7zXb8_myBA0CLHa?dl=0.
[24] M. S. P. Kumari and A. K. Bedi, “Multimedia datasets for anomaly detection: A review.” arxiv:2112.05410, 2022.
[25] A. Mathew, P. Amudha, and S. Sivakumari, “Deep learning techniques: an overview,” Adv. Intell. Syst. Comput., vol. 1141,
no. August 2020, pp. 599608, 2021, doi: 10.1007/978-981-15-3383-9_54.
[26] J. M. Czum, “Dive into deep learning,” J. Am. Coll. Radiol., vol. 17, no. 5, pp. 637638, 2020, doi: 10.1016/j.jacr.2020.02.005.
[27] A. A. Al-Ani, L. A. Al-Ani, and Y. A. Jasim, “Iris recognition using independent and principal component analysis,Dissertation,
2018.
[28] I. T. Jollife and J. Cadima, “Principal component analysis: A review and recent developments,” Philos. Trans. R. Soc. A Math. Phys.
Eng. Sci., vol. 374, no. 2065, 2016, doi: 10.1098/rsta.2015.0202.
[29] S. H. Abd, I. A. Hashim, and A. S. A. Jalal, “Automatic deception detection system based on hybrid feature extraction techniques,”
Indones. J. Electr. Eng. Comput. Sci., vol. 26, no. 1, pp. 381393, 2022, doi: 10.11591/ijeecs.v26.i1.pp381-393.
BIOGRAPHIES OF AUTHORS
Zainab K. Abbas Born in Baghdad/Iraq, in 1993. She gained B.Sc. and M.Sc.,
from Dept. of Control and system engineering/University of technology, in 2015 and 2018,
respectively. She is currently a Ph.D. student in the Department of Information and
Communication Engineering, College of Information Engineering, Al-Nahrain University,
Baghdad, Iraq. Her works studies are focused on Digital processing with artificial
intelligence. She can be contacted at email: zainabkudair@gmail.com.
Ayad A. Al-Ani he is born in Baghdad/Iraq, in 1961. He gained the B.Sc., M.Sc.,
and Ph.D., from Dept. of Physics/College of Science/Baghdad University, in 1983, 1990,
and 1995, respectively. Previously, he was University. Vice President for Administrative
affairs/Al-Nahrain Univ., Deputy Dean for Post Graduate Studies and Scientific
Research/Baghdad Univ, and Head of Space and Astronomy Dept./College of
Science/Baghdad University. Since 2006, he is a Professor of Digital Image Processing in
Dept. of Information and Communication Engineering/College of Information
Engineering/Al-Nahrain University. He published 74 papers and 4 books. He can be
contacted at email: ayad.a@nahrainuniv.edu.iq.
... An algorithm for decreasing the features maps size has been suggested by Abbas and Al-Ani [19]. Abbas and Al-Ani [20] suggest using principal component analysis for reducing the dimensionality of the data features. Although the term "anomaly" is commonly used in literature, there isn't yet agreement on what it means [9]. ...
... AUC score comparison between the proposed work and the previous works Method AUC % Waqas and colleagues, 2019 [10] 75.41 Anala and colleagues, 2019 [12] 85 Shreyas and colleagues, 2020 [11] 79.8 Hao and colleagues, 2020 [13] 81. 22 Dubey and colleagues, 2021 [14] 81.91 Ullah and colleagues, 2021 [5] 78.43 Ullah and colleagues, 2021 [3] 85.53 Zaheer and colleagues, 2021 [15] 78.27 Majhi and colleagues, 2021 [16] 82.12 Wu and colleagues, 2021 [17] 87.65 Cao and colleagues, 2022 [18] 83.14 Abbas and Al-Ani [2] 90. 16 Abbas and Al-Ani [19] 93.61 Abbas and Al-Ani [20] 94.21 Our adaptive algorithm 94.58 ...
Article
Full-text available
One of the most widely used human behavior detection methods is anomaly detection, which this article covers. Ensuring a person's safety is a crucial task in every community today due to the ever-increasing actions that can be dangerous, from planned crime to harm from an accident. Classic closed- circuit television is insufficient since a person must always be awake and available to monitor the cameras, which is costly. Also, someone's attention tends to decrease after a certain period of time. Due to these reasons, a surveillance system that is automated and able to detect unusual activities in real-time and give sufferers prompt aid is necessary. It should be noted that the identification process must be completed swiftly and correctly. In this paper, we employ a model based on mixes the machine learning (ML) model, namely genetic algorithms with deep learning (DL). In this study's experimentation, the UCF-Crime dataset was employed. The detection accuracy on the testing sample dataset was equal to 89.90%, while the area under the curve (AUC) was equal to 94.58%. The developed models have demonstrated reliability and the ability to achieve the greatest accuracy when compared to models that have already been designed.
... Authors Year Document Type [22] Chebbi and Jebara 2020 Conference paper [23] Abbas and Al-Ani 2023 Article [24] Shen et al. The selected studies were published mainly in conference proceedings (15 studies; 62.5%), followed by journals (9 studies; 37.5%). ...
Article
Full-text available
Interest in detecting deceptive behaviours by various application fields, such as security systems, political debates, advanced intelligent user interfaces, etc., makes automatic deception detection an active research topic. This interest has stimulated the development of many deception-detection methods in the literature in recent years. This work systematically reviews the literature focused on facial cues of deception. The most relevant methods applied in the literature of the last decade have been surveyed and classified according to the main steps of the facial-deception-detection process (video pre-processing, facial feature extraction, and decision making). Moreover, datasets used for the evaluation and future research directions have also been analysed.
Article
Full-text available
Despite being extensively used in numerous uses, precise and effective human activity identification continues to be an interesting research issue in the area of vision for computers. Currently, a lot of investigation is being done on themes like pedestrian activity recognition and ways to recognize people's movements employing depth data, 3D skeletal data, still picture data, or strategies that utilize spatiotemporal interest points. This study aims to investigate and evaluate DL approaches for detecting human activity in video. The focus has been on multiple structures for detecting human activities that use DL as their primary strategy. Based on the application, including identifying faces, emotion identification, action identification, and anomaly identification, the human occurrence forecasts are divided into four different subcategories. The literature has been carried several research based on these recognitions for predicting human behavior and activity for video surveillance applications. The state of the art of four different applications' DL techniques is contrasted. This paper also presents the application areas, scientific issues, and potential goals in the field of DL-based human behavior and activity recognition/detection.
Article
Social networks are inevitable parts of our daily life, where an unprecedented amount of complex data corresponding to a diverse range of applications are generated. As such, it is imperative to conduct research on social events and patterns from the perspectives of conventional sociology to optimize services that originate from social networks. Event tracking in social networks finds various applications, such as network security and societal governance, which involves analyzing data generated by user groups on social networks in real time. Moreover, as deep learning techniques continue to advance and make important breakthroughs in various fields, researchers are using this technology to progressively optimize the effectiveness of Event Detection (ED) and tracking algorithms. In this regard, this paper presents an in-depth comprehensive review of the concept and methods involved in ED and tracking in social networks. We introduce mainstream event tracking methods, which involve three primary technical steps: ED, event propagation, and event evolution. Finally, we introduce benchmark datasets and evaluation metrics for ED and tracking, which allow comparative analysis on the performance of mainstream methods. Finally, we present a comprehensive analysis of the main research findings and existing limitations in this field, as well as future research prospects and challenges.
Article
Full-text available
Multimedia anomaly datasets play a crucial role in automated surveillance. They have a wide range of applications expanding from outlier objects/ situation detection to the detection of life-threatening events. For more than 1.5 decades, this field has attracted a lot of research attention, and as a result, more and more datasets dedicated to anomalous actions and object detection have been developed. Tapping these public anomaly datasets enable researchers to generate and compare various anomaly detection frameworks with the same input data. This paper presents a comprehensive survey on a variety of video, audio, as well as audio-visual datasets based on the application of anomaly detection. This survey aims to address the lack of a comprehensive comparison and analysis of multimedia public datasets based on anomaly detection. Also, it can assist researchers in selecting the best available dataset for bench-marking frameworks. Additionally, we discuss gaps in the existing dataset and insights for future direction towards developing multimodal anomaly detection datasets.
Article
Full-text available
Video anomaly detection in smart cities is a critical errand in computer vision that plays an imperative role in intelligent surveillance and public security but is challenging due to its differing, complex, and rare event in real-time surveillance situations. Different deep learning models utilize a critical amount of training data without generalization capabilities and with long time complexity. In this work, and to overcome these problems, an algorithm for reducing the size of the extracted features have been suggested, and this was done by combining every 15 video frames to generate the new features vectors which will be fed into our classifier model, the values of new features vectors represent the summation of the values of original features vectors got from Resnet50. Finally, the new feature vectors are fed into our classifier model to detect the abnormality. We conducted comprehensive tests on a variety of anomaly detection benchmark datasets to verify the proposed framework's functionality in complex surveillance scenarios. The Numerical results were carried out on the UCF-Crime dataset, with the proposed approach achieving Area Under Curve (AUC) scores of 93.61% on the database's test set.
Article
Full-text available
This paper discusses anomaly detection, which is one of the most well-known applications of human activity recognition. Due to the ever-increasing activities posing risks ranging from planned aggression to harm caused by an accident, providing security to an individual is a major issue in any community today. Traditional closed-circuit television does not suffice since it necessitates a human being to be awake and always watch the cameras, which is costly. This necessitates the creation of an automated security system that detects anomalous activity in real-time and provides rapid assistance to victims. However, identifying activity from long surveillance footage takes time. Hence, in this research, we study the effect of the down-sampling concept on the challenging database namely the university of central florida (UCF Crime) using high efficiency video coding (HEVC)-H265 before feeding them into the anomaly detection system. This step reduced the size of the data, making it easier to store and transfer, and highlights the unique properties of each video clip. In the proposed work, first, we are down-sampling each video’s frame into half by using H265 on the fast forward moving picture experts group FFMPEG platform, and then spatiotemporal features are extracted from a series of frames (frame level) using a pre-trained convolutional neural network (CNN) called Resnet50, then to boost the feature we are combining the features of every 15 video frames to generate a new feature vector that will be fed into the classifier model. The values of the new feature vectors represent the summation of the values of the original feature vectors obtained from Resnet50. Finally, the features obtained from a series of frames are fed to the bidirectional long short-term memory (BiLSTM) model, to classify the video as normal or abnormal. We conducted comprehensive tests on a different benchmark dataset for anomaly detection to verify the proposed framework's functionality in complex surveillance scenarios. The numerical results were carried out on the UCF crime dataset, with the proposed approach achieving an area under curve (AUC) score of 90.16% on the database's test set.
Article
Full-text available
Cervical cancer is screened by pap smear methodology for detection and classification purposes. Pap smear images of the cervical region are employed to detect and classify the abnormality of cervical tissues. In this paper, we proposed the first system that it ables to classify the pap smear images into a seven classes problem. Pap smear images are exploited to design a computer-aided diagnoses system to classify the abnormality in cervical images cells. Automated features that have been extracted using ResNet101 are employed to discriminate seven classes of images in Support Vector Machine (SVM) classifier. The success of this proposed system in distinguishing between the levels of normal cases with 100% accuracy and 100% sensitivity. On top of that, it can distinguish between normal and abnormal cases with an accuracy of 100%. The high level of abnormality is then studied and classified with a high accuracy. On the other hand, the low level of abnormality is studied separately and classified into two classes, mild and moderate dysplasia, with ∼ 92% accuracy. The proposed system is a built-in cascading manner with five models of polynomial (SVM) classifier. The overall accuracy in training for all cases is 100%, while the overall test for all seven classes is around 92% in the test phase and overall accuracy reaches 97.3%. The proposed system facilitates the process of detection and classification of cervical cells in pap smear images and leads to early diagnosis of cervical cancer, which may lead to an increase in the survival rate in women.
Article
Full-text available
Human face is considered as a rich source of non-verbal features. These features have proven their efficiency, so they are used by the deception detection system (DDS) to distinguish liar from innocent subjects. The suggested DDS utilized three kinds of features, these are facial expressions, head movements and eye gaze. Facial expressions are simply encoded and represented in the form of action units (AUs) based on facial action coding system (FACS). Head movements are represented based on both transitions and rotation. For eye gaze features, the eye gaze directional angle in both x-axis and y-axis are extracted. The collected database used to prove validity and robustness of the suggested system contains videos for 102 subjects from both genders with age range 18-55 years. The detection accuracy of the suggested DDS based on applying the logistic regression classifier is equal to 88.0631%. The proposed system has proven its robustness and the achievement of the highest detection accuracy when compared with previously designed systems.</span
Article
For weakly supervised anomaly detection, most existing work is limited to the problem of inadequate video representation due to the inability of modeling long-term contextual information. To solve this, we propose a novel weakly supervised adaptive graph convolutional network (WAGCN) to model the complex contextual relationship among video segments. By which, we fully consider the influence of other video segments on the current one when generating the anomaly probability score for each segment. Firstly, we combine the temporal consistency as well as feature similarity of video segments to construct a global graph, which makes full use of the association information among spatial-temporal features of anomalous events in videos. Secondly, we propose a graph learning layer in order to break the limitation of setting topology manually, which can extract graph adjacency matrix based on data adaptively and effectively. Extensive experiments on two public datasets (i.e., UCF-Crime dataset and ShanghaiTech dataset) demonstrate the effectiveness of our approach which achieves state-of-the-art performance.
Conference Paper
Video Surveillance Systems (VSS) are widely utilized in public and private areas to increase public safety, such as shopping malls, markets, banks, hospitals, educational institutions, streets, and smart cities. The accuracy and fast identification of video anomalies is usually the major goal of security applications. However, because of varying environmental factors, the complexities of human activity, the ambiguous nature of the anomaly, and the absence of appropriate datasets, detecting video anomalies is challenging. This paper surveys the last three years, a comprehensive study of detecting video anomalies, and the recently used dataset. Moreover, a comparison study on different approaches has been performed, which are used for anomalies detection. We have noticed that deep learning has outperformed other methods in this field.
Conference Paper
In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video. Specifically, given an untrimmed video, WSSTAD aims to localize a spatio-temporal tube (i.e., a sequence of bounding boxes at consecutive times) that encloses the abnormal event, with only coarse video-level annotations as supervision during training. To address this challenging task, we propose a dual-branch network which takes as input the proposals with multi-granularities in both spatial-temporal domains. Each branch employs a relationship reasoning module to capture the correlation between tubes/videolets, which can provide rich contextual information and complex entity relationships for the concept learning of abnormal behaviors. Mutually-guided Progressive Refinement framework is set up to employ dual-path mutual guidance in a recurrent manner, iteratively sharing auxiliary supervision information across branches. It impels the learned concepts of each branch to serve as a guide for its counterpart, which progressively refines the corresponding branch and the whole framework. Furthermore, we contribute two datasets, i.e., ST-UCF-Crime and STRA, consisting of videos containing spatio-temporal abnormal annotations to serve as the benchmarks for WSSTAD. We conduct extensive qualitative and quantitative evaluations to demonstrate the effectiveness of the proposed approach and analyze the key factors that contribute more to handle this task.