Content uploaded by Yousef Sharrab
Author content
All content in this area was uploaded by Yousef Sharrab on Jun 29, 2023
Content may be subject to copyright.
Improving Arabic Fake News Detection Using
Optimized Feature Selection
Bilal Hawashin
Department of Artifical Intelligence
Alzaytoonah University of Jordan
Amman, Jordan
b.hawashin@zuj.edu.jo
Shadi AlZu’bi
Department of Computer Science
Alzaytoonah University of Jordan
Amman, Jordan
smalzubi@zuj.edu.jo
Ahmad Althunibat
Department of Software Engineering
Alzaytoonah University of Jordan
Amman, Jordan
a.thunibat@zuj.edu.jo
Tarek Kanan
Department of Artifical Intelligence
Alzaytoonah University of Jordan
Amman, Jordan
tarek.kanan@zuj.edu.jo
Yousef Sharrab
Department of Artifical Intelligence
Isra’a University
Amman, Jordan
sharrab@iu.edu.jo
Abstract— It is of no doubt that the advent of social media
has brought several important benefits. However, there have
been also attempts of abusing social media in several ways, one
of which is by distributing fake news. Fake news is able to
change public opinion, and it is necessary to detect such
attempts. Despite its importance, there is a lack of research
work that has been done in this topic on Arabic posts. The few
works that studied this topic in Arabic language did not give
much attention to optimizing the feature selection process,
which can play an important role in further improving the
detection accuracy. This work further improves fake news
detection performance by optimizing the feature selection
phase. Experimental work has shown that such optimizing
improved the detection accuracy for traditional machine
learning methods.
Keywords— Arabic Fake News Detection, Social Media,
Classification, Machine Learning, Data Science.
I. INTRODUCTION
With the advent of social media, the world has become a
small world. Social media has proved its ability to increase
connectivity. Furthermore, it has been used as an educating
tool, a mean of increasing awareness toward important issues,
building virtual communities, helping in noble causes, and
many more. No one can doubt the great benefits of social
media. However, it has been abused also in several ways, one
of the which is by spreading fake news.
Spreading fake news in social media is a new phenomenon
that aims at changing public opinion toward some issue. Such
fake news can manipulate people opinions to serve the
interests of individuals. Therefore, this topic has been gaining
more and more attention in the recent years. Thanks to the
artificial intelligence era, several solutions have bee used to
detect such fake posts automatically, one of them is via text
classification.
Text classification is the process of labeling text using one
or more predefined label. It has several applications in wide
range of domains such as in sentiment analysis, document
classification, author authentication, and many more.
Although several works have been proposed in the recent
years to detect fake news in English language via
classification such as[1][2][3][4][5], very few works have
been proposed to detect Arabic fake news despite its
importance. It is of no doubt that it is necessary to detect such
posts, and the lack of works in this direction can be due to the
lack of Arabic datasets and the lack of attention to this
important issue. Even the works that tackled this issue in
Arabic language have some limitations such as not giving
much attention to the feature extraction process despite its
importance. It is clear that feature selection is one of the
important phases in natural language processing, and such
phase can play a vital role in improving the accuracy and
reducing time. Although some previous works proposed
solutions based on deep learning, these solutions come with a
time and computational cost. It would be much better to
optimize the accuracy of the traditional, less time consuming,
methods.
In this work, we optimize the feature selection phase and
compare the optimized performance with the original one. As
part of this work, we use several classification methods, which
are Logistic Regression, Support Vector Machines(SVM) ,
Random Forest, K Nearest Neighbor(KNN), Naïve Bayes,
and AraBERT. We used a publicly available dataset of Arabic
fake news [21]. As for the evaluation measurements, we used
recall, precision, and F1, which are commonly used in the
classification evaluation process.
The contributions of this paper are as follows.
• Further improving the Arabic fake news detection by
optimizing feature selection phase.
• Increasing the awareness of this important issue in
Arabic language.
The remainder of the paper is as the following. Section two
is the literature review. Section three has the methodology.
Section four has the experimental works and the discussion,
while section five has the conclusions and the future works.
II. LITERATURE REVIEW
In the field of fake news detection several studies have
been conducted. In this section, we present the important
works in this field, and then we provide the limitations of
them.
[1] discussed different characteristics and types of fake news
and insisted on the importance of handling them properly.
Furthermore, it proposed a fake news detection algorithm for
OSM networks. The best achieved accuracy was 93% when
using bi-LSTM.
[2] compared several machine learning approaches, natural
language processing techniques, and social network analysis
methods. Furthermore, they made a thorough survey of
different means used for fake news identification and
mitigation.
[3] proposes an intelligent approach to recognize rumors on
blogging websites. It benefits from time series information
from social media websites such as user comments and
retweet dynamics, in order to enhance the performance of
rumor detection.
[4] tackled the fake news problem from a new perspective by
using different propagation characteristics, textual features,
and social features. Next, it evaluated several machine
learning methods according to their performance in detecting
fake news based on these features.
[5] handled the issue of domain biases when detection fake
news. In this issue, the trained classifier do not work well and
do not detect fake news if the domain is not known. Therefore,
they proposed a solution for cross domain detection. They
suggested the use of paired news to improve the accuracy in
this scenario.
[6] proposed also a solution for multi-domain fake news
detection. As part of their work, they used history news
environment perception framework, which played an
important role in improving the accuracy.
[7] proposed a model named Modality and Event Adversarial
Networks to detect fake news. This work concentrated on the
multi-modality case and how to learn efficiently when text,
images, and other modalities exist together.
[8] provided a survey on the methods proposed recently to
detect fake news using machine learning. They also proposed
a hybrid method composed of Naïve Bayes and LSTM for this
sake.
[9] surveyed the recent works in this domain and discussed the
challenges and the opportunities for improvements.
[10] proposed the use of multi-layer Bi-LSTM for fake news
detection.
[11]adopted the use of sequential models, specifically
recurrent neural network (RNN) architecture, to detect rumors
in microblogging platforms by finding data temporal
dependencies.
[12] proposed the use of convolutional neural network (CNN)
and argued on its importance and efficiency in the domain of
fake speech recognition. As part of their work, they used both
content-based features and user-based features.
As for Arabic fake news detection, a few works were proposed
in this direction. For example, [13] proposed a multi-modal
fake news detection in Arabic language. They used
MARBERTv2 for textual feature extraction and a
combination of VGG-19 and RESNET50 for visual feature
extraction. In their experimental results, textual features
proved their efficiency in this task.
[14] proposed a novel rumor detection for Arabic language.
They compared ARABERT and MARBERT for this task.
[15] Introduced a novel dataset for fake news training. The
dataset was related to Covid_19 and it was extracted from
facebook and twitter. In their work, they also compared the
performance of two pretrained models; BERT and
ARABERT. The experimental results showed that
ARABERT outperformed BERT.
[16] provided analysis on the challenges facing this issue in
Arabic language. They found that the most studied platform in
Arabic language was Twitter, recommending more studies on
other platforms. Moreover, they recommended more works on
dialects.
[17] provided a dataset composed of real and fake news for
training purposes In the field of Covid-19. They argued on the
importance of detecting such fake news due to their negative
effect in changing public opinion.
[18] proposed the use of BERT-based method in detecting
fake news. The proposed method proved its efficiency in
detecting fake news in Arabic language.
[19] proposed the first large dataset in this domain. The dataset
was composed of around 600000 articles. They used both
traditional machine learning and deep learning methods.
[20] proposed the use of text analysis in order to detect fake
news.
III. METHODOLOGY
From the literature review, it was clear that despite having
several works proposed in the direction of English fake news
detection, only a few works were proposed to improve Arabic
fake news detection. Furthermore, these works did not give
much attention to the optimization of feature selection. Some
works showed the superiority of deep learning models such
as[14][15], however, this efficiency has its own cost. Deep
learning models are very complex, require much more training
time, and calculation power. Therefore, this motivates our
work to find an efficient feature selection method for fake
Arabic news detection.
In this work, we optimize one of the well known feature
selection methods; namely chi square. The classification
performance using various classifiers is tested after all the
combinations of the aforementioned methods. These results
are compared with the classification without using
preprocessing.
In general, the preprocessing is composed of several tasks
including the following:
1- Tokenization: which splits the text into words.
2- Stopword Removal: which removes non-important
terms.
3- Normalization: which unifies the same word that is
written In different forms by removing Harakat or
Hamza for example.
4- Stemming: to find the stem of each term.
5- Feature selection: to select part of the terms instead of
using all the terms.
As stated earlier, we are comparing the performance of
various classifiers based on various combinations of feature
selection. As for the classifiers, we use SVM, Naïve Bayes,
Logistic Regression, KNN, Random Forest, and AraBERT.
As for the and for the feature selection, we use chi square.
The results of classification are provided in the experimental
section.
IV. EXPERIMENTS
A. Data set
The used dataset is from [21]. It is composed of 606912
posts from 134 different sources. Misbar was used to
annotate the data into credible, not credible, and not
sure. In our experiments, the first two labels was used,
whereas the system had a high confidence in the
classification process. We used a balanced subset of the
dataset composed of 30000 fake news post and 30000
real news post.
B. Evaluation Measurements
As for the evaluation measurements, we adopted the use of
recall, precision, and the F1 measurement. They are defined
as follows.
• Recall: it is defined as the ratio of true positive over
true positive and false negative. It measures the
ability of the classifier to correctly recognise all
those records of the target label.
• Precision: It is defined as the ration of the true
positive over true positive and false positive. It
measures the probability of incorrectly assigning a
record to the target label.
• F1 Measurement: It is used to measure the the
harmonic mean between the recall and the precision
and is given in Equation 1.
𝐹1 = 2∗𝑅∗𝑃
𝑅+𝑃 (1)
C. Experimental Settings
For our experiments, we used an Intel® Core™ i7_8550U 1.8
GHz CPU and 16GB RAM, with Microsoft Windows 10
Operating System. Also, we used Python Jupyter notebook
for the implementations of the classifiers.
D. The Compared Classifiers
The following subsections provide more information about
the compared classifiers.
• Support Vector Machines
This classifier has been used widely in the literature
due to its superior accuracy and relatively fast
learning time. It uses the support vectors as a
discriminative tool to find the best hyperplane
between the given classes. SVM can be either linear
or nonlinear based on the type of the hyperplane.
• Logistic Regression
This classifier uses regression concept to predicts the
probability of the positive label given the input data. It
is considered the base of the artificial neural network
classifier.
• K Nearest Neighbor
This classifier belongs to the lazy classifiers as it does
not learn a model. Instead, whenever a testing record
arrives, it assigns the label of the closest k training
records.
• Random Forest
This classifier has gained wide attention in the
literature due to its relatively high accuracy. It uses the
concept of bootstrapping to generate several datasets
from the original one, and uses bagging to build
different tree models from these datasets. Finally, it
uses the major voting to find the final decision.
• Naïve Bayes
This classifier finds the probability for each label based
on the given data. For this sake, it uses bayes theorem
that provides the posterior probability. This classifier
assumes that features are independent. This justifies
the name of the classifier. It is well known for its very
fast training time even when the data size is large.
E. Experimental Results and Discussion
As for the dataset, and as mentioned earlier, we used a
balanced dataset of 30000 records for each of the two labels
(fake, real). Next, we removed stopwords and applied data
normalization by removing the punctuations, duplicate
letters, and all the harakat. This step would unify the
appearance of the same term. Next, we applied TF.IDF
vectorizer to find the weight matrix for each term in each
post. We selected the first 10,000 features and removed
features that appeared in less than three documents. Next is
to optimize the classifiers as optimization plays a vital role in
the results. In order to optimize the classifiers, we used
GridSearchCV in python for the optimization. We used cross
validation with 10 folds as it has been used widely In the
literature and proved its efficiency.
TABLE I. TRAINING AND CLASSIFICATION TIME OF VARIOUS
CLASSIFIERS USING 512 USERS
Classifier
Best Parameters
SVM
C = 0.1, Kernel = RBF, gamma = 1
Logistic
Regression
C = 0.5
Naïve Bayes
-
RF
N_estimators = 100
KNN
K = 10
AraBERT
Default
After the optimization process, we compared the classifiers
using their best parameters according to their performance in
detecting Arabic fake news. The optimized parameters are
provided in Table 1. The results of applying the classification
methods is provided in Table II. It is noted that these results
are without using feature selection. These are the baseline
results.
TABLE II. ACCURACIES OF VARIOUS CLASSIFIERS USING WITHOUT
FEATURE SELECTION
Classifier
Recall
Precision
F1
SVM
0.94
0.96
0.95
Logistic
Regression
0.93
0.93
0.93
Naïve Bayes
0.86
0.86
0.86
RF
0.93
0.93
0.93
KNN
0.88
0.88
0.88
AraBERT
0.96
0.98
0.97
From Table II, it can be obviously noted that deep learning
model AraBERT outperformed traditional machine learning
methods, with an F1 reached 0.97. The best performance
among traditional machine learning methods was for SVM.
Both results were expected as deep learning and SVM
proved their high performance in the literature. The worst
performances were for KNN and NB as the former does not
learn a model and the latter merely depends on a probability
model.
Next, we conducted a set of experiments to optimize the chi
square feature selection method by finding the best number
of features. It is well known that different number of
features can lead to different F1 score in the classification
phase. Therefore, we used several values for the number of
features ranging from 200 to 1200. For each value, we
performed feature selection and classification using SVM
and found the F1 score. The results are provided as follows.
TABLE III. OPTIMIZING THE SELECTED NUMBER OF FEATURES FOR CHI
SQUARE BY FINDING F1 FOR SVM CLASSIFIER USING DIFFERENT VALUES
Feature
Selection
Method
200
400
600
800
1000
1200
Chi
Squared
0.871
0.9
0.929
0.946
0.958
0.953
From the table, it can be noted that the F1 score tends to
increase exponentially at the beginning and becomes more
stable with larger number of features. We noted that at 1000
features, the performance outperformed the baseline SVM,
with F1 score of 0.958. This performance starts to degrade
later on. The surge of performance can be due to the
elimination of noisey columns that existed in the baseline
SVM. When eliminated, the best performance was attained.
However, when more features were added, which tended to
be more noisy, the performance started to degrade. Table IV
compares the baseline SVM with both the SVM after feature
selection and AraBERT.
TABLE IV. COMPARING THE BASELINE PERFORMANCE WITH THE
OPTIMIZED PERFORMANCE USING FEATURE SELECTION
Classifier
Recall
Precision
F1
AraBERT
0.96
0.98
0.97
SVM
Baseline
0.94
0.96
0.95
SVM + Chi
0.943
0.969
0.958
It is clear from the table that the best performance was for
AraBERT deep learning method. However, this method has
a high computational cost training time. The difference in
F1 score between optimized SVM after feature selection and
AraBERT was1.5%. However, this difference is not the only
factor that must be considered. Although the accuracy of
optimized SVM is less than that of deep learning, the
optimized SVM has much improved training time than that
of AraBERT. Therefore, it is up to the domain to select the
best track to conduct. If accuracy is needed regardless of the
model complexity nor the training time, deep learning
would be the first golden option. If the model complexity
and the training time are key factors for the domain, feature
selection would provide a much improvement in training
time and model complexity with a slight decrease in
accuracy. Therefore, despite the importance of further
improving deep learning methods, it is equally important to
shed more light on optimizing simple models to gain the
optimal performance.
V. CONCLUSIONS AND FUTURE WORKS
In this work, we proposed an optimized fake news
classification method for Arabic text. Experimental work
showed that optimizing feature selection can improve the
performance of fake news classification in comparison with
no feature selection, and such performance can be close to
that of deep learning methods with much improvement in
model complexity and training time.
Future work can be conducted optimize other parts of the
preprocessing phases. Furthermore, more studies are needed
to provide more Arabic fake news datasets and to direct more
works toward the detection of such important issue.
REFERENCES
[1] X. Jose, S.M. Kumar, & P. Chandran, Characterization,
Classification and Detection of Fake News in Online Social
Media Networks. In 2021 IEEE Mysore Sub Section
International Conference (MysuruCon) ,pp. 759-765, 2021.
[2] K. Sharma, F. Qian, H. Jiang, N. Ruchansky, M. Zhang & Y.
Liu, “Combating fake news: A survey on identification and
mitigation techniques,” ACM Transactions on Intelligent
Systems and Technology (TIST), vol 10, no 3, 1-42, 2019.
[3] J. Ma, W. Gao, Z. Wei, Y. Lu, & K. F. Wong, “Detect rumors
using time series of social context information on
microblogging websites,” In Proceedings of the 24th ACM
international on conference on information and knowledge
management, pp. 1751-1754, 2015.
[4] K. Shu, A. Sliva, S. Wang, J. Tang, & H. Liu, “Fake news
detection on social media: A data mining perspective,” ACM
SIGKDD explorations newsletter, vol. 19, no. 1, pp. 22-36,
2017.
[5] S. Kato, L. Yang, & D. Ikeda, “Domain Bias in Fake News
Datasets Consisting of Fake and Real News Pairs,” In 2022
12th International Congress on Advanced Applied Informatics
(IIAI-AAI) pp. 101-106, 2022.
[6] W. Yu, J. Ge, Z. Yang, Y. Dong, Y., Zheng, & H. Dai, “Multi-
domain Fake News Detection for History News Environment
Perception,” In 2022 IEEE 17th Conference on Industrial
Electronics and Applications (ICIEA), pp. 428-433, 2022.
[7] P. Wei, F. Wu, Y. Sun, H. Zhou, & X.Y. Jing, “Modality and
Event Adversarial Networks for Multi-Modal Fake News
Detection,” IEEE Signal Processing Letters, vol. 29, pp. 1382-
1386, 2022.
[8] D. ohera, et al. “A taxonomy of fake news classification
techniques: Survey and implementation aspects,” IEEE
Access, vol 10, pp. 30367-30394, 2022.
[9] X. Zhou & R. Zafarani. “A survey of fake news: Fundamental
theories, detection methods, and opportunities,” ACM
Computing Surveys (CSUR), vol. 53, no. 5,pp. 1-40, 2020.
[10] A. R. Merryton, & M. G. Augasta, “A Novel Framework for
Fake News Detection using Double Layer BI-LSTM,”. In 2023
5th International Conference on Smart Systems and Inventive
Technology (ICSSIT) , pp. 1689-1696, 2023.
[11] J. Ma, W. Gao, P. Mitra, S. Kwon, B.J. Jansen, K.F. Wong, M.
Cha “Detecting rumors from microblogs with recurrent neural
networks”, 3818, 2016.
[12] Y. Yang, L. Zheng, J. Zhang, Q. Cui, Z. Li, & P.S.Y. TI-CNN,
“Convolutional neural networks for fake news
detection,”.arXiv preprint arXiv:1806.00749, vol 2, no. 6,
2018.
[13] R. M. Albalawi, A. T. Jamal, A. O. Khadidos, & A. M.
Alhothali, “Multimodal Arabic Rumors Detection,”. IEEE
Access, vol. 11, pp. 9716-9730, 2023.
[14] N.O. Bahurmuz, G. A. Amoudi, F. Baothman, A. T. Jamal, H.
S. Alghamdi, & A. M. Alhothali, A. M. “Arabic Rumor
Detection Using Contextual Deep Bidirectional Language
Modeling,” IEEE Access, vol. 10, pp. 114907-114918, 2022.
[15] S. B. Ali, Z. Kechaou, & A. Wali, A., “Arabic fake news
detection in social media Based on AraBERT,” In 2022 IEEE
21st International Conference on Cognitive Informatics &
Cognitive Computing (ICCI* CC) pp. 214-220, 2022.
[16] H. Rahab, A. Zitouni, & M. Djoudi, “Arabic Fake News and
Spam Handling: Methods, Resources and Opportunities,”
In 2021 International Conference on Artificial Intelligence for
Cyber Security Systems and Privacy (AI-CSP) pp. 1-7,2021
[17] D. Mohdeb, M. Laifa, & M. Naidja, “An Arabic Corpus for
Covid-19 related Fake News,” In 2021 International
Conference on Recent Advances in Mathematics and
Informatics (ICRAMI) , pp. 1-5, 2021.
[18] W. Shishah, “JointBert for Detecting Arabic Fake
News,” IEEE Access, vol. 10, pp. 71951-71960, 2022.
[19] A. Khalil, M. Jarrah, M. Aldwairi, and Y. Jararweh, “Detecting
arabic fake news using machine learning,” In 2021 Second
International Conference on Intelligent Data Science
Technologies and Applications (IDSTA) , pp. 171-17, 2021.
[20] H.T. Himdi, & F. Y. Assiri, “Development of Classification
Model based on Arabic Textual Analysis to Detect Fake News:
Case Studies,” In 2023 1st International Conference on
Advanced Innovations in Smart Cities (ICAISC), pp. 1-6,
2023.
[21] A. Khalil, M. Jarrah, and M. Aldwairi, Arabic Fake News
Dataset (AFND), Accessed May 2023.