Top ten internet using languages

COVID-19 Twitter Sentiment Classification Using Hybrid Deep Learning Model Based on Grid Search Methodology

Preprint

Full-text available

Jun 2024

In the contemporary era, social media platforms amass an extensive volume of social data contributed by their users. In order to promptly grasp the opinions and emotional inclinations of individuals regarding a product or event, it becomes imperative to perform sentiment analysis on the user-generated content. Microblog comments often encompass both lengthy and concise text entries, presenting a complex scenario. This complexity is particularly pronounced in extensive textual content due to its rich content and intricate word interrelations compared to shorter text entries. Sentiment analysis of public opinion shared on social networking websites such as Facebook or Twitter has evolved and found diverse applications. However, several challenges remain to be tackled in this field. The hybrid methodologies have emerged as promising models for mitigating sentiment analysis errors, particularly when dealing with progressively intricate training data. In this article, to investigate the hesitancy of COVID-19 vaccination, we propose eight different hybrid deep learning models for sentiment classification with an aim of improving overall accuracy of the model. The sentiment prediction is achieved using embedding, deep learning model and grid search algorithm on Twitter COVID-19 dataset. According to the study, public sentiment towards COVID-19 immunization appears to be improving with time, as evidenced by the gradual decline in vaccine reluctance. Through extensive evaluation, proposed model reported an increased accuracy of 98.86%, outperforming other models. Specifically, the combination of BERT, CNN and GS yield the highest accuracy, while the combination of GloVe, BiLSTM, CNN and GS follows closely behind with an accuracy of 98.17%. In addition, increase in accuracy in the range of 2.11% to 14.46% is reported by the proposed model in comparisons with existing works.

Soutcom: Real‐time sentiment analysis of Arabic text for football fan satisfaction using a bidirectional LSTM

Article

Full-text available

May 2024
EXPERT SYST

Sultan Alfarhood

In the last few years, various topics, including sports, have seen social media platforms emerge as significant sources of information and viewpoints. Football fans use social media to express their opinions and sentiments about their favourite teams and players. Analysing these opinions can provide valuable information on the satisfaction of football fans with their teams. In this article, we present Soutcom, a scalable real‐time system that estimates the satisfaction of football fans with their teams. Our approach leverages the power of social media platforms to gather real‐time opinions and emotions of football fans and applies state‐of‐the‐art machine learning‐based sentiment analysis techniques to accurately predict the sentiment of Arabic posts. Soutcom is designed as a cloud‐based scalable system integrated with the X (formerly known as Twitter) API and a football data service to retrieve live posts and match data. The Arabic posts are analysed using our proposed bidirectional LSTM (biLSTM) model, which we trained on a custom dataset specifically tailored for the sports domain. Our evaluation shows that the proposed model outperforms other machine learning models such as Random Forest, XGBoost and Convolutional Neural Networks (CNNs) in terms of accuracy and F1‐score with values of 0.83 and 0.82, respectively. Furthermore, we analyse the inference time of our proposed model and suggest that there is a trade‐off between performance and efficiency when selecting a model for sentiment analysis on Arabic posts.

Improving Arabic sentiment analysis across context-aware attention deep model based on natural language processing

Article

Full-text available

Apr 2024

With the enormous growth of social data in recent years, sentiment analysis has gained increasing research attention and has been widely explored in various languages. Arabic language nature imposes several challenges, such as the complicated morphological structure and the limited resources, Thereby, the current state-of-the-art methods for sentiment analysis remain to be enhanced. This inspired us to explore the application of the emerging deep-learning architecture to Arabic text classification. In this paper, we present an ensemble model which integrates a convolutional neural network, bidirectional long short-term memory (Bi-LSTM), and attention mechanism, to predict the sentiment orientation of Arabic sentences. The convolutional layer is used for feature extraction from the higher-level sentence representations layer, the BiLSTM is integrated to further capture the contextual information from the produced set of features. Two attention mechanism units are incorporated to highlight the critical information from the contextual feature vectors produced by the Bi-LSTM hidden layers. The context-related vectors generated by the attention mechanism layers are then concatenated and passed into a classifier to predict the final label. To disentangle the influence of these components, the proposed model is validated as three variant architectures on a multi-domains corpus, as well as four benchmarks. Experimental results show that incorporating Bi-LSTM and attention mechanism improves the model’s performance while yielding 96.08% in accuracy. Consequently, this architecture consistently outperforms the other State-of-The-Art approaches with up to + 14.47%, + 20.38%, and + 18.45% improvements in accuracy, precision, and recall respectively. These results demonstrated the strengths of this model in addressing the challenges of text classification tasks.

A new evolutionary strategy for reinforcement learning

Article

Full-text available

Apr 2024
MULTIMED TOOLS APPL

The detection of traffic signs in the natural scene requires a great deal of information due to the variations it undergoes over time for the output of the detection model to be effective and relevant. The amount of annotated data can range from a few hundred to thousands or even millions of examples, the availability of labeled data is often a critical bottleneck in the development and deployment of deep learning models, and acquiring high-quality annotations can be time-consuming and costly. Unlabeled data is easy to acquire but expensive to annotate. Several works focus on the use of active learning as an approach to get rid of the problem of costly annotation and computation time. To address these difficulties, we propose a Semi-Automatic Deep Image Annotation system using a new Evolutionary Strategy for Reinforcement Learning (SADIA-ESRL). Our experiments demonstrate remarkable efficiency on Natural Scene Traffic Sign and panel guide Arabic-Latin Text Dataset (NaSTSArLaT). The annotation approach studied has allowed for annotating only 1/4 of the images without compromising the model’s efficiency. This reduction in annotation effort is accompanied by significant time savings, with the labeling process now taking as little as 1/5 of the initial time. Furthermore, this strategy grants us the capability to selectively annotate instances, ensuring optimal performance in the used detection model.

Word Embedding as a Semantic Feature Extraction Technique in Arabic Natural Language Processing: An Overview

Article

Full-text available

Mar 2024

Feature extraction has transformed the field of Natural Language Processing (NLP) by providing an effective way to represent linguistic features. Various techniques are utilised for feature extraction, such as word embedding. This latter has emerged as a powerful technique for semantic feature extraction in Arabic Natural Language Processing (ANLP). Notably, research on feature extraction in the Arabic language remains relatively limited compared to English. In this paper, we present a review of recent studies focusing on word embedding as a semantic feature extraction technique applied in Arabic NLP. The review primarily includes studies on word embedding techniques applied to the Arabic corpus. We collected and analysed a selection of journal papers published between 2018 and 2023 in this field. Through our analysis, we categorised the different feature extraction techniques, identified the Machine Learning (ML) and/or Deep Learning (DL) algorithms employed, and assessed the performance metrics utilised in these studies. We demonstrate the superiority of word embeddings as a semantic feature representation in ANLP. We compare their performance with other feature extraction techniques, highlighting the ability of word embeddings to capture semantic similarities, detect contextual associations, and facilitate a better understanding of Arabic text. Consequently, this article provides valuable insights into the current state of research in word embedding for Arabic NLP.

Aspect-oriented extraction and sentiment analysis using optimized hybrid deep learning approaches

Article

Full-text available

Mar 2024
MULTIMED TOOLS APPL

Aspect-oriented extraction involves the identification and extraction of specific aspects, features, or entities within a piece of text. Traditional methods often struggled with the complexity and variability of language, leading to the exploration of advanced deep learning approaches. In the realm of sentiment analysis, the conventional approaches often fall short when it comes to providing a nuanced understanding of sentiments expressed in textual data. Traditional sentiment analysis models often overlook the specific aspects or entities within the text that contribute to the overall sentiment. This limitation poses a significant challenge for businesses and organizations aiming to gain detailed insights into customer opinions, product reviews, and other forms of user-generated content.In this research, we propose an innovative approach for aspect-oriented extraction and sentiment analysis leveraging optimized hybrid deep learning techniques. Our methodology integrates the powerful capabilities of deep learning models with the efficiency of Reptile Search Optimization. Furthermore, we introduce an advanced sentiment analysis framework employing the state-of-the-art Extreme Gradient Boosting Algorithm. The fusion of these techniques aims to enhance the precision and interpretability of aspect-oriented sentiment analysis. The proposed approach first utilizes deep learning architectures to extract and comprehend diverse aspects within textual data. Through the incorporation of Reptile Search Optimization, we optimize the learning process, ensuring adaptability and improved model generalization across various datasets. Subsequently, the sentiment analysis phase employs the robust Extreme Gradient Boosting Algorithm, known for its effectiveness in handling complex relationships and patterns within data. Our experiments, conducted on diverse datasets, demonstrate the superior performance of the proposed methodology in comparison to traditional approaches. The optimized hybrid deep learning approach, coupled with the Reptile Search Optimization and Extreme Gradient Boosting Algorithm, showcases promising results in accurately capturing nuanced sentiments associated with different aspects. This research contributes to the advancement of aspect-oriented sentiment analysis techniques, offering a comprehensive and efficient solution for understanding sentiment nuances in textual data across various domains. The ResNet 50 and EfficientNet B7 architecture of the modified pre-trained model is proposed for the aspect extraction function. The Reptile Search Optimization based Extreme Gradient Boosting Algorithm (RSO-EGBA) is proposed to analyze and predict customer sentiments. The execution of this study is carried out using python software. It has been observed that the overall accuracy of our proposed method is 99.8%, while that of the other state-of-the-art. The overall accuracy of our proposed method shows an increment of 9–16% from that of the state-of-the-art methods.

Word Embedding as a Semantic Feature Extraction Technique in Arabic Natural Language Processing: An Overview

Article

Full-text available

Mar 2024

Feature extraction has transformed the field of Natural Language Processing (NLP) by providing an effective way to represent linguistic features. Various techniques are utilised for feature extraction, such as word embedding. This latter has emerged as a powerful technique for semantic feature extraction in Arabic Natural Language Processing (ANLP). Notably, research on feature extraction in the Arabic language remains relatively limited compared to English. In this paper, we present a review of recent studies focusing on word embedding as a semantic feature extraction technique applied in Arabic NLP. The review primarily includes studies on word embedding techniques applied to the Arabic corpus. We collected and analysed a selection of journal papers published between 2018 and 2023 in this field. Through our analysis, we categorised the different feature extraction techniques, identified the Machine Learning (ML) and/or Deep Learning (DL) algorithms employed, and assessed the performance metrics utilised in these studies. We demonstrate the superiority of word embeddings as a semantic feature representation in ANLP. We compare their performance with other feature extraction techniques, highlighting the ability of word embeddings to capture semantic similarities, detect contextual associations, and facilitate a better understanding of Arabic text. Consequently, this article provides valuable insights into the current state of research in word embedding for Arabic NLP.

Sentiment Analysis Methods for Arabic Content on Social Media: A Systematic Review

Article

Feb 2024

Predicting Human Behavior Using Arabic Sentiment Analysis on Social Media: Approaches and Challenges

Article

Full-text available

Jan 2024

The rapid growth of social networking and micro-blogging websites has motivated researchers to analyze published content to identify and predict human behavior. With the ongoing growth in data volume, the efficient and effective extraction of valuable information has become crucial. Researchers are tackling this problem using big data analytics. However, most studies have focused on the English language and fewer research efforts have been devoted to the Arabic language. This paper covers the recent Arabic sentiment analysis research, highlighting the most important studies that analyzed content from social media to predict human behavior and the different approaches used. Sentiment analysis studies varied between lexicon-based, traditional machine learning-based, deep learning, and hybrid approaches, in addition to employing swarm intelligence to optimize the performance of text classification algorithms. The reviews showed that Naïve Bayes (NB) and Support Vector Machine (SVM) were the most widely used algorithms among traditional machine learning algorithms. The results also showed that using deep learning approaches achieves better accuracy than other approaches. Also, the use of optimization algorithms based on swarm intelligence had a significant impact on increasing the accuracy of text clustering.

Detection of Sarcasm in Urdu Tweets Using Deep Learning and Transformer Based Hybrid Approaches

Article

Full-text available

Jan 2024

Sarcasm has a significant role in human communication especially on social media platforms where users express their sentiments through humor, satire, and criticism. The identification of sarcasm is crucial in comprehending the sentiment and the communication context on platforms like Twitter. This ambiguous nature of the expression of content presents the detection of sarcasm as a considerable challenge in natural language processing (NLP). The importance and challenges increase further, especially in languages like Urdu where resources for NLP are limited. The traditional rule-based approaches lack the desired performance because of the subtle and context-based nature of sarcasm. However, the recent advancements in NLP, particularly the transformer architecture-based large language models (LLMs) like BERT offer promising solutions. In this research, we have utilized a newly created Urdu sarcasm dataset comprising 12,910 tweets manually re-annotated into sarcastic and non-sarcastic classes. These tweets were derived from the public Urdu tweet dataset consisting of 19,995 tweets. We have established baseline results using deep learning classifiers comprising CNN, LSTM, GRU, BiLSTM, and CNN-LSTM. To comprehensively capture the contextual information, we propose a novel hybrid model architecture that integrates multilingual BERT (mBERT) embeddings with BiLSTM and multi-head attention (MHA) for Urdu sarcasm. The proposed mBERT-BiLSTM-MHA model demonstrates superior performance by achieving an accuracy of 79.51% and an F1 score of 80.04%, outperforming deep learning classifiers trained with fastText word embeddings.

Top ten internet using languages

Similar publications

Citations