Conference PaperPDF Available

Sentiment Analysis on Bangla Food Reviews Using Machine Learning and Explainable NLP

Authors:

Abstract and Figures

Sentiment analysis (SA) is a sub-field of natural language processing (NLP) which can extract valuable insights from textual data of a language. Food review analysis is a trending domain of SA which become very useful as internet dependency has rapidly shifted people’s food ordering preferences from restaurants to online platforms. This work focuses on examining various machine learning (ML) and deep learning (DL) algorithms for Bangla sentimental analysis on food reviews using a new dataset of 44,491 reviews collected from various restaurant Facebook pages and groups. Furthermore, in this study, we utilized the explainable NLP to interpret why a model is performing well or poor. Random Forest (RF) and Convolutional Neural Network-Bidirectional Gated Recurrent Unit (CNN-BiGRU) models outperformed other models and achieved the highest accuracy of 88.73% and 90.96% from ML and DL domains respectively. Friedman statistical test was performed on the obtained results and the test results are significant at p<0.05. “দর ” is the best feature that is responsible for the hybrid DL (CNN-BiGRU) model to classify reviews more accurately.
Content may be subject to copyright.
2023 26th International Conference on Computer and Information Technology (ICCIT)
13-15 December 2023, Cox’s Bazar, Bangladesh
979-8-3503-5901-5/23/$31.00 ©2023 IEEE
Sentiment Analysis on Bangla Food Reviews Using
Machine Learning and Explainable NLP
Md. Shymon Islam1, and Dr. Kazi Masudul Alam2
1,2Computer Science and Engineering Discipline
Khulna University, Khulna-9208, Bangladesh
Email- shymum1702@cseku.ac.bd, kazi@cseku.ac.bd
Abstract—Sentiment analysis (SA) is a sub-field of natural
language processing (NLP) which can extract valuable insights
from textual data of a language. Food review analysis is a
trending domain of SA which become very useful as internet
dependency has rapidly shifted people’s food ordering
preferences from restaurants to online platforms. This work
focuses on examining various machine learning (ML) and deep
learning (DL) algorithms for Bangla sentimental analysis on
food reviews using a new dataset of 44,491 reviews collected
from various restaurant Facebook pages and groups.
Furthermore, in this study, we utilized the explainable NLP to
interpret why a model is performing well or poor. Random
Forest (RF) and Convolutional Neural Network-Bidirectional
Gated Recurrent Unit (CNN-BiGRU) models outperformed
other models and achieved the highest accuracy of 88.73% and
90.96% from ML and DL domains respectively. Friedman
statistical test was performed on the obtained results and the test
results are significant at p<0.05. দর is the best feature that is
responsible for the hybrid DL (CNN-BiGRU) model to classify
reviews more accurately.
Keywords— Sentiment Analysis, Food Review Analysis, CNN,
Bi-GRU, Explainable NLP
I. INTRODUCTION
Sentiment analysis (SA) is the process of predicting the
sentiments or emotions of a set of text data [1]. SA is a
subfield of natural language processing (NLP), has emerged
as a critical tool for extracting valuable insights from textual
data by discerning the emotional tone and underlying
sentiment conveyed by the language used [2]. Sentiment
Analysis basically focuses on assessing people’s opinions,
feelings, and emotions [3].
Nearly 250 million people speak Bangla as their first
language, and 160 million of them are Bangladeshis [4].
While Bangla is still in the developing stage, the majority of
research studies and publicly accessible datasets on
sentiment analysis are restricted to English and other
resource-rich languages like Arabic, Turkish, Hindi and
Urdu [5]. Bangla is a language with a rich morphology that
has developed over thousands of years with its longstanding
customs, including numerous dialects. Bangla presents both
challenges and opportunities for developing effective
sentiment analysis techniques that can accurately capture the
sentiment conveyed in Bangla texts across various domains
[6]. In this work, we mainly focus on the food review domain
of SA. The participation of Bangladeshi residents in internet
activities is also rapidly increasing. Reviews of food and food
distribution systems are fascinating sections. In our country,
there are a rising number of online food delivery services [7].
The majority of Bangladeshis provide insightful feedback in
Bangla on social media [2]. In addition to English, users also
post comments in Bangla as well examining several blogs
and social media [6]. There are many application areas of
food review sentimental analysis such as the analysis of food
delivery services [8], food quality analysis [9], financial
market analysis [10] and customer satisfaction tracking [11]
etc.
The majority of research on multi-class SA has been done
using machine learning (ML) or deep learning (DL)
algorithms to predict positive, negative or neutral sentiments.
This work focuses on detecting the correct sentiments on
Bangla food reviews accurately using learning models as
well as explains the reason for performing with better or
worse results using explainable NLP. The followings are the
contributions of the proposed work:
1) Created a new specialized SA dataset on food reviews
consisting of 44,491 reviews from several Facebook
pages and groups.
2) Examining various ML and DL algorithms and make a
comparative study among them.
3) The introduction of the proposed novel hybrid DL
(CNN-BiGRU) method which outperforms all other
algorithms.
4) Applying explainable NLP to explain why a model is
producing good or bad results using LIME and SHAP
modules.
The organization of this article: Section 2 demonstrates the
related works, Section 3 is for the proposed methodology for
Bangla SA on food reviews, Section 4 describes the
experimental results analysis and discussions and Section 5
concludes the works with some future remarks.
II. RELATED WORKS
Some of the recent studies of Bangla SA on food reviews
are discussed and summarized here.
In 2023, M. I. H. Junaid et al. [2] proposed a method based
on machine learning for sentimental analysis of food reviews
in Bangla by creating a dataset of only 1040 reviews from
Foodpanda and Hungrynaki and showed that Long-Term
Short-Term Memory (LSTM) deep learning model
outperformed others with an accuracy of 90.89%. M. Hasan et
al. [6] studied a method concerned on the Bangla SA topic
based on “Russia-Ukraine war”. A total of 10,861 Bangla
comments were collected and labeled with three polarity
sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-
Russia (Negative). They used several transformer language
models including BanglaBERT, XLM-RoBERTa-base,
XLMRoBERTa-large, Distil-mBERT and mBERT and
showed by experiments that BanglaBERT outperformed
baseline and all the other transformer-based classifiers.
Another method from 2023 proposed by E. R. Rhythm et al.
[7] studied delved into SA on restaurant reviews sourced from
Bangladeshi food delivery apps. They created a dataset named
“Bangladeshi Restaurant Reviews” covering 15,018 instances
by collecting customer feedback from popular apps like
Foodpanda and Hungrynaki. They employed Robustly
Optimized BERT Pretraining Approach (RoBERTa), AFINN,
and DistilBERT models and obtained accuracy of 74%, 73%
and 77% respectively. In 2023, another new study was
performed by Bitto et al. [12] for the user reviews collected
from food delivery startups. They collected 1400 reviews
from 4 food delivery Facebook pages and applied bipolar SA.
Applying ML and DL algorithms, they obtained highest
accuracy 89.64% using XGB and 91.07% from LSTM
classifier. A supervised deep learning classifier based on CNN
and LSTM to conduct multi-class SA on Bengali social media
comments was proposed by R. Haque et al. [13] in 2023. The
performance of their proposed CLSTM (Convolutional Long
Short-Term Memory) architecture greatly improved the
performance of SA with 85.8% accuracy and 0.86 f1-score on
a labeled dataset of 42,036 Facebook comments. The recent
studies [1-13] on Bangla food reviews prove that DL and
hybrid methods are more promising than traditional ML
approaches.
The studies performed on [9], [14] and [15] utilized
datasets of 8435, 1000 and 2053 restaurant reviews
respectively. Compared to the literature being studied [1-15]
for Bangla SA on food reviews, it is observed that a rich
dataset is always a crisis in Bangla language. To the best of
our knowledge, we have developed one of the largest Bangla
SA dataset on food reviews consisting of 44,491 reviews
from several Facebook pages and groups. Explainable NLP
is not utilized yet in the domain of Bangla food reviews
analysis. In our proposed method, we will utilize these
findings and try to cover up the knowledge gap in this domain
of research.
III. PROPOSED METHODOLOGY FOR BANGLA SA ON FOOD
REVIEWS
Our main goal of this study is to develop different ML and
DL models that can differentiate among positive, negative and
neutral Bangla food reviews. The overall functioning
procedure of the proposed method is depicted in a nutshell in
Fig. 1.
A. Dataset Preparation
In this study, we have gathered Bengali food reviews from
several Facebook food blogs and groups, such as Street Food
Hunting, Dhaka Food Review, Food Review Jashore, Food
Bloggers Barishal and Rafsan the Chotobhai.
TABLE I. SUMMARY OF THE DATASET
Sentiment
No. of Reviews
Total Reviews
Positive
14,424
44,491 Negative
12,371
Neutral
17,696
Six Bangla native graduate students were involved in data
collection process (4 males and 2 females) and 5 were
involved in data annotation process (3 males and 2
females).The proposed dataset contains a total of 44,491
Bangla food reviews and we have shown the distribution of
the dataset in Table I. We have split the proposed dataset into
training and test sets as 90% and 10% respectively using
holdout method. A total of 40,041 reviews were used for
training models and 4,450 reviews were used for testing. Since
the class distribution of the reviews are unequal, the dataset
got an imbalanced distribution which may produce biased
results [5]. So we have used the synthetic minority
oversampling technique (SMOTE) to balance the proposed
dataset.
B. Data Preprocessing
Data preprocessing is a prerequisite for any classification
model to perform well [2]. The reviews we collected from
Facebook contain spelling mistakes, punctuation marks,
emojis, special characters and numerical values and so on.
Depending on Bangla language, different preprocessing
processes were applied such as tokenization, non-Bangla
words removal, emoji removal, URL removal, cleaning, stop
words removal and stemming etc.
1) Non-Bangla Words, Punctuation, URLs and Emoticon
Removal: The dataset consists of many redundant or
irrelevant elements that should be removed from the dataset
to eliminate ambiguity. This includes removing non-Bangla
words, punctuation, URLs, special characters, emoticon etc
and so on. Data cleaning is performed on the dataset to get a
reduced fresh version of dataset to work with.
C. Feature Extraction
Learning models always require numeric representation of
data to work with, and the way of doing such task from texts
is termed as feature extraction [7]. We have used the Term
Frequency - Inverse Document Frequency (TF-IDF) which is
the most widely used feature extraction method [12]. For TF-
IDF, we have considered 50,000 features with an n-gram
range from 1 to 3, which works according to equation (3).
1https://www.ranks.nl/stopwords/bengali
In equation (1) and (2) the terms X, Y, P and Q are the
frequency of a word in a review, total number of words in the
review, total number of review classes and number of review
from the text classes contain the word respectively.

 
 
 
   
D. SA Method for Bangla Food Reviews
The machine learning models we used for Bangla food
review sentimental analysis are MNB, SVM, KNN, LR and
RF and from deep learning we used CNN, LSTM, GRU,
BiLSTM and BiGRU. Random Forest works by combining
multiple decision trees to make predictions. It uses
randomness in creating these trees and then combines their
outputs through voting. This approach increases accuracy and
prevents overfitting [4]. MNB is a probabilistic classifier
works on the basis of Bayes theorem which is very time
effective [10]. SVM finds the best linear separator
(hyperplane) between classes in data space. It prioritizes the
hyperplane that maximizes the distance from data points [11].
KNN is a lazy learner which often underperforms for text
classification due to some issues [12]. DL models generally
consist of embedding layer, hidden layers, fully connected
layer and output layer. The experimental results of all the
implemented algorithms are briefly described in section IV.
IV. EXPERIMENTAL RESULTS ANALYSIS AND DISCUSSION
The proposed work mainly focuses on detecting the
correct sentiments on Bangla food reviews accurately using
learning models as well as explains the reason for performing
with better or worse results using explainable NLP. We
consequently describe the performance metrics, obtained
results analysis, Friedman test statistics, explainable NLP etc.
A. Performance Metric
The recent studies from [1-15] show that several
evaluation metrics such as accuracy, f1-score, precision and
recall etc are used to analyze a model’s performance.
Precision (equation 4) is the ratio of true positives (TP) to the
true positives (TP) and false positives (FP) prediction. Recall
(equation 5) is defined as the ratio of true positives (TP) to the
true positives (TP) and false negatives (FN). F1-Score
(equation 6) is defined as the harmonic mean of precision and
recall and accuracy (equation 7) is calculated as the ratio of
total true classified instances compared to all the instances.
 
  
 
  !
 "  
  #
$%& 
  '
B. Obtained Results Analysis
For getting the best results, parameter tuning is a must in
every classification task. From the ML and DL domain the
best performed models are RF and CNN-BiGRU
respectively. We have tuned the initial parameters for RF
model as n_estimators = 1.5 and random state = 0. Again,
for the CNN-BiGRU model, the parameters are set to max
features = 50000, embedding dimension = 64, sequence
length = 40, no. of filters = 128, kernel size = 3 and dropout
= 0.5 etc.
We have summarized the obtained results with and
without SMOTE of the implemented ML algorithms in Table
II. RF algorithm maintains its status as the best performer.
We have summarized the obtained results with and without
SMOTE of the implemented ML algorithms in Table II. RF
algorithm maintains its status as the best performer.
Fig. 1. Workflow of the Proposed Bangla Sentiment Analysis Method on Food Reviews
TABLE II. OBTAINED RESULTS OF APPLIED MACHINE LEARNING ALGORITHMS; FRIEDMAN TEST: CHI-SQUARE = 10.04, P-VALUE = 0.03976, DEGREES OF
FREEDOM = 5, SIGNIFICANCE LEVEL = 0.05 AND RESULT IS SIGNIFICANT AT P<0.05
Algorithm
Results Without SMOTE Results With SMOTE
Precision
Recall
F1-Score
Accuracy
(%)
AUC
value
MSE
value
Precision
Recall
F1-Score
Accuracy
(%)
AUC
value
MSE
value
MNB
0.76
0.76
0.76
76.42
0.83
0.31
0.78
0.78
0.78
78.04
0.90
0.16
SVM
0.82
0.81
0.81
81.35
0.86
0.27
0.83
0.84
0.84
84.31
0.90
0.14
KNN
0.78
0.76
0.75
76.67
0.81
0.35
0.80
0.79
0.79
79.90
0.85
0.31
LR
0.81
0.82
0.81
81.77
0.87
0.24
0.88
0.80
0.83
83.65
0.89
0.17
RF
0.85
0.85
0.85
85.46
0.90
0.15
0.86
0.90
0.87
88.73
0.95
0.12
TABLE III. OBTAINED RESULTS OF APPLIED DEEP LEARNING ALGORITHMS; FRIEDMAN TEST: CHI-SQUARE = 15.9333, P-VALUE = 0.00311, DEGREES OF
FREEDOM =5, SIGNIFICANCE LEVEL = 0.05 AND RESULT IS SIGNIFICANT AT P < 0.05
Algorithm
Results Without SMOTE Results With SMOTE
Precision
Recall
F1-Score
Accuracy
(%)
AUC
value
MSE
value
Precision
Recall
F1-Score
Accuracy
(%)
AUC
value
MSE
value
CNN
0.78
0.78
0.78
78.34
0.87
0.29
0.87
0.82
0.84
83.24
0.91
0.24
LSTM
0.76
0.76
0.76
76.88
0.86
0.32
0.85
0.84
0.84
84.12
0.89
0.25
GRU
0.76
0.76
0.76
76.81
0.86
0.31
0.81
0.88
0.84
84.31
0.89
0.25
Bi-LSTM
0.77
0.76
0.76
76.91
0.86
0.31
0.84
0.83
0.83
83.29
0.89
0.25
Bi-GRU
0.77
0.76
0.76
76.89
0.86
0.32
0.85
0.82
0.83
83.88
0.89
0.26
CNN-LSTM
0.77
0.77
0.77
77.83
0.88
0.28
0.85
0.85
0.85
85.33
0.91
0.25
CNN-BiLSTM
0.76
0.76
0.76
76.95
0.89
0.28
0.86
0.90
0.87
88.21
0.90
0.23
CNN-GRU
0.77
0.77
0.77
77.93
0.89
0.29
0.92
0.85
0.88
87.38
0.90
0.22
CNN-BiGRU
0.76
0.76
0.76
76.98
0.88
0.27
0.89
0.93
0.90
90.96
0.96
0.11
It obtained the highest scores for f1-score and accuracy
(88.73%). Additionally, it stands out with an area under curve
(AUC) value of 0.95, which means it is really good at telling
apart different classes. It is also worth noting that RF got the
smallest mean squared error (MSE) value of 0.12, which is a
good sign that its predictions are consistently close to the
actual values. The 2nd best ML model is SVM producing an
accuracy of 84.31%. The obtained results of the implemented
DL algorithms with and without SMOTE are given in Table
III. Among the models evaluated, the CNN-BiGRU and
CNN-BiLSTM models acquired best performance across
multiple metrics. The hybrid model CNN-BiGRU achieved
f1-score of 0.90 and an accuracy of 90.96%. It also produced
an AUC and MSE values of 0.96 and 0.11 respectively those
indicate strong discriminatory power. The CNN-BiLSTM
hybrid model also performed well with an accuracy of
88.21%. But the CNN model produced only 83.24%
accuracy. So, hybrid models performed better than base DL
models.
TABLE IV. COMPARISON BETWEEN ML AND DL ALGORITHMS
The comparison between ML and DL algorithms is illustrated
in Table IV, and from the table it is clear that DL models
outperform ML models. SMOTE was used to balance the
dataset and its effect over ML and DL algorithms were
depicted in Fig. 2. SMOTE significantly improves the
performance of all the implemented models. The average
improvement after SMOTE for ML and DL models are
3.22% and 10.81% respectively. A statistical test (Friedman
test) was performed on the obtained results to observe the
statistical significance with 0.05 level of significance and
both the test results were significant. The comparison among
existing works and our proposed work is presented in Table
V. The proposed model operates on a substantial dataset of
44,491 food reviews which is comparable to or larger than
many of the other studies that can positively impact the
model’s ability to learn patterns and generalize well. The
proposed hybrid model CNN-BiGRU achieves the highest
accuracy of 90.96% and an f1-score of 0.90. This
performance level puts it in line with or even surpasses some
of the existing works in the sentiment analysis domain.
C. Explainable NLP
In the previous section we have analyzed different model
performance but we can not explain the reason why a model
is classifying perfectly or not. The very fundamental problem
of AI and learning related tasks is the lacking of
interpretability of a model being performing well or poor,
which features are responsible to classify a document to a
category, which are the top most important features and so
on. We have used the local interpretable model-agnostic
explanations (LIME) and Shapley additive explanations
(SHAP) module of python for this purpose. LIME can explain
which features are making significant contribution for
categorizing a document, and it can also interpret which
category is more likely to be chosen with their prediction
probabilities. A sample LIME explanation for prediction of a
review is shown in Fig. 3 where the responsible text features
are highlighted. SHAP provides a comprehensive view by
quantifying the importance of each dimension to the selection
of a model in a broader context. Fig. 4 demonstrates the
average impact on the CNN-BiGRU model output magnitude
using Shapley value. There is total 50,000 features for the
model, but we have shown only the top 20 features for
predicting 3 target classes.
Domain
Algorithm
F1-
Score
Accuracy
(%)
AUC
value
MSE
value
ML
RF
0.87
88.73
0.95
0.12
DL
CNN-BiGRU
0.90
90.96
0.96
0.11
TABLE V. COMPARISON AMONG EXISTING RECENT WORKS AND OUR PROPOSED WORK
Name
Year
Dataset
Used
No. of
Classes
Best Model
F1-Score
Accuracy
(%)
M. I. H. Junaid et
al. [2]
2023
1040
2
LSTM
N/A
90.89
M. Hasan et
al. [6]
2023
10,861
3
Bangla-BERT
0.82
86.00
E. R. Rhythm et
al. [7]
2023
15,018
N/A
DistilBERT
N/A
77.00
Our Proposed
2023
44,491
3
CNN-BiGRU
0.90
90.96
V. CONCLUSIONS AND FUTURE WORKS
Sentiment analysis on food review is a topic of great
importance in every language due to its versatile applications.
But the unfortunate thing is that there is still no benchmark
datasets and researches to refer to for food reviews in Bengali
language. In this work we have developed a dataset of 44,491
Bangla food reviews from various food review Facebook
pages and groups and annotated them manually. We mainly
keep focus on detecting the correct sentiments of Bangla food
reviews properly using learning models as well as explains the
reason for performing with better or worse results using
explainable NLP. SMOTE significantly improves the
performance of all the implemented models, the average
Fig. 2. Effect of Different Model Performance After Balancing the Dataset
Fig. 3. Sample LIME Explanation for Prediction
improvement after SMOTE for ML and DL models are 3.22%
and 10.81% respectively. RF and CNN-BiGRU models
outperformed other models and achieved the highest accuracy
of 88.73% and 90.96% from ML and DL domains
respectively. The hybrid deep learning methods outperforms
the base deep learning methods. Friedman statistical test was
performed on the obtained results and the test results are
significant at p<0.05. Additionally LIME and SHAP from
explainable NLP are used to observe the reason for a model
being performing well or poor from local and global point of
views. দর is the best feature that is responsible for the
prediction of Bangla food reviews. In the future, we want to
enrich and balance our dataset more and explore different
hybrid feature extraction techniques for the SA on Bangla
food reviews, implement transformer based learning and
different hybrid methods.
REFERENCES
[1] J. K. Adarsh, V. T. Sreedevi, D. Thangavelusamy, “Product Review
System With BERT for Sentiment Analysis and Implementation of
Administrative Privileges on Node-RED,” in IEEE Access, vol. 11, pp.
65968-65976, 2023, doi: 10.1109/ACCESS.2023.3275738.
[2] M. I. H. Junaid, F. Hossain, U. S. Upal, A. Tameem, A. Kashim, A.
Fahmin, “Bangla Food Review Sentimental Analysis using Machine
Learning,” 2022 IEEE 12th Annual Computing and Communication
Workshop and Conference (CCWC), Las Vegas, NV, USA, pp. 0347-
0353, 2022, doi: 10.1109/CCWC54503.2022.9720761.
[3] A. S. Talaat, “Sentiment analysis classification system using hybrid
BERT models,” J Big Data, vol. 10, no. 1, Dec. 2023, doi:
10.1186/s40537-023-00781-w.
[4] A. Akther, M. S. Islam, H. Sultana, A. R. Rahman, S. Saha, K. M.
Alam, R. Debnath, “Compilation, Analysis and Application of a
Comprehensive Bangla Corpus KUMono,” IEEE Access, vol. 10, pp.
79999-80014, 2022, doi: 10.1109/ACCESS.2022.3195236.
[5] G. M. Shahariar, M. T. R. Shawon, F. M. Shah, M. S. Alam, M. S.
Mahbub, “Bengali Fake Reviews: A Benchmark Dataset and
Detection System,” arXiv:2308.01987v1.
https://doi.org/10.48550/arXiv.2308.01987.
[6] M. Hasan, L. Islam, I. Jahan, S. M. Meem, R. M. Rahman, “Natural
Language Processing and Sentiment Analysis on Bangla Social Media
Comments on Russia– Ukraine War Using Transformers,” Vietnam
Journal of Computer Science, Vol. 10, No. 03, pp. 329-356, 2023,
https://doi.org/10.1142/S2196888823500021.
[7] E. R. Rhythm, R. A. Shuvo, M. S. Hossain, M. F. Islam, A. A. Rasel,
“Sentiment Analysis of Restaurant Reviews from Bangladeshi Food
Delivery Apps,” 2023 International Conference on Emerging Smart
Computing and Informatics (ESCI), Pune, India, pp. 1-5, 2023, doi:
10.1109/ESCI56872.2023.10100214.
[8] A. Anirban, B. Pradhan, N. Shukla, “Sentiment analysis of customer
reviews of food delivery services using deep learning and explainable
artificial intelligence: Systematic review,” Foods 11, no. 10, pp. 1500,
2022.
[9] E. Hossain, O. Sharif, M. M. Hoque, I. H. Sarker, “SentiLSTM: A Deep
Learning Approach for Sentiment Analysis of Restaurant Reviews.
Hybrid Intelligent Systems,” HIS 2020, Advances in Intelligent
Systems and Computing, vol 1375. Springer, Cham, 2021,
https://doi.org/10.1007/978-3-030-73050-5_19.
[10] O. Sharif, M. M. Hoque and E. Hossain, “Sentiment Analysis of
Bengali Texts on Online Restaurant Reviews Using Multinomial Naïve
Bayes,” 2019 1st International Conference on Advances in Science,
Engineering and Robotics Technology (ICASERT), Dhaka,
Bangladesh, pp. 1-6, 2019, doi: 10.1109/ICASERT.2019.8934655.
[11] V. Lokam, V. Shinde, A. Raikar, V. Kate, G. Jumnake, “Restaurant and
Cuisine Review Sentiment Analysis using SVM,” International Journal
of Advanced Research in Computer and Communication Engineering,
Vol. 10, Issue 5, May 2021, doi: 10.17148/IJARCCE.2021.10545.
[12] A. K. Bitto, M. H. I. Bijoy, M. S. Arman, I. Mahmud, A. Das, J.
Majumder, “Sentiment Analysis from Bangladeshi Food Delivery
Startup Based on User Reviews Using Machine learning and Deep
Learning,” Bulletin of Electrical Engineering and Informatics, vol. 12,
no. 4, pp. 2282-2291, 2023.
[13] R. Haque, N. Islam, M. Tasneem, A. K. Das, “Multiclass sentiment
classification on Bengali social media comments using machine
learning,” International Journal of Cognitive Computing in
Engineering, Vol. 4, pp. 21-35, 2023,
https://doi.org/10.1016/j.ijcce.2023.01.001.
[14] N. Hossain, M. R. Bhuiyan, Z. N. Tumpa, S. A. Hossain, “Sentiment
Analysis of Restaurant Reviews using Combined CNN-LSTM,” 2020
11th International Conference on Computing, Communication and
Networking Technologies (ICCCNT), Kharagpur, India, pp. 1-5, 2020,
doi: 10.1109/ICCCNT49239.2020.9225328.
[15] M. A. Rahman and E. Kumar Dey, “Aspect Extraction from Bangla
Reviews using Convolutional Neural Network,” 2018 Joint 7th
International Conference on Informatics, Electronics & Vision
(ICIEV) and 2018 2nd International Conference on Imaging, Vision &
Pattern Recognition (icIVPR), Kitakyushu, Japan, 2018, pp. 262-267,
doi: 10.1109/ICIEV.2018.8641050.
Fig. 4. Average Impact on CNN-
BiGRU Model Output Magnitude
Using Shapley Value
... Sentiment analysis (SA) which is also known as opinion mining is a process of determining a person's views on a particular topic (Kabir et al., 2023;Islam and Alam, 2023a). SA is the mining study of human opinion that analyzes people's opinions, feelings, evaluations and judgment towards social entities such as services, products, people, events, organizations etc (Kabir et al., 2023;Islam and Alam, 2023b). Sentiments can vary across cultures and languages (Nafisa et al., 2023). ...
Article
Full-text available
In this modern technologically advanced world, Sentiment Analysis (SA) is a very important topic in every language due to its various trendy applications. But SA in Bangla language is still in a dearth level. This work focuses on examining different hybrid feature extraction techniques and learning algorithms on Bangla Document level Sentiment Analysis using a new comprehensive dataset (BangDSA) of 203,493 comments collected from various microblogging sites. The proposed BangDSA dataset approximately follows the Zipf’s law, covering 32.84% function words with a vocabulary growth rate of 0.053, tagged both on 15 and 3 categories. In this study, we have implemented 21 different hybrid feature extraction methods including Bag of Words (BOW), N-gram, TF-IDF, TF-IDF-ICF, Word2Vec, FastText, GloVe, Bangla-BERT etc with CBOW and Skipgram mechanisms. The proposed novel method (Bangla-BERT+Skipgram), skipBangla-BERT outperforms all other feature extraction techniques in machine leaning (ML), ensemble learning (EL) and deep learning (DL) approaches. Among the built models from ML, EL and DL domains the hybrid method CNN-BiLSTM surpasses the others. The best acquired accuracy for the CNN-BiLSTM model is 90.24% in 15 categories and 95.71% in 3 categories. Friedman test has been performed on the obtained results to observe the statistical significance. For both real 15 and 3 categories, the results of the statistical test are significant.
Article
Full-text available
Because of the rapid growth of mobile technology, social media has become an essential platform for people to express their views and opinions. Understanding public opinion can help businesses and political institutions make strategic decisions. Considering this, sentiment analysis is critical for understanding the polarity of public opinion. Most social media analysis studies divide sentiment into three categories: positive, negative, and neutral. The proposed model is a machine-learning application of a classification problem trained on three datasets. Recently, the BERT model has demonstrated effectiveness in sentiment analysis. However, the accuracy of sentiment analysis still needs to be improved. We propose four deep learning models based on a combination of BERT with Bidirectional Long ShortTerm Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU) algorithms. The study is based on pre-trained word embedding vectors that aid in the model fine-tuning process. The proposed methods are trying to enhance accuracy and check the effect of hybridizing layers of BIGRU and BILSTM on both Bert models (DistilBERT, RoBERTa) for no emoji (text sentiment classifier) and also with emoji cases. The proposed methods were compared to two pre-trained BERT models and seven other models built for the same task using classical machine learning. The proposed architectures with BiGRU layers have the best results.
Article
Full-text available
There are many ways to collect the product review, either by using a website or by manually asking the customers to leave their comments after buying the product. The traditional product feedback process is time-consuming. Nowadays, although industries use online review system to rate their product, the process lacks modernization. In this paper, BERT sentiment analysis is used to process the product review from the customers and automatically convert the review into a numeric score. This average score value is used for processing the sentiment analysis. The system is built entirely on Node-RED as a backbone to gather the review from the users with certain administrator privileges. An interactive dashboard is created to analyze the feedback of the product review. The above method of review system also reduces malpractice that the manufacturer or service provider may indulge into increase their ratings.
Conference Paper
Full-text available
In this study, we conducted sentiment analysis on restaurant reviews from Bangladeshi food delivery apps using natural language processing techniques. Food delivery apps have become increasingly popular in Bangladesh, and understanding the sentiment of customer reviews can provide valuable insights for restaurant owners and food delivery app companies. In this research, we have created a dataset named "Bangladeshi Restaurant Reviews" by gathering customer reviews of restaurants available on Foodpanda and Hungrynaki, which are two popular food delivery apps in Bangladesh. We used Robustly Optimized BERT Pretraining Approach (RoBERTa), AFINN, and DistilBERT, a distilled version of Bidirectional Encoder Representations from Transformers (BERT) to perform the sentiment analysis. Overall, this research paper highlights the importance of sentiment analysis in the food delivery industry and demonstrates the effectiveness of different models in performing this task. It also provides insights for businesses looking to use sentiment analysis to improve their services and products. The accuracy of the models evaluated, RoBERTa, AFINN, and DistilBERT, were 74%, 73%, and 77% respectively.
Article
Full-text available
Food delivery methods are at the top of the list in today's world. People's attitudes toward food delivery systems are usually influenced by food quality and delivery time. We did a sentiment analysis of consumer comments on the Facebook pages of Food Panda, HungryNaki, Pathao Food, and Shohoz Food, and data was acquired from these four sites' remarks. In natural language processing (NLP) task, before the model was implemented, we went through a rigorous data pre-processing process that included stages like adding contractions, removing stop words, tokenizing, and more. Four supervised classification techniques are used: extreme gradient boosting (XGB), random forest classifier (RFC), decision tree classifier (DTC), and multi nominal Naive Bayes (MNB). Three deep learning (DL) models are used: convolutional neural network (CNN), long term short memory (LSTM), and recurrent neural network (RNN). The XGB model exceeds all four machine learning (ML) algorithms with an accuracy of 89.64%. LSTM has the highest accuracy rate of the three DL algorithms, with an accuracy of 91.07%. Among ML and DL models, LSTM DL takes the lead to predict the sentiment.
Article
Full-text available
The Bangla Language ranks seventh in the list of most spoken languages with 265 native and non-native speakers around the world and the second Indo-Aryan language after Hindi. However, the growth of research for tasks such as sentiment analysis (SA) in Bangla is relatively low compared to SA in the English language. It is because there are not enough high-quality publically available datasets for training language models for text classification tasks in Bangla. In this paper, we propose a Bangla annotated dataset for sentiment analysis on the ongoing Ukraine–Russia war. The dataset was developed by collecting Bangla comments from various videos of three prominent YouTube TV news channels of Bangladesh covering their report on the ongoing conflict. A total of 10,861 Bangla comments were collected and labeled with three polarity sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-Russia (Negative). A benchmark classifier was developed by experimenting with several transformer-based language models all pre-trained on unlabeled Bangla corpus. The models were fine-tuned using our procured dataset. Hyperparameter optimization was performed on all 5 transformer language models which include: BanglaBERT, XLM-RoBERTa-base, XLM-RoBERTa-large, Distil-mBERT and mBERT. Each model was evaluated and analyzed using several evaluation metrics which include: F1 score, accuracy, and AIC (Akaike Information Criterion). The best-performing model achieved the highest accuracy of 86% with 0.82 F1 score. Based on accuracy, F1 score and AIC, BanglaBERT outperforms baseline and all the other transformer-based classifiers.
Article
Full-text available
Multi-class Sentiment Analysis (SA) is an important field of computational linguistics that extracts multiple opinions expressed in a text using NLP and text-mining techniques. Existing research on multi-class SA in the Bengali language is directed towards ternary classification with unsatisfactory classification performance. Moreover, obtaining a higher performance score is challenging due to the peculiarities of Bengali text, lack of ground truth datasets, and low resources of preprocessing tools. Moreover, no research has shown that deep learning algorithms perform higher on four types of sentiments. Therefore, we proposed a supervised deep learning classifier based on CNN and LSTM to conduct multi-class SA on Bengali social media comments labelled as sexual, religious, political, and acceptable. The study aims to achieve maximum accuracy using the proposed model and provide a comparative analysis with the baseline models. Six machine learning models with two different feature extraction techniques were considered baseline models. The performance of our proposed CLSTM architecture can greatly improve the performance of SA with 85.8% accuracy and 0.86 F1 scores on a labelled dataset of 42036 Facebook comments. A web application based on the proposed model and the highest-performing baseline model was built to detect the real-life sentiment of social media comments.
Article
Full-text available
Research in Natural Language Processing (NLP) and computational linguistics highly depends on a good quality representative corpus of any specific language. Bangla is one of the most spoken languages in the world but Bangla NLP research is in its early stage of development due to the lack of quality public corpus. This article describes the detailed compilation methodology of a comprehensive monolingual Bangla corpus, KUMono ( K hulna U niversity Mono lingual corpus). The newly developed corpus consists of more than 350 million word tokens and more than one million unique tokens from 18 major text categories of online Bangla websites. We have conducted several word-level and character-level linguistic phenomenon analyses based on empirical studies of the developed corpus. The corpus follows Zipf’s curve and hapax legomena rule. The quality of the corpus is also assessed by analyzing and comparing the inherent sparseness of the corpus with existing Bangla corpora, by analyzing the distribution of function words of the corpus and vocabulary growth rate. We have developed a Bangla article categorization application based on the KUMono corpus and received compelling results by comparing to the state-of-the-art models.
Article
Full-text available
During the COVID-19 crisis, customers’ preference in having food delivered to their doorstep instead of waiting in a restaurant has propelled the growth of food delivery services (FDSs). With all restaurants going online and bringing FDSs onboard, such as UberEATS, Menulog or Deliveroo, customer reviews on online platforms have become an important source of information about the company’s performance. FDS organisations aim to gather complaints from customer feedback and effectively use the data to determine the areas for improvement to enhance customer satisfaction. This work aimed to review machine learning (ML) and deep learning (DL) models and explainable artificial intelligence (XAI) methods to predict customer sentiments in the FDS domain. A literature review revealed the wide usage of lexicon-based and ML techniques for predicting sentiments through customer reviews in FDS. However, limited studies applying DL techniques were found due to the lack of the model interpretability and explainability of the decisions made. The key findings of this systematic review are as follows: 77% of the models are non-interpretable in nature, and organisations can argue for the explainability and trust in the system. DL models in other domains perform well in terms of accuracy but lack explainability, which can be achieved with XAI implementation. Future research should focus on implementing DL models for sentiment analysis in the FDS domain and incorporating XAI techniques to bring out the explainability of the models.
Conference Paper
Full-text available
In this modern age, people are dependent on the internet. They prefer to order food online or Food App rather than the restaurant. They are giving various reviews online about the foods. In this project, we aim to build a machine learning model to analyze the sentiment of that reviews. In Bangladesh, internet users are increasing day by day. So we have decided to build the model for the Bangla language. We have found no Bangla dataset for food reviews that we can use for our project. Then we have collected more than one thousand Bangla food reviews from various online platforms like Foodpanda, Hungrynaki, Shohoz food, Pathao food, etc., and labeled them. After some necessary preprocessing, we have extracted various features from cleaned data and used them to train and test for machine learning and deep learning models. We have come to the result that Long Term Short Term (LSTM), a deep learning model giving the best accuracy, that is 90.89%, where we have used word2sequence as feature extraction. Our research contribution will help the food industry by using this model. This model can help them to understand the Bangla food review sentiment.