Conference PaperPDF Available

Sentiment Analysis on Bangla Food Reviews Using Machine Learning and Explainable NLP

December 2023

December 2023

DOI:10.1109/ICCIT60459.2023.10441309

Conference: 2023 26th International Conference on Computer and Information Technology (ICCIT)

Authors:

Md. Shymon Islam

Khulna University

Sentiment analysis (SA) is a sub-field of natural language processing (NLP) which can extract valuable insights from textual data of a language. Food review analysis is a trending domain of SA which become very useful as internet dependency has rapidly shifted people’s food ordering preferences from restaurants to online platforms. This work focuses on examining various machine learning (ML) and deep learning (DL) algorithms for Bangla sentimental analysis on food reviews using a new dataset of 44,491 reviews collected from various restaurant Facebook pages and groups. Furthermore, in this study, we utilized the explainable NLP to interpret why a model is performing well or poor. Random Forest (RF) and Convolutional Neural Network-Bidirectional Gated Recurrent Unit (CNN-BiGRU) models outperformed other models and achieved the highest accuracy of 88.73% and 90.96% from ML and DL domains respectively. Friedman statistical test was performed on the obtained results and the test results are significant at p<0.05. “দর ” is the best feature that is responsible for the hybrid DL (CNN-BiGRU) model to classify reviews more accurately.

Workflow of the Proposed Bangla Sentiment Analysis Method on Food Reviews

…

Figures - uploaded by Md. Shymon Islam

Content may be subject to copyright.

Content uploaded by Md. Shymon Islam

Content may be subject to copyright.

2023 26th International Conference on Computer and Information Technology (ICCIT)

13-15 December 2023, Cox’s Bazar, Bangladesh

Sentiment Analysis on Bangla Food Reviews Using

Machine Learning and Explainable NLP

Md. Shymon Islam1, and Dr. Kazi Masudul Alam2

1,2Computer Science and Engineering Discipline

Khulna University, Khulna-9208, Bangladesh

Email- shymum1702@cseku.ac.bd, kazi@cseku.ac.bd

Abstract—Sentiment analysis (SA) is a sub-field of natural

language processing (NLP) which can extract valuable insights

from textual data of a language. Food review analysis is a

trending domain of SA which become very useful as internet

dependency has rapidly shifted people’s food ordering

preferences from restaurants to online platforms. This work

focuses on examining various machine learning (ML) and deep

learning (DL) algorithms for Bangla sentimental analysis on

food reviews using a new dataset of 44,491 reviews collected

from various restaurant Facebook pages and groups.

Furthermore, in this study, we utilized the explainable NLP to

interpret why a model is performing well or poor. Random

Forest (RF) and Convolutional Neural Network-Bidirectional

Gated Recurrent Unit (CNN-BiGRU) models outperformed

other models and achieved the highest accuracy of 88.73% and

90.96% from ML and DL domains respectively. Friedman

statistical test was performed on the obtained results and the test

results are significant at p<0.05. “দর ” is the best feature that is

responsible for the hybrid DL (CNN-BiGRU) model to classify

reviews more accurately.

Keywords— Sentiment Analysis, Food Review Analysis, CNN,

Bi-GRU, Explainable NLP

I. INTRODUCTION

Sentiment analysis (SA) is the process of predicting the

sentiments or emotions of a set of text data [1]. SA is a

subfield of natural language processing (NLP), has emerged

as a critical tool for extracting valuable insights from textual

data by discerning the emotional tone and underlying

sentiment conveyed by the language used [2]. Sentiment

Analysis basically focuses on assessing people’s opinions,

feelings, and emotions [3].

Nearly 250 million people speak Bangla as their first

language, and 160 million of them are Bangladeshis [4].

While Bangla is still in the developing stage, the majority of

research studies and publicly accessible datasets on

sentiment analysis are restricted to English and other

resource-rich languages like Arabic, Turkish, Hindi and

Urdu [5]. Bangla is a language with a rich morphology that

has developed over thousands of years with its longstanding

customs, including numerous dialects. Bangla presents both

challenges and opportunities for developing effective

sentiment analysis techniques that can accurately capture the

sentiment conveyed in Bangla texts across various domains

[6]. In this work, we mainly focus on the food review domain

of SA. The participation of Bangladeshi residents in internet

activities is also rapidly increasing. Reviews of food and food

distribution systems are fascinating sections. In our country,

there are a rising number of online food delivery services [7].

The majority of Bangladeshis provide insightful feedback in

Bangla on social media [2]. In addition to English, users also

post comments in Bangla as well examining several blogs

and social media [6]. There are many application areas of

food review sentimental analysis such as the analysis of food

delivery services [8], food quality analysis [9], financial

market analysis [10] and customer satisfaction tracking [11]

etc.

The majority of research on multi-class SA has been done

using machine learning (ML) or deep learning (DL)

algorithms to predict positive, negative or neutral sentiments.

This work focuses on detecting the correct sentiments on

Bangla food reviews accurately using learning models as

well as explains the reason for performing with better or

worse results using explainable NLP. The followings are the

contributions of the proposed work:

1) Created a new specialized SA dataset on food reviews

consisting of 44,491 reviews from several Facebook

pages and groups.

2) Examining various ML and DL algorithms and make a

comparative study among them.

3) The introduction of the proposed novel hybrid DL

(CNN-BiGRU) method which outperforms all other

algorithms.

4) Applying explainable NLP to explain why a model is

producing good or bad results using LIME and SHAP

modules.

The organization of this article: Section 2 demonstrates the

related works, Section 3 is for the proposed methodology for

Bangla SA on food reviews, Section 4 describes the

experimental results analysis and discussions and Section 5

concludes the works with some future remarks.

II. RELATED WORKS

Some of the recent studies of Bangla SA on food reviews

are discussed and summarized here.

In 2023, M. I. H. Junaid et al. [2] proposed a method based

on machine learning for sentimental analysis of food reviews

in Bangla by creating a dataset of only 1040 reviews from

Foodpanda and Hungrynaki and showed that Long-Term

Short-Term Memory (LSTM) deep learning model

outperformed others with an accuracy of 90.89%. M. Hasan et

al. [6] studied a method concerned on the Bangla SA topic

based on “Russia-Ukraine war”. A total of 10,861 Bangla

comments were collected and labeled with three polarity

sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-

Russia (Negative). They used several transformer language

models including BanglaBERT, XLM-RoBERTa-base,

XLMRoBERTa-large, Distil-mBERT and mBERT and

showed by experiments that BanglaBERT outperformed

baseline and all the other transformer-based classifiers.

Another method from 2023 proposed by E. R. Rhythm et al.

[7] studied delved into SA on restaurant reviews sourced from

Bangladeshi food delivery apps. They created a dataset named

“Bangladeshi Restaurant Reviews” covering 15,018 instances

by collecting customer feedback from popular apps like

Foodpanda and Hungrynaki. They employed Robustly

Optimized BERT Pretraining Approach (RoBERTa), AFINN,

and DistilBERT models and obtained accuracy of 74%, 73%

and 77% respectively. In 2023, another new study was

performed by Bitto et al. [12] for the user reviews collected

from food delivery startups. They collected 1400 reviews

from 4 food delivery Facebook pages and applied bipolar SA.

Applying ML and DL algorithms, they obtained highest

accuracy 89.64% using XGB and 91.07% from LSTM

classifier. A supervised deep learning classifier based on CNN

and LSTM to conduct multi-class SA on Bengali social media

comments was proposed by R. Haque et al. [13] in 2023. The

performance of their proposed CLSTM (Convolutional Long

Short-Term Memory) architecture greatly improved the

performance of SA with 85.8% accuracy and 0.86 f1-score on

a labeled dataset of 42,036 Facebook comments. The recent

studies [1-13] on Bangla food reviews prove that DL and

hybrid methods are more promising than traditional ML

approaches.

The studies performed on [9], [14] and [15] utilized

datasets of 8435, 1000 and 2053 restaurant reviews

respectively. Compared to the literature being studied [1-15]

for Bangla SA on food reviews, it is observed that a rich

dataset is always a crisis in Bangla language. To the best of

our knowledge, we have developed one of the largest Bangla

SA dataset on food reviews consisting of 44,491 reviews

from several Facebook pages and groups. Explainable NLP

is not utilized yet in the domain of Bangla food reviews

analysis. In our proposed method, we will utilize these

findings and try to cover up the knowledge gap in this domain

of research.

III. PROPOSED METHODOLOGY FOR BANGLA SA ON FOOD

REVIEWS

Our main goal of this study is to develop different ML and

DL models that can differentiate among positive, negative and

neutral Bangla food reviews. The overall functioning

procedure of the proposed method is depicted in a nutshell in

Fig. 1.

A. Dataset Preparation

In this study, we have gathered Bengali food reviews from

several Facebook food blogs and groups, such as Street Food

Hunting, Dhaka Food Review, Food Review Jashore, Food

Bloggers Barishal and Rafsan the Chotobhai.

TABLE I. SUMMARY OF THE DATASET

Sentiment

No. of Reviews

Total Reviews

Positive

14,424

44,491 Negative

12,371

Neutral

17,696

Six Bangla native graduate students were involved in data

collection process (4 males and 2 females) and 5 were

involved in data annotation process (3 males and 2

females).The proposed dataset contains a total of 44,491

Bangla food reviews and we have shown the distribution of

the dataset in Table I. We have split the proposed dataset into

training and test sets as 90% and 10% respectively using

holdout method. A total of 40,041 reviews were used for

training models and 4,450 reviews were used for testing. Since

the class distribution of the reviews are unequal, the dataset

got an imbalanced distribution which may produce biased

results [5]. So we have used the synthetic minority

oversampling technique (SMOTE) to balance the proposed

dataset.

B. Data Preprocessing

Data preprocessing is a prerequisite for any classification

model to perform well [2]. The reviews we collected from

Facebook contain spelling mistakes, punctuation marks,

emojis, special characters and numerical values and so on.

Depending on Bangla language, different preprocessing

processes were applied such as tokenization, non-Bangla

words removal, emoji removal, URL removal, cleaning, stop

words removal and stemming etc.

1) Non-Bangla Words, Punctuation, URLs and Emoticon

Removal: The dataset consists of many redundant or

irrelevant elements that should be removed from the dataset

to eliminate ambiguity. This includes removing non-Bangla

words, punctuation, URLs, special characters, emoticon etc

and so on. Data cleaning is performed on the dataset to get a

reduced fresh version of dataset to work with.

C. Feature Extraction

Learning models always require numeric representation of

data to work with, and the way of doing such task from texts

is termed as feature extraction [7]. We have used the Term

Frequency - Inverse Document Frequency (TF-IDF) which is

the most widely used feature extraction method [12]. For TF-

IDF, we have considered 50,000 features with an n-gram

range from 1 to 3, which works according to equation (3).

1https://www.ranks.nl/stopwords/bengali

In equation (1) and (2) the terms X, Y, P and Q are the

frequency of a word in a review, total number of words in the

review, total number of review classes and number of review

from the text classes contain the word respectively.

  

 

  



 

      

D. SA Method for Bangla Food Reviews

The machine learning models we used for Bangla food

review sentimental analysis are MNB, SVM, KNN, LR and

RF and from deep learning we used CNN, LSTM, GRU,

BiLSTM and BiGRU. Random Forest works by combining

multiple decision trees to make predictions. It uses

randomness in creating these trees and then combines their

outputs through voting. This approach increases accuracy and

prevents overfitting [4]. MNB is a probabilistic classifier

works on the basis of Bayes theorem which is very time

effective [10]. SVM finds the best linear separator

(hyperplane) between classes in data space. It prioritizes the

hyperplane that maximizes the distance from data points [11].

KNN is a lazy learner which often underperforms for text

classification due to some issues [12]. DL models generally

consist of embedding layer, hidden layers, fully connected

layer and output layer. The experimental results of all the

implemented algorithms are briefly described in section IV.

IV. EXPERIMENTAL RESULTS ANALYSIS AND DISCUSSION

The proposed work mainly focuses on detecting the

correct sentiments on Bangla food reviews accurately using

learning models as well as explains the reason for performing

with better or worse results using explainable NLP. We

consequently describe the performance metrics, obtained

results analysis, Friedman test statistics, explainable NLP etc.

A. Performance Metric

The recent studies from [1-15] show that several

evaluation metrics such as accuracy, f1-score, precision and

recall etc are used to analyze a model’s performance.

Precision (equation 4) is the ratio of true positives (TP) to the

true positives (TP) and false positives (FP) prediction. Recall

(equation 5) is defined as the ratio of true positives (TP) to the

true positives (TP) and false negatives (FN). F1-Score

(equation 6) is defined as the harmonic mean of precision and

recall and accuracy (equation 7) is calculated as the ratio of

total true classified instances compared to all the instances.

  

   

   

    !

  "      

   #

$%&     

       '

B. Obtained Results Analysis

For getting the best results, parameter tuning is a must in

every classification task. From the ML and DL domain the

best performed models are RF and CNN-BiGRU

respectively. We have tuned the initial parameters for RF

model as n_estimators = 1.5 and random state = 0. Again,

for the CNN-BiGRU model, the parameters are set to max

features = 50000, embedding dimension = 64, sequence

length = 40, no. of filters = 128, kernel size = 3 and dropout

= 0.5 etc.

We have summarized the obtained results with and

without SMOTE of the implemented ML algorithms in Table

II. RF algorithm maintains its status as the best performer.

We have summarized the obtained results with and without

SMOTE of the implemented ML algorithms in Table II. RF

algorithm maintains its status as the best performer.

Fig. 1. Workflow of the Proposed Bangla Sentiment Analysis Method on Food Reviews

TABLE II. OBTAINED RESULTS OF APPLIED MACHINE LEARNING ALGORITHMS; FRIEDMAN TEST: CHI-SQUARE = 10.04, P-VALUE = 0.03976, DEGREES OF

FREEDOM = 5, SIGNIFICANCE LEVEL = 0.05 AND RESULT IS SIGNIFICANT AT P<0.05

Algorithm

Results Without SMOTE Results With SMOTE

Precision

Recall

F1-Score

Accuracy

(%)

AUC

value

MSE

value

Precision

Recall

F1-Score

Accuracy

(%)

AUC

value

MSE

value

MNB

0.76

76.42

0.83

0.31

0.78

78.04

0.90

0.16

SVM

0.82

0.81

81.35

0.86

0.27

0.83

0.84

84.31

0.90

0.14

KNN

0.78

0.76

0.75

76.67

0.81

0.35

0.80

0.79

79.90

0.85

0.31

0.81

0.82

0.81

81.77

0.87

0.24

0.88

0.80

0.83

83.65

0.89

0.17

0.85

85.46

0.90

0.15

0.86

0.90

0.87

88.73

0.95

0.12

TABLE III. OBTAINED RESULTS OF APPLIED DEEP LEARNING ALGORITHMS; FRIEDMAN TEST: CHI-SQUARE = 15.9333, P-VALUE = 0.00311, DEGREES OF

FREEDOM =5, SIGNIFICANCE LEVEL = 0.05 AND RESULT IS SIGNIFICANT AT P < 0.05

Algorithm

Results Without SMOTE Results With SMOTE

Precision

Recall

F1-Score

Accuracy

(%)

AUC

value

MSE

value

Precision

Recall

F1-Score

Accuracy

(%)

AUC

value

MSE

value

CNN

0.78

78.34

0.87

0.29

0.87

0.82

0.84

83.24

0.91

0.24

LSTM

0.76

76.88

0.86

0.32

0.85

0.84

84.12

0.89

0.25

GRU

0.76

76.81

0.86

0.31

0.81

0.88

0.84

84.31

0.89

0.25

Bi-LSTM

0.77

0.76

76.91

0.86

0.31

0.84

0.83

83.29

0.89

0.25

Bi-GRU

0.77

0.76

76.89

0.86

0.32

0.85

0.82

0.83

83.88

0.89

0.26

CNN-LSTM

0.77

77.83

0.88

0.28

0.85

85.33

0.91

0.25

CNN-BiLSTM

0.76

76.95

0.89

0.28

0.86

0.90

0.87

88.21

0.90

0.23

CNN-GRU

0.77

77.93

0.89

0.29

0.92

0.85

0.88

87.38

0.90

0.22

CNN-BiGRU

0.76

76.98

0.88

0.27

0.89

0.93

0.90

90.96

0.96

0.11

It obtained the highest scores for f1-score and accuracy

(88.73%). Additionally, it stands out with an area under curve

(AUC) value of 0.95, which means it is really good at telling

apart different classes. It is also worth noting that RF got the

smallest mean squared error (MSE) value of 0.12, which is a

good sign that its predictions are consistently close to the

actual values. The 2nd best ML model is SVM producing an

accuracy of 84.31%. The obtained results of the implemented

DL algorithms with and without SMOTE are given in Table

III. Among the models evaluated, the CNN-BiGRU and

CNN-BiLSTM models acquired best performance across

multiple metrics. The hybrid model CNN-BiGRU achieved

f1-score of 0.90 and an accuracy of 90.96%. It also produced

an AUC and MSE values of 0.96 and 0.11 respectively those

indicate strong discriminatory power. The CNN-BiLSTM

hybrid model also performed well with an accuracy of

88.21%. But the CNN model produced only 83.24%

accuracy. So, hybrid models performed better than base DL

models.

TABLE IV. COMPARISON BETWEEN ML AND DL ALGORITHMS

The comparison between ML and DL algorithms is illustrated

in Table IV, and from the table it is clear that DL models

outperform ML models. SMOTE was used to balance the

dataset and its effect over ML and DL algorithms were

depicted in Fig. 2. SMOTE significantly improves the

performance of all the implemented models. The average

improvement after SMOTE for ML and DL models are

3.22% and 10.81% respectively. A statistical test (Friedman

test) was performed on the obtained results to observe the

statistical significance with 0.05 level of significance and

both the test results were significant. The comparison among

existing works and our proposed work is presented in Table

V. The proposed model operates on a substantial dataset of

44,491 food reviews which is comparable to or larger than

many of the other studies that can positively impact the

model’s ability to learn patterns and generalize well. The

proposed hybrid model CNN-BiGRU achieves the highest

accuracy of 90.96% and an f1-score of 0.90. This

performance level puts it in line with or even surpasses some

of the existing works in the sentiment analysis domain.

C. Explainable NLP

In the previous section we have analyzed different model

performance but we can not explain the reason why a model

is classifying perfectly or not. The very fundamental problem

of AI and learning related tasks is the lacking of

interpretability of a model being performing well or poor,

which features are responsible to classify a document to a

category, which are the top most important features and so

on. We have used the local interpretable model-agnostic

explanations (LIME) and Shapley additive explanations

(SHAP) module of python for this purpose. LIME can explain

which features are making significant contribution for

categorizing a document, and it can also interpret which

category is more likely to be chosen with their prediction

probabilities. A sample LIME explanation for prediction of a

review is shown in Fig. 3 where the responsible text features

are highlighted. SHAP provides a comprehensive view by

quantifying the importance of each dimension to the selection

of a model in a broader context. Fig. 4 demonstrates the

average impact on the CNN-BiGRU model output magnitude

using Shapley value. There is total 50,000 features for the

model, but we have shown only the top 20 features for

predicting 3 target classes.

Domain

Algorithm

F1-

Score

Accuracy

(%)

AUC

value

MSE

value

0.87

88.73

0.95

0.12

CNN-BiGRU

0.90

90.96

0.96

0.11

TABLE V. COMPARISON AMONG EXISTING RECENT WORKS AND OUR PROPOSED WORK

Name

Year

Dataset

Used

No. of

Classes

Best Model

F1-Score

Accuracy

(%)

M. I. H. Junaid et

al. [2]

2023

1040

LSTM

N/A

90.89

M. Hasan et

al. [6]

2023

10,861

Bangla-BERT

0.82

86.00

E. R. Rhythm et

al. [7]

2023

15,018

N/A

DistilBERT

N/A

77.00

Our Proposed

2023

44,491

CNN-BiGRU

0.90

90.96

V. CONCLUSIONS AND FUTURE WORKS

Sentiment analysis on food review is a topic of great

importance in every language due to its versatile applications.

But the unfortunate thing is that there is still no benchmark

datasets and researches to refer to for food reviews in Bengali

language. In this work we have developed a dataset of 44,491

Bangla food reviews from various food review Facebook

pages and groups and annotated them manually. We mainly

keep focus on detecting the correct sentiments of Bangla food

reviews properly using learning models as well as explains the

reason for performing with better or worse results using

explainable NLP. SMOTE significantly improves the

performance of all the implemented models, the average

Fig. 2. Effect of Different Model Performance After Balancing the Dataset

Fig. 3. Sample LIME Explanation for Prediction

improvement after SMOTE for ML and DL models are 3.22%

and 10.81% respectively. RF and CNN-BiGRU models

outperformed other models and achieved the highest accuracy

of 88.73% and 90.96% from ML and DL domains

respectively. The hybrid deep learning methods outperforms

the base deep learning methods. Friedman statistical test was

performed on the obtained results and the test results are

significant at p<0.05. Additionally LIME and SHAP from

explainable NLP are used to observe the reason for a model

being performing well or poor from local and global point of

views. “দর ” is the best feature that is responsible for the

prediction of Bangla food reviews. In the future, we want to

enrich and balance our dataset more and explore different

hybrid feature extraction techniques for the SA on Bangla

food reviews, implement transformer based learning and

different hybrid methods.

REFERENCES

[1] J. K. Adarsh, V. T. Sreedevi, D. Thangavelusamy, “Product Review

System With BERT for Sentiment Analysis and Implementation of

Administrative Privileges on Node-RED,” in IEEE Access, vol. 11, pp.

65968-65976, 2023, doi: 10.1109/ACCESS.2023.3275738.

[2] M. I. H. Junaid, F. Hossain, U. S. Upal, A. Tameem, A. Kashim, A.

Fahmin, “Bangla Food Review Sentimental Analysis using Machine

Learning,” 2022 IEEE 12th Annual Computing and Communication

Workshop and Conference (CCWC), Las Vegas, NV, USA, pp. 0347-

0353, 2022, doi: 10.1109/CCWC54503.2022.9720761.

[3] A. S. Talaat, “Sentiment analysis classification system using hybrid

BERT models,” J Big Data, vol. 10, no. 1, Dec. 2023, doi:

10.1186/s40537-023-00781-w.

[4] A. Akther, M. S. Islam, H. Sultana, A. R. Rahman, S. Saha, K. M.

Alam, R. Debnath, “Compilation, Analysis and Application of a

Comprehensive Bangla Corpus KUMono,” IEEE Access, vol. 10, pp.

79999-80014, 2022, doi: 10.1109/ACCESS.2022.3195236.

[5] G. M. Shahariar, M. T. R. Shawon, F. M. Shah, M. S. Alam, M. S.

Mahbub, “Bengali Fake Reviews: A Benchmark Dataset and

Detection System,” arXiv:2308.01987v1.

https://doi.org/10.48550/arXiv.2308.01987.

[6] M. Hasan, L. Islam, I. Jahan, S. M. Meem, R. M. Rahman, “Natural

Language Processing and Sentiment Analysis on Bangla Social Media

Comments on Russia– Ukraine War Using Transformers,” Vietnam

Journal of Computer Science, Vol. 10, No. 03, pp. 329-356, 2023,

https://doi.org/10.1142/S2196888823500021.

[7] E. R. Rhythm, R. A. Shuvo, M. S. Hossain, M. F. Islam, A. A. Rasel,

“Sentiment Analysis of Restaurant Reviews from Bangladeshi Food

Delivery Apps,” 2023 International Conference on Emerging Smart

Computing and Informatics (ESCI), Pune, India, pp. 1-5, 2023, doi:

10.1109/ESCI56872.2023.10100214.

[8] A. Anirban, B. Pradhan, N. Shukla, “Sentiment analysis of customer

reviews of food delivery services using deep learning and explainable

artificial intelligence: Systematic review,” Foods 11, no. 10, pp. 1500,

2022.

[9] E. Hossain, O. Sharif, M. M. Hoque, I. H. Sarker, “SentiLSTM: A Deep

Learning Approach for Sentiment Analysis of Restaurant Reviews.

Hybrid Intelligent Systems,” HIS 2020, Advances in Intelligent

Systems and Computing, vol 1375. Springer, Cham, 2021,

https://doi.org/10.1007/978-3-030-73050-5_19.

[10] O. Sharif, M. M. Hoque and E. Hossain, “Sentiment Analysis of

Bengali Texts on Online Restaurant Reviews Using Multinomial Naïve

Bayes,” 2019 1st International Conference on Advances in Science,

Engineering and Robotics Technology (ICASERT), Dhaka,

Bangladesh, pp. 1-6, 2019, doi: 10.1109/ICASERT.2019.8934655.

[11] V. Lokam, V. Shinde, A. Raikar, V. Kate, G. Jumnake, “Restaurant and

Cuisine Review Sentiment Analysis using SVM,” International Journal

of Advanced Research in Computer and Communication Engineering,

Vol. 10, Issue 5, May 2021, doi: 10.17148/IJARCCE.2021.10545.

[12] A. K. Bitto, M. H. I. Bijoy, M. S. Arman, I. Mahmud, A. Das, J.

Majumder, “Sentiment Analysis from Bangladeshi Food Delivery

Startup Based on User Reviews Using Machine learning and Deep

Learning,” Bulletin of Electrical Engineering and Informatics, vol. 12,

no. 4, pp. 2282-2291, 2023.

[13] R. Haque, N. Islam, M. Tasneem, A. K. Das, “Multiclass sentiment

classification on Bengali social media comments using machine

learning,” International Journal of Cognitive Computing in

Engineering, Vol. 4, pp. 21-35, 2023,

https://doi.org/10.1016/j.ijcce.2023.01.001.

[14] N. Hossain, M. R. Bhuiyan, Z. N. Tumpa, S. A. Hossain, “Sentiment

Analysis of Restaurant Reviews using Combined CNN-LSTM,” 2020

11th International Conference on Computing, Communication and

Networking Technologies (ICCCNT), Kharagpur, India, pp. 1-5, 2020,

doi: 10.1109/ICCCNT49239.2020.9225328.

[15] M. A. Rahman and E. Kumar Dey, “Aspect Extraction from Bangla

Reviews using Convolutional Neural Network,” 2018 Joint 7th

International Conference on Informatics, Electronics & Vision

(ICIEV) and 2018 2nd International Conference on Imaging, Vision &

Pattern Recognition (icIVPR), Kitakyushu, Japan, 2018, pp. 262-267,

doi: 10.1109/ICIEV.2018.8641050.

Fig. 4. Average Impact on CNN-

BiGRU Model Output Magnitude

Using Shapley Value

Sentiment analysis of Bangla language using a new comprehensive dataset BangDSA and the novel feature metric skipBangla-BERT

Article

Full-text available

Jun 2024

In this modern technologically advanced world, Sentiment Analysis (SA) is a very important topic in every language due to its various trendy applications. But SA in Bangla language is still in a dearth level. This work focuses on examining different hybrid feature extraction techniques and learning algorithms on Bangla Document level Sentiment Analysis using a new comprehensive dataset (BangDSA) of 203,493 comments collected from various microblogging sites. The proposed BangDSA dataset approximately follows the Zipf’s law, covering 32.84% function words with a vocabulary growth rate of 0.053, tagged both on 15 and 3 categories. In this study, we have implemented 21 different hybrid feature extraction methods including Bag of Words (BOW), N-gram, TF-IDF, TF-IDF-ICF, Word2Vec, FastText, GloVe, Bangla-BERT etc with CBOW and Skipgram mechanisms. The proposed novel method (Bangla-BERT+Skipgram), skipBangla-BERT outperforms all other feature extraction techniques in machine leaning (ML), ensemble learning (EL) and deep learning (DL) approaches. Among the built models from ML, EL and DL domains the hybrid method CNN-BiLSTM surpasses the others. The best acquired accuracy for the CNN-BiLSTM model is 90.24% in 15 categories and 95.71% in 3 categories. Friedman test has been performed on the obtained results to observe the statistical significance. For both real 15 and 3 categories, the results of the statistical test are significant.

Sentiment analysis classification system using hybrid BERT models

Article

Full-text available

Jun 2023

Amira Samy

Because of the rapid growth of mobile technology, social media has become an essential platform for people to express their views and opinions. Understanding public opinion can help businesses and political institutions make strategic decisions. Considering this, sentiment analysis is critical for understanding the polarity of public opinion. Most social media analysis studies divide sentiment into three categories: positive, negative, and neutral. The proposed model is a machine-learning application of a classification problem trained on three datasets. Recently, the BERT model has demonstrated effectiveness in sentiment analysis. However, the accuracy of sentiment analysis still needs to be improved. We propose four deep learning models based on a combination of BERT with Bidirectional Long ShortTerm Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU) algorithms. The study is based on pre-trained word embedding vectors that aid in the model fine-tuning process. The proposed methods are trying to enhance accuracy and check the effect of hybridizing layers of BIGRU and BILSTM on both Bert models (DistilBERT, RoBERTa) for no emoji (text sentiment classifier) and also with emoji cases. The proposed methods were compared to two pre-trained BERT models and seven other models built for the same task using classical machine learning. The proposed architectures with BiGRU layers have the best results.

Product Review System With BERT for Sentiment Analysis and Implementation of Administrative Privileges on Node-RED

Article

Full-text available

Jan 2023

There are many ways to collect the product review, either by using a website or by manually asking the customers to leave their comments after buying the product. The traditional product feedback process is time-consuming. Nowadays, although industries use online review system to rate their product, the process lacks modernization. In this paper, BERT sentiment analysis is used to process the product review from the customers and automatically convert the review into a numeric score. This average score value is used for processing the sentiment analysis. The system is built entirely on Node-RED as a backbone to gather the review from the users with certain administrator privileges. An interactive dashboard is created to analyze the feedback of the product review. The above method of review system also reduces malpractice that the manufacturer or service provider may indulge into increase their ratings.

Sentiment Analysis of Restaurant Reviews from Bangladeshi Food Delivery Apps

Conference Paper

Full-text available

Apr 2023

In this study, we conducted sentiment analysis on restaurant reviews from Bangladeshi food delivery apps using natural language processing techniques. Food delivery apps have become increasingly popular in Bangladesh, and understanding the sentiment of customer reviews can provide valuable insights for restaurant owners and food delivery app companies. In this research, we have created a dataset named "Bangladeshi Restaurant Reviews" by gathering customer reviews of restaurants available on Foodpanda and Hungrynaki, which are two popular food delivery apps in Bangladesh. We used Robustly Optimized BERT Pretraining Approach (RoBERTa), AFINN, and DistilBERT, a distilled version of Bidirectional Encoder Representations from Transformers (BERT) to perform the sentiment analysis. Overall, this research paper highlights the importance of sentiment analysis in the food delivery industry and demonstrates the effectiveness of different models in performing this task. It also provides insights for businesses looking to use sentiment analysis to improve their services and products. The accuracy of the models evaluated, RoBERTa, AFINN, and DistilBERT, were 74%, 73%, and 77% respectively.

Sentiment analysis from Bangladeshi food delivery startup based on user reviews using machine learning and deep learning

Article

Full-text available

Aug 2022

Food delivery methods are at the top of the list in today's world. People's attitudes toward food delivery systems are usually influenced by food quality and delivery time. We did a sentiment analysis of consumer comments on the Facebook pages of Food Panda, HungryNaki, Pathao Food, and Shohoz Food, and data was acquired from these four sites' remarks. In natural language processing (NLP) task, before the model was implemented, we went through a rigorous data pre-processing process that included stages like adding contractions, removing stop words, tokenizing, and more. Four supervised classification techniques are used: extreme gradient boosting (XGB), random forest classifier (RFC), decision tree classifier (DTC), and multi nominal Naive Bayes (MNB). Three deep learning (DL) models are used: convolutional neural network (CNN), long term short memory (LSTM), and recurrent neural network (RNN). The XGB model exceeds all four machine learning (ML) algorithms with an accuracy of 89.64%. LSTM has the highest accuracy rate of the three DL algorithms, with an accuracy of 91.07%. Among ML and DL models, LSTM DL takes the lead to predict the sentiment.

Natural Language Processing and Sentiment Analysis on Bangla Social Media Comments on Russia–Ukraine War Using Transformers

Article

Full-text available

Mar 2023

The Bangla Language ranks seventh in the list of most spoken languages with 265 native and non-native speakers around the world and the second Indo-Aryan language after Hindi. However, the growth of research for tasks such as sentiment analysis (SA) in Bangla is relatively low compared to SA in the English language. It is because there are not enough high-quality publically available datasets for training language models for text classification tasks in Bangla. In this paper, we propose a Bangla annotated dataset for sentiment analysis on the ongoing Ukraine–Russia war. The dataset was developed by collecting Bangla comments from various videos of three prominent YouTube TV news channels of Bangladesh covering their report on the ongoing conflict. A total of 10,861 Bangla comments were collected and labeled with three polarity sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-Russia (Negative). A benchmark classifier was developed by experimenting with several transformer-based language models all pre-trained on unlabeled Bangla corpus. The models were fine-tuned using our procured dataset. Hyperparameter optimization was performed on all 5 transformer language models which include: BanglaBERT, XLM-RoBERTa-base, XLM-RoBERTa-large, Distil-mBERT and mBERT. Each model was evaluated and analyzed using several evaluation metrics which include: F1 score, accuracy, and AIC (Akaike Information Criterion). The best-performing model achieved the highest accuracy of 86% with 0.82 F1 score. Based on accuracy, F1 score and AIC, BanglaBERT outperforms baseline and all the other transformer-based classifiers.

Multi-class sentiment classification on Bengali social media comments using machine learning

Article

Full-text available

Jan 2023

Multi-class Sentiment Analysis (SA) is an important field of computational linguistics that extracts multiple opinions expressed in a text using NLP and text-mining techniques. Existing research on multi-class SA in the Bengali language is directed towards ternary classification with unsatisfactory classification performance. Moreover, obtaining a higher performance score is challenging due to the peculiarities of Bengali text, lack of ground truth datasets, and low resources of preprocessing tools. Moreover, no research has shown that deep learning algorithms perform higher on four types of sentiments. Therefore, we proposed a supervised deep learning classifier based on CNN and LSTM to conduct multi-class SA on Bengali social media comments labelled as sexual, religious, political, and acceptable. The study aims to achieve maximum accuracy using the proposed model and provide a comparative analysis with the baseline models. Six machine learning models with two different feature extraction techniques were considered baseline models. The performance of our proposed CLSTM architecture can greatly improve the performance of SA with 85.8% accuracy and 0.86 F1 scores on a labelled dataset of 42036 Facebook comments. A web application based on the proposed model and the highest-performing baseline model was built to detect the real-life sentiment of social media comments.

Compilation, Analysis and Application of a Comprehensive Bangla Corpus KUMono

Article

Full-text available

Aug 2022

Research in Natural Language Processing (NLP) and computational linguistics highly depends on a good quality representative corpus of any specific language. Bangla is one of the most spoken languages in the world but Bangla NLP research is in its early stage of development due to the lack of quality public corpus. This article describes the detailed compilation methodology of a comprehensive monolingual Bangla corpus, KUMono ( K hulna U niversity Mono lingual corpus). The newly developed corpus consists of more than 350 million word tokens and more than one million unique tokens from 18 major text categories of online Bangla websites. We have conducted several word-level and character-level linguistic phenomenon analyses based on empirical studies of the developed corpus. The corpus follows Zipf’s curve and hapax legomena rule. The quality of the corpus is also assessed by analyzing and comparing the inherent sparseness of the corpus with existing Bangla corpora, by analyzing the distribution of function words of the corpus and vocabulary growth rate. We have developed a Bangla article categorization application based on the KUMono corpus and received compelling results by comparing to the state-of-the-art models.

Sentiment Analysis of Customer Reviews of Food Delivery Services Using Deep Learning and Explainable Artificial Intelligence: Systematic Review

Article

Full-text available

May 2022

During the COVID-19 crisis, customers’ preference in having food delivered to their doorstep instead of waiting in a restaurant has propelled the growth of food delivery services (FDSs). With all restaurants going online and bringing FDSs onboard, such as UberEATS, Menulog or Deliveroo, customer reviews on online platforms have become an important source of information about the company’s performance. FDS organisations aim to gather complaints from customer feedback and effectively use the data to determine the areas for improvement to enhance customer satisfaction. This work aimed to review machine learning (ML) and deep learning (DL) models and explainable artificial intelligence (XAI) methods to predict customer sentiments in the FDS domain. A literature review revealed the wide usage of lexicon-based and ML techniques for predicting sentiments through customer reviews in FDS. However, limited studies applying DL techniques were found due to the lack of the model interpretability and explainability of the decisions made. The key findings of this systematic review are as follows: 77% of the models are non-interpretable in nature, and organisations can argue for the explainability and trust in the system. DL models in other domains perform well in terms of accuracy but lack explainability, which can be achieved with XAI implementation. Future research should focus on implementing DL models for sentiment analysis in the FDS domain and incorporating XAI techniques to bring out the explainability of the models.

Bangla Food Review Sentimental Analysis using Machine Learning

Conference Paper

Full-text available

Jan 2022

In this modern age, people are dependent on the internet. They prefer to order food online or Food App rather than the restaurant. They are giving various reviews online about the foods. In this project, we aim to build a machine learning model to analyze the sentiment of that reviews. In Bangladesh, internet users are increasing day by day. So we have decided to build the model for the Bangla language. We have found no Bangla dataset for food reviews that we can use for our project. Then we have collected more than one thousand Bangla food reviews from various online platforms like Foodpanda, Hungrynaki, Shohoz food, Pathao food, etc., and labeled them. After some necessary preprocessing, we have extracted various features from cleaned data and used them to train and test for machine learning and deep learning models. We have come to the result that Long Term Short Term (LSTM), a deep learning model giving the best accuracy, that is 90.89%, where we have used word2sequence as feature extraction. Our research contribution will help the food industry by using this model. This model can help them to understand the Bangla food review sentiment.

Bengali fake reviews: A benchmark dataset and detection system

Article