ArticlePDF Available

A Method for Ranking Products Through Online Reviews Based on Sentiment Classification and Interval-Valued Intuitionistic Fuzzy TOPSIS

Authors:

Abstract and Figures

Studies have shown that online product reviews significantly affect consumer purchase decisions. However, it is difficult for the consumer to read online product reviews one by one because the number of online reviews is very large. Thus, to facilitate consumer purchase decisions, how to rank products through online reviews is a valuable research topic. This paper proposes a method for ranking products through online reviews based on sentiment classification and the interval-valued intuitionistic fuzzy Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS). The method consists of two parts: (1) identifying sentiment orientations of the online reviews based on sentiment classification and (2) ranking alternative products based on interval-valued intuitionistic fuzzy TOPSIS. In the first part, the online reviews of the alternative products concerning multiple attributes are preprocessed, and an algorithm based on support vector machine and one-versus-one strategy is developed for classifying the sentiment orientations of online reviews into three categories: positive, neutral, and negative. In the second part, based on the percentages of the online reviews with different sentiment orientations and the numbers of online reviews of different products crawled from the website, an interval-valued intuitionistic fuzzy number is constructed to represent the performance of an alternative product with respect to the product attribute. Additionally, the interval-valued intuitionistic fuzzy TOPSIS method is employed to determine a ranking of the alternative products. Finally, a case analysis is provided to illustrate the application of the proposed method.
Content may be subject to copyright.
A Method for Ranking Products Through Online Reviews
Based on Sentiment Classi¯cation and Interval-Valued
Intuitionistic Fuzzy TOPSIS
Yang Liu
*,
, Jian-Wu Bi
*,§
and Zhi-Ping Fan
*,,
*
Department of Information Management and Decision Sciences
School of Business Administration
Northeastern University, Shenyang 110167, P. R. China
State Key Laboratory of Synthetical Automation for Process Industries
Northeastern University, Shenyang 110819, P. R. China
liuy@mail.neu.edu.cn
§
jianwubi@126.com
zpfan@mail.neu.edu.cn
Published 29 September 2017
Studies have shown that online product reviews signi¯cantly a®ect consumer purchase
decisions. However, it is di±cult for the consumer to read online product reviews one by one
because the number of online reviews is very large. Thus, to facilitate consumer purchase
decisions, how to rank products through online reviews is a valuable research topic. This paper
proposes a method for ranking products through online reviews based on sentiment classi¯ca-
tion and the interval-valued intuitionistic fuzzy Technique for Order Preference by Similarity to
an Ideal Solution (TOPSIS). The method consists of two parts: (1) identifying sentiment
orientations of the online reviews based on sentiment classi¯cation and (2) ranking alternative
products based on interval-valued intuitionistic fuzzy TOPSIS. In the ¯rst part, the online reviews
of the alternative products concerning multiple attributes are preprocessed, and an algorithm
based on support vector machine and one-versus-one strategy is developed for classifying the
sentiment orientations of online reviews into three categories: positive, neutral, and negative. In
the second part, based on the percentages of the online reviews with di®erent sentiment orien-
tations and the numbers of online reviews of di®erent products crawled from the website, an
interval-valued intuitionistic fuzzy number is constructed to represent the performance of an
alternative product with respect to the product attribute. Additionally, the interval-valued
intuitionistic fuzzy TOPSIS method is employed to determine a ranking of the alternative pro-
ducts. Finally, a case analysis is provided to illustrate the application of the proposed method.
Keywords: Product ranking; online reviews; SVM; sentiment classi¯cation; interval-valued
intuitionistic fuzzy number; TOPSIS.
1. Introduction
With the rapid development of e-commerce, an increasing number of people are
buying products from e-commerce websites. Compared with products exhibited in
physical stores, the products exhibited on the websites are viewed with more
§
Corresponding author.
International Journal of Information Technology & Decision Making
Vol. 16 (2017)
°
cWorld Scienti¯c Publishing Company
DOI: 10.1142/S021962201750033X
1
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
uncertainty because the consumers cannot touch or try out the products until those
products are delivered to the consumers. Thus, online product reviews published by
consumers who have bought or used the products would be helpful for potential
consumers to understand the products more clearly.
1,2
In fact, even when consumers
want to buy the products from the physical stores, the consumers can also visit the
websites and obtain more product information by reading the related online product
reviews. For example, a consumer wants to buy a medium-sized car. From a pre-
liminary investigation, several acceptable cars are determined, which can be con-
sidered the alternative cars. Nevertheless, the consumer wavers among the
alternative cars because of limited knowledge and expertise. To make a desirable
selection among the alternative cars, the consumer might read online reviews con-
cerning the alternative cars to understand them more clearly and make a reasonable
decision. Results of the previous studies have indicated that online product reviews
signi¯cantly a®ect consumer purchase decisions.
3
5
However, note that, because the
number of online reviews concerning the alternative products is usually large, it can
be tedious and time-consuming for the consumer to read all of the online reviews one
by one. In this situation, several types of approaches can be helpful for the consumer
to capture the information embedded in the online reviews and make the ¯nal de-
cision. Some of these approaches include extracting a subset of important reviews,
6
summarizing the opinions of a huge number of online reviews,
7
and ranking alter-
native products through online reviews.
815
Among these approaches, ranking al-
ternative products through online reviews is considered a comprehensive approach
because it considers multiple factors of product selection such as product attributes,
sentiment orientations of online reviews, product attribute weights, and posted time
of reviews.
815
Thus, how to rank products through online reviews is a valuable
research topic with extensive application backgrounds.
Until now, the problem of ranking products through online reviews has attracted
the attentions of some scholars, and several methods for ranking products through
online reviews have been proposed.
815
There are often two processes in the existing
methods for ranking products through online reviews, namely, (1) information ex-
traction and (2) product ranking. The former is to extract the related information
from online reviews, such as product attributes and sentiment orientations. The
latter is to rank products based on the extracted information. These studies have
made signi¯cant contributions to ranking products through online reviews. However,
in most of the previous studies,
814
the online reviews with neutral sentiment
orientations are ignored. In fact, the reviews with neutral sentiment orientations
represent the hesitant or uncertain evaluations of consumers concerning products,
and the reviews with neutral sentiment orientations should not be ignored
1517
be-
cause they are also valuable for the potential consumer to make a reasonable deci-
sion. For example, let us consider a consumer who wants to buy one from two
alternative cars. One car (denoted as A1) has received 100 reviews, including 45
reviews with positive sentiment orientation, 50 reviews with neutral sentiment ori-
entation and 5 reviews with negative sentiment orientation; the other car (denoted as
2Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
A2) has received 50 reviews, including 45 reviews with positive sentiment orientation
and 5 reviews with negative sentiment orientation. It would be di±cult to choose
between the two cars if the online reviews with neutral sentiment orientation were
ignored. However, if the online reviews with neutral sentiment orientation were
considered, most consumers would select car A2because of the higher percentage of
positive evaluations and lower percentages of hesitant or uncertain evaluations re-
ceived by A2. Thus, the reviews with neutral sentiment orientation should not be
ignored when determining the ranking of alternative products.
An intuitionistic fuzzy number is a valuable data form to represent information
with hesitance and uncertainty.
18,19
An intuitionistic fuzzy number can be used to
represent evaluations or judgments with di®erent degrees of support, hesitation, and
opposition.
20,21
Thus, based on sentiment analysis technique and intuitionistic fuzzy
set theory, Liu et al.
15
proposed a method for ranking products through online
reviews. In the method, based on the percentages of reviews with positive, neutral,
and negative sentiment orientations, an intuitionistic fuzzy number is constructed
to represent the performance of an alternative product with respect to a product
attribute. Then, the Preference Ranking Organization Method for Enrichment
Evaluations (PROMETHEE) II method is used to determine the ranking of alter-
native products. In the study of Liu et al.,
15
a large number of sentiment orientations
of online reviews of an alternative product concerning a product attribute can be
represented simply and completely by an intuitionistic fuzzy number. This approach
is a new idea and a valuable attempt to process and fuse a large number of sentiment
orientations embedded in online reviews. However, note that although the use of an
intuitionistic fuzzy number can simply re°ect the percentages of online reviews with
di®erent sentiment orientations, the numbers of online reviews of di®erent products
crawled from the website are not considered. In fact, there might be great di®erences
among the numbers of online reviews concerning di®erent products crawled from the
website. The numbers of online reviews would a®ect the con¯dences of the decision
data re¯ned from the online reviews. That is, the intuitionistic fuzzy number con-
structed based on the sentiment orientations of a large number of online reviews
should have a high con¯dence level; conversely, the intuitionistic fuzzy number
constructed based on the sentiment orientations of a small number of online reviews
should have a low con¯dence level. Thus, to eliminate the e®ects of di®erent numbers
of online reviews on the con¯dence levels of the constructed intuitionistic fuzzy
numbers, based on the con¯dence interval estimation in probability theory,
22
con-
¯dence intervals can be obtained based on the percentages of di®erent sentiment
orientations and the numbers of online reviews of the alternative products. That is,
the percentages of di®erent sentiment orientations in the form of crisp numbers are
replaced by con¯dence intervals. Therefore, an interval-valued intuitionistic fuzzy
number can be constructed to re°ect the performance of an alternative product
concerning a product attribute, and then the interval-valued intuitionistic fuzzy
Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS)
23,24
can
be used to determine a ranking of alternative products.
Method for Ranking Products Through Online Reviews 3
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
The objective of this paper is to propose a method for ranking products through
online reviews based on sentiment classi¯cation and interval-valued intuitionistic
fuzzy TOPSIS. The method consists of two parts: (1) identifying sentiment orien-
tations of the online reviews based on sentiment classi¯cation and (2) ranking
alternative products based on interval-valued intuitionistic fuzzy TOPSIS. In the
¯rst part, the online reviews of the alternative products concerning multiple attri-
butes are preprocessed using the ICTCLAS 2016 (Institute of Computing Technol-
ogy, Chinese Lexical Analysis System, http://ictclas.nlpir.org/), and a set of
notional words is constructed with respect to alternative products concerning each
product attribute. Then, based on support vector machine (SVM) and one-versus-
one (OVO) strategy, an algorithm is developed for classifying the sentiment orien-
tations of online reviews of the alternative products concerning di®erent attributes
into positive, neutral, and negative. In the second part, based on the percentages of
the online reviews with di®erent sentiment orientations and the numbers of online
reviews of di®erent products crawled from the website, an interval-valued intuitio-
nistic fuzzy number is constructed to represent the performance of an alternative
product concerning a product attribute. Then, the interval-valued intuitionistic
fuzzy TOPSIS method is employed to determine a ranking of the alternative
products.
The remainder of this paper is arranged as follows. Section 2provides a literature
review of methods for ranking products through online reviews. Section 3formulates
the problem of ranking products through online reviews. In Sec. 4, descriptions of the
two parts of the proposed method are presented. In Sec. 5, a case study on ranking
11 cars is provided to illustrate the use of the proposed method. Finally, Sec. 6
summarizes and highlights the major contributions of this paper.
2. Literature Review of Methods for Ranking Products
Through Online Reviews
The problem of ranking products through online reviews has received the attention of
scholars, and several methods for ranking products through online reviews have been
proposed.
815
Zhang et al.
8
previously focussed on the problem of ranking products through
online reviews and proposed a method based on directed and weighted product
graph. In the method, subjective sentences and comparative sentences in online
product reviews are ¯rst distinguished, in which a subjective sentence represents the
subjective opinion of a consumer on a product and a comparative sentence represents
a comparison relationship of a pair of products. Then, the positive or negative sen-
timent orientation of each subjective sentence and each comparative sentence is
identi¯ed using the sentiment analysis technique, and a directed and weighted
product graph is constructed which simultaneously re°ects the subjective opinions
and comparison relationships of the products. Finally, an improved page-rank
algorithm is proposed to determine a ranking of products based on the directed and
4Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
weighted product graph. On this basis, Zhang et al.
9
further proposed a method for
ranking products through online reviews based on di®erent aspects of product
attributes. In the method, sentences in the online reviews are initially classi¯ed into
di®erent subsets based on the product attribute mentioned in the sentences. Then,
the procedure proposed by Zhang et al.
8
is used to determine a ranking of products
based on the sentences concerning each product attribute. Thereafter, Zhang et al.
10
incorporated the helpfulness and the age importance of each review into the deter-
mination of the ranking of products, in which the helpfulness of each review is
measured according to the number of helpful votes received by the review, and the
age importance of each review is calculated according to the posted date of the
review. Peng et al.
11
suggested a method for ranking products through Chinese online
reviews based on the fuzzy PROMETHEE method. According to similarity degrees
of di®erent Chinese words, the synonyms concerning each product attribute are
determined. According to the total frequency of the synonyms concerning each
product attribute, the set of important product attributes is determined. Then,
several domain experts are invited to provide subjective evaluations on multiple
products concerning the important product attributes. Furthermore, according to
the subjective evaluations, the ranking of products is determined by the fuzzy
PROMETHEE method. Chen et al.
12
proposed a method for market structure
visualization through online product reviews. In their method, the online product
reviews are initially classi¯ed into positive and negative reviews. Additional analysis
is conducted based on the positive reviews and the negative reviews, respectively.
Using topic modeling and the scree plot technique, the topic distribution matrix can
be obtained, and the weight matrix of all brands and the weights of important topics
are determined according to the topic distribution matrix. Furthermore, a perceptual
map of market structure is built by the multi-dimensional scaling method, and a
ranking of products is obtained using the TOPSIS method. Najmi et al.
13
proposed a
comprehensive method for ranking products through online reviews. In the method,
the brand score of each product is calculated by an improved page-rank algorithm,
and the review score is calculated based on the results of sentiment analysis and
usefulness analysis of each online review. The ¯nal ranking score of each product is
determined by aggregating the brand score and the review score. Yang et al.
14
proposed a method for ranking multiple products by integrating heterogeneous
information including numeric ratings, text reviews, and comparative votes. In
the method, the heterogeneous information is ¯rst classi¯ed into two categories:
descriptive information and comparative information. Then, the descriptive infor-
mation and comparative information are integrated into a digraph structure, from
which an integrated electronic word-of-mouth (eWOM) score of each product can be
calculated. Furthermore, based on the obtained eWOM score, the overall ranking of
the multiple products can be determined. Liu et al.
15
proposed a method for ranking
products based on the sentiment analysis technique and intuitionistic fuzzy set
theory. In the method, an algorithm based on sentiment dictionaries is developed to
identify the positive, neutral, and negative sentiment orientations of online reviews.
Method for Ranking Products Through Online Reviews 5
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Then, according to the identi¯ed sentiment orientations, an intuitionistic fuzzy
number is constructed to represent the performance of each alternative product
concerning each product attribute. Furthermore, the dominance degrees on pairwise
comparisons of the alternative products are calculated, and a ranking of the alter-
native products is determined using the PROMETHEE II method.
The previous studies have made signi¯cant contributions to ranking products
through online reviews. However, in most of the previous studies,
8
14
the online
reviews with neutral sentiment orientations are ignored, which lead to a loss of
valuable decision data (an example can be found in Sec. 1). Although the online
reviews with neutral sentiment orientations are considered in the study of Liu et al.,
15
the numbers of online reviews of di®erent products crawled from the website are not
considered. In fact, the di®erent numbers of online reviews re°ect the di®erent
con¯dence levels of decision data re¯ned from the online reviews and are valuable for
the consumer to select a desirable alternative product. Thus, to support consumer
purchase decisions, it is necessary to further develop the method for ranking products
through online reviews.
3. Problem Description
Consider a consumer who wants to buy a product such as a car. By a preliminary
investigation, several alternative products are identi¯ed. The alternative products
are all acceptable, but the consumer cannot decide which one to buy because of
limited knowledge and expertise. To select the most desirable one from the alter-
native products, the consumer provides his/her personalized preferences on the im-
portant product attributes and the weights of these important product attributes.
To support the consumer purchase decision, a large number of online reviews of the
alternative products concerning the product attributes are crawled from the related
website. The problem addressed in this paper is how to rank the alternative products
based on the online reviews and the attribute weights provided by the consumer.
The problem of ranking products through online reviews is vividly shown in Fig. 1.
The following notations are used to denote the sets and variables in the problem.
These notations will be used throughout this paper.
1
A
2
A
n
A
Alternative products
1
f
2
f
m
f
Attribute weights
2
A
Ranking of alternative products
1
A
n
A
1
w
2
w
m
w
Product attributes
Online reviews concerning
alternative products The method for
ranking product
through online
reviews
Consumer
Fig. 1. Problem of ranking products through online reviews.
6Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
.A¼fA1;A2;...;Ang: the set of nacceptable alternative products, where Ai
denotes the ith acceptable alternative product, i¼1;2;...;n. The set Acan be
determined by the consumer according to his/her personal preference.
.F¼ff1;f2;...;fmg: the set of mproduct attributes of interest to the consumer,
where fjdenotes the jth product attribute of interest to the consumer,
j¼1;2;...;m. Usually, a prede¯ned attribute set with a great number of attri-
butes can be determined by the website based on the characteristics of the type of
product, and the consumer can determine set Fby selecting some or all of the
prede¯ned attribute sets according to his/her personal preference.
.w¼ðw1;w2;...;wmÞ: the vector of attribute weights, where wjdenotes the weight
of attribute fj, such that wj0 and Pm
j¼1wj¼1, j¼1;2;...;m.wcan be di-
rectly determined by the assignment of the consumer or indirectly obtained using
existing procedures such as analytic hierarchy process (AHP).
2527
If the consumer
is not familiar with AHP, a webpage including the framework of AHP could be
helpful for the consumer to determine the attribute weights. In the webpage, a
series of questions on attribute comparisons are embedded beforehand. Then,
according to the answers of the consumer, the attribute weights can be obtained
automatically based on AHP.
25
.Q¼ðq1;q2;...;qnÞ: the vector of numbers of the online reviews concerning
alternative products, where qidenotes the number of the online reviews concerning
alternative product Ai;i¼1;2;...;n. The online reviews can be crawled from the
related website.
.Dik ¼ðD1
ik ;D2
ik ;...;Dm
ik Þ: the kth online review concerning alternative product Ai,
where Dj
ik denotes the sentence concerning attribute fjin the kth online review of
alternative product Ai;i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi. If there is no
sentence concerning product attribute fjin review Dik , then we denote Dj
ik ¼,
i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Currently, some websites encourage
consumers to post their reviews according to a pre-established framework of
product attributes, such as Automobile home (http://www.autohome.com.cn/).
Even when the reviews are not posted according to the product attributes, some
existing techniques
28,29
can be used to extract sentences concerning di®erent
attributes from the online reviews. Thus, in this study, we consider that the online
reviews have been expressed in the form of sentences concerning di®erent attri-
butes, i.e., Dik ¼ðD1
ik ;D2
ik ;...;Dm
ik Þ.
.
Tj¼f
Tj
1;
Tj
2;...;
Tj
qj
t
g: the set of qj
ttraining samples on identifying the senti-
ment orientations of online reviews concerning attribute fj, where
Tj
zis the review
(or sentences) on the same type of products concerning attribute fjand has been
labeled positive, neutral or negative sentiment orientation, z¼1;2;...;qj
t,
j¼1;2;...;m. Thus, the set
Tjcan be further divided into three subsets
Tj
pos,
Tj
neu, and
Tj
neg, where
Tj
pos,
Tj
neu, and
Tj
neg, respectively, denote the sets of
training samples with positive, neutral, and negative sentiment orientations, such
Method for Ranking Products Through Online Reviews 7
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
that
Tj
pos [
Tj
neu [
Tj
neg ¼
Tj,
Tj
pos \
Tj
neu ¼,
Tj
pos \
Tj
neg ¼,and
Tj
neu \
Tj
neg ¼,j¼1;2;...;m.
The problem of concern in this paper is how to rank alternative products A1;
A2;...;Anbased on the online review Dik and attribute weight wj,i¼1;2;...;n,
j¼1;2;...;m,k¼1;2;...;qi.
4. The Proposed Method
To solve the above problem, a method for ranking products through online reviews
based on sentiment classi¯cation and interval-valued intuitionistic fuzzy TOPSIS is
proposed in this section. The method consists of two parts: (1) identifying sentiment
orientations of the online reviews based on sentiment classi¯cation and (2) ranking
alternative products based on interval-valued intuitionistic fuzzy TOPSIS. Detailed
descriptions of the two parts are, respectively, provided in Secs. 4.1 and 4.2.
4.1. Identifying sentiment orientations of the online reviews based
on sentiment classi¯cation
In this paper, the sentiment classi¯cation technique is employed to identify the
sentiment orientations of online reviews of alternative products with respect to each
attribute. Thus, the online reviews are initially preprocessed, and then an algorithm
based on SVM and the OVO strategy is proposed to identify the positive, neutral,
and negative sentiment orientations of the online reviews. The details are, respec-
tively, provided in Secs. 4.1.1 and 4.1.2.
4.1.1. Preprocessing online reviews concerning the alternative products
The preprocessing includes two processes, namely (1) word segmentation and part-
of-speech (POS) tagging and (2) stop word removal. The details are provided below.
(1) Word segmentation and POS tagging
In this paper, ICTCLAS 2016 is used for word segmentation and POS tagging. Each
sentence in the online review is decomposed into several words, and the POS of each
word is tagged after the word. If the Chinese sentence \ " (i.e., \the seat is
very conformable") is imported into the ICTCLAS 2016, then the output result is
\/n /d /a" (i.e., \seat/n, very/d, conformable/a"), where \/n" denotes
\noun"; \/d" denotes \adverb"; \/v" denotes \verb"; and \/a" denotes \adjective".
Let D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þdenote the output of ICTCLAS 2016 when the online
review Dik is input, i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Speci¯cally, if
Dj
ik ¼, then D0j
ik ¼,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
(2) Stop word removal
To improve the e±ciency and e®ectiveness of sentiment classi¯cation, stop
words must be removed. The words in the stop word list (see: http://www.datatang.
com/data/19300) are deleted from D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þ,i¼1;2;...;n,
8Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
j¼1;2;...;m,k¼1;2;...;qi. Let
Wik ¼ð
W1
ik ;
W2
ik ;...;
Wm
ik Þdenote the vector of
notional words obtained by removing the stop words from D0
ik , where
Wj
ik ¼
fWj1
ik ;Wj2
ik ;...;Wjq j
ik
ik gdenotes the set of notional words in sentence D0j
ik , and qj
ik
denote the number of words in
Wj
ik ,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Speci¯cally, if Dj
ik ¼, then we denote
Wj
ik ¼,i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi.
4.1.2. An algorithm based on SVM and the OVO strategy to identify the positive,
neutral, and negative sentiment orientations of online reviews
To rank products through online reviews, we must identify the sentiment orienta-
tions of the online reviews of alternative products with respect to di®erent attributes.
Previous studies
3032
have proved that SVM has good performance for not only
binary sentiment classi¯cation but also multiple sentiment classi¯cation when
combined with the OVO strategy. Thus, in this section, an algorithm based on SVM
and the OVO strategy is proposed to identify the positive, neutral, and negative
sentiment orientations of online reviews. The structure of the algorithm is shown
in Fig. 2.
As seen in Fig. 2, the algorithm can be divided into two stages: (1) converting
training samples and online reviews into feature vectors concerning the notional
words and (2) training the SVM classi¯ers and identifying the sentiment orientations
of online reviews using the OVO strategy. In the ¯rst stage, according to the training
samples and online reviews, a set of notional words is constructed with respect to
alternative products concerning each product attribute. Then, the training samples
and the online reviews of alternative products are uniformly converted into feature
vectors concerning the notional words. In the second stage, three subsets of training
samples (i.e., \positive-neutral"
Tj
pos [
Tj
neu, \positive-negative"
Tj
pos [
Tj
neg and
\neutral-negative"
Tj
neu [
Tj
negÞare ¯rst constructed with respect to each product
Train the SVM classifiers by
the three subsets, respectively
Training samples
neg neu pos
Labeled sentiment orientations
Feature vectors
(1) Converting training samples and online reviews into
feature vectors concerning the notional words
Online reviews
The OVO strategy
Set of notional
words
Sentiment orientations
of
online reviews
(2) Training the SVM classifiers and identifying the sentiment
orientations of online reviews using the OVO strategy
neg neu pos
W
1
W
2
W
3
0 0.1 0T
1
0.2 0.6 1.2T
2
3.2 06.1T
3
0.7 0.3 0.8T
4
0 0.6 0.9T
5
0.4 0 0T
6
W
1
W
2
W
3
0.23 0.62 0D
1
0.96 5.3 3.1D
2
0.16 00.46D
3
0 3.2 0D
4
0.6 00.98D
5
Subset(pos-neu)
W
1
W
2
W
3
00.1 0T
1
3.2 06.1T
3
0.7 0.3 0.8T
4
00.6 0.9T
5
W
1
W
2
W
3
00.1 0T
1
0.2 0.6 1.2T
2
0.7 0.3 0.8T
4
0.4 0 0T
6
W
1
W
2
W
3
0.2 0.6 1.2T
2
3.2 06.1T
3
00.6 0.9T
5
0.4 0 0T
6
Subset(pos-neg) Subset(neu-neg)
SVM (
pos-neu
)SVM
(pos-neg)
SVM
(neu-neg)
Fig. 2. Structure of the algorithm based on SVM and the OVO strategy.
Method for Ranking Products Through Online Reviews 9
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
attribute. Then, three SVM classi¯ers are, respectively, trained by the three subsets
with respect to the product attribute. Furthermore, a classi¯cation result of the
sentiment orientation of each review concerning each attribute can be obtained by
each SVM classi¯er, and the OVO strategy is used to integrate the results of the
three SVM classi¯ers and to determine the ¯nal sentiment orientation of each review
with respect to each product attribute. A detailed description of each stage is pro-
vided below.
(1) Converting training samples and online reviews into feature vectors concerning
the notional words
Without loss of generality, we assume that the training sample
Tj
zhas been pre-
processed by the process shown in Sec. 4.1.1 and has been changed into the set of
notional words, j¼1;2;...;m;z¼1;2;...;qj
t. Let
Wj
idenote the set of notional
words of online reviews of alternative product Aiwith respect to attribute fj;
Wj
denote the set of notional words of training samples and online reviews with respect
to attribute fj; then
Wj
iand
Wjcan be, respectively, determined by the following
Eqs. (1)and(2), i.e.,
Wj
i¼
Wj
i1[
Wj
i2[[
Wj
iqi;i¼1;2;...;n;j¼1;2;...;m;ð1Þ
Wj¼
Wj
1[
Wj
2[[
Wj
n[
Tj
1[
Tj
2[[
Tj
qj
t
;j¼1;2;...;m:ð2Þ
Let Ejdenote the number of words in set
Wj, then set
Wjcan be further represented
by
Wj¼fWj1;Wj2;...;WjE jg, where Wjh denote the hth notional word in the set
Wj,h¼1;2;...;Ej. Based on the bag-of-words (BOW) model,
33
review Dj
ik can be
represented by a feature vector !j
ik ¼ð!j1
ik ;!j2
ik ;...;!jEj
ik Þconcerning the set
Wj¼fWj1;Wj2;...;WjEjg, where !jh
ik denotes the weight of word Wjh for distin-
guishing the semantics of review Dj
ik ,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi,
h¼1;2;...;Ej. Concerning the determination of weight !jh
ik , the idea of term fre-
quency-inverse document frequency (TF-IDF)
34
is employed, i.e., the greater the
frequency of word Wjh in review Dj
ik is, the more important word Wjh will be for
distinguishing the semantic of review Dj
ik , and the greater !jh
ik will be. Conversely,
the less important word Wjh will be for distinguishing the semantic of review Dj
ik , the
smaller !jh
ik will be. Meanwhile, the greater the frequency of word Wjh in all of
the reviews is, the less the importance of word Wjh for distinguishing the semantic of
review Dj
ik will be, and the smaller !jh
ik will be; conversely, the more important word
Wjh for distinguishing the semantic of review Dj
ik will be, the greater !jh
ik will be.
Thus, based on the idea of TF-IDF,
34
the value of !jh
ik can be calculated by the
following Eq. (3), i.e.,
!jh
ik ¼jh
ik
j
ik
log qj
tþ jfði;kÞ:Dj
ik gj
jfz:Wjh 2
Tj
zgj þ jfði;kÞ:Wjh 2Dj
ik gj þ 1:ð3Þ
In Eq. (3), jh
ik denotes the frequency of word Wjh in review Dj
ik ,j
ik denotes the
number of notional words in review Dj
ik ,jfði;kÞ:Dj
ik gj denotes the number of
10 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
not-empty reviews of all alternative products concerning attribute fj, and jfz:
Wjh 2
Tj
zgj and jfði;kÞ:Wjh 2Dj
ik gj denote the numbers of training samples con-
taining the word Wjh and online reviews containing the word Wjh ,i¼1;2;...;n,
j¼1;2;...;m,k¼1;2;...;qi,h¼1;2;...;Ej, respectively.
Similarly, let !j
z¼ð!j1
z;!j2
z;...;!jEj
zÞdenote the feature vector of training
sample
Tj
zconcerning set
Wj¼fWj1;Wj2;...;WjE jg, where !jh
zdenotes the weight
of word Wjh for distinguishing the semantic of training sample
Tj
z;j¼1;2;...;m;z¼1;2;...;qj
t. Based on the idea of TF-IDF,
34
the value of !jh
z
can be calculated by the following Eq. (4), i.e.,
!jh
z¼jh
z
j
z
log qj
tþ jfði;kÞ:Dj
ik gj
jfz:Wjh 2
Tj
zgj þ jfði;kÞ:Wjh 2Dj
ik gj þ 1;ð4Þ
where jh
zdenotes the frequency of word Wjh in training sample
Tj
z, and j
zdenotes
the number of notional words in training sample
Tj
z,j¼1;2;...;m,z¼1;2;...;qj
t,
h¼1;2;...;Ej.
To illustrate the process of converting training samples and online reviews into
feature vectors more clearly, an example is provided. Table 1shows the notional
words of 1109 training samples and 346 online reviews concerning the attribute of car
power. The set of notional words concerning the attribute of car power can be
obtained using Eq. (2), which is composed of 2194 words, i.e., f(power),
(aspect), (a little), (surprise), (OK), (not), (sedan car),
(pretty good), (give), (oil), (run), ...g. Then, the weight of each word for
distinguishing the semantics of the training samples and the online reviews can be
calculated using Eqs. (3) and (4). Furthermore, according to the obtained weight of
each word, the feature vectors of training samples and online reviews can be deter-
mined. These vectors are shown in Table 2.
(2) Training the SVM classi¯ers and identifying the sentiment orientations of online
reviews using the OVO strategy
Among the existing multiple machine learning algorithms, SVM is considered the one
most suitable for identifying the sentiment orientations of online reviews.
3032,3537
Table 1. Notional words of training samples and online reviews concerning the at-
tribute of car power.
Notional words
Training samples
Tj
1(power) (aspect) (a little), (surprise)
... ...
Tj
1109 (power) (OK) (not) (sedan car) ...
Online reviews Dj
i1(power) (aspect) (not) (OK) ...
... ...
Dj
i346 (pretty good) (give) (oil) (run) ...
Method for Ranking Products Through Online Reviews 11
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
However, SVM cannot be directly used for classifying the sentiment orientations of
online reviews into three categories, i.e., positive, neutral, and negative, because
SVM is a binary classi¯er. Thus, SVM is usually combined with the OVO strategy. In
the previous studies, several OVO strategies have been proposed, such as voting
strategy
38
and weighted voting strategy.
39
In this study, the weighted voting
strategy
39
is used, and an algorithm based on SVM and the weighted voting strategy
is proposed to identify the positive, neutral, and negative sentiment orientations of
online reviews. Details of the algorithm are provided below.
According to the feature vectors of training samples !j
z¼ð!j1
z;!j2
z;...;!jEj
zÞ,
j¼1;2;...;m,z¼1;2;...;qj
t, three subsets of training samples for distinguishing
di®erent pairs of sentiment orientations are, respectively, constructed. The subset of
training samples for discriminating \positive-neutral" (pos-neu) is the union of
training samples with positive and neutral sentiment orientations (
Tj
pos [
Tj
neuÞ,
that for \positive-negative" (pos-neg) is
Tj
pos [
Tj
neg, and that for \neutral-
negative" (neu-neg) is
Tj
neu [
Tj
neg,j¼1;2;...;m. Then, by training the SVM using
the three subsets of training samples, respectively, three SVM classi¯ers for dis-
criminating di®erent pairs of sentiment orientations can be obtained. These classi-
¯ers are noted as SVMpos-neu
j,SVMpos-neu
j, and SVMneu-neg
j,j¼1;2;...;m.To
identify the sentiment orientation of online review Dj
ik , the feature vector !j
ik ¼
ð!j1
ik ;!j2
ik ;...;!jEj
ik Þof Dj
ik is, respectively, input into the three SVM classi¯ers
SVMpos-neu
j,SVMpos-neu
j,andSVMneu-neg
j,i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Let xj#
ik 2f0;1gdenote the output of the SVM classi¯er SVM#
j,
j¼1;2;...;m, where # denotes one of \pos-neu", \pos-neg", or \neu-neg",
j¼1;2;...;m.Ifxj#
ik ¼1, then it denotes that the result obtained by classi¯er
SVM#
jis \the sentiment orientation of online review Dj
ik is "; if xj#
ik ¼0, then it
denotes that the result obtained by SVM #
jis \the sentiment orientation of
online review Dj
ik is #", i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Speci¯cally,
xjpos-neu
ik ¼1 denotes that the result obtained by classi¯er SVM pos-neu
jconcerning
review Dj
ik is \pos", xjpos-neg
ik ¼1 denotes the result obtained by SVM pos-neg
jcon-
cerning review Dj
ik is \pos", and xjneu-neg
ik ¼1 denotes the result obtained by
SVMneu-neg
jis \neu"; otherwise, the results obtained by SVM pos-neu
j,SVMpos-neg
jand
Table 2. Feature vectors of training samples and online reviews concerning the attribute of car power.
The set of notional words concerning the attribute of car power
...
(power) (aspect) (a little) (surprise) (OK) (not) (sedan
car)
(give)
Training
samples
Tj
10.39 2.98 2.81 3.90 0.00 0.00 0.00 0.00 ...
... ... ... ... ... ... ... ... ... ...
Tj
1109 0.39 0.00 0.00 0.00 2.34 0.85 3.34 0.00 ...
Online
reviews
Dj
i10.39 2.98 0.00 0.00 0.00 0.85 0.00 0.00 ...
... ... ... ... ... ... ... ... ... ...
Dj
i346 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.36 ...
12 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
SVMneu-neg
jare \neu", \neg", and \neg", respectively, i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Thus, di®erent results of sentiment orientations of review Dj
ik can
be obtained by di®erent SVM classi¯ers. Therefore, to determine the ¯nal
sentiment orientation of online review Dj
ik , we must integrate the results obtained by
the three SVM classi¯ers. Let Sj
ik represent the sentiment orientation for review Dik
on feature fj,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. For convenience of
analysis and further calculation, Sj
ik is represented by an indicator vector, i.e.,
Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ, where j
ik ,j
ik and j
ik are indicator variables for positive,
neutral, and negative sentiment orientations, respectively, j
ik ;j
ik ;j
ik 2f0;1g,
j
ik þj
ik þj
ik 1, i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Speci¯cally,
Sj
ik ¼ð0;0;0Þdenotes Dj
ik ¼;Sj
ik ¼ð1;0;0Þ,Sj
ik ¼ð0;1;0Þ, and Sj
ik ¼ð0;0;1Þde-
note that review Dj
ik represents the positive sentiment orientation, neutral sentiment
orientation, and negative sentiment orientation, respectively. In this paper, the
weighted voting strategy
39
is used to integrate the results obtained by the three SVM
classi¯ers (xjpos-neu
ik ,xjpos-neg
ik and xjneu-neg
ik Þand determine Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ,
i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi, and the detail description is provided
in Appendix A.
4.2. Ranking alternative products based on interval-valued
intuitionistic fuzzy TOPSIS
In the study of Liu et al.,
15
a review with positive sentiment orientation is considered
a vote in support, a review with neutral sentiment orientation is considered hesita-
tion, and a review with negative sentiment orientation is considered a vote in op-
position. Then, based on the identi¯ed sentiment orientations and the physical
interpretation of intuitionistic fuzzy numbers,
1821
an intuitionistic fuzzy number is
built to represent the performance of an alternative product concerning a product
attribute. Although the use of intuitionistic fuzzy numbers can simply re°ect the
percentages of online reviews with di®erent sentiment orientations, the numbers of
online reviews of di®erent products crawled from the website are not considered.
In fact, there can be great di®erences among the numbers of online reviews of dif-
ferent products crawled from the website. The numbers of online reviews a®ect the
con¯dences of the decision data re¯ned from these online reviews. Thus, to eliminate
the e®ects of di®erent numbers of online reviews on the con¯dences of the con-
structed intuitionistic fuzzy numbers, the percentages of positive and negative sen-
timent orientations in the form of crisp numbers can be replaced by con¯dence
intervals. Therefore, an interval-valued intuitionistic fuzzy number can be con-
structed to re°ect the performance of an alternative product with respect to a
product attribute. Then, based on the obtained interval-valued intuitionistic fuzzy
numbers and the weights of product attributes, the ranking of the alternative pro-
ducts can be determined by the interval-valued intuitionistic fuzzy TOPSIS
method.
23,24
A detailed description is provided below.
Method for Ranking Products Through Online Reviews 13
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Let qpos
ij ,qneu
ij , and qneg
ij denote the numbers of reviews of alternative product Ai
concerning attribute fjwith positive, neutral, and negative sentiment orientations,
respectively, i¼1;2;...;n,j¼1;2;...;m. The values of qpos
ij ,qneu
ij , and qneg
ij can be
calculated by the following Eqs. (5)(7), respectively, i.e.,
qpos
ij ¼X
qi
k¼1
j
ik ;i¼1;2;...;n;j¼1;2;...;m;ð5Þ
qneu
ij ¼X
qi
k¼1
j
ik ;i¼1;2;...;n;j¼1;2;...;m;ð6Þ
qneg
ij ¼X
qi
k¼1
j
ik ;i¼1;2;...;n;j¼1;2;...;m:ð7Þ
Let ij and ij denote the percentages of support and opposition degrees of
alternative product Aiconcerning attribute fj, respectively, i¼1;2;...;n,
j¼1;2;...;m. According to the obtained qpos
ij ;qneu
ij , and qneg
ij , the values of ij and
ij can be, respectively, calculated by the following Eqs. (8) and (9), i.e.,
ij ¼qpos
ij
qpos
ij þqneu
ij þqneg
ij
;i¼1;2;...;n;j¼1;2;...;m;ð8Þ
ij ¼qneg
ij
qpos
ij þqneu
ij þqneg
ij
;i¼1;2;...;n;j¼1;2;...;m:ð9Þ
Note that because there can be great di®erences among the numbers of online
reviews concerning di®erent products crawled from the website, there can be great
di®erences among the con¯dences of the values of ij and ij ,i¼1;2;...;n,
j¼1;2;...;m. Thus, to eliminate the e®ects of di®erent numbers of online reviews
on the con¯dences of the values of ij and ij , con¯dence intervals ½L
ij ;U
ij and
½vL
ij ;vU
ij are, respectively, used to replace the crisp values of ij and ij ,
i¼1;2;...;n,j¼1;2;...;m. Based on the calculation formula of con¯dence in-
terval estimation of the binomial distribution,
24
L
ij ;U
ij ;vL
ij , and vU
ij can be, respec-
tively, calculated by the following Eqs. (10)(13), i.e.,
L
ij ¼ij z=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ij ð1ij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð10Þ
U
ij ¼ij þz=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ij ð1ij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð11Þ
vL
ij ¼vij z=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
vij ð1vij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð12Þ
14 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
vU
ij ¼vij þz=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
vij ð1vij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð13Þ
where is the signi¯cance level (1 is the con¯dence level), z=2is the parameter
corresponding to signi¯cance level , and z=2can be determined by referencing the
table of normal distribution. For example, if the con¯dence level is 0.95, i.e., ¼0:05
and 1 ¼0:95, then we have z0:05=2¼1:96 by referencing the table of normal
distribution. Note that U
ij and vU
ij obtained by Eqs. (11) and (12) might not
satisfy the assumption of interval-valued intuitionistic fuzzy number that
U
ij þvU
ij 1. Thus, L
ij ;U
ij ;vL
ij ,andvU
ij are, respectively, uni¯ed by the following
Eqs. (14)(17), i.e.,
L
ij ¼L
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m;ð14Þ
U
ij ¼U
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m;ð15Þ
vL
ij ¼vL
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m;ð16Þ
vU
ij ¼vU
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m:ð17Þ
Based on the
L
ij ;
U
ij ;
vL
ij ,and
vU
ij obtained by Eqs. (14)(17), an interval-valued
intuitionistic fuzzy number can be constructed to represent the performance
of alternative Aiconcerning product attribute fj, i.e., rij ¼f½
L
ij ;
U
ij ;½
vL
ij ;
vU
ij g,
i¼1;2;...;n,j¼1;2;...;m.
Then, according to the obtained rij ¼f½
L
ij ;
U
ij ;½
vL
ij ;
vU
ij g,i¼1;2;...;n,
j¼1;2;...;m, the ideal and anti-ideal products can be de¯ned, i.e., Aþ¼
ðrþ
1;rþ
2;...;rþ
mÞand A¼ðr
1;r
2;...;r
mÞ, where rþ
jand r
jrepresent the per-
formance of ideal product and anti-ideal product concerning attribute fj,
j¼1;2;...;m. According to the idea of TOPSIS,
40,41
rþ
jand r
jcan be represented
by the following Eqs. (18) and (19), i.e.,
rþ
j¼max
i
L
ij ;max
i
U
ij

;min
i
L
ij ;min
i
U
ij

;i¼1;2;...;n;j¼1;2;...;m;
ð18Þ
r
j¼min
i
L
ij ;min
i
U
ij

;max
i
L
ij ;max
i
U
ij

;i¼1;2;...;n;j¼1;2;...;m:
ð19Þ
Furthermore, let dþ
iand d
i, respectively, denote the distances of alternative product
Aifrom the ideal product Aþ¼ðrþ
1;rþ
2;...;rþ
mÞand anti-ideal product
A¼ðr
1;r
2;...;r
mÞ. Based on the idea in the studies of Xu
23
and Ye,
24
dþ
iand
Method for Ranking Products Through Online Reviews 15
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
d
ican be, respectively, calculated by the following Eqs. (20) and (21), i.e.,
dþ
i¼1
2X
m
j¼1
wj
L
ij max
I
L
ij

2
þ
U
ij max
I
U
ij

2
þ
vL
ij min
i
L
ij

2
"
þ
vU
ij min
i
U
ij

2
þðL
ij Lþ
jÞ2þðU
ij Uþ
jÞ2!#1
2
;i¼1;2;...;n;
ð20Þ
d
i¼1
2X
m
j¼1
wj
L
ij min
i
L
ij

2
þ
U
ij min
I
U
ij

2
þ
vL
ij max
i
L
ij

2
"
þ
vU
ij max
i
U
ij

2
þðL
ij L
jÞ2þðU
ij U
jÞ21
2
;i¼1;2;...;n:
ð21Þ
where L
ij ¼1
U
ij
vU
ij ,U
ij ¼1
L
ij
vL
ij ,Lþ
j¼1maxI
U
ij mini
U
ij ,
Uþ
j¼1maxi
L
ij mini
L
ij ,L
j¼1mini
U
ij maxi
U
ij ,U
j¼1mini
L
ij
maxi
L
ij , and wjis the weight of product attribute fjprovided by the consumer,
i¼1;2;...;n;j¼1;2;...;m.
Moreover, based on dþ
iand d
i, the closeness coe±cient of alternative product Ai
can be calculated, i.e.,
Ci¼d
i
d
iþdþ
i
;i¼1;2;...;n:ð22Þ
Obviously, if alternative product Aiis closer to the ideal product and farther
from the anti-ideal product, namely, Ciis greater, then alternative product Aiis
preferable. Therefore, in accordance with a descending order of the closeness coe±-
cients of all alternative products, the ranking of the alternative products can be
determined.
In summary, the proposed method for ranking products through online reviews is
provided below.
Step 1. Input the online review Dik ¼ðD1
ik ;D2
ik ;...;Dm
ik Þinto ICTCLAS 2016
and obtain the output D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þ,i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Then, based on the stop word list (see: http://www.datatang.com/
data/19300), the stop words are deleted from D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þand
the vector of notional words
Wik ¼ð
W1
ik ;
W2
ik ;...;
Wm
ik Þcan be obtained,
i¼1;2;...;n,k¼1;2;...;qi.
Step 2. Determine the feature vectors of online reviews and training samples
using Eqs. (1)(4), i.e., !j
ik ¼ð!j1
ik ;!j2
ik ;...;!jEj
ik Þand !j
z¼ð!j1
z;!j2
z;...;!jEj
zÞ,
i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi,z¼1;2;...;qj
t.
Step 3. Based on the training samples concerning attribute fj, construct the
subsets of training samples for \positive-neutral"
Tj
pos [
Tj
neu, \positive-negative"
16 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Tj
pos [
Tj
neg, and \neutral-negative"
Tj
neu [
Tj
neg. Then, by training the SVM using
the three subsets, three sentiment classi¯ers can be obtained with respect to product
attribute fj, i.e., SVMpos-neu
j,SVMpos-neg
j,andSVMneu-neg
j,j¼1;2;...;m.
Step 4. Identify the sentiment orientation of online review Dj
ik using SVMpos-neu
j,
SVMpos-neg
j, and SVMneu-neg
j, respectively, and the results xjpos-neu
ik ,xjpos-neg
ik , and
xjneu-neg
ik are obtained, i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Step 5. Based on the results xjpos-neu
ik ,xjpos-neg
ik , and xjneu-neg
ik , the indicator vector
Sj
ik ¼ðj
ik ;j
ik ;j
ik Þcan be determined by the weighted voting strategy shown in
Appendix A,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Step 6. Based on the indicator vector Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ, the interval-valued
intuitionistic fuzzy number rij ¼f½
L
ij ;
U
ij ;½
vL
ij ;
vU
ij g can be constructed to repre-
sent the performance of alternative product Aiwith respect to product attribute fj,
using Eqs. (5)(17), i¼1;2;...;n,j¼1;2;...;m.
Step 7. Determine the ideal product Aþ¼ðrþ
1;rþ
2;...;rþ
mÞand anti-ideal product
A¼ðr
1;r
2;...;r
mÞusing Eqs. (18) and (19).
Step 8. Calculate the closeness coe±cient Ciusing Eqs. (20)(22), i¼1;2;...;n,
and determine the ranking of the alternative products in accordance with a
descending order of the closeness coe±cients.
5. Case Study
Consider a consumer who wants to buy a medium-sized car. After a preliminary
investigation, 11 alternative cars are identi¯ed, i.e., Bora (A1), Golf (A2), Corolla
(A3), Cruze (A4), Long Yat (A5), Mazda (A6), Octavia (A7), Sega (A8), Jetta (A9),
Sylphy (A10), and Citroen (A11 ). The 11 alternative cars are all acceptable, but
the consumer is not sure which one is the best because of limited knowledge and
expertise. To select a desirable car, the consumer is concerned with the following ¯ve
attributes of cars: controllability (f1), oil consumption (f2), space (f3), power (f4), and
cost performance (f5). The consumer provides the vector of weights of the ¯ve
attributes, i.e., w¼ð0:2;0:3;0:2;0:2;0:1Þ. To support the consumer purchase deci-
sion, the method proposed in this paper is used. The computation processes and
results are presented below.
Locoy Spider software (http://www.locoy.com/) is used to crawl online reviews of
the 11 cars concerning the ¯ve attributes from Automobile home (http://www.
autohome.com.cn/). The obtained online reviews are expressed by Dik ¼ðD1
ik ;
D2
ik ;D3
ik ;D4
ik ;D5
ik Þ,i¼1;2;...;11, k¼1;2;...;qi,q1¼7244, q2¼7748, q3¼5964,
q4¼9882, q5¼5252, q6¼5003, q7¼7700, q8¼5431, q9¼8466, q10 ¼4739,
q11 ¼3572. The training samples concerning each attribute are crawled from
the same website; the sentiment orientations of the training samples are labeled
beforehand. The number of training samples concerning each attribute is
Method for Ranking Products Through Online Reviews 17
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
q1
t¼1119;q2
t¼1229;q3
t¼1128;q4
t¼1109 and q5
t¼1158. Based on Eqs. (1)(4),
the feature vectors of online reviews and training samples are determined.
Then, based on the training samples concerning attribute fj, the subsets of
training samples for \positive-neutral", \positive-negative", and \neutral-negative"
are constructed, and three SVM classi¯ers SVM pos-neu
j,SVMpos-neg
j, and SVMneu-neg
j
are obtained by training the SVM classi¯ers using the three subsets of training
samples, respectively, j¼1;2;3;4;5. When training the SVM classi¯ers, the pa-
rameter settings are the following: cost parameter ¼1.0, tolerance parame-
ter ¼0.001, Kernel type¼radial basis function, and degree ¼3. Using the three SVM
classi¯ers SVMpos-neu
j,SVMpos-neg
j,andSVMneu-neg
j, the results concerning online
review Dj
ik are obtained and are noted as xjpos-neu
ik ,xjpos-neg
ik , and xjneu-neg
ik ,
i¼1;2;...;11, j¼1;2;3;4;5, k¼1;2;...;qi,q1¼7244, q2¼7748, q3¼5964,
q4¼9882, q5¼5252, q6¼5003, q7¼7700, q8¼5431, q9¼8466, q10 ¼4739,
q11 ¼3572. Furthermore, based on the obtained xjpos-neu
ik ,xjpos-neg
ik , and xjneu-neg
ik , the
indicator vector Sj
ik ¼ðj
ik ;j
ik ;j
ik Þof review Dj
ik is determined using the weighted
voting strategy, i¼1;2;...;11, j¼1;2;3;4;5, k¼1;2;...;qi,q1¼7244,
q2¼7748, q3¼5964, q4¼9882, q5¼5252, q6¼5003, q7¼7700, q8¼5431,
q9¼8466, q10 ¼4739, q11 ¼3572. Based on Eqs. (5)(7), the values of qpos
ij ,qneu
ij
and qneg
ij are obtained, i¼1;2;...;11, j¼1;2;3;4;5. These values are shown in
Table 3. The interval-valued intuitionistic fuzzy number to represent the perfor-
mance of alternative car Aiconcerning car attribute fjis constructed (here ¼0:05Þ
using Eqs. (8)(19), i¼1;2;...;11, j¼1;2;3;4;5. The constructed interval-valued
intuitionistic fuzzy numbers are shown in Table 4. Based on Eqs. (18)and(19),
the ideal car and anti-ideal car can be de¯ned, i.e., Aþ¼ðrþ
1;rþ
2;rþ
3;rþ
4;
rþ
5Þand A¼ðr
1;r
2;r
3;r
4;r
5Þ, where rþ
1¼ð½0:9866;0:9932;½0:0013;0:0043Þ,
rþ
2¼ð½0:9317;0:9454;½0:0081;0:0139Þ,rþ
3¼ð½0:9919;0:9974;½0:0003;0:0026Þ,
rþ
4¼ð½0:8531;0:8688;½0:0032;0:0081Þ,rþ
5¼ð½0:9241;0:9358;½0:0081;0:0136Þ;
r
1¼ð½0:6788;0:7053;½0:0360;0:0475Þ,r
2¼ð½0:5208;0:5475;½0:1269;0:1452Þ,
Table 3. Values of qpos
ij ,qneu
ij , and qneg
ij ,i¼1;2;...;11, j¼1;2;3;4;5.
f1f2f3f4f5
qpos
i1qneu
i1qneg
i1qpos
i2qneu
i2qneg
i2qpos
i3qneu
i3qneg
i3qpos
i4qneu
i4qneg
i4qpos
i1qneu
i1qneg
i1
A16476 692 76 6216 848 180 6258 879 107 4837 2127 280 6237 836 171
A27502 222 24 6931 684 133 6596 1044 108 6643 973 132 5759 1661 328
A35092 783 89 5334 523 107 5451 450 63 5022 883 59 5196 646 122
A49267 551 64 5813 3063 1006 8029 1640 213 5145 3955 782 8507 1192 183
A54502 654 96 4746 427 79 4910 295 47 3627 1405 220 4268 814 170
A64932 57 14 4676 272 55 3399 1405 199 4148 778 77 4034 845 124
A77279 377 44 6992 613 95 7357 324 19 5041 2399 260 7131 483 86
A85260 144 27 2889 1806 736 4423 894 114 3846 1437 148 5012 360 59
A97887 462 117 7188 1044 234 7758 614 94 6207 1861 398 5433 2295 738
A10 3266 1276 197 4365 297 77 4694 38 7 2814 1710 215 4215 457 67
A11 3414 129 29 2170 1092 310 3526 40 6 3415 137 20 3220 289 63
18 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Table 4. Interval-valued intuitionistic fuzzy number of alternative car Aiconcerning fj,i¼1;2;...;11, j¼1;2;3;4;5.
f1f2f3f4f5
A1([0.8869, 0.9011],[0.0081, 0.0128]) ([0.8501, 0.8661],[0.0213, 0.0284]) ([0.8560, 0.8718],[0.0120, 0.0175]) ([0.6569, 0.6786],[0.0342, 0.0431]) ([0.8530, 0.8690],[0.0201, 0.0271])
A2([0.9643, 0.9722],[0.0019, 0.0043]) ([0.8877, 0.9014],[0.0143, 0.0201]) ([0.8434, 0.8592],[0.0113, 0.0165]) ([0.8496, 0.8652],[0.0142, 0.0199]) ([0.7336, 0.7530],[0.0379, 0.0468])
A3([0.8448, 0.8628],[0.0118, 0.0180]) ([0.8866, 0.9022],[0.0146, 0.0213]) ([0.9069, 0.9211],[0.0080, 0.0132]) ([0.8328, 0.8513],[0.0074, 0.0124]) ([0.8627, 0.8797],[0.0169, 0.0240])
A4([0.9330, 0.9425],[0.0049, 0.0081]) ([0.5785, 0.5979],[0.0958, 0.1078]) ([0.8048, 0.8202],[0.0187, 0.0244]) ([0.5108, 0.5305],[0.0738, 0.0845]) ([0.8540, 0.8677],[0.0159, 0.0212])
A5([0.8477, 0.8667],[0.0147, 0.0219]) ([0.8957, 0.9116],[0.0117, 0.0183]) ([0.9282, 0.9416],[0.0064, 0.0115]) ([0.6781, 0.7031],[0.0365, 0.0473]) ([0.8021, 0.8232],[0.0276, 0.0372])
A6([0.9825, 0.9891],[0.0013, 0.0043]) ([0.9278, 0.9415],[0.0081, 0.0139]) ([0.6665, 0.6923],[0.0344, 0.0452]) ([0.8187, 0.8395],[0.0120, 0.0188]) ([0.7954, 0.8173],[0.0205, 0.0291])
A7([0.9402, 0.9504],[0.0040, 0.0074]) ([0.9016, 0.9145],[0.0099, 0.0148]) ([0.9508, 0.9601],[0.0014, 0.0036]) ([0.6441, 0.6653],[0.0297, 0.0378]) ([0.9203, 0.9319],[0.0088, 0.0135])
A8([0.9639, 0.9732],[0.0031, 0.0068]) ([0.5178, 0.5452],[0.1264, 0.1446]) ([0.8041, 0.8247],[0.0172, 0.0248]) ([0.6961, 0.7202],[0.0229, 0.0316]) ([0.9158, 0.9299],[0.0081, 0.0136])
A9([0.9262, 0.9370],[0.0113, 0.0163]) ([0.8414, 0.8567],[0.0241, 0.0311]) ([0.9105, 0.9223],[0.0089, 0.0133]) ([0.7237, 0.7426],[0.0425, 0.0515]) ([0.6315, 0.6520],[0.0812, 0.0932])
A10 ([0.6760, 0.7024],[0.0359, 0.0473]) ([0.9134, 0.9288],[0.0126, 0.0198]) ([0.9877, 0.9933],[0.0004, 0.0026]) ([0.5798, 0.6078],[0.0394, 0.0513]) ([0.8805, 0.8984],[0.0108, 0.0175])
A11 ([0.9490, 0.9625],[0.0052, 0.0111]) ([0.5915, 0.6235],[0.0776, 0.0960]) ([0.9834, 0.9908],[0.0003, 0.0030]) ([0.9493, 0.9628],[0.0032, 0.0080]) ([0.8917, 0.9112],[0.0133, 0.0220])
Method for Ranking Products Through Online Reviews 19
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
r
3¼ð½0:6692;0:6952;½0:0345;0:0454Þ,r
4¼ð½0:5129;0:5327;½0:0741;0:0848Þ,
and r
5¼ð½0:6342;0:6547;½0:0815;0:0936Þ. According Eqs. (20)(22), the close-
ness coe±cient of each alternative car is obtained, i.e., C1¼0:3107, C2¼0:3224,
C3¼0:3210, C4¼0:2930, C5¼0:3596, C6¼0:3720, C7¼0:3700, C8¼0:3010,
C9¼0:3531, C10 ¼0:3496, and C11 ¼0:3640. In accordance with a descending
order of closeness coe±cients, the ranking of the alternative cars is determined, i.e.,
A6A7A11 A5A9A10 A2A3A1A8A4.
In previous studies, several methods for ranking products through online
reviews have been proposed. However, in most of the existing studies, the product
attributes and attribute weights are objectively determined based on the online
reviews or are not considered. To compare the result obtained by the proposed
method with the results obtained by the existing methods, it is considered that the
weights of di®erent attributes are equal, i.e., w¼ð0:2;0:2;0:2;0:2;0:2Þ. The situ-
ation of equal attribute weights can be considered approximately equivalent to
the situation that product attributes are not considered. Given the equal attribute
weights, the proposed method and the methods proposed by Zhang et al.,
10
Najmi
et al.,
13
and Liu et al.
15
are simultaneously used, and the ranking results of
the alternative cars are obtained. These results are shown in Table 5.Itcan
be seen that the same or similar ranking results are obtained by di®erent methods.
However, if the unequal attribute weights are considered, then most of the existing
methods cannot be used. To illustrate the characteristics of the proposed method,
di®erent attribute weights are considered, and a series of ranking results of
the 11 alternative cars can be obtained. These results are shown in Table 6.
In Table 6, each row represents a vector of attribute weights (such that
w1þw2þw3þw4þw5¼1) and the corresponding ranking result of alternative
Table 5. Ranking results of the alternative cars obtained by di®erent methods.
Di®erent methods Ranking results of the alternative cars
The proposed method A11 A10 A9A3A2A7A6A5A1A8A4
The method proposed by Zhang et al.
10
A11 A9A10 A3A2A6A7A5A1A8A4
The method proposed by Najmi et al.
13
A11 A10 A9A2A3A7A5A6A1A8A4
The method proposed by Liu et al.
15
A11 A10 A9A3A2A7A6A5A1A8A4
Table 6. Ranking results of alternative cars with di®erent attribute weights.
Attribute weights Ranking results of alternative cars
w1w2w3w4w5
0.2 0.3 0.2 0.2 0.1 A6A7A11 A5A9A10 A2A3A1A8A4
0.1 0.3 0.2 0.2 0.2 A7A6A11 A10 A5A9A3A2A1A8A4
0.1 0.01 0.2 0.2 0.4 A11 A7A8A10 A5A6A3A2A1A9A4
0.1 0.5 0.1 0.2 0.1 A6A7A5A10 A9A11 A2A3A1A4A8
0.05 0.7 0.1 0.05 0.1 A6A10 A7A5A9A3A2A1A11 A4A8
0.1 0.1 0.1 0.1 0.6 A11 A10 A7A3A8A1A6A5A2A4A9
20 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
cars. As seen in Table 6, using the proposed method, di®erent products could be
considered the most desirable one with respect to the consumer's di®erent sub-
jective preferences on product attributes. However, the consumer's subjective
preferences on product attributes cannot be considered in most of the existing
studies.
6. Conclusions
This paper proposes a method for ranking products through online reviews based on
sentiment classi¯cation and interval-valued intuitionistic fuzzy TOPSIS. In the
method, the online reviews of the alternative products concerning multiple attributes
are preprocessed using the ICTCLAS 2016, and the notional words of each online
review are obtained by removing the stop words. Then, an algorithm based on SVM
and the OVO strategy is developed to identify the positive, neutral, and negative
sentiment orientations of online reviews of the alternative products concerning
di®erent attributes. Furthermore, based on the percentages of online reviews with
di®erent sentiment orientations and the numbers of online reviews concerning dif-
ferent alternatives, interval-valued intuitionistic fuzzy numbers are constructed to
represent the performance of the alternative products concerning the product
attributes, and the ranking of the alternative products is determined by the interval-
valued intuitionistic fuzzy TOPSIS. The proposed method has several distinct
characteristics as discussed below.
First, in the proposed method, to identify the positive, neutral, and negative
sentiment orientations of online reviews, an algorithm based on SVM and the OVO
strategy is proposed. In the algorithm, with respect to each product attribute, three
SVM classi¯ers are, respectively, trained using the three subsets of training samples
(i.e., \positive-neutral", \positive-negative", and \neutral-negative"), and the OVO
strategy is introduced to integrate the classi¯cation results obtained by the three
SVM classi¯ers. The algorithm has a clear logic and is a valuable attempt at re¯ning
more-valuable information for ranking products through online reviews.
Secondly, the key strength of the proposed method is the use of interval-valued
intuitionistic fuzzy numbers to represent the performances of the alternative pro-
ducts with respect to the product attributes. The transformation process is theo-
retically sound and complete because it is based on classical theories, including
intuitionistic fuzzy theory and the con¯dence interval estimation in probability
theory. The use of interval-valued intuitionistic fuzzy numbers overcomes the
limitations of the previous studies, in which either the reviews with neutral sentiment
orientations are ignored or the numbers of reviews of di®erent products cannot be
considered. It is not only a new idea to represent a large number of sentiment
orientations but also the ¯rst attempt to obtain decision data objectively in the
form of interval-valued intuitionistic fuzzy numbers. This approach will
signi¯cantly extend the application backgrounds of interval-valued intuitionistic
fuzzy numbers.
Method for Ranking Products Through Online Reviews 21
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
We emphasize that, because the proposed method is new and di®erent from the
existing methods, it is important for developing and enriching theories and methods
for ranking products through online reviews.
In terms of future research, a support system based on the proposed method needs
to be developed to support consumer use of the proposed method to make purchase
decisions more conveniently. Moreover, to re¯ne valuable decision data from online
reviews more e±ciently, research on online review analysis based on the Natural
Language Toolkit is needed.
Acknowledgments
This work was partly supported by the National Science Foundation of China
(Project Nos. 71371002, 71571039 and 71771043), the Fundamental Research Funds
for the Central Universities, China (Project No. N140607001), and the 111 Project
(B16009).
Appendix A.
A description of the weighted voting strategy for integrating the results obtained
by the three SVM classi¯ers (xjpos-neu
ik ;xjpos-neg
ik ,andxjneu-neg
ik ) and determining Sj
ik ¼
ðj
ik ;j
ik ;j
ik Þis provided below; i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Let pj
ik ðjxj#
ik ¼1Þand pj
ik ðjxj#
ik ¼0Þdenote the con¯dences that \the
sentiment orientation of online review Dj
ik is " given xj#
ik ¼1andxj#
ik ¼0,
respectively, #2fpos-neu;pos-neg;neu-negg. Based on the study of Platt,
42
pj
ik ðjxj#
ik ¼1Þand pj
ik ðjxj#
ik ¼0Þcan be represented by the following Eqs. (A.1)
and (A.2), respectively, i.e.,
pj
ik ðjxj#
ik ¼1Þ¼ 1
1þexpð#
jþ#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:1Þ
pj
ik ðjxj#
ik ¼0Þ¼ 1
1þexpð#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:2Þ
In Eqs. (A.1) and (A.2), #
jand #
jare parameters. The values of #
jand
#
jdepend on the numbers and categories of training samples for the di®erent
SVM classi¯ers, #2fpos-neu;pos-neg;neu-negg,j¼1;2;...;m. The values of
#
jand #
jcan be determined by maximizing the likelihood function on the
training samples,
42
i.e.,
Lð#
j;#
jÞ¼ Y
n#
j
i¼1
ðp#
ij Þt#
jð1p#
ij Þt#
j#;
j¼1;2;...;m;#2fpos-neu;pos-neg;neu-negg;
ðA:3Þ
22 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
p#
ij ¼1
1þexpð#
jþ#
j#
ij Þ;i¼1;2;...;n#
j;j¼1;2;...;m;
#2fpos-neu;pos-neg;neu-negg;
ðA:4Þ
t#
j¼n#
jþ1
n#
jþ2;j¼1;2;...;m;#2fpos-neu;pos-neg;neu-negg;
ðA:5Þ
t#
j#¼1
n#
j#þ2;j¼1;2;...;m;#2fpos-neu;pos-neg;neu-negg:
ðA:6Þ
In Eqs. (A.3)(A.6), n#
jdenotes the number of training samples for classi¯er
SVM#
j,n#
jand n#
j#, respectively, denote the numbers of training samples with
the sentiment orientations of and #, #
ij is a score value of !jh
zcalculated by a pre-
trained classi¯er, and p#
ij denotes the con¯dence of the SVM #
jdiscriminating
classes * and # in favor of the former.
42
Correspondingly, let pj
ik ð#jxj#
ik ¼1Þand pj
ik ð#jxj#
ik ¼0Þdenote the con-
¯dences of \the sentiment orientation of online review Dj
ik is #" given xj#
ik ¼1 and
xj#
ik ¼0, respectively. Then, based on the studies of Ailon and Mohri
43
and Galar
et al.,
44
pj
ik ð#jxj#
ik ¼1Þand pj
ik ð#jxj#
ik ¼0Þcan be calculated by the following
Eqs. (A.7) and (A.8), i.e.,
pj
ik ð#jxj#
ik ¼1Þ¼1pj
ik ðjxj#
ik ¼1Þ¼11
1þexpð#
jþ#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:7Þ
pj
ik ð#jxj#
ik ¼0Þ¼1pj
ik ðjxj#
ik ¼0Þ¼11
1þexpð#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:8Þ
Let pj
ik pos,pj
ik neu, and pj
ik neg denote the con¯dences to identify the sentiment
orientation of Dj
ik as positive, neutral, and negative, i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Then, pj
ik pos,pj
ik neu, and pj
ik neg can be calculated by the following
Eqs. (A.9)(A.11), respectively, i.e.,
pj
ik pos ¼pj
ik ðposjxjpos-neu
ik ¼1Þxjpos-neu
ik þpj
ik ðposjxjpos-neu
ik ¼0Þ
ð1xjpos-neu
ik Þþpj
ik ðposjxjpos-neg
ik ¼1Þxjpos-neg
ik
þpj
ik ðposjxjpos-neg
ik ¼0Þð1xjpos-neg
ik Þ
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;ðA:9Þ
Method for Ranking Products Through Online Reviews 23
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
pj
ik neu ¼pj
ik ðneujxjpos-neu
ik ¼1Þxjpos-neu
ik þpj
ik ðneujxjpos-neu
ik ¼0Þ
ð1xjpos-neu
ik Þþpj
ik ðneujxjneu-neg
ik ¼1Þxjneu-neg
ik
þpj
ik ðneujxjneu-neg
ik ¼0Þð1xjneu-neg
ik Þ
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;ðA:10Þ
pj
ik neg ¼pj
ik ðnegjxjpos-neg
ik ¼1Þxjpos-neg
ik þpj
ik ðnegjxjpos-neg
ik ¼0Þ
ð1xjpos-neg
ik Þþpj
ik ðnegjxjneu-neg
ik ¼1Þxjneu-neg
ik
þpj
ik ðnegjxjneu-neg
ik ¼0Þð1xjneu-neg
ik Þ
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi:ðA:11Þ
Based on the obtained pj
ik pos,pj
ik neu, and pj
ik neg, the indicator vector of senti-
ment orientation of online review Dj
ik can be determined by the following Eq.
(A.12), i.e.,
Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ¼
ð0;0;0Þif Dj
ik ¼;
ð1;0;0Þ;if maxfpj
ik pos;pj
ik neu;pj
ik negpj
ik pos;
ð0;1;0Þ;if maxfpj
ik pos;pj
ik neu;pj
ik negpj
ik neu;
ð0;0;1Þ;if maxfpj
ik pos;pj
ik neu;pj
ik negpj
ik neg;
8
>
>
>
>
>
>
<
>
>
>
>
>
>
:
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi:ðA:12Þ
References
1. H. Chen, R. H. Chiang and V. C. Storey, Business intelligence and analytics: From big
data to big impact, MIS Quarterly 36(4) (2012) 11651188.
2. Y. Li, Q. Ye, Z. Zhang and T. Wang, Snippet-based unsupervised approach for sentiment
classi¯cation of Chinese online reviews, International Journal of Information Technology
& Decision Making 10(6) (2011) 10971110.
3. T. Hennig-Thurau, K. P. Gwinner, G. Walsh and D. D. Gremler, Electronic word-of-
mouth via consumer-opinion platforms: What motivates consumers to articulate them-
selves on the Internet?, Journal of Interactive Marketing 18(1) (2004) 3852.
4. B. Bickart and R. M. Schindler, Internet forums as in°uential sources of consumer
information, Journal of Interactive Marketing 15(3) (2001) 3140.
5. S. Senecal and J. Nantel, The in°uence of online product recommendations on consumers'
online choices, Journal of Retailing 80(2) (2004) 159169.
6. A. Ghose and P. G. Ipeirotis, Estimating the helpfulness and economic impact of product
reviews: Mining text and reviewer characteristics, IEEE Transactions on Knowledge and
Data Engineering 23(10) (2011) 14981512.
7. T. Lappas, M. Crovella and E. Terzi, Selecting a characteristic set of reviews, in
Proc. 18th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Beijing,
China (2012), pp. 832840.
8. K. Zhang, R. Narayanan and A. Choudhary, Mining online customer reviews for ranking
products, EECS Department, Northwestern University (2009).
24 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
9. K. Zhang, R. Narayanan and A. N. Choudhary, Voice of the customers: Mining online
customer reviews for product feature-based ranking, in Proc. 3rd Conf. Online Social
Networks, Boston, MA, USA (2010), pp. 19.
10. K. Zhang, Y. Cheng, W. K. Liao and A. Choudhary, Mining millions of reviews:
A technique to rank products based on importance of reviews, in Proc. 13th Int. Conf.
Electronic Commerce, Liverpool, United Kingdom (2011), pp. 121128.
11. Y. Peng, G. Kou and J. Li, A fuzzy PROMETHEE approach for mining customer reviews
in Chinese, Arabian Journal for Science and Engineering 39(6) (2014) 52455252.
12. K. Chen, G. Kou, J. Shang and Y. Chen, Visualizing market structure through online
product revi: Integrate topic modeling, TOPSIS, and multi-dimensional scaling approa-
ches, Electronic Commerce Research and Applications 14(1) (2015) 5874.
13. E. Najmi, K. Hashmi, Z. Malik, A. Rezgui and H. U. Khan, CAPRA: A com-
prehensive approach to product ranking using customer reviews, Computing 97(8) (2015)
843867.
14. X. Yang, G. Yang and J. Wu, Integrating rich and heterogeneous information to design a
ranking system for multiple products, Decision Support Systems 84 (2016) 117133.
15. Y. Liu, J. W. Bi and Z. P. Fan, Ranking products through online reviews: A method based
on sentiment analysis technique and intuitionistic fuzzy set theory, Information Fusion
36 (2016) 149161.
16. Z. Xu and H. Hu, Projection models for intuitionistic fuzzy multiple attribute decision
making, International Journal of Information Technology & Decision Making 9(2) (2010)
267280.
17. Z. Xu, Intuitionistic fuzzy aggregation operators, IEEE Transactions on Fuzzy Systems
15(6) (2007) 11791187.
18. Z. Xu and R. R. Yager, Some geometric aggregation operators based on intuitionistic
fuzzy sets, International Journal of General Systems 35(4) (2006) 417433.
19. K. T. Atanassov, Intuitionistic fuzzy sets, Fuzzy Sets and Systems 20(1) (1986) 8796.
20. Z. Xu and X. Cai, Intuitionistic Fuzzy Information Aggregation (Springer, Berlin
Heidelberg, 2012).
21. Z. Xu and R. R. Yager, Dynamic intuitionistic fuzzy multi-attribute decision making,
International Journal of Approximate Reasoning 48(1) (2008) 246262.
22. W. Feller, An Introduction to Probability Theory and Its Applications (Wiley, New York,
2008).
23. Z. Xu, Models for multiple attribute decision making with intuitionistic fuzzy informa-
tion, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
15(3) (2007) 285297.
24. F. Ye, An extended TOPSIS method with interval-valued intuitionistic fuzzy numbers for
virtual enterprise partner selection, Expert Systems with Applications 37(10) (2010)
70507055.
25. T. L. Saaty, The Analytical Hierarchy Process (McGraw-Hill, Toronto, 1980).
26. G. Kou, D. Ergu, C. Lin and Y. Chen, Pairwise comparison matrix in multiple criteria
decision making, Technological and Economic Development of Economy 22(5) (2016)
738765.
27. G. Kou and C. Lin, A cosine maximization method for the priority vector derivation in
AHP, European Journal of Operational Research 235(1) (2014) 225232, DOI: HTTP://
DX.DOI.ORG/10.1016/j.ejor.2013.10.019.
28. S. L. Huang and W. C. Cheng, Discovering Chinese sentence patterns for feature-based
opinion summarization, Electronic Commerce Research and Applications 14(6) (2015)
582591.
Method for Ranking Products Through Online Reviews 25
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
29. H. Q. Zhang, A. Sekhari, Y. Ouzrout and A. Bouras, Jointly identifying opinion mining
elements and fuzzy measurement of opinion intensity to analyze product features,
Engineering Applications of Arti¯cial Intelligence 47 (2016) 122139.
30. J. A. Balazs and J. D. Vel
asquez, Opinion mining and information fusion: A survey,
Information Fusion 27 (2016) 95110.
31. W. Medhat, A. Hassan and H. Korashy, Sentiment analysis algorithms and applications:
A survey, Ain Shams Engineering Journal 5(4) (2014) 10931113.
32. Y. Liu, J. W. Bi and Z. P. Fan, A method for multi-class sentiment classi¯cation based on
an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algo-
rithm, Information Sciences 394395 (2017) 3852.
33. B. Pang and L. Lee, Opinion mining and sentiment analysis, Foundations and Trends in
Information Retrieval 2(2008) 1135.
34. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval (ACM Press,
New York, 1999).
35. S. Tan and J. Zhang, An empirical study of sentiment analysis for Chinese documents,
Expert Systems with Applications 34(4) (2008) 26222629.
36. G. Wang, J. Sun, J. Ma, K. Xu and J. Gu, Sentiment classi¯cation: The contribution of
ensemble learning, Decision Support Systems 57 (2014) 7793.
37. V. N. Vapnik and V. Vapnik, Statistical Learning Theory (Wiley, New York, 1998).
38. M. Galar, A. Fern
andez, E. Barrenechea, H. Bustince and F. Herrera, An overview
of ensemble methods for binary classi¯ers in multi-class problems: Experimental study on
one-vs-one and one-vs-all schemes, Pattern Recognition 44(8) (2011) 17611776.
39. E. Hullermeier and S. Vanderlooy, Combining predictions in pairwise classi¯cation:
An optimal adaptive voting strategy and its relation to weighted voting, Pattern Rec-
ognition 43(1) (2010) 128142.
40. G. Kou, Y. Lu, Y. Peng and Y. Shi, Evaluation of classi¯cation algorithms using MCDM
and rank correlation, International Journal of Information Technology & Decision
Making 11(01) (2012) 197225.
41. G. Kou, Y. Peng and G. Wang, Evaluation of clustering algorithms for ¯nancial risk
analysis using MCDM methods, Information Sciences 275 (2014) 112.
42. J. Platt, Probabilistic outputs for support vector machines and comparisons to regular-
ized likelihood methods, Advances in Large Margin Classi¯ers 10(3) (1999) 6174.
43. N. Ailon and M. Mohri, An e±cient reduction of ranking to classi¯cation, Machine
Learning 29(2) (2008) 103130.
44. M. Galar, A. Fern
andez, E. Barrenechea, H. Bustince and F. Herrera, An overview of
ensemble methods for binary classi¯ers in multi-class problems: Experimental study on
one-vs-one and one-vs-all schemes, Pattern Recognition 44(8) (2011) 17611776.
26 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
... TOPSIS offers three notable benefits: it is comprehensive; it requires minimal data; and it produces intuitive and easily comprehensible results [55]. Given its excellent performance in our investigation [61], the method combining the interval intuitionistic fuzzy set with TOPSIS was chosen as the ranking method for this study. ...
Article
Full-text available
As the leading platform of online education, MOOCs provide learners with rich course resources, but course designers are still faced with the challenge of how to accurately improve the quality of courses. Current research mainly focuses on learners’ emotional feedback on different course attributes, neglecting non-emotional content as well as the costs required to improve these attributes. This limitation makes it difficult for course designers to fully grasp the real needs of learners and to accurately locate the key issues in the course. To overcome the above challenges, this study proposes an MOOC improvement method based on text mining and multi-attribute decision-making. Firstly, we utilize word vectors and clustering techniques to extract course attributes that learners focus on from their comments. Secondly, with the help of some deep learning methods based on BERT, we conduct a sentiment analysis on these comments to reveal learners’ emotional tendencies and non-emotional content towards course attributes. Finally, we adopt the multi-attribute decision-making method TOPSIS to comprehensively consider the emotional score, attention, non-emotional content, and improvement costs of the attributes, providing course designers with a priority ranking for attribute improvement. We applied this method to two typical MOOC programming courses—C language and Java language. The experimental findings demonstrate that our approach effectively identifies course attributes from reviews, assesses learners’ satisfaction, attention, and cost of improvement, and ultimately generates a prioritized list of course attributes for improvement. This study provides a new approach for improving the quality of online courses and contributes to the sustainable development of online course quality.
... Interval-valued intuitionistic fuzzy TOPSIS has been used in past research for ranking products using online reviews (Y. Liu, Bi, & Fan, 2017). Different research combined fuzzy TOPSIS and fuzzy AHP to comprehensively and effectively evaluate hotel websites (Baki, 2020). ...
Preprint
Full-text available
The hospitality sector generates a lot of data, which is added as written feedbacks and/or numerical ratings. Online travel agencies (OTAs) thrive on giving consumers a variety of options tailored according to their tastes and needs. In this research, we propose ranking of hotels according to customer provided reviews. The whole process is carried out using several techniques. The reviews are initially pre-processed and a topic modelling technique Latent Dirichlet Allocation (LDA) has been utilized to extract important features. Maximum Entropy Minimum Variance Ordered Weighted Averaging (MEMV-OWA) method is then utilized to assign weights to individual topics. For ranking, multi-criteria decision-modelling approach Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) is utilized. To address the imprecision inherent in human judgement, picture fuzzy numbers are used. The findings of the research provide theoretical and practical implementations and are discussed in the paper.
... A curative testimonial obtained from these different B2C web sites in a common format is crucial for making an analysis while implementing different meta data and content-based features that deduce the rank scores for different product alternatives. This has proven to be beneficial for both new customers and manufacturers [4,5,13,19,20,23,26,36,45,47]. ...
Article
Full-text available
The sentiment analysis approach or opinion mining for product ranking deals with computing people's opinions using structured and unstructured data from blogs, review sites/articles and social media. The fast pace of changing user preferences in different age groups and geographical regions has made product ranking a valuable research area. The product ranking based on user's opinion for multi-criteria decision-making has gained prominence with the rise in e-commerce and online selling of goods and services. The reviews on online products display significant impacts on decisions made by consumers purchase. Ranking of the products through online reviews influences consumers' purchase decisions and a source for sellers for evaluating market response to their product. The increasing number of competitive business models has made the right product selection by the end user based on other users opinion and feedback on different forums a challenging task. Aggregation of product features by the e-commerce portals based on data collected from various sources is not enough for the buyer to make decision on appropriate product choice. The present research introduces a novel approach for ranking the alternatives based on machine learning techniques, fuzzy analytical hierarchy process, the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS), and wavelet transformations. Experiments are conducted over the real data sets, and efficacy of the proposed method is assessed and compared the results with the rank given by the domain experts.
... This enhanced approach emphasized both membership and non-membership degrees, providing a richer representation of a decision maker's hesitations. Later on, Liu et al. [22,23] built upon this theory by incorporating sentiment analysis, allowing online product reviews to be represented by intuitionistic fuzzy numbers. Furthermore, Roszkowska et al. [24][25][26] introduced a composite measure. ...
Article
Full-text available
With the burgeoning growth of the internet, online evaluation systems have become increasingly pivotal in shaping consumer decision making. In this context, this study introduces an intuitionistic fuzzy TODIM (an acronym in Portuguese for interactive and multicriteria decision making) methodology to rank products based on online reviews. Our approach aims to enhance user decision making efficiency and address the prevalent issue of information overload. Initially, we devised a product attribute emotion quantification framework within the confines of the intuitionistic fuzzy paradigm. This allows for the transformation of online reviews into exact functional outputs via our advanced intuitionistic fuzzy scoring mechanism and its associated precise function. Following this, we take into account the inherent correlation among product attributes, leading to the development of an attribute-associated intuitionistic fuzzy model. This model further ascertains the dominance degree of alternative products. Moreover, by integrating the risk aversion factor, we can derive a hierarchical structure for alternative products, aiding in the prioritization process. Finally, this paper validates the proposed method using movie sequencing as a case study. The results show that the proposed method, which takes into account the emotional tendencies of different attributes in a movie and the different preferences of viewers in the attribute weighting and movie selection process, is more reasonable than methods proposed in previous studies.
Article
In the process of product ranking considering online reviews, they often are based on initial reviews and do not consider additional consumer reviews, but additional review information can sometimes directly affect consumers’ final decisions. To fully characterize the rich emotional preferences of consumers embedded in two-stage online customer reviews information, considering consumers’ individual preferences and product objective evaluation information, we construct a combination weighting method to calculate comprehensive weights of product attributes, and then exploit the sentiment analysis technique, interval-valued probabilistic linguistic term set (IVPLTS) and preference ranking organization method for enrichment evaluations (PROMETHEE) to establish a products ranking method based on compound reviews, and then we use it to identify the sentiment orientation of reviews and the results. Finally, a real-life case illustrates a real-world application of the proposed method.
Article
The surge in online shopping has led to an increase in online customer reviews (OCRs), posing challenges for product selection based on product features and customer sentiment. This is where the combination of multicriteria decision-making (MCDM) and sentiment analysis (SA) methods come in. In this article, we propose a hybrid approach for product ranking that addresses challenges identified in previous studies. These challenges include accurately considering feature interdependencies, identifying hesitancy and uncertainty in consumer purchase decisions, and using a more robust method for ranking alternative products. In doing so, we utilize SA and unsupervised machine learning to extract features from OCRs. We employ a combination of association rule mining (ARM) and fuzzy cognitive maps (FCM) to calculate feature weights based on interdependencies among features. In addition, we formulate a decision matrix using sentiment orientation and intuitionistic fuzzy theory. The interval-valued intuitionistic fuzzy (IVIF) theory ensures reliable decision-making information. The IVIF-multiobjective optimization by ratio analysis plus full multiplicative form method (MULTIMOORA) is applied to rank alternative products. Using Amazon comments, five mobile phones are ranked to demonstrate the methodology. The proposed framework improves decision-making in product selection based on OCRs by considering feature interdependencies. Sensitivity analysis and comparisons with other MCDM methods evaluate its robustness. By addressing previous limitations and incorporating interdependencies among features, this comprehensive approach provides reliable decision-making in product selection based on OCRs.
Article
Full-text available
The production decision of a large commodity or equipment manufacturing enterprise can be modeled as a newsvendor problem. Managers must determine the optimal production volume in advance to minimize the underage cost and the overage cost. However, the traditional newsvendor problem assumes the known demand distribution, which is not the case in practice. Data-driven approaches have become the hot research topic and opened up new avenues for such issues. Recent studies have considered demand-related features but have failed to address how to optimize production and inventory using informative textual reviews, not just numerical feature data. To address this issue, we propose a data-driven newsvendor model that leverages sentiment analysis on textual reviews using a deep learning model to solve the data-driven newsvendor problem by integrating estimation and optimization. Experiments on real data show that our proposed method reduces the average cost by approximately 14.18% compared to the most advanced deep neural network method, making it the best-performing method. Furthermore, our method is more suitable for situations where unit shortage costs are greater than unit overage costs. Finally, our method is robust in terms of sample size and can still obtain good results even with insufficient historical data.
Article
Full-text available
The measurement scales, consistency index, inconsistency issues, missing judgment estimation and priority derivation methods have been extensively studied in the pairwise comparison matrix (PCM). Various approaches have been proposed to handle these problems, and made great contributions to the decision making. This paper reviews the literature of the main developments of the PCM. There are plenty of literature related to these issues, thus we mainly focus on the literature published in 37 peer reviewed international journals from 2010 to 2015 (searched via ISI Web of science). We attempt to analyze and classify these literatures so as to find the current hot research topics and research techniques in the PCM, and point out the future directions on the PCM. It is hoped that this paper will provide a comprehensive literature review on PCM, and act as informative summary of the main developments of the PCM for the researchers for their future research. First published online: 02 Sep 2016
Article
Full-text available
Business intelligence and analytics (BI&A) has emerged as an important area of study for both practitioners and researchers, reflecting the magnitude and impact of data-related problems to be solved in contemporary business organizations. This introduction to the MIS Quarterly Special Issue on Business Intelligence Research first provides a framework that identifies the evolution, applications, and emerging research areas of BI&A. BI&A 1.0, BI&A 2.0, and BI&A 3.0 are defined and described in terms of their key characteristics and capabilities. Current research in BI&A is analyzed and challenges and opportunities associated with BI&A research and education are identified. We also report a bibliometric study of critical BI&A publications, researchers, and research topics based on more than a decade of related academic and industry publications. Finally, the six articles that comprise this special issue are introduced and characterized in terms of the proposed BI&A research framework.
Article
Full-text available
Online shopping generates billions of dollars in revenues, including both the physical goods and online services. Product images and associated descriptions are the two main sources of information used by the shoppers to gain knowledge about a product. However, these two pieces of information may not always present the true picture of the product. Images could be deceiving, and descriptions could be overwhelming or cryptic. Moreover, the relative rank of these products among the peers may lead to inconsistencies. Hence, a useful and widely used piece of information is “user reviews”. A number of vendors like Amazon have created whole ecosystems around user reviews, thereby boosting their revenues. However, extracting the relevant and useful information out of the plethora of reviews is not straight forward, and is a very tedious job. In this paper we propose a product ranking system that facilitates the online shopping experience by analyzing the reviews for sentiments, evaluating their usefulness, extracting and weighing different product features and aspects, ranking it among similar comparable products, and finally creating a unified rank for each product. Experiment results show the usefulness of our proposed approach in providing an effective and reliable online shopping experience in comparison with similar approaches.
Article
Multi-class sentiment classification is a valuable research topic with extensive applications; however, studies in the field remain relatively scarce. In the present paper, a method for multi-class sentiment classification based on an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algorithm is proposed. First, an improved OVO strategy is proposed wherein the relative competence weight of each binary classifier is determined according to the K nearest neighbors and the class center of each class in the training sample set concerning the binary classifier. A method for multi-class sentiment classification is proposed based on this improved OVO strategy and the SVM algorithm. After converting the training texts into term feature vectors, the important features (terms) for multi-class sentiment classification are selected using the information gain (IG) algorithm. A binary SVM classifier is then trained on the training feature vectors of each pair of sentiment classes. To identify the sentiment class of a test text, a confidence score matrix of multiple SVM classifiers is constructed based on the results of multiple SVM classifiers. Using this score matrix, the sentiment class of the test text can be determined using the improved OVO strategy. The results of our experimental studies show that the performance of the proposed method is significantly better than that of the existing methods for multi-class sentiment classification.
Article
Online product reviews have significant impacts on consumers’ purchase decisions. To support consumers’ purchase decisions, how to rank the products through online reviews is a valuable research topic, while research concerning this issue is still relatively scarce. This paper proposes a method based on the sentiment analysis technique and the intuitionistic fuzzy set theory to rank the products through online reviews. An algorithm based on sentiment dictionaries is developed to identify the positive, neutral or negative sentiment orientation on the alternative product concerning the product feature in each review. According to the identified positive, neutral and negative sentiment orientations, an intuitionistic fuzzy number is constructed for representing the performance of an alternative product concerning a product feature. The ranking of alternative products is determined by intuitionistic fuzzy weighted averaging (IFWA) operator and preference ranking organization methods for enrichment evaluations II (PROMETHEE II). A case study is given to illustrate the use of the proposed method. The comparisons and experiments are further conducted to illustrate the characteristics and advantages of the proposed method. Converting the identified positive, neutral and negative sentiment orientations into intuitionistic fuzzy numbers is a new idea for processing and fusing a large number of sentiment orientations of online reviews. Based on the proposed method, decision support system can be developed to support the consumers’ purchase decisions more conveniently.
Article
The online review plays an important role as electronic word-of-mouth (eWOM) for potential consumers to make informed purchase decisions. However, the large number of reviews poses a considerable challenge because it is impossible for customers to read all of them for reference. Moreover, there are different types of online reviews with distinct features, such as numeric ratings, text descriptions, and comparative words, for example; such heterogeneous information leads to more complexity for customers. In this paper, we propose a method to integrate such rich and heterogeneous information. The integrated information can be classified into two categories: descriptive information and comparative information. The descriptive information consists of online opinions directly given by consumers using text sentiments and numeric ratings to describe one specific product. The comparative information comes from comparative sentences that are implicitly embedded in the reviews and online comparative votes that are explicitly provided by third-party websites to compare more than one product. Both descriptive information and comparative information are integrated into a digraph structure, from which an overall eWOM score for each product and a ranking of all products can be derived. We collect both descriptive and comparative information for three different categories of products (mobile phones, laptops, and digital cameras) during a period of 10 days. The results demonstrate that our method can provide improved performance compared with those of existing product ranking methods. A ranking system based on our method is also provided that can help consumers to compare multiple products and make appropriate purchase decisions effortlessly.
Article
This study discovers part-of-speech (POS) patterns of sentences that express opinions in Chinese product reviews. The use of these patterns makes it possible to identify opinion sentences, feature words, and opinion/feeling words. Degree words and negation words are used in determining the orientation of opinions as well as the degree of their intensity. In order to identify the subject of opinions, the associations between opinion/feeling words, feature words, and corresponding features were ascertained. An algorithm for feature-based opinion summarization is then proposed based on these patterns and association rules. Both car and movie reviews were collected for discovering patterns and testing of the patterns and algorithm. The experimental results demonstrate that the proposed algorithm and approaches perform well on Chinese product reviews.
Article
Opinion mining mainly involves three elements: feature and feature-of relations, opinion expressions and the related opinion attributes (e.g. Polarity), and feature–opinion relations. Although many works have emerged to achieve its aim of gaining information, the previous researches typically handled each of the three elements in isolation, which cannot give sufficient information extraction results; hence, the complexity and the running time of information extraction is increased. In this paper, we propose an opinion mining extraction algorithm to jointly discover the main opinion mining elements. Specifically, the algorithm automatically builds kernels to combine closely related words into new terms from word level to phrase level based on dependency relations; and we ensure the accuracy of opinion expressions and polarity based on: fuzzy measurements, opinion degree intensifiers, and opinion patterns. The 3458 analyzed reviews show that the proposed algorithm can effectively identify the main elements simultaneously and outperform the baseline methods. The proposed algorithm is used to analyze the features among heterogeneous products in the same category. The feature-by-feature comparison can help to select the weaker features and recommend the correct specifications from the beginning life of a product. From this comparison, some interesting observations are revealed. For example, the negative polarity of video dimension is higher than the product usability dimension for a product. Yet, enhancing the dimension of product usability can more effectively improve the product.
Article
Interest in Opinion Mining has been growing steadily in the last years, mainly because of its great number of applications and the scientific challenge it poses. Accordingly, the resources and techniques to help tackle the problem are many, and most of the latest work fuses them at some stage of the process. However, this combination is usually executed without following any defined guidelines and overlooking the possibility of replicating and improving it, hence the need for a deeper understanding of the fusion process becomes apparent. Information Fusion is the field charged with researching efficient methods for transforming information from different sources into a single coherent representation, and therefore can be used to guide fusion processes in Opinion Mining. In this paper we present a survey on Information Fusion applied to Opinion Mining. We first define Opinion Mining and describe its most fundamental aspects, later explain Information Fusion and finally review several Opinion Mining studies that rely at some point on the fusion of information.