Content uploaded by Jian-Wu Bi
Author content
All content in this area was uploaded by Jian-Wu Bi on May 18, 2020
Content may be subject to copyright.
A Method for Ranking Products Through Online Reviews
Based on Sentiment Classi¯cation and Interval-Valued
Intuitionistic Fuzzy TOPSIS
Yang Liu
*,‡
, Jian-Wu Bi
*,§
and Zhi-Ping Fan
*,†,¶
*
Department of Information Management and Decision Sciences
School of Business Administration
Northeastern University, Shenyang 110167, P. R. China
†
State Key Laboratory of Synthetical Automation for Process Industries
Northeastern University, Shenyang 110819, P. R. China
‡
liuy@mail.neu.edu.cn
§
jianwubi@126.com
¶
zpfan@mail.neu.edu.cn
Published 29 September 2017
Studies have shown that online product reviews signi¯cantly a®ect consumer purchase
decisions. However, it is di±cult for the consumer to read online product reviews one by one
because the number of online reviews is very large. Thus, to facilitate consumer purchase
decisions, how to rank products through online reviews is a valuable research topic. This paper
proposes a method for ranking products through online reviews based on sentiment classi¯ca-
tion and the interval-valued intuitionistic fuzzy Technique for Order Preference by Similarity to
an Ideal Solution (TOPSIS). The method consists of two parts: (1) identifying sentiment
orientations of the online reviews based on sentiment classi¯cation and (2) ranking alternative
products based on interval-valued intuitionistic fuzzy TOPSIS. In the ¯rst part, the online reviews
of the alternative products concerning multiple attributes are preprocessed, and an algorithm
based on support vector machine and one-versus-one strategy is developed for classifying the
sentiment orientations of online reviews into three categories: positive, neutral, and negative. In
the second part, based on the percentages of the online reviews with di®erent sentiment orien-
tations and the numbers of online reviews of di®erent products crawled from the website, an
interval-valued intuitionistic fuzzy number is constructed to represent the performance of an
alternative product with respect to the product attribute. Additionally, the interval-valued
intuitionistic fuzzy TOPSIS method is employed to determine a ranking of the alternative pro-
ducts. Finally, a case analysis is provided to illustrate the application of the proposed method.
Keywords: Product ranking; online reviews; SVM; sentiment classi¯cation; interval-valued
intuitionistic fuzzy number; TOPSIS.
1. Introduction
With the rapid development of e-commerce, an increasing number of people are
buying products from e-commerce websites. Compared with products exhibited in
physical stores, the products exhibited on the websites are viewed with more
§
Corresponding author.
International Journal of Information Technology & Decision Making
Vol. 16 (2017)
°
cWorld Scienti¯c Publishing Company
DOI: 10.1142/S021962201750033X
1
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
uncertainty because the consumers cannot touch or try out the products until those
products are delivered to the consumers. Thus, online product reviews published by
consumers who have bought or used the products would be helpful for potential
consumers to understand the products more clearly.
1,2
In fact, even when consumers
want to buy the products from the physical stores, the consumers can also visit the
websites and obtain more product information by reading the related online product
reviews. For example, a consumer wants to buy a medium-sized car. From a pre-
liminary investigation, several acceptable cars are determined, which can be con-
sidered the alternative cars. Nevertheless, the consumer wavers among the
alternative cars because of limited knowledge and expertise. To make a desirable
selection among the alternative cars, the consumer might read online reviews con-
cerning the alternative cars to understand them more clearly and make a reasonable
decision. Results of the previous studies have indicated that online product reviews
signi¯cantly a®ect consumer purchase decisions.
3–
5
However, note that, because the
number of online reviews concerning the alternative products is usually large, it can
be tedious and time-consuming for the consumer to read all of the online reviews one
by one. In this situation, several types of approaches can be helpful for the consumer
to capture the information embedded in the online reviews and make the ¯nal de-
cision. Some of these approaches include extracting a subset of important reviews,
6
summarizing the opinions of a huge number of online reviews,
7
and ranking alter-
native products through online reviews.
8–15
Among these approaches, ranking al-
ternative products through online reviews is considered a comprehensive approach
because it considers multiple factors of product selection such as product attributes,
sentiment orientations of online reviews, product attribute weights, and posted time
of reviews.
8–15
Thus, how to rank products through online reviews is a valuable
research topic with extensive application backgrounds.
Until now, the problem of ranking products through online reviews has attracted
the attentions of some scholars, and several methods for ranking products through
online reviews have been proposed.
8–15
There are often two processes in the existing
methods for ranking products through online reviews, namely, (1) information ex-
traction and (2) product ranking. The former is to extract the related information
from online reviews, such as product attributes and sentiment orientations. The
latter is to rank products based on the extracted information. These studies have
made signi¯cant contributions to ranking products through online reviews. However,
in most of the previous studies,
8–14
the online reviews with neutral sentiment
orientations are ignored. In fact, the reviews with neutral sentiment orientations
represent the hesitant or uncertain evaluations of consumers concerning products,
and the reviews with neutral sentiment orientations should not be ignored
15–17
be-
cause they are also valuable for the potential consumer to make a reasonable deci-
sion. For example, let us consider a consumer who wants to buy one from two
alternative cars. One car (denoted as A1) has received 100 reviews, including 45
reviews with positive sentiment orientation, 50 reviews with neutral sentiment ori-
entation and 5 reviews with negative sentiment orientation; the other car (denoted as
2Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
A2) has received 50 reviews, including 45 reviews with positive sentiment orientation
and 5 reviews with negative sentiment orientation. It would be di±cult to choose
between the two cars if the online reviews with neutral sentiment orientation were
ignored. However, if the online reviews with neutral sentiment orientation were
considered, most consumers would select car A2because of the higher percentage of
positive evaluations and lower percentages of hesitant or uncertain evaluations re-
ceived by A2. Thus, the reviews with neutral sentiment orientation should not be
ignored when determining the ranking of alternative products.
An intuitionistic fuzzy number is a valuable data form to represent information
with hesitance and uncertainty.
18,19
An intuitionistic fuzzy number can be used to
represent evaluations or judgments with di®erent degrees of support, hesitation, and
opposition.
20,21
Thus, based on sentiment analysis technique and intuitionistic fuzzy
set theory, Liu et al.
15
proposed a method for ranking products through online
reviews. In the method, based on the percentages of reviews with positive, neutral,
and negative sentiment orientations, an intuitionistic fuzzy number is constructed
to represent the performance of an alternative product with respect to a product
attribute. Then, the Preference Ranking Organization Method for Enrichment
Evaluations (PROMETHEE) II method is used to determine the ranking of alter-
native products. In the study of Liu et al.,
15
a large number of sentiment orientations
of online reviews of an alternative product concerning a product attribute can be
represented simply and completely by an intuitionistic fuzzy number. This approach
is a new idea and a valuable attempt to process and fuse a large number of sentiment
orientations embedded in online reviews. However, note that although the use of an
intuitionistic fuzzy number can simply re°ect the percentages of online reviews with
di®erent sentiment orientations, the numbers of online reviews of di®erent products
crawled from the website are not considered. In fact, there might be great di®erences
among the numbers of online reviews concerning di®erent products crawled from the
website. The numbers of online reviews would a®ect the con¯dences of the decision
data re¯ned from the online reviews. That is, the intuitionistic fuzzy number con-
structed based on the sentiment orientations of a large number of online reviews
should have a high con¯dence level; conversely, the intuitionistic fuzzy number
constructed based on the sentiment orientations of a small number of online reviews
should have a low con¯dence level. Thus, to eliminate the e®ects of di®erent numbers
of online reviews on the con¯dence levels of the constructed intuitionistic fuzzy
numbers, based on the con¯dence interval estimation in probability theory,
22
con-
¯dence intervals can be obtained based on the percentages of di®erent sentiment
orientations and the numbers of online reviews of the alternative products. That is,
the percentages of di®erent sentiment orientations in the form of crisp numbers are
replaced by con¯dence intervals. Therefore, an interval-valued intuitionistic fuzzy
number can be constructed to re°ect the performance of an alternative product
concerning a product attribute, and then the interval-valued intuitionistic fuzzy
Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS)
23,24
can
be used to determine a ranking of alternative products.
Method for Ranking Products Through Online Reviews 3
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
The objective of this paper is to propose a method for ranking products through
online reviews based on sentiment classi¯cation and interval-valued intuitionistic
fuzzy TOPSIS. The method consists of two parts: (1) identifying sentiment orien-
tations of the online reviews based on sentiment classi¯cation and (2) ranking
alternative products based on interval-valued intuitionistic fuzzy TOPSIS. In the
¯rst part, the online reviews of the alternative products concerning multiple attri-
butes are preprocessed using the ICTCLAS 2016 (Institute of Computing Technol-
ogy, Chinese Lexical Analysis System, http://ictclas.nlpir.org/), and a set of
notional words is constructed with respect to alternative products concerning each
product attribute. Then, based on support vector machine (SVM) and one-versus-
one (OVO) strategy, an algorithm is developed for classifying the sentiment orien-
tations of online reviews of the alternative products concerning di®erent attributes
into positive, neutral, and negative. In the second part, based on the percentages of
the online reviews with di®erent sentiment orientations and the numbers of online
reviews of di®erent products crawled from the website, an interval-valued intuitio-
nistic fuzzy number is constructed to represent the performance of an alternative
product concerning a product attribute. Then, the interval-valued intuitionistic
fuzzy TOPSIS method is employed to determine a ranking of the alternative
products.
The remainder of this paper is arranged as follows. Section 2provides a literature
review of methods for ranking products through online reviews. Section 3formulates
the problem of ranking products through online reviews. In Sec. 4, descriptions of the
two parts of the proposed method are presented. In Sec. 5, a case study on ranking
11 cars is provided to illustrate the use of the proposed method. Finally, Sec. 6
summarizes and highlights the major contributions of this paper.
2. Literature Review of Methods for Ranking Products
Through Online Reviews
The problem of ranking products through online reviews has received the attention of
scholars, and several methods for ranking products through online reviews have been
proposed.
8–15
Zhang et al.
8
previously focussed on the problem of ranking products through
online reviews and proposed a method based on directed and weighted product
graph. In the method, subjective sentences and comparative sentences in online
product reviews are ¯rst distinguished, in which a subjective sentence represents the
subjective opinion of a consumer on a product and a comparative sentence represents
a comparison relationship of a pair of products. Then, the positive or negative sen-
timent orientation of each subjective sentence and each comparative sentence is
identi¯ed using the sentiment analysis technique, and a directed and weighted
product graph is constructed which simultaneously re°ects the subjective opinions
and comparison relationships of the products. Finally, an improved page-rank
algorithm is proposed to determine a ranking of products based on the directed and
4Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
weighted product graph. On this basis, Zhang et al.
9
further proposed a method for
ranking products through online reviews based on di®erent aspects of product
attributes. In the method, sentences in the online reviews are initially classi¯ed into
di®erent subsets based on the product attribute mentioned in the sentences. Then,
the procedure proposed by Zhang et al.
8
is used to determine a ranking of products
based on the sentences concerning each product attribute. Thereafter, Zhang et al.
10
incorporated the helpfulness and the age importance of each review into the deter-
mination of the ranking of products, in which the helpfulness of each review is
measured according to the number of helpful votes received by the review, and the
age importance of each review is calculated according to the posted date of the
review. Peng et al.
11
suggested a method for ranking products through Chinese online
reviews based on the fuzzy PROMETHEE method. According to similarity degrees
of di®erent Chinese words, the synonyms concerning each product attribute are
determined. According to the total frequency of the synonyms concerning each
product attribute, the set of important product attributes is determined. Then,
several domain experts are invited to provide subjective evaluations on multiple
products concerning the important product attributes. Furthermore, according to
the subjective evaluations, the ranking of products is determined by the fuzzy
PROMETHEE method. Chen et al.
12
proposed a method for market structure
visualization through online product reviews. In their method, the online product
reviews are initially classi¯ed into positive and negative reviews. Additional analysis
is conducted based on the positive reviews and the negative reviews, respectively.
Using topic modeling and the scree plot technique, the topic distribution matrix can
be obtained, and the weight matrix of all brands and the weights of important topics
are determined according to the topic distribution matrix. Furthermore, a perceptual
map of market structure is built by the multi-dimensional scaling method, and a
ranking of products is obtained using the TOPSIS method. Najmi et al.
13
proposed a
comprehensive method for ranking products through online reviews. In the method,
the brand score of each product is calculated by an improved page-rank algorithm,
and the review score is calculated based on the results of sentiment analysis and
usefulness analysis of each online review. The ¯nal ranking score of each product is
determined by aggregating the brand score and the review score. Yang et al.
14
proposed a method for ranking multiple products by integrating heterogeneous
information including numeric ratings, text reviews, and comparative votes. In
the method, the heterogeneous information is ¯rst classi¯ed into two categories:
descriptive information and comparative information. Then, the descriptive infor-
mation and comparative information are integrated into a digraph structure, from
which an integrated electronic word-of-mouth (eWOM) score of each product can be
calculated. Furthermore, based on the obtained eWOM score, the overall ranking of
the multiple products can be determined. Liu et al.
15
proposed a method for ranking
products based on the sentiment analysis technique and intuitionistic fuzzy set
theory. In the method, an algorithm based on sentiment dictionaries is developed to
identify the positive, neutral, and negative sentiment orientations of online reviews.
Method for Ranking Products Through Online Reviews 5
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Then, according to the identi¯ed sentiment orientations, an intuitionistic fuzzy
number is constructed to represent the performance of each alternative product
concerning each product attribute. Furthermore, the dominance degrees on pairwise
comparisons of the alternative products are calculated, and a ranking of the alter-
native products is determined using the PROMETHEE II method.
The previous studies have made signi¯cant contributions to ranking products
through online reviews. However, in most of the previous studies,
8–
14
the online
reviews with neutral sentiment orientations are ignored, which lead to a loss of
valuable decision data (an example can be found in Sec. 1). Although the online
reviews with neutral sentiment orientations are considered in the study of Liu et al.,
15
the numbers of online reviews of di®erent products crawled from the website are not
considered. In fact, the di®erent numbers of online reviews re°ect the di®erent
con¯dence levels of decision data re¯ned from the online reviews and are valuable for
the consumer to select a desirable alternative product. Thus, to support consumer
purchase decisions, it is necessary to further develop the method for ranking products
through online reviews.
3. Problem Description
Consider a consumer who wants to buy a product such as a car. By a preliminary
investigation, several alternative products are identi¯ed. The alternative products
are all acceptable, but the consumer cannot decide which one to buy because of
limited knowledge and expertise. To select the most desirable one from the alter-
native products, the consumer provides his/her personalized preferences on the im-
portant product attributes and the weights of these important product attributes.
To support the consumer purchase decision, a large number of online reviews of the
alternative products concerning the product attributes are crawled from the related
website. The problem addressed in this paper is how to rank the alternative products
based on the online reviews and the attribute weights provided by the consumer.
The problem of ranking products through online reviews is vividly shown in Fig. 1.
The following notations are used to denote the sets and variables in the problem.
These notations will be used throughout this paper.
1
A
2
A
n
A
Alternative products
1
f
2
f
m
f
Attribute weights
2
A
Ranking of alternative products
1
A
n
A
1
w
2
w
m
w
Product attributes
Online reviews concerning
alternative products The method for
ranking product
through online
reviews
Consumer
Fig. 1. Problem of ranking products through online reviews.
6Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
.A¼fA1;A2;...;Ang: the set of nacceptable alternative products, where Ai
denotes the ith acceptable alternative product, i¼1;2;...;n. The set Acan be
determined by the consumer according to his/her personal preference.
.F¼ff1;f2;...;fmg: the set of mproduct attributes of interest to the consumer,
where fjdenotes the jth product attribute of interest to the consumer,
j¼1;2;...;m. Usually, a prede¯ned attribute set with a great number of attri-
butes can be determined by the website based on the characteristics of the type of
product, and the consumer can determine set Fby selecting some or all of the
prede¯ned attribute sets according to his/her personal preference.
.w¼ðw1;w2;...;wmÞ: the vector of attribute weights, where wjdenotes the weight
of attribute fj, such that wj0 and Pm
j¼1wj¼1, j¼1;2;...;m.wcan be di-
rectly determined by the assignment of the consumer or indirectly obtained using
existing procedures such as analytic hierarchy process (AHP).
25–27
If the consumer
is not familiar with AHP, a webpage including the framework of AHP could be
helpful for the consumer to determine the attribute weights. In the webpage, a
series of questions on attribute comparisons are embedded beforehand. Then,
according to the answers of the consumer, the attribute weights can be obtained
automatically based on AHP.
25
.Q¼ðq1;q2;...;qnÞ: the vector of numbers of the online reviews concerning
alternative products, where qidenotes the number of the online reviews concerning
alternative product Ai;i¼1;2;...;n. The online reviews can be crawled from the
related website.
.Dik ¼ðD1
ik ;D2
ik ;...;Dm
ik Þ: the kth online review concerning alternative product Ai,
where Dj
ik denotes the sentence concerning attribute fjin the kth online review of
alternative product Ai;i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi. If there is no
sentence concerning product attribute fjin review Dik , then we denote Dj
ik ¼,
i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Currently, some websites encourage
consumers to post their reviews according to a pre-established framework of
product attributes, such as Automobile home (http://www.autohome.com.cn/).
Even when the reviews are not posted according to the product attributes, some
existing techniques
28,29
can be used to extract sentences concerning di®erent
attributes from the online reviews. Thus, in this study, we consider that the online
reviews have been expressed in the form of sentences concerning di®erent attri-
butes, i.e., Dik ¼ðD1
ik ;D2
ik ;...;Dm
ik Þ.
.
Tj¼f
Tj
1;
Tj
2;...;
Tj
qj
t
g: the set of qj
ttraining samples on identifying the senti-
ment orientations of online reviews concerning attribute fj, where
Tj
zis the review
(or sentences) on the same type of products concerning attribute fjand has been
labeled positive, neutral or negative sentiment orientation, z¼1;2;...;qj
t,
j¼1;2;...;m. Thus, the set
Tjcan be further divided into three subsets
Tj
pos,
Tj
neu, and
Tj
neg, where
Tj
pos,
Tj
neu, and
Tj
neg, respectively, denote the sets of
training samples with positive, neutral, and negative sentiment orientations, such
Method for Ranking Products Through Online Reviews 7
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
that
Tj
pos [
Tj
neu [
Tj
neg ¼
Tj,
Tj
pos \
Tj
neu ¼,
Tj
pos \
Tj
neg ¼,and
Tj
neu \
Tj
neg ¼,j¼1;2;...;m.
The problem of concern in this paper is how to rank alternative products A1;
A2;...;Anbased on the online review Dik and attribute weight wj,i¼1;2;...;n,
j¼1;2;...;m,k¼1;2;...;qi.
4. The Proposed Method
To solve the above problem, a method for ranking products through online reviews
based on sentiment classi¯cation and interval-valued intuitionistic fuzzy TOPSIS is
proposed in this section. The method consists of two parts: (1) identifying sentiment
orientations of the online reviews based on sentiment classi¯cation and (2) ranking
alternative products based on interval-valued intuitionistic fuzzy TOPSIS. Detailed
descriptions of the two parts are, respectively, provided in Secs. 4.1 and 4.2.
4.1. Identifying sentiment orientations of the online reviews based
on sentiment classi¯cation
In this paper, the sentiment classi¯cation technique is employed to identify the
sentiment orientations of online reviews of alternative products with respect to each
attribute. Thus, the online reviews are initially preprocessed, and then an algorithm
based on SVM and the OVO strategy is proposed to identify the positive, neutral,
and negative sentiment orientations of the online reviews. The details are, respec-
tively, provided in Secs. 4.1.1 and 4.1.2.
4.1.1. Preprocessing online reviews concerning the alternative products
The preprocessing includes two processes, namely (1) word segmentation and part-
of-speech (POS) tagging and (2) stop word removal. The details are provided below.
(1) Word segmentation and POS tagging
In this paper, ICTCLAS 2016 is used for word segmentation and POS tagging. Each
sentence in the online review is decomposed into several words, and the POS of each
word is tagged after the word. If the Chinese sentence \ " (i.e., \the seat is
very conformable") is imported into the ICTCLAS 2016, then the output result is
\/n /d /a" (i.e., \seat/n, very/d, conformable/a"), where \/n" denotes
\noun"; \/d" denotes \adverb"; \/v" denotes \verb"; and \/a" denotes \adjective".
Let D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þdenote the output of ICTCLAS 2016 when the online
review Dik is input, i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Speci¯cally, if
Dj
ik ¼, then D0j
ik ¼,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
(2) Stop word removal
To improve the e±ciency and e®ectiveness of sentiment classi¯cation, stop
words must be removed. The words in the stop word list (see: http://www.datatang.
com/data/19300) are deleted from D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þ,i¼1;2;...;n,
8Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
j¼1;2;...;m,k¼1;2;...;qi. Let
Wik ¼ð
W1
ik ;
W2
ik ;...;
Wm
ik Þdenote the vector of
notional words obtained by removing the stop words from D0
ik , where
Wj
ik ¼
fWj1
ik ;Wj2
ik ;...;Wjq j
ik
ik gdenotes the set of notional words in sentence D0j
ik , and qj
ik
denote the number of words in
Wj
ik ,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Speci¯cally, if Dj
ik ¼, then we denote
Wj
ik ¼,i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi.
4.1.2. An algorithm based on SVM and the OVO strategy to identify the positive,
neutral, and negative sentiment orientations of online reviews
To rank products through online reviews, we must identify the sentiment orienta-
tions of the online reviews of alternative products with respect to di®erent attributes.
Previous studies
30–32
have proved that SVM has good performance for not only
binary sentiment classi¯cation but also multiple sentiment classi¯cation when
combined with the OVO strategy. Thus, in this section, an algorithm based on SVM
and the OVO strategy is proposed to identify the positive, neutral, and negative
sentiment orientations of online reviews. The structure of the algorithm is shown
in Fig. 2.
As seen in Fig. 2, the algorithm can be divided into two stages: (1) converting
training samples and online reviews into feature vectors concerning the notional
words and (2) training the SVM classi¯ers and identifying the sentiment orientations
of online reviews using the OVO strategy. In the ¯rst stage, according to the training
samples and online reviews, a set of notional words is constructed with respect to
alternative products concerning each product attribute. Then, the training samples
and the online reviews of alternative products are uniformly converted into feature
vectors concerning the notional words. In the second stage, three subsets of training
samples (i.e., \positive-neutral"
Tj
pos [
Tj
neu, \positive-negative"
Tj
pos [
Tj
neg and
\neutral-negative"
Tj
neu [
Tj
negÞare ¯rst constructed with respect to each product
Train the SVM classifiers by
the three subsets, respectively
Training samples
neg neu pos
Labeled sentiment orientations
Feature vectors
(1) Converting training samples and online reviews into
feature vectors concerning the notional words
Online reviews
The OVO strategy
Set of notional
words
Sentiment orientations
of
online reviews
(2) Training the SVM classifiers and identifying the sentiment
orientations of online reviews using the OVO strategy
neg neu pos
W
1
W
2
W
3
0 0.1 0T
1
0.2 0.6 1.2T
2
3.2 06.1T
3
0.7 0.3 0.8T
4
0 0.6 0.9T
5
0.4 0 0T
6
W
1
W
2
W
3
0.23 0.62 0D
1
0.96 5.3 3.1D
2
0.16 00.46D
3
0 3.2 0D
4
0.6 00.98D
5
Subset(pos-neu)
W
1
W
2
W
3
00.1 0T
1
3.2 06.1T
3
0.7 0.3 0.8T
4
00.6 0.9T
5
W
1
W
2
W
3
00.1 0T
1
0.2 0.6 1.2T
2
0.7 0.3 0.8T
4
0.4 0 0T
6
W
1
W
2
W
3
0.2 0.6 1.2T
2
3.2 06.1T
3
00.6 0.9T
5
0.4 0 0T
6
Subset(pos-neg) Subset(neu-neg)
SVM (
pos-neu
)SVM
(pos-neg)
SVM
(neu-neg)
Fig. 2. Structure of the algorithm based on SVM and the OVO strategy.
Method for Ranking Products Through Online Reviews 9
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
attribute. Then, three SVM classi¯ers are, respectively, trained by the three subsets
with respect to the product attribute. Furthermore, a classi¯cation result of the
sentiment orientation of each review concerning each attribute can be obtained by
each SVM classi¯er, and the OVO strategy is used to integrate the results of the
three SVM classi¯ers and to determine the ¯nal sentiment orientation of each review
with respect to each product attribute. A detailed description of each stage is pro-
vided below.
(1) Converting training samples and online reviews into feature vectors concerning
the notional words
Without loss of generality, we assume that the training sample
Tj
zhas been pre-
processed by the process shown in Sec. 4.1.1 and has been changed into the set of
notional words, j¼1;2;...;m;z¼1;2;...;qj
t. Let
Wj
idenote the set of notional
words of online reviews of alternative product Aiwith respect to attribute fj;
Wj
denote the set of notional words of training samples and online reviews with respect
to attribute fj; then
Wj
iand
Wjcan be, respectively, determined by the following
Eqs. (1)and(2), i.e.,
Wj
i¼
Wj
i1[
Wj
i2[[
Wj
iqi;i¼1;2;...;n;j¼1;2;...;m;ð1Þ
Wj¼
Wj
1[
Wj
2[[
Wj
n[
Tj
1[
Tj
2[[
Tj
qj
t
;j¼1;2;...;m:ð2Þ
Let Ejdenote the number of words in set
Wj, then set
Wjcan be further represented
by
Wj¼fWj1;Wj2;...;WjE jg, where Wjh denote the hth notional word in the set
Wj,h¼1;2;...;Ej. Based on the bag-of-words (BOW) model,
33
review Dj
ik can be
represented by a feature vector !j
ik ¼ð!j1
ik ;!j2
ik ;...;!jEj
ik Þconcerning the set
Wj¼fWj1;Wj2;...;WjEjg, where !jh
ik denotes the weight of word Wjh for distin-
guishing the semantics of review Dj
ik ,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi,
h¼1;2;...;Ej. Concerning the determination of weight !jh
ik , the idea of term fre-
quency-inverse document frequency (TF-IDF)
34
is employed, i.e., the greater the
frequency of word Wjh in review Dj
ik is, the more important word Wjh will be for
distinguishing the semantic of review Dj
ik , and the greater !jh
ik will be. Conversely,
the less important word Wjh will be for distinguishing the semantic of review Dj
ik , the
smaller !jh
ik will be. Meanwhile, the greater the frequency of word Wjh in all of
the reviews is, the less the importance of word Wjh for distinguishing the semantic of
review Dj
ik will be, and the smaller !jh
ik will be; conversely, the more important word
Wjh for distinguishing the semantic of review Dj
ik will be, the greater !jh
ik will be.
Thus, based on the idea of TF-IDF,
34
the value of !jh
ik can be calculated by the
following Eq. (3), i.e.,
!jh
ik ¼jh
ik
j
ik
log qj
tþ jfði;kÞ:Dj
ik 6¼ gj
jfz:Wjh 2
Tj
zgj þ jfði;kÞ:Wjh 2Dj
ik gj þ 1:ð3Þ
In Eq. (3), jh
ik denotes the frequency of word Wjh in review Dj
ik ,j
ik denotes the
number of notional words in review Dj
ik ,jfði;kÞ:Dj
ik 6¼ gj denotes the number of
10 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
not-empty reviews of all alternative products concerning attribute fj, and jfz:
Wjh 2
Tj
zgj and jfði;kÞ:Wjh 2Dj
ik gj denote the numbers of training samples con-
taining the word Wjh and online reviews containing the word Wjh ,i¼1;2;...;n,
j¼1;2;...;m,k¼1;2;...;qi,h¼1;2;...;Ej, respectively.
Similarly, let !j
z¼ð!j1
z;!j2
z;...;!jEj
zÞdenote the feature vector of training
sample
Tj
zconcerning set
Wj¼fWj1;Wj2;...;WjE jg, where !jh
zdenotes the weight
of word Wjh for distinguishing the semantic of training sample
Tj
z;j¼1;2;...;m;z¼1;2;...;qj
t. Based on the idea of TF-IDF,
34
the value of !jh
z
can be calculated by the following Eq. (4), i.e.,
!jh
z¼jh
z
j
z
log qj
tþ jfði;kÞ:Dj
ik 6¼ gj
jfz:Wjh 2
Tj
zgj þ jfði;kÞ:Wjh 2Dj
ik gj þ 1;ð4Þ
where jh
zdenotes the frequency of word Wjh in training sample
Tj
z, and j
zdenotes
the number of notional words in training sample
Tj
z,j¼1;2;...;m,z¼1;2;...;qj
t,
h¼1;2;...;Ej.
To illustrate the process of converting training samples and online reviews into
feature vectors more clearly, an example is provided. Table 1shows the notional
words of 1109 training samples and 346 online reviews concerning the attribute of car
power. The set of notional words concerning the attribute of car power can be
obtained using Eq. (2), which is composed of 2194 words, i.e., f(power),
(aspect), (a little), (surprise), (OK), (not), (sedan car),
(pretty good), (give), (oil), (run), ...g. Then, the weight of each word for
distinguishing the semantics of the training samples and the online reviews can be
calculated using Eqs. (3) and (4). Furthermore, according to the obtained weight of
each word, the feature vectors of training samples and online reviews can be deter-
mined. These vectors are shown in Table 2.
(2) Training the SVM classi¯ers and identifying the sentiment orientations of online
reviews using the OVO strategy
Among the existing multiple machine learning algorithms, SVM is considered the one
most suitable for identifying the sentiment orientations of online reviews.
30–32,35–37
Table 1. Notional words of training samples and online reviews concerning the at-
tribute of car power.
Notional words
Training samples
Tj
1(power) (aspect) (a little), (surprise)
... ...
Tj
1109 (power) (OK) (not) (sedan car) ...
Online reviews Dj
i1(power) (aspect) (not) (OK) ...
... ...
Dj
i346 (pretty good) (give) (oil) (run) ...
Method for Ranking Products Through Online Reviews 11
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
However, SVM cannot be directly used for classifying the sentiment orientations of
online reviews into three categories, i.e., positive, neutral, and negative, because
SVM is a binary classi¯er. Thus, SVM is usually combined with the OVO strategy. In
the previous studies, several OVO strategies have been proposed, such as voting
strategy
38
and weighted voting strategy.
39
In this study, the weighted voting
strategy
39
is used, and an algorithm based on SVM and the weighted voting strategy
is proposed to identify the positive, neutral, and negative sentiment orientations of
online reviews. Details of the algorithm are provided below.
According to the feature vectors of training samples !j
z¼ð!j1
z;!j2
z;...;!jEj
zÞ,
j¼1;2;...;m,z¼1;2;...;qj
t, three subsets of training samples for distinguishing
di®erent pairs of sentiment orientations are, respectively, constructed. The subset of
training samples for discriminating \positive-neutral" (pos-neu) is the union of
training samples with positive and neutral sentiment orientations (
Tj
pos [
Tj
neuÞ,
that for \positive-negative" (pos-neg) is
Tj
pos [
Tj
neg, and that for \neutral-
negative" (neu-neg) is
Tj
neu [
Tj
neg,j¼1;2;...;m. Then, by training the SVM using
the three subsets of training samples, respectively, three SVM classi¯ers for dis-
criminating di®erent pairs of sentiment orientations can be obtained. These classi-
¯ers are noted as SVMpos-neu
j,SVMpos-neu
j, and SVMneu-neg
j,j¼1;2;...;m.To
identify the sentiment orientation of online review Dj
ik , the feature vector !j
ik ¼
ð!j1
ik ;!j2
ik ;...;!jEj
ik Þof Dj
ik is, respectively, input into the three SVM classi¯ers
SVMpos-neu
j,SVMpos-neu
j,andSVMneu-neg
j,i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Let xj#
ik 2f0;1gdenote the output of the SVM classi¯er SVM#
j,
j¼1;2;...;m, where # denotes one of \pos-neu", \pos-neg", or \neu-neg",
j¼1;2;...;m.Ifxj#
ik ¼1, then it denotes that the result obtained by classi¯er
SVM#
jis \the sentiment orientation of online review Dj
ik is "; if xj#
ik ¼0, then it
denotes that the result obtained by SVM #
jis \the sentiment orientation of
online review Dj
ik is #", i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Speci¯cally,
xjpos-neu
ik ¼1 denotes that the result obtained by classi¯er SVM pos-neu
jconcerning
review Dj
ik is \pos", xjpos-neg
ik ¼1 denotes the result obtained by SVM pos-neg
jcon-
cerning review Dj
ik is \pos", and xjneu-neg
ik ¼1 denotes the result obtained by
SVMneu-neg
jis \neu"; otherwise, the results obtained by SVM pos-neu
j,SVMpos-neg
jand
Table 2. Feature vectors of training samples and online reviews concerning the attribute of car power.
The set of notional words concerning the attribute of car power
...
(power) (aspect) (a little) (surprise) (OK) (not) (sedan
car)
(give)
Training
samples
Tj
10.39 2.98 2.81 3.90 0.00 0.00 0.00 0.00 ...
... ... ... ... ... ... ... ... ... ...
Tj
1109 0.39 0.00 0.00 0.00 2.34 0.85 3.34 0.00 ...
Online
reviews
Dj
i10.39 2.98 0.00 0.00 0.00 0.85 0.00 0.00 ...
... ... ... ... ... ... ... ... ... ...
Dj
i346 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.36 ...
12 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
SVMneu-neg
jare \neu", \neg", and \neg", respectively, i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Thus, di®erent results of sentiment orientations of review Dj
ik can
be obtained by di®erent SVM classi¯ers. Therefore, to determine the ¯nal
sentiment orientation of online review Dj
ik , we must integrate the results obtained by
the three SVM classi¯ers. Let Sj
ik represent the sentiment orientation for review Dik
on feature fj,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. For convenience of
analysis and further calculation, Sj
ik is represented by an indicator vector, i.e.,
Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ, where j
ik ,j
ik and j
ik are indicator variables for positive,
neutral, and negative sentiment orientations, respectively, j
ik ;j
ik ;j
ik 2f0;1g,
j
ik þj
ik þj
ik 1, i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi. Speci¯cally,
Sj
ik ¼ð0;0;0Þdenotes Dj
ik ¼;Sj
ik ¼ð1;0;0Þ,Sj
ik ¼ð0;1;0Þ, and Sj
ik ¼ð0;0;1Þde-
note that review Dj
ik represents the positive sentiment orientation, neutral sentiment
orientation, and negative sentiment orientation, respectively. In this paper, the
weighted voting strategy
39
is used to integrate the results obtained by the three SVM
classi¯ers (xjpos-neu
ik ,xjpos-neg
ik and xjneu-neg
ik Þand determine Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ,
i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi, and the detail description is provided
in Appendix A.
4.2. Ranking alternative products based on interval-valued
intuitionistic fuzzy TOPSIS
In the study of Liu et al.,
15
a review with positive sentiment orientation is considered
a vote in support, a review with neutral sentiment orientation is considered hesita-
tion, and a review with negative sentiment orientation is considered a vote in op-
position. Then, based on the identi¯ed sentiment orientations and the physical
interpretation of intuitionistic fuzzy numbers,
18–21
an intuitionistic fuzzy number is
built to represent the performance of an alternative product concerning a product
attribute. Although the use of intuitionistic fuzzy numbers can simply re°ect the
percentages of online reviews with di®erent sentiment orientations, the numbers of
online reviews of di®erent products crawled from the website are not considered.
In fact, there can be great di®erences among the numbers of online reviews of dif-
ferent products crawled from the website. The numbers of online reviews a®ect the
con¯dences of the decision data re¯ned from these online reviews. Thus, to eliminate
the e®ects of di®erent numbers of online reviews on the con¯dences of the con-
structed intuitionistic fuzzy numbers, the percentages of positive and negative sen-
timent orientations in the form of crisp numbers can be replaced by con¯dence
intervals. Therefore, an interval-valued intuitionistic fuzzy number can be con-
structed to re°ect the performance of an alternative product with respect to a
product attribute. Then, based on the obtained interval-valued intuitionistic fuzzy
numbers and the weights of product attributes, the ranking of the alternative pro-
ducts can be determined by the interval-valued intuitionistic fuzzy TOPSIS
method.
23,24
A detailed description is provided below.
Method for Ranking Products Through Online Reviews 13
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Let qpos
ij ,qneu
ij , and qneg
ij denote the numbers of reviews of alternative product Ai
concerning attribute fjwith positive, neutral, and negative sentiment orientations,
respectively, i¼1;2;...;n,j¼1;2;...;m. The values of qpos
ij ,qneu
ij , and qneg
ij can be
calculated by the following Eqs. (5)–(7), respectively, i.e.,
qpos
ij ¼X
qi
k¼1
j
ik ;i¼1;2;...;n;j¼1;2;...;m;ð5Þ
qneu
ij ¼X
qi
k¼1
j
ik ;i¼1;2;...;n;j¼1;2;...;m;ð6Þ
qneg
ij ¼X
qi
k¼1
j
ik ;i¼1;2;...;n;j¼1;2;...;m:ð7Þ
Let ij and ij denote the percentages of support and opposition degrees of
alternative product Aiconcerning attribute fj, respectively, i¼1;2;...;n,
j¼1;2;...;m. According to the obtained qpos
ij ;qneu
ij , and qneg
ij , the values of ij and
ij can be, respectively, calculated by the following Eqs. (8) and (9), i.e.,
ij ¼qpos
ij
qpos
ij þqneu
ij þqneg
ij
;i¼1;2;...;n;j¼1;2;...;m;ð8Þ
ij ¼qneg
ij
qpos
ij þqneu
ij þqneg
ij
;i¼1;2;...;n;j¼1;2;...;m:ð9Þ
Note that because there can be great di®erences among the numbers of online
reviews concerning di®erent products crawled from the website, there can be great
di®erences among the con¯dences of the values of ij and ij ,i¼1;2;...;n,
j¼1;2;...;m. Thus, to eliminate the e®ects of di®erent numbers of online reviews
on the con¯dences of the values of ij and ij , con¯dence intervals ½L
ij ;U
ij and
½vL
ij ;vU
ij are, respectively, used to replace the crisp values of ij and ij ,
i¼1;2;...;n,j¼1;2;...;m. Based on the calculation formula of con¯dence in-
terval estimation of the binomial distribution,
24
L
ij ;U
ij ;vL
ij , and vU
ij can be, respec-
tively, calculated by the following Eqs. (10)–(13), i.e.,
L
ij ¼ij z=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ij ð1ij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð10Þ
U
ij ¼ij þz=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ij ð1ij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð11Þ
vL
ij ¼vij z=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
vij ð1vij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð12Þ
14 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
vU
ij ¼vij þz=2ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
vij ð1vij Þ
qpos
ij þqneu
ij þqneg
ij
s;i¼1;2;...;n;j¼1;2;...;m;ð13Þ
where is the signi¯cance level (1 is the con¯dence level), z=2is the parameter
corresponding to signi¯cance level , and z=2can be determined by referencing the
table of normal distribution. For example, if the con¯dence level is 0.95, i.e., ¼0:05
and 1 ¼0:95, then we have z0:05=2¼1:96 by referencing the table of normal
distribution. Note that U
ij and vU
ij obtained by Eqs. (11) and (12) might not
satisfy the assumption of interval-valued intuitionistic fuzzy number that
U
ij þvU
ij 1. Thus, L
ij ;U
ij ;vL
ij ,andvU
ij are, respectively, uni¯ed by the following
Eqs. (14)–(17), i.e.,
L
ij ¼L
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m;ð14Þ
U
ij ¼U
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m;ð15Þ
vL
ij ¼vL
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m;ð16Þ
vU
ij ¼vU
ij
maxðU
ij þvU
ij Þ;i¼1;2;...;n;j¼1;2;...;m:ð17Þ
Based on the
L
ij ;
U
ij ;
vL
ij ,and
vU
ij obtained by Eqs. (14)–(17), an interval-valued
intuitionistic fuzzy number can be constructed to represent the performance
of alternative Aiconcerning product attribute fj, i.e., rij ¼f½
L
ij ;
U
ij ;½
vL
ij ;
vU
ij g,
i¼1;2;...;n,j¼1;2;...;m.
Then, according to the obtained rij ¼f½
L
ij ;
U
ij ;½
vL
ij ;
vU
ij g,i¼1;2;...;n,
j¼1;2;...;m, the ideal and anti-ideal products can be de¯ned, i.e., Aþ¼
ðrþ
1;rþ
2;...;rþ
mÞand A¼ðr
1;r
2;...;r
mÞ, where rþ
jand r
jrepresent the per-
formance of ideal product and anti-ideal product concerning attribute fj,
j¼1;2;...;m. According to the idea of TOPSIS,
40,41
rþ
jand r
jcan be represented
by the following Eqs. (18) and (19), i.e.,
rþ
j¼max
i
L
ij ;max
i
U
ij
;min
i
L
ij ;min
i
U
ij
;i¼1;2;...;n;j¼1;2;...;m;
ð18Þ
r
j¼min
i
L
ij ;min
i
U
ij
;max
i
L
ij ;max
i
U
ij
;i¼1;2;...;n;j¼1;2;...;m:
ð19Þ
Furthermore, let dþ
iand d
i, respectively, denote the distances of alternative product
Aifrom the ideal product Aþ¼ðrþ
1;rþ
2;...;rþ
mÞand anti-ideal product
A¼ðr
1;r
2;...;r
mÞ. Based on the idea in the studies of Xu
23
and Ye,
24
dþ
iand
Method for Ranking Products Through Online Reviews 15
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
d
ican be, respectively, calculated by the following Eqs. (20) and (21), i.e.,
dþ
i¼1
2X
m
j¼1
wj
L
ij max
I
L
ij
2
þ
U
ij max
I
U
ij
2
þ
vL
ij min
i
L
ij
2
"
þ
vU
ij min
i
U
ij
2
þðL
ij Lþ
jÞ2þðU
ij Uþ
jÞ2!#1
2
;i¼1;2;...;n;
ð20Þ
d
i¼1
2X
m
j¼1
wj
L
ij min
i
L
ij
2
þ
U
ij min
I
U
ij
2
þ
vL
ij max
i
L
ij
2
"
þ
vU
ij max
i
U
ij
2
þðL
ij L
jÞ2þðU
ij U
jÞ21
2
;i¼1;2;...;n:
ð21Þ
where L
ij ¼1
U
ij
vU
ij ,U
ij ¼1
L
ij
vL
ij ,Lþ
j¼1maxI
U
ij mini
U
ij ,
Uþ
j¼1maxi
L
ij mini
L
ij ,L
j¼1mini
U
ij maxi
U
ij ,U
j¼1mini
L
ij
maxi
L
ij , and wjis the weight of product attribute fjprovided by the consumer,
i¼1;2;...;n;j¼1;2;...;m.
Moreover, based on dþ
iand d
i, the closeness coe±cient of alternative product Ai
can be calculated, i.e.,
Ci¼d
i
d
iþdþ
i
;i¼1;2;...;n:ð22Þ
Obviously, if alternative product Aiis closer to the ideal product and farther
from the anti-ideal product, namely, Ciis greater, then alternative product Aiis
preferable. Therefore, in accordance with a descending order of the closeness coe±-
cients of all alternative products, the ranking of the alternative products can be
determined.
In summary, the proposed method for ranking products through online reviews is
provided below.
Step 1. Input the online review Dik ¼ðD1
ik ;D2
ik ;...;Dm
ik Þinto ICTCLAS 2016
and obtain the output D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þ,i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Then, based on the stop word list (see: http://www.datatang.com/
data/19300), the stop words are deleted from D0
ik ¼ðD01
ik ;D02
ik ;...;D0m
ik Þand
the vector of notional words
Wik ¼ð
W1
ik ;
W2
ik ;...;
Wm
ik Þcan be obtained,
i¼1;2;...;n,k¼1;2;...;qi.
Step 2. Determine the feature vectors of online reviews and training samples
using Eqs. (1)–(4), i.e., !j
ik ¼ð!j1
ik ;!j2
ik ;...;!jEj
ik Þand !j
z¼ð!j1
z;!j2
z;...;!jEj
zÞ,
i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi,z¼1;2;...;qj
t.
Step 3. Based on the training samples concerning attribute fj, construct the
subsets of training samples for \positive-neutral"
Tj
pos [
Tj
neu, \positive-negative"
16 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Tj
pos [
Tj
neg, and \neutral-negative"
Tj
neu [
Tj
neg. Then, by training the SVM using
the three subsets, three sentiment classi¯ers can be obtained with respect to product
attribute fj, i.e., SVMpos-neu
j,SVMpos-neg
j,andSVMneu-neg
j,j¼1;2;...;m.
Step 4. Identify the sentiment orientation of online review Dj
ik using SVMpos-neu
j,
SVMpos-neg
j, and SVMneu-neg
j, respectively, and the results xjpos-neu
ik ,xjpos-neg
ik , and
xjneu-neg
ik are obtained, i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Step 5. Based on the results xjpos-neu
ik ,xjpos-neg
ik , and xjneu-neg
ik , the indicator vector
Sj
ik ¼ðj
ik ;j
ik ;j
ik Þcan be determined by the weighted voting strategy shown in
Appendix A,i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Step 6. Based on the indicator vector Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ, the interval-valued
intuitionistic fuzzy number rij ¼f½
L
ij ;
U
ij ;½
vL
ij ;
vU
ij g can be constructed to repre-
sent the performance of alternative product Aiwith respect to product attribute fj,
using Eqs. (5)–(17), i¼1;2;...;n,j¼1;2;...;m.
Step 7. Determine the ideal product Aþ¼ðrþ
1;rþ
2;...;rþ
mÞand anti-ideal product
A¼ðr
1;r
2;...;r
mÞusing Eqs. (18) and (19).
Step 8. Calculate the closeness coe±cient Ciusing Eqs. (20)–(22), i¼1;2;...;n,
and determine the ranking of the alternative products in accordance with a
descending order of the closeness coe±cients.
5. Case Study
Consider a consumer who wants to buy a medium-sized car. After a preliminary
investigation, 11 alternative cars are identi¯ed, i.e., Bora (A1), Golf (A2), Corolla
(A3), Cruze (A4), Long Yat (A5), Mazda (A6), Octavia (A7), Sega (A8), Jetta (A9),
Sylphy (A10), and Citroen (A11 ). The 11 alternative cars are all acceptable, but
the consumer is not sure which one is the best because of limited knowledge and
expertise. To select a desirable car, the consumer is concerned with the following ¯ve
attributes of cars: controllability (f1), oil consumption (f2), space (f3), power (f4), and
cost performance (f5). The consumer provides the vector of weights of the ¯ve
attributes, i.e., w¼ð0:2;0:3;0:2;0:2;0:1Þ. To support the consumer purchase deci-
sion, the method proposed in this paper is used. The computation processes and
results are presented below.
Locoy Spider software (http://www.locoy.com/) is used to crawl online reviews of
the 11 cars concerning the ¯ve attributes from Automobile home (http://www.
autohome.com.cn/). The obtained online reviews are expressed by Dik ¼ðD1
ik ;
D2
ik ;D3
ik ;D4
ik ;D5
ik Þ,i¼1;2;...;11, k¼1;2;...;qi,q1¼7244, q2¼7748, q3¼5964,
q4¼9882, q5¼5252, q6¼5003, q7¼7700, q8¼5431, q9¼8466, q10 ¼4739,
q11 ¼3572. The training samples concerning each attribute are crawled from
the same website; the sentiment orientations of the training samples are labeled
beforehand. The number of training samples concerning each attribute is
Method for Ranking Products Through Online Reviews 17
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
q1
t¼1119;q2
t¼1229;q3
t¼1128;q4
t¼1109 and q5
t¼1158. Based on Eqs. (1)–(4),
the feature vectors of online reviews and training samples are determined.
Then, based on the training samples concerning attribute fj, the subsets of
training samples for \positive-neutral", \positive-negative", and \neutral-negative"
are constructed, and three SVM classi¯ers SVM pos-neu
j,SVMpos-neg
j, and SVMneu-neg
j
are obtained by training the SVM classi¯ers using the three subsets of training
samples, respectively, j¼1;2;3;4;5. When training the SVM classi¯ers, the pa-
rameter settings are the following: cost parameter ¼1.0, tolerance parame-
ter ¼0.001, Kernel type¼radial basis function, and degree ¼3. Using the three SVM
classi¯ers SVMpos-neu
j,SVMpos-neg
j,andSVMneu-neg
j, the results concerning online
review Dj
ik are obtained and are noted as xjpos-neu
ik ,xjpos-neg
ik , and xjneu-neg
ik ,
i¼1;2;...;11, j¼1;2;3;4;5, k¼1;2;...;qi,q1¼7244, q2¼7748, q3¼5964,
q4¼9882, q5¼5252, q6¼5003, q7¼7700, q8¼5431, q9¼8466, q10 ¼4739,
q11 ¼3572. Furthermore, based on the obtained xjpos-neu
ik ,xjpos-neg
ik , and xjneu-neg
ik , the
indicator vector Sj
ik ¼ðj
ik ;j
ik ;j
ik Þof review Dj
ik is determined using the weighted
voting strategy, i¼1;2;...;11, j¼1;2;3;4;5, k¼1;2;...;qi,q1¼7244,
q2¼7748, q3¼5964, q4¼9882, q5¼5252, q6¼5003, q7¼7700, q8¼5431,
q9¼8466, q10 ¼4739, q11 ¼3572. Based on Eqs. (5)–(7), the values of qpos
ij ,qneu
ij
and qneg
ij are obtained, i¼1;2;...;11, j¼1;2;3;4;5. These values are shown in
Table 3. The interval-valued intuitionistic fuzzy number to represent the perfor-
mance of alternative car Aiconcerning car attribute fjis constructed (here ¼0:05Þ
using Eqs. (8)–(19), i¼1;2;...;11, j¼1;2;3;4;5. The constructed interval-valued
intuitionistic fuzzy numbers are shown in Table 4. Based on Eqs. (18)and(19),
the ideal car and anti-ideal car can be de¯ned, i.e., Aþ¼ðrþ
1;rþ
2;rþ
3;rþ
4;
rþ
5Þand A¼ðr
1;r
2;r
3;r
4;r
5Þ, where rþ
1¼ð½0:9866;0:9932;½0:0013;0:0043Þ,
rþ
2¼ð½0:9317;0:9454;½0:0081;0:0139Þ,rþ
3¼ð½0:9919;0:9974;½0:0003;0:0026Þ,
rþ
4¼ð½0:8531;0:8688;½0:0032;0:0081Þ,rþ
5¼ð½0:9241;0:9358;½0:0081;0:0136Þ;
r
1¼ð½0:6788;0:7053;½0:0360;0:0475Þ,r
2¼ð½0:5208;0:5475;½0:1269;0:1452Þ,
Table 3. Values of qpos
ij ,qneu
ij , and qneg
ij ,i¼1;2;...;11, j¼1;2;3;4;5.
f1f2f3f4f5
qpos
i1qneu
i1qneg
i1qpos
i2qneu
i2qneg
i2qpos
i3qneu
i3qneg
i3qpos
i4qneu
i4qneg
i4qpos
i1qneu
i1qneg
i1
A16476 692 76 6216 848 180 6258 879 107 4837 2127 280 6237 836 171
A27502 222 24 6931 684 133 6596 1044 108 6643 973 132 5759 1661 328
A35092 783 89 5334 523 107 5451 450 63 5022 883 59 5196 646 122
A49267 551 64 5813 3063 1006 8029 1640 213 5145 3955 782 8507 1192 183
A54502 654 96 4746 427 79 4910 295 47 3627 1405 220 4268 814 170
A64932 57 14 4676 272 55 3399 1405 199 4148 778 77 4034 845 124
A77279 377 44 6992 613 95 7357 324 19 5041 2399 260 7131 483 86
A85260 144 27 2889 1806 736 4423 894 114 3846 1437 148 5012 360 59
A97887 462 117 7188 1044 234 7758 614 94 6207 1861 398 5433 2295 738
A10 3266 1276 197 4365 297 77 4694 38 7 2814 1710 215 4215 457 67
A11 3414 129 29 2170 1092 310 3526 40 6 3415 137 20 3220 289 63
18 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
Table 4. Interval-valued intuitionistic fuzzy number of alternative car Aiconcerning fj,i¼1;2;...;11, j¼1;2;3;4;5.
f1f2f3f4f5
A1([0.8869, 0.9011],[0.0081, 0.0128]) ([0.8501, 0.8661],[0.0213, 0.0284]) ([0.8560, 0.8718],[0.0120, 0.0175]) ([0.6569, 0.6786],[0.0342, 0.0431]) ([0.8530, 0.8690],[0.0201, 0.0271])
A2([0.9643, 0.9722],[0.0019, 0.0043]) ([0.8877, 0.9014],[0.0143, 0.0201]) ([0.8434, 0.8592],[0.0113, 0.0165]) ([0.8496, 0.8652],[0.0142, 0.0199]) ([0.7336, 0.7530],[0.0379, 0.0468])
A3([0.8448, 0.8628],[0.0118, 0.0180]) ([0.8866, 0.9022],[0.0146, 0.0213]) ([0.9069, 0.9211],[0.0080, 0.0132]) ([0.8328, 0.8513],[0.0074, 0.0124]) ([0.8627, 0.8797],[0.0169, 0.0240])
A4([0.9330, 0.9425],[0.0049, 0.0081]) ([0.5785, 0.5979],[0.0958, 0.1078]) ([0.8048, 0.8202],[0.0187, 0.0244]) ([0.5108, 0.5305],[0.0738, 0.0845]) ([0.8540, 0.8677],[0.0159, 0.0212])
A5([0.8477, 0.8667],[0.0147, 0.0219]) ([0.8957, 0.9116],[0.0117, 0.0183]) ([0.9282, 0.9416],[0.0064, 0.0115]) ([0.6781, 0.7031],[0.0365, 0.0473]) ([0.8021, 0.8232],[0.0276, 0.0372])
A6([0.9825, 0.9891],[0.0013, 0.0043]) ([0.9278, 0.9415],[0.0081, 0.0139]) ([0.6665, 0.6923],[0.0344, 0.0452]) ([0.8187, 0.8395],[0.0120, 0.0188]) ([0.7954, 0.8173],[0.0205, 0.0291])
A7([0.9402, 0.9504],[0.0040, 0.0074]) ([0.9016, 0.9145],[0.0099, 0.0148]) ([0.9508, 0.9601],[0.0014, 0.0036]) ([0.6441, 0.6653],[0.0297, 0.0378]) ([0.9203, 0.9319],[0.0088, 0.0135])
A8([0.9639, 0.9732],[0.0031, 0.0068]) ([0.5178, 0.5452],[0.1264, 0.1446]) ([0.8041, 0.8247],[0.0172, 0.0248]) ([0.6961, 0.7202],[0.0229, 0.0316]) ([0.9158, 0.9299],[0.0081, 0.0136])
A9([0.9262, 0.9370],[0.0113, 0.0163]) ([0.8414, 0.8567],[0.0241, 0.0311]) ([0.9105, 0.9223],[0.0089, 0.0133]) ([0.7237, 0.7426],[0.0425, 0.0515]) ([0.6315, 0.6520],[0.0812, 0.0932])
A10 ([0.6760, 0.7024],[0.0359, 0.0473]) ([0.9134, 0.9288],[0.0126, 0.0198]) ([0.9877, 0.9933],[0.0004, 0.0026]) ([0.5798, 0.6078],[0.0394, 0.0513]) ([0.8805, 0.8984],[0.0108, 0.0175])
A11 ([0.9490, 0.9625],[0.0052, 0.0111]) ([0.5915, 0.6235],[0.0776, 0.0960]) ([0.9834, 0.9908],[0.0003, 0.0030]) ([0.9493, 0.9628],[0.0032, 0.0080]) ([0.8917, 0.9112],[0.0133, 0.0220])
Method for Ranking Products Through Online Reviews 19
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
r
3¼ð½0:6692;0:6952;½0:0345;0:0454Þ,r
4¼ð½0:5129;0:5327;½0:0741;0:0848Þ,
and r
5¼ð½0:6342;0:6547;½0:0815;0:0936Þ. According Eqs. (20)–(22), the close-
ness coe±cient of each alternative car is obtained, i.e., C1¼0:3107, C2¼0:3224,
C3¼0:3210, C4¼0:2930, C5¼0:3596, C6¼0:3720, C7¼0:3700, C8¼0:3010,
C9¼0:3531, C10 ¼0:3496, and C11 ¼0:3640. In accordance with a descending
order of closeness coe±cients, the ranking of the alternative cars is determined, i.e.,
A6A7A11 A5A9A10 A2A3A1A8A4.
In previous studies, several methods for ranking products through online
reviews have been proposed. However, in most of the existing studies, the product
attributes and attribute weights are objectively determined based on the online
reviews or are not considered. To compare the result obtained by the proposed
method with the results obtained by the existing methods, it is considered that the
weights of di®erent attributes are equal, i.e., w¼ð0:2;0:2;0:2;0:2;0:2Þ. The situ-
ation of equal attribute weights can be considered approximately equivalent to
the situation that product attributes are not considered. Given the equal attribute
weights, the proposed method and the methods proposed by Zhang et al.,
10
Najmi
et al.,
13
and Liu et al.
15
are simultaneously used, and the ranking results of
the alternative cars are obtained. These results are shown in Table 5.Itcan
be seen that the same or similar ranking results are obtained by di®erent methods.
However, if the unequal attribute weights are considered, then most of the existing
methods cannot be used. To illustrate the characteristics of the proposed method,
di®erent attribute weights are considered, and a series of ranking results of
the 11 alternative cars can be obtained. These results are shown in Table 6.
In Table 6, each row represents a vector of attribute weights (such that
w1þw2þw3þw4þw5¼1) and the corresponding ranking result of alternative
Table 5. Ranking results of the alternative cars obtained by di®erent methods.
Di®erent methods Ranking results of the alternative cars
The proposed method A11 A10 A9A3A2A7A6A5A1A8A4
The method proposed by Zhang et al.
10
A11 A9A10 A3A2A6A7A5A1A8A4
The method proposed by Najmi et al.
13
A11 A10 A9A2A3A7A5A6A1A8A4
The method proposed by Liu et al.
15
A11 A10 A9A3A2A7A6A5A1A8A4
Table 6. Ranking results of alternative cars with di®erent attribute weights.
Attribute weights Ranking results of alternative cars
w1w2w3w4w5
0.2 0.3 0.2 0.2 0.1 A6A7A11 A5A9A10 A2A3A1A8A4
0.1 0.3 0.2 0.2 0.2 A7A6A11 A10 A5A9A3A2A1A8A4
0.1 0.01 0.2 0.2 0.4 A11 A7A8A10 A5A6A3A2A1A9A4
0.1 0.5 0.1 0.2 0.1 A6A7A5A10 A9A11 A2A3A1A4A8
0.05 0.7 0.1 0.05 0.1 A6A10 A7A5A9A3A2A1A11 A4A8
0.1 0.1 0.1 0.1 0.6 A11 A10 A7A3A8A1A6A5A2A4A9
20 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
cars. As seen in Table 6, using the proposed method, di®erent products could be
considered the most desirable one with respect to the consumer's di®erent sub-
jective preferences on product attributes. However, the consumer's subjective
preferences on product attributes cannot be considered in most of the existing
studies.
6. Conclusions
This paper proposes a method for ranking products through online reviews based on
sentiment classi¯cation and interval-valued intuitionistic fuzzy TOPSIS. In the
method, the online reviews of the alternative products concerning multiple attributes
are preprocessed using the ICTCLAS 2016, and the notional words of each online
review are obtained by removing the stop words. Then, an algorithm based on SVM
and the OVO strategy is developed to identify the positive, neutral, and negative
sentiment orientations of online reviews of the alternative products concerning
di®erent attributes. Furthermore, based on the percentages of online reviews with
di®erent sentiment orientations and the numbers of online reviews concerning dif-
ferent alternatives, interval-valued intuitionistic fuzzy numbers are constructed to
represent the performance of the alternative products concerning the product
attributes, and the ranking of the alternative products is determined by the interval-
valued intuitionistic fuzzy TOPSIS. The proposed method has several distinct
characteristics as discussed below.
First, in the proposed method, to identify the positive, neutral, and negative
sentiment orientations of online reviews, an algorithm based on SVM and the OVO
strategy is proposed. In the algorithm, with respect to each product attribute, three
SVM classi¯ers are, respectively, trained using the three subsets of training samples
(i.e., \positive-neutral", \positive-negative", and \neutral-negative"), and the OVO
strategy is introduced to integrate the classi¯cation results obtained by the three
SVM classi¯ers. The algorithm has a clear logic and is a valuable attempt at re¯ning
more-valuable information for ranking products through online reviews.
Secondly, the key strength of the proposed method is the use of interval-valued
intuitionistic fuzzy numbers to represent the performances of the alternative pro-
ducts with respect to the product attributes. The transformation process is theo-
retically sound and complete because it is based on classical theories, including
intuitionistic fuzzy theory and the con¯dence interval estimation in probability
theory. The use of interval-valued intuitionistic fuzzy numbers overcomes the
limitations of the previous studies, in which either the reviews with neutral sentiment
orientations are ignored or the numbers of reviews of di®erent products cannot be
considered. It is not only a new idea to represent a large number of sentiment
orientations but also the ¯rst attempt to obtain decision data objectively in the
form of interval-valued intuitionistic fuzzy numbers. This approach will
signi¯cantly extend the application backgrounds of interval-valued intuitionistic
fuzzy numbers.
Method for Ranking Products Through Online Reviews 21
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
We emphasize that, because the proposed method is new and di®erent from the
existing methods, it is important for developing and enriching theories and methods
for ranking products through online reviews.
In terms of future research, a support system based on the proposed method needs
to be developed to support consumer use of the proposed method to make purchase
decisions more conveniently. Moreover, to re¯ne valuable decision data from online
reviews more e±ciently, research on online review analysis based on the Natural
Language Toolkit is needed.
Acknowledgments
This work was partly supported by the National Science Foundation of China
(Project Nos. 71371002, 71571039 and 71771043), the Fundamental Research Funds
for the Central Universities, China (Project No. N140607001), and the 111 Project
(B16009).
Appendix A.
A description of the weighted voting strategy for integrating the results obtained
by the three SVM classi¯ers (xjpos-neu
ik ;xjpos-neg
ik ,andxjneu-neg
ik ) and determining Sj
ik ¼
ðj
ik ;j
ik ;j
ik Þis provided below; i¼1;2;...;n,j¼1;2;...;m,k¼1;2;...;qi.
Let pj
ik ðjxj#
ik ¼1Þand pj
ik ðjxj#
ik ¼0Þdenote the con¯dences that \the
sentiment orientation of online review Dj
ik is " given xj#
ik ¼1andxj#
ik ¼0,
respectively, #2fpos-neu;pos-neg;neu-negg. Based on the study of Platt,
42
pj
ik ðjxj#
ik ¼1Þand pj
ik ðjxj#
ik ¼0Þcan be represented by the following Eqs. (A.1)
and (A.2), respectively, i.e.,
pj
ik ðjxj#
ik ¼1Þ¼ 1
1þexpð#
jþ#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:1Þ
pj
ik ðjxj#
ik ¼0Þ¼ 1
1þexpð#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:2Þ
In Eqs. (A.1) and (A.2), #
jand #
jare parameters. The values of #
jand
#
jdepend on the numbers and categories of training samples for the di®erent
SVM classi¯ers, #2fpos-neu;pos-neg;neu-negg,j¼1;2;...;m. The values of
#
jand #
jcan be determined by maximizing the likelihood function on the
training samples,
42
i.e.,
Lð#
j;#
jÞ¼ Y
n#
j
i¼1
ðp#
ij Þt#
jð1p#
ij Þt#
j#;
j¼1;2;...;m;#2fpos-neu;pos-neg;neu-negg;
ðA:3Þ
22 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
p#
ij ¼1
1þexpð#
jþ#
j#
ij Þ;i¼1;2;...;n#
j;j¼1;2;...;m;
#2fpos-neu;pos-neg;neu-negg;
ðA:4Þ
t#
j¼n#
jþ1
n#
jþ2;j¼1;2;...;m;#2fpos-neu;pos-neg;neu-negg;
ðA:5Þ
t#
j#¼1
n#
j#þ2;j¼1;2;...;m;#2fpos-neu;pos-neg;neu-negg:
ðA:6Þ
In Eqs. (A.3)–(A.6), n#
jdenotes the number of training samples for classi¯er
SVM#
j,n#
jand n#
j#, respectively, denote the numbers of training samples with
the sentiment orientations of and #, #
ij is a score value of !jh
zcalculated by a pre-
trained classi¯er, and p#
ij denotes the con¯dence of the SVM #
jdiscriminating
classes * and # in favor of the former.
42
Correspondingly, let pj
ik ð#jxj#
ik ¼1Þand pj
ik ð#jxj#
ik ¼0Þdenote the con-
¯dences of \the sentiment orientation of online review Dj
ik is #" given xj#
ik ¼1 and
xj#
ik ¼0, respectively. Then, based on the studies of Ailon and Mohri
43
and Galar
et al.,
44
pj
ik ð#jxj#
ik ¼1Þand pj
ik ð#jxj#
ik ¼0Þcan be calculated by the following
Eqs. (A.7) and (A.8), i.e.,
pj
ik ð#jxj#
ik ¼1Þ¼1pj
ik ðjxj#
ik ¼1Þ¼11
1þexpð#
jþ#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:7Þ
pj
ik ð#jxj#
ik ¼0Þ¼1pj
ik ðjxj#
ik ¼0Þ¼11
1þexpð#
jÞ;
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;
#2fpos-neu;pos-neg;neu-negg
ðA:8Þ
Let pj
ik pos,pj
ik neu, and pj
ik neg denote the con¯dences to identify the sentiment
orientation of Dj
ik as positive, neutral, and negative, i¼1;2;...;n,j¼1;2;...;m,
k¼1;2;...;qi. Then, pj
ik pos,pj
ik neu, and pj
ik neg can be calculated by the following
Eqs. (A.9)–(A.11), respectively, i.e.,
pj
ik pos ¼pj
ik ðposjxjpos-neu
ik ¼1Þxjpos-neu
ik þpj
ik ðposjxjpos-neu
ik ¼0Þ
ð1xjpos-neu
ik Þþpj
ik ðposjxjpos-neg
ik ¼1Þxjpos-neg
ik
þpj
ik ðposjxjpos-neg
ik ¼0Þð1xjpos-neg
ik Þ
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;ðA:9Þ
Method for Ranking Products Through Online Reviews 23
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
pj
ik neu ¼pj
ik ðneujxjpos-neu
ik ¼1Þxjpos-neu
ik þpj
ik ðneujxjpos-neu
ik ¼0Þ
ð1xjpos-neu
ik Þþpj
ik ðneujxjneu-neg
ik ¼1Þxjneu-neg
ik
þpj
ik ðneujxjneu-neg
ik ¼0Þð1xjneu-neg
ik Þ
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi;ðA:10Þ
pj
ik neg ¼pj
ik ðnegjxjpos-neg
ik ¼1Þxjpos-neg
ik þpj
ik ðnegjxjpos-neg
ik ¼0Þ
ð1xjpos-neg
ik Þþpj
ik ðnegjxjneu-neg
ik ¼1Þxjneu-neg
ik
þpj
ik ðnegjxjneu-neg
ik ¼0Þð1xjneu-neg
ik Þ
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi:ðA:11Þ
Based on the obtained pj
ik pos,pj
ik neu, and pj
ik neg, the indicator vector of senti-
ment orientation of online review Dj
ik can be determined by the following Eq.
(A.12), i.e.,
Sj
ik ¼ðj
ik ;j
ik ;j
ik Þ¼
ð0;0;0Þif Dj
ik ¼;
ð1;0;0Þ;if maxfpj
ik pos;pj
ik neu;pj
ik negg¼pj
ik pos;
ð0;1;0Þ;if maxfpj
ik pos;pj
ik neu;pj
ik negg¼pj
ik neu;
ð0;0;1Þ;if maxfpj
ik pos;pj
ik neu;pj
ik negg¼pj
ik neg;
8
>
>
>
>
>
>
<
>
>
>
>
>
>
:
i¼1;2;...;n;j¼1;2;...;m;k¼1;2;...;qi:ðA:12Þ
References
1. H. Chen, R. H. Chiang and V. C. Storey, Business intelligence and analytics: From big
data to big impact, MIS Quarterly 36(4) (2012) 1165–1188.
2. Y. Li, Q. Ye, Z. Zhang and T. Wang, Snippet-based unsupervised approach for sentiment
classi¯cation of Chinese online reviews, International Journal of Information Technology
& Decision Making 10(6) (2011) 1097–1110.
3. T. Hennig-Thurau, K. P. Gwinner, G. Walsh and D. D. Gremler, Electronic word-of-
mouth via consumer-opinion platforms: What motivates consumers to articulate them-
selves on the Internet?, Journal of Interactive Marketing 18(1) (2004) 38–52.
4. B. Bickart and R. M. Schindler, Internet forums as in°uential sources of consumer
information, Journal of Interactive Marketing 15(3) (2001) 31–40.
5. S. Senecal and J. Nantel, The in°uence of online product recommendations on consumers'
online choices, Journal of Retailing 80(2) (2004) 159–169.
6. A. Ghose and P. G. Ipeirotis, Estimating the helpfulness and economic impact of product
reviews: Mining text and reviewer characteristics, IEEE Transactions on Knowledge and
Data Engineering 23(10) (2011) 1498–1512.
7. T. Lappas, M. Crovella and E. Terzi, Selecting a characteristic set of reviews, in
Proc. 18th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Beijing,
China (2012), pp. 832–840.
8. K. Zhang, R. Narayanan and A. Choudhary, Mining online customer reviews for ranking
products, EECS Department, Northwestern University (2009).
24 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
9. K. Zhang, R. Narayanan and A. N. Choudhary, Voice of the customers: Mining online
customer reviews for product feature-based ranking, in Proc. 3rd Conf. Online Social
Networks, Boston, MA, USA (2010), pp. 1–9.
10. K. Zhang, Y. Cheng, W. K. Liao and A. Choudhary, Mining millions of reviews:
A technique to rank products based on importance of reviews, in Proc. 13th Int. Conf.
Electronic Commerce, Liverpool, United Kingdom (2011), pp. 121–128.
11. Y. Peng, G. Kou and J. Li, A fuzzy PROMETHEE approach for mining customer reviews
in Chinese, Arabian Journal for Science and Engineering 39(6) (2014) 5245–5252.
12. K. Chen, G. Kou, J. Shang and Y. Chen, Visualizing market structure through online
product revi: Integrate topic modeling, TOPSIS, and multi-dimensional scaling approa-
ches, Electronic Commerce Research and Applications 14(1) (2015) 58–74.
13. E. Najmi, K. Hashmi, Z. Malik, A. Rezgui and H. U. Khan, CAPRA: A com-
prehensive approach to product ranking using customer reviews, Computing 97(8) (2015)
843–867.
14. X. Yang, G. Yang and J. Wu, Integrating rich and heterogeneous information to design a
ranking system for multiple products, Decision Support Systems 84 (2016) 117–133.
15. Y. Liu, J. W. Bi and Z. P. Fan, Ranking products through online reviews: A method based
on sentiment analysis technique and intuitionistic fuzzy set theory, Information Fusion
36 (2016) 149–161.
16. Z. Xu and H. Hu, Projection models for intuitionistic fuzzy multiple attribute decision
making, International Journal of Information Technology & Decision Making 9(2) (2010)
267–280.
17. Z. Xu, Intuitionistic fuzzy aggregation operators, IEEE Transactions on Fuzzy Systems
15(6) (2007) 1179–1187.
18. Z. Xu and R. R. Yager, Some geometric aggregation operators based on intuitionistic
fuzzy sets, International Journal of General Systems 35(4) (2006) 417–433.
19. K. T. Atanassov, Intuitionistic fuzzy sets, Fuzzy Sets and Systems 20(1) (1986) 87–96.
20. Z. Xu and X. Cai, Intuitionistic Fuzzy Information Aggregation (Springer, Berlin
Heidelberg, 2012).
21. Z. Xu and R. R. Yager, Dynamic intuitionistic fuzzy multi-attribute decision making,
International Journal of Approximate Reasoning 48(1) (2008) 246–262.
22. W. Feller, An Introduction to Probability Theory and Its Applications (Wiley, New York,
2008).
23. Z. Xu, Models for multiple attribute decision making with intuitionistic fuzzy informa-
tion, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
15(3) (2007) 285–297.
24. F. Ye, An extended TOPSIS method with interval-valued intuitionistic fuzzy numbers for
virtual enterprise partner selection, Expert Systems with Applications 37(10) (2010)
7050–7055.
25. T. L. Saaty, The Analytical Hierarchy Process (McGraw-Hill, Toronto, 1980).
26. G. Kou, D. Ergu, C. Lin and Y. Chen, Pairwise comparison matrix in multiple criteria
decision making, Technological and Economic Development of Economy 22(5) (2016)
738–765.
27. G. Kou and C. Lin, A cosine maximization method for the priority vector derivation in
AHP, European Journal of Operational Research 235(1) (2014) 225–232, DOI: HTTP://
DX.DOI.ORG/10.1016/j.ejor.2013.10.019.
28. S. L. Huang and W. C. Cheng, Discovering Chinese sentence patterns for feature-based
opinion summarization, Electronic Commerce Research and Applications 14(6) (2015)
582–591.
Method for Ranking Products Through Online Reviews 25
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.
29. H. Q. Zhang, A. Sekhari, Y. Ouzrout and A. Bouras, Jointly identifying opinion mining
elements and fuzzy measurement of opinion intensity to analyze product features,
Engineering Applications of Arti¯cial Intelligence 47 (2016) 122–139.
30. J. A. Balazs and J. D. Vel
asquez, Opinion mining and information fusion: A survey,
Information Fusion 27 (2016) 95–110.
31. W. Medhat, A. Hassan and H. Korashy, Sentiment analysis algorithms and applications:
A survey, Ain Shams Engineering Journal 5(4) (2014) 1093–1113.
32. Y. Liu, J. W. Bi and Z. P. Fan, A method for multi-class sentiment classi¯cation based on
an improved one-vs-one (OVO) strategy and the support vector machine (SVM) algo-
rithm, Information Sciences 394–395 (2017) 38–52.
33. B. Pang and L. Lee, Opinion mining and sentiment analysis, Foundations and Trends in
Information Retrieval 2(2008) 1–135.
34. R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval (ACM Press,
New York, 1999).
35. S. Tan and J. Zhang, An empirical study of sentiment analysis for Chinese documents,
Expert Systems with Applications 34(4) (2008) 2622–2629.
36. G. Wang, J. Sun, J. Ma, K. Xu and J. Gu, Sentiment classi¯cation: The contribution of
ensemble learning, Decision Support Systems 57 (2014) 77–93.
37. V. N. Vapnik and V. Vapnik, Statistical Learning Theory (Wiley, New York, 1998).
38. M. Galar, A. Fern
andez, E. Barrenechea, H. Bustince and F. Herrera, An overview
of ensemble methods for binary classi¯ers in multi-class problems: Experimental study on
one-vs-one and one-vs-all schemes, Pattern Recognition 44(8) (2011) 1761–1776.
39. E. Hullermeier and S. Vanderlooy, Combining predictions in pairwise classi¯cation:
An optimal adaptive voting strategy and its relation to weighted voting, Pattern Rec-
ognition 43(1) (2010) 128–142.
40. G. Kou, Y. Lu, Y. Peng and Y. Shi, Evaluation of classi¯cation algorithms using MCDM
and rank correlation, International Journal of Information Technology & Decision
Making 11(01) (2012) 197–225.
41. G. Kou, Y. Peng and G. Wang, Evaluation of clustering algorithms for ¯nancial risk
analysis using MCDM methods, Information Sciences 275 (2014) 1–12.
42. J. Platt, Probabilistic outputs for support vector machines and comparisons to regular-
ized likelihood methods, Advances in Large Margin Classi¯ers 10(3) (1999) 61–74.
43. N. Ailon and M. Mohri, An e±cient reduction of ranking to classi¯cation, Machine
Learning 29(2) (2008) 103–130.
44. M. Galar, A. Fern
andez, E. Barrenechea, H. Bustince and F. Herrera, An overview of
ensemble methods for binary classi¯ers in multi-class problems: Experimental study on
one-vs-one and one-vs-all schemes, Pattern Recognition 44(8) (2011) 1761–1776.
26 Y. Liu, J.-W. Bi & Z.-P. Fan
Int. J. Info. Tech. Dec. Mak. Downloaded from www.worldscientific.com
by SAM HOUSTON STATE UNIVERSITY on 10/02/17. For personal use only.