Table 3 - uploaded by Guohong Fu
Content may be subject to copyright.
Examples of generated paraphrases. 

Examples of generated paraphrases. 

Source publication
Conference Paper
Full-text available
While substantial studies have been achieved on sentiment polarity classification to date, lacking enough opinion-annotated corpora for reliable t rain ing is still a challenge. In this paper we propose to improve a supported vector mach ines based polarity classifier by enriching both training data and test data via opinion paraphrasing. In partic...

Context in source publication

Context 1
... path forms a probable paraphrase for the input sentence. Table 3 shows some generated paraphrases and their bigram scores. ...

Citations

... The sentiment analysis research community has adopted the consensus that semantically equivalent text should have the same sentiment polarity [8,19,28,62,72,77]. For example, [19] proposed an approach that learns the polarity of affective events in a narrative text based on weakly-supervised labels, where the semantically equivalent pairs (event/effect) got the same polarity, and opposite pairs (event /effect) got opposite polarities. ...
Article
Full-text available
In this paper, we present a comprehensive study that evaluates six state-of-the-art sentiment analysis tools on five public datasets, based on the quality of predictive results in the presence of semantically equivalent documents, i.e., how consistent existing tools are in predicting the polarity of documents based on paraphrased text. We observe that sentiment analysis tools exhibit intra-tool inconsistency , which is the prediction of different polarity for semantically equivalent documents by the same tool, and inter-tool inconsistency , which is the prediction of different polarity for semantically equivalent documents across different tools. We introduce a heuristic to assess the data quality of an augmented dataset and a new set of metrics to evaluate tool inconsistencies. Our results indicate that tool inconsistencies is still an open problem, and they point towards promising research directions and accuracy improvements that can be obtained if such inconsistencies are resolved.
... Nevertheless, sentiment analysis of social media data is still a challenging task [10] due to the complexity and variety of natural language through which the same idea can be expressed and interpreted using different text. Many research work have adopted the consensus that semantically equivalent documents should have the same polarity [3,5,12,22,26,28]. For instance [5] have attributed the same polarity labels to the semantically equivalent couples (event/effect) while [12] have augmented their sentiment dataset using paraphrases and assign the original document's polarity to the generated paraphrases. ...
... Many research work have adopted the consensus that semantically equivalent documents should have the same polarity [3,5,12,22,26,28]. For instance [5] have attributed the same polarity labels to the semantically equivalent couples (event/effect) while [12] have augmented their sentiment dataset using paraphrases and assign the original document's polarity to the generated paraphrases. ...
Preprint
Full-text available
The opinion expressed in various Web sites and social-media is an essential contributor to the decision making process of several organizations. Existing sentiment analysis tools aim to extract the polarity (i.e., positive, negative, neutral) from these opinionated contents. Despite the advance of the research in the field, sentiment analysis tools give \textit{inconsistent} polarities, which is harmful to business decisions. In this paper, we propose SentiQ, an unsupervised Markov logic Network-based approach that injects the semantic dimension in the tools through rules. It allows to detect and solve inconsistencies and then improves the overall accuracy of the tools. Preliminary experimental results demonstrate the usefulness of SentiQ.
... With the rapid development of social networks over the past years, sentiment analysis of short social media texts has been attracting an ever-increasing amount of attention from the natural language processing community (Hu et al., 2004;Fu et al., 2014;Santos and Gatti, 2014). While substantial studies have been achieved on sentiment analysis to date (Pang et al., 2002;Hu et al., 2004;Wang and Manning, 2012;Kim et al., 2013;Liu et al., 2014;He et al., 2015), it is still challenging to explore enough contextual information or specific cues for polarity classification of short text like online product reviews (Fu et al., 2014;Santos and Gatti, 2014). ...
... With the rapid development of social networks over the past years, sentiment analysis of short social media texts has been attracting an ever-increasing amount of attention from the natural language processing community (Hu et al., 2004;Fu et al., 2014;Santos and Gatti, 2014). While substantial studies have been achieved on sentiment analysis to date (Pang et al., 2002;Hu et al., 2004;Wang and Manning, 2012;Kim et al., 2013;Liu et al., 2014;He et al., 2015), it is still challenging to explore enough contextual information or specific cues for polarity classification of short text like online product reviews (Fu et al., 2014;Santos and Gatti, 2014). On the one hand, online product reviews are short and thus contain a limited amount of contextual information for sentiment analysis. ...
... How to explore enough contextual information or specific cues is one important challenge for polarity classification of online product reviews (Fu et al., 2014;Santos and Gatti, 2014). Actually, online product reviews are short text with a limited amount of contextual information for sentiment analysis. ...
Conference Paper
While substantial studies have been achieved on sentiment analysis to date, it is still challenging to explore enough contextual information or specific cues for polarity classification of short text like online product reviews. In this work we explore review clustering and opinion paraphrasing to build multiple cluster-based classifiers for polarity classification of Chinese product reviews under the framework of support vector machines. We apply our approach to two corpora of product reviews in car and mobilephone domains. Our experimental results demonstrate that opinion clustering and paraphrasing are of great value to polarity classification.
Article
Sentiment analysis has received constant research attention due to its usefulness and importance in different applications. However, despite the research advances in this field, most current tools suffer in prediction quality due to the inconsistencies in their results, i.e., intra- and inter-tool inconsistencies. This demonstration proposes a system for the evaluation of sentiment analysis quality namely SA-Q. The system allows the evaluation of inconsistency in sentiment analysis tools, the resolution of the inconsistency using state-of-the-art methods and the recommendation of relevant sentiment analysis tool for any type of data set provided by the attendees. It allows the attendees to compare the tools. Moreover, we demonstrate that SA-Q evaluates the consistency of tools on two levels (intra-tool and inter-tool). Through various scenarios, we showcase the challenges of inconsistency resolution, demonstrate the usefulness of the proposed system and the recommendations that can be given to the attendees for their datasets. We demonstrate that SA-Q system has practical utility in many areas of industrial applications for better decision making. This demonstration shows promising research areas for data management, NLP, and machine learning communities by adopting and drawing inspiration from truth inference methods to create more robust tools and improve the tool's scalability.
Conference Paper
The Chinese language is a character-based language, with no explicit separators between words like English. Traditionally, word segmentation is conducted to convert Chinese sentences into word sequences, thus the same framework of English sentiment analysis can be exploited for Chinese. These work uses a specified word segmentor as a prerequisite step, yet ignores the fact that different segmentation styles exist in Chinese word segmentation, such as CTB, PKU, MSR and etc. In this paper, we study the influences of these heterogeneous segmentations for Chinese sentiment analysis, and then integrate these segmentations, based on both discrete and neural models. Experimental results show that different segmentations do affect the final performances, and the integrated models can achieve better performances.
Conference Paper
Although much progress has been made to date on sentiment classification, lacking annotated corpora remains a problem. In this paper we propose to expand corpora for Chinese polarity classification via opinion paraphrase generation. To this end, we first exploit three strategies for opinion paraphrase generation, namely sentences re-ordering, opinion element substitution and explicit attribution implying. To improve the quality of the generated opinion paraphrases, we define four criteria for opinion paraphrase evaluation and thus present a filtering algorithm to discard improper opinion paraphrase candidates. To assess the proposed method, we further apply the expanded corpus to a SVM classifier for polarity classification. The experimental results show that the generated opinion paraphrases are beneficial to polarity classification.