Research

A Hybrid Approach for Big Data Analysis of Cricket Fan Sentiments in Twitter

Authors:
To read the file of this research, you can request a copy directly from the author.

Abstract

Twitter has become one of the most widely used social networks, and its popularity is increasing day by day as the number of tweets grows exponentially each day in the order of millions. The twitter data is used widely for personal, academic and business purpose. In this paper, we collected real time tweets from Indian Cricket Team fans during the eight matches of ICC Cricket World Cup (CWC) 2015 (here 8 matches means the total number of games India played in CWC) using the Social media Twitter. We performed sentiment analysis on the tweets to test emotions of Indian Cricket Lovers. The analysis is based on the fact that the emotions of fans change frequently with each event such that when home country is batting and scoring runs, they will be happy, and for every loss of wicket they will be sad. When the team is bowling, they will be sad for ‘six’ and happy for Wickets. So when Fans are happy, they react with positive tweets and accordingly when they are sad they react with negative tweets. We analyzed that, when India is batting, people use fear, anger, nervousness, tension are the most frequently used negative words and words like awesome, happy, love are the most used positive words. All emotions are entirely dependent on team India's performance. All negative emotions are increased when opponent team hit runs or when they achieve HIGH SCORE" and is decreased when the Indian team hit runs or when they take wickets'. This paper uses tweets and captures emotions for big data analytics and analyzes emotional quotient.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the author.

Conference Paper
Full-text available
As people increasingly use emoticons in text in order to express, stress, or disambiguate their sentiment, it is crucial for automated sentiment analysis tools to correctly account for such graphical cues for sentiment. We analyze how emoticons typically convey sentiment and demonstrate how we can exploit this by using a novel, manually created emoticon sentiment lexicon in order to improve a state-of-the-art lexicon-based sentiment classification method. We evaluate our approach on 2,080 Dutch tweets and forum messages, which all contain emoticons and have been manually annotated for sentiment. On this corpus, paragraph-level accounting for sentiment implied by emoticons significantly improves sentiment classification accuracy. This indicates that whenever emoticons are used, their associated sentiment dominates the sentiment conveyed by textual cues and forms a good proxy for intended sentiment.
Conference Paper
Full-text available
Opinion mining and sentiment analysis is a fast growing topic with various world applications, from polls to advertisement placement. Traditionally individuals gather feedback from their friends or relatives before purchasing an item, but today the trend is to identify the opinions of a variety of individuals around the globe using microblogging data. This paper discusses an approach where a publicised stream of tweets from the Twitter microblogging site are preprocessed and classified based on their emotional content as positive, negative and irrelevant; and analyses the performance of various classifying algorithms based on their precision and recall in such cases. Further, the paper exemplifies the applications of this research and its limitations.
Article
Full-text available
This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not rec- ommended (thumbs down). The classifi- cation of a review is predicted by the average semantic orientation of the phrases in the review that contain adjec- tives or adverbs. A phrase has a positive semantic orientation when it has good as- sociations (e.g., "subtle nuances") and a negative semantic orientation when it has bad associations (e.g., "very cavalier"). In this paper, the semantic orientation of a phrase is calculated as the mutual infor- mation between the given phrase and the word "excellent" minus the mutual information between the given phrase and the word "poor". A review is classified as recommended if the average semantic ori- entation of its phrases is positive. The al- gorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The ac- curacy ranges from 84% for automobile reviews to 66% for movie reviews.
Conference Paper
Full-text available
In this paper, we present TwiSent, a sentiment analysis system for Twitter. Based on the topic searched, TwiSent collects tweets pertaining to it and categorizes them into the different polarity classes positive, negative and objective. However, analyzing micro-blog posts have many inherent challenges compared to the other text genres. Through TwiSent, we address the problems of 1) Spams pertaining to sentiment analysis in Twitter, 2) Structural anomalies in the text in the form of incorrect spellings, nonstandard abbreviations, slangs etc., 3) Entity specificity in the context of the topic searched and 4) Pragmatics embedded in text. The system performance is evaluated on manually annotated gold standard data and on an automatically annotated tweet set based on hashtags. It is a common practise to show the efficacy of a supervised system on an automatically annotated dataset. However, we show that such a system achieves lesser classification accurcy when tested on generic twitter dataset. We also show that our system performs much better than an existing system.
Conference Paper
Social networks are a popular place for people to express their opinions about products and services. One question would be that for two similar products (e.g., two different brands of mobile phones), can we make them comparable to each other? In this paper, we show our system namely OpinionAnalyzer, a novel social network analyser designed to collect opinions from Twitter micro-blogs about two given similar products for an effective comparison between them. The system outcome is a structure of features for the given products that people have expressed opinions about. Then the corresponding sentiment analysis on those features is performed. Our system can be used to understand user’s preference to a certain product and show the reasons why users prefer this product. The experiments are evaluated based on accuracy, precision/recall, and F-score. Our experimental results show that the system is effective and efficient.
Conference Paper
Opinions are the fundamental aspect to almost all decision making activities. The increased usage of internet and the exchange of user opinions through social media and public forums on the web has become the motivation for sentiment analysis. Due to the infinite amount of user opinions available throughout the web it is necessary to automatically analyze and classify sentiment expressed in opinions. The basic task of sentiment analysis or opinion mining is sentiment classification which classifies the content as positive, negative and irrelevant. This paper discusses an approach where an exposed stream of tweets from the Twitter micro blogging site are preprocessed and classified based on their sentiments. In sentiment classification system the concept of opinion subjectivity has been accounted. In this paper, we present opinion detection and organization subsystem, which have already been integrated into our larger question-answering system. The subjectivity classification system uses Genetic-Based Machine Learning (GBML) technique that considers subjectivity as a semantic problem. The classification of a review is predicted through the average semantic orientation of the phrases in the review that contain adjectives or adverbs. Experimental results of the proposed techniques are efficient and generate eminent evaluations.
Conference Paper
In Chap. 9, we studied the extraction of structured data from Web pages. The Web also contains a huge amount of information in unstructured texts. Analyzing these texts is of great importance as well and perhaps even more important than extracting structured data because of the sheer volume of valuable information of almost any imaginable type contained in text. In this chapter, we only focus on mining opinions which indicate positive or negative sentiments. The task is technically challenging and practically very useful. For example, businesses always want to find public or consumer opinions on their products and services. Potential customers also want to know the opinions of existing users before they use a service or purchase a product.
Article
We introduce a novel approach for automatically classify-ing the sentiment of Twitter messages. These messages are classified as either positive or negative with respect to a query term. This is useful for consumers who want to re-search the sentiment of products before purchase, or com-panies that want to monitor the public sentiment of their brands. There is no previous research on classifying sen-timent of messages on microblogging services like Twitter. We present the results of machine learning algorithms for classifying the sentiment of Twitter messages using distant supervision. Our training data consists of Twitter messages with emoticons, which are used as noisy labels. This type of training data is abundantly available and can be obtained through automated means. We show that machine learn-ing algorithms (Naive Bayes, Maximum Entropy, and SVM) have accuracy above 80% when trained with emoticon data. This paper also describes the preprocessing steps needed in order to achieve high accuracy. The main contribution of this paper is the idea of using tweets with emoticons for distant supervised learning.
Conference Paper
In this paper, we propose an approach to automatically detect sentiments on Twit- ter messages (tweets) that explores some characteristics of how tweets are written and meta-information of the words that compose these messages. Moreover, we leverage sources of noisy labels as our training data. These noisy labels were provided by a few sentiment detection websites over twitter data. In our experi- ments, we show that since our features are able to capture a more abstract represen- tation of tweets, our solution is more ef- fective than previous ones and also more robust regarding biased and noisy data, which is the kind of data provided by these sources.
  • Saif M Mohammad
  • Peter D Turney
Mohammad, Saif M., and Peter D. Turney. NRC Emotion Lexicon. NRC Technical Report, 2013.