Development and use of a gold standard data set for subjectivity classifications

WIKIBIAS: Detecting Multi-Span Subjective Biases in Language

Conference Paper

Jan 2021

Sentiment Analysis Tool in Website Comments

Chapter

Jan 2020

Don't Let Me Be Misunderstood: Comparing Intentions and Perceptions in Online Discussions

Preprint

Full-text available

Apr 2020

Discourse involves two perspectives: a person's intention in making an utterance and others' perception of that utterance. The misalignment between these perspectives can lead to undesirable outcomes, such as misunderstandings, low productivity and even overt strife. In this work, we present a computational framework for exploring and comparing both perspectives in online public discussions. We combine logged data about public comments on Facebook with a survey of over 16,000 people about their intentions in writing these comments or about their perceptions of comments that others had written. Unlike previous studies of online discussions that have largely relied on third-party labels to quantify properties such as sentiment and subjectivity, our approach also directly captures what the speakers actually intended when writing their comments. In particular, our analysis focuses on judgments of whether a comment is stating a fact or an opinion, since these concepts were shown to be often confused. We show that intentions and perceptions diverge in consequential ways. People are more likely to perceive opinions than to intend them, and linguistic cues that signal how an utterance is intended can differ from those that signal how it will be perceived. Further, this misalignment between intentions and perceptions can be linked to the future health of a conversation: when a comment whose author intended to share a fact is misperceived as sharing an opinion, the subsequent conversation is more likely to derail into uncivil behavior than when the comment is perceived as intended. Altogether, these findings may inform the design of discussion platforms that better promote positive interactions.

Extracting all Aspect-polarity Pairs Jointly in a Text with Relation Extraction Approach

Preprint

Sep 2021

Extracting aspect-polarity pairs from texts is an important task of fine-grained sentiment analysis. While the existing approaches to this task have gained many progresses, they are limited at capturing relationships among aspect-polarity pairs in a text, thus degrading the extraction performance. Moreover, the existing state-of-the-art approaches, namely token-based se-quence tagging and span-based classification, have their own defects such as polarity inconsistency resulted from separately tagging tokens in the former and the heterogeneous categorization in the latter where aspect-related and polarity-related labels are mixed. In order to remedy the above defects, in-spiring from the recent advancements in relation extraction, we propose to generate aspect-polarity pairs directly from a text with relation extraction technology, regarding aspect-pairs as unary relations where aspects are enti-ties and the corresponding polarities are relations. Based on the perspective, we present a position- and aspect-aware sequence2sequence model for joint extraction of aspect-polarity pairs. The model is characterized with its ability to capture not only relationships among aspect-polarity pairs in a text through the sequence decoding, but also correlations between an aspect and its polarity through the position- and aspect-aware attentions. The experi-ments performed on three benchmark datasets demonstrate that our model outperforms the existing state-of-the-art approaches, making significant im-provement over them.

Product Reputation Evaluation based on Multiple Web Sources

Thesis

Full-text available

Mar 2017

Umar Farooq

The extraction of unstructured data from the Web and to analyse them in order to determine useful information which can be used by customers and manufacturers to make decisions about product is a challengeable task. There are some existing techniques to evaluate products based on the ratings and product reviews posted on the Web. However, all these techniques have some inherent issues and limitations and therefore not able to fulfil the needs and requirements of both customer and manufacturer. For instance, the existing sentiment analysis methods (which classify the opinions in customer reviews about a product as positive or negative) are not able to determine the context of word in a sentence accurately. In addition, negation handling methods adopted while determining the sentiment are not able to deal with all types of negations and they also do not consider all exceptions where negations behave differently. Similarly, the existing product reputation models are based on single source, not robust to false and biased ratings, not able to reflect the recent opinions, do not allow users to evaluate product on different criteria, and also do not provide a good estimation accuracy. On the other hand, the existing product reputation systems are centralized which have issues such as single point of failure, easy to falsify evaluation information and not suitable approach to solve a complex problem. This thesis proposes methods and techniques for evaluating product reputation based on data available on the Web and to provide valuable information to customers and manufacturers for decision making. These methods perform the following tasks: 1) extract product evaluation data from multiple Web sources 2) analyse product reviews in order to determine that whether opinions about product features in customer reviews are positive or negative, 3) computes different product reputation values while considering different evaluation criteria, and 4) finally the results are provided to customers and manufacturers in order to make decisions. This thesis contributes in three main research areas i.e. 1) feature level sentiment analysis, 2) product reputation model and 3) multiagent architecture. First, a word sense disambiguation and negation handling methods are proposed in order to improve the performance of feature level sentiment analysis. Second, a novel mathematical model is proposed which computes several reputation values in order to evaluate product based on different criteria. Finally, multiagent architecture for review analysis and product evaluation is proposed. Huge amount of the product evaluation data on the Web is in textual form (i.e. product reviews). In order to analyse product reviews to evaluate product we propose a feature level sentiment analysis method which determines the opinions about different features of a product. A word sense disambiguation method is introduced which identify the sense of words according to the context while determining the polarity. In addition, a negation handling method is proposed which determine the sequence of words affected by different types of negations. The results show that both word sense disambiguation and negation handling methods improve the overall accuracy of feature level sentiment analysis. A multi-source product reputation model is proposed where informative, robust and strategy proof aggregation methods are introduced to compute different reputation values. Sources from which reviews are extracted may not be creditable hence a source credibility measuring method is proposed in order to avoid malicious web sources. In addition, suitable decay principles for product reputation are also introduced in order to reflect the newest opinions about product quickly. The model also considers several parameters such as reviewer expertise, rating trustworthiness, time span of ratings, reviewer age, sex and location in order to evaluate product in different ways. Different types of ratings (i.e. textual and numeric ratings) are considered to compute reputation values which increase the choices for customers and manufacturers to make decisions. The results show that the proposed model is robust, strategy proof, able to reflect recent opinions, and estimates true reputation values even if some ratings are false. A multiagent layered architecture is proposed for product reputation evaluation. The main idea behind this layered architecture is to divide the complex problem of the product evaluation which is handled by a single entity in a centralized fashion into simpler and smaller problems handled by several entities in a distributed fashion. The architecture addresses different aspects of product evaluation such as taking inputs and displaying results to users, reviews extraction, feature level sentiment analysis and computing reputation values. In addition, the architecture also addresses issues concerned with centralized approach and also offers additional benefits such as autonomy, pro-activeness, openness and social ability.

Linguistic Shields: Safeguarding Against Harmful Content

Preprint

Full-text available

Apr 2024

Mohammad Dehghani

In today's digital world, online platforms play a significant role in facilitating communication and content sharing. Furthermore, with the emergence of large language models (LLMs), there has been a notable increase in both the quantity and variety of content. The exponential rise in user-generated content has led to challenges in maintaining a respectful online environment. In some cases, users have taken advantage of anonymity in order to use harmful content, which can negatively affect the user experience and pose serious social problems. Recognizing the limitations of manual moderation, automatic detection systems have been developed to tackle this problem. Nevertheless, several obstacles persist, including lack of a comprehensive framework, the absence of a universal definition for harmful content, the need for detailed annotation guideline. The current definitions of harmful content are static and do not adapt to changes over time. Furthermore, the detection methods are outdated and fail to keep pace with advancements in content, platforms, and new technologies such as LLMs. This study aims to address these challenges by introducing, for the first time, a detailed framework adaptable to any content, language and platform. This framework encompasses various aspects of harmful content detection. One of the key component of the framework is our development of a cross-language annotation guideline. Additionally, the integration of sentiment analysis represents an approach to enhancing harmful content detection. Furthermore, a definition of harmful content is proposed, which is formulated through a comprehensive review of various related concepts and emerging needs, allowing for adaptability to dynamic changes. Addressing these challenges and implementing a harmful content detection framework is vital, as it allows for early detection and prevention of harmful content, greatly improving the safety and security of all online users.

Transformer based model for offensive content recognition in dravidian languages

Article

Dec 2023

This paper describes a model for spotting offensive data from the comments being collected from social media. The comments posted will include expressions, emoticons and will mostly be in code mixed language and classifying these code-mixed language comments is tricky. The proposed system uses a multi-head attention model to extract features from the code-mixed Tamil input data. Various classification algorithms are applied to these extracted features to categorize offensive comments. The generated labels are optimized by performing majority voting on labels generated by different algorithms. This system is validated on the validation set and is evaluated by applying the Tamil CodeMix test data from the dataset published by the HASOC task (Task2-subtask1) at FIRE 2021. The evaluation yields an average weighted F1 score of 0.83 and is ranked 3rd position in the official ranking.

Revealing People’s Sentiment in Natural Italian Language Sentences

Article

Full-text available

Nov 2023

Social network systems are constantly fed with text messages. While this enables rapid communication and global awareness, some messages could be aptly made to hurt or mislead. Automatically identifying meaningful parts of a sentence, such as, e.g., positive or negative sentiments in a phrase, would give valuable support for automatically flagging hateful messages, propaganda, etc. Many existing approaches concerned with the study of people’s opinions, attitudes and emotions and based on machine learning require an extensive labelled dataset and provide results that are not very decisive in many circumstances due to the complexity of the language structure and the fuzziness inherent in most of the techniques adopted. This paper proposes a deterministic approach that automatically identifies people’s sentiments at the sentence level. The approach is based on text analysis rules that are manually derived from the way Italian grammar works. Such rules are embedded in finite-state automata and then expressed in a way that facilitates checking unstructured Italian text. A few grammar rules suffice to analyse an ample amount of correctly formed text. We have developed a tool that has validated the proposed approach by analysing several hundreds of sentences gathered from social media: hence, they are actual comments given by users. Such a tool exploits parallel execution to make it ready to process many thousands of sentences in a fraction of a second. Our approach outperforms a well-known previous approach in terms of precision.

The Impact of Sanctions on the Capitalization of Russian Companies: The Sectoral Aspect

Article

Sep 2023

The research purpose is to evaluate influence of sanctions on the Russian economy taking into consideration the sectoral aspect (oil and gas, telecommunications and consumer sector). The research methodology comprises econometric modeling (elastic net and GARCH modeling) and text analysis. In the paper we developed author’s sanction indices based on the text analysis. We used the EcSentiThemeLex dictionary to assess the news’ positivity and negativity. The empiric research base consists of news publications of the lenta.ru portal for the period from 01.01.2014 to 31.03.2023 represented by the thematic sections “economy” and “science and technology”. The research results are as follows. On the basis of GARCH modeling we revealed that sanctions have a negative impact on capitalization of the largest companies inoil and gas, the consumer sector and telecommunications. The news tonality influences companies’ capitalization. We have developed sanctions indices (a minimal index, an expanded index, a maximally expanded index) which allow to assess the extent of sanctions pressure. On the basis of elastic net method we made the conclusion of priority of sentiment variables over the control ones, i.e. information on sanctions and its tonality influences the stock market more than the oil prices, rouble exchange rate and interbank rate in the short term. Sanctions influence is not industry specific. However, the study does entail certain limitations: 1. reliance on publications from a single source; 2. the use of a single dictionary for evaluating news sentiment; 3. the sanctions index does not allow the incorporation of new terms when fresh sanctions are imposed. We intend to address these issues in future research.

Adaptive Prompt Learning-Based Few-Shot Sentiment Analysis

Article

Full-text available

Mar 2023
NEURAL PROCESS LETT

In the field of natural language processing, sentiment analysis via deep learning has a excellent performance by using large labeled datasets. Meanwhile, labeled data are insufficient in many sentiment analysis tasks, and obtaining these data is time-consuming and laborious. Prompt learning devotes to resolving the data deficiency by reformulating downstream tasks with the help of prompt. The model performance of this method depends on the quality of the prompt. This paper proposes an adaptive prompting (AP) construction strategy using seq2seq-attention structure to acquire the semantic information of the input sequence. Our method of dynamically constructing adaptive prompts can not only improve the quality of prompt, but also can effectively generalize to other fields by constructing a pre-trained prompt with existing public labeled data. The experimental results on FewCLUE datasets demonstrate that the proposed method AP can effectively construct appropriate adaptive prompt regardless of the quality of hand-crafted prompt and outperform the state-of-the-art baselines.

Insight to Emotional tones in WhatsApp Through Sentiment Analysis

Article

Full-text available

Feb 2019

Every day the social media networks (SMN) are generating massive amounts of data that may be structured, unstructured or semi-structured. The data may range from a normal text to a graphic image or an audio or video. Analysing this varied and ever-growing data is a big challenge. This paper focusses on extracting and analyzing data from a much used online text message application WhatsApp through the process of Sentiment Analysis. Sentiment Analysis or also known as opinion mining is contextual mining of data to identify, extract and analyse the underlying sentiment in messages and classify them to be positive, negative or neutral. R language has been used in this paper to understand the different emotions in a WhatsApp chat.

Sentiment Overflow in the Testing Stack: Analysing Software Testing Posts on Stack Overflow

Preprint

Full-text available

Feb 2023

Software testing is an integral part of modern software engineering practice. Past research has not only underlined its significance, but also revealed its multi-faceted nature. The practice of software testing and its adoption is influenced by many factors that go beyond tools or technology. This paper sets out to investigate the context of software testing from the practitioners' point of view by mining and analyzing sentimental posts on the widely used question and answer website Stack Overflow. By qualitatively analyzing sentimental expressions of practitioners, which we extract from the Stack Overflow dataset using sentiment analysis tools, we discern factors that help us to better understand the lived experience of software engineers with regards to software testing. Grounded in the data that we have analyzed, we argue that sentiments like insecurity, despair and aspiration, have an impact on practitioners' attitude towards testing. We suggest that they are connected to concrete factors like the level of complexity of projects in which software testing is practiced.

TONALITY OF TWEETS ABOUT UKRAINE DURING THE RUSSIAN-UKRAINIAN WAR

Article

Jan 2022

An Insight on Sentiment Analysis Research from Text using Deep Learning Methods

Article

Full-text available

Aug 2019

Nowadays, Deep Learning (DL) is a fast growing and most attractive research field in the area of image processing and natural language processing (NLP), which is being adopted across several sectors like medicine, agriculture, commerce and so many other areas as well. This is mainly because of the greater advantages in using DL like automatic feature extraction, capability to process more number of parameters and capacity to generate more accuracy in results. In this paper, we have examined the research works which have used the DL based Sentiment Analysis (SA) for the social network data. This paper provides the brief explanation about the SA, the necessities of the pre-processing of text, performance metrics and the roles of DL models in SA. The main focus of this paper is to explore how the DL algorithms can enhance the performance of SA than the traditional machine learning algorithms for text based analysis. Since DL models are more effective for NLP research, the text classification can be applied on the complex sentences in which there are two inverse emotions which produces the two different emotions about an event. Through this literature appraisal we conclude that by using the Convolutional Neural Network (CNN) technique we can obtain more accuracy than others. The paper also brings to the light that there is no major focus on mixed emotions by using DL methods, which eventually increases the scope for future researches.

Multi Layered Rule-Based Technique for Explicit Aspect Extraction from Online Reviews

Article

Full-text available

Jan 2022
CMC-COMPUT MATER CON

Detecting Islamic Radicalism Arabic Tweets Using Natural Language Processing

Article

Full-text available

Jan 2022

The image of the tolerant religion of Islam has been distorted by extremists in the last two decades in many ways, such as luring teenagers into terrorist acts. Nowadays, millions of users socialize and share ideas using social media platforms such as Twitter. Typically, the ideas shared on Twitter (tweets) reach and influence many people who could simply retweet them and make them even spread faster. Unfortunately, some of these ideas are posted by extremists who share hateful Arabic content. Thus, it is very important to automate the process of controlling and monitoring hateful Arabic tweets, given that Arabic is the most widely used language in the Islamic world. In this paper, we provide a manually labeled and curated dataset of 3,000 Arabic tweets that contain hateful and non-hateful tweets. To automate the process of detecting hateful tweets, we utilize advanced Machine Learning (ML) techniques and perform sentiment analysis to capture the meaning of the Arabic words in a proper word embedding (Word2Vec). Also, we used the proposed model to classify and analyze 100,000 tweets of the last decade. The outcome of this work promotes future research on analyzing Arabic hateful speech by providing a manually labeled Arabic dataset, and the trained model (achieved 92% accuracy) which can be used as an underlying tool by governments, Internet service providers, and social media applications to detect any inflammatory tweets before they spread to a wider audience.

Polarity and Subjectivity Detection with Multitask Learning and BERT Embedding

Preprint

Full-text available

Jan 2022

Multitask learning often helps improve the performance of related tasks as these often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multitask learning framework that jointly performs polarity and subjective detection. We propose an attention-based multitask model for predicting polarity and subjectivity. The input sentences are transformed into vectors using pre-trained BERT and Glove embeddings, and the results depict that BERT embedding based model works better than the Glove based model. We compare our approach with state-of-the-art models in both subjective and polarity classification single-task and multitask frameworks. The proposed approach reports baseline performances for both polarity detection and subjectivity detection.

Opinionated Text Classification For Hindi Tweets Using Deep Learning

Conference Paper

Full-text available

Apr 2021

The recent years have witnessed a significant growth in the data collected from the reviews posted on various websites. Reviews are a direct way of getting the response of the customers and clients of any business, making it a convenient way for getting feedback for marketing, performance and other such characteristics in association with any product or service. The opinions mined from these collections of data can provide strategies to improve the sales based on how well a product is received. This is done in two steps, first being the S ubjectivity Detection followed by Sentiment Analysis. For this process, various methods have been already introduced in this field. These vary from S VMs, Naive-Bayesian, deep learning etc. S ince, English is the most commonly used language in the world, it is not surprising that most work done in this field focuses on the same. But it is already known that there are roughly around 6500 languages used around the world. India alone has 447 languages which ranks it fourth on the list of countries with the greatest number of languages. The proposed research work focuses on sentiment classification in Hindi language text. The proposed research work has attempted to experiment with a method that does not rely on availability language dictionaries. This is done by creating a completely numerical data corresponding to the text. The model proposed in this paper will use a combination of Recurrent Neural Network and Convolutional Neural Network model to extract the subjective data form the given dataset of movie reviews.

Insight to Emotional tones in WhatsApp Through Sentiment Analysis

Article

Feb 2019

Ritinder Kaur

Every day the social media networks (SMN) are generating massive amounts of data that may be structured, unstructured or semi-structured. The data may range from a normal text to a graphic image or an audio or video. Analysing this varied and ever�growing data is a big challenge. This paper focusses on extracting and analyzing data from a much used online text message application WhatsApp through the process of Sentiment Analysis. Sentiment Analysis or also known as opinion mining is contextual mining of data to identify, extract and analyse the underlying sentiment in messages and classify them to be positive, negative or neutral. R language has been used in this paper to understand the different emotions in a WhatsApp chat.

SubjectivITA: An Italian Corpus for Subjectivity Detection in Newspapers

Conference Paper

Sep 2021

We present SubjectivITA: the first Italian corpus for subjectivity detection on news articles, with annotations at sentence and document level. Our corpus consists of 103 articles extracted from online newspapers, amounting to 1,841 sentences. We also define baselines for sentence-and document-level subjectivity detection using transformer-based and statistical classifiers. Our results suggest that sentence-level subjectivity annotations may often be sufficient to classify the whole document.

A Deep Learning Approach for Sentiment & Trend Analysis of GST Tweets in India using Topic-Sentiment Modeling

Thesis

Full-text available

May 2018

Sourav Das

With my work, I intend to showcase a lexical-level sentiment analysis while matching the found tokens with previous state-of-the-art datasets, providing sentiment rating to the tweets which consist of these words, developing a word occurrence probability cluster for especially this type of events from these features. Henceforth, I bring to the table an LSTM model for predicting sentiments by training and testing the sentiment-rated tweets, implying which, I achieved an accuracy of 84.51%, which I analyzed and addressed further by doing the epoch-level error analysis, and keeping up with that I even take an approach for making the trend analysis of tweet counts and people’s changing opinions reflecting through tweets about this new topic with time.

Deep learning and multilingual sentiment analysis on social media data: An overview

Article

Apr 2021
APPL SOFT COMPUT

Twenty-four studies on twenty-three distinct languages and eleven social media illustrate the steady interest in deep learning approaches for multilingual sentiment analysis of social media. We improve over previous reviews with wider coverage from 2017 to 2020 as well as a study focused on the underlying ideas and commonalities behind the different solutions to achieve multilingual sentiment analysis. Interesting findings of our research are (i) the shift of research interest to cross-lingual and code-switching approaches, (ii) the apparent stagnation of the less complex architectures derived from a backbone featuring an embedding layer, a feature extractor based on a single CNN or LSTM and a classifier, (iii) the lack of approaches tackling multilingual aspect-based sentiment analysis through deep learning, and, surprisingly, (iv) the lack of more complex architectures such as the transformers-based, despite results suggest the more difficult tasks requires more elaborated architectures. Full text: https://authors.elsevier.com/a/1cv0e5aecSjupP

A Public Opinion Analysis System Based on Emotion Analysis in Linyi

Article

Full-text available

Jan 2021

This article, the official news site of Linyi five services in Shanghai, Yiwu GanZhou Shenzhen national logistics hub of news as the data source, through Word2Vec and construction LSTM emotion classification model, with positive or negative emotion in general categories, calculating to analyze its emotional value, from the emotional category and time series analysis and word frequency vector to Linyi public opinion analysis of logistics hub

Exploring list of markers in unstructured text automatic processing

Conference Paper

Full-text available

Nov 2015

In this study we propose to identify some data source that can be used to determine "controlled vocabulary" (lexicon of markers), in order to develop an engine for extracting articles refering to a given topic. Methods of automatic enriching of the created lexicon of markers are presented.

Machine Learning based Analysis on Human Aggressiveness and Reactions towards Uncertain Decisions

Article

Full-text available

Oct 2020

Tweet data can be processed as a useful information. Social media sites like Twitter, Facebook, Google+ are rapidly growing popularity. These social media sites provide a platform for people to share and express their views about daily routine life, have to discuss on particular topics, have discussion with different communities, or connect with globe by posting messages. Tweets posted on twitter are expressed as opinions. These opinions can be used for different purposes such as to take public views on uncertain decisions such as Muslim ban in America, War in Syria, American Soldiers in Afghanistan etc. These decisions have direct impact on user's life such as violations & aggressiveness are common causes. For this purpose, we will collect opinions on some popular decision taken in past decade from twitter. We will divide the sentiments into two classes that is anger (hatred) and positive. We will propose a hypothesis model for such data which will be used in future. We will use Support Vector Machine (SVM), Naive Bayes (NB), and Logistic Regression (LR) classifier for text classification task. Furthermore , we will also compare SVM results with NB, LR. Research will help us to predict early behaviors & reactions of people before the big consequences of such decisions.

Evaluating Richer Features and Varied Machine Learning Models for Subjectivity Classification of Book Review Sentences in Portuguese

Article

Full-text available

Sep 2020

Texts published on social media have been a valuable source of information for companies and users, as the analysis of this data helps improving/selecting products and services of interest. Due to the huge amount of data, techniques for automatically analyzing user opinions are necessary. The research field that investigates these techniques is called sentiment analysis. This paper focuses specifically on the task of subjectivity classification, which aims to predict whether a text passage conveys an opinion. We report the study and comparison of machine learning methods of different paradigms to perform subjectivity classification of book review sentences in Portuguese, which have shown to be a challenging domain in the area. Specifically, we explore richer features for the task, using several lexical, centrality-based and discourse features. We show the contributions of the different feature sets and evidence that the combination of lexical, centrality-based and discourse features produce better results than any of the feature sets individually. Additionally, by analyzing the achieved results and the acquired knowledge by some symbolic machine learning methods, we show that some discourse relations may clearly signal subjectivity. Our corpus annotation also reveals some distinctive discourse structuring patterns for sentence subjectivity.

Understanding the Effects of Real-time Sentiment Analysis and Morale Visualisation in Backchannel Systems: A Case Study

Article

Aug 2020

When presenting to a large group of students, either in an amphitheatre or through an online platform, effectively connecting to the audience – understanding how well the audience is following the presentation and taking appropriate actions promptly if they experience difficulties – is a serious challenge. Backchannel systems are sometimes deployed to address this issue by allowing audience to give feedback to the presenter without interrupting the current discourse. However, these systems are not designed to immediately aggregate and present the audience's feedback to the presenter in a meaningful way that is easy to quickly digest. To fill this gap, we have explored a proof-of-concept method for analysing the emotions and sentiments from the audience's feedback in real time and displaying to the presenter a morale graph showing a trend of the audience's overall reaction over time. This allows a presenter to effectively connected to their audience in real time, knowing whether their presentation is going well and what issues their audience may have in common at any specific moment. We have further implemented this method in an educational context, using a prototype backchannel system, known as ClasSense, for a lecturer to effectively connect to their students. This paper presents the evaluation of this system, which shows that lecturers accept and prefer the morale graph based user interface developed over other backchannel user interfaces that display all posts in chronological order. Students also positively expressed their agreement that the system not only makes their feedback an important part of the class but also increases their interactions with the lecturers. This is further confirmed with a Markov chain predicting the probability that students’ and lecturers’ survey results lead to their overall positive sentiment towards the tool. The flexibility of the ClasSense system suggests it may also be suitable in contexts other than education.

SubjQA: A Dataset for Subjectivity and Review Comprehension

Preprint

Full-text available

Apr 2020

Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to be important for sentiment analysis and word-sense disambiguation. Furthermore, subjectivity is an important aspect of user-generated data. In spite of this, subjectivity has not been investigated in contexts where such data is widespread, such as in question answering (QA). We therefore investigate the relationship between subjectivity and QA, while developing a new dataset. We compare and contrast with analyses from previous work, and verify that findings regarding subjectivity still hold when using recently developed NLP architectures. We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance. For instance, a subjective question may or may not be associated with a subjective answer. We release an English QA dataset (SubjQA) based on customer reviews, containing subjectivity annotations for questions and answer spans across 6 distinct domains.

Emotion Mining in Social Media Data

Article

Full-text available

Jan 2019

Emotions are known to influence the perception of human beings along with their memory, thinking and imagination. Human perception is important in today’s world in a wide range of factors including but not limited to business, education, art, and music. Microblogging and Social networking sites like Twitter, Facebook are challenging sources of information that allow people to share their feelings and thoughts on a daily basis. In this paper we propose an approach to automatically detect emotions on Twitter messages that explores characteristics of the tweets and the writer’s emotion using Support Vector Machine LibLinear model and achieve 98% accuracy. Emotion mining gained attraction in the field of computer science due to the vast variety of systems that can be developed and promising applications like remote health care system, customer care services, smart phones that react based on users’s emotion, vehicles that sense emotion of the driver. These emotions help understand the current state of user. In order to perform suitable actions or provide suggestions on how user’s can enhance their feeling for a better healthy life-style we use actionable recommendations. In this work we extract action rules with respect to the user emotions that help provide suggestions for user’s.

A Survey of Computational Approaches and Challenges in Multimodal Sentiment Analysis

Article

Full-text available

Jan 2019
IJCSE

Most of the recent work in sentiment analysis is carried out on textual data. The text-based sentiment analysis mainly relies on the construction of word dictionaries, using machine learning techniques that learn and extract opinions from large text corpora. Text-based sentiment analysis has numerous applications such as customer satisfaction analysis about a brand or product perception, to gauge voting intentions, etc. With the rapid growth of social media, users post humongous volumes of data in various modalities such as text, image, audio, and video. These multimodal data streams bring new opportunities for going beyond text-based sentiment analysis and improving possible results. Since sentiment can be extracted from facial and vocal expressions, prosody and body posture, multimodal sentiment analysis offers new avenues in sentiment analysis. In multimodal sentiment analysis, the sentiment is extracted from transcribed content, visual and vocal features. This survey defines sentiment, sentiment analysis, states problems and challenges in multimodal sentiment analysis and finally reviews some of the recent computational approaches used multimodal sentiment analysis.

Machine Learning Models for Analysis of User Credibility Index in the E-Marketplaces

Conference Paper

May 2024

Efficient Utilization of Pre-trained Models: A Review of Sentiment Analysis via Prompt Learning

Article

Nov 2023
KNOWL-BASED SYST

Sentiment overflow in the testing stack: Analysing software testing posts on Stack Overflow

Article

Jul 2023
J SYST SOFTWARE

A Survey on Sentiment Analysis in Health Care: New Opportunities and Challenges

Chapter

Jul 2023

Twitter is increasingly being used as a venue for medical research because of the large number of unstructured and free-text tweets sent there on healthcare-related topics. In natural language processing, sentiment analysis is one sort of data mining that may be used to assess the direction of a person's personality. Computational linguistics is used to the analysis of text to infer and assess conceptual understanding of the internet, social media, and related topics. Healthcare information is also widely available online in the form of personal blogs, social media, and websites that rate medical conditions, but this data was not compiled in a systematic fashion. A few of the numerous advantages of sentiment analysis include better healthcare outcomes and more efficient medical practice. In this paper, we explore possible new opportunities for those researchers who want to do work in the domain of sentiment analysis in the medical field and. We explore many recent and existing papers and find out the strength and research gaps of these papers in terms of methodologies, datasets used, and different machine learning and deep learning models. These tabular forms give new direction for research in this domain.KeywordsSentiment analysisLexicon-based sentiment classification deep learningClinical text miningHealth status analysis computational linguistics

Literature Perlustration of Opinion Analysis on E-commerce

Article

Full-text available

Jun 2023

Recomendação de reviews personalizada para donos de estabelecimentos

Conference Paper

Oct 2016

Aplicativos online de avaliação geralmente recomendam as revisões (reviews) mais úteis para os usuários leitores de avaliações. Aqui, introduzimos um novo problema: avaliar a utilidade de uma revisão para o dono de um estabelecimento. Especificamente, propomos o uso de aspectos e sentimentos das revisões, e a geração de um ranking ordenado a partir das mais úteis para o gerenciamento e desenvolvimento do estabelecimento.

Verbal sentiment analysis and detection using recurrent neural network

Chapter

Jan 2022

The illusion of data validity: Why numbers about people are likely wrong

Article

Sep 2022

This reflection article addresses a difficulty faced by scholars and practitioners working with numbers about people, which is that those who study people want numerical data about these people. Unfortunately, time and time again, this numerical data about people is wrong. Addressing the potential causes of this wrongness, we present examples of analyzing people numbers, i.e., numbers derived from digital data by or about people, and discuss the comforting illusion of data validity. We first lay a foundation by highlighting potential inaccuracies in collecting people data, such as selection bias. Then, we discuss inaccuracies in analyzing people data, such as the flaw of averages, followed by a discussion of errors that are made when trying to make sense of people data through techniques such as posterior labeling. Finally, we discuss a root cause of people data often being wrong – the conceptual conundrum of thinking the numbers are counts when they are actually measures. Practical solutions to address this illusion of data validity are proposed. The implications for theories derived from people data are also highlighted, namely that these people theories are generally wrong as they are often derived from people numbers that are wrong.

ECONOMIC INTEGRATION AND SOUTH ASIA: EXPLORING SPILLOVER EFFECTS FOR NORTH-EAST INDIA

Article

Full-text available

Oct 2018

Jayanti Bhattacharjee

The present research paper attempts to estimate the influence of physical capital investment, education expenditure and trade on the economic growth of the four major countries in South Asia, namely, India, Pakistan, Bangladesh and Srilanka. In addition the study estimates spillover benefits of the institutional measures of voice and accountability, political stability and absence of violence and terrorism in the neighbouring countries on economic growth of home country. The paper diagnoses the intra-regional trade in South Asia and whether Northeast India can catalyse the economic integration in the region. The study also throws light on the spillover benefits from regional integration and hindrances in realisation of the spillover benefits for the North-Eastern states from the Act-East Policy of Government of India. We run a panel regression for the period 1996-2016 to estimate the influence of various conventional factors and spillover effects of institutional measures of voice and accountability and political stability and absence of violence and terrorism on economic growth of the four major economies of South Asia. Annual data on various explanatory variables have been collected from Penn World Tables, Word Bank, World Bank Governance Indicators for the four South Asian countries, namely, India, Pakistan, Bangladesh and Srilanka for the period 1996 to 2016. Significant positive effects of physical capital investment, trade, regional institutions of voice and accountability and political stability are observed. Surprisingly, it is observed that international trade measured by trade-GDP ratio has positive and significant influence on economic growth of the countries in South Asia, but intraregional trade within South Asia remains meagre. Policy makers should make the most of the geographical location of the Northeast states in escalating the economic growth of South Asian nations. This also provides the Northeast region a generous opportunity to reap the benefits of the Act East Policy of India. Keywords: International Economics, Institutions and Macroeconomy, Panel Data Models, Estimation

Polarity and Subjectivity Detection with Multitask Learning and BERT Embedding

Article

Full-text available

Jun 2022

In recent years, deep learning-based sentiment analysis has received attention mainly because of the rise of social media and e-commerce. In this paper, we showcase the fact that the polarity detection and subjectivity detection subtasks of sentiment analysis are inter-related. To this end, we propose a knowledge-sharing-based multitask learning framework. To ensure high-quality knowledge sharing between the tasks, we use the Neural Tensor Network, which consists of a bilinear tensor layer that links the two entity vectors. We show that BERT-based embedding with our MTL framework outperforms the baselines and achieves a new state-of-the-art status in multitask learning. Our framework shows that the information across datasets for related tasks can be helpful for understanding task-specific features.

A Survey on Feature Selection and Classification Techniques for EEG Signal Processing

Chapter

Full-text available

Jan 2022

The electroencephalogram is a test that is used to keep track on the brain activity. These signals are generally used in clinical areas to identify various brain activities that happen during specific tasks and to design brain–machine interfaces to help in prosthesis, orthosis, exoskeletons, etc. One of the tedious tasks in designing a brain–machine interface application is based on processing of EEG signals acquainted from real-time environment. The complexity arises due to the fact that the signals are noisy, non-stationary, and high-dimensional in nature. So, building a robust BMI is based on the efficient processing of these signals. Optimal selection of features from the signals and the classifiers used plays a vital role in building efficient devices. This paper concentrates on surveying the recent feature selection, feature extraction, and classification algorithms used in various applications for the development of BMI.KeywordsEEGProsthesisOrthosisExoskeletons

Innovations and Sustainable Growth in Business Management Opportunities and Challenges. Edited Book

Book

Full-text available

May 2019

“Learning gives creativity, creativity leads to thinking, thinking provides knowledge, and knowledge makes you great.” — Abdul Kalam Azad. “The capacity to learn is a Gift, The ability to learn is a skill, The willingness to learn is a choice.” — Brian Herbert. “Anyone who stops learning is old whether at twenty or eighty. Anyone who keeps learning stays young.” —Henry Ford. “It is the customer who pays the wages and the more you engage with customers the clearer things become and the easier it is to determine what you should be doing. ” — John Russell, President, Harley Davidson. This is an Edited book from the chapters of researchers presented in a seminar.

Beyond the Polarities: Sentiment Analysis of French Restaurant Reviews Using BERT-based Models

Conference Paper

Oct 2021

Sentiment Analysis and Opinion Mining

Chapter

Jan 2021

With the rapid development and popularity of the World Wide Web, the Internet has entered the Web 2.0 and social network era. In Web 1.0, the Internet was characterized by static web pages. Web 2.0 refers to a World Wide Web that highlights user-generated content. Social networks are represented by a number of online tools and platforms, such as Twitter, Facebook, Weibo, and WeChat, where people share their perspectives, opinions, thoughts, and experiences. These online platforms contain innumerable subjective texts regarding different topics and events that fully reflect the individual opinions, sentiments, attitudes, and emotions of all of society.

Sentiment analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text

Chapter

Full-text available

Jan 2021

Saif M. Mohammad

Recent advances in machine learning have led to computer systems that are humanlike in behavior. Sentiment analysis, the automatic determination of emotions in text, is allowing us to capitalize on substantial previously unattainable opportunities in commerce, public health, government policy, social sciences, and art. Further, analysis of emotions in text, from news to social media posts, is improving our understanding of not just how people convey emotions through language but also how emotions shape our behavior. This article presents a sweeping overview of sentiment analysis research that includes: the origins of the field, the rich landscape of tasks, challenges, a survey of the methods and resources used, and applications. We also discuss how, without careful fore-thought, sentiment analysis has the potential for harmful outcomes. We outline the latest lines of research in pursuit of fairness in sentiment analysis.

A Literature Review on Sentiment Analysis

Article

May 2020

This paper is an intend to consolidate the review and perform the literature survey on the sentiment analysis and on opinion mining. In this paper we try to analyze people sentiments, opinions, and emotions from their text language by which we can try to understand in what mood or emotion was the person while writing the text message. There are many types of sentimental moods according to which person writes the text it can be classified like happy, sad, neutral, angry. Also there are times when the user can be sad and angry at the same time which is needed to be identified by the analysis.

Sentiment Analysis Techniques: A Review

Article

Oct 2020

Sentiments are the attitude, opinions, thoughts, beliefs or feelings of the writer towards something, such as people, artifacts, company or location. Sentiment analysis intends to conclude the judgment of a presenter or an author apropos to some subject matter or on the whole relative polarity of the manuscript. The outlook could be the perception or assessment, emotional condition, or the projected poignant message of the person behind

Reducing Information Overload with Aspect Based Analysis of Online Product Reviews

Conference Paper

Full-text available

Nov 2017

Consumers use numerical review ratings, and unstructured review text on various latent aspects of the products/services to make online purchase decisions. The sheer volume of content creates an information overload issue for consumers in extracting relevant information. With the advent of advanced analytical tools and cheaper computing resources, it is now possible to extract rich information from unstructured review text. Using latest data mining and analytics tools and techniques, this work proposes a novel approach to derive and extract objective latent aspect ratings and ways to integrate the proposed model with extant review systems to address the information overload issue.

Transformer based Deep Intelligent Contextual Embedding for Twitter sentiment analysis

Article

Jul 2020
FUTURE GENER COMP SY

Along with the emergence of the Internet, the rapid development of handheld devices has democratized content creation due to the extensive use of social media and has resulted in an explosion of short informal texts. Although a sentiment analysis of these texts is valuable for many reasons, this task is often perceived as a challenge given that these texts are often short, informal, noisy, and rich in language ambiguities, such as polysemy. Moreover, most of the existing sentiment analysis methods are based on clean data. In this paper, we present DICET, a transformer-based method for sentiment analysis that encodes representation from a transformer and applies deep intelligent contextual embedding to enhance the quality of tweets by removing noise while taking word sentiments, polysemy, syntax, and semantic knowledge into account. We also use the bidirectional long- and short-term memory network to determine the sentiment of a tweet. To validate the performance of the proposed framework, we perform extensive experiments on three benchmark datasets, and results show that DICET considerably outperforms the state of the art in sentiment classification.

Deep Feature Weighting Based on Genetic Algorithm and Naïve Bayes for Twitter Sentiment Analysis

Conference Paper

Sep 2019

Development and use of a gold standard data set for subjectivity classifications

No full-text available

Recommended publications

Classification of Standard FASHION MNIST Dataset Using Deep Learning Based CNN Algorithms

Radiologic Classification of Black Lung: Time for a New Gold Standard?

Diagnostic Classification Models for Testlets: Methods and Theory

Auto Generation of Gold Standard, Class Labeled Data Set and Ontology Extension Tool [QuadW]