Figure 4 - uploaded by Ali Shariq Imran
Content may be subject to copyright.
An example RDF graph representation

An example RDF graph representation

Source publication
Article
Full-text available
This paper provides a comprehensive performance analysis of parametric and non-parametric machine learning classifiers including a deep feed-forward multi-layer perceptron (MLP) network on two variants of improved Concept Vector Space (iCVS) model. In the first variant, a weighting scheme enhanced with the notion of concept importance is used to as...

Context in source publication

Context 1
... difference is that a relation in an ontology graph is defined as a vertex in the RDF graph. For example, relation isReceived in ontology graph shown in Figure 3 is represented as a vertex in RDF graph, as shown in Figure 4. In other words, a relation in RDF graph is a link between a subject denoted by rdfs:domain property and an object denoted by rdfs:range property as given in Definition 3. The next step is computation of the importance of vertices of the graph using an adoption of the Markov based algorithms. ...

Similar publications

Preprint
Full-text available
Background Accumulating evidence has linked environmental exposures, such as ambient air pollution and meteorological factors to the development and severity of cardiovascular diseases (CVDs), resulting in increased healthcare demand. Effective prediction of situations of demand for healthcare services particularly those associated with peak events...

Citations

... The conventional machine learning models employed in this study for sentiment and emotion classification include Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), and AdaBoost, as they are known for their good performance [15] and efficiency even for handling millions of tweets [18]. All the algorithms are trained in scikit-learn library in Jupyter Notebook in Anaconda, with default values for all parameters for all classifiers. ...
Chapter
Automatic text-based sentiment analysis and emotion detection on social media platforms has gained tremendous popularity recently due to its widespread application reach, despite the unavailability of a massive amount of labeled datasets. With social media platforms in the limelight in recent years, it’s easier for people to express their opinions and reach a larger target audience via Twitter and Facebook. Large tweet postings provide researchers with much data to train deep learning models for analysis and predictions for various applications. However, deep learning-based supervised learning is data-hungry and relies heavily on abundant labeled data, which remains a challenge. To address this issue, we have created a large-scale labeled emotion dataset of 1.83 million tweets by harnessing emotion-indicative emojis available in tweets. We conducted a set of experiments on our distant-supervised labeled dataset using conventional machine learning and deep learning models for estimating sentiment polarity and multi-class emotion detection. Our experimental results revealed that deep neural networks such as BiLSTM and CNN-BiLSTM outperform other models in both sentiment polarity and multi-class emotion classification tasks achieving an F1 score of 62.21% and 39.46%, respectively, an average performance improvement of nearly 2–3 percentage points on the baseline results.KeywordsSentiment polarityEmotion detectionDistant supervisionEmojiDeep learningTwitterClassification
... This research work is however limited in this respect. Furthermore, semantics [64][65][66] and concept space [67,68], as well as ontology models [69][70][71], and processing systems [72] can be utilized to enrich Urdu lexicon for developing polarity assessment models for Urdu [73], rather than using translation services to convert text into English first. ...
Preprint
Full-text available
Discovering what other people think has always been a key aspect of our information-gathering strategy. People can now actively utilize information technology to seek out and comprehend the ideas of others, thanks to the increased availability and popularity of opinion-rich resources such as online review sites and personal blogs. Because of its crucial function in understanding people's opinions, sentiment analysis (SA) is a crucial task. Existing research, on the other hand, is primarily focused on the English language, with just a small amount of study devoted to low-resource languages. For sentiment analysis, this work presented a new multi-class Urdu dataset based on user evaluations. The tweeter website was used to get Urdu dataset. Our proposed dataset includes 10,000 reviews that have been carefully classified into two categories by human experts: positive, negative. The primary purpose of this research is to construct a manually annotated dataset for Urdu sentiment analysis and to establish the baseline result. Five different lexicon- and rule-based algorithms including Naivebayes, Stanza, Textblob, Vader, and Flair are employed and the experimental results show that Flair with an accuracy of 70% outperforms other tested algorithms.
... Currently, the sentiment analysis algorithms integrated are lexicon-based. It would be interesting to add algorithms which are machine learning-based to compare the differences and train them on domainspecific topics using ontology [62,63] or concept vectors [64]. Another aspect regarding the future work is multilingual compatibility [46]. ...
Preprint
Full-text available
The amount of opinionated data on the internet is rapidly increasing. More and more people are sharing their ideas and opinions in reviews, discussion forums, microblogs and general social media. As opinions are central in all human activities, sentiment analysis has been applied to gain insights in this type of data. There are proposed several approaches for sentiment classification. The major drawback is the lack of standardized solutions for classification and high-level visualization. In this study, a sentiment analyzer dashboard for online social networking analysis is proposed. This, to enable people gaining insights in topics interesting to them. The tool allows users to run the desired sentiment analysis algorithm in the dashboard. In addition to providing several visualization types, the dashboard facilitates raw data results from the sentiment classification which can be downloaded for further analysis.
... We were limited to only bag of words. These features can further be incorporated with semantics [39][40][41] and vector space representation models [42][43][44][45][46] to improve classification performance. ...
Preprint
Full-text available
In today's world, everyone is expressive in some way, and the focus of this project is on people's opinions about rising electricity prices in United Kingdom and India using data from Twitter, a micro-blogging platform on which people post messages, known as tweets. Because many people's incomes are not good and they have to pay so many taxes and bills, maintaining a home has become a disputed issue these days. Despite the fact that Government offered subsidy schemes to compensate people electricity bills but it is not welcomed by people. In this project, the aim is to perform sentiment analysis on people's expressions and opinions expressed on Twitter. In order to grasp the electricity prices opinion, it is necessary to carry out sentiment analysis for the government and consumers in energy market. Furthermore, text present on these medias are unstructured in nature, so to process them we firstly need to pre-process the data. There are so many feature extraction techniques such as Bag of Words, TF-IDF (Term Frequency-Inverse Document Frequency), word embedding, NLP based features like word count. In this project, we analysed the impact of feature TF-IDF word level on electricity bills dataset of sentiment analysis. We found that by using TF-IDF word level performance of sentiment analysis is 3-4 higher than using N-gram features. Analysis is done using four classification algorithms including Naive Bayes, Decision Tree, Random Forest, and Logistic Regression and considering F-Score, Accuracy, Precision, and Recall performance parameters.
... Then, Hedwig [26] developed a semantic data mining algorithm that exploits this summarized knowledge for deriving efficient rules. Kastrati & Imran [27] also applied the PageRank algorithm to identify the importance value of each concept in the ontology for assisting the document classification task. The importance value is aggregated with the concept relevance score, which is the frequency of the concept in the document, to determine the final weight of each concept for the classification process. ...
Article
Full-text available
Decision Trees are a common approach used for classifying unseen data into defined classes. The Information Gain is usually applied as splitting criteria in the node selection process for constructing the decision tree. However, bias in selecting the multi-variation attributes is a major limitation of using this splitting condition, leading to unsatisfactory classification performance. To deal with this problem, a new decision tree algorithm called "Knowledge-Based Decision Tree (KDT)" is proposed which exploits the knowledge in an ontology to assist the decision tree construction. The novelty of the study is that an ontology is applied to determine the attribute importance values using the PageRank algorithm. These values are used to modify the Information Gain to obtain appropriate attributes to be nodes in the decision tree. Four different datasets, Soybean, Heart disease, Dengue fever, and COVID-19 dataset, were employed to evaluate the proposed approach. The experimental results show that the proposed method is superior to the other decision tree algorithms, such as the traditional ID3 and the Mutual Information Decision tree (MIDT), and also performs better than a non-decision tree algorithm, e.g., the k-Nearest Neighbors.
... Term Presence and Frequency, Part of Speech Tagging, and Negation are some of the features that can be used. Also incorporating the semantic context using publicly available lexical databases (i.e WordNet, SentiWordNet, SenticNet, etc.) [54] or semantically rich representations using ontologies [60,61] and their thesaurus [62,63] to identify opinion and attitude of users from text would be an import aspect to further investigate. ...
Preprint
Full-text available
Online learning is becoming increasingly popular, whether for convenience, to accommodate work hours, or simply to have the freedom to study from anywhere. Especially, during the Covid-19 pandemic, it has become the only viable option for learning. The effectiveness of teaching various hard-core programming courses with a mix of theoretical content is determined by the student interaction and responses. In contrast to a digital lecture through Zoom or Teams, a lecturer may rapidly acquire such responses from students' facial expressions, behavior, and attitude in a physical session, even if the listener is largely idle and non-interactive. However, student assessment in virtual learning is a challenging task. Despite the challenges, different technologies are progressively being integrated into teaching environments to boost student engagement and motivation. In this paper, we evaluate the effectiveness of various in-class feedback assessment methods such as Kahoot!, Mentimeter, Padlet, and polling to assist a lecturer in obtaining real-time feedback from students throughout a session and adapting the teaching style accordingly. Furthermore, some of the topics covered by student suggestions include tutor suggestions, enhancing teaching style, course content, and other subjects. Any input gives the instructor valuable insight into how to improve the student's learning experience, however, manually going through all of the qualitative comments and extracting the ideas is tedious. Thus, in this paper, we propose a sentiment analysis model for extracting the explicit suggestions from the students' qualitative feedback comments.
... The greater number of times a word appears in the tweet, the value of TF-IDF will increase. We convert text messages in tweets to the Vector Space Model (VSM) (Kastrati and Imran 2019). VSM is a model in which we represent text messages in tweets as a vector. ...
Article
Full-text available
The success factor of sentimental analysis lies in identifying the most occurring and relevant opinions among users relating to the particular topic. In this paper, we develop a framework to analyze users’ sentiments on Twitter on natural disasters using the data pre-processing techniques and a hybrid of machine learning, statistical modeling, and lexicon-based approach. We choose TF-IDF and K-means for sentiment classification among affinitive and hierarchical clustering. Latent Dirichlet Allocation, a pipeline of Doc2Vec and K-means used to capture themes, then perform multi-level polarity indices classification and its time series analysis. In our study, we draw insights from 243,746 tweets for Kerala’s 2018 natural disasters in India. The key findings of the study are the classification of sentiments based on similarity and polarity indices and identifying themes among the topics discussed on Twitter. We observe different sets of emotions and influencers, among others. Through this case example of Kerala floods, it shows how the government and other organizations could track the positive/negative sentiments concerning time and location; gain a better understanding of the topic of discussion trending among the public, and collaborate with crucial Twitter users/influencers to spread and figure out the gaps in the implementation of schemes in terms of design and execution. This research’s uniqueness is the streamlined and efficient combination of algorithms and techniques embedded in the framework used in achieving the above output, which can be integrated into a platform with GUI for further automation.
... BERT and all other deep learning models are only good at NLP tasks, not much when it comes to natural language understanding 1 . Such issues can be addressed employing ontologies, better vector space representation models [47,48], and objective and semantic metrics [49,50]. ...
Preprint
Full-text available
While the whole world is still struggling with the COVID-19 pandemic, online learning and home office become more common. Many schools transfer their courses teaching to the online classroom. Therefore, it is significant to mine the students' feedback and opinions from their reviews towards studies so that both schools and teachers can know where they need to improve. This paper trains machine learning and deep learning models using both balanced and imbalanced datasets for sentiment classification. Two SOTA category-aware text generation GAN models: CatGAN and SentiGAN, are utilized to synthesize text used to balance the highly imbalanced dataset. Results on three datasets with different imbalance degree from distinct domains show that when using generated text to balance the dataset, the F1-score of machine learning and deep learning model on sentiment classification increases 2.79% ~ 9.28%. Also, the results indicate that the average growth degree for CR100k is higher than CR23k, the average growth degree for deep learning is more increased than machine learning algorithms, and the average growth degree for more complex deep learning models is more increased than simpler deep learning models in experiments.
... We can see that BERT and other deep learning algorithms are not all good at natural language processing tasks, not much when it comes to natural language understanding 1 . Such issues can be addressed employing ontologies, better vector space representation models [43,44], and objective and semantic metrics [45,46]. Also, the model that are able to handle multi-lingual reviews is absent and most current sentiment analysis model can only process reviews written in English. ...
Preprint
Now when the whole world is still under COVID-19 pandemic, many schools have transferred the teaching from physical classroom to online platforms. It is highly important for schools and online learning platforms to investigate the feedback to get valuable insights about online teaching process so that both platforms and teachers are able to learn which aspect they can improve to achieve better teaching performance. But handling reviews expressed by students would be a pretty laborious work if they were handled manually as well as it is unrealistic to handle large-scale feedback from e-learning platform. In order to address this problem, both machine learning algorithms and deep learning models are used in recent research to automatically process students' review getting the opinion, sentiment and attitudes expressed by the students. Such studies may play a crucial role in improving various interactive online learning platforms by incorporating automatic analysis of feedback. Therefore, we conduct an overview study of sentiment analysis in educational field presented in recent research, to help people grasp an overall understanding of the sentiment analysis research. Besides, according to the literature review, we identify three future directions that researchers can focus on in automatically feedback processing: high-level entity extraction, multi-lingual sentiment analysis, and handling of figurative language.
... The framework was split into two modules: (i) a documents representation module, and (ii) a classification module. For the classification mechanism to show the desired performance, a document was enriched with semantics using background knowledge provided by an ontology and through the acquisition of its relevant terminology [39]. This way, in-depth coverage of concepts is achieved and conceptualization is involved in documents captured. ...
Preprint
Full-text available
Analysis of a large amount of data has always brought value to institutions and organizations. Lately, people's opinions expressed through text have become a very important aspect of this analysis. In response to this challenge, a natural language processing technique known as Aspect-Based Sentiment Analysis (ABSA) has emerged. Having the ability to extract the polarity for each aspect of opinions separately, ABSA has found itself useful in a wide range of domains. Education is one of the domains in which ABSA can be successfully utilized. Being able to understand and find out what students like and don't like most about a course, professor, or teaching methodology can be of great importance for the respective institutions. While this task represents a unique NLP challenge, many studies have proposed different approaches to tackle the problem. In this work, we present a comprehensive review of the existing work in ABSA with a focus in the education domain. A wide range of methodologies are discussed and conclusions are drawn.