Article

Time Aware Knowledge Extraction for Microblog Summarization on Twitter

January 2015
Information Fusion 28(2)

January 2015
28(2)

DOI:10.1016/j.inffus.2015.06.004

Source
arXiv

Authors:

Carmen De Maio

Università degli Studi di Salerno

Giuseppe Fenza

Università degli Studi di Salerno

Vincenzo Loia

Università degli Studi di Salerno

Mimmo Parente

Università degli Studi di Salerno

Microblogging services like Twitter and Facebook collect millions of user generated content every moment about trending news, occurring events, and so on. Nevertheless, it is really a nightmare to find information of interest through the huge amount of available posts that are often noise and redundant. In general, social media analytics services have caught increasing attention from both side research and industry. Specifically, the dynamic context of microblogging requires to manage not only meaning of information but also the evolution of knowledge over the timeline. This work defines Time Aware Knowledge Extraction (briefly TAKE) methodology that relies on temporal extension of Fuzzy Formal Concept Analysis. In particular, a microblog summarization algorithm has been defined filtering the concepts organized by TAKE in a time-dependent hierarchy. The algorithm addresses topic-based summarization on Twitter. Besides considering the timing of the concepts, another distinguish feature of the proposed microblog summarization framework is the possibility to have more or less detailed summary, according to the user's needs, with good levels of quality and completeness as highlighted in the experimental results.

Emotion-Aware Event Summarization in Microblogs

Conference Paper

Full-text available

Apr 2021

Microblogs have become the preferred means of communication for people to share information and feelings, especially for fast evolving events. Understanding the emotional reactions of people allows decision makers to formulate policies that are likely to be more well-received by the public and hence better accepted especially during policy implementation. However, uncovering the topics and emotions related to an event over time is a challenge due to the short and noisy nature of microblogs. This work proposes a weakly supervised learning approach to learn coherent topics and the corresponding emotional reactions as an event unfolds. We summarize the event by giving the representative microblogs and the emotion distributions associated with the topics over time. Experiments on multiple real-world event datasets demonstrate the effectiveness of the proposed approach over existing solutions.

EMOTION-AWARE EVENT SUMMARIZATION IN MICROBLOGS

Thesis

Full-text available

Jan 2021

Rrubaa Panchendrarajan

One aspect of crisis management involves the ability to understand the emotional reactions of people to adjust the response strategies. Uncovering the topics and emotions related to an event over time is critical for timely intervention. For fast-evolving events, microblogs tend to be the preferred means of communication for people to share information and feelings. In this thesis, we develop an event analysis framework to identify the topics and emotional reactions of people in microblogs. The framework has three components: (1) a trend interval detection algorithm to determine the granularity for discovering hot topics while minimizing potential information loss, (2) a weakly supervised learning approach to learn coherent topics and their corresponding emotional reactions, (3) an event summary generator that gives representative microblogs and emotion distributions associated with the topics over time. Extensive experiments on multiple real-world event datasets demonstrate the effectiveness of the proposed approach over existing solutions.

COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification

Article

Full-text available

Jun 2020

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naïve Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.

Review of Automatic Text Summarization Techniques & Methods

Article

Full-text available

May 2020

Text summarization automatically produces a summary containing important sentences and includes all relevant important information from the original document. One of the main approaches, when viewed from the summary results, are extractive and abstractive. An extractive summary is heading towards maturity and now research has shifted towards abstractive summation and real-time summarization. Although there have been so many achievements in the acquisition of datasets, methods, and techniques published, there are not many papers that can provide a broad picture of the current state of research in this field. This paper provides a broad and systematic review of research in the field of text summarization published from 2008 to 2019. There are 85 journal and conference publications which are the results of the extraction of selected studies for identification and analysis to describe research topics/trends, datasets, preprocessing, features, techniques, methods, evaluations, and problems in this field of research. The results of the analysis provide an in-depth explanation of the topics/trends that are the focus of their research in the field of text summarization; provide references to public datasets, preprocessing and features that have been used; describes the techniques and methods that are often used by researchers as a comparison and means for developing methods. At the end of this paper, several recommendations for opportunities and challenges related to text summarization research are mentioned.

COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification

Article

Jan 2020

COVID-19 Public Sentiment Insights, Fear Curve Analytics and Machine Learning for Tweets Classification

Preprint

Full-text available

Apr 2020

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fuelled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naïve Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.

Crowdsourcing service requirement oriented requirement pattern elicitation method

Article

Full-text available

Jul 2020
NEURAL COMPUT APPL

“Pattern” can always help machine to recognize the new encounters, so does “requirement pattern.” Requirement pattern is one of the essences for the cognitive service to understand customer’s intention. Since crowdsourcing service platform holds abundant user demands in the form of text information, the method proposed in this paper aims at eliciting valuable patterns from this “treasure.” This method is based on a knowledge graph, which is constructed with the refined concepts of those text information of several different domains. Due to the irregularity and difference of user demand expressions, this paper will firstly explain the knowledge extraction method for heterogeneous text and the knowledge fusion-based knowledge graph construction method. Afterward, we will introduce the requirement pattern elicitation method based on this knowledge graph. The pattern could either be a frequent demand sequence or a domain-oriented rule or link. Finally, this paper will demonstrate a case study to show how those patterns can help to understand customers’ intention effectively and accurately.

Service Science Management Engineering and Design (SSMED): a semiautomatic literature review

Article

Full-text available

May 2019

SSMED (Service Science, Management Engineering and Design) is multidisciplinary by nature. However, some authors stated that SSMED publications remain focused on single scientific domains. This paper proposes a Semiautomatic Literature Review (SALR) using integrated techniques for knowledge extraction –‘Time Aware Knowledge Extraction’ (TAKE) – to analyse the interdisciplinarity of SSMED publications and the potential for transdisciplinarity based on the actual adoption of Service-Dominant Logic as the foundation of SSMED research. Findings reveal that: 1) most SSMED publications are not interdisciplinary and are mainly related to Management; 2) Service-Dominant Logic has been adopted very often in SSMED publications, paving the way for SSMED transdisciplinarity. This paper offers theoretical and practical insights by enhancing the knowledge about SSMED literature and enriching the state of the art related to techniques to perform literature reviews. Furthermore, it stimulates the expansion of scholars’ and managers’ views of holistic approaches to service systems while fostering SSMED viability.

Using Stigmergy to Distinguish Event-Specific Topics in Social Discussions

Article

Full-text available

Jul 2018
SENSORS-BASEL

In settings wherein discussion topics are not statically assigned, such as in microblogs, a need exists for identifying and separating topics of a given event. We approach the problem by using a novel type of similarity, calculated between the major terms used in posts. The occurrences of such terms are periodically sampled from the posts stream. The generated temporal series are processed by using marker-based stigmergy, i.e.; a biologically-inspired mechanism performing scalar and temporal information aggregation. More precisely, each sample of the series generates a functional structure, called mark, associated with some concentration. The concentrations disperse in a scalar space and evaporate over time. Multiple deposits, when samples are close in terms of instants of time and values, aggregate in a trail and then persist longer than an isolated mark. To measure similarity between time series, the Jaccard’s similarity coefficient between trails is calculated. Discussion topics are generated by such similarity measure in a clustering process using Self-Organizing Maps, and are represented via a colored term cloud. Structural parameters are correctly tuned via an adaptation mechanism based on Differential Evolution. Experiments are completed for a real-world scenario, and the resulting similarity is compared with Dynamic Time Warping (DTW) similarity.

Fusing approaches of pattern discovery and visual analytics on tweet propagation

Article

Jun 2018
INFORM FUSION

Over the past several years, social networks have become a major channel for information delivery. At present, social networks are being used to obtain more followers and exert influence over people during political campaigns. However, the propagation of a social network post is dependent on numerous factors. Some of these are known; for example, the post contents, the time when it was posted, and the person or entity by whom it was posted. However, other factors remain unknown, such as what makes a post more successful than others, and how posts from similar profiles evolve and propagate differently over time. The main subject of this work is addressing these types of questions. Our approach relies on a three-fold methodology for studying the influence and propagation of posts: graph-based, semantic, and contrast pattern recognition analysis. The results obtained are complemented by a dynamic visualization that encompasses all of the variables involved. In order to corroborate our results, we collected all posts from the Twitter accounts of the most prominent Mexican political figures and analyzed the influence and propagation of each post issued.

Methods for concept analysis and multi-relational data mining: a systematic literature review

Article

Full-text available

May 2024
KNOWL INF SYST

The Internet of Things massive adoption in many industrial areas in addition to the requirement of modern services is posing huge challenges to the field of data mining. Moreover, the semantic interoperability of systems and enterprises requires to operate between many different formats such as ontologies, knowledge graphs, or relational databases, as well as different contexts such as static, dynamic, or real time. Consequently, supporting this semantic interoperability requires a wide range of knowledge discovery methods with different capabilities that answer to the context of distributed architectures (DA). However, to the best of our knowledge there is no general review in recent time about the state of the art of Concept Analysis (CA) and multi-relational data mining (MRDM) methods regarding knowledge discovery in DA considering semantic interoperability. In this work, a systematic literature review on CA and MRDM is conducted, providing a discussion on the characteristics they have according to the papers reviewed, supported by a clusterization technique based on association rules. Moreover, the review allowed the identification of three research gaps toward a more scalable set of methods in the context of DA and heterogeneous sources.

An architectural framework of elderly healthcare monitoring and tracking through wearable sensor technologies

Article

Full-text available

Jan 2024
MULTIMED TOOLS APPL

The growing elderly population in smart home environments necessitates increased remote medical support and frequent doctor visits. To address this need, wearable sensor technology plays a crucial role in designing effective healthcare systems for the elderly, facilitating human–machine interaction. However, wearable technology has not been implemented accurately in monitoring various vital healthcare parameters of elders because of inaccurate monitoring. In addition, healthcare providers encounter issues regarding the acceptability of healthcare parameter monitoring and secure data communication within the context of elderly care in smart home environments. Therefore, this research is dedicated to investigating the accuracy of wearable sensors in monitoring healthcare parameters and ensuring secure data transmission. An architectural framework is introduced, outlining the critical components of a comprehensive system, including Sensing, Data storage, and Data communication (SDD) for the monitoring process. These vital components highlight the system's functionality and introduce elements for monitoring and tracking various healthcare parameters through wearable sensors. The collected data is subsequently communicated to healthcare providers to enhance the well-being of elderly individuals. The SDD taxonomy guides the implementation of wearable sensor technology through environmental and body sensors. The proposed system demonstrates the accuracy enhancement of healthcare parameter monitoring and tracking through smart sensors. This study evaluates state-of-the-art articles on monitoring and tracking healthcare parameters through wearable sensors. In conclusion, this study underscores the importance of delineating the SSD taxonomy by classifying the system's major components, contributing to the analysis and resolution of existing challenges. It emphasizes the efficiency of remote monitoring techniques in enhancing healthcare services for the elderly in smart home environments.

Exploring the efficacy and reliability of automatic text summarisation systems: Arabic texts in focus

Article

Full-text available

Mar 2023

This study compared the salient features of the three basic types of automatic text summarisation methods (ATSMs)—extractive, abstractive, and real-time—along with the available approaches used for each type. The data set comprised 12 reports on the current issues on automatic text summarisation methods and techniques across languages, with a special focus on Arabic whose structure has been largely claimed to be problematic in most ATSMs. Three main summarizers were compared: TAAM, OTExtSum, and OntoRealSumm. Further to this, a humanoid version of the summary of the data set was prepared, and then compared to the automatically generated summary. A 10-item questionnaire was built to help with the assessment of the target ATSMs. Also, Rouge analysis was performed to assess the efficacy of all techniques in minimising the redundancy of the data set. Findings showed that the precision of the target summarizers differed considerably, as 80% of the data set has been proven to be aware of the problems underlying ATSMS. The remaining parameters were in the normal range (65–75%). In light of the equations-based assessment of ATSMS, the highest range was noted with the removal of stop word, the least range was noted with POS tagging, stem weight, and stem collection. Regarding Arabic, the statistical analysis has been proven to be the most effective summarisation method (accuracy = 57.59%; reminiscence = 58.79%; F-Value = 57.99%). Further research is required to explore how the lexicogrammatical nature of languages and generic text structure would affect the text summarisation process.

LSTM Based Sentiment Analysis Model to Monitor COVID-19 Emotion

Article

Full-text available

May 2022

Psychologists and Social scientists are interested to evaluate how people show their expressions and sentiments about natural disasters, terrorism, and pandemic situations. The covid-19 has raised the number of psychological issues such as depression due to social changes and employment issues. The everyday life of people is disturbed due to the Pandemic situation of covid-19. During the lockdown, people share their opinions on social sites like Twitter and Facebook. Due to this pandemic situation and lockdown, the emotions of people are different, the emotions are categorized as fear, anger, joy, and sad in terms of covid-19 and lockdown. In this paper, we have used machine learning and Natural Language Processing approaches to design an effective machine learning model for the classification of people's emotions related to covid-19. The early detection of sentiment allows for better handling of the pandemic situation and government policies. Text is categorized into fear, joy, anger, and sad sentiment classes. We have proposed a deep learning-based LSTM model for Covid-19 related emotion identification and achieved an accuracy of 71.7% with the proposed model. For the robustness of the proposed model, we considered several machine learning classifiers and compare these classifiers with our proposed model. Data Availability: In this study, an open-source dataset is used: https://www.kaggle.com/code/poulamibakshi/covid-19-sentiment-analysis/data

Fine-Grained Context-aware Ad Targeting on Social Media Platforms

Conference Paper

Oct 2020

An Improved Approach for Multi-class Twitter Emotion Classification

Method

Full-text available

Aug 2020

The main objective of this study is to propose and develop a Deep Learning based model for sentiment analysis using data extracted from twitter

Who and where: context-aware advertisement recommendation on Twitter

Article

Full-text available

Jan 2021
SOFT COMPUT

Advertising is becoming a business on social networks. Billions of people around the world use social media, and fastly, it has become one of the defining technologies of our time. Social platforms like Twitter are one of the primary means of communication and information dissemination and can capture the interest of potential customers. Therefore, it is crucial to select suitable advertisements to users in specific times and locations for capturing their attention, profitably. In this paper, we propose a context-aware advertising recommendation system that, by analyzing the users’ tweets and movements along a timeline, infers the personal interests of users and provides attractive ads to users through the triadic formal concept analysis theory.

COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification

Preprint

Full-text available

Jun 2020

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91\% for short Tweets, with the Na\"ive Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74\% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.

COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification

Preprint

Full-text available

May 2020

Along with the Coronavirus pandemic, another crisis has manifested itself in the form of mass fear and panic phenomena, fueled by incomplete and often inaccurate information. There is therefore a tremendous need to address and better understand COVID-19's informational crisis and gauge public sentiment, so that appropriate messaging and policy decisions can be implemented. In this research article, we identify public sentiment associated with the pandemic using Coronavirus specific Tweets and R statistical software, along with its sentiment analysis packages. We demonstrate insights into the progress of fear-sentiment over time as COVID-19 approached peak levels in the United States, using descriptive textual analytics supported by necessary textual data visualizations. Furthermore, we provide a methodological overview of two essential machine learning (ML) classification methods, in the context of textual analytics, and compare their effectiveness in classifying Coronavirus Tweets of varying lengths. We observe a strong classification accuracy of 91% for short Tweets, with the Naive Bayes method. We also observe that the logistic regression classification method provides a reasonable accuracy of 74% with shorter Tweets, and both methods showed relatively weaker performance for longer Tweets. This research provides insights into Coronavirus fear sentiment progression, and outlines associated methods, implications, limitations and opportunities.

A bibliometric analysis and cutting-edge overview on fuzzy techniques in Big Data

Article

Apr 2020
ENG APPL ARTIF INTEL

Over the last few years, Big Data has gained a tremendous attention from the research community. The data being generated in huge quantity from almost every field is unstructured and unprocessed. Extracting knowledge base and useful information from the big raw data is one of the major challenges, present today. Various computational intelligence and soft computing techniques have been proposed for efficient big data analytics. Fuzzy techniques are one of the soft computing approaches which can play a very crucial role in current big data challenges by pre-processing and reconstructing data. There is a wide spread application domains where traditional fuzzy sets (type-1 fuzzy sets) and higher order fuzzy sets (type-2 fuzzy sets) have shown remarkable outcomes. Although, this research domain of “fuzzy techniques in Big Data” is gaining some attention, there is a strong need for a motivation to encourage researchers to explore more in this area. In this paper, we have conducted bibliometric study on recent development in the field of “fuzzy techniques in big data”. In bibliometric study, various performance metrics including total papers, total citations, and citation per paper are calculated. Further, top 10 of most productive and highly cited authors, discipline, source journals, countries, institutions, and highly influential papers are also evaluated. Later, a comparative analysis is performed on the fuzzy techniques in big data after analysing the most influential works in this field.

Benefits of using data mining techniques to extract and analyze Twitter data for higher education applications: a systematic literature review

Article

Mar 2020

Benefits of using data mining techniques to extract and analyze Twitter data for higher education applications: a systematic literature review

Article

Full-text available

Mar 2020

Beneficios del uso de técnicas de minería de datos para extraer y analizar datos de twitter aplicados en la educación superior: una revisión sistemática de la literatura

Article

Full-text available

Feb 2020

En los últimos años, existe un creciente interés por los actores de la educación en la inclusión de las TIC en sus instituciones, como es el caso de las redes sociales, que lejos de ser un problema y mediante un uso guiado de las mismas, permiten innovar las sesiones de clases tradicionales y mejorar la comunicación entre docentes y estudiantes. En el presente estudio se plantearon dos objetivos: (1) realizar una revisión sistemática de la literatura, mediante la búsqueda de artículos publicados entre Enero/2007 y Marzo/2019, en bases de datos como ACM, IEEE, ScienceDirect, Springer, entre otras, para identificar las investigaciones que han aplicado técnicas de minería de datos, para la extracción y análisis de datos de Twitter en la educación superior; y, (2) destacar las prácticas pedagógicas que han incorporado Twitter y minería de datos para mejorar los procesos educativos. De los 315 artículos obtenidos, fueron seleccionados 65 que cumplieron con los criterios de inclusión. Los principales resultados indican que: (1) las técnicas de minería de datos más utilizadas son predictivas con tareas de clasificación; (2) Twitter se usa principalmente para: (a) determinar percepción estudiantil; (b) compartir información, material y recursos; (c) generar comunicación y participación; (d) fomentar habilidades; y (e) mejorar la expresión oral y el rendimiento académico; (3) Estados Unidos es el país con mayor número de trabajos; sin embargo, en países de Latinoamérica los hallazgos son pocos, por lo que, se apertura un campo de investigación en esta región; y (4) los estudios incluyeron modelos, métodos, estrategias, teorías o instrumentos como práctica pedagógica; de modo que, no existe un consenso en la forma en que los datos extraídos de Twitter podrían ser incorporados en la educación superior para mejorar los procesos de enseñanza y aprendizaje.

Identify Topic Relations in Scientific Literature Using Topic Modeling

Article

Apr 2019

Over the past five years, topic models have been applied to bibliometrics research as an efficient tool for discovering latent and potentially useful content. The combination of topic modeling algorithms and bibliometrics has generated new challenges of interpreting and understanding the outcome of topic modeling. Motivated by these new challenges, this paper proposes a systematic methodology for topic analysis in scientific literature corpora to face the concerns of conducting post topic modeling analysis. By linking the corpus metadata with the discovered topics, we feature them with a number of topic-based analytic indices to explore their significance, developing trend, and received attention. A topic relation identification approach is then presented to quantitatively model the relations among the topics. To demonstrate the feasibility and effectiveness of our methodology, we present two case studies, using big data and dye-sensitized solar cell publications derived from searches in World of Science. Possible application of the methodology in telling good stories of a target corpus is also explored to facilitate further research management and opportunity discovery.

Veracity handling and instance reduction in big data using interval type-2 fuzzy sets ✩

Article

Feb 2020
ENG APPL ARTIF INTEL

Within the aspect of big data, veracity refers to the existing uncertainty in the dataset. The continuous flow of unstructured data with unwanted noise may bring abnormality in the dataset making them unusable. In this paper, we propose a novel method to handle the veracity characteristic of the big data using the concept of footprint of uncertainty (FOU) in interval type-2 fuzzy sets (IT2 FSs). The proposed method helps in handling the veracity issue in big data and reduces the instances to a manageable extent. We have compared the results with the existing clustering based methods and examined the relationship between the clusters and the FOUs by comparing their centroids and defuzzified values. To scrutinize the validity of our results, we have also performed a number of additional experiments by appending extra instances to the datasets. To check its consistency and efficacy, the proposed methodology is assessed from three different aspects. Experimental result validates that the proposed method can suitably handle the veracity issue in big datasets and is efficient in reducing the instances.

Tweet Summarization of News Articles: An Objective Ordering-Based Perspective

Article

Jul 2019

Twitter has become an essential platform for the news media sources to disseminate news. The opinions expressed through Twitter can be mined by news media sources to obtain users' reactions centered around different news articles. A comprehensive summary of the users' reactions with respect to a news article can be crucial due to various reasons like: 1) understanding the sensitivity/importance of the news; 2) obtaining insights about the diverse opinions of the readers with respect to the news; and 3) understanding the key aspects that draw the interest of the readers. However, the selected summary tweets must fulfill multiple objectives, like relevance to the news article, diversity among the selected tweets, and should cover the entire spectrum of opinions expressed through the tweets. Existing methods primarily attempt to identify a set of relevant tweets from which the summary tweets are selected that maintains the diversity and coverage requirements. However, the noise and the nontemporal behavior of the article-specific tweets make the identification of such relevant tweets extremely difficult, resulting in poor summary quality. In this paper, through empirical investigations, we show that initially identifying the diverse opinions can lead to better identification of the relevant tweets, i.e., following a specific ordering of the objectives can lead to the improved summary. We, subsequently, propose a tweet summarization technique that follows such a specific ordering. Validation of our proposed approach for 800 news articles with 2.1 billion related tweets shows that the proposed approach produces 11.6%-34.8% improvement in summary quality as compared to existing state-of-the-art techniques.

Automatic generation of entity-oriented summaries for reputation management

Article

Full-text available

Apr 2020

Producing online reputation summaries for an entity (company, brand, etc.) is a focused summarization task with a distinctive feature: issues that may affect the reputation of the entity take priority in the summary. In this paper we (i) present a new test collection of manually created (abstractive and extractive) reputation reports which summarize tweet streams for 31 companies in the banking and automobile domains; (ii) propose a novel methodology to evaluate summaries in the context of online reputation monitoring, which profits from an analogy between reputation reports and the problem of diversity in search; and (iii) provide empirical evidence that producing reputation reports is different from a standard summarization problem, and incorporating priority signals is essential to address the task effectively.

Spark-IDPP: high-throughput and scalable prediction of intrinsically disordered protein regions with Spark clusters on the Cloud

Article

Full-text available

Jun 2019
CLUSTER COMPUT

Intrinsically disorder proteins (IDPs) constitute a significant part of proteins that exist and act in cells of living organisms. IDPs play key roles in central cellular processes and some of them are closely related to various human diseases, like cancer or neurodegenerative disorders. Identification of IDPs and studying their structural characteristics have become an important part of structural bioinformatics and structural genomics. However, growing amount of genomic and protein sequences in public repositories pose a pressure on existing methods for identification of IDPs. Large volumes of protein amino acid sequences need to be analyzed in terms of propensity to form disordered regions, and this task requires novel tools and scalable platforms to cope with this big biological data challenge. In this paper, we show how the identification of disordered regions of 3D protein structures can be efficiently accelerated with the use of Apache Spark cluster established and scaled on the public Cloud. For this purpose, we propose Spark-based meta-predictor (Spark-IDPP), which enables efficient prediction of disordered regions of proteins on a large-scale. Results of our performance tests show that, for large data sets, our method achieves almost linear speedup, when scaling out the computations on the 32-node Spark cluster located in the Azure cloud. This proves that through appropriate partitioning of data and by increasing the degree of parallelism, we can significantly improve efficiency of IDP predictions. Additionally, by using several basic predictors, aggregating their ranks in various consensus modes, and filtering the final outcome with a dedicated fuzzy filter, the Spark-IDPP increases the quality of predictions.

Context-aware Advertisment Recommendation on Twitter through Rough sets

Conference Paper

Jul 2018

Scalable Prediction of Intrinsically Disordered Protein Regions with Spark Clusters on Microsoft Azure Cloud: Efficient Computational Solutions for Protein Structures

Chapter

Sep 2018

Dariusz Mrozek

Intrinsically disordered proteins (IDPs) constitute a wide range of molecules that act in cells of living organisms and mediate many protein–protein interactions and many regulatory processes. Computational identification of disordered regions in protein amino acid sequences, thus, became an important branch of 3D protein structure prediction and modeling. In this chapter, we will see the IDP meta-predictor that applies an ensemble of primary predictors in order to increase the quality of IDP prediction. We will also see the highly scalable implementation of the meta-predictor on the Spark cluster (Spark-IDPP) that mitigates the problem of the exponentially growing number of protein amino acid sequences in public repositories. Spark-IDPP responds very well to the current needs of IDP prediction by parallelizing computations on the Spark cluster that can be scaled on demand on the Microsoft Azure cloud according to particular requirements for computing power.

Technological Roadmap: Efficient Computational Solutions for Protein Structures

Chapter

Sep 2018

Dariusz Mrozek

Scientific solutions presented in this book rely on various technologies that emerged in computer science. Some of them emerged recently and are quite new in the bioinformatics field. Some of them are widely used in developing efficient and reliable IT systems supporting various forms of business for many years, but are not frequently used in bioinformatics. This chapter provides a technological road map for solutions presented in this book. It covers a brief introduction to the concept of cloud computing, cloud service, and deployment models. It also defines the Big Data challenge and presents benefits of using multi-threading in scientific computations. It then explains graphics processing units (GPU) and CUDA architecture. Finally, it focuses on relational databases and the SQL language used for declarative querying.

Recognizing Multi-Resident Activities in Non-intrusive Sensor-Based Smart Homes by Formal Concept Analysis

Article

Aug 2018
NEUROCOMPUTING

Activity recognition is one of the most important prerequisites for smart home applications. It is a challenging topic due to the high requirements for reliable data acquisition and efficient data analysis. Besides, the heterogeneous layouts of smart homes, the number of residents and varied human behavioral patterns also aggravate the complexity of recognition. Therefore, most human activity recognition systems are based on an unrealistic assumption that there is only one resident performing activities. In this paper, we investigate the issue of multi-resident activity recognition and propose a knowledge-driven solution on the basis of formal concept analysis (FCA) to identify human activities from non-intrusive sensor data. We extract the ontological correlations among sequential behavioral patterns. At the same time, these correlations are well organized in a graphical knowledge base, without intervention from domain experts. We propose an incremental lattice search strategy in order to retrieve the best inference given a few sensor events. Compared with other conventional probabilistic methods, our solution outperforms on the CASAS multi-resident benchmark dataset. Furthermore, we open up a promising solution of sequential pattern mining to discover the ontological features of temporal and sequential sensor data.

Topic and sentiment aware microblog summarization for twitter

Article

Full-text available

Feb 2020
J INTELL INF SYST

Recent advances in microblog content summarization has primarily viewed this task in the context of traditional multi-document summarization techniques where a microblog post or their collection form one document. While these techniques already facilitate information aggregation, categorization and visualization of microblog posts, they fall short in two aspects: i) when summarizing a certain topic from microblog content, not all existing techniques take topic polarity into account. This is an important consideration in that the summarization of a topic should cover all aspects of the topic and hence taking polarity into account (sentiment) can lead to the inclusion of the less popular polarity in the summarization process. ii) Some summarization techniques produce summaries at the topic level. However, it is possible that a given topic can have more than one important aspect that need to have representation in the summarization process. Our work in this paper addresses these two challenges by considering both topic sentiments and topic aspects in tandem. We compare our work with the state of the art Twitter summarization techniques and show that our method is able to outperform existing methods on standard metrics such as ROUGE-1.

Automatic labelling of important terms and phrases from medical discussions

Conference Paper

Nov 2017

Soft and Declarative Fishing of Information in Big Data Lake

Article

Mar 2018

In recent years, many fields that experience a sudden proliferation of data, which increases the volume of data that must be processed and the variety of formats the data is stored in have been identified. This causes pressure on existing compute infrastructures and data analysis methods, as more and more data is considered as a useful source of information for making critical decisions in particular fields. Among these fields exist several areas related to human life, e.g., various branches of medicine, where the uncertainty of data complicates the data analysis, and where the inclusion of fuzzy expert knowledge in data processing brings many advantages. In this paper, we show how fuzzy techniques can be incorporated in Big Data analytics carried out with the declarative U-SQL language over a Big Data Lake located on the Cloud. We define the concept of Big Data Lake together with the Extract, Process, and Store (EPS) process performed while schematizing and processing data from the Data Lake, and while storing results of the processing. Our solution, developed as a Fuzzy Search Library for Data Lake, introduces the possibility of (1) massively-parallel, declarative querying of Big Data Lake with simple and complex fuzzy search criteria, (2) using fuzzy linguistic terms in various data transformations, and (3) fuzzy grouping. Presented ideas are exemplified by a distributed analysis of large volumes of biomedical data on Microsoft Azure cloud. Results of performed tests confirm that the presented solution is highly scalable on the Cloud and is a successful step toward soft and declarative processing of data on a large scale. The solution presented in this paper directly addresses three characteristics of Big Data, i.e., volume, variety, and velocity, and indirectly addresses, veracity and value.

A method of network public opinion prediction based on the model of grey forecasting and hybrid fuzzy neural network

Article

Full-text available

Feb 2023
NEURAL COMPUT APPL

Unexpected events occur frequently, and network public opinion prediction is one of the important research directions. Aiming at the problem that the current network public opinion prediction models mostly take improving the accuracy of the model as a breakthrough point, and lack the problem of exploring the law of public opinion communication, the study analyzes the current micro blog emergency propagation, focusing on introducing the implicit law of emotion vector, user browsing, and emergencies. At the same time, it studies the influencing factors causing the fluctuation of micro blog transmission of emergencies and selects the grey prediction model. The defects of the model are analyzed, and it has the constant increment problem and the lack of ability to deal with interference factors, and metabolic grey prediction model is used for the prediction of micro blog emergencies. At the same time, the concept of an incremental coefficient is introduced and the hybrid fuzzy neural network is adopted, the emotional knowledge is the key factor affecting the increment of grey prediction model. Use fuzzy neural network to analyze the micro blog emotional data generation, and obtain a mixed public opinion prediction model based on fuzzy neural network and grey prediction model. In the experimental process, the performance of the optimized prediction model is compared with that of the original prediction model, and a large number of data analyses prove that the optimized prediction model is effective. The experimental results show that the optimized prediction model has higher prediction accuracy.

A Altmetrics analysis in social media using Bigdata

Article

Jun 2022

The motivation behind this examination is to explore the status and the development of the logical investigations for the impact of interpersonal organizations on enormous information and utilization of large information for displaying the interpersonal organizations clients' conduct. This paper presents a far reaching audit of the examinations related with enormous information in online media. The investigation utilizes Scopus information base as an essential web crawler and covers 2000 of profoundly refered to articles over the period 2012-2019. The records are genuinely broke down and feline egorized as far as various standards. The discoveries show that explores have developed dramatically since 2014 and the pattern has proceeded at generally stable rates. In view of the review, choice emotionally supportive networks is the catchphrase which has conveyed the most noteworthy densities followed by heuristics techniques. Among the most refered to articles, papers distributed by re-searchers in United States have gotten the most noteworthy references (7548), trailed by United Kingdom (588) and China with 543 ci-tations. Topical investigation shows that the subject almost kept a significant and well-devel-oped research field and for better outcomes we can combine our exploration with "huge information examination" and "twitter" that are significant points in this field yet not grew well.

On Multimodal Microblog Summarization

Article

Oct 2021

Microblog summarization systems are gaining importance during natural disasters. A lot of tweets are posted along with multimedia content during the occurrence of any natural disaster event. Extracting relevant information/summary from these tweets is important for the smooth functioning of the rescue operation. Moreover, because of the limited size of the tweets, in many cases, tweets are associated with images. The current work is the first of its kind where both the image and the tweet text are utilized simultaneously to generate a summary from microblog data generated during a disaster event. Different aspects, such as syntactic similarity, the maximum length of the tweets, retweet score, and antiredundancy, are considered as objective functions and those are simultaneously optimized using a metaheuristic population-based evolutionary strategy to select a good set of tweets to form a good quality summary. In order to extract information from images, a dense captioning model is utilized and the dense captions are further utilized for calculating the antiredundancy measure. We employed word mover distance to capture the semantic similarity between two tweets. Due to the unavailability of the dataset for multimodal microblog summarization tasks in a disaster-event scenario, datasets are created and made openly available to the community. The obtained summarization results are evaluated using the well-known ROUGE measure.

Covid-19: Origin, Detection and Impact Analysis Using Artificial Intelligence Computational Techniques

Book

Full-text available

Jun 2021

Machine learning algorithms for social media analysis: A survey

Article

May 2021

Social Media (SM) are the most widespread and rapid data generation applications on the Internet increase the study of these data. However, the efficient processing of such massive data is challenging, so we require a system that learns from these data, like machine learning. Machine learning methods make the systems to learn itself. Many papers are published on SM using machine learning approaches over the past few decades. In this paper, we provide a comprehensive survey of multiple applications of SM analysis using robust machine learning algorithms. Initially, we discuss a summary of machine learning algorithms, which are used in SM analysis. After that, we provide a detailed survey of machine learning approaches to SM analysis. Furthermore, we summarize the challenges and benefits of Machine Learning usages in SM analysis. Finally, we presented open issues and consequences in SM analysis for further research.

A time‐driven FCA‐based approach for identifying students' dropout in MOOCs

Article

Mar 2021

In online learning, the dropout phenomenon is a relevant issue to address with practical solutions. Several data sets stimulate original, and resolutive data analysis approaches, demonstrating the importance of the dropout phenomenon. This study proposes a novel approach to predicting massive online open course (MOOC) students at risk of dropout stressing the need to consider the temporal dimension in the data log. The proposal aims to build a data‐driven decision support system able to identify students at risk of dropout based on the conceptualization of such students' behavior and its evolution along the time dimension. The primary theoretical model behind the proposed method is the formal concept analysis, and its temporal extension (i.e., temporal concept analysis) for analyzing timestamped data and carrying out a timed lattice. The main result of the paper is a method to extract behavioral patterns of MOOC students at risk of dropout. Such patterns are defined as Time‐based Behavior Rules extracted from the aforementioned timed lattice obtained through the preprocessing of MOOC platform log files. The resulting rule set can be easily integrated for implementing educational DSS, as shown in the last part of the paper. The conducted experiments reveal promising results in terms of F‐score and students' monitoring time.

Research and Implementation of Network Public Opinion Rrediction System

Article

Full-text available

Jun 2020

The Internet has become a distribution center of ideological and cultural information and an amplifier of public opinion. To dig, analyze and study hot public opinions on the Internet is an important means to fully understand what netizens are thinking and doing. This paper analyzes the function of public opinion system, and introduces the key technology to realize the prediction of network public opinion, which can well realize the prediction function of public opinion.

Fusion of self-organizing map and granular self-organizing map for microblog summarization

Article

Full-text available

Dec 2020
SOFT COMPUT

In this paper, we have proposed a fusion of two architectures, self-organizing map and granular self-organizing map (SOM + GSOM), for solving the microblog summarization task where a set of relevant tweets are extracted from the available set of tweets. SOM is used to reduce the available set of tweets to a smaller subset, and GSOM is used for extracting relevant tweets. The fusion of SOM + SOM is also accomplished to illustrate the effectiveness of GSOM over SOM in the second architecture. Moreover, only SOM version is also utilized to illustrate the potentiality of fusion in our proposed approaches. As similarity/dissimilarity measures play major role in any summarization system; therefore, to measure the same between tweets, various measures like word mover distance, cosine distance and Euclidean distance are also explored. The results obtained are evaluated on four datasets related to disaster events using ROUGE measures. Experimental results demonstrate that our best-proposed approach (SOM + GSOM) has obtained \(17\%\) and \(5.9\%\) improvements in terms of ROUGE-2 and ROUGE-L scores, respectively, over the existing techniques. The results are also validated using statistical significance t-test.

Multiobjective-Based Approach for Microblog Summarization

Article

Nov 2019

In recent years, social networking sites such as Twitter have become the primary sources for real-time information of ongoing events such as political rallies, natural disasters, and so on. At the time of occurrence of natural disasters, it has been seen that relevant information collected from tweets can help in different ways. Therefore, there is a need to develop an automated microblog/tweet summarization system to automatically select relevant tweets. In this article, we employ the concept of multiobjective optimization in microblog summarization to produce good quality summaries. Different statistical quality measures namely, length, tf-idf score of the tweets, antiredundancy, measuring different aspects of summary, are optimized simultaneously using the search capability of a multiobjective differential evolution technique. Different types of genetic operators including recently developed self-organizing map (a type of neural network) based operator, are explored in the proposed framework. To measure the similarity between tweets, word mover distance is utilized which is capable of capturing the semantic similarity between tweets. For evaluation, four benchmark data sets related to disaster events are used, and the results obtained are compared with various state-of-the-art techniques using ROUGE measures. It has been found that our algorithm improves by 62.37% and 5.65% in terms of ROUGE-2 and ROUGE-L scores, respectively, over the state-of-the-art techniques. Results are also validated using statistical significance t-test. At the end of the article, extension of the proposed approach to solve the multidocument summarization task is also illustrated.

Keyword extraction: Issues and methods

Article

Nov 2019

Due to the considerable growth of the volume of text documents on the Internet and in digital libraries, manual analysis of these documents is no longer feasible. Having efficient approaches to keyword extraction in order to retrieve the ‘key’ elements of the studied documents is now a necessity. Keyword extraction has been an active research field for many years, covering various applications in Text Mining, Information Retrieval, and Natural Language Processing, and meeting different requirements. However, it is not a unified domain of research. In spite of the existence of many approaches in the field, there is no single approach that effectively extracts keywords from different data sources. This shows the importance of having a comprehensive review, which discusses the complexity of the task and categorizes the main approaches of the field based on the features and methods of extraction that they use. This paper presents a general introduction to the field of keyword/keyphrase extraction. Unlike the existing surveys, different aspects of the problem along with the main challenges in the field are discussed. This mainly includes the unclear definition of ‘keyness’, complexities of targeting proper features for capturing desired keyness properties and selecting efficient extraction methods, and also the evaluation issues. By classifying a broad range of state-of-the-art approaches and analysing the benefits and drawbacks of different features and methods, we provide a clearer picture of them. This review is intended to help readers find their way around all the works related to keyword extraction and guide them in choosing or designing a method that is appropriate for the application they are targeting.

An intelligent and private method to profile social network users

Conference Paper

Jun 2019

Money saving or what?

Chapter

Full-text available

Dec 2018

Big data and social media: A scientometrics analysis

Article

Full-text available

Jan 2019

The purpose of this research is to investigate the status and the evolution of the scientific studies for the effect of social networks on big data and usage of big data for modeling the social networks users’ behavior. This paper presents a comprehensive review of the studies associated with big data in social media. The study uses Scopus database as a primary search engine and covers 2000 of highly cited articles over the period 2012-2019. The records are statistically analyzed and categorized in terms of different criteria. The findings show that researches have grown exponentially since 2014 and the trend has continued at relatively stable rates. Based on the survey, decision support systems is the key-word which has carried the highest densities followed by heuristics methods. Among the most cited articles, papers published by re-searchers in the United States have received the highest citations (7548), followed by United Kingdom (588) and China with 543 citations. The thematic analysis shows that the subject nearly maintained an important and well-developed research field and for better results, we can merge our research with “big data analytics” and “twitter” that are important topics in this field but not developed well.

Microblog Summarization using Paragraph Vector and Semantic Structure

Article

Sep 2019

There are two fundamental difficulties that are still hindering the development of microblog summarization. The first problem is the features sparseness of microblog, which restricts the performance of sub-topics detection. The second one is the sentence selection from sub-topics that is based mainly on centrality approaches to measure sentence salience. Also, the semantic features and relations features between sentences and sub-topics were not given much attention. In order to address the two aforementioned problems, we propose a summarization method considering Paragraph Vector and semantic structure. Firstly, we construct sentence similarity matrix that involves the contextual information of microblogs to detect sub-topics by using Paragraph Vector. Secondly, we analyze the sentences by utilizing Chinese Sentential Semantic Model (CSM) to get semantic features; then the relations features are obtained based on the similarity matrix and semantic features above. Finally, the most informative sentences can be selected accurately from microblogs belonging to the same sub-topics by semantic features and relation features. The experimental results show that the ROUGE-1 value is up to 53.17% with 1.5% compression ratio. The results indicate that applying Paragraph Vector to the field of microblog summarization can effectively improve sub-topics detection. Additionally, semantic features and relation features enhance summarization result jointly. Furthermore, CSM provides a promising tool for sentence semantic analysis.

An intelligent framework for health estimation with Naïve Bayes approach

Conference Paper

Nov 2017

Rahul Katarya

Automatic Summarization of Events from Social Media

Article

Full-text available

Jan 2013

Social media services such as Twitter generate phenomenal volume of content for most real-world events on a daily basis. Digging through the noise and redundancy to understand the important aspects of the content is a very challenging task. We propose a search and summarization framework to extract relevant representative tweets from a time-ordered sample of tweets to generate a coherent and concise summary of an event. We introduce two topic models that take advantage of temporal correlation in the data to extract relevant tweets for summarization. The summarization framework has been evaluated using Twitter data on four real-world events. Evaluations are performed using Wikipedia articles on the events as well as using Amazon Mechanical Turk (MTurk) with human readers (MTurkers). Both experiments show that the proposed models outperform traditional LDA and lead to informative summaries. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Analyzing Chat Conversations of Pedophiles with Temporal Relational Semantic Systems

Conference Paper

Full-text available

Aug 2012

Grooming is the process by which pedophiles try to find children on the internet for sex-related purposes. In chat conversations they may try to establish a connection and escalate the conversation towards a physical meeting. Till date no effective methods exist for quickly analyzing the contents, evolution over time, the present state and threat level of these chat conversations. In this paper we propose a novel method based on Temporal Relational Semantic Systems, the main structure in the temporal and relational version of Formal Concept Analysis. For rapidly gaining insight into the topics of chat conversations we combine a linguistic ontology for chatterms with conceptual scaling and represent the dynamics of chats by life tracks in nested line diagrams. To showcase the possibilities of our approach we used chat conversations of a private American organization which actively searches for pedophiles on the internet.

Automatic Summarization of Twitter Topics

Article

Full-text available

Jan 2010

During recent years, socially generated content has become pervasive on the World Wide Web. The enormous amount of content generated in blog sites, social networking sites such as Facebook and Myspace, encyclopedic sites such as Wikipedia, has not only empowered ordinary users of the Web but also contributed to the vastness as well as richness of the Web's contents. In this paper, we focus on a recent trend called microblogging, and in particular a site called Twitter that allows a huge number of users to contribute frequent short messages. The content of such a site is an extra-ordinarily large number of small textual messages, posted by millions of users, at random or in response to perceived events or situations. However, out of such random and massive disorganization of messages usually trends emerge as a large number of users post similar messages on similar topics. These trends can be discovered using statistical analysis of mass of posts. We have developed an algorithm that takes a trending phrase or any phrase specified by a user, collects a large number of posts containing the phrase, and provides an automatically created summary of the posts related to the term. We present examples of summaries we produce along with initial qualitative evaluation. It is possible to get a global view of the content of the text message repository in terms of a set of short summaries of trending terms during the course of a period of time such as an hour or a day.

Exploiting Wikipedia as external knowledge for document clustering

Conference Paper

Full-text available

Jun 2009

In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core words they use are probably synonyms or semantically associated in other forms. The most common way to solve this problem is to enrich document representation with the background knowledge in an ontology. There are two major issues for this approach: (1) the coverage of the ontology is limited, even for WordNet or Mesh, (2) using ontology terms as replacement or additional features may cause information loss, or introduce noise. In this paper, we present a novel text clustering method to address these two issues by enriching document representation with Wikipedia concept and category information. We develop two approaches, exact match and relatedness-match, to map text documents to Wikipedia concepts, and further to Wikipedia categories. Then the text documents are clustered based on a similarity metric which combines document content information, concept information as well as category information. The experimental results using the proposed clustering framework on three datasets (20-newsgroup, TDT2, and LA Times) show that clustering performance improves significantly by enriching document representation with Wikipedia concepts and categories.

TwitInfo: Aggregating and Visualizing Microblogs for Event Exploration

Conference Paper

Full-text available

May 2011

Microblogs are a tremendous repository of user-generated content about world events. However, for people trying to understand events by querying services like Twitter, a chronological log of posts makes it very difficult to get a detailed understanding of an event. In this paper, we present TwitInfo, a system for visualizing and summarizing events on Twitter. TwitInfo allows users to browse a large collec-tion of tweets using a timeline-based display that highlights peaks of high tweet activity. A novel streaming algorithm automatically discovers these peaks and labels them mean-ingfully using text from the tweets. Users can drill down to subevents, and explore further via geolocation, sentiment, and popular URLs. We contribute a recall-normalized ag-gregate sentiment visualization to produce more honest sen-timent overviews. An evaluation of the system revealed that users were able to reconstruct meaningful summaries of events in a small amount of time. An interview with a Pulitzer Prize-winning journalist suggested that the system would be especially useful for understanding a long-running event and for identifying eyewitnesses. Quantitatively, our system can identify 80-100% of manually labeled peaks, fa-cilitating a relatively complete view of each event studied.

Characterizing debate performance via aggregated twitter sentiment

Conference Paper

Full-text available

Apr 2010

Television broadcasters are beginning to combine social micro-blogging systems such as Twitter with television to create social video experiences around events. We looked at one such event, the first U.S. presidential debate in 2008, in conjunction with aggregated ratings of message sentiment from Twitter. We begin to develop an analytical methodol- ogy and visual representations that could help a journalist or public affairs person better understand the temporal dy- namics of sentiment in reaction to the debate video. We demonstrate visuals and metrics that can be used to detect sentiment pulse, anomalies in that pulse, and indications of controversial topics that can be used to inform the design of visual analytic systems for social media events.

Event Summarization Using Tweets.

Conference Paper

Full-text available

Jan 2011

Experiments in Microblog Summarization

Conference Paper

Full-text available

Aug 2010

This paper presents algorithms for summarizing microblog posts. In particular, our algorithms process collections of short posts on specific topics on the well-known site called Twitter and create short summaries from these collections of posts on a specific topic. The goal is to produce summaries that are similar to what a human would produce for the same collection of posts on a specific topic. We evaluate the summaries produced by the summarizing algorithms, compare them with human-produced summaries and obtain excellent results.

On-Line LDA: Adaptive Topic Models for Mining Text Streams with Applications to Topic Detection and Tracking

Conference Paper

Full-text available

Dec 2008

This paper presents online topic model (OLDA), a topic model that automatically captures the thematic patterns and identifies emerging topics of text streams and their changes over time. Our approach allows the topic modeling framework, specifically the latent Dirichlet allocation (LDA) model, to work in an online fashion such that it incrementally builds an up-to-date model (mixture of topics per document and mixture of words per topic) when a new document (or a set of documents) appears. A solution based on the empirical Bayes method is proposed. The idea is to incrementally update the current model according to the information inferred from the new stream of data with no need to access previous data. The dynamics of the proposed approach also provide an efficient mean to track the topics over time and detect the emerging topics in real time. Our method is evaluated both qualitatively and quantitatively using benchmark datasets. In our experiments, the OLDA has discovered interesting patterns by just analyzing a fraction of data at a time. Our tests also prove the ability of OLDA to align the topics across the epochs with which the evolution of the topics over time is captured. The OLDA is also comparable to, and sometimes better than, the original LDA in predicting the likelihood of unseen documents.

Specification and Verification of Protocols With Time Constraints

Article

Full-text available

Aug 2004
Electron Notes Theor Comput Sci

In this paper we face the problem of specifying and verifying security protocols where temporal aspects explicitly appear in the description. For these kinds of protocols we have designed a specification formalism, which consists of a state-transition graph for each participant of the protocol, with edges labelled by trigger/action clauses. The specification of a protocol is translated into a Timed Automaton on which standard techniques of model checking can be exploited (properties to be checked can be expressed in a linear/branching untimed/timed temporal logic). Along all the presentation we use, as running example, a two parties non-repudiation protocol for which we show how our framework applies in the verification of the fairness property for the protocol (establishing whether there is a step of the protocol in which one of the two participants can take any advantage over the other).

Verification of scope-dependent hierarchical state machines

Article

Full-text available

Sep 2008

A hierarchical state machine (Hsm) is a finite state machine where a vertex can either expand to another hierarchical state machine (box) or be a basic vertex (node). Each node is labeled with atomic propositions. We study an extension of such model which allows atomic propositions to label also boxes (Shsm). We show that Shsms can be exponentially more succinct than Shsms and verification is in general harder by an exponential factor. We carefully establish the computational complexity of reachability, cycle detection, and model checking against general Ltl and Ctl specifications. We also discuss some natural and interesting restrictions of the considered problems for which we can prove that Shsms can be verified as much efficiently as Hsms, still preserving an exponential gap of succinctness.

A Visual Backchannel for Large-Scale Events

Article

Full-text available

Jan 2011

We introduce the concept of a Visual Backchannel as a novel way of following and exploring online conversations about large-scale events. Microblogging communities, such as Twitter, are increasingly used as digital backchannels for timely exchange of brief comments and impressions during political speeches, sport competitions, natural disasters, and other large events. Currently, shared updates are typically displayed in the form of a simple list, making it difficult to get an overview of the fast-paced discussions as it happens in the moment and how it evolves over time. In contrast, our Visual Backchannel design provides an evolving, interactive, and multi-faceted visual overview of large-scale ongoing conversations on Twitter. To visualize a continuously updating information stream, we include visual saliency for what is happening now and what has just happened, set in the context of the evolving conversation. As part of a fully web-based coordinated-view system we introduce Topic Streams, a temporally adjustable stacked graph visualizing topics over time, a People Spiral representing participants and their activity, and an Image Cloud encoding the popularity of event photos by size. Together with a post listing, these mutually linked views support cross-filtering along topics, participants, and time ranges. We discuss our design considerations, in particular with respect to evolving visualizations of dynamically changing data. Initial feedback indicates significant interest and suggests several unanticipated uses.

Generalized Fuzzy Quantitative Association Rules Mining with Fuzzy Generalization Hierarchies

Conference Paper

Full-text available

Aug 2001

Keon Myung Lee

Association rule mining is an exploratory learning task to discover some hidden dependency relationships among items in transaction data. Quantitative association rules denote association rules with both categorical and quantitative attributes. There have been several works on quantitative association rule mining such as the application of fuzzy techniques to quantitative association rule mining, the generalized association rule mining for quantitative association rules, and importance weight incorporation into association rule mining for taking into account the user's interest. This paper introduces a new method for generalized fuzzy quantitative association rule mining with importance weights. The method uses fuzzy concept hierarchies for categorical attributes and generalization hierarchies of fuzzy linguistic terms for quantitative attributes. It enables the users to flexibly perform the association rule mining by controlling the generalization levels for attributes and the importance weights for attributes

TweetMotif: Exploratory Search and Topic Summarization for Twitter

Article

May 2010

We present TweetMotif, an exploratory search applica- tion for Twitter. Unlike traditional approaches to in- formation retrieval, which present a simple list of mes- sages, TweetMotif groups messages by frequent signif- icant terms — a result set’s subtopics — which facili- tate navigation and drilldown through a faceted search interface. The topic extraction system is based on syn- tactic ﬁltering, language modeling, near-duplicate de- tection, and set cover heuristics. We have used Tweet- Motif to deﬂate rumors, uncover scams, summarize sentiment, and track political protests in real-time. A demo of TweetMotif, plus its source code, is available at http://tweetmotif.com.

Conversational Shadows: Describing Live Media Events Using Short Messages

Article

May 2010

Microblogging concurrently with live media events is becoming commonplace. The resulting comment stream represents a parallel, social conversational reflection on the event. Although not formally `attached' to the actual event stream itself, we demonstrate it is possible to establish a relationship between the two streams by mapping their structural properties. In this article, we examine: How do people produce and consume real-time commentary? And how does the structure of commentary and conversation change in response to moments of interest? Using a dataset of 53,712 Twitter posts, or tweets, sampled during the inauguration of Barack Obama in January 2009, we develop methods for exploring these questions. We find that short message activity reflects the structure and content of this media event. Specifically, messages directed at large audiences can serve as broadcast announcements, while variations in the level of conversation can reflect levels of interest in the media event itself. Finally, we present some implications for the design of future tools for a variety of users ranging from consumers to journalists.

States, transitions, and life tracks in temporal concept analysis

Article

Jan 2005
Lect Notes Comput Sci

KE Wolff

Based on Formal Concept Analysis, we introduce Temporal Concept Analysis as a temporal conceptual granularity theory for movements of general objects in abstract or "real" space and time such that the notions of states, situations, transitions and life tracks of objects in conceptual time systems are defined mathematically. The life track lemma is a first approach to granularity reasoning. Applications of Temporal Concept Analysis in medicine and in chemical industry are demonstrated as well as recent developments of computer programs for graphical representations of temporal systems. Basic relations between Temporal Concept Analysis and other temporal theories, namely theoretical physics, mathematical system theory, automata theory, and temporal logic are discussed.

A robust framework for short text categorization based on topic model and integrated classifier

Conference Paper

Jul 2014

In this paper, we propose a method for short text categorization using topic model and integrated classifier. To enrich the representation of short text, the Latent Dirichlet Allocation (LDA) model is used to extract latent topic information. While for classification, we combine two classifiers for achieving high reliability. Particularly, we train LDA models with variable number of topics using the Wikipedia corpus as external knowledge base, and extend labeled Web snippets by potential topics extracted by LDA. Then, the enriched representation of snippets are used to learn Maximum Entropy (MaxEnt) and support vector machine (SVM) classifiers separately. Finally, viewing that the most possible predicted result will appear in the top two candidates selected by MaxEnt classifier, we develop a novel scheme that if the gap between these candidates is large enough, the predicted result is considered to be reliable; otherwise, the SVM classifier will be integrated with MaxEnt classifier to make a comprehensive prediction. Experimental results show that our framework is effective and can outperform the state-of-the-art techniques.

Fuzzy Sets and Fuzzy Logic: Theory and Applications

Article

Jan 1995

Summarization of Twitter Microblogs

Article

Feb 2013

Owing to the sheer volume of text generated by a microblog site like Twitter, it is often difficult to fully understand what is being said about various topics. This paper presents algorithms for summarizing microblog documents. Initially, we present algorithms that produce single-document summaries but later extend them to produce summaries containing multiple documents. We evaluate the generated summaries by comparing them to both manually produced summaries and, for the multiple-post summaries, to the summarization results of some of the leading traditional summarization systems.

Sequential Summarization: A Full View of Twitter Trending Topics

Article

Feb 2014

As an information delivering platform, Twitter collects millions of tweets every day. However, some users, especially new users, often find it difficult to understand trending topics in Twitter when confronting the overwhelming and unorganized tweets. Existing work has attempted to provide a short snippet to explain a topic, but this only provides limited benefits and cannot satisfy the users' expectations. In this paper, we propose a new summarization task, namely sequential summarization, which aims to provide a serial of chronologically ordered short sub-summaries for a trending topic in order to provide a complete story about the development of the topic while retaining the order of information presentation. Different from the traditional summarization task, the numbers of sub-summaries for different topics are not fixed. Two approaches, i.e., stream-based and semantic-based approaches, are developed to detect the important subtopics within a trending topic. Then a short sub-summary is generated for each subtopic. In addition, we propose three new measures to evaluate the position-aware coverage, sequential novelty and sequence correlation of the system-generated summaries. The experimental results based on the proposed evaluation criteria have demonstrated the effectiveness of the proposed approaches.

Conference Paper

Jun 2011

User-contributed content is creating a surge on the Internet. A list of "buzzing topics" can effectively monitor the surge and lead people to their topics of interest. Yet a topic phrase alone, such as "SXSW", can rarely present the information clearly. In this paper, we propose to explore a variety of text sources for summarizing the Twitter topics, including the tweets, normalized tweets via a dedicated tweet normalization system, web contents linked from the tweets, as well as integration of different text sources. We employ the concept-based optimization framework for topic summarization, and conduct both automatic and human evaluation regarding the summary quality. Performance differences are observed for different input sources and types of topics. We also provide a comprehensive analysis regarding the task challenges.

Sumblr: continuous summarization of evolving tweet streams

Conference Paper

Jul 2013

With the explosive growth of microblogging services, short-text messages (also known as tweets) are being created and shared at an unprecedented rate. Tweets in its raw form can be incredibly informative, but also overwhelming. For both end-users and data analysts it is a nightmare to plow through millions of tweets which contain enormous noises and redundancies. In this paper, we study continuous tweet summarization as a solution to address this problem. While traditional document summarization methods focus on static and small-scale data, we aim to deal with dynamic, quickly arriving, and large-scale tweet streams. We propose a novel prototype called Sumblr (SUMmarization By stream cLusteRing) for tweet streams. We first propose an online tweet stream clustering algorithm to cluster tweets and maintain distilled statistics called Tweet Cluster Vectors. Then we develop a TCV-Rank summarization technique for generating online summaries and historical summaries of arbitrary time durations. Finally, we describe a topic evolvement detection method, which consumes online and historical summaries to produce timelines automatically from tweet streams. Our experiments on large-scale real tweets demonstrate the efficiency and effectiveness of our approach.

Learning Ontology Automatically Using Topic Model

Conference Paper

May 2012

Ontology has been extensively applied in various fields, such as artificial intelligence, information extraction and retrieval et al. In this paper we describe a new approach for automatic learning terminological ontology. The method takes the topics generated by generative topic model as concepts and builds subsumption relationships between such concepts to learn ontology without the existence of seed ontology. The method presents CosTMI measure to compute semantic similarity between topics and to organize these topics into hierarchy structure and form new ontology. We evaluate our method using real world text dataset GENIA corpus which is a collection of biomedical literature. And the experiment results demonstrate the validity and efficiency of proposed method.

Hierarchical web resources retrieval by exploiting Fuzzy Formal Concept Analysis

Article

May 2011
INFORM PROCESS MANAG

Toward a Fuzzy Domain Ontology Extraction Method for Adaptive e-Learning

Article

Jul 2009

With the widespread applications of electronic learning (e-Learning) technologies to education at all levels, increasing number of online educational resources and messages are generated from the corresponding e-Learning environments. Nevertheless, it is quite difficult, if not totally impossible, for instructors to read through and analyze the online messages to predict the progress of their students on the fly. The main contribution of this paper is the illustration of a novel concept map generation mechanism which is underpinned by a fuzzy domain ontology extraction algorithm. The proposed mechanism can automatically construct concept maps based on the messages posted to online discussion forums. By browsing the concept maps, instructors can quickly identify the progress of their students and adjust the pedagogical sequence on the fly. Our initial experimental results reveal that the accuracy and the quality of the automatically generated concept maps are promising. Our research work opens the door to the development and application of intelligent software tools to enhance e-Learning.

Fuzzy Sets

Article

Jun 1965
Inform Contr

L.A Zadeh

A fuzzy set is a class of objects with a continuum of grades of membership. Such a set is characterized by a membership (characteristic) function which assigns to each object a grade of membership ranging between zero and one. The notions of inclusion, union, intersection, complement, relation, convexity, etc., are extended to such sets, and various properties of these notions in the context of fuzzy sets are established. In particular, a separation theorem for convex fuzzy sets is proved without requiring that the fuzzy sets be disjoint.

Improving novelty detection for general topics using sentence level information patterns

Conference Paper

Nov 2006

The detection of new information in a document stream is an important component of many potential applications. In this work, a new novelty detection approach based on the identification of sentence level information patterns is proposed. First, the information- pattern concept for novelty detection is presented with the emphasis on new information patterns for general topics (queries) that cannot be simply turned into specific questions whose answers are specific named entities (NEs). Then we elaborate a thorough analysis of sentence level information patterns on data from the TREC novelty tracks, including sentence lengths, named entities, sentence level opinion patterns. This analysis provides guidelines in applying those patterns in novelty detection particularly for the general topics. Finally, a unified pattern-based approach is presented to novelty detection for both general and specific topics. The new method for dealing with general topics will be the focus. Experimental results show that the proposed approach significantly improves the performance of novelty detection for general topics as well as the overall performance for all topics from the 2002-2004 TREC novelty tracks.

Jasmine: A real-time local-event detection system based on geolocation information propagated to microblogs

Conference Paper

Oct 2011

We propose a system for detecting local events in the real-world using geolocation information from microblog documents. A local event happens when people with a common purpose gather at the same time and place. To detect such an event, we identify a group of Twitter documents describing the same theme that were generated within a short time and a small geographic area. Timestamps and geotags are useful for finding such documents, but only 0.7% of documents are geotagged and not sufficient for this purpose. Therefore, we propose an automatic geotagging method that identifies the location of non-geotagged documents. Our geotagging method successfully increased the number of geographic groups by about 115 times. For each group of documents, we extract co-occurring terms to identify its theme and determine whether it is about an event. We subjectively evaluated the precision of our detected local events and found that it had 25.5% accuracy. These results demonstrate that our system can detect local events that are difficult to identify using existing event detection methods. A user can interactively specify the size of a desired event by manipulating the parameters of date, area size, and the minimum number of Twitter users associated with the location. Our system allows users to enjoy the novel experience of finding a local event happening near their current location in real time.

Automatic Taxonomy Generation: Issues and Possibilities

Conference Paper

Jun 2003
Lect Notes Comput Sci

Automatic taxonomy generation deals with organizing text documents in terms of an unknown labeled hierarchy. The main issues here are (i) how to identify documents that have similar content, (ii) how to discover the hierarchical structure of the topics and subtopics, and (iii) how to find appropriate labels for each of the topics and subtopics. In this paper, we review several approaches to automatic taxonomy generation to provide an insight into the issues involved. We also describe how fuzzy hierarchies can overcome some of the problems associated with traditional crisp taxonomies.

TweetMotif: Exploratory Search and Topic Summarization for Twitter.

Conference Paper

Jan 2010

We present TweetMotif, an exploratory search applica- tion for Twitter. Unlike traditional approaches to in- formation retrieval, which present a simple list of mes- sages, TweetMotif groups messages by frequent signif- icant terms — a result set's subtopics — which facili- tate navigation and drilldown through a faceted search interface. The topic extraction system is based on syn- tactic filtering, language modeling, near-duplicate de- tection, and set cover heuristics. We have used Tweet- Motif to deflate rumors, uncover scams, summarize sentiment, and track political protests in real-time. A demo of TweetMotif, plus its source code, is available at http://tweetmotif.com.

States, Transitions, and Life Tracks in Temporal Concept Analysis

Conference Paper

Jan 2005

Karl Erich Wolff

Based on Formal Concept Analysis, we introduce Temporal Concept Analysis as a temporal conceptual granularity theory for movements of general objects in abstract or “real” space and time such that the notions of states, situations, transitions and life tracks of objects in conceptual time systems are defined mathematically. The life track lemma is a first approach to granularity reasoning. Applications of Temporal Concept Analysis in medicine and in chemical industry are demonstrated as well as recent developments of computer programs for graphical representations of temporal systems. Basic relations between Temporal Concept Analysis and other temporal theories, namely theoretical physics, mathematical system theory, automata theory, and temporal logic are discussed.

RSS-based e-learning recommendations exploiting fuzzy FCA for Knowledge Modeling

Article

Jan 2012
APPL SOFT COMPUT

Nowadays, Web 2.0 focuses on user generated content, data sharing and collaboration activities. Formats like Really Simple Syndication (RSS) provide structured Web information, display changes in summary form and stay updated about news headlines of interest. This trend has also affected the e-learning domain, where RSS feeds demand for dynamic learning activities, enabling learners and teachers to access to new blog posts, to keep track of new shared media, to consult Learning Objects which meet their needs.This paper presents an approach to enrich personalized e-learning experiences with user-generated content, through a contextualized RSS-feeds fruition. The synergic exploitation of Knowledge Modeling and Formal Concept Analysis techniques enables the design and development of a system that supports learners in their learning activities by collecting, conceptualizing, classifying and providing updated information on specific topics coming from relevant information sources. An agent-based layer supervises the extraction and filtering of RSS feeds whose topics cover a specific educational domain.

Wikify! Linking Documents to Encyclopedic Knowledge

Article

Nov 2007

This paper introduces the use of Wikipedia as a resource for automatic keyword extraction and word sense disambiguation, and shows how this online encyclopedia can be used to achieve state-of-the-art results on both these tasks. The paper also shows how the two methods can be combined into a system able to automatically enrich a text with links to encyclopedic knowledge. Given an input document, the system identifies the important concepts in the text and automatically links these concepts to the corresponding Wikipedia pages. Evaluations of the system show that the automatic annotations are reliable and hardly distinguishable from manual annotations. providing the users a quick way of accessing additional information. Wikipedia contributors perform these annotations by hand following a Wikipedia“manual of style,”which gives guidelines concerning the selection of important concepts in a text, as well as the assignment of links to appropriate related articles. For instance, Figure 1 shows an example of a Wikipedia page, including the definition for one of the meanings of the word “plant.”

Semantic Information Retrieval Based on Fuzzy Ontology for Electronic Commerce

Article

Dec 2008

Information retrieval is the important work for Electronic Commerce. Ontology-based semantic retrieval is a hotspot of current research. In order to achieve fuzzy semantic retrieval, this paper applies a fuzzy ontology framework to information retrieval system in E-Commerce. The framework includes three parts: concepts, properties of concepts and values of properties, in which property’s value can be either standard data types or linguistic values of fuzzy concepts. The semantic query expansions are constructed by order relation, equivalence relation, inclusion relation, reversion relation and complement relation between fuzzy concepts defined in linguistic variable ontologies with Resource Description Framework (RDF). The application to retrieve customer, product and supplier information shows that the framework can overcome the localization of other fuzzy ontology models, and this research facilitates the semantic retrieval of information through fuzzy concepts on the Semantic Web.

Fuzzy c-means for Fuzzy Hierarchical Clustering

Conference Paper

Jun 2005

Vicenç Torra

This paper describes an algorithm for building fuzzy hierarchies. These are hierarchies where the elements can have fuzzy membership to the nodes. The paper presents an approach that mainly follows a bottom-up strategy, and describes the functions needed to operate with fuzzy variables. An example of the application of the approach is also presented

A new method for fuzzy information retrieval based on fuzzy hierarchical clustering and fuzzy inference techniques

Article

May 2005

In this paper, we extend the work of Kraft et al. to present a new method for fuzzy information retrieval based on fuzzy hierarchical clustering and fuzzy inference techniques. First, we present a fuzzy agglomerative hierarchical clustering algorithm for clustering documents and to get the document cluster centers of document clusters. Then, we present a method to construct fuzzy logic rules based on the document clusters and their document cluster centers. Finally, we apply the constructed fuzzy logic rules to modify the user's query for query expansion and to guide the information retrieval system to retrieve documents relevant to the user's request. The fuzzy logic rules can represent three kinds of fuzzy relationships (i.e., fuzzy positive association relationship, fuzzy specialization relationship and fuzzy generalization relationship) between index terms. The proposed fuzzy information retrieval method is more flexible and more intelligent than the existing methods due to the fact that it can expand users' queries for fuzzy information retrieval in a more effective manner.

Efficient Data Mining Based on Formal Concept Analysis

Conference Paper

Jul 2002

Gerd Stumme

Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm Titanic. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.

Jasmine: A real-time localevent detection system based on geolocation information propagated to microblogs

Jan 2011
2541-2544

K Watanabe
M Ochi
M Okabe
R Onai

K. Watanabe, M. Ochi, M. Okabe, R. Onai, Jasmine: A real-time localevent detection system based on geolocation information propagated to microblogs, in: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM '11, ACM, New York, NY, USA, 2011, pp. 2541-2544. doi:10.1145/2063576.2064014. URL http://doi.acm.org/10.1145/2063576.2064014

Enhancing query-oriented summarization based on sentence wikification

Y Miao
C Li

Y. Miao, C. Li, Enhancing query-oriented summarization based on sentence wikification, in: Workshop of the 33 rd Annual International, 2010, p. 32.

Twitinfo: Aggregating and visualizing microblogs for event exploration

Jan 2011
227-236

A Marcus
M S Bernstein
O Badar
D R Karger
S Madden
R C Miller

A. Marcus, M. S. Bernstein, O. Badar, D. R. Karger, S. Madden, R. C. Miller, Twitinfo: Aggregating and visualizing microblogs for event exploration, in: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '11, ACM, New York, NY, USA, 2011, pp. 227-236. doi:10.1145/1978942.1978975. URL http://doi.acm.org/10.1145/1978942.1978975

Sumblr: Continuous summarization of evolving tweet streams

Jan 2013
533-542

L Shou
Z Wang
K Chen
G Chen

L. Shou, Z. Wang, K. Chen, G. Chen, Sumblr: Continuous summarization of evolving tweet streams, in: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '13, ACM, New York, NY, USA, 2013, pp. 533-542. doi:10.1145/ 2484028.2484045. URL http://doi.acm.org/10.1145/2484028.2484045

A visual backchannel for large-scale events

Jan 2010
IEEE T VIS COMPUT GR
1129-1138

M Dork
D Gruen
C Williamson
S Carpendale

M. Dork, D. Gruen, C. Williamson, S. Carpendale, A visual backchannel for large-scale events, IEEE Transactions on Visualization and Computer Graphics 16 (6) (2010) 1129-1138. doi:http://doi. ieeecomputersociety.org/10.1109/TVCG.2010.129.

Towards a temporal extension of formal concept analysis

Jan 2001
Lect Notes Comput Sci
335-344

R Neouchi
A Tawfik
R Frost

R. Neouchi, A. Tawfik, R. Frost, Towards a temporal extension of formal concept analysis, in: E. Stroulia, S. Matwin (Eds.), Advances in Artificial Intelligence, Vol. 2056 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, 2001, pp. 335-344. doi:10.1007/3-540-45153-6_33. URL http://dx.doi.org/10.1007/3-540-45153-6_33

Towards a temporal extension of formal concept analysis

Jan 2001
335

Neouchi

Soft computing for information retrieval on the web

G Bordogna
M Pagani
G Pasi

Time Aware Knowledge Extraction for Microblog Summarization on Twitter

Abstract

No full-text available

Recommended publications

Self-disclosure and Social network sites users' awareness

Ubiquitous conference management system for mobile recommendation services based on mobilizing socia...

Movement: A Secure Community Awareness Application and Display

Weighted Fuzzy Rule Based Sentiment Prediction Analysis on Tweets