$Event detection results at time-stamp t=2017-01-15\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t = 2017-01-15$$\end{document} 7pm: a Constructed quad-tree using tweets in the interval [t-T:t\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t-T:t$$\end{document}) where T=3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T=3$$\end{document}-days, b Flagged events with Poisson signal, and c Distribution of Poisson signals for all nodes.$

Event detection results at time-stamp t=2017-01-15\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t = 2017-01-15$$\end{document} 7pm: a Constructed quad-tree using tweets in the interval [t-T:t\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t-T:t$$\end{document}) where T=3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T=3$$\end{document}-days, b Flagged events with Poisson signal, and c Distribution of Poisson signals for all nodes.

Source publication

Fig. 1 Overview of our proposed Spatio-temporal Online Event Detection...

Fig. 2 Event detection results at time-stamp t = 2017 − 01 − 15 7pm: a...

Fig. 3 Sample detected events using Twitter Data in Melbourne, 2017

Fig. 6 The number of detected events using different values for T (in...

Real-time spatio-temporal event detection on geotagged social media

Article

Full-text available

Jun 2021

A key challenge in mining social media data streams is to identify events which are actively discussed by a group of people in a specific local or global area. Such events are useful for early warning for accident, protest, election or breaking news. However, neither the list of events nor the resolution of both event time and space is fixed or kno...

Figure 2 Event detection results at time-stamp t = 2017 − 01 − 15 7pm:...

Parameter selection: Poisson signal threshold (τ 1 )

Sample detected events for different Poisson signal thresholds

Event detection results for Flickr and Twitter

Precision results: Melbourne detected events in 2017 using geotagged...

Real-time Spatio-temporal Event Detection on Geotagged Social Media

Preprint

Full-text available

Jun 2021

The spatial dynamics of Ukraine air quality impacted by the war and pandemic

Article

Full-text available

Sep 2023
INT J DIGIT EARTH

In recent years, our world has experienced significant disruptions due to the COVID-19 pandemic, and Russia's 2022 invasion of Ukraine, impacting human activities and the global environment. This paper explored air quality changes in Ukraine due to COVID-19, and Russia's invasion of Ukraine using on-demand with a what-you-see-is-what-you-get approach. During the COVID-19 pandemic, strict quarantine policies in Ukraine led to a 2% reduction in tropospheric NO2 concentration before the lockdown and 4% during the lockdown period. Cities like Kyiv, Donetsk, and Dnipro exhibited reductions of 5%, 11%, and 16%, respectively. Total SO2 column concentration decreased by 6% before the lockdown and 2.5% during the lockdown period, except in high population density areas. Kyiv showed the highest reduction of 17% in SO2 concentration, while Donetsk and Dnipro exhibited an 11% reduction. However, during the Russian invasion, there was a significant increase in tropospheric NO2 concentration in heavily destroyed Kharkiv while most eastern regions experienced a reduction. The total SO2 column was 48% higher before the war but reduced throughout the country after the war, except for in Kyiv and a few central regions. These findings can contribute to analyzing air pollution and building digital twin simulations for future reconstruction scenarios.

Optimized Ensemble Approach for Multi-model Event Detection in Big data

Article

Jun 2023

Event detection acts an important role among modern society and it is a popular computer process that permits to detect the events automatically. Big data is more useful for the event detection due to large size of data. Multimodal event detection is utilized for the detection of events using heterogeneous types of data. This work aims to perform for classification of diverse events using Optimized Ensemble learning approach. The Multi-modal event data including text, image and audio are sent to the user devices from cloud or server where three models are generated for processing audio, text and image. At first, the text, image and audio data is processed separately. The process of creating a text model includes pre-processing using Imputation of missing values and data normalization. Then the textual feature extraction using integrated N-gram approach. The Generation of text model using Convolutional two directional LSTM (2DCon_LSTM). The steps involved in image model generation are pre-processing using Min-Max Gaussian filtering (MMGF). Image feature extraction using VGG-16 network model and generation of image model using Tweaked auto encoder (TAE) model. The steps involved in audio model generation are pre-processing using Discrete wavelet transform (DWT). Then the audio feature extraction using Hilbert Huang transform (HHT) and Generation of audio model using Attention based convolutional capsule network (Attn_CCNet). The features obtained by the generated models of text, image and audio are fused together by feature ensemble approach. From the fused feature vector, the optimal features are trained through improved battle royal optimization (IBRO) algorithm. A deep learning model called Convolutional duo Gated recurrent unit with auto encoder (C-Duo GRU_AE) is used as a classifier. Finally, different types of events are classified where the global model are then sent to the user devices with high security and offers better decision making process. The proposed methodology achieves better performances are Accuracy (99.93%), F1-score (99.91%), precision (99.93%), Recall (99.93%), processing time (17seconds) and training time (0.05seconds). Performance analysis exceeds several comparable methodologies in precision, recall, accuracy, F1 score, training time, and processing time. This designates that the proposed methodology achieves improved performance than the compared schemes. In addition, the proposed scheme detects the multi-modal events accurately.

Identifying Crisis Response Communities in Online Social Networks for Compound Disasters: The Case of Hurricane Laura and COVID-19

Article

Full-text available

May 2023
TRANSPORT RES REC

Online social networks allow different agencies and the public to interact and share the underlying risks and protective actions during major disasters. This study revealed such crisis communication patterns during Hurricane Laura compounded by the COVID-19 pandemic. Hurricane Laura was one of the strongest (Category 4) hurricanes on record to make landfall in Cameron, Louisiana, U.S. Using an application programming interface (API), this study utilizes large-scale social media data obtained from Twitter through the recently released academic track that provides complete and unbiased observations. The data captured publicly available tweets shared by active Twitter users from the vulnerable areas threatened by Hurricane Laura. Online social networks were based on Twitter's user influence feature (i.e., mentions or tags) that allows notification of other users while posting a tweet. Using network science theories and advanced community detection algorithms, the study split these networks into 21 components of various size, the largest of which contained eight well-defined communities. Several natural language processing techniques (i.e., word clouds, bigrams, topic modeling) were applied to the tweets shared by the users in these communities to observe their risk-taking or risk-averse behavior during a major compounding crisis. Social media accounts of local news media, radio, universities, and popular sports pages were among those which heavily involved and closely interacted with local residents. In contrast, emergency management and planning units in the area engaged less with the public. The findings of this study provide novel insights into the design of efficient social media communication guidelines to respond better in future disasters.

The myth of reproducibility: A review of event tracking evaluations on Twitter

Article

Full-text available

Apr 2023

Event tracking literature based on Twitter does not have a state-of-the-art. What it does have is a plethora of manual evaluation methodologies and inventive automatic alternatives: incomparable and irreproducible studies incongruous with the idea of a state-of-the-art. Many researchers blame Twitter's data sharing policy for the lack of common datasets and a universal ground truth–for the lack of reproducibility–but many other issues stem from the conscious decisions of those same researchers. In this paper, we present the most comprehensive review yet on event tracking literature's evaluations on Twitter. We explore the challenges of manual experiments, the insufficiencies of automatic analyses and the misguided notions on reproducibility. Crucially, we discredit the widely-held belief that reusing tweet datasets could induce reproducibility. We reveal how tweet datasets self-sanitize over time; how spam and noise become unavailable at much higher rates than legitimate content, rendering downloaded datasets incomparable with the original. Nevertheless, we argue that Twitter's policy can be a hindrance without being an insurmountable barrier, and propose how the research community can make its evaluations more reproducible. A state-of-the-art remains attainable for event tracking research.

Machine Learning Based Representative Spatio-Temporal Event Documents Classification

Article

Full-text available

Mar 2023

As the scale of online news and social media expands, attempts to analyze the latest social issues and consumer trends are increasing. Research on detecting spatio-temporal event sentences in text data is being actively conducted. However, a document contains important spatio-temporal events necessary for event analysis, as well as non-critical events for event analysis. It is important to increase the accuracy of event analysis by extracting only the key events necessary for event analysis from among a large number of events. In this study, we define important 'representative spatio-temporal event documents' for the core subject of documents and propose a BiLSTM-based document classification model to classify representative spatio-temporal event documents. We build 10,000 gold-standard training datasets to train the proposed BiLSTM model. The experimental results show that our BiLSTM model improves the F1 score by 2.6% and the accuracy by 4.5% compared to the baseline CNN model.

A Novel Burst Event Detection Model Based on Cross Social Media Influence

Preprint

Full-text available

Nov 2022

With the frequent occurrence of public emergencies around the world today, how to effectively use big data and artificial intelligence technologies to accurately and efficiently detect and identify burst events of the Internet has become a hot issue. These existing burst event detection methods lack of comprehensively considering multi-data source of social media and their influences, which leads to a lower accuracy. This paper proposes a novel burst event detection model based on cross social media influence and unsupervised clustering. In this article, we, explain the basic framework of burst event detection, along with characteristics of social media influence, and the word frequency features and growth rate features. In our proposed approach, according to the time information in the data stream, social media network data were sliced and the burst word features in each time window were calculated. Then, the three burst features were fused to compute the burst degree of words; after that the words larger than the threshold were selected to form the burst word set. Finally, the agglomerative hierarchical clustering method is introduced to cluster the burst word set and extracts the burst event from it. The results of the experiment on a real-world social media dataset show that the detection method has significantly improved in Precision and F1-score value compared with the latest four burst event detection methods and prove the effectiveness of the proposed method.

Preprint

Oct 2022

Wenchuan Mu

Capturing the similarities between human language units is crucial for explaining how humans associate different objects, and therefore its computation has received extensive attention, research, and applications. With the ever-increasing amount of information around us, calculating similarity becomes increasingly complex, especially in many cases, such as legal or medical affairs, measuring similarity requires extra care and precision, as small acts within a language unit can have significant real-world effects. My research goal in this thesis is to develop regression models that account for similarities between language units in a more refined way. Computation of similarity has come a long way, but approaches to debugging the measures are often based on continually fitting human judgment values. To this end, my goal is to develop an algorithm that precisely catches loopholes in a similarity calculation. Furthermore, most methods have vague definitions of the similarities they compute and are often difficult to interpret. The proposed framework addresses both shortcomings. It constantly improves the model through catching different loopholes. In addition, every refinement of the model provides a reasonable explanation. The regression model introduced in this thesis is called progressively refined similarity computation, which combines attack testing with adversarial training. The similarity regression model of this thesis achieves state-of-the-art performance in handling edge cases.

Identifying Crisis Response Communities in Online Social Networks for Compound Disasters: The Case of Hurricane Laura and Covid-19

Preprint

Full-text available

Oct 2022

Online social networks allow different agencies and the public to interact and share the underlying risks and protective actions during major disasters. This study revealed such crisis communication patterns during hurricane Laura compounded by the COVID-19 pandemic. Laura was one of the strongest (Category 4) hurricanes on record to make landfall in Cameron, Louisiana. Using the Application Programming Interface (API), this study utilizes large-scale social media data obtained from Twitter through the recently released academic track that provides complete and unbiased observations. The data captured publicly available tweets shared by active Twitter users from the vulnerable areas threatened by Laura. Online social networks were based on user influence feature ( mentions or tags) that allows notifying other users while posting a tweet. Using network science theories and advanced community detection algorithms, the study split these networks into twenty-one components of various sizes, the largest of which contained eight well-defined communities. Several natural language processing techniques (i.e., word clouds, bigrams, topic modeling) were applied to the tweets shared by the users in these communities to observe their risk-taking or risk-averse behavior during a major compounding crisis. Social media accounts of local news media, radio, universities, and popular sports pages were among those who involved heavily and interacted closely with local residents. In contrast, emergency management and planning units in the area engaged less with the public. The findings of this study provide novel insights into the design of efficient social media communication guidelines to respond better in future disasters.

An improved deep belief neural network based civil unrest event forecasting in twitter

Article

Full-text available

Jul 2022
APPL INTELL

Nowadays, event forecasting in Twitter can be considered an essential, significant and difficult issue. Maximum conventional methods are focusing on temporal events like sports or elections. These methods do not calculate the spatial features too their correlation analysis. Hence, this paper proposes an Improved Deep Belief Neural Network (iDBNN) for civil unrest event forecasting in twitter data. This proposed method is utilized to forecast the future event with the consideration of the tweets. The proposed method is designed with three phases named as pre-processing phase, feature extraction phase, and civil unrest event forecasting. Initially, the proposed method is used to train the Hong Kong Protest event 2019 tweet data for forecasting events. In the pre-processing phase, removal of special symbol, removal of URL, username removal, tokenization and stop word removal are done. After that, the essential features such as domain weight, event weight, textual similarity, spatial similarity, temporal similarity, and Relative Document-Term Frequency Difference (RDTFD) are extracted and then applied for training the proposed model. To empower the training phase of proposed iDBNN method, the Jellyfish Algorithm is utilized to select optimal weight parameter coefficients of DBNN for training the model parameters. The projected technique is authenticated by statistical capacities and compared with the conventional methods such as Hidden Markov Model (HMM) and Random Forest (RF) respectively. Comparing with other traditional methods, the proposed model shows better performance in terms of prediction and processing time. The iDBNN model shows 91% prediction accuracy that is much higher than the traditional DBNN.

Geotagging Social Media Posts to Landmarks Using Hierarchical BERT (Student Abstract)

Article

Jun 2022

Geographical information provided in social media data is useful for many valuable applications. However, only a small proportion of social media posts are explicitly geotagged with their posting locations, which makes the pursuit of these applications challenging. Motivated by this, we propose a 2-level hierarchical classification method that builds upon a BERT model, coupled with textual information and temporal context, which we denote HierBERT. As far as we are aware, this work is the first to utilize a 2-level hierarchical classification approach alongside BERT and temporal information for geolocation prediction. Experimental results based on two social media datasets show that HierBERT outperforms various state-of-art baselines in terms of accuracy and distance error metrics.

Similar publications

Citations