April 2014
·
91 Reads
ICIC Express Letters
Society continuously communicate on social networks, blogs and news; the permanent generation of text data makes an attractive challenge to discover what people are talking about and track a non-linear status of a certain topic over the time. In this paper we explore particle filtering (PF) methodology applied to text data. PF is a powerful methodology for sequential signal processing with a wide scope of applications. There are several non-linear problems where PF has been applied, such as localization and tracking of targets, recognition of objects in video or image, among others. This paper aims to explore PF for tracking a set of related words in a news stream. We track a predefined topic and its "relevance, estimating its state value over the time. PF uses a state-space model to perform estimations, in our work; the state-space model is simulated by using Unigram method. Unigram models a topic named as "nuclear issue from a news stream composed of 421 news generated during last natural disaster in Japan. PF sampling importance re-sampling (SIR) algorithm is used to process data as it arrives, for rapid adaptation to changing signal characteristics. Our application computes the posterior pdf of the system's state based on available information; every time a news is read, the system tracks and estimates the selected topic "nuclear issue".