Input/Output behavior of query.

Source publication

Evaluation of Rate-Based Adaptivity in Asynchronous Data Stream Joins

Conference Paper

Full-text available

May 2005

Continuous query systems are an intuitive way for users to access streaming data in large-scale scientific applications containing many hundreds of streams. A challenge in these systems is to join streams in such a way that memory is conserved. Storing events that could not possibly participate in a join any longer wastes memory and limits scalabil...

Context 1

... microbenchmarks break down the overhead of the join window algorithm. The scenario, depicted in Figure 3, a single query run in a quoblet container accepts two input streams, D and R, and joins those streams together based on timestamp to produce the aggregate event < D R > . ...

View in full-text

A Continuous Query Processing System for RFID Data Stream

Article

Full-text available

Apr 2013

As the maritime logistics becomes global, the issues related to the real time monitoring of the possible problems that could occur during the transport have come to the fore. As the RFID technology for monitoring the transport process has been used and significant amount of data stream is fed into middleware, the function for processing them in rea...

A new operator for efficient stream-relation join processing in data streaming engines

Conference Paper

Full-text available

Oct 2013

In the last decade, Stream Processing Engines (SPEs) have emerged as a new processing paradigm that can process huge amounts of data while retaining low latency and high-throughputs. Yet, it is often necessary to join streaming data with traditional databases to provide more contextual information for the end-users and applications. The major probl...

Evaluation Techniques for Generalized Path Pattern Queries on XML Data

Article

Full-text available

Dec 2010

Finding the occurrences of structural patterns in XML data is a key operation in XML query processing. Existing algorithms for this operation focus almost exclusively on path patterns or tree patterns. Current applications of XML require querying of data whose structure is complex or is not fully known to the user, or integrating XML data sources w...

Linked Stream Data Processing

Chapter

Full-text available

Sep 2012

Linked Stream Data has emerged as an effort to represent dynamic, time-dependent data streams following the principles of Linked Data. Given the increasing number of available stream data sources like sensors and social network services, Linked Stream Data allows an easy and seamless integration, not only among heterogenous stream data, but also be...

Load Shedding in Stream Data bases: A Control-Based Approach

Conference Paper

Full-text available

Jan 2006

ABSTRACT,data are persistent queries that continuous~y output results as they Query processing,in Data Stream Management,Systetns (DSMSs) has to meet,various,Quality-of-Service (QoS) rcq~~irements. In Inany data stream,applications. processing,deln? is the most,critical quality requirement,since the val~te of query rcsults decreases dramatically,ov...

Prediction of Missing Events in Sensor Data Streams Using Kalman Filters *

Article

Full-text available

Dec 2008

Sensors and instruments are an important source of real time data. However, sensor networks and instruments and their delivery systems can fail due to intrusion attacks, node fail-ures, link failures, or problems in the measuring instruments. Missing data can cause prediction inaccuracies or problems in the continuous events processing process. Estimation techniques can approximate missing data in a stream, thus enabling a continuous flow of data when the stream goes down temporarily. We propose Kalman filters for predicting missing events in sensor streams, specifically, with the dynamic linear model. Our study compares the Kalman filter based approach to reservoir sampling and histogram based approaches. We show that Kalman filtering is promising and has the least root mean squared error for most cases. We introduce a novel solution for inserting this approximation technique into an SQL-based events processing system as a new query operator. Our experimental analysis shows that the predic-tion operator has low overhead and is effective in estimat-ing missing events in weather data streams, specifically, the METAR streams.

Multi-model Based Optimization for Stream Query Processing.

Conference Paper

Full-text available

Jan 2006

With recent explosive growth of sensors and instruments, large scale data-intensive and computation-intensive appli- cations are emerging, especially in scientific fields. Helping scientists to efficiently, even in real time, process queries over those large scale scientific streams thus has great de- mand. However, query optimization for high volume stream applications- in particular its core component, theevalu- ation model- has not been systematically studied. We ob- serve that evaluating stream query plans should consider three aspects: output rate, computation cost and memory consumption. However, to our knowledge, no existing re- search on evaluating stream query plans consider all three metrics. In this paper, we propose a new combined opti- mization goal which leverages all these aspects and develop a multi-model based optimization framework to accomplish this goal. Specifically, we build three models to evaluate a plan's output rate, computation cost and memory consump- tion respectively. Based on such three models, we search for an optimal plan while considering systems's computation resource and memory constraints. We also experimentally evaluate our optimization framework.

Stream processing in data-driven computational science

Conference Paper

Full-text available

Jan 2006

The use of real-time data streams in data-driven computational science is driving the need for stream processing tools that work within the architectural framework of the larger application. Data stream processing systems are beginning to emerge in the commercial space, but these systems fail to address the needs of large-scale scientific applications. In this paper we illustrate the unique needs of large-scale data driven computational science through an example taken from weather prediction and forecasting. We apply a realistic workload from this application against our Calder stream processing system to determine effective throughput, event processing latency, data access scalability, and deployment latency. 1

Tracking stream provenance in Complex Event Processing systems for workflow-driven computing

Article

Full-text available

Workflow-driven, dynamically adaptive e-Science is a form of scientific investigation often using a Service-Oriented Ar-chitecture (SOA) paradigm, designed to use large-scale com-putational resources on-the-fly to execute workflows consist-ing of parallel models, analysis, and visualization tasks. In the Linked Environments for Atmospheric Discovery (LEAD) project, with which our team is involved, our research has centered around event processing and mining of observa-tional and model generated weather data such that users can dynamically trigger regional weather forecasts on-demand in response to developing weather. In this paper we describe stream provenance in complex event processing (CEP) systems. Specifically, we give an information model and architecture for stream provenance capture and collection, and evaluate the provenance service for perturbation and scalability.

A GRID-BASED MIDDLEWARE FOR SCALABLE PROCESSING OF REMOTE DATA

Article

Supporting Dynamic Migration in Tightly Coupled Grid Applications

Conference Paper

Dec 2006

In recent years, there has been a growing trend towards supporting more tightly coupled applications on the grid, including scientific workflows, applications that use pipelined or data-flow like processing, and distributed streaming applications. As availability of resources can vary over time in a grid environment, dynamic reallocation of resources is very important for these applications, particularly because of their long-running nature, and because they often require large-volume data transfers between processing stages. This paper considers the problem of supporting and efficiently implementing dynamic resource allocation for tightly-coupled and pipelined applications in a grid environment. We provide an alternative to basic checkpointing, using the notion of light-weight summary structure (LSS), to enable efficient migration. The idea behind LSS is that at certain points during the execution of a processing stage, the state of the program can be summarized by a small amount of memory. This allows us to perform low-cost process migration, as long as such memory can be identified by an application developer, and migration is performed only at these points. Our implementation and evaluation of LSS based process migration has been in the context of the GATES (grid-based adaptive execution on streams) middleware that we have been developing. We also present an algorithm for dynamic resource allocation, and have shown an architecture for resource monitoring and allocation. We have extensively evaluated our implementation using three stream data processing applications, and show that the use of LSS allows efficient process migration

Supporting a visualization application on a self-adapting grid middleware

Conference Paper

May 2008

This paper describes how we have used a self-adapting middleware to implement a distributed and adaptive volume rendering application. The middleware we have used is GATES (grid-based adaptive execution on streams), which allows processing of streaming data in a distributed environment. A challenge in supporting such an application on streaming data is to balance the visualization quality and the speed of processing, which can be automatically done by the GATES middleware. We describe how we divide the application into a number of processing stages, and what adaptation parameters we use. Our experimental studies have focused on evaluating the self-adaptation enabled by the middleware, and measuring the overhead associated with the use of middleware.

Calder query grid service: Insights and experimental evaluations

Conference Paper

Full-text available

Jun 2006

We have architected and evaluated a new kind of data resource, one that is composed of a logical collection of ephemeral data streams that could be viewed as a collection of publish-subscribe "channels" over which rich data-access and semantic operations can be performed. This paper contributes new insight to stream processing under the highly asynchronous stream workloads often found in data-driven scientific applications, and presents insights gained through porting a distributed stream processing system to a grid services framework. Experimental results reveal limits on stream processing rates that are directly tied to differences in stream rates.

Input/Output behavior of query.

Context in source publication

Similar publications

Citations