Cartesian coordinates allow quick tests for point-in-polygon and point-near-point. Each lat/lon point has a corresponding (x,y,z) unit vector.

Source publication

Using Table Valued Functions in SQL Server 2005 To Implement a Spatial Data Library

Article

Full-text available

Jan 2007

This article explains how to add spatial search functions (point-near-point and point in polygon) to Microsoft SQL Server 2005 using C# and table-valued functions. It is possible to use this library to add spatial search to your application without writing any special code. The library implements the public-domain C# Hierarchical Triangular Mesh (H...

A knowledge-based data mining system for diagnosing malaria related cases in healthcare management

Article

Full-text available

Jan 2010

Data mining a process for assembling and analysing data into useful information can be applied as rapid measures for malaria diagnosis. In this research work we implemented (knowledge-base) inference engine that will help in mining sample patient records to discover interesting relationships in malaria related cases. The computer programming langua...

Design and implementation of on-line data submission and retrieval system for coordinated research trials in food legumes

Article

Full-text available

Dec 2015

Online data submission and retrieval system for coordinated research trials in food legumes has been developed with an aim to reduce the time and cost in collection, compilation, data analysis, retrieval and report generation. At the first instance, efforts were made to use plant breeding data as it covers more than 60% of total trials. The system...

A WAY FORWARD TO IMPROVING FUNCTIONALITIES OF OPEN DATABASE SYSTEM

Article

Full-text available

The increasing need for information dissemination and the tremendous population growth in today's organizations calls for the need to migrate most application and their associated data to the network where several millions of people can access concurrently. Several challenges such as security risk and management/maintenance of the database have bee...

Improved parallel processing function for high-performance large-scale astronomical cross-matching

Article

Feb 2011

Astronomical cross-matching is a basic method for aggregating the observational data of different wavelengths. By data aggregation, the properties of astronomical objects can be understood comprehensively. Aiming at decreasing the time consumed on I/O operations, several improved methods are introduced, including a processing flow based on the boundary growing model, which can reduce the database query operations; a concept of the biggest growing block and its determination which can improve the performance of task partition and resolve data-sparse problem; and a fast bitwise algorithm to compute the index numbers of the neighboring blocks, which is a significant efficiency guarantee. Experiments show that the methods can effectively speed up cross-matching on both sparse datasets and high-density datasets. Keywordsastronomical cross-matching–boundary growing model–HEALPix–task partition–data-sparse problem

Spatial indexing of global geographical data with HTM

Conference Paper

Full-text available

Jun 2010

Spatial indexing is one of the most important techniques in the field of spatial data management. Many kinds of techniques of spatial indexing have been successfully developed, and each of them has advantages towards special applications. As a type of spatial data structure, Hierarchical Triangular Mesh (HTM) has excellent features of global continuity, stability, hierarchy and uniformity, which has attracted much interest of researchers for many years. This paper investigates the method that using HTM as indexing for global geographical data (only point-like objects now). The HTM is defined by subdividing a unit sphere recursively and the basic elements in it are spherical triangles that are coded as integers called HTM codes in the computer system. At the global scale, all the regions on the sphere are spherical, which can be intersected with HTM elements obeying some equations. The spatial position of each input object can also be represented by a HTM code. HTM codes thus become the bridge between query regions and input objects. Our system is based on the combination of database management system (DBMS) and distributed file system. The major information of input files is extracted as metadata that are stored on tables of DBMS, while the original files are stored on the distributed file system (called HDFS) which has potential abilities to support parallel processing. Millions of point-like objects on the global were examined and the experiments indicated the system were acceptable.

The Zones Algorithm for Finding Points-Near-a-Point or Cross-Matching Spatial Datasets

Article

Full-text available

Feb 2007

Zones index an N-dimensional Euclidian or metric space to efficiently support points-near-a-point queries either within a dataset or between two datasets. The approach uses relational algebra and the B-Tree mechanism found in almost all relational database systems. Hence, the Zones Algorithm gives a portable-relational implementation of points-near-point, spatial cross-match, and self-match queries. This article corrects some mistakes in an earlier article we wrote on the Zones Algorithm and describes some algorithmic improvements. The Appendix includes an implementation of point-near-point, self-match, and cross-match using the USGS city and stream gauge database.

INTEGRATION OF SPATIAL DATA TYPES AND OPERATORS INTO RELATIONAL DBMSs: A SURVEY

Article

The relevance of merging spatial data with standard relational data is largely recognized, since by adding geometry into databases we enhance stored information and expand the possibilities of business intelligence, as well. This paper reports about the current state of the technological transfer of research results about "spatial SQL" into relational database management systems and sets a list of the next first-class features hopefully to be implemented to further enhance business intelligence.

Database-Centric Scientific Computing: 22nd European Conference, ADBIS 2018, Budapest, Hungary, September 2–5, 2018, Proceedings

Chapter

Jul 2018

Alexander S. Szalay

Working with Jim Gray, we set out more than 20 years ago to design and build the archive for the Sloan Digital Sky Survey (SDSS), the SkyServer. The SDSS project collected a huge data set over a large fraction of the Northern Sky and turned it into an open resource for the world’s astronomy community. Over the years the project has changed astronomy. Now the project is faced with the problem of how to ensure that the data will be preserved and kept alive for active use for another 15 to 20 years. At the time there were very few examples to learn from and we had to invent much of the system ourselves. The paper discusses the lessons learned, future directions and recalls some memorable moments of our collaboration.

From SkyServer to SciServer

Article

Jan 2018
ANN AMER ACAD POLIT SOC SCI

Alexander S. Szalay

Twenty years ago, work commenced on the Sloan Digital Sky Survey. The project aimed to collect a statistically complete dataset over a large fraction of the sky and turn it into an open data resource for the world’s astronomy community. There were few examples to learn from, and those of us who worked on it had to invent much of the system ourselves. The project has made fundamental changes to astronomy, and we are now faced with the problem of ensuring that the data will be preserved and kept in active use for another 20 years. In redesigning this very large, open archive of data, we made a system that is able to serve a much broader set of communities. In this article, I discuss what we have learned by rebuilding a massive dataset that is available to an increasingly sophisticated set of users, and how we have been challenged and motivated to incorporate more of the patterns of data analytics required by contemporary science.

Scalable Numerical Queries by Algebraic Inequality Transformations

Conference Paper

Apr 2014

To enable historical analyses of logged data streams by SQL queries, the Stream Log Analysis System (SLAS) bulk loads data streams derived from sensor readings into a relational database system. SQL queries over such log data often involve numerical conditions containing inequalities, e.g. to find suspected deviations from normal behavior based on some function over measured sensor values. However, such queries are often slow to execute, because the query optimizer is unable to utilize ordered indexed attributes inside numerical conditions. In order to speed up the queries they need to be reformulated to utilize available indexes. In SLAS the query transformation algorithm AQIT (Algebraic Query Inequality Transformation) automatically transforms SQL queries involving a class of algebraic inequalities into more scalable SQL queries utilizing ordered indexes. The experimental results show that the queries execute substantially faster by a commercial DBMS when AQIT has been applied to preprocess them.

From simulations to interactive numerical laboratories

Article

Jan 2015

Alexander S. Szalay

High Performance Computing is becoming an instrument in its own right. The largest simulations performed on our supercomputers are now approaching petabytes. As the volume of these simulations is growing, it is becoming harder to access, analyze and visualize these data.at the same time for a broad community buy in we need to provide public access to some of the simulation results. This is becoming another Big Data challenge, where we have to move the analyses and visualizations right where the data is. The paper will discuss the challenges in creating such interactive numerical laboratories.

A Paralleled Large-Scale Astronomical Cross-Matching Function

Conference Paper

Jun 2009

Multi-wavelength data cross-match among multiple catalogs is a basic and unavoidable step to make distributed digital archives accessible and interoperable. As current catalogs often contain millions or billions objects, it is a typical data-intensive computation problem. In this paper, a high-efficient parallel approach of astronomical cross-match is introduced. We issue our partitioning and parallelization approach, after that we address some problems introduced by task partition and give the solutions correspondingly, including a sky splitting function HEALPix we selected which play a key role on both the task partitioning and the database indexing, and a quick bit-operation algorithm we advanced to resolve the block-edge problem. Our experiments prove that the function has a marked performance superiority comparing with the previous functions and is fully applicable to large-scale cross-match.

The Sloan Digital Sky Survey and beyond

Article

Jun 2008
SIGMOD REC

Alexander S. Szalay

Our collaboration with Jim Gray has created some of the world's largest astronomy databases, and has enabled us to test many avant-garde ideas in practice. The astronomers have been very receptive to these and embraced Jim as a 'card carrying member' of their community. Jim's contributions have made a permanent mark on astronomy, and eScience in general.

Cartesian coordinates allow quick tests for point-in-polygon and point-near-point. Each lat/lon point has a corresponding (x,y,z) unit vector.

Similar publications

Citations