Article

A B-tree based indexing technique for fuzzy numerical data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper proposes an indexing technique for fuzzy numerical data which increases the performance of query processing when the query involves an atomic possibility measured flexible condition. The proposal is based on a classical indexing mechanism for numerical crisp data, B+-tree, which is implemented in most commercial database management systems (DBMS). This makes the proposed technique a good candidate for integration in a fuzzy DBMS when it is developed as an extension of a crisp DBMS. The efficiency of the proposal is contrasted with another indexing method for similar data and queries, G-tree, which is specifically designed to index multidimensional data. Results show that the proposal performance is similar to and more stable than the measured for G-tree when used for indexing fuzzy numbers.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Among the literature, we can highlight the seminal paper, 11 as commented in Section 2.3, where the principles for indexing in fuzzy database are laid. On top of this, there are proposals to extend traditional database indexes, as B + -trees, 13,14 bitmaps, 15 or multidimensional indexes as g-tree, 16 to make them able to support flexible querying on fuzzy data. Apart from this, there are proposals in the literature not using the above indexing principle, as some specially designed structures for fuzzy object-oriented databases. ...
... In Ref. 13, the authors propose an indexing technique to enhance possibilistic queries by means of the use of two indexes defined on alpha and delta, respectively. To preselect the tuples that satisfy the query, it uses the first index to get the tuples that accomplish the condition alpha ≤ U_CT and uses the second one to get the tuples that satisfy the condition delta ≥ L_CT; then, the intersection of these sets is calculated to get the tuples that satisfy the preselection criterion. ...
... However, when a threshold of 1 is set, the lists leftQueries, rightQueries, and innerQueries change containing, respectively, the nodes {8, 10, 11}, {14, 16}, and, {12 − 13}. Again, the indexed trapezoid belongs to the new innerQueries list, and following the original RI-tree algorithm this trapezoid would be recovered; however, for this threshold, T 4 becomes the interval [10,11] which does not intersect the queried interval [12,13] (dark gray rectangle). For this reason, for possibilistic queries, the indexed nodes that belong to the inner Queries list do not necessarily satisfy the query, and it is necessary that they also satisfy the selection condition. ...
Article
A common way to implement a fuzzy database is on top of a classical relational database management systems (RDBMS). Given that almost all RDBMS provide indexing mechanisms to enhance classical query processing performance, finding ways to use these mechanisms to enhance the performance of flexible query processing is of enormous interest. This work proposes and evaluates a set of indexing strategies, implemented exclusively on top of classical RDBMS indexing structures, designed to improve flexible query processing performance, focusing in the case of possibilities queries. Results show the best indexing strategies for different data a query scenarios, offering effective ways to implement fuzzy data indexes on top of a classical RDBMS.
... Contemporary schools need to manage more information than ever before. Consequently, without a solid internal infrastructure for teachers, administrators and departments to share data, critical school and students' information can be lost, thereby leading to a host of problems that can effect of a school's image and overall performance [4]. To remain competitive, school needs a simple solution that can run individual function, connect their entire operation, use the web as a key communication tool and simplify day to day operational responsibilities, giving staff more time with students. ...
Article
Full-text available
The need for effective indexing and retrieval of data is paramount in any contemporary organization. However, the use of tree data structure had been effective in this regard as evident in literature. This article gives an overview of B +-tree data structure, its indexing technique and application in indexing and retrieving students' academic records in the school system in order to make such records flexible. The study demonstrates the indexing and arrangement patterns of some numerical data. In essence, it discusses how to adopt the use of B +-tree data structure to manage some numerical data in order to enhance indexing, retrieval and modifications of such record. It concludes that good record management results in more convenient indexing and retrieval of students' academic records within the school system.
... Contemporary schools need to manage more information than ever before. Consequently, without a solid internal infrastructure for teachers, administrators and departments to share data, critical school and students' information can be lost, thereby leading to a host of problems that can effect of a school's image and overall performance [4]. To remain competitive, school needs a simple solution that can run individual function, connect their entire operation, use the web as a key communication tool and simplify day to day operational responsibilities, giving staff more time with students. ...
Article
The need for effective indexing and retrieval of data is paramount in any contemporary organization. However, the use of tree data structure had been effective in this regard as evident in literature. This article gives an overview of B+-tree data structure, its indexing technique and application in indexing and retrieving students’ academic records in the school system in order to make such records flexible. The study demonstrates the indexing and arrangement patterns of some numerical data. In essence, it discusses how to adopt the use of B+-tree data structure to manage some numerical data in order to enhance indexing, retrieval and modifications of such record. It concludes that good record management results in more convenient indexing and retrieval of students’ academic records within the school system.
... The TRIE index structure [6,7] is a high-speed index structure used in language processing systems and dictionary search, and it is widely used because it provides quick retrieval time that is not dependent on the number of keys and various essential functions. However, in applications where the size is very large and the keys need to be constantly updated, the requirements of index construction cannot be satisfied and the range search cannot be performed more efficiently than the B + tree-based index method [8][9][10]. The B + tree based indexing method is the most commonly used dynamic indexing structure in database systems because of its effectiveness. ...
Article
Full-text available
The data used in the database has characteristics of sequence and can be used to create the index structure by using the tree structure. Currently, many cryptographic techniques are being studied in databases, but some of them have a significant impact on the performance of the system after being used in DBMS. This means that if the data is encrypted at the field level, it will not be able to maintain its characteristics of sequence, so that the indexing technique used in DBMS does not play its role and the speed of the query processing of the encrypted data will be slowed down. From this, the problem of high-speed query processing of data encrypted in the database system is seriously raised and becomes an important study. In this paper we improve the retrieval performance of the encrypted data by making the external indexing file based on B+ tree and by proceeding with the node encryption for the external indexing file. The results show that the proposed method surpasses the previous methods in terms of execution time and data retrieval capability.
... As already mentioned, we assume that Ω is discrete, since we are interested in indexing scalar data. In the case of applications using continuous numerical data, we either have to discretize the possibility distribution or use a different kind of index structure [9]). ...
Conference Paper
Full-text available
We propose an approach for indexing fuzzy data based on inverted files that speeds up retrieval considerably by stopping the traversal of postings lists early. This is possible because the entries in the postings lists are organized in a way that guarantees that there are no matching items beyond a certain point in a list. Consequently, we can reduce the number of false positives significantly, leading to an increase in retrieval performance. We have implemented our ap- proach and evaluated it experimentally, comparing it to an approach that has previously been shown to be superior to other methods. Keywords— fuzzy databases, access methods, inverted files, physical design
... For this reason, the queries shown have been computed by means of a sequential search; moreover, the complex fuzzy datatype structure used for the spine curve representation more decreases the efficiency if none index technique is used. We are working on the implementation of the indexes techniques for fuzzy data proposed in [21,22]; this will provide us mechanisms to optimize retrieval process. ...
Conference Paper
Full-text available
This paper presents a novel approach for medical im- age storage using a Fuzzy Object-Relational Database Management System (FORDBMS). The system stores medical images along with a set of parameters describing their content. Flexible queries can be performed over these parameters to retrieve images matching vi- sually. To illustrate the capabilities of the FORDBMS, parameter curves are obtained from X-Ray images of patients suffering from scoliosis, and queries are performed when looking for images with a determined curve pattern. Results show that retrieved images visu- ally match the condition established in the query.
Chapter
Cloud computing is highly praised for its high data reliability, lower cost, and nearly unlimited storage. In cloud computing projects, the MapReduce distributed computing model is prevalent. MapReduce distributed computing model is mainly divided into the Map and Reduce functions. As a mapper, the Map function is responsible for dividing tasks (such as uploaded files) into multiple small tasks executed separately; As a reducer, the Reduce function is responsible for summarizing the processing results of multiple tasks after decomposition. It is a scalable and fault-tolerant data processing tool that can process huge voluminous data in parallel with many low-end computing nodes. This paper implements the wordcount program based on the MapReduce framework and uses different dividing methods and data sizes to test the program. The common faults faced by the MapReduce framework also emerged during the experiment. This paper proposes schemes to improve the efficiency of the MapReduce framework. Finally, building an index or using a machine learning model to alleviate data skew is proposed to improve program efficiency. The application system is recommended to be a hybrid system with different modules to process variant tasks.
Article
Uncertainty extensively exists in data and knowledge intensive applications, in which fuzzy information processing plays a crucial role. Fuzzy sets have been extensively used to enhance various database models for managing fuzzy data or flexibly querying crisp data. This has resulted in numerous contributions in this research area. This paper pays attention to three crucial issues in fuzzy techniques for data management: modeling fuzzy data, querying fuzzy data, and fuzzy queries over crisp data, and provides a full up-to-date survey on the current state of the art in fuzzy data modeling and querying. The paper identifies fuzzy conceptual data models, fuzzy (relational and object-oriented) database models and fuzzy XML model as well as the relationships among these fuzzy data models. For each type of fuzzy data models, the paper summarizes its query processing. The paper also reviews fuzzy querying over classical data models. In addition to providing a generic overview of the approaches for fuzzy data modeling and querying, this survey paper serves for identifying possible research opportunities in the area of fuzzy data processing.
Article
When record sets become large, indexing becomes a required technique for speeding up querying. This paper proposes an indexing technique for interval data. Such data are common in possibility based relational databases but are also frequently used in other applications. Our approach is an adaptation of a B⁺-tree, which is currently still one of the most efficient indexing techniques. Because it can store interval data, we name it the Interval B⁺-tree (IBPT). It is illustrated how an IBPT index can be built and applied in practice to speed up the evaluation of fuzzy queries on possibilistic relational databases.
Chapter
In content-based image retrieval applications, there is an exhaustive search in the image database for finding relevant images, which is non-scalable. This chapter presents methods on indexing scheme, encoding scheme and similarity measure for handling the non-scalable issue. An image is represented in terms of colour feature, and the bin content of the feature is analysed to understand the colour content of the images. Based on the bin values and its contribution to the colour information, the size of the feature is truncated. The features are clustered based on the dimension of the histogram. The bin values of the truncated feature are encoded with Golomb–Rice (GR) coding scheme. The similarity between the query and database image is calculated by measuring the degree of overlap in terms of bins and its content. Benchmark datasets are used for evaluating the performance of the all the proposed schemes.
Article
It is widely known that the most effective way to implement a fuzzy database is to use a classical Relational Database Management System (RDBMS) as the basis. All these systems provide several kinds of indexing methods to improve the execution time of classical queries, but they are useless when directly applied to fuzzy queries. For this reason, in this work we propose and evaluate several fuzzy indexing techniques implemented over the indexing techniques available on classical RDBMS in order to enhance flexible queries when based on the necessity measure. As the results show, the best evaluated fuzzy indexing techniques can be implemented on top of classical RDBMS.
Conference Paper
When record sets become large, indexing becomes a required technique for speeding up querying. This holds for regular databases, but also for ‘fuzzy’ databases. In this paper we propose a novel indexing technique, supporting the querying of imperfect numerical data. A possibility based relational database setting is considered. Our approach is based on a novel adaptation of a B\(^{+}\)-tree, which is currently still one of the most efficient indexing techniques for databases. The leaf nodes of a B\(^{+}\)-tree are enriched with extra data and an extra tree pointer so that interval data can be stored and handled with them, hence the name Interval B\(^{+}\)-tree (IBPT). An IBPT allows to index possibility distributions using a single index structure, offering almost the same benefits as a B\(^{+}\)-tree. We illustrate how an IBPT index can be used to index fuzzy sets and demonstrate its benefits for supporting ‘fuzzy’ querying of ‘fuzzy’ databases. More specifically, we focus on the handling of elementary query criteria that use the so-called compatibility operator IS, which checks whether stored imperfect data are compatible with user preferences (or not).
Article
Index data distribution is an important approach that provides parallelism and can improve the usability of a distributed parallel database. B+ tree is a storage structure, which perfectly fits for distributed and parallel indexing, and the distributed B+ tree is adopted to index the massive and rapidly increasing data available in a distributed network. This paper proposes an index data distribution strategy using distributed parallel B+ tree in a distributed network environment. In our proposal, the basic data distribution strategy can improve the efficiency of a query by utilizing a data fragment method based on the scope of value, and the replica distribution can be adjusted dynamically, according to the number of system access. The performance evaluation and experiment results show that this index data distribution strategy can improve the query's efficiency and load balance.
Article
In Content Based Image Retrieval (CBIR) system, the exhaustive search for a given query image to find the relevant images in the database are non-scalable. In this paper, we propose indexing, coding technique and similarity measure to address the above mentioned problem. We consider the color histogram of the image and its bin values are analyzed to understand the color information in the image. The histogram dimension is reduced by removing trivial bins and only those bins that represent color information significantly are considered. Based on the dimensions of the histogram, it is clustered and indexed. The Golomb Rice (GR) coding is used to encode the indexed histograms. The Bin Overlapped Similarity Measure (BOSM) is proposed to compute the distance values between query and database image histograms. The performance of proposed approach is evaluated on benchmark datasets and found that the performance of the proposed approach is encouraging.
Article
This chapter presents a schema and a transformation algorithm to store OWL ontologies in Object Relational Databases. The database schema allows the storage of an ontology structure, while the transformation algorithm creates an appropriate schema to store its instances preserving all information. We allow the use of instance data of imprecise nature, mostly fuzzy numerical data. An OWL ontology is defined allowing numerical fuzzy datatypes as the range of properties. In order to manage all the information, instance data handling is delegated onto a Fuzzy ORDBMS, which is briefly described. We present here a complete description of the structures conforming the storage schema proposed, and the algorithms used to transform the OWL ontology to a database schema. We also discuss the role of ontologies as relational database design tools.
Chapter
This chapter presents definitions and descriptions of intelligent decision support systems (IDSSs) and analyzes the technology and AI methods, which serve as bases of the IDSS. Scholars have offered various definitions of IDSS. Every one of them accents that an intelligent decision support system is a DSS, which makes extensive use of artificial intelligence techniques. Artificial intelligence techniques can be utilized in all the components of IDSSs, such as in the data base, knowledge base, model base, user interface and the rest. Therefore this chapter deliberates the intelligent databases, hardware (sensors, iris camera hardware, hardware for fingerprint biometric identification, etc.) and computer human interfaces (gesture, intelligent user, motion tracking, voice and natural-language interfaces) in intelligent decision support systems.
Article
Crisp index structures introduce the problem of having sharp decision boundaries which may not be found in the real life clustering problems. In real world, specifically in the CBIR context, each data may not be fully assigned to one cluster and it may partially belong to other clusters, as opposed to the crisp index structures which fully affect data to clusters according to their proximity in terms of distance in the high-dimensional vector space. Based on kernel-fuzzy C-means clustering (KFCM) mechanism, this paper presents a fast and efficient index structure to support high-dimensional indexing for both crisp and fuzzy data. The proposed index structure offers a number of advantages such as a compact and efficient fuzzy data clustering. The experimental study demonstrates the efficiency and effectiveness of our method.
Article
This paper proposes an indexing procedure for improving the performance of query processing on a fuzzy database. It focuses on the case when a necessity-measured atomic flexible condition is imposed on the values of a fuzzy numerical attribute. The proposal is to apply a classical indexing structure for numerical crisp data, a B+-tree combined with a Hilbert curve. The use of such a common indexing technique makes its incorporation into current systems straightforward. The efficiency of the proposal is compared with that of another indexing procedure for similar fuzzy data and flexible query types. Experimental results reveal that the performance of the proposed method is similar and more stable than that of its competitor.
Article
This paper introduces a novel approach to medical image retrieval using a fuzzy object-relational database management system (FORDBMS). The system stores medical images along with information about the content of the image, such as the presence or absence of certain indicators of pathologies. It allows us to flexibly retrieve them on the basis of these indicators, making it possible to obtain images from patients with similar diagnosis and thus, following a common visual pattern. To illustrate the capabilities of the FORDBMS, this paper focuses on X-ray images of patients suffering from scoliosis (a medical condition in which the patient's spine is curved) from which spine descriptions are obtained. Then queries are performed to obtain a set of images with a certain curvature pattern. Results show high accuracy when evaluated by medical experts. Compared with other ad hoc content-based image retrieval systems, the one presented here is easily adaptable to other application domains, customizable, and very scalable.
Article
This paper presents a medical image viewer implemented in Java whose innovative features are: on the one hand, its capability for visual edition and storage of measurements involved in diagnosis and treatment of scoliosis (a medical condition in which the patient’s spine is curved) and performed on digital X-rays; on the other hand, its capability for retrieving images in a flexible way from medical image databases on the basis of those measurements, which are the standard method for diagnosing this pathology. Hence, the viewer is intended to be a useful tool for physicians in diagnosis and treatment of scoliosis.
Article
Location management provides the guarantee to deliver a call to mobile user during the mobility of user and it is a key challenge in wireless cellular networks. In this paper, we are introducing a new index-based location management scheme. It is based on indexing of location update information at the home agent of network. A tuple of index will keep track of range of location update information and corresponding thread of it, connected to stack of information table. To register a new mobile user, the mobile switching centre will generate a new identification number with the help of mobile switching centre identification number and temporary mobile subscriber identity of subscriber. If, the identification number is with in the range of index then, the Care of Address of mobile subscriber is added to information table else, the index will be reconstruct based on new range of identification number. It has been observed that in proposed technique, the call setup delay and network overheads is reduces over the limitation of minor increment of registration delay. The analytical model and numerical result represents the effectiveness of proposed scheme over the existing schemes.
Article
We propose an approach for indexing fuzzy data based on inverted files that speeds up retrieval considerably by stopping the traversal of postings lists early. This is possible because the entries in the postings lists are organized in a way that guarantees that there are no matching items beyond a certain point in a list. Consequently, we can reduce the number of false positives significantly, leading to an increase in retrieval performance. We have implemented our approach and evaluated it experimentally, including a test on skewed and real-world data, comparing it to an approach that has previously been shown to be superior to other methods.
Conference Paper
This paper studies the influence of data distribution and clustering on the performance of currently available indexing methods, namely GT and HBPT, to solve necessity measured flexible queries on numerical imprecise data. The study of the above data scenarios lets to obtain valuable information about the expected performance of these indexes on real-world data and query sets, which are usually affected by different skew factors. Results reveal some sensibility of GT and no influence for the considered data scenarios on HBPT.
Article
Full-text available
The paper proposes an indexing mechanism for imprecise numerical data, fuzzy data dened on an or-dered numerical domain. The pro-posal is based on a classical indexing mechanism for numerical crisp data, b+trees, included in most of recent DBMS. This fact, makes the pro-posed mechanism more suitable than other to be integrated in a Fuzzy Object-Relational DBMS, which will improve the performance processing queries on imprecise data.
Conference Paper
Full-text available
Abstract G-tree is a,data,structure,designed,to,provide,multi- dimensional,access in databases.,It has the self-balancing property of B+-tree. In this paper, performance evahtation of G-tree is provided,for various data distributions. For point queries, the experiments show that its retrieval and update performance,is similar to that of l?+-trees independently,of the data distribution. For range queries, the performance varies,significantly,with,the,data,distributions.,While the performance is good for the 2-dimensional case, it deteriorates,as the number,of dimensions,increases.,This empirical evidence is confirmed by an analytical proof, which,also yields a simple,way,of computing,the expected number,of data pages,accessed. This analytical result which shows,that the number,of data,pages,accessed,by a range query,increases exponentially,with the number,of dimensions applies,to many,multi-dimensional,schemes.,We also apply the,G-tree to fuzzy,databases,and,show,empirically,that it has,good,performance,for imprecise,queries,on relatively imprecise,data. But it is less efficient for precise queries on relatively precise data.
Chapter
We present an “add-on” to Microsoft Access, one of new Microsoft Windows based popular DBMSs, which makes possible the use of queries that allow for a more intelligent and human consistent information retrieval. More specifically, fuzzy (imprecise) descriptions and linguistic quantifiers are accommodated to allow for queries as, e.g., “find (all) records such that most of the (important) clauses are satisfied (to a degree from [0,1])”. Zadeh’s (1983) fuzzy logic based calculus of linguistically quantified propositions is employed.
Article
The data model and manipulation language in FREEDOM-0 are described. A fuzzy relational model in FREEDOM-0 is based on the possibility distribution for representing fuzzy data. It is considered as an extension of Codd's relational model of data. The manipulation language provides QUERY, INSERT, DELETE, DEFR (DEfine Fuzzy Relation) and DEFP (DEfine Fuzzy Predicate) statements. The interpretation method of QUERY statement is described and several examples are illustrated. This manipulation language is implemented in FSTDSL/FORTRAN and it is currently running on a FACOM 230-45S computer.
Article
An abstract is not available.
Article
This paper deals with relational databases which are extended in the sense that fuzzily known values are allowed for attributes. Precise as well as partial (imprecise, uncertain) knowledge concerning the value of the attributes are represented by means of [0,1]-valued possibility distributions in Zadeh's sense. Thus, we have to manipulate ordinary relations on Cartesian products of sets of fuzzy subsets rather than fuzzy relations. Besides, vague queries whose contents are also represented by possibility distributions can be taken into account. The basic operations of relational algebra, union, intersection, Cartesian product, projection, and selection are extended in order to deal with partial information and vague queries. Approximate equalities and inequalities modeled by fuzzy relations can also be taken into account in the selection operation. Then, the main features of a query language based on the extended relational algebra are presented. An illustrative example is provided. This approach, which enables a very general treatment of relational databases with fuzzy attribute values, makes an extensive use of dual possibility and necessity measures.
Article
Providing efficient query processing in database systems is one step towards gaining acceptance of such systems by end users. We propose several techniques for indexing fuzzy sets in databases to improve the query evaluation performance. Three of the presented access methods are based on superimposed coding, while the fourth relies on inverted files. The efficiency of these techniques was evaluated experimentally. We present results from these experiments, which clearly show the superiority of the inverted files.
Article
A revised fuzzy-set interpretation of possibility theory is introduced in this paper. Contrary to the standard fuzzy-set interpretation of possibility theory, which is coherent only for normal fuzzy sets, the revised interpretation is shown to be coherent for all fuzzy sets. It is also argued that the revised interpretation, which coincides with the standard one for normal fuzzy sets, is more meaningful on intuitive grounds. Prior to the introduction of the revised interpretation, previous efforts to overcome the well-known difficulties of the standard interpretation are critically examined, and it is demonstrated that none of them results in a coherent and meaningful interpretation of possibility theory.
Article
Up to now, many theoretical works about fuzzy databases have been defined by designing some extension of the relational model. Our purpose is to discuss some implementation aspects related to these databases. This point of view does not seem to be a usual concern of such systems, whereas it is of prime importance with respect to their future performances. We focus on the evaluation of a class of queries, called mono-attribute restrictions. We show how some basic principles can be applied to improve such a “fuzzy associative” retrieval, by means of an indexing-like access method. Lastly, some implementation solutions are presented.
Article
A significant effort has been made in representing imprecise information in database models by using fuzzy set theory. However, the research directed toward access structures to handle fuzzy querying effectively is still at an immature stage. Fuzzy querying involves more complex processing than the ordinary querying does. Additionally, a larger number of tuples are possibly selected by fuzzy conditions in comparison to the crisp ones. It is obvious that the need for fast response time becomes very important when the database system deals with imprecise (fuzzy) data. The current crisp index structures are inappropriate for representing and efficiently accessing fuzzy data. At the same time, in many complex applications such as Expert Database Systems, Multimedia Database Systems, Decision Support Systems, etc., fuzzy queries are usually intermingled with crisp queries. For the effectiveness of fuzzy databases, it is necessary to allow both the non-fuzzy and fuzzy attributes to be indexed together; therefore, a multi-dimensional access structure is required. Beside a suitable access structure, an effective partitioning, representation, and storage of fuzzy data are also necessary for efficient retrieval. In this study we utilise a multi-dimensional data structure, namely Multi Level Grid File (MLGF), for efficiently accessing both crisp and fuzzy data from fuzzy databases. Therefore, we focus on the issue of partitioning, representation and organisation of fuzzy and crisp data at physical database level, i.e., record and file structures, in addition to the design of the access structure. The implementation of the access structure is also described and its comparison with a previously proposed fuzzy access method is given along with the experimental results.
Conference Paper
The client-server model is being used mostly in the actual DataBase Management Systems (DBMS). However, these DBMS do not allow either to make flexible queries to the database or to store vague information in it. We have developed a FSQL Server for a Fuzzy Relational Database (FRDB). The FSQL language (Fuzzy SQL) is an extension of the SQL language that allows us to write flexible conditions in our queries. This Server has been developed for Oracle, following the model GEFRED, a theoric model for FRDB that includes fuzzy attributes to store vague information in the tables. The FSQL Server allows us to make flexible queries about traditional (crisp) or fuzzy attributes and we can use linguistic labels defined on any attribute.
Conference Paper
Fuzzy object-oriented data model is a fuzzy logic-based extension to object-oriented database model, which permits uncertain data to be explicitly represented. One of the proposed fuzzy object-oriented database models based on similarity relations is the FOOD model. Several kinds of fuzziness are dealt with in the FOOD model, including fuzziness between object/class and class/ superclass relations. The traditional index structures are inappropriate for the FOOD model for an efficient access to the objects with crisp or fuzzy values, since they are not efficient for processing both crisp and fuzzy queries. In this study we propose a new index structure (the FOOD Index) dealing with different kinds of fuzziness in FOOD databases and supports multi-dimensional indexing. We describe how the FOOD Index supports various types of flexible queries and evaluate performance results of crisp, range, and fuzzy queries using the FOOD index.
Conference Paper
Organization and maintenance of an index for a dynamic random access file is considered. It is assumed that the index must be kept on some pseudo random access backup store like a disc or a drum. The index organization described allows retrieval, insertion, and deletion of keys in time proportional to logk I where I is the size of the index and k is a device dependent natural number such that the performance of the scheme becomes near optimal. Storage utilization is at least 50% but generally much higher. The pages of the index are organized in a special datastructure, so-called B-trees. The scheme is analyzed, performance bounds are obtained, and a near optimal k is computed. Experiments have been performed with indexes up to 100000 keys. An index of size 15000 (100000) can be maintained with an average of 9 (at least 4) transactions per second on an IBM 360/44 with a 2311 disc.
Conference Paper
This paper presents and discusses a radically different approach to multi-dimensional indexing based on the concept of the space-filling curve. It reports the novel algorithms which had to be developed to create the first actual implementation of a system based on this approach, on some comparative performance tests, and on its actual use within the TriStarp Group at Birkbeck to provide a Triple Store repository. An important result that goes beyond this requirement, however, is that the performance improvement over the Grid File is greater the higher the dimension.
Article
In this paper, we present a Fuzzy Relational Databases model whose main characteristics are: the integration of previous models in the same framework, representation capabilities for a wide series of fuzzy information, and a coherent and flexible handling of it. This model aims to solve each problem of representation and handling of fuzzy information taking into account its specific nature, and hence it allows the user to. choose the comparison operator and the fuzzy compatibility measure to be used in a query. Besides, it permits the user to specify the precision with which the conditions involved in a query are satisfied.
Conference Paper
Fuzzy querying involves more complex processing than ordinary querying does. In addition, a larger number of tuples will possibly be selected by fuzzy conditions compared to the crisp ones. The current index structures are inefficient in representing and dealing with uncertain and fuzzy data. In this paper we extend one of the multi-dimensional data structures, namely Multi Lever Grid File (Whang and Krishnamurty, 1991) for an efficient access to both crisp and fuzzy data. In order to take advantage of the indexing data structure proposed here, we first partition uncertain data in a way that accessing such data in a database is reasonably efficient. Therefore, we also focus on the issue of preparation of uncertain data before building the access structure. Then we compare the one proposed here with sequential access along with experimental results
Article
An important issue in extending database management systems functionalities is to allow the expression of imprecise queries to enable these systems to satisfy the user needs more closely. This paper deals with imprecise querying of regular relational databases. The basic idea is to extend an existing query language, namely SQL. In this context, two important points must be considered: one concerns the integration in the extended language of many propositions that have been made elsewhere, in particular those concerning fuzzy aggregation operators; and the second point is to know whether the equivalences which are valid in SQL still hold in the extended language. Both these topics are investigated in this paper
Article
The author describes an efficient data structure called the G-tree (or grid tree) for organizing multidimensional data. The data structure combines the features of grids and B-trees in a novel manner. It also exploits an ordering property that numbers the partitions in such a way that partitions that are spatially close to one another in a multidimensional space are also close in terms of their partition numbers. This structure adapts well to dynamic data spaces with a high frequency of insertions and deletions, and to nonuniform distributions of data. We demonstrate that it is possible to perform insertion, retrieval, and deletion operations, and to run various range queries efficiently using this structure. A comparison with the BD tree, zkdb tree and the KDB tree is carried out, and the advantages of the G-tree over the other structures are discussed. The simulated bucket utilization rates for the G-tree are also reported
Fuzzy databases: principles and applications. International Series in Intelligent Technologies
  • F E Petry
  • P Bosc
F. E. Petry and P. Bosc. Fuzzy databases: principles and applications. International Series in Intelligent Technologies. Kluwer Academic Publishers, 1996.
Fuzzy database retrieval and manipulation language
  • S Fukami
  • M Umano
  • M Muzimoto
  • H Tanaka
S. Fukami, M. Umano, M. Muzimoto, H. Tanaka, Fuzzy database retrieval and manipulation language, IEICE Technical Reports, Vol. 78 (233), 1979, pp. 65–72.