d1 ≤D d3 and d9 ≤D d6, but d2 D d4

Source publication

Metadata Management in a Multiversion Data Warehouse

Conference Paper

Full-text available

Jan 2005

A data warehouse (DW) is a database that integrates data from external data sources (EDSs) for the purpose of advanced analysis. EDSs are production systems that often change not only their contents but also their structures. The evolution of EDSs has to be reflected in a DW that integrates the sources. Traditional DW systems offer a limited suppor...

Context 1

... note that the notion of default subsumption may appear strange for people accustomed to classical subsumption because of its symmetry. As a consequence, it does not define an ordering relationship on the description space D. The notation ≤ D may be confusing with respect to this symmetry, but it is relative to the underlying idea of generality. Fig. 2 gives two examples extracted from fig. 1 where the default subsumption is verified and a third case where it is not. Let us consider the previous descriptions d 1 ...

View in full-text

Context 2

... the interpretation of a role/co-role label pair as being a part-of or specialisation rela- tion, is delegated to the Commitment Layer, where the semantic axiomatisation takes place. A lexon could be approximately considered as a combination of an RDF/OWL triple and its inverse. Lexons and commitments are visualised in a NIAM 5 -like schema (cfr. Fig. 2). ...

View in full-text

Context 3

... document title is "Java Enterprise in a Nutshell, Second Edition". In the DMoz web directory, reduced for sake of presentation, the example title can be found through two different search paths (see Figure 2), namely: ...

View in full-text

Context 4

... challenge here is to disambiguate natural lan- guage words and labels. For example, the classifier has to understand that in the label of node n 7 (see Figure 2) the word "Java" has at least three senses, which are: an island in Indonesia; a coffee beverage; and an object-oriented pro- gramming language. Moreover, words in a label are combined to build complex concepts. ...

View in full-text

Context 5

... an object, the classi- fier has to understand what classification alternatives for this object are. For instance, the book "Java Enterprise in a Nutshell, Second Edition" might po- tentially be put in all the nodes of the hierarchy shown in Figure 2. The reason for this is that the book is related to both business and technology branches; ...

View in full-text

Context 6

... all the possible paths converge to the same semantically equivalent concept. Consider, for instance, node n 8 in the classification shown in Figure 2. The two paths below will converge to the same concept for the node 11 : ...

View in full-text

Context 7

... 4 (Disambiguating edges in a web directory). Recall the ex- ample of the part of the DMoz directory shown in Figure 2 and let us see how the concept at node n 7 can be computed. Remember the three senses of the word "java" (which is the label of n 7 ) discussed earlier in the paper, and con- sider the parent node's label, "programming languages", which is recognized as a multi-word with only one sense whose gloss is "a language designed for pro- gramming computers". ...

View in full-text

Context 8

... 5 (Document classification). As en example, recall the classifica- tion in Figure 2, and suppose that we need to classify the book: "Java Enterprise in a Nutshell, Second Edition", whose concept is java#3 enterprise#2 book#1. It can be shown, by means of propositional reasoning, that the set of classification alternatives includes all the nodes of the corresponding NFC. ...

View in full-text

Context 9

... to space constraints, the table does not embody all metamodel elements and correspondences in the different metamodels. Figure 2 presents the Generic Role based Metamodel GeRoMe at its current state, based on the analysis of the previous section. All role classes inherit from RoleObject but we omitted these links for the sake of readability. ...

View in full-text

Context 10

... top of the storage layer, an abstract object model corresponding to the model in fig. 2 has been implemented as a Java library. This is a set of interfaces and base imple- mentations in Java. An implementation of these interfaces can be chosen by instantiat- ing a factory class. Consequently, the object model is independent from the underlying implementation and storage strategy. The relationship between roles and model ...

View in full-text

Context 11

... the correspondences are not hidden in imperative code, but are given as a set of equivalence rules, the developer can concentrate on the logical correspondences and does not have to deal with implementation details. Besides, only two classes have to be implemented that produce facts about a concrete model from an API (e.g., the Jena OWL API, see fig. 12) or read facts and produce the model with calls to the API, respectively. These two classes merely produce (or read) a different syntactic representation of the native model and do not perform any sophisticated processing of schemas. Creating and pro- cessing of facts about the GeRoMe representation is completely done with ...

View in full-text

Context 12

... a ROLAP implementation, a data cube is stored in relational tables, some of them represent levels and are called level tables (e.g., Categories and Items in Fig. 2), while others store values of measures, and are called fact tables (Sales in Fig. 2). Two basic types of ROLAP schemas are used for the implementation of a data cube, i.e., a star schema and a snowflake schema [19]. In a star schema, a dimension is composed of only one level table (e.g., Time in Fig. 2). In a snowflake schema, a ...

View in full-text

Context 13

View in full-text

Context 14

... level tables (e.g., Categories and Items in Fig. 2), while others store values of measures, and are called fact tables (Sales in Fig. 2). Two basic types of ROLAP schemas are used for the implementation of a data cube, i.e., a star schema and a snowflake schema [19]. In a star schema, a dimension is composed of only one level table (e.g., Time in Fig. 2). In a snowflake schema, a dimension is composed of multiple level tables connected by foreign key -primary key relationships (e.g., dimension Location with level tables Shops, Cities, and Regions). In practice, one also builds the so called star- flake schemas where some dimensions are composed of multiple level tables and some ...

View in full-text

Context 15

... 2. In order to illustrate the idea and usage of mapping tables, let us consider a DW schema from Fig. 2 and let us assume that initially, in a real version from February (R F EB ) to March (R MAR ) there existed 3 shops, namely ShopA, ShopB, and ShopC that were represented by appropriate instances of the Location dimension. In April, a new DW version was created, namely R AP R in order to represent a new reality where ShopA and ShopB ...

View in full-text

Context 16

... 3. In order to illustrate annotating result sets of SVQs with metadata (step 3) let us consider a DW schema from Fig. 2. Let us further assume that initially in a real version from April 2004 R AP R there existed 3 shops, namely ShopA, ShopB, and ShopC. These shops were selling porotherm bricks with 7% of VAT (tax). Let us assume that in May, porotherm bricks were reclassified to 22% VAT category (which is a real case of Poland after joining the ...

View in full-text

Context 17

... structure of the mapping metaschema is shown in Fig. 12 (represented in the Oracle notation). The SRC_SOURCES dictionary table stores descrip- tions of external data sources. It contains among others connection parame- ters for accessing every EDS. Data about EDSs data structures whose changes are to be monitored are registered in two dictionary tables: SRC_OBJECTS and SRC_ATTRIBUTES. All ...

View in full-text

Context 18

... range typing is made with the class P 2 :P ainting of P 2 . If we suppose that this map- ping belongs to P 1 , its graphical notation is the one in Figure 2. In that case, P 2 :P ainting is a shared relation between P 1 and P 2 . ...

View in full-text

Context 19

... The schema of a SomeRDFS PDMS forms a knowledge base R of function- free Horn rules with single conditions (see the FOL axiomatization of core-RDFS in Figure 2). A simple backward chaining algorithm [18] with cycle detection ap- plied to each atom of a user query Q ensures to find all the maximal conjunctive rewritings of each atom of Q with atmost n chaining steps, if n is the number of rules in the schema. ...

View in full-text

Context 20

... the FOL axiomatization of core-RDFS (see Figure 2) is made of safe rules only. Therefore, conjuncting views that are relevant to each atom of the query provides rewritings of the query. ...

View in full-text

Context 21

... output of the alignment algorithm is a set of alignment relationships between terms from the source ontologies. Figure 2 shows a simple merging algorithm. A new ontology is computed from the source ontologies and their identified alignment. ...

View in full-text

Fig. 1. Overview of Multiverse Analysis. In traditional analyses, an...

Fig. 5. universe-to-multiverse diff Interface. An analyst makes changes...

Understanding and Supporting Debugging Workflows in Multiverse Analysis

Preprint

Full-text available

Oct 2022

Multiverse analysis-a paradigm for statistical analysis that considers all combinations of reasonable analysis choices in parallel-promises to improve transparency and reproducibility. Although recent tools help analysts specify multiverse analyses, they remain difficult to use in practice. In this work, we conduct a formative study with four multi...

Metadata Management in a Multiversion Data Warehouse

Conference Paper

Full-text available

Oct 2005

A data warehouse (DW) is supplied with data that come from external data sources (EDSs) that are production systems. EDSs, which are usually autonomous, often change not only their contents but also their structures. The evolution of external data sources has to be reflected in a DW that uses the sources. Traditional DW systems offer a limited supp...

Schema Evolution for Databases and Data Warehouses

Conference Paper

May 2016

Like all software systems, databases are subject to evolution as time passes. The impact of this evolution is tremendous as every change to the schema of a database affects the syntactic correctness and the semantic validity of all the surrounding applications and de facto necessitates their maintenance in order to remove errors from their source code. This survey provides a walk-through on different approaches to the problem of handling database and data warehouse schema evolution. The areas covered include (a) published case studies with statistical information on database evolution, (b) techniques for managing schema and view evolution, (c) techniques pertaining to the area of data warehouses, and, (d) prospects for future research.

Querying Multiversion Data Warehouses

Conference Paper

Sep 2015

Data warehouses (DWs) change in their content and structure due to changes in the feeding sources, business requirements, the modeled reality, and legislation, to name a few. Keeping the history of changes in the content and structure of a DW enables the user to analyze the state of the business world retrospectively or prospectively. Multiversion data warehouses (MVDWs) keep the history of content and structure changes by creating multiple data warehouse versions. Querying such DWs is complex as data is stored in multiple schema versions. In this paper, we discuss various schema changes in a multidimensional model, and elaborate their impact on the queries. Further, we also propose a system to support querying MVDWs.

An MDA Approach for the Evolution of Data Warehouses

Article

Jul 2015

Modeling and data warehousing have been considered, for more than one decade, as a new challenging research topic for which different approaches have been proposed. Nevertheless these proposals have focused on static aspects only. In practice, the evolution of the operational information system can lead to changes in its dependent multidimensional data warehouse (i.e. that this system feeds with data), and therefore may require the evolution of the data warehouse model. In this evolving context, the authors propose a model-driven based approach in order to automate the propagation of the evolutions occurred in the source database towards the multidimensional data warehouse. This approach is based on two evolution models, along with a set of transformation rule s formalized in Query/View/Transformation. This paper describes this evolution approach for which we are developing a software prototype called DWE

A Logical Model for Multiversion Data Warehouses

Conference Paper

Sep 2014
Lect Notes Comput Sci

Waqas Ahmed

Data warehouse systems integrate data from heterogeneous sources. These sources are autonomous in nature and change independently of a data warehouse. Owing to changes in data sources, the content and the schema of a data warehouse may need to be changed for accurate decision making. Slowly changing dimensions and temporal data warehouses are the available solutions to manage changes in the content of the data warehouse. Multiversion data warehouses are capable of managing changes in the content and the structure simultaneously however, they are relatively complex and not easy to implement. In this paper, we present a logical model of a multiversion data warehouse which is capable of handling schema changes independently of changes in the content. We also introduce a new hybrid table version approach to implement the multiversion data warehouse.

Query-Driven Method for Improvement of Data Warehouse Conceptual Model

Conference Paper

Oct 2013

We propose a query-driven method that elicits the information requirements from existing queries on data sources and their usage statistics. Our method presumes that the queries against the source database reflect the analysis needs of users. We use this method to recommend changes to the existing data warehouse schemata. In our method, we take advantage of the schema versioning approach to reflect all changes that occur in the analysed process, and we analyse the activity of users in the source system, rather than changes in physical data structure, to infer the necessary improvements to the data warehouse schema.

Evolution-Oriented User-Centric Data Warehouse

Chapter

Full-text available

Sep 2011

Data warehouses tend to evolve, because of changes in data sources and business requirements of users. All these kinds of changes must be properly handled, therefore, data warehouse development is never-ending process. In this paper we propose the evolution-oriented user-centric data warehouse design, which on the one hand allows to manage data warehouse evolution automatically or semi-automatically, and on the other hand it provides users with the understandable, easy and transparent data analysis possibilities. The proposed approach supports versions of data warehouse schemata and data semantics.

Operators for reclassification in a temporal multidimensional model

Article

Full-text available

Apr 2011

Data warehouse dimensions are usually considered to be static because their schema and data tend not to change; however, both dimension schema and dimension data can change. This paper focuses on a type of dimension data change called reclassification which occurs when a member of a certain level becomes a member of a higher level in the same dimension, e.g. when a product changes category (it is reclassified). This type of change gives rise to the notion of classification period and to a type of query that can be useful for decision-support. For example, What were total chess-set sales during first classification period in Toy category? A set of operators has been proposed to facilitate formulating this type of query and it is shown how to incorporate them in SQL, a familiar database developer language. Our operators’ expressivity is also shown because formulating such queries without using these operators usually leads to complex and nonintuitive solutions.

Operadores para consultas relacionadas con reclasificaciones en un modelo temporal multidimensional

Article

Full-text available

Jan 2011

Usualmente las dimensiones de una bodega de datos son consideradas estáticas porque su esquema y datos tienden a no cambiar. Sin embargo, tanto el esquema como los datos de las dimensiones pueden cambiar. Este artículo se enfoca en un tipo de cambio dimensional denominado reclasificación, que ocurre cuando un miembro de un nivel cambia de miembro en un nivel superior de la dimensión, ejemplo, cuando un producto cambia de categoría (es reclasificado). Este tipo de cambios da lugar al concepto período de clasificación y a un tipo de consultas que pueden ser útiles para la toma de decisiones. Verbigracia, ¿cuál fue el total vendido del producto ajedrez durante su primer periodo de clasificación en la categoría juguete? Para facilitar el planteamiento de este tipo de consultas se propone un conjunto de operadores y se muestra como éstos se incorporan en SQL, un lenguaje familiar para los desarrolladores de bases de datos. También se demuestra la expresividad de los operadores propuestos, ya que la formulación de esas consultas sin usar estos operadores usualmente conduce a soluciones complejas y poco intuitivas.

Season queries on a temporal multidimensional model for OLAP

Article

Oct 2010
MATH COMPUT MODEL

Dimensions are usually considered static in a data warehouse. However, because of changing requirements, dimension data and dimension structure can evolve. In this paper we focus on a type of dimension data change called reclassification, i.e., when a member of a level changes its parent in a higher level of a dimension. This kind of change gives rise to the notion of season, i.e., an interval during which two members of a dimension are associated with each other. In this paper we extend a formal temporal multidimensional model with the notion of season and propose query language constructs to enable season queries. A case study about soccer illustrates the application of the proposed extensions, exemplified with several season queries.

Metadata Version Management for DW 2, 0 Environment.

Article

May 2010

Ding Pan

As a new paradigm for data warehousing demanded by today's decision support community, DW 2.0 recognized the life cycle of data with it, that make metadata evolution mechanism became one of the important research issues. The requirements of multi-version management for four data sectors in DW 2.0 environment are described. Then a novel metadata versioning meta-model is proposed, that is capable of storing and managing schemas versions, comparing and interpreting the results of versions queries, and tracing the version evolution. In implementation, the schema evolution with version is discussed in the abstract by model management operators; a verification engine to resolve the evolution inconsistencies is represented. The prototype has verified its feasibility and validity.

d1 ≤D d3 and d9 ≤D d6, but d2 D d4

Contexts in source publication

Similar publications

Citations