Fig 2 - uploaded by Bartosz Bębel
Content may be subject to copyright.
d1 ≤D d3 and d9 ≤D d6, but d2 D d4

d1 ≤D d3 and d9 ≤D d6, but d2 D d4

Source publication
Conference Paper
Full-text available
A data warehouse (DW) is a database that integrates data from external data sources (EDSs) for the purpose of advanced analysis. EDSs are production systems that often change not only their contents but also their structures. The evolution of EDSs has to be reflected in a DW that integrates the sources. Traditional DW systems offer a limited suppor...

Contexts in source publication

Context 1
... note that the notion of default subsumption may appear strange for people accustomed to classical subsumption because of its symmetry. As a consequence, it does not define an ordering relationship on the description space D. The notation ≤ D may be confusing with respect to this symmetry, but it is relative to the underlying idea of generality. Fig. 2 gives two examples extracted from fig. 1 where the default subsumption is verified and a third case where it is not. Let us consider the previous descriptions d 1 ...
Context 2
... the interpretation of a role/co-role label pair as being a part-of or specialisation rela- tion, is delegated to the Commitment Layer, where the semantic axiomatisation takes place. A lexon could be approximately considered as a combination of an RDF/OWL triple and its inverse. Lexons and commitments are visualised in a NIAM 5 -like schema (cfr. Fig. 2). ...
Context 3
... document title is "Java Enterprise in a Nutshell, Second Edition". In the DMoz web directory, reduced for sake of presentation, the example title can be found through two different search paths (see Figure 2), namely: ...
Context 4
... challenge here is to disambiguate natural lan- guage words and labels. For example, the classifier has to understand that in the label of node n 7 (see Figure 2) the word "Java" has at least three senses, which are: an island in Indonesia; a coffee beverage; and an object-oriented pro- gramming language. Moreover, words in a label are combined to build complex concepts. ...
Context 5
... an object, the classi- fier has to understand what classification alternatives for this object are. For instance, the book "Java Enterprise in a Nutshell, Second Edition" might po- tentially be put in all the nodes of the hierarchy shown in Figure 2. The reason for this is that the book is related to both business and technology branches; ...
Context 6
... all the possible paths converge to the same semantically equivalent concept. Consider, for instance, node n 8 in the classification shown in Figure 2. The two paths below will converge to the same concept for the node 11 : ...
Context 7
... 4 (Disambiguating edges in a web directory). Recall the ex- ample of the part of the DMoz directory shown in Figure 2 and let us see how the concept at node n 7 can be computed. Remember the three senses of the word "java" (which is the label of n 7 ) discussed earlier in the paper, and con- sider the parent node's label, "programming languages", which is recognized as a multi-word with only one sense whose gloss is "a language designed for pro- gramming computers". ...
Context 8
... 5 (Document classification). As en example, recall the classifica- tion in Figure 2, and suppose that we need to classify the book: "Java Enterprise in a Nutshell, Second Edition", whose concept is java#3 enterprise#2 book#1. It can be shown, by means of propositional reasoning, that the set of classification alternatives includes all the nodes of the corresponding NFC. ...
Context 9
... to space constraints, the table does not embody all metamodel elements and correspondences in the different metamodels. Figure 2 presents the Generic Role based Metamodel GeRoMe at its current state, based on the analysis of the previous section. All role classes inherit from RoleObject but we omitted these links for the sake of readability. ...
Context 10
... top of the storage layer, an abstract object model corresponding to the model in fig. 2 has been implemented as a Java library. This is a set of interfaces and base imple- mentations in Java. An implementation of these interfaces can be chosen by instantiat- ing a factory class. Consequently, the object model is independent from the underlying implementation and storage strategy. The relationship between roles and model ...
Context 11
... the correspondences are not hidden in imperative code, but are given as a set of equivalence rules, the developer can concentrate on the logical correspondences and does not have to deal with implementation details. Besides, only two classes have to be implemented that produce facts about a concrete model from an API (e.g., the Jena OWL API, see fig. 12) or read facts and produce the model with calls to the API, respectively. These two classes merely produce (or read) a different syntactic representation of the native model and do not perform any sophisticated processing of schemas. Creating and pro- cessing of facts about the GeRoMe representation is completely done with ...
Context 12
... a ROLAP implementation, a data cube is stored in relational tables, some of them represent levels and are called level tables (e.g., Categories and Items in Fig. 2), while others store values of measures, and are called fact tables (Sales in Fig. 2). Two basic types of ROLAP schemas are used for the implementation of a data cube, i.e., a star schema and a snowflake schema [19]. In a star schema, a dimension is composed of only one level table (e.g., Time in Fig. 2). In a snowflake schema, a ...
Context 13
... a ROLAP implementation, a data cube is stored in relational tables, some of them represent levels and are called level tables (e.g., Categories and Items in Fig. 2), while others store values of measures, and are called fact tables (Sales in Fig. 2). Two basic types of ROLAP schemas are used for the implementation of a data cube, i.e., a star schema and a snowflake schema [19]. In a star schema, a dimension is composed of only one level table (e.g., Time in Fig. 2). In a snowflake schema, a dimension is composed of multiple level tables connected by foreign key -primary key ...
Context 14
... level tables (e.g., Categories and Items in Fig. 2), while others store values of measures, and are called fact tables (Sales in Fig. 2). Two basic types of ROLAP schemas are used for the implementation of a data cube, i.e., a star schema and a snowflake schema [19]. In a star schema, a dimension is composed of only one level table (e.g., Time in Fig. 2). In a snowflake schema, a dimension is composed of multiple level tables connected by foreign key -primary key relationships (e.g., dimension Location with level tables Shops, Cities, and Regions). In practice, one also builds the so called star- flake schemas where some dimensions are composed of multiple level tables and some ...
Context 15
... 2. In order to illustrate the idea and usage of mapping tables, let us consider a DW schema from Fig. 2 and let us assume that initially, in a real version from February (R F EB ) to March (R MAR ) there existed 3 shops, namely ShopA, ShopB, and ShopC that were represented by appropriate instances of the Location dimension. In April, a new DW version was created, namely R AP R in order to represent a new reality where ShopA and ShopB ...
Context 16
... 3. In order to illustrate annotating result sets of SVQs with metadata (step 3) let us consider a DW schema from Fig. 2. Let us further assume that initially in a real version from April 2004 R AP R there existed 3 shops, namely ShopA, ShopB, and ShopC. These shops were selling porotherm bricks with 7% of VAT (tax). Let us assume that in May, porotherm bricks were reclassified to 22% VAT category (which is a real case of Poland after joining the ...
Context 17
... structure of the mapping metaschema is shown in Fig. 12 (represented in the Oracle notation). The SRC_SOURCES dictionary table stores descrip- tions of external data sources. It contains among others connection parame- ters for accessing every EDS. Data about EDSs data structures whose changes are to be monitored are registered in two dictionary tables: SRC_OBJECTS and SRC_ATTRIBUTES. All ...
Context 18
... range typing is made with the class P 2 :P ainting of P 2 . If we suppose that this map- ping belongs to P 1 , its graphical notation is the one in Figure 2. In that case, P 2 :P ainting is a shared relation between P 1 and P 2 . ...
Context 19
... The schema of a SomeRDFS PDMS forms a knowledge base R of function- free Horn rules with single conditions (see the FOL axiomatization of core-RDFS in Figure 2). A simple backward chaining algorithm [18] with cycle detection ap- plied to each atom of a user query Q ensures to find all the maximal conjunctive rewritings of each atom of Q with atmost n chaining steps, if n is the number of rules in the schema. ...
Context 20
... the FOL axiomatization of core-RDFS (see Figure 2) is made of safe rules only. Therefore, conjuncting views that are relevant to each atom of the query provides rewritings of the query. ...
Context 21
... output of the alignment algorithm is a set of alignment relationships between terms from the source ontologies. Figure 2 shows a simple merging algorithm. A new ontology is computed from the source ontologies and their identified alignment. ...

Similar publications

Preprint
Full-text available
Multiverse analysis-a paradigm for statistical analysis that considers all combinations of reasonable analysis choices in parallel-promises to improve transparency and reproducibility. Although recent tools help analysts specify multiverse analyses, they remain difficult to use in practice. In this work, we conduct a formative study with four multi...
Conference Paper
Full-text available
A data warehouse (DW) is supplied with data that come from external data sources (EDSs) that are production systems. EDSs, which are usually autonomous, often change not only their contents but also their structures. The evolution of external data sources has to be reflected in a DW that uses the sources. Traditional DW systems offer a limited supp...

Citations

... The essence of multi-version queries involves transforming the data of previous versions (that obey a previous structure) to the current version of the structure of the data warehouse, in order to allow their uniform querying with the current data. In this section, we discuss the adaptation of multiversion data warehouses [49], the use of data mining techniques in order to detect structural changes in data warehouses [50][51][52], and, the use of graph representations (directed graphs) [53], in order to achieve correct cross version queries. We summarize problems and solutions in a table at the end of the subsection. ...
... Since every previous version is accompanied by an augmented schema that transforms it to the current one, it is possible to pose a query that spans different versions and translate the data of the previous versions to a representation obeying the current schema, as explained above. Practically around the same time, Wrembel and Bebel [49] deal both with cross-version querying and with the problems that appear when changes take place at the external data sources (EDS) of a data warehouse. Those problems can be related to a multi-version data warehouse which is composed of a sequence of persistent versions that describe the schema and data for a given period of time. ...
Conference Paper
Like all software systems, databases are subject to evolution as time passes. The impact of this evolution is tremendous as every change to the schema of a database affects the syntactic correctness and the semantic validity of all the surrounding applications and de facto necessitates their maintenance in order to remove errors from their source code. This survey provides a walk-through on different approaches to the problem of handling database and data warehouse schema evolution. The areas covered include (a) published case studies with statistical information on database evolution, (b) techniques for managing schema and view evolution, (c) techniques pertaining to the area of data warehouses, and, (d) prospects for future research.
... When the user creates a new schema version from an existing one, an augmented schema is also associated with the old version: It is the most generic schema containing all the elements from both the new and the old versions. In [19] , the authors also presented a metadata-based version management system for MVDWs. In both of the above mentioned approaches, to answer a cross-version query, firstly the user query is converted into individual queries against each version, and then the results of these individual queries are combined and presented to the user. ...
... At a given instant, only one DW version is used to store data and it is called the current version. Although it is possible to derive multiple schema versions from the current version, for the sake of simplicity, we only consider the sequential versioning approach [19], in which a new version can be derived by applying changes to the current version only. Each version has an associated begin application time (BAT) and end application time (EAT) that represent a close-open interval during which a version is used to store data. ...
Conference Paper
Data warehouses (DWs) change in their content and structure due to changes in the feeding sources, business requirements, the modeled reality, and legislation, to name a few. Keeping the history of changes in the content and structure of a DW enables the user to analyze the state of the business world retrospectively or prospectively. Multiversion data warehouses (MVDWs) keep the history of content and structure changes by creating multiple data warehouse versions. Querying such DWs is complex as data is stored in multiple schema versions. In this paper, we discuss various schema changes in a multidimensional model, and elaborate their impact on the queries. Further, we also propose a system to support querying MVDWs.
... They located the changes that may affect the DW at three levels: physical, logical and semantic. The solutions proposed by (Solodovnikova, 2008) and (Wrembel et al., 2007) allow the automatic detection of the DS changes and assist the administrator in the propagation of these changes towards the DW. These studies are mainly based on the administrator's expertise and do not propose automatic propagation rules for the DW alterations. ...
Article
Modeling and data warehousing have been considered, for more than one decade, as a new challenging research topic for which different approaches have been proposed. Nevertheless these proposals have focused on static aspects only. In practice, the evolution of the operational information system can lead to changes in its dependent multidimensional data warehouse (i.e. that this system feeds with data), and therefore may require the evolution of the data warehouse model. In this evolving context, the authors propose a model-driven based approach in order to automate the propagation of the evolutions occurred in the source database towards the multidimensional data warehouse. This approach is based on two evolution models, along with a set of transformation rule s formalized in Query/View/Transformation. This paper describes this evolution approach for which we are developing a software prototype called DWE
... The alternative schema versions can be used for what-if analysis . In [12], the authors presented a logical model for the implementation of a MVDW and discussed various constraints to maintain data integrity across DW versions. They used the work presented in [13] to query data from multiple versions of a DW. ...
... The creation and maintenance of these structures is relatively complex. Further, the approaches to handling DW evolution either manage changes in the content only [9], changes in the schema only [2], or changes in the content and schema simultaneously [12] . Data warehouse versioning approaches support both changes in the content and the schema at the same time but none of the existing versioning approaches deals with issues of schema and content evolution independently of each other. ...
... The EAT for the current version is set to UC (until-changed). It is possible to create alternative schema versions [12] using the model presented in Sect. 4 but for simplicity's sake, we do not consider the branching versioning model. Figure 1shows an example of multiple DW versions. ...
Conference Paper
Data warehouse systems integrate data from heterogeneous sources. These sources are autonomous in nature and change independently of a data warehouse. Owing to changes in data sources, the content and the schema of a data warehouse may need to be changed for accurate decision making. Slowly changing dimensions and temporal data warehouses are the available solutions to manage changes in the content of the data warehouse. Multiversion data warehouses are capable of managing changes in the content and the structure simultaneously however, they are relatively complex and not easy to implement. In this paper, we present a logical model of a multiversion data warehouse which is capable of handling schema changes independently of changes in the content. We also introduce a new hybrid table version approach to implement the multiversion data warehouse.
... The method allows tracking history and comparing data using temporal modes of presentation that is data mapping into the particular structure version. In [13] metadata management solutions in a multiversion data warehouse are proposed. The above mentioned papers do not address the problems of the data warehouse adaptation after changes in data sources directly. ...
Conference Paper
We propose a query-driven method that elicits the information requirements from existing queries on data sources and their usage statistics. Our method presumes that the queries against the source database reflect the analysis needs of users. We use this method to recommend changes to the existing data warehouse schemata. In our method, we take advantage of the schema versioning approach to reflect all changes that occur in the analysed process, and we analyse the activity of users in the source system, rather than changes in physical data structure, to infer the necessary improvements to the data warehouse schema.
... Several authors [1], [9], [10] propose the data warehouse schema versioning approach to solve the problems of schema evolution. The main idea in [1] is to store augmented schemata together with schema versions to support cross-version querying. ...
... The method allows tracking history and comparing data using temporal modes of presentation that is data mapping into the particular structure version. In [9] metadata management solutions in a multiversion data warehouse are proposed. Issues related to queries over a multiversion data warehouse are considered in [11], but the translation of queries to SQL is not discussed. ...
Chapter
Full-text available
Data warehouses tend to evolve, because of changes in data sources and business requirements of users. All these kinds of changes must be properly handled, therefore, data warehouse development is never-ending process. In this paper we propose the evolution-oriented user-centric data warehouse design, which on the one hand allows to manage data warehouse evolution automatically or semi-automatically, and on the other hand it provides users with the understandable, easy and transparent data analysis possibilities. The proposed approach supports versions of data warehouse schemata and data semantics.
... Kaas (2004) propone operadores para cambiar el esquema de una BD, entre ellos operadores para insertar y borrar dimensiones y niveles. Otros autores (Eder, 2001; Body, 2002; Morzy, 2004; Golfarelli, 2006; Ravat, 2006; Rechy-Ramirez, 2006; Wrembel, 2007), se enfocan en el problema de versionamiento de una BD, es decir, cómo transformar o consultar datos que abarcan varias versiones de una BD originadas a partir de cambios en las dimensiones . Un estado del arte reciente sobre BD que consideran aspectos temporales, se puede ver en Golfarelli (2009b). ...
... Kaas (2004) has proposed operators for changing DW schema, including operators for inserting and deleting dimensions and levels. Other authors (Eder, 2001; Body, 2002; Morzy, 2004; Golfarelli, 2006; Ravat, 2006; Rechy-Ramirez, 2006; Wrembel, 2007) have focused on DW versioning, i.e. how to transform and/or query data covering several DW versions arising from dimension changes. A recent survey on DWs considering temporal aspects can be found in Golfarelli (2009b).A few works have specifically dealt with reclassification. ...
Article
Full-text available
Data warehouse dimensions are usually considered to be static because their schema and data tend not to change; however, both dimension schema and dimension data can change. This paper focuses on a type of dimension data change called reclassification which occurs when a member of a certain level becomes a member of a higher level in the same dimension, e.g. when a product changes category (it is reclassified). This type of change gives rise to the notion of classification period and to a type of query that can be useful for decision-support. For example, What were total chess-set sales during first classification period in Toy category? A set of operators has been proposed to facilitate formulating this type of query and it is shown how to incorporate them in SQL, a familiar database developer language. Our operators’ expressivity is also shown because formulating such queries without using these operators usually leads to complex and nonintuitive solutions.
... Kaas (2004) propone operadores para cambiar el esquema de una BD, entre ellos operadores para insertar y borrar dimensiones y niveles. Otros autores (Eder, 2001; Body, 2002; Morzy, 2004; Golfarelli, 2006; Ravat, 2006; Rechy-Ramirez, 2006; Wrembel, 2007), se enfocan en el problema de versionamiento de una BD, es decir, cómo transformar o consultar datos que abarcan varias versiones de una BD originadas a partir de cambios en las dimensiones . Un estado del arte reciente sobre BD que consideran aspectos temporales, se puede ver en Golfarelli (2009b). ...
... Kaas (2004) has proposed operators for changing DW schema, including operators for inserting and deleting dimensions and levels. Other authors (Eder, 2001; Body, 2002; Morzy, 2004; Golfarelli, 2006; Ravat, 2006; Rechy-Ramirez, 2006; Wrembel, 2007) have focused on DW versioning, i.e. how to transform and/or query data covering several DW versions arising from dimension changes. A recent survey on DWs considering temporal aspects can be found in Golfarelli (2009b). ...
Article
Full-text available
Usualmente las dimensiones de una bodega de datos son consideradas estáticas porque su esquema y datos tienden a no cambiar. Sin embargo, tanto el esquema como los datos de las dimensiones pueden cambiar. Este artículo se enfoca en un tipo de cambio dimensional denominado reclasificación, que ocurre cuando un miembro de un nivel cambia de miembro en un nivel superior de la dimensión, ejemplo, cuando un producto cambia de categoría (es reclasificado). Este tipo de cambios da lugar al concepto período de clasificación y a un tipo de consultas que pueden ser útiles para la toma de decisiones. Verbigracia, ¿cuál fue el total vendido del producto ajedrez durante su primer periodo de clasificación en la categoría juguete? Para facilitar el planteamiento de este tipo de consultas se propone un conjunto de operadores y se muestra como éstos se incorporan en SQL, un lenguaje familiar para los desarrolladores de bases de datos. También se demuestra la expresividad de los operadores propuestos, ya que la formulación de esas consultas sin usar estos operadores usualmente conduce a soluciones complejas y poco intuitivas.
... Kaas et al. [6] considers operators to change the DW schema, such as insert and delete of dimensions and levels. Other authors like Eder and Koncilia [7], Body et al. [8], Morzy and Wrembel [9], Golfarelli et al. [10], Ravat and Teste [11], Rechy-Ramirez and Benitez-Guerrero [12] and Wrembel and Bebel [13] focus on DW versioning, i.e., how to transform and/or query data that span several DW versions originated from dimension changes. For a recent survey on temporal DW refer to [3]. ...
Article
Dimensions are usually considered static in a data warehouse. However, because of changing requirements, dimension data and dimension structure can evolve. In this paper we focus on a type of dimension data change called reclassification, i.e., when a member of a level changes its parent in a higher level of a dimension. This kind of change gives rise to the notion of season, i.e., an interval during which two members of a dimension are associated with each other. In this paper we extend a formal temporal multidimensional model with the notion of season and propose query language constructs to enable season queries. A case study about soccer illustrates the application of the proposed extensions, exemplified with several season queries.
... Wrembel et al. Journal of Convergence Information Technology Volume 5, Number 3, May 2010 discussed detecting changes in external data sources and metadata management in a multi-version data warehouse [8]. Hauch et al. describe how MetaMatrix captures and manages the metadata through the use of the OMG's MOF architecture and multiple domain-specific modeling languages, and how this semantic and syntactic metadata is then used for accessing and integrating data [9]. ...
Article
As a new paradigm for data warehousing demanded by today's decision support community, DW 2.0 recognized the life cycle of data with it, that make metadata evolution mechanism became one of the important research issues. The requirements of multi-version management for four data sectors in DW 2.0 environment are described. Then a novel metadata versioning meta-model is proposed, that is capable of storing and managing schemas versions, comparing and interpreting the results of versions queries, and tracing the version evolution. In implementation, the schema evolution with version is discussed in the abstract by model management operators; a verification engine to resolve the evolution inconsistencies is represented. The prototype has verified its feasibility and validity.