Table 1 - uploaded by Franck Ravat
Content may be subject to copyright.
Transformation operations

Transformation operations

Source publication
Conference Paper
Full-text available
During the last few years, several frameworks have dealt with Data Warehousing (DW) design issues. Most of these frameworks provide partial answers that focus either on multidimensional (MD) modelling or on Extraction-Transformation-Loading (ETL) modelling. Yet, neither the study of unifying both modelling issues nor their automation have been cons...

Context in source publication

Context 1
... OCL lacks ways to express the data aggregation often used to transform the source data. We extend OCL with an aggregation function as follows: The table 1 presents the main ETL conceptual operations and their definition using the ETL-OCL language [17], [19]. ...

Citations

... Most frameworks typically focus either on DW design or on ETL process modelling. However, Atigui et al. (2012) proposed a generic unified method that automatically integrated DW and ETL design. Their approach employed the MDA framework and used UML profile and diagrams for DW and ETL design, while the extraction formulas were formalized using an OCL extension. ...
Article
Full-text available
Maintenance of Data Warehouse (DW) systems is a critical task because any downtime or data loss can have significant consequences on business applications. Existing DW maintenance solutions mostly rely on concrete technologies and tools that are dependent on: the platform on which the DW system was created; the specific data extraction, transformation, and loading (ETL) tool; and the database language the DW uses. Different languages for different versions of DW systems make organizing DW processes difficult, as minimal changes in the structure require major changes in the application code for managing ETL processes. This article proposes a domain-specific language (DSL) for ETL process management that mitigates these problems by centralizing all program logic, making it independent from a particular platform. This approach would simplify DW system maintenance. The platform-independent language proposed in this article also provides an easier way to create a unified environment to control DW processes, regardless of the language, environment, or ETL tool the DW uses.
... The R-OLAP approach allows transforming the multidimensional data model of a data warehouse into relational logical models in the form of star or snowflake schemas. These relational logical models are automatically generated from conceptual models by applying a set of rules [49]. Using these transformation rules in the context of big data has VOLUME 10, 2022 many weaknesses ascribed to the limitations of the relational data model mainly when queries require multiple complex aggregations. ...
Article
Full-text available
Nowadays, the data used for decision-making come from a wide variety of sources which are difficult to manage using relational databases. To address this problem, many researchers have turned to Not only SQL (NoSQL) databases to provide scalability and flexibility for On-Line Analytical Processing (OLAP) systems. In this paper, we propose a set of formal rules to convert a multidimensional data model into a graph data model (MDM2G). These rules allow conventional star and snowflake schemas to fit into NoSQL graph databases. We apply the proposed rules to implement star-like and snowflake-like graph data warehouses. We compare their performances to similar relational ones focusing on the data model, dimensionality, and size. The experimental results show large differences between relational and graph implementations of a data warehouse. A relational implementation performs better for queries on a couple of tables, but conversely, a graph implementation is better when queries involve many tables. Surprisingly the performances of a star-like and snowflake-like graph data warehouses are very close. Hence a snowflake schema could be used in order to easily consider new sub-dimensions in a graph data warehouse.
... Several works [5], [9], [11], [12] have dealt with ETL process modeling and they don't focus on incorporating pre-processing phase of ETL process since the conceptual modeling phase of the DW. Furthermore, it has been noticed that while trying to design the ETL process, people tend to overlook the work done in the conceptual phases and which contain a useful knowledge for the ETL process. ...
... Various approaches for designing and optimizing ETL process have been proposed in the last few years [5], [9], [11], [12]. This approaches can be classified into three main groups. ...
... The PIM is modeled using the UML activity diagram. Atigui and al. [9] have proposed an approach where the designer built his unified conceptual model PIM which describes the multidimensional structures and related ETL process. ...
Conference Paper
Full-text available
Building the ETL process is potentially one of the biggest tasks of building a warehouse. In fact, it is complex, time consuming, and consumes most of data warehouse projects implementation efforts, costs, and resources. Nevertheless, the difference on data structures imposes new requirements on the ETL process implementation and maintenance. What makes these tasks even more challenging is the fact that data continue to grow rapidly and business requirements change over time. In this paper, we propose a method that contains Two-ETL phases, one treats the pre-treatment phase and another deals with the actual ETL. Our method consists on determining the correspondence table, modeling new operations using the Business Process Modeling Notation (BPMN) and implementing these operations with Talend Open Source (TOS). In addition, our method allows the design of ETL process in an earlier stage, which enormously facilitates the implementation of this process. Another advantage of our proposal is the use of the BPMN which allows to cover a deficit of communication that often occurs between the design and implementation of business processes.