Relational schema DS2 for products-orders database

Data Warehousing Process Modeling from Classical Approaches to New Trends: Main Features and Comparisons

Article

Full-text available

Aug 2022

The extract, transform, and load (ETL) process is at the core of data warehousing architectures. As such, the success of data warehouse (DW) projects is essentially based on the proper modeling of the ETL process. As there is no standard model for the representation and design of this process, several researchers have made efforts to propose modeling methods based on different formalisms, such as unified modeling language (UML), ontology, model-driven architecture (MDA), model-driven development (MDD), and graphical flow, which includes business process model notation (BPMN), colored Petri nets (CPN), Yet Another Workflow Language (YAWL), CommonCube, entity modeling diagram (EMD), and so on. With the emergence of Big Data, despite the multitude of relevant approaches proposed for modeling the ETL process in classical environments, part of the community has been motivated to provide new data warehousing methods that support Big Data specifications. In this paper, we present a summary of relevant works related to the modeling of data warehousing approaches, from classical ETL processes to ELT design approaches. A systematic literature review is conducted and a detailed set of comparison criteria are defined in order to allow the reader to better understand the evolution of these processes. Our study paints a complete picture of ETL modeling approaches, from their advent to the era of Big Data, while comparing their main characteristics. This study allows for the identification of the main challenges and issues related to the design of Big Data warehousing systems, mainly involving the lack of a generic design model for data collection, storage, processing, querying, and analysis

Data Warehousing Process Modeling from Classical Approaches to New Trends: Main Features and Comparisons

Article

Full-text available

Aug 2022

The extract, transform, and load (ETL) process is at the core of data warehousing architectures. As such, the success of data warehouse (DW) projects is essentially based on the proper modeling of the ETL process. As there is no standard model for the representation and design of this process, several researchers have made efforts to propose modeling methods based on different formalisms, such as unified modeling language (UML), ontology, model-driven architecture (MDA), model-driven development (MDD), and graphical flow, which includes business process model notation (BPMN), colored Petri nets (CPN), Yet Another Workflow Language (YAWL), CommonCube, entity modeling diagram (EMD), and so on. With the emergence of Big Data, despite the multitude of relevant approaches proposed for modeling the ETL process in classical environments, part of the community has been motivated to provide new data warehousing methods that support Big Data specifications. In this paper, we present a summary of relevant works related to the modeling of data warehousing approaches, from classical ETL processes to ELT design approaches. A systematic literature review is conducted and a detailed set of comparison criteria are defined in order to allow the reader to better understand the evolution of these processes. Our study paints a complete picture of ETL modeling approaches, from their advent to the era of Big Data, while comparing their main characteristics. This study allows for the identification of the main challenges and issues related to the design of Big Data warehousing systems, mainly involving the lack of a generic design model for data collection, storage, processing, querying, and analysis.

Data Warehousing Process Modeling from Classical Approaches to New Trends: Main Features and Comparisons

Preprint

Full-text available

Aug 2022

An Integrated Approach for Massive Sequential Data Processing in Civil Infrastructure Operation and Maintenance

Article

Full-text available

Mar 2018

This paper presents an extract-transform-load (ETL) approach based on multilayer task execution for processing massive sequential data collected from infrastructure operation and maintenance. The proposed approach consists of ETL task partition, execution mode selection, and ETL modeling. The task partition focuses on dividing the ETL process into four tasks to be executed in accordance with different organizational forms of data. Sequenced or non-sequenced load mode is optional, which is independent of the data standardization. In addition, the ETL modeling phase implements conceptual, logical, and physical modeling for the multi-dimensional model. Our main objective is to integrate massive sequential data, enhancing decision-making performance for the intelligent management platform. Traffic data for two years were collected from various systems and acquisition tools of different providers to evaluate the data integration capability of the proposed approach. Furthermore, Kettle software was used to perform transformation and job modules for the multilayer tasks. In addition, a machine learning algorithm was used to generate traffic warning in the tunnels based on the integrated data. The proposed approach is promising for management and analysis of massive sequential data generated in operation and maintenance of transportation tunnels as well as effective decision-making.

Factors of data infrastructure and resource support influencing the integration of business intelligence into enterprise resource planning systems

Article

Jan 2015

Chien Wen Shen

This study aims to investigate whether data infrastructure and resource support affect the integration of business intelligence (BI) into enterprise resource planning (ERP) systems. A Bayesian network model includes the variables of data warehouse, OLAP, data mining, ERP vendor, online period of ERP, return on assets, return on sales, return on investment, sales over employees and BI implementation was developed to investigate the issues of this research. Empirical findings from ERP-implemented manufacturers suggest that BI implementation may not have positive impacts on financial performances. In contrast, BI-implemented companies generally have more complicated data infrastructure than the companies without BI systems. In addition, results of Bayesian inferences suggest that ERP vendor, data warehouse, OLAP and data mining may have significant impacts on the implementation of BI systems. Hence, companies should choose their ERP solutions carefully or start planning their data infrastructure if they expect to adopt BI solutions in the future.

Effective data warehouse for information delivery: A literature survey and classification

Article

Oct 2013

Data warehouse is playing an important role in strategic decision making process for complex business solutions. To gain competitive advantage, business executives are increasingly making use of data warehouse concepts as it plays a vital role in analysing, predicting future trends based on past and current scenarios. We as authors have surveyed the various techniques used in building of data warehouse and the methods used for the implementation of techniques. We have conducted an in-depth survey of existing literature from various known international journal papers to come up with a framework which will help the researchers to focus on specific and emerging areas in the field of data warehouse development as well as application of data warehouse in various business domains.

Relational schema DS2 for products-orders database

Similar publications

Citations