Memory Architecture for Optimized HYBRIDJOIN.

Article

Mar 2024

An efficient hybrid optimization of ETL process in data warehouse of cloud architecture

Article

Full-text available

Jan 2024

In big data, analysis data is collected from different sources in various formats, transforming into the aspect of cleansing the data, customization, and loading it into a Data Warehouse. Extracting data in other formats and transforming it to the required format requires transformation algorithms. This transformation stage has redundancy issues and is stored across any location in the data warehouse, which increases computation costs. The main issues in big data ETL are handling high-dimensional data and maintaining similar data for effective data warehouse usage. Therefore, Extract, Transform, Load (ETL) plays a vital role in extracting meaningful information from the data warehouse and trying to retain the users. This paper proposes hybrid optimization of Swarm Intelligence with a tabu search algorithm for handling big data in a cloud-based architecture-based ETL process. This proposed work overcomes many issues related to complex data storage and retrieval in the data warehouse. Swarm Intelligence algorithms can overcome problems like high dimensional data, dynamical change of huge data and cost optimization in the transformation stage. In this work for the swarm intelligence algorithm, a Grey-Wolf Optimizer (GWO) is implemented to reduce the high dimensionality of data. Tabu Search (TS) is used for clustering the relevant data as a group. Clustering means the segregation of relevant data accurately from the data warehouse. The cluster size in the ETL process can be optimized by the proposed work of (GWO-TS). Therefore, the huge data in the warehouse can be processed within an expected latency.

Optimization of Data Warehouse Architecture to Improve Information System Performance

Conference Paper

Feb 2023

TinyLFU-based semi-stream cache join for near-real-time data warehousing

Article

Full-text available

Sep 2022
SOFT COMPUT

Semi-stream join is an emerging research problem in the domain of near-real-time data warehousing. A semi-stream join is basically a join between a fast stream (S) and a slow disk-based relation (R). In the modern era of technology, huge amounts of data are being generated swiftly on a daily basis which needs to be instantly analyzed for making successful business decisions. Keeping this in mind, a famous algorithm called CACHEJOIN (Cache Join) was proposed. The limitation of the CACHEJOIN algorithm is that it does not deal with the frequently changing trends in a stream data efficiently. To overcome this limitation, in this paper, we propose a TinyLFU-CACHEJOIN algorithm, a modified version of the original CACHEJOIN algorithm, which is designed to enhance the performance of a CACHEJOIN algorithm. TinyLFU-CACHEJOIN employs an intelligent strategy which keeps only those records of R in the cache that have a high hit rate in S. This mechanism of TinyLFU-CACHEJOIN allows it to deal with the sudden and abrupt trend changes in S. We developed a cost model for our TinyLFU-CACHEJOIN algorithm and proved it empirically. We also assessed the performance of our proposed TinyLFU-CACHEJOIN algorithm with the existing CACHEJOIN algorithm on a skewed synthetic dataset. The experiments proved that TinyLFU-CACHEJOIN algorithm significantly outperforms the CACHEJOIN algorithm.

Enhancing supply chain performance using RFID technology and decision support systems in the industry 4.0–A systematic literature review

Article

Full-text available

May 2022
INT J INFORM MANAGE

Supply Chain processes are continuously marred by myriad factors including varying demands, changing routes, major disruptions, and compliance issues. Therefore, supply chains require monitoring and ongoing optimization. Data science uses real-time data to provide analytical insights, leading to automation and improved decision making. RFID is an ideal technology to source big data, particularly in supply chains, because RFID tags are consumed across supply chain process, which includes scanning raw materials, completing products, transporting goods, and storing products, with accuracy and speed. This study carries out a systematic literature review of research articles published during the timeline (2000-2021) that discuss the role of RFID technology in developing decision support systems that optimize supply chains in light of Industry 4.0. Furthermore, the study offers recommendations on operational efficiency of supply chains while reducing the costs of implementing the RFID technology. The core contribution of this paper is its analysis and evaluation of various RFID implementation methods in supply chains with the aim of saving time effectively and achieving cost efficiencies.

Query Dictionary for Frequent Non-Indexed Queries in HTAP Databases

Article

Full-text available

Jan 2022

The increasing demand for the simultaneous transaction and review of the data for either decision making or forecasting has created a need for faster and better Hybrid Transactional/Analytical Processing (HTAP). This paper emphasizes the speedup of Online Analytical Processing (OLAP) operations in an HTAP environment where analytical queries are mainly repetitive and contain non-indexed keys as their predicates. Zone maps and materialized views are popular approaches adopted by more extensive databases to address this issue. However, they are absent in in-memory databases because of space constraints. Instead, in-memory databases load the cache with result pages of frequently accessed queries. Increasing the number of such queries can fill the cache and raise the system’s overhead. This paper presents Query_Dictionary, a hybrid storage solution that leverages the full capabilities of SQLite by retaining less information of repetitive queries in the cache and efficiently accommodating the newly updated data by the end-user. The solution proposes storing page-level metadata query information for a larger result set and row-level information for a smaller result set. It demonstrates Query_Dictionary capabilities on three types of representative queries: single table, binary join, and transactional queries on non-indexed attributes. In comparison with SQLite, the proposed method performs better.

TinyLFU-Based Semi-Stream Cache Join for Near-Real-Time Data Warehousing

Preprint

Full-text available

Sep 2021

Semi-stream join is an emerging research problem in the domain of near-real-time data warehousing. A semi-stream join is basically a join between a fast stream (S) and a slow disk-based relation (R). In the modern era of technology, huge amounts of data are being generated swiftly on a daily basis which needs to be instantly analyzed for making successful business decisions. Keeping this in mind, a famous algorithm called CACHEJOIN (Cache Join) was proposed. The limitation of the CACHEJOIN algorithm is that it does not deal with the frequently changing trends in a stream data efficiently. To overcome this limitation, in this paper we propose a TinyLFU-CACHEJOIN algorithm, a modified version of the original CACHEJOIN algorithm, which is designed to enhance the performance of a CACHEJOIN algorithm. TinyLFU-CACHEJOIN employs an intelligent strategy which keeps only those records of $R$ in the cache that have a high hit rate in S. This mechanism of TinyLFU-CACHEJOIN allows it to deal with the sudden and abrupt trend changes in S. We developed a cost model for our TinyLFU-CACHEJOIN algorithm and proved it empirically. We also assessed the performance of our proposed TinyLFU-CACHEJOIN algorithm with the existing CACHEJOIN algorithm on a skewed synthetic dataset. The experiments proved that TinyLFU-CACHEJOIN algorithm significantly outperforms the CACHEJOIN algorithm.

Social and Economic Contribution of 5G and Blockchain With Green Computing: Taxonomy, Challenges, and Opportunities

Article

Full-text available

Apr 2021

In recent years Fifth Generation (5G) technology is the most recent advancement in a wireless communication network. There is the advent of using the 5G with diverse data structures. The Blockchain (BC) has become an approving adoption for decentralized, peer-to-peer, distributed transparent ledger systems with a diverse data structure. The use of 5G with BC is an emerging trend in communication technology. The elasticity of 5G with BC enables many applications to reciprocity information molds it a fast, transparent, consequential, and safe for transportation of data in this smart era. Green computing (GC) is presently the intense optimistic tactic for the integration of smart technology in a diverse and distributed world of power consumption. This Systematic Mapping Study (SMS) has been analyzed by cautiously elected publications between 2016 and 2020 in well-putative venus. This study analyzed the advanced research on power consumption solutions for BC-based 5G communication, Moreover, a taxonomy of 5G based on green BC and GC in various areas is presented. Furthermore, Green energy renewable communication (GERC) problems are being observed in this research by integrating three discrete technologies such as 5G with green BC and GC also along with smart systems. Lastly, the research gaps had been bestowed to render future directions for the researchers in 5G with green BC and GC as the solution for rechargeable data packets.

Social and Economic Contribution of 5G and Blockchain with Green Computing: Taxonomy, Challenges and Opportunities

Article

Apr 2021

In recent years Fifth Generation (5G) technology is the most recent advancement in a wireless communication network. There is the advent of using the 5G with diverse data structures. The Blockchain (BC) has become an approving adoption for decentralized, peer-to-peer, distributed transparent ledger systems with a diverse data structure. The use of 5G with BC is an emerging trend in communication technology. The elasticity of 5G with BC enables many applications to reciprocity information molds it a fast, transparent, consequential, and safe for transportation of data in this smart era. Green computing (GC) is presently the intense optimistic tactic for the integration of smart technology in a diverse and distributed world of power consumption. This Systematic Mapping Study (SMS) has been analyzed by cautiously elected publications between 2016 and 2020 in well-putative venus. This study analyzed the advanced research on power consumption solutions for BC-based 5G communication, Moreover, a taxonomy of 5G based on green BC and GC in various areas is presented. Furthermore, Green energy renewable communication (GERC) problems are being observed in this research by integrating three discrete technologies such as 5G with green BC and GC also along with smart systems. Lastly, the research gaps had been bestowed to render future directions for the researchers in 5G with green BC and GC as the solution for rechargeable data packets.

Social and Economic Contribution of 5G and Blockchain With Green Computing: Taxonomy, Challenges, and Opportunities

Article

Full-text available

Apr 2021

Fatima Bukhari

Memory Architecture for Optimized HYBRIDJOIN.

Similar publications

Citations