A high level view of Adaptive Scheduling Algorithm for Dynamic Heterogeneous Hadoop Systems

Source publication

Figure 3. The high-level architecture of COSHH.

Figure 5. A high level view of Adaptive Scheduling Algorithm for...

An Overview of Hadoop Scheduler Algorithms

Article

Full-text available

Jul 2018

Faten Hamad

Hadoop is a cloud computing open source system, used in large-scale data processing. It became the basic computing platforms for many internet companies. With Hadoop platform users can develop the cloud computing application and then submit the task to the platform. Hadoop has a strong fault tolerance, and can easily increase the number of cluster...

Context 1

... the mean job execution times are estimated when a new job is submitted to the system, which makes the scheduler adaptable to changes in job execution times. A high level view of Dynamic Heterogeneous Hadoop Systems is shown in Figure 5. ...

View in full-text

A hybrid and optimized resource scheduling technique using map reduce for larger instruction sets

Article

Full-text available

Jun 2018

MapReduce is a structural form to address larger-applications for handling tremendous data generated in parallel. These larger tasks is car-ried out by master and salve node architecture, where the master node judges all the available resources and manages the distributed applica-tions and the slave node is responsible to maintain the resources usa...

BÜYÜK VERİLER İÇİN HADOOP İŞ ÇİZELGELEME ALGORİTMALARINA GENEL BAKIŞ

Article

Full-text available

Oct 2022

Rapid advancements in Big data systems have occurred over the last several decades. The significant element for attaining high performance is "Job Scheduling" in Big data systems which requires more utmost attention to resolve some challenges of scheduling. To obtain higher performance when processing the big data, proper scheduling is required. Apache Hadoop is most commonly used to manage immense data volumes in an efficient way and also proficient in handling the issues associated with job scheduling. To improve performance of big data systems, we significantly analyzed various Hadoop job scheduling algorithms. To get an overall idea about the scheduling algorithm, this paper presents a rigorous background. This paper made an overview on the fundamental architecture of Hadoop Big data framework, job scheduling and its issues, then reviewed and compared the most important and fundamental Hadoop job scheduling algorithms. In addition, this paper includes a review of other improved algorithms. The primary objective is to present an overview of various scheduling algorithms to improve performance when analyzing big data. This study will also provide appropriate direction in terms of job scheduling algorithm to the researcher according to which characteristics are most significant

An architecture for scheduling with the capability of minimum share to heterogeneous Hadoop systems

Article

Full-text available

Jun 2021
J SUPERCOMPUT

Job scheduling in Hadoop has been thus far investigated in several studies. However, some challenges including minimum share (min-share), heterogeneous cluster, execution time estimation, and scheduling program size facing Hadoop clusters have received less attention. Accordingly, one of the most important algorithms with regard to min-share is that presented by Facebook Inc., i.e., FAIR scheduler, based on its own needs, in which an equal min-share has been considered for users. In this article, an attempt has been made to make the proposed method superior to existing methods through automation and configuration, performance optimization, fairness and data locality. A high-level architectural model is designed. Then a scheduler is defined on this architectural model. The provided scheduler contains four components. Three components schedule jobs and one component distributes the data for each job among the nodes. The given scheduler will be capable of being executed on heterogeneous Hadoop clusters and running jobs in parallel, in which disparate min-shares can be assigned to each job or user. Moreover, an approach is presented for each problem associated with min-share, cluster heterogeneity, execution time estimation, and scheduler program size. These approaches can be also utilized on its own to improve the performance of other scheduling algorithms. The scheduler presented in this paper showed acceptable performance compared with First-In, First-Out (FIFO), and FAIR schedulers.

Epsilon: A Microservices based distributed scheduler for Kubernetes Cluster

Conference Paper

Jun 2021

A high level view of Adaptive Scheduling Algorithm for Dynamic Heterogeneous Hadoop Systems

Context in source publication

Similar publications

Citations