Fig 1 - uploaded by Huailiang Tan
Content may be subject to copyright.
Data transfer process of block device driver.

Data transfer process of block device driver.

Source publication
Article
Full-text available
In the era of cloud computing and big data, virtualization is gaining great popularity in storage systems. Since multiple guest virtual machines (DomUs) are running on a single physical device, disk I/O fairness among DomUs and aggregated throughput remain the challenges in virtualized environments. Although several methods have been developed for...

Contexts in source publication

Context 1
... VMs meet the demand of VM user for block devices by providing an interface of virtual block device. In block device drivers, the data transfer process is shown in Fig. 1. The disk access of a VM goes through the following components. 1) File system. The operation of accessing the virtual disk from a DomU application needs to be encapsu- lated into an I/O request to virtual disk when pass- ing through the file system layer. 2) I/O scheduling in DomU. According to the real-time priority or the principle ...
Context 2
... our test is also performed under the three con- current DomUs, but the workloads of each DomU are differ- ent. The I/O sizes of the three DomUs are 512 B, 4 K, and 128 K respectively. Fig. 10 shows each DomU's bandwidth allocated. We can see that the bandwidth utilization of each DomU is improved and it is remarkable with request block size of 512 B (The most frequent I/O operations). The results of workloads with different I/O types running within three DomUs are shown in Fig. 9. VMCD (28.35 Mbps) shows higher bandwidth ...
Context 3
... is the average response time of all requests of multiple DomUs. An experiment is designed in this section to evalu- ate the latency. Three concurrent DomUs run different workloads with diverse latency requirement. Fig. 11 shows the ART of each ...
Context 4
... running in another computer remotely. These I/O intensive applica- tions are simulated by random requests, and the total size of each DomU's I/O Read/Write requests is 2 GB. We change the queue depth of each DomU from 1 to 256 and collect IOPS for each experiment. The results of the three types of I/O intensive applications are shown in Fig. 12. We can observe that IOPS of multiple DomUs using VMCD is higher than the other three schedulers in all cases. VMCD shows the improvement of 11.5, 11.2, and 12.2 percent com- pared with CFQ, Deadline, and Anticipatory respectively for file server, 12.1, 4.9, and 12.5 percent for web server, and 11, 5.7, and 5.6 percent for OLTP. The ...
Context 5
... fairness index can be calculated by allocated band- width of each DomU. Fig. 13 presents the results of fairness for three types of workloads. We can see that VMCD improves fairness significantly compared to CFQ, Deadline, and Anticipatory. It achieves 56, 53, 56.8 percent improve- ment for fairness in the case of three virtual DomUs as file server, 53.5, 51, 54 percent for web server, and 54, 52, 55 percent for ...
Context 6
... three DomUs are deployed as specific real serv- ers, which run the file server, web server and OLTP work- loads respectively. And we record the allocated bandwidth under four different schedulers. Fig. 14 ...
Context 7
... it does not rely on any special disk hardware support. To explain why VMCD does not hinder live migration and portability, we imple- ment VMCD on SSD-based storage system. The used SSD is Intel 320 Series 40 GB, and three DomUs that have different bandwidth requirements (64 K, Random-Write, IOps are 10, 100, and 300) are deployed concurrently. Fig. 15a shows the obtained bandwidth of three DomUs on different storage devices (HDD and SSD) with VMCD. Three DomUs respec- tively acquired 18.58, 41.74, and 106.9 Mbps with SSD, and 3.16, 8.41, and 16.75 Mbps for HDD. The fairness index according to formula (2) is 4.2930 for SSD and 4.5932 for HDD. It can be seen that VMCD also exhibits ...
Context 8
... (2) is 4.2930 for SSD and 4.5932 for HDD. It can be seen that VMCD also exhibits good fairness on SSD. At the same time, the overall performance is much better than that of HDD (The intuitive reason is that SSD's performance is better than that of HDD). And in the SSD storage system, we also compare VMCD with original CFQ scheduling in Xen. Fig. 15b shows VMCD also performs bet- ter than the original Xen on SSD storage system. In addition to better fairness of disk bandwidth allocation (The fairness index is 4.293 for VMCD and 11.592 for CFQ), VMCD improves bandwidth utilization obviously, which achieves 20.5 percent on SSD (The aggregated bandwidth of three DomUs are 167.22 Mbps ...
Context 9
... to each pending queue through CPU computing, replenishment mechanism requires CPU to judge whether its event happens continu- ally, and some additional data structures also need CPU to process. To evaluate the overhead of VMCD, we monitor CPU utilization of Dom0 in sequential read experiment by varying the number of DomUs from 1 to 5. As shown in Fig. 16, we compare the change of CPU utilization under VMCD and under CFQ. Because CFQ is simpler than VMCD in disk I/O scheduling, VMCD raises CPU utiliza- tion by 3 percent compared to CFQ when using the same number of DomUs, but brings better fairness and stability of disk I/O bandwidth allocation and higher aggregated disk I/O bandwidth ...

Citations

... Based on the derived disk I/O information, the existing disk I/O scheduler, which considers only the number of disk I/O requests, is extended to present a disk I/O scheduling technique that also considers the data size of the disk I/O requests. Tan [58] proposed VMCD, a virtual multichannel-based disk I/O scheduler. The VMCD isolates disk I/O streams by allocating virtual channels to the I/O device for all guest VMs in DOM0. ...
Article
Full-text available
The development of IT technology in the 21st century has created a new paradigm for real-time, data-intensive user services, such as connected cars, smart factories, and remote health care services. The considerable computational resources required by these services are rendering the cloud increasingly more important. In the cloud server, user services are forced to share physical resources because of the emerging resource competition, thus introducing various types of unpredictable workloads. The core technology of the cloud is a virtualized system, which isolates and shares the powerful physical resources of the server in the form of a virtual machine (VM) to increase resource efficiency. However, the scheduling policy of a virtual CPU (vCPU), which is a logical CPU of a VM, generally schedules the vCPU based on the degree of occupation of the physical CPU (pCPU) without regarding I/O strength; so it brings the unfair I/O performance among VMs in the virtualized systems. The user services performing on the VM are not aware of the user-contention architectures, which sharing of I/O devices, in the virtualized systems; Furthermore, the current virtualized system simply adopts the Linux-based I/O processing process which optimized for user-contention-free architectures. Therefore, the architecture that brings the unfair usage of I/O devices among user services is hardly regarded and has low awareness in current virtualized systems. To overcome this problem, in this study, I-Balancer is presented to provide fair I/O performance among I/O-intensive user services by applying an asynchronous inter-communication control technique for the virtualized system with a high VM density. The main design goal of I-Balancer is to increase the awareness of user-contention architectures in the hypervisor. I-Balancer derives the fine-grained workload and I/O strength for each vCPU during the scheduler and event channel areas. Subsequently, to strengthen fair I/O performance, an I/O traffic control mechanism is implemented to control the inter-domain communication traffic according to the I/O strength of the VMs. Experiments were performed on the fairness of I/O(disk and network) performance on virtualized systems with Xen 4.12 hypervisor adopted based on various performance metrics. The experimental results showed that the virtualization system to which I-Balancer is applied reduces the network and disk I/O performance standard deviation among VMs by up to 71% and 61% respectively compared to the existing virtualization system; and, performance interference and overhead are also confirmed to be negligible.
... Fairness Metric: We use Jain's fairness measure to quantify the fairness [23]. It is a widely used metric to quantitatively measure fairness in shared computer systems [24][25][26]: ...
Conference Paper
Full-text available
While current fairness-driven I/O schedulers are successful in allocating equal time/resource share to concurrent workloads, they ignore the I/O request queueing or reordering in storage device layer, such as Native Command Queueing (NCQ). As a result, requests of different workloads cannot have an equal chance to enter NCQ (NCQ conflict) and fairness is violated. We address this issue by providing the first systematic empirical analysis on how NCQ affects I/O fairness and SSD utilization and accordingly proposing a NCQ-aware I/O scheduling scheme, NASS. The basic idea of NASS is to elaborately control the request dispatch of workloads to relieve NCQ conflict and improve NCQ utilization. NASS builds on two core components: an evaluation model to quantify important features of the workload, and a dispatch control algorithm to set the appropriate request dispatch of running workloads. We integrate NASS into four state-of-the-art I/O schedulers and evaluate its effectiveness using widely used benchmarks and real world applications. Results show that with NASS, I/O schedulers can achieve 11-23% better fairness and at the same time improve device utilization by 9-29%.
... Managing performance interference effects Application-level [6][7][8][9][10][11] Middleware-level [11][12][13][14][15][16][17] Infrastructure-level [11,[18][19][20][21][22][23][24][25][26][27][28][29][30] Managing resource challenges Application-level [2,[6][7][8][9][10][31][32][33] Middleware-level [2,[33][34][35][36] Infrastructure-level [1,[37][38][39][40][41][42][43][44][45][46][47][48][49][50][51][52] SLA compliance Application-level [2,53,54] Middleware-level [2,53,55] Infrastructure-level [2,53] Multi-dimensional goals and trade-offs Application-level [56][57][58][59][60] Middleware-level [59,61,62] Infrastructure-level [3,5,[63][64][65][66][67][68][69][70][71][72][73][74][75] it is useful to have a taxonomic scheme that can identify dark dimensions of the problem. In this regard, important performance management goals, operating scope and the related research literature can be summarized as follows. ...
... In some research work such as [1,17,83], both homogeneous and heterogeneous workload types have been examined. Furthermore, in some research work such as [20,23,29,40,53,55] real workload scenarios have been examined for performance evaluations and workload tests, but in some others such as [1,2,15,19,21,60,64,69] synthetic workloads have been examined, and finally, in some cases [34,37] the both real and synthetic workloads have been examined to emphasize the capability of solutions. In another category, the structure of major workflows from diverse scientific applications includes Montage (astronomy science), CyberShake (earthquake science), Epigenomics (biology scope), LIGO (gravitational physics) and SIPHT (biology scope). ...
... During the performance management, cloud workloads in target service environment are evaluated based on the preferred solution approach toward performance objectives. After determining the Concurrency control [7,9,11] Resource tuning, scheduling [1,3,11,17,21,23,25,26,28,29,38,51,[68][69][70]74] Provisioning (scalability, elasticity) [14,15,18,19,35,36,48,55,60,88] Combinational [5,22,37,64] characteristics of cloud workloads and determining performance requirements, a solution policy must be applied toward performance improvement and management. In the next section, solution approaches and related literature will be reviewed. ...
Article
Full-text available
Cloud computing is an evolving paradigm with tremendous momentum. Performance is a major challenge in providing cloud services, and performance management is prerequisite to meet quality objectives in clouds. Although many researches have studied this issue, there is a lack of analysis on management dimensions, challenges and opportunities. As an attempt toward compensating the shortage, this work first gives a review on performance management dimensions in clouds. Moreover, a taxonomic scheme has devised to classify the recent literature, help to standardize the problem and highlight commonalities and deviations. Afterward, an autonomic and integrated performance management framework has been proposed. The proposed framework enables cloud providers to realize optimization schemes without major changes. Practicality and effectiveness of the proposed framework has been demonstrated by prototype implementation on top of the CloudSim. Experiments present promising results, in terms of the performance improvement and management. Finally, open issues, opportunities and suggestions have been presented.
... In a server system in a virtualized environment, it is easy to allocate CPU and memory resources compared to other resources. However, resource allocation of storage devices is one of the relatively difficult research areas [3]. This is particularly evident in terms of service level agreements (SLAs). ...
Article
Full-text available
There were scheduler studies for QoS(Quality of Service) or SLA(Service Level Agreement) of hard disks. The use of SSDs as storage has been increasing dramatically in recent systems due to their fast performance and low power usage. However, the studies to guarantee the SLA are based on the hard disk and do not consider SSD which is a flash storage device. In the SSD, GC(Garbae Collection) process copies data to an empty block and the corresponding block is removed by the GC. This causes SSD performance to degrade in a virtualized environment with many I/Os. We considered the Linux scheduler to take SSD characteristics into consideration and to improve I/O performance. In this paper, we propose a MTS-CFQ I/O scheduler that is implemented by modifying the existing Linux CFQ I/O scheduler. Our proposed method controls the time slice based on the I/O bandwidth for the current storage device. Real workload-driven simulation based experimental results have shown that MTS-CFQ can improve performance by up to 20% with an average of 5%, compared with the traditional CFQ I/O for the four workload considered. © 2018, ECTI Association Sirindhon International Institute of Technology. All rights reserved.
... In [34] the authors deal with the difficulty induced to resource allocation process, due to unpredictability of the type of the workload by the hypervisor; they propose a mechanism that provides distinction between I/O-bound and CPU-bound workloads which results to improvement of the performance and complete CPU fairness among virtual machines. Similarly, the authors of [35] present the design, implementation and evaluation of a system that can fairly allocate disk I/O bandwidth in virtualized environment. This system can mitigate the interference between multiple DomUs by introducing separated V-Channel and I/O queue for each DomU. ...
... Some works in the literature, including [33] and [35], consider the latter assumption as a part of their theoretical analyses. More specifically, the distribution of service times in our theoretical analysis is considered to be exponential; this derives from the fact that the operations, on average, require a service time related to the average operation length, but a level of uncertainty, due to physical system complexity, induces an additional stochastic amount to their service times. ...
Article
Full-text available
Hypervisors’ smooth operation and efficient performance has an immediate effect in the supported Cloud services. We investigate scheduling algorithms that match I/O requests originated from virtual resources, to the physical CPUs that do the actual processing. We envisage a new paradigm of virtualized resource consolidation, where I/O resources required by several Virtual Machines (VMs) in different physical hosts, are provided by one (or more) external powerful dedicated appliance(s), namely the I/O Hypervisor (IOH). For this reason I/O operations are transferred from the VMs to the IOH, where they are executed. We propose and evaluate a number of scheduling algorithms for this hypervisor model, concentrating on providing guaranteed fairness among the virtual resources. A simulator has been built that describes this model and is used for the implementation and the evaluation of the algorithms. We also analyze the performance of the different hypervisor models and highlight the importance of fair scheduling.
Chapter
Applications are increasingly containerized using techniques, such as LXC and Docker. Scientific workflow applications are no exception. In this paper, we address the problem of resource contention between concurrently running containerized scientific workflows. To this end, we design and implement Hierarchical Recursive Resource Sharing (HRRS), which structures multiple concurrent containers in a hierarchy that automatically and dynamically regulates their resource consumption based on their level/tier in the hierarchy. The hierarchy is recursively updated as the top-tier container completes its execution with the second-tier container becoming the top-tier container inheriting the resource consumption priority. We have evaluated the performance of HRRS using multiple large-scale scientific workflows containerized by Docker. The experimental results show the significant reduction of resource contention as evident in performance improvement of 49%, 160% and 18% compared with sequential execution, concurrent execution with fair resource share and execution with submission interval, respectively.
Article
Full-text available
In modern virtual computing environment, existing GPU virtualization techniques are unable to take full advantage of a GPU's powerful 2D/3D hardware-accelerated graphics rendering performance or parallel computing potential, or it has not been considered that the internal resources of a GPU domain are fairly allocated between VMs with different performance requirements. Therefore, we propose a multi-channel GPU virtualization architecture (VMCG), model the corresponding credit allocating and transferring mechanisms, and redesign the virtual multi-channel GPU fair-scheduling algorithm. VMCG provides a separate V-Channel for each guest VM (DomU) that competes with other VMs for the same physical GPU resources, and each DomU submits command request blocks to its respective V-Channel according to the corresponding DomU ID. Through the virtual multi-channel GPU fair-scheduling algorithm, not only do multiple DomUs make full use of native GPU hardware acceleration, but the fairness of GPU resource allocation is significantly improved during GPU-intensive workloads from multiple DomUs running on the same host. Experimental results show that, for 2D/3D graphics applications, performance is close to 96\% of that of the native GPU, performance is improved by approximately 500\% for parallel computing applications, and GPU resource-allocation fairness is improved by approximately 60\%-80\%.
Article
Applications with different characteristics in the cloud may have different resources preferences. However, traditional resource allocation and scheduling strategies rarely take into account the characteristics of applications. Considering that an I/O‐intensive application is a typical type of application and that frequent I/O accesses, especially small files randomly accessing the disk, may lead to an inefficient use of resources and reduce the quality of service (QoS) of applications, a weight allocation strategy is proposed based on the available resources that a physical server can provide as well as the characteristics of the applications. Using the weight obtained, a resource allocation and scheduling strategy is presented based on the specific application characteristics in the data center. Extensive experiments show that the strategy is correct and can guarantee a high concurrency of I/O per second (IOPS) in a cloud data center with high QoS. Additionally, the strategy can efficiently improve the utilization of the disk and resources of the data center without affecting the service quality of applications.