Conference PaperPDF Available

Elastic Queue: A Universal SSD Lifetime Extension Plug-in for Cache Replacement Algorithms

Authors:

Abstract and Figures

Flash-based solid-state drives (SSDs) are getting popular to be deployed as the second-level cache in storage systems because of the noticeable performance acceleration and transparency for the original software. However, the frequent data updates of existing cache replacement algorithms (e.g. LRU, LIRS, and LARC) causes too many writes on SSDs, leading to short lifetime and high costs of devices. SSD-oriented cache schemes with less SSD writes have fixed strategies of selecting cache blocks, so we cannot freely choose a suitable cache algorithm to adapt to application features for higher performance. Therefore, a universal SSD lifetime extension plug-in called Elastic Queue (EQ), which can cooperate with any cache algorithm to extend the lifetime of SSDs, is proposed in this paper. EQ reduces the data updating frequency by extending the eviction border of cache blocks elastically, making SSD devices serve much longer. The experimental results based on some real-world traces indicate that for the original LRU, LIRS, and LARC schemes, adding the EQ plug-in reduces their SSD write amounts by 39.03 times, and improves the cache hit rates by 17.30% on average at the same time.
Content may be subject to copyright.
A preview of the PDF is not available
... Hybrid storage systems (HSS) take advantage of both fast-yet-small storage devices and large-yet-slow storage devices to deliver high storage capacity at low latency [27,39,40,55,62,70,74,76,82,84,100,115,130,184,229,230,238,244,253,258,267,270,274,283,284,287,288,291,312,343,347,349,356,396,408,409,439,441,453,456,489,511,517,518,520,530,531,540]. The key challenge in designing a high-performance and cost-effective hybrid storage system is to accurately identify the performance-criticality of application data and place data in the "best-fit" storage device [343]. ...
Preprint
The cost of moving data between the memory units and the compute units is a major contributor to the execution time and energy consumption of modern workloads in computing systems. At the same time, we are witnessing an enormous amount of data being generated across multiple application domains. These trends suggest a need for a paradigm shift towards a data-centric approach where computation is performed close to where the data resides. Further, a data-centric approach can enable a data-driven view where we take advantage of vast amounts of available data to improve architectural decisions. As a step towards modern architectures, this dissertation contributes to various aspects of the data-centric approach and proposes several data-driven mechanisms. First, we design NERO, a data-centric accelerator for a real-world weather prediction application. Second, we explore the applicability of different number formats, including fixed-point, floating-point, and posit, for different stencil kernels. Third, we propose NAPEL, an ML-based application performance and energy prediction framework for data-centric architectures. Fourth, we present LEAPER, the first use of few-shot learning to transfer FPGA-based computing models across different hardware platforms and applications. Fifth, we propose Sibyl, the first reinforcement learning-based mechanism for data placement in hybrid storage systems. Overall, this thesis provides two key conclusions: (1) hardware acceleration on an FPGA+HBM fabric is a promising solution to overcome the data movement bottleneck of our current computing systems; (2) data should drive system and design decisions by leveraging inherent data characteristics to make our computing systems more efficient.
... In this section, we evaluate the SSD endurance improvement of ETICA over ECI-Cache. We measure endurance with the number of writes into the SSD cache as our metric (as discussed and used by prior works [6], [86], [87], [88], [89]). Fig. 14 compares the endurance of SSD using ETICA and ECI-Cache (in terms of the number of writes performed into the SSD cache). ...
Preprint
Full-text available
In this paper, we propose an Efficient Two-Level I/O Caching Architecture (ETICA) for virtualized platforms that can significantly improve I/O latency, endurance, and cost (in terms of cache size) while preserving the reliability of write-pending data blocks. As opposed to previous one-level I/O caching schemes in virtualized platforms, our proposed architecture 1) provides two levels of cache by employing both Dynamic Random-Access Memory (DRAM) and SSD in the I/O caching layer of virtualized platforms and 2) effectively partitions the cache space between running VMs to achieve maximum performance and minimum cache size. To manage the two-level cache, unlike the previous reuse distance calculation schemes such as Useful Reuse Distance (URD), which only consider the request type and neglect the impact of cache write policy, we propose a new metric, Policy Optimized reuse Distance (POD). The key idea of POD is to effectively calculate the reuse distance and estimate the amount of two-level DRAM+SSD cache space to allocate by considering both 1) the request type and 2) the cache write policy. Doing so results in enhanced performance and reduced cache size due to the allocation of cache blocks only for the requests that would be served by the I/O cache. ETICA maintains the reliability of write-pending data blocks and improves performance by 1) assigning an effective and fixed write policy at each level of the I/O cache hierarchy and 2) employing effective promotion and eviction methods between cache levels. Our extensive experiments conducted with a real implementation of the proposed two-level storage caching architecture show that ETICA provides 45% higher performance, compared to the state-of-the-art caching schemes in virtualized platforms, while improving both cache size and SSD endurance by 51.7% and 33.8%, respectively.
... In this section, we evaluate the SSD endurance improvement of ETICA over ECI-Cache. We measure endurance with the number of writes into the SSD cache as our metric (as discussed and used by prior works [6], [86], [87], [88], [89]). Fig. 14 compares the endurance of SSD using ETICA and ECI-Cache (in terms of the number of writes performed into the SSD cache). ...
Article
Full-text available
In this paper, we propose an Efficient Two-Level I/O Caching Architecture (ETICA) for virtualized platforms that can significantly improve I/O latency, endurance, and cost (in terms of cache size) while preserving the reliability of write-pending data blocks. As opposed to previous one-level I/O caching schemes in virtualized platforms, our proposed architecture 1) provides two levels of cache by employing both Dynamic Random-Access Memory (DRAM) and SSD in the I/O caching layer of virtualized platforms and 2) effectively partitions the cache space between running VMs to achieve maximum performance and minimum cache size. To manage the two-level cache, unlike the previous reuse distance calculation schemes such as Useful Reuse Distance (URD), which only consider the request type and neglect the impact of cache write policy, we propose a new metric, Policy Optimized reuse Distance (POD). The key idea of POD is to effectively calculate the reuse distance and estimate the amount of two-level DRAM+SSD cache space to allocate by considering both 1) the request type and 2) the cache write policy. Doing so results in enhanced performance and reduced cache size due to the allocation of cache blocks only for the requests that would be served by the I/O cache. Our extensive experiments conducted with a real implementation of the proposed two-level storage caching architecture show that ETICA provides 45% higher performance, compared to the state-of-the-art caching schemes in virtualized platforms, while improving both cache size and SSD endurance by 51.7% and 33.8%, respectively.
... Notice that the random write in an SSD device is roughly tenfold slower than the sequential write and causes excessive internal fragmentation. Many algorithms [7,21,10,33] and architectures [65,35,44] have been design to alleviate the write traffic and control the GC process in SSD caches. ...
Preprint
Although every individual invented storage technology made a big step towards perfection, none of them is spotless. Different data store essentials such as performance, availability, and recovery requirements have not met together in a single economically affordable medium, yet. One of the most influential factors is price. So, there has always been a trade-off between having a desired set of storage choices and the costs. To address this issue, a network of various types of storing media is used to deliver the high performance of expensive devices such as solid state drives and non-volatile memories, along with the high capacity of inexpensive ones like hard disk drives. In software, caching and tiering are long-established concepts for handling file operations and moving data automatically within such a storage network and manage data backup in low-cost media. Intelligently moving data around different devices based on the needs is the key insight for this matter. In this survey, we discuss some recent pieces of research that have been done to improve high-performance storage systems with caching and tiering techniques.
... A smaller number of writes is expected to lead to better endurance. Similar metrics are used in previous system-level studies, such as [22,29,31,40]. Note that the total number of writes for each workload (reported in the experiments) is calculated by Eq. 3 which includes writes from the disk subsystem to the SSD and also the writes from the CPU to the SSD: ...
Article
Full-text available
In recent years, high interest in using Virtual Machines (VMs) in data centers and cloud computing has significantly increased the demand for high-performance data storage systems. A straightforward approach to providing a high-performance storage system is using Solid-State Drives (SSDs). Inclusion of SSDs in storage systems, however, imposes significantly higher cost compared to Hard Disk Drives (HDDs). Recent studies suggest using SSDs as a caching layer for HDD-based storage subsystems in virtualized platforms. Such studies neglect to address the endurance and cost of SSDs, which can significantly affect the efficiency of I/O caching. Moreover, previous studies only configure the cache size to provide the required performance level for each VM, while neglecting other important parameters such as cache write policy and request type, which can adversely affect both performance-per-cost and endurance.
... In [15], various management policies based on ARC for DRAM-SSD caching architectures are compared. A more general approach to prevent repetitive replacement of data pages in SSDs is suggested in [16] which provides buffered data pages a more chance to be accessed again and therefore stay in the cache. S-RAC [17] characterizes workloads into six groups. ...
Article
Full-text available
SSDs are emerging storage devices which unlike HDDs, do not have mechanical parts and therefore, have superior performance compared to HDDs. Due to the high cost of SSDs, entirely replacing HDDs with SSDs is not economically justified. Additionally, SSDs can endure a limited number of writes before failing. To mitigate the shortcomings of SSDs while taking advantage of their high performance, SSD caching is practiced in both academia and industry. Previously proposed caching architectures have only focused on either performance or endurance and neglected to address both parameters in suggested architectures. Moreover, the cost, reliability, and power consumption of such architectures is not evaluated. This paper proposes a hybrid I/O caching architecture that while offers higher performance than previous studies, it also improves power consumption with a similar budget. The proposed architecture uses DRAM, Read-Optimized SSD, and Write-Optimized SSD in a three-level cache hierarchy and tries to efficiently redirect read requests to either DRAM or RO-SSD while sending writes to WO-SSD. To provide high reliability, dirty pages are written to at least two devices which removes any single point of failure. The power consumption is also managed by reducing the number of accesses issued to SSDs. The proposed architecture reconfigures itself between performance- and endurance-optimized policies based on the workload characteristics to maintain an effective tradeoff between performance and endurance. We have implemented the proposed architecture on a server equipped with industrial SSDs and HDDs. The experimental results show that as compared to state-of-the-art studies, the proposed architecture improves performance and power consumption by an average of 8% and 28%, respectively, and reduces the cost by 5% while increasing the endurance cost by 4.7% and negligible reliability penalty.
... Especially, the degradation of lifetime is a serious problem because it reduces the amount of available blocks which are limited resources. us, enhancing lifetime is an important and urgent demand, so many studies have been proposed until recently [6][7][8][9][10]. ...
Article
Full-text available
Solid-state drive (SSD) becomes popular as the main storage device. However, over time, the reliability of SSD degrades due to bit errors, which poses a serious issue. The periodic remapping (PR) has been suggested to overcome the issue, but it still has a critical weakness as PR increases lifetime loss. Therefore, we propose the conditional remapping invocation method (CRIM) to sustain reliability without lifetime loss. CRIM uses a probability-based threshold to determine the condition of invoking remapping operation. We evaluate the effectiveness of CRIM using the real workload trace data. In our experiments, we show that CRIM can extend a lifetime of SSD more than PR by up to 12.6% to 17.9% of 5-year warranty time. In addition, we show that CRIM can reduce the bit error probability of SSD by up to 73 times in terms of typical bit error rate in comparison with PR.
... A smaller number of writes is expected to lead to better endurance. Similar metrics are used in previous system-level studies, such as [22,29,31,40]. Note that the total number of writes for each workload (reported in the experiments) is calculated by Eq. 3 which includes writes from the disk subsystem to the SSD and also the writes from the CPU to the SSD: ...
Article
Full-text available
In recent years, high interest in using Virtual Machines (VMs) in data centers and Cloud computing has significantly increased the demand for high-performance data storage systems. Recent studies suggest using SSDs as a caching layer for HDD-based storage subsystems in virtualization platforms. Such studies neglect to address the endurance and cost of SSDs, which can significantly affect the efficiency of I/O caching. Moreover, previous studies only configure the cache size to provide the required performance level for each VM, while neglecting other important parameters such as cache write policy and request type, which can adversely affect both performance-per-cost and endurance. In this paper, we present a new high-Endurance and Cost-efficient I/O Caching (ECI-Cache) scheme for virtualized platforms, which can significantly improve both the performance-per-cost and endurance of storage subsystems as opposed to previously proposed I/O caching schemes. Unlike traditional I/O caching schemes which allocate cache size only based on reuse distance of accesses, we propose a new metric, Useful Reuse Distance (URD), which considers the request type in reuse distance calculation, resulting in improved performance-per-cost and endurance for the SSD cache. Via online characterization of workloads and using URD, ECI-Cache partitions the SSD cache across VMs and is able to dynamically adjust the cache size and write policy for each VM. To evaluate the proposed scheme, we have implemented ECI-Cache in an open source hypervisor, QEMU (version 2.8.0), on a server running the CentOS 7 operating system (kernel version 3.10.0-327). Experimental results show that our proposed scheme improves the performance, performance-per-cost, and endurance of the SSD cache by 17%, 30% and 65%, respectively, compared to the state-of-the-art dynamic cache partitioning scheme.
Preprint
Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Recent research proposes various techniques that aim to accurately identify performance-critical data to place it in a "best-fit" storage device. Unfortunately, most of these techniques are rigid, which (1) limits their adaptivity to perform well for a wide range of workloads and storage device configurations, and (2) makes it difficult for designers to extend these techniques to different storage system configurations (e.g., with a different number or different types of storage devices) than the configuration they are designed for. We introduce Sibyl, the first technique that uses reinforcement learning for data placement in hybrid storage systems. Sibyl observes different features of the running workload as well as the storage devices to make system-aware data placement decisions. For every decision it makes, Sibyl receives a reward from the system that it uses to evaluate the long-term performance impact of its decision and continuously optimizes its data placement policy online. We implement Sibyl on real systems with various HSS configurations. Our results show that Sibyl provides 21.6%/19.9% performance improvement in a performance-oriented/cost-oriented HSS configuration compared to the best previous data placement technique. Our evaluation using an HSS configuration with three different storage devices shows that Sibyl outperforms the state-of-the-art data placement policy by 23.9%-48.2%, while significantly reducing the system architect's burden in designing a data placement mechanism that can simultaneously incorporate three storage devices. We show that Sibyl achieves 80% of the performance of an oracle policy that has complete knowledge of future access patterns while incurring a very modest storage overhead of only 124.4 KiB.
Conference Paper
Full-text available
Recently flash-based solid-state drives (SSDs) have been widely deployed as cache devices to boost system performance. However, classical SSD cache algorithms (e.g. LRU) replace the cached data frequently to maintain high hit rates. Such aggressive data updating strategies result in too many writing operations on SSDs and make them wear out quickly, which finally leads to high costs of SSDs for enterprise applications. In this paper, we propose a novel Expiration-Time Driven Cache (ETD-Cache) method to solve this problem. In ETD-Cache, an active data eviction mechanism is adopted. An already cached block leaves the SSD cache if and only if there is no access to it for a time longer than a specified expiration time. This mechanism gives more time for the cached contents to wait for their following accesses and limits the admission of newly arrived blocks to generate less SSD writes. In addition, a low-overhead candidate management module is designed to maintain the most popular data in the system for the potential cache replacement. The simulations driven by a series of typical real-world traces indicate that due to the great reduction on data updating frequency, ETD-Cache lowers the total SSD costs by 98.45% compared with LRU under the same cache hit rate.
Article
Full-text available
Serving as cache disks, flash-based solid-state drives (SSDs) can significantly boost the performance of read-intensive applications. However, frequent data updating, the necessary condition for classical replacement algorithms (e.g., LRU, MQ, LIRS, and ARC) to achieve a high hit rate, makes SSDs wear out quickly. To address this problem, we propose a new approach—write-efficient caching (WEC)—to greatly improve the write durability of SSD cache. WEC is conducive to reducing the total number of writes issued to SSDs while achieving high hit rates. WEC takes two steps to improve write durability and performance of SSD cache. First, WEC discovers write-efficient data, which tend to be active for a long time period and to be frequently accessed. Second, WEC keeps the write-efficient data in SSDs long enough to avoid excessive number of unnecessary updates. Our findings based on a wide range of popular real-world traces show that write-efficient data does exist in a wide range of popular read-intensive applications. Our experimental results indicate that compared with the classical algorithms, WEC judiciously improves the mean hits of each written block by approximately two orders of magnitude while exhibiting similar or even higher hit rates.
Conference Paper
Full-text available
We examine the write endurance of USB flash drives using a range of approaches: chip-level measurements, reverse engineering, timing analysis, whole-device endurance testing, and simulation. The focus of our investigation is not only measured endurance, but underlying factors at the level of chips and algorithms--both typical and ideal--which determine the endurance of a device. Our chip-level measurements show endurance far in excess of nominal values quoted by manufacturers, by a factor of as much as 100. We reverse engineer specifics of the Flash Translation Layers (FTLs) used by several devices, and find a close correlation between measured whole-device endurance and predictions from reverse-engineered FTL parameters and measured chip endurance values. We present methods based on analysis of operation latency which provide a non-intrusive mechanism for determining FTL parameters. Finally we present Monte Carlo simulation results giving numerical bounds on endurance achievable by any on-line algorithm in the face of arbitrary or malicious access patterns.
Conference Paper
Full-text available
Large caches in storage servers have become essential for meeting service levels required by applications. These caches need to be warmed with data often today due to various scenarios including dynamic creation of cache space and server restarts that clear cache contents. When large storage caches are warmed at the rate of application I/O, warmup can take hours or even days, thus affecting both application performance and server load over a long period of time. We have created Bonfire, a mechanism for accelerat-ing cache warmup. Bonfire monitors storage server work-loads, logs important warmup data, and efficiently pre-loads storage-level caches with warmup data. Bonfire is based on our detailed analysis of block-level data-center traces that provides insights into heuristics for warmup as well as the potential for efficient mechanisms. We show through both simulation and trace replay that Bonfire re-duces both warmup time and backend server load signifi-cantly, compared to a cache that is warmed up on demand.
Article
Emerging solid-state storage media can significantly improve storage performance and energy. However, the high cost-per-byte of solid-state media has hindered wide-spread adoption in servers. This paper proposes a new, cost-effective architecture - SieveStore - which enables the use of solid-state media to significantly filter access to storage ensembles. Our paper makes three key contributions. First, we make a case for highly-selective, storage-ensemble-level disk-block caching based on the highly-skewed block popularity distribution and based on the dynamic nature of the popular block set. Second, we identify the problem of allocation-writes and show that selective cache allocation to reduce allocation-writes - sieving - is fundamental to enable efficient ensemble-level disk-caching. Third, we propose two practical variants of SieveStore. Based on week-long block access traces from a storage ensemble of 13 servers, we find that the two components (sieving and ensemble-level caching) each contribute to SieveStore's cost-effectiveness. Compared to unsieved, ensemble-level disk-caches, SieveStore achieves significantly higher hit ratios (35%-50% more, on average) while using only 1/7th the number of SSD drives. Further, ensemble-level caching is strictly better in cost-performance compared to per-server caching.
Conference Paper
Emerging solid-state storage media can significantly improve storage performance and energy. However, the high cost-per-byte of solid-state media has hindered wide-spread adoption in servers. This paper proposes a new, cost-effective architecture - SieveStore - which enables the use of solid-state media to significantly filter access to storage ensembles. Our paper makes three key contributions. First, we make a case for highly-selective, storage-ensemble-level disk-block caching based on the highly-skewed block popularity distribution and based on the dynamic nature of the popular block set. Second, we identify the problem of allocation-writes and show that selective cache allocation to reduce allocation-writes - sieving - is fundamental to enable efficient ensemble-level disk-caching. Third, we propose two practical variants of SieveStore. Based on week-long block access traces from a storage ensemble of 13 servers, we find that the two components (sieving and ensemble-level caching) each contribute to SieveStore's cost-effectiveness. Compared to unsieved, ensemble-level disk-caches, SieveStore achieves significantly higher hit ratios (35%-50% more, on average) while using only 1/7th the number of SSD drives. Further, ensemble-level caching is strictly better in cost-performance compared to per-server caching.
Conference Paper
The increasing popularity of flash memory has changed storage systems. Flash-based solid state drive(SSD) is now widely deployed as cache for magnetic hard disk drives(HDD) to speed up data intensive applications. However, existing cache algorithms focus exclusively on performance improvements and ignore the write endurance of SSD. In this paper, we proposed a novel cache management algorithm for flash-based disk cache, named Lazy Adaptive Replacement Cache(LARC). LARC can filter out seldom accessed blocks and prevent them from entering cache. This avoids cache pollution and keeps popular blocks in cache for a longer period of time, leading to higher hit rate. Meanwhile, LARC reduces the amount of cache replacements thus incurs less write traffics to SSD, especially for read dominant workloads. In this way, LARC improves performance and extends SSD lifetime at the same time. LARC is self-tuning and low overhead. It has been extensively evaluated by both trace-driven simulations and a prototype implementation in flashcache. Our experiments show that LARC outperforms state-of-art algorithms and reduces write traffics to SSD by up to 94.5% for read dominant workloads, 11.2-40.8% for write dominant workloads.
Article
In recent years, flash-based SSDs have grown enormously both in capacity and popularity. In highperformance enterprise storage applications, accelerating adoption of SSDs is predicated on the ability of manufacturers to deliver performance that far exceeds disks while closing the gap in cost per gigabyte. However, while flash density continues to improve, other metrics such as a reliability, endurance, and performance are all declining. As a result, building larger-capacity flashbased SSDs that are reliable enough to be useful in enterprise settings and high-performance enough to justify their cost will become challenging. In this work, we present our empirical data collected from 45 flash chips from 6 manufacturers and examine the performance trends for these raw flash devices as flash scales down in feature size. We use this analysis to predict the performance and cost characteristics of future SSDs. We show that future gains in density will come at significant drops in performance and reliability. As a result, SSD manufacturers and users will face a tough choice in trading off between cost, performance, capacity and reliability.