Write Amplification. In S2 slc , maximum bandwidth is achieved when the write size aligns with the stripe size (1 MB).

Source publication

Block management in solid-state devices

Article

Full-text available

Jan 2009

Solid-state devices (SSDs) have the potential to replace traditional hard disk drives (HDDs) as the de facto storage medium. Unfortunately, there are several decades of spinning-media assumptions embedded in the software stack as an "unwritten contract" [20]. In this paper, we revisit these system-level assumptions in light of SSDs and find that se...

Context 1

... amplification is not a new phenomenon; it happens on RAID arrays that need to update parity blocks. We measured the effect of write amplification on one of the engineering samples (a low-end SSD, S2 slc ); Fig- ure 2 shows the results. We plot the bandwidth against the write size. ...

View in full-text

A visual example of our coincidence processor, shown with four total...

Overall coincidence processing performance of the coincidence processor...

Coincidence processing performance of the coincidence processor with...

Number of computer nodes needed to achieve near real-time coincidence...

Performance assessment of a software-based coincidence processor for the EXPLORER total-body PET scanner

Article

Full-text available

Sep 2018

Coincidence processing in positron emission tomography (PET) is typically done during acquisition of the data. However, on the EXPLORER total-body PET scanner we plan, in addition, to store unpaired single events (i.e. singles) for post-acquisition coincidence processing. A software-based coincidence processor was developed for EXPLORER and its per...

9 year history of areal density increases for HDD, NAND, LTO Tape, and...

9 year history of cost/bit decreases for HDD, NAND, LTO Tape, and...

Annual cost/bit vs areal density changes for HDD, LTO Tape, NAND for...

Scaling strategies for bit cells associated with HDD, NAND, LTO Tape,...

Moore’s law realities for recording systems and memory storage components: HDD, tape, NAND, and optical

Article

Full-text available

Dec 2017

This paper describes trends in the storage technologies associated with Linear Tape Open (LTO) Tape cartridges, hard disk drives (HDD), and NAND Flash based storage devices including solid-state drives (SSD). This technology discussion centers on the relationship between cost/bit and bit density and, specifically on how the Moore’s Law perception t...

FIGURE 8: OLTP results between ext4 (ordered mode) and e2xt4 file systems

FIGURE 9: OLTP results between btrfs and ebtrfs file systems

FIGURE 13: Average latency of normal and transactional operations under...

An Empirical Performance Evaluation of Transactional Solid-State Drives

Article

Full-text available

Dec 2019

Solid-state drives (SSDs) have accelerated the architectural evolution of storage systems with several characteristics (e.g., out-of-place update) compared with hard disk drives (HDD). Out-of-place update of SSDs naturally can support transaction mechanism which is commonly used in systems to provide crash consistency. Thus, transactional functiona...

CosaFS: A Cooperative Shingle-Aware File System

Article

Full-text available

Nov 2017

In this article, we design and implement a cooperative shingle-aware file system, called CosaFS, on heterogeneous storage devices that mix solid-state drives (SSDs) and shingled magnetic recording (SMR) technology to improve the overall performance of storage systems. The basic idea of CosaFS is to classify objects as hot or cold objects based on a...

RC-RNN: Reconfigurable Cache Architecture for Storage Systems Using Recurrent Neural Networks

Article

Full-text available

Aug 2021

Solid-State Drives (SSDs) have significant performance advantages over traditional Hard Disk Drives (HDDs) such as lower latency and higher throughput. Significantly higher price per capacity and limited lifetime, however, prevents designers to completely substitute HDDs by SSDs in enterprise storage systems. In this paper, we propose RC-RNN, the f...

I/O Access Patterns in HPC Applications: A 360-Degree Survey

Article

Full-text available

Jul 2023
ACM COMPUT SURV

The high-performance computing (HPC) I/O stack has been complex due to multiple software layers, the inter-dependencies among these layers, and the different performance tuning options for each layer. In this complex stack, the definition of an “I/O access pattern” has been re-appropriated to describe what an application is doing to write or read data from the perspective of different layers of the stack, often comprising a different set of features. It has become common having to redefine what is meant when discussing a pattern in every new study as no assumption can be made. This survey aims to propose a baseline taxonomy, harnessing the I/O community’s knowledge over the last 20 years. This definition can serve as a common ground for HPC I/O researchers and developers to apply known I/O tuning strategies and design new strategies for improving I/O performance. We seek to summarize and bring a consensus with the multiple ways to describe a pattern based on common features already used by the community over the years.

A design to reduce write amplification in object-based NAND flash devices

Conference Paper

Oct 2016

Write amplification is a major cause of performance and endurance degradations in NAND flash based storage systems. In an object-based NAND flash device, two causes of write amplification are onode partial update and cascading update. Updating one onode, a kind of small-sized object metadata, invokes partial page update (i.e., onode partial update) that incurs unnecessary migration of the un-updated data. An cascading update denotes that object metadata is updated in a cascading manner due to erase-before-program property of NAND flash memory. In this work, we propose a system design to alleviate onode partial update and cascading update. The proposed system design includes: 1) A multi-level garbage collection technique to minimize unnecessary data migration incurred by onode partial update; 2) A B+ table tree and selective cache design to reduce the write operations associated with cascading update; and 3) A power failure handling technique to guarantee system consistency. Experiment results show that our proposed design can achieve up to 20% write reduction compared to the best state-of-the-art.

正式发表2015-11(vol.64,no.11)

Data

Full-text available

Apr 2016

WEC: Improving Durability of SSD Cache Drives by Caching Write-Efficient Data

Article

Full-text available

Nov 2015
IEEE T COMPUT

Serving as cache disks, flash-based solid-state drives (SSDs) can significantly boost the performance of read-intensive applications. However, frequent data updating, the necessary condition for classical replacement algorithms (e.g., LRU, MQ, LIRS, and ARC) to achieve a high hit rate, makes SSDs wear out quickly. To address this problem, we propose a new approach—write-efficient caching (WEC)—to greatly improve the write durability of SSD cache. WEC is conducive to reducing the total number of writes issued to SSDs while achieving high hit rates. WEC takes two steps to improve write durability and performance of SSD cache. First, WEC discovers write-efficient data, which tend to be active for a long time period and to be frequently accessed. Second, WEC keeps the write-efficient data in SSDs long enough to avoid excessive number of unnecessary updates. Our findings based on a wide range of popular real-world traces show that write-efficient data does exist in a wide range of popular read-intensive applications. Our experimental results indicate that compared with the classical algorithms, WEC judiciously improves the mean hits of each written block by approximately two orders of magnitude while exhibiting similar or even higher hit rates.

File-Less Approach to Large Scale Data Management

Conference Paper

Full-text available

Aug 2015

With the continuously increasing amount of online resources and data such use cases as discovery, maintenance and inter-operation become more and more complex. In particular, data management is becoming one of the main issues with respect to both scientific (large scale simulations or data mining applications) as well as consumer use cases (accessing photos or email attachments on mobile devices). We believe that one of the main bottlenecks blocking development of solutions providing truly seamless developer and user experience is the concept of file and filesystem. We present Filess, vision and architecture of file-less information systems where files are not necessary, neither in the application nor operating system layers.

Faster Storage Devices Profiling with Parallel SeRRa

Conference Paper

Full-text available

Aug 2015

This work presents the parallel storage device profiling tool SeRRa. Our tool obtains the sequential to random throughput ratio for reads and writes of different sizes on storage devices. In order to provide this information efficiently, SeRRa employs benchmarks to obtain the values for only a subset of the parameter space and estimates the remaining values through linear models. The MPI parallelization of SeRRa presented in this paper allows for faster profiling. Our results show that our parallel SeRRa provides profiles up to 8.7 times faster than the sequential implementation, up to 895 times faster than the originally required time (without SeRRa).

Automatic I/O scheduling algorithm selection for parallel file systems

Article

Aug 2015
CONCURR COMP-PRACT E

This article presents our approach to provide input/output (I/O) scheduling with double adaptivity: to applications and devices. In high-performance computing environments, parallel file systems provide a shared storage infrastructure to applications. In the situation where multiple applications access this shared infrastructure concurrently, their performance can be impaired because of interference. Our work focuses on I/O scheduling as a tool to improve performance by alleviating interference effects. The role of the I/O scheduler is to decide the order in which applications' requests must be processed by the parallel file system's servers, applying optimizations to adjust the resulting access pattern for improved performance. Our approach to improve I/O scheduling results is based on using information from applications' access patterns and storage devices' sensitivity to access sequentiality. We have applied machine learning to provide the ability to automatically select the best scheduling algorithm for each situation. Our approach improves performance by up to 75% over an approach that uses the same scheduling algorithm to all situations, without adaptability. Our results evidence that both aspects – applications and storage devices – are essential to make good scheduling decisions. Copyright

Parallel Storage Devices Profiling with SeRRa

Conference Paper

Full-text available

Jul 2015

This work presents the parallel storage device profiling tool SeRRa. Our tool obtains the sequential to random throughput ratio for reads and writes of different sizes on storage devices. In order to provide this information efficiently, SeRRa employs benchmarks to obtain the values for only a subset of the parameter space and estimates the remaining values through linear models. The MPI parallelization of SeRRa presented in this paper allows for faster profiling. Our results show that our parallel SeRRa provides profiles up to 8:7 times faster than the sequential implementation, up to 895 times faster than the originally required time (without SeRRa).

Z-MAP: A Zone-Based Flash Translation Layer with Workload Classification for Solid-State Drive

Article

Feb 2015

Existing space management and address mapping schemes for flash-based Solid-State-Drive (SSD) operate either at page or block granularity, with inevitable limitations in terms of memory requirement, performance, garbage collection, and scalability. To overcome these limitations, we proposed a novel space management and address mapping scheme for flash referred to as Z-MAP, which manages flash space at granularity of Zone. Each Zone consists of multiple numbers of flash blocks. Leveraging workload classification, Z-MAP explores Page-mapping Zone (Page Zone) to store random data and handle a large number of partial updates, and Block-mapping Zone (Block Zone) to store sequential data and lower the overall mapping table. Zones are dynamically allocated and a mapping scheme for a Zone is determined only when it is allocated. Z-MAP uses a small part of Flash memory or phase change memory as a streaming Buffer Zone to log data sequentially and migrate data into Page Zone or Block Zone based on workload classification. A two-level address mapping is designed to reduce the overall mapping table and address translation latency. Z-MAP classifies data before it is permanently stored into Flash memory so that different workloads can be isolated and garbage collection overhead can be minimized. Z-MAP has been extensively evaluated by trace-driven simulation and a prototype implementation on OpenSSD. Our benchmark results conclusively demonstrate that Z-MAP can achieve up to 76&percnt; performance improvement, 81&percnt; mapping table reduction, and 88&percnt; garbage collection overhead reduction compared to existing Flash Translation Layer (FTL) schemes.

CBM: A cooperative buffer management for SSD

Conference Paper

Jun 2014

Random writes significantly limit the application of Solid State Drive (SSD) in the I/O intensive applications such as scientific computing, Web services, and database. While several buffer management algorithms are proposed to reduce random writes, their ability to deal with workloads mixed with sequential and random accesses is limited. In this paper, we propose a cooperative buffer management scheme referred to as CBM, which coordinates write buffer and read cache to fully exploit temporal and spatial localities among I/O intensive workload. To improve both buffer hit rate and destage sequentiality, CBM divides write buffer space into Page Region and Block Region. Randomly written data is put in the Page Region at page granularity, while sequentially written data is stored in the Block Region at block granularity. CBM leverages threshold-based migration to dynamically classify random write from sequential writes. When a block is evicted from write buffer, CBM merges the dirty pages in write buffer and the clean pages in read cache belonging to the evicted block to maximize the possibility of forming full block write. CBM has been extensively evaluated with simulation and real implementation on OpenSSD. Our testing results conclusively demonstrate that CBM can achieve up to 84% performance improvement and 85% garbage collection overhead reduction compared to existing buffer management schemes.

Write Amplification. In S2 slc , maximum bandwidth is achieved when the write size aligns with the stripe size (1 MB).

Context in source publication

Similar publications

Citations