Fig 1 - uploaded by Jiachen Zhang
Content may be subject to copyright.

Contexts in source publication

Context 1
... displayed in Fig. 1, NAND flash-based SSDs are usually connected through SATA or PCIe, while NVM based devices can be connected through either DIMM or PCIe, and so compete with both SSD and DRAM devices. There has been a lot of studies focusing on NVMs [1]. However, most of them were based on NVM emulators, as NVM based devices were previously not ...
Context 2
... common situation is where a MySQL server is connected to multiple clients simultaneously. In this situation, the requests are concurrently submitted to the InnoDB storage engine. Thus, we simulate this situation by using multiple threads benchmark provided by Sysbench. The results in the PE and the VE are shown in Fig. 10 and Fig. 11 ...
Context 3
... Optane can get better TPSs in all situation, as expected. As shown in Fig.10 (a), when only one thread connected, the TPS of Optane is only 2.6× higher than NAND. When the thread number is 64, the TPS of Optane is 3.8× higher than NAND. For the write-only case and mixed read&write-case in Fig.10 (b) and (c), Optane gets better scalability when the thread number is smaller than 8. We also believe the good ...
Context 4
... Optane can get better TPSs in all situation, as expected. As shown in Fig.10 (a), when only one thread connected, the TPS of Optane is only 2.6× higher than NAND. When the thread number is 64, the TPS of Optane is 3.8× higher than NAND. For the write-only case and mixed read&write-case in Fig.10 (b) and (c), Optane gets better scalability when the thread number is smaller than 8. We also believe the good scalability of NAND is because of the write buffering in our NAND hardware, as explained in Section III. Thus, we conclude that Optane has better scalability in MySQL workloads, especially for read ...
Context 5
... results in the PE and the VE are shown in Fig. 12 (a) and Fig. 13 (a) respectively. Transfering from 3% data cached to 50% data cached resulted with Optane in the PE increasing its TPS 1.4-fold, NAND in the PE increased 1.9-fold, Optane in the VE increased 1.3-fold, and NAND in the VE increased 1.5-fold. Therefore, the cache is more important for slower NAND both in the VE and in the ...
Context 6
... results in the PE and the VE are shown in Fig. 12 (a) and Fig. 13 (a) respectively. Transfering from 3% data cached to 50% data cached resulted with Optane in the PE increasing its TPS 1.4-fold, NAND in the PE increased 1.9-fold, Optane in the VE increased 1.3-fold, and NAND in the VE increased 1.5-fold. Therefore, the cache is more important for slower NAND both in the VE and in the PE. Performance ...
Context 7
... Optane in the VE increased 1.3-fold, and NAND in the VE increased 1.5-fold. Therefore, the cache is more important for slower NAND both in the VE and in the PE. Performance boosting caused by compression can only happen when using slow HDDs, and the purpose of using table compression in SSDs is only to save storage space. As displayed in Fig. 12 (b)(c) and Fig. 13 (b)(c), we set the compressed page size 8 KB and 4 KB, and test the OLTP performance when cache size increases in PE and VE. Compared with the uncompressed situation in Fig.12 (a) and Fig. 13 (a), using table compression feature slightly decreases the performance of NAND equipped with MySQL, while the decrease of Optane ...
Context 8
... the VE increased 1.3-fold, and NAND in the VE increased 1.5-fold. Therefore, the cache is more important for slower NAND both in the VE and in the PE. Performance boosting caused by compression can only happen when using slow HDDs, and the purpose of using table compression in SSDs is only to save storage space. As displayed in Fig. 12 (b)(c) and Fig. 13 (b)(c), we set the compressed page size 8 KB and 4 KB, and test the OLTP performance when cache size increases in PE and VE. Compared with the uncompressed situation in Fig.12 (a) and Fig. 13 (a), using table compression feature slightly decreases the performance of NAND equipped with MySQL, while the decrease of Optane equipped with MySQL is ...
Context 9
... by compression can only happen when using slow HDDs, and the purpose of using table compression in SSDs is only to save storage space. As displayed in Fig. 12 (b)(c) and Fig. 13 (b)(c), we set the compressed page size 8 KB and 4 KB, and test the OLTP performance when cache size increases in PE and VE. Compared with the uncompressed situation in Fig.12 (a) and Fig. 13 (a), using table compression feature slightly decreases the performance of NAND equipped with MySQL, while the decrease of Optane equipped with MySQL is significant. Therefore, in MySQL InnoDB, configuring the table compression feature is a trade-off between storage space saving and the performance for Optane and NAND in ...
Context 10
... can only happen when using slow HDDs, and the purpose of using table compression in SSDs is only to save storage space. As displayed in Fig. 12 (b)(c) and Fig. 13 (b)(c), we set the compressed page size 8 KB and 4 KB, and test the OLTP performance when cache size increases in PE and VE. Compared with the uncompressed situation in Fig.12 (a) and Fig. 13 (a), using table compression feature slightly decreases the performance of NAND equipped with MySQL, while the decrease of Optane equipped with MySQL is significant. Therefore, in MySQL InnoDB, configuring the table compression feature is a trade-off between storage space saving and the performance for Optane and NAND in the PE. In the VE, ...

Citations

... However, these NVMbased devices are still slower than DRAM [48,50,51,[88][89][90][91][92][93]. For example, the state-of-the-art Intel Optane SSD (solid-state drive) [94], which is a low-latency NVM-based SSD device (i.e., an SSD device that uses NVM as its primary persistent storage media), has an access latency that is two orders of magnitude slower than that of DRAM [95,96] (but it is still one order of magnitude faster than traditional NAND-flash-based SSDs [88,89,[97][98][99][100][101][102][103]), while providing a better cost-per-byte ($1.50 per GB [104] vs. $5 per GB for DRAM [105]). Previous works propose two different ways to integrate an NVM device into state-of-the-art computers in order to alleviate DRAM scalability issues. ...
... For the Intel Optane SSD, the device has a system interface similar to current NAND-based flash memory devices, where the system communicates to the device via the PCIe bus [177]. This configuration provides one order of magnitude lower latency than traditional NAND-flash-based SSDs [88,89,[97][98][99][100][101][102][103]. For the Intel Optane DC Persistent DIMM, the device is integrated into the system with a DIMM-based interface, similar to DRAM devices. ...
... There are three major differences between the Intel Optane SSD and a traditional NAND-flashbased SSD: (1) lower access latency in the Intel Optane SSD, (2) higher endurance in the Intel Optane SSD, and (3) higher cost of the Intel Optane SSD device. First, as previous works [88,89,[97][98][99][100][101][102][103] show, performing a 4 kB random read using the Intel Optane SSD is approximately 6× faster than using a traditional NAND-flash-based SSD. Second, the Intel Optane SSD can provide 10× the endurance of a traditional NAND-flash-based SSD [223]. ...
Preprint
Full-text available
The number and diversity of consumer devices are growing rapidly, alongside their target applications' memory consumption. Unfortunately, DRAM scalability is becoming a limiting factor to the available memory capacity in consumer devices. As a potential solution, manufacturers have introduced emerging non-volatile memories (NVMs) into the market, which can be used to increase the memory capacity of consumer devices by augmenting or replacing DRAM. Since entirely replacing DRAM with NVM in consumer devices imposes large system integration and design challenges, recent works propose extending the total main memory space available to applications by using NVM as swap space for DRAM. However, no prior work analyzes the implications of enabling a real NVM-based swap space in real consumer devices. In this work, we provide the first analysis of the impact of extending the main memory space of consumer devices using off-the-shelf NVMs. We extensively examine system performance and energy consumption when the NVM device is used as swap space for DRAM main memory to effectively extend the main memory capacity. For our analyses, we equip real web-based Chromebook computers with the Intel Optane SSD, which is a state-of-the-art low-latency NVM-based SSD device. We compare the performance and energy consumption of interactive workloads running on our Chromebook with NVM-based swap space, where the Intel Optane SSD capacity is used as swap space to extend main memory capacity, against two state-of-the-art systems: (i) a baseline system with double the amount of DRAM than the system with the NVM-based swap space; and (ii) a system where the Intel Optane SSD is naively replaced with a state-of-the-art (yet slower) off-the-shelf NAND-flash-based SSD, which we use as a swap space of equivalent size as the NVM-based swap space.
... All the works used the same Intel Optane SSD (900P), but they used different configurations. Each experiment configuration was as follows: Zhang et al. [33] For the FIO configuration, Zhang et al. [33] ran FIO during 30s and stored the fixedsize data (i.e., 20 GB) in the SSD before the performance evaluation. Yang et al. [34] ran FIO with a 4K block size, four threads, and a 32 iodepth. ...
... All the works used the same Intel Optane SSD (900P), but they used different configurations. Each experiment configuration was as follows: Zhang et al. [33] For the FIO configuration, Zhang et al. [33] ran FIO during 30s and stored the fixedsize data (i.e., 20 GB) in the SSD before the performance evaluation. Yang et al. [34] ran FIO with a 4K block size, four threads, and a 32 iodepth. ...
Article
Full-text available
Cloud computing as a service-on-demand architecture has grown in importance over the last few years. The storage subsystem in cloud computing has undergone enormous innovation to provide high-quality cloud services. Emerging Non-Volatile Memory Express (NVMe) technology has attracted considerable attention in cloud computing by delivering high I/O performance in latency and bandwidth. Specifically, multiple NVMe solid-state drives (SSDs) can provide higher performance, fault tolerance, and storage capacity in the cloud computing environment. In this paper, we performed an empirical evaluation study of performance on recent NVMe SSDs (i.e., Intel Optane SSDs) with different redundant array of independent disks (RAID) environments. We analyzed multiple NVMe SSDs with RAID in terms of different performance metrics via synthesis and database benchmarks. We anticipate that our experimental results and performance analysis will have implications for various storage systems. Experimental results showed that the software stack overhead reduced the performance by up to 75%, 52%, 76%, 91%, and 92% in RAID 0, 1, 10, 5, and 6, respectively, compared with theoretical and expected performance.
... Unlike flash SSD, 3D XPoint SSD uses resistancebased recording material to store bits, enabling it to provide much lower latency and higher throughput. Most importantly, 3D XPoint SSD removes many long-existing concerns on flash SSDs, such as the read-write speed disparity, slow random write, and endurance problems [24], [45]. Thus 3D XPoint SSD is widely regarded as a pivotal technology for building the next-generation storage system in the future. ...
... For slow devices like HDDs or low-end SSDs, a 4 KB copy usually not worth as the hardware access latency is far larger than the transfer time spend for a data cluster [25]. However, as PM has ultra-low latency, the transfer time will dominate the data copy process even for relatively small data size, which is verified in Section VI-D. ...
Article
Full-text available
Over the past six decades, the computing systems field has experienced significant transformations, profoundly impacting society with transformational developments, such as the Internet and the commodification of computing. Underpinned by technological advancements, computer systems, far from being static, have been continuously evolving and adapting to cover multifaceted societal niches. This has led to new paradigms such as cloud, fog, edge computing, and the Internet of Things (IoT), which offer fresh economic and creative opportunities. Nevertheless, this rapid change poses complex research challenges, especially in maximizing potential and enhancing functionality. As such, to maintain an economical level of performance that meets ever-tighter requirements, one must understand the drivers of new model emergence and expansion, and how contemporary challenges differ from past ones. To that end, this article investigates and assesses the factors influencing the evolution of computing systems, covering established systems and architectures as well as newer developments, such as serverless computing, quantum computing, and on-device AI on edge devices. Trends emerge when one traces technological trajectory, which includes the rapid obsolescence of frameworks due to business and technical constraints, a move towards specialized systems and models, and varying approaches to centralized and decentralized control. This comprehensive review of modern computing systems looks ahead to the future of research in the field, highlighting key challenges and emerging trends, and underscoring their importance in cost-effectively driving technological progress.
Article
Full-text available
DRAM scalability is becoming a limiting factor to the available memory capacity in consumer devices. As a potential solution, manufacturers have introduced emerging non-volatile memories (NVMs) into the market, which can be used to increase the memory capacity of consumer devices by augmenting or replacing DRAM. In this work, we provide the first analysis of the impact of extending the main memory space of consumer devices using off-the-shelf NVMs. We equip real web-based Chromebook computers with the Intel Optane solid-state drive (SSD), which contains state-of-the-art low-latency NVM, and use the NVM as swap space. We analyze the performance and energy consumption of the Optane-equipped Chromebooks, and compare this with (i) a baseline system with double the amount of DRAM than the system with the NVM-based swap space; and (ii) a system where the Intel Optane SSD is naively replaced with a state-of-the-art NAND-flash-based SSD. Our experimental analysis reveals that while Optane-based swap space provides a cost-effective way to alleviate the DRAM capacity bottleneck in consumer devices, naive integration of the Optane SSD leads to several system-level overheads, mostly related to (1) the Linux block I/O layer, which can negatively impact overall performance; and (2) the off-chip traffic to the swap space, which can negatively impact energy consumption. To reduce the Linux block I/O layer overheads, we tailor several system-level mechanisms (i.e., the I/O scheduler and the I/O request completion mechanism) to the currently-running application’s access pattern. To reduce the off-chip traffic overhead, we leverage an operating system feature (called Zswap) that allocates some DRAM space to be used as a compressed in-DRAM cache for data swapped between DRAM and the Intel Optane SSD, significantly reducing energy consumption caused by the off-chip traffic to the swap space. We conclude that emerging NVMs are a cost-effective solution to alleviate the DRAM capacity bottleneck in consumer devices, which can be further enhanced by tailoring system-level mechanisms to better leverage the characteristics of our workloads and the NVM.
Article
Persistent memory’s (PM) byte-addressability and high capacity will also make it emerging for virtualized environment. Modern virtual machine monitors virtualize PM using either I/O virtualization or memory virtualization. However, I/O virtualization will sacrifice PM’s byte-addressability, and memory virtualization does not get the chance of PM image management. In this article, we enhance QEMU’s memory virtualization mechanism. The enhanced system can achieve both PM’s byte-addressability inside virtual machines and PM image management outside the virtual machines. We also design pcow , a virtual machine image format for PM, which is compatible with our enhanced memory virtualization and supports storage virtualization features including thin-provisioning, base image, snapshot, and striping. Address translation is performed with the help of the Extended Page Table, thus much faster than image formats implemented in I/O virtualization. We also optimize pcow considering PM’s characteristics. We perform exhaustive performance evaluations on an x86 server equipping with Intel’s Optane DC persistent memory. The evaluation demonstrates that our scheme boosts the overall performance by up to 50× compared with qcow2, an image format implemented in I/O virtualization, and brings almost no performance overhead compared with the native memory virtualization. The striping feature can also scale-out the virtual PM’s bandwidth performance.
Article
In high-performance computing (HPC), data and metadata are stored on special server nodes and client applications access the servers’ data and metadata through a network, which induces network latencies and resource contention. These server nodes are typically equipped with (slow) magnetic disks, while the client nodes store temporary data on fast SSDs or even on non-volatile main memory (NVMM). Therefore, the full potential of parallel file systems can only be reached if fast client side storage devices are included into the overall storage architecture. In this article, we propose an NVMM-based hierarchical persistent client cache for the Lustre file system (NVMM-LPCC for short). NVMM-LPCC implements two caching modes: a read and write mode (RW-NVMM-LPCC for short) and a read only mode (RO-NVMM-LPCC for short). NVMM-LPCC integrates with the Lustre Hierarchical Storage Management (HSM) solution and the Lustre layout lock mechanism to provide consistent persistent caching services for I/O applications running on client nodes, meanwhile maintaining a global unified namespace of the entire Lustre file system. The evaluation results presented in this article show that NVMM-LPCC can increase the average read throughput by up to 35.80 times and the average write throughput by up to 9.83 times compared with the native Lustre system, while providing excellent scalability.
Article
Intel’s Optane solid-state nonvolatile storage device is constructed using their new 3D Xpoint technology. Although it is claimed that this technology can deliver substantial performance improvements compared to NAND-based storage systems, its performance characteristics have not been well studied. In this study, intensive experiments and measurements have been carried out to extract the intrinsic performance characteristics of the Optane SSD, including the basic I/O performance behavior, advanced interleaving technology, performance consistency under a highly intensive I/O workload, influence of unaligned request size, elimination of write-driven garbage collection, read disturb issues, and tail latency problem. The performance is compared to that of a conventional NAND SSD to indicate the performance difference of the Optane SSD in each scenario. In addition, by using TPC-H, a read-intensive benchmark, a database system’s performance has been studied on our target storage devices to quantify the potential benefits of the Optane SSD to a real application. Finally, the performance impact of hybrid Optane and NAND SSD storage systems on a database application has been investigated.