Results for Quad-core System

Source publication

LastingNVCache: A Technique for Improving the Lifetime of Non-volatile Caches

Conference Paper

Full-text available

Jul 2014

Use of NVM (Non-volatile memory) devices such as ReRAM (resistive RAM) and STT-RAM (spin transfer torque RAM) for designing on-chip caches holds the promise of providing a high-density, low-leakage alternative to SRAM. However, low write endurance of NVMs, along with the write-variation introduced by existing cache management schemes may significan...

LastingNVCache: Extending the Lifetime of Non-volatile Caches using Intra-set Wear-leveling

Technical Report

Full-text available

Jan 2014

FIGURE 8. Normalized energy consumption (L2 TLB + L2 cache) of PARSEC...

FIGURE 9. Normalized memory access energy consumption depending on...

Monolithic 3D-based SRAM/MRAM Hybrid Memory for an Energy-efficient Unified L2 TLB-Cache Architecture

Article

Full-text available

Jan 2021

Young-Ho Gong

Monolithic 3D (M3D) integration has been emerged as a promising technology for fine-grained 3D stacking. As the M3D integration offers extremely small dimension of via in a nanometer-scale, it is beneficial for small microarchitectural blocks such as caches, register files, translation look-aside buffers (TLBs), etc. However, since the M3D integrat...

A Technique for Improving Lifetime of Non-Volatile Caches Using Write-Minimization

Article

Full-text available

Jan 2016

BBB: Simplifying Persistent Programming using Battery-Backed Buffers

Conference Paper

Feb 2021

Sleepy-LRU: extending the lifetime of non-volatile caches by reducing activity of age bits

Article

Full-text available

Jul 2019
J SUPERCOMPUT

Emerging non-volatile memories (NVMs) are known as promising alternatives to SRAMs in on-chip caches. However, their limited write endurance is a major challenge when NVMs are employed in these highly frequently written caches. Early wear-out of NVM cells makes the lifetime of the caches extremely insufficient for nowadays computational systems. Previous studies only addressed the lifetime of data part in the cache. This paper first demonstrates that the age bits field of the cache replacement algorithm is the most frequently written part of a cache block and its lifetime is shorter than that of data part by more than 27×. Second, it investigates the effect of age bits wear-out on the cache operation and shows that the performance is severely degraded after even a small portion of age bits become non-operational. Third, a novel cache replacement algorithm, so-called Sleepy-LRU, is proposed to reduce the write activity of the age bits with negligible overheads. The evaluations show that Sleepy-LRU extends the lifetime of instruction and data caches to 3.63× and 3.00×, respectively, with an average of 0.06% performance overhead. In addition, Sleepy-LRU imposes no area and power consumption overhead.

Improving the Lifetime of Non-Volatile Cache by Write Restriction

Article

Full-text available

Jan 2019
IEEE T COMPUT

The attractive features such as low static power and high density exhibited by the Non-Volatile Memory (NVM) technologies makes them a promising candidate in the memory hierarchy, including caches. However, the limited write endurance with the write variations governed by the access patterns and the applied replacement policies reduce the chance of NVMs as a successor of SRAM. These write variations are of concern as they not only breakdown the NVM cells but also reduce the effective lifetime. This paper proposes efficient techniques to mitigate the intra-set write variation to improve the lifetime of NVM cache. Our first two techniques partition the cache into windows of equal size and distribute the writes uniformly across the cache set by employing the window as write-restricted or read-only. The selection of the window in these techniques is by rotation or with the help of counters. In our third technique, different cache ways are employed as a write-restricted over the period of execution to distribute the writes uniformly. Experimental results using full system simulation show significant reduction in intra-set write variation along with improvement in the cache lifetime.

Software-level Analysis and Optimization to Mitigate the Cost of Write Operations on Non-Volatile Memories

Thesis

Dec 2018

Rabab Bouziane

Traditional memories such as SRAM, DRAM and Flash have faced during the last years,critical challenges related to what modern computing systems required: high performance,high storage density and low power. As the number of CMOS transistors is increasing, theleakage power consumption becomes a critical issue for energy-efficient systems. SRAMand DRAM consume too much energy and have low density and Flash memories have alimited write endurance. Therefore, these technologies can no longer ensure the needs in bothembedded and high-performance computing domains. The future memory systems mustrespect the energy and performance requirements. Since Non Volatile Memories (NVMs)appeared, many studies have shown prominent features where such technologies can be apotential replacement of the conventional memories used on-chip and off-chip. NVMs haveimportant qualities in storage density, scalability, leakage power, access performance andwrite endurance. Many research works have proposed designs based on NVMs, whether onmain memory or on cache memories. Nevertheless, there are still some critical drawbacksof these new technologies. The main drawback is the cost of write operations in terms oflatency and energy consumption. Ideally, we want to replace traditional technologies withNVMs to benefit from storage density and very low leakage but eventually without the writeoperations overhead.The scope of this work is to exploit the advantages of NVMs employed mainly on cachememories by mitigating the cost of write operations. Obviously, reducing the number of writeoperations in a program will help in reducing the energy consumption of that program. Manyapproaches about reducing writes operations exist at circuit level, architectural level andsoftware level. We propose a compiler-level optimization that reduces the number of writeoperations by eliminating the execution of redundant stores, called silent stores. A store issilent if it’s writing in a memory address the same value that is already stored at this address.The LLVM-based optimization eliminates the identified silent stores in a program by notexecuting them.Furthermore, the cost of a write operation is highly dependent on the used NVM andits non-volatility called retention time; when the retention time is high then the latency andthe energetic cost of a write operation are considerably high and vice versa. Based on thischaracteristic, we propose an approach applicable in a multi-bank NVM where each bank isdesigned with a specific retention time. We analyze a program and we compute the worst-caselifetime of a store instruction. The worst-case lifetime will help to allocate data to the mostappropriate NVM bank.

A Technique for Improving Lifetime of Non-Volatile Caches Using Write-Minimization

Article

Full-text available

Jan 2016

While non-volatile memories (NVMs) provide high-density and low-leakage, they also have low write-endurance. This, along with the write-variation introduced by the cache management policies can lead to very small cache lifetime. In this paper, we propose ENLIVE, a technique for improving the lifetime of NVM caches. Our technique uses a small SRAM storage, called HotStore. ENLIVE detects frequently written blocks and transfers them to the HotStore so that they can be accessed with smaller latency and energy. This also reduces the number of writes to the NVM cache which improves its lifetime. We present microarchitectural schemes for managing the HotStore. Simulations have been performed using an x86-64 simulator and benchmarks from SPEC2006 suite. We observe that ENLIVE provides higher improvement in lifetime and better performance and energy efficiency than two state-of-the-art techniques for improving NVM cache lifetime. ENLIVE provides 8.47X, 14.67X and 15.79X improvement in lifetime or 2, 4 and 8 core systems, respectively. Also, it works well for a range of system and algorithm parameters and incurs only small overhead.

Opportunities for Nonvolatile Memory Systems in Extreme-Scale High Performance Computing

Article

Full-text available

Mar 2015
COMPUT SCI ENG

For extreme-scale high performance computing systems, system-wide power consumption has been identified as one of the key constraints moving forward, where the DRAM main memory systems account for about 30-50% of a node's overall power consumption. Moreover, as the benefits of device scaling for DRAM memory slow, it will become increasingly difficult to keep memory capacities balanced with increasing computational rates offered by next-generation processors. However, a number of emerging memory technologies - nonvolatile memory (NVM) devices - are being investigated as an alternative for DRAM. Moving forward, these NVM devices may offer a number of solutions for HPC architectures. First, as the name, NVM, implies, these devices retain state without continuous power, which can, in turn, reduce power costs. Second, certain NVM devices can be as dense as DRAM, facilitating more memory capacity in the same physical volume. Finally, NVM, such as contemporary NAND flash memory, can be less expensive than DRAM in terms of cost per bit. Taken together, these benefits can provide opportunities for revolutionizing the design of extreme-scale HPC systems. Researchers are investigating how to integrate these emerging technologies into future extreme-scale HPC systems, and how to expose these capabilities in the software stack and applications. Current results show a number of these strategies may offer high-bandwidth I/O, larger main memory capacities, persistent data structures, and new approaches for application resilience and output post-processing, such as transaction-based, incremental-checkpointing and in-situ visualization, respectively.

EqualWrites: Reducing Intra-Set Write Variations for Enhancing Lifetime of Non-Volatile Caches

Article

Full-text available

Jan 2015
IEEE T VLSI SYST

Driven by the trends of increasing core-count and bandwidth-wall problem, the size of last level caches (LLCs) has greatly increased and hence, the researchers have explored non-volatile memories (NVMs) which provide high density and consume low-leakage power. Since NVMs have low write-endurance and the existing cache management policies are write variation-unaware, effective wear-leveling techniques are required for achieving reasonable cache lifetimes using NVMs. We present EqualWrites, a technique for mitigating intra-set write variation. Our technique works by recording the number of writes on a block and changing the cache-block location of a hot data-item to redirect the future writes to a cold block to achieve wear-leveling. Simulation experiments have been performed using an x86-64 simulator and benchmarks from SPEC06 and HPC (high-performance computing) field. The results show that for single, dual and quad-core system configurations, EqualWrites improves cache lifetime by 6.31X, 8.74X and 10.54X, respectively. Also, its implementation overhead is very small and it provides larger improvement in lifetime than three other intra-set wear-leveling techniques and a cache replacement policy.

LastingNVCache: Extending the Lifetime of Non-volatile Caches using Intra-set Wear-leveling

Technical Report

Full-text available

Jan 2014

The limitations of SRAM viz. low-density and high leakage power have motivated the researchers to explore non-volatile memory (NVM) as an alternative. However, the write-endurance of NVMs is orders of magnitude smaller than that of SRAM, and existing cache management schemes may introduce significant write-variation, and hence, the use of NVMs for designing on-chip caches is challenging. In this paper, we present LastingNVCache, a technique for improving the cache lifetime by mitigating the intra-set write variation. LastingNVCache works on the key idea that by periodically flushing a frequently-written data-item, next time the block can be made to load into a cold block in the set. Through this, the future writes to that data-item can be redirected from a hot block to a cold block, which leads to improvement in the cache lifetime. Microarchitectural simulations have shown that for single, dual and quad-core systems, LastingNVCache provides 6.36X, 9.79X, and 10.94X improvement in lifetime, respectively. Also, its implementation overhead is small and it outperforms two recently proposed techniques for improving the lifetime of NVM caches.

SweepCache: Intermittence-Aware Cache on the Cheap

Conference Paper

Dec 2023

Self Adaptive Logical Split Cache techniques for delayed aging of NVM LLC

Article

Aug 2023
ACM T DES AUTOMAT EL

Due to the technological advancements in the last few decades, several applications have emerged that demand more computing power and on-chip and off-chip memories. However, the scaling of memory technologies is not at par with computing throughput of modern day multi-core processors. Conventional memory technologies such as SRAM and DRAM have technological limitations to meet large on-chip memory requirements owing to their low packaging density and high leakage power. In order to meet the ever-increasing demand for memory, researchers came up with alternative solutions, such as emerging non-volatile memory technologies such as STT-RAM, PCM and ReRAM. However, these memory technologies have limited write endurance and high write energy. This emphasizes the need for a policy that will reduce the writes or distribute the writes uniformly across the memory thereby enhancing its lifetime by delaying the early wear out of memory cells due to frequent writes. We propose two techniques, Enhanced-Virtually Split Cache (E-ViSC) and Protean-Virtually Split Cache (P-ViSC), which dynamically adjust the cache configuration to distribute the writes uniformly across the memory to enhance the lifetime. Experimental studies show that E-ViSC and P-ViSC improve lifetime of NVM L2 caches by upto 2.31x and 1.97x respectively.

Results for Quad-core System

Similar publications

Citations