Proposed N-bit ripple carry adder (RCA). (a) The structure of N-bit RCA p-circuit using two-multiplexing strategy; (b) the structure of FA using time-division multiplexing; (c) the calculation process in the weight-matrix.

Source publication

Figure 2. Structure and characteristics of RRAM. (a) The cross-section...

Figure 4. The structure of the weight-matrix. MUXs are used to choose...

Figure 5. Proposed N-bit ripple carry adder (RCA). (a) The structure of...

Figure 6. Invertible AND gate operation. (a) Directional mode: Clamping...

Probabilistic Circuit Implementation Based on P-Bits Using the Intrinsic Random Property of RRAM and P-Bit Multiplexing Strategy

Article

Full-text available

Jun 2022

Probabilistic computing is an emerging computational paradigm that uses probabilistic circuits to efficiently solve optimization problems such as invertible logic, where traditional digital computations are difficult to solve. This paper proposes a true random number generator (TRNG) based on resistive random-access memory (RRAM), which is combined...

Context 1

... the N-bit ripple carry adder (RCA) as an example, in which we adopt the two multiplexing strategies. The first multiplexing strategy is applied to the basic unit full adder (FA) of RCA as shown in Figure 5b,c. Usually, it takes five p-bits to construct an FA, where A, B, and CI are the inputs of the FA, and S and CO are the outputs of the FA. ...

View in full-text

Context 2

... multiplexing strategy not only performs serial updates naturally but also greatly reduces the number of p-bits. We use two MUXs to achieve p-bit time-division multiplexing as shown in Figure 5b. The signal to control the MUX is generated by the weight-matrix module, and the concrete operation process is shown in Figure 5c. ...

View in full-text

Context 3

... use two MUXs to achieve p-bit time-division multiplexing as shown in Figure 5b. The signal to control the MUX is generated by the weight-matrix module, and the concrete operation process is shown in Figure 5c. Starting with state 1, the circuit accomplishes two things in each state: Firstly, the input of the corresponding p-bit is calculated based on the interconnect coefficient (J ij ), the external bias (h i ), and the p-bit output (m i ; secondly, the corresponding control signal is set to 1 to update the p-bit. ...

View in full-text

Context 4

... FA needs to go through five states to complete an update of all bits. The second multiplexing strategy is applied to N-bit RCA as shown in Figure 5a, and the update order is from FA 1 to FA n . Although the multiplexing strategy increases the operation time, it is acceptable for statistical-based probabilistic computing to reduce hardware consumption. ...

View in full-text

Context 5

... FA needs to go through five states to complete an update of all bits. The second multiplexing strategy is applied to N-bit RCA as shown in Figure 5a, and the update order is from FA1 to FAn. Although the multiplexing strategy increases the operation time, it is acceptable for statistical-based probabilistic computing to reduce hardware consumption. ...

View in full-text

Context 6

... section shows simulation results of the invertible AND gate, FA, 16-bit RCA, and 4-bit multiplier. The implementation of the invertible circuits in this section is based on the mathematical description of the p-bit and the coupling relationship between p-bits, Figure 5. Proposed N-bit ripple carry adder (RCA). ...

View in full-text

a,b) Circuit schematics of a passive and of a 1T1R crossbar array,...

a) The GIXRD profiles of the TaOx/HfO2/TiN samples, sputtering TaOx at...

a) The bright field (BF) STEM image of the TiN/TaOx/HfO2/TiN ReRAM...

Comparison between the DC I–V characteristics of the bilayer ReRAM...

DC I–V characteristics of the TiN/TaOx/HfO2/Pt bilayer ReRAM, that is,...

Filamentary TaOx/HfO2 ReRAM Devices for Neural Networks Training with Analog In‐Memory Computing

Article

Full-text available

Jul 2022

The in‐memory computing paradigm aims at overcoming the intrinsic inefficiencies of Von‐Neumann computers by reducing the data‐transport per arithmetic operation. Crossbar arrays of multilevel memristive devices enable efficient calculations of matrix‐vector‐multiplications, an operation extensively called on in artificial intelligence (AI) tasks....

A Full-Stack View of Probabilistic Computing With p-Bits: Devices, Architectures, and Algorithms

Article

Full-text available

Jun 2023

The transistor celebrated its 75 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> birthday in 2022. The continued scaling of the transistor defined by Moore’s Law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with unconventional technologies has emerged as a promising path for domain-specific computing. In this article, we provide a full-stack review of probabilistic computing with p-bits as a representative example of the energy-efficient and domain-specific computing movement. We argue that p-bits could be used to build energy-efficient probabilistic systems, tailored for probabilistic algorithms and applications. From hardware, architecture, and algorithmic perspectives, we outline the main applications of probabilistic computers ranging from probabilistic machine learning and AI to combinatorial optimization and quantum simulation. Combining emerging nanodevices with the existing CMOS ecosystem will lead to probabilistic computers with orders of magnitude improvements in energy efficiency and probabilistic sampling, potentially unlocking previously unexplored regimes for powerful probabilistic algorithms.

A full-stack view of probabilistic computing with p-bits: devices, architectures and algorithms

Preprint

Full-text available

Feb 2023

The transistor celebrated its 75\textsuperscript{th} birthday in 2022. The continued scaling of the transistor defined by Moore's Law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with unconventional technologies has emerged as a promising path for domain-specific computing. In this article, we provide a full-stack review of probabilistic computing with p-bits as a representative example of the energy-efficient and domain-specific computing movement. We argue that p-bits could be used to build energy-efficient probabilistic systems, tailored for probabilistic algorithms and applications. From hardware, architecture, and algorithmic perspectives, we outline the main applications of probabilistic computers ranging from probabilistic machine learning and AI to combinatorial optimization and quantum simulation. Combining emerging nanodevices with the existing CMOS ecosystem will lead to probabilistic computers with orders of magnitude improvements in energy efficiency and probabilistic sampling, potentially unlocking previously unexplored regimes for powerful probabilistic algorithms.

Double-free-layer stochastic magnetic tunnel junctions with synthetic antiferromagnets

Article

May 2024

Stochastic magnetic tunnel junctions (SMTJs) using low-barrier nanomagnets have shown promise as fast, energy-efficient, and scalable building blocks for probabilistic computing. Despite recent experimental and theoretical progress, SMTJs exhibiting the ideal characteristics necessary for probabilistic bits (p-bits) are still lacking. Ideally, the SMTJs should have (a) voltage bias independence, preventing read disturbance; (b) uniform randomness in the magnetization angle between the two magnetic layers; and (c) fast fluctuations without requiring external magnetic fields, while being robust to magnetic field perturbations. Here, we propose a design that satisfies all of these requirements, using double-free-layer SMTJs with synthetic antiferromagnets (SAFs). We evaluate the proposed SMTJ design with experimentally benchmarked spin-circuit models, accounting for transport physics, coupled with the stochastic Landau-Lifshitz-Gilbert equation for magnetization dynamics. We find that the use of low-barrier SAF layers reduces dipolar coupling, achieving uncorrelated fluctuations at zero-magnetic field, surviving up to diameters exceeding D≈100nm if the nanomagnets can be made thin enough (≈1–2nm). The double-free-layer structure retains bias independence and the circular nature of the nanomagnets provides near-uniform randomness with fast fluctuations. Combining our full SMTJ model with advanced transistor models, we estimate the energy to generate a random bit to be about 3.6fJ, with fluctuation rates of about 3.3GHz per p-bit. Our results will guide the experimental development of superior stochastic magnetic tunnel junctions for large-scale and energy-efficient probabilistic computation for problems relevant to machine learning and artificial intelligence.

CMOS plus stochastic nanomagnets enabling heterogeneous computers for probabilistic inference and learning

Article

Full-text available

Mar 2024

Extending Moore’s law by augmenting complementary-metal-oxide semiconductor (CMOS) transistors with emerging nanotechnologies (X) has become increasingly important. One important class of problems involve sampling-based Monte Carlo algorithms used in probabilistic machine learning, optimization, and quantum simulation. Here, we combine stochastic magnetic tunnel junction (sMTJ)-based probabilistic bits (p-bits) with Field Programmable Gate Arrays (FPGA) to create an energy-efficient CMOS + X (X = sMTJ) prototype. This setup shows how asynchronously driven CMOS circuits controlled by sMTJs can perform probabilistic inference and learning by leveraging the algorithmic update-order-invariance of Gibbs sampling. We show how the stochasticity of sMTJs can augment low-quality random number generators (RNG). Detailed transistor-level comparisons reveal that sMTJ-based p-bits can replace up to 10,000 CMOS transistors while dissipating two orders of magnitude less energy. Integrated versions of our approach can advance probabilistic computing involving deep Boltzmann machines and other energy-based learning algorithms with extremely high throughput and energy efficiency.

Tyche: A Compact and Configurable Accelerator for Scalable Probabilistic Computing on FPGA

Conference Paper

Sep 2023

Proposed N-bit ripple carry adder (RCA). (a) The structure of N-bit RCA p-circuit using two-multiplexing strategy; (b) the structure of FA using time-division multiplexing; (c) the calculation process in the weight-matrix.

Contexts in source publication

Similar publications

Citations