4-bit Ripple Carry Adder (RCA): A 4-bit adder is implemented using 3 Full Adders and a Half Adder. A schematic and a block diagram are shown in (a) and (b). We assign each p-bit a separate retention time τ N , with a normal distribution shown in the inset. (c-d) When the inputs are clamped to A = 10 to B = 13 the output S is 23. (e-f) In the inverted mode the output S is clamped to 23, resulting in A and B going through all 8 combinations (that can be probed by 4-digit binary inputs A and B) of producing A+B=S=23.

Source publication

Hardware demonstration of stochastic p-bits for invertible logic

Article

Full-text available

May 2017

The common feature of nearly all logic and memory devices is that they make use of stable units to represent 0's and 1's. A completely different paradigm is based on three-terminal stochastic units which could be called "p-bits", where the output is a random telegraphic signal continuously fluctuating between 0 and 1 with a tunable mean. p-bits can...

Context 1

... build more complex systems, one possible approach is to design the entire system as a single Boltzmann Machine, but the reversible nature of the Boltzmann Machines can hinder in the correct operation of such systems [5]. A more practical alternative is to inter- connect simpler Boltzmann Machines with directed connections to build up more complex systems such as a 4-bit Ripple Carry Adder (RCA) (Fig.7(a)) or a 4-bit multiplier/factorizer ( Fig.8(a)). ...

View in full-text

Context 2

... to disconnecting the input voltage of p-bit "i" from its native weight logic and connecting to it the output voltage of p-bit "j" from a different Boltzmann Machine so that J ij = 1 and J ji = 0. Consider the case of a 4-bit adder that is built using a Half Adder and 3 Full Adders. In this case there are 3 directed connections as shown in Fig. 7(a). Each connection takes the output voltage of C OUT of the (n − 1) th adder and connects it to the input terminal of C IN of the n th adder. Due to this connection scheme, no information can flow from the n th adder to the (n−1) th adder, which makes the system no longer bidirectional. However, as noted in [5], bidirectional connections ...

View in full-text

Context 3

... Adder: We next demonstrate the correct operation of a 4-bit RCA comprised of 48 p-bits each having different τ N as shown in the inset of Fig. 7(d). The values of τ N are normally distributed around an average of 200 ms with a minimum of 137 ms to a maximum of 263 ms. 4-bit binary addition is performed by clamping the input p-bits of each adder, as demonstrated by the time evolution of the sum shown in Fig. 7(c) with A=10 and B=13 resulting in the sum being 23 when converted to ...

View in full-text

Context 4

... RCA comprised of 48 p-bits each having different τ N as shown in the inset of Fig. 7(d). The values of τ N are normally distributed around an average of 200 ms with a minimum of 137 ms to a maximum of 263 ms. 4-bit binary addition is performed by clamping the input p-bits of each adder, as demonstrated by the time evolution of the sum shown in Fig. 7(c) with A=10 and B=13 resulting in the sum being 23 when converted to ...

View in full-text

Context 5

... of each of the adders being clamped to S=23, with A and B left floating. In this case, A and B fluctuate among 8 possible integer combinations that satisfy A+B=23. Note that since A and B are 4-digit binary numbers, not all integer combinations can be probed by the system, for example A=22 and B=1. This can be seen from the histogram presented in Fig. 7(f). Although there are 8 peaks in the histogram, the height of each peak is not the same since statistics presented in Fig. 7(f) are not exactly steady state. With 48 p-bits in the system, the number of samples needed for steady state statistics is prohibitively ...

View in full-text

Context 6

... combinations that satisfy A+B=23. Note that since A and B are 4-digit binary numbers, not all integer combinations can be probed by the system, for example A=22 and B=1. This can be seen from the histogram presented in Fig. 7(f). Although there are 8 peaks in the histogram, the height of each peak is not the same since statistics presented in Fig. 7(f) are not exactly steady state. With 48 p-bits in the system, the number of samples needed for steady state statistics is prohibitively ...

View in full-text

Reconfigurable Logic based on Voltage-Controlled Magnetic Tunnel Junction (VC-MTJ) for Stochastic Computing

Conference Paper

Full-text available

Oct 2018

Stochastic logic in biased coupled photonic probabilistic bits

Preprint

Full-text available

Jun 2024

Optical computing often employs tailor-made hardware to implement specific algorithms, trading generality for improved performance in key aspects like speed and power efficiency. An important computing approach that is still missing its corresponding optical hardware is probabilistic computing, used e.g. for solving difficult combinatorial optimization problems. In this study, we propose an experimentally viable photonic approach to solve arbitrary probabilistic computing problems. Our method relies on the insight that coherent Ising machines composed of coupled and biased optical parametric oscillators can emulate stochastic logic. We demonstrate the feasibility of our approach by using numerical simulations equivalent to the full density matrix formulation of coupled optical parametric oscillators.

Effect of stochastic activation function on reconstruction performance of restricted Boltzmann machines with stochastic magnetic tunnel junctions

Article

Jan 2024
APPL PHYS LETT

Stochastic Magnetic Tunnel Junctions (SMTJs) emerge as a promising candidate for neuromorphic computing. The inherent stochasticity of SMTJs makes them ideal for implementing stochastic synapses or neurons in neuromorphic computing. However, the stochasticity of SMTJs may impair the performance of neuromorphic systems. In this study, we conduct a systematic examination of the influence of three stochastic effects (shift, change of slope, and broadening) on the sigmoid activation function. We further explore the implications of these effects on the reconstruction performance of Restricted Boltzmann Machines (RBMs). We find that the trainability of RBMs is robust against the three stochastic effects. However, reconstruction error is strongly related to the three stochastic effects in SMTJs-based RBMs. Significant reconstruction error is found when the stochastic effect is strong. Last, we identify the correlation of the reconstruction error with each stochastic factor. Our results might help develop more robust neuromorphic systems based on SMTJs.

Enhanced convergence in p-bit based simulated annealing with partial deactivation for large-scale combinatorial optimization problems

Article

Full-text available

Jan 2024

This article critically investigates the limitations of the simulated annealing algorithm using probabilistic bits (pSA) in solving large-scale combinatorial optimization problems. The study begins with an in-depth analysis of the pSA process, focusing on the issues resulting from unexpected oscillations among p-bits. These oscillations hinder the energy reduction of the Ising model and thus obstruct the successful execution of pSA in complex tasks. Through detailed simulations, we unravel the root cause of this energy stagnation, identifying the feedback mechanism inherent to the pSA operation as the primary contributor to these disruptive oscillations. To address this challenge, we propose two novel algorithms, time average pSA (TApSA) and stalled pSA (SpSA). These algorithms are designed based on partial deactivation of p-bits and are thoroughly tested using Python simulations on maximum cut benchmarks that are typical combinatorial optimization problems. On the 16 benchmarks from 800 to 5000 nodes, the proposed methods improve the normalized cut value from 0.8 to 98.4% on average in comparison with the conventional pSA.

Many-Body Effects-Based Invertible Logic With a Simple Energy Landscape and High Accuracy

Article

Full-text available

Dec 2023

Inspired by many-body effects, we propose a novel design for Boltzmann machine-based invertible logic using probabilistic bits. A CMOS-based XNOR gate is derived to serve as the hardware implementation of many-body interactions and an invertible logic family is built based on this design. Compared to the conventional two-body-based design framework, the many-body-based design enables compact configuration and provides the simplest binarized energy landscape for fundamental invertible logic gates. Furthermore, we demonstrate the composability of the many-body-based invertible logic circuit by merging modular building blocks into large-scale integer factorizers. To optimize the energy landscape of large-scale combinatorial invertible logic circuits, we introduce degeneracy in energy levels which enlarges the probabilities for the lowest states. Circuit simulations of our integer factorizers reveal a significant boost in factorization accuracy. An example of a 2-bit × 2-bit integer factorizer demonstrated an increment of factorization accuracy from 64.99% to 91.44% with a reduction in the number of energy levels from 32 to 9. Similarly, our 6-bit × 6-bit integer factorizer increases the accuracy from 4.430% to 83.65% with the many-body design. Overall, the many-body-based design scheme provides promising results for future invertible logic circuit designs.

A quantum-inspired probabilistic prime factorization based on virtually connected Boltzmann machine and probabilistic annealing

Article

Full-text available

Sep 2023

Probabilistic computing has been introduced to operate functional networks using a probabilistic bit (p-bit), broadening the computational abilities in non-deterministic polynomial searching operations. However, previous developments have focused on emulating the operation of quantum computers similarly, implementing every p-bit with large weight-sum matrix multiplication blocks and requiring tens of times more p-bits than semiprime bits. In addition, operations based on a conventional simulated annealing scheme required a large number of sampling operations, which deteriorated the performance of the Ising machines. Here we introduce a prime factorization machine with a virtually connected Boltzmann machine and probabilistic annealing method, which are designed to reduce the hardware complexity and number of sampling operations. From 10-bit to 64-bit prime factorizations were performed, and the machine offers up to 1.2 × 10⁸ times improvement in the number of sampling operations compared with previous factorization machines, with a 22-fold smaller hardware resource.

Noise-Aided Invertible Logic from Coupled Nonlinear Systems

Article

Full-text available

Sep 2023

Invertible logic is a powerful new unconventional computing paradigm, providing bidirectional operations between inputs and outputs. It has found applications in important critical problems, such as integer factorization and machine learning. Here we propose a network of interconnected nonlinear systems that serve as our probabilistic bits (“p-bits”) to implement invertible logic in the presence of a noise floor. In the forward (or directed) mode, the inputs are fixed in our network, yielding outputs in accordance with and, or, nand, and nor logic functions. In the reverse (inverted) mode the output is clamped in the network, and the input nodes fluctuate among all possible logical input values consistent with the different logic functions. So the system acts as a unique invertible logic circuit by exploiting the probabilistic transitions between the dynamical states of the coupled noisy nonlinear systems. Interestingly, both the directed and the inverted mode are most robust and reliable in an optimal band of moderate noise, reminiscent of stochastic resonance. The concept is verified in proof-of-principle electronic circuit experiments, demonstrating the robustness of the architecture and the potential of this idea to be realized in a wide range of physical situations.

Stochastic p-Bits Based on Spin-Orbit Torque Magnetic Tunnel Junctions

Preprint

Jun 2023

Stochastic p-Bit devices play a pivotal role in solving NP-hard problems, neural network computing, and hardware accelerators for algorithms such as the simulated annealing. In this work, we focus on Stochastic p-Bits based on high-barrier magnetic tunnel junctions (HB-MTJs) with identical stack structure and cell geometry, but employing different spin-orbit torque (SOT) switching schemes. We conducted a comparative study of their switching probability as a function of pulse amplitude and width of the applied voltage. Through experimental and theoretical investigations, we have observed that the Y-type SOT-MTJs exhibit the gentlest dependence of the switching probability on the external voltage. This characteristic indicates superior tunability in randomness and enhanced robustness against external disturbances when Y-type SOT-MTJs are employed as stochastic p-Bits. Furthermore, the random numbers generated by these Y-type SOT-MTJs, following XOR pretreatment, have successfully passed the National Institute of Standards and Technology (NIST) SP800-22 test. This comprehensive study demonstrates the high performance and immense potential of Y-type SOT-MTJs for the implementation of stochastic p-Bits.

A Full-Stack View of Probabilistic Computing With p-Bits: Devices, Architectures, and Algorithms

Article

Full-text available

Jun 2023

The transistor celebrated its 75 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">th</sup> birthday in 2022. The continued scaling of the transistor defined by Moore’s Law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with unconventional technologies has emerged as a promising path for domain-specific computing. In this article, we provide a full-stack review of probabilistic computing with p-bits as a representative example of the energy-efficient and domain-specific computing movement. We argue that p-bits could be used to build energy-efficient probabilistic systems, tailored for probabilistic algorithms and applications. From hardware, architecture, and algorithmic perspectives, we outline the main applications of probabilistic computers ranging from probabilistic machine learning and AI to combinatorial optimization and quantum simulation. Combining emerging nanodevices with the existing CMOS ecosystem will lead to probabilistic computers with orders of magnitude improvements in energy efficiency and probabilistic sampling, potentially unlocking previously unexplored regimes for powerful probabilistic algorithms.

Training Deep Boltzmann Networks with Sparse Ising Machines

Preprint

Full-text available

Mar 2023

The slowing down of Moore's law has driven the development of unconventional computing paradigms, such as specialized Ising machines tailored to solve combinatorial optimization problems. In this paper, we show a new application domain for probabilistic bit (p-bit) based Ising machines by training deep generative AI models with them. Using sparse, asynchronous, and massively parallel Ising machines we train deep Boltzmann networks in a hybrid probabilistic-classical computing setup. We use the full MNIST dataset without any downsampling or reduction in hardware-aware network topologies implemented in moderately sized Field Programmable Gate Arrays (FPGA). Our machine, which uses only 4,264 nodes (p-bits) and about 30,000 parameters, achieves the same classification accuracy (90%) as an optimized software-based restricted Boltzmann Machine (RBM) with approximately 3.25 million parameters. Additionally, the sparse deep Boltzmann network can generate new handwritten digits, a task the 3.25 million parameter RBM fails at despite achieving the same accuracy. Our hybrid computer takes a measured 50 to 64 billion probabilistic flips per second, which is at least an order of magnitude faster than superficially similar Graphics and Tensor Processing Unit (GPU/TPU) based implementations. The massively parallel architecture can comfortably perform the contrastive divergence algorithm (CD-n) with up to n = 10 million sweeps per update, beyond the capabilities of existing software implementations. These results demonstrate the potential of using Ising machines for traditionally hard-to-train deep generative Boltzmann networks, with further possible improvement in nanodevice-based realizations.

Unconventional computing based on magnetic tunnel junction

Article

Full-text available

Mar 2023
APPL PHYS A-MATER

The conventional computing method based on the von Neumann architecture is limited by a series of problems such as high energy consumption, finite data exchange bandwidth between processors and storage media, etc., and it is difficult to achieve higher computing efficiency. A more efficient unconventional computing architecture is urgently needed to overcome these problems. Neuromorphic computing and stochastic computing have been considered to be two competitive candidates for unconventional computing, due to their extraordinary potential for energy-efficient and high-performance computing. Although conventional electronic devices can mimic the topology of the human brain, these require high power consumption and large area. Spintronic devices represented by magnetic tunnel junctions (MTJs) exhibit remarkable high-energy efficiency, non-volatility, and similarity to biological nervous systems, making them one of the promising candidates for unconventional computing. In this work, we review the fundamentals of MTJs as well as the development of MTJ-based neurons, synapses, and probabilistic-bit. In the section on neuromorphic computing, we review a variety of neural networks composed of MTJ-based neurons and synapses, including multilayer perceptrons, convolutional neural networks, recurrent neural networks, and spiking neural networks, which are the closest to the biological neural system. In the section on stochastic computing, we review the applications of MTJ-based p-bits, including Boltzmann machines, Ising machines, and Bayesian networks. Furthermore, the challenges to developing these novel technologies are briefly discussed at the end of each section.

Contexts in source publication

Similar publications

Citations