(a) Classical structure of conventional von Neumann architecture. Von Neumann bottleneck is caused by frequent data transport between the CPU and memory. (b) Typical structure of memristor-based in-memory computing architecture, which combines computation and storage. (c) Operation mode in the new architecture.

(a) Classical structure of conventional von Neumann architecture. Von Neumann bottleneck is caused by frequent data transport between the CPU and memory. (b) Typical structure of memristor-based in-memory computing architecture, which combines computation and storage. (c) Operation mode in the new architecture.

Source publication
Article
Full-text available
Memristive stateful logic has emerged as a promising next-generation in-memory computing paradigm to address escalating computing-performance pressures in traditional von Neumann architecture. Here, we present a nonvolatile reprogrammable logic method that can process data between different rows and columns in a memristive crossbar array based on m...

Similar publications

Article
Full-text available
In‐memory computing enabled by advanced nonvolatile memory technologies, such as memristors and memristive devices, emerges as a promising approach to accelerate certain data‐intensive algorithms, and thus outperforms the von Neumann computing in terms of processing latency and energy efficiency. In this work, an efficient method to calculate the H...

Citations

... Yu [37] Enhanced scouting logic literature [3,26,27,40], typically a 1-bit FA operation can be implemented using from 6 up to 9 RRAM devices, and sequences of IMPLY and FALSE operations which length vary from as low as 23 steps, when the inputs are overwritten, up to 136 steps when the sequence is not optimized. To further reduce the number of computing steps, the multi-input stateful material implication framework was proposed by Siemon et al. in [39], showing that by using three-inputs IMPLY operations the number of computing steps can be reduced down to 19 steps when using 8 RRAM devices. ...
Chapter
Full-text available
In-memory computing hardware accelerators for binarized neural networks based on resistive RAM (RRAM) memory technologies represent a promising solution for enabling the execution of deep neural network algorithms on resource-constrained devices at the edge of the network. However, the intrinsic stochasticity and nonidealities of RRAM devices can easily lead to unreliable circuit operations if not appropriately considered during the design phase. In this chapter, analysis and design methodologies enabled by RRAM physics-based compact models of LIM and mixed-signal BNN inference accelerators are discussed. As a use case example, the UNIMORE RRAM physics-based compact model calibrated on an RRAM technology from the literature, is used to determine the performance vs. reliability trade-offs of different in-memory computing accelerators: i) a logic-in-memory accelerator based on the material implication logic, ii) a mixed-signal BNN accelerator, and iii) a hybrid accelerator enabling both computing paradigms on the same array. Finally, the performance of the three accelerators on a BNN inference task is compared and benchmarked with the state of the art.
... In the proposed CIM scheme, the computed result is directly stored in the 3T1M cell, effectively eliminating the need for a writeback operation. This results in a reduction in delay when compared to previous works referenced in [9], [17], [18], [42], [51], [52], and [53]. In a related study [26], it demonstrated faster calculation speeds during logic operations. ...
Article
Full-text available
Silicon-based semiconductor transistors are approaching their physical limits due to shrinking feature sizes. Simultaneously, traditional silicon-based von Neumann architectures exhibit significant latency and power consumption issues in data-centric applications, such as the Internet of Things and artificial intelligence. To tackle these challenges, this study introduces a novel approach: Magnetoresistance Random Access Memory (MRAM) computing in-memory (CIM) using gate-all-around carbon nanotube field-effect transistors (GAA-CNTFET). The proposed MRAM array comprised three transistors and one perpendicular magnetic anisotropy spin-orbit torque magnetic tunnel junction (p-SOT-MTJ) (3T1M) cell and achieves full-array Boolean logic operations and half/full-adder operations. The calculated results can be stored in-situ during the computing phase without requiring additional peripheral circuits. A 16 Kb MRAM was simulated in both GAA-CNTFET/p-SOT-MTJ and 14-nm FinFET/p-SOT-MTJ technologies to examine the effectiveness of the proposed design. Compared to its 14-nm FinFET/p-SOT-MTJ counterparts, the write and computing latencies of the GAA-CNTFET/p-SOT-MTJ CIM macro were reduced by approximately 21% and 20.6%, respectively, while the read and computing energy consumption by approximately 45.3% and 24.7%, respectively. Moreover, the proposed in-memory Boolean logic throughput was 8192 GOPS, which was approximately 160–250 times higher than that of existing CIM solutions, in which only two rows of word lines can be activated.
... Considering the worst-case energy contribution for each 1-bit FA operation results in an energy consumption overestimation, that, as a first-order approximation, provides enough room to account for additional energy dissipated by components in the peripheral circuitry that were not included in the circuit simulations (i.e., the analog tri-state buffers). The proposed FA outperforms all existing IMPLY-based LIM solutions (both simulation and experimental works [12,20,[36][37][38]), in terms of the number of steps (delay) and energy consumption, using few devices and lowering the energy-delay product (EDP) by a factor >10 3 , bringing it much closer to the CMOS one. We compare the performance of the proposed solution vs. CMOS (with and without considering the VNB energy and time overhead) in Table 5 considering the parallel execution of 512 32-bit FA operations (simple ripple carry architecture), which entails 4 kB data, which is the common memory page size [39]. ...
Article
Full-text available
Logic-in-memory (LIM) circuits based on the material implication logic (IMPLY) and resistive random access memory (RRAM) technologies are a candidate solution for the development of ultra-low power non-von Neumann computing architectures. Such architectures could enable the energy-efficient implementation of hardware accelerators for novel edge computing paradigms such as binarized neural networks (BNNs) which rely on the execution of logic operations. In this work, we present the multi-input IMPLY operation implemented on a recently developed smart IMPLY architecture, SIMPLY, which improves the circuit reliability, reduces energy consumption, and breaks the strict design trade-offs of conventional architectures. We show that the generalization of the typical logic schemes used in LIM circuits to multi-input operations strongly reduces the execution time of complex functions needed for BNNs inference tasks (e.g., the 1-bit Full Addition, XNOR, Popcount). The performance of four different RRAM technologies is compared using circuit simulations leveraging a physics-based RRAM compact model. The proposed solution approaches the performance of its CMOS equivalent while bypassing the von Neumann bottleneck, which gives a huge improvement in bit error rate (by a factor of at least 108) and energy-delay product (projected up to a factor of 1010).
... However, the von Neumann bottleneck caused by the separation of processor and memory severely hinders the development of high-performance and energyefficient computer hardware [1][2][3] In-memory computing that enables the computation directly inside the memory is considered as a promising solution to overcome this limitation [4][5][6] Such computing system is built of nonvolatile and computational memories such as phase-change memory [7], * Author to whom any correspondence should be addressed. resistive random access memory [8][9][10][11][12], and magnetic random access memory (MRAM) [13][14][15][16][17][18] Out of these choices, MRAM based on the spin-orbit torque (SOT) [19][20][21][22][23] has attracted considerable attention due to its high speed and low power consumption. ...
Article
Full-text available
The non-volatile logic gates that can perform both storage and computing functions are the elementary components for in-memory computing. In this work, we propose a spin-orbit torque based non-volatile reconfigurable logic device. The spatial-dependent spin current generated by a Y-shaped heavy metal layer is utilized to break the symmetry of a perpendicular magnetic tunnel junction (MTJ) with Dzyaloshinskii-Moriya interaction, which leads to a symmetry-dependent magnetization switching. Both bipolar and unipolar magnetization switching can be achieved based on the proposed scheme. The feasibility of the prototype device is demonstrated through the micromagnetic simulations. Moreover, we assign these symmetry-dependent switching characteristics with different Boolean logic operations. Three logic gates including Majority, XOR, and AND/XOR are reconfigured. Furthermore, we offer a different approach to perform an N-bit full subtractor/adder function using a single MTJ. The proposed device paves the way toward a programmable and highly parallel logic-in-memory architecture in the near future.
... The feasibility of using these two schemes to achieve important Boolean functions has been proven experimentally. In one notable case, a 2  2 Ti/HfO2/W RRAM array [157] was used to execute logic computation and apply logic inputs (i.e., VCOND and VSET in either positive or negative polarity) to two working RRAM, followed by a read process to detect the correct logic output (see Figure 17d). This method enables the 16 binary Boolean logic functions to be reprogrammed in one small single cell with superb performance. ...
... Various RRAM configurations: (a) a hybrid device structure with one switch and one unipolar RRAM cell; (b) the I-V characteristics of a typical TaOx-based RRAM; (c) the logic operation of two anti-serial RRAM devices; (d) experimental results for "WL-IMP" operations on M1 and M2 ((a) is reprinted with permission from[154]; (b) is reprinted with permission from[155]; (c) is reprinted with permission from[156]; (d) is reprinted with permission from[157].) ...
Article
Full-text available
Recent progress in the development of artificial intelligence technologies, aided by deep learning algorithms, has led to an unprecedented revolution in neuromorphic circuits, bringing us ever closer to brain-like computers. However, the vast majority of advanced algorithms still have to run on conventional computers. Thus, their capacities are limited by what is known as the von-Neumann bottleneck, where the central processing unit for data computation and the main memory for data storage are separated. Emerging forms of non-volatile random access memory, such as ferroelectric random access memory, phase-change random access memory, magnetic random access memory, and resistive random access memory, are widely considered to offer the best prospect of circumventing the von-Neumann bottleneck. This is due to their ability to merge storage and computational operations, such as Boolean logic. This paper reviews the most common kinds of non-volatile random access memory and their physical principles, together with their relative pros and cons when compared with conventional CMOS-based circuits (Complementary Metal Oxide Semiconductor). Their potential application to Boolean logic computation is then considered in terms of their working mechanism, circuit design and performance metrics. The paper concludes by envisaging the prospects offered by non-volatile devices for future brain-inspired and neuromorphic computation.
... switching between resistance states, respectively. Such resistive-switching elements have been also proposed as building blocks in neuromorphic computing, where they aim to emulate the cognitive computing functions of the human brain [7][8][9][10]. The race for materials with excellent resistive-switching characteristics is thus underway. ...
Article
Full-text available
We study the resistive switching in tunnel junctions with single-crystal La2NiO4 electrodes. Such electro-resistive devices are promising candidates for future nonvolatile memory and reconfigurable logic applications thanks to their simple structure, excellent scalability and endurance. Our tunnel junctions were prepared by painting a spot of conductive silver epoxy on the surface of a La2NiO4 single crystal. The interface between the silver and the semiconducting crystal served as a natural barrier forming planar normal metal/insulator/semiconductor (N–I–S) tunnel junctions with resistances ranging from a few Ohms to more than hundred thousands of Ohms. The current–voltage (I–V) measurements performed on such junctions at room temperature demonstrated a bias-driven switching between high and low resistance states with ratios close to 100% and high endurance. A combination of 2- and 3- probe I–V measurements unambiguously demonstrated that the resistive switching is associated with the interfaces between the La2NiO4 crystal and the silver-contact electrodes, with negligible contribution from the bulk of the crystal. Similar resistive-switching phenomena in other oxide materials were previously associated with crystal-lattice distortions produced by an applied voltage/electric field. Here, we use an ultra-sensitive capacitive displacement meter to monitor the field-induced lattice distortions in situ. We observe that the crystal contraction/expansion is strongly correlated with the resistive switching. We also note that the Joule heating from dc bias may contribute to the crystal size changes. Our results provide a new insight into the origin of lattice distortions/resistive switching in transition metal oxides while the observed interfacial nature of the switching phenomenon is promising for fabrication of thin-film planar devices to be used in nonvolatile memory and logic.
... The switching time, as calculated by applying a voltage pulse of amplitude 0.7 V and 1 μs width is found to be 700 ns (Fig. 3g). With the trade-off existing between programing voltage and switching time 44 , we Ti/HfO 2 /W [46] SiO 2 [53] Nanopore graphene/HfO 2 [52] AlO x /WO x [54] MoTe 2 [38] Zr 0.5 Hf 0.5 O 2 / graphene oxide [49] HfO 2 [51] Graphene/HfO x /TiN [50] Ta 2 O 5 (CRS) [47] HfO x / AlOy [48] a Table 5). We suspect that the low switching energy is promoted by the excess defects and grain boundaries in our printed WSe 2 layer. ...
Article
Full-text available
3D monolithic integration of logic and memory has been the most sought after solution to surpass the Von Neumann bottleneck, for which a low-temperature processed material system becomes inevitable. Two-dimensional materials, with their excellent electrical properties and low thermal budget are potential candidates. Here, we demonstrate a low-temperature hybrid co-integration of one-transistor-one-resistor memory cell, comprising a surface functionalized 2D WSe2p-FET, with a solution-processed WSe2 Resistive Random Access Memory. The employed plasma oxidation technique results in a low Schottky barrier height of 25 meV with a mobility of 230 cm² V⁻¹ s⁻¹, leading to a 100x performance enhanced WSe2p-FET, while the defective WSe2 Resistive Random Access Memory exhibits a switching energy of 2.6 pJ per bit. Furthermore, guided by our device-circuit modelling, we propose vertically stacked channel FETs for high-density sub-0.01 μm² memory cells, offering a new beyond-Si solution to enable 3-D embedded memories for future computing systems.
... To overcome this obstacle, many efforts have been devoted to future computing systems based on logic-in-memory architecture, which performs the computation at the same physical location as where the data are stored. Various physical devices, such as the magnetic tunnel junction [2], memristor [3,4], ferroelectric tunnel junction [5], phase change memory [6], etc, have been implemented for the nonvolatile logic-in-memory applications in the past decade. ...
Article
Full-text available
The biphase magnetic heterostructure (Ni/Fe20Ni45Co25Si2B8/micro-planarcoil/Fe20Ni45Co25Si2B8/micro-planar coil/Fe20Ni45Co25Si2B8/Ni) bonded with a pair of thin magnets shows an improved asymmetric and hysteretic giant magnetoimpedance (GMI) effect with respect to an external dc magnetic field. This is attributed to the modification of the soft amorphous magnetic alloy’s (Fe20Ni45Co25Si2B8) permeability by the hysteretic magnetostrictive stress from the neighboring magnetostrictive Ni layer besides the magnetostatic coupling. More importantly, nonvolatile logic gates, such as NOR and NAND, can be implemented with either a single GMI laminate or arrays. On one hand, after applying two sequential magnetic field pulses as the logic inputs to a single proposed laminate, the nonvolatile output voltages are positive high (logic 1) or positive low (logic 0) depending on the signs and magnitudes of pulses. On the other hand, after applying the magnetic field pulses parallel to the GMI laminates arrays, different logic functions can be programmed and realized at run-time. The integration of memory and logic functions enables the proposed GMI laminate to be a possible candidate for future computing systems beyond von Neumann architecture.
... The functionality of a MAGIC adder has been shown with organic unipolar switching devices 36 . Next to the demonstration of the IMPLY logic in the proposing publication 17 additional publications presented experimental studies for this approaches 37,38 . All of these adders require a certain number of devices and steps to perform a certain operation, e.g. a 1-bit addition. ...
Article
Full-text available
Memristive switches are able to act as both storage and computing elements, which make them an excellent candidate for beyond-CMOS computing. In this paper, multi-input memristive switch logic is proposed, which enables the function X OR (Y NOR Z) to be performed in a single-step with three memristive switches. This ORNOR logic gate increases the capabilities of memristive switches, improving the overall system efficiency of a memristive switch-based computing architecture. Additionally, a computing system architecture and clocking scheme are proposed to further utilize memristive switching for computation. The system architecture is based on a design where multiple computational function blocks are interconnected and controlled by a master clock that synchronizes system data processing and transfer. The clocking steps to perform a full adder with the ORNOR gate are presented along with simulation results using a physics-based model. The full adder function block is integrated into the system architecture to realize a 64-bit full adder, which is also demonstrated through simulation.
... Since then, memristors have received explosive attention from both academic and industrial communities. In recent years, in-depth investigation in the resistive switching mechanisms and continuous improvement in device performances have not only led to the breakthrough in the development of digital non-volatile memory [3][4][5], but also led to other prospective non von Neumann computing paradigms such as in-memory computing [6][7][8][9][10][11][12][13][14][15][16][17][18][19] and neuromorphic computing [20][21][22][23][24]. ...
... In the device level, the fusion of memory and computing is realized by implementing primitive Boolean logic functions in the memristors. More specifically, in various proposed memristive logic families, such as material implication (IMP) [7][8][9]17], sequential logic [10][11][12]16], and MAGIC logic [18,19], the common key feature is utilizing the high and low resistance states rather than the volatile voltages as logic variables (0 and 1), i.e. logic inputs and output, hence defined as 'stateful' logic. In other words, the logic results could be in situ stored within the nonvolatile device and be used directly for the next state logic, facilitating the normally-off characteristic and significant reduction of the data movement during computing especially for data-intensive tasks. ...
Article
Full-text available
Owing to the capability of integrating the information storage and computing in the same physical location, in-memory computing with memristors has become a research hotspot as a promising route for non von Neumann architecture. However, it is still a challenge to develop high performance devices as well as optimized logic methodologies to realize energy-efficient computing. Herein, filamentary Cu/GeTe/TiN memristor is reported to show satisfactory properties with nanosecond switching speed (< 60 ns), low voltage operation (< 2 V), high endurance (>104 cycles) and good retention (>104 s @85℃). It is revealed that the charge carrier conduction mechanisms in high resistance and low resistance states are Schottky emission and hopping transport between the adjacent Cu clusters, respectively, based on the analysis of current-voltage behaviors and resistance-temperature characteristics. An intuitive picture is given to describe the dynamic processes of resistive switching. Moreover, based on the basic material implication (IMP) logic circuit, we proposed a reconfigurable logic method and experimentally implemented IMP, NOT, OR, and COPY logic functions. Design of a one-bit full adder with reduction in computational sequences and its validation in simulation further demonstrate the potential practical application. The results provide important progress towards understanding of resistive switching mechanism and realization of energy-efficient in-memory computing architecture.