Fig 1 - uploaded by Alex Kirichenko
Content may be subject to copyright.
Decoder block diagram. 

Decoder block diagram. 

Source publication
Article
Full-text available
We designed, fabricated, and demonstrated an energy-efficient ERSFQ 4-bit decoder. The first version of the decoder is designed and fabricated using HYPRES legacy 1.0-μm four-layer 4.5-kA/cm2 process. It occupies an area of 700 μm × 1800 μm, which is to be reduced to 160 μm × 400 μm once fabricated using HYPRES's RIPPLE-2 0.25-μm six-layer process....

Context in source publication

Context 1
... typical RAM consists of a memory cell array controlled by periphery circuits including address decoders, line drivers and sense circuits. The periphery circuits can be responsible for a significant fraction of RAM energy consumption and latency [17]- [21]. If not addressed, inefficiencies in RAM would eliminate most of the benefits of processor improvements [1]. Address decoders are one of the key circuits defining the RAM performance. Many different superconducting designs of n-to-2 n decoders were developed including tree decoders, loop decoders, NOR and NAND decoders [18]- [29]. This paper reports on the detailed design and testing of a new energy- efficient decoder satisfying requirements for prospective high density fast memories and compatible with energy-efficient SFQ processors. Fig. 1 shows a block diagram of the n-bit decoder. It consists of n address line drivers and a decoder matrix comprising an n × 2 n array of decoder cells. When an address arrives at the decoder, its address line drivers produce complementary control dc currents for each input address bit. For each bit (a col- umn) of the decoder, two complementary superconductive lines (L lef t and L right ) go throughout the entire decoder matrix column with one of two carrying the control current. Each cell of the decoder matrix is magnetically coupled to one of two complementary lines. By alternating (swapping) pairs of lines in the proper order, a unique combination of control currents is achieved for every decoder row. Only one row in the matrix, in which all n cells are magnetically biased (i.e., all control currents flow through the L right lines), can propagate the select signal to its output, thus performing address decoding. We used a balanced Gray code in the swapping scheme to minimize the number of swaps and provide inductance uniformity. The important detail is that once the address is set, the decoder settings do not change until a new address is ...

Similar publications

Article
Full-text available
A bidirectional logic gate has been designed based on the backhopping phenomenon observed in magnetic tunnel junctions (MTJ) at high bias. The magnetization dynamics of each magnetic layer of the MTJ—having materials and geometry of a standard spin-transfer torque magnetic random access memory device—is calculated using the coupled Landau–Lifshitz–...

Citations

... These RSFQ successors are now considered to be a basis for the next-generation low-power circuit technology needed for future high energy-efficient data centers, supercomputers [10]- [12], and embedded classical control modules for quantum computers [13]. One of the most promising and practical energy efficient SFQ logic is ERSFQ, which retains all advantages of RSFQ including the well-developed circuit libraries [3], [14], [15]. ERSFQ logic is one of two integrated circuit technologies chosen for the implementation of superconducting processors in C3 project [16]. ...
Article
Full-text available
We have designed and tested a parallel 8-bit ERSFQ arithmetic logic unit (ALU). The ALU design employs wave-pipelined instruction execution and features modular bit-slice ar-chitecture that is easily extendable to any number of bits and adaptable to current recycling. A carry signal synchronized with an asynchronous instruction propagation provides the wave-pipeline operation of the ALU. The ALU instruction set consists of 14 arithmetical and logical instructions. It has been designed and simulated for operation up to a 10 GHz clock rate at the 10-kA/cm2 fabrication process. The ALU is embedded into a shift-register-based high-frequency testbed with on-chip clock generator to allow for comprehensive high frequency testing for all possible operands. The 8-bit ERSFQ ALU, comprising 6840 Josephson junctions, has been fabricated with MIT Lincoln Lab's $\rm 10-kA/cm^2$ SFQ5ee fabri-cation process featuring eight Nb wiring layers and a high-kinetic inductance layer needed for ERSFQ technology. We evaluated the bias margins for all instructions and various operands at both low and high frequency clock. At low frequency, clock and all instruc-tion propagation through ALU were observed with bias margins of +/-11% and +/-9%, respectively. Also at low speed, the ALU ex-hibited correct functionality for all arithmetical and logical instruc-tions with +/-6% bias margins. We tested the 8-bit ALU for all in-structions up to 2.8 GHz clock frequency.
... These RSFQ successors are now considered to be a basis for the next-generation low-power circuit technology needed for future high energy-efficient data centers, supercomputers [10]- [12], and embedded classical control modules for quantum computers [13]. One of the most promising and practical energy efficient SFQ logic is ERSFQ, which retains all advantages of RSFQ including the well-developed circuit libraries [3], [14], [15]. ERSFQ logic is one of two integrated circuit technologies chosen for the implementation of superconducting processors in C3 project [16]. ...
Preprint
Full-text available
We have designed and tested a parallel 8-bit ERSFQ arithmetic logic unit (ALU). The ALU design employs wave-pipelined instruction execution and features modular bit-slice architecture that is easily extendable to any number of bits and adaptable to current recycling. A carry signal synchronized with an asynchronous instruction propagation provides the wave-pipeline operation of the ALU. The ALU instruction set consists of 14 arithmetical and logical instructions. It has been designed and simulated for operation up to a 10 GHz clock rate at the 10-kA/cm 2 fabrication process. The ALU is embedded into a shift-register-based high-frequency testbed with on-chip clock generator to allow for comprehensive high frequency testing for all possible operands. The 8-bit ERSFQ ALU, comprising 6840 Josephson junctions, has been fabricated with MIT Lincoln Lab's 10-kA/cm 2 SFQ5ee fabrication process featuring eight Nb wiring layers and a high-kinetic inductance layer needed for ERSFQ technology. We evaluated the bias margins for all instructions and various operands at both low and high frequency clock. At low frequency, clock and all instruction propagation through ALU were observed with bias margins of +/-11% and +/-9%, respectively. Also at low speed, the ALU exhibited correct functionality for all arithmetical and logical instructions with +/-6% bias margins. We tested the 8-bit ALU for all instructions up to 2.8 GHz clock frequency.
... New energy-efficient versions of RSFQ, such as ERSFQ, eSFQ [14] [15] and LV-SFQ [16], have further reduced power consumption for systems built on this technology. These new additions to the SFQ family show great promise, and success has already been demonstrated with these energy-efficient technologies as well [17] [18] [19] [20]. ...
Article
Full-text available
New tools have been created to allow a superconducting design flow for schematic design, verification, and optimization. These tools integrate with the Cadence design environment. In Single Flux Quantum (SFQ) superconducting electronics, individual component values, such as wire inductances, Josephson junction critical currents, and bias currents must be optimized to allow for maximum deviance from the designer value, which is also known as the device margin. One tool is used to create a description of the proper circuit behavior. Included with this tool is the ability to automatically create the description from a Cadence netlist. The other tool is an automated device margin circuit schematic verification and optimization tool, which widens device margins while maintaining proper circuit behavior derived from the first tool. Additionally, this optimization tool can automatically correct the circuit schematic using the proper circuit behavior description. In this paper, the functionality of the language used to create the description of the proper circuit behavior is presented. Several circuits are then verified and optimized based on their correct behavior.
... These RSFQ successors are now considered to be a basis for the next-generation low-power circuit technology needed for future high energy-efficient data centers, supercomputers [10]- [12], and embedded classical control modules for quantum computers [13]. One of the most promising and practical energy efficient SFQ logic is ERSFQ, which retains all advantages of RSFQ including the well-developed circuit libraries [3], [14], [15]. ERSFQ logic is one of two integrated circuit technologies chosen for the implementation of superconducting processors in C3 project [16]. ...
Preprint
Full-text available
We have designed and tested a parallel 8-bit ERSFQ arithmetic logic unit (ALU). The ALU design employs wave-pipelined instruction execution and features modular bit-slice architecture that is easily extendable to any number of bits and adaptable to current recycling. A carry signal synchronized with an asynchronous instruction propagation provides the wave-pipeline operation of the ALU. The ALU instruction set consists of 14 arithmetical and logical instructions. It has been designed and simulated for operation up to a 10 GHz clock rate at the 10-kA/cm2 fabrication process. The ALU is embedded into a shift-register-based high-frequency testbed with on-chip clock generator to allow for comprehensive high frequency testing for all possible operands. The 8-bit ERSFQ ALU, comprising 6840 Josephson junctions, has been fabricated with MIT Lincoln Lab 10-kA/cm2 SFQ5ee fabrication process featuring eight Nb wiring layers and a high-kinetic inductance layer needed for ERSFQ technology. We evaluated the bias margins for all instructions and various operands at both low and high frequency clock. At low frequency, clock and all instruction propagation through ALU were observed with bias margins of +/-11% and +/-9%, respectively. Also at low speed, the ALU exhibited correct functionality for all arithmetical and logical instructions with +/-6% bias margins. We tested the 8-bit ALU for all instructions up to 2.8 GHz clock frequency.
... These decoders can be SFQ-based (e.g. [34], [35]) and located on the periphery of the FPGA fabric. ...
Article
Field-programmable gate arrays (FPGA) provide a significantly cheaper solution for various applications in traditional semiconductor electronics. Single flux quantum (SFQ) technologies are developing rapidly and the availability of SFQspecific FPGA will be very useful. Towards developing such an SFQ-specific FPGA, new designs of FPGA subcircuits for both synchronous and asynchronous operation of SFQ circuits are presented in this paper. Magnetic Josephson junctions (MJJs) are used as bias limiting junctions in ERSFQ biasing to implement programmable switches in various subcircuits of the proposed FPGA fabric. Designs of all FPGA subcircuits are developed and are verified through circuit simulation. Verilog HDL models are also developed for all FPGA circuit blocks to facilitate large-scale FPGA simulations for the implementation of the desired circuit on the proposed FPGA fabric. Designs of a few subcircuits with switches based on non-destructive readout (NDRO) cell are also given in the current paper for better comparison with MJJ switch based counterparts. Programming of MJJ-based switches is based on the ability to control the critical current of MJJs externally. Recent implementations of SFQ decoder is proposed for accessing individual MJJs through the current lines in a crossbar structure. Estimations for the area and power consumption are much better in comparison to previous attempts at designing an SFQ specific FPGA.
... A solution of this problem can be found in the utilization of magnetic control over cells by using a current control line. This approach can be realized with SFQ-to-current loop converters [80,81]. A similar technique can be used in merging of multiple outputs [82]. ...
Article
Full-text available
The predictions of Moore’s law are considered by experts to be valid until 2020 giving rise to “post-Moore’s” technologies afterwards. Energy efficiency is one of the major challenges in high-performance computing that should be answered. Superconductor digital technology is a promising post-Moore’s alternative for the development of supercomputers. In this paper, we consider operation principles of an energy-efficient superconductor logic and memory circuits with a short retrospective review of their evolution. We analyze their shortcomings in respect to computer circuits design. Possible ways of further research are outlined.
... There are several known decoder design types implemented using superconducting circuits: loop decoders, tree decoders, NOR and NAND decoders [11]- [22]. The first decoder based on energyefficient ERSFQ logic [1] was described in [23]. This 4-to-16 bit decoder was implemented using Hypres 4-layer process with 4.5 kA/cm 2 critical current density, occupied 0.7 × 1.8 mm 2 and dissipated ∼70 aJ per one address decode operation. ...
... The decoder consists of n address line drivers. In contrast to our previous decoder design with n2 n decoder cells [23], the binary tree architecture requires only 2 n − 1 decoder cells. With energy efficiency being one of the primary goals of this design, the binary tree approach has a significant power scaling advantage (∼ factor of n) comparing to the previous matrix design. ...
... A n −1 ) arrives at the decoder, its address line drivers produce complementary control dc currents for each input address bit. An address line driver [23] comprises an ERSFQ D flip-flop with complementary outputs (DFFC) that generates SFQ control signals for the dc-powered current steering loop driver [22]. For each column of the decoder, two complementary superconductive lines (A n and =Ā n ) traverse the entire vertical dimension of the decoder (Fig. 1). ...
Article
Full-text available
We report on the development of energy-efficient decoders for cryogenic random access memory and register file. To reduce the pitch, area, and energy, our decoder employs a scalable binary tree architecture. We implemented these decoders using ERSFQ logic controlled by magnetically coupled address lines. These lines are driven by energy-efficient drivers based on the current-stirring technique. A 4-to-16 version of the decoder was laid out and fabricated in HYPRES 6-layer 10 kA/cm $^2$ and MIT LL 8-layer 10 kA/cm $^2$ processes with 15 $\mu$ m and 28 $\mu$ m decoder row pitch, respectively. The decoders were designed to have ~ 30 ps latency and dissipate ~ 40 aJ per clock. We experimentally confirmed the functionality of the circuits with $\pm$ 8% dc bias margins and verified its operation up to 13 GHz clock.
Article
The artificial neuron proposed earlier for use in superconducting neural networks is experimentally studied. The fabricated sample is a single-junction interferometer, part of the circuit of which is shunted by an additional inductance, which is also used to generate an output signal. A technological process has been developed and tested to fabricate a neuron in the form of a multilayer thin-film structure over a thick superconducting screen. The transfer function of the fabricated sample, which contains sigmoid and linear components, is experimentally measured. A theoretical model is developed to describe the relation between input and output signals in a practical superconducting neuron. The derived equations are shown to approximate experimental curves at a high level of accuracy. The linear component of the transfer function is shown to be related to the direct transmission of an input signal to a measuring circuit. Possible ways for improving the design of the sigma neuron are considered.
Article
We investigated local magnetic flux biasing (LFB) that induces a phase shift in superconductor circuits by locally applying a magnetic field through the superconductor loop with Josephson junctions. The arbitrary phase shift can be achieved using LFB without modifying the circuit fabrication process. To quantitatively evaluate the effects of introducing LFB for practical superconductor circuit applications, we designed a single flux quantum (SFQ) based non-destructive read-out flip-flop with complementary outputs (NDROC) and a delay flip-flop with complementary outputs (DFFC). The circuit area and static power consumption of the NDROC based on LFB architecture (LFB-NDROC) are approximately 67% and 36% of a conventional NDROC, respectively. The measured bias margin of the LFB-NDROC was in the range of 69%–129%. Using LFB, we were able to reduce the circuit area and power consumption for the DFFC by 67% and 83%, respectively. The measured bias margin of the DFFC with LFB was between 115% and 128%. LFB enabled us to implement a 5-to-32 SFQ decoder which comprises NDROC trees with a reduced circuit area of approximately 60% of a conventional decoder. The results obtained in this study can be applied to not just SFQ circuits but other superconductor circuits also, as they improve the area and power efficiency of such circuits.