Fig 1 - uploaded by Nam-Seog Kim
Content may be subject to copyright.
Trend of SRAM cell size reported.

Trend of SRAM cell size reported.

Source publication
Article
Full-text available
A 1.2-V 72-Mb double data rate 3 (DDR3) SRAM achieves a data rate of 1.5 Gb/s using dynamic self-resetting circuits. Single-ended main data lines halve the data line precharging power dissipation and the number of data lines. Clocks phase shifted by 0°, 90°, and 270° are generated through the proposed clock adjustment circuits. The latter circuits...

Contexts in source publication

Context 1
... systems. The 72-Mb SRAM is fabricated using a 0.1-m CMOS technology with the implemented SRAM, breaking the 1-m barrier for the cell size. Some papers have been published including six-transistor (6T) SRAM cells near the 1-m barrier [1]- [3]. One paper described a 6T embedded SRAM cell breaking the 1-m barrier with the cell size of 0.998 m [4]. Fig. 1 shows the trend in SRAM size recently reported. The implemented cell size of this work is 0.845 m , which is the smallest size to date. The chip size is 151.1 mm . To halve the data line precharging power dissipation and the number of data lines, single-ended main data lines (SMDLs) are designed using dynamic self-resetting circuits ...
Context 2
... the change in unit delay, resulting in a phase shift of 90 . The jitter of the clock shifted by 0 is 13 ps and that of the clock shifted by 90 is 40 ps. In addition, the amount of phase shift can be controlled by JTAG_CONT to trim the setup/hold time and the data valid window. The covered range of the phase shift by JTAG_CONT is 20 with 7 steps. Fig. 10 shows the test result of the proposed CAC. Two clocks phase shifted by 90 from the rising and falling edge of external clock are ...
Context 3
... stub causes relatively large reflections compared with on-chip termination due to the impedance mismatch. Therefore, on-chip termination is developed to remove the effect of the unterminated stub and improve signal integrity. An on-chip input termination of the center-tapped-termination (CTT) type is designed by using CMOS transistors. Fig. 11(a) shows the input termination scheme of data pad. The off-chip driver (OCD) is activated during the read operation and the ter- minator is activated during the write operation. The impedance of the OCD and the terminator is controlled by digital codes generated by two PICs with reference resistors RQ and RT [10]. In 72-Mb DDR3 SRAM, RQ ...
Context 4
... equal to the board characteristic impedance, and the input impedance of terminators in the data pads is 75 . Ter- mination impedance is not matched to the board characteristic impedance to reduce the dc current dissipated in the terminator. To reduce input capacitance, 1/5 of the OCD transistors are used as terminators during nonread operation. Fig. 11(b) and (c) shows the simplified schematic of the terminator for the data pad and the address/control/clock pad. The terminator for the data pads consists of transistor arrays of nMOS and diode-connected nMOS pairs for pulldown, and pMOS and diode-connected pMOS pairs for pullup. The terminator for address/control/clock pads is composed of ...
Context 5
... [10]. In the previously published approaches, the pulldown code is generated from the pullup code which has a quantization error [10]. As a result, the accuracy of the pulldown code is dependent upon the pullup code. To eliminate the dependency of the pulldown code on the pullup code, a new impedance code generation scheme is proposed. Fig. 12 shows the block diagram of the PIC. By feedback operation of AMP1 and a pMOS (M0), VZQ becomes VREF ( VDDQ ). The reference current of VREF/RQ flows through M0 and RQ. M3 copies the current in M0 to make pulldown impedance code and M1, M2, M4, and AMP2 copy the current in M0 to make pullup impedance code. Therefore, pulldown impedance ...
Context 6
... input impedance of terminator should be linear to main- tain equal channel environment for changing pad voltage and to improve the signal integrity and the input data valid windows in a system. Fig. 13(a) shows the linearity of an ideal terminator. In the ideal case, the relationship between pad voltage and input current is perfectly linear. But due to the nonlinear characteris- tics in transistors, there exists linearity error. Fig. 13(b) shows the linearity error of the terminator. The impedance of termi- nator is evaluated either by ...
Context 7
... for changing pad voltage and to improve the signal integrity and the input data valid windows in a system. Fig. 13(a) shows the linearity of an ideal terminator. In the ideal case, the relationship between pad voltage and input current is perfectly linear. But due to the nonlinear characteris- tics in transistors, there exists linearity error. Fig. 13(b) shows the linearity error of the terminator. The impedance of termi- nator is evaluated either by forcing a voltage to a pad and mea- suring the current flowing into a pad, or by measuring pullup impedance and pulldown impedance, respectively. Measuring pullup impedance and pulldown impedance, respectively, is just executed in test ...
Context 8
... 13(b) shows the linearity error of the terminator. The impedance of termi- nator is evaluated either by forcing a voltage to a pad and mea- suring the current flowing into a pad, or by measuring pullup impedance and pulldown impedance, respectively. Measuring pullup impedance and pulldown impedance, respectively, is just executed in test mode. Fig. 13(b) shows the result of forcing a voltage and measuring the current. The total linearity error of the terminator is 4.1% over PVT variations. The linearity error is measured between 0.3 and 1.2 V. Fig. 14 shows the eye diagram of the input data at a data rate of 1.5 Gb/s and a power supply of 1.5 V. In the case of no termination, the ...
Context 9
... impedance and pulldown impedance, respectively. Measuring pullup impedance and pulldown impedance, respectively, is just executed in test mode. Fig. 13(b) shows the result of forcing a voltage and measuring the current. The total linearity error of the terminator is 4.1% over PVT variations. The linearity error is measured between 0.3 and 1.2 V. Fig. 14 shows the eye diagram of the input data at a data rate of 1.5 Gb/s and a power supply of 1.5 V. In the case of no termination, the signal swing is larger than that in on-chip termination. But the noise and the reflections increase jitter and reduce the input data valid window. In the case of on-chip ter- mination, the signal swing is ...
Context 10
... wider input data valid window is obtained. The data input valid window with on-chip termination is 480 ps at 750 mV 200 mV when the terminator impedance and the board char- acteristic impedance is 75 and 25 , respectively. Included in the results are 10% PVT variations, 10% termination impedance variations, and system models. VI. HARDWARE RESULTS Fig. 15 shows the hardware results at a data rate of 1.5 Gb/s and 750-MHz core frequencies. K is the external clock of 750 MHz. Data (DQ0, DQ1) of 1.5 Gb/s is well aligned with the echo clock (CQ, CQb). DQ latency is 2.1 ns from the time that an address is captured. Fig. 16 shows the chip micrograph of the 72-Mb ...
Context 11
... 10% termination impedance variations, and system models. VI. HARDWARE RESULTS Fig. 15 shows the hardware results at a data rate of 1.5 Gb/s and 750-MHz core frequencies. K is the external clock of 750 MHz. Data (DQ0, DQ1) of 1.5 Gb/s is well aligned with the echo clock (CQ, CQb). DQ latency is 2.1 ns from the time that an address is captured. Fig. 16 shows the chip micrograph of the 72-Mb ...

Similar publications

Article
Full-text available
A Physical unclonable function (PUF), alike a fingerprint, exploits manufacturing randomness to endow each physical item with a unique identifier. One primary PUF application is the secure derivation of volatile cryptographic keys using a fuzzy extractor comprising of: i) a secure sketch; and ii) an entropy extractor. Although the entropy extractor...

Citations

... The BS needs to provide the computation result of the l-th service if and only if |K l | ≥ 1, ∀s l ∈ S. We adopt a commonly used computation model [13], in which the total number of 1 The delay of proactive offloading and executing the input data is sufficiently small compared with the deadline, and therefore the computation results can be reused for a long period of time in the future. 2 We assume a type of on-chip caching facilities that incurs negligible accessing delay, e.g., SRAM, with reading/writing speed of 1.5Gb/s and S around 72Mbits [25]. CPU cycles required for performing one computation task is linearly proportioned to its task input bit length. ...
Preprint
With the growing demand for latency-critical and computation-intensive Internet of Things (IoT) services, mobile edge computing (MEC) has emerged as a promising technique to reinforce the computation capability of the resource-constrained mobile devices. To exploit the cloud-like functions at the network edge, service caching has been implemented to (partially) reuse the computation tasks, thus effectively reducing the delay incurred by data retransmissions and/or the computation burden due to repeated execution of the same task. In a multiuser cache-assisted MEC system, designs for service caching depend on users' preference for different types of services, which is at times highly correlated to the locations where the requests are made. In this paper, we exploit users' location-dependent service preference profiles to formulate a cache placement optimization problem in a multiuser MEC system. Specifically, we consider multiple representative locations, where users at the same location share the same preference profile for a given set of services. In a frequency-division multiple access (FDMA) setup, we jointly optimize the binary cache placement, edge computation resources and bandwidth allocation to minimize the expected weighted-sum energy of the edge server and the users with respect to the users' preference profile, subject to the bandwidth and the computation limitations, and the latency constraints. To effectively solve the mixed-integer non-convex problem, we propose a deep learning based offline cache placement scheme using a novel stochastic quantization based discrete-action generation method. In special cases, we also attain suboptimal caching decisions with low complexity leveraging the structure of the optimal solution. The simulations verify the performance of the proposed scheme and the effectiveness of service caching in general.
... A programmable impedance controller (PIC) creates the impedance codes, having as reference the external resistors R T and R Q for ODT and OCD, respectively. In [11] a 1.2-V 1.5-Gb/s 72-Mb DDR3 SRAM is described, which satisfies the need for high data rate and area density required in recent ultra fast systems together with the corresponding on-chip termination. In [12] the implementation of a third-generation 1.1-GHz 64-bit microprocessor is presented whereas it provides some details on the DDR1 SSTL I/O circuit solutions at the memory interface. ...
Article
Full-text available
A 1 GHz Double Data Rate 2/3 (DRR2/3) combo Stub Series Terminated Logic (SSTL) driver has been developed for the first time to our knowledge using a 90 nm CMOS process. To satisfy the signal integrity requirements the driver strength is dynamically calibrated and the input/output port is efficiently terminated by on-die resistors. Furthermore, the slew-rate can be sufficiently controlled by selecting an appropriate external resistor. The proposed driver design provides all the required output and termination impedances specified by both the DDR2 and DDR3 standards and occupies a small die area of 0.032 mm2 (differential). Experimental results demonstrate its robustness over process, voltage, and temperature variations.
Article
A Clock-shared differential signaling(CSDS) transmitter is fabricated in 0.13 μm CMOS for 120 Hz 10-bit Full HD TVs. The proposed Tx driver takes advantages of PVT-insensitive tunable termination resistance with double feedback loops, and small reference voltage fluctuation. Moreover, a fully-digital duty cycle corrector is proposed, and compared to non-clock spreading, the relative near-field EMI level of multi-phase clock spreading is enhanced by 4.4 dB at the operating frequency of 500 MHz. The CSDS Tx with 34 channels consumes 300 mW at a 2.5 V power supply and 1.0 Gb/s/ch.
Article
The effective design of semiconductor memory pertaining to the power consumption, speed and area penalty has always been the crucial task in embedded computing applications. The work presented in this paper is exact and innovative mathematical model based implementation of 32 kb SRAM optimized for power and speed. The model has been developed for a cell, array, and pre-charge, I/Os and periphery devices for their exact behavior and then effective design is obtained by running the model through computing engine. The supply and pre-charge to an array of SRAM are swept and optimized combination is found out for minimum power dissipation and highest achievable access time. The SRAM array rows are controlled by the Gating Transistor Power Saving Technique (GTPST). Redundant columns have been found to make the memory fault tolerant. Similarly the the bitline passive leakage sensing and compensation scheme also has been presented. The experimental result shows 0.25 ¿W dissipation at VDD of 620 mV and pre-charge of 300 mV. The minimum attainable bit line swing is 200 ¿V/ns at VDD of 620 mV and precharge of 500 mV, both of which are state-of-art of its kind. The power saving of 13% is reported. The design by mathematical model, schematic and layout of 32 Kb memory chip and simulation are carried out for development of codebook memory that finds application in embedded signal processing.
Conference Paper
A 72Mb 6T SRAM is designed with 2times144 separate-I/O and random R/W in parallel per cycle running at 875MHz DDR to achieve 504Gb/s bandwidth. It is fabricated in a 90nm CMOS process. Dual R/W self-timed clocks with core emulators are multiplexed to operate the SRAM core at 875MHz. On-chip DLL, programmable I/O skews, and programmable input termination and output driver impedance with precise linearity are essential for this 504Gb/s interface
Conference Paper
A versatile I/O buffer is proposed to interface DDR/DDR2/GDDR3 memory types. A new robust impedance calibration scheme which fills the role of off-chip driver (OCD) and on-die terminator (ODT) for improving signal integrity is introduced. The proposed calibration scheme minimizes quantization error and maintains 30~300Omega impedance within 3% variations
Conference Paper
A novel differential pulse-width control loop circuit based on high speed frequency-to-voltage converters is proposed. To demonstrate its functionality, a circuit has been designed and simulated in 0.18mm CMOS technology. Results show that the proposed circuit can correct a clock signal's duty cycle even for frequencies as high as 5 GHz. This design can be used to correct clock signal distortion due to process variations in high speed applications such as half-rate clock and data recovery systems.
Conference Paper
Effective design of cache SRAM has always been the challenging task in embedded systems dedicated to image processing applications such as vector quantizer (VQ). The low power high speed SRAM array is the need of VQ. The mathematical model and simulation results for low power, high speed, fault tolerant codebook SRAM is presented in this paper. The cell, precharge, transmission logic, sense amplifier, redundant bits and IOs are modeled and SPICE simulated. Since the codebook has rhythmic nature, the successive multiple read cycles are important than write. The implementation is done at 0.25 mum technology. The results show that the least precharge is at 300 mV. The array operates minimum at 600 mV. The dissipation of 256 b array is 1.8 mW at read speed of 5 Gbits/sec at precharge of 1.25 V and supply of 2.5 V.