Kai Ni's research while affiliated with Notre Dame College and other places

Publications (203)

Article
This article presents a comprehensive physics-based model for back-end-of-line (BEOL)-compa-tible oxide–semiconductor-based ferroelectric field-effect transistors (FeFETs). The proposed model describes the polarization switching behavior and enables bidirectional bias sweeps for the hysteretic $\textit{I}_{\textit{D}}\textit{V}_{\textit{G}}$ curv...
Article
Deep random forest (DRF), which combines deep learning and random forest, exhibits comparable accuracy, interpretability, low memory and computational overhead to deep neural networks (DNNs) in edge intelligence tasks. However, efficient DRF accelerator is lagging behind its DNN counterparts. The key to DRF acceleration lies in realizing the branch...
Article
Full-text available
The 6G network, the next‐generation communication system, is envisaged to provide unprecedented experience through hyperconnectivity involving everything. The communication should hold artificial intelligence‐centric network infrastructures as interconnecting a swarm of machines. However, existing network systems use orthogonal modulation and costl...
Article
Reliability issues stemming from device level nonidealities of nonvolatile emerging technologies like ferroelectric field-effect transistors (FeFETs), especially at scaled dimensions, cause substantial degradation in the accuracy of in-memory crossbar-based AI systems. In this work, we present a variation-aware design technique to characterize the...
Article
In this article, we report a comprehensive investigation and a deep understanding of the impact of hydrogen evolution on the reliability of indium-gallium-zinc-oxide (IGZO) field-effect transistors (FETs) with HfO $_{\text{2}}$ as the gate dielectric. Our findings reveal that the source/drain (S/D) regions play a pivotal role in the threshold volt...
Article
Full-text available
Computationally hard combinatorial optimization problems (COPs) are ubiquitous in many applications. Various digital annealers, dynamical Ising machines, and quantum/photonic systems have been developed for solving COPs, but they still suffer from the memory access issue, scalability, restricted applicability to certain types of COPs, and VLSI-inco...
Preprint
Full-text available
Physical unclonable functions (PUFs) are of immense potential in authentication applications for numerous Internet of Things (IoT) devices. For creditable and lightweight PUF applications, high reconfigurability, ultra-low power, and large challenge-response pair (CRP) space are highly desirable. Here we report the first demonstration of ferroelect...
Article
Full-text available
Field programmable gate array (FPGA) is widely used in the acceleration of deep learning applications because of its reconfigurability, flexibility, and fast time-to-market. However, conventional FPGA suffers from the trade-off between chip area and reconfiguration latency, making efficient FPGA accelerations that require switching between multiple...
Article
Logic camouflage is a widely adopted technique that mitigates the threat of intellectual property (IP) piracy and overproduction in the integrated circuit (IC) supply chain. Camouflaged logic achieves functional obfuscation through physical-level ambiguity and post-manufacturing programmability. However, discussions on programmability are confined...
Article
The recent progress in quantum computing and space exploration led to a surge in interest in cryogenic electronics. Superconducting devices such as Josephson junction, Josephson field effect transistor, cryotron, and superconducting quantum interference device (SQUID) are traditionally used to build cryogenic logic gates. However, due to the superc...
Article
In this work, we identify the potential challenges of ambipolar ferroelectric field effect transistor (FeFET) in building a single transistor CAM array to perform parallel hamming distance (HD) computations. The asymmetry in the two current branches of an ambipolar FeFET, such as different subthreshold swing (SS) and ON state current I <sub xmlns...
Article
Content addressable memory (CAM) has been employed in various data-intensive tasks for its parallel pattern-matching capability. To enhance the density and efficiency of CAMs, emerging nonvolatile memory (NVM) technologies have been exploited in the CAM designs. Recently, the multilevel cell (MLC) characteristics of NVMs have been utilized in sever...
Article
Based on experiments and simulation, we perform a systematic study to compare and under-stand the interplay between polarization switching and charge trapping (CT) for both ferroelectric (FE) metal–ferroelectric–insulator–semiconductor (MFIS) stack and antiferroelectric (AFE) MFIS stack. In order to conduct the comparative study, a unified modeling...
Article
Full-text available
Non-volatile memories (NVMs) have the potential to reshape next-generation memory systems because of their promising properties of near-zero leakage power consumption, high density and non-volatility. However, NVMs also face critical security threats that exploit the non-volatile property. Compared to volatile memory, the capability of retaining da...
Conference Paper
In this work, we present a lightweight in-situ encryption/decryption technique for high-density NAND memory, aiming to meet the growing need for data privacy and security in storage and computing applications. Using ferroelectric FET (FeFET) as a technology platform for demonstration, we show that: i) using a XOR-based cipher, the encryption/decryp...
Conference Paper
In this work, we study ferroelectric capacitor memories and demonstrate comparative advantages of 2T-nC (Two transistors-n metal-ferroelectric-metal (MFM) capacitors) in scalability, reliability, and feasibility of dense 3D integration and operation. We show that: i) the sensing and scalability issues of conventional 1T-1C FeRAM rooted in its charg...
Article
Full-text available
This study investigates the electrical characteristics observed in n-channel and p-channel ferroelectric field effect transistor (FeFET) devices fabricated through a similar process flow with 10 nm of ferroelectric hafnium zirconium oxide (HZO) as the gate dielectric. The n-FeFETs demonstrate a faster complete polarization switching compared to the...
Article
The rapid development of edge artificial intelligence (AI) raises high requirements for data-intensive neural network (NN) computing and storage of edge devices, under a limited chip footprint and energy supply source. As a promising approach for energy-efficient processing, computing-in-memory (CiM) has been widely explored in recent efforts to mi...
Article
Full-text available
This paper presents a novel, simulation-based study of the long-term impact of X-ray irradiation on the ferroelectric field effect transistor (FeFET). The analysis is conducted through accurate multi-physics technology CAD (TCAD) simulations and radiation impact on the two FeFET memory states—HVT and LVT—is studied. For both the states, we investig...
Article
The data explosion of Internet of Things (IoT) and machine learning tasks raises a great demand on highly efficient computing hardware and paradigms. Brain-inspired hyperdimensional computing (HDC) is becoming a promising computing paradigm, which encodes data as hypervectors with homogeneous elements instead of numbers, and can perform learning/cl...
Preprint
This study investigates the electrical characteristics observed in n- channel and p-channel ferroelectric field effect transistor (FeFET) devices fabricated through a similar process flow with 10 nm of ferroelectric hafnium zirconium oxide (HZO) as the gate dielectric. The n-FeFETs demonstrate a faster complete polarization switching compared to th...
Preprint
Full-text available
Computationally hard combinatorial optimization problems (COPs) are ubiquitous in many applications, including logistical planning, resource allocation , chip design, drug explorations, and more. Due to their critical significance and the inability of conventional hardware in efficiently handling scaled COPs, there is a growing interest in developi...
Conference Paper
In this work we introduce reconfigurable multifinger ferroelectric field effect transistors (FeFETs) which were fabricated using 28 nm CMOS technology. By switching the threshold voltage, the FeFETs can be utilized as reconfigurable devices for RF circuits, functioning at V GS = 0, thereby reducing energy losses during operation. The devices were r...
Article
In this work, we exploit a 2TnC ferroelectric random access memory (FeRAM) cell design to realize the quasi-nondestructive readout (QNRO) of ferroelectric polarization ( P <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">FE</sub> ) in a capacitor, which can relax the endurance requirement of the ferroe...
Preprint
Full-text available
To reduce system complexity and bridge the interface between electronic and photonic circuits, there is a high demand for a non-volatile memory that can be accessed both electrically and optically. However, practical solutions are still lacking when considering the potential for large-scale CMOS compatible integration. Here, we present an experimen...
Article
Ferroelectric Field Effect Transistors (FeFETs) have spurred increasing interest in both memories and computing applications, thanks to their CMOS compatibility, low-power operation, and high scalability. However, new security threats to the FeFET-based memories also arise. A major threat is the power analysis side-channel attack (P-SCA), which exp...
Preprint
Full-text available
Non-volatile memories (NVMs) have the potential to reshape next-generation memory systems because of their promising properties of near-zero leakage power consumption, high density and non-volatility. However, NVMs also face critical security threats that exploit the non-volatile property. Compared to volatile memory, the capability of retaining da...
Article
Bitwise logic-in-memory (BLiM) is a promising approach to efficient computing in data-intensive applications by reducing data movement between memory and processing units. However, existing BLiM techniques have challenges towards higher energy efficiency and speed: (i) DC power in computing and result sensing is significant in most existing RRAM an...
Preprint
Full-text available
p>In this work, a thorough assessment of the robustness of complementary channel HfO2 ferroelectric FET (FeFET) against total ionizing dose (TID) radiation is conducted, with the goal of determining its suitability for use as high-performance and energy-efficient embedded nonvolatile memory (eNVM) for space applications. We demonstrate that: i) fer...
Preprint
Full-text available
p>In this work, a thorough assessment of the robustness of complementary channel HfO2 ferroelectric FET (FeFET) against total ionizing dose (TID) radiation is conducted, with the goal of determining its suitability for use as high-performance and energy-efficient embedded nonvolatile memory (eNVM) for space applications. We demonstrate that: i) fer...
Article
This paper proposes C 2 FeRAM, a 2T2C/cell ferroelectric compute-in-memory (CiM) scheme for energy-efficient and high-reliability edge inference and transfer learning. With certain area overhead, C 2 FeRAM achieves the following highlights: (i) compared with FeFET/FeMFET, it achieves disturb-free CiM and much higher write endurance (equal to FeRAM)...
Preprint
Full-text available
Single-port ferroelectric FET (FeFET) that performs write and read operations on the same electrical gate prevents its wide application in tunable analog electronics and suffers from read disturb, especially to the high-threshold voltage (VTH) state as the retention energy barrier is reduced by the applied read bias. To address both issues, we prop...
Preprint
Full-text available
p>We have developed a comprehensive modeling framework to explain the switching characteristics of BEOL-compatible FeFET with an amorphous IGZO channel. Our TCAD-based modeling framework, calibrated against measurement data, jointly incorporates a) the distributed channel, b) a physics- based nucleation-limited switching dynamics model for multi- d...
Preprint
Full-text available
p>We have developed a comprehensive modeling framework to explain the switching characteristics of BEOL-compatible FeFET with an amorphous IGZO channel. Our TCAD-based modeling framework, calibrated against measurement data, jointly incorporates a) the distributed channel, b) a physics- based nucleation-limited switching dynamics model for multi- d...
Preprint
Full-text available
p>In this work, we propose a 2TnC ferroelectric random access memory (FeRAM) cell design to realize the quasi- nondestructive readout (QNRO) of ferroelectric polarization (PFE) in a capacitor, which can relax the endurance requirement of the ferroelectric thin film and exploits the benefits of both FeRAM and ferroelectric FET (FeFET). We demonstrat...
Preprint
Full-text available
p>In this work, we propose a 2TnC ferroelectric random access memory (FeRAM) cell design to realize the quasi- nondestructive readout (QNRO) of ferroelectric polarization (PFE) in a capacitor, which can relax the endurance requirement of the ferroelectric thin film and exploits the benefits of both FeRAM and ferroelectric FET (FeFET). We demonstrat...
Article
Full-text available
Content addressable memory (CAM) is widely used in associative search tasks due to its parallel pattern matching capability. As more complex and data-intensive tasks emerge, it is becoming increasingly important to enhance CAM density for improved performance and better area efficiency. To reduce the area overheads, various nonvolatile memory (NVM)...
Preprint
Full-text available
Cache serves as a temporary data memory module in many general-purpose processors and domain-specific accelerators. Its density, power, speed, and reliability play a critical role in enhancing the overall system performance and quality of service. Conventional volatile memories, including static random-access memory (SRAM) and embedded dynamic rand...
Article
Achieving brain-like density and performance in neuromorphic computers necessitates scaling down the size of nanodevices emulating neuro-synaptic functionalities. However, scaling nanodevices results in reduction of programming resolution and emergence of stochastic non-idealities. While prior work has mainly focused on binary transitions, in this...
Article
Full-text available
Realizing compact and scalable Ising machines that are compatible with CMOS-process technology is crucial to the effectiveness and practicality of using such hardware platforms for accelerating computationally intractable problems. Besides the need for realizing compact Ising spins, the implementation of the coupling network, which describes the sp...
Article
In this article, we have developed a comprehensive modeling framework to explain the switching characteristics of BEOL-compatible ferroelectric field-effect transistor (FeFET) with an amorphous IGZO channel. Our TCAD-based modeling framework, calibrated against measurement data, jointly incorporates: 1) the distributed channel; 2) a physics-based n...
Article
Through careful design of the area ratio (AR) of the back-end-of-line (BEOL)-compatible metal–ferroelectric–metal–insulator–semiconductor (MFMIS) ferroelectric field-effect transistor (FeFET), we are able to modulate the charge injection in the gate-stack and successfully extend the memory window (MW) to $\sim$ 8 V, far beyond the theoretical lim...
Article
In this work, a thorough assessment of the robustness of complementary channel HfO <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> ferroelectric FET (FeFET) against total ionizing dose (TID) radiation is conducted, with the goal of determining its suitability for use as high-performance and ene...
Preprint
Full-text available
The recent progress in quantum computing and space exploration led to a surge in interest in cryogenic electronics. Superconducting devices such as Josephson junction, Josephson field effect transistor, cryotron, and superconducting quantum interference device (SQUID) are traditionally used to build cryogenic logic gates. However, due to the superc...
Preprint
Cache serves as a temporary data memory module in many general-purpose processors and domain-specific accelerators. Its density, power, speed, and reliability play a critical role in enhancing the overall system performance and quality of service. Conventional volatile memories, including static random-access memory (SRAM) and embedded dynamic rand...
Article
To fully exploit the ferroelectric field effect transistor (FeFET) as compact embedded nonvolatile memory for various computing and storage applications, it is desirable to use a single FeFET (1T) as a unit cell and arrange the cells into an array. However, many write mechanisms for an 1T FeFET array reported in the literature are yet to be validat...
Cover Page
Massively Scalable Ternary Content‐Addressable Memories In article number 2200643, Xunzhao Yin, Xiao Gong, and co‐workers present high‐performance floating‐gate transistors with an amorphous‐InGaZnO channel and the application in ternary content addressable memories (TCAMs). The two‐transistor configuration and the extremely low OFF current of the...
Preprint
Field Programmable Gate Array (FPGA) is widely used in acceleration of deep learning applications because of its reconfigurability, flexibility, and fast time-to-market. However, conventional FPGA suffers from the tradeoff between chip area and reconfiguration latency, making efficient FPGA accelerations that require switching between multiple conf...
Article
As one type of associative memory, content-addressable memory (CAM) has become a critical component in several applications, including caches, routers, and pattern matching. Compared with the conventional CAM that could only deliver a “matched or not-matched” result, emerging multilevel CAM (ML-CAM) is capable of delivering “the degree of match” wi...
Preprint
Full-text available
Achieving brain-like density and performance in neuromorphic computers necessitates scaling down the size of nanodevices emulating neuro-synaptic functionalities. However, scaling nanodevices results in reduction of programming resolution and emergence of stochastic non-idealities. While prior work has mainly focused on binary transitions, in this...
Preprint
Full-text available
In this work, we propose a ferroelectric FET(FeFET) time-domain compute-in-memory (TD-CiM) array as a homogeneous processing fabric for binary multiplication-accumulation (MAC) and content addressable memory (CAM). We demonstrate that: i) the XOR(XNOR)/AND logic function can be realized using a single cell composed of 2FeFETs connected in series; i...
Article
Full-text available
3D NAND has been enabling continuous NAND density and cost scaling beyond conventional 2D NAND since sub‐20‐nm nodes. However, its poly‐Si channel suffers from low mobility, instability caused by grain boundaries, and large device‐to‐device variations in electrical characteristics at highly scaled device dimensions. These drawbacks can be overcome...
Article
HfO <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> -based FeFET is a remarkably promising candidate among emerging memory technologies. Its manifold applications range from nonvolatile memory to neuromorphic computing. However, the memory window (MW) is limited, since the ferroelectric propert...
Article
We demonstrate nonvolatile and area-efficient ternary content-addressable memories (TCAMs) featuring amorphous indium–gallium–zinc–oxide (a-IGZO) ferroelectric field-effect transistors (FeFETs) with excellent electrical characteristics. An extremely large sensing margin of the TCAM array is achieved due to the large current ON/OFF ratio ( $I_{ \mat...
Preprint
Full-text available
Hardware security has been a key concern in modern information technologies. Especially, as the number of Internet-of-Things (IoT) devices grows rapidly, to protect the device security with low-cost security primitives becomes essential, among which Physical Unclonable Function (PUF) is a widely-used solution. In this paper, we propose the first Fe...
Preprint
In a number of machine learning models, an input query is searched across the trained class vectors to find the closest feature class vector in cosine similarity metric. However, performing the cosine similarities between the vectors in Von-Neumann machines involves a large number of multiplications, Euclidean normalizations and division operations...
Preprint
Realizing compact and scalable Ising machines that are compatible with CMOS-process technology is crucial to the effectiveness and practicality of using such hardware platforms for accelerating computationally intractable problems. Besides the need for realizing compact Ising spins, the implementation of the coupling network, which describes the sp...
Preprint
Full-text available
Realizing compact and scalable Ising machines that are compatible with CMOS-process technology is crucial to the effectiveness and practicality of using such hardware platforms for accelerating computationally intractable problems. Besides the need for realizing compact Ising spins, the implementation of the coupling network, which describes the sp...
Article
Full-text available
Existing circuit camouflaging techniques to prevent reverse engineering increase circuit-complexity with significant area, energy, and delay penalty. In this paper, we propose an efficient hardware encryption technique with minimal complexity and overheads based on ferroelectric field-effect transistor (FeFET) active interconnects. By utilizing the...
Article
The Ising-physical unclonable function (PUF) is a recent PUF structure formed of a network of APUFs inspired by the Ising model. A large challenge–response pair (CRP) space with high resilience against machine learning modeling attacks can be attained due to the unique arrangement. These advantages, however, are achieved at the cost of a large area...
Preprint
Full-text available
Content addressable memory (CAM) is widely used in associative search tasks for its highly parallel pattern matching capability. To accommodate the increasingly complex and data-intensive pattern matching tasks, it is critical to keep improving the CAM density to enhance the performance and area efficiency. In this work, we demonstrate: i) a novel...
Article
In this work, a comprehensive study of charge trapping and de-trapping dynamics is performed on n-channel ferroelectric field-effect transistors (nFeFETs) and pFeFETs. It is discovered that: 1) the degree of charge trapping depends on the substrate that nFeFETs exhibit significant electron trapping but negligible hole trapping during memory write w...
Article
While the theoretical maximum of the memory window $\Delta {V}_{t}$ in a ferroelectric field-effect transistor (FEFET) is $2{E}_{C}{t}_{F}$ , ${E}_{C}$ and ${t}_{F}$ being coercive field and FE thickness, respectively, experimentally $\Delta {V}_{t}$ is observed to be much less than that, even in the case of complete polarization switchin...
Article
Silicon channel ferroelectric field-effect transistors (FeFETs) with low-k interfacial layer (IL) between ferroelectric and silicon channel suffers from high write voltage, limited write endurance and long read-after-write latency. This is due to early IL breakdown and mobile charge injection at the ferroelectric-IL interface. Here, we demonstrate...
Article
Content addressable memories (CAMs) are a promising category of computing-in-memory (CiM) elements that can perform highly parallel and efficient search operations for routers, pattern matching, and other data-intensive applications. Various magnetic tunnel junction (MTJ)-based CAM designs have been proposed to realize zero standby power and high-p...
Article
A fast and efficient search function across the database has been a core component for a number of data-intensive tasks in machine learning, IoT applications, and inference. However, the conventional digital machines implementing the search functionality with repetitive arithmetic operations suffer from the energy efficiency and performance degrada...
Preprint
3D NAND enables continuous NAND density and cost scaling beyond conventional 2D NAND. However, its poly-Si channel suffers from low mobility, large device variations, and instability caused by grain boundaries. Here, we overcome these drawbacks by introducing an amorphous indium-gallium-zinc-oxide (a-IGZO) channel, which has the advantages of ultra...

Citations

... The HDC operators -binding, bundling, and permutation -construct sets, associations, and sequences respectively, facilitating the interpretable creation and manipulation of complex objects for data representation, learning, and processing. For learning, an HDC model makes decisions by evaluating the similarity between query and model hypervectors [11,26,12,18,16]; for cognitive processing, an HDC model retrieves information directly over the hyperspace with HDC operators; it then decodes the information with similarity functions and the atomic hypervectors [9,22,30]. Recent work has shown great advantages of HDC in enhancing the cognitive capability of neural networks in an explainable fashion [7]: a neural network learns to encode and perform HDC-like composition of the data over Raven's Progressive Matrix, a visual reasoning task over the symbolic attributes of the objects, and significantly outperforms state-of-the-art pure DNN and neuro-symbolic AI solutions in both accuracy and efficiency scaling. While HDC serves as a natural bridge between neural models and symbolic reasoning, it faces some scalability issues when it comes to decoding, a crucial process for information retrieval. ...
... COPs are a class of problems where the decision maker is required to make a series of choices from a limited set of options, and the goal is to find the optimal set of combinations that achieves the best outcomes. Combinatorial optimization plays a central role in tackling some of the most challenging problems across diverse domains [2,3,4,5,6], from logistics and transportation to resource allocation and scheduling. For example, logistics companies use combinatorial optimization to optimize delivery routes and schedules. ...
... However, there are applications that need to apply the V READ constantly, which brings serious concern over the state stability. A noteworthy example is FeFET based reconfigurable computing, where the FeFET is exploited as a compact nonvolatile 3 switch as shown in Fig.1a 13,14 . By combining the configuration memory and a switch together, a FeFET based nonvolatile switch can implement the routing network in a field programmable gate array (FPGA) by constructing the connection box (CB) and switch box (SB). ...
... Although JJ and SQUID have been fundamental components of superconducting circuits and systems [7,11], circuits based on these devices encounter several challenges, including limited cascadability, fabrication complexities, and poor scalability [2,18]. Three-terminal cryotron devices, demonstrating channel switching between superconducting and resistive states controlled by gate current, were proposed in response to these challenges [18]. ...
... AM has been deployed in a variety of scenarios such as HDC [23], [25], [26], MANN [14], few-shot learning [18], and so on. Table I summarizes existing AMs based on single-level cell/multi-level cell (SLC/MLC) NVMs with different distance functions. ...
... The latter has very high density, but requires several steps to carry out a search operation. Recent advancements include single FeFET CAM designs that exploit the ambipolar transport of devices, such as Schottky barrier FeFET or ambipolar ferroelectric tunnel FET (FeTFET), allowing pattern searching in just one step [44]. However, the asymmetry in ambipolar FeFETs impacts the CAM functionality and necessitates careful tuning [44]. ...
... The design incorporates the recently proposed dual-port FeFET, which offers an increased memory window (MW) and separate read/write paths for read-disturb-free operation [13], [14], [15], [16]. In Fig. 1, we illustrate how the dual-port FeFET functions as a PDE, providing variable delay based on the applied write voltage (WV). ...
... In contrast, thin Zr-doped HfO 2 (HZO) exhibits excellent ferroelectric performance at low driving voltages even at a thickness of merely 5nm [27][28][29] , which is crucial for scalable integrated circuits incorporating ultra-thin oxide semiconductors. Currently, whether it is two-terminal devices like FRAM or three-terminal transistors integrated with IGZO channels, the annealing temperature of HZO has been shown to be compatible with BEOL conditions 19,[30][31][32] . In addition, FeFETs based on HZO have demonstrated synaptic characteristics 33 . ...
... Therefore, there will be no mismatch between the logic and memory blocks in terms of speed, power consumption, and fabrication process which will also reduce the "memory wall" bottleneck. [33][34][35][36][37] The major contributions of this work are outlined below. ...
... FeRAM has been in the market for some time and has niche usage in low-power-embedded systems, thanks to its speed, low power access, and low voltage operation. However, it has not reached widespread adoption in CAMs mainly due to its moderate area footprint and its destructive read operation [37], [38]. To address this issue, a novel approach has been developed utilizing quasi-nondestructive readout (QNRO) of the capacitor polarization. ...