Block diagram of the hardware monitor [39].

Source publication

FIGURE 1. The LEON3 processor system design [14].

FIGURE 2. LEON3 architecture with a redundant processor [17].

FIGURE 3. Conceptual block diagram of the proposed architecture [19].

FIGURE 4. Shadow register technique extended with ECC units [20].

Survey of Soft Error Mitigation Techniques applied to LEON3 Soft Processors on SRAM-based FPGAs

Article

Full-text available

Jan 2020

Soft-core processors implemented in SRAM-based FPGAs are an attractive option for applications to be employed in radiation environments due to their flexibility, relatively-low application development costs, and reconfigurability features enabling them to adapt to the evolving mission needs. Despite the advantages soft-core processors possess, they...

Building Beyond HLS: Graph Analysis and Others

Preprint

Full-text available

Apr 2021

High-Level Synthesis has introduced reconfigurable logic to a new world -- that of software development. The newest wave of HLS tools has been successful, and the future looks bright. But is HLS the end-all-be-all to FPGA acceleration? Is it enough to allow non-experts to program FPGAs successfully, even when dealing with troublesome data structure...

Power-Oriented Monitoring of Clock Signals in FPGA Systems for Critical Application

Article

Full-text available

Jan 2021

This paper presents a power-oriented monitoring of clock signals that is designed to avoid synchronization failure in computer systems such as FPGAs. The proposed design reduces power consumption and increases the power-oriented checkability in FPGA systems. These advantages are due to improvements in the evaluation and measurement of corresponding...

Approximate Logic Synthesis for FPGA by Decomposition

Chapter

Full-text available

Jan 2022

Approximate computing is a design paradigm for error-tolerant applications. By relaxing the accuracy requirement, it can significantly reduce circuit area and power consumption. There are many approximate logic synthesis (ALS) methods, but few of them target at FPGA designs. In this work, we propose an ALS method for FPGAs based on decomposition. I...

Hybrid Reconfigurable FPGA Architecture Based on Autonomous Fine-Grain Power- Gating

Article

Full-text available

Dec 2021

Field Programmable Gate Arrays (FPGAs) are special type processor which allows the end user to configure directly. This paper investigates to design a low power reconfigurable Asynchronous FPGA cells. The proposed design combines four-phase dual-rail encoding and LEDR (Level-Encoded Dual-Rail) encoding with sleep controller. Four-phase dual-rail en...

Figure 1: Schematic depiction of a muon cross one chamber. The active...

Figure 4: Example of a track in a macro-cell. The filled black circles...

Figure 7: Time pedestal (a) and trigger primitive angle (b) resolutions...

Figure 10: Simplified depiction of the LNL telescope geometry. The 4...

Figure 11: (a) Time resolution between the trigger t 0 and the external...

Muon trigger with fast Neural Networks on FPGA, a demonstrator

Preprint

Full-text available

May 2021

The online reconstruction of muon tracks in High Energy Physics experiments is a highly demanding task, typically performed with programmable logic boards, such as FPGAs. Complex analytical algorithms are executed in a quasi-real-time environment to identify, select and reconstruct local tracks in often noise-rich environments. A novel approach to...

Space Shuttle: A Test Vehicle for the Reliability of the SkyWater 130nm PDK for Future Space Processors

Conference Paper

Jul 2023

Using Machine Learning for Anomaly Detection on a System-on-Chip under Gamma Radiation

Preprint

Full-text available

Jan 2022

The emergence of new nanoscale technologies has imposed significant challenges to designing reliable electronic systems in radiation environments. A few types of radiation like Total Ionizing Dose (TID) effects often cause permanent damages on such nanoscale electronic devices, and current state-of-the-art technologies to tackle TID make use of expensive radiation-hardened devices. This paper focuses on a novel and different approach: using machine learning algorithms on consumer electronic level Field Programmable Gate Arrays (FPGAs) to tackle TID effects and monitor them to replace before they stop working. This condition has a research challenge to anticipate when the board results in a total failure due to TID effects. We observed internal measurements of the FPGA boards under gamma radiation and used three different anomaly detection machine learning (ML) algorithms to detect anomalies in the sensor measurements in a gamma-radiated environment. The statistical results show a highly significant relationship between the gamma radiation exposure levels and the board measurements. Moreover, our anomaly detection results have shown that a One-Class Support Vector Machine with Radial Basis Function Kernel has an average Recall score of 0.95. Also, all anomalies can be detected before the boards stop working.

Side-Channel Attacks on Triple Modular Redundancy Schemes

Preprint

Apr 2021

The interplay between security and reliability is poorly understood. This paper shows how triple modular redundancy affects a side-channel attack (SCA). Our counterintuitive findings show that modular redundancy can increase SCA resiliency.

Approximate Triple Modular Redundancy: A Survey

Article

Full-text available

Jan 2020

In recent years, approximate computing (AC) has attracted attention owing to its tradeoff between the exactness of computations and performance gains. AC has also been probed for the technique of Triple modular redundancy (TMR). TMR is a well-known fault masking methodology, with associated overheads, widely used in systems of different nature and at different levels. E.g.: layout-level, gate-level, HW-module level, software. At hardware level, through exploitation of AC the 200% area overhead problem due to triplication of the original modules in TMR can be reduced. By approximating the modules of TMR while ensuring that at least two of the approximate modules do not differ from the original module for every input vector, the facilitation of fault masking can lead to overhead reduction. Hence, approximate TMR (ATMR) aims to achieve cost-effective reliability. Nevertheless, due to the extensive search space, computational complexity, and principal fault masking function of ATMR, designing an ATMR is a challenging task. An ATMR technique must be scalable so that it can be easily adopted by circuits having large number of inputs and the extraction of ATMR modules remains computationally inexpensive. Compared with TMR, due to the inclusion of approximations, ATMR is more vulnerable to errors, and hence, the design technique must ensure awareness of input-criticality. To the best of the authors' knowledge, none of the existing survey articles on AC has reported on ATMR. Therefore, in this work, ATMR design techniques are thoroughly surveyed and qualitatively compared. Moreover, design considerations and challenges for designing ATMR are discussed.

Recent advances on reliability of FPGAs in a radiation environment

Article

Apr 2024

Approximate Computing: Hardware and Software Techniques, Tools and Their Applications

Article

Sep 2023
J CIRCUIT SYST COMP

The limitations of scaling in CMOS technology pose challenges in meeting the requirements of future applications. To address these challenges, researchers are exploring various design techniques, including Approximate Computing (AC), which leverages the inherent error resilience of applications to achieve high performance and energy gains with desired quality. AC has gained popularity as a computer paradigm for error-resilient applications, and many researchers have studied AC across computing layers and developed tools for implementing these techniques. This paper provides a comprehensive survey of AC techniques at the abstraction levels of software and hardware and discusses the tools to implement AC in hardware and software, quality evaluation tools and comparison points. The paper also covers existing frameworks for AC, potential applications, future research directions, challenges and limitations. This information can guide researchers in identifying promising avenues for further advancements and innovations in this domain. Additionally, this paper compares state-of-the-art surveys of AC and highlights the unique features and contributions of this work that distinguish our work from previous surveys.

Using Machine Learning for Anomaly Detection on a System-on-Chip under Gamma Radiation

Article

Full-text available

Jun 2022
NUCL ENG TECHNOL

The emergence of new nanoscale technologies has imposed significant challenges to designing reliable electronic systems in radiation environments. A few types of radiation like Total Ionizing Dose (TID) can cause permanent damages on such nanoscale electronic devices, and current state-of-the-art technologies to tackle TID make use of expensive radiation-hardened devices. This paper focuses on a novel and different approach: using machine learning algorithms on consumer electronic level Field Programmable Gate Arrays (FPGAs) to tackle TID effects and monitor them to replace before they stop working. This condition has a research challenge to anticipate when the board results in a total failure due to TID effects. We observed internal measurements of FPGA boards under gamma radiation and used three different anomaly detection machine learning (ML) algorithms to detect anomalies in the sensor measurements in a gamma-radiated environment. The statistical results show a highly significant relationship between the gamma radiation exposure levels and the board measurements. Moreover, our anomaly detection results have shown that a One-Class SVM with Radial Basis Function Kernel has an average recall score of 0.95. Also, all anomalies can be detected before the boards are entirely inoperative, i.e. voltages drop to zero and confirmed with a sanity check.

A two‐dimensional RC network topology for fault‐tolerant design of analog circuits

Article

Apr 2022
INT J CIRC THEOR APP

This paper proposes a novel one‐port passive circuit topology consisting of a two‐dimensional network of resistors and capacitors, which can be used as a fault‐tolerant building block for analog circuit design. Through an analytical procedure, the network is shown to follow simple first‐order admittance dynamics. A Monte Carlo method is employed to describe the effect of simultaneous faults (short or open circuit) in random network elements in terms of confidence bounds in the frequency‐domain admittance profile. Faults in 10% of the elements resulted in only minor changes of the frequency response (up to 3.9 dB in magnitude and 12.5 ∘$$ {}^{\circ } $$ in phase in 95% of the cases). An example is presented to illustrate the use of the proposed RC network in the fault‐tolerant design of a low‐pass filter. This paper proposes a novel two‐dimensional RC network topology that can be used as a fault‐tolerant building block for analog circuit design. The network follows simple first‐order admittance dynamics under nominal conditions. A Monte Carlo method is employed to characterize the effect of simultaneous faults (short or open circuit) in random network elements. A fault‐tolerant design of a low‐pass filter is presented for illustration.

Side-Channel Attacks on Triple Modular Redundancy Schemes

Conference Paper

Nov 2021

Novel lockstep-based fault mitigation approach for SoCs with roll-back and roll-forward recovery

Article

Full-text available

Sep 2021
MICROELECTRON RELIAB

All-Programmable System-on-Chips (APSoCs) constitute a compelling option for employing applications in radiation environments thanks to their high-performance computing and power efficiency merits. Despite these advantages, APSoCs are sensitive to radiation like any other electronic device. Processors embedded in APSoCs, therefore, have to be adequately hardened against ionizing-radiation to make them a viable choice of design for harsh environments. This paper proposes a novel lockstep-based approach to harden the dual-core ARM Cortex-A9 processor in the Xilinx Zynq-7000 APSoC against radiation-induced soft errors by coupling it with a MicroBlaze TMR subsystem in the programmable logic (PL) layer of the Zynq. The proposed technique uses the concepts of checkpointing along with roll-back and roll-forward mechanisms at the software level, i.e. software redundancy, as well as processor replication and checker circuits at the hardware level (i.e. hardware redundancy). Results of fault injection experiments show that the proposed approach achieves high levels of protection against soft errors by mitigating around 98% of bit-flips injected into the register files of both ARM cores while keeping timing performance overhead as low as 25% if block and application sizes are adjusted appropriately. Furthermore, the incorporation of the roll-forward recovery operation in addition to the roll-back operation improves the Mean Workload between Failures (MWBF) of the system by up to ≈19% depending on the nature of the running application, since the application can proceed faster, in a scenario where a fault occurs, when treated with the roll-forward operation rather than roll-back operation. Thus, relatively more data can be processed before the next error occurs in the system. <br/

Block diagram of the hardware monitor [39].

Similar publications

Citations