Figure 3 - uploaded by Christian Plessl
Content may be subject to copyright.
LUT and FF-based pipeline

LUT and FF-based pipeline

Source publication
Article
Full-text available
Due to the continuously shrinking device structures and increasing densities of FPGAs, thermal aspects have become the new focus for many research projects over the last years. Most researchers rely on temperature simulations to evaluate their novel thermal management techniques. However, these temperature simulations require a high computational e...

Context in source publication

Context 1
... order to combine LUTs with FFs, we have designed a pipeline similar to the FF pipeline (Figure 2). The difference is, that between each pair of FFs there is a LUT interconnected, as can be seen in Figure 3. In this design all look-up tables are implemented as an exclusive OR (XOR) of the input signals I0 and I1. ...

Citations

... As the final algorithm firmware is under development, special power inefficient firmware, the Look-Up Table (LUT) oscillator algorithm [7], is used to test the power and thermal limits of the Apollo blade. The amount of power the FPGA exerts can be adjusted dynamically. ...
... Further investigation suggests that large temperature gradients can also occur within a single SLR. Similar observations have been reported previously [7]. ...
... We would like to thank the authors of ref. [7] for the original VHDL code of the LUT oscillator used in our power tests. This work was supported by the National Science Foundation under Grant Nos. ...
Article
Full-text available
The challenging conditions of the High-Luminosity LHC require tailored hardware designs for the trigger and data acquisition systems. The Apollo platform features a “Service Module” with a powerful system-on-module computer that provides standard ATCA communications and application-specific “Command Modules” with large FPGAs and high speed optical fiber links. The CMS version of Apollo will be used for the track finder and the pixel readout. It features up to two large FPGAs and more than 100 optical links with speeds up to 25 Gb/s. We study carefully the design and performance of the board by using customized firmware to test power consumption, heat dissipation, and optical link integrity. This paper presents the results of these performance tests, design updates, and future plans.
... Special power inefficient firmware, the Look-Up Table (LUT) oscillator algorithm (developed by the authors of Ref. [7]), is used to make the FPGA draw a large amount of power. The amount of power the FPGA exerts can be adjusted dynamically. ...
... Further investigation suggests that large temperature gradients can also occur within a single SLR. Similar observations have been reported previously [7]. ...
... We would like to thank the authors of Ref. [7] for the original VHDL code of the LUT oscillator used in our power tests. This work was supported by the National Science Foundation under grant no. ...
Preprint
Full-text available
The challenging conditions of the High-Luminosity LHC require tailored hardware designs for the trigger and data acquisition systems. The Apollo platform features a "Service Module" with a powerful system-on-module computer that provides standard ATCA communications and an application-specific "Command Module"s with large FPGAs and high-speed optical fiber links. The CMS version of Apollo will be used for the track finder and the pixel readout. It features up to two large FPGAs and more than 100 optical links with speeds up to 25\,Gb/s. We study carefully the design and performance of the board by using customized firmware to test power consumption, heat dissipation, and optical link integrity. This paper presents the results of these performance tests, design updates, and future plans.
... Another defense strategy would be to heat and keep the DRAM a certain temperature. For example, circuits such as Ring Oscillators can be used as heaters [19] and could be added to the DRAM chip. As long as SoC could not heat the DRAM even higher, DRAM would not be affected by the operations of the SoC. ...
... With proper calibration, ROs can be used as temperature sensors; comparing oscillation counts of an RO at a fixed location on an FPGA board at different times allows one to observe relative changes in the temperature of the FPGA board over time. Meanwhile, for controlling temperature, a free running RO can be used to generate heat, by constantly toggling transistors and maximizing dynamic power [1]. Use of a sensor diode for measuring temperature is also possible, and potentially more accurate. ...
... Existing work has shown that it is possible to heat up different regions of the FPGA [1]. However, observing the differences in temperature of different parts of the FPGA over time is not trivial. ...
... However, observing the differences in temperature of different parts of the FPGA over time is not trivial. The existing work [1] used high-resolution thermal cameras and observed the FPGA at what is equivalent to 0s of the idle period, i.e. at the moment when the FPGA is being heated, and not afterwards. when RO sensor is active at the same time as the heater. ...
Conference Paper
With increasing interest in Cloud FPGAs, such as Amazon's EC2 F1 instances or Microsoft's Azure with Catapult servers, FPGAs in cloud computing infrastructures can become targets for information leakages via convert channel communication. Cloud FPGAs leverage temporal sharing of the FPGA resources between users. This paper shows that heat generated by one user can be observed by another user who later uses the same FPGA. The covert data transfer can be achieved through simple on-off keying (OOK) and use of multiple FPGA boards in parallel significantly improves data throughput. The new temporal thermal covert channel is demonstrated on Microsoft's Catapult servers with FPGAs running remotely in the Texas Advanced Computing Center (TACC). A number of defenses against the new temporal thermal covert channel are presented at the end of the paper.
... One of the most important challenges to measuring the on-chip temperature is the restriction on the temperature ranges. Agne et al. [20] introduce seven ways to make the high-tech FPGAs heat using available internal resources. In this paper, we use the heat generator circuit that has been designed in [20], i.e., the one-level LUT-based heater. ...
... Agne et al. [20] introduce seven ways to make the high-tech FPGAs heat using available internal resources. In this paper, we use the heat generator circuit that has been designed in [20], i.e., the one-level LUT-based heater. ...
... The processor local bus (PLB) connects the heater circuit to the MicroBlaze. The heat generator circuit consists of 10,000 one-level LUTbased oscillator which enables the maximum toggling frequency [20]. Fig. 4 shows the schematic of the one-level LUT-based heater. ...
Conference Paper
Full-text available
During the last decades, technology scaling in reconfigurable logic devices enabled implementing complicated designs which results in higher power density and on-chip temperature. Since higher operating temperature of chips is a critical problem in electronics devices, thermal management techniques are highly required. To provide a thermal map of reconfigurable logic devices, a network of sensors is needed. In this work, a ring-oscillator-based temperature sensor is used to create a sensor network. Then, a design space exploration is done among several sensor networks with the various sensor configurations including different ring oscillator length, the number of sensors in the examined network and various sampling time. We propose three criteria for exploring and comparing the efficiency of sensors network based on the thermal overhead and also measurement accuracy and precision among plenty of configurations on the Virtex-6 FPGA.
... The limitation on the temperature ranges of sensors is the major drawback of the latest approach because modern FPGAs can operate in wide range of temperatures (e.g. up to 120°C), but the evaluation boards are suffered and may be failed in these operating temperatures. To address this issue, in [24] a systematic study of heat-generating cores is performed and then seven ways are introduced to generate heat on modern FPGAs by utilizing different available resources of the device. More recently, Weber et al. [8] present a calibration effort for RO-based temperature sensors in FPGAs that employs a mixed approach to overcome the intra-sensor variation. ...
... In fact, in FPGA-based designs, each application has its own thermal behavior and increases the die temperature by a certain amount. To measure the thermal overhead of a sensor network, a fixed heat generator circuit (heater) is designed and developed on the FPGA [24] in which the generated heat is controllable. We tune the temperature generated by the heat generator circuit, according to the results obtained from the study of the temperature of several benchmark implemented on an FPGA [31]. ...
... The proposed temperature sensor, several designs of the RO-based temperature sensor, and the heat generator circuit are synthesized using HDL and connected to MicroBlaze via PLB, allowing the MicroBlaze to control them and access the measured data. The heat generator circuit composed of one-level LUT-based oscillators, which enables the maximum toggling frequency [24]. Fig. 9 shows the LUTbased oscillator with a single LUT. ...
Article
The availability of FPGAs in cloud data centers offers rapid, on-demand access to reconfigurable hardware compute resources that users can adapt to their own needs. However, the low-level access to the FPGA hardware and associated resources such as the PCIe bus, SSD drives, or DRAM modules also opens up threats of malicious attackers uploading designs that are able to infer information about other users or about the cloud infrastructure itself. In particular, this work presents a new, fast PCIe-contention-based channel that is able to transmit data between FPGA-accelerated virtual machines by modulating the PCIe bus usage. This channel further works with different operating systems, and achieves bandwidths reaching 20 kbps with 99% accuracy. This is the first cross-FPGA covert channel demonstrated on commercial clouds, and has a bandwidth which is over 2000 × larger than prior voltage- or temperature-based cross-board attacks. This paper further demonstrates that the PCIe receivers are able to not just receive covert transmissions, but can also perform fine-grained monitoring of the PCIe bus, including detecting when co-located VMs are initialized, even prior to their associated FPGAs being used. Moreover, the proposed mechanism can be used to infer the activities of other users, or even slow down the programming of the co-located FPGAs as well as other data transfers between the host and the FPGA. Beyond leaking information across different virtual machines, the ability to monitor the PCIe bandwidth over hours or days can be used to estimate the data center utilization and map the behavior of the other users. The paper also introduces further novel threats in FPGA-accelerated instances, including contention due to network traffic, contention due to shared NVMe SSDs, as well as thermal monitoring to identify FPGA co-location using the DRAM modules attached to the FPGA boards. This is the first work to demonstrate that it is possible to break the separation of privilege in FPGA-accelerated cloud environments, and highlights that defenses for public clouds using FPGAs need to consider PCIe, SSD, and DRAM resources as part of the attack surface that should be protected.
Article
The impacts of aging and process variations on the performance of VLSI systems is increasing with each process generation. The conventional way to counteract them are extensive guard bands, which are calculated at system design time. Hence, they are necessarily worst case guard bands, i.e., most often too pessimistic. Current research tries to mitigate this by means of in-situ performance measurement based adaptive voltage scaling (AVS). The performance measurement is typically determined by means of dedicated sensors or canary logic. The parametrization of such AVS systems relies on assumptions regarding the relative behavior of the sensor and the application logic. Most published approaches use manually gained empirical data for this purpose. However, an automatic calibration procedure is needed for the practical application of these approaches. We propose such an automated calibration procedure and evaluate it on multiple FPGAs to consider the effects of aging and process variation. Furthermore, we use two designs to cover leakage power and dynamic power dominated scenarios. We achieve average power savings of 67% for a leakage dominated design and 48% for a test case with dominant dynamic power. Furthermore, we investigate the limitations of AVS systems regarding their capability to counteract fast disturbances, e.g., voltage drop.