Dirk Koch's research while affiliated with Universität Heidelberg and other places

This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Analysis of Process Variation Within Clock Regions of AMD-Xilinx UltraScale+ Devices

Chapter

March 2024

·

7 Reads

·

As semiconductor technology advances and transistor feature sizes shrink, the increasing significance of process variation poses critical challenges to the reliability of semiconductor devices. This paper thoroughly explores the impact of process variation within the Clock Regions (CRs) of AMD-Xilinx UltraScale+ devices. We employ a novel method to characterize process variation with significantly higher precision than conventional ring oscillator (RO)-based sensors. Our experimental findings on ZYNQ XCZU9EG reveal that the latency of resources during rising and falling transitions may differ. Additionally, the proximity of Interconnect (INT) tiles to various tile types can influence the latency of resources within a column in a given CR. Moreover, we demonstrate that specific segments within CRs consistently exhibit faster performance compared to other areas within the same CR.

Memory-Aware Scheduling for a Resource-Elastic FPGA Operating System

Conference Paper

September 2023

·

4 Reads

Shaden Alismail

·

The memory subsystem is often the main performance bottleneck in an FPGA acceleration system. This paper presents two memory-aware runtime schedulers that decide the order of running tasks to improve the system’s performance: memory model-aware (MMA) and memory access pattern-aware (MAPA) schedulers. The proposed approaches consider memory characteristics in scheduling decisions to alleviate the memory overhead and enhance the system’s performance. MMA considers the accessed memory regions when scheduling the tasks in a way that reduces the memory page miss rates. On the other hand, MAPA alleviates the pressure on the memory subsystem by scheduling the tasks mainly based on their memory intensity and access patterns. The proposed runtime schedulers are evaluated and implemented on an Ultra96 FPGA board. The presented approaches show (on average) approximately \(10\%\), \(22\%\), \(12\%\), and \(9\%\) improvements in memory throughput, task execution time, makespan time, and job throughput, respectively, over an existing state-of-the-art memory-agnostic scheduler.

Efficient Resource Scheduling for Runtime Reconfigurable Systems on FPGAs

Conference Paper

September 2023

·

4 Reads

Shaden Alismail

·

Automated Generation and Orchestration of Stream Processing Pipelines on FPGAs

Conference Paper

December 2022

·

16 Reads

·

1 Citation

·

Kristiyan Manev

·

·

byteman: A Bitstream Manipulation Framework

November 2022

·

775 Reads

·

3 Citations

Kristiyan Manev

·

·

·

From better resource pooling for FPGA cloud providers to building dynamic execution pipelines at runtime, the capabilities of partial reconfiguration (PR) are waiting to be fully explored. However, the community still fails to materialize PR at scale, and FPGAs are only used as updatable ASICs, hence, omitting the opportunities offered by dynamically reconfiguring FPGAs at runtime. This work proposes a resourceful FPGA bitstream manipulation framework. The proposed tool provides means for parsing, modification, and generation of bitstream files, and it has been open-sourced and demonstrated in a working system. As a distinguished feature, it supports multi-die FPGAs (among the 106 Xilinx 7 Series, UltraScale, and UltraScale+ devices), and enables datacenter FPGAs to be used for relocatable PR. Using the versatile tool's built-in (dis)assembler allows for manual bitstream manipulations. Bundled with an efficient bitstream manipulation core, the efficacy is demonstrated by two case studies where we observe 58-377x higher bitstream merging throughput than a current state-of-art tool.

Automated Generation and Orchestration of Stream Processing Pipelines on FPGAs

November 2022

·

202 Reads

·

Kristiyan Manev

·

·

FPGAs have demonstrated substantial performance and energy efficiency advantages for workloads that fit a stream processing model with direct module-to-module communications. However, when the dataflow processing system is required to adapt to runtime conditions, current static acceleration solutions are limited in how efficiently the FPGA can be utilized due to the inability to switch out idling modules. To better use FPGAs in dynamic scenarios, this paper proposes using partial reconfigura-tion to stitch together different physically implemented operator modules on-the-fly. Rather than using designated module slots, our system places all modules and routing wires into a shared region with more placement options to minimize fragmentation. Furthermore, we use a module library that provides different resource and performance trade-offs for faster execution while considering the configuration cost. Then our system finds the optimal set of modules while scheduling multiple acceleration requests and managing all constraints transparently to the end-user. We demonstrate that the overheads of the middleware are insignificant enough to form accelerators with end-to-end execution times equal to hand-crafted static systems with small datasets while being 7.2× faster when streaming large datasets. We exemplified our approach for database acceleration, where the whole operation is abstracted to execute SQL queries directly.

byteman: A Bitstream Manipulation Framework

Preprint

October 2022

·

704 Reads

Kristiyan Manev

·

·

·

See published version: https://www.researchgate.net/publication/365276026

FPL Demo: Runtime Stream Processing with Resource-Elastic Pipelines on FPGAs

August 2022

·

181 Reads

·

Kristiyan Manev

·

·

FPGAs are efficient at dataflow applications, as demonstrated in various application domains, including machine learning, communication, and image processing. In this demo, we accelerate database management operations transparently to the user by stitching together partially reconfigurable stream processing modules that implement database operators. Our runtime system orchestrates this, which builds custom pipelines according to runtime conditions. This demo will showcase an acceleration of SQL queries using our dynamic stream processing system running on a ZCU102 FPGA board.

FPL Demo: FPGA Bitstream Virus Scanning

Conference Paper

August 2022

·

7 Reads

·

1 Citation

·

·

Kristiyan Manev

·

Tunable Fine-grained Clock Phase-shifting for FPGAs

Conference Paper

August 2022

·

3 Reads

·

1 Citation

·

... It also enables runtime bitstream manipulation by being 220 − 377× faster than related work. byteman has also been demonstrated in a database acceleration system that utilizes dynamic execution pipelines at runtime [40]. ...
Reference:
byteman: A Bitstream Manipulation Framework

FPL Demo: Runtime Stream Processing with Resource-Elastic Pipelines on FPGAs

Citing Conference Paper
August 2022

·

Kristiyan Manev

·

·

... There is a moderate amount of research on database acceleration focusing on analytics [20]. Similar to earlier works that accelerated joins [2], [21], there are tradeoffs when selecting between a sort and a hash-based solution [22], [23], each with separate adaptation challenges [24], [25]. For instance, even with a sorter-based pipeline, supporting arbitrary data in a join accelerator caused stalls and increased its memory requirements [21]. ...
Reference:
Efficient Adaptable Streaming Aggregation Engine

Automated Generation and Orchestration of Stream Processing Pipelines on FPGAs

Citing Conference Paper
December 2022

·

Kristiyan Manev

·

·

... AMD/Xilinx's dynamic function exchange (DFX) introduces a technology for setting up PR areas within a static system, allowing users to allocate modules to these specific areas on FPGA fabrics [38]. However, the Vivado toolchain has several weaknesses, such as being too slow for real-time applications and the lack of support for bitstream relocation [18]. On the other hand, the opensource tools, such as Byteman [18], improved the efficiency and speed of performing PR from an embedded processor, making it suitable for real-time applications, such as countermeasures against side-channel attacks during runtime. ...
Reference:
LaserEscape: Detecting and Mitigating Optical Probing Attacks

byteman: A Bitstream Manipulation Framework

Citing Conference Paper
Full-text available
November 2022

Kristiyan Manev

·

·

·

... Academic studies on the virtualization of FPGA resources are abundant in the research community [7], [8], covering a wide range of FPGA types (including PCIe, network, and SoC) and methodologies. ...
Reference:
SVFF: An Automated Framework for SR-IOV Virtual Function Management in FPGA Accelerated Virtualized Environments

The Future of FPGA Acceleration in Datacenters and the Cloud

Citing Article
February 2022

ACM Transactions on Reconfigurable Technology and Systems

Christophe Bobda

·

Joel Mandebi Mbongue

·

·

[...]

·

Russell Tessier

... In addition to the safety and security concerns raised by multi-tenant FPGAs, timely availability of resources to legitimate tenants is also of utmost importance. A malicious tenant can hide behind the facade of legitimacy, waiting to initiate DoS for requests generated by legitimate tenants or may try to damage the PDN of multi-tenant FPGAs in order to cause long-term damage [86,87]. In this section, we present a scenario in which a malicious tenant threatens the availability of resources to legitimate tenants, followed by a defense mechanism that can fend against such attempts. ...
Reference:
Enabling Secure and Efficient Sharing of Accelerators in Expeditionary Systems

Denial-of-Service on FPGA-based Cloud Infrastructures — Attack and Defense

Citing Article
Full-text available
July 2021

IACR Transactions on Cryptographic Hardware and Embedded Systems

·

·

·

... TEEOD [69] used reconfigurable FPGA technology to customize a secure enclave based on programmable logic for each application instance. In addition, there are also some studies on protecting FPGA cloud security and privacy based on TEE, such as TruFPGA [70]. ...
Reference:
Survey of research on confidential computing

Trusted Configuration in Cloud FPGAs

Citing Conference Paper
May 2021

·

·

Tommaso Frassetto

·

[...]

·

... Emerging memory technologies including resistive random access memory (ReRAM/RRAM) [1]- [4] provide potential solutions to the challenges faced by current memories due to their lower power consumption, scalability, non-volatility and high-speed operation. Apart from the typical use case as computer memory, such devices have other uses in applications such as in-memory computing [5], [6] and FPGAs [7], [8]. A common criteria of these applications is the use of large arrays with millions, billions or more of devices in a single chip [9]. ...
Reference:
A CMOS-based Characterisation Platform for Emerging RRAM Technologies

Memristor-based Pass Gate for FPGA Programmable Routing Switch

Citing Conference Paper
May 2021

Nguyen Cong Dao

·

... The virtual streams/channels are differentiated by StreamID and ChannelID signals of DSPI. 9,48,50,51,59,87,95,102 Abstract While FPGAs are becoming mainstream in the deployment of datacenters and cloud systems, they are mostly used as updatable ASICs. This thesis shows that it is feasible to achieve acceleration for runtime-only known problems using dynamically built stream processing pipelines if we efficiently exploit the given FPGA resources and utilize additional techniques such as resource elasticity. ...
Reference:
Resource Elastic Dynamic Stream Processing on FPGAs Exemplified on Database Acceleration

Moving Compute towards Data in Heterogeneous multi-FPGA Clusters using Partial Reconfiguration and I/O Virtualisation

Citing Conference Paper
December 2020

·

·

·

[...]

·

Iakovos Mavroidis

... Field-Programmable System-on-Chips (FPSoC) achieves even tighter integration by integrating processors and FPGAs on the same chip, similar to Duet at a high level. Many commodity FPSoCs [26], [36], [37], [42], [55] and academic FPSoCs [28], [45], [53] support full or partial cache coherence. For example, the Xilinx Zynq-7000 employs the AXI4 ACP interface [3] which supports uni-directional cache coherence (I/O coherency). ...
Reference:
Duet: Creating Harmony between Processors and Embedded FPGAs

FABulous: An Embedded FPGA Framework

Citing Conference Paper
February 2021

Nguyen Cong Dao

·

·

·

[...]

·

... 9, 62, 100, 116 module resource footprint variant Implemented module for different FPGA resource footprint to maximize placement options. 37,66,119,120,125,143 module stitching Building an execution pipeline at runtime by placing module bitstreams. 9,21,40,59,120,122 partial (re)configuration Changing the loaded FPGA bitstream for a partial region of the FPGA. ...
Reference:
Resource Elastic Dynamic Stream Processing on FPGAs Exemplified on Database Acceleration

Transparent Integration of a Dynamic FPGA Database Acceleration System

Citing Conference Paper
August 2020

·

Citations

1273

Browse more researchers