Ulrich Drepper's research while affiliated with Red Hat and other places

Publications (21)

Article
A key design decision for data systems is whether they follow the row-store or the column-store paradigm. The former supports transactional workloads, while the latter is better for analytical queries. This decision has a significant impact on the entire data system architecture. The multiple-decadelong journey of these two designs has led to a new...
Article
Transactional and analytical database management systems (DBMS) typically employ different data layouts: row-stores for the first and column-stores for the latter. In order to bridge the requirements of the two without maintaining two systems and two (or more) copies of the data, our proposed system Relational Memory employs specialized hardware th...
Conference Paper
Full-text available
Analytical database systems are typically designed to use a column-first data layout to access only the desired fields. On the other hand, storing data row-first works great for accessing, inserting, or updating entire rows. Transforming rows to columns at run-time is expensive, hence, many analytical systems ingest data in row-first form and trans...
Article
Full-text available
The proliferation of multi-core, accelerator-enabled embedded systems has introduced new opportunities to consolidate real-time systems of increasing complexity. But the road to build confidence on the temporal behavior of co-running applications has presented formidable challenges. Most prominently, the main memory subsystem represents a performan...
Preprint
Full-text available
We explore if unikernel techniques can be integrated into a general-purpose OS while preserving its battle-tested code, development community, and ecosystem of tools, applications, and hardware support. Our prototype demonstrates both a path to integrate unikernel techniques in Linux and that such techniques can result in significant performance ad...
Preprint
Full-text available
Analytical database systems are typically designed to use a column-first data layout that maps better to analytical queries of access patterns. This choice is premised on the assumption that storing data in a row-first format leads to accessing unwanted fields; moreover, transforming rows to columns at runtime is expensive. On the other hand, new d...
Conference Paper
Full-text available
The proliferation of multi-core, accelerator-enabled embedded systems has introduced new opportunities to consolidate real-time systems of increasing complexity. But the road to build confidence on the temporal behavior of co-running applications has presented formidable challenges. Most prominently, the main memory subsystem represents a performan...
Conference Paper
Full-text available
Unikernels have demonstrated enormous advantages over Linux in many important domains, causing some to propose that the days of Linux's dominance may be coming to an end. On the contrary, we believe that unikernels' advantages represent the next natural evolution for Linux, as it can adopt the best ideas from the unikernel approach and, along with...
Article
CPU manufacturers are looking for increased number of execution units, or CPU cores and for such cores, programs must be parallelized. Splitting a program into multiple pieces that can be executed in parallel, adds a whole dimension of additional problems like dealing with multiple memory locations, traditional parallel programming. The concept of...
Article
Programming Language APIs and their libraries are often not designed with safety in mind. The API of the C language and the Unix API (which is defined using C) are especially weak in this respect. It is not necessarily the case that all programs developed using these programming languages and APIs are unsafe. In this document we will describe possi...
Article
Virtualization can be implemented in many different ways. It can be done with and without hardware support. The virtualized operating system can be expected to be changed in preparation for virtualization, or it can be expected to work unchanged. Regardless, software developers must strive to meet the three goals of virtualization spelled out by Ge...
Article
As CPU cores become both faster and more numerous, the limiting factor for most programs is now, and will be for some time, memory access. Hardware designers have come up with ever more sophisticated memory handling and acceleration techniques–such as CPU caches–but these cannot work optimally without some help from the programmer. Unfortunately, n...
Article
This document is completely, utterly out of date when it comes to descriptions of the limitations of the current implementation. Everybody referring to this document to document shortcomings of NPTL is either a moron who hasn't done her/his homework and researched the issue, or is deliberately misleading people as it happens too often with publicat...
Article
Starting with early version of the 2.5 series, the Linux kernel contains a light-weight method for process synchronization. It is used in the modern thread library implementation but is also useful when used directly. This article introduces the concept and user level code to use them.
Article
Today, shared libraries are ubiquitous. Developers use them for multiple reasons and create them just as they would create application code. This is a problem, though, since on many platforms some additional techniques must be applied even to generate decent code. Even more knowledge is needed to generate optimized code. This paper introduces the r...
Article
Today's demands for threads can hardly be satisfied by the Linux-Threads library implementing POSIX threads which is currently part of the standard runtime environment. It was not written with the kernel ex-tensions we have now and in the near future available, it does not scale, and it does not take modern architectures into account. A completely...

Citations

... They implement a two-level cache structure that dispatches the concurrent write threads efficiently and significantly mitigates the cache block competition problem with a flexible merging scheme. Tarikul et al. [25] proposed Relational Fabric. It is a near-data vertical partitioner that allows memory or storage components to perform on-the-fly transparent data transformation. ...
... [21] OS Noise Analysis on Azalea-Unikernel [22] Demo: On-The-Fly Generation of Unikernels for Software-Defined Security in Cloud Infrastructures [23] A TOSCA-Oriented Software-Defined Security Approach for Unikernel-Based Protection Clouds [24] Unikernel-based Approach for Software-Defined Security in Cloud Infrastructures [11] A Survey on Security Isolation of Virtualization, Containers, and Unikernels [5] Cloud Cyber Security: Finding an Effective Approach with Unikernels [16] Exokernel: An Operating System Architecture for Application-Level Resource Management [25] USETL: Unikernels for the Serverless Extract Transform and Load; Why Should You Settle for Less? [26] Want More Unikernels? Inflate Them! [7] Understanding and Hardening Linux Containers [27] MirageOS Unikernel with Network Acceleration for IoT Cloud Environments [28] Azalea Unikernel IO Offload Acceleration [29] Unikernel Network Functions: A Journey Beyond the Containers [30] Unikraft and the Coming Age of Unikernels [31] fASLR: Function-Based ASLR Via TrustZone-M and MPU for Resource-Constrained IoT Systems [32] Unikernels: Library Operating Systems for the Cloud [10] NCC Group Assessing Unikernel Security [14] A Syscall-Level Binary-Compatible Unikernel [33] Rethinking the Library OS from the Top Down [8] Unikernel Linux (UKL) [34] Virtualization: A Survey on Concepts, Taxonomy And Associated Security Issues [1] Virtualization and Containerization of Application Infrastructure: A Comparison [2] Uniguard Protecting Unikernels using Intel SGX [20] Occlum: Secure and Efficient Multitasking Inside a Single Enclave of Intel SGX [35] Panoply: Low-TCB Linux Applications with SGX Enclaves [36] A Design and Verification Methodology for Secure Isolated Regions [37] Container Security: Issues, Challenges, and the Road Ahead [38] Intra-unikernel Isolation with Intel Memory Protection Keys [13] A Security Perspective on Unikernels [15] A Fresh Look at the Architecture and Performance of Contemporary Isolation Platforms [12] Unikernels as Processes [39] Accelerating Disaggregated Data Centers Using Unikernel [40] To address RQ3, Figure 1 shows how many papers in the corpus were published, per year, between 1995 and 2023. Beginning in 2014 the number of papers steadily increased to five a year in both 2018 and 2019. ...
... Overall, the VirtIO standard has become a de facto standard for virtualized I/O in the industry, and is widely adopted by hypervisors and operating systems. Its flexibility and portability make it an ideal choice for embedded systems that require virtualization capabilities [31]. ...
... More specifically, in-network processing unlocks higher application performance by reducing interand intra-node communication and bypassing MPI software layers. As new classes of devices including programmable NICs/switches [11], [12], Data Processing Units (DPUs) [13], and accelerators (FPGAs, GPUs) [14]- [16] are emerging in the datacenters [17], [18], we posit that there is an unrevealed opportunity to further improve the performance by extending in-network collective processing to a new class of complex collectives. ...
... Existing proposals (e.g., [25], [27], [29], [32], [33], [34]) aim to reduce the complexity of hardware design by providing a library of building blocks to synthesize query evaluation pipelines. However, in [25], their proposal is restricted to single query execution pipeline synthesis and does not allow the synthesis of multiple query execution pipelines, while [27], [29], [32], [34] do not support the data stream query processing model with sliding window semantics. ...
... Finally, in order to implement through software the mechanisms for isolation needed by these heterogeneous architectures, virtualization is foreseen as an interesting solution (Sohal et al. 2022;Modica et al. 2018) ...
... Worse yet, it is challenging to define complex software regulation policies that account for more than a single performance metric. This contrasts with the wide range of performance metrics exported by modern platforms at multiple levels of their complex memory hierarchy-e.g. at the level of PE (ARM 2016a; Xilinx 2024b), interconnect (ARM 2016b), and memory controller (Sohal et al. 2020;Saeed et al. 2022). Third, it forces to integrate additional system-level software components at the OS (Yun et al. 2013) or hypervisor level (Modica et al. 2018;Sohal et al. 2020), with the corresponding engineering and performance overheads. ...
... While this approach achieves near-native performance due to its tight coupling with the host kernel, it does not provide adequate isolation or allow applications to customize kernel configurations or policies. To avoid the cost of request indirection, such as context switches, unikernels [17], [18], [19] are proposed to run a container and the guest kernel in the same address space. Although this approach helps mitigate the overhead, it requires significant engineering efforts to port legacy applications to unikernels. ...
... On the contrary, in-transaction page faults are rare in parallel programs that employ fine-grained synchronization, such as many programs from HTMBench [11] in which fine-grained locks were replaced by transactions. From these results, we observe that without adequate hardware 55 support, page faults can be an important performance bottleneck in those programs that have been written from scratch keeping the TM paradigm in mind (its promise of making parallel programming easier through more coarse-grained transactions [12]). Other authors have also found page faults to be a limiting factor [13] [14] [15]. ...
... These events are recorded, analyzed, and displayed to the user for further exploration. We chose to follow the recommendation given by Ulrich Drepper [14,15], to inspect the memory subsystem effects by using a few event ratios rather than absolute values because it makes the interpretation easier. The following are brief descriptions of the event ratios we used in our low-level measurements: ...