Article

Improving Processor Utilization with a Task Classification Model based Application Specific Hard Real-Time Architecture

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Modern microprocessors with caches and pipelines show increasing performance, but at the price of a decreasing predictability of execution times. The design of hard real-- time systems however has to be based on worst case considerations. Consequently, real--time systems are generally oversized and fail to profit of developments in the standard processor field. This paper presents an approach where real--time systems are analyzed and built according to a task classification model. Each class of tasks corresponds to a type of processor best suited in terms of performance and deterministic execution times. The resulting target architecture framework is a tightly coupled heterogeneous multiprocessor system based on templates using off-the-shelf components. The described real--time system design process includes a schedulability analysis method that supports the partitioning and allocation process and provides the necessary real--time guarantees. The result is a event--driven hard real--ti...

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Our target architecture REAR (Rapid Prototyping Environment for Advanced Real-Time Systems) was built after the multiprocessor architecture framework presented in [3]. In this approach, real-time systems are analyzed and partitioned according to a task classification model. ...
... The individual tasks to be performed can be classified according to the task classification model presented in [3]. In this model, the attributes deadline of the task and complexity of the function to be performed are used to allocate the tasks to the best suitable type of processing unit (here: HPU, RTU and CIOP). ...
... For the application threads on the RTU approximately 2500 lines of code (including application specific header files and comments) were written, which compiled to 14 KByte program code (text segment) and less than 4 KByte initialized and uninitialized data (data and bss segments). Linked with the necessary RTEMS modules (52 KByte text) and the C library (43 KByte) 3 Taking into account the size of the RTU's SRAM (512 KByte), this means that already this hand coded, medium size application almost fills the available fast memory. The planned automated code generation usually results in even larger code and data sizes. ...
Article
Rapid Prototyping is used in embedded systems design as a means to reduce development time and costs. At an early stage in the development cycle, the specification is implemented in a working protoype, which can be used to test the specification and, in real-time systems, also the timing constraints. The REAR Rapid Prototyping Environment was built as an adaptable target platform for embedded realtime systems. It supports both the proof that the system meets all its deadlines, and the automated translation of a system specification into an executable prototype. This paper presents a CAN controller and monitor application, which was implemented and evaluated on REAR as a first non--trivial real--world application. This application represented a wide range of timing and coordination requirements towards the target architecture. The fact that it was possible to implement it successfully in reasonably short time on REAR is a proof of the soundness of the concept behind the REAR rapid proto...
... Our target architecture REAR (cf. Figure 1) was designed to support rapid prototyping of real-time systems. The basis for this is the task classification model presented in [6], where each type of real-time task corresponds to a best suited processor type, in terms of performance and deterministic execution times. It is a configurable and scalable heterogeneous multiprocessor system consisting of standard off-the-shelf components, which are tightly coupled by a global PCI-Bus. ...
Conference Paper
In the development of embedded hard real-time systems the ability to guarantee worst case execution times is gain- ing importance and complexity. The fast evolving proces- sor acceleration techniques constantly increase the gap be- tween a processor which uses the accelerations and one which does not. Modeling cache, pipelines and other parts of the processor gets increasingly difficult and time consum- ing. To circumvent this problem especially in the domain of rapid prototyping of embedded hard real-time systems, we propose to lay more weight on measurement and less on modeling. By analyzing the control flow graph the compiler uses for optimization, a reduced control flow graph can be generated, which limits the paths to be measured. Using this information the object code of the program is being instrumented and then measured. By measuring all paths the reduced control flow graph indicates, predictability is achieved without using too pessimistic estimations.
... The specifi-cation in SDL is annotated with a specification of the timing requirements using deadlines and a temporal description of the embedding system with event streams ( [7]). Currently, the SDL model is partitioned manually by allocating SDL processes on the target architecture's HW and SW processing units according to the task classification model ( [5]). For the software part, C code is generated using SDT's code generator CAdvanced, automatically including functions from the underlying real-time operating system RTEMS and from a scalable IPC library for the inter-unit communication. ...
Article
The specification of an embedded system at system level together with co--joint hardware/software synthesis is a goal of many rapid prototyping projects. SDL has been proposed as a formal and abstract specification language well suited for this purpose. In the automated generation of hardware however, SDL's asynchronous communication model (directly implemented in the so called server model) can lead to a large overhead in area and response time. The activity thread implementation model on the other hand is more similar to hardware description language concepts, respectively an execution in hardware, due to its synchronous communication and execution scheme. This paper compares VHDL code generation from SDL using these two models regarding implementation architectures, resource usage, throughput and response time. The integration in an existing rapid prototyping design process is presented as well as results gained form several application examples. 1 Introduction Embedded hard real-...
... Our target architecture REAR (cf. Figure 1) was designed to support rapid prototyping of real-time systems. The basis for this is the task classification model presented in [6], where each type of real-time task corresponds to a best suited processor type, in terms of performance and deterministic execution times. It is a configurable and scalable heterogeneous multiprocessor system consisting of standard off-the-shelf components, which are tightly coupled by a global PCI-Bus. ...
Article
In the development of embedded hard real-time systems the ability to guarantee worst case execution times is gaining importance and complexity. The fast evolving processor acceleration techniques constantly increase the gap between a processor which uses the accelerations and one which does not. Modeling cache, pipelines and other parts of the processor gets increasingly difficult and time consuming. To circumvent this problem especially in the domain of rapid prototyping of embedded hard real-time systems, we propose to lay more weight on measurement and less on modeling. By analyzing the control flow graph the compiler uses for optimization, a reduced control flow graph can be generated, which limits the paths to be measured. Using this information the object code of the program is being instrumented and then measured. By measuring all paths the reduced control flow graph indicates, predictability is achieved without using too pessimistic estimations. 1 Introduction State of the ar...
... REAR's target architecture ( 8 in Fig. 1.2, [FKMF97]) consists of an heterogeneous multi-processor system complemented with additional field programmable gate arrays, tightly coupled with the microprocessor based units. Its processing nodes (High Performance Unit (HPU), Real-Time Unit (RTU), and Configurable I/O Processor (CIOP)) are specialized according to the "task classification model" [FFKM97], where each type of real-time task corresponds to a best suited type of processing unit (PU), in terms of performance and deterministic execution times. Therefore, a straight-forward allocation of the application tasks to the single PUs can be derived from a task grading according to both, computing complexity and deadline. ...
Article
new deadline now have to be inserted into a task's only queue in a deadline sorted order. Deadline inheritance (DIP), respectively deadline ceiling (DCP) applied to message queues raise the dynamic priority of the receiving task and assure the avoidance of priority inversion e#ects caused by server tasks responding to multiple requests with unequal urgencies. Since MEDF scheduling implicates an earliest deadline first processing sequence for all tasks, Gresser's schedulability analysis methodology for event--driven real--time systems [Gre93a] can be applied to prove the timeliness of a MEDF system. With these presumptions, a real--time system's implementation and its worst case timing behaviour can be automatically derived on the basis of its SDL system specification by one strike, if the system model has been extended by timing constraint annotations (deadlines, execution times, timing of system stimuli). Code generation that preserves MEDF processing sequence on the one hand, and ma
... The rapid prototyping target architecture was designed to support realtime analysis in guaranteeing realistic, not-too-pessimistic worst-case execution times. The basis for this is the task classification model presented in [FFKM97], where each type of real-time task corresponds to a best suited type of processing unit, in terms of performance and deterministic execution times. It is a configurable and scalable heterogeneous multiprocessor system consisting of standard off-the-shelf components, which are tightly coupled by a global PCI-bus (figure 6.1). ...
Article
Specification languages and automated design methods are increasingly being used to master the growing complexity in the development of embedded electronic systems. The work presented here uses the "Specification and Description Language" SDL as basis of an automated design process targeting application specific hardware particularly for hard real-time systems.
Conference Paper
It is undoubtedly true, that the usage of a formal specification methodology in software design will reduce the development effort, particularly as embedded hard real-time systems show increasing functional complexity. We suggest the use of the language SDL even for the design of real-time systems with hard timing constraints. Emerging problems, caused by the non-deterministic semantics of SDL, can be solved by adding EDF process activation to the SDL system model. This paper describes the different steps necessary to map a SDL system specification to an analyzable task network. Considering a SDL process as a typical server process, the mapping rules are resolving the resulting interdependencies and delays, caused by possible priority inversion and blocking. Finally the study of an application example, the “Mine Control System” proofs the usabilty of the introduced methods.
Conference Paper
Full-text available
The sustained performance of fast processors is critically dependent on cache performance. Cache performance in turn depends on locality of reference. When an operating system switches contexts, the assumption of locality may be violated because the instructions and data of the newly-scheduled process may no longer be in the cache(s). Context-switching thus has a cost above that associated with that of the operations performed by the kernel. We fed address traces of the processes running on a multi-tasking operating system through a cache simulator, to compute accurate cache-hit rates over short intervals. By marking the output of such a simulation whenever a context switch occurs, and then aggregating the post-context-switch results of a large number of context switches, it is possible to estimate the cache performance reduction caused by a switch. Depending on cache parameters the net cost of a context switch appears to be in the thousands of cycles, or tens to hundreds of microseconds.
Article
In this paper, we give a comprehensive review of a number of practical problems associated with the use of static priority scheduling. We first present a new approach to stabilize the rate-monotonic algorithm in the presence of transient processor overloads. We also present a new class of algorithms to handle aperiodic tasks which improve the response times to aperiodic tasks while guaranteeing the deadlines of periodic tasks. We then study the problem of integrated processor and data I/O scheduling. Finally we review the problem of scheduling of messages over a bus with insufficient priority levels but with multiple buffers.
Article
Any real time computer control system must have a capability to measure the duration between events in the metric of real time and must respond to a stimulus within a given real time interval. This paper discusses some of the implications which result from the inclusion of this real time metric on the specification, communication and error detection in real time distributed systems.
Conference Paper
Any real time computer control system must have a capability to measure the duration between events in the metric of real time and must respond to a stimulus within a given real time interval. This paper discusses some of the implications which result from the inclusion of this real time metric on the specification, communication and error detection in real time distributed systems.
Conference Paper
A mandatory condition for event driven hard real-time systems to meet deadlines under all circumstances is the bounded number of events in time. Traditionally the verification of hard deadlines is achieved for periodic events or sporadic events with given interarrival times only. A new event model presented in this paper provides the means to describe the occurrence of events bounded in time intervals more precisely to achieve a better off-line schedulability test. New deadline verification techniques for a basic task model are outlined in this paper. These techniques have been extended to verify deadlines in real process control environments.
Conference Paper
Cache-partitioning techniques have been invented to make modern processors with an extensive cache structure useful in real-time systems where task switches disrupt cache working sets and hence make execution times unpredictable. This paper describes an OS-controlled application-transparent cache-partitioning technique. The resulting partitions can be transparently assigned to tasks for their exclusive use. The major drawbacks found in other cache-partitioning techniques, namely waste of memory and additions on the critical performance path within CPUs, are avoided using memory coloring techniques that do nor require changes within the chips of modern CPUs or on the critical path for performance. A simple filter algorithm commonly used in real-time systems, a matrix-multiplication algorithm and the interaction of both are analysed with regard to cache-induced worst case penalties. Worst-case penalties are determined for different widely-used cache architectures. Some insights regarding the impact of cache architectures on worst-case execution are described
Conference Paper
The problem of scheduling a set of sporadic tasks that share a set of serially reusable, single unit software resources on a single processor is considered. The correctness conditions are that: each invocation of each task completes execution at or before a well-defined deadline; and a resource is never accessed by more than one task simultaneously. An optimal online algorithm for scheduling a set of sporadic tasks is presented. The algorithm results from the integration of a synchronization scheme for access to shared resources with the earliest deadline first algorithm. A set of relations on task parameters that are necessary and sufficient for a set of tasks to be schedulable is also derived. The proposed model for the analysis of processor scheduling policies is novel in that it incorporates minimum as well as maximum processing time requirements of tasks. The scheduling algorithm and the sporadic tasking model have been incorporated into an operating system kernel and used to implement several real-time systems
Article
The controller area network is a well-established networking system specifically designed with real-time requirements in mind. Developed in the 1980s by Robert Bosch, its ease of use and low cost has led to its wide adoption throughout the automotive and automation industries. However, for the beginner using CAN may seem somewhat bewildering. This article goes some way into explaining how CAN is used both at the hardware and the software levels
Article
The problem of scheduling a set of sporadic tasks that share a set of serially reusable, single unit software resources on a single processor is considered. The correctness conditions are that (1) each invocation of each task completes execution at or before a well-defined deadline, and (2) a resource is never accessed by more than one task simultaneously. We present an optimal on-line algorithm for scheduling a set of sporadic tasks. The algorithm results from the integration of a synchronization scheme for access to shared resources with the earliest deadline first algorithm. A set of relations on task parameters that are necessary and sufficient for a set of tasks to be schedulable is also derived. Our model for the analysis of processor scheduling policies is novel in that it incorporates minimum as well as maximum processing time requirements of tasks. The scheduling algorithm and the sporadic tasking model have been incorporated into an operating system kernel and used to impleme...
Article
Cache memories have become an essential part of modern processors to bridge the increasing gap between fast processors and slower main memory. Until recently, cache memories were thought to impose unpredictable execution time behavior for hard real-time systems. But recent results show that the speedup of caches can be exploited without a significant sacrifice of predictability. These results were obtained under the assumption that real-time tasks be scheduled non-preemptively. This paper introduces a method to maintain predictability of execution time within preemptive, cached real-time systems and discusses the impact on compilation support for such a system. Preemptive systems with caches are made predictable via softwarebased cache partitioning. With this approach, the cache is divided into distinct portions associated with a realtime task, such that a task may only use its portion. The compiler has to support instruction and data partitioning for each task. Instruction partitionin...
Article
. When designing distributed computer control systems, there is a great variety of possibilities how to assign system parameters e.g. the task allocation. For a chosen parameter set, it is indispensable to prove that the system meets all deadlines even in the worst case. On this condition it is a lengthy process to find a feasable and cheap solution. Therefore it is highly desirable to support the system designer in this process. In this paper, a proposal is presented to automate some parts of the design process in order to find a realization as inexpensive as possible by using stochastic optimization methods. Keywords. Real time computer systems, parameter optimization, minimal realization, computer selection and evaluation, stochastic optimization, computer aided system design. 1. INTRODUCTION A substantial problem during the development of computer control systems with hard real--time constraints is the proof, that the system meets all deadlines even in the worst case. In time driv...
The Netherlands. PCA82C200, Stand–alone CAN Controller, Product Specification
  • Philips Semiconductors
  • Eindhoven
Philips Semiconductors, Eindhoven, The Netherlands. PCA82C200, Stand–alone CAN Controller, Product Specification, 1992.
The hep supercomputer and its applications
  • B Smith
B. Smith. The hep supercomputer and its applications. In J. S. Kowalik, editor, Parallel MIMD Computation, pages 41-55. The MIT Press, 1985.
  • K Etschberger
K. Etschberger et al. CAN Controller-Area-Network, Grundlagen, Protokolle, Bausteine, Anwendungen. Hanser Verlag, 1994.
Stand-alone CAN Controller, Product Specification
Philips Semiconductors, Eindhoven, The Netherlands. PCA82C200, Stand-alone CAN Controller, Product Specification, 1992.