Processor architecture

Source publication

High Performance Control System Processor

Article

Full-text available

Jan 2001

This paper describes a compact, high-speed special purpose processor, which offers a low- cost solution to implement linear time invariant controllers. The controller has been reformulated into a modified state-space representation based on the operator, which is optimised for numerical efficiency. This Control System Processor (CSP) has been imple...

Context 1

... that x temp is used to store the sum of the old values of x 1 to x 4 , which thereby avoids having to retain old values for the states while the new values are calculated – the state variables are then simply overwritten at each calculation. Using this standard controller formulation the requirements for coefficients and controller state variables become relatively standardised across a wide range of applications, and these are illustrated in figure 2. The variables are 27 bit fixed point, with the input values brought in as integers, a small 3-bit allowance for overflow (although this is a nominal requirement because of the good scaling properties mentioned previously), and 12 fractional bits for underflow. The coefficients are held in a simple low-precision floating-point form, with 6 bits for the mantissa and 5 bits for the exponent. In general the coefficients fractional with values which become progressively smaller as the sample frequency is increased, but a positive exponent is provided to implement greater than unity gains, a few of which are associated with most controllers. This numerical specification will implement successfully the vast majority of LTI controller examples, and allows for the sample frequency being at least three orders of magnitude higher than the lowest pole in the controller. Of course if there are exceptional requirements it is always possible to reprogram the CSP hardware in the FPGA, maintaining the essential principles but extending the hardware precision as required. An example of implementing extremely high sample frequency digital filters using the modified canonic δ approach can be found in [6]. Figure 3 shows a block diagram for the CSP system. The core of the CSP comprises a simplified datapath with storage and computation capabilities. The Register bank stores all the constants, coefficients, state variables, inputs data, output data and partial products needed to perform the calculations. The computation of the output values is done iteratively executing multiply-accumulation (MAC) operations. These operations are specified by the instructions fetched from an internal program ROM and decoded by the instruction handler. The instruction format contains the source and destination addresses of the operands used in the MAC operation. The program counter addresses the next instruction in program memory to be executed. An internal Data ROM contains the coefficients and the initial values for the state variables and program counter registers. A group of analogue-digital and digital-analogue converters provide the interface to the physical system being controlled. Figure 4 shows the processor interface, grey lines indicate data values while black lines indicate control signals. The processor will be embedded within the complete control system and will normally be programmed in a separate programming system. Important features of this architecture are: • reduced precision of the variables when compared to full IEEE floating-point representation • different numerical representations of coefficients and state variables which are satisfactory for a wide-range of controllers • targeted MAC unit optimised to for calculating sum of products This novel architecture combined with the use of a small and specialised instruction set presents cost and performance benefits for control applications over traditional architectures. The core of the CSP includes a special-purpose multiply-accumulator unit (MAC) and a 4- port register bank (3 read, 1 write). The MAC unit executes the multiply-accumulate operations required to perform the control algorithm, i.e. D=A*B+C (see figure 5). The A input is in coefficient format (12 bits) and the B and C values are in variable format (27 bits). A detailed low level design has been used to speed up the MAC operation. The system is pipelined such there is a latency of 4 clock cycles between instructions issues and the result being written back to the register bank. The compiler ensures that instruction dependencies are observed through an appropriate series of instruction issues. The coefficient is split into its mantissa and exponent sections. The multiplier block multiply a state variable by the mantissa, the product is then shifted by a number of bits determined by the coefficient exponent. Finally, the result is added to other state variable to produce the output. Table 1 shows the CSP complexity in terms of ProASIC tiles and equivalent gates. Everything except the program and data ROM are fixed in size; these are hardwired, and their size and speed depends upon the control algorithm being implemented. The figures shown are for the controller specified in section 2.2. The synthesis of the CSP results in an overall gate count of fewer than 21,000 gates and a delay of 20ns, this allows a clock frequency of 50 MHz. The register bank is implemented using 9 embedded RAM blocks provided by ProASIC devices. Each block contains a 256 word deep by 9 bits wide memory, with 2 ports (1 read, 1 write). As such the CSP is a compact low cost core capable of implementing the most demanding real-time systems. The maximum sampling frequency for a specific control system is determined by its complexity, i.e. the number of instructions needed to calculate the next state and output values. The relatively small size of the processor core leaves much of the FPGA free such that it can be used to carry out other functions typically associated with real-time control – logical interlocking functions, background tasks such as gain-scheduling etc. Now we describe the operation of the CSP from the programmer's point of view. The CSP instruction set is very simple and specialised; it is targeted towards high-speed computation. Due to the MAC unit contains one pipe stage, multiple instructions can be overlapped in execution. At the time that the operands specified by one instruction are being read from the register bank and copied to the MAC unit inputs, the results obtained from the previous instruction is obtained at the output of the MAC unit and copied back to the register bank. There The processor are no conditional only has four jumps instructions in the system. (see Unconditional table 2). The jumps MAC are instruction supported executes for user a programming. multiply-accumulation However, operation the code on generator the operands flattens indicated all but by the the exterior source addresses loop. The and program stores counter the result starts in at the zero destination increments address. until it reaches This instruction the value stored allows in the performing ‘jump1 register’, the ...

View in full-text

BFS Enumeration for Breaking Symmetries in Graphs

Article

Full-text available

Apr 2018

There are numerous NP-hard combinatorial problems which involve searching for an undirected graph satisfying a certain property. One way to solve such problems is to translate a problem into an instance of the boolean satisfiability (SAT) or constraint satisfaction (CSP) problem. Such reduction usually can give rise to numerous isomorphic represent...

Compositional and Local Livelock Analysis for CSP

Article

Jan 2018

The success of component-based techniques for software construction relies on trust in the emergent behaviour of the compositions. Here, we propose an efficient correct-by-construction technique for building livelock-free CSP models. Its verification conditions are based on a local analysis of the shortest event sequences (traces) that represent a...

Characterising the Complexity of Constraint Satisfaction Problems Defined by 2-Constraint Forbidden Patterns

Article

Full-text available

Mar 2015

Although the CSP (constraint satisfaction problem) is NP-complete, even in the case when all constraints are binary, certain classes of instances are tractable. We study classes of binary CSP instances defined by excluding subproblems. This approach has recently led to the discovery of novel tractable classes. The complete characterisation of all t...

Computing Skypattern Cubes

Conference Paper

Full-text available

Aug 2014

We introduce skypattern cubes and propose an efﬁcient bottom-up approach to compute them. Our approach relies on derivation rules collecting skypatterns of a parent node from its child nodes without any dominance test. Non-derivable skypatterns are computed on the ﬂy thanks to Dynamic CSP. The bottom-up principle enables to provide a concise repres...

Design and Implementation of Digital Linear Control Systems on Reconfigurable Hardware

Article

May 2003
EURASIP J ADV SIG PR

The implementation of large linear control systems requires a high amount of digital signal processing. Here, we show that reconfigurable hardware allows the design of fast yet flexible control systems. After discussing the basic concepts for the design and implementation of digital controllers for mechatronic systems, a new general and automated design flow starting from a system of differential equations to application-specific hardware implementation is presented. The advances of reconfigurable hardware as a target technology for linear controllers is discussed. In a case study, we compare the new hardware approach for implementing linear controllers with a software implementation.

Design and Realization of Distributed Real-Time Controller for Mechatronic Systems.

Conference Paper

Jan 2002

Developing distributed embedded control systems increases the need for a consistent design approach. Our example is taken from the mechatronic design in the automotive industry and illustrates our structuring concept for a modular realization of real-time-critical controllers. In our consistent design approach we employ the structured modelling of mechatronic systems, a modular integration platform for real-time software implementation and a modular hardware platform based on FPGAs and microcontrollers.

Rapid prototyping of real-time control laws for complex mechatronicsystems

Conference Paper

Full-text available

Feb 2001

The rapid prototyping of complex systems embedded in even more complex environments raises the need for a multi-level design approach. Our example is taken from mechatronic design in the automotive industry and illustrates the rapid prototyping procedure of real-time critical control laws. Our approach is based on an object-oriented structuring, not only allowing central control units but also supporting distributed control units, as needed in today's designs. The implementation of control laws is a stepwise-refined hardware-in-the-loop simulation, reducing the simulation part in each step. At the lower level, common platforms (such as FPGAs, microcontrollers or specialized platforms) can be instantiated. This is illustrated by an asynchronous data-flow processor for the high-performance rapid prototyping of cyclic iterated control laws

Rapid Prototyping of Real-Time Control Laws for Complex Mechatronic Systems.

Conference Paper

Full-text available

Jan 2001

Rapid prototyping of real-time controllers for humanoid robotics: A case study

Conference Paper

Dec 2012

This paper presents a rapid prototyping of realtime controllers for humanoid robotics based on standard off-the-shelf hardware and software. The proposed scheme allows control of a wide class of robotic systems in hard real-time. To take advantage of Simulink graphic programming interfaces, robotic programming environment, middleware and library are also presented based on Matlab/Simulink/RTW toolchain. Experiments are presented to show the performance on current computing hardware.

The Utilization of Reconfigurable Hardware to Implement Digital Controllers: a Review

Conference Paper

Jul 2007

This paper reviews the impact of Reconfigurable Hardware (RH) on the design of digital controllers. It starts by showing the application areas in which this technology has more influence. The reasons of the technology migration are then analyzed, pointing specific examples from the literature. Finally, run-time reconfiguration (RTR) of Field Programmable Gate Arrays (FPGAs) is revised and its utilization for designing FPGA-based controllers is presented. The research trends are shown, giving an insight on the potential benefits of using this technology.

Rapid prototyping of real-time control laws for complex mechatronic systems: A case study

Article

Mar 2004
J SYST SOFTWARE

Rapid prototyping of complex systems embedded in even more complex environments raises the need for a layered design approach. Our example is a mechatronic design taken from the automotive industry and illustrates the rapid-prototyping procedure of real-time-critical control laws. The approach is based on an object-oriented structuring allowing not only central control units but also distributed control units as needed by today’s designs. The implementation of control laws is a hardware-in-the-loop simulation, refined in steps and reducing the simulation part at every one of these. On the lower level, common platforms, such as FPGAs, microcontrollers or specialized platforms, can be instantiated.

Processor architecture

Context in source publication

Similar publications

Citations