Figure 3 - uploaded by Roger M. Goodall
Content may be subject to copyright.
Processor architecture 

Processor architecture 

Source publication
Article
Full-text available
This paper describes a compact, high-speed special purpose processor, which offers a low- cost solution to implement linear time invariant controllers. The controller has been reformulated into a modified state-space representation based on the operator, which is optimised for numerical efficiency. This Control System Processor (CSP) has been imple...

Context in source publication

Context 1
... that x temp is used to store the sum of the old values of x 1 to x 4 , which thereby avoids having to retain old values for the states while the new values are calculated – the state variables are then simply overwritten at each calculation. Using this standard controller formulation the requirements for coefficients and controller state variables become relatively standardised across a wide range of applications, and these are illustrated in figure 2. The variables are 27 bit fixed point, with the input values brought in as integers, a small 3-bit allowance for overflow (although this is a nominal requirement because of the good scaling properties mentioned previously), and 12 fractional bits for underflow. The coefficients are held in a simple low-precision floating-point form, with 6 bits for the mantissa and 5 bits for the exponent. In general the coefficients fractional with values which become progressively smaller as the sample frequency is increased, but a positive exponent is provided to implement greater than unity gains, a few of which are associated with most controllers. This numerical specification will implement successfully the vast majority of LTI controller examples, and allows for the sample frequency being at least three orders of magnitude higher than the lowest pole in the controller. Of course if there are exceptional requirements it is always possible to reprogram the CSP hardware in the FPGA, maintaining the essential principles but extending the hardware precision as required. An example of implementing extremely high sample frequency digital filters using the modified canonic δ approach can be found in [6]. Figure 3 shows a block diagram for the CSP system. The core of the CSP comprises a simplified datapath with storage and computation capabilities. The Register bank stores all the constants, coefficients, state variables, inputs data, output data and partial products needed to perform the calculations. The computation of the output values is done iteratively executing multiply-accumulation (MAC) operations. These operations are specified by the instructions fetched from an internal program ROM and decoded by the instruction handler. The instruction format contains the source and destination addresses of the operands used in the MAC operation. The program counter addresses the next instruction in program memory to be executed. An internal Data ROM contains the coefficients and the initial values for the state variables and program counter registers. A group of analogue-digital and digital-analogue converters provide the interface to the physical system being controlled. Figure 4 shows the processor interface, grey lines indicate data values while black lines indicate control signals. The processor will be embedded within the complete control system and will normally be programmed in a separate programming system. Important features of this architecture are: • reduced precision of the variables when compared to full IEEE floating-point representation • different numerical representations of coefficients and state variables which are satisfactory for a wide-range of controllers • targeted MAC unit optimised to for calculating sum of products This novel architecture combined with the use of a small and specialised instruction set presents cost and performance benefits for control applications over traditional architectures. The core of the CSP includes a special-purpose multiply-accumulator unit (MAC) and a 4- port register bank (3 read, 1 write). The MAC unit executes the multiply-accumulate operations required to perform the control algorithm, i.e. D=A*B+C (see figure 5). The A input is in coefficient format (12 bits) and the B and C values are in variable format (27 bits). A detailed low level design has been used to speed up the MAC operation. The system is pipelined such there is a latency of 4 clock cycles between instructions issues and the result being written back to the register bank. The compiler ensures that instruction dependencies are observed through an appropriate series of instruction issues. The coefficient is split into its mantissa and exponent sections. The multiplier block multiply a state variable by the mantissa, the product is then shifted by a number of bits determined by the coefficient exponent. Finally, the result is added to other state variable to produce the output. Table 1 shows the CSP complexity in terms of ProASIC tiles and equivalent gates. Everything except the program and data ROM are fixed in size; these are hardwired, and their size and speed depends upon the control algorithm being implemented. The figures shown are for the controller specified in section 2.2. The synthesis of the CSP results in an overall gate count of fewer than 21,000 gates and a delay of 20ns, this allows a clock frequency of 50 MHz. The register bank is implemented using 9 embedded RAM blocks provided by ProASIC devices. Each block contains a 256 word deep by 9 bits wide memory, with 2 ports (1 read, 1 write). As such the CSP is a compact low cost core capable of implementing the most demanding real-time systems. The maximum sampling frequency for a specific control system is determined by its complexity, i.e. the number of instructions needed to calculate the next state and output values. The relatively small size of the processor core leaves much of the FPGA free such that it can be used to carry out other functions typically associated with real-time control – logical interlocking functions, background tasks such as gain-scheduling etc. Now we describe the operation of the CSP from the programmer's point of view. The CSP instruction set is very simple and specialised; it is targeted towards high-speed computation. Due to the MAC unit contains one pipe stage, multiple instructions can be overlapped in execution. At the time that the operands specified by one instruction are being read from the register bank and copied to the MAC unit inputs, the results obtained from the previous instruction is obtained at the output of the MAC unit and copied back to the register bank. There The processor are no conditional only has four jumps instructions in the system. (see Unconditional table 2). The jumps MAC are instruction supported executes for user a programming. multiply-accumulation However, operation the code on generator the operands flattens indicated all but by the the exterior source addresses loop. The and program stores counter the result starts in at the zero destination increments address. until it reaches This instruction the value stored allows in the performing ‘jump1 register’, the ...

Similar publications

Article
Full-text available
There are numerous NP-hard combinatorial problems which involve searching for an undirected graph satisfying a certain property. One way to solve such problems is to translate a problem into an instance of the boolean satisfiability (SAT) or constraint satisfaction (CSP) problem. Such reduction usually can give rise to numerous isomorphic represent...
Article
The success of component-based techniques for software construction relies on trust in the emergent behaviour of the compositions. Here, we propose an efficient correct-by-construction technique for building livelock-free CSP models. Its verification conditions are based on a local analysis of the shortest event sequences (traces) that represent a...
Article
Full-text available
Although the CSP (constraint satisfaction problem) is NP-complete, even in the case when all constraints are binary, certain classes of instances are tractable. We study classes of binary CSP instances defined by excluding subproblems. This approach has recently led to the discovery of novel tractable classes. The complete characterisation of all t...
Conference Paper
Full-text available
We introduce skypattern cubes and propose an efficient bottom-up approach to compute them. Our approach relies on derivation rules collecting skypatterns of a parent node from its child nodes without any dominance test. Non-derivable skypatterns are computed on the fly thanks to Dynamic CSP. The bottom-up principle enables to provide a concise repres...

Citations

... We develop a generic hardware structure which can be easily adapted to new applications. In difference to [4], where a special instruction set processor for implementing digital control algorithms is described, our approach implements all parts of the controller in hardware. Important issues for using reconfigurable hardware are: ...
Article
The implementation of large linear control systems requires a high amount of digital signal processing. Here, we show that reconfigurable hardware allows the design of fast yet flexible control systems. After discussing the basic concepts for the design and implementation of digital controllers for mechatronic systems, a new general and automated design flow starting from a system of differential equations to application-specific hardware implementation is presented. The advances of reconfigurable hardware as a target technology for linear controllers is discussed. In a case study, we compare the new hardware approach for implementing linear controllers with a software implementation.
... This approach is inappropriate for applications with high sample rates (fs> 20 kHz) as well as for highly modular applications consisting of cheap processing nodes. With high-level design tools, such as VHDL, and logic-synthesis CAD tools designed for large low-cost reprogrammable FPGAs, a rapid prototyping of complex modular control laws has become possible [9]. ...
Conference Paper
Developing distributed embedded control systems increases the need for a consistent design approach. Our example is taken from the mechatronic design in the automotive industry and illustrates our structuring concept for a modular realization of real-time-critical controllers. In our consistent design approach we employ the structured modelling of mechatronic systems, a modular integration platform for real-time software implementation and a modular hardware platform based on FPGAs and microcontrollers.
... Our concept allows to design and test actual mechatronic aggregates with local control. In addition and as an alternative to microcontrollers, FPGAs [1,2] and asynchronous architectures (FLYSIG dataflow processors) [3] can be employed as hardware for the controller realisation. This reconfigurable hardware is the basis for a subsequent use in mass production in the shape of an ASIC. ...
Conference Paper
Full-text available
The rapid prototyping of complex systems embedded in even more complex environments raises the need for a multi-level design approach. Our example is taken from mechatronic design in the automotive industry and illustrates the rapid prototyping procedure of real-time critical control laws. Our approach is based on an object-oriented structuring, not only allowing central control units but also supporting distributed control units, as needed in today's designs. The implementation of control laws is a stepwise-refined hardware-in-the-loop simulation, reducing the simulation part in each step. At the lower level, common platforms (such as FPGAs, microcontrollers or specialized platforms) can be instantiated. This is illustrated by an asynchronous data-flow processor for the high-performance rapid prototyping of cyclic iterated control laws
... Our concept allows to design and test actual mechatronic aggregates with local control. In addition and as an alternative to microcontrollers, FPGAs [1,2] and asynchronous architectures (FLYSIG dataflow processors) [3] can be employed as hardware for the controller realisation. This reconfigurable hardware is the basis for a subsequent use in mass production in the shape of an ASIC. ...
Conference Paper
This paper presents a rapid prototyping of realtime controllers for humanoid robotics based on standard off-the-shelf hardware and software. The proposed scheme allows control of a wide class of robotic systems in hard real-time. To take advantage of Simulink graphic programming interfaces, robotic programming environment, middleware and library are also presented based on Matlab/Simulink/RTW toolchain. Experiments are presented to show the performance on current computing hardware.
Conference Paper
This paper reviews the impact of Reconfigurable Hardware (RH) on the design of digital controllers. It starts by showing the application areas in which this technology has more influence. The reasons of the technology migration are then analyzed, pointing specific examples from the literature. Finally, run-time reconfiguration (RTR) of Field Programmable Gate Arrays (FPGAs) is revised and its utilization for designing FPGA-based controllers is presented. The research trends are shown, giving an insight on the potential benefits of using this technology.
Article
Rapid prototyping of complex systems embedded in even more complex environments raises the need for a layered design approach. Our example is a mechatronic design taken from the automotive industry and illustrates the rapid-prototyping procedure of real-time-critical control laws. The approach is based on an object-oriented structuring allowing not only central control units but also distributed control units as needed by today’s designs. The implementation of control laws is a hardware-in-the-loop simulation, refined in steps and reducing the simulation part at every one of these. On the lower level, common platforms, such as FPGAs, microcontrollers or specialized platforms, can be instantiated.