Content uploaded by Claudia Casellato
Author content
All content in this area was uploaded by Claudia Casellato
Content may be subject to copyright.
mi
Abstract— A new complex model of human motor control
has been developed, combining brain internal models and
neural network mechanisms. Based on nervous system
structures and operating principles, a feedforward block, a
feedback controller and a cerebellum-like learning module
have been integrated and tested with an anthropometric
robotic arm. A simulated sequence of 8-like tracking tasks
showed the contributions of these main loops over time.
Different external dynamics were introduced. The role of
feedback corrections, intrinsically imprecise due to
sensorimotor delays, decreases, while the output of cerebellum,
which has been learning, increases; the movement becomes
more accurate. Moreover, an experimental session on a subject
performing the task repetitions using a haptic device was
carried out, recording upper limb kinematics.
I. INTRODUCTION
HE biological motor system is a high performance
control engine. Unlike artificial control systems, it
exhibits much higher performance with great flexibility
and versatility in spite of nonlinearities, uncertainties and
large Degrees of Freedom (DoF) of animal bodies.
Sensorimotor function is created from a highly distributed
circuit that includes different neural centers, such as cerebral
cortex, cerebellum, and spinal cord.
The movement kinematic planning to achieve a
particular task is assigned to the premotor and
somatosensory cortical areas; they generate the optimal
trajectory and transform this external-space Cartesian
coordinates into internal-space joint coordinates through
inverse kinematics processing. It was shown that
somatosensory cortex cells encode joint-centered
kinematics; their activity is correlated with position, velocity
and acceleration parameters [1].
Then, the motor commands, in order to achieve such
desired kinematics, are defined. The brain must construct
internal models of the plant, objects and environment only
through learning by experience and memorize them in its
neural networks in a usable format for motor control. The
primary motor cortex (M1) is considered the site where basic
inverse dynamic models are stored, thus behaving as a
nonlinear feedforward controller able to compute torque
values. Since possible joint miscalibrations, context changes,
noise and other uncertainties, this structure is not capable of
guaranteeing an accurate control on its own.
The cerebellum is able to fine-tune motor skills by
processing incomplete or approximate commands issued by
higher levels of motor system [2]. It is in charge of temporal
and spatial movement coordination. Its structure, made up of
microzones acting as functional units, fits well with the
learning mechanisms. Patients suffering cerebellar
dysfunctions (e.g. cerebellar ataxias) are almost unable to
deal with disturbances as they can rely only on the imprecise
and unstable feedback control to enhance the basic inverse
model of motor cortex [3].
The action of a feedback controller in motor control is
well accepted. The role of M1 in this loop is proved,
neurophysiologically, by a dense projection from M1 to the
spinal cord, often directly onto motor neurons, and by a
number of correlations between M1 firing and end-effector
kinematics [4]. Ito [5] showed that the feedback controller
generates a command in motor cortex, which can tune the
viscoelastic properties of musculoskeletal system (tension-
length and tension-velocity relationships). Adaptive
feedback controllers have been proposed [6], which means
that the pre-programmed arm impedance changes in
response to feedback information. For instance, it has been
shown that impedance increases around the task constraints
[7]. Feedback gains, which convert sensory state variables
into motor signals, are optimized based on specific goals of
a particular behavior, by following the ‘minimum
intervention principle’ [8], [9]. Thus, the optimal feedback
control consists of two main steps: state/error estimation and
control laws.
No model of human-like motor control including this
overall complexity has been built. In all biological systems,
in which all the different parts have been evolved together
towards global aims, the whole is more complex than the
sum of its parts. Kawato [10] proposed the adaptive
nonlinear feedforward controller, based on a feedback-error
learning architecture; that is, error signals from linear
feedback controller tune the feedforward inverse model
parameters. Schweighofer and colleagues [11] showed how
the cerebellum may increase the accuracy in target reaching
movements by compensating for the interaction torques, thus
by learning a portion of the inverse dynamics model that
refines a previously stored basic inverse model in the motor
cortex. Other models were proposed and tested on planar
movements of a robotic arm, e.g. [12], [13].
Starting from functional/anatomical schemes and from
these previous important steps, the present study integrates
control models [14], learning models, neural network
dynamics and behavioral observations, by using both
modeling/computational and experimental approaches on
multi-joint 3D movements.
An integrated motor control loop of a human-like robotic arm:
feedforward, feedback and cerebellum
-
based learning
C. Casellato, A. Pedrocchi, J.A. Garrido, N.R. Luque, G. Ferrigno,
E. D’Angelo
, E. Ros
T
II. M
ODELING
A
PPROACH
A. Control system
Fig.1 comes up with the main neural structures and
mutual connections involved in motor control, highlighting
the ones implemented in our model.
Premotor and sensory cortex blocks compute the
kinematic planning. First, the desired trajectory is generated
following a minimum-jerk criterion in external space, so
facing the kinematic redundancy [15]. Then, closed-form
inverse kinematics is carried out to compute displacement,
velocity and acceleration for each of the three joints [16].
The nonlinear feedforward controller is placed into M1
block; it is made up of an inaccurate inverse dynamic model
of the arm based on recursive Newton-Euler dynamic
equations computing joint torques. These dynamic equations
do not take into account friction torques, inertial interaction
torques (i.e. inertia tensor matrix presents zero terms for the
off-diagonal elements), and internal neural noise. The latter
here consists of both sensorial noise on actual kinematics
and signal-dependent noise on total torques, i.e. proportional
to motor command amplitudes [17], [18]. Moreover, the
inverse model does not include unexpected external force
changes, embedding just the very well-known gravity action.
The linear feedback controller, receiving somatosensory
information from the periphery, is sited within M1. By
means of exploiting muscular viscoelasticity, an additive
torque value is produced depending on the ongoing error, as
an online correction. Its performance is limited due to the
system nonlinearities and the inevitable feedback
sensorimotor delays. Because of muscle spindles do not
carry a significant amount of acceleration information, only
position and velocity are present in the feedback controller.
Position and velocity errors (e
p
, e
v
) are weighted by gains
(K
p
and K
v
: elasticity and viscosity features, respectively);
this arm impedance is selected depending on the task
requirements and keeping in mind that high feedback gains
enhance robustness to external perturbations but, at the same
time, increase noise (signal-dependent noise) and metabolic
cost. It would imply non-compliant and non-stable
movements [19], [20].
The plasticity mechanisms are implemented by the
cerebellum and inferior olive blocks. The cerebellum learns
to provide corrective torques towards reducing the kinematic
errors in incoming trials; thus, it acts as a predictor.
This biological adaptation takes place on the parallel fiber
to Purkinje cell synapses, driven by the activity from the
Inferior Olive that here encodes a teaching signal (dependent
on the accuracy of the movement execution compared to the
desired movement trajectory).This system implements a
look-up table which associates each parallel fiber state [27]
with a Purkinje cell output. This association is iteratively
modified during the learning process.
The adaptation mechanism is based on LTD/LTP (Long-
Term Depression and Potentiation) processes validated in
previous approaches [21] with a linear firing rate cerebellar
model. This cerebellum-like model delivers add-on output
corrective torque terms based on the received feedback error
Figure 1. Model
The scheme includes the main neural structures and functional interconnections involved in motor control. In red the blocks and the
connections implemented in our control model.
along previously executed trials. The cerebellar module
torque action is defined following the equation:
Where Gainsout allow a rescaling based on the torque ranges,
MFgain represents the activity coming through mossy fibers
(this input activity has been fixed to 1 in order to normalize
the output activity in between 0 and 1) and PCi(t) is the
Purkinje cell firing rate associated to the currently active
parallel fiber. This activity is iteratively modified following
the equation:
Where LTPMax and LTDMax are parameters which regulate
the learning plasticity mechanism speed (both have been
fixed to 0.2), e represents the error signal (a linear
combination of joint position and velocity error, normalized
between 0 and 1) and α regulates the LTP/LTD interaction
(it has been set up to 1000 to reduce LTP action in presence
of a significant error). Finally, PCi(t) is configured to be
always working in the range [0, 1].
The expected behavior of the whole system should be
that the feedback controller progressively is driven out, since
the cerebellum adjusts progressively internal models. Thus,
the desired motions will be mainly predicted and only small
correction forces will be required, so increasing the system’s
control compliance.
B. Plant
A robotic arm is built with 3 rotational DoFs: q1
represents shoulder abduction/adduction, q2 shoulder
lowering/elevation, and q3 elbow extension/flexion, as
reported in Fig.2-a. The kinematic parameters are defined
according to Denavit-Hartenberg convention (Fig.2-c),
setting the link lengths depending on subject’s
anthropometric measurements. The inertial parameters for
each link, such as mass, CoM position, and inertia tensor are
set depending on subject’s anthropometric measures as well
[22], [23], [24], [25].
C. Simulations
The control loop and the robot plant have been built up
in Simulink (Mathworks®), using a Robotic Toolbox [26].
A simulation of an 8-like trajectory tracking task in 3D
was carried out, where one trial lasted 4 s and 20 repetitions
were performed. The unexpected external force
perpendicular to end-effector was a step: from 0.6 N to 2.3 N
at half of each trial duration. The time resolution was 2 ms.
The signal-dependent noise was a white noise with
amplitude equals to 2% of the torque amplitude.
The feedback controller was characterized by Kp and Kv
proportional to the external force modulus: Kp = 3·|Fext|
[Nm/rad]; Kv = 1·|Fext| [Nm/(rad/s)]. The delay was 50 ms.
The cerebellum module was implemented by a linear firing
rate model of the cerebellum which includes some of the
traditional working hypothesis of the cerebellum, such as
the generation of non-recurrent states at the granular layer
[27], synaptic plasticity at the parallel fibers driven by the
climbing fiber activity and synaptic integration at the
Purkinje layer [28], [29].
For each task repetition, multiple variables were
recorded. The different contributions on the total torque (τtot)
were computed, as ratio of Root Mean Square (RMS)
values:
• τcerebellum/τtot = RMS(τcerebellum) / [ RMS(τcerebellum) +
RMS(τfeedback) + RMS(τfeedforward) ]
• τfeedback/τtot = RMS(τfeedback) / [ RMS(τcerebellum) +
RMS(τfeedback) + RMS(τfeedforward) ]
The Cartesian error of the end-effector was evaluated by
using two main parameters: RMS-Error and the correlation
between the desired and the actual 3D trajectories.
D. Results
The main simulation results are reported in Fig.3. It is
evident, from the joint angles (3-a), that the distance
between the desired and the actual movements decreases
over time. It is also supported by the Cartesian trajectories
(3-c) and the performance indexes (3-d and 3-e).
Panel 3-b shows how the different controllers contribute
to the whole motor commands over time; the feedback
component is predominant in the very first trials, while the
cerebellum is still learning.
Figure 2. Human-like robotic arm
(a) Robotic arm built for our model, with 3 DoF. Green (q1):
shoulder abduction/adduction. Red (q2): shoulder
lowering/elevation. Blue (q3): elbow extension/flexion.
(b) Experimental set-up: the 3-
marker tools on the involved joints,
the haptic device and the graphical interface with the required task.
(c) The conventional Denavit-Hartenberg parameters which define
the kinematic features of the anthropometric robotic arm.
Latest repetitions show a cerebellum correction activity
bigger than the feedback one for all joints, with quite stable
trends. It is worthy to note that the q3 curves (green) are
higher than the other joints even if the error is smaller (as it
is shown by the green curves in panel 3-a) since this joint
feedforward torque does not include the gravity component;
q3 indeed moves on the horizontal plane.
III. EXPERIMENTAL APPROACH
A. Set-up
Preliminary experimental sessions on one healthy subject
were carried out in order to qualitatively compare the
simulator kinematic movement and a realistic one. The
subject’s upper limb segments were acquired by a motion
capture system (VICRA, PolarisTM), thus placing a 3-
passive-markers tool on each joint (shoulder, elbow and
wrist). A haptic device (PHANToM OMNI, SensAbleTM)
was used to perform the task, developing in Visual C++ a
task-specific visual interface and a control algorithm proving
the subject with the external force changes. By displaying a
countdown, the subject was aware of the required trial
duration. Start and end points were marked through
touchable spheres within the task environment. The set-up
picture is reported in Fig.2-b.
In order to constrain the movement to the selected 3 DoFs,
the subject worn a wrist plaster cast, so as the haptic device
pen was like a forearm extension. The subject was instructed
to avoid as much as possible the use of finger DoFs, the
shoulder rotation and any translation.
After few familiarization trials, the subject was asked to
perform 5 trials with a low constant external vertical force
field (0.6 N) and 5 trials with a force field change from the
half of each trial duration (from 0.6 N to 2.3 N).
Figure 3. Simulation with external perturbation at half of each task repetition
(a) The 3 DoF angles (q1, q2, and q3, as in Fig.2-
a), with time (80 s, i.e. 20 repetitions). Solid curves: the actual joint
angle; dashed curves: the desired joint angle.
(b) The % contributions, with respect to the total generated torque, of cerebellum torque (star c
urves) and of feedback
torque (circle curves). These values are reported for each joint and for each task repetition.
(c) The 3D Cartesian end-
effector trajectories. In violet: the desired one. In black: the first repetition. In thick grey: the
last repetition. In thin grey: the intermediate repetitions.
(d) The Root Mean Square Error between the 3D desired trajectory and the actual one, for each repetition.
(e) The correlation coefficient between the 3D desired trajectory and the actual one, for each repetition.
B. Results
Fig.4 depicts the main representative results, concerning
the 5 trials with external disturbance. In Fig.4-a, the
experimental angles are laid on the ones which come from
simulator planning (desired joint angles). It is evident that,
whereas the shoulder DoFs (q1 and q2) fit quite well with
the desired ones, the elbow flexion (q3) is significantly
smaller in the experimental data than in the simulation
approach. This could be because the subject used also other
DoFs, such as fingers, to achieve the task.
Fig.4-b draws the cartesian end-effector trajectories, and in
Fig.4-c and 4-d, performance indexes, analogous to the ones
computed for simulation, are reported along the 5
repetitions. Both parameters show values that are similar to
the ones achieved in simulation after the first trials. It could
be explained by the fact that the subject performed some
trials for familiarization before recordings, even if not
enough training time to get a stable behavior. Next
experiments will foresee more repetitions, so as to achieve a
convergent trend of performance indexes.
IV. DISCUSSION
The model presented here for motor control revealed
itself neurophysiologically plausible and comparable with
experimentally-based modeling. It successfully puts together
different flexible controllers and predictors, including both
control-based and neural-network blocks in a whole complex
system.
The model behavior can be explored for other tasks (e.g.
first tests on reaching task have been carried out) and for any
dynamic environment. Multiple factors can be set; for
instance, the inverse dynamic model inaccuracies, the task-
dependent optimal feedback law and the time constant of
cerebellar learning rules.
A lot of enhancing steps will be implemented within the
model. The analogical model of the cerebellum can be
replaced with a spiking network version similarly as
presented in [13] and [21] (EDLUT: Event-Driven Look-Up
Tables), which can naturally include more realistic plasticity
mechanisms [30], starting from the most recent dualism
between neurophysiology evidences and neural network
modeling, e.g. [27]. Furthermore, the cerebro-cerebellar loop
could be exploited within the model. In this direction, a first
attempt was carried out through a recurrent architecture
model, where the cerebellum output modified the motor
cortex input, i.e. the kinematic planning, so solving the
motor error problem [31].
Finally, the neurophysiology demonstrated that, after
learning, the inferior olive response decreases significantly
[12], thus suggesting that when the cerebellum learning has
been completed, the learning consolidation occurs
transferring this information directly to the motor cortex, i.e.
making directly the feedforward generated motor commands
more accurate.
In conclusion, this model is using a control scheme
consistent with the motor-learning theory, in which the
motor error is pre-computed and sent to the parallel fiber -
Purkinje cell connection of the cerebellum, in order to
generate LTD and LTP through a supervised learning rule.
Thus, the model now provides the basis for testing more
biologically plausible architectures and computational
solutions, including vector coding in the motor cortex,
implicit learning in the cerebellar granular layer, and various
signal transformations in the different nuclei involved. In
particular, the expansion of the cerebellum into a detailed
neuronal network using the EDLUT simulator will allow to
test the impact of biological circuit and cellular properties on
the control capabilities of the cerebro-cerebellar loop.
Figure 4. Experimental data
(a) The 3 DoF angles (q1, q2, and q3, as in Fig.2-a), with time (20
s, i.e. 5 repetitions). The vertical lines bound each repetition. Solid
curves: experimental data; dashed curves: the desired joint angles
from simulator planning.
(b) The 3D Cartesian end-effector trajectories. In violet: the desired
one. In black: the first repetition. In thick grey: the last repetition.
In thin grey: the intermediate repetitions.
(c) The Root Mean Square Error between the 3D desired trajectory
and the actual one, for each repetition.
(d) The correlation coefficient between the 3D desired trajectory
and the actual one, for each repetition.
ACKNOWLEDGMENT
This work has been supported by the EU grant REALNET
(FP7-ICT-270434).
REFERENCES
[1]
J. F. Kalaska, D. A. Cohen, M. Prud'homme, and M. L. Hyde, "Parietal
area 5 neuronal activity encodes movement kinematics, not movement
dynamics.," Exp Brain Res, vol. 80, pp. 351-364, 1990.
[2] M. Ito, "Cerebellar circuitry as a neuronal machine.," Prog Neurobiol
,
vol. 78, pp. 272-303, 2006.
[3] A.J Ba stian, "Cerebellar limb ataxia: abnormal control of self-
generated and external forces.," Ann N Y Acad Sci, vol. 978, pp. 16-
27, 2002.
[4] E. Todorov, "Direct cortical control of muscle activation in volun
tary
arm movements: a model.," Nat Neurosci, vol. 3, pp. 391 -398, 2000.
[5]
M. Ito, "Control of mental activities by internal models in the
cerebellum.," Nat Rev Neurosci, vol. 9, pp. 304-313, 2008.
[6] J. Nakanishi and S. Schaal, "Feedback error learnin
g and nonlinear
adaptive control.," Neural Netw, vol. 17, pp. 1453-1465, 2004.
[7]
R. Osu, K. Morishige, H. Miyamoto, and M. Kawato, "Feedforward
impedance control efficiently reduce motor variability.," Neurosci Res
,
vol. 65, pp. 6-10, 2009.
[8]
S.H Scott, "Optimal feedback control and the neural basis of volitional
motor control.," Nat Rev Neur osci, vol. 5, pp. 532-546, 2004.
[9]
D. Mitrovic, S. Klanke, R. Osu, M. Kawato and S. Vijayakumar, "A
computational model of limb impedance control based
on principles of
internal model uncertainty. " PLoS One, vol. 5, e13601, 2010.
[10]
M. Kawato, K. Furukawa, and R. Suzuki, "A hierarchical neural-
network model for control and learning of voluntary movement.,"
Biol
Cybern, vol. 57, pp. 169-185, 1987.
[11]
N. Schweighofer, M. A. Arbib, and M. Kawato, "Role of the
cerebellum in r eaching movements in humans. I. D istributed inverse
dynamics control.," Eur J N eurosci, vol. 10, pp. 86-94, 1998.
[12]
N. Schweighofer, J. Spoelstra, M. A. Arbib, and M. Kawato, "Role of
the cerebellum in reaching movements in humans. II. A neural model
of the intermediate cerebellum.," Eur J Neurosci, vol. 10, pp. 95-
105,
1998.
[13]
R.R Carrillo, E. Ros, C. Boucheny, and O.J. Coenen, "A real-
time
spiking cerebellum model for learning robot control.," Bi osystems
, vol.
94, pp. 18-27, 2008.
[14]
J.J Craig, Introduction to robotics: mechanics and control., 2005.
[15]
E. T odorov, "Optimality principles in sensorimotor control.,"
Nat
Neurosci, vol. 7, pp. 907-915, 2004.
[16]
D. Kostic, M. Bra m de, a nd R. Hensen, "Modeling and identification
for high-performance robot control: an RRR-
robotic arm case study,"
IEEE Transactions on Control Systems Technology, vol. 12, pp. 904-
919, 2004.
[17]
C. M. Harris and D. M. Wolpert, "Signal-
dependent noise determines
motor planning.," Nature, vol. 394, pp. 780-784, 1998.
[18]
K.E Jones, A.F Hamilton, and D.M Wolpert, "Sources of signal-
dependent noise during isometric force production.," J Neurophysiol
,
vol. 88, pp. 1533-1544, 2002.
[19]
E. Burdet, R. Osu, D. W. Franklin, T. E. Milner, and M. Kawato, "The
central nervous system stabilizes unstable dynamics by learning
optimal impedance.," Nature, vol. 414, pp. 446-449, 2001.
[20]
J. Porrill and P.Dean, "Recurrent cerebellar loops simplify adaptive
control of redundant and nonlinear motor systems.," N eural Comput
,
vol. 19, pp. 170-193, 2007.
[21]
N.R. Luque, J.A. Garrido, R.R. Carrillo, O.J. Coenen, and E. Ros,
"Cerebellarlike corr
ective model inference engine for manipulation
tasks.," IEEE Trans Syst Man Cybern B Cybern, vol. 41, pp. 1299-
1312, 2011.
[22]
D.A. Winter, Biomechanics and Motor Control of Hu man Movement
.
John Wiley \& Sons, 1990.
[23]
P. de Leva, "Adjustments to Zatsiorsky-
Seluyanov's segment inertia
parameters.," J Biomech, vol. 29, pp. 1223-1230, 1996.
[24]
H.A. Abdullah, C. Tarry, R. Datta, G. S. Mittal, and M. Abderrahim,
"Dynamic biomechanical model for assessing and monitoring robot-
assisted upper-limb therapy.," J Rehabil Res Dev, vol. 44, pp. 43-
62,
2007.
[25]
A.H. Vette, T. Yoshida, T.A. Thrasher, K. Masani, and M.R. Popovic,
"A complete, non-
lumped, and verifiable set of upper body segment
parameters for three-dimensional dynamic modeling.," Med Eng Phys
,
vol. 33, pp. 70-79, 2011.
[26]
P. Corke, "Robotics Toolbox for Matlab," 2008.
[27]
T.Yamazaki and S. T anaka, "The cerebellum as a liquid state
machine.," Neural Netw, vol. 20, pp. 290-297, 2007.
[28]
D. Marr, "A theory of cerebellar cortex.," J Physiol, vol. 202, pp. 437-
470, 1969.
[29]
J.S. Albus, "A theory of cerebellar function," Math Biosci
, vol. 10, pp.
25-61, 1971.
[30]
E. D'Angelo, "Neural circuits of the cerebellum: hypothesis for
function.," J Integr Neurosci, vol. 10, pp. 317-352, 2011.
[31]
N. R. Luque, J. A. Garrido, R. R. Carrillo, S. Tolu, and E. Ros,
"Adaptive cerebellar spiking model embedded in the control loop:
context switching and robustness against noise.," Int J Neural Syst,
vol.
21, pp. 385-401, 2011.