ArticlePDF Available

Control of a Speech Robot via an Optimum Neural-Network-Based Internal Model With Constraints

February 2010
IEEE Transactions on Robotics vol. 26(1):142 - 159

February 2010
vol. 26(1):142 - 159

DOI:10.1109/TRO.2009.2033331

Source
IEEE Xplore

Authors:

Iaroslav V. Blagouchine

Université de Toulon

Eric Moreau

Université de Toulon

An optimum internal model with constraints is proposed and discussed for the control of a speech robot, which is based on the human-like behavior. The main idea of the study is that the robot movements are carried out in such a way that the length of the path traveled in the internal space, under external acoustical and mechanical constraints, is minimized. This optimum strategy defines the designed internal model, which is responsible for the robot task planning. First, an exact analytical way to deal with the problem is proposed. Next, by using some empirical findings, an approximate solution for the designed internal model is developed. Finally, the implementation of this solution, which is applied to the control of a speech robot, yields interesting results in the field of task-planning strategies, task anticipation (namely, speech coarticulation), and the influence of force on the accuracy of executed tasks.

…

Dependencies (upper six subfigures) F 1 ( λ ) and (lower six subfigures) F ( λ ) approximated by the corresponding ANN. Since both the functions F 1 ( λ ) and F ( λ ) depend on six variables ( n = 6 ), they are shown in the fol-

…

The behaviors of the classic linear spring, the exponential spring, and the exponential springs of the Feldman’s linearized λ -model (static and dynamic

…

+10

Optimization of the sequence [i a O ]. Formant zones are defined by the corresponding formant ellipses. Notation: “Optimum ends F ( λ j ) ” ≡

…

Figures - uploaded by Iaroslav V. Blagouchine

Content may be subject to copyright.

Content uploaded by Iaroslav V. Blagouchine

Content may be subject to copyright.

142 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

Control of a Speech Robot via an Optimum

Neural-Network-Based Internal Model

With Constraints

Iaroslav V. Blagouchine and Eric Moreau, Senior Member, IEEE

Abstract—An optimum internal model with constraints is pro-

posed and discussed for the control of a speech robot, which is based

on the human-like behavior. The main idea of the study is that the

robot movements are carried out in such a way that the length of

the path traveled in the internal space, under external acoustical

and mechanical constraints, is minimized. This optimum strategy

deﬁnes the designed internal model, which is responsible for the

robot task planning. First, an exact analytical way to deal with the

problem is proposed. Next, by using some empirical ﬁndings, an

approximate solution for the designed internal model is developed.

Finally, the implementation of this solution, which is applied to the

control of a speech robot, yields interesting results in the ﬁeld of

task-planning strategies, task anticipation (namely, speech coar-

ticulation), and the inﬂuence of force on the accuracy of executed

tasks.

Index Terms—Artiﬁcial neural networks (ANNs), constrained

optimization, Lagrange’s multipliers method, mathematical and

computational issues in robotics control, mathematical physics,

models and theories of speech production, λ-model [equilibrium-

point hypothesis (EPH)], optimum control, optimum task planning,

path and trajectory planning, robotics of speech production, robot-

motion planning, variational calculus.

I. INTRODUCTION

ROBOTICS of speech production is quite a challenging

subject in modern design and engineering. Since it is

well-known that the tongue is one of the principal elements,

which is responsible for speech production, many speech robots

are based on the modeling of the movements of the tongue.

To ensure the quality of these movements and, consequently,

that of the produced speech (both have to be as close as pos-

sible to the real ones), the control of such robots is extremely

important. To this end, the principles of control of such artiﬁcial-

intelligence devices are often borrowed from human beings, and

many robots are based on human-like behavior and are modeled

in close conjunction with the motor-control theories [1]–[5].

Manuscript received April 14, 2009; revised July 7, 2009 and September 17,

2009. First published November 13, 2009; current version published February

9, 2010. This paper was recommended for publication by Associate Editor T.

Kanda and Editor J.-P. Laumond upon evaluation of the reviewers’ comments.

I. V. Blagouchine is with the Department of Telecommunication, Insti-

tute of Engineering Sciences of Toulon-Var-School of Engineering, Univer-

sity of Toulon, Toulon F-83162, France, and also with the Department of

Mobile Communication, Eur´

ecom, F-06904 Sophia-Antipolis, France (e-mail:

iaroslav.blagouchine@univ-tln.fr).

E. Moreau is with the Department of Telecommunication, Institute of Engi-

neering Sciences of Toulon-Var-School of Engineering, University of Toulon,

Toulon F-83162, France (e-mail: moreau@univ-tln.fr).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TRO.2009.2033331

Currently, there is no single unique theory in the ﬁeld of mo-

tor control and task planning. Over the past 80 years, many

different approaches were developed and are currently in com-

petition; these include the electromyographic approach [1]–[3],

the information-channel approach [3], [6]–[8], the global econ-

omy of the diverse mechanical factors approaches [9]–[15],

the equilibrium-point hypothesis (EPH, which is also known

as the λ-model) [1], [5], [16]–[27], the internal models’ ap-

proaches [28]–[30], etc.

Among these approaches, the EPH, economy’s approaches,

and internal models’ ones have received some special interest

in speech robotics.

The EPH is a development of the classic linear damped spring

model of muscle [31], which is completed by central nervous

system (CNS) inﬂuence. According to the EPH, the muscle can

be modeled as a nonlinear spring, which is controlled by a spe-

cial motor command λ, which descends from CNS. The force

F, which is generated by such a muscle, depends on the differ-

ence between its actual length land the CNS motor command

λ, as well as on several other physical parameters associated

with muscle. In other words, F(l, λ)=f(s)H(s), where f(·)

is the transfer function of muscle, H(·)is the Heaviside step

function, and the parameter s, which is called activation, is de-

ﬁned as s=l−λfor the static case and as s=l−λ+κνfor

the dynamic case, where parameters κand νare, respectively,

the damping factor and the speed of muscle lengthening. As to

the transfer function of muscle f(·), it has been noted that it is

a nonlinear function, and it is quite well-approximated by an

exponential function. Thus

F(λ)=ρ(ecs −1) H(s)(1)

where cis the form parameter, and ρis a parameter related

to the force-generating capability of muscle [19], [32], [33]. By

expanding the latter expression in the Maclaurin series for s>0,

it is straightforward to see that for 0<s1,the muscle, in

ﬁrst approximation, behaves as a classic linear spring

F(λ)=ρ∞



n=1

(cs)n

n!=ρcs+O(s2)(2)

which is another argument in favor of this model. From the point

of view of robotics and cybernetics, the λ-model is especially

attractive, because it provides a simple mathematical mean to

conceive artiﬁcial-intelligence devices based on the human-like

behavior, without going into details about the underlying prin-

ciples of motor control. The EPH has also become quite popular

in the articulatory speech-production ﬁeld, for which correct

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 143

Fig. 1. Simpliﬁed anatomical structure of tongue (from [43]).

modeling of the tongue and jaw movements is of great im-

portance, because these are physically responsible for speech

production. Due to the increasing researcher’s interest and to

the constantly growing computational capacities, many of such

works have arisen over the past 15 years (e.g., see, [33]–[40]).

One of these examples is the articulatory-based speech robot

that we used in our study (see e.g. [33], [41] and [42]).*This

robot represents an artiﬁcial tongue, which is modeled by six

main muscles that are responsible to shape and move the tongue

in the sagittal plane: posterior and anterior parts of genioglos-

sus,styloglossus,hyoglossus,inferior and superior longitudi-

nalis, and verticalis (see Fig. 1) [43]. Each of these muscles is

controlled by its own motor command λi,i=1,...,n,n=6,

according to the EPH.1Their forces are generated according

to (1), with different ρfor each muscle, and with constant

c=1 cm−1[19], [32]. Initial vocal-tract geometry is recon-

structed from anatomical cineradiographic data. By means of

the ﬁnite-element method [44], the tongue is divided into small

volumes connected by 221 nodes, each of which anatomically

belongs to the deﬁned muscle(s). The motion of each node is

then described by a second-order ordinary differential equa-

tion (ODE) with damping and external terms, due to viscosity,

gravity, and contact reaction forces. The stiffness matrix, which

determins the distribution of the internal forces within the ﬁnite-

element structure, is calculated by the ﬁnite-element algorithm.

Such a complex system of the ODEs is solved numerically by

means of the Runge–Kutta method using MATLAB software,

which ﬁnally gives the trajectory of motion of each node and,

by further interpolation, the motion of the tongue body. In order

to achieve the vocal-tract reconstruction, lips, palate, and phar-

ynx are also added to model mechanical contacts with tongue

(see Fig. 2). The jaw is represented by static rigid structures to

which the tongue is attached. Note, ﬁnally, that there are also

other articulatory-based speech robots that might be interest-

ing [45]–[59], especially because the optimum internal model

that we will introduce is meant as a general model and can be

used with many other speech robots using similar principles of

control.

The diverse-economy approaches consider that the move-

ments are deﬁned by some economy principle, that is to say,

1For simplicity, we will write these motor commands as components of vector

λ≡(λ1,...,λn).

*This articulatory-based speech robot is called throughout the paper the biomecha-

nical tongue model (BTM).

Fig. 2. Modeled vocal tract and its further cutting by an acoustical-tube model

for the computation of formants F. The upper contour is the palate, and the

lower one is the tongue dorsum. The lips’ area is variable, which ranges from

0.5to3.0cm

2, depending on the vowel.

the movements are always carried out in such a way that some

criterion is optimized. These approaches are basically inspired

from analytical mechanics, namely, from principle of least

action [60]–[69], which is one of the most universal princi-

ples of physics (many fundamental equations of physics can

be deduced from it). This principle states that the motions are

always carried out in such a way that the action2is minimum.

However, because of high complexity of biosystems, direct ap-

plications of this principle in motor-control theories are quite

limited. Under these circumstances, the exact mathematical de-

scription being almost impossible, and one of the possible solu-

tions might be to describe them more globally, i.e., by supposing

that some global criterion is optimized during the movement.

This criterion, which is often known as a cost, may be deﬁned

in many different ways, e.g., time cost, energy cost, force cost,

impulse cost, accuracy cost, etc. In this ﬁeld, the concept that

appears to be the most frequent and interesting is that of the

minimum of the jerk cost3[9]–[11]. The idea of the economy

principles also affected robotics and, in particular, the speech-

motor-control community, and several studies using economy

principles appeared. Basically, these works propose to search

for the shortest trajectories between the steady-state positions

in the command internal space [32], [36], [39], [70]–[73]. In

some of these works, the shortest distance principle is explicitly

stated, and thus, authors suggest using straight lines as solutions

(the straight line is the shortest trajectory between two points if

there are no constraints). In others, it is implicitly formulated

by constant-rate transitions between steady-state positions, i.e.,

again by straight-line transitions. These works reported that by

shifting motor-control commands λat constant speed, realistic

articulatory movements and speech signals may be produced.

It is interesting to note that the minimization of the trajectory

2The action is the deﬁnite integral over time interval of the Lagrangian, the

latter being the difference between kinetic and potential energies.

3Jerk, also known as jolt, is the rate of change of acceleration, i.e., the third

derivative of displacement with respect to time. The jerk cost is, therefore,

deﬁned as deﬁnite integral over time interval of the square of jerk.

144 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

length (3), as well as that of the jerk cost, are quite similar, at

least, mathematically, to that of the action. In fact, in all three

cases, one seeks to optimize a trajectory-depending functional,

which is given as deﬁnite integral over time interval. Finally,

the exploration of optimum principles may also be interesting

in the ﬁeld of inverse problems. By optimizing a cost functional

(or just function) under constraints on outputs, one can ﬁnd the

corresponding inputs (as we will show later). This idea is not

novel in the speech ﬁeld. For more details, see [74]–[76].4

Finally, there is another approach that should be mentioned in

the human-like robotics context: the internal models [28]–[30].

This approach supposes that any living creature has an internal

representation of all the external tasks he/she can do.5Thus,

typically, the learning of a new task implies, inter alia, the de-

termination of the corresponding place in the internal space. It is

also important to note that there is no bijection between internal

and external spaces, since the same task can be achieved differ-

ently. The notion of tasks is closely related to that of targets,

and it is very important for the planning theories. In speech mo-

tor control, there are different interpretations of targets, which

is also known as reference frames. For example, these may

be vocal-tract conﬁgurations (e.g., tongue shape, constriction

position, and lips area) and, consequently, the output acoustic

patterns, which may be expressed in terms of formants. Since

there is great variability of the formants for the same vowel,

the auditory system normalizes them, in order to recognize the

vowel, which is the so-called target-normalization theory.The

other theory is more complex and takes into account not only

the static acousticophonetic parameters but also the dynamic

ones, such as transitions and their character6(e.g., linear, dif-

ferent nonlinear forms, etc.). These transitions are often asso-

ciated with formant transitions [79], [80], because it has been

empirically established that this dynamic acoustic information

also contributes for vowel identiﬁcation [81] (see also various

coarticulation references given in Section III-A2), which is the

so-called dynamic-target-speciﬁcation theory [82], [83]. How-

ever, these two basic interpretations of targets are far from being

exhaustive, and the reader might particularly appreciate the the-

oretical study in [78], where one can ﬁnd muscle-length targets,

articulator targets,constriction-position targets,acoustic tar-

gets,auditory perceptual targets,etc.

The aim of our study is to propose an optimum internal model

for the control of the speech robots based on the EPH. The mo-

tion planning of the robot is performed in its internal space,

whose coordinates are λ-motor commands of the EPH (inter-

nal space λis, therefore, n-dimensional space). Being inspired

by the principle of least action, it is proposed that the robot

task planning is based on the global optimum principle, which

is related to the aforementioned internal space, with external

4This work seems to be misinterpreted in [77], where the cost function from

[76] was called “length,” while it is clearly called by its author as “variation,”

and in addition, the formula provided in [76] does not represent the length.

5It has been even suggested that as in human beings, the most probable site

for the latter may be the cerebellar cortex [26].

6The question where the transitions are planned and how they are controlled

is also a subject of controversial discussion; some works suggested that it may

be in spatial reference frames, while others reported that it may be more closely

related to physical levels (e.g., joints and muscles) [78].

constraints related to the execution of tasks (e.g., the quality of

the executed tasks), or in motor-control words, to the targets.

It is proposed, namely, from that all the movements, including

those of the tongue, which are mainly responsible for speech

production and are controlled by the λcommands according

to the EPH, are carried out in such a way that the length of

path, which is traveled in the internal space λ,is minimized,

under external physical constraints, namely, acoustical and me-

chanical ones. The robot’s behavior is, therefore, completely

determined by this optimum principle, which permits ﬁnding

the corresponding optimum commands λ, which are sent to the

robot. Therefore, the originality of our work consists in two im-

portant differences with respect to previously referenced works

in the EPH-based robotics ﬁeld. First, previous works do not

perform the minimization of length in order to ﬁnd correspond-

ing motor commands λ. They just use the fact that the straight

line is the shortest path between two points. However, the latter

fact is true if there are no constraints,7and with constraints, their

approaches cannot provide solutions. Second, the optimization

that we carry out is, in addition, a constraint one. We ﬁrst per-

form it under one constraint (the acoustical one) and then under

two constraints (the acoustical and the mechanical ones).

II. OPTIMUM INTERNAL MODEL

A. Preliminaries

First of all, we specify what we exactly mean by external

physical constraints. The acoustical constraints consist in the

speciﬁcation of the sound that we wish to produce. Its speci-

ﬁcation is made in terms of the spectrum, and since the opti-

mum internal model is mainly designed for vowels, the latter

can be roughly approximated via the ﬁrst kformants of vowel,

which are denoted by vector F≡(F1,...,F

k). In practice, the

formants are obtained via the BTM, which is followed by an

acoustical tube model (see Fig. 3). First, the BTM provides

the vocal-tract geometry (x,y), and then, the acoustical tube

model cuts the vocal tract in cross sections (see Fig. 2) and

approximates it by a tube of variable cross section. This yields

the area function of the vocal tract, by which, the formants are

computed [84]–[86]. The mechanical constraint consists in the

requirement to keep the prescribed mean force’s level contained

in the tongue during speech production. This level is calculated

as the arithmetic (or sample) mean of the absolute values of

the forces at each node of the BTM. Physically, this level may

be interpreted as mean muscular tongue effort, or measure of

global tongue stiffness, and phonetically, it helps to account for

lax and tense vowels.

B. Mathematical Formalization of the Formants–Commands

and Force–Commands Relationships and Learning of the

Artiﬁcial Neural Networks

As we saw before, the BTM is not a fully analytical robot.

In other words, there are no explicit analytical relationships

7Or they are trivial, e.g. straight line or plane passing through endpoints.

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 145

Fig. 3. (Top) Direct or real model and its replacement by the (bottom) approx-

imate one.

between its inputs and outputs. However, in order to mathemat-

ically implement the minimization algorithm of the optimum

internal model, we need the analytical relationships between

the formants Fand the motor commands λ, which are denoted

by vector ﬁeld F(λ), and between the global mean force’s level

Fand λ, which are denoted by scalar ﬁeld F(λ),aswellas

their derivatives. For this reason, we approximated the BTM,

followed by an acoustical tube model, by two artiﬁcial neural

networks (ANNs) [5], [87]–[91] (see Fig. 3). The choice of the

ANNs for similar problems was already suggested by several

authors [92]–[97]; moreover, the ANNs are precisely known

for their good properties for multidimensional approximations.

Besides, the replacement of a particular BTM by ANNs has po-

tentially another application: the generalization of the proposed

internal model for its use with other speech robots based on the

EPH or using similar principles of control, whose input–output

relationships may be approximated by the ANNs.

The learned ANNs (for details, see the Appendix) reveal

the general nonlinear character of the dependencies F(λ)and

F(λ), as shown in Fig. 4. Nevertheless, one can note that the

dependencies F(λ)and, especially, F(λ)are not highly non-

linear; it suggests that it would be also reasonable to try the use

of the multidimensional polynomials instead of the ANNs, since

the former are “lighter” for calculations from the computational

point of view.

C. Model Itself

The optimum internal model, which is designed according

to the principle of the shortest path in the internal space un-

der constraints, logically leads to the calculus of variations

[98]–[105]. In fact, the problem to ﬁnd a curve, whose length

is least under constraints, is one of the typical problems of

Fig. 4. Dependencies (upper six subﬁgures) F1(λ)and (lower six subﬁg-

ures) F(λ)approximated by the corresponding ANN. Since both the functions

F1(λ)and F(λ)depend on six variables (n=6), they are shown in the fol-

lowing way: We ﬁx ﬁve variables out of six and show the dependency solely on

the remaining sixth variable. Six panels for F1and for Fshow these depen-

dencies, where the sixth variable switches from λ1to λ6, respectively. Three

different cases are presented in each small panel; they are obtained by setting

ﬁve ﬁxed variables to their minimal, mean, and maximal values, respectively.

variational calculus, which is known as geodesic problem.The

length of a curve, which is given in parametric form λi≡λi(t),

in the n-dimensional space λ, can be written as [99], [100],

[102]–[105]

L[λ(t)] = t2

t1˙

λ2

1+···+˙

λ2

ndt =t2

t1

˙

λ(t)

dt (3)

where t1and t2are, respectively, the initial and ﬁnal times of

movement λ(t1)and λ(t2)—their corresponding positions in the

internal space λ. We will now seek the vector-valued function

λ(t), i.e., the set of functions λi(t)n

i=1, which minimizes this

integral under two constraints: acoustical and mechanical.

The acoustical constraint consists in the speciﬁcation of the

initial and ﬁnal phonetic targets, which correspond, respectively,

to the initial t1,and ﬁnal t2moments. These targets are the zones

in the formant space F. Formally, the constraint is deﬁned as the

appertaining of the ﬁrst kformants of each produced vowel to its

own speciﬁc formant zone, which is deﬁned by a k-dimensional

ellipsoidorectangle in the formant space F. In other words,

146 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

mathematically, the formants of the jth produced vowel, which

are denoted by Fj≡(F1,j ,...,F

k,j),mustsatisfyH(Ga,j)=

0, where

Ga,j Fj=



l=1

(Fl,j(λj)−◦

Fl,j)2ηj

2ηj

l,j −1(4)

where ◦

Fl,j are the prescribed formants, Fl,j are the produced

ones, parameters l,j deﬁne the axes of the formant ellip-

soidorectangle, ηjdeﬁnes its shape (rounded or rectangular),

λj≡λ(tj)is the vector motor command that is responsible for

the production of jth vowel, and j=1and 2, since we have only

two targets (vowels): the initial one and the ﬁnal one. By an ellip-

soidorectangle, we actually mean a voluminous k-dimensional

ﬁgure, which is obtained from the previous equation by setting

Ga,j =0.Forηj=1,itisak-dimensional ellipse; then, by

increasing the parameter ηj, it becomes more and more rectan-

gular, and ﬁnally, for large ηj, it becomes deﬁnitively a hyper-

rectangle. The equality, which is given by H(Ga,j )=0,means

that we wish the formants of the produced sounds to be in-

side their ellipsoidorectangles; this can be viewed as a weak

constraint, because we do not specify where exactly we want

the formants to be; they must be just somewhere inside the

ellipsoidorectangles. We could pose Ga,j =−1, i.e., the strict

belonging to the given set of formants of the produced vowels;

however, this constraint is too rigid and, in practice, it does not

seem very real, since the slight ﬂuctuation of the phonetic tar-

gets is always present (actually, parameters l,j were precisely

introduced for this purpose, i.e., in order to deﬁne the size of the

formant zones),8or in addition, Ga,j =0, i.e., the strict apper-

taining to the surface of the formant ellipsoidorectangle (which

is another weak constraint, because we do not specify where

exactly on the ellipsoidorectangle’s surface the formants must

be, but it is stronger than H(Ga,j)=0). Furthermore, the con-

straints of the type H(Ga,j )=0will be called the constraints

of the ﬁrst kind; those of the type Ga,j =0will be called the

constraints of the second kind.

The mechanical constraint simply consists in the equality of

the global mean force’s level Fto the prescribed value ◦

F(t)

(time-dependent in general), which must be kept during the

whole transition between t1and t2:

Gm(λ,t)=F(λ)−◦

F(t)=0 (5)

i.e., this constraint has to be satisﬁed every time and every-

where and not only in t1and t2, as it is for the acoustical one.

Note that the introduction of the constraints in the model aims

precisely to mathematically formalize the targets (see Section

I). The acoustical constraints represent actually a sort of static

targets, which are in accordance with target-normalization the-

ory (especially that of the ﬁrst kind). In contrast, the mechanical

8For instance, it is well known that the distribution of formants about its mean

Fjis near-normal. Thus, for the particular case ηj=1, the parameters jdeﬁne

the formant zone of the constant probability level a−1,a>1, with respect to

the maximum level at Fj, if we pose the referents ◦

Fj=Fjand jequal to

√2lna×standard deviation of the aforementioned normal distribution; in

other words, jdeﬁne the formant equiprobability’s ellipses.

constraint, which is a dynamic one, corresponds to the dynamic-

target-speciﬁcation theory (since it must be satisﬁed during the

transition and not only at the static endpoints belonging to some

zone) and aims to better represent the reality of the system.

As we may recall from variational calculus, the function λ(t)

that minimizes the functional (3) is the solution of the corre-

sponding system of the Euler–Lagrange differential equations.

For the ordinary variational problem, which requires the station-

arity of the functional

Y[λ(t)] = t2

f(λ,˙

λ,t)dt (6)

with given ﬁxed boundary conditions λ(t1)=λ1,and λ(t2)=

λ2, under the mconstraints Gj(λ,t)=0,for j=1,...,m,

the solution can be found from the following system of nEuler–

Lagrange equations:







∂

∂λif+



j=1

µjGj

−d

∂

∂˙

λif+



j=1

µjGj

=0 (7)

where µj≡µj(t)are the Lagrange’s undetermined multipliers.

The latter equation may be reduced to the following one, which

is represented in vector form as

∂f

∂λ+



j=1

µj

∂Gj

∂λ−d

∂f

∂˙

λ=0(8)

where ∂/∂λis the operator of partial differentiation with respect

to each component of the vector λ. Note that since we have n

differential partial equations and mequations of constraint, we

can ﬁnd all ncomponents of λ(t)and mLagrange’s multipliers;

the remaining 2nunknowns, due to ndifferential equations of

second order, can be found from the 2nboundary or initial

conditions. Note also that the constraints Gjmay be static

or dynamic, that does not change the previously mentioned

differential equation, since they do not contain ˙

λ(for more

information, see [100] and [104], where we can also ﬁnd the

cases of the constraints given as ODEs). Obviously, similar

reasoning also applies to the mechanical constraint (5).

For the functional (3), the Euler–Lagrange equations are par-

ticularly simple, because the integrand contains only the deriva-

tive of λ(t), which is a particularity of all geodesic problems,

i.e., we have the following system:







λi

˙

λ2

1+···+˙

λ2

−µ∂F

∂λi

=0 ∀i=1,...,n

(9)

with the additional equation (5) to ﬁnd µ(t). The latter expres-

sion can also be written as



˙

λ(t)

−µ∂F

∂λ=0.(10)

After the total differentiation with respect to time, it becomes

λ˙

λ,˙

λ−˙

λ˙

λ,¨

λ



˙

λ

3−µ∂F

∂λ=0(11)

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 147

where ·denotes the scalar product. Moreover, for the particular

case n=3, by using the well-known rule of linear algebra

transforming the scalar products into the vector ones, we may

simplify (11) as follows:

λ×¨

λ×˙

λ



˙

λ

3−µ∂F

∂λ=0.(12)

We can even generalize (9)–(12), for the cases when there are

more than one constraint that must be fulﬁlled along optimum

solution λ(t), i.e., in each point of λ(t). By using (7) or (8), we

can generalize (11), as follows:

λ˙

λ,˙

λ−˙

λ˙

λ,¨

λ



˙

λ

3−



j=1

µj

∂Gj

∂λ=0.(13)

Note that the acoustical constraints [see (4)] are not present in

any of these equations. This is because they are related only to

the boundary conditions but not to the whole path λ(t), which

is precisely the complexity of our case. In fact, (9)–(13) and (5)

are only the necessary conditions to which the optimum solu-

tion λ(t)must satisfy, and they are not sufﬁcient for its complete

determination. Generally, the latter is carried out with the help

of the boundary conditions. However, in our case, these con-

ditions are not given explicitly but implicitly via the acoustical

constraints (4), i.e., t1is related to Ga,1,t2, and to Ga,2,as

follows:

tj⇒λj⇒Fj⇒Ga,j ,j=1,2.(14)

In this case, which is often called in literature the undeter-

mined endpoints case [100], the optimum function λ(t)must

also satisfy a supplementary system of the differential equa-

tions,9involving the derivatives of the acoustical constraints,

and this condition is sufﬁcient to completely determine the op-

timum solution λ(t), thereby giving the trajectory of motion in

the internal space.

The solution of such a system of partial differential equations

represents a quite complicated problem of mathematical physics

(we recall that the dependencies F(λ)and F(λ)are given by

two nonlinear ANNs). Thus, ﬁrst, we would like to discuss

some mathematical issues related to the (9)–(13), raising serious

questions about validity of some variants of the EPH, and then

propose a solution for the problem.

D. Discussion of the Drawbacks of the Linearized λ-Model

1) Brief Description of the Linearized λ-Model: Classic

EPH does not imply any dynamic description of the applied

motor commands λ. In order to bring some dynamics to the

system, the so-called linearized λ-model was proposed. This

variant of the EPH is basically the classic λ-model, supplemen-

tary supposing the time transitions between static positions in

the internal space λmay be effected only at a constant rate.

In other words, during the change of posture, from one to an-

other, the commands λare modiﬁed linearly with time, i.e.,

λ(t)=αt+β, where αand βare the constant coefﬁcients

9The so-called left-hand and right-hand endpoint requirements [100].

(vectors). We have already mentioned this model in Section I; it

is a simplest implementation of the shortest distance principle in

the internal motor-command space. However, we will show that

this model works only in trivial cases, and it does not support

dynamic systems well.

2) Contradiction With the Principle of the Shortest Path:

We will show now that if the constraints on the form of the

path, i.e., Gj(λ,t), are nonlinear10 in λ, the principle of the

shortest path and the linearized λ-model are in mathematical

contradiction.

Reductio ad absurdum: The unique solutions that the lin-

earized λ-model allows are the linear ones: λ(t)=αt+β.If

we now substitute these linear solutions into (13), we obtain

that the left part of this equation is always equal to zero, which

also means that the remaining part is always zero; however, the

dependencies Gj(λ,t),j=1,...,mare, in general, nonlinear

in λ, and therefore, the sum of their derivatives cannot be null in

all cases. Furthermore, if at least one of these dependencies is

nonlinear in λ, the sum of their derivatives cannot be null, which

means that the ﬁrst term in the left part of (9)–(13) cannot be

always null, and thus, the optimum solutions cannot be linear

functions λ(t)=αt+β. It is simple to show that the unique

case when this ﬁrst term vanishes is that when λ(t)becomes

a linear function. An extremely simple proof, which is based

on geometry, may be done for the particular 3-D internal space

case. From (12), we have ˙

λ×¨

λ×˙

λ=0, which means either

λis parallel to ¨

λ×˙

λ, or one of them is zero. The parallelism

is impossible because ¨

λ×˙

λis orthogonal to both of its ar-

guments, and one of them is precisely ˙

λ. Thus, one of these

vectors is null. In the most general case, ¨

λ=0, and therefore,

λ(t)=αt+β.

It is important to note that the impossibility of the linear solu-

tions is not due to any particular formulation of the constraints

but to the nature of the geodesic problem itself, for which the

solutions are, in fact, always determined by the constraints (e.g.,

for the original geodesic problem, the Earth’s surface determines

the corresponding solutions). If the constraints are nonlinear, the

optimum solutions λ(t)cannot be linear. Thus, only the trivial

constraints related to the whole traveled path λ(t)[e.g., those

described in footnote 7] can be compatible with the linearized λ-

model. These ﬁndings cast doubts on the linearized λ-model for

the task planning and its general use in motor-control theories.

3) Contradiction With the Finite-Energy and Finite-Power

Principles: Another contradiction is that with the ﬁnite energy

and power principles. Generally, the processes having ﬁnite en-

ergy and, especially, ﬁnite power, are said to be physically stable

(note that some processes can have the inﬁnite energy, but ﬁnite

power, for instance, classic small nondamped harmonic oscil-

lations). It is not complicated to show that the Feldman’s lin-

earized λ-model (both static and dynamic variants; see Section I)

may lead to the mechanical process of inﬁnitely growing en-

ergy and power. We decided to compare four spring mod-

els: the classic linear damped spring model, the exponential

damped spring model, the exponential damped spring model

controlled by linear λcommand (i.e., static Feldman’s model

10Or even linear, but in other contexts than described in footnote 7.

148 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

Fig. 5. The behaviors of the classic linear spring, the exponential spring, and

the exponential springs of the Feldman’s linearized λ-model (static and dynamic

variants) for positive and negative α.

with λ(t)=αt +β), and the exponential damped spring model

controlled by linear λcommand with the adjustment of feedback

related to the current velocity11 (i.e., dynamic Feldman’s model

with λ(t)=αt +β). The ﬁrst model is classically described by

a second-order ODE of motion: ¨x+γ˙x+ρx =0. The second

one can be described by

¨x+γ˙x+ρ(eηx−1) = 0 (15)

the third model by

¨x+γ˙x+ρ(eηx−αt−β−1) = 0 (16)

and the fourth one by

¨x+γ˙x+ρ(eηx+κ˙x−αt−β−1) = 0 (17)

the displacement being a function of time, i.e., x≡x(t), and

γ,ρ,η,κ,α, and βbeing constant parameters. These param-

eters are (for all models) given by γ=0.16,ρ=1,η=1,

κ=0.05,α=+0.25 (case α>0), α=−0.50 (case α<0),

and β=5. The initial conditions that are ﬁxed for all differential

equations are given by x(0) = 0,and ˙x(0) = 1. Unfortunately,

only the ﬁrst equation has an exact analytical solution. We,

therefore, had to resort to numerical methods in order to ob-

tain the corresponding solutions (namely, we used the ode45

MATLAB’s function, which is based on the Runge–Kutta

method). The results are shown in Fig. 5. From this ﬁgure,

we can ascertain that both linear and exponential noncontrolled

spring models have a stable exponentially damped oscillating

solutions. On the contrary, for the exponential spring models,

which are controlled by linear λcommand, the solutions are

inﬁnitely increasing tending to +∞for α>0and inﬁnitely

decreasing tending to −∞ for α<0; therefore, for both latter

cases, they are not in the L1,fort∈[0,∞), and neither in L2,for

t∈[0,∞), and the energy and power of such a process tend to

inﬁnity.

4) Potentially Limited System Dynamics: Finally, third

drawback of linearized λ-models lie in the fact that the prescrip-

tion of linear variations of λ(t)may potentially limit the dynam-

ics of the system. Taking into account that the dependency F(λ)

11It is sometimes called the proprioceptive feedback.

is associated with the physics of the system (its biomechanics)

and cannot be changed without relearning, the prescription of

linear λ(t)impacts to the dynamics of the system. In this case,

it is clear that if F(λ)is linear in λ, so does F(t)≡F(λ(t))

in t;ifF(λ)is quadratic in λ, so does F(t)in t,etc.Inthis

context, it is difﬁcult to imagine how, with linearized λ-model,

one can truly implement dynamic-target-speciﬁcation theory,

thus allowing, for example, some variability of transitions (see

Section I), if the system dynamics are ﬁxed by its biomechanics

and the CNS inﬂuence is taken into account only in a sketchy

linear form.

E. Solution in First Approximation

The exact analytical solution of the optimum internal model

problem, which is described in Section II-C, is not simple. The

solution of a system of partial differential equations, which

imply neural networks, and whose boundary conditions are not

given explicitly, is a quite complicated problem of mathematical

physics. On the other hand, one can easily see from Fig. 4 that the

dependencies F(λ)are not strongly nonlinear, and therefore,

they may be replaced by the straight lines in ﬁrst approxima-

tion. As to the dependencies F(λ), they are strongly nonlinear;

however, as it follows from Section II-C, these constraints do

not affect the form of the solution but only its endpoints. Thus,

basing on Section II-D2, the optimum solutions become the

straight lines λ(t)=αt+β, and the undetermined coefﬁcients

αand βare the variables by which the mechanical and acous-

tical constraints must be satisﬁed. In other words, in our case,

where the constraints on the form of the path are not far from

the linear ones, the linearized λ-model can be actually viewed

as the ﬁrst approximation to the global problem of ﬁnding the

optimal path. We can no longer “play” with the form of this op-

timum path λ(t)(that was precisely the main role of variational

calculus) but only with the limits of the integral (3), which are

written in implicit form, i.e., λ1≡λ(t1)and λ2≡λ(t2). Since

the dependency F(λ)is now the linear one and the optimum

solution λ(t)is the straight line, it is sufﬁcient that the constraint

(5) was satisﬁed only in two points (e.g., at the ends λ1and λ2)

in order to be satisﬁed in every point of the optimum solution

λ(t). Mathematically, this means that the mechanical constraint

(5), which was initially on the form of the path, now becomes

that on the boundary conditions. As to the acoustical constraints

(4), it remains as before on the boundary conditions λ1and λ2.

By substituting linear solutions λ(t)=αt+βinto the func-

tional (3), we obtain the following well-known formula of the

straight line’s length in n-dimensional space:

L(λ1,λ2)=t2

t1

α

dt =

α

t2−t1=

λ2−λ1

.

Moreover, we extend this approach to more than two vowels

between which the commands are linear, say, pvowels. In this

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 149

case, the total length becomes

L(λ1,...,λp)=

p−1



j=1tj+1

tj

α

dt =

p−1



j=1 

λj+1 −λj



p−1



j=1 







i=1

(λi,j+1 −λi,j )2.(18)

Obviously, in this case, the boundary conditions have to be

fulﬁlled in the points λ1,λ2,...,λp, instead of λ1and λ2.

We now state the exact mathematical formulation of the prob-

lem: We seek to extremize the function L(λ1,...,λp), with re-

spect to the variables (λ1,...,λp), under pacoustical12 and p

mechanical constraints [see also (4) and (5)]

Ga,j (λj)=0 ∀j=1,...,p

Gm,j (λj)=0 ∀j=1,...,p. (19)

Thus, instead of the optimization with respect to the form of the

traveled path and its ends (deﬁned implicitly via the boundary

conditions), we now carry out the optimization only with respect

to the ends (λ1,...,λp). In other words, the initial problem of

the constrained optimization of functional becomes that of the

constrained optimization of function, which is usually more

simple to solve.

The latter optimization problem is classically solved by

means of the Lagrange’s undetermined multipliers method for

functions. This method consists in introducing a composite

function U(λ1,...,λp,µa,µm)of n×p+2pvariables, which

is a sum of function L(λ1,...,λp)of n×pvariables and of

2pconstraints (19), which are weighted by the correspond-

ing Lagrange’s multipliers µa≡(µa,1,...,µ

a,p),and µm≡

(µm,1,...,µ

m,p)

U=L+



h=1

µa,hGa,h (λh)+



h=1

µm,hGm,h (λh).(20)

Here, index jwas replaced by hin order not to get confused

with the further derivatives. The optimization itself consists in

ﬁnding the optimum set (

λ1,...,

λp,

µa,

µm)such that

∂U

∂λjλj=

λj

=0,∂U

∂µaµa=

µa

=0,∂U

∂µmµm=

µm

j=1,...,p (21)

or, in short, grad U=0. We recall that the equality of the deriva-

tives of Uto zero with respect to the Lagrange’s multipliers gives

actually the constraints given in (19), which is precisely the in-

terest of the Lagrange’s method: The constrained optimization

of the function Lof n×pvariables reduces to the unconstrained

one of the function Uof n×p+2pvariables, where the last

2pvariables are the Lagrange’s undetermined multipliers.

12Note that use of the acoustical constraints of the second kind instead of

the ﬁrst one cannot be considered too restrictive for our problem from an

acousticophonetic point of view. On the one hand, the position of the borders

is variable and can be adjusted via the parameters l,j and ◦

Fl,j . On the other

hand, as we shall see later, almost always, the solution for the constraints of the

second kind is also that for the constraints of the ﬁrst kind.

The procedure of differentiation is quite particular for the

length’s function L(λ1,...,λp). The derivative with respect to

λjis not always calculated in the same way; this is because, the

ﬁrst λ1and the last λpterms are present only once in the sum

(18), while all the intermediate terms λ2,...,λp−1are present

twice. Thus, by differentiating with respect to each component

of λ1, we obtain the following system:











∂L

∂λi,1

=−(λi,2−λi,1)

n



i=1

(λi,2−λi,1)2

,i=1,...,n (22)

or in the following vector form:

∂L

∂λ1

=−λ2−λ1



λ2−λ1

.(23)

With respect to λ2,...,λp−1, we obtain

∂L

∂λj

=λj−λj−1



λj−λj−1

−λj+1 −λj



λj+1 −λj

(24)

where j=2,...,p−1. Finally, for j=p, it yields

∂L

∂λp

=λp−λp−1



λp−λp−1

.(25)

The differentiation of the constraints is less sophisticated. For

the acoustical one, it becomes

∂Ga,h(λh)

∂λj

=









2ηj



l=1 Fl,j(λj)−◦

Fl,j 2ηj−1

2ηj

l,j ·∂Fl,j

∂λj

,h=j

0,h=j

for j=1,...,p, where the derivatives of the lth formant of jth

vowel with respect to the motor commands of this vowel λjare

calculated according to (36), i.e.,

∂Fl,j

∂λj



s=1

1vs,l(ws−λj)e−(ws−λjb1)2(26)

for l=1,...,k,j=1,...,p, and dim λj=n(i.e., in all k×

p×nderivatives). It may be noted that since the constraints on

the boundary conditions of the hth vowel are independent from

the jth vowel, these derivative are all null. The derivatives of

the mechanical constraint are simply given by

∂Gm,h(λh)

∂λj

=





∂F(λj)

∂λj

,h=j

0,h=j

(27)

where the last derivatives are calculated similarly to the formant

ones (without index land with the ANN weights corresponding

to the λ⇔Fnetwork).

The system (21) cannot be solved analytically. The optimiza-

tion of Uwas, therefore, performed numerically by means of the

gradient descent method (which is also known as the method of

steepest descent), implemented in the MATLAB programming

language.

150 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

TAB L E I

FREQUENCIES (IN HERTZ)WEUSED FOR THE DEFINITION OF THE PHONETIC

TARGETS FOR THE ACOUSTICAL CONSTRAINT

III. RESULTS

A. Only Acoustical Constraints Are Applied

1) Simulations With the Model and Effect of Different Task-

Planning Strategies: We ﬁrst considered a slightly simpler case,

which is one without the third term in (20), i.e., without mechan-

ical constraints (5) at all. First of all, we ﬁxed static phonetic

targets (i.e., prescribed formants) according to Table I. Besides,

the formant frequencies ◦

Fjare also used for the initialization

of optimization algorithm: From the initial database used for

the ANN learning, we search for the couples of data λ⇔F,

having Fas close as possible to the formant zone’s centers ◦

(these frequencies are generally found in the range ±10 Hz for

the ﬁrst formant, and ±30 Hz for the second one). The cor-

responding motor commands λare taken as initial points for

the gradient descent method. Note also that since there is no

bijection between spaces λand F, there is a ﬁnite (n−k)-

dimensional volume in λ, any point of which maps to the same

point in F, e.g., to the projection of the initial point ◦

Fj. Thus,

the initial point in λcannot be uniquely determined. Indeed, tak-

ing into account that the numerical optimization method is the

gradient descent one, the optimum solution may be potentially

inﬂuenced by the choice of this initial point λ. However, the

projection of the initial point (i.e., ellipsoidorectangle’s center),

regardless of the initial point itself in λ, is quite close to the

solution (i.e., to the ellipsoidorectangle’s border), and taking

into account that the function L(λ1,...,λp), which is given by

(18), is “sufﬁciently” convex, and the constraints (19) can be

considered locally monotonic13 (see Fig. 4), and therefore, the

local minima problem does not really affect the solution. On the

other hand, we also tested this potential dependency empirically,

and the optimization algorithm always returned the same ﬁnal

solution (with ANN accuracy), regardless of the initial point.

The optimization results given by our algorithm for the se-

quence of three vowels [i a O]areshowninFig.6. As

13One may also note that the projections of the traveled paths onto the formant

space F(shown in dots in the ﬁgures) are just slightly curved in all experiments.

Fig. 6. Optimization of the sequence [i a O]. Formant zones are deﬁned

by the corresponding formant ellipses. Notation: “Optimum ends F(λj)”≡

(

F1,...,

Fp).

we can observe from this ﬁgure, the formant zones were de-

ﬁned as ellipses, i.e., all ηj=1. The set of the Lagrange’s

multipliers found by our optimization algorithm is 

µa=

(0.0670,0.1700,0.0880) mm, and that of the optimal motor

commands 

λ1,...,

λp(in millimeters) is given by



λ1= (33.76,48.12,46.84,74.92,18.38,63.78),for [i]



λ2= (46.96,48.02,41.78,74.04,18.70,63.90),for [a]



λ3= (51.04,49.44,41.21,72.02,18.74,64.02),for [O]

where the gradient step was set to 0.0125. The set of the cor-

responding formants 

F1,..., 

Fp≡F(

λ1),...,F(

λp),

which is the projection of the found optimal commands to the

formant space, is given by (in hertz, ﬁrst and second formants)



F1= (353.3,2117.2),for [i]



F2= (569.1,1284.4),for [a]



F3= (570.9,1054.5),for [O].(28)

We can clearly observe that when the optimization is ﬁnished, all

three acoustical constraints vanish, and the function Ureaches

its minimum, which is designated by 

U, and is equal to 18.962

mm. A total of 253 iterations of the method of steepest descent

for the Lagrange’s multipliers method were necessary to ﬁnd

this optimal solution. Note that the path traveled in the formant

space Fis shown in dots, in order to emphasize the fact that

the real path is actually traveled in the internal space λ, and the

path shown in the formant space is only its projection to F.

Analogously, the optimum endpoints are found in the internal

space λ, and we showed their projections to the formant space.

Note that they are exactly on ellipse’s borders, that means the

fulﬁllment of the acoustical constraints from (19) (the same can

be actually directly observed from the lower left panel of Fig. 6).

We present now a different case of optimization: All initial pa-

rameters are the same, except the deﬁnition of the formant zone

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 151

Fig. 7. Optimization of the sequence [i a O]. Formant zones of [i] and [O]

are deﬁned by formant ellipses (i.e., η1=η3=1) and that of [a] by an ellip-

soidorectangle with parameter η2=3.

for the vowel [a], which is deﬁned by an ellipsoidorectangle

with parameter η2=3(see Fig. 7). For this case, the optimiza-

tion algorithm found: 

µa=(0.0645,0.0580,0.0850) mm, and



U=18.693 mm. The set of the formants (

F1,..., 

Fp), which

corresponds to the found optimal commands, is given by (in

hertz, ﬁrst and second formants)



F1= (352.7,2115.4),for [i]



F2= (568.1,1262.5),for [a]



F3= (568.9,1057.3),for [O].(29)

By comparing two latter sets (28) and (29), or Figs. 6 and 7,

we note that the main difference is the formants of [a], espe-

cially the second formant; in the ﬁrst case, it is 21.9 Hz (1.7%)

greater than that in the second one. This actually means that by

accenting differently the same vowel in the same sequence, we

can obtain its different acoustical variants. Note that the param-

eter η=3means, on the one hand, a different geometrical form

of the formant zone, and on the other hand, when the solution

reaches its border and starts to leave the formant zone, the acous-

tical constraint increases much more strongly than that for the

normal ellipse given by η=1. Thus, the acoustical constraint

deﬁnes not only the formant zone but the degree of accentuation

of this zone as well (note that we employ the word “accentua-

tion” especially in this sense). Therefore, even the small simple

changes of strategy of task planning can have an impact on the

formant space F. Note also that the character of the impact is

mostly local, i.e., other vowels of the sequence, whose strategy

remained unchanged, showed a relatively small modiﬁcations

of their positions in the formant space F.

2) Phenomenon of Task Anticipation: We found out that

our model is in accordance with the phenomenon of task

anticipation or, more precisely, with its acoustic variant ob-

served for a long time by phoneticians and which is known

in phonetics as coarticulation or effect of phonetic environ-

ment [80]–[82], [108]–[112]. In the sequence of several vow-

Fig. 8. Optimization of the sequence [i a œ O]. Formant zones are deﬁned by

the corresponding formant ellipses.

Fig. 9. Optimization of the sequence [i a Oœ]. Formant zones are deﬁned by

the corresponding formant ellipses.

els, this phenomenon represents the inﬂuence of the following

vowel(s) on the previous one(s); especially, it concerns two

vowels following one after another, i.e., in the sequence of p

vowels, the jth vowel is especially inﬂuenced by the (j+1)th

vowel ∀j=1,...,p−1. Acoustically, this phenomenon can

be observed in terms of formants. Usually, this phenomenon,

in different degrees (depending on the context and other con-

ditions, e.g., accentuation), is present in real speech, which is

why we wanted to ﬁnd out if the proposed model was able to

reproduce it.

For demonstration, we choose an example where the triple

phenomenon of the task anticipation is produced: sequences

[i a œ O] versus [i a Oœ]. Thus, the acoustical anticipa-

tion will be studied on vowels [a], [O], and [œ]. In addi-

tion, two different planning strategies will be compared. The

optimizations of the sequences with elliptic planning strate-

gies related to the constraints are performed in Figs. 8 and

9. For the former, the optimization algorithm returns 

µa=

(0.0673,0.1770,0.1200,0.0780) mm, 

U=23.989 mm, and the

152 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

Fig. 10. Optimization of the sequence [i a œ O]. Formant zone of [a] is deﬁned

by an ellipsoidorectangle of η2=3.

optimal formant set (in hertz)



F1= (357.4,2117.2),for [i]



F2= (571.1,1296.0),for [a]



F3= (528.6,1422.1),for [œ]



F4= (551.0,1062.8),for [O].(30)

For the latter, the optimization algorithm returns 

µa=

(0.0658,0.1980,0.1419,0.061) mm, 

U=25.010 mm, and the

optimal formant set (in hertz)



F1= (354.0,2119.9),for [i]



F2= (567.6,1285.1),for [a]



F3= (556.9,1066.5),for [O]



F4= (515.8,1388.0),for [œ].(31)

Thus, we can note a light anticipation on each vowel; however,

by taking into account the precision, we mainly observe the

anticipation on the vowel [œ]; the formant difference between

the two cases is ∆F=(12.76,34.12)Hz, which represents

2.4% on each formant, or 3.4% of total difference.14

We will now show that according to our model, the chosen

strategy can be the reason for greater or smaller anticipation.

Once again, we will change the strategy for the vowel [a], by

deﬁning its formant zone by an ellipsoidorectangle of η2=

3, and we will compare two previous sequences under these

conditions. The results are presented in Figs. 10 and 11. For

the sequence [i a œO], our optimization algorithm returned



µa=(0.0674,0.052,0.123,0.078) mm, 

U=23.885 mm, and

the optimal formant set (in hertz)



F1= (358.0,2122.9),for [i]



F2= (572.0,1320.3),for [a]

14The latter is %td = 100(∆F1/F1)2+···+(∆Fk/Fk)2.

Fig. 11. Optimization of the sequence [i a Oœ]. Formant zone of [a] is deﬁned

by an ellipsoidorectangle of η2=3.



F3= (529.7,1431.6),for [œ]



F4= (551.8,1063.5),for [O].(32)

For the sequence [i a Oœ], it gave 

µa=(0.065,0.080,

0.1421,0.061)mm, 

U=25.028 mm, and the optimal formant

set (in hertz)



F1= (353.3,2118.9),for [i]



F2= (568.3,1263.6),for [a]



F3= (557.0,1065.2),for [O]



F4= (515.4,1387.4),for [œ].(33)

We obtain the anticipation on [a] for its second formant to be

56.7 Hz, which represents 4.3% of its initial value. Note that the

previous vowel [i] also resulted to be concerned by this modiﬁ-

cation of strategy for [a]; it decreased its ﬁrst formant by 4.7 Hz

(1.3%). Thus, our model conﬁrms that the degree of anticipa-

tion may depend on the chosen strategy. Moreover, since the

anticipation on [a] was very small (practically null) in the case

of the elliptic formant strategy, one can suppose that actually,

the anticipation itself may be one of the consequences of the

chosen strategy. Thus, we think that the acoustical anticipation,

or coarticulation, is due not to the muscular mechanics and body

dynamics, as was claimed in several previous studies [32], [72],

but due probably to the centrally planned mechanisms.15

Finally, it would be also interesting to compare the obtained

results to the real ones obtained by phoneticians. Unfortunately,

quantitative analysis of such a kind seems quite difﬁcult and

inconsistent for several reasons. First, the proposed solution is

only a ﬁrst-approximation solution. Second, the BTM is not

ideal, and neither is the ANN. Third, it is very difﬁcult to com-

pare our results with those found in phonetic literature, because

15Moreover, these works reported that not only context-sensitivity arises from

biomechanics, but it also need not be represented in control CNS commands,

while we obtained this effect precisely from CNS-planning mechanism.

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 153

the measurement techniques are very different, and even ba-

sic acousticophonetic data strongly differ (e.g., compare [107]

with [112], although both studies are on American English vow-

els). Moreover, in most of the coarticulation works, the antic-

ipation extent is measured not by frequency deviation (as we

did) but by human-perception error rate, i.e., by means of the

so-called discriminant analysis (percentage of correct versus

incorrect identiﬁcation by listeners) [82], [107]. This analysis

in based on the fact that nearly all errors involved confusions

between adjacent vowels [107], [112]. However, such an error

rate is difﬁcult to interpret in absolute formant values that we

reported. Another problem is that the discriminant analysis is

quite subjective (it is produced by some speakers, and perceived

by some listeners), and therefore, it is individual-dependent.

Fourth, in our model, there are some individual-dependent pa-

rameters, ηjand i,j , which are meant to represent concrete

speaker; in other words, the set of these parameters is meant to

represent age, sex, accent, effort, concentration, weariness, etc.

However, of course, their concrete values are difﬁcult to estimate

numerically. This is why we think that it is better to compare the

obtained results qualitatively, rather than quantitatively. If, nev-

ertheless, we accept to compare our results quantitatively with

those that were provided by a small amount of works where

the coarticulation frequency deviations were reported, e.g., [80]

and [108], we ﬁnd out that they are of the same order (for ex-

ample, anticipation on [A]isupto10HzforF1and up to 50 Hz

for F2, depending on the context). However, more interesting is

the overall analysis of these results. It, in particular, shows that

the anticipation extent depends on the size of formant zones, as

well as on the position of the vowel inside formant space. Gen-

erally, vowels having larger formant zones and greater opening

angles to the neighboring vowels possess greater anticipation,

e.g., cases of [æ], [E], [œ], [I], [U], [u], but not [i] having small

opening angle or [A] and [a] having small opening angles and

zones (see e.g., [80], [107], and [112]). We obtained similar re-

sults: quite small anticipation on [i], [a] (case of small formant

zone, η=1)or[O] (which was limited to only two close neigh-

bors), and greater anticipation on [œ] and on [a] (case of larger

formant zone, η=3).

3) Few Words About Two Kinds of the Acoustical Constraints

From Practical Point of View and Their Roles in the Previous

Optimizations: One can also note that, although we gave the

solution for the constrained problem with the acoustical con-

straints of the second kind (see Section II-C), in fact, it will

be also that for the problem with weaker constraints, those of

the ﬁrst kind, if we suppose the local monotonicity inside the

ellipsoidorectangles16 of the dependencies F(λ). The latter can

be generally done, because as we previously saw (see Fig. 4;

Section II-B), these relationships, depending on the interval, can

be locally considered monotonic. Furthermore, not only it gives

the solution for the problem with constraints of the ﬁrst kind,

but also, it is better to use the constraints of the second kind to

ﬁnd a solution to the problem with the constraints of the ﬁrst

kind. In fact, as we may recall from previous ﬁgures, during

16See also little discussion devoted to the initial point choice in

Section III-A.1.

Fig. 12. Correct optimization of the sequence [i a] by using the constraints

of the second kind; those of the ﬁrst kind are, therefore, also satisﬁed and the

minimum of Uis reached.

Fig. 13. Incorrect optimization of the sequence [i a] by using the constraints

of the ﬁrst kind; those of the second kind are also satisﬁed, but the minimum of

Uis not reached.

the optimization process (different iterations), the solution may

temporarily leave the authorized formant zone. This is not so

important for the constraints of the second kind but is very im-

portant for those of the ﬁrst kind, because the Heaviside function

will not permit leaving even temporarily, the authorized zone,

by wrongfully stopping the gradient algorithm because of its

sharp increase (this mostly remains true, even if we approxi-

mate the Heaviside function by its continuous variants; see the

formula given later). We present a comparison of such cases in

Figs. 12 and 13. Note that the acoustical constraints of the ﬁrst

and second kinds are fulﬁlled in both cases (the formant projec-

tions of the optimal solutions are on ellipse’s borders); however,

the correct solution is given only by the model with the con-

straints of the second kind, i.e., we have 

U=14.084 mm, in the

case of the constraints of the second kind, while in the case of

the constraints of the ﬁrst kind, 

U=21.105 mm, which is the

wrong solution, because the length is not minimum. Thus, the

154 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

Fig. 14. Optimization of the sequence [i O] with corresponding mean force’s

level. Formant zones are deﬁned by the corresponding formant ellipses. Only

acoustical constraints are applied.

gradient algorithm was unable to ﬁnd the correct solution by us-

ing directly the constraints of the ﬁrst kind; therefore, it is better

to circumvent these difﬁculties by using the constraints of the

second kind to ﬁnd the optimal solution for the constraints of the

ﬁrst kind. Finally, in order to improve the quality of convergence

of the optimization algorithm, the Heaviside function was re-

placed by its continuous approximation H(z)≈(1 + th az)/2,

with parameter ataken in the range 60–1000.

B. Both Acoustical and Mechanical Constraints Are Applied

The main drawback of our model without mechanical con-

straints is that the optimal solution gives the formants that are

always on the borders of their ellipsoidorectangles, for both

kinds of the acoustical constraints. Moreover, for many vow-

els, only some parts of these borders can be covered (e.g., for

the vowel [a], only the upper part of its formant zone can be

covered), and the covered zone depends essentially on the po-

sition of vowel inside the vocalic triangle. This is actually the

consequence of the chosen strategy of planning (i.e., the mini-

mization of the traveled path under constraints) and of the fact

that the dependency F(λ)is often near-monotonic. Therefore,

one can suppose that the addition of the mechanical constraints

may permit reaching some uncovered zones.

First of all, in all previous simulations unconstrained mechan-

ically, the mean force’s level, which corresponds to the optimal

ends of the traveled path, was different. Now, we run another

simulation unconstrained mechanically, i.e., sequence [i O].17

The optimization results are shown in Fig. 14; the numerical

17We recall that in this section, [i] and [O] are those with ∗from Table I.

Fig. 15. Optimization of the sequence [i O] with mechanical and acoustical

constraints. All formant zones are deﬁned by ellipses. Case of equal mean force’s

levels (low levels).

ones are given as 

U=22.23 mm



µa=(0.0647,0.1124) mm



F1= (299.7,2196.3) Hz, for [i]



F2= (535.9,901.0) Hz, for [O]



F1=0.65 N, for [i]



F2=0.25 N, for [O] (34)

where 

Fj≡F(

λj),j=1,...,p; in other words, jcorre-

sponds to the sound in the sequence. We will now try to pre-

scribe the mean force’s level with both kinds of the acoustical

constraints and to prescribe the equal global mean force’s level

to each vowel in the sequence.

In the ﬁrst example, we try to force both vowels to decrease

their mean force’s level by reaching the value 0.20 N. The results

of this optimization are given as 

U=26.90 mm



µa=(0.36,0.13) mm



µm=(6.8,−1.0) mm/N



F1= (292.5,2194.4) Hz, for [i]



F2= (531.8,897.0) Hz, for [O]



F1=0.20 N, for [i]



F2=0.20 N, for [O].

Fig. 15 shows the optimization. Mainly, we note that [O]did

not signiﬁcantly change its position on the formant zone’s bor-

der, while [i] was slightly inﬂuenced. It can be understood: the

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 155

Fig. 16. Optimization of the sequence [i O] with mechanical and acoustical

constraints. All formant zones are deﬁned by ellipses. Case of equal mean force’s

levels (high levels).

prescribed mean force’s level is much closer to [O], which is

why it impacts especially on [i], whose unconstrained level was

high. On the other hand, the modiﬁcation of its position was not

very great, notwithstanding the quite different muscular com-

mands. This is normal because, as we previously said, there

is no bijection between the internal and external spaces; thus,

a whole domain in the internal space may correspond to one

point in the external space. In other words, the most signiﬁcant

changes were produced in this domain of the internal space;

more precisely, for [i], the total difference in the internal space

λis %td =32.8%, while that in the external space Fis only

%td =2.4%! This is also related to the fact that the length of

path traveled in the internal space increased: 26.90 mm versus

22.23 mm.

In the second example, we will try to obtain a high level of the

prescribed mean force’s level, say ◦

F=0.77 N, for all vowels

in the same sequence. The optimization returns the following:



U=29.65 mm



µa=(0.077,0.191) mm



µm=(0.89,−5.50) mm/N



F1= (272.6,2191.4) Hz, for [i]



F2= (562.5,915.4) Hz, for [O]



F1=0.77 N, for [i]



F2=0.77 N, for [O].

Fig. 16 shows the optimization. In this case, both vowels were

inﬂuenced: formants of [i] by %td =9.1% and those of [O]

by %td =5.2%; the corresponding values in the internal space

Fig. 17. Optimization of the sequence [i O] with mechanical and acoustical

constraints. All formant zones are deﬁned by ellipses. Case of different mean

force’s levels (low and high levels).

λare much greater: %td =19.9% and %td =43.7%,respec-

tively.

Two previous simulations showed that the model allows the

prescription of equal mean force’s levels to each vowel in se-

quence. However, this case is not very realistic, and the prescrip-

tion of different mean force’s levels to the vowels would better

represent the reality. In the next example, for the sequence [i O],

we will try to prescribe law mean force’s level to the ﬁrst vowel

(which was high in the simulation unconstrained mechanically),

which is given by ◦

Fi=0.3 N and, conversely, high level to the

second one, which is given by ◦

FO=0.6 N. This simulation,

which is shown in Fig. 17, returned the following optimization

results: 

U=34.70 mm



µa=(0.139,0.211) mm



µm=(2.78,−4.17) mm/N



F1= (308.1,2200.6) Hz, for [i]



F2= (536.8,901.7) Hz, for [O]



F1=0.30 N, for [i]



F2=0.60 N, for [O].

Both vowels were inﬂuenced, as well as the character of the

projection of the traveled path. As to the formant deviation,

those of [i] changed by %td =2.8%, and those of [O] changed

by %td =1.9%; the corresponding values in the internal space λ

are much larger: %td =24.7%,and %td =32.4% respectively,

with respect to the simulation unconstrained mechanically (see

(34) and Fig. 14).

156 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

These results show that regardless of the mean force’s level,

it is possible to keep the vowels in their formant zones (on the

border), and thus, there was no impact on the fulﬁllment of the

acoustical constraints; mathematically, the problem could al-

ways be solved. The prescribed mean force’s level has variable

impact on the external formant space Fby letting sometimes

to accede to the formant zones uncovered before. On the con-

trary, the prescribed mean force’s level strongly impacts on the

internal command space λ, which is always inﬂuenced in great

degree; in addition, the distance between the optimum ends al-

ways increases, which suggests that change of the external tasks

may be often compensated by the CNS.

IV. CONCLUSION

We presented an optimum neural-network-based internal

model for the control of the speech robots, controlled by EPH.

The internal model was designed according to the principle of

the shortest path in the internal command space, under acousti-

cal and mechanical constraints, which were meant to represent

static acoustic and dynamic mechanic targets.

We ﬁrst dealt with the obtaining of the exact analytical solu-

tion, which was based on the calculus of variations. We proved

that this solution cannot be, in general, a linear function that

casts doubts on the so-called linearized λ-model, which has re-

ceived some interest in artiﬁcial-intelligence device modeling

and supposes that the time transitions in the internal space may

be only linear. Moreover, it was also shown that the linearized

λ-model may have some potential instability issues because of

inﬁnitely growing energy and power and that it may not be

fully compatible with dynamic-speciﬁcation target theory. We,

therefore, suggest to reconsider the linearized λ-model.

Then, by using some empirical ﬁndings, we developed a

ﬁrst approximation solution for the proposed optimum inter-

nal model. Experimental tests showed that this model was in

accordance with the phenomenon of the acousticophonetic an-

ticipation (i.e., coarticulation), and it also showed that the degree

of the latter is closely related to the chosen task-planning strat-

egy. Moreover, it was also suggested that the anticipation itself

may be due to the internal CNS strategy and not to the dynamics

or biomechanics of physical system (such as the human body),

as was previously suggested in several works [32], [72].

As to the inﬂuence of the mean muscular tongue effort, which

helps to account for lax and tense vowels, its strict prescription

via mechanical constraints did not lead to the problem over-

posed mathematically, and we always succeeded to obtain op-

timum solutions. Thus, the model did not permit to answer to

questions related to the relationships between the forces and

the hypo/hyper-speech (see, e.g., [113]), namely, we could not

afﬁrm that a greater or smaller mean force’s level leads to the

restriction or enlargement of the formant zones. We could reach

the small levels, as well as the great ones. The strict prescription

of the mean force’s level has a variable impact on the external

formant space. In contrast, it strongly impacts on the internal

command space; the length of traveled path increased, and the

motor commands become quite different, even if the impact on

the formant space was small. In other words, the robot, which

was controlled by the proposed internal model, was able to

compensate the change of one of the external tasks, without

signiﬁcantly changing the quality of the another one, by the

corresponding shift in the internal space, which was determined

by the proposed optimum algorithm.

APPENDIX

ANN: CONSTRUCTION,LEARNING,AND VALIDATION

To construct the approximate model, we employed the two-

layer ANN using radial basis neurons for the ﬁrst layer (called

also hidden or radial layer) and linear neurons for the second one

(which is also called output layer) [5], [87]–[91]. The chosen

radial basis transfer function is the Gaussian curve e−(·)2.The

choice of the radial basis networks is motivated by the fact

that they are usually considered as one of the best for most

nonlinear approximations, thus giving a good tradeoff between

the complexity and the precision of the network.

By basing on the architecture of such a network [91], the for-

malization of the input–output relationships is not complicated.

The output of the sth radial layer neuron asis given by

as=e−(ws−λb1)2,s=1,...,S

1(35)

where ws≡(w1,s,...,w

n,s)is the input weight vector of the

sth radial neuron, λ≡(λ1,...,λn)is the input training vector,

b1is the narrowness’ parameter for the Gaussian function of the

sth hidden neuron, S1is the number of the hidden layer neurons,

and ·denotes the Euclidian distance. The lth output of the

linear layer αlcan be written as

αl=vla+b2,l =



s=1

vs,le−(ws−λb1)2+b2,l (36)

for l=1,...,S

2, where ais the column vector (a1,...,a

S1)t,

vlis the output weight line vector (v1,l,...,v

S1,l),S2is the

number of the output linear neurons, and b2,l is the bias of the

lth linear neuron (where “ t” stands for transpose). The learning

procedure consists of determining the weight vectors ws,vl,

and the bias vector b2≡(b2,1,...,b

2,S2)by minimizing the

sum of output squared errors (SSEs) on the set of input–output

data, where parameters b1are often left for the users.

Practical implementation of the ANNs was carried out with

neural network toolbox of MATLAB [91], by means of newrb

function. To learn the network, we ﬁrst generated 17 000 ran-

dom motor commands λdistributed uniformly, occupying a

six-orthotope in the internal space λ. Then, these motor com-

mands are applied to the BTM one after another (see Fig. 3).

This produces 17 000 output formant vectors F(see Fig. 18)

and output global mean force’s levels F.ForF, only ﬁrst two

formants are taken into account, i.e., k=2(ﬁrst two formants

are, in general, sufﬁcient to distinguish the vowel [86], [107]).18

Then, 17 000 data were split in two parts: 6000 data were given

to the ANNs in order to choose 340 optimal ones for the con-

struction of the latter (hidden layer of the ANN is, therefore,

18Despite that the values of nand kwere ﬁxed in experiments, all the formulas

will be written for arbitrary values of nand k, in order to keep the generality of

the model.

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 157

Fig. 18. Distribution of the 17 000 output vectors F, i.e., ﬁrst two formants

F1and F2.

composed of 340 radial layer neurons, i.e., S1= 340;formore

details, see [91]), and the remaining 11 000 were used to test

the accuracy of the obtained ANNs, which is about 1% on each

output.

REFERENCES

[1] M. L. Latash, Neurophysiological Basis of Movement. Champaign, IL:

Hum. Kinet., 1998.

[2] M. L. Latash and F. Lestienne, Motor Control and Learning.New

York: Springer Sci.–Bus. Media, 2006.

[3] R. A. Schmidt and T. D. Lee, Motor Control and Learning, 4th ed.

Champaign, IL: Hum. Kinet., 2005.

[4] P. Cordo and S. R. Harnad, Movement Control. Cambridge, U.K:

Cambridge Univ., 1994.

[5] M. A. Arbib, The Handbook of Brain Theory and Neural Networks, 2nd

ed. Cambridge, MA: MIT, 2003.

[6] P. M. Fitts, “The information capacity of the human motor system in

controlling the amplitude of movement,” J. Exp. Psychol., vol. 47, no. 6,

pp. 381–391, 1954.

[7] P. M. Fitts and J. R. Peterson, “Information capacity of discrete motor

responses,” J. Exp. Psychol., vol. 67, no. 2, pp. 103–112, 1964.

[8] P. M. Fitts and B. K. Radford, “Information capacity of discrete motor

responses under different cognitive sets,” J. Exp. Psychol., vol. 71,

pp. 475–482, 1966.

[9] N. Hogan, “An organizing principle for a class of voluntary movements,”

J. Neurosci., vol. 4, no. 11, pp. 2745–2764, 1984.

[10] N. Hogan, “Moving gracefully: Quantitative theories of motor coordina-

tion,” Trends Neurosci., vol. 10, pp. 170–174, 1985.

[11] W. L. Nelson, “Physical principles for economies of skilled movements,”

Biol. Cybern., vol. 46, pp. 135–147, 1983.

[12] M. Dornay, M. K. Y. Uno, and R. Suzuki, “Minimum muscle–tension

change trajectories predicted by using a 17-muscle model of the monkey’s

arm,” J. Motor Behav., vol. 28, no. 2, pp. 83–100, 1996.

[13] Y. Uno, M. Kawato, and R. Suzuki, “Formation and control of optimal

trajectory in human multijoint arm movement,” Biol. Cybern, vol. 61,

no. 2, pp. 89–101, 1989.

[14] H. Hatze and J. D. Buys, “Energy-optimal controls in the mammalian

neuromuscular system,” Biol. Cybern., vol. 27, pp. 9–20, 1977.

[15] P. Zukofsky, “Arm movements in skilled violin playing,” presented at

the 22nd Annu. Meeting Psychon. Soc., Philadelphia, PA, 1981.

[16] D. G. Asatryan and A. G. Feldman, “Functional tuning of the nervous

system with control of movement or maintenance of a steady posture.

I: Mechanographic analysis of the work of the joint or execution of a

postural task,” Biophys. J., vol. 10, pp. 925–935, 1965.

[17] A. G. Feldman, “Functional tuning of the nervous system with control of

movement or maintenance of a steady posture. II: Controllable parame-

ters of the muscles,” Biophys. J., vol. 11, pp. 565–578, 1966.

[18] A. G. Feldman, “Functional tuning of the nervous system with control

of movement or maintenance of a steady posture. III: Mechanographic

analysis of execution by man of the simplest of motor tasks,” Biophys.

J., vol. 11, pp. 766–775, 1966.

[19] A. G. Feldman and G. N. Orlovsky, “The inﬂuence of different descend-

ing systems on the tonic stretch reﬂex in the cat,” Exp. Neurol.,vol.37,

pp. 481–494, 1972.

[20] A. G. Feldman, “Once more on the equilibrium point hypothesis (λ

model) for motor control,” J. Motor Behav., vol. 18, pp. 17–54, 1986.

[21] A. G. Feldman, “Change of muscle length as a consequence of a shift in

an equilibrium of muscle load system,” Biophys., vol. 19, pp. 544–548,

1974.

[22] A. G. Feldman, “Control of the length of a muscle,” Biophys.,vol.19,

pp. 766–771, 1974.

[23] E. Bizzi, N. Hogan, F. A. Mussa-Ivaldi, and S. F. Giszter, “Does the

nervous system use equilibrium-point control to guide single and multiple

joint movements?” Behav. Brain Sci., vol. 15, pp. 603–613, 1992.

[24] X. Gu and D. H. Ballard, “An equilibrium point based model

unifying movement control in humanoids,” in Proc. Robot. Sci. Syst.,

2006, pp. 1–7.

[25] E. Bizzi, N. Hogan, F. A. Mussa-Ivaldi, and S. F. Giszter, “The

equilibrium-point framework: A point of departure,” Behav. Brain Sci.,

vol. 15, pp. 808–815, 1992.

[26] X. Gu and D. H. Ballard, “Robot movement planning and control based

on equilibrium point hypothesis,” in Proc. IEEE Conf. Robot., Autom.

Mechatron., 2006, pp. 1–6.

[27] S. A. Migliore and S. DeWeerth, “Control of robotic joints using prin-

ciples from the equilibrium point hypothesis of animal motor control,”

M.S. thesis, Georgia Inst. Technol., Atlanta, GA, 2004.

[28] M. Kawato, “Internal models for motor control and trajectory planning,”

Curr. Opin. Neurobiol., vol. 9, pp. 718–727, 1999.

[29] J. R. Flanagan and A. M. Wing, “The role of internal models in mo-

tion planning and control: evidence from grip force adjustments during

movements of hand-held loads,” J. Neurosci., vol. 17, pp. 1519–1528,

1997.

[30] M. Kawato, M. Isobe, Y. Maeda, and R. Suzuki, “Coordinates trans-

formation and learning control for visually-guided voluntary movement

with iteration: A Newton-like method in a function space,” Biol. Cybern.,

vol. 59, no. 3, pp. 161–177, 1988.

[31] J. D. Cooke, “The organization of simple skilled movements,” in Ad-

vances in Psychology: Tutorials in Motor Behavior, G. E. Stelmach and

J. Requin, Eds. Amsterdam, The Netherlands: North-Holland, 1980.

[32] P. L. Gribble, D. J. Ostry, V. Sanguineti, and R. Laboissi`

ere, “Are complex

control signals required for human arm movement?” J. Neurophysiol.,

vol. 79, no. 3, pp. 1409–1424, 1998.

[33] V. Sanguineti, R. Laboissi`

ere, and D. J. Ostry, “A dynamic biomechanical

model for neural control of speech production,” J. Acoust. Soc. Amer.,

vol. 103, no. 3, pp. 1615–1627, 1998.

[34] J. R. Flanagan, D. J. Ostry, and A. G. Feldman, “Control of human

jaw and multi-joint arm movements,” in Cerebral Control of Speech

and Limb Movements, G. Hammond, Ed. New York: Springer-Verlag,

1990, pp. 29–58.

[35] D. J. Ostry, J. R. Flanagan, A. G. Feldman, and K. G. Munhall, “Human

jaw movement kinematics and control,” in Tutorials in Motor Behavior

II, G. E. Stelmac and J. Requin, Eds. Amsterdam, The Netherlands:

Elsevier, 1992, pp. 646–660.

[36] “Special issue on the equilibrium point hypothesis and its applications

in speech,” Bull. Commun. Parl´

ee, vol. 4, pp. 5–110, 1998.

[37] R. Laboissi´

ere, D. J. Ostry, and A. G. Feldman, “The control of mul-

timuscle systems: Human jaw and hyoid movements,” Biol. Cybern.,

vol. 74, no. 3, pp. 373–384, 1996.

[38] R. Wilhelms-Tricarico, “Physiological modeling of speech production:

Methods for modeling soft-tissue articulators,” J. Acoust. Soc. Amer.,

vol. 97, no. 5, pp. 3085–3098, 1995.

[39] R. Wilhelms-Tricarico and C.-M. Wu, “A biomechanical model of the

tongue,” in Proc. Bioeng. Conf., vol. 35, K. B. Chandran, R. Vanderby,

Jr., and M. S. Hefzy, Eds. New York: ASME, 1997, pp. 69–70.

[40] P. Buscemi, M. Carlson, and R. W. Tricarico, “A computational approach

to muscle modeling of the human tongue via the ﬁnite element method

along with motion control correlations with MRI tracking data for simple

speech patterns,” J. Med. Devices, vol. 2, no. 2, p. 027548, 2008.

[41] H. Ranca, C. Servaisa, P.-F. Chauvyb, S. Debaudb, and S. Mischle,

“Effect of surface structure on frictional behaviour of a tongue/palate

tribological system,” Tribol. Int., vol. 39, no. 12, pp. 1518–1526, 2006.

[42] W. S. Levine, C. E. Torcaso, and M. Stone, “Controlling the shape of a

muscular hydrostat: A tongue or tentacle,” Lect. Notes Control Inf. Sci.,

vol. 321, pp. 20–222, 2005.

[43] W. H. Lewis, Ed., Gray’s Anatomy of the Human Body, 20th U.S. ed.

Philadelphia, PA: Lea & Febiger, 1918.

158 IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

[44] O. C. Zienkiewicz and R. L. Taylor, The Finite Element Method. Basic

Formulation and Linear Problems. New York: McGraw-Hill, 1989.

[45] P. E. Rubin, T. Baer, and P. Mermelstein, “An articulatory synthesizer

for perceptual research,” J. Acoust. Soc. Amer., vol. 70, pp. 321–328,

1981.

[46] P. E. Rubin, E. Saltzman, L. Goldstein, R. McGowan, M. Tiede, and

C. Browman, “CASY and extensions to the task-dynamic model,” in

Proc. 1st ESCA Tutorial Res. Workshop Speech Producing Model., 4th

Speech Prod. Semin., 1996, pp. 125–128.

[47] J. S. Perkell and K. N. Stevens, “A physiologically-oriented model of

tongue activity in speech production,” Ph.D. dissertation, Speech Com-

mun. Group, Dept. Electr. Eng., Mass. Inst. Technol., Cambridge, MA,

1974.

[48] S. Kiritani, K. Miyawaki, O. Fujimura, and J. E. Miller, “Computational

model of the tongue,” J. Acoust. Soc. Amer., vol. 57, no. S1, pp. S3–S3,

1975.

[49] Y. Kakita and O. Fujimura, “Computational model of the tongue: A

revised version,” J. Acoust. Soc. Amer., vol. 62, no. S1, pp. S15–S16,

1977.

[50] M. M. Sondhi and J. Schroeter, “A nonlinear articulatory speech synthe-

sizer using both time- and frequency-domain elements,”in Proc. ICASSP,

Tokyo, Japan, 1986, pp. 1999–2002.

[51] M. M. Sondhi and J. Schroeter, “A hybrid time-frequency domain articu-

latory speech synthesizer,” IEEE Trans. Acoust. Speech Signal Process.,

vol. ASSP-35, no. 7, pp. 955–967, Jul. 1986.

[52] S. Maeda, “An articulatory model of the tongue based on a statistical

analysis,” J. Acoust. Soc. Amer., vol. 65, no. s1, p. S22, 1988.

[53] S. Maeda, “Improved articulatory model,” J. Acoust. Soc. Amer., vol. 84,

no. s1, p. S146, 1988.

[54] F. Vogt, “Finite element modeling of the tongue,” in Proc. Int. Workshop

Audio Vis. Speech Process., 2005, pp. 143–144.

[55] O. Engwall, “A 3D tongue model based on MRI data,” in Proc. 6th Int.

Conf. Spoken Lang. Process., Beijing, China, 2000, pp. 901–904.

[56] K. van den Doel, F. Vogt, R. E. English, and S. Fels, “Towardsarticulatory

speech synthesis with a dynamic 3D ﬁnite element tongue model,” in

Proc. ISSP, 2006, pp. 59–66.

[57] [Online]. Available: http://www.takanishi.mech.waseda.ac.jp/top/

research/voice/index.htm

[58] K. Nishikawa, K. Asama, K. Hayashi, H. Takanobu, and A. Takanishi,

“Development of a talking robot,” in Proc. IEEE/RSJ Int. Conf. Intell.

Robots Syst., 2000, pp. 1760–1765.

[59] K. Fukui, K. Nishikawa, S. Ikeo, E. Shintaku, K. Takada, H. Takanobu,

M. Honda, and A. Takanishi, “Development of a talking robot with

vocal cords and lips having human-like biological structures,” in Proc.

IEEE/RSJ Int. Conf. Intell. Robots Syst., 2005, pp. 2023–2028.

[60] P. L. M. de Maupertuis, “Accord de diff´

erentes lois de la nature qui avaient

jusqu’ici paru incompatibles (Eng. trans.: “Accordbetween different laws

of nature that seemed incompatible”),”presented at the M ´

emoires l’Acad.

Sci. Paris, Paris, France, 1744, p. 417.

[61] P. L. M. de Maupertuis, “Les lois du mouvement et du repos, d´

eduites

d’un principe de m´

etaphysique (Eng. trans.: “Derivation of the laws of

motion and equilibrium from a metaphysical principle”),” presented at

the M´

emoires l’Acad. Sci. Paris, Berlin, Germany, 1746, p. 267.

[62] J.-L. Lagrange, Euvres de Lagrange. Tome Onzi`

eme: M´

ecanique Ana-

lytique, Tome Premier, Quatri`

eme ´

ed. Paris, France: Gauthier-Villars et

ﬁls, imprimeurs-libraires, 1888.

[63] W. R. Hamilton, “On a general method in dynamics,” Philos. Trans. R.

Soc., 1834, pp. 247–308.

[64] W. R. Hamilton, “On a general method in dynamics,” Philos. Trans. R.

Soc., 1835, pp. 95–144.

[65] H. Goldstein, Classical Mechanics. Reading, MA: Addison-Wesley,

1953.

[66] V. I. Arnold, Mathematical Methods of Classical Mechanics, 2nd ed.

Berlin, Germany: Springer-Verlag, 1989.

[67] L. D. Landau and E. M. Lifshitz, Course of Theoretical Physics. Vol. I:

Mechanics, 3rd ed. Oxford, U.K.: Elsevier, 2003.

[68] L. N. Hand and J. D. Finch, Analytical Mechanics. Cambridge, U.K.:

Cambridge Univ., 1998.

[69] C. Lanczos, The Variational Principles of Mechanics. Toronto, ON,

Canada: Univ. Toronto, 1970.

[70] D. J. Ostry and K. G. Munhall, “Control of jaw orientation and posi-

tion in mastication and speech,” J. Neurosci., vol. 71, pp. 1528–1545,

1994.

[71] P. L. Gribble, R. Laboissi´

ere, and D. J. Ostry, “Control of human

arm and jaw motion: issues related to musculo-skeletal geometry,” in

Self-Organization, Computational Maps and Motor Control, vol. 118.

Amsterdam, The Netherlands: North-Holland/Elsevier, 1997, pp. 483–

506.

[72] D. J. Ostry, P. L. Gribble, and V. L. Gracco, “Coarticulation of jaw move-

ments in speech production: is context sensitivity in speech kinematics

centrally planned?” J. Neurosci., vol. 16, pp. 1570–1579, 1996.

[73] A. G. Feldman, S. V. Adamovich, D. J. Ostry, and J. R. Flanagan, “The

origin of electromyograms—Explanations based on the equilibrium point

hypothesis,” in Multiple Muscle Systems: Biomechanics and Movement

Organization, J. Winters and S. Woo, Eds. Berlin, Germany: Springer-

Verlag, 1990, pp. 195–213.

[74] M. R. Schroeder, “Determination of the geometry of the human vocal

tract by acoustic measurements,” J. Acoust. Soc. Amer., vol. 41, no. 4B,

pp. 1002–1010, 1967.

[75] S. Hiroya and M. Honda, “Estimation of articulatory movements from

speech acoustics using an hmm-based speech production model,” IEEE

Trans. Speech Audio Process., vol. 12, no. 2, pp. 175–185, Mar.

2004.

[76] R. Marret, “Apprentissage des relations entre commandes musculaires et

g´

eom´

etrie de la langue,” Master’s thesis, Inst. Nat. Polytech. de Grenoble,

Grenoble, France, 2002.

[77] P. Perrier, L. Ma, and Y. Payan, “Modeling the production of VCV

sequences via the inversion of a biomechanical model of the tongue,”

in Proc. 9th Eur. Conf. Speech Commun. Technol., 2005, pp. 1041–

1044.

[78] F. H. Guenther, M. Hampson, and D. Johnson, “A theoretical investiga-

tion of reference frames for the planning of speech movements,” Psychol.

Rev., vol. 105, pp. 611–633, 1998.

[79] D. Kewley-Port and S. S. Goodman, “Thresholds for second formant

transitions in front vowels,” J. Acoust. Soc. Amer., vol. 118, no. 5,

pp. 3252–3260, 2005.

[80] J. Hillenbrand, M. J. Clark, and T. M. Nearey, “Effects of consonant

environment on vowel formant patterns,” J. Acoust. Soc. Amer.,vol.109,

no. 2, pp. 748–763, 2001.

[81] T. L. Gottfried and W. Strange, “Identiﬁcation of coarticulated vowels,”

J. Acoust. Soc. Amer., vol. 68, no. 6, pp. 1626–1635, 1980.

[82] W. Strange, J. J. Jenkins, and T. L. Johnson, “Dynamic speciﬁcation of

coarticulated vowels,” J. Acoust. Soc. Amer., vol. 74, no. 3, pp. 695–705,

1983.

[83] J. E. Andruski and T. M. Nearey, “On the sufﬁciency of compound target

speciﬁcation of isolated vowels and vowels in /bvb/ syllables,” J. Acoust.

Soc. Amer., vol. 91, pp. 390–410, 1992.

[84] H. Dudley and T. H. Tarnoczy, “The calculation of vowel resonances,

and an electrical vocal tract,” J. Acoust. Soc. Amer., vol. 22, no. 6,

pp. 740–753, 1950.

[85] G. Fant, Acoustic Theory of Speech Production. Hague, The

Netherlands: Mouton, 1960.

[86] L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals.

Englewood Cliffs, NJ: Prentice-Hall, 1978.

[87] Y. H. Hu and J. N. Hwang, Eds., Handbook of Neural Network Signal

Processing. Boca Raton, FL: CRC, 2002.

[88] L. Jain, Recent Advances in Artiﬁcial Neural Networks. Design and

Applications (Int. Ser. Comput. Intell.), A. M. Fanelli, Ed. Boca Raton,

FL: CRC, 2000.

[89] S. Chen, C. Cowan, and P. Grant, “Orthogonal least squares learning

algorithm for radial basis function networks,” IEEE Trans. Neural Netw.,

vol. 2, no. 2, pp. 302–309, Mar. 1991.

[90] T. Poggio and F. Girosi, “Networks for approximation and learning,”

Proc. IEEE, vol. 78, no. 9, pp. 1481–1497, Sep. 1990.

[91] H. Demuth, M. Beale, and M. Hagan, Neural Network Toolbox for Use

With MATLAB (Version 5). Natick, MA: The MathWorks, 2006.

[92] M. Kawato, K. Furukawa, and R. Suzuki, “A hierarchical neural-network

model for control and learning of voluntary movement,” Biol. Cybern.,

vol. 57, no. 3, pp. 169–185, 1987.

[93] F. H. Guenther, “A neural network model of speech acquisition and motor

equivalent speech production,” Biol. Cybern., no. 72, pp. 43–53, 1994.

[94] T. D. Sanger, “Neural network learning control of robot manipulators

using gradually increasing task difﬁculty,” IEEE Trans. Robot. Autom.,

vol. 10, no. 3, pp. 323–333, Jun. 1994.

[95] F. H. Guenther, “Speech sound acquisition, coarticulation, and rate ef-

fects in a neural network model of speech production,” Psychol. Rev.,

vol. 102, pp. 594–621, 1995.

[96] F. H. Guenther and J. W. Bohland, “Learning sound categories: A neural

model and supporting experiments. Acoustical science and technology,”

Acoust. Sci. Technol., vol. 23, no. 4, pp. 213–220, 2002.

BLAGOUCHINE AND MOREAU: CONTROL OF A SPEECH ROBOT VIA AN OPTIMUM NEURAL-NETWORK-BASED INTERNAL MODEL 159

[97] M. G. Rahim, C. C. Goodyear, W. B. Kleijn, J. Schroeter, and

M. M. Sondhi, “On the use of neural networks in articulatory speech

synthesis,” J. Acoust. Soc. Amer., vol. 93, no. 2, pp. 1109–1121, 1993.

[98] L. Eulero. (1744). Methodus inveniendi lineas curvas maximi min-

imive proprietate gaudentes, sive solutio problematisisoperimetrici latis-

simo sensu accepti. Lausannæ/Genevæ, Switzerland: Apud Marcum-

Michaelem Bousquet/Socios [Online]. Available: http://math.dartmouth.

edu/∼euler/pages/E065.html

[99] M. J. Forray, Variational Calculus in Science and Engineering.New

York: McGraw-Hill, 1967.

[100] R. Weinstock, Calculus of Variations With Applications to Physics and

Engineering. New York: McGraw-Hill, 1952.

[101] G. A. Bliss, Lectures on the Calculus of Variations. Chicago, IL: Univ.

Chicago, 1947.

[102] V. I. Smirnov, A Course of Higher Mathematics, vol. I–V, Oxford, U.K.:

Pergamon, 1964.

[103] R. Courant and D. Hilbert, Methods of Mathematical Physics,vol.I.New

York: Interscience, 1966.

[104] G. A. Korn and T. M. Korn, Mathematical Handbook for Scientists

and Engineers. Deﬁnitions, Theorems, and Formulas for Reference and

Review, 2nd ed. enlarged and revised ed. New York: McGraw-Hill,

1968.

[105] I. N. Bronshtein and K. A. Semendyayev, Handbook of Mathematics,

3rd ed. Berlin, Germany: Springer-Verlag, 1998.

[106] Calliope, La Parole et Son Traitement Automatique. Paris, France:

Dunod, 1989.

[107] G. E. Peterson and H. L. Barney, “Control methods used in a study of

the vowels,” J. Acoust. Soc. Amer., vol. 24, no. 2, pp. 175–184, 1952.

[108] K. N. Stevens and A. S. House, “Perturbation of vowel articulations by

consonantal context,” J. Speech Hear. Res., vol. 6, pp. 111–128, 1963.

[109] M. J. Macchi, “Identiﬁcation of vowels spoken in isolation versus vowels

spoken in consonantal context,” J. Acoust. Soc. Amer., vol. 68, no. 6,

pp. 1636–1642, 1980.

[110] T. M. Nearey, “Static, dynamic, and relational properties in vowel per-

ception,” J. Acoust. Soc. Amer., vol. 85, no. 5, pp. 2088–2113, 1989.

[111] J. Talley, “Vowel perception in varied symmetric CVC contexts,” J.

Acoust. Soc. Amer., vol. 108, no. 5, pp. 2601–2601, 2000.

[112] J. Hillenbrand, L. A. Getty, K. Wheeler, and M. J. Clark, “Acoustic char-

acteristics of american english vowels,” J. Acoust. Soc. Amer., vol. 95,

no. 5, pp. 2875–2875, 1995.

[113] B. Lindblom, “Explaining phonetic variation: a sketch of the H&H the-

ory,” in Speech Production and Speech Modelling, W.J. Hardcastle and

A. Marchal, Eds. Dordrecht, The Netherlands: Kluwer, 1990, pp. 403–

439.

Iaroslav V. Blagouchine was born in St. Petersburg,

Russia, on December 22, 1979. He received the B.S.

degree in physics from the St. Petersburg State Uni-

versity in 2000, the M.S. degree in electronic engi-

neering from the Grenoble Institute of Technology,

and the Ph.D. degree in signal processing and ap-

plied mathematics from the ´

Ecole Centrale, France,

in 2001 and 2009, respectively.

From 2001 to 2002, he was with the Department

Se˜

nales, Sistemas y Radiocomunicaciones Universi-

dad Polit´

ecnica de Madrid, Madrid, Spain. During

2003, he was a Research Engineer with the Grenoble Institute of Technology,

Grenoble, France, where he was also a Teacher Assistant from 2004 to 2007.

From 2007 to 2009, he was a Postdoctoral Researcher and a Teacher Assis-

tant with the Telecommunication Department, University of Toulon, Toulon,

France. Since September 2009, he has been a Research Engineer with the

Department of Mobile Communication, Eur´

ecom, Sophia Antipolis, France.

His current research interests include biologically inspired robotics (especially

equilibrium-point-hypothesis-based), speech robotics, constraint optimization

techniques, variational calculus, and statistical signal processing.

Eric Moreau (M’96–SM’08) was born in Lille,

France. He graduated from the Ecole Nationale

Sup´

erieure des Arts et M´

etiers, Paris, France, in 1989.

He received the Agr´

egation de Physique degree from

the Ecole Normale Sup´

erieure de Cachan, Cachan

Cedex, France, in 1990 and the DEA and Ph.D. de-

grees in signal processing from the Universit´

e Paris-

Sud in 1991 and 1995, respectively.

From 1995 to 2001, he was an Assistant Professor

with the Department of Telecommunications, Insti-

tute of Engineering Sciences of Toulon-Var-School

of Engineering, University of Toulon, Toulon, France, where he is currently a

Professor. His current research interests include constraint optimization, neural

networks applications, and statistical signal processing.

On Internal Modeling of the Upright Postural Control in Elderly

Conference Paper

Full-text available

Nov 2018

The second most common cause of injury in the elderly population is falling. In an effort to understand the mechanism behind the reduced ability to maintain balance in any posture or activity, we study the performance of the central nervous system as a controller of the body, while maintaining the balance in some postures or activities. Towards this direction, forty-five subjects aged over 70 were tested in different trials of quiet stance: a) hard stable surface with open eyes, b) stable surface with closed eyes, c) soft unstable surface with open eyes, and d) unstable surface, while eyes were closed. In the sequel, the body kinematics were described by legs and trunk segment angles in the sagittal plane, while the muscle activations were described by a weighted sum of rectified EMG signals from tibialis anterior and gastrocnemius muscles of left and right legs. Using the neuro-science hypothesis and adaptive control theory, a completely novel model was identified for the CNS based on the feedback internal model. The proposed model is able to predict the output commands, based on a recurrent neural network, while the efficiency of the proposed scheme has been proven based on multiple experimental results, showing that the model can sufficiently predict the muscle activity based on the optimum sensory inputs.

Robust Control of Multi Degree of Freedom Robot Based on Disturbance Observer of Neural Network

Article

Full-text available

Jun 2020

Stabilization of an Inverted Pendulum via Human Brain Inspired Controller Design

Conference Paper

Full-text available

Sep 2019

The human body is mechanically unstable, while the brain as the main controller, is responsible to maintain our balance. However, the mechanisms of the brain towards balancing are still an open research question and thus in this article, we propose a novel modeling architecture for replicating and understanding the fundamental mechanisms for generating balance in the humans. Towards this aim, a nonlinear Recurrent Neural Network (RNN) has been proposed and trained that has the ability to predict the performance of the Central Nervous System (CNS) in stabilizing the human body with high accuracy and that has been trained based on multiple collected human-based balancing data and by utilizing system identification techniques. One fundamental contribution of the article is the fact that the obtained network, for the balancing mechanisms, is experimentally evaluated on a single link inverted pendulum that replicates the basic model of the human balance and can be directly extended in the area of humanoids and balancing exoskeletons.

Speech Recognition based Industrial Cloud Robot for Service-Oriented Sustainable Manufacturing

Article

Full-text available

Apr 2021

Industrial Cloud Robotics is an amalgamation of cloud computing and an industrial robot to establish a service-oriented manufacturing system. Teleoperation of an industrial robot – the ability to control a robot from a remote location, is facilitated through this system. In this paper, a framework is proposed by utilizing speech recognition and industrial cloud robots for the applications of sustainable manufacturing. We have used Google’s speech recognition services to control the robot manipulator. We have employed Android Speech API in a custom android application that receives the speech signal, transcribes, and forwards it to a server. The application can be viewed or accessed by any host computer, which primarily serves as a user. The monitoring unit and the data is fetched by the Robot Speech Interface unit via the internet and web sockets. The interface triggers the required action of the robot through the relay board actuation of the digital input facility of the robot. Through this work, it is realized that, despite the disturbances and noise interferences, speed and reliability are not compromised.

Replicating human brain mechanisms towards balancing

Conference Paper

Jun 2019

Internal Model Impedance Control for a Lower Limb Rehabilitation Robot in the Presence of Uncertainty

Conference Paper

May 2018

Modeling the production of VCV sequences via the inversion of a biomechanical model of the tongue

Conference Paper

Full-text available

Sep 2005

Internal models for motor control and trajectory planning.

Article

Full-text available

Jan 1999
CURR OPIN NEUROBIOL

Mitsuo Kawato

A number of internal model concepts are now widespread in neuroscience and cognitive science. These concepts are supported by behavioral, neurophysiological, and imaging data; furthermore, these models have had their structures and functions revealed by such data. In particular, a specific theory on inverse dynamics model learning is directly supported by unit recordings from cerebellar Purkinje cells. Multiple paired forward inverse models describing how diverse objects and environments can be controlled and learned separately have recently been proposed. The 'minimum variance model' is another major recent advance in the computational theory of motor control. This model integrates two furiously disputed approaches on trajectory planning, strongly suggesting that both kinematic and dynamic internal models are utilized in movement planning and control

Analytical Mechanics

Book

Nov 1998

Does the nervous system use equilibrium-point control to guide single and multiple joint movements?

Chapter

May 1994

Movement is arguably the most fundamental and important function of the nervous system. Purposive movement requires the coordination of actions within many areas of the cerebral cortex, cerebellum, basal ganglia, spinal cord, and peripheral nerves and sensory receptors, which together must control a highly complex biomechanical apparatus made up of the skeleton and muscles. Beginning at the level of biomechanics and spinal reflexes and proceeding upward to brain structures in the cerebellum, brainstem and cerebral cortex, the chapters in this book highlight the important issues in movement control. Commentaries provide a balanced treatment of the articles that have been written by experts in a variety of areas concerned with movement, including behaviour, physiology, robotics, and mathematics.

Recent Advances in Artificial Neural Networks: Design and Applications

Book

May 2018

L. C. Jain

The Variational Principles of Mechanics

Article

Feb 1973

Motor Control and Learning

Book

Jan 2006

Motor Control and Learning focuses on the effects of development, aging, and practice on the control of human voluntary movement. These issues have been at the center of attention of the motor control community, but no book until now has addressed all of these issues under one cover in the context of contemporary views on the control of human voluntary movement. This book emphasizes the links between progress in basic motor control research and applied areas such as motor disorders and motor rehabilitation. Contributors are established scientists in the areas of both theoretical/experimental motor control and its applications. The chapters focus more on large, general issues than on their particular research. As a result, Motor Control and Learning is relevant to both professionals in the areas of motor control, movement disorders, and motor rehabilitation, and to students who are starting their careers in one of these actively developed areas. Dr. Mark L. Latash is Professor of Kinesiology at the Pennsylvania State University. Dr. Francis G. Lestienne is Professor and Director of the Center for Science and Technology in Physical Activity and Sports at the Université de Caen Basse-Normandie, France.

Handbook of Mathematics

Article

Jan 1998

The Acoustic Theory of Speech Production (chapter)

Chapter

Jan 1999

The acoustic characteristics of any speech sound are determined by the whole complex of the movement and configurations of the speech production process. We have seen that some aspects of speech production have a fairly predictable effect on the acoustic speech signal. For example, periodicity in the acoustic waveform is the acoustic consequence of vocal fold vibration that characterises voiced sounds, while a nearly random fluctuation in air pressure variation results from a turbulent airstream in the production of most voiceless sounds.

Acoustic Theory of Speech Production

Book

Jan 1960

G Fant

Control of a Speech Robot via an Optimum Neural-Network-Based Internal Model With Constraints

Abstract and Figures

Recommended publications

Development of the applied mathematics originating from the group theory of physical and mathematica...

Ground States in the Spin Boson Model

Hyperfunctions and Theoretical Physics

New Reductions and Nonlinear Systems for 2D Schrodinger Operators