Adaptive 1994 NN Ad Con Camacho

Coptight Q IFAC ArtificialIntelligencein RealTime
Control,Valencia, Spain, 1994
NEURAL NETWORK
BASED
ADAPTIVE
CONTROL
Dpto.
Ingenieria
E.F. CAMACHO and M.R. ARAHAL.

de Sistemas y Autombtica,
Univ. of Seville, Spain.
Abstract.This
paper presents differents ways of using artificial neural networks in adaptive control. A classification of architectures for control using neural networks is presented, showing the existing paralelism with Adaptive Control techniques.
Key
Words-
Adaptive
control;
Automatic
control;
1. INTRODUCTION
real processes caninto account. Simbe made and modof reality.
c) Most processes are non linear, having nonlinear dynamics

and nonlinearities
caused by
actuators
that have a limited range of action
and a limited slew rate, as in the, case of control valves, which are limited by fully closed and
fully open positions and a maximum slew rate.
Constructive
and/or safety reasons, as well as
sensor ranges, cause limits in process variables,
as in the case of tank levels, pipe flows and pressures in deposits.
d) Because of changing environmental
conditions, such as ambient temperature,
humidity
etc., most processes are not time invariant.
problems
have been extensively
Nonlinear
control
systems.
In Robust Control the process is usually modeled by a linear model and some of the problems mentioned above are treated by considering uncertainties
about the model. The main
assumption
in most cases is that the underlaying process is linear. In Adaptive Control the
main idea is that by an appropriate
adaptation mechanism, the controller and/o1 model of
the process, linear in most cases, will cope with
unknown, changing and possibly nonlinear
dynamics. Advanced control strategies,
normally
based on an exact cancellation
of the nonlinear dynamics (Craig, 1988) have to be used for
nonlinear processes such as robots. The uncertainties on the dynamic parameters
of the processes, such as inertias and payload conditions
in robots, have motivated
the design of adaptive controllers
(Slotine and Li, 1990; Kelly,
Carelli and Ortega, 1989; Ortega and Spong,
1988). This type of controller is designed assuming an exact knowledge of the model structure and does not include aspects, such as nonlinear frictions, elasticity in the joints and links,
backlash and torque perturbations,
which can
be found in robots.
a) A linear mathematical
model of the plant is
needed and finding one is not a trivial problem
in many cases.
These
nets;
in literature and some new disciplines have appeared to address them.

Some of the disciplines have evolved around the Linear Systems Control
Community,
as is the case of
Robust or Adaptive Control while other disciplines have developed around the Artificial Intelligence Control Community,
as is the case of
expert, fuzzy, or Neural Control.
The Control Theory for linear processes has for

some time been considered
a well established
scientific discipline with powerful techniques for
analyzing
and designing controllers.
The main
problems in process control when applying the
Linear Control Theory are caused by the fact
that:
b) Mathematical
models of
not take all aspect of reality
plifying assumptions
have to
els are only approximations
Neural
treated
13
The AI type of approaches try somehow to reproduce the behavior of hzlman controllers that
are able to use natural intelligence
to control
processes exhibiting all the problems described.
A further difference in both approaches
has
been that while the Linear Control Community approach seemed to be more interested in
demonstrating
results about stability
of proposed control schemes, the AI Control Community seemed more interested in showing that
the technique worked on particular
processes.
This is however changing lately and there are
a number of works relating both types of disciplines. Stability analysis is one of the converging fields and some results have appeared in
Iiterature establishing
conditions to ensure the
stability
of AI controllers
(Aracil, Ollero and
Garcia-Cerezo,
1989). Adaptive Control is another field where there is a strong confluence
with AI controllers.
The idea of adaptation
is
strongly linked to the idea of learning which is
fundamental
to NN.
cesses with no necessarily known and/or changing dynamics.

The main objective of this paper
is to show how Neural Networks can be used for
adaptive control and to explore the parallelism
found in Neural Network Control and Adaptive
Control.
2. NEURAL NETWORK
BASED CONTROL
The history of NN can be traced back to the

40s with the first models of biological neural
cells. A first step in connection&t
was done by
McCulloch and Pitts (1943) that modeled an
artificial neuron and studied the properties
of
the resulting network.
Hebb observed that a
strengthening
of the connections
between neurons occurs when one cell stimulates
another
when the latter is firing. This observation
was
used by Grossberg in the 60s to model neural learning. Hebbian type of rules for learning
were used by Rosenblatts
Perceptron
(1958)
that was later studied in depth by Minsky and
A gradient
descent method
Papert
(1969).
called the delta rule was used by Widrow
and Hoff (1960) to train a NN whose nodes are
called ADALINE
(Adaptive
Linear Element).
The backpropagation
training algorithm developed in the 70s and 80s is another milestone in
the history of artificial NN. It allowed for networks with hidden layers to be trained, overcoming the problems
that perceptron
had of
representing
certain type of functions
such as
the exclusive-OR function (Minsky and Papert,
1969). The introduction
of feedback in NN produced dynamical
systems with various equilibrium points that were used as associative memories: Hopfield (1982) devised a dynamic structure that has been widely used for the solving
of optimization
problems.
Neural Network based controllers have received

This type
much attention
in recent years.
of controller
exploits the possibilities
of neural networks for learning nonlinear functions
and/or the possibilities
of neural networks to
solve certain type of problems where massive
parallel computation
is required.
The learning capability
of NN is used to make the controller map a certain function,
highly nonlinear most of the time, representing
direct dynamics, inverse dynamics or any other characteristics of the process.
This is usually done
during a, normally long, training period when
commissioning
the controller in a supervised or
unsupervised
manner (Psaltis, Sideris and Yamura, 1988). If the learning capability
of the
NN is not switched off after the training
period, once the controller is commissioned,
the
NN based controller works as an adaptive controller.
The ability of NN for parallel computation has been exploited to implement
controllers which require a substantial
amount of
computation,
such as long range predictive controllers where a quadratic
optimization
problem has to be solved (Quero and Camacho,
1990).
NN abilities were soon applied to challenging

Barto, Sutton and Andercontrol problems.
son (1983) solved the well-known
problem of
balancing a pole in a cart. They discussed an
important
aspect of training NN: the credit assignment problem. Backpropagation
rule need
to be told the error made at any time, but
in some cases it is only known that an error
has been made.
Nguyen and Widrow (1990)
exploited the ability of NN to learn nonlinear
functions in the problem of the dock and the
trailer truck. Backing a trailer truck to a loading dock, is a hard task even for humans.
The
controi signal was generated
by a NN previously trained using backpropagation
with the
help of an emulator.The
emulator
consists
of another NN that identifies the plant, it has
the same inputs as the plant plus the state of
This paper deals with adaptive NN controllers

for nonlinear
processes.
Neural networks, as
was mentioned before, have the ability of learning a nonlinear
model without a prior knowledge of its structure
(Lightbody
and Irwing,
1992) and are adequate for working in real time
because of its high parallelism.
NN seems to be
an alternative
way of solving some of the problems mentioned above. That is, non linear pro14
a function has some degree of smoothness,

it
was shown that the system formed by the plant
and a controller using the NN, is stable and that
the tracking error will converge to a neighborhood of zero. The aim of the authors was to
develop stable adaptive architectures
capable
of exploiting analog designs for the control of
continuous-time
nonlinear
dynamic
systems.
Lets consider a plant whose dynamics
have a
nonlinear
expression relating the n-th derivative of the state with the state and its n - 1
first derivatives.
The role of the NN, consisting in a single layer of nodes possessing radial
Gaussian characteristics,
is to provide an estimation of such a function at any time. That is,
the net has to uniformly approximate
a continuous function with a pre specified accuracy on
a compact subset of R using a finite number of
nodes. It is necessary to prove that such a function can be represented as a linear combination
of a set of continuous,
known basis functions.
Sanner and Slotine showed that this approximation can be done using Gaussian radial basis functions.
The resulting system adjusts the
networks weights while controlling
the plant.
No prior learning is needed.
A sliding mode
control is set up to prevent the tracking from
degrading when the state of the plant is outside
the region in which the NN has good performance.
the plant. The output of the net is an estimation of the plants next state (using a discretetime representation).
Backpropagating
the error made in the prediction,
the net learns the
behavior
of the plant.
Once the emulator
knows the plants dynamics with a certain accuracy, the training of the controller begins. It
is commissioned
by using b~ckpropagation
of
error at the final state, The aim is to find the
weights that minimize a measure of the states
error at each time step. But, as the error is only
available in the final state, it has to be backpropagated
through the plant emulator in order
to estimate the controllers error at each step.
From trial to trial, each one having different initial conditions,
the controller is driven by the
emulator to give the correct control law. The
fact that the real plant carmot be used to backpropagate the errors made by the controller explains the need for an emulator.
Another
way of viewing NN in control is as
look-up tables. The NN stores control signals,
given the state of the plant and the next-step
desired state. In (Kraft and Campagna,
1990) a
NN was used to control three types of systems:
linear, linear + noise and nonlinear.
The performance of the NN controller was compared
with a couple of adaptive controllers:
STR and
MRAC showing good characteristics.
NN have been used to design controllers

for
highly non linear processes such as robot manipulators
(Kawato et al., 1987).
Adaptive
feedback controller has been proposed by Guez
and Bar-Kana
(1990)
In 1990 the first number

of the new magazine IEEE Transactions
on Neural Network
appeared,
and its first paper (Narendra
and
~arth~arathy~
1990) was dedicated to the application of NN to control and identification
of
nonlinear
systems.
They suggested structures
for identification
and control of nonlinear systems with unknown dynamics using NN. Based
on simple operations:
1) time delay, 2) summation and multiplication
by a constant
and 3)
the nonlinear
activation
function, they treated
recurrent
and multilayer
networks in a unified
fashion. They used the term ~~~er~~~~e~ NN to
name the nets resulting from the combination
of the above listed building blocks. A method
that allows the parameters
of the NN to be
dynamically
changed was presented as an extension of static backpropagation.
By simulation studies they revealed
such structures to identify
ear plants
eters.
with unkno~v~
NN have also been used to implement

long
range receding horizon predictive
controllers.
Long-range
predictive
controllers
(LRPC), or
Model Predictive
Controllers
(MPC) as they
are called in the domain of the process industry, have received a lot of attention
in recent
years. Al1 these controllers are based on the fact
that the process output can be predicted over
a horizon from the past process input and output and the potential
future control sequence
if a suitable parameterized
model of the system is known.
The name of these types of
controllers
comes from the way in which the
control law is computed:
at the present time
t the future sequence of manipulated
variables
is selected in such a way that the predicted response of the process has certain desirable characteristics.
Only the first computed manipulable variable is implemented
and the process is
repeated at time t + 1 . There have been many
LRPC or MPC algorithms
proposed in literature (Garcia, Prett and Morari, 1989; Tan and
the effectiveness of
and control nonlin-
structure
and param-
Sanner and Slotine (1992) proposed a new architecture

for adaptive control using Gaussian
NN. A Gaussian Network uses Gaussian radial
basis functions
(RBF) in its nodes to approximate a nonlinear function,
Provided that such
15
way in which the adaptation

mechanism
in adaptive control and how the learning
anism operates when a NN is learning.
differences are illustrated
by Fig. 1.
De Keyser, 1993; Cutler and Ramaker, 1980),

Model Algorithmic
Control (MAC) (Rouhani
and Mehra, 1982). G eneralized Predictive Control (GPC) (Clarke, Mohtadi and Tu&, 1987a
and 1987b) can be mentioned among the most
popular.
The basic idea of GPC is to calculate a sequence of future control signals which
minimizes
a multistage
cost function defined
over a receding control horizon.
The index to
be optimized is the expectation
of a quadratic
function measuring
the control effort and the
distance between the predicted system output
and some predicted reference sequence over the
The GPC involves the soreceding horizon.
lution of an unconstrained
quadratic
problem
(QP) with N variables which can easily be obtained by using any standard
method for unconstrained
QP optimization.
These methods
cannot,
however, solve the constrained
problem and although the amount of computation
needed is not very high, it can be a drawback
for real time applications.
When process variables are bounded,
a QP problem with linear
constraints
has to be solved (Camacho,
1993)
which requires a substantial
amount of computation
for real time problems.
Hopfield NN
have been used to implement
GPC for processes with unbounded
(Quero and Camacho,
1990) and bounded signals (Quero, Camacho
and Franquelo,
1993).
time
FIG. 1. Adaptation
(a)
works
mechThese
and learning (b) processes.
In adaptive
control,
the adaptation
is performed in a single trajectory.
Normally at the
beginning of the trajectory, while the controller
is not properly tuned, the process trajectory
differs substantially
from the reference trajectory. Once the parameters
are properly tuned,
the process follows the desired trajectory
with
greater accuracy.
Learning is performed by modifying the NN parameters during repeated performance
trials of
the desired trajectory.
It is like practicing
the
same stroke of, lets say tennis, a number of
times until success has been achieved.
This
idea is illustrated
by figure lb where the different trajectories
obtained at different learning stages are shown. It can be seen that process trajectories
reproduce the reference trajectory with more accuracy when learning
progresses. A practice strategy has been suggested
which instead of using the reference trajectory
in each learning period a sequence of trajectories is used. The first element of the sequence is
a previously learned trajectory
and the last element is the desired trajectory.
This approach
could solve some problems found in NN learning when the desired trajectory
is far from any
of the previously learned ones.
The training signals used for adaptation
and
learning are of great importance
and some parallelism can be established
between NN learning and adaptive control regarding this issue.
When the process is non linear the problem gets

more complex as the implementation
of a GPC
requires the optimization
of a non linear function, NN have also been used in this context to
implement
GPC. G6mez-Ortega
and Camacho
(1994) use NN to implement
a GPC for path
tracking of mobile robots.
The NN is trained
in a supervised way in an off-line manner and
using the output of a numerical optimization
algorithm that computed the best control action.
Tan and De Keyser (1993) use a NN predictor
to implement
GPC for nonlinear processes. An
interesting
fast learning algorithm is also proposed by these authors.
3. LEARNING AND ADAPTATION
Persistent Excitation. The concept of persistent excitation is crucial to adaptive control, it

refers to the need for using a signal for identification purposes which is dynamically
representative of the entire class of input that the
process may be subjected
to. Let us consider
a process with a transfer function
characterized by a set of true parameter 5. Consider the
set of adjustable
parameter o and an appropriate identification
algorithm.
A signal is persis-
Learning and adaptation

are fundamental
concepts associated
to NN and adaptive control
that although related are not quite the same. It
can be considered that with the adaptive mechanism, an adaptive controller learns the process
parameters
or a set of adequate controller parameters. On the other hand the learning phase
of a NN can be considered as the adaptation
of
the NN weights to adequate values. There are,
however, some differences when considering the
16
chitectures
will be classified according
role played by the network.
tently exciting if 0 + 5 when the error between

process and model tends to zero asymptotically.
When using NN for the identification
or control
of nonlinear processes, a similar type of concept
would be of great interest for answering questions such as: Is the chosen training set adequate ?? What sort of training pattern should
be used to train the NN ?. What we are looking
for are signals which are dynamically
representative of the entire class of inputs and techniques to determine
this. Unfortunately,
there
are not known methods to generate a persistently exciting signal for NN training and only
good judgment
can be used for this purpose at
present.
4.1. NN II$ a controller.

&ec6 inverse Cani?rok
An adaptive
control
scheme using direct inverse control is shown in
Fig. 2. If the NN is trained to produce the
signal uff it is possible to substitute
the feedforward controller with the NN. In this way,
the NN produces the inverse dynamics
of the
plant, while the feedback controller counts for
non-perfect
learning and perturbations.
Classic adaptive controllers
make use of two elements: the adaptive controller and the adaptation law. In the neural context the controller
is performed by a NN and the adaptation
law
is the rule used to adjust the parameters
of
the net (usually connection weights). Kawatos
proposal for robot control (Kawato, Uno, Isobe
and Suzuki, 1987) matches this structure.
The
NN is trained to make the feedback control signal zero. In the first stages of the training of
the NN the system is stable thanks to the feedback controller.
One of the fundament~

~~~~~~~~o~ Speed.
problems of adaptive control is the adaptation
speed of the controller.
In self tuning control
theory, two different time scales are assumed
for process dynamics and for process parameter changes. In practice, although process parameters tend to change more slowly than process variables,
the time scales are not so far
apart and a quick tuning is required in most
cases. In adaptive control there are two factors that determine adaptation
speed. The first
one is the election of appropriate
adaptation
gains or forgetting factors and the second one
is the number of parameters
being identified
or adjusted.
Small adaptation
gain will result
in a slow adaptation
speed while high adaptation gain tend to produce oscillations and convergence problems.
The adaptation
speed decreases considerably
with the number of parameters chosen.
Feed Forwnrd
controller
uff
L---.-_____________:
FIG. 2. Controller for a nonlinear
When NN are used for adaptation,

the same
factors dominate
the learning
or adaptation
speed.
E&t while in adaptive
controller
the
number of adapting parameters tend to be kept
small, when using NN there are a substantial
number of parameters
if all the weights are to
be adapted.
Some mechanisms
to obtain faster
adaptation
speed for NN based adaptive controllers have been proposed in literature
(Tan
and De Keyser, 1993; Care&
Camacho and
Patiiro, 1993, 1994).
4. NN BASED ADAPTIVE
CLASSIFICATION
to the
I
plant.
In (Sanner and Slotine, 1992) a Radial Basis

Function NN is used to provide the inverse dynamics of a nonlinear plant. The controller incorporates
sliding mode control and adaptive
control blended through a modulation
function.
There are many other architectures
that make
use of the learning capability of NN to identify
the inverse dynamics of a plant. Notice that an
inverse of the process dynamics must exist for
this scheme to work.
C&i&c
Model Rejkrence Adaptive Co&-d.
MRAC can be extended to the neural case. In
(Narendra
and Parthasarathy,
1990) a NN is
used to identify the plant while another NN
produces the input to the plant.
The objective is to track the output of a reference model.
3 network Ni has to be previously
In Fig.
trained to identify the input-o~~tput
behavior
of the plant. Later the controllers parameters
can be adjusted by backpropagating
the track-
CUNT~OLLERS
Many controllers
have been proposed including NN as a part of them, Most of them are
neural versions of classical adaptive controllers
(&t&m
and Wittenmark,
1989). The way the
NN is incorporated
in the system differs from
one to the others.
Most frequently
used ar-
17
ing errors through the plant identifier.

A direct adaptation
of the controller is not possible
since the plant, whose dynamics are unknown,
lies between the controller and the tracking er-
as input a codification of the state of the plant,

and gives as output a control signal depending on that state ( i.e. position and velocity of
the cart and pole). 2) The adaptive
critic element (ACE) predicts reinforcement
from the
environment
that corresponds to the control action generated by the associative element. Both
elements need to be adjusted.
The associative search element is constantly
modifying its
weights thanks to a signal generated
by the
critic element. This means that the critic drives
the decision that the ASE has to take by means
of aa internal reinforcement.
The critic learns
from trial to trial to predict the future action of
the pole in terms of reinforcement.
It receives
the reinforcement
signal supplied from the exterior. The results showed a better performance
of the ASE-ACE system than the classical box
system.
TOT.
Rd.
FIG. 3. Indirect
MGi-j
ym
adaptive control.
In Fig. 3, the TDL blocks represent tapped delay lines whose function is to provide delayed
values of the plants inputs and outputs.
In some structures,
the NN is given the task of
generating a small part of the control signal. In
these cases the network counts for structured
and non-structured
uncertainties
of a model.
The main part of the command
signal is produced by a conventional
controller based on the
model. Examples can be found in (Iiguni, Sakai
and Tokumaru,
1991) and (Zomaya, 1993).
Other models. To overcome the problem of the

lack of previous information
about the plant
a learning method has been used that enables
one to control processes without the identification stage. Reinforcement
learning (RL) uses a
W&C instead of a ~te~che~. The critic gets
a measurement
of the performance
of the system from the environment.
The objective of the
adaptation
system is to improve the reinforcement signal produced by the critic. Examples
of this type of architectures
are found in (Barto,
Sutton and Anderson,
1983) and in (Zomaya,
1994) *
/l~I/
I J
I
4.2. NN as estimator,
Internal Model Control.
The Internal
Model
Control
scheme
proposed
in (Economou,
Morari and Palsson, 1986) uses a system forThe system
ward and an inverse model.
models output is compared to the plants output and the difference is fed back to a controller, This structure
can incorporate
NN for
the identification
of nonlinear plants (Hunt and
Sbarbaro, 1991).
Predictive Control. A predictive controller produces a command
signal that minimizes
the
squared error between the predicted output of
the plant and the reference over a certain temporal horizon at every time step k. A predictian can be computed for linear plants by using a Diophantine
equation
(Clarke, Mohtadi
and Tuffs, 1987). To extend the idea to nonlinear plants a predictor has to be developed.
In (Takahashi,
1993) a NN is used to produce
a prediction of a nonlinear plants output.
I
I
FIG. 4. Associative and critic elements in a RL

architecture.
The first example deals with balancing a pole
in a cart that can move between two stops. The
goal of the controller
is to move the cart in
such a way as to keep the pole vertical.
An
error is made when the pale falls or the cart
hit the tracks bounds.
To solve this problem
two adaptive elements were used (see Fig. 4):
1) The associative
search element (ASE) has
Inferential
Control.
In some industrial
processes control is difficult due to the fact that the
plants output is non measurable
at a proper
frequency. This is the case of quality measurements in chemical processes. Inferential control
uses secondary measurements
to estimate the
plants output.
The mapping from secondary
18
to primary variables can be nonlinear and difficult to determine,

so a NN is likely to produce
good results. Inferential
nonlinear control was
studied by Morari and Fung (1982) and by Parrish and Brosilow (1988). An implementation
of NN to inferential
control is given in (Luo,
Shao and Zhang, 1993).
4.3. NN us u~~ustme~t element.
A NN can be used to adjust a classic controller.
Most controllers
currently
on use in industry
are PID due to its simplicity
and robustness.
However, the tuning of this t*ype of controller
is often a burdensome
task. In (Akhyar and
Qmatu,
1993) a NN is used to automatically
tune a PID (Fig. 5). The NNs output are the
parameters
of the PID. The NN uses a gradient
descent algorithm
to learn the adequate mapping using the control error.
FIG. 6. Adaptive controller using fixed

feedforward N N ,
pressed
as?
r(t)
= ~(~)~
+ C(% 4)Fi + ~(~)
(5-I)
The dynamic
structure
can be expressed
(Khoda and Kanade,
1935) as a linear function of a suitable selected set of robot and load
parameters:
FIG.
It has been shown by Funahashi

{1989), Cybenko (1989), Hornik et al. (1989) that two
layer NN can approximate
any well-behaved
nonlinear function to any desired accuracy.
5. The neural net adjusts the PID controller.
5s NN ADAPTIVE
CONTROL
ADAPTATION SPEED
WITH
Consider a set {ipi) of N neural networks, each

representing
the inverse robot dynamics for a
determined
payload condition characterized
by
a value Bi of the parameters.
If we take into account the linear parameterization
property of
the robot model and assume that each element
of the NN set represents the inverse robot dynamics for each load condition,
FAST
This section
describes
a structure
for the
adaptive
control of manipulators
proposed by
Carelli,
Camacho
and Patifio,
(1993, 1994)
which does not require a long adaptation
period
{see Fig. 6). Although
the control structure
has been designed for robot manipulator
it can
be applied to any nonlinear process that can be
reparameterized
in the way described. The controller uses a set of fixed feedforward NN which
are trained in an off-line manner.
This structure allows the adaptation
of the controller to
deal with dynamic uncertainties,
such as link
inertias or payloads, minimizing
the amount of
computation
that has to be performed on-line.
As the number of parameters
to be adapted is
small, the adaptation
to changes in robot parameters is faster than when using the learning
capabilities
of the NN to adapt.
(5.3)
where,
Bi =
[et,@,, I I. , OrIT,
1,2, * * ,N and z = [q,i,ilT
with
=5
Now consider a particular robot payload condition characterized

by a value of the parameter 0
and assume that it can be expressed as a linear
combination
of the values Bi,
6 = nlei
+ a?zBz + 1 4 + ap$arJ
(5.4)
The inverse robot dynamics for a payload condition characterized

by 0 can be expressed as,
5.1. NN Inverse Robot Dynmmics.
The inverse dynamics
of a robot can be ex-
19
Thus, the inverse robot dynamic can be approximated to any payload condition by,
The adaptive
law is given by:

* W(Zdf * T&)
& = -r
where,
= al@&)
Equation
+ a&(z)
+.a.
(5.4) can be written
+ WV%+$
(5.6)
I? is a positive
[@I (xd)@2(xd)
* * $+d)]=
8 = (cirpIi2 - CiJq]
as,
If N = n and 0 is non singular,

unique solution given by:
there
is a
If the columns of 0 do not form a basis, because N < n or the training

conditions
have
been chosen so that some columns of 0 are linearly dependent
on the rest, only an approximate value can be found for CLusing any fitting
algorithm such as least square.
If the columns of 0 form a basis, the whole 0
space can be generated and a set of parameter
CLcan be found for all 8. That is, inverse robot
dynamics
can be approximated
by the set of
NN for any values of the uncertainties,
such as
payloads, parameterized
in 8.
5.2. Adaptive Cuntrol.
Let us consider a set of N trained NN that
model the inverse dynamics of the robot for different payload conditions with a desired accuracy A, that is
where z = [4,&4;IT and A(z) is the learning

error or difference between the real robot inverse dynamics y(z) and the inverse dynamics
generated by the set of neural networks by:
definite,
W(zd)
ANALYSIS
Stability is a crucial problem in control. Most

results in the field of stability
(Narendra
and
Ann~w~y,
1989) are restricted to linear time
invariant systems. Since NN are nonlinear systems and are normally employed for identification and control of NN systems there is a need
for results in stability
on nonlinear
systems.
Narendra
and Parthasarathy
(1990) pointed
out that assumptions
have to be made concerning the plant characteristics
in order to obtain
Yabuta and Yamada (1990)
stable systems.
concluded that the stability of a direct NN controller using BP depends on the initial value of
the nets parameter
as well as on the learning
gain used in the adaptation
law. Because of the
nonlinear
nature of the problem, there are no
general stability results for NN based adaptive
controllers.
However some stability conditions
have been obtained
for particular
control architectures.
The following conditions
for practical stability
are given in (Care&
Camacho
and Patifio, 1993) for the NN adaptive control
described above.
Theorem:
Suppose that yp(qd, &, id) can be written
?p(qd, id, id)
= W(qd,
id, id)
~3
control
y(r) = -Kp~-K~,~+W(~d,~d,~~)~
with the parameter
Assume
bounds,
law is:
y(t) = Ypbd)+ fc
w%dh zd = fed, &, &IT the desired
-k(t) =
-Kp.@-
K,*$
(5.12)
state
vector
with d= Q-Qd
(5.13)
as,
(6.16)
where IV(*) is completely known and CKis an unknown but constant vector. Suppose the adaptive control law
update
that
yp(.)
(6.17)
law
li = -P * ~~~d~ * ^ic(tf
The proposed
and
6. STABILITY
(54
(5.I5)
satisfies
the
(6.18)
following
relative to local smoothness

in the set A. Similar considerations
can be made about function
-t-[@$?,
id) - fb(h id)]i- c(q, i&d
hlldl + ~211irl12
+ ~3ll~llll!ill
b.
5
(6.20)
+ ~4ll!illllw
where a; and bi, i = 1, a. -, 4, are functions of time only and are independent
of i
and (ii. Furthermore, assume that a; and bi
are all L, functions.

If pp and ,uLu(pS k
Amin(KS),s = p,v and Am;,, denotes the minimum singular value) are sufficiently large and
$(ts), hIto), &(te) = &(t,) - TY(~O)are sufficiently small, KFH(q) is definite positive and
Y& - ~~~~~~~
< 0, XN A?
jlI;i,H(q)ll,
FIG. 7. Control Scheme proposed by Sanner and

Slotine (1992).
then
g(t)* h are uniformly bounded for all t 2 to and

the system has strong practical stability.
Furthermore, if ljA(F:d)lloo = 0, this is, nl = hl =
0, then i(t), i&t) -+ 0 as t -+ 0.
Proof:
1993).
given in (Carelli,
Camacho
A sliding mode control is set up to prevent the

tracking from degrading when the state of the
plant is outside the region in which the NN has
good performance.
The integration
of sliding
mode and adaptive control leads to a control
law of the form:
and Patiiio,
IL(t)=
z&(t) = f(XQ),
upd(t)
+ (1 -
+>>%d(t>
+ m(t)%l(t)
(6.22)
where z+,b is a pd-like control term, aad
is an
adaptive component
provided by the network
to count for the nonlinear functions and 2~~1
(t} is
the contribution
of the sliding mode controller.
Notice that m(t) is a modulation
function that
mixes adaptive and sliding control modes depending upon the situation
of the state of the
plant in the set A. When the operating
point
is next to the boFder, the sliding component
is
preferred; so that, the state is driven back to
the set A.
Lyapunov
theory was used by Sanner and Slotine (1992) to derive a stable adaptive system
for control of nonlinear
plants.
Lets describe
briefly the network architecture:
for a SISC
plant of order n whose dynamics are
+(t), ...,X-l(t))
Jt- bz@)
(6.21)
where u(t) is the control input, functions
f
If x =
and b aFe nonlinear
and unknown.
fx, i, .*., z (n-1)] is the state of the plant, the
objective
of the control is to track a desired
state trajectory
included in a set A. The role
of the NN, consisting in a single layer of nodes
possessing radial Gaussian characteristics,
is to
provide an estimation
of functions f and b at
any time.
That is, the net has to uniformly
approximate
a continuous
function with a pre
specified accuracy on a compact subset of R
using a finite number of nodes. The controller
proposed by Sanner and Slotine has three parts:
a linear combination
of the tracking error, a
feedforward
component
of t,he n-th derivative
of the desired trajectory
and an adaptive control law that attempt
to cancel the unknown
nonlinear
function that governs the plant. Every node in the Gaussian
(RBF) network provides an output that is a Gaussian function of
the distance from the current state to the input weight of the node in the input space. The
summation
of all nodes gives an approximation
of the function to be estimated,
provided t,hat
the output, weights are correctly set.
The resulting system (see Fig. 7) adjusts the

networks weights while controlling
the plant.
No prior learning
is needed.
It is possible
to prove that all states in the system remain
bounded and the tracking errors asymptotically
converge to a neighborhood
of zeFo. See (Sanner and Slotine (1992)~ for a complete proof.
7. IMPLEMENTATIONS
Most of the control applications
of NN are implemented
by programs
in digital computers
which simulate the behavior of the neural nets.
One of the potential advantages of NN which is
the inherent pardelism
is therefore lost. Hardware implementations
are consequently
convenient in OFdeF to use NN to its full potential,
In order to use NN for adaptive control, implementations
must perform some kind of parameter adjustment.
This is more easily achieved
when using some types of circuits but at the
The conditions
that function f has to meet in
order to be approximate
by a Gaussian NN are
21
cost of bigger silicon surfaces. Implementations

of NN can be classified in the following cate
gories:
Although the potentials of NN for adaptive control have been demonstrated

in literature
with
different processes, there are still a number of
open research issues in the field such as stability, characterization
of persistent
exciting patterns, adaptation
speed and hardware implementation
of NN with programmable
weights.
7.1. Neurral Processors.

They exhibit a flexibility which is an advantage
Feild and ~avlakh~ (1988)
over other types.
proposed an architecture
consisting in two INMOS boards having five transputers
connected
to a IB~/XT
computer.
Beynon (1988) developed a network of transputers
to simulate the
BP training algorithm.
In the field of pattern
recognition
we find the Graph Search Machine
(Glinski et al., 1987).
ACKNOWLEDGEMENT
The authors would like to acknowledge CICYT
for funding the work under grant TAP93-0804
and project CYTED-D.
Digital implementations
are more robust than
analog ones against dispersion
in the characR,asure et al.
teristics
of the components.
designed a feedforward
net that was able to
classify hand-written
digits.
A 3-D structure
of NETSIM boards was proposed in (Garth,
1987).
Each board contains
communication
buses, control circuits and neural coprocessors.
To reduce the surface needed in digital circuits
synchronous
stochastic implementations
can be
used (Janer, 1994; Janer, Quero and Franquelo,
1993).
Akhyar,
S. and S. Omatu (1993). Neuromorphic self- tuning PID controller.

In IEEE
International Joint Conference on Neural Networks, IJCNN93, pp. 552-557.
Aracil, J., A. Ollero and A. Garcia-Cerezo
(1989). Stability
indices for the global
analysis of expert control systems. IEEE
Trans. on System, Man id ~~&e~e~ics,
19,998-1007.
Astrom,
K. J. and B. Wittenmark
(1989).
A&p&e
Gon&ol. Add~on-Wesley,
New
York.
Barto, A. G., R. S. Sutton and G. W. Anderson (1983). Neuronlike
adaptive elements that can solve difficult learning
control problems.
IEEE Trans. on System, Man and C~~e~et~c$, 13, 834-846.
Beynon, T. (1988) A parallel implementation
of the backpropagation
algorithm
on a
neural network of transputers.
IEEE bternational Confeference on Neural Networks.
Camacho E-F. (1993). Constrained
generalized
predictive control. IEEE Trans. on Aut.
Control, 30, 327-332.
Camacho E.F. and J.M. Quero (1991). Precomputation
of genemlized
predictive
controllers. IEEE Trans. on Aut. Control,
36, 852-859.
Care&
R., E. F, Carnacho,
and D. Patiiio
(1993). Neural network based adaptive
control for robots. Proc QR the 2nd EW
ropean Control Conference, Groningen
475-480.
Carelli, R., E, F. Camacho
and D. Patifio
(1994).
A neural network based adaptive controller for robots.
IEEE Tbs.
on SMC To appear.
Clarke,
D.W.,
C. Mohtadi
and P.S. Tuffs
(1987a). Generalized
predictive control.
Part I The basic algorithm.
A~to~ut~~~
23, 137-148.
7.3. Specific An&g Circt~i~s.

They use less surface than the other types of
implementation
but need a more careful design.
In Caltech a group of researchers directed by
Mead have developed a number of implementations that use architectures
based on biological
models ( Mead and Mahowald, 1988).
7.4. ~~~~~ ~~~le~entu~~o~.
The use of hybrid (digital-analog)
circuits aims
at obtaining
a mix of the good traits of both
t,ypes of implementations
while avoiding the
bad ones. Murrays group has published many
papers dealing with this type of implementations (Murray et aI., 1987).
8. CONCLUSIONS
Neural networks have the ability of learning a
nonlinear
model without a prior knowledge of
its structure
and are adequate for working in
real time because of the high parallelism.
The
use of NN seems therefore to be a way of implementing
adaptive
controllers
for processes
where standard
adaptive
control is not adequate, hat is, non linear processes with no necessarily known model structure
and/or changing dynamics.
22
Neural
are universal
approximators.
Networks, 2, 359-366.
Hunt, K. J. and D. Sbarbaro
(1991). Neural
networks for non-linear
internal
model
control. Proc. IEE Pt. D, 138,431-438.
Kelly, R., R. Carelli and R. Ortega (1989).
Adaptive motion control design of robot
An input-output
apmanipulators:
proach. Int. J. Control, 50, 2563-2581.
Iiguni, Y., H. Sakai and H. Tokumaru
(1991).
A nonlinear regulator design in the presence of system uncertainties
using multilayered neural networks.
IEEE Trans.
on NN, 2, 410-417.
Janer, C. L. (1994). Paralell stochastic architectures for the microelectronic
implementation
of neural networks, (in spanish).
Ph.
D. Thesis,
University
of
Seville, Spain.
Janer, C. L., J. Quero and L. G. Franquelo
(1993) Fully Parallel summation
in a new
stochastic
neural network architecture.
IEEE Int.
Conf on Neural Networks
ICNN93, 1498-1503.
Kawato M., Y. Uno, M. Isobe and R. Suzuki
(1987).
A hierarchical
model for voluntary movement and its application
to
robotics.
Proc. IEEE Control Systems
Magazine, 8, 8-17.
Kraft, L. G. and D. P. Campagna
(1990). A
comparison
between CMAC neural network and two traditional
adaptive control systems.
IEEE Controb Systems
Magazine, pp. 36-43.
Ligtbody
G. and G. Irwing (1992).
Neural
networks for nonlinear
adaptive control
IFAC Workshop on Algorithms and Ar&tectures
for Rent Time Controt, pp. l13.
Luo, R. F., H. H. Shao and Z. J. Zhang (1993).
Fuzzy neural nets based inferen&& control for a high purity distillation
column
Automatica To appear.
Mead, C. A. and M. A. Mahowald
(1988) A
silicon model of early visual processing.
Neural Networks, 1, 91-97.
McCulloch, W. S. and W. Pitts (1943). Analogical calculus of the ideas inmanent
in
nervous activity. Bulletin of Mathematical Biophysics, 9, 127-147.
Minsky, M. L. and S. A. Papert (1969). Perceptrons. The MIT Press, Cambridge,
MA.
Morari, A. J. and K. W. Fung (1982). Nonlinear inferential
control.
Comput. Chem.
Eng., 6, 271-281.
Murray, A. F., D. Del Corso and L. Tarassenko
(1991) Pulse-stream
VLSI neural networks mixing analog and digital techniques.
IEEE Transactions on Neural
Networks, 2.
C. Mohtadi
and P.S. Tuffs
D.W.,
(1987b). Generalized
predictive control.
Part II Extensions
and Interpretations.
Automatica 23, 149-160.
Clarke, D.W. and C. Mohtadi (1989). Properties of generalized
predictive
control.
Automatica 25, 859-875.
Craig, J.J. (1988).
Adaptive Control of Mechanical Manipulators Addison - Wesley
Publishing
Co.
Cutler, C.R. and B.L. Ramaker (1980). Dynamic matrix control. A computer cont,rol algorithm.
Proc. JACC80.
Cybenko, G. (1989). Approximation
by superpositions
of sigmoidal function.
Math,.
Controt Signal System.s, 2, 303-314.
Economou, C. G., M. Morari and B. 0. Palsson
(1986). Internal
model control.
5. Extension to nonlinear
systems. Ind. Eng.
Chem. Process Des. Den., 25, 403-411.
Feild, W. B., and J. K. Navlakha
(1988).
Transputer
implementation
of Hopfield
NN. IEEE Int. Conf. on, NN. ICNN88.
Funahashi,
K. I. (1989). On t,he approximate
realization
of continuous
mappings
by
neural networks.
New& Netloo&,
2,
183-192.
Garcia, C.E., D.M. Prett and M. Morari (1989).
Model predictive
control:
theory and
practice - a survey. Automatica 25, 335348.
Garth, S (1987). A chipset for high speed simulation of neural network systems. IEEE
Int. Conf. on, NN. ICNN87 443-452.
Glinski, S., T. Lalumia, D. Cassiday, T. Koh,
C. Gerveshi,
G. Wilson and J. Kumar
(1987).
The graph search machine:
A
VLSI architecture
for connected
word
speech recognition
and other applications. Proc. IEEE, 75, 1172-1184.
G6mez-Ortega,
J. and E. F. Camacho (1994)
Neural network GPC for mobile robots
path tracking.
EURISCON9.j, To appear.
G6mez-Ortega,
J., E. F. Camacho and J. Quero
(1994) Neural network local navigation
of mobile robots in a moving obstacles
environment.
PreprinGs of -Intelligent
Componen,ts and Instrum,ents for Control Applications SICICA94. 263-268.
Guez A. and I. Bar-Kana
(1990). Two-degreeof-freedom robot neurocontroller.
Proc.
2gfh Conference on Decision and Control, pp 3260-3264.
Hopfield,
J. J. (1982) Neural Networks and
physical systems with emergent collective computational
abilities.
Proc. of
the National Academy of Sciences, 79,
2554-2558.
Hornik,
K., M. Stinchcombe
and H. White
(1989). Multilayer feedforward networks
Clarke,
23
Namndra, K. S. and A. M. Ann~w~y

(1989).
Stable AaTqthe &&ems.
Prentice-Ha&
Englewood Cliffs, NJ.
Narendra,
K. S. and K. Parthasarathy
(1990).
Identification
and control of dynamical
systems using neural networks.
IEEE
Transactions on Neural Networks, 1,427.
Nguyen, D. H. and B. Widrow (1990). Neural networks for self-learning control systems. 1X&X Con&. Syst. Msg., 10, 1%
23.
Ollero, A., J. Aracil, A. Garcia-Cerezo
and A.
Barreiro (1993). Stability of Fuzzy Control Systems.
In A. Driankov and H.
Hellendoorn
(Eds), Introduction to Fuzzy
Cantrol. Springer-Verlag,
Berlin.
Ortega, R. and M. Spong (1988). Adaptive Motion Control of Rigid Robot: A Tutorial.
Automat~ca, 25,8?7-888.
Parrish, J. R. and C. B. Brosilow (1988). Nonlinear inferential control, AIChE J., 34,
633-644.
and A.A. Yamura
P&is
D., A. Sideris
(1988). A multilayered
neural network
controller.
IEEE Control System Magazine, 8, 17-21.
Quero, J.M and E.F. Camacho (1990). Neural
generalized
predictive
self-tuning
controllers. PTOC. of the IEEE International
Conference on System Engineering, 160163.
Quero, J.M, E.F. Camacho and L. G. Franquelo
(1993). Neural network for constrained
predictive control. !l+trzasections on Circuits and Systems, 40, 621-626.
Rosenblatt,
F. (1958). The perceptron:
a probabilistic
model for information
storage
and organization
in the brain. phychological Review, 65, 386-408.
Rouhani, R. and R. K. Mehra (1982). Model

algorithmic
control: Basic theoretic perspectives. Automatica, 10, 401-414.
Saerens, M., A. Soquet, J.-M. Renders and H.
Bersini (1991). Some preliminary
comparisons between a neural adaptive controller and a model reference controller.
In G. A. Bekey and K. Y. Goldberg
(Eds.), NN $n Robotics. 131-146.
Sanner,
R. M. and J.-J. E. Slotine (1992).
Gaussian
networks for direct adaptive
control.
IEEE. Bans.
on Neuml Networks, 3, 833-863.
Slotine, J.-J. E. and W. Li (1990). Adaptive
manipulator
control: A case study IEEE
Trans. on Automatic Control, AC-33,.
Takahashi, Y. (1993). Adaptive predictive control of nonlinear
time-varying
systems
nsing neural networks.
ht. Jo&t Coraference on NN I~CNN~~~, 1464-1468.
Tan, Y. and R. De Keyser (1993).
Neural network
based adaptive
predictive
control.
CEG/ESPRIT/CIME/CIDIG
Conf. Advances in MBPC. 77-88.
Widrow, B. and M. E. Hoff (1960). Adaptive
switching circuits.
1960 IRE WESCON
convention Record, New York: IRE, pp.
96- 104.
Yabuta,
T. and T. Yamada (1990).
Possibility of neural networks controller
for
robot manipuiators.
Proc. Int. Conf.
Robotics Automat., 1686-1691.
Zomaya, A. Y. (1994). Reinforcement
learning
for the adaptive control of nonlinear systems. IEEE Trans. Syst., Man, Cybern.,
24, 357-363.
Zomaya,
A. Y. and T, M, Nabhan
(1993).
Centralized
and decentralized
neuroadaptive robot controllers.
Neural Networks, 6, 223-244.
24

Adaptive 1994 NN Ad Con Camacho

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Adaptive 1994 NN Ad Con Camacho

Uploaded by

Copyright:

Available Formats

Coptight Q IFAC ArtificialIntelligencein RealTime

Control,Valencia, Spain, 1994

E.F. CAMACHO and M.R. ARAHAL.

real processes caninto account. Simbe made and modof reality.

c) Most processes are non linear, having nonlinear dynamics

have been extensively

in literature and some new disciplines have appeared to address them.

The Control Theory for linear processes has for

cesses with no necessarily known and/or changing dynamics.

The history of NN can be traced back to the

Neural Network based controllers have received

NN abilities were soon applied to challenging

This paper deals with adaptive NN controllers

a function has some degree of smoothness,

NN have been used to design controllers

In 1990 the first number

NN have also been used to implement

Sanner and Slotine (1992) proposed a new architecture

way in which the adaptation

De Keyser, 1993; Cutler and Ramaker, 1980),

and learning (b) processes.

When the process is non linear the problem gets

3. LEARNING AND ADAPTATION

Persistent Excitation. The concept of persistent excitation is crucial to adaptive control, it

Learning and adaptation

tently exciting if 0 + 5 when the error between

4.1. NN II$ a controller.

One of the fundament~

FIG. 2. Controller for a nonlinear

When NN are used for adaptation,

In (Sanner and Slotine, 1992) a Radial Basis

ing errors through the plant identifier.

as input a codification of the state of the plant,

Other models. To overcome the problem of the

FIG. 4. Associative and critic elements in a RL

to primary variables can be nonlinear and difficult to determine,

FIG. 6. Adaptive controller using fixed

+ C(% 4)Fi + ~(~)

It has been shown by Funahashi

5. The neural net adjusts the PID controller.

Consider a set {ipi) of N neural networks, each

Now consider a particular robot payload condition characterized

The inverse robot dynamics for a payload condition characterized

law is given by:

(5.4) can be written

If N = n and 0 is non singular,

If the columns of 0 do not form a basis, because N < n or the training

where z = [4,&4;IT and A(z) is the learning

Stability is a crucial problem in control. Most

relative to local smoothness

are all L, functions.

FIG. 7. Control Scheme proposed by Sanner and

g(t)* h are uniformly bounded for all t 2 to and

A sliding mode control is set up to prevent the

The resulting system (see Fig. 7) adjusts the

cost of bigger silicon surfaces. Implementations

Although the potentials of NN for adaptive control have been demonstrated

7.1. Neurral Processors.

S. and S. Omatu (1993). Neuromorphic self- tuning PID controller.

7.3. Specific An&g Circt~i~s.

Namndra, K. S. and A. M. Ann~w~y

Rouhani, R. and R. K. Mehra (1982). Model

You might also like