Professional Documents
Culture Documents
We will want to find functions u which are extrema of F , usually subject to end conditions
of the form
u.a/ D c and u.b/ D d:
The methods we will use come from the calculus. The theory of minima and maxima
for functions of one and several variables provide both a model of what we will do in the
calculus of variations and one of our most powerful methods. Therefore the first section
will be devoted to that study in some detail.
9
10 2. VARIATIONAL PROBLEMS INVOLVING SCALAR FUNCTIONS IN ONE VARIABLE
There is a basic difference between parts (1) and (2). In part (1) of Theorem 2.2, under
the stronger assumption that f 2 C 2 .I / we get a stronger necessary condition for x0 to
be a local minimum than we found in Theorem 2.1. In part (2), we assume a condition on
f that is slightly stronger than the necessary condition found in part (1) and we find that
this condition is sufficient to ensure that that x0 is a local minimum.
This difference between necessary and sufficient conditions will also be found in what
we do in the calculus of variations. It is rare that we can find conditions that are simulta-
neously necessary and sufficient. To see this for functions consider the following example.
Example 2.3. The examples f .x/ D x 3 and g.x/ D x 4 show that there is no condition
involving only the first and second derivative which is simultaneously necessary and suf-
ficient. We have f 0 .0/ D f 00 .0/ D 0, but 0 is not a local extremum for f . On the other
hand, we have g 0 .0/ D g 00 .0/ D 0, but 0 is a local minimum for g.
It should also be pointed out that while we have put our emphasis on minimum points
for the function f , there are corresponding results for maximum points. We need only
point out that if x0 is a (local) maximum for f , then it is a (local) minimum for f . We
will continue to state results that are true of minima, and it will be up to the reader to derive
the corresponding results for maxima.
Since these theorems are so important in what will follow, let’s prove them.
P ROOF. (Of Theorem 2.1) Suppose the theorem is not true. Then there is a point x0
which is an extremum for f , and f 0 .x0 / ¤ 0. We may as well assume that f 0 .x0 / > 0.
Since f 2 C 1 .I /, by Taylor’s theorem, as explained in A.3,
(2.1.3) f .x/ D f .x0 / C f 0 .x0 /.x x0 / C R1 .x/;
where R1 .x/ D r1 .x/.x x0 / and limx!x0 r1 .x/ D 0: Thus there is an > 0 such that
jx x0 j < implies that jr1 .x/j < f 0 .x0 /. This means that f 0 .x0 / C r1 .x/ > 0 if
jx x0 j < : Hence, from (2.1.3)
f .x/ D f .x0 / C f 0 .x0 /.x x0 / C R1 .x/; for jx x0 j < ;
D f .x0 / C f 0 .x0 / C r1 .x/ .x x0 /; for jx x0 j < ;
(
> f .x0 / if 0 < x x0 < ;
< f .x0 / if < x x0 < 0:
Consequently, x0 is not a local minimum for f .
P ROOF. (Of Theorem 2.2) First let’s prove the sufficient condition in part (2). By
condition (2.1.2), we have f 0 .x0 / D 0 and f 00 .x0 / > 0: Since f 2 C 2 .I /, by Taylor’s
theorem, as explained in A.3,
(2.1.4) f .x/ D f .x0 / C f 0 .x0 /.x x0 / C f 00 .x0 /.x x0 /2 C R2 .x/;
where R2 .x/ D r2 .x/.x x0 /2 and limx!x0 r2 .x/ D 0: Thus there is an > 0 such that
jr2 .x/j < f 00 .x0 / if jx x0 j < . This means that f 00 .x0 / C r2 .x/ > 0 if jx x0 j < :
Hence, from (2.1.4),
f .x/ D f .x0 / C f 0 .x0 /.x x0 / C f 00 .x0 /.x x0 /2 C R2 .x/
D f .x0 / C f 00 .x0 / C r2 .x/ .x x0 /2 since f 0 .x0 / D 0
To prove the necessary condition in part (1), we assume that the theorem is not true.
Then there is a point x0 which is a local minimum for f , but the condition in (2.1.1) is not
true. By Theorem 2.1 we know that f 0 .x0 / D 0, so we must have f 00 .x0 / < 0: But then,
by part (2) x0 is a local maximum for f , and not a local minimum.
Maximum and minimum results in several variables are similar. We will state results
for functions of two variables.
Theorem 2.4. Suppose that f 2 C 1 ./ where R2 is an open set and that f has a
local extremum at .x0 ; y0 / 2 : Then
@f @f
(2.1.5) .x0 ; y0 / D 0 and .x0 ; y0 / D 0:
@x @y
Using the gradient of f equation (2.1.5) can be written as
@f @f
rf .x0 ; y0 / D .x0 ; y0 /; .x0 ; y0 / D .0; 0/:
@x @y
This provides a necessary condition for a local extremum analogous to that in The-
orem 2.1. To state the corresponding condition on second derivatives for the analog of
@ f ˘
Theorem 2.2 we need the concept of the Hessian of f . The Hessian of f is the matrix
2
@2 f
.x; y/ .x; y/
Hf .x; y/ D @x 2 @x@y :
2 2
@ f @ f
.x; y/ .x; y/
@x@y @y 2
a b
Remember that a matrix A D is positive semi definite if
c d
a 2 C .b C c/ C d2 0 for all .; /;
and is positive definite if
a 2 C .b C c/ C d2 > 0 whenever .; / ¤ .0; 0/:
We will write A 0 if A is positive semidefinite, and A > 0 ifA is positive definite.
Theorem 2.5. Suppose that f 2 C 2 ./.
(1) If .xo ; yo / 2 is a local minimum for f , then
(2.1.6) rf .x0 ; y0 / D .0; 0/ and Hf .x0 ; y0 / 0:
(2) If
(2.1.7) rf .x0 ; y0 / D .0; 0/ and Hf .x0 ; y0 / > 0:
then .xo ; yo / is a local minimum for f .
P ROOF. (Of Theorem 2.4) It is interesting to notice for what follows in the next chap-
ter that we can easily use Theorem 2.1 to prove Theorem2.4. Let v D .; /, and consider
the function
ˆ.t/ D f ..x0 ; y0 / C tv/ D f .x0 C t; y0 C t/:
If f has a local minimum at .x0 ; y0 /, then ˆ has a local minimum at t D 0. According to
Theorem 2.1, ˆ0 .0/ D 0. Then
@f @f @f
0 D ˆ0 .0/ D .x0 ; y0 / C .x0 ; y0 / D rf .x0 ; y0 / v D .x0 ; y0 /:
@x @y @v
12 2. VARIATIONAL PROBLEMS INVOLVING SCALAR FUNCTIONS IN ONE VARIABLE
The last expression is the derivative of f in the direction v. This is true for all vectors v.
Hence all directional derivatives of f are 0 at .x0 ; y0 /. Since this includes the directions
of the x- and y-axes, equation (2.1.5) follows.
Theorem 2.5 can be proved in the same way, and we will leave it as an exercise.
and the problem is to find functions u which minimize or maximize the functional F
subject to the boundary conditions
(2.2.2) u.a/ D c and u.b/ D d:
In this section we will look at this problem in more detail to discover exactly what must be
done.
Remember that the function F is called the Lagrangian of the functional F and that
the integral in (2.2.1) is called a variational integral.
The Lagrangians in (2.2.3) and (2.2.5) are defined for x 2 I and .u; p/ 2 ˚R2 . However, the
Lagrangian for the brachistochrone in (2.2.4) requires that .u; p/ 2 U D .u; p/ 2 R2 u < c :
The Lagrangian F must be continuous in all three variables in order that the variational
integral in (2.2.1) be defined. However, we will usually require at least that F 2 C 1 .I U /,
or that F 2 C 2 .I U /:
In order for the integral in (2.2.1) to be defined it will be necessary for the function u
to be continuously differentiable in I . That can be stretched to u 2 P C 1 .I /. In addition,
F .x; u.x/; u0 .x// must be defined for all x 2 I . This means that .u.x/; u0 .x// 2 U for
all x 2 I . Such a function which also satisfies the boundary conditions in (1.1.18) will be
said to be admissable for the functional F :
2.2. THE PROBLEM 13
2.2.2. Extrema
Definition 2.6. A function u which is admissable for the functional F is a global minimum
for F , if
F .u/ F .v/ for every admissable v.
This is not much different from the definition of a global minimum for a function
of one variable (see Section 2.1). There is a similar definition for a global maximum.
A function which is either a global minimum or a global maximum is called a global
extremum.
Things get a little more complicated when we talk about local extrema. Remember
that a point a is a local minimum for the function f if there is an > 0 such that
(2.2.6) f .a/ f .x/ for all x such that jx aj < :
Roughly speaking, this means that f .a/ f .x/ for all x close to a, where “close” is
measured by jx aj. The distance between x and a is measured by the absolute value of
the difference.
To come up with a corresponding definition for functions we need to have a measure
of the distance between two functions u and v. Since u and v are real valued, the quantity
ju.x/ v.x/j measures the distance between the values of u and v at the point x. Hence a
good definition of the distance between u and v is
sup ju.x/ v.x/j:
axb
The zero norm will also be referred to as the sup norm or the C 0 norm. One way of
measuring the distance between u and v is to use ku vk0 .
However, this will not be sufficient. Since the functional F involves the derivative
of u, we will have to take that into consideration as well. We can do so with the next
definition.
Definition 2.8. For a function 2 C 1 .I / we define the one norm of by
(2.2.8) kk1 D sup j.x/j C sup j 0 .x/j:
axb axb
The one norm will also be referred to as the C 1 norm. Clearly, ku vk1 provides
another way of measuring the distance between u and v. From (2.2.7) and (2.2.8) we see
that
kk1 D kk0 C kk00 ;
and, in particular that
(2.2.9) kk1 kk0 :
We can now define what we mean by a local minimum.
Definition 2.9. A function u 2 C 1 .I / which is admissable for the functional F is a (weak)
local minimum for F if there is an > 0 such that
(2.2.10) F .u/ F .v/ for all admissable v 2 C 1 .I / such that ku vk1 < :
14 2. VARIATIONAL PROBLEMS INVOLVING SCALAR FUNCTIONS IN ONE VARIABLE
Certainly Lemma 2.11 guarantees a lot of functions in C01 .I / for any interval I . We
need only make sure that the closed interval Œa ; a C I: Our next result approaches
this question from another direction.
Lemma 2.13 (Fundamental Lemma of the Calculus of Variations). Let f be a continuous
function on the interval I D .a; b/. If
Z b
(2.2.13) f .x/.x/ dx D 0 for all 2 C01 .I /
a
Remark 2.14. We will find many uses for this lemma. It is not difficult to see that the
lemma remains true for function f which are piecewise continuous on I . In fact, if we
were using the Lebesgue theory of integration, the lemma is true assuming that f is inte-
grable on I .
2.3. The first variation and a necessary condition for a local extremum
Consider once more the functional
Z b
F .u/ D F x; u.x/; u0 .x/ dx;
(2.3.1)
a
is a well defined function of t for jtj < : Since u is a local minimum for F , we have
ˆ.t/ D F .u C t/ F .u/ D ˆ.0/;
for all small values of jtj. Thus 0 is a local minimum for ˆ. Since F is continuously
differentiable, ˆ is also continuously differentiable. By Theorem 2.1, we must have
(2.3.3) ˆ0 .0/ D 0:
16 2. VARIATIONAL PROBLEMS INVOLVING SCALAR FUNCTIONS IN ONE VARIABLE
We can calculate ˆ0 by differentiating (2.3.2) with respect to t under the integral sign
in (2.3.2).1 Doing so and using the chain rule we get
Z b
d
ˆ0 .t/ D F x; u.x/ C t.x/; u0 .x/ C t 0 .x/ dx
dt a
l b
d
D F x; u.x/ C t.x/; u0 .x/ C t 0 .x/ dx
dt
a
l b
@F
D x; u.x/ C t.x/; u0 .x/ C t 0 .x/ .x/
@u
a
@F 0 0
0
C x; u.x/ C t.x/; u .x/ C t .x/ .x/ dx
@p
In particular at t D 0 we get
l b
0 @F 0 @F 0
0
ˆ .0/ D C
(2.3.4) x; u.x/; u .x/ .x/ x; u.x/; u .x/ .x/ dx
@u @p
a
We clearly need some better notation. We will denote partial derivatives by subscripts.
Thus, for example, we will set
@F @F @2 F
Fu D ; Fp D ; and Fup D :
@u @p @u@p
Then (2.3.4) becomes
Z b
ˆ0 .0/ D Fu x; u.x/; u0 .x/ .x/ C Fp x; u.x/; u0 .x/ 0 .x/ dx:
(2.3.5)
a
This becomes even simpler if we suppress the dependence of u and on x.
Z b
0
ˆ .0/ D Fu x; u; u0 C Fp x; u; u0 0 dx:
(2.3.6)
a
If it does not cause confusion, we can even drop the variables.
Z b
ˆ0 .0/ D Fu C Fp 0 dx:
(2.3.7)
a
Let’s give this a name.
Definition 2.15. The first variation of F at u in the direction is defined to be
ˇ
d ˇ
(2.3.8) ıF .u; / D F .u C t/ˇˇ :
dt t D0
1It is the continuity of the derivatives of F which makes this possible. Without this condition, it is not al-
ways possible to compute the derivative of an integral by interchanging the order of integration and differentiation.
2.4. THE Euler-Lagrange EQUATION 17
Notice that the first variation at u is defined in Definition 2.15 whether u is a local
extremum or not. However, if u is a local extremum of F , then by (2.3.3) and (2.3.9),
ıF .u; / D 0: We have proved our first necessary condition on a local extremum of F .
Proposition 2.16. Suppose that F 2 C 1 .I U /, and suppose that u 2 C 1 .I / is a local
Rb
extremum for the functional F .u/ D a F .x; u; u0 / dx: Then
(2.3.11) ıF .u; / D 0 for all 2 C01 .I /:
Remark 2.17. It is interesting to compare our derivation of (2.3.11) with the proof of
Theorem 2.4. A first variation of a functional is comparable to a directional derivative of
a function. From the vanishing of all directional derivatives of a function we can conclude
that the gradient of the function must vanish. No such easy conclusion is possible for
functionals. The comparable result is the subject of the next section.
Definition 2.18. The condition in (2.3.11) is called the weak form of the Euler-Lagrange
equation. A function u which satisfies (2.3.11) is called a weak extremal of F :
From this equation we see that the Euler-Lagrange equation is a second order differ-
ential equation for u. The theory of ordinary differential equations tells us that the general
solution involves two arbitrary constants. Since we will want the solution to satisfy the end
conditions u.a/ D c and u.b/ D d , there will usually be a solution. However this will not
always be the case.
2.5. Examples
We are ready to solve some variational problems. Let’s start with the easiest one.
2.5.1. The shortest graph
We looked at this problem in Section 1.1.1.1. We want to join the points.a; c/ and
.b; d /, where a < b, with the graph of a function u defined on I which has the shortest
length. The length of
u , the graph of u, is given by the length functional
Z bp
L.u/ D 1 C u0 .x/2 dx:
a
Thus we want to find minimizers of L subject to the end conditions u.a/ D c and u.b/ D
d:
According to Theorem 2.19, any local minimum u 2 C 2 .I / of L must satisfy the cor-
responding Euler-Lagrange equation. The Lagrangian of the functional L is L.x; u; p/ D
p
1 C p 2 . Since L does not depend on u, Lu D 0, and the Euler-Lagrange equation
reduces to
d
Lp D 0:
dx
p
From this we conclude that Lp D p= 1 C p 2 is a constant. Hence there is a constant C
such that
u0 .x/
Lp x; u.x/; u0 .x/ D p D C:
1 C u0 .x/2
2.5. EXAMPLES 19
This can happen only if u0 .x/ D m for some constant m, and we conclude that
(2.5.1) u.x/ D mx C B; where m and B are constants.
Thus the extremals of L are the affine functions in (2.5.1). We can easily find constants
m and B so that u.x/ D mx C B satisfies the end conditions. The graphs of these affine
functions are straight lines.
Since any local extremum is an extremal, we have shown that any local minimum of
L must be an affine function. Notice that we have not shown that the straight line joining
.a; c/ and .b; d / is actually a minimum.
This example illustrates a situation when the solution of the Euler-Lagrange equation
can be reduced to the solution of a first order differential equation. If the Lagrangian F is
not explicitly a function of u, then Fu D 0 and the Euler-Lagrange equation reduces to
d
Fp D 0:
dx
This means that Fp must be a constant function. Thus, we have
Proposition 2.21 (Special case #1). Suppose that u 2 C 2 .I / is an extremal of F . If
Fu D 0, then there is a constant C such that Fp .x; u.x/; u0 .x// D C:
There is another special case that we might as well state and prove now.
Proposition 2.22 (Special case #2). Let E.u; p/ D pFp F . Suppose that u 2 C 2 .I / is
an extremal of F . If Fx D 0, then there is a constant C such that E .u.x/; u0 .x// D C:2
P ROOF. We will show that the derivative of E.u; u0 / is equal to zero, from which the
proposition follows.
d d 0
E.u; u0 / D u Fp .u; u0 / F .u; u0 /
dx dx
d
D u00 Fp C u0 Fp Fu u0 Fp u00
dx
00 d
Du Fp Fu
dx
D 0:
The last step follow because u is an extremal and must satisfy the Euler-Lagrange equation
(see Definition 2.20).
Propositions 2.21 and 2.22 illustrate a phenomenon that we will be pursuing. First a
defintion.
Definition 2.23. A function G.x; u; p/ is a conserved quantity for the functional F if for
any extremal u of F there is a constant C such that
G.x; u.x/; u0 .x/ D C for a x b.
Thus, we can rephrase Proposition 2.21 by saying that Fp is a conserved quantity if
Fu D 0, and we can rephrase Proposition 2.22 by saying that E D pFp F is a conserved
quantity if Fx D 0.
2Where does the mysterious quantity E come from? Some light on this question will come from Exercise
. The quantity E will arise frequently in what follows.
20 2. VARIATIONAL PROBLEMS INVOLVING SCALAR FUNCTIONS IN ONE VARIABLE
solutions, but if c=a is large, there are two. The dividing point is the staright line through
the origin which is tangent to the graph of z D cosh y. Let the slope of this line be ˛. Then
we have three cases.
Case 1. c=a < ˛: The two curves in (2.5.4) do not intersect, so there are no solu-
tions to (2.5.3). The functional F has no extremals, and hence there are no local
minima.
Case 2. c=a D ˛: The two curves in (2.5.4) intersect in one point, so there is one
solution to (2.5.3). The functional F has one extremal. We do not know if this
solution is a local minimum.
Case 3. c=a > ˛: The two curves in (2.5.4) intersect in two points, so there are two
solutions y1 < y2 to (2.5.3). Set Mi D yi =a, for i D 1; 2, Then the functional
F has two extremals
1
ui .x/ D cosh Mi x; for i D 1; 2, where M1 < M2 :
Mi
At this point, we do not know if either of these solutions is a local minimum.
However, when we study sufficient conditions we will be able to demonstrate
that u1 is a minimum, while u2 is neither a minimum nor a maximum.
2.5.3. The brachistochrone
We studied this problem in Section 1.1.2. We want to find the minimum of the func-
tional
l bs
1 C u0 .x/2
(2.5.5) F .u/ D dx; with C D c C v02 =2g;
c u.x/
a
subject to the end conditions
u.a/ D c and u.b/ D d:
Notice that when the initial speed v0 D 0, C D c, and the functional
r r
1 C p2 1 C p2
(2.5.6) F .x; u; p/ D D
C u c u
becomes infinite when u D c. Our results require the F 2 C 2 .I U /, which is not the
case when v0 D 0. Therefore we will first assume that v0 ¤ 0 and then we handle the case
v0 D 0 as a limiting case.
Since Fx D 0, we again apply Proposition 2.22 to conclude that E.u; p/ D pFp F
is constant along any extremal. A computation shows that
1
E.u; p/ D p ;
.1 C p /.C u/
2
Hence C u D 2k sin2 .=2/; and (2.5.7) using the plus sign becomes
du cos.=2/
(2.5.9) D :
dx sin.=2/
Using equation (2.5.8) we find the following alternative to (2.5.9)
du du d d
(2.5.10) D D 2k sin.=2/ cos.=2/ :
dx d dx dx
Using (2.5.9)and (2.5.10) we can solve for
dx
D 2k sin2 .=2/:
d
Integrating this equation yields
x./ D x0 C k. sin /:
This equation together with (2.5.8) gives us a parametrized solution to the Euler-
Lagrange equation
x./ D x0 C k. sin /; and
(2.5.11)
u./ D C k.1 cos /:
Notice that x.0/ D x0 and u.0/ D C D c C v02 =2g. To find a solution which satisfies our
end conditions we must find values of the parameters k, x0 , 1 , and 2 such that
x.1 / D a; u.1 / D c and x.2 / D b; u.2 / D d:
Assuming that a < b and c > d it is possible to do that. Thus, given the end points .a; c/
and .b; d / and the initial speed v0 > 0, there is precisely one extremal of F . This extremal
can be shown to be a global minimum.
The family of curves parametrized by (2.5.11) are called cycloids. To visualize what a
cycloid looks like, consider a circle of radius k starting at D 0 tangent to the line y D C
at the point P D .x0 ; C / and underneath the line. As increases, the circle rolls under
this line with the center moving to the right with velocity k. The point P moves with the
circle, and P ./ D .x./; u.//, where the components are given by equations (2.5.11).
Now let’s examine the case when v0 D 0. This means that the weight is dropped
from .a; c/ with initial speed equal to zero. The Lagrangian is given in (2.5.6), and, as
we already pointed out, it has a singularity at u D c; so Theorem 2.19 does not apply.
However, if we let v0 ! 0 in (2.5.11) we get the parameterization
x./ D x0 C k. sin /; and
u./ D c k.1 cos /:
Notice that now u.0/ D c; so to satisfy the right-hand end condition we need a D x.0/ D
x0 . Thus the parametrization becomes
x./ D a C k. sin /; and
(2.5.12)
u./ D c k.1 cos /:
You might suspect that this is the minimum, and you would be right. However, the mini-
mum must be defined in the correct way, since the Lagrangian is not continuous. Another
problem is that when is eliminated from the two equations in (2.5.12) and u is expressed
as a function of x, we have
1=3
u.x/ c 9k
(2.5.13) lim D :
x!aC .x a/2=3 2
2.5. EXAMPLES 23
P A.y/ P B.y/
f .y/ D T .u/ D C :
v1 v2
We now want to find y which minimizes f .y/. We calculate that
y c y d
f 0 .y/ D C :
v1 P A.y/ v1 P B.y/
At a minimum we must have f 0 .y/ D/, or
c y y d
D :
v1 P A.y/ v1 P B.y/
24 2. VARIATIONAL PROBLEMS INVOLVING SCALAR FUNCTIONS IN ONE VARIABLE
If we let ˛1 be the angle that the line AP makes with the x-axis, and ˛2 the angle tthat P B
makes with the x-axis then this equation reads
sin ˛1 sin ˛2
(2.5.15) D :
v1 v2
This equation is known as Snell’s law.
Finally, let’s consider the case when the velocity of light v depends only on x. Then (2.5.14)
becomes
Z bp
1 C u0 .x/2
T .u/ D dx:
a v.x/
In this case the Lagrangian
p
1 C p2
T .x; p/ D
v.x/
p
does not depend on u. This is special case #1, so we know that Tp D p=v 1 C p 2 is a
constant along an extremal. Hence, there is a constant C such that
1 u0 .x/
p D C; for a x b.
v.x/ 1 C u0 .x/2
Let ˛.x/ denote the angle that
u makes with the x-axis at the point p .x; u.x//. Then
u0 .x/ D tan.˛.x//; from which it follows that sin.˛.x// D u0 .x/= 1 C u0 .x/2 : Hence
the previous equation becomes
sin.˛.x//
(2.5.16) D C; for a x b.
v.x/
This equation is also called Snell’s law, providing as it does a generalization of (2.5.15).