You are on page 1of 776

CALCULUS

SECOND EDITION

EDWIN E. MOISE

Harvard University

ADDISON-WESLEY PUBLISHING COMPANY


Reading, Massachusetts

Menlo Park, California

London

Don Mills, Ontario

This book is in the


ADDISON-WESLEY SERIES IN MATHEMATICS

Consulting Editor:

LYNN H. LOOMIS

Copyright 197.2 by Addison-Wesley Publishing Company, Inc. Philippines copyright 1972


by Addison-Wesley Publishing Company, Inc.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted, in any form or by any means, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of the publisher. Printed in the
United States of America. Published simultaneously in Canada. Library of Congress Catalog
Card No. 76-150576.

Author's Note on the Second Edition

The preface to the first edition was an explanation of the author's intentions, and
since these intentions have not changed, the original preface is reprinted after this one.
But the present edition is a thorough revision of the first, with many major changes
and even more minor ones. Some of these are as follows.
1. The use of language has been simplified throughout. Excessive colloquialisms

have been eliminated.


2. Many problems have been added, most of them being easy.

In cases where

several problems form a sequence, they have been combined into a single problem,
with parts (a), (b), (c), .... Thus it is now safe to assign all odd-numbered problems,
without checking to make sure that problem-sequences are not being broken up.
3. Various long sections have been divided into two parts.
4. The classical definition of a limit has been restored.

Exploratory problems,

dealing with limits and continuity in terms of "boxes," have been eliminated, and
this material has been inserted in later portions of the text.

5. Section 5.8, on the derivative of one function with respect to another, has been

completely recast, following suggestions of Professor Hugh Thurston, of the University


of British Columbia.

The new version is mathematically straightforward, and it

bridges the gap between the modern concepts of function and derivative and the
"fractional" notation

du/dv

commonly used in physics.

6. Chapter 8, on the conic sections, has been shortened and simplified, by

omitting various topics not ordinarily covered in a first course in calculus.

In

particular, the section on the geometry of the ellipse has been omitted. In a way it is
a pity to leave this out, because it is very good mathematics, but in a first course in
calculus we barely have time for essentials.
7. In the chapters on vector spaces, the standard use of the terms "vector space"

and "inner product space" has been restored.


8. The old Chapter 10, on number theory and partial fractions, has been omitted.

The above remarks about the geometry of the ellipse also apply here.
9. The chapter on infinite series has been completely recast.

In the first edition,

the idea of uniform convergence was built into the presentation, almost from the
outset;

it was used, in various special cases, to justify term-wise integration, long

before the general definition of uniform convergenc was stated. This treatment had
advantages, for some students, but it had a serious tlisadvantage: it meant that the
hardest part of the study of infinite series could not be skipped, or even postponed.
The chapter has now been arranged in such a way that the hardest parts of it come
iii

iv

Author's Note on the Second Edition

last. Term-wise integration and differentiation of power series are introduced early,
and play a central part throughout the chapter; but the justification of these processes
, is saved for the end.
The construction of the complex numbers (using congruence classes of real
polynomials modulo

10.

1 + x2)

has been moved to an appendix.

The chapter on linear transformations, matrices, and determinants has been

recast and simplified in various ways.

For example, the idea of isometries between

subspaces has been omitted (from the text and therefore from the problems.)

11.

In the chapter on functions of several variables, the Leibniz notation for

partial derivatives has been introduced, in parallel with the subscript notation

f.,,fv, . .

..

The former notation is of course universal in physics, and it cannot be

denied that it makes the chain rule easier to remember.


These examples should make it plain that this is not a perfunctory revision.
The intent of the revision is to make the book more teachable and more flexible,

without weakening its mathematical content.

Some sections (as indicated above)

have been omitted outright. Some chapters have been recast in such a way that more
topics can be omitted at the teacher's discretion. But the main substance of the book,
and the conception of calculus that it attempts to teach, have not been changed. All
the hard problems in the first edition have been retained (except for one very embar
rassing case, in which I asked the student to prove a false theorem).
Most of the calculus books now in print are of one of the following three types:

1)

Some are written on a high plateau of austerity and rigor, and the Devil take

the hindmost.

2)

Some are "quick calculus" books. A typical device, in this sort of book, is to

use the Fundamental Theorem of Integral Calculus as a


integral.

definition

of the definite

This enables the student to imitate the behavior of mathematicians, in

calculating definite integrals, without sharing the mathematicians' conception of


what the problem meant in the first place.
3) Some are a combination of types

and 2, with exact theoretical material

included in the text but not in the problems. A course based on a book like this is
evidently intended to play a double game, in which some of the students learn the
text, but most of them study, in effect, a "quick calculus" course based on the same
sort of problems used in books of type

2.

The present book is of none of these three types. It is addressed to all students
who ordinarily take calculus courses and pass them, and it is designed to teach the
ideas of calculus, at least in some form, to these students. In a sense we are playing
a double game, because the teacher has a great deal of choice.

If the entire text is

taught, then the course is logically complete, and nothing that it includes needs to be
taught again later.

But many of the theoretical sections, especially at the ends of

chapters, giving proofs of "foundational" theorems, can easily be omitted.

For

example, if we omit Sections 5.6, 5.7, and 5.8, then something is subtracted from the
course, but nothing is disrupted or confused. And in Chapter

10 we can stop at almost

any point. But even if such omissions are made on a large scale, the remaining hard
core of the book conveys the ideas of calculus in conceptual forms.

Author's Note on the Second Edition

The rock-bottom minimum, in any good mathematics course, is for the student
to attach conceptual meanings to the problems that he solves and to the "answers"
that he computes. If we settle for less than this, we are making a bad bargain.
hard fact is that "practical" calculus courses are not practical.

The

In real life it seldom,

if ever, happens that a mathematical problem takes the form of a homework exercise
which can be solved by copying the pattern of the "solved problems" that immediately
precede it. In physics it is the conceptual definite integral that is crucial, and numerical
valuations are often done by computers. Thus the art of setting up integrals is often
more useful than the art of calculating them by elementary methods.
principle applies very widely.

The same

When people put their mthematical training to

practical use, they seldom need the logical refinements that appear in a thorough
treatise, but they nearly always use their conceptual grasp-at some level-of mathe
matical ideas.

Obviously, many techniques are needed, and in this book we have

worked hard to teach them.

But as the same time we have tried to produce the sort

of conceptual grasp of mathematics that can be put to work in real life.


New York City, N. Y.
October 1971

E.E.M.

Preface

The mathematical content of the first ten chapters of this book is familiar and easy
to describe. These chapters present, more thoroughly than is customary, the material
normally covered in one-year introductions to college calculus, and end with a chapter
on infinite series.
the title

(This portion of the book is being published separately, under

Elements of Calculus.)

In the last four chapters of the complete edition, the

choice of material is nowhere nearly so traditional.

In particular, we have laid heavy

stress on the methods of linear algebra.


In the latter portion of this preface, we explain the considerations on which the
selection of topics in the last few chapters is based.

Elements

Most of the novelties in the

are in the style of treatment; and the ideas underlying them may best be

explained by means of numerous examples.

1.

THE SPIRAL PROCESS

The central concepts of the calculus are deep.

It is not to be expected that they can

be learned ali at once, in the forms in which a modern mathematician thinks of them.
Therefore, in this book, the more difficult ideas are presented in a series of different
forms, in ascending order of difficulty, generality, and exactitude.

Thus the idea of

the definite integral makes its first and simplest appearance in Section 2.10; it is
generalized in Section 3.7; and it is not presented in final form (using Riemann sums)
until Section 7.1, where Riemann sums are needed, in the calculation of arc length.
Similarly, the chain rule for derivatives appears first in Section 3.6, for powers
and square roots of functions;

it is proposed, in more general forms, in Problem

Sets 3.6, 3.8, 4.3, and 4.5; and it appears in final form only in Section 4.6.
The mean-value theorem is first stated, in geometric terms, in Section 3.2, before
any formal definition of the derivative. It is used freely thereafter. Finally, in Section
5.7, it is proved, after the ideas needed in the proof have been used and motivated

in other ways.
The idea of the limit of a function appears first in Section 2. 7. The formal defini
tion is in Section 3.3.

Earlier sections include a lengthy preparation for the

formal definition, designed to eliminate in advance as many of its difficulties as


possible. This purpose is served by the text of Sections 1.4 and 2.5. Thus the style of
treatment is such that an inspection of isolated sections of the book is likely to lead
to an overestimate of the difficulty of the course. The point is that the sections are
not isolated: difficult discussions have been provided with elaborate foundations, in
the text and especially in the problems.
vi

Preface

vii

The spiral treatment, in which concepts appear in various forms as the theory
develops, is intended to make the concepts easier to learn.
purpose.

But this is not its only

The processes by which special ideas are generalized, and heuristic ideas

are made concrete and exact, are part of the substance of what we ought to be
teaching.

Thus the heuristic treatment of exponentials and logarithms, in Section

4.9, is not given merely in order to make the student's life easier. The transition from

(in which the theory is based

Section 4.9 to Sections 4.10 and 4.11

In

x =

n (dt/t)) is

on the definition

valuable in itself, as an illustration of a recasting process which

is essential both in the growth of mathematics and in the growth of the people who
use it.

2.

MOTIVATION

The desire to solve interesting puzzles is very strong; there. is no maturity level at
which it disappears;

and we should appeal to it continually.

Most of the time,

however, when new ideas are introduced, they ought to be motivated by a sense of
power, and by the light that they throw on ideas already regarded as significant.
For example, if we present Riemann sums, in full generality, long before we deal
with problems in which they are needed, it is not reasonable to expect the student to
master their complications.

Similarly, the completeness of the real number system,

in the sense of Dedekind, is not needed at all in the theory of pointwise limits: this
theory takes exactly the same form in the rational domain as in the real domain.
If we postpone the idea of completeness until the point where it is needed, in the
study of functions continuous on an interval, it is more likely to be understood,
partly because it is more likely to get the student's attention.
The problem of motivating the idea of the limit of a function involves a peculiar
difficulty. The only cases in which limx-+a f(x) is easy to calculate are those in which
f is a continuous function, described by a simple formula. In these cases, the formula
works just as well for

x =

as for other values of x; in practice, it turns out that

the limit isf(a); and the student is likely to get the idea that the expression limx-+a f(x)
is merely a devious and pretentious description of j(a).

If we avoid this trouble by

starting with significant cases, such as


sin x
1m--,

1.

X-+0

then the technical difficulties are formidable, and workable problem material is hard
to come by.

If we choose, instead, to discuss limits of sequences, then we have

evaded the issue by changing the subject:

in the differential calculus, limits of

functions are what we need.


But there is a fourth alternative: we can introduce the idea of a limit not as a
subject in its own right, but as a device for solving a problem.
mention limits for the first time, in finding the limit of a

linear

In Section 2.7 we

function.

to the limit, in this case, we merely plug the hole in a punctured line.
has no intrinsic significance.

To pass

This process

But in the context of Section 2.7, it has an extrinsic

viii

Preface

significance, because it is used to solve a nontrivial problem, namely, the problem


of finding the slope of the tangent to a parabola.

Similarly, in Section

2.10, we use

the idea of the limit of a sequence, in a technically simple case, in order to find the
area of a parabolic segment.

(A formal definition of the limit of a sequence finally

appears in Section I 0.1 ). There are many other points at which ideas are introduced,
in simple forms, in connection with a discussion of something else.

3.

BLACK BOXES

It is generally agreed that in a physics laboratory the student should build as much as
possible of his own equipment.

Nobody learns very much by watching the per

formance of the proverbial "black box."

In mathematics the situation is similar:

we do not learn mathematical principles by hearing them mentioned once, no matter


how elegantly; we need to live with them and use them.

Therefore, in this book

certain extremely powerful theorems have been proved long before being stated.
That is, the proof has been presented, in the form of a method of solving a certain
class of problems; and after the student has learned the idea by using it on many
problems, we have summed up the situation by stating the general theorem that the
proof proves. This scheme costs very little time, even in the short run; and in the
long run it is likely to save a great deal of time. The point is that if we allow recipes
to take the place of ideas, in a first course, then the ideas need to be taught all over
again later;

and the second attempt may be harder, because the problem-solving

motivation for these particular ideas has already been used up.
There are good reasons for not giving examples of this technique.
It should be understood that the avoidance of black boxes has no particular
connection with the pursuit of logical rigor. Indeed, if we have to choose, it is better
to master an idea in an heuristic form, by using it repeatedly, than to listen once to a
rigorous exposition, and then forget it.

4.

PROBLEMS

In a quick examination of a textbook, it is not a good idea to read the text and skip
the problems; it is better to read the problems and skip the text.

The problems

represent the life that the student leads when he studies the course; and any ideas
that do not appear in them are unlikely to be learned, no matter how much preachment
may be devoted to them.
In this book, a variety of problems are used for a variety of purposes. There are:

I) Technical problems, as, for example, in the chapter on the technique of integration.
These are carefully graded, and often they form sequences, in which the answer
to one problem can be used in the solution of the next.

2) Theoretical problems, some easy, some hard.

Vigorous attempts have been

made to find easy ones, so as to avoid a dichotomy between techniques (which the
student really uses) and "theory" (of which he is intermittently a spectator).

Preface

3)
4)
5)

ix

Puzzle problems.
Sketching exercises, in which the student is asked to translate back and forth
between analytic ideas and visual images.
Discovery problems, which anticipate, in special cases, ideas which will later
be explained in the text.
There is wide general agreement on the content of the first year course in college

calculus;

and in writing the

Elements, the author was in the happy position of

working on the basis of a consensus with which he was fully in sympathy. But there is
no such general agreement on the content of a course in intermediate calculus.

In

the past decade, calculus courses have tended to grow, by including various topics
from advanced calculus and linear algebra. But it is not easy to decide which of these
topics should be included, and what relative stress should be placed on them; and in
fact there is no reason to suppose that such questions have unique answers.
On the other hand, every book and every course must make

some choices, and

then stick to them long enough to permit a valid learning process.

If the pursuit of

flexibility turns an intermediate calculus book into an anthology, then its little pieces
are unlikely to have any lasting effect. For example, if the treatment of infinite series
is sketchy, then its residuum in the mind of the student may include hardly more than
the ratio test.

And the dangers presented by brief treatments of linear algebra are

worse.
Modern algebra is modern because its motivations and its applications came late.
Today, there are very good reasons for studying groups, rings, fields, vector spaces,
normed vector spaces, inner product spaces, linear transformations, matrices, and so
on.

But the logical simplicity of the rudiments of these theories is misleading.

For

example, the manipulative process of multiplying matrices can be taught to almost


anybody, at almost any level;
another matter entirely.

but the significant applications of this process are

In a short treatment of axiomatic and linear algebra, at the

freshman or sophomore level, we cannot presuppose knowledge of the significant


applications, and we have no time in which to present them. Thus we may fall into a
peculiar form of use-mention confusion: the reader hopes that the ideas of modern
algebra are going to be used, but in the end he sees that they have merely been
mentioned.
For these reasons we have tried, throughout, never to state an algebraic definition
until the reader already knows at least one important instance of the idea that the
definition describes; and once an algebraic idea has been introduced, we have tried,
throughout, to put it to work for the purposes that it is good for. Thus, for example,
matrices are introduced as a shorthand for handling linear transformations;

and

thereafter the treatment of the two is closely tied together. The Schwarz inequality is
first introduced (on page

521) as a theorem in Cartesian three-space, and for this


f) 1 for every fJ. Later, on

case it is proved by the trivial observation that cos2


page

536, it is proved in the general case, and thereafter it is used in a great variety

of ways, to trivialize problems which would not otherwise be trivial.

It appears in

disguised forms in many problems (which should not be listed here). These examples
are typical of the style of Chapters

11

through

13.

It appears to the author that the

Preface

nature, the purposes, and the power of algebraic methods are hot likely to be under
stood unless they are conveyed to the student by some such extended experience.
The most impressive, but also the most difficult, of these applications occurs in
Chapter 12, on Fourier series. This topic is not ordinarily included in intermediate
courses; and if something must be omitted, in teaching a course from this book,
Chapter 12 is an excellent candidate for omission. (None of the material in it is used
later.)
Chapters 1 through 13 amount to more than 600 pages; something had to be
shortened; and so the treatment of functions of many variables is shorter than might
have been expected, and there is no separate chapter on differential equations.

It

should be noted, however, that there is a substantial treatment of linear differential


equations at the end of Chapter 13, and that the viewpoint of differential equations
has been stressed throughout. (Recall, for example, the treatment of the fundamental
theorem of integral calculus, and of the elementary functions, in the Elements.). In
Chapter IO, the standard method of showing that a given series converges to a given
function is first to show that the series and the function satisfy the same differential
equation, and then to show that the differential equation (with initial condition) has
only one solution. Usually, the series is derived from the differential equation, and so
the student is not likely to be surprised when the same process is applied later to
equations whose solutions were not previously known.

For this sort of reason, the

book conveys much more of the spirit and methodology of differential equations
than the table of contents would suggest.
Moreover, it appeared to the author that the natural sequels of the material in
Chapter 14 would grow exponentially more difficult, and that they rightly belong in
an advanced calculus course. The hard fact is that multivariate calculus, once we get
past its beginnings, is not an elementary subject; and if we try to make it seem elemen
tary, we are likely to give up both intuition and logic in favor of a bewildering
formalism.

Thus it appeared, at the end of Chapter 14, that we should say either

much more, or no more at all; and since every book-even a calculus book-has got
to end somewhere, the choice was clear.
The above discussion is an attempt to indicate some of the author's objectives,
and some of the methods used in pursuing them. Obviously no such discussion can
prove anything about the extent of the contribution that the text makes to the
achievement of these objectives.

A great deal has happened, in the teaching of cal

culus, in the past decade, and it remains to be seen how much more can be accom
plished, and how.
New York City, N. Y.
October 1971

E. E. M.

Contents

Chapter 1

Inequalities

1.1

Introduction

1.2

Products which are equal to zero

2
3

1.3

Order

1.4

Absolute values. Intervals on the number line

Chapter 2

Analytic Geometry

16

2.1

Introduction

2.2

Coordinate systems. The distance formula

2.3

The graph of a condition. Equations for circles .

21

2.4

Equations of lins. Slopes, parallelism, and perpendicularity

26

2.5

Graphs of inequalities. And, or, and if ... then

33

2.6

Parabolas .

38

2.7

Tangents

43

2.8

A shorthand for sums

2.9

The induction principle and the well-ordering principle .

49

2.10 Solution of the area problem for parabolas

Chapter 3

16

51
57

Functions, Derivatives, and Integrals

3.1

The idea of a function

63

3.2

The derivative of a function , intuitively considered

69

3.3

Continuity and limits

75

3.4

Theorems on limits

82

3.5

The process of differentiation

89

3.6

The process of differentiation: roots and powers of functions .

97

3.7

The integral of a nonnegative function

3.8

The derivative of the integral

109

3.9

Uniformly accelerated motion .

119

*3.10 Proof of the formula for the derivative of the integral

Chapter 4

102

124

Trigonometric and Exponential Functions

4.1

Directed angles. Trigonometric functions of angles and numbers .

128

4.2

The law of cosines and the addition formulas

135

4.3

The derivatives of the trigonometric functions; the differences tlx and


fl/; the squeeze principle

139
xi

xii

Contents

4.4

The approximation of differences by differentials

148

4.5

Composition of functions

154
159

4.6

The chain rule

4. 7

Invertible functions. The inverse trigonometric functions

165

4.8

Simpson's rule. The computation of 1T

176

4.9

Exponentials and logarithms

185

4.10 The functions In and exp

191

4.11 Exponentials and logarithms. The existence of


Chapter 5

197

The Variation of Continuous Functions

5.1

Intervals on which a function increases, or decreases

5.2

Local maxima and minima, direction of concavity, inflection points

211

5.3

The behavior of functions at infinity

216

5.4

206

The introduction of functions into geometric problems;

the use of

existence theorems as shortcuts

223

5.5

The use of functional equations as shortcuts .

232

5.6

The completeness of R and the existence of maxima

238

5.7

The mean-value theorem and the no-jump theorem .

246

5.8

The derivative of one function with respect to another

250

Chapter 6

The Technique of Integration

6.1

Introduction

6.2

Independent variables and indefinite integrals

6.3

Integrals leading to the logarithm and the inverse secant.


devices

265

6.4

Integration by parts .

273

6.5

Integration of powers of trigonometric functions

278

6.6

Integration by substitution .

284

6.7

Algebraic substitutions

291

6.8

Algebraic devices: completing the square and partial fractions

297

Chapter 7

254
255
Algebraic

The Definite Integral

7.1

The problem of arc length

7.2

The definite integral, defined as a limit of sample sums .

308

7.3

The calculation of volumes, by the method of disks .

315

7.4

The general method of cross sections, and the method of shells

321

7.5

The area of a surface of revolution

327

7.6

Moments and centroids. The theorems of Pappus

335

7.7

Improper integrals

344

The integrability of continuous functions .

350

*7.8

Chapter 8

303

The Conic Sections

8.1

Translation of axes

356

8.2

The ellipse

360

8.3

The hyperbola

366

8.4

The general equation of the second degree. Rotation of axes

372

Chapter 9

Paths and Vectors in a Plane

9.1

Motion of a particle in a plane

9.2

The parametric mean-value theorem; l'Hopital's rule

385

9.3

Other forms of I'Hopital's rule .

393

9.4

Polar coordinates

397

9.5

Areas in polar coordinates

402

381

9.6

The length of a path .

405

9.7

Vectors in a plane

409

9.8

Free vectors

9.9

Velocity, acceleration, and curvature

415
422

9.10 Concluding remarks on vector spaces and inner product spaces

430

Chapter 10 Infinite Series


10.l

Limits of sequences

10.2

Infinite series. Convergence. Comparison tests

437

10.3

Absolute convergence. Alternating series

445

10.4

Estimates of remainders.

448

10.5

Termwise integration of series. Power series for Tan-1 and In

453

10.6

The ratio test for absolute convergence.

457

10. 7

Power series for exp, sin, and cos

10.8

The binomial series

10.9

Taylor series

431

Applications to power series

463
468

473

10.10 Taylor's theorem: Estimates of remainders

477

10.11 The complex number system

479

10.12 Sequences and series of complex numbers.

The complex exponential

function

484

10.13 De Moivre's theorem


*10.14 The radius of convergence.

489
Differentiation of complex power series

*10.15 Integration and differentiation of real power series

493
499

Chapter 11 Vector Spaces and Inner Products


11.1

Cartesian coordinate systems in three-dimensional space

508

11.2

Direction cosines. The directed normal form

512

11.3

Three-dimensional space, regarded as an inner-product space

518

11.4

The dimension of a vector space. Various ways to form a basis

526

11.5

Orthonormal bases

530

11.6

The Schwarz inequality. More general concepts of norm and distance

533

Chapter 12 Fourier Series


12.1

Projections into a subspace, trigonometric polynomials and Fourier

12.2

Uniform approximations by trigonometric polynomials

12.3

Integration of Fourier series. The uniform convergence theorem

series

541
549
.

556

xiv

Contents

Chapter 13 Linear Transformations, Matrices, and Determinants

13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8

Linear transformations .
Composition of linear transformations and multiplication of matrices
Formal properties of the algebra of matrices. Groups and rings
The determinant function
Expansions by minors. Cramer's rule and inversion of matrices
Row and column operations. Linear independence of sets of functions
Linear differential equations
The dimension theorem for the space of solutions.

563
570
577
582
590
596
601

The nonhomo

607

geneous case .

Chapter 14 Functions of Several Variables

14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8

Surfaces and solids in R3


The quadric surfaces
Functions of two variables. Slice functions and partial derivatives
Directional derivatives and differentiable functions

Differentiable functions of many variables. The chain rule.


Directional derivatives and gradients
Interior local maxima and minima, for functions of two variables

Line integrals

651
660
666
674
680

The Shorthand of Logic and Set Theory .


.
Algebraic Operations with Limits of Functions
Algebraic Operations with Limits of Sequences
The Error in the Approximation 11/ ""' df .
.
The Continuity of Composite Functions
The Error in Simpson's Rule .
. .
The Idea of a Measurable Set .
Proof of the Northeast Theorem
.
Proof of the Formula for Path Length
.
. . .
A Method for Constructing the Complex Numbers
Iterated Limits. Mixed Partial Derivatives .
.
Possible Peculiarities of Functions of Two Variables
Maxima and Minima for Functions of Two Variables
An Exact Definition of the Idea of a Function
Selected Answers .
Index .

687
690
695
697
700
702
705
707
711
713
717
721
725
727
733
759

Level curves

14.9
14.10
14.11
14.12
Appendix A
Appendix
Appendix
Appendix
Appendix

B
C
D

Appendix F
Appendix G
Appendix H
Appendix I
Appendix J
Appendix K
Appendix L
Appendix M
Appendix N

The chain rule for paths

614
620
626
634
641
644
648

Double integrals, intuitively considered


Cylindrical coordinates in space. The definition of the integral
Moments and centroids of nonhomogeneous bodies

1.1

Inequalities

INTRODUCTION

In this book it is assumed that you know elementary geometry and the algebra of the
real number system. Theorems of plane geometry will be used only occasionally, and
there is no need to reexamine the subject as a whole.
Inequalities, however, are another matter.

We shall be using them constantly,

and they are tricky. We shall therefore handle them with care. To derive the laws that
govern them we first need to recall the elementary laws of the number system. These
are as follows.
We have given the set R of real numbers, with the operations of addition and
multiplication. Thus the number system is a triplet
[R,+, ].

Addition and multiplication are subject to the following laws:


Closure.

For every

Associativity.

and

For every

in R,

and

a+b

and

ab

are in R.

b,

a+(b+c)= (a+b)+c,
and

a(bc)= (ab)c.
Commutativity.

For every

and

b,
and

a+b=b+a
Distributive Law.

For every

a , b,

and

ab=ba.

c,

a(b+c)=ab+ac.
Existence of 0 and 1.

There are two different numbers 0 and 1 such that

a+O=a
for every

and

a 1 =a

a.

Existence of Negatives.
Existence of Reciprocals.

For every

there is a number

-a such that a+(-a)= 0.

For every a : 0 there is a number I /a such that


1

a I /a=1.

1.2

Inequalities

These laws are called the field postulates; and any number system which satisfies
them is called

afield.

There are many such number systems: the real numbers form a

field, and so do the complex numbers. For a long time to come, however, we shall be
working only with the real numbers. Therefore, when we speak of numbers, we mean

real numbers, unless the

contrary is stated.

We shall assume not only the field postulates but also the familiar laws based on
them.

1.2

(a - b)(a + b)

For example, we know that

for every

a2

- b2, and that

a 0

a.

PRODUCTS WHICH ARE EQUAL TO ZERO

.When we perform calculations, we shall not stop to justify them on the basis of the
field postulates.

But the following principle is worth special mention, because it is

used in reasoning processes which don't involve calculations:


Theorem 1.

If ab

0, then e ith er

0 or b

0.

Proof

1) If a
0,
2) If a 0,
=

there is nothing to prove.


then

has a reciprocal. Therefore

-1 (ab)
a

1
=

1. b

0'

0,

and
b
Thus either

0 or b

0.

0.

Obviously it is possible that

a and bare both


0. In Theorem 1 (and everywhere
either ... or . . , we allow the possibility of both.
=

else in mathematics) when we say

PROBLEM SET 1.2

1. Show that if x2 =0, then x=0.


2. a) Obviously the numbers 1 and -1 are roots of the equation

(x - l)(x + 1) =0.
How do you know that no other number is a root of the equation?
b) Show that 2 and 3 are the only roots of the equation

x2 - 5x + 6 =0.
3. If 0 had a reciprocal, then its reciprocal would be a root of the equation
Ox=l.
Show that this equation has no root.
4. a) If

b) If

ab =ac,
ab =ac,

does it follow that


and

c?

Why or why not?

;t 0, does it follow that

c?

Why or why not?

1.3

Order

5. a) Show that if

ab c

0, then

0 or b

0 or

0.

b) Show that 1, 2, and 3 are the only roots of the equation


x3 - 6x2 + llx - 6
6. a) If a2
b) If a2

b2, does it follow that a

0.

b? Why or why not?

b2, what can you conclude about the relation between a and b? Why?

7. Under what conditions (if any) is it true that


1
1
- +-

1
=--

x +a

8. a) Under what conditions (if any) is it true that

(a + b)2

a2 + b2?

b) Under what conditions (if any) is it true that

(a + b)3

a3 + b3?

*9. Consider the "number system" which has only two elements 0 and 1, with addition and

multiplication defined by the following tables:


+

----1

--0

Which of the field postulates hold true, in this system?

Which, if any, fail to hold?

(The answer to this question suggests that the field postulates are not, in themselves,
a very adequate description of the real number system.)
*10. Consider the number system in which the "numbers" are 0, 1, 2, and 3, with addition

and multiplication defined by the following tables:


+

0
0

3
2

Exactly one of the field postulates fails to hold in this number system. Find out which
one.

[Hint: Don't bother to test the Associative and Distributive Laws; in fact, they

hold true in this system, although the verifications are extremely tedious.]
Does Theorem 1 hold true in this system? Why or why not?

1.3

ORDER

We think of the real numbers as being arranged on a line, like this:

-V3

11'

-1

Inequalities

1.3

When we write a<b, this means (roughly speaking) that a lies to the left of b on the
number line. Thus what we have in mind is a system

[R,

+,

<],

where<is a relation having the following properties:

0.1. (Trichotomy) For every a and b in R, one and only one of the following
conditions holds:
or

a<b,

or

a= b,

b<a.

0.2. (Transitil'ity) If a<b and b< c, then a<c.


A relation satisfying 0.1 and 0.2 is called an order relation, and an expression of
the form a<b is called an inequality. We write b > a to mean a<b; a b means
that either a<b or a
b; and a b means that either a > b or a= b. A number
=

a is positive if a > O; a is negative if a<0.

Zero is neither positive nor negative.

But 0.1 and 0.2 do not, by themselves, enable us to handle inequalities.


need to know how<is related to + and

We

The laws are the following:

MO. If a > 0 and b > 0, then ab > 0.


AO.

If a<b, then a +c <b+c for every

c.

These four laws, in combination, tell the whole story: all of the elementary laws
of inequalities can be derived from them.
following problem set.
Theorem 1.
Theorem

2.

Theorem 3.

lf a

You will carry out this process, in the

Meanwhile we state the theorems without proof.

> 0, then -a<0.

If a<0, then -a

> 0.

If a<b, and c<d, then


a+c<b+d.

Theorem 4.

An inequality is preserved if both sides are multiplied by the same positive

number.
That is, if a<band

> 0, then ac<be.

Similarly,
Theorem 5.

An inequality is preserced if both sides are divided by the same positive

number.
That is, if ac<be and c > 0, then a< b.
Theorem

6. An inequality is raersed if both sides are multiplied by the same negative

number.
That is, if a<b and c<0, then ac > be.

1.3

Order

Theorem 7.

An inequality is rerersed if both sides are divided by the same negative

number.
That is, if be<ac, and c<0, then b

>

a.

Consider now an inequality involving an unknown number x, for example,

3x+4<5x + 7.
An expression like this, involving a variable, is called an open sentence; in an open
sentence, x marks the spot where numbers are to be inserted. Some numbers, when
substituted for x, may give true statements, and other numbers may give false state
ments. For example,

32+4<52+7
is true, because 10<17; but

3(-5) + 4<5(-5) + 7
is false, because -11

> -18.

In simple cases like this, it is easy to find out what numbers satisfy the inequality.
If

3x + 4<5x + 7,

(1)

4<2x + 7,

(2)

then

by AO. (We have added -3x to each side of the inequality.) Therefore

-3<2x,

(3)

x > -t,

(4)

by AO; and so

by Theorem 4. (We have multiplied, on each side, by t, and then written the inequality
backwards, to put x on the left.)
Thus every number which satisfies (1) also satisfies (4). And all of our steps can
be reversed. If

x > -t,

(4)

-3<2x,

(3)

4 <2x + 7,

(2)

3x + 4<5x + 7,

(1)

then
by Theorem 4; therefore
by AO; and so

by AO. Therefore every number which satisfies (4) also satisfies (1). We can sum all
this up briefly by writing

3x + 4<5x+7

<=>-

x > -t.

Here the symbol <=>- is pronounced "is equivalent to."

When we write<=:>- between

two inequalities (or any two open sentences of any kind) we mean that whenever one
of them is satisfied, so is the other.

Inequalities

1.3

We use a single-headed arrow to indicate that one condition implies another.


For example,
x

> 0

x2 > 0.

=>

This is true. (Why?) But


(?)
is false, because
a

b => a2

> 0

<=>

> 0 (?)

-1 satisfies the second inequality but not the first.

Similarly,

b2 is true, but

( ?)
is false, because if a ":/= 0 and b

<=>

a2

b2 (?)

-a, then the second inequality holds, but the first

does not.
The shorthand symbols<=> and => are worth learning and using.

The reason is

that when we write down strings of formulas, in solving a problem, we ought to


indicate what the connection between them is supposed to be. We are more likely to
do this if we have a way of doing it briefly.
Using the symbols=> and<=>, we can restate some of the theorems of this section
in a more efficient way. For example, AO says that
a< b

=>

(5)

a + e< b + e.

And given a + e< b + e, we can add -e to both sides, preserving the inequality.
Therefore
a

=>

+ e< b + e

a< b.

(6)

These fit together to give:


The Addition Law of Order. a< b

<=>

a + e< b + e.

We shall refer to this, for short, as ALO. Similarly, Theorem 4 says that
fore > 0,

a< b

fore > 0,

ae< be

ae< be.

=>

Theorem 5 says that


a< b.

=>

These fit together to give:


The Multiplication Law of Order.

Fore> 0, a<b

<=>

ae<be.

This will be referred to as MLO. Theorems 6 and 7 say that


ae > be,

(7)

=>

b > a.

(8)

=>

a< b.

(8')

We sum all this up in the short form on the next page.

The meanings of the

fore< 0,

a< b

fore< 0,

be<ae

ae > be

=>

and
And (8) can be rewritten in the form
fore< 0,

Thus Theorems 6 and 7 fit together to give:


Reversal of Order.

For c<0, a < b

abbreviations should be plain.

<=>

ae > be.

Order

Trich.

For every a and b in R, one and only one of the following


conditions holds:

a<b,

or

b,

or

a<band b < e

a< e.

MO.

a>

ab>

AO.

a< b

Trans.

0 and

b>

1.

a>

-a< 0.

[Theorem

2.

a<O

-a>

[Theorem

3.

a<band e< d
a<b

0.

a+ e <b + e.

Theorem

ALO.

b <a.

<=>

0.

a+ e <b

d.

a+ e<b+ e.

MLO.

Fore> 0,

a<b

<=>

ae<be.

RO.

Fore< 0,

a<b

<=>

ae

be

The last three of these are convenient in solving inequalities; they enable u
e <=> at each stage, instead of working first forward and then backward.
nple, the solution of the illustrative problem above can now be written like t

3x+ 4< 5x+ 7


4<2x+7

<=>

by ALO

<=>

-3 <2x

by ALO

<=>

-t < x

byMLO

x > -

<=>

by definition of >.

A linear inequality is said to be solved when we find an equivalent inequalit


form x< a or x >

a.

)BLEM SET 1.3


Solve the following inequalities, by writing a chain of equivalent inequalities, and gi
he right the reason for each step, as in the text.

5 - 3x

>

17+x

5x+ 3

>

17x+ 1

-3x - 7 <x +5
6x - 10

>

5x+ 3

2. 5x - 3<17x+ 1
4. 5+ 3x<17+x
6.

-4x -

8.

3 - 2x<4 - 3x

8 <2x+6

2x - 6 <2 - 2x

10. 6x - 2<3+x

2x+6<3+x

12. 6(x - 2)

>

x -3

In the following problems, we d evelop the theory in which all of the results of
ion are derived from Trich., Trans., MO, and AO.

Therefore, at the start, thest

Inequalities

1.3

the only statements that can be given as reasons in proofs.

In each problem, however,

you may assume that the results given in the preceding problems are known and you may
cite them as reasons.

13. Following are the steps in the proof of Theorem 1.

Complete the proof by giving a

reason for each step.


a) a> 0
c)

=>

0 <a

-a+ 0 < -a+ a

-a< 0

=>

b) 0 <a

=>

d) a> 0

=>

- a + 0 < -a+ a
-a<0

14. Following is an outline of the proof of Theorem 2. Complete the proof by giving a
reason for each

=>.

a< 0

-a + a< -a + 0

=>

0 < -a

=>

-a> 0.

=>

15. a) Give a reason for the statement


a< b

=>

a+ c <b + c.

c <d

=>

b + c < b + d.

b) Similarly, for
c) Prove Theorem 3.

16. a) Show that


a<b

<=>

b - a> 0.

(More than one step is needed here.)


b) Give a reason for the statement
and

c> 0

b - a> 0

=>

(b - a)c > 0.

c) Prove Theorem 4.

17. Show that


x

f;; 0,

for every

x.

[Hint: By Trich., there are three cases to be considered: x > 0, or


2
2
x > 0 or x
O.]

Show that in each of these cases we have either

18. Show that

y2 - 2y + 1 f;; 0,

19. a) Everybody knows that 1 > 0.


developed so far.

0, or

< 0.

for every y.

Prove it, on the basis of the theory that we have

(You may assume, of course, that 1 ,,e 0.)

b) Show that

a> 0

=>

1
-

> 0.

[Hint: By Trich., it
0 and 1/a< 0 are impossible.

That is, the reciprocal of every positive number is positive.


will be sufficient to show that the conditions 1/a
Remember that a

1/a

1.]

20. Show that


c> 0

CThis is Theorem 5.)

and

ac < be

=>

a<b.

1.4

21.

Absolute Values.

Give the reason for each step in the following proof of Theorem 6.

=>
=>
=>
=>
=>

22.

Intervals on the Number Line

and
a <b
b - a >0
and
b - a >0
and
(b - a)(-e) >0
ae - be> 0
ae> be.

<

e <0
-e> 0

Give the reason for each step in the following proof of Theorem 7.

=>

be
be

<

ae
ae

and

=>

be

<

ae

and

-> 0
-e

=>

ae - be >0

and

- >0
-e

=>

=>

-(ae - be) >0


-e
b-a>O

=>

b >a.

<

e <0
-e >0

and

23.

Is there a positive number which is smaller than all other positive numbers? Why or

24.

Is there a negative number which is larger than all other negative numbers?

why not?
Why

or why not?
*25.

ls it possible to define, for the complex numbers, a relation < which obeys the laws
0.1 and 0.2?

(That is, can an order relation be defined for the complex numbers?)

Why or why not?


*26.

ls it possible to define, for the complex numbers, a relation < which satisfies not only
0.1 and 0.2 but also MO and AO?

[Hint: Since i

;f:.

0,

we must have i

>0 or

-i

>0.)

The language in which these problems are stated ought to suggest what the answers are.
The answer to Problem 26 indicates why it is that arranging the complex numbers in an
order is not a useful proceeding.

In the complex number system, no theory of inequalities

can be made to work.

1.4

ABSOLUTE VALUES. INTERVALS ON THE NUMBER LINE

The absolute value lxl of a number xis defined by the following two conditions:

1)

If x 0, then lxl = x.

2)

I f x < 0, then lxl

-x.

Thus under Condition

(1)

we have

12 1=2,
and under Condition

(2)

we have

1-21 = -(-2) = 2,

Inequalities

10

1.4

Thus the operation I I leaves positive numbers unchanged, and replaces each negative
number by the corresponding positive number. On this basis it is easy to see that the
following theorem holds.
Theorem 1. For every x,

lxl 0.

Proof There are two cases to consider.


Case

1.

x 0. Here lxl

x, by definition of lxJ. Therefore Ix\ 0 in Case

I.

Case 2. x < 0. Here lxl


-x, by definition of lxl; and -x > 0, by Theorem 2
of the preceding section. Therefore lxl > 0 in Case 2.
=

Thus in each case we have lxl 0.


Theorem 2. For every x,

\x\2

x2.

This is true because \xi is either x or -x, and ( -x)2

x2

A number x is a square root of a number a if x2


a. For each a > 0, -J is the
positive square root of a. Thus, for example, 9 has two square roots, 3 and -3;
=

and /9 is 3, which is the positive square root. We define Jo


0. Here and hereafter,
we are assuming that positive numbers have roots of all orders-square roots, cube
roots, and so on.
=

Theorem 3. For every x,

\xi=

Proof By Theorem 2, lxl2


lxl 0. Therefore lxl
Theorem

4.

x2, and so \xi is a square root of x2 By Theorem

-J x2,

by definition of -J .

For every x,
1-x\

This is true because 1-xj


Theorem

5.

-J.

-J (-x)2

I x!.

-J x2

Ix!.

For every x and y,


\xy\

This is true because lxyl

-J (xy)2

jx\ \yj.

-J x2y2

-J x2 -J y2

Jxl IYI.

1,

1.4

Absolute Values.

Theorem 6.

For every

Intervals on the Number Line

11

x,
xIx!.

Here, as in the proof of Theorem 1, we need to consider two cases.


Case 1.

0.

Here

xjxj, because !xi = x.

Case 2. x < 0. Here !xi = -x, and -x >


jxj, and so x < jxj.
Theorem 7.

0.

(Why?)

Thus

x <

0 <

-x =

(The triangular inequality) For every x and y,


Ix+ YI lxl+ jyj.

The trouble with this theorem, if we try to prove it by brute force, is that there

x and y may or may not be


x and y have different signs, x + y may or may not be negative. It

are too many cases to consider: each of the numbers


negative; and if

turns out, however, that we can get a proof by examining only two cases:
Case 1.

x+ y

0.

In this case

Ix+ YI= x + y.
Since

x lxl ,

we have

yjy l,

x+ Ylxl+ IYI,

and so

Case 2.

and

Ix+ YIlxl + jyj.


Suppose that

x+ y <

0. Then

-x - y >

0. Therefore

Ix+ YI= 1-x - YI= 1(-x)+ (-y)II-xi+ 1-yj,


by the result of Case 1. Since

I-xi = !xi and 1-yl = Jyl, we have


Jx+ YI Jxl+ JyJ,

which was to be proved.


Theorem 8.

Given d > 0. Then

Jxl < d

<::::>-

-d <

x < d.

lxl <d

This is geometrically obvious:

lxl is "the distance between 0 and x, on the number

line"; and the points that lie within a distance d of the origin are the numbers between
-d and d.
origin.

We get a more general result by using any given point

a instead of the

12

1.4

Inequalities

Theorem 9.

Given d > 0, and any number a. Then

Ix - al <d

<=>

a- d< x<a+ d.

lx-aj<d

Proof In Theorem

8, substitute

Ix

x- a for x. This gives

al <d

-d<x- a<d.

<=>

And

-d<x-a<d

<=>

a- d<x<a+

(Reason?)
If a<b, then the set of all numbers between a and b is called an open interval,
and is denoted by (a, b).

(a, b)
0

There is a shorthand for this sort of statement:

(a, b)

{x I a< x<b}.

The expression on the right denotes the set of all objects that satisfy the condition
following the vertical bar. This is called the solution set of the open sentence a<x<

b. Similarly, the set of all positive numbers is the solution set of the open sentence
x > O; this is denoted by {x I x > O}. Thus two open sentences are equivalent if
they have the same solution set.
Sometimes it turns out than an open sentence never gives a true statement, no
matter what we substitute for x. In such cases, the solution set is empty. The empty
set is denoted by { }. For example,

{x

I .Jx2

x- 1}

{ }.

The notation { } is designed to suggest its meaning: we describe sets in the brace
notation; and when there is nothing written between the braces, this means that the
set has nothing in it.
If we add to the open interval (a, b) the endpoints a and b, we get a closed interval,
denoted by [a, b].

[a, b]
0

Thus

[a, b]

{x I a x b}.

Absolute Values.

1.4

Intervals on the Number Line

We shall also be dealing with "infinite intervals."

13

In the first figure below,

the "infinite interval" is

(a, oo)

{x I a

<

x}.

Similarly,

( - oo, a)

{x I x

<

a},

as shown in the second figure below.

(a, oo)
a

(- oo, a)
0
This notation, in which
but it is convenient.
"numbers"

- oo

and

"oo"

is used as if it denoted a number, is not very logical,

To keep track of the notation, you should think of fictitious


oo

as the "ends" of the number line, as shown below.

We also use "half-open" intervals:

[a, b)

{x I a

<

b}

( a, b]

and

{x I a

<

{x I x

a}.

b},

[a, b)
0

a
(a, b]

and "closed infinite" intervals:

[a, oo)

{x I x a},

and

( - oo, a]
[a, oo)

a
(, oo, a]
a

14

1.4

Inequalities

Finally, we may refer to the whole real number system Ras the interval ( - oo, oo).
Thus we have a total of nine kinds of interval:
(a, b),

(a, b],

[a, b),

[a, b],

(a, oo),

( - oo, a),

[a, oo),
(-oo, oo).

(- oo, a],

In some of the problems below, you may find it convenient to use the following:
Theorem 10.

If lxl

lyl, then x

y or x

-y.

Proof
Jxl

IYI

=>

=>
=>
=>

=>

IYl 2
Jxl2
x2 = y2
x2 - y2 =
=

(x - y)(x + y)
or
x y
x
=

0
=

-y.

(The converse is obvious.)


PROBLEM SET 1.4
Describe each of the following sets in the interval notation. Your answers should be in

a form

like the following:

{xi 3x+4

<

2. {x I 3 - x > x - 3}
4. {x I Ix - 31 2}
6. {x I 2x+3 6x - 4}
8. {x I 12+xi < O}

{x J 3x+4 > 4x+5}


{x J lxl < 1}
{x I Ix - 51 < 5}
{x I 11 - xi 2}

9. a) Is it true that

Yx2

oo .

1.3 of the text.)

(This is the example discussed in Section

1.
3.
5.
7.

5x+7} = (-f,

x for every x?
.

Why or why not? Describe the set

{x I Yx2 = x},

in the interval notation.


b) Describe the set

{xi Y(x+1)2=x+1},
in the interval notation.
Find out for what numbers

(if any) each of the following conditions holds. In each

case in which the solution set is an interval, the answer should be given in the interval
notation.

10.
12.
14.
16.

Yx2- 2x+1 = x - 1
lx2 - 5x+61=x2 - 5x+6
Ix+11 =11 - xi
Yx2-l=x

11.
13.
15 .
17.

lx2- 5x+61 = Ix -31 Ix - 21


Ix - 51 = l2x - 31
vx2+1 = x
l2x -lj +Ix+31 l3x+21

Absolute Values.

1.4

19. ,12x - x2 = 1
21. Ix + ll +l2x +31

18. 1 7x +31 +1 3 - xi 6 Ix + ll
20. l2x - x21
22. Ix

Intervals on the Number Line

x +2x2

>

15

ll lx2 +xi = lx3 - xi

Indicate graphically, on a number scale, the places where the following conditions hold;
describe the graphs in the interval notation if possible.

23. lxl

25. l2x - 31 ;;;; i


27. 13 - 2xl ;;;; i
29. 1 1 - xi ;;;; 2
3 1. Ix 11 < 2

33.

24. Ix - 21 <1
26. Ix - 11 < -
28. Ix 21 < :l
30. l2x - 41 < 1
32. 1 2x - 1 1 1

<2

lxl -

and

a) Show that if

0, then

lI 11
=

lbl is the number y


lbl 1 1/bl
l.)

b) Show that if

34.

(There is a short proof.)


b) Show that for every

l,bl
a

and

and

b,
b,

x 2

Ix - 21 ;;;; 1

l2x - 11 ;;;; 1

and

such that

lbl

y = I. Therefore

0, then

a) Show that for every

and

(By definition, the reciprocal of


it is sufficient to show that

and (also)

lal

fbl.

la - bl lal - lbl.
la +bl lal - lbl.

(The proof is short.)

35.

For what numbers a is the fraction

a/lal

defined?

What is this fraction equal to, for

various values of a?

36.

Sketch

{x I Ix - 21 + 1 7 - xi

5}

on the number line, and describe this set in the interval notation.

Analytic Geometry

2.1

INTRODUCTION

This chapter includes various topics which serve as a preparation for calculus. Some
of these topics are familiar to you, at least in some form. In such cases you should
still read the text carefully, in order to learn the terminology that will be used hereafter.
2.2

COORDINATE SYSTEMS. THE DISTANCE FORMULA

We shall now apply algebra to the study of geometry. We start with a plane, in the
usual sense of Euclidean geometry; and we suppose that a unit of distance has been
chosen, once for all, so that the distance between two points Pand Q is a well-defined
nonnegative number.

The distance between the points P and Q is denoted by PQ.

(We say merely that PQ is nonnegative, rather than PQ > 0, because we are allowing
the case P

Q, and in this case PQ

0.)

To set up a coordinate system in a plane, we first need to assign number-labels to

the points of a line. We choose a point 0 as the origin; it is given the label 0.

Each pointP1 to the right of 0 is labeled with the distance x1

OP1, which is positive.

And each point P2 to the left of 0 is labeled with the number x2


negative.

-OP2, which is

Thus we have a matching scheme, under which each point of the line is

matched with exactly one real number.


p

-2

RS

I I I
1 V2 2

-1

For the points marked in the figure, the matching pairs are
P-2,
S

../2,

Q -1,
T 2,

Rl,
U

71'.

Here the double arrow is pronounced "is matched with." Every such pair has the
form P x, where Pis a point and xis a number. A one-to-one matching scheme,
16

Coordinate Systems.

2.2

The Distance Formula

17

between the elements of one set and the elements of another, is called a one-to-one

correspondence between the two sets.


If the correspondence is set up in the way that we have just described, then we
can compute the distance between any two points by means of the formula

Here P1 +--* x1 and P2 +--* x2 This distance formula holds no matter how the points
P1 and P2 are situated on the line:
0

and so on; in every case, P1P2

lx2 - x11.

Thus we have a one-to-one corre

spondence P x, between the points of the line and the real numbers, such that the
distance formula holds for every pair of points. Such a correspondence is called a

coordinate system for the line. If P

+--* x, then x is called the

coordinate of P.

These ideas are summed up in the following postulate.


The Ruler Postulate.

Every line has a coordinate system. And given any two points

0 and

P of the line, there is a coordinate system in which the coordinate of 0 is


and the coordinate of P is positive.
0

x>O

On the basis of the ruler postulate, it is easy to set up a coordinate system in the
plane. We take two perpendicular lines Xand Y, intersecting in a point 0. On each
of the two lines we set up a coordinate system, in such a way that 0 +--* 0; that is, the
coordinate of 0 is zero on each of the lines Xand Y.

Xis called the x-axis, Y is

called the y- axis, and the point 0 is called the origin.


Given any point P of the plane, we drop a perpendicular from P to the x-axis,
ending at a point M.

The point M has a coordinate x, on the line X.

then x is called the x-coordinate of P.

y
y N--------- -,p

Ml
I
L--------- y
p
N

I
I
M x
x

If M +--* x,

Analytic Geometry

18

2.2

Similarly, we drop a perpendicular from P to the y-axis, ending at a point N.


If N

+--+

theny is called they- coordinate of P. Thus we have a matching scheme

y,

p +--+

(x, y)

between the points P of the plane and the ordered pairs

(x, y)

of real numbers. The

order in which we write the numbers makes a difference. In the left-hand figure below,
and

Q +--+ (2, 1).

We may speak of "the point (1, 2)" or "the point

(x, y),"

meaning "the point

which is matched with (1, 2)" or "the point which is matched with
may write P

(x, y),

meaning P+--+

(x, y)."

Thus we

(x, y).

--

p
-,
I

N
y

I
I

Q
------,
I
I
I
I
I
I

p'

-----

tt

Obviously

and

and

I
I
I
I
I

are determined when Pis known. And Pis determined when

are known, because the vertical line through M and the horizontal line

through N intersect in exactly one point. Thus we have a one-to-one correspondence


p

+--+

(x,y)

between the points of the plane and the ordered pairs of real numbers. Such a corre
spondence is called a

coordinate system for the plane.

We need to see how the algebra

in this situation is related to the geometry.


y

Consider first the question of distance.

(x2, h)

If we know the coordinates

(x1, y1)

and

of two points P and Q, then the points are determined, and so the distance

between them is determined. The following theorem gives a formula for the distance.

Coordinate Systems.

2.2

Theorem

The Distance Formula

19

1. If
and

then

PQ=
Proof

.J(x2 - X1)2

Draw the vertical line through

at the point

R.

Let

and

T be

+ (Y2 - Y1)2.

and the horizontal line through

P,

the feet of the perpendiculars to X, from

meeting

and

respectively. Then

PR= ST,
because opposite sides of a rectangle have the same length. And

ST= lx2 - Xii,


by definition of a coordinate system on a line. Therefore

PR = lx2

Xii

For the same sort of reason,

RQ= UV= IY2 - Y1I


But

t:,.PQR is a right triangle, with its right angle at R.

theorem,

Therefore

Therefore, by the Pythagorean

PQ2 = PR2 + RQ2


= /x2 - X1l2 + IY2 - Y1l2.
PQ= .J1x2 - X1l2

2
IY2 - Yil -

This is not quite the formula given in Theorem I, because it uses absolute-value
signs instead of parentheses. But this makes no difference, because

/X2 - X112= (X2 - X1)2,


and

2
2
IY2 - Y1l = (Y2 - Y1)

(Why? We need a theorem from Section 1.4.)


In the previous figures, we have shown the x-axis going positively from left to
right, and the y-axis going positively from bottom to top.

Logically speaking, we

could equally well have put the axes in any of a number of other positions:

But fhe axes are usually drawn as shown on the right above.

This figure shows the

minimum that must be indicated when graph paper is used for drawing pictures of

20

Analytic Geometry

2.2

coordinate systems. That is, the axes must be labeled, and the number scale must be
shown on each axis, by indicating the coordinate of at least one point.
The two axes separate the plane into four parts, called quadrants. The quadrants
are numbered I, II, III, IV. That is, the first quadrant is the set of all points (x, y)
of the plane for which x > 0 and y > O; the second quadrant is the set of all points

(x, y) for which x < 0 and y > O; and so on.


y

y
1

II

III

IV

--+---X
-'--

We have used the letters X and Yin order to have convenient names for the x
and y-axes. The axes are more commonly labeled as on the right above.
PROBLEM SET 2.2
Calculate the distances between the following pairs of points. Then plot the points and
check the plausibility of your answers.

1. a) (1, 2)

and

c) (7, -5)
2.

a) (3, 7)

(3, 4)

and

3. Obviously, PQ

b) ( -2, -4)

and

(4, 2)

(5, -7)

d) (1, 0)

and

(0, 1 )

(-3, -7)

b ) ( 1, 3 )

and

(--2, 7)

and

QP for every pair of points P, Q; the distance between two points

does not depend on the order in which the points are named.

Therefore any correct

distance formula has the property that when we interchange the two points, the formula
gives the same answer. Check algebraically that our distance formula has this property.
4.

Find out whether or not the points ( - 1 0 , 10), (14, 3), and (38, -4) are vertices of an
isosceles triangle. Is the triangle equilateral?

5. Find all points (x, y) such that (0, 0), (2, 2), and (x, y) are the vertices of an equilateral
triangle.
6.

Find out whether the points ( -2, 3), (0, 1), and (3, 4) are the vertices of a right triangle.
Then plot the points and check for plausibility.

(This problem can and should be

worked by the use of distances alone. The use of slopes is not necessary.)

7. Find the coordinates of the point which is equidistant from (0, 0), (1, 2), and (3, -1).
Find the radius of the circle which passes through the three given points.

8. What point on the y-axis, if any, is equidistant from ( -1, -2) and (2, 3)?
9. a) Give a formula for the perpendicular distance between (x, y) and the x-axis.
b) Give a formula for the perpendicular distance between (x, y) and the y-axis.
10.

Find out whether the points ( -1, -1), (0, 1), and (2, 5) are collinear. Then plot and
check for plausibility.

11.

(The remarks following Problem 6 also apply here.)

Find a point on the x-axis which is collinear with the points (1, 2) and (0, 3).
remarks following Problem 6 also apply here.)

(The

The Graph of a Condition.

2.3

Equations for Circles

21

The following problems are a review of the main theorems of elementary geometry that
we have been using so far.
12.

Show that an exterior angle of a triangle is greater than either of its remote interior
angles.
A

.,

l><i_.

That is, show that in the left-hand figure we have LACD > LA.
based on the figure on the right.

The proof is

[Query: If you know that LACD > LA, how do

you infer that LACD > LB?]


13.

Show that there is only one perpendicular to a given line, from a given external point.
That is, show that the left-hand figure below is impossible for A B. (We needed this
in order to explain what was meant by the x-coordinate of a point; A must be determined
when P is known.)

..

14.

Write the proof of the Pythagorean theorem suggested by the figu,re on the right above.

15. The proof of Theorem 1 of this section was incomplete: it discussed only the most

significant case and neglected to mention two other cases. The point is that if P and Q
lie on the same horizontal line, or the same vertical line, then there is no such thing as

t-,PQR, and so the Pythagorean theorem cannot be used.


Show that the distance formula holds in the case x1
Yi
2.3

x2, and also in the case

Y2

THE GRAPH OF A CONDITION. EQUATIONS FOR CIRCLES

Given a point Pand a positive number r, the circle with center Pand radius r is the
set of all points of the plane whose distance from Pis equal to r. That is, a point Q
is on the circle if PQ

r.

This is the first and simplest example of the idea of the graph of a condition.

If

we state a condition which every point of the plane either satisfies or doesn't satisfy,
then the graph of the condition is the set of all points of the plane that satisfy it. (Thus
the graph is simply the solution set of an open sentence; we use the word graph

22

Analytic Geometry

2.3

when the solution set is a set of points.) In this language, we say that the graph of the
condition OQ

r is

the circle with center at the origin and radius

r.

y
r

-r

The

interior of the circle with center P and radius r (r > 0) is the set of all points
< r. Thus the interior is the graph of the inequality PQ < r. We

Q such that PQ

indicate such graphs in figures by means of shading or cross-hatching.

Sometimes the condition takes the form of an algebraic equation. For example,
if Q +--+

(x, y),

then the distance formula tells us that

OQ

Jx2

+ y2.

Therefore the condition

OQ

(1)

can be written in the equivalent form

Jx2
or

y2

x2 + y2

The point

r,

(2)

r2.

(3)

(x, y) is on the circle if and only if x and y satisfy (2). And


Jx2 + y2 r <=> x2 + y2 r2
(r > 0).
=

Thus the circle with center -at the origin and radius 2 is the graph of

J x2

y2

<=>

x2

y2

4;

and the interior of this circle is the graph of

J x2

y2 <

<=>

x2

y2 <

4.

Similarly, the first quadrant is the graph of the condition

x > 0

and

y > 0.

2.3

The Graph of a Condition.

Equations for Circles

23

y
x>O
y>O

x>O
y<
O

IV

The fourth quadrant is the graph of the condition

> 0 and y < 0.

We found that the circle with center at the origin and radius
equation
2
x

y2

is the graph of the

2
r .

Consider, more generally, the circle with center at

(a, b)

and radius

r.

By

definition, the circle is

{PI QP

r .

.,.,....-

r,,....
,,

P(x, y)

Q(a, b)

QP algebraically, we get

Using the distance formula to express

QP

)(

. <=>

<=>

(x

- a)2

+ (y

- b)2

+ (y

- b)2 = r2

a)2

Thus:
Theorem 1.

The circle with center at

(x - a)2

(a, b)
+

and radius

(y - b)2

is the graph of the equation

r2

An equation written in the above form is easy to interpret.

(x

2)2

(y - 5)2

we see by Theorem 1 what the graph is.

4,

On the other hand, if such an equation

is "simplified" algebraically, it may look like this:


2
x

For example, given

+ y2 + 4x - lOy +

25

0.

24

Analytic Geometry

2.3

5
4
3
2

To find out what the graph is, we first "unsimplify" by completing the square:

x2+ 4x+ y2 - IOy = -25


<:::>

x2+ 4x+ 4+ y2 - lOy+ 25= -25+ 4+ 25


2
<=> (x+ 2)2+ (y - 5) = 4.

In the general case, for equations of the form

x2+ y2+ Dx+ Ey+ F =


there are three possibilities for the graph.

0,

In some cases, the graph is a circle.

x2+ y2

But

is also an equation of this form, and its graph is not a circle, but a single point, namely
the origin. And the equation

x2+ y2+ 1
is never satisfied, for any

x and y.

Its graph is therefore the empty set

{ }.

By completing the square, starting with the general form, we shall show that these
three possibilities-a circle, a point, and the empty set-are in fact the only ones:

x2+ y2 + Dx+ Ey+ F=

2
(D)2
(E)2
(D)2
x + Dx+ 2 + y2+ Ey+ z = - F+ 2
<:::>

(E)2

+ 2

D)2
E)2 D2 + E2 - 4F
(
x+-+ y+-=
.
2
2
4

If the fraction on the right, in the last equation, is positive, then it is=
positive number

radius

r.

r,

and so the graph is the circle with center at

If the fraction on the right is

r2 for some
(-D/2, - E/2) and

0, then the equation takes the form

The Graph of a Condition.

2.3

and the graph contains only the point

D/2

E/2)

Equations for Circles

25

Finally, if the fraction on the

right is negative, then the equation is never satisfied, for any x and y, and so the graph
is the empty set { }.
To sum up:

Theorem 2. The graph of an equation of the form


x2 + y2 + Dx + Ey + F

is a circle, a point, or the empty set.


PROBLEM SET 2.3
Problems 1 through

6.

In the illustration below six figures are drawn. For each of these figures, state a condition
which has the given figure as its graph. In the figure, the arrowheads merely indicate that the
Thus (1) and (2) are entire
(3) is a ray, going infinitely far on the right, but stopping at the point (0, 4) on the left;
and (6) is a segment, with endpoints (I,'-3) and (4, -3).

line is supposed to go infinitely far in the indicated direction.


lines;

(1)

(4)

Problems 7 through 10.


Follow the same directions as in the previous problems for the illustration below.

26

Analytic Geometry

2.4

Sketch the graphs of the following conditions, using cross-hatching to indicate regions.
11. x2

y2 = 1

14. x = 2
18. x2

and

12. x2
0 y 2

y2

<

15. x = - 3

y2 1

13. x2

1
16. y

19. x y 0

y2

>

17. y = x/lxl, x
20. x

>

0,

>=

y = 3

21. a) Sketch the graph of the condition "(x, y) is equidistant from the points (0, 1) and
(1, 0)."
b) Write this condition in the simplest possible algebraic form.
22. Write the simplest equation that you can get, for the set of all points that are equidistant
from (1, 2) and (0, 3). What sort of a figure is this graph? How is it related to the
segment from (1, 2) to (0, 3)?
23. Same problem, for the set of all points that are equidistant from (1, 2) and (2, 2).
24. Same problem, for the set of all points that are equidistant from P i = (xi, Yi) and
P2 = (x2, J2).
*25. Describe and sketch the graph of the equation
v (x - 1)2

(y - 2)2

v (x - 4)2

(y - 7)2 = v34.

[Hint: If you do a lot of algebra, you will probably get the wrong answer; the graph is
an ellipse.]

not

*26. Describe the graph of the equation


v x2

(y - 1)2

v (x - 2)2

y2 = 1.

[The same hint as for Problem 25 applies here.]


27. Draw the graph of the equation
x3y

y3x - xy = 0.

xy2 - xy = 0.

28. Draw the graph of the equation


x2y

29. Consider the set of all points that are twice as far from the origin as from the point
(3, 0). Find an equation for this graph, and sketch.
2.4 EQUATIONS OF LINES. SLOPES,
PARALLELISM. AND PERPENDICULARITY

Every line is the graph of an equation of the form

Ax+ By+ C
where A and B are not both

0,

0. The proof is as follows.

Every line is the perpendicular bisector of some segment. If Lis the perpendicular
bisector of the segment from

Q +----'> (ai, bi) to R +----'> (a2 , b ),


2

{P j PQ

PR}

then

Equations of Lines.

2.4

Slopes, Parallelism, and Perpendicularity

27

(Remember your geometry.) Therefore Lis the graph of the equation

<::?-

2
2
2
2
.J(x - a 1) + (y - b1)
.J(x - a2) + (y - b2)
2
x2 - 2a1x + ai + y2 - 2b1y + b
x - 2a2x + a: + y2 - 2b2y + b

<::?-

2(a2 - a 1)x + 2(b2 - b1)y + a i + bi - a - b

0.

This has the desired form

Ax + By + C

0,

with
and

B cannot both be 0, because a2 - a1 and b2 - b1 cannot both


0; the number pairs (au b1) and (a2 , b2) are the coordinates of Q and R, and
Q =;tf R, because Q and R are the endpoints of a segment.

The numbers A and


be

An equation of this type is called a


Theorem 1.

linear equation in x and y.

Every line is the graph of a linear equation in


y

Thus we have

and y.

If the line is not vertical, we can say more. In this case, the perpendicular segment
from Q to R is not horizontal, and this means that

b2 - b1 =;tf 0.

Therefore

B =;tf 0,

Analytic Geometry

28

2.4

and we can divide by B and solve for y. This gives

which has the form


y

+ k,

mx

where
A

m =

In the figure, the label

b2

bi

k on the y-axis is correct, because k is they-coordinate of


k). The number k is called the
(m 0 + k

the point where L crosses the y-axis

y-intercept

of the line.

The number m also has a geometric meaning, as we shall soon see.


(xi, Yi) and P2 (x2, y2) are any two points of a nonvertical line, then the

If Pi

slope

of

the segment from P1 to P2 is defined to be the fraction

The denominator x2

is marked 6.x in the figures below; it is pronounced

Xi

"deltax," and stands for the


stands for the

difference

in y.

difference

inx. Similarly, Yz - Yi is marked 6.y, which

Here 6.y and 6.x are not necessarily distances in the

sense of elementary geometry, because they may be negative.


y

y
P1

P1

P,

t:.x

" <

y
" <

O
t:.x>O

P,

<
Slope

Y2-Y1
x2-x1

t:.y
t:.x

We shall show that all segments of the same line have the same slope, and that
this slope is the number

which appears in the equationy

mx + b.

Given two points Pi (x1,y1) and P2 > (x2, Yz), on the line
y
then
Yz

x2

+ k

+ k,

mx

and

Therefore
and

Y2

Y1

X2 - X1

m.

2.4

Equations of Lines.

Slopes, Parallelism, and Perpendicularity

29

(In this calculation we do not care whether y2 - Yi and x2 - x1 are positive or


negative. The algebra takes care of all cases at once.)
The number m is called the slope of the line. And we have proved the following
theorem:
Theorem 2.

The graph of the equation


y=mx+k

is the nonvertical line with slope m and y-intercept k. All segments of this line have
slope=m.
The equation given in this theorem is called the slope-intercept form of the equa
tion of the line.

y=x

A line can be described by many different equations. For example, the bisector
of the first and third quadrants above is the graph of each of the following equations:
y=x
x-y=O

<=>
<=>
<=>
<=>

and so on.

3x - 3y=0
2
- y) =0

(x

(x - y)177=0,

But there is only one equation, in the slope-intercept form, for every

nonvertical line, because when the line is named, its slope and its y-intercept are
determined.
Often a line will be described by its slope m and the coordinates x1, Yi. of one of
its points. We can then find an equation for it in the following way. If (x, y) is any
other point of the line, then

because all segments of the line have the same slope


Y - Y1=m(x -

m.

Therefore

X1 .

The graph of this equation contains (x1, y1), because 0=m


line with slope=m, because the equation has the form
y=mx + (y1 -mx1)=mx + k.

0. And the graph is a

Analytic Geometry

30

2.4

Thus:
The graph of the equation y

Theorem 3.

slope

and contains the point

For example, the graph ofthe equationy


slope

and passes through the point

intercept form y

2x

y1

m(x - x1)

is the line which has

(x1, Yi).
3
-2(x + l)is the line which has
( -1, 3). Solving for y, we get the slope
-

+ 1.

Two nonvertical lines are parallel if and only if they have the same slope.

Theorem 4.

Given:

we need to prove two things: (1)If the slopes are the same, and the lines are different,
then the lines are parallel.
1)

If m1

m2,

then

k1

(2) Ifthe slopes are different, then the lines are not parallel.

--

k2,

because the lines are different. Therefore the lines are

parallel, because the two equations are inconsistent: they take the form

Since

2)

k1

--

k2,

If m1 --

these equations have no common solution.

m2,

the lines cannot be parallel, because the equations always have a

common solution. By subtraction we get

and we now find they-coordinate of the point of intersection by substituting in either


of the original equations.
Theorem 5.

If two nonvertical lines are perpendicular, then their slopes are negative

reciprocals of each other.


y

Equations of Lines.

2.4

Slopes, Parallelism, and Perpendicularity

31

Proof Given Li with slope mi and L2 with slope m2, intersecting at right angles at T.
Let (ai, bi) and (a2, b2) be points of L2 which are equidistant from T. Then Li is the
perpendicular bisector of the segment between these points. As we found earlier, the
slope of Li is

But we can calculate the slope


and

(a2, b2).

Obviously

m2

m2 of L2 by the slope formula, using the points (ai, bi)

This gives

-1/m1.

This also works the other way around:

Theorem 6. Given two lines L1, L2, with slopes m1, m2. If

then

Li and L2

are perpendicular.
y

\
\
\
\
\
\
\
\
\

\
\
\

Lz

L?.

Proof
(Why?)

First we observe that the lines cannot be parallel, because


Let T be the point where they intersect.

perpendicular to

L1.

Then

has slope

m2

there is only one line with a given slope.


perpendicular to

Let

l/m1.

(Why?)

m2 cannot be

m1.

be the line through T,

But through a given point

Therefore L is

L2,

and

L2

is

L1.

Probably you have seen these theorems proved before, in different ways.

The

treatment given above is intended to avoid repetitions and also to furnish some
practice in drawing geometric conclusions by algebraic methods.

Analytic Geometry

32

2.4

PROBLEM SET 2.4

Find point-slope equations, and slope-intercept equations, for the Jines containing
the following pairs of points.

1. (-3, 2), (2, 1)

2. (3, -4), (1, 2)

4.

3. (I, 0), (3, 3)

(-1, I), (2, -2)

5. Find an equation for the tangent to the graph of


x2 + y2
at the point
6.

25,

(3, 4).

Given thatP1

(x1, y1) lies on the circle


x2 + y2

with

a2,

LetP2 be the point where the tangent atP1 crosses the x-axis.

[Warning: Geometric distances are never negative.]

Find the distanceP1P2

-a

7.

Find the points P on the circle x2 + y2


passes through the point

2 so that the tangent line to the circle atP


(2, 0). (You may use the fact that, at any pointPon a circle,
=

the tangent and the radius are perpendicular.)


8.

Sketch the graph of the equation


x2 + y2 + I +

9.

2x

2y

0.

Sketch the graph of the equation

x2
10.

2xy

+ 4y2 + I

4xy +

2x

4y

0.

Sketch the graphs of the following equations.

a) y

lxl

b) y

-l2xl

c) y

- Ix - II

For this problem we offer a hint which applies equally well to a very large number of other
problems.

If you didn't know the meaning of the symbol

lxl, you would have no hope of


lxl, and use it.

sketching the graph. This suggests that you should recall the definition of

Graphs of Inequalities. And, Or, and If ... Then

2.5
11. Sketch the graph of

33

!xi + lyl = 1.

[Hint: As a first step, sketch the portion of the graph that lies in the first quadrant.]
12. Sketch the graph of the equation

y = x +!xi + 1.
13. Sketch the graph of the equation
!xi - lyl = 1.
14. Sketch the graph of the equation
v(x -1)2 + (y - 3)2 + v(x - 4)2 + (y - 2)2

v10.

15. Let C be the set of all points P such that the segment from ( - 1, 0) to Pis perpendicular
to the segment from P to (2, 1). What sort of figure is C? Sketch. (In answering this
one you should bear in mind that the endpoints of a segment are always different. That
is, there is no such thing as the segment from P to P.)
*16. Let A = ( -2, 0), let B = (2, 0), and let G be the set of all points P such that LAPE

is an angle of 60. What sort of figure is G? Sketch. (You will have to remember and
use some plane geometry, to do this one. If you have suitable drawing instruments, you
ought to be able to do a good sketch.)

2.5

GRAPHS OF INEQUALITIES. AND, OR, AND IF ... THEN

We have found that the graph of the equation

(x is the circle with center at A

1)2 +

(y -

1)2

(1, 1) and radius 1.

1
The interior of the circle is the

graph of the condition AP < 1. This is the region marked R1 in the figure. It is the
graph of the inequality

(x -

1)2 +

(y -

1)2 < I.

Similarly, the exterior R is the graph of the condition AP > 1, so that


2
R2

{(x,y) I (x -

1)2 +

(y -

1)2 > l }.

34

Analytic Geometry

2.5
y

The graph of the equation

is a line L. The points lying above L form a set H1, called a

halfplane.

Evidently H1

is the graph of the inequality


y

>

x.

The points lying below L form a half-plane H2; and H2 is the graph of the inequality

y< 1

x.

Consider now the double inequality

t< x < %.
The graph is an infinite vertical strip R1, lying between the lines
X

.l..

and

Ji

Similarly, the graph of

t<y< 1
is an infinite horizontal strip, as shown on the left below.
y
I
.I
I
I

y
1

___

--

__

----

--

Consider next the condition


or

I
I
I
I
I

t<y<l.

--- -

35

Graphs of Inequalities. And, Or, and If ... Then

2.5

The graph of the condition using or is an infinite cross-shaped region. This region R'
is the union of an infinite vertical strip R1 and an infinite horizontal strip R2; it contains
all points of the plane that belong to R1 or to R2.
(In mathematics, when we say that one condition holds or another condition
holds, we allow the possibility that both conditions hold. If we mean " ... but not
both," we have to say so.)
Similarly, the graphs of the conditions

y>x,

y > -x

are two half-planes H1 and H2


and to the right of the line y

They are respectively to the left of the line y


x
-x, as shown in the figure on the left below. The graph
=

of the condition

y> x

and

y > -x

is the intersection of these two half-planes. This is the interior R1 of LAOB.


y
y= - x

y=x

The graph of the condition

y >x

or

y > -x

is the union of the two half-planes.


y

y
S1
R

I-

R
I
I

'

I
I

l
2

S2

'

2
R

36

2.5

Analytic Geometry

Let us now see what sort of graph we get when we combine two inequalities by
"if ... then." Consider the condition

i<x<i
This says that

t<x< },

if

i<y<l.

=>

then

i<y<l.

Let R be the graph. We assert that R looks like the drawing on the right above. That
is, R contains all points that do

not

lie in either of the two vertical strips marked

S1 and S2. The reason is as follows:

1)

If

(x, y)

is a point of R, and

t < x< !,

fore the part of R that lies between the lines

then we must have


=

t and x

t <y< I.

There

i must be the interior of a

rectangle, as indicated by the dashed lines in the figure.


2)

On the other hand, if xis

not

between

t and i,

then the condition for the graph

imposes no restriction on y at all. Therefore R contains all points to the left of the
line

t and

all points to the right of the line x

R also contains these two

vertical lines, for the same reason.


The reasoning in (2) may seem a little tricky, but may be clarified by an analogy
from everyday life.
defective vision,

The law in most places requires that if a person has seriously

then

he must wear corrective glasses when driving a car.

A person

with normal vision automatically obeys this law; its restrictive clause does not apply
to him. In the same way, the "law"

i<x<imposes a restriction only on points

=>

(x, y)

t<y<l
for which

l <x< -;

all other points

automatically obey the "law," because its restrictive clause does not apply to them.
Thus the "law" holds under each of the following three conditions:
1)
2)

t <x< !
x t,

and

t<y<

1,

3) x !.
The graph of (1) is the rectangular region in the middle of the figure; the graph of (2)
is the infinite region to the left of the line x
region to the right of the line x

t;

and the graph of

i.
y

Yes

x<O
y>O

x>O
y>O

Yes

Yes

x<O
y<O

x>O
y<O

No

(3)

is the infinite

2.5

Graphs of Inequalities. And, Or, and If ... Then

37

Similarly, the graph of the condition

x>O

yO

=>-

contains all of the plane except for the fourth quadrant.

x> 0

quadrant that

y < 0 s

holds and

y 0

It is only in the fourth

x> 0,
y 0.

does not hold; and the possibility

the only possibility that is ruled out by the condition

x> 0

=>-

In each of the following cases, the shaded region is the graph of the condition
appearing below it.
y

x;;;o
y

-_ +
1 -

r-

-t-.,.x

-1

There is no need to use graph paper in the following problem set.

Reasonably

neat freehand sketches, with cross-hatching used to indicate regions, are sufficient.

PROBLEM SET 2.5

Sketch the graphs of the following conditions:

1. !

< x <

3. } g
5.

<

y < ii

Ix - 11 <

7. !y - 2i <
9.

2. Ix - 1 1 < l

4.

t and ly - 21

/0

=>-

fyl ;2 ix!

11. x2 + y2 ;2 1

< /0

fx - 11 <

ly - 21 <

6. lx - lj <

8.

'

110

( x - 1)2 + y2 ;2 1

ly - 21 < /0

;;;: lxl

10. x2 + y2 ;2 1
or

=>-

and

(x - 1)2 + y2 ;2 1

38

2.6

Analytic Geometry

12. x2+ y2 1

(x - 1)2+ y2 1

=>

x2+ y2 1

13. (x - 1)2+ y2 1

=>

14. x2+ y2 4

and

(x+ 1)2+ y2 1

15. x2+ y2 4

or

(x+ 1)2+ y2 1

16. x2+ y2 4

=>

x2+ y2 1

18. x2 + y2 1

and

y ;;;; x

=>

x2+ y2 4

and

x lyl

-x - y 1

25. !xi - ly l 1
21. Ix - yl 1

26. Ix+ yl 1
=

19. x2+ y2 1
23.

-x+ y 1

24. Ix!+ lyl 1


28. x

=>

21. x - y 1

20. x+ y 1
22.

17. x2+ y2 1

30. Ix - 31 < t

=>

29. Ix - 21 <Yo

=>

ly - 11 < t

l y - 21 < t

31. Supposeyou know that (a)P =>

Qand (b)Pis false. What, if anything, can you infer

about Q?

32. Supposeyou know that (a)P =>

Qand (b) Qis true. What, if anything, can you infer

aboutP?

33. Supposeyou know that (a)P =>

Qand (b) Qis false. What, if anything, can you infer

aboutP?

34. Supposeyou know thatP

=>

Q. Which of thefollowing arepossible?

a) P is trueand Qis true.

b) Pis trueand Q is false.

c) Pis false and Qis true.

d) Pis falseand Qis false.

35. Suppose you know that P <=>


in Problem 34 arepossible?
2.6

PARABOLAS

The

distance

Q. Which of the combinations (a), (b), (c), and (d)

from a point to a line is the length of the perpendicular from the point

to the line . Given a point F and a line Dnot containing F,

and directrix

the parabola with focus

Dis the set of all points of the plane that are equidistant from Fand D.

The parabola is the graph of the condition


FP =MP,
where Mis the foot of the perpendicular from P to D. The perpendicular line to D

Parabolas

2.6

39

through F is called the


parabola is called the

axis of the parabola. The point where the axis crosses the
vertex. (There is only one such point, because any such point

is midway between the focus and the directrix.)


The first step in the study of parabolas is to get equations for them.
y

::::-+-:+l-. x
D..-----------

_______

n_

____

Mly=-'E.
2

In setting up our axes, we take the vertex as the origin, and the x-axis parallel to

the directrix, in such a way that Dis below the x-axis and the focus is above it.

The

number

be a

is the distance from the focus to the directrix.

point of the parabola. Then

Now let P

= Jx2 + (y r

FP

and

Therefore
FP

MP

Jx2 + (y - r = J (y + r

x2 + (y - )2 = (y + r
x2 + Y2 py + 2 y2 + py + 2
x2 = 2py
- 21p ..
_

This has the form


where

a = l/2p,

and

p__
4

.,

p 2a1

=-.

p__
4

(x, y)

40

Analytic Geometry

2.6

Thus we have proved the following theorem:


Theorem 1.

The graph of the equation

y= ax2
is a parabola, with focus at

(0, 1/4a)

and directrix

y=-1/4a.
y

_ti_
M

______

-4li
FP=MP

<=>

____

y=ax2.

If a parabola is situated like this, relative to the axes, then the parabola is said
to be in standard position.

The use of standard position simplifies the equation

considerably. For example, if Fis the point (2,

parabola is the graph of the equations

-1) and

Dis the line y= 3, then the

FP=MP

<=>

.Jex - 2)2 + (y + 1)2 = .J(y - 3)2


x2 - 4x + 4 + y2 + 2y + 1 = y2 - 6y +

<=>

x2 - 4x - 4 =-8y

<=>

y = -tx2 + tx + t.

<=>

It is not hard to check, in general, that if the directrix is horizontal, then the equation
always takes the form

y = Ax2 + Bx + C,

:;i: 0.

,t: 0.

And if the directrix is vertical, we get

x= Ay2 + By + C,

If the directrix is neither horizontal nor vertical, then the equation involves, in

general, terms in x2,

y2, and xy, as well as linear terms and a constant. In this case it

is hard to derive the equation when the focus and directrix are given; and it is even

harder, when the equation is given, to see that the graph is a parabola. This case will
be discussed in Chapter

8.

For a long time to come, however, we shall deal only with the simplest case, in

which the directrix is horizontal.

Parabolas arise in a variety of contexts which appear at first to be unrelated.

Following are a few.

Parabolas

2.6

1)

41

If a right circular cone is cut by a plane parallel to an element of the cone, the

resulting curve is a parabola. This was the viewpoint from which the Greeks studied
parabolas; and it is for this reason that a parabola is one of the conic sections. There
are other kinds of conic sections, obtained by slicing cones by planes in various
positions.
y

---------------

2)

If a theoretical projectile is fired from the surface of the earth, in any direction

other than straight upward, the path that it moves along is a portion of a parabola.
In the figure on the right above, the x-axis lies along the surface of the earth, the
y-axis is vertical, L Cl.. represents the angle at which the gun is aimed, and Tis the
point where the projectile hits the ground. We say, "a theoretical projectile," because
to get this result you must assume both that the weight of the projectile is independent
of its altitude and that the air makes no resistance. These assumptions are false, but
they are good approximations to the truth, if the projectile is not going very fast or
very high.

For high-speed, long-range projectiles, both assumptions are quite

unrealistic, and the situation is more complicated.


3)

If you rotate a parabola around its axis, you get a surface which is called a parab

oloid

of recolution. The mirror in a reflecting telescope is a paraboloid of revolution,

as is the reflector in an automobile headlight.

The reason is that if a ray of light

travels along a line parallel to the axis, and is reflected in the usual way, it always hits
the focus. And conversely, if a ray of light starts at the focus, hits the surface and is
reflected, it always continues along a line parallel to the axis. The first of these prin
ciples is used in telescopes, and the second in headlights.

i
I
I

i
I
I

l"1F
4)

Suppose that you fire a "theoretical projectile'" vertically upward. It moves up a

vertical line, for a certain distance lz, and then comes down again along the same line.
Thus the path of motion is simply a segment.

Suppose now that we label our

42

Analytic Geometry

2.6

horizontal axis as the t-axis; we measure time starting at the moment of firing; and
we plot, for each time t, the height of the projectile at time t.
y

-----

-:..:-...--

The resulting graph is a portion of a parabola. In the figure above,

is the

time at which the projectile hits the ground. Note that the graph that we have been
discussing is not all of the parabola: one minute before firing, the projectile was in
the gun; it was not underground. And at t
the motion stops.

= a

the projectile hits the ground, and

Therefore, in the figure there is a solid arc, which indicates the

portion of the parabola that is related to the physical problem; the irrelevant part
of the curve is indicated by dashed arcs to the left and right.
This example indicates that geometric ideas come up in physics in unexpected
ways; the uses of geometry are not limited to the study of figures in space.

PROBLEM SET 2.6


1. Take a full-size sheet of graph paper; draw the y-axis in the center; and draw the x-axis
near the bottom of the paper.

Then choose the largest uniform scule that you can,

on the axes, in such a way that x ranges from - 2 to 2 and y ranges from -! to 4.
Now sketch the graph of y =x2 First plot the points corresponding to the following
values of x:
x=O,

x=0.1,

x= 1.2,

x=0.2,

x = 1.4,

... ,

... '

x=0.9,

x = 1.8,

x=l,

x= 2.

Then draw the curve, freehand, as smoothly as you can. If this is done carefully, it will
really look as if FP = MP at every point of the curve.
One of the reasons for doing this is that it will give you an accurate idea of what a
parabola really looks like.
2.

Show that
0 < x1 < x2

xi < x .

=>

y
y=ax2 , a>O

Xz
Y1 <yz

Tangents

2.7

43

This means that the right-hand half of a parabola in standard position rises as we go
from left to right along the curve.

3. Show that
What does this tell us about parabolas in standard position?
4. Find the focus and the directrix of the graph of the equation y = x2
5. Same problem, for the equation y = 3x2

6. Same problem, for the equation y = tx2

7. Show that the graph of the equation y = x2 + 1 is a parabola. To show this, you must
find its focus F and directrix D.

You can then check by deriving the equation of the

parabola with focus F and directrix D.

8. Show that the graph of the equation y = (x - 2)2 is a parabola.


9. Same problem, for the equation y = (x - 2)2 + 1.

10. Show that the graph of y = x2 - 2x is a parabola.

11. Show that the graph of the equation y = (x + 1)2 is a parabola.


and directrix D.)

(Find the focus F

12. Same problem, for y = (x + 1)2 - I.


13. Same problem, for y = (2x + 1)2
2.7

TANGENTS

In geometry, tangent lines to circles are defined as follows.


Definition. A tangent to a circle is a line (in the same plane) which intersects the
circle in one and only one point. This point is called the point of contact.
It is then shown that a line is tangent to the circle if and only if the line is per
pendicular to the radius drawn to the point of contact.

(In fact, the latter condition

is probably the one that you used to find the slopes of tangent lines to circles, in
Problem Set 2.4.)

z2

y2

+b2

1.

Tangency can be defined in the same way for an ellipse.

Ellipses will be studied

in Chapter 8. Meanwhile we observe that an ellipse is an oval curve, of the sort shown
in the right-hand figure above, and the tangents to it are the lines that intersect
it in one and only one point.

44

Analytic Geometry

2.7

But for some curves, tangents cannot be described by the definition that we use
for circles.

Consider, for example, a parabola, as shown in the figure below.

tangent to the parabola, at the point (x1,


the vertical line through

(x1, y1)

The

y1), intersects the curve only at (x1, Ji). But

has the same property; and the vertical line is not

a tangent.
y

We may try to get around this trouble by providing that the tangent line must
only touch the curve, without crossing it.
either. The graph of

y =

x3

But for many curves, this won't work

is shown below. The tangent to this curve at the origin

turns out to be the x-axis; and the x-axis crosses the curve, at the point of tangency.
Jn other cases a tangent line may cross a curve in many points.
y
y

y =x

The geometric idea of tangency is obvious in all these cases.

But the above

examples indicate that the mathematical definition that works for circles does not
work in general.

To find the tangents to other curves, we need a better definition.

Consider first the graph of y

x2,

and the fixed point

find the slope of the tangent. For every other point

(x,

x2

(I, 1) at which we want to


of the curve, let Lx be the

Tangents

2.7

secant line through

45

(1, 1) and (x, x2). Then the slope of L., is


m,,,

x2

---

(x -:F- 1).

Here the restriction x -:F- 1 reflects the geometric fact that it takes two different
It also refers to the algebraic fact that fractions with

points to determine a line.

denominator 0 have no meaning.


y
y

!/'m.

Lx

I
I
I
I
I
I
I

We shall now draw the graph of y


y

mx

mx

x+

(x -:F-

1). We have

(x -:F-

1).

The graph is a line from which one point has been deleted.
such thing as "the secant line through
such thing as the "fraction"
to see that

mx

m1

For x

(1, 1) and (1, l2"


) ; and for x

0/0.

1, there is no
1, there is no

But this causes no trouble, because it is easy

is very close to 2 when xis very close to 1. We express this by writing


lim

x->1

mx

This is read: "The limit of mx, as xapproaches


general definition of the idea of a limit.

2.

1, is equal to 2." Later we shall give a

But in the present case, the meaning of the

limit is clear, and so we use it in the definition of the tangent to the parabola.
Definition.

The tangent to the graph of


y

ax2 + bx +

c,

at a point (x0, y0) of the graph, is the line through (x0, y0) with slope
Sx0

where

mx

lim

xx0

111.,,

is the slope of the secant line passing through the points (x0, y0) and

(x, ax2 + bx +

c)

(x -:F- x0).

Analytic Geometry

46

2.7

Even in the general case, the slope is easy to calculate on the basis of this defini
tion. We have

ax2 + bx +

x0)

(x

x0)

= a(x + x0) + b

(x

x0)

= ax + (ax0 + b).

(x

x0)

X - Xo

a(x2 - x) + b(x - x0)

The graph of y =

- ax - bx0 -

(x

m.,,=

X - Xo

m.,, is

a line with one point missing. The line from which the point

y = m.,, is

is missing is shown on the left below. The graph of


y

on the right.

y
I
I
I

-----

I
I

I
I
I

-4---+--- x
1ro

y=ax+(ax0+b)
Here again, the limit of

y=m.;

m.,, is simply they-coordinate of the point that is missing from

the graph. Thus we have:


Theorem 1.

Let

(x, y)

be a point of the graph of

y = ax2 +bx +
Then the slope of the tangent to the graph, at
S.,,

(0, 0) .

For each

(x, y),

is

2ax +b.

For some curves, there is no tangent.


y = lxl, at the point

c.

Consider, for example, the graph of

0,

lxl - IOI
lxl
=m =
"'
x
0
x
-

Thus:

m.,, =

for

(Remember the definition of

>

Ix!.)

0,

mx =

for

<

0.

Therefore the graph of y = m,, looks like the

2.7

Tangents

47

y
y

k?

y=l,x>O

k?

y = -1,x<O

k?

y=fxf
drawing on the right above. For this graph there is no
whenever

x is

one

number that y is close to,

close to 0. Therefore, there is no such thing as


Jim

(?)

x->O

mx

(?);

for every number k, the statement

(?)

Jim

mx

(?)

x->O

is false.

Geometrically it is obvious that the origin is the only point of the curve at which
things go wrong; at every other point
tangent is I for

> 0 and -1 for

(x, lxl), the curve has a tangent; the slope of the


< 0.

PROBLEM SET 2.7


1.

You already have a carefully drawn graph of the equation J

= x2 At each point (x, J)

of the graph, the slope of the tangent ought to be 2x. Check this graphically by drawing
lines of the proper slope at the points where

2. Given J = x2

x = 0.2, 0.4, 0.6, 0.8, and 1.

4x + 4. Find the slopes of the tangents at the p oints where x =

2,

x = 0, and x = 2 and sketch, showing all three of these tangents.


,

= x2 + x + I, using the points where x

3.

Same problem, for J

4.

By completing the square, show that

0, x = L and x = 1.

J = ax2 +bx + c
can be expressed in the form

y = a(x - A)2 + B.
For

a > 0, this means that the point where x = A is the lowest point on the curve.

Find the slope of the tangent at this point.

5.

a) Given the graph of

y = ax2
and a point

(x0, Jo) of the curve. Show that the tangent at (x0, Jo) is the only non
(x0, Jo) and has no other point in common with the

vertical line which passes through

48

Analytic Geometry

2.7

parabola. That is, show t hat, if the graph of

(y - Yo)
intersects the parabola only at

(x - x0)

(x0, y0), then


m

2ax0

b) Prove the corresponding theorem for the graph of

ax2 + bx +

c.

6. a) Get a plausible answer for the slope of the tangent to the graph of
point

(I, 1). Sketch the graph of y

mx,

x3, at the

explain what sort of graph it is, and explain

as well as you can why your value for the slope is plausible.

7.

a) Show that, if

< 0, then the line through the origin with slope

meets the graph

> 0, then the line through the origin with slope

meets the graph

of y

of y

x3 at precisely one point.

b) Show that, if

8.

x3, at an arbitrary point (x0, xg).

b) Do the same for

x3 at precisely three points.

Sketch the graph of

x lxl,

and describe this curve in terms of types of curve that we already know about.
which points does this graph have a tangent?

What is S2?

possible, a general formula for Sx. Is there such a thing as S0?

9.

Consider the graph of

What is S_2?

At

Give, if

y = x3 - 4x.

Where does this cross the x-axis? At which points is the tangent horizontal? What is

For what values of x is y > 0? For what values


x is y < 0? Use this information to draw a reasonable sketch of the graph, plotting

the slope of the tangent at (0, 0)?


of

onlyfive points.
10.

Carry out the steps of Problem 9 for the equation

y
11.

2x3 - 6x.

Show that every parabola has the reflecting property. In the figure, Tis the tangent at P,
and you need to show that
is a rhombus.

Cl

/3. The key to the proof is that the quadrilateral FPRQ

(That is, all four sides have the same length.)

A Shorthand for Sums

2.8
2.8

49

A SHORTHAND FOR SUMS

An arithmetic series is a sum of the form


Sn =a

+ (a +

d) +

(a

2d) + + (a + [n

l]d).

A geometric series is a sum of the form

Tn =

+ ar + ar2 + ar3 + + ar"-1

There is a shorthand for sums, which makes them easier to handle.

we write

n
Sn=

I a;.

i=l

(This is pronounced: "The summation from 1 to

Ii and follow it with an expression involving i,


all (integral) values of i, from 1 to
1)

Given a sum

n,

of a;.") That is, when we write

this means that we are to substitute

and add the results.

The geometric series


Sn = a

+ ar +

ar

+ + arn-1

can be written as

The shorthand can be checked by substituting the values of i from 1 to


i

2 1
ar -

a 1
ar -

4 l
ar -

ar

ar

ar

n.

When we add these, we get the geometric series.

2)

Consider the sum


Rn = l2 + 22 + 32 + ... +

In the short form,

2
n .

.o

Rn = '
L., z-.
i=l

3)

An arithmetic series can be written in the form


n

Sn =

L [a + (i

i=l

l)d].

This can be checked by means of a table of the sort that we gave above for the case
of the geometric series.
In each case, the formula after
i=

gives the ith term; i

gives the second term, and so on.

1 gives the first term,

This will always be true so long as we are

2.8

Analytic Geometry

50

taking the sum from i

1 to

However, we also write such sums as

n.

.::.., i 3

i=2
Here we take all values of i from i
fore

I i3

i=2

to i

23 + 33 + 43 + 53

In general, for m

n,

n
I a;

i=m

5 inclusive and add the results.

+ 21 + 64 + 125

am + a.,,+1 +

There

224.

+ a n.

Thus
4

I (af + 1)

i=2

(a+ 1) + (a+ 1) + (a+ 1)

a+ a+ a+ 3

Note that

I a; + 3.

1:=2

applies only to the expression immediately after it; in the last line, we

are told to add the numbers a (from i


parentheses in the formula
sum.

4)

and add 3 to the result. The

indicate that

is part of every term of the

!t=2 (a + 1)

to i

PROBLEM SET 2.8


Find each of the following sums numerically:
3

1. >2
i=l

Each of the sums below is of the form

am + Gm+l +

3. I u2 - 1)
i=2

2. I u-1)2
i=l

+ an.

Lf=m a;.

s.

i=3

4. "2i2
i=l

s.

I u3

i=2

Write each of them in the long form

3
I (3b + d)

i=2

Convert each of the indicated sums to the short form:

10. 12 + 22 + 32 +
12. k2 + (k

n2

1)2 + (k + 2)2 +

13. 21 + .13 + ... +


14.

k-1

--

Is is true that

(n + 1)2

11. 32 + 42 + ... + k2

+ (n - 1)2

1
k

+-

n
I (a;

i=l
Why or why not?

1)

b;) = I a; + I b;?
i=l
i=l

9.

!i7
i=m

The Induction Principle and the Well-Ordering Principle

2.9
15.

51

Is it true that
n

L kai

i=l

k 2, ai?
i=l

Why or why not?


16.

Is it true that
n

L..
i=1

( -) hi 2
n

h3
-a
n

L..

iz?

i=l

Why or why not?


17.

()is the number of subsets with exactly k elements, in a given set with
<mis the number of possible 13-card bridge hands; (552) is
the number of possible 5-card draw poker hands. Show that ()
<D
Show that
GD G).
For 0

k b,

elements.

For example,

*18.

*19. Show that

2.9

THE INDUCTION PRINCIPLE AND THE WELL-ORDERING PRINCIPLE

Consider the following game.

We have three spindles, of the sort used as targets in

quoits. On the first spindle is a stack of wooden disks, diminishing in size from bottom
to top. (See the figure.) The disks are numbered 1,
in the figure,

2,

3, .

.. , n, from top to bottom;

5.

A legal move consists in taking the topmost disk from one spindle and placing it
on one of the other spindles, providing that we must not, at any stage, place a disk
above a smaller disk.
At the start, all the disks are on spindle A. The object of the game is to get all
the disks onto spindle B, by a series of legal moves.
For example, we might begin by taking disk 1 off spindle A and putting it on
spindle B. There would then be three possibilities for the second move:
back on spindle A,

(2)

put disk 1 on spindle C, and

(3)

put disk

( 1) Put disk 1

on spindle C.

It

would not be legal to put disk 2 on spindle B, because disk 2 would then be above
disk

1,

which is smaller.

We shall see that the game can always be completed, no matter how large the
positive integer

may be. For each positive integer

the game can be completed, starting with


of the propositions P n are true.

disks.

n,

Let Pn be the proposition that

What we need to show is that all

52

2.9

Analytic Geometry

Lemma 1. P1 is true.
(A lemma is a sort of subtheorem, used as a step in the proof of a harder theorem.)

Proof of Lemma

I.

Move the one and only disk from spindle A to spindle B. Then

the game is over.

Lemma 2. P 2 is true.

Proof of Lemma 2. (I) Move disk

1 to spindle

C.

(2) Move disk 2 to spindle

B.

(3) Move disk I to spindle B. Then the game is over.

Lemma 3. P3 is true.

Proof of Lemma 3. By Lemma 2, disks 1 and 2 can be moved to spindle C. (Lemma


2 really means that any two disks at the top of a stack can be moved to any other
spindle.) Do this. Then move disk 3 to spindle B. By Lemma 2, disks 1 and 2 can
then be moved to spindle B, whereupon the game is over.
A pattern is now appearing, suggesting the following lemma. This lemma states
that if the game with n disks can be completed, then the game with

+ I disks can

also be completed.

Lemma 4. For each n, Pn

=>

P n+1

Proof of Lemma 4. We are given n + I disks on spindle A, and we are given by


hypothesis that P n is true. Therefore the stack consisting of disks 1, 2, . . . , n can be
moved to spindle C by legal moves. Do this. (Disk n + 1 causes no trouble; it can
be regarded as the base of the spindle on which it lies, because it is larger than any
of the disks being moved.)

Then move disk n + 1 from spindle A to spindle B.

By Pn> we know that disks 1, 2, ... , n can be moved from spindle C to spindle B.
Then the game is over.
Lemma 4 gives us an infinite chain of implications:

And Lemma 1 tells us that the first statement in the chain is true. Therefore all of
the statements P1, P2,

are true. This idea is conveyed mathematically as follows:

The Induction Principle.

Let Pl> P2,

be a sequence of propositions (one

for every positive integer). If

a)

P1 is true, and

b)

Pn

=>

Pn+i for every

n,

then all of the propositions P1, P2,

are true.

The problem of the disks is probably the clearest illustration of what the induction
principle means.

The principle is used continually, in all branches of mathematics.

In this section, we shall use it to get short formulas for certain sums.

The Induction Principle and the Well-Ordering Principle

2.9
Theorem 1.

For every n,

53

n
> = -(n+ 1).

i=l
Proof

For each n, let Pn be the proposition that


n

n
Ii= -(n+ 1).

i=l
P1 is true, because

a)

Ii = 1 = tCl + 1).

i=l

b)

Pn

=?-

n, because

Pn+I for every


n

n
Ii= - en+ 1)

i=l
n

n
L i+ (n+ 1) = -(n+ 1)+ (n+ 1)

=?-

i=l

: i= G+ 1) en+ 1)
n+ 1
I i = -- (n+ 2).

=?-

n +l

=?-

i=l

In this chain of implications, the first equation is Pn and the last is Pn+i Therefore

Pn

=?-

Pn+i By the induction principle, Pn is true for every

n, which was to be proved.

In fact, there is a simpler way of getting this result. If


Sn

= 1 + 2+ 3+

+ (n - I)+ n,

then
Sn

= n+ (n - I)+ (n - 2)+

+ 2+ I;

and adding terms in pairs, we get


2Sn =

(I+ n)+ (I+ n)+

+ (I+ n)+ (I+ n),

to n terms. Therefore
2Sn

= n(n+ 1)

and

Sn

n
= -(n+ 1) '
2

as before. This device is neat but very special. Consider now the problem of calculat
ing

Sn

= L i2 = 12+ 22+ 32+ ...+ n2.


i=l

We have just found that the sum of the first n positive integers is a polynomial in n,
of degree 2. This suggests that S n is a polynomial of degree 3. That is, we conjecture
that
Sn

= An3+ Bn2+ Cn+

D,

54

2.9

Analytic Geometry

for some numbers A, B, C, D. The problem is to find A, B, C, and D, and prove by


induction that they work. Let P n be the proposition that

n
Pn: L i2=Ana+ Bn2+ Cn

+ D.

i=l

Then Pn+I asserts that

n
Pn+I: L i2+(n+ 1)2=A(n+ l)a+ B(n+ 1)2+ C(n+ 1)+ D.
i=l

We want P n P n+l to make the induction proof work. This means that

Ana + Bn2+Cn+ D+ (n+ 1)2=A(n+l)a + B(n+ 1)2+ C(n+ 1)+ D.


If this equation holds, then P n

P n+i (Check the algebra.) Collecting coefficients,

we get the equivalent equation

Ana + (B+ l)n2+ (C+ 2)n+ D+ 1


=An3+ (3A+ B)n2+ (3A+ 2B+ C)n+A+ B+ C+ D,
or

(1 - 3A)n2+ (2-3A - 2B)n+ 1-A-B-C= 0.


This holds if

A= t,
This gives

B = t(2 - 3A)= t,

C=l-A-B=i.

n
Pn: L i2=t na+ tn2+ i n+ D.
i=l

Thus, for any D, Pn P,.+i For D= 0, P1 is true. We take D= O; and we know


by the induction principle that

n
L i2=tna+ tn2+ tn,
i=l

for every n. Taking a common denominator on the dght and factoring, we get:
Theorem 2.

For every n,

n
n
L i2=- (n + 1)(2n+ 1).
i=l

For some purposes, the following idea is easier to use thanthe Induction Principle.

The Well-Ordering Principle.

Every nonempty set of positive integers has a

least element.

(See, for example, Problems 10 and 12 below.) The Well-Ordering Principle and the
Induction Principle are equivalent.

(See Problems 14 and 15 below.)

2.9

The Induction Principle and the Well-Ordering Principle

55

PROBLEM SET 2.9

1. Prove by any method that for every n, the sum of the first n odd numbers is n2 That is,
n

c2i - 1)

i=l

n2

This can be shown by induction, but there are at least two other ways.
2. Prove by induction that

1 + r + r2 +

rn+l - 1

+ rn

(r 1).

3. Prove by induction that

4.

Find by any method a formula for


n

C3i - 1).

I
i=l

5. Find by any method a formula for


n

I C4i
6. Find a formula for

Iu
7. Find a formula for

- 2) .

i=l

+ i + 1).

i=l

i (i2

- i).

i=l

8. Assume that if A1, A2, A3 are points, then

AiA2 + A2A3 A1A3.


Prove that for every n 3 we have

A1A2 + AzA3 +

+ An_1An A1A,,.

This is known as the polygonal inequality.


9. a) Let Pn be the number of moves required to complete the game with n disks. Show
that for every 11,

Pn+l

2pn + 1.

b) Let Pn be as in (a). Show that for each

p,,
(Since 210
moves.

2"

11,

1.

1024, this means that the game with 20 disks requires over a million

Thus, if you want to verify that P20 is true, the easiest way to do it is to

show by induction that Pn is true for every

11,

and then set

11

20.)

*10. Throughout this problem, the numbers under discussion are positive integers. If
a =be for some c, then bis called a factor of a (or a divisor of a). If p > 1, and the
only positive factors of p are p and 1, then p is a prime. Obviously every prime has a

56

2.9

Analytic Geometry

prime factor, namely, itself. Prove that every number greater than 1 has a prime factor.
[Beginning of the proof: "Let K be the set of all numbers Which are greater than 1 and
have no prime factors. We need to show that K is empty. If K is not empty, then . .."]

* 11. Following is the beginning of Euclid's proof that there are infinitely many primes.
Suppose that there are only a finite number of primes, say
Consider the number
N

P1P2Pa

'Pn +

1.

Complete Euclid's proof, by showing that this situation i s impossible.

*12. Show that every rational number can be expressed as a fraction in lowest terms. [Hint:
Try the Well-Ordering Principle.]
13.

In the song "The Twelve Days of Christmas," gifts are sent on successive days according
to the following scheme:
First day: a partridge in a pear tree.
Second day: another partridge, and two turtledoves.
For each i, let G; be the number of gifts sent on the ith day.Then
G;

G;_1 + i.

(Which we have just observed for i


2.)
Let Tn be the total number of gifts sent on the first
formula for Tn, in the form
?(? + ?)(? + ?)
=

days of Christmas. Get a

?
As a check, the final value is T12
Thomas F. Banchoff.)

364.

(I am indebted, for this problem, to Professor

* 14. Show that, if the Well-Ordering Principle is taken as a postulate, then the Induction
Principle can be proved as a theorem. [Start of the proof: Suppose that not all of the
propositions Pn are true, and let
K

{n I Pn is false.}.

Then K ""- { }. Therefore ...]


* 15.

Show conversely that, if the Induction Principle is taken as a postulate, then the Well
Ordering Principle can be proved as a theorem. [Start of the proof: For each n, let Pn
be the proposition that none of the integers 1, 2, . .. , n belongs to K .... ]
The diagram below is related to one of the problems in this section.

2.10
2.10

Solution of the Area Problem for Parabolas

57

SOLUTION OF THE AREA PROBLEM FOR PARABOLAS

If a line intersects a parabola in two points, then it cuts off a region called a parabolic
sector. In the left-hand figure below, the sector is the region lying above the parabola
and below the line. In the third century B.C., Archimedes discovered a method for
finding the area of a parabolic sector. In this section we shall give an easier solution
of the problem.

The problem will be solved if we can find the area of a "curvilinear triangle" of
the type shown on the right above. If we can do this, then we can find the area of the
trapezoid in the other figure, and subtract the areas of the two curvilinear triangles.
The result will be the area of the sector.

We shall attack the area problem, for the graph of y

region with rectangles, like this:

x2, by approximating the

We cut the closed interval [O, h] into n little intervals of equal length, using the
di.vision points

O, ,
11

2h , ... , (i l)h , ih , ... , (n


-

11

11

11

l)h , h.

This gives a sequence of closed intervals

l)h ' l ' [(11 l)h ' hl


'

[
[(i
'
l
1
2:

[
0,

58

Analytic Geometry

2.10

With each of these intervals as base, we construct a rectangle, using as altitude the
height of the parabola at the right-hand endpoint.
ith interval is

ih/n.

The right-hand endpoint of the

Therefore the altitude of the ith rectangle is

(ih/n)2

the area of the ith rectangular region is

Therefore

Let Rn be the union of all these rectangular regions. Then the area of Rn is

n
n h3i2
ai= .
An = .
i=l
i=l n3

h3
i2
.
n3 i=l

We want to find out what limit An approaches as n becomes very large. If we find this
limit, then our problem is solved, because the limit is the area of the region R that
we started with.
We found, in Theorem

2 of Section 2.9,

that

n
n
I i2 = - (n + 1)(2n + 1).
6

i=l

Therefore

h3 n
An =- - (n + 1)(2n + 1)
n3 6

As

/1

h3

(1 + 1-) ( 1 + -1 )
/1

2n

becomes large without limit, it is easy to see that


1

--

so that

o,

and

1
1 + - -1,
n

1
-o'
2n

)(

h3
1
1
An =- 1 + - 1 + 3
n
2n
Therefore the area under the parabola, from 0 to

1
1 + --1,
2n

and

h,

h3
-.
3

is

h3
A=-.
3

It would have been equally natural to approximate the area from the inside. We
shall see that this procedure leads to the same answer as before. Here we have cut
up the interval

[O, h]

into the same little intervals as before; but on each little in

terval we have set up a rectangle whose altitude is the height of the parabola at the

Solution of the Area Problem for Parabolas

2.10

59

ah

--...,
I

I
I
I
I
I
I
I
I
I
I
I
I
I

.. (i-l)h fl: ...<n-l)h nh_


n
n n
n
n
.

left-hand endpoint. Therefore, on [O,

h/n]

our "rectangle" is merely the base inter

val, with area 0; and thereafter the area of the ith rectangle is

a =
Let

[(i - l)h] 2
n

h3
n3

R be the union of these rectangular regions.

(i - 1)2.

Then the area of

R is

To see why the last equation holds, observe that each of the indicated sums is the
sum of the squares of the integers from 1 to

n-

h3 n
(n + 1)(2n + 1)
A = 3
n 6
-

As

n increases, An---+ h3/3 and h3/n---+ 0.

n2

Therefore

limit as before. To sum up:


Theorem 1.

1. Therefore
=

An -

A---+ h3/3,

Let

R
Then the area of

{(x,

y) I 0

R is h3/3.
y

x h

and

h3 .
n
-

and we get the same

x2}.

2.10

Analytic Geometry

60

It is easy to extend this result to the case in which the parabola is the graph of

kx2, k > 0.

y =

y=kx2

.h

i
n

. - (ih)2h

Ui

When we multiply y by
by

k.

Thus, if A11

k,

this multiplies the area of each approximating rectangle

,2;'.,.1 ai,

as before, and
n

Bn

.2 a;,

-i=l

we have

Bn

,2 ka;

h3/3,

i=l

i=l

Since An--+

k ,2 G;

kAn.

we have

Bn

kh3
3

-+-

Therefore we have the following theorem:


Theorem

2. Let
R =

with

k > 0.

{(x, y) I 0 x h

Then the area of R is

In general, for

<

0 y kx2},

and

kh3/3.

b let

be the area of the region under the graph of y

kx2,

from

to

b. Then

we have the following:


Theorem

3.

Akx2

(b3
3

(Proof? There are three cases to consider:

a3).
<

b 0, a

<

0 b, 0 a

<

b.)

61

Solution of the Area Problem for Parabolas

2.10
PROBLEM SET 2.10
Find the area under the graph of y

5x2, between the following limits.

1.

From 0 to 4

2.

From 0 to 2

4.

From 2 to 4

5.

From -2 to 2

3.

From -2 to 0

Find the area under the graph of y = 2x2 + 1, between the following limits.

6.

From 0 to 4

7.

From -1 to 0

8.

From -1 to 3

9. Find the area of the parabolic sector between the graphs of y. = 2x2 and y
10.

Same problem, for y

11.

Find the area of the sector between the graphs of y = x2 - 1 and y

12. Same problem, for y


13.

x2 and y

x2 and y

x + 1.

x.
=

-x2 + 1.

2x2 - 1.

Solve, for the general case, the problem of Archimedes, stated at the beginning of
this section.

14. a) For each n, let


An = 1 +

1
----=

v'n
Obviously An > 1 for every n.

Under what condition for n can you be sure that

b) Under what condition for n can you be sure that

1
An - l <
?
10 '000 '000
c) Let

be any positive number.

An - 1 < E?

15.

a)

For each n, let

Under what condition for n can you be sure that

2n - 2
En= --.
3
n -1

Under what condition for n can you be sure that En < lo?

b) Under what condition for n can you be sure that

c) Given any positive number

E,,. < E?

16.

E,

under what condition for n can you be sure that

For each n, let

Obviously C,,. > 4 for every n.


whenever n is sufficiently large.

Given a positive number

E,

show that Cn

4 <

62

Analytic Geometry

17. a) For each n, let Dn

that Dn <

2.10

n2

3n

1
+ 2 .

Under what condition for n can you be sure

102 ?

b) Given any positive number E, under what condition for n can you be sure that

Dn < E?

18.

Given an ellipse, find its area.


y

-b
-

x2
y2
-+- = 1
a2
b2

This can be done by a method somewhat similar to one used in the preceding section of
the text.

[Hint: In the figures, what is the relation between y and k?]

19. In the discussion preceding Theorem 1, we fo und that A

A n - h3/n.

Verify this

statement geometrically, without using a formula for either An or A.

Hint: Draw a figure showing both the inner and outer rectangles, and explain why
An - A
*20. a) Find a formula for

h
=

. 1z2.

""'

.., l 3

i=l

b) Find the area of the region under the graph of y = x3, from 0 to 1.
*21. a) Let

as in the text, and let

Thus En is the error in the approximation An

""'

'13/3, and En > 0 for each n.

Calculate En and show that En < h3/n for each n.


b) Show that for every E > 0, En < E when n is sufficiently large. That is, find a number
N such that En < E whenever n > N.

Functions,
Derivatives, and Integrals

3.1

THE IDEA OF A FUNCTION

Roughly speaking, a function is a law of correspondence under which to each element


of one set there corresponds one and only one element of another set.

Consider

some examples.
I)

Suppose that we have set up a coordinate system in a plane .

point P of

Then to each

E there corresponds a number x which is the x-coordinate of

P.

Thus

we have a function

E--+R
which matches points

2)

P of

E with elements x of R.

Similarly, every point P has a unique y-coordinate y.

Thus we have

another

function
E--+ R.

To distinguish these two functions, we give them different names, say, X and Y.
Thus
X:

E--+ R,

: PHX
is the "x-coordinate function," and

Y: E--+R,

PHy
is the "y-coordinate function."

When we write

PH x (with the vertical bar on the

left-hand end of the arrow), this means that each point P is matched with its x
coordinate x.

3)

Thus we write

---+

between

sets and

between

elements of the sets.

If the real number x is known, then x2 is determined. Thus we have a function

f: R--+R,
2
X H X .

4)

Every nonnegative real number has one and only one nonnegative square root.

Thus we have a function

g: R+--+R,
: x H

1-Y

X,

where R+ denotes, as usual, the set of all nonnegative real numbers.


63

64

3.1

Functions, Derivatives, and Integrals

5) If x 2, then x - 2 is nonnegative, and so has one and only one nonnegative


square root. Thus we have a function
[2, oo)-+R,

h:

:XH

6) The absolute value /xi of xis defined by the conditions


x
/ /

x
/ /

and

x for x 0

-x for x < 0.

In either case, if xis known, then /x/ is determined. Thus we have a function
i:

R-+R,
XH /x/.

In each of these six cases we have a function


f: A-+B,
where A and B are sets of some kind. The elements of A are the objects to which
things are going to correspond. The set A is called the domain of the function f
In each case, B is a set which contains all of the objects which correspond to elements
of A. The set B is called the range of the function/ Finally, to have a function/, we
must have a rule under which to each element of A there corresponds a unique element
of B. Under these conditions, we have a/unction of A into B.
We can sum up the preceding examples in the following table.
Example

Function

--

Domain

Range

R+

g
h

[2, co)

Rule

PH x
PHy
XH x2
XH V
xH Vx
XH/x/

It is not required that all the elements of the range actually get used. Thus, in
Example 3, x2 0 for every x, and so we could equally well write
/: R-+R+,
XHX2,
using R+ as the range instead of R.
Often functions are defined by algebraic formulas, but some of the most important
functions are defined in other ways. Consider the following example.
7) Given the parabola, shown below, which is the graph of the equation y
x2
For each point P of the parabola, the arc of the curve from the origin 0 to P has a
certain length. If to each xwe let correspond the length of the arc from 0
(0 , 0)
to P
(x, x)2 , then we have a function
=

j: R-+ R+.

The Idea of a Function

3.1

65

(Here we are talking about simple geometric length, independent of direction, and
so the length of the arc is never negative.) Later we shall find that this function can
be described by a formula. But we don't need to know this, let alone find the formula,
to know that we are dealing with a function.
y

8)

Given the same parabola. To each number k

0 there corresponds a number A

which measures the area of the shaded region in the right-hand figure. To be exact,
the region is

R2

{(x, y) I 0 x k, 0 y x2}.

Thus we have a function


/2: R+---+ R+,

kHA.
In Chapter 2 we got a formula for this function:

kH ik3
for every

9)

k 0.

Given the graph of y =

x", for x 0.
y

(The rest of the graph goes upward when

To each k
region

is even and downward when

is odd.)

0 there corresponds a number A which measures the area of the shaded

66

Functions, Derivatives, and Integrals

Thus for each

3.1

n we have a function

: kl---* A .

Only for the cases n = 1 and n = 2 do we know how to calculate the values of A.
But for n = 3, we nevertheless have a well-defined function/3. Later in this chapter,
you will see how this function can be calculated.
Given a function/: A -+ B, for each a in A we denote by f(a) the element of B
which corresponds to a. For example, if f is the function which squares things
(x x2), then
f(l)

1,

/(2)

4,

/(3)

9, f(J2)

2;

and
/()-;;)

for every

=x

0.

In Example 9 above,f3(1) is the area under the graph of y = x3, from 0 to I ; and
so on.
If the domain A and the range B are sets of real numbers, then we can draw
pictures of the function. The graph of a function f: A _,. B is the set of all points
of the coordinate plane that have the form (x,f(x)). In other words, to draw the
graph of the function, we plot the point (x,f(x)) for each x in A.
y
y

r-Y!
I
I
I
I

I
I

In the case shown in the left-hand figure above, the domain is a closed interval
Consider, next, the function g in Example 4, which extracts nonnegative
square roots:

[a, b].

g:
:

R+-+ R,
x 1--4

)"'-;,

The graph of g (the right-hand figure above) is the graph of the equation
To see that this graph is approximately right, observe that

y = )
We get x

<=>

0,

0,

x=

y = )'-;.

y2.

= y2 by interchanging x and yin the equation y = x2.

Therefore the graph

The Idea of a Function

3.1

of x
y

y2 is a parabola with directrix

and focus

(t, 0).

67

And the graph of

..j--; is the upper half of this graph.

A curve which is the graph of a function is called a function-graph. It is easy to

see what sort of curve is a function-graph: A set of points in a coordinate plane is a

function-graph if it intersects every vertical line in at most one point.


y

y
I
I
I

y
I

No

Yes

Ordinarily, we make no distinction between a function-graph in a coordinate plane


and the corresponding function.
y
4

For example, in the figure above,jis a set of points, and is a function-graph. We use
the same symbol /for the corresponding function. Thus we say that the domain off
is the closed interval

[ -1, 7],

and the range off is R.

(Obviously some smaller set

could be used as the range, but it is not obvious from the figure just what the smallest

possible range is.)

2,

1, 2

We write f(O)

2,

f(l)

1,

J(2)

3, and so on, because

3, under the action of the function/

Given a function

f: A-+B.
If bis

f(a) for some a in A, we say that bis a value of the function. For example,

4 rs a value of the function

is called the image.


that in

2
x H x ,

but -1 is not. The set of all values of a function

If you reexamine Examples 1 through 6 above, you will find

and 2 the image is all of R, and in the remaining cases the image is R+.

(You should check these cases.)


Similarly, for

f: [O, 1]-+ R,

X H

../I

x2.

68

Functions, Derivatives, and Integrals

3.1

Here the graph is a quadrant of a circle, as shown on the left below, and the image
is the closed interval [O, 1].
y

A word of caution: it often happens that a figure looks like a function-graph if


you look at it sidewise. In the right-hand figure above, C is not a function-graph;
but sidewise it looks like one. More precisely, if you reflect C across the line y
x
you get a curve which really is a function-graph. We often use this device, to study
various curves C for which the reflection C' is a function-graph. But this does not
mean that C was a function-graph in the first place. Therefore, in the following
problem set, when you are asked whether certain curves are function-graphs, you
must look at the curves right side up. For the curve C shown in the above figure, the
answer is "No," even though for C' the answer is "Yes."
In some of the problems below, you are asked to find the image. In some cases,
the image is not an interval; and you may find it convenient to use the notation
=

{a, b,
for the set whose elements are

a,

b,
N

c,

c,

. ..}

.... Thus
{ l,

2, 3,

.}

is the set of all positive integers;


z

{ ...

'

2 , -1, 0,

1,

2,

...}

is the set of all integers; and

{O, l}
is the set whose only elements are

and 1.

PROBLEM SET 3.1


1.

Givenf(x)

2. Givenf(x)
3.

x2 + x + 1, for every x. Findf(O),f(l), and [(2).


2x2 - x + 3. Findf(-1),f(O), andf(2).

Givenfas in Problem 2. Get a general formula forf(2 + h).

4. For what positive integers

n (if any) is the graph of y


xn a function-graph?
each such case (if any), what are the domain and the range?

5.

Same question, for the graphs of the equations x

yn.

For

3.2

The Derivative of a Function, Intuitively Considered

69

6.

For what positive integers n (if any) is the graph of y = lxln a function-graph? For
each such case (if any), what are the domain and range?

7.

Same problem, for the graphs of the equations lyln = x.

*8.

Same question as

6, for y3 +

9. Is the graph of x =
Sketch.

ny = x.

Vy a function-graph?

If so, what are the domain and the ima ge?

10. Same question, for y = lxl/x.


11. Same question, for Jyl = x.
12. Same question, for y = lxl + x.
13. Same question, for y = x2 + x + 1. (Here, of course, the only trouble is in finding
the image. The image is an interval, and should therefore be described in the interval
notation.)

14. The postage rate for airmail letters within the United States is now (1971) ten cents
per ounce or fraction thereof Thus we have a function
amp:

R+

___,.

R+,

where amp xis the airmail postage (in cents) for a letter of weight x (in ounces.) Thus
amp t = 10, amp 1 = 10, amp 1T = 40, amp 0 = 0, and so on. Sketch the graph of
this function. What is the image?
15. The roundoff function r: R ---+ R assigns to each number the nearest integer (with a
half-integer assigned to the next highest integer). Thus r(2) = 2, r(2!) = 2, r(2t) = 3,
r (2! ) = 3. Sketch the graph of this function from 0 to 3. What is its image?
16. Under what conditions is a semicircle a function-graph?
17. Under what conditions is a parabola a function-graph? (To solve this one, you will
need a theorem from a problem in Chapter 2.)
3.2

THE DERIVATIVE OF A FUNCTION, INTUITIVELY CONSIDERED

In Section 2.7 we solved the tangent problem for parab olas. Given the graph of
y

we found that for each


was

x0,

ax2 + bx +

c,

the slope of the tangent at the point

(x0, y0)

of the graph

70

3.2

Functions, Derivatives, and Integrals

Obviously a parabola with its axis vertical is a function-graph; its equation expresses
in terms of x. Thus we have a function

f:

R-R

ax2

bx

c.

Now at each point of the graph off there is one and only one tangent; and this
tangent has a certain slope. Thus we have another function
f':

R-R

: X

S"'

2ax

+ b.

For each x,f'(x) is the slope of the tangent to the graph ofjat the point (x,f(x)).
To see how this works, consider the simplest example, in which

j(x)

x2

Here the parabola is the graph of the function


j: X

X2,

2X.

and the line is the graph of the function


f': X

For each x, the value off' is the slope of the tangent to the graph off For example,
I the slope of the tangent tofis 2; and f'(l)
2 I 2.
at the point where x
At x
%, we getf'(i)
t; and tis the slope of the tangent to the parabola. Where
x - 1, the slope of the tangent to the parabola is -2; andf'(-1) 2(-1)
-2.
=

The Derivative of a Function, Intuitively Considered

3.2

71

In general, suppose that we have a function


f:

R ___.... R.

If the graph ofjhas a nonvertical tangent at each point (x,f(x)), we letf'(x) be the
slope of this tangent. This gives a new function
f':

R ___.... R.

The new function f' is called the derivative off Consider another example.
y

A careful inspection of the figure above indicates that f' is (at least approximately)
the derivative of/ Thus, at x = 0, the tangent tofis horizontal; andf'(O) = 0, as it
should be. At x = 1, the tangent to f seems to have slope = -1; and f'(1) = -1.
At x = -1, the tangent to f has slope = I; and /'(-1) = I. For x > 0. the
tangent to f has negative slope; and/' (x) < 0 for x > 0. For x < 0, the tangent
to f has positive slope; and/'(x) > 0 for x < 0.
It may be that at some points f has no tangent. At such points,/' is not defined.
Thus, in some cases, the domain off' is a smaller set than the domain of/ Consider,
for example, the function/: x H Ix/.
y

f'

-1

f'

-1

For every x > 0, the slope of the tangent is 1; and for every x < 0, the slope of the
tangent is -1. Therefore the graph off' looks like the figure on the right above.

3.2

Functions, Derivatives, and Integrals

72

Drawing both/ and/' on the same set of axes, we get the left-hand figure be low.
You should carefully inspect the figure on the right below, to conv ince yourself that
f' is the derivative off, at least approximately.

Heref has a t angent at

the tangent is vertical, and therefore there is no such thing as f'(O).


and

small.
point

is small, thenf'

(x)

It looks as iff'(2)

( 2,/(2)).

is large, becausef is rising steeply. When


=

0, but

When

>

is

x >
1, f' (x)

O; and the g raph off has a horizontal tangent at the

y
y

f
f'

f'

-1

A function which has a derivative at every point of its domain is called

entiable.

differ

The following theorem describes a fundamental property of differentiable

functions:
The Mean-Value Theorem.

Every chord of a differentiable function is parallel to the

tangent at some intermediate point.


y

Here by a

chord we mean a segment joining two points of the graph. The theorem
[a, b], then there is some i between a and bat which

says that if/ is differentiable on

the slope of the tangent is the same as the slope of the chord. As indicated in the
right-hand figure above, there may be more than one such point.
The situation with regard to this theorem is awkward.
obvious.

Also it is important and we shall need it soon.

It is geometrically

On the other hand, the

proof of the theorem is hard, and involves ideas which belong in the later portion of a
calculus course. We shall therefore postpone the proof, but use the theorem whenever
we need it.

The Derivative of a Function, Intuitively Considered

3.2

73

The theorem can be stated in a form which looks more algebraic. If f is defined

on an interval [a, b], then the slope of the chord joining the endpoints is

f(b)-f(a)
b-a
and the slope of the tangent at x is f' (.X). Thus the theorem states that

f'(x)

f(b)-f(a)
b - a

for some x between a and b. In this style we can restate the theorem as follows:

The Mean-Value Theorem. Suppose that f is differentiable on the closed interval


[a, b ]. Then for some x between a and b we have

f'(x)

f(b)-f(a) .
b - a

Note that, if we merely required that the graph have a tangent at every point, the
theorem would become false. The graph shown below has a tangent at every point,
but one of these tangents is vertical. Therefore the function f is not differentiable on

[a, b]. And no tangent line is parallel to the chord from P to Q.


y

.... Q
,,,.,,,,, l
.,,.,,.
I
,,,,,.
I
I
I

Hereafter, the mean-value theorem will be referred to as MVT.


PROBLEM SET 3.2

In each of the figures below, a function-graph/is given. Do a tracing of each graph on a


sheet of writing paper, and then draw a plausible sketch of the graph off'. Obviously your
sketch off' cannot be exact. But f' should be
0 at points where the tangent to f is hori
zop.tal; /'(x) should be > 0 where the original graph slopes upward;j'(x) should be < 0
=

wheref slopes downward; and so on. In some cases, you may find that the values off' are

so large that there is no room for them on the paper. In such cases, draw as much of the
graph of/' as space permits.
Some but not all of the functions shown below satisfy the conditions of the mean-value
theorem. For each such function, draw the chord between the endpoints of the graph,
draw a tangent line which is parallel to this chord, and drop a dashed line from the point of
tangency to the point x on the x-axis. (See the figures in the text, illustrating

MVT.)

3.2

Functions, Derivatives, and Integrals

74

y
1

1.

y
4

2.

3.

f,

-1

-2

-4
-2
y

4.

5.

y
4
f
2

-2

6.

f
f
-1

7.

-1

8.

10.

y
11.

12.

f
x

-1

f
x

-1

-1

13.

9.

-1

-1

14.

-1

15.

y
1

-1

-x

Continuity and Limits

3.3

17.

16.

18.

75

-1

19.

-1

20.

21.

3.3

CONTINUITY AND LIMITS

Let/be a function defined on an interval, or defined for every x. Roughly speaking,


f is continuous if we can draw the graph without lifting the pencil from the paper.
For example, the function f(x)
half of the circle x2 + y2

1.

J1

x2 is continuous, because it is the upper

Most functions that arise naturally are of this type,

and in this book we will rarely deal with any other kind of function. But some very
simple functions are not continuous.

Consider, for example, the airmail postage

function defined in Problem 14 of Problem Set 3.1. The graph looks like this:
y

40
30
20
IO

76

Functions, Derivatives, and Integrals

3.3

Here y = amp x, where amp x is the airmail postage on a letter weighing x ounces.
The values of this function make sudden jumps at integral values: the graph cannot be
drawn without lifting the pencil from the paper, and so the function is discontinuous.
Functions of this kind are used in physics. For example, the so-called Heaviside
function is defined by the conditions
h(x) =

if

x < 0

if

x 0.

The graph looks like the figure below. It makes a sudden jump at x = 0.

We shall now make the idea of continuity more exact, in several stages.

Given

a point x0, in the domain off, we want to explain what it means to say that f is
continuous at x0. First we try the following:

Continuity and Limits

3.3

1)

77

Whenever xis close to x0,f(x) is close to f(x0). In symbols,


x X0

=>

f(x)

)
f(x0.

This is the idea, but it is not good enough; the question is

how

close things are

supposed to be to each other. As xgets very close to x0,f(x) is supposed to become


very close to f(x0). This suggests:

2)

We can makef(x)

as close as we please tof(x0), by taking xsufficiently close to x0

This is better, but it can be improved. We measure the closeness of two numbers

by taking the absolute value of their difference. Thus if E is a positive number, and
/f(x) - f(xo)/ < E,
then we say thatf(x) is -close tof(x0). In these terms, we can restate

(2) as

follows:

3)

For each E > O,f(x) is -close to f(x0) whenever xis sufficiently close to x0

4)

For each E > 0, there is a o > 0 such thatf(x) is -close tof(x0) whenever xis

If o > 0 and Ix - x01 < o, then we say that x is a-close to x0 The idea of
"sufficiently close" can be described by taking a positive number o. This gives:
0-close to Xo
We can now draw a picture.
y

/(xo)+E

/(xo)

- -

In the figure, the solid rectangular region is called an EO-boxfor the functionfat the

point (x0,f(x0)
) . When we call it a boxfor the function, we mean that no point of the
graph lies above the box or below it.

positive number
We now restate

E,

(4)

If the function is continuous, then for every

no matter how small, we can find a o > 0 that gives an EO-box.


as follows.

Definition. Let x0- be a point in the domain of the function


every E > 0 there is a o > 0 such that

/x - x01 < O
Thenf is

continuous at x0

=>

/f(x) - f(xo)I < E.

Suppose that for

78

Functions, Derivatives, and Integrals

3.3

This definition applies very simply to the function/(x)

Given any

E/2.

at the point (I,

2).

> 0, we can find an EO-box, as shown in the figure; we simply take o

2x,

Then, algebraically,

Ix - II<

=>
=>
=>

Ix - II < E/2
l2X - 21 < E
lf(x) - /(I)I < E;

and this is what we need, to show that the function/ is continuous at the point x

I.

Of course, the function is also continuous at all other points, and we can show in
exactly the same way that the definition of continuity applies, taking o
would we do for f (x)

3x,

at any point

E/2.

We shall now apply the definition of continuity to the function f(x)

the point (I,

(What

x0 ?)

I).
y

1-------

d'

x2,

at

79

Continuity and Limits

3.3

Given an E > 0, we want to find an ED-box for f, at

Ix

11

< D

lx2

=>

(I, I),

so that

I I < E.

To find the desired number D directly requires clumsy calculations, but there is an
easier way. Let d and d' be the numbers such that

f (d)

d2

(See the lower figure on page


d <

78.)
d'

<

f(d')

E;

d'2

1 +

E.

The graph rises from left to right. Therefore,


=>

f(x) < /(d')


x2 < d'2

f (d)

<

<=>

d2 <

:>

<=>

lx2

X2

<

II <

< I

+ E

E.

Thus the dotted rectangle in the figure boxes in the graph, in the same way that an
ED-box does.

dd'-box.

We call such a rectangle a

Obviously a dd'-box is just as

good as an ED-box. And, in fact, given a dd'-box, we can always get an ED-box that
lies in it.
d'

Let D be the smaller of the positive numbers

d and

d'

1. (Jn

fact,

is the smaller, but we don't need to use this.) Then

and so

Ix

11

< D

=>

Ix

< D

=>

lx2

<

<

d',

II <

E,

which is what we wanted. We are going to use this method again, and so we record
it as a theorem.
Theorem 1.
E

Let

x0 be a point in the domain of the function f

> 0 there are numbers d and d' such that d < x0


d <

x < d'

Then/ is continuous at

f (x0)

=>

<

f (x)

<

f(x0) +

x0.
y

f(xo)+.- - - - -.---------

f(xo) - -

- - - -

- -

- -

--

- - - -

8
d

Suppose that for every

< d' and

E.

Functions, Derivatives, and Integrals

80

Proof Let
o

d'

Ix

o be the smaller of the numbers

x0.)

Then

- x0I

< o

3.3

- d and d' -

f (x0)

=>

d <

<=>

If (x) - f(xo)I <

< d'

x0

=>

x0.

< f(x) <

(In the figure,

f (xo)

E.

We shall now reexamine the idea of a limit, which we used in defining the slope
of the tangent to the graph of a function.
y

To find the slope of the tangent at the point (x0,f(x0)), we let m(x) be the slope of the
secant line through the points (x0,f(x0)) and (x, f(x)), where x -: x0. Thus the slope
of the secant is a function, and we are now describing it in functional notation. By
definition, the slope of the tangent is

f'(x0)

Jim m(x),

x-xo

if such a limit exists. We shall now give a definition of the limit. The idea is that
limxx m(x)
L if the function m becomes continuous at x0 when we insert the
0
value L as the value m(x0). Thus we want to use L as m(x0) in the definition of
continuity. This gives the following:
=

Definition. Let m be a function defined at each point of an interval /, except at the


point x0 Suppose that for every E > 0 there is a o > 0 such that

0 <

Ix - x0I

< o

lm(x) - LI

=>

Then
Jim
For example, for f(x)

m(x)

x2, x0

x2
x

m(x)
l,

- 1

--

<

E.

L.

we have

+ 1

(x

-:

1).

When we insert the point (1, 2) on the graph of the function m, we get a continuous
function (which is equal to x + 1 for every x ). Thus lim,,,_"'0 m(x)
2, not just
intuitively but also in terms of our definition of a limit.
There is one more problem to consider. What if j(x0) is defined? In this case,
=

Continuity and Limits

3.3

81

what do we mean by lim.,.,J(x)? The answer is that we ignore the value off at x0,
and investigate how the rest of the graph behaves. To be exact:
Let/be a function defined on an interval I, except perhaps at the point x0
Suppose that for every e > 0 there is a b > 0 such that

Definition.

0 <

Ix - x0I < b

If (x) - L I <

e.

Then
lim f(x)
x-+x0

L.

Note that here we have simply copied the preceding definition, using f for m: all
x0 was ruled out by the condition 0 < Ix - x01. The left-hand
along, the value x
figure, showing the eb-box, looks the same as before, except that there is no point in
plotting /(x0) (which may not be defined, and which will not in any case be used).
=

y
I

f(Xo)
L

/
-r?:

-- - ---

-+---<:---x
Xo
I

The definition of a limit applies in peculiar ways to certain peculiar functions.


In the right-hand figure above, lim.,.,JCx) = L, but L :;i:. f(x0). The following
theorem shows that the strange situation shown in the figure cannot happen if the
function is continuous.
Theorem 2.

f is continuous at x0 if and only if


lim f(x)

f(xo).

Here the displayed formula says three things at once:

1) f(x0) is defined. That is, x0 is in the domain off


2) lim.,.,J(x) exists. That is,f approaches a limit, as x--+ x0
3) f(x0) and lim.,.,0 f(x) are the same number.
PROBLEM SET 3.3
1.

How close to 3 does x need to be, for 2x to be within 0.001 of 6? (Answer in the form

Ix - 31 <
.8-box

12x - 61 < 0.001. Sketch the graph of j(x) = 2x, and sketch your
0.001).)

2. Find numbers d and d' such that d < x < d' lx2 - 321 < 0.0001. Sketch the graph
ofj(x) = x2 and sketch your dd'-box. In your sketch, you will have to distort the scale
grossly, because of the small size of your

82

3.

Functions, Derivatives, and Integrals


Show that the function f(x)

x2 is

3.4

x0

continuous at the point

that was used in the text for the same function at the point

x0

Thus your answer will include statements in the form: "Let d


Then

d <x <

d' =>

32 -

< x2 < 32 + ."

3.

Use the method

1 and apply Theorem I.

. . . , and let d'

. ..

Sketch the function, showing your

dd'-box.
Answer as in Problem

4.

j(x)

6.
8.

f(x)
f(x)

9.

f(x)

10.

2x2,

Xo

V':;, x0
xn, where

3 for the following functions,

I.
8.

n is any positive integer

5.

j(x)

7.

f(x)

and x0

at the following points


=

V,

Xo

x3, x0

x 0

4.

2.

is any number.

v:;, x0 is any positive number.


A function f is called Lipschitzian if there is a number m >
points x, x0 we have
l/(x) - f(x0)1 m Ix - x0I.
=

0 such that for every two

Show that every Lipschitzian function is continuous.

3.4

THEOREMS ON LIMITS

In this section we shall give the elementary rules that we use in dealing with limits.
These rules are much easier to learn and to use than they are to prove, and so many of
the proofs are omitted from this section. (You will find the missing proofs in Appendix
B.) But some of the proofs are easy, and they throw some light on the idea of a limit.
Theorem 1.

If limx-xJ(x)

L, then limx-x0 [-j(x)]

-L.

Proof To get -/from/, we flip the graph of/ across the x-axis. We know that for
every e > 0, fhas an eb-box at (x0, L). If we flip the box across the x-axis, in the
same way that we flipped the graph, this gives us a box for -fat (x0, -L).
This theorem can also be proved algebraically.

The hypothesis means that:

(1) for every e > 0 there is a o > 0 such that


0 < Jx - x01 < O

=>

If (x) - LI <

e;

Theorems on Limits

3.4

the conclusion means that:

(2) for

every" > 0 there is a o > 0 such that

o <o
O<lx-xl
Since

83

1-f(x)-(-L)I<"

=>

1-f(x) -(-L)I = lf(x) - LI, it is obvious that (1)

=>

(2),

and thus the

theorem holds.

Theorem

2.

If limx-xJ(x)

L, then limx-xo

[f(x) - L]

0.

!I

J---1
I
I

51

rl

Proof

Given f, we get

f-L

f - L by moving the graph up or down a certain distance

(down or up, according as L is positive or negative). We move the box along with the
graph; and this gives a box for the functionf - L.

Theorem

3.

If Iimx-xo

[j(x) - L ]

0, then limx- x0 f(x)

L.

To prove this, merely use the previous proof in reverse; move the box along
with the graph.

Theorem 4. If limx_.,0 f(x)

L, and k is any number, then limx-xo kf(x)

kL.

That is, the limit of a constant times a function is the same constant times the
limit of the function.
y

Proof
1) Fork

0, this is easy:

kf(x)

0 for everyx, and so lim.,_x0

kf(x)

0 L.

Functions, Derivatives, and Integrals

84

2)

Suppose that

Therefore, for every

> 0.

For every

3.4

=>

0 < Ix - x01 < O


Multiplying by

3)

Proof for

1/(x) - LI < E/k.

k, in the inequality on the right, we get:


< Ix - x01 <

and so lim,H,,0

(x0, L).
(x0, L). Thus

> 0, the graph off has an EO-box at

> 0, the graph of /has an (E/k) o-box at

lkf(x) - kLI < E,

=>

kf(x) = kL, which was to be proved.

k < O?

These proofs illustrate the way we work with boxes, to prove things about limits.
The following theorems will be proved later (unless you find a way to prove them
for yourself).
Theorem 5.

If lim xx f(x) =Land 1imxx0 g (x) = L', then

Jim [ (x)
x-+x f
0

g(x)]

L + I.:.

That is, if each of the functions/and g has a limit, as

x--+ x0, then the sum also

has a limit, and the limit of the sum is the sum of the limits.
Theorem 6.

If limx-xo

f(x) = Land limx-xo g(x)


Jim

x-xo
Theorem 7.

If limxx

[f(x)g(x)]

L', then

= LI.:.

f(x) =Land limxx0 g (x) = L', and L'


Jim

x-xo

f(x)
g(x)

!=._
I.:

0, then

Caution: The preceding theorem says nothing about what happens when L'
anything can happen, even in very simple cases. If

0.

And in fact, for L' = 0

f(x) = 2x,
then

f(x)
g(x)

And any number

2x

and

g(x) = x,

f(x)
x-o g(x)
Jim

2.

k can be used in place of 2. Therefore, if f (x)

--+

0 and g (x)

--+

0,

the quotient//g can approach any number whatever as a limit. This should not sur
prise us, because every time we calculate a derivative we are finding the limit of a
quotient

f(x) - f(xo)
X - Xo

whose numerator and denominator are approaching 0.

Theorems on Limits

3.4

85

In the preceding section we showed that f is continuous at x0 if and only if


lim f(x)

f(x0).

(This was Theorem 2.) Hereafter, we shall regard the above formula as interchange
able with the definition of continuity. Thus every theorem on limits automatically
gives us a theorem on continuous functions. Some of these are as follows:
Theorem 8.

If f is continuous at x0, and k is any number, then kf is continuous at x0

Proof We are given that


lim f(x)

f(xo).

By Theorem 4,
lim kf(x)

kf(x0),

and this means that kf is continuous at x0


Theorem 9.

If f and g are continuous at x0, then so also are f + g and Jg.

Proof? (Use Theorems


Theorem 10.

and 6.)

If f and g are continuous at x0, and g(x0)

>'6 0,

then fig is continuous

at x0
(Use Theorem 7.)
Most of the time we shall apply these results not just at one point x0 but through
out the domain of the functions f and g. For these cases, we can state our theorems
more briefly as follows:
Letf and g be functions with the same domain. Iff and g are continuous,
then so also are kf, f + g, and Jg. And fig is continuous at every point x0 where
g(x0) =: 0.

Theorem 11.

Thus, for example, given that/(x)


x2 + 1 and g(x)
x4 + 4 are continuous,
on the entire real number system, we can infer immediately that kf,f+ g,fg, andfig
have the same property. Here g(x) =: 0 for every x. Given
=

h(x)

x2

1,

we can infer that/+ h andfh are continuous everywhere, and thatflh is continuous
except at 1 and -1. Of course, at x
1 and x
-1 it is not just continuity that
breaks down: the quotient function is not even defined at these points, because the
denominator ofj/h becomes 0.
Finally, a trivial observation.
=

3.4

Functions, Derivatives, and Integrals

86

Theorem 12.

Every constant function is

continuous.

-- -- ----

Xo

Proof Given/(x)
every

"

>

0 there

k, for every x in a certain domain. We need to show that for


is a o > 0 such that
=

Ix

Xol < o

=>

If (x)

kl <

E.

Obviously any positive number can be used as o.


PROBLEM SET 3.4

In solving the following problems, you need not base your work directly on the definition
of continuity or the definition of a limit; you are free to use all the theorems stated in this
section. Note that the later problems are not based on this section at all; they are extensions
of the theory.
1.

Show that if/(x)

2.

f(x)
kx2
Same, for f (x)
xn (n a positive integer).
Same, for f (x)
x3 - 3x2 + x - 4.
Apolynomial of degree n is a function of the form

3.
4.
5.

Same, for

kx, then/is continuous.

where an 0. (The function which is


Show that every polynomial is continuous.
=

0 for every x is a polynomial of degree 0.)

6. Show that if

f(x)

xn
1 + x2'

then f is continuous.
A function f is

bounded if there is a number

Msuch that - M

f (x)

Mfor every x

in the domain of M. If the above inequalities hold, then we say that Mis a

bound off.
f; and to

Obviously, to show that a function f is bounded, you have to name a bound of

show that a function is unbounded, you have to show that no number Mis a bound off.

3.4

Theorems on Limits

87

Find out which of the following functions are bounded, on the given domains, and justify
your answers:
7. f(x)

9. f(x)

11. f(x)

13. f(x)

15. f(x)

17. f(x)

19. f(x)

1
=

--

1 + x2

10. f(x) =

x3, 0 ;;:; x ;;:; 2

12. f(x)

1 + x2
x

--

1 + x3

1 + x2
1
x + 1

8. f(x)

x2, 0 ;;:; x ;;:; 1

x
=

, - oo < x < oo

--

1 + x3

x2, - oo < x < oo


x2

--2

1 + x

x
=

--

1 + x2'
x

, 1;;:; x < oo

14. f(x)

, - oo < x < oo

16. f(x)

, -1 < x < 1

18. f(x)

--

1 + x2

< x <

0 x 1
-

, - oo < x < -1

1
-- , 1 < x < oo
1 + x3
x4

- w

--

1 + x3

, 0 < x < oo

, -1 < x < 1

20. Show that if f is bounded, then so also is kf for every k.


21. Show that if f and g are bounded (on the same domain), then so also is f + g.
*22. Show that if/and g are bounded (on the same domain), then so also is/g. You may
find it convenient to write the condition for boundedness in the form If(x)I ;;:; M.
Can you infer also that fig is bounded? Why or why not?
*23. a) Show that if f is bounded and
Jim

g(x)

0,

x-xo

then
Jim [f(x)g(x)]

0.

x-xo

(First try proving this for the case M


1.)
b) Show that if
Jim f(x)
0,
=

then
Jim [f(x)sin x]

0.

x-xo

(Here it makes no difference whether degrees or radians are being used.)


c) In Problem 23(a), can we get along without the hypothesis that/is bounded? That
is, if
Jim g(x)
0,
=

x-xo

does it follow that


Jim [f(x)g(x)]

0,

no matter what kind of function f may be? Why or why not?

88

Functions, Derivatives, and Integrals

3.4

d) Show that if
Jim f(x)

0,

xo

then
Jim
xo

[t(x) ]
sin

0.

(Query: Can Theorem 6 be applied to this problem?)


e) Show that
1

Jim x2 cos -

x-o

0.

-M

*24.

5 I
-

-----

a) A function fis locally bounded at x0 if there are positive numbers M and o such that
0 <Ix - x01 < o=> lf(x)I < M.
b) If fis bounded, does it follows that f is locally bounded at each point of its domain?
c) Conversely, if f is locally bounded at each point of its domain, does it follow that f
is bounded?
d) If f is locally bounded at each point of the open interval (0, 1), does it follow that f
is bounded on (0, 1)? Why or why not?
e) Show that if
Jim f(x)

L,

x-xo

then fis locally bounded at x0. (This result does not require that x0 be in the domain
of f If you draw a picture of what you have, and a picture of what you want, and
compare the two, this proof may become obvious.)
f) Show that if f is locally bounded at x0, and
.

Jim g(x)

0,

x-xo

then
Jim f (x)g(x)
x-xo

0.

The Process of Differentiation

3.5
3.5

89

THE PROCESS OF DIFFERENTIATION

The theorems in the preceding section tell us enough about limits to give us some
information about derivatives.
To make some formulas easier to write, we introduce an alternative notation for
the derivative: we write DJ to mean the derivative off Thus
Df=f',

by definition. Similarly, if

h(x) =f(x) + g(x)

for every x in a certain domain, then

D(f + g) = h'.

Similarly, when we write

D(x2 + 2x + 5)

we mean the derivative of the function

(x2+2x+ 5).

We know already what this derivative is:


D(x2+2x+ 5) = 2x + 2.
Here we are merely rewriting the result which we got quite a while ago: for each x,
the slope of the tangent to the graph of
y=

is given by the formula

ax2+bx+c

S,, =

2ax+b.

We recall that a functionfis differentiable at a point x0 if it has a derivative at x0.


When we say thatf is differentiable, we mean that it has a derivative at every point of
its domain. For example, if f(x) = lxl, then f is differentiable at 1, but not at 0.
But ifj(x) = x2, thenf is differentiable ( without qualification).
Theorem 1. The derivative of a constant function is 0.
y

Here by a constant function we mean a functionffor whichf(x) = k for every x.


Obviously the slope of the tangent is 0 everywhere. Algebraically,

f(x) - f(x o)
rr;->x0
X - x0

!'(Xo) - 1.lm
_

- = 11m 0 = 0 .
k
= !.im --x->x0 X - X0

x->x0

90

3.5

Functions, Derivatives, and Integrals

Theorem 2.

f is

Jf

differentiable, then so also is

D(kj)

Proof

Take any

x0

kf for

every k, and

kDf

in the domain off Then

1.
Im

x-->x0

by the definition off' (x0).

f(x) - f(xo) _ '


- f (x0),
X - x0

Therefore, by Theorem 4 of Section 3.4,

kf(x) - k f(x0) _ '


- kf (Xo)
"''"'"'o
X - Xo
1.
Im

Therefore, at each point

x0 ,

the derivative of

kf is

have seen an example of this:

Dx2

2x,

D(kx2)

2kx,

k times the derivative off

We

and

as it should be.
Theorem 3.

f and g

If

are differentiable, then so also is

D( f + g)

Proof.

Given an

x0

in the domain of

f and g,

f(x) - f(xo)
"'"""'o
x - x0
1. g(x)
Im
"''"'"'o

g, and

DJ+ Dg.

l im

and

f+

- g(x0)

we have

X - Xo

f '(xo)
'
g (x0).

We want to prove that

[j(x) + g(x)] - [f(x0) + g(x0)] _ f'(


'
x0) + g (x0) .
"''"'"'o
x - Xo
1.
Im

Since we know by Theorem .5 of Section 3.4 that the limit of the sum is the sum of the

limits, the result follows immediately; the big fraction in the third formula is the sum
of the two fractions in the preceding two formulas.
We shall now show that

Dx3
For each

x,

3x2

let

f(x)

xa;

The Process of Differentiation

3.5

91

and take any x0. Then

'
f (x0)

f(x) - f(xo)
x - x0
o:->o:0
Im

.
= IIm

(x - x0)(x2

= lim

(x 2

!t::t:o

x0x

x0,

x0x

x)

o:->o:0 X

- x0

Xo

xg)

(Why?) Therefore

by Theorems 6, 4, and 5 of Section 3.4.

for every

x3 - x

x -

.,_,.,o

r
= Im

and

which is what we wanted.


To extend this result to j(x) =

x",

for every positive integer

n,

we merely need

to know the general factorization formula

This is easy to check by multiplication.


Thorem 4.

Proof.

Dx"

Let f(x)
.
IIm

nx"-1,

for every positive integer

x" for every

f(x) - f(x0)

and take any

x0

lffi

(x - x0)(x11-1

x"-2x0

X - Xo

z-+ :Z:o

= lim

Then

x" - xg

1.

1.
= Im

x,

n.

(x"-1

x-xo

11-2
x
x0


+
+

xxg-2

xg-1)
(to

Thus/' (x0)

nx-1

for every

x0,

and so
Dx" =

which was to be proved.

nx"-1,

terms)

92

Functions, Derivatives, and Integrals

3.5

The preceding theorems, in combination, enable us to differentiate any poly


nomial. For example,

D(17x29 + 7TX17 - 7x5)

Dl7x29 + D7Tx17 - D7x5,

because the derivative of the sum is the sum of the derivatives. This is
=

Theorem 5.

17 29x28 + 177Tx16 - 35x4

If f is differentiable at x0, then f is continuous at x0.


y
I

I--/'
I
I
I
I

---1I
I
I

x
---+---xc!--o---By definition, f'(x0) is the

This is easy to see, as a matter of common sense.


slope of the tangent.

Therefore f must have a nonvertical tangent at

"made a jump" at x0, there surely couldn't be a nonvertical tangent.

x0.

But if f

The secant

lines would get too steep for their slopes to approach a limit. Pictures are useful, but
it is hard to be sure that the pictures we draw allow for all possibilities. Jn any case,
the theorem is easy to prove algebraically.

Given

f(x) - f(x0)
- f'(x0),
x-xo
X - Xo
.
Itm

then

hm [f(x) - f(x0)]

x-+x0

hm

f(x) - f(xo)

x-x0

f'(x0)

- x0

(x - x0)

0.

(Theorem needed, for the second step?) By Theorem 3 of Section 3.4, lim.,_,, f(x)
0

f(x0), which was to be proved.

The differential calculus would be simpler if the derivative of the product were
equal to the product of the derivatives; but this is not so. For example, take

f( x)

xa,

g (x)

x2.

Then

f'(x) g'(x)

which is not the same as

Dx5

3x2

2x

5x4

6x3,

The Process of Differentiation

3.5

93

A correct formula can be derived as follows. Take any x0, and suppose, as usual,

thatfand g are differentiable. Then

'
f(x)-f(x0)
-f (X0)
.,_,.,o
X - Xo

(1)

g(x) - g(x0)
X - x0
.,_,.,0

(2)

1.

lffi

and

1.

tm

g '(Xo) .

To find the derivative of the product, we need to find

.f(x)g(x)-f(x0)g(x0) .
X- x0
x->x0

(3)

lim

In a similar situation, when we were finding the derivative off+ g, there was no

problem: we looked at the fraction whose limit we wanted to find, and observed
that it was the sum of the two fractions

g(x ) - g(x0)
X - Xo

f(x)-f(xo)
X - Xo

whose limits we knew. If these fractions appeared in


that their limits are given.

(3),

then we could use the fact

Since neither of them appears, we use a trick: we simply

put one of them there, fix up the rest of the fraction so that its value is unchanged,
and hope for the best:

f(x)g(x)-f(x0)g(x) + f(x0)g(x) - f(x0)g(x0)


x -

f(x)g(x) - f(x0)g(x0)
x-
=

f(x) - f(xo)
x - x0

Now we can see what happens as

1.

lill

x->x0

x ->-

g(x)

+ g(x) - g(xo)
f(xo).
X - x0

x0:

'
f(x)-f(xo)
-f (Xo,
)
X0
X
_

lim g(x)

g(x) - g(x0)
:z:->:z:o
x - Xo
1.

lffi

g(x0),
g'(Xo) .

Therefore, by our theorems on limits of sums and products, we have

f(x)g(x) - f(x0)g(x0)
-f'(x0)g(x0) + f(X0)g'(x0)
X - Xo
:z:->:z:o
!.

1m

Functions, Derivatives, and Integrals

94

3.5

In words:
Theorem 6.

The derivative of the product of two differentiable functions is the

derivative of the first, times the second, plus the first times the derivative of the
second.
More briefly:

D(fg) = f'g

Let us try this one out for

f(x)

Now

/'(x)

Therefore

f'(x)g(x)

x3,

g(x)

3x2,

g'(x)

f(x)g'(x)

g'f

x2

3x2 x2

2x.
x3 2x

as it should be.
Next we want to find the derivative of the reciprocal
As usual, we take a fixed

have/ (x0) 0, or

x0;

'

1.

1m

xx0

if such a limit exists.

x-->- x0,

l//

of a function f

not be defined.

g(x) - g(x0)
X - Xo

We must

Now

1.
I //(x)
Im

_
-

- 1 /J(x0)
,
X - Xo

xx0

Algebraically,

1/f(x) - l/f(x0)
x - X0
As

5x 4,

and we assume that f is differentiable at x0.

g(x0) would

g (x0) -

f(x0) - f(x)
x)f(x
f(
o)(x - x0)

the first fraction approaches

- 1/ [/(x0)] 2 , because/(x0) 0.

f'(x0),

-1
f(x) - f(x0)
x
x - X0
f( )f(x0)
and the second fraction approaches

Therefore

,
g (x )
o

-f'(xo)
[f(xo)J2 '

and so

at every point
Theorem 7.

where f (x) 0.

In words:

The derivative of the reciprocal of a differentiable function is equal to

minus the derivative of the function, divided by the square of the function (wherever
the function is different from zero).
Using the preceding two theorems, we get

(f);

( ;1)

D f

f'

fD

(1);

!'
=

;-

f.

-g' =
7

f'g - g'f
g2

In words:
Theorem 8.

The derivative of the quotient of two differentiable functions is equal to

the denominator times the derivative of the numerator, minus the numerator times

The Process of Differentiation

3.5

95

the derivative of the denominator, all divided by the square of the denominator
(wherever the denominator is not

More briefly, at every point

0).

where

g(x)

0,

we have

() = gf' Jg'

Let us try this one out in the case

f(x)

where we .already know the answer.

D
This is right, because

(-xx42) = x2

Dx2

g(x)

x4,

x2,

By Theorem 8 we get

2x5
==

4x3 - x4 2x
(x2)2

x4

2x.

2x.

For convenience of reference, we list our differentiation theorems as short for


mulas:
(i)
(ii)
(iii)
(iv)
(v)
(vi)

(vii)

0,
Dk
kDf,
D(kf)
DJ + Dg,
D(f + g)
nxn-i
(n a positive
Dxn
D(fg) f' g + gf,
=

integer),

n() f[_2
D (J
) = gf' - Jg'
g2

(wherever f

Theorem 5 did not involve a formula.

(wherever

0),
g

0).

It said that if a function is differentiable

at a point, then it is continuous at the same point.


Finally, some remarks on the notation used for derivatives.
named by a letter, such as/, then the notation

DJ is

When a function is

unambiguous: it means f'.

But when we describe functions by formulas, it is not always obvious what function
we mean. When we speak of "the function
x

H x2 - x

t H t2 + 2t

x2 -

+ 1,'' it is obvious that we mean


t2 + 2t
3," we mean

1; and when we speak of "the function


3.

But if we speak of the "function"

t2 - tx2 +

x,

we might have either of two things in mind:


a)

xis regarded as a constant; and our function is

j: t

H !2 - tx2

X.

Functions, Derivatives, and Integrals

96

b)

3.5

tis regarded as a constant, and our function is

g: x

f->

t2 - tx2 +

x.

In such a case, it would hardly do to indicate a derivative by writing


D(t2 - tx2 + x)

(?)

(?),

because nobody could tell whether we meantf' or g'.

To eliminate the ambiguity,

we write D1 or D,,, to indicate which letter does not represent a constant. Thus
D1(t2 - tx2 + x)

f'(t)

2t

2
- x ,

while
D,,,(t2 - tx2 + x)

g'(x)

-2tx + 1.

Similarly,
Dx(ax3z + z2)
D.(ax3z + z2)
Da(ax3z + z2)

3ax2z,
ax3 + 2z,
x3z.

PROBLEM SET 3.5

All of the following are differentiation problems. Most of them can be worked by the
standard formulas that we have just derived. But in some cases you will need to start with
the definition of[' (x0) and then use various algebraic strategems.
2.

0
1. D(7x1 - x8)
4. D -x2 + 1
n

3. D --

x + 1

x + 1

y
5. D -y3 - 3

7.

D--

()

8.

x)

b) Da(3axy + x2 + a3)

11. a) Dx(3axy + x2 + a3)

12. D[(x2 - x + l)(x2 + x + 1)]

D(?y4 _ y2 + 7T)

9. D(l + x)3

b) Dv (x3y + ay3 + xy 2)

10. a) Dx(x3y + ay3 + xy 2)

x + 1

n 2

6.

1 3. D

c) Da (x3y + ay3 + xy2)

x2 - x + 1

x2 + x + 1

15. D(x2 + x)2

14. D-x3 - x

16. If you worked Problem 9.of Section 3.3, you know that f(x)
Vx is continuous in its
entire domain R+. Assuming, in any case, that this is true, find['. [Hint: Set up the
fraction whose limit is/'(x0), rationalize the numerator and hope for the best.]
=

17. Given/(x) = Vx + 1 (x;;; -1),findf'. Here you may assume thatfis continuous.
But you should mention this fact, at the stage where you need it.
18. Given/(x)

Vx2 + I, find/'. (Assume that/ is continuous.)

19. Givenf(x) = VJ - x2

(-1 ;;:; x ;;:;: 1), find/'.

The Process of Differentiation:

3.6

20. Gi veng(x)

21. Find D(I/Vx)

x2Vl - x2, findg'.

Roots and Powers of Functions

(x > 0).

23. Find D(x/Vl - x2)

22. Find D(l/Vl - x2).

97

(-1 < x < 1).

24. Now solve Problem 19 by the methods of Chapter 2, without using limits or differentiation formulas.

Find out whether the following formulas are correct, and give your reasons.
25.

D(x2 + 1)2

27. D(x2 + 1)3


29.

2(x2 + 1) (?)

26. D(x2 + 1)2

28. D(x2 + 1)3

3(x2 + 1) (?)

2(x2 + 1)

3(x2 + 1)

2x (?)

2x (?)

500(x2 + 1)499 (?) [Hint: In fact, this formula is wrong. And it is


D(x2 + osoo
possible to prove that it is wrong, without finding out what the derivative of the given
=

function really is.]


30. Prove by induction that Dxn
or the binomial theorem.

31. Same problem, for Dx-n

32. Find DVx2 + x + 1.

nxn-1, without using either a factorization formula

-nx-n-1

3.6 THE PROCESS OF DIFFERENTIATION:


ROOTS AND POWERS OF FUNCTIONS

Some of the answers that you got in the preceding problem set deserve to be regarded
as standard differentiation formulas. For example, you found that for

we have

f(x)

f'(x)

Jx
2 vlrx .

This problem is going to come up again. We had, therefore, better add it to the list
of formulas at the end of Section 3.5:

(viii)

D.jx

2,/x .

You found also that

Dy'X+I

D J x2 + 1

DvI1

x2

2.Jx + 1
x ,
J x2 + 1
-X

--===

J1

x2

(x

> - 1),

(-1 <

< 1).

In each of these cases, we have the problem of finding the derivative of the positive
square root

Jj of the differentiable positive functionf

If we can solve this problem

98

Functions, Derivatives, and Integrals

3.6

in the general case, then we can get a formula

n.J]=

?;

and we can then apply the formula hereafter.

g(x)

==

Suppose, then, that we are given

.JJ(x),

on a domain where f (x) > 0; and suppose that f has a derivative at a certain point

x0 of the domain.

We want to find

g (Xo) =
'

By definition,

g(x) - g(x0) = 1.lm JM - Jf(XJ


x->x0
X - x0
x->x0
X - x0
1.

lm

lim

[.JI(X5 -- .Jf(XJ
.JTW
.J!!S?__
Xo
.jf(X5 Jf(xo)J
1
[f(x) - f(xo) .
JJ(x) JJ(xo)J
Xo
+

x->a:o

g'(x0).

lim
a:->a:o

= f'(x0) lim
->
x

a:o

.jf(X5 + JJ(xo)

provided that the latter limit exists. It is easy to see that this limit exists, provided that
lim

.jf(x) = JJ(x0).

(1)

Since the limit of the sum is the sum of the limits, it then f ollows that
lim

[.jf(X5 + JJ(xo)] = 2Jf(xo);

and since the limit of the quotient is the quotient of the limits, we get:
Theorem 1.

If f is positive and differentiable, then

DJ = f' .
] 2J]
Let us try this on the function x H Jx + 1.
for every x. Therefore

Here f(x)

=x

+ 1, and f' (x)

1
DJ.X+l = J.X+l '
2 x+ 1
which is the right answer.

DJx2

+ 1

For

x HJ x2

+ 1, we have

f(x) = x2 +
f'(x) = 2x,
x)
= DJJ(x) = f'( =

2JiW

1,

2Jx2

The Process of Differentiation:

3.6

which is the right answer.

which we haven't proved.

.JJ(x0),

(1)

We postpone the proof, observing meanwhile that

Since f is continuous, we have

x """' x0
Since

.J is continuous,

99

Formula (viii), of course, depends on

lim.JJ(x)

reasonable.

Roots and Powers of Functions

(1)

is

f(x) """'f (xo) .

=>

we have

f(x) """'f(xo)

.J f (x) """' .J f(xo).

=>

Fitting these two statements together, we get

x """' x0

.Jf(x)

=>

which is what we want.

"""' .J f(x0),

Consider now the following function:

g(x)

(x5o

xl7 + 1)247.

If you want to find g' (x), it is not helpful to observe that g is a polynomial, of degree

12,350,

or to recall the binomial theorem. In fact, the right approach is to solve first

a more general problem. Given a function

g(x)

g which is a power of a function f Thus

f"(x),

where by f"(x) we mean [f (x)]n. Then


g

'

(Xo)

rCx) - rCxo)
X - X0
x-+x0
1.

Jill

Iim

x-+x0

([f(x) - JCxomr-1Cx) + r-2(x)f(xo) +


X - x0

We see that this limit is

g'(x0)

+ r-1cxo)l

nj"-1(x0)f'(x0),

(2)

provided that in the brackets on the right we have


lim

xx0

for each positive integer k.

fk(x)

(3)

And this is true: given that


Jim

xx0

it follows that

f2(x)

fk(x0)

Equation (3) states that the kth power of a continuous

function is always continuous.

lim

lim

f(x)

[f(x)f(x)]

f(x0),

f(x0)f(x0)

f2 (x0),

100

Functions, Derivatives, and Integrals

3.6

because the limit of the product is the product of the limits. For the same reason,

Ink - 1 such steps, we get Eq. (3). Therefore Eq. (2) is correct; and we have:
Theorem 2.

If f is differentiable, and n is a positive integer, then


nr = njn-1!'.

Let us try this on our polynomial of degree 12 ,350:


D (xso

x17 + 1)247 =Df247 = 247f246f'


= 247(xso
x17 + 1)246(50x 49 - 17x16).
-

Note that our use of the shortcut formula Dfn =nfn-1.f' has two advantages
over the method based on the binomial expansion. First, the calculation is possible,
as a practical matter. Second, it gives the answer not merely in a correct form, but
also in a factored form, which is easier to handle than the binomial expansion of the
derivative.
Since we know how to differentiate fractions, we know how to differentiate
functions of the form
1
f(x) =---;;,
x
where k is a positive integer. We have

f'(x) =

-Dxk
-kxk-l
=
x2k
(xk)2

wherek + 1 = 2k - (k - 1). If we express 1/x k as x-k, and make the same change
in the formula for f', then we get
Dx-k = -kx-k-1
This has the same form as our previous formula Dxn =nxn-1, with n = -k . What
is needed, to take care of all such cases, is the following:
Theorem 3.

If n is a positive integer, and f is differentiable, then

Ifn is a negative integer, then the same formula holds at every pointx wheref(x) 0 .
The last condition is necessary: ifn < 0 , thenf(x) appears in the denominator of
rcx) = l/J-n(x), and r is therefore not defined at points where f (x) = 0.
Theorem 3 has already been proved for the case in which n is a positive integer.
For the case in which n is a negative integer, = -k, the proof is as follows:

kp-1f'
nr =Df-k = D _!_k =
k
f
f2
= -kf-lc-lf ' = nr-l f'.
-

The Process of Differentiation:

3.6

Roots and Powers of Functions

101

For convenience of reference, we list all the differentiation formulas that we have
so far:

(i)
(iii)
(v)

Dk = 0,
D(f

g) = DJ + Dg,

D( fg) = f' g

()

D L

(vii)

(ix)

(ii)

DJ]=

gf,

(vi)

()

J'

(x)

Each of these formulas holds for every

nr

(n :rt= 0),

- [_
!2'

D ! =
f

r=
...) DyX
(Vlll

gf' - Jg' ,
2
g

(iv)

D(kf) = kDf,
Dxn = nxn-i

2Jx

---=,

njn-1.f'

(n :rt= 0).

x for which its right-hand member is defined.

PROBLEM SET 3.6


Find, by any method:

I.

2. D

Dv'(x +l)(x +2)

8. DVx3 +2x +1

7. DVx(x - 2)
x2
1
10. D- -
x2 +1
-

11. D

DVx2 (Warning:

-J
1

Recall that for

x
(x2 +2x+1)2

x2 +1
6. D- -x2 - 1
9.

v'x - I
--
x2 + 1

12. DYv'

+x

Don't "simplify" yourself into a wrong answer.)

15. Dv(x3y2 - x2y3)3

14. D,, v' x2 +a2


17.

3.

5. D(x3 +x2 - x +7)2

4. DVx4 +5x2 +2

13.

1
(x2 +x +1)2

>

16 Dx

0,

x3y2
(x2 +y2)2

by definition. Show that

[Hint:

_
.a,

By definition, v

.{;xv

xv is

( :X)1'

(x

>

o) .

the number which, when raised to the qth power, gives

x'l>.

Therefore you need to show that the qth power of the right-hand side of the above
equation is

xv.]

This is an instance of a frequent phenomenon: often, a problem becomes easy if

18.

we rewrite it, using the

definitions

Find

the answer in a form which brings out the analogy with the

Dx312, and write


formula Dxn
nxn-1
=

of the ideas that the problem involves.

You may assume that

x312 is

continuous.

102

Functions, Derivatives, and Integrals

3.7

19. Find D.,[V x3 + x(x3 + x)].


20. Find D.,(x4 + 2)312

*21. Get a general formula for D/312, where f is any positive function (differentiable, of
course). The answer should be written in such a form as to bring out the analogy with
the formula D fn

nfn-lf'.

*22. Find a formula for D/512, where f is differentiable and positive.


23. Find D.,(x2 + 3x + 1)512

24. Find a formula for Dx-312 and write it in a form which brings out the analogy with
the formula Dxn

nxn-1

25. a) Simplify

a - b
a3 - b3.

(Obviously, the word "simplify" can mean many different things.

In this case, it

means to get rid of the numerator, so as to get a formula which will be useful to you
in solving the next part of the problem.)
b) Find DflX: assuming that

fix is continuous.

26. a) Simplify, as in Problem 25,

a - b

q is a positive integer.)
q_
q
b) Find DV x, assuming that V x is continuous.
(Here

*27. Given positive integers p, q. Find a formula for Dxp/q, valid for x > 0, and write the
answer in a form which brings out the analogy with the formula Dxn
nxn-1
=

3.7

THE INTEGRAL OF A NONNEGATIVE FUNCTION

Given the graph ofy


region

We found that

kx2. In Section 2.10, we calculated the area A" of the shaded


=

{(x,y) \ 0 x h, 0 y
Ah

kx2}.

h3.
y
2
y=kt

The Integral of a Nonnegative Function

3.7

In this situation, we regarded h as a fixed number.

103

On the other hand, it is plain

that h can be any positive number, and that when his named, Ah is determined. Thus
Ah can be regarded as a function.

To discuss Ah as a function, without confusing the notation, let us relabel our


horizontal axis as the t-axis, as shown on the right. Thus our parabola becomes the
graph of the equation y = kt2

(Here t is acting like

x.) For each x 0, let F(x)


be the area of the region under the parabola, from 0 to x. Thus F(I) is the area under
the parabola from 0 to 1 ; this is
k

F(l) =3
F( 3)

is the area under the parabola from

1 3 =- .

to

3;

this is

F(3) = 33 = 9k.
3
And so on.

Thus we have a function F: R+-+ R+.

the values of

F:

And we have a formula giving

F(x) = x3
3
Here we have replaced h by

x in

the area formula

3
A"=- h.
3

Thus, starting with the nonnegative function

we have defined a new function

3
.
-x.
F.x1--+
3
For each

x,

the value of the new function is the area under the graph of the old one.

We can generalize this scheme in the following way.

y
y=f(t)

n/l

I
I
I
I
I

I
I
I
I
I

I
I

104

Functions, Derivatives, and Integrals

3.7

Given a continuous nonnegative function/, defined on an interval, [a, b]. As before,


we label the horizontal axis as the t-axis, because we want to use the symbol x for
another purpose. For each number x on the interval [a, b], let R., be the shaded region.
Thus
R.,= {(t, y) I a t x, 0 y f(t)}.
And let F(x) be the area of the region R.,.
This is the scheme that we used for the parabola. For the parabola, we had
f(t)= kt 2, and we used a= 0. But we can go through the same proceeding starting
with any number a and any continuous function/ Let us look at some more examples.
Consider
f( t )= t + I,
t 1.
This is a function
f: [1, oo)--+ [2, oo).
y

y=t+l =f(t)

4
I
I
I
I
I
1

(x, x+I)

2-

I
I
I
I
I
I
I

/I
//
I
1 /
I
/
I
I
//
I
/
a=l

For every x on the infinite interval [ I, oo ), let F(x) be the area under the graph, from
1 to x. We now have a new function
F: [l, oo) --+ [O, oo).
In this case, it is easy to write a formula for F. For x= I, the area is 0. Therefore
F(l)= 0. For x > I, F(x) is the area of a trapezoid lying on its side, with its "bases"
vertical. The altitude is h= x - I, and the lengths of the bases are b1= 2 and
b2= x + 1. Therefore
F(x)= i(b1 + b2)h
H2 + x + l)(x - 1)
= Hx + 3)(x - I)= t(x2 + 2x - 3).
=

Let us now try the parabolay = t2, taking a= -2:


y = f(t)

t2,

t -2.

Here we have
f: [-2, oo)--+ [O,

co .

The Integral of a Nonnegative Function

3.7

105

-2

For each x -2, F(x) is the area under this graph, from -2 to x. By our old formula,
F(x)
tx3
t(-2)3 = tx3 + l
=

In all these situations, F(x) is determined by (a) the given function f, (b) the
number a, and (c) the number x. All this is conveyed by the notation

F(x)

Li(t) dt.

That is, if/ is continuous and nonnegative, and a x, then

is the area under the graph of f, from t

a to t

x.

y
fi J(t)
I

1x f(t) dt

I
I

and

Here it should be understood that f is continuous on an interval containing a


x.
The expression

is called the integral from a to x of the function f The number a is called the lower
limit of integration (or, briefly, the lower limit) and x is called the upper limit of

106

Functions, Derivatives, and Integrals

3.7

integration (or simply the upper limit). The function f is called the integrand. The
notation for the integral may look formidable at first, but it is not hard to learn and
is convenient: it shows at a glance that we are taking the integral, of a certain function,
between certain limits.
We proceed to generalize these ideas in two ways.
a)

Suppose that f is negative, for some values of

t-axis are counted

t. In this case, areas below the


negatively. For example, in the left-hand figure below, Ai and A2

are positive numbers, representing the areas of the two shaded regions.
Ai positively; it is the area of a region above the t-axis.
it is the area of a region below the !-axis.
y

Thus we have

Similarly, in the figure on the right,

b)

So far, we have required

a <

x.

If

a >

x, we first find

J:f(t) dt,
and then reverse the sign. Thus, in the figure on the left below,

under our old definition. And

under our new definition.

We count

We count A2 negatively;

The Integral of a Nonnegative Function

3.7

107

y
(x, 2-x)
y

4
y=f(t)

x?
a

Consider, for example, f(t)


figure above). Take

L:

f(t) dt

J(2 -

t) dt

t(x

t, defined for every t (in the right-hand

-1

- 1 . Then, for

a =

x ;:;;; 2
2

+ 1)(3 +

we have

- x)

Hx

+ 1)(5

x),

by the formula for the area of a trapezoid.


For

x2

we have

J"'(2

- t) dt

-1

For

x;:;;;

-1

- t(x - 2)(x

- 2)

we have

J"'c2

t) dt

-1

r-\2
-[t(-1 x)(2 - x
Hx
- x).
-

- t) dt

J.,

+ 3)]

1)(5

You should check that these are the right answers for the three cases.

In each

case, we have computed areas of triangles and trapezoids by elementary area formulas,
and then attached the correct sign to the area of each region.

PROBLEM SET 3.7


1.

a) Consider f(t)

It!. Get a formula for

f
valid when

;;;;

0. Sketch.

ltldt,

108

Functions, Derivatives, and Integrals

3.7

b) Now get a formula for

valid when x 0. Sketch.


c) Now get

one

formula for the same integral, valid for every

d) Let

F(x)
Get a formula for F'(x), valid when

f'

t
l l dt.

> 0.

e) Now get a formula for F'(x), valid when x < 0.


f) Finally, get a formula for F'(x), valid for every x.
2.

Do the same six things for

f'(t2
3.

+ 1) dt.

Consider the function defined by the graph below.


y

This function is called the signum.

Algebraically,
when

< 0,

when

sigt

when

0,

> 0.

Obviously sig is not continuous at 0. But we define

f'

sig t dt

in the same way as for continuous functions. For example,

i1

sig t dt

I;

here the integral is the area of a square of edge I. Similarly,

1-i
13

sigtdt

sigtdt

3,

!)

1,

x.

3.8

f1 sig t dt
J_l

-1 + 1

O;

and so on.
sig t that you did forf (t)
Do the same six things for f (t)
Do the same six things, for f(t)
t It!.
a) Explain why
=

4.
5.

109

The Derivative of the Integral

It/ and f (f)

t2

+ 1.

( t3 dt
J_l
1

0.

b) Explain why

(1 t273 dt
J_l

0.

c) Let f be a cubic polynomial, and suppose that


Show that

6.

I (-a)

Explain why
0

7.

Explain why
0

3.8

<

I (0)

fa f(t) cit
J3 -dt
<

I (a)
=

0.

o.

14
< 12

3 1---+t2 dt
J
1

<

.
3

5.

THE DERIVATIVE OF THE INTEGRAL

In Problems 1 and 2 above, you found that if

F(x) =
then

{xf(t) dt,

F'(x) = f(x)
for every

x.

That is, at each point

the value of the integrand function f.

the derivative of the integral function Fis simply

In fact, it is not hard to convince ourselves that if f is a continuous function, this

is what always happens.

Consider first the case in which f is positive.

y
y=f(t)

y=F(x)

Functions, Derivatives, and Integrals

110

Take a fixed

x0

By definition,

F'(x0 ) -

X -

o:->o:0

Now

F(x)

Lf(t) dt

F(x) - F(x0)
<

x,

Xo

F(x0)

and

Therefore

x0

F(x) - F(x0)

1.tm

For

3.8

f"f(t) dt.
a

"
C
f(t) dt - L f(t) dt.
.

as in the figure below,

F(x) - F(x0)

"

( f(t) dt.
J..

Since f is continuous,

F(x) - F(x0)

(x - x0)f (xo).

Here"" means "is approximately equal to." We are claiming that the area under
the curve, from
x

- x0

x0

to

x,

is closely approximated by the area of a rectangle with base

and altitude f(x0). Therefore

F(x) - F(x0)
X - Xo
and the approximation gets better as

f(Xo'
)

gets closer to

x0

y=f(t)

For the situation shown in the figure, in which the graph rises to the right of
it is easy to see why the approximation is good.

F(x) - F(x0)

Here we have

(x - x0)f (x0)

+ E,

where E is the area of the little curvilinear triangle at the upper right.
E <

e(x - x0)

and

--E

X - X0

< e,

Now

x0,

The Derivative of the Integral

3.8

where

111

is the altitude of the curvilinear triangle. Thus, when we write

F(x) - F(x0)

!"'3

(x

x0)f (xo),

the error in the approximation is E; and when we write

F(x) - F(x0)
X

,,,..,

""

Xo

f(Xo) '

the error in the approximation is Ej(x - x0) , which is less than e.


y

x0

If x < x0, the same approximation formula holds, although the reasons are
slightly different.

Here the area under the curve from x to x0 is


F(x0) - F(x);

the area of the rectangle is

( x0 -

x)f (x0);

these are approximately equal. Changing the sign of each, we get


F(x) - F(x0)

!"'3 (x -

x0)f(x0),

and

as before.

F(x)

- F(x0)

X - Xo

,,,_,
rv

f(Xo'
)

But the fraction on the left is the slope of the secant line to the graph of F. The
limit of this fraction is F' (x0); and since the fraction is close to f (x0) when xis close

to x0, we ought to have

F'(x0)

f(xo).

y
F

112

Functions, Derivatives, and Integrals

3.8

If this is true, then we have:


Theorem 1.

If f is continuous on an interval containing

D.,
at each point

f'f

(t) dt

a,

then

f(x)

x of the interval.

In fact this is true, and can be proved by a more careful use of the ideas that we
have just been describing informally.

But let us postpone the proof until the end of

this chapter, and see, in the meantime, what the theorem is good for.

Consider the

following problem:
Problem I.

Calculate the area under the graph of y =

x4, from x =

0 to

I.

To solve this problem, the first step is to realize that whoever proposed the
problem has asked the wrong question: the answer to his question is a number, and
there is nothing about this number that is easy to see.

y=f(t) =t4

y=x4

The easiest way to solve Problem 1 is to consider instead the following:


Problem 2.

Find a formula for the function

F(x) =

t4 dt.

It might seem that Problem 2 must be harder, but this is not true.
that, while information about the number
something about the function

The point is

F(l) is hard to come by, Theorem

1 tells us

F, namely,
F'(x) = x4
x4 as its derivative. We get powers
x, using the formula

We now ask ourselves what sort of function has


of

by differentiating powers of

Thus

The Derivative of the Integral

3.8

This is like F'(x), except for the factor 5.

113

But the 5 is easy to get rid of: we divide

x5 by 5, getting
D(tx5) = t . 5x4 =x4.
Thus we have found a function

G(x) =-kx5,
which resembles the given function

F(x) =
To be exact:

G'(x) =F '(x)

f'

t4 dt.
for every x,

(1)

G(O)=F(O).

(2)

To see why (2) holds, we observe that

G(O) = t05 = O
and that

F(O) =
Equations

(1)

t4 dt = 0.

and (2) ought to guarantee that


for every x;

G(x) =F(x)
that is,

G=F.
The functions F and G start with the same value, at x= 0.

And

(1)

tells us that F

and G always change at the same rate. This suggests the following:
Theorem 2.

(The uniqueness theorem.)


I, and let a

defined on the same interval

Let F and G be differentiable functions,


be a point of

I.

If

F(a) = G(a)

(3)

and

F' (x) = G' (x)

for every x in

I,

(4)

then

F(x) = G(x)

for every x in

I.

(5)

Here we call the interval I because we want to allow intervals of all kinds, including

[a, b], [a, b), [a, oo), ( -oo, a], and so on. We also allow the case I= ( - oo, oo).
This is the case for the functions

F(x) =

't4 dt,

G(x) = tx5

that we have been discussing in the last few pages.


The uniqueness theorem is a consequence of the mean-value theorem (MVT)
of Section 3.2. The proof is as follows.

114

3.8

Functions, Derivatives, and Integrals

Suppose that F(b) G(b)for some bon the interval I. For each xon [a, b], let
H(x)
Then H(a)

0, but

H(b)

F(x) - G(x).

0.
y

The slope ofthe chord joining the endpoints ofthe graph of His

H(b) - H(a)
b-a
By MVT there is an x between

O.

and bsuch that

H'(x)

H(b)-H(a)
b-a

Therefore H' (x) 0. But this is impossible: for every xwe have
H'(x)

F'(x)

G'(x)

0.

Theorems 1and 2,in combination,enable us to solve some difficult area problems.


Example 1. Find the area ofthe region above the xa
- xis and below the graph of
f(x) = 1 -x2
y

Obviouslythis area is

Let

F(x)

1:(1 - t2) dt.

The Derivative of the Integral

3.8

Then

F'(x)= I

by Theorem 1.

Now

115

x2,

It is easy to find another function with this derivative, namely,

F(-1)=0,

G(x) = x - tx3

and

G(-1) = -1 + t = -i.

(?)

But this is easy to fix: we change our minds and write

G(x) = x - tx3 + i.

Then

G(-1)=0,

as it should be; and by the uniqueness theorem it follows that

for every

x.

1:
Example
to

x = 2.

2.

G(x) = F(x)

Therefore

(1

t2) dt = F(l)

G(l) =

Find the area under the graph of

-} + i = f.

y = x2 + x + 2, from x =

-1

:
-1

1
1

Here the area is

Let

fy2 + t + 2) dt.
F(x) = Jy2 + t + 2) dt.

Then

As our first guess, let

F'(x) = x2 + x + 2.

G(x) = }x3 + }x2 + 2x,

so that

G'(x)

F'(x).

We find that

3.8

Functions, Derivatives, and Integrals

116

-t + t - 2
G(-1)
2x + ll, so that G( -1)
=

rt2 t
+

-1-l-.

+ 2) dt

Therefore we really want

G(x)

tx3 + t x2 +

This gives the answer in the form

0.

F(2)

G(2)

.
l 8 + t

+ 2 . 2 + 1-l

v-

J,_l

The same scheme can be used to calculate integrals in which the integrand is
negative and hence does not represent an area.

F(x)

G'(x)
We then arrange for

G(a)

f'f (t)
G

we first find (if we can) another function

Given

dt,

such that

F'(x)

f(x).

to be 0, by adding a suitable constant to the first


G(x)
F(x) for every x. Therefore

we tried. We then know that

!Jf(t) dt

that

F(b)

G(b),

in the same way as for positive functions, and for the same reasons.
PROBLEM SET 3.8
I. By the methods of this section, find the area under the graph of y

x - x3, from
x
0 to x
1, and sketch. (Here, and hereafter in this problem set, you should
explain what functions you are using as the functions/, F, and G.)
=

2. Find the area of the region lying below the x-axis and above the graph of y
x4
I.
Note that here the function f is negative, so that the area and the integral are different;
the area A is positive, and
=

J_ f

(t) dt

-A.

3. a) Find the area under the graph of y = x10, from 0 to b.


b) Find the area under the graph of y = x10, from a to b. (Q11e1y: Do you need to give
separate discussions for the cases 0 < a < b, a < 0 < b, and so on?)
c) Same as Problem 3(b), for the graph of y
x100.
=

d) Now find a general formula for

valid for every positive integer


4.

Find
a)

fu2

2t

5)dt

b)

f'

tndt,

11.

cx2 + 2x + 5)dx

c)

(z2

+ 2z +

5)dz

3.8

The Derivative of the Integral

117

d) A general formula for

F(x) J:(t2 + 2t + 5) dt.


=

5.

a) Find

(t3 + t

1) dt.

b) Get a general formula for

F(x) f + t - 1) dt.
(t3

6.

Get a general formula for

f'(t5 - 2t3 + 1) dt.


7. a) Find the area under the graph of

x
v' x2 + 1 '

y=

1 2.
x/ v'x2 + 1,

from

to

Unless you happen to remember a function whose derivative is


you are going to have to figure out how this function might arise as

the answer to a differentiation problem.

The radical in the denominator suggests

that somebody has been using the formula

Dv'fb) Find

f'
2v'J'
-

Jl t2t + 1 dt.
__

-1

Then sketch the graph of

v'

__

y = /(!)

v'

t2 + 1 '

as well as you can, and explain how the numerical value that you got for the integral
could have been predicted, without any calculations at all.
8. Let

F(x) f(1 +
=

Express

F'(x)

formula.

not

being asked to express

Same as Problem 8, for

F(x) r(1 + v'()500 dt.


=

10.

dt.

by an elementary formula (that is, by a formula not involving integrals

or differentiations). Note that you are


9.

v'()0

Same as Problem 8, for

F(x)

by an elementary

118

11. Find the area under the graph of y =


12. Same, for the function y =
13.

3.8

Functions, Derivatives, and Integrals

1
_

1 __

x + 1

(and above the x-axis) from 0 to 1.

.
vz -x

--

Find

[Hint: This problem is easier if you forget the binomial theorem.]


14.

Find

15.

Find

16.

For n 7" 0, we have

x2

(1

dx.
xa)2

Since n 7" 0, the function


f(x) = x-1 =

1
-

1
never appears as the derivative of a power of x. If we allowed n = 0, then xn = xo
for x > O; the derivative is O; and l/x still does not appear. Thusf(x)
l/x is not the
derivative of any integral power of x.
Question: ls there any function at all which has f(x) = 1/x as its derivative, say,
for x > 0? If so, what function?
=

* 1 7.

Consider

If you attempt to evaluate this integral by applying the methods of this section in a
mechanical sort of way, you will get an "answer." If you try to interpret your answer
geometrically, you will see that your answer cannot possibly be right. What went wrong?
(Evidently we must have been trying to apply a theorem in a case in which its hypothesis
is not satisfied. The question is what theorem and what hypothesis.)
*18.

In Theorem 1, suppose that we had omitted the hypothesis that/ is continuous. Give
an example to show that the resulting theorem would not have been true. [Hint: You
have already seen cases in which a function of the type
F(x) =

ft

(t) dt

fails to have a derivative at some point x0; and surely we cannot have

if there is no such thing as F' (x0).]

Uniformly Accelerated Motion

3.9
3.9

119

UNIFORMLY ACCELERATED MOTION

Suppose that a particle is moving, according to some given law, along a line . If we
think of the line of motion as the y-axis, then the motion can be described by a
function

f:

I->- R;

for each time ton the interval,f(t) is they-coordinate of the moving particle at time t.
Thus, for example, in the figure below, the total time interval I is the closed interval
[t1, t4]. The figure tells us that, at the start of the motion, the time is t1 and the particle
is at the pointy = 1; in the time interval [t1, t2], the particle rises from 1 to 3; in the
time interval [t2, t3], the particle falls from 3 to -1; and in the time interval [f 3, t4],
the particle rises from -1 to 4.
y
4
3
2
1

-1

The figure shows a finite time interval I= [ti. t4]. More generally, the function
f may be defined on an infinite time interval I= [t1, oo) or I= R = (-oo, oo).
But most of the time, on or near the earth, the motion begins at some time t0, and
eventually the motion stops. The velocity is the function

v=f':

I_.. R,

provided that f is differentiable. The acceleration is the function

a= v': I->- R,
provided that vis differentiable. Thus the acceleration is the derivative of the derivative
off. We call this the second derivative off, and denote it by f". Thus we can sum up:

v=/',

a= v =f"
I

(by definition).

Finally, there is a fourth function associated with the motion. This is the function
F: l->-R
which gives, for each time t, the force F(t) acting on the body at time t.

Functions, Derivatives, and Integrals

120

3.9

We shall now see what form these functions take when/ describes the motion of a
freely falling body. Before we can work mathematically on the problem, we have to
state our physical assumptions in mathematical form.

1)

Newton's second law asserts that acceleration is proportional to force divided by

mass; that is,

F(t)
a(t) = k1-

(k1

const).

2)

For a freely falling body (or a body projected vertically upward), the force is the

resultant of the

weight (which acts downward) and the air resistance (which acts up

ward when the body is falling and downward when the body is rising). If the speed is
moderate, then the air resistance can be neglected. Hereafter, we shall assume that the
weight is the only force, so that

F(t)
where

3)

W(t) < 0

for every t on

[c, d],

W(t) is the weight at time t. W(t) < 0 because weight pulls things downward.

Evidently, the weight will not change merely with the passage of time, but it will

depend on the altitude; the greater the altitude, the less the force of gravitation. But
if the altitude is not very great, then the weight will be very nearly constant. We shall
assume hereafter that the weight of a freely falling body is constant.

F(t)

W(t)

Therefore

k2 < 0. Therefore
a(t) = k1

k2

for every

< 0

t;

and

a(t)

k3

< 0.

This last equation says that

for each falling body there is a constant which is equal to


the acceleration, independently of the time.

4)

There remains, however, a question: is there one constant which works for all

falling bodies, or does the constant acceleration depend on what sort of body is
falling? Conceivably, the law governing the free fall of heavy bodies (such as cannon
balls) might be different from the law governing the fall of light bodies (such as BB
shots).

Jn fact, until the time of Galileo, everybody thought that heavy bodies fell

faster.

The story goes that Galileo proved them wrong by dropping two iron balls

of different sizes off the leaning tower of Pisa: they hit the ground at the same time.
Since

k3/m

is independent of

m, there is a constant -g

k3/m which gives the

acceleration of every freely falling body, regardless of its mass. The number

k3

g= -m

is called the

acceleration of gravity. If distance is measured in feet and time in seconds,

then numerically

32,

measured in ft/sec2

Uniformly Accelerated Motion

3.9

121

The above discussion can be summed up as follows:

If we neglect air resistance and neglect the variation of weight with


altitude, then the acceleration of a freely falling body is given by the formula

a(t)

-g,

where g is a constant and

32 ft/sec2

We now consider the problem of finding the functions that satisfy the equation

a(t)
Problem.

J"(t)

-g.

The function

f:

R---+R

has the following properties:


a) f"(t)

-g for every t,

b) f'(O) is a given number v0,


c)

f(0) is a given number Yo


What is f ?
Using the notation v for f' and a for v'

f", we write these conditions in the

form:
= -g for every t,
v(O)
v0,
c) f(O) =Yo

a)

a(t)

b)

Thus our data consist of (a) the constant acceleration -g, (b) the initial velocity

v0
a)

v(O), and (c) the initial position y0

f(O). The solution is as follows:

We know that

v'(t)

a(t)

-g

for every t. The function

u(t)

-gt (?)

has -g as its derivative; the only trouble is that u(O) is 0 instead of v0

But this is

easy to fix: we change our minds and let

u(t)

-gt + v0

Our function u then has the same derivative as v, and has the same value at t
By the uniqueness theorem, u and v are the same function, and so

v(t)
b)

-gt + v0

We know now that

f'(t)

v(t)

-gt + v0

0.

122

3.9

Functions, Derivatives, and Integrals

We want to find/ Now the function

z(t) = - t2 + v0t (?)


2

has

-gt + v0 as its derivative; the only trouble- is that z(O) is 0 instead of y0 But

this is easy to fix: we change our minds and let


g
z(t) = - - t2 + v0t + Yo
2

The function z then has the same derivative as f, and has the same value at
By the uniqueness theorem, f and
f(t)

0.

z are the same function, and so


=

t2 + v0t + Yo
2

This completes the solution. We sum up in the following theorem:


Theorem 1.

Let f be a function R

__,,.

R.

If

f"(t) = -g for every t,


f'(O) = v0, and
c) f(O) =Yo

a)

b)

then
d)

f(t) = (-g/2)t2 + v0t +Yo for every t.


Thus the mathematical problem defined by (a), (b), and (c) has only one solution.

This fact is important in applications, because, if our mathematical problem had two
solutions, we would have to find out which of the two solutions applied to the
But, if f"(t)
-g, f'(O) =
v0 = 10, andf(O) = v0 = 5, thenfmust be the functionf(t) = (-g/2)t2 + lOt + 5.

physical situation that we started out to investigate.

Theorem 1 can be stated in a more general form. If I is any time interval whatever
(finite or infinite) and

t0 is any point of I, then we can consider a function


f: /_,,.R,

such that

J"(t) = -g for every t,


f'(to) = Vo ,
c) /(to) =Yo

a)

b)

The uniqueness theorem applies to our problem in exactly the same way as before,
and the algebra is only slightly more complicated. We are given

v'(t) = -g.
We try

u(t) = -gt (?);


we observe that

u(t0) = -gt0 instead of v0; to fix this, we let


u(t) = -gt + gt0 + Vo

Now u has the same derivative as the unknown function


at t

123

Uniformly Accelerated Motion

3.9

t0 By the uniqueness theorem, u(t)

v(t)

v,

and has the same value

v(t) for every t, and so

- gt + gt0 + v0

This solves half of our problem.


We know that/'(t)

v(t). We therefore try

z(t)
we observe that

(t0)

z(t)

- !2 + gt0t + V0t (?);

(g/2)t + v 0t0

instead of y0; and we fix this by letting

- - t2 + gt0t + v0. t - - t20 - v0t0 + Yo

- (t - t0)2 + Vo(t - to) + Yo


2

Then

has the same derivative as/, and has the same value at t0. By the uniqueness

theorem it follows that z{t)

f(t)

f(t) for every t. Therefore

- (t - t0) 2 + v0(t - t0) + Yo


2

None of these formulas should be learned. What you need to learn is the process by
which they were derived; if you remember the method, you can use it. For example:
Problem.

Given f"(t)

Solution. Let v

f'.

3,/'(3)

Then v'(t)

1, and/(3)

2, what is/?

This suggests that v(t)

3.

3t.

Adding the

appropriate constant, to get v ( 3 ) = 1, we obtain

v(t)

3t - 8.

Now

f'(t)
This suggests f(t)
we have

f(t)

3t - 8.

ft2 - 8t. Adding the appropriate constant, to get f (3)


ft2 - 8t - t . 32 + 8 . 3 +

2,

it2 - 8t + .
-

This is the answer. (Two differentiations verify that it is an answer; and two applica
tions of the uniqueness theorem tell us that it is the only answer.)
PROBLEM SET 3.9
Find formulas for the unknown functions, under each of the following sets of conditions.
In all but one of these problems, the conditions are enough to determine the function.

In three cases, however, there are infinitely many possibilities; and in these cases you should
try to explain what the possible functions are.

1. /'(t)
3. f"(t)

3t + 4,/(0) 4
-1,/'(0) 2,/(0)
=

2. f'(x)
4. f"(x)

x3 - 7x + 5,/(0)
3x2,f' (1) O,f(1)

-1

124

Functions, Derivatives, and Integrals

5. J"(t)

7. g (x) =

9.

I (t) =

3.10
1
,
6. f (x) = 2 ,/(1)

3
t ,f'(O) = 1,/(1) = 0
_

x
1

,g(O)
x2

t ,/(2) =

8. g'(x) = x(x2 + 1)2, g(3) = 1

-1

10. f'(t) = t2(1 + t3)10,f(O) = 2 (By all means, do not use the binomial theorem on this
one.)

11. j'(t) = t2 + 1,/(1)


13. f'(t) =
15.

t2

(I

+ 3

t )2

,/(1)

12. f"(x) = x,/(1) = O,f'(I)

2
=

14. g"(t) =

+ l 3,g(O) = l,g(I)= 1.
(t
)

A "theoretical projectile" is fired vertically upward, from the surface of the earth, at
time 0, with initial velocity 10 ft/sec. When will it hit the ground again? For what time
interval is its motion described by the condition a(t) =

g?

(Following the advice

given at the end of this section, you should solve this problem with your book closed,
using the methods but not the results given in the text.)

I 6. A "theoretical projectile" is fired vertically upward, from the surface of the earth, and
hits the ground again ten seconds later. What was the initial velocity?

17. A "theoretical projectile" is fired vertically downward from the top of a 200-foot
building and hits the ground 2 seconds later. What was the initial velocity?

18. We state this problem in a nonmilitary form. A billiard ball is raised to a certain height
y0 and simply dropped, so that it begins its free fall at velocity v0 = 0. Five seconds
later it hits the ground. What was y0?
19. Free fall near the surface of the moon works the same way as free fall near the surface
of earth, except that the constant acceleration -gL (L for lunar) is different; the smaller
mass of the moon makes the difference. Suppose you went to the moon, dropped a
billiard ball as in Problem 18, and found that it dropped 3 feet in one second. What
could you conclude aboutgL?
*3.10
PROOF OF THE FORMULA
FOR THE DERIVATIVE OF THE INTEGRAL

We shall now prove Theorem I of Section 3.8.

We have a continuous function f;

we let

F(x)
we take a point

Let

x0;

ixf(t) dt;

and we want to show that

be any p ositi v e number.

has an eb-box at the point

Since f is continuous, we know that the graph off

(x0,f(x0)).

Proof of the Formula for the Derivative of the Integral

3.10

125

f(xo)+l-----------+-----,

x0-5

Thus

Ix - x0I < a

If(x) - f(xo)I <

=>
<=>

Xo

f(x0)

- E

< f(x) < f(x0) +

E.

We are going to use these inequalities to get information about the function

m(x)
Here

1
=

--

X - Xo

[F(x) - F(x0)].

is the slope function for the function

F, so that lim,,_.,0 m(x)

F'(x0).

Evidently

F(x) - F(x0)

i
f

"'f(t) dt -

f(t) dt

!(t) dt,

:ro

and so

m(x)
If f is positive and

_
_

x - X0

[F(x) - F(x0)]

l_

X - Xo

f(t) dt.

"'o

x0 < x, as in the figure, then F(x) - F(x0) is the area of the

shaded region.
Case 1.

Suppose that

f(xo)
Therefore

Xo < x < Xo + a, as in the figure. Then


-

< f(t) < f(xo) +

(x0 ;;;

;;; x).

126

3.10

Functions, Derivatives, and Integrals

and so
[f(x0) - E:](x - x0) <

ff(t) dt

<

'
"'

[f(x0) + E:](x - x0).

Dividing by the positive number x - x0, we get


f(x0) - E < m(x) < f(x0) + E.
Thus we have shown that
x0 < x < x0 + o

If (x0) - m(x)I

=>

(.1)

< E.

Case 2. Suppose that x0 - o < x < x0 Then

(x t x0),

f(x0) - E < f(t) < f (x0) + E


just as in Case

1.

Therefore
"'0

[f(xo) - ]

"'0
[f(x0) + ]

dt J:f(t) dt i
<

<

dt.

(We are integrating from left to right.) Therefore


[f(x0) - ](x0 - x) <

I:f(t) dt

<

[f(x0) + E:](X0 - x).

Since x0 - x > 0 in Case 2, we can divide by x0 - x, preserving these inequalities.


This gives
1

f(x0) - E <

--

J"'f(t) dt

X0 - X "'

<

f(x0) + .

When we interchange x and x0, this changes the sign of each of the factors in the
middle of this expression. Therefore
f(x0) - E <
To sum up:
X0 - o < x < X0

=>

X -1 Xo l"'f(t) dt

--

a:0

<

f(x0) + .

/(x0) - E < m(x) < f(x0) + E

=>

lm(x) - f(x0)1 <

E,

(2)
exactly as in Case
Therefore

1.

Fitting together our results in Cases

0 <

Ix - x01 < o

lim m(x)
a::-.a::o

lim
a::-t-a:o

=>

and 2, we get

lm(x) - f(x0)1 < E.

Xo

X -

[F(x) - F(x0)]

which was to be proved.


This proof is not easy, but it might have been worse. It was made simpler by the
fact that for each E > 0, the o > 0 that we get from the hypothesis lim.,_,.,0/ (x)
f(x0) is precisely the o that we need, to conclude that lim.,_,.,0m(x) f(x0).
=

3.10

Proof of the Formula for the Derivative of the Integral

127

PROBLEM SET 3.10

Find the first and second derivatives of the following functions.


1.

f(x)

3.

h(x)

5.

g(x)

7.

f(x)

9.

h(x)

11.

g(x)

13.
15.
17.

f' (t3 1) dt
{"' dt
f(x)
Jo t
h(x)
i v t2 + 1dt
f (1 + t3)100dt
8.
10. f(x) J: v2 + tdt
2
12. h( )
1 t 4dt
"'
f
-x
14. J1x t3 dt+ 1
Jx -1t2++t110dt
2
18. {"' J 1 + t dt
Ja 1 + t4

Vl

v(

4.

6.

v!

l
+

16.

1T

If you know that

D.,

for every continuous function

f,

f /(t)dt

f(x)

f(x),

this does not immediately enable you to find the

derivative of

*20.

2. g(x)

g(x)

dt
iJ,,x 4dt
2 t +1
J"' J 1 + tdt
t
-1

19.

r t2dt
J("'2,, (t4 - t)dt
r4 + t8dt
J("'4,, 1dt
{"'
dt
Jo 1 + t2
f (t2 + 1) dt

rJo2x

Vl

t8dt.

vl

t8 dt.

But find the answer f', by any method.


Findg'(x), given
g(x)

"'

(
Jo

'

Trigonometric and
Exponential Functions

4.1
DIRECTED ANGLES. TRIGONOMETRIC
FUNCTIONS OF ANGLES AND NUMBERS

In elementary geometry, when we speak of an angle we simply mean a geometric


figure, that is, a set of points:

If

--+

AB

and

-+

AC are rays

which have the same endpoint

line, then their union is the angle

LBAC.

A,

but do not lie on the same

(In the figure, the arrowheads remind us

that the sides of an angle are rays rather than segments.)

Some authors define the

word angle in such a way as to allow "zero angles" and "straight angles."

In any case, in elementary geometry the idea of an angle does not include the idea
of order; the sides of an angle are not arranged in an order, any more than the sides
of a triangle are.
Initial

Terminal

L.
0
OL.
A
LBOA A
LAOB

In trigonometry, however, the order of the sides of an angle makes a difference.


Henceforth, whenever we speak of an angle we shall mean a directed angle.
in the figures above,
initial side, and

--+

OB

LAOB

is an ordered pair of rays

is the terminal side.

Thus
128

-+

--+

(OA, OB);

LAOB is

Thus,

OA is
LBOA.

the ray

different from

-+

the

Directed Angles.

4.1

Trigonometric Functions of Angles and Numbers

129

Wf suppose that a coordinate system is given in the plane. The counterclockwise


direction is the direction from the positive x-axis to the positive y-axis, as shown
below. The counterclockwise direction in a coordinate plane is regarded as positive;
and the clockwise direction (running the other way) is regarded as the negative direc
tion.
A new coordinate system is called
wise direction; otherwise it is called

right-handed if it gives the same counterclock


left-handed. In the figure below, the right-handed

coordinate systems are marked R, and the left-handed ones are marked L.
y

Lx

"X

We can now define the trigonometric functions of an angle LAOB. The procedure
--+

is as follows. We set up a right-handed coordinate system, in which the initial side OA


--+

is the positive half of the x-axis. On the terminal side OB we choose a point P 0.
P has coordinates

(x, y), in the coordinate system that we have set up,


r.

and the distance

OP is a positive number

y
x

It is easy to show (by similar triangles) that the ratios

xfr, yfr, yfx, xfy, r/x, r/y

are

independent of the choice of P; they depend only on the angle that we started with.
Thus we can define the trigonometric functions of LAOB as follows:
sin LAOB
tan LAOB
sec LAOB

y/r,
yfx
rfx

cos LAOB
(for
(for

x
x

0),
0),

cot LAOB
csc LAOB

=
=

xfr,
x/y
r/y

(for

0),

(for y

0).

4.1

Trigonometric and Exponential Functions

130

We have defined six functions.

Note that the domains of these functions are not

sets of numbers, but sets of angles.


Consider now the unit circle C, with center at the origin, in the xy-plane.
y

P,

-1

%
P0

-1

Let P0 be the point

(1, 0),

as in the figure. To each real number fJ there corresponds a

point P0 of C, under the following rules:

1)

Given

fJ

> 0, we start at P0 and move around C in the counterclockwise direction

until we have traced out a path whose total length is

fJ.

The point where our path

ends is P0

2)

Given

fJ

< 0, we start at P0 and move around C in the clockwise direction, until

we have traced out a path whose total length is

lfJI.

The point where our path ends.

is P0
These rules define a function
w:

R-+ C

fJ f-'>P0 == w(fJ),
under which to each real number fJ there corresponds a point of C. The function w
is called the winding function.
rather than numbers.

Note that the values of the function

are points

Note also that

Po+2" ==Po,
for every e.

The reason is that when we add

27T

to

fJ,

this merely means that we

take another round trip around the circle, ending at the same point P0 where we
began.

Similarly,
and

for every integer n, positive, negative, or zero.


We shall use the winding function to define trigonometric functions of numbers,
in the following way.

For each number

fJ,

let

LO== LP00P0

Directed Angles.

4.1

The symbol
number e.

L()

Trigonometric Functions of Angles and Numbers

is pronounced "angle () ;
"

L()

131

is the angle which corresponds to the

We now define
sin()= sin

L()=

sin

LP00P0,

cos() = cos

L()=

cos

LP00P0,

and so on.
y

We have defined these functions in terms of

L()

because we want to emphasize

their geometric meaning. But for some purposes, it is simpler to forget about angles,

and merely use the coordinates of P0 If

Po= (xo, Yo),

then
sin()

COS()=

y0,
Xo,
Yo
Xo
Xo
Yo

(whenever

x0

0),

(whenever

Yo

0),

sec()= -

(whenever

x0

0),

_!_
Yo

(whenever

Yo

0).

tan() =

cot() =

Xo

csc ()

Using these definitions, we can derive the usual formulas. Since P0 is on the unit
circle C, we know that

OP0=

and we have:
Theorem 1.

1.

Therefore

y=

1,

For every (),


cos2 () + sin2 e= 1.

Trigonometric and Exponential Functions

132

If the sign of

4.1

is changed, this sends us around the circle C in the opposite

direction. Therefore the points P8 and P_8 are symmetric across the x-axis, as in the
figure.
y

Pe
/1
/ I
/ I
/ I
/

Therefore

Y-o

This gives:

Theorem 2.

For every

-Ye

6,

sin

(-6) =

-sin

6,

cos

c -6) =

cos

6.

Plotting the points P0, P1112 and P11, we get the following:
,

Theorem 3.
sin 0

Sln

7T
-

0,

cos 0

1,

cos

sin 7T = 0,

Theorem 4.

For each

'!!.
2

1,

= 0'

cos 7T = -1.

6,

sin (7T +

6) =

-sin

6,

cos (7T +

6) =

-cos

6.

Trigonometric Functions of Angles and Numbers

Directed Angles.

4.1

133

Proof For each (), the points P9 and P1T+9 are symmetric across the origin. This
holds in all quadrants. Therefore

and the theorem follows, by definition of the sine and cosine.


In the kind of trigonometry that we are dealing with now, the relation between
angles and numbers is a little tricky.

If () is known, then P9 is determined, and so

L() is determined; L() is LP00P8


y

But if the angle is known, the number() is not determined. In the figure on the right,

LP00Q is given, but for this angle we may have


()

!7T,

or

()

In fact, for every integer

n,

21T + !7T

or

\1 7T,

()

t7T - 21T

-i1T.

positive, negative, or zero, we may have


()

If an angle

!7T + 2n1T.

LAOB corresponds to a number(), under the rules that we have been


LAOB has measure (), and we shall write, for short,

giving, then we shall say that

LAOB
(We have seen that every angle

L().

LAOB has infinitely many measures e.


"the measure of an angle.")

For

this reason, it would be misleading to speak of

So far, we have used the notation L() only for angles "in standard position,"
that is, angles with the positive half of the x-axis as initial side.

But it will be con

venient to use the same shorthand for angles in general. Thus

LP00Q

and

LP 0Os

3
L 7T.
4

134

4.1

Trigonometric and Exponential Functions


y
x'

But if we set up new axes

x',

LQOS

'
y , we can also say that

L 7T

LQOT

and

LTT.

Using other axes, not shown in the figure, we see that

LSOP0

(- )

37T ,
4

and so on.
PROBLEM SET 4.1

Derive the trigonometric identities given or suggested below.

The derivations should

be based on the definitions and theorems given in this section of the text.

1.
6.

sin

2.

7.

csc z

23. sin(7T - 6)

27. sec

( 7T - 6)

cos

16. cot

19. tan(7T + 8)

sin

8.

4.

tan x
cos

sin

12. cot2 8 + 1

11. 1 + tan2 8
1 5. tan( -8)

cos

3.

(-8)

9.
13.

20. cot(7T + 6)
24. cos(7T

8)

28. CSC(7T - 6)

cotx

sec

csc

secy
csc

sec

17. sec(-6)

18. csc (-8)

21. sec(7T + 6)

22. csc(7T + 8)

2 5. tan(7T - 6)

sin

801;:;; 10 - 001.

b) Show that the sine is a continuous function.

30. a) Show that for every 8, 80, we have


jcos fJ

--

14. sec2 8

sec x

10.

cscx

29. a) Show that for every 8, 6 0, we.have


!sin

5.

- cos fJ01 ;:;; lfJ - e01.

b) Show that the cosine is a continuous function.

26. cot(7T - 8)

='

The Law of Cosines and the Addition Formulas

4.2
4.2

135

THE LAW OF COSINES AND THE ADDITION FORMULAS

Jn the figure on the left below, We have X9 =COS 6, andy9 =sin 6, by definition Of
the sine and cosine.
y

P(x, y)

More generally, we have:


Theorem 1.

Let

P be

any point of

---+

OPe,

and let

OP=a.

Then the coordinates of

are

x=a cose,
Proof

y=a sine.

By similar triangles,

lxl =lxel
'
a
1

Therefore

and

lyl =a IYol
In these equations

xe also agree in
x=axe= acose,
and

which was to be proved.


Theorem 2.

sign, and similarly for

y=aye=a sine,

From this we get immediately:

(The law of cosines). If LACE = L8, then


c2 =a2 + b2 - 2ab cos e.

(The notation is that of the following figure.)


y

y and y9

Therefore

136

Trigonometric and Exponential Functions

4.2

Proof By the preceding theorem,


B

a
( cose,a sine).

And obviously
A = (b, 0).
Therefore, by the distance formula,
c2 = (a cose - b)2 + (a sin e - 0)2
=a2 cos2e - 2ab cose

= a2( cos2e

b2 - 2ab cose

sin2e)

b2

+ a2 sin2e

= a2 + b2 - 2ab cose'
which was to be proved.
Theorem 3.

For every

and cp,

cosce
Proof Let A = P0,

cp)

cose cos <P - sine sin rp.

= P8, and C = PB+ef>


y

Then
A= 1
( ,0),
c =(cos

e
(

rp), since

rp)),

and so by the distance formula


AC2 = [cos e
(

cp)

= cos2ce

rp) - 2 cos e
(

= 2 - 2 cosce

1]2

r/J).

sin2 e
(
+

rp)

cp)

+1 +

sin2 e
(

cp)

4.2

The Law of Cosines and the Addition Formulas

137

---+

We now set up a new coordinate system, with OP6 as the positive x'-axis.
y

In the new coordinate system,


A=P_8 = (cos
C =Pq,

(-8), sin (-8)) =(cos 8, -sin 8),

(cos cf>, sin cf>).

Therefore, by the distance formula,


AC2 =(cos

cos2

- cos c/>)2 + (-sin


2 cos

8 -

= 2 - 2(cos

8 cos cf>

8 cos cf>

8 -

sin c/>)2

+ cos2 cf> + sin2

- sin

+ 2 sin

8 sin cf>

+ sin2 cf>

8 sin cf>).

But the distance AC is independent of the coordinate system. Therefore


2 - 2 cos

(8

+ cf>) = 2 - 2(cos

8 cos cf>

- sin

8 sin cf>),

and
cos

(8

+ cf>) = cos

8 cos cf>

- sin

8 sin cf>,

which was to be proved.


Once we have the addition formula for the cosine, it is easy to get similar formulas
for the other trigonometric functions.
Theorem 4.

For every

and cf>,

cos

Proof

(8

- cf>) = cos

8 cos cf>

+ sine sin cf>.

Using -cf> for cf> in the preceding theorem, we get


cos (8

- </>) =

cos e cos (-cf>) + sine sin (-cf>).

But we know that


cos (-cf>) = cos cf>,

and

Using these, we get the desired formula for cos

sin (-cf>).=-sin cf>.

(6 +

cf>).

4.2

Trigonometric and Exponential Functions

138

For every 8,

Theorem 5.

cos

Proof.

( - )
e

and

=sine,

( )

sin

-e

=cose.

By Theorem 4,
cos

( )
-e

=cos 7!. cose


2
=0. cose

+ 1

by Theorem 3 of Section 4.1. Therefore


cos
Using 7r/2

( )
-e

sin 7!. sine


2

sine,

= sine.

8 for 8, we get
cos

[ - ( - ) J
e

=sin

( )
-

e .

Therefore
sin

( )
-e

= cose.

(The name of the cosine is a reference to this theorem; the word


the Latin

complementi sinus,

meaning

For every fJ and cp, sin

Theorem 6.

(fJ +

cp) =sin fJ cos cp

cos fJ sin cp.

Proof.
sin(fJ + cp) =cos

=cos

[ - (fJ + J
cp)

( - fJ)

cos cp + sin

( )
-

fJ

sin cp

=sin e cos cp + cose sin cp.


PROBLEM SET 4.2
1. tan (A + B)

3. cot (8 + </>)
5. sin 28
7.

cos 2li

tan A +tan B
=

l - tan A tan B

cote cot</> - 1
=

cote + cot</>

2 sine cose
1 - 2 sin2 IJ

cosine

sine of the complement.)

2. tan (A - B)

4. cot (A - B)

6.

cos 28

8. cot (Ii

2 cos2 e
</>)

is from

The Derivatives of the Trigonometric Functions

4.3

31T
9. a) sin-=
2
31T
11. a) tan-=
2
12.
15.
18.

o
(}
2 sin - cos- =
2
2

+ cos(}
2

(}
tan 2

31T
10. a) cos2 =

b) sin (3 7T + o)
2

b) tan (3 7T + e
2

13.
16.

sin(}

+ cos(}

(} 1 - cos(}
19. tan - =
sin(}
2

2 cos2

0
2

=
14.

- 1

- cos 20
=
2

17.

l
J

b) cos (3 7T + o
2

139

+ cos 20 =
2

[Hint: Let <P = (}/2, so that 8


in terms of <fi. Then prove it.]

2</i, and rewrite the formula

*20. Show geometrically (without using any of the theory developed in this section) that the
formula in Problem 18 holds whenever 0 is between 0 and TT. Discuss the problem of
extending the formula from this special case to the general case.
21. Show that there is no formula which expresses sin (0/2) in terms of sin e.
show that sin ((}/2) is not determined if only sin(} is known.

That is,

2 2. Find a formula which expresses !sin ((}/2)1 in terms of cos 0.

23 . Show that there is no formula which expresses sin(}in terms of tan e. That is, show that
tan(}does not determine sin 8.

24. Show that there is no formula which expresses sin ((}/2) in terms of sin (}and cos e.

25. Show that if Pe is known, then P3e is determined. [Hint: If Pe = P4,, what is the relation
between 0 and <P? In this case, what is the relation between 3(} and 34'? Between P3e
and P3q,?]
26. It is a consequence of Problem 25 that, if sin 8 and cos(}are known, then P38 is deter
mined, and therefore sin 3(}is determined. How ? That is, find a formula which expres
ses sin 3(}in terms of sin (}and cos e.

27. Can cos 3(}be expressed in terms of cos(}? If so, derive such a formula. If not, explain
how you know that no such formula exists.

4.3 THE DERIVATIVES OF THE TRIGONOMETRIC FUNCTIONS;


THE DIFFERENCES /lx AND llf; THE SQUEEZE PRINCIPLE

If we try, in a straightforward way, to find the derivative of the sine, we get into
trouble.

By definition,

!'(
if the indicated limit exists.

x0)

For f (x)

Sln

, Xo

1.

f(x) - f(xo)
,
X - Xo
x->x0
1m

sin

x,

sin

1.

!ill

x->x0

this definition says that

- sin x0

X -

X0

4.3

Trigonometric and Exponential Functions

140

if the indicated limit exists. In fact, the limit does exist. But it is not obvious what we
ought to do to this expression
sin

x -

sin

x0

X - Xo

in order to find its limit. For functions f which were defined algebraically, we found
ways to cancel out x

- x0

in fractions of the form

f(x) - f(xo)
X - Xo
using various algebraic tricks.

Evidently some new device is needed for the sine.

It is as follows. Let

The symbol

Ax

Ax= x - x0

is all one symbol.

is supposed to remind us that

Ax

It is pronounced "delta
is the difference in

Similarly, let
Llf
Here

A/ is

x,"

and the Greek delta

Obviously, x

the difference in/, as we pass from

(x0,f(x0)).

and

A/ is

1J
I
I

I
I

x0

I
I

x=x0+t:.x

Geometrically, it is easy to see that the expressions

x->x0

lim
ax->O

f(x) - f(xo)= '


f (xo)
X

+ Llx.

indicated by a new set of

t:.f=f(x)-f(xo)

and

x0

x0 to x.

Ax

Jim

f (x) - f (xo).

Geometrically, the use of the differences


axes, with the new origin at the point

x.

xo
Llf
= lim f(
Llx ax->O

x0

Ax) - f(xo)
= f'(xo)
Ax

are merely two different ways of describing the same limit f' (x0) .

The Derivatives of the Trigonometric Functions

4.3

141

The point of this procedure, in finding the derivative of the sine, is that it enables
us to apply the addition formula for the sine.

f(x0

=sin

Ax)

(x0

For f (x) =sin x, we have

Ax)

=sin x0 cos

+ cos x0 sin Ax.

Ax

Thus we have

f'(Xo)

1.
sin
=Sill Xo = Jill
.

(x0

.
sin
=IIm
.

Ax

x0 cos Ax

Ax

.
cos Ax
Sill Xo

Ax

il.x->O

- 1]

x0

sin

+ cos x0 sin Ax

il.x->O

=I!ill

Ax)

il.x->O

1.

Jill

sin

cos Xo

x0

sin Ax .

--

Ax

il.x->O

We are going to show that


Jim

cos Ax

and

=0

Ax

il.x->O

Jim

sin Ax

Ax

il.x->O

(1)

= l.

(2)

It will then follow that


sin'

x0

=cos

x0,

and
D sin

=cos

x.

The unknown limits ( 1) and (2) have curious forms.

Since cos

= I, the first

limit has the form


.
cos
IJill

(0

il.x->O

And since sin

Ax)

- cos

Ax

=cos ,

0.

(3)

0.

(4)

= 0, the second limit is


.
sin
I1m
il.x->O

Thus we have found that if cos'

(0

Ax)

- sin 0

Ax

=0 and sin'

=sm ,

= 1, then sin'

cos x for every

To simplify the notation, in the theorems that follow, we use e in place of


and state the theorems that we need in the following way:

Theorem 1.

sine
Jim -8-

8->0

1.

x.

Ax,

142

4.3

Trigonometric and Exponential Functions

Theorem 2. IIm

cose - 1

9-o

0.

Theorem 1 is the hard part; given that Theorem 1 holds, Theorem

it. To see this, we first observe that


lim

cose - 1
e

o-+o

Jim

9-+o

cose - 1 . cose +

Simplifying on the right, we express this as

/:::i

cos2 e - 1

e(cose + 1)

cose +

1
9

-sin2 e

e(cose + l)

lim

cose - 1
e

-lim
8-+o

][

i_n_e
s_
e

][

1im sine
0
8-+

follows from

.
J
.

The last formula can be factored into three parts, giving

o-+o

1im --1-o-+o cos e + 1

Given that Theorem 1 holds, this gives


lim

cos e - 1

o-+o

(Query:

-1 . 0 . l

0.

How do we know that lim0_,0 sine

= 0, and that lim _0 cose


1 ?)
9
First we observe that only positive values of e

It remains to prove Theorem 1.

need to be considered, because when we replace e by -e, the value of the fraction
(sine)/O is unchanged. Thus if (sine)/e-+ 1 ase approaches 0 through positive values,

it follows that (sine)/e

---+

1 as ()

-+

0 through negative values.

We shall show that for 0 < e <

1Tj2

we have

sine e tane.
y

-1

4.3

The Derivatives of the Trigonometric Functions

In the figure, () is the length of the arc from Q to P.


RP= sin()

and

QS

143

Since
=

tan(),

the inequalities that we want take the form


RP() QS.
To prove this, we have to go back to the definition of arc length.
The figure below shows a broken line inscribed in the arc from Q to P, with
segments of equal length

(In the figure,

11

a1 = a2 =

= an.

Thus the length of the broken line is

3.) We extend the radii of the circle until they intersect the vertical

line through Q; and for each segment of our broken line we let b; be the length of the
corresponding segment on the vertical line through Q.
y

It is a matter of elementary geometry to check that

and that
for each i.
Therefore
RP<A11< QS,
and so
sin()<A,, <tan 8.
As

n -+ oo,

A11 -+ () .

In fact, this is the definition of the length of a circular arc.

Therefore
sin()() tan().

144

Trigonometric and Exponential Functions

4.3

(When we pass to a limit, a "weak inequality''


preserved, but a "strong inequality"

<

<

A,. or A,,

An or A11 b is always
b is not necessarily preserved.

For example,

+ 1

for every

> 1

n,

11

but we cannot conclude that

(?)

lim

/1

+ 1

>

11

n-oo

1.

(?)

In fact, the limit is 1, which is I, but not >I.


equalities that we have written above.

always hold for 0 < 8 <

7r/2,

but we are not stopping to prove it.) Therefore

1
As 8-+ 0, cos 8-+ cos 0

1.

Hence the overcautious weak in

The strong inequalities sin 8 < 8 < tan 8

::::; _e_ ::::;

1_

- sine - cose

(You proved this in Problem 30(b) of Section

4.1.)

Therefore I/cos 8-+ 1, because the limit of the reciprocal is the reciprocal of the limit.

Thus the picture must look something like the figure below.
y
y=

cos e

e
----y = sine

That is, the graph of y = 8/sin 8 is "squeezed into l ,

1.1m
9 ... 0

e
--

sin e

"

and

1.

(This is an instance of a general "squeeze principle," to be discussed further at the


end of this section.) Therefore

because the limit of the reciprocal is the reciprocal of the limit.


this means that:

As we have seen,

The Derivatives of the Trigonometric Functions

4.3

145

Theorem 3.
D sin x =cos

x.

Once we know how to find D sin x, the derivatives of the other trigonometric

functions are easy.

cos'

cos

Ll.x) - cos

(x0

X0
+
---'--"------'-----=

x0 =lim

Ll.x

Ax->O

1.
cos
=Im -

x0 cos Ll.x - sin x0 sin Ll.x - cos x0


Ll.x

Ax->O

= cos

Xo

(cos

1.
cos
Im

Ao:->0

x0)

Ll.x -

Ll.x
(sin

x0)

Sill Xo

1.
sin Ll.x
Im -.6.x

Ao:->O

1 = -sin

x0.

Thus:
Theorem 4.
D cos

x = -sin x.

By simpler methods, we get

D tan

x =sec2 x,

D cot

x = -csc2 x,

D sec x = sec x tan x,


D csc x = -csc x cot x.

You will be asked to derive these, in the problem set below.


In finding the limit of &/sine, we used the following idea:
Theorem 5.

(The squeeze principle). Letf and g be functions defined at every point

of the interval I, except perhaps at the point x0 If


Jim f(x) = L,
and for each x, g(x) is betweenf(x) and L, then
lim g(x) = L.

4.3

Trigonometric and Exponential Functions

146

y
g

(}

Lt------

(}

Two illustrations of the theorem are shown above. The theorem is geometrically
clear, and is also easy to prove.
any box for fat (x0,
at (x0,

L),

L) is

The point is that since g(x) is betweenf(x) and

automatically a box for g at (x0,

L).

L,

Since fhas an EO-box

for every E > 0, it follows that g does also. Therefore


lim g(x) = L,
xx0

by definition of a limit.
y

f
g
-+-Ll--.,--.c.+-----,,.>.'-(} -.:-.-/ :
:
I
I
j
L------l-----J
I
I

The same idea also works when two functions approach the same limit, and a
third function lies between them.

(}

The Derivatives of the Trigonometric Functions

4.3

147

If
g(x) h(x) f(x),

and

limf(x)

lim g(x)

L,

x-+xo

then it follows that


limh(x)

L.

Similarly for the following situation:


y
g

All of these ideas are very closely related, and we shall refer to all of them as the
squeeze principle.
PROBLEM SET 4.3

Derive formulas for the following:


2. D cot x

1. D tan x
5. DVl - sin2 x
6.

9.

:2

4. D csc x

[Warning: It is very easy to get a wrong answer to this one.]


[Same warning.]

DVl - cos2 x

7. D cos2 x

3. D sec x

ce>J"

D 2 sin x cos x

8. D(cos2 8 + sin2 8)
10. Dv'l + tan2 8
sin x

11. D(csc2 8 - cot2 8)

12. D

cos x
13. D l
.
+ smx

14. D(x2 sin x)

1 + cos x

Show that the following differentiation formulas are correct:


15. D sin 2x =(cos 2x)2

1 6. D cos 2x =(-sin 2x)2

17. D tan 2x =2 sec2 2x

18. D sin ( -x) = [cos ( -x)](-1)

19. D cos (-x) = [-sin ( -x)]( -1)

20. D cot 2x = -2 csc 2x cot 2x

21. D tan (-x) = [sec2 (-x)](-1)

22. D sin 3x = (cos 3x)3

23. D cos 3x = (-sin 3x)3

24. D tan 3x = 3 sec2 3x

*25. Make a plausible guess for D,, sin ax, and verify it if you can.

Trigonometric and Exponential Functions

148

v'x.

*26. Same, for D., sin


*27. If f(x)

sin x and g (x)


(a)

4.4

f'

g,

x,
(b) g'
cos

then
=

-f,

(c)

and /2,

g2

f(O)

(d)

0,

g(O)

1.

Is it possible that there is another pair of functions satisfying the same four conditions?

[Hint: Suppose that the pairs fv

function

What sort of function is F?

g1

satisfy (a) through (d).

Consider the

From what you learn about F, what can you conclude

about /1,/2, g1, and g2 ?]


The answer to this problem has a rather curious significance: it means that all
properties of the sine and cosine are contained, implicitly, in conditions (a) through (d).
That is, the sine and cosine are completely described by the conditions

Sill

4.4

I
=

cos'

COS,

sin 0

-sin,

cos 0

0,

1.

THE APPROXIMATION OF DIFFERENCES BY DIFFERENTIALS

We recall, from the preceding section, the apparatus which we set up in order to
calculate the derivative of the sine.

Given a function

/: J-..R,
where I is an interval, and a fixed point
x

- x0,

so that

x0 + .6.x.

x0 of I.

For each

point

/.,
6}L
I

We let

IJ..f

f(x) - f(x0)

Llx

f(x0 + .6.x) - f(x0).

In the old notation,

'
f ( Xo) _

by definition.

1.

lffi

x--+x0

f(x) - f(xo)
,
X - Xo

In the new notation, this takes the form

f'(
x0 )

1.
im
ax->O

f(xo + .6.x) - f(xo)


6.x

1.

1m

ax->O

.6.f
.
ilx
-

of I, we let

.6.x

The Approximation of Differences by Differentials

4.4

When Llx is small,

149

LlfLl
/ x is close to f'(x0). Thus
when Llx::::::;

0,

where ::::::; stands for the phrase "is approximately qua! to." This ought to mean that
when Llx::::::;

0.

Let us interpret this last statement geometrically.


y
f
T

Xo

In the figure, the line Tis the tangent to the graph off at the point
the slope of Tis

f' (x0). If

(x0,f(x0)).

(x, y), then

Thus

f (xo) f'( Xo'


)
_

X0

because the slope of the segment from P to S is the slope of the line T. This gives
Y
This quantity is called the
in the figure.) To repeat:

by definition. Since

x0

f(xo)

f'(x0) Llx.

differential off at x0, and is denoted by elf (See the label


df

f' ( x0) Llx,

is regarded as fixed, throughout this discussion,

dfis a function,

whose value is determined when Llx is named. The differential is often convenient for
purposes of numerical approximation. We have observed that
when Llx::::::;

0.

In our new notation, this says that

Llf::::::; df

when Llx::::::; 0.

Let us try this on some numerical examples, and see how good the approximation
looks.

150

Trigonometric and Exponential Functions

Example

1.

4.4

Let

(x

f(x) = .J

and take

25,

Xo=

bx

O);

0.4.

y
5
4
3
2
---f---'
---'---'---'---'---'---''--'--'---'---'---!--,__ X
0
12
14
24 I 26
22
4
6
10
20
8
18
16
2
Xo=25

Then

and

f(x0)

f'(x)

.J25

1;-

2yx

df= lo bx

5,

(x

>

0),

lo (0.4)

0.04.

The approximation formula

dfbf
suggests that

.-/25.4

f(x0

bx)

f(x0)

bf f(x0)

df= .J25

0.04;

.-/25.4 5.04.
The actual value of

.-/25.4,

correct to six decimal places, is

.-/25.4
Thus the error in our approximation is
approximation

Ax= 0.4 is

b f df wasn't

5.039841.

0.000159, which is not bad.

supposed to be good except when

not very small. Using

df

Ax= 0.1,

1
;- (0.1)
2y25

.J25.l f (5)

we get

dj

0.01;

5.01.

Note also that the

bx

is small; and

The Approximation of Differences by Differentials

4.4

151

The correct value is

)25.1
so that our error is

0.00001,

5.00999,
Using t:.x

which looks better.

0.01,

we get

11-o(0.01) 0.001;
)25.01 f::::! 5.001.

dj

Using five-place common logarithms, we get

)25.01

f::::!

5.0010.

Thus, in this case, the differential is as accurate as five-place tables.


It is natural to ask why the approximation

b.f f::::! df

f'(xo) b.x

should be as good as it is. The reason is as follows. We know that

On this basis, we wrote

f'(x0)
Multiplying by

b.x,

f::::!

we got

b.f
b.x

b.f f::::! f'(xo) b.x

when

b.x

when

b.x

(1)

0.

f::::!

f::::!

(2)

0.

The second of these approximations is much better than the first.

b.x

and
then the product

[b.b.xf - f'(x0)] b.x

f::::!

f::::!

The point is that if

0,

O;

when you multiply two numbers each of which is small, the product is even smaller.
We shall now express these ideas in a more exact form.

E(b.x)

f (x 0

Then

b.x) - f(x0)
b.x
lim E( b.x)

6.x-+O

because

1.

!ID

ax-+O

/(xo

j'(xo)

0,

b.x) - f(xo) - f'(Xo) .


b.x

For each b.x, let

(b.x

-:

0).

152

Trigonometric and Exponential Functions

4.4

Thus the graph of the function Elooks like the figure on the left below.
y

//

y =E(t.x)
(D.x7'0)

/
To this graph we add the origin. That is, we define
E(O)

= 0.

The graph of the extended function Eis shown on the right above.
lim E(tu)

E(O)

We now have

0.

Ax-+O

Note that Eis defined on some open interval containing 0.


y

x
---+- -xo- --a XoX-o+-
a -

(x0 - a, x0 + a) lies in the domain off,


( -a, a) lies in the domain of E. An open interval containing
called a neighborhood of the given point. In this language, we

The reason is that if the open interval


then the open interval
a given point will be

can sum up the above discussion in the following theorem.


Theorem 1.

Letf be a function defined in a neighborhood of

is differentiable at

x0

such that
i)

6.J= f'(x0)6.x

ii)

limxo

(Proof

x0,

and suppose that f

Then there is a function E, defined in a neighborhood of 0,

E(6.x)6.x,

(6.x)= E(O)

and

0.

Use the function Ewhich we have just defined.)

This theorem explains why

d
f

j'(xo) 6.x

is a good approximation of

6.j = J(x0

6.x) - j(x0)

The Approximation of Differences by Differentials

4.4

when !1x

153

The reason is that

0.

11/- df= !1f-f1(x0)!1x


Thus when !1x

E(!1x)!1x.

0, the error in the approximation !1f

small multiple of the smaU number

11x.

df is

doubly small, being a

In many cases, it is possible to estimate the largest possible error that can result
when you use

df as

an approximation for

!1f

For a discussion of this problem, see

Appendix D.

PROBLEM SET 4.4

Following is a partial table of the sine and cosine functions, for ready reference in solving

some of the following problems:

sin x

cos x

0.0814

0.0814

0.9967

0.2094

0.2079

0.9781

0.1222

0.3840

0.1219

0.9925

0.3746

1. Find sin 0.1251 approximately (sin 0.1251

0.9272

0.1248, correct to four decimal places).

2. Find cos 0.0844 approximately (cos 0.0844


3. Find sin 0.2123 approximately (sin 0.2123

=
=

4. Find cos 0.3869 approximately (cos 0.3869

0.9964).
0.2108).
0.9261).

5. How do you account for the first two entries in the first line of the above table?
6. Without the use of tables of any kind, get the best approximation you can for
sin 0.5235988. (This is a trick question.)

7. Same question, for cos ( -6.2832).


Without using tables of any kind, get numerical approximations for the following. The
answers given are the "right answers," correct to the indicated number of decimal places;

it is not to be expected that an approximation process based on the differential will give them
exactly.
8.

>'Y27.1

10.

{116.3

[Answer: 3.004]

V25.2

9.
11.

[Answer: 5.0200]

-V' -7.9

12. One of the standard approximation formulas used in mathematical physics says that
sin x

when x ""'0.

""'x

Explain how this formula is related to the ideas in this section of the text.
Consider the general approximation formula

when b.x ""0 .

b.f"" df
. What form does this take, for j(x)
13. Same question, for

cos (1 - x)

sin x, x0
""

O?]

when x ""1.

[Hint:

154

4.5

Trigonometric and Exponential Functions

14. Another standard approximation formula says that


(1 + x)" ""' 1 + nx

when x ""'0.

Interpret this in terms of the theory that we have been developing, and justify it. [Hint:
Surely the given formula is equivalent to

(1 + 6 xr - 1 ""' n 6 x

when 6 x ""'0.

Here, what isf? What is x0? What are 6fand df?]

15. Same question, for the formula


v1 + x""' 1 +

when x""' 0.

16. Same question, for the formula


1

""'1

.3; v
x

1 +

when x""' 0.

17. Without using calculus at all, justify the approximation formula


1

when x""' 0.

R:Jl-x

1 + x
--

Is this a "doubly good" approximation in the same sense in which 6f""' df is "doubly
good"? Why or why not?

4.5

COMPOSITION OF FUNCTIONS

In calculating derivatives, we have often found it convenient to regard one function


as a power of another.

For example, given

<f>(x)

(x2

3x

+ 5)5,

x2

3x

+ 5,

we let
g(x)

so that

We can then get <P' in the form


<P'

5g4g',

</>'(x)

5(x2

3x

+ 5)4(2x + 3).

Similarly, we have found it convenient to regard one function as the positive


square root of another.

For example, given

(x)

Jx2

we let
g(x)

x2

+ 1,
+ 1.

We then get <P' in the form

</>'(x)

2Jx2 + 1

x_
Jx2 + 1
__

Composition of Functions

4.5

155

The idea that we have been using is that of composition of functions. In the
first case, the action of is described by
: xH (x2 + 3x + 5)5.
We split this operation into two steps, like this:

x H x2 + 3x + 5 H (x2

3x + 5)5.

The first of these steps represents the action of the function

g: x

x2 + 3x + 5.

The second step raises things to the fifth power.


function

It can thus be described by the

In this situation, g is called the inside function; it represents the first step. The function
f is called the outside function; it represents the second step. And is called the
composition off and g. The reason for the use of the terms inside and outside is that
we can write

cp(x) = f(g(x)).
To get cp(x), we should substitute g in the formula for f
Diagrammatically:

x x2

3x +

(x2 + 3x + 5)5

Our second example fits the same pattern. We have

cp(x)

==

g(x) = x2 + 1, f(u) = .J,


.Jx2 + 1,
cp(x) =f(g(x)) )g(x) = .Jx2 + 1.
=

Diagrammatically:
g

x H x2 +

;v
1.

x2 +

Algebraically, to get the values of the composite function = f(g), we substitute


g(x) for u in the formula forf(u). This is why we described the "square-root function"
f by the formula

f(u) = .J
instead of the equally logical formula f(x) = ,J--;. We want to form the composite
function by setting

u = g(x) = x2 + 1,
and it would hardly make sense to set ( ?) x = g(x) = x2
We sum all this up in the following definition:

+ 1

( ?).

Trigonometric and Exponential Functions

156

4.S

Given two functions

Definition.

g: A-+ B,
the composition

f: B-+C,

f(g): A-+ C

is the function whose values are given by the formula

f(g)(x)=f(g(x)).
Here, for each

x, f(g)(x) denotes

the value of the functionf(g) at the point

x.

Diagrammatically:
f

AHBHC.

Let us consider some more examples.


Example 1.

Let

f(u)=sin u,

Then

f(g(x))

(In this example, what is


Example 2.

A?

g(x)=2x+ 1.

sin g( x)

What are

sin

(2x+ 1).

B and C?)

Let

f(u)=u2+u+I,

g(x)

Then

f(g(x))= c);)2 +J; + 1 =x


(What are

A, B,

Example 3.

and

Let

Then
Example 4.

f (u)= sin

u,

g(x)=x2

Thus, for example,

I.

- 5.

f\14 dt, g(x)


<i(X)
dt=J(x2o (t4
f(g(x)) J (t4
f(g(3))=f 2(t4
dt=f(t4
f(u) =

Then

J; +

C?)

f(g(x))=sin (g (x))=sin (x2


Let

J;.

0 .

1)

1)

1)

5).

x2

- 1)

- 1)

dt.

dt.

In Examples I through 4 above, we supposed that f and

then proceeded to form the composite function =f(g).

were given, and we

More often, however, we

are given a function , and in order to investigate the function , we express it as the

4.5

Composition of Functions

composition of two other functions, each of which is simpler than

f2(t4

to investigate the function

</>(x)

1)

</>.

157

For example,

dt,

we first observe that it has the form

</>(x)

(u(x)
Jo

where

g(x)

Thus

</>

where

f(u) =

lu(t4

(t4

dt,

1)

x2

f(g),

1)

dt,

g(x) = x2.

Similarly, in the preceding three examples, if</> is given by the final formula, we shall

f and g in,

for many purposes need to set up a pair of functions

</>=Jw.

such a way that

The derivative of a function is also a function; and so we can form composite

functions of the type/' (g) and f (g'). Consider, for example,

f(u)

= ua,

g(x)

x.

=sin

Then

f'(u)

g'(x) =

= 3u2,

Therefore

f (g(x))

=sin3

f'(g(x)) =

x,

3 sin2

x,

cos

x.

f'(g(x))g'(x)

.3 sin2

x cos x.

These formulas are significant, because it will turn out that

Similarly,

g(f(u))

Df (g(x))
= sin u3,

sin3

x = f'(g(x))g'(x) =

g'(f(u))

= cos

u3,

3 sin2 x cos x.

g'(f(u))f'(u) = (cos u3)3 u2,

which will turn out to be the derivative of cos u3. (Here cos u3 is the cosine of
not the cube of cos

u.)

f(u)

Let us try one more example:


=cos

f(g(x))

u,

=cos

g(x) = .JX:,
x,

f'(g(x))g'(x)

g'(x) =

f'(g(x))
=(-sin

= -sin

-jx)

lr,

2"x

.J-;,

11_.

2, x

In dealing with composite functions, we shall need the following:

u3,

Trigonometric and Exponential Functions

158

4.5

The composition of two continuous functions is continuous.

Theorem 1.

That is, if

lim g(x) = g(x0) = u0


and

limf(u) =f(u0),
u-+uo

then

Jim J(g(x))

= J(g(x0)).

x-+xo

The idea here is that

R:o!

x0

g(x)

=>

R:o!

g(x0)

=>

f(g(x)),::::; f(g(xo)).

In Appendix E it is shown that this idea can be used to get a proof.

PROBLEM SET 4.5

For each of the functions , given in the problems below, find formulas for functions
f andg, such that</> =j(g). Then get formulas for f',g',f'(g), and '.

1. <f>(x) =sin2 x

<f>(x) = (sin x + cos x)2

2. <f>(x) =cos2 x

3.

4.

<f>(x) =sin 2x

5. <f>(x)

6. <f>(x) =cos 2x

7.

<f>(x) = Vl - x2

8.

10. <f>(x)

fin:r

tan 2x

<f>(x) =sin6 x

(t2 + 1) dt

9.

11. <f>(x) =

<f>(x) = f't

fos.r

(t2 + 1) dt

(Note that the function


f(u) =

f'

(t2 + 1) dt

can be expressed without the use of integral signs; f can be calculated as a polynomial.)
12. a) Find Jim
U-4>1lo

sin u

sin u0

Ii

Uo

sin x2 - sin xg
b) Find Jim------.:.

x2 - xfi

(It is not hard to see a very plausible answer to Problem 12(b). To prove, in an orderly
way, that your answer is right, you should express the function

<f>(x)

sin x2 - sin xfi

-=---

x2 - xfi

as the composition /(g) of two functions f andg, and then apply Theorem 1.)
13.

Find Jim

14.

G iv en

sin x3 - sin x8

----X
Xo
x-+xo

<f>(x) =sin x2, proceed as in Problems 1 through 11.

15. Do the same, for <f>(x) =sin x3


16. Do the same, for <f>(x)

sin <X:

The Chain Rule

4.6

Given

17.

<f>(x)

fin

159

I + ,2 dt.

On the basis of the theory that you know so far, you are in no position to calculate

(u)

v'l+f2 dt.

And you have, so far, no general formula for

D[f(g)]

On the other hand, you ought by this time to be able to make a good guess about

D[f(g)], and then

sinx v'l+f2 dt

use your guess to write

<f>'(x)

some

kind of formula for

[Hint:

As a start, what is/'(u)?]


sinx - I
18. Find lim
.
:r-1Tf2 x - 7r12

[Hint: If you can figure out what the geometric meaning of this limit is, it will then be
easy to find its numerical value.]

19.

cos x + 1
Find lim ---x-;; x - 1T

21.

Find Jim

4.6

THE CHAIN

.x"/4

[Same hint.]

sin 2x - I
X

14

1T

tanx - I

20.

Find Jim

22.

secx - 1
Find Jim ---

:r-"14

:ro

14

1T

.X

RULE

You may have observed, in the preceding problem set, that the formula

Df(g)
held in a number of cases.

f'(g)g'

For example, if

f(u)

then
f' (u)

u",

nu"-1;

and

Df(g)

Dg"

ng"-1g'

f'(g)g'.

Similarly, if

f(u)

then

f' ( ll)
Df(g)

D,/g

Jli,

:'
----!,:
2,,.
ll

-1=
2 Jg

g'

f'(g)g'.

160

Trigonometric and Exponential Functions

4.6

The same formula seems to hold for


f(u) = sin u,
at least in the cases where we can test the formula by calculating DJ(g) = D sin g.
For example, it turns out that
2 cos 2x,
D sin 2x
and this has the form
DJ(g)= f' (g)g''
where
f(u) =sin u,
f ( u)
cos u,
=

'

f'(g)g' = (cos 2x) 2

The formula

g'(x) = 2,

g(x) = 2x,

2 cos 2x.

Df(g)= f'(g)g'

is called the chain rule. In fact, it always holds, whenever the right-hand side has a
meaning, that is, wheneverf'(g) and g' are defined. We shall prove this at the end of
this section. First, we give some illustrations of its use.
Example 1. Consider
This is a composite function
with

.\ ( ? \ '
cp(x) = sin (3x + 1).
cp(x) = f(g(x)),

f(u)= sin u,
g(x) = 3 x + 1,

By the chain rule,

f'(u) = cos u,
g'(x )= 3.

cp'(x) = D sin (3x + 1) = [cos (3x + l)]D(3x + 1)


= 3 cos (3x + 1).
Example 2. Consider
By the chain rule,

cp(x) = sin (k + x) .

f (x) = [cos (k + x)]D(k + x) = cos (k + x).


Note that if the chain rule is known, and the formula
D sin= cos

is known, we can find D sin (k + x) without using the addition formula.


To give new applications of the chain rule, we should not be talking about cases
where the outside functionf is u" or). For these outside functions, we have known
for a long time that the chain rule held. After the trigonometric functions, the next
outside functions to consider are integrals:

The Chain Rule

4.6

Example

3.

Consider

cp(x)

f,kx
1

Here

cp(x)= f(g(x)),
f '(u) = 1:'
u

g(x)
Dcp

This is a curious result:

(k, x

dt

f(u) =

1
-

>

0).

Df(g) = f '(g)g' = J... k


kx

lkx
1

1
-

dt

(u > 0),

dt

g'(x)

f'(g(x))= : '
x

kx,

"'
J
1

1
-

161

k,

_!.
x

dt.

What does it tell us about the functions?


Example 4. The chain rule can be applied several times in the same problem. For
example, we know that

D sing=
whatever g may be.

(cosg)g',

We can then apply the formula in cases where g' itself needs to

be calculated by the chain rule:

x)D sin x
(cos sin x) cos x.

D sin sin x = (cos sin

=
Here sin sin x is the sine of the sine of

x,

which is different from sin2 x.

Therefore

D sin sin sin x

= (cos sin sin

x)D sin sin x

= (cos sin sin x)(cos sin x) cos x.


Example 5.

Similarly,

D{[(x3 + 1)2 + 1]2 + 1}3 = 3{[(x3 + 1)2 + 1]2 + 1}2D{[(x3 + 1)2 + 1]2 + I}
= 3{ }2 2 [(x3 + 1)2 + I]D[(x3 + 1)2 + I]
3{ }2 2 [ ] 2(x3 + l)D(x3 + I)

3{

}2

2[ ] 2( )

3x2

Here we have left braces, brackets, and parentheses empty, in the intermediate
stages, to make the steps easier to follow.

The final answer is

3{[(x3 + 1)2 + 1]2 + 1}2 2[(x3 + 1)2 +

l]

which can be simplified slightly by collecting constants.

2(x3 + 1) 3x 2,

Trigonometric and Exponential Functions

162

4.6

We shall now prove the chain rule. Given

rp(x)
x0

we want to show that for each

f(g(x)),

we have

rp'(xo)

f'(g(xo))g'(xo).

Obviously, we must assume that


a)

b)

f has

g' (x0) at x0, and


derivative/'(g(x0)) at g(x0).

has a derivative
a

A differentiable function is continuous.

lim

g(x)

Therefore

g(x0).

For convenience of reference later, we write this in the form


c)

[g(x0

lim
t.x->O

x) - g(x0)]

= 0.

By definition,

rp'(x0)

rp(x) - rp(xo )
X - Xo
rp(x0 + x) - rp(x0)
= lim
t.x->O
x
. f( g(x o + x)) - f(g(xo))
= lim

x->x0

=hm

t.x->O

Let

u
so that

= g(x0 +

x) - g(x0)

g(x0

Then
"''
(Xo)

x)

1.
lm

f(uo

u0

t.x->O

g(x0
+

x) - u0,

u.

u) - f(u0)
.
x

Here the numerator is a difference

f = f (u0

Liu) - f(u0)

between two values of the function f


Now comes the crucial idea: we apply to
Section 4.4.

the theorem stated at the end of

We need to change the notation of the theorem, using

to fit the notation of the present discussion.

function E, defined in a neighborhood of 0, such that


and

f = f'(u0) u
Jim

E(u)

t.u->O

in place of

x,

The theorem then says that there is a

E(u ) u,

= E(O) = 0.

4.6

The Chain Rule

163

Therefore

D.f
D.x

D.u
D.x

f'(u0)

f'(g(xo))

E(D.u)

g(xo

D.u
D.x

D.x) - g(xo)
D.x

E(D.u) g

(x0

x) - g(x0)
.
D.x

It is now easy to see what the limit is. By definition of g'(x0), we have

.
I1m

g(x0

<ix-o
As

D.x ---+

0,

D.u---+

D.x) - g(x0)
D.x

g'(x0)

(Remember the definition of

0.

and recall condition (c),

D.u,

at the beginning of the proof.) Therefore, by Theorem 1 of Section 4.5, we have


lim
Ax-+O

This gives

D.f
ef/(x0) = Jim
<ix-o D.x
-

E(D.u)

= 0.

.
=

f'(g(x0))g'(x0)

+ 0

g'(x0).

We therefore have:
Theorem 1.

Let f and

g be

functions. Then
= f'(g)g',

Df(g)
at every point

x0

at which the right-hand member has a meaning.

That is, the formula holds at every point

x0 and

(b)

f is

differentiable at g(x0).

x0

such that (a)

is differentiable at

These conditions illustrate the normal pattern

of theorems involving differentiation formulas: the equation holds whenever the


quantities mentioned in the right-hand member exist.
PROBLEM SET 4.6

In this problem set, your main job is to learn to use the chain rule. In each odd-numbered
problem, from 1 to 19, you should indicate the logic of your work by writing formulas for

f,g,f',f'(g),andg', before writing the answer in the form D[(f g)]


given the function

<f>(x)

sin (x2 +

f'(g)g'.

For example,

1),

your solution should be written in the form

f(u)
f'(u)
g'(x)

sin

u,

=COS!I,
=

2x,

g(x)
f'(g(x))
'
(x)

x2

cos (x2 +

1,

1),
[cos (x2 + 1)]2x

2x cos (x2

1).

If you go through this routine for one day, you are Jess likely hereafter to omit the factor

g' followingf'(g) in calculating Df(g).

The parentheses and brackets in the expression [cos (x2 +

1)]2x

look clumsy, but to

eliminate the brackets we have to change the order of the factors, as in the last expression

4.6

Trigonometric and Exponential Functions

164

above. It would have been simpler to write

(?)

'(x) =cos(x2 +

1 )2x

(?)

but this is the wrong answer: the function on the right is the function whose value, for each
x, is the cosine of 2x3 + 2x. If you write this formula for ',you are relying on the reader to
remember what the problem was and to realize that you must not mean what you are saying.
In some cases you may not feel sure whether brackets are necessary.

When in doubt,

use them.

1.

7.

12.

16.

20.
24.
27.
30.
34.
37.

Now find the derivatives of the following functions:

2.

sinx2

8.

sin(x3 + x)
tanx -

sec 2x
sec

vx2

a) sin

(vx)2

b) tanx2

a) tan2 x

3-

21.

b) \sinx

(I

,i:x

[x[2

cos4x - sin4x

cos

vx2

sin sinx

cos:c

f,k:J; -1 dt

5.

tan(t2 +

10.

,1 cosx

15.

t3Jt

f,x 1
- dt
t

1)

I I.

a) sec2 x

18.

22.

6.

19.

x
tan a) '\!tanx

23.

-1
2

tan t2 +

x
tan --

b) secx2

cos 2x

26.
29.
32.

sin cosx

35. L

+ sinx)

cos3x

cos

14.

17.

25.
28.
31.

,,--

4.

9.

sin x cos2x + sin3

cos cosx
sin

cosx3

sinx3 +x

13.

3.

sin2 x

cos2

- sin2

sinx

+ cosx
b) tan ,;:;.:

33.

sin2 sinx + cos2 sinx


sin sin sin

36.

ix'
f,k - dt.
0

cos f

dt

tan sinx

Let k be any positive number; and for each positive number x, let
(x)

Find the simplest possible formula for '(x). Then do the same, for the functions (x)

38.

f,x' - dt
1

(x >

.t

39.

0)

lx3dt-3 xdt
J t
t
sin 1
- dt (0 < <
44. J
t
45. For each x > 0, let
41.

J,"'' 1- dt - 2 Jx 1- dt

defined by the following formulas.

42. J\;dt
t

TT

f(x)

/(ab)

x3 1
40. l - dt
t
V;; 1
1 "' 1
43. J - dt - - dt
f
2J t
1

=f,x dt.
t
1

Show that for every pair of positive numbers

[Hint:

a and b,
f(a) + f (b).

we have

When we try to attack this problem by the methods of calculus, the obvious

introduce a function into the problem.]

trouble is that the problem does not appear to involve any functions.
first step should be to

Therefore our

Invertible Functions.

4.7
46.

Let ef>(x) =f(xn), where

The Inverse Trigonometric Functions

> 0 and/is as in the preceding problem. Find

165

ef>'(x).

*47. Given
D cos = -sin,

D sin =cos,

sin 0

cos 0

0,

1,

and given no other information whatever about the sine and cosine, prove that
sin (k + x) =sin k cos x + cos k sin x,
cos (k + x)
for every k and

x. [Hint:

cos k cos x - sin k sin x,

0 if the first equation holds;


0 if the second equation holds, and investigate the

Let f be the function which is

let g be the function which is

function

F =12 + g2.]

This result tends to confirm a claim that was made in Problem *27 of Problem Set
4.3. The claim was that all properties of the sine and cosine are contained, implicitly,

in the properties that we have just used to prove the addition formulas.
find further confirmation of this.
*48. Let/be a function, defined

(a)

f"

Show that f(x) =sin

for every x, such that

-f ,

(b)

x for every x.

*A9. Let g be a function, defined for every

(a) g"
Show that g(x) = cos
4.7

-g,

Later we shall

for every

f (0)

x,

=0,

(c)

f' (0)

I.

such that

(b) g(O)

(c) g'(O) =0.

I,

x.

INVERTIBLE FUNCTIONS. THE INVERSE TRIGONOMETRIC FUNCTIONS

A function f is called invertible if its graph intersects every horizontal line in at most

one point. Thus f(x)

x3

is invertible, but f (x)

x2

y=f(x)=x3.

is not.
y=f(x)=x2.

Iff is invertible, then for each number yin the image of/there is exactly one number
x in the domain of/ such that/(x)
y.
Thus to every invertible function f there corresponds a new function 1-1, called
the inverse off. (This is pronounced f inverse. The symbol -1 is not an exponent,
=

4.7

Trigonometric and Exponential Functions

166

really;andJ-1is not 1//) The inverse is defined by the condition that

1-1(x)

ifJ(y)

x.

If fis invertible, this condition defn


i es a function, because for each xin the image
offthere is exactly one such y. It is not hard to see what this relation between fand
J-1 means geometrically. The point (x, y)is on the graph of J-1if the point (y, x)
is on the graph of/ Therefore,to get the graph ofJ-1 from the graph off,we should
reflect the graph offacross the line y
x.
=

Let us see what this means algebraically. Consider

f(x)

a
x .

The graph offis the graph of the equation


y=

a
x .

(I)

The graph ofJ-1is the graph of the equation


x=

Here we have simply interchanged

and yin Eq. (I). Now (2)is equivalent to

y = fl;: .
Thus
J-l (x) = -{Y;:,
as we would expect: the inverse of cubing is the extraction of cube roots.
y

Invertible Functions.

4.7

Theorem 1.

The Inverse Trigonometric Functions

167

Let f be an invertible function. Then

J(!-1(x)) = x,
for every

x.

Proof For each x, lety =J-1(x).


J(!-l(x)) = f (y) = x.

Then/(y)

x,

by definition ofJ-1. Therefore,

We can use this idea to calculate the derivatives of inverse functions, assuming
that the inverse function has a derivative.
Example
Thus

1.

The function f(x)

x3 is invertible, and its inverse is J-1(x) = \o/;:.


(i1')3 = x.

We take the derivative on each side of this equation, using the chain rule for the
composite function on the left. This gives:

3(\o/x)2 D\o/x =

D\o/x = --1=

3\o/x2

1,

(x - 0).

You may have calculated this by another method, in Problem Set 3.6, but the present
method is easier.
Example 2. A function of the form j(x) = xq (where q is a positive integer) is
not necessarily invertible; in fact, it never is when q is even. We therefore restrict x
to positive values. This gives an inverse function

r1<x>=vx

We calculate D/x in three steps, as follows:

1)
2)
3)

(::Jx)q = x,

q(::Jx)q-l D/x =
-

Dx = ---==
q::}xq-1

1,

(x

>

0).

168

4.7

Trigonometric and Exponential Functions

When we use this method, the equations that we write have the following general
form:

j(J-1(x))
1
f'(f- (x))Df-1(x)

1)

2)
3)

Df-1(x)

x,
1,

(f'(f-1(x)) 9'6 0).

f'(r (x))

(You should check this against the preceding examples.) The method assumes that
our problem has an answer, that is, that1-1 has a derivative. Thus we need to show
that this holds, in every case in which the fraction at the last stage has a meaning.
This is easy to see. Consider I, 1-1, as in the figure below, with
Yi

as the labels indicate.

1-1(x1),

X1

l(Y1),

If I has a tangent line L, at (y1, x1), then1-1 has a tangent line L,' at (x1, y1): to get
this, we reflect both the graph and the tangent line across the line y
x. The slope
of Lis
=

f'(y1)

l'(f-1(x1)).

If m 9'6 0, then Lis not horizontal. Therefore L' is not vertical, andl-1 has a deriva
tive at x1. Thus we have completed the proof of the following theorem.

Theorem 2.

D r-1( x)
1

f'(r1Cx))'

wherever the fraction on the right has a meaning.


In most cases, the method used in deriving this formula is easier to use than the
formula itself. To find n1-1, we write 1(!-1(x))
1, differentiate, and solve for
D l-1, as in Examples 1 and 2.
We shall now discuss the so-called "inverse trigonometric functions." This
involves a slight difficulty, because the fact is that no trigonometric function is
invertible. The reason is that every trigonometric function satisfies the identity
=

I(x + 27T)

I(x),

for every x for which the trigonometric function l(x) is defined at all. Therefore

Invertible Functions.

4.7

The Inverse Trigonometric Functions

169

every value that a trigonometric function takes on at all is taken on for infinitely
many values of x. For example, the graph ofj(x)

sin

looks something like this:

-1
y=f(x)=sinx
If we restrict

to the interval

[-7T/2, 7r/2],

then we get a new function whose graph

includes some, but not all, of the original graph.


Sin, and the graph of y

This new function is denoted by

Sin x looks like the left-hand figure below.

y
y
Sin

The graph looks as if Sin ought to be invertible; and in fact this is not hard to see.
In the right-hand figure above, we have switched the notation to fit the definition of
the sine, so that y
() on the interval

sine.

Every point of the semicircle corresponds to exactly one

[ -7T/2, 7r/2];

and every horizontal line intersects the semicircle in

exactly one point.


As always for inverse functions, we get the graph of Sin-1 by reflecting the graph
of Sin across the line y

x.

Therefore the graph of Sin-1 looks like this:


y

170

Trigonometric and Exponential Functions

Similarly, we define Cos

4.7

to be equal to cos

x,

on the interval

[O, 7T],

and we

show that Cos is invertible. The graphs of Cos and Cos-1 look like this:
y

To find the derivative of Sin-1, we write

(cos Sin-1

sin Sin-1

x,

x)D

Sin-1

1,

Sin-1

cos Sin-1

We want to simplify the expression cos Sin-1 x on the right, and, while we are at it,

we shall get a formula for sin Cos-1 x.

Since

cos2 u + sin2 u

1,

we can now solve, getting


cos u
sin u

For
u

.JI
.JI
=

sin2 u,

(1)

cos2 u.

(2)

Sin-1 x,

this gives
and so from

(I) we get

sin u

sin Sin-1 x =x,

cos Sin-1 x =
Similarly, for
u

.J 1

x2

(3)

Cos-1 x

we have
cos
and so from

(2)

=cos Cos-1 x =x
,

we get
sin Cos-1 x

Formulas (3) and


In fact, the double

.JI

x2.

(4) are correct, but they are not good enough for
signs can be omitted, and the formulas still hold:

(4)
our purposes.

Invertible Functions.

4.7

The Inverse Trigonometric Functions

171

Theorem 3.

cos Sin-1 x

==

sin Cos-1 x

.J 1
.J 1

- x2,
-

x2

To see this, we merely need to remember that

On this interval, the cosine is 0.


applies.

Therefore, in

(3),

it must be the plus sign that

Similarly,

0 Cos-1 x 7T.
On this interval, the sine is 0.

Therefore, in (4), it must be the plus sign that

applies.
We now substitute
D Sin-1 x.
Theorem 4.

.J l

- x2

for cos Sin-1 x, in the formula that we got for

This gives:
D Sin-1

,--

= 1/...; 1 - x2

(-l<x<l).

Note that D Sin-1 x is always >0, just as the graph suggests that it ought to be.
At the endpoints of the graph, the tangents are vertical.
The proof of the following theorem is like that of the preceding one:
Theorem 5.

D Cos-1 x

-1/.J1

(-l<x<l).

x2

Note that D Cos-1 xis always <O, as it should be.


For tan x, the process is simpler. The graph of y = f (x)

tan x looks something

like this:
y

To get an invertible function Tan, we take the portion of the graph that lies between

-7T/2 and

x =

7T/2.

We could verify by brute force than Tan is invertible, but

it is easier to prove first the following theorem:

Trigonometric and Exponential Functions

172

Theorem 6.

4.7

Letfbe a differentiable function on an interval/. Ifj'(x) 0 for every

x in J, thenfis invertible.

The proof is based on the mean-value theorem.

If f is not invertible, then the

graph intersects some horizontal line in more than one point.

f(a)
for some

and b in /.

f(b),

Therefore the graph has a horizontal chord.

means that the graph has a horizontal tangent; that is,f'(x)

contradicts the hypothesis for f.

Now the domain of Tan is an open interval ( -1T/2,

Tan' x

sec2

By MVT, this

0 for some x, which

1T/2).

On this interval,

Therefore Tan is invertible.

0.

Thus

The graphs of Tan and Tan-1 look like this:

y
y

Tanx
7r

----------------2

Theorem 7.

D Tan-1 x

1/(1

+ x2).

The derivation is easier than the preceding ones, because it turns out that there
are no double signs to be eliminated.
For the secant, the situation is trickier, and some handbooks contain formulas
that are wrong.

The reason is that the graph of the secant looks like this:

The Inverse Trigonometric Functions

Invertible Functions.

4.7

173

y
I
I
I
I
I
I

3.,,.

-2

_,.

x =

:
I

'(\
I/cos

I
I
I
I
I

= secx

: 3.,,.
12

1r

:(\

-1

I
I
I
I
I

I
I
I
I
I

(Remember that sec

IY
I
I
I
I
I,,.
1
2
I

I
I
I
I
I

I
I
I
I
I

wherever cos

0.)

This graph consists of

infinitely many connected pieces, but none of these connected pieces is the graph of an
invertible function. We therefore cannot use all of any one of the pieces. Everybody
agrees that we ought to use the part of the graph where

<

TT/2, but there is no

general agreement on what else we ought to use. To be safe, we define Sec x only for

<

TT/2. (See the graphs below.)

y
Jy=Sec x
I
I
I
I
I
I

---------

.,,.

--f----L---x
.,,.
2
(Query: How do you know that the secant never takes on the same value twice,
on the interval

[O, TT/2) ?)

On the basis of the definition of Sec-1, it is plain that the equation

Sec-1

means two things:


.1:

secy

(5)

and
7T

(6)

O:Sy
< -.
2

We now calculate the derivative.

We have
Sec Sec-1

(7)

x = x,

and
Sec' u
for every

from

0 to TT/2.

Sec u Tan

u,

(Why are we justified in using capital letters on the right?)

Therefore, by the chain rule,


(Sec Sec-1 x)(Tan Sec-1 x)D Sec-1

x =

1,

4.7

Trigonometric and Exponential Functions

174

and

x(Tan Sec-1 x)D Sec-1

x=

(8)

1.

Therefore

1
x = -----
(9)
x Tan Sec-1 x
We now need a formula for Tan Sec-1 x, analogous to the formulas for sin Cos-1
and cos Sin-1 x. We know that
1 + tan2 u = sec2 u
D Sec-1

for every

u.

Therefore
tan

For

u=

) sec2

I.

Sec-1 x, this says that


tan Sec-1

Since

u=

x=

) sec2 Sec-1

x -

0 Sec-1 x < 1T/2, we have


tan Sec-1 x

Tan Sec-1

0,

and so it must be the plus sign that applies on the right.


Tan Sec-1

x = )x2

Therefore

1,

and we have:
Theorem 8.

D Sec-1

x = 1/x)x2

I.

For convenience of reference, we repeat these differentiation formulas:

D Sin-1

1 ,
x = --) 1 x2

D Cos -1x

D Tan-1

x=

1
1 +

x2

D Sec-1

=
)1 x2

x=

_1

x x2

_
_

We now have a new set of functions arising as derivatives: none of these four functions
has appeared before as a result of differentiation. This means, for one thing, that we
can use our new functions to solve certain area problems that we couldn't solve
before. Later we shall see that the process by which we find a function whose deriva
tive is a given function has many other applications.
You will also need to remember

cos Sin-1

x = )1

x2,

sin Cos -1 x

= )1

x2.

4.7

Invertible Functions.

The Inverse Trigonometric Functions

175

PROBLEM SET 4.7

For each of the following functions, calculate the derivative.


2. Cos-1 (x - 1)

3. Tan-1 (x + I)

4. Sec-1 (x + 1)

5. Sin Sin-1 (x + 1)

6. Cos Sin-1 x

7. Sin Sin-1 x2

8. Cos Sin-1 (2x)

9. Sin-1

1. Sin-1 (x

10. Cos-1

1)

Vl

- x2

13. Sec-1 x2
I
16.

Sec-1

19. Sec Tan-1 x

\1 I

11. Tan-1 (x2 + I)

12. Tan-1 (sec2 x

14. See1 (I + tan2 x)

15. Tan-1

1
17. Cos-1

18. Sin-1x

20. Tan Sec-1 x

21. Sin Tan-1 x

x2
-

v1 - x2

J)

22. Cos Tan-1 x

23. Tan Sin--1 x

24. Tan Cot-1 x

25. Sin-1 (2x + 1)

26. Tan-1 (I - x)

27. Sin-1 x2

28. Show that

See1 x

Cos-1 - ,
x

for every x on a certain interval. What interval?


29. Show that

Sin-1 x + Cos-1 x
for every x on the interval [ I, 1 ].
uniqueness theorem of Section 3.8.)
-

30. Find

f -1 - ? dt.
+ rI

Sketch.

11'

2'

(A very short proof is possible. Remember the

33. Find

-1

32. Find

f/2
0

VI

t2

dt.

'/

31. Find

2dt.
l-+ t
1v2 r
I

v1

t2

Sketch.

dt.

I /(I + x2) on the whole


34. Try to get the right answer for the area under the graph of y
interval ( - w, w). You need not justify your answer, so long as it is right.
=

35. Given
0 x l,

find a formula for f-1(x). Then explain how your answer might have been predicted
without a calculation.
36. Find

(2
1
dt.
J 21v3 iVi2 1
11v3 t
dt.
1 --v t2 + 1
-

38. Find

37. Find

J xVx2
1

_
_
_
_

dx.

Trigonometric and Exponential Functions

176

39.

4.8

In Theorem 6 we required that f' (x) be different from 0 everywhere on the interval I.
This hypothesis was satisfied by Tan on the open interval ( -Tr/2,
to Sin on

Tr/2, Tr/2]

or to Sec on

vanish at the endpoints

and so we

[O, Tr/2),

because the derivatives of these functions

and 0.

To take care of such cases, we need the

Tr/2, Tr/2,

Tr/2),

But Theorem 6, as it stands, does not apply

could conclude that Tan is invertible.

following:
If f is differentiable on an interval I, and f'(x)

Theorem.

0 at every interior point of

I, then f is invertible.
Here by an interior point of I we mean a point of I which is not an endpoint.
Reread the proof of Theorem 6 and see whether it proves this more general theorem.

If so, say so and explain. If not, furnish whatever additional reasoning is necessary.
40. It might also be convenient to have the following generalized form of the uniqueness
theorem (of Section

3.8).

F'(x)

Here we require that

G'(x)

at all interior points of

the interval I.

Theorem
let

(?).

Let F and

be a point of I.

of I, then (iii)

F(x)

G be differentiable functions, defined on the same interval I, and


F(a)
G(a) and (ii) F'(x)
G'(x) for every interior point
G(x) for every point x of I.

If (i)
=

Reexamine the proof of Theorem 2 of Section

3.8,

and see whether it proves the more

general theorem above. (If not, complete the proof.)

Then name a case in which the

more general theorem is more convenient to use.

4.8

SIMPSON'S RULE. THE COMPUTATION OF

11

In Section 3.8 we developed a method for evaluating definite integrals. To find

!(x) dx,

where f is continuous, we first set up the function

F(x)
Then F'(x)

f(x), for every x.

f'f

(t) dt.

We find another function G, such that G'.

Then F and G have the same derivative f; and by adding a constant to G, we get
function, say H, such that H'

G'
by the uniqueness theorem that F(x)
=

f and H(a)

0.

Since F(a)

H(x) for every x. Therefore

f(t) dt

H(b).

It is possible to write a theorem which sums this up very briefly:


Theorem 1.

If f is continuous, and

G'(x)

(a x b),

f(x)

then

!(x) dx

G(b)

G(a).

f
a

0, we know

Simpsons' Rule.

4.8

Proof

For each

x,

The Computation of

TC

177

let

F(x)

=if(t) dt.

Then

F'(x)

f(x),

F(a)

and

0.

Let

H(x)

G(x) - G(a).

Then

H'(x)
F(x)

Therefore

H(x) for

G'(x)
every

x,

ff(t) dt

and so

H(b)

H(a)

and

f(x),

F(b)
=

G(b)

H(b).
-

0.

Therefore

G(a).

The proof reproduces the procedure that we have been using all along.
first

G that we try,

with

G'

G is

the

f; and His the function that we get when we adjust the

constant.
But in many cases it is hard to find a known function which has a given function
fas its derivative. For example, if we had never heard of tan, Tan, or Tan-1, then we
would have had no chance at all of finding a known function

G'(x)

G such that

1-.
1 + x2
-

Later, we shall learn more and better methods for attacking such problems.

But no

method, and no system of methods, works all the time. Therefore we often need to
use numerical methods, to calculate definite integrals approximately.
One way is the following.
val

[O, 1]

into

Suppose that we didn't know anything about deriva

H (1 - x3) dx approximately. We might divide the inter


10 subintervals of length 0.1, and add the areas of the circumscribed

tives, but we needed to find


rectangles.

Trigonometric and Exponential Functions

178

i=

xi=

4.8

ai =

Yi=

0.1

0.1

0.999

0.0999

0.2

0.992

0.0992

0.3

0.973

0.0973

0.4

0.936

0.0936

0.5

0.875

0.0875

0.6

0.784

0.0784

0.7

0.657

0.0657

0.8

0.488

0.0488

0.9

0.271

0.0271
0.7975

Here the areas of the ten circumscribed rectangles are

and their total area is 0.7975. This gives

A
The approximation A

f(1 - x3) dx

0.7975

A1.

A1 is not very good: by an easy calculation based on Theorem

I, we get the exact answer

f(1 - x3) dx
We might also have used

0.7500.

inscribed rectangles.

0.7975 - 0.1000

(Why?) The approximation

A2

==

Their total area would be

0.6975

A2

is not very good either.

But their

average

is

considerably better.

Aa
The sum

A3 has a

t(A1

A2)

==

0.7475

0.7500.

geometric meaning: it is the sum of the areas of the inscribed trape

zoids.
y

CS:J

I
I
I

I
I
I

Over each of the little intervals, the area of the trapezoid is the average of the areas
of the inscribed and circumscribed rectangles; and it is not hard to check that the
same is true of the sums.

This helps to explain why the approximation

A3

is

Simpson's Rule.

4.8

The Computation of

7t

179

reasonably good; we have approximated the graph of f by an inscribed broken line,


and used the area under the broken line as an approximation of the integral.
In practice, however, nobody uses the approximation A

I:::::!

A3, because there is

another method which gives better results without any extra work. This method is

Simpson's rule The scheme is as follows.


..

Suppose that we have a functionf, whose values we can compute, on an interval

[a, b]. We find a quadratic function


g(x)
which agrees wlthf at

= Ax2 +

Bx + C

a, at b, and at the midpoint of [a, b]; and we use the approxima

tion

f!(x)dx fg(x) dx.

Here by a quadratic function we mean a function given by a formula

A,x2 + Bx +

C. We allow the case A = 0, and so the graph may turn out to be a line instead of a
parabola. In any case, the integral on the right is easy to calculate: if

G(x)

then

:i x3
3

G'

and so

fg(x)dx

!!. x 2
2

Cx '

g,

G(b)

G(a).

In the figure above, the approximation looks good, because the errors on the
two halves of

[a, b] seem to cancel each other out. Most of the time, we cut [a, b]
[a;, ai+1]; we then use Simpson's rule on each

into a certain number of little intervals

of the little intervals, and add the results.

We shall now develop a shortcut formula for Simpson's rule, in a special case.

Theorem 2.

Let

g be a quadratic function, and let k be a positive number. Then

Jk g(x)dx

-k

where Yo =

g(-k), y1

g(O), and y2

(Yo
3

g(k).

4y1 + Y2),

180

Trigonometric and Exponential Functions

4.8

Before proving that this formula is true, let us first check it, in a simple case, to

g(x) = l for every x.


2k. Thus our fom.Jla says that

make sure that it is not absurd. One of the possibilities is that


In this case, the integral on the left is equal to

2k = (1 + 4 + 1),
3

which is correct.

Any time you wonder whether you have remembered Simpson's

rule correctly, you should check by this method; the check uncovers the most common
errors in recollection.
We proceed to the proof. We have

g(x) = Ax2 + Bx +
Let

G(x)
so that

A
B
- x3 + x2 + Cx'
3
2
-

G' = g. Then

fkg(x) dx

G(k) - G(-k)

(The algebra here is straightforward.)

y 0,

C.

iAk3 + 2Ck.

We need to express

y1' y2, and k. Evidently C is no problem:

To find

A, we use
Yo + y2

We can now solve for

Our expressions for

Y2

C,

Ak2 + Bk +

C,

Yo + y2 = 2Ak2 + 2Ji.

2A'k2 + 2C,

A:

A and

f g(x) dx
k

-k

C now give

= iAk3 + 2Ck
=

3 ( Yo

k
=

(y0 - 2Yi + Y2) + 2kYi

+ Y2),

4y1

which was to be proved.


Let us try Simpson's rule on the function

f(x)
Here we have

C in terms of

g(O) = y1.

2
Yo = Ak - Bk +

A and

k =

1,

x + 2

--

Yo=

1,

-1 x 1.

Simpson's Rule.

4.8

The Computation of

7t

181

y
3
2

-- 21
I
I
I

-
- 1

f(x)

-+

x+2

--''--

'--2 --X

The rule gives

Jl-1
t(l
x
2

+ t) 1.11.

Later, we shall find ways to calculate this integral as exactly as we please. It will then
turn out that the right answer, correct to four decimal places, is 1.0986. In this case,

the approximation is good, in spite of the length of the interval [ -1, 1], because the
portion of the graph off that we are dealing with is very close to its approximating
parabola.

-1
Let us now try

f(x) =
Here we have

k = 1,

The rule gives

1
,
1 + x2

Yo= i,

1 dx Ht
J-11
x2
+

---

Since

-1 x 1.

--

Y1 =
+ 4 + t)

-1 x =

D Tan

the right answer is

f1 --dx
=
-11 + x2

-1 1

Tan

. 1.57.
= !!...
2

1,

t 1.67.

1
,
1 + x2

---

Tan-1

( l)
-

7T

4.8

Trigonometric and Exponential Functions

182

Here the error is about

0.10,

which is not very bad. To get better results, we need to

cut up our intervals into smaller pieces.

2,

formulas is to generalize Theorem

The first step in deriving the necessary

to take care of the case in which the origin is not

necessarily the midpoint of the interval over which we are integrating.


Theorem 3.

Let

be a quadratic function, and let

ia+2k
g(x) dx
a

where

Yo = g(a),

k
=

Yi= g(a

(y0 + 4yi

be a positive number. Then


+

Y2

k),

J2),

g(a

2k).

The easiest way to see this is to move the graph


point

(a

k, 0) falls on the origin.

k units

to the left, so that the

When a parabola (or a line) is moved in this way,

it is still a parabola (or a line); the integral does not change, and neither do the
numbers

k, y0, Ji,

and y2 Therefore Theorem 3 is a consequence of Theorem 2.

Consider now a functionf, on an interval


into an even number

2n

[a, b].

We cut up the interval

of little intervals, each of length

b-a
.
k=
2n
The division points are

x0,

Xi,

, x2n,

as shown in the figure for

Yt
Yo

On the interval

[x0, x2) = [a, a + 2k], Simpson's rule gives


ia+2k
k
f(x) dx R::i (y0 + 4yi + J2),
3
a
-

2.

[a, b]

Simpson's Rule.

4.8

where

Yi =/(xi);

for each i.

On the interval

ia+4kf(x) dx

R:i

a+2k

[x2, x4]

The Computation of

[a + 2k, a + 4k]

7t

183

we get

(Y2 + 4y3 + y4).

For the 2n little intervals we have

Ja(bf(x) dx

R:i

(y0 + 4y1
3

2Y2

4y3 + 2y4

+ 4Y2n-1 + Y2n).

This formula is the final form of Simpson's rule. Let us try it, with
better approximation of

11 d x
.
-1 x + 2

The computation looks like this:


i=

xi=

Yi=

0
1
2
3
4
5
6
7
8
9
10

-1.0
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1.0

1.0000
0.8333
0.7143
0.6250
0.5555
0.5000
0.4545
0.4167
0.3846
0.3571
0.3333

1
4
2
4
2
4
2
4
2
4
1

k = 0.2, to get a

1.0000
3.3332
1.4286
2.5000
1.1110
2.0000
0.9090
1.6668
0.7692
1.4285
0.3333
16.4795

This gives

1
f
J_1 x + 2

R:i

0 2
3

(16.4795)

R:i

1.0986.

This answer is actually correct, to the fourth decimal place.

Obviously, however,

we must have been lucky: Simpson's rule is not supposed to be exact, and, besides,
we were carrying only four decimal places in the calculation.
When you use Simpson's rule, it is a good idea to use a table like the one shown
above. Make sure that the last entry in the fourth column of your table is 1 and not 2.
We have postponed until now the presentation of Simpson's rule, because this

is the first point at which we can do something interesting with it.


thing is as follows. We know that

11--dx = Tan-1 1
2
1 + x
o

Therefore

- Tan-1

7T .
0 =4

The interesting

184

4.8

Trigonometric and Exponential Functions

Applying Simpson's rule, we can thus get a numerical approximation of

Tr.

This is

Problem 1 below.

PROBLEM SET 4.8

1. Apply Simpson's rule to the function

f(x)
with k = !.

(0 x 1),

1 + x2

Check your answer against what people have been telling you about

'TT.

If you want to use k = 0.1, to get a more exact approximation, it might occur to you
to use a slide rule to calculate they/s. Would this be a good idea? Why or why not?
How about five-place log tables?

2. Apply Simpson's rule to the function

f(x)

with k = 2. Then calculate


mation.

3x3 - 5x2 + 1

(-2 x 2),

J:2/ (x) dx exactly,

and compute the error in your approxi

3. Apply Simpson's rule to the function

f(x)

x3 + x2 - 17

(-100 x 100),

Then calculate the integral exactly, and compute the error in the

with k = 100.
approximation.

4. Apply Simpson's rule to

f(x)
with k

5.

= x3

- 2x +

3,

(-1 x 1),

l Compute the error.

Apply Simpson's rule to

f(x)

= x4

- 2x +

3,

over the same interval as in Problem 4, using the same k, and compute the error.
6. There ought to be a theorem which accounts for some of the results that you have been

getting. State and prove the theorem.


.
7. Apply Simpson's rule to the function

f(x)
on the interval

[O, 1],

=1

- x3,

using k = 0.1. (This is the integral which we investigated in the

text above, using inscribed rectangles, circumscribed rectangles, and finally trapezoids.)
*8. Given a positive number k and numbers y0, y1, and y2, write an explicit formula for a
y0, g(O) =Yi. and g(k)
y2 That is, write
quadratic function g such that g( -k)
=

an expression of the form

g(x)
in which the coefficients

Yi andy2

A, B,

Ax2 +Bx +

C,

and C are expressed algebraically in terms of k, y0,

Exponentials and Logarithms

4.9

185

*9. Does the theorem that you proved in Problem 6 hold only on intervals of the type

[ -k, k] or does it hold on any interval [a, b]? Proof or refutation?


After finishing Problem 1, you may want to try a smaller k, to get a better approximation
of

As a check,

11'.

3.14159265,

11' =

correct to eight decimal places.

In Appendix F, at the end of the book, you will find a theorem which enables us, under

some conditions, to set a limit on the error in Simpson's rule.

4.9

EXPONENTIALS AND LOGARITHMS

For the case in which the exponents are positive integers, exponentials are part of
elementary algebra. We begin with:
Definition.

For each positive integer


xn = xxx

n,

(to

factors).

It is then easy to see, simply by counting factors on the left and on the right
that the familiar laws of exponents hold:

(A)
(xmr
If

is a negative integer, then

Thus, for example, for

n =

> 0, and for

:;if 0 we define

3 we have
x-

For

-n

(B)

= xmn.

1
--

x-C-3l

1
x3

:;if 0, we define

It can be shown that, if x :;if 0, then formulas (A) and (B) hold for all integers m and n.
When the exponents are allowed to range over all real numbers, exponentials
cease to be part of elementary algebra. In this section we shall state the facts about
exponentials and logarithms, but will make no attempt to verify them. (In the follow
ing two sections, we shall see how these facts fit together to make a logical theory.)
We begin with a positive base and a rational exponent.

1) Suppose that a > 0, and that

is a rational number p/q (where p and q are

integers and q > 0). We want to define ax

a'IJfa in such a way that (A) and (B) will

continue to hold. For (B) to hold, we must have

(a'IJfo)q

aP.

That is, aPfa must be the qth root of aP. Hence the following:

186

Trigonometric and Exponential Functions

4.9

Definition. If a > 0 and q > 0, then

a'Pfa= ,:;a'P.
Here we cannot allow the case a < 0. For a =
( -1)1/3

( -1 )2/6

-1
( -1)2

-'-

we would get

1,

1 =

-1,

1.

Thus, for a < 0, a'Pfa would depend not merely on the number that we use as an
exponent but also on the notation in which the number is expressed. This would lead
to nothing but trouble.
It is a fact that for a > 0, and x and y rational, the following laws hold:

2)

a"'. av= a"'+v,


(ax)v

(A)

a xv,

(B)

aO = 1.

3)

(C)

The rational numbers on the x-axis do not fill up the x-axis, because every interval,

however short, contains irrational numbers.

Therefore the set Q of all rational

numbers forms a sort of infinitely dotted line. So far, the function/(x) =ax has been
defined only for rational values of x. Therefore/is a function Q-+ R+, and the graph
is an infinitely dotted curve, as in the figure below. Note that/(x) > 0 for every x,
because a"' = a'flfq, which is the positive qth root of the positive number aP.
y

a>l

\
\
\
\

'
'

'

'

',1 "' ,,/

I
I
I
I
I
I

f(x) =a",

............
.......... __

x in Q

a<l
-x

j: Q-->R+

It is a fact that the definition of this function can be extended so as to give a new
function:
/: R-+ (0,

oo)

ax> 0,

defined for every x, such that f is continuous and satisfies


a

4)

I, we have f(x)

l'"

We now define loga as the inverse of f(x)


y = lox

by definition.

(A),

1 for every x. But for a > 0 and

>-

a'" (a

1).

(B), and

(C).

For

1,f is invertible.

That is,

av= x,

The image of the exponential function includes all positive numbers.

Exponentials and Logarithms

4.9

187

Therefore the domain of its inverse includes all positive numbers, and we have a
function
log,,:

(0,

co

__,..

R.

Logarithms to the base a obey the following laws:


loga xy
loga b

loga X + logay,

= X

loga 1 =

loga

(A')

(b > 0, b 1),

(B')

0.

(C')

In fact, these are derivable from (A), (B), and (C).

Since the logarithm and exponential are inverses of each other, we have
loga ax= x,
alogax

x.

And the graph of either of these functions is the reflection of the graph of the other
across the line y = x.

y=f(x)=ax, a>l

5) We now want to calculate the derivative of the logarithm. By definition,


I oga
, x0

1.
loga (x0 + 1:1x) - loga x0
.
= lm

Using the laws (A'), (B'), and


1

Jim - loga
t:.x-o 1:1x

1:1x

t:.x-o

(C'), we express this in the form

x0 + 1:1x
x0

lim loga

t:.x-o

t:.x
!:1x 1/

+ -

= lim - loga
t:.x-o

Let

x0

Ax
h =
.
Xo

x0

f:1x "'o-'
+ -

) t:.x]

x0

Trigonometric and Exponential Functions

188

Since

x0 is fixed, h --+ 0 as x --+ 0.


log

x0

4.9

This gives

_!_ lim
Xo h-+O

(1 + h)11h.

loga

If loga has a derivative, then the limit on the right-hand side exists, and conversely.
Suppose that

(1 + h)1'"

approaches a limit, and call this limit

e.

Thus

(1 + h)11".

lim

h-+O

Suppose that loga is continuous, so that the limit of the logarithm is the logarithm of
the limit. Then
log
Thus

x0

l. loga lim (1 + h)11h


Xo
h->O
D loga

Since

e1

e,

-x1

loga

l. loga e.
Xo

e.

we have
log.

and so for a =

e our differentiation

1;

formula takes the form

D log.

Considering the complications of the preceding discussion, the simplicity of this


formula is surprising; it is also 'important (see the next section).
On the basis of the preceding formulas, it is easy to find Da", for a -

6)
a

1,

we have no problem.) Let

f(x)

loga

x,

so that

J-1(x)

=a'".

The general formula

J(j-1(x)) = x
thus takes the form
loga a'" =

x.

Since
Du loga

-1

loga

e,

the chain rule gives:

Cx

loga

Da"

Therefore
Da" =

1
-e
loga

a'".

1.

1.

(For

Exponentials and Logarithms

4.9

In particular, for a =

189

we have

De"'= e"'.
A final simplification: we assert that
1

Proof.

-loga e

Let

Then

x = loga

e,

a"'=

e,

=log. a.

y =loge a.

by the definition of the logarithm. Therefore

(a"') 11
This holds when xy

= e11,

and

'"11
a = a.

Since the exponential function is invertible, it cannot take

1.

on the same value twice. Therefore the equation can hold only when xy = I. There

fore

-=
x

y,

which was to be proved. Thus we can write

Da'" =

'"
a log. a.

This is better, not just because it avoids a fraction, but also because
two bases for which tables of logarithms are published.

is one of the

Throughout the following problem set you may assume that the statements

made in this section are true. (They will be proved in the following two sections.)
For convenience of reference, we give a summary.

Laws of Exponentials (a >

a)

a'" a11 = a'"+11,

c)

a0 =

e)

Da'" =a"' log. a,

0)

1,

Laws of Logarithms (a >

0, a

g)

loga xy =lo x + logay,

i)

loga 1 = 0,

k)

loga a x= x,

1)

b)

(ax) v =

d)

a'" > 0

f)

e = l im1i-o (1 + h)1f1i.

h)

log0

j)

1)

bx

axv,

for every x,

= x loga b

D loga x =

(b >

l/(x log. a) ,

aloUa x = x.

PROBLEM SET 4.9

Find the first and second derivatives of the following functions.

x
e"' cos x

3.

5.
8.

[loge x]2

9. log. xsoo

2.

4.

7. log. x2

xe2"'
iex(sin x

xex
e'" sin x

1. x loge

6.

+cos x)

0),

190

4.9

Trigonometric and Exponential Functions

10. esin a:
13. [log. e]"'
16. e "'
19. log. (1 - x)
22. el-X

!h

25. log. sin x


28. log.

12. log. e"'


15. x2e "'

11. 10"'
14. elog, "'
2
17. xe"'

1 -x

20. log. (e"'

18. log. (x3)


21. e"'-1

1)

23. log. sec x

24. e"'

26. log. cos x

27. log. (sec x

29. log. (csc x

31. Show that for every x >

0,
log. x

32. Show that if a and

30. log. (x

+ cot x)

v x2

1)

l
"'
i

-dt.
t

b are positive and different from

logb

+ tan x)

1
=

1-oga b

1, then

33. Show that, under the..same conditions,

logb x

(loga x)(logb

a).

34. Show that, if a and b are positive and different from 1, then
[Hint: What is a10g b, and why?]
35. The function
f(x)

e "'

has the property of being its own derivative.


because for every k,g(x)

But it is not the only such function,

ke"' has the same property. We have, however, the following

theorem.
Theorem.

If g ' (x)

( - oo, oo), then there is a constant k such that


ke"'
for every x.

g(x), on

g(x)

That is, g(x)/e"' is a constant.

Prove this.

36. Show that the function f(x)

e"' is completely described by the conditions

f'(x)
f(O)
That is, show that

f(x)

( - oo <x < oo ),

1.

(1) and (2) imply that f(x)

37. Show that the function f(x)

f'(x)
f(O)
That is, show that

e-"'

(1)
(2)

e"' for every x.

is completely determined by the conditions

-f(x)

( -oo <x < oo )

(2)

1.

(1) and (2) imply that f(x)

(1)

e-"' for every x.

The Functions In and exp

4.10
4.10

191

THE FUNCTIONS In AND exp

In the preceding section, we gave a sketch of the way that logarithms and exponentials
ought to behave, postponing both the proofs and also the basic definitions.

shall now fill these gaps.

We

If you review the formulas of the preceding section, you will see that after con

siderable complications in the middle, we got a formula that looked simpie:


D loge

X = -

This enables us to write a f ormula for loge:


loge x

f,"' t dt.
1

If the theory works, then this formula must be right: the functions on the two sides
of the equation have the same derivative (namely,
at

x =

1 (namely,

same function.

0);

1/x), and they have the same value

and so it follows by the uniqueness theorem that they are the

We shall use the function f:

(l/t) dt as the foundation of the theory of exponentials


ff (1/t) dt, learn its proper

and logarithms. The scheme is to investigate the function

ties, and then define all our other functions in terms of it.
we shall investigate

f: (1/t) dt

Thus, at the beginning,

without assuming that we know anything about

logarithms, or about exponentials, or about the number


starting afresh, we give the function a new name In.

e. To emphasize that we are


natural

(Here In is suggested by

logarithm.) And the official theory


. begins with the definition of In in terms of an
:
Definition.

For each x > 0,


ln x

"' dt

Soon we shall show that every real number y is equal to In x for some x. For this

purpose we shall need:


Theorem 1

(The no-jump theorem). If f is continuous on [x1, x2], then f takes on

every value betweenf(x1) andf(x2).

Trigonometric and Exponential Functions

192

That is, if f(xi) < k < f(x2), then there is an x, between Xi and x2, such ti
f(x)

k. And ifj(x2) < k < f(xi), then the same conclusion follows. This theor1

will be proved in the next chapter.


Our first few theorems on the function In are easy.
Theorem 2.

Proof.

D In x =

I/x.

This follows from the definition of In and the formula for the derivative

the integral.
Theorem 3. In 1 = 0.

This is obvious.
Theorem 4.

Proof.

For every k, x > 0, D,,, In

kx

I/x.

By the chain rule,


D"' Inkx =

Theorem 5.

Proof.

1
-

kx

1
k =-

For every a, b > 0, ln ab = ln a + In b.

The trouble with this theorem is that it does not appear to involve any fu1

tions. To prove it, we first restate it, using k for

and x for b. It then says that 1

every k, x > 0,
Inkx =Ink + In x.
----

e proof is now as follows.

Let

f(x) =Inkx,

g(x)

=Ink + In x.

Then

f'(x)

= _!

g'(x),

j(l)

=Ink,

and

g(l)

Ink + In 1 =Ink + 0 =Ink.

By the uniqueness theorem,f(x) = g(x) for every x.


y

We now want to show that the graph of In looks approximately like the drawi
above.

The figure suggests that In 1 = 0 and In' 1 = 1.

These things we alrea

The Functions In and exp

4.10

know.

193

Other things suggested by the figure are conveyed by some of the following

theorems.

Theorem 6. In is invertible.

Proof
every

We know that In'

x.

and

x = 1/x;

I/x -:F- 0 for

every

Therefore In'

x.

x -:F- 0 for

By Theorem 6 of Section 4.7, In is invertible.

Theorem 7. For every x > 0, and every positive integer n,


In

xn = n In x.

Proof.

Obviously this formula holds when

integer

n,

then it holds for the next integer


In

xn+i =
=

In
In

And if it holds for any particular


Proof:

(x xn) = In x + In xn
x + n In x = (n + 1) In x.

Therefore, by induction, we have In


A number M is called an

n = 1.
n + 1.

xn

upper bound

f (x) M

n In x for every x,
for a function
for every

which was to be proved.

f if

x.

If there is such a number M, then we say that f is

bounded above. (For example,


x 1 for every x.) If no such number M
say that/ is unbounded above. (For example, ifj(x)
x, for every x,

the sine is bounded above, because sin


exists, then we

then/ is unbounded above.)

Theorem 8. In is unbounded above.

Proof

In Theorem 7, take

2.

Then
In 2 n
for every

n.

= n In 2 ,

And In 2 > 0, because In 2 is the area of a region. Therefore In cannot

have an upper bound; no number M is greater than or equal to all of the numbers

n In 2 ,

because

ln 2 > M

whenever

n >

M
-

In 2

Trigonometric and Exponential Functions

194

Theorem 9. In

(1/x)

4.10

-In x.

Proof
In

+ In

In

e x)

In 1

0;

and from this the theorem follows.


Theorem 10. In 2-n

-n In 2.

By Theorems 7 and 9.

Proof

A number mis called a

lower bound

of a function/ if m f(x) for every

there is such a number m, then we say that f is


sine is bounded below, because -1
exists, then we say that f is

sin x

unbounded below.

bounded below.

for every

x.)

x.

If

(For example, the

If no such number m

(For example, if f(x)

x, then f is

unbounded below.)
Theorem 11. The function In is unbounded below.

Because no number mis less than or equal to all the numbers In 2-n

-n In 2.

Every real number is a value of the function In. That is, every number y

Theorem 12.

is equal to In x for some x > 0.

Proof
y lies

that y
R

Since In is unbounded both above and below, it follows that every number

between
=

twoYalues of In.

In x for se x.

If In

x1

< y < In

x2,

then it follows by Theorem 1

Thus the image of the function In is the entire interval

(-oo, oo).

We know by Theorem 6 that In is invertible. Its inverse will be denoted by exp.


That is:
Definition. exp

Since In

ln

1.

x will turn out to be log. x,

But we should not use the notation

e",

this means that exp

x will turn out to bee"'.

at this stage, because we have not yet defined

in the present treatment.


The graphs of In and exp are shown in the figure opposite.

Since exp and In are

inverses of each other, we have:


exp In

Theorem 14. In exp

Theorem 13.

x.
x.

These are instances of the general rule

1-1(f(x))

J(f-1(x))

x.

As always, for functions which are inverses of one another, the image of exp is the
domain of In. Therefore
Theorem 15.

exp

> 0

for every

x.

4.10

195

The Functions In and exp


y
exp
y=x

/
/

/
/

/
/

/
/

/
,,

In

This theorem is also easy to see graphically, in the figure above. The graph
of In lies to the right of the y-axis. Reflecting this graph across the line y

x,

we get

the graph of exp. Therefore the graph of exp lies above the x-axis.

Theorem 16. exp


Because In 1

0.

1.

Theorem 17. exp (k +

Proof.

(exp k)(ex

) =

for every k and

x)

x.

Both sides of the equation have the same In:


ln exp (k +

x)

k +

x,

because In and exp are inverses of each other. And


In [(exp k)(exp

x)]

In exp k + In exp x

k +

x.

Since In never takes on the same value twice, the theorem follows.

Theorem 18. exp'

Proof.

exp.

We know that ln exp x


(In' exp

for every

x)

exp'

We now have functions ln


log. x and

e"'

the next section.

a)

In

f,.,
1

dt
-

Since In' u

-x

exp

exp

1/u, the chain rules gives

x = 1,

exp'

x =

x and exp x which have the properties that the functions


e"',

A natural next step is to find a number

in such a way that exp

x = e"'.

>

0),

e,

and

We shall do this in

Meanwhile we give a quick summary of this one.

(x

exp x,

exp, which was to be proved.

are expected to have.

define the exponential function

Definitions

x.

x = 1,

Therefore exp'

x.

196

Trigonometric and Exponential Functions

4.10

Laws for In

c) In 1

d) In xn

0,

e) In kx

Laws for

exp

g) exp 0

k) In exp x

f)

1,

exp x > 0

i)

(k, x > 0),

In k + In x

In' x

In x

(x > 0),
(x > 0).

I/x

h) (exp k)(exp x)
j) exp In x
x

for every x,
x

= n

I)

for every x,

exp'

exp (k + x),
(x > 0),

exp.

PROBLEM SET 4.10


Some of the problems below are to be solved by any method that works, including

4.9.

methods based on the unproved results of Section

Some, however, are supposed to

be worked strictly on the basis of the theory developed in this section; and these are stated
in the notation of In and exp. Thus, if the problem uses the notation a"', Ioga x, then the
solution may use the theory in Section

4.9;

but if the problem uses In, exp, then the solution

should also.

Find the derivatives of the f ollowing functions:

1.

ln2

5. exp

x2

9.

exp sin

13.

In sin x

2. In In

3.

1)

4.

In

6.

x]2

7. exp (2 In

x)

8.

In (exp

(x In x)

1 2.

e"' log,

16.

(sin

[exp

10.

sin (exp

14.

sin In

x)

In

(x2

1 1.

exp

15.

X"'

(X >

0)

x2

+ 1

x2)

x)sin x

(sin x

>

0)

17. We found that the function


ln

lx
1

dt

was unbounded above. Is this true also for

f(x)

Jx

dt
?

vt

Why or why not?

18.

How about the function

g(x)
19.

h(x)
.

h'(x), by any method.

lx
1

Given

find

l"'2
1

dt
--

vr

dt
2?
t

(0 <

<

Note, however, that you are not being asked to calculate

you are being asked only to calculate h'.


20.

Given

f(x)
findf'(x).

oo),

rsinx
Jo

vl+t2 dt,

h;

Exponentials and Logarithms.

4.11
1

2 . Given

f(x)

fanx
0

find f'(x).

22. Given

g(x)

findg'(x).

23. Given

h(x)

=exp

find h'(x).
24. Find
lim

v1 +

The Existence of

197

t2dt,

fxdtt'
i

(f' dt),

sinx + 1

x3"/2 X-37TI2

(By far the easiest way to solve this problem is to think of a geometric meaning for it.)
25. Find
28\ Find

Jim

tanx-1
tanx + 1

x-rr/4 X +

I4 .

1T

--

--

X-+1if4 X-7T14 .

Jim

In x2
lnx
27. Find Jim
26. Find Jim
.
1.
x-..1X x1X - 1
exp ( 2x)- l
29. Find Jim
xo

30'. Using Simpson's rule, compute an approximation of In 2. To four decimal places, the
right answer is 0.6931; and if you cut up the interval [1, 2] into ten parts, you get a
good approximation.
31. Show that for

x l,

Show that for 0 <


32. Given

f (x)

lnxx-1.
< 1, the same inequality holds.

x-1, find a formula for 1-1(x), and sketch both functions on the

same set of axes.

Show that expx


33. Let k

+ 1, for every x.

[Hint: Try to use a known property of In.]

Jn-1 l. Show that k > 2.

34. Show that k < 4.


4.11

EXPONENTIALS AND LOGARITHMS. THE EXISTENCE OF

In Section 4.9, we wanted to define

f (h)
h 0.
(1 + h)1'"

as

as the limit of the function


=

(1 + h)lfh,

To investigate this limit, we first need a proper definition of the function


Since the exponent

1/h

varies continuously through real values, we need

a definition of the exponential ax, where a > 0 and

is not necessarily rational.

The right definition is not hard to find. We know that if n is a positive integer, then
In

n
a =

n In a.

4.11

Trigonometric and Exponential Functions

198

Therefore

a" = exp

(n ln a) .

If the laws of logarithms hold for all exponents, then


ln a"'= x ln a,
which gives

"
a = exp (x ln a) .

We take this last equation as our definition of the exponential function a".

Definition. For

> 0,

a" =

Thus:

exp (x ln a) .

This gives:

Theorem 1. ln

a" = x ln

a.

We are now ready to show that

(1 + h)1f1i
(1 +

f(x) =

Then

lnf(x)

1
= -

ln

approaches a limit, as

h ---+ 0.

Let

x)1 x.

(1 +

x).

We now replace x by x, and observe that


\

ln f
, (x) =
!

x
ln

ln

(l +

(1 +

x)

x) - In

x
This last fraction is the fraction whose limit is ln' 1, by definition of the derivative.
Therefore
lim lnf(x)

Ax-o

and

ln'

1 = t = l,

limf(x) = lim exp lnf(x)

Ax-o

Ax-o

explim lnf(x)
Ax-o

= exp 1 = ln -1 1.

Replacing x by

again, we have
lim

h-+O

(1 + h)111i =

exp 1 = ln-1

1.

Now that we know the limit exists, we can use it as a definition of e:

Definition.

e = lim1io

(1 + h)1fh.

And we know:

Theorem

2.

e = exp 1 = ln-1 I.

This theorem has a geometric interpretation.

Exponentials and Logarithms.

4.11

The Existence of

199

That is,

e is the number such that

le -1 dt= 1.
1

Later we will find an efficient method of calculating

e. In fact,

e= 2.7182818,
correct to seven decimal places. It will turn out that

1
1
1
e=l+-+-+ ..+-+ '
1!
2!
n!

where

n!= 1

n.

The series on the right is infinite, but the terms diminish so rapidly as

increases

that we get good numerical approximations by using the first few terms.
We expected exp

x to bee"'. We can now show that this is true:


e"' = exp (x In e),

by definition of
Theorem 3.

e"'; and since In e= I , we have:

e"' = exp x, for every

x.

As before, we define the logarithm as the inverse of the exponential. That is

y = loga x
Since

<=>

aY = x.

e"' and exp x are the same function, they have the same inverse.

Therefore

we have:
Theorem 4.

log.

x = In

x,

for every

x > 0.

Thus exp really is an exponential, and In really is a logarithm. Once we know the
laws governing

e"' and log. x, it is easy to derive the laws governing other positive

bases. The first step is to express Ioga in terms of In. We recall that for a > 0,
aY

exp

(y ln a),

4.11

Trigonometric and Exponential Functions

200

by definition. Therefore
In

v
a

In exp

(y In a) ,

and
In
Since

a"'

av= y In a.

and loga x are inverses of each other,


log.x
a

= x.

In this equation, we take the In of each side, getting


(loga x) In

a=

In

x.

This gives:
Theorem 5.

For every

> 0,

Iogax =

x
-

In
In

Thus the function Ioga is a constant times the function In; and this means that the
extension of the theory from In to loga is easy.
Theorem 6.

For every

Ioga

> 0, 1,

xy =

Ioga b"' =

Ioga

+ Ioga

(x, y

>

(g)

0),

loga b,

(h)

Ioga 1 = 0,
1

Ioga/ X =

log,

Ioga
a

(i)

a"' = x,

log."'

= x,

(j)

,
x,

for every
for

(k)
(I)

> 0.

Here the formula designations are those of the summary at the end of Section 4.9.
The proofs are as follows.

Proof
Joga =

For

a = e,

the first three formulas are known to hold, because in this case

log, = In. If we divide every term by In

a,

then we get loga throughout, and

the equations still hold. To get Eq. (j), we observe that


log
Equations (k) and

(1)

x=

D loga

x=

1
x
- - -In a
x In a

In

merely remind us that loga x and

another. Similarly, we get the laws governing the exponential

a"' =

exp

(.JC

In

a) =

"'

Ina
.

"'

"'

are inverses of one

by using the fact that

Exponentials and Logarithms.

4.11

Theorem 7. For every a >

The Existence of

201

0,

a"' a11 = a'*11,

(a)

(a"')v =a"'11,

(b)

aO = 1,

(c)
for every

a"' > 0

x,

(d)
(e)

Da" =a"'Ina.
The proofs are as follows.
a)

We have
In (a"

a") =In a" + In a11 =xIn a + y In a


= (x + y)Ina

In a"'+v.

Since a" a" and ax+v have the same In, they must be the same; In is invertible,
and so In never takes on the same value twice.
b) By definition, b11

exp (yIn b). Therefore


(a")11 =exp (yIn a") = exp (yxIn a)
=exp (xyIn a) = a"Y.

c)

a0 =exp (0In a) =exp 0 =1.

d) a"' =exp (xIn a) >


e)

0.

Da"'
D exp (xIn a) = [exp (xIna)] In a, by the chain rule. Therefore Da"' =
a"'In a.
=

This completes the program that was sjs,ewked_jn Section 4.9. There are, however,
some things that we still need to check. In the elementary theory, we stated:
Definition

1.

For every positive integer

n,

and every real number x,


(to n factors).

In the new theory, we stated:


Definition 2. If x is positive, and n is any real number, then
[xn] =exp (nIn x).
We have used ( ) and [ ] to distinguish the two definitions. If n is a positive
integer, and x > 0, then both definitions apply; and we need to know that they give
the same answer. In fact, this is true:
In (xn) = nIn x,
by Theorem

of Section

4.10.

Therefore

(xn) = exp [Jn (xn)] = exp (n In x) = [xn],


by definition of [x"].
Similarly, we now have two definitions of avla.

Trigonometric and Exponential Functions

202

4.11

Definition 1. If a > 0, and x is a rational number p/q, then

Definition 2. If a > 0, and x = p/q, then


a

"'

a111q

These definitions agree if it is true that

a"
The proof is as follows.

( )

exp (x In a) = exp

In a .

( a) .
In

exp

Let

Then
and

q lny = p ln a.

Therefore
In y =!!.in a,

and

which was to be proved.

y =

exp

( )

In a ,

We have found that the differentiation formula

Dxk
holds true in certain cases.

kx"-1

We first proved it for the case in which

integer. Later we found that it held true when


and

x >

0, it says that

Dx112

k was a positive
k was a n7gative integer. For k = t,
/

ix112-1
2

ix-112
2

1_

__

2.Jx,

which is correct. We can now prove the following:

Theorem 8. For every x > 0, and every real number k,

Proof

Dx" = kx" -1.

By definition,
x"

Therefore

Dx"

= exp

(k In x).

D[exp (k In x)] = [exp (k In x)] D[k ln x]


xk

kxk-1

In this section we have presented no new results, except for Theorem 8; we have

merely furnished proofs for the theory sketched in Section 4.9.

no new material for problem work.

You therefore have

Hence we give the definitions of a new set of

Exponentials and Logarithms.

4.ll

The Existence of

203

functions, the hyperbolic functions, and list various identities which they satisfy.
In the following problem set you will be asked to derive these. The theory is simpler
than the theory of trigonometric functions. In fact, once you know about the expo
nerltial function, most of the following formulas have straightforward derivations.
The functions are called the hyperbolic sine, hyperbolic cosine, hyperbolic
tangent, and so on.

Definitions

smhx

e"' - e-"'
= --2

coshx

e"' + e-"'
= ---

tanhx

co thx

sechx

cschx

e"' -

-"'
e

e"' + e-"'
e"' + e-"'
e"' - e-"'
2
e"' + e-"'
2

e"' - e-"'

sinhx .
= -coshx

coshx

= -- .
sinhx

1
- --

cosh x

1
-sinh x

Identities
sinh

( -x,) = -sinhx.

cosh (-x)
tanh

(1)

= coshx.

(2)

( - x) = -tanhx.

cosh2 x- sinh2 x

(3)

=:===,,(.

(4)

1 - tanh2x = sech2 x.

(5)

coth2 x-

(6)

1 = csch2 x.

sinh (x

+ y) = sinhx cosh y + coshx sinh y.

(7)

cosh (x

+ y) = coshx cosh y + sinh

(8)

tanh (x

+ y) =

sinh 2x

= 2 sinhx coshx.

(10)

cosh 2x

= cosh2 x + sinh2 x.

(11)

"'

tanhx + tanh
1

+ tanhx tanh y

= coshx + sinh x.

e-"

cosh x - sinhx.

sinh

y.

(9)

(12)
(13)

204

4.11

Trigonometric and Exponential Functions

Derivatives

sinh' x = cosh x.

(14)

cosh' x = sinh x.

(15)

tanh' x = sech2 x.

(16)

coth' x = -csch2 x.

(17)

sech' x = -sech x tanh x.

(18)

csch' x =

(19)

csch x coth x.

PROBLEM SET 4.11

Verify the following.

(The numbers in parentheses refer to the numbered formulas

above in the text.)

1. (12)

2. (13)

3. (1)

4. (2)

5. (3)

6. (14)

7. (15)

8. (16)

9. (17)

10. (18)

11. (19)

12. Find the derivative of

F(x)

cosh2 x

sinh2 x.

14. Verify (5).

13. Verify (4).


16. Let
A

Show that A + B

15. Verify (6).

sinh (x + y) - sinh x cosy - cosh x sinhy,


cosh (x + y) - cosh x coshy - sinh x sinhy.

0. (It is not necessary to go back to the definitions to show this.

Try Identities (12) and (13).)

17. Let A and B be as in Problem 16. Show that A - B

0.

Now verify the following.

18. (7)

20. (9)

19. (8)

21. (10)

22. (11)

23. Express cosh 3x in terms of cosh x.


24. Show that
25. Show that
26. Show that, on the interval [O,

x > 0

=>

sinh x > 0.

x < 0

=>

sinh x < 0.

) cosh is increasing.

oo

27. Show that, on the interval ( - oo, OJ, cosh is decreasing.


28. Show that

cosh x 1

for every x.

29. Show that sinh is invertible.


30. Show that
cosh x

VI + sinh2 x,

for every x. Note that there is no double sign in this formula; if your derivation leads
to a"" sign, you musf find a way to get rid of it.

Exponentials and Logarithms.

4.11

31.

Find cosh sinh-1 x.

32.

Find D sinh-1 x.

33.

Find D sinh-1 2x.

The Existence of

205

34. Show that cosh is not an invertible function.

35.

Let

(0 x).

Cosh x = cosh x

(Compare with the definition of Cos: Cos x = cos x

(0 x

invertible.

) ) Show that Cosh is

7T .

36. Show that


.

Sillh X

37.

Show that

38.

Find D Cosh-1 x.

39.

Firid D Cosh-1 x2

{v

cosh2 x

for x

1,

for x <

-Vcosh2x - 1,

sinh Cosh-1 x =

0,

x2

0.

1.

40. Show that tanh is invertible.


41. Show that
sech x = VI - tanh2 x,
without any double sign in front of the radical.

42.

Find D tanh-1 x.

43.

Solve for x:

e2"'

e"'

- 6 =

0,

and explain why this equation has only one root.


44. Solve for x:
45.

Solve for x, in terms of y:

e"'

e"'

+ y - .6y2e-"'

\
I

2 - 35e-"'

0.
0.

46. Find a formula which express s sinh-1 x as the logarithm of an algebraic expression.
Hint: The graph of sinh is the graph of the equation

y =

i(e"' - e-"') .

(1)

Therefore the graph of sinh-1 is the graph of the equation


x =

i(e11 - e-i-).

(2)

Here we have reflected the graph across the line y = x, by interchanging x and y in

Eq.

(1). Now solve for y in (2), getting

Then

= (

"

sinh-1 x =
47.

Analogously, get a formula for Cosh-1 x.

48. Analogously, get a formula.for tanh-1 x.

The Variation of
5

5.1

Continuous Functions

INTERVALS ON WHICH A FUNCTION INCREASES, OR DECREASES

The function f is increasing if

x < x'

=>

f(x) <f(x').

x < x'

=>

f(x) > f(x').

Similarly, f is decreasing if

y
y

Here

and

x'

are any points in the domain off

increasing nor decreasing.

For example,

f(x)

Some simple functions are neither

x2

satisfies neither of the above

conditions.
Often, however, we can get a good description of a function by cutting up its
domain into subintervals, in such a way that on each subinterval the function is either
increasing or decreasing. For example, the domain might be a closed interval [a, b],
and the graph might look like this:

206

Intervals on Which a Function Increases, or Decreases

5.1

increasing on the interval [x2, x3]. Similarly, f


[x4, x5], and is decreasing on the interval [x1' x2].

This function is

[x0, x1]

and

f
J

207

is increasing on

I
I
I
I
I
I
I
: I
I
--+---x
X2
X
X1 X3

Similarly,f is decreasing on

[x3, x4].

If a function is differentiable, then we can find out where it is increasing or


decreasing by examining the derivative.
Theorem 1.

If f'(x)> 0 at every interior point of I, then f is increasing on I.

We recall that an

interior point of an interval is any point which is not an endpoint.

Theorem 1 is a consequence of the mean-value theorem (MVT).


y

If we had

a< b,

(?)a,b in l,

f(a)> f(b) (?),

as on the left of the graph above, then the slope of the chord would be

f(b) -f(a)< 0,
b -a
and this would give
because such an

f'(x)
<

0 for some

between

and

b.

This is impossible,

would be an interior point of I. If we had

(?) c,

din/,

c< d,

f(c)= f(d) (?),

as on the right, then the chord would be horizontal, and we would have f' (x)= 0
at an interior point

of I.

In Theorem 1, we allow the possibility that I is an infinite interval.


Here

f ( x)=

xz,

I=

f'(x) = 2x,

[O, co).

Consider

5.1

The Variation of Continuous Functions

208

and so /'(x) >

at every interior point of I.

that when we allow the possibility that/'(x) =

Therefore f is increasing on I. Note


0 at endpoints of I, we are not splitting

hairs; if we required that/'(x) be >O everywhere in I, then the theorem would not
cos x, I=
apply in the simple case f(x)
x2, I= [O, oo), or to the case f(x)
=

[7T, 27T]. We don't need theorems to be as general as possible, but we want them to be
general enough to be usable. And it is not unusual to find that/'(x)

0 at an end

point; in fact, this is what usually happens, when we break up the domain of our
function into the largest possible subintervals on which the derivative does not change
Here

sign.

f is

increasing on 11 and /3, and decreasing on 12; and the derivative

vanishes at the endpoints x1 and x2

Consider another example:

f(x)

Here

f'(x)

2x

x2

f'(t)

- 1,

- x

0,

(0

2).

and

f'(x)

>

on

( t, 2].

In the left-hand figure below, we have used this information, and have plotted

f(t)

and/(2).

Obviously /(1)

0,

and so we have plotted this point exactly also.

y
2

-1

-1

The same principle works in the opposite case:


Theorem 2. If f' (x) <

(Proof

0 at

every interior point of I, then f is decreasing on I.

Apply Theorem 1 to the function g = - f By Theorem 1, g is increasing on

I. Therefore f is decreasing on

I.)

5.1

Intervals on Which a Function Increases, or Decreases

To apply this theorem to the same function f(x)


I=

(0, !],

209

x2 - x, on the interval

we observe that/ is decreasing on this interval, because


f'(x)

2x

<

for

< x < t.

We use this information to complete our sketch.


This example doesn't look impressive, because we already knew how to sketch
parabolas. It is not so obvious, however, how to sketch the graph of a cubic function
taken at random, say,
f(x)

x3 + 2x2

3x

- 4,

-2

2.

This is not a put-up job; it is a "real-life" problem, and nothing is going to come out
even.

We need to find out where f' >


f'(x)

and where f' <

3x2 + 4x -

0.

Now

3,

so that

f'(x) = 0
Since

J13

3.6,

when

-2
X=

the roots of the equationf'(x)

JTI

3
=

are

-6

Since the graph off' is a parabola opening upward, it must look like the drawing
on the left above.

Thus
f'(x) >

when

x < x2,

f'(x) <

when

x2 < x < x1,

f'(x) >

when

x > x1.

210

The Variation of Continuous Functions

5.1

Therefore f is increasing on [ -2, x2]; f is decreasing on [x2, x1]; and f is increasing


on [x1, 2]. We calculate
f(-2)
f(x1)

2,

::::::; -

f(x2)::::::; 2.1,
4. 9 ,

/(2)

6.

This gives us our sketch on the right. (The problems in the following problem set are
not this awkward.)
To apply this method, you need to know how the derivative behaves; and we
may use the same method in investigating the derivative. For example, in the pre
ceding problem we had
2
f'(x)
3x + 4x - 3.
If we let
2
g(x) = f'(x) = 3x + 4x - 3,
then
g'(x) = 6x + 4.
=

Therefore g is increasing for x > -i, and is decreasing for x < -i Plotting g
exactly, at the points -2, x2, 0, and x1, we get the sketch of f' which is given
above. We know thatf'(x) > 0 for x1 < x < 2, because f' increases, starting at the
0. Similarly, f'(x) > 0 for -2 < x < x2, because on the interval
value f'(x1)
[-2, x2], f' decreases toward f'(x2) = 0. Similarly in the middle interval [x1, x2].
This idea is simple enough, but it is so useful that we had better record it as a theorem:
=

Theorem 3.

Iff is increasing on [x1, x2], thenf(x) > f(x1) for every x on (x1, x2].
y

__/L_
I

We recall that (x1, x2] is a half-open interval;


(x1, x2]

{x I x1 < x x2}.

We have been talking aboutthe casef'(x1)


Theorem 4.

0.

Iff is decreasing on [x1, x2], then f(x) < f(x1) for every x on (x1, x2].

PROBLEM SET 5.1

For each function given, state on what intervals the function is increasing, and on wha
intervals it is decreasing; and sketch the graph.

Local Maxima and Minima, Direction of Concavity, Inflection Points

5.2
1.

f(x)

2. f(x)
3.

f(x)

4.

f(x)

5.

f(x)

6.

f(x)

7.

f(x)

8. f(x)
9.

f(x)

sin x,

-1;:;:;

Sin-1 x,

1
- --2
1
+x

-2;:;:; x;:;:; 2

'

x3 - 3x,

(sin x + cos x)2

'

x 2
x ;:;:; 3

-1 ;:;:;

x;:;:; 2

1,

x3 + 6x2 + 9x + 3,
ew

2;:;:;

-1;:;:;

x3 + 3x2 - 2,

x;:;:; 1

-2;:;:; x;:;:; 2

'

-1 --2
+x

0;:;:; x ;:;:; 1

1 0. f(x) = x In x,

1 ;:;:; x;:;:; 5

11 .

0 ;:;:; x;:;:; 27T

f(x) =cos x,

12. f(x)
13.

f(x)

14.

f(x)

15.

f(x)

6
1 .

f(x)

7
1 .

f(x)

18. f(x)
19.

f(x)

211

0 ;:;:; x ;:;:;

sin 2x
-x

+2x2

1
=

-1 +x 4

-2;:;:; x;:;:; 2
-1;:;:;

xe-w

---4
1 +x

1T

x;:;:; 2

-1

;;;; x ;;;; 1

-1

;;;; x ;:;:; 1

x cos x - sin x
x/2 + sin x,
e"'

2x,

0 ;;=;x;:;:; h
0;:;:; x;:;:; 2

(Here you are not going to be able to get answers in an exact numerical form. The figure
should indicate plausible approximations.)
20. Investigate the converse of Theorem 1. That is, find out whether the following state
ment is true:
Theorem(?). If (i) f is continuous on [a, b], (ii) f is differentiable on (a, b), and (iii) f is
increasing on [a, b], then (iv)/'(x) > 0 for every x of (a, b).

2 1 . Is the following true?


Theorem(?). If f is differentiable at x0 and f '(x0) > 0, then some chord of the graph of
f has a positive slope.

22. Investigate:
Theorem(?). Let f be a function satisfying (i), (ii), and (iii) of Problem 20. Then (iv')
f'(x) 0 for every x of (a, b).
5.2 LOCAL MAXIMA AND MINIMA,
DIRECTION OF CONCAVITY, INFLECTION POINTS

Again we consider a continuous functionf, defined on a closed interval [a, b]. In the
figure,f(x2)
M; and Mis the largest value off
=

212

5.2

The Variation of Continuous Functions


y

We say thatf has a maximum at x2; and we say that Mis the maximum value off
Similarly,f(x3)
m; and mis the smallest value off We say that/has a minimum
at x3 ; and we say that mis the minimum value off
Here when we speak of maxima and minima, we mean maxima and minima on the
whole domain of the functionf; in this case the domain is [a, b]. Before you know
what is a maximum or minimum, you must first know the domain of the function.
In the figure above, f(x1) is not a minimum value, becausef(x3) <f(x1). But
f(x1) is the smallest value thatf takes on when x is close to x1. We say thatf has a
local minimum at x1. This is abbreviated as LMin. Local minima can occur in three
ways:
=

(fV

I
I

I
I
I

I
I

I
I
I

I
I
I
I
I

X1-0XJ X1+0

I
I
I
I
I

I
I
I
I
I

X1 X1+0

m
I
I

X1 -o

I
I
I
I

Xj

1) x1 may lie on an open interval (x1


o, x1 + o), in the domain off; and /(x1)
may be the smallest value of the function on the interval (x1
o, x1 + o). In this
case, we say that/has an interior local minimum at x1. This is abbreviated as ILMin.
-

2) x1 may be the left-hand endpoint of the domain of/; and/(x1) may be the smallest
value Of j on an interval (X1, X1 + 0).
3) x1 may be the right-hand endpoint of the domain of/; and/(x1) may be the smallest
value off on an interval (x1
o, x1].
-

Thus, for the function/whose graph is sketched at the beginning of this section,
we have local minima at x1 and x3. Note that every minimum is automatically a local
minimum, just as the tallest man in the world is automatically the tallest in his own
neighborhood.
Local maxima are defined similarly. Local maximum is abbreviated as LMax.
A local maximum can occur in three ways:

Local Maxima and M inima, Direction of Concavity, Inflection Points

5.2

I
I

Xi-5

I
I
I

I
I

Xi xi+5

!\):

0
I
I
I

I
I
I

X1 xi+5

In the figure on the left, f has an

213

I
I
I
I

I
I
I
I

X1-5 Xi

interior local maximum at x1 This is abbreviated

ILMax.
There are simple conditions under which a function has an ILMax or an ILMin
at a given point.

1. If/is increasing on an interval [x1 - o, x1] and decreasing on an interval


[X1, X1 + o], then/ has an ILMax at X1

Theorem

2. If/is decreasing on an interval [xi - o, Xi], and increasing on an interval


[xi. X1 + o], then/ has an ILMin at Xi.

Theorem

If f' > 0 on

In applying these, we use the derivative.


on

(xi, x1 + o), we can apply Theorem

1.

(xi

o, x1) and f' < 0

Similarly for Theorem 2.

In fact, if you

find out where a function is increasing and where it is decreasing, it is always obvious
where the interior local maxima and minima are; they are at the turning points, where
the graph stops behaving in one way and starts behaving in the other way.
Most of the time, for functions defined on a closed interval, the endpoints of the
interval give either local maxima or local minima. Therefore, if we are investigating a
function for local maxima and minima, we always investigate the endpoints. Of course,
interior local maxima and minima may occur anywhere in the interior of the interval.
In searching for them, we use the theorem suggested by the figure below. If the f unc
tion is differentiable, then at an interior local maximum the derivative must be

-+----x-X1
f'(x1) =0
Theorem 3.

If f has an ILMax at

Xi, and f is differentiable at Xi, then f' (xi)

This is geometrically obvious, and a logical proof is also easy.

m(x)

f(x) - f(xi)
,
X
Xi
-

so that
Jim

m(x)

f'(x1).

Let

0.

0.

214

1)

The Variation of Continuous Functions

5.2

Suppose thatf'(xi) > 0. Then the function m(x) must be >0 when x Xi.
y

Take x

!::::!

Xi, with x > Xi.

x Xi

and

Then

x > Xi

=>

m(x) > 0

=>

m(x)(x - Xi) > 0

=>

and

x - Xi > 0
=>

f(x) - f(xi) > 0

f(x) > f(xi),

which is impossible, becausef has an ILMax at Xi.

2)

Suppose thatf'(xi) < 0. Then the function m(x) must be <0 when x Xi.

Take x Xi, with x < Xi. Then


x

!::::!

Xi

and

which is impossible.
Since
proved.

(1)

and

(2)

x < Xi

and

=>

m(x) < 0

=>

m(x)(x - Xi) > 0

=>

x - Xi < 0
=>

f(x) - f(xi) > 0

f(x) > f(xi),

are both impossible, it follows thatf'(xi)

0, which was to be

215

Local Maxima and Minima, Direction of Concavity, Inflection Points

5.2

This is the standard method for finding an ILMax. Given a differentiable function

x wheref'( x)
0. Usually there are only a finite number of
These are the only possible places where interior local maxima can occur.
Therefore we have only a finite number of values of x to investigate; and when we are

f, we find the points

such points.

done, our list of interior local maxima is complete.


Note, however, that the converse of Theorem 3 is false: iff' (x1)
follow that f has a local maximum ( or a local minimum) at x1
(fx)

x3, -1 x 1,

thenf'(O)

0, it does

not

For example, if

0, butf is increasing on the whole interval

[-1, 1]
y

We have a similar theorem for interior local minima:


Theorem 4.

Proof

Iff has an ILMin at

x1,

Let

andf is differentiable at

g(x)

Theng has an ILMax at

x1.

x1,

then f'(x1)

0.

-f(x).

Thereforeg'(x1)

0.

Thereforef'(x1)

-g'(x1)

-0 = 0, which was to be proved.


y

I
I
I

I
I
I

If f' is increasing, on an interval

[xi. x2],

thenf is

concave upward

on

[x1, x2].

(You ought to be able to convince yourself that this is a reasonable use of language.)

[x2, x3], thenf is concave downward on [x2, x3]. In the figure on


x2 is the point at which the direction of concavity changes. Such a point is
called an inflection point. Of course, the direction of concavity can change from up to

IfJ' is decreasing on
the right,

down or from down to up.


Definition.

An

Hence:

inflection point

ILMax or an ILMin.

of a function f is a point at which

f'

has either an

216

5.3

The Variation of Continuous Functions

Note the way in which these definitions fit together. If you know how to investi
gate (a) increasing, (b) decreasing, (c) interior local maxima, and (d) interior local
minima, then automatically you know how to investigate direction of concavity and
inflection points.

The reason is that f', once you get it, is a function, and can be

investigated in the same way as any other function, with the aid of its derivativef".
Wheref' increases,fis concave upward; where/' decreases,fis concave downward;
and where f' has an interior local maximum or minimum, f has an inflection point.
Most of the time, we investigate local maxima and local minima because we
want to find the maxima and minima. We find the maxima and minima, on the whole
domain, by looking to see which local maximum value is the largest and which local
minimum value is the smallest.
Finally, we observe that a function may easily have a local maximum or minimum
at an endpoint at which it is not differentiable.

x2/3 (0 x

<

oo

theory takes care of this case.


of the interval [O,

For example, the function f(x)

has a minimum (and hence a local minimum) at

oo ,

Since the derivative

jx-1/3

x = 0.

The

is positive in the interior

it follows that the function is increasing, and so it has a

minimum at the left-hand endpoint.

PROBLEM SET 5.2

1 through 19. For each of the functions described in Problems 1 through 19 of the
preceding problem set, find the local maxima, the local minima, the maximum, the minimum,
the inflection points (if any), and the image. (The image will always turn out to be a closed
interval.) Tell where each of the functions is concave upward and where it is concave down
ward.
20. Consider the function defined by the following conditions:
1

f(x)

= x sin

/(0)

for 0 <

'TT

0.

An exact sketch is not practical, because the ILMax and ILMin points are hard to
calculate. Give a rough sketch, however, indicating as well as you can how the function
behaves. Is it continuous at O? Does it have a local maximum or minimum at O? Is it
differentiable at 0?
*21. Suppose that/is both continuous and differentiable on [O, 1]. Does it follow that/has

an LMax or an LMin at 0? Why or why not?


5.3

THE BEHAVIOR OF FUNCTIONS AT INFINITY

So far, we have been discussing functions on closed interva!S: In this section, we shall
consider larger domains, including infinite intervals, such as ( - oo,
and so on, and also intervals with holes in them.
tangent is
D

and this is an infinite interval

{x I x

( - oo,

oo

TT/2

oo),

[O,

oo ),

For example, the domain of the

nTT};

with infinitely many holes in it.

The Behavior of Functions at Infinity

5.3

217

Most of the ideas that we shall be investigating are illustrated by a simple function,
whose domain has a hole in it at

0.

!"'---

_)

-1

1l
I
I
I
I
I
I

-1

f(x)

1
=

(xO)

I
I

f(x)

1
x2-1

xI

A carefi:I inspection of the left-hand graph above will give you an idea of the
meanings of the following statements:

limf(x)

0,

(1)

lim

f(x)

0,

(2)

lim

f(x)

= oo,

x--+ oo
x-+-co

(3)

x-+o+

lim f(x) =

x-+O-

Definitions will come later.

(4)

- oo.

Meanwhile, let us look at another example.

The

function whose graph is shown on the right above has the following properties:
i)

has an interior local maximum at

Everywhere else near


ii)

lif(x)

0, x2

>

0, x2 - 1

0. (At x = 0,
-1, and 1/(x2

>

the denominator is -1.


-

1)

<

-1.)

-oo.

o:--+1

iii)

lim f(x) = 00.


ai-+1+

iv)
v)

lim

f(x)

- oo.

x-+-1

lim

x-+-1-

f(x)

oo.

Here statements (ii) through (v) mean the things that the figure suggests.

An

> 1,

examination of the formula shows why the figure is right.

For example, if

218

The Variation of Continuous Functions

5.3

and x 1, then x2 - 1 > 0, and x2 - l 0. Therefore 1/(x2 - I) is positive and


very large. This is shown in the figure and stated by (iii). Similarly, if x < I and
x I, then x2 - 1 < 0 and x2 - 1 0. Therefore 1/(x2 - 1) is negative and is
numerically very large. This is shown in the figure and stated by (ii).
Let us now make this precise, by stating definitions that we can work with.
Definition.

limx00f(x)

means that for every E > 0 there is an M such that

> M

=>

- E < f(x) < L

E.

L-<

--- --------

---------

This is like the definition of limx-xof(x)

Roughly,

L.

means

x x0

=>

f(x) L,

means

=>

f(x) L.

lim f(x)
and
limf(x)

oo

x-+oo

In the definitions, the condition x x0 is expressed by 0 < Ix - x01 < o, and the
condition x oo is expressed by x > M.
Let us see how our definition of limx_,00 applies to the function
1

f(x)

(x

0).

We claim that
lim ..!

0.

X-Jof"fJ X

Under the definition, given E > 0, we are supposed to find an M such that
l

-E < - < E
x
This is trivial: take M
Definition.

limx-00/(x)

1/E. When x
=

x <

whenever x > M.
>

1/E, obviously 0 < 1/x < E. Similarly:

means that for every


=>

- E </(x) <

>

0 there is an

L +

E.

such that

The Behavior of Functions at Infinity

5.3

219

L-

In the same spirit:


Definition.

limxx0+ f(x)

oo

x0 <

means that for every

x < x0 + o

M there

is a (J > 0 such that

f (x) > M.

=>

I
I
I
I
I
I
M
t- _J
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
--L---'--'--x
x0 x0+o
------

That is, you can make f(x) as big as you want (i.e., > M) by taking
of x0 and very close to x0 (i.e., between

x0

and

+ b.)

x0

We need to talk about one-sided limits (as

x x

or

x x)

x to

the right

because these

often turn out to be different. In some cases, however, the one-sided limits have the
same value. In such cases, limxx.

f (x) must exist, and must be their

common value.

Thus
1.im 2
1

x-+O X

= 00.

The following two theorems justify the remarks that were made above about

f(x)

1/(x2 - 1).

Theorem 1.

Suppose thatf(x) > 0 on an interval (x0, x1).


1.lill
x,,0+

If limxx. f(x)

0, then

1
= 00.
j(x)

M > 0. We need to find a b > 0 such that l/f(x) > M whenever


< x0 + b. Let E = I/M. By the definition of the statement lim,,,_." f(x) = 0,

we know that there is a o > 0 such that

Proof
x0

<

Given

f ( x) <

whenever

x0

< x <

x0

+ b.

5.3

The Variation of Continuous Functions

220

x0 < x < x0 + o,

This is theo that we want: when

f(x) <

we have

1,

and hence
1
__

> M.

f(x)
(Remember that M >

0,

and f (x) >

0 for

the values of

that we are interested in.)

Similarly, we have:
Theorem 2.

Suppose that

f(x) < 0 on an interval (x0, x1).


I.lm
x->xo

Proof

Given

-E

When

Let E =

M < 0.

f(x) >

1
-

f(x)

0, then

= -oo.

-1/M > 0.

Leto be a positive number such that

x0 < x < x0 + o.

whenever

<f(x)

x0 < x < x0 + o,

If lim,,_,.,0+ f(x)

we have
1

-E,

< - '
f(x)

-- >
E

1
-

f(x)

-<M.
f(x)

(Here we have been reversing inequalities, because we have been dividing by negative
numbers.)
Following the analogy of the above definitions, you ought to be able to write
your own definitions of the following statements:
lim

Jim f(x) = oo,


l.-+O')

f(x)

oo,

oo.

x--oo

Jim f(x) =

lim

- oo,

f(x)

X4-00

Consider now the question of

(?)

lim

x-+oc:

(1 !)x (?)
+

We use question marks, because it is not obvious that the indicated limit exists at all:
as

1/x---->- 0, and 1 + 1/x---->- 1.


type 100." We recall, however,

x---->-

of the

oo,

answer like this:

Jim

h-+0
This was also of the form

f(u)

a similar situation before where we got an

(1 + h)lf1i

"1 a:i."
=

Therefore we have an "indeterminate form

= e =

111-1

l.

And the two are related: if we let

(1 + u)11u

and

g(x)

1
=

'

The Behavior of Functions at Infinity

5.3

221

then

( r
1 +

f(g(x)),

and we want to find


limf(g(x)).

x-+ co

This is like the situation in Section 4. 5. There we found:


Theorem.

If lim., xo g(x)

and limu-.uo j(u)

= u0 = g(x0)

(u o ) .

limf(g(x))
For the case in which

x---+

ro,

= j(u0), then

=f

instead of x---+ x0, this theorem is still true, and

the proof is virtually the same. That is:


Theorem 3.

If lim.,_.00

g(x) = u0

and limu_.,,0

f(u) =

limf(g(x))

L.

x-+ oo

L, then

Roughly speaking, the reason is that


x

ro

g(x)

u0

In fact, the same result holds if lim.,00 g(x)


Theorem 4.

If lim.,00

g(x)

and limu oo

ro

f(g(x))

lim f(g(x))
x oo

ro.

f(u)

L.

L, then

L.

These theorems give quick answers to some rather hard-looking problems.


Returning to our discussion of

f(u) = (1 + u)llu,

g(x)

1
-

f(g(x))

(1 ;r
+

we get immediately:
Theorem 5.

lima:-.oo

(1 + 1/x)"' =

e.

This limit is used as a definition of


logarithms.

e,

in some treatments of exponentials and

In such a treatment, the formula

theorem.

e =

limho

(1 + h)1fh

appears as a

PROBLEM SET 5.3

Investigate the following functions for maxima, minima, local maxima, local minima,
direction of concavity, and inflection points. Then investigate for limits of the sort defined
in this section.
1. f(x)

1
=

x(x

- 2)

(x >'6 0, x >'6 2)

222

The Variation of Continuous Functions

2 f(x)

3.

f(x)

(x - l)(x -

4. f(x)

8. f(x)

IO. f(x)

x-+oo

3)

(x -2, x

x
-2
x + 1
1

x + 1
3

x -x

)"'2

( -1 x)

1
1 + 2
X

( Ir
( r
3. 3r2
(

13.
16.

lim

x-,,12

0)

xz

7. f(x)

9. f(x)

-2
x + 1

x + 1

-3

x +
1-

x3

-3

+x

(-1 x)

(-1

x)

x-o+

Jim (x)1f(x-ll

x-1

1 +-

18. f(x)

1 +

21. /(x)

2x

(x

14. Jim o + v':X)11v'x

+ cosx)secx

(1

11 . f (x)

(x 0, x 1, x -1)

Investigate the following, for lim x-oo .

20. f(x)

5. f(x)

15. Jim (1 + x4)11x'


x-o

17. f(x)

3)

-2
x + 1

Investigate:

12. lim

( x ;= l, x ;=

2
x -x -6

6. I <x)

3)

5.3

(1

2
1 +x
+

(1 L)"'
( -rx

19. f(x)

22.

e-xy'

l +

ln x

1 + x

24. Discuss as in Problems 1 through


f(x)

11
=

(In x)/x

(x > 0).

(Here the sticky point is limx- er, You ought to be able to figure out what this limit is,
and convince yourself that your answer must be right. But to prove that the answer is

right is an unreasonably hard problem, at this stage.)

25. Find lim x-o+xIn (1/x). You need not prove that your answer is right.
26. Is there such a thing as limx - 'Y) sinx? Why or why not?
27. Is there such a thing as Jim"'_"' (l/x) sinx? Why or why not?
28. Prove the following:
Theorem

(The squeeze principle). If


f(x) g(x) /z(x)

(x a),

and
Jim f (x)
x--oo

Jim

x-

h(x)

L,

The Introduction of Functions into Geometric Problems

5.4

223

29. If you borrow a dollar for a year, at

you owe 2 dollars.

100 % simple interest, then at the end of the year


(A certain Marcus Junius Brutus lent money at this rate, in the first

century B.C. He was also an assassin.) If interest is compounded semiannually, then

at the end of the year you owe

(1 + t)2

$2.25.

If the interest is compounded n times a year, then you owe

Suppose now that interest is compounded continuously: the bank passes to the limit,

as

increases without limit, and at the end of the year they charge you the limit. How

much do you owe?

30. Suppose that the basic interest rate is 6%, but interest is compounded continuously,
as in Problem 29. How much do you owe?

(To get a numerical answer to this one,

you will need to use one of the tables at the end of the book.)

5.4 THE INTRODUCTION OF FUNCTIONS INTO GEOMETRIC PROBLEMS;


THE USE OF EXISTENCE THEOREMS AS SHORTCUTS

On several occasions already we have been confronted with problems which did not
appear to involve functions, and have solved them by introducing functions.
For example, in Section
from

o to

3.7 we wanted to find the area under the graph of y

4
x ,

1.
!I

y=t4

y=x4

F(x)
We solved this problem by attacking the more general problem of calculating the

function

We found that

xs

F(x)
and then set

x ==

1 to get the answer

t.

Similarly, in Section 4.10 we wanted to show that


In

(ab)

==

In

a +

In

b,

224

5.4

The Variation of Continuous Functions

for every pair of positive numbers a, b. To use the methods of calculus, we had to
introduce functions into the problem. Given k > 0, we set
(x > 0),
(x > 0).

f(x) =In kx
g(x) = Ink + In x

We then found that/'(x) = g'(x) for every x, and/(1) = g(l). It followed that
f = g; and this proved our theorem.
We use the same kind of method to attack problems in maxima and minima which
may be stated in geometric or physical terms. Consider some examples.
Problem 1. A segment of length 1 has its endpoints on the sides of a right angle.
What position for the segment gives maximum area for the resulting triangle?
y
y

The first step is to introduce a coordinate system, as shown on the right above.
The endpoints of the segment now fie on the positive ends of the axes.
Let x be the x-coordinate of the endpoint that lies on the x-axis; and let the other
endpoint be (0, y). When x is named, y is determined. Thus there is a function f
which gives y in terms of x. Since
x2 + y2 = 1,

we have

f(x) = .JI

x2

(0 x 1).

And for each x, the area enclosed is


A(x) = txf(x) = tx.J1 - x2.
We need to investigate the function A for maxima. Now
A'(x)

.! x + .J1 - x2
2 .Jr x2
-

1 2x2 - 1
2 .Ji - x2

.! -x2 x2)
2
.J 1 - x2

(0 x 1).

Therefore A'(x) = 0 when x = /212. Since we are concerned only with numbers
on the interval [O, l], only x = Ii.12 is of interest to us. Here A = t. Any maximum
of A is surely an ILMax, because A(O) = A(I) = 0, and A(x) > 0 for 0 < x < 1.

The Introduction of Functions into Geometric Problems

5.4

225

Therefore our problem is solved, with A

!, if we know the following theorem:

(Existence of maxima). Iff is continuous on [a, b], then/has a maximum


value on [a, b].

Theorem 1

y
y

--(1\1
I

I
I

I
I

--
I

b=x

The maximum may be an ILMax, as on the left above, or it may be at an endpoint,


as on the right. But in many cases, like the one we have just been discussing, it is
plain that the second of these possibilities does not arise. In such cases, we can infer
immediately that the maximum is an ILMax. If the derivative vanishes at only one
point, then this point must be the maximum.
We shall prove Theorem 1 in Section 5.6. Meanwhile, let us look at some more
applications of it. In the preceding example, there are other functions that we might
equally well have introduced.
y

If the

angle at P has measure e (0 e TT/2), then

y =sine,

x =cose,

and

A(O)

txy =t sine cose =! sin 20.

226

The Variation of Continuous Functions

5.4

Therefore

A'(O) = !(cos 20) 2 = t cos W.

The only point 0 on the interval [O, TT/2] where A' (0)

0 is the point where

= !!. .

We claim, without further investigation of derivatives, that this must be where the
maximum occurs. (As in the previous discussion, there must be a maximum some
where; this is not at an endpoint 0 or TT/2; it is therefore an interior local maximum;
at an ILMax, A' (0)
0; and 0 = TT/4 is the only point of the interval at which
A'(O) = 0.)
Setting 0 = TT/4, we get the maximum value of A as
=

A()

i sin

( )
1

i sin

= i,

as before.
On reflection, you may find a way to solve this problem by purely geometrical
methods, without taking any derivatives or even introducing any functions. The
geometric method is easier if you think of it. Even in cases where elementary methods
can be made to work, however, calculus does the same job methodically.
Problem 2. In a coordinate plane, let A
(0, 1) and B
(3, 2), as shown in the
figure. What is the length of the shortest path from A to the x-axis to B? And where
should the path touch the x-axis, for this minimum to be attained?
In other words, for what choice of P
(x, 0) is the sum of the distances AP
and PB as small as possible?
=

y
3
2

Solution. Let

f(x) =AP+ PB
= .J12 + x2 + ./(3 - x)2 + 22
=.Ji + x2 + .Jx2 - 6x + 13.

The Introduction of Functions into Geometric Problems

5.4

Then

f'(x)

x
J +
i

xJx2

x2

x -3

Jx2

6x

227

+ 13

6x + 13 + (x - 3)-,/
J 1 + x2 Jx2 - 6x + 13
-

Thereforef'(x)

0 when

x2(x2 - 6x + 13)

or

x4

or

or

6x3

+ 13x2

(x2

6x

-9

x2 +

2x

- 3 = 0,

or
(x

+ 3)(x -

I)

6x

+ 9)(x2 + 1),

x4 - 6x3 + 9x2 + x2

3x2 +

6x

+ 9,

0,

0.

To examine second derivatives looks hard.

I)

Let us try to use reasoning instead.

x decreases past 0, AP increases, and so does PB. The same is true when x
3. Therefore, in searching for a minimum, we can restrict the search to,
interval [ -1, 4].

When

increases past
say, the
2)

Suppose that we know that the function has a minimum, somewhere on the

interval (-1,

4].

Then the minimum must be an ILMin, at whichf'(x)

is only one such point on our interval, namely,


must be at

x =

I.

1.

0. There

Therefore the minimum

Obviously, to complete this discussion, we need the following theorem:

Theorem 2 (Existence
value on [a, b].

ofm in ima).

If/is continuous on

The proof is easy, granted that Theorem

1 is true.

[a, b], then/has a minimum

Since -fis continuous, it has

a maximum; and any maximum of -fis a minimum off


y

-M

:
b l -- x

-- x
-+--+'a'--I

-t- ---I

-f

I
I

228

5.4

The Variation of Continuous Functions

Here again, once the problem is solved, you may be able to think of a simpler
attack on it.
Problem 3.

But the methods of calculus work in any case.


Find the right circular cylinder of largest volume, inscribed in a sphere

of radius 1.
y

-1

To avoid a difficult drawing problem, we show not the three-dimensional figure


but merely a plane cross section of it.

One way to introduce a function into this

problem is to express the volume of the cylinder as

(0 x 1).
This gives
V'(x)

2TT 2x

Ji -

x2 + x2

x2

Ji -

Therefore
V'(x) = 0

->

2x - 3x3 = 0

->

x(3x2 - 2) = 0 .

Since x must lie on [O, 1), we find that V'(x) = 0 only when x = 0 or x =
Now V has a maximum, because V is continuous on [O, 1].
ILMax, because V(O) = 0
fore V'(x)

Ji

And this must be an

V(l), and V(x) > 0 everywhere else on [O, 1]. There-

0 at the maximum. Therefore the maximum occurs at

J'i.

Hence

the maximum volume is

There is another function that we might have used to solve the same problem.
might have written
V(y)

TTr2h

TTX2 2y

TT(l - y2)2y

This would give


V'(y) = 2TT(l - 3y2),

2TT

(y - y3).

We

5.4

The Introduction of Functions into Geometric Problems

so that
V' (y)

<=>

3y2

<=>

Jt

or

229

-/i.

Here again only the positive number applies, because y must be on the interval [O, 1].
As before, we conclude that the maximum value occurs at y

J};

The second method is simpler. This sort of thing happens often. It is therefore a
good idea to have a quick look at all of the functions that it seems natural to try,
before doing any hard work with any one of them. If the first function that you try
looks simple, there is no point in examining others.
Our third problem shows a danger which should be remembered hereafter.

We

might have supposed that the inscribed cylinder attains its maximum volume at the
stage where the inscribed rectangle (in the cross section) attains its maximum area.
But this is false: it is easy to show that the inscribed rectangle of maximum area is a
square; and the cross section of the maximal cylinder is a rectangle of base

2.Jj and

altitude 2.Jt. Therefore we should never assume without proof that two maximum
minimum problems are equivalent.
A further word of caution: In establishing that a certain

x0

gives a maximum or

minimum, you may use the theorems of the preceding sections.

Under certain con

ditions, you may avoid these theorems (and the calculations that they require) by the
sort of reasoning that we have used in the problems above. But in any case, you must
use

either the theorems of the preceding sections or a reasoning process which justifies

your conclusions. To find a point

x0

where a derivative vanishes and hence infer that

your problem is solved is a mistake. For one thing,

x0

may give a minimum when you

were looking for a maximum, or vice versa. For another thing,

x0

may give a point

of inflection.
PROBLEM SET 5.4
1.

Find the area of the largest rectangle than can be inscribed in a semicircle of radius

a.

2. Find the area of the largest rectangle that can be inscribed in an equilateral triangle

whose sides have length


3.

a.

Find the area of the triangle with the smallest area which contains a square with side

a.

4. Find the perimeter of the triangle with the smallest perimeter which contains a square

with side

a ..

5. A rectangular field has one side along a river and a fence along the other three sides.
If the total length of the fence is k, what is the maximum possible area of the field?
6.

Given a rectangular field with one side along a river, as in Problem 5.

If the area of

the field is A, what is the minimum possible length of the fence?


7. If a rectangular wooden beam is supported horizontally at its ends, then the maximum

weight that it can support at its midpoint is proportional (at least approximately) to its
width, and to the square of its thickness. That is, W

y2, where xis the width,

5.4

The Variation of Continuous Functions

230

is the thickness, and k is a constant depending on the wood (and on the units of

length and weight).


Suppose that such a beam is to be cut from a cylindrical log of radius

a,

in such a

way as to maximize W. What should be the width and the thickness?


8.

An open pan is to be made out of a square metal sheet, by cutting out the square pieces
from the corners of the sheet and folding up the sides of the metal that is left.
square pieces are to be thrown away.) If the sheet has edges of length

a,

(The

what is the

volume of the pan of largest volume that can be made in this way?

9.

An open pan, of the sort described in the preceding problem, has a total surface area
of

10.

128

sq. in. What is the largest possible volume?

10

Find the closed circular cylinder with volume

cu. in. and surface area as small as

possible.
11.

12.

Solve the same problem, given that the cylinder is open at one end.
Solve the same problem, given that the cylinder is open at both ends.

(It sits on a flat

table and holds flour.)

13.

A piece of sheet metal,

feet long and

feet wide, is to be bent so as to form a trough

n feet long, with open top, open ends, and triangular cross sections. What is the greatest
possible cross sectional area?

14.

A trough is to be made with isosceles right triangles as endpiecesand congruentrectangles


as sides, as shown in the figure.

If the total surface area is to be

100

sq. in., what is

the maximum volume?

15.

In a rectangular parallelepiped, with a square base, the total length of the edges is k.
What is the largest possible volume?

16.

A rectangle is to be inscribed in the region above the x-axis and below the graph of

y =

x2 Find the area of the rectangle of maximum area.

y =

17.

Same problem, for

18.

Find the rectangle of maximum area contained in the region above the line
to the right of the line x

19.

x4

1,

and under the graph of y

A rectangle is inscribed in the region R

{(x,

y) I ixl

= 1/x.

[y\ 1},

y = t,

in such a way as to

maximize the area. Find the area of the rectangle.


Find the values of x at which the following functions take on their maximum values,

and j ustify your answers. You need not find the maximum values of the functions.

20. f(x)
22.

h(x)

x
=

+ x2

J"'

Problems

-1

Sin-1 t dt

24 through

21.
23.

g(x)

</>(X)

=
=

x
--

1 + x4

finx

V1

+ t8 dt

27. Investigate the preceding four functions for minimum values.

5.4

The Introduction of Functions into Geometric Problems

231

28. An isosceles triangle has base d and. altitude h. Find the area of the rectangle of largest
area that can be inscribed in it.

29. Given a triangle with angles of 30, 60, and 90, there are three plausible ways of
inscribing in it a rectangle of maximum area; the rectangle may have a side lying along

any one of the three sides of the triangle.

Show that all three of these "maximal"

rectangles are really maximal; that is, show that they all have the same area.
30.

Show that there are some triangles for which the conclusion in Problem 29 does not hold.

31. Show, however, that the conclusion of Problem 29 holds for a class of triangles which
includes more than the 30-60-90 triangles.

32. Consider the curve which is the graph of the equation x2 + 4y2

4.

Find the area

of the rectangle of largest area that can be inscribed in this curve.

33. A right circular cone has a base of diameter d, and altitude h. Find the volume of the
largest right circular cylinder that can be inscribed in it.
34.

Find the area of the isosceles triangle of maximum area that can be inscribed in a circle
of radius r.

35. Find the volume of the right circular cylinder of maximum volume that can be inscribed
in a sphere of radius r.
36.

Suppose that in Problem 34 the word "isosceles" is omitted.

Is the solution of the

resulting problem the same as before?

37. Similarly, discuss the problem obtained by omitting the word "right" in Problem 35.
38. Find the length of the longest ladder than can be carried (in a horizontal position)
around the corner shown on the left below. The segment from P to Q shows a possible
position of the ladder.

39. In the right-hand figure above, the circle (of radius r) is inscribed in the right angle
LBAC. What is the minimum possible area of 6ADE?
**40. Suppose that in Problem 39 we do not require that LBAC be a right angle.
that LBAC has measure
and

ix.

ix,

Given

find the minimum possible area of 6ADE, in terms of

(This is much harder than Problem 39.)

232

5.5

The Variation of Continuous Functions

5.5

THE USE OF FUNCTIONAL EQUATIONS AS SHORTCUTS

In the preceding section, we found that under some conditions we could locate maxi
mum and minimum values merely by finding a point where the derivative vanishes.
We shall now see that in some cases we can locate maximum and minimum values
without calculating the function. Consider first a simple problem, from Section 5.4.
Problem

1.

A segment of length

has its endpoints on the sides of a right angle.

What position for the segment gives maximum area for the resulting triangle?
y

As in Section 5.4, we set up the axes as shown. Let x be the x-coordinate of the
lower endpoint of the segment; and for each x from
of the other endpoint.

0 to 1, letf(x) be they-coordinate

Note that we are entitled to use functional notation: f(x)

really is determined when x is named. And for each x, we have


x2
because x2

[j(x)]2 = l2,

[ f(x)]2 is the square of the length of the segment. Therefore the function

f satisfies the equation

(0

x2+f2=l
The area of the triangle is
A(x)
Now in

(1),

1).

(1)

tx f(x).

(2)

the left-hand member is a function, whose derivative is 2 x+ 2 ff'.

But this function is known to be a constant, equal to


Therefore
x

(0 <

+ff' = 0

for every x from

< 1).

Here, of course, we are assuming that f has a derivative, for

to

1.

(l')

0 <

< 1,

but this

must be true, because the graph off is a quadrant of a circle. Obviously


A'(x).=

x f'(x)

+t

f(x).

(2')

The maximum of A(x) must be an ILMax; and so, at the maximum of A(x), we have

xf' +f = 0.

We now know:

!' =

!' = _i
x

on

(0, 1),

at the maximum.

(2")

The Use of Functional Equations as Shortcuts

5.S

233

Therefore, at the maximum, both of these equations hold, and


and
x = f(x).
f
x
That is, the maxim um is achieved when the triangle is isosceles.
This discussion has been long, because ideas needed to be explained; but once the
ideas are understood, the calculations are simple:

x 2 + f2 =

1,

x + 2 ff'= 0,

f' =

J;

A'(x)= tx f'(x) + tf(x);

A(x) = tx f(x),
and hence

A'= 0

f'= - j
x

<::?-

Therefore, at the maximum,

x
- -= _J
x
f

x = f(x).

and

In this case, of course, it was not much trouble to find a formula for f and use it.
But in many cases, equations like

x2+12=

are more convenient than formulas for the function f These are called functional
equations. Obviously every trigonometric identity is a functional equation. Usually,
however, we use the word identity when the function is known, and the termfunc
tional equation when the equation itself is being used as a working definition of the
function.

Consider another example, Problem


Problem 3.

in Section 5.4.

Find the right circular cylinder of largest volume, inscribed in a

sphere of radius

a.

y
a

As before, we show a vertical cross section of the figure. Let x be the radius of the
inscribed cylinder, and letf(x) be half the altitude. Then

x2 +f2= a2,
and

f' =

x
f

x + 2 ff' = 0,

(0

<

<

a).

(3)

s.s

The Variation of Continuous Functions

234

Now the volume is

V(x) = 7TX2 2f(x),

so that

V'= 27T (x2.f' + 2xf).

At the maximum, V'

0, and so

f' =

- 2xf= - 2!
x
x2

(4)

(at Max).

Therefore, at the maximum, both our formulas for f' must hold, and so

2
=- f
x
f

-
and

./2

For a

1, this tells us that

(5)

f=-x.
2

- '\/1;,_
3
-

'

as before.
Note, however, that in a way the most natural answer to a problem like this is a

shape, rather than a size.

And the solution based on the functional equation ordinarily

gives the answer in the form of a shape, that is, in the form of a ratio between two
measurements.
mines the

size

For example, in the preceding problem the constant

entiated in the equation


maximum,

a,

which deter

of the whole configuration, disappeared immediately when we differ

x2 + f2 = a2.

Our final equation

(5)

means that at the

2f(x) = /2 x,

that is, the altitude of the maximum cylinder is equal to


y
a

-a

/2 times the radius of its base.

The Use of Functional Equations as Shortcuts

5.5

235

The answer is also a shape when the problem is to find the rectangle of maximum
area in a given circle:

x2 + f2
f'

2x + 2ff'

(0

-
f

A(x)

a2,

0,

x < a);

<

(2x) 2f(x) = 4

x f,

A'(x) = 4(xf' + f),


A'(x)

<=>

f'

-f
x

Therefore at the maximum,

because

f '=

and

and f are both positive.

x = f,

This is a qualitative answer, as it should be: it

says that the maximum rectangle is a square.

The constant

has disappeared,

because the shape of the maximum rectangle is the same for all circles.
In the following problem set, you will find more cases in which maxima and
minima can most conveniently be found by using functional equations.

Meanwhile

let us look carefully at what happens when we take the derivative on each side of a
functional equation. The ideas here are illustrated by a simple case. When we write

x2 + 12 = a2

=>

we are claiming that

Eq. (7).

(6)

2. x + 2 . ff' = 0,

(7)

every differentiable function which satisfies Eq. (6) also satisfies

It often happens that there is more than one such function/ For example,

consider
Here

f{(x)=
and

f(x)

-x
,
fi( x)

-x
..Ja2 - x2

x
..J a2 - x2

-x

--;===

-..Ja2 - x2

Therefore
Therefore

fi(x)f{(x)

-x,

and

{2. x + 2. fd{ = 0,
2

-x
f2(x)

. x + 2. ! !
2

0.

(8)

236

5.5

The Variation of Continuous Functions

That is, both/1 and/2 satisfy (7). A figure makes it obvious what is going on here.
y

At each of the labeled points, we have

fi(x)=

-x
=
f;(x)/x f;(x) '
-1

because the tangent is perpendicular to the radius.


The same sort of thing goes on in more complicated cases. The graph of

y= x3 - x

(9)

looks like the left-hand figure below.


Therefore the graph of

x= ya-y

(10)

looks like the right-hand drawing below.

We have interchanged

and y in Eq. (9), and reflected the graph across the line

The Use of Functional Equations as Shortcuts

5.5

y =

x.

237

This gives the curve C which is the graph of (10). C is not a function-graph.

But C is the union of the graphs of three functions fi,f2,f3, as indicated in the figure.

And each of the functions ft> h, and /3 satisfies the functional equation
x

=/3

Therefore each of these functions satisfies the differential equation


1

3 Pf'

!'.

This is what we are claiming when we differentiate the functional equation, and write
x

=/3

=>

1 = 3 . !21'

f'.

PROBLEM SET 5.5

In Problems 1 through 10 below, the notation 5.4.n refers to Problem n of Problem Set
5.4. In each of these cases, the indicated ratio is to be found by the method based on func
tional equations.
1. In 5.4.1, find altitude/base, at the maximum.
2. In 5.4.2, same.
3. In 5.4.5, same, using the side parallel to the river as base.
4. In 5.4.7, findy/x, at the maximum.
5. In 5.4.14, let l be the length of the rectangular side and let w be the width. Find w/l
at the maximum.
6. In 5.4.15, let h be the altitude and let e be the length of each edge of the base. Find
h/e, at the maximum.
7.

In 5.4.16, find altitude/base, at the maximum.

8.

In 5.4.28, same.

9.

In 5.4.33, let a be the altitude of the cylinder, and let r be the radius of the base. Find
a/r, at the maximum.

10. In 5.4.34, let h be the altitude, and let a be half the length of the base. Find hfa, at the
maximum.
11. In 5.4.35, let h be the altitude and let a be the radius of the base. Find hfa, at the
maximum.
12. We know of a function f, with domain [ -1, 1], which is a solution of the functional
equation sin f(x) = x. (Our "known function," of course, is f(x) = Sin-1 x.) What
other continuous solutions of the equation have the entire interval [ -1, 1] as domain?
Draw a figure.
13. Write a differential equation which is satisfied by all solutions of the functional equation
x4 + [/ (x)]4

1.

14. a) Let n = 101010 Sketch the graph of xn + yn


l. [Hint: A commonly used drawing
instrument will give you an excellent sketch.]
b) Let n = 101010 + 1. Sketch the graph of xn + yn = 1. [Same hint.]
=

238

The Variation of Continuous Functions

5.6

15. Find the functions/which satisfy the differential equation

x +ff'= 0.
(You need not show that the solutions that you describe are the only ones.)

16. Given that f and/' are continuous, let

F(x) =
Calculate

F(x),

f'f

(t)/1(t) dt.

in terms off

*17. Now show that your list of solutions, in Problem 15, is complete.
18. Let f be the function whose graph is the union of (a) the lower left-hand quadrant of
the circle with center at (0, 1) and radius 1 and (b) the upper right-hand quadrant of the
circle with center at (0, -1) and radius I. Show that f is a solution of the differential
equation

[f'(x)]2 = [x + f(x)J'(x)]2,
except, of course, at the endpoints x = 1, where the tangent lines are vertical and the
function- has no derivative. As a start, observe that at x
0, the tangent to the graph
is horizontal and the equation is satisfied: 02 = [O + 0 0]2
=

*19. Consider the family of quadratic functions represented by the formula


(1)

f (x) = (x - a)2
Differentiating, we get

j'(x) = 2(x - a),


and squaring, we get

[j'(x)]2 = 4(x - a)2,

and

[j'(x)]2 = 4
Evidently (1)

/(x)

(2)

(2). But the converse is false.

a) Show that one of the solutions of (2) is a

linear function f

b) Show that (2) has some solutions which are neither quadratic nor linear; that is,
the differential equation has solutions whose total graphs are neither lines nor
parabolas.

5.6

THE COMPLETENESS OF

R AND THE EXISTENCE OF MAXIMA

In Section 5.4 and later, we have used the fact that, if f is continuous on
then f has a maximum value o:n

[a, b].

[a, b],

In Section 5.4 this theorem was used as a

shortcut in finding maximum values, but this is only one of the uses of the theorem.
In fact, the theorem is part of the foundation of the calculus, as we shall see.
In proving it, we shall need to use, for the first time, the fact that the number line
has no holes in it.

As a guide in giving an exact description of this property of the

number system, let us consider what happens when you remove a point from the
number line, thus getting a system which really does have a hole in it.
Let A be the set of all negative numbers, and let B be the set of all positive

The Completeness of R and the Existence of Maxima

5.6

numbers.

239

We mean strictly positive and strictly negative, so that 0 belongs neither

to A nor to B. Then

1)

B has no least element.


R+
3 ------= 2-=---

1--"x'---x-- 1=----

The reason is that if xis a positive number, then so is x2


/ , and x/2 < x. Therefore
no positive number x is less than all other positive numbers, and so

(1) holds.

Similarly,
2)

A has no greatest element.


---.,;.
3 ..
R- -_

-,...2

l X--:':-X-+-1 ---
-

For if x < 0, then x/2 < 0, and x2


/ > x.
Now let
K

A u B

{xI x :

O}.

Then obviously:

3)

K is the union of two nonempty sets A and B, such that (a) every number in A

is less than every number in B, but (b) A has no greatest element, and (c) B has no
least element .
Evidently this situation could not have arisen if we had not excluded 0: if we
put 0 in A, then 0 would be the greatest element of A; and if we put 0 in B, then 0
would be the least element of B. Thus the situation described in

(3) can arise only in a

number system with a hole in it, and so the following statement conveys the idea that
there are no holes in R:
The Dedekind Cut Postulate (DCP).

Suppose R is expressed as the union of two

nonempty sets A and B, such that every element of A is less than every element of B.
Then either A has a greatest element or B has a least element.
B

Xo

In the figure, x0 must belong either to A or to B.

Therefore x0 is either the

greatest element of A or the least element of B.


We have stated DCP as our first description of the completeness of R, because
it is the best known description, and in some ways the most natural.
purposes, the following idea is easier to use.

But for some

Given a sequence

[a1, b1], [a2, b2L ...


of closed intervals.

If every interval in the sequence contains the next, then we say

240

5.6

The Variation of Continuous Functions

that the sequence is

nested.

Algebraically, this means that


for every

For example, if

( - ! !)

[ai' b]
i

'

i.

for every i,

then the sequence is nested. This sequence "closes down on O." That is, 0 lies in each
of the intervals in the sequence, and 0 is the only number that lies in all of them.
A more important example is as follows.

Given a circle of radius 1, let Pn be the

perimeter of an inscribed regular (n + 2)-gon, and let qn be the perimeter of a circum

scribed regular

(n +

2)-gon. Evidently

P1 < P2 < p3 <

and

for each i.
Thus we have a nested sequence

of closed intervals. And this sequence "closes down on 27T." That is, 27T lies in all of
the intervals in the sequence, and no other number lies in all of them.
The following postulate says that every nested sequence of intervals closes down
on at least one point.

The Nested Interval Postulate (NIP). For every nested sequence of closed intervals
there is a number x which lies in every interval in the sequence.
This conveys the idea that the number system is complete. Suppose, for example,
that 27T were missing, so that the number system had a hole in it where 2rr ought to be.
Then no number at all would lie on all of the intervals
have just discussed.

Similarly, if

./2

[Pi, q1], [p2, q2],

sequence of closed intervals closing down on no number whatever.

[,./2 - lfi, ,J2 + l/i]

that we

were missing, then there would be a nested


(We could use

as the ith interval in the sequence.)

Using the nested interval postulate (NIP), we shall prove the following theorem:

Theorem

1. If/is continuous on

[a, b],

then/has an upper bound on

That is, there is a number M such that/(x) M for each x of

[a, b].

[a, b].

Lemma. If/is unbounded above on an interval [c, d], then/is unbounded above on
at least one of the halves of
By the halves of

[c, d]

[c, d].

we mean the intervals

[c, (c + d)/2]

[(c + d)/2, d].


[c, (c + d)/2]
bound on [c, d].

and

The proof of the lemma is immediate: if/has an upper bound M1 on


and has an upper bound M2 on

[(c + d)/2, d],

then/has an upper

\\{e merely use the larger of the bounds M1 and M2

5.6

The Completeness of R and the Existence of Maxima

241

I
I
I
I
I

--1------1
I
I
I
I

+-
c
c+d-d,,...__
._ x

2
We proceed to prove the theorem.

For short, we say that an interval is

good

is good. We start by supposing that

[a, b]

if f is bounded above on the interval; and we say that an interval is


good. Thus we need to prove that

[a, b]

bad if

is bad, and we shall show this assumption leads to a contradiction.


Let

Let

it is not

If

[a, b] is bad, then it follows that at least one of the halves of [a, b] must be bad.
[a1, b1] be a bad half of [a, b]. For the same reason, [a1, b1] must have a bad half.
[a2, b2] be a bad half of [a1, b1]. Continuing this process to infinity, we get a

sequence

[a1, b1], [a2, b2], ...


of closed intervals, all of which are bad, and each of which is a half of the preceding
one. Therefore

and so

b; - a;
By NIP, there is an x such that
But f is continuous at

-: (b

2'

a; x b; for

a).

each i.

Thus, for every E > 0,

x.

f has

an EO-box at the point

(x,J(x)).
y
M = f(x) +

--------1--------,
I
I
I
I

f(x)

Thus

lx-xl<o

::?-

f(x)-E<f(x)<f(x)+E,

and so f (x) + E is an upper bound for f on the interval


lim

i-+ 00

(b; - ai)

0,

(x -o, x + o).

But since

242

The Variation of Continuous Functions

5.6

we have

bi - ai

<

for some i.

For such an i, the closed interval

(x

a). That is,

- a,

[ai, bi] lies inside the open interval

as shown in the figure.

x-o

a;

(This is easy to see geometrically, because

[a;, b;] contains the midpoint x of the open

interval, and is less than half as long.)

(x - a, x + a)
[ai, b;]. This contradiction completes

But this situation is impossible, because f is bounded above on


and is not bounded above on the smaller interval
the proof of the theorem.

One of the ideas that we have just used is going to be useful later. We therefore
record it as a theorem:
Theorem 2.

Suppose that

for each i, and


lim

(bi

a;)

0.

i-+ 00

Then every interval

(x - a, x

a) contains some interval [a;, b;].


(x - a, x + a) contains [a;, b;],
a) contains all of the later intervals

This was proved in the preceding discussion. If


then of course it follows that

(x - a,.\'

Given that a functionfis bounded, it does not follow thatfhas a maximum or a


minimum. Consider, for example,
f

Tan-1.
y

---------------

7r

The Completeness of R and the Existence of Maxima

5.6

243

When xis far to the right, Tan-1 xis close to

TT/2, but Tan-1 xis never actually equal


TT/2 for any x. Similarly, when xis far to the left, Tan-1 xis close to -TT/2, but
-TT/2 is not one of the values of the function. On the other hand, it is easy to see that
the numbers TT/2 and -TT/2 are related to the function Tan-1 in a special way: TT/2
is an upper bound of the function; and of all upper bounds of the function, TT/2 is the

to

smallest.

We express this by writing

!!. =sup Tan-1.

Here

sup

is

2
pronounced supremum.

To be exact:

Definition. If k is an upper bound of a function f, and k is smaller than every other


upper bound of/, then k is called the supremum of/, and we write
k =sup/

More generally, we define the supremum for any set of numbers:

Definition. Let B be a set of numbers. If x k, for every xin B, then k is an upper


every other upper bound of B, then k is called the

bound of B. If k is smaller than


supremum of B, and we write

k =sup B.

Consider, for example, the case where B is an open interval

(a, b).

Every number

k ?; b is an upper bound of B. Thus the upper bounds of B form an interval [b, w ).


B

Here

is an upper bound of B, and

Therefore

Consider now
B_

is smaller than all other upper bounds of B.

=sup B.

{1 '3 '4'

... f'

Here the upper bounds of B are the points of the interval

[ 1, w),

and sup B = 1.

4
5

---'------++<>---'-- x
0
1
2 3 5
x =2
346

In each of these cases, starting with a nonempty set B which is bounded above,

we have found that the upper bounds form an interval of the type

[k, w), and k =

sup B. The following postulate says that this is what always happens:

The Least Upper Bound Postulate (LUBP). Let B be a nonempty set of numbers.
If B has an upper bound, then B has a supremum.
Using the least upper bound postulate, we shall show that no continuous function
can behave like Tan-1 if its domain is a closed interval:

5.6

The Variation of Continuous Functions

244

(Existence of maxima). If/is continuous on [a, b], then/has a maximum


value on [a, b].

Theorem 3

M =f(x),f(x) M for every

Proof We know by Theorem

that/ is bounded. Let


k

Thenf(x) k for every


Suppose not, and let

supf

on [a, b]. We need to show thatf(x)

g(x)

1
=

(a

k - f(x)

for some x.

b).

Then g is continuous. But g is unbounded. For suppose that


for a x b.

g(x) M
Then
1

:$

k - f(x) -

and

f(x) :$
-

M,

- _!_
M

k - f(x),

for a

b.

This is impossible, because k is the least of the upper bounds off


Thus, if/has no maximum, there is a continuous function g which is unbounded
on [a, b]. This contradicts Theorem 1, and so completes the proof of Theorem 3.
We have already observed, in Section 5.4, that the existence of maxima implies
the existence of minima. Therefore

(Existence of minima). If/is continuous on [a, b], then/has a minimum


value on [a, b].

Theorem 4

(This was Theorem

of Section 5.4.)

PROBLEM SET 5.6


1.

2.

Let B be the set of all rational numbers p/q for which p2/q2 < 2. What is sup B?
Consider a circle of radius 1. For each polygon

perimeter of P. Let B be the set of all numbers

P inscribed in the circle, Jet k(P) be the


k(P). What is sup B?

3.

245

The Completeness of R and the Existence of Maxima

5.6
Consider the graph off

(x)

sin

x, 0

1T.

Suppose that we cut up the interval

[0, 7r] into little intervals, in any way, using subdivision points 0
x1 < x2 <
<
xi < xi+l <
< x,. = 1T. Over each little interval [xi, xi+i l we set up the tallest
possible inscribed rectangle with [xi, xi+i l as base. Let s be the sum of the areas of the
rectangles. Let B be the set of all numbers s which are obtainable in this way. What is
=

sup B?

(A numerical answer is called for here.)

4. Let B be any set of numbers. If b EB, and b is larger than every other element of B,
then b is called the greatest element of B, and we write b

Max B.

Question: If B

has an upper bound, does it follow that B has a Max?


5.

Suppose that we had defined bounds and suprema in the following way:
"Let B be a set of numbers, and let k be a number.
then k is a strict upper bound of B.

If

< k, for every

in B,

If k is a strict upper bound of B, and is smaller

than every other strict upper bound of B, then k

sup B."

a) What is the difference between this "definition" and the usual definition of upper
bounds and suprema?
Under the new "definition" of "supremum," which if any of the following statements
are true?
b) Every finite set has a "supremum."
c) No finite set has a "supremum."
d) Every open interval has a "supremum."
e) No open interval has a "supremum."
f ) Every closed interval has a "supremum."
g) No closed interval has a "supremum."
6. If B is a set of numbers, then -B denotes the set obtained when we replace every
element

of B by its negative

-x.

That is,

-B
For example, if B

(-

oo,

{ -x Ix EB}.

[l, 2], then -B

[ -2, -1]; if B

[ -1,

oo

) , then -B

1 ], and so on. Prove the following:

Theorem. If (a) k is an upper bound of B, then (b) -k is a lower bound of

B And
.

conversely, (b) implies (a).


(This is easy; don't try to make it hard.)
7. If k is a lower bound of the set B, and k is greater than every other lower bound of B,
then k is called the infimum of B, and we write k

inf B. Show that if a set B is bounded

below, then B has an infimum.


8. Let B be a set which is bounded below, and let K be the set of all lower bounds of B.
Describe Kin the interval notation.
*9. Let [av bi], [a2, b2], .. .be a nested sequence, and letA

{ava2,

Show that (a) every number b; is an upper bound of A. Let x


(b) ai

}, B

{b1,b2,

}.

sup A. Then show that

x bi for every i.

This result means that the least upper bound postulate

(LUBP) implies the nested

interval postulate (NIP).


*10. Let Kbe a (nonempty) set of numbers, bounded above. Let A be the set of all numbers

a which are not upper bounds of K. That is, a EA if a < k for some k in K.
Show that A cannot contain a greatest element.

246

* 11.

5.7

The Variation of Continuous Functions

Show that the Dedekind cut postulate (DCP) implies the least upper bound postulate
(LUB P) .
The results of Problems 9 and 11 mean that
DCP

=>

LUBP

=>

NIP.

Thus our only really new assumption, in this section, is DCP.


5.7

THE MEAN-VALUE THEOREM AND THE NO-JUMP THEOREM

The mean-value theorem was stated in Chapter 3, and we have been using it ever
since. We are now finally in a position to prove it. We need one preliminary result.
Rolle's Theorem. Iffis continuous on the closed i'nterval [a, b] and differentiable on
the open interval (a, b), and f(a) f(b)
0, then j'(x)
0 for some x between
a and b.
=

Proof There are three cases to consider:


1) Suppose thatf(x)
gives/'(x) = 0.

0 for every x on [a, b]. Then any number x between a and

2) Suppose that/(x) > 0 for some x on [a, b]. Now fhas a maximum at some x,
and x is not a or b. Therefore fhas an ILMax at x. By Theorem 3 of Section 5.2 it
follows that f'(x)
0.
3) Iff(x) < 0 for some x, then the minimum of/is an ILMin. By Theorem 4 of
Section 5.2 we know that at an ILMin the derivative vanishes.
=

The proof of MVT is now easy.


y

f(b

l-

----------

The Mean-Value Theorem and the No-Jump Theorem

5.7

247

Given that f is continuous on [a, b] and differentiable on (a, b), let g be the linear
function which agrees with fat a and at b. Thus

g(a)

g(b)

f(a),

f(b).

We could write a formula for g, in the form g(x)


mx + k, if we needed to, but
we don't need to. Since the derivative of a linear function is simply the slope of the
line which is its graph, we know that
=

g '(x)

f(b) - f(a)
,
b - a

for every x. For each x of [a, b], let

c/>(x)

f(x)

g(x).

Then cf> is continuous on [a, b] (because fand g are), and cf> is differentiable on (a, b),
with

c/>'(x)
Since cf>(a)
c/>(b)
some x. Thus
=

f'(x) - g'(x)

f(b) - f(a)
f'(x) b - a

0, we can apply Rolle's theorem.

f '( x)

and

f' (x)

f(b) - f(a)
b - a
=

Therefore cf>'(x)

0 for

o,

f(b) - f(a)
a
b
-

for some x, which was to be proved.


The no-jump theorem is harder. To prove it, we need to go back to first principles,
and we need some preliminary results.
Lemma 1. Let f be a continuous function, on an open interval containing x0
f(x0) > 0, then there is a o > 0 such that

x0 -

O <

<

x0

+ c'J

=>

Proof Since fis continuous, we know that


lim f(x)

f(xo).

f (x)

> 0.

If

The Variation of Continuous Functions

248

In the definition of a limit, we take

x0 -

< x < X0

because f(x0)

+ o

=>

E =

f(x0)

f(x0) -

0. Therefore the

E =

5.7

>

0.

There is a

<f(x) <f(x0) +

that we have is the

o >
E

0 such that

=>

0 <f(x),

that we wanted.

Lemma 2. Letfbe a continuous function, on an interval containing x0 Iff(x0) < 0,


then there is a

o >

0 such that

x0 -

Proof?

< x < x0

=>

+ o

f(x) < 0.

(The proof of Lemma 1 can be adapted, to give a proof of Lemma 2.

it is quicker to derive Lemma 2 from the statement of Lemma

A functionfchanges sign, on an interval/, iff(x) >

But

1.)

0 for some x in I and/(x') <

0 for some x' in /.

Lemma 3. If fis continuous, on an interval containing x0, and f(x0) - 0, then there
is a 0 > 0 SUCh that j does not change sign on the interval (x0 - O, Xo + 0).

Proof.

For f(x0)

>

0, this follows from Lemma 1.

For f(x0) < 0, it follows from

Lemma 2.
We are now ready to prove the following convenient special case of the no-jump
theorem.

Theorem 1. If/is continuous on [a, b], and/ changes sign on [a, b], then
f(x0)

for some x0 in [a, b].


y

The proof is based on Lemma 3 and the nested interval postulate (NIP). We
suppose that/(x) - 0 for every x in [a, b]. We shall show that this assumption leads
to a contradiction.
Given that f changes sign on [a, b] and that f(x) is never

0, it follows that f

changes sign on one of the halves of [a, b]. We recall, from Section 5.6, that the
halves of [a, b] are [a, (a + b)/2] and [(a + b)/2, b]. Let [a1, b1] be half of [a, b],
such that f changes sign on [a1, b1]. Similarly, let [a2, b2] be half of [av bi], such
that f changes sign on [a2, b2]. Proceeding to infinity in this way, we get a nested
sequence

[a1, b1], [a2, b2], ...


of closed intervals, such that f changes sign on each of them. Evidently

b; - a;

i(

b - a),

5.7

The Mean-Value Theorem and the No-Jump Theorem

249

and so
lim (bi

a;)

0,

i-t> 00

as in the proof of Theorem 1 of Section 5.6. By NIP, there is an x0 which lies on all
of the intervals in the nested sequence. That is,
for every

i.

By Lemma 3 there is a o > 0 such that/ does not change sign on the interval (x0
x0 + o). By Theorem 2 of Section 5.6, there is an i for which [a;, b;] lies in (x0
Xo + 0), as indicated in the figure.

o,
0,

This is impossible, because f changes sign on [a;, b;], but does not change sign on
(x0
o, x0 + o). This contradiction completes the proof of Theorem 1.
It is now easy to prove the no-jump theorem.
-

Theorem 2

(The no-jump theorem). If f is continuous on [x1, x2], then f takes on

every value betweenf(x1) andf(x2).

Proof Suppose first that


j(x1) < k < f(x2),

and let

g(x)

f(x)

k.

0 for some x0 on [x1, x2]. This


Then g changes sign on [x1, x2]. Therefore g(x0)
givesf(x0)
0,
andf(x0)
k.
k
Iff(x2) < k < f(x1), then the same function g still changes sign, and so the proof
is exactly the same.
=

This completes our reexamination of the foundations of calculus. It now appears


that the idea of a continuous function is adequately described by the EO-definition
of a limit and that the completeness of the number system R, in the sense of "no
holes," is adequately described by the Dedekind cut postulate (which implies the least
upper bound postulate and the nested interval postulate).
The theorems in this section and the preceding one are not news; it was obvious
at the outset that these theorems ought to be true. But the fact that these theorems
can be proved, on the basis of a single simple assumption DCP, is significant. It
means that mathematics hangs together in a special way.

5.8

The Variation of Continuous Functions

250

Nobody expects that a doctor will write down a definition of the word
and then write a few assumptions about

men,

man

in such a way that all medical science

can be derived by logical reasoning from the definition and from the assumptions.

Medicine is an empirical science: it depends on observations of fact, not just at the

outset but continually. Mathematics is different.

Moreover, in your study of mathematics you have already passed the point where

the truth can be relied upon to be obvious and where obvious things can be relied

on to be true. From now on, logic is going to be an important part of your mathe

matical equipment.

This is partly due to recent developments.

calculus was illogical, and very few people cared.

As late as 1800,

In the last century, however,

mathematical ideas which require careful logical analysis have become more

important, in pure research and also in applications.

5.8

Let

THE DERIVATIVE OF ONE FUNCTION WITH RESPECT TO ANOTHER

fandg

be differentiable functions. Take a point x0, and form the differences

Af= f(x0
A g =g(x0

Ax) - f(x0),

Ax) - g( x0) .

!if/Ag approaches a limit, as Ax---->- 0, then this


with respect to g, and is denoted by dfd
/ g. That is,

If

limit is called the derivative

off

!if= df
6.x-+O A g
dg,
Jim

by definition. In fact, the limit always exists, wheneverg'(x0) "16- 0.


Theorem 1.

'

df
f
-=-,
dg g'

whereverg'(x) "16- 0.

Proof
!if
6.x-+O fig
lim

Jim

6.x-+O

!if/fix
fig/fix

For the case in which g(x) = x for every

x,

reduces to an ordinary derivative:

Theorem 2. Ifg(x) = x for every

x,

f'(xo)
.
g'(x0)

the derivative off with respect tog

then

df df
=
dg dx

f'(x).

Obviously,

df
=
dg

for each x0

Jim
6.x-+O

!if
fix

f'(x0),

The Derivative of One Function with Respect to Another

5.8

251

Some examples are as follows:

d sin x
=
---

d cos x

dsin x
--- =

dx

cos x

---

-sin x

(wherever cos x ":/= 0)

-cot x,

cos x,

de"'

e"'

dx2

2x

(wherever x ":/= 0)

We often write

d
-f(x)
dx

df

for

dx

Thus every derivative can be written in the form

f'(x)
'
f

d
-f(x)
dx

df.
dx

The notation df/dx for derivatives is widely used, especially in physics, and it is
natural to use it when you are continually dealing with the derivative df/dg of one

function with respect to another.

It has a disadvantage, however: there is no con

venient way to write the value of the derivative at a particular point x0


we denote this by

fx 'X=Wo'

but the notation f' (x0) is more convenient.

We now want to prove a sort of cancellation law

df. dg
dg dh

df
dh

We can derive this from the equation

l:lf. !:l g - l:lf


!:lg l:lh - l:lh '
taing the limit as l:lx--+ 0. Thus we need

l:lj
!:lg

df

----

as l:lx

--+

dg '

!:lg

dg

l:lh

dh '

_ _,,_ _

0. This requires

g'(x0) ":/=

0,

as in Theorem 1. Hence the conditions in the following theorem:

Sometimes

252

5.8

The Variation of Continuous Functions

Theorem 3.

If f, g, and h are differentiable, then

dg

df

dg. dh

'

df

dh'

wherever g :- 0 and h' :- 0.


Theorem 4.

dg 1_
df
df/dg'
_

wherever df/dg :- 0.

(The limit of the quotient is the quotient of the limits.)


We shall now find a short-cut for calculating derivatives of the type df/dg.
Consider
2 sin x cos x
d sin2 x
2 SinX
.
(cos x :- 0).
cos x
d sin x
---

This has the form

du2
du

This is like

2u.

df

dx

f'(x)

2x.

That is, to find du2/du (where u is a function), we treat u as if it were a dummy variable
x and differentiate in one step. This is an example of the following situation.
Let f and g be functions.
then we say that f is a function of g.

Definition.

If there is a function </> such that f

</>(g),

For example, sin2 x is a function of sin x, with </>(u)


u2 And cos2 x - 2 cos x
is a function of cos x, with </>(u)
u2 - 2u. The easiest way to calculate df/du, in
each of these cases, is to write
=

d sin2 x
d sin x
---

du2
=

2u

2 sm x'

d(u2 - 2u)

d(cos2 x - 2 cos x)

d cos x

du

du

2u

2 cos x - 2.

This procedure is justified by the following theorem.


Theorem 5.

Let/be a function of g,

</>(g), where all the functions are differentiable.

Then
wherever g' :- 0.

</> ( )
' g,

5.8

The Derivative of One Function with Respect to Another

253

Proof
df
dg

f'

g'

gg
</>'( ) ' = f (g).
g'

Using this theorem, we can write immediately


dtan2x

2 tanx,

dtanx
instead of using Theorem 1 and writing
dtan2x

2 tanx sec2x

dtanx

sec2x

= 2 tanx'

there is no point in writing sec2x = D tanx in both numerator and denominator,


since we are about to cancel it out in any case.
PROBLEM SET 5.8
Calculate dffdg, given:

1. /(x) =e"' , g(x) = x2

2. /(x) = e'" ,g(x) =2x

3. f(x) =e"', g(x) =Tan x

4. f(x) =exsinz,g(x) =x

5. f(x) =exsinx,g(x) = x3

__

7. /(x) =e'"3, g(x) =x4

6. f(x) =e"' ,g(x) = x2


8. f(x) = sin x,g(x) = cos x

9. f(x) = x3,g(x) =Tan x


Problems 10 through 14.

In Problems 1 through 5, first calculate the function

</> such
.) Then calculate ' (u) =
. Finally,
calculate <f>'(g),and compare it with your previous formula for df/dg. (Or, if you worked
that f = </>( g). (Answer in the form

</>(u) =

the problems this way in the first place, work them by the other method, and check.)
Calculate as for Problems 1 through 9.
15. f(t) =sin t,g(t) =et

16. f(t) =cost, g(t) =Tan t

17. f(t) =t6,g(t) =t3

18. /(t) = t6,g(t) =Tan t

19. f(x) = ln x, g(x) =e"'


Problems 20 through 24. Solve the preceding five problems by another method.
25.

Given/2 + t 2 + 1 = 0, find df/dt.

26.

Given /3 + t3 = 1, find df/dt. Then calculate f = f(t), find /' (t), and compare the

27.

Same, for f4 + t4 = 1. (Here there are two functions f = f(t) to be considered.)

result with dffdt.

28. Now try to check your answer to Problem 25 in the same way that you checked your
answers to Problems 26 and 27. (It often happens that a formal process gives "answers"
in cases where there never was a question.)

The Technique
6

6.1

of Integration

INTRODUCTION

In Section 3.7 we found a way to solve certain types of area problem.

area under the graph of a continuous function/, from

to

b,

To :find the

we introduce the area

function

f f(t) dt

A=

rYf
I

We know that

F'(x)

f'f(t) dt

f(x)

for every

To calculate the area function, we :find another function

G'

We then know that

G'
If it happens that

G(a)

0, then we have

H(x)

Then

H'(x)
Therefore

G'(x)
H(x)

x.

G such that

F '.
G(x)

F(x)

for every

G(x) - G(a).

F'(x)

F(x)

and
for every

H(a)

x,

and so

ff(t) dt

F(b)

H(b)
254

G(b) - G(a).

0.

x.

If not, we let

Independent Variables and Indefinite Integrals

6.2

255

To sum up:

G'
The notations

and

=>

G(b) - G(a).

were introduced for the sake ofthe derivation. Once we have

the answer, it is natural to use

F'

More formally:
Theorem 1

ff(t) dt

and F, and write:

ff(x) dx

=>

F(b) - F(a).

(The fundamental theorem of integral calculus).

[a, b], and F'

f,

Iff is continuous on

then

ff(x) dx

F(b) - F(a).

To apply the theorem, ofcourse, we need to find F when/is given. This process
antidijferentiation. We shall see later that the method of antidifferentiation

is called

enables us to solve not only the sort ofarea problems that we have used it on so far,
but also a variety of problems which, offhand, don't look like area problems at all.
But these applications should be postponed. The point is that, to apply the method,
we need to know how to calculate a function F whose derivative is a given function/;
up to now we have been finding such functions F only by hit-or-miss procedures, in
simple cases; and it would not be good to reduce various problems to problems in
antidifferentiation, when we are unable to solve the antidifferentiation problems.
We should therefore first learn better methods for calculating functions when their
derivatives are given.

6.2

INDEPENDENT VARIABLES AND INDEFINITE INTEGRALS

The usual way of defining a function is to write an expression which gives the value
ofthe function for every number in the domain. For example, we may define functions

f and g

by writing

f(x)

x2

(-oo <

In these formulas, the letter

"x"

< oo),

is called the

g(x)

..}

(x

0).

independent variable.

It is simply a

dummy letter, marking the places where numbers are to be inserted.


speaking, it makes no difference what letter we use as a dummy.

Logically

For example, we

could have defined exactly the same functions by writing

f (t)

(-oo < t < oo),

t2

Ifwe have decided to use, say,


Thus, when we write
a3

- 1

g(t)

g(t)

Ii

(t 0).

as the dummy, then we say that/is a function of


t, we are describing g as a function of t ; h(rJ..)

cos2

x.
=

is a function of rJ..; and so on.

We return now to the problem ofantidifferentiation. We found long ago, from

the uniqueness theorem, that iftwo functions have the same derivative, on an interval,

256

The Technique of Integration

6.2

then they differ by a constant. Thus, if


j'(x) =x2

(-oo < x < oo),

then /must be a function of the form


x3
f(x) =3
where C is a constant.
Therefore

C,

The converse is ,trivial: for every C, D(x3/3


{F I F' =x2} =

C) =x2

{3 c}.
+

The set of all functions F for which F' =/is commonly denoted by

ff(x) dx.
This is called the indefinite integral off Thus

J x4 dx = {F J F'(x) =x4} -;= {h5

C},

Jcos x dx ={F I F'(x) =cos x} = {sin x

C},

and so on. Any other dummy letter would have done as well:

Jt4 dt ={F I F'(t) =t4},

and

J cos t dt = {F I F'(t) =cos t}.


In each case, the braces on the right indicate that we are talking about the set of all
functions of the form given inside. The symbol dx (or dt) merely reminds us that
x (or t) is the dummy letter used in describing the function. In the examples above,
the reminder may seem unnecessary. Similarly, when we write

f(3x3

2x4) dx =

{3:4 2;5 c},


+

we might have gotten along without the "dx," because the only constants involved
are the numerical constants 2 and 3. On the other hand, if we write

Jc!Xx3y

(3x2y2) dx,

the "dx" is needed; it tells us that a, (3, and y are to be regarded as constants, and that
the function which we are dealing with is
f(x) = ax3y

f3x2y2.

257

Independent Variables and Indefinite Integrals

6.2

When the problem is understood in this sense, it is plain that the answer is

f(o:x3y + {Jx2/) dx= {o:x4y + {Jxa


-y2 + C} .
-4-

3-

(i)

This should be compared with


(ii)

In (ii),

f(o:x3y + /Jx2/)do:= r23y + f3x2/o: + c) ,


f(o:x3y + f3x2y2)d/3= {o:x3y/J + /3222 + c).
o:,
x
g(y) o:xay + fJx2y2.
h(o:)= o:x3y + {Jx2y2.
{3, and

(iii)
(iv)

are constants, and the function is


=

In (iii),

x,

y, and fJ are constants, and the function is

Similarly for (iv).


The process of calculating indefinite integrals is called indefinite integration, or
briefly, integration. Given a differentiation formula, we can get a corresponding
integration formula merely by writing the given formula "backwards," with minor
adjustments in some cases to take care of constants. In each case below, the formula
on the right follows from the formula or formu1as on the left.

Dxn= n xn- (n - 0),


Dxn+l= ( + l)xn (
Dcx:11) = xn (n - -1)
D-y;-x= 1;-x '
D(2Jx) = /x
D x= x
D x=
x)= x
l

=>

1)
f xndx= { n 1 xn+1 + .c}

2-y

=>

J)xdx=

sin

cos

=>

f x dx=

{sin

cos

-sin

=>

f xdx=

{-cos

D(-cos

sin

{2Jx + c},

cos

x + C},

x,

sin

x + C},

(n

-1),

The Technique of Integration

258

In

.! (x
x

>

0)

6.2

=>

De"'=e"

=>

J dx=
fe"' dx=

{In

x + C} (x

{e"'

+ C}.

>

0),

We know many more differentiation formulas than this, and so we could have
written many more integration formulas. But we postpone the complete list until we
can write it in a better form, which we shall now explain.
Given a function/, if

is another function, then

f (u )

is a composite function.

By the chain rule,

Df(u)=j'(u)u'.
It follows that

ff'(u)u'(x) dx

{f(u(x)) + C}.

For example, if

u(x)= x2 + 1,

f(u)= sin u,
then

, D[f(u(x))]= D[sin (x2+ 1)] =f'(u)u'(x)

(cos

u)2x

= [cos (x2 + 1)]2x.


Therefore

[cos ( x2

+ 1)]2x dx= {sin (x2 + 1) + C}.

More generally,

sin

u(x)

u(x)]u'(x),

[cos

and so

[cos

u(x)]u'(x) dx

{sin

u(x) + C}.

This works for any functions. If

F'=f,

so that

ff(x) dx

{F(x) + C},

then

D[F(u(x))]

F'(u(x))u'(x)

f(u(x))u'(x),

so that

ff(u(x))u'(x) dx= {F(u(x)) + C}.


In such formulas, we abbreviate

u'(x) dx by the

cos

symbol

u du= {sin u + C},

du.

Thus we write

6.2

259

Independent Variables and Indefinite Integrals

which means that for every differentiable function

[cos

u(x)]u'(x) dx

u(x),

we have

{sin u(x) + C}.

Similarly, we write

fe"du
which means that if

u is any

{ e" + C}

differentiable function, then

fe""">u'(x)dx
This is true, because

Deu<x>

{eu!x> + C}.

e"x>u'(x).

Using different dummy letters, we can convert the above formula to any of the forms

feu<t>u'(t)dt,

or
and so on.

More often, however, we start with an integral described in the long

notation and observe that it is convertible to a short form. For example,

fex2+12x dx
has the form

where

u(x)

x2 + 1.

Therefore

fex2+12xdx fe"du
=

f [sin (t2 + 1)]2t dt has the

Similarly,

[sin

(t2 + 1)]2t dt

sin

u du

{ e" + C}

form

{ ex2+1 + C}.

f sin u du.

Therefore

{ -cos u + C}

{ -cos (t2 + 1) + C}.

Note that the solution is not finished in the third formula above, because
function.

To complete the solution, we need to express the function

che dummy letter

t.

To sum up:

F'

=>

D[F(u)]

= f (u)u'.

Therefore

ff(x)dx

{F(x) + C}

In the abbreviated form, using

ff(x)dx

=>

ff(u(x))u'(x)dx

du for u'(x) dx,

{F(x) + C}

=>

{F(u) + C}.

we have

ff(u) du

{F(u) + C}.

is a

in terms of

260

6.2

The Technique of Integration

Using this general idea, we can write all of our old integration formulas in the more
general form. The first few look like this:

JUndu={nU:ll + c},
J

cos

J)udu= {2-Ju + c},


J

u du= { sin u + C},

J J:.udu={

ln

u + C} (u

>

sin

0),

udu= { -cosu + C},

u
e

du= { e" + C}.

And of course we have

f [f(x) + g(x)]dx=ff(x)dx + Jg(x)dx,


J!<J(x)= k ff(x)dx, k 0,

because

D[f+ g] =DJ+ Dg

D (kf) = kDf

and

Let us now consider how to apply such formulas as these, as a practical matter.
Example

1.

Consider

Jcx2 + 1)7xdx.
This is almost, but not quite, in the form

J u7du.
If we take

u(x) = x2 +

1,

then

du=u'(x) dx=2x dx.

We therefore have

J(x2 + 1)7xdx=Jt(x2 + 1)72xdx = Jtu du


'

= {t tu8 + C} = { 1\(x2 + 1)8 + C}.

This checks:

D[T\r(x2 + 1)8]=T1s 8(x2 + 1)7 2x=x(x2 + 1)7

Example

2.

Consider

-Jx
1_ dx
"\ x

cos

(x

>

The only form that might fit this integral is the form

u(x) =.jX,

0).
f cos udu.

Thus we would have

1
du = u'(x)dx =--dx.
2.jX

The only difference between what we have and what we want is a multiplicative

Independent Variables and Indefinite Integrals

6.2

261

constant. Therefore

JX
dx=
Jx

cos

Example 3.

(cos v x)
;-

1
x dx= 2
J

= 2 Jcos u du= {2 sin u

;(cos v x)

1
x dx
2J

C}= {2 sin Jx

C}.

Con:.ider

Jecos"' sin x dx.


So far we have only one integration formula involving the exponential function:

feu du= {eu

C}.

If our problem fits this form, we must have

u(x)

=cos

x,

du

u'(x) dx=

-sin

x dx.

Here again the multiplicative constant causes no trouble:

Je00sx sin x dx=

J{-

-e008"'(-'-sin

eu

x) dx= - eu du

C}= {-ecosx

C}.

Below we shall give a list of all the integration formulas that we can write, at this
stage, on the basis of the differentiation formulas that we know. Special explanations
are needed, however, in connection with the formula for
u,

defined on a domain where

We need to know that

u(x)

u(x)

>

0 for

every

x,

f (lju) du.

Given a function

we know that

1
D ln u(x)= - Du(x).
u(x)

>

on t domain under consideration, because only

positive numbers have logarithms. Therefore we write

J du=
But even where

u(x)

<

0,

{In

C}

(u

that is, it makes sense to ask what functions f have

D[ln
This gives us

u(x)

<

0).

it makes sense to write

J du;

answer is easy: if

>

0,

then

-u(x)

>

0.

(1/u)u' as their derivatives. The


-u(x) has a logarithm, and

Therefore

(-u(x))l = -1- D[-u(x)]= -1- (-u'(x))= -1 u'(x).


-u(x)
u(x)
-u(x)

J du= {ln(-u)

C}

(u

<

0).

262

6.2

The Technique of Integration

Hence the two formulas for

f (l/u) du in the list

below.

Ikf(x) dx k ff(x) dx (k O)
f [f(x) g(x)] dx ff(x) dx f g(x) dx
=

(n -1)

f du

{ In u

f du

{In (-u)

C}

(u

>

C}

(u

sin

sin

0)

(4)

<

0)

(5)

-cos

sec2

csc2

tan

-cot

sec

tan

sec

csc

cot

-csc

"

"
e

f
{
1
u
I u J
u2 - 1
=

>

Sin-1

Tan-1

{ Sec-1 u
.

<

C}
+

C}

(2)
(3)

f u du { u C}
I u du { u C}
f u du { u C}
f u du { u C}
I u u du { u C}
I u u du { u C}
f du { C}
Ja" du {1:"a c} (a 0, a 1)
I
J1 - u2 { u C} (lul 1)
cos

(1)

(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)

(u

>

1)

(16)

Independent Variables and Indefinite Integrals

6.2

263

To solve the following problems, you will start by expressing the given integral
in the form

f f(u) du.

In each such case, you should (a) say what

u and du are and (b)

state the general formula that you are applying. It is natural to write down the original
integral first, and after this it would be awkward to interrupt the solution with the
formulas for

u(x) and du

u'(x) dx. But u and du can be filled in on the right, like

this, for example:

J(x3

+ 1)10 x2dx

Jt<x3 1)13x2dx
fiu10 du {t TI-u11

C}

u{x)
du

x3 + 1
3x2 dx

h\(x3 + 1)11 + C}.

This form of the solution shows what we have in mind; writing formulas of the type

U(X)

X 3 + 1, du

3X2dX, f tu16 du

a 1\-U11

'

C}

Will help you tO avoid

mistakes. For example, you might write hastily

(?)

f(x3

as if it were true that for

u(x)

1)10x2dx

and du, we uncover such errors.

x3 +

1, du

{-fi(x3 + 1)11 + C},


=

x2dx. When we write formulas for u

Similarly for the following wrong solution:

(?)

J<x2

+ 1)2dx

{t(x2 + 1)3 + C}.

In full, the solution would begin like this:

(?)

J<x2

+ 1)2dx

du

x2 + 1
dx(?!)

The error is obvious, and so we start over again:

J<x2

+ 1)2 dx

J<x'

+ 2x2 +

1) dx

{tx5 + fx3 + x + C}.

PROBLEM SET 6.2

Calculate the following integrals, and check by differentiation in each case. Some of
these problems fit together in sequences, in which the answer to one problem helps in the
solution of another; you should watch for such patterns.

1.
4.

f
I

(1 + x2)3x dx

2.

(t4 + l)t8dt

5.

f
f (x2

(1 + t 3)t2dt

3.

t2)3 dx

6.

f
f (x2

(2 + u2)3u du
+

t2}8tx dx

264

7.

10.

13 .

16.

19.

21.

22.

24.

27.

30 .

33.

3 6.

39.

42.

45.

48.

51.

54.

57.

The Technique of Integration

f
J
f
f
f
f
fl
f
J
J
J
J
J
J
J
J
J
J
f

(x2 + t2)3 txd t

(rs/2

l)rs/2dt

11.

v'cos x sin xdx

14.

(e"'

-l- e-"')2 e"'

x2

(I

+ x3)3

f
f
f

8.

e-"')dx

6.2

(l

1
+ y1x)3-=dx
v'x

(I

+ sin x)2 cos xdx

+ 2)4 e"' dx

(e"'

17.

x2

(e"'

12.

15.

f :

20.

dx

dx
1 + xs

9.

f
f
f

e-"')3 dx

x2

(t3f2 + 5)10.,/(dt

(1 + tan x)3i2 sec2 x dx

(e"' - 2)3e-"' dx

f :

18.

(1

dx
x2)2

dx

(There are two intervals to be considered in this problem.)

23. f

in xdx

sin x cos xdx

25.

sin101 x cos xdx

28.

cos57 x sin xdx

31.

(cot2 0 + 1) dO

34.

cos 0

-- dO

37.

(cos2 0 - sin2 0)dO

40.

cos2 0dO

43.

sin2 0

sin2 0dO

46.

cos2 0 sin 0dO

49.

cos (0/2) dO

52.

x e-"'2 dx

55.

e2"'dx

58.

J
J
J
J
J
J
J
J
J
fJl
J
J

In (x2)dx

(Same comment.)

sin2 x cos xdx

26.

cos2 x sin x dx

29.

(1 + tan2 0)dO

3 2.

cot2 0dO

3 5.

sin 20d0

38.

(cos2 0 + sin2 0)dO

41.

(1 - 2 sin2 0)dO

44.

sin2 20dO

47.

sin 0

50.

(1 - sin2 0)dO

- cos 0
dO
2

5 3.

t2et3 dt

56.

e5t dt

59.

J
J
J
J
J
J
J
J
J
J
J
J

sin3 x cos xdx

cos3 x sin xdx

tan2 OdO
sin 0
--do
cos2 0
cos 20dO

(2 cos2 0 - 1)dO

(2 sin2 0 - 1)dO

sin2 0 cos2 0dO

sin3 0dO

v' 1 - cos 0 sin 0 dO

xe"'2 dx

e1

tdt

Integrals Leading to the Logarithm and the Inverse Secant,

6.3

60.

63.

66.

69.

72.
75.
78.

J dt
J sin t dt
I dx
J (2
dt
dt
I
et'+3t

61.

e008 t

64.

(10x)2

67.

+i-312)

70.

( .Y1

I
I

t3

1 +t

73.

12)s

7 6.

4 dt

ex

Yl - e2x

dx

osx
c--dx
I x
2 I x x x xdx

f dx
J2x+idx
dt
I (2
ein sec' x

62.

65.

+ o-3/2

68.

tdt

I
dt
I
I exdx

71.

.Y1 - t2
t2

74.

\o/1 +t3
ex

77.

1 +

xdx
-x
I 2
csc x x xdx
I x x

Algebraic Devices

f cosxdx
J xdx
J
dt
dt dt
I
dt
I - (2t)2
I -=dx
esin x
10x

t(2 +t2)-312

(There are different intervals to consider in Problems 79 through

79.

80.

sin

sec 2

8 .

+sec

tan

83.

secx +tan

.y 1 - t2

.y 4

ex

Y] _ex

84.)

sin

cos

+ csc

csc

265

cot

+cot

81.
84.

J xdx
J xdx
tan

sec

6.3 INTEGRALS LEADING TO THE LOGARITHM


AND THE INVERSE SECANT. ALGEBRAIC DEVICES

J du/u,

In the preceding section, we got two formulas for


and

for the intervals

(0, oo)

(-co, 0).
u

d
-;
;

I
du
- =
I
=

Since

lul = u

when

>

and

{In u +

{ln

( - u) + C}

luJ = -u

0)

(4)

(u < 0).

(5)

(u

C}

when

>

u < 0,

these two formulas can be

combined into one:

r- =
du

Ll

{ln

Ju J + C}

(on

(0,

oo

or

( - oo, 0)).

(17)

Here the expression in parentheses on the right reminds us that the formula can
be used on an interval where u > 0,

or

on an interval where

on an interval where u takes on the value

"1/u"

0.

When

0,

u < 0;

it cannot be used

there is no such thing as the

on the left or the "In lul" on the right. Thus, whenever we apply formula

we might have used formula (4) or (5). The advantage of


Consider

r -l d

J-2

(17),
(17) is that it is easier to use.

6.3

The Techniqu of Integration

266

In the fundamental theorem of integral calculus, we take


1
f(x) =x
Then F'=f Therefore

l-1d =
-2
2

F(-1) - F(-2)

F(x)

In !xi.

ln 1-11 - ln 1-21=0 - ln2 = -ln2.

This is negative, as it should be; the integrand is negative, and we are integrating
from left to right. The calculation might be confusing if we used formula (5):

r-i dx = [ln(- x)]=; =In


L2 x

[-(-1)] - In [-(-2)] =0 - ln2

- ln2.

Hereafter, we shall use the following shorthand for this kind of calculation:

r-i dx
L2 x

In general

[In Ix!]==In 1-11 - In 1-21.


[F(x)]=F(b) - F(a),

by definition. Sometimes, where no confusion could result, we may omit the opening
bracket on the left. Thus

3I= -

We can convert various integrals to the form J du/u. For example,

tan u du =

Except for sign, this has the form

we have

cos u

du.

v' v u, dv
u du.

J
dv
(v
v
J --;- vl
u
d:
u du =
du = J
u
J
J
=cos

Since

sin u
--

tan

{ln l

+ C}

= - sin

> 0 or

<

0),

- sin

--

cos

={-In !vi+ C} ={-Jn lcosul + C};

tan u du

{Jn !sec ui + C}

This is a standard formula.

(sec u > 0 or sec u

<

0).

(18)

Integrals Leading to the Logarithm and the Inverse Secant.

6.3

Algebraic Devices

167

Similarly,

This gives

cot u du =

cos u
du.
-.
smu
-

(sin u > 0 or sin u < 0).

cot udu= {ln Jsin uJ + C}

(19)

By an ingenious device, we can find

secxdx.

We multiply and divide by secx+tan x, getting

Since

secxdx =
D

and

secx = secx tan x


D

the integral has the form

sec2 x+ secx tan x d


x.
secx+ tan x

tan x = sec2 x,

where
u = secx+ tan x,
Therefore

du

(secx tan x+sec2 x) dx.

secxdx = {In u
J l + C}

{ln Jsecx+ tan xi + C}.

As always, the chain rule gives us a more general formula for J sec u du:

sec u du = {In Jsec u+tan ul + C}

(sec u+ tan

> 0 or < 0). (20)

Similarly,

cscxdx =

and this gives

cscx(cscx+cot x)
csc x+cot x

dx = {-lnJcscx+cot xi+ C};

csc udu = { -ln Jcsc u+cot ul + C}

(csc u +cot

Consider now the formula


D

Sec-1x =

1
xJx2

(x > 1).

> 0 or < 0).

(21)

268

6.3

The Technique of Integration

The graph of Sec-1 looks like this:


y

7r

2 ---- -- ------- --- -----

(See Section

4. 7.)

Thus Sec-1 is defined on the interval [I,

oo

).

But at 1 its tangent is

vertical; and so the differentiation formula holds only for x > 1. It gives

dx
.
.J
x x2 -

+ C}

(x

>

1),

du
= {See1 u
U'\I U 2 -

+ C}

(u

>

1).

{See1

and more generally

Notice, however, that the integral

(16)

-2
1
J-ax.Jx2-l dx
___

makes sense.

We therefore need an integration formula which will apply to this

integrand on the interval ( - oo,

Ix.Jx21 - 1

dx

__

where

-1).

On this domain,

1
dx=
(-1)
I -x.J( -x)2 - 1
f .J - 1 du,
u

u(x)=-x,
Therefore, for

<

x'./ x

-x

du=(-l)dx.

-1,

-1

= {See1 u
=

because !xi

u2

{See1 lxl + C}

when x < 0.

the general case (with a function

du

fu.Ju2 - 1

+ C} = {S.ee1 (-x) + C}

(x

< -1),

Fitting our two formulas together, and passing to


u

instead of

{Sec-1 lul + C}

x),

we get

(u

>

or

u < - 1)

(22)

Integrals Leading to the Logarithm and the Inverse Secant.

6.3

Algebraic Devices

269

There is a rough rule to help you decide which of our present list of formulas to
apply to a given problem: look in the integrand for functions which are the derivatives

of other functions. The point is that all our formulas have left-hand members of the
form ff(u) du; and we need to decide, in each case, what u is.
Example 1.

I:

In x

dx.

ls there anything here that is the derivative of something else? Yes:

D In x = .!. .
x

Taking u(x)= In x, we have

du= u'(x) dx= .!. dx.


x

{u

Thus our integral has the form

ln3 x

dx=
=

I
{:

(ln3 x)

du= - dx
x

x dx

(1 + x2)7 .

Looking for functions which are derivatives of other functions, we observe that

D(l + x2)= 2x.


Multiplicative constants are no trouble:

x dx

=
(1 + x2)7
2

where

u(x) =

2x dx

! u-7 du,
J

=
(1 + x2)7
2

du= u'(x) dx= 2x dx.

+ x2,

Therefore the answer is

!.

Example 3.

1
-7 + 1

u-7+l

} {-l

+ c =

12

Sometimes we have to hunt harder:

} {-l (1

u-6 + c =

x dx

)1 - x4

12

= l x

u3 du

+ c = {t ln4x + C}.

Example 2.

dx=

+ x2r6 + c .

270

6.3

The Technique of Integration

There is no hope that l/J l -x4 is part of


must be

du. Either the problem is hard or du


x dx, or a constant multiple of x dx. Now 2x=Dx2; and x2 is what gets

squared under the radical sign in the denominator. This suggests


u=x2,
x dx

2x dx

JJ1 - x4=,2 JJ1-(x2)2 =2 JJ

du
1

- u2

={t Sin-1 u + C} ={t Sin-1 x2 + C}.

Example 4.

Some obscure-looking integrals may be calculated algebraically:

=J(1 + -) dx.
x-1
Jx-1

(Here we have divided the denominator into the numerator, getting a quotient and a
remainder.) Therefore

Example 5.

xdx
= {x + In Ix-11 + C}.
x-1

Sometimes we need to find other algebraic devices, for such problems

as this:

dx

J1 + e-x

As it stands, this is hopeless: nothing in the integrand is the derivative of anything

else. But

dx

du
=
=.
1 + e-x
ex+ 1
u

u=ex+ 1

={ln lul + C}={ln (1 + e") + C}.


(No absolute-value signs are needed, because 1
Example 6.

Sometimes the same devices appear in more complicated forms:

r --dx
-

+ e" > 1 for every x.)

e" + e-x

---

du
e" dx
e" dx
1 + e 2" - 1 + (e")2
1 + u2

(u=ex, du=e"dx)

={Tan-1 u + C} ={Tan-1 e" + C}.


Here we have used, in combination, the methods that worked in Examples 3 and 5.
Example 7.

Often we need routine algebra and arithmetic:

_E_= !
J + x2 J
4

Here

dx
t dx
= !.2.
1 + (x/2)2
4
1 + (x/2)2

u=x/2, du=tdx. This gives

{t

Tan-1

+ c}.

Integrals Leading to the Logarithm and the Inverse Secant.

6.3

Algebraic Devices

271

x
(1/J3) dx
I J3d-x x2 I/31 Ji -d(x/J3)
I
2
Jt (x/J3)2
{ ;3 }

Similarly,

sin-1

+ c .

There was nothing special about the numbers


way, we get

Ia2 dx x2 {.!a
IJa2dx- x2 {
+

Tan-1 + c

u2

IJa2du- u2

{a
{

-Tan-

(/3)2

In the same

(a > 0).
we get two more standard formulas:

+ C

22 and 3

(a > 0),

g from x to any differentiable function u,


du
u
1
1
a2

sin-1 + C)

Passin

sin-1 + c

(a > 0),

(23)

(a > 0).

(24)

PROBLEM SET 6.3

Calculate the following integrals, and check by differentiation in each case.

1. I
4. J
7. I
10. I
13. I
J
J : e
J e"'e"' - e-x
e25. J (e"' e-')(e2"'
v' 1

x4

v'4 - y4

dx

v'1 -

dy

5.

9 :sx4 dx

16.
19.

22.

zs

zs

dt

dt

2zs

dz

dz
1 :55zs

dz

xs

dx
dt

dz

,.dx

t4

11.

dz

t2

x
dx
1 + 9x 4

x
3
dx
Vl - x8

2. J 4
J
8. J
J 1
14. J
17. J 1
20. J 1 :t e2t
23. I e"'e"' - e-x
e-x
J

e-2"') dx 26.

dx

2
dx
v'2 - x3

3. J
J
J
12. J 1
J
18. J
21. J e"' e-x
24. J (e"' e- )(
2 7. J

4y4

v'1

6.

15.

:39 4 dx
x
z6

dz

6dz
+ z

x
7
dx
Vl - x8

xs

dx

dx

"'

e"' - e-"') dx
2

2
dx
v'2 - x6

272

The Technique of Integration

2 dx
28. J v1 x-2x3
J v1Sin-1x
-x2dx
34. J xe"' dx
37. J (xcosx
sinx) dx+
J x sinxdx
43. J (3x+2 Inxx2
) dx
1
46. J x( l + 1n2x) dx
49. I x2 V; - 1dx
J V 1 x2dx
J Cos-Ixdx
f
1 + e4u du
6 .1 I 1 :x2dx
64. We know that
31.

40.

52.

55.

58.

Consider

6.3

2 dx
29. J v1x-2x6
32. J Tan-Ix
-1-+x2 dx
35. J In e"'' dx.
38. fxcosxdx
J (2x Inx+ x) dx
44. J x21nxdx
47. J (x + x1)2In+(x2x2 +2x) dx
f Ve2x dx
f Sin-Ixdx
J Cos-I (2x) dx
f du
59. l :2u
62. f Tan-1 xdx
41.

50.

1
30. fxln-2xdx
33. f (xe"' + e"') dx
36. J ln2 e"' dx
f (x sinx -cosx) dx
42. J xlnxdx
J ln3x
--dx
x
48. I tVt21 dt
J (sin-Ix+ Vl x2) dx
J (Cos-Ix - Vl x_ x2) dx
f du
1 + e2u
f(Tan-Ix+ 1 :x2) dx
Jv1 -z2dz
39.

45.

51 .

53.

54.

56.

57.

D-x Dx-I -ix-2


JI-Ix21 dx.
=

60.
63.

x2

6.4

Integration by Parts

273

In the fundamental theorem of integral calculus, we take

f(x)
Then

F'

F(x)

x2'

1
x

f Therefore

fl -dx
1
-1

x2

F(l) - F(-1)

Now we interpret the problem geometrically. We seem to have proved that the region
under a positive function has negative area.
a) What went wrong?
b) Show that the area in question is not only positive but infinite.

(This does not

follow from the mere fact that the region is unbounded. Some unbounded regions have
finite areas.)
65. Let R be the region under the graph of f(x)

I/v',

from

that R has finite area.

6.4

0 to

1.

Show

INTEGRATION BY PARTS

By differentiation, we get

D[x sin x]

x cos x

x.

+ sin

Since

D cos x

sin

x,

+ sin

we have

D[x sin x

+ cos

x]

x cos x

sin

x cos x.

Therefore

Jx

cos

x dx

{x sin x

+ cos

C}.

Thus, working backward, we have found the solution of an integration problem which
m ight have looked hard if we had approached it forward, starting'wifu_the unknown

integral

f x cos x dx.

We shall now describe a general method of solving problems of

this kind.
The formula for the derivative of a product is

D[u(x)v(x)]

u(x)v'(x)

u'(x)v(x).

Therefore

T herefo re

J [u(x)v'(x)

u'(x)v(x)] dx

Ju(x)v'(x) dx

u(x)v(x)

{u(x)v(x)

C}.

Jv(x)u'(x) dx.

Here we have dropped the constant, C, because each of the indefinite integrals on the
two sides of the equation carries its own constant with it; what the equation says is
that the two sides of the equation represent the same class of functions.
short notation

du

u'(x) dx,

dv

v'(x) dx,

Using the

6.4

The Technique of Integration

274

we get the formula

Ju dv= uv - Jv du.
This is the formula for integration by parts; the word parts refers to the functions

u(x) and v'(x) in the integral


one integral by another.

on the left. Any time we apply the formula, we replace

The method is useful when the new integral is easier to

calculate than the old one.


Let us first try the method on

x cos

x dx.

Let

dv

=x,

x dx,

=cos

so that

du= dx

and

= Sill

X.

(We need not allow for a constant here; any function

whose derivative is cosx

will work. We will return to this point in a moment.) By the basic formula, we get

x cos

x dx = u dv= uv

Jv du= x

= {x sin x +cosx

sin x

sin

x dx

+ C}.

If we had used the seemingly more general

=sin

+ c,

we would have

Judv =

x(sinx + e) -

(sinx + ;)dx

= {x sinx + ex + cosx

- ex

+ C}

= {x sin x

+ cos

+ C},

exactly as before. The same happens in general:

u(v

+ c)

f<v

+ c)du

= uv

+ uc

Jv du - uc

= uv

- v du.

In applying the basic formula, we made what may seem to be an arbitrary choice
of

u and dv.

We might have taken

=cos

x,

dv = x dx

du

= -sin x dx,

x2
v = -.
2

Integration by Parts

6.4

This would have given


xcos x dx =

udv = uv -

v du

cos x +

This is true, but is worthless as a method of finding

integral is harder to calculate than the old one.


An equally bad choice would be
u=

x cos x,

du= cosx

du= dx,

which gives

xcosxdx = udv = uv -

x2sinxdx.

f x cos x dx,

275

because the new

x sin x,

vdtt

= x2cosx -J(xcosx - x2sinx)dx.

Here again the new integral is harder than the old one. We remember also that no

term of the form x2cos x appears in the right answer.

Therefore the term x2 cos

cannot be the beginning of the solution, as we might hope: it must be a blind alley.

These examples indicate that integration by parts can be either a good or a bad

method, according to the skill with which we choose the parts.

Practice is a help,

but there are general rules which help us to decide what choices are promising:

1)

dv has got to be something that we know how to integrate.

2)

We want

apply the method at all.)

simpler than

3)

f vdu to be

u.

an. easier integral than

f u du.

For the same reason, we want

J xe"'

We can integrate both x and e".

suggests that u= x and

r
u = e

u=

J xe"'dx = J u

Note that

(3)

x,

du=

advises us not to try


u

because we would then get

i:

dx.

Therefore

(1)

gives us no guidance.

"
e ,

Dx

Rule (2)
u = x is to be preferred.

=I, which looks good.) We therefore

du= e"' dx,

du= dx,

uv

Let us try them on

are both acceptable, but that

e"', which is no worse thane", but

This gives

u.

to be simpler than dv; at least we don't want it to

These rules are not infallible, but they are a help.

(De"'=

Therefore we want du to be

At least, we don't want du to be more complicated than

look worse than du.

use

(If it isn't, we can't

v = e"'.

xe"' - e"'dx = {xe"' - e"'


dv =

C}.

dx,

= x2/ 2, which looks worse than du. In fact, this choice

6.4

The Technique of Integration

276

won't work.

Consider next

f
Rule

x2e'" dx.
du = e'"dx. We therefore take

(3) tells us that we had better take

du= e'" dx,

du= 2xdx,

u= x 2,

This looks good under rule (2), and acceptable under rule

x2e'" dx =

du=

uv

(3).

du = x2e" - 2

= {x2e" - 2xe'" + 2e"' +

= e'".

We get

xe'" dx,

C},

by the result of the previous problem. If we hadn't known the answer to the previous
problem, it would still be easy to see that we had made progress, in replacing

J x2e"' dx

by

J xe"'dx;

we would then attack the new problem'-by the same method.

pression for the

It sometimes happens that integration by parts gives us not aB


'

integral that we started with, but an equation that can be solved for, this integral.
Consider

We take
u

e"' sin x dx.

du= e"' dx,

= e'",

du= sin x dx,

e'" sin x dx = -e'" cos x +

= -cos x,

e'" cos x dx.

We repeat the process, taking

u= e'",

du= e'"dx,

du= cos x dx,

=sin

x.

For short, we wrf(

I=
We then have

I = -e'" cos x +

e" sin x dx.

e"' cos x dx = -e"' cos x + e'" sin,x -

e"' sin x dx.


_

Here the last integral is simply the one we started with. Therefore

2I = {e"' sin

and

- e"'cos x +

C},

e'" sin x d/= {t[e'" sin x - e"' cos x] +

Sometimes we need to make a strange choice, in which


and du is merely dx. This is what we need, to find

in x dx.

C}.
is the whole integrand

6.4

Integration by Parts

Here we use

u= lnx,

du= - dx,
x

dv = dx,

277

v = x.

When we replace 1 by x, we seem to have lost somewhat, but the profit in passing
from In x to l/x more than makes up for it. In fact, this scheme works:

inx dx

uv

J
--J1

v du

= xlnx

x Inx

J
x

dx

dx = {xlnx - x +

C}.

PROBLEM SET 6.4

Evaluate the following integrals.


integration by parts.

Each of them can be calculated by the method of

You should try to work these problems with the smallest possible

number of false starts. In each case, survey the situation and try to arrive at a conclusion on
the question of-what choice of

and dv is most promising.

If you do this carefully,

you ought to be able to solve each of the problems below on the first try.
Each answer should be checked by differentiation.
1.

4.

7.

10.

13.

16.

19.
22.

J ln2xdx
J
J
J
J
J
J Tan1xdx

2.

xe axdx

5.

ea'" sin x dx

8.

eax cos bxdx

11.

x3e"'dx

14.

x3 ln x dx

17.

20.

J
J
J
J
J
J
J xTan-1xdx
In (x2) dx

3.

x sin axdx

6.

eax

9.

COS

xdx

12 .

x2sin xdx

J
J xcosaxdx
J
J
J2
J
J
(ax)e"'dx

eax sin bxdx

X " COS X dx
9

x2 ln xdx

15.

Sin-1xdx

18.

21.

'

x In2 xdx

Sin-1(2x) dx

xex sin x dx

Derive and check:

In" xdx

= x

In" x

lnn-l xdx.

Formulas of this kind are called reduction formulas.


formula, we can calculate the integral on the left.

23.

Find

24.

Derive a reduction formula for J xn sin xdx.

ln3 xdx.

By n

1 applications of the

6.5

The Technique of Integration

278

25. Derive a reduction formula for J xnex dx.


26. Derive a reduction formula which reduces
27.

in J x" Inn x dx.

J sin In x dx. (Here you should survey the situation, decide on the most promising
procedure, and then proceed with faith.)

28. J cos In x dx. (Same comment as for the preceding problem.)


6.5

INTEGRATION OF POWERS OF TRIGONOMETRIC FUNCTIONS

We shall find how to calculate integrals of the form

J
where

and m are any integers, positive, negative, or zero, and integrals of the forms

and

where n

sinnx cosmx dx,

0 and

secnx tanmxdx,

cscnx cotmxdx,

0. We shall discuss the various cases in the order of increasing

difficulty.

odd and positive.

(1)

For example, we might have

sin2 x cos3 x dx.

The method in such cases is as follows.


cos2x

and

+ sin2x

sin2x cos3 x dx

This method works whenever

J
J

Since
cos2x = 1 - sin2x,

1,

sin2x(l - sin2x) cosx dx


sin2x cos xdx -

{t sin3x

sin4 x cosxdx

t sin5x + C}.

in is odd. (In this case n need not be an integer; it may


2k + 1, our integral has the form

be any real number.) For m =

sin"x cos2k+lx dx

f
J

sinn.x(cos2x?cosxdx

sinnx(l - sin2x?cosxdx .

6.5

Integration of Powers of Trigonometric' Functions

279

We expand (1 - sin2 x)k by the binomial theorem. This gives us a sum of integrals
of the form

f m du

(u

sin x,

du

cos x dx).

We integrate these one at a time and add the results.

f sin11

cosm x dx,

odd and positive.

This is like the preceding case. Here

(2)

= 2k + 1, and the integral has the form

J cosm x(sin2 x)k sin x dx = J cosm x(l - cos2 x?(-sin x) dx.


-

Expanding by the binomial formula, we get a sum of integrals of the type

fcos; x( -sin x) dx;


we evaluate each of these by the formula for J u1 du, and add.

J sin" x cosm

and n

x,

and even.

(3)

To handle this one, we recall that


cos 2x = cos2 x

sin2 x = cos2

(1 - cos2 x)

2 cos2 x

1.

Solving for cos2 x, we get


cos2 x

+ cos 2x
2

Similarly,
cos 2x = (1 - sin2 x)

and

sm 2 x =

sin2 x

cos 2x

2 sin2 x,

Making these substitutions in the integrand, we get a form in which the exponents
are divided by 2. For example,

sin2 x cos2 x dx

cos 2x

1 +

cos 2x
2

dx

6.5

The Technique of Integration

280

We now make the same sort of substitution again, getting

1 - cos 4x
dx
4J

tx - t cos 4xdx

{tx - ;t2 sm 4x + C}.

When the exponents are large, this method is tedious, but at least we know that it
will work.
(n positive).

For

1,

we know that

Jtan xdx
For n

(4)

(cos x > 0 or cos x < 0).

{-In jcos xi + C}

(4a)

2,

Jtan2 xdx J(sec2x - 1)dx.


=

(Remember that 1 + tan2 x

sec2 x.) This gives

Jtan2xdx

{tan x - x + C}.

(4b)

For n > 2, we have

Jtann xdx Jtann-2x(sec2x - 1)dx,


=

and so

Jtann xdx

1
-- tann-i x - tann-2 xdx.

n - 1

(4c)

This is called a reduction formula. By repeated applications of it, we can reduce the
integral to one of the forms (4a) and (4b).

Jcotn x dx
This is like (4). For n

cos x
-.-dx
Jcot x J sm
x
=

For

(n positive).

(5)

1,

.
{In Ism x I + C}

(sin x > 0 or sin x < 0).

(5a)

2,

Jcot2xdx J(csc2x - l)dx.


=

(Remember that cot2x + 1

csc2x. ) Thus

Jcot2xdx

{-cot x - x + C}.

(5b)

6.5

Integration of Powers of Trigonometric Functions

For

281

> 2,

fcotn x dx fcotn-2 x cot2 x dx fcor-2 x(csc2 x


=

1) dx;

and so

Icotn x dx

1
-

coc-1 x - cotn-2 x dx.

(5c)

By repeated applications of (5c), we can reduce our integral to one of the forms
(5a) and (5b).

fsec" x dx,
For

n =

2, J sec2 x dx

even and positive.

{tan x + C}. For

n =

2k, k >

(6)

1,

Isec" x dx fsec2k x dx fsec2k-2 x sec2 x dx


f(1 + tan2 x)"-1 sec2 x dx.
=

When we expand (1 + tan2 x)k-l by the binomial formula, we get a sum of integrals
of the form

Jtan; x sec2 x dx fu; du.


=

We integrate each of these by the power formula and add the results. For example,

fsec6 x dx f(sec2 x)2 sec2 x dx f(1 tan2 x)2 sec2 x dx


f(1 tan2 x tan4 x) sec2 x dx
+ 2

/1 =

n =

2k, k >

even and positive.

2,

fcsc2 x dx
For

{tan x + i tan3 x + t tan5 x + C}.

fcscn x dx
This is like (6). For

{-cot x + C}.

1,

Jcsc2k x dx Jccsc2 x)k-1 csc2 x


Jccot2 1t 1 csc2 x dx
Jccot2 x + l)k-1(-csc2 x dx).
=

(7)

282

6.5

The Technique of Integration

When we expand the binomial, we get a sum of integrals of the form

f
f
For

n = 1,

cot' x( -csc2

seen

xdx,

x) dx =

J uidu.

odd and positive.

(8)

we found that

secxd x

secx(sec
sec

+tan

+tan

x)

dx =

fdu-u

where
sec

u =

+tan

du =

x,

(sec

x tan x

+sec2

x) dx.

Therefore

fsecxdx =
For

{lnjsecx +tanxj + C}.

odd and greater than 1, we have a problem.

For example, in f sec3

x dx

it does no good to write

sec2 x sec xdx

= (1

+tan2

x)

sec

xdx,

because the second term fits no standard form. The solution is obtained by integrating
by parts. We have

Let

seen

u =

xdx =

secn-2

x,

secn-2 x sec2

xdx.

sec2

x dx,

dv =

du = (n - 2) secn-3 x sec x tan x dx = (n - 2) sec"-2 x tan x dx,


v

=tan

x.

This gives

seen

xdx =
=

f udu = uv fvdu
-

secn-2

:; secn-2 x tan

secn-2

f
x - (n - 2) f
x - (n - 2) f

tan x -

x tan

(n - 2)

secn-2

x tan2 xdx

secn-2 x(sec2

x - 1) dx

seen

(n - 2)

xdx

sec"-2

xdx.

Integration of Powers of Trigonometric Functions

6.5

283

Thus, if I is the original integral, we have


I

secn-2 x tan x

- (n-2)1 + (n - 2) secn-2 x dx,

and

seen

x dx

1
=

--

n-1

secn-2 x tan x +

n-2
--

n-1

secn-2 x

dx.

There is a similar reduction formula which works for odd powers of the cosecant:

cscn

x dx

1
=

--

n-1

cscn-2 x cot

To derive this, we integrate by parts, taking

x +

--

n-1

csc"-2

as in the previous derivation to solve for J csc" x

n-2

cscn-2 x

x, dv

csc2

dx.

(9)

x, and proceed

dx.

By these formulas and methods we can integrate products of powers of trigono


metric functions.

There is absolutely no need to memorize the formulas which we have


just finished developing. You can handle simple cases by remembering the methods.
If you need to compute an integral, in one of the difficult cases-and this almost never
happens to people, in real life-then you look up the appropriate reduction formula.
Moreover, it isn't even safe to try to memorize complicated formulas: you are very
likely to misremember them and get wrong answers.

PROBLEM SET 6.5

Before starting to work on these problems, you should read Section 6.5 carefully, until
you understand what the methods are and why they work.

In working the problems, you

should refer to the text as seldom as possible. You should try to avoid looking up even the
reduction formulas (8) and (9), unless a problem requires you to apply one of them more
than once. If only one reduction is required, you should integrate by parts, instead of using
the reduction formula. As you will see, the first few problems below are designed to remind
you of the methods that we have been using. Check by differentiation in each case.
1.

4.

7.

10.

J
J
J
J

sin2 x cos3 xdx

2.

cos2xdx

5.

cot4xdx;

8.

sec4 xdx

11.

J
J
J
J

sin3 x cos2 xdx

3.

sin2 x cos2 xdx

6.

tan5 xdx

9.

csc4xdx

12.

J
J
J
J

sin2 x dx

tan4xdx

cot5 xdx

sec3 xdx

13.
x dx
6. x x dx
x 2x dx
19.
22. 2x 2x dx
25. --dx
x
2 . J x dx
30.
J sin

sec

J sin2

cos

sin2
sec

f sec5

17.

J cos

csc

f csc

sec

tan

14.

J cos3

sin

f tan

x dx
x x dx
21.
2x dx
24. I --dx
x
1
27. I --dx
x

x dx
x x dx
20.
2x dx
23. x x dx
1
26. I --dx
x
29. x1 x dx

f csc3

6.6

The Technique of Integration

284

x dx

J sin

sec3

cos2

sin4

cot

A sinn-l

18.

sinx

tan

There is a reduction formula of the form


J sinn

J csc5

f cos2

cos4

csc

15.

x x
cos

+BJ sinn-2

x dx,

where A and Bare constants, expressed, of course, in terms of n. Derive such a formula.

31.

[Hint: It is no use trying to do this merely by the use of the elementary trigonometric

identities relating the sine and the cosine.]

There is a reduction formula of the form

Derive it.
6.6

J cos"

x dx

A cos"-1

x x
sin

+ BJ cos"-2

x dx.

INTEGRATION BY SUBSTITUTION

In Section 6.2 we found that there was a close connection between certain simple

integrals and some more complicated ones. For example, if we know that

then we know that

Jx2 dx = {tx3 + C},

sin2 e cos e de= {t sin3 e + C}.

(We are using a different dummy letter in the second problem, for reasons which will

soon be clear.) Thus we have two related integration problems:


Jx2 dx

{tx3 +

xsino

C}

,,_,sin o

sin2 e cos e. dB = {t sin3 0 +


(!)

C}.

Integration by Substitution

6.6

The

( !)

285

at the bottom indicates that the equation in the bottom line is the final

conclusion. The pattern here is the following:

= {F(x)
ff(x) dx
l
l
ff(u) du =ff(u(O))u'(O) d() = {F(u(O))

X-+U(O)

X-+U(O)

(!)

Jf(x) dx,

Thus, if we know how to find

C}

C}.

we can use the result to find

Jf(u) du.

It sometimes happens, however, that we want to move in the opposite direction;


sometimes we can see how to calculate

ff(u(e))u'(e) d(),
Jf(x) dx.

and we want to use the result to calculate

So as to give ourselves a simple

example to work with at first, let us suppose that we know about the functions
Sin and Sin-1, but do not know that l/.J 1
consider

x2 is the derivative of Sin-1

.J 1

x2

But perhaps it would be

manageable if we could extract the indicated square root. For

x = Sine,

root can be extracted. (See below.) If we replace the dummy letter

dx

becomes Sin'()

de,

We then

dx

We observe that it does not fit any form that we know.

Sine, then

x.

the square

by the function

and we get the related integrals on the left in the

following diagram:

J
J

dx

.Ji

cos

.J1

=?

x2

X-+inO

e de

Sin2 e

The trigonometric integral is easy:


cos

(Query:

J.J1

e de
Sin20

Why is it true that

_
_

cos

J .J

e de

cos2e

1 d()

= {O

C}.

.Jcos2 () = cos e'


for the values ofe that we need to consider? What values of()

do we need to consider?)

286

6.6

l'he Technique of Integration

The above calculation enables u::. to complete our diagram:

fJldx1
f

X2

(I)

x-Sino

cos e de

J1

- Sin2e

C}

{Sin-1 x +

o-Sin-1x

{0 +

c}

In this case, of course, the solution in the top line was known before we started.
But the same scheme works in general, whenever we can calculate the new integral
on the lower left:

ff(x) dx
l
ff(u) du ff(u(O))u'(O) dO
x-u(8)

{G(O)

C}

We shall prove, at the end of this section, that this procedure is valid, whenever the
symbols

u' and u-1 have a meaning; that is, whenever u has both a derivative and an

inverse.

Meanwhile, we shall show how the scheme is used to solve problems

which would otherwise be hard.


Example 1.

dx

x2J1 - x2

_?

(-1 < x < 1, x :

0).

As in the preceding case, it seems to be the radical that is causing the trouble; and so
we get rid of it by the substitution
x--)-Sin 0

dx --)-Sin' e dO

This gives

f J dx

x2 1 - x2

(Throughout,

-Tr/2

--)- f

cos e dO

< e<

Tr/2;

2 '

cos e de.

Sin2 0 cos e

- :!. < e < :!.

csc2 0

dO

on this interval, sin x

{-cot 0 +

Sin x, and the usual

identities hold automatically.)


We now reverse the substitution, using e--)- Sin-1 x. This gives

fh

x2 1 - x2

A formula for cot Sin-1

{-cotSin-1x +

C}.

is easy to read off from a figure.

C}.

6.6

Integration by Substitution

287

Here

Sine= x,
.
cot Sm 1
_

Therefore

x=

e=

cot

dx
=
x2J1-x2

Sin-1

x;

1 - x2
e= -k = J
-'----x
x

{- J
x

+ c

Note that all the trigonometry has cancelled out of the problem. Our answer checks:

-x
DJ = l. x
- J1 - x2
x
x2
J1-x2
=

-1
1
.
[-x2-cJ1-x2)2J=
x2)1 - x2
x2)1 - x2

We can sum this up in a diagram as follows:

dx

Jx2J1-x2=(
l
J e de =
-.. e
xSinO

csc2

- x2

The substitution
integrand is
Example 2.

J1

Consider

Sin

{- }
x
I
+ c

oSin-1 :v

{-cot

e+

C}

is the usual one to try, if the troublesome part of the

In other cases,

x-...

Tan

e works in much the same way.

288

The Technique of Integration

6.6

To get rid of the radical, we use

(- 2

:!!. < e <

x -+ Tane
This gives

!!)

dx -+ sec2ede.

2 '

j..J1 + x2 dx->-j..J1 + Tan2esec2ede.


The domain of Tan is the interval (-TT/2, 7r/2), on which sece > 0.

Therefore

sece= ..)1 + Tan2e, and

JVl

Tan2esec2ede= sec3 ede.

We now use one of the reduction formulas of Section 6.5:

Jsec3ede= fsecnede

(n = 3)

1
= -- secn-2etan e+
n-1

n
2
n-1

- Jsecn-2ede

t secetane+ t secede

= {tsecetane +tin lsece +tanel+ C}


= {G(e) + C}.
We complete the solution by letting e -+ Tan-1

x.

This gives

J ..J1 + x2 dx = {G(Tan-1x) +C}


= {!(sec Tan-1x)(tan Tan-1x)
+ t In I sec Tan-1x + tan Tan-1 xi + C}.
Obviously tan Tan-1 x
figure.

x.

The formula for sec Tan-1


y

can be read off from a

Integration by Substitution

6.6

In the figure, -7T/2 < e < 7r/2, bute may be positive or negative. We take
This gives
x =Tane,
e = Tan-1 x,
r = sece = secTan-1 x,
so that
secTan-1x

OP

289

1.

=.Ji+x2

Therefore the answer is

f.J1 +x2dx = {tx.Jl +x2+

tlni .J l +x2+xi+ C}.

This can be simplified slightly: since


absolute value bars, getting:

f.J1 +x2dx =

.J 1 +x2+ x

> 0

for every x, we can omit the

{tx.J l +x2+tln(.J l +x2 +x) + C}

As before, we sum up in a diagram the process by which the problem was solved:

I.JI+x2dx
l
J 8d 8 =

= {tx.J l +x2+tln(.J l +x2 +x) + C}


(!)

x-Tane

sec3

e-Tan-1 x

{tsec8tane +tlnjsece + tan8j + C}

Such diagrams are worth drawing, especially the first few times you use the
substitution process; often the calculations are long, and it is easy to lose track of
what the process means.
The answer in Example 2 suggests that no method would have made the problem
seem easy. Note that the formulas of Section 6. 5 are turning out to be useful in solving
problems which do not appear, at first, to involve trigonometry at all.
We return to the general theory, to see why this method works. The pattern of
our work is described by the diagram:

ff(x) dx
l

0) {G(u-1(x)) + C}

x- 1<(6)

Jf(u(8))u'(8) de =

e- u-1(x)

{G(8) + C}

What we are claiming, when we use the method of substitution, is that, if the second
equation holds, so does the first. In terms of the definition of the indefinite integral,
this means the following:

290

6.6

The Technique of Integration

Theorem 1. If u is differentiable and invertible, then

G'

The proof is as follows.

f(u)u'

G'

Therefore

x.

G'(u-I) Du-1.

f(u)u'. Therefore
G'(x)

for every

=f

By the chain rule,

D[G(u-I)]
By hypothesis,

D[G(u-I)]

=>

G'(u-I)

f(u(x))u'(x)

f(u(u-1))u' (u-1),

and

D[G(u-1)]

G'(u-1) Du-I

f(u(u-1))u'(u-1) Du-1.

Now

f(u(u-1))
because

u(u-I(x))

x for every

f(u(u-1))u'(u-I) Du-I

f,

Therefore

x.

u'(u-1) Du-I

u'(u-1)

--

u'(u-1)

by the general formula for the derivative of the inverse of a function.


cancels, and gives us

D[G(u-1)]

Now

f, which was to be proved.

u' (u-1)

PROBLEM SET 6.6


Calculate each of the following integrals, by any method.
the easiest method is to use a substitution of the form

In most cases, but not all,

_,.Sin 8,

x-+

Tan 0, or

x -+ Sec

0.

In each case where you do use the method of substitution, you should sum up the process

of solution in a diagram as in Examples


differentiation.

1.
4.

7.
10.

13.

16.

19.

22.

J
J
J
J
J
J
J
J

(1 - x2)-3!2dx

2.

dx
dx
x(l + x2)

5.

dx
x2(1 + x2)

8.

dx
x2v' x2 - 1
x2dx

14.

Vl - x2
x2v'l
xv'l

11.

x2 dx
x2dx

x2v' x2 - 1dx

17.
20.

and 2 in the text. Finally, check in each case by

dx
v' x2

J
J
J
J
J

dx

dx
xv' x2 - 1

xdx
+ x2
xdx
x3dx

v'l - x2dx
+

12.

15.

v'1 - x2

x3 v' 1

6.

9.

v' x2 - 1

3.

x2dx

18.

21.

J
J
J
J
J
J
J

dx
v' x2 - 1
x(l - x2)-3f2dx
Vl - x2dx
x2v'1 - x2dx
x2dx
x2

1 +

(1

: x2)3dx
x3

dx
x2

Algebraic Substitutions

6.7

Jb f(u(O))u' (0)dO

23. a) Show that

iu(b)

ida>

whether

u is

b) Show that if

291

/(x) dx,

invertible or not.

u is invertible, then

lcl f(x)dx J,"-l(d) f(u(O))u'(O)dO.


=

u-l(c)

24.

Obviously there is no point in writing this on a paper which is to be turned in and


graded; but for your own benefit, reproduce the proof of the following, without reference
to the text:

G'
6.7

f(u)u'

D[G(u-1)]

=>

ALGEBRAIC SUBSTITUTIONS

It is a good rule, if you have a problem which you don't see how to solve, to try to
think ofan easier problem that resembles it. If you can solve the easier problem, and
bridge the gap between the two, then you have solved the problem which you started
with.

J-::: =x2= == d x.
.J2x + 1

For example, consider

This does not fit any of the standard forms that we know about. There is no reason to

.)-;, or .J(, then

suppose that a trigonometric substitution would help; and in fact, none of them
would. We note, however, that ifthe denominator were ofthe form
the problem would become easier. Now

2x

We therefore try the substitution

tx

- u(t)

x2 dx
J.J2x +

Ht - 1 ) .

Ht - 1),

dx-+ u'(t) dt

Under this substitution,

t dt.

-+ J t(t -

1)
.Ji

dt

The latter integral is easy to calculate. It is

! J t2 - 2 +
8

.jt

dt

i J(t3/2 - 2tl/2 + t-1/2) dt

{i{it5;2 - 2 . ita12 + 2t112) + C}


h\ts12 - tta12 + it112 + C}
{G(t)

+ C}.

292

6.7

The Technique of Integration

To get the answer to the problem which we started with, we use the inverse substitution

t-+ u-1(x) = 2x+ 1.


This gives

x2 dx
-J2x+ 1

= h(2x+ l)s/2 - t(2x+ l)a/2+!(2x+ 1)112+ C}


= {G(u-1(x))+ C}.

The scheme here is the same as in the preceding section:

x2

J-J2x+1 dx
lx1i(t)
J

t2 - 2t+ 1
r

8v t

= {G(u-1(x))+ C}
<1>

dt

{G(t)+C}

u and u-1 are described algebraically,


G(t) and G(u-1(x)) are too long to be conveniently written in

The only differences are that (a) the functions


and (b) the formulas for

the diagram. In any case, we know that the method works: this follows from Theorem
1 of Section

6.6.

Often we can tell that a substitution is going to work, long before we know what the
answer is. As soon as we wrote

x2 dx

J-J2x+1

-+

JW

1) l dt
,
-Jt

it was evident that the numerator was a polynomial.

We can integrate the quotient

of a polynomial and a power. Similarly, we know that we can integrate

f (x3 - 3x/ +4)2 dx'.

x2 3

the calculation will be tedious, but the outcome is not in doubt.


If one algebraic substitution works, there are usually others that also work.
In the preceding problem, we might have used

y2=2x+l,

Thus we use

x = t(y2 - 1).

x -+ u(y) = t(y2 - 1),


dx-+ u'(y) dy = y dy,
x2-+ t(y2 - 1)2,
HY2 - 2. Y

-J2x+ 1

; ay =i cy4
J

_,..

y,

2y2+ 1) dy

= {21oy5 - i-y3 +b + C}.

Algebraic Substitutions

6.7

293

As usual, we now reverse the substitution, using

This gives the final answer

{0(2x

1)512 - t(2x

1)312

i(2x

1)1'2

+ C},

as before.

There are no rules which tell us the best substitution to try in every case.

The

best approach is to look at the integrand, ask ourselves what feature of it is most
troublesome, and then choose a substitution which seems likely to remove the
troublesome feature. For example, if we want to calculate

j.

dx
-Jx2 +

'
1

we want to extract the square root; we can do this if

x---+

Tan

( - !!.2 < e < !!.)2 ,

+ 1 ---+ sec2 ().

x2
This works:

J-Jx2dx

+ 1

---+

e de
sec e

sec2

which leads to a solution, as you found in Problem

sec

e de ,

2 of Problem

Set

6.6.

We might also have tried

z2
x

---+ u(z) =

dx

,/ z2

\ z

This gives

dx
-Jx2 +

__..

J!
z

,Iz2

+ 1,

dz.
1

x2

dz
1

,
J-J__:!:___
z2

which gets us nowhere, unless we happen to remember the solution of Problem 3 of


Problem Set

6.6.

Usually, to find out what algebraic substitution is going to work, we need to


solve an algebraic equation.

For example, given

dz
1 + ./:'

294

The Technique of Integration

6.7

we wish that the denominator had merely the form

1 +;-; t,

we need

t.

To get

z (t - 1)2

We usually write this with"=" signs:

This gives

J;=t- 1,

1 +;;=t,

z = (t- 1)2

z u(t) = (t- 1)2,


dz u'(t) dt

- Ji :=J:Z J

2(t - 1) dt,

2(t- o dt = (2 - ) dt
t

The reverse substitution

{2t - 2 ln !ti + C}.

t u-l(z) = 1 +)-;,
gives the final answer

(Query:

{2(1 +,Jzj

2 In (1 +-Jzj + C}.

Would it be all right to delete the

"1"

in the first parenthesis?)

This is probably the most efficient solution. If we hadn't thought of it, we might

have tried
which gives the substitution

,J; = t,
z u(t) =t2,
dz u'(t) dt = 2t dt,
l

Dividing

1 +t into t,

Therefore

dz

2t dt

J +Jz J1+t.
we get

t
1
--=1---.
l+t
l+t

2J_!___E:!_ =2J(1 - -1-) dt= {2t- 2 Jn jl+ti. + C}.


l+t
l+t

6.7

Algebraic Substitutions

295

Finally we apply the inverse substitution


t---+

u-1

(z)

.jz,

getting

I 1 +dz.J

--- =

r + C}.
r - 2 ln (1 + vz)
{2vz

(Is this really the same as the previous answer? Why or why not?)
We have used the substitutions x---+ Sin8;

Tan8, and

x---+

integrands involving the radicals

x---+

Sec() to handle

and
Slight variations enable us to take care of more general cases, involving
For example, to find

.Ja2

- x2,

.Jx2

Ja2 - x2dx

(a >

- a2.

0),

we use x---+ a Sin() = u(8), so that


a2 - x2
Here x/a

a2(1 - Sin28),

Ja2 - x2

a cos().

Sin8, so that
() = Sin-1 .
a

Thus

Ja 2 - x2dx

J
J
{:

a2 cos 2fJdfJ

a2 t ( cos 28 +

(a cos8)a cos8dfJ

---+

\in2fJ +

Now
sin 28

2 sin()cos8

This gives

.Ja2 - x2dx

tx.Ja2 - x2

In the same way, we use

x---+ a Tan 8,

! Ja2 - x2
a a

Sin-1

+ c} .

1) dfJ
e

+ c .

296

The Technique of Integration

to get. rid of

6.7

-Ja2 + x2; and we use


x ---+ a

to get rid of

Sec

e,

-J x2 - a2.

There are miscellaneous substitutions which work on miscellaneous problems .


For example, in

dx
x2(x 2 +

1)

'

the trouble seems to be that the integrand is concentrated in its own denominator.
We ought to be able to correct this by letting

x ---+ u(t)

1
=

-1

dx ---+ - dt.
t2

This gives

fx2(x +

1)

---+
---+

J(
{-

-1+

t2) dt {-t +Tan-it


+ c}.
=

+ C}

Tan-1

Here, in the last step, we have applied the inverse substitution

t ---+ u 1(x)

= x.
1

In writing up solutions of problems, in the following problem set, you need not
draw diagrams of the form:

ff(x) dx = {G(u-1(x)) + C}
1x11(t)
(!)

ff(u(t))'(t) dt

{G(t) + C}

But whenever you use a substitution, you should explain what you are doing, by
writing formulas of the type

x ---+ u(t)

Algebraic Devices: Completing the Square and Partial Fractions

6.8

297

PROBLEM SET 6.7

Calculate the following, by any method.


1.
4.

7.
10.

13.
16.
19.
22.
25.
28.
31.

J
J
J
J
J
J
J
J
J
J
J

dx

2.

V:x-)3

(I +

(a2 + x2)-af2dx

5.

dx

8.

l + x
z3dz

11.

vz + I
dx

14.

l + vx
(I +

vx)3vxdx

17.

dx

20.

Yl + e2"'
x Sin-1xdx

23.

l
dx
(1 + vx)2

26.

29.

dx
v1 +fix
dx

32.

x3(1 - x)

J
J
J
J

vx
(1 + Vx)3

3.

dx

dx

6.

v1 + e"'
dx

9.

(1 - x)2
z3dz
Vz2

12.

dx

J
J
J
J
J
J :
J
x4(x

15.

I)

dx

18.

Sin-1xdx

21.

Tan-1 xdx

24.

1
dx
vx(l + vx)2

27.

dx
e"')2

30.

(l

33.

x2 ln xdx

J
J
J Yvx
J
J
J
J
J
J
J
J
(a2

x2)-af2dx

dx

v1 - e"'
dx
+ 1

dx

YI + 'x

(l - x2)4dx
dx

(1 + e"')4

xln xdx
xTan-1 xdx
1

l + fix

dx

dx

v 1 + e3X

x2Tan-1xdx

6.8 ALGEBRAIC DEVICES: COMPLETING


THE SQUARE AND PARTIAL FRACTIONS

In Section 6.6 we used trigonometric substitutions to calculate integrals involving

Ja2

x2 and

Jx2

- a2. By completing the square, we can extend these methods so

as to take care of expressions of the form

x- + x +
9

Jax2 + bx

x- + x + i + !
?

c.

For example,

(x + t)- +
?

(.)3)2
2

Therefore

which has the form

dx

v x2 + x +

dx
'

(x + t)2

JJuzdu+

a2

+ (./3/2)2

'

6.8

The Technique of Integration

298

We can calculate this by the substitution


u--+ a Tan e.
This gives
{lnl .Ju2 +a2 +ul + C} = {ln l .Jx2 +x + 1 +x +ti+ C}
Similarly,

x2 +x = x2 +x +t - t = ex +t)2 - ew,

and so

dx
.Jx2 +x

which has the form

dx

J
J du
=

.Jex +t)2 - eW'

.Ju2 - a 2

Here we would use


u--+ a Sec(),
and proceed as in Section 6.6.
The following simple-looking problem has a curious solution:

=?
x2 - 1

We try
x--+ Sec()

so that

ex> 1)

dx --+ sec() tan

()

d(),

giving

sec() tan() d()


tan2 ()

csc ()

d() = {-In Iese e +cot e1 +C}

Now

1
x+ 1
x
x+ 1
--+
=
=
x -1
.Jx2 - 1
.Jx2 - 1
.Jex + 1)ex - 1)
-

This gives the answer

1 c}

J { 1
x2 - 1

= t 1n +
x +1

ex> 1).

ex> 1).

We check by differentiation:

( l : I )

D tin

= Detinlx - 1\ - tln\x +1\)


1
=-
2

--

x -1
1

x2 - 1

1
1
1 ex + 1) - ex - 1)
=-
-
2 x+1
2
ex - l)ex + 1)

--

Algebraic Devices: Colfi'pleting the Square and Partial Fractions

6.8

299

This shows that our answer was right. But it also shows that our use of trigonometry
was unnecessary; the solution depends merely on the algebraic identity
_l

_l.
2_
_
2_ +
x - 1
x +1
__

x2 - 1

This suggests that we should have a systematic method of breaking up rational


functions into sums of simpler functions. We call this the method of partialfractions.
Theorem

1. If a b, then there are numbers


ex+d
(x - a)(x - b)

Proof

_
_

x - a

A and B such that

_B_
+
x - b

(x a, b).

The obvious method works:


ex +d
(x - a)(x - b)

= +_!}__
x - a
x - b

ex +d = A(x - b) +B(x - a)

<=>

<=>

A+B = e

and

Ab+Ba= -d.
We solve for A and B by any method, getting the solution
A=

ae+d

be+d.
B=
b - a

a - b'

These values satisfy both equations.


Note that since a - b appears in the denominators, we really needed the hy
pothesis a -- b.

And for a= b, the theorem is false.

That is, you cannot express

1/(x - a)2 in the form


A

A+B

x - a

x - a

x - a

-- +-- =--.

It might seem that we should have stated a stronger theorem, as follows:


Theorem

1. If a b, then
ex +d
(x - a)(x - b)

ae+d
a -b

be+d . _1_
+
.
x - a
b - a
x -b
1

_
_

But nob'ody could remember this formula.

The efficient way to handle such

problems is the following. Given

fcx 2x
-

- 5)

=
?

we know by Theorem 1 that there are numbers A and B such that

(x

------ = -- +--

- 2)(x -

5)

x - 2

300

6.8

The Technique of Integration

The only problem is to find out what they are, numerically. We first write

1 = A(x - 5)+ B(x - 2).


Since this equation holds for every x, it must hold for
1 =A

x =2 and for x = 5.

Therefore

1 =B 3;

(-3),

B =t.

A= -t,

This is another example of the efficiency of existence theorems: often, if you


know in advance that a problem has a solution, you can use a simple procedure to
find out what the answer is. Without Theorem
1

(x - 2)(x - 5)

=-

1,

the shortcut calculation of

l
_!_ .__
l __
1
+ .
3 x-2
3 x-5

would not have been valid. To see this, consider the following analogous procedure:
"Problem."

Find the numbers

and
sin

b such that

x =ax+ b.

"Solution." Letting x =0, we get


0 =a 0+ b.

Therefore

b =0, and
sin

Letting

x =7r/2,

a =2/7r.

and

x =ax.

we get

l=a2:
2'

Therefore
.
(?)

Sill
X
.

2x
=-

for every

(?)

This is wrong: in fact, our formula is correct for only three values of

0 and

x = 7r/2.

x,

namely,

The fallacy was in assuming at the outset that the problem

had a solution, when in fact it has none.

What the above line of reasoning really

proves is the following:

=>

The statement

(1)

=>

( 2)

sin

is a linear function

sin

is the linear function

(1)

2x/7r.

is true, but it is not useful, because

(2)
(1)

is false.

The method that we used for quadratic denominators also works whenever the
denominator can be factored into linear factors all of which are different.

2x2+ 1
A
B
C
=
+
+
x+ 1
x+ 2
(x+ l)(x+ 2)(x+ 3)
x+ 3
--

2x2+ 1 =A(x+ 2)(x+ 3) + B(x+ l)(x + 3) + C(x+ l)(x+ 2),


Ax = -1,
3 =2A,
-
2,
x =-2,
x = -3,

9 = -B,
19 =2C,

B= -9;

C=:!ll-.

This solution depends on the following existence theorem.

301

Algebraic Devices: Completing the Square and Partial Fractions

6.8

If p(x) is of degree 2, and a, b, and

Theorem 2.

are all different, then there are

numbers A, B, and C such that

p(x)
(x - a)(x - b)(x - c)

_
_

x - a

_
_

_
_

(3)

With our present equipment, we could give only a brute-force proof. But we know
how to handle simple cases, and in the following problem set you will see how various
more difficult problems of the same type can be solved.
PROBLEM SET 6.8

Find:
1.

4.

7.

8.

9.

12.

15.

J
J
J
J
J
J
J

dx

2.

x2+2x+5
dx

5.

x2+x- 4

J
J

dx
v'x2+2x+5
dx
Yx2+x -

J
J

3.

6.

dx
x2 - 4
dx

v2 - x2

dx
>12 - 2x - x2
dx

(Is this an impossible problem at the outset?)

v -2x- x2
dx

10.

x2+ 6x + IO
dx
(x - l)(x -

13.

2)

J
J

dx

v-x2-

6x +10

x dx
(x - l)(x

11.

14

- 2)

J
J

d
x2

6 +10
dx

x(x - l)(x - 2)

x dx
x(x- l)(x-

2)

Find the unknown coefficients A, B, C, . . . which satisfy the following equations:


16.

18.

20.

1
(x - I)2 (x - 2)
1
x2 (x

23.

Find

25.

Find

(x - 1)2

B
C
+ -+
x- 1
x - 2
--

A
x+ I

--

dx
(x+ l)(xdx
x(xz + I)
dx
x(xz+1)2

2)3

B
(x-

22.
24.

2)3

17.

Find

19. Find

--

(x+ l)(x - 2)3

J
J
J

A
B
C
D
2 +x
- + (x - J)2 +x- I
l )2 = x

21. Find

c
(x -

2)2

J
J

(x

1x

x2(x

1)2

D
(x - 2)

---

A
Bx+ C
=-+
2
x2+ I
x
x(x + I)
I

x(x2+1)2

26. Find

Dx+ E
Bx+ C
A
=-+
+
x
x2+1
(x2+ 1)2

sine
d8
1 + cos 8

2)

302

The Technique of Integration

27.

Given 8

28.

Find J d8/(1 +cos 8).

6.8

2 Tan-1 x, calculate sin 8 and cos 8, in terms of x.

->-

(One way to do this is to use the substitution

u(x)

2 Tan-1 x,

sin 8

_,.

d8
?

_,.

u'(x) dx

cos 8

->-

2 dx
1 + x 2'

But when you see the answer, you may be able to think of a simpler method of solution.)
Find:
29.

32.

35.

f
f
f

d8
1 +sin 8
d8

sin 8 +cos 8
d8
1

sin 8

30.

33.

36.

f
f
J

d8

31.

cos 8
d8

sec 8 +tan 8
d8
2 +cos 8

34.

37.

f
J
J

dx
x2 +6x +9

sec3 8 cot 8 d8
d8

cos 8

sin 8

The
7

7.1

Definite Integral

THE PROBLEM OF ARC LENGTH

-C.o_!lider the parabolic arc which is the graph of y = x2, 0 x 1. We shall


calculate its length. The ideas that we use to do this will apply to other curves, but the
general problem is no harder than the special one; if we were interested in arc length
only for parabolas, the ideas in this section would all be needed.
The length of an arc of a circle is defined (as in Section 4.3) as the limit of the
lengths of the inscribed broken lines. More generally, suppose that f is a continuous
function on a closed interval [a, b].
y

By a net over [a, b] we mean an ascending sequence N of numbers


a
For each

i,

= x0 < x1 <

< xi <

X;+i

<

<

Xn

b.

let

We join successive points P;_1, Pi with segments, getting a broken line as in the figure.
Such a broken line is said to be inscribed in the graph off Its length is
PoP1 + P1P2 +

+ Pn-1Pn.

We denote this by p(N). That is,


n

p(N)

I P;-1P;.

i=l

We use the functional notation p(N), because when the net N is named, the broken
line is determined, and so also is its length.
The graph of a continuous function on a closed interval may have infinite length.
But if the length is finite, we ought to be able to approximate it by using a net N
which cuts up [a, b] into very small pieces. This idea is the basis of the following
definitions.
303

7.1

The Definite Integral

304

Definition. Let

be a net over

[a, b]. The mesh of N is the largest of the numbers

The mesh of N is denoted by INI .


Definition.

If p(N) approaches a limit L, as INI approaches

0, then fis said to be

rectifiable, and the number Lis called its length.


We need, of course , to explain what is meant by the statement
lim

p(N)

INl->O

L.

Intuitively, this means that p(N) R:! L when IN I R:! 0. We define this idea by the same
method that we used to defn
i e the limit of a function at a point . To make the analogy
clearer, we write the old and new definitions in parallel .
Definition.

interval

Let f be a function, on an

Definition. Let f be a

function, on an

[a, b]. Let x0be a point of [a, b]. interval [a, b].

Suppose that for every E > 0 there is a Suppose that for every E > 0 there is a
a > 0 such that if x is a point of [a, b], a > 0 such that if N is a net over [a, b],
then
then
INI < 0
0 <Ix - x01 <a
Then

=>

lf(x) - LI < E .
limf(x)

=>

lp(N) - LI <

E.

Then
Jim

L.

p(N)

INl->O

x--+xo

L.

To calculate the arc length, we fri st express the length p(N) (of the inscribed
broken line) in terms of things that we know how to handle. By definition,
n

p(N)
(See the figure on p .

I P;-1P;.

i=l

303.) The segment from

P;_1 to Pi looks

Yi=f(xi)
Yi-I= f(Xi-1)

-+---'---.L__-'---x
Xi-I Xi
Xi

like this:

The Problem of Arc Length

7.1

305

Thus

Here the fraction

[i (:YJ Lix;.
+

Liyi/Lixi is the slope of the chord from Pi-l to Pi; and the mean-value

theorem says that this is the slope of the tangent line at some intermediate point.
Thus we have

Liyi
Lixi

= f'(xi)

(xi-1 < X; < X;).

Making this substitution, and extracting the square root, we get

Taking the sum from i

=1

p(N)

ton, we get
n

=l

=l

=I
;-1 =I J1
i P P;
i

+ [f'(xi)]2 Lixi.

The problem is to find out what happens to the sum on the right as INI

-+

0. We can

find this by giving a geometric interpretation to the sum.


y

-t--X i--I XiX.i---x

For each

x on [a, b], let

g(x)

J1

In the figure,

g(x),

X;-1 <Xi < X;

for

n.

[xi_1, xi], of length Lixi


X; - x,_1, we have set up a rectangle
[xi_1, xi] as base, and altitude g(xi). The area of this rectangle is then

On each little interval


with

+ [f'(x)]2.

g(xi) Lixi.
The sum of these areas is

i g(x;) Lix; i_i=l J1

i=l

If f' is continuous, then so is

g;

+ [f'(i;)]2 Lixi

and I g(xi)

p(N).

Lixi ought to be close to the area

306

The Definite Integral

7.1

under the graph of g, when the mesh of the net N is small. That is, we ought to have

INI

i g(xi) !ixi fg(x) dx,

=>

which means that


n

lim L g(xi) !ixi


INl->Oi=l

This gives us a formula for arc length:

= .J1

lbg(x) dx.
a

[f'(x)]2 dx.

This holds whenever f' is continuous; and we will complete the proof later in this
chapter. Meanwhile, consider some examples.
Example

1.

Thenj'(x)

Let

f(x) =

= 0, and

Oxl.

= Jf1o .J1

02 dx

which is the right answer.


Example 2. Let

1,

f(x) = kx,

= 1,

Oxl.

Thenf'(x)

= k, and
L

= .J1

k2 dx

= .J1

k2,

which is the right answer, by the Pythagorean theorem.


Example 3. Let
f(x)

= .J1

-x

-x2,

Ox.J2.
-

Then
f'(x)
1 +

.J1

[f'(x)]2

[f'(x)]2

=1

- x2 + x2
1 -x2

x2

x2

'

-x2
1

_'
_

x2

The Problem of Arc Length

7.1

307

and
L

i,1;10

"-J1 + [f'(x)]2 dx

i\!212
o

Sin-1

-Ji

dx

-J2_

x2

,;[Sin-1 x]0 212

!!. .

This is the right answer, because Lis one-eighth of the circumference of a circle of
radius 1.
Example 4. We return to f(x)

/'(x)

2x

x2, 0 x

1 + [j'(x)]2

1.

f-J1

Now

Here
=

+ 4x2 dx.

1 + 4x2,

This is

2x.

H-x-JI + 4x2 + t In 12x + -J1 + 4x21 + C}.

Therefore, by the fundamental theorem of integral calculus, we have


L

ls+
2

t In c2 +

-Js).

The answer suggests that no method would have made the problem look easy.

PROBLEM SET 7.1


Find the lengths of the graphs of the following functions, between the indicated limits.

I.

f(x) = x312,

x, 0 x 7T/4
x, 7r/4 x 7r/2
*4. f (x) = Jn x, 1 ;;; x ;;; 3
5. f (x)
1 + tx3i2, 0 ;;; x 4
2.

3.

f(x) =
f(x)

x ;;; 2

In cos

In sin

*6. f (x)
7. a)

x
= e ,

/(x)

t(ex

e-

"'

) , 0 x 1 (You can solve this one, by an algebraic trick,

without using any of the standard formulas for hyperbolic functions.


problem is a little easier if you remember that sinh

}Ce"

e-x .

He"' -

e-" ,

But the

) cosh x

For the definitions of the hyperbolic functions sinh and cosh, and the

formulas governing them, see the end of Section

4.11.)

308

8.

7.2

The Definite Integral

Let f be a function with a continuous derivative, on an interval containing x0


y

Let

r(x)

"'

P0P,,/P0Px, that is, the ratio of the arc length to the length of the chord.
Show that lim,,_x0 r(x)
I.
"'
To prove this, you will need to use the formula which expresses P0P0. as an integral.
=

In many books you will see a "proof" of the integral formula for arc length, based on
the assumption that

r(x)

--+

I.

This is an example of a proof of infinite thinness: the

hole in it is as big as the proof, because to fill the hole you must first prove the theorem
itself by another method.
9.

Consider the sequence of broken lines suggested by the figure below. Each broken line
forms a stairway from P to Q. The nth stairway has n vertical segments and n horizontal
segments . For each

n,

let B,. be the length of the nth broken line. Find limn-oo Bn.

p
10.

Let/be any function on

[a, b], and let x be any point of(a, b). Let mv m2, and mbe the
[a, x], [x, b], and [a, b]. Show that mis between

slopes of the chords over the intervals


m1 and m2 . (Unless, of course, m1
More precisely,

m2

f(x) - f(a)
x-a

m.)

f(b) - f(x)
b-x

and
m

f(b)-/(a)
b - a

The theorem says that either (a) m1 m m2 or (b) m2 m m1.

7.2

THE DEFINITE INTEGRAL, DEFINED AS A LIMIT OF SAMPLE SUMS

In Section 3.7, we defined the definite integral in terms of area, with areas above the
x-axis counted positively and those below counted negatively.

In the preceding

The Definite Integral, Defined as a Limit of Sample Sums

7.2

bg(x) dx

309

section, however, we regarded the integral as the limit of a sum:

Lim

2 g(.Xi) xi.

JNJ...,Oi=l

Most of the time hereafter, the definite integral will be used in this way, and so we
shall redefine the integral, using the above formula as a definition. For this purpose,
we need to investigate nets, and sums of the type
n

2 g(.X;) xi.
i=l
Consider first ail increasing continuous function/, on an interval
y

f(b)

[a, b].

y
------

f(b)

---- --- r -'


I
I
I

--- --- -----,- -- 1


I
I

x0=a x1
I

iI -II
-

f(a)
--+-x
Xn=b
x3
a=xo xi xz
(a)

f(a)

On the interval

[a, b] we

N: a

I
I
I
---

_____

IX3
_J

Xn=b X

_J

I
I
I
_

.J

(b)

form a net

x0 < x1 <

< xi-l < Xi <

< Xn

b.

N cut up the interval [a, b] into little intervals [xi_1, X;]. For
m; be the minimum value off on the ith interval [x;_1, x;],
and let M; be the maximum value. Since f is increasing, we have m; = / (x;_1),
M; = f (x;) As usual, x; = X; - X;-i. so that x; is the length of the ith interval
[x;_1, x;]. Iff is positive, as in part (a) of the figure above, then the sum

The points of the net


each

from 1 to

n,

let

s(N)

2 m; xi
i=l

is the sum of the areas of the inscribed rectangles, and the sum

S(N)

L M; X;
i=l

is the sum of the areas of the circumscribed rectangles. For functions which may be

s(N) and S(N) are sums of signed areas. In either


s(N) is called the lower sum off over the net N, and S(N) is called the upper sum
off over N.

negative, as in part (b) of the figure,


case,

310

7.2

The Definite Integral

On each interval [xi1, x;] we choose a sample point X;. Thus

(1 i 11) .
The sequence
is called a sample of the net

The sample gives a sum

N.

L (X)

L f(xi) 6.xi.

i=l

This is called the sample sum off over the sample X.


As in the preceding section, the mesh of N is the largest of the numbers 6.x;.
The mesh is denoted by INI. Thus

INI
Let

max {6.x;}.

be the region between the graph off and the x-axis, from a to b.
y

Theorem 1.

If/is continuous, and

N1 and N2 are any nets over [a, b], then

s(N1) S(N2).
That is, every lower um off is less than or equal to every upper sum off For
positive functions this is obvious, because in this case s(N1) is the area of an inscribed
polygonal region (lying under the curve) and S(N2) is the area of a circumscribed
polygonal region. Jn general,

where A and D are areas of inscribed regions, and Band C are areas of circumscribed
regions. (To see how this works, see the figure (b) above.) Therefore,

A C,
and

B ?;;_ D,

-B -D,

- B C - D,

s(N1) S(N2), as before.


Note that in Theorem 1 it is not required thatf be an increasing function.

Theorem 2.

Iff is continuous and increasing, then


lim [S(N) - s(N)]

1s1-o

0.

That is, the upper sums are close to the corresponding lower sums, when the mesh
is small. To prove this, we observe that the difference S(N) - s(N) has a geometric
interpretation.

The Definite Integral, Defined as a Limit of Sample Sums

7.2
y

311

--

f(b)

f(b) --------------

I
I
I
I
I
I
I
I

I
I
I
I
I
I
I
I
I

.-+---]----
I
I
I
I
I

j(a) --,

I
I
I
I
I

This difference is
n

S(N) - s(N) = .L Mi xi - L mi xi= L (Mi - mi) xi;


i=l
i=l
i=l
and this is the sum of the areas of the rectangles drawn solid in the figure.

These

rectangles can all be moved to the left and stacked up inside a rectangle of altitude

f(b) - f(a) and base INI. (Remember that INI is the largest of the numbers xi.)
Therefore

S(N) - s(N) [ f(b) - J(a)] INI,

and so

S(N) - s(N)

Theorem 3.

-+

0 as

IN/

-+

0.

If f is continuous, and

Jim

IXl->O

then the sample sums

.L (X)

[S(N) - s(N)] =

approach a limit, as

0,

INI

-+

0.

That is, there is a number k such that


Jim

l1Yl->O

Proof

L (X) =

k.

We have to start the proof by naming the number k.

bounded above.

The numbers

s(N) are

(By Theorem 1, every upper sum is an upper bound of the lower

sums.) Let
k

Consider an interval

sup

[x;_1, x;]. For each

{s(N)}.

i,

m; f(i;) M;.

312

The Definite Integral

Since

7.2

!J.xi > 0, this gives


m; !J.xif (xi) !J.xiM; !J.x i.

Therefore the sums from 1 to

n rank in the same order:

I m; !J.xiI f(x;) !J.x;I M; !J.xi,


i=l
i=l
i=l
so that

s(N) I (X) S(N).

That is, every sample sum lies between the lower sum and the upper sum.
We are now almost done. Given

INI < 0
By Theorem 3 there is a

> 0, we want a c5 > 0 such that


IL (X) - kl < .

=>

o > 0 such that


INI < o =>

Thus when

S(N) - s(N) <

E.

INI < o, the interval from s(N) to S(N) has length less than
s(N) 2.::(X)

E.

S(N)

I (X). ( See the inequalities above.) And it also contains k:


s(N) k, because k is an upper bound for the lower sums; and k S(N), because
k is the least upper bound of the lower sums. Therefore IL (X) - kl < E, because
I (X) and k are squeezed together: they both lie on the same short interval.

This interval contains

We can now give the new definition of the integral.


Definition.

fa f(x) dx

lim1N1-?0

I (X), if the indicated limit on the right exists.


f is integrable on [a, b ].

If the limit exists, then we say that

Theorems 2 and 3 fit together to give:


Theorem 4.

If f is continuous and increasing on

[a, b], then f is integrable on [a, b ].

Later we shall see that all continuous functions are integrable, whether or not
they are increasing.
Our calculations of definite integrals have been based on the differentiation
formula

f'f(t) dt

f(x),

where/is continuous. We need to know that this differentiation formula still holds,
under the new definition of the integral. This is the purpose of the following theorem.
Theorem 5

(The betweenness theorem for integrals). If f is integrable on [a, b ], and


mf(x)M

(a x b),

7.2

313

The Definite Integral, Defined as a Limit of Sample Sums

then

m(b
Proof

a) !(x) dx M(b - a).

Let N be any net over

[a, b],

and let

X be any sample of N.

m f (xi) M

Then

for every i.

Therefore

m D.xi f(xi) D.xi MD.xi.


Forming the sample sum

'L (X) by addition,

we get

'L m D.xi 'L (X) 'L M D.xi.

i=l

But

i=l

'L m D.xi

i=l

m 'L D.xi;
i=l

'L M D.xi

i=l

M 'L D.xi,
i=l

and
n

'L D.xi

i=l

(Why?) Therefore

m(b

a.

a) 'L (X) M(b

a);

and this holds for every sample sum, over every net N. Therefore the same inequalities
hold for lim

'L (X),

and the integral lies between

m(b

a)

and

M(b

a),

which

was to be proved.
If you review the proof of the formula
D

f'f(t) dt

f(x),

in Section 3.10, you will find that in this proof, all that we needed to know about the
integral was the information conveyed by the betweenness theorem for integrals.
Therefore the differentiation formula continues to hold, under our new definition,
wherever the integrand is continuous.

It follows that the fundamental theorem of

integral calculus still holds true.


At the end of this chapter we shall prove that every continuous function is inte
g rable.
*PROBLEM SET 7.2
1.

In Theorem 2 it was assumed that the function f is increasing. Does the same scheme
of proof work, for a decreasing function? If so, draw a figure illustrating the proof for
decreasing functions. If not, explain how the scheme breaks down, for the case of a
decreasing function.

2.

In Theorem 2 it was assumed that f is both continuous and increasing. Suppose we


assume that f is increasing, but not that f is continuous. What changes (if any) do we
then need to make in
a) the definitions of mi and Mi,
c) the proof of Theorem 2?

b) the definitions of s(N) and S(N), and

314
3.

The Definite Integral

7.2

Prove the following:


(The mean-value theorem for integrals). If f is continuous on [a, b], then
there is a point x, between a and b, such that

Theorem A

4.

ff

j(x)(b - a).

Consider the function f defined by the following conditions: /(!)


!, /Ci)
!,
0 for every other x on [O, I]. Is this function integrable? Why
/Ci)
!; /(x)
or why not?
=

*5.

(x) dx

Consider the following function, on [O, I]. If x is irrational, then/(x)


0. If x
p/q,
in lowest terms, then/(x)
I/q. At what points (if any) is this function continuous?
Is the function integrable? Why or why not?
=

6.

Given a continuous functiong, on [a, b], and a net N over [a, b]. Show that there is a
sample x of N such that L (X)
s g (x) dx.

7.

In Section 7.1 we showed that for every net N we could choose a sample X in such a
way that the length of the inscribed broken line is equal to the sample sum L (X),
not just approximately but exactly. ls it always possible to choose a sample X' such
that L (X') is exactly equal to the arc length? (Here we are assuming, as usual, that
/' is continuous.)

*8.

The following remarks are a very sketchy indication of an amusing proof of an important
theorem which is known to you in a slightly weaker form. Fill in the gaps, and state the
theorem which is proved.
F' f. f is known to be integrable on [a, b], but is not necessarily continuous.

L [F(x;) - F(xi_1)]
i=l
As INI
along.
*9.

___,.

0,

L'f=if (x;) D.xi

___,.

L j(x;) D.xi.
i=l

?; but Lf=1 [F(xi) - F(xi_1)] was something simple, all

Let/be differentiable on [a, b]. Show that if


j'(a) < k <

f (b

(a)

then k
j'(x), for some x between a and b.
[Hint: Remember the definition of j'(a). Do a sketch illustrating the definition.]
=

*IO. Theorem (171e no-jump theorem for derivatives). If f is differentiable on [a, b], and k
is between j'(a) and j'(b), then k
j'(x) for some x between a and b.
Thus, for example, the function
=

for x ,e. 0,
for x
cannot be the derivative of any other function f

The Calculation of Volumes, by the Method of Disks

7.3

*11. Theorem.

315

Iff is differentiable at a, then

xl-

(f(x; =Xi)) =['(a).

x:-a+
More precisely, for every

" > 0

a *

12.

a)

A function

c5 <

there is a
x1

<

c5 > 0
<

x2

such that

<

I [Cx2) - f(x1) _ ' a


[( )
X2 - X1
I

+ c5
<

"

of/is Lipschitzian on [a, b] if there is a number k


x1 and x2 of [a, b],

> 0

such that for every

[f(x1) - [Cx2)i k [x1 - X2i

Show that f(x)

sin x is Lipschitzian on the interval ( - ct:>,

ct:> .

b) Show that every Lipschitzian function is continuous.

c) Give an example to show that a continuous function is not necessarily Lipschitzian.


d) Show that if[' is continuous on [a, b], then/is Lipschitzian on [a, b].

e) Show that if[ is Lipschitzian on [a, b], then/is integrable.


Since Theorem
then

is known, it will be sufficient to show that if/is Lipschitzian,


lim [S(N) - s(N)]

0.

IN/o
13.

a)

function f is uniformly continuous on [ a, b] if for every


such that

[x - x'[

< c5

I/(x) - f(x')[

"

> 0

there is a

c5 > O

< "

(Here x and x' are any points of [a, b].) Show that if[' is continuous, then f is
uniformly continuous on [a, b].
b) Show that every uniformly continuous function is integrable.
7.3

THE CALCULATION OF VOLUMES, BY THE METHOD OF DISKS

The volumes of various solids can be expressed as definite integrals. In this process,
we shall assume that the following volume formulas are known.

I
I
I
I
I
I
I

) ---- ---/
/

,,....-- ------ ........,


r

V =abc.

V = 7rh(r2 - s2).

The Definite Integral

316

7.3

The first of these solids is a rectangular parallelepiped; the second is a right circular
cylinder; and the third is a cylindrical shell, that is, the portion of the larger cylinder
that lies outside the smaller cylinder.
We get a coordinate system in space by setting up a z-axis, perpendicular to the
xy-plane at the origin. Here, and throughout this chapter, we shall indicate only the
positive half of each axis, thus getting a picture of only the "first octant," in which
the points have nonnegative coordinates.
y

Consider now the function


f(x)

1/x

on the interval [1, 2]. Let R be the region under the graph, in the xy-plane. We
rotate the region R about the x-axis. This gives a solid S.
y

Let
net

vS

be the volume of S. We shall express

vS

as a definite integral. First we form a

over the interval [1, 2]. For convenience, we use equally spaced points, so that
xi

xi-l

1/n

for each i. Over the intervals [xi_1 , xi ] we set up the circumscribed rectangles. These
form a region Rn which is an approximation of the region R. Then we rotate Rn
about the x-axis. Each of our rectangles then gives a cylinder (lying on its side), and
the cylinders form a solid Sn which is an approximation of S. In the figure on the
right below we show only the ith cylinder. Its altitude is x
xi
xi_1, and the
radius of its base is
=

The Calculation of Volumes, by the Method of Disks

7.3
y

317

Therefore its volume is

7T (-1-)2 x
xi-1
and so the total volume of the circumscribed solid Sn is
n
n
1 2
vSn .2 vi .2 7T
x.
i=l
i=l X;-1
V

7Tr2 x

'

(-)

This is a sample sum of the function

g(x)

1
7T
x2'

over the net

(In fact, it is the upper sum of g over the net N, because g(xi_1) is the maximum
value of g on the interval [x;_1, X;].) The mesh of N is

and so IN!

as

INI
co.

{ 7T dx
J 1 x2

.! ,
n

Therefore

lim vSn

n-+ 00

- Jx 21

.'.'.:: .

(1)

If we use inscribed rectangles, and rotate them about the x-axis, then we get an
inscribed solid S, with volume
n
2
vS .2 7T ( 1 ) x.
=

i=l

X;

This is also a sample sum, of the same function g


lim vS =

n- cc

2 7T
2 dx.
1 X

7T/x2 Therefore

(2)

Therefore the volume vS of Sis squeezed between the volumes of the inscribed and
circumscribed solids:
for every n,

318

7.3

The Definite Integral

and so

vs =

f
Ji

'!!__ dx = '!!. .
2
x2

(3)

We shall now review this process and state the assumptions on which it is based.
Not all solids are

measurable, in the sense that they have volumes; but the solids that

you are likely to encounter soon are measurable, and their volumes are governed by
the following laws.
By an

elementary solid we mean a right parallelpiped, cylinder, or cylindrical

shell, as at the beginning of this section. We have been assuming that:

V.1. Elementary solids are measurable, and their volumes are given by the formulas
v = abc, v = 7Tr2h, v = 7Th(r2 - s2).
Two solids are nonoverlapping if they have no solid in common. (They may have
surfaces in common.)

V.2. If s1, s2,

sn are nonoverlapping elementary solids, and Sn is their union,

then Sn is measurable, and

V.3. If S and S' are measurable, and S' lies in S, then vS' vS.
V.4 (The squeeze principle). If (a) Si. S2,

are measurable solids containing S,

(b) S{, S, ... are measurable solids lying in S, and (c) lim,H00

vSn

L = limn-ro

vS,

then S is measurable, and

vs=

L.

Using V.1 through V.4, we can show that the method of disks, which we have
'

used for the function

f (x) = I/x,
works for every function

f which is

0 and continuous.

f
M;
'mi

Given such a function f, on a closed interval

----

I
I
I
I
I

I
I
I
I
I

.x
--lf---x, -1 -i
-

---x

[a, b], let S be the solid of revolution


[a, b], with equal spacing. As usual, let
and M; be the minimum and maximum values of f on the ith interval [x;_1, x;] .

of R, about the x-axis. Take a net N over


mi

---

The Calculation of Volumes, by the Method of Disks

7.3

319

If we rotate the inscribed rectangles about the x-axis, we get an inscribed solid S,
of volume

n
vS = I 7Tmi!lx.
i=l

If we rotate the circumscribed rectangles about the x-axis, we get a circumscribed


solid Sn, of volume

n
vSn = I 7TMi!lx.
i=l

V.3 says that

vS vSn

for every

n.

But vSn is an upper sum of the function

g(x) = 7Tj(x)2,
and vS is a lower sum of the same g. As

vSn _.

f7rf(x)2 dx,

n-. oo,

vs

JNI-. O;

--f7Tj(x)2 dx.

By the squeeze principle it follows that S is measurable, and

vS

f7Tj(x)2 dx.

We also use this formula sidewise.


y

y=vx
R

Suppose that the region R on the left is rotated about the y-axis. Sidewise, R can be
regarded as the region under the graph of a function
x =f(y) =y2,
Therefore the volume is

1 7T

[f(y)]2 dy

Oyl.

1 7TY4 dy
1
0

7T

=5

320

7.3

The Definite Integral

PROBLEM SET 7.3

1. Obviously a right circular cone can be regarded as the solid of revolution of a right
triangle about one of its legs. If we place the triangle in the xy-plane as shown in the
figure, then the hypotenuse becomes the graph of a function f Calculate/, and find the
volume of the cone by the methods of this section.
y

2. Similarly, a round ball of radius

r can be regarded as the solid of revolution of a semi


circular region about its diameter. Find the volume, by the methods of this section.

3.

The region under the graph of /(x)


Find the volume of the resulting solid.

4.

Same, forf(x)

5.

Same, for f(x)

6.

Same, forf(x)

7.

Let R
volume.

8.

Same, for R

9.

Same, for R

v':X

(0 x 1) is rotated about the x-axis.

sinx, 0 x .;,,
x312, 0 x 1.
cosx, -1Tj2 x 1T/2.

{(x,y) I 0 x 1,. x2 y 1} be rotated :>.!)out the y-axis.

10. Same, for

11. Same, for

{(x, y) I 0 x 1, Sin-1x y 1T/2}.


{(x,y) I 0 x 1T/2, sinx y l}.
{(x,y) I 0 x 1, x3 y 1}.
{(x,y) I 0 x v'2/2, x y v'l - x2}.

Find the

321

The General Method of Cross Sections, and the Method oJ.' Shells

7.4

12. Find out whether the following is true:

Theorem (?). Let T and T' be triangles each of which has a side on the x-axis.

If T

and T' have the same area, then when they are rotated about the x-axis, they give solids
with the same volume.
13. a) For each.x from 0 to 1, let T,,, be the triangle whose vertices are (0, O), (1, 0), and
(x, 1). What value or values of x give maximum volume, when Tx is rotated about
the x-axis?
b) Suppose that the triangles T,,, are rotated about the y-axis (instead of the x-axis).
Which value or values of x give maximum volume?
14. For each k from 0 to 1, let Tk be the triangle whose vertices are (0, 0), (k, 0), and
(0, Vl - k2). (Thus the hypotenuse of Tk has length 1.) Tk is rotated about the x-axis.
What value of k gives maximum volume? What is the maximum volume?
15. a) Given/(x)

1/x. Let R be the region under the graph of/, from 1 to

ro.

Give a

reasonable definition of the area of R. Is this area finite?


b) The region R is rotated about the x-axis, giving a solid S. Give a reasonable definition
of the volume of S. Is this volume finite?
16. a) The region R under the graph of f(x)

1/x2 from l to

ro

is rotated about the x-axis

giving a solid S. Does S have finite volume?


b) If the region R is rotated about the y-axis, do we obtain a solid with finite volume?

7.4 THE GENERAL METHOD OF CROSS


SECTIONS, AND THE METHOD OF SHELLS

The method of disks can be generalized in the following way.

Given a solid S in

space. Suppose that we can calculate the areas of the cross sections perpendicular to
the x-axis.
y

'\
'I
, ,
11
,,
I;

\
I
I
I
I

For each

from

to b we let A(x) be the area of the cross section.

function A which expresses the cross-sectional area in terms of


examples, the cross sections were all circular.)
[a, b] into

I
I
I
I
I
I

x.

This gives a

(In our previous

As before, we divide the interval

equal parts, and we approximate the volume by cylinders. In the figure

at the right, we show only the ith cylinder.

322

The Definite Integral

7.4

We then have
vn

L A(xi) D..x,

i=l

and the sum on the right-hand side is a sample sum of the function A.

Therefore,

as the mesh goes to 0,

It is plausible to suppose that as the mesh approaches 0, vn approaches the


volume of S; and in fact this is true, for measurable solids, although we are not in a
position to prove it. Thus
vS

fA(x) dx.

By this method we can calculate volumes.


y

x, for 0 x 1.

For example, take the parabola

For each y from 0 to 1, we take the horizontal segment

from (O,y) to-the point (x,y) of the parabola; and using this segment as an edge,
we construct a horizontal square. Thus we get a solid, as shown in the figure.
y

/
/

The cross-sectional areas perpendicular to the y-axis are given by the formula
A(y)
Therefore the volume is
V

x2

y.

11A(y) dy 11y dy [y2] 1


=

1
=

The general method of cross sections applies, in a sense, to every volume problem.
That is, it is always true that
vS

fA(x) dx.

But often this formula leads to difficult calculations.

The General Method of Cross Sections, and the Method of Shells

7.4

323

,,.

,,.

I
I
I
I
I

-2

Consider, for example, the region R under the graph of


f(x)

cos x,

We rotate R about the y-axis, getting a solid of revolution, of which only the front
half is shown in the figure. We can find the volume by the cross-section method. We
have
A(y)
Therefore
V

J:

?TX2

A(y) dy

?T(Cos-1 y)2

J:

?T(Cos-1 y)2 dy.

We can calculate this by integrating by parts twice, but there is a better way.
Instead of approximating the solid by thin cylinders, we approximate it by thin
cylindrical shells.
y

First we approximate the region R by rectangles with equal bases x

?T/2n,

as

shown on the left. Then we rotate each of these rectangles about the y-axis, getting
cylindrical shells, as shown on the right. The altitude of the ith shell is
f(xi)

cos xi;

the outer radius is X;; and the inner radius is X;_1

X; - x . Therefore the volume

of the ith shell is

?TX cos xi - ?T(X; - x)2 cos X;


?T(x; - x +

2x; x

- x2) cos X;

27TX cos xi x - 7T(COS X;)(x)2.

324

7.4

The Definite Integral

Therefore the total volume of the inscribed solid Sn is

L 27TXi

cos xi

n
Llx - 7T Llx L cos xi Llx.

We need to find out what happens as the mesh goes to 0. The first sum is a sample
sum of the function 27TX cos x. Therefore

L 27TX; cos X;
For the same reason,

Llx

L cos X;

-+

7T Llx

L cos xi

rr/2
27Tx cos x dx.

Llx-+

Therefore

rr/2
cos x dx.

Llx

-+

7T

[ r.12
J cos x dx.
o

Thus the entire second sum, in the expression for vSn, drops out when we pass to the
limit. Therefore

vSn
and
V

-+

[ "12
J 27TX cos x dx,
o

[ "12
J 27TX cos x dx
o

27T

[i J
+ 0

27T[X sin x + cos x]12

- 277[1]

7T2 - 27T.

The same method applies if we rotate a region lying to the right of the y-axis.
y

If the width of the region is given by a function h(x), then the volume of the ith
cylindrical shell is
vi =
=
=

7TX7h(x;) - 7T(Xi - Llx)2h(xi)


7Th(xi)(x7 - x7 + 2x; Llx - Llx2 )
27TX;h(x;) Llx - 7Th(xi) Llx2

The General Method of Cross Sections, and the Method of Shells

7.4

Therefore

n
vn

L vi

325

27T L X;h(x;) Llx - 7T Llx L h(x;) Llx

Therefore
V

27T

fxh(x) dx.

Here again the second part of the sum drops out when we pass to the limit.
Thus the sums behave as if

V;

There is a simple reason why

were given by the formula

V;

is well approximated by

27TX;h(x;) Llx.

2...x;

h(u){ ------------------]
\..,

D.x

If we make a vertical cut in the ith cylindrical shell, and :flatten it out, we get a rec
tangular prism. The length of the prism is the circumference of the outer circle in the
base of the shell.

This is

27TX;.

The altitude and the thickness of the prism are the

same as the altitude and the thickness of the shell; these are
the volume of the prism is exactly
mation to

V;

when

Llx

27Txih(x;) Llx;

h(xJ

and

Llx.

Therefore

and this ought to be a good approxi

is small, because when the shell is thin, we can flatten it out

without distorting it very much. As we have seen, the error goes to zero as the mesh
goes to zero.
The method of shells applies to the problem that we were discussing above.
We know that the volume is
V

irr/227TX
0

COS

X dx.

Integrating by parts, we get

Jx

Therefore
V

cos

x dx

{x sin x +

[27T(X sin x +

cos

x)]12

cos

x +

C}.

7T2 - 27T.

The same method applies more generally. Consider the region


R

{(x,y) I 2 x 3, 0 y -x2 + 5x - 6}.

326

The Definite Integral

7.4
y

-_Ll
I
I

+-

If R is rotated about the line

-1, then by the shell method the volume of the

resulting solid is

f27T(x

-'--'-.__--X

----'-

+ 1)(-x + 5x - 6) dx

f27T(-x3
[ x4
27T - 4

If the same region is rotated about the line

f277(6 - x)(-x2

which is also equal to

77T/6.

+ 4x2 - x
+

4xa

6,

x2

6) dx
]a
6x. 2

t7T.

the volume is

+ 5x - 6) dx,

(Why?)

In some cases there is little to choose between the cross-section method and the
shell method. For example, suppose we take the region below the graph of y
0

;;; x ;;;

x2,

1, and rotate it about the y-axis.


y

By the shell method,


V

f127TX x2dx

Jo

'!!..

The horizontal cross section at height y is the region between a circle of radius 1
and a circle of radius

x =

JY.

as before.

Therefore

fA(y) dy f [7T
=

7T - 7T

fly dy
0

7T
2

12 - 7TY] dy

= - '

7.5

The Area of a Surface of Revolution

327

PROBLEM SET 7.4

1. Let R be the circular region with center at (5, 0) and radius 2. R is rotated abo1t the
y-axis. Find the volume of the resulting solid.
2.

A solid of the sort described in Problem 1 is called a solid torus. More generally, suppose

we have given a circular region of radius

a,

and a line L in the same plane, such that

the perpendicular distance from L to the center of R is b, with b

a.

When R is rotated

about the line L, the result is a solid torus. Find its volume, in terms of

3.

and b.

Let R be the square region with center at (4, 0) and sides of length 2, parallel to the

coordinate axes. R is rotated about the y-axis. Find the volume of the resulting solid.

4. Let T be the square region with center at (4, 0) and sides of length 2, with diagonals
parallel to the coordinate axes.
rotated about the y-axis.

Find the volume of the solid which results when T is

5. a) The region under the graph of


y

1 x

lnx,

e,

is rotated about the x-axis. Find the volume, by the method of disks.
b) Now solve this problem by the method of shells.
6.

For eachx from 0 to 1, let Rx be the circular region perpendicular to thexy-plane, with

center at the point (x, x2) and radius 1.

Find the volume of S.

Let S be the solid formed by the regions Rx.

7. a) The region described in Problem 5a is rotated about the y-axis. Find the volume, by
the shell method.

b) Now solve Problem 7a by the method of cross sections.


8. a) The region under the graph of y

e", 0 x 1, is rotated about the y-axis. Find

the volume by the method of shells.


b) Now solve Problem Sa by the cross-section method.
9.

Let C be the cylinder with the y-axis as its axis of symmetry, and radius 1. Let S be the

sphere with center at the origin and radius 2.

Find the volume of the solid which lies

inside the sphere and outside the cylinder.

10. Let C,, be the cylinder of radius 1, with the x-axis as its axis of symmetry; and let Cy
be the cylinder of radius 1 with the y-axis as its axis of symmetry. Find the volume of the
solid which lies in both Cx and Cy.
11. Let S be the sphere of radius v2 with center at the origin. Let C be the cone with vertex
at the origin, axis along the y-axis, and passing throug the point (1, I). Find the
volume of the solid which lies inside the sphere and inside the cone.

7.5

THE AREA OF A SURFACE OF REVOLUTION

Given a line and a curve, lying in the same plane and lying on one side of the given
line. If the curve is rotated about the line, the resulting surface is called a

revolution.

surface of

The area of such a surface can be expressed as an integral. We begin with

the simplest case, in which a function-graph is rotated about the y-axis.


functionfis defined on a closed interval
assume that

[a, b]

has a continuous derivative.

Here the

on the positive half of the x-axis. We

328

7.5

The Definite Integral

.. x

To calculate the area of the surface of revolution, we need the formula for the
lateral surface of a right circular cone. Let s be the slant height of the cone, and let
r be the radius of the base, so that the circumference of the base is 2Trr.

We assert

that the lateral surface is the same as the area of a circular sector of radius s, with
boundary arc of length 2Trr. The reason is that we can make a straight cut in the cone,
starting at the vertex, and then flatten out the surface, without changing its area, so
that the resulting surface lies in a plane. The plane surface thus obtained is the sector
shown below.

But the area of a circular sector is half the product of its radius and the length of its
boundary arc. Therefore, for cones, we have
A

= 'TrYS.

(Note that for a "cone of altitude O," that is, a disk, this formula gives the right
= Trr2. ) From this we can get a formula for the lateral area of a frustum

answer Trrs

of a cone. If the larger cone (with slant height s2) has area A2, and the smaller cone

The Area of a Surface of Revolution

7.5

329

has area Ai, then the area of the frustum is


AA= A2 - Ai=

7Tr2s2 - 7Tris1

Evidently

If Si

kri. s2= kr2,

then

AA= 1Tkr - 1Tkri= 1Tk(r2 - ri)(r2


=

7T(S2

s1)(r2 + r1) =

and so

21T

+ ri)
ri + r2 A
us,
--2

AA= 2Ttf As,

where

i(r1 + r2).
That is, the area of the frustum is equal to its "average circumference"
slant height D..s.
f=

2Ttf

times its

Consider now the surface of revolution obtained by rotating the graph off
about the y-axis. We take a net
N: x0, Xi,

over the interval

[a, b],

xi-l xi,

. . . Xn,

with equal subdivisions, so that

b - a
Ax= --= INI
n
for each i. For each i, let Pi be the point (xi, f (xi)) . These points determine a broken
xi - xi-l

line Bn which is an approximation of the graph off When Bn is rotated about the

y-axis, we get a surface Sn, with area An.

By definition, the area of the surface of

revolution off is

A= lim An,
INl->O
if the limit exists.

(This is like the definition of arc length.)

We shall now calculate An, and find its limit as

INI -+

0. Consider the ith segment,

from Pi-l to Pi. When this segment is rotated, it gives a frustum whose area is

ai=

21Txi

Pi-lpi;

.. x

Xi-1

Xi

Xi

7.5

The Definite Integral

330

As in the calculation of arc length,

Pi-1Pi

where

X;_1

.JLix2 + [f(xi) - f(x;_1) ]2

xi )
f xi)
1 + ( c :c -l Y Llx

.J1 + f'(i;)2Lix,

< i; < xi, as shown in the last figure.

We now have a formula for the area A n of the approximating surface:


An

Here

i i is

I ai iI=l 27TiiPi-1Pi

i=l

the midpoint of

If it were true that

i;

[xi_1, x i],

and

i i for each i,

g(x)

I 27Tii.J1

i=l

+ f'(i;)2Lix.

i; is somewhere

on the same interval.

then A would be a sample sum of the function


n

27Tx.JI + f'(x)2,

and we would have no problem, because for


n

I a;

i=l

we know that
n

I a;

lim

noo i=l

I 27Ti;.J1 + f'(i;)2Lix,

i=l

ibg(x) dx ib27Tx.J1
=

+ f'(x)2 dx.

What we need to show, therefore, is that

It will then follow that


n

lim

I NI ->O

Now

because

i; lies

I ai

i=l

lim

INI ->O

I a;

i=l

xi __
x i,1
I-

on the interval

lai - a;J

[xi-1

ib27Tx.J1
a

+ f'(x)2 dx.

<Lix
'
2

] whose midpoint is i i. Therefore

X; ,

l27Tii.J1 + f'(i;)2Lix - 27Ti;.J1 + f'(i;)2Lixl

27T.J1 + f'(i;)2Lix Ii; - i;I

7T.J1 + f'(i;)2Lix2

The Area of a Surface of Revolution

7.5

331

Therefore

a; - ;a;
;

I ;Ia;
n

- a;I

+ f'(x;)2 .6.x2

= .6.x I '"-J1

f'(x;)2 .6.x.

i=l

Now
lim

.6.x i Tr-J1

INJ-+O

Therefore

i '"-J1

i=l

+ f'(x;)2 .6.x

b
= 0 r Tr-J1
Ja
.

+ f'(x)2

dx.

II a; - I a;I is squeezed to 0, which was to be proved.

Let us try this formula on some problems to which we already know the answers.
y
y
b

A cone is the surface of revolution of a segment. Here


and

f(x) = b - (b/a)x,
A

f'(x) = -b/a,

a
b2
r
= o 2Trx 1 + dx
J
a2

= 2Tr -Ja2 + b2 rax dx


a
Jo
= Tra-Ja2 + b2,

which is the right answer.

Consider next a quadrant of a circle of radius

about the y-axis. In this case,

f(x) = -Ja2 - x2
(0 x a),
-x
f'(x) = I
'
'\/ a2 - x2
a2
x2
1 + f'(x)2 = 1 +
=
a 2 - x2 a2 - x2
---

a,

rotated

332

The Definite Integral

7.5

Therefore the area is

It follows that the total area of a sphere of radius

is

47Ta2

This is the standard

formula.
It is harder to find the area when we rotate a function-graph about the x-axis
instead of the y-axis.
y

Given a nonnegative function f, on an interval


over

[a, b],

[a, b].

As before, take a net

with equal subdivisions, so that for each i,


xi - xi-1

D..x

b
=

--

INI.

As before, we approximate the graph by a broken line Bn- Then we rotate Bn about
the x-axis, getting a surface Sn, with area Aw

We define the area of the surface of

revolution to be limlNl--?O An, if such a limit exists. We proceed to calculate:

where

a;

is the area of the ith frustum, shown in the figure. As before,


P;_1P;

where

-J 1

X;_1 <

f'(x)2 D..x,
< Xi.

But when we rotate the chord from P ;_1 to P; about the x-axis, the "average circum
ference" is

Obviously i'; is between/(xi_1) andf(x;), because i'; is their average. By the no-jump
theorem of Section 5.7,
Therefore
An

I a; I 27Tj(.X;)-J1
=

i=l

i=l

f'(.X)2 D..x.

The Area of a Surface of Revolution

7.5

333

If it were true that xi= x for each i, then the sum on the right-hand side would be a
sample sum of the function

g(x)= 2rrf(x)J1 + j'(x)2.


As it stands, it is very close to being a sample sum. The idea is that
INI=

for each i
for each i

i 2rrl(xi)-./1
i=l

f'(x;)2 x

i 2rrl(x;)J1

f'(x;)2 x

i=l

f2rrl(x)-./1

f'(x)2 dx.

At the end of the chapter, these ideas will be turned into a proof. Meanwhile let us
look at some applications of the formula
A=

1)

If f(x)

f2rrl(x)J1

f'(x)2 dx.

k, on [a, b], then the surface of revolution is a cylinder.

By the

integral formula,
A=

f2rrkJT+02 dx= 2rrk(b

- a ,

which is the right answer.

2)

A sphere of radius

a is the surface of revolution of a semicircle of radius a.

Here

l(x)= Ja2 - x2,


Hence

2x + 2lf'= 0,
Therefore
1

and
A=

!'2=

and

2
12

1
+ =

x
l

l'=

2
2 + x2
=
12
1 2'

a
f-aa 2rrl(x)J7
-- dx= f 2rra dx= 4rra2
-a
l(x)2

PROBLEM SET 7.5


1.

Let

Ca

be the circle with center at the origin and radius

lying above the interval

a,

and let A be the arc of Ca

[ -a/2, a/2]. A is rotated about the x-axis. Find the area of the

resulting surface. What proportion is this, of the total area of the sphere?

334

The Definite Integral

7.5

2. The entire circle

Ca is rotated about the x-axis, giving a sphere of radius a. Eb and Ee


b and x = c; and S is the part of
the sphere that lies between them. Find the area of S, in terms of a, b, and c. The

are two planes, perpendicular to the x-axis, at x =

form of your answer ought to suggest a somewhat surprising theorem which can be
stated without the use of formulas. What is the theorem?
3. The circle with center at

(b, 0) and radius a, a


torus. Find its area.

<

b, is rotated about the y-axis. The

resulting surface is called a

4. The square with corners at the points

(a, O), (a+k, k), (a+k, -k), and (a + 2k, 0)


a.) Find the area of the resulting surface.

is rotated about the y-axis. (Here 0 < k <

5. Find the volume of the solid obtained when the corresponding square

region is rotated

about the y-axis.


6.

The same square is rotated about the line x

a+2k. Find the surface area.

7. The square region is rotated about the line x =


8.

The square with center at


rotated about the line x =

9.

a + 2k. Find the volume.

(a, 0) and sides of length 2k parallel to the coordinate axes is


2a. Find the area of the resulting surface. (Here 0 < k < a.)

Find the volume when the corresponding square region is rotated about the Jine y = k/2.

10. Consider the curve consisting of (a) the segment from (0, 0) to

(a, 0), (b) the segment


(a, 1), and (c) the semicircle, pointing outward, with endpoints at (a, 0)
and (a, 1). This curve is to be rotated about the y-axis. For what value of a is it true
from (0, 1) to

that the total area of the resulting surface is equal to 15?


11.

For each

a, let Sa be the area of the surface described in Problem 10, and let Va be the
a maximizes the ratio VafSa?

volume of the solid that it encloses. What value of


12. The circle with center at

(b, b) and radius a is rotated about the line


x+y=l .

Here

a and b are both positive, and the Circle does not intersect the line. Find (a) the

area of the resulting surface, and (b) the volume of the solid that it encloses.
13. Same question, for the circle with center at (2, 10) and radius I and the line x+y = 2.
(The only natural solutions of this, on the basis of the theory that we have so far, are
rather clumsy. This suggests that some new ideas are needed.)
14. The graph of
y = 2x2,
from x = -1 to x = 1, is rotated about the line x = 5. Find the area of the resulting
surface.
15. If the same surface is rotated about the line x = 4, would the area of the resulting
surface be greater, or would it. be less, than the answer to Problem 14? Get a plausible
answer to this, and justify it as well as you can.
16.

The graph of y =

t(e"'

e-"'), 0

1, is rotated about the x-axis. Find the area of

the resulting surface.


17. The same graph is rotated about the y-axis. Find the area of the resulting surface.
18.

Let G be the graph off (x)=sin x, from


line x+y = 4, and then about the line
has the larger area? Why?

x
x

0 to x = n/2. G is first rotated about the

+y

5. Which of the resulting surfaces

(A right answer, with a plausibility argument, is acceptable

Moments and Centroids. The Theorems of Pappus

7.6
as an answer to this one.

It is possible, however, to give a proof of the right answer,

without calculating the area of either of the surfaces.

That is, you can prove an in

equality of the form A < B, without calculating either A or

7.6

335

B.)

MOMENTS AND CENTROIDS. THE THEOREMS OF PAPPUS

The ideas in this section are mathematical descriptions of physical ideas.


finite set of "point masses" m;, at the points P;

Given a

(x;, y;) in a coordinate plane, the

moment (of the system) about the y-axis is


n

My= .L X;m;.
i=l

The left-hand figure below shows the general case.


y

y
P2

-2
-1
--r---'---+-----_...x
I
I
-1
I
I
I
m2=1------2

----
--+-----____, :

P3

Pn

In the example on the rig)1t, we have:


My

x1m1 + x2m2

2 +

(-2)

0.

Physically speaking, this means that if the plane is horizontal, resting on a knife-edge
along the y-axis, it will balance. The formula .L mix; for M v makes it plain that the
effect of each point mass depends only on the product m;X;; if we divide m; by 2,
and double xi, then the moment Mv is unchanged.
Similarly, the moment about the x-axis, of our finite system of point masses, is
defined to be
M

L Yimi.

i=l

The total mass of alt the particles in the system is denoted by


n

.L m;.
i=l

The centroid of the system is defined to be the point

(x, .Y)

such that
and

Mx

ym.

m.

That is,

336

The Definite Integral

7.6

Thus if we concentrate the entire mass of the system at P, the moments about the
x-axis and y-axis are unchanged.
For example, if we have m1 = 2 at P1 = (1, 2) and m2 = 3 at P2 = (2, 5), then
My= 2 + 6 = 8,

MllJ = 4 + 15 = 19,

m = Lmi

= 5,

19 = ji . 5,

8 = x. 5,

x = t.

ji=1_/.

The above discussion does not prove that My, Mx, and P = (x, ji) have any
physical significance; only experiments can prove this. The fact, however, is that the
physical conditions for equilibrium are described by moments and centroids.
Let us now consider how these ideas can be applied to a region Rin the xy-plane.
We shall think of Ras a very thin sheet of homogeneous material, so that the mass per
unit area is constant, say, = I.
y

Suppose that we take a net over the interval [a, b], as in the figure; for each x ,
we let h(x) be the height of the cross section of Rat x, and we let
We use equal subdivisions, so that
Then

xi -

for each i.

xi-l = b.xi = !:ix,

is the area of the rectangle in the figure. The rectangle is narrow, and so its moment
about the y-axis should be approximately

If we approximate the region R by a finite set of such narrow rectangles, then the
moment of R about the y-axis ought to be approximately
n

L xA !:ix = L xi h(x ) !:ix,


i=l
i=l
i

7.6

Moments and Centroids. The Theorems of Pappus

and the approximation ought to get better as the mesh

,6.x decreases.

337

This is the idea

of the following definition.


Definition.

Let R be the region lying between the graphs of two continuous functions

/1 and/2, on an interval

[a, b], with/1

/2, and let

h(x)=fix) - /1(x).
Then the

moment of

about the y-axis is

fxh(x) dx.

M,"=

The definition of M., is similar. Here (see the right-hand figure)


R= {(x,y)

and by definition,

Jc y

g1( y) x g2(y)},
w (y)=g2(y) - gi(y),
M.,

and

fyw(y) dy.

Since the total mass of R is its area

A= h(x) dx= w(y) dy,


it is natural to define the centroid of R as the point P= (.X,ji) such that
Mv=.XA,

Mx=jiA.

For example, consider a quadrant of a circle of radius


y

y=Va2-x2,

Oxa.

a.

338

The Definite Integral

7.6

Here

Obviously
and so
Therefore

x =-a.
37T

By symmetry, interchanging

and y, we get:

y =
=

The moment about the line

= -a.
37T

x0 is

defined to be

f(x - x0)h(x)dx,

Mx=xo

and the moment about the line y =

y0

Mv=Yo =

is

f(y - Yo)w(y)dy.

It is now easy to see that


Mx=:r =

Theorem 1.

Proof

For any

Mx=x =
o

This is 0 for

x0

0 = My=g

x0,

fcx - x0)h(x)dx fxh(x)dx - x0fh(x)dx =My - x0A.


=

x. The proof of the other half of the theorem is the same. In fact,

the equation

M"'="o = My

- x0A

shows that the converse of Theorem I is also true.


Theorem

2. If MX=Xo

0, then Xo

x; and if MY=Yo =

0, then Yo

ji.

symmetric, in the sense now to


symmetric across the line L if L is the perpen

Centroids are easy to find for regions which are


be defined.

Two points

and P' are

dicular bisector of the segment between them.


that

P'

In the left-hand figure below, we say

is the point symmetrically across L from

L if for each point

region or a curve.

P of the figure, P' also lies in

P. A figure is symmetric about a line

the figure. The figure may be either a

For example, a circle is symmetric about any line through its

center, and the interior of a circle has the same property.

Moments and Centroids. The Theorems of Pappus

7.6
p

pt

P'

It is easy to see that if R is symmetric about the y-axis, then x

h(x) is an even
with (-x)h(-x)

on the left,
function,

function, with

- [xh(x)].
M11

and x

h(-x)

h(x).

Therefore

0. In the figure

xh(x)

is an

Therefore

faxh(x)

0,

0.
y

--+--'-x
"'----'--'--..L.._
x0-k x0-t x0 x0+t x0+k

More generally, as in the right-hand figure, we have:


Theorem 3.

If R is symmetric about the line

Proof
Mxxo

By symmetry,

rxo+k
(x
J xo-k
h(x0

- X0)h(x) dx

t)

h(x0

x0,

then x

x0.

r xo+k
(x) dx.
Jco-k efi

+ t)

for every t; and so

efi(x0

- t)

Therefore the graph of

339

- t)

[(x0

-efi(xo

efi must

- x0]h(x0 -

t)

-th(x0

+ t).

be like the graph shown below .

xo+k

t)

odd

7.6

The Definite Integral

340

Therefore

r xo cp(x) dx '"o+k cp(x) dx,


Jxo-k
Jxo
xo+k
Mx=xo ixo-k cp(x) dx
=

and

It follows that

x0

x.

0.

In this proof, all that we have used is the assumption that

h(x0

t)

h(x0 + t).

This condition may hold for regions which are not symmetric, as below.
y

And interchanging

and y, we get the following theorem.

Theorem 4. If R is symmetric about the line y

y0, then ji

y0

These ideas have the following geometric consequence:

Theorem 5 (Pappus' theorem, for volumes). If a region is rotated about a line not
intersecting it, then the volume of the resulting solid is equal to the area of the region
times the circumference of the circle described by the centroid.
That is, if the region below is rotated about the y-axis, then
V=

27T.XA.

Proof

By the method of shells,


V

f27Txh(x) dx.

Therefore
V=

27TMv

27T.XA,

Moments and Centroids. The Theorems of Pappus

7.6

341

because x was defined by the equation


Mv

xA.

Pappus' theorem can be applied in two ways.

If we know x and A, we can

compute V = 27TxA; and if we know V and A, we can solve for x = V/2.;.A.

For

example, consider a circular region R, of radius a, with center at the point (b,

0),

b > a. When R is rotated about the y-axis, we get a solid which is called a solid torus.
(The surface of the solid is called a

torus.)

By Pappus' theorem, we get

27Tb 7TG2 = 2TT2a2b.

y
a

I
I
I
I
I
I
-
-+-+----+-- x
a

pt=?

I
I
I
I
-a

We can use the theorem in reverse to find the centroid of a semicircular region.
If the region is rotated about the y-axis, we get a sphere of radius a, with volume

V = t7Ta3.
Obviously
Therefore
and

4
x =-a.
37T
These ideas apply also to arcs.

We shall think of an arc as a thin homogeneous

wire whose mass per unit length is constant, say, = 1.

Suppose that the arc is the

graph of a function f, on an interval [a, b ]. As usual, we take a net over [a, b], with
equal subdivisions.

The arc length over the interval [x;_1, X;] is


si

f/1

+ f'(x)2 dx.

The Definite Integral

342

7.6

Now

si

-./1 +f'(x;)2 f.,,x;

R::!

the moment of this little arc about the y-axis ought to be approximately x,si; and so
the moment of the whole graph ought to be
Mv
Definitions.

R::!

I xi-./1

il

+f'(x;)2 t>,.x.

Given the function f, with a continuous derivative f', on [a, b], the

moment of the graph about the y-axis is


M,11

and the moment about the line x


M,x=xo

fx-./1 +f'(x)2dx;

x0 is

fcx - x0)-./l +f'(x)2dx.

Similarly, we state the following:


Definitions.

The moment of the graph off about the x-axis is


Mx

Mv=110

ff(x)-./1 +f'(x)2dx,
f(f(x) - Yo)-./1 +f'(x)2dx;

and the centroid of the graph is the point P


M
v

(x, ji) for which

:XL ,

where L is the total arc length.


Our previous theorems for regions now have analogous forms for arcs, as follows.
Theorem 6.

If the graph off is symmetric about a Ii ne

x0 (or y =

y0)

then this

line contains the centroid.


Theorem

7. If the graph off is rotated about a horizontal or vertical line not inter

secting the graph, then the area of the resulting surface of revolution is equal to the
length of the arc times the circumference of the circle described by the centroid.
y

r-1
I
I

I
I

Moments and Centroids. The Theorems of Pappus

7.6

343

For example, if we rotate about the y-axis, then the area of the resulting surface is
S

27Txl + f'(x)2 dx

27TM"

27TxL,

by definition of x. The proof of the theorem in the other cases is similar.


Throughout this section, we have used a fixed coordinate system to define and
investigate moments and centroids of regions and arcs.

It is a fact, however, that

moments and centroids do not depend on the choice of a coordinate system; they
depend only on the regions and the arcs. ln particular, any line of symmetry (horizon
tal, vertical, or sloping) must contain the centroid.

You may use this fact in the

problem set below.

PROBLEM SET 7.6


I.
2.

Let A, B, and C be the points

(0, 0), (a, 0), and (b, c), a, c > 0. At each of these points

there is a particle of mass 1.

Find the centroid of the resulting system.

Suppose at the points A, B, and C of Problem I there are particles of


respectively.

3.

ma

ss 1, 2, and 3

Find the centroid of the resulting system.

median of a triangle is a segment between a vertex and the midpoint of the opposite

side. Show that every median of the triangle described in P roblem 1 passes through the
centroid.

4.

Now consider the triangular region R determined by the same points A, B, and C.
Find the centroid of R.

5. The region R is rotated about the x-axis. Find the volume.


6.

R is rotated about the y-axis. Find the volume.

7. The figure formed by the sloping sides of the triangle is rotated about the x-axis. Find
the area of the resulting surface.

8.

Same question, for rotation about the y-axis, assuming

9.

A trapezoid has vertices

(0, 0), (a, c), (b

b 0.

) (b, 0) (with b > a > 0 and c > 0).

- a, c ,

Find the centroid of the region T bounded by this trapezoid.

10. The region Tis rotated about the x-axis. F!nd the volume.
11. The region Tis rotated about the y-axis. Find the volume.
12.

The figure formed by the four sides of the trapezoid is rotated about the y-axis. Find the
surface area.

13.

The circle with center at


y-axis.

14.

a,

with

0 < a < b, is rotated about the

Let the arc A be the portion of the circle with center at the origin and radius
lies in the first quadrant.

15.

(b, 0) and radius

Find the area of the resulting torus.

a which

Find the centroid of A.

The square with corners at


about the y-axis.

(Here

(a, 0), (a + k, k), (a + k, -k), (a + 2k, 0) is rotated


0 < k < a.) Find the area of the resulting surface.

16.

Find the volume of the solid obtained if the corresponding square

17.

The same square is rotated about the line

= a

region is rotated.

+ 1k. Find the surface area.

344

The Definite Integral

7.7

18. The square region is rotated about the line


19.

x =

a + 2k. Find the volume.

(a, O); (b) the segment


I) to (a, I); and (c) a semicircle, pointing outward, with endpoints at (a, 0)
and (a, I). This curve is to be rotated about the y-axis. For what value of a is it true
Consider the curve consisting of: (a) the segment from (0, 0) to

from (0,

that the total area of the resulting surface is equal to 15?


20.

For each

let Sa be the area of the surface described in Problem 19, and let Va be the

volume of the solid that it encloses. What value of

7.7

a maximizes the ratio VafSa?

IMPROPER INTEGRALS

The definite integral is defined as a limit of sample sums as the mesh of the net
approaches

0.

This limit exists if the integrand f is continuous.

But this definition

of the integral does not apply to the function


1

f(x)
on the half-open interval from

Jx

to 1.

It really is the half-open interval

(0, l]

that we are dealing with, because at

{x j 0
x

<

I}

the function is not defined.

half-open interval the function is unbounded.

On this

Therefore, for every net over

(0, l]

we can form a sample sum as large as we please, by taking the first sample point
close to

0.

Thus,

the sample sum is large when

x is small, and so the sample sums do not approach a

limit as the mesh approaches

0.

Nevertheless, we can extend the definition of the integral in such a way that our
problem has an answer.
every closed interval

The function f

[a, l],

where

a > 0.

(x)

1/J

Therefore

is defined and continuous on

H (1/J) dx is

well defined.

Improper Integrals

7.7

345

f(x)

1
-

Vx

We define a new kind of integral by saying that

f 7x

if the indicated limit on the right exists.

!+

; '

(We write a-+ O+, because a takes on only

positive values.) In the present case, the limit exists and is finite:

f ;
Therefore

\-112 dx

[2x112J!

f1 d
lim [2
a-+O J a '\/ a-+O+
=

lim
+

2.fa]

2 - 2.Ja.
=

2.

There are similar-looking problems for which the limit is infinite. For example,

11 dx

.
1dx
-= hm
0 2 a->0+ 1a 2 '
X

if the limit exists. In this case,

1
-

-1 +

r 1 dx
a->0+ Ja x2

and so

lim

(a > 0),

oo
,

in the sense defined in Section 5.3.


We abbreviate this by writing

f1 dx

Jo x2

oo

The same test can be applied at any point where the function "blows up," as long
as there is only one such point, at an endpoint of the interval. For example, consider

/2sec x dx =

lim
sec x dx,
a.. rr 2 la0
/
where the minus sign means that a-+ 7Tj2 through values less than 7T/2.

l"0

346

7.7

The Definite Integral


y

--+---rr,,.__---- X
2

Now

lasec xdx = [In !sec x


= In I sec a
=

As a-+ "TT/2-,
sec a=

1
--

cos

-+

Therefore In [sec a + tan a] -+

+ tan
+ tan

In I sec a + tan

oo

oo

xJ]g

aI

0I

a j.

tan a

and

In 11 +

sm
=

cos

a
a

-+

oo.

and
a

-"12- xdx =
a
We use the same method to define and evaluate such integrals as
ico dx2
x
Here the integration is supposed to be carried out all the way to the right, starting
at x =
Again our definition of the definite integral (as the limit of the sample sums)
does not apply, and so we define the improper integral as a limit:
lco dx = .m ladx
a-co
if the limit on the right exists. For the function
ladx
g(a) = 2'
x
the limit exists and is finite:
j'adx2 = [-.!]a=_ l + 1;
x
a
x
and so
ad
.!] =
lim r : = lim [ 1
J1
-+oo
a x a-+oo a
Jim

r sec
Jo

00.

1.

2
X

2
X

l.

Improper Integrals

7.7

347

Here we have a second example of an unbounded figure with a finite area.


y

f(x)

1
2

For very similar-looking functions, the integral from 1 to


For example,

00

dx

Jim

a--+oo

fa dx
1

Jim [In

a--+oo

x] = lim ln

a =

a--+oo

may be infinite.

oo

oo

Integrals of the type that we have been discussing here are called improper

integrals, of the first and second kind respectively.


combination.

For example,

The two kinds can occur in

r oo
dx
Jo ,jx(l + x)

is improper in two ways: the function blows up at the lower limit, and also the upper
limit is

oo.

Thus, for the integral to be finite, the two limits

1 dx
.
a-+o+ a,Jx(l + x)
hm

1.
1m

b-+oo

fb
i

dx
.jx(l + x)

must both be finite; if they are, then the integral from 0 to

oo

positive number k would have done equally well in place of 1.)

is their sum.

(Any

In this case, both

limits are finite.

Improper integrals may appear in a disguised form, and so we need to be careful.


For example, a careless calculation gives

l d: [-1]1
J-1 x x -1
=

-1

This is impossible, because the integrand is positive.


function blows up at
integrals

x =

o dx
x

-1

'

We find that both these limits are equal to


oo.

The trouble here is that the

0, and so we need to make separate investigations of the

to

-2.

oo.

Therefore the original integral is equal

348

The Definite Integral

7.7

Note that when we got the "answer" -2 we had no right to complain that the
theory was wrong, because the Fundamental Theorem of Integral Calculus does not
apply to functions whose domains have holes in them; the theorem applies only
to functions which are defined and continuous on the interval over which we are
integrating.
An even worse example of the same kind is

1 dx

= [In lxl]:.1 = ln 1

-1 x

ln 1 = 0

(?).

Splitting this into two parts, we get

J-1
(
J

= li[ln lxl]'.:1 = li ln lal = -oo,

a-+O

1 dx

a-+O

=lim[ln lxlJ! =lim[-lna] = oo.

a-+O

a-+o

The limits - oo and oo do not combine to give a well-defined limit, either finite or
infinite, and therefore the original integral is not defined at all. We also get no
answer for
"'
sin x dx.
Here

sin x dx =[-cos x]g = -cos

+ 1,

which oscillates forever between 0 and 2, and therefore does not approach a limit.
Thus, when we investigate an improper integral, there are three situations that
we may encounter.

1) The integral may exist, as a finite limit. For example,

J1

"'

=Jim

a-+00

r a:
J
1

= lim

a-+oo

2) The integral may exist, as an 'infinite limit

i"'
1

dx
- =Jim
X

a-+ oo

3) Finally, the limit may not exist as


find for

(?)

oo

[- l]

or

= l.

- oo.

dx
- = lim lna =
X

For example,
oo.

a-+ oo

any

limit, finite or infinite. This is what we

1 dx
.
-1 x

In the following problem set, when you are asked to "investigate" an improper
integral, you should find out which of the above three cases it represents. If it is
Case 1, you should find the limit, unless the contrary is stated.

Improper Integrals

7.7

349

PROBLEM SET 7.7

loo dx
x2
oo
dx
J x3
loo dx
1 x1.0001
f dx
1x x
loo
2 (x - l)s dx
J000 xe-xdx
loo xdx
1 x4
f;' xn dx

loo dx
1 x2 x
f dx
loo dx
1 x0.9999
f dx1 0001
0 x.
f
1 (x - l)s dx
Loox2e-xdx

Investigate:
1.

4.

2.

1 +

5.

7.

10.

13.

16.

19.
20.

8.

11.

In

3.

6.

-v:x

9.

12.

14.

17.

15.

18.

-v:x

v'x(l

x)

1 +

Show that

is never finite, for any value of n, positive, negative, or zero. (The

l/x,

point is that something always goes wrong, either at 0 or at


21.

loo dx
1 xs
loo dx
loo dx
xlnx
f2 dx
0 xo.9999
fo00e-xdx
Joo dx

Consider the graph of/(x)

< oo. Let

oo.)

R be the region under the graph;

let S be the solid of revolution (about the x-axis); and let Tbe the

surface of revolution.

Investigate the improper integrals which represent (a) the area of R, (b) the volume of S,
and (c) the area of T.
Investigate the following for existence. (That is, find out

whether the integral

represents

a finite limit; but in the cases where it does, you need not calculate the liP)-it.)
22.

25.

27.

30.

33.

*36.

loo-1--.,
dx dx
1 e
loo--dx
x2e-x
x
+

1 1 +

J00 e-xdx
-00
J00 e-x dx
-00/2
f0 xdx
x
Joo --dx
x2
csc
sin

..

23.

26.

28.

31.

34.

loo dx
2 x x
f-00oo e-x dx
1 +

loo

24.

In

1 +

v'

dx

1 +

Tan-1

< oo. It will

then follow by symmetry that

e-x

< oo. And obviously

because

lsin

dx
x(l x)
f;" e-x dx f= e-"'2
f:_1 e-x-dx1,
loo dx
2 v'xinx
/2tanxdx
f0
xi
Joo --dx
x2

(To show that this is finite, it will be sufficient to show that

x4 dx
i00 G - x) dx
i00 x2 xi dx
1

Joo

is continuous on
29.

32.

!sin

35.

1T

< oo,
1].)

7.8

The Definite Integral

350

**37. a) Show that for each

n,

IJv';;;(n+iJ.-.
f 00 v

I 'Jv;;;

sm x2 dx <

b) Investigate
38.

v'..

__

sin x2 dx .

v' (n-1>.-

sin x2 dx.

Let f be a decreasing function, with a continuous derivative, on the interval [a,


The graph (and the region under it) are rotated about the x-axis.

oo ) .

Show that if the

surface area is finite, then so also is the volume.


*7.8

THE INTEGRABILITY OF CONTINUOUS FUNCTIONS

Let/be a function which is continuous on an interval l. The function/is continuous


if for each x, and each E > 0, the graph has an Ec5-box at the point
total height of an Ec5-box is h

(x,f(x)).

The

2E. The box is then called an h-box of every point of

the graph that lies in it.


y

The definition of continuity applies to the points x of the interval, one at a time.
It may appear, therefore, that iff is continuous at each point
have to use infinitely many boxes (one for each

x)

x of the interval, then we

in order to exhibit the fact.

But

if I is a closed interval, this is not so:


Theorem 1

(The finite covering theorem). Let f be a continuous function on the


[a, b], and let h be a positive number. Then there is a finite collection

closed interval

of h-boxes, covering the entire graph off


y
y

I
I
I
I
I

I
I
I
I
I

I
I
I
I
I

I
I
I
I
I

Proof Let [c, d] be a subinterval of [a, b]. If there is a finite collection of h-boxes,
covering the part of the graph determined by [c, d], then we shall say that [c, d] is good.

The Integrability of Continuous Functions

7.8

351

[c, d] is bad. We allow the case in


[c, d] is all of [a, b]. Thus what we need to show is that [a, b] is good. Suppose,
then, that [a, b] is bad. We shall show that this leads to a contradiction.
Let [au b1]
[a, b]. If the left-hand half of [a1, b1] and the right-hand half of
[a1, b1] are both good, then it follows that [a1, b1] is good; we can fit together two
If no such finite collection exists, then we say that

which

finite collections of boxes, getting another finite collection of boxes that covers the
whole graph. Therefore one or both of the halves of
be a bad half of

[a3, b3]

be a bad

[a1, bi]. Similarly,


half of [a2, b2].

[a1, b1] must be bad. Let [a2, b2]


[a2, b2] must be bad. Let

one of the halves of

Proceeding to infinity in this way, we get a nested sequence

[a1, b1J. [a2, b2], ..., [an, bn],


of closed intervals, each of which is bad.

..

By the nested interval postulate, there is a

number x which lies on all of these intervals.


Now f is continuous at .X. Therefore /has an h-box at
box is

{(x, y) I x0 < x < x1,

Yo <

<

y1}.

bn - an
the length of the nth interval

b
=

[an, bnl

- a

2n-

[a,,, b,,] approaches

is good after all:

Suppose that the

--1- '

0.

Therefore we must have

for some
This means that

(x,j(x)).

Since

one

n.

h-box covers the part of the graph that

lies above it; and 1 is finite.


We continue now at the point where we left off in Section 7.2. There we defined

net, mesh (of a net), upper sum S(N), lower sum s(N), and sample sum I (X).

Theorem

3 of Section 7.2 was as follows:


Theorem.

If f is continuous, and
lim [S(N)
[N[-+O

- s(N)]

then the sample sums approach a limit, as


Now

f f (x) dx

INI

is defined to be limfNf-+O

->-

0,

0.

I (X).

Therefore what we need, to

complete the proof that continuous functions are integrable, is the following theorem:

352

7.8

The Definite Integral

If f is continuous on [a, b], then

Theorem 2.

lim [S(N) - s(N)]

INJ-+O

Proof Let

>

0.

be given. We need to show that there is a o > 0 such that

INI <o

=>

S(N) - s(N)<E.

We know by the finite covering theorem that for every h > 0 there is a finite
collection of h-boxes, covering the graph. (See the left-hand figure below. We have
not yet decided what h we want to use.) The x-coordinates of the vertical sides of the
boxes, together with a and b, form a net N0 over [a, b]. Let o be the length of the
shortest interval in N0 We assert that if N is any other net over [a, b], with IN! <o ,
then every little interval [x;_1, X;] in N lies under some one of our boxes. We illustrate
y
y

with the simpler figure on the right. If [x;_1, X;] contains no point of N0 (as on the
right) this is evident. If [xi_1, X;] contains a point Y; of N0 (as on the left), then Y;
lies on the open interval under one of our boxes, and so [x;_1, x;] lies under the same
box.
Now take a net N, with INI <o . The difference S(N) - s(N) is the sum of the
areas of a collection of rectangles, like this:
y

I
I
I

I
I
I

Each of these rectangles lies in one of our h-boxes. (Why?) Therefore each of them
has height h. Hence
S (N) - s(N) h(b - a).
Thus we want
h(b - a)<E,
and this will hold if

h<--.
b - a
This is the way we should choose h at the beginning of the proof. The resulting o
is the o that we need.

7.8

The Integrability of Continuous Functions

Theorem 3.

353

Every continuous function on a closed interval is integrable.

Proof By the preceding theorem, lim [S(N) - s(N)]


7.2, f is integrable.

0. By Theorem 3 of Section

Some of the ideas in this proof are worth examining further. In Problem Set 7.2,
we gave the following definition.
Let f be a function on an interval /. Suppose that for every E > 0 there
is a o > 0 such that

Definition.

Ix - x'I

<

where x and x' are any two points of

=>

I.

If(x) - f(x')I

<

E,

Then f is uniformly continuous on

I.

Note that while continuity is defined for one point x at a time, uniform continuity
is defined for the graph as a whole. The difference between these ideas may be clarified
by an analogy:
1)
2)
3)
of

A
A
A
the

man is literate if there is a language that he can read and write.


group of men is literate if each of its members is literate.
group of men is uniformly literate if there is one language that every member
group can read and write.

Thus, uniform literacy is a property not of the individuals in a group but of the
group as a whole; if each of the members of the group is literate, it follows that the
group is literate [see (2)], but it does not follow that the group is uniformly literate.
The difference between continuity of a function f on an interval I and uniform
continuity of f on / is analogous. For example,f(x) = l/x is continuous on the open
interval I= (0, 1), because/is continuous at every point x of I. But/is not uniformly
continuous on I. (For every E > 0, we can find two points x, x', as close together as
we please, such that lf(x) - f(x')I > E.)
But continuity implies uniform continuity when the domain of the function is a
closed interval.
Theorem 4 (the uniform continuity theorem). If f is continuous on [a, b], then f is
uniformly continuous on [a, b].

Proof Let E > 0 be given. By the finite covering theorem there is a finite collection
of boxes of height E, covering the graph. (We are using E as the h of Theorem 1.)
Let N0 be the corresponding net over [a, b], as in the proof of Theorem 2. As before,
let o be the length of the shortest interval in N0 It follows that if Ix - x'I < o, then
x. and x' lie under some one of our boxes. Therefore lf(x) - f(x')I < E, which was
to be proved.
This is the idea that we need, to complete the proof of the formula
A

f27Tf(x).J1

f'(x)2 dx

for the area of a surface of revolution about the x-axis. In Section 7.5, we knew that

354

as

The Definite Integral

INI

--+

7.8

0, the sample sum


n

I' = I 27Tf(x;)J1

+ f'(x;)2 ilx

i=l

approaches the integral.

But the area of the approximating surface was

I= I 2TTj(i;)Jl + f'(.X';)2 Llx,


i=l

with two different sample points i;,

x;

used on each interval

to show that

Iim II' -

JNJ->O

II

2TTJ I + f'(x)2 Therefore

Let M be such that

the latter function is bounded.

2TTJI + f'(x)2
E

Thus we need

o.

We are assuming that / ' is continuous. Therefore so also is

Let

rx;_1, x;].

< M,

for every

x.

be any positive number. Then

M(b

a)

>

0.

By the uniform continuity theorem, there is a o > 0 such that

Ix - x'I

<O

=>

lf x) - f(x')I

<M(b

- a)

This is the o that we need.

Proof

INI

<o

<o

=>

Ii; - .x;1

=>

lf(x;) - f(x;)I

for each i

<M(b

Now

I - I'

for each i.

a)

I 2TT[f(i;) - f(x;)]J1 + f'(x;)2 ilx,

i=l

and so

II

I'I I lf(.x;) - !c.x;) 1 27TJ1 + f'CxD2 ilx


<IM lf(i;) - f(x;)I ilx.

i=l
n

i=l

Therefore

INI

<o

=>

/I - I'/ <iM. M(b - a) Llx = M. M(b - a); Llx


'(b - a) =
b -a
E

which was to be proved.

--

E,

The Integrability of Continuous Functions

7.8

355

'PROBLEM SET 7.8

Most of the questions below can be answered on the basis of a careful reexamination of
the theorems and proofs in Sections 7.2 and 7.8. Some of them, however, require independent
investigation. Naturally, all answers should be explained.
I.

Suppose that/is known to be increasing on

[a, b],

but is not known to be continuous.

Does it follow that f is integrable?


2.

Show that Tan-1 is uniformly continuous on ( -

oo

).

3.

Same, for f(x)

4.

Is it possible for a function to be uniformly continuous on an open interval

oo,

on

( -co,

oo

).

(a,

b)?

Why or why not?


5.

If a function is uniformly continuous on an interval I, does it follow thatf is continuous


on I? Why or why not?

6.

Suppose that/is (a) continuous at


on

*7.

(a, b) .

a,

(b) continuous at b, and (c) uniformly continuous

Does it follow that f is uniformly continuous on

[a, b]?

Suppose that f is bounded and integrable, but not necessarily continuous, on


For each

of

[a, b],

let

F(x)

[a, b].

f' f

(t) dt.

Show that Fis continuous. (The betweenness theorem for integrals, which is Theorem 5
of Section 7.2, may be useful here.)
*8.

For the definition of

Lipschitzian,

see Problem 12a of Problem Set 7.2.

is Lipschitzian on I, then/ is uniformly continuous on I.


open or closed, finite or infinite.)
9.

10.

Let

a1, a2,

an

be any finite sequence of numbers. Show that

Let f be continuous on

[a, b].

Show that

If

f(x) dx

I f

If(x)I dx.

Show that if f

(Here I may be any interval,

8.1

The Conic Sections

TRANSLATION OF AXES

In Section 2.2 we stated the definition of a coordinate system on a line L.


p
x

X2

A coordinate system on L is a one-to-one correspondence


L

<--+

p <--+ x,

R,

between the points P of L and the real numbers x, such that the distance between
any two points is the absolute value of the difference of the corresponding numbers.
That is,

If we subtract the same number from the coordinate of every point, we obtain
another coordinate system on the line.

If we subtract h from every x, then

and so

'

Therefore the distance formula works, for the new coordinates x

h.

This process is called translation. The origin is moved to the point h, and all the
other number labels are moved with it.
0

x=
1
x =

-h 1-h 2-h

h+l

h+3
3

Thus the old and new coordinates are related by the formulas
'

h,

Consider now a plane with a coordinate system.


356

x' + h.

Translation of Axes

8.1

357

y'

y
y=k

'

x=h
Suppose that we translate the coordinate system on the x-axis, subtracting

from every x-coordinate, and then translate the coordinate system on the y-axis,
subtracting

k from every y-coordinate. The effect is to move the origin to the point
(h, k). Every point p now has a new pair of coordinates x', y', and these are related

to the old ones by the formulas

= x' + h,

x' = x - h,

y = y' + k,

y' = y - k.

These formulas are easy to remember; the only way you are likely to go wrong

( by writing x'

is to get them backwards

= x + k ( ?) or y' = y + k ( ?)). It is easy


h, k and new coordi

to see, however, that the new origin must have old coordinates
nates

0, O; and from this we can tell which way the formulas ought to go.
y) denotes the point whose old coordinates are x and y. Thus the
old origin is (0, 0), and the new origin is (h, k). When we write (a, b)' (with a prime
outside the parentheses) we mean the point whose new coordinates are a and b.
Thus the new origin is (0, O)', and the old origin is ( -h, -k)'. More examples are
As usual, (x,

given below:

y'

P=(5,2)=(2, 1)1

--

2 -------

+--=-t.._--,+-'-.._--- x
- 1,
0
2 3
4
5

/L--1
Q=(-1, -l)=(-4, -2)1
In the figure,

h =

3 and

k = I. Two points have been labeled both ways. At the

point P, we have
x

= 5,

y = 2,

x'

= 2,

y'

= 1,

so that the label


p

is correct.

= (5, 2) = (2, l)'

Similarly, at Q we have
x

= -1,

= -1,

x'

= -4,

y' = -2,

358

The Conic Sections

8.1

so that

= (-1, -1) = (- 4

-2)'.

When we write an equation to describe a figure in the plane, the equation depends
on the choice of axes; and often one choice of axes gives a simpler equation than any
other. If we didn't start with the axes in the best position, then we can simplify the
equation by translation of axes.

For example, consider the parabola with directrix

y = -1 and focus F = (3, 3).


y
F=(3, 3)

x, y)
---- P=(
--\

"'-.__

-+---- x
0
1
2
4
5
16
3
D---1

________________

n_
M

__

The parabola is the graph of the condition

FP =MP.
Algebraically, this says that

.Jex - 3)2 + (y - 3)2 = )(y + 1)2


x2 - 6x + 9 + y2 - 6y + 9 = y2 + 2 y + 1
<=>- 8y = x2 - 6x + 17
<=>- y = tx2 - !x + -1i-.
-<::::>

We know, however, that a parabola with a horizontal directrix and vertex at the
origin always has an equation of the form

y = ax2

The vertex of our parabola is

halfway between the focus and the directrix, at the point V

= (3, 1).

This means

that we should translate the axes so that the new origin becomes the point
O'

= (h, k) = (3, 1).

Relative to the new axes, the equation becomes

8(y' + 1) = (x' + 3)2 - 6(x' + 3) + 17;


here we have replaced x by x' + h = x' + 3 and y by y' + k = y' + I.
8y' + 8 = x'2 + 6x' + 9 - 6x' - 18 + 17,

This gives

or

8y' = x'2, or y' = tx'2.


This is in the standard form y' = ax'2, where a = 1/2pand pis the distance between
the focus and the directrix.
Thus, by a translation of axes, we have eliminated the linear term in

x and the

constant term. Here we knew in advance where the origin ought to be for the equation

Translation of Axes

8.1

359

to appear in a simple form. If we hadn't known this, we could still have investigated
algebraically, to find out what sort of simplifications a translation could accomplish.
To do this, we would regard hand

k as unknown quantities, and make the substitution


y = y' + k

x= x' + h,
in general form. This gives:

S(y' + k)= (x' + h)2 - 6(x' + h) + 17,


or

x'2 + (2h - 6)x' - Sy' + h2 - 6h + 17 - Sk=

0.

Certain facts are now obvious: (1) We can't get rid of the term x'2, by any choice
h and k, because h and k do not appear in the coefficient of x'2. (2) For the same
reason, we can't get rid of the linear term in y'. (3) The total coefficient of x' is 2h - 6,

of

and the total constant term is

h2 - 6h + 17 - Sk.
We can therefore get rid of the

x' term, by using h = 3. The constant term then

becomes
9

which is 0 when

- 1 S + 17 - Sk'

k= 1.

or

- Sk,

Thus, translating the origin to the point

(h, k)= (3, 1),

we get the equation in the form

Sy'= x'2

or

y'= tx'z,

as before. This is the process that you follow if you don't know the answer in advance
PROBLEM SET 8.1
1. Find a translation which eliminates both of the linear terms from the equation
xy - 5y - 6x - 30

0.

Then sketch the graph, showing both sets of axes.


2. Is there a translation which eliminates the xy-term from the above equation?
or why not? How about the possibility of removing the constant term?
3.

Find a translation which removes both linear terms from the equation
x2 + y2 + x + y - 2

0.

Then sketch the graph, showing both sets of axes.


4.

Find a translation which removes both linear terms from the equation
2xy - x + 3y - 2

5.

0.

Find a translation which removes both linear terms from the equation
x2 + y2 + 4x + 2y + 1

0.

6. Find a translation which removes both linear terms from the equation
x2 + xy - 3x + 2

0.

Why

360
7.

The Conic Sections

8.2

Find a translation that eliminates both linear terms in the equation


x2 +xy +y2 +x +y +5

8.

0.

Find a translation which removes both linear terms from the equation
x2 +xy +y2 +x +y +1

9.

0.

Show that there is no translation which removes both linear terms from the equation
x2 +2xy +y2 +x - y +1

0.

10. Show that there is no translation which removes both linear terms from the equation
4x2 +4xy +y2 +2x +y +8

l 1.

Consider the equation x2 +y2 +x +y - 2

0.

0.

Under what conditions for fl

and k does this equation take the form


x'2 +y'2 +Ax' +By'

0,

with possible linear terms but no constant term? (You may be able to think of a way to
answer this question without doing any calculations at all.)
12.

Show that if ad - be ;;f. 0, then the linear system


ah +bk

ch +dk

e,

always has a solution. (Simply start solving it; at some point, you will need to assume
that ad - be ;;f. 0.)
13.

Consider an equation of the form


Ax2 +Bxy +Cy2 +Dx +Ey +F

0.

Show that if B2 - 4AC ;;f. 0, then there is always a translation that eliminates both of
the linear terms.
where B2 - 4AC

(The converse is not true; there are simple examples of equations


=

0, but where the linear terms are absent to begin with. Examples?)

14.

Sketch the graph of the equation y2

15.

Let C be the graph of an equation of the form

b2 y2 +h1Y +ho

x(x +l)(x - 1).

3
G3X

+a x2 +G1X +ao.
2

Show that if the axes are translated, then C is the graph of an equation of the same form,
in the new coordinates x' and y'.

8.2

THE ELLIPSE

Let F and F' be two points, let c be half the distance between them, so that
FF'= 2c,
and let a be a number greater than c. Let C be the graph of the equation
FP + F'P = 2a.
The curve C is called the ellipse with foci F, F' and focal sum 2a.

The Ellipse

8.2

361

To draw an ellipse, we put two thumbtacks in a drawing board, at the foci


and

We tie the ends of a string to the thumbtacks, in such a way that the length

F'.

of the string left free between the thumbtacks is 2a. Then we put a pencil in the loop
of string, placing the point so that the string is taut, and move the pencil around,
keeping the string taut all the way. (We need to do this in two steps, on the two sides
of the line through

F and F'.)

In the definition of an ellipse, we really mean that

;>6

F'.

the graph of the condition


a <

c,

F and F' are two

(Thus a circle is not an ellipse.) Also, we really mean a >

FP

F'P

2a is the segment from

points; that is,


c.

(For a

to

F';

= c,

and for

the graph is empty.)

Some things about ellipses are easily seen from the definition. For the definition
of symmetry of a figure, about a line or a point, see Section 7.6.
Theorem 1.

Proof

An ellipse is symmetric about the line through its foci.

In the left-hand figure below,

FP

P is
+

on the ellipse, so that

F'P

2a.

And L is the perpendicular bisector of the segment from


geometry,

FP

FP',

and

F'P

F'P'.

Therefore

P to P'.
FP' + F'P' = 2a,

By elementary
and

P' is on

the

ellipse.

Theorem 2.

An ellipse is symmetric about the perpendicular bisector of the segment

between its foci.


Proof? (This is not quite as simple as the preceding theorem. See the right-hand
figure above.)
Theorem 3.

If a curve is symmetric about each of two perpendicular lines, then it is

symmetric about their point of intersection.

362

8.2

The Conic Sections

Proof?

(We need to show that OP = OP".)


L'

P'

--,
I

I
I
I
I
I

For ellipses, this gives us:


Theorem 4.

Every ellipse is symmetric about the point midway between its foci.

P0 is called the center of the ellipse. (See the right-hand figure above.)
These symmetry theorems convey nearly all that is easy to see about ellipses
merely from the definition. Our next step is to set up a coordinate system, and describe
our ellipses by equations. We take the origin at the center of the ellipse, and the foci

on the x-axis. The ellipse is then said to be in standard position, relative to the axes.

y
P(x, y)

As indicated in the figure above, let F and F' be the points

FP + F'P

J(x + c)2 + y2 + J(x - c)2 + y2 = 2a

J(x + c)2 + y2 = 2 a - J(x - c)2 + y2

=>

x2 + 2cx + c2 + y2

=>

and

(c, 0).

Then

2a

( -c, 0)

-----

4a2 - 4aJ(x

aJ(x - c) 2 + y 2
a2 '-- ex
a2x2 - 2 a2cx + a2c2 + a2y 2

c) 2 +

y2

+ x2 - 2cx + c2 + y2

(a2 - c2 )x2 + a2y2


x2
Y2 + -=l.
a2
az - c 2

a4

a2(a2 - c2)

- 2 a2cx + c2x2

The Ellipse

8.2

It is possible to show,

Thus every point on the ellipse satisfies the final equation.


conversely, that every point
(See Problem
Theorem 5.

22

363

that satisfies the final equation lies on the ellipse.

(x, y)

below.) Thus we have:

The ellipse with foci at

the equation

x2

a2
For example, for

3 and c

y2

--

+
=

(c, 0)

a 2 - c2
2

and focal sum

2a

is the graph of

1.

we get

x2

y2

-+-=l.
9
5

To sketch, we observe that for y = 0, x = 3, and for


then sketch an oval with these as its extreme points.

0,

y=

)5.

We

y
2

-2

Given an equation
x2

y2

a2

b2

-+- = 1 '

the graph is always an ellipse.


for some

>

0.

Since

a2 - b2

> 0, it follows that

a2 - b2

c2

The graph is therefore the ellipse described in Theorem 5. Thus we

have proved half of the following theorem.


Theorem 6.

Given the equation


x2

a2
For

<

a2,

y2

/;2

1.

the graph is the ellipse with focal sum

)a2 - b2. For a2 < b2, the


(0, c), where c = )b2 - a2.

c =

at

b2

2a

and foci at

(c, 0),

graph is the ellipse with focal sum

Proof of the second half of the theorem?

2b

where

and foci

8.2

The Conic Sections

364

-a

-b

b<a,

c=Va2-b2,

a<b,

FP+F'P=2a.

c=vb2-a2,

FP+F'P=2b.

If the foci are not in either of the two positions shown above, then the equation
of the ellipse is more complicated. In some cases, when the equation is given, we can
simplify the equation by a translation of axes. Consider

4x2 + 9y2 - 8x + 18y - 23 = 0.


Making the substitutions

x = x' + h, y =y' + k, we get

4x'2 + 9y'2 + (8h - 8)x' + (18k + 18)y' + 4h2 + 9k2 - Sh + 18k Evidently we want

23

0.

h = 1, k = -1; and this gives the equation in the form


2

4x ' + 9y' - 36 = 0,

or

x2
y2
'
'
-+-=l.
9
4

()5, O)' and focal sum 6; it intersects the x'


(3, O)', (0, 2)'. We can now sketch, showing both sets of

The graph is the ellipse with foci at


and y'-axes at the points
axes

'
y

-3-+-----+-_
l-+--_._.,__+3---. x'
-_
-1

-2

In doing such sketches, we start by drawing the new axes and the curve, in a con
venient position on the paper, and then draw the old axes, in the position where
they must have been.

8.2

The Ellipse

365

PROBLEM SET 8.2


Write equations for the ellipses described by the following conditions and sketch.
1. Foci at (1,

O);

2. Foci at (0, 1); focal sum 4.

focal sum 4.

4. Foci at (-1, -1), (1, 1); focal sum 4.

3. Foci at (1, 2), (1, 4); focal sum 4.


5.

Foci at (-1, 1), (1, -1); focal sum 4.

7. Foci at (0, 2); focal sum 6.

6. Foci at (2, O); focal sum 6.


8. Foci at (-1, 1) and (I, -1); focal sum 6.

Find the foci and the focal sum, and sketch, showing both sets of axes, in cases where
more than one set is used.

9. x2/4 + y2 =1

10. x2 + 9y2 - 2x + 36y + 28 =0

11. 9x2 + y2 + 36x - 2y + 28 =0


12. x2 + 2y2 + 3x + 4y - 6 =0

(This one does not "come out even.")

13. 4x2 + y2 =1
15.

14. x2 + x +

y2
-

2y
-

+ 1 =0

Given an equation of the form

Ax2 + By2 + Cx + Ey + F = 0,
where A and Bare both positive, show that the graph is (a) an ellipse, (b) a point, or (c)
the empty set. (The same conclusion follows if A and B are both negative.)

16. A function f is odd if/(-x) =

f (x)

for every x.

Show that the graph of an odd

function is symmetric about the origin.

17. a) Let C be the graph of the sine function. Show that C is symmetric about the origin.
b) Now show that C is also symmetric about infinitely many other points.
may happen that an unbounded figure has more than one "center."

(Thus it

In fact, there

is a simpler example: a line is symmetric about each of its points, and so every point
of a line is "a center" of the line.

center only for bounded figures.)

For this reason, we ordinarily use the word

18. a) Show that the graph of the cosine is symmetric about infinitely many points.

b)

Show that the graph of the sine is symmetric about infinitely many lines.

19. Consider the infinite strip R between the lines y =1 and y = -1. That is,
R = {(x, y)

-oo < x <

w,

-1

2 y 2 l}.

Show that R is symmetric about infinitely many points, and find a simple description of

the set C consisting of all points which are "centers" of R.

20. Show that every cubic curve is symmetric about its point of inflection. Here by a cubic
. curve we mean the graph of an equation y = ax3 + bx2 + ex + d, with a yf 0.
21. Suppose that in Theorem 3 of this section we drop the hypothesis that the two lines of
symmetry are perpendicular. Would the resulting theorem be true? Why or why not?

*22. Given 0 < c < a, as in the definition of an ellipse.


Let P = (x, y) be a point satisfying the equation
x2
y2
c
a2 + a-2--2
-

1.

Let F = ( -c,

O), F' = (c, 0).

The Conic Sections

366

8.3

a) Show that
y2

a2 - e2
-- - (a2 - x2).
a2

b) Show that
FP + F P
I

c) Show that a2 + ex > 0.


Remember that 0 < e <

a,

d) Show that a2 - ex > 0.

e) Show that FP + MP

I
'\/ (a2 + ex)2 + 1 '\/I (a2 - ex)2
a
-

(There are two cases to consider, x 0 and x 0.


and use the fact that /x/ a.)

2a.

(This completes the proof of Theorem 5.)

8.3

THE HYPERBOLA

Given

0 < a < c, and the points

condition

F and F', with FF'

FP - F'P =

The curve C is called the

2a

hyperbola with foci

2c. Let C be the graph of the

(a < c).

F, F'

and focal difef rence 2a. The figure

shows what a hyperbola looks like, but the reasons for this appearance of the graph
are not obvious; the only thing that is easy to see, on the basis of the definition, is
that the hyperbola is symmetric about each of the two perpendicular lines. The first
step in our investigation of hyperbolas is to take the axes in a convenient position, as

shown above, with F


FP - F'P
<=> FP
<=>
=>

<=>

F'P

(-c, 0) and

(c, 0), and get an equation for the curve.

2a
=

J(x - c)2 + y2 2a

x2 + 2ex + c2 + y2
- a2

2a

2
J(x + c)2 + y

ex

F'

x2 - 2cx + c2 + y2 4aJ(x - c)2 + y2 + 4a2

aJ(x - c)2 + y2

8.3

The Hyperbola

c2x2 - 2a2cx

-:?

(c2

a2)x2

a4 = a2x2 - 2a2cx
a2y2 = a2(c2
a2)

a2c2

367

a 2y2

y2
= 1.
c 2 - a2
---

Thus every point P = (x, y) of the hyperbola satisfies the final equation. As in the
case of the ellipse, it can be shown conversely that every point on the graph of the
final equation is on the hyperbola. (See Problem 32 below.) Since c2 > a2, we may let
b2 = c2 - a2.

This substitution gives the standard form of the equation:


x2
Y2 =l.
--a2
b2

And we can sum up as follows:


Theorem 1.

The graph of the equation


Y2

x2
-

a2

bz

is the hyperbola with foci at (c, 0) (where c =.Ja2

b2) and focal difference 2a.

We shall use our equation to justify the sketch which we gave at the outset.

1) No point of the hyperbola lies between the lines x = -a and x = a. The


reason is as follows. Solving for y, we get

Therefore the hyperbola is the union of the graphs of two functions


f(x) = "l}_.Jx2 - a2
a

and

g(x) =

and neither of these functions is defined for -a

<

<

"!}_ .Jx 2 - a2,

a
a.

2) The curve is symmetric about each of the coordinate axes. This is easy to see
algebraically. For each point (x, y), the symmetric point across the x-axis is (x, -y);
and if (x, y) is on the hyperbola, then so also is (x, -y). Similarly, if (x, y) is on the
curve, then so also is (-x, y); and so the hyperbola is symmetric about the y-axis.

3) The hyperbola is unbounded in both the x- and y-directions. Obviously f(x)


and g(x) are defined whenever !xi a. As x-+ oo, f(x) -+ oo and g(x)-+ - oo.
And as x-+ oo,f(x) -+ oo and g(x) -+ - oo.
-

It remains to discuss the two lines which the curve seems to be getting close to
when both x and y become numerically large. The behavior of the hyperbola relative

The Conic Sections

368

8.3

to these lines seems to be similar to that of the curve


Y

relative to the coordinate axes.


curve.

= f(x) = l,
x

The coordinate axes are called

asymptotes

of this

By this we mean, roughly speaking, that points of the curve, far from the

origin, in the appropriate directions, are close to the axes.

We want to extend this

idea to cases in which the asymptote is neither horizontal nor vertical.

As x-+

oo,

(This distance is

the distance from the point

MP= IYI = 1 1/x j . )


P=

P=

x;-<0.

(x, y) to the x-axis approaches 0.

We shall take this property as our definition

of an asymptote. That is, a line Lis an


from the line to the point

1
y=f(x) = x'

asymptote

of a function-graph if the distance

(x,j(x)) approaches 0 as

x-+

oo,

It is evident that the x-axis is an asymptote of the graph off (x)

or as x-+

oo.

I/x under this

definition.

In fact, the x-axis is an asymptote in both the positive and negative

directions.

We also say that a line L is an asymptote of a curve C if C

function-graph which has L as an asymptote.


x

g(y) = 1/y;

In the case of

y=

contains

l/x, we also have

thus the curve, looked at sidewise, is still a function-graph, and has

the y-axis as an asymptote, in both the positive and negative directions. This is shown
in the left-hand figure below.

y
x=g(y) = ly (y;o<O)
M
IimMP=O

y=f(x) = l
x (x;o<O)

y-.ro

lim MP=O

Jim MP=O

y-.-ro

The Hyperbola

8.3

We return to our hyperbola.

In the last figure on the preceding page, the slope

of the segment from the origin to the point P = (x,

m(x) = ,!'.
x

369

y) is

..! .Jx2 - a2 = 1
x a
a

a2

x2

Obviously
Jim m(x)
and this suggests that the line

y=

bfa,

bx/a, or x/a

part of the curve that lies in the first quadrant.


by symmetry that the lines x/a

y/b =

quadrants.
Thus we need to show that Jim.,

... 00

- y/b =

0, is an asymptote of the

If we show this, then it will follow

0 are asymptotes of the curve in all four

MP = 0. Since MP < NP, it will be sufficient

to show that lim NP = 0. This can be done by an algebraic trick.

NP=

Obviously NP

--+

0 as x

x
a

E..Jx2 - a2 = E.(x - .Jx2 - a2)


a
a

x
. (x - .Jx2 - a2).
a
x
b
a2
a

--+ oo.

+
+

.Jx2 - a2
.Jx2 - a2

.Jx2 - a2 .

Therefore MP

--+

0, which was to be proved.

This

gives the following theorem.


Theorem 2.

The lines

x
y
--=0
a
b
are asymptotes of the hyperbola

x2
a2

y2
b2

l.

You can sketch a hyperbola by drawing in the asymptotes and x-intercepts


exactly, and then filling in the curve freehand.
y

370

The Conic Sections

8.3

For example, consider

2
x

y2

= 1.

The x-intercepts are at x = 3, and the asymptotes are the Jines


y
x
-- = 0
'
3
2

or

Jx.

Y =

A hyperbola whose asymptotes are perpendicular is called


y =-x
'
'
'
'
'
'

y=x

'

/
/

'

'
'
'

'

/
/
/
/

rectangular.

'

'

'

'

'

'

'

'
'

'

If such a hyperbola is in standard position, then the asymptotes must be the lines
xy

0, and the equation must have the form


2
y2
x
- - - =
1,
2
2
a
a

or

2
2
2
x -y =a .

If the foci are on the y-axis, at the points (0,c ) , then the equation of the hyper

bola takes the form

2
x
--2
c - a2

1.

It follows that the graph of the condition


2
x
2
a

2
y
= 1
2
b

is the union of two hyperbolas with the same asymptotes. These are called
hyperbolas.

conjugate

8.3

The Hyperbola

371

PROBLEM SET 8.3


Sketch the graphs of the following equations.
l.

x2 - 4y2

4. y2 - 4x2

7. -x2 + 9y2

2. y2 - 4x2

-4

10. 25 y2 - 4x2

5. 9x2 - 4y2

8.

6. 9x2 - 4y2

36

-9x2 + y2

3. x2 - 4y2

9. 25x2 - 4y2

-4

-36

100

100

Derive equations for the hyperbolas determined by the following conditions, and sketch.
11.

Foci at (2,

13. Foci at

12. Foci at (2, 2); focal difference 3.

0); focal difference 3.

14. Foci at

(0, 0) and (0, 4); focal difference 3.

(0, 2); focal difference 3.

15. Foci at (1, I); focal difference 2.


16. Foci at (2,

O); passing through the point (3, 4).

17. Foci at (2,

0); focal difference 2.

18. Foci at (3,

0); focal difference 4.

19. Foci at

(0, 3); focal difference 4.

20. Foci at (3,

O); passing through the point (5, 5).

21. Given F, F', and a, as for a hyperbola in standard position. What is the graph of the
condition FP - F'P

la? How about the graph of FP - F'P

-la?

22. Find a rectangular hyperbola in standard position (with asymptotes x + y


x - y

0 and

0) passing through the point (5, 3).

Investigate the graphs of the following equations. In each case, find all asymptotes.
23. (x2 - y2 - 1)2
25.

x2y2 - xy +

=0
=

24. (x2 - y2)2


26. y

27. Let D be the line x

I
------

(x - l)(x - 2)

-1, let F be the origin, and for each point P let DP be the

perpendicular distance between D and P. Let C be the graph of the condition


FP

DP=
What sort of curve is this?

Sketch.

28. Let F and D be as in the preceding problem, and let C' be the graph of the condition
FP
DP
What sort of curve is this?

Sketch.

29. _Let G be the set of points P such that CP


D is the line x

=
1.
=

2DP, where C is the circle x2 + y2

I and

4. What sort of figme is G? Discuss and sketch.

30. The following passage occurs in the U.S.Internal Revenue Act of 1964.
" ...There shall be allowed as a deduction moving expenses paid ... in connection
with the commencement of work by the taxpayer ... at a new principal place of work ...
[However,] no deduction shall be allowed ... unless ... the taxpayer's new principal
place of work ... is at least 20 miles farther from his former residence than was his
former principal place of work ...

"

The Conic Sections

372

8.4

Give a sketch, showing what this means. Your sketch should show (a) the former
residence, (b) the former place of work, and (c) the region in which the new place of
work must lie, for the moving expenses to be deductible. (The author is indebted,
for this problem, to Dr. Henry Pollak, of the Bell Telephone Laboratories.)
31.

The region between two conjugate hyperbolas stretches out infinitely far, in each of four
directions. Find out whether the area of such a region is finite.

*32.

Given 0 < a < e, as in the definition of a hyperbola.


Let P
(x, y) be a point satisfying the equation

Let F

( -e, 0), F'

(e, 0).

x2
-

a2

v2

__
,_ =

e2 - a2

a) Show that

y2

e2 - a2
-- - (x2 - a2).
a2

b) Show that
FP - F P
I

c) Show that, if
ex

a,

then

+ a2 > 0,

d) Show that if x ;;;i

1
I .I
- v (ex + a2)2 - - '\I (ex - a2)2.
a
a

-a,

ex

- a2 > 0,

FP - F'P

and

2a.

then

ex + a2 < 0,

ex -

a2 < 0,

and

FP - F'P = -2a.

(This completes the proof of Theorem 1.)


8.4 THE GENERAL EQUATION OF
THE SECOND DEGREE. ROTATION OF AXES

An equation of the second degree in x and y is an equation of the form

Ax2 + Bxy + Cy2 + Dx + Ey + F

0,

where at least one of the coefficients A, B, and C is different from zero.

The latter

condition is to guarantee that the degree of the equation really is 2, rather than 1 or 0.
We have found that all conic sections are graphs of equations of this type; and we shall
now investigate the converse. That is, we propose to find out what sort of figure can
be the graph of a second-degree equation.

The possibilities that we have already

found are
a)

a circle,

b)

a p arabola,

c)

an ellipse,

d)

a hyperbola.

There are other possibilities, which we noted as exceptional cases when we were
studying the equation

x2 + y2 + Dx + Ey + F = 0,

The General Equation of the Second Degree.

8.4

Rotation of Axes

373

in connection with the circle. The graph of


x2 +

y2 = 0

is a point; and the graph of

x2
is empty.

y2

1 = 0

(See Theorem 2 of Section 2.3.)

Our list of possible graphs of second

degree equations must therefore include


e)
f)

point, and
the empty set.

And this is not all. The graph of

y2 = 0
is a line, namely, the x-axis. And the graph of

xy = 0
is the union of two lines, namely, the two axes. Similarly, the graph of

x2-y2=0
x2-y2 = (x
y)(x + y). This is = 0
y = 0. Therefore a point P = (x, y) is on
the graph of x2 -y2 = 0 if and only if (i) P is on the line y = x or (ii) P is on the
line y = -x.

is the union of two lines. The reason is that


if and only if either x -y

= 0 or x

In this example, the lines intersect, but we may get the union of two parallel lines.
The equation

x2- x = 0
is equivalent to
x(x

1) = O;

and the graph is therefore the union of the two parallel lines x

= 0 and x = 1. Thus

the graphs, for the general equation of the second degree, include
g)
h)

line, and
the union of two lines, either parallel or intersecting.

We shall show that the eight possibilities that we have just listed are the only possi
bilities. The method will be to reduce the equation to a recognizable form by moving
th axes. In some cases, this cannot be done by translation; we may also have to use
rotation of the axes.

374

The Conic Sections

8.4

Suppose that we rotate the axes through an angle of measure 8, getting a new
pair of axes.
y'

In the figure,

is the distance OP; P has coordinates x, y in the old coordinate system,

and coordinates x', y' in the new coordinate system.


x

= r

cos cp,

x'

= r

cos (<P - 8)

y'

= r

sin (cp - 8)

sin cp,

cos <P cos 8 +

= r
= r

= r

Evidently

sin <P cos 8 -

r
r

sin <P sin 8'

cos <P sin"8.

Therefore the new coordinates are given in terms of the old ones l:>y the formulas
x'

x cos 8 + y sin 8,

y'

-x sin 8 + y cos 8.

If we rotate the new axes through an angle of measure -8 we are back where we
started. Therefore the old coordinates are expressed in terms of the new ones by the
formulas
x

x' cos (-8) + y' sin (-8),

-x' sin (-8) + y' cos (-8).

These give
x
Theorem 1.

x' cos 8 - y' sin 8'

x' sin 8 + y' cos 8.

In any second-degree equation, the xy-term can be eliminated by a

rotation of axes.
Before going into the proof, let us try a simple example:
xy

l.

To rotate the axes through an angle 8, we should substitute


x

x' cos 8 - y' sin 8'

x' sin 8 + y' cos 8.

(1)

The equation then becomes


(x' cos 8 - y' sin 8)(x' sin 8 + y' cos 8)

1,

or
x'2 sine cos 8 + x'y'(cos2 8 - sin2 8) - y'2 sine cos 8
We want the x'y'-term to vanish. Thus we want
cos2 8 - sin2 8

0,

or

cos 26

O;

l.

The General Equation of the Second Degree.

8.4

Rotation of Axes

375

and this will happen when


W

'TT
=

nTr

Tr

'

One value of 8 is all we need, and so we take 8


sin 8

cos 8

Tr/4, which gives

)2

and
sin 8 cos 8

nTr

+ -.

t.

Thus our new equation is

x' 2

y' 2

1.

This is the equation of a rectangular hyperbola.

Let us now return to our general equation

Ax2 + Bxy + Cy2 + Dx + Ey +

0.

Making the usual substitution, to rotate the axes through 8, we get

A(x' cos 8 - y' sin 8)2 + B(x' cos 8 - y' sin 8)(x' sin 8 + y' cos 8)
+ C(x' sin 8 + y' cos 8)2 + D(x' cos 8 - y' sin 8)
+ E(x' sin 8 + y' cos 8) +

F = 0.

When we collect coefficients for the terms of various types, we get a new equation of
the same form, like this:

A'x'2 + B'x'y' + C'y'2 + D'x' + E'y' + F'


Algebraically,

A'
B'

C'
D'
E'

A cos2

8 + B sin 8 cos 8 + C sin2 8,


-2A sin 8 cos 8 + B(cos2 8
sin2 8) +
A sin2 8 - B sin 8 cos 8 + C cos2 8,
D cos 8 + E sin 8,
- D sin 8 + E cos 8,

F'= F.

0.

2 C sin 8 cos 8,

The Conic Sections

376

8.4

For future reference, we have written down all of these, but for the moment, all we
are interested in is B': we want to find a e that makes B' =0. Simplifying trigono
metrically, we get
B' = (C- A) sin2e + Bcos2e.
There are now two cases:

1)

If A =C, then B' =Bcos 28.

We must have B - 0, or there wouldn't be any

xy-term in the original equation. Therefore


when

B' =0

cos2e =o,

and cos28 =0 when


e = 27:.
4
Thus a rotation through TT/4 eliminates the xy-term whenever A =C.
2)

If A - C, then we can divide by A -C. Therefore B' = 0 when

--B

or

B cos 28 = (A-C) sin 2e,

A-C

=tan 28.

Thus, to get B' =0, we take


e
This proves the theorem.

Tan
I
-

---

A-C

(The theorem did not say that the coefficients in the new

equation were easy to compute.)


Theorem 2.

The graph of a second-degree equation is (a) a circle, (b) a parabola,

(c) an ellipse, (d) a hyperbola, (e) a point, (f) the empty set, (g) a line, or (h) the
union of two lines (either parallel or intersecting).

Proof

By the preceding theorem, we can assume that there is no xy-term.

equation then has the form


Ax2 + Cy2 + Dx + Ey + F = 0.
We now need to discuss various cases.

1)

Suppose that neither A nor C is

( )

A x2 +

0.

We can then write

( )

2
+ C y +

= -F,

and complete the square twice to get

( r c(y r

A x +

-F +

which has the form


Ax '2 + Cy'2 = F'.

:;2,
2

The

8.4

The General Equation of the Second Degree.

Rotation of Axes

377

Here we have translated the axes letting


D
x =x +2A'
I

E
,
y =y+-.
2C

Since A =;r6 0, we can divide by A, getting


x'2 + C'y'2 = F"

(C' = C/A =;r6 0).

There are six possibilities to be considered. For each of these cases, we have indicated
on the right what sort of figure the graph is.
(C' > 0, F" > 0

a circle or ellipse

C' > 0, F" = 0

a point

C' > 0, F"< 0

the empty set

(C'< 0, F" > 0

a hyperbola

C'< 0, F" = 0

two intersecting lines

C'< 0, F"< 0

2)

a hyperbola (with foci on the new y-axis)

Suppose that C = 0. The equation then has the form


Ax2 + Dx + Ey + F = 0,

where A =;r6 0, because the degree is 2. We divide by A, getting


x2 + D'x + E'y + F' = 0;
and then we complete the square in x, so as to eliminate the linear term in x.
gives an equation of the form

This

x'2 + F" = E'y.

For E' =;r6 0, this is a parabola. For E' = 0, the equation x'2 =

F" gives one line,

two lines, or the empty set.

3)

Suppose that A = 0.

This is exactly like Case 2; we interchange x and y, and

proceed as before. This completes the proof of the theorem.


It is easy to compute the new coefficients produced by a translation of axes.
For a rotation, the new coefficients are expressed in terms of sine and cose, and e
is defined by the equation
__
e = 1.
Tan-1 _B
2

A-C

Thus we want to express sine and cose in terms of tan 28 ( = B/(A

C)) for the

case where
_!!<28<.
2
2
When 28 is in the first or fourth quadrant, cos 28 > 0, and sin 28 has the same sign
as tan 28.

8.4

The Conic Sections

378

v1+k2
k?
x

k?
v1+k2

In the figure,
B

k = tan 2() = ___ .


A-C

Therefore

1
cos 2() =
I
v1 +k2
The half-angle formulas are
x

cos-=
2

i+cos x
2

'

sin-=
2

l - os

x .

For the present case, these give

cos e =
where

i+cos 2e
2

.
sin e =

'

i- cos 2()
2

'

1
cos 2() =
I +k2
v1
and where the sign in the formula for sine is the same as the sign of k =tan 2(),
For example, consider

Here
and

3x2 +2xy +y2 =1.


A=3,

B =2 ,

c =1,

B
2
=--=1.
k=
3-1
A-C
--

Therefore

1
1
=
cos 2e =
J1 + k2
J2

----=

Hence

cos e =

i+1;/2
2

2 + /2
4

'

The General Equation of the Second Degree.

8.4

Rotation of Axes

379

and

(In the second formula, sine > 0 because k > 0.) Therefore
cos2e

2 + /i
'

sin2 e

A'x'2
A'

A cos2e
3

2 + .J2
4

and

C'

C'y'2

+2

. .J2

/i.
4

1,

B sine cose

sine cose

'

The new equation is


where

2 - .J2

C sin2e

2 - .J2

2
.J + 2

- B s in e cose + C cos2e
2 - .J2 - . .J2 2 + .J2 -.J2
+ 2.
2
+

A sin2e
3.

PROBLEM SET 8.4


In these problems, when you are asked to

inrestigate

an equation, you should find out

what sort of figure the graph is, and sketch. If the graph is a conic section, you should also

find the coefficients in the standard form.

1. Investigate

x - xy

(Here it is easier to translate first and rotate second.

Sketch, showing all three sets

of axes.)
2.

Investigate

3.

Investigate

x2 - xy

4.

xy -

1.

- 2y

0.

Investigate

5 . . Investigate
6. Investigate
7.

2xy - y2 + 2

0.

x2 + 2xy + y2 + 2x + 2y + l = 0.

x2 + 4xy + 4y2 + 4x + 8y + 3

0.

Show that, under a rotation of axes,

A' + C'

A + C

and

F' = F.

We express this by saying that A + C and Fare invariant under rotation.

380

8.

The Conic Sections

8.4

a) Given the general equation of the second degree.

Let

A0, B0, C0,

be the new

coefficients, when the axes are rotated through an angle of measure 8. Thus

C8,

are the

A', B', C', . .


A8
B8
C0

A cos2 8 + B sin 8 cos 8 + C sin2 8,


(C - A) sin 28 + B cos 28,
A sin2 8 - B sin 8 cos 8 + C cos2 8.

Show that the derivatives

A, B, C satisfy the differential equations

b) Show that the function


f(8)
is a constant.

A8, B8,

. of the text; and so

Thus we say that

B - 4A8C8

B 2 - 4AC is invariant

under rotation of axes.

It

may be of some interest to check this, in the cases where we have computed the new
coefficients.
9.

Given x2- + 2xy + 3y2 +


the xy-term.

4x

+ 5y + 6

0, the axes are rotated so as to eliminate

What are the possibilities for the coefficients of x2 and y2, in the new

equation?
10.

Same question, for the equation


x2 + 2xy + 5y2

11.

10

0.

Same question for


4x2 +

v3 xy

+ y2 + 2x + 3

12. a) Let D be a line, let F be a point not on D, and let

0.

be a positive number. Let G

be the set of all points P such that


FP
=
DP

e
,

where DP is the perpendicular distance from D to P.

with directrix

D, focus F,

(b) a parabola if

and eccentricity e.

G is called

the conic section


e < 1,

Show that G is (a) an ellipse if

I, and (c) a hyperbola if

> 1.

b) Is a circle a conic section, in the sense defined in Problem 12(a)? Why or why not?

9.1

Paths and
Vectors in a Plane

MOTION OF A PARTICLE IN A PLANE

To describe the motion of a particle in a plane


particle is, at each time t on a certain interval /.
interval

I there

corresponds a point

P(t);

E,

we need to explain where the

Thus to each time t on the given

and the motion is described by a function

P: I-E.
For the motion shown in the figure,
point

is the point

P(O)

(1, 1 )

I is

the infinite interval

[O,

oo

),

and the initial

In general:

Definition.

plane path is a function

P: I-E,
where

I is

an interval and

E is

a plane.

The same idea applies more generally: a path in space is a function


where I is an interval and S is space.

P: I - S,

In this chapter we shall be dealing only with

plane paths, and so we shall refer to them for short simply as paths.
The locus of a path is the curve which is traced out by the moving point.
precisely:

Definition. Given a path

P: I-E,
the locus of P is the set of all points

Q which are
381

P(t)

for some

in I.

More

382

Paths and Vectors in a Plane

Briefly, the locus of

P is

the

9.1

image

of I under the function

P.

The locus is deter

mined when the path is named, but given a locus, the path is not determined: the
same curve can be traced out by a moving point in infinitely many ways.
We describe a path in a coordinate plane by defining two functions which give
the coordinates of the moving point for each time t.

x = f(t)
4t
y = g(t) = 8t2
=

Here

At

/=(-00,00),

t = 0, P(t) = (0, 0).

As

t increases,

P(t)

and

For example, we might take

starting from

0,

(4t, 8t2).
both

and

increase, but y

increases faster. In fact, the locus of this path is a parabola. To see this, we observe
that from the first equation,

t = x/4.
y=

Substituting in the second equation, we get

sGY

tx2.

Thus every point of the path lies on the graph of the equation

And it is easy to check, conversely, that every point


path.

(x, y)

of the parabola is on the

x =f(t), y = g(t), the two


parameter. Sometimes
we can get a simple description of the locus of a path by writing an equation in x and y.
We then say that we have eliminated the parameter, getting a rectangular equation of
When a path is described by a pair of functions

functions are called the

coordinate functions,

and tis called a

the locus. Often this process is useful: a path may trace out a simple figure, such as a
segment or a circle, in a complicated way; and when this happens we want to know it.
The process of getting rectangular equations for loci is often tricky.
for example, the path

P described by

Consider,

the equations

x =f(t) = t2,

y = g(t)

t4

Every point of this path lies on the parabola

y = x2.
But the converse is not true.
Therefore the locus of

P is

On the path, we always have

only half of the parabola.

?; 0, because t2 ?; 0.

Motion of a Particle in a Plane

9.1

383

y
\
\
\
\
\
\
'
'

', P(O)
,

'
,....-::;
. -'---- x

We are always free to regard a parameter as representing time, and in many


physical problems, this is what the parameter means.

But it often makes equally

good sense to regard the parameter as the measure of an angle. Consider, for example,

x =cost,

y =sint.

These functions describe uniform motion around a circle.

Here we may regard the

parameter as the measure of an angle, and write

x =cose,

y =sin()

to describe the same path.

Somewhat similar looking paths have ellipses as their loci. For example, consider

x =a cose,

y =b sine.

Here the locus of Pis an ellipse:


::
a

=cos() '

2' = sin() '


b

x2
y2
2
.
-2 + -2 =COS () + Sill2 () = 1.
a
b
Investigating further, we see what values of () correspond to what points of the
elliptical locus.

We draw circles of radii a and b, with centers at the origin, and

construct L() in standard position.

Paths and Vectors in a Plane

384

9.1
y

In the figure,
Q
R

(a cos e, a sin8),
(bcos8,bsin8).

Therefore
P

P(8)

(a cos8, bsin8).

Following the scheme of the above figure, using drawing instruments, you can plot
as many points of the ellipse as you want to, without making any numerical calcula
tions. The same idea is used in the construction of a drawing instrument called the
ellipsograph, which can be adjusted so as to draw the ellipse with any pair of semiaxes
a, b.
PROBLEM SET 9.1
Investigate the paths described by the following pairs of coordinate functions, sketch
the loci, and label a

few points as P(O), P( 7T/4),

and so on, so as to indicate the way in which

the moving point traverses the locus.


l.

4.

sec e, y = tan e

2. x = cos e, y = cos2 0

cos2 e, y = sin2 e

5.

x = t3, y

3.

x = 2 cos e, y = sin 0

lt31

(Check that not only /(t) = t3 but also g(t) = lt31 have continuous derivatives. Thus a
moving point can go smoothly around a sharp corner, if only it does so slowly enough.)
6. x = sec2 e, y
9. x

tan2 0

sin e, y = Jsin 01
y

7. x = sec 0, y = cos 0
10. x = t6, y

8. x

t4
y

csc 0, y = cot ()

The Parametric Mean-Value Theorem;

9.2

L'Hopital's Rule

385

11. In the left-hand figure above, IJ ranges over the open interval (0, 7T) , OR
b, and QP
is a constant a. Find a parametric description of the path, and sketch the locus.
=

12. In the right-hand figure above, OR

b as before, and QP is a constant a.

parametric description of the path, and sketch the loci, showing the three cases

Find a
a

<

b,

b, a> b.

13. A circle of radius a rolls without slipping inside a circle of radius 2a. The initial position
is shown on the left below; a later position is shown on the right. Observe that RQ

2a0, PQ

2a0, PQ

aij>. Therefore 1>


S

(h, k)

20. Let

(a cos IJ, a sin 0).


y

2a

y'

-2a

Then
x'

a cos (0

x = x' +

y'
y

if>),

h,

a sin (0
'

if>),

k.

Complete the discussion to get a parametric description of the path, and find out what
the locus is.

It will turn out that the figure on the right above is slightly misleading.

**14. lf you solved the preceding problem correctly, you found that some of the machinery
that you used was not necessary after all. But consider the case where the outer circle
has radius a and the inner circle has radius b

a/4. Find parametric equations for the

path, and eliminate the parameter to get the rectangular equation

iY
This curve is called a four-cusped

principles.

k.

iY.f

f/.

a, and limx-a ['(x)

[Hint: The theorem is conceptual,

k, then f is differentiable

and the proof goes back to first

Start by writing out the hypothesis and conclusion in terms of the basic

definitions of the statements (a) lim;;-a/' (x)

9.2

hypocycloid.

*15. Show that if f is differentiable for x

at a, and[' (a)

k and (b) f' (a) = k.]

THE PARAMETRIC MEAN-VALUE THEOREM; L'HOPITAL'S RULE

Given a path described parametrically by a pair of coordinate functions


x =

f(t),

y =

g(t),

386

9.2

Paths and Vectors in a Plane

we may want to find the slope of the tangent line at the point corresponding to a
particular

t.
y

In the figure, we see the path; we want to find the slope of the tangent at P, if such a
tangent exists.

Suppose that P is the point corresponding to a certain

be the neighboring point corresponding to

Lly= g(t + Llt) - g(t)

t + !it.

Llx=f (t + Llt)

and

t;

and let Q

Let
-

f (t),

as indicated in the figure. Then the slope of the tangent at P is

Ll y
m= I.im-,
Ll.t-+O LlX
if such a limit exists. Suppose now that f and g are differentiable, and that f' (t) 0.
Then we can write

[g(t + Llt) - g(t)]/Llt


Ll /Llt
Ll
m= lim y= lim y = lim
Ll.t-+OLlX Ll.t-+ollx/Llt Ll.t-+O [f(t + Llt) - f(t)]/Llt
limLl.t-+O {[g(t + Llt) - g(t)]/Llt}
g'(t)
j'(t)
limLl.t-+O {[j(t + Llt) - f(t)]/Llt}
_

Thus we get the formula

This will be called the


Theorem 1.

g'(t)
m=--.
f'(t)
parametric slope formula.

We have shown:

Given a path defined by functions


x=j(t),

If f and g are differentiable at

t,

y= g(t).

and f' (t) 0, then the path has a tangent at the

corresponding point P, and the slope of the tangent is given by the formula

m= m(t)

g'(t)
.
f'(t)

An important case is the one in which f' (t) 0 for

< t < b. Here x=f (t)

can never take on the same value twice, and so the locus of the path is the graph of a
function

ef;.

The Parametric Mean-Value Theorem;

9.2

L'Hopital's Rule

387

t=a

Qt=b

M=<t>'(x)
I
I

If P and Q are the endpoints of the graph, as in the figure, then the slope of the
secant line through P and Q is

g(b) - g(a)
f(b) - f(a)
By the mean-value theorem,there is a point i where the derivative cf' (i) is the slope of
the secant line. Thus

g(b) - g(a)
f(b) - f(a)

rf'(x).

This number i must have come from somewhere. That is, there must be a i between
a and b such that

It follows that

f'(i)
Therefore

g(b) - g(a)
f(b) - f(a)

f(i).
g'(i)
.
f'(f)

g'(i)
f'(i)

(a <

< b).

What we have just proved is a parametric form of the mean-value theorem. The idea
is that, if a function-graph is presented parametrically, then we can rewrite the
mean-value theorem parametrically, expressing both the slope of the secant and the
slope of the tangent in terms of the parameter.
Theorem 2 (The parametric mean-value theorem). Given two continuous functions
Jandg,fora t b. If both functions are differentiable fora < t < b,andf'(t) =
0 for a < t < b, then

for some

between a and b.

g(b) - g(a)

g'(i)

f(b) - f(a)

f'(i)

9.2

Paths and Vectors in a Plane

388

This theorem takes a simple form when

f (a)

g(a)

0.

In this case, the theorem says that

g'(i)
(b)
g
-- f(b)
f'(i)'
for some f between a and b.
approaches a limit

L,

as

This has the following consequence: if

_,..a, then

g(t)/f(t)

approaches the same limit

L.

g'(t)/f'(t)
That is,

if

f(a)

a
g( )

ta

then

(t)
f(t)

lim g
t-a

Since f is between t and a, we know that


fR:> a
_,..

L.

'

L.

Roughly, the reason why it holds true is as follows.

This is called l'Hopital's rule.

because g'(t) lf' (t)

'(t)
J'(t)

lim g

and

0,

t R:> a=> f R:> a.

=>

But

'(f)
g
- R:> L,
f'(i)

Therefore

fR:>a=:>fR:>a

=>

g(t)
f(t)

g'(i) R::3 L.
f'(f)

Therefore

t
t R::3 a => g( )
t
f( )

and so

(t)
J(t)

lim g
t-a

R::3

L'

L.

It is very easy to express this idea in the form of an E o proof; all we do is to for
-

malize our statements involving

I)

Hypothesis.

For every

"R::3"

E > 0

there is a o > 0 such that

O<lt-al<o
2)

Conclusion.

For every

E > 0

We need to show that ( 1) => (2).

(I).

For each

t,

=>

t
\g'( )_L\<E.
t
f'( )

there is a o > 0 such that

0 < It-a I <o

furnished by

in the following way:

=>

t
\ g( )-L / < E.
f(t)

Given

> 0, as in (2), we take the o > 0

let f be the f furnished by the parametric mean-value

The Parametric Mean-Value Theorem;

9.2

theorem.

This is the o that we need:


0

<It

al < o

==>

(f)
g
f''(f)

I
I

==>

==>

< If

(t
g
f(t))

These fit together to give the desired conclusion.

f(a)

In the above discussion, we assumed that


were

0.

lim

t--+a

If these relations hold, and


to be

and

<

<
.

g(a)

were both defined, and

It would have been sufficient, however, to suppose that

1 im f(t)
g(a)

389

al < o

L'Hopital's Rule

O;f and g

f and g

t--+a

(t).

are not defined at

a,

then we

define f (a) and


g/f goes

are then continuous, and the discussion of the limit of

through exactly as before.


Using

Theorem 3

instead of t, we get our theorem in the following form:


If

(l'Hopital's rule,first form).


I im f( x)
x--+a

Jim
x--+a

( )
gx

and

Jim
a:--+a

( )
g' x
f ' (x)

L'

then
lim
x--+a

g(x)
f(x)

L
Consider

Let us now look at some applications.

1.
sin
1m

--

x--+O

This satisfies the conditions of l'Hopital's rule, with

g(x)

f(x)

sin x,

x.

We investigate the quotient of the derivatives:


Jim
x--+O

cos
1

sin

1.

1.

Therefore
Jim
x--+O

This discussion does not supersede the geometric proof of the same statement,
given in Section 4.2. The reason is that to apply l'Hopital's rule, we had to know the

derivative of the sine, and to find the derivative of the sine we needed to know that

lim.,o [(sin x)/x]

1.

Moreover, if you know the derivative of the sine, you can

remind yourself of what lim [(sin

x/x]

is, without using l'Hopital's rule.

The point

Paths and Vectors in a Plane

390

9.2

is that
. sin x
1. sinx - sin 0
. , 0,
= Sln
l Im -- = Im
x-+O
X
x-+O
X- 0
by definition of sin' 0.

Since sin' = cos, and cos 0

1, we get the answer imme

diately.

It is not an accident that in applying the first form of l'Hopital's rule, we some
times find that we are merely solving a differentiation problem. The reason is that the
formula used in the definition of the derivative is always an instance of the rule,
whenever the function is differentiable.

F'(x0)

By definition,

1
- .Im

F(x) - F(x0)

x-+x0

X -

X0

The indicated limit on the right satisfies the conditions of Theorem 1, with
g(x) = Fx
( ) - F(x0)

--+

0,

f (x) =x - x0 --+ 0,

asx --+ x0. Thus every differentiation problem is a problem of the sort that l'Hopital's
rule deals with. The rule, of course, applies in many other cases; and it is the other
cases that make it significant.

1.

Im

x-+O

For example,

2
sin x+x
1. 2sinxcosx+l
=l
= 1m
. ,,
"'
"'
e
- 1
x-+O
e

by the rule; and here the rule is needed.


Often, the application of l'Hopital's rule requires the use of some preliminary
device.

For example, consider the possible limit


lim xcot x.
x-+O

Here we should start by writing

. xcosx
1im
--x-+o

sinx

and then use Theorem 3 ( unless we can think of something simpler ).


PROBLEM SET 9.2

Investigate the following indicated limits. (That is, calculate the ones which exi st.)

1. limx cotx
x-o

cos2x - 1
4. lim
x2
x -o

7.
10.

e"'

- 1

Jim
.,_0 In (x + 1)
Jim x2 sec2 x
00-+rr/2

2.

sin2 e
lim
82

3.

o-o

o-o

5.

Jim
x-1

lnx

6.

-X -

. y2 - 2y + 4
8. ltm
y-2
y- 3
1

11. limx2 sin 2


x-o

sin3 e
lim
82

9.

x- I
lim -.,-e
- 1
x-1
Jim
.,-,,/2

12. Jim
x-rr

cos3x

x3 - 1

x4 - 1
. 2X - 1

Sill

The Parametric Mean-Value Theorem;

9.2

1 3.

1 6.

1 9.

21.

23.

*25.

ln2 x - 1

Jim

x2

x-e

Jim
e-rr/4
Jim

x-l

Jim
t-e

Jim

x-1

1 4.

sine - cose

1 7.

e - Tr/4

Jim

Jim x2 csc2 x
x-o

(t)
t

22.

e13

v( dt

24.

Jim x In x
CC-+O+

26.

*27. Jim x In (sin x cos x)

*28.

a;_.o+

29.

lim

x2 - 4x + 3

x-+1 X

2 - 3x + 2

- 1

[ J,"'
X

3x
.,. 20. Jim -1
x-oe

---

In

--

x-o

18.

391

tan x

Jim

1 5.

v:X

x-o

x2 - 1

In

sin x cos x - tan x

L'Hopital's Rule

[ f
[ l"'
sect

Jim
t-rr/2
Jim

csc x

x-rr

rr/2

]
]

Vl + sin3 t dt

rr/2

Vl + sin3 t dt

Jim x In sin x
X-+0+
Jim x In (cos2 x sin2 x)

x-o+

A circle starts off tangent to the x-axis at the origin. The circle then rolls, without slip
ping, along the x-axis. The point P which started at the origin then traces out a path;
this path is called a

cycloid. (The same term cycloid is applied also to the locus of the

path.) The parameter in the coordinate functions is thee indicated in the figure. Sketch
the locus, and calculate the coordinate functions of the path.

y'

I
I

P=(x, y) =(x', y')'

As the figure suggests, the easiest method is to use a "moving coordinate system,"

(h, k) of the
. (What is ?)

as in Problem 13 of Problem Set 9. 1 ; we need to calculate the coordinates


"moving origin" O', and calculate x' and
30.

y' as

cos

</> and

a) When a circle rolls on the inside of another circle, we get a


the fixed circle has radius

sin

hypocycloid. In the figure,

and the rolling circle has radius b.

392

9.2

Paths and Vectors in a Plane

Calculate the coordinate functions, using e as the parameter. The answer is


x

f(O)

b-a
(a- b) cos e + b cos - - e,
b
g(O)
b - a
(a - b) sin e + b sin - - e.
b

b) Get a rectangular equation for the locus, for the case b

a/4. Sketch.

31. When one circle rolls around the outside of another, the figure traced out is called an

epicycloid. Derive parametric equations for the epicycloid, using radius a for the fixed
circle and radius b for the moving circle. Use the same parameter e as in Problem 30(a).
32. Suppose that a railroad wheel rolls (without slipping) along a flat track. Find coordinate
functions for the path traced out by a point at the outer edge of the flange on the wheel.
In the figure below, the outer radius is b and the inner radius is a. Sketch the locus,
bearing in mind that it is not a function-graph; it has loops in it.
y

33.

Make the same modification in the definition of the epicycloid, as suggested by the
figure below, and sketch the curve. The fixed circle has radius a; the rolling wheel
has inner radius b and outer radius c.
y

*34. A path is regular if the coordinate functions/, g are differentiable, and we never have
'
f' (t) = g (t) = 0 for the same t. Show that every chord of a regular path is parallel to

Other Forms of L'Hopital's Rule

9.3

393

the tangent at some intermediate point. Examples of this are as follows:


y

(Evidently the locus need not be a function-graph, and the chord may be vertical.)
*35.

Given a path, with differentiable coordinate functions/, g. Show that, if the axes are
rotated, then the coordinate functions F, G that work for the new set of axes are also
'
differentiable. Show that if f' and g never vanish simultaneously, then F' and G'
never vanish simultaneously.
A

9.3

OTHER FORMS OF L'HOPITAL'S RULE

The first f orm of l'Hopital's rule says that if


lim/(x)

lim

x-+a

x-+a

(x) =

0,

and

(x)
g'
x->a j'(x)

lim

L'

then

(x)
g
x->a f(x)

Jim

L.

This can be generalized in three ways.

x--+ oo or x--+ - oo, instead of x--+ a, it doesn't matter;


2 ) If g'(x)lf'(x)--+ oo or_,. - oo, as x _,.a, the rule still holds.
3) If f(x) _,. oo and g(x)--+ oo, instead of f(x)--+ 0, g(x)--+ 0,
Similarly iff(x) --+ - oo and g(x) --+ - oo.
1)

If

the rule still holds.

the rule still holds.

Thus, in the most general f orm ofl'Hopital's rule, we have: (1) x _,.a, x _,. oo, or
x--+ - oo; (2) g'(x)/f'(x)--+ L, g'(x)/f'(x) _,. oo, or g'(x)/j'(x)--+ - oo; and (3)
f(x), g(x) --+ 0, or--+ oo, or_,. - oo. Thus we have a grand total of 27 theorems, all of
which are true. One of these has already been proved, and the only hard one among
the others is the following.
Theorem I

(The Northeast Theorem).


lim f(x) =

x-oo

lim
x-toc,

then

(x) =

oo,

and

(x)
=
g
x->ro J(x)
Jim

This is proved in Appendix H.

If

(x)
g'
'
x->oo j (x)
Jim

L.

Meanwhile we shall use it.

L'

Paths and Vectors in a Plane

394

Example I .

9.3

To find
1. In x
im-,
X-+O'J X

we take derivatives and find

1/x
lim- = 0.
X-+00 1

By the Northeast theorem,

ln x
= 0.
lim
X-+ 00 X
The case in which g' (x)Jf'(x)
Example 2.

oo

causes no trouble.

To find
e

"'

Jim - ,
X-t-00 X

we investigate

"'

lim - = oo.
X-+ 00 1
It follows, by one form of l'Hopital's rule, that
"'
e

Jim - = oo.
X-+00 X
The theorem being applied here is the following.
Theorem 2.

then

If
limf(x) =Jim g(x) = oo,
X-+00
X-+00

and

.
g'(x)
hm -- =co,
x-+oo f'(x)

g(x)
lim
= oo.
x-+oof(x)
The proof is easy, on the basis of the Northeast theorem; we merely investigate
reciprocals. Since g'(x)Jf'(x)

By the Northeast theorem,

co, we have
f'(x)
Jim
= 0.
X-+00 g'(x)
f(x)
lim
= 0.
X-+00 g(x)

And f(x)/g(x) > 0 when x is large. Therefore

Example 3.

Consider

g(x)
lim
=co.
x-+oof(x)
.
In x
1.
- In x
I1m-=-1m -- .
x-+O+ X
., .... o+
x

Other Forms of L'Hopital's Rule

9.3

Here the limit on the right takes the form


lim
., ... o+

Taking derivatives, we find

oo oo.

-1/x
--

395

=0 .

By one form of l'Hopital's rule, it follows that


-In x = ,
I"Im+ -O

X->0

and so the answer to the original problem is -0 = 0. The theorem being used here
is the following.
Theorem 3.

then

If
lim f(x) = lim g(x) =
+
x-+a+
x-+a

(x)

lim g'
= L'
x->a+ f'(x)

and

oo,

(x)

lim g
= L.
x..,a+ f(x)
Let

This is not quite as easy as Theorem 2.

1
x=a+-,
y

y= -- ,
x - a
so that y

---+ oo

as

x ---+ a+

and

x .......

a+

as

Then

---+ oo.

g(x) =lim g(a + l/y).


x-.a+ f(x)
y-.oof(a+ 1/y)
lim

Taking derivatives, we find


1.

Im

v->oo

g'(a+ l /y)(-l/y22)
f'(a+ 1/y)(-1/y )

1.

= Im
y-.oo

=
The Northeast theorem now applies to

g'(a+ 1/y)
f'(a+ 1/y)
g'(x)

hm -x-.a+ f'(x)

g(a+ 1/y)
Y->00 f(a
+ 1/y)
].

1m

L.

'

and tells us that this limit is L. Therefore

(x)

lim g
= L.
x-.a+ f(x)
We have now discussed all the troublesome cases of l'Hopital's rule; once we
have gotten this far, the rest of the derivations are routine. Hereafter, we shall use
all forms of the rule without comment.
Sometimes we can apply the rule by taking logarithms.
lim
.,_, 0

x"'

= ?

Consider

9.3

Paths and Vectors in a Plane

396

Let cf>(x)

xx. Then

=x

In cf>(x )
Now
.
g'(x)
Ilffi --

x-+o+

f'(x)

g(x)
= 1/xx = f(x)
In

In x

lffi

x-+O

1/x
--1/x

==

Jim In cf>(x)

0,

1lffi ( -x )

limxx
x o+

and

x-+o+

..-. =

Therefore
x-o+

e0

1.

PROBLEM SET 9.3

2. Jim y (Tan-1 y - 7T/2)

3. Jim (In Jn x)/ v:X

5. Jim e-1/x'

6. Jim [(l/x)e-1/x2]

8. Jim (1 + Tan-1 o:)Tau-ia

9. Jim ( l + csc x)in' x

x-o+

x-o+

7.

Jim [(sin x)(In x)/x]

a--+ co

10. Jim (I + ax)1fx


x-o

11. Jim x3e-x

13. Jim (l - 2x)1fx

14. Jim
x-'"12

16. Jim (tan fJ)lano

17.

19. Jim (e-lfx'/x2)

20. Jim (e-lfx' /x3)


x---+O
23. Jim (l/x + n Jn x)

X-+00

o-o+

Jim

t-oo

))Tan-'
(
()1

x--o+

x)sin

x-o

30. Jim (I - sin

kx)csc

Tf'-o+

18.

Jim x2 ln x

x-o+

21. Jim x Jn x
x-o+
24. Jim ( l/x2 +
x-o+

/1

ln x)

0.

27.

Jim (1 - COS y)l-COS y

v-o+

29. Jim (1 + sin kx)csc x


x-o

31. Jim (v2)"


v-o+

x-o

32. Jim

15. Jim x2e-11x


a:- o+

28. Jim (1 + COS y)l-COS Y


v-o+

X---+00

1 + Tan 1 (x

22. Jim (1/x + In x)


x-o+
x-o+
25. Show that for every n, Jim (e-1/x' /xn)
26. Jim (sin

x-o+

12. lim xe-1/x

'
ww

33. Jim (I + tan 3 fJ)-CSC 9


o-o

*34. The nth derivative of a function .f is denoted by ['"1 Let

f( x )

e-1/x2

for x ;e 0,

for x

0.

Show that for each n > O,fhas an nth derivative, for every x; and show that J<">(O)
0
for every n. [Hint: You are not likely to find a manageable general formula for J<n>(x).
But you ought to be able to show that for x - O,J<">(x) is always given by a formula of
a certain form, involving certain constant coefficients; and you may be able to use thi5
form, to show that ['"1(0)
0, without needing to determine the coefficients.]
=

Polar Coordinates

9.4

9.4

397

POLAR COORDINATES

When we set up a rectangular coordinate system in a plane E, to every ordered pair

(x, y) of numbers there corresponds a point P of


{(x, y)}

(x, y)

E,

Thus we have a function

E.

f--+ P.

And the correspondence also works in reverse: when P is named, x and y are deter
mined.

{(x, y)}

Thus a rectangular coordinate system gives us a one-to-one correspondence


+--+

E between the ordered pairs of real numbers and the points of E.

We now consider another way of labeling points with pairs of numbers.


Given two numbers rand 8, we first draw the ray which starts at the origin and
has direction 8.

On the line containing this ray we set up a coordinate system, with

the direction 8 as the positive direction; and we let P be the point with coordinate
(This is equivalent to saying that the directed distance OP is

r.

r.) We then say that

P has polar coordinates (r, 8).


For example, in the left-hand figure it looks as if P1 has polar coordinates (2, 7r/3),
and P2 has polar coordinates (-2, 7r/3).
1r

,f---__,___,_-'---- x
2

I
I
I
I
I
I
I
I

P2(r,8) (r< 0)
I

Thus to every pair (r, 8) of numbers there corresponds a point P. But the corre
spondence does not work uniquely the other way: every point P corresponds to
infinitely many number pairs (r, 8). Thus, in the right-hand figure, the point P with
rectangular coordinates (1, 1) has polar coordinates

(._}2, 7r/4).

But Palso has polar

coordinates ( -/2, 57T/4). And this is not all; the possible polar coordinates for Pare

cI2, 7r/4

c-.J2, s7T/4

+ 2n7T),

+ 2n7T),

where n is any integer (positive, negative, or zero).


Thus, when we set up a polar coordinate system we have a function

{(r, 8)}

E,

but we do not have a one-to-one correspondence; the polar coordinates of a point


are not determined when the point is named.
ordinates can naturally be thought of as paths.

1)

For this reason, graphs in polar co


Let us look at some examples.

Consider the graph of

cos 8

co 8 27T).

398

Paths and Vectors in a Plane

9.4

Since the cosine is periodic, with period 27T, we can get all of the locus by restricting()
to the interval [O,

27T].
r

As an aid in sketching the polar graph, we first sketch the rectangular graph of the
equation r

cos e. We cut the curve into four parts, as indicated, and then sketch the

portion of the polar graph corresponding to each of them. As () increases from 0 to

7T/2, r decreases

from I to 0. As () increases from

7T/2 to 7T, r decreases

from 0 to -1.

Therefore the second part of the curve, in the fourth quadrant, comes from values of()
in the second quadrant.

(See the figure on the left.)


,..

,..

'

3,..

'

3,..

2
As() continues to increase, from

7T

to

27T,

we trace out a curve shown on the right.

This looks like the curve that we had already. And in fact it is exactly the same curve
as before, because
cos ce +

7T) =

-cos

e.

Further investigation shows that the graph is a circle.


y
p

If Phas polar coordinates

(r, ()),
x

then the rectanguar coordinates of Pare

= r

cos e,

y = r sine.

Polar Coordinates

9.4

399

(There are two cases to check. If r > 0, then these formulas follow from the defini
tions of the sine and cosine.

Verification for r

x2 + y2

0 and

r2 cos2 e + r2 sin2e

< O?) Therefore

r2 .

This gives the three conversion formulas


x

r cose

x2 + y2

r sine,

,2,

We note that the equation


r

cose

does not involve any of the three expressions r cose, r sine, r2 which we know how
to convert into rectangular coordinates.

But we can multiply by r, on both sides,

getting
r2

r cose;

and this means that


x 2 + y2
This is the circle with center at the point
2)

x.

(t, 0)

and radius t.

Consider
r

sece.

Here we might sketch the graph without using rectangular coordinates.


y

It is easier, however, to multiply both sides of the equation by cose. This gives
r cose
Ase increases from 0 to

27T

I,

or

I.

(skipping 7T/2 and 37T/2), this line is traversed twice.

(It

is worthwhile to figure out how.)


3)

In these two examples, it was easy to work back to a rectangular equation.

Consider, however,
r

sine

co e 27T).

9.4

Paths and Vectors in a Plane

400

(As for r

cos(), the interval [O, 277] gives us the entire locus of the path.)

we do a rectangular sketch as on the left.

First

We then sketch the polar graph in four

parts:
11'

2
2
r

3,,.

2
This curve is called, for obvious reasons, a cardioid.

It is possible to write a rectangular equation for the cardioid.

First we observe

that since

1 -

sin () ;;; 0,

(1)

we always have (on this particular curve)

r
We can therefore write

r2

J--;2
=

x2 + y2
4)

Jx2 + y2.

r - rsin(),

Jx2 + y2

(2)
_

y.

(3)

To get the equation of a line L, in polar coordinates, we proceed as follows.

Let N be the perpendicular to L through the origin. Rotate the axes through an angle
of measure cp, choosing cp so that N becomes the x'-axis.
!J

Then L is the graph of an equation

x' - p = 0,

Polar Coordinates

9.4

where p is a constant. Since

'

= x

cos <P +

cos <P +

401

sin cp, this gives

sin cf> - p

0.

Converting to polar form, we get


r cos e cos cf> + r sin e sin cf> - p
r cos ee - cf>) - p

0,

0.

This is the standard form of the equation of a line in polar coordinates.


Some geometric problems are most conveniently attacked by introducing polar
coordinates at the outset. To do this, we need a distance formula.
Theorem I. Let d be the distance between the points with polar coordinates (r1, e1)
and (r2, e2). Then
d2
ri + r - 2r1r2 cos eel - e2).
=

Proof The rectangular coordinates of the two points are


for

1, 2.

Therefore
d

2
2
(x1 - X2) + eY1 - Y2)
2
2
erl cos el - '2 cos e2) + er1 sin el - '2 sin ez)
2
2
2
2
riecos el + sin el) + recos e2 + sin ez)
- 2r1r2ecos el cos e2 + sin el sin e2)

ri

+ ,. -

2r1r2 cos eel - e2),

which was to be proved.


For r1 > 0, r2 > 0, 0 < e1 - e2 <

TT,

this is simply the law of cosines.

But the polar distance formula applies for any values of r1, r2, e1, and e2

9.5

Paths and Vectors in a Plane

402

PROBLEM SET 9.4

Sketch the following, and convert to rectangular coordinates if possible.


I.

5.

7. r2 =sin 8 sec3 8

8. r =

13. r =

=1 - cos 8

11.

sin 30

36

1 - sin0

19.

r=
sin 8 - cos 8

20. r2 =sin2 8

22.

23.

2 +cos 8

15.

4 cos2 8 +9 sin2 8

17.

r
1 +r cos 0

r=sin 28

12. r =1 + r cos 8

=sin 48

16. r sin 0

9.

sin 8 +cos 8

14. r2=

1 +cos0

6. r=sin 8 sec2 8

4. r =1 +sin 8

10. r

3. r cos (fJ - 7T/4)

2. r = -2 sec 8

r=2 csc0

= 1 +csc 8

18. r=cos 38

= 2.

21.

r2

24.

=sin0

=eBI

25. The figure given in the text suggests that at the origin, the two sides of the cardioid
have the same tangent, namely, the line 0 = 7Tj2. Show that this is correct.
Discuss, as in Problems 1 through 24.
26. r2

27. r2

cos2 0

cos 8

28. r2 = cos 28

(This curve is called a lemniscate.)

29. r2 = a2 cos 28

r2

30.

=a2 sin 20

Find polar equations for the curves defined by the following conditions, and sketch.
Identify the curve if possible.
31. The set of all points which are equidistant from the origin and the line

= csc0.

32. The set of all points which are equidistant from the origin and the point (2 v2, 7T/4).
33. The set of all points P such that PA = 2PB, where A is the origin and B

(2v2, 7T/4).

34. The circle with center at (2, 7T/4) and radius 2.


Sketch:
35. r=2 - sin 8
38.

9.5

r
1 +r cos 0

36.

= 3 - 2 sin 0

37.

=3 + 2 cos 8

1
=- (What sort of curve is this, and why?)

AREAS IN POLAR COORDINATES

Given
r

f(8) 0,

where f is continuous, and the length of the interval [O'., /9]


region between the polar graph and the origin.

is

;;;

2rr.

Let

be the

Areas in Polar Coordinates

9.5

403

7r

That is,
R

{ (r, &) I o: ;;; e ;;; f3

and

;;;

;;; f (&)}.

Consider a subinterval [&;_1, &i] of the interval [o:, {3]. Let mi be the minimum
value off on [&;_1, &;],let M; be the maximum value, and let flA; be the area of the
region between the origin and the part of the curve from e
ei-1 to e
e i.
=

The area of the inner circular sector, with radius

m;,

and the area of the outer sector, with radius M;, is


Therefore

-M; fl&;.

Now take a net


over the interval [o:, {3]. The area of R is
n

A= .2flA;.
i=l

is

404

Paths and Vectors in a Plane

9.5

The above inequalities hold for every

i;

and so by addition we get

I -m; 6.8;

i=l

I iM; 6.0;.
i=l

But the sum on the left is the lower sum s(N) of the function

F(8)
over the net

N,

F, over the net

}/(B)2,

and the sum on the right is the upper sum S(N), of the same function
N. Thus

s(N) A S(N);
and so
I

Since

Jim s(N)
s1-o

lim s(N)
JXJ-o
is squeezed, and

it follows that

Jim S(N).
J.YJ-o

(P U(8)2 dO
J,

Thus we have:

fHC0)2

lim S(N),
J.\J-o

dO.

Let f be continuous and 0 on [?:, (3], with (3


?'. 21T, and Jet
the region between the origin and the polar graph off Then the area of R is

Theorem 1.

f f(fJ)2

which is the right answer. For

i2"ta2

dO

-a2 21T = 7Ta2,

r = -----

'
cos 0 + sine

we get
A=

1"12
o 2
l;;/2
2 o
["12
1

(cos 0 + sin 8)2

2.o

8 7T/2,

+ sin 20
- sin 20

cos220

dO

c1e
d6

Mt tan 2e - - sec 28]12

}(O + t)

be

dO.

Let us try this in some simple cases. For the circular region with radius
center at the origin, the formula gives
A

(O - i)

and

The Length of a Path

9.6

405

This is correct, because the region is a right triangle with legs of length 1.

rcosB+rsinB=l; x+y=l; A=t.

PROBLEM SET 9.5


Find the areas of the regions enclosed by the following curves, and sketch.
1.

r =

4.

r2 =

7.

r =

- sin 0

r = 1 - cos()
r2 = cos 20

2.

cos2()

5.

sec fJVtan fJ, 0

()

j4

1T

8. r

Find the area of the inside loop of the graph of

11.

r =

1
lcos

fJI +

lsin

12.

01

= e0, 0

r =

6.

r =sec() tan fJ, 0

9.

r2

= sin() cos()

10.

r =

0 21T

() 1T/4

4 cos 2()

1 - 2 sin fJ, and sketch.


13.

14. Given a polar graph defined by a differentiable function


a formula for the slope of the tangent, at a point

cos()

3.

(r0, fJ 0)

= e28, 0

() 21T

r = f(O) ( <X () {3), derive


(r0 =f(fJ0)). Here we really

mean the slope, relative to a rectangular coordinate system superimposed on the polar
coordinate system.

9.6

THE LENGTH OF A PATH

Roughly speaking, the length of a path is the total distance traversed by the moving
point.

For example, consider the path defined by the coordinate functions


j(t)

a cos

t,

g(t)

The locus of this path is a circle with radius


from

0 to 47T,

sin t

(0

47T).

a and circumference 27Ta.

But as t increases

this locus is traversed twice. Therefore the length of the p ath is

2 27Ta

41ra.
Lengths of paths are undirected; they are always positive (or zero, in trivial cas es) .
Thus the length of the path
x =

f (t)

cos t'

g(t)

is four, not zero; the two halves of the path do not cancel each other out.

9.6

Paths and Vectors in a Plane

406

To be exact, path length is defined as follows. Given a path

x = f(t),

g(t)

y =

(a t b),

let

be a net over

[a, b];

for each i from 0 ton, let

Yi

g(ti),

Then

is the length of an inscribed broken line.


y

Po

The path length, by definition, is


n

s = Jim

IN10

2; P;_1P;.
i=O

We can express the path length as an integral.


X; = f(t;),

For each i, let


Y.; =

.6.x; = X; - X;_1,
as indicated in the figure.
y

:I

_____ _

Xi-1

Then

Xi

g(t;),

9.6

The Length of a Path

407

Since
and
we know by the mean-value theorem that

6.xi

f' (ii) 6.ti

for some f; between ti-i and ti; and

for some r; between ti-l and ti. (We do not know that f; = i;, and this leads to trouble,
as we shall see.) Therefore

and so

i pi-lpi i=l
i -J f'(ii)2 +

i=l

g'(f/)2 6.ti.

This is almost, but not quite, a sample sum of the function

ix(t)

-Jf' (t)2 + g' (t)2;

it differs from a sample sum in that we have used two different sample points f;, i;
on each interval [ti_1, t;] of our net N. Since INI --+ 0, and g' is continuous, we ought
to have

i -Jf'(f;)2 +

i=l

g'(f;)2 6.ti

i -Jf'(i;)2 +

i=l

f-J f'(t)2

g ( f;)2 6.t;
'

i t1.(i;) 6.ti

i=l

+ g'(t)2 dt

when INJ 0. For a proof of this, see Appendix I. Meanwhile we shall state the
following theorem and use it.
Theorem 1.

'

If

the coordinate functions f, g of a path have continuous derivatives

f', g , then the length of the path is


s

f-Jf'(t)2

+ g'(t)2 dt.

This formula can be converted to polar coordinates in the following way. Suppose
that a polar path is described by a function

cp(6)

(a e b).

The rectangular coordinate functions of the path are then


c/J(6)cos e,

y =

cp'(6)cos6 - cp(6)sin6,

'
g (6)

x
This gives
j'(6)

/(6)

g(6)
=

cp(6)sin6.

f (6)sin6 + cp(6)cos6.

Paths and Vectors in a Plane

408

9.6

For short, let us write </>' for </>' (e),

for cos e, and

for sin e. This gives

and

Thus we have the following theorem.


Theorem 2.

Given a path defined in polar coordinates by a function

(a e b),
where </>' is continuous, the length of the path is

PROBLEM SET 9.6

It is hard to propose reasonable problems in the calculation of path length; sometimes


the integral takes a troublesome but manageable form such as S V 1 + x2 dx, but most of
the time, path length problems are either easy or impossible. Therefore, if some of the
problems below look impossible, you should try to think of an approach that might make
them easy.
Find the lengths of the following paths.
I.

2. r = eO, 0 () 2n
1
, 0 0 n/2
4. r =
.
cos 0 + sm 0

r = cos(), 0 () n

3. x = 0, y = cost, 0 t h/2
5.

9.

r = 2 sin 8, 0 () r.

r =

8.

,.

2
cos() - sine '

1
= 0

n/2 0 n

0 () 2r.

r = sec fJ tan 0, 0 0 7T/4 (Remember the above remarks.)

10. x = 1 + sint, y
1 1.

6.

x =

cost, 0 t 7T/4

cos3 t, y = sin3t, 0 t 1Tj4 (What sort of curve is the locus of this path?)

12. x = t3, y = t
l 3I, -1 t 1 (Do these coordinate functions satisfy the conditions of
Theorem 1? That is, does g (t )

lt31 have a continuous derivative?)

'-'13. The proof of Theorem 1 would have been much easier if we had been able to use the
following:

Theorem (?). For each i, there is a single point f;, betweent;_1 andt;, such that
pi_lpi

..; f'(i;)2 +

g'(i;)2

D.t;.

We could then have expressed P;_1P; as a sample sum, and passed to the limit, as
in Section 7.1. But the above theorem is false. Give an example of a path ( withf' andg'
continuous) for which the theorem fails.

There is a very simple example of this kind.

Vectors in a Plane

9.7
9.7

409

VECTORS IN A PLANE

In Section 3.8, we found that the motion of a particle on a line could be described by
a single functionf, with real numbers as values, and that the velocity and acceleration
functions were the first and second derivatives

v =f'

and

a=

v
I

= f" .

As we remarked at the time, these ideas are not adequate to describe the motion of a
particle in a plane (or in space).

The motion of a particle in a plane Eis described

by a path, which is a function

P:

I---+E

: t H P(t),
where I is an interval, and P(t) is the location of the moving particle at time

t.

Velocity

in this case is a "vector quantity," with both a magnitude and a direction, conven

iently pictured by an arrow. At each point P(t), the direction of the velocity vector is
the direction of the motion, so that the arrow always lies on the tangent line, pointing
in the appropriate direction on the tangent line; and the length of the velocity vector
is the speed.
y

This is the idea. We need to express it in a mathematical form in which it can be


used. The idea of a vector appears in a variety of forms. The simplest of these is as
follows.
With each point

--+

of the plane we associate the directed segment

at the origin and ending at

P.

--+

Such a directed segment


y

OP

OP,

starting

will be called a

vector.

9.7

Paths and Vectors in a Plane

410

--+

We allow the "degenerate segment" 00; this is called the zero vector1 and may be
-+

denoted simply by 0.

Moreover, since all our directed segments, in this section, are

going to start at the origin, we can denote the directed segment


symbol

-+

P.

Addition.

--+

OP

by the shorter

Three operations can be performed, in this system:


-+

Given P1,

-+

P2,

with P1 =

(x1, Yi) and P2

(x2, y2),

the sum is defined to be

where

Vector addition is governed by the same formal laws that govern addition of real
numbers, as follows.
-+

A.1

Associativity.

A.2

Existence of 0.

every vector

-+

-+

-+

-+

-+

-+

There is a vector 0 such that 0

-+

-+

For each vector P there is a vector


-+

-+

--+

P + ( -P) = ( -P) +P =
Commutativity.

A.4

--+

-+

-+

+P = P + 0 = P

for

P.

Existence of negatives.

A.3

-+

(P1 +P2) +Pa = P1 + (P2 +Pa).

-+

-+

-+

such that

-+

0.

-+

P1 +P2 = P2 +P1.

These follow from the corresponding laws for real numbers.


-+

-P

(P1 +P2) +P3

-+
=

Q,

and

-+

-+

-+

For example, if
-+

P1 + (P2 +Pa) = Q',

then

Q = ((x1 + Xz) + X3, (Y1 +Y2) +Ya)


'
= (x1 + (x2 + Xa) , Y1 + (Y2 +Ya))= Q ,
and so

-+

-+

Q = Q'.

Q, where Q

-+

-+

The existence of 0 is obvious: 0 is 00. If P

(-x, -y).

Similarly for

A.4.

(x, y),

then

-P =

9.7

Vectors in a Plane

411

Scalar multiplication. When we are discussing vectors, we refer to real numbers as

scalars. To multiply a vector P by a scalar


That is,

a, we multiply the coordinates of P by a.

- aP = Q,

where

(clX , ay).

We then have a kind of associative law.


-

M.1. (a{J)P

a({J]>).

Because

(a{J)x

==;

a({Jx), and (a{J)y

a({Jy).

Multiplication is connected with

vector addition by two distributive laws.


-

M.2. (a+ {J)P

-+

aP+ {JP.

-+

M.3. a(P1 + P2)

-+

aP1 + r1.P2.

Zero and 1 work in the usual way:


-

M.4. 0
M.5.

M.6. a

P
P
-

-+

0, for every

P.
P, for every P.
= 0, for every a.

-+

Let "f/' be the set of all vectors

P. In "f/' we have defined two operations (addition

and scalar multiplication), and shown that they satisfy the laws A. l through A.4 and
M. l through M.6; "f/' is called a

vector space (relative to these two operations). More

generally, any collection "f/' of objects is called a vector space if it is provided with two
operations satisfying the above formal laws. There are many important vector spaces
other than the one which we are now discussing. For example, we may consider the
---+

directed segments

OP, starting from the origin in three-dimensional space, with

the two operations defined in an analogous way.


Finally, we introduce another kind of multiplication for vectors, called the

dot product or inner product. If P1

(xi. y1) and P2

(x2 , y2), as before, then the

inner product is a scalar, namely,


+

P1 . P2

X1X2 + Y1Y2

The following properties of this operation are easy to check:


-+

-+

-+

S.1. P1 P2

S.2. (aP1) P2
-

->-

P2 P1.

-+

a(P1 P2).

--

-+

-+

P1 P2 + P1
(P2 + Pa)
- S.4. P P 0, for every P.

S.3. P1

-+

Pa.

412

Paths and Vectors in a Plane


-+

If P

S.5.

-+

0, then

-+

-+

9.7

0.

(The last condition rules out trivial "dot products" for which
for every

-+

-+

-+

P1 P2

is always 0,

P1, P2.)

Thus, "f/ is called an inner product space (relative to the three operations which
have now been defined). More generally, any collection 1/ is called an inner product
space if it is provided with three operations (addition, scalar multiplication, and
inner product) satisfying all the above laws.
As a matter of convenience, we have defined our three operations algebraically,
using the coordinates (x, y) of the terminal points P of the vectors. But it is important
to understand that all three of them have geometric meanings.

We can add two

vectors, geometrically, by completing a parallelogram, as shown on the left.

To do this, we don't need to know the directions of the


axes can be, and hve been, omitted from the figure.

x-+

and y-axes. Therefore the


-+

lf P1 and

P2

are collinear, then

the parallelogram collapses, but the idea is the same.


Geometrically,

-+

-+

is the vector Q which has the same length as

opposite direction.

-+

P,

but has the

-+

To multiply a vector
-+

by a positive scalar

direction as P, but multiply the length by

x.

IX,

we draw a vector with the same

If(/. < 0, we go in the opposite direction,

and multiply the length by j(/.j.

-?

aP

The geometric meaning of the inner product is less obvious.


-+

-+

P1 P2

X1X2 + Y1Y2

Algebraically,

Vectors in a Plane

9.7

413

Under the conditions given in the figure,

Substituting cos

el = X1/0P1, sin el = Y1/0P1, cos 82 = X2/0P2, sin 82 = Y1/0P2,

we get

so that
-+

-+

P1
Obviously cos

P2 = OP1

OP2 cos 8.

8 is independent of the directions of the axes, because () measures the

angle between the two vectors. Note that the length of the vector

-+

P can be expressed

in terms of the dot product:


-+

p . p

= x2 + y 2 = OP 2.

-+

The length of a vector P may also be denoted by


IPI

IP!. Thus

j+-+

= PP.

By a linear combination of two vectors

-+-+

-+

P1, P2 we mean a vector Q which can be

expressed in the form

where

rx

and f3 are scalars. In a coordinate plane, it is easy to find two vectors i and j

such that every vector is a linear combination of them. If the vectors i and j are as in
the left-hand figure below, and P
-+

= (x, y), then

P =xi + yj

(i = (1, 0), j

This is an equation between vectors, not numbers.


the vectors i and j by the scalars

(0,

1)).

On the right, we have multiplied

x and y, and added the resulting vectors.

Paths and Vectors in a Plane

414

9.7
y

This section contains no new information, but quite a lot of new language.
Learning a language takes practice.

Therefore, while some of the problems below

are genuine problems, many of them are merely exercises in the process of translation
from the language of coordinate s y ste m s to the language of vectors and back again.

PROBLEM SET 9.7

Sketch the set of all points P satisfying the following conditions.


I. P

3. P

od ( -

<

oo

r.ti + r.tj (-w

5. p . i
0
.
7. p (i + j)
9. p . p
1

2. P

< w)

C'I.

<

4. P

< w)

C'I.

10. p. p

r.ti + 2et.j

13. p. j
15. P

19. P

co

C'I.

14. P

2r.ti - C'l.j (-w <

C'I.

<

w)

16. p
18. p

r.tj + r.t2i

21. p . (2i + j)
22. Let c

12. p. i

v3)

17. P(i + 2j)

(-w

<

r.t

20. p .

< co)

(i

C'I.

<

)
)

8. p . (i + j)

<

r.ti- r.tj ( - w <et. <

6. p . j

11. J3

r.tj ( -

cxi + r.t2j

( - O'J <

C'I.

< oo)

et.i + r.t3j

( - O'J <

C'I.

< O'J)

C'l.2i + r.t3j
+ 2j)

( - Ct:) <

C'I.

< (/J)

i + j, d

i - j. Express i as a linear combination of c and d.

(To do this

you will need to calculate with vectors, by the same processes that you use with real

numbers. This can be done; and this is why we stated and verified the laws A. I through

A.4 and M.1 through M.6.)

23. Express j as a linear combination of c and d.


24. Now show how any vector

25. Let e

i + 2j, f

P can be expressed as a linear combination of c and d.

2i - j.

a) Express i as a linear combination of e and f.


b) Express j as such a linear combination.
c) Show how every vector P can be so expressed.

26. Same problem, for e

i - 2j, f

3i + 2j.

27. The vectors g and h span the vector space Y if every vector in Y is a linear combination
of g and h. (Thus in Problem 25 (c) you showed that e and f span Y.) Is it true that
every pair of vectors in Y span Y? Why or why not?

Free Vectors

9.8
28.

Let P1

(2,

1), P2

(1, 2).

Sketch the set of all points P such that

(0 ;;;i
29.

-+

415

et:

-+

;;;i I).

different vectors).

Let P1 and P2 be any two vectors (by which we mean two

Sketch the

set of all points P such that


-+

et:P1 + (1

-+

rx)P2

(0 ;;;i

;;;i I).

et:

Sketch the set of all points P satisfying the following conditions:

30. p

. i

32. p .
34. p
36. p
*33.

(i +
=

31. p . j 0

33. p .

j) 0

et:i +

{i'j ( et: 0, fi' 0)

. (i + 2j)

35. p
37. p

j) 0

(i -

et:i + {i'(i

. (i -

2j)

<

(o: 0, /! 0)

+ j)
0

Let "ff/' be the set of all continuous functions on the interval


functions in

1/1,

[ -1, I].

State, for the

definitions of (a) addition, (b) scalar multiplication, and (c) inner

product, in such a way that 1f/' forms an inner product space. Verify that under your
definitions, the inner product space laws are all satisfied. (There is only one reasonable
definition for (a), and similarly for (b); but the "right" definition of the inner product is
less obvious. Hint: The""' operation is supposed to assign a numberf g to each pair
of functions j; g. Under what significant operation does a number correspond to one

function? As a check on your definition, it should turn out that if/ (x)

h(x)

x,

then.fg

O,g

0,

and/

x3, g(x)

1,

%.)

This inner product space has important uses, later in the theory of functions.

9.8

FREE VECTORS

In the last section, we defined a vector to be a directed segment OP, starting at the
origin. We shall now introduce a different form of the vector concept, which for
some purposes is better.
By a

translation of a coordinate plane we mean a correspondence of the form


x x

where h and k are constants.

)'

k,

This is different from the idea of translation of axes,

which we used in Chapter 8.


moving the

+ h,

Then, we were moving the

points (x, y), with (x, y)

(x + h, y + k).

axes, while now we are

----+

Suppose that we have given two directed segments PQ, P'Q', in a coordinate
plane.

"---+

If there is a translation under which P


----+

P' and

PQ and P'Q' are equivalent.


y

Q'
Q

p'

---+

Q', then we say that

9.8

Paths and Vectors in a Plane

416

This idea is easy to describe in terms of coordinates.

Q'

Let

(x, y).

We can always move P onto P' by a translation


y

+ h,

x I-+ x

I-+

where
h

x{ -

X1,

y + k,

y{ - Yi-

If it is true that

y{

Y1

-+

Y - Y2,
-----*

then this translation also moves Q onto Q', and PQ and P'Q' are equivalent.
-+

For each pair P, Q, the symbol PQ denotes the set of all directed segments

-----*

-+

P'Q' that are equivalent to PQ. Such a set of equivalent directed segments is called
a free

vector

intended).
vector.

(or simply a

vector,

if the context makes it obvious what meaning is

Thus the figure on the left below is a partial picture of exactly one free

A free vector is called an

equivalence class

of directed segments; and any

directed segment which belongs to such an equivalence class is called a


of the class.
-+

representative

Thus each of the arrows in the figure is a representative of the free

vector PQ.
y
y

-+

-----*

If two directed segments PQ, P'Q' are equivalent, then they determine the same
-+

free vector, and PQ

P' Q'.

-+

And if PQ

-+

-----*

P'Q', then the segments PQ and P'Q'

are equivalent. Therefore, when we write an equation of the form

-+

PQ

P'Q',

we are saying that the segments PQ, P'Q' are equivalent under a translation.
It is now easy to define, for free vectors, the operations of addition, scalar
multiplication, and dot product.

If
-+

---+

OP+ OQ

-+
=

OR

Free Vectors

9.8

417

in the sense defined in the preceding section, then


--+

--+

---+

OP+ OQ =OR,
by definition.

This definition is complete, because every free vector ST has exactly

one representative segment which starts at the origin. Similarly, if

by definition; and

by definition.

---+

--+

OP OQ

--+

OQ, then

--+

rxOP

--+

rxOP

OQ,

OP OQ,

The form of these definitions makes it clear that all the vector laws

and inner product laws of the preceding section also hold true for free vectors. Since
all we would need to do is rewrite them in the new notation (using
not worth while to do so.
representatives. That is,

--+

--+

IOPI =OP.

The free vector of length 0 is denoted by 0.


It is convenient, in figures, to use the label
vector

The length of a free vector is the length of each of its

IOPI

OP for P), it is

--+

PQ for any representative of the free

Thus different segments may have the same label, as in the left-hand

PQ.

figure below; and when they do, this means that the segments are equivalent.
y

y
u

It is easy to see that the right-hand figure above is correctly labeled.


--+

Theorem 1.

--+

PQ + QP

0, for every

P, Q.

Similarly, the labels are correct in the parallelogram below.


--+

Therefore:

Since

OP+ OR=OQ,
we have

--+

--+

OP+ PQ

OQ.

This has a geometric meaning: we can add free vectors by laying representative
segments end to end.

Solving for

--+

--

--+

PQ, we get PQ

--+

OQ - OP. And this gives:

9.8

Paths and Vectors in a Plane

418

Theorem 2.

Proof

PQ + QR + RP
-

PQ + QR + RP

0, for every P, Q, R.

OQ - OP + OR - OQ + OP - OR

0.

R
I
I
I
I
I
I
I
I
I
I
I
I
"

I/" --.-:.,. ..........

As a matter of convenience, we have defined equivalence of directed segments in


terms of a coordinate system. But in fact this relation of equivalence is independent
of the choice of the coordinate system.

The directed segments

equivalent under translation if (a) their lengths

PQ

and

P'Q'

PQ

and

=-+

P'Q'

are

are the same and (b)

the directions e and e' are the same.

Note that while the directions e, e' depend on the directions of the axes, the equation

e' does not; if the equation holds, and the axes are rotated, then the equation

continues to hold.
Thus we say that the relation of equivalence between directed segments, used in
defining free vectors, is

invariant

under changes in the coordinate system.

It very often happens that we use coordinate systems in the study of things which
are invariant under changes of coordinates. Thus the distance between two points is
invariant, and so also is the question whether a given curve is a parabola. But we use
coordinate systems in the study of parabolas, and similarly we use coordinate systems
in the study of vectors. If P

(x, y),

then x and

y are called

the

x- and y-components

Free Vectors

9.8

of

--+

OP.

where

In this case

--+

P,

--+

--+

P= OP =

i, and

xi +

yj,

are as in the preceding section.

we have free vectors

i, j;

and

OP is

419

Corresponding to the vectors i, j

a linear combination of these free vectors:

OP= xi+ yj,

as shown on the left below. And of course pictures of the new

starting at any point that we want.

PQ, i,

In the right-hand figure,

and

--+

j can

be drawn

and

are all

free vectors. In general, ifV and Tare any vectors, with T :;tf. 0, then the T-component
of

is the number

vT

1v1

cos e,

where 8 measures the angle between the direction of T and the direction of
y

V.
Q

PQ=i+2j.
Thus, in the figure below

VT is

the directed distance PQ, relative to the given positive

direction on the line that contains

PR.
R

Since

V T = IVI

ITI

cos e,

it is easy to express the T-component in terms of the dot product:

v T

V T
--

IT!

420

9.8

Paths and Vectors in a Plane

PROBLEM SET 9.8

In the figures below, we use tick marks to indicate that segments have the same length.
Thus the tick marks in the figure below say that AB

AC.

1.

-+

--+

--+

a) Calculate OS as a linear combination of OR and OP. (The figure is a parallelogram.)

0
--+

--+

--+

b) Calculate OT as a linear combination of OR and OP, shown on the right above.


(These two answers, in combination, give a vector proof that the diagonals of a
parallelogram bisect each other.)
2.

--+

-+

--+

--+

a) Calculate SR and OT as linear combinations of OR and OS.


rhombus, so \OS\

(The figure is a

--+

-+

\OR\.)
T

0
--+

--+

b) Show that, in a rhombus, SR

OT

0.

(These two answers, in combination, give

a vector proof that the diagonals of a rhombus are perpendicular.)

Free Vectors

9.8

421

-+

3. a) Calculate OS as a linear combination of OP and OR in the left-hand figure below.


R

--+

b) Calculate OT as a linear combination of OP and OR, in the right-hand figure.


4. Do Problems 3a and 3b give a vector proof that the three medians of a triangle are

concurrent? Or do you need to carry out a third calculation of the same kind, to
complete the proof?
5.

a) Show that

IV Tl IVI

ITI,

for every two free vectors V and T.


b) Show that for any real numbers a, b, x, y, we have

lax+ byl Va2+ b2Vx2 + y2.


6.

Show that if P, Q, R, and Sare any four points of the plane, then
---+

---+

---+

PQ + QR + RS + SP
Let V0 be a fixed (free) vector.

8.

Suppose that V i
0, and [V[
1. Is this information enough to determine V?
If so, what is V? If not, give a figure, showing the possibilities for V.

9.

Given that V i

0.

7.

Show that if V0

0, for every V, then V0

0 and V

0, discuss as in Problem 8.

11.

a) A set of vectors Vv V2,


, Vn are linearly dependent if there are scalars !Xv IX2,
1Xm not all
0, such that

Given V

1 and V

10.

0.

1, discuss as in Problem 8.

Show that for any V, the vectors i, j, and V are linearly dependent.
b) Show that if one of the vectors Vi is
0, then the vectors V1, V2, ... , Vn are
linearly dependent.
c) Find a number a such that 2i + j and 7i + aj are linearly dependent.
=

12.

a) A set of vectors V1, V2, ... , Vn are linearly independent if they are not linearly
dependent. Thus the V/s are linearly dependent if
n

iI

CliVi

=>

IX 1

IX2

Cln

0.

Show that i and j are linearly independent.


b) Are i and i + j linearly independent? Why or why not?
--+

c) Given that i and OP are linearly dependent, what are the possibilities for P?

Paths and Vectors in a Plane

422
13.

9.9

Show that if
and
then
IV1 - V2I

IW1 - W2I

(Remember that IVl2


V V, for every V.) Then draw a figure, and restate the
theorem in the language of elementary geometry.
=

14. Explain how Problems 5a and 5b can be regarded as the same problem.

15. a) Consider the vector space which you were asked to define in the last problem of the
preceding problem set. Let 1 be the constant function which is = 1 for each x on
[ -1, 1 ]. Find ten nonconstant functions.f1,f2,
,/10 such that 1 f; = 0 for each i.
b) Show that in the same vector space, f f0
0 for every f => f0
0.
.

9.9

VELOCITY, ACCELERATION, AND CURVATURE

We return to the discussion of moving particles in a plane. Suppose that the motion is
described by a path

P: IE
t
where

I is a

P(t),

time interval. Let the coordinate functions of the path be f and

P(t) = (f (t), g(t))

(ton

g,

so that

I).

We now regard the path as a function whose values are the vectors
-

----+

pt= OPt,
-

where Pt is a vector in the sense of Section 9.7, and

P(t) is

denoted by

Pt,

to fit it into

the vector notation.


y

We then have
-+

pt= f (t)i + g(t)j.


We can now define the velocity and acceleration. These are the free vectors

Vt= f'(t)i + g'(t)j,


At= j"(t)i + g"(t)j,

423

Velocity, Acceleration, and Curvature

9.9

where i and

are the free vectors corresponding to i and j.

Since

Vt and At are free

vectors, we can draw pictures of them in any position we want; and so we picture
them by drawing arrows starting at the point

P1

The picture then says that at time

t,

the moving particle is at the point Pt and has the

indicated velocity and acceleration vectors

Vt and At Note that Vt lies along the

tangent line; and this is right. (This should be checked, for the various possible cases.
(a) If f'(t) and g'(t) are both 0, then V1
0, and there is nothing to prove. (b) If
f'(t) -:;!= 0, then V1 and the tangent line both have slope g'(t)/f'(t). (c) If f'(t) = 0
and g'(t) -:;!= 0, then V1 and the tangent line are both vertical.)
When we write Vt = f'(t)i + g'(t)j, A1
j"(t)i + g"(t)j, we are describing each
of the vectors Vt and At by a pair of numbers. Unfortunately, the numbersf'(t),
g'(t), f"(t), g"(t) have no physical meaning, because they depend on the coordinate
=

system.

It is possible, however, to describe the acceleration by a pair of numbers

which do have physical meanings. This is done in the following way. First we take a

V1, but with length I. T is called the unit


Pt. (Here, and throughout the following discussion, we are
assuming that the speed IV11 is not zero. If the length of Vi is 0, then its direction is

free vector T, with the same direction as


tangent vector at the point

not determined, and so T is not determined either.)

Next we take a free vector N, with length 1, perpendicular to T, and lying on the
same side of T as

At

Then

Ai

must be expressible as a linear combination

Ai=
of T and N.

o:T + (3N

Here o: is the T-component of

Au

and (3 is the N-component.

These

numbers are called the tangential and normal components of the acceleration.
shall now compute them.

We

424

9.9

Paths and Vectors in a Plane


Y

At

N---In the right-hand figure above,

is the direction of

we have

Vt. Since Vt = f'(t)i +

g'(t)j,

where

IVtl = -Jf'(t)2 + g'(t)2

Similarly, </> is the direction of the acceleration, so that


cos

<P

By definition of the T-component


IX=

A1I cos(</>

.
(t)
Sill</>= g"
-.

f"(t)

IAtl '

IX of

IA1I

At, we have

8) = IA1I cos

4> cos

f"
f"(t) cos 8 + g"(t) sin 8 =

8 + IAtl sin

4> sin

(t)f'(t) g"(t)g'(t)
I
I

f' (t)f"(t) + g'(t)g"(t)

-Jf'(t)2

Theorem 1.

+ g'(t)2

The tangential component of acceleration is the derivative of the speed.

That is,
IX=

Ar

.E:_ IVtl = IVtl'.


dt

Once this has been observed, it is easy to check it, by differentiating the function

IV tl =

)j'(t)2

+ g'(t)2.

fJ is computed as follows.
423, then

The normal component

T, as in the figure on p.

fJ

r/>]
I At I COS [(8 + rr/2)
ef>) + rr/2] = -IA tl
IAtl cos [(8
-

If N is counterclockwise from

If the direction of N is reversed, the sign of sin

(8

sin

(8

- ef>).

</>) is also reversed. In any case,

we want fJ 0, because N is taken on the same side of the tangent as

A1. Therefore

425

Velocity, Acceleration, and Curvature

9.9

we must have

fJ

IAtl

(8

!sin

</>)I,

in all cases. Therefore

fJ

IAtl

I sine cos</> - cose sin</>I

I IAtl sine cos</>

IA1I cose sin </> I

lf"(t)g'(t) - g"(t) f'(t) I

lf"(t) sine

g"(t) cose1

lf" (t)g'(t) - g"(t)f'(t) I

,Jf'(t)2 +

IVtl

g'(t)2

This formula for fJ also has an interpretation, but its interpretation is harder to
see, and requires the idea of the curvature of a path at a point.
For the sake of simplicity, we start with the idea of the curvature of the graph of a
twice differentiable function at a point.
y

For each

x,

let

s(x) be the length of the graph from t


s(x)

For each
Since

x,

let

8(x)

to t

x.

Then

f'J1 +f'(t)2dt,

and

s'(x)

,J1

+f'(x)2

be the direction of the tangent line, with

-TT/2

<

s is an increasing function, 8(x) is determined when s(x) is known.


h such that
8(x) h(s(x)),

is a function

ahd (in the language of Section

5.8)

ae

ds
The curvature

h'(s(x))

d8/dx
.
ds/d."

is defined to be
K

=I :I

8(x)

<

TT/2.

Thus there

426

Paths and Vectors in a Plane

9.9

This is easy to calculate. Since

(}(x)

we have

{}'(x)
Therefore
K

For future reference:


Theorem 2.

d(}
dx

Tan-1 f'(x),

1
f (x)
1 + f'(x)2 "

f"(x)

I
I d(}ds I I d(}/dx
I
ds/dx
1 f'(x)2 -J1
=

f"(x)

[1 + f'(x)2]3/2

1
+ f'(x)2

The curvature of the graph of a twice differentiable function is given

by the formula
K

f"(x)

(1 + j'(x)2]3/2

For paths, the idea is similar. Take a fixed


of the path, from

t0 to t.

t0, and for each t, let s(t) be the length

Then

s(t)

and

{t-Jf'(u)2
Jto

g'(u)2 du,

s'(t) -Jj'(t)2 + g'(t)2


{}(t) be the direction of the velocity vector at time t. We are working
on a portion of the path where !Vtl
-JJ'(t)2 + g'(t)2 - 0. On such a portion of the
path, sis an increasing function, and so {}(t) is determined when s(t) is known. Thus
=

For each t, let

there is a function h such that


Therefore

{}(t)
d(}
ds

h(s(t)).

h'(s(t))

d(}/dt
ds/dt.

But according to the definition of the curvature


K

In order to calculate

of the path,

lI

we take first the case in which Vt is not vertical, so that


tan

{}(t)

Taking the derivative, we get


2

[sec

(}(t)](}'(t)

g'(t) .
f'(t)

f'(t)g"(t) - g'(t)f"(t)
.
j'(t)2

Velocity, Acceleration, and Curvature

9.9

427

Now
2
sec
Therefore

g'(t)2 = f'(t)2 + g'(t)2


e(t) = 1 + tan2 e(t) = 1 +
f'(t)2
f'(t)2
010 =

f'(t)g"(t) - g'(t)f"(t)
.
f'(t)2 + g'(t)2

This derivation works whenever the velocity vector is nonvertical.

(Query: How

would you derive the same formula, in the case where the velocity vector is vertical?)
This gives

K=

de
ds

I I I

d e/dt
ds/dt

[ [

e'(t)
s'(t)

g'(t)f"(t)
= f'(t)g"(t)
f'(t)2 + g'(t)2
_

.Jf'(t)2

+ g'(t) 2

lf'(t)g"(t) - g'(t)f"(t)I
]3/
[f'(t)2 + g'(t)2 2

Thus we have:

Theorem 3.

The curvature of a twice differentiable path, at any point where the

speed is not

0, is given by the formula

lf'(t)g"(t) - g'(t)f"(t)I
]3/
[f'(t)2 + g'(t)2 2

K=

Comparing this with the formula

/3 =

lf"(t)g'(t) - g"(t)f'(t) I

.Jf'(t)2

we get:
Theorem 4.

+ g'(t)2

'

At any point where the speed is not zero, the normal component of

acceleration is given by the formula

/3 = AN = K IV1l2
In our discussion, we used the notation f',

g',

... for derivatives, most of the

time, in order to connect our work with the preceding theory. We used the notation

de/dt, de/ds,

. only when we really needed to talk about the derivative of one

function with respect to another, in defining and calculating curvature. In the litera
ture of physics, however, the notation f',

'
g,

. is hardly used at all. The following

notations are far more common:

dg.
df.
v = -1 +
1
dt
dt
-

In the last expression the dots

dx
dy
v = -i +--j,
dt
dt
x
over
and y indicate

time. Similarly,
A=

d2g .
d2x
d2j
J=
l +
d t2
dt2
dt2
.

.
l

v =xi+ yj.
differentiation with respect to

d2y.
J = Xl + YJ
dt2
.. .

...

Paths and Vectors in a Plane

428

9.9

In these notations,

and
K

Ji.XI
[x2 + _y2]312
li.Y

l(dx/dt)(d2y/dt2) - (dy/dt)(d2x/dt2)1
[(dx/dt)2 + (dy/dt)2]31 2

There is a good reason, in physics, for the use of the "fractional" notation

dy/dt,

df/dx,

... for derivatives. Most of the time, physical problems involve a large number

of interrelated functions, and physicists need to talk about the derivative of one of
these with respect to another.
"fractional" notation

df/dx

Therefore, the rest of the time, they use the same

for ordinary derivatives

f'.

PROBLEM SET 9.9


1.

Find the point of maximum curvature of the parabola y =


value of

2.

and find the maximum

Find the point of maximum curvature of the parabola y =2 +


maximum value of

3.

x2,

K.

+ x2, and find the

K.

Find the points of maximum and minimum curvature of the graph of


calculate the values of

x3,

and

at these points.

4.

Calculate the curvature of a circle of radius

5.

Calculate the curvature at the points

a.

(a, 0) and (0, b) for the ellipse

x2

y2

+
=1.
/;2

6.

Sketch the path

Pt = i cost + j sint,
showing the velocity and acceleration vectors at several points.
7.

Discuss (as in Problem

6)
t

Pt = i cos 2 + j sin 2 .
8. Discuss
9. Discuss
10. Discuss

Pt =2i cost + j sin t.

Pt = it + jt2 .
Pt

ti +

(t - t2)j (0 t

11. Discuss

Pt =i cos t2 + j sin t2.

12.

Pt

Discuss

it + jt3.

13. Discuss

Pt = it3 + jt2

14. Discuss

Pt =i(l - t)2 + j(t3 - t).

15.

Pt =(t cos ()()i + ( -!gt2 + t cos ()()j.


0, and find the direction of this vector.

Discuss

1).

In the sketch , show the velocity at

Velocity, Acceleration, and Curvature

9.9

429

16. For a certain path, the velocity at time 0 has direction I)( and length 1. The initial
point P0 is the origin. For each t, At
gj Express the path in the form Pt
f(t )i + g(t)j.
=

17.

Discuss as in Problems 6 through 15, and express the tangential and normal com
ponents of acceleration as functions of the time:

Pt

18. Discuss and sketch

P0

3i cos 2t + 3j sin 2t.

i cos2 8 + j sin 8 cos 8.

Describe this as a path in polar coordinates; find a rectangular equation for its locus,
and identify the locus.
19. Discuss and sketch

Po

(cos 8 - cos2 8)i + (sin 8 - sin 8 cos 8)j.

20. Discuss and sketch

21. Discuss and sketch

Po

(3 cos 8 + cos 38)i + (3 sin 8 - sin 38)j.

P0

22. Discuss and sketch


po

i(cos2 8 - sin2 8)i + sin 8 cos 8j.

(cos 8 - cos 8 sin 8)i + (sin 8 - sin2 8)j.

23. Is the following statement true? (Why or why not?)


Theorem (?).

Given a path with coordinate functions f and g, on an interval [a, b], such

that f and g are differentiable, and the velocity is nowhere

such that V1 has the same direction as PaPb.


The figure indicates that in

some

0, then there is a time t

cases, at least, there is such a time t.

24. Given a path which has curvature


the curvature change?

at time t0, suppose that the axes are rotated. Does

Why or why not?

[Hint: This problem does not require a

calculation.]

25. Let a

i + j, b

j. Suppose we define an "inner product" V1

that for
V1
V2

x1a

+ y1b,

x2a

+ y2b,

V2, by agreeing

430

Paths and Vectors in a Plane

9.10

the * product is
V 1 * V2

X1X2 + Y1Y 2

a) Does *obey the same formal laws as the old inner product?
b) Is it true that V1 *V2 = V1 V2 for every V1 and V2?
case, express the new operation * in terms of the old.

Why or why not?

In any

9.10 CONCLUDING REMARKS ON


VECTOR SPACES AND INNER PRODUCT SPACES

The treatment of vectors in this chapter has been brief, because so far we are working
in a plane, and the main advantages of a vector approach appear in three-dimensional
space, and in spaces of higher dimensions.

Meanwhile we must bear in mind that

vector ideas appear in many different forms.

1)

Free vectors.

Velocity and acceleration are vectors in this sense, as in Sections

9.8 and 9.9.

2)

Bound vectors.

These have not only length and direction, but also position.

For example, if two forces act in opposite directions on the ends of a spring, then
they may be regarded as bound vectors.
F_1_.

---F2-

In the figure, the two forces have the same length and opposite directions, but they
do not cancel each other out, as free vectors would; on the contrary, they compress
the spring.
3)

Sequences of numbers, regarded as vectors.

(w, x, y, z) can be regarded as vectors.


(w1, X1, Y1, z1) + (w2, Xz, J2, z2)
<X(w, x, y, z)
(w1, X1, Y1, z1)

(wz, X2, Yz, Z2)

For example, ordered quadruplets

We make the natural definitions


(w1 + W2, X1 + X2, Y1 + Y2, z1 + Zz) ,
(<Xw, <XX, <XJ, <Xz),
(w1W2 + X1X2 + Y1Y2 + Z1Z2.

In fact, this is the usual way of describing a space of four dimensions.

4)

Systems of other kinds, regarded as vector spaces and inner product spaces.

of these are unexpected, but turn out to be useful.

Some

See, for example, Problem 33 of

Problem Set 9.7, in which it appeared that a set of functions can be regarded as an
inner product space, although functions may not seem like vectors when we look at
them one at a time.
For this reason, when people speak of "vectors," we need to find out what kind
of vectors they are talking about.

Infinite Series

10

10.1

LIMITS OF SEQUENCES

Most of the time, so far, we have dealt with limits of functions, as


x

oo.

But often we have dealt with limits of sequences, as n

oo.

a or as

For example, in

Section 2.10, we wanted to find the area A, under the graph of y = x2, from
to

= h. We expressed A as the limit of a sequence A1, A2,

=0

where An is the area

of a circumscribed polygonal region Rn. We calculated


An =

h3

1 +

r) (

and we found that

1
+ 2n '

h3
limAn = n-+oo
3

We are now going to use limits of sequences more extensively, as a way of dealing
with infinite series.

Given an infinite sum


. 00

L ai

i=l

= a 1 + a 2 + ...
.
'

we define

and we call the An's the partial

sums

of

_2;:1 ai.

Thus the An's form a sequence

Ai, A2, ...

If

limAn =A,
n-+oo
then we say that the infinite sum is convergent, and we write
00

2; ai

=A.

i=l

We shall now examine limits of sequences more carefully, starting with the
definition of the limit, and building up the theory that is needed.
Definition.

Given a sequence A1, A2,

for every

> 0 there is an integer N such that

n > N

=>

of numbers, and a number

IAn
431

LI

<

E.

L.

Suppose that

432

Infinite Series

10.1

Then
Jim An= L.
n-+ oo

Note that this is like the definition of lim.,00/(x). A sequence which has a limit
is called

convergent.

Here, as always, when we speak of a limit we mean a finite limit

(unless the contrary is stated.)


Theorem 1.

Proof

limnoo

1
=
n

0.

Here

for every

L = 0, and IAn - LI = 11/n - OI = l/n.


E > 0 there is an N such that
n> N

Thus we need to show that

- <E.
1

=>

Now
1

-<E
n
If

l/E

is an integer, let N =

l/E.

n> N

<:::>

n>-.
E

In any case, there is an integer N>

n > 1/E

=>

=>

l/E.

Then

I/n < E,

which is what we wanted.


On the basis of the definition of limn_,00

A,,,

we can prove the expected theorems

on sums, products, and quotients. These are much like the corresponding theorems
for limits of functions. In Appendix C they are listed in such an order that they became
easy to prove.
Theorem 2.

Meanwhile we shall state the main results and use them.

If lim,,00

A,, = A

and lim,,00

B,, = B,

then

and

If

':;!: 0, and

B,,

':;!: 0 for each

n,

then

These theorems justify the procedures that we have been using informally.
example, they give a proof that

lim
n-+oo

h3 (i n.!) (i
3

) h3

12n

For

Limits of Sequences

10.1

433

The steps are as follows:


=0

lim

.!

n -+oo

lim

(i .!)

lim

_!_

lim

n-+ 00 n

n-+oo 2n

n-+oo

= 1

= .! lim .! = 0
2 n-+oo n

i +

1-

lim ha 1 +

n-+oo 3

=1

)
( .!) (
2n

i +

1-

=ha.

2n

(Justification for each of these steps?)


If we start with convergent sequences AI> A2,

, B1, B2,

, and so on, then

Theorem 2 tells us that certain other sequences are convergent. But often we deal with
sequences which are not built up out of convergent sequences as in Theorem 2. We
then need the following ideas.
Definition. A sequence A1, A2,

sequence is decreasing if An

is increasing if AnAn+i for every n.

The

An+l for every n. (If An < An+i for every n, then the
.

sequence is strictly increasing; and if An+i < An for every n, then the sequence is
strictly decreasing.)
Definition. If there is a number M such that An M for every n, then M is called

an upper bound of the sequence A1, A2, .


above.

, and we say that the sequence is bounded

If there is a number m such that m An for every n, then m is called a

lower bound of the sequence, and we say that the sequence is bounded below. If there

is a K > 0 such that /An/ K for every n, then the sequence is bounded.
Example:

(1)

If An =/;:,,for every n, then the sequence is increasing, and is

bounded.

(3)

If An = sin n, then the sequence is neither increasing nor decreasing,

bounded below but not above. (2) If An = e- , then the sequence is decreasing, and is
n

but is bounded, with /sin n/ 1 for each n.


It is easy to see that if a sequence is bounded both above and below, then it is
bounded.

Given m An M for every n, let K be the larger of the numbers /m/

and IM/.
Theorem 3. If a sequence is increasing, and is bounded above, then it is convergent.

That is, if
AiA2 ...AnAn+l ... M,

then the sequence has a limit.


have seen is in geometry.
polygon of 2n sides.

The first application of this principle that you may

Given a circle of diameter

1,

we inscribe in it a regular

For each n, let An be the perimeter of our 2n-gon.

(Note that

we had better start with n = 2.) It is a matter of elementary geometry to show that the

Infinite Series

434

sequence A2, A3,

10.1

is increasing. Also, An < 4 for every n, because the perimeter of

every inscribed polygon is less than the perimeter of the circumscribed square.
(Draw a figure.) Therefore the sequence is convergent. Its limit, of course, is

'TT.

We proceed to the proof. Let S be the set of all numbers An. That is,

Then S has an upper bound. By the Least Upper Bound Postulate (LUBP), S has a
least upper bound. (See Section 5.6.) This is called the supremum of S, and is denoted
by sup S. Let
A= sups.
We shall show that
limAn
n co

A.

....

Let

be any positive number.

Then A

upper bound of S. Therefore AN > A

< A. Therefore A -

is not an

for some N. Since the sequence is increas

ing, this means that

n > N

An > A

=>

Since A is an upper bound of S, and A +

E.

> A, it follows that A +

is an upper

bound of S.

Therefore
for every n.
Therefore

n > N => IAn - Al <


and limn-co An

E,

A, which was to be proved.

We have a similar theorem for decreasing sequences:

Theorem 4. If a sequence is decreasing, and is bounded below, then it is convergent.


That is, if A1, A2,

is decreasing, and An K for every n, then the sequence

has a limit.

Proof

For each n, let Bn

-An.

Then B1, B2,

above. Therefore it is convergent. Let limn-co Bn

is increasing, and is bounded


B. Then limn-oo An

B.

Some simple sequences converge for reas0ns which are not covered by the
preceding theorems. For example, given that
lim

n-.0 __

_!

0'

it is obvious that
lim
1.V"n-co
n + 2 n

0,

because the second sequence is smaller, term by term. This is the idea of the following
theorem.

10.l

Limits of Sequences

(The squeeze principle). If lim n- co


n, then limn-oo Bn = L.

Theorem 5

435

An= L, Iimn-co en= L, and A n

B11 en for every

In many cases, it is easier to use this theorem than to do awkward calculations.


Similarly, it ought to be true that
lim

cos

11-00

= 0'

But we can't get this result from Theorem 2,

because lcos
because cos

nl I, and I/n -+ 0.
n does not approach a

Theorem 6

(The annihilation theorem).

limit as

n-+ oo.

Hence we need the following:

If limn- co An= 0, and BI> B2,

is bounded,

then limn_,00 A nB n = 0.
Theorem 7.

For

Every convergent sequence is bounded.

increasing

sequences,

this is trivial:

limn-oo An= A, then Ai An A for every

n.

For a proof in the general case, see Appendix C.


tions: if we show that a sequence is

not

if Ai, A2,

is increasing,

and

Similarly for decreasing sequences.


This theorem has simple applica

bounded, then it follows that the sequence

is not convergent.
The statements
lim An =
mean what you would expect.

IimAn = -00
n...,co

oo,

You should be able to state your own definitions of

them, following, if you need to, the models of Section 5.3.

Sequences like this are

not called convergent. If lim"_,00 An = oo, then we say that the sequence diverges
to infinity. And if limn-oo An = - oo, we say that the sequence diverges to minus
infinity. We have to be careful about this: if convergence allowed the limits oo and
- oo,

then Theorem 7 would become false, and Theorem 2 would be meaningless

in many cases.
-oo.

(You can't perform algebraic operations on the "numbers"

oo

and

PROBLEM SET 10.1


Investigate the following indicated limits. That is, find out whether they exist, and find
out, if possible, what they are.

1.

lim 2
noo

2.

Jim

2 + 311

3.
4.

6.

Jim

3n

n-oo n- +
Jim

(Try dividing the numerator and denominator by

---

n-Cf)

n-oo 2 ll

ll

(Try using one of the last theorems in this section.)

5.

+ n

.
sin (2n +
hm

n.)

1)

7.

lim

n->00

3
ll

ll

+ n2 +

.
cos (n
Jim

n->ro

7T

- 1)

n + 1

436

8.

9.

Infinite Series

Jim

n->OO

lim

n_,.oo

(
(

n
)

1 +

10.1

-l/n
)

[Hint:

(1 +

(1

Surely you know limx-o

1/y)v,

x)1fx.

Now find liffiy_oo

and apply the result to the problem in hand.]

10. lim,H"' Bn. where Bn is the perimeter of a regular 2n-gon circumscribed about a circle

1.

of radius

1.

13.

15.
17.

Jim Inn

n-oo

1/n

Jim In

n-oo

"dx

n-oo

i
f

Jim

.L

Jim

n-+CO

Jim

n---+CO z=l

19.

Jim

21.

14.

n---+CO

(n2)

Jim In (

Jim

n-c.o

)
ll

"dx
3
1 x

(Investigate existence only. A geometric interpretation is useful.)

"dx

n--oo J1

Jim

n-oo

dx
312

"

Jim In

12.

16.

X2

18.

20.

(You need not prove that your answer to this one is right.)

n--i-oo i=1

X
1

(Investigate existence only.)

:a
l

n
1
L .31
n-+OO 1.=l l 2

lim

Jim

Jim

(Investigate existence only.)

22.

n->OO t=l l

23.

n--co i=l

sin

24.

Jim

cos

-:

(Investigate existence only.)

( hr)n n
I ( 2n) n
:

hr

noo i=l

(Geometric interpretation?)
n

!!__

25.

I)

_L --- (
n-oo i=l I + (i/n) ;
Jim

10.2

26.

Infinite Series.

lim
n-co

"

i=l

1 + (i/211)

28.

l n
lim - L e-i/n
n-co n i=l

30.

Jim n2 sin 2
n
n--+CO

32.

Jim
n-.oo

34.

Jim
n--+CO

(1)-

27.

31.

437

L eifn

n-co n i=l

-n1
( 1n

Jim

sin

Jim 11 1
n-+CO
Jim

33.

tan

Comparison Tests

1 n

Jim -

29.

Convergence.

sec

cos

[I - ]
In

1.=l l

(In fact, this limit exists; if you can find a geometric interpretation of the problem, you
The limit is known as Euler's constant. Nobody knows whether it is
rational.)

can prove it.


n

35.

Jim
n->OO

10.2

i=l

36.

(Investigate existence only.)

(21 + 1)

n
1
um I
.2
)2
i=l
(31
+ l
n-.oo

INFINITE SERIES. CONVERGENCE. COMPARISON TESTS

By an

infinite series

we mean an indicated sum of the form


00

I ai

i=l

= a1 + a2 +

+ an +

We say "an indicated sum" because in many cases there is no such thing as the sum
of infinitely many terms.

For example, the series


(to infinity)

l+l+l+ "
has no sum; and neither does the series
1

+1

+1

(to infinity).

In many cases, however, the "sum of infinitely many terms" can be defined, by a
passage to a limit, in the following way.
Given the series

for each

n,

let

An =

Then An is called the

I ai

i=I

nth partial sum

a1

+ a2 +

of the series.

+ an

If

limAn =A,
where A is a (finite) number, then we say that the series is
is its

sum.

converges to A.
that the series is divergent. If

We also say that the series

has no limit, then we say

limAn =
n-+ oo

oo,

convergent

and that A

If the sequence A1, A2,

438

Infinite Series

10.2

then the series diverges to infinity; and if


limAn
=
n-+ oo

We may write these statements briefly as

then the series diverges to minus infinity.


"'

"'

L Qi =

i=l

00

.z a;=

A,

-oo,

co,

i=l

=-co .

.2;a;

i=l

Probably the first example that you have seen of a convergent series is the geo
metric series

1+

Here
=
An

1 +

,2

r + r2 +

1- r

Jim

-oo

,.n

and this means that

00

.2; ri

i=O

0 <

< I,

we have

Therefore the sequence r, r2, r3,

(0 <

(0 <

I'

(1 ).

<

(0 <

< 1).

rn+i

1
1-r

- , n_r_
1- r

(0 < ,. < 1),

--1- r

= -

There are many ways of proving


r

1
- 1- r

-+oo

limA =
n

+ r

n
+I

If we know that
then it follows that

1- r

+ ... + r =

rn

< 1);

< 1).

The following proof is the easiest.

(1)
(2)

(3)
Since

for every n.

is decreasing. And it has a lower bound, namely 0.

Therefore the sequence is convergent, to some limit L. Thus


lim rn = L,
n-+oo

and

Jim
n-+

rn+i = L.

oo

(Why? What happens to the limit of a sequence, if you omit the first term?) Therefore
L = Jim

11-00

and so
Since

rn+

L = rL,
- r

0,

n-oo

observation:

If limn-oo lan
l=

0,

rn

= rL,

(1 - r)L = 0.

and

In fact, the same conclusion holds for

Proof?

= r Jim

it follows that L = 0.
Jim rn =
n-+oo

Theorem 1.

Therefore

(0 <
-1 <

< 1).
0.

then limn-oo an
=

0,

We get this from the following


and conversely.

(If you rewrite these two statements, using the definitions of the statements

limn-oo lan
l=

and limn-oo an
=

the following theorem.

0,

they hardly even look different.) Thus we get

Infinite Series.

10.2
Theorem 2.

If -1 <

Convergence.

Comparison Tests

439

< 1, then

= 0.

limrn
n-+oo
Algebraically, the formula
1 + r + y2 + .
holds for every
Theorem 3.

n
+ r = (1

00

i=O

This holds because lim n-oo

Theorem 4.

r)

< 1, then

Lri

a,

:- I. We therefore have a more general result for geometric series:

If -1 <

is any number

n
- y +l)/(1

-- .

[-r +l/(l - r)]

= 0/(1

r) = 0.

If -1 <

< 1, then

oo

Lar'
i=O

= -- .
1 -r

The following theorem often makes it easy to see that a series


Theorem 5.

Proof

If

L:,1 ai

For each

If the first term

rather than 1, then we have:

n,

is convergent, then lim n -oo an = 0.

Jet

Let lim,._,00 An =A. Then limn-oo An-I

= A,

lim(An - An_i) =A -

an-

Therefore limn-oo

an

where

n->oo

But An - An-l =

diverges.

> I. Therefore

A= 0.

= 0, which was to be proved.

L:o ari is divergent for a :lrln lal, and so an does not approach 0.

For example, the geometric series


In this case,

lanl = lal

Warning.

The converse of Theorem 5 is false.

0 and

lrl

I.

That is, the nth term of a series

may approach 0, and the series may still diverge. The simplest example of this is the
series
1

+ t + t + + i + i + t + t + t + t +

The next five terms are each equal to t; and so on.

Here

an-+

0, but the series

diverges to infinity.
A more natural example of the same phenomenon is the
00

i=l

I=1+-+-+ .

harmonic

+ -+
n

series

10.2

Infinite Series

440

In fact, this diverges.

The easiest way to see this is to draw a picture:

For each

n,

the area under the graph from

area of the circumscribed rectangles. Therefore


A

But this integral is In

Briefly:

2;;:1 (l/i) =

=n +

1 is less than the total

ln+l dx

-.

and

(n + I);

Therefore the partial sums

Theorem 6.

/1

limln(n

diverge to infinity.

1 to

1
1
1
=1+-+-+ .. +- >

"

n+I

An

+ 1)

oo.

form an unbounded sequence, and the series must

oo.

The same sort of comparison scheme can be used for other series, to show that
they converge. Consider, for example,

Here

ai

l/i2,

and so lim;oo

ai

0. This does not, in itself, show that the series

converges. But the algebraic pattern suggests that the series is related to the improper
integral

100 dx la dx
[-l]a (-1 l)
x a-+ x a-+ x a-+
/x2
x, "
ld
.

. Since 1

2 = 1Im

> 0 for every

00

1.
Im

co

1.Im

00

l.

the integral approaches its limit from below, and

< 1

for every n.

Infinite Series.

10.2

And since the function

Convergence.

Comparison Tests

441

1/x2 is decreasing,
(n

>

1).

1/n2.)
{ 2dx
J1 x2 '

(Here the area of the rectangle is

l_

22

and

<

An

Obviously the sequence

A A 2,
1,

2
i=l

Therefore

l_

32

2 < 1 +
n

<

r s dx
J 2 x2 '

n dx
J1 x2

< 2.

of partial sums is increasing; and we have just

seen that it is bounded above. Therefore:


Theorem 7.

2:1 (1/i2) is convergent.

Some of the ideas that we have been using to get these results are useful in so
many connections that they are worth recording as theorems.
Theorem 8

(The comparison theorem).


0

Then

(1)

if

Then

ai bi

2:1 ai and 2:1 bi be series,

with

for each i.

2:1 bi is convergent, then so also is ,Li':1 ai; and (2) if ,L:1 ai is divergent,
.L:1 bi.

then so also is

Proof

Let

For each

n,

let

442

10.2

Infinite Series

(Why?)

And each of the sequences A1, A2,

and B1, B2,

is increasing.

An

increasing sequence is convergent if it is bounded, and conversely. We can therefore


prove

( l)

in the following steps:


00

i=l bi
2

is convergent
is convergent

=> B1, B2,

=> B1, B2,

=> Ai, A2,

=> A1, A2,

is bounded
is bounded
is convergent

00

=>

2a;

is convergent.

i=i

(Reason for each of these implication signs?)


following steps:

Similarly, we can prove

(2)

in the

00

i=l ai
2

is divergent

=> Ai, A2,

is divergent

=> A1, A2, is unbounded

is unbounded

=> B1, B2,

=> B1, B2,

is divergent

00

=>

i2=l bi

is divergent.

The comparison theorem gives us easy tests for some series.


example,

1 =1+-+-+-+
1 1 1
i2=O i !
1 2! 3!
n (n
n! =
00

Here

and

O!

Then

1, by definition. For each

i, let

1)

ai= '1 bi= (2_ 1)i-l.


a i bi
1 erl (i = 0),
O! 2
1
1! Gr (i = 1),
.

l.

for each i;

- <

Consider for

10.2

Infinite Series.

Convergence.

Comparison Tests

and thereafter the strict inequality holds, with l/n ! < (l/2r-1 for

443

2. Therefore

our series is term by term less than the geometric series

which is known to converge.


Theorem

9.

Therefore:

2;:0 (l/i!) is convergent.

In fact,

I !

i=O

ln-

lim

(1

x)1/"'.

x-+0

But we won't be able to prove this until we have developed the theory much further.
The situation here is peculiar: the easiest way to get this special result is first to show
that

and then to set

1. (You have seen a situation like this before. The easiest way to

H x4 dx is first to calculate the function frr t4 dt, and then to set x

find

I.)

Consider

next
00

/n

Since
1
n

>

for every

n,

and

it follows that the given series diverges:


ro

1
n

)n

Cf).

While the comparison theorem tells us, under some conditions, that a series
converges, it never tells us what the sum is.

But such partial information may be

useful. In fact, some of the most important uses of series are in cases where a number
(or a function) can best be described by a series; in such cases, we use
some large

n)

to get an approximation of

,L;:1 a;.

Lf=i a;

(for

For example, the approximation

is excellent, even for fairly small values of n; it gives by far the best way of computing
e; and in fact, the series approaches its infinite sum so fast that e is much easier to

compute than

)2.

Infinite Series

444

10.2

Therefore, when you are asked to show that a series has a sum, without finding
out what the sum is, you should not consider that the problem is artificial.
PROBLEM SET 10.2

Find out which of the following series are convergent. If the series is geometric, calculate
the sum.
I.

oo

2 Vi:
i=l ----:

oo

3.
6.

1
4.
i j3/2 + 2
oo

00

.2 (-l)i7T-i

i=l

10.

00

cos3

(2i)
j2

oo

13.
16.
19

14.

il j l-1

2 l 1n I .
i=2

17.

.,

i=O
00

25.
2 8.

(i!)3

---

2
i=l i
.,

i=l

(i + l)(i + 2)
+ 1

20

i l Vi + 1
2i
i=l

sin2

0 9
i j.
1

1
i=2 l:a-n

18.

_2

z
i=2 1-:--1n

co

1
21.
iO (i !)2
co
1
24. 2
i=2 i(i - 1)

2 i(i+1)
i=l -
00

.2 -.-
i=2 1 n
00

2
.
i=2 z:---1n2 z
.2

ii'
00

15.

(2i - 1)
j2

i =l
00

12.

oo

23.

9.

i
i=l j

00

i2 ln2 i
(i ! - 1)

22.

L ( -2)ie2i

i=l

00

00

11.

.2
= --

00

00

oo

8.

)
i(
00

i j 3 /2

26. 2
(i - l)(i)(i + 2)
i=2 ---

27.

21_

30.

i_
i -._
+ 1

i=l

i:

i=l j2 - 1

31. If you think of Theorem 3 backwards, it says that

--

1
=1 +r+r 2+
1 -r
That is,

1/(1 - r) can be expressed as the sum for an infinite series. Express 1/(1+x)

as the sum of an infinite series. For what numbers

does your series converge?

32. Express 1/(1 +x2) as an infinite series. For what numbers x does the series converge?

33. Same question, for 1/(1 + x4).

*34. Suppose that :L:o aixi converges for every x. The series then defines a function
co

/(x)

2 aixi.
i=O

It will turn out that functions which can be defined in this way are always differentiable,
and that their derivatives can be calculated by differentiating the series a term at a time.

That is,
co

f' (x)

2 iaixi-I.

i=l

10.3

Absolute Convergence.

Alternating Series

445

(Don't try to prove this; you haven't got a chance.) Granted that all this is true, what
must the a;'s be, if /(0)
1 and f'(x)
/(x) for every x? Comment on your result.
=

00

.3
:Li=l I + 1

35.
*37.

00

For which numbers

ct.

is the series

2i2

1 I1
i=l
+

36.

:L:1 (l/n") convergent?

*38. Prove the following.


Theorem A (The Integral Test). Let f be a positive decreasing continuous function,

on the interval [1,

oo

) . If

ioo

f(x) dx < oo,

(1)

then
00

:L /Ci)
i=l

< w,

(2)

and conversely.
10.3

ABSOLUTE CONVERGENCE. ALTERNATING SERIES

Given a series

_L!o ai

(in which the terms may be positive, negative, or zero), we can

form a new series by taking the absolute value


00

00

Lai
i=O

then

L( - l)iri
i=O
=

L lrli
i=O

of each term

- r + r2 -

00

00

_L la;I
i=O

la;J

1 +

lrl + lrl2

Given that La; converges, it does not follow that _L


the series

1 a;

a;.

For example, if

lai l converges.

1 - 1 + t- t + i - i +

For example,

is convergent, but the series

:L la;I

1 + 1 + t+ t + i + t

is not, because the harmonic series is not. The same sort of thing happens if we take
absolute values in the series
00

00

Here it is plain
convergent.

:L ai :L c-1)i+ 1--: 1
i=l
i=l
l
that L la;I diverges, but it
=

Proof

Let

1
-

- - + ...
4

is not quite so easy to see that

La;

is

This is worth proving, however, because the idea used in the proof is

useful in other connections.


Theorem 1.

_L!1 ( - l)i+1(1/i) is convergent.

446

10.3

Infinite Series

If n is even, with n =

2k,

then

2
)+(

An = Ao.
"k= 1 - I_ + I_ =

(1

)4 .+
1

Therefore the sequence A2, A4, A6, , A2k,

2k

- l_

...

(2k 1-

)
2k .
1

is increasing. And it has an upper

bound, because

G ) ( - )

A2k = 1 -

Therefore the sequence A2, A4,

has a limit.

ck 2 - 2k 1 ) 2k1
-

<

1.

Let

A= lim A2k.
k-+ 00

(1)

We shall show that A is the sum of the series. First we p.bserve that
lim A2k+1= lim [A2k

so that

le-> oo

k-> oo

+ a2k+il

lim A2k+i

k--+ 00
Thus we see that

lim A2k

k->

oo

lim
le-> oo

2k

1
+

(2)

A + 0= A.

(I) as n--+ oo through even values, An --+A and


(3) limnoo An= A.

odd values, An --+A. It follows that

(2)

as n--+

oo

through

Proof? (You need to show that for every

E > 0 there is an N such that [An


- A[ <
n > N. Given such an i:, you know from (1) that there is an N1 such that
[A2k - A [ < i: for every > NI; and you know from
that there is an N2 such that
IA2k+i - Al < i: for every > N2 How can N be definecl in terms of NI and N2?)

i: for every

(2)

The scheme that we used to prove Theorem 1 applies more generally.

If you

reexamine the proof, you will see that the only facts ,about the series

that were used were the following:

1)

The series is alternating. That is, successive terms a;, ai+I have opposite signs.

3)

The sequence laII, la2I, ... is decreasing.

2)

Limnoo an

0.

We have therefore proved the following theorem.

Absolute Convergence.

10.3

447

(The alternating series test). Given an alternating series ,L:1 a;. If


0, and the sequence la1i, la2i, . . . is decreasing, then the series converges.

Theorem 2

l i mn_,00 an

Alternating Series

(Strictly speaking, some of our formulas in the proof of Theorem 1 used the fact
that the first term was positive instead of negative.

If you know that the theorem

holds in this case, how would you show that it also holds when a1 < 0 ?)
We have seen that if

La;

converges, it does not follow that

L la;I

converges.

But the reverse implication does hold:


Theorem 3.

If

If

.L la;I

_L;:1 la;I

is convergent, then so also is

is convergent, then

L
. a;

is said to be

,L;:1 a;.
absolutely convergent.

In this

language, we can restate Theorem 3 as follows:

Every absolutely convergent series is convergent.


To prove this, we break up each partial sum

into a sum of positive terms and a sum of negative terms. To do this, we let

if a; 0,

+_ a;
a;
-

and let

if

if a; 0,

a;
a; =
0
_

a; < 0,

if a;> 0.

Let
n

A+
n

A-;;-= _La;.
i=l

"" ai'
+
L-

i=l

Then
for each

n,

for each

i.

because

Obviously

A, A;, ... is

an increasing sequence, and

A-;, A;, ... is a

decreasing sequence. Let


ro

.L la;!.

i=l

Then
n

L ia; I

:::::; k

for every

n.

i=l

Also
n

A! _L la;I,
i=l

because

is the sum of some (perhaps all) of the terms on the right-hand side.

Infinite Series

448

Therefore

10.4

At, A, . . .

is convergent.

Let

A+=

lim

A-;;-

I ( - lail),

Similarly,

because

A:

A;;.

i=l

is the sum of some (perhaps all) of the terms on the right-hand side;

and if you omit negative terms, the sum becomes larger.


sequence

Therefore the decreasing

A, A-;, ...is bounded below. Therefore it has a limit.

Then
lim An = Jim (A;;
n-+co
n-+co
and

I1 a; is convergent,

Theorem 4.

If I1

/a;/

Let A-

limnco A:.

A-;;) = A+ + A-,

which was to be proved. In fact, we can say a little more:

is convergent, then

Ii a;I ; /a;J.
Proof

We know that

By induction it follows that

Ii a;I ; /a;/

for every

n.

Passing to the limit, we get the inequality that we wanted.


PROBLEM SET 10.3

Find out which of the following series are alternating, which are convergent, and which
are absolutely convergent.
"'

i=l

4.
1.

10.
10.4

"'

1
<-1)i -:{I

i=l"'
i1
i=lI c-2))2i-:-i
(i=lf
l

l.

s.

i=l"' 2
1
I 2
i=lI c ni1
(
i=lI ::.!_2
-

i=l

11.

'
..

i=l

( -1)' -. + I

+ i

"'

s.

3.

9.

sin

cos

rri)
2
l

12.

ESTIMATES OF REMAINDERS

Given that a series converges, we often want to use a partial sum

00

:L

i=2 c

t)i ' (i

00

i=l

c-i)-i

1)

Estimates of Remainders

10.4

449

as an approximation of the limit


ct)

A= lim An = L ai.
n-+co
i=l
The approximation An

A is used in some of the most important applications, and

in all applications that use computers.

As in all approximation processes, we are

better off if we can set a limit on the error. We shall now find ways to do this.
Given that

1:1 ai converges

to a sum A, let"Rn= A - An. Then


ct)

Rn=
and obviously

L ai,
i=n+l

limRn=
n-+oo

0.

For alternating series, of the type treated in Theorem 2 of the preceding section,
it is easy to get an estimate of R,,,.
ct)

Let the series be

ct)

i+l =
"" ai = ""
b;
b1 - b2 + ba .., ( -1)
i=l
i=l
..,

where

b; = Jail Then
ct)

ct)

R n= L a i= L (-l)i+l bi.
i=n+l
i=n+l

If n is even, then

(bn+l - hn+2) + (bn+3 - bn+4) +

'

' ' 0.

But we can also write

Therefore

1)

0 Rn hn+l

when

is even.

If n is odd, then
Rn =
=
Rn =

-bn+l + hn+2 - hn+3 +


-(bn+l - h,,,+z) - (bn+3 - b,,,+4) - O;
-bn+l + (bn+2 - hn+a) + (bn+4 - bn+5) +

Thus

2)

-bn+l Rn 0,

when

is odd.

Therefore

Since

bn+l = Jan+il, we have proved the following theorem.

450

10.4

Infinite Series

Given Ii:1 ai. If (1) the series is alternating, (2) limnco an


is decreasing, then
(3) the sequence la11, la2I, .

Theorem 1.

0, and

for every

IRnl lan+il

n.

That is, when you stop after a finite number of terms, the error is numerically no
larger than the first term that you omit. For example, take
i(-l)i+l_!2
i

1 - _!_2 + _!_2 - . .
2
3
.

i=l

By the alternating series test, this series converges. Let

be its sum. Then

1
1
1
+-
Al--+-2 '
2
2
2

and the error in the approximation is 1/102


very fast. Next consider
-1

C()

I c lY-:i=O
l!
-

This series converges to

sum

1 - 1 +

A.

1
-

2!

0.01. This series does not converge


1

- - + ...

3!

(It will turn out that

2!

3!

10 !

1/e.) We have

ARJ--- + +-
'

and the error is less than 1/11 !. This series converges very rapidly:
11 !

39,916,800,

and

__!__ 2.5052 10-s 0.000000025052.


11 !

If you reexamine the proof of Theorem 1, you will see that the method that we
used to get an estimate of the error was very much like the method that we used to
establish convergence in the first place, in the proof of the alternating series test.
This happens most of the time: that is, a proof of convergence usually gives an
estimate of Rn- Consider, for example,
1

C()

I-:;.
i=l l

We let

and we observe that the sequence A1, A2,


is increasing. To show that it is bounded
above, we draw a picture and observe that

An

n 1

-:2 <

i=l l

ln

1+

dx
2
x

Estimates of Remainders

10.4

451

The same sort of reasoning tells us that

= l_
R
.,,,
i=n+l l2

<

oo

dx.
9

Since

we conclude that

Rn

< 1
-

for every

n.

This is nowhere nearly so small as the estimate of error for the corresponding alter
nating series.

1n fact, the positive series I

(1/i2)

converges very slowly.

Similarly, Theorem 4 of Section 10.3 gives an estimate of the error for series

which are absolutely convergent.


Theorem 2.

Suppose that

L lail

is convergent.
00

Rn= L ai.
i=n+l

Let

10.4

Infinite Series

452

Then
00

IRnl
Iai

That is, the error in

lail

i=n+l

is numerically no greater than the error in

I la;I-

To

prove this, we apply Theorem 4 to the series


00

I ai,
i=n+l
If we use the comparison theorem of Section

10.2,

to establish the convergence

of a positive series, then any estimate of the remainder of the larger series auto
matically is an estimate of the remainder of the smaller one.

For example, we have

found that

00 1

I-:; < co,


i=l i
l/(i2 + 1)

with Rn < l/n for every n. Since 0 <


comparison theorem that
00

2
< l/i for every i, it follows by the

I-. -- <co.
i=l i2 + 1
It also follows, for the remainder

in the new series, that

and so

R < l.

This scheme always works, whenever we establish convergence by means of the


comparison theorem.

PROBLEM SET 10.4

Each of the following series is convergent. In each case, get an estimate of the remainder
Rn, in the form \Rn\
I.

i; ( ym
i=l 3}

2.

00

3. I ( -l)i7T-i
i=l
00
sin2 (2i - 1)
5. L
i2
i=l
00 ( - l)i+l
7. I
i=l i-4
w

9.

L
i=l

( - l)i

i0.9

4.
6.

i: (- 4)
i=l
i; co 7Ti
l

i=l
00

il f3

8.
w

10.

L -:--12.
1 n 1
i=2

Termwise Integration of Series.

10.5

1 1.

co

12.

i O (i!)2

13.
15.

co

1 i(i

14.

+ 1)

co

16.

[a, b].

Let

( --r

co
2
i=l
co

i 2 i2(1 + i)
co

fv fz,

453

i l i(i + l)(i + 2)

*17. Let

Power Series for Tan-1 and In

I c -o-i
i=

..
f

. be a sequence of continuous functions defined on the same interval


be a function such that
Jim

n-co

fn(x)

f(x),

for each x on [a, b]. Questions: (1) Does it follow that/ is continuous? (2) If f is known
to be continuous, does it follow that
(?)Jim

n--+co

* 18. Consider

order

19.

10.5

i
I!1 ( -l) +l (l/i).

Jb fn(x) dx Jb f(x) dx?


=

Show that by writing the terms of this series in a different


(using each term once and only once) you can get a series La; whose sum is 10.

Now reexamine your solutions of Problems 1 through 16. If you used any method other
than Theorem 1, in estimating the remainder in an alternating series, try using Theorem
1, and compare the new estimate with the old one. (The alternating series test usually
gives a good estimate, in the cases where it applies at all.)
TERMWISE INTEGRATION OF SERIES.

A power

series is a

POWER SERIES FOR Tan-1 AND In

series of the form

(Here, as a matter of convenience, we are defining

x, including x = 0.)

1, so that

a0x0

a0

Thus every geometric series is a power series; writing

the old formula, we get

. -

co
1
x'=
I
i=O
1-

for every

x for r in

(-1 < x< 1).

If a given series is convergent, for every x on an open interval

(-r, r),

then the series

defines a function/, on the same interval, and we write

co
f(x) = L a;xi
i=O

(-r<x<r).

The following theorem is fundamental:


Theorem A.

Given

co
f(x) = L G;Xi
i=O

Then/ is continuous and differentiable on

(-r<x<r).
(-r, r),

and the derivative of the sum is

10.5

Infinite Series

454

the sum of the derivatives. That is,


00

(-r<x<r).

f'(x) =Ii aixi-l


i=l

The same idea applies to the integral.


Theorem B.

Given
00

f(x) =I aixi
i=O

(-r<x<r).

Then the integral of the sum is the sum of the integrals. That is,

x
["'
l f(t) dt =I ai dt =I -._ai_1 xi+1.
oo

oo

i=O. 0

1=0 l

As you might expect, the proofs are hard; they will be postponed until the end of
this chapter. But the theorems are easy to apply, and Theorem B gives the best method
of finding series for many functions. The method is as follows.
We know that

1
1 + x + x2 + ... + xn + ... = -1 -x

(-1 <x<1).

Writing this backwards, we can express the function 1/(1


1/(1
Replacing

- x) = 1 + x + x2 +

- x) as a power series:

(-1 <x < J).

x by -x, we get

l
__ = 1 - x + x 2 - . .. + (-lrx" + ..
1 + x
and then, replacing

(-l<x<l );

x by x2, we get

_1_ = 1 - x2 + x4 - xs + ... + ( -1r x2n + ...


1 + x2

(-1 <x<1).

Theorem B says that the series on the right can be integrated a term at a time. Thus

lx

dt

ol+t2
--

lo"'dt -lo"'t2 dt

and this gives

lx
0

dt
1 + t2

=x -

+ (-1)"

3
x

l"'t2n dt
o

x2n+1

+ ...(-l)" 211 + 1

00
i 2i +l
=I C-l)
i=o
2i + 1

The integral on the left is equal to Tan-1

(-l<x<l ),

+ ...

(-1<x<1).

x. Thus we have:

Termwise Integration of Series.

10.5

Theorem 1.

Tan-1

. x2i+1
oo
x =I (-1)' --.
21 + 1
i=o

Power Series for Tan-1 and In

455

(-l< x<l).

Granted that Theorem B is true, there is no need to test the convergence of the
series on the right; Theorem B tells us not only that the series has a sum, but also that
its sum is Tan-1

x.

Note that the series includes only terms of odd degree.

could have been predicted, because Tan-1 is an odd function, with Tan-1

x.

-Tan-1 x for every

This

( -x) =

The same method can be used to get a series for the natural logarithm.
Theorem 2.

In

(1 + x)

x2
x3
x- - + 2

(-1 < x< 1).


Proof

We know that
In

and we know that

1
1 +t
--

1 - t +t2 - t3 + ... + (-l)iti +

By Theorem B,

lx

dt
(1 + x) = -- ,
0 1 +t

dt

--

o l +t

f"'at - xtat +f"'t2 dt o


o
Jo

x2
x3
=x--+2

00

=IC-l)i+l
l
i=l

l)i +
.

(-l<t<l).

"'
f
+ c-1Y tiat +
o

xi
-

(-1 < x< 1).

Note that this method cannot be used to calculate the integral from 0 to 2, because the
series for

1/(1 + t)

converges only for

!ti < 1.

The method that we have just been using can be applied so as to give answers,
in the form of series, for problems which up to now we could not have solved.
Consider

r -1/2

Jo

1 + x4

In Chapter 6, this would have been an impossible problem. But now we can solve it,
by expressing the integrand as a power series, and integrating a term at a time.
the series for

1/(1 + x),

we replace

1 + x4
---

1 00

x by x4 This gives

4 + x8 -

=I c-1)ix4i.
i=O

.
+ (-l)'x4'

In

456

10.5

Infinite Series

Therefore

-112

dx

4
1 + X

--

oo
=

I
i=O

-112

(-l)'x4'dx

I< -l)i

< ) i+l
_ -t

i=O

+ 1

4z

-t - t(-t)5 + -H-W +

1
-! +_
.

25

+ ...

9 . 29

This is an alternating series; the terms diminish numerically, and approach 0 as


n--+ oo. Therefore, if we use the first three terms as an approximation of the integral,
the error is less than the fourth term. This is

which is quite small: 210

1024, 213

8192, and so

E < I0-5

PROBLEM SET 10.5


1.

2.

Calculate Tan-1

l/10

0.02 to six decimal places, and explain how you know that the error in

your approximation is less than 5

Calculate

7
10-

1
---4 dx to five decimal places, and explain how you know that the
1 + x

error in your approximation is less than 5


3.

10--G.

Using the first term only, in the series for Tan-1, we get the approximation formula
Tan-1

x ""' x

for

x ""'

0.

How might you explain and justify this approximation formula if you knew nothing
about infinite series?

4. Given

f(x)

1 +

a) Express f(x) as an infinite series.


b) Express

0.6
J
o

as an infinite series.

1 +

6
x

c) Calculate numerically the sum of the first three terms of your series.
d) Get (by any method) an estimate of the error in the resulting approximation of the
integral.
5.

Do the same four things, starting with f(x )


(Your infinite series will use powers of

same reasons.)

v,

1/(1 +

vi),

on the interval

[O, 0.49}

but the same methods will apply, for the

10.6

The Ratio Test for Absolute Convergence

6. Do the same four things, starting with/(x)


7. Do the same, starting with f (x)

8. Do the same, starting with/(x)

457

1/(1 + x), on the interval [O, 0.2].

1
5 2 , on the interval [O, 0.25].
1 + x 1
1

3, on [O, 1/2].
1 + x

9. Express in the form of a series:

rk [ f--=:_J

Jo i=oi +

dx

(0 < k < 1).

10. Using the first term only, in the series for In, we get the approximation formula
In (1 + x)

R:J

for x

R:J

0.

How might you explain and justify this formula, if you knew nothing about infinite
series?
11. Consider the function f(x) defined by the series
1 + x +

x2

x3

+
3!

n
x
n!

a) Express/'(x) as a series.
b) Express H f(t) dt as a series.
The results that you get ought to enable you to guess what the function is.
*12. For each n, let

(0 x 1).
a) Find limn-->oo f fn(X) dx.
b) For each x on [O, 1], let f(x)
c) Find

rl

)o

limn-->oo fn(x). Get a formula for the function f(x).

f(x) dx

J1
0

[limfn(x)] dx.
n-co

*13. Your answers in Problem 12 suggest that the functions fn behave rather peculiarly.
Investigate as follows:
a) For each n, let .Xn be the point at whichfn takes on its maximum value. Get a formula
for Xn , and find limn_,00 Xn .
fn(.Xn ). Get a formula for Ym and find liffin_,00 Yn
b) For each n, let Yn

c) Draw a sketch showing what the graph of fn looks like for n


1, n
2, and
n R:i oo. Your sketch will throw some light on the results that you got in Prob
lem 12.
=

10.6 THE RATIO TEST FOR ABSOLUTE


CONVERGENCE. APPLICATIONS TO POWER SERIES

Consider a series .Lai, in which the terms may be positive or negative, but not equal
to 0. For each i, let

458

Infinite Series

10.6

so that

lai+II
An examination of the sequence

r1, r2,

lail ri.

gives us a convergence test which works

very quickly, in the cases where it applies.


Theorem 1

(The ratio test). If


lim

i_..oo

then

2:0 ai is

ri

r <

1,

absolutely convergent.

Proof Let s be any number such that r < s < 1. Then there is an N such that
i N

ri <

=>

s.

r;

(iN)

ri, take

(In the defi11ition of limico

laN+II

laN+2I

s - r, so that r

s -

) It follows that

laNI rN < laNI s,


[aN+1I rN+I < laNI ss

laNI s2;

and in general, given


it follows that

By induction,
Therefore

"'

laN+;I < laNI s1

for every j.

"'

"'

2 laN+1I 2 laN[ s'

i=O

i=O

laNl2s'

i=O

It follows that

and so

co

2 fail

i=O

N-1
=

2 [a;[

i=O

[aNI -- <
1 - S

oo.

co

2 lail

<

oo,

i=]\T

which was to be proved.


What we are really using here is a comparison test between the series

2 la;I

and

a geometric series; the comparison does not necessarily work for the first few terms,
but it does start working after a certain point; and this is good enough to tell us what
we want to know.

10.6

The Ratio Test for Absolute Convergence

In Section 10.2, Theorem 9, we showed by a comparison test that

459

Io ( l/i ! ) is

convergent. The ratio test gives this result very quickly. We have
1

a;=
l.

1
i!
Qi+l
- -- '
r-=-=
'
(i + 1)!
i + 1
a;
and so lim;co

r; = 0. It follows that the series converges.

There are simple cases in which a series converges, but in which convergence
cannot be established by the ratio test.

Consider
co

!-:;.
i=l I
which is known to converge.

Here

i2
a;+1
r; = - =
.
(i + 1)2
a;
Therefore, while

r; <

I for each i, we have

I.1m r;

i-co

]"1m

i- co

1
(1

+ (l/i)]2

= 1,

and so the ratio test does not apply.

And Theorem 1 cannot be generalized to take

r; ---+

I it may easily happen that the series diverges.

care of these cases, because if


This happens for
co
1
I
-:
i=l l

i
r;= -----+ 1.
i +l

oo,

An even simpler case is


co

I c-1)i= 1 i=O
Here

+ 1 - 1 +

r; = 1(- l)i+l/(- l)il = I

so that

ri

---+

for every i,

I automatically, but the series diverges.

On the basis of the ratio test, we can derive a more general result for power series:
Theorem

2. Given the series

where

0 for every

a;

i.

Suppose that
Jim

t-+ro

a;+
i = L.
ai

I I

If L = 0, then the series is absolutely convergent for every


series is absolutely convergent for

lxl < 1/L.

x.

If L > 0, then the

Infinite Series

460

Proof

10.6

For x = 0, there is nothing to prove.

For each

- 0, we have

Therefore
lim ri = lxl
If L = 0, then

ri

--+

L.

0, no matter what x may be.

If L > 0, then limioo

whenever lxl < I/L. In either case, the series converges absolutely.

ri <

By the first half of Theorem 2, we conclude that

converges absdlutely for every x. By the second half of Theorem 2, we see that

converges absolutely for lxl < 1.

In each of these cases, the sum of the coefficients

forms a convergent series. But the theorem also applies in cases where the sum of the
coefficients diverges.

Consider
CXl

I i11'iXi.

i=l

Here
1.Im

.
1.-+00

ai

I I
+l
Qi

ci + 1)11'i+l
= 11'.
. i
n-+OO
l'TT
1.

= Im

Therefore the series converges absolutely whenever lxl < 1/71'.


If the ratio

n approaches a limit which is greater than 1, then the series

_L ai

always diverges. The reason is that in this case we have an N such that

i N

=>

r;

> 1.

Therefore lai+11 > lail for i N, and so after a certain point the sequence la1J,

Ja2J, . . . becomes an increasing sequence.

Therefore ai cannot approach 0.

This

observation enables us to add something to the conclusion of Theorem 2.


Theorem 3.

Given the series

_Lo a;xi,

with

a;
+1
i-+oo ai
Jim

If L

I I

L.

0, then the series converges absolutely for every x. If L > 0, then the series

converges absolutely for lxl < l/L and diverges for Jxl > 1/L.
This theorem can be adapted to take care of cases in which some terms of the
series are equal to 0.

For example,
CXl

I c-1)ix2i.

i=O

The Ratio Test for Absolute Convergence

10.6

x2

Setting

461

y, we get
00

.L (-1)y\
i=O
which converges absolutely for
series converges absolutely for

IYI <
/xi <

1 and diverges for

IYI

> 1. Therefore the given

1 and diverges for

/xi

> 1.

x2i+1

oo

.L i
i=O 2

Here

.
lIm
i-+ oo

Similarly,

x2i
x_L--:-.
i=O 2'
oo

I I
ai+l
a;

Therefore the series converges absolutely for

1
=-

x2 < 2

and diverges for

x2

2.

>

Some more observations about Theorem 3 are in order.


1)

The theorem applies only to the case in which

approaches a limit. This

/ai+1/a;/

usually happens for series which are describable by simple formulas.

But for series

in general it should be regarded as a remarkable accident. Suppose, for example, that


we start with
00

..

i=O
Here

a;

even i.

1 for

i,

every

x'

and so

r;

1 + x + x2 +
1 for every

i.

We now divide

xi by i!

for every

This gives
oo

.L b xi
i=O i
The series still converges, for

2)

x+

/xi < 1,

x2
2!

x3 +

4!

but the ratio approaches no limit at all.

The theorem tells us that the series converges everywhere on the open interval

(-1/L, l/L),
interval.

but it tells us nothing about what happens at the endpoints of the

In fact, at the endpoints anything can happen.

converges on

( -1, 1),

For example, _L;:1

and converges at both the endpoints.

The series

(xi/i2)
.L:1 ixi

neither of the endpoints. The series


x
-1, but diverges at x
1.
(-1, 1), and converges at x
1, but

converges on the same interval, but converges at

_L;:1 (xi/i)

converges on ( -1, 1), and converges at

The series

.L:1 (- l)i(xi/i) converges on


x
-1. For this reason, to tell where the series converges,

diverges at

we have to

make separate tests at the endpoints.

3)

Obviously every power series

sometimes 0 is the only value of


every

L a;xi

converges for

;tf 0, we have
r;

O; the sum is

that gives convergence. Consider

l(i + l)!xi+i/i!xil
(i + 1) /xi --+ oo.

Therefore the series converges only for

0.

a0

.L:o i!xi.

But
For

462

4)

Infinite Series

10.6

Finally, the results that we have been getting for power series suggest a conjecture.

In every case that we have investigated, the domain of convergence of


turned out to be of one of the following types:

.L aixi

has

( - oo, oo).
(-a, a), plus, perhaps, one or both of the endpoints.

i) The entire interval


ii) An open interval

iii) The point 0 alone.


The question arises whether these are the only possibilities. For example, is there
a series

.L a;xi

origin?

.L a;xi

whose domain of convergence is an interval whose midpoint is not the

We shall see, as the theory develops, that the domain of convergence of

is always a set of one of the forms

( - oo, oo),

(-a, a),

[-a, a) ,

(-a, a],

[-a, a],

{O}.

PROBLEM SET 10.6


For each of the following series, find the domain of convergence, remembering, of course,
to test the endpoints.
ro

ro

1.

i=1

ro

4.

_L ix2i
i=l
oo

7.
10.

13.

oo

5.

xi

3. _Li2x2i-1
i=l
x2i

xi

ro

6. .L
i=l VI

.L

i=l v'l
(-x)i

co

ro

; v2i + 1

8.

00

I -=
i= Vi - I

9.

11. L (3i)2x2i-1
i=l
x2 i
14. IC-l)i
(
2'I )'
t=O

.L (3i)x2i

i=l

x2i-1

ro

.L (-l)i+l ( . 21
i=l

ro

15.

17. _Li-ixi

i=l

x 4i

L ( - l)i (2 I')I

t=O

ro

00

16. _Lhi

i=l

12. L (3i)4x3i
i=l

oo

1)!

.L(3i)3xi
co

ro

00

19.

ro

2. _Li3xi

_Li2xi
i=l

18.

i=l

,L (Tan-1 i)xi
i=l

I (-l)i
i2 + 1

i=l
00

20. .L i(2x - l)i


i=l

(Does the answer to this one contradict Theorem

ei

oo

21. .L-:--- (x - 4)i

(Same query as for Problem

i=ll3
oo

(x - 2 )i

22. .L

i=l

(Same query as for Problem

Show that

24.

Prove the following theorem:

,L:1

Theorem. If

,L;':1 a;

then ,L:1 a;b;

20.)

20.)

(sin i)xi is absolutely convergent when

23.

3 ?)

Jxl

< 1.

is absolutely convergent, and b1b2, ... is a bounded sequence,

is absolutely convergent.

10.7

Power Series for exp, sin, and cos

463

*25. Show that there are infinitely many integers i for which sin i > t.
*26. Show that ,L;:1 (sin i)xi is divergent when !xi > I.
(The ri;;:sults of Problems 23 and 26 show that for this very irregular series, the domain
of convergence is still of one of the types described by Theorem 3.)
*27. You may have noticed that the number 1 has come up very often as an endpoint of our
domains of convergence. The following theorem helps to account for this:
Theorem. Let p(i) and q(i) be polynomials in i, of any degree, with q(i) never equal to 0.

If

a;

p(i)/q(i), then

converges absolutely for !xi < 1, and diverges for !xi > 1.
Prove this theorem.
10.7

POWER SERIES FOR exp, sin, AND cos

Theorem A of Section 10.5 asserts that power series can be differentiated a term at a
time.

That is, if
co

f(x) = ,L a;xi

( -r <

i=O

then

x <

r ,

co

f'(x) = ,L ia;xi-l
i=l

( -r < x

r).

<

We shall use this to find a series for the exponential function. We start by assum

ing that ex can be expressed in some way as a power series, so that


co

f(x) = ex = ,L a;xi = a0 + a1x +

i=O
for some sequence of coefficients a0, a1, . . . . On any open interval whf"re this works,
we have
co

f'(x) = ,L ia;xi-l = a1 + 2a 2x +
i=l

ia;xi-l

;
(i + l)a;+1x +

It must be true thatj'(x) =/(x), and/(O) = 1; and so we want to find a sequence


of coefficients a0, a1, a2,

which gives these results for the series. This is easy: we

want

which gives f' (x) = f (x); and we want a0 = 1, which gives f (0) = 1. Thus

a0 =

1,

a1 = a0/ 1 =

I,

a2 = a1/2 = t,

and, in general,

a; = 1/i!.

a3

= a2/3 = 1/ (2

3) ;

10.7

Infinite Series

464

This can be checked by induction. For i = 0, I,


And
a

1
-i!

=>

2,

the formula

a;

= I/i! holds true.

1
1
G;
.
i+l - i + 1 - (i + l)i! - ( + 1)!
i
--

This proceeding does not prove that


(1)

because we started off with an unproved assumption that ex had some power series
expansion. But now that we know what series to examine, it is very easy to show that
Eq.

(1) holds.

By the ratio test, the series on the right-hand side converges for every x.

It therefore defines a function g. Thus


oo

i
x

g(x) =I-:-

i=O l !

We chose the coefficients

(-oo <

< oo).

in such a way that g' = g and g(O) = l.

show that g(x) = e"' for every x. For each x, let ef;(x) = g(x)/e"'. Then
ef;'(x) =

e"'g'(x) - g(x)e"'
e

2x

1
= - [g'(x) - g(x)]
e"'

Therefore is a constant, and ef;(x) = ef;(O) for every x.


g(x)/e"' = I,

We need to

0.

But ef;(O) = I.

Therefore

g(x) = e"',

and

which was to be proved.

What makes this scheme work is the fact that the functionf(x) = e'" is completely

described by the conditions f' =f, f(0) = 1; no other function satisfies these con
ditions. Thus we have
Theorem 1.

e"' =

Lo (xi/i!).

Setting x = I we get
Theorem 2.

e =

Lo (1/i!).

This series converges so fast that some people enjoy using it to calculate e

2.7182818,

correct in the seventh decimal place.

We now want to get a series for the sine. As before, we start by assuming that our
problem has a solution, and then we try to find out what form the solution must take.
For f(x) = sin x we have

f'(x) = cos x,

Therefore
and

f(O) = 0,
f"(x) =

f(x)
-

f"(x) = -sin x.
f'(O) = I,

for every x.

Power Series for exp, sin, and cos

10.7

465

Thus if
.

sm

we must have

a0 =

0,

x=

co

..

i=O
and a1 = 1.

a;xi

a0 + a1x + .

Now

f'(x) = L ia;xi-1,
i=l
co

f"(x) = L i(i - l)a;xi-2


O')

i 2
= 2a2 + 32a3x + + i(i - l)aix + (i + l)ia;+lxi-l + (i + 2)(i + l)a;+2xi +
To getf"

(i
Since

a0

-f, we want

2)(i + l)ai+2

-a;,

a;+2 = -

or

(i + l)(i + 2)

0, it folJows that every even-numbered coefficient a2i_is also equal to 0.

The odd-numbered coefficients are

a1 = 1,
as=

(1

a1

1)(1

2)

23

2)

as
(3 + 1)(3

a5 =

a1

---=

and in general

3!'

1
3! .

4. 5

1
5! '

i
a2i+l - (-l) .

(2i + 1)!

To check this by induction, we note that

i
a2i+l - (-l)
=>-

ac2i+i>+2

(2i + 1)!
a2i+l

-[
(2i + 1)

a2<i+i>+i = (-l)'( -l)


l)i+l

- c

1] [(2i

1) + 2]
1

1
.

(2i
1

(2i

3)!

1)! . (2i
-

c -1Y+i

2)(2i

3)

[2(i + 1) + 1]!

Therefore, if there is a series for the sine, the series must have the form

(x) =

co

x2i+1

(-l)' (2i + l)!


.

466

10.7

Infinite Series

g(x) =sin x for every x.


h(x) =g'(x). Then we know that

We need to show that


Let

I) g'

h,

2) h'

-g,

3) g(O)

= 0,

4) h(O) = I.
It ought to be true that

h(x) =cos x.

g(x) =sin x,
If so, the function

cp(x) = [g(x) - sin x]2 + [h(x) - cos x]2


x. And conversely, if cp(x) =0 for every x, it follows
g(x) =sin x and h(x) =cos x. Now

must be equal to 0 for every


that

cp'(x)

2[g(x) - sin x][g'(x) - cos x] + 2[h(x) - cos x][h'(x) + sin x]

=2[g(x)
=0
Therefore

<P

sin

for every

x][h(x) - cos x] + 2[h(x) - cos x][-g(x) + sin x]


x.

is a constant. But

cp(O) = [O - 0]2 + [l - 1]2


Therefore

0.

cp(x) =0 for every x, which was to be proved. Thus we have:

Theorem 3.
oo

sin

I (-1)'

i=O

By differentiation,

x2i+1
(2i + 1) !

oo

COS X =

.? (-1)

i-0

=x

x3

(2i + l)x2i

(2.l

Thus:

xs

x7

-- + - - - +
3!
5!
7!

l) .I

oo

x2i

t-0

.? (-1)i (2 . ) I
.

Theorem 4.
cos

oo
. x2;
x2
xs
x4
x =I c - 1) - = 1 - - + - - - +
i=O
(2i)!
2!
4!
6!
'

Obviously, the series that we have been developing in this section can be used
for calculating the values of the corresponding functions.

In fact, this is the way

people arrived at the values .that you find in the tables of exp, sin, and cos. And the
series can be adapted, in simple ways, to handle a variety of related problems.
example, consider

o
r .5
Jo

'"2
e

dx.

If we could get a simple formula for a function F such that

F'(x)

= e

"'

For

Power Series for exp, sin, and cos

10.7

467

then the integral could be expressed as F(0.5) - F(O). There is no such simple formula.
But we can express such an Fas an infinite series, in the following way. We know that
x
e =

ex'

Therefore

ro

ro
=

L
i=D l !
2i
x

L--;-i=O l !

a
x

"
x

2!

..

ro

F(O)

and

0,

f'

t'

F'(x)

dt =

2i+1

I --- .
-i=oi!(2i
+ 1)

Evidently
Therefore

2!

- + -- +
3

1 + x + - + ...

Integrating a term at a time, we get a function

F(x)

+ -+
2!

1 + X

ex'.

F(x),

and so, using the series for F, we can calculate

F(t)

approximately, with an error as

small as we please.

PROBLEM SET 10.7


Find a series for each of the following functions.

1. f(x)

x In (x

In each case, name the interval on

2. f(x) x2 In (x2 + I).


4. rxf(t) dt, where f is as in

which you know that your series converges to the given function.

3. f(x)

5. f(x)
7. f(x)
11. f(x)
9. /(x)

13. f(x)

15. F(x)
17.

F(x)

2
x ln (x

I).

2x.
sin (x/2).

sin

cos

Jo

10. f(x)

G).

3 "'3
x e

l"'

6. f(x)
8. f(x)

+ I).

3 t3
t e

12. f(x)

dt.

lxf(t)dt,

wheref is as in Problem

14.

i"'f(t)dt, where/is as in Problem 16.

14. f(x)

16. f(x)

18. f(x)

x sin x.
cos 2x.

2.

sin x cos x.

'n-'x

"'
xe - x.

(e"'

Problem

for x

for x

for x

;e 0
0

;e 0
0
=

x ;e 0
for x
0
for x

for

468

Infinite Series

19.

F(x)

21.

f(x)

23. f(x)

10.8

f'f(t)dt,

where/ is as in Problem 18.

x3 cos2 x.

20.

f(x)

22. F(x)

cos2 x - sin2 x

24. F(x)

cos2 x.

{ f(t ) dt where f i s
J as in Problem21.
o

x cos2 x + x sin2 x.

tf and (2) f(0)


I. Either before or
25. Find a series for a function f such that (1) f'
after finding the series, find an elementary formula for such a function f
=

26. Find a series for a function f such that (1) /' (x)
/(0)
0.

2/ (x)/x for every x 0 and (2)

27. Is there only one function satisfying the conditions of Problem26? Why or why not?
28. Get a formula for Dixi, where D1 denotes the ith derivative of the function f
29. Get a formula for Dix1, valid for i < j.

30. Do the same, for the case i


*31. Given f(x)
of f )

>

j.

I:o a xi. Get a formula forJ(i>(O). (Here /( i ) denotes the ith derivative
i

*32. Is it possible that there are two different power series for the same function, valid on
the same open interval I? That is, given
<XJ

f(x)

L aixi

io

<XJ

L b;xi

on I,

io

does it follow that ai = b; for each i? Why or why not?


function f is called real-analytic on an interval I if f can be expressed as a power
series I:o a xi. Does there exist a real-analytic function /, on an interval (-a, a),
i
such that /(i) (0)
(i !)2 for each i? Why or why not?

*33. A

10.8

THE BINOMIAL SERIES

It is possible to show, by induction, that if n is a positive integer, then

(a + bt

an +

nan-lb +

n(n

n(n - 1) an-2 2 +
b
2

1) (n
l1.

- i

+ 1)

an-ibi + . . + bn.

Here the coefficient of an-ibi can be written more briefly as

(n)
i

n(n - 1) (n

n!
=

!(n

i)!

1
1.

- i

+ 1)

The induction proof of the binomial theorem depends on the identity

You may have seen this proved.

In any case, we shall not stop to prove it now,

The Binomial Series

10.8

469

because the elementary form of the binomial theorem is a corollary of a more general
result which we shall prove presently.
We would like to generalize the familiar binomial formula

(a + br
to take care of the case in which

i G) an-ibt

is not an integer. That is, we want a formula for

(a+ b)k, where k is any real number. The following observations are obvious:

1)

Fork

0, we have

(a+ b)k

1, and our problem is solved. We may therefore

assume that

k
2)

For the case of interest, in which

c>

only for

0.

(See Section

4.9.)

k is

0.

not an integer, the exponential

a+ b>

3)

For

ck is defined

Therefore we must assume that


0.

b, the problem has an immediate solution: (a+ b)7'

(2a)k

2kak.

We therefore may assume hereafter that

a-:;!:. b.
And we want to assume this, because the case

b does not fit the pattern that is

going to emerge.

4)

It is now a matter of notation to suppose that

We let

a+ b
If we had

a> b.

bfa, so that

\x\

a(l + x),

and

\bfa\ 1, then either b a or b -a; and these possibilities are


(2) and (4). Therefore \x\ < I, and our problem takes the

ruled out by conditions


following form:
Problem.

Given

k -:;!:.

0, and

f(x)

(I + x)k

(\xi < I).

Find a formula for f (x), analogous to the binomial formula.


Our past experience with sin, cos, and exp suggests that we should investigate
the relation between

f(x)

(1 + x)k

and its derivatives, and use the results in the

investigation of the series. Now

j'(x)
Therefore

k(l + x)7'-1.

(1 + x)f'(x)
f'(x)

CQ

L ia;xi-l
i=O

kf(x).
CQ

L ia;xi-1.
i=l

470

10.8

Infinite Series

Therefore
xf'(x)

C()

.L ia ixi.

We want to express (1 + x)J'(x) as a series, and so we need to express/'(x) in the


form .L bixi. For this purpose we use a trick. Let j = i - 1, so that i = j + 1.
This gives
co

00

.L (j + l)a;+1x1 = L (i + l)a;+ixi.

f'(x) =

1=0

The equation (1 + x)j'(x) = kf(x) now takes the form


00

00

L [(i + l)ai+i + ia;]xi = L ka;xi.

i=O

Comparing coefficients of xi, we get


(i + l)ai+i + iai = kai

Obviously
a0

Therefore

/(0)

<=>

(i + l)ai+l

<=>

a;+i =

(1 + 0)1'

(k -i)ai

k-i
ai.
i
+1

1.

a0 = 1;
a1 =

O a = k'
0+1 0

k(k - 1)
k -1

a.= -- a =
1 + 1 1
2
'
k -2
k(k-l)(k - 2) - k(k- l)(k -2)
a ---a
32
3!
'
3 2+1 2

and in general, for

>

1,

ai =

k(k - 1) . .. (k
.,
I.

i + 1)

We denote the fraction on the right by the symbol (),just as in the case where k is a
positive integer. The above formula then takes the form
ai.=

(7)

for each

0,

and the net result of the above discussion is that


00

'
.4..

i=O

k
aixi = (1 + x)

=>

a; =

(7)

for each

i.

That is, the series that we have found is the only series that might work. To know
that our series does work, we need the following two theorems.

10.8

The Binomial Series

Theorem 1.

Proof

10.6,

As in Section

Then
r

'

= I k(k
=

Evidently

2::o CDxi is

The series

convergent for

lxl < 1.

let

1)... (k
i + l)(k
(i+l)!
_

i).

i!
k(k-l)(k-i+l)

r
; =

Ix!.

Therefore, by the ratio test, the series converges absolutely for


> 1.

Theorem 2.

For every

0,

lxl <

Let

1, and diverges

x
i ()xi= (1 + x?.

and every

between -1 and 1,

g
l)g(O)=l;
2) (I +x)g (x) = kg(x).

Proof

xI

Jim

lx l

I ; x I
i-too

for

471

be the function which is the sum of the series.

We determined the

coefficients in such a way that

'

We need to prove that

g(x)=f(x)= (1+x)k

(-1 <

x < I).

For this purpose, we use the same device that worked for the exponential.

f(x)
Then

= (1 g(x)
+ x)k

x) - (1 +x)kg'(x)(1-+ gx()xzk)k(l+x)k-l
kg(x)
= (1 + (1x)g+'(x))
g(O) = f (0) =
x,
e .

A. '
't' C

x k+1

so that cf> is a constant.

g(x)=j(x)

for every

PROBLEM SET 10.8


1.

Let

And the constant is 1, because


which was to be p ro v d

'

1.

Therefore

Write a series for v', and find out how many terms of the series you would need
to use, to calculate v'u, correct to three decimal places.

10.8

Infinite Series

472

2.

.a;
x +
Do the same, for v

3.

Do the same, for V1 x

4.

Do the same, for

5. Let

1.

+ 1.

V2x + 1.

be a positive integer. Using the definition

n!
(n - i)!i!'
write formulas for

cn;1) and (i.:'.1).

6.

Using the apparatus of Problem 5, show that

7.

Using the result of Problem

<1 + xr =

6,

show that

i ( ) xi

i=O

"""

(The first and last terms on the right-hand side require a separate discussion.
note that ()
<nt1), because both are equal to 1 ; and similarly that ()
(!})
=

Since obviously (1

+ x)1

() + ()x1,

But

= 1.)

this gives an induction proof of the elementary

form of the binomial theorem.


Find a series for each of the following functions, and discuss for convergence. You need
not test for convergence at the endpoints.

f(x)

8.

v'I + x

14. f(x)

Sa"' Vl + t10 dt
f'2 VI + tlo dt

f(x) = (2 + x2)"

20. f(x)
22.

x
=

15.

V 1 + x2

--

19.

x2
.I
v

1 +x

.a;
v

x
1 + x2

(1

+ x)312

f(x) = {/2 + x

f' {/2 + t2 dt
f(x) = r (2 t2)k dt
=

21. f(x) = ( "'

J v'1 + t2
0

dt

Find a function f such that (1) (1 + x2)[' (x)


f(x) and (2) f(0) = 1. Then show that
the function tht you found is the only function satisfying conditions (1) and (2).
=

23.

(k ;6-0)

13. f(x)

17. f(x)

16. f(x) = v'3 + x


18.

j(X) =

11. f(x)

10. f(x) = xVI + x


12. f(x) =

Same question, for the conditions

(1) f' (x) sec x

f(x) and (2) f(0)

1.

Taylor Series

10.9
10.9

473

TAYLOR SERIES

! a;xi, converging to f(x) = l /x on an


0, because any such series is continuous at x = 0, while

Obviously we cannot get a series of the type


open interval containing
lim:t-.o+

1/x

On the other hand, there is a series

= oo.

f C-l)ixi

1
__
1+X

i=O

If we let

x'

1 + x, x

(lxl <1).

x' - 1, then the above equation takes the form

(Ix' - 11< l);


and dropping the prime we get

f(x)

1
=

00

!(-l)'(x - 1)'

(Ix - 11<1).

i=O

Taylor series, oi a Taylor expansion of


1. A power series ! a;xi, of the type that we have
been discussing so far, is called a Maclaurin series. Thus every Maclaurin series is a
Taylor serie:

The series on the right-hand side is called a


the function f about the point

00

! a;x;

00

i=O

i=O

0. In this language, we may say that f(x)

1/x
1.
In x cannot have a Maclaurin series, because at x = 0 the

which is a Taylor series, with

! a;(x - O)i,

has no Maclaurin series, but it does have a Taylor expansion about the point
Similarly,

f(x)

function approaches

- oo.

g(x)
Setting

x'

1 + x, x

But we do have a series for


ln (l

x)

oo
xi
!c-1r1--:-

i=l

(l xl<1).

x' - 1, we get

Inx' =

)i
(x
i(-l)i+l ' l

i=l

(Ix' - 11<1);

and dropping the prime we get


lnx

(x l)i
ic-l)i+i -:l
i=l

(Ix - 11<1).

This is a Taylor expansion of In, about the point 1.


With the obvious modifications, all our theorems for Maclaurin series hold also
for Taylor series; and to prove them in the general case, we merely translate the axes

x = x' + a, x' = x - a.
10.5 take the following forms:

by the substitution
of Section

Theorem A'.

For example, Theorems A and B

Given

f(x)

00

! a;(x - aY

i=O

(a - r<x< a + r),

10.9

Infinite Series

474

thenf is continuous and differentiable on the interval

(a - r, a + r),

and the deriva

tive of the sum is the sum of the derivatives. That is


ro

f'(x)= L iab - aY-1


i= l
Theorem B'.

(a - r

<

<

a + r).

(a - r

<

<

a + r).

Given
ro

f(x)= L a;(x - aY
i=O

Then the integral of the sum is the sum of the integrals. That is,

("'f(t) dt
Jo

("'a;(t - 4 dt
i =oJo

(x - a)i+l.
I
+ 1

i=Ol

Our other theorems can be generalized in the same style; we treat


exactly the same way that we used to treat

x.

found, in Problem 31 of Problem Set 10.7, that if f(x)

interval containing 0, then


so that

a =
n

x - a

in

Another example is as follows. You

j<n>(o)
--

n!

I:o a;xi,

on an open

An analogous formula holds for Taylor series:


Theorem 1.

If f(x)=

Lo a;(x - a)',
a =
n

Proof
by

on an interval

J<n>(a)

(a - r, a

r),

then

for every n.

--

n!

The nth derivative of a function described by a formula rp(x) will be denoted

Dncp(x).

We observe that

nn(x - a)i=

nn(x - ar=

n!,

for

nn(x - a)i = i(i - 1)

Therefore,

<

forf(x)= L a; (x - a)i,

fCn>(x)

a
n

n,
(i - n + l)(x - a)i-n

for

>

J'l.

we have
ro

n!

+ L b;(x - aY-n.
i=n+l

We don't care what form the b/s have, because every term of the sum on the right
hand side has (x

- a)

raised to a positive power, and we are about to set x

This gives
so that

a =
n
which was to be proved.

fCn>(a)
--

n.1

'

a.

10.9

Taylor Series

475

We have found that for some functions the use of Taylor series in place of
Maclaurin series is a necessity.

For example,

l/x and Jn xdon't

have any Maclaurin

In other cases a Taylor series may be preferable, even though the

expansions.

Maclaurin expansion exists. The point is that

.Z

i
a ix

usually converges rapidly when

0, and more slowly when xis larger. To take an extreme case, we know
10,0007T = 0, because 10,000 is even. Therefore it must be true that

is close to
that sin

.I=O (-1)i (2i +1 1)! (10,0007T)2i+l

o.

But in waiting for the partial sums to get close to 0, we had better not be impatient.
In general, if we want to use a series to calculate a function numerically, we should
choose the "base point"
substitute.

as close as possible to the value of x that we want to

Suppose, for example, that we have calculated In

way to calculate In

1.6

would be to take
ln

But the convergence of the series

1.6

x =

Jn

x,

we have

f'(x)
f"(x)

fil(x)

ai

Jn
=

1.6,

fi)(l.5)
.,
L

0.4055.

One

.I c-1)i+l co"6)i
l

00

.2 a;(x - 1.5t

i=O

1/x

1.6

x-1,

( -l)x-2,

(-l)i+l(i - l)!x-\

( -l)i+l
.

(1.5)-i

(i > 0),

0.4055.

0.4055 +

this gives
In

j(O)(l.5)

Therefore

For x

We therefore use the base point 1.5. Thus


In

For f (x)

1.5

x
1
l(-l)i+1C y
l
=
i l
i=l

is slow.

in the series

(- ) +l ( x - 1.5);

I
.
1.5

0.4055 +

t=l

( - i.y+i (1-)i,
i=l

15

which converges much more rapidly.


Note that the above derivation tacitly assumes that In
about the point

1.5;

has a Taylor expansion

if it has, then the coefficients must be given by the formula

ai

f(a) .
.

= -,L

476

Infinite Series

10.9

It is a fact that if a function J has a Taylor expansion


<Xl

f(x)
converging on an interval

Ix - al

<

! ai(x - a)\
i=O
then any other point b of the interval can also

r,

be used as a base point, giving an expansion


<Xl

f(x)

! b;(x - b)i,
i=O

which converges on some interval containing b. But the proof would be hard, in the
present context, and should be postponed until we can use the theory of functions
of a complex variable.
PROBLEM SET 10.9

For some of the functions in the first twelve problems below, it is a practical proceeding
to derive a general formula for pn>(a), and use the formula to calculate the coefficients ai
in the series

! a;(x - a)i.

In each such case, calculate the coefficients by this method. In

cases where the derivation of the general formula seems unreasonably difficult, merely cal
culate the first three terms of the series.

a=0.
a = 1T.
3. f(x) =tan x,
a = 0.
5. f(x) =Tan-1 x,
a=0.
7. f(x) =ex,
a=0.
9. f(x) =In (2 + x),
a=0.
11. f(x) =Jn (1 + x2),

2. f(x) =tan x,

a=0.
a=21T.
4. f(x) =cos x,
6. f(x) =Tan-1 x,
a=1.
a=1.
8. f(x) =e"',
a=any number.
10. f(x) =e"',

1. f(x) =sinx,

12. f(x) = sin x,

a = 1.

13. This is a separate problem, and it requires you to think of a trick. Given that In 1.4
0.3365, find a way, using series, to calculate In 2, correct to three decimal places. (To
four decimal places, In 2 = 0.6931.)

!%1 a; be any series.

14. Let

For each i, let

= i(a; + la;I),
ci=i(a; - la;I).

b;

! b; and ! c; both converge, then 2: a; converges absolutely.


2: c; be as in Problem 14. Show that if ! a; converges and 2: la;I
Let 2: a;, 2:
Show that if

15.

b;,

oo,

then
and

LC;
*16. Let

n1, n2,

exactly once.

- oo.

be a sequence of positive integers in which each positive integer appears

That is, the numbers

some order. For each series

n1, n2,

2:%1 a;,

are the integers

1, 2,

3,

. . . arranged in

we can then form a "rearranged series"

in which the same terms appear in some order.


"commutative law of addition" for positive series.

!%1 an,,

The following theorem is a sort of

Taylor's Theorem.

10.10
Theorem. If

ai

> 0 for each i,

2 ai

and

converges to the same sum.

Estimates of Remainders

A, then every rearrangement of

477

2 a;

Prove this.
*17. Show that

has

a rearrangement

which converges to 0.

(Thus the "commutative Jaw for infinite

sums" does not hold in general.)

a rearrangement

*18. Show that for every number k there is

of the above series which

converges to k.

10.10

TAYLOR'S THEOREM. ESTIMATES OF REMAINDERS

In the preceding section, we showed that, if a function is expressible by a Taylor


series, with

f(x)

co

(a - r < x < a + r),

.2 a;(x - aY

i=O

then the coefficients ai are given by the formula

ai

i
J< >(a)

.-1
.
i.

Using the formula, we can write down a series. But there are three questions which
it is natural to ask:
1)

For what values of x does the series converge? (We recall that Tan-1 x is defined
for every x, but its series converges only for -1 < x 1.)

2)

Does the series converge to the function f that we started with?

3) If we use a partial sum


Sn(x)

(i)(a)

J
.2.

i=O

(x - a)i

as an approximation ofj(x), what is the error? For this, we need an estimate of the
"remainder function"
Rn(x)

/(x) - Sn(x)
n /(i)(a)
f(x) - 2-. (x - aY,
i=l l !
-

Partial answers to these questions are given by the following theorem.


Theorem 1

(Taylor's theorem). If f has

the interval [x, a], then

Rn(x)
for some x between a and x.

+ 1 derivatives, on the interval [a, x] or

f(n+l)(.X)
=

(n + 1) !

(x - ar+1'

10.10

Infinite Series

478

x as a constant; and for

The proof is artificial, and hard to remember. We regard


each

t we let

i
f (t)
i
f(x). (x - t) .
i=O l !
Here we have simply replaced a by t in the formula for R (x).
n
F(t)

F(x)
For

a we have F(a)

F(t)

f(x)-

r<ol(x)

--

O!

f(x) - f(x)

R (x). Since
n
f"(t)
f(t)
f'(t)
(x - t)2 f(x) - -- - (x - t)O!
1!
2!
F '(t)

x we have

0.

we have

For

J(n)(t)
- -- (x - tf,
n!

-f'(t) - [f" (t)(x- t)- f'(t)]


f" (t)
f"'(t)
2(x- t)
(x - t)22!
2!

J(n)(t)
f(n+l)(t)
(x- tr- -- . n(x - tr-1 .
n!
n!

Here all terms cancel out, telescopically, except the first term in the last bracket;
and so

F'(t)

Now let

f(n+l)(t)
(x - ir.
n!

(x - tr+i
G(t)- -(n + 1)! '

so that

G'(t)

-(x - tr
=

n!
F and G, on the interval between a and x, we apply the parametric
mean-value theorem. (This is Theorem 2 of Section 9.2.) It gives

To the functions

F(x)- F(a)
F'(x)
G(x)- G(a)
G'(x)'
_

for some

between

a and x. Since F(x)

-F(a)
And

-G(a)

G(x)

F '(x)
G'(x)

0, this means that

10.11

The Complex Number System

479

for every t. Therefore

F(a)
G(a)

+1
f<n >(x).

By definition of G(a) and F(a), we have


R n(x)

F(a) = J<n+i>(x)G(a)

x)
f(n+l)((n + 1)!

(x

ar+l'

which was to be proved.


In some cases we can use this theorem to prove that a formal power series con
For example, we may be able to find a number M

verges to the expected function.


such that

IJ<n+i>(x) I
for every

and every x between

and

x.

In such a case it follows that

Rn(x)

--+

and f (x) is the sum of its formal Taylor series. Most of the time, however, estimates
of

(n

+ l)st derivatives are hard to come by. For example, the calculation ofj <n> (x)

is unmanageable for the function

f(x)

1 -,
+
x3
1

even though we can easily see what the Maclaurin series is:

_1_ =I c-i)ix3i
X3 i=O

1 +

(lxl

<

1).

It follows, of course that

JC3i>(O) = ( -1)i(3i) ! ,
and thatJ<nl(O)

0 if n is not divisible by

about J<n>(x) for other values of

x.

3.

But this does not give us any information

PROBLEM SET 10.10


1 through 6. In at least six of the first twelve problems in Problem Set 10.9, it is easy to
get an estimate of J<n>(x), and then show by Taylor's Theorem that the series converges to
the given function. Identify these six cases, and carry out the process.

10.11

THE COMPLEX NUMBER SYSTEM

Formally speaking, complex numbers are numbers of the type


z

=a+ bi,

where a and bare real numbers, and where i is some sort of number such that i2

-1.

Granted that there is such a number system, and that it obeys the same manipulative
rules as the real number system, the equation i2 = -1 gives all that we need to

480

Infinite Series

10.11

perform calculations.

For example,

(a + bi)2 = a2 + 2abi + b2i2 =(a2 - b2) + 2abi;

(a + bi)(c+ di) =(ac - bd) + (ad + bc)i;


1
1.)2.
( +
i
= z;

/2
(_;2
,1

Jl

I;

i4 =(i2)2 =1;
l0,001
i
= i;

G 3
+

)2

iy = i(l +
= i(l +

= i(l = -1.

3J3 i + 3 . 3i2 + 3J3 i3)


9 + 3J3 i
3J3 i)
-

i has no reciprocal. But if a + bi - 0, then a+ bi has


a reciprocal in the complex number system. To see this, note that if a + bi - 0,
then a and b are not both = 0. Therefore a
bi - 0, and
Obviously 0

= 0+ 0

_13 i)3

1
1
a - bi
a - bi
--- = --- . --- =
a + bi
a + bi a - bi
a2 - (bi)2
=

a - bi

a2 + b2

=A+ Bi.

a2 + b2

This calculation begins with the assumption that

but once we know the answer, it is easy to check:

-b

+ bi and

a
(a+ bi)(A + Bi) =(a + bi) (
+
a2 + b2
=

Therefore A +

(a + bi)(a - bi)
a2 + b2

a2 + b2

Bi is the reciprocal of a+ bi.

-b

2+ b2

a2 + b2

a2 + b2

bi have reciprocals,
i

=1.

There are several ways to define the set of complex numbers, as a mathematical

system, and check their properties.

One such method is explained in Appendix J.

Meanwhile we shall regard the complex numbers as known, and calculate with them,
using the familiar laws of algebra and the fact that
The

conjugate of the complex number


z

=a+ bi

=a - bi.

is the number

i2 =

10.11

The

The Complex Number System

ab solute value

of

481

is

lzl = .Ja2 + b2
By straightforward calculations, we get the following.

Theorem 1. For all complex numbers z,

Zi.

z2,

we have:

z = z,
z + z

is a real number,

is a real
zz
2
!zl = z z,

number,

Proofs.
z = a - (-b)i = a + bi = z;
z = a - bi,
z = a + bi,
z + z = a + bi + a - bi = 2a;
z z = (a + bi)(a - bi) = a2 + b2 = !zl2;

z1 + z2 = a1 + b1i + a2 + b2i = a1 - b1i + a2 - b2i


= (a1 + a2) - (b1 + b2)i = z1 + z2;
Z1Z2 = (a1 + b1i)(a2 + b2i) = a1a2 - b1b2 + (a1b2 + a2b1)i;
z1 z2 = (a1 - b1i)(a2 - b2i) = a1a2 - b1b2 - (a1b2 + a2b1)i = z1z2;

lz1z2I = .Jz1z2z1z2

.Jz1z2z1z2 = .Jz1z1 .Jz2z2 = lz1I

lz2I

(Note that in the last of these calculations all the radicands are real and 0, as they
should be.)
So far, we have been treating all these ideas algebraically. We shall now interpret
them geometrically, plotting each complex number

z =

x +

yi as

a point (x, y) in a

coordinate plane.
y
Y

---------, z=x+yi
I
I
I
I
I
I

+--x,----x
Thus real numbers

z =

x fall on the x-axis; we shall think of this as the

And "pure imaginary numbers," of the form

z = iy ,

real axis.

fall on the y-axis; we shall think

of this as the imaginary axis. This explains the labels on the axes in the figure below.
Evidently

is the reflection of

across the x-axis.

If you reflect twice, you get back

where you started; and this is the geometric meaning of the equation z

= z.

As the

482

Infinite Series

10.11

z=x-yi.
z

lzl is the distance to z from the origin; the reason is that


lzl = ,Jx2 + y2,

figure suggests,

which gives the distance. More generally,

lz1 - z21 is the distance between z1 and z2.


Z2

I
I
I
____

_r:jI
I

For

z1

= x1

+ y1i and z2

= x2

Z1 - Z2

+ y2i, we have

(X1 - X2) + (Y1 - J2)i,

SO. that

lz1 - Z2I

.J (x1

- X2)2 + (Yi

2
J2) ,

which gives the distance.


If

number

= x + yi, then x is called the real part of z, and is denoted by Re z. The


y is called the imaginary part of z, and is denoted by Im z. Thus
z

=x

+ yi

Re

z + i Im z.

It is easy to check that


Rez
Im
Since

i2 =

=
=

t(z + z),
(1/2i)(z

z).

-1, we have

.
l =

-i,

Imz

and

= (z
2

- z ).

These formulas enable us to connect complex numbers with the geometry of our

lzl = 1 is the circle with


1; and the graph of the equation lz - z01 = a is the
circle with center at z0 and radius a. The vertical line through the point (1, 0) =
1 + 0 i is the graph of the equations
coordinate plane.

For example, the graph of the equation

center at the origin and radius

x=

<=>

Re

;>-

t(z + z)

;>-

z + z

2.

The Complex Number System

10.11

483

In the following problem set, you will be asked to carry out a variety of such
processes.

For short, we shall use the term C-equation to describe an equation in

which complex numbers are the only variables. Thus the vertical line discussed above
is the graph of the C-equation
C-equation zz

+ z

2; and a certain circle is the graph of the

4.

PROBLEM SET 10.11

Reduce each of the following expressions to the form

I. (1
4. (-1 - _1 r
1
J
+ i)4

v2

7.

10.

v2

v3
- +21
(2

2
8. (3 - iJ
11. (1
14. (1
.
20. 2i 1
+ i)2

(v2 - v2 i)8

(2 + i)2

2i +

25.

v3 + 2i

28.

i3 + j2 + i

31. 11
34.

+i

v3

-1

32.

+I

Show that zn

37.
39.
41.
43. -

27.

2i

zn, for every

Re z +Im z

Re z =Im z.

lz

II < I.

I.

z.

40.

+ i)3

v2

_1

I + 3i

v3 i

i2 + i + 1
1

(i +

2i

Show that lz"I = lzl", for every z.


Show that l/z

Sketch the graphs of the following C-equations.


=

---

1 +

36.
38.

v2

30. 1)3
33. 1 -- 3i

2 +

i4 + i3 + i2 + i + I

1 r

v2i

- i + 1

- i

is + i4 + i3 +

35.

26. 29. i2 1
3i
2i

G( _1
( -1

2 .

2 + v3 i

23.

v2

2 - i

3i

(1

18.
1
24. -

i +2

17

3.
6.
9.
12.

15. (v3 + i)4

- 2i)2

- v3 i)3

- i)4

v3
- +-1
2
J

5.

13.
16. (1
19. 1
1

22.

(1
(1

2.

+ yi.

Re z

42.

I=

44.

lz

11 = I.
Im

II > I.

(l/z), for every

= I.

0.

484
45.

\z - i\ = t.

49.

\z\2 = 4.

47.

10.12

Infinite Series

46. \z+i\ < t.


48. z+i>l.

z+i = 0.

50. \i\2 = 4.

51. z2 = 4.

52. z4 = 4.

55. \z - i\ = 1.

56. \i - z\ + \z+i\ 2.

53. z-i=l.

57

< l_
\z+i\ =
2

54.

Ii - zl + \z+ii = 2.

The following two problems require the theory developed in Appendix J.

*58. Find a polynomial q(x) such that (x5 + x4+ x3+ x2+ x + l)q(x)

*59. We found that if z =p(x)

1 mod 1+ x2

0, then z has a reciprocal in C. In the language of


0 mod 1 + x2, then there is a polynomial q(x)

congruences, this says that if p(x)

such that

p(x)q(x)

Similarly, if

1 mod 1 + x2

p(x) - p'(x) = r(x)(x2+ x + 1),

then we write

p(x)

p'(x) mod x2+ x + I.

Show that if p(x) 0 mod x2+ x + 1, then there is a polynomial q(x) such that
p(x)q(x) = 1 mod x2+ x + 1. (In fact, the congruence classes of polynomials modulo
x2+ x + 1 form a field. All the conditions for a field can be checked, in the same way
as for 1 + x2, except for the existence of reciprocals.) Is it true that every polynomial
is congruent mod x2+ x + 1 to some linear polynomial ?

*60. In Section 10.8 we deduced the binomial formula


(a +hr

=I ()an-ibi
i=O

from a more general result, using the methods of calculus. But the methods of Section

10.8 do not, as they stand, apply in the complex domain. Show, by induction, that
(1 + zr =i G)zi = 1 + U)z+ ()z2 +

for every positive integer


numbers

u and v,

and every complex number

(u +vr =

1)zn-1 +z n,

z. Then show that for all complex

i (n)un-ivi.

i=O

10.12 SEQUENCES AND SERIES OF COMPLEX


NUMBERS. THE COMPLEX EXPONENTIAL FUNCTION

We know that for sequences x1, x


lim Xn

n_.co

2
x

of real numbers,

lim lxn - xi

n-+oo

0.

The second of these conditions also has a meaning if the xn's and x are complex

numbers, because the absolute values lxn - xi are real in any case. We use this idea
to define limits for sequences of complex numbers.

10.12

Sequences and Series of Complex Numbers.

Definition.

Let z1, z2,

and

The Complex Exponential Function

485

z be complex numbers. If
lim lzn
n-+oo

- zl = 0,

then

limzn =z.
We can test for limits by examining the real and imaginary parts of the sequence
separately.
Theorem 1.

For each

let

n,

Zn= Xn + Yni.
If the sequencesx1, x2,

and y1, y2,

are convergent, then z1, z2, is convergent,

and

limZn =lim Xn

Proof

+ i lim Yn

Let
x

= limxm
n-+oo

so that

y =limym
n-+ oo

lim(xn - x) = lim(yn - y) = 0.
We need to show that
lim(xn + Yni)

n-+

= X

+ yi,

oo

which means, by definition, that


lim (J xn + Yni )
n-+ oo

This is trivial:
lim lxn + Yni

n-+

- X

- yil

oo

lim

n-+ oo

+ yi)I = 0.

- (x

2
[(xn - x)

2]112 = 02
(
(Yn - y)

2 1'2
+ 0 ) = 0.

The converse is also true.


Theorem 2.

If limn_,.00 (xn + Yni)

x + yi, then limn_,.00 Xn=x and limn_,.oo Yn= y.

(Proof?) Once we have Theorems 1 and 2, it is a routine matter to verify that


the limit of a sum, product, or quotient is the sum, product, or quotient of the limits,
just as for real sequences. We shall use these rules without comment.
Just as for real numbers, we define convergence of infinite series in terms of

convergence of sequences.
Definition.

Given a series

,L;:,0 z1. For each

n,

If
limSn= S,
n-+ oo

let

486

10.12

Infinite Series

then we say that

Lo z1 converges

S),

(to

and we write

00

:L z1

i=O

For real series, we found that if

! lxil

same is true in the complex domain.


Theorem 3.

Proof

If

Lo lz11

converges, then

S.

converges, then

!o Z;

! xi also converges.

The

converges.

For eachj, let

Then
and similarly
for eachj. Therefore
and
Therefore

! x1 and L y1 converge, to sus A and B.


! (x1 + y1i)

Jim

n-+ oo

:L z1

i=O

i=O

lim

n-+ co

Therefore

LX; + ilim LYi


n-+ oo =O
n-+
j=O
j
n

lim

=A +Bi.

co

The simplicity of this theorem, and of its proof, are misleading: the theorem is
powerful.

It gives immediately:

Theorem 4.

Lo (zifj!) is

convergent for every

This is so because

for every

1
i=O

real number lzl.

I j! I

z.

j
111
j=O j !

< 00

This enables us to extend the domain of the exponential

x to the entire complex plane:


Definition. ez = exp z = Lo (zi/j!).
ex

exp

For the case in which


another form:

ei6

is a pure imaginary number

2k
= :L ( i8)j = :L (i8)
00

i=O j! .
-

Now

00

k=O (2k)!

oo

e'6=:L

c 0ke2k
_

k=O (2k)!

we can express

( i8)2k+l
k=O (2k + 1)!
00

:L

and
Therefore

i8,

. oo c 0ke2k+l
+ i:Lk=O (2k + 1)!
_

in

Sequences and Series of Complex Numbers.

10.12

The Complex Exponential Function

487

This gives:
Theorem 5.

For every real number fJ,

ei8 = cose + i sine.


Settinge

= Tr,

we get a famous equation due to Leonard Euler:

It is easy to see that if z

ei0, then izl = I. The reason is that

lei81 = !cose + i sine1 = -Jcos2e + sin2 e.


Conversely, if izi = 1, then z
y = sine.

ei8 for some e. Here z = x + yi, x = cose, and

\
\
'\

'-

,_

More generally, every complex number can be expressed in the form

r =

izl 0.

To see this, we let

so that lwl = I. Therefore

z
w=- '
izl
w= ei8

for some e' and


Z

= izl

cose + i sin e

= izl eiB = reiO

r(COS fJ +

i sin fJ).

The expression rew (or r(cose + i sin fJ)) is called the polar form for z, because it
describes z in terms of the polar coordinates r, e of the corresponding point.
For example, consider

z
Here

= lzl

Is,

1 +

2i.

and so

z
w=-=
izl

2i

-Js

---

=-+-i

-J5

-Js .

488

10.12

Infinite Series

Let () be any number such that


cos()

.J5'

sm

()

Then
w =

and

+ i sin ()

cos()

r(cos

z = rw =

.J

i8

() + i sin())

in polar form.
Theorem 6.

For every () and

IX,

(The same is true for all complex numbers

w, z.

That is, e

w+z
.

But we

are not yet in a position to prove it.)

Proof
e

i8

i
ea

+ i sin ())(cos IX + i sin IX)


- Sin ()Sin IX + (sin 8 COS IX + COS()Sin 1X)i
cos (() + IX) + i sin (8 + IX)

(cos()

COS() COS IX

i(e+ l
a.
e
i8
In the polar form re , r is called the
=

modulus and () is called the amplitude. It is


i8

a slight abuse of language to speak of () as the amplitude of re , because while the


modulus is determined when the number

is named, the amplitude () is not deter

mined. In fact, when we apply the exponent i() to e , we get a periodic function/(8).

Theorem

7. For every integer

This is so because cos (8

(positive, negative, or zero),

+ 2mr)

cos() and sin (()

+ 2mr)

si n e.

PROBLEM SET 10.12

Express the following complex numbers in polar form.


I.

1 + i

4.

v3

2.
i

s.

3i

v3

+ i

3.

-7

6.

-4 - 4i

7. In the complex domain, the sine and cosine are defined by the series
co

sinz =

J=O
co

cos z =

z2Hl
.
l)''
(2} +
.

I (-1);

z2;

I c-1); . .
,

j=O

(2j)

How do we know that these series converge for every z?

De Moivre's Theorem

10.13

8. In the text, we expressed

ei8

in terms of sin fJ and cos 8.

489

More generally, express

ei

in terms of sin z and cos z.

Show that for every

9.

z,

sin ( -z)
cos ( -z)
e-i

10. Express

in terms of sin z and cos

-sin z,

cos z.

z.

Express sin z and cos z in terms of the complex exponential function.

11.

10.13

DE MOIVRE'S THEOREM

In Section

10.12

we found that every complex number could be expressed in polar

form, with

where

r = lzl.
And Theorem 6 said that for every e and

a,

This gives us a rule for multiplying complex numbers in polar form: we multiply
the moduli and add the amplitudes.

For

we have

To divide, we divide by the modulus of the divisor and subtract the amplitude.

(r2 ;of 0)

z1/z2 = (r1/r2)ei<&1-02l
(This is easy to check by multiplication.)
rule, for

rei8

By induction based on the multiplication

we have

These ideas give us a method for extracting roots of any order. We shall now see
that the number 1 has three cube roots in the complex domain. If z3

1,

and so

izl

1.

Therefore

for some 8. Therefore


and

38
for some

n.

0 + 2n7T

For three successive values of n we get:


n

n =

= 1;

0,

e = o,

1,

e = f 7T,
e = f7T,

z = z2 =

cos

cos t7T + i sin f7T.

n = 2,

Z1

f7T +

i sin

f7T;

1,

then

izl3

Infinite Series

490

10.13

Using other values of n, we would get repetitions of the same cube roots.

Thus the

roots are
z1 =

1,

Z2 =

1
J3
-2 + li,

Z3

--

J3.

- -z.

These cube roots could have been found by elementary methods, because

z3

(z -

l)(z2 +

+ 1),

and the quadratic formula gives z2 and z3 as the roots of the equation z2
z + 1
0.
But for roots of higher order, and for numbers less simple than 1, the elementary
ei"f2
methods break down, and our new method still works. For example, i
Therefore the fourth roots of are the numbers ei0 for which
=

4() = :!!.

2n1T'

or

() =

4n

1T.

Four successive values of n give us

1T

51T

from which the roots z;

()2 -

()1=-,
8

()o--,
8
=

91T

- 137T

()3-

eio; can be computed, by repeated applications of the half

angle formulas for the sine and cosine.


In general, every complex number z - 0 has exactly n nth roots in the complex
domain, and the roots can be expressed in the form that we have been using.
Theorem 1

(De Moivre's theorem).

Every complex number


z

reie - 0

has exactly n nth roots. These are the numbers


z.'

where

e;

.nr o
v r e';

(1/n)(()

2j7T)

(j = 0, 1,
(j = 0,

. . . , n

1,

To prove this, we need to investigate two things.

. . .

1),

, n

1).

10.13

1)

De Moivre's Theorem

The modulus of the roots. If

wn

= z , then

lwln= lzl = r,
2)

The amplitudes of the roots. If


ncx= ()

and

Any

lwl= V";.

and
=

V"; eia.,

+ 2j7T

ex

491

and

= z, then

for some j,

()

= -+

27T.

successive values of j give us n different values of e ia., but thereafter the values of

eia. repeat themselves.

For example, consider

n= 5.

z= 1 + 2i,

Then
where

r =

J5,

Sin-1

(1/J5).

The fifth roots of z are the numbers

r ei81
= M eie1= 5111oei8;

Z;=

where

and so on.

1, 2, 3, 4),

e; =tee +

2j7T)
7T
= !!. + 2j

Thus

(j= 0,

(j

= 0, 1, 2, 3, 4).

efs
Zo= 5111oei ,
rr) s
Zi = 51/1oe i(8+2 / ,
4rrl/s
z2 = 5111oei(9+
,
Note that it is not easy to express these numbers in the form a + bi.

De Moivre's theorem shows that C not only contains roots of all orders for all

real numbers, but also roots of all orders for all complex numbers.

This means, in

particular, that any quadratic equation with coefficients in C can be solved in C. The
method follows the derivation of the familiar quadratic formula.
2
az + bz + c = 0

(a 0)

c
b
2
z +-z=-a
a
2
2
c
b
b
b
2
z + -z + - = - -+2
2
a
a
4a
4a
2
2
b
b - 4ac
z +- =
.
2
2a
4a

492

Infinite Series

If b2

10.13

- 4ac = 0, then the equation takes the form

(z -)2 = 0
2a
b

<=>

z +

-2a = 0,
b

and

z = -b/2a

is the only root. If


square roots

b2 - 4ac - 0, then the complex number (b2 - 4ac)/4a2 has two


z1, z , and
2
z + b/2a = z1'
az2 + bz + c = 0 ::::> or
Z + bj2a = Z .
2

Therefore the roots are the numbers

b
- - + Z1 '
2a

b
- - + Z 2.
2a

In fact, a much more general result holds: every polynomial equation

with coefficients in C, has a root in C. We express this by saying that C is algebraically

closed. An easy proof appears eventually, in the theory of functions of a complex


variable.

PROBLEM SET 10.13

Solve the following equations, algebraically in terms of radicals.


1.

z4 + 1

3. z3 + 8
5. z3 +

z2

0.

2. z6 + i

0.

+ z + 1

4. z2 + 2z
=

8. z5 +
9.

0.

i + 1

6. z2 + z + i +

0.

7. z7 + z6 + z5 + z4 + z3 +
4

+ 2z3 + 2z2 + z + 1

+ 1

0.

0.

0.

0.

We know that for each n,-the number 1 has exactly n nth roots.

Show that we can

always find one of these, say, z0, so that the complete set of nth roots are the powers
z0, z5 , zg , ... , z(; of z0
10. List, in polar form, the fifth roots of i

1.

11. If z0 is as in Problem 9, we say that z0 is a generator of the nth roots of 1. Can z0 be

chosen at random? That is, if z(; = 1, and z0 1, does it follow that z0 generates the
nth roots of 1 ? Why or why not?

12.

For each n, let Zn be the set of all nth roots of 1.

Show that (a) Zn is closed under

multiplication and (b) Zn contains the reciprocal of each of its elements.


*13. Let p be a prime, and let z0 be any element of Zp, other than 1. Show that the numbers
z0, zfi ,

... , =-1 are all different from

1. Then show that they are all different (from each

other). Explain how this result is related to one of the preceding problems.

10.14

The Radius of Convergence.

Differentiation of Complex Power Series

493

*10.14 THE RADIUS OF CONVERGENCE.


DIFFERENTIATION OF COMPLEX POWER SERIES

In dealing with real series, we observed the following phenomena.

1)

The domain of convergence of

! a;xi

was always symmetric about zero, except

perhaps for the endpoints of an interval. That is, the domain of convergence always
turned out to be (a) 0 alone, (b) ( - oo,

oo),

or (c) an interval of one of the types

(-r, r), [-r, r], [-r, r), (-r, r].


2)

The function

of all orders.

( - 1 1) .

f(x)

1/(1 +

is defined for every x, and has derivatives

Nevertheless, its series

!o ( - l )ix2i

converges only on the interval

Here the series goes bad for reasons which seem unrelated to the properties

of the function which it represents.


We shall now find out why these things happen.

The next two theorems are

modeled on theorems which are known in the real domain.

Theorem 1. If ! Z; is convergent, then limn"' zn


For each

Proof

limn"' Zn

n,

let Sn

Li'=o z;,

limnoo (Sn - Sn_1)

0.

so that limnoo Sn

S - S = 0.

!:o Z;.

Then

(Note that this is exactly like the

old proof.)

Theorem 2. Every convergent sequence of complex numbers is bounded.


Let zn = xn + iYn

Proof
x1, x2,

and y1, J2, .

Since z1, z2,

. also converge.

converges, it follows that the sequences

Therefore they are bounded, and we have


IYnl b,

for every n. Therefore


Throughout this section, for each

r > 0, D, denotes the interior of the circle


r, in the complex plane. Here D stands for disk:
an open disk. If we include the boundary circle, we
r}.

with center at the origin and radius


the interior of a circle is called
get the

closed disk D,

Theorem 3. If a series

{z

Jzl

!:o a;z; converges for z

then the series converges at every point of D,.

= z0, with z0

0, and 0 < s < lz0I,

Infinite Series

494

10.14

The first step in the proof is to show that


we use a comparison test.

We have

L Ja;J si is convergent.

For this purpose

I I

i
;
!ail ;0 Zo = !ail ;0 lzoli
i
laiz&I ;
o
0, and so the numbers aizi
But we know that L aizi is convergent. Therefore a;zi
!ail s;

I I

-'>-

form a bounded sequence, with

laizgJ

And

for every j.

I :J
i
i
I
I.!._
=
I

I la;Z6J I.!._
I
z0 i=O z0
i=O
<

1,

lzol Therefore
<
b
_
b
I Ja;I si
1
ls/z01 <
i=O
Now for each point z of Ds, we have Jzl s. Therefore
L la z il zla;I s1 <
(z in Ds)
i=O ;
i=O
Therefore .2; a1zi converges in D8, which was to be proved.
Suppose now that we have given a series L a;zi which converges for some z
because

"'

oo.

"'

00

It follows that the series converges on some disk


numbers

s.

That is,
S

= s J _I a1z1
J=O

D8

converges on

If S is unbounded, then the series is convergent for every


the
sup

radius of convergence is
S. (See page 243.) Let

oo.

Theorem 4.

z.

In this case, we say that

=sup S.

radius of convergence

is

r.

Lo aizi has radius of convergence


lzl < r and diverges for lzJ > r.

If a series

converges for

I>s}

If S is bounded, then S has a least upper bound

r
In this case, we say that the

;if 0.

Let S be the set of all such

<

oo,

then the series

Proof

z0, with Jz0J < r. Since r is the least upper bound of the set S,
Jz01 is not an upper bound of S. Therefore there is an s > lz0J such that
z0
the series converges on D8 Therefore the series converges at z
2) Given z0, with lz0J > r. Suppose that the series converges at z = z0, and let s be
1)

Given a number

the number

such that
Then the series converges on
for such numbers

s.

Ds;

< s

lz0J.

<
and this is impossible, because

is an upper bound

10.14

The Radius of Convergence.

Differentiation of Complex Power Series

Note that while this theorem tells us what happens inside the circle

lzl

= r,

495

and

what happens outside the circle, it tells us nothing about what happens on the circle.

L a,xi. Suppose that the series


x 0. Then the complex series I,0 aizi converges for some
z 0 (namely, the same x). Let the radius of convergence be r. Then L a1zi con
verges for lzl < r and diverges for lzl > r. Therefore, for real values of z, L a1xi
converges for Jxl < r and diverges for lxl > r.
This theorem clarifies the situation for real series

converges for some

The circular domain of convergence for complex power series also accounts for
the behavior of the series

I< -1)ix2i

i=O
If this series converged for some

1
___2 .
+ X

x for which lxl > 1, then for the complex series

""'(-l)'z'
i=O
.

we would have a radius of convergence

2'

1 +

1/(1 + z ),

for complex values of

> 1. This is impossible, because the function


1, and for z = i the denominator

itself blows up at a point of the unit circle: Iii


on the right-hand side becomes 0.

2,

---

[Query: how do we know that the series is equal to

z?]

The derivative of a complex-valued function is defined by an obvious analogy


with the derivative of a real-valued function. That is,

f'(zo)

lim
z->zo

f(z) - f(zo)
Z

This is a complicated limiting process, because

z0

may approach

in the complex plane.


i

G
8

z0 from any direction

Infinite Series

496

10.14

To be exact, the indicated limit means that for every

0 < lz - zol < o

=>

>

0 there is a o

f(z)

f(z o)
- f'(zo) <
- Zo

>

0 such that

E.

Here the inequality lz - z01 < o allows z to lie anywhere in the interior of a circle
with center at z0 and radius o. lfj'(z0) exists, then we say that/ is differentiable at z0
If/ is differentiable at every point of an open disk containing z0, then we say that/ is
analytic at z0 It is easy to see that if f(z) = zn, then f is differentiable everywhere,
and therefore analytic everywhere, with
j'(z)

The proof is exactly like the proof for f (x)


Theorem 5.

If f is a polynomial, with

f(z)

n
= x .

Similarly:

then/ is analytic everywhere, and

f'(z)

nzn-1

2;a1z1,
1=0
n

2;ja1z1-1.
j=l

But the proof of the following theorem is hard.


If f(z) has a power series 2;;:0 a;z1, converging in the open disk
then f is analytic in D,., and

Theorem 6.

D.,

00

f'(z)

2;ja1z1-1
j=l

(lzl < r).

Proof Take a fixed z0 in


Let

Dr. To computej'(z0), we need some preliminary results.


be any real number such that

lzol <

< r

Then
00

L la11 s1 < oo
1=0
(See the proof of Theorem

4,

(0 <

< r).

where we used the same apparatus.)

10.14

The Radius of Convergence.

Differentiation of Complex Power Series

Therefore the series defines a function

and by Theorem A of Section 10.5

g(s),

it follows that

g'(s)

497

00

Lj la 1 sH,
1
j=l

where the series on the right is convergent. Therefore


00

Lj la 1 s1 < oo,
1
j=l
and so
00

L j Ja 1 s1
n-+oo :i=n+l 1
lim

. [--- ( La z

0.

;)]

Now consider what we are trying to prove. The theorem says that
hm
z--+zo
for every

z0 in

00

z - z0

J=O

00

- La z0
J=O 1

00

LJa;z0J-1,
J=l

Dr- Simplifying the expression in brackets, andtransposing the sum on

the right, we get the equivalent form


00

lim La;(z1-1 +
z-+zo i=l

z1-2z0 + z1-3z +

+ zzg-2 + zg-1 - jzg-1)

0.

This is not as bad as it looks, because the limit of thejth term is 0 for eachj: each
of the firstj terms in the parenthesis approaches the limit
our conclusion would follow: if

Sn(z)

n
Lak
i=O

zb-1,

as

--+

z0,

and so the

Therefore if the sum were finite

total expression in the parentheses approaches 0.

then
lim Sn(z)
z-+zo

for each n.

We need to find out what happens to the remainder

Rn(z)
We know that

lzl < s.

00

+ zg-1 - jzg-1).
L alzi-l + z1-2z0 +
i=n+l
lzol < s; and since we are discussing limho' we may
=

assume that

Under these conditions,

la;(z1-1 + z1-2Zo + ... + zg-1 - jzg-1)1


la;J . (Jzi-11 + Jz1-2zol + .. . + lzg-11 +I Jzg-11) < Ja;J . 2j s1-1.
Therefore, for

lzl < s

we have
00

IRn(z)J L Ja J 2js1-1.
i=n+l 1

The sum on the right approaches 0 as

n --+ oo.

498

10.14

Infinite Series

Now let

E > 0 be

given. There is an

There is a c5

> 0 such

that
Zo l

00

L ak
1=1

we have

and so limZ-+
o

L ak - )

<0
)

Sn(z) + Rn(z),

0,

ISn(z)I < E/2.

=>

I Iai( )I

=>

0 < lz - Zol < c5

such that

(lzl < s).

0 < lz -

Since

ISn(z)I + IRn(z)I < + = E,


2
2

which was to be proved.

It is worthwhile to remember the device (used in this proof) of taking

IRn(z)I

enough to make

ISn(z)I

small.

small, and then taking

lz - z01

large

small enough to make

This is not a special or isolated device, but a sample of a standard

method.
Once we know how to differentiate a series, we can apply in the complex domain a
variety of techniques which we have been using in the real domain. The generalizations
are not hard, as you will see.

In the following problem set, D is used to indicate

differentiation. Thus

Df(z)

and

00

f'(z),
00

L a1z1 = L ja1z1-1,
1=1
1=1

on any open disk Dr where the series on the left converges.


PROBLEM SET 10.14
1.

2.

Find De'.

5. Find De2

4. Find D cos z.
7.

Find De-.

Show that if/is analytic in Dri andg


g'

=Jn,
n
Dt

3.

Find D sin

6.

Find D sin 2z.

z.

theng is analytic in Dr, and

nr-1f'.

(The pattern of proof for the real domain works equally well here.)
8. Let f be analytic for every z , and let g(z)

f(a + z). 3how that g'(z)


(Evidently this is another simple special case of the chain rule.)

9. Givenf(z)
10.
11.
12.

f' (a + z).

Show that/has not only a first derivative in D,., but also


"
f , f<3>, . . . . of all orders.

L:o a1z1
=

0 for every z in Dr

0 for every j.
L: o a1z1 in Dr.

Obviously a0

0, because f(O)

a0.

Given f(z)
Show that if f'(z)
function and is equal to a0 for each z.
=

L:o a1z1 in Dr.

derivativesf<2>

Given f(z) =
Show that a1

Given thatf(z)
constant.

L a1zi, g(z)

L b1zi in

D r.

0 for every

Show that if f'

z,

then f is a constant
g', then/ - :; is a

10.15

Integration and Differentiation of Real Power Series

13. Let

cos2

+ sin2

z.

14. Given that cos2

+ sin2

cos

15.

(z)
z

and sin

Show that

ea

Show that
=

(z)

1 for every

1 for every

499

z.

does it follow that the complex functions

z,

are bounded? Why or why not?


e

ea+z,

for every

and

z.

16. Does the addition formula for the cosine hold in the complex domain? That is, is it
true that
cos
for every

(a

z)

cos

cos

sin

a sin z,

+ cos

a sin z.

and z?

17. Answer the same question, for the proposed identity


sin

18. Show that sin

2z

19. Show that cos

2z

*10.15

(a

z)

sin

a cos z

2 sin z cos z.
cos2

sin2

z.

INTEGRATION AND DIFFERENTIATION OF REAL POWER SERIES

In Section

10.5

we stated (in Theorems A and B) that a Maclaurin series could be

differentiated and integrated a term at a time, on any interval


series converges.

We shall now prove these theorems.

) on which

-r, r

the

The ideas that are needed to

do this may be easier to understand if we first show how they apply to the geometric
series

f(x)

00

1
=

--

- X

We want to show that

rk

Let

00

Lx
i=O '

1 <

< 1).

rk

J/(x) dx iJo xi dx (0 < k <


n
Sn(x) Lxi,
i=O
Rn(x) L
x\
i=n+l
f(x) Sn(x) + Rn(x),
f(x) - Sn(x) Rn(x),

1).

(1)

f!(x) dx - fSn(x) dx fRn(x) dx.

(2)

00

so that

and

Since

and

S,,(x) is a finite sum, we know that


n (
(
xi dx J xi dx,
J
i o
!rr;, i o
"

oo

"

500

10.15

Infinite Series

by definition of the infinite sum. We shall show that


lim (kR (x) dx = 0.
n-+ Jo n
If

(3)

ex:>

(3) holds, then we can take the limit in formula (2), getting

lim rf(x) dx - rksn(x) dx = lim rkf(x) dx


11-+oo Jo
Jo
n-+co J
o

= kf(x) dx

This means that (1) holds.


The proof of (3) is as follows. We have

Rn(x) = L xi= xn+1 + xn+2 + ..


i=n+l
xn+l
= xn+l ! xi =
1i=O
We are integrating from 0 to k. If 0 x k, then
00

(\i dx
J
o

lkxi dx = 0.

00

i=O

i
i=o

00

--

1 - x 1 - k,
and
Therefore

Therefore

Therefore

r n+l
kRn(x) dx k k
dx
o
J ol-k

--

kn+2
1-k

= -- .

lim (k R (x) dx = 0,
n-+oo Jo n

which is what we wanted.


What made this work was the fact that the functions
to 0 by a sequence of constants. We had

Rv R2,

were

squeezed

kn+l

Mn =-k'
1
-

0 Rn(x) Mn

and

(0 x k),

IimMn=O.

It followed ihat

j.kRn(x) dx
0

n-->OO

k dx
rM
Jo n

= kMn--+

This method applies in general, to prove Theorem


need some preliminaries.

0.

of Section

10.5.

To do this, we

Integration and Differentiation of Real Power Series

10.15
Theorem 1.

Every convergent sequence is bounded.

This is Theorem 7 of Section 10.1.


Theorem 2.

L a;ki
Proof

If

L a,xi

It gives us the following result for series.

is convergent on the interval

(-r, r),

and 0 < k < r, then

is absolutely convergent.
Let

x1

be any number between

and

r.

Then L a;x is convergent, and so limi--.oo aix1 = 0. By Theorem 1, the


a0, a1x1, a2x, . .. is bounded. Let b be a bound for this sequence, so that

laixil b for every i.


L aiki. Lets= k/lx11. Then

Now consider the series

k = lx11

s,

iO

and so

sequence

0 <s < 1, and so

1
L:s' = -00

But

501

<co.

Therefore
.

00

L la;k'I =

Therefore

L aiki

00

00

b
1-s

laixtl s' L bs' = -

<

co.

is absolutely convergent, which was to be proved.

Carrying the same ideas a little further, we get better information.


Theorem 3.

If

remainders

L aixi is

convergent on the interval

R n(x) =

are squeezed to 0, on the interval


is a sequence

M1, M2,

(-r, r),

and 0 <k < r, then the

00

in+l

[-k, k],

a;xi

by a sequence of constants. That is, there

of constants, such that

limMn = 0,

n-> oo

and

Proof

We know by the preceding

For each

n,

(-k x k).
theorem that L aiki is

let

L laikil.
00

Mn=

i=n+l

Then

limM n=O,

because

n-> oo

oo

i
lai kil - L la;k j.
i=O
i=O

Mn= L

absolutely convergent.

502

Infinite Series

10.15

On the interval [-k, k], we have lxl k . Therefore


and

Therefore

IRn(x)I Mn

for every n,

and so the remainders Rn(x) are squeezed to 0, on the interval [-k, k], by the con
stants M1, M2,
The ideas in this theorem are going to come up again, and so we need a briefer
language in which to describe them.

Let Ri. R2,


is a sequence M1, M2,

Definition.

be a sequence of functions on the interval [a, b]. If there


of positive constants, approaching 0, such that
R
I n(x)I Mn,

for every

x on [a, b] and for every n, then we say that the functions R1, R2,
approach 0 uniformly on [a, b], and we write

VlimR n(x)
Definition.

on [a, b].

Let Si. S2, ... , and S be functions on [a, b]. If


Ulim [S(x) - Sn(x)]
n-+ co

then we say that the functions S1, S2,

UJim Sn(x)
n-+co

on [a, b],

approach S uniformly on [a, b] , and we write

S(x)

on [a, b ].

In the case covered by Theorem 3, we had

[a, b]
Rn(x)
Sn(x)
S(x)
. Rn(x)

[-k, k],
co

L aixi,

i=n+l

L a;x\

i=O
co

L a;x;,

i=O

S(x) - Sn(x).

In our new terminology, Theorem 3 takes the following form:


Theorem 3'.

If

L a;x;

is convergent on (-r, r), and 0 < k <


UJim
n-+ co

L a;xi

i=O

co

L a;xi

i=O

on [

k, k].

r,

then

Integration and Differentiation of Real Power Series

10.15

503

At the beginning of this section, we used these ideas to integrate z xi a term at a


time. We want to use the same idea for all power series, but we have a new problem.
The series z xi represented a known function f(x)
1/(1 - x), which was con
tinuous. But whenf(x) is given only by a series z a xi , we first need to show that/is
i
continuous, in order to conclude that the sum has an integral. Thus we need the
following two theorems.
=

Theorem 4. Iffn is continuous for each


f is continuous on [a, b].

n,

and U limn-oofn(x)

f(x) on [a, b], then

Proof Take a fixed x0, and let E be any positive number. Let M1, M2,

be as in
the definition of U lim. Then there is an n such that Mn < E/3. Hereafter in the
proof, n is fixed. Sincefn is continuous at x0, there is a o > 0 such that
Ix - Xol < b

l f(x) - f n(x)I < Mn

we have

Ix - Xol < O

=>

l fn(x; - fn(Xo)I < E/3.

=>

Since

for every x,

If(x) - f n(x)I < E/3,


/fn(x) - fn(Xo)I < E/3,
/fn(Xo) - f(xo)I < E/3.

By the triangular inequality,

/a + b + cl l / + /bl + /c/.
Therefore
/x - x01 < O

:::::>

If(x) - /(x0)1 < E/3 + E/3 + E/3

E,

which was to be proved.


To conclude that f is continuous, it is not enough to know that limn-co f,,(x)
f(x) for each x of [a, b]; we really need to know that U limn_00/(x) f(x) on [a, b].
(See the following problem set, for an example showing this.) For power series, of
course, we know that Sn(x) is continuous for each n, because Sn(x) is a polynomial;
and we know that the sum of the series is not merely lim S,(, x) but also U lim S,,(x),
on every closed interval lying in the interval of convergence. This gives the following
theorem.
=

Theorem 5. If z a xi is convergent on (-r, r), then z a xi is a continuous function

on_ (-r, r).

Therefore there is such a thing as the integral of 2 a;xi ; and we can show that it is
the sum of the integrals.
Theorem 6. If zo a;xi converges on (-r, r), and lxl < r, then

{"'
Jo

[f al]
i=O

dt

=I l"'0 ai dt.
i=O

Infinite Series

504

Since

10.15

f ti dt = xi+I/(i + I),

the theorem tells us that

(" [ii=Oal] dt = i=OI


xi+i.
+ 1

J0

To prove the theorem, we let

Sn(t) = Lf=o aiti.

Then for 0 <

00

U lim
n--+oo

(For

-r

< x <

Sn(t) =_Lai
i=O

on

0, the same condition holds on

x < r

we have

[O, x].

[x, O],

and the rest of the proof is

exactly the same.) Let


00

00

Rn(t) = L a/ = L aiti - Sn(t),


i=O
i=n+l
and let M1, M2,

be as in the definition of U lim. Then

IRn(t)I =

Iia;ti - Sn(t)I

Mn

for every

n.

Therefore

"Mn dt

=Mn Ix!.
Since limnoo Mn

lxl = 0,

it follows that

l0x [Z=O a/] dt = l"0 Sn(t) dt.


lim

n--+ oo

But Sn(t) is a finite sum, and can be integrated a term at a time. Therefore

l0x [i=O_La/J dt =

and

oo

lim
noo

l"'

n
oo
_L a/dt = _L
i=O
i=O 0

l"a/dt,
0

which was to be proved.

This theorem shows that if _L a;X; converges on (-r, r), then the integral series
_L [a;/(i + l)]xH1 also converges on ( -r, r) . The same is true for the derivative series
'co
i
ki=l lQiX -1
Theorem 7. If _L;:0 aix i converges on (-r, r), then _L;:1 iaixi-i converges on (-r, r).

Proof

This is going to be very similar to the proof of Theorem 2.

number such that

lxJ < x1 < r.

Then

_L a;xi

is convergent; limnoo

x1 be
aixi = O;
Let

any
and

Integration and Differentiation of Real Power Series

10.15

there is a bound b such that laixil b for every i. Lets

Jxl
i
liaix -ll

X1S,

liaixi-1si-ll

Therefore

laixi-11

C()

C()

il

i=l

505

lxl/x1, so that

isi-l bii-1

i
i
2 I iaix -ll b 2 is -l.

But s < 1, and so the series on the right-hand side converges, by the ratio test.
Therefore the series on the left-hand side converges. Therefore 2 iaix i-l converges.
It remains to show that the "derivative series" 2 ia;xi-l really gives the derivative,
but this is easy.
Theorem

8. If
C()

2 ai xi

f(x)

on

( -r, r),

i=O

thenf is differentiable on

(-r, r),

f'(x)

Proof

and

C()

2 iaix -1

i=l

Let

g(x)
Then

1"'g(t) dt 1"' [200


0

Therefore

C()

2 ia;xi-l

(-r < x < r).

i=l

ia/-1 dt
i=l

00 1"'

i=l 0

i"'g(t) dt

ia/-1 dt

00

2 aix;

i=l

f(x) - a0

f(x) - a0

But the integral on the left-hand side is a differentiable function, and

Di"'g(t) dt

Therefore fis differentiable, and f'

g(x).

g, which was to be proved.

Obviously Theorem 8 can be applied again, to the derivative series, and so

f"(x)
/<3\x)

00
2
00

2 i(i - l)xi-2,

i=2

i =3

i(i - l)(i - 2)xi-a,

and so on. Thus ifj(x) is represented by a power series, thenfhas an nth derivative
for every n. In a way this is good; it means that functions given by series are in some
respects manageable.

But it also means that if a function f does not have infinitely

many derivatives, then f cannot be represented by a power series. Later you will see

506

Infinite Series

10.15

that many such "irregular" functions can be represented by series of other kinds,
notably by so-called Fourier series, of the form
00

a0 +

L [ai cos ix

i=;ol

+ bi sin

ix].

PROBLEM SET 10.15

1. Let
J<x)

oo

.I

=0

<-1)i

x 2i
' .
(2l )'

Calculate the series for j"(x), and verify that j"(x)


because f(x)
cos x.]

f(x)

[This must be true,

2. Find a simple formula for g" (x), given


"

g(x)

3. Do the same for

x2H1
(-l)i
(2i + 1)!

xi
(-l)i-:-.
z!
i=O
oo

h(x)

==

n
4. For each n, let fn(x)
x (0 x 1). Let /(x)
lifin__.00f (x). Sketch the graph
off, and find out whether it is true that U liIDn ...... oof n(x)
f(x) on [O, 1]. (This throws
some light on Theorem 4.)
=

5. Find, by any method, a power series for /(x)

xeX + e"'.

x cos x + sin x.

6.

Same question, for g(x)

7.

Same question, for h(x)

8.

Find a series for a positive function f such that f'(x)


2xf(x) for every x, and f(O)
2 xf (x) <=> f'(x)/f(x)
(There is a short-cut. Since f (x) > 0 for every x, f'(x)
Now what?)

x sin x - cos x.
=

1.
2x.

In each of the following cases, discuss limn__.oof n(x) and U limn__.00/ n(x).
9.

fn(x)

10. fn(x)
11. fn(x)
12. fn(X)
13. fn(X)
14. fn(x)

nxn,

(1/n) sin nx,

t.

-1 x 1.

(1/n) cos 2nx,


-1 x 1.
n
(1/n)e "',
0 X 1.
(1/n2)en"',

n sin (x/n)2,

0 x 1.
0 x 1.

*15. Find out whether the following is true:


Theorem (?). If U limn__.00/,:,(x)
f(x) on [ -k, k], and each of the functions fn is
differentiable, then f is differentiable, and f'(x)
liIDn...... cx,/(x) for every x between
-k and k.
=

(If this is true, then it furnishes a straightforward proof of Theorem 8, replacing


the proof using integrals.)
**16.

Here we return to complex power series, as in Section 10.14. It is evident that if

L aizi converges on Dr, then L I aizil converges on Dr; in fact, every time we have proved

Integration and Differentiation of Real Power Series

10.15

507

convergence for a complex power series, we have first proved absolute convergence and
then used Theorem 3 of Section 10.12. It remains, however, to consider the question of
uniform convergence. Just as for sequences of real functions,
lim/n(z)
n-co

f(z)

on D8

if 1/n(z) - f(z)I is squeezed to 0 by a sequence of constants. That is, the above


relation holds if there is a sequence M1, M2,
of positive constants such that

limMn
n-co

and

0,

1/n(z) - /(z)I

Mn,

for every z in D5

Prove the following:


Theorem.

If

.2 a1zi

has

as its radius of convergence, and


n
co
.2 a1zi
Jim .2 a1zi
on D5

> 0
U

n-oo

1=0

i=O

O <

<

r,

then

Jim

11

11.1

Vector Spaces and Inner Products

CARTESIAN COORDINATE SYSTEMS IN THREE-DIMENSIONAL SPACE

To set up a coordinate system in three-dimensional space, we use the same scheme that
we used in a plane; the only difference is that we use three mutually perpendicular
lines instead of two. These are the x-, y-, and z-axes. On each of the axes we take a
coordinate system, in such a way that the origin 0 has coordinate 0. The plane con
taining the x- and y-axes is called the xy-plane. Similarly for the yz- and xz-planes.
z

/
/

----/

//
f--I
- -P1 /
I
--- /
-

-:-'II

I
I
I

I
I
I

: ------ !
/7---- y
I

--

- --

//
I
/
---....v/
I

These are called the coordinate planes. In the figure we have indicated the position of
the pointP1 by drawing the rectangular parallelepiped which has the origin 0 and the
point P1 as opposite corners, and sides parallel to the coordinate planes.

We get

the coordinates of a pointPl> as before, by dropping perpendiculars to the coordinate


axes.
Here M1, M2, and M3 are the feet of the perpendiculars fromP1 to the three axes.
If these points have coordinates x1,
with the triplet

(xi, Yi, zi),

Yi

Zi,

on the respective axes, then Pi is matched

and we write
Pi +-+

(xi, Yi, Zi) .

As in the plane, we often fail to distinguish between a point and the number triplet
with which it is matched; and so we may write
Pi

(xi, Yi'
508

z1) .

Cartesian Coordinate Systems in Three-Dimensional Space

11.1

509

P1(x1, y1, z1),

In figures, we may label a point as

to indicate that the given point has

the given coordinates.


The coordinate planes divide space into eight parts, called
above shows the fi rst

octant,

octants.

The figure

consisting of all points of space for which all three co

ordinates are 0.
By two applications of the Pythagorean theorem, we see that each diagonal of a
rectangular parallelepiped has length

P1(x1, y1, z1) we

Ja 2

2
2
+ b + c This means that for each point

have
0 P

More generally, for any two points

x + y + z.

P1, P2,

we have the distance formula given in the

following theorem.
b
a

i'\
I \
I
''

'

'e
'
J.-----'\--- ---
/ .............
\
-.......
/
\
d ',
'
/a
......
/
...... _,

Theorem

1. If P1 (x1, Yi. z1) and P2

P1P2=
Proof

J (x2

Suppose first that x1 -:;!:

(x2, y2, z2),

then

2
2
2
- X1) + (Y2 - Y1) + (z2 - z1) .
x2, Yi -:;!: y2, and z1 -:;!: z2.

Then P 1 and P2 are opposite

corners of a rectangular parallelepiped. In the figure on the left below,

a= lx2 - Xii,
Therefore

b= IY2 - Y1I,

c = lz2 - z1I

11.1

Vector Spaces and Inner Products

510

If some of the inequalities x1 -:;!= x , y1 -:;!= y , and z1 -:;!= z do not hold, then our
2
2
2
parallelepiped reduces to a rectangle, a segment, or a point, and the same distance
formula holds for simpler reasons.
We shall use this result to describe planes by equations.

(See the figure on the

right below.) Given a plane E, suppose first that E does not pass through the origin,

al
I

P2

I
I

- 1 ----

!
/

__

--- Y

---

/
/

Po

<-->

(a, b, c), P1

.--..

(2a,2b,2c)

and let P0 = (a, b, c) be the foot of the perpendicular from the origin to E. Let P1 be
the point (2a, 2b, 2c). Then OP0 = P0P1; P0 is the midpoint of the segment from 0
to P1; and Eis the perpendicular bisecting plane of the segment . Therefore Eis the set
of all points of space that are equidistant from 0 and P1. That is, Eis the graph of the
condition

<::::> x2+ y2+ z2 = (x - 2a)2+ (y - 2b)2+ (z - 2c)2


<::::>

= -4ax+ 4a2 - 4by+ 4b2 - 4cz+ 4c2

<=>ax+ by+ cz - (a2+ b2+ c2) =

0.

This has the form


Ax+ By+ Cz + D

0,

(A,

B, C) -:;!=

(0, 0, 0).

The condition on the right says that the numbers A, B, and Care not all equal to zero;
and this is correct, because the point P0 = (a, b, c) is not the origin. An equation of
the above type is called a linear equation in x, y, and z. Thus we have shown that every
plane that does not pass through the origin is the graph of a linear equation in x, y,
and z.

For planes through the origin, the same result holds.

Let P0(a, b, c) be any

point of the line perpendicular to Ethrough 0, other than 0 itself. Let P1 be the point
(-a, -b, -c).

Then E is the perpendicular bisecting plane of the segment from

11.1

511

Cartesian Coordinate Systems in Three-Dimensional Space

Po(a,b,c), P1 (-a,-b,-c)

P1 to P0; and so E is the graph of the equation


P1P = P0P

+++W++=-+-W+-
2ax+a2+2by+b2+2cz+c2 = -2ax+a2- 2by+b2 - 2cz+c2
ax+by+cz = 0.
This has the same form

Ax+By+Cz+ D =

0, with D = 0, as it must be: the

origin lies in the plane E. In general:


Theorem 2. Every plane is the graph of a linear equation in x, y, and

z.

This theorem can be applied in a variety of ways. For example, let Ebe the plane

through the points (3,

3, 4), (2, 4, 4), and (2, 3, 6).

Now Ehas an equation of the form

Ax+By+Cz+ D =

0,

and the equation must be satisfied by the coordinates of the three given points.
Therefore the coefficients

Subtraoting

(2)

from

(1)

A, B, C,

and D must satisfy the equations

3A+ 3B+4C+ D =

0,

(1)

2A+4B+4C+ D =

0,

(2)

2A+3B+ 6C+ D =

0.

(3)

we get

A -B =
Setting

B =A

Subtracting

in

(3')

(2)

from

and

(2')

(3),

(1)- (2)

0.

we get

6A+4C+ D = 0,

(2')

5A+ 6C+ D

(3')

0.

we get

A - 2C

0,

(2')

(3')

512

11.2

Vector Spaces and Inner Products

and so C = A/2.

Therefore D =

Now, to avoid fractions, we set

-8A.

A= 2.

This gives

A= B = 2,

D = - 16,

1,

2x + 2y+z -

16 = 0.

This checks.
Note that any number different from 0 could have been used as A. There are some
cases, however, when this is not so. For example, the graph of the equation

y+z=l
is a plane, parallel to the x-axis. This plane is the graph of infinitely many different
equations, of the form

ky+ kz - k

(k -:;t: O);

but x does not appear (with nonzero coefficient) in any of these equations.
PROBLEM SET

1.

point P0

2.

11.1

Find the equation of the plane containing all points equidistant from the origin and the
=

(2,6, 4).

Find the equation of the plane containing all points equidistant from P0

Pi

(0,2,3).

(1, 0,0) and

3.

Find the equation of the plane containing all points equidistant from the planes

4.

Find the equation of the plane through the points

P2

5.

z =

and

+y +

( -1,2, 0).

Find the equation of the plane through the points P0

P2
6.

+y +

P0
=

(0,0,3).

z =
=

-1.

(1,0, 1), Pi

(2,1,1),Pi

(1,1, 1), and

( -1, -1,0), and

Find the equation of the plane through

P0

(1,3,0)

P1

and

(2, 2, 3),

which is parallel to the z-axis.

7.

The point

(1, -1, 2) is the foot of the perpendicular from the origin to a plane E.

Find the equation for this plane.

8.

Find a plane throughP0

(1,0,2),Pi

(2,2,1),andP2

(3,4,0). Is there more

than one such plane? Why?

9.

Let

(1,0,0), let B

(4,0,0), and let K

{PI 2AP

BP}. Find an equation

whose graph is K. What sort of figure is this?

11.2

DIRECTION COSINES.

THE DIRECTED NORMAL FORM

Linear equations for planes are more useful if we know the geometric significance of
For this purpose, we need the idea of directed
distance on a line. Given a line L with a coordinate system, and two points P and Q
of L, with coordinates x1 and x2, we define the directed distance from P to Q as
the coefficients in the equations.

PQ = X2

- X1.

11.2

Direction Cosines.

Since x1 - x2

The Directed Normal Form

513

-(x2 - x1), we have


for every

QP= -PQ

for every

PQ+ QR= PR
And since the distance

PQ is

and

Q.

P, Q, R.

equal to lx2 - x1I, it follows that

PQ=

if

PQ

Q is

in the positive

direction from
if

-PQ

Q is

P,

in the negative

direction from

P.

This means that directed distances are determined if the positive direction on the line
L is known; we do not care where the origin is in the coordinate system.
z

<-->

(a,b,c),.o(0,0,0).

Consider now a directed line L, through the origin. Let


on the positive end of L, with

OP= 1.

P= (a,

b,

c)

be a point

Consider the angle between the positive end

of the x-axis and the positive end of L. In its own plane, this angle looks like this:
L

Note that the foot of the perpendicular really does have coordinate
in fact, this is the definition of the x-coordinate of
cos

CJ..

P.

on the x-axis;

If the angle has measure

direction angles

then

=a.

In the preceding three-dimensional figure, fJ and y are defined similarly.


called the

e<,

of the directed line L.

They are

(This is an abbreviation: what we

514

11.2

Vector Spaces and Inner Products

really mean is that they are the

measures

of the angles between the positive end of L

and the positive ends of the axes.) The numbers


cos {J,

cos ex,
are called the

direction cosines

cosy

Note that they determine not merely a line

of L.

through the origin but also a positive direction on the line. If the direction on the line
is reversed, then
ex-+ 7T - ex,

{J-+ 7T - {J,

y-+7T - y,

and each of the direction cosines changes in sign.

Suppose now that P is any point

(xi, Yi, zi) of L,

other than the origin. Let p be

the directed distance from 0 to P, relative to the given positive direction on L.

p > 0,

If

as above, then
cos ex,

XifP =

by definition of the cosine. If p < 0, as on the facing page, then


COS (7T - ex)

= Xif-p,

by definition of the cosine. Therefore


- COS ex

and

xifP =

= - Xifp,

cos ex, as before. In the same way, we get

Xi = p cos

ex,

. Yi = p cos {J,

Zi = p COSy.
'
We now return to our equations for planes. First we choose a direction on the normal
line N, in such a way that the directed distance
graph of the equation

Of' = p is 0.
'

Then the plane is the

ax + by +cz - (a2 + b2 + c2)= 0.


Now p

.Ja2

+ b2 + c2.

.
The equation therefore has the form

b
a
-x +
p
p
-

or

c
-

x cos ex + y cos {J +

- p = 0,
cosy

- p = 0.

11.2

Direction Cosines.

The Directed Normal Form

515

0, but the same result holds: the


+ y cos {J + z cosy - 0
0.
Here we have chosen the direction so as to make p 0. If we choose the opposite

If the plane passes through the origin, then p


equation ax + by + cz

0 takes the form x cos

oc

direction on the normal, then


oc

cos
and p

___..

oc

1T

- -cos

oc,

fJ
cos

oc,

fJ

- {J,

1T

y -

-cos {J,

1T

- y,

cosy - -cosy,

-p. This changes all the signs in the equation

x cos

+ y cos fJ +

oc

cosy - p

0.

Therefore the equation still holds; and we have the following:

Theorem 3. Let Ebe a plane, and letN be a directed normal to Ethrough the origin,
with direction angles

oc,

{J, and y. Then Eis the graph of the equation

x cos

oc

y cos

fJ + z cosy

- p

0,

where pis the directed distance from the origin to E, relative to the given direction
onN.
So far, we seem to have been talking about numbers which are hard to compute.
But it is easy to bring the discussion down to earth.

x + 2y + 3z + 4

Consider the following equation:

0.

It is not reasonable to expect that an equation taken at random will be in the directed
normal form; after all, if a plane Eis the graph of the equation

Ax + By + Cz + D

0,

then E is also the graph of the equation

kAx + kBy + kCz + kD

0,

for every k :;i6 0. In fact, Eq. (1) cannot be in the directed normal form, because of the
following theorem:

Theorem 4. If

oc,

{J, y are the direction angles of a directed line L through the origin,

then
cos2

oc

+ cos2 fJ + cos2y

1.

11.2

Vector Spaces and Inner Products

516

Proof. LetP be the point of L for which the directed distance OP is 1. IfP= (a, b, c)
then
b=cos (3,
a=COS IX,
c =cosy.
Since
OP=OP=1 =.Ja2 + b2 + c2,
we have a2 + b2 +

1; and the theorem follows. We also have a converse:

2
=

Theorem 5.

If a2 + b2 +
are a, b, and c.

=1, then there is a directed line whose direction cosines

c2

Proof. Let L be the line from the origin through the point P=(a, b, c) , directed
positively from 0 to P. This does it.
This suggests that the equation
has the form
for some

IX,

x + 2y + 3z + 4=0

+ yk cos f3 + zk cosy - pk=0,


{J, y, p, and k. If so, we must have
xk cos

k COS
k2(cos2

and

(1)

IX

IX

IX

= 1,

k cos {J=2,

k cosy=3,

+ cos2 f3 + cos2y)=12 + 22 + 32=14,

Taking k=,J14, we get


1

Here

.)14

x +

2
3
4
+
+
=
.}14 y
.}14
o.
.)14 z

cos IX= I/./14,

cos f3 = 2/./14,

cosy=3/./14,

p= -4/./14.

(2)

11.2

Direction Cosines.

The Directed Normal Form

517

We have sketched the graph by plotting the intercepts, on the axes, and then completed
a triangle just as we did in the case where the intercepts were in the first octant. The
graph of (2) is the plane whose normal has the given direction cosines 1/.J14, 2/.J14,
3/.J14, and which lies at a directed distance p = -4/.J14 from the origin.
Taking k =

-.J 1 4 ,

we get
1

- .J14

- .J14 y

3
.J14

.J14

(3)

O,

which gives the opposite direction on the normal;


cos a'= -

cosy'= -

1
==-cos a,
.J14
3
.J14

cos /3'=
'
p =

= -cosy,

4
-

- -p.

.J14 -

The same scheme works for any linear equation in

x, y, z.

This gives a converse

of Theorem 2:
Theorem 6. The graph of every linear equation in

Proof

Given

Ax + By + Cz +
with

A, B,

and

C not

all= 0. Then

x,

is a plane.

D = 0,

A2 + B2 + C2

k = .J A2

y, and

(1)

> 0. Let

+ B2 + C2,

and

B
b= -,

A
a= '
k

c
c=
'
k

p=

D
k

Then Eq. (1) is equivalent to

ax + by + cz - p

= 0.

(2)

The graph of Eq. (2) is a plane E: the direction cosines of a directed normal to E
are a,

b,

and c; and the directed distance from the origin to Eis p.

PROBLEM SET
l. Given

11 2
.

+ y + z -1

and sketch.

2. Do the same, for


3. Do the same, for

4. Do the same, for

5. Do the same, for

0. Write the two directed normal forms for the same plane,

+ 2z - 3
+ 2y + 2z

0.

+ 2y + 4z - 4

+ 2y + 4z + 4

0.

0.

0. (This is going to look awkward; the foot of

the normal does not lie in the first octant.)

518

11.3

Vector Spaces and Inner Products

6. The normal to E from the origin contains the point (2, 4, 6). The plane E contains the
point (1, 1, 1). Find the two directed normal forms of the equation of E. How far is E
from the origin?
7. Let
and

K be the set of all points


C = (0, 0, 1). That is,

which are equidistant from A = (1,

0, 0), B = (0, l, 0),

K={PIAP =BP= CP}.


Prove that

K is a line through the origin,

precisely, a set of direction cosines for a


8. Let A, B, and

and find a set of direction cosines for

K (more

direction on K).

be three points which are all different, but collinear. Let

K={PIAP =BP= CP}.


K is the empty set { }.
Let A, B, and C be any points, not necessarily different.
Show that K is (a) a line, (b) a plane, (c) all of space, or
Show that

9.

Let

K be

as in Problem

(d) the empty set

{ }.

6.

Give

examples to show that all of these possibilities can actually arise.

10. A plane E contains the points (a, 0, 0), (0, b, 0), (0, 0, c),

b,

a,

c 0. Find the two


a, b, c > 0.

directed normal forms of the equation, and sketch, showing the case

11. The normal to E from the origin lies in the xy-plane; and E contains the points (1, 1, 1)
and (-1, 2, 1). Discuss as in Problem 10.
12. The normal to E lies in the yz-plane; and E contains the points (2, 2, 1) and (1, 1, 2).
Discuss as in Problem 10.
13. Let E be the plane z = -1, and let K be the set of all points wh i ch are equidistant
from E and the point A = (1, 0, 0). What sort of figure is this? Sketch.
14. Let E be the plane z = -2 and let K be the set of all points which are equidistant from
E and the x-axis. What sort of figure is this? Sketch.

11.3

THREE-DIMENSIONAL SPACE, REGARDED AS AN INNER-PRODUCT SPACE

Following the pattern of Section 9.7, we identify the point


directed segment
for

P = (x, y, z)

with the

OP from the origin to P; we denote the resulting vector as P;


P1 = (x1, y1, z 1), P2 = (x2, y2, z2), we define addition, scalar multiplication,

and
and

inner product by the formulas


-

P1 + P2 = (x1 + X2,Y1 + Y2, Z1 + Zz),


-

CJ.Pi= (()(X1, ()(Y1,


-

()(Z1 ,

P1 P2 = X1X2 + Y1Y2 + Z1Z2.

To simplify the notation, however, we drop the arrows, and write


have an inner-product space

V = {P} = {(x, y, z)},

for

P.

Thus we

Three-Dimensional Space, Regarded as an Inner Product Space

11.3

519

with addition, scalar multiplication, and inner product defined by the formulas

P1+P2= (x1 +X2,Y1 + Y2, Z1 + z2),


rxP1 = (rxxi. rxy1 , rxz1),
P1. P2=X1X2+Y1 Y2+ Z1Z2.
The resulting system satisfies all the vector and inner-product laws of Section 9.7.
In the new notation (without the arrows) these are as follows.

A.2.

"f/' contains a vector 0 such that

O+P=P+O=P
for every
A.3.

P.

For every

P in"f/' there is a vector -P in"f/' such that


P+ ( -P)= ( - P)+P= 0.

M.1.

(rx{J)P= rx({JP).

M.2.

(rx + {J)P= rxP+ {JP.

M.3.

rx(P1 + P2)= rxP1 +rxP2.

M.4.

M.5.

M.6.

rxO= 0, for every rx.

P= 0, for every P.
P=P, for every P.

In M.4 and M.5, 0 is the real number zero, and 0 is the zero vector. Thus M.4
says that the scalar product of the number 0 and any vector P is the zero vector; and
M.6 says that the scalar product of any number rx and the zero vector is the zero vector.
It is understood that all sums and scalar products are also vectors, that is, elements
of"f/'. But we had better make this explicit:

(Closure under addition). For every V1, V2 in"f/', Vi + V2 belongs to"f/'.

CA

(Closure under scalar multiplication). For every Vin "f/' and every real number
rx, rxV belongs to "f/'.

CSM

As in Section 9. 7, any set"f/', with operations satisfying the above laws, is called a

vector space. The space that we are dealing with at the moment, in which the vectors
are the triplets P= (x,y,z) of real numbers, is denoted by R3,and is called Cartesian
three-space. Thus
R3= {(x,y, z) I x,y,z in R}.

520

Vector Spaces and Inner Products

11.3

The inner product that we have defined in R3 has the following properties:

S.1.
S.2.

PiP2

P2Pi.

(Cl.Pi) P2

C1.(Pi PJ.
Pi P2 + Pi Pa.

S.3.

Pi (P2 + P3)

S.4.

P PE;;; 0,

for every

P.

S.5.

If

PP

0, then P

0.

As in Section 9.7, we define an inner-product space to be a vector space in which an


inner product is defined, satisfying S.1 through S.5.
The distance from 0

(0, 0, 0)

to

.jx2 + y2 + z2

(x, y, z)

Hereafter this number will be called the norm of

is

.jp . P.
P,

and will be denoted by

[[ P\[.

(The

double bars are a reminder that we are performing an operation on a vector rather
than a number.)

For inner-product spaces in general, the formula

may not apply, but the expression

../P P always has a meaning,

../x2

+ y2 + z2

and so we use it as

our definition of the norm.

Definition. In any inner-product space,

[[P [[
\IP [[

is called the norm of the vector

.jp P.

P.

Just as in the plane, the inner product has a geometric interpretation in R3

Theorem 1. In R3,
where() is the measure of the angle between the directed segments

OP1

and

--+

OP2

The proof is by definition of the. norm, together with the law of cosines:

2
OPi + OP - 2 OPi OP2
(Pi P2)
2
2
2
(X1 - X2) + (Yi - Y2) + (zi - Zz)
=

cos 0;

x + Yi + zi + x + Y + z - 2[\P1\\ \\P2\I

-2(X1X2 + Y1Y2 + ZiZ2)


Pi P2

-2[IP1\I

l\P2ll

\\Pill l \P21\ cos 0.

cos();

cos O;

521

Three-Dimensional Space, Regarded as an Inner Product Space

11.3

From this we get immediately:


Theorem 2 (The Schwarz inequality). In R3,
2
2
(P1 P2) JIP1ll2 ll P2ll
2
This is true because cos () 1.
Following the pattern of Section 9.7, we let

j = (0, 1, 0),

i = (1,0,0),

k = (0, 0, 1),

so that for each P = (x, y, z) we have


P =xi + yj + zk.
In general, if V = ix1V1 + ix2V2 +
+ ixnVn, for some scalars ix1, ix2, ..., ixn,
then Vis a linear combination of the vectors Vi, V2, ..., Vn. Thus every vector in R3
is a linear combination of i, j, and k.

Definition. A set {V1, Vz, . .., Vn} of vectors spans the vector f' if every Vin f'
is a linear combination of the Vi's. (Thus {i, j, k} spans R3.)
A set {V1, V2, ..., Vn} is linearly dependent if there are scalars ix1, ix2,
not all equal to zero, such that
ix1V1 + ix2V2 + ... + anVn = 0.

, an,

Thus, in R3, every-set of vectors of the form {P, i, j, k} is linearly dependent, because
for P = (x, y, z) , we have P =xi + yj + zk, and so
P - xi - yj - zk = 0.
Here ix1 = 1, ix2 = -x, ix3 = -y, and ix4 = -z; and the numbers ixi are not all
=0, because ix1 = 1.
A set of vectors is linearly independent if it is not linearly dependent. Thus
{V1, V2, ..., Vn} is linearly independent if

For example, {i, j, k} is linearly independent. The reason is that in R3,


and

Therefore
=>

IX

IXz

0 = (0,0,0).

= IX3 =0.

A set of vectors {V1, V2,


, Vn} forms a basis for a vector space f' if ( 1) the set
spans f' and (2) the set is linearly independent. Thus we have:

Theorem 3. {i, j, k} is a basis for R3

Obviously the points of the xy-plane form a vector space in themselves; in fact,
this is the vector space that was discussed in Section 9.7. In fact, all three of the
coordinate planes
Exv

{xi + yj + 0 k},

Exz ={xi + 0 j + zk},

Evz ={O i + yj + zk},

522

Vector Spaces and Inner Products

11.3

form vector spaces. Such sets are called subspaces of R3 More generally:
Definition. Given a vector space"/'" and a subset"/'"'. If"/'"' also forms a vector space

( under the same definitions of addition and scalar multiplication) then "/'"' is called
a subspace of"/'".
Thus a subspace must satisfy all of the vector laws. But this is not as tedious to
check as one might think, because of the following theorem.
Theorem 4. Let"/'"' be a subset of the vector space"/'". If"/'"' is closed under addition

and scalar multiplication, then "/'"' is a subspace of"/'".

Proof Many of the laws can be checked all at once. Since A.1 and A.4 hold for all
vectors in"/'", they automatically hold in"/'"'. The same is true for M.1 through M.6
and S.1 through S.5. Therefore the only things remaining to verify are A.2 and A.3.

1)

By M.4, 0

P= 0, for every Pin"/'"'; and by CA, 0 P belongs to"/'"'. Therefore

0 belongs to "/'"', and so A.2 holds.

2)

Given P in"/'"'. By M.2,

[1+(-l)]P=1 P +(-l)P.

By M.4 and M.5, this gives


0 =P +(-l)P.

Therefore (-1)P= -P, and

P belongs to "/'"'.

On the basis of Theorem 4, it is easy to see that each of the three coordinate
planes is a subspace of R3 The point is that if

then
P1 + P2 = (x1 + X2)i + (Y1 + Y2)j,
so that the set E,,11 of all linear combinations of i and j is closed under addition.
Similarly, CJ..P1 = CJ..X1i + CJ..y1j, and so E,,11 is closed under scalar multiplication.
Similarly for Ev and E,,,. In fact, a more general result holds:
Theorem 5. Every plane through 0 forms a subspace of R3

Proof Let

E be such a plane. Then E is the graph of an equation of the form

Ax+ By+ Cz= 0.


If P 1

(x1, Ji, z1) and P2 = (x2, y2, z2) belong to E, then


Ax1 + By1 + Cz1

:::::

and

By addition,
A(x1 + X2) + B(Y1 + Y2) + C(z1 + z2) = O;
and this means that P 1 + P2 is in E. Similarly, for every real number CJ..,
ACJ..X1 + BCJ..Y1 + CCJ..Z1 = CJ..

0,

11.3

Three-Dimensional Space, Regarded as an Inner Product Space

rxP1

and

is in E. Therefore Eis closed under scalar multiplication.

523

By Theorem 4,

Eforms a subspace .

There is, however, a much better way to get this result, using vector-space
methods instead of using the results of the preceding section.

Given the plane E

through 0, let

P0

(A , B, C)

be any vector such that the line through 0 and


each

P-:;!:.

0 in E,

OP and OP0 are perpendicular,


Po P

For

P0

is perpendicular to E
. Then for

and so

llPoll llPll cos (7T/2)

0 the same equation holds.

Conversely, if

P0

0.

P= 0, then Plies in E.

Therefore
E=

{P I P0 P

O}.

In terms of coordinates, this tells us that Eis the graph of the equation

Ax +

+ Cz

By

0,

which we already knew. But when we describe Eby a vector equation, using the
inner product, this suggests the following theorem:
Theorem 6.

Let "I"" be any inner-product space; let

V0 be

any vector in "I"", different

from 0; and let


"I""'

{V I V0 V

O}.

Then "I""' is a subspace of "I"".

Proof

We need to show that "I""' satisfies CA and CSM. If

V0 Vi

Vi and V2 are in "I""', then

V0 V2

Therefore

Vo (Vi + V2)
by S.
3
. Therefore

V0 (Vi + V2)

V0 a.Vi

a.Vi V0

(Reasons for these steps?)

Vo Vi + Vo V2,

0, and

Vi + V2 is

rx(Vi V0)

in "I""'. Similarly,

rx(V0 Vi)

rx

0.

524

11.3

Vector Spaces and Inner Products

If you rewrite these formulas, in the forms that they take when
are vectors in

V0, Vi, and V 2

R3, with

V0

(A, B, C),

you will find that you are simply copying the proof of Theorem 5.

(This is worth

going through, to see how it works.) Thus it may seem that nothing is new in Theorem
6 except the notation. But this is not true, because Theorem 6 and its proof work in

every vector space, including spaces of four dimensions, spaces of functions, and so on.

Thus when we proved Theorem 6, we found that the method used in proving Theorem
5 had nothing to do with any special properties of

R3; it depended only on the inner

product space laws. From now on, easy generalizations of this kind will occur often.
We shall treat the vector laws (or the inner-product space laws) as basic assumptions,
like postulates in geometry, and any theorems that we derive from them will be
known to hold in every vector space (or any inner-product space.)

If a plane E does not contain the origin, then it never forms a subspace of

R3 ,

because it does not contain 0. But we can still write a vector equation for E, because

Ax+ By+ Cz +
where

P0

Theorem 7.

(A, B, C).

= 0 <=>

11.3

Convert each of them to the form

giving the value of a that you are using and the coordinates A, B, C of P0
2.

z=x+y

3. z= -x - y
5. y = 4z - 3x
7. z= 1
x y z
+ + =4
9 1
2
3
11. Let V1 = i +j, V2 =j + k, Va
k.
linear combination of V1, V2, V3
12. Now show that {V1, V2, Va} spans R3.
=

13.

-D,

=a.

Each of the following is the equation of a plane.

1.

R3 is the graph of an equation of the form


P0 P

P0 P = a,

Thus we have:

Every plane in

PROBLEM SET

P0 P

z=x - y
4. x= 3y - 4z
6. z= 4x+ 3y
8. x= 4
z
x y
lO. 2 - 4 + 3 = 2
-

Show that each of the basis vectors i,

j,

k is a

{V1, V2, Va} is linearly independent. (By definition, this means that
cx1 V1 + ix2 V2 + ix3V 3= 0 =:-- ix1 = ix2
ix3
0. Problems 12 and 13, in combination,
tell us that {V1, V2, V3} is a basis for R3.)

Now show that

14. Let

15. Let

V1

i +j,

V2

V1= i - j +k, V2

through 13.

+ k,
=

V3 = i + k.

+j

k,

Proceed as in Problems 11 through

V3 = -i +j

13.

+ k. Proceed as in Problems 11

11.3

Three-Dimensional Space, Regarded as an Inner Product Space

525

Show that the following hold, in any inner-product space. (Each of them should be
derived from the inner-product space laws, with a reason given for each step .)
16. V1 (1XV2) = 1X(V1 V2)

1X{J(V1 V2)
17. (1XV1) ({JV2)
18. 1X(V1 + V2 + V3)
1XV1 + 1XV2 + cxV3
19. l\1XVll
10(1 llVll
20. l\-1XVll
llXI llVll

21. (P+ Q) (R + S)
of course .)

22 . (P + Q)(P - R)

P R + Q R + P S + Q S (Here P, Q, R,

are vectors,

P P+ Q P - R P - Q R
23 . (P+ Q) (P - Q) = P P - Q Q
24. (P + Q) (P + Q)
P P + 2(P Q) + Q Q
25. (P - Q) (P - Q) = P P - 2(P Q) + Q Q
=

26. (P - Q) (Q - P)

-P P + 2(P Q) - Q Q

27. Is it true that in R3, (P Q)R


(Q R)P? (Each side of this equation has a meaning,
because P Q and Q R are scalars.)
28. Is the following true in any inner-product space? Theorem ( ?) If P Q = 0 for every
Q, then P = 0. Why or why not?
=

29 . Let the plane Ebe the graph of the equation z = 2x + 3y. Show that Econtains the
vectors V1 = i + 2k and V2 = j + 3k.
30 . Show that Econtains every vector Vof the form xV1 + yV2. Then show, conversely,
that every vector of this form lies in E. [Hint: Express Vin the form ( )i + ( )j +
( )k.]
31. Show that the V1 and V2 of Problem 29 are linearly independent.
32 . Show that Vi and V2 span E. (Problems 30 through 32 tell us that {V1, V2} is a basis
for E.)
33. Let the plane Ebe the graph of the equation x + y + 2z
0, and let V1 = i - j and
V2 = 2j - k . Proceed as in Problems 29 through 32.
=

34. Let V1 = i + j, V2 = j + k; and let Ebe the set of all vectors of the form V = xV1 +
y V2 Write an equation for E, in the form Ax + By + Cz
0.
=

35. Let V1 = i + 2j, V2 = j + 2k. Proceed as in Problem 34 .


36. Two vectors V, V' are orthogonal if V V' 0. A set { V1, V2, Vn} of vectors is
orthogonal if every two (different) vectors in the set are orthogonal. Verify that {i, j, k }
i s an orthogonal set.
37. A set { V1, V2, . , Vn} is orthonormal if (1) the set is orthogonal, and (2) II Vi II
1 for
each i. (Thus the basis {i, j, k} for R3 is orthonormal.) Given that { V1, V2, V3} is
orthonormal, express the inner product
++++

in the simplest possible form.


*38 . Let R4 be the set of all quadruplets P = (w, x, y, z) of real numbers. The set R4 forms
an inner-product space, under the obvious definitions of sum, scalar product, and inner
2
2
2
vp P. Show that (P1 P2) llP1ll llP2ll
product. As always, llPll
=

Vector Spaces and Inner Products

526

11.4

*39. Let P be the set of all polynomials with real coefficients. For

n
i
V =a.x
., '
i=O
we define

V+ W

Show that

n
b.xi
., '
i=O

n
i
L (ai + bi) x ,
i=O
n
i
L aix ,
i =O
n
Laibi.
i=O

V W

2
2
2
(V W) llVll llW ll

a finite basis?

40. Does the vector space P of Problem 39 have

If so, describe such a basis.

If not, explain why no such basis exists.

11.4

THE DIMENSION OF A VECTOR SPACE.


VARIOUS WAYS TO FORM A BASIS

For each positive integer n, let Rn be the set of all n-tuples of real numbers. Thus
Rn

{(x1, X2, 'Xn)

xi ER},

and Rn forms an inner-product space, under the obvious definitions of sum, scalar
product, and inner product. Rn is called Cartesian n-space. Let
Bn

where
E1
E n-i

{E1. E2, ... ' E n},

(1, 0, 0, ..., 0),

E2

(0, 0, ..., 1, 0),

En

(0, 1, 0, ..., 0), ... ,


(0, 0, ... , 0, 1).

(In general, Ei has 1 in the ith position, and O's everywhere else.) The vectors Ei span
the space Rn, with
(xi. x2, , x n)

n
L xiEi.
i=l

They are linearly independent, since


n
L xiEi
i=l

=>

(x 1, x2, , x n)

(0, 0, ..., 0)

Therefore Bn forms a basis.


Every subset of Bn gives a subspace of Rn. For example,
"f/1

{ixE1

ix ER}

is a subspace, and forms a line through the origin (0, 0, . . . , O);


"Y2

{ixE1 + fJE2

ix, (J ER}

The Dimension of a Vector Space.

11.4

Various Ways to Form a Basis

is a subspace, and forms a plane, and so on, for any subset of Bn.

But

527

Rn has

many subspaces which are not obtainable in this way. For example, we have found
that in

R3, any line or plane through the origin forms a subspace. To investigate

these other subspaces, we need to use bases other than the obvious basis Bn.

Our

investigation of other bases will also be useful in other connections.


The key to the theory of bases is the following theorem.

Theorem 1. Let "f/ be a vector space. Let A be a set of m vectors, and let B be a set
of

vectors, such that

(1) A

is linearly independent, and (2)

spanned by a set C consisting of (a) all the elements of


of the elements of

B spans '"f/. Then "f/ is


A, and (b) exactly n
m
-

B.

For example, we might have

"f/ =R ,

A = {E1 + E2, E2 + Ea},


B = B3 = {E1, E2, E }.
3

Here we can take

{E1 + E2, E2 + E3, Ea}.

This has the desired form, using all elements of

A and 3

= 1 element of B. To

see that C spans R3, we observe that

Therefore every basis element


Therefore every vector in

E1, E2, E is a linear combination of the elements of C.


3
R3 is a linear combination of the elements of C, and C

spans R3
We proceed to the general proof.
that the vectors which also belong to

where

A1, A2,

First we list the elements of

A in such a way

B come first. Thus

, Ai belong also to B, but Ai+l ... , Am do not. Then B can be

described in the form

One of the possibilities, of course, is that i

= 0, as in the above example. Another


= m; in this case there is nothing to prove. We shall show that
given a set B, as above, with 0 i < m, we can always delete one of the vectors B1,
and replace it by Ai+I getting a new set which also spans '"f/. In a finite number of

possibility is that i

such steps, we get a set C of the sort that we want.


Since

B spans

V, it follows that

Ai+l is a linear combination of the form

Ai+l = ct.1A1 + ct.2A2 + ... + ct.;A; + fliB1 + f32B2 + ... + fJn-iBn-i


Here it cannot be true that all the numbers {J1 are equal to zero, because if so it
would follow that

A is linearly dependent. It is a matter of notation, therefore, to

528

Vector Spaces and Inner Products

{31 :;rt: 0. We can therefore solve for B1, in the above equation, getting

suppose that

B1

11.4

f31 (Ai+l - Cl1A1 - Cl2A2 - ... - Cl;Ai - f32 B2 - ... - /Jn-iBn-i).

Now let

B'

{A1, A2,

, A;, Ai+i B2,

Bn-i}

Every element of B (including B1) is a linear combination of the elements of B'; and
B spans "f/. Therefore B' spans "f/. In m - i steps of this type, we get the desired
set C.

Let us now check to see how this general scheme of proof applies to the above
example. We had

A
Here

m =

2, n

elements of

{A1, A2}

{ B1, B2, Ba}

3, and at the outset, i

B, with

A1

B'

{E1, E2, Ea}

0. Also A1 is a linear combination of the

E1 +E2

Ei. giving Ei as a linear combination of Ai and E2.


Ei by Ai in B, getting

This equation can be solved for


Therefore we can replace

{Ei +E2, E2 +Ea},

{Ai, B2, Ba}

This completes step 1. Next we express

A2

{ Ei +E2, E2, Ea}.

A2 as a linear combination

E2 +Ea.
This equation can be solved for E2, giving E2 as a linear combination of A2 and E 3
Therefore we can replace E2 by A2 in B', getting
and now we are done.

B"

{ Ei +E2, E2 + Ea, Ea};

We shall now pursue some of the consequences of Theorem 1.


Let "f/ be a vector space.

Theorem 2.

set of
m

vectors, such that (1)

Let

be a set of

vectors, and let

is linearly independent, and

n.

(2) B

spans "f/.

B be

Then

A.

This follows from Theorem 1, because Chas

elements, and contains all of

If a vector space "f/ has a basis with

elements, then every basis for "f/

Theorem 3.

has exactly

elements.

Proof Let B be a basis with n elements, and let A be any other basis, with m elements.
Then A is linearly independent, and B spans "f/. By Theorem 2, m n. But we also
know that Bis linearly independent, and A spans "f/. By Theorem 2, n m. There
fore

n = m,

which was to be proved.

Thus the number of elements in a basis is independent of the choice of basis.


This justifies the following definitions.

11.4

The Dimension of a Vector Space.

Definitions. A vector space

finite-dimensional, then the

Various Ways to Form a Basis

529

Y is finite-dimensional if it has a finite basis. If Y is


dimension of Y is the number n which is the number of

elements in every basis. The dimension of Y is denoted by dim Y.


If you review the conditions for a vector space, you will see that they are all
satisfied in the trivial case where Y contains a zero vector 0 and nothing else.
this case we define dim Y

In

That is,

0.

dim

{O}

0.

(Here the empty set is being regarded as a "basis" for

{O}.)

In a way it is a nuisance

to allow this case, but to rule it out would lead to worse nuisances in the long run.
Theorem 4.

In an n-dimensional vector space, every set of more than

vectors is

linearly dependent.

Proof

Let B be a basis, with

elements, with m >

n.

elements, and let A be any set of vectors, with

If A were linearly independent, this would contradict Theorem

2. Therefore A is linearly dependent, which was to be proved.


Theorem 5.

In an n-dimensional vector space, every linearly independent set with

elements is a basis.

Proof

Given a linearly independent set B

B is a basis. If not, there is a vector

{V1, V2,
, Vn}. If B spans Y, then
Vn+i which is not a linear combination of elements
=

of B. It follows that the larger set


B'

{V1, V2,

Vn, V n+i}

is linearly independent; and this contradicts Theorem 4.


Theorem

6. Let Y be a vector space, and let B

{V1, V2,

be a set which

Vm}

spans Y. Then B contains a basis for Y.

Proof

If B is linearly independent, there is nothing to prove.

linear combination of the others.

Suppose that this is

V1

If not, some V; is a

Then

{V2,

Vm}

also

spans Y. Repeating this process, removing superfluous vectors one at a time, we get
a

basis.

Theorem 7.

Proof

If dim Y

n,

then no set of fewer than

Let

elements.

Let Y be an n-dimensional vector space, and let "f/"' be a subspace off.

Then "f/"' is finite-dimensional, and dim f'

n.

be the largest number for which it is true that "f/"' contains a linearly

independent set of
m

elements spans Y.

Any set which spans Y contains a basis, and every basis has

Theorem 8.

P;oof

vectors. (By Theorem 4, there is such a largest number

m,

and

n.) Let
B

{V1, V2,

Vm}

be a linearly independent set in "f/"'. We assert that B spans "f/"'.

(Proof

If not, there

is a vector Vm-'-' in "f/"' which is not a linear combination of elements of B, and it

530

11.5

Vector Spaces and Inner Products

follows, as in the proof of Theorem 5, that the larger set


B'

={Vi, V2,

Vm, Vm+i}

Therefore B is a basis for "/"', and dim"/"'

is linearly independent.)

=m

Thus every subspace "/" of Rn has a basis.


PROBLEM SET

n.

11.4

1. Given Vi = Ei

+2E2, V2 = 2Ei + 3E2, Va = 3Ei +4E2, in R2 Theorem 4 predicts


that. {Vi. V2, Va} is linearly dependent. Exhibit the linear dependence, by finding
numbers 'i '2, 'a not all = 0, such that z 'iVi = 0.

a
2. Given Vi =Ei +2E2, V2 = 2E2 + 3Ea, Va = 3E2 +4Ea, Vi =4Ei - 5E2, in R ,
proceed as in Problem 1.

a
3. Given Vi =Ei +E2, V2 =E2 +Ea, Va = E2 + 3Ea, V4 =4Ei +E2 +Ea, in R ,
proceed as in Problem 1.

4. In Ra, let'"/!'" ={VI V (Ei +E2)= 0}. Find a basis for '"fl'".
5. Same question for'"/!'" ={VI V (Ei +E2 - 2Ea)= O}.
6. In
7.
8.

R4, let'"/!'" ={VI V (Ei +E2 +Ea)= O}. Find a basis for '"fl'".
4
Find a basis for R , using Ei +E2 and Ea +E4 as basis elements.
a
Find a basis for R from among the vectors of
B ={Ei

+E2 +2Ea, E1 +Ea, E a +E2, E1 - E2, Ei +E2 +Ea}.

Ra, let'"/!'" ={VI V (E1 +E2)=0, V (1 +2Ea)= 0}. Find a basis for '"fl'".
4
In R , let '"fl'"={Vj V (E1 +E2 +Ea)= 0, V (E2 +2Ea +E4)= O}.
Find a

9. In
10.

basis for'"/!'".
11.5

ORTHONORMAL BASES

In the preceding section, we found for the space

Rn

a basis

Bn = {E1. E2, ... ' E n},


where the ith coordinate of

Ei

is 1 and the other coordinates of

V = (x i x2,
.

(Yi. Y2, ' Yn)

then
v

are 0. Thus

i=l

If
W

Ei

n
, xn) = ,L xiEi.

n
L Y;E;,
i=l

n
.L XiYi
i=l

Thus, for linear combinations of the E/s, we have a simple formula for the inner
product:

n
=

L X;Y;
i=l

531

Orthonormal Bases

11.5

This formula does not hold for all bases. For example, the set

forms a basis for

( ?)

(Try taking

R2,

{ Vi, V2} = {E1, E1 + 2}

but it is

not true that

(cx1Vi + IX2V2)

rxi = fJi = {32 =

(f31 Vi + f32V2) = rx1f31 + rx2f32

1.) But the above formula for the inner product does

hold for a certain kind of basis, now to be defined.


Two vectors
are orthogonal in
is

Vi and V2 are called orthogonal


Rn.) More generally, a set
B=

{Vi, V2,

if

Vi V2 = 0.

(Thus

Ei

and

2-

V n}

orthogonal if

Thus

Bn = {Ei, 2,

for i j.

En}

Ei

is an orthogonal set, but

(Ei + 2) =

If

ll V;ll =

then B=
is

( ?).

{Ei, Ei + 2} is

not, because

1 + 0 = 1 0.
for each i,

{Vi, V2,
, V n} is normal. If Bis both orthogonal and normal, then B
2
orthonormal. Thus nn is orthonormal. Since II Vill = vi . vi, we note that Bis

orthonormal if and only if

for i j

V;V.=
o
'
Theorem

1.

for i

j.

Every finite-dimensional inner-product space has an orthonormal basis.

Proof We shall show, by induction, that every n-dimensional inner-product space


has an orthonormal basis. For dim"/" = 1, this is obvious: Given a basis {Vi}, we
let W1 = Vi/llVill. Then {W1} is an orthonormal basis. We suppose, then, that every
n-dimensional inner-product space has an orthonormal basis; and we need to show
that every

(n + 1)-dimensional space has the same property.

Given dim "/" = n + 1, let

B=

{Vi, Vi, ... 'Vn, Vn+i}

V. Let "/"' be the subspace spanned by {Vi, V2,


n, and so"/"' has an orthogonal basis

be a basis for

dim"/"' =

Then the set C =

W = {Wi, W2,

{Wi, W2,

W11}.

V11}.

Then

, Wn, Vn+i} is a basis for"/". (Check that C is linearly


V+ i such that the set

independent, and spans"/".) We shall now find a vector


D

= {Wi. W2, ...' wn> v+l}

a i = Wi V n+i Let
n
v+l = vn+l - .L ai JV;.
ii

spans"/" and is orthogonal. For each i, let

532

11.5

Vector Spaces and Inner Products

Then for each

from 1 to

we have

wk v+i

wk Vn+1 - 2 aiwk

i=l

a:.

In the sum on the right, the only nonzero term is Wk Wk, because Wis an orthogonal
set; and Wk Wk
1, because W is orthonormal. Therefore

W k V+1

Therefore

Wk Vn+i

ak = ak - ak

0.

is an orthogonal set. The last step is trivial. Let


IX=

and let

llV+1ll,

Then

and

Therefore the set { Wi. W2,

, Wn, Wn+i} forms an orthonormal basis.

Note that the pattern of this proof supplies us with a method of actually finding
an orthonormal basis, starting with a basis which is not necessarily orthonormal.
The proof gives a scheme for "orthonormalizing" a given basis, a step at a time.
For example, in R3 let V be the subspace spanned by
B = {V1, Vi}

{E1 + E2, E2 + 3}.

Then Bis a basis for V, but is neither orthogonal nor normal. We can get an ortho
normal basis for V by following the pattern of the proof of Theorem 1.

1) Let

Then II Will

1.

2) We shall now treat V2 as the Vn+i of the above proof. Let

Let

v;

V2 - a1W1

The theory predicts that

( 2 + E3)
-

t(E1 + E2)

tE1 + !E2 + 3.

Orthonormal Bases

11.5

and this checks, because

:2.

W 1 V =

./2

3)

Now we normalize

V ,

(E1 +E2) (-tE1 +fE2 +Ea)

(-t +t +0) = 0.

by letting

W2 = V/11Vll-

Since

llVll

= V2

we have

V = ! +t + 1 = -f,
1

11v;11

and

W2 = so that

and

533

._j(, E1

J-3
2

i 3,
+.Jf, E2 +yr

II W2ll2 = i +i +i =

II W2II

1 , as it should be. Now { W1,

W2}

1
,

is an orthonormal basis.

Orthonormal bases are what we need to get a simple formula for the inner
product:
Theorem 2. If

{V1, V2,

, Vn} is orthonormal, then


n

Proof

2 ;{J

i=l

ct.

;.

We know that

(i = j),
and
Therefore

. L f3;V; =
n

ct.iV;
and so

for each

(i: v:) ( i f3 ivi) = i


Cl.;
=l

J=l

Of course, for inner products of the form

R2 and Ra, it is easy to


any two points P and Q is llP
In

{3;

ct.;

i=l

- Q II.

i=l

{3;.

ct.;

V V = II V l!2,

i,

we have

see by the distance formula that the distance between


For inner-product spaces in general, we use the

Vector Spaces and Inner Products

534

11.5

latter formula as the definition of distance.

Often we shall think of inner-product

spaces geometrically, and so we may refer to their elements as points rather than
vectors, as in the following definition.

Definition. In any inner-product space, the

distance

between two points

and

Q is

PQ.

An

llP- Qll.

The distance between

and

Q may

be denoted by

d( P, Q),

or simply

orthonormal basis gives us a distance formula.

Theorem 3. Let B
space

V,

{V1, V 2,

and let

, Vn}

be an orthonormal basis for the inner-product

.L aiV;,
i=l

Then

Proof

.L Pit:
i=l

We have

d(P, Q)

P- Q

llP - Qii ,

and

llP - Qll2

(P- Q)

.L (a; - p;)Vi,
i=l

(P - Q).

Since the basis is orthonormal,

2
llP - Qll

L (a;- P;)2,
i=l

and the theorem follows.


PROBLEM SET

11.5

3
V be the subspace of R spanned by the
E1 +E2}. Find an orthonormal basis for V.

1. Let

2.

Same question for B

basis B

={V1, V2} ={E1 +E2 + E3,

={V1, V2} ={E1 +E2 +E3, E1 - E2}.

3. Use the orthonormalization scheme described in the proof of Theorem 1 to find an

orthonormal basis for the subspace spanned by the basis

={E1 +E2 +Ea, E2 +Ea, Ea} ={V v Vi, Va}


B ={E2 +E3, E3, E1 +E2 +E3} ={V1, V2, V3} .

B
4. Same question for

5. Same question for B

={E3, E1 +E2 +E3, E2 +E3} ={V1, V2, V3}.

6. In

={V / V (E1 +E2 +E3) = O} .


={V / V (E1 +2E2) = O}.

3
R , find an

orthonormal basis for 'i'

7. Same question for 'i'

an

9. Given

10 .

In

V1 = 2E2 +E3, V2 = 4E1 +E4.


4
orthogonal basis of R

8. Let

4
R,

Find vectors

V1 = E1 - E3 +E4, V2 = E1 +E2 +E3,

V3, V4

so that {Vv

proceed as in Problem 8.

find an orthonormal basis of the subspace


'i'

={VJ V (E1 +E2) = 0,

V2, V3, V4}

(E2 +E4 ) = 0}.

is

11.6

The Schwarz Inequality.

More General Concepts of Norm and Distance

={VI V (1

11. Same question forr

2) =

0,

E3

0,

535

V E4 = O}.

V1 is a fixed nonzero vector in an n-dimensional vector space,


= {VI V V1 = O} is an (n - 1)-dimensional subspace.
Suppose that {V1, V2,
, Vn} is orthogonal, but not necessarily orthonormal.

12. Show that if

then

"//'
13.

Find

a formula for

14. Let {V1,

V2,

Vn} be an orthogonal set of nonzero vectors.

Let

V = L ixi V; be any

linear combination of them. Show that

V V; = 0 for every i

=<>-

V = 0.

, Vn} be as in Problem 14. Show that the set is linearly independent.


15. Let {Vv V2,
(This means that in an n-dimensional vector space, no orthogonal set of nonzero vectors

can have more than n elements. Thus, for example, in R3 there is no set of four con

current lines, every two of which are perpendicular.)


16. Let ii'" and f1E be subspaces of a vector spacer. If every vector in ii'" is orthogonal to
every vector in f1E, then ii'" and f1E are orthogonal subspaces, and we write ii'" J_ fE. Show
that if ii'" J_ fE, then dim ii'" + dim q;

;;:;; dim r.

Give an example to show that the

equality does not necessarily hold.


17. The following is a converse of Theorem 2.

Theorem (?) Let B ={Vi. Vi, . .. , Vn} be a basis for i"". If for all vectors V =
W

L ix; V;,

= ! /3;V;, we have

then B is orthonormal.
Is this true? Why or why not?
18. Show that if {V1,

V2,

Vn}

is an orthonormal basis for r, then for every Vinr,


v

n
= L (V V;)V;.
iI

That is, for

11.6

V=L

IX;

V;, we always have

IX;

= V V;.

THE SCHWARZ INEQUALITY.


MORE GENERAL CONCEPTS OF NORM AND DISTANCE

The "triangular inequality" for points in a plane (or in three-dimensional space)


asserts that for any points P,

Q, R

we have

PR PQ +QR.
The equality holds if the points are collinear, and

Q is between P and R;

and in every

other case, the strict inequality holds.


We propose to show that in any inner-product space, the same inequality holds
for distances. That is,

d(P, R) d(P, Q) +d(Q, R),


for every

P, Q,

and

R.

Since distance was defined by the formula

d(P, Q)

llP - Q ll.

(1)

536

11.6

Vector Spaces and Inner Products

the proposed inequality means that

llP - R\\ ll P - Q\\ + \\Q - RI\.

This has the form

where

P - Q and B

(2)

ll A + Bii ll A ll + llBI\,
=

Q-

R. Note that this is analogous to the inequality

\x + YI \x\ + IYI,
which is known to hold for both real and complex numbers. Obviously any general
proof of (2) , for all inner-product spaces, must appeal to the definition of the norm:

ll A ll

..{;0A,

\I A ll2

A.

(3)

Therefore the natural first step, in proving (2) , is to restate it in terms of the definition
given in

In these terms,

(3).

(2)

ll A + Bii ll A ll + llBll
<::;, l\ A + Bll2 \I A \\2 + 2 l\ A \I

llBll + llBll2

<::;,( A + B) ( A + B) A A + 2 ll A ll llBll + B B

<::;, A A + 2A B + B B A A + 2 ll A ll llBll + B B

<::;, A B \I A I\
Here

l4)

llBll .

(4)

automatically holds whenever

we must have

A B < 0. But if ( 4) always holds, then

I A B\ l\ A I\ ll Bll.

(5)

(5) were false, for some A , B, then (4) would also be false, either for A , B or
A
- , B.) And Eq. (5) is obviously equivalent to

(If

( A B)2 ll A ll2 llBll2


Formula

(6)

Theorem 1

for every

is called the

Schwarz inequality.

(The Schwarz inequality).

(6)

We shall now prove it.

In any inner-product space,

( A B)2 ll A ll2 llBll2,

A and B.

Any proof must use, at some stage, the fact that P

(6)
P 0 for every P, because

this is the only inequality that is given by the inner-product space laws.

But to use

this law to prove the theorem, we first need to reduce the theorem to a manageable
special case. This is done as follows.
If A

0 or

0, then the inequality

We may therefore suppose that


suppose that

A B -:;if 0.

A -:;if

Suppose now that we replace

0 and

(6) takes the form 0 0, which is true.


B -:;if 0. For the same reason, we may

A by ex. A and B by (JB, with

-:;if 0 and (J -:;if 0.

11.6

The Schwarz Inequality.

The inequality

(6) then takes

More General Concepts of Norm and Distance

the form

(l't.A (JB)2 lll't.All2 llfJB ll2


<=> i't.2(J2(A. B)2 l't.2 llAll2. (32 llBll2

(7)

<=>(A B)2 llAll2 llBll2

(6)

Since (7)
We take

<=>

(6),

for every

l't. , (J =;I= 0,
1

We then choose

537

l't.=llAll'
(J=

l't.A B
-

we are free to choose

l't.

so that

lll't.A ll = 1.

so that

l't.A (JB = I.

and

(J

as we please.

We let P = l't.A,

Q = (JB.

Our theorem now takes the form

llPll =
This is easy to prove: for

1 and

P Q=

llPll = 1, P Q =

=>

llQll2

1, we have

(P - Q) (P - Q) 0
=>

P P - 2P Q + Q Q 0

=>

llPll2 - 2P Q + llQll2 0

=>

- 2 + 11Q11 2 o

=>

llQll2

The theorem follows. In the light of the discussion which led to the Schwarz inequality,
we also have the following:

Theorem 2. In any inner-product space,

llP + Qll llP ll + llQll

(2')

( We sometimes express this by saying that the norm is subadditive. In general, a


real-valued function is subadditive if f (x + y) f (x) + f (y), for every x and y.)

Theorem 3. In any inner-product space, distance is triangular. That is,

d(P, R) d(P, Q) + d(Q, R) .


Equivalently,

llP - Rll llP - Qll + llQ - Rll .


Let us now review what we know about norms and distance. For norms, we have

N.1.
N.2.
N.3.
N.4.

llAll 0 for every A.


llAll = 0 => A = 0.
lll't.All
ll't.I llAll . (Homogeneity)
llA + Bii llAll + llBll. (Subadditivity)
=

Vector Spaces and Inner Products

538

11.6

All these are easy to check, on the basis of the definition


laws, and the Schwarz inequality. For distance, we have

llA ll =.JA A, the vector

D.l. d(P, Q) 0.
D.2. d(P, Q) =0 => P = Q.
D.3. d(P, Q) =d(Q , P ).
D.4. d(P, R) d(P, Q) + d(Q , R).

On this basis, we shall define various types of mathematical systems which are
more general than inner-product spaces.

Definition. A normed vector space is a vector space in which a norm is defined,


satisfying

N.1 through N.4.

Thus a normed vector space is a quadruplet

[r, +, sm, II 111.

[r, +, sm] is a vector space, and the "norm operation" P


N.l through N.4.

where

Example

ll P ll satisfies

1. Let 21 be the set of all continuous functions f, on the interval [

1 1 ].
,

Addition and scalar multiplication are defined in the obvious way. We define

llfllu =max lfl,

lfl is the largest of the numbers If(x)I ( -1 x 1). It is easy to check


N.l, N.2, and N.3 hold. To verify N.4, we observe that for each x,

where max
that

If(x) + g(x)I If(x)I + lg(x)I


llfll + llgll.

Therefore max

{lf(x) + g(x)I} llfll + llgll. Thus we have a normed vector space.


uniform norm. (Hence the notation

The norm that we have just defined is called the

II II.,.)

Example

2. Let 21 be as Example 1; and let + and sm be as before. We can define

another norm by the formula

As before, the verifications of


that

II!+ gll 1

llfll1 =f lf(x) I dx.


1

N.l, N.2, and N.3 are easy. To verify N.4, we observe

f1lf(x) + g(x)I dx f1[1f(x) I + lg(x)I] dx


1lf

=f

(x) I dx + f 1lg(x) I dx

= 11!111 +

II gll1

It should be emphasized that a normed vector space is not merely a vector space
in which a norm

can be defined, but rather a linear space in which a norm has been

11.6

The Schwarz Inequality.

539

More General Concepts of Norm and Distance

defined. Thus, in Examples 1 and 2 we defined two different norms II II,, and II 111
in the same vector space [21, +, sm]; and this gave us two different normed vector
spaces
[21, +, sm, 11 Iii].
[21, +, sm, 11 II,,],
In any normed vector space, we can define distance by means of the formula
d(A, B)

llA - Bii.

It then follows that the distanced satisfies D.1 through D.4. (This should be checked.)
More generally, we state the following.
Definition. A metric space is a set S which is provided with a distance d, satisfying
D.1 through D.4. The distanced is called the metric.

Thus a metric space is a pair [S,d ], where d is a metric for S. It is evident that
metric spaces can arise in ways that have very little to do with vector spaces or with
norms. For example, S may be the surface of a sphere in R3; that is,
S

{(x, y, z) I x2 + y2 + z2

1},

and for each pair of points P, Q on S, the distanced(P, Q) may be the length of the
shortest arc on S, joining P and Q. It is not hard to see that this system forms a
metric space, that is,d satisfies D.l through D.4. In fact, this is the metric space used
in navigation on the open sea, with arc length measured in nautical miles.
PROBLEM SET
1.

2.

3.

11.6

Show, by any method, that for any pair of pairs of real numbers (x1, x2) and (y1, y2)
we have (x + x;)(y + y) ;;;;; (x1y1 + X2J2)2
Show, by any method, that for every pair of finite sequences xl> x2, , Xn, Yi. y2, ,
Yn of real numbers,

Let E be a coordinate plane and, for each P


d(P, Q)

(x - a)2

(x, y) and Q
+

(a, b), let

(y - b)2.

Thus d is the square of the usual distance. Does [E,d] form a metric space?
4.

Ix - al + ly - bl.
maxmium of Ix - al and ly - bl.
If [S,d] is a metric space and k > 0, does it follow that [S, kd] is a metric space?

Same question,for d(P, Q)

5. Same
6.

question,for d(P, Q)

Vlxl.

7.

The real number systemR clearlyforms a vector space. For eachxinR,let llxll
Does this give a normed vector space?

8.

Let S be the set of all airline passenger terminals in the world; and for each P, Q in S
let d(P, Q) be the minimum number of hours required to get from P to Q by a combination
of regularly scheduled flights. Is [S, d] a metric space?

540

11.6

Vector Spaces and Inner Products

9. Let .!l'1 be the set of all continuous functions on the interval [ -1,

of the text. For each/, g in .!l'1, let

d (/,g )

1], as in Example 1

l/(x) - g(x)I
dx.
l/(x) + g(x)I

-1

1 +

Does this give a metric space?


*10. For the definition of U Jim, see p. 502. Let .!'1 be the normed vector space, with

the "uniform norm," defined in Example 1 of the text. Show that for any sequence

/1,f2, ... of functions in .!l'1,

lim 11/n llu

<=>

U lim/n

0.

(This is why the norm defined in Example 1 is called the uniform norm.)
11. Let .!l'1 be the same as in Problem 10, with the norm

11/111

l/(x)I dx,
1

as in Example 2 of the text. Is it true that


Jim llfnll1

n-co

Is it true that

U lim fn = 0

=>
=>

U lim/n = O?
n-oo

Jim llfnll

O?

12. Let C0[-7T, 7T] be the set of all continuous functions/ on the interval [-7T, 7T], with +

and sm defined as usual. Set f g


(f"..." f(x)g(x) dx), and verify that C0[ - TT,
this inner product, forms as inner-product space.

13. Find a polynomial

h(x)

f(x)
a + a1x + x2 which is orthogonal to g(x)
0
x, in the inner-product space defined in Problem 12 .
=

] with

TT ,

1 and

14. Find an orthonormal basis for the subspace of C0[-7T, 7T] spanned by the elements

{l,x,x2}.
*15. For each

n, let Tn be the set of all functions of the form


n
f(x)
a + I [ai cos ix + bi sin ix],
0
=

i=l

on the interval [-7T, 7T]. (Such functions are called trigonometric polynomials.)
Evidnt!y Tn forms a subspace of C0[-7T, 7T], and the set
B

{ 1 ; cos x, cos 2x, . . . , cos nx ; sin x, sin 2x, . . . , sin nx}

spans Tn. Show that (a) B is orthogonal, and (b) B is a basis for Tn. Then find an

orthonormal basis for Tn. [Warning: This one is long. It is easier if you note that in
(a) you need not necessarily compute indefinite integrals; what you need, in each case,
is the definite integral, from - 7T to 7T. The identities
cos (A + B) - cos (A - B)
cos (A + B) + cos (A

B)

-2 sin

A sin B,

2 cos A cos B

are also useful.] Problem 15 of Problem Set 11.5 is useful at one stage.

12

12.1

Fourier Series

PROJECTIONS INTO A SUBSPACE.


TRIGONOMETRIC POLYNOMIALS AND FOURIER SERIES

The idea of a projection is taken from elementary geometry. Let Ebe a plane in R3,
and let P be a point . To suit the terms of our later discussion, suppose that Epasses
through the origin. Then the projection

of P into Eis

the point Q which is the foot of

the perpendicular from Pto E. (If Pis in E, then the projection of Pis P.) The follow
ing facts are well known:
+-

1) The line PQ, through Pand Q, is perpendicular to every line in Ethat contains Q.
++

(In fact, this is the definition of the statement that PQ J_ .)

2)

If P is not in, then there is one and only one point Q in, satisfying (1).

3)

If R is in, then
(PQ)2 + (Q R)2

(PR)2

It follows immediately that:

4)

The distance from P to Q is the minimum distance from P to E.

We shall now regard R3 as an inner-product space , regard the points P, Q, . ..


as vectors, and restate these ideas in vectorial form.
In the figure below , we have completed a rectangle, by inserting the point P - Q.
(Check that in the figure, (P - Q) + Q

P, as it should be.) In elementary geo

metric terms, 0 is the foot of the perpendicular from P - Q to 0.


541

Therefore (1)

542

12.1

Fourier Series
p

through (4) take the following forms:


1)

(P - Q) S

0, for every Sin E.

2) There is one and only one Q in E, satisfying (1).


3) If R is in E, then

\IP - Q\12 + llQ - R\12

\IP - R\12.

Therefore

4)

ll P - Qll is the minimum distance from P to E.


We shall now show that these ideas apply under far more general conditions, as

follows .

Theorem 1. Let "I"' be any vector space, let "fr be any finite-dimensional subspace,
and let P be any vector in "I"'. Then there is one and only one vector Q in "fr such that

(P - Q) S

for every Sin "Ir.

Proof The easy part is to show that there is only one such Q. If
(P - Q)

and

(P - Q') S

0,

for every S, then

[(P - Q) - (P - Q')]

0,

for every S. Therefore

(Q' - Q) . s
for every S. In particular, for S
so Q'

0,

Q' - Q, we have (Q' - Q) (Q' - Q)

0, and

Q.

To show that there is one such Q, we need to find one; and as a guide, we look at
a simple case. For

"I"'

R3

"fr

R2

P
we ought to have

{x1E1 + x2E2 + xsEs},


{x1E1 + X2E2},

1:t..1E1 + 1:t..2E2 + (/.3E3,

Projections into a Subspace.

12.1

Trigonometric Polynomials and Fourier Series

543

here Q is the projection of Pinto the xy-plane. This works:


P- Q

E1 + 0 E2 + rxaE3,

(P - Q) E1

(P - Q) E2

0,

and so
(P - Q) S

(P - Q)

(x1E1 + x2E2)

for every Sin "fl/. In our formula for Q, we have


Q . E1
Q E2

rx1E1 . E1 + rx2E2 . E1

rx1E1 E2 + rx2E2 E2

rx1,
rx2;

and so
Q

(Q E1)E1 + (Q E2)E2.

We shall see that this pattern carries over to the general case. Let
B

{W1, W 2,

be an orthonormal basis for "fl/; for each i from


rx;

and let
Q
Then
Q

W;

and so
(P - Q)

W;

n
! rxiW,:
i=l

to n, let

n
! rxiW i.
i=l
W;

n
! /J3 W;
j=l

we have

2
rx; llW;lt

Wn}

P W;,

P W; - Q

for each j. Therefore for every

(P - Q)

W;

rx3,

rx; - rx;

0,

E "fl/,

n
! /J;(P
j=l

Q)

W;

0,

which was to be proved.


The point Q is called the projection of Pinto "fl/, and is denoted by Pr P (or by
Prif" P, if there is any doubt about the subspace into which we are projecting). To
repeat:
Let "f/ be any vector space , and let "fl/ be any finite-dimensional sub
space of "f/. For each P in "f/, Pr P (or Prif" P) is the point Q of "fl/ such that
(P - Q) S 0 for each Sin "fl/.
Theorem 1 tells us that this definition defines something. And one of the ideas
in the proof of Theorem 1 is worth noting for future reference:
Definition.

If Pr P is the projection of P into "fl/, and {W1, W 2,


orthonormal basis for "fl/, then
n
Pr P ! (P W,:)W,:.
i=l

Theorem 2.

, Wn} is an

12.1

Fourier Series

544

(In the proof of Theorem 1, we found that this sum satisfied the conditions for
and that there is only one such

Q.)

It remains to show that conditions

(3)

Q,

and (4),

stated at the beginning of this section, hold on the basis of our general definition.

Theorem 3.

If

= Pr P, and

Proof Obviously
because Q
R is
-

is in if/, then

llQ - Ri12 = llP - Rll2


P
Q)
(P - Q) + (Q - R)
R; and (P
in if/. Therefore Theorem 3 is a consequence of
llP - Qi12

Theorem 4 (The Pythagorean theorem).


A B

Proof

llA

= 0

+ Bll2 =

0,
(Q
R)
the following.
-

In any inner-product space,

llAll2

=>

(A

llBll2

+ B)

(A

llA

+ Bll2

+ B)

=AA+2AB+BB
=

llAIJ2

+ 0 +

llBll2

As in the special case of R3, this immediately gives:

Theorem 5.

The projection of a point

point of if/ which is closest to

into a finite-dimensional subspace if/ is the

P.

This idea has the following unexpected application. Let

"Y

where

C0[-TT, TT],

C0[-TT, TT] is the set of all continuous functions on the closed interval [-TT, TT];

and consider the inner-product space

["Y,

+, sm,

],

where + and sm are defined as usual for spaces of functions, and the inner product is
defined by the formula
g
/(x)g(x) dx.
f
For each positive integer n, let Tn be the set of all trigonometric polynomials of order
n, that is, the set of all functions of the form
n
g(x) = a0+ I [a; cos ix + b; sin ix].

= f:

Obviously

Tn forms

i=l

a subspace of"f/. Consider the set

x, cos 2x,. .. , cos nx; sin x, sin 2x, ... , sin nx}.

{1; cos
This set spans

Tn.

To verify that the set is orthogonal, we need to show that

f:

"

cos

J:

ix dx

cos
"

= f:

"

sin

ix sinjx dx

ix dx
=

for every i,

for every

i, j,

12.1

and

Projections into a Subspace. Trigonometric Polynomials and Fourier Series

J:

545

J:

sin ix sinjx dx =
cos ix cosjx dx = 0
for i ;;i!: j.
,,
11
All of these answers can be calculated by brute force, but there are tricks that help.
By more straightforward calculations, we get

1/1112

f,,12dx

27T,

llcos ixll2

7T

\11\I

.J21T;

\\sin ixl\2

To get an orthonormal basis, we divide each basis element by its norm. This gives an
orthonormal basis of the form B = {C0, C1,
, Sn}, where
Cn; S1, S2,
.

.J21T'

(i > 0),

1 .
s

(i > 0) .

.x

m i

.J;

'

COS lX
.J;
1

S.

Co=
Ci

Now let/ be a function in C0[-7T, 7T]. By Theorem 2, the projection off into the
finite-dimensional subspace "fl/"
Tn is the vector
=

P r nf

Here

f C0

n
n
I (f Ci)Ci + I (f S,)S;.
i=O
i=l

J f(x)
"

_,,

C;

f S;

Pr nf

a;

-11

dx

.J21T

cos ix dx

f" f(x) dx,


_,,

(i > 0),

sin ix dx.

n
a0 + I [a; cos ix + b; sin ix],
i=l

f"" f(x) dx,


- f f(x) ix dx

27T

1
=

7T

b;

.J21T

1 f" f(x)
7T- J_1T{" f(x)

1-

'\/ 7T

where

a0

--=

'\/

Therefore

_,,

-1T

1 f" f(x)

7T

-1T

cos

sin ix dx.

(i > 0)

546

Fourier Series

12.l

b; sin ix
[Check that (j C0)C0
a0, (j Ci)Ci
ai cos ix, and (f S;)S;
).
> O ]
It now seems reasonable to hope that Prnf is in some sense an approximation of/
when n is large. That is,

(i

( ?) n

co

Prnf

( ?).

If we judge the approximation by observing II/ - Prnf II , it is clear, at least, that


we have done our best: Theorem 5 tells us that Prnf is the element of Tn which
minimizes II/ - Prn/11. If our best was good enough, then we should have

(?)
A

lim I
n-+oo I!

Prnfll

(?).

(1)

stronger conjecture is that the approximation is good even in the uniform norm:

0 (?).
lim
(2)
- Prnf llu
n-+oo llf
It may be disturbing, at this stage, to observe that (2) cannot be true as stated:
Prnf (-Tr)
Prnf (Tr), because all trigonometric polynomials have period 2Tr.
Therefore (2) cannot be true unless/has the same property. We shall see, however,
that this is the only way that (2) can fail to hold:

(?)

Theorem A.

If/ has period 2Tr, andf' is continuous, then limnoo II/- Prnfll,.

0.

This implies, of course, that


f (x)

!!, [a0 + i (ai cos ix + bi sin x)J


a0

00

I (ai cos ix

i=l

bi sin ix),

where the limit is a pointwise limit in the elementary sense.


The last of these expressions is called the Fourier series for f, and the numbers
a0, a1,
b1, b2,
are called the Fourier coefficients off Obviously every con
tinuous function has a set of Fourier coefficients, and therefore has a Fourier series.
The question is under what conditions we can conclude that the Fourier series con
verges to the function. This question is complicated, and the situation is not yet
thoroughly understood by anybody; Theorem A, above, is the best of the simple
results.
Meanwhile, the successive projections Pr1/, Pr2/,
have two encouraging
properties. First, each projection Prn+if is simply a continuation of the preceding one
Prnf; to get Prn+if from Prnf we merely add a term of the form

an+l cos (n

l)x + bn+l sin (n + l)x,

leaving the preceding terms unchanged. Second, the error in the approximation
Prnf f, as measured by II/ - Prn/11, is nonincreasing.
Theorem 6.

Iff is continuous on [-Tr, Tr], then


II/ - Prn+dll II/ - Pr;Jll.

12.1

Projections into a Subspace.

Trigonometric Polynomials and Fourier Series

547

(Proof Since T,,+1 contains T,,, the minimum distance from/to T,, cannot be less
than the minimum distance from/ to T,,+i.)
Note that the Fourier series for a function f depends only on the values off on
the interval [-7T, 7T]. Therefore, when we set up the series forf(x)
x, what we are
really dealing with is a discontinuous function, with period 27T, whose graph looks
like this:
=

Similarly, when we set up the series forf(x)


x2, the series turns out to represent a
periodic function whose graph is obtained by fitting together infinitely many parabolic
arcs, like this:
=

__,'---"'""'-....._'--""""'---'--'"4""-L----"...-'-_,___. x
- 3,,.
-.,,.
.,,.
3,,.
5,,.
-511"

Of course the periodic function that we get from f(x)


x is not continuous.
But it turns out that this doesn't matter: if the graph ofjis obtained by fitting together
a finite number of continuous functions with continuous derivatives, then the Fourier
series always converges to a function F; and F(x)
f(x) at every point where f is
continuous.
=

At points where the "continuous pieces" of the function fail to fit together, as at
x
0 in the figure above, the series makes a compromise, and converges to the
average of the lefthand value and the righthand value. Similarly, ifj(-7T) =;C j(7T),
then
F(-7T)
F(7T)
![f(-7T) + f(7r)].
=

548

12.1

Fourier Series

Thus, for the function fin the preceding figure, the graph of the function F given by
the series looks like the figure below. Here
F(-rr)

F(O)

F(rr)

0.

PROBLEM SET 12.1

Throughout this problem set, it should be understood that a0, av . . , b1, b2,
are
the Fourier coefficients of the function f, and that F is the function to which the series
converges. In each case, the graph offshould be sketched; and Fshould be sketched also,
in those cases in which Fis different from/
Compute the Fourier coefficients for each of the following functions.
.

1. f(x) = x
4. /(x) = x3
7. /(x) = x

on [O, rr],

8. /(x) = -x

2. f(x) = x2

3. /(x) = x + x2

5. f(x) = x - x3/rr2

6. /(x) =!xi

/(x) = 0

on [O, rr]

on [O, rr],

/(x) = 1

10. /(x) = 1

on [O, rr],

f(x) = -1

on [-rr, O]

/(x) =x

on [-TT, 0]

on [O, rr],

on [-rr, O]

9. /(x) = x
11. f(x) = 2x

on [ -rr, O]

/(x) = 0

on [ -rr, O],

12. For the odd functions x and x3, you found that the series used only sines, with ai
0
for each i. Show that this happens for every odd function (/is odd if/( -x) = -/(x)
for every x).
=

13. Similarly, show that if/is even (with/(- x) =/(x)), then the series uses cosines only.
14. Show that for each/in C0[-rr, TT], 11/112 2rr 11/11.
15. In Theorem A,fand/' are continuous, and since

(1)

JI/ - Pr,,/llu

0,

we have
(2)

U Jim Prnf =f on

[ -rr, rr].

n-oo

(See Problem 10 of Problem Set 11.6) Can (2) hold iffis not continuous?

12.2

Uniform Approximations by Trigonometric Polynomials

549

16. Let f be as in Theorem A. Assuming that Theorem A is true, show that


Jim II/ - Pr,./11 = 0.
17. Working merely on the basis of the formulas for the Fourier coefficients of a continuous

functionf, give a geometric plausibility argument for the statement


Iima,. =Jim bn = 0.

n-oo

(What does the graph of y


graph of y =f(x) cos nx?)

cos

nx

n-co

look like, when

is very large? How about the

18. In Theorem 6, under what simple conditions does the equality hold?

12.2

UNIFORM APPROXIMATIONS BY TRIGONOMETRIC POLYNOMIALS

The purpose of this section is to show that for every continuous function f, with
period 27T, and every E > 0, there is a trigonometric polynomial

</J(x)

a0 + I (a, cos ix + b; sin ix)


i=l

such that

II! - <Pllu <

That is,

lf(x) - </J(x)I

<

for every x.

Here we are not claiming that the coefficients in <P are Fourier coefficients. In
fact, if we have a <P which makes !If - <Pllu < E, and we want to improve the approxi
'
mation, using an E < E, we cannot always do this merely by adding new terms to the
old t/J; we may have to start afresh, with new coefficients even in the first few terms.
y

f+
f

f-

The first clue to this situation is that the trigonometric polynomials form a bigger
system than one might think:
Theorem 1.

The set of all trigonometric polynomials is closed under multiplication.

From this it follows immediately that sin2 x, cos2 x,


onometric polynomials. For example,
Cos2 x

+ cos 2x
2

- 2

+ .l

cos 2x.

sinn x, cosn x are trig-

550

12.2

Fourier Series

The general proof depends on the trigonometric identities


cos A cos B = t[cos (A + B) + cos (A - B)],
sin A sin B = -![cos (A + B) - cos (A - B)],
sin A cos B = t[sin (A + B) + sin (A - B)].
These are easy consequences of the addition formulas for the sine and cosine.
Consider now two trigonometric polynomials
n
m
A0+1(A1cosjx + B1sinjx).
a0 = 1(a; cos ix + bi sin ix),
i=l

i=l

Every term of the product has one of the forms


a;A0 cos ix,
a;A1cos ix cosjx,

a;B1cos ix sinjx,

b;A1sin ix cosjx,

b;A0 sin ix,

b;B1sin ix sinjx.

Each such term is a trigonometric polynomial. Therefore so also is their sum.


y
1

n
Consider now the function g(t) = cos2 t, which we now know to be a trigono
metric polynomial. If n is large, then cos2n t R:1 1 only when t R:1 O; everywhere else,
n
cos2 t R:i 0. Thus the graph looks something like the figure above, on the interval
[ -7T, 7T]. Let o be any number between 0 and TT, and let
In=
Jn=

L:
-0
J
f

cos2n t dt,
n
cos2 t dt =

-IT

K ==
n
We shall show that when
Theorem 2.

0
n
cos2 t dt.

i"0

cos2n t dt,

-o

is large, In

For each O > 0, limn

......

ro

R:1

Kn. That is:

Jn/In= 0.

This does not follow from the fact thatJn-+ 0, because In-+ 0 also. To prove the
theorem, we need to get good estimates ofJn and In. The first of these is easy:
Jn=

"
n
cos2 t dt < (7T

o) cos2

0.

12.2

Uniform Approximations by Trigonometric Polynomials

To estimate In, we use the reduction formula

cosn x dx

n
cosn-l x sin x +

551

: 1 fcosn-2 x dx .

(See Problem 31 of Problem Set 6.5) The formula can be derived by integration by
parts. This gives a recursion formula for the definite integral:

i"

Using 2n for n, we get


In

i"

cosn x dx

2n 2n

_,,

I,,

2n 2n

1
=

Therefore

li"

li"

cosn-2 x dx .

-11

cos2<n-ll x dx

2n - 1

_,,

2n - 3 2n - 5

2n - 2 2n - 4

111

2n

In-l

cos0 x dx

-11

(2n - 3)(2n - 1) . 17
2
2 4 6
(2n - 2)2n

3.5 .7
1 .
'TT
. . . 2n - 1 . 27T > - .
2 4 6
2n - 2 2n
n

n -

--

_,,

cos2n x dx

Therefore

('TT - ()) cos2" ()


'TT - ()
Ji
-- n cos2n u.
- <
'TT
TT/n
In
Since 0 < cos c5 < 1, it follows that Jn/In--+- 0. [In fact, L!1 (J;/I;) converges;
the easiest way to see this is to recall that L ki(x2)i converges for 0 < x < l.]
Now let/be any continuous function with period 2TT, and for each n let

Jn

,1.
..,, ..(x)

J'!..,,f(x + t) cos2" t dt
f'!..,. cos2n t dt

It is plausible to suppose that

oo =:;>

c/>,,(x)

f(x).

Roughly speaking, the reason is that for each c5 > 0,

1
c/>,,(x) :::::: - f(x + t) cos2" t dt;
In i{J
-!J
if c5 :::::: 0, then/(x + t) f(x) for -c5 t c5, and so
n

oo =:;>

::::::

oo =:;>

1
cf>n(x) :::::: - f(x) cos2n t dt
i{J
-!J
In
1
:::::: - f(x)I n
In

f(x).

552

12.2

Fourier Series

As we shall see, these ideas can be built up into a proof that


lim

n-+OO

II/ - </>nllu

0.

First, however, we need the following:

Theorem 3.

Proof

For each

n,

</>n is a

trigonometric polynomial.

First we observe that

</>n(x)

f1l f(x t)
Jn
f" f(t)
1

cos2n

-1!

t dt
x) dt.

cos2n (t J -tr
n
(The integrand has period 27T, and so f.'.'." is unchanged if we slide the graph back and
forth horizontally.) We know that cos2n tis a trigonometric polynomial; say,
-

cos2n

Therefore
cos2n

(t

i=l

cos2n

(t

(a; it
cos

x) a0

! (ai cos

i=l

it

cos

it).

+ bi sin

cos

it ix
x) dt J:"a0f(t) dt
i [f:}a; it
; [f'"(a
+ b; sin

Therefore

J:/(t)

a0

ix

- b; cos

ai it
it ix).
sin

sin

ix

sin

cos

+ b; sin

sin it - b; cos

it)f(t) dt] ix
it)f(t) dt] ix.
cos
sin

The coefficients here are complicated, but they are constants, and so the indicated

integral is a trigonometric polynomial.

Theorem 4. Iff is continuous, and has period 27T, and


.J..

()

nX then

f':_" f

lim

Given

> 0, we need an
n

cos2n

f7:_1l cos2n

n-.oo
Proof

(x t)

II/ - </>nllu

t dt

t dt

'

0.

such that

N=;> II!- </>nllu

<

E.

By definition of the uniform norm, this means that


n

-=;>

lf(x)

</>n(x)I

<

for -7T

7T.

Uniform Approximations by Trigonometric Polynomials

12.2

553

Step 1. Since/is continuous on [-TT, TT],fis bounded on [-TT, TT]. Also/is periodic.
Therefore there is an Msuch that

If(t)I

< M for every

t.

(1)

Step 2. Since f is continuous on [-2TT, 2TT], it follows that f is uniformly continuous


on [-2TT, 2TT]. And f is periodic. Therefore there is a (J > 0 such that

Ix -x'I
ltl

lf(x) (x')I
t )/
/j(x) f(x
1,
(x t) t dt
lf(x)- ef>nCx)I \ f(x) s,,fJ,,
t dt I
L JI(x)J: t dt -J:/(x t) t dt I
;n If,, [f(x)-f(x t)] t dt /
t)/ t dt
- J " lf(x) -f(x
/j
[J_,,-/j/f(x)-f(x t)lcos2ntdt+ J_)f(x)-f(x
f/f(x) - f(x t)/ t dt]
t dt f-o t dt Jo( t dt]
1. [ f f-o t dt]
_!_ [
l . .=J"
tdt
This means that

< b

< b

< E/2.

-f

=>

=>

< E/2.

and c5 as in Step 2, we have

We are now ready to calculate. For Mas in Step


+

cos2n

cos2n

cos2n

cos2n

_,,

cos2n

cos2n

1
I
n
1
=
I
n

(2)

+ t)/cos2ntdt

1M

=
I

-tr

cos2n

4MJ +
n
2

J
< 4M n+
I
I
n
n
J

n+-.
4MIn
2

+ 2M

cos2n

"

cos2n

cos2n

-1I

cos2n

Therefore there is an
n

We now have
which was to be proved.

N
=>

N =>

such that
J
4M n < .:
I
2
n

I !-

ef>nllu < E,

This theorem has an immediate consequence for Fourier series.

cos2n

554

12.2

Fourier Series

Theorem 5. Iffis continuous, and has period 27T, and

Prnf

a0

2 (a, cos ix + b, sin ix)

i=l

is the nth partial sum of the Fourier series off, then


limllf - Prn/11

Proof Let

0.

> 0 be given, and let

cp(x)

A0 + 2 (A; cos ix + B, sin ix)


i=l

be a trigonometric polynomial of order N. Then

II! - PrN fll II! - <foll ,


because PrNJ is the element of T N which is closest to f And

II! - <'1112 2 7T II! - <foll!.


(See Problem 14 of Problem Set 12.1) Therefore

We take <P so that

27T II! - <Pl


l: <
(This will hold whenever II/ - <Pl
lu

< E/.,/277.) It follows that

II! - PrNfll <

Since the norm of this difference is nonincreasing (Theorem 6 of Section 12.1), it


follows that lJ - Prnfll < E for every n N. Therefore limn-.co II/ - Prnf 11
0.
Finally, we observe that for each n there is a point xn at which the error in the
proposed approximationf R:> Prnf is actually equal to 0.
=

Theorem 6. If f is continuous, then for each

Prnf(xn).

there is an xn such that f (xn)

Proof By definition of the projection, f - Prnf must be orthogonal to every vector


in Tn. In particular, the inner product (f - Prnf) 1 must be 0, because 1 is in Tn.

Therefore

J:"[f(x) - Prnf(x)]

1 dx

Here the integrand must vanish at some point xn.

0.

Uniform Approximations by Trigonometric Polynomials

12.2

SSS

At first, this theorem may seem almost like a joke, but it isn't. See the following
section.
PROBLEM SET 12.2

Theorem 1 implies that each of the following functions is a trigonometric polynomial.


Compute these functions in the form of trigonometric. polynomials.
1. sin3 x
4.

cos2 (2x)

7. cos x sin 2x

2. sin2 x

3. cos3 x

5. sin2 x cos x

6. cos2 x sin x

8. cos4 x

9. sin4 x

10. sin2 x cos2 x


1 1. Suppose that in Theorem 3 we had used

</>n(x)

s1T f(x + t) cosn t d t


s1T cosn t dt

Would Theorem 3 still have been true? (Either prove the theorem in the more general
form, with odd exponents allowed, or give an example to show that the more general
theorem is false.)
12. Show, by any method, that

[f/ r
<x) dx

21T

f}

f(x)]2 dx.

[Hint: There is a quick method, on the basis of what you know now.]
13. Show that if/is as in Theorem 5, then

: fJ f(x) - Prn/(x)I dx

0.

14. Show that if/is as in Theorem 5, then

:n: f,,Prn/(x) dx

f/

(x) dx.

15. Now show that the same result holds on every interval

[a, b].

* 16. In Section 10.8 we proved the binomial theorem

(1 + xr

C)

i
x

by the methods of calculus, in the real domain. Thus the proof in Section 10.8 does not
show that

for every complex number

z.

Prove the latter theorem, by induction.

* 17. Now use the result of Problem 16 to get an explicit formula for cos4 x in the form of a

trigonometric polynomial, with coefficients given numerically.

12.3

Fourier Series

556

*18. Now get a general formula for cosn x as a trigonometric polynomial.


[Note that in the text, we did not need the full force of Theorem 1; all we needed
was the result of Problem 18; and the proof of the special result is neater. But Problem
18 is misleadingly special; and Theorem 1 ought to be regarded as the "real reason"
why cos2n x is a trigonometric polynomial.]
19. In the proof of Theorem 3, we used the function

</>,,.(x)

f'!...ir f(x + t) cos2n t dt


f'!...ir cos2"' t dt

f'!...ir f(t) cos2n (t f'!...ir cos2n t dt

x) dt

Let f(x)
x2, on the interval [-1T, 1T]; and extend the graph so as to get a function
of period 21T. (See the figure on p. 547.) For this function f, compute the function
<f> (x) in the form of a trigonometric polynomial, using definite integrals as coefficients.
2
You need not compute the integrals numerically.
=

20. Given a trigonometric polynomial


<f>(x)

a0 +

L (a; cos ix
i=l

+ b; sin

x).

Is it always true that <P is its own Fourier series? That is, do we have
form n?

Why or why not?

*21. Let/be a continuous function on [O, 1]. Show that for every E > 0 there is a poly
nomialp(x)
L=o a; x; such that
=

lf(x)

p(x)I < E

(0

1).

(This is a celebrated theorem due to Karl Weierstrass.)


*22. Given that the theorem proposed in Problem 21 is true, show that if f is continuous
on [a, b], and e > 0, then there is a polynomialp(x) such that
lf(x)

12.3

p(x)I

< E

for a x b.

INTEGRATION OF FOURIER SERIES.


THE UNIFORM CONVERGENCE THEOREM

We defined the Fourier series


00

a0

+ !(a; cos

by using the projections


Prnf

i=l

ix

+ b; sin

a0 +!(a; cos ix
i=l

ix)

+ b; sin

ix)

off into the subspace Tn of trigonometric polynomials of order n. We now want to


show that iff has period 2?T, and f' is continuous, then:

1) the Fourier series off converges;


2) the sum of the Fourier series is the function/;
3) the series can be integrated a term at a time.

Integration of Fourier Series.

12.3

The Uniform Convergence Theorem

557

The ideas suggested by (3) are the key to the situation: to prove convergence, we
first need to find out how the operations Prn are related to differentiation and
integration.
Theorem 1.

Iffhas period 27T, andf' is continuous, then


Prnf' = (Prnf)'.

That is, the projection of the derivative is the derivative of the projection.

Proof Let
Prnf=

a0

L (ai cosix + bi sinix),

i-=1

Prn f'=

A0

(Ai cosix + Bi sinix).


iL
l
We need to show thatA0 = 0, Ai= ibi, Bi= iAi . The Fourier coefficients forfare
+

given by the formulas

ao =

f" fx( ) dx,


f" fx( ) cosix dx

27T

-1T

ai =

1
1T
-

(i >

0),

i( >

0),

-1T

( ) sinix dx.
bi = .!. [" fx
J_1T
1T

Similarly,

A0 =

1
27T

f"f'(x) dx,
-1T

Ai = .!. {" f'(x) cosix dx


J_1T
1T

B,. = .!. {" f'(x) sinix dx.


J_1T
1T

By the fundamental theorem of integral calculus,

Ao =

( )- f( 7r )] =
_L [f7r

27T

0, we integrate by parts, using


u = cosix,
dv = f'x
( ) dx,
du =

0.

In i fori >

-i sinix,

v = f(x).

This gives

.!. {"if(x) sinix dx = 0 + ibi = ibi.


J_1T
Similarly, using u = sinix, dv = f'(x) dx, du = i cosix dx, v = f(x), we get
A,. =

.!. [f(x) cosix]'.'.."

1T

Bi =

1
-

1T'

1T

( ) sinix]:'.." - [fx
1T'

" if(x) cosix dx=


J
-ir

iai = -iai.

Fourier Series

558

12.3

Note that Theorem 1, as it stands, does not tell us that

D[ i
a

(ai cos ix

] i

bi sin ix)

(-iai sin ix

ibi cos ix).

In fact, at this stage, we don't know that either of the,indicated'series converges to any
function at all.

We now propose to find out how

Section 12.2 says that

Jim

n-+CO

which means that

Prn

is related to integration.

II! - Prnf 11

Theorem 5 of

o,

What we want is

J: lf(x) - Prn f(x)I dx


,,

0.

For this purpose we need the following:


Theorem 2. If g is continuous on [a,

Let

Proof

C0[a, b]

[f

b],

g(x) dx

then

b
[g(x)]2 dx.

(b - a)

[a, b]. Under the usual


C0[a, b] forms an inner-product space, and so the Schwarz

be the set of all continuous functions on

definitions of+, sm, and,

inequality holds. In the inequality

(A B)2 llAll2

we take

g,

[f

llB ll2,

I. This gives

g(x)

r [f

1 dx

] [i

g(x) . g(x) dx

( b - a)

(l

1) dx

[g(x)]2 dx.

This tells us that the integral of a function is small if the norm of the function is

small.

Applying this principle to the function

theorem.

Theorem 3.

Then

If - Prn/I,

we obtain the following

Let f be a continuous function, of period 27T, and let

lim Mn

'7t-+CO

0.

12.3

Integration of Fourier Series.

Proof
M=

The Uniform Convergence Theorem

559

[f,,1 1cx) - Prnf(x)I axJ 2


J_') J(x) - Prnf(x)}2 dx

27T

= 2 7T II! - Pr nfll2
By Theorem

of Section 12.2, the last of these expressions approaches 0. Therefore

M!-O, and Mn- 0.

We are now ready to prove a convergence theorem. If/is as in Theorem 3, and/'


is continuous, then f satisfies the conditions of Theorem 3. Let

J:,)f'(x) - Prnf (' x)I dx.

Mn=
Then

limMn= 0.
n->OO
Let xn and x be any points of [-7T, 77]. Since Prnf' = (Prnf)' , we have

L:[f'(t) - Prnf'(t)] dt= [f(t) - Prnf(t)]


=f(x) - Prnf(x) - [f(xn) - Prnf(xn)].
By Theorem 6 of Section 12.2, we can choose each Xn so that f(xn)= Pr1.f(xn).
This gives
lf(x) - Prnf(x)I=

IJ

I f

[f'(t) - Prnf'(t)] dt

lf'(t) - Prnf'(t)I dt Mn-

Therefore II/ - Prnflu Mn. This gives:


Theorem 4.

Iff has period 27T, and f' is continuous, then


lim l
l J - Prnfllu= 0 .

In the language of Section 10.15, this takes the form:


TbeQl'em 4'.

If/has period 27T, and/' is continuous, then


Ulim Prnf =f

on(-00,00).

In each case, the reason is that the differences/ (x) - Prnf (x) are squeezed to 0
by a sequence of positive constants. Finally, all this can be restated in terms of the
formula
co

Prnf(x)= a0 + I (ai cos ix + bi sin ix),


i=l

Fourier Series

560

12.3

and the formulas for the Fourier coefficients ai and bi. This gives a third form of the
theorem:
Theorem 4". Let f be a function with period 27T and a continuous derivative. Let
a0

1 J"f(x) dx, ai = -1 f"f(x)


1;!" [a0+ i ( i ix+ bi

=-

27T

Then

'TT

_,,

cos

ix dx,

1 f"f(x)

bi = -

-1T

a cos

sin

'TT

] =f(x)

ix)

on

sin

ix dx.

-1T

( -co,

oo) .

Just as we found for power series, in Chapter 10, uniform convergence enables us
to integrate a term at a time. In general:

Theorem 5. If the functions fn are continuous, and U limfn

=f on [ , b], then

n-+ro

b
b
rfn(x) dx = rf(x) dx.
n-+oo Ja
Ja

lim
This gives:
Theorem 6.

Iffhas period 27T andf' is continuous , then the Fourier series forf can

be integrated a term at a time; on any interval.


We found, at the beginning of this section, that
f' (x)

= lim P r nf' ,

P r nf' = (Pr nf) ' . If

it follows that the series for f can be differentiated a term at a time.

But we have

proved Theorem 4 only for functions with continuous derivatives; and so our con
vergence theorem applies tof' only whenf"is continuous. Hence the heavy hypotheses
in the following theorem:
Theorem 7. Iffhas a period 27T andf' andf" are continuous, then the Fourier series
for f can be differentiated a term at a time.
That is, forf(x)

= limn..... oo P r nf, we have

f'(x) = lim Prnf'(x) = lim [Prnf(x)]'.


n-+ro
n-+ro

(We get the first of these equations from Theorem 4, and the second from Theorem

1.)

In the further development of the theory, in which we allow discontinuous


functions, the status of Theorem 7 is very different from that of Theorems 4 and 6,
in that the latter two theorems hav quite satisfactory generalizations, but Theorem 7
does not.
y

Integration of Fourier Series.

12.3

561

The Uniform Convergence Theorem

Suppose that we form a function of period

27T

by fitting together the graphs of a

finite set of continuously differentiable functions, end to end.

We reconcile the values

at common endpoints by defining

f (x1)

Hf1(xJ

and so on; and we reconcile the values at

f(7T)

f(-7T)

-?T

f2(X1)],

and

7T

by defining

Hf1(-7T) + fa(7T)].

A function obtained in this way will be called a function

of the Fourier type.

Evidently

functions of this type are integrable, and so every such f has a Fourier series, with
coefficients given by the same integral formulas as for continuous functions.

The

following turns out to be true:

Theorem B. Let f be a function of the Fourier type, with Fourier series

I (x)

<X)

Then

a0 + I (ai cos ix + bi sin ix).


i=l

I)

For every

2)

The convergence is uniform on every closed interval on which f is continuous.

3)

The series can be integrated a term at a time on any closed interval (even if the

x, L (x)

converges to f ( x).

interval contains points of discontinuity).

4)

lim

II! - Prnf 11

O.

n->oo

But it is not necessarily true that the series for a function of the Fourier type can

be differentiated a term at a time, even on an interval on which f, f', and

f"

are

continuous. This will be brought out in the problem set below. In working on these
problems, you should regard Theorem B as given.

PROBLEM SET
1.

12.3

Show that iffis a function of the Fourier type , then

[f/

(x)dx

27T

f,,

] dx.
[/(x)2

(A simple proof is possible, if you use the right ideas.)


2.

Let/be the function defined by the conditions


(1) f(x) =1
(3) /(x) = -1

on

(0, 7T),
(-7T, 0),

on

(2) /(0) =/(7T) =0,


+ 2n1T) = /(x).

(4) /(x

Calculate the series for/, and discuss the series obtained by termwise differentiation.
3. Let/be the function defined by the conditions
(1) /(x) =ex on (-7T, 7T),
(2) /(-7T) =/(7T) =t(e-" + e"),
(3) f(x + 2n7T) =/(x), for every x.

562

12.3

Fourier Series

Evidently has a Fourier series

a0 + L (a; cos ix + b; sin ix).


Show that termwise differentiation of the series forf cannot give the Fourier series for/'.
00

i=l

4. Now calculate the Fourier series off (You did not need to do this, to solve Problem 3.)
5.

Now verify, by a calculation, that for this series,

Jo('" [ao + i a; cos it + b; sin it] dt =Jor '"ao dt + i Jo("'(a; cos it + b; sin it) dt.
i=l

i=l

Problems 6 through 8. Let / be the function defined by the conditions


(1)
(3)

/(x)
/(x)

e"'

Proceed as in Problems
*9. Let

(2) /(0)

on (0, 7T),
on ( -7T, 0),

(4)

-e'"

0, /(7T)

! (e"

f(x + 2n7T) f(x),


=

- e-"),

for every

x.

through 5.

f and g be functions of the Fourier type, and let L (x) be the series for f, as in the

text. Show that

f f(x)g(x) dx J"a0g(x) dx + 1 J:" [a; cos ix + b; sin ix]g(x) dx.


..

*10. Show that if / is as in Problem 1, then

111112

=Ia+ Ib.
00

00

i=O

i=l

Linear Transformations,
Matrices, and Determinants

13

13.1

LINEAR TRANSFORMATIONS

Let Rn and Rm be Cartesian spaces of any dimension. A function

j;

Rn-'>- Rm

is called a linear function, or a linear transformation, if it preserves sums and scalar


products. That is,f is linear if
f(P + Q)

and

f(rxP)

If f is linear, then

f(rxP + {JQ)

f(P) + f(Q)

rxf(P)

for every P, Q,

for every rx, P.

f(rxP) + f({JQ)

(1)
(2)

rxf(P) + {Jf(Q);

(3)

and conversely, if (3) holds, then (1) and (2) both hold.
To see how linear transformations work, let us examine some special cases in
low dimensions. Suppose that
f: Ra_,,.. Ra
is linear. In R3 we use the "standard basis"
where
E1

E2

(1, 0, 0),

(0, 1, 0),

Ea

(0, 0, 1).

Now f(E1) must be a vector in R3; and B3 is a basis for R3 Therefore we have
f(E1)

a11E1 + a21E2 + aa1Ea,

for some set of scalars a11, a21, aa1. [Here we are using double subscripts; ai1 is the
coefficient of Ei inthe expression forf(E1) .] Similarly,f(E2) and/(3) have the forms
f(E2)

f(Ea)

a12E1 + a22E2 + aa2Ea,

a13E1 + a2aE2 + aaaEa.

Iff(E1),f(E2), andf(E3) are known, then this determinesf(P) for every P in Ra. The
reason is that for
we must have

563

564

Linear Transformations, Matrices, and Determinants

13.1

This enables us to write a formula forf(P) in terms ofthe numbers ai1:


f(x1E1 + X2E2 + xaEa) =

a11X1E1 + a21X1E2 + aa1X1Ea


+ a12X2E1 + a22X2E2 + aa2X2Ea
+ a13X3E1 + a23X3E2 + aa3X33.

Collecting the coefficients of E1, E2, and E3, we get:


(a11X1 + a12X2 + a13X3)E1
+ (a21X1 + a22X2 + a23X3)E2
+ (aa1X1 + aa2X2 + aaaXa)Ea
Y1E1 + Y 2E2 + YaEa.

There is a special apparatus which enables us to write such formulas more


compactly, and deal with them more conveniently. Evidently/is completely described
by the 3

3 = 9 numbers ai1.

We write these in a rectangular array, like this:

Such an array M is called a 3 by 3 matrix. (In general, an m by n matrix is a rectan


gular array of numbers, with m rows and n columns.)
3
At the beginning of Section 11.3, before we introduced a basis in R , we repre
sented the vector P = x1E1 + x2E2 + x3E3 by the ordered triplet (x1 , x2, x3). We
now write this triplet vertically instead of horizontally, in the form of a "3 by 1
matrix"

When Pis described in this notation, we call Pa column vector. Similarly, for
we write

f(P) =Y 1E1 + Y2E2 + YaEa = (y1, Y2 Ya) ,

f(P) =

[] [: ! ::: ! :::] .
=

Ya

aa1X1 + aa2X2 + aaaXa

Note that the array on the right i; a column vector; once the indicated additions are
performed, there is only one entry in each row.
Now we define the operation of multiplication, of the column vector P by the
matrix M, in such a way thatf(P) =MP. That is:
Definition

Linear Transformations

13.1

565

Under this definition of the "product" MP, we have


MP

[f:]

f (P)

There is a usable pattern in this multiplication: to get the entry y1 in the first row of
the product, we regard the first row of M as a vector, and form its inner product
with the column vector P. Similarly for the other rows.

Let us examine an example. The matrix

1
-2
1

2
1
-1

describes a linear transformation f, with

f(xi. X2, Xa)

1
-2
1

1] [X1]
2 X2
2 Xa

2
1

-1

]
[

X1 + 2x2 + X3]
-2X1 + X2 + 2x3
Xi - x2 + 2x3

[Y1]
Y2 .
Ya

Two questions arise naturally here.


Given a particular point (Ji, y2, Ya), for what points P, if any, do we have
(Yi. Y2, Ya)? For example, for what points P is it true that

Problem 1.
f(P)

f(P)

(1, 2, 3)?

To answer this question, we need to solve the system

Xi + 2X2

X3

1,

(1)

-2xi + x2 + 2x3

2,

(2)

Xi - x2 + 2xa

(3)

of linear equations in the unknowns

Xi. x2, x3 Almost any method will do. We shall

use a method which will be of theoretical importance later.


Step 1.

Eliminate

Xi from (2) and (3), by adding twice (1) to (2) and subtracting (1)

from (3). This gives a new system which is equivalent to the original system, in the
sense that it has exactly the same solutions:

Xi + 2X2 + Xa
5x2 + 4X3
- 3 x2 + Xa

1,

(1)

4,

(2')

2.

(3')

(2) + 2(1)

(2)

(1)

(The notations on the right indicate where the new equations come from.)
Step

2. Eliminate x2 from (3'), by adding t of (2') to (3'). This gives the equivalent

system

Xi + 2X2 +

Xa

5x2 + 4xa

1,

(1)

4,

(2')
(3")

(3') + !(2')

566

13.1

Linear Transformations, Matrices, and Determinants

We now say that the system is in triangular form. In general, a system of n


linear equations in n unknowns is in triangular form if the nth equation involves
only xn, the (n
l)st equation involves only Xn-l and Xn, and so on. (This means
that in the matrix of coefficients, a;1 = 0 for j < i.)

Step 3. Solve, by successive substitutions, working from bottom to top, getting


This means that

X
3

This should be checked.

1] [ ] [1]
1
H -1 ; -r ;
= H.

13 7

Problem 2. Find, if possible, a set of formulas expressing P = (xv x2,


(Y1. Y2, Ya ).

Xa)

in terms of

To do this, we need to get a "general solution" of the equation

-11

in which the x/s are expressed in terms of the y /s. This is only slightly more trouble
some than Problem 1; we treat (y1, y2, Ya) in exactly the same way as we treated
(1, 2, 3) in Problem 1. The solution is
X1

147Yl - 151Y2 + 137Ya

X2 = 161Y1 + l\Y2 - 14-iYa,


Xa = l\Y1 + 131Y2 + 151Ya

']

The coefficients in these equations give us a new matrix

M-1 =

[':
17
1

T7

- rr
5

T7
3
T7

T7

-1 7
17

We call this matrix M-1 because it is the matrix of the transformation

1-1: Ra Ra.
That is, M-1 reverses the action of M, in the sense that
,
M-1(MP) = P, for every P = (x1, x2,

Xa) .

It may easily happen, however, that a linear transformation


/:

R"Rm

does not have an inverse

(?) 1-1:

Rm Rn.

(?)

There are two things that may go wrong: (1) some points Q of Rm may not be values

13.1

(2)

Linear Transformations

1
of the function at all. In such cases, there is no such thing as J- (Q).
Q of Rm may be equal to f(P) for more than one point P.

567

Some points

In such cases, f is not

[1 1
J

invertible. An example of both these phenomena is furnished by the matrix

M=

0
0

Here

0 .

This function projects R3 into the xy-plane; no point outside the xy-plane is a value
of the function, andf(P)

f(P') whenever P and P' lie on the same vertical line.

To find out how a given linear transformation behaves, we try to compute its
inverse, by the method used in the above example. If there is an inverse, the method
gives it to us; and if there isn't, the method still tells us what is going on. Consider,

[l21 25 38.7]
z)
(a,
[; ; 7][;z] [2 !+ ; !+ i7z []
+ 2y + 3z =a,
2x + + 8z =
+ + 7z
+ 2y 3z
+ 2z b - 2a,
=
2b +
3a - 2b +
2b +
{(a,

for example, the transformation f described by the matrix

M=

To simplify the notation, we-use (x, y,


We then have
MP =

for (x1, x2, x3) and

b, c) for (y1, J2, Ya)

4y

This gives the linear system

Sy

b,

4y

= c.

Reducing this to triangular form, we get the equivalent system


x

=a,

3a -

Thus the equation

f(P) = MP

cannot have a solution unless


E=

c.

(2)
(3)

(l)

(a, b, c)

c = 0. Let

b,

c) I 3a -

c =

O}.

Then Eis a plane, and every point of Eis = f(P) for some P. The reason is that as

13.1

Linear Transform;itions, Matrices, and Determinants

568

long as

(3) is satisfied, we can solve (2) and (1) in the forms


(2')

y = -2z +b - 2a,
x = -2y - 3z +a
Here we can choose any

= z - 2b +Sa.

z; if
=

(1')

(z - 2b +Sa, -2z +b - 2a, z),

then

f(P) = (a, b, c) .

3.1, the image of a function is defined to be the set of all values of


A B, then the image is denoted by f(A). More
generally, if A' is any subset of A, then
As in Section

the function.

If f is a function

f(A')

{bI b = f(a')

for some

a' in A'}.

Thus, in the example above, the image is

f(R3) =
The

E=

{(a, b, c) I 3a - 2b +c = O}.

kernel of a linear transformationfis


Kerf=

Obviously the kernel contains

{PI f(P) = O}.

0, no matter whatf may be, because


f(O) = 0.

(Proof f(O) = f(P - P) = f(P) - f(P) = 0.)


If f is not one-to-one, however, then the kernel Ker f contains vectors other than

(Proof If f(P) = f(Q), for some P


and so P - Q belongs to Ker f)

Q, then f(P - Q) = f(P) -f(Q) = 0,

In the above example, the kernel is the solution set of the equation

To get the solution, we set

a = b = 0in equations (2') and (l'). Therefore Pis in

the kernel if P has the form


Using

(J..

for

.P = (z, -2z, z).

z (to fit the usual notation of scalar multiplication), we get


Kerf=

{(J..V0},

V0 = ( l , -2, 1).

Therefore Kerf is a line. In other cases, the image may be an even smaller set, and the
kernel even larger. For

M=

1 0O
0 0 0,
000

Linear Transformations

13.1

569

the image is the x-axis, because

rn J[J m

The kernel is the yz-plane, because the equation

gives the system


x

0,

0,

0,

whose solution set is obvious.


Finally, a few remarks on questions which the above discussion may suggest.
R3.

1)

As far as the ideas in this section are concerned, there is nothing special about

linear transformations Rn_,. Rn.

We have discussed R3 merely to avoid tedious

Exactly the same methods apply, in exactly the same ways, when we deal with

notation, large matrices, and pointlessly long computations.

2)

In the examples that we have discussed, the image and kernel of a linear trans

is easy, on the basis of Theorem 4 of Section

formation have turned out to be subspaces. This is what always happens. (The proof

11.3.

You will find it hardly more

trouble to write it yourself than to consult the following section.)

3)

If you are acquainted with determinants, and with the process of solving linear

systems by Cramer's rule, then you may suspect that the method used above, by which
we convert the system to a triangular system, is naive or inefficient or both. But this
is not true. In computation, triangularization is about as good a method as any.
PROBLEM SET 13.1

In each of the following problems, you are given a matrix M, describing a linear transfor
mation f Iff turns out to be invertible, compute 1-1. If not, find the image and the kernel.
Thus the answer to each of the first ten problems below should be in one of the forms
M-1 = [

(a)

or
(b) /(R3) ={(a, b, c)

[H ]
. [HJ
[H :J

I
.

},

and

[[ !]
[: ; ]

[H i]

-3

+2

I}.

, [H ] ]
[- - -;
[h i] [! g ]

Ker/= {(x,y, z)

6.

10.

-2

570

ll.

Suppose that the matrix of/has the form


M=

12.

ll 2
a,
aa1 Olll 2 Olll3
f3a1 f3a2 flaa

"]
u
[T ]
["" g l

What sort of subspace is f(R3)? How about Ker/?


Same question, for matrices of the form

where a11a22a33
13.

13.2

Linear Transformations, Matrices, and Determinants

ll12
ll22 ll23 ,
ll33
0

0.

Same question, for

M=

14.

with a11a22a33 0.
Same question, for
M

15.

a21 ll22
ll31 ll32 ll33

[g

ll31

with a12a23a31 0.
Same question, for
M

13.2

ll12
0

ll12

rn

"

aa
J

"

aa

COMPOSITION OF LINEAR TRANSFORMATIONS


AND MULTIPLICATION OF MATRICES

The preceding section was devoted almost entirely to investigation of examples of


linear transformations. We shall now develop some of the theory.
Let J be a linear transformation Rm--+Rn. Then the image f(Rm) is a
subspace ofRn, and the kernel Ker f is a subspace ofRrn.

Theorem 1.

Proof By Theorem 4 of Section 11.3, we merely need to show that these sets are
closed under addition and scalar nrnltiplication. For eachf(P),f(Q), we have
f(P) + f(Q)
and for eachfP
( ),ocfP
( )
to Ker f, then
and sof(P + Q)

( + Q);
fP

f(ocP). Thereforef(Rm) is a subspace. IfP andQ belong


f(P)

f(P) + f(Q)
foc
( P)

Therefore Ker f is a subspace.

f(Q)

0 + 0

ocf(P)

oc

0,
0,
0

and
=

0.

13.2

Composition of Linear Transformations and Multiplication of Matrices

Theorem 2. If f and

andrx.f

571

are linear transformations Rm-+ Rn, then so also are f +

Here the sum and scalar product are defined by the obvious conditions

(f + g)(P)

f(P)

(rx.f)(P)

g(P),

rx.(f(P)).

The verification of linearity is trivial:

(f + g)(P

Q)

f(P

f (P)

Q)

g(P

+ f(Q) +

Q)

g(P)

g(Q)

(f + g)(P) + (f + g)(Q);

(f + g)(rx.P)

f(rx.P) + g(rx.P)
rx.f(P) + rx.g(P)

rx.(f(P)

g(P))

rx.(f + g)(P).

This gives:
Theorem 3.

For each m and

n,

the linear transformations f: Rm -+ Rn form a vector

space.
Theorem 4. If

g:

Rm-+ Rn and f: Rn-+ RP are linear, then the composite function

f(g):

R"'-+ RP

is also linear:

f(g)

The verification is straightforward:

f[g(P

Q)]

f[g(rx.P)]

f [g(P)

g(Q)]

f(g)(P)

f[ixg(P)]

f [g(P)]

f[g(Q)]

f(g)(Q );

rx.f [g(P)].

We fol!nd, in the preceding section, that the action of a linear transformation


R3 -+ R3 could be described by a 3 by 3 matrix.
transformation
follows. If
then

g:

Similarly, the action of a linear

R"'-+ Rn can be described by an n by m matrix. The scheme is as

572

13.2

Linear Transformations, Matrices, and Determinants

Therefore, for P

(x1, x2,

g(P)

, Xm) in Rm, we have

Adding by columns, to get the total coefficient of each E; on the right, we get

Describing P and f(P) as column vectors, and representing g by the matrix with a; 1 in
the ith row and jth column, we get
b12
b21 b22

b1m
b2m

bn

X1
X2

Y1
Y2
=

b,.1 b,.2 .

bnm

Xm

Yn

which has the form


X1
X2
Mg

Y1
Y2
=

Xm

Yn

Here the pattern of the operation is the same as in the case m


3: to get y; in
n
the column vector on the right, we regard the ith row of the matrix Mg as a vector, and
form its inner product with the column vector.
Now if g and fare as in the preceding theorem, then each of the transformations
g,f, andf(g) can be described by a matrix. Let these matrices be
=

M0

[b;1]

(n by m),

M1

[a1k]

(p by n),

M1<g>

[cik1

(p by m).

Here [a,:;] is a shorthand for the matrix with the number a;; in the ith row and the
jth column, and similarly for [b1k] and [cik]. We define the product of two matrices,

Composition of Linear Transformations and Multiplication of Matrices

13.2

573

in this case, to be the matrix of the composite function. That is,


Definition.

Given linear transformations

g: Rm-+ Rn,

J: Rn-+ RP,
J(g): Rm-+ RP,
with associated matrices M1, Mg, Mt<a> By definition,

M1Ma = Mt<g>

We shall now get a formula for the product M1M11 of two matrices. The general
formula looks complicated, but its pattern is easy to see by an examination of the
case m

= n = p = 2. Let the matrices of the transformationsf, g, andf(g) be


M, =

Then

Mu
and so

an
a21

a12
,
a22

[:] [
{:J
=

Mt<a

Mg=

=
=

bn
h21

h12
h22 '

J[ J [
[J [J
[ J[ ] [
[
[
[

bu
b21

an
a21

a12
a22

= M

[en J
C

Mt<g> =

12 .
C22

C21

J [:].

bnX1 + h12X2
b21X1 + b22X2

X1 =
X2

b12
b22

= M M X1
g
f
X2

Yi
Y2

a11Yi + a12Y2
a21Yi + a22Y2

Yi =
Y2

J
J

au(b11X1 + b12xJ + a12(b21X1 + h22X2)


a21(b11X1 + b12X2) + a22(b21X1 + h22X2)

(aubn + ai2h21)X1 + (a11b12 + ai2h22)X2


(a21b11 + a22b21)X1 + (a21b12 + az2h22)X2
aubu + a12h21
a21b11 + a22b21

J[ J

auh12 + a12b22
a21b12 + a22b22

X1
X2

Therefore M1Mg is the 2 by 2 matrix in the last formula. The pattern of the operation

is clear: to get the number C;;, in the ith row andjth column of the product, we regard

the ith row of M1 and the jth column of M a as vectors, and compute their inner
product. That is,

C;;

= a; 1b1; + a;2b2;

This is called the row by column rule of matrix multiplication. The same rule applies
in the general case, and the only problem is that the formulas are complicated to write
down. We have

bu
h21

X1
X2

b lm
b2m

h12
b22

Mu
Xm

bnl

bn2

b nm

_L;'!,1 bi; X;
L7!.1 b2;X;

X1
Xz

Y1
Y2
-

_L;'!,1 bn;X;

Yn

574

13.2

Linear Transformations, Matrices, and Determinants

Here M0(P)

(y1, y2,

, Yn), where
m

Yi

2: bi;X;
J=l

(i

1, 2,

.. , n).

Therefore

ap1 ap2

Yn

Yn

2:f=1 ap;{Lr;'=1 bi1x1)


Here, for each k from

Z1
Z2

L1 ali(Lf:1 b;1x,)
Lf=1 a2;(2:f!=1 b;1x1)

2:f=1 aliyi
Lf=1 a2iYi
=

apn

top, we have

where

LCk;X;
J=l

(k

1, 2,

'p),

Thus

ckJ = L akibii"
i=l

But the above, formula for ck1 says that c k; is the inner product of the kth row of M1
and the jth column of M0; these are the row vector and column vector

13.2

Composition of Linear Transformations and Multiplication of Matrices

ck; = z1 akibi;

and their inner product is

c22

is the inner product of

(-2, 0, 2)

and

For example, in the product

(3, 2, 1).

plete calculation gives the answer

(This should be checked.)

575

Therefore

c22 =

-4. The com

10.

-4

4 .
-2

Similarly, we define the sum of two matrices to be the matrix of the sum:
Definition.

Given the linear transformations /: Rm->- Rn and g: Rm->- Rn, with

matrices M1 and M9 Then

M1 +Mg= Mt+u
The sum is easy to calculate; the simplest possible idea works. Let

au
ll21

a ln
a2n

ll12
ll22

b12
b22

Mg=

M, =

Then for

bu
b21

P = (x1, x2,

, xm),

we have

L=l aux;
L=l ll 2;X;
f(P)

g(P) =

Therefore

z7=1
L7=1
(f + g)(P) = f(P)

g(P)

(a1; + b1;)X;
(a21 + b21 )x1

13.2

Linear Transformations, Matrices, and Determinants

576

The matrix which gives this result in one step is

[ci;]
au
a21

aln
a2n

a12
a22

[ai; +bi;]

[ai;] + [bi;];

bu
b21

b12
b22

bin
b2n

bml

bm2

bmn

+
am1

am2 ... a mn

a u+b11
a21 +b21

012 +b12
a22 +b22

aml +bml

am2 +bm2 ... amn +bmn

a1n +bin
a2n +b2n

PROBLEM SET 13.2

Carry out the indicated operations, expressing each answer as a matrix (which may, of

course, turn out to be an

1.

4.

7.

10.

13.

16.

19.

[ iJ[: :J
[
rn
[!
n
[
[l

2
5
8
0
2
0

0
0
4
5
6
1
0
0
1

3
0

by 1 matrix, that is, a column vector).

[ m: ]
[! m i]
f]
[ mi i]
f
ff f
[! ff""
[ m: ]
2

2.

m !]
urn
!][]
m !]
n
m ]
0
1

5.

8.

11.

14.

10

3.

[ m !J
[! !J[l]
[ mf J
G f
[l m ]
[' i][i ]
1
1

9.

12.

0
0
l

15.

18.

20.

[; ][ ]
0

6.

17.

0
0

[3

21.

15

18

14

Formal Properties of the Algebra of Matrices.

13.3

22.

24.

26.

[l mf i l]
[ f 1J
4

23.

1
0

25.

1
0

27.

[ _l

][

2
I

-6

Groups and Rings

-6

577

-2
4

-1
I

-1

[ J
[ T
0

1
0

For each of the following four matrices, find the i nverse M-1 if there is an inverse; if

OJ [

not, give the simplest reason that you can for concluding that no inverse exists.
28.

[
2

*32. Let

29.

10

11

For how many 2 by 2 matrices

is it true that MM

12

30.

12

10

11

6
0

[ l

4]
8

12
0

31.

[ ]

[ ]

12? (It is easy to find two such "square roots" of 12 The question

is whether there are others, and if so, what they are.)

33. Let

02

[ ].

This matrix acts like 0, in that for every 2 by 2 matrix M, we have


02 + M

M + 02

Question: If A and B are 2 by 2 matrices, and AB

or B

02?

34. If A and B are 2


=

2 matrices and AB

M.

02, does it follow that A

02, does it follow that BA

02

02?

.
13.3 FORMAL PROPERTIES OF THE
ALGEBRA OF MATRICES. GROUPS AND RINGS

In the preceding problem set, you found that addition and multiplication of matrices
were analogous in some ways, but not in others, to addition and multiplication of
real and complex numbers. We shall now investigate the algebra of matrices syste
matically, and find out how far the analogy goes.
Throughout this section we shall be concerned only with square matrices.
set of all

by

The

matrices is denoted by ,An. In our investigation of the formal prop

erties of vltn, under addition and multiplication, it will not be very useful to think

Linear Transformations, Matrices, and Determinants

578

13.3

about square arrays of numbers; the ideas are much easier to see if we work with the
linear transformations f that the matrices represent.
Definition. !l'n is the set of all linear transformations f: Rn

---*

Rn.

We found, in Theorem 3 of Section 13.2, that !l'n forms a vector space, under
addition and scalar multiplication. Therefore, in particular, we have:
C.1. !l'n is closed under addition.
A.1. Addition in !l'n is associative.

(Existence of zero). There is an element/0 of !l'n such that fo + g


for every g.

A.2

g + fo

(Obviously/0 is the linear transformation such that


f0(P)
A.3

(0, 0, ... , 0)

for every Pin Rn.)

(Existence of negatives). For each/ in !l'n there is a -f such that


f + ( - f)
( -f) + f fo.
f + g
g + f, for every f and g.
=

A.4.

A pair [!l', +] is called a group if the operation + satisfies C. I, A.I, A.2, and
A.3. If A.4 is also satisfied, then [!l', +] is called a commutative group. We can
therefore sum up as follows:
Theorem 1. For each

n, [!l'n, +] is a commutative group.

We defined multiplication for matrices by composition of functions. That is,


M1Mu

M1tu>,

by definition. We therefore need to investigate composition of functions in !l'".


For the sake of convenience, we shall denote the composite function f (g) by the
notation f g . Theorem 4 of Section 13.2 tells us that iff and g are linear: R" ---* Rn,
then so also is f o g. This gives
o

C.2. The set !l'n is closed under the operation

o.

For each/, g, h in !l'n,


f

(g

h)

is the composition off and g(h); and


(f 0 g) h
is the composition off(g) and h. T ese give the same answer:
0

(g

h)

(f g)
0

The reason is shown by a diagram:

fog

h.

Formal Properties of the Algebra of Matrices.

13.3

Starting at any point

w,

we get to the same point

Groups and Rings

579

no matter how the functions

z,

f, g, hare grouped. Therefore we have:


M.1. In

2n, the

operation

is associative.

(Obviously this has nothing to do with linearity; compos1t10n of functions is


always associative, regardless of the domains, or the ranges, or the nature of the
functions.)
M.2 (Existence of unity). There is anf1 in 2n such that
for every g.
This f1 acts like the number 1, under our "multiplication."

Obviously f1 is the

"identity" function, such that/(P) = P for every P. We call/1 the unit element.
So far, M.l and M.2 are precisely analogous to C.l, A.I, and A.2; these conditions
say the same things, about addition in one case and "multiplication" in the other.
1
But the analogy now breaks down: not every linear function f has an inverse J- ,
and composition of functions is not, in general, commutative. (We have seen many
examples of both of these.) But we do have:
DL (The distributive

law).

In

2n,f 0

(g+ h) =fog+f 0 h.

The reason is thatf 0 g isf(g); andf(g+ h) = f(g)+f(h) because f is linear.


A system satisfying all the conditions that we have mentioned so far is called a
ring. More precisely:
Given a set

Definition.

2, with

two operations+ and

[2,+,

The system

o]

is a ring if the following conditions are satisfied:

[2,+]is a commutative group.


set 2 is closed under 0; a is associative;

R.1.

The pair

R.2.

The

R.3.

The operation a is distributive over+. (That is,/ a (g+ h) =fag+fa h.)

arid

2 contains

a unit element.

All this discussion carries over immediately to the set ._,n of n by n matrices,
since M1 + Mu = Mt+u and M1Mu
TheQrem 2.
Here

For each n, the system

M1.u. This gives:

[Jin,+,

0]is a ring.

is used to denote matrix multiplication. It is easy to see that the zero

eleinent of ._,n is
0

...

Linear Transformations, Matrices, and Determinants

580

and that the "unit element" of

..,1tn

In

13.3

is the matrix

1
=

0
with l's on the main diagonal and O's everywhere else. There is a shorthand for this:
we define

/J

..

Z?

for

for

i = j,
i -:;6 j.

We then have

In

[o;;].

If you try proving Theorem 2 directly, carrying our calculations with

by

arrays of numbers, you will see that the use of the system !t' of linear transformations
offered great advantages; most of the proofs were easier to write down than even
one

by

matrix.

In particular, a direct verification of the associativity of matrix

multiplication, using the formula for the product of two matrices, would be extremely
tedious.
A final remark, on the notation used in describing a group. In this section, the
group operation is denoted by +. This is partly because addition was what we meant,
in the case that we were discussing. Also it is customary to use the symbol + when the
operation is commutative. More generally, however, we can state the conditions for a
group as follows:

Definition. Given a set G and an operation* The system


[G,*]
is a

if the following conditions are satisfied:

(Closure).

Cl

G is closed under*

(Associativity). a* (b *c)

EU
for

EI

group

(Existence of Unity).
every a.

(a*b) *c,

always.

There is an element

of G such that

(Existence of Inverses).
a-1
a-1*a
e.

For each a in G there is an element

As before, the group is

commutative

e *a
a-1

a*e

a,

of G such that

Com

(Commutativity). a*b

b*a,

if it satisfies:

always .

A field is a system [F, +, ] which satisfies all the conditions which were stated
for the real number system in Section 1.1. Thus [F, +, ]is a field if
ring, (2) multiplication is commutative, and (3) every
that x

x-1

1.

x -:;6 0

(1)

[F, +, ]is.a

has an inverse x-1, such

Formal Properties of the Algebra of Matrices

13.3

581

Obviously the real-number system furnishes examples of all the ideas that we
have been talking about in this section:

[R, +] is a group.

[R, +, ]

is both a ring and a field, and

But if the real-number system were the only algebraic system that

we were concerned with, there would be no advantage in using the terms group, ring,
and field. The advantage is in other conne
. ctions: already we have been dealing with
vector spaces, which form groups (under addition), but do not form rings or fields;
and from now on, we shall be dealing with (a) groups which are not rings, (b) rings
which are not commutative,

(c)

commutative rings which are not fields, and so on.

To find our way around in this variety of algebraic systems, we need a language in
which we can explain briefly and clearly what sort of system we are dealing with at a
given moment.
PROBLEM SET 13.3
1. Let

G be the set of all complex numbers of the form


(OinR.)

z=cosO+isinO

Which, if any, of the following statements are true, and why, in each case?
a) [G, +]is a group.
b) [G, ]is a group.

c) [G, +,]is a field.

2. Let

G be the set of all "pure imaginary" numbers, of the form

3. Let

G be the set of all complex numbers of the form m + in, where m and n are integers.

z =

as in Problem 1.

iy (y inR.) Discuss

Discuss as in Problem 1.

*4. Same problem, where


5. Show that if
6. Let

m and n are rational numbers.

a and bare rational, and

+ bv2

G be the set of all real numbers of the form

=
a

Discuss as in Problem 1.

7. In this problem, you may regard it as known that

equation with rational coefficients.

a +

b1T,

1T

0, then

bVZ,

b = 0.

where a and bare rational

is not a root of any linear or quadratic

Let G be the set of all real numbers of the form

where a and bare rational. Is [G, +,]a ring?

8. A permutation matrix is a square matrix with exactly one 1 in each row, exactly one 1

in each column, and O's everywhere else. The set of all n by n permutation matrices is

denoted by pn. Show that [P2 , o] is a group, and write a multiplication table for the

group, in the form

12
A

What is the effect of a matrix in P2 on a vector P


9. Now carry out the same process for

[P3, o).

(x1, x2)?

(You need not write a complete multi

plication table, but you should find out whether the group is commutative.)

[ai1] with a;1


0 for j < i. Denote the
n upper triangular matrices by vn. Show [ vn, +] is a group, but [ vn, o]

* 10. An upper triangular matrix is a matrix

set of all n

is not.

582

13.4

Linear Transformations, Matrices, and Determinants

zn,

and f and g are invertible, then fog is also invertible.


(The proof should be direct: you should produce a function which is the inverse of

11. Show that if f and g are in

12.

Jog.)
Let GL(n) be

the set of all invertible transformations in zn. Show that [GL(n), ] is a


group. (This is called the general linear group.) Then show that [GL(n), +, ] is not a
ring.
o

*13. Let

GLU(n) be the set of all upper triangular n x n matrices [ai1], with a11a22
[GLU(n), ] is a group, but [GLU(n), +, o] is not a ring.

0. Show that

13.4

ann

THE DETERMINANT FUNCTION

The determinant function assigns, to every square matrix, a real number. The
definition of this function begins as follows.

a11
ll21

Given an

by

matrix

[a;1],

a 12
ll22

ll1n
G2n

we take all possible products of the form

using exactly one element from each row and exactly one element from each column.
Thus the numbers

are all different. To each of these products we attach a + or -'-- sign, according to a
rule which will be stated presently. We then take the sum

of all terms which can be formed according to the above rules. This sum is called the

determinant of the matrix,

and is denoted by det M, or det [ a;1 ]. Thus, when we have

explained how the sign is to be chosen for each term, we shall have a function
det:

,A

---+

R.

The rule for the signs takes time, to explain and justify. With each term

there is associated a function


where

In= {l,
Here

p{i)= j;;

that is,

from the ith row.

p(i)

2, 3,

... , n}.

is the column number of the element a;1, that we chose

We can describe such a function p by a diagram in the following

The Determinant Function

13.4

(1 2 3 .
jl j2 ja

form:

p
with the numbers

583

n
jfl ,

in the top line and the numbers

p(i) below them.

(13 12 43 24)
1H3, 2H1, 3 H4, 4H2.

For example,

is the function under whose action

A one-to-one function

p:

In---+ In is called a

permutation.

When permutations are

described in the two-line notation, the order of the columns does not matter; all that
matters is what is under what. For example,

(13 12 43 24) = (24 34 12 13) = (31 34 12 24) '

and so on.
If

then

interchanges two integers

is called a

transposition.

and b, and leaves every other integer in In fixed,

For example,

(13 22 13 44)
1 3.

p
=

(13).

is a transposition; it interchanges
hand

Theorem 1.

In general,

(ab)

and

We denote this permutation by the short

is the permutation which interchanges

and b.

Every permutation can be expressed as a product of transpositions.

Here the word

product

is used in the sense of composition of functions.

example, for

For

G i ! ).

we can use the following transpositions:

) = (21)
)
(13)
=
G i ; :
G ; :
(24)(21)(13).

) (24).
=

We now have

p =

As always, for composition of functions, the operations are performed in the order
from right to left. Thus

1 (13) 3 (21) 3 (24) 3,


(24) 1,
2 613) 2 (21)
(21) 2 (24) 4,
3 r(13)
(21) 4 (24) 2.
41 (13)
I- b"ti:-

----+-

584

13.4

Linear Transformations, Matrices, and Determinants

This checks, with

4H2.

3H4,

2H1,

1H3,

The scheme that we used on this example always works. Given

p=
we first take the transposition

(1

2 3
ji j2 ja

(1 j1);

n)

jn '

this puts ji under

1,

where we want it to be. To

the resulting sequence, we apply a transposition which puts j 2 in the second position,

and so on. A further example:

p=

2 3
5 4

4 5
1 7

6
3

Here we could use the following stages:

5 4

5 4

(12)
(15)
(34)
(31)
(37)
(36)

It is not claimed, in Theorem

1,

that every p can be expressed in only one way as a

product of transpositions; and in fact this is not true.

For example, the above

diagram gives

p=

G ; ! 1 )

(36)(37)(31)(34)(15)(12).

But it is also true that

p=

(36)(73)(16)(43)(57)(24)(16)(27)(14)(57),

which looks different, and uses ten transpositions instead of six. Nevertheless all such
expressions for a given

A permutation

have a common property, now to be described.


p
is called even if it can be expre&0 : as the product of an even

p
number of transpositions;
positions.

is odd if

is the product o

Theorem 2. No permutation is both odd and even.

an odd number of trans

The Determinant Function

13.4

Proof

The

alternating function on n variables is

585

defined by the formula

i<j
where the expression on the right is the product of all differences

i < j.

xi - X;

for which

(Analogously, we might use

to denote

_2;=1 xi).

For example,

f(x1, X2, X3) = (x1 - X2)(x1 - X3)(x2 - X3),


and

f(xu x2, x3, x4) = (x1 - x2)(x1 - xa)(x1 - xJ(x2 - xa)(x2 - X4)(xa - xJ.
Now consider what happens to f when we apply the transposition
changing

xi

and

X;.

The factors of f are of the following types:

1)

(xr - xi),

(xr - X;)

(r < i),

2)

(xi - x8),

(X8 - X;)

(i < s < j),

3)

(xi - xt),

(x; - xt)

(j < t),

4)

(Xi - X;).

When we apply the transposition

(ij),

3)

( <Xr - Xi)
(x, - X;)
{(xi - x,)
(x, - x;)
{(xi - xt)

(x; - Xt)

(x1 - xt),
(xi - Xt);

4)

(xi - x;)

(x; - xi)

1)
2)

H
H
H
H
H

xi

interchanging

and

X;,

(i j),

thus inter

the effect is

(Xr - X;),
(xr - xi) ,.
(x; - x,)
-(X8 - X;),
(x, - xi) = -(xi - x,);
=

-(xi - X;)

Thus the factors of the first three types fit together in pairs, and in each case, the

products of the pairs are left unchanged.


the sign of f is changed. Briefly,

(ij)f(x1, X2, ... 'Xn)


or; more briefly still,

It follows that

(ij)f

pf= f if p

But the sign of

xi - x1

is changed, and so

-/(x1, X2, ... 'Xn)

-f

is an even permutation, and pf=

-f if p

is an odd per

mutation. No permutation can have both these effects, and so the theorem follows.
On this basis, we can finally define the determinant function:
det

[ai;] = ,2 a1;, a212

anin'

Linear Transformations, Matrices, and Determinants

586

13.4

where we use + if the permutation


=

. .

(1. 2 3 . )
j2 ja
Ji

'

'

}n

is even, and - -if pis odd.


In practice, when we have developed some of the theory, we shall never have to
make direct use of the above definition; and this is fortunate, because the definition
is even more tedious to handle than one might think. In order to form a term of
det

[a;;], for an n by n matrix, we have to choose an element

from each row, in such a

way as never to use the same column twice. Thus we have


from in the first row; there are then

possibilities to choose

possibilities in the second row; and so on.

-1
n(n - l)(n - 2)
3 21

Therefore the total number of terms is

Therefore the determinant of an


for

20,

n by n matrix is

n!

the sum of

n!

terms. In particular,

the number of terms of det Mis


=

20! 2,432,902,008,176,640,000 .
31,526,000.

The number of seconds in a year is only

This is why nobody asks even an

electronic computer to calculate the determinants of large matrices by brute force.


Nevertheless, the definition of the function det is usable conceptually, as the
basis of a theory which leads quickly to efficient techniques. In the rest of this section,
we shall begin to develop the portion of the theory which makes direct use of the idea
of odd and even permutations.
Theorem 3.

Every permutation is invertible.

Obviously, since every permutation is one-to-one.


Theorem 4.

For each

n,

let S,. be the set of all permutations p: I,.--+ I,..

Then

[Sn, ] is a group.
o

Proof

Sn is closed under because the product p o q of two one-to-one functions


In--+ In is another such function.

2)
3)
4)

(1)

o,

The operation

is associative; composition of functions always is.

There is an identity
e

By Theorem

Theorem 5.

3,

(1, 2, 3,
1, 2, 3,

... ,
... ,

n
n

every pin Sn has an inverse.

In any group, the inverse of a product is the product of the inverses,

in reverse order.
Thus
because

(p

q) o (q-1 o p-1)

p o (q

q-1) o p-1

p o e o p-1

p o p-1

e.

The Determinant Function

13.4

Hereafter, we shall omit the operation sign

587

For products of n factors, the theorem

o.

says that

(P1P2

Pk-1Pk)-1

p/;1p/;!.1.

P21P1\

and this is true, because in the product

P1P2 ... Pk-1PkP.i/P"k!.1 ... P21P1\


all the factors cancel each other in pairs, starting in the middle.
Note that if the group is commutative, then the order doesn't matter, and the
theorem takes a simpler form. But the group that we are working with at the moment
is not commutative.

Theorem 6. For each

Proof

We express

in

Sm p

p;

P1P2 .

h-1h

is a transposition. Every transposition is its own inverse. Therefore

p -l
If k is even, then
The

are either both even or both odd.

as a product of transpositions:

p
where each

p-1

and

transpose

and

p -1

hh-1 .

P 2Pi

are both even. If not, p and

are both odd.

of a matrix .!It is the matrix obtained by reflecting .!It across its

main diagonal. The transpose is denoted by .Jtt. Thus

and in general

Theorem 7. For each Min .,1tn,


det M

Proof

p -1

det M.t

The terms of det Mare of the form

with + or - according as the permutation

(1

p
=

ji j2 ja

'

.
'

}n

is even or odd. The corresponding term of det Mt is

with + or - according as the permutation

Ct 1

j;

n)

Linear Transformations, Matrices, and Determinants

588

is even or odd. Obviously

13.4

p-1, and sop and are both even or both odd. There
q
q
fore the terms of det Mand det Mt have the same signs, and det M
det M.t
=

Theorem 8. If two rows of Mare interchanged, then the determinant of the resulting
matrix is - det M.
For example,

The reason is that when two rows are interchanged, this contributes exactly one
transposition to the permutation
p

(1

A jz ja

jn

For example, if the first and third rows are interchanged, then the sign of the term
a
a
1ii 212

a
nJn

in the new determinant is determined by the permutation

(1

q =

h j2 ji

()1 ]a)

p.

PROBLEM SET 13.4

1.

Working directly from the definition of <let, get an explicit formula for
<let

2. Similarly, get a formula for

[anll21

ll12
ll22

["" ""]
au

<let

a21 a22 a23


ll31 ll32 ll33

3. Similarly, get a formula for

a,, ""
ll23 ll24
ll33 ll34
ll44

[001 010 ]
000
0
0
[
0
5. 0 001 ]
00 01
0
[ 000 007 400 !]
<let

Calculate the following.

0
OJ
[1
0
0
1
0
4. 00 00 01 010
7. [ 001 00
0 4 ]
<let

6. det

8.

det

00 00
0
[ 000 007 400 !]
00 000 401
[! 00 07 00 ]
2

9. det

13.4

10. det

12

det

[i i i]
[H H ]
[! : ]
o

14 . det

-14

-2

15. det

17 . det

-6

0
-4
0

0
0

3
0

6
16. det

-6

0
0

589

The Determinant Function

0
0

0
0

18. Let An be the set of all even permutations in Sw Is [An, 0 ] a group?


19. Let Bn be the set of all odd permutations in Sn. Is [Bn, 0] a group?
20. Let Cn denote the set of permutations in Sn, for whichji = 1. Is [Cm a] a group?
21. Let Dn denote the set of permutations in Sn such that either Ji

2 and iz = 1. Is [Dm a] a group?

1 and h

2 or

22. Can any general statement be made about the evenness or oddness of the following

permutation?
_

p-

n n- 1

n- 2

n-2

n- 1

...

;)

23. What can you say about the sign of the following permutation?
=
q

n 1

. .

n- 2

n- 1

2 ... n- 3

n-2

n- 1

24. Suppose that [G, a] satisfies all the conditions for a group, except that some elements

of G may not have inverses. Let H be the set of all elements of G that have inverses.
Does it follow that [H, a] is a group?

25. Find the roots of the equation

det

[: f: fJ

0.

Express the left-hand member as a quadratic expression in factored form.

Linear Transformations, Matrices, and Determinants

590

26. Find the roots of the equation


x

det

X1

a
2
x2
2

x2
3

x2

X2

X3

xa
3

13.5

x3

0.

Then express the lefthand member as a cubic equation in factored form.


*27. Find a matrix M whose determinant is the alternating function / (xi.

13.5

x ,
2

, Xn).

EXPANSIONS BY MINORS.
CRAMER'S RULE AND INVERSION OF MATRICES

In the preceding section we showed that


detMt

detM

for every square matrix M; and we showed that if two rows ofM are interchanged,
the effect is to change the sign of detM. These statements in combination give us the
following:
Theorem 1.

If two rows of M are interchanged, or two columns are interchanged,

then the determinant of the resulting matrix is -detM.


To get the second half of this theorem, we take the transpose, perform the
appropriate interchange of two rows, and take the transpose of the resulting matrix.
The first and third of these operations leave the determinant unchanged, and the
second one reverses the sign.
In fact, since detMt

<let M, every theorem about rows automatically gives us

a theorem about columns.


Theorem 2.

IfM has two identical rows, then <letM

0.

Similarly for columns.

The reason is that when the two identical rows are interchanged, nothing happens
to the determinant (or even to the matrix).
detM

Therefore detM

-detM, and

0.

The minor of an element a;1, in a square matrix M, is the matrix that we get by
deleting the ith row and the )th column ofM. The minor is denoted by M;1, and its
determinant detM;1 is denoted by D;1.
It is easy to see that the sum of all terms of detM that include au is auDu:
au

a ln

G12

G1

a 21

G22

a 2a

a2

aa1

aa2

aaa

aa n

...... ------ ----- ----------

Expansions by Minors

13.5

591

Every term that involves a11 has the form

is (except possibly for sign) a term of Du = det M11 And


an
here a21 a31
in
2 3
these two corresponding terms of det Mand det Mu have the same sign, because the

permutations

l
1

)2

)a

.. '

)2

}n

..

"

Jn

are either both even or both odd.


This leads to a more general result:
Theorem 3.

1
The sum of all terms of det Mthat involve a;j is (-l)i+ ai1Dw

Proof This is known for the case i

1.

We shall reduce the theorem to this case.

By a simple row transposition we mean an operation which interchanges two


consecutive rows of a matrix. Similarly for simple column transpositions.
that ai; can be moved into the first column by j
[c1, c2, C3, C4, C5,
'l<-.A

Here c1 denotes the jth column.

.._.. ._,..
For j

- 1

We assert

simple column transpositions:

, Cn).

5, the transpositions are (c4c5), (c3c5),

(c2c5), (c1c5). The new order of columns is

Thus the fifth column becomes the first, and the other columns are in the same order,
among themselves, as they were before.
Similarly, we can then move a1
; into the first row, by i

1 simple row trans

positions. Let the new matrix be M'. Then


det M'

( -1 )<i-l>+!i-1>

and so
det M

det M

i
(-l) +i det M,

(-1)-H det M'

(-l)i+i det M'.

But the sum of all terms of det M' that involve a;1 is a;1 det M/1, where M1
/ is the
minor of ai; in M'; and M1
/ is M1
; , because our total operations on the rows and
columns of M did not disturb the order of the rows and columns of Mi; Therefore
the sum of all terms of det M that involve ai; is
. . det Mi3
(-l)i+ia1.J

..
(-l)i+ia i..3 Di1'

which was to be proved.


Theorem 4 (Expansion about the minors of a row).

For each i,

n
i+i
.. ( - 1 )
det M = ""
aiJD;J
=l
i
This is true because every term of det Minvolves exactly one element in the ith
row.

Thus the above formula separates the n ! terms of det M into n classes, with

Linear Transformations, Matrices, and Determinants

592

(n - 1)!

13.5

terms in each class. Similarly:

Theorem 5

(Expansion about the minors of a column).

For eachj,

det M

L(-1/+1a;1DiJ

i=l

(At this stage you should check to see how these formulas apply to a 3 by 3 matrix,
using, say, the second row and the second column.)
If we multiply the elements of one column by the determinants of the minors of
some

other column,

Theorem 6.

with the appropriate signs, and add, we get 0:

If k ;;C j, then

The reason is this.

Let M' be the matrix obtained by changing the jth column

so as to make it identical with the kth column of the given matrix M. Then the above
sum is the expansion of M' about the minors of itsj th column. Therefore the sum is
det M'. But det M' is 0, because M' has two identical columns. Similarly for rows:
Theorem 7.

If k ;;e

i,

then

Theorems 6 and 7 may seem, at first glance, to be merely descriptions of what


happens if somebody makes a mistake, but in fact they are very pointed statements.
They enable us to write explicit formulas for the solution of a set of

n linear equations

inn unknowns, in the case in which the solution exists and is unique. To avoid tedious
notation, we show how the method applies in the case

3.

(The general case is

exactly the same in principle.) Given the system

Let M be the matrix

[ai11

auX1

a12x2

aiaXa

a21X1

a22X2

a2axa

aa1X1

aa2X2

aaaXa

D31 Then we add:

ba.

b1Du

-a21D21X1 - a22D21X2 - a2aD21Xa

-b2D21

aa1Da1X1
Dx1

a12D11X2

G32Da1X2

x2

On the lefthand side, the coefficient of

x2

b2,

a13Duxa

auD11X1

and

b1,

D
det M, and suppose that D ;;if 0.
D11, in the second by -D21, and in the third

of the system; let

In the first equation, we multiply by


by

x3 are 0, by Theorem 6.

G33Da1Xa
0

x3

baDa1

i
L:=1(-l)i+ b;Di1.

x1 is D, by Theorem

5, and the coefficients of

Thus the use of the minors as multipliers has given us

Expansions by Minors

13.5

593

an equation in which Xi is the only unknown. The sum on the right-hand side, in the
last equation, is easy to describe: it is the determinant Di of the matrix

obtained by replacing the first column of M by the


To solve for

b/s.
x2, we multiply in the three equations by - D 12, D 22, and - D 32

respectively, and add. This gives

0 Xi

Dx2

0 X3

3
2 i
! (-1) + biDi z

i=i

Here the sum on the righthand side is the determinant

D2 of the matrix obtained by


x3 There

replacing the second column of M by the b/s. The same scheme works for
fore, if

D 0, the system has one and only one solution, namely,


Xi

X2

Di/D,

D2/D,

X3

D3/D.

Obviously none of the above discussion depended on the condition n

3. In general,

we have:
Theorem 8

(Cramer's rule). Given a linear system of the form

bi
b2

X1
Xz
M

Let

det M. If

D 0, then the system has one and only one solution; and the

solution is given by the formula

X;

D ;f D,

where

D 1 is the determinant of the matrix M1 obtained by replacing the jth column


b2,
, bn).
Cramer's rule has the following consequence. A square matrix M is called non
singular if M has an inverse.
of M by the vector ( bi,

Theorem 9.

If M is a square matrix, and det M

0, then M is nonsingular, and

its inverse is given by the formula


where for each i and j,

ci i

1
-(-1)'+'D 1i
D

i
That is, M- is the transpose of the matrix

[ (-l)i+iD;;J

594

Linear Transformations, Matrices, and Determinants

13.5

To see why this is true, consider the matrix equation

Here

Dxi

I (-I)i+iyiDii
il

by Cramer's rule. Therefore

xi
What we want is a matrix M'

I ( - ly+i(l/D)Di;Yi
i=l

M-1, such that

Here the y's form a column vector, and if M'

[c;;l, then

X; =

To convert our previous formula for


right, getting
X; =

xi

L ciiYJ
il
to this form, we interchange i and j on the

(-ly+
i l

(-ii) Diih

The value of the sum on the right is unchanged when we usej as an index of summation.
Therefore
M-1

M1

[cii]

[ (-l)i+1D1l

which was to be proved.

PROBLEM SET 13.5


1. Find multipliers which eliminate y and z from the system

x + 2y + 3z

4,

2x + 3y + 4z

5,

3x + 4y + 5z

6.

13.5

Expansions by Minors

595

Carry out the multiplications, add, and solve for x. (Here you are not supposed to use
Cramer's rule; you should use the scheme used in deriving Cramer's rule.)
2. Similarly, solve for y in the system
x - 2 y + 3z
2x + 3y + 4z
3x - 4y + 5z

-4,

-5,

-6.

3. Similarly, solve for z in the system


x - 2y - 3z
2x + 3y - 4z
-3x + 4y

5z

4,
-5,

0=

6.

Find the inverses of the following matrices, by a direct application of Theorem 9, and
check your answers by matrix multiplication.
4.

7.

[ ].
G ].

5.

[ !l

9.

[ l
1

11.

[! ]
-2

[! l
1

6.

8.

10.

[ ].

0
0

[ l

Find, by any method, the inverses of the following matrices.

(You need not calculate

the determinants unless you need to, as a step in finding the inverse.)

12.

15.

[ l

13.

-1

-2

-5

[ !l
0

14.

16.

[ !l
0

[ ]
1

17. Suppose we form a 4 by 4 matrix by fitting together four 2 by 2 matrices, like this:

Let D

det M, and let D;;

det M;1 for each i,j. Is it true that


D

det

Dn

D12

D21

D22

If so, verify it. If not, give an example of a case in which it fails.

13.6

Linear Transformations, Matrices, and Determinants

596

18. Similarly, discuss the case in which a 2n by 2n matrix is formed by fitting together

four n by n matrices.
19. Similarly, discuss the case in which a 6 by 6 matrix is formed by fitting together nine
2 by 2 matrices.

13.6

ROW AND COLUMN OPERATIONS.


LINEAR INDEPENDENCE OF SETS OF FUNCTIONS

We shall now show that when we apply to a square matrix the "triangularization"
process that we applied to systems of linear equations in Section 13.1, the determinant
of the matrix is unchanged.
Theorem 1. If one row of a square matrix is multiplied by a scalar, and the resulting
vector added to another row, the determinant of the matrix is unchanged. Similarly
for columns.

Proof

Suppose that the kth row of the matrix M

[ai;] is multiplied by a and added

to the ith row, giving a matrix M'. Expanding M' about the minors of the ith row,
we get
det M'

n
;
! (- lY+ (a;; +aak 1)D;;
i=l

n
n
;
;
! (-1y+ a;;D;; +a! (-l)H ak;D;;
i=l

i=l

detM

+a

0,

by Theorems 4 and 7 of Section 13.5. We get the other half of the theorem by taking
transposes, as in the proof of Theorem 1 of Section 13.5.
Iterations of this procedure constitute the most efficient scheme for computing
determinants; by appropriate row (or column) operations, we can introduce O's
into a particular row (or column), so that when we use an expansion by minors, only
one of the minors needs to be computed. Note that without these preliminaries, an
expansion by minors is not a short cut in computation, but merely a device for
systematizing our work; in an expansion by minors, the same number of terms appear
as under the original definition of the determinant; they have merely been sorted into

sets of

(n

1) !

terms each.

Tbeorem 2. If the rows of a matrix M form a linearly dependent set, then det M

0.

Similarly for the columns.

Proof

Let M

[a;1];

let the rows be

r;

n
! a;r;
i=l
for some set of numbers

a;,

L {Jiri

i=2

a;n);

suppose that

0,

not all equal to 0.

combination of the others:

r1

(a;1, a;2,

Then some

r;,

say, r1, is a linear

Row and Column Operations

13.6

597

By i - 1 row operations as in Theorem 1, we get a new matrix M' in which r1 is


replaced by a row of O's. Therefore <letM'
0, and so <letM
<letM'
0.
The converse is a little harder.
=

If detM
also do the columns).

Theorem 3.

0,

then the rows of Mform a linearly dependent set (and so

Proof The proof is by induction. Obviously the theorem holds for 1 by 1 matrices.
We need to show that if it holds for n - 1 by n
1 matrices, then it also holds for
n by n matrices.
Given an n by n matrix M
[ai;], with rows r1, r2,
, r n If any row ri is the
zero vector, then the linear dependence of the set
-

{r1, r2,

, r n}

is obvious. Therefore we may assume that r 1 -- 0. We may also assume that a11 -- 0,
since the linear dependence or independence of the rows is unaffected by permutations
of the columns.
Now consider the matrix M' whose rows form the set
R'

{r ,{ r, ... , r}

{r1, r2 - Ol2r1,

r n - Olnr1

The r/s are linear combinations of the r;'s, and vice versa.
span the same subspace of Rn. This gives:

}.

Therefore R and R'

If R' is linearly dependent, then R is linearly dependent, and conversely.


But the Ol/S can be chosen so that a11 is the only nonzero element in the first
column ofM'. Therefore <letM'
a11D1, where D1 is the determinant of the minor
M1{ of a11 in M'. Since
1)

<letM'

detM

and

we have

2) <letM{1

0.

1 by n - 1 matrix. Therefore the rows ofM{1 form a linearly


ButM{1 is an n
dependent set. Therefore R' is linearly dependent. Therefore R is linearly dependent,
which was to be proved.
-

This theorem easily gives:


If the rows of a matrix M are linearly dependent, then so also are the
columns, and conversely.

Theorem 4.

Proof By the preceding two theorems, each of the following conditions is equivalent
to the next:

1) The rows of Mare linearly dependent.


2) detM 0.
3) The columns of Mare linearly dependent.
=

Here, and throughout this chapter so far, we have been talking about matrices of

numbers. We shall now discuss linear independence of sets of functions; for this

Linear Transformations, Matrices, and Determinants

598

13.6

purpose, we shall use matrices


stand is that

of/unctions; and the first thing that we need to under


for matrices of functions,the analogue of Theorem 4 is false. This can be

shown by a very simple example, as follows:

M(x)

2
2x
2
2x

[l

x
2
x

Here the columns are linearly dependent, obviously; but the rows are not: if
for every
for some real numbers

1)(1,

1)(2,

nomial that vanishes for every

then

1)(3,

1)(1

1)(2

1)(3

x,

0, because the only poly

is the zero polynomial.

The simplest general test for linear independence of functions uses the determinant
of a matrix of functions.

Wronskian

Let /1, /2,


/1,/2,
,fn

of the sequence

Jn

be functions on an interval I.

The

is the function

!1 f{ f
!2 f f

fn J f

Thus

W(x)

det

n l
f - )

[JJ1-1>];

the Wronskian matrix has the (j - !)-derivative off; in the ith row andjth column.
Note that the Wronskian really depends on a
on a
may

sequence of functions, and not merely


set of functions; if the same functions are taken in a different order, the sign of W
change. The notation W(x) is meant to emphasize that the Wronskian is a

function and not a number. The following is easy:


Theorem 5.

That is,

If

W(x)

{fi,/2,

0 for every

fn}
x.

The reason is that for each


dependent: if

is linearly dependent, then

x, the

n
2, a,f(x)
i=l

for some numbers

et.i

which are not all

W(/1,/2,

(Here

Di[ ]

for every

for every

x,

0, then automatically

(j

1, 2,

. . . , n

1).

denotes thejth derivative.) Therefore

x, and

n
2, l)(;f1>(x)
i=l

,fn)

0.

rows of the Wronskian matrix are linearly

D1 2, et.Jlx)
i=l

(j

1, 2,

the rows are linearly dependent.

,n

1),

13.6

Row and Column Operations

599

Ordinarily, we apply this theorem backwards, using the following equivalent


form:
Theorem 5'. If W(x0) :- 0 for some x0, then the set {fi,/ 2,

Jn} is linearly

independent.

For example, consider

j;(x) =cos x.
Here

W(x)

det

sin x
cos x

c ?s x
-sm x

-sin2 x - cos2 x = -1

for every x.

Here W(x) :- 0 for every x, and it follows that sin and cos are linearly independent.
But sometimes we have to choose a particular x0 For example, consider
Here

W(x) =det
Here W(O)

By Theorem

0, but

5',

w()

=det

flx)

sin 2x.

snx
cos x
sm 2x 2 cos 2x

[ J
-

-2.

{sin x, sin 2x} is linearly independent. Similarly for

Here

[ xe"' xe"'
xe xe
[1x 2x
xe2"'
J

W(x) =det
==

2 .,

det

+ e"'
2 ., + 2xe .,
1

=det

[ xe"'

x2e .,

e"'
2xe"'

=x2e2"''

and so W(x) :- 0 for every x ,= 0.


Elaborations of these techniques will build up gradually, in the following problem
set. Meanwhile, it is natural to inquire about the converse of Theorem 5.
(?)Theorem(?). If W(x) =0 for every x, then {fi,/ 2,

Jn} is linearly dependent.

As it stands, this is false; and strong hypotheses need to be added to make it a


true theorem. For example, let
.

fi(x) =x2,

2x2 for x 0
!2(x) = 3x2 for x o'.

0 on the interval [O, co), because/1 and/2 are linearly dependent on


Then W(x)
[O, oo); and for the same reason, W(x) =0 on ( oo, O]. Therefore W(x) =0 for
every x. Nevertheless, {/t>/2} is linearly independent, because there is no one pair
=

600

Linear Transformations, Matrices, and Determinants

13.6

of nonzero constants oc1, oc for which

for every x.
If the above equation holds for every x, then setting x = -1 and x = 1 we get

By subtraction, oc

oc1 + 2oc
2
oc1 + 3oc
2

= 0. It follows that

oc1

= 0,

0.

= 0.

This indicates that the Wronskian can give proofs of linear dependence only for
special types of functions.

Fortunately, these functions are of special interest and

importance, as we shall see.


In the following problem set, you will be writing long strings of equations between
determinants, and it will be convenient to use
For example,

I I

= det

[ J

I as an abbreviation for det [ ].

ad

be.

PROBLEM SET 13.6


Investigate the following sets of functions for linear dependence.

1. {e"', e2"'}

2. { e"', e2"', e3"'}

3. {e-"', e"'}

4. {e"', e2"', xe"'}

5. {sin x, sin2 x}

6. {cos 2x, sin2 x}

8. {x, sin x}

9. {ea"', eb"'}

7. {sin x, cos x, sin2 x}


10. {ea"'' eb"'' xea"'} (a -,= b)

11. {eax' ebx xeax' xeb "'}


'

(a -,= b)

12. {e"', xe"', x2e"'}

13. {sin x, sin 2x, sin 3x}

14. {cos x, cos 2x, cos 3x}

15. {cos x, x cos x}

16. {cos x, cos 2x, cos2 x}

17. {cos2

18. {cos x, sin4 x, sin2 (2x)}

19. {e"', sin x, cos x, e"' sin x}

20. {e"', e"' sin x, e"' cos x}

21. {x, e"', xe"'}

(a -,= b)

x, sin2 x}

22. Since a polynomial equation of degree n has at most n roots, it follows that for each n,
{ 1, x, x2,
, xn} is linearly independent. Get an alternative proof of this statement

by calculating the Wronskian.

23. Find the roots of the equation

D4

Here the numbers

0.

x1, x , and x are all different. How do you know that D4 is a poly
2
3

nomial of degree three?


determinants.

X 1 x12 x13
X2 x22 x3
2
X3 x32 x33
x x2 x3

Express D4 as a product of linear factors, not involving

Linear Differential -Equations

13.7

*24.

601

Express the following determinant as a product of linear factors:


x

x2
2

1
1

*25. Investigate

Xn 1
-

Xn

for linear dependence:

{ea1"', ea"', eas'", .

a '"
. . , e n
,

where the a/s are all

different.

13.7

LINEAR DIFFERENTIAL EQUATIONS

Consider a differential equation of the form

f" +bf'+ cf= 0,


where b and

(1)

are constants. As always, when we study a differential equation, we

want to answer the following three questions:


i) Does the equation have any solutions?
ii) If so, what are they?
iii) How do we know that the solutions that we found are the only ones?
For Eq. (1), and for many others like it, we can give complete answers to all these
questions. (In fact, the only hard one is the third.) As a guide to what to try, we look
first at cases in which some of the solutions are obvious. The equation

f" -f=O

has the solutions

because

D2e"'

then so also is

e'"

and

D2e-x

f1 + rx2f2,

rxi

e-'".

And it is easy to see that if f1 and f2 are solutions,

for every pair ofscalars

note of this, more generally:


Theorem 1.

rx.1

and

We had better make a

rx.
2

Given a differential equation

pn>

f <n-1)

an_i

..+ aif' + aof

0,

where a/s are constants. Let "// be the set of all solutions of the equation. Then "//
f orms a vector space.
That is, "// is closed under addition and scalar multiplication. This is trivial to
check. Note, however, that ifthe zero on the right is replaced by a nonzero function,
or even a nonzero constant k, the solutions of the resulting equation never form a
vector space.

(If the sum of two solutions is a solution, then 2k = k, and k = 0.)

The above example suggests that we try solutions of the form

f(x) = em'",

f'(x) =

memx,

f"(x) =

m2emx.

602

Linear Transformations, Matrices, and Determinants

13.7

If such a function is a solution, then

and since

emx

-:/:- 0 for every x, this is equivalent to the equation

m2
Equation

bm

(2) is called the auxiliary equation.

0.

(2)

There are three possibilities for its

solutions.
I.

If b2 -

4c

> 0, then there are two root:>

-b

)b 2 - 4c

-b -

)b 2

4c

and both these roots are real.


II. If b2 -

4c

0, then there is only one root

m1
and this is a real root of multiplicity

m2
III. If b2

where(/..

- 4c

2, with

bm

-b/2,

(m - m1)2.

< 0, then the roots are two conjugate complex numbers

-b/2 and (3

)4c

b2

In case I, the functions

are solutions, and so also is every linear combination

Also

{f1,f2}

is linearly independent, because

W(/1,/2)

det

[:::: ::::::]

em'xem2x m
2

em1xem2x

det

- m1) -:/:- 0.

Therefore the solutions that we have found for case I include all the solutions, if
the following theorem is true:
Theorem 2.

Let Y be the solution space of the equation

f"
Then dim "Y

2.

bfI

cf

0.

603

Linear Differential Equations

13.7

This is true, and will be proved in the following section. Meanwhile we shall use it.
In case II, we seem to have only one solution
/1(x)

em"';

and on the basis of Theorem 2, we need to find another one, h, such that {/1,/2} is
linearly independent. We do not know how somebody first thought of trying
h(x)

xem1"',

but at any rate, it works:


j(x)

f(x)

f(x)

bn(x)

cf2(x)

m1Xem1x

em1x,

mxem1x

m1em1x

mixem1x

2m1 em'"';

xem1"'[mi

bm1

m1em1x

c]

because both the expressions in the brackets are equal to


Also, {/1,/2} is linearly independent, because

e2m1x det

m1
m1x +

[1x

1l

_J

+
0

em1"'[2m1

b]

0
,

e2m1 x

0.

Case III looks peculiar. Taken at face value, the roots of the auxiliary equation
give us
ea"'(cos f3x + i sin f3x),
e<a+Pilx
f 1(x)
=

f2(x)

e<a-Pilx

ea"'(cos{Jx - i sin {Jx) .

At the outset, we did not intend to get into the complex domain; but in the complex
domain, our formulas still make sense: if mis complex, then the function/(x)
em"'
is well defined; in fact, we have a function
=

cp(z)
with
cp'(z)
In particular,when z is real,
rp(z)

J (x)

em"''

entz'

mem z,

x, we have

rf>'(z)

f'(x)

mem"',

rf>"(z)

j"(x)

m2emx.

Therefore /1 and /2 really are solutions. But at the moment we are interested only in
real solutions (in another sense), and so we take the real and imaginary parts separately,
getting
(Check that if a complex-valued function f is a solution, then its real and imaginary
parts are also solutions. This is easier than checking g1 and g2 by a brute-force

Linear Transformations, Matrices, and Determinants

604

calculation.) It remains to verify that

g{(x)

g1

and g2 are linearly independent. We have

rxe""' cos {Jx - {Je""' sin {Jx,

13.7

g(x )

rxe""' sin {Jx + {Je""'

Therefore

cos

{Jx.

rxe""' cos {Jx - {Je""' sin {Jx


rxe""' sin {Jx + f3e""' cos {Jx

- e2ax det
=

because

{J

e2""' <let

[ . {Jx rx
{Jx rx
[ ? {Jx
cos

cos

sm

sin

c s

sm

0.

{Jx - f3 sin {Jx


{Jx + {J cos {Jx

{Jx
{Jx

- /3 sin

{Jx

{J

cos

{Je2""'

'

In case III, the linear combinations

f(x)

k1g1(x) + k2g2(x)

k1e""' cos {Jx

k2e""' sin {Jx

can be described in a better form. We have

f(x)

where

e""'(k1 cos f3x

+ k2 sin

.Jk21

(.Jkiki

k22 e""'

x0 is

cos

{Jx +

k2

.Jk

sin

{Jx

ke""' cos {J(x - x0),


k

and

{Jx)

any number such that cos

l 2

'\/ k 1

f3x0

k22

k1/k

and sin

f3x0

we get

f(t)

k 2/k.

Using

t for x,

ke"t cos {J(t - t0),

which describes the motion of a particle along a line, with the position given as a
function of the time.
graph of

f,

This kind of motion is called

damped oscillation. To get the


{Jt; we move the graph

we start with the "simple oscillating function" cos

{Jt0 units to the right, so that t0 acts like O; and then we damp the function by multiply
ing each value by ke"t. (For rx < 0, this damps the oscillations as t -+ oo; for rx > 0,
the oscillations are damped as t -+ - oo.)
Note that in our formula for f, the constants

rx

and {J play a very different part

t0: rx and {J are determined by the coefficients


equation, while k and t0 range arbitrarily.
It is a fact that a solution f of the equation
from

and

f"
is completely determined if

bf

cf

f(x0) and f'(x0) are

b and

c in

the differential

known, for some

x0.

This can be

verified by a calculation, for the three types of solutions that we have found, but the
theorem is best postponed until the next section, where we can give the "right proof."
Meanwhile, in the following problem set, you will find that such initial conditions
always determine an answer.

Linear Differential Equations

13.7

Case III, in which b2

605

4c < 0, and we get real solutions by making a detour

into complex variables, may seem peculiar, but it is case III that has the most
elementary application in physics: it describes the behavior of a vibrating spring.
This problem is as follows. Suppose that you hang a coiled steel spring from a rigid
support, like this:

The spring has a certain natural length L. If you hang an object of weight w to the
bottom end, the spring will be stretched by a distance s. It turns out experimentally
that if the weight

is not too great, then the ratio w/s is a constant k; that is, s = w/k;

the stretch is proportional to the weight.

This statement is called Hooke's law.

The proportionality constant k depends on the physical properties of the spring; the
thicker and stiffer the spring, the larger k will be. This law, of course, applies only
within certain limits: if you hang a brick on the hairspring of a watch, the result will
not be an illustration of the law.

Note, however, that the validity of the law for a

given spring and a given range of weights is capable of being tested by static experi
ments; and this is important, because we are about to deduce from Hooke's law first a
differential equation and then a law of motion.
If the spring is in equilibrium, when stretched to a length L + s, with a weight

at the bottom, then the spring must be exerting a force of magnitude w = ks, upward,
to balance the force w exerted downward by gravity. Let us now set up a coordinate
system on the line which is the axis of the spring, in such a way that the origin is at
the equilibrium point for the given weight.

In the figure below, we omit the spring

itself, to clarify the labeling.

0
.'V

606

Linear Transformations, Matrices, and Determinants

13.7

Suppose that the spring has been stretched to a point with coordinate x. Then two
forces are acting:

1)

The force Fi exerted by the spring. This is


Fi =

because x +

-k(x

s ,

is the total stretch. We use the minus sign because the x-axis is directed

downward.

2)

The weight w. This counts positively, because weight acts downward, in the posi

tive direction on the x-axis. Therefore the total force is

+ s)

F = -k(x

-kx.

+ w=

Now suppose that the weight is pulled down to a certain point x0 and then released.
Then the weight will bob up and down, with its position given as a function of the time.
For x =f(t), the velocity and acceleration are
v(t) =J'(t),

a(t)
Newton's second law says that

= v' (t) =f" (t).

F(t) = ma(t),

where m is the mass. The force represented by the weight is equal to the mass
the acceleration g of gravity. Thus

m times

=mg, and m = w/g. This gives


w

F(t) = -

But we know that


F(t) =

a(t).

-kx.

Therefore the function f which describes the motion must satisfy the differential
equation
w

- a(t) = -kf(t),
g

which can be written in the form


f"(t) +

kg
w

f(t)

= 0.

Since k, g, and ware all positive, this has the form

f"
where bf

- 4c

= -4c <

Of' + cf = 0

(c

>

0),

0.

PROBLEM SET 13.7


In each of the following eight problems, find the solution space of the given differential
equatiOn, and then find the scalars which give the solution satisfying the initial conditions

on the right. In each of these cases, you should use .the methods but not the results of this

The Dimension Theorem for the Space of Solutions

13.8

607

section of the text. That is, set up the auxiliary equation, solve it, and then use the root(s)
to get two solutions which form a linearly independent set.

f" - 5f' + 6/ = O; f(O) = 1,


= O; f(O) = 2, f'(O)
3. f" - 6f' + 9f = O; f(O) = 1,
4. f" + 4[ = O.; f(l) = 1, J'(l)
5. f" + f' + f = O; f(O) = f'(O)
6. f" - 7/' + 6f = O; f(O) = 0;
7. f" - 7/ = O; /(1)
0, /'(1)
8. 4f" - 4f' + f = O; f(O) = 1,
1.

2. f" - f'

j'(O)
= 3.
f'(O)
= 2.
= 0.
f'(O)
= 2.
f'(O)

= 2.
= 1.

= 1.
= 0.

Solve by any method:

9.

A spring is such that an 8-lb weight stretches it 6 in. A 4-Jb weight is attached, allowed
to reach equilibrium, then pulled 2 in. below the equilibrium point and released. What
happens? What is the period?

10.

A spring is such that a 10-lb weight stretches it 18 in. A 1-lb weight is attached, allowed
to reach equilibrium, pushed 6 in. above the equilibrium point, and released. What
happens? What is the period?

11. You found, in Problem 27 of Problem Set 4.3, that the sine and cosine are the only
functionsf and g for which it is true that

f' =g ,

g' = -f,

/(0)

= 0,

and

g(O) = 1.

Show that there is only one function f for which it is true that

f" = -f,

/(0)

= 0,

j'(O) = 1.

and

*12. We know that the set of all infinite sequences of real numbers forms a vector space.
Letf1,f2,f3 be solutions of the differential equation
f"
and for

bf'

cf

= 0;

= 1, 2, 3, and every j, let

This gives three "vectors"

Y2 = (J21. Y22

. ),

Ya = (y31, Ys2,

),

which form a "3 by infinity matrix." Show that the rows of this matrix form a linearly
dependent set.
13.8

THE DIMENSION THEOREM FOR THE SPACE


OF SOLUTIONS. THE NONHOMOGENEOUS CASE

In the preceding section, we found that for every equation of the form

f" +bf'+ cf= 0,

(1)

13.8

Linear Transformations, Matrices, and Determinants

608

the solutions formed a linear space "Y. In each case, we found solutions /1, h such
that

{j1,j;} is linearly independent. We shall now show that the linear combinations

f=

rx i

f1

rxd2

are the only solutions.


Lemma. Every solution of

Proof

(1) has derivatives of all orders.

Obviously every solution of

(1)

has a first and a second derivative, and

f" = -bf' - cf
Here the righthand side is differentiable, and so also is the lefthand side. Therefore

j < a> = -bp2> - cf(l>.


Similarly,

p4> = -bpa>
and by induction,

pn>

cp2>;

-bpn-1) - cpn-2)'

for every n, which proves the lemma.


It follows that for every solution of

(1)

oo

I a;(x - a)'

i=O

we can write a "formal Taylor series"

oo

f(i\a)

i=O

I-.- (x - a)'.
!

We shall now show that f is real-analytic; that is, the Taylor series converges, for
every

x,

and its sum is the function/that we started with.

Theorem 1. If/is a solution of


oo

(I),

defined in a neighborhood of a point a, then

j<il(a)

f(x)= I -.- (x - a)i


i=O

for every

x.

In the proof, we shall use Taylor's theorem. (This is Theorem

of Section

10.10.

Note that we are now using it for the first time.) The theorem says that for each

x,

the remainder
Rn(x)

f(i)(a)

i=O

= f(x) - L -. - (x - a)'
!

is given by the formula


R
for some x between

(x) =
n

f(n+l)(x
)

(n

1)!

(x - a)n+l'

a and x. We want to conclude that Rn(x)

O; and to do this,

we need to show that the numberspn+i>(x) cannot increase fast enough to overcome
the effect of the

(n

1) ! in the denominator.

throughout the following discussion.

It should be understood that

x is fixed,

The Dimension Theorem for the Space of Solutions

13.8

k be any number such that

Let

and let
interval

k !cl,

k lbl,

609

k 1;

and

M be a number which is an upper bound for both lf(t)I and l/'(t)i, on the
I

{t I It - al Ix - al}.

t on

For each

x?

x?

I, we then have

lf"(t)I

1-bf'(t) - cf(t)I

lbl lf'(t)I + !cl lf(t)I

kM + kM

Similarly,

l/<3\t)I

2kM.

I-bf"(t) - cf'(t)I

2kM + kM

< (2k)2M.

We now claim that

for every
This is known for
n

- 2 and n

{1pn-1>(t)i

1 and n

n.

2. And if it holds for two successive integers


n:

1, then it holds for the next integer


<
=

(2kt-2M

IJ<nl(t)I (2kt-1M

=>

IJ<n+i>(t)I

1-bpn>(t) - cpn-ll(t)I

k IJ<n>(t)I + k IJ<n-ll(t)i

k(2kt-1M + k(2kt-2M
=

because

(2k)n-2(2k2 + k)M < (2k)nM ,

k 1. From this the theorem follows, because it gives


t
(2k M
IR (x)I ::5
Ix - aln+l'
n
(n + 1)!
-

which obviously approaches O; in fact, the expression on the right is the

(n + l)st

term of the series for

M e2klx al.
2k

It is not an accident that the series for f converges with the rapidity of an exponential

series: the solutions that we have found, so far, for our differential equation have been

610

Linear Transformations, Matrices, and Determinants

13.8

combinations of exponentials, sines, and cosines; and we are about to find that these
are the only solutions.
Theorem 2. If/"+ bf'+ cf= 0, andf(a) =f'(a) = 0 for some a, thenf(x) = 0
for every x.
Proof Since pn> = - b pn 1> - cpn-2), it follows by induction that pnl(a) = 0
for every n, and so all the coefficients in the Taylor series are equal to 0. This gives
the result which was used without proof in the last problem set:
-

Theorem 3. If Ji and/2 are solutions of the equationf" + bf'+ cf= 0, and

and

fi(a) = f2(a)

f {(a) = f (a)

for some a,

then/1(x) =fix) for every x.


This is because the difference functionf =/1

j; is also a solution, and

f(a) = f'(a) = 0.
The dimension theorem is now easy:
Theorem 4 (The dimension theorem). Let "f/' be the space of solutions of the equation
f" + bf'+ cf= 0. Then dim "f/' = 2.

Proof We found, in the preceding section, that every equation of this form has
two linearly independent solutions. Therefore dim "f/' 2. It remains to show that
every three solutions fi,f2,f 3 form a linearly dependent set. Consider the matrix

f 1(0 ) f {(O) f'{(O)


=
M
f2(0) f (O) f (O) .
fa(O) f (O) f(O)
This is not the Wronskian; it is not a matrix of functions but a matrix of numbers.
Therefore Theorem 4 of Section 13.6 can be applied to it: since the columns of Mare
linearly dependent, so also are the rows. Thus

for some scalars IX1, IX2, 1X3, not all equal to 0. Let
/ = 1Xi/1 + 1Xd2 + 1Xaf3.
Thenf(O) =f'(O) = 0, and sof(x) = 0 for every x. Therefore {/1,/2,/ 3} is linearly
dependent, which was to be proved.
All the results that we have been getting for equations of order 2 can be general
ized, in a straightforward way, tont. h-order equations, of the form

n-1
1<n) + I bJw = o.
i=O

As before, we tryf(x) =

em"',

and form the auxiliary equation

n-1
mn + I bimi =
iO

o.

The Dimension Theorem for the Space of Solutions

13.8

If

m1 is a root of the equation, with multiplicity

k1 (so that

611

(m - mi)k1 divides the

lefthand member), then the functions

are solutions, and form a linearly independent set.

If

IX

{Ji,

IX

{Ji

conjugate complex roots, with multiplicity k2, then the functions

are a pair of

e"-"' cos {Jx, xe"-" cos {Jx, ... , xk-1e"-" cos {Jx

and

e"-"' sin {Jx, xe"-"' sin {Jx, ... , xk.-1e"-" sin {Jx
are solutions, and are linearly independent.

Moreover, the total set of functions

obtained in this way forms a linearly independent set, and the number of elements
in the set is

n, because n is the sum of the multiplicities of the roots of the auxiliary


n. As for

equation. Therefore the dimension of the solution space is at least equal to


equations of order

2,

it can be shown that all solutions of the equation are real

analytic; and matrix theory then furnishes a proof that every set of

solutions

forms a linearly dependent set. It follows that the dimension of the solution space is
exactly

n.

Thus the results follow the pattern that we found for equations of order

2.

However, to derive them, in a reasonably efficient and natural way, requires new
theoretical ideas, and, in particular, a new kind of algebraic formalism. This theory

is best postponed to a systematic course in differential equations.

Meanwhile we consider what happens, in a linear differential equation with


constant coefficients, when the 0 on the right is replaced by a function.
for example,

f"(x)

5f'(x)

6f ( x)

Consider,

(I)

e".

We know how to find all solutions of the equation

f"
the solution space is
Equation

(2)

"f/

is called the

Sf'

{ IX1 e-2x

6f
+

(2)

O;

1X2e-3'"}.

reduced equation. In general, a linear differential equation


homogeneous.

with 0 as its righthand member is called

(l),

Iff andf0 are solutions of

f"(x)

5f'(x)

6f(x)

then

e"'

and

f(x)

6(f

fo)

5f(x)

6f0(x)

e"',

and so by subtraction we get

Cf
Therefore the function f

that if we can find

fo)"

- Jo

+ (f

f(x)

(1) has

0.

is a solution of the reduced equation

one solution of (1), then we can express

form
every solution of

- fo)'

1X1e-2'"

1X2e-3'"

this form, because f

- f0

(2).

This means

all solutions of

(1)

in the

+ fo(x);

is of the form

1X1e-2"'

1X2e-3"'.

612

13.8

Linear Transformations, Matrices, and Determinants

If the function on the right is real-analytic, then there is a systematic scheme for
looking for solutions of a nonhomogeneous equation: we assume that

I(x)
and solve for the coefficients

co

L aixi,

i=O

ai one at a time.

But if the function on the right is simple,

then the method of trial and error may work faster and lead to a simpler formula. In
the example above, we try

l(x)
and try to find

J'(x)

Ae"',

l"(x)

Ae"',

Ae"',

so as to make I a solution:

Ae"' + 5Ae"' + 6Ae"'

12A
Therefore the solutions of

(1)

1,

1\,

l(x)

/2e"'.

A set of this kind is called a

general, if if/ is a subspace of a linear space "f/", and


H

hyperplane.

e"',

form a set

The set H does not form a subspace.

is called a

lo is any point

hyperplane.

In

of"f/", then the set

U +lo I I in if/}

Note that every subspace is automatically

hyperplane,

because lo may be zero. The term hyperplane is suggested by the language of geometry
in Cartesian 3-space. If E is a plane through the origin, and P0 is any point of R3, then
the set
H

is a plane. (We use the prefix

hyper

{P +P 0 I Pin E}
because in vector spaces of higher dimension, the

dimension of a hyperplane may easily be greater than

2.

The set H may, of course,

be of dimension 1 or 0; every line in R3 forms a hyperplane, under the above definition,


because every line through the origin forms a subspace. The same applies for a point,
although we rarely have any occasion to say so.)
Similar devices work for various other functions on the right in a nonhomogeneous
equation. For example, consider

J"(x) + 5l'(x) + 61(x)

First we try

(?) l(x)

sin

This gives

(?)

- A sin x + 5A

cos

x + 6A

sin

x.

(?).
sin

sin

(?),

which is impossible, because {sin, cos} is linearly independent. Next we try

(?) l(x)

sin

x +

B cos

(?),

The Dimension Theorem for the Space of Solutions

13.8

613

which gives
-A sin x - B cos x
+

6A sin x

5(A cos x - B sin x)

6B cos x

sin x

<=>(-A - 5B

6A) sinx

( -B + 5A + 6B) cos x
5A - 5B
1
5A + 5B
0

<=>

sin x

This gives

/o(sin x - cos x),


fo(x)
so that the hyperplane of solutions is
H
{cx1e-2"' + cx2e-3x + /0(sin x - cos x)}.
=

PROBLEM SET 13.8

For each of the following equations, find the space ii" of solutions. Answers should be
in the form
+ an fn}
11' = {aif1 + a2f2 +

1.

JC4l +4/<3> +6f" +4f' +f=0.

3. 1<4> +2/<3l - 3[" - 4[' +4/


5. pal - f" + f' - f= 0.
7. /(5) - 2/(3) +f' = 0.

0.

2. [<4> +J<3l - 3f" - Sf' - 2/ = 0.


4. 1<4> +2f" +f = 0.
6. /(4) +J<3l - f' - f=0.

For each of the following, find (a) the space 11' of solutions of the reduced equation, in
the same form as in the preceding problems, and (b) the hyperplane Hof solutions of the
given equation, in the form
H=

{ai f1 + azfz + + anfn +/o}

f" +f=sin x

10. /" +f = 1

8. f" +f = e"'
11. f" +f = cos x

12. f" +f=x + 1

14. f" +f = x3

15. f" +f=e"' sin x

16. f" +f = x2 +1

17. f" +f=e"'cos x

18. f" +f=xe"'

19. /" +f=xe"' +e"'

20. f" +f = x sin x

21. f" +f = x cos x

22. f" +f = sin x +2x cos x

9.

13. f" +f=x

Functions of Several Variables

14

14.1

SURFACES AND SOLIDS IN R3

We recall, froll)._elementary geometry, the definition of a right cylinder.


set Bin a plane E, the

Given a

right cylinder with base Bis the union of all lines that intersect

B and are perpendicular to E.

Bz

I
I

,,
/

I
I
I
IB

,.--i

Hereafter in this chapter, when we speak of a cylinder we shall always mean a


right cylinder. Some other remarks are in order here.
1)

The figures above suggest that a cylinder is a bounded figure, with a lower

base B1 and an upper base B2 But this is merely a device for clarifying the meaning
of the pictures; according to our definition, cylinders are of infinite extent, in each of
two directions.

In the same way, planes are unbounded, although we indicate them

in pictures by drawing parallelograms.


2)

The base may be any set of points in a plane. If the base is a curve, as on the left

above, then the cylinder is a surface., If the base is a region, as on the right above,
then the cylinder is a solid. The definition applies to each of these cases in exactly the
same way.

In each case, if the plane of the base is regarded as horizontal, then the

cylinder is the union of all vertical lines that intersect the base.
To avoid possible confusion, we may distinguish these cases by speaking of

surfaces

and

cylindrical solids.

cylindrical

14.1

615

Surfaces and Solids in R3

If the base is in the xy-plane, and is described by an equation in x and y, then the
same equation can be regarded as a description of the cylinder.

-1

__
__

-----

I
I
I
I
/)-

)r

---

7c-- - y

,,

On the left above we show the unit circle in the xy-plane; this is the graph of
the equation x2 + y2

1. In the center above, we see the same figure in perspective,

as it appears when we are about to draw in a z-axis. In the three-dimensional figure


on the right above we see the portion of the cylinder that lies iri the first octant. The
cylinder is the graph of the same equation x2 + y2

1 ; since the fquation imposes

no restriction on z, the graph includes the vertical line through each of its points.
To be more precise, the circle is
{(x,y)

x2 + y2

I},

and the cylinder is


{(x,y,z)

x2 + y2

=I}.

The relations among these figures deserve careful examination. At the left above,
the tangents to the circle at the y-intercepts are horizontal,that is,parallel to the ,x-axis.
This should be true also in the perspective drawings at the center and right.
the dotted guide lines.

Hence

Similarly, the tangents to the circle at the x-intercepts are

vertical, that is, parallel to the y-axis.

This should also be true in the perspective

drawings; it is indicated by the dotted guide lines.

Often a correctly drawn figure

looks peculiar, unless you analyze it in this way. For example:

y
1 -----

::..._

--t-<

-71

1
I
I
I
I
I
I

__:_I

___

y
I

/--L

x
_.

__

y=x2, Oxl}.

/ /i
I
/
I
I
I
I

__

__
_

-L
y
/
/

x
x

-----

{(x,y,z)iy=x2, Oxl}.

To get a circular disk, instead of a circle, we use the inequality x2 +

y2

instead of the corresponding equation. Using this as base we get the solid cylinder,

14.1

Functions of Several Variables

616

which is
{ cx,y,z)

x2 + y2

1}.

If we use the xz-plane or the yz-plane as the plane of the base, then the same
scheme works, in a similar way. For example:

z
z

y
x
x
{(x,y,z)lx+z=l, O;;;x;;;l}.

x+z=l, Ox;;;l.

As usual, the figure is cut off at the ends, to clarify it in a pictorial sense. In its own
plane, the cylinder is an infinite strip, of width

J2.

v'2

->--'-l'---- '
If we had used the entire line

have been the entire plane


{(x,y,z)

1,

ix

y =

+ z

0 as base, then the cylinder would


=

l}.

This plane is parallel to the y-axis. Thus any plane parallel to one of the coordinate
axes can be described as a cylinder. In fact, for appropriate choice of the base plane,
any plane whatever can be regarded as a cylinder.
We have seen that cylindrical surfaces with their bases in the coordinate planes
are easy to describe by equations
surfaces are the
7.5.

(if their bases are so describable). The next simplest


surfaces of revolution, whose areas we learned to compute in Section

Given a curve in, say, the yz-plane, we may rotate the curve about !he y-axis.

This generates a surface.

Surfaces and Solids in R3

14.1

617

The cross sections of the surface, in planes parallel to the xz-plane, are all circles,
with their centers on the y-axis. If the generating curve is described by a function, say'
z =

f(y) G:

0,

then for each y0, the cross section in the plane y = y0 is the circle with center at
(0, y0, 0) and radius f (y0). Thus the cross section is the graph of the condition
Y =Yo,

and the surface of revolution is the graph of the equation


x2 + z2 =

[f (y))2.

Two important special cases are as follows:

-a

1)

Consider the generating curve


z =

-J a2

y2,

x = 0.

This is a semicircle. We rotate about the y-axis. The surface of revolution is the graph
of the equation
x 2 + z2 =
<=>

[-}a 2

_ y 2) 2 = a2

y2

x2 + y2 + z2 = a2.

This is as it should be, because the surface of revolution is the sphere with center
at the origin and radius a; it is easy to see by the distance formula that the sphere
must be the graph of the equation

-J(x
<=>

2)

- 0)2 + (y - 0)2 + (z - 0)2 = a

x2 + y2 + z2 = a2.

Consider the line


z =my,

in the yz-plane.
surface).

x = 0,

When we rotate about the y-axis, we get a

cone

(that is, a conical

618

Functions of Several Variables

14.1

As usual, the figure shows only the first octant.

The conical surface is the graph of

the equation

x2 +z2
<=>

(my)2

x2 - m2y2+z2

0.

If we had taken a line through the origin and rotated it about one of the other
coordinate axes, we would have gotten an equation of one of the forms
(a)

-m2x2+y2+z2

(b)

x2+y2 - m2z2

0,

0.

t------

y
:t

(a)

(b)

Each of the surfaces that we have investigated so far has been the graph of an
equation of the second degree in x,

y,

and z, that is, an equation of the form

Ax2 +By2 +Cz2+Dxy+Exz +Fyz+Gx+Hy+Iz+J

0,

where the first six coefficients are not all equal to 0. Using the method of rotation of
axes in a plane, as in Section 8.4, we can find out what the plane cross sections of such
surfaces are like. Let 0 be any plane, and let N be the normal line to E0 through the
origin. Let F0 be the plane which contains N and the z-axis, and let L be the line in
which F0 intersects the xy-plane. By a rotation of axes in the xy-plane, we can make

Surfaces and Solids in R3

14.1

619

L the x'-axis. The equations of the rotation are of the form

x' cos () - y' sin ()

and

'

x' sin () + y cos ().

In the new coordinate system, the equation of the surface that we started with has the
form
A' x'2 + B' y'2 + C'z2 + D'x'y' + E' x'z + F'y'z + Gx' + Hy' + I'z + J
(Query: How do we know that the constant term is unchanged?

0.

And how do we

know that the first six coefficients are not all equal to 0?) In the x'z-plane, we now
perform another rotation of axes, in such a way that N becomes the new x-axis. The
equations for this rotation are of the form
x'

x" cos cf> - z' sin cf>,

Z =

,/.
ff
X Slll 'I' +

,/.
COS 'I';

and in the x"-, y'-, z'-coordinate system, the equation of our surface is still of the
second degree, for the same reason as before. The plane F0 is the graph of an equation
of the form
x"

k,

where k is the distance between the origin and F0 To get the equation of the inter

section of the surface with F0, we should set x"

k in the equation of the surface.

This gives an equation of the second degree in y' and z'. By Theorem 2 of Section 8.4,
this means that every plane cross section of a second-degree surface is (a) a circle,
(b) a parabola, (c) an ellipse, (d) a hyperbola, (e) a point, (f) the empty set, (g) a line,
or (h) the union of two lines (either parallel or intersecting).
In particular, every plane cross section of a cone is a "conic section" of the sort
that we investigated in Chapter 8.

620

14.2

Functions of Several Variables

PROBLEM SET 14.1

Sketch the graphs of the following, in the first octant only. All the equations are to be
regarded as equations in (x, y, z). For example, x+y=1 is the equation of a plane,
x2 + y2 - 1 =O is the equation of a cylindrical surface,and so on.
1. x+y =1

2. x+z=l

4. x2+z2=1

5. x2+y2

7. z=y2,

0y 1

3. y - z=1
6. y2+z2=4

8. x =z2,

9. x =4y2, 0y 1

Ozl

10. (x2/4)+y2=1,

x,y 0

11. (,y2/4)+z2=1,

y, z 0

12. (x2/4)+ (z2/9)= 1,

13. x2+ (z2/4)=1,

x, z 0

1 4. x = ly

x, z 0

II

15. lxl=y + 1
Find equations for the surfaces described as follows,and sketch in the first octant.
16. The graph of z=sin y, 0y TT is rotated about the y-axis.
17. The graph of y+z=1 is rotated about the y-axis.
18. The same graph is rotated about the z-axis.
19. The line which passes through the origin and the point (1, 3, 1) is rotated about the
y-axis.
20. The same line is rotated about the z-axis.
21. The same line is rotated about the x-axis.

22. The graph of y =e"', 0x 1 is rotated about the y-axis.


23. The same graph is rotated about the x-axis.
24. The graph of y =lxl+ 1 is rotated about the x-axis.
25. The graph of y =cos z,0z ;;2, is rotated about the z-axis.
26. The same graph is rotated about the y-axis.

14.2

THE QUADRIC SURFACES

A quadric surface is a surface which is the graph of an equation of the second


degree in x,y, and z. Thus all the surfaces discussed in the preceding section are
quadric surfaces. We now consider less simple cases.

1) Spheres and ellipsoids.

Obviously the graph of the equation


x2 + y2 + z2 = a2

is a sphere of radius

a.

More generally, the graph of the equation


x2

Y2

z2

a2

b2

c2

-+-+-=1
is called an ellipsoid.

The sketch on the left shows the entire surface.

On the right we show only the

part of the surface that lies in the first octant. Such partial sketches are much easier
to draw, and sometimes they are actually easier to interpret and to use.

The Quadric Surfaces

14.2

621

It is easy to see, algebraically, that every cross section of an ellipsoid, parallel to


one of the coordinate axes, is an ellipse, a circle, a point, or the empty set.

For

example, setting x = x0 we get


Y2
z2
x2
-+-=1-_Q-k.
c
2
2
a2 b
Fork > 0 the cross section is an ellipse or a circle; for k =0 the graph is the point
(x0, 0, 0); and fork < 0 the graph is empty. Similarly for y = y0 or z = z0
2)

Elliptic and circular cones. We found in the last section that the graph of the

equation
x2 - m2 y2 + z2 =0
is a circular cone. More generally, the graph of
x2

z2

a2

b2

--y2 +-=0

is a cone, either elliptic or circular.


z

/yo

x
x

--y

The figure on the left shows the entire cone, and the one on the right shows only the
portion that lies in the first octant. The cross section in the yz-plane is obviously a
pair of lines, because it is the graph of
x

=0,

z2

y2 =-.

b2

622

Functions of Several Variables

14.2

Similarly, the cross section in the xy-plane is a pair of lines, because it is the graph of
y2 =

z = 0,

x2

The cross section in the plane y = Yo is the graph of


Y =Yo,

x2
a2

z2
b2

= Y2o

This is a point for Yo = 0, and is a circle or an ellipse for Yo 0. Note that the cross
section in the plane x = x0 (x0 0) is the graph of

which is a hyperbola.

3)

The hyperboloid of one sheet.

Consider the equation

y2
x2
z2
-+- --=l.
a2
b2
c2

Setting z = z0 and transposing, we get


y2
x2
z2
-+-=l+_Q=k2:::
- 1.
a2
b2
c2

Therefore all horizontal cross sections of the graph are ellipses. Rewriting in the form
y2

x2

a2k + b2k

1'

we see that as lzol increases the ellipses get bigger, but their shape does not change.

I
I
I

)- - --,_

__
_
_

14.2

The Quadric Surfaces

623

The cross sections in the other coordinate planes are hyperbolas; they are the graphs
of the conditions
x

y=

z2

b2

c2

- -- =

0,

z2
x2
--=l.
a2
c2

1'

The hyperboloid of two sheets. Consider the equation

4)

x2

y2

--

a2

z2
-

b2

c2

x0 is empty; for x

z2
v2
x2
=
'--+
- 1.
b2
c2
a2

Again we investigate cross sections.


x

y2

0,

For \x0\ < a, the cross section in the plane

a, the graph is a point; and for x > a, the graph is an

ellipse (or a circle), being the graph of


X

z2
y2
-+
b2
c2

X0,

x2

_Q

a2

1 >

0.

The cross sections in the xz-plane and the xy-plane are obviously hyperbolas.

5)

The hyperbolic paraboloid. This one is hard to visualize and hard to sketch. It is

the graph of the equation


Y2

CZ= -

b2

We give the sketch for the case a


becomes

x2

(c '=r6

a2

b= c

1.

0).

Thus the equation of the surface

624

Functions of Several Variables

14.2

To sketch, we use the cross sections in the planes

-1,

1, x

(Rather oddly, it is a bad idea to draw the cross section in the plane

0, andy

0.

O; this is a

pair of intersecting lines, and when we indicate it correctly, the result is very hard to
interpret pictorially.)
z

x=O, z=y2.

y=O,

z=-x2.

Using these cross sections in a perspective drawing, we get the result shown
below.
For other values of

and b, we get hyperbolas of different shapes in the hori

zontal cross sections. And when the sign of

is changed, the effect is to reflect the

surface across the xy-plane.


z

PROBLEM SET

14.2

Sketch the graphs of the following equations, and identify the surfaces.

1. x2+-+-= l
4
9

2. x2--+ - = 1
4
9

2
z2
y
3. x2+--- =1
9
4

2
z2
y
4. x2----= 1
4
9

2
z2
y
5. x2 - - - - =0
9
4

2
z2
y
6. x2+---=0
9
4

z2
2
y
7. -x2+---=0
9
4

8. x2

2
y
9. z=x2-4

z2

____

10. z2 =x2+t
4

-1

The Quadric Surfaces

14.2

625

y2

y2

12. z2 = - - x

11. z =- - x2
4

13. z = xy [Hint: This is easier to sketch if you first identify it.]


14. z2 = x2 + 2xy + y2
15. Consider the hyperbolic paraboloid which is the graph of the equation
cz =

y2
b2

x2
a2

(c : 0).

Let (x0, y0) be any point of the xy-plane. For each cc, consider the path whose coordinate
functions are

X = Xo + t

COS

cc,

y = y0 + t sin cc.

(In effect, we have set up a coordinate system on a line L through (x0, y0), in such a
way that L becomes the t-axis.) Now consider the path in space defined by the coordinate
functions
X = Xo + t

COS

cc,

y = y0 + t sin cc,

1 y2
x2
z =- - - c b2
a2

= - ( y0 + t sin cc)2 - - (x0 + t cos cc)2


b2c
a2c

What kinds of curves can be the loci of such paths?

16. A surface S is said to be ruled if for each point P of S there is a line L which contains P
and lies entirely in S.

(Thus every cone is automatically a ruled surface, and so also

is every cylinder.) Show that every hyperbolic paraboloid is a ruled surface.


* 17. Show that any hyperboloid of one sheet is a ruled surface (see Problem 16 for definition).
18. Find the volume of the region inside the graph of the equation

x2Ja2 + y2/b2 + z2 /c2 = 1 .

19. Find the volume of the solid defined by the inequalities


0 y 1,

-y x y,

[This is the solid which lies (a) above the xy-plane, (b) below the hyperbolic paraboloid
z = y2 - x2, and (c) between the planes y = 0 and y = 1.)

626

Functions of Several Variables

14.3

Find the volume of the solid which lies between the planes z = 0 and z = 1, and
inside the one-sheeted hyperboloid

20.

2
y

2
= z + 1.

Same question for

21.

2
x

= z

+ 1.

Find the volume of the solid which lies between the planes z = 1 and z = 2 and the
two-sheeted hyperboloid

22.

z
14.3

= x

+y

+ 1.

FUNCTIONS OF TWO VARIABLES.


SLICE FUNCTIONS AND PARTIAL DERIVATIVES

So far, most of the functions that we have been studying .1ave been of the following
types.

1)

Functions whose domains are sets of real numbers. In these cases, the' domain

2)

Functions of one vector space into another. These were always linear, and were

Dwas usually an interval.

referred to as linear transformations.

3) Sequences. (Every sequence a1, a2,


f(i)
a;, and sometimes it is useful to think

can be regarded as a function, with


of sequences in this way. You may be

able to recall such an occasion.)

We shall now investigate functions of the type

f:

D-+ R,

where Dis a region in an inner-product space, andfis not necessarily linear.

'
I
I
,--1=- ---- .... ,
I
.
I
I
I
I
I
I
l I
1
-l
I

p
D

:
:
c=0
x

z=f(P).

14.3

Functions of Two Variables

627

Suppose that a rule is given under which to each point P of D there corresponds a
real number z. We then say that we have a function
f: D-+

R.

As usual, the number corresponding to the point P of D is denoted by f(P). If D is


a region in R2, and P
(x, y), then we may writef(x, y) forf(P); in this case,fis
called a function of two variables, by which we mean two real variables. In this case,
the graph offis
=

{(x, y, z)

and

(x, y) in D

z =

f(x, y)},

or, equivalently,
{(P, z) J P in D

and

z =

f(P)},

as in the figure above.


Roughly speaking,fis continuous at a point P0 of D if
P ::::,; P0

f(P)::::,; f(P0).

=>

As for a function defined on an interval, the limit off is defined, more generally,
for the case in which P0 does not necessarily lie in D. In this case,
lim f(P)

P-+Po

means that

P - P0

and

=>

f (P)

::::,;

L.

To make these ideas precise, we interpret P ::::,; P0 to mean that llP - P0ll is small,
L I is small. This gives the following
and we interpretf(P)::::,; L to mean that lf(P)
definition :
-

Definition. Let D be a region in an inner-product space Y, let fbe a function D-+ R,


let P0 be a point of"f/ (not necessarily lying in D), and let L be a real number. Suppose
that for every E > 0 there is a o > 0 such that

Then

0 < llP - Poll < 0

lf(P)

=>

lim f(P)

L I < E.

L.

P-+Po

Note that this is a quite straightforward generalization of the definition of the


statement limx-.xJ(x) = L, with Ix - x0\ replaced by llP - P01\. Following, in the
same way, the pattern of our earlier work, we state:
Definition.

Let fbe a function D-+

R,

and let P 0 be a point of D. If

lim f(P)

f(Po),

P-+Po

then/ is continuous at P0
The elementary theory of limits can now be generalized very easily.

628

14.3

Functions of Several Variables

Theorem 1.

Let D be a region in a vector space "/', and let f and g be functions

D-+R.If
lim f(P)

P-+Po

then

lim g( P) = L',

and

lim [f(P) + g(P)] = L + L'

P-+Po

P-+Po

lim [f(P)g(P)] =LL'.

and

P-+Po

If L' 0, then

f (P)
= !::__
P-+Po g(P)
L'
lim

The proofs of these results are exactly the same as in Appendix B; we merely
need to translate the old proofs into the new language in the same way that we have
just translated the definitions. In this process, there is no advantage in using coordi
nates, writingf(x, y) forf(P), g(x, y) for g(P), and so on, even if domains in R2 are
the only ones that we are concerned with. In fact, the use of coordinates in these
particular proofs merely complicates the notation and obscures the ideas. But for
other purposes, we need to know how continuity is related to coordinates, as in the
following:
Let D be a region in R2, let/be a function D---+ R, and let P0
be a point of R2 Suppose that for every E > 0 there is a o > 0 such that

Theorem 2.

Then

Ix - x01 < o

and

IY - Yol

< o

=>

lf(x,y) - LI

<

(x0, y0)

E.

lim f(P) = L.

P-+Po

To see this, we merely need to examine the geometric meaning of the inequalities
on the left (preceding the =>).
y

The inequalities Ix - x01 < o, IY- - Yol < o hold inside the square, and the
equality llP - P0ll < o holds inside the inscribed circle. Therefore

llP - Poll
and so

< o

=>

Ix - x01 < o

=>

lf(x,y) - LI

and
< E;

lim f(P) = L.

P-+Po

IY - Yol

< o

Functions of Two Variables

14.3

629

From this it follows, as for functions of one variable, that for functions of two
variables, continuity is preserved under composition of functions:
3. Let f, g, and h be continuous functions of two variables.
(x, y), let
<f>(x, y) f[g(x, y), h(x, y)].

Theorem

For each

(Equivalently, <f>(P) f[g(P), h(P)].) Then</> is continuous.


We give the proof in outline:
=

(1)
=>

x x0

=>

g(x, y) g(x0, Yo)

(2)

Y Yo

and

and

h(x, y) h(x0, Yo)

=> f[g(x, y), h(x, y)] f[g(x0, y0), h(x0, y0)].

(3)
(4)

The reasons for these implications are as follows:


1)
2)
3)
4)

P
(x, y) and P0
(x0, Yo).
Any circle llP - P01\ < o lies in its circumscribed square, as in the figure above.
g and h are continuous.
f is continuous, and Theorem 2 holds.
=

To verify that various functions defined by algebraic formulas are continuous,


we need the following:
Theorem 4.

Let </> be a function of one variable; and for each x and y (in a domain D)

let

f(x, y)

</>(x).

If <f> is continuous, thenf is continuous.


Note that the graph of such anf is always a portion of a cylinder, like this:
z

)-.
/

__

,------k:___j,/
x

Geometrically it is obvious that the cylinder must be a continuous surface, and an


algebraic proof is almost equally easy. Given (x0, y0) in D, and E > 0. We know
that there is a o > 0 such that
Ix

Xol < O

=>

l<f>(x) - <f>(xo)I <

630

14.3

Functions of Several Variables

Therefore

Ix - x0\

IY - Yo\

and

< o

< o

=>

<=>

Therefore, by Theorem

2,f is

lcf>(x) - cf>(xo)I < E


lf(x,y) - f(x0,Yo)\

<

E.

continuous.

These theorems enable us to infer that various simple functions, which obviously
ought to be continuous, really are. Consider, for example,

f(x, y)

cos

x2y2

xa y

1 + x 2 + y2

By Theorem 4 the functions

x2, y2, xa, y, 1,


are all continuous, considered as functions of two variables. By Theorem 1 it follows
that

x2, y2 , x2y2, and x3y

are continuous (Theorem

are continuous.

again).

x2y2 - x3y and 1 + x2 + y2


1, (x2y2 - x3y)/(1 + x2 + y2 ) is

Therefore

By Theorem

continuous. By Theorem 3, so also is f

We investigated the quadric surfaces by examining cross sections. We do the same


for the graphs of functions of two variables. Given a region
/:
take a fixed

y0,and

D in R2,

and a function

D-+R,

consider the intersection of the graph with the plane

y0

The cross section is a curve, and is the graph of a function

(x,y0) is

ef>v0(x)
Such a function is called a
x0,

cf>v.:

for each

x for

which

in the domain of/, we have

slice function

then its derivative is denoted by

f(x, Yo)

off If the slice function is differentiable at

Functions of Two Variables

14.3

Geometrically,

(xo, Yo).

f,,(x0, y0)

631

is the slope of the surface in the x-direction at the point

Naturally, we can restate this in purely analytic terms, without any reference to

the geometry, and without mentioning a slice function. We can state:

Definition
1.

1m

(
f"'Xo,
Yo) -

x-+xo

if such a limit exists.


The function f,, is called the

f(x, Yo) - f(xo, Yo)


,
x - x0

partial derivative off with respect to x.

Standard differentiation formulas give us partial derivatives very easily.

example, given
we sety

= y0 ,

f(x,

to get

y)

= x3 +

For

xy2 + x2y + y4,

This gives

cp0(x) = 3x2 +

yg + 2xy0

= f,,(x, Y0 ).

Dropping the subscript 0, we get

f,,(x,

y)

The formal rule is simple: regard

= 3x2 + y2 +

2xy.

as a constant, regard

defining the function, and differentiate.

as the dummy letter

We can equally well consider slice functions in the y-direction, setting

x = x0

The derivative of the slice function is now the partial derivative off with respect toy.
More precisely:

Definition

f,v(Xo, Yo if such a limit exists.

1.

im

v-+'llo

f(xo,

y)

- f(Xo, Yo)

Y - Yo

'

Functions of Several Variables

632

14.3

The partial derivatives f., and .fv, once we get them, are also functions; and their
partial derivatives are defined in the same way. For example, consider
f (x,
Here

y)

x3

f.,(x, y)
fv(x, y)

xy2

x2y

y4.

3x2

y2

2xy,

2xy

x2

4y3

The partial derivative of f., with respect to

f,,,,(x, y)
Similarly

h v(x, y)

is denoted by .fc.,. Thus

6x

2x

2y.

+
+

l2y2.

Now !rev is the partial derivative off., with respect toy. We have

fxv

And similarly

hx

Dv(3x2

y2

2xy)

D,,(2xy

x2

4y3)

2y

2x.

2y

2x.

Note that while fxv and hx turned out to be the same function, they were not defined
in the same way, and they were not arrived at by the same process. Therefore the fact
that.fcv
fy,,, for this particular function f, must be due either to an accident or to a
=

nontrivial theorem.

Finally, some remarks on notation. Often, people write

of
ox
o2f
ox2
02!

of
oy
o2f
oy2

for f,,,

oy ox

for

fxx

for

fxv

o2f
ox oy

for

f11,

for

fvv

for

fvx

and so on. Note, in the last line, that in the symbols.fcv and.fv.,, the letters indicating
partial differentiation accumulate on the right; while in the symbols

o2f
oy ox

02/
ox oy

and

the letters indicating partial differentiation accumulate on the left. Thus

oY
'.l '.l
uxuyuy
'.l

f vx
v

oY
'.l '.l
0yuyux

and

fxyy

Note that in the CJ-notation, the symbols for higher derivatives look like "prod
ucts" of "factors" of the types

fm

of ox, o/oy.

Thus

0 a
f (x y)
oy oy ox
,
a

o3f
oy oy ox

This is why the symbols accumulate on the left instead of the right.

Functions of Two Variables

14.3
PROBLEM SET

633

14.3

Citing the theorems of this section, at the points where you need them, show that each
of the following functions is continuous.
1.

f(x,y)

4. f(x,y)

vx2 + y2
xy

x 2 + y2
y2
x2
2
x + y2
_

6. f(x,y)
8.

f(x,y)

10. f(x,y)

sin x
=

(y

sin xy
-2x
+ y2

2. f(x,y)

Vx4 + y4 + 1

[(x,y) (0, O)]

5. f(x, y)

[(x,y) (0, O)]

7.

f(x,y)

0)

9.

f(x,y)

[(x,y) (0, O)]

11. f(x,y)

3. f(x,y)

-v'x2 + y2

sin (x2y + y2x)

xy
2
x + y2 + 1
cosy - 1

cosy

2
x-2-+ y

(x 0)
[(x,y) (0, 0)]

Problems 12 through 22. For each of the functions f given in Problems 1 through 11,
find fx, fv, fxy, and fvx
23. Obviously the definition

off,

in Problem 4, is valid only for (x,y) (0, 0).

Is it

possible to give a separate definition off (0, 0), in such a way that the resulting function
is continuous ? That is, is there any such thing as
( ?)

lim

(x,vJ-<o,OJ

xy
--2 + y2
X

Why or why not ?


24. Same question, for the function defined in Problem 10.
25. Same question, for the function defined in Problem 11.
26. By a polynomial in x and y we mean a function f which is the sum of a finite number
of terms of the form aiixiyi, where the a;;'s are constants. Thus

f(x, y)
Show that if f is a polynomial, then fxy

.2 .2 a;;xiy i.

i=O i=O

fvx

27. Let us say, for short, that a function/is regular if/x v


then so also is f2.

fv:v Show that if/is regular

28. Show that if f is regular and positive, then vJ is regular.


29. Show that if/is regular, then so also is/3
30. Show that if/is regular and never zero, then (1//) is regular.
31. Show that if
g(x,y) 0.

and g are regular, then so also is f/g, at every point (x,y) where

32. Given a function f, and a point (x0,y0). Let

f(x,y)

t::.f
.

f(x,y)

x2y + y2x,

- f(x0,y0). For the function

634

14.4

Functions of Several Variables

show that 11/ can be expressed in the form


11/

f,,(x0,y0) 11x

+ fy(x0,y0)

11y

E(11x,11y) 11x

F(11x, 11y) 11y,

where Eand Fare functions such that


Jim
E(l1x,11y)
(Ll.x,Ll.v)---+(0,0)
*33.

14.4

Write a complete proof of Theorem


such that ...

3,

Jim

F(11x,11y)

(Ll.x,Ll.ul---+(O,O)

showing that for every

0.

> 0

there is a

c5 > 0

DIRECTIONAL DERIVATIVES AND DIFFERENTIABLE FUNCTIONS

The partial derivatives of a function f: D---+ R, where D is a region in R2, were


defined as the derivatives of the slicefunctions z =f(x,y0) and z =f(x0,y); and we
got the slice functions by taking cross sections of the graph off, parallel to the yz
plane and the xz-plane. We proceed to consider more general slice functions, obtained
by taking cross sections in any vertical plane whatever.
z

Yo
XQ

QI_

t(x, y)

Given a function

f:

D---+ R

with continuous partial derivatives.fx and.fv. Let (x0,y0) be a point of D, let L be a


directed line through (x0, y0), and let a be the angle between the positive direction on
L and the positive direction on the x-axis. Then L is the locus of the path
X

= Xo + t COS

y =Yo+ t sin

CJ.,
a.

We now form the composite function


cp,,(t) =f (x0 + t cos

a,

Yo+

sin a) .

We call <Pa the slice function in the direction a. If <Pa has a derivative at t = 0, then
cp(O) is called the derivative of f in the direction a, and is denoted by f,,(x0,y 0).
That is,

J+a(Xo,

Yo)

1.

lffi

t-o

f(x0 + t cos a, Yo + t sin a) - f(x0, Yo)


t

Directional Derivatives and Differentiable Functions

14.4

For

ex

635

= 0, the slice is parallel to the x-axis, and so it ought to be true that /0 =fz.
ex = 0 we have cos ex = 1, sin ex = 0, and

And this is true: for

f(x o+ t,Yo) - f(xo, Yo)


t

Jo
+ (Xo,Yo) = 1im
t-+O

= lim

f(x0 + x,Yo) - f(xo, Yo)


X

lix-+O

Similarly, for

ex = 7r/2

=f.,(Xo,Yo)

we have
cos

iX

= 0,

sin

ex

= 1,

and

f,,12(Xo,Yo) = fv(Xo,Yo).
We now want a general formula forfr,. As a guide to what we should be aiming at,
we consider first the simplest case, in which/is linear, with

f(x,y) =Ax+ By+ C.


For each point

(x0,y0) and each ex, we have


</>a.(t) =f(x0+ t cos ex, Yo+ t sin ex)
= A(x0+ t cos ex) + B(y0+ t sin ex) + C,
</>(t) = A cos ex+ B sin ex.

Since

for each

fx(x,y) =A,

fv(x,y) = B ,

x and y , it follows that these equations hold at the particular point (x0, y0);

and so we have the following:


Theorem A.

If

f is a linear function of two variables, then


fa.(xo,Yo) = fx(Xo,Yo) cos ex+ fv(xo,Yo) sin

for each

ex,

ex.

We shall now see that this formula holds under much more general conditions,
when/is not necessarily linear, but is "approximately linear near

(x0,y0)," in a sense

which we shall define presently.


We recall that if

f is a differentiable function of one variable, the difference


f = f(x0 + x) - f(x0)

can be expressed in the form

f = f'(x0) x+ E(x) x,
where

lim E(x) = 0.
t.x-o

We want to get an analogous expression for the difference

f = f(xo+ x,Yo+ y) - f(xo,Yo).

14.4

Functions of Several Variables

636

It is fairly easy to find out what form the formula has to take if it exists at all. If f is
linear, with

f(x,y) =Ax+ By+ C,

then

!J.f=Ax+ By+ C

( Ax0+ By0+ C)

=A(x - x0)+ B( y - Yo)


=A !J.x+ B !J.y.
Here, as before,

A=f,,(x,y),

B=fu (x,y);
and

y.

(/J.x,/J..y)

_,..

for a linear function, the partial derivatives are simply the coefficients of
This suggests that our expression for

/J..f ought

to take the form

/J..f =fx(Xo,Yo) /J..x+ fv(xo,Yo) /J..y+ [- - -],


where

[- - -]

is, we hope, a function which approaches 0 very rapidly as

(0, 0). Consider, for example,

f(x,y)=x2+ xy + y2.
Here

/J..f=(Xo+ /J..x)2+ (xo+ /J..x)(yo+ /J..y)+ (Yo+ /J..y)2 - xg - XoYo - yg


= 2x0 /J..x+ !J.x2+ x0 /J..y+ Yo /J..x+ /J..x /J..y+ 2y0 /J..y+ /J..y2
=(2xo+ Yo) /J..x+ (x0+ 2y0) /J..y+ [/J..x2+ /J..x !J.y+ /J..y2]
fxCxo,Yo) /J..x+ fu(xo,Yo) /J..y+ [- - -],
[- - -] _,.. 0 rapidly, as /J.x, /J.y _,.. 0.
=

where

We now attack the general problem, for functions with continuous partial
derivatives

fx,fv.

(x0,Yo)

We can get from

to

(x0+ /J..x,Yo+ /J..y)

by moving first

vertically and then horizontally. Algebraically,

!J.j =f(xo+ /J..x,Yo+ /J.. y) - f(xo,Yo+ /J..y)+ f(xo,Yo+ /J..y) - f(xo,Yo).


y
Yo+Ay

----

-.- --- -- -- 1

+
Yo

___

__.
I
I

II
I
I
I

--+-x-1--x0
x0+Ax

We apply the mean-value theorem MVT to the function

cp(x)=f(x,Yo+ /J..y),
on the interval from

x0

to

x0+ /J..x.

Directional Derivatives and Differentiable Functions

14.4

637

x0 x!

x0+t.x

z= q,(x)

The mean-value theorem says that there is an .X, between x0 and x0

+Lix,

that

This means that

f(x0 + Lix,Yo + Liy) - f(x0,Yo + Liy)

f,Jx,Yo + Liy)Lix.

+----''-----'----'-y
Y
Yo+t.y
Yo
z=Y,(y)

Similarly, we apply MVT to the function

7P(Y)
on the interval from y 0 toYo

+ Liy.

f(xo, y),

Thus

7P(Yo + Liy) - 7P(Yo)


for some ji between y 0 and y0

+ Liy;

7JJ'( ji)Liy,

and so

f(xo,Yo + Liy) - f(xo,Yo)

f/xo, ji) Liy.

Fitting these two results together, we get

Lif

f.,(x,Yo + Liy)Lix + fu(x0, ji) Liy,

where xis between x0 and x0

+ 6.x,

We are now almost finished.

and ji is between Yo and y0

For each

E1(Lix, 6.y)

E2(Lix, Liy)

6.x, Liy,

let

f.,(x,Yo + 6.y) - fx(xo,Yo),


fu(xo, ji)

fu(xo,Yo).

+ Liy.

such

638

Functions of Several Variables

14.4

Then
and
Therefore

b.J

f.,(xo, Yo) b.x

fixo, Yo) b.y

Ei(b.x, b.y) b.x

E2(b.x, b.y) b.y.

Note that
lim

E1(b.x, b.y)

lim

E2(b.x, b.y)

(4x,4y)-+ (0,0)

(4x,4y)-+(0,0)

fy are

because fx and
then

b.f

0,

0,

continuous. Thus, if the partial derivatives off are continuous,

f.,(x0, y0) b.x

is well approximated by the linear function

fu(x0, y0) b.y.

For functions of two or more variables, the idea of approximation by linear functions
is used as the

definition of differentiability.

To be exact:

R2, let/be a function

Definition. Let D be a domain in

--+

R,

and let P 0

(x 0,y0)

be a point of D. Suppose that there is a linear function

L(b.x, b.y)
E1 and E2,

and functions

b.f

and

lim

is said to be

b.x

+ B

b.y,

defined in a neighborhood of

(4x,4y)-+ (0,0)

Then

+ B

E1(b.x, b.y)

differentiable

at

E1(x, y)x

lim

(4x,4y)-+(0,0)

(0, 0),

such that

E2(x, b.y),

E2(b.x, b.y)

0.

(1)
(2)

(x0, y0).

This definition was modeled on the preceding discussion, and so we have already
proved the following theorem.

(x 0, y0),

Theorem 1. If/., andfv are continuous at

then/is differentiable at

(x0,y0),

with

b.J R:!

b.x

+ B

b.y

f.,(xo, Yo) b.x

+ fu(xo, Yo) b. y.

For functions of one variable, we defined the differential to be the linear function

df

df(b.x)

which gives good approximations of

b.f

J'(x0) b.x,
f (x0

b.x) - f (x0).

For functions of

two variables, the differential is defined analogously:


Definition. If

is differentiable at

df

(x0, y0) ,

df (x, b.y)

then

f.,(xo, Yo) b.x

+ fu(xo, Yo)

b.y.

The definition of differentiability is complicated to state, but it is easy to use.


It gives us the formula that we wanted, for directional derivatives:
Theorem 2. If

is differentiable at

f,,,(xo, Yo)

(x 0, y 0) ,

then for every direction

f.,(x0, Yo) cos

rx

+ fv(x0,

y0) sin rx.

rx,

14.4

Directional Derivatives and Differentiable Functions

Proof

639

By definition,
-r
Ja(Xo,

Yo)

f(x0 + t cos ix, Yo + t sin ix) - f(x0, y0)


t
t-+0

1.

- 1m

This has the form

f(x 0 + Ax, Yo + Ay) - f(xo, Yo)


t
t-+0

.
l 1m
where

Ax
Since

t cos ix,

Ay

1.

Af

1m-,

t-+O

t sin ix.

is differentiable,

Af

= A
=
=

Ax +

11y + 1 Ax + E2 11y

f,,(x0 , y0) 11x + fu(x0, y0) 11y + E1 11x + E2 11y


f,,(x0, y0)t cos ix + fu(x0,y0)t sin IX + E1t cos IX + E2t sin ix.

Therefore

+ Ei(t cos ix, t sin ix) cos ix + E2(t cos ix, t sin) sin

and so

faCxo,Yo) =

ix,

!1f
t-+0 t

l im

= f,,(x0, Yo) cos IX + fy(x0,Yo) sin ix + 0

cos ix + 0

sin

ix,

which was to be proved.


In the first five problems in the following problem set, you are asked to
"verify directly" that certain functions are differentiable at certain points. In each
of these cases, you should go through an elementary calculation to express
form

For example, given

2
f(x ,y) = x y

(xo.Yo) = (1, 1),

you would proceed as follows:

11/ = f(x0 + 11x, Yo + 11y) - f(xo, Yo)

(1 + /1x)2(1 + !1y) - 1
2
= 1 + 2 11x + /1x + 11y + 2 11x 11y + /1x2 11y - 1
2
= 2 11x + 11y + (11x + 2 11y) 11x + (/1x ) 11y.
=

The answer can now be written in the form

11/ in the

14.4

Functions of Several Variables

640

Note that other choices of E1 and E2 would have worked just as well. For example,

fif

fix + fiy + (fix) fix + (fix2 + 2 fix) fiy.

Therefore each of the first nine problems below has more than one right answer.

PROBLEM SET

14.4

Verify directly that each of the following nine functions is differentiable at the indicated
point.

1. f(x, y) =xy,

(2, 1)

2. f(x, y) =x2y2,

3.

f(x, y) =xy2,

(1,1)

4. f(x, y) =x3,

5.

f(x, y) =x2 - y2,

(1,1)

6. f(x, y) =y4,

7.

f(x, y) =x2

( -1, 1)

8.

9.

f(x, y) =x2 - 4y2,

y2,

f(x, y) =4x2

(-1, -1)

(0,0)
(
+

y2,

1 1)
,

(1, 1)

(1, - 1 )

10. Given f (x, y) = v x2 + y2, (x0, y0) = (1, 1), get a general formula for fa(x0, y0).
For which ex does f.. (x0, y0) take on its maximum value? For which ex do we get the
minimum value?

11. Same question, forf(x, y) =x2 - y2, (x0, y0) =(1, 1).
12. Same question forf(x, y)
13.

xy, (x0, y0) =(1, 1).

Suppose that f has a directional derivative/o: in every direction ex, at a point (x0, y0).

Is it possible that f.. (x0, y0) > 0 for every ex? Why or why not?

(Try to answer this

one merely on the basis of the definition of fa, without appealing to Theorem 2.)

14. Show that if fa(x0, y0) =0 for every ex, then fx(x0, y0) =fy(x0, y0) =0.

15. Give an example to show that the following "Theorem" is false:


Theorem(?) "Given/: D-+ R. Iff,,(x, y) =fy(x, y) =0, for every (x, y) in
f is a constant."

D, then

16. Show that the following theorem is true:


Theorem A. Given
z

(a < x < b,

f(x, y)

< y < d).

If fx(x, y) =fy(x, y) =0, for every (x, y) in the given domain, then f is a constant.
Here we are requiring that the domain be a rectangular region with sides parallel to
the x- and y-axes.

d
c

----

r -----

i
I
I
J
----+------
I
I
I
I
I
I
I
I
a

The Chain Rule for Paths

14.5

641

17. Theorem A, stated in Problem 16, is artificially special; it does not apply, as it stands,

to domains like the following:


y

Find a way of describing the property of D that is really needed in the proof of Theorem
A, and prove a theorem which uses your more general hypothesis.

14.5

THE CHAIN RULE FOR PATHS

In the preceding section, we defined the directional derivative fa as the derivative off
along a linear path

X
y

g(t)

h(t)

Xo

t COS IX,

Yo

t sin IX,

and we found that if f is differentiable, then

f a(Xo, Yo)

fx(Xo, Yo) cos IX

fixo, Yo) sin IX.

This result can be generalized, so as to apply to derivatives along paths which are not
necessarily linear. Suppose that a path Pis defined by a pair of coordinate functions.
Strictly speaking, we should write

g(t),

h(t),

But it is easier to keep track if we use the letters

a<

t<

and

b.
as the names of the co

ordinate functions. Thus we write

x(t),

y(t),

a<

t<

b.

642

Functions of Several Variables

Let

F: D---+
D.

path Plies in

14.5

R be a differentiable function, and suppose that the locus of the

We can then form the composite function

cf>(t)

F(x(t),y(t)),

and we have the following theorem:


Theorem 1

(The chain rule for paths). Let cf>(t)


cf> is differentiable, and

F(x(t),y(t)).

If F,

x(t), and y(t)

are

differentiable, then

cf>(t)

F.,(x(t),y(t))x'(t)

Fu(x(t),y(t))y'(t).

For example, consider

F(x,y)

x(t)

y(t)

x2y

y2x,

cost,
sin

t.

Here the locus of the path P is a circle. And

cf>(t)
cf>'(t)
Since

cos2

t sin t

-2 cost sin2 t

F'"(x,y)

2xy

t cos t,

+ sin2

+ cos3t +

y2

2 sin t cos2 t -

F u(x,y)

and

x2

sin3 t.

2xy,

Theorem 3 gives us

cf>'(t)

(2 cost sin t

+ sin2t)(-sin

which is the right answer.


We proceed to the proof. Take a fixed

Xo
Lix
Liy

x(to),

t)

+ (cos2

2 sin t cost ) cost,

t0, and let


Yo

YUo),

x(t0

Lit) - x(t0),

y(t0

Lit) - YUo)

In this notation,
"-'
't' (t0)

1.
Im

Mo

(Re-examine the definition of

F(x0

Lix) - F(x0, Yo)

F(x0

cf>.)
=

Lix,Yo

Now

LiF
F,ix0, y0) Lix

where
as
Therefore

Liy) - F(x0,y0)
Lit

Fy(x0, y0) Liy

(Lix, Liy)---+ (0, 0).

E1 Lix

E Liy,
2

The Chain Rule for Paths

14.5

643

Therefore

cp'(t0)

lim
M-+O

F
t

F,,(x0, y0)x'(t0) + Fux0, y0)y'(t0) +


= Fixo, Yo)x'(to) + Fu(xo, Yo)y'(to).
every t0 Therefore
=

This holds for

cp'(t)

F,,,(x(t),y(t))x'(t)

x'(t0)

+ 0

y'(t0)

Flx(t),y(t)),

which was to be proved.


Briefly,

In the "fractional" notation,

d<fo
dt

oFdx
oxdt

oFdy
.
o ydt

Both of these short formulas should be regarded as abbreviations of the longer


formula preceding them. The last of these three formulas is the easiest to remember,

The derivative dcp/dt


t changes slightly. The change in <Pis due to
in y. These two effects are measured by the

especially if we give it an intuitive interpretation, as follows.


measures the rate of change of cp, when

(1)

the change in

and

(2)

the change

quantities

oF ox
ox o t

'

oFdy
o ydt

The formula says that to combine these two effects, we simply add them.

PROBLEM SET

14.5

1. Given F(x,y) =cosxy,x(t) =


t2 + 1, y(t) =t3, and <f>(t) = F(x(t),y(t)), find '.

Did you need to use Theorem 3?

2. Same question, for F(x,y) =sinxy,x(t) =t2 + 1, y(t) =t2

1, <f>(t) =F(x(t),y(t)).

3. Same question, for F(x, y) =2xy , x(t) =cost, y(t) =sint, <f>(t) =F(x(t), y(t)).
4. Same question, for F(x, y) =x2 + y2,x(t) =cost, y(t) =sint, <f>(t) =F(x(t),y(t)).
5. Same question, for F(x,y) =xy,x(t) =t2,y(t) =t3, <f>(t) =F(x(t),y(t)).
6. Same question,for F(x,y) =x2 + y2,x(t) =cost, y(t) =sint, <f>(t) =F(x(t),y(t)).

<f>(t) =F(x(t),y(t)), find ', using


Theorem 3. ( This will give you a circuitous derivation of the formula for the derivative
of a product.)

7. Given F(x,y) =xy, x =x(t), y =y(t), and

8. Same question, for F(x,y) =x/y. ( This will give you an equally circuitous derivation

of the formula for the derivative of a quotient.)


9. Same question, for F(x,y) =xu,x > 0. This will give you the formula

D[gh]

hgh-1g'

(gh ln g )h

'.

14.6

Functions of Several Variables

644

This formula is easy to remember: first we differentiate as though the exponent were
a constant, then we differentiate as though the base were a constant, and then we add
the results.

10. Now derive the same differentiation formula, without using the theory developed in
this chapter, appealing only to the basic definition

(a > 0).

14.6

DIFFERENTIABLE FUNCTIONS OF MANY VARIABLES. THE CHAIN RULE

All the ideas which we developed in the last section, for functions of two variables,
can be generalized immediately for functions of any number of variables.

Limits

and continuity have already been defined in the general case, in Section 14.3.

Following the pattern of Section 14.4, we say that a function of n variables is differ
entiable at a point if the difference function is well approximated by a linear function,

in small neighborhoods of the point. The definition is as follows:

Definition. Let D be a region in Rn, let/be a function D--+ R, and let


Po

be a point of D. For each point P

(a1, az, ... , an)

(xi. x2,

f(P) - f(Po)

xn) of D, let

let
and let

l:,. f

Suppose that there is a linear function

L(/:,.P)
and a set of

!:,.f

L(l:,.xi. l:,.x2,

functions E1, E2,

l:,.xn)

A1 /:,.x1 + A2 l:,.x2 +

+ An l:,.xn,

, En, defined in a neighborhood of 0, such that

L(l:,.x1, !:,.xz, ... , !:,.xn) + E1(/:,.P) !:,.x1 + E2(/:,.P) l:,.x2 +

+ En(l:,.P) l:,.xn,

(1)

and
lim E;(l:,.P)
!J.P->O

(2)

for each i. Then f is differentiable at P0


Partial derivatives are defined in exactly the same way as for functions of two
variables. For example,

Differentiable Functions of Many Variables

14.6

645

and so on. Sometimes it is convenient to write f1(P0) for f,,1(P0), and in general
f;(P0) for f,,/P0); that is, f; is the derivative of f with respect to the ith coordinate
in Rn. Thus, for Rn= R2,P= (x,y), we may writef1 forf,, andf2 forJ..
Just as in the preceding section, if a function is differentiable, then it has all its

first partial derivatives, and these are the coefficients in the linear approximation

L(b.P):
Theorem 1. If
each i, and

f is

differentiable at

P 0, with tlf !"=:! L(b.P), then f;(P 0)

is defined for

The proof is just the same as for two variables: we take a fixed integer k, and set

tlx; =

0 for i :;zf k, so that

b.P = b.xk.

Then

tlf = L(b.P)

Ek(b.P) tlxk,

because all other error terms on the right get multiplied by 0.


we have

L(b.P)= Ak b.xk.

Since

b.P = b.xk,

Therefore
and

As before, the differential of f at P0 is defined to be the linear function which gives

good approximations of

tlf

Thus

df = f1(P0) b.x1

+ j;(P0)

b.x 2

f n(P0) b.x n-

By now it should be clear that we are in much the same situation as we were
when dealing with
by the case

by

matrices, in Chapter 13: the ideas are adequately conveyed

3, but the notation of the general case is tedious.

For this reason,

we shall often deal hereafter with only three variables, using

P = (w,x,y),

b.P= (b.w,b.x, b.y) .

You will probably find it easier to generalize the ideas for yourself, in your head, than
to read a generalized version.
Theorem 2.

Let D be a domain in

R3, and

let

f be

a function D

R.

If

f has

three of its first partial derivatives in D, and these are continuous at the point
then

f is

Proof

differentiable at

P0

all

P0 ,

Let

tlf = f(P) - f(P0) = f(w,x,y) - f(w0,x 0,y0).

By a slight extension of the device that we used in the proof of the same theorem for
two variables, we write

tlf = [f(w,x ,y) - f(Wo,x,y)] + [f(w0,x,y) - f(w0,x 0,y)]


+ [f(wo,Xo,y) - f(wo,Xo,Yo)].
In the first bracket, we regard x and y as constants, and apply the mean-value theorem

14.6

Functions of Several Variables

646

to the function

<f>(w)
Then

where

f(w,x, y).

</>(w) - <f>(w0)
w

w0

is between

and

w.

Since

</>'(w)

f(w,x, y) - f(w0,x, y)

</>'(w) !::i.w,
fw(w,x, y), we
=

have

fw(w, x,y) !::i.w.

By two more such applications of the mean-value theorem, we get

!::i.J

fw(w,x, y) !::i.w + fx(w0,x, y) !::i.x + fy(w0,x0,y) !::i.y.

(1)

Let

E1(!::i.P)
E2(!::i.P)
E3(!::i.P)

Then

!::i.J

fw(w, x,y) - fw(w0,Xo,Yo),


f,,(w0,.X, y) - fx(w0,Xo,Yo),
f/wo,Xo,y) - jy(wo,Xo,Yo).

fw(P0) !::i.w + fx(P0) !::i.x + jy(P0) !::i.y


+ E1(!::i.P) !::i.w + E2(!::i.P) !::i.x + E3(!::i.P) !::i.y,

as in the definition of differentiability.


The chain rule for paths takes the same form as for two variables, and has the
same proof.

(The chain rule for paths). Let P be a path, with


t
w(t), x( ), y(t), and with locus lying in the domain D in R3.
D-+ R, and for each t, let
</>(t) f(w(t),x(t),y(t)).

Theorem 3

coordinate functions
Let f be a function

Iff and the three coordinate functions are differentiable, then</> is differentiable, and

</>'(t)

fw(w,x,y)w' + f,,(w,x,y)x' + fy(w,x, y)y'.

That is,

</>(t)

for every

t.

fw(w(t),x(t),y(t))w'(t)
+ Jx(w(t),x(t),y(t))x'(t)
+ jy(w(t),x(t),y(t)) y'(t)

This automatically gives us a chain rule for composite functions in which


and

y are functions

of several variables. As in Theorem 3, let

D be a

w, x,
R3,

domain in

D-+ R. But now let w,x,andy be functions of three variables,


D'. Suppose that for each point (t,u,v) of D', the point
(w(t,u,v), x(t,u,v),y(t,u,v)) lies in D. We then have a composite function

and let/be a function


defined in a domain

cp: D'-+R,
defined by the formula

</>(t,u,v)

f(w(t,u,v), x(t,u,v ),y(t,u,v)),

Differentiable Functions of Many Variables

14.6

cp1, <f>u, and <Pv But this is not a


cp1, we regard u and v as constants, and this means

647

and we want to find the partial derivatives

new

problem, really: in calculating

that

we can calculate <Pt by means of the chain rule for paths. The only difference is that the
derivatives w'(t), x'(t),y'(t) in Theorem 3 are now the partial derivatives wt(t, u, v),
x1(t, u, v) , y1(t, u, v) , and the final answer cp'(t) in Theorem 3 now becomes cp1(t, u, v).
This gives the formula

cp1(t, u, v) =

fw (w, x, y)w1(t, u, v)

+f,,(w, x,y)xv(t, u, v)
+fv(w, x,y)y/t, u, v).
This may be easier to remember in the o-notation. In this notation,

o f aw

o<f>

ai

aw ai

ax
of oy
+
.
ax at
oy at
of

Similarly for <Pu and <Pv Thus we have


Theorem 4

(The chain rule). Letfbe a differentiable function of w, x, and y; and let

w, x, and y

be differentiable functions of t, u, and

Let

v.

cp(t, u, v) = f( w(t, u, v), x(t, u, v), y(t, u, v)).


Then the partial derivatives of <P are given by the formulas
o<f>
o f aw
of ax
of oy
-=--+--+--,
a1
aw a1
ax at
oy at
o<P

o<P

au

aw

of

ov

--

aw au

of

ax

+ -- +

ax au

of oy
-

oy au

of ax
of oy
aw
+
+
.
ow av
OX ov
oy OV
of

For example, we might have

w2+ x2+ y 2,
f(w, x,y)
t+ 2u+ 3v,
x
2t+ 3u+ 4v,
y = 3t+ 4u+ Sv.
=

Here
of
-

of

OW

OX

2w
2x

2(2t+ 3u + 4v),

2(3t+ 4u + Sv),

of

- = 2y
oy

ow

2
OU - ,

2(t+ 2u + 3v),

OX
OU

3,

oy

OU

4.

648

Functions of Several Variables

14.7

Therefore, by the chain rule,

o<f>
OU

2(t + 2u + 3v)2 + 2(2t + 3u + 4v)3 + 2(3t + 4u + 5v)4


40t + 58u + 76v.

This is the right answer; by a direct calculation, we get

<f>(t, u, v)

14t2 + 29u2 + 50v2 + 40tu + 76uv + 52tv,

so that

o<f>

<f>11(t, u, v)

- =

OU

58u + 40t + 76v,

as before.

PROBLEM SET
1. Given/(t,u)

14.6

t 2u,g(t,u)

t +u2,and <f>(t,u)

2. Given/(t,u) =t3 - u,g(t,u)


3. Given/(s,t,u,v)
sin f cos g; find

s2 +t3 +u4 +v5,g(s,t,u,v)

and

cos u cost, g(t,u)


g + h, find </>1 and u=

cos u cost, g(t,u)


g2 + h2, find 1 and u=

/2 +g2,find </>1 and </>,,.

cos u sin t, h(t,u) =sin u, and </>(t,u)

cos u sin t, h(t,u) =sin u, and </>(t,u)

f(t,u,v)
t cos u cos v, g(t,u,v)
t sin u cos v, h(t,u,v)
/2 + g2 - h,find </>t, </>,., and <f>v
<f>(t,u,v)

8. Given

<f>v

5. Given f(t,u) =t cos u, g(t,u) = t sin u,and <f>(t,u)

7. Given f(t,u)

s +t +u +v,and </>(s,t,u, v)

<f>u-

4. Under the conditions of Problem 3, find </>sand


6. Given f(t,u)

J2 +g2,find <f>t and <f>u

t - u2,and <f>(t,11) =/2g2,find <f>t and <f>u-

f+

/2 +

t sin v, and

9. G iven f(s,t,u,v,w)

(1, 0, 0,

1,

0).

stuvw, verify by a direct calculation that f is differentiable at

That is, find error functions

Ei(D.P), ... , E5(D.P)

as in the definition

of differentiability of a function at a point.

10.
14.7

Same question, for/(s,t,u,v)

s2 +tu +v3,at (0, 1, 1,

1 ).

DIRECTIONAL DERIVATIVES AND GRADIENTS

We shall now generalize the idea of the directional derivative


differentiable functionf:

D-+ R,

in such a way that it

Given a region

let P0 be any point of

Rn, with Ii VII = l. (A vector of norm 1


offin the direction V, at the point P,
0 is
r
JV(P)
o

f,a

D in Rn, and a
D, and let V be any vector in
will be called a direction in Rn.) The derivative

applies to functions of any number of variables.

1.

lID

t-+O

defined to be

f(P0 + tV) - f(P0)


t

14.7

Directional Derivatives and Gradients

649

To compute fv(P0), we let

Po = (a1, a2, ... 'an),


V= (C1, C2,
, Cn)

We now have a path, defined by the coordinate functions


xi(t) = ai

(i= 1,2, . . ,n),

+ cit

and a composite function


rp(t) = j(X1 (t),

and

X2(t), ... , Xn(t));

fv(Po) = rp'(O).
We can now calculate rp' (0) by the chain rule for paths:
Thus we have
Theorem 1.

point of

D,

If f is differentiable in D, then f has a directional derivative at every


in every direction. If Vis a unit vector (c1, c2,
, en), then

fv(Po) = fx1(Po)C1 +fx2(Po)C2 +


+fxJPo)cn
= f1(Po)C1 +f2(Po)C2 + +fn(Po)cn.

(You should check that for the case n = 2, our definition of the directional
derivative, and the formula given in Theorem 1, agree with the definition and formula
given in Section 14.4.)
The gradient of a differentiable function, at a point P0, is the vector whose
components are the partial derivatives off at P0. The gradient vector is denoted by
R, with real numbers as its values,
grad/ Thus ifjis a differentiable function D
Rn, with
then grad f is a vector-valued function D
__.,.

__.,.

where the f;,'s are the first partial derivatives off That is,
gradf(P ) = (fx1(P),fx2(P), ... , fx"(P)),
for each Pin
then
and

D.

For example, if

f(P) = f(x, y) = x2 + xy +ya,


f1( x , y) = fx(x, y) = 2x +y,

which is a vector in

gradf(x, y) = (2x +y, x + 3y2),


R2,

as it should be, for each point P = (x, y).

650

Functions of Several Variables

14.7

The definition of the gradient may seem arbitrary, but it is not; the gradient has
a geometric meaning, now to be explained. First we observe that for each unit vector
V, with

V= (c1, C ,
, en),
2
llVll2 = c + c +
+ c; = ,1
.

. .

the directional derivative fv can be expressed as an inner product:

fv =/1C1 + f2C2 +

+ fncn,

where the fi's are the first partial derivatives off, and so

fv = (grad f)

V.

We shall prove the following:


Theorem 2. If/is differentiable at P, then (1) the direction of gradf(P) is the direction
which gives the maximum value of the directional derivative fv(P), and (2) the norm
of grad f(P) is the maximum value offv(P).

Proof Let G= gradf(P), so that


G = (/i(P),fiP),
and let

. . f n(P)) ,

be the unit vector with the same direction as G; that is,

1 G
V= -llGI! '

so that

where
f;(P)

C = -llGll .
i

Then

fv(P ) = (gradf(P) ) V= G V

=- G

(_Q_)
= _ (G
llGll
llGll
l

G) = llGll = llgradf(P)ll.

Thus the directional derivative, in the direction of the gradient, is the norm of the
gradient. And this is the direction which maximizes the directional derivative: if W
is any unit vector, then we know by the Schwarz inequality (Theorem 1 of Section 11.6)
that
(G. W)2 llGll2 II w112 = llG ll2 ,
and so
fw(P) = G W llGll = fv(P ).

This completes the proof.

A continuous function D--+ 1/, where Dis a region in a Cartesian space Rn and
1/ is a vector space, is called a vector field. We ordinarily draw the graphs of

Interior Local Maxima and Minima

14.8

651

vector fields by using free vectors. For example, for


f(x, y)

x2

gradf(x, y)

y2,

(2x , 2y),

we can indicate the vector field grad f by drawing sample vectors in the xy-plane,
like this:
y

-3

-3

At each point P

(x, y), the direction of gradf(P) is the direction of the ray from

the origin through P, and the length l[gradf(P)ll is twice the distance from the origin
to P. At the origin , the gradient vector vanishes. Such a point is called a singularity
of a vector field.
PROBLEM SET

14.7

For

1.

each of the following functions f, sketch the graph of grad f


Vx2 + y2
f(x,y)
x + y
3. f(x,y)
2. f(x,y)
=

4. f(x, y)

f(x,y)

7.
9.

f(x,y)

11.

f(x,y)

14.8

y2

x2 - y2
y2
x_
2__
1
4
4

5.

f(x,y)

8.

f(x,y)

10.

f(x,y)

12.

f(x,y)

xy
Vl - x2 - y2

6.

f(x, y)

13.

f(x,y)

x
x3

__

(x2

y2)2

x2

y
1

x2

y2

+ 1

4y2 - x2

INTERIOR LOCAL MAXIMA AND MINIMA,


FOR FUNCTIONS OF TWO VARIABLES. LEVEL CURVES

For functions of one variable, defined on a closed interval, we had two kinds of
maxima. In the figure on the left below , the maximum occurs at the endpoint b; at

652

14.8

Functions of Several Variables

x1 the function has a local maximum, but not a maximum, because/(x1) < f(b).
In the figure on the right, the function has a maximum at x1; this is an interior maxi
mum, and so/'(x1) must be 0.
y

One of the simplest theorems for functions of one variable was the following:
Theorem 1. Suppose that f, f', and/" are continuous, in a neighborhood of x0

If/'(x0)

0 andf"(x0) < 0, then/ has a local maximum at x1


y

f'(xo)

0, f11(x0) < 0, f has a Ll\Iax at

x0.

The proof is simple. Sincef"(x0) < 0, andf" is continuous, it follows that there
is a neighborhood (x0
O, X0 + 0) Of X0 such that
-

f"(x) < 0

for

x0

o < x < x0 + o,

as indicated in the figure. Therefore f' is decreasing on the interval


(Xo - O, Xo + 0).
Therefore
J'(x) > 0

for

x0 - o < x < x0

f'(x) < 0

for

x0 < x < x0 + o.

and
Thereforef is increasing, from x0 - o to x0, and f is decreasing, from x0 to x0 + o.
Thereforef(x0) is the maximum value off on the interval from x0
o to x0 + o.
-

14.8

Interior Local Maxima and Minima, for Functions of Two Variables. Level Curves

653

The same proof proves the following theorem, which is going to be more useful:
Theorem 2.

Givenf,f', andf", on an interval (x0 for

f"(x) < 0

x0

x0 +

o,

We return now to functions of two variables. Let P0

> 0, the

center at P0 and radius


N(P o,

o)

{ P

Po P <

o}

{(x, y) (x -

0, and

o,

x0 +

o).

(x0, y 0)be a point of the

P0 is the interior of the circle with

This is denoted by N(P0,

o.

o-neighborhood of

Ifj'(x0)

x < x0 + O,

O <

thenf(x0) is the maximum value off on the interval (x0 xy-plane. For each

o).

o).
Xo 2

Thus
+ (y - Yo)2 < 02}.

Let D be a set of points in the xy-plane. If P0 is a point of D, and D contains a


neighborhood of P0 (for some

o),

then P0 is called an

interior point

of D.

Thus Pis an interior point of D if P0 lies in D, with at least a little room to spare.
Consider, for example,
D

{(x,y)

x2 + y 2 l}.

Here D consists of the unit circle, plus its interior. If OP0 < 1, as in the figure, then
P0 is an interior point; if we let

o
then N(P0,

o) lies

OP0,

in D.
y

This works, no matter how close P0 may be to the circle, as long as P0 isn't actually
the circle; no matter how small the positive number 1

on

OP0 may be, we can use it

14.8

Functions of Several Variables

654

as our positive

o.

On the other hand, if OP1

1, so that P1 is on the circle, then P1

is not an interior point of D; no matter how small we take


NP
( 1,

o) contains points outside

of D.

o,

the neighborhood

. ---,

__,., --y

-...
____
..._

f has an !Ll\fax at P0.

Consider now a function of two variables

f:

D---+ R,

( 0 ) for every Pin D, then we say that/ has a


and let P0 be a point of D. lffP
( ) fP
maximum at P0. If P0 is an interior point of D, and P0 has a neighborhood NP
( 0, o)

such that

( ) fP
( o)
fP

for every P in N(P0,

o),

then we say that/ has an interior local maximum (ILMax ) at P0

This discussion has been rather lengthy, but if you review the figures which have

been given in this section so far, you will find that they convey, by themselves, most
of the ideas that we have been talking about.

Our purpose at this stage is to find conditions under which we can conclude that a
function of two variables has an ILMax at a given point. At an ILMax, we must have

because an ILMax offmust be an .ILMax of both the slice functions


<Po(t)
<P 1 2 ( t )
,,

/(xo + t, Yo),
f(xo, Yo + t).

(At this point you may want to review the definition of slice functions, at the
beginning of Section

14.3.)

Obviously, however, the vanishing of the partial deriv

ativesfx and fu is not enough to guarantee an ILMax; we might have a minimum or

Interior Local Maxima and Minima, for Functions of Two Variables.

14.8

Level Curves

655

a saddle point. For example, for

f(x,y)

we have

fx(O,0)
/

but the point

(0, 0)

x2 - y2,
fv(O,0)

0,

is a saddle point; the slice function

</>0(t)
has a minimum at

0,

f(t, 0)

t2

and the slice function

</>"12(t)

f(O,t)

-t2

has a maximum at 0. One way to see the difference between the behavior of a function

at an ILMax or ILMin and its behavior at a saddle point is to consider the so-called
level curves, in the xy-plane, on which the function takes on various constant values.

For the function

f(x, y)

x2

+ y2,

the level curves are circles with center at the origin, as shown above.
k >

0,

the level curve on whichf(x,y)

For each

k is the circle with center at the origin and

radius .J'k. The origin is a singular point of this family of curves; and this is the point
at which the function takes on its obvious minimum value
For the function
f(x,y)

0.

x2 - y2,

there is no maximum and no minimum. Since/is defined for every

x and y,any Max

or Min would have to be an ILMax or ILMin; at any such point, both the partial
derivatives .fc and fv would have to vanish;

f,,

and fv vanish simultaneously only at

14.8

Functions of SE .eral Variables

656

(0, 0), and at (0, 0) the function has a

saddle point. The level curves for this function

look like this :


y

y=x

y=

Fork >

0,

y= -x

-x

= k is a hyperbola in standard position,

the level curve on whichf(x,y)

-k < 0 is conjugate to it. The level curve


is the union of two lines, which intersect each other at the

and the level curve on whichf(x,y)


on which f(x,y)

origin,where f has a saddle point. These examples are typical of the way level curves
behave in simple cases.

Even if each of the slice functions in the

x-

and y- directions has an ILMax at a

point (x0, y0), we may still have a saddle point, on which a man in the saddle would
be facing in some third direction.
f(x,y)

Consider

-xy

- tx2 - tf.

Here

c/>o(t) = -!t2,
so that </;0 has an ILMax at
But for

rx = 37T/4 we

0;

and similarly for

</>rr12(f) = -!f2

have
cos

rx =

.J2. ,

sm

rx = .J2. ,

<f;.(t) = f(t cos rx, t sin rx)

t2
42

which has a minimum at

0.

t2 -t2
1 2
=-t
,
4
2
4

t2
42

14.8

Interior Local Maxima and Minima, for Functions of Two Variables.

Level Curves

657

Thus, if we want to infer that f has an ILMax at (x0, y0), we need to consider
every direction rx, and examine all the slice functions

<fa(t) =f (x0

t cos rx, Yo

t sin rx) .

This is the basis on which we shall attack the problem. Now

cp(t). =fx(x0

t cos rx, Yo

+ t

sin rx) cos

rx

+ fv(x0 + t

cos rx, Yo

t sin rx) sin rx;

here we are using the chain rule. Applying the chain rule again, to each term, we get

fxx(x0

t cos rx, Yo

fxv(x
fvxCx0
+ fvv(x0

+
+

t cos rx, Yo
t cos rx, Yo

cos rx, Yo

cp(t) =

= c

2fx:.

+ t

+ 2 csf xv +

t sin rx) cos2 rx


t sin rx) cos rx sin rx
t sin rx) sin rx cos rx
t sin rx) sin2 a

2jYY'

Here we are using the abbreviations


c

=cos rx,

!xx =fxx(Xo

=sin rx,

t cos rx, Yo

t sin rx) ,

and so on. We are also assuming that all our derivatives are continuous; in this case
it is a fact thatfvx = fxv (See Appendix K, where this is proved.)
The following is easy to see:
Theorem A.

Suppose that
<f;(O) =0

for every

rx.

Suppose also that there is a number o


cp(t)

< 0

> 0

such that for !ti

<

a we have

for every rx.

Then/has an ILMax at (x0, y0).


y

The reason is that for every rx, <fa(O) is the maximum value of <Pa on the interval
(-o, o). It follows thatf (x0,y0) is the maximum value of/in the a-neighborhood
of (x0,y0).

658

14.8

Functions of Several Variables

We therefore need to find conditions under which such a c5 exists. We have

r/>(t) = c2fxx + 2csfxv + s2fyv


We now use an "ingenious device." We write

csf
s2 r
+
l
fxx
frxx J
s;Y
s2fvv
2
fxv
]
+
=fxx [( c + s . )
fxx
fxx
f xx
fxu 2
fxxfvv - f;v
.
s2]
)
+
=fxx [( c +
fxx
J2xx
r

r/>;(t) =fxx C2 + 2

--.:!.Y!!.

Suppose now that at the point

(x0, y0) we

have

fxx < 0,
We are assuming that all the partial derivatives that we are dealing with are con
tinuous. It follows that the same inequalities hold in the c5-neighborhood of
for some c5 > 0. Thus for !ti < c5 we have
If we also know that

r/>(t) < 0

for every

(x0, y0),

o:.

then we have

rf>(O) = 0

for every

Therefore, by Theorem A, f has an ILMax at

o:.

(x0, y0).

We sum all this up in the

following theorem:

Theorem 3. Suppose
hood of (x0, y0). If

that f has continuous second partial derivatives in a neighbor

0,

(I)

fxx(Xo, Yo) < 0,

(2)

J;ixo, Yo) - fxx(xo, Yo)f,viXo, Yo) < 0,


then/has an ILMax at (x0, y0) .

(3)

f.(xo, Yo)

fv(xo, Yo)

and

Not only the proof of this theorem, but also the theorem itself, are hard to read
and hard to remember. This is typical of what you can expect from now on: when we
pass from one variable to two or more, the calculus takes on a higher order of

difficulty.

The following is a corollary of Theorem

Theorem
hood of

4.

3:

Suppose that f has continuous second partial derivatives in a neighbor

(x0, y0).

If

f,,(xo, Yo) =fv(xo, Yo) = 0,

(1)

fxx(Xo, Yo) > 0,

(2)

14.8

Interior Local Maxima and Minima, for Functions of Two Variables.

Level Curves

659

and

(3)

J!vCxo, Yo) - fxxCxo, Yo)fvvCxo, Yo) < 0,


then/has an ILMin at (x0 ,y0).

Proof

If

of Theorem
at

(xo, Yo).

satisfies the hypothesis of Theorem

3.

Therefore

-f

has an ILMax at

4,

then

-f

satisfies the hypothesis

(x0,y0). Therefore f has an ILMin

When we were studying functions of one variable, we found simple cases in


which a function had an ILMax or an ILMin, but the second derivative test failed to
reveal the fact.

For example,

f(x)

x4 has an ILMin at x

0,

but f"(O)

0.

Here the trouble seems to be that the function approaches its ILMin value "very
flatly."

The same sort of thing can happen for functions of two variables.

example, the function

f(x, y)

does not apply, because at the point


and fxv are equal to

For

x4 + y4 has an ILMin at (0, 0), but Theorem 3

(0, 0),

0.

all the second partial derivatives fxx fv11,

Sometimes, however, we can get negative information by examining the quantity

J!vCxo, Yo) - fx,cCxo. Yo)fvvCxo, Yo)


If this quantity is positive, then we can infer that
ILMin at

</>;(t)
We sett

has neither an ILMax nor an

(x0 ,y0). The proof is as follows. We found that

0.

Let

c2fxx + 2 csfx11 + s 2 f1111

Then
</>:(o)

A cos2 IX + 2B sin

cos

IX

IX

+ C sin2 IX,

and we are assuming that

B2

AC> 0.

Consider the function


1jJ(u)

+ 2Bu + Cu2

The graph is a parabola; the discriminant

(2B)2 - 4AC
and so 1P(u1)

> 0 for some u1 and


</>(O)

4(B2 - AC) > 0,

1P(u2) <

cos2 1X[A +

0 for

some u2 But for cos IX

0 we have

2B tan IX + C tan2 IX];

IX = u1, we have </>(O) > O; and for tan IX = u , we have </>;(O) < 0. Therefore
2
the direction of concavity of the slice functions </>a is different for different values of IX,

for tan

and we cannot have an ILMax or an ILMin. To sum up:


Theorem 5.

at P0

If f ;v

f xxfvv > 0

at

P0,

then

has neither an ILMax nor an ILMin

14.8

Functions of Several Variables

660

PROBLEM SET

14.8

Investigate the following functions for interior local maxima and minima. Not all of
these problems can be worked by straightforward applications of the theorems in Section
14.8; you may need to examine slice functions, or use other elementary methods.
2. f (x,y) =x2 + xy

1. f (x,y) = xy
3.

f(x,y) = x2 - y2

5.

f(x,y) =x2 + y2 + x + y +

7.

f(x,y) = x2 + xy + y2 + x + y + 1

4.

f(x,y) = (x + y + 1)2 + (x - y + 1)2

6.

f (x,y) =x2 + y2 + 2x +
f (x,y) =x2 + 2xy + y2

8.

10. f(x,y) =x4 - 2x2 - y2

9. f(x,y) = 1 - x4 - y2

11. At what point does the function f (x,y) =x2 ( 1 - x2 - y2) take on its maximum

value? What is the maximum value of the function?


12. At what point or points does the function
f(x,y) = Y(x - 1)2 + (y - 2)2 + .Yx2 + y2
13.

take on its minimum value? What is the minimum value of the function?
Consider the ellipsoid
x2 + y2/4 + z2/9 = 1 .

In this surface we are to inscribe a rectangular parallelepiped, with sides parallel to


the coordinate planes. What is the maximum possible volume of such a parallelepiped?
Give the coordinates of its corner in the first octant.
14.

Same question, for the ellipsoid


x2 +

15.

z2
y2
+ -;;=l.
4

Same question, for the ellipsoid


z2
y2
x2 + 4 + 4 = 1.

16. Let A1 = (0, 0), A = (1, 2), and A3 = (2, 1). For each P
2
f(P) = (A1P)2 + (A2P)2

(x, y), let

(A3P)2

At what point P0 does f(P) take on its minimum value?


17. Same question, for Ai = (ai, b;) (i = 1, 2,.
14.9

. .

, n), and f (P) =.2f=1 (A;P)2

DOUBLE INTEGRALS, INTUITIVELY CONSIDERED

You recall that in Section 3.7 we gave a preliminary intuitive definition of the definite
integral of a continuous function over a closed interval. Here the A;'s are areas, in
the elementary geometric sense, so that A; 0 for every i. To get the integral, we
count areas above the x-axis positively, and areas below the x-axis negatively.

14.9

Double Integrals, Intuitively Considered

661

Later, in Section 7.2, we gave a new definition of the integral, as the limit of the
sample sums of the function as the mesh of the net approaches 0:

provided, of course, that such a limit exists.

The new definition was necessary for

two reasons. First, we needed it to clarify the underlying theory. Second, we wanted
to use the definite integral to solve problems which did not, at the outset, look like
area problems at all.

For example, to calculate arc lengths, surface areas, volumes,

and moments, we regarded them as limits of sample sums, as the mesh approaches
zero. Thus our second definition of the definite integral was not only more exact but
also more widely applicable.
We shall follow the same scheme with multiple integrals, first giving an intuitive
definition, and then reformulating it when the need arises (which will be soon).
Suppose that we have given a nonnegative continuous function
f: D--+ R,
defined in a domain Din the xy-plane. (See figure on the left below.)

,0,

I
____,---y
I - 1
'1
1......
--

'

----

./

x
a
x

662

14.9

Functions of Several Variables

The expression

fff(P)

dA

denotes the volume of the region lying above the xy-plane and below the graph off
This is called the integral off over D. Thus the integral is the volume of the solid
S

{(x,y,z)

(x,y)

0 z f(x,y)}.

and

ED

In "reasonable" cases, double integrals can be calculated by the method of cross


sections, developed in Section 7.4. The scheme is shown on the right above.
Given a solid S, in space,lying between the planes x
a and x
b. Suppose that
for each x0 from a to b we can compute, somehow, the area of the cross section in
the plane x
x0. If for each such x0 we let A(x0) be the area of this cross section,
then the volume of our solid S is
=

vS

A(x) dx.

This method works for many solids whose volumes are not given by standard
formulas. Consider the following.
y

y=vx.

We start with the region


R

{(x,y) \ 0 x 1,

0 y )-;,},

in the xy-plane. For each x, we join the point (x,0) to the point (x, y2), by a segment.
On each such segment we set up an isosceles right triangle, as shown above on the
right. Let S be the union of all these triangles (including, of course, their interiors).
For each x, the area of the triangle _at x is
A(x)

Therefore the volume is

A(x) dx

t)-;; )

x dx

tx.

t[tx2]

-!.

Here A(x) was computable by an elementary formula, because the cross sections
for constant x were triangular. But no matter what method you use to compute
A(x), you can still find the volume by integrating A(x) between the appropriate

14.9

Double Integrals, Intuitively Considered

limits. In particular, you

integral.

663

can use the method when A(x) is itself computed as a definite


y

Consider the following. Let


D

For each

(x, y)

{(x,y) \

in D, let

f(x, y)

1, 0 y 1

x2

x}.

+ y3.

We want to find the volume of the solid lying above D and below the graph off
Now for each

x0,

the cross section in the plane

x0 looks like the drawing on the


x0 is

left below. Therefore the area of the cross section at

A(x0)

-"'0

(x

y3) dy

[xy

!y4]-xo

xW - Xo) + i(l - X0)4


x - xg + !( 1 - x0)4

Dropping the subscript we get

A(x)

x2 - x3

Hl

x)4

Therefore the volume is

ff!(P) dA fA(x) dx
=

[tx3 - !x4 - ! i-(1

(t - t] - (-io]

The method works more generally.

x)5]

T\.

Suppose that we have a region D in the

xy-plane, lying between the graphs of two functions, as on the right below.
z
y

x2
0

Functions of Several Variables

664

We have given

F(x, y) 0

14.9

on D, and we want to find

ff

F(x, y)dA.

This is the volume of the solid lying above D and below the graph of F. For each x,
the cross-sectional area is

A(x)
Here

lgf(x)(x)

F(x, y)dy.

is being held constant, and we are integrating from f (x) to

once you get it, is a function. Therefore the total volume is

ff

F(x, y)dA

A(x) dx

f [i::

>

g(x).

But

A(x),

F(x, y)dy dx.

This takes a very simple form when D is a rectangular region defined by inequalities of
the form

a x b,

Here

ff

F(x, y)dA

y d.

ff

F(x, y)dy dx.

Here it is to be understood that the "inside integration" is to be performed first,


giving the cross-sectional area

A(x)

ia

F(x, y)dy,

and the resulting function is to be integrated from a to b.


y

:]
I
I
I
I
I

I
I
I
I
I
b

Of course, we could equally well have used cross sections for constant
would give a different cross-sectional area function

B(y)
and we would have

ff

F(x, y)dA

F(x,y)dx;

B(y)dy

ff

F(x, y)dx dy.

y.

This

Double Integrals, Intuitively Considered

14.9

665

The expressions

ffF(x, y) dy dx, ffF(x, y) dx dy


are called

iterated integrals.

If the general assumptions that we are making in this

section are correct, and in fact they are, then it follows that the two iterated integrals
are equal; that is, the order of integration does not matter. The reason is that each
of the two iterated integrals is equal to the double integral.
y

The same phenomenon occurs, in a less simple form, if the domain is not
rectangular. In- the figure above, the domain D can equally well be described by the
inequalities

1,

2
x ,

(1)

or

(2)

1,

Thus, for any continuous nonnegative function/: D--+ R we have

JJ F(x, y) dA f f-"' F(x, y) dy dx,


2

and we also have

ff

F(x, y) dA

f1 r-vl-Y
F(x, y) dx dy.
=Jo Jo

Therefore the two iterated integrals must have the same value.

PROBLEM SET

14.9

In each problem below, we have a domain D described by a pair of inequalities, and a


function defined by a formula. In each case, express the double integral

JJF(x,y)dA
D

666

14.9

Functions of Several Variables

as an iterated integral in two different ways, evaluate both of your iterated integrals, and
check by observing that they ought to have the same value.
1. D: 0 x 2,

0 y x3;

2. D: 0 x 2,

x2
0:5y:5-
'
-

3. D: 0 x 2,

4. D: 0 x 1,

5. D: 0 x 1,
6. D: 0 y 1,

7. D: -1 x 1,

F(x, y)
F(x,y)

- 4
x3 y 8;

F(x,y)

x2 y x;

F(x,y)

x y 1;

F(x,y)

y x 1;

F(x,y)

0 y 1 - x2;

8. D: 0 x 1,

-Vl

9. D: 0 x 1,

0 y x2;

x + y

x - y

x2 + y

x + y

x3y3

x2 + y2

F(x,y)

x2 y V l
F(x,y)

x2;

xy
F(x,y)

(x2 + y2)2

.Yxy

10. Let

Find / ( 0<:) .
'

11. Let </> be positive and continuous, and let

/(0<:)

ff

</>(x,y)dxdy.

Find the simplest formula that you can for[' (0<:).

12. Same question for

/(<X)

rld

</>(x, y) dxdy.

13. Let </> be a positive function, with continuous first and second partial derivatives. Get
the simplest formula that you can for

fld

tl

</>xy(X,y) dxdy.
/

____

I
I
I
I
I
I

I
)-_____
//

14.10

CYLINDRICAL COORDINATES IN SPACE.


THE DEFINITION OF THE INTEGRAL

7r

.7c--1
- 2

To set up a system of cylindrical coordinates in space, we use polar coordinates in


the xy-plane, and leave the z-coordinate unchanged.
For some solids and surfaces, this leads to a considerable simplification.

For

example, the cylindrical surface of radius 1, with the z-axis as its axis of symmetry,
is the graph of the equation

r =

1.

(See the figure above.)

Cylindrical Coordinates in Space.

14.10

The Definition of the Integral

The unit sphere with center at the origin is the graph of the equation r2 +

z2 =

667

1.

(See the figure on the left below.)


Recalling the familiar formulas giving x and yin terms of rand e, we see that
rectangular and cylindrical coordinates are related by the formulas
x
y

cose,

sine,

z = z,

x2 + y2

,2.
z

'

'

'

I
I
I
l..
I ',
I
I 1/ ,'1I
I/

'
'

IZ

/---- -r---- 2

--7r

7r

(r,0,z)

--2

As for polar coordinates in the plane, these formulas work in only one direction:
when rande are named, x and y are determined, but when x and yare named, there
are two possibilities for rand infinitely many possibilities for

8. (See figure on the

right above.)
Suppose now that we have given a domain D in the plane
z =

it simply as the base plane.


/: D

z =

0.

The plane

0 may be regarded as the xy-plane or the r8-plane; sometimes we shall refer to


--+

Suppose that we have given a continuous function

R. If we describe a point P of D by its polar coordinates


z =

f(P)

(r, 8), then we have

f(r, 8).

We now want to compute

ff!(P)

dA,

and we want to do this without transforming to rectangular coordinates.

In some

cases we might not be able to transform; and in other cases we wouldn't want to,
because the rectangular form would turn out to be unmanageable.

Therefore we.

need to know how to deal with cylindrical coordinates in their own terms. This can
be done as follows.

668

14.10

Functions of Several Variables

Q
I
I

Given a domain

D in the base plane. By a net over D we mean a finite collection


N: D1, D2,
, Dn

of regions such that

(1) D

is the union of the D/s,

(2) each Di has an area (i.e., is


(3) if Di intersects D1, then the

measurable, in the sense defined in Appendix G), and

area of the intersection is 0. The sets

Di

are called the cells of the net.

The figure

indicates, at long last, why we use the word net in integration theory.
y

By the diameter of a set

Di

we mean the supremum of the distances between its

points. The diameter is denoted by

oDi
Note that if

Di is

oDi.

Thus

sup {PQ I P, Qin

oDi is the diameter of D; in the elementary


N is the greatest of the diameters of the cells of the net.
JNJ. Thus JNJ
Max {aDi}.

a circular region, then

sense. The mesh of the net


The mesh is denoted by

Di}.

14.10

Cylindrical Coordinates in Space.

The Definition of the Integral

669

sample of the net

is a sequence
of points, where P; belongs to D; for each i (see figure at the right above).
For each i, let flA; be the area of D;. A sample sum off over the net N is a sum
of the form
n

"J.J(Pi) LlA;.
i=l

We are now finally ready to give our definition of the double integral. By definition,

JJJ(P)

dA

lim _if(P;) flA;,


INl-+o i=l

if such a limit exists. If the limit exists, thenf is said to be integrable on D, or simply
integrable. In this definition, liml.ivl-o means the same thing that it meant in the
definition of the integral for functions of one variable; when we write
n

lim "J.J(P;) flA;

INl-+Oi=l

this means that for every

L,

> 0 there is a c5 > 0 such that

!NI < c5

=>

if(P;) fl A; - L <

We recall that if/ is continuous on the closed interval [a, b], then/ is integrable
on [a, b]. We want to state an analogous theorem for functions of two variables.
It would hardly do to restrict ourselves to "two-dimensional closed intervals"
a x b, c y d. On the other hand, we cannot allow all sets D in the xy
plane as domains, because continuous functions on some domains may not even be
bounded. (Examples?) What is needed here is the following:
Definition. A point P is a limit point of a set D if every neighborhood U(P, c5) of P
contains a point of D other than P.
Definition.

set D is closed if it contains all its limit points.

Thus a closed interval is closed, but an open interval is not; the region
is closed, but the region

{(x, y) I x2 + y 2 1}

D'
{(x, y) I x2 + y2 < 1}
is not.
We recall that a set D in a plane is bounded if it lies in the interior of some circle
(or, equivalently, if it lies in the interior of some rectangle). We can now finally
state our theorem:
=

Theorem 1. Let D be a closed, bounded, measurable set in the xy-plane, and let f be
a function which is continuous on D. Then/ is integrable on D.

Functions of Several Variables

670

14.10

You may be able to convince yourself of this, for positive functions, by thinking of
the integral as a volume, and thinking of the sample sums as approximations of the
volume; the idea is that we can approximate the volume as closely as we please,
by cutting up the base domain into sufficiently small pieces. If the function is negative
somewhere, then we need to use volumes with signs attached, but the idea is much
the same.

But a mathematical proof that all this works is far beyond the scope of

this book, and we make no attempt to present one.


Meanwhile we assume that the theorem is true, and return to the problem of
integration in cylindrical coordinates. For the sake of simplicity, we consider first a
domain of the type

D = {(r, 6) I a r b,

Cl

e {J}.

This is the polar equivalent of a rectangular region.


7r

The first step in the calculation of the integral is to set up a net

N, on the interval

[a, b] and a net N8 on the interval [cc, {J]. Thus we have

No: 60, 61, . . . 'em.


The circles

r = ri and the rays 6 = 61 now cut up the domain D into nm little pieces,

like this:
7r

14.10

Cylindrical Coordinates in Space.

The Definition of the Integral

671

Let Di; be the shaded region in the figure. Thus


D;1

= {(r, 0) I r ;_1;;; r ;;; r;, 01_1;;; 0 ;;; 01}.

Evidently the D;/s form a net N over D. For each

i, j,

the area of

D;1

is

Then
and

r; - r;:1 = r; - (r; - 2r; Llr;


= 2r; Llr; - Llri.

Therefore

Llr;2)

Ll A;1 = i Ll01[ri - rLJ = i Ll01[2r; Llr; - Llri]


= r; Llr; LlO1 - i Llri LlO1
In each cell

D;1 of the net we pick the sample point P;1 = (r ;, O;).

We now form the

sample sum
n

L = L Lf(r;, O;) LlA; ;


i=l i=l
n

2 2 f(r;, 01)[r; Llr; LlO1 - t Llr;o 1].

i=l i=l

Evidently is a sample sum off over D; and our problem is to find


lim

L=

INl->o

fff(P) d A.
D

But this is much easier than it looks:


n

2 = 2 2 f(r;, 01)r; Llr; LlO1 - }Llri 2 2 f(r ;, 01) Llr; LlO1


i=li=l

i=li=l

We interpret each of these double sums as a sample sum in rectangular coordinates.


0
On=/3
oi - - -1--

+-----+-

oj-1
00=a

- -

-.---<
pij

--

1--+-----1---1

Functions of Several Variables

672

14.10

Let D' be the rectangular region shown in the figure. That is,
D'

= {(r' e)

Ia r

b'

ex

{J}.

The limits of our sums are now known:

I f(ri, e;)ri !lri Ile; = JJ !(r, e)rdrde,

Iim
JNJ -+O i=l

i=l

D'

and
lim
JNJ -+O
Since IimlNl-o

!1ri =

i f(ri, e;) !1ri 11e, =

i=l

i=l

ff1(r, e) drde.

D'

0, we have

JJ f(r, e) dA = JJ f(r, e)rdrde + JJ f(r, e) drde = JIf(r, e)rdrde.


0

D'

D'

D'

Thus the second integral has dropped out.


We usually evaluate double integrals by converting them into iterated integrals.
For the special type of domain that we have been discussing, we can sum up our
results in the following theorem:
Theorem 2.

Let
D

= {(r, e)

Ia r

b,

ex

(J},

in polar coordinates, and let f be a function which is continuous on D. Then

JJ f(r, e) dA = rfj(r, e)rdrde =fJ:f(r, e)rdedr.


D

Let us try this out in a simple case in which we know the answer. Consider the
hemisphere under the graph of
z

=f(x,y) =

)1

x2

y2.

In cylindrical coordinates,
z

Let
D

=f(r, e) = )1

= {(x ,y) I x2 + y2

l}

r2

= {(r, e) I r2

l}.

Then the volume of the hemisphere is

JJ-!1 - r2dA =f"f_;1 - r2rdrde.

Now

f J1

r2rdr = {-t(l - r2)312+ C}.

Cylindrical Coordinates in Space.

14.10

The Definition of the Integral

673

Therefore
Therefore

ffJ1

r2 dA

f" !dfJ 2;.


=

This is right, because the volume of the whole sphere is 477/3 13

PROBLEM SET
1. Let D

477/3.

14.10

{(x,

y) [ x2

+ y2

l}. Find

JJcx2

y2)712dydx.

2. Let D

{(x,y) I

x2

y2 l}. Find

JJ v

y2dydx.

1 + x2 +

3. Find the volume of the solid which lies under the paraboloid

the interior of the cardioid

r =

sin e.

4. Find the volume of the solid lying inside the cylinder x2 +

sphere x2 +

y2

z2

4.

5. Find the volume of the solid lying inside the cylinder

ellipsoid
x2

6. Find

y2
4

il Jyl-x

z2

ffv 2

x +

x2 +

y2 and over

y2

1 and inside the

y2

1 and inside the

1.

y2)10dydx.

_(x2 +

-1 -v1-x2

7. Let D be the circular region with center at

x2

(0, 1) and radius 1 , in the xy-plane. Find

y2

dydx.

8. Let D be as in Problem 7. Find

ffvx2

y
+

dydx.
y2

9. Let S be the part of the disk D lying in the half-plane {(x,

ff
s

x2

xy

+ y2

dydx.

y) Ix ;;; O}. Find

Functions of Several Variables

674
10.

11.

Find

f1 i'h-x sin7 cos7


-1 - y l-x
12 iy4-x 2
0 0

x
y
(X2 + 2)3/2 dydx.
Y

Find

14.11

14.10

x2 - Y.
- dydx.
x2 + y2

MOMENTS AND CENTROIDS OF NONHOMOGENEOUS BODIES

We recall, from Section 7.6, the definitions of moments and centroids for finite
systems of point masses in a coordinate plane. Suppose that we have given a set of
Pn, with masses m1, m 2,
particles Pv P2,
, m n, at the points (x1, y1),
(x2, y2),
, (x n, y,,). The moment of the system about the y-axis is defined to be

M11

= .L m;X;,
i=l

and the moment about the x-axis is


M.,

n
= _L m;Y;.
i=l

More generally, the moment about the line x = x0 is


n
Mx=xo
L m;(x; - Xo),
=

and the moment about the line y

Mu=vo

If

Yo

i=l

is
n

L m;(y; - Yo)

i=l

then the point (.X, ji) is called the centroid of the system. By easy calculations we get
n
1
m;
m =
ji =
_L m;Y;
i=l
m ;=1

It is easy to see that if the axes are translated, the centroid is unchanged: for
x = x' + h,

x' = x - h,

y = y' + k,

y' = y

k,

the coordinates of the centroid in the new coordinate system are given by the formulas
x

_,
=

-1 ,,;;;., m;x;, -m1 ,,;;;., m;(X; - h)


=

m;=1

_,

i=l

1
x- - h
,,;;;., m; = x- - h,
m i=l

-m1;,,;;;.,=1 m;Yi

,
=

Y- - k,

Moments and Centroids of Nonhomogeneous Bodies

14.11

675

so that in the new coordinate system we get the same centroid as before. Similarly,
if we reverse the direction of the x-axis, or the y-axis, or both, we get the same
centroid as before. Finally, we observe that the centroid is unchanged if we rotate
the axes through an angle of measure(), We have

x = x' cos() - y' sin(),

y = x' sin() + y' cos(),


Therefore

x =

1
-

.,

m i=l

()

mi(xi' cos() - Yi' sm )

.!m ( ii=l mix)

x'

cos() -

.!

( i miy)

m i=l

sin()

cos() - ji' sin(),

where x' and ji' are the new coordinates of the centroid. A similar calculation
gives
ji = x' sin () + ji' cos e.
Thus the old coordinate system and the new one give us the same point as centroid.
Suppose now that we have a thin rod, lying on an interval [a, b] on the x-axis.
We do not suppose that its mass per unit length is constant. But in any case there
is a function f which gives, for each x, the mass of the part of the rod that lies on
the interval [a, x]. If f has a continuous derivative f', then .

if'(t) dt = f(x)

- f(a ) = f(x),

becausef(a) = 0. More generally,

A function p which behaves in the way that we have just observed for f' is called a
density function for the rod. That is:
Given a rod on [a, b]. For a x1 < x2 b, let m(x1, x2) be the mass
of the part of the rod that lies on [x1, x2]. A densityfunction for the rod is a function
p such that

Definition.

It follows, of course, that the total mass m of the rod is m(a, b)


g p (x) dx.
And the definition agrees with our intuitive notion of what density at a point ought
to mean. If p(x0) is the density at x0, then it ought to be true that
=

when <5

0.

Functions of Several Variables

676

14.11

Here the lefthand side is the average mass per unit length on the interval

p(x0)

and this ought to be approximately

when

where p is continuous:

o-+O o
1

Jim - m (Xo, Xo

+ 0) =

lim

o-+o O

R::>

[x0, x0 + o],

0. And this is true, at any point

xo+o
Jxo. p(x) dx = p(x0),

by the general formula for the derivative of the integral. We assume hereafter that
is a continuous function. Let us take a net N:
the sum

x0, x1,

, xn

over

[a, b],

and form

,L xip(xi) 6.xi.

i=l

This sum is the moment, about the origin, of a finite system of particles of mass

p(x1) 6.x1, ... , p(xn) 6.x n ,


, xn. The limit of the sum, as the mesh of the net approaches
x1, x2,
xp(x) dx. This integral is defined to be the moment of the rod about the origin.

at the points
0, is f

Thus

b
M0 = a xp(x) dx.

More generally, the moment about the point

Mk=

is

f(x - k)p(x) dx.

We now define the centroid as the point x such that


M33

0.

It is easy to calculate that


_

x=

J! xp(x) dx
.
J! p(x) dx

By the definition of the density function, the integral in the denominator is the total
mass m of the system. Thus, briefly,
x

1
=

iabxp(x) dx.

Suppose, for example, that the rod lies on the interval

[O, 2], and that the density

is proportional to the distance from the origin. Here we have

p(x)

x=

kx,

fkx dx

[tkx2]g = 2k,

"

_!_ [ x kx dx
m

Jo

_!_ k[lx3]

2k

This is greater than one, as it should be.

! l

f.

14.11

Moments and Centroids of Nonhomogeneous Bodies

Consider next a thin plate, occupying a region

677

in the xy-plane. Again we do

not suppose that the density per unit area is constant.


y

For each subregion


lies in

D;.

D; of D let m(D;) be the mass of the portion of the plate that


density function for the

Following the analogy of the rod, we define a

plate to be a function psuch that

=ff

m(D;)

pP
( ) dP

D;

for every

D; lying in D.

D; may be all of D;

In particular,

and in this case the total

mass of the plate is

m = m(D)

=ff

p(P)

dA.

Hereafter, we assume that pis continuous. We take a net


N=

over

D;

we take a sample

P1, P2,

D1, D2,

,Pn of N,

Dn

with

Pi=

(xi, y;); and we form the

sum
n

( i, yJ Ai,
xipx
iL
=l
where

A; is the area of Di. This sum is the moment about the y-axis of a system of
Ai, with x-coordinates x i . As the mesh of the net

particles of mass p(x i, Yi)

approaches zero, these sums approach the limit

Jf

xp(x,

y) dA.

By definition, the moment of the plate about the y-axis is this integral. More generally,
the moment about the line x

=k

Mx=k

is

=fJ

cx

k)p(x, y) dA;

Functions of Several Variables

678

14.11

and similarly,

= ff<y

M11=k

k)p(x,

y) dA.

The centroid is the point (x, ji) such that

Since the total mass is


m

=ffp(x, y) dA,
D

an easy calculation gives


x

; ffxp(x, y) dA,

ji

; ffYP(x, y) dA.

In the preceding discussion, we have assumed for the sake of simplicity that the
density is continuous, so that we don't need to worry about whether our integrals
exist. In some very simple cases, however, the density is not continuous. Suppose,
for example, that we take a rod of unit length, with constant density 1, and another
rod of unit length, with constant density 2, and lay them end to end.

p=l

p=2

0
Thus

p(x)

1 for 0

x < 1, and p(x)


2 for 1 < x 2. At the midpoint 1,
p(l)
t We now have a discontinuous density

we split the difference, and take

function whose graph looks like this:

y
2

-+------+--x
2
y=(x)
Note, however, that pis integrable, and that

fp(x) dx =

1 +

3,

Moments and Centroids of Nonhomogeneous Bodies

14.11

679

xp(x) looks like this:

which is equal to the mass, as it should be. The function


y
4

3
2

I
I
I
I
I
I
I

and
M0

fxp(x)dx

i + i(2 + 4)

t.

Therefore
x =

_!_

l t

Mo =

t.

This is the right answer. Ifwe assume that the masses of the two halves of the rod are
concentrated at their centroids, then we get two particles, of masses 1 and 2, at the
points

and t.

Here
Mo =

+ t 2

{,

m =

3,

and
x =

t. t

i,

as before.
This illustrates the way in which our formulas work, for discontinuous density
functions. The general theory, however, is hard, and we make no attempt to discuss
it here. Meanwhile the above example shows that some very simple physical situations
lead naturally to discontinuous functions.
PROBLEM SET

14.11

1. A thin rod occupies the interval

[2, 4]. Its density is proportional to the distance from

the origin. Find the centroid.

2.

A thin rod occupies the interval

[I, 2]. Its density is proportional to the square root

of the distance from the origin. Find the centroid.


3. A thin plate occupies the unit disk with center at the origin. Its density is proportional

to e-<"+Y'>'. Find the centroid.

14.11

Functions of Several Variables

680

4. A thin plate occupies the unit disk with center at the origin. Its density is proportional
to Vl + x2 + y2 Find the centroid.
5. A thin plate occupies the righthand half

(x 0)

of the unit disk with center at the

origin. Its density is proportional to the distance from the origin.

Find the centroid.

6. Same question, where the density is proportional to the square of the distance from the

origin.
7. A thin plate occupies the interior of the cardioid

say,

1.

Find the centroid.

1 - sin IJ. Its density is constant,

(The computation is Jong, even if the appropriate

short-cuts are used.)


8. A function f is defined by the conditions

f(x)
Find

for 0 x 1 ,

x2
(x -

2) 2

for 1 x

2.

Sf(x)dx.

9. A function/ is defined by the condition

f(x)
Find

Sf(x)dx.

{; = 2

for 0 x 1 ,
for 1

x 2.

10. A thin plate occupies the square region whose corners are (0, 0), (1, 1 ), (2, 0), and
(1, - 1 ). Its density is proportional to the distance from the y-axis. Find the centroid.
11. A thin plate occupies a triangular region with vertices (0, 0), ( 1 , 1 ), and (1, -1). Its
density is proportional to the distance from the x-axis. Find the centroid.

1 2. Given a thin plate, occupying a region D, with density function p. The moment of
inertia of the plate about the point
lp0

P0

(x0, y0)

is defined to be

JJ (P0P)2p(P)dA JJ [(x - x0)2


=

(y - y0)2]p(x, y)dA.

Suppose that the plate occupies the unit circle with center at the origin, and that the
density is constant. Find the moment of inertia about the origin.

13. Under the conditions of Problem 12, find out which point P0 gives the minimum value
of the moment of inertia.

1 4. The moment of inertia of a thin plate about the line x


lx=xo

x0 is

=ff cx -x0)2p(x,y)dA.
D

The moment of inertia IY=Yo about the line y


y0 is defined similarly. A thin plate
occupies the righthand half of the unit disk with center at the origin, and its density is
=

proportional to the distance from the origin. Find lx=o and ly=o

1 5. Given a thin plate, with density function p, on a domain D. For what point P0 does the
moment of inertia Ip take on its minimum value?
14.12

LINE INTEGRALS

Suppose that we have given a path P: I--+ D, where I is a closed interval [a, b] and
D is a region in a coordinate plane.

14.12

Line Integrals

681

P(a)

P(b)

g be the coordinate functions of the path P, so that


P(t)
(j(t), g(t))
for each t, and suppose that f and g have continuous derivatives. Let F and G be
continuous functions defined on D. The line integral Sr F dx + G dy, of F and G

Let f and

over the path

P,

is defined as follows.

Let

[a, b].

be any net over I=

xi
ti.xi

Then

J{P F dx

For each i, let

f(ti),

+ G

dy

X;

Yi g(ti),
L'l.yi = Yi - Yi-1
=

xi-1

Jim

JNJ..,O

i [F(x;, Yi) L'l.x; + G(x;, Y;) L'l.y;].

i=l

We need, of course, to show that the limit exists.

We shall do this by deriving a

formula for the limit, as follows:


Theorem 1.

path

P have

If F and

G are continuous,

and the coordinate functions f and

continuous derivatives, then

tF dx

G dy

JrbF(J(t), g(t) )f'(t) dt


a

rbG(f(t), g(t) )g'(t) dt.


Ja

g of the

682

14.12

Functions of Several Variables

Proof

On each little interval

[ ti 1 , t;]
_

of the net, we take a

i;

such that

!:ix; = f' (i;) !:it;


n

.L F(x;, Y;) !:ix;= .L F (f(t;), g(t;))J'(i;) !:it;.


i=l
i=l
This is almost, but not quite, a sample sum of the function

</>(t)

F(j(t),g(t))j'(t)

over the net N; the only trouble is that we have substituted two different sample
points in two different places in the formula for if>.

But in the limit, this does not

matter. (See Appendix I, where a very similar case is discussed in detail.) Therefore
n

_LF(x;, y;) !:ixi=

lim

iNi-Oi=l

lb</>(t) dt lbF(j(t),g(t))f'(t) dt.


=

In exactly the same way, we get


n

Jim .L G(xi, Yi) !:iyi =

INl-+O i=l

lb
a

G(f(t), g(t) )g'(t) dt ;

and from this the theorem follows.


For example, we might have

P(t)

(j(t),g(t)) = (t + 1, t 2)

F(x,y)
Then

LF

dx + G dy

x + y,

G(x,y)

(0 ;;; t ;;; 1),


=

x + y2

r [(t + 1 + t2) + (t + 1 + t4)2t] dt


f(2t5 + 3t2 + 3t + 1)
dt= _g_l.

Line integrals have the following quite natural physical interpretation. We regard
the path

P:

I-. Das a description of the motion of a particle in the plane, during

;;; t ;;; b. Suppose that at each point (x,y) of Dthere is a resisting


F(x,y)i + G (x , y)j .

the time interval a


force

R(x,y)

Line Integrals

14.12

683

Here R is a vector, with components F and G in the x- and y-directions, and the indi
cated addition is vector addition. As the particle moves from P(t;_1) to P(t;), the work
should be approximately
W;

F(x;, Y;) Llxi + G(xi , Yi ) Ll yi ,

where the first term is the "work in the x-direction" and the second term is the "work
in the y-direction." Therefore the total work W should be
n

R:j

L [F(xi, Y;) Llx; + G(x;, Y;) Lly;].

i=l

Passing to the limit, as !NI 0, we get


W=

F dx + G dy,

which is the formula ordinarily used as the definition of work.


In the above discussion we have described the path P and the resistance R in
terms of the coordinate functions f, g (for P) and F, G (for R). Thus, in effect, we
have been using a particular basis {i, j} for the base plane R2, with

P(t)
R(Q)

Note, however, that

f(t)i + g(t)j,
F(Q)i + G(Q)j.

so that the sum


n

L [F(x;, Y;) Llx; + G(x;, Y;) Lly;],

i=l

whose limit is the line integral, is really a sum of inner products:


n

_L R(Q;) LlPi,
i=l

where Q ;
P(t;)
(x;, y;). Therefore the line integral depends merely on the
vector-valued functions P and R; it is independent of the coordinate system in the
base plane. Of course this must be true, for any mathematical concept which has a
physical meaning. In vector notation, the line integral is denoted by
=

RdP.

A very important special case is the one in which the function


R: DR,
: (x, y) F(x, y)i + G(x, y)j
is an

exact differential. This is the case in which there is a function <P such that
cf>x = F,

cf>11= G,

14.12

Functions of Several Variables

684

so that the line integral takes the form

LR dP =LF dx dy L <Px dx ef>v dy


=f [ef>.,(f(t), g(t))f'(t) <Py(f(t), g(t))g'(t)] dt.
+ G

Here, by the chain rule for paths, the integrand is the derivative of the function

<I>(t )
It follows that

= 1>(/(t), g(t)).

LR dP = ef>(f(b), g(b))- ef>(j(a), g(a)).


endpoints
P(a)
(j(b), g(b)),

The answer here depends only on the

P(b)

= (j(a), g(a))

of the path; it is independent of the way in which the path proceeds from the initial
point

P(a)

to the terminal point

P(b).

In such a case, we may describe the line integral merely by using the endpoints of
the path as limits of integration, writing

r<c',d')<Px dx
Jcc,a>

for

L <f>x dx

In these formulas, we use

<Pv dy

<f>.v dy.

ef>.,, v for F and G to emphasize that the notation fi:g;>

can be used only in the case where the integrand is an exact differential.

PROBLEM SET

14.12

Calculate the line integral


andP.

dP, by any method, for the following functions R

Line Integrals

14.12

1. R(x,y) =x2i

2. R(x,y) = x2 i

y2j;

P(t) = i sint + j cost

y2j;

P(t) = it + jt2

(0 t I)

P(t) = i sint + j cost

(0 t

1T

P (t) = i sin11t + j cos11t

(0 t

1T

P(t) = it + jta

(0 t l)

3. R(x,y) = yi

xj;

4. R(x,y) = yi

xj ;

5.

R(x,y) =xyj;

6.

R(x,y) = i y e"'Y

7. R(x,y) = i y exy

P(t) = it3

j xexy;
jxexy;

(j l - ts)

(0 t 2rr)
)
)

(-1 t l)

P(t) = it

(-1 t l)

8.

R(x,y) =xi - yj ,

P(t) = i sin2t + j cos2t

(0 t 21T)

9.

R(x,y) =xi - yj ,

P(t) = it 2 + jt3

(0 t l)

10. R(x,y) =xi - yj ,

P (t) = i sint + j cost

(0 t 1T/2 )

11. R(x,y) = yi - xi,

P(t) = i sint + j cost

(4 t 7)

P(t) = i cost + j sint

(0 t

12. R(x,y) =(x2 - y2)i

2xyj ,

1T

685

The Shorthand of
Logic and Set Theory

Appendix A

In this book, the use of logical symbols is held to a minimum, on the ground that
words are usually easier to read.

But the symbolism explained below will at some

points be convenient in the text, and is even more useful in notebooks and on black
boards.
We explained in Chapter

that

<=>

x < 1
And

=>

means "is equivalent to." Thus

1 > x.

<=?

means "implies." Thus

x >2

=>

x2 > 4.

Occasionally we write this symbol backwards:

x2 > 4

x > 2,

<=

which means the same thing. We also recall from Chapter 1 that

{x I P(x)}
x such that P(x) is true. That is, {x I P(x)} is the solution
P(x). Thus the closed interval from 0 to 4 is

denotes the set of all obj ects


set of the open sentence

[O, 4]

and the open interval from

to

{x I 0 x 4},

is

(1, 2)

{x I 1 < x < 2}.

If A and B are sets, of any kind whatever, then

AcB
means that A is a subset of B; that is, every element of A is also an element of B.
For example, if

(1, 2)

and

[O, 4],

then

Ac B.
We allow the possibility that A
like

not like <. If A

B, so that A c A for every set A.

B, we can also write


B

::i

A.

Thus c is

Appendix A

688

If

A,

is an element of the set

then we write

x EA.
This is read

"x

belongs to

A."

The denial of this statement is indicated by a diagonal

stroke. That is,

xA
x

means that

A.

does not belong to

The union of

and

is denoted by

A UB.
Thus

A UB={x
The formula

is read

"A

x EA

B."

cup

or

x EB}.

The intersection of

and

is denoted by

AnB.
Thus

AnB={x
The formula

AnBis

read

"A

B."

cap

x EA

and

x EB}.

The difference

A-B
of

and Bis the set of all elements of

we need not suppose that

B c A.

that are not elements of

For example, if

A= [O, 2]

B.

and

A-B,
[1, 4], then

To write

A-B= [O, 1).


These sets are intervals, of course, described in the notation of Chapter
empty set is denoted by {

}.

1.

The

Thus

AnB={}
means that

and

have no element in common. And

A-B={}
means that

A c B.

When we write

(x-1)2= x2-2x + 1

y.,,

we mean that the equation on the left holds true for every

x2y2=(xy)(x + y)
means that the equation holds true for every

x.

Similarly,

v.,,11

and y. The symbol "Y" is read "for

every." More informally, we may write

xsys=(x-y)(x2 + xy + y2)

y'

where the symbol "Y" means that the preceding equation holds true for all values
of all the variables that appear in it. When "V" stands alone, it may be pronounced

always.
The symbol 3 stands for "there exists," and the symbol 3 stands for "such that."

The Shorthand of Logic and Set Theory

Used in moderation, this symbolism is a convenience.

689

But its use can be over

done; and it takes practice to read formulas like


(x ER and y ER and x <

y)

=>

3z 3 z E R and x <

< y.

This says that between any two real numbers there is a third.
This symbolism is introduced merely as a scheme of abbreviations of English
words and phrases. This book makes no attempt to deal with symbolic logic; and its
use of the "theory of sets" is entirely intuitive.

But the shorthand of logic is useful

simply as a shorthand; the point is that we are more likely to say what we mean if we
have a quick and easy way to do so.

Algebraic Operations
Appendix B

with Limits of Functions

In Section 3.4 a number of theorems on limits were stated without proofs.


give the proofs. First we recall some of the results of Section 3.4.
Theorem

Here we

1. If limX---+.,.f(x) =L, then lim.,-.,.[f(x) - L] = 0.


If lim.,-.,0 [f(x) - L] = 0, then lim.,-.,J(x) =L.

Theorem 2.

(These were Theorems 2 and 3 of Section 3.4.)


If lim.,-.,J(x) = 0 and limo:---+xo g(x) = 0, then

Theorem 3.

lim [f(x) + g(x)] = 0.

x-+xo

Proof Let

"

>

0 be given. There is a 01 > 0 such that


0 < Ix - 0
x 1 < 01

=>

x I < E/2.
If ()

(Here we are using i;:/2 for " in the definition of the statement limo:---+., J(x) = 0.)
Similarly, there is a 02 > 0 such that

0 < Ix - Xol < 02 => lg()


x I < i;:/2.
Leto be the smaller of the numbers 01 and 02 Brieflyo
, = min (01, 02). Then
Since

0 < Ix - x01 < O

we have

and

lg(x)I < i;:/2.

x I,
If (x) + g(x)I If (x)I + lg()

0 < Ix - x0I < O

which is what we wanted.


Theorem 4.

If (x)I < i;:/2

=>

=>

If (x) + g(x)I If (x)I + lg(x)I


< E/2 + E/2 = E,

If lim.,-.,J(x) =Land Jim.,-.,0g(x) =L', then


Jim [f(x) + g(x)] = L + I.:.

Proof By Theorem

we have
lim [g(x) - I.:] = 0.

Jim [f(x) - L] = 0,
X--+CCO

690

Algebraic Operations with Limits of Functions

691

By Theorem 3,

lim [f(x) + g(x) - (L + I:)]

0.

By Theorem 2,
lim

[f(x) + g(x)] = L + I:.

X--tXo

A function f is locally bounded at a point x0 if there are positive numbers Mand o


such that
0

< I x - x0 I < o

If(x)I <

=>

M.

Theorem 5. If f approaches a limit, as x x0, then f is locally bounded at x0

Proof

Let

limf(x)
Let Ebe any positive number, and let

L.

o be as in the definition of a limit.

f
L

I
I
I
I
---4---
I
I
I
I
I
I
-+---x
x0-o
x0
Xo+o

L-

--

Thus
0

The

< Ix - x01 < o

=>

- E<

o that we now have is the o that we wanted.


IL+ E l , IL - E l .

f(x) < L +

E.

We let M be the larger of the

numbers

Theorem 6. If lim,,_,,J(x)

0, and

lim

g is locally bounded at x0, then

[f(x)g(x)]

0.

692

Appendix B

Proof

Let a positive number

be given. Take

01

> 0 and

> 0 such that

Ix - x0I < 01
lg(x)I < M.

(1)
(2)

0 <
=>

Next, using e/ M in place of e in the definiton of a limit, take c52 > 0 such that

Ix - Xol

0 <

0 <
then

(1)

and

(3)

both hold, and so

0 <

Ix - x01

< - .

(4)

Ix - x0I

(2)

(3)

If(x)I

=>

02

<

and

< O

(4)

(5)

< O,

both hold. Therefore

=>

g(x)

< M,

f(x)

<

Since

lf(x)g(x)I

lf(x)I lg(x)I,

we have
0 <
Thus, given an

Ix - x0I

< o

=>

lf(x)g(x)I

< !....

M = e.

> 0 we have found a o > 0 such that


0 <

Ix - x01

< o

=>

This means that


lim

[f(x)g(x)]

If (x)g(x)I
=

< e.

0,

:z:-:z:o

which was to be proved.


Theorem 7. If lim.,._.xJ(x) =Land limxx. g(x)
lim

Proof

By Theorem

2,

[f(x)g(x)]

L', then
LL'.

we need to show that


lim

[f(x)g(x) - LL']

= O;

Theorem 7 will then follow. Now

f(x)g(x) - LL'

f(x)g(x) - L g(x)

[(j(x) - L)g(x)]

In each of these brackets, the first factor


factor is locally bounded at

Lg(x) - LL'
[(g(x) - L')L].
approaches 0 as x x0, and

x0

Therefore

lim

[(f(x) - L)g(x)]

:z:-+a::o

= 0,

the second

Algebraic Operations with Limits of Functions

693

and
lim [(g(x)

L')L]

a::-+xo

0.

Therefore the sum of the two bracketed expressions approaches 0, which was to be
proved.
Roughly speaking, a function f is locally bounded away from 0 at x0 iff (x) is not
very close to 0 when x is close to x0 and different from x0.
Definition.

Suppose that there are numbers


0 <

Ix

Xol < c5

Then f is locally bounded away from 0 at x


y

> 0

=>
=

and c5 > 0 such that

If(x)I

>

x0
y

-t-----.,___,1 -Xo
1 1
Xo-+0,,-----x

st- -t-i--

-x;;--

I
I

I
I

Note that, if/(x) is never 0 when x = x0 and xis close to x0, it does not follow
that/ is locally bounded away from 0 at x0 The situation shown in the figure on the
right above can easily occur. Here f is undefined at x,0 f (x) = 0 for x = x0, and
lim.,_,.., f(x) = 0. In this case f is not locally bounded away from 0 at x0
0
Theorem 8. If l i m.,_,.x0 f(x)
L, and L = 0, then f is locally bounded away from
0 at x0
=

Proof Suppose first that L

> 0.

limit.
y

Then

Let

L/2, and let c5 be as in the definition of a

Appendix B

694

Thus

0 < Ix - x01 < b

Suppose now that L <


fore

the definition: it is a
Theorem 9.

Proof

=>

Then

away from 0 at

We know that there are positive numbers

0 < Ix - x0I < b


0 < Ix - x01 < b

the bound M
Theorem 10.

l/e.

If lim.,_.,0 f(x)

=>

lf(x)I

L, and L :;.6

lim -1-

We need to show that


Jim

f(x)

There

0.

Therefore the product approaches


=

Proof by Theorems 7 and

x-+xo

10.

=>

.!

J i
1
-f(x)

<

.!e .

!_.
L

0.

1
[L - f(x)].
Lf(x)

The fraction on the right is locally bounded at

0.

L, lim.,_.,0

Jim

and b such that

then

[f(x) - !.]L

Lf(x)

The bracket on the left approaches

If lim.,_.,0 f(x)

0,

L - f(x)
=

x-+xo

Theorem 11.

-L.

(Look again at

we use the b that was given for f, and we use

x0;

x-+xof(x)

Now

I f(x)I > E.

1- <

=>

Therefore 1 /f is locally bounded at

x0

lf(x)I > e.

Iff is locally bounded away from 0 at x0, then l!fis locally bounded at x0

Therefore

Proof

f(x) > E

-L > 0, and limx-x0 [-f(x)]


x0 Therefore so also is f
statement about I f!.)

-f is locally bounded

0.

=>

g(x)

f(x)
g(x)

L
.

I.:

L',

and L' :;.6

0,

then

Algebraic Operations
with Limits of Sequences

Appendix C

Nearly everything in this appendix is analogous to something in the preceding one.


The analogy begins with the definition of the limit of a sequence.

Definition. Given a sequence a1, a2,


e > 0 there is an integer N such that
>

=>

and a number L.

Jan - LJ <

Suppose that for every

E.

Then
lim an= L.
n.-+oo
The following theorems are modeled on the theorems of Appendix B. We there
fore give only a few of the proofs; you ought to be able to supply the rest of the
proofs yourself.

Theorem

Proof

1.

Let

If limn--+OO an= L, then limn--+OO (an - L) = 0.


e

>

0 be given. There is an
n

> N

=>

N such that
Ja11 - LJ <

e.

Therefore
n

Since for every

>

> N

J(an - L) - OJ <

=>

0 there is such an

Note that statements of the form


statements of the form 0 <

Theorem

2.

N, it follows that limn--+oo (an - L) =


n

0.

> N are playing exactly the same part as

Ix - x0J < o.

If limn--+oo (an - L)

Theorem 3. If limn--+oo an=

e.

0, then lim11--+00 an= L.

0 and limn--+oo bn= 0, then

n-+OO

Theorem

4.

If limn--+OO an= L and limn--->OO bn= L' then

Definition. A sequence a1, a2,


Jani M for every n.

is bounded if there is a number M such that

(For sequences, the question of local boundedness does not arise.)

Appendix C

696

Theorem 5. If limn-> an =L, then a1, a2,

Proof

Take any

> N

Therefore the sequence aN+i. aN +2,


-

is bounded.

> 0. There is an N such that

n
IL

< an < L +

is bounded.

E.

(The larger of the numbers

l IL + E l is a bound.)

E ,

And the finite sequence a1, a2,


entire sequence a1, a2,

la1I, la2 I

aN is bounded.

We now get a bound M for the

: let M be the largest of the numbers

IL

E ,

IL +

E ,

, laNI

Theorem 6. If limn-> an = 0, and bi b2,


.

is bounded, then

lim anbn = 0.
n-+oo

lim anbn =LL'.


n-+>
Definition.

Suppose that there are numbers

n
Then a1, a2,

is

> N

bounded away from

> 0 and N such that

lan l >

0.

Theorem 8. If limn-> an =L 0. then ai. a2,


Theorem 9. If a1, a2,
sequence 1/a1, 1/a2,

E.

is bounded away from 0.

is bounded away from 0, and an 0 for each

n,

then the

is bounded.

Theorem 10. If limn-> an =L, and L 0, and an 0 for each n, then


Jim l.
n-+oo an

!. .
L

Note that we must require that an 0 for each

n;

otherwise the sequence of

reciprocals is not defined.


Theorem 11.

If limn-> an =L, limn-> bn =L' 0, and bn 0 for every

n,

then

an
L
lim
=-.
n-+oo bn
L'
-

Theorem 12

bn ;

(The squeeze principle).

If limn-> an =L, limn-> en =L, and an

en for every n, then limn-> bn =L.

The Error in the


Approximation a1 df

Appendix D

At the beginning of Section 4.4, we gave numerical examples of the use of the approxi

6.f df,

mation

and we found that when we checked our approximate answers

against the exact answers, the approximations looked good. But numerical approxima
tion methods are important precisely in those cases where their accuracy cannot be
checked in this way: if you can find the exact answer, then you use it; you don't get
an inexact answer to compare with it.
on the error that results when you use
as follows. We have

6.f

f(xo

This brings up the problem of setting a limit

df in place of 6.f

6.x) - f(xo),

df

The solution of this problem is

f'(x0) 6.x.

I
I
I
I
I
I
I
I
------ r -- -1

x0+x

XQ

XQ

I
I
I
I
I

I
I
I
I
I

x0+x

Applying the mean-value theorem (MVT) to the function f, on the interval from
to

x0

6.x,

we conclude that

f(xo
for some
follows.

x between x0 and x0

- f(xo)

6.x.

(For

x0

<

j'(x),

x0

You should check that it also follows when

shall assume that


Therefore
and

6.x

> 0.

6.f
!if - df

The case

f ' (x) !ix,

6.x
x0

6.x, as in the figure, this result


x0 + 6.x < x0 Hereafter we

< 0 needs to be checked separately.)


<

'
f (x) 6.x - f'(x0) 6.x
697

x0

x
=

<

x0

!ix,

[f'(x) - f'(x0)] !ix.

Appendix D

698

[x0, x]

We now apply MVT to the function/', on the interval

f'(x - f'(xo)
for some x' between

x0

and x. Therefore

f'(x) - f'(x0)
and

b,.f - df
Now

x0

MVT tells us that

j"(x'),

X - X0

Ix - Xol 16.xl.
b,.x.

Therefore

and x0 +

f"(x')(x - x0),

f"(x')(x - x0) 6.x.

16.f - df I lf"(x')I b,.x2,

x'

where

is between

It often happens that we can find a bound

between x0 and x0 +

6.x.

for the numbers

If" (x)I,

for

If so, we can conclude that

lt:if - dfl M b,.x2.


For example, if
what

may be.

f(x)

sin x, then

j"(x)

-sin x, and

lf"(x)I 1,

no matter

Before giving further examples, let us write down the theorem that

we have proved:
Theorem 1.

Suppose that

f has a second
b,.x. Then

every x between x0 and x0 +

derivative

f",

and that

lf"(x)I M,

for

lb,.f - dfl M b,.x2.


Let us see how this applied to Example 1 of Section

f(x)

..J,

Xo

25,

b,.x

4.4.

Here we had

0.4.

Now

f(x)

f" (x)

x1f2,
=

f'(x)

-:lx-a/2

tx-1/2,

__=..!.._ .

4R

We want to find a bound for this function, on the interval


so also does

x3,

and therefore so also does

4)x3.

[25, 25.4].

As

x increases,

Hence the maximum value of

The Error in the Approximation

!::if R> df

699

lf"(x)I on the interval [25, w), is 1/"(25)1. Now


1!"(25)1
We can therefore take M

4.J253

0.039841,

-1

500

-d-0. This gives

lf - dfl sh x2

In fact, we had
f

1
=

df

0.04,

-sh(0.4)2

0.00032.

lf- dfl

Thus the error was considerably smaller than Theorem

0.000159.

predicted that it had to be.

The Continuity of
Composite Functions

Appendix E

In Section 4.5 we stated the following theorem, with only rough indications of proof.
Theorem.

The composition of two continuous functions is continuous. That is, if


lim g(x)

x-xo

and

lim f(u)

g(x0)

u0,

(1)

f(u0),

(2)

f(g(xo )).

(3)

then
limf(g(x))

Proof Let

be any positive number. By (2), there is a number 01 such that


lu - Uol < 01

Now take o1 as

E,

=>

If (u) - f (uo)I <

E.

in the definition of hypothesis (1). By (1), there is a o >


Ix - x01 < O

=>

such that

Jg(x) - g(x0)J < 01.

This is the o that we wanted: we have


Ix - Xol < O

=>

Jg(x) - g(x0)J < 01

=>

lf(g(x)) - f(g(xo))I <

E,

and so (3) holds.


Limits of composite functions are trickier than one might think. For example,
the following "theorem" is false.
Theorem

(?). If lim.,__,.,0g(x)

u0, and lim,,_.,,J(u)

L,

then lim.,_..,J(g(x))

To see that this is false, consider the following example. Let


f(x)_

for x ;tf.
for x

1,
1.

I
I

I
I

2
700

L.

The Continuity of Continuous Functions

Let g be the same as/, and let x0

lim g(x)

u0

1.

Then

lim g(x)

u0,

and
lim f(u)
but it does not follow that
In fact,
x :;z6 1

=>

Iimf(u)

1,

u-+1

(?)

and so

u-uo

limf(g(x))

1.

xxo

g(x)

limf(g(x))
x-+1

=>

f(g(x))
2.

2,

701

The Error 1n Simpson1s Rule

Appendix F

The results of our calculations in Section 4.8 and in later sections suggest two questions:

1)

Why does Simpson's rule give such good approximations?

2)

In a particular computation, how can we tell how good the approximation is?
That is, how can we determine a bound for the error?
In the theorem below, f(4) denotes
j<ll is f', j<2l = f", j<3l = DJ" = J'", and J<4l =

These questions have the following answer.


the fourth derivative off Thus

Dj<3l.

As usual,

Y1 = f(O),

Yo= f(-k),

Y2 = f(k).

Theorem 1. If f has a fourth derivative, on the interval

[-k, k],

then the error in

(-k <

x<

Simpson's rule is equal to

E(k)
where xis some number between

-k

:j<4l(.X),

and

k
l(x) dx - - (Yo + 4Yi + Y2)
3

It follows, of course, that if

k.

That is

0
=

90

1<4l(.X)

Jj<4l(x)J M ( -k < x < k),

k).

then

JE(k)J 9k5M.
The latter is the statement which is most convenient to apply.
the proof of the theorem, let us look at an application of it.
Example.

Suppose that we want to compute

to five decimal places. Here

l(x) = .!. = x-1,


x
1111(x)

-6x-4,
702

Before proceeding to

The Error in Simpson's Rule

703

x increases, its maximum value on the interval [1, 2] isj<4>(1) =


24. Therefore /j<4>(x)/ 24 (1 x 2). Therefore, if we cut up the interval
[1, 2] into 2n parts, each of length k, we have
Since/(4) decreases as

/E(k)[ 910k5 24.

We want

/E(k)/ < 5 10-s,

for the fifth decimal place in our approximation to be correct. Thus we want to take
k such that

9k5 24 < 5 10-s,

or

ks < H . 5 . 10-s.

Arithmetically, this reduces to

ks<\"-. 10-s,
which surely holds if

k = 0.05.

E(0.05) < 5 10-s.

Therefore

This example was selected for its simplicity.

For most functions, the calculation

of fourth derivatives is tedious.


We proceed to the proof of Theorem 1. Let

F be

any function such that

F' =f
(How do we know that there is such a function?) Then

f/(x) dx = F(k) - F(-k),

so that

E(k)

k
f(x) dx - - (Yo
4y1
Y2)
k
ik
4f(O)
= F(k) - F(-k) - [f(-k)
+

f(k)].

Therefore

E'(k) = f(k)

= if(k)

f(-k) - [-j'(-k)
3

if( -k) - tf(O)

f'(k)] - Hf(-k)

j'(-k)
3

4f(O)

f(k)]

j'(k),
3

f"(-k) H'(-k) - tf'(k) - f"(k)


= H'(k) - tf'(-k) - f"(k) - f"(-k),
f"'(-k) - tf"(-k)
tf"(-k) - f"'(k) - tf"(k)
E"'(k) = tf"(k)
E"(k) = if'(k) - if'(-k) -

= - [f"'(k) - f'"(-k)].
3

Appendix F

704

We need all of these formulas, not just the last. It is easy to check that

E(O)
From now on,

E'(O)

E"(O)

k is going to be regarded as
G(t)

It is easy to check that

G(O)

and

G'(O)

E(t) -

k5

0.

t5

0,

G111(0)

0.

On the interval

constant. For each ton [-k, kJ, let

E(k)

G(k)

G"(O)

E'"(O)

[O, k] we apply the mean-value theorem (MVT) to the function G.

This gives

0<X1<k.

We next apply MVT to the function G', on the interval

G"(x2)

Applying MVT to

[O, x1]. This gives

0<X2<X1.

0,

G", on [O, x2], we get


G"'(x3)

0,

By a straightforward calculation,

G"'(t)

Setting

E"'(t) -

E(k)
k5

(60t2)

- !._ [f"'(t) - j"'(-t)] - 60


3

x3, and solving for E(k), we get

E(k)

k5
=

180

k5 f'"(xa) - f"'(-xa)

f"'(xa) - f"'( -xa)

X3

We now apply MVT for the last time.

E(k) 2
t
k5

90

2X3

By MVT there is an .X, between

such that the second fraction on the right is equal to j<4>(x). This gives

E(k)
which was to be proved.

k
s 1<4>(x)
90

(-k<x<k),

-x3 and x3,

Appendix G

The Idea of a
Measurable Set

If you reexamine Section 2.10, you will see that at the end of the section we were in a
peculiar position: we had gotten an answer for the area under the graph of y = kx2,
from x = a to x = b, but we were not in a position to prove it, because we had no
definition of area. The trouble, however, is easy to remedy.For the sake of simplicity,
consider first the case in which R is the region under the graph of y = x2, from
x = 0 to x = h. In Section 2.10, we proved the following two things:

1) There is a sequence R1, R2,


A1, A2,

, such that

of polygonal regions containing R, with areas

limAn =
n-+co

L.

(Here Rn was the union of the outer rectangles, An was (h3/3)(1 + l / n) ( l + 2/n),
and Lwas h3/3.)

2) There is a sequence R, R, ...of polygonal regions lying in R, with areas


A, A, ... , such that limnco A is the same number L.
(Here R was the union of the inner rectangles, A was (h3/3)
h3/3, as in condition 1.)

zfi (i

- 1)2, and Lwas

These ideas can be used to give a definition of area, in the following way.
Definition. Let R be a region in the plane. If R satisfies conditions (1) and (2),
then R is said to be measurable, and the number Lis called its area.

Under this definition, the plane regions discussed in Chapter 2 are measurable,
and their areas are the numbers that we computed. The same conclusion follows
whenever we compute an area by means of a definite integral. In Section 7.8 we
showed that every continuous function is integrable.This gives the following:
Theorem. Let f be continuous and nonnegative on [a, b], and let R be the region
under the graph off Then R is measurable, and the area of R is

Proof Take a sequence of nets

A =ff(x) dx.

over [a, b], with IN;I - 0. For each

i,

let A; be the upper sum S(N;) and let A be the


705

Appendix G

706

lower sum s(Ni). Then

A;

is the area of a polygonal region containing R, and

A;

is

the area of a polygonal region lying in R, as in the definition of a measurable set.


And

Ai
The sequences A1, A2,

and

ff(x) dx A;.

A, A, ... have

the same limit, namely, the integral.

Therefore R is measurable, and its area is the integral.


This theorem can be extended so as to apply to the region between the graphs of
two continuous functions.
It might seem that we could simplify the preceding discussion by defining the area
to be the integral, in the first place.

But this will not work. The point is that some

regions can be represented in many different ways as the regions between the graphs
of two continuous functions.

Different directions for the axes give different limits of

integration, and also different integrands, even for so simple a figure as an ellipse.
In the theory that we have just developed, we know that all the resulting integrals
give the same answer, because they all give the right answer for the area of the region.
But if we

defined the area to

be the integral, we would have the problem of showing,

by the methods of calculus, that all the integrals have the same value, and this would
be hard.

Proof of the
Appendix H

Northeast Theorem

The Northeast theorem asserts that if


limf (t)

t-+oo

and
lim

t->CO

then

lim

t-cc

(t)
g

oo,

t
g'( ) = L,

(2)

f'(t)
( t)
j t)
(

lim g

tco

(1)

L.

g'(t)/f'(t)->- Las t ->- oo, we must


t t0 Since we are taking the
limit as t ->- oo, we mayregard t0 as the initial point of the path. Since g' (t)lf' (t) ->- L,
the function g'If' must be bounded on some interval [t1, oo) (t1 t0). The reason is
To start the proof, we first observe that since

have f' ( t ) 0 when

is sufficiently large, say, for

that for every E > 0, we have

< g'

(t)

f'(t)

< L +

'

fort a certain t1 Therefore g'lf' is bounded on the interval [t1, oo). We now take
t1 as the initial point of the path.
As a further simplification, we translate the point (j(t1), g(t1)) to the origin,
replacing f(t) and g(t) by F(t) = f(t) - f(t0), G(t) = g(t) - g(t0), where c t0
ObviouslyG'/F' is bounded, and
t)

because F'

= f' and G'

lim G'(
t->oo F'(t)
=

g'.

L'

And if we can prove that

G(t)

L'

t)
lim g(

L.

lim
t-> co

then it will follow immediatelythat

t->oo

F(t)

J( t )
707

Appendix H

708

The reason is that

g(t)
f(t)

G(t)

g(t1)

F(t)

f(t1)

G(t)/F(t)

g(t1)/F(t)
+ f(t1)/F(t)
+

Therefore it will be sufficient to prove the theorem in the following special form.
Theorem A.

Let F and G be differentiable functions on the interval [ti.


lim F(t)

lim G(t)

t-+ 00

lim
t-+oo

oo

), such that
(3)

oo,

t-+ 00

G'(t)

L'

F'(t )
F'(t)

(4)
(5)

fort t 1,

G'/F' is bounded,

(6)

G(t 1)

(7)

and

F(t 1)

0.

Then
lim
t-+ oo

G(t)
F(t)

L.

(8)

We now make a final simplification: under the conditions of Theorem A, the


locus of the path must be the graph of a function <f, defined on the interval [O, oo),
with
G'(t)
'(
(x
<f x)
F(t)).
and
<f(O)
0,
F'(t)
=

y
y=<J>(x)

x=F(t)
- G1(t)
. m-</> (x) F'(t)
-

Evidently

G(t)
F(t)

<f(x)
x

(x

F(t)).

Therefore, rewriting Theorem A in terms of <f, we get the following:

Proof of the Northeast Theorem


Theorem B

interval [O,

709

(The Northeast theorem, rectangular form). Let <P be a function on the


) such that

oo

(0)

0,

(9)
(10)

<P' is bounded,
lim (x)

oo,

(11)

L.

(12)

L.

(13)

X-+<Xl

and
lim </J'(x)

x-+oo

Then
lim </J
X-+ co

(x)

The proof is in several steps.


Step 1.
. <f>(x2) - </>(x)
l lffi
X

X-><Xl

L.

- X

Proof By the mean-value theorem (MVT), for each x there is an x, between x and x 2,
such that

</>(x2 - </>(x)
x - x
As x -+

oo,

x-+

oo,

and so <f>'(x)-+

L.

<f>'(x).

Therefore the fraction on the left also-+


y

Step 2.
lim
X-+<Xl

[<f>(x2 - <f>(x) - </>(2)]


X

- X

0.

Proof
<f>(x2) - <f>(x)
x2 - x

<f>(x2)
x2

---=

x<f>(x2) - xrf>(x) - x<f>(x2)


1)
x2(x
-

__l _ <f>(x)
x
1 x
-

</>(x2)

x2

<f>(x2)

710

Appendix H

By MVT, there is an .X between

and x such that


</>(x)
x

</>'(.X).

Since </>'is bounded, it follows that <f>(x)/x is bounded. Therefore </>(x2)/x2 is bounded.
Since -1/(x - I)-+ 0, it follows that
_

Step

1_ </> ( x)
[
x

- 1

</>(x2)
x2

-+

O.

3.

lim </>
X-+CO

by Steps I and
to be proved.

2.

From Step

(x2)
X2

L,

it follows immediately that </> (x)/x

-+

L,

which was

Proof of the
Formula for Path Length

Appendix I

Here we complete the proof of Theorem 1 of Section 9.6, which asserts that the
length of a path is given by the formula
s

f.fJ '(t)2

g'(t)2 dt,

wheref and g are the coordinate functions andf' and g' are continuous. The notation
is that of Section 9.6. By definition,
n
s =

INl-+Oi=l

lim L P;_1P;;

we know that

I Pi-1Pi iI=l .JJ '(ii)2 + g'(f;)2 ti,

i=l

where

i i and i; are between t;_1


lim .JJ'(i;)2
INl-+O

and
+

t;;

and

g'(i;)2 ti= fb.JJ'(t)2

Ja

g'(t)2 dt,

by definition of the integral. Therefore what remains to be proved is that


lim
.JJ'(ii)2
INl-+O i=l

['f

g'(fD2 ti

I .JJ'(ii)2 + g'(ii)2 ti] = o.

i=l

In terms of the definition of limw10, this can be restated as follows:


Lemma.

For every

Proof 1)
Let M be

2)

c:

> 0 there is a o > 0 such that

Sincef' and
such that

The function

z f-4

g'

are continuous on

f' (t)2
.J;

g'(t)2 M

[a, b],
for

so also is the functionf'2 +

a t b.

is continuous on the interval

function is uniformly continuous on

[O, M]
711

g'2

[O, M].

(Section 7.8, Theorem

Therefore this

4).

Therefore,

712

Appendix I

given any

> 0, there is a () > 0 such that

iz - z'I < ()
3)

The function

=>

IJz -Jz'I < --.


b - a

g'2 is continuous, and therefore uniformly continuous, on [a, b].


o > 0 such that

Therefore, given () > 0 there is a

It: - t/ < o

=>

lg'(i/)2 - g'(idl < o.

4) Theo given by t3) is the o that we need:


INI <

O =>
=>

=>

=>

=>

If[ - iii < O

(for every

lg'(n)2 - g'(fi)21 < ()

i)

(for every i)

I
I[
I[

Jj'(fi)2 + g'(ff)2 - Jf'(fi)2 + g'(fi)2 <

J l b a Llti
g'(fi)2] Llt;I

Jf'(fi)2 + g'(ti)2 - Jf'(fi)2 + g'(fi)2 Llti <

Jf'(fi)2 + g'(fD2 - Jf'(ii)2 +


n

< -- L Llti
b - ai1
E

-- (b - a)
E

b - a

= e.

Since the absolute value of the sum is less than or equal to the sum of the absolute
values, it follows that

o satisfies the conditions of the lemma.

A Method for
Constructing the
Appendix J

Complex Numbers

In Section 10.11 the complex numbers were presented as a formal system of symbols

a + bi, with i2

-1. We shall now define a mathematical system of this kind, and

show that it has the properties that we want.

There are various ways to do this.

The following method has the advantage of copying the pattern of the manipulative
processes that we would be using anyway. It has the further advantage of introducing
ideas that will be useful later, in modern algebra.
Let P(x) be the set of all polynomials p(x)

.2f=0 aixi.

In P(x) we can add and

multiply. We know that in P(x), these operations obey the CAD laws, that is, they
are commutative, associative, and distributive:

pq = qp'
p(qr) = (pq)r,

p +q

q + p'

p + (q + r) = (p + q) + r,

p(q + r)

pq +pr.

These follow immediately from the corresponding laws for the real numbers p(x)
which are the values of our polynomial functions.
Two polynomials p, q will be called congruent modulo 1 + x2 if their difference
is a multiple of 1 + x2 We then write

p(x)
or briefly p

q(x) mod 1 + x2,

q. Thus
p(x)

if

q(x)

p(x) - q(x)

r(x)(l + x2),

for some polynomial r(x). For example, x3

= -x, because x3 - (-x)


x3 + x
-1, because x2 - (-1) = x2 + 1 =I (1 + x2).
In the following theorem, the primes do not indicate differentiation; p, q, p',
and q ' are supposed to be any polynomials.

x(l + x2); and x2

Theorem 1.

If p

p' and q

q' , then p + q

p' + q', and pq

This is a straightforward calculation. Given

p' = p + r

q ' = q + s (x2 + 1),

(x2 + l),

we get

p' + q ' = p + q + (r + s) (x2 + 1),


713

p'q'.

714

Appendix J

so that p' + q'

p + q. And
p'q' = pq + (ps +qr)(x2 + 1) + rs(x2 + 1)2
= pq + ( ps +qr + rsx2 + rs)(x2 + 1),

so that p'q' = pq.


For eachp= p(x) in P(x), let
top. That is,

p be the set of all polynomials that are congruent

p= {p'

IP' =p}.

These sets are called congruence classes.


classes

p . In

Let C be the set of all such congruence

C we define addition and multiplication by the conditions

p +q = p +q,

p . q = pq.

and

These definitions make sense, because the congruence class that contains p +q

does not depend on the choice of p and q; it depends only on the congruence classes
p and q. The same applies for the product.
Theorem 2.

The CAD laws hold in C.

Proof We know that these laws hold in P( x). Therefore, under our definitions of
addition and multiplication in C, we have

p. q = pq= qp= qp;


p + q = p +q= q + p = q +p;
p (q +F )

p q + r= p(q + r)= p q +pr= pq +pr= Pi] +pr;

and similarly for the other laws.


We now observe that
and
Therefore every power of x is congruent to a linear polynomial. Therefore:
Theorem

3. Every polynomial p(x) is congruent to a linear polynomial a + bx.

For example,

p(x)= 7x7 - 5x3 + 6x2 - 3


= 7(x2)3
=

x - 5x2

-7x + Sx - 6

x + 6x2 - 3

= -2x - 9.
In fact, the system C that we have just defined has all the properties of the number
system that we wanted.

To describe it in the familiar notation, we denote each

congruence class p(x) by the formal expressionp(i), in which xis replaced by i. Thus,

A Method for Constructing the Complex Numbers

715

if p(x) is as in the preceding example, we have

p(x) =p(i) =7i7 - 5i3 + 6i2 - 3


=7(i2) 3

i - 5i2

i+ 6i2 -

-7i+ 5i - 6 - 3

-9

- 2i.

Here we have simplified by substituting -1 for i2, and this is right; since

x2

-1,

we have

x2 =i2 =-1;
any congruence between two polynomials p(x) and q(x) gives an equation between
their congruence classes p(i) and q(i) And our number system satisfies the conditions
.
for a field, given in Chapter 1: the CAD laws hold; there are numbers 0 and 1,
such that if

z
then

for each

0. z

z there

=a+ bi,
and

0,

=z;

is a number

-z

=-a - bi,

such that
z

Finally, every

z -:;tf:. 0 has

(-z)

=0.

a reciprocal. To prove this, we first observe that

a + bi =0

=>

a =b =0.

The reason is that

a + bi =0

Here r(x) must be

<=>

a + bx

<=>

a+ bx = r(x)(l + x2).

0 mod 1 + x2

0, because otherwise r(x)(l + x2) would be of degree

2.

Since

r =0, a + bx is the zero polynomial, and so a =b =0. Similarly,


a - bi =0

=>

a =b =0,

and so

a + bi -:;tf:. 0

=>

and

a - bi -:;tf:. 0,

We shall now find a reciprocal for any a+ bi

a2+ b2 > 0.

-:;tf:. 0, by assuming, experimentally,

fractions with complex denominators make sense, and then checking that our answer
works:
1
1
a - bi
a
=
--- =--- .
a + bi a - bi
a 2 + b2
a + bi
---

---

bi
a 2 + b2

716

Appendix J

The last expression really is the reciprocal of a + bi, because

(a + bi)

[a2 a b2 - a2 bi b2]
+

2
a + b

.
.
2 (a + bi)(a - bi)

a2 + b2
a2 + b 2
.

1.

To sum up:
Theorem 4.

C is a field.

Note that when we passed from P(x) to C, by forming congruence classes modulo
+

x2, the algebraic character of the system changed: in P(x), only the constant
a - 0 have reciprocals; but every congruence class p(i)

polynomials p(x)

p(x) - 0 has a reciprocal.


To set up this number system, and check its properties, we need to use equivalence
classes of polynomials (or some equivalent device).
system, we need not remember where we got it.

But now that we have such a

In the future, we shall never have

occasion to refer to the fact that a + bi was defined to be

{p(x) I p(x)

a + bx}.

Iterated Limits.
APPENDIX K

Mixed Partial Derivatives

In discussing double limits of the type


lim

f(x, y),

(x,y)-+(xo.Yo>

for functions of two variables, we assume that f is defined in a neighborhood of


(x0, y0), except, perhaps, at the point (x0, y0) itself.

Under the same conditions,

we can discuss limits of the type

These are called

lim lim f(x, y).

and

lim lim f(x, y)

a::-+xo 11-+110

'Y-+Yo X-+Xo

iterated limits.

There are simple examples in which the iterated limits both exist but have different
values. That is, the order in which we take the limits may make a difference. Consider
f(x, y)
Here

2
2
x
Y
2 + 2
x
y
_

.
. x 2 - y2
bm hm
2
2
x-+O 11-+0 X + y

and

lim Jim
y-+0 x-+O

2
.
x
hm 2 = 1,
x-+O

- 2
2
y
y
=
lim -- = -1.
2 + 2
2
X
y
y-+0 y

2 -

This sort of thing cannot happen, however, if/is continuous in D and the double
limit exists. That is, we have the following theorem:
Theorem 1.

If/is continuous in D, and


f(x, y)

lim

L,

(x,v)-+(xo.vo>

then
lim limf(x, y)

Proof

lim limf(x, y)

L.

Iff(x0, y0) is defined at all, thenf(x0, y0) must be L. Iff(x0, y0) is not defined,

we define it to be L. Thus we may assume that f is continuous in a neighborhood of


(x0, y0), including (x0, Yo)
717

718

Appendix K

x,

The proof is now trivial: For each

we have

limf(x, y) = f(x, Yo),

(1)

because f is continuous. Therefore, for the same reason we have

lim limf(x, y)

limf(x, Yo)= f(xo, Yo)

(2)

limf(x0, Y) = f(xo, Yo)

(3)

Similarly,
Jim Jim f(x,

y)

Y-+Yo

11-+Yo X-+Xo

In

(1) ,

all we are saying is that ifjis continuous (as a function of two variables)

then the slice functions, for each fixed


to keep straight if we rewrite

(1)

x,

are also continuous.

This may be easier

in the form

Iimf(a, y)

11-+Yo
which reminds us that xis fixed as y

---+

applications of the same principle.

f(a, Yo),

(1')

y0 Equations (2) and (3) follow by repeated

Since in some cases the two iterated limits are different, we always have to
investigate, in the cases where we need to know that they are the same.

One such

case comes up when we consider the "mixed partial derivatives" fxy and hx

If we

write in full the definitions of fx,u(x0, y0) and fux(x 0, y0), we see that they are iterated

limits of the same function:


1.

fxY(Xo, Yo) - im
_

fxCxo, Yo

6.y)

Ay->O

fx(Xo, Yo)

6.y

1
f(x o
.
Im
=Im
I
x
Ay -> 06.y A ->O

[i

6.x, Yo

6.y)

f(x o, Yo

6.x

6.y)

. f(xo + 6.x, Yo) - f(xo, Yo)


- 1Im
Ax-+O
6.x

J im Jim

1-

{[f(xo

Ay-+OAx->06.y 6.x

- [f(xo, Yo

6.y)

6.x, Yo

6.y)

f(xo

6.x, Yo)]

f(xo, Yo)]}.

Note that in the last step we have changed the order of two of the terms. The reason
for this will soon be clear. Thus we have

fxvCxo, Yo)

:::;:

Jim Jim

1-

Ay-+0 Ax->O6.y 6.x

F(6.x, 6.y),

where F(6.x, 6.y) is the function defined by the expression in the braces. An entirely

analogous calculation tells us that fux(x0, y0) is the iterated limit, in reverse order, of
exactly the same function:

fvx(xo, Yo)=

lim J im

-1-

Ax->OAv->06.y 6.x

F(6.x, 6.y).

Iterated Limits.

Let us now investigate the function

F.

Mixed Partial Derivatives

719

This function can be regarded as the

difference of two values of the function

cp(x) =f(x,Yo + y) - f(x, Yo)


That is,

F(x, y) = cp(x0 + x) - cp(x0).

Applying the mean-value theorem MVT to the function <fa, on the interval from

x0 to x0 + x, we get

F(x,y)
where xis between

<f'(x) x = [fx(x,y0 + y) - fx(x,y0)] x,

x0 and x0 + x. Now the quantity in brackets can be regarded

as the difference of two values of the function

ip(y) =f.,(x,y);

we have

F(x, y) = [VJ(Yo + y) - VJ(Yo)J x.

By MVT,

where

VJ(Yo + y) - VJ(Yo) = ip'(y) y,

y is between Yo and y0 + y. This gives


F(x,y) =fxy(x, y) Y x.

Suppose now that fxy is continuous. We then have

Ix - x01 < x and IY - Yol < y, we must have (x,y)


x
( , y) (0, 0).)
[Since

(x0,y0) as

Let us now take stock. We had

1
F(x, y)
A
A
Ll
Ll
X
y
t.yo t.x-o

1) fxy(X0,y0) = lim lim

(by definition).

Iffxy is continuous at (x0 ,y0), then the double limit

2)

Jim

(ilx,t.y)(o,o)

x y

--

F(x, y)

exists, and is equal to fxy(x0,y0). This is what we have just proved. Suppose now

that fy x is also defined in a neighborhood of

(x0, y0), and is continuous. Then

lim lim
1
F(x,y).
fy.,(Xo, Yo) =6.x->O
6.y->OX y

3)

(2) exists, the iterated limit in (3) must be equal to it.


=fvx,(x0,y0). Thus we have proved the following theorem:

Since the double limit in

Therefore fxy(x0,y0)
Theorem 2.

Iffx v and fvx exist and are continuous, in a domain

D, thenfxv = f11x-

720

Appendix K

By repeated applications of this theorem, we can draw a more general conclusion


about partial derivatives of higher order: if these exist and are continuous, then all
that matters is the number of times we differentiate with respect to each variable.
Thus, for example,

Proof
Therefore

(hua)v
andfxvxv

(hx11)11,

fxxvv which was to be proved.

Warning: It is not true that if/xv exists and is continuous, thenJ;.."' also exists and
is the same. To see this, let
f(x, y)
<f>(y),
=

where </> is any function which is not differentiable. Then


fx( x,y)

for every x, y,

because the slice functions for constanty are constant. Therefore, trivially,fx11 exists,
and fxy( x,y)
0 for every x, y, so that fxu is continuous. But /11 is not defined,
because </>'(y) is not defined. Therefore fvx is not defined either.
=

Possible Peculiarities of
APPENDIX L

Functions of Two Variables

Some of the definitions that we have used in Chapter 14, and the hypotheses of some
of our theorems, may seem needlessly strong.

In fact, they are not.

The theory of

functions of two variables includes some rather odd and unexpected phenomena;
and if we want to draw simple conclusions, we need to use hypotheses sufficiently
strong to rule out the oddities. Some of these are as follows.
Example 1.

1)
2)

There is a function f such that

all the slice functions f(x, y0) (with y held constant) are continuous, and
all the slice functions f (x0, y) (with x held constant) are continuous, but

3) f is not continuous.

Proof

In the first quadrant of the xy-plane we take an infinite sequence Di. D2,

of circular disks, not intersecting each other, with radii approaching 0, and approaching

the origin as a limit.


y

.... y

As indicated in the figure on the left, we take these disks with their centers on the
line y

x, in such a way that no horizontal or vertical line intersects more than one

of them. This is easy to arrange, because we can make the disks as small as we want.
If (x, y) lies in

none of the disks D;, then we definef(x, y) to be 0.

Over each disk

D;, the graph off is a "blister" of height 1, shown on the right.

Obviously f is not continuous at


are continuous.

(0, 0).

But all the slice functions ef>(x)

f(x, y0)

Since no horizontal line intersects more than one of the disks D;,

it follows that the graph of cf> looks, at worst, like the graph shown below.
721

722

Appendix L

----

-+---....._x

On the left we see what happens if the line y

y0 passes through the center of a disk.

If this doesn't happen, then the maximum of cp is smaller; and of course cp(x) may
be 0 for every x. Similarly for the slice functions for constant x.
Here, of course, the slice function

<Prr14(t)

(J2' Jl)

is not continuous. Its graph is shown below. But even if a function f has slice functions

<Pa(t)

a, Yo

f (x0 + t cos

which are continuous, for every (x0, y0) and every

+ t sin

a,

a)

we still cannot conclude that/ is

continuous.
z

Example

1)
2)

2.

There is a function/ such that

Every slice function

<Pa

is continuous, but

f is not continuous.

Proof.

Consider the parabola y

x-axis we take a sequence D1, D2,

x2, in the xy-plane. Between the parabola and the

of circular disks, with radii approaching 0,

approaching the origin as a limit. On each disk, the graph off is a blister of height
as in Example

1;

everywhere else,f (x, y)


y

0.

1,

Possible Peculiarities of Functions of Two Variables

723

As before, f is not continuous. But all the slice functions are. The reason is that
no line L intersects more than a finite number of the disks D;:
y

If L does not pass through the origin (as on the left above) or if L passes through the
origin and has negative slope (as in the center), this conclusion is trivial. The interest
ing case is shown on the right. Here L passes through (0, 0) and has positive slope.
Near the origin, in the first quadrant, the line lies above the parabola and the disks
lie below it. Therefore L cannot intersect infinitely many disks. Therefore the slice
functions defined along any line L are continuous.
Our next peculiar function is going to be continuous. We recall that in Section
14.8 we proved the following theorem:
Theorem A. Given a function f, defined in a neighborhood of (x0, y0) . For each tJ., let

cos tJ., Yo + t sin tJ.) .


Suppose that c/>(O)
0 for every tJ.; and suppose that there is a number o > 0
such that for !ti < o we have c/>(t) < 0 for every tJ.. Then/has an ILMax at (x0, y0) .
cf>a

f (x0 +

y
I

Yo

-- - - -

The reason is that for each


( -o, a). See the figure below.

tJ.,

cf>a(O) is the maximum value of cf> on the interval


z

-iJ
<t>&(O)=O, <t>&'(t)<O

for-

iJ<t<iJ

Here it is essential that there be a single number a > 0 which works for every
The following plausible-looking variation on Theorem A is false.

tJ..

Appendix L

724

? Theorem B? Given a functionf, defined in a neighborhood of (x0,y0). For each

</>it)
If

(1)

each function

rp,,

f(x0 +

t cos ex, Yo

ex,

let

t sin ex) .

has an ILMax at 0, then (2) f has an ILMax at (x0, y0).

The falsity of this "theorem" is demonstrated by:


Example 3. There is a continuous function f such that

(1)

every slice function

through the origin has an ILMax at the origin, but (2) f does not have an ILMax
at the origin.
This is similar to Example 2. As before, we take a sequence of disks lying under a
parabola. We define f(x, y) to be 0 everywhere except on the disks. But this time,
we take the blister over the ith disk D; in such a way that its height is

1/i.

Now our

function f is continuous.
y

As before, no line Lin the xy-plane intersects more than a finite number of the disks.
Therefore every slice function

rp,,(t)
is equal to 0 in a

rf>a(t)

0 for

!ti

neighborhood of 0.

f(t cos

ex,

t sin ex)

That is, for each

ex

there is a o,, > 0 such that

< o". Therefore every <Pa has an ILMax at 0. But obviously f does

not have an ILMax at (0, O);f(O, 0)

a disk D;, on which f(x, y) > 0.

0, but every neighborhood of (0, 0) contains


z

The trouble here is that while for every

ex

there is a o,, with the desired property,

there is no

one o which works for every

inf {o"}

0. If you reread the proof of Theorem 3, Section 14.8, you will see how

ex.

If

ex

> 0 and

ex !::::;

0, then o"

!::::;

O; and so

this trouble was avoided: using the continuity of fxv fxx, and fvv, we found a single
o > 0 which worked for every

ex .

Thus the proof of Theorem 3 was not merely

complicated in a technical way but was also subtle, in a way which is not likely to be
understood unless we re-examine the proof in the light of Example 3.

Maxima and Minima, for


Functions of Two Variables

APPENDIX M

Here we give a brief sample of the way the theory of continuous functions of one
variable can be extended so as to apply to functions f: D
in a Cartesian space Rn
Theorem 1.

a ;;a

;;a b

R, where D is a domain

Let D be a closed rectangular region in R2, defined by the inequalities

and c

bounded above.

Proof

---+

(n > 1).

;;a

y ;;a d.

Let

be a continuous function D

Suppose that f is not bounded above.

---+

R.

Then f is

We shall show that this leads to a

contradiction.
The region D is the union of four closed rectangular regions, shown in the figure
on the left below. These will be called quarters of D. These are like the "halves" of an
interval

[a, b],

same way.

as defined in Section 5.6; and they are going to be used in exactly the

'! =

=1----+----1

C---,

I
I
a

:I

di

J,[

Ci

I
a+b
2
-

I
I
b

--

a;

Di

I
I
I

Ii

bi

Following the pattern of Section 5.6, we say that a closed rectangular region D' is

good ifjis bounded on D'; and D' is

bad if/is unbounded on

D'. We are assuming

that the giyen D is bad. It follows that one of the quarters of D must be bad. (Why?
See Lemma

on page

240.)

Let D1 be a bad quarter of D. Similarly, let D2 be a bad

quarter of D1 Proceeding in this way, we get a sequence

of closed rectangular regions, each of which is bad, such that for each i, D;+1 is a

quarter of D;. As indicated in the figure on the right above, let/; and l; be the closed

intervals which are the projections of I; and l; onto the x- and y-axes. Then fi, 12,

is a nested sequence. By the Nested Interval Postulate (NIP) there is an x which lies
725

Appendix M

726

on each interval !;. Similarly, there is a y which lies on every interval Ji. It follows
that the point P
(x, y) lies in every region Di.
=

But f is continuous at P. Thus for every

PP < 0

> 0 there is a o > 0 such that

=>

If(P) - f(P)I <

=>

f(P) < f(P) +

and this means that

PP < o

E,

E;

that is, f is bounded on the circular disk with center at P and radius o. But this
circular disk contains some Di, because P lies in all the D/s, and the height and width
of D; both ___,.. 0 as i ___,.. oo. Therefore Di must be good for some i, which contradicts
our hypothesis.
Theorem 2.

If/is continuous on a closed rectangular region D, then/has a maximum

value on D.
The proof is exactly like the proof of Theorem 3 of Section 5.6. Let k
sup f
If k
f(P) for some P, then/has its maximum value at P. If f(P) < k for every P
in D, let
=

(P) -

- k -f(P)

Then g is continuous on D, but is not bounded above; and this contradicts Theorem 1.
As before, the existence of maxima gives, as a corollary, the existence of minima:
Theorem

3. Iffis continuous on a closed rectangular region D, thenfhas a minimum


value on D.

(Proof Any maximum value of -f is a minimum value off)


The same scheme works for continuous functions defined on an "n-dimension
interval"
D
, Xn) I a; X; b; for i
{(x1, x2,
1, 2, ... , n}.
=

We use a subdivision process just as in


stage, into 2n parts.

R1

and

R2,

dividing our "interval," at each

An Exact Definition of
the Idea of a Function

APPENDIX N

In Chapter 3 we explained that a function f: A

---+ B is defined if a rule is given under

which for each element of the set A, there exists one and only one corresponding
element of the set B.
This formulation of the idea of a function is adequate for the purposes of elemen
tary calculus, and so, in the text, we have let it stand. But eventually we need a more
exact definition, now to be explained.

To see what we are driving at, in the new

definition, consider the function

f: R-+ R
: x

1-->

2
x

for every x.

The graph is a parabola.


y

We shall now approach our new definition in the following two steps.
Step 1. We regard the function as being indistinguishable from its graph, so that the
function/becomes a set of points P, in a coordinate plane. (In this case, the function
is a parabola.)
Step 2.

We regard a point P, in a coordinate plane, as indistinguishable from the

ordered pair (x, y) which gives its coordinates.


y

f
P=(x,x2)

f
727

728

Appendix N

The graph now becomes a collection ofordered pairs ofreal numbers, namely,

"
\

f={(x, x2)}.
This collection ofordered pairs has the property that each real number x is the first
term ofexactly one ordered pair (x, y) in the set. (This is because the graph intersects
every vertical line in exactly one point.)
This final description off, as a collection of ordered pairs {(x, x2)}, can be
generalized to apply to functions ofany kind, on any domain. The final definition is
as follows.
Definition. Let A and B be sets. Let f be a collection ofordered pairs(a, b). Suppose
that

1)
2)

if (a, b) belongs to f, then a belongs to A and b belongs to B, and


every element a ofA is the first term ofexactly one pair belonging to f

Then f is a function of A into B, and we write


f: A-+B.
For each a in A,f(a) denotes the second term ofthe ordered pair whose first term is a.
y

For the function whose graph was the parabola, we had


A=B=R,

f ={(x, x2)}.

Similarly, for the function Sin-1, we have


A= [-1, 1],

B=R

f=Sin-1= cx, y) \ -1 x 1,

(or B= [-rr/2, TT/2]),

, x=siny}.

Here again the idea is that the function is defined to be its graph, and the graph is
regarded as a set ofordered pairs ofreal numbers. Note, however, that our general

An Exact Definition of the Idea of a Function

729

definition of a function applies in a variety of other contexts, in which A and B


may be sets of quite different kinds.

For example, A may be a vector space, or a

region in a vector space, or the set of all positive integers. In this book, B has usually,
but not always, been a set of numbers.
Eventually, the exact definition of a function becomes useful, as a matter of
technique. A little reflection will convince us, however, that it cannot be quite right,
at any stage, to define a function to be a rule. The point is that rules are formed with
words (or with a combination of words and symbols).
thing as a fifteen-word rule.
function.

Therefore there is such a

But surely there is no such thing as a fifteen-word

Analogously, there is such a thing as a three-syllable name, but there is

no such thing as a three-syllable man.

As a matter of common sense, a man is

different from his name; and in the same way, a function is different from the phrases
and formulas that we use to describe it.
such a way as to be mathematical objects.

Therefore functions ought to be defined in

730

Appendix N

Table 1
Natural Trigonometric Functions

Angle

fu

Angle

46
47

0.803

0.719

0.695

0.820

0.731

0 682

1.072

0.838

0.743

0 669

1.111

ful

n t_1
. l _g_e_
0

0.000

0.000

1.000

1
2

0.017

0.017

1.000

0. 035

0.035

0 999

0.035

3
4

0.052

0. 052

0 999

0.052

0.070

0. 070

0.998

0. 070

48
49

0.855

0.755

0.656

1.150

0.087

0 087

0.996

0.087

50

0.873

0.766

0.643

1.192

0.105

0.105

0.995

0.105

0.890

0.777

0 ()29

l.2:l5

7
8

0.122

0.122

o.993

o.12:i

51
52

0.908

0.616

1. 280

0.140

0.139

0.990

O.Hl

53

0.788

0.925

0.799

0.602

1.327

9
10

0.157

0.156

0.988

0.158

0.942

0.809

0 588

1.376

0.175

0.174

0.985

0.176

54
55

0.960

0.819

0 574

1.428

11
12
0
i:i
0
14
15

0.000
0.017

0.192

0. 1\Jl

0.982

0.194

0.209

0.208

0.978

0 2J:l

0.227

0.225

0 974

0.231

0.244

0.242

0.970

0.262

0.259

0.279
0.297

0
5()
57

1.o:l6

0.977

0.829

0.559

1.483

0.995

0.839

0.545

1.540

1.012

0.848

0 530

I.GOO

0.249

58
59

1.030

0.857

0.515

l.GG4

0.966

0.268

60

1.047

0.866

0.500

1.732

0.276

0.961

0 287

0.875

0.485

1.804

0.956

0.306

61
62

1.065

0.292

0.883

0.469

1.881

rno

0.314

0.309

0.951

0.325

1 100

0.891

0.454

1.963

0 332

0.326

0.946

0.344

0
()3
64

1.082
1 117

0 899

0.438

2.050

20

0.349

0.342

0.940

0.364

65

1 134

0.906

0 423

21
22

0.367

0.358

0.934

0.384

1 152

0.914

0.407

0.384

0.375

0.927

0.404

1 169

0.921

0 391

2.356

23

0.401

0.391

0.921

0.424

1 187

0.927

0.375

2.475

24
25

0.419

0.407

0.914

0445

66
0
()7
0
()8
69

l 204

0.934

0 358

2.605

0.436

0.423

0.906

0 466

70

1.222

0.940

0.342

2.748

71
72

1.239

0.946

0.326

2.904

1.257

0.951

0 309

3.078

1.274

0.956

0.292

3.271

1()0
o
n

18

0.454

0.438

0.899

0.488

0.471

0 454

0.891

0.510

28

0.48\J

0.469

0.883

0.5:32

29
30

0.506

0485

0.875

0.554

7:3
740

1.292

0.961

0.276

3.487

0.524

0.500

0.866

0.577

75

1.309

0.966

0.259

3.732

31
32

0.541

0.515

0.857

0.601

1.326

0.970

0.242

4. 011

0.559

0.530

0.848

0.625

76
77

1.344

0.974

0.225

4.332

33

0.576

0.545

0.839

0.649

1. 361

0 978

0.208

4.705
5.145

34

0.593

0.559

0 829

0.675

78
79

1.379

0.982

0.191

35

0 611

0.574

0.819

0 700

80

1.:396

0.985

0.174

5.671

:50

0.628

0.588

0.809

0.727

1.414

0.988

0.156

6.314

37
38

0 646

0.602

0.799

0.754

81
82

1. 431

0 990

0.139

7.115

0.663

0.616

0.788

0.781

1.449

0 993

0.122

8.144

39
40

0.681

0.629

0.777

0.810

83
84

1.466

0.995

0.105

9.514

0.698

0.643

0.766

o.s:io

85

1.484

0.996

0.087

41

0.716

0.656

0.755

0.86!)

1.501

0.998

0 070

14.30

42

0.733

0.669

0.743

0. 900

86
87

1.518

0.999

0.052

19.0

43

0.750

0.682

0.731

0.933

1.536

0 999

0.035

28.64

44

0.768

0.695

0.719

0.966

57.29

45

0.785

0. 707

0.707

i. ooo

I
I

ggo

0
39
90

1 553

1.000

0.017

1.571

1 t. ooo

0 000

: :: I

2G0
27

1143

i"

An Exact Definition of the Idea of a Function


Table 2
Exponential Functions

e"

e-x

0.00

1.0000

1.0000

2.5

12.182

0.0821

0.05

1.0513

0.9512

2.6

13.464

0.0743

0.10

1.1052

0.9048

2.7

14.880

0.0672

0.15

1.1618

0.8607

2.8

16.445

0.0608

0.20

1. 2214

0.8187

2.9

18.174

0.0550

0.25

1.2840

0.7788

3.0

20.086

0.0498

0.30

1.3499

0.7408

3.1

22.198

0.0450

0.35

1.4191

0.7047

3.2

24.533

0.0408

0.40

1.4918

0.6703

3.3

27.113

0.0369

0.45

1.5683

0.6376

3.4

29.964

0.0334

0.50

1.6487

0.6065

3.5

33.115

0.0302

0.55

1.7333

0.5769

3.6

36.598

0.0273

0.60

1.8221

0.5488

3.7

40.447

0.0247

e"

e-x

0.65

1.9155

0.5220

3.8

44.701

0.0224

0.70

2.0138

0.4966

3.9

49.402

0.0202

0.75

2.1170

0.4724

4.0

54.598

0.0183

0.80

2.2255

0.4493

4.1

60.340

0.0166

0.85

2.3396

0.4274

4.2

66.686

0.0150

0.90

2.4596

0.4066

4.3

73.700

0.0136

0.95

2.5857

0.3867

4.4

81.451

0.0123

1.0

2.7183

0.3679

4.5

90.017

0.0111

1.1

3.0042

0.3329

4.6

99.484

0.0101

1.2

3.3201

0.3012

4.7

109.95

0.0091

1. 3

3.6693

0.2725

4.8

121.51

0.0082

1.4

4.0552

0.2466

4.9

134.29

0.0074

1.5

4.4817

0.2231

148.41

0.0067

1.6

4.9530

0.2019

403.43

0.0025

1.7

5.4739

0.1827

1096.6

0.0009

1.8

6.0496

0.1653

2981.0

0.0003

1.9

6.6859

0.1496

8103.1

0.0001

2.0

7.3891

0.1353

10

2.1

8.1662

0.1225

2.2

9.0250

0.1108

2.3

9.9742

0.1003

2.4

11.023

0.0907

22026

0.00005

732

Appendix N

Table 3
Natural Logarithms of Numbers
n

log,n
*

log,n

log,n

---

4.5

1.5041

9.0

2.1972

4.6

1.5261

9.1

2.2083

4 7

1 .5476

9.2

2 2192

4.8

1 .5686

9 3

2.2300

4.9

I 5892

9.4

2.2407

9.3069

5.0

1.6094

9.5

2.2513

0 6

9.4892

.5.1

l .6292

9.6

2.2618

0.7

9.6433

5.2

I 6487

9.7

2.2721

0.8

9.7769

5.3

I 6677

9.8

2.2824

0.9

9.8946

5.4

1 .6864

9.9

2.2925

JO

0 0000

5.5

I. 7047

10

2.3026

11

0.0953

5.6

1.7228

11

2.3979

12

0.1823

5.7

1.7405

12

2.4849

I. 3
I. 4

0.2624

5.8

I.7579

13

0.3365

5.9

1 7i50

14

I. 5

0.4055

6.0

1 7918

15

1.6

0.4700

6. I

l. 8083

16

2.7726

I. 7

0.5306

6.2

1.8245

17

2 8332

I. 8
I. 9

0.5878

6.3

1.8405

18

2.8904

0.6419

6.4

1.8563

19

2.9444

2.0

0.6931

6.5

I. 8718

20

2.9957

0.0

0.1

7.6974

0.2

8.3906

0.3

8.7960

0 4

9.0837

0.5

2.5649
2.6391
2.7081

2.1

0.7419

6.6

1.8871

25

3.2189

2.2

0.7885

6.7

I 9021

30

3.4012

2.3

0.8329

6.8

1.9169

35

3.5553

2..4

0.8755

6.9

1.9315

40

3.6889

2.5

0.9163

7.0

1 .9459

45

3.8067

2.6

0.9555

7. I

1 .9601

50

3.9120

2.7

0.9933

7.2

I 9741

55

4.0073

2 8

1.0296

7.3

1.9879

60

4.0943

2.9

I. 0647

7.4

2.0015

(i5

4.1744

70

4.2485

3 0

1 0986

7.5

2.0149

3. l

1.1314

7.6

2.0281

75

4.3175

3.2

I 1632

7.7

2.0412

80

4.3820

3.3

I. 1939

7.8

2.0541

85

4.4427

3.4

1 2238

7.9

2.0669

90

4.4998

3.5

1.2528

8.0

2.0794

95

4.5539

3.6

8. l

2.0919

JOO

4.6052

3 7

I. 2809
I. 3083

8.2

2.1041

3.8

1.3350

8.3

2.1163

3.9

!. 3610

8.4

2.1282

4.0

I. 3863
I. 4110

8.5

2.1401

4. 1

8.6

2. 1518

4.2

1.43.51

8.7

2.1633

4.3

1.4586

8.8

2. 1748

4.4

14816

8.9

2.1861

Selected Answers

PROBLEM SET

1. x
7. x

<

-3

>

13

1.3

3. x
9. x

PROBLEM SET

1. (
5. (0, 10)
-a:J,

9.

a) (0,

<
<

t
2

) b) [-1, a:J)

a:J

25. [t. tJ
[-1, 3)

29.

PROBLEM SET

c) 2v'2

v3, 1 - v3) and (1 - v'3, 1

v3)

2.3

3. y

4, x 0
7. y 2x, x - 1
21. b) y x

PROBLEM SET

1. y - 1
5. 4y

11. (3, 0)

1. x 3
5. x2 + y2 + 4x - 4y + 4
0
9. x2 - 2x + y2 - 3
=

a:J

d) v2

7. (i!, 154), 154 v'26


9. a) lyl b) lxl

23. x

-3

2.2

b) 6v'2

PROBLEM SET

-3

<

3. (-1, 1)
7. (- a:J, -1) and [3,
11. ( -a:J, a:J)
15. { }
19. {1}
23. (-2, 2)
27. [f, tJ
31. [t. 3)

13. {-2, t}
17. ( -a:J, a:J)
21. (- a:J, -2) and(-!-, a:J)

5. (1

>

1.4

-1)

1. a) 2v'2

5. x
11. x

3x

2.4

l
-6(x
- 2), y

25

x
5

7
S

3. y

f(x - 1), y

7. (1, 1) and (1, -1)


733

fx - f

Selected Answers

734

PROBLEM SET

2.6

5. F= (0, -12), D:y= -/2

7. F= (0, f),D:y= t
11. F= (-1, t),D:y= -t

9. F= (2,f),D:y=f
13. F= (-t, 11s-), D:y = --16
PROBLEM SET

2.7

9. crosses x-axis at (0, 0), (2, 0), ( -2, 0); tangent horizontal where x =
(0, O) =

-4;

PROBLEM SET

y > 0 if -2

<

<

0 or 2

<

x; y

3. 50

5. 33

7. b3
+

(m

1)7

PROBLEM SET

n7

7. t

15. a) n 4 b) n 2 1010 c)

>

<

2c4

2c5

<

slope at

b4

b5

11. >2
i=3

l)(n

- 1)
5

_4,Q
3

9. t

-2 or 0

2.10

>

2c3

2.9

itXQ

17. a)

7. in(n

(n

<

v3

5. 2n2
PROBLEM SET

0 if x

----2 --:

2.8

1. 14

9. m7

<

> -2 +

. - 3)

!lQ
3

11. t

1 I
v E(8
2

also works. Note that you were not asked to find the smallest possible n.

98

b) n

PROBLEM SET

>

1
-2

3.4

7. bounded with M = 1

9. bounded with M = 1

11. bounded with M = 8

13. bounded with M = t

15. bounded with M = t

17. unbounded

19. bounded with

M= 1

Selected Answers
PROBLEM SET

3.5

1. 70x9 - 8x7

3.

+ 1)2
-1
7
. (x - 1)2
11. a) 3ay + 2x b) 3xy

-2y3 - 3
5.
(y3 - 3)2
9.

13.

3(1

x)2

2(x2 - 1)
(x2 + x + 1)2

23.

x2 - x

-3x2

4x

----

2x

l)(x3

3x

---:====
v(1 - x2)s

-3x + 1
3
. (x + l)s
7)711(3x2

2x - 1)

2Vx - l(x2 + 1)2


13. 1 if x > 0, -1 if x
+

3.6

5. 712(x3

6x2

3a 2

-x

2x + 3
1. --;=====
2Y(x + l)(x + 2)

23. f(x2

1 9. ---===
Vl - x2

PROBLEM SET

19. f(3x2

(x

15. 4x3

1
17. --2.Yx + 1
-1
21.
2.Y

9.

7.

11.
<

+ x)l/2
1)312(2x

x -1

-:===

.Yx(x - 2)
-1

----:===-----:===
v1 - x Y(l + x)s

15. 3(2x3y - 3x2_y2)(x3y2 - x2y3)2


21. y11211
25. b) tx-213

3)

27. f. x<PfqJ-1
q

PROBLEM SET

1. a)

x2
2

3. a) x

x2
2

b) - b) -x

PROBLEM SET

1. t

bll

3.7

c) tx lxl
c) JxJ

d) x

d) 1

e) -x

e) -1

f) Ix!

f) sig x

3.8

1
11

3. a) U

b) - (b11 - a11)

5. a) -!

b)-

x4
4

x2
- -x
2

c)

1 b101
(
101

.
_

alOl)

d) -- (x n+l - an+l)
n
+

7. a) vs - v2

b) 0

735

736

Selected Answers

9. (1+ xlO)lOO

13.

11. 2(V2 - 1)

15.

PROBLEM SET

674

3.9

t2

-2 +2t+3

5.

1. ft2+4t+4

3.

7. -Vl- x2

9. 2v/ - 2v2+ 5

7
-1
l3. 3(1 + t3) +
6

17.

Vo

15.

-100+g

PROBLEM SET

20
g

, a(t)

19. gL

9.

I3.
I7.

--.

I5.
I
' -;::==-;:==:::::;

VI+x V(I - x)3

-4x3

' (x4+I)2

I9. 2VI+(2x)8

4.1

I. csc 0

3. cot x

5. cosy

7. tan 0

9. tan 0

I1. sec2 O

13. tan x

I5. -tan 0

I7. sec O

I9. tan O

21. -sec O

23. sin 0

25. -tan 0

27. -sec 0

PROBLEM SET

9. a) -I

I I. a)

ro

0,

20
g

PROBLEM SET

[ ]

I
5
II. -vx(x2+I), -2x
vx 2vx

(1 +x2)2

x x
-- I+ x

I- x

-gin the time interval

7. v:;,

-2x

--

t3
2
1i.3+t+3

3. x4 - x, 4x3 - I

VI+ x8

2 1
- 20

3.10

4x7

I + x2'

+t

6 ft/sec

I. x2, 2x
5. VI+x8,

t5
20

4.2

b) -cos 0
b) -cot O

13. cos 0
27. 4 cos3 0 - 3 cos 0

Selected Answers
PROBLEM SET
I.

4.3

5. -sinx

3. sec x tan x

sec2 x

11.

9. 2 cos 2x
PROBLEM SET

13.

-
cos x
lcos xi

737

7. -2 sin x cos x

-1
.
+smx

4.4

7. 1
PROBLEM SET

4.5

I. g(x) =sin x,f(u) =u2,f'(u) =2u,g'(x) =cos x,f'(g) =2 sinx, q/(x)


2 sin x cos x
3. g(x) =sin x +cos x,f(u) = u2,j'(u) =2u, g'(x) =cos x - sin x,j'(g) =2(sin x +
cos x), cp' (x) =2 cos 2x
=

5. g(x) =2x,f(u) =tan u,f'(u) =sec2 u,g'(x) =2,f'(g) =sec2 2x, cp'(x) =2 sec2 2x
1
1
/7. g(x) = 1 - x2,f(u) =vu,f (u) = r ,g (x) = -2x,f (g) = . i
,
2vl- x2
2vu
-x
cp'(x) =
v1 - x2

9. g(x) =1 +x, f(u) =u113, f'(u) =tu-213, g'(x) =1, j'(g) =t(l +x)-213, cp'(x) =
!(1 +x)-213
cos x,f(u) =Jg (t2 +1) dt,f'(u )=u2 +1,g'(x)
11. g(x)
cp'(x) = -sin x(cos2 x + 1)
13. -3x5 cos (x&)
=

-sinx,f'(g) =cos2 x +1,

15. g(x) =x3, f(u) =sin u, f'(u) =cos u, g'(x) =3x2,f'(g) =cos x3, cp'(x) =3x2 cos x3
17.

cp'(x) =cos xVl +sin2 x

19.

21.

PROBLEM SET
1.

4.6

2x cos x2

3. -3x2 sin x3

7. (3x2 +1) cos (x3 +x)

-1
sin vx
9.
2v:X

5. 2t sec2 (t2 +1)


11. t sec2

x- 1
-2

13. 1
15. a) 2 sec2 x tan x
19.

-2 sin 2x

25.

-x
sin Vx2 +1
vx2 +1

--

b) 2x sec x2 tan x2

17. -2 sin 2x

21. cos x

23.

27.

a) tx-213 cos x113

b)

---

1 +cos x
!(sin x)-2!3 cos x

31. -sin x cos cos x

33. cos x sec2 sin x

35. -sin x cos3 x

31.

39.

41.

43.

29. cos x
0

738

Selected Answers

PROBLEM SET

1.

4.7

3.

----

Y2x - x2

9.

7. 2x
13.

19.
25.

2
xYx4 - 1
x

21 .

.Y1 +x2
1

27.

Y -x - x2

33. 1
37.

15.

--===

1
-

1
-x

11.

lxl Yl - x2
-1

17.

.Y1 - x2

23.

(1 + x2)a/2
2x

1.

.Y1 - x4

35. J-1(x)

--=

.Y2

5. 1

2+2x+x 2

2x
2 +2x2+x4
1
x.Yx2 - 1
1
(1

x2)a;2

77"

.Y 1 - x2, 0 ;2; x ;2; 1

77"

PROBLEM

SET

1. log.x+ 1,

4.9

1
x

5. (sin x+cos x)e"',

3. (I +2x)e2"', (4+4x)e2"'
2e"' cos x

500 -500
9. -,- x2
x

7. 2/x, -2/x2

11. (log. 10) 10"', (log.10)2 10"'

13. 0, 0

15. (2x + x2)e"', (2+ 4x+x2)e"'

17. (2x2+ l)e"'2, (4x3+ 6x)e"'2

19.

-1
1

-1
x ' (1 - x)2

21. e"'-1, e"'-1

23. tan x, sec2 x

25. cot x, -csc2 x

27. sec x, sec x tan x

29. -csc x, csc x cot x

PROBLEM SET

4.10

2
1. - In x
x

2x.
3. - -x2+ 1

7. 2x

9. cos x exp sin x

5. 2x exp x2
11. (In x+ 1) exp (x In x)

13. cot x

15. (In x+ l)x"'

19. 2

21. sec3 x

23. 1

25. 2

27. 2

29. 2

Selected Answers

PROBLEM SET

4.11

23. 4 cosh3 x - 3 cash x


39.

2x

47. Cosh-1(x) =In (x

1. increasing on

31. v'1

x2

33

43. x = In 2

Vx4-1

PROBLEM SET

739

2
v'1

4x 2

45. x = In (2y)

v'x2 - 1 )

5.1

[ - tr/2, tr/2], decreasing

on [- tr, -tr/2] and [tr/2, tr]

3. increasing on [-2, OJ, decreasing on


[O, 2]

0.,
y

-2

5. increasing on [-2, -1] and [1, 2],


decreasing on [-1, 1]

9. increasing on [O, 1]

-1

7. increasing on [- tr, -3tr/4], [-tr/4, tr/4],


and [h/4, 5tr/4], decreasing on
[-3tr/4, -tr/4] and [tr/4, 37T/4]

11. increasing on [tr, 27T], decreasing on

[0, tr]

.. x

13. increasing on [-2, -1] and [O, 1],


decreasing on [-1, O] and [1, 2]
y

15. increasing on [-1, O], decreasing on


[0, 1]
y

Selected Answers

740

17. increasing on[-Tr, -Tr/2J and ['"/2, '"J,


decreasing on[-'"/2, '"/2J

19. increasing on [In 2, 2J, decreasing on


[O, In 2J

y
y

PROBLEM SET

5.2

1. local maxima -Tr, '"/2; local mm1ma -'"/2, '"; maximum '"/2; mm1mum -Tr/2;
inflection point O; image[-1, lJ; concave upward[-'" OJ; concave downward [O, '"J
3. local maximum 1; local minima -2, 2; maximum 1; minima -2, 2; inflection points

-l/v'3, l/v'3; image [i-, lJ;


[ -l/v'3, l/v'3]

concave upward

[-2, -l/v'3] and [l/v'3, 2];

concave

downward

5. local maxima -1, 2; local minima -2, l ; maxima -1, 2; minima -2, l; inflection
point O; image[-2, 2J; concave upward [O, 2J; concave downward [-2, OJ
7. local maxima -3Tr/4, '"/4, '"; local minima -Tr, -Tr/4, 3Tr/4; maxima -3Tr/4, '"/4;
minima -Tr/4, 3'"/4; inflection points -Tr/2, 0, '"/2; image [-1, lJ; concave upward
[-Tr/2, OJ and [7T/2, '"J; concave downward[ -1T, -7T/2] and [0, 7T/2]
9. local maximum 1; local minimum O; maximum 1; minimum O; inflection points none;
image[l, e - 2]; concave upward [O, 1]; concave downward {}
11. local maxima 0, 2'"; local minimum 1T; maxima 0, 21T; minimum '"; inflection points
7r/2, 37T/2; image [-1, 1]; concave upward [7r/2, 37T/2]; concave downward [O, 7r/2]
and[3'"/2, 27T]
13. local maxima -1, 1; local minima -2, 0, 2; maxima -1, 1; minima -2, 2; inflection

-v'i, + v'k; image[-8, 1]; concave upward [-v'i, v'l]; concave downward
[-2, -v'l] and [v'i, 2]

points

15. local maximum O; local minima -1, 1; maximum O; minima -1, l; inflection points

{fl; image [t,


downward [- v-'i, {Ii]
_

{! %

l]; concave upward

[-1,

{ff]

and

[{If, l];

concave

17. local maxima -7T/2, '"; local minima -Tr, 7r/2; maximum '"; minimum -1T; inflection
point O; image[-7T, 7r]; convex upward[O, 7r]; concave downward[-7T, OJ
19. local maxima 0, 2; local minimum In 2; maximum 2; minimum In 2; inflection points
none; image [2 - 2 In 2, e2 - 4]; concave upward [l, 2]; concave downward {}
21. No

Selected Answers
PROBLEM SET

741

5.3

1. maxima none; minima none; local maximum 1; concave upward (- oo, 0) and (2, oo) ;
concave downward (0, 2); inflection points none;
Iim f(x) = oo,
Iim f(x) = oo,
lim f(x) = -oo,
limf(x) = 0,
limf(x) = 0,
X-+oo

lim f(x)
x--2-

x--..- oo

x-...O+

x-o-

x--2+

= - oo

3. maxima none; minima none; local maximum t; concave upward (-oo, -2) and
(3, oo); concave downward (-2, 3); inflection points none;
limf(x) = 0, lim f(x) = 0, Jim f(x) = -oo, Jim f(x) = oo,
:2:-+00

lim f(x)
X-+3+

X--..-2+

x---oo

= oo, Iim f(x)

X-+3-

X-+-2-

- oo

5. maxima none; minima none; local maxima none; concave upward (-oo, 0) and (0,
concave downward { }; inflection points none;
limf(x) = 0, Jim f(x) = 0, Jim f(x) = oo, Jim f(x) = oo
X--+00

a'-+0+

x--co

oo) ;

X--+0-

7. maxima none; minimum O; local maxima none; concave upward [-1/v'3, l/v'J];
concave downward (-oo, -l/v'3] and [l/v'3, oo); inflection points -l/v'3, l /v'3;
lim f(x) = 1, Jim f(x) = 1
x--oo

X-+00

9. maxima t; minima none; local maxima t; no local minima; concave upward (-oo, O]
and [l, oo); concave downward [O, 1]; inflection points 0, 1;
Iim f(x) = 0, lim/(x) = 0, Jim f(x) = !; Jim f(x) = !
X---+--00

X-+CO

X--+-1-

x.-.-1+

11. no maxima; no minima; no local maxima; no local minima; concave upward ( - oo, -1)
and [o, -fit]; concave downward ( -1, O] and [-fit, oo); inflection points 0, -fit;
lim /(x) = 1, Iim f(x) = 1, Jim f(x) = oo, Jim f(x) = - oo
X-+-co

13.
19.
25.

x_.co

e
e112
0

PROBLEM SET

1. a

2
7 x =-a
v3

19.

35.

17. e
23. e213
29. $e

5.4

2v'2
= -a 9 128 in.3
v3

k3
15
1728
1
21. {1 3

w2

8
2

25. -

x--..-1 +

3. 2a2

13.

X-+-1-

15. e
21. e
27. 0

{13

4 '17'r3

3v'3

39. (3 +2v'2)r2

17'

23. 2 +217'k, k an integer

742

Selected Answers

PROBLEM SET

5.5

1. t

5. 1

3. t

11. v2

9. h/d
PROBLEM SET

13. 4x3 + 4.ff'

9.

13.

15.

5.8

2 5. cp (u)

v' cp'(u)

27. cp(u)
33. df/dt
35. df/dt
/ (!)
2

5.
13.
17.
19.
21.
23.

25.
31.
37.

, cp'(u)

(sin x + x cos x)e"' sin'"

, cp'(g)

v'1 - u

cos x
=

sin x
1

23. xe"'

-u

'
cp (g)
-tan t cos3 t
-sin t cos2 t
(1 + u2)3/2
6
6t5
-- (Tan-1 u)5 , cp' (g)
(Tan-1 u)6, cp' (u)
+ u2
1 + (tan t)5
-t!Jfor any function/(!) such that/ 2 + t2 + 1 0
= -t3/f3,J1 (t) (1 - t4)114,j(t) !(-4t3)(1 - t4)-3/4
-t3!f'f,
-(1 - 14)114,f;(t) !:(4t4) (1 - t4)-3/4
-t3/f{

v1

+ u2

----

PROBLEM SET

9.

e"' cos2 x)

21. 2t3

19. cos (t)/e1

1.

1
3. exp Tan-1 g -- (
1 + g2
3
7.
e"'
4x

(sin x + x cos x)e"' sm"'


3x2
3x2 cos2 x
cp(u) eu214, cp'(u) (u/2)eu2/4, cp' (g) xe"'2
'
cp() eu sin u, cp' (u) (sin u + u cos u)eu sin u, cp (g)
cp(u) = e"312, cp'(u)
fu112eu312, cp'(g)
i x e"'3

17. cp(u)

3. 2

PROBLEM SET

11.

l/v'3

5.6

1. v'2

5.

7.

6.2

{i(l + x2)4 + C}
3. {!(2 + u2)2 + C}
6
{x7/7 + !-x5t2 + x3t4 + xt + C}
7. {(x/8)(x2 + t2)4 + C}
{a2a(t3/2 + 5)11 + C}
11. {t(l + sin x)3 + C}
{-{(cos x)312 + C}
15. {te2"' - 6e"' + 12x + se-x + C}
{te3"' + 3e"' - 3e-x - !e-3"' + C}
{ -W + x3)-2 + C1}, x > -1; { -l(l + x3)-2 + C },x < -1
2
{t In (1 + x3) + C1}, x > -1; {t In (-1 - x3) + C },x < -1
2
{ln2(x2) + C1},x > O; {ln2(x2) + C2},x < 0
27 . h-h- sin102 x + C}
H sin3 x + C}
29. { -i cos4 x + C}
35. {sec 8 + C}
{tan 8 + C}
33. {-cot 8 + C}
39. {t sin 28 + C}
41. {t sin 28 + C}
{ -t cos 28 + C}

Selected Answers

43. {t sin 2fJ + C}

45.

{ - isin2fJ + c}

{: - a\ sin40 +c}

47.

51. {2 sin fJ/2 + C}

53. {i(l - cos 0)3/2 + C}

55. {tet3 + C}

57. {te2"' + C}

59. {tet + C}

61. {tan x + c}

63. { -ecost + c}

65.

49. { -l cos3

+ C}

fJ

+ t)-112 +
.

67. { -2(2

69. {2t - 2t-112

C}

743

{21

10

2
10"' + c

C}

71. {Sin-1 (t) + C}, ltl < 1

73. {t(l + r3)2/3 + C}

75. {iIn (l +t4) + C}

77. { -2v'1 - e"' + c}

79. {In I sin xi + C} on any interval where sin x - 0


81. {In I sec xi + C} on any interval where cos x - 0

83. { -In Iese x +cot xi

C} on any interval where csc x +cot x

6.3

PROBLEM SET

1. {t Sin-1 (x2)

3. {! Sin-1 (2y2) + C}

C}
5. {i Tan-1 (3x2) + C}
+

7. {i In (9 +x4) + C}

11. { (1/3 v'z) Tan-1 ( v'z z 3) + c}

9. {t Tan-1 (z3) + C}
13. {in (5 +z6)

15. { -tv'I - x8

C}

c}

17. {l In (1 +x8) + C}

19. {In (1 + ez) + C}

21. {Tan-1 e"' + C}

23. {In le"' + e-"'I + C} on ( - oo, 0) or (0,

25. {(e"' - e-"')3 - e"' + e-x + C}

27. { (1/3) Sin-1 (x3/ v'z) + c}

29. { (1/3 v'z) Sin-1 ( v'z x3) + c}

31. {t(Sin-1 x)2

33. {xe"'

35.

C}

C}

C}

{ c}
+

37. {x sin x + C}

39. { - x cos x

41.

43. {x3 ln x + c}

{ In x + c}

oo

45. {tln4 x + C}

47. {i ln2 lx2

2xl + C} on any interval where x2 + 2x

49. {t Sec-1 (x2) + C}

51. {x Sin-1 x + C}

53. {x Sin-1 x + v'1 - x2 + c}

55.

57. {t In (1 + e2u) + C}

- v'l - x2 + c}
59. {t (e2u - In (1 + e2u)) + C}

61. {t In (1 + x2) + C}

63. {t (Sin-1 (z) +zv'l - z2) + c}

PROBLEM SET

6.4

1. {x ln2 x - 2x In x + 2x
v

{x Cos-1 x

C}

5. { -(x/a) cos ax + (l/a2) sin ax + C}

3. {a(x - l)e" + C}
7. {[1/(1 + a2)]ea"'( -cos x + a sin x) + C}

9. {[l/(a2 +b2)]ea"'(a sin bx - b cos bx) + C}

744

Selected Answers

11. {-x2 cos x+ 1x sin x+2 cos x+ C}


13. {x 3e" -3x2e"+6xe" -6e"+ C}

15. {(x 3/3)(ln2 x - f In x+ )+ C}

19. {x Tan-1 x -i
17. {x Sin-1x+v1 -x2+c}
21. {t(x sin xe"+e" cos x -xe" cos x)+ C}

In (1+x2)+ C}

23. {x ln 3 x -3x ln2 x+ 6x In x -6x+ C}


25. fxne" dx xne" - n fxn-le" dx
=

27. {(x/2)( sin In x -cos In x)+ C}


PROBLEM SET

6.5

3. {ix - l sin 2x+ C}


1. { t sin 3 x - t sin5 x+ C}
5. {tx -i2sin 4x+ C}
7. {x+cotx - t cot3 x+ C}
9. { -! cot4 x+t cot2 x -In Icos xi+ C}, cos x ,= 0
11. {-t cot3 x =- cotx+ C}
13. {-1/2(cscx cotx+In lcscx+cotxJ)+ C}, cotx+cscx ,= 0
15. { -! csc3 x cotx+ i cscx cotx+ i In Iesex+cotxi+ C}, cotx
19. {t sin 4 x+ C}
17. {In I sin xi+ C}
21.

{i+i sin (4x)+c}

cscx

23. {InJ secx+ tan xi+ C}

25. {-cscx+ C}

27. {-cotx -i cot3 x+ C}

29. {+ In I sec x+ tan xi - sin x+ C}

31.

PROBLEM SET
1.

>= 0

1
=

-,

n
=

-1

--

6.6

{x/v1 -x2+c}

3. {ln Jx+ v1x2-11+c}

5. {Sec-1 x+ C}

7. {-1/x -Tan-1 x+ C}

9. {t(Sin-1 x+xvl -x2)+c}


13. {t(Sin-1 x -xvl -x2)
17. { -i(l x2)a12+ C}

c}

1i.{vx2-1+c}
15. {x -Tan-1 x+ C}
19. { t(l +x2)312+ C}

21. {tx2 -t In (1+x2)+ C}


PROBLEM SET

6.7

{ x.
{( -1 -2v) + C
3.
a2va2
(1+v x)2
}
5. {x-2 In (ve"+l +1)+c}
7. {t(1+ \o/)2 -6 d+ \o/)+3 In Jl+ \o/;I+c}
1.

9. {t(v'vx+1) 3 -4Vv1x+1+c}

.;._

+c
x2

11. {vz2 -1+l(vz2-1)3+c}

Selected Answers

13. {!('711+ v'x)5-3(\o/1+ v'x)2+c} 15. {x-tx3+ tx" fx7+ tx9+ C}


19. {2x -In ( v'1+e4"'+ 1)+c}
17. {x-In(Yl+ e2"'+I)+ C}
2 3. {x Tan-1x ! In(I+x2)+ C}
21. {tx2 lnlxl -tx2+ C}
25. {2 In1(1+ v')+2(1+ v')-1+c}
27. {4(tx3'4 - tx1'2+x1'4 - ln (1+x1'4))+c}

29. {x-In (1+e"')+I e"'+ c}

1
- -In I 1-xi +c}
31. {1nlxl-x - 2x2
33. {tx3Tan-1x

- +iln(l+x2)+c}

PROBLEM SET

1. GTan-1

6.8

() +c}

3. {In

1: :I+c}

x 1
x-4 2(x+
5. {In 12 v'x2+
v'17 + 17 + C} 7. {sin-1 ( ;3 ) +c}
11. {Tan-1(x- 3)+ C}
9. {Tan-1(x+ 3)+ C}
15. {In = +c}
13. {1In1: +c}
17. {xl -lnlx-11 +lnlx-21 +c}
1
19. {-x+2lnx-x--1-2Inlx-ll+C}
21. {111n Ix-21 -27lnIx+II-6(x 2)2+9(x 1_ 2)+c}
23. {In lxl-tIn(x2+1)+ C}
25. {1nlxl+2(x/+ 1) -tin(x2+ 1)+c}
2x ,case=-1 -x2
27. sine= 1-l+x2
+x2
29. {-sec e+tane+ C}
3i. {-x+ 3+c}
33. {In (1+sinx)+ C}, sinx -1
35. {tanx+sec x+ C}
37. { 2.In Isec (e+ 77/4)+ tan (e+ 77/4)1+c}
_

.i

f)I

1: I

=I

1-

716

745

Selected Answers

746

PROBLEM SET

7.1

3.
1
2a

7. a) i-(e - 1/e)

b) -(ea - e-a)

PROBLEM SET

1. f(x)

5. j(5312 - 1)

ln(v'2+1)

7.3

volume

(r/h)x,

trr2h/3

3. tr/2

5. tr/4

7. tr/2
13.

(b)

ii.
x =

PROBLEM SET

7.4

7.

5.

3. 32tr

1. 40tr2
1T

a) 2: (e2

2tr (
3 1-

9. 2 v31T

+ 1)

PROBLEM SET

11.

a) tr(e - 2)

4tr ( v'2 - 1)
T

7.5

5. 4tr(a

+ k)k2

13. 10 v2 1T2
PROBLEM SET

1. (x' y)

(a
=

7. trc(Vb2
11.

b
; '

c2

7.6

5.

v'(a - b)2

c2)

19. a

1
}-tr
4

PROBLEM SET

1. tr/4
7. 10,000
13. t
19. tr/8

27. finite
33. not finite

ac2

9 (x, y)
_

13. 4tr2ab

trCb(b - a)

15. sv2 trk(a

1T

(b c (3b - 4a))
2' 6 (b - a)

11. sv2 trk2

b)

+ v' tr4 - 8tr2

120tr)

7.7

3. t
9. 00

15. 1
not finite
not finite
35. finite

23.
29.

5. 2

11.
17. 2
25. finite
31. not finite
00

1 )
v2

Selected Answers

PROBLEM SET

747

8.1

3. x' =x +t, y' = y

1. x' =x - 5, y' =y - 6
5. x' = x +2, y' = y +1
PROBLEM SET

7. x' = x +t, y' = y +t

8.2

1.x2/4 +y2/3=1

3. (x - 1)2/3 +(y - 3)2/4 =1

5. 3x2 +2xy +3y2 =8

7. x2/5 + y2/9

9.foci ( v3, 0); focal sum 4

vs),

11. foci (-2, 1 +

(-2, 1

vs);

focal sum

13. foci (0, VJ/2); focal sum 2


PROBLEM SET

8.3

11. 4x2/9 - 4y2/7 =1

13. -4x2/7 +4(y - 2)2/9 =1

15. xy=t

17. x2 - y2/3 =1
23. x +y =1, x - y =1

19. -x2/4 +y2/5 =1


25. x =0, y = 0
PROBLEM SET

8.4

1. hyperbola
1.

A' =2

3.hyperbola

v2,

PROBLEM SET
11.

c'

=2

:r:

v2

9. A' =t

13. x =2acos

3. 1

5.1

7. 1

11. 0

13.0

15.1

17. 1

1
21. -

23.

25. 0

27. 0

29.x = ae +asin

8,

y =a - acos

31.x=(a+b)coslJ+bcos

PROBLEM SET

1. 0
13.
23.

-2

- co

c' =t :r:

v3

8,

y=0

9.2

1. 1

v3,

9.1

bcot e +acos e, y= b +asin 8

PROBLEM SET

5. straightline

9. -8/ (7r3
19. 2

a+b
a+b
- - e,y=(a+b)sin!J+bsin - - e
b
b

9.3

3. 0

5. 0

9. 1

11. 0

15. 0

17.e

19. 0

21. 0

27. 1

29.

31. 1

33.

3
e-

8)

Selected Answers

748

PROBLEM SET

9.4

3. x +y = v'2
9. (x 2 +y2)2 = 2xy

1. y = 2
7. y = .x3
13. 2x +y2 -1 = 0
29. (x2 +y 2)2 = a2(x2 _ y2)
33. 3r2 -16rcos () - 16r sin () + 32 = 0
PROBLEM SET

9.5

3. rr/4

1. 3rr/2

1.
7.

5. t

11. 2

9. 2
PROBLEM SET

+ i-)3/2 - 1)

PROBLEM SET

23. j = ic

1. a) OS
-

5. 2rr

9. tv's +t1n(2 +1v's)

1t. !

9.7

25. a) i

te +if b) j = -!e - -tf

9.8

-+

1)

3. 3

id

PROBLEM SET
-+

13. i(e8"

9.6

7T

i((l

19. x - y = 1
31. r = 1/(1 + sin 0)

---+

-+

-+

fOR + tOP b) OT
-+

-+

-+

OP + f(OR -OP)
-

-+

-+

-+

-+

tOP + toR

-+

-+

---+

3. a) OS = !OR + !OP b) OT = !OP + !{OR -!OP) = !OP + !OR


11.c)a=f
PROBLEM SET

9.9

1. maximum at x = 0, K = 2
3. maximum at x = (45)-114, K = 5312 (45)-114 6-112; minimum at x
_53/2 (45)-1/4 6-1/2
K =

-(45)-114,

1 1
5 a'
'b
PROBLEM SET

1. 0
7. 0
13.
19.

10.1

3. 0

5. 0

11.

9. -

- 00

15. 1

00

00

17. 2

21. converges

23. 2

25. In 2

27.

29. 1

31.

33.

00

35. converges

Selected Answers

PROBLEM SET

749

10.2

3. not convergent

1. not convergent
1T

7. -1 + 1T

11. convergent

9. convergent

13. convergent

15. not convergent

19. not convergent

21. convergent

17. convergent
23. convergent

25. convergent

27. not convergent

29. not convergent

00

33. I <-1)ix4i, -1

<

35. convergent

x 1

i=O

PROBLEM SET

10.3

alternating, absolutely convergent

3. alternating, absolutely convergent

5. alternating, absolutely convergent

7. alternating, absolutely convergent

l.

9. not alternating, not absolutely convergent


PROBLEM SET

10.4

1
5. IRnl n
1

7. IR..I
<n
11. IRnl

1)4

9. !Rn!
(n

1
+

1)9

(This is the estimate which is easiest to derive.

Much better estimates are

possible.)

1
13. !Rn! n

1
15. IR..I
2n2

PROBLEM SET
1.

0.019997

5.

a) /(x)

7. a)/(x)

10.5

00

(-lY( v)i

oo

L (-l)i x5if2

i=O

PROBLEM SET

b) IRn(x)I (0.49)< n+i)/2


b) !Rn(x)I (0.25)5<n+l>/2

+)

2 /2
il (-1) 2(0 49)<i
i + 2
00

c)

ro

2(0.25)<5i+2)/2
51 + 2

c) L (-1)' -.
.

i=l

--

10.6

1. (-1, 1)

3. (-1, 1)

5. [-1, 1)

7. [-1, 1)

9. (-1, 1)

11. (-1, 1)
17. ( -oo, ex:>)

13. ( -oo,

oo

19. [-1,1]

15. ( -oo,

oo

21. [4 - 1/e, 4

1/e]

Selected Answers

750

PROBLEM SET
oo

I (-1Y

i.

i =O

(-l)
oo

5.

9.

oo

10.7

xi+ 3
. - , c-1, 11
l +1
i=O
oo
x2i+l
.
, (- oo, oo
-l)i 2i+1
7
(
2
(2i +1)!
i
co x3i +3
11. I-.- c- oo , oo
i=O ,I.
x2i+1
co
15. Ic-1)i .
[-1, 11
c 2l + n2,
i=O
x2i+l
(-l)i
19.
+
(2i
1)(2i + l)! ' - oo, oo)
i

xi+ 2
. - , ( -1, 1 1
l +1
22i+ix2H1

co

3.

(2i + 1) ! ' - oo' oo)


x2i
( 1)
(- oo, oo
3)2i(2i)!'
-

I c-1)i

oo

21.
25.

i .I (-l)i
co

i=O

f(x) = e"'l2

22ix2i+3 x3
. 1 + -, (- oo,
2
(2I )

9.
13.

n+ 1

17. 2113
21.

oo

J.

00

i=O

(n + 1)!

=(n+l- i)!i!'

xi

.
l

on

on

( )
i- 1

I c-1Y

xi+2

+2

-:-- on
l

on

( - v2, v2)

(-1, 1)

19. 2k

co

I.

an = 0 if n is

3.

a0 = 0, a1 = 1, a2 = 0

5.

an = 0 if n is

even,

an

(-l)m
if n= 2m + 1
(2m + i) !

( -l)m .
1f n= 2m +1
a11 = (
Zm + l)

on

(-1, 1)

on

(-2, 2)

x2i +l
on -v'2, v'2)
2i
2 (2i +1)

23. f(x) = ein(x)

10.9

even,

x2i+I

( -:t)
;o (;) (J
()

i=O

15. 211a

(-1, 1)
x2i+1

n!

(n- i +1)! ( i - 1)!


11.

(-1, 1)

2 i(2i + 1)

PROBLEM SET

1
7. an= f
n.

22ix2i
. ,, c- oo , oo
c2l )
i=O
29. j(j- 1)
(j- i +l)xi-1
23.

10.8

-:i xi+ 2

i=O

co

( )
( )
i (;)
(; )
-i
_2 ( )
i

oo

PROBLEM SET

Selected Answers

9.
11.

a0 =In 2, a1 = !, a2 = -!
an

0 1f n

IS

PROBLEM SET

1.

IRn(x)j

odd,

an =

(-l)m-1.

if

1m

n = 1m, m > 0 (a0 = 0)

10.10

lxln+l

(n + l ) !

PROBLEM SET

10.11

5. -1

3. -1

1. -4

11. 2i

9. 1
15. -9 +sv'3;

13. 3 +4i
19.

2i
1
5 -5

25.

v'3

21.
2i

1
10

3i
10

27. -i

33.

31. i

PROBLEM SET

23.

3.

z =

5.

z=

z=

1
--=

v'2

v'3 i
- - 7

29. i

i
+5

35. 1

1ei"16

10.13

----='

----=

v'2 v'2

i
--

v'2

-2, 1 + v3 i, 1
-1, i, -i
-

10.12

PROBLEM SET

z=

5-5

5.

1.

17.

l , i, -i,

v2

1
'

- v3

- ----=

v'2

--

v'2

1
'

----=

v'2

i
----=

v'2

i
1

v'2
v'2 v2 - v2'
'

v2'

1
v'2

- v2

751

152

Selected Answers

PROBLEM SET

1.

3.

PROBLEM SET
5.
9.

10.14

GO (i + l)Xi

.L

7.

.,
I.

i=O

Jim /n(O) = 1,Iim fn(x) =

PROBLEM SET

if 0 <

U Jim does not exist

n-+co

x 1,U lim does not exist

5. 5x - Sy + z = 3
x 2 + y2 +
z2 = 4.

9. The figure is a sphere with equation

11.2

+L +_:_
3

__l_

v'3 v' v'3 v'3=0


2y

(2i +l)x2i
(2 I")I

'+l

( -1)'

3. x + y +z = t

7. x - y +2z = 6

2ze"2

11.1

1. x + 3y + 2z = 14

PROBLEM SET

.-

11. Iim fn(x) = 0,

oo

n--+ co

i=O

lim/n(x) = 0 = U lim/n(X)

n--+ oo

i.

5.

10.15

n-+co

13.

cos

2z

-
'

v'3

2y

_l'_

1
_:_ + - -

v'3 v'3 v'3=0


_

2z

3. - + - +- - 1 = 0 ' - - - - - - + 1 = 0
3
3
3
3
3
3
2
2
5 __!__ + Y + + = 0 - __!__ - Y - - = 0
l
1
1
1
1
l

'
l
7

(:3' :3' :3)


- 3 =0 11.
v'5 v'5 v'5 ' Vs
x
2y
- +-

PROBLEM SET

-x

2y
3
-- +- =0

Vs

Vs

11.3

1. a= 0,P0 = (1,1, -1)

3. a= 0, P0= (1, 1 , 1)

5. a=0,P0=(-3,-1,4)

7. a= 1 , P0=(0,0, 1)

9.

a= 4,P0= (1,!, !)

13.
=>
=>
=>

15.

11. i = Vi

cx1V1 + cx2V2 + cx3V3 = 0


CXl (i +
j) + <X2(j +k) +CX3(k) = 0
cx1i +(cx1 + cx2)j +(cx1 + cx2 + cx3)k = 0
CX1 = 0, CX1 +CX 2 = 0 so CX2 = 0,and
CXl + CX 2 + CX3 = 0 SO CX3 = 0

i = t(V1

+V2), j = t(V2 +Va), k =!(Vi +Vs)

V2 + V3, j = V2 - V3, k =V3

Selected Answers

=:>

xi

+yj+zk =2 (V1 +V2) + 2

z
(V2+ V3)+2. (Vi

753

+ V3)

=!(x+z)V1 +!(x +y)V2+t(y +z)V3,


'1V1 + '2 Vi + '3V3 = 0
'1(i - j +k)+ '2(i+j - k)+ <X3(-i +j+k) = 0

'1
=> a:1
=:>

+ '2 - <X3 =0, -<Xi+ '2+ 'a = 0,


=0, a:2 =0, 'a =0 by elimination

'1 - '2+ <X3 = 0

33. Vi =i - j = (1, -1, 0) satisfies x+y+ 2z =0 as does V2 = 2j - k = (0, 2, -1).


Any vector of the form aV1+bV2 can be written a(i - j)+b(2j - k) =
ai+ (2b - a)j - bk and this satisfies the equationx+y +2z =0. Ifa:1V1 +a:2V2 =0,
then a:1(i - j)+ a2(2j - k) =O; therefore a:1i+ (2a:2 - a:1)j - a:2k= O; therefore
a:1 =0, and a2 =0 since i, j, and k are linearly independent. Any vector ( -y - 2z, y, z)
in E can be written as (-y - 2z)V1 - z V2, and so V1 and V2 span E.
35. 4x - 2y+ z =0
PROBLEM SET
1.

'1

=1,

'2

11.4

= -2,

'a

=1

5. {E1 - E2, 2E2+ Ea}


9.

3. '1 =4, '2 = -5, 'a = 2, <X4= -1


7. {E1+E2, E2, Ea+E4, E4}

{2E1 - 2E2 - Ea}

PROBLEM SET

11.5

(E1+E2+E3), -1- (-E1 - E2+2E3)}


1. {
v'3
(2E1 -E2 -Ea), (E2 - Ea)}
{ 1 (E1 E2+Ea),
v'6
v'2
v'6

3.

9.

11.

v'3

Va= E1 - E2 - E4, V4

=-E2+Ea+E4

{ (E1 - EJ}

PROBLEM SET

11.6

2
1. [Hint: Expand (x1y2 - x2y1) .]
. 3. No. D.4 fails for an obtuse triangle.
5.

D.1
D.2
D.3
D.4

{Ix - al, /y - bl} 0 since.both are 0.


{Ix - al, /y - bl} =0 =>Ix - al = 0, /y - bl = 0, =>P =Q
=max {/x - a/, ly - b/} =max {ly - b/, Ix - al}= d(Q,P)
=max {Ix - u/, /y - vi} =max {/x - a+a - u/, /y - b +b - v/}
max {Ix - a/ +/a - ul, /y - b/+lb - v/}
max {Ix - al, /y - bl} +max {/a - ul, lb - vi}
= d(P, Q)+d(Q, R)

d(P, Q)
d(P, Q)
d(P, Q)
d(P, R)

=max

=0 =>max

754

Selected Answers

1T2

13. /(x) = x2 - 3

PROBLEM SET

12.1

7T2

2
.
1. a0= 0, ai =0, bi= -: (-1)'+1
5

(i37T'2)

'

(-l)i4

j2

--

'

2
bi = - (-l)i+l

E_ (-l)Hl
a=
Ob=
i
'
i

1T

7. ao =-4, ai= -;zl 1T


1T 1

9.

3 . a0 = - ai =
3

a0= -4 + -2, ai

1T

11. a0=-4

'

.
1
((-l)i - 1), bi=-: (-l)i+1

( 1 ) ((-1)'. - 1), bi=-:1 (-1)+. 1 +,1 ((-1)'.-1)


-;z-

l 1T

l1T

1
.
3
1
+
a=i
i
j ((-1)'- 1) ' b=-(-1)'

21T

PROBLEM SET

12.2

3. ! cos 3x +!cos x
7. -i cos 3x + t cos x

1. -! sin 3x + ! sin x
5. -! cos 3x + ! cos x
9. i - t cos 2x + i cos 4x
PROBLEM SET

12.3

1
2

7. a0= 1T ( -2 + e" + e-")


ai=

1
. (-2 + (e" + e-")(-l)i)
7r(l + 12)
i

( +2 + (e" + e-")(-I)i+l)
b=
'
7r(l + i2)

[l -u

PROBLEM SET

13.1

-4
3
-2

1.

M-

5.

/(R3) ={(a, b, c) I b =3a, c = -2a}

Ker/={(x,y, z) I 2x + y

3.

M-

[ ]
0

3
-1

z = O}

7. /(R3) = {(a, b, c) I 3b = 2c}


Ker f = {(x , y, z) I x + 2y = 0, z = o}

9. M-

-3 -I
3 -1
2
0

756

!
15
15
LP ]

PROBLEM SET

13.2

3. [ ;]
7 UJ
11. [l 64 !]
15. [ 001 ]
19. [ 130 ]
23. [l J
00 10
27. [! 10 00

5. [4741 5825 ]
9. LJ
13. [i 231 ]
17. [ 00 !]
21. [21327 158 2i]32
01 00
25. [ 00 10
]
29.

31.

no inverse

PROBLEM SET

7.13. 112-2.
I. D 11

-1,

2, D31

-1

l
[; :1
17. -2-2
5. 1

G33 G44

13.5
-D21

M-'

3.9. 56
15. -720

13.4

1. G11G22 - G12G21

PROBLEM SET

Selected Answers

11.

3.

z =

5047

755

Selected Answers

756

5.

[ :]

7.

[! H]

11.

[ _!]

[i !]
0
5

2
0

PROBLEM SET

1
0

0
15.

-1

-1

0
1

1
5
0
0

13.6

1. independent

3. independent

5. independent

7. independent

9. independent

11. independent

13. independent

15. independent

19. independent

21. independent

PROBLEM SET

1. {e

2"'

, e

} , f(x)

PROBLEM SET

1. "fl/'= {

5. "fl/' = { IX1e"'
9. H
13. H
15. H
17. H
19. H
21. H

dependent

13.7

3"'

"'
1X1e-

17.

2"'

3. {e3"',

xe

} , f(x)

3"'

(1

2x)e3"'

13.8
+

1X2

{ IX1 sin x
{ IX1 sin x
{ 1X1 sin x
{ 1X1 sin x
{ 1X1 sin x
{ IX1 sin x

2
x
IX3 X e-

X
IX2Xe-

+
+
+
+
+
+

cos

+
x +

IX2

cos

IX2

cos

IX2

cos

IX2

cos

IX2

cos

IX2

cos

} 3. "fl/'= { 1X1e"' 1X2Xe"' + 1X3e-2"' + IX4Xe-2"'}


+
7. "fl/' = { IX1 + 1X2e"' + IX3Xe"' + 1X4e-X + IX5Xe- "'}
IX3 sin x}
- tx cos x}
11. H = { IX1 sin x + IX2 cos x + ix sin x}
x}
+
+ ie"' sin x - -}e"' cos x}
-}e"' sin x ie"' cos x}
+
+
!xe"')
2
+
ix cos x + !x sin x}
+

3
"'
IX4X e-

Selected Answers
PROBLEM SET

14.1

19. 9(x2 + z2) = 2y2

17. x2 + z2= (1 - y)2

23. y2 + z2

21. z2 + y 2 = 10x2

25. x2 + y2 = cos2 z, 0
PROBLEM SET

7T/2

14.2

1. ellipsoid

3. 9ne-sheeted hyperboloid

7. elliptic cone

9. hyperbolic paraboloid
19.

13. hyperbolic paraboloid

PROBLEM SET

21. 87T

2y3

fx=

15.

/.,

17.

fx= (x2

19.

,/y= --2 - ,/xy = --2- = fyx


fx = -y
y
y

21

f"'

/v =
1 ,

v'x 4 + y4 +
_

x2)

(x2 + y2)2

x(x2

.fv = (x2

+ y2)2

'fv= (x2

/xv=

y2)

+ y2)2

(x2 + y2)y cos xy

+ y2)2

.fxv=

'/xv=

2x sin xy

(x2 + y2)2

+ y 4 + 1 )312

.,...y4 + 6x2y2

x4

(x2 + y2) 3

8xy(x2 - y2)
(x2 + y 2)3

f'

>Jv

=/vx

=fvx

=f vx

(x2 + y2)x cos xy - 2y sin xy


(x2 + y 2)2

(-cos xy - xy sin xy)(x2 + y2)2 + Sxy sin xy


(x2 + y2)3

PROBLEM SET

-4x3y3

,/xv= (x4
1

-cos x

-sin x

cos x

v'x 4 + y 4 +

-4yx 2

4xy2

11. hyperbolic paraboloid

14.3

2x 3

y(y2

5. elliptic cone

13.

e2'"

=fyx

14.4

= 1, B = 2, E1 = Ay, E2= 0
3. A = 1, B
2, E1 = Ay2, E2 = Ay + 2Ax
5. A = 2, B = -2, E1 = Ax, E2 = -y
7. A = 2, B
2, E1 = Ax, E2 = Ay
. 9. A = 2, B
-8, E1 = Ax, E2 = 4Ay
11. /a(l, 1) = 2 cos cz - 2 sin cz, maximum at cz =
1. A

PROBLEM SET
1.

3.

-7T/4, minimum at 3 TT/4

14.5

rp'(t)= -(5t4 +
rp'(t)= 2 cos 2t

3t2) sin (t5 + t3 )


5.

rp'(t)

5t4

757

758

Selected Answers

PROBLEM SET
l.

<Pt= 4t3u2

14.6

2u2, <Pu= 2t4u

+ 2t +

4ut + 4u3

3. <Pt= 3t2 cos/cosg - sin/sing, <Pu= 4u3 cos/cosg - sin/ sing


5. <Pt= 2t,
9
. E1
t:it
=

<Pu= 0

7. <Pt= 0, <Pu= 0

fiu fiw(l + fiv), E2= fiu tiw(l + fiv), E3 = 0, E4 = 0, E5= 0

PROBLEM SET

14.8

1. no ILMax, no ILMin

3. no ILMax, no ILMin

5. no ILMax, ILMin at ( !
9
. ILMax at (0, 0), no ILMin
-

7. no ILMax, ILMin at(-!, -!)

!)

11. maximum value at x= (1/v2, 0) and x = (-1/v2, 0), maximum value= t


13. maximum volume

16

v3

15. maximum volume=

PROBLEM SET

'coordinates

v3' v3 ' v3

32-V3

9- 'coordinates=

v3

1376
3.
21

7. 0

544

27

14.10

1. 27r/l 1

3. 151T/16

5. ct

7. 0

14.11

28
1. 9
x = 3/27r,

9.

-11/6

11. 0

PROBLEM SET

5.

2
2
,
v3 v3

14.9

1. 35

PROBLEM SET

3.
ji = 0

PROBLEM SET

x = 0,

ji = 0

7. x = 0, ji = - t
3

11. x =- y-=o
4'

13. (0, 0)

14.12

1. 0

3.

7. 0

5. 3/7
11. 3

v3)27r

Index

absolute convergence, 447


absolute value, of a real number, 9 ff
of a complex number,481

cardioid,400
Cartesian n-space, 526

acceleration, 119,422

center, of an ellipse, 362


of any point-symmetric set,365

of gravity, 120
addition formula,for the cosine,136

centroid, 335 ff
of nonhomogeneous bodies, 674

alternating function, 585

chain rule, J 59 ff
proof of, 162
for paths, 642, 646
for functions of many variables, 647

alternating series,446

chord, of a function,72

alternating series test,447

circle,23

amplitude,of a complex number,488

circular cone, 621


closed interval,12

for the sine,138


algebraic substitutions,291 ff
ALO,7

annihilation theorem,435
antidifferentiation, 255
A0,4
apocryphal anecdote,120
approximations by trigonometric polynomials,
549
arc length,303 ff
formula for, 306

disk, 493
set, 669
commutative group, 578
commutative law, for real numbers, 1
for positive series,476
comparison theorem,for positive series,441
completeness, of the real number system,238 ff

Archimedes, 57

completing the square, in integration, 297 ff

area, of a parabolic sector, 57 ff

complex numbers,479 ff

of a surface,327 ff
areas in polar coordinates, 402 ff

construction of, 535, 713


components of a vector,418
composition, of functions, 154 ff
for functions of two variables, 629
concave,215

arithmetic series,49
associativity,1
asymptote,368 ff
axis, of a parabola, 38

cone, 41
conical surface, 618

Banchoff,Thomas F., 56

conjugate, of a complex number, 480

base plane, 667


basis,for a vector space,521

conjugate hyperbolas, 370


constant function, 89

betweenness theorem for integrals, 312

continuity, of a function,75 ff

binomial series,468 ff
binomial theorem, 468
induction proof of,472, 484
bound, of a function, 86

of a function of two variables, 627


of composite functions, 520

bounded below,194

continuous,definition of,77
continuous function,75
elementary theorems on, 85
continuous interest, 223
convergent, 431

bounded function, 86

coordinate functions,382

bounded above, 193


bounded away from 0, 696

sequence,517
bounded sequence, 433,695
set,669
box, for a function, 77

coordinate system,16 ff
Cos, Cos-', 170
cash,203
cosine, series for, 466

759

760

Index

coth, 203
Cra:mer's'rule, 593
cross section, 321
csch, 203
curvature, 425 ff
cycloid, 391
cylinder, 316, 614
cylindrical coordinates, 666
cylindrical shell, 316
solid, 614
surface, 614
D, 95
Dx, 96
damped oscillation, 604
DCP, 239
dd'-box, 79
decreasing function, 206
sequence, 433
Dedekind cut postulate, 239
definite integral, definition of, 308 ff, 312
De Moivre's theorem, 489 ff
density function, 675, 677
derivative, of a function, 69 ff
of an arbitrary power of x, 202
of one function with respect to another, .250 ff
of the integral, 109 ff
of the integral, proof of formula for, 124 ff
determinant function, 582
df/dg, 250
diameter, of a plane set, 668
differences, Ax and !:if, 140
differentiability, for functions of two variables,
638
differentiable function, 89
differentials, 148 ff
differentiation, 89 ff
of complex power series, 496
of series, 453, 505
for functions of many variables, 644
dimension, of a vector space, 529
dimension theorem, 610
directed angles, 128
distance, 512
normal form, 512
segments, 415
direction angles, 513
cosines, 514
directional derivatives, 634, 648
directrix, of a parabola, 38
disk, 493
distance formula, 19
from a point to a line, 38
in an inner-product space, 534.
in polar coordinates, 401
distributive law, 1
diverges to infinity, 435
divisor, 55
domain, of a function, 64
of convergence, 462

dot product, 411


double integrals, 660, 666
definition of, 198
existence of, 198
as a limit as x -->- oo, 221
series for, 464
eccentricity of a conic section, 380
ellipse, 360 ff
area of, 62
ellipsoid, 620
elliptic cone, 621
epicycloid, 392
epsilon-delta box, 77
equivalence, 5
of directed segments, 415
equivalence class, of directed segments, 416
equivalent to, 509
error in Simpson's rule, 702
estimates of remainders, 448 ff
even function, 339
permutation, 584
exact differential, 683
existence, of maxima, 225, 244
of minima, 227, 244
existence theorems, use in finding maxima and
minima, 223 ff
exp, 194
Jaws for, 196
series for, 464
exponential function, in the complex domain,
484 ff
exponentials and logarithms, 185 ff

e,

factor, 55
falling body, 120
field postulates, 1, 580
finite covering theorem, 350
finite dimensional vector space, 529
first octant, 509
focal difference, 366
sum, 360
focus, of a hyperbola, 366
of a parabola, 38
of an ellipse, 360
force, 119
Fourier coefficients, 546
type, 561
fractional exponents, 101
free vectors, 415 ff
frustum, 329
function, 63
of g, 252
of x, 255
of two variables, 627
function-graph, 67
functional equations, 232 ff
differentiation of, 235
fundamental theorem of integral calculus, 255

Index
Galileo,120
general equation of the second degree, 372 ff
geometric series, 49
gradient, 649
graph, of a condition, 21 ff
of a function, 66
of an inequality,33 ff
group,578
half,of an interval, 241
half-open interval,13
half-plane, 34
harmonic series,439
Heaviside function,76
homogeneous differential equation, 611
Hooke's law,605
hyperbola, 366 ff
hyperbolic functions, 203 ff
hyperboloid of one sheet, 622
of two sheets, 623
hypocycloid,385, 391
ILMax, 213
ILMin, 212
Im z, 482
image, of a function, 67
implies,6, 507
improper integrals, 344 ff
increasing function, 206
sequence, 433
indefinite integral, 256
integration,257
independent variable, 255
induction principle, 51 ff
inequalities,4 ff
infinite interval,13
infinity,13
inflection point,215
initial side, 128
inner product,411
space, 412, 518
inscribed broken line,303
inside function, 155
integers modulo 2, 3
modulo 4, 3
integrability of continuous functions, 353
integrable function, 312
integral,of a nonnegative function, 102 ff
integral test for infinite series, 445
integrand,106
integration by parts, 273 ff
integration, of power series, 454,503
of Fourier series, 556
interior local maximum,213
local minimum, 212
interior of a circle,22
interior point,of an interval,176, 207
of a set in a plane,653
interval, 12
inverse, of an invertible function, 165

invertible function, 165


derivative of, 168
iterated integral,665
limits, 717
kernel, 568
law of cosines, 135
Least Upper Bound Postulate,243
left-handed coordinate system,129
length of a path,405 ff
formula for,407
level curves, 655
l'Hoptal's rule,388, 389, 393 ff
limx-oo, limx--oo, limx-a+,limx-a-,
217 ff
limit,45
definition of,80,81
of a function of two variables,627
of a sequence,definition,431
of integration, 105
limit point of a set,669
limits,theorems on,82 ff
line integrals, 680
linear combination,413,521
differential equation,601
equation in x, y, and z, 510
function,563
transformation, 563
linearly dependent, 421,521
independent,421, 521,596
Lipschitzian function,82, 315, 355
LMax, 212
LMin,212
In,191 ff
laws for,196
series for,455
local maximum,212
minimum,212
locally bounded function, 88, 511,691
locally bounded away from 0, 693
locus of a path, 381
logarithms, 186 ff
laws of, 189
logic and set theory,687
lower bound, 194
lower limit of integration, 105
lower sum, 309
LUBP, 243

Maclaurin series, 473


matrix, 564
maxima, 211 ff
for functions of several variables,651
mean-value theorem, 72 ff,246
measurable set, in a plane,527, 668, 705
measurable solid,318
"measure," of a directed angle,133
mesh of a net over an interval, 304
over a plane region,668

761

762

Index

metric space, 539


minima, 211 ff
minor, of an element of a matrix, 590
mixed partial derivatives, 717
MLO, 7
M0,4

modulus, of a complex number, 488


moments, 335 ff
of nonhomogeneous bodies, 674
multiplication of matrices, 573
MVT, 73, 246

natural logarithm, 191


neighborhood of a point, on a line, 152
in a plane, 653
Nested Interval Postulate, 240
net, over an interval, 303
over a plane region, 668
Newton's second law, 120
NIP, 240
no-jump theorem, 191, 249
for derivatives, 314
nonoverlapping solids, 318
nonsingular matrix, 593
norm, 520
normal component, 423
normal set of vectors, 531
normal vector space, 538
Northeast theorem, 393
proof of, 527, 707
octant, 509
odd function, 339
odd permutation, 584
open disk, 493
interval, 12
sentence, 5
order, 3
orthogonal vectors, 531
orthonormal basis, 531
outside function, 155
Pappus' theorem, for surfaces, 342
for volumes, 340
parabola, 38 ff
paraboloid of revolution, 41
parallel, 30
parallelepiped, 316
parallelogram law, 412
parameter, 382
parametric mean-value theorem, 387
parametric slope formula, 386
partial derivatives, 631
partial fractions, 298 ff
partial sum, 431
path, 381
path length, formulas for, 407, 408
proof of formula for, 531, 711
patfl of motion, 41

permutation, 583
perpendicular, 30
plane path, 381
p(N), 303
point of contact, 43
point-slope form, 30
polar coordinates, 397 ff
distance formula in, 401
polar form, of a complex number, 487
polygonal inequality, 55
polynomial, 86
in x and y, 633
power series, 453
powers of functioi1s, 99
prime number, 55
projection into a sub space, 541
Pythagorean theorem, 19, 22
quadrant, 20
quadratic function, 179
quadric surface, 620
radius of convergence, 494
range, of a function, 64
ratio test, 457 ff
ray, 25, 128
real-analytic, 468
rectangular equation of a locus, 382
rectangular hyperbola, 370
rectifiable, 304
reduction formula, 277,284
reflecting property, of a parabola, 48
regular path, 392
Re z, 482
right cylinder, 614
right-handed coordinate system, 129
ring, 579
RO, 7
Rolle's theorem, 246
roots of functions, 97
rotation of axes, 374 ff
ruler postulate, 17
saddle point, 655
sample, of a net, 310
of a net over a plane region, 669
sample sum, 310
scalar, 411
Schwarz inequality, 521, 536
Sec, Sec-1, 173
secant line, 45
sech, 203
second derivative, 119
segment, 25
signum, 108
Simpson's rule, 176 ff
error in, 524
Sin, Sin-1, 169
sine, series for, 466
sinh, 203
slice function, 630, 634

Index

slope, of a segment, 28
of a line, 29
slope-intercept form, 29
S(N), 309
s(N), 309
solid torus, 327
solution set, 12
span, 414, 521
sphere, 620
square root, 10
squeeze principle, for functions, 145, 222
for sequences, 435, 518, 696
for volumes of solids, 318
standard position, for an angle, 133
for an ellipse, 362
for a parabola, 40
subadditive function, 537
subspace, 522
substitution, integration by, 284 ff, 291 ff
justification of, 290
summation notation, 49
sup, 243
supremum, 243
surface of revolution, 327, 616
symmetric, 338

term-wise integration of series, 453


transitivity, 4
translation, of axes, 356 ff
of a coordinate plane, 415
transpose, of a matrix, 587
transposition, 583
triangular inequality, 11
trichotomy, 4
trigonometric functions, of angles, 129
of numbers, 131

Tan, Tan-1, 171-172


Tan-1, series for, 455
tangent, 43 ff
to a parabola, 45
tangent vector, 423
tangential component, 423
tanh, 203
Taylor series, 473 ff
Taylor's theorem, 477 ff
terminal side, 128

value, of a function, 67
vector space, 411
vectors, in a plane, 409 ff
operations on, 410 ff
velocity, 119, 422
vertex, of a parabola, 39

A BCDEFGH'/987 65432

U Jim, 502
unbounded, 193-194
uniform convergence, 502
norm, 538
uniformly accelerated motion, 119
uniformly continuous, 315, 353
uniqueness theorem, 113
unit element, 579
unit tangent vector, 423
upper bound, of a function, 193
of a set, 243
upper limit of integration, 105
upper sum, 309

weight, 120
well-ordering principle, 54
winding function, 130

763

You might also like