You are on page 1of 708

Foundations of Applied Mathematics

Volume 1

Mathematical Analysis

JEFFREY HUMPHERYS
TYLER J. JARVIS
EMILY J. EVANS

BRIGHAM YOUNG UNIVERSITY


"
SOCIETY FOR INDUSTRIAL
AND APPLIED MATHEMATICS

PHILADELPHIA
Copyright© 2017 by the Society for Industrial and Applied Mathematics

10987 654321

All rights reserved . Printed in the United States of America. No part of this book may be reproduced, stored, or
transmitted in any manner without the written permission of the publisher. For information, write to the Society
for Industrial and Applied Mathematics, 3600 Market Street, 6th Floor, Philadelphia, PA 19104-2688 USA

No warranties, express or implied, are made by the publisher, authors, and their employers that the programs
contained in this volume are free of error. They should not be relied on as the sole basis to solve a problem
w hose incorrect solution could result in injury to person or property. If the prog ra ms are employed in such
a manner, it is at the user's own risk and the publisher, authors, and their employers disclaim all liability for
such m isuse.

Python is a registered trademark of Python Software Foundation .

PUBLISHER David Marsha ll


EXECUTIVE EDITOR Elizabeth Greenspan
DEVELOPMENTAL EDITOR Gina Rinelli Harris
MANAGING EDITOR Kelly Thomas
PRODUCTION EDITOR Louis R. Primus
COPY EDITOR Louis R. Primus
PRODUCTION MANAGER Donna Witzleben
PRODUCTION COORDINATOR Cally Shrader
COMPOSITOR Lum ina Datamatics
GRAPH IC DESIGNER Lois Sellers
COVER DESIGNER Sarah Kay Miller

Library of Congress Cataloging-in-Publication Data


Names: Humpherys, Jeffrey, author. I Jarvis, Tyler ,Jamison, author. I Evans,
Emily J , author.
Title: Foundations of applied mathematics I Jeffrey Humpherys, Tyler J
Jarvis, Emily J . Evans, Brigham Young University, Provo, Utah .
Description Philadelphia : Society for Industrial and Applied Mathematics,
[2017]- I Series: Other titles in applied mathematics ; 152 I Includes
bibliograph ical references and index.
Identifiers LCCN 2017012783 I ISBN 978161197 4898 (v. 1)
Subjects: LCSH: Calculus. I Mathematical analysis. I Matrices.
Classification LCC QA303.2 .H86 20171 DDC 515--dc23 LC record available at https//lccn.loc
gov /2 017012783


5.la.J1L is a reg istered trademark.
Contents

List of Notation ix

Preface xiii

I Linear Analysis I 1

1 Abstract Vector Spaces 3


1.1 Vector Algebra . . . . . . . . . . . . 3
1.2 Spans and Linear Independence .. 10
1.3 Products, Sums, and Complements 14
1.4 Dimension, Replacement, and Extension 17
1.5 Quotient Spaces 21
Exercises . . . . . . . . . . . . . . . . .. . 27

2 Linear Transformations and Matrices 31


2.1 Basics of Linear Transformations I . 32
2.2 Basics of Linear Transformations II 36
2.3 Rank, Nullity, and the First Isomorphism Theorem 40
2.4 Matrix Representations .. . . . . . . . . . . . 46
2.5 Composition, Change of Basis, and Similarity 51
2.6 Important Example: Bernstein Polynomials 54
2. 7 Linear Systems 58
2.8 Determinants I . 65
2.9 Determinants II 70
Exercises . . . . . . . . 78

3 Inner Product Spaces 87


3.1 Introduction to Inner Products. 88
3.2 Orthonormal Sets and Orthogonal Projections 94
3.3 Gram-Schmidt Orthonormalization .. 99
3.4 QR with Householder Transformations 105
3.5 Normed Linear Spaces . . . 110
3.6 Important Norm Inequalities . . . . . . 117
3.7 Adjoints . . . . . . . . . . . . . . . . . 120
3.8 Fundamental Subspaces of a Linear Transformation 123
3.9 Least Squares 127
Exercises . . . . . . . . . . . . . 131

v
vi Contents

4 S p e ctral Theory 139


4.1 Eigenvalues and Eigenvectors . 140
4.2 Invariant Subspaces 147
4.3 Diagonalization . . . . . . . . 150
4.4 Schur's Lemma . . . . . . . . 155
4.5 The Singular Value Decomposition 159
4.6 Consequences of the SYD . 165
Exercises . .. . .. . 171

II N onlinear Analysis I 177

5 Metric Space Topology 179


5.1 Metric Spaces and Continuous Functions 180
5. 2 Continuous Functions and Limits . . . . 185
5.3 Closed Sets, Sequences, and Convergence 190
5.4 Completeness and Uniform Continuity . 195
5.5 Compactness . .. . . . . . . . . . . . . . 203
5.6 Uniform Convergence and Banach Spaces . 210
5.7 The Continuous Linear Extension Theorem . 213
5.8 Topologically Equivalent Metrics . 219
5.9 Topological Properties .. 222
5.10 Banach-Valued Integration 227
Exercises . . . . . . . . .. . .. . 233

6 Differentiation 241
6.1 The Directional Derivative 241
6.2 T he Frechet Derivative in ]Rn . . 246
6.3 The General Frechet Derivative 252
6.4 Properties of Derivatives . . . . 256
6.5 Mean Value Theorem and Fundamental Theorem of Calculus 260
6.6 Taylor's Theorem 265
Exercises . . . . . . . . . . . . . . .. . . . . . 272

7 C ontraction Mappings and Applications 277


7.1 Contraction Mapping Principle . .. .. 278
7.2 Uniform Contraction Mapping Principle 281
7.3 Newton's Method . . . . . . . . . . . . . 285
7.4 The Implicit and Inverse Function Theorems 293
7.5 Conditioning 301
Exercises . .. . . . . . .. . . . . . . .. . . . . . . 310

III Nonlinear Analys is II 317

8 Integration I 319
8.1 Multivariable Integration . . . . . . . . . . 320
8.2 Overview of Daniell- Lebesgue Integration 326
8.3 Measure Zero and Measurability . . . . . . 331
Contents vii

8.4 Monotone Convergence and Integration on


Unbounded Domains . . . . . . . . . . . . 335
8.5 Fatou's Lemma and the Dominated Convergence Theorem 340
8.6 Fubini's Theorem and Leibniz's Integral Rule 344
8.7 Change of Variables 349
Exercises . .. .. 356

9 * Integration II 361
9.1 Every Normed Space Has a Unique Completion 361
9.2 More about Measure Zero . . 364
9.3 Lebesgue-Integrable Functions .. . . . . . 367
9.4 Proof of Fubini's Theorem . . . . . . . . . 372
9.5 Proof of the Change of Variables Theorem 374
Exercises . . . . . . . . . . . . . . . . . . . . .. . 378

10 Calculus on Manifolds 381


10.l Curves and Arclength . 381
10.2 Line Integrals . . . . . 386
10.3 Parametrized Manifolds . 389
10.4 * Integration on Manifolds 393
10.5 Green's Theorem 396
Exercises .. . . . . . 403

11 Complex Analysis 407


11 . l Holomorphic Functions 407
11.2 Properties and Examples 411
11.3 Contour Integrals . . . . 416
11.4 Cauchy's Integral Formula 424
11.5 Consequences of Cauchy's Integral Formula . 429
11.6 Power Series and Laurent Series . . . . . . . 433
11. 7 The Residue Theorem . . . . . . . . . . . . . 438
11.8 *The Argument Principle and Its Consequences . 445
Exercises .. . .. . .. .. . . . . . . . . . . . . . . . . . 451

IV Linear Analysis II 457

12 Spectral Calculus 459


12.l Projections . . . . . .. . 460
12.2 Generalized Eigenvectors 465
12.3 The Resolvent . . . . . . 470
12.4 Spectral Resolution . . . 475
12.5 Spectral Decomposition I 480
12.6 Spectral Decomposition II 483
12.7 Spectral Mapping Theorem . 489
12.8 The Perron-Frobenius Theorem 494
12.9 The Drazin Inverse . . . . 500
12.10 * Jordan Canonical Form . 506
Exercises . . . . . . . . . . . . . . 511
viii Contents

13 Iterative Methods 519


13.l Methods for Linear Systems . . . . . . . . . 520
13.2 Minimal Polynomials and Krylov Subspaces 526
13.3 The Arnoldi Iteration and GMRES Methods 530
13.4 * Computing Eigenvalues I . 538
13.5 * Computing Eigenvalues II 543
Exercises . . . . . . . . . . . . . 548

14 Spectra and Pseudospectra 553


14.l The Pseudospectrum . . . . . . . . . . 554
14.2 Asymptotic and Transient Behavior .. 561
14.3 * Proof of the Kreiss Matrix Theorem . 566
Exercises . . . . .. . . . . 570

15 Rings and Polynomials 573


15.l Definition and Examples . . . . . . . .. . 574
15.2 Euclidean Domains . . . . . . . . . .. . . 583
15.3 The Fundamental Theorem of Arithmetic . 588
15.4 Homomorphisms . . . . . . . .. . . . . . . 592
15.5 Quotients and the First Isomorphism Theorem . 598
15.6 The Chinese Remainder Theorem .. .. . . . . 601
15.7 Polynomial Interpolation and Spectral Decomposition . 610
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .. .. . 618

V Appendices 625

A Foundations of Abstract Ma.thematics 627


A. l Sets and Relations . 627
A.2 Functions . . . . . . . . . . . . . .. . 635
A.3 Orderings . . . . . . . . . . . . . . . . 643
A.4 Zorn's Lemma, the Axiom of Choice, and Well Ordering 647
A.5 Cardinality . . . . . . . . . . . . . . . . . . . . . . . . . . 648

B The Complex Numbers and Other Fields 653


B.1 Complex Numbers. 653
B.2 Fields . . . . . . . . . . . . . . . . . .. . . 659

C Topics in Matrix Analysis 663


C.l Matrix Algebra 663
C.2 Block Matrices . 665
C.3 Cross Products 667

D The Greek Alphabet 669

Bibliography 671

Index 679
List of Notation

t indicates harder exercises xvi


* indicates advanced material that can be skipped xvi
isomorphism 39, 595
EB direct sum 16
EB addition in a quotient 24, 575, 599, 641
[] multiplication in a quotient 24, 575, 599, 641
>0 (for matrices) positive definite 159
~o (for matrices) positive semidefinite 159
>- componentwise inequality 494, 513
(-, .) inner product 89
(-) the ideal generated by · 581
J_
orthogonal complement 123
x Cartesian product 4, 14, 596, 630
I divides 585
I· I absolute value or modulus of a number; componentwise mod-
ulus of a matrix 4, 513, 654
II · II a norm 111
JJ · JJF the Frobenius norm 113, 115
II· Jiu the L 1-norm 134, 327
II · 11 £
2 the L 2 -norm 134
JI · JI L= the L 00 -norm, also known as the sup norm 113, 134
II· llv,w the induced norm on @(V, W) 114
II· lip the p-norm, with p E [l, oo], either for vectors or operators
111, 112, 115, 116
[[·]] an equivalence class 22
[-,·,·] a permutation 66
]. the all ones vector ]. = [1 1] T 499
lie the indicator function of E 228, 323
2s the power set of S 628

[a,b] a closed interval in ]Rn 321


a.e. almost everywhere 329, 332
argminx f(x) the value of x that minimizes f 533

ix
x List of Notation

B(xo, r) the open ball with center at xo and radius r 182


Bf!>
J
the jth Bernstein polynomial of degree n 55
8E the boundary of E 192
@(V;W) the space of bounded linear transformations from V to W
114
@k(V;W) the space of bounded linear transformations from V to
@k- 1 (V, W) 266
@(V) the space of bounded linear operators on V 114
@(X;IF) the space of bounded linear functionals on X 214

c the complex numbers 4, 407, 627


C(X;Y) the set of continuous functions from X to Y 5, 185
Co([a,b];IF) the space of continuous IF-valued functions that vanish at the
endpoints a and b 9
Cb(X; JR) the space of continuous functions on X with bounded L 00 -
norm 311
cn(X;Y) the space of Y-valued functions whose nth derivative is con-
tinuous on X 9, 253, 266
C 00 (X; Y) the space of smooth Y-valued functions on X 266
Csr the transition matrix from T to S 48

D(xo, r) the closed ball with center at x 0 and radius r 191


Df(x) the Frechet derivative of f at x 246
Dd the ith partial derivative of f 245
Dk f(x) the kth Frechet derivative of f at x 266
Dvf(x) the directional derivative of f at x in the direction v 244
DU(x) kth directional derivative of f at x in the direction v 268
Dk f(x)v(k) same as D~f (x) 268
D>. the eigennilpotent associated to the eigenvalue ,\ 483
diag(.A1, . .. , An) the diagonal matrix with (i, i) entry equal to Ai 152
d(x,y) a metric 180
8ij the Kronecker delta 95

Eo the set of interior points of E 182


E the closure of E 192
6°>, the generalized eigenspace corresponding to eigenvalue ,\ 468
ei the ith standard basis vector 13, 51
ep the evaluation map at p 33, 594

IF a field , always either C or JR 4


IFn n-dimensional Euclidean space 5
IF[A] the ring of matrices that are polynomials in A 576
List of Notation xi

lF[x] the space of polynomials with coefficients in lF 6


JF[x, y] the space of polynomials in x and y with coefficients in lF 576
lF[x;n] the space of polynomials with coefficients in lF of degree at
most n 9
the preimage {x I f(x) EU} off 186, 187

the greatest common divisor 586


the graph of f 635

the Householder transformation of x 106

I("Y,zo) the winding number of 'Y with respect to z 0 441


ind(B) the index of the matrix B 466
SS(z) the imaginary part of z 450, 654

K(A) the Kreiss matrix constant of A 562


JtA,(A, b) the kth Krylov subspace of A generated by b 506, 527
11;(A) the matrix condition number of A 307
/\; the relative condition number 303
Pi, the absolute condition number 302
/\;spect (A) the spectral condition number of A 560

L 1 (A; X) the space of integrable functions on A 329, 338, 363


LP([a, b]; lF) the space of p-integrable functions 6
Li the ith Lagrange interpolant 603
L 00 (S; X) the set of bounded functions from S to X, with the sup norm
216
£(V;W) the set of linear transformations from V to W 37
len( a) the arclength of the curve a 383
£(a, b) the line segment from a to b 260
£P the space of infinite sequences (xj)~ 1 such that 2::~ 1 x~
converges 6
>.(R) the measure of an interval R C lFn 323

Mn(lF) the space of square n x n matrices with entries in lF 5


Mmxn(lF) the space of m x n matrices with entries in lF 5

N the natural numbers {O, 1, 2, ... } 577, 628


Ni the ith Newton interpolant 611
N the unit normal 392
JY (L) the kernel of L 35, 594

the eigenprojection associated to all the nonzero eigenvalues


500
P;.. the eigenprojection associated to the eigenvalue >. 475
PA(z) the characteristic polynomial of A 143
proh(v) the orthogonal projection of v onto the subspace X 96
xii List of Notation

the orthogonal projection of v onto span( {x}) 93


a subdivision 323
the ith projection map from xn to x 33

the rational numbers 628


the points of ]Rn with dyadic rational coordinates 374

JR the real numbers 4, 627


R[x] the ring of formal power series in x with coefficients in R 576
R(A,z) the resolvent of A, sometimes denoted R( z) 470
Res(!, zo) the residue off at zo 440
r(A) the spectral radius of A 474
re:(A) the c:-pseudospectral radius of A 561
~([a, b];X) the space of regulated integrable functions 324
~(L) the range of L 35
~(z) the real part of z 450, 654
p(L) the resolvent set of L 141

S([a, b]; X) the set of all step functions mapping [a, b] into X 228, 323
sign(z) the complex sign z/JzJ of z 656
Skewn(IF) the space of skew-symmetric n x n matrices 5, 29
Symn(IF) the space of symmetric n x n matrices 5, 29
~>.(L) the >.-eigenspace of L 141
a(L) the spectrum of L 141
ae:(A) the pseudospectrum of A 554

the unit tangent vector 382


the tangent space to M at p 390
the set of compact n-cubes with corners in Qk 374

v(a) a valuation function 583

the integers{ ... , -2, -1, 0, 1, 2, ... } 628


the positive integers {1, 2, ... } 628
the set of equivalence classes in Z modulo n 575, 633, 660
Preface

Overview
Why Mathematical Analysis?
Mathematical analysis is the foundation upon which nearly every area of applied
mathematics is built. It is the language and intellectual framework for studying
optimization, probability theory, stochastic processes, statistics, machine learning,
differential equations, and control theory. It is also essential for rigorously describing
the theoret ical concepts of many quantitative fields, including computer science,
economics, physics, and several areas of engineering.
Beyond its importance in these disciplines, mathematical analysis is also fun-
damental in the design, analysis, and optimization of algorithms. In addition to
allowing us to make objectively true statements about the performance, complex-
ity, and accuracy of algorithms, mathematical analysis has inspired many of the
key insights needed to create, understand, and contextualize the fastest and most
important algorithms discovered to date.
In recent years, the size, speed, and scale of computing has had a profound
impact on nearly every area of science and technology. As future discoveries and in-
novations become more algorithmic, and therefore more computational, there will be
tremendous opportunities for those who understand mathematical analysis. Those
who can peer beyond the jargon-filled barriers of various quantitative disciplines
and abstract out their fundamental algorithmic concepts will be able to move fluidly
across quantitative disciplines and innovate at their crossroads. In short, mathemat-
ical analysis gives solutions to quantitative problems, and the future is promising
for those who master this material.

To the Instructor
About this Text
This text modernizes and integrates a semester of advanced linear algebra with
a semester of multivariable real analysis to give a new and redesigned year-long
curriculum in linear and nonlinear analysis. The mathematical prerequisites are

xiii
xiv Preface

vector calculus, linear algebra, and a semester of undergraduate-level, single-variable


real analysis. 1
The content in this volume could be reasonably described as upper-division
undergraduate or first-year graduate-level mathematics. It can be taught as a stand-
alone two-semester sequence or in parallel with the second volume, Foundations of
Applied Mathematics, Volume 2: Algorithms, Approximation, and Optimization, as
part of a larger curriculum in applied and computational mathematics.
There is also a supplementary computer lab manual, containing over 25 com-
puter labs that support this text. This text focuses on the theory, while the labs
cover application and computation. Although we recommend that t he manual be
used in a computer lab setting with a teaching assistant, the material can be used
without instruction. The concepts are developed slowly and thoroughly, with nu-
merous examples and figures as pedagogical breadcrumbs, so that students can learn
this material on their own, verifying their progress along the way. The labs and
other classroom resources are open content and are available at

http://www.siam.org/books/ot152

The intent of this text and the computer labs is to attract and retain students
into the mathematical sciences by modernizing the curriculum and connecting the-
ory to application in a way that makes the students want to understand the theory,
rather than just tolerate it. In short, a major goal of this text is to entice them to
hunger for more.

Topics and Focus


In addition to standard material one would normally expect from linear and
nonlinear analysis, this text also includes several key concepts of modern applied
mathematical analysis which are not typically taught in a traditional applied math
curriculum (see the Detailed Description, below, for more information) .
We focus on both rigor and relevance to give the students mathematical ma-
turity and an understanding of the most important ideas in mathematical analysis.

Detailed Description
Chapters 1- 3 We give a rigorous treatment of the basics of linear algebra over both
JR and <C, including abstract vector spaces, linear transformations, matrices,
the LU decomposition, inner product spaces, the QR decomposition, and least
squares. As much as possible, we try to frame things in a way that does not
require vector spaces to be finite dimensional, and we give many infinite-
dimensional examples.

Chapter 4 We treat the spectral theory of matrices, including the spectral theorem
for normal matrices. We give special attention to the singular value decom-
position and its applications.
1
Specifically, the reader should have had exposure to a rigorous treatment of continuity, con-
vergence, differentiation, and Riemann-Darboux integration in one dimension, as covered, for
example, in [Abbl5].
Preface xv

Chapter 5 We present the basics of metric topology, including the ideas of com-
pleteness and compactness. We define and give many examples of Banach
spaces. Throughout the rest of the text we formulate results in terms of Ba-
nach spaces, wherever possible. A highlight of this chapter is the continuous
linear extension theorem (sometimes called the bounded linear transforma-
tion theorem), which we use to give a very slick construction of Riemann (or
rather regulated) Banach-valued integration (single-variable in this chapter
and multivariable in Chapter 8).

Chapters 6-7 We discuss calculus on Banach spaces, including Frechet derivatives


and Taylor's theorem. We then present the uniform contraction mapping
theorem, which we use to prove convergence results for Newton's method and
also to give nice proofs of the inverse and implicit function theorems.

Chapters 8-9 We use the same basic ideas to develop Lebesgue integration as we
used in the development of the regulated integral in Chapter 5. This approach
could be called the Riesz or Daniell approach. Instead of developing measure
theory and creating integrals from simple functions , we define what it means
for a set to have zero measure and create integrals from step functions . This
is a very clean way to do integration, which has the additional benefit of
reinforcing many of the functional-analytic ideas developed earlier in the text.

Chapters 10-11 We give an introduction to the fundamental tools of complex anal-


ysis, briefly covering first the main ideas of parametrized curves, surfaces, and
manifolds as well as line integrals, and Green 's theorem, to provide a solid
foundation for contour integration and Cauchy's theorem.
Throughout both of these chapters, we express the main ideas and results in
terms of Banach-valued functions whenever possible, so that we can use these
powerful tools to study spectral theory of operators later in the book.

Chapters 12-13 One of the biggest innovations in the book is our treatment of
spectral theory. We take the Dunford-Schwartz approach via resolvents. This
approach is usually only developed from an advanced functional-analytic point
of view, but we break it down to the level of an undergraduate math major,
using the tools and ideas developed earlier in this text .
In this setting, we put a strong emphasis on eigenprojections, providing in-
sights into the spectral resolution theorem. This allows for easy proofs of
the spectral mapping theorem, the Perron-Frobenius theorem, the Cayley-
Hamilton theorem, and convergence of the power method. This also allows
for a nice presentation of the Drazin inverse and matrix perturbation theory.
These ideas are used again in Volume 4 with dynamical systems, where we
prove the stable and center manifold theorems using spectral projections and
corresponding semigroup estimates.

Chapter 14 The pseudospectrum is a fundamental tool in modern linear algebra.


We use the pseudospectrum to study sequences of the form l Ak ll, their asymp-
totic and transient behavior, an understanding of which is important both for
Markov chains and for the many iterative methods based on such sequences,
such as successive overrelaxation.
xvi Preface

Chapter 15 We conclude the book with a chapter on applied ring theory, focused
on the algebraic structure of polynomials and matrices. A major focus of this
chapter is the Chinese remainder theorem, which we use in many ways, includ-
ing to prove results about partial fractions and Lagrange interpolation. The
highlight of the chapter is Section 15.7.3, which describes a striking connection
between Lagrange interpolation and the spectral decomposition of a matrix.

Teaching from the Text


In our courses we teach each section in a fifty-minute lecture. We require students
read the section carefully before each class so that class time can be used to focus
on the parts they find most confusing, rather than on just repeating to them the
material already written in the book.
There are roughly five to seven exercises per section. We believe that students
can realistically be expected to do all of the exercises in the text, but some are
difficult and will require time, effort, and perhaps an occasional hint. Exercises
that are unusually hard are marked with the symbol t. Some of the exercises are
marked with * to indicate that they cover advanced material. Although these are
valuable, they are not essential for understanding the rest of the text, so they may
safely be skipped, if necessary.
Throughout this book the exercises, examples, and concepts are tightly in-
tegrated and build upon each other in a way that reinforces previous ideas and
prepares students for upcoming ideas. We find this helps students better retain and
understand the concepts learned, and helps achieve greater depth. Students are
encouraged to do all of the exercises, as they reinforce new ideas and also revisit
the core ideas taught earlier in the text.

Courses Taught from This Book


Full Sequence

At BYU we teach a year-long advanced undergraduate-level course from this book,


proceeding straight through the book, skipping only the advanced sections and
chapters (marked with *), and ending in Chapter 13. But this would also make a
very good course at the beginning graduate level as well.
Graduate students who are well prepared could be further challenged either
by covering the material more rapidly, so as to get to the very rewarding material
at the end of the book, or by covering some or all of the advanced sections along
the way.

Advanced Linear Algebra

Alternatively, Chapters 1- 4 (linear analysis part I), Section 7.5 (conditioning), and
Chapters 12- 14 (linear analysis part II), as well as parts of Chapter 15, as time
permits, make up a very good one-semester advanced linear algebra course for
students who have already completed undergraduate-level courses in linear algebra,
complex analysis, and multivariate real analysis.
Preface xvii

Advanced Analysis
This book can also be used to teach a one-semester advanced analysis course for stu-
dents who have already had a semester of basic undergraduate analysis (say, at the
level of [Abb15]). One possible path through the book for this course would be to
briefly review Chapter 1 (vector spaces) , Sections 2.1-2.2 (basics of linear transfor-
mations), and Sections 3.1 and 3.5 (inner product spaces and norms), in order to set
notation and to remind the students of necessary background from linear algebra,
and then proceed through Part II (Chapters 5-7) and Part III (Chapters 8-11).
Figure 1 indicates the dependencies among the chapters.

Advanced Sections
A few problems, sections, and even chapters are marked with the symbol * to
indicate that they cover more advanced topics. Although this material is valuable,
it is not essential for understanding the rest of the text, so it may safely be skipped,
if necessary.

Instructors New to the Material


We've taken a tactical approach that combines professional development for faculty
with instruction for the students. Specifically, the class instruction is where the the-
ory lies and the supporting media (labs, etc.) are provided so that faculty need not
be computer experts nor familiar with the applications in order to run the course.
The professor can teach the theoretical material in the text and use teaching
assistants, who may be better versed in the latest technology, to cover the applica-
tions and computation in the labs, where the "hands-on" part of the course takes
place. In this way the professor can gradually become acquainted with the appli-
cations and technology over time, by working through the labs on his or her own
time without the pressures of staying ahead of the students.
A more technologically experienced applied mathematician could flip the class
if she wanted to, or change it in other ways. But we feel the current format is
most versatile and allows instructors of all backgrounds to gracefully learn and
adapt to the program. Over time, instructors will become familiar enough with the
content that they can experiment with various pedagogical approaches and make
the program theirs.

To the Student
Examples
Although some of the topics in this book may seem familiar to you, especially
many of the linear algebra topics, we have taken a very different approach to these
topics by integrating many different topics together in our presentation, so examples
treated in a discussion of vector spaces will appear again in sections on nonlinear
analysis and other places throughout the text. Also, notation introduced in the
examples is often used again later in the text.
Because of this, we recommend that you read all the examples in each section,
even if the definitions, theorems, and other results look familiar .
xviii Preface

Linear Analysis Nonlinear Analysis

1: Vector Spaces

2: Linear 5: Metric Spaces


Transformations

6: Differentia-
tion

4: Spectral l 7: Contraction
Theory ___J Mappings

Is: Integration I
10: Calculus on
Manifolds
I9:*Int: gration II

11: Complex
12: Spectral w..---, Analysis
Decomposition

13: Iterative
Methods

14: Spectra and


Pseudospectra

Figure 1. Diagram of the dependencies among the chapters of this book.


Although we usually proceed straight through the book in order, it could also be used
for either an advanced linear algebra course or a course in real and complex analysis.
The linear analysis half (left side) provides a course in advanced linear algebra
for students who have had complex analysis and multivariate real analysis. The
nonlinear analysis half (right side) could be used for a course in real and complex
analysis for students who have already had linear algebra and a semester of real
analysis. For that track, we recommend briefly reviewing the material of Chapter 1
and Sections 2.1- 2.2, 3.1, and 3.5, before proceeding to the nonlinear material, in
order to fix notation and ensure students remember the necessary background.
Preface xix

Exercises
Each section of the book has several exercises, all collected at the end of each
chapter. Horizontal lines separate the exercises for each section from the exer-
cises for the other sections. We have carefully selected these exercises. You should
work them all (but your instructor may choose to let you skip some of the ad-
vanced exercises marked with *)-each is important for your ability to understand
subsequent material.
Although the exercises are gathered together at the end of the chapter, we
strongly recommend that you do the exercises for each section as soon as you have
completed the section, rather than saving them until you have finished the entire
chapter. Learning mathematics is like developing physical strength. It is much
easier to improve, and improvement is greater, when exercises are done daily, in
measured amounts, rather than doing long, intense bouts of exercise separated by
long rests.

Origins
This curriculum evolved as an outgrowth of lecture notes and computer labs that
were developed for a 6-credit summer course in computational mathematics and
statistics. This was designed to introduce groups of undergraduate researchers to
a number of core concepts in mathematics, statistics, and computation as part of
a National Science Foundation (NSF) funded mentoring program called CSUMS:
Computational Science Training for Undergraduates in the Mathematical Sciences.
This NSF program sought out new undergraduate mentoring models in the
mathematical sciences, with particular attention paid to computational science
training through genuine research experiences. Our answer was the Interdisciplinary
Mentoring Program in Analysis, Computation, and Theory (IMPACT), which took
cohorts of mathematics and statistics undergraduates and inserted them into an in-
tense summer "boot camp" program designed to prepare them for interdisciplinary
research during the school year. This effort required a great deal of experimenta-
tion, and when the dust finally settled, the list of topics that we wanted to teach
blossomed into 8 semesters of material--essentially an entire curriculum.
After we explained the boot camp concept to one visitor, he quipped, "It's
the minimum number of instructions needed to create an applied mathematician."
Our goal, however, is much broader than this. We don't want to train or create
a specific type of applied mathematician; we want a curriculum that supports all
types, simultaneously. In other words, our goal is to take in students with diverse
and evolving interests and backgrounds and provide them with a common corpus of
mathematical, statistical, and computational content so that they can emerge well
prepared to work in their own chosen areas of specialization. We also want to draw
their attention to the core ideas that are ubiquitous across various applications so
that they can navigate fluidly across fields.

Acknowledgments
We thank the National Science Foundation for their support through the TUES
Phase II grant DUE-1323785. We especially thank Ron Buckmire at the National
xx Preface

Science Foundation for taking a chance on us and providing much-needed advice and
guidance along the way. Without the NSF, this book would not have been possible.
We also thank the Department of Mathematics at Brigham Young University for
their generous support and for providing a stimulating environment in which to
work.
Many colleagues and friends have helped shape the ideas that led to this
text, especially Randy Beard, Rick Evans, Shane Reese, Dennis Tolley, and Sean
Warnick, as well as Bryant Angelos, Jonathan Baker, Blake Barker, Mylan Cook,
Casey Dougal, Abe Frandsen, Ryan Grout, McKay Heasley, Amelia Henricksen, Ian
Henricksen, Brent Kerby, Steven Lutz, Shane McQuarrie, Ryan Murray, Spencer
Patty, Jared Webb, Matthew Webb, Jeremy West, and Alexander Zaitzeff, who
were all instrumental in helping to organize this material.
We also thank the students of the BYU Applied and Computational Mathe-
matics Emphasis (ACME) cohorts of 2013-2015, 2014- 2016, 2015-2017, and 2016-
2018, who suffered through our mistakes, corrected many errors, and never hesitated
to tell us what they thought of our work.
We are deeply grateful to Chris Grant, Todd Kapitula, Zach Boyd, Rachel
Webb, Jared McBride, and M.A. Averill, who read various drafts of this volume very
carefully, corrected many errors, and gave us a tremendous amount of helpful feed-
back. Of course, all remaining errors are entirely our fault . We also thank Amelia
Henricksen, Sierra Horst, and Michael Hansen for their help illustrating the text
and Sarah Kay Miller for her outstanding graphic design work, including her beau-
tifully designed book covers. We also appreciate the patience, support, and expert
editorial work of Elizabeth Greenspan and the other editors and staff at SIAM.
Finally, we thank the folks at Savvysherpa, Inc., for corporate sponsorship
that greatly helped make the transition from IMPACT to ACME and their help
nourishing and strengthening the ACME development team.
Part I

Linear Analysis I
Abstract Vector
Spaces

Mathematics is the art of reducing any problem to linear algebra.


-William Stein

In this chapter we develop the theory of abstract vector spaces. Although it is


sometimes sufficient to think of vectors as arrows or arrays in JRn, this point of
view can be too limiting, particularly when it comes to non-Cartesian coordinates,
infinite-dimensional vector spaces, or vectors over other fields, such as the complex
numbers <C. Hence, we begin this journey by stripping away the simple geomet-
ric interpretation of a vector space and replacing it with rigorous mathematical
abstraction, from which we build the theory of mathematical analysis. We then
carefully rebuild the geometric properties of vector spaces in subsequent chapters.
This chapter contains a description of vector spaces, subspaces, and the rules of
vector algebra. In addition, we discuss linear combinations of vectors, the spans
of sets, and the consequences of linear independence. We then build new vector
spaces and subspaces out of existing ones by taking Cartesian products
of vector spaces and direct sums of subspaces. The chapter concludes with an
important foray into quotient spaces and their algebraic properties.
Even if you are already familiar with linear algebra, we recommend that you at
least read the examples and work the exercises, because they are used and referred
to repeatedly throughout this text .

1.1 Vector Algebra


Vector spaces are a fundamental concept in mathematics, science, and engineering.
The key properties of a vector space are that its elements (called vectors) can be
added, subtracted, and rescaled.
We can describe many real-world phenomena as vectors in a vector space.
Examples include the position and momentum of particles, sound traveling in a
medium, and electrical or optical signals from an Internet connection. Once we
describe something as a vector , the powerful tools of linear algebra can be used to
study it.

3
4 Chapter 1. Abstract Vector Spaces

Remark 1.1.1. Many of the properties of vector spaces described in t his text hold
over arbitrary fields;2 however, we restrict our discussion here to vector spaces over
the real field IR or the complex3 field <C. We denote t he field by IF when a statement
is true for both IR and <C.

Remark 1.1.2. We use the notation I · I to denote the absolute value of a real
number and the modulus of a complex number; that is, if z =a+ bi EC, where a,
b E IR, then lzl = va
2 + b2 . The reader should verify that lzl 2 = zz = zz, where

z = a + bi = a - bi is the complex conjugate.


1.1 .1 Vector Spaces
We begin by carefully defining a vector space.

Definition 1.1.3. A vector space over a field IF is a set V with two operations:
addition, mapping the Cartesian product4 VxV to V, and denoted by (x, y) H x+y;
and scalar multiplication, mapping IF x V to V, and denoted by (a, x ) H ax .
Elements of the vector space V are called vectors, and the elements of the field
IF are called scalars . These operations must satisfy the following properties for all
x,y,z EV and all a,b E IF :
(i) Commutativity of vector addition: x + y = y + x.
(ii) Associativity of vector addition: (x + y) + z = x + (y + z) .
(iii) Existence of an additive identity: There exists an element 0 E V such that
o+x = x.
(iv) Existence of an additive inverse: For each x E V there exists an element
-x EV such that x + (-x) = 0 .
(v) First distributive law: a (x + y) == ax+ ay.

(vi) Second distributive law: (a+ b)x =ax+ bx.


(vii) Multiplicative identity: lx = x.
(viii) Relation to ordinary multiplication: (ab)x = a(bx) .

Nota Bene 1.1.4. A subtle point that i sometime missed because it is not
included in the numbered list of properties is that the definition of a vector
space requires the operations of vector addition and scalar multiplication to
take their values in V. More precisely, if we add two vectors together in V, the
result must be a vector in V, and if we multiply a vector in V by a scalar in

2
For more informat ion on general fields, see Appendix B.2.
3 For more informat ion on complex numbers, see Appendix B .l.
4
See Definition A .l. 10 (viii) in Appendix A.
1.1. Vector Algebra 5

IF, the result must also be a vector in V . When these properties hold, we say
that V is closed under vector addition and scalar multiplication. If V is not
closed under an operation, then, strictly speaking, we don't have an operation
on V at all. Checking closure under operations is often the hardest part of
verifying that a given set is a vector space.

Remark 1.1.5. Throughout this text we typically write vectors in boldface in an


effort to distinguish them from scalars and other mathematical objects. This may
cause confusion, however, when the vectors are also functions, polynomials, or other
objects that are normally written in regular (not bold) fonts. In such cases, we
simply write these vectors in the regular font and expect the reader to recognize
the objects as vectors.

Example 1.1.6. Each of the following examples of vector spaces is important


in applied mathematics and appears repeatedly throughout this book. The
nai:ve descriptions of vectors as "arrows" with "magnitude" and "direction" are
inadequate for describing many of these examples. The reader should check
that all the axioms are satisfied for each of these:

(i) The n-tuples lFn forming the usual Euclidean space. Vector addition is
given by (a1, ... , an)+ (b1 , ... , bn) = (a1 +bi , . .. , an+ bn) , and scalar
multiplication is given by c(a 1 , . .. , an) = (ca 1 , ... , can)·

(ii) The space of m x n matrices, denoted Mmxn(lF), where each entry is


an element of lF. Vector addition is usual matrix addition, and scalar
multiplication is given by multiplying every entry in the matrix by the
scalar:

We often write Mn (JF) to denote square n x n matrices. There is another


important operation called multiplication on this set, namely, matrix
multiplication. Matrix multiplication is important in many contexts, but
it is not part of the vector space structure-that is, Mn(lF) and Mm xn(lF)
are vector spaces regardless of whether we can multiply matrices.

(iii) For [a, b] C JR, the space C([a, b]; JF) of continuous lF-valued functions.
Vector addition is given by defining the function f + g as (f + g)(x) =
f( x) + g(x) , and scalar multiplication is given by defining the function
cf by (cf)(x) = c · f(x). Note that C([a, b]; JF) is closed under vector
addition and scalar multiplication because sums and scalar products of
continuous functions are continuous.
6 Chapter 1. Abstract Vector Spaces

(iv) For [a, b] c IR and 1 ::; p < oo, the space LP([a, b] ; JF) of p-integrablea
functions f: [a, b] -+ lF, that is, satisfying J:
[f(x)IP dx < oo.
For p = oo, let L ([a, b]; JF) be the set of all functions f: [a, b]-+ lF such
00

that SUPxE[a,b] [f(x)[ < oo.


If 1 ::; p::; oo, the space LP([a, b]; JF) is a vector space. To prove that LP
is closed under vector addition, notice that for a, b E IR we have

Thus, it follows that

(v) The space JF[x] of polynomials with coefficients in lF.


(vi) The space gp of infinite sequences (xJ )~ 1 with each Xj E lF and satisfying
2:::~ 1 [xJ[P < oo for 1::; p < oo, and satisfying supJEN [xJ[<oo for p= oo.
Vector addition and scalar multiplication are defined componentwise.
It is not immediately obvious that gp is closed under vector addition,
but this can be shown with an argument similar to the one used for
LP([a, b]; JF) in (iv) . We see another proof when we discuss Minkowski's
inequality in Section 3.6.
a For now we use the Riemann integral in the definition of LP, but this is an abuse of notation,
since the usual definition of LP assumes a more advanced definition of integration. In
Chapters 8- 9 we generalize the integral to include a broader class of integrable functions
consistent with the usual convention.

Pro pos ition 1.1. 7. Let V be a vector space. If x, y E V, then the fallowing hold:
(i) x +y = x implies y = 0 . In particular, the additive identity is unique .
(ii) x +y = 0 implies y = - x. In particular, additive inverses are unique.
(iii) Ox = 0 .
(iv) (- l)x = - x.

P roof.
(i) If x + y = x , then 0 = - x + x = - x + (x + y) = (-x + x) + y = 0 + y = y .
(ii) Ifx+y = 0, then -x = -x+O = - x + (x +y) = (-x+x) +y = O+y = y .
(iii) For each x , we have that x = lx = (1 + O)x = lx +Ox= x +Ox . Hence, (i)
implies that Ox = 0.
(iv) For each x , we have that 0 =Ox= (1 + (-l))x = lx + (-l)x = x + (-l)x.
Hence, (ii) implies that (-l)x = - x . D
1.1. Vector Algebra 7

Remark 1.1.8. Subtraction in a vector space is really just addition of the negative.
Thus, the expression x - y is just shorthand for x + (-y).

Proposition 1.1.9. Let V be a vector space. Ifx,y,z EV and a,b E lF, then the
fallowing hold:

(i) aO = 0.

(ii) If ax= ay and a -=/= 0, then x = y.

(iii) If x +y = x + z, then y = z.

(iv) If ax= bx and x-=/= 0, then a= b.

Proof.

(i) For each x E V and a E lF, we have that ax= a(x + 0) =ax+ aO. Since the
addit ive ident ity is unique by Proposition 1.1.7, it follows that aO = 0.

(ii) Since a-=/= 0, we have that a- 1 E lF. Thus, x = (lx) = (a- 1 a)x = a- 1 (ax) =
a- 1 (ay) = (a- 1 a)y = ly = y .

(iii) Note that y = 0 + y = (-x + x) + y = -x + (x + y) = -x + (x + z) =


(-x + x) + z = 0 + z = z.

(iv) We prove an equivalent statement: If ex = 0 and x -=/= 0, then c = 0. By


way of contradiction, assume c -=/= 0. It follows that x = lx = (c- 1 c) x =
c- 1 (cx) = c- 1 0 = 0 , which is a contradiction. Hence, c = 0. D

Not a Bene 1.1.10. It is important to understand that Proposit ion l.l. 9(iv)
is not saying that we can divide one vector by another vector. T hat is abso-
lutely not t he case. Indeed, this cancellation only works in t he special case
that both sides of the equation are scalar multiples of t he same vector x.

1.1.2 Subspaces
We conclude this section by defining subspaces of a vector space and then providing
several examples.

Definition 1.1.11. Let V be a vector space. A nonempty subset W C V is a


subspace of V if W itself is a vector space under the same operations of vector
addition and scalar multiplication as V.
8 Chapter 1. Abstract Vector Spaces

An immediate consequence of the definition is that if a subset W C V is a


subspace, then it is closed under vector addition and scalar multiplication as defined
by V . In other words, if Wis a subspace, then whenever x, y E Wand a, b E lF, we
have that ax + by E W . Surprisingly, this is all we need to check to show that a
subset is a subspace.

Theorem 1.1.12. If W is a nonempty subset of a vector space V such that for


any x, y E W and for any a, b E lF the vector ax+ by is also in W, then W is a
subspace of V .

Proof. The hypothesis of this theorem shows that W is closed under vector addition
and scalar multiplication, so vector addition does indeed map W x W to W and
scalar multiplication does indeed map lF x W to W, as required. Properties (i)-(ii)
and (v)- (viii) of Definition 1.1.3 follow because they hold for the space V. The
proofs of the remaining two properties (iii) and (iv) are left as an exercise- see
Exercise 1.2. D

The following corollary is immediate.

Corollary 1.1.13. If V is a vector space and W is a nonempty subset of V such


that for any x , y E W and for any a E lF the vectors x + y and ax are also in W,
then W is a subspace of V .

Unex ample 1.1.14.

(i) If W is the union of the x- and y-axes in the plane IR 2 , then it is not a
subspace of IR 2 . Indeed, the sum of the vectors (1, 0) (in t he x-axis) and
(0, 1) (in the y-axis) is (1, 1), which is not in W, so W is not closed with
respect to vector addition.

(ii) Any line or plane in IR 3 that does not contain the origin 0 is not a
subspace of JR 3 because all subspaces are required to contain 0.

E xample 1.1.15. The following are examples of subspaces:

(i) Any line that passes through the origin in JR 3 is a subspace of JR 3 and
can be written as {tx I t E IR} for some x E JR 3 . Thus, any scalar
multiple a(tx) is on the line, as is any sum tx + sx = (t + s)x. More
generally, for any vector space V over a field lF and any x E V, the set
W = {tx It E lF} is a subspace of V.
1.1 . Vector Algebra 9

(ii) Any plane that passes through the origin in IR 3 is a subspace of IR 3


and can be written as the set { sx + ty s, t E IR} for some pair a of
J

vectors x and y. It is straightforward to check that this set is closed


under vector addition and scalar multiplication. More generally, for any
vector space V over a field lF and any two vectors x , y E V , the set
P = {sx + ty Is, t E lF} is a subspace of V.

(iii) For [a, b] c IR, the space Co([a, b]; JF) of all functions f E C([a, b]; JF) such
that f(a) = f(b) = 0 is a subspace of C([a,b];lF). To see this, we must
check that if f and g both vanish at the endpoints, then so does cf+ dg
for any c, d E lF.

(iv) For n E N, the set JF[x; n] of all polynomials of degree at most n is a


subspace of JF[x] . To see this, we must check that for any f and g of
degree at most n and any a, b E lF, then af + bg is again a polynomial
of degree at most n .

(v) For any vector space V, both {O} and V are subspaces of V. The
subspace {O} is called the trivial subspace of V. Any subspace of V
that is not equal to V itself is called a proper subspace.

(vi) For [a, b] c IR, the set C([a, b]; lF) is a subspace of LP([a, b]; JF) when
1 ::; p < 00 .
(vii) The space cn((a, b); JF) of lF-valued functions whose nth derivative is
continuous on (a, b) is a subspace of C ( (a, b); JF).

(viii) The property of being a subspace is transitive. If X is a subspace of a


vector space W, and if W is a subspace of a vector space V, then X
is a subspace of V. For example, when n E N, the space cn+i ( (a, b); lF)
is a subspace of en ( (a, b); JF) . It follows that cn+i ( (a, b); JF) is a subspace
of C((a, b); JF) .
au the two vectors are scalar multiples of each other, then this set is a line, not a plane.

Application 1.1.16 (Linear Maps and the Superposition Principle).


Many physical phenomena satisfy the superposition principle , which is just
another way of saying that they are vectors in a vector space. Waves- both
electromagnetic and acoustic-provide common examples of this phenomenon.
Radio signals are examples of electromagnetic waves. Two radio signals of dif-
ferent frequencies correspond to two different vectors in a common vector
space. When these are broadcast simultaneously, the result is a new wave
(also a vector) that is the sum of the two original signals.
10 Chapter 1. Abstract Vector Spaces

Sound waves also behave like vectors in a vector space. This allows the
construction of technologies like noise-canceling headphones. The idea is very
simple. When an undesired noise n is approaching the ear , produce the signal
that is the additive inverse -n, and play it at the same time. The signal heard
is the sum n + (-n) = 0 , which is silent.

1.2 Spans and Linear Independence


In this section we discuss two big ideas from linear algebra. The first is the span of
a collection of vectors in a given vector space V. The span of a nonempty set S is
the set of all linear combinations of elements from S, that is, the set of all vectors
obtained by taking finite sums of scalar multiples of the elements of S. This set of
linear combinations forms a subspace of V , which can also be obtained by taking
the intersection of all subspaces of V that contain S .
The second idea concerns whether it is possible to remove a vector from S
and still span the same subspace. If no vectors can be removed, then the set S is
linearly independent, and any vector in the span can be uniquely represented as a
linear combination of vectors from S.
If S is linearly independent and the span of S is the entire vector space V, then
every vector in V has a unique representation as a linear combination of elements
from S. In this case S defines a coordinate system for the vector space, and we say
t hat S is a basis of V. Such a representation is extremely useful and forms a very
powerful set of tools with incredibly broad application.
Throughout the rest of this chapter, we assume that S is a subset of a given
vector space V, unless otherwise specified.

1.2.1 The Span of a Set


The span of a collection of vectors is an important building block in linear
algebra. The main result of this subsection is that the span of a set can be de-
fined two ways: either as the intersection of all subspaces that contain the set, or
as the set of all possible linear combinations of the set.

Definition 1.2.1. The span of S, denoted span(S) , is the set of linear combinations
of elements of S, that is, the set of all finite sums of the form

If S is empty, then we define span(S) to be the set {O}. If span(S) = W for some
subspace W of V , then we say that S spans W.

Proposition 1.2.2. The set span(S') is a subspace of V .

Proof. If S is empty, the result is clear, so we may assume S is nonempty. If


x,y E span(S), then there exists a finite subset {x 1 , . .. , x m} CS, such that x =
:z:=:,
1 Ci X i and y = :z:=:,
1 dixi for some coefficients c1, . . . , Cm and di, ... , dm (some
1.2. Spans and Linear Independence 11

possibly zero). Since ax+ by= 2:::7:


1 (aci + bdi)xi is of the form given in (1.1), and
is thus a linear combination of elements of S, it follows that ax+ by is contained
in S; hence span(S) is a subspace of V . D

Lemma 1.2.3. If W is a subspace of V, then span(W) = W .

Proof. It is immediate that W C span(W), so it suffices to show span(W) C W,


which we prove by induction. By definition, any element v E span(W) can be
written as a linear combination v = a 1 x 1 + · · · + anXn with Xi E W. If n = 1,
then v E W since subspaces are closed under scalar multiplication. Now suppose
that all linear combinations of W of length k or less are in W, and consider a linear
combination of length k + 1, say v = a1x1 + a2x2 + · · · + akxk + ak+1 Xk+l· This
can be rewritten as v = w + ak+Ixk+i , where w is the sum of the first k terms in
the linear combination. By the inductive hypothesis, w E W, and by the definition
of a subspace, w + an+IXn+i E W. Thus, all linear combinations of k + 1 elements
are also in W. Therefore, by induction, all finite linear combinations of elements of
Ware in W , or in other words span(W) c W. D

Proposition 1.2.4. The intersection of a collection {Wa}aEJ of subspaces of V is


a subspace of V.

Proof. The intersection is nonempty because it contains 0. Assume Wa is a


subspace of v for each a E J. If x, y E naEJ Wa, then x, y E Wa for each a E J.
Since each Wa is a subspace, then for every a, b E lF we have ax+ by E Wa. Hence,
ax+ by E naEJ Wa , and, by Theorem 1.1.12, we conclude that n aEJ Wa is a
subspace of V . D

Proposition 1.2.5. If SC S' C V , then span(S) C span(S').

Proof. See Exercise 1.10. D

Theorem 1.2.6. The span of a set S C V is the smallest subspace of V that


contains S, meaning that for any subspace WC V containing S we have span(S) C
W. It is also equal to the intersection of all subspaces that contain S.

Proof. If Wis any subspace of V with Sc W, then span(S) c span(W) = W by


Lemma 1.2.3. Moreover, the intersection of all subspaces containing Sis a subspace
containing S, by Proposition 1.2.4, and hence it contains span(S) . Conversely,
span(S) is a subspace of V containing S, so it must contain the intersection of all
such subspaces. D

Proposition 1.2.7. If v E span (S) , then span (S) = span (SU {v} ).

Proof. By the previous proposition, we have that span( S) C span( SU {v}) . Thus,
assuming the hypothesis, it suffices to show that span( S U {v}) C span( S).
12 Chapter 1. Abstract Vector Spaces

Given x Espan(S U {v}) we have x = 2:~ 1 ai si + cv, for some {si}~ 1 CS.
Since v Espan(S), we also have that v = 2:1J'= 1 b1t1 for some {t1l:t=l CS. Thus,
we have that x = I:~=l aisi +c 2:1J'= 1 b1 t 1 , which is a linear combination of elements
of S. Therefore, x E span (S). D

1.2.2 Linear Independence


As mentioned in the introduction to this section, a set S is linearly independent if
it is impossible to remove a vector from S and still span t he same subspace.
The first result in this subsection establishes that any vector in the span of
a linearly independent set S can be uniquely expressed as a finite sum of scalar
multiples of vectors from S. We then prove that subsets of linearly indepen-
dent sets are also linearly independent. One might say that this result is dual to
Proposition 1.2.5, which states that supersets of sets that span also span. These
two dual results intersect when S both spans V and is linearly independent- if we
add a vector to S, then it still spans V but is no longer linearly independent; if we
remove a vector from S, then it is still linearly independent but no longer spans V.
When Sis both linearly independent and spans the space, we say that it is a basis
for V .

Definition 1.2.8. The set S is linearly dependent if there exists a nontrivial linear
combination of elements of S that equals zero; that is, for some nonempty subset
{x1, ... , xm} CS we have

where the elements x i are distinct, and not all of the coefficients ai are zero . If no
such linear combination exists, then the set S is linearly independent .

Remark 1.2.9. The empty set is vacuously linearly independent. Because there
are no vectors in 0, there is no nontrivial linear combination of vectors.

Example 1.2.10. Let V = lF 2 and S = {x,y , z} C V, where x = (1 , 1),


y = (-1 , 1), and z = (0, 2). The set Sis linearly dependent, since x+y-z = 0.

Proposition 1.2.11. If S is linearly independent, then any vector v E span (S) can
be written uniquely as a (finite) linear combination of elements of S . More precisely,
for any distinct elements X1 , .. . , Xm of S, if v = I:Z::, 1 aixi and v = I:Z::, 1 bi xi,
then ai =bi for every i = 1, . .. , m .

Proof. Suppose that v = I:Z::, 1 aix'i and v = I:Z::, 1 bixi · Subtracting gives 0 =
I:Z::, 1 (ai -
bi)xi. Since Sis linearly independent, each term is equal to zero, which
implies ai =bi for each i = 1, . .. , m. D

The next lemma is an important tool for two theorems in a later section (the
replacement theorem (Theorem 1.4.1) and the extension theorem (Corollary 1.4.5)) .
1.2. Spans and Linear Independence 13

It states that linear independence is inherited by subsets. This is sort of a dual result
to Proposition 1.2.5, which states that supersets of sets that span also span.

Lemma 1.2.12. If S is linearly independent and S' c S , then S' is also linearly
independent.

Proof. See Exercise 1.13. D

Lemma 1.2.12 and Proposition 1.2.5 suggest that linear independence and the
span are complementary in some sense. There are some special sets that have both
properties. These important sets are called bases of the vector space, and they act
as a coordinate system for the vector space.

Definition 1.2.13. If S is linearly independent and spans V , then it is called a


basis of V .

Example 1.2.14.

(i) The vectors e 1 = (1, 0, 0) , e 2 = (0, 1, 0), and e3 = (0 , 0, 1) in IF 3 form a


basis for IF 3 .

(ii) More generally the vectors e 1 = (1 , 0,. . ., 0) , e 2 = (0, 1, 0,. . ., 0) , .. .,


en = (0, 0, ... , 0, 1) E lFn form a basis of lFn called the standard basis.

(iii) The vectors (1, 1, 1) , (2 , 1,1) , and (1 , 0, 1) also form a basis for IF 3 . To
show that t he vectors are linearly independent, set

a (l , 1, 1) + b(2, 1, 1) + c(l, 0, 1) = (0, 0, 0).


Solving gives a= b = c = 0. To show that the vectors span IF3 , set
a(l , 1, 1) + b(2, 1, 1) + c(l , 0, 1) = (x, y , z).

Solving gives

a = y - x + z,
b = x - z,
c = z -y.

(iv) The monomials {1 , x, x 2 , x 3 , . . . } c IF[x] form a basis of IF[x] (the reader


should prove this) . But there are infinitely many other bases as well, for
example, {1 , x + 1, x 2 + x + 1, x 3 + x 2 + x + 1, ... }.

Corollary 1.2.15. If S is a basis for V, then any nonzero vector x E V can be


uniquely expressed as a linear combination of elements of S.
14 Chapter 1. Abstract Vector Spaces

Proof. Since S spans V, each x E V can be expressed as a linear combination of


elements of S. Uniqueness follows from Proposition 1.2.11. D

This corollary means that if we have a basis B = {x1, ... , xn} of V, we can
write any element v E V uniquely as a linear combination v = :L;~=l aixi, and thus
v can be identified with the n-tuple ( a 1 , . . . , an) E lFn. That is to say, the basis B
has given us a coordinate system for V. A different basis would, of course, give us a
different coordinate system. In the next chapter, we show how to transform vectors
from one coordinate system to another.

1.3 Products, Sums, and Complements


In this section we describe two ways of constructing new vector spaces and subspaces
from existing ones: the Cartesian product of vector spaces, and the sum of subspaces
of a given vector space.
The Cartesian product of a finite collection of vector spaces (over the same
field) is a vector space, where vector addition and scalar multiplication are defined
componentwise. This allows us to combine vector spaces into a single larger vector
space. For example, suppose we have two three-dimensional vector spaces, the first
describing the position of a given particle, and the second describing the velocity
of the particle. The Cartesian product of those two vector spaces forms a new, six-
dimensional vector space describing both the position and velocity of the particle.
The sum of nontrivial subspaces of a given vector space is also a subspace.
Moreover, if a pair of subspaces intersects only at zero, then their sum has no
redundancy and is called a direct sum. A direct sum decomposes a subspace into the
sum of other smaller subspaces. If a vector space is the direct sum of two subspaces,
then we call these subspaces complementary. Decomposing a space into the "right"
collection of complementary subspaces often simplifies a problem substantially.

1.3.1 Products

Proposition 1.3.1. Let Vi, V2, ... , Vn be a collection of vector spaces over the field
lF . The Cartesian product
n
V = IJ Vi = Vi X Vi X · · · X Vn = {(v1, V2, ... , Vn) I Vi E Vi}
i=l
forms a vector space over lF with additive identity (0, 0, .. . , 0), and vector addition
and scalar multiplication defined componentwise as

(i) (x 1, X2 , · · ·, Xn) + (y1, Y2, · · · , Yn) = (x1 + Y1, X2 + Y2, · · ·, Xn + Yn) ,


(ii) a(x1,x2, ... , xn) = (ax 1,ax2, ... ,ax n)

for all (x1, x2, ... , Xn) , (y1, y2, ... , Yn) E V and a E lF.

Proof. See Exercise 1.17. D


1.3. Products, Sums, and Complements 15

(x,y)

Figure 1.1. A depiction of JR 2 x JR. Beware that JR 2 U JR is not a vector


space. The vector (x, y) in this diagram (red) lies in the product JR 2 x JR, but it does
not lie in JR 2 U JR.

Example 1.3.2. The product space I1~= 1 lF is exactly the vector space lFn.

Example 1.3.3. The space JR 2 x JR can be written as the set of pairs (x,y),
where x = (x1,x2) E JR 2 and y = y 1 E R The points of JR 2 x JR are in a
natural bijective correspondence with the points of JR 3 by sending ((x 1 , x 2 ), y 1 )
to (x1,x2,y1) as in Figure 1.1.

Remark 1.3.4. It is straightforward to check that the previous proposition remains


true for infinite sequences of vector spaces. For example, if (Vi)~ 1 is a sequence of
vector spaces, the Cartesian product
00

IJ Vi = {(x 1, x2, . . . ) I x i E Vi for all i = 1, 2 ... }


i=l
is a vector space with componentwise vector addition and scalar multiplication
defined as follows:
(i) (x1, x2, .. .) + (y1, Y2, ... ) = (x1 + Y1, x2 + Y2,. · .),
(ii) a(x1 , x2, ... ) = (ax1, ax2, ... ).

1.3.2 Sums and Complements

Proposition 1.3.5. Let W1 , W2 , ... , Wn be a collection of subspaces of the vector


space V. The sum l:~=l Wi = {w1 + w2 + · · · + Wn I wi E Wi} is a subspace of V.

Proof. By Corollary 1.1.13 it suffices to show that W = I:~=l Wi is nonempty and


closed under vector addition and scalar multiplication. Since 0 E Wi for every i, we
have 0 E 2=~1 wi, sow is nonempty. If X1 + X2 + ... + Xn and Yl + Y2 + ... + Yn
are in W, then so is their sum, since (x1 + x2 + · · · + Xn) + (Y1 + Y2 + · · · +
Yn) = (x1 + Y1) + (x2 + Y2) + · · · + (xn + Yn) E W . Moreover, we have that
a(w1 + w2 + · · · + wn) = aw1 + aw2 +···+awn E W. D
16 Chapter 1. Abstract Vector Spaces

In the special case that subspaces W1, W2 satisfy W1 n W2 = {O}, their


sum behaves like a Cartesian product, as described in the following definition and
theorem.

Definition 1.3.6. Let W1, W2 be subspaces of the vector space V. IfW1nW2 = {O},
then the sum W1+W2 is called a direct sum and is denoted W1EB W2. More generally,
if (Wi)~ 1 is a collection of subspaces of V, then the sum W = L:;~= l Wi is a direct
sum if

Win (L#i
wj) = {O} for all i = 1, 2, . .. , n. (1.2)

In this case we write W = W1 EB W2 EB ··· EB Wn or W = EB ~=l Wi .

Theore m 1.3. 7. Let {Wi}i=l be a collection of subspaces of a vector space V with


bases {Si}i= 1, respectively, and let W = L:;~ 1 Wi. The following are equivalent:

(i) w = EB~= 1 wi.


(ii) For each w E W, there exists a unique n -tuple (w1, ... , wn), with each wi E
Wi, SUCh that W = W1 + W2 + · · · + Wn·
(iii) S = LJ~ 1 Si is a basis for W, and Si n Sj = 0 for every pair i ~ j .

Proof.
(i) =? (ii): If w = X1 + X2 + ... + Xn and w = Y1 + Y2 + ... + Yn, each Xi, Yi E wi,
then for each i, we have Xi - Yi = L:;#i(Yj - Xj ) E Win L:;#i Wj = {0}.
Hence, uniqueness holds.
(ii)=>(iii): Because each Si is a basis of Wi, it is linearly independent; hence, 0 tf_ Si .
Since this holds for every i, we also have 0 tf_ S. If Sis linearly dependent, then
there exists some nontrivial linear combination of elements of S that equals
zero. This contradicts the uniqueness of the representation in (ii), since the
unique representation of zero is 0 = 0 + 0 + · · · + 0.
Moreover, the set S spans W, since every element of W can be expressed as a
linear combination of elements from {Wi}~ 1 , which in turn can be expressed
as a linear combination of elements of Si .
Finally, if s E Sin Sj for some i f. j, then the uniqueness of the representation
0

in (ii), applied to s = s + 0 = 0 + s, implies thats= 0. But we have already


seen that O tf_ S, so Sin Sj = 0.
(iii)=>(i): Suppose that for some i we have a nonzero element w E Win L:;#i Wj =
span{Si} n span{U#iSj}· Thus, w is a nontrivial linear combination of ele-
ments of Si c Sand a nontrivial linear combination of elements of LJ#i Sj c
S. Since Sin (LJi#j Sj) = 0, this contradicts the uniqueness of linear combi-
nations in the basis S . Thus, Win L:;#i Wj = {O} for all i. D

Definition 1.3.8. Two subspaces W1 and W2 of V are complementary if V =


W1 EB W2.
1.4. Dimension, Replacement, and Extension 17

Example 1.3.9. Consider the space JF[x; 2n]. If W1 = span{l , x 2 , x 4 ,. . ., x 2n}


and W2 = span{x,x 3 ,x 5 ,. . .,x 2n- 1}, then JF[x;2n] = W1 EB W2. In other
words , W1 and W2 are complementary.

1.4 Dimension, Replacement, and Extension


In this section, we prove the replacement theorem, which says that if V has a finite
basis of m elements, then a linearly independent subset of V has at most m elements.
An immediate corollary of this is that if V has two bases and one of them is finite ,
then they both have the same cardinality. In other words, the number of elements
in a given basis, if finite, is a meaningful constant, which we call the dimension of
the vector space.
Another corollary of the replacement theorem is the extension theorem, which
for a finite-dimensional vector space V, allows us to create a basis out of any linearly
independent subset by adding vectors to the set. This is a powerful tool because it
enables us to construct a basis from a certain subset of vectors that are desirable
for the problem at hand.
We conclude this section with the difficult but important theorem that every
vector space has a basis. The proof of this theorem uses Zorn's lemma, which is
equivalent to the axiom of choice (see Appendix A for more details).

1.4.1 The Replacement and Extension Theorems

Theorem 1.4.1 (Replacement Theorem). If V is spanned by S = {s 1, s2 , . . . ,


sm} and T = {t1, t2 , ... , tn} (where each ti is distinct) is a linearly independent
subset of V , then n :::; m, and there exists a subset S' c S having m - n elements
such that T U S' spans V .

Proof. We prove this by induction on n. If n = 0, then the result follows trivially


(set S' = S). Now assume that the result holds for some n EN. It suffices to show
that the result holds for n + 1. If T = {t 1, t2, . . . , tn+ i} is a linearly independent
subset of V, then we know that T' = {t 1 , t 2, ... , tn} is also linearly independent by
Lemma 1.2.12. Thus, by the inductive hypothesis, we haven :::; m, and there is a
subset of S with m - n elements, say S' = {s 1, . .. , Sm-n}, such that T' U S' spans
V. Hence, we can write tn+l as a linear combination:
(1.3)
If n = m, then S' = (/J and tn+l E span(T'), which contradicts the linear indepen-
dence of T . It follows that n < m (hence n + 1 :::; m), and there exists at least one
nonzero bi in (1.3) . Without loss of generality, assume b1 i= 0. Thus,
1
S1 = - - (a1t1 + · · · + antn - tn+l + b2s2 + · · · + bm-nSm- n) · (1.4)
b1
If follows that s 1 E span(TUS"), where S" = {s2, .. . , Sm- n}· By Proposition 1.2.7,
we have that span(T U S") = span(T U S" U {s1}) = span(T U S'). But since
T' US' c TU S', then by Proposition 1.2.5 we have span(T US') = V . D
18 Chapter 1. Abstract Vector Spaces

Corollary 1.4.2. If V has a basis of n elements, then all other bases of V have n
elements.

Proof. Let S be a basis of V having n elements. If T is any finite basis of V, then


since Sand Tare both linearly independent and span V , the replacement theorem
guarantees that both ISi :::; ITI and ITI :::; ISi .
Suppose now that T is an infinite basis of V. Choose a subset T' c T such
that IT'I = n + 1. By Lemma 1.2.12 every subset of a linearly independent set is
linearly independent, so by the replacement theorem, we have IT'I :::; ISi = n . This
is a contradiction. D

Definition 1.4.3. A vector space V with a finite basis of n elements is n-dimensional


or, equivalently, has dimension n. We denote this by dim(V) = n. If V does not
have a finite basis, then it is infinite dimensional.

Corollary 1.4.4. Let V be an n -dimensional vector space and S c V.


(i) If S spans V, then S has at least n elements.
(ii) If S is linearly independent, then S has at most n elements.

Proof. These statements follow immediately from the replacement theorem. See
Exercise 1. 20. D

Corollary 1.4.5 (Extension Theorem). Let W be a subspace of the vector


space V . If S = {s 1, s2, ... , sm} with each Si distinct and T = {t1, t2, ... , tn} with
each ti distinct are bases for V and W, respectively, then there exists S' C S having
m - n elements such that T U S' is a basis for V.

Proof. By the replacement theorem, there exists S' c S such that TU S' spans
V. It suffices to show that TU S' is linearly independent. Assume without loss
of generality that S' = {s 1, ... , Sm-n} and suppose, by way of contradiction, there
exists a nontrivial linear combination of elements of T U S' satisfying

Since T is linearly independent, this implies that at least one of the bi is nonzero.
We assume without loss of generality that b1 =f:. 0. Thus, we have

Hence, TU S" spans V, where S" = {s2, ... , Sm-n}· This is a contradiction since
T U S" has only m - 1 elements, and Corollary 1.4.4 requires any set that spans
the m-dimensional space V to have at least m elements. Thus, TU S is linearly
independent. D
1A. Dimension, Replacement, and Extension 19

Example 1.4.6. Consider the linearly independent polynomials

Ji = x 2 - 1, fz = x3 - x, and h = x3 - 2x 2 - x +1
in the space lF[x ;3]. The set of monomials {l,x,x 2 ,x 3 } forms a basis for
lF[x;3]. Let T = {!1,h,h} and S = {l,x,x 2 ,x3 }. By the extension theorem
there exists a subset of S' of S such that S' UT forms a basis for lF[x; 3]. One
option is S' = {x}. A straightforward calculation shows

1= fz - 2fi - h, x2 = fz - Ji - h, and x 3 = fz + x,
so {!1 ,fz,h,x} forms a basis for lF[x;3].

Corollary 1.4. 7. Let V be a finite -dimensional vector space. If W is a subspace


of V and dim W =dim V , then W = V.

Proof. This follows trivially from the extension theorem (Corollary 1.4.5). It is
also proved in Exercise 1.21. D

Vista 1.4.8. Both the replacement and extension theorems require t hat the
underlying vector space be finite dimensional. This raises the question: Are
infinite-dimensional vector spaces important? Or can we get away with nar-
rowing our focus to the finite-dimensional case? Although you may have come
pretty far in life just using finite-dimensional vector spaces, many important
areas of mathematics rely heavily on infinite-dimensional vector spaces, such
as C([a, b]; !F) and L00 ([a, b]; !F). One example where such a vector space oc-
curs is t he st udy of differential equations, which includes most , if not all, of
the laws of physics. Infinite-dimensional vector spaces are also widely used in
other fields such as finance , economics, geology, climate science, biology, and
nearly every area of engineering.

We conclude this section with a simple but useful result on dimension of direct
sums and Cartesian products.

Proposition 1.4.9. Let {Wi}~ 1 be a collection of finite -dimensional subspaces of


the vector space V such that the direct-sum condition (1.2) holds. If dim( E9~=l Wi)
exists, then dim(E9~= l Wi) = :Z::::~ 1 dim(Wi )·
Similarly, if Vi, Vi, . .. , Vn is a collection of finite -dimensional vector spaces
over the fi eld lF, the Cartesian product

n
V =IT Vi= Vi X Vi X · · · X Vn = {(v1, Vz, ... , Vn) I vi E Vi}
i= l
20 Chapter 1. Abstract Vector Spaces

satisfies
n
dim(V1 x Vi x · · · x Vn) = L dim(V;) .
i=l

Proof. The proof is Exercise 1.25. D

1.4.2 * Every Vector Space Has a Basis


We now prove that every vector space has a basis- this is a major theorem in linear
algebra. As an immediate consequence, this theorem says that the dimension of
a vector space is well defined. Beyond that, an ordered basis is useful because it
allows a vector to be represented as an array where each entry corresponds to the
coefficient leading that basis element, t hat is, when the vector is written as a linear
combination in that basis. By only keeping track of the array, we can think of the
basis as a coordinate system for a vector space. Thus, in addition to showing that
a basis exists, this theorem also tells that we always have a coordinate system for a
given vector space.
The proof follows from Zorn's lemma, which is equivalent to the axiom of
choice. 5 We take these as axioms of set theory; for more details, see the discussion
in Appendix A.4.

Theorem 1.4.10 (Zorn's Lemma). Let (X, :::;) be a nonempty partially ordered
set. 6 If every chain in X has an upper bound in X, then X contains a maximal
element.

By chain we mean any subset C C X such that C is totally ordered; that is,
for every a, f3 E C we have either a ::; f3 or f3 :::; a . A chain C is said to have an
upper bound in X if there is an element I E X such that a :::; I for every a E C.

Theorem 1.4.11. Every vector space has a basis.

Although this theorem is about bases of vector spaces, the idea of its proof is
useful in many settings where one needs to prove that a maximal set having certain
properties exists.

Proof. Let V be any vector space. If V = {O}, then the empty set is a basis.
Hence, we may assume that V is nontrivial.
Let Y = {S0 }aEI be the set of all linearly independent subsets S 0 of V. Since
V is nontrivial, Y is not empty. The set Y is partially ordered by set inclusion c.
To use Zorn's lemma, we must show that every chain in Y has an upper bound.

5 Proofs that rely on the axiom of choice are usually only proofs of existence. Specifically, the
theorem here about existence of a basis doesn't say anything about how to construct a basis.
6 See Definition A.3.15.
1.5. Quotient Spaces 21

A nontrivial chain in .5/ is a nonempty subset C = {Sa}aEJ C .5/ that is


totally ordered by set inclusion. We claim that the set S' = UaEJ Sa is an upper
bound for C; that is, S' E .5/ and Sa C S' for all a E J. The inclusion Sa C S' is
immediate, so we need only show that S' is linearly independent.
If S' were not linearly independent, then there would exist a nontrivial linear
combination a 1 x 1 + · · · + amXm = 0 where xi E S'. By definition of S', we have,
for each 1 ::::: i ::::: m, that there is some ai E J such that Xi E Sa, . Without loss
of generality assume that Sa 1 c Sa 2 c · · · c Sa,.,..· Hence, {x1, . . . ,xm} c Sa,.,..,
which is a linearly independent set. This is a contradiction. Therefore, S' must be
linearly independent, and every chain in .5/ has an upper bound.
By Zorn's lemma, there exists a maximal element S E .5/. We claim that
S is a basis for V. It suffices to show that S spans V. If not, then there exists
v E V which is not in the span of S. Thus, {v} U S is linearly independent and
properly contains S, which contradicts the maximality of S. Hence, S must span
V, as desired, and is therefore a basis for V. D

1.5 Quotient Spaces


This section describes yet another way to construct a vector space from a subspace.
Specifically, we form a vector space by considering the set of all the translates
of a given subspace. There is a natural way to define vector addition and scalar
multiplication of these translates that makes the set of translates into a vector space
called the quotient space.
As an analogy, it may be helpful to think of quotient spaces as a vector-space
analogue of modular arithmetic. Recall that in modular arithmetic we start with
a set S of all integers divisible by a certain number. So, for example, we could let
S = 5Z = { ... , -15 , - 10, - 5, 0, 5, 10, . . . } be the set of numbers divisible by 5. The
translates of S are the sets
0+s= s = {... '-15, -10, -5 , 0, 5, 10, ... },
1+s = {... , - 14, - 9, -4, 1,6, 11 , . . . },
2 + s = { ... ' - 13, - 8, -3, 2, 7, 12, . . . },
3+s= { ... ' - 12, -7, - 2, 3, 8, 13, ... },
4+s= { ... '-11, - 6, -1, 4, 9, 14, ... }.
The set of these translates is usually written Z/5Z, and it has only 5 elements in it:
Z/5Z = {O + S, 1 + S, 2 + S, 3 + S, 4 + S}.
There are only 5 translates of S because 0 + S = 5 + S = 10 + S = · · · , and
2 + S = 12 + S = -8 + S = · · ·, and so forth. When we say 6 is equivalent to 1
modulo 5 or write 6 = 1 (mod 5), we mean that the number 6 lies in the translate
1 + S, or equivalently 6 - 1 E S, or equivalently 1 + S = 6 + S.
It is a very useful fact that addition, subtraction, and multiplication all make
sense modulo S; that is, we can add, subtract, or multiply the translates by adding,
subtracting, or multiplying any of their corresponding elements. So when we write
3 + 4 = 2 (mod 5) we mean that if we take any element in 3 +Sand add it to any
element in 4 + S, we get an element in 2 + S.
22 Chapter 1. Abstract Vector Spaces

The construction of a vector-space quotient is very similar. Instead of a set


of multiples of a given number, we take a subspace W of a given vector space V.
The translates of W are sets of the form x + W = {x + w I w E W}, where
x E V. The set of all translates is written V /W and is called the quotient of V
by W . It follows that x 1 + W = x2 + W if and only if x 1 - x2 E W . Thus, we say
that x 1 is equivalent to x 2 if x 1 + W = x 2 + W, or equivalently if x 1 - x 2 E W.
Quotient spaces provide useful tools for better understanding several im-
portant concepts, including linear transformations, systems of linear equations,
Lebesgue integration, and the Chinese remainder theorem. The key fact that makes
these quotient spaces so useful is the fact that the set V /W of these translates is it-
self a vector space. That is, we can add, subtract, and scalar multiply the translates,
in direct analogy to what we did in modular arithmetic.
In this section, we define quotient spaces and the corresponding operations of
vector addition and scalar multiplication on quotient spaces. We also give several
examples.
Throughout this section, we assume that W is a subspace of a given vector
space V.

1. 5.1 Co sets

Definition 1.5.1. We say that xis equivalent toy modulo W if x - y E W; this


is denoted x "' W y or just by x "' y.

Proposition 1.5.2. The relation"' is an equivalence relation. 7

Proof. It suffices to show that the relation "' is (i) reflexive, (ii) symmetric, and
(iii) transitive.

(i) Note that x - x =0 E W. Thus, x"' x .

(ii) If x - y E W, then y - x = (-l)(x - y) E W. Hence, x"' y implies y "'x.

(iii) If x - y E Wand y - z E W, then x - z = (x - y) + (y - z) E W . Thus,


x "' y and y "' z implies x "' z. D

Recall that an equivalence relation defines a partition. More precisely, for each
x E V, we define the equivalence class of x to be the set [[x]] = {y I y "' x }. Thus,
every element x E V is in exactly one equivalence class.
T he equivalence classes defined by"' have a particularly useful structure. Note
that

{y I y "-' x } = {y I y - x E W}
= {y I y = x + w for some w E W}
= {x + w I w E W}.
7
The definition of an equivalence relation is given in Appendix A.l.
15. Quotient Spaces 23

a+W

Figure 1.2. A representation of the cosets of a subspace W as translates


of W . Here W (red) is the subspace span({(l, -1)}) = {(x,y) I y = -x}, and
each of the other lines represents a cos et of W. If b = (b 1 , b2 ) , then the cos et b + W
is the line {b + (x,y) I y = - x} = {( x + b1,Y + b2) I y = - x} .

Hence, the equivalence classes of"' are just translates of the subspace W, and we
often denote them by x + W = {x + w I w E W} = [[x]]; see Figure 1.2. The
equivalence classes of W are called the cosets of W. Cosets are either identical or
disjoint, and we have x + W = x' + W if and only if x - x' E W . As we show in
Theorem 1.5.7 below, the set of all cosets of Wis a vector space.

1.5.2 Quotient Vector Spaces

Definition 1.5.3. The set {x +W I x EV} (or equivalently {[[x]J I x EV}) of all
cosets of W in V is denoted V /W and is called the quotient of V modulo W .

Example 1.5.4.

(i) Let V = IR 3 and let W = span((O, 1, 0)) be the y-axis. We show that
there is a natural bijective correspondence between the elements (cosets)
of the quotient V / W and the elements of IR 2 .
Note that any (a , b, c) EV can be written as (a, b, c) = (a, 0, c) + (0, b, 0) ,
and (0, b, 0) E W. Therefore, the coset (a , b, c) +Win the quotient V/W
is equal to the coset (a, 0, c) + W, and we have a surjection cp : IR 2 -t
V / W, defined by sending (a, c) E IR 2 to (a , 0, c) + W E V/W.
If cp(a,c) "'cp(a' , c'), then (a,O,c) "'(a',O,c') , implying that (a,0,c) -
(a', O,c') =(a - a' , O, c - c') E W = {(O,b, O) I b E IR}. It follows that
a - a' = 0 = c - c', and so a = a' and c = c'. Thus, the map cp is
injective. This gives a bijection from IR 2 to V/W. Below we show that
V / W has a natural vector-space structure, and the bijection cp preserves
all the properties of a vector space.
24 Chapter 1. Abstract Vector Spaces

(ii) Let V = C([O, 1] ; JR) be the vector space of real-valued functions defined
and continuous on the interval [O, 1], and let W = {f EV I f(O) = O} be
the subspace of all functions that vanish at 0. We show that there is a
natural bijective correspondence between the elements (cosets) of V /W
and the real numbers.
Given any function f E V , let f(x) = f(x) - f(O) E W , and let Jo be
the constant function fo(x) = f(O). We can check that f E Wand that
Jo E V. Thus, f = Jo+ f. This shows that f(x) "" fo( x) modulo W,
and the coset f +Win V/W can be written as Jo+ W .
Now we can proceed as in the previous example. Given any a E JR there
is a corresponding constant function in V , which we also denote by
fa(x) = a. Let 'I/; : JR---+ V/W be defined as 'l/;(a) =fa+ W. This map is
surjective, since any f + W E V/W can be written as Jo+ W = 'l/;(f(O)) .
Also, given any a, a' E JR, if 'l/;(a) ""'l/;(a') , then fa - fa ' E W. But fa - fa'
is constant, and since it vanishes at 0, it must vanish everywhere; that is,
a= a'. It follows that 'I/; is injective and thus also a bijection.
(iii) Let V = IF[x] and let W = span({x 3 ,x 4 , ... }) be the subspace of IF[x]
consisting of all polynomials with no nonzero terms of degree less than 3.
Any f = ao + a1 x + · · · + anxn E IF[x] is equivalent mod W to a0 + a 1x +
a 2x 2, and so the coset f +Wean be written as ao + a1x + a2x 2 + W . An
argument similar to the previous example shows that the quotient V /W
is in bijective correspondence with the set {(ao , a1, a2) I ai E IF} = IF 3
via the map (ao, a1, a2) f-t ao + a1x + a2x 2 + W.

D efinition 1.5.5. The operations of vector addition EE : V /W x V /W ---+ V /W


and scalar multiplication 0 : IF x V / 1W ---+ V /W on the quotient space V /W are
defined to be

(i) (x + W) EE (y+ W) = (x+y) + W, and

(ii) aD (x+ W) =(ax)+ W .

Lemma 1.5.6. The operations EE: V/W x V/W---+ V/W and 0 : IFx V/W---+ V/W
are well defined for all x, y E V and a E IF.

Figure 1.3 shows the additive part of this lemma in JR 2: different representa-
t ives of each coset can sum to different vectors, but those different sums still lie in
the same coset.

Proof.

(i) We must show that the definition of EE does not depend on the choice of
representative of the coset; that is, if x + W = x' +Wand y + W = y' + W,
then we must show (x + W) EE (y + W) = (x' + W) EE (y' + W).
1.5. Quotient Spaces 25

(x+y)+W (x+y)+W

y+W y+W

x+W x+W

Figure 1.3. For any two vectors x1 and x2 in the coset x + W and any
other two vectors Y1 and Y2 in the coset y + W, the sums x 1 + y 1 and x2 + Y2 both
lie in the coset (x + y) + W .

Because x + W = x' + W, we have that x - x' E W. And similarly, because


y+ W = y' + W, we have y-y' E W. Adding yields (x- x' ) + (y -y') E W,
which can be rewritten as (x + y) - (x' + y') E W. Thus, we have that
(x + y) + W = (x' +y') + W, and so (x + W) tE(y+ W) = (x' + W) tE (y' + W) .
(ii) Similarly, we must show that the operation C:::J does not depend on the choice
of representative of the coset; that is, if x + W = x ' +Wand a E lF, we show
that a C:::J (x + W) = a [:::J (x' + W). Since x + W = x ' + W, we have that
x - x' E W, which implies that a(x - x') E W, or equivalently ax - ax' E W.
Thus, we see ax+ W =ax'+ W. D

Theorem 1.5. 7. The quotient space V / W is a vector space when endowed with the
operations tE and [:::J of vector addition and scalar multiplication, respectively.

Proof. The operation tE is commutative because + is commutative. That is, for


any cosets (x + W) EV/ Wand (y + W) E V/W, we have (x + W) tE (y + W) =
(x + y) + W = (y + x) + W = (y + W) tE (x + W). The proof of associativity is
similar.
We note that the additive identity in V / Wis O+ W , since for any x+ W E V /W
we have (O+ W) tE (x+ W) = (O+x) + W = x+ W. Similarly, the additive inverse of
the coset x+ Wis (- x)+ W, since (x+ W)tE(( -x)+ W) = (x+( - x )) + W = O+ W.
The remaining details are left for the reader; see Exercise 1.26. D

Example 1.5.8.
(i) Consider again the example of V = JR3 and W = span((O, 1, 0)) , as in
Example l.5.4(i). We saw already that the map cp is a bijection from JR 2
to V/W, but now V/W also has a vector-space structure. We now show
that cp has the special property that it preserves both vector addition
and scalar multiplication. (Note: Not every bijection from JR 2 to V / W
has this property.)
26 Chapter 1. Abstract Vector Spaces

The operation EE is given by ((a, b, c) + W) EE ((a', b', c') + W) = (a+


a', b+ b', c + c') + W. The map cp satisfies cp(a, c) EE cp(a', c') = ((a , 0, c) +
W) EE ((a' , 0, c') + W) =(a+ a' , O,c + c') + W = cp((a,c) +(a', c')).
Similarly, for any scalar d E IR we have d [J cp(a, c) = d [J (a, 0, c) + W =
(da, 0, de) + W = cp(d(a, c)).
(ii) In the case of V = C([O, 1]; IR) and W = {f E V I f(O) = O}, we
have already seen in Example l.5 .4(ii) that any coset can be uniquely
written as a+ W, with a E IR, giving the bijection 'l/; : IR -+ V/W.
Again the map 'l/J is especially nice because for any a, b E IR we have
'l/;((a+W)EE(b+W)) = 'l/;((a+b) +W) = a+b = 'l/;(a+W)+'l/;(b+W) , and
for any d E IR we have 'l/;(d[J (a+ W)) = 'l/;(da + W) = da = d'l/;(a + W).

(iii) In the case ofV = JF[x] and W = span({x 3 ,x 4 , •• . }), any two cosets can
be written in the form (ao+a1x+a2x 2)+W and (bo+b1x+b2x 2)+W, and
the operation EE is given by ( (ao +a1x+a2x 2) + W) EE ( (bo +b1x+ b2x 2) +
W) = ((a 0 + b0 ) + (a 1 + b1 )x + (a2 + b2)x 2) + W . Similarly, the operation
of [J is given by d[J ((ao +a 1x +a2x 2) + W) = (dao +da1x +da2x 2) + W.

Example 1.5 .9.

(i) In the case of V = IR 3 and W = span((O, 1, 0)), we saw in the previous


example that the operations of EE and [Jon the space IR 3 /Ware identical
(via cp) to the usual operations of vector addition + and scalar multi-
plication · on the vector space IR 2. So the vector space V /W and the
vector space IR 2 are equivalent or ''the same" in a sense that is defined
more precisely in the next chapter.

(ii) Similarly, in the case of V = C([O, 1]; IR) and W = {f E V I f(O) = O}


the map 'l/J preserves vector addition and scalar multiplication, so the
space C([O, 1]; IR) /W is "the same as" the vector space IR.

(iii) Finally, in the case that V = JF[x] and W = span({x 3 ,x 4 ,. .. }) , the


space JF[x] /W is ''the same as" the vector space JF 3 . Of course, these are
different sets, and they have other structure beyond the vector addition
and scalar multiplication, but as vector spaces they are the same.

Vista 1.5.10 (Quotient Spaces and Integration). Quotient spaces play


an important role in the theory of integration. If two functions are the same
on all but a very small set, they have the same integral. For example, the
function
h(x) = {1, x -=/= 1/ 2,
9, x = 1/ 2,
Exercises 27

has integral 1 on the interval [O, l]. Similarly the function

f(x) = 1

also has integral 1 on the interval [O, 1]. When studying integration, it is
helpful to think of t hese two functions as being the same.
A set Z has measure zero if it is so small that any two functions that
differ only on Z must have the same integral. The set of all integrable functions
that are supported only on a set of measure zero is a subspace of the space of
all integrable functions.
The precise way to make sense of the idea that functions like f and h
should be "the same" is to work with the quotient of the space of all integrable
functions by the subspace of functions supported on a set of measure zero. We
treat these ideas much more carefully in Chapters 8 and 9.

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth .
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with & are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

1.1. Show that the set V = (0, oo) is a vector space over JR with vector addition
(EB) and scalar multiplication (0) defined as

xEBy = xy
a('.)x = xa,

where a E JR and x,y EV.


1.2. Finish the proof of Theorem 1.1.12 by proving properties (iii) and (iv). Also
explain why the proof of the multiplicative identity property (vii) follows
immediately from the fact that V is a vector space, but the additive identity
property (iii) does not.
28 Chapter 1. Abstract Vector Spaces

1.3. Do the following sets form subspaces of JF 3? Prove or disprove.


(i) {(x1,x2,X3) I 3x1+4x3 = 1}.
(ii) {(x1,x2,x3) I X1 =2x2 = x3} .
(iii) {(x1, x2, x3) I X3 = x1 + 4x2}.
(iv) {(x1 ,x2,X3) I X3 = 2x1 or :i;3 = 3x2}.
1.4. Do the following sets form subspaces of JF[x; 4]? (See Example l.l.15(iv).)
Prove or disprove.
(i) The set of polynomials in l!<'[x; 4] of even degree.
(ii) The set of all polynomials of degree 3.
(iii) The set of all polynomials p(x) in JF[x; 4] such that p(O) = 0.
(iv) The set of all polynomials p(x) in JF[x; 4] such that p'(O) = 0.
(v) The set of all polynomials in JF[x; 4] having at least one real root.
1.5. Assume that 1 ::; p < q < oo. Is it true that £P is a subspace of Cq? (See
Example l.1.6(vi) .) Justify your answer.

1.6. Let x 1 = (- 1, 2,3) and x2 = (3,4,2) .


(i) Is x = (2,6,6) in span({x 1, x 2})? Justify your answer.
(ii) Is y = (-9,-2,5) in span({x 1,x2})? Justify your answer.
1.7. For each set below, which of the following span JF[x; 2]? Justify your answer .
(i) {1, x 2 , x 2 - 3}.
(ii) {1,x-1, x 2 -l} .
(iii) {x+2,x-2,x 2 -2}.
(iv) {x+4, x 2 -4}.
(v) {x - 1, x+l, 2x+3,5x-7} .
1.8. Which of the sets in Exercise 1.7 are linearly independent? Justify your
answer.
1.9. Assume that {v 1, v 2, ... , vn} spans V, and let v be any other vector in V.
Show that the set {v, v 1, v2, ... , vn} is linearly dependent.
1.10. Prove Proposition 1.2.5.
1.11. Prove that the monomials {1, x, x 2, x 3, . . . } c JF[x] form a basis of JF[x].
1.12. Let {v 1, v2, ... , vn} be a linearly independent set of distinct vectors in V.
Show that {v2, ... , vn} does not span V.
1.13. Prove Lemma 1.2.12.
1.14. Prove that the converse of Corollary 1.2.15 is also true; that is, if any vector
in V can be uniquely expressed as a linear combination of elements of a set
S, then S is a basis.
Exercises 29

1.15. Prove that there is a basis for IF[x; n] consisting only of polynomials of degree
n E f::!.
1.16. For every positive integer n, write IF[x] as the direct sum of n subspaces.
1.17. Prove Proposition 1.3.1.
1.18. Let

Symn(IF) = {A E Mn(IF) I AT= A} ,


Skewn(IF) = {A E Mn(IF) I AT = - A}.

Show that

(i) Symn (IF) is a subspace of Mn (IF),

(ii) Skewn(IF) is a subspace of Mn(IF), and

1.19. Show that any function f E C([-1, l];JR) can be uniquely expressed as
an even continuous function on [- 1, 1] plus an odd continuous function on
[- 1, l]. Then show that the spaces of even and odd continuous functions of
C([- 1, 1]; JR), respectively, form complementary subspaces of C([- 1, 1]; JR).

1.20. Prove Corollary 1.4.4.


1.21. Let W be a subspace of the finite-dimensional vector space V . Prove, not
using any results after Section 1.4, that if dim(W) = dim(V), then W = V .
Hint: Prove the contrapositive.
1.22. Let W be a subspace of the n-dimensional vector space V . Prove that
dim(W)::::; n.
1.23. Prove: Let V be an n-dimensional vector space. If S c V is a linearly
independent subset of V with n elements, then S is a basis.
1.24. Let W be a subspace of the finite-dimensional vector space V . Use the
extension theorem (Corollary 1.4.5) to show that there is a subspace X such
that V = W EEl X.
1.25. Prove Proposition 1.4.9. Hint: Consider using Theorem 1.3.7.

1.26. Using the operations defined in Definition 1.5.5, prove the remaining details
in Theorem 1.5.7.
1.27. Prove that the quotient V /W satisfies

(a D (x + W)) EB (HJ (y + W)) = (ax+ by) + W.

1.28. Show that the quotient V /V consists of a single element.


1.29. Show that the quotient V/ {O} has an obvious bijection¢ : V / {O} -t V which
satisfies ¢(x+{O}EBy+{O}) = ¢(x+{O})+¢(y+{O}) and ¢(cD(x+{O})) =
c¢(x + {O}) for all x ,y EV and c E IF.
30 Chapter 1. Abstract Vector Spaces

1.30. Let V = lF[x] and W = span( { x, x 3 , x 5 , .. . } ) be the span of the set of all
odd-degree monomials. Prove that there is a bijective map 'ljJ : V /W ---t lF[y]
satisfying 'l/;(p +WEB q + W) = 'l/;(p + W) + 'l/;(q + W) and 'l/;(c [:J (p + W)) =
c'l/;(p + W).

Notes
For a friendly description of many modern applications of linear algebra, we recom-
mend Tim Chartier's little book [Cha15] .
The reader who wants to review elementary linear algebra may find some of
the following books useful [Lay02, Leo80, 0806, Str80]. G. Strang also has some
very clear video explanations of many important linear algebra concepts available
through the MIT Open Courseware page [StrlO].
Linear
Transformations
and Matrices

The Matrix is everywhere. It is all around us. Even now, in this very room. You can
see it when you look out your window, or when you turn on your television. You
can feel it when you go to work, when you go to church, when you pay your taxes.
It is the world that has been pulled over your eyes to blind you from the truth.
-Morpheus

A linear transformation (or linear map) is a function between two vector spaces
that preserves all the linear structures, that is, lines map into lines, the origin maps
to the origin, and subspaces map into subspaces.
The study of linear transformations of vector spaces can be broken down
into roughly three areas, namely, algebraic properties, geometric properties, and
operator-theoretic properties. In this chapter, we explore the algebraic properties
of linear transformations. We discuss the geometric properties in Chapter 3 and
the operator-theoretic properties in Chapter 4.
We begin by defining linear transformations and describing their attributes.
For example, when a linear transformation has an inverse, we say that it is an
isomorphism, and that the domain and codomain are isomorphic. This is a math-
ematical way of saying that, as vector spaces, the domain and codomain are the
same. One of the big results in this chapter is that all n -dimensional vector spaces
over the field lF are isomorphic to lFn .
We also consider basic features of linear transformations such as the kernel
and range. Of particular interest is the first isomorphism theorem, which states that
the quotient space of the domain of a linear transformation, modulo its kernel, is
isomorphic to its range. This is used to prove several results about the dimensions
of various spaces. In particular, an important consequence of the first isomorphism
theorem is the rank-nullity theorem, which tells us that for a given linear map, the
dimension of the domain is equal to the dimension of its range (called rank) plus
the dimension of the kernel (called nullity). The rank-nullity theorem is a major
result in linear algebra and is frequently used in applications.
Another useful consequence of the first isomorphism theorem is the second
isomorphism theorem, which provides an identity involving sums and intersections

31
32 Chapter 2. Linear Transformations and Matrices

of subspaces. Although it may appear to be of little importance initially, the second


isomorphism theorem is used to prove the dimension formula for subspaces, which
says that the dimension of the sum of two subspaces is equal to the sum of the
dimensions of each subspace minus the dimension of their intersection. This is an
intuitive counting formula that we use later on.
An important branch of linear algebra is matrix theory. Matrices appear
as representations of linear transformations between two finite-dimensional spaces,
where matrix-vector multiplication represents the mapping of a vector by the linear
transformation. This matrix representation is unique once a basis for the domain
and a basis for the codomain are chosen. This raises the question of how the matrix
representation of a linear transformation changes when the underlying bases change.
For each linear transformation, there are certain canonical bases that are natural
to use.
In an elementary linear algebra class, a great deal of attention is given to solv-
ing linear systems; that is, given an m x n matrix A and the vector b E !Fm, find
the vector x E !Fn satisfying Ax = b. This problem breaks down into three possible
outcomes: no solution, exactly one solution, and infinitely many solutions. The
principal method for solving linear systems is row reduction. For small matrices,
we can carry this out by hand; for laLI"ge systems, computer software is required.
Although we assume the reader is already familiar with the mechanics of row reduc-
tion, we review this topic briefly because we need to have a thorough understanding
of the theory behind this method.
The last two sections of this chapter are devoted to determinants. Determi-
nants have a mixed reputation in mathematics. On the one hand, they give us
powerful tools for proving theorems in both geometry and linear algebra, and on
the other hand they are costly to compute and thus give way to other more effi-
cient methods for solving most linear algebra problems. One place where they have
substantial value, however, is in understanding how linear transformations map
volumes from one space to another. Indeed, determinants are used to define the
Jacobians used in multidimensional integration; see Section 8.7.
In this chapter, we briefly study the classical theory of determinants and prove
two major theorems- first, that the determinant of a matrix can be computed via
row reduction to the simpler problem of computing a determinant of an upper-
triangular matrix and, second, that the determinant of an upper-triangular matrix
is just the product of its diagonal elements. These two results allow us to compute
determinants fairly efficiently.

2.1 Basics of Linear Transformations I


Maps between vector spaces that preserve the vector-space structure are called linear
transformations. Linear transformations are the main objects of study in
linear algebra and are a key tool for understanding much of mathematics. Any linear
transformation has two important subspaces associated with it, namely, the kernel,
which consists of all the vectors in the domain that map to zero, and the range or
image of the linear transformation. In this section we study the basic properties of
a linear transformation and the associated kernel and range.
2.1. Basics of Linear Transformations I 33

2.1.1 Definition and Examples

Definition 2.1.1. Let V and W be vector spaces over a common field lF. A map
L : V -t W is a linear transformation from V into W if

(2.1)

for all vectors x 1 , x2 E V and scalars a, b E lF . We say that a linear transformation


is a linear operator if it maps a vector space into itself.

Remark 2.1.2. As mentioned in the definition, linear transformations are always


between vector spaces with a common scalar field. Maps between vector spaces
with different scalar fields are rare. It is occasionally useful to think of the complex
numbers as a two-dimensional real vector space, but in that case, we would state
very explicitly which scalar field we are using, and even then the linear transfor-
mations we study would still have the same scalar field (whether IR or C) for both
domain and codomain.

Example 2.1.3.

(i) For any positive integer n, the projection map 'Tri : lFn -t lF given by
(a1 , .. . , an) i-t ai is linear for each i = 1, .. . , n.
(ii) For any positive integer n, then-tuple (a 1 , ... ,an) of scalars defines a
linear map (x1 , ... ,xn) H a1x1 + · · · + anXn from lFn -t lF.
(iii) More generally, any m x n matrix A with entries in lF defines a linear
transformation lFn -t lFm by the rule x i-t Ax.

(iv) For any positive integer n, the map cn((a,b);lF) -t cn- 1 ((a , b);lF) de-
fined by f i-t d~ f is linear.

(v) The map C([a, b]; JF) -t lF defined by f i-t J: f(x) dx is linear.
(vi) For any interval [a, b] CIR, the map C([a, b];lF) -t C([a,b];lF) given by
J:
f i-t f(t) dt is linear.
(vii) For any interval [a, b] C IR and any p E [a, b], the evaluation map ep :
C([a, b]; JF) -t lF defined by f(x) i-t f(p) is linear.
(viii) For any interval [a, b] C IR, a polynomial p E lF[x] is continuous on [a, b],
and thus the identity map lF[x] -t C([a, b]; lF) is linear.
(ix) For any positive integer n, the projection map f,P -t lFn given by

is a linear transformation.
34 Chapter 2. Linear Transformations and Matrices

(x) The left-shift map f_P -t f_P given by

is a linear operator.
(xi) For any vector space V defined over the field IF and any scalar a E IF,
the scaling operator h°' : V ---,> V given by hoi(x) = a x is linear.
(xii) For any e E [O, 27r), the map po : IR 2 -t IR 2 given by rotating around
e
the origin counterclockwise by angle is a linear operator. This can be
written as

Po (x, y) = ( x cos B - y sin B, x sin B + y cos B) .

(xiii) For A E Mm(IF) and B E Mn(IF), the Sylvester map L : Mmxn(IF) -t


Mmxn(IF) given by X t--+ AX+ XB is a linear operator. Similarly, the
Stein map K: Mmxn(IF) -t Mmxn(IF) given by X t--+ AXE - X is also
a linear operator .

Unexample 2.1.4.

(i) The map ¢ : IF -t IF given by ¢(x) = x 2 is not linear because (x + y) 2 is


not equal to x 2 + y 2 , and therefore ¢( x + y) =f. ¢( x) + ¢(y).
(ii) Most functions IF -t IF are not linear operators; indeed, a continuous
function f : IF -t IF is linear if and only if f(x) = ax. Thus, most
functions in the standard arsenal of useful functions , including cos(x) ,
sin(x) , xn, ex, and most other functions you can think of, are not linear.
However, do not confuse this with the fact that the set C(IF; IF) of all
continuous functions from IF to IF is a vector space, and there are many
linear operators on this vector space. Such operators are not functions
from IF to IF, but rather are functions from C(IF; IF) to itself. In other
words, the vectors in C(IF; IF) are continuous functions and the operators
act on those continuous functions and return other continuous functions.
For example, if h E C(IF; IF) is any continuous function, we may define a
linear operator Lh : C(IF; IF) -t C(IF; IF) by right multiplication; that is,
Lh[f] = h · f.

The fundamental properties of linear transformations are given in the following


proposition.

Proposition 2.1.5. Let V and W be vector spaces. A linear transformation L :


V-t W
(i) maps lines into lines; that is, L(tx + (1 - t)y) = tL(x) + (1 - t)L(y) for all
t E IF;
2.1. Basics of Linear Transformations I 35

(ii) maps the origin to the origin; that is, L(O) = O; and

(iii) maps subspaces to subspaces; that is, if X is a subspace of V, then the set
L( X) is a subspace of W.

Proof.

(i) This is an immediate consequence of (2 .1).

(ii) Since L(x) = L(x + 0) = L(x) + L(O), it follows by the uniqueness of the
additive identity that L(O) = 0 by Proposition 1.1.7.

(iii) Assume that Y1, Y2 E L(X) . Hence, Y1 = L(x1) and Y2 = L(x2) for some
x 1, x2 EX. Since ax1 +bx2 EX, we have that ay1 +by2 = aL(x1)+bL(x2) =
L(ax1 + bx2) E L(X). D

Remark 2.1.6. In Proposition 2.l.5(i), the line tx + (1 - t)y maps to the origin if
L(x) = L(y) = 0. If we consider a point as a degenerate line, we can still say that
linear transformations map lines to lines.

2.1.2 The Kernel and Range


A fundamental problem in linear algebra is to solve systems of linear equations.
This problem can be recast as finding all solutions x to the equation L(x) = b,
where Lis a linear map from the vector space V into Wand bis an element of W.
The special case when b = 0 is called a homogeneous linear system. The
solution set JV (L) = {x E V I L(x) = O} is called the kernel of the transformation
L. The system L(x) = 0 always has at least one solution (x = 0), and so JV (L) is
always nonempty. In fact, it turns out that JV (L) is a subspace of the domain V.
On the other hand if b-=/= 0, it is possible that the system has no solutions at
all. It all depends on whether bis in the range f/t (L) of the linear transformation
L. As it turns out, the range also forms a subspace, but of Winstead of V.
These two subspaces, the kernel and the range, tell us a lot about the linear
transformation L and a lot about the solutions of linear systems of the form L(x) =
b. For example, for each b E f/t (L) the set of all solutions of L(x) =bis a coset
v +JV (L) for some v EV.

Definition 2.1.7. Let V and W be vector spaces. The kernel (or null space) of
a linear transformation L: V-+ W is the set JV (L) = {x EV I L(x) = O} . The
rnnge (or image) of L is the set f/t (L) = {L(x ) E WI x EV}.

Proposition 2.1.8. Let V and W be vector spaces, and let L : V-+ W be a linear
map. We have the following :

(i) JV(L) is a subspace ofV .


(ii) f/t(L) is a subspace ofW.

Proof. Note that both JV (L) and f/t (L) are nonempty since L(O) = 0.
36 Chapter 2. Linear Transformations and Matrices

(i) Ifx 1, x 2 E JV (L), then L (x 1) = L(x2) = 0, which implies that L(ax1 +bx2) =
aL(x1) + bL(x2) = 0. Therefore ax1 + bx2 E JV (L) .
(ii) This follows immediately from Proposition 2.1.5 (iii). D

Example 2.1.9. We have the following:

(i) Let L: IR 3 -t IR 2 be a projection given by (x,y , z) H (x,y). It follows


that &l (L) = IR 2 and JV (L) = {(O, 0, z) I z E IR}.

(ii) Let L : IR 2 -t IR 3 be an identity map given by (x , y) H (x , y, 0). It


follows t hat &l(L) = {(x,y ,0) I x , y E IR} and JV (L ) = {(0,0)}.

(iii) Let L : IR 3 -t IR 3 be given by L(x) = x. It follows that JV (L) = {O}


and &l (L) = IR 3 .

(iv) Let L : C 1([0, l];lF) -t C([O, 1]; JF) be given by L[f] = f' + f. It is easy
to show that Lis linear. Note that JV (L) = span{e- x}. To prove that
Lis surjective, let g(x) E C([O, 1]; JF), and for 0 :S x :S 1, define

for any C E JF. It is straightforward to verify that L [f] = g. Thus,


&l (L)= C([O, 1]; JF) .

Remark 2.1.10. Given a linear map L from one function space to another, we
typically write L[f] as the image of f. To evaluate the function at a point x ,
we write L[f](x).

2.2 Basics of Linear Transformations II


In this section, we examine two algebraic properties that allow us to build new
linear transformations from existing ones: the composition of two linear maps is
again a linear map, and a linear combination of linear maps is a linear map.
We also define formally what it means for two vector spaces to be "the same,''
or isomorphic. Maps identifying isomorphic vector spaces are called isomorphisms.
Isomorphisms are useful when one vector space is difficult or confusing to work with,
but we can construct an isomorphism to another vector space where the structure
is easier to visualize or work with. Any results we find in the new vector space also
apply to the original vector space.

2.2.1 Two Algebraic Propertie!S of Linear Transformations

Proposition 2.2.1.

(i) If L: V--+ W and K : W--+ X are linear transformations, then the composi-
tion K o L : V --+ X, is also a l'inear transformation.
2.2. Basics of Linear Transformations II 37

(ii) If L : V-+ W and K : V-+ W are linear transformations and r, s E lF, then
the map r L + sK, defined by (rL + sK)(x) = rL(x) + sK(x), is also a linear
transformation.

Proof.
(i) If a, b E lF and x , y EV , t hen

(Ko L )(ax +by) = K(aL(x) + bL(y)) = aK(L(x)) + bK(L(y))


= a(K o L)(x) + b(K o L)(y) .

Thus, the composition of two linear transformations is linear.

(ii) If a, b, r, s E lF and x, y E V, then

(rL + sK)(ax +by) = rL(ax +by)+ sK(ax +by)


= r(aL(x) + bL(y)) + s(aK(x) + bK(y))
= a(rL + sK)(x) + b(rL + sK)(y) .
Thus, the linear combinat ion r L + sK is itself a linear transformation. D

Definition 2.2.2. Let V and W be vector spaces over the same field lF. Let
2'(V ; W) be the set of linear transformations L mapping V into W with the point-
wise operations of vector addition and scalar multiplication:

(i) If f, g E 2'(V; W), then f + g is the map defined by (J + g)(x) = f(x) + g(x).
(ii) If a E lF, then af is the map defined by (af)(x) = af(x).

Corollary 2.2.3. Let V and W be vector spaces over the same field lF. The set
2'(V; W) with the pointwise operations of vector addition and scalar multiplication
forms a vector space over lF.

Proof. The proof is Exercise 2.5. D

Remark 2.2.4. For notational convenience, we denote the composition of two lin-
ear transformations K and Las the product KL instead of K oL. When expressing
the repeated composition of linear operators K: V-+ V we write as powers of K;
for example, we write K 2 instead of K o K.

Nota Bene 2.2.5. The distributive laws hold for compositions of sums and
sums of compositions; that is, K(L1 + L2) = KL1 + KL2 and (L1 + L2)K =
L 1K + L 2K. However, commutativity generally fails. In fact, if L : V-+ W
and K : W-+ X and X =/:- V, then the composition LK doesn 't even make
sense. Even when X = V, we seldom have commutativity.
38 Chapter 2. Linear Transformations and Matrices

2.2.2 lnvertibility
We finish this section by developing the main ideas of invertibility and defining the
concept of an isomorphism, which tells us when two vector spaces are essentially
the same.

Definition 2.2.6. A linear transformation L : V-+ W is called invertible if it has


an inverse that is also a linear transformation.

A function has an inverse if and only if it is bijective; see Theorem A.2.19(iii)


in the appendix. One might think that there could be a linear transformation
whose inverse function is not linear, but the next proposition shows t hat if a linear
transformation has an inverse, then that inverse must also be linear.

Proposition 2.2. 7. If a linear transformation is bijective, then the inverse function


is also a linear transformation.

Proof. Assume that L : V -+ W is a bijective linear transformation with inverse


function L- 1. Given w 1 , w 2 E W there exist x 1,x2 EV such that L(x 1) = w 1 and
L(x 2) = w 2 . Thus, for a, b E IF, we have L- 1(aw 1+bw2) = L- 1(aL(x 1)+bL(x 2)) =
L- 1 (L(ax 1 + bx2)) = ax 1 + bx2 = aL- 1(w 1) + bL- 1(w2). Thus, L- 1 is linear. D

Corollary 2.2.8. A linear transformation is invertible if and only if it is bijective.

Definition 2.2.9. An invertible linear transformation is called an isomorphism. 8


Two spaces V and W are isomorphic, denoted V ~ W, if there exists an isomor-
phism L: V-+ W . If an isomorphism is a linear operator (that is, if W = V ), then
it is called an automorphism .9

Example 2.2.10.

(i) For any positive integer n, let W = {(0, a2, a3, ... , an) I ai E IF} be a
subspace of IFn. We claim IFn-l is isomorphic to W. The isomorphism
between them is the linear map (b1, b2, ... , bn-1) H (0, bi , b2, . .. , bn-1)
with the obvious inverse.

8
For experts, we note that it is common to define an isomorphism to be a bijective linear trans-
formation, and although that is equivalent to our definition, it is the ''wrong" definition. What
we really care about is that all the properties of the two vector spaces match up-that is, any
relation in one vector space maps to (by the isomorphism or by its inverse) the same relation
in the other vector space. The importance of requiring the inverse to preserve all the important
properties is evident in categories where bijectivity is not sufficient. For example, an isomorphism
of topological spaces (a homeomorphism) is a continuous map that has a continuous inverse, but
a continuous bijective map need not have a continuous inverse.
9
It is worth noting the Greek roots of these words: iso means equal, morph means shape or form,
and auto means self.
2.2. Basics of Linear Transformations II 39

(ii) The vector space IF'[x; n] of polynomials of degree at most n is isomorphic


to wn+ 1 . The isomorphism is given by the map a0 + a 1 x + · · · + anxn f-t
(ao, a1, ... , an) E JF'n+l, which is readily seen to be linear and invertible.
Note that while these two vector spaces are isomorphic, the domain
has additional structure that is not necessarily preserved by the vector-
space isomorphism. For example, multiplication of polynomials is an
interesting and useful operation, but it is not part of the vector-space
structure, and so it is not guaranteed to be preserved by isomorphism.

Remark 2.2.11. The inverse of an invertible linear transformation L is also


invertible. Specifically, (L- 1 ) - 1 = L. Also, the composition KL of two invert-
ible linear transformations is itself invertible with inverse (KL )- 1 = L - l K - 1 (see
Exercise 2.6).

2.2.3 Isomorphisms

Theorem 2.2.12. The relation~ is an equivalence relation on the collection of all


vector spaces.

Proof. Since the identity map Iv : V --+ V is an isomorphism, it follows that


~ is reflexive. If V ~ W, then there exists an isomorphism L : V --+ W. By
Remark 2.2.11, we know that L- 1 is also an isomorphism, and thus~ is symmetric.
Finally, if V ~ W and W ~ X, then there exist isomorphisms L : V --+ W and
K : W --+ X. By Remark 2.2.11 the composition of isomorphisms is an isomorphism,
which shows that KL is an isomorphism and ~ is transitive. D

Remark 2.2.13. Of course not every operator is invertible, but even when an
operator is not invertible, many of the results of invertibility can still be used. In
Sections 4.6.1 and 12.9, we examine certain generalized inverses of linear operators.
These are operators that are not quite inverses but do have some of the properties
enjoyed by inverses.

Two vector spaces that are isomorphic are essentially the same with respect to
the linear structure. The following shows that every property of vector spaces that
we have discussed so far in this book is preserved by isomorphism . We use this often
because many vector spaces that look complicated can be shown to be isomorphic
to a simpler-looking vector space. In particular, we show in Corollary 2.3.12 that
all vector spaces of dimension n are isomorphic.

Proposition 2.2.14. If V and W are isomorphic vector spaces, with isomorphism


L : V --+ W, then the following hold:
(i) A linear equation holds in V if and only if it also holds in W; that is,
:Z:::::~= l aixi = 0 holds in V if and only if :Z::::~= l aiL(xi) = 0 holds in W .
(ii) A set B = {v 1, ... , vn} is a basis of V if and only if LB= {L(v1), ... , L(vn)}
is a basis for W. Moreover, the dimension of V is equal to the dimension of W .
40 Chapter 2. Linear Transformations and Matrices

(iii) The set of all subspaces of V is in bijective correspondence with the set of all
subspaces of W .

(iv) If K : W--+ X is any linear transformation, then the composition KL: V--+
X is also a linear trans!ormation, and we have

JV (KL)= L- 1 JV (K) = {v I L(v) E JV (K)}

and
f,f (KL) = f,f (K).

Proof.

(i) If 2:;= 1 aixi = 0 in V, then 0 = L(O) = L(2::;=l aixi) = 2:;= 1 aiL(xi) in W .


The converse follows by applying L- 1 to 2:;= 1 aiL(xi) = 0.

(ii) Since L is surjective, any element w E W can be written as L(v) for some
v EV. Because Bis a basis, it spans V, and we can write v = 2::7= 1 aivi for
some choice of a 1, ... , an E IF. Applying L gives w = L(v) = L(2:7= 1 aivi) =
2::7= 1 aiL(vi), so the set LB= {.L(v1), ... , L(vn)} spans W. It is also linearly
independent since 2:~ 1 ci.L(vi) = 0 implies 2::7= 1 ci.L- 1.L(vi) = 0 by part (i).
Since B is linearly independent, we must have Ci = 0 for all i.
The converse follows by applying the same argument with L- 1 .

(iii) Let Yv be the set of subspaces of V and Yw be the set of subspaces of W .


The linear transformation L induces a map L : Yv --+ Yw given by sending
X E Yv to the subspace LX = {L(v) I v E X} E Yw. Similarly, L- 1
induces a map £- 1 : Yw --+ Yv. The composition £- 1 L is the identity I.9v
because for any X E Yv we have

£- 1£(X) = £- 1{L(v) Iv EX}= {L- 1 Lv Iv EX}= {v I v EX}= X.

Similarly, ££- 1 = I .9w, so the maps Land £- 1 are inverses and thus bijective.

(iv) See Exercise 2.7. D

2.3 Rank, Nullity, and the First Isomorphism Theorem


As we saw in the previous section, the kernel is the solution set to the homogeneous
linear system L(x) = 0, and it forms a subspace of the domain of L. When we
want to consider the more general equation L(x) = b with b of=. 0, the solution set
is no longer a subspace, but rather a coset of the kernel. That is, if x 0 is any one
solution to the system L(x) = b, then the set of all possible solutions is the coset
xo +JV (L). This illustrates the importance of the quotient space V/JV (L) in the
study of linear transformations and linear systems of equations.
In this section, we study these relationships more carefully. In particular,
we show that the quotient V /JV (L) is isomorphic to f,f ( L). This is called the
first isomorphism theorem, and it has powerful consequences. We use it to give a
2.3. Rank, Nullity, and the First Isomorphism Theorem 41

formula relating the dimension of the image (called the rank) and the dimension of
the kernel (called the nullity).
We also use the first isomorphism theorem to prove the extremely important
result that all n-dimensional vector spaces over IF are isomorphic to !Fn.
Finally, we use the first isomorphism theorem to prove another theorem, called
the second isomorphism theorem, about the relation between quotients of sums and
intersections of subspaces. This theorem provides a formula, called the dimension
formula, relating the dimensions of sums and intersections of subspaces.

2.3 .1 Quotients, Rank, and Nullity


Proposition 2.3.1. Let W be a subspace of the vector space V and denote by
V /W the quotient space. The mapping 7r : V --t V /W defined by 7r( v) = v + W is
a surjective linear transformation. We call 7r the canonical epimorphism. 10

Proof. Since 7r is clearly surjective, it suffices to show that 7r is linear. We check


7r(ax + by) = (ax+ by) + W = a 0 (x + W) EB b 0 (y + W) = a 0 7r(x) EB b
['.:l7f(y). D

Example 2.3.2. Exercise 1.29 shows for a vector space V that V/{O} ~ V.
Alternatively, this follows from the proposition as follows. Define a linear map
7r: V --t V/{O} by 7r(x) = x + {O}. By the proposition, 7r is surjective. Thus,
it suffices to show that 7r is injective. If 7r(x) = 7r(y), then x + {O} = y + {O} ,
which implies that x - y E {O}. Therefore, x = y and 7r is injective.

The following lemma is a very handy tool.

Lemma 2.3.3. A linear map L is injective if and only if JV (L) = {O} .

Proof. Lis injective if and only if L(x) = L(y) implies x = y, which holds if and
only if L( z) = 0 implies z = 0, which holds if and only if JV (L) = {0}. D

Lemma 2.3.4. Let L : V --t Z be a linear transformation between vector spaces V


and Z. Assume that W and Y are subspaces of V and Z , respectively. If L(W ) C Y ,
then L induces a linear transformation L : V/W --t Z /Y defined by L(x + W) =
L(x) + Y.

Proof. We show that Lis well defined (see Appendix A.2.4) and linear. Ifx+ W =
y + W, then x - y E W, which implies that L(x) - L(y) E L(W) c Y. Hence,
L (x + W) = L(x) + Y = L(y) + Y = L (y + W) . Thus, Lis well defined. To show

10 An epimorphism is a surjective linear transformat ion. The name comes from the Greek root
epi-, meaning on, upon, or above.
42 Chapter 2. Linear Transformations and Matrices

that L is linear, we note that

L(a c:J (x + W) EE b c:J (y + W)) = L(ax +by+ W)


= L(ax +by)+ Y
= aL(x) + bL(y) + Y
=a c:J (L(x) + Y) EE b c:J (L(y) + Y)
=a c:J L(x + W) EE b c:J L(y + W). D

Theorem 2.3.5 (First Isomorphism Theorem). If V and X are vector spaces


and L : v -+ x is a linear transformation, then vI A" (L) ~ a (L). In particular,
if Lis surjective, then V/JY (L) ~ X.

Proof. Apply the previous lemma with W =A" (L), with Y = {O}, and with Z =
a (L) to get an induced linear transformation L: V/JY (L)-+ a (L) /{O} ~a (L)
that is clearly surjective since x +A" (L) maps to L(x) + {O}, which then maps
to L(x) E f/t(L). Thus, it suffices to show that Lis injective. IfL(x+A"'(L)) =
L(y + A" (L)), then L(x) + {O} = L(y) + {O}. Equivalently, L(x - y) E {O},
which implies that x - y E A" (L), and so x +A" (L) = y +A" (L). Thus, Lis
injective. D

The above theorem is incredibly useful because we can use it to prove that a
quotient V/W is isomorphic to another space X. The standard way to do this is
to construct a surjective linear transformation V-+ X that has kernel equal to W.
The first isomorphism theorem then gives the desired isomorphism.

Example 2.3.6.

(i) If V is a vector space, then we can show that V /V ~ {O}. Define the
linear map L : V -+ {O} by L(x) = 0. The kernel of Lis all of V, and
so the result follows from the first isomorphism theorem.

(ii) Let V = {(0, a2, a3, ... , an) I ai E IF} C IFn for some integer n 2 2. The
quotient IFn /V is isomorphic to IF. To see this use the map n 1 : IFn -+IF
(see Example 2.l.3(i)) defined by (a 1 , . . . , an) H a 1 . One can readily
check that this is linear and surjective with kernel equal to V, and so
the first isomorphism theorem gives the desired result.
(iii) The set of constant functions ConstF forms a subspace of cn([a, b]; IF).
The quotient cn([a , b]; IF)/ ConstF is isomorphic to cn- 1 ([a, b]; IF). To
see this, define the linear map D : cn([a, b]; IF) -+ cn- 1 ([a, b]; IF) as the
derivative D[f](x) = f'(x). Its kernel is precisely ConstlF., and so the first
isomorphism theorem implies the quotient is isomorphic to flt (D) . But
D is also surjective, since, by the fundamental theorem of calculus , it is
a right inverse to Int: f HJ: f(t) dt (see Example A.2.22). Since Dis
surjective, the induced map D: cn([a,b];IF)/ConstF-+ cn- 1 ([a,b];IF)
is an isomorphism.
2.3. Rank, Nullity, and the First Isomorphism Theorem 43

(iv) For any p E [a, b] C JR let Mp = {! E C([a, b]; IF) I f(p) = O}. We claim
that C([a, b]; IF)/Mp ~IF. Use the linear map ep : C([a, b]; IF) --+IF given
by ep(f) = f(p). The kernel of ep is precisely Mp, and so the result
follows from the first isomorphism theorem.

Theorem 2.3.7. IfW is a subspace of a finite -dimensional vector space V, then


dim V = dim W + dim V /W (2.2)

Proof. If W = {O} or W = V , then the result follows trivially from Example 2.3.2
or Example 2.3.6(i). Thus, we assume that Wis a proper, nontrivial subspace of V.
Let S = {x 1 , ... , x,.} be a basis for W. By the extension theorem (Corollary 1.4.5)
we can choose T = {Yr+1, . . . , yn} so that SUT is a basis for V and SnT = f/J. We
claim that the set Tw = {y,.+ 1 + W, ... ,yn + W} is a basis for V/W . It follows
that dim V = n = r + (n - r) = dim W + dim V /W .
We show that Tw is a basis. To show that Tw is linearly independent, assume
/Jr+l D (Yr+l + W) EB ... EB /Jn D (Yn + W) = 0 + w
Thus, I:7=r+l /3jy j E W, which implies that I:7=r+l /3jYj = 0 (since S spans W,
not T). However since T is linearly independent, we have that each /3j = 0. Thus,
Tw is linearly independent. To show that Tw spans V/W, let v + W E V/W
for some v E V. Thus, we have that v = I: ~= l aixi + I:7=r+l /3jYj· Hence,
v + w = /Jr+l D (Yr+l + W) EB ... EB /Jn D (Yn + W). D

Definition 2.3.8. Let L : V --+ W be a linear transformation between vector


spaces V and W. The rank of L is the dimension of its range, and the nullity
of L is the dimension of its kernel. More precisely, rank(L) = dim~ (L) and
nullity(L) = dim JY (L).

Corollary 2.3.9 (Rank-Nullity Theorem). Let V and W be vector spaces, with


V of finit e dimension. If L : V --+ W is a linear transformation, then
dim V =dim~ (L) +<limo.A' (L) = rank(L) + nullity(L). (2.3)

Proof. Note that (2.2) implies that dimV = dimo.A'( L) + dimV/Jf"(L). The
first isomorphism theorem (Theorem 2.3.5) tells us that V / JY (L) ~ ~ (L ), and
dimension is preserved by isomorphism (Proposition 2.2.14(ii)), so dim V/JY (L) =
dim~ (L). 0

Remark 2.3.10. So far , when talking about addition and scalar multiplication
of cosets we have used the notation EB and D in order to help the reader see
that these operations on V /W differ from the addition and scalar multiplication
in V. However, most authors use + and · (or juxtaposition) for these operations
on the quotient space and just expect the reader to be able to identify from con-
text when the operators are being used in V /W and when they are being used
44 Chapter 2. Linear Transformations and Matrices

in V. From now on we also use this more standard, but possibly more confusing,
notation.

2.3.2 Isomorphisms of Finite-Dimensional Vector Spaces

Corollary 2.3.11. Assume that V and W are n -dimensional vector spaces. A


linear map L : V -+ W is injective if and only if it is surjective.

Proof. By Lemma 2.3.3, Lis injective if and only if dim(JV (L)) = 0. By the rank-
nullity theorem, this holds if and only if rank(L) = n. However, by Corollary 1.4.7
we know rank(L) = n if and only if~ (L) = W, which means Lis surjective. D

Corollary 2.3.11 implies that two vector spaces of the same finite dimension,
and over the same field lF, are isomorphic if there is either an injective or a surjective
linear mapping between the two spaces. The next corollary shows that such a map
always exists.

Corollary 2.3.12. Ann-dimensional vector space V over the field lF is isomorphic


to lFn.

Proof. Let T = {x 1, x2, ... , Xn} be a basis for V. Define the map L : lFn -+ V as
L((a1,a2, ... ,an))= a1x1 + a2x2 + · · · + anXn· It is straightforward to check that
this is a linear transformation. We wish to show that L is bijective and hence an
isomorphism. By Corollary 2.3.11, it suffices to show that Lis surjective. Because T
is a basis, every element x E V can be written as a linear combination x = :l:~=l aixi
of vectors in T, so x = L(a 1, ... , an)· Hence, Lis surjective, as required. D

E x ample 2.3.13. An immediate consequence of the corollary above is that


lF[x; n] ~ JFn+l and Mmxn(lF) ~ lFmn.

Remark 2.3.14. Corollary 2.3.12 is a big deal. Among other things, it means
that even though there are many different descriptions of finite-dimensional vector
spaces, there is essentially (up to isomorphism) only one n-dimensional vector space
over lF for each nonnegative integer n. When combined with the results of the next
two sections, this allows essentially all of finite-dimensional linear algebra to be
reduced to matrix analysis.

Remark 2.3.15. Again, we note that many vector spaces also carry additional
structure that is not part of the vector-space structure, and the isomorphisms we
have discussed do not necessarily preserve this other structure. So, for example,
while JF[x; n 2 - 1] is isomorphic to lFn as a vector space, and these are both isomor-
2

phic to Mn(lF) as vector spaces, multiplication of polynomials and multiplication


of matrices in these vector spaces are not "the same." In fact, multiplication of
polynomials is commutative and multiplication of matrices is not, so we can't hope
to identify these multiplicative structures. But when we only care about the vector-
space structure, the spaces are isomorphic and can be treated as being ''the same."
2.3. Rank, Nullity, and the First Isomorphism Theorem 45

Nota Bene 2.3.16. Although any n-dimensional vector space is isomorphic


to JFn, beware that the actual isomorphism depends on the choice of basis
and, in fact, on the choice of the order of the elements in the basis. If we had
chosen a different basis or a different ordering of the elements of the basis in
the proof of Corollary 2.3.12 , we would have had a different i omorphism.

As a final consequence of Corollary 2.3.11, we show that for finite-dimensional


vector spaces, if any operator has a left inverse, then that left inverse is also a right
inverse.

Proposition 2.3.17. Let K, L : V-+ V be linear operators on a finite -dimensional


vector space V. If KL = I is the identity map, then LK is also the identity map,
and thus L and K are isomorphisms.

Proof. The identity map is injective, so if KL = I, then L is also injective (see


Proposition A.2.9) , and by Corollary 2.3.11 , the map L must also be surjective.
Consider now the operator (I - LK): V-+ V. We have

(I - LK)L = L - L(KL) = L - L = 0.
This implies that every element of &it(L) is in the kernel of (I - LK). But Lis
surjective, so JV (I - LK) = &it (L) = V. Hence, (I - LK) = 0, and I = LK. 0

2.3.3 The Dimension Formula*

Corollary 2.3.18 (Second Isomorphis m Theorem). Let V be a vector space.


If Vi and Vi are subspaces of V, then

Vi/(Vi n Vi)~ (Vi+ Vi)/Vi. (2.4)

The following diagram is sometimes helpful when thinking about the second
isomorphism theorem:

V1nVi ~ Vi

Here each of the symbols c _ . - denotes the inclusion of a subspace inside of


another vector space. Associated with each inclusion A ~ B we can construct
the quotient vector space B /A. The second isomorphism theorem says that the
quotient associated with the left vertical side of the square is isomorphic to
the quotient associated with the right vertical side of the square. If we relabel the
subspaces, the second isomorphism theorem also says that the bottom quotient is
isomorphic to the top quotient.
46 Chapter 2. Linear Transformations and Matrices

Figure 2.1. If a subspace W1 has dimension di and a subspace W2 has


dimension d2, then the dimension formula (2.5) says that the intersection W1 n W2
has dimension dim W1 + dim W2 - dim(W1 + W2). In this example, both W1 and
W2 have dimension 2, and their sum has dimension 3; therefore the intersection
has dimension l.

Proof. Define L : Vi -+ (Vi+ Vi)/V2 by L(x) = x +Vi. Clearly, L is surjective.


Note that JV (L) = { x E Vi I x + V2 = 0 + Vi} = { x E Vi I x E V2} = Vi n V2.
Thus, by Theorem 2.3.5, we have (2.4} D

The next corollary tells us about the dimension of sums and intersections of
subspaces. This result should seem geometrically intuitive: for example, if W1 and
W2 are two distinct planes (through t he origin) in IR 3 , then their sum is all of IR 3 ,
and their intersection is a line. Thus, we have dim W1 + dim W2 = 2 + 2 = 4, and
dim(W1 n W2) + dim(W1 + W2) = 1 + 3 = 4; see Figure 2.1.

Corollary 2.3.19 (Dimension Formula). If Vi and Vi are finite -dimensional sub-


spaces of a vector space V , then

dim Vi+ dim Vi= dim(V1 n Vi)+ dim(Vi +Vi). (2.5)

Proof. By the second isomorphism theorem (Corollary 2.3.18), it follows that


dim((Vi + Vi)/V2) = dim(Vi/(Vi n V2)) . Also, by Theorem 2.3.7, it follows that
dim Vi = dim(Vi n Vi)+ dim(Vi/(Vi n Vi)). Therefore,

dim Vi+ dim Vi= dim(Vi n Vi)+ dim(Vi/(Vi n Vi))+ dim Vi


= dim(V1 n Vi)+ dim((V1 + V2)/Vi) +dim Vi
= dim(Vi n Vi) + dim(Vi +Vi),
where the last line again follows from Theorem 2.3.7. D

2.4 Matrix Representations


In this section, we show how to represent linear transformations between finite-
dimensional vector spaces as matrices. Given a finite ordered basis for each of the
domain and codomain, a linear transformation has a unique matrix representation.
In this representation, matrix-vector multiplication describes how the coordinates
are mapped, and matrix-matrix multiplication corresponds to composition of linear
2.4. Matrix Representations 47

transformations. We first define transition matrices, which correspond to the matrix


representation of the identity operator, but with two different bases for the domain
and codomain. We then extend the construction to general linear transformations.

2.4.1 Transition M atrices


We begin by introducing some notation. Suppose that S = [s 1, s 2, ... , sm] and
T = [t1, t2, ... , tm] are both ordered 11 bases of the vector space V. Any x EV can
be uniquely represented in each basis:
m m
X = l:aiSi and x =L bjtj.
i =l j=l

In matrix notation, we may write this as

We call the ai above the coordinates of x in the basis S and denote the m x 1

l
(column) matrix of these coordinates by [x] 8 . We proceed similarly for T:

[x]s ~ [2] and [x]r=


[
L.
b2
b1

Since S is a basis, then by Corollary 1.2.15, each element of T can be uniquely


expressed as a linear combination of elements of S; we denote this as tj = 1 cijSi · L:::
In matrix notation, we write this as

en
C21 C1ml
C2m
[t1, ... ,tm] = [s1, ... , sm] '. . .
[
Cml Cmm

Thus,

11 Inwhat follows, we want to keep track of the order of the elements of a set. Strictly speaking,
a set isn't an object that maintains order-for example, the set {x,y, z} is equal to the set
{z, y , x}-and so we use square brackets to mean that the set is ordered. Thus, [x,y, z] means
that x is the first element, y is the second element, and so on. That said , we do not always put
the word "ordered" in front of the word basis-it is implied by the use of square brackets.
48 Chapter 2. Linear Transformations and Matrices

In terms of coordinates [x]s and [x]r, this takes the form

[x]s =
a1]
a2. [cu
..
C21.
.. ~=1 [:.:] = C[x]r,
am Cm l Cm2 Cmm bm

where C is the matrix [ciJ]. Hence, C transforms the coordinates from the basis
T into coordinates in the basis S. We call C the transition matrix from T into S
and sometimes denote it with the subscripts Csr to indicate that it provides the
S-coordinates for a vector written originally in terms of T , that is,

[x]s = Csr[x]r .

It is straightforward to verify that the transition matrix Crs from S into T


is given by the matrix inverse Cr s = (Csr) - 1 ; see Appendix C.1.3. This allows us
to go back and forth between coordinates in S and in T, as demonstrated in the
following example.

Example 2.4.1. Assume that S = [x 2 - l , x+ 1, x-1] and T = [x 2 ,x, 1] are


bases for !F[x; 2]. If q(x ) = 2(x 2 -1) + 3(x + 1) - 5(x -1) , we can write q in the
basis S as [q] s = [2 3 -5] T. The transition matrix from S into T satisfies

[x 2 -1 ,x +l ,x -l]=[x 2 ,x,l] [~ ~ ~1
-1 1 -1

Thus, we can rewrite q in the basis T by matrix multiplication, and we find

This corresponds to the equality q(x) = 2(x 2 - 1) + 3(x + 1) - 5(x - 1) =


2x 2 - 2x + 6. We can invert Crs to get

which implies that

[x 2 ,x,l] = [x2 -l,~r:+l,x-l] [! -~


~ ~i
~ -~
2.4. Matrix Representations 49

This gives

[q]s=Csr[q]r = [t ! !] [- [~]
-~ ~ -~
2
2]
6 -5
which corresponds to q(x) = 2x 2 - 2x + 6 = 2(x 2 - 1) + 3(x + 1) - 5(x - 1) .

Example 2.4.2. If S = T , then the matrix Css is just the identity matrix.

Example 2.4.3. To illustrate the importance of order, assume that the vec-
tors of Sand Tare the same but are ordered differently. For example, if m = 2
and if T = [s2, si] is obtained from S = [s 1, s2] by switching the basis vectors,
then the transition matrix is Csr = [~ n
2.4.2 Con structing the M atrix Represe ntation
The next theorem tells us that a given linear map between two finite-dimensional
vector spaces with ordered bases for the domain and codomain has a unique matrix
representation.

Theorem 2.4.4. Let V and W be finite -dimensional vector spaces over the field lF
with bases S = [s 1, s2, ... , sm] and T = [t 1 , t 2 , . . . , tnJ, respectively. Given a linear
transformation L: V -t W, there exists a unique n x m matrix Mrs describing L
in terms of the bases S and T; that is, there exists a unique matrix Mrs such that

[L(x )]r = Mrs[x]s


for all x E V. We say that Mrs is the matrix representation of L from S into T.

Proof. If x = 2.::;'1=1 aJSJ, then L(x) = 2.::;'1= 1 aJL(sJ)· Since each L(sJ) can be
written uniquely as a linear combination of elements of T, we have that L( SJ) =
2.::~ 1 CiJti for some matrix M = [ciJ]· Thus,

where bi = 2.::;'1= 1 Cijaj · In matrix notation, we have

[L(x)]r =
b11
b2
..
.
[C11
C21
.
..
~:.:1 r~:1 = M[x]s.
r
bn Cnl Cn2 Cnm am
50 Chapter 2. Linear Transformations and Matrices

To show uniqueness, we suppose that two matrix representations M and M of L


exist; that is, both [L(x)]r = M[x]s and [L(x)]r = M[x]s hold for all x. This gives
(M - M)[x]s = 0 for all x. However, this can only happen if M - M = 0 . D

Nota Bene 2.4.5. It is very important to remember that the matrix repre-
sentation of L depends on the choices of ordered bases S and T. Different bases
and different ordering almost always give different matrix repre entations.

Remark 2.4.6. The transition matrix we defined in the previous section is just
the matrix representation of the identity transformation Iv, but with basis S used
to represent the inputs of Iv and basis T used to represent the outputs of Iv.

Example 2.4.7. Let S = [si,s 2 ,s 3 ] and T = [t 1 ,t 2 ] be bases for JF 3 and JF 2 ,


respectively. Consider the linear map L : JF 3 -+ JF 2 that satisfies

[L(s1)]r = rnJ , [L(s2 )]r = [~] , and [L(s3)]r = [~] .


We can write

and Crs is the matrix representation of L from S into T. We compute L(x)


for x = 3s1 + 2s2 + 4s3. Note that

Crs[x]s =
o
[0
2
0 ~] m~ [i] ~ [L(x)Jr.

Example 2.4.8. Consider the differentiation operator L : JF[x; 4] -+ lF[x; 4]


given by L[p](x) = p'(x ). Using the basis S = [l,x,x 2 ,x 3 ,x 4 ] for both the
domain and codomain, the matrix representation of L is

0 1 0 0 0
0 0 2 0 0
Css = 0 0 0 3 0
0 0 0 0 4
0 0 0 0 0
2.5. Composition, Chang e of Ba sis, and Similarity 51

If we write p( x) = 3x4 - 8x 3 + 6x 2 - 12x + 4 in terms of the basis S, we can


use Css to compute [L[p]]s. We have [L[p]]s = Css[P]s, or

0 1 0 0 0 4 -12
0 0 2 0 0 - 12 12
[L[p]]s = 0 0 0 3 0 6 -24
0 0 0 0 4 -8 12
0 0 0 0 0 3 0

This agrees with the direct calculation of p' (x) 12x 3 - 24x 2 + 12x - 12,
written in terms of the basis S.

Remark 2.4.9. Let V = lFn. The standard basis is the set

S = [e1,e2, ... ,en] = [(1,0,0, . . . ,0), (0, 1,0, . .. ,0), .. . , (0,0,0, . .. , 1)].
Hence, an n-tuple (a1, ... , an) can be written x = a 1e 1 + · · · + anen, or in matrix
form as

We often write vectors in column form and suppress the notation [ · ]s if the choice
of basis is clear.

Remark 2.4.10. Unless the vector space and basis are already otherwise defined,
when using matrices to represent linear operators, we assume that a matrix defines
a linear operator on lFn with the standard basis {e 1,e 2, . .. , e n} ·

2.5 Composition, Change of Basis, and Similarity


In the previous section, we showed how to represent a linear transformation as a
matrix, once ordered bases for the domain and codomain are chosen. In this section,
we first show that the composition of two linear transformations is represented by
the matrix product, and then we use this to show how a matrix representation
changes as a result of a change of basis.

2.5.1 Composition of Linear Transformations


We have seen that matrix-vector multiplication describes how linear transformations
map vectors, and thus it should come as no surprise that composition of linear
transformations corresponds to matrix-matrix multiplication. In this subsection
we prove this result . A key consequence of this is that most questions in finite-
dimensional linear algebra can be reformulated as problems in matrix analysis.

Theorem 2.5.1 (Matrix Multiplication). Consider the vector spaces V, W, and


X , with bases S = [s 1, s2 , ... ,smJ, T = [t 1,t2, . .. ,tnJ, and U = [u1,u2, ... ,up],
respectively, together with the linear maps L : V --t W and K : W --t X. Denote the
52 Chapter 2. Linear Transformations and Matrices

unique matrix representations of L and K, respectively, as then x m matrix Brs =


[bjk] and the p x n matrix Cur = [cij] . Denote the unique matrix representation of
the composition KL by Dus = [dij]. Matrix multiplication of the representation of
K with the representation of L gives the representation of KL; that is,

Dus== CurBrs.

Proof. We begin by writing v E V as v = I:;;;'= 1 aksk, together with L(s k)


2.:::7= 1 bjktj
and K(tj) = I:;f= 1 CijUi · We have

KL(v) ~ K (t,a,L(s.)) ~ K (Ea, ti b;.t;) ~ti (t, b,,a,) K(t;)

~ti (t,b,,a,) (t,c;;u;) ~ t, (E [t,~;bjkl a,) u,

So [KL(v) ]u = Dus[v]s, and Dus is the matrix for the transformation KL. In
matrix notation, this gives

[du
d21
d1 2
d22 d,m
d2m 1 [cu
C21
C12
C22
c,,1 [bu
C2n b21
b12
b22 b,m
b2m 1
... ... . '
dpl dp2 dpm Cpl Cp2 Cpn bn1 bn2 bnm
which we may write as Dus= CurBrs. D

Remark 2.5.2. This result tells us, among other things, that the matrix represen-
tation of a linear transformation is an invertible matrix precisely when the linear
transformation is invertible. In this case, we say that the matrix is nonsingular.
If the matrix (or the corresponding transformation) is not invertible, we say it is
singular.

Corollary 2.5.3. Matrix multiplication is associative. That is, if A E Mmxn,


BE Mnxk, and CE Mkxs, then we have

(AB)C = A(BC) .

Proof. First observe that function composition is associative; that is, given any
functions f : V ---+ W, g : W ---+ X and h : X ---+ Y, we have (ho (go f))(v) =
h((g o f)(v)) = h(g(f(v))) =(ho g)(f(v)) =((hog) o f)(v) for every v E V. The
corollary follows since matrix multiplication is function composition. D
2.5. Composition, Change of Basis, and Similarity 53

2.5.2 Change of Basis and Transition Matrices


The results of the previous section make it possible for us to describe how changing
the basis of the domain and codomain changes the matrix representation of a linear
transformation.

Theorem 2.5.4. Let Crs be the matrix representation of L : V ---+ W from the
basis S into the basis T. If P8 s is the transition matrix on V from the basis§ into
S and Qy.T is the transition matrix on W from the basis T into T, then the matrix
representation By.§ of L in terms of S and T is given by By.§= Qy.TCrsPss·

Proof. Let [Y]r = Crs [x]s. We have [x]s = P8 s[x]§ and [y]y. = Qy.r[Y]r. When
we combine these expressions we get [y]y. = Qy.TCrsP8 s[x]s. By uniqueness of
matrix representations (Theorem 2.4.4), we know By.§ = Qy.TCrsPss· D

Remark 2.5.5. The following commutative diagram 12 illustrates the previous


theorem:
left mult. by Crs

l
[x]s - - - - - - - -- [y]r

left mult. by P8 3 [left mull. by Orr

[x]§ - - - - - - -- --+- [y]y.


left mult. by By.§

Corollary 2.5.6. Let S and § be bases for the vector space V with transition
matrix P8 s from S into S . Let Css be the matrix representation of the operator
L: V---+ V in the basis S. The matrix Bss = (P8 s)- 1 CssPss is the unique matrix
representation of L in the basis S.

Remark 2.5. 7. When the bases we are working with are understood from the con-
text, we often drop the basis subscripts. In these cases, we usually also drop the
square brackets [·] and denote a vector as an n-tuple x = (x 1 , ... , Xn) of its coordi-
nates. However, when multiple bases are being considered in the same problem, it
is wise to be very explicit with bases and write out all the basis subscripts so as to
avoid confusion or mistakes.

Definition 2.5.8. Two square matrices A, B E Mn(lF) are similar if there exists a
nonsingular PE Mn(lF) such that B = p - l AP.

Remark 2.5.9. Any such P in the definition of similar matrices is a transition


matrix defining a change of basis in !Fn , so two matrices are similar if and only if
they correspond to the matrix representations, in different bases, of the same linear
trans!ormation L : !Fn ---+ !Fn .
12 Startingat the bottom left corner of the commutative diagram and following the arrows up and
around and back down to the bottom right gives the same result as following the single arrow
across the bottom.
54 Chapter 2. Linear Transformations and Matrices

Remark 2.5.10. Similarity is an equivalence relation on the set of square matrices;


see Exercise 2 .27.

Remark 2.5.11. It is common to talk about the rank and nullity of a matrix when
one really means the rank and nullity of the linear transformation represented by
the matrix. Since any two similar matrices represent the same linear transformation
(but with different bases), they must have the same rank and nullity.
Many other common properties of matrices are actually properties of the linear
transformations that the matrices describe (for example, the determinant and the
eigenvalues), both of which are discussed later. All of these properties must be
the same for any two similar matrices.

Example 2.5.12. Let B be the unique matrix representation of the linear


operator L: JR 2 -+ JR 2 in the basis S = [81, 82] given by

and let S= [8 1 , 82 ] be another basis for JR 2 , defined by 8 1 = 8 1 - 282 and


82 = 8 1. The transition matrix P from S into S and its inverse are given by

p = [ 1 l]
-2 0 and
1
p_ = 1 -1]
2 [o2 1 .

Hence,

D = p- 1 BP=~ [O
2 2
-1]1 [l0 -11][-21 0l] -_[-10 1OJ '
which defines the linear transformation L in the basis S. Thus, D is similar
to B.

2.6 Important Example: Bernstein Polynomials


In this section we give an extended example of the concepts we have developed so far.
Specifically, we define the Bernstein polynomials and show that they form bases for
the spaces lF[x; n] of polynomials. We then describe various linear transformations
in terms of these bases and compare the results, via similarity, to the standard
polynomial bases. These polynomials provide a useful tool for approximation and
also play a key role in graphics applications, especially in the construction of Bezier
curves.
In Volume 2 we also use the Bernstein polynomials to give a constructive
proof of the Stone-Weierstrass approximation theorem, which guarantees that any
continuous function on a closed interval can be approximated as closely as desired
by polynomials.
2.6. Important Example: Bernstein Polynomials 55

Definition 2.6.1. Given n EN, the Bernstein polynomials {Bj(x)}j=o of degree


n are defined as

where n) n! (2.6)
( j - j!(n-j)!'

Remark 2.6.2. Observe that B(J(O) = 1, and Bj(O) = 0 if j :/=- 0. Moreover,


B~(l) = 1, and Bj(l) = 0 if j :/=- n.

The first interesting property of the Bernstein polynomials is that they sum
to 1, as can be seen from t he binomial theorem:

tBj =t (n)xl(l -x)n-j = (x + (1 - x))n = 1.


j=O j=O J

We show that the set { Bj(x)}j=o forms a basis for IF[x; n] and provides the transition
matrix between the Bernstein basis and the usual monomial basis [1, x, x 2 , . .. , xn].

Example 2.6.3. The Bernstein polynomials of degree n = 4 are


B6(x) = (1 - x) 4 , Bf(x) = 4x( l - x) 3 , Bi(x) = 6x 2 (1- x) 2 ,
Bj(x) = 4x3 (1 - x), B !(x) = x 4 .

These are plotted in Figure 2.2.


We compute the transition matrix Psr from the Bernstein basis

T = [Bti(x), Bf (x) , Bi(x) , Bj(x), B!(x)]


into the standard polynomial basis S = [1, x, x 2 , x 3 , x 4 ], and its inverse Qrs =
(Psr )- 1 , as

~I
1 0 0 0 1 0 0 0 0
-4 4 0 0 1 1/4 0 0 0
Psr= 6 -12 6 0 and Qrs = 1 1/2 1/6 0 0
-4 12 -12 4 1 3/4 1/2 1/4 0
1 -4 6 -4 1 1 1 1 1

Let p(x) = 3x 4 - 8x 3 + 6x 2 - 12x + 4. By computing [p(x)]r = Qrs[p(x)]s,


we see that p(x) can be written in the Bernstein basis as

p(x) = 4B6(x) + Bf(x) - Bi(x) - 4Bj(x) - 7B!(x).

Lemma 2.6.4. We have the following identity for j = 0, 1, . . . , n:

(2.7)
56 Chapter 2. Linear Transform ations and Matrices

Bt
1.0

0.8

0 .6

0.4

0.2

0 .0

Bf Bt Combined
1.0

0.8

0.6

0.4
I
0.2

Figure 2.2. The five Bernstein polynomials of degree 4 on the in-


terval [O, l], plotted first separately and then together {bottom right). See also
Example 2. 6. 3.

Proof.

~ . . n'i' .
= L.) -1)'-J ·1 ( . - .);(. - .)' .,x'
i =J J. i J . n i . i.

Theorem 2.6.5. For any n E N the set Tn = {Bj( x)}j=o of degree-n Bernstein
polynomials forms a basis for F[x; n] .
2.6. Important Exampl e: Bernstein Polynomial s 57

Proof. To see that Tn is a basis, we construct (what will be transition) matrices


P and Q such that Q = p - I and such that any linear relation L~o c1 Bj(x) = 0
can be written as
[l, x, ... , xn]P[co, ... , Cn]T = 0.
Since S = [1, x, ... , xn] is a basis for JF[x; n], this means that the vector P[c0, . . . , cn] T
must vanish; that is, P[co, . .. , cn]T = 0. Since P is invertible, we have

[ca , .. . )Cn] T = (QP)[co, ... 'Cn]T = Q(P[co, . . . ) Cn]T) = QO = 0.


Therefore the set of n + 1 polynomials {B 0(x), . . . , B;{ (x)} are linearly independent
in the (n + 1)-dimensional space JF[x; n], and hence form a basis for JF[x; n].
We now construct the matrices P and Q. Let P = [p1k] be the (n+ 1) x (n+ 1)
lower-triangular matrix defined by

(-l)J-k(n) (j) if j ?. k
PJk =
{0
J k
if j < k
for j, k E { 0, 1, ... , n}. (2.8)

If we already knew that the Bernstein polynomials formed a basis, then (2.7) tells
us that P would be the transition matrix Psr from the Bernstein basis Tn into the
power basis S.
Now define the matrix Q = [%]by

if i ?. j
for i,j E {O, 1, ... , n}. (2 .9)
if i < j

Once we see that the set Tn of Bernstein polynomials forms a basis, it will be clear
that matrix Q is the transition matrix Qrs from the basis S to the basis Tn.
We verify that QP =I = PQ by direct computation. When 0 :::; k :::; i :::; n,
the product Q P takes the form

n
(QP)ik = L%PJk
J=O
= Li %PJk = 2:,
J=k
i

J=k
(-l)j-k ( ;") ( " ) = Bk(l). i
When i = k we have that B k( l) = 1, and when k < i we have that Bk(l) = 0.
Also, when 0 :::; i < k :::; n, we have that (QP)ik = 0 since both matrices are
lower-triangular. Hence, QP =I, and by Proposition 2.3.17 we also have P Q = I.
Thus, P and Q have the necessary properties, and the argument outlined
above shows that {Bj(x)}j=o is a basis of JF[x; n] . D

Example 2.6.6. Using the matrix A in Example 2.4.8, representing the deriva-
tive of a polynomial in JF[x; 4], we use similarity to transform it to the Bernstein
basis. Multiplying out P8,j,APsr , we get
58 Chapter 2. Linear Transformations and Matrices

-4 4 0 0 0
-1 -2 3 0 0
P5j,APsr = 0 -2 0 2 0
0 0 -3 2 1
() 0 0 -4 4
Thus, using the representation of p(x) in Example 2.6.3, we express the
derivative as

~1 Iil 1~:1
4 0 0 1

1-·
-1 -2 3 0
0 -2 0 2 -10
0 0 -3 2 1 -4 -12
0 0 0 ·-4 4 -7 -12
Thus, p'(x) = -12B0(x) - 9Bf(x) - lOB~ (x) - 12B3(x) -12B4(x).

Application 2.6. 7 (Computer Aided Design). Over a century ago, when


Bernstein first introduced his polynomials, they were seen as being mostly a
theoretical construct with limited application. However, in the early 1960s
two French engineers, de Casteljau and Bezier, independently used Bernstein
polynomials to address a perplexing problem in the automotive design indus-
try, namely, how to specify "free-form" shapes consistently. Their solution was
to use Bernstein polynomials to describe parametric curves and surfaces with
a finite number of points that control the shape of the curve or surface. This
enabled the use of computers to sketch the curves and surfaces. T he resulting
curves are called Bezier curves and are of the form
n
r(t) = L PiBj(t), t E [O, 1],
j=O

where Po, ... , Pn are the control points and Bj(t) are the Bernstein
polynomials.

2.7 Linear Systems


In this section we discuss the theory behind solving linear systems. We develop the
theoretical foundation of row reduction, and then we show how to find bases for
the kernel and range of a linear transformation.
Consider the linear system Ax = b, where A E Mmxn (lF) and b E lFm are
given and x E lFn is unknown. We have already seen that a solution exists if and
only if b E !% (A). Moreover, if a solution exists and JV (A) =f. {O}, then there are
infinitely many solutions.
Knowing that a solution exists is usually not sufficient; we often want to know
what that solution is. The standard way to solve a linear system by hand is row
reduction. Row reduction is the process of performing row operations to transform
2.7. Linear Systems 59

the linear system into a form that is easy to solve, either directly by inspection or
through a process called back substitution. In this section, we explore row reduction
in greater depth.

2.7.1 Elementary Matrices


A row operation is performed by left multiplying both sides of a linear system by
an elementary matrix, as described below. There are three types of row operations.

Type I: Swapping rows, denoted Ri +-+ Rj. The corresponding elementary matrix
(called a type I matrix) is formed by interchanging rows i and j of the identity
matrix. Left multiplying by this matrix performs the row operation. For
example,

[~ ~ ~][~ ~
001 ghi
;]=[: ~ f]gh.
A type I matrix is also called a transposition matrix.

Type II: Multiply a row by a nonzero scalar a, denoted Rj -+ aRj· A type II


elementary matrix is formed by multiplying the jth diagonal entry of the
ident ity matrix by a. For example,

Type III: Add a scalar multiple of a row to another row , denoted Ri -+ aRj + R i
(with i -/:- j). A type III elementary matrix is formed by inserting a in the
(i,j) entry of the identity matrix. For example, if (i , j) = (2, 1), then we have

Ol [ad
a1 01 0 eb fcl =
[aa + d
a cf l .
ab b+ e ac :
[ 001 ghi g h i

Remark 2. 7.1. All three elementary matrices are found by performing the desired
row operation on the identity matrix.

Proposition 2. 7.2. All of the elementary matrices are invertible.

Proof. This is easily verified by direct computation: If E is type I, then E 2 = I .


If E is type II corresponding to Rj -+ aRj, then its inverse corresponds to Rj -+
a- 1 Rj (recall a -/:- 0). Finally, if E is type III, corresponding to Ri -+ aRj + Ri,
then its inverse corresponds to Ri-+ - aRj + Ri· D

Remark 2. 7.3. Since the elementary matrices are invertible, row reduction can be
viewed as repeated left multiplication of both sides of a linear system by invertible
matrices. Multiplication by invertible matrices represents a change of basis, and
left multiplication by the elementary matrices can be thought of as simply changing
60 Chapter 2. Linear Transformations and Matrices

the basis of the range, but leaving the domain unchanged. Thus, any solution of the
system EAx = 0 is also a solution of the system Ax= 0, and conversely.
More generally, any solution of the system Ax= bis a solution of the system
EAx = Eb for any elementary matrix E, and conversely. This is equivalent to
saying that a solution to Ax = b can be found by row reducing both the matrix
A and the vector b using the same operations in the same order. The goal is to
judiciously choose these elementary matrices so that the linear system is reduced
to some nice form that is easy to solve. Typically this is either an upper-triangular
matrix or a diagonal matrix.

Example 2.7.4. Consider the linear system

[i ~] [~~] [~].
To solve it using row operations, or equivalently elementary matrices, we first
swap the rows. This is done by left multiplying both sides by the type I matrix
corresponding to Ri +-+ R2· Thus, we have

Now we eliminate the (2, 1) entry by using the type III matrix corresponding
to R2---+ -2R1 + R2. This yields

(2.10)

From here we see that x2 = -5 and X1 + X2 = 4, which reduces to x 1 = 9


by back substitution. Alternatively, we can continue by eliminating the (1, 2)
entry using the type III matrix R 1 ---+ -R2 + R 1. Thus, we have

Remark 2. 7.5. As a shorthand in the previous example, we write the original


linear system in augmented form

2
[ 1
313]
1 4 .

Thus, we can more compactly carry out the row reduction process by writing
2.7. Linear Systems 61

Definition 2. 7.6. The matrix B is row equivalent to the matrix A if there exists a
finite collection of elementary matrices E1, E2, ... , En such that B = E1E2 · · · EnA·

Theorem 2.7.7. Row equivalence is an equivalence relation.

Proof. See Exercise 2.40. D

Remark 2. 7.8. Sometimes we say that one matrix is row equivalent to another
matrix, but more commonly we say that one augmented matrix is row equivalent
to another augmented matrix, or, in other words, that one linear system is row
equivalent to another linear system. Thus, if two linear systems (or augmented
matrices) are row equivalent, they have the same solution.

2.7.2 Row Echelon Form

Definition 2. 7.9. A matrix A is in row echelon form (REF) if the following hold:
(i) The first nonzero entry, called the leading entry, of each nonzero row is always
strictly to the right of the leading entry of the row above it.
(ii) All nonzero rows are above any zero rows.
A matrix A is in reduced row echelon form (RREF) if the following hold:
(i) It is in REF.
(ii) The leading entry of every nonzero row is equal to one.
(iii) The leading entry of every nonzero row is the only nonzero entry in its column.

Example 2.7.10. The first two of the following matrices are in REF, the
next is in RREF, and the last two are not in REF. Can you say why?

[~ ~1
2 3
25 0I] ' [I0 01 4 ~] ,
[~
4 2 0
4 5
3 4 7 0
0 6
0 0 6 9 0 0 0 1
0 0

[~ ~1
2

[~ !]
2
4
0
0
5
0

Remark 2. 7.11. REF is the canonical form of a linear system that can then be
solved by what we call back substitution. In other words, when an augmented matrix
is in REF, we can solve for the coordinates of x and substitute them back into the
partial solution, as we did with (2 .10).
62 Chapter 2. Linear Transformations and Matrices

Nota Bene 2.7.12. Although it is much easier t o determine the solution of a


linear system when it is in RREF t han REF, it usually requires twice as many
row operations to get a system into RREF, and so it is faster algorithmically
to reduce to REF and t hen use back substitution.

Proposition 2.7.13. If A is a matrix, then there exists an RREF matrix B such


that A and B are row equivalent. Moreover, if A is a square matrix, then B is upper
triangular (the (i,j) entry of Bis zero whenever i > j). Finally, if Tis a square,
upper-triangular matrix, all of whose diagonal elements are nonzero, then T is row
equivalent to the identity matrix.

Proof. See Exercise 2.39. 0

Theorem 2. 7 .14. A square matrix is nonsingular if and only if it is row equivalent


to the identity matrix.

Proof. (==;.) By the previous proposition, any square matrix A can be row reduced
to an RREF upper-triangular matrix B by a sequence E1, E2 , ... , Ek of elementary
matrices such that E 1E2 · · · EkA = B. If A is nonsingular, then it has an inverse
A-1, and so

Assume by way of contradiction that one of the diagonal elements of B is zero.


Since B is in RREF, the entire bottom row of B must be zero. This implies that
the bottom row of the product BA- 1 (E 1 E2 · · · Ek)- 1 is zero, and hence it is not
equal to I -a contradiction.
Therefore B can have no zeros on the diagonal and must be row equivalent to
I by Proposition 2.7.13. Since Bis row equivalent to I, then A is also row equivalent
to I.
( ~) If A is row equivalent to the identity, then there exists a sequence
E 1, E2, . .. , Ek of elementary matrices such that E 1E 2 · · · EkA =I. It follows that
A - 1 = E1E2 ·· · Ek. D

Application 2.7.15 (LU Decomposition). There are efficient numerical


libraries for solving linear systems of the form Ax = b. If nothing is known
about the matrix A, or if A has no specialized structure, the best known rou-
tine is called LU factorization , which produces two output matrices,
specifically, an upper-triangular matrix U which corresponds to A in REF
form, and a lower-triangular matrix L which is the product of type III ele-
mentary matrices. Specifically, if En··· E 1 A = U , then L = (En ··· E 1 )- 1 =
E"1 1E2 1 · · · E;;, 1. The inverse of a type III elementary matrix is a type III
elementary matrix, and so L is indeed the product of type III matrices.
2.7. Linear Systems 63

To solve a linear system Ax = b, first write A as A = LU, and then


solve for y in the system L(y) = b via back substitution. Finally, solve for x
in the system Ux = y, again via back substitution.

Corollary 2. 7.16. If a matrix A is invertible, then the RREF of the augmented


matrix [A JI] is [I J A- 1 ]. In other words, we can compute A- 1 by row reducing
[A I I ].

Proof. When we put the augmented matrix [A I] in RREF using a sequence of


J

elementary matrices El, E2, . .. , Ek, we have E1E2 · · · Ek[A JI ] = [I J E1E2 ·· · Ek]·
This shows that E 1E2 · · · EkA =I and hence E 1E 2 · ··Ek= A- 1. D

2.7.3 Basic and Free Variables


In the absence of special structure, row reduction is the canonical approach 13 to
computing the solution of the linear system Ax = b, where A E Mmxn(lF) and
b E JFm. If A is invertible, then row reduction is straightforward. If A is singular,
and in particular if it is rectangular, then the problem is a little more complicated.
Consider first the homogeneous linear system Ax = 0. For any finite sequence
of m x m elementary matrices E 1 , .. . , Ek, the system has the same solution as the
row-reduced system Ek · · · E 1 Ax = Ek · · · E 1 0 = 0. Thus, row reduction does not
change the solution x, which means that the kernel of A is the same as the kernel
of the RREF matrix B =Ek··· El A.
Given an RREF matrix B , we associate the columns with leading entries as
the basic variables and the ones without as free variables. The number of free
variables is the dimension of the kernel and corresponds to the number of degrees
of freedom of the kernel. More precisely, the basic variables can be written as
linear combinations of the free variables, and thus the entire solution set to the
homogeneous problem can be written as a linear combination of the free variables
(one for each dimension).
If the dimension of the kernel equals the number of free variables, then by the
rank-nullity theorem (Corollary 2.3.9) , the number of basic variables is equal to
the rank of the matrix.

Example 2.7.17. Consider the linear system Ax = 0 , where

(2.11)

13The widely used computing libraries for solving linear systems have many efficiencies and en-
hancements built in that are quite sophisticated . We present only the entry-level approach
here.
64 Chapter 2. Linear Transformations and Matrices

We can write this as an augmented linear system and row reduce to RREF
form (using elementary matrices)

Io J .
- 21 0

T hus , the system is row equivalent to the reduced system Bx = 0, where

(2.12)

The first two columns correspond to basic variables (x 1 and x2) , and the third
column corresponds to the free variable X3. The augmented system can then
be written as

X1 = X3,
X2 = -2X3,

or equivalently the solution is

where X3 E F. It follows that JV (A) = span { [1 -2 1] T}. Here it is clear


t hat the free column [-1 2] T can be written as a linear combination of the
basic columns [1 0] T and [0 1] T . Hence, the rank is 2 and the nullity is 1.

In the more general nonhomogeneous case of Ax = b, the set of all solutions


is a coset of the form x' +JV (A), where x' is any particular solution of Ax= b .
Since all the elementary matrices are invertible, solving Ax = b is equivalent to
solving Bx = d, where B is given above and d = Ek · · · E 1 b. T his is generally done
by row reducing the augmented matrix [A [ b] . This allows us to t rack the effect of
multiplying elementary matrices on both A and b simultaneously.

Example 2.7.18. Consider the linear system Ax= b, where A is given in


the previous example and b = [4 8] T. We can write this as an augmented
linear system and row reduce to RREF form (using elementary matrices)

126 7314]
[5 [12 314 ] [1 0 -1 1-23 ] .
8 -t 01 2 3 -t 0 1 2

Thus, the system is row equivalent to the reduced system Bx= d, where Bis
given in (2.12)
and d = [-2
3]T. The augmented system can be written as
2.8. Determinants I 65

X 1 = X3 - 2,
X 2 = - 2X3 + 3,

or equivalent ly as x = x' + JY (A) , where x' = [-2 3 O]T and JY (A ) is


given in the previous example.

If we t hink of the matrices Ei as describing a change of basis, then they are


changing the basis of the codomain, but they leave the domain unchanged. The
columns of A are vectors in the codomain that span the range of A. The columns
of B are those same vectors expressed in a new basis, and their span is the range of
the same linear transformation, but expressed in this new basis. However, if B is
in RREF, then it is easy to see that the columns with leading entries (called basic
columns) span the range (again, expressed in terms of the new basis) . This means
that the corresponding columns of the original matrix A span the range (in terms
of the original basis).

Example 2. 7.19. In the previous two examples, the first and second columns
are basic and obviously linearly independent, and the corresponding columns
from the original matrix are also linearly independent and form a basis of t he
range, so we have
~ (A) = span { [~] , [~]}.

In summary, to find a basis for the kernel of a matrix A, row reduce A to


RREF and write the basic variables in terms of the free variables. To find a general
solution to Ax = b, row reduce the augmented matrix [A I b] to RREF and use
it to find a single solution y of the system. The coset y + JY (A) is the set of all
solutions of Ax = b. Finally, the range of A is the span of the columns, but these
columns may not be linearly independent. To get a basis for the range of A, row
reduce to RREF , and then the columns of A that correspond to basic variables form
the basis for~ (A).

2.8 Determinants I
In this sect ion and the next , we develop the classical theory of determinants. Most
linear algebra t ext s define the determinant using the cofactor expansion. This is a
fair approach, but it is difficult to prove some of the properties of the determinant
rigorously t his way, and so a little hand waving is common. Instead, we develop
the determinant using permutations. While this approach is initially a little more
difficult , it allows us to give a rigorous derivation of the main properties of the
determinant. Moreover, permutations are a powerful mathematical tool with broad
applicability beyond determinants, and so we favor this approach pedagogically.
66 Chapter 2. Linear Transformations and Matrices

We begin by developing properties of permutations. We then prove two


key properties of the determinant, specifically that the determinant of an upper-
triangular matrix is the product of its diagonal elements and that the determinant
of a matrix is equal to the determinant of its transpose.
Neither the cofactor definition nor the permutation definition are well suited
to numerical computation. Indeed, the number of arithmetic operations required to
compute the determinant of an n x n matrix, using either definition, is proportional
to n factorial. In the following section, we introduce a much more efficient way of
computing the determinant using row reduction. This provides an algorithm that
grows in proportion to n 3 instead of n!.
Finally, we remark that while determinants are useful in developing theory,
they are seldom used in computation because there are generally faster methods
available. As a result there are schools of thought that eschew determinants al-
together. One place where determinants are essential, however, is in describing
how a linear transformation changes volumes. For example, given a parallelepiped
spanned by n vectors in :!Rn, the absolute value of the determinant of the ma-
trix whose columns are these vectors is the volume of the parallelepiped. And a
linear transformation L takes the unit n-cube spanned by the standard basis vec-
tors to a parallelepiped spanned by the columns of the matrix representation of L .
So, the transformation L changes the cube to a parallelepiped of volume det(L).
This fact is essential to changing variables in multidimensional integration (see
Section 8. 7) .

2. 8.1 Permutations
Throughout this section, assume that n is a positive integer and Tn = {1, 2, ... , n }.

Definition 2.8.1. A permutation O" of a set Tn is a bijection O" : Tn -t Tn . The


set of all permutations of Tn is denoted Sn and is called the symmetric group of
order n.

Remark 2 .8.2. The composition of two permutations on a given set is again a per-
mutation, and the inverse of every permutation is again a permutation. A nonempty
set of functions is called a group if it has these two properties.

Notation 2 .8.3. If O" E Sn is given by O"(j) = ij for every j E Tn, then we denote
O" by the ordered tuple [i1, . . . , in]. For example, if O" E S3 is such that 0"(1) = 2,
0"(2) = 3, and 0"(3) = 1, then we write O" as the 3-tuple [2, 3, 1] .

Example 2.8.4. We have

S1 = {[1]},
S2 = {[1 , 2],[2,1]},
S3 = {[l , 2, 3], [2, 3, l], [3, 1, 2], [2, 1, 3], [l , 3, 2], [3, 2, l]}.
2.8. Determinants I 67

Note that IS1I = 1, IS2I = 2, and IS3I = 6. For general values of n, we have
the following theorem.

Theorem 2.8.5. The number of permutations of the set Tn is n!; that is, ISn I = n!.

Proof. We proceed by induction on n EN. By the previous example, we know that


the base case holds; thus we assume that ISn-1 I = ( n - 1) !. For each permutation
[O"(l), 0"(2), . . . , O"(n - 1)] in Sn_ 1 , we can create n permutations of Tn as follows:

[n, O"(l), ... , O"(n - 1)], [O"(l), n, .. . , O"(n - 1)], . .. , [0"(1), ... , O"(n - 1), n].

All permutations of Tn can be produced in this way, and no two are the same. Thus,
there are n · (n - 1)! = n! permutations in Sn· 0

Definition 2.8.6. Let O" : Tn ~ Tn be a permutation. An inversion in O" is a pair


(di), O"(j)) such that i < j and O"(i) > O"(j) .

Definition 2.8. 7. A permutation O" is called even if it has an even number of


inversions. It is called odd if it has an odd number of inversions. The sign of the
permutation is defined as

1 if O" is even,
signO" ={
- 1 if O" is odd.

Example 2.8.8. The permutation O" = [3, 2, 4, 5, 1] on T5 has 5 inversions,


namely, (3, 2), (3, 1), (2, 1), (4, 1), and (5, 1). Thus, O" is an odd permutation.

Example 2.8.9. A permutation that swaps two elements, fixing the rest, is
a transposition. More precisely, O" is a transposition of Tn if there exists
i,j E Tn, i =J j, such that O" (i) = j, O"(j) = i, and O"(k ) = k for every
k =J i, j. For example, the permutations [2, 1, 3, 4, 5] and [1 , 2, 5, 4, 3] are both
transpositions of T5.
Assuming i < j, the inversions of the transposition are (j, i), (j, i + 1),
(j, i + 2) , ... , (j , j - 1) and (i + 1, i), (i + 2, i), ... , (j - 1, i). This adds up to
2(j - i) - 1, which is odd. Hence, the sign of the transposition is -1.

Lemma 2.8.10 (Equivalent Definition of sign(u)). Let O" E Sn· Consider the
polynomial p(x 1 , x 2 , ... , Xn) = Tii<j(xi -Xj ). The sign of a permutation is given as

(2.13)
68 Chapter 2. Linear Transformati ons and Matrices

Proof. It is straightforward to see that (2.13) reduces to ( -1 )m, where m is the


number of inversions, and thus even (respectively, odd) permutations are positive
(respectively, negative). D

Example 2.8. 11. When n =4 and O" = [2, 4, 3, 1] , we have

. ( ) p(XCT(l)> XCT(2)> X<T(3)> X<T(4))


sign O" =
p(Xi, X2, X3, X4)
p(x2, X4, X3 , X1)
p(X1 , X2, X3, X4)
(x2 - x4)(x2 - x3)(x2 - x1)(x4 - X3)(x4 - x 1) (x3 - x1)
(x1 - x2)(x1 - x3)(xi - x4)(x2 - x3)(x2 - x4)(x3 - x4)
= 1 . 1. -1. -1. - 1. ·-1 =1 = (-1) 4 .

Observe that there are four inversions, so sign(O") = l.

The next theorem is an important tool for working with signs.

Theorem 2.8.1 2. The sign of the composition of two permutations is equal to the
product of their signs; that is, if O", T E: Sn, then sign ( O" o T) = sign ( O") sign (T) .

Proof. Note that x 1 ,x 2 , ... ,Xn in (2.13) are dummy variables, so we could just
as well have used Yn, Yn-1, . .. , YI or any other set of variables in any other initial
order. Thus, for any O", T E Sn we have

. () P(XCT(r( l ))>X<T(r(2)),·· ·, XCT(r(n)))


sign O" = .
P(XT(l)> Xr(2)> · · · , Xr(n))
Comput ing the sign of O"T using (2.13), we h ave

. ( ) p(xCT(r(I))>XCT(r(2)),· · ·,XCT(T(n)))
sign O"T = -~~'-'---~~-----'---'--'-'--

p( x1, x2, ... , xn)


_ p(XCT(r(l))> XCT(r(2))> · · ·, X<T(r(n))) P(Xr(I),Xr(2), · · ·, Xr(n))
- P(Xr(I),Xr(2)> ··· ,Xr(n)) p(x1, X2, ... , Xn)
= sign(O") sign(T). D

Corollary 2.8.13. If O" is a permutation, then sign(0"- 1) = sign(O") .

Proof. If e is the identity map, then l = sign(e) = sign(0"0"- 1) = sign(O") sign(0"- 1).
Thus , if O" is even (respectively, odd), then so is 0"- 1. D

2.8.2 Definition of the Determinant

Definition 2.8.1 4. Let A = [aij] be in Mn(IF) and O" E Sn. The elementary
product associated with O" E Sn is the product a1CT(l)a2CT(2) · · · anu(n).
2.8. Determinants I 69

Remark 2.8.15. An elementary product contains exactly one element from each
row of A and exactly one element of each column.

Example 2.8.16. Let

(2.14)

and let r:J E 53 be given by [3, 1, 2]. The elementary product is a13a2 1a32 .
Table 2.1 contains a complete list of permutations and their corresponding
elementary products for a 3 x 3 matrix.

Definition 2.8.17. If A = [aij] E Mn(lF), then the determinant of A is the quantity

det (A) = L sign(r:J)a1,,.(1)a2a-(2) · · · aM(n) ·


<7ES,,.

Ex ample 2.8.18. Given n = 2, there are two permutations on T2, namely,


r:J 1 = [1, 2] and r:J 2 = [2, l]. Since sign (r:J 1) = 1 and sign (r:J2) = - 1, it follows
that the determinant of the matrix

(j sign (r:J) elementary product


[1, 2, 3] 1 aua22a33
[1, 3, 2] - 1 aua23a32
[2, 3, 1] 1 a12a23a31
[2,1,3] - 1 a12a21a33
[3, 1, 2] 1 a13a21a32
[3, 2, 1] - 1 a13a22a31

Table 2.1. A list of the permutations, their signs, and the corresponding
elementary products for a generic 3 x 3 matrix.

Example 2.8.19. Let A be the arbitrary 3 x 3 matrix as given in (2.14).


Using Table 2.1 , we have
70 Chapter 2. Linear Transformations and Matrices

2.8.3 Elementary Properties of Determinants

Theorem 2.8.20. The determinant of an upper-triangular matrix is the product


of its diagonals; that is, <let (A) = TI~=l aii ·

Proof. Since A = [ai1] is upper triangular, then aij = 0 whenever i > j. Thus,
for an elementary product a 1a(l)aza(Z) · · · ana(n) to be nonzero, it is necessary that
CJ(k) ~ k for each k . But O"(n) ~ n implies that O"(n) = n. Similarly, O"(n-1) ~ n -1,
but since CJ(n) = n, it follows that O"(n - 1) = n - 1. Working backwards , "':Ne see
that O"(k) = k for all k. Thus, O" is the identity, and all other elementary products
are zero. Therefore, <let (A) = a11a22 · · · ann · D

Theorem 2.8.21. If A E Mn(lF), then <let (A) =<let (AT).

Proof. Let B = [bij] =AT; that is, bij = aji· It follows that

det (B) = L sign(O")b1a(1)b2a(2) · · · bna(n)


aESn

= 2= sign( O" )aa(l)l aa(2)2 ... aa(n)n ·


aESn

Rearranging the elements in each product, we get

As O" runs through all the elements of Sn, so does 0"-1; thus, by writing T = O"-l,
we have

<let (B) = L sign(O")aa(l)laa(2)2 · · · aa(n)n


aESn

= L sign(T)aIT(l)a2T(2) · · · am(n) =<let (A). D


TE Sn

2.9 Determinants II
In this section we show that determinants can be computed using row reduction.
This provides a practical approach to computing the determinant, as well as a
way to prove several important properties of determinants. We also show that
the determinant can be computed using cofactor expansions. From there we prove
Cramer's rule and define the adjugate of a matrix, which relates the determinant
to the inverse of a matrix.

2.9.1 Row Operations on Determinants

Theorem 2.9.1. If A, B E Mn(lF) and if E is an elementary matrix such that


B = EA, then det(B) = det(E) det(A). Specifically, if E is of
2.9. Determinants II 71

(i) type I (Rk +-+Re), then det (B) = - <let (A);

(ii) type II (Rk-+ aRk}, then det (B) = adet (A) (this holds even if a = O);

(iii) type III (Re-+ aRk +Re}, with k # e, then det (B ) = det (A).

Proof. Let A= [aij] and B = [bij] ·


(i) Let T be the transposition k +-+ l (assume k < l). Note that bij = a-r(i)j and
that T- 1 = T. Thus,

det (B) = L sign (o-) bia(1)b2a(2) · · · bka(k) · · · bza(l) · · · bna(n)

= 2= sign ((l) ala(l)a2a(2) ... aza(k) .. . aka(l) .. . ana(n)


aESn
= L sign(T- )sign(o-T)a1a(-r(l))a2a(-r(2)) · · ·ana(-r(n))
1
aESn
= - L sign (v) alv(1)a2v(2) .. . anv(n) = - det (A).
vESn

Here the last equality comes from setting v = o-T and noticing that when o-
ranges through every element of Sn, then so does v.

(ii) Note that bij = aij when i # k and bij = aaij when i = k. Thus,

<let (B) = L sign (o-) b1a(l) · · · bna(n)


aES,,
= L a sign (o- ) a1a(l) · · · ana(n) = adet (A).
aESn

(iii) Note that bij = aij when i # l and bij = % + aakj when i = l. Thus,

<let (B) = L sign (o- ) bia(1)b2a(2) · · · bna(n)


a ES,,.

= L sign (o-) a1a(l) · · · (aza(l) + aaka(l)) · · · ana(n)


aESn
= <let (A) + a <let (C),
where C = [cij] satisfies Cij = aij when i # l and Cij = akj when i = l.
Since two of the rows of C are identical, interchanging those two rows cannot
change the determinant of C, but we know from part (i) that interchanging
any two rows changes the sign of the det(C) . Therefore, det(C) = - det(C),
which can only be true if det(C) = 0, and so we have <let (B) =<let (A) . D

In the proof of (iii) above, we also proved the following corollary of (i) .

Corollary 2.9.2. If A E Mn(lF) has two identical rows, then det(A) = 0.


72 Chapter 2. Linear Transformations and Matrices

Remark 2.9.3. For notational convenience, we often write the determinant of the
matrix A= [aiJl E Mn(IF) as

au a12 a1n
a21 a22 a2n
det (A) =

We can now perform row operations to compute the determinant.

Example 2.9.4. Using row reduction, we compute t he determinant of the


matrix
-2 -4
A= [ ~ 5 6 . -6]
8 10
We perform the following operations:

-2 -4 -6 1 2 3 1 2 3
4 5 6 = - 2· 4 5 6 = -2· 0 -3 -6
7 8 10 7 8 10 0 - 6 -11
1 2 3 1 2 3
= 6. 0 1 2 = 6 . 0 1 2 =6.
0 -6 -11 0 0 1

Corollary 2.9.5. If A E Mn(IF), then det (aA) =an <let (A), a E IF.

Proof. The proof is Exercise 2.49. []

Corollary 2.9.6. A matrix A E Mn(IF) is nonsingular if and only if det(A) "I 0.


Proof. The matrix A is nonsingular if and only if it is row equivalent to the identity
(see Theorem 2.7.14) . But by the previous theorem, row equivalence changes the
determinant only by a nonzero scalar (recall that we do not allow a = 0 in the type II
elementary matrices) . Thus, if A is invertible, then its determinant is a nonzero
scalar times det(I) = 1, so det(A) is nonzero. On the other hand, if A is singular,
then it row reduces to an upper-triangular matrix with at least one zero on the
diagonal, and hence (by Theorem 2.8.20) it has determinant zero. D

Theorem 2.9.7. If A,B E Mn(IF), then det(AB) = det(A)det(B).


Proof. If A is singular, then AB is also singular by Exercise 2.11. Thus, both
<let (AB) = 0 and <let (A) = 0 by Corollary 2.9.6. If A is nonsingular, then A can
2.9. Determinants II 73

be written as a product of elementary matrices A = EkEk - l · · · E 1. It follows that

det (AB) = det (EkEk - 1 · · · E1B) =<let (Ek) det (Ek- 1) · · · det (E1) det (B)
= det (EkEk - 1 · · · E 1) det (B) = det (A) <let (B). 0

Corollary 2 .9.8. If A E Mn(IF) is nonsingular, then det (A- 1 ) = 1/ <let (A).

Proof. Since 1 = det(I) = det(AA- 1 ) = det(A)det(A- 1 ) , the desired result


follows. 0

Corollary 2.9.9. If A, B E Mn(IF) are similar matrices, that is, B = p - l AP for


some invertible matrix PE Mn(IF) , then det(A) = det(B) .

Proof. det(B) = <let (P- 1 AP) = <let (P- 1 ) <let (A) <let (P) =<let (A). 0

Remark 2.9.10. Since the determinant is preserved under similarity transforma-


tions, we can define the determinant of a linear operator L on a finite-dimensional
vector space as t he determinant of any of its matrix representations.

2.9.2 Cofactor Expansions and th e Adjug ate


The cofactor expansion gives another approach for defining the determinant of a
matrix. We use this approach to construct the adjugate, which can be used to give
an explicit formula for the inverse of a nonsingular matrix. Although the formula is
not well suited to numerical computation, it does give us some powerful theoretical
tools, most notably Cramer's rule.

Definition 2.9.11. The (i, j) submatrix Aij of A E Mn(IF) is the (n -1) x (n -1)
matrix obtained by removing row i and column j from A . The (i,j) minor Mij of
A is the determinant of Aij. We call ( - 1)i+ j Mij the (i, j) cofactor of A .

Ex ample 2.9.12. Using the matrix A from Example 2.9.4, we have that

A23 =
[-2 -4]
7 8
.

Moreover, the (2, 3) minor of A is M 23 = 12, whereas the (2, 3) cofactor of A


is equal to ( -1 ) 5 (12) = -12.

Lemma 2.9.13. If Aij is the (i,j) submatrix of A E Mn(IF), then the signed
elementary products associated with (- l)i+JaijAij are the same as those associated
with A .
74 Chapter 2. Lin ear Transformations and Matrices

Remark 2 .9 .14. How can the elementary products associated with (-l)i+JaijAij
be the same as those associated with A? Suppose we fix i and j and then sort
the elementary products of the matrix into two groups: those that contain aij, and
those that do not. We find that there are (n-1)! elementary products that contain
aij and (n-1) · (n-1)! elementary products that do not contain aij· So, the number
of elementary products of Aij is the same as the number of elementary products of
A that contain aij .
Recall also that an elementary product contains exactly one element from each
row of A and each column of A (see Remark 2.8.15). The elementary products of A
that contain aij as a factor cannot have another element that lies in row i or column
j as a factor; in fact the other factors of the elementary product must come from
precisely the submatrix Aij. All that remains is to verify the correct sign of the
elementary product and to formalize the proof of the lemma. This is done below.

Proof. Let i, j E {1, 2, . . . , n} be given. Each a E Bn-1 corresponds to an elementary


product in the minor Aij· We define O" E Sn so that the resulting elementary product
of A is the same as that which comes by multiplying aij by the elementary product of
Aij coming from a. Specifically, we have

a(k) , k < i and a(k) < j'


a(k) + 1, k < i and a(k) ?. j,
O"(k) = j, k = i,
a(k - 1), k > i and a(k - 1) < j,
a(k-1)+1, k > i and 0-(k - 1) ?. j.

The number of inversions of O" equals the number of inversions of a plus the number
of inversions that involve j. Thus, we need only find the number of inversions that
depend on j.
Let k be the number of elements of the n-tuple O" = [0"(1), 0"(2), . .. , O"(n)]
that are greater than j, yet whose position in then-tuple precede that of j. Thus,
n - j - k is the number of elements of O" greater than j that follow j. Since there
are n - i spots that follow j, it follows that there are (n - i) - (n - j - k) elements
smaller than j that follow j. Hence, there are k+ (n-i)-(n- j - k) = i + j +2(k-i)
inversions due to j. Since (-l)i+i+ 2 (k-i) = (-l)i+j, the signs match the elementary
products of A with those in (-l)i+iaiJAij · D

E x ample 2.9.15. Let i = 2, j = 5, and n = 7. If a= [4, 3, 2, 1, 6, 5], then

O"(l) = 0-(1) = 4 since 1 < 2 and 0-(1) < 5,


0"(2) = 5 since 2 = 2,
0"(3) = 0-(3 - 1) = 0-(2) = 3 since 3 > 2 and 0-(2) < 5,
0"(4) = 0-(4-1) = 0-(3) = 2 since 4 > 2 and 0-(3) < 5,
0"(5) = 0-(5-1) = 0-(4) = 1since5 > 2 and 0-(4) < 5,
2.9. Determinants II 75

d6) = &(6 - 1) + 1 = &(5) + 1 = 7 since 6 > 2 and &(5) 2 5,


a(7) = &(7 - 1) + 1 = &(6)+1 = 6 since 7 > 2 and &(6) 2 5.

Thus, a = [4, 5, 3, 2, 1, 7, 6].

The next theorem gives an alternative way to compute the determinant of a


matrix, called the cofactor expansion.

Theorem 2.9.16. Let Mij be the (i,j) minor of A E Mn(lF). If A = [aij], then
n
(i) for any fixed j E {l, . . . , n}, we have <let (A) = 2._)-l)i+jaijMij;
i= l
n
(ii) for any fixed i E {l , . .. ,n}, we have det(A) = 2._) - l)i+jaijMij·
j=l

Proof. For each (i,j) pair, there are (n-1)! elementary products in Aij · By taking
a column sum of cofactors of A (as in (i)), or by taking a row sum (as in (ii)), we
get n! unique elementary products, and by the previous lemma, each of these occurs
with the correct sign. D

Example 2.9.17. Theorem 2.9.16 says that we can split the determinant into
a sum like the one below,

or alternatively as

Ex ample 2.9.18. If

A~ [l !~ i1
we may expand det(A) in cofactors along the bottom row (choose i = 4 in
76 Chapter 2. Linear Transformations and Matrices

Theorem 2.9 .16(ii)) as


4
det(A) = 2)-1) 4+Ja4JM4J = 2M42.
j=l

Next, we compute M 42 by expanding along the first column (choose j = 1 in


Theorem 2.9.16(i)). We find

1 3 4 3
M42 = 5 7 8 = :~.:)-1)H 1 ai1Mi1
1 2 3 i =l

= 1M11 - 5M21 + lM31


= (21 - 16) - 5(9 - 8) + (24 - 28) = - 4,

where Mij corresponds to the minors of the 3 x 3 matrix A 4 2. It follows that


det(A) = 2(-4) = -8.

Definition 2.9.19. Assume Mij is the (i,j) minor of A = [%] E Mn(IF). The
adjugate adj (A) = [ciJ] of A is given by Cij = (-l)i+J MJi (note that the indices of
the cofactors are transposed (i,j) f-t (j,i) in the adjugate) .

Remark 2.9.20. Some people call this matrix the classical adjoint, but this is a
confusing name because the adjoint has another meaning that is more standard
(used in Chapter 3) . We always call the matrix in Definition 2.9.19 the adjugate.

Example 2.9.21. Consider the matrix A from Example 2.9.4. Exercise 2.48
shows that

adj (A) = [ ;
-3
;g -~2]
- 12 6
.

The adjugate can be used to give an explicit formula for the inverse of a
matrix. Although this formula is not well suited to computation, it is a powerful
theoretical tool.

Theorem 2.9.22. If A E Mn(IF), then A adj (A)= det (A)!.

Proof. Let A = [aiJl and adj (A) = [ciJ]. Note that the (i, k) entry of A adj(A) is

Now if i = k, then the (i, i) entry of A adj(A) is L,7= 1 (-l)i+JaiJMiJ> which is equal
to det(A) by Theorem 2.9.16.
2.9. Determinants II 77

Consider now the matrix B which is created by replacing row k of matrix A


by row i of matrix A. By Theorem 2.9.16,
n n n
det(B) = 2:,(-l)k+jbkjBkj = 2:,(-l )k+jbijBkj = 2:, (-l)k+jaijMkj ·
j=l j=l j =l

However, by Corollary 2.9.2, det(B) = 0. Applying this to the theorem at hand, if


i =/= k, then the (i,k) entry of Aadj(A) is L,7=
1 (- l)k+jaijMkj = 0. Thus,

~a c = ~(-l)k+ja ·M . =
L.., iJ Jk L.., iJ kJ
{det(A)
0
if i = k,
"f . _/.. k 0
j=l j =l I i r .

Corollary 2.9.23. If A E Mn(IF) is nonsingular, then A - 1 = ~~~/~} .

One of the most important applications of the adjugate is Cramer's rule, which
allows us to write an explicit formula for the solution to a system of linear equations.
Although the equations become unwieldy very rapidly, they are useful for proving
various properties of the solutions, and this can often make solving the system much
simpler.

Corollary 2.9.24 (Cramer's Rule). If A E Mn(IF) is nonsingular, then the


unique solution to Ax = b is

= A- 1b = adj (A) b (2.15)


x det (A) .

Moreover, if Ai(b) E Mn(IF) is the matrix A with the ith column replaced by b, then
the ith coordinate of x is
det (Ai(b))
(2 .1 6)
Xi= det(A) ·

Proof. Note that (2.15) is an immediate consequence of Corollary 2.9.23. To prove


(2.16), let A = [aij], b = [b1 b2 . . . bn ]T, and adj (A) = [cij]· Thus,

L,7= 1 Cijbj = ~ (-l)i+JbjMji = det (Ai(b)). D


Xi = det (A) L.., det (A) det (A)
J= l

Example 2.9.25. Consider the linear system Ax= b , where

-2 -4
A= [ ~ 5 -61
6
10
'
8
78 Chapter 2. Linear Transformations and Matrices

as in Example 2.9.4, and b = [6 6 l)T. Using (2.16) , we compute

6 -4 -6 -2 6 -6
det(A1(b)) = 6 5 6 = -:30, <let (A2(b)) = 4 6 6 = 132,
1 8 10 7 1 10
-2 -4 6
and <let (A3(b)) = 4 5 6 = -84.
7 8 1

Since detA = 6, we have that x = [-5 22 -14)T.

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1., the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with .&. are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

2.1. Determine whether each of the following is a linear transformation from JR. 2
to JR. 2. If it is a linear transformation, give JV (L) and flt (L ).
(i) L(x,y) = (y,x) .
(ii) L(x, y) = (x, 0).
(iii) L(x, y) = (x + 1., y + 1) .
(iv) L(x,y) = (x 2 ,y 2 ).
2.2. Recall the vector space V = (0, oo) given in Exercise 1.1. Show that the
function T(x) = logx is a linear transformation from V into R
2.3. Determine whether each of the following is a linear transformation from JF[x; 2]
into JF[x; 4]. Prove your answers.
(i) L maps any polynomial p(x) to x 2; that is, L[p](x) = x 2 .
(ii) L maps any polynomial p(x) to xp(x); that is, L[p](x) = xp(x).
Exercises 79

(iii) L[p](x) = x 4 + p(x).


(iv) L[p](x) = (4x 2 - 3x)p'(x).
2.4. Let L: ci([O, l];lF)---+ C([O, l];lF) be given by L[f] = f' + f. Show that Lis
linear. For 0 :::; x :::; 1, define

Verify that L[f] = g.


2.5. Prove Corollary 2.2.3 .

2.6. Let {Vi}~!ii be a collection of vector spaces and { Li}~i a collection of


invertible linear maps Li : Vi ---+ Vi+i· Prove that (LnLn - i · · · Li)-i =
L -iL-i
i
L-i
2 .. · n ·
2.7. Prove Proposition 2.2.14(iv) .
2.8. Let L be an operator on the vector space V and k EN. Prove that
(i) JV (Lk) c JV (Lk+i),
(ii) &f(Lk+l) c&f(Lk).
Note: We can assume that L 0 is the identity operator.
2.9. Elaborating on Example 2.l.3(xii), consider the rotation matrix

R = [cose - sine] (2.17)


e sine cos e .
Show that multiplication of column vectors [x y] T E lF 2 by Re rotates
vectors counterclockwise. This is the same as pe(x, y) in the example, but it
is written as matrix-vector multiplication.
2.10. Show that the rotation matrix in the previous exercise satisfies Re+¢ = ReR¢.

2.11. Let Li and L 2 be operators on a finite-dimensional vector space. Prove that


if L i is not invertible; then the product LiL2 is not invertible.
2.12.t Let L be a linear operator on a finite-dimensional vector space. Fork EN,
prove the following:
(i) If JV (Lk) = JV (Lk+ 1 ), then JV (Le) = JV (Lk) for all£ 2:: k.
(ii) If 3f ( Lk+i) = 3f ( Lk), then 3f (Le) = 3f ( Lk) for all £ 2:: k.
(iii) If 3f (Lk+ 1 ) = 3f (Lk), then 3f (Lk) n JV (Lk) = {0}.
2.13. Let V, W, X be finite-dimensional vector spaces and let L V ---+ W and
K : W ---+ X be linear transformations. Prove that

rank(K L) = rank(L ) - dim(JV (K) n 3f (L) ).

Hint: Use the rank-nullity theorem (Corollary 2.3 .9) on Ka(L) : 3f (L)---+ X,
which is K restricted to the domain 3f ( L).
80 Chapter 2. Linear Transform ations and Matrices

2.14. Given the setup of Exercise 2.13, prove the following inequalities:
(i) rank(KL) ~ min(rank(L), rank(K)).
(ii) rank(K) + rank(L) - dim(W) ~ rank(KL).
2.15. Let {W1, W2, ... , Wn} be a collection of subspaces of t he vector space V.
Show that the mapping L : V -t V/W1 x V/W2 x · · · x V/Wn defined
by L(v) = (v + W1, v + W2, .. . , v + Wn) is a linear transformation and
J1f (L) = n~ 1 Wi. This is the vector-space analogue of the Chinese remain-
der theorem . The Chinese remainder theorem is treated in more detail in
Chapter 15.
2.16.* Let V be a vector space and suppose that S s;:; T s;:; V are subspaces of V.
Prove

2.17. Let L be a linear operator on JF2 that maps the basis vectors e 1 and e 2 as
follows:
L(e1) = e1 + 21e2 and L(e2 ) = 2e1 - e2.
(i) Compute L(2e1 - 3e2) and L 2(2e1 - 3e2) in terms of e 1 and e2.
(ii) Determine the matrix representations of L and L 2.
2.18. Assuming the polynomial bases [1,x,x 2] and [1,x,x 2,x 3 ,x 4 ] for IF[x;2] and
IF [x; 4], respectively, find the matrix representations for each of the linear
transformations in Exercise 2.3.
2.19. Let L: IF[x; 2] -t lF[x; 2] be given by L[p] = p + 4p'. Find the matrix repre-
sentation of L with respect to the basis S = [x 2 + 1, x - 1, 1] (for both the
domain and codomain).
2.20. Let a =f. 0 be fixed, and let V be the space of infinitely differentiable
real-valued functions spanned by S = [e"'x,xe"'x,x 2 e"'x]. Let D be a lin-
ear operator on V given by D[f](x) = f' . Find the matrix A representing D
on S. (Tip: Save your answer. You need it to solve Exercise 2. 41. )
2.21. Recall Euler's identity: ei 6 = cosB+i sinB . Consider C([O, 2n]; q as a vector
space over C. Let V be the subspace spanned by the vectors S = [eilJ, e- ilJ].
Given that T = [cosB,sinB] is also a basis for V ,
(i) find the transition matrix from S into T;
(ii) find the transition matrix from T into S;
(iii) verify that the transition matrices are inverses of each other.
2.22. Let V be the vector space V of the previous exercise. Let D : V -t V be the
derivative operator D(f) = ddlJ f (B). Write the matrix representation of D in
terms of
(i) the basis Son both domain and codomain;
(ii) the basis Ton both domain and codomain;
(iii) the basis S on the domain and basis T on the codomain.
2.23. Given a linear transformation defined by a matrix A, prove that the range of
the transformation is the span of the columns of A.
Exercises 81

2.24. Show that the matrix representation of the operator L 2 : IF'[x; 4] -+ IF'[x; 4]
given by L2[p](x) = p"(x) on the standard polynomial basis [l,x,x 2 ,x 3 , x 4 ]
equals (Css) 2 , as given in Example 2.4.8. In other words, the matrix rep-
resentation of the second derivative operator is the square of the matrix
representation of the derivative operator.
2.25. The trace of an n x n matrix A = [aij], denoted tr (A) , is the sum of its
diagonal entries; that is, tr (A) = au + a22 + · · · + ann· Show the following
for any n x n matrices A, B E Mn(IR) (where B = [bij]):
(i) tr (AB)= tr (BA).
(ii) If A and B are similar, then tr (A) = tr (B). This implies, among
other things, that the trace is determined only by the linear operator
and does not depend on the choice of basis we use to write the matrix
representation of that operator.
(iii) tr (ABT) = L~= l 2::;= 1 %bij·
2.26. Let p(z) = akzk +ak_ 1zk-l + · · ·+a 1z+ao be a polynomial. For A E Mn(IF'),
we define p(A) = akAk + ak_ 1Ak-l + · · · + a1A + aol. Prove: If A and Bare
similar, then p(A) and p(B) are also similar.
2.27. Prove that similarity is an equivalence relation on the set of all n x n matrices.
What are all the equivalence classes of 1 x 1 matrices?
2.28. Prove that if two matrices are similar and one is invertible, then so is the
other.
2.29. Prove that similarity preserves rank and nullity of matrices.
2.30. Prove: If IF'm ~ IF'n, then m = n. Hint: Consider using the rank-nullity
theorem.

2.31. A higher-degree Bernstein polynomial can be written in terms of lower-degree


Bernstein polynomials. Let B;:;- 1 ( x) =
0 and B~1 1 ( x) 0 for all n 2: 0.=
Show that for any n 2: 1 we have
Bj(x) = (1 - x)Br 1(x) + xBj~l(x), j = o, 1, ... , n.
2.32. Since the obvious inclusion im,n : IF' [x; m] -+ IF'[x; n] is a linear transformation
for every 0 ::; m ::; n , and since the Bernstein polynomials form a basis
of IF'[x; n], any lower-degree Bernstein polynomial can be written as a linear
combination of higher-degree Bernstein polynomials. Show that
Bn(x) = j + l Bn+l(x) +n- j + l Bn+ 1(x) · 0 1
3 n + 1 3 +1 n +1 J ' J = ' '· · · 'n.
Write the matrix representation of in,n+l : IF'[x; n] -+ IF'[x; n + l] in terms of
the Bernstein bases.
2.33. The derivative gives a linear transformation D : IF'[x; n] -+ IF'[x; n - l].
Since the set of Bernstein polynomials B = { B'0- 1 ( x) , B~- 1 (x), B~- 1 (x), ... ,
B~= Ux)} forms a basis for IF'[x; n - l], one can always write the derivative of
Bj (x ) as a linear combination of the Bernstein polynomial basis B.
(i) Show that the derivative of Bj(x) can be written as
D[Bj](x) = n (Bj~11 (x) - Bj- 1 (x)).
82 Chapter 2. Linear Transformations and Matrices

(ii) Write the matrix representation of the derivative D: IF[x; n]---+ IF[x; n-1]
in terms of the Bernstein bases.
(iii) The previous step is different from Example 2.6.6 because the domain
and codomain of the derivative are different. Use the previous step and
Exercise 2.32 to write a product of matrices that gives the matrix rep-
resentation of the derivative IF[x; n] ---+ IF[x; n] (as an operator) in terms
of the Bernstein basis.
2.34. The map I: IF[x; n] ---+IF given by fr-+ J;
f(x) dx is a linear transformation.
Write the matrix representation of this linear transformation in terms of the
Bernstein basis for IF[x; n] and the (single-element) standard basis for IF.
2.35. t Define the nth Bernstein operator Bn : C( [O, 1]; IF) ---+ IF[x; n] to be the linear
transformation
n
Bn[f](:r:) = "L, J(j/n)Bj(x).
j=O

For any subspace WC C([O, 1]; IF), the transformation Bn also defines a linear
transformation W ---+ IF[x; n] .
(i) Find the matrix representation of B 1 : IF[x; 2]---+ IF[x; 1] in terms of the
Bernstein bases.
(ii) Let V c C([O, l];IF) be the two-dimensional subspace spanned by the
set S = [sin x, cos x] . For every n 2: 0, write the matrix representation
of the linear transformation Bn : V---+ IF[x; n] in terms of the basis S of
V and the Bernstein basis of IF[x; n].

2.36. Using only type III elementary matrices, reduce the matrix

to an upper-triangular matrix U . Clearly label each elementary matrix Ei


satisfying U = EkEk-l · · · E 1 A. Then compute L = (EkEk-l · · · E 1 )- 1 and
verify that A = LU.
2.37. Reduce each of the following matrices to RREF. Describe the kernel JV (A)
of the linear transformation defined by the matrix A. Let b = [1 1 0] T.
Determine whether the system Ax = b has a solution, and if it does, describe
the set of all solutions as a coset of JV (A).

[l ~]
0
(i) A= 1
0

A~ [~ ~]
0
(ii) 1
0
Exercises 83

[~ l]
2
(iii) A= 5
7

[1 ~]
2
(iv) A = 2
0

A~ [l ~]
2
(v) 25
8

2.38. &
(i) Let ei and ej be the ith and jth standard basis elements, respectively,
(thought of as column vectors). Prove that the product e iej is the
matrix with a one in its (i,j) entry and zeros everywhere else.
(ii) Let u, v E JFn, a E JF, and av Tu i= 1. Prove that

T
(I -auv T)-l -_I - auv .
avTu - 1

In particular, (I - auv T) is nonsingular whenever av Tu i= 1. Hint:


Multiply both sides by (I - auvT) and recall vT u is a scalar.
All three types of elementary matrices are of the form I - auv T. Specifically,
type I matrices can be written as I - (ei - ej) (ei - ej) T , type II matrices
can be written as I - (1 - o:)eieT, and type III matrices can be written as
I+ o:eiej.
2.39. Prove Proposition 2.7.13. Hint: Do row reduction with elementary matrices.
2.40. Prove that row equivalence is an equivalence relation (see Theorem 2.7.7) .
Hint: If E is an elementary matrix, then so is E- 1 .
2.41. Let A be the matrix in Exercise 2.20. Use its inverse to find the antiderivative
of f (x) = aex + bx ex + cx 2 ex in the space V . Antiderivatives are not usually
unique. Explain why there is a unique solution in this case.
2.42 . Let L: JR[x; 3]-+ JR[x; 3] be given by L[f](x) = (x - l)f'(x). Find bases for
both JI/ (L) and~ (L).

2.43. How many inversions are there in the permutation a = [l, 4, 3, 2, 6, 5, 9, 8, 7]?
2.44. List all the permutations in S4 .
2.45. Use the definition of determinant to compute the determinant of the matrix

2.46. Prove that if a matrix A has a row (or column) of zeros, then det(A) = 0.
84 Chapter 2. Linear Transform ations an d Matrices

2.47. Recall (see Definition C.1.3) that the Hermitian conjugate AH of any m x n
matrix is the conjugate transpose

A H =A
-T
= [-aji.j
Prove that for any A E Mn(C) we have det(AH) = det(A).

2. 48. Compute the adjugate for the matrix in Example 2.9.4.


2.49. Prove Corollary 2.9.5.
2. 50. Prove: If a matrix can be written in block-triangular form

M= [~ ~],
then det(M) = det(A) det(D). Hint: Use row operations.
2.51. Let x , y E IF'n, and assume yHx =f. l. Show that det(I - xyH) = 1 - yHx.
Hint: Note that

2.52. Prove: If a matrix can be written in block-triangular form

M= [~~],
where A is invertible, then det(M) = det(A) det(D - CA- 1 B) . Hint: Use
Exercise 2.50 and the fact that

[CA B] [A OJ
D = C: I
[J0 A- B ]
1

D - CA- 1 B .
2.53. Use Corollary 2.9.23 to compute the inverse of the matrix

A~[~ Hl
2.54. Use Cramer's rule to solve the system Ax= b in Exercise 2.37(v).
2.55.t Consider the matrix
x20

~ l'. x~l.
Xo
X1 x21 x?
.
Vn
Xn x2n xn
n
Prove that
det(Vn) =IT (xj - xi )·
i<j
Hint: Row reduce the transpose. Subtract xo times row k - 1 from row k for
k = 1, ... , n . Then factor out all the (xk - x 0 ) terms to reduce the problem
and proceed recursively.
Exercises 85

2.56. t The Wronskian gives a sufficient condition that determines whether a set of n

l
functions in the space cn- 1 ([a, b]; IF) is linearly independent. The Wronskian
W(x) of a set of functions <cf = {yl (x), Y2(x), . . . , Yn(x)} is defined to be

Y1 (x) Y2(x)

l y~ (x) y~(x) Yn(x)


y~(x)
W(x) =<let : (2.18)
(n-1)( )
Y1 x
(n-1)( )
Y2 x
y~n-'.l)(x) '

where YJk)(x) is t he kth derivative ofyj(x) .


(i) Prove: If there exists x E [a, b] such that W(x) -I 0, then the set <cf is
linearly independent in cn([a, b]; IF). Hint: Prove the contrapositive.
(ii) Use the Wronskian to determine whether the set S in Exercise 2.20 is
linearly independent.

Notes
As mentioned at the end of the previous chapter, some standard references for the
material of this chapter include [Lay02, Leo80, 0806, Str80, Ax115] and the video
explanations of G. Strang in [StrlO].
Inner Product Spaces

Without deviation from the norm, progress is not possible.


-Frank Zappa

In the first two chapters, we focused primarily on the algebraic 14 properties


of vector spaces and linear transformations (which includes matrices). While we
have provided some geometric intuition about key concepts involving linearity, sub-
spaces, spans, cosets, isomorphisms, and determinants, these mathematical ideas
and structures have largely been algebraic in nature. In this section, we introduce
geometric structure into vector spaces by way of the inner product.
Inner products provide vector spaces with two essential geometric properties,
namely, lengths and angles, 15 and in particular the concept of orthogonality-a
fancy word for saying that two vectors are at right angles. Here we define inner
products and describe their role in characterizing the geometry of linear spaces.
Chapter 1 shows that a basis of a given vector space provides a way to represent
each vector uniquely as a linear combination of those basis elements. By ordering
the basis vectors, we can also represent a given vector as a list of coordinates for
that basis, where each coordinate corresponds to the coefficient of that basis vector
in the corresponding linear combination. In this chapter, we go a step further and
consider orthonormal 16 bases, that is, bases that are also orthonormal sets, where, in
addition to providing a unique representation of each vector and a coordinatization
of the vector space, the orthonormality allows us to use the inner product to project
a given vector along each basis element to determine its coefficients. This feature
is particularly nice when only some of the coordinates are needed, as is common
in many applications, and is in stark contrast to the nonorthonormal case, where

14 By the word algebraic we mean anything related to the study of mathematical symbols and the
rules for manipulating these symbols.
15 In a real vector space, the inner product gives us angles by way of the law of cosines (see
Section 3.1), but in a complex vector space, the idea of an angle breaks down somewhat and
isn't very intuitive. However, for both real and complex vector spaces, orthogonality is well
defined by the inner product and is an extremely useful concept in theory, application, and
computation.
l6 An orthonormal set is one where each vector is orthogonal to every other vector and all of the
vectors are of unit length.

87
88 Chapter 3. Inner Product Spaces

one typically has to solve a linear system in order to determine the coordinates; see
Example l.2.14(iii) and Exercise 1.6 for details.
After establishing this and several other important properties of orthonormal
bases, we show that any finite linearly independent set can be transformed into
an orthonormal one that preserves the span. This algorithm is called the Gram-
Schmidt orthonormalization method, and it is widely used in both theory and ap-
plications. This means that we can always construct an orthonormal basis for a
given finite-dimensional vector space oir subspace, and then bring to bear the power
of the inner product in determining the coefficients of a given vector in this new
orthonormal basis.
More generally, the inner product also gives us the ability to compute the
lengths of vectors (or distances between vectors by computing the length of their
difference). In other words, the inner product induces a norm (or length function)
on a vector space. Armed with this induced norm, we can examine many ideas
from geometry, including perimeters, angles between vectors and subspaces, and
the Pythagorean theorem. A norm function also gives us the ability to establish
convergence properties for sequences, which in turn enables us to take limits of
vector-valued functions; this is at the heart of the theory of calculus on vector spaces.
Because of the far-reaching consequences of norm functions, we take a brief
detour from inner products and devote a section to the general properties of norms.
In particular, we consider examples of norm functions that are very useful in math-
ematical analysis but cannot be induced by an inner product. This allows us to
expand our way of thinking and understand the key properties of norm functions
apart from inner products and Euclidean geometry. Given a linear map from one
normed vector space to another, we can look at how these linear maps transform
unit vectors. In particular, we look at the maximum distortion obtained by map-
ping the set of all unit vectors. We show that this quantity is a norm on the set of
linear t ransformations from the domain to the codomain.
We return to properties of the inner product by considering how they affect
linear transformations. We introduce the adjoint map and fundamental subspaces.
These ideas help us to decompose the domains and codomains of a linear trans-
formation into complementary subspaces that are orthogonal to one another. As a
result, we are able to form orthonormal bases for each of these subspaces. One of
the biggest applications of this is least squares, which is fundamental to classical
statistics. It gives us the ability to fit lines to data and, more generally, to solve
many curve-fitting problems where the unknown coefficients can be formulated as
linear combinations of certain basis functions.

3.1 Introduction to Inner Products


In this section, we define the inner product on a vector space and describe some of
its essential properties. We also provide several examples.

3.1.1 Definition s and Examples

Definition 3.1.1. An inner product on a vector space V is a scalar-valued map


(., ·) : V x V ---+ lF that satisfies the fallowing conditions for any x, y, z E V and any
a, b E lF :
3.1. Introduction to Inner Products 89

(i) (x, x ) 2'. 0 with equality holding if and only if x = 0 (positivity).


(ii) (x, ay + bz) =a (x , y) + b (x, z) (linearity).
(iii) (x,y) = (y, x ) (conjugate symmetry).

A vector space V together with an inner product (·, ·) is called an inner product
space and is denoted by the pair (V, (·, ·)).

Remark 3.1.2. If lF = JR., then the conjugate symmetry condition given in (iii)
simplifies to (x ,y) = (y, x ).

Proposition 3.1.3. Let (V, (., ·)) be an inner product space. For any x , y, z E V
and any a E lF, we have

(i) (x + y, z) = (x, z) + (y, z),


(ii) (ax ,y) = a(x,y).

Proof.

(i) (x + y, z) = (z, x + y ) = (z, x ) + (z, y ) = (x, z) + (y, z).


(ii) (ax,y) = (y,ax) = a (y, x ) = a (x , y ). D

Remark 3.1.4. From the proposition, we see that an inner product on a real vector
space is also linear in its first entry; that is, (ax+ by, z) = a (x, z) + b (y, z) . Since
it is linear in both entries, we say that the inner product on a real vector space is
bilinear. For complex vector spaces, however, the conjugate symmetry only makes
the inner product half linear in the first entry since sums can be pulled out of
the inner product, but scalars come out conjugated. Thus, the inner product on a
complex vector space is called sesquilinear, 17 meaning one-and-a-half linear.

Example 3.1.5. In JR.n, the standard inner product (or dot product) is
n
(x, y ) = x Ty = L aibi, (3.1)
i= l

where x = [a1 bn JT. In en the standard inner


product is
n
(x ,y) = xHy = 2.:aibi · (3.2)
i= l

17 Some books define the inner product to be linear in the first entry instead of the second, but
linearity in the second entry gives a more natural connection to complex inner products.
90 Chapter 3. Inner Product Spaces

Example 3.1.6. In L2 ([a, b]; IR), the standard inner product is

(f, g) = ib f(x)g(x)dx, (3.3)

whereas in L 2 ([a, b]; q, it is

(3.4)

Example 3.1.7. In Mmxn(IF') , the standard inner product is

(3.5)

This is called the Probenius (or Hilbert-Schmidt) inner product.

Remark 3 .1.8. In each of the three examples above, we give the real case and the
complex case separately to emphasize the importance of the conjugate term in the
inner product. Hereafter, we dispense with the distinction and simply write IF'n,
L 2 ([a, b]; IF'), and Mmxn(IF'), respectively. In each case, we use the complex version
of the inner product, since the complex inner product reduces to the real version if
the corresponding field is real.

Remark 3.1.9. Usually when denoting the inner product we just write (-, ·). How-
ever, if the theorem or problem we are studying has multiple vector spaces and/or
inner products, we distinguish the different inner products with a subscript denoting
the space, for example, (-, )v·

Example 3.1.10. On the vector space f, 2 over the field IF' (see Examples
l.l.6(iv) and l.l.6(vi)) , the standard inner product is given by
00

(x, y) = L aibi , (3.6)


i=l

where x = (a1, a2, ... ) and y = (b1 , b2, ... ).


The axioms of an inner product are straightforward to verify, but it is
not immediately obvious why the inner product should be finite, that is, that
the sum should converge. This follows from Holder 's inequality (3.32), which
we prove in Section 3.6.
3.l Introduction to Inner Products 91

3.1 .2 Lengths and Angles


Throughout the remainder of this section, assume that (V, (-, ·)) is an inner product
space over the field IF.

Definition 3.1.11. The length of a vector x E V induced by the inner product is


ll x ll = J(x,x). If ll x ll = 1, we say thatx is a unit vector. The distance between
two vectors x, y E V is the length of the difference; that is, dist(x, y) = llx - Yll ·

Remark 3.1.12. By Definition 3.1.l(i), we know that llxll 2': 0 for all x E V .
We call this property positivity. Moreover, we know that ll x ll = 0 if and only
if x = 0. We can also show that the length function preserves scale; that is,
llax ll = J (ax, ax) = Jlal 2 (x , x ) = lal ll x ll · The function II · II also has another key
property called the triangle inequality; this is examined in Section 3.5.

Remark 3.1.13. Any nonzero vector x can be normalized to a unit vector by


dividing by its length ll xl l· In other words, ll ~ll is a unit vector since

Definition 3.1.14. We say that the vector x E V is orthogonal to the vector y E V


if (x, y ) = 0. Sometimes we denote this by x l. y .

Remark 3.1.15. Note that (y, x ) = 0 if and only if (x, y) = 0. In other words,
orthogonality is a symmetric relation between two vectors.

The zero vector 0 is orthogonal 18 to every x EV, since (x, x) + (x, 0) = (x, x).
In the following proposition, we show that the converse is also true.

Proposition 3.1.16. If (x ,y) = 0 for all x EV, then y = 0.

Proof. If (x ,y) = 0 holds for all x EV, then it holds when x = y. Thus, we have
that 0 = (y,y) = ll Yll 2 , which implies that y = 0. D

Proposition 3.1.17 (Cauchy- Schwarz Inequality). For allx,y EV, we have

l(x ,y)I :S ll x llll Yll, (3.7)

with equality holding if and only if x and y are linearly dependent.

Proof. See Exercise 3.6. D

18 There is no consensus in the literature as to whether the two vectors in the definition should
have to be nonzero. As we see in the following section, it doesn 't really matter that much since
we are usually more interested in orthonormality, which is when the vectors are orthogonal and
of unit length .
92 Chapter 3. Inner Product Spaces

Figure 3.1. Geometric representation of the vectors and angles involved


in the law of cosines, as discussed in Remark 3.1.19.

Definition 3.1.18. Let (V, (-, ·)) be an inner product space over JR. We define the
angle between two nonzero vectors x and y to be the unique angle E [O, n] such e
that
(x,y)
(3.8)
cos e = l xll llYll.
Remark 3.1.19. In IR 2 it is straightforward to show that (3.8) follows from the
law of cosines (see Figure 3.1), given by

llx -yll 2 = llxll 2 + llYll 2 - 2llxllllYll cose.


Note also that this definition does not extend to complex vector spaces. In-
deed, the definition of angle breaks down in the complex case because (x, y) is a
complex number.

The Pythagorean theorem is fundamental for classical geometry, but we can


now show that it holds for any inner product space, even if the inner product (and
hence the length I · II of a vector) is very different from what we are used to in
classical geometry.

Theorem 3.1.20 (Pythagorean Law). If x, y are orthogonal vectors, then


l x + Yll 2 = l xll 2 + llYll 2 ·
Proof. llx + Yll 2 (x + y, x + y) llxll 2 + (x, Y) + (y, x) + llYll 2
= l xll + llYll 2 · D
2

Example 3.1.21. In L 2 ([0, l]; <C) the vectors f(t) = e 27rit and g(t) = e 10r.it
are orthogonal, since

1_ . 1 . e87ritll
(!, g) =
1 o
e2r.it e l0m t dt =
10
e8mt dt = - -. = 0.
8ni 0

A similar computation shows that Iii I = llgll = 1. If h = 4e 2 7rit +3e 10 rrit, then
by the Pythagorean law, we have llhll 2 = ll4fll 2 + ll3gll 2 = 25 .
3.1. Introduction to Inner Products 93

3.1.3 Orthogonal Projections


An inner product allows us to define the orthogonal projection of any vector v onto
a unit vector u (or rather to the one-dimensional subspace spanned by u) . This
gives us the vector in span( { u}) that is closest (in terms of the norm II · II) to the
original vector v.

Definition 3.1.22. For any vector x E V, x =f. 0, and any v E V , the orthogonal
projection of v onto span( {x}) is the vector
. x
proJspan({x})(v) = (x, v) llxll2. (3.9)

Remark 3.1.23. For any nonzero a E lF we have


. ax _ x x .
proJspan({<>x}) = (ax, v) llaxll2 = aa (x, v) lal211xll 2 = (x, v) ll xl l2 = proJspan({x})'

so despite the fact that the definition of projspan( {x}) depends explicitly on x, it
really is determined only by the one-dimensional subspace span( {x}). Neverthe-
less, for convenience we usually write projx(v), instead of the more cumbersome
projspan( {x}) (v) ·

Proposition 3.1.24. For any vector x E V the map projx : V -+ V is a linear


operator. Moreover, the fallowing hold:
(i) projx o projx = projx.
(ii) For any v E V the residual vector r = v - projx(v) is orthogonal to all the
vectors in span( {x} ), including projx(v); see Figure 3.2. Thus, projx(r) = 0,
or equivalently r E JV (projx) ·
(iii) The vector projx(v) is the unique vector in span({x}) that is nearest to v.
More precisely, llv - projx(v)ll < ll v - xii for all x E span({x}) satisfying
x =f. projx(v).

r = v - projx v v

Figure 3.2. Orthogonal projection projx v = (x , v) x (blue) of a vector v


(black) onto a unit vector x (red). The difference r = v - projx v = v - (x, v) x
(green) is orthogonal to x.
94 Chapter 3. Inner Product Spaces

Proof. It is clear from the definition that projx is a linear operator. Since the
projection depends only on the span of x , we may replace x by the unit vector
u = x/llxll
(i) For any v EV we have
proju(proju(v)) = (u, proju(v)) u = (u, (u, v) u) u
= (u, v) (u, u) u = (u, v) u = proju(v).
(ii) To show that au J.. r for any a E lF, take the inner product
(au, r) = (au, v - pro ju (v)) = (au, v - (u, v) u)
= (au, v) - (au, (u, v) u) =a (u, v) - a (u, v) (u, u) = 0.
(iii) Given any vector x E span( {u} ), the square of the distance from v to xis
2 2 2
llv - xll 2 = llr + proju(v) - x ll =llrll +11 proju(v) - x11 .

If xi= proju(v), then 11 proju(v) - xll > 2


o, and thus llv - xii is minimized
when :X = proju(v). D

Example 3.1.25. Consider the vector space IR[x] with inner product (!, g) =
f~ 1 f(x)g(x) dx. If f(x) = 5x 3 - 3:x, then

11!11 2 = (!,f)=f1 (5x 3 - 3x) 2 dx = ~.


-1 7

The projection of g(x) = x 3 onto f is given by


1
projf[g](x) = (/_ (5t 3
1
- 3t)t 3 dt) ~f(x) = x 3 - ~x.
Moreover, the residual vector is r(:x) = g(x) - projf[g](x) = ~x, which by the
previous proposition is orthogonal to f. We verify this by a direct check:
l 3
(f,r) =
f
-1
(5x 3 - 3x) · x dx = 0.
5

3.2 Orthonormal Sets and Orthogonal Projections


The role of orthogonality is essential in many applications. It allows for many
problems to be broken up into individual components and solved or analyzed in
parts. Orthogonality is also a key concept in a number of computational problems
that would be difficult or prohibitive to solve otherwise. In this section, we examine
the main properties of orthonormal sets and orthogonal projections.
Throughout this section, assume that (V, (·, ·)) is an inner product space over
the field lF.
3.2. Orthonormal Sets and Orthogonal Projections 95

3.2.1 Properties of Orthonormal Sets

Definition 3.2.1. A collection 'f! = {xi}iEJ is an orthonormal set if for all i, j E J


we have

where
8 - { 1 if i = j,
iJ - 0 if i =f. j
is the Kronecker delta.

Example 3.2.2. Let V = C([O, l ]; JR) . We verify that S = {1, vU(x -1/2)}
is an orthonormal set with the inner product
1
(f, g) = fo f(x)g(x) dx . (3.10)

1
Note that (1, vU(x - 1/2)) = vU f 0 (x - ~) dx = vU (~ - ~) = 0. Also
1 1
(1, 1) = f0 1 dx = 1 and ( vU(x - 1/2), vU(x - 1/2)) = f0 12(x- 1/2) 2 dx =
2 2d
J l/2
-1 / 2 1 u u = 1.

Theorem 3.2.3. Let {xi}~ 1 be an orthonormal set.

(i) If x = I:;:1 aixi, then ai = (xi , x) for each i .


(ii) If x = I:;: 1aixi and y = I:;: 1bixi, then (x, y) = I:;: 1aibi.

(iii) If x = I:;: 1 aixi, then JJxJJ 2 = I::: 1 JaiJ 2.

Proof.

(i) (xi,x) = \xi, I:;:, 1ajXj) = I:;:,1aj (xi,xj) = I:;:,1ajbij = ai.


(ii) (x, y) = \ L:1 aiXi, L7=1 bjXj) = L:1 L7=1 aibj (xi, Xj) = L:1 aibi.
(iii) This follows immediately from (ii). D

Remark 3.2.4. The values ai = (xi, x ) in Theorem 3.2.3(i) are called the Fourier
coefficients of x. The ability to find these coefficients by simply taking the inner
product is the hallmark of orthonormal sets.

Corollary 3.2.5. Orthonormal sets are linearly independent. In particular, if 'f!


is an orthonormal set that spans V, then 'f! is a basis of V.
96 Ch apter 3. Inner Product Spaces

Proof. Assume that a 1 x 1 + a2x2 + · · · + amXm = 0 for some orthonormal subset


{xi}~ 1 of <ef. By Theorem 3.2.3(iii), we have that I::, 2
1 iail = 0, which implies
that each ai = 0. Hence, the set 't? is linearly independent. 0

We can generalize vector projections from Section 3.1.3 to finite orthonormal


sets.

Definition 3.2.6. Let {xi} ~ 1 be an orthonormal set that spans the subspace
X C V. For any v E V we define the orthogonal projection onto X to be the sum
of vector projections along each x i; that is,
m
proh(v) = L (xi, v) xi . (3.11)
i=l

Theorem 3.2.7. If {xi}~ 1 is an orithonormal set with span({xi}~ 1 ) = X, then


the map proh : V-+ V is a linear operator. Moreover, the following hold:
(i) proh o proh = proh .
(ii) For every v E V, the residual vector r = v - proh(v) is orthogonal to every
x EX {see Figure 3.3) . Thus, proh(r) = 0, or equivalently r E JV (proh ).
(iii) The point proh(v) is the unique vector in X that is nearest to v. More
precisely, !Iv - proh(v)ll < !Iv -- x ii for all x EX satisfying x -=I- proh(v).

Proof. It is straightforward to check that proh is linear.


(i) For any v E V we have (Theorem 3.2.3(i)) that (xi, proh (v)) = (xi, v), so
m m

proh(proh(v)) = L (xi, proh(v)) x i= L (xi, v) x i= proh(v).


i=l i=l

r v

Figure The orthogonal projection of v {black) onto X =


3.3.
span({xi}~ 1 ). The projection prohv = I::,
1 (xi,v)xi {blue) lies in X and the
difference r = v - proh v {red) is orthogonal to X.
3.2. Orthonormal Sets and Orthogonal Projections 97

(ii) Given any vector x = I:7: 1 CiXi EX, we have

(x, r) = ( x, v - ~ (xi, v) Xi) = (x, v) - ~ (x, (xi, v) xi)


m m
= (x, v) - L (x, xi) (x i, v ) = (x, v) - L Ci (xi, v)
i=l i=l

= (x, v) - / f
\i=l
cixi, v) = (x, v) - (x , v) = 0.

(iii) Given any vector x EX, the square of the distance from x to vis
2 2 2
llv - xll = llr + proh(v) - x ll = ll r ll +II proh(v) - x ll 2 ·
The last term II proh(v) - xll 2 is always nonnegative and is zero only when
proh(v) = x, and thus llv - x ii is minimized when x = proh(v). D
Remark 3.2.8. Theorem 3.2.7(iii) above shows that the linear map proh depends
only on the subspace X and not on the particular choice of orthonormal basis
{xi}~1 ·

Theorem 3.2.9 (Pythagorean Theorem). If {xi}~ 1 CV is an orthonormal


set with span X and v E V, then
m m 2
llvll
2
= II proh(v)ll 2 + llv - proh(v)ll =
2
L l(xi , v)l 2
+ llv - L (x i, v) xiii ·
i=l i=l
(3.12)

Proof. Since v = proh(v) + (v - proh(v)) and proh(v) ..l (v - proh(v)), the


result follows immediately by Theorems 3.1.20 and 3.2.7(ii). D

Corollary 3.2.10 (Bessel's Inequality). If v E V and {xi}~ 1 C V is an


orthonormal set with span X, then
m
2
llvll 2:: I proh(v)ll 2 = L l(xi , v) l
2
. (3.13)
i=l
Equality holds only if v E X.

Proof. This follows immediately from (3.12). D

3.2.2 Orthonormal Transformations


Definition 3.2.11. A linear map L from an inner product space (V, (., · )v) to
an inner product space (W, (·, ·) w) is called an orthonormal transformation if for
every x , y E V we have
(x , Y)v = (L(x), L(y))w. (3.14)
98 Chapter 3. Inner Product Spaces

If L is an orthonormal transformation from an inner product space (V, (., ·)) to


itself, then L is called an orthonormal operator.

Proposition 3.2.12. If L is an orthonormal operator and V is finite dimensional,


then L is invertible.

Proof. By Corollary 2.3.11, it suffices to show that L is injective. If x E JV (L),


then llxll 2 = (x, x) = (L(x), L(x)) = llL(x)ll 2 = llOll 2 , and hence x = 0. D

In the hypothesis of the previous proposition, it is important that the space


V be finite dimensional. Otherwise, injectivity does not imply surjectivity. The
following is an example of an orthonormal operator that is not invertible.

Example 3.2.13. Assume £2 is endowed with the standard inner product


(3.6) . Let L : £2 -t £2 be the right-shift operator given by (a 1 , a 2 , ... ) >--+
(O,a 1 ,a 2 , .. . ). It is easy to see that Lis an injective orthonormal operator,
but it is not surjective, and therefore not invertible.

Definition 3.2.14. A matrix Q E Mn(IF) is called orthonormal 19 if it is the


matrix representation of an orthonormal operator on !Fn in the standard bases (see
Example 1.2.14(ii)) and with the standard inner product (see Example 3.1.5).

Theorem 3.2.15. The matrix Q E Mn(IF) is orthonormal if and only if it satisfies


QHQ = QQH =I. Moreover, for any orthonormal matrix Q E Mn(IF), the following
hold:

(i) llQxll = llxll for all x EV .

(ii) Q- 1 exists and is an orthonormal matrix.

(iii) The columns of Q are orthonormal.

(iv) ldet(Q)I = 1; moreover, ifQ E Mn(lR), then det(Q) = ±1.

Proof. See Exercise 3.10. D

Corollary 3.2.16. If Qi, Q2 E Mn(IF) are orthonormal matrices, then so is their


product Q1Q2.

Remark 3.2.17. An immediate consequence of Theorem 3.2.15 and Exercise 3.4


is that orthonormal matrices preserve both lengths and angles.

19
Many textbooks use the term unitary for complex-valued orthonormal square matrices, and they
call real-valued orthonormal matrices orthogonal matrices. In this text we prefer to use the term
orthonormal for both of these because it more accurately describes the essential nature of these
matrices, and it allows us to treat the real and complex cases identically.
3.3. Gram-Schmidt Orthonormalization 99

Not a B ene 3 .2.18. One way to determine if a given linear operator on lFn
is orthonormal (with t he usual inner product) is t o represent it as a matrix in
the standard basis. If t hat mat rix is ort honormal, t hen the linear operator is
orthonormal.

3.3 Gram-Schmidt Orthonormalization


In this section, we show that given a linearly independent set {x 1, ... , xn} we can
construct an orthonormal set {q 1 , ... , qn}, which has the same span. An important
corollary of this is that every finite-dimensional inner product space is isomorphic
to lFn with the standard inner product (x , y ) = xHy. This means that many results
that hold on lFn with the standard inner product also hold for any n-dimensional
inner product space (over the same field). We demonstrate this idea with a very
brief treat ment of the theory of orthogonal polynomials. We conclude this section
by introducing the very important QR decomposition, which factors a matrix into
a product of an orthogonal matrix and an upper-triangular matrix.

3.3.1 Gram - Schmidt Orthonormalization Process

Theorem 3.3.1 (Gram-Schmidt). Let {x 1, ... ,xn} be a linearly independent


set in the inner product space (V, (., ·)). The following algorithm, called the Gram-
Schmidt process, produces a set {q 1 , ... , qn} that is orthonormal in V with the
same span as {x1 , ... , Xn} ·
Let

and define q2, q3 , ... , qn , recursively, by

Xk - Pk-1
qk - k = 2, ... , n ,
- ll xk - Pk-1 II '
where
k- 1
Pk-1 = projQ"_ 1 (xk) = L (qi, xk ) q i (3.15)
i=l
is the projection of xk onto Qk-1 = span ({q1, ... , qk-d).

Proof. We prove inductively that {q 1 , .. . , qn} is an orthonormal set. The base


case is trivial. It suffices to show that if { q 1 ,. . ., qk - 1} is an orthonormal set,
then {q 1, ... ,qk} is also an orthonormal set. By Theorem 3.2.7(ii), the residual
vector Xk - Pk- l is orthogonal to every vector in Qk-1 = span({q1, . . . , qk_i}).
If Xk - P k- l = 0 , then Xk = Pk-l E Qk - l, which contradicts the linear inde-
pendence of {q 1 , .. . , qk - l} given by the inductive hypothesis. Thus, Xk - Pk- 1 is
nonzero and qk is well defined. It follows that { q 1 , ... , qk} is an orthonormal set, as
required. D
100 Chapter 3. In ner Product Spaces

Example 3.3.2. Consider the vectors

in JR 3 . T he Gram- Schmidt process yields

Projecting x2 onto qi gives

X2 - Pl 1
and q2 = ll x2 - Pill = J6 [~ll ·

Now projecting onto Q2 = span{q1,q2} gives

and

q,~ 1: =::11 ~~[Tl·


Thus, we have the following orthonormal basis for JR 3 :

Example 3.3.3. Consider the inner product space JF[x; 2] with the inner
product
1
(f, g) = /_ f(x)g( x )dx. (3.16)
1

We apply the Gram-Schmidt process to the set of vectors {l , x, x 2} C JF [x; 2].


The first step yields

i i / i ) i
qi = 1fij = J2 and P1 = \ J2, x J2 = 0.
3.3. Gram- Schmidt Orthonormalization 101

Thus, x is already orthogonal to q1 . The second step gives

It follows that

Vista 3.3.4. In Example 3.3.3, we used the Gram-Schmidt method to derive


the orthonormal polynomials {q1, q2, q3} spanning lF[x; 2]. Repeating this for
all degrees gives the Legendre polynomials.
There are many other inner products that could be used on lF[x]. For
example, consider the weighted inner product

(f, g) =lbw(x)f(x)g(x) dx, (3.17)

where w(x) > 0 is a weight function. By performing the Gram- Schmidt


orthogonalization process, we can construct a collection of polynomials that
are orthonormal with respect to (3.17). In the table below, we list a few of the
domains, weights, and the names of the corresponding families of orthogonal
polynomials. These are discussed in much more detail in Volume 2.

Name Domain w(x)


Chebyshev-1 (-1,1) (l _ x2) -1 /2
Chebyshev-2 (-1, 1) (l _ x2) 1/ 2
Hermite (-oo, oo) exp (-x 2)
Legendre (-1 , 1) 1

Remark 3.3.5. Nai:ve application of the Gram-Schmidt process, as described


above, is numerically unstable, meaning that the round-off error in numerical com-
putation compounds to yield unreliable output. A slight change gives the modified
Gram- Schmidt routine and produces better results; see the lab on QR decomposi-
tion for details.

3.3.2 Finite-Dimensional Inner Product Spaces

Corollary 3.3.6. If (V, (., · )v) is an n -dimensional inner product space over lF,
then it is isomorphic tor with the standard inner product (3.2) .
102 Chapter 3. Inner Product Spaces

Proof. Using Theorem 3.3.1, we choose an orthonormal basis [x1, . . . , xn] for V
and define the map L : V -t IFn by

This is clearly a bijective linear transformation. To see that the inner product is
preserved, we note that if x = 2=7= 1 bjxj and y = L~=l ckxk, then

(x , Y)v ~ (~ b;x;. ~ CkXk) v~ ~ ~ b;ck (x;. x,)y ~ ~ b;c; ~ (L(x), L(y)) ,


where (., ·) is the usual inner product on IFn. D

Remark 3.3. 7. Just as Corollary 2.3.12 is a big deal for vector spaces, it follows
that Corollary 3.3.6 is a big deal for inner product spaces. It means that although
there are many different descriptions of finite-dimensional inner product spaces,
there is essentially (up to isomorphism) only one n -dimensional inner product space
over IF for each n E N. Anything we can prove on IFn with the standard inner
product automatically holds for any finite-dimensional inner product space over IF.

Example 3.3.8. Recall that the orthonormal basis {q 1 , q2, q3}, computed in
Example 3.3.3, has the following transition matrix:

0
V3
0 ~
y'5

Consider the linear isomorphism L : IF[x; 2] -t IF 3, given by L(qi) = ei·
Since [q 1 , q2, q3] is orthonormal, this map also preserves t he inner product,
so (j,g)IF[x; 2] = (L(f),L(g))JF3·
To express this map in terms of the power basis [l , x, x 2 ], first write an
arbitrary element of IF[x; 2] as p(x) = ax 2 +bx+ c = [1, x, x 2 ] [c b a]T, and
then compute L(p) as

[cl = (L(q1), L(q2) , L(q3) J3y'2 [3~


2
L(p) = [L(l ), L(x), L(x )] : V3
0

3.3.3 The QR Decomposition


The QR decomposition of a matrix is an important tool that allows any matrix
to be written as a product of an orthonormal matrix Q and an upper-triangular
3.3. Gram - Schmidt Orthonormalization 103

matrix R. The QR decomposition is used in many applications, such as solving


least squares problems and computing eigenvalues of a matrix, and it is one of the
most fundamental matrix decompositions in applied and computational mathemat-
ics. Here we prove the existence of the decomposition using the Gram-Schmidt
orthonormalization process.

Theorem 3.3.9 (QR Decomposition). Any matrix A E Mmxn of rank n::::; m


can be factored into a product A = QR, where Q is an m x m orthonormal matrix
and R is an m x n upper-triangular matrix.

Proof. Since A has rank n, the columns of A are linearly independent, so we can
apply Theorem 3.3.1. Let x 1 , ... , Xn be the columns of A. By the replacement
theorem (Theorem 1.4.1) there are vectors X n+ 1 , ... , Xm such that space lFm is
spanned by x1 , ... , Xm·
Using the notation in Theorem 3.3.1, let Po = 0, let rjk = (qj, xk) for j =
1, 2, ... , k - 1, and let rkk = [[ xk - Pk-1 [[. We have

X1 = r11q1,
x2 = r12q1 + r22q2 ,

or, in matrix form,

Let Q be the orthonormal matrix [q1 , ... ,qmJ, and let R be the first n columns of
the upper-triangular matrix [rij] above. Since the columns of A are x1, ... , xn, this
gives A = QR, as required. D

Remark 3.3.10. Along with the full QR decomposition given in Theorem 3.3.9,
another useful decomposition is the reduced QR decomposition. Given a matrix
A E Mmxn of rank n::::; m, the reduced QR decomposition of A is the factorization
A = QR, where Qis an mxn orthonormal matrix and Ris an nxn upper-triangular
matrix.

-2
3
3
-2
3.5
-0.5
2.5
0.5
l
Example 3.3.11. To compute the reduced QR decomposition of
104 Chapter 3. Inner Product Spaces

let xi denote the ith column of the matrix A, so that

T T 1 [ T
x 1 =[1 1 1 1] , x 2 =[-2 3 3 -2) , and x3="2 7 -1 5 l].

Begin by normalizing x1. This gives ru = ll x 1ll = 2 and

Next find r 12 = (q 1, x2 ) = 1 and r13 = (q1, x 3) = 3. Now compute

x2 - P1 = x 2 - r12c11 = 2
S [- 1 1 1 -l]T .

Normalizinggivesr22 = ll x2-P1l l = 5andqz = ~ [-1 1 1 -l] T. Further,


we find rz3 = (qz , x3 ) = -1. Finally, we have

This gives r33 = 3 and q3 = ~ [1 ·-1 1 -1 JT. Therefore, Q is the matrix


-1
~!]
-1
~ 1 1 1
Q= -2 1 1 1 .
1 -1 -1

You should verify that Q has orthonormal columns, that is, QT Q = I. We


also have

~
R=
[ru0 (q1,x2 )
llx2 - Pill
0 0

For the full QR decomposition, we can choose any additional x 4 such that
[x 1, x 2, x 3, x 4] spans JR 4, and then continue the Gram-Schmidt process. A con-
venient choice in this example is x4 = e4 = [0 0 0 1JT . We calculate r 14 =
(q1,x4) = 1/2 and rz4 = r34 = -1/2 . This gives p4 = ~(q1 - qz - q3) and
i
X4 - p 4 = [-1 -1 1 l]T , from which we find q4 = ~ [-1 -1 1 l]T.
Thus for the full QR decomposition, we have

Q= ~2 r~1 ~11 ~11 -1]


-1
1
1 -1 -1 1
3.4. QR with Householder Transform ations 105

and

l
rn
R= 0
0
0

Vista 3.3.12. The QR decomposition is an important tool for solving many


fundamental problems. Two of the most important applications of the QR
decomposition are in solving least squares problems (see Section 3.9 and
Vista 3.9.8) and in finding eigenvalues (see Section 13.4.2).

Remark 3.3.13. The QR decomposition can also be used to solve a linear system
of the form Ax = b. By writing A = QR we have QRx = b, and since Q is
orthonormal, we have
Rx = QT QRx = QTb.
Since R is upper triangular, the system R x= QTb can be backsolved to find x. This
method takes about twice as many operations as Gaussian elimination, but is more
stable. In practice, however, the cases where Gaussian elimination is unstable are
extremely rare, and so Gaussian elimination is usually preferred for solving dense
linear systems.

3.4 QR with Householder Transformations


There are a few different algorithms for computing the QR decomposition. We
have already seen how the Gram-Schmidt (or modified Gram-Schmidt) process
can be used to compute the QR decomposition. In this section we present an algo-
rithm using what are called Householder transformations. This algorithm is usually
faster and more accurate than the algorithms using Gram-Schmidt. Householder
transformations modify A E Mmxn(lF) to produce an upper-triangular matrix
R E Mmxn(lF) through a sequence of left multiplications of orthonormal matrices
Qr E Mn(lF). These matrices are then multiplied together to form QH E Mn(lF). In
other words, R = Q~ · · · Q~Q~ A= QH A, which implies that A = QR.

3.4.1 The Geometry of Householder Transform ations


In order to construct Householder transformations, we first need to know how to
project vectors onto the orthogonal complement of a given nonzero vector.

Definition 3.4.1. Given a nonzero vector v E lFn, the orthogonal complement of


v, denoted v..l., is the set of all vectors x E lFn that are orthogonal to v. In other
words, v..l. = {x E lFn I (v, x ) = O}.
106 Chapter 3. Inner Product Spaces

Figure 3.4. Reflection Hv(x) {red) of x {black) through the orthogonal


complement (dotted) of v {blue) .

Nota Bene 3.4.2. Beware that although vis a vector, v-1 is a et (in fact a
whole subspace)- not a vector.

Recall that projv(x) is the projection ofx onto the nonzero vector v, and thus
the residual x - projv(x) is orthogonal to v; see Proposition 3.l.24(ii) for details.
This means that we can project an arbitrary vector x E !Fn onto v_l_ by the map

v
projv.L (x) = x - projv(x) = x - llv ll 2 (v, x) = ( I - vvH)
vHv x.

A Householder transformation (also called a Householder reflection) reflects


the vector x across the orthogonal complement vj_, in essence moving twice as far
as the projection; see Figure 3.4 for an illustration of the Householder reflection.

Definition 3.4.3. Fix a nonzero vector v E !Fn. For any x E !Fn the Householder
transformation reflecting x across the orthogonal complement v _l_ of v is given by

VVH) x .
Hv(x) = x - 2 projv(x) = (I - 2 vHv (3.18)

Proposition 3.4.4. Assume v in !Fn is nonzero . We have the following properties


of Hous eholder reflections:
(i) Hv is an orthonormal transformation.
(ii) Elements in v-1 are not affected by Hv. More precisely, if (v, x) = 0, then
Hv(x) = X.

Proof. The proof is Exercise 3.20. D


3.4. QR with Householder Tran sform ations 107

3.4.2 Computing the QR Decomposition vi a Householder


We now compute the QR decomposition using Householder reflections. Given an
arbitrary vector x in lFn, we want to find the nonzero vector v E lFn whose House-
holder reflection Hv(x) resides in the span of the first standard basis vector e 1. The
following lemmata 20 tell us how to find v. We begin with real-valued matrices and
then generalize to complex-valued matrices.

Lemma 3.4.5. Given x E JRn, if the vector v = x + ff xff e1 is nonzero, then


Hv(x) = - ff x ff e1 . Similarly, if v = x - ffxffe1 is nonzero, then Hv(x) = ffxffe1 .

Proof. We prove the case that v = x + ff x ff e 1 is nonzero. The proof for v =


x - ffxffe 1 is similar. Let x 1 denote the first component of x. Expanding the
definition of Hv(x), we have

Hv(x) = x - 2 (x + ffxffe1)(x T + ff x ffe{) x


(xT + ff x ff e"[)(x + ff x ffe1)

=x- ff x ff 2x + ffxff 3 e1 + ffxffx1x + ffxff 2x1e1


~-----~---.,,------

If xf f2 + X1 ffxff
= (ffxff + x1)x + ff xf f(x1 + ff x ff) e1
x - - - - - -,---,-- - - - - -
\Ix \\ +x1
= -\\xffe1. D

Remark 3.4.6. Mathematically, it is enough to require v to be nonzero when doing


Householder transformations. But computationally, with floating-point arithmetic,
we don't want v to be at all close to zero, since the operator Hv becomes much
more sensitive to round-off error as ffv\f gets smaller. To reduce the numerical
errors, when x ~ ff x ff e 1, we choose the "plus" option in Lemma 3.4.5 and set
v = x + ffxffe1. Similarly, when x ~ -ffxffe1 , we choose the "minus" option and
set v = x - \\ x \\ e 1. We can succinctly combine these choices into the single rule
v = x + sign(x 1) f\ xffe 1, which automatically chooses the option farther from zero.
Here we use the convention that sign(O) = 1.

Lemma 3.4. 7. Given x E en' let X1 denote the first component of x. Recall2 1
that the complex sign of X1 is sign(x1) = xif\x1 \ when X1 -=J 0, and sign(O) = 1.
Choosing v = x + sign(x1) ff x f\ e1 implies that

has zeros in all but the first entry.

Proof. The proof is Exercise 3.21. D

20 Lemmata is the plural of the Greek word lemma.


21 See (B .3) in Appendix B .1.2.
108 Chapter 3. Inner Product Spaces

Remark 3.4.8. The vector v = x + sign(x 1)llxlle 1 is never zero unless x = 0.


Indeed if there exists a nonzero x such that x = -sign(x1)llxlle1, then x1 =
-sign(x 1)llxll- This implies that sign(x1) = -sign(x1), which is a contradiction,
since sign(x 1) is never zero.

The algorithm for computing the QR decomposition with Householder reflec-


tions is essentially just a repeated application of Lemma 3.4.7, as we show in the
next theorem.

Theorem 3.4.9. Given a matrix A E Mmxn(IF), with m ~ n, there is a sequence


of nonzero vectors v1, v2, ... , ve E lFm, withe = min{m - 1, n} , such that R =
HvtHvt-l · · ·Hv 1 A is upper triangular. Taking Q = H~1 H~2 • • ·H~t gives a QR
decomposition of A.

Proof. Let x 1 be the first column of the matrix A. Following Lemma 3.4.7, set
v1 = x 1 +ei 111 llxil!e 1, where ei 111 = sign(x 1), so that the Householder reflection Hv 1
takes x 1 to the span of e 1. Therefore, Hv 1 A has the form

* * * *
0
* * *
0
Hv1A= * * *
0
* * * ... *
where * indicates an entry of arbitrary value.
Now let x 2 be the second column of the new matrix Hv 1 A. We decompose x 2
into x 2 = x2 + x~, where x2 and x~ are of the form

Zm]T ·

Set v2 = x~ + ei112 llx~lle2. Since x~ and e2 are both orthogonal to e 1, the vector v 2
is also orthogonal to ei. Thus, the re:A.ection Hv 2 acts as the identity on the span
of ei , and Hv 2 leaves x2 and the first column of Hv 1 A unchanged.
We now have Hv 2x2 = Hv 2x2 + Hv 2 X~ = x2 + Hv 2 X~. By Lemma 3.4.7 the
vector Hv 2 x~ lies in span{e2}; so Hv 2x2 E span{e1,e2}, and the matrix Hv 2Hv 1A
has the form
* * * *
0
* * *
0 0
Hv2Hv1A = * *
0 0
* * ... *
Repeat the procedure for each k, = 3, 4, . . . , e, choosing Xk equal to the kth
column of Hvk - l · · · Hv 2 Hv 1A, and decomposing Xk = xlc + x%, where xlc and x%
are of the form

O]T and x% = [0
3.4. QR with Householder Transformations 109

Set vk = x;: + ei 8k ll x% 11 ek· The vectors x;: and ek are both orthogonal to the
subspace spanned by e 1, ... , ek-1, so vk is as well. This implies that the reflection
Hvk acts as t he identity on the span of e 1, ... , ek- l· In other words, since the
first k-1 columns of Hvk-i · · · Hv 2 Hv 1 A are in upper-triangular form, they remain
unchanged by Hvk· Moreover, we have HvkXk = xk + Hvkx;:. By Lemma 3.4.7 we
have that Hvkxk E span{ek} , and so Hv kxk E span{e1,e2 , ... , ek} , as desired.
Upon termination, we set R = Hvt · · · Hv 2 Hv 1 A, noting that it is upper
triangular. The QR decomposition follows by taking Q = H~1 H~2 • • • H~t . By
Proposition 3.4.4 each Hvk is orthonormal, so the product Q is as well. D

Remark 3.4.10. Recall that at the kth reflection, the first k - 1 columns are
unaffected by Hvk. It turns out that the first k - l rows are also unaffected by Hv • .
This is harder to see than for columns, but it follows from the fact that the kth
Householder transform is block diagonal with the (1, 1) block being the (k - 1) x
(k - 1) identity. The numerical algorithms for computing the QR decomposition
take advantage of these facts and skip those parts of the calculations. This speeds
up these algorithms considerably.

3.4.3 A Complete Worked Example


Consider the matrix from Example 3.3.11:

-2
3 -3.5
0.5 1
3 2.5 .
-2 0.5

To compute the QR decomposition using Householder reflections, let x 1 be the first


column of A and set

This gives
l/2 1/2 1/2
1/2 1/2 -1/2 -1/1/2
21
1/ 2 -1/2 1/ 2 -1 / 2 )
r1/2 - 1/2 -1/2 1/2
which yields

H.,A ~ [~ j5 ~1
Let x2 be the second column of Hv 1 A, noting that
110 Chapter 3. Inner Product Spaces

Now set

This gives

r~ ~']
0 0
0 0
Hv2 = () 0 1 0 '
() -1 0 0

which yields

r~ ~']
1
5
Hv2Hv1A = 0
0

Therefore,

and
l/2 -1/2 1/2
1/2 1/2 -1/2 -1/2 -1/2]
1/2 1/2 1/2 1/ 2 '
r1/2 -1 / 2 -1 / 2 1/2
where the second equality comes from the fact that Hv 1 and Hv 2 are orthonormal
matrices.

3.5 Normed Linear Spaces


Up to this point we've used the notation II ·II to denote the length of a vector induced
by an inner product, and we did this without any justification or explanation. In
this section, we show that this notion of length really does have the properties that
one would expect of a length function. But there are also other ways of defining
the length, or norm, of a vector.
Norms are useful for many things, in particular, for measuring when two vec-
tors are close together (or far apart). This allows us to quantify the convergence
of sequences, to approximate vectors with other vectors, and to give geometric
meaning to properties and relations in vector spaces. Different norms give different
definitions of which vectors are "close" to one another, and it is very handy to have
many alternatives to choose from, depending on the problem we want to solve.
In this section, we describe the basic properties of norms and give examples
of various kinds of norms, some of which can be induced by an inner product and
some of which cannot.
3.5. Normed Linear Spaces 111

Definition 3.5.1. A norm on a vector space V is a map IH : V-+ [O, oo) satisfying
the fallo wing conditions fo r all x , y E V and all a E lF:
(i) Positivity: ll x ll ~ 0, with equality if and only if x = 0.
(ii) Scale preservation: ll ax ll = !a l!! x ii ·
(iii) Triangle inequality: !I x+ YI! :=::: llx ll + ll Yll ·
If II · II satisfies all the conditions above except that llxll = 0 need not imply that
x = 0, then II · II is called a seminorm . A vector space with a norm is called a
normed linear space and is often denoted by the pair (V, II· II ).

Theorem 3.5.2. Every inner product space is a normed linear space with norm
!! x ii = J (x , x ) .

Proof. The first two conditions follow immediately from the definition of an inner
product; see Remark 3.1.12. It remains to verify the triangle inequality:
2 2 2
ll x+y ll = ll x ll + (x,y) + (y,x) + llYll
2 2
:=::: ll x ll + 2l(x, Y)I + ll Yll
2 2
:=::: ll xll + 2l lxll llY ll + llYll (by Cauchy-Schwarz)
2
=( !! x ii + ll Yll) .

Hence, !I x+ Yll :=::: ll x ll + ll Yll · D

Remark 3.5.3. In most situations, the most natural norm to use on an inner
product space is the induced norm llx ll 2 = (x,x). Whenever we are working in
an inner product space, we will almost always use the induced norm, unless we
explicitly say otherwise.

3.5.1 Examp les

Example 3.5.4. For any x = [x 1 x2 ... Xn] T E lFn , the following are
norms on lFn:

(i) 1-norm.
(3 .19)
T he 1-nor m is sometimes called the taxicab norm or the Manhattan norm
because it tracks t he distance traversed along streets in a rectilinear grid.

(ii) 2-norm.
(3.20)
This is the usual Euclidean norm , and it measures the distance that a
crow would fly to get from one point to another.

(iii) oo-norm.
(3.21)
112 Chapter 3. Inner Product Spaces

Figure 3.5 shows the sets {x E IR 2 : llxll :::; l} for t he 1-norm, the 2-norm, and
the oo-norm, and F igure 3.6 shows the analogous sets in JR 3 . As mentioned
above, the 2-norm arises from the st andard inner product on !Fn, but neither
the 1-norm nor the oo-norm arises from any inner product.

1-norm 2-norm oo-norm

Figure 3.5. The closed unit ball {x E IR 2 : llxll :::; 1} for the 1-norm, the
2-norm, and the oo-norm in IR 2 , as discussed in Example 3.5.4.

1-norm 2-norm oo-norm

Figure 3.6. The closed unit ball {x E IR3 : llxll :::; l} for the 1-norm, the
2-norm, and the oo-norm in IR3 , as discussed in Example 3.5.4.

Example 3.5.5. The 1- and 2-norms are special cases of p-norms (p ::;::: 1)
given by

llxllP = \f;
( n
lxjlP
) l/p
(3 .22)

We show in Corollary 3.6.7 that the p-norm satisfies the triangle inequality.
It is not hard to show that the oo-norm is the limit of p-norms asp --t oo;
that is,
llxlloo = lim l xllp·
p--foo
It is common to say that the oo-norm is the p-norm for p = oo.
3.5. Normed Linear Spaces 113

Example 3.5.6. The Frobenius norm on the space Mmxn(lF) is given by

[[ A[[F = Jtr(AHA).

This arises from the Frobenius (or Hilbert- Schmidt) inner product (3 .5). It is
worth noting that the square of the Frobenius norm is just the sum of squares
of the matrix elements. Thus, if you stack the elements of A into a long vector
of dimension mn x 1, the Frobenius norm of A is just the usual 2-norm of that
stacked vector.
The Frobenius norm holds a prominent place in applied linear algebra,
in part because it is easy to compute, but also because it is invariant under
orthonormal transformations; that is, [[UA [[ F = [[ A[[F , whenever U is an
orthonormal matrix.

Example 3.5.7. For p E [l,oo) the usual norm on the space £P([a,b];lF) (see
Example l.1.6(iv)) is

llfllLP = ( .l b
tf[P dx
) l /p

Similarly, for p = oo the usual choice of norm on the space L 00 ([a, b]; JF) is

llfllL 00 = sup [f(x) [.


xE[a ,b]

This last example is an especially important one that we use throughout this
book. This is sometimes called the sup norm. a
0
· sup is short for supremum.

Definition 3.5.8. For any normed linear space Y and any set X, let L 00 (X ; Y)
be the set of all bounded functions from X to Y, that is, functions f : X -+ Y such
that the £=-norm [[f[[ L= = supxEX [[ f(x)[[y is finite.

Proposition 3.5.9. The pair (L 00 (X; Y), [[ · [[ L= ) is a normed vector space.

Proof. The proof is Exercise 3.25. D

3.5.2 Induced Norm s on Linear Transformations


In this section, we show how to construct a norm on a set of linear transformations
from one normed linear space to another. This allows us to discuss the distance
between two linear operators and the convergence of a sequence of linear operators.
We often use these norms to prove properties of various linear operators.
114 Chapter 3. Inner Product Spaces

Definition 3.5.10. Given two normed linear spaces (V,11 · llv) and (W,11 · llw),
let @(V; W) denote the set of bounded linear transformations, that is, the set of
linear maps T : V -t W for which the quantity

llT(x)llw
llTllv,w = sup-- - - = sup llT(x)llw (3.23)
x;fO 11 X 11V llxllv=l

is finite. The quantity 11 · llv,w is called the induced norm on @(V; W), that is, the
norm induced by ll ·llv and ll · llw - For convenience, ifT: V -t Vis a linear operator,
then we usually write @(V) to denote @(V; V), and we write 11 · llv to denote 22 the
induced norm II · llv,v. We call @(V) the set of bounded linear operators on V,
and we call the induced norm I · llv on @(V) the operator norm .

Theorem 3.5.11. The set @(V; W), with operations of vector addition and scalar
multiplication defined pointwise, is a vector subspace23 of ..ZO(V; W), and the pair
(&#(V; W), 11 · llv,w) is a normed linear space.

Proof. We first prove that (3.23) satisfies the properties of a norm.


(i) Positivity: llTllv,w :=::: 0 by definition. Moreover, if T = 0, then llTllv,w = 0.
Conversely, if T # 0, then T(x) # 0 for some x # 0 . Hence, (3.23) is positive.

(ii) Scale preservation: Note that

llaTllvw=sup llaT(x)l lw -sup lal ll T(x)ll w lal sup llT(x) llw lalllTllv,w.
' x;fO llxllv x;fO llxllv x;fO llxllv

(iii) Triangle inequality:

llS + Tllvw =sup llS(x) + T(x)llw < llS(x)llw + llT(x)llw


' x#O llxllv - ~~~ llxllv
llS(x)llw llT(x)llw
:::; sup - - -II - +sup II II = llSllv,w + llTllv,w-
x10 11
X V x;fO X V

To prove that @(V; W) is a vector space, it suffices to show that it is a subspace of


..ZO(V; W) . But this follows directly, since scale preservation and the triangle inequal-
ity imply that @(V; W) is closed under vector addition and scalar
multiplication. D

Remark 3.5.12. Let (V, II · llv) and (W, I · llw) be normed linear spaces. The
induced norm of LE @(V; W) satisfies llL(x)llw:::; llLllv,wllxllv for all x EV .

Remark 3.5.13. It is important to note that @(V; W) = ..ZO(V; W) whenever


V and W are both finite dimensional-we prove this in Corollary 3.7.5. But for
22
Even though the notation for the vector norm and the operator norm is the same, the context
should make clear which one we are referring to: when what's inside the norm is a vector, it's
the vector norm, and when what's inside the norm is an operator, it's the operator norm.
23
See Corollary 2.2.3, which guarantees .2"(V; W) is a vector space.
3.5. Normed Linear Spaces 115

infinite-dimensional vector spaces the space ~(V; W) is usually a proper subspace of


$(V; W). This distinction is important because some of the most useful theorems
about linear transformations of finite-dimensional vector spaces extend to ~(V; W)
in the infinite-dimensional case but do not hold for $(V; W).

Theorem 3.5.14. Let (V, II· llv), (W, I · llw), and (X, II· llx) be normed linear
spaces. If TE ~(V; W) and SE ~(W; X) , then STE ~(V; X) and

llSTllv,x::; l Sllw,x ll Tllv,w. (3.24)

In particular, any operator norm I · llv on ~(V) satisfies the submultiplicative


property
ll STllv::; llSllvllTllv (3.25)
for all S and T in ~(V).

Proof. For all v E V we have

llSTvllx ::; l Sllw,xllTv llw : : ; llSllw,x llTllv,wll vllv· D

Definition 3.5.15. Any norm I · II on the finite -dimensional vector space Mn(F)
that satisfies the submultiplicative property is called a matrix norm.

Remark 3.5.16. If II · I is an operator norm, then the submultiplicative property


shows that for every n :'.'.". 1 and every linear operator T, we have llTnll ::; l Tlln.
Moreover, if llTll < 1, then llTnll : : ; llTlln -t 0 as n -too, which implies that Tn -t 0
as n -t oo (see Chapter 5). This is a useful observation in many applications,
particularly in numerical analysis.

Example 3.5.17. Using the p-norm on both lFm and lFn yields the induced
matrix norm on Mmxn(lF) defined by

llAxllP
llAllP = sup
x;60
- 11 - 11- , 1::; P::;
X p
oo. (3.26)

Unexample 3.5.18. The Frobenius norm (see Example 3.5 .6) is not an
induced norm, but, as shown in Exercise 4.28, it does satisfy the submul-
tiplicative property llABllF::; llAllFllBllF ·

Example 3.5.19. Let (V,(.,·)) be an inner product space with the usual
norm II· II· If the linear operator L : V -t Vis orthonormal, then llL(x) I = llxll
for every x E V, which implies that the induced norm II · II on ~(V) satisfies
116 Chapter 3. Inner Product Spaces

llLll = l. A linear operator that preserves the norm of every vector is called
an isometry. This is a much stronger condition than just saying that llLll = 1;
in fact an isometry preserves both lengths and angles.

3.5. 3 Explicit formulas for llAlli and llAll=


T he norms I ·Iii and 1 · lloo of a linear operator on a finite-dimensional vector space
have very simple descriptions in terms of row and column sums. Although they may
be less intuit ive than the usual induced 2-norm, they are much simpler to compute
and can give us useful bounds for the 2-norm and other norms. This allows us to
simplify many problems where more computationally complex norms might seem
more natural at first.

Theorem 3.5.20. Let A = [aij] E Mrnxn (IF). We have the following :


m
llAll1 = sup L
Is;js;n i =l
1%1, (3.27)

n
llAlloo = sup L laijl·
Is;i s; m j=l
(3.28)

In other words, the l -norm and the oo -norm are, respectively, the largest column
and row sums (after taking the modulus of each entry).

Proof. We prove (3.28) and leave (3.27) to the reader (Exercise 3.27). Note that

Hence,

llAxlloo ~ ,~~fm
t. O;j Xj :". ,~~fmt. la., IIx, [ :". (,~~fmt, la;j 1) llxlloo·
Hence, for all x "I- 0, we have

Taking the supremum of the left side yields


n
llAlloo :S: sup L laijl·
Is;is;rn j=l

It suffices now to prove the reverse inequality over all x E !Fn . Let k be the row
index satisfying
3.6. Important Norm Inequalities 117

Let x be the vector whose ith entry is 0 if aki = 0 and is akiflakil if aki =f- 0. The
only way x could be 0 is if A = 0, in which case the theorem is clearly true; so we
may assume that at least one of the entries of x is not zero, and thus l xll oo = 1.
We have

~ 4i], then llAll1 = 7 and l!Alloo = 8.


-1
Example 3.5.21. If A = [ 3 _3

Nota Bene 3.5.22. One way to remember that the 1-norm is the largest
column sum is to observe that the number one looks like a column. Similarly,
the symbol for infinity is more horizontal in shape, corresponding to the fact
that the infinity norm is the largest row sum.

3.6 Important Norm Inequalities


The arsenal of the analyst is stocked with inequalities.
-Bela Bollobas

In this section, we examine several important norm inequalities that are fun-
damental in applied analysis. In particular, we prove the Minkowski and Holder
inequalities. Before doing so, however, we prove Young's inequality, which is one of
the most widely used inequalities in all of mathematics.

3.6.1 Young's Inequality


We begin with the following lemma.

i
Lemma 3.6.1. If -i; + = 1, where 1 < p, q < oo (meaning both 1 < p < oo and
1 < q < oo), then for all real x > 0 we have
x x 1- q
1 <-+--
- p q
. (3.29)

Proof. Let f(x) be the right side of (3 .29); that is, f(x) = ~ + x'; q. Note that
f(l) = 1. It suffices to show that x = 1 is the minimum of f(x). Clearly, f'(x) =
1p + x- q(l-q),
q
which simplifies to f'(x) = 1(1
p
- x-q), since 1p = q -q l . This implies
that x = 1 is the only critical point. Since f"(x) = ~x - q-l > 0 for all x > 0, it
follows that f attains its global minimum at x = 1. D
118 Chapter 3. Inner Product Spaces

Theorem 3 .6.2 (Young's Inequality). If a, b ~ 0 and f; + % = 1, where 1 <


p, q < oo, then
aP bq
ab< - +-. (3.30)
- p q

Proof. First observe that multiplying f; + %= 1 by pq yields p + q = pq, and hence


p+q-
1
pq = 0. By setting x = aP; in (3.29), we have that

1 <-
1 (ap-l) (ap-l)l-q - - +
1
- - +- - - =
aP-l aP+q-pq-l aP-l bq-l
= - - +- = -
1 (aP
-
bq)
+ -q .
- p b q b pb qbl -q pb qa ab p

Thus, (3.30) holds. D

Corollary 3.6.3 (Arithmetic-Geometric Mean Inequality). If x, y ~ 0 and


0 :::; e :::; 1, then
x 8y 1- 0 :::; ex+ (1 - e)y. (3.31)
In particular, fore= 1/2, we have yXY:::; x!y.
Proof. If e = 0 ore= 1, then the corollary is trivial. If e E (0, 1), then by Young's
inequality with p = lje, q = 1/(1 - e), a= x 8 , and b = yl - B, we have
xey1-e:::; e(xe)(l/8) + (l _ O)(y1-e)1/(1-e) =ex+ (l _ e)y. D

3.6.2 Resulting Norm Inequalities

Corollary 3.6.4 (HOlder's Inequality). If x, y E !Fn and .!p + .!q = 1, where


1 < p, q < oo, then

(3.32)

The corresponding result for p =1 and q = oo also holds:

Proof. When p = 1 and q = oo, the result is immediate. Since IYkl ::; llYllcx:i, it
follows that
n n
L JxkYkl:::; L lxkJ llYlloo = JJxJJ1 l Yllcx:i·
k=l k=l
When 1 < p, q < oo, Young's inequality gives
3.6. Important Norm Inequalities 119

Remark 3.6.5. Assuming the standard inner product on lFn, we have

In particular, when p = q = 2, this yields the Cauchy-Schwarz inequality.

Remark 3.6.6. Using Holder's inequality on x and y, where y = e i is the standard


basis function, we have that lxil :::; l xllp·

Corollary 3.6.7 (Minkowski's Inequality). Ifx,y ElFn andpE (1,oo), then

(3.34)

Minkowski's inequality is precisely the triangle inequality for the p-norm, thus
showing that the p-norm is, indeed, a norm on lFn.

Proof. Choose q = p/(p - 1) so that i + %= 1 and p - 1 = ~ · We have


n
llx + Yll~ = L lxk + YklP
k=l
n n
: :; L lxk + Yklp- Ilxkl + L lxk + Yk lp- 1
1Ykl
k= l k=l

n ) l/q ( n ) l /q
:::; ( ~ lxk + YklP l xllp + ~ lxk + YklP llYllP
= l x + Yll~/q(llxllP + llYllp)
= l x + Yll~- (llxllP + llYllp)·
1

Hence, l x + YllP :::; l xllP + llYllp· D

Remark 3.6.8. Both Holder's and Minkowski's inequalities can be generalized


fairly easily to both f,P and LP( [a,b];JF) . Recall that in Example l.l.6(iv) we prove
LP([a, b]; JF) is closed under addition by showing that II!+ gllP :::; 2( 11!11~ + 11911~) 1 /P .
Minkowski's inequality provides a much sharper bound.

We conclude this section by extending Young's inequality to give the following


two additional norm inequalities.

Corollary 3.6.9. If x, y E lFn and 1 < p, q < oo, with i + %= 1, then

Proof. This follows immediately from (3.30) . D


120 Chapter 3. Inner Product Spaces

Corollary 3.6.10. For all c > 0 and all x, y E JFn, we have

(3.36)

Proof. This follows immediately from (3.46) in Exercise 3.32. D

3.7 Adjoints
Let A be an m x n matrix. For the usual inner product (3.2) , we have that

for any x E JFm and y E JFn. In this section, we generalize this property of Hermitian
conjugates (see Definition C.1.3) to arbitrary linear transformations and arbitrary
inner products. We call the resulting map the adjoint.
Before developing the theory of the adjoint, we present the celebrated Riesz
representation theorem, which states that for each bounded linear transformation
L : V --+ JF, there exists w E V such that L(v) = (w, v) for all v E V. In
fact, there is a one-to-one correspondence between the vectors w and the linear
transformations L . In this section, we prove the finite-dimensional version of this
result. The infinite-dimensional version is beyond the scope of this text and is
typically seen in a standard functional analysis course.

3.7.1 Finite-Dimensional Riesz Representation Theorem


Let S = [x1, . . . , Xn] be a basis of the vector space V. Given a linear transformation
f : V--+ JF, we can write its matrix representation in the basis Sas the 1 x n matrix
[f(x1) f( x n)]. In other words, ifx = 2::~ 1 aixi, then f(x) can be written as
n
f (x) = [f (x1) an]T = L f(xi)ai.
i=l

Applying this to V = JFn with the standard basis S = [e1, .. . ,en], if x = 2::~ 1 aiei,
then f(x) = 2:: ~ 1 f(ei)ai = yHx, where y = 2::~ 1 f(ei)ei·
This shows that every linear function f : JFn --+ lF can be written as
f(x) = (y,x) for some y E JFn, where(-,·) is the usual inner product. Moreover, we
have llf(x)ll = l(y,x)I:::; llYllllxll , which implies that llfll:::; llYll· Also, l(y,x)I =
lf(x)I :::; llfllllxll for all x, which implies that llYll 2 :::; llfll ll Yll , and hence llYll :::; llfll·
Therefore, we have llfll = llYll· By Corollary 3.3.6 and Remark 3.3.7, these results
hold for any finite-dimensional inner product space. We summarize these results in
the following theorem.

Theorem 3.7.1 (Finite-Dimensional Riesz Representation Theorem). As-


sume that (V, (., ·)) is a finite -dimensional inner product space. If L is any linear
transformation L : V--+ JF, then there exists a unique y E V such that L(x) = (y, x)
for all x EV. Moreover, we have that llLll = llYll = ..j(y,y).
3.7. Adjoints 121

Remark 3. 7.2. It is useful to explicitly find the vector y that the previous theorem
promises must exist. Let [x 1 , ... , xn] be an orthonormal basis in V. If x E V, then
we can write x uniquely as the linear combination x = I:~=l aixi, where each
ai = (xi,x). Hence,

If we set y = I:~ 1 L(xi)x i, then L(x) = (y , x ) for each x E V.

Vista 3. 7 .3. Although the proof of the Riesz representation theorem given
above is very simple, it relies on the finite-dimensionality of V. If V is infinite
dimensional, then the sum used to define y becomes an infinite sum, and it
is not at all clear that it should converge. Nevertheless, the result can be
generalized to infinite dimensions, provided we restrict ourselves to bounded
linear transformations. The infinite-dimensional Riesz representation theorem
is a famous result in functional analysis that has widespread applications in
differential equations, probability theory, and optimization, but the proof of
the infinite-dimensional case would take us beyond t he scope of this book.

Definition 3. 7.4. A linear functional is a linear transformation L : V --+ lF. If


LE @(V; JF), we say that it is a bounded linear functional.

Corollary 3. 7 .5. If V is finite dimensional, then every linear functional on V


is bounded; that is, @(V;lF) = 2'(V;lF). This is a special case of @(V;W) =
2'(V; W), where W = lF. (Recall Remark 3.5.13.)

3.7.2 Adjoints
Since every linear functional on a finite-dimensional inner product space Vis defined
by the inner product with some vector in V, it is natural to ask how such functions
and the corresponding inner products change when a linear transformation acts on
V. Adjoints answer this question.

Definition 3. 7.6. Assume that (V, (-, ·)v) and (W, (·, ·)w) are inner product spaces
and that L : V --+ W is a linear transformation. The adjoint of L is a linear
transformation L * : W --+ V such that

(w , L (v))w = (L*(w) , v )v for allv EV andw E W. (3.37)

Remark 3. 7. 7. Theorem 3. 7 .10 shows that, for finite-dimensional inner product


spaces, the adjoint always exists and is unique; so it makes sense to speak of the
adjoint.
122 Chapter 3. Inner Product Spaces

Example 3.7.8. Let A= [aij] be the matrix representation of L: lFm -t lFn


in the standard bases. IflFm and lFn have standard inner product (x, y ) = xHy,
the matrix representation of the adjoint L * is given by the Hermitian conjugate
AH = [bji], where bji = aij. This is easy to verify directly:

Example 3. 7. 9. Let V be the vector space of smooth (infinitely differen-


tiable) functions f on IR that have compact support, that is, such that there
exists some closed and bounded interval [a, b] such that f(x) = 0 if x rf. [a, b].
Note that all these functions are Riemann integrable and satisfy f ~00 f dx <
oo. The space V is an inner product space with (!, g) = J~00 f g dx. If we
define L: V -t V by L[g](x) = g'(:r:), then using integration by parts gives

(!, L[g]) = 1_: f(x)g'(x) dx = - 1_: f'(x)g(x) dx = - (L[f],g) .

So for this inner product and linear operator, we have L * = -L.

Theorem 3. 7.10. Assume that (V, (., ·)v) and (W, (., ·)w) are finite -dimensional
inner product spaces. If L : V -t W is a linear transformation, then the adjoint L *
of L exists and is unique.

Proof. To prove existence, define for each w E W the linear map Lw : V -t lF by


Lw(v) = (w, L(v)) w· By the finite-dimensional version of the Riesz representation
theorem, there exists a unique u E V such that Lw(v) = (u, v)v for all v E V.
Thus, we define the map L * : W -t V satisfying L * ( w) = u . Since (w, L( v)) w =
(L*(w), v)v for all v EV and w E W, we only need to check that L* is linear. For
any w1 , w2 E W we have

(L*(aw1 + bw2), v)v = + bw2, L(v))w =a (w1, L(v))w + b (w2, L(v) )w


(aw1
=a (L*(w1), v)v + b (L*(w2), v)v = (aL*(wi) + bL*(w2), v)v.
Since this holds for all v EV, we have that L*(aw 1 + bw 2) = aL*(w 1) + bL*(w 2 ) .
To prove uniqueness, assume that both Li and D), are adjoints of L. We have
((Li - L:2)(w), v) = 0 for all v EV and w E W. By Proposition 3.1.16, this implies
that (Li - D2)(w) = 0 for all w E W. Hence, Li = L2,. D

Vista 3. 7 .11. As with the Riesz representation theorem, the previous propo-
sition can be extended to bounded linear transformations of infinite-dimensional
vector spaces. You should expect to be able to prove this generalization after
taking a course in functional analysis.
3.8. Fundamental Subspaces 123

Proposition 3.7.12. Let V and W be finite -dimensional inner product spaces.


The adjoint has the following properties:
(i) If S , TE 2'(V; W), then (S + T)* = S* + T* and (a.T)* = Ci.T* , a. E JF .
(ii) If SE 2'(V; W), then (S*)* = S.
(iii) If S,T E 2'(V), then (ST)* = T*S*.
(iv) If TE 2'(V) and T is invertible, then (T*) - 1 = (T- 1 )* .

Proof. The proof is Exercise 3.39. D

3.8 Fundamental Subspaces of a Linear Transformation


In this section we use adjoints to describe four fundamental subspaces associated
with a linear operator and prove the fundamental subspaces theorem, which gives
some very simple but powerful relations among these spaces.

3.8.1 Orthogonal Complements


Throughout this subsection, let (V, (-,· ))be an inner product space.

Definition 3.8.1. The orthogonal complement of Sc V is the set


SJ_ = {y EV I (x,y) = O\fx ES}.

Example 3.8.2. For a single vector v E V , the set {v }J_ is the hyperplane
in V defined by (v, x ) = 0. Thus, if V = JFn with the standard inner product
and v = (v1 , . . . ,vn), then VJ_= {(x1, ... ,xn) I V1X1 + · · · + VnXn = O}.

Proposition 3.8.3. SJ_ is a subspace of V .

Proof. If y1, y2 E SJ_ , then for each x E S, we have (x, ay1 + by2 ) = a (x, Y1) +
b (x, Y2) = 0. Thus, ay1 + by2 E SJ_. D

Remark 3.8.4. For an alternative to the previous proof, recall that the intersection
of a collection of subspaces is a subspace (see Proposition 1.2.4) . Thus, we have
that SJ_ = nxES{x}J_. In other words, the hyperplane {x}J_ is a subspace, and so
the intersection of all these subspaces is also a subspace.

Theorem 3.8.5. If W is a finite -dimensional subspace of V, then V =WEB WJ_ .

Proof. If x E W n W J_, then (x, x) = 0, which implies that x = 0. Thus, it suffices


to prove that W + WJ_ = V. This follows from Theorem 3.2.7, since any v EV may
be written as v = projw(v) + r, where r = v - projw(v) is orthogonal to W. D
124 Chapter 3. Inner Product Spaces

Lemma 3.8.6. If W is a finite -dimensional subspace of V, then (W ..l )..l = W.

Proof. If x E W, then (x, y) = 0 for ally E W..L . This implies that x E (W..l )..l,
and so W c (W..l)..L . Suppose now that x E (W..l)..l. By Theorem 3.8.5, we can
write x uniquely as x = w + Wj_, where w E w and Wj_ E w..l. However, we
also have that (w..L,x) = 0, which implies that 0 = (w..L, w + w..L) = (w..L, w) +
(w..L, w..L) = (w..L, W..L) = llw..Lll 2 , and hence W..L = 0. This implies x E W, and
thus (W..l)..l c W . D

Remark 3.8. 7. It is important to note that Theorem 3.8.5 and Lemma 3.8.6 do
not hold for all infinite-dimensional subspaces. For example, the space lF[x] of
polynomials is a proper subspace of C([O, l];lF) with the inner product (f,g) =
J01 fgdx, yet it can be shown that the orthogonal complement of lF[x] is the zero
vector.

3.8 .2 The Fundamental Subspaces


Throughout this subsection, assume that L : V -+ W is a linear transformation
from the inner product space (V, (-, ·)v) into the inner product space (W, (·, )w)·
Assume also that the adjoint of L is L *.
We now define the four fundamental subspaces.

Definition 3.8.8. The following are the four fundamental subspaces of L:

(i) The range of L, denoted by fl (L).


(ii) Th e kernel of L , denoted by JV (L).

(iii) Th e range of L *, denoted by fl (L *).

(iv) The kernel of L*, denoted by JV (L*).

The four fundamental subspaces are depicted in Figure 3.7.

Theorem 3.8.9 (Fundamental Subspaces Theorem). The following holds


for L:
1
fl(L)· =JV(L*). (3.38)
Moreover, if fl (L *) is finite dimensional, then

JV(L)..L = fl(L*). (3.39)

Proof. Note that w E fl (L)..l if and only if (L(v), w) = 0 for all v E V, which
holds if and only if (v, L*w) = 0 for all v EV. This occurs if and only if L*w = 0,
which is equivalent tow E JV (L*) . The proof of (3.39) follows from (3.38) and
Lemma 3.8.6; see Exercise 3.43. D
3.8. Fundamental Subspaces 125

I
./V(L) JY (L*)

~
L w
v

/ I
th?(L)

Figure 3. 7. The four fundamental subspaces of a linear transformation


L: V---+ W. Any linear transformation L always maps f% (L*) (the black plane on
the left) isomorphically to f% (L) (the black plane on the right) and sends JY (L)
(the vertical blue line on the left) to 0 (blue dot on the right) in W. Similarly, L *
maps f% ( L) isomorphically to f% ( L *) and sends JY (L *) (the vertical red line on the
right) to 0 (the red dot on the left). For an alternative depiction of the fundamental
subspaces, see Figure 4.1 .

Corollary 3.8.10. If V and W are finite-dimensional vector spaces, then


(i) V = JY (L) E8 f% (L*), and
(ii) W = f% (L) E8 JY (L*) .
Moreover, if n = dim V, m = dim W, and r = rankL, the four fundamental
subspaces JY (L) , f% (L*), JY (L*), f% (L) have dimensions n - r , r , m - r, and r ,
respectively.

Corollary 3.8.11. Let V and W be finite -dimensional vector spaces. The linear
transformation L maps the subspace f% ( L *) bijectively to the subspace f% ( L), and
L* maps f% (L) bijectively to f% (L*).

Proof. Since JY( L) nf%(L*) = {O}, the restriction of L to!%(£*) is injective


(see Lemma 2.3.3). Since every v E V can be written uniquely as v = x + r with
r E f% (L*) and x E JY (L), we have L(v) = L(r). Thus, f% (L) is the image of
L restricted to f% (L*); that is, L maps f% (L*) surjectively onto f% (L). Taking
adjoints gives the same result for L *. D

Remark 3.8.12. Let's describe the fundamental subspaces theorem in terms of


matrices. Let V = !Fm, W = !Fn , and let A = [aiJ] be then x m matrix representing
Lin the standard bases with the usual inner product. Denote the jth column of A
by aj and the ith column of AH by bi.
The image of a given vector v = [v1
linear combination of the columns of A; that is, Av =
vmrI:;:
E V can be written as a
1 vjaJ. Thus, the range
of A is the span of its columns. For this reason, f% (A) is sometimes called the
126 Chapter 3. Inner Product Spaces

column space of A, and it has dimension equal to the rank of A. Since the adjoint
L * is represented by the matrix AH, it follows that the range of AH is the span of
its columns.
The null space fl (A) of A is the set of n-tu ples v = [V1 Vn JT in wn
such that Av= 0, or equivalently

r
0 =Av=
b~1
b~ rV11
V2
=
r(b1,v)1
(b2, v)
.
b~i L (b~, v)
From this we see immediately that fl (A) is orthogonal to the column space/% (AH),
or, in other words, fl (A) is orthogonal to /%(AH) as described in (3 .39). The
fundamental subspaces theorem also tells us that the direct sum of these two spaces
is all of v = wm.
Using the same argument for fl (AH), we get the dual statements that fl (AH)
is orthogonal to the column space/% (A) and that the direct sum of these two spaces
is the entire space w = wn.

Nota Bene 3.8.13. Corollary 3.8.11 hows that L : /% (L*) -t /% (L) and
L * : /% (L) -t /% ( L *) are isomorphisms of vector space , but it i important to
note that L * restricted to/% (L) is generally not the inverse of L restricted to
/% (L*). Instead, the Moore - Penrose pseudoinverse Lt : W -t Vi the inverse
of L restricted to/% (L*) (see Proposition 4.6.2).

Example 3.8.14. Let A : JR 3 -t l!l 2 be given by

A= 0 0 0. [l 2 3]
The range of A is the span of the columns, which is /% (A) = span( [1 0] T).
This is the x-axis in JR 2 . The kernel of A is the hyperplane

fl (A)= { [x1 x2 x3JT E 1R3 I x1 + 2x2 + 3x3 = O},

which is orthogonal to [l 2 3JT. Also,/%(AH) =span([l 2 3] T),which


by the previous calculation is fl (A) .l. Similarly, the kernel of AH is fl (AH) ,
which is the y-axis in 1R 2 , that is, /%(A).l. The adjoint AH, when restricted
to /%(A), is bijective and linear, but it is not quite the inverse of A. To see
this, note that
AAH = [140 OJ.
0 '
3.9 . Least Squares 127

so when AAH is restricted to ~(A) = span([l O]T) we have AAHla(A) =


l4Ia (A), which is not the identity on~ (A) but is invertible there. Similarly,

so when AHA is restricted to ~(AH) span([l 2 3]T) it also acts as


multiplication by 14. This can be seen by direct verification: for every v E
span([l 2 3]T) we can write v = [t 2t 3tf for some t E IF , and we have

[~ : ~] [i:] ~ [f:] ·
14

So again, AH A is not the identity on ~ (AH), but is invertible there.

r b

L~V'A)
Figure 3.8. Projecting the vector b onto ~(A) to get p = proja(A) b
{blue) . The best approximate solution to the overdetermined system Ax = b is the
x
vector that solves the system Ax = p , as described in Section 3. 9.1 . The error is
the norm of the residual r = b - Ax (red) .

3. 9 Least Squares
Many applications involve linear systems that are overdetermined, meaning that
there are more equations than unknowns . In this section, we discuss the best
approximate solution (called the least squares solution) of an overdetermined linear
system. This technique is very powerful and is used in virtually every quantitative
discipline.

3.9.1 Formulation of Least Squares


Let A E Mmxn(IF). The system Ax= b has a solution if and only if b E ~(A) . If
bis not in~ (A) , we seek the best approximate solution; that is, we want to choose
x such that li b - Ax il is minimized. By Theorem 3.2.7, this occurs when Ax= p,
where p = proj&l(A)(b). This is depicted in Figure 3.8. Thus, we want to solve
Ax = p . (3.40)
128 Chapter 3. Inner Product Spaces

Since AH is a bijection when restricted to fl (A) (see Corollary 3.8.11), solving


(3.40) is equivalent to solving the equation
AH Ax= AHp.

But r = b - p is orthogonal to fl (A), so by the fundamental subspaces theorem,


we have r E JV (AH). Applying the linear transformation AH, we get

AHb = AH(p + r) =A Hp.


This tells us that solving (3.40) is equivalent to solving
AH Ax= AHb, (3.41)
which we call the normal equation. Any solution x
of either (3.41) or (3.40) is
called a least squares solution of the linear system Ax = b, and it represents the
best approximate solution of the linear system in the 2-norm. We summarize this
discussion in the following proposition.

Proposition 3.9.1. For any A E Mmxn(lF) the system Ax= b has a least squares
solution. It is unique if and only if A is injective.

Why go to the effort of multiplying by A H7 The main advantage is that the


resulting square matrix AH A is often invertible, as we see in the next lemma. As
a side benefit, multiplying by AH also takes care of projecting to fl (A); that is, it
automatically annihilates the part r of b that is not in the range of A.

Lemma 3.9.2. If A E Mmxn(lF) i8 injective, that is, of rank n, then AH A is


nonsingular.

Proof. See Exercise 3.46. D

We can now use the invertibility of AH A to summarize the results of this


section as follows.

Theorem 3.9.3. If A E Mmxn(lF) is injective, that is, of rank n, then the unique
least squares solution of the system Ax = b is given by
(3.42)

Proof. Since A is injective, AH A is invertible and, when applied to (3.41), gives


the unique solution in (3.42). D

Remark 3.9.4. If the matrix A is not injective (not of rank n), then the normal
equation (3.41) always has a solution, but the solution is not unique. This gen-
erally only occurs in applications when one has collected too few data points, or
when variables that one has assumed to be linearly independent are in fact linearly
dependent. In situations where A cannot be made injective, there is still a choice
of solution that might be considered "best." We discuss this further when we treat
the singular value decomposition in Sections 4.5 and 4.6.
3.9. Least Squares 129

3.9.2 Line and Curve Fitting


Least squares solutions can be used to fit lines or other curves to data. By fit we
mean that given a family of curves (for example, lines or exponential functions) we
want to find the specific curve of that type that most closely matches the data in
the sense described above in Section 3.9.l.
For example, suppose we want to fit the line y = mx+b to the data {(xi , Yi) }~ 1 .
This means we want to find the slope m and y-intercept b so that mxi + b = Yi
for each i. In other words, we want to find the best approximate solution to the
overdetermined matrix equation Ax = b, where

lj [Ylj
~ ~ = [~] , and ~
X12 2
A = , x b =
[
Xn 1 Yn

The least squares solution is found by solving the normal equation (3.41), which
takes the form
[E~:~ ~: 2:~~1 Xi] [~] = [lf7~;~;i]. (3.43)

The matrix AT A is invertible as long as the xi terms are not all equal. Simplifying
(3.42) yields

(3.44)

where nx = 2:~ 1 Xi and ny = 2:~ 1 Yi ·

Example 3.9.5. Consider points (3.0, 7.3) , (4.0,8.8), (5 .0, 11.1), and
(6.0, 12.5). Using (3.43) or the explicit solution (3.44), we find the line to
be m = 1.79 and b = 1.87; see Figure 3.9(a).

~.s 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5

(a) (b)

Figure 3.9. Least squares solution fitting (a) a line to four data points and
(b) an exponential curve to four points.
130 Chapter 3. Inner Product Spaces

Example 3.9 .6. Suppose that an amount of radioactive substance is weighed


at several points in time, giving n data points {(ti, wi)}~ 1 . We expect ra-
dioactive decay to be exponential, so we want to fit the curve w(t) = aekt to
the data. By taking the (natural) log, we can write this as a linear equation
log w(t) =log a+ kt. Thus, we want to find the least squares solution for the
system Ax= b , where

t1
t2 111 [. k ] [logW1]
log W2
A= . . , x = , and b = .
[: : loga :
tn 1 logwn

If the data points are (3.0, 7.3) , (4 .0, 3.5), (5.0, 1.2), and (6.0, 0.8), where the
first coordinate is measured in years, and the second coordinate is measured in
grams, we can use (3.44) to find the exponential parameters to be k = -0. 7703
and a= 71.2736. Thus, the half life in this case is -(log2) / k = 0.8998 years;
see Figure 3.9(b).

Example 3.9.7. Suppose we have a collection of data points {(xi,Yi)}f= 1 ,


and we have reason to believe that they should lie (approximately) on a
parabola of the form y = ax 2 + bx + c. In this case we want to find the
least squares solution to the system Ax= b , with

X1

A= ["l
x~
.
X2

x2 Xn
n

Again the least squares solution i:s obtained by solving the normal equation
AH Ax = AHb. The solution is unique if and only if the matrix A has rank 3,
which occurs if there are at least three distinct values of xi in the data set.

Vista 3.9.8. A widely used approach for computing least squares solutions
of linear systems is to use the QR decomposition introduced in Section 3.3.3.
If A= QR is a QR decomposition of A , then the least squares solution xis
found by solving the linear system Rx = QHb (see Exercise 3.17). Typically
using the QR decomposition takes about twice as long as solving the normal
equation directly, but it is more stable.
Exercises 131

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with .& are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

3.1. Verify the polarization and parallelogram identities on a real inner product
space, with the usual norm !!xii = ./(x,X) arising from the inner product:
(i) (x,Y) = i (!Ix+ Yll 2 - !Ix - Yll 2 ) ·
(ii) llxll 2 + llYll 2 = ~ (!Ix+ Yll 2 + !I x - Yll 2 ) ·
It can be shown that in any normed linear space over JR for which (ii) holds,
one can define an inner product by using (i); see [Pro08, Thm. 4.8] for details.
3.2. Verify the polarization identity on a complex inner product space, with the
usual norm !!xii = ./(x,X) arising from the inner product:

(x, y) = ~ (!Ix+ Yll 2 - !I x - Yll 2 + illx - iyll 2 - illx + iyll 2 ) .

A nice consequence of the polarization identity on a real or complex inner


product space is that if two inner products induce the same norm, then the
inner products are equal.
3.3. Let JR[x] have the inner product
1
(f,g) = fo f(x)g(x)dx.

Using (3.8), find the angle e between the following sets of vectors:
5
(i) x and x .
2 4
(ii) x and x .

3.4. Let (V, (-,·))be a real inner product space. A linear map T: V-+ Vis angle
preserving if for all nonzero x, y E V, we have that
(Tx, Ty) (x,y)
(3.45)
!!Tx!!llTYll !!x!!llYll .
132 Chapter 3. Inner Product Spaces

Prove that T is angle preserving if and only if there exists a > 0 such that
llTxll = allxll for all x EV. Hint: Use Exercise 3.l (i) for one direction. For
the other direction, verify that (3.45) implies that T preserves orthogonality,
and then write y as y = projx y + r .
1
3.5. Let V = C([O,l];JR) have the inner product (f,g) = f 0 f(x)g(x)dx . Find
the projection of ex onto the vector x - l. Hint: Is x - 1 a unit vector?
3.6. Prove the Cauchy- Schwarz inequality by considering the inequality

where >. = j~~J.

3.7. Prove the Cauchy- Schwarz inequality using Corollary 3.2.10. Hint: Consider
the orthonormal (singleton) set { fxrr}.
3.8. Let V be the inner product space C([- n,n];JR) with inner product

(!, g) = ;1 !7l" f (t)g(t) dt.


- 7J"

Let X = span(S) CV, where S = {cos(t),sin(t),cos(2t),sin(2t)}.


(i) Prove that S is an orthonormal set.
(ii) Compute lltll·
(iii) Compute the projection proh(cos(3t)).
(iv) Compute the projection proh(t).
3.9. Prove that a rotation (2.17) in JR 2 is an orthonormal transformation (with
respect to the usual inner product).
3.10. & Recall the definition of an orthonormal matrix as given in Definition 3.2.14.
Assume the usual inner product on lFn . Prove the following statements:
(i) The matrix Q E Mn(lF) is an orthonormal matrix if and only if QHQ =
QQH =I.
(ii) If Q E Mn(lF) is an orthonormal matrix, then ll Qxl l = llx ll for all x E lFn.
(iii) If Q E Mn(lF) is an orthonormal matrix, then so is Q- 1.
(iv) The columns of an orthonormal matrix Q E Mn(lF) are orthonormal.
(v) If Q E Mn(lF) is an orthonormal matrix, then I <let (Q)I = l. Is the
converse true?
(vi) If Q1, Q2 E Mn(lF) are orthonormal matrices, then the product Q 1Q 2 is
also an orthonormal matrix.

3.11 . Describe what happens when we apply the Gram-Schmidt orthonormaliza-


t ion process to a collection of linearly dependent vectors.
3.12. Apply the Gram-Schmidt orthonormalization process to the set {[1, l]T,
[1, O] T} C JR 2-this gives a new, orthonormal basis of JR 2. Express the vector
2e 1 + 3e2 in terms of this basis using Theorem 3.2.3.
Exercises 133

3.13. Apply the Gram-Schmidt orthonormalization process to the set {1, x, x 2 , x 3 }


C JR[x] with the Chebyshev-1 inner product

1
fgdx
(f,g) =
1
- 1 Vl=X2 '
Hint: Recall the trigonometric identity:

cos(u) + cos(v) = 2 cos (u+v) (u -v)


-
2
- cos -
2
- .

3.14. Prove that for any proper subspace X c V of a finite-dimensional inner prod-
uct space the projection projx : V 4 Xis not an orthonormal transformation.
3.15. Let

A~ r; ~l
(i) Use the Gram- Schmidt method to find the QR decomposition of A.
(ii) Let b = [- 1 6 5 7f. Use (i) to solve AH Ax = AHb.

3.16. Prove the following results about the QR decomposition:


(i) The QR decomposition is not unique. Hint: Consider matrices of the
form QD and n- 1 R, where Dis a diagonal matrix.
(ii) If A is invertible, then there is a unique QR decomposition of A such
that R has only positive diagonal elements.
3.17. & Let A E Mmxn have rank n ::::; m, and let A = QR be a reduced QR
decomposition. Prove that solving the system AH Ax = AHb is equivalent to
solving the system Rx= QHb.
3.18. Let P be t he plane 2x + y - z = 0 in JR 3 .
(i) Compute the orthogonal projection of the vector v = (1, 1, 1) onto the
plane P.
(ii) Compute the matrix representation (in the standard basis) of the reflec-
tion through the plane P.
(iii) Compute the reflection of v through the plane P.
(iv) Add v to the result of (iii). How does this compare to the result of (i)?
Explain.
3.19. Find the two reflections Hv, and Hv 2 that put the matrix

in upper-triangular form; that is, write Hv 2 Hv 1 A = R, where R is upper


triangular.
134 Chapter 3. Inner Product Spaces

3.20. Prove Proposition 3.4.4.


3.21. Prove Lemma 3.4.7.
3.22.* Recall that every rotation around the origin in JR. 2 can be written in the
form (2 .17) .
(i) Show that the composition of any two Householder reflections in JR. 2 is
a rotation around the origin.
(ii) Prove that any single Householder reflection in JR. 2 is not a rotation.

3.23. Let (V, 11 · II) be a normed linear space. Prove that lllxll - llYlll :::; ll x -y ll for
all x,y EV. Hint: Prove llxll -- llYll:::; llx -yll and llYll - llxll:::; ll x -yll ·
3.24. Let C([a, b]; IF) be the vector space of all continuous functions from [a, b] C JR.
to IF. Prove that each of the following is a norm on C([a, b]; IF):
(i) ll!llu =I: lf(t)I dt.
(ii) ll J llL 2 = u: lf(t)l 2 dt) 1! 2.
(iii) llfllv"' = supxE[a,b] IJ(x)I .
3.25. Prove Proposition 3.5.9.
3.26. & Two norms II · Ila and 11 · llb on the vector space X are topologically equivalent
if there exist constants 0 < m :::; M such that

mllxlla :::; llxllb :::; Mllxlla for all x E X.

Prove that topological equivalence is an equivalence relation. Then prove that


the p-norms for p = 1, 2, oo on IFn are topologically equivalent by establishing
the following inequalities:
(i) ll x ll2:::; llxll1 :::; vlnllxll2·
(ii) llxlloo:::; ll x ll 2:::; Vnll xl loo ·
Hint: Use the Cauchy- Schwarz inequality.
The idea of topological equivalence is especially important in Chapter 5.
3.27. Complete the proof of Theorem 3.5.20 by showing (3.27).
3.28. Let A be an n x n matrix. Prove that the operator p-norms are topologically
equivalent for p = 1, 2, oo by establishing the following inequalities:
(i) }nllAll2:::; llAll1:::; follAll2 ·
(ii) Jn llAlloo :::; llAll2 :::; follAlloo ·
3.29. Take IFn with the 2-norm, and let the norm on Mn(IF) be the corresponding
induced norm. Prove that any orthonormal matrix Q E Mn(IF) has ll Qll = 1.
For any x E IFn, let Rx: Mn(IF)-+ IFn be the linear transformation AH Ax.
Prove that the induced norm of the transformation Rx is equal to llxll 2. Hint:
First prove llRxll :::; llxll2 . Then recall that by Gram- Schmidt, any vector x
with norm llxll2 = 1 is part of an orthonormal basis, and hence is the first
column of an orthonormal matrix. Use this to prove equality.
3.30.* Let SE Mn(IF) be an invertible matrix. Given any matrix norm I · II on Mn,
define 11 · lls by llAlls = llSAS- 111 · Prove that 11 · ll s is a matrix norm on Mn.
Exercises 135

3.31. Prove that in Young's inequality (3.30), equality holds if and only if aP = bq.
3.32. Prove that for every a, b 2: 0 and every c: > 0, we have

(3.46)

3.33. Prove that if() =/=- 0, 1, then equality holds in the arithmetic-geometric mean
inequality if and only if x = y.
3.34. Let (X1, II · llxJ,. .. , (Xn, II · llxJ be normed linear spaces, and let
X = X1 x · · · x Xn be the Cartesian product. For any x = (x1, ... ,xn) EX
define
JJ x JJP = { (Z:~= l ll x ill1;.J l /p ~f PE [1, oo ),
supi IJxdx; if p = oo.
For every p E [1, oo] prove that JI · lip is a norm on X . Hint: Adapt the proof
of Minkowski's inequality.
Note that if Xi = lF for every i, then X = lFn and JI · JJP is the usual p-norm
on lFn.
3.35 .t Suppose that x , y E ]Rn and p, q, r 2: 1 are such that ~ + %= ~ · Prove that

Hint: Note that


1 1
(~) + (;) = 1.
3.36. Use the arithmetic-geometric mean inequality to prove that of all rectangles
with a fixed area, the square is the only rectangle with the least perimeter.

3.37. Let V = JR[x; 2] be the space of polynomials of degree at most two, which is
a subspace of the inner product space L 2([0, 1]; JR). Let L : V -+ JR be the
linear functional given by L[p] = p'(l). Find the unique q E V such that
L[p] = (q,p), as guaranteed by the Riesz representation theorem. Hint: Look
at the discussion just before Theorem 3.7.1.
3.38. Let V = JF[x; 2], which is a subspace of the inner product space L 2([0, 1]; JR).
Let D be the derivative operator D: V-+ V; that is, D[p](x) = p'(x). Write
the matrix representation of D with respect to the power basis [1, x, x 2] of
lF[x; 2]. Write the matrix representation of the adjoint of D with respect to
this basis.
3.39. & Prove Proposition 3.7.12.
3.40. Let Mn(lF) be endowed with the Frobenius inner product (see Example 3.1.7).
Any A E Mn (lF) defines a linear operator on Mn (JF) by left multiplication:
Br--+AB.
(i) Show that A*= AH.
(ii) Show that for any A 1, A2, A3 E Mn(lF) we have (A2, A3A1) = (A2Ai, A3 )·
Hint: Recall tr(AB) = tr(BA).
136 Chapter 3. Inner Product Spaces

(iii) Let A E Mn(lF). Define the linear operator TA : Mn(lF) ---+ Mn(lF) by
TA(X) = AX - XA, and show that (TA)* =TA*.

3.41. Let

A=
1 1 1
0 0 0 0 .
ol
[2 2 2 0

What are the four fundamental subspaces of A?


3.42. For the linear operator in Exercise 3.38, describe the four fundamental sub-
spaces.
3.43. Prove (3.39) in the fundamental subspaces theorem.
3.44. Given A E Mmxn(lF) and b E lFm, prove the Fredholm alternative : Either
Ax= b has a solution x E lFn or there exists y E JY (AH) such that (y, b) =f. 0.
3.45. Consider the vector space Mn(IR) with the Ftobenius inner product (3.5).
Show that Symn(IR).i_ = Skewn(IR). (See Exercise 1.18 for the definition of
Sym and Skew. )

3.46. Prove the following for an m x n matrix A:


(i) If x E JY (AH A), then Ax is in both a? (A) and JY (AH).
(ii) JY (AH A) = JY (A) .
(iii) A and AH A have the same rank.
(iv) If A has linearly independent columns, then AH A is nonsingular.
3.47. Assume A is an m x n matrix of rank n . Let P = A(AH A)- 1 AH . Prove the
following:
(i) P 2 = P.
(ii) pH= P.
(iii) rank(P) = n.
Whenever a linear operator satisfies P 2 = P, it is called a projection . Pro-
jections are treated in detail in Section 12.1.
3.48. Consider the vector space Mn(IR) with the Ftobenius inner product (3.5).
Let P(A) = A4;AT be the map P: Mn(IR)---+ Mn(IR) . Prove the following:
(i) P is linear.
(ii) P 2 = P.
(iii) P* = P (note that * here means the adjoint with respect to the Ftobe-
nius inner product).
(iv) JY (P) = Skewn(IR).
(v) a?(P) = Symn (IR).
2
(vi) llA- P(A)llF = Jtr(ATA);-tr(A ). Here 11 · llF is the norm with respect
to the Ftobenius inner product.
Hint: Recall that tr(AB) = tr(BA) and tr(A) = tr(AT).
Exercises 137

3.49. Show that if A E Mmxn, if b E lFm, and if r = b - proj~(A) b, then x E lFn


is a least squares solution to the linear system Ax = b if and only if [x, r]T
is a solution of the equation

3.50. Let (xi, Yi)f= 1 be a collection of data points that we have reason to believe
should lie (roughly) on an ellipse of the form rx 2 + sy 2 = l. We wish to
find the least squares approximation for r and s. Write A, x , and b for
the corresponding normal equation in terms of the data xi and Yi and the
unknowns r and s.

Notes
Sources for the infinite-dimensional cases of the results of this chapter include
[Pro08, Con90, Rud91]. For more on the QR decomposition and the Householder
algorithm, see [TB97, Part II]. For details on the stability of Gaussian elimination,
as discussed in Remark 3.3.13, see [TB97, Sect. 22].
Spectral Theory

I 'm definitely on the spectrum of socially awkward.


-Mayim Bialik

Spectral theory describes how to decouple the domain of a linear operator into a
direct sum of minimal components upon which the operator is invariant. Choosing a
basis that respects this direct sum results in a corresponding block-diagonal matrix
representation. This has powerful consequences in applications, as it allows many
problems to be reduced to a series of small individual parts that can be solved
independently and more easily. The key tools for constructing this decomposition
are eigenvalues and eigenvectors, which are widely used in many areas of science and
engineering and have many important physical interpretations and applications.
For example, they are used to describe the normal modes of vibration in
engineered systems such as musical instruments, electrical motors, and even static
structures, like bridges and skyscrapers. In quantum mechanics, they describe the
possible energy states of an electron or particle; in particular, the atomic orbitals
one learns about in chemistry are just eigenvectors of the Hamiltonian operator for
a hydrogen atom. Eigenvalues and eigenvectors are fundamental to control theory
applications ranging from cruise control on a car to the automated control systems
that guide missiles and fly unmanned air vehicles (UAVs) or drones.
Spectral theory is also widely used in the information sciences. For example,
eigenvalues and eigenvectors are the key to Google's PageRank algorithm. They
can be used in data compression, which in turn is essential for reducing both the
complexity and dimensionality in problems like facial recognition, intelligence and
personality testing, and machine learning. Eigenvalues and eigenvectors are also
useful for decomposing a graph into clusters, which has applications ranging from
image segmentation to identifying communities on social media. In short, spectral
theory is essential to applied mathematics.
In this chapter, we restrict ourselves to the spectral theory of linear operators
on finite-dimensional vector spaces. While it is possible to extend spectral theory
to infinite-dimensional spaces, the mathematical sophistication required is beyond
the scope of this text and is more suitable for a course in functional analysis.

139
140 Chapter 4. Spectral Theory

Most finite-dimensional linear operators have eigenvectors that span the space
(hereafter called an eigenbasis) where the corresponding matrix representation is
diagonal. In other words, a change of basis to the eigenbasis shows that such a
matrix operator is similar to a diagonal matrix. That said, not all matrices can be
diagonalized, and some can only be made block diagonal. In the first sections of this
chapter, we develop this theory and expound on when a matrix can be diagonalized
and when we must settle for block diagonalization.
One of the most important and useful results of linear algebra and spectral
theory specifically is Schur's lemma, which states that every matrix operator is
similar to an upper-triangular matrix, and the transition matrix used to perform
the similarity transformation is an orthonormal matrix. Any such upper-triangular
matrix is called the Schur form 24 of the operator. The Schur form is well suited to
numerical computation, and, as we show in this chapter, it also has real theoretical
significance and allows for some very nice proofs of important theorems.
Some matrices have special structure that can make them easier to under-
stand, better behaved, or otherwise more useful than arbitrary matrices. Among
the most important of these special matrices are the normal matrices, which are
characterized by having an orthonormal eigenbasis. This allows a normal matrix to
be diagonalized by an orthonormal transition matrix. Among the normal matrices,
the Hermitian (self-adjoint) matrices are especially important. In this chapter we
define and discuss the properties of these and other special classes of matrices.
Finally, the last part of this chapt er is devoted to the celebrated singular value
decomposition (SVD), which allows any matrix to be separated into parts that
describe explicitly how it acts on its fundamental subspaces. This is an essential
and very powerful result that you will use over and over, in many different ways.

4.1 Eigenvalues and Eigenvectors


Eigenvectors of an operator are the directions in which the operator acts by simply
rescaling; 25 the corresponding eigenvalues are the scaling factors. Together the eigen-
values and eigenvectors provide a nice way of understanding the action of the
operator; for example, they allow us to write the operator in block-diagonal form.
T he eigenvalues of a finite-dimensional operator correspond to the roots of a
particular polynomial, called the characteristic polynomial. For a given eigenvalue
.A, the corresponding eigenvectors generate a subspace called the .A-eigenspace of
the operator. When restricted to the eigenspace, the operator just acts as scalar
multiplication by .A.

Definition 4.1.1. Let L : V --+ V be a linear operator on a finite -dimensional


vector space V over C. A scalar A E (['. is an eigenvalue of L if there exists a
nonzero x E V such that
L(x) =.Ax. (4.1)
24
Another commonly seen reduced form is the Jordan canonical form, which is nonzero only on the
diagonal and the superdiagonal. Because t he Jordan canonical form has a very nice structure,
it is commonly seen in textbooks, but it has only lim ited use in real-world applications because
of inherent computational inaccuracies with float ing-point arithmet ic. For this reason the Schur
form is preferred in most applications. Another important and powerful red uced form is the
spectral decomposition, which we discuss in Chapter 12.
25
This includes reflections, in which eigenvectors are rescaled by a negative number.
4.1. Eigenvalues and Eigenvectors 141

Any nonzero x satisfying (4.1) is called an eigenvector of L corresponding to the


eigenvalue >.. For each scalar>., we define the >.-eigenspace of L as

E>-.(L) = {x EV I L(x) =.Ax}. (4.2)

The dimension of the >.-eigenspace E>-.(L) is called the geometric multiplicity of


>.. If >. is an eigenvalue of L , then E).. (L) is nontrivial and contains all of its
corresponding eigenvectors; otherwise, °E.A(L) = {0}. The set of all eigenvalues of
L , denoted CJ(L) , is called the spectrum of L . The complement of the spectrum,
which is the set of all scalars>. for which °E.A(L) = {O}, is called the resolvent set
and is denoted p(L).

Remark 4.1.2. In addition to all the eigenvectors corresponding to >., the >.-
eigenspace E).. always contains the zero vector 0, which is not an eigenvector.

Remark 4.1.3. The definitions of eigenvalues and eigenvectors given above only
apply to finite-dimensional operators on complex vector spaces, that is, for matrices
with complex entries. For matrices with real entries, we simply think of the real
entries as complex numbers. In other words, we use the obvious inclusion Mn(IR) c
Mn(C) and define eigenvalues and eigenvectors for the corresponding matrices in
Mn(C).

The next proposition follows immediately from the definition of E>-.(L).

Proposition 4.1.4. Let L : V -+ V be a linear operator on a finite -dimensional


vector space V over C. For any >. E C, the eigenspace °E).. (L) satisfies

E>-. (L) = JY (L - >.I). (4.3)

Nota Bene 4.1.5. Proposition 4.1.4 motivates the traditional method offind-
ing eigenvectors as taught in an elementary linear algebra course. Given that
>. i an eigenvalue of the matrix A, you can find the corresponding eigenvec-
tors by solving the linear system (A - >.I)x = 0. For a worked example see
Example 4.1.12.

Eigenvalues depend only on the linear transformation L and not on a particu-


lar choice of basis or matrix representation of L. In other words, they are invariant
under similarity transformations. Eigenvectors also depend only on the linear trans-
formation, but their representation changes according to the choice of basis. More
precisely, let S be a basis of a finite-dimensional vector space V, and let L : V -+ V
be a linear operator with matrix representation A on S. For any eigenvalue >. of L
with eigenvector x, we have that (4.1) can be written as

A[x]s = >.[x]s.

However, if we choose a different basis T on V and denote the corresponding matrix


representation of L by B, then the relation between the representations is given,
142 Chapter 4. Spectral Theory

as always, by the transition matrix Ors - Thus, we have [x]r = Crs[x]s and
B = CrsA(Crs)- 1 . This implies that

B[x]r = CrsA(Crs)- 1 Crs[x]s = CrsA[x]s = ACrs[x]s = A[x]r .

Conversely, any matrix A E Mn (JF) defines a linear operator on en with the


standard basis. We say that A E e is an eigenvalue of the matrix A and that the
nonzero ~olu~n vector x = [x1 · · · Xnr
E :C:~ is a correspondin_g eigenvect?~ of
the matnx A if Ax = Ax. If B = PAP- is s1m1lar to A, then P is the trans1t1on
matrix into a new basis of en, and again we have B Px = P AP- 1 Px = APx, so A
is also an eigenvalue of the matrix B, and Px is the corresponding eigenvector.

Remark 4.1.6. The previous discussion shows that the spectral theory of a linear
operator on a finite-dimensional vector space and the spectral theory of any associ-
ated matrix representation of that operator are essentially the same. But since the
representation of the eigenvectors depends on a choice of basis, we will, from this
point forth, phrase our study in terms of matrices- that is, in terms of a specific
choice of basis.
Unless otherwise stated, we assume throughout the remainder of this chapter
that L is a linear operator on a finite-dimensional vector space V and that A is its
matrix representation for a given basis S.

Example 4.1.7. Let

!
A= [ ~] and x = [- ~] .

We verify that Ax= -2x and thus conclude that A = -2 is an eigenvalue of


A with corresponding eigenvector x.

Theorem 4.1.8. Let A E Mn(lF) and A EC. The following are equivalent:

(i) A is an eigenvalue of A .

(ii) There is a nonzero x such that (AI - A)x = 0.


(iii) 2:>,(A) -le {O}.
(iv) AI - A is singular.

(v) det (AI - A) = 0.

Proof. That (i) is equivalent to (ii) follows immediately by subtracting Ax from


both sides of the defining equation A.x = Ax. That these are equivalent to (iii) fol-
lows from Proposition 4.1.4. The fact that these are equivalent to (iv) and (v)
follows from Lemma 2.3.3 and Corollary 2.9.6, respectively. D
4.1. Eigenvalues and Eigenvectors 143

Definition 4.1.9. Let A E Mn(IF). The polynomial

PA(z) = det (zI - A) (4.4)

is called the characteristic polynomial of A . When it is clear from the context, we


sometimes just write p( z) instead of PA ( z).

Remark 4.1.10. According to Theorem 4.1.8, the scalar A is an eigenvalue of A if


and only if PA(>.) = 0. Hence, the problem of determining the spectrum of a given
matrix is equivalent to locating the roots of its characteristic polynomial.

Proposition 4.1.11. The characteristic polynomial (4.4) is of degree n in z and


can be factored over C as
r
p(z) = IJ (z - Aj)mi, (4.5)
j=l
where the Aj are the distinct roots of p(z). The collection of positive integers (mj )j=1
satisfy m1 + · · ·+mr = n and are called the algebraic multiplicities of the eigenvalues
(.Aj)j=1, respectively. 26

Proof. Consider the matrix-valued function B(z) = [bij] = zI - A. Since z occurs


only on the diagonals of B(z) , the degree of an elementary product of det(B(z))
is less than n when the permutation is not the identity and is equal to n when
it is the identity. Thus, the characteristic polynomial p(z), which is the sum of the
signed elementary products, has degree n. Moreover, by the fundamental theorem
of algebra (Theorem 15.3.15), a monic 27 degree-n polynomial can be factored into
the form (4.5). D

Example 4.1.12. The matrix A from Example 4.1.7 has the characteristic
polynomial

.\-1
p(.A) = <let (AI - A) = _4
1

Thus, the spectrum is O'(A) = {-2, 5}.


Once the spectrum is known, the corresponding eigenspaces can be
found . Recall from Proposition 4.1.4 that I;,\(A) = JV (A - AI). Hence,

26 0f course, t he polynomial p(z) does not necessarily factor into linear terms over JR, since JR is
not algebraically closed, but it always factors completely over C.
27 A polynomial p is manic if the coefficient of the term of highest degree is one. For example,

x 2 + 3 is monic, but 7x 2 + 1 is not monic.


144 Chapter 4. Spectral Theory

To find an eigenvector in I; 5 ( L) , just find the values of x and y such that

A short calculation shows that the eigenspace is span{ [3 4] T}.


A similar argument gives I;_ 2 (A) = JV(A+ 2I) = span{[- 1 l]T}.
Note that the geometric and algebraic multiplicities of the eigenvalues of A
happen to be the same, but this is not true in general.

Example 4.1.13. It is possible to have complex-valued eigenvalues even when


the matrix is real-valued. For example, if

A = [1
o -1]
0 '

then p(.X) = >. 2 + 1, and so O"(A) == {i, -i}. By solving for the eigenvectors,
we have that L;±i(A) = JV (±il - Jl) = span{[l =Fi]T }. Note again that the
geometric and algebraic multiplicities of the eigenvalues of A are the same.

Example 4.1.14. Now consider the matrix

A= rn ~].
Note that p(.X) = (.A - 3) 2 , so O"(A) = {3} and JV (A - 3I) = span{e 1 , e 2 }.
Thus, the algebraic and geometric multiplicities of .A = 3 are equal to two.

Remark 4.1.15. Despite the previous examples, the geometric and algebraic mul-
tiplicities are not always the same, as the next example shows.

Example 4.1.16. Consider the matrix

A= [~ ~] .
Note that O"(A) = {3} , yet JV (A. - 3I) = span{e 1 }. Thus, the algebraic
multiplicity of A = 3 is two, yet the geometric multiplicity is one.
4.l Eigenvalues and Eigenvectors 145

Example 4.1.17. The matrix

has characteristic polynomial

p(>.) = det (AJ - A) = >- 3 - >- 2 - 5>- - 3 = (>- - 3)(>- + 1) 2 ,

and so u(A) = {3, -1 }. The corresponding eigenspaces are E 3 = J1f (A - 3I) =


span([2 1 2] T) and E_ 1 = span([2 -1 -2]T ). Thus, the algebraic mul-
tiplicity of A = -1 is two, but the geometric multiplicity is only one.

All finite-dimensional operators (over <C) have eigenvalues, but not all opera-
tors on infinite-dimensional spaces do. In particular, consider the following example.

Unexample 4.1.18. The definition of eigenvalue extends in an obvious way


to operators on infinite-dimensional spaces, but not all operators on infinite-
dimensional spaces have eigenvalues. In goo the right-shift operator T : goo -+
goo given by (a 1,a2,a3, ... ) f-t (O,a 1 ,a 2 , . .. ) has no eigenvalue. That is, no
choice of >- E <C and v E goo satisfy the relation Tv = >- v.

Application 4.1.19. An important partial differential equation is the Laplace


equation, which (in one dimension) takes the form

where f is a given function. One numerical way to find u(x) in this equation
involves solving a linear equation Ax = b, where A is a tridiagonal n x n
matrix in the form

b a 0 0 0
c b a 0

A= 0 c b a
( 4.6)
0 0 c b 0
a
0 0 c b

The eigenvalues and eigenvectors of the matrix are also important when solving
this problem.
146 Chapter 4. Spectral Theory

The eigenvalues of A are b + 2Jac cos w1,k with eigenvectors Xk =


>.k ==

[
.
pSlllW1,k
n .
p SlllWn ,k
]T , ~
h
W ere
d
Wj,k = n+l an p =
(c)l/2
a . Th"IS Can
be verified by setting 0 =(A - >.I)xk, which gives, for each j, that

0 = Cp,.j-1 SlllWj-
.
1,k +
(b - /\\),.j
p SlllWj,k + ap
. ,.J+l .
SlllWj+l,k

=pl (y'aesinwj-1,k + (b - >.) sinwj,k + vacsinwj+1,k)


=pl sinwj,k (2vaccosw1,k + (b - >.)).
Thus, >.k is an eigenvalue with eigenvector xk , since these equations all hold
when>.= >.k .

Remark 4.1.20. If A and B are similar matrices, that is, A= PBP- 1 for some
nonsingular P, then A and B define the same operator on lFn , but in different
bases related by P . Since the determinant and eigenvalues are determined only by
a linear operator, and not by its matrix representation, the following proposition is
immediate.

Propos ition 4 .1.21. If A, B E Mn(IF) are similar matrices, that is, B = p-l AP
for some nonsingular matrix P, then the following hold:
(i) A and B have the same characteristic polynomials.
(ii) A and B have the same eigenvalues.
(iii) If>. is an eigenvalue of A and B , then P : °L.>. (B) --+ °L.>. (A) is an isomorphism,
and dim °L.>. (A) =dim °L.>. (B) .

Proof. This follows from Remark 4.1.20, but can also be seen algebraically from
the following computation. For any scalar z we have
(zI - B) = (zI - p- 1 AP) = p- 1 (zI - A)P.

This implies

det(zI - B) = det(P- 1(zI - A)P) == det(P- 1) det(zI - A) det(P) = det(zI - A) .

Moreover, for any x E °L.>.(B) we know


0 =PO= P(>.I - B)x = pp- 1 (>.I - A)Px = (>.! - A)Px.
So the bijection P maps the subspace °L.>. (B) injectively into the subspace °L.>. (A).
The same argument in reverse shows that p-l maps °L.>.(A) into °L.>.(B), so they
must also be mapped surjectively onto one another, and P is an isomorphism of
these subspaces, as required. D

Finally, we conclude with two observations that follow immediately from the
results of this section but are nevertheless very useful in many settings.
Propos ition 4 .1.22. The diagonal entries of an upper-triangular (or a lower-
triangular) matrix are its eigenvalues.
4.2. Invariant Subspaces 147

Proof. See Exercise 4.6. 0

Proposition 4.1.23. A matrix A E Mn(lF) and its transpose AT have the same
characteristic polynomial.

Proof. See Exercise 4.7. D

4.2 Invariant Subspaces


A subspace is invariant under a given linear operator if all the vectors in the subspace
map into vectors in the subspace. More precisely, a subspace is invariant under the
linear operator if the domain and codomain of the operator can be restricted to
that subspace yielding an operator on the subspace. In this section, we examine the
properties of invariant subspaces and show that they can provide a canonical matrix
representation for a given linear operator. An important example of an invariant
subspace is an eigenspace.
Throughout this section, assume that L : V ---+ V is a linear operator on the
finite-dimensional vector space V of dimension n .

Definition 4.2.1. A subspace W C V is invariant under L if L(W) C W . Alter-


natively, we say that W is £ -invariant.

Nota Bene 4.2.2. Beware that when we say "Wis £-invariant," it does not
mean each vector in Wis fixed by L . Rather it means that W, as a space, is
mapped to itself. Thus, a vector in W could certainly be mapped to a different
vector in W, but it will not be sent to a vector outside of W.

Remark 4.2.3. For a vector space V and operator Lon V, it is easy to see that
{O} and V are invariant. But these are not useful- we are really only interested in
proper, nontrivial invariant subspaces.

Example 4.2.4. Consider the double-derivative operator Lon JF[x] given by


L[p](x) = p"(x). The subspace W = span{l,x 2 ,x4,x6 , ... } c JF[x] is £ -
invariant because each basis vector is mapped to W by L; that is,

L [x 2n] = 2n(2n - l)x 2n- 2 E W.

You should convince yourself (and prove) that when determining whether a
subspace is invariant, it is sufficient to check the basis vectors.

Example 4.2.5. The kernel JV (L) of a linear operator Lis always £ -invariant
because L(JV (L)) = {O} c JV (L).
148 Chapter 4. Spectral Theory

Theorem 4.2.6. If W is an L-invariant subspace of V with basis S = [s1, . . . , sk],


then L restricted to W is a linear operator on W. Let Au denote the matrix
representation of L restricted to W, expressed in terms of the basis S. There exists
a basis S' of V, containing S, such that the matrix representation A of L on the
basis S' is of the form
(4.7)

Proof. By Corollary 1.4.5, there exists a set T = [tk+l, t k+2, ... , tn] such that
S' = SU Tis a basis for V. Let A = [ai1] be the unique matrix representation of
L on S'. Since W is invariant, the map of each basis element of S can be uniquely
represented as a linear combination of elements of S. Specifically, we have that
L(s1) = I:7=l <1ijSi for j = 1, . . . , k. Note that aij = 0 when i > k and j:::; k. The
image L (t 1 ) of each vector in T can be expressed as a linear combination of elements
of S'. Thus, we have L(t1) = I:7=i aijSi + L:~k+l aijti for j = k + 1, ... , n. Thus,
(4. 7) holds, where
a12 a1k

Au=
[""
a21 a22
:
ak1 ak2
a2k

akk

is the unique matrix representation of L on W on the basis S. D

Example 4.2. 7. Let


2 3 ]
A = [ -1 -2

be the matrix representation of the linear transformation L : :W 2 -+ :W 2 in the


standard basis. It is easy to verify that span( [3 -1] T) is a one-dimensional L-
invariant subspace. Choosing any other vector , say [1 0] T , that i not in the
span of [3 -l]T gives a new basis { [3 -l]T , [1 O]T }. The corresponding
change-of-basis matrices are given by

Thus, the representation of L in the new basis is given by

which is upper triangular.

Remark 4.2.8. The span of an eigenvector is an invariant subspace. Moreover,


any eigenspace is also invariant. A finite-dimensional linear operator has a
4.2. Invariant Subspaces 149

one-dimensional invariant subspace (corresponding to an eigenvector), but this is


not necessarily true for infinite-dimensional spaces; see Exercise 4.9.

Theorem 4.2.9. Let W1 and W2 be L-invariant complementary subspaces of V.


If W1 and W2 have bases 81 = [x1, ... , xk] and 82 = [xk+1, ... , xn], respectively,
then the matrix representation A of L on the combined basis 8 = 8 1 U 8 2 is block
diagonal; that is,
A_
-
[Au0 A220]' (4.8)

where Au and A22 are the matrix representations of L restricted to W1 and W2 in


the bases 8 1 and 8 2 , respectively.

Proof. Let A = [aiJ] be the matrix representation of Lon 8. Since W1 and W2 are
invariant, the map of each basis element of 8 can be uniquely represented as a linear
combination in its respective subspace. Specifically, we have L(xJ) = .L7=l aijXi E
W1 for j = 1, ... , k and L(xj) = .L7=k+l aijXi E W2 for j = k + 1, ... , n. Thus, the

Au =
[""
a21
:
ak1
a12
a22

ak2
"l
other aij terms are zero, and (4.8) holds, where

a2k

akk
and A22 =
["'+''+'
ak+2,k+1
.
an ,k+l
ak+1 ,k+2
ak+2 ,k+2

an ,k+2
ak+"n
ak+2,n

an ,n
l
are the unique matrix representations of L on W1 and W2 with respect to the bases
81 and 82, respectively. D

Example 4.2.10. Let L : JR 2 -t JR 2 be a reflection about the line y = 2x .


It may not be immediately obvious how to write the matrix representation
of L in t he standard basis 8, but Theorem 4.2.9 tells us that if we can find
complementary £ -invariant subspaces, the matrix representation of Lin terms
of the bases of t hose subspaces is diagonal.
A natural choice of complementary, invariant subspaces consists of the
line y = 2x, which we write as span( [l 2] T), and its normal y = -x/ 2, which
we can write as span([2 - l]T). Let T = {[l 2]T , [2 - l]T} be the new
f
basis. Since L leaves [1 2 fixed and takes [2 -1] T to its negative, we get
a very simple matrix representation in this basis:

Changing back to the standard basis via the matrix


150 Chapter 4. Spectral Theory

gives the matrix representation

in the standard basis.

Corollary 4.2.11. Let W1 , W 2, ... , Wr be a collection of L -invariant subspaces of


V. If V = W1 EB W2 EB · ·· EB Wr, where each Wi has the basis Si, then the unique
matrix representation of L on S = U~ 1 Si is of the form

where each Aii is the matrix representation of L restricted to Wi with the basis Si .

Proof. This follows by repeated use of Theorem 4.2.9. D

4.3 Diagonalization
If a matrix has a set of eigenvectors that forms a basis, then the matrix is similar
to a diagonal matrix. This is one of the most fundamental ideas in linear analysis.
In this section we describe how this works. As before, we assume throughout this
section that L is a linear operator on a finite-dimensional vector space with matrix
representation A, with respect to some given basis.

4.3.1 Simple and Semisimple Matrices

Theorem 4.3.1. If >.1, ... , >.k are distinct eigenvalues of L with corresponding
eigenvectors x 1 , x 2 , ... , xk , then these eigenvectors are linearly independent.

Proof. Suppose dim (span( {x 1, x2, .. . xk} )) = r < k. By renumbering we may


assume, without loss of generality, that {x 1 , x 2 , ... , Xr} is linearly independent .
Thus, each subsequent vector is a linear combination of the first r vectors. Hence,
(4.9)
Applying L yields

which implies that

(4.10)

Taking (4.10) and subtracting Ar+l times (4.9) yields

0 = a1(.A1 - Ar+1)x1 + a2(>.2 - Ar+1)x2 + · · · + ar(Ar - Ar+1)xr,


4.3. Diagonalization 151

but since {x1, x2, ... , Xr} is linearly independent, each ai(.>.i - Ar+l) is zero. Because
the eigenvalues are distinct, this implies ai = 0 for each i. Hence, Xr+l = 0,
which is a contradiction (since by definition eigenvectors cannot be 0). Therefore,
r = k. 0

Definition 4.3.2. If a set S = {x1, x2, .. . , Xn} C !Fn of eigenvectors of L forms a


basis of !Fn , then we say S is an eigenbasis of L .

Example 4.3.3. The matrix

A= [! ~]
from Examples 4.1.7 and 4.1.12 has distinct eigenvalues, and so the previous
theorem shows t hat the corresponding eigenvectors are linearly independent
and thus form an eigenbasis.
Having distinct eigenvalues is a sufficient condition for the existence of
an eigenbasis; however, as shown in Example 4.1.14, it is not necessary-the
eigenvalues are not distinct , and yet an eigenbasis can be also found. On
the other hand , there are cases where an eigenbasis does not exist for a matrix;
see, for example, Example 4.1. 16.

Definition 4.3.4. An operator L {or the corresponding matrix A) on a finite -


dimensional vector space is called
(i) simple if all of its eigenvalues are distinct;
(ii) semisimple if there exists an eigenbasis of L.

Corollary 4.3.5. A simple matrix is semisimple.

Proof. If A E Mn (IF) is simple, then there exist n distinct eigenvalues >.1, ... , An
with corresponding eigenvectors x 1 , x 2 , ... , Xn · Hence, by Theorem 4.3.1, these
form a linearly independent set, which is an eigenbasis. 0

Definition 4.3.6. The matrix A is diagonalizable if it is similar to a diago -


nal matrix; that is, there exists a nonsingular matrix P and a diagonal matrix D
such that
D = p - 1 AP. (4.11)

Theorem 4.3. 7. A matrix is diagonalizable if and only if it is semisimple.

Proof. (====?) If A is diagonalizable, then there exists an invertible matrix P


and a diagonal matrix D such that (4.11) holds. If we denote the columns of
152 Chapter 4. Spectral Theory

P as x 1, x 2,. . . , Xn , that is, P = [x1 x2 · · · xn]), and the diagonal elements of


D as >. 1, >. 2, . .. , An, that is, D = diag(>. 1, . . . , An), then it suffices to show that
>. 1, >. 2, . .. , An are eigenvalues of A and x 1, x2, ... , Xn are the corresponding eigen-
vectors, that is, that Axi = >.ixi, for each i . Note, however, that this follows from
matching up the columns in
[Ax1 Ax2 A xn ] =AP= PD= [>.1x1 .A2x2 Anxn].
( <==) Assume A is semisimple. Let P be the nonsingular (transition) matrix of
eigenvectors P = [x1 x 2 · · · x n]· Thus,
AP= [Ax1 Ax2 Axn] = [>.1x1 .A2x2 AnXn]
= [x1 X2 Xn] diag(.X1,.A2, ... .An)= PD.
If follows that D = p-l AP. D

Example 4.3.8. Consider again the matrix

A= [! ~]
from Examples 4.1.7 and 4.1.12. The eigenvalues are a-(A) = {-2, 5} with
corresponding eigenvectors [-1 1] T and [3 4JT, respectively. Setting

p = [-1 3]
1 4 , D = [-20 5OJ ' and p-l = ~
7
[-4 3]
1 1 ,

we multiply to find that

as expected.
The matrix A represents an operator L : JF 2 --+ JF 2 in the standard
basis. The previous computation shows that if we change the basis to the
eigenbasis {[- 1 1JT , [3 4] T} , then the matrix representation of L is D ,
which is diagonal. This "decouples" the action of the operator L into its action
on the two eigenspaces ~ - 2 = span([-1 l]T) and ~ 5 = span([3 4]T).

Example 4.3.9. Not every square matrix is diagonalizable. For example, the
matrix
A= [~ ~]
in Example 4.1.16 cannot be diagonalized. In this example, JV (A - 31) =
span {ei}, which is not a basis for JH: 2, and thus A does not have an eigenbasis.
4.3. Diagonali zation 153

Diagonalization is useful in many settings. For example, if you diagonalize a


matrix, you can compute large powers of it easily since powers of diagonal matrices
are trivial to compute, and if A = p- 1 DP, then Ak = p - 1 Dk P for all k E N, as
shown in the next proposition.

Proposition 4.3.10. If matrices A, B E Mn(lF) are similar, with A = p - l BP,


then Ak = p- 1 Bk P for all k EN .

Proof. For k E N, we have that Ak (P- 1 BP)(P- 1BP)··· (P- 1 BP), which
reduces to Ak = p - 1 Bk P. D

Application 4.3.11. Consider the familiar Fibonacci sequence

0, 1, 1, 2, 3, 5, 8, .. . '

which can be defined recursively as Fn+l = Fn + Fn_ 1, where Fo = 0 and


F1 = 1. We can also define the sequence in terms of matrices and vectors as
follows. Define vk = [Fk Fk _ i] T and observe that

vk+l = Avk, where A= [i ~] .


To find the kth number in the Fibonacci sequence, use the fact that vk+l =
Avk = A 2vk-l = · · · = Akv1.
To calculate Ak, diagonalize A and apply Proposition 4.3.10. A routine
calculation shows that the eigenvalues of A are

and >-2
l
= - -- ,
- v'5
2

and the corresponding eigenvectors are [>-1 l]T and [>-2 l] T, respectively
(The reader should check this!). Since the eigenvectors are linearly indepen-
dent, they form an eigenbasis, and A can be written as A = PD p - l, where

1 [ 1
and P- 1
= v'5 _1
The kth Fibonacci number is the second entry in vk+li given by

Vk+l = p D k p -1 V1 = 1
v'5 [>-1
1 0] [-11 ->-1>-2] [l]0 .
>.~

>.k - >.k
Multiplying this out gives Fk = ~.

Theorem 4.3.12 (Semisimple Spectra l M apping ) . If (>.i)i';: 1 are the eigen-


values of a semisimple matrix A E Mn(lF) and f(x) = ao + a1x + · · · + anxn is a
polynomial, then (J(>.i) )i;: 1 are the eigenvalues off (A) = aoI + a1A + · · · + anAn.
154 Chapter 4. Spectral Theory

Proof. This is Exercise 4.15. D

Vista 4.3.13. In Section 12.7.1 we show that the semisimple spectral map-
ping theorem actually holds for all matrices, not just semisimple ones, and it
holds for many functions , not just polynomials.

4.3.2 Left Eigenvectors


Recall that an eigenvalue of the mat rix A E Mn(IF) is a scalar >. E IF for which
>..I - A is singular; see Theorem 4.1.S(iv). We have established that an eigenvector
of>.. is a nonzero element of the kernel JV (>.I - A). However, for a given eigenvalue
>.,we also have that x T (>..I -A) = oT for some nonzero row vector x T ; see Exercise
4.18. We refer to these row vectors as the left eigenvectors of A corresponding to
>... For the remainder of this section, we examine the properties of left eigenvectors.
Definition 4.3.14. Consider a matrix A E Mn(IF) . Given an eigenvalue A E IF we
say that the nonzero row vector x T is a left eigenvector of A corresponding to >.. if
xTA=AXT. (4.12)
When necessary to avoid confusion, we refer to a regular eigenvector as a right
eigenvector.

Remark 4.3.15. It is easy to see by taking the transpose that (4.12) is equivalent
to AT x = >.x. In other words, given >., the row vector x T is a left eigenvector of A
if and only if x is a right eigenvector of AT.
Remark 4.3.16. We define the left eigenspace of>. to be the set of row vectors
that satisfies (4.12) . By appealing to the rank-nullity theorem, we can match the
dimensions of the left and right eigenspaces of an eigenvalue >. since it is always
true that dim...%(>..! -A)= dim...%(>..! - AT).
Remark 4.3.17. Sometimes, for notational convenience, we denote the right eigen-
vectors by the letter r and the left eigenvectors by the letter £T .

Example 4.3.18. If
-9
B = [ 35

and ii = [5 1], then i[ B = -2£i. Thus, >. = -2 is a left eigenvalue with £j


as a corresponding left eigenvector. Let £I = [7 2]; then £I B = £I. Thus,
>. = 1 is a left eigenvalue with £I as a corresponding left eigenvector.

Remark 4.3.19. If a matrix A is semisimple, then it has a basis of right eigen-


vectors [r1, r2, ... , rnJ, which form the transition matrix P used to diagonalize A;
that is, p-l AP = D, where D = diag(>..1, A2, . .. , An) is the diagonal matrix of
eigenvalues (see Theorem 4.3.7). Since DP- 1 = p- 1A, it follows for each i that
the ith row tTof p-l is a left eigenvector of A with eigenvalue Ai·
4.4. Schur's Lemma 155

4.4 Schur's Lemma


Schur's lemma states that any square matrix can be transformed via similarity to
an upper-triangular matrix and that the similarity transform can be performed with
an orthonormal matrix. At first glance, this may seem unimportant, but Schur's
lemma is quite powerful, both theoretically and computationally.
Schur's lemma is an important concept in computation. Recall from
Proposition 4.1.22 that the eigenvalues of an upper-triangular matrix are given
by its diagonal elements. Hence, to find the eigenvalues of a matrix, Schur's lemma
provides an alternative to diagonalization. Moreover, the eigenvalues computed with
Schur's lemma are generally more accurate than those that are computed by diag-
onalization.
Given its significance, Schur's lemma is, without question, a theorem in its
own right , but for historical reasons it is called a lemma. This is primarily because
it nicely sets up the proof of the spectral theorem for Hermitian matrices. Recall
that an Hermitian matrix is one that is self-adjoint,28 that is, A= AH. The spectral
theorem for Hermitian matrices states that all Hermitian matrices are diagonalizable
and that their eigenvalues are real. Moreover, there exists an orthonormal eigenbasis
that diagonalizes Hermitian matrices.
The most general class of matrices to have orthonormal eigenbases is the class
of normal matrices, which include Hermitian matrices, skew-Hermitian matrices,
and orthonormal matrices. Using Schur's lemma, we show that a matrix is normal
if and only if it has an orthonormal eigenbasis.

4.4.1 Schur's Lemma and the Spectral Theorem


Recall from Theorem 3.2.15 that a matrix Q E Mn(lF) is orthonormal if and only if
it satisfies QHQ = QQH = I, which is equivalent to its having orthonormal columns.

Definition 4.4.1. Two matrices A and B are orthonormally similar 29 if there


exists an orthonormal matrix U such that B = UH AU.

Lemma 4.4.2. If A is Hermitian and orthonormally similar to B, then B is also


Hermitian.

Proof. The proof is Exercise 4.20. D

Theorem 4.4.3 (Schur's Lemma). Every matrix A E Mn(q is orthonormally


similar to an upper-triangular matrix.

Proof. We prove this by induction on n. The n = 1 case is trivial. Now assume


that the theorem holds for n = k, and take A E Mk+1(C). Let >'1 be an eigenvalue
of A with unit eigenvector w 1 (that is, rescale w1 so that jj wilJ2 = 1). Using the
Gram- Schmidt algorithm, construct [w2, .. . , Wk+1] so that [w1, .. . , wk+1] is an
2 8These form a very important class of matrices that includes all real, symmetric matrices.
2 9This is also often called unitarily similar in the complex case, and orthogonally similar in the
real case, but as we have explained before, we prefer to use the name orthonormal for both the
real and complex situations.
156 Chapter 4. Spectral Theory

orthonormal set. Setting U = [w 1 · · · Wk+1] to be the matrix whose columns are


these vectors, we have that

H
U AU=
l-). .*....r *]
l ,
: M
0

where Mis a k x k matrix. By the inductive hypothesis, there exists an orthonormal


Q1 and an upper-triangular T 1 such that Q~ MQ1 = T1. Hence, setting

yields
* ...

which is upper triangular. Finally, since (UQ) H(UQ) = QHUHUQ = QHJQ


QHQ =I, we have that UQ is orthonormal; see also Corollary 3.2.16. D

Remark 4.4.4. If B = UH AU is the upper-triangular matrix orthonormally sim-


ilar to A given by Schur's lemma, it is called the Schur form of A. Both A and
B correspond to different representations of the same linear operator, and their
eigenvalues have the same algebraic and geometric multiplicities. Moreover, the
eigenvalues are the diagonal entries of B by Proposition 4.1.22.

Theorem 4.4.5. Let >. be an eigenvalue of an operator T on a finite -dimensional


space V . If m>. is the algebraic multiplicity of>., then dim :E>. (T) :::; m>..

Proof. Let v 1 , ... , vk be a basis of :E>.(T). By the extension theorem


(Corollary 1.4.5) we can choose additional vectors Vk+1, ... , Vn so t hat v1, ... , Vn
is a basis of V. The matrix representation A of T in this basis has the form

A-
-
[>-h
0
*]'
A22

where h is the k x k identity matrix. Thus, the characteristic polynomial p(z) of T


satisfies p(z) = det(zJ - A) = (z - >..)k det(A 22 ) . But since p(z) factors completely
as (4.5) , this implies that k :::; m>. . D

Corollary 4.4.6. A matrix A is semisimple if and only if m>. = dim:E>.(A) for


each eigenvalue >..
4.4. Schur's Lemma 157

4.4.2 Spectral Theorem for Hermitian Matrices


Hermitian matrices are self-adjoint complex-valued matrices. These include sym-
metric real-valued matrices. Self-adjoint operators and Hermitian matrices appear
naturally in many physical problems, such as the Schrodinger equation in quantum
mechanics and structural models in civil engineering. The spectral theorem tells us
that Hermitian matrices have orthonormal eigenbases and that their eigenvalues are
always real. We focus primarily on Hermitian matrices and finite-dimensional vec-
tor spaces, but the results in this section generalize nicely to self-adjoint operators
on infinite-dimensional spaces.
We call the next theorem the first spectral theorem because although it is often
called just ''the spectral theorem," there is a second, stronger version of the theorem
that also should be called "the spectral theorem" (which we have given the clever
name the second spectral theorem).
Theorem 4.4.7 (First Spectral Theorem). Every Hermitian matrix A is
orthonormally diagonalizable, that is, orthonormally similar to a diagonal matrix.
Moreover, the resulting diagonal matrix has only real entries.

Proof. By Schur's lemma A is orthonormally similar to an upper-triangular matrix


T. However, since A is Hermitian, then so is T, by Lemma 4.4.2. This implies that
T is diagonal and T = T; hence, T is real. D

Remark 4.4.8. The converse is also true, since if A UH DU , with D a real


diagonal matrix, then AH = (UH DU)H =UH DU = A.

Corollary 4.4.9. If A is an Hermitian matrix, then it has an orthonormal eigen-


basis and the eigenvalues of A are real.

Proof. Let A be an Hermitian matrix. By the first spectral theorem, there exist an
orthonormal matrix U and a diagonal D such that UH AU= D . The eigenbasis is
then given by the columns of U, which are orthonormal. The diagonal elements are
real since D is also Hermitian (and the diagonal elements of an Hermitian matrix
are real). D

Example 4.4.10. Consider the Hermitian matrix

1 2i]
A = [ -2i -2

with characteristic polynomial

p(>.) = (>- - 1)(>- + 2) - 4 = >. 2 + >- - 6 = (>- + 3)(>- - 2).

Thus, the spectrum is a(A) = {-3, 2}, and so both eigenvalues are real. The
corresponding eigenvectors, when scaled to be unit vectors, are
158 Chapter 4. Spectral Theory

and 1 [2i]
v'5 1 )

which form an orthonormal basis for C 2 .

Remark 4.4.11. Not every eigenbasis of an Hermitian matrix is orthonormal.


First, the eigenvectors need not have unit length. Second, in the case that an
Hermitian matrix has an eigenvalue )\ of multiplicity two or more, any linearly in-
dependent set in the eigenspace of>. can be used to help form an eigenbasis, and that
linearly independent set need not be orthogonal. Thus, an orthonormal eigenbasis
is a very special choice of eigenbasis.

4.4.3 Normal Matrices


The spectral theorem is extremely powerful, and fortunately it can be generalized
to a much larger class of matrices called normal matrices. These are matrices that
are orthonormally diagonalizable.
Definition 4.4.12. A matrix A E Mn(IF) is normal if A HA= AAH .

Example 4.4.13. The following are examples of normal matrices:

(i) Hermitian matrices: AH A= A 2 = AAH.


(ii) Skew-Hermitian matrices: AHA = -A 2 = AAH.

(iii) Orthonormal matrices: uHu =I = uuH.

Theorem 4.4.14 (Second Spectral Theorem). A matrix A E Mn(IF) is normal


if and only if it is orthonormally diagonalizable .

Proof. Assume that A is normal. By Schur's lemma, there exists an orthonormal


matrix U and an upper-triangular matrix T such that UHAU= T. Hence,

or, in other words, T is also normal. However, we also have that

[" -~ H~'
0 ti2
12 t22 t22 t2n
THT= .
""1
ln t2n tnn 0 0 tnn

and
ti2 0
f'b'. t22

""1l"
t2n ti2 t 22
T TH =
0 0 tnn tin t2n
jJ
4.5. The Singular Value Decomposition 159

Comparing the diagonals of both yields

ltnl 2 = itnl 2+ itd 2+ ·· · + ltinl 2,


1tl2[ 2+ it221 2 = it221 2+ it231 2+ · · · + it2nl 2,

This implies that itij I = 0 whenever i =/=- j. Hence, T is diagonal.


Conversely, if A is orthonormally diagonalizable, there exist an orthonormal
matrix U and a diagonal matrix D such that UH AU= D. Thus, A = U DUH. Since
DHD = DDH, we have that AHA= UDHDUH = UDDHUH = AAH. Therefore, A
is normal. D

Vista 4.4.15. Eigenvalue and eigenvector computations are more prone to


error if the matrices involved are not normal. This means that round-off error
from floating-point arithmetic and noise in the physical problem can compound
into large errors in the final output; see Section 7.5 for details. In Chapter 14,
we examine the pseudospectrum, an important tool to better understand the
behavior of nonnormal matrices in numerical computation.

4.5 The Singular Value Decomposition


The singular value decomposition (SVD) is one of the most important ideas in ap-
plied mathematics and is ubiquitous in science and engineering. The SVD is known
by many different names and has several close cousins, each celebrated in different
applications and across disciplines. These include the Karhunen-Loeve expansion,
principal component analysis, factor analysis, empirical orthogonal decomposition,
proper orthogonal decomposition, conjoint analysis, the Hotelling transform, latent
semantic analysis, and eigenfaces.
The SVD provides orthonormal bases for the four fundamental subspaces as
described in Section 3.8. It also gives us a means to approximate a matrix with
one of lower rank. Additionally, the SVD allows us to solve least squares problems
when the linear systems do not have full column rank.

4.5.1 Positive Definite Matrices


Before we describe the SVD, we must first discuss positive definite and positive
semidefinite matrices, both of which are important for the SVD, but they are also
very important in their own right, especially in optimization and statistics.

Definition 4.5.1. A matrix A E Mn(lF) is positive definite , denoted A > 0, if it is


Hermitian and (x , Ax) > 0 for all x =/=- 0. It is positive semidefinite, denoted A:'.'.: 0,
if it is Hermitian and (x , Ax) :'.'.: 0 for all x.
160 Chapter 4. Spectral Theory

Remark 4.5.2. If A is Hermitian, then it is clear that (x, Ax) is real valued. In
particular, we have that
(x,Ax) = (AHx,x) = (Ax,x) = (x,Ax).

Unexample 4.5.3. Consider the matrices

A= [~ ~] and B = [~ ~] .
Note that A is not positive definite because it is not Hermitian. To show that
B is not positive definite, let x = [--1 1] T, which gives xH Bx = -4 < 0.

Example 4.5.4. Consider the matrix

Let x = [x 1 x2]T and assume xi= 0. Thus,


xHCx = i1(3x1 -I- X2) -1- i2(x1 -1- 3x2)
2 2
= 3lxil + X1X2 + X1X2 + 3lx21
2 2 2
= 2lx11 + 2lx21 + lx1 + x21 > 0.

Theorem 4.5.5. The Hermitian matrix A E Mn(lF) is positive definite if and only
if its spectrum contains only positive eigenvalues. It is positive semidefinite if and
only if its spectrum contains only nonnegative eigenvalues.

Proof. ( ====?) If A is an eigenvalue of a positive definite linear operator A with


corresponding eigenvector x , then (x, Ax) = (x, .\x) = >-llxll 2 is positive. Since
x i= 0, we have llxl! > 0, and thus A is positive.
( ~) Let A be Hermitian. If the spectrum (.\i)~ 1 is positive and (xi)f= 1 is
the corresponding orthonormal eigenbasis, then x = Z::~=l aixi satisfies

(x, Ax) ~( t, a;x;, t, a, Ax,) ~ t, t, a,a, A, (x., x,) ~ t, la; 1


2
>.;,

which is positive for all x -=f. 0.


The same proof works for the positive semidefinite case by replacing every
occurrence of the word positive with the word nonnegative. D

Proposition 4.5.6. If A E Mn(lF) is a positive semidefinite matrix of rank r whose


nonzero eigenvalues are >-1 2: >-2 2: · · · 2: Ar > 0, then there exists an orthonormal
4.5. The Singular Value Decomposition 161

matrix Q such that QHAQ = diag(A 1, ... , Ari 0, .. . , 0). The last n -r columns of Q
form an orthonormal basis for JV (A), and the first r columns form an orthonormal
basis for JV (A)J_. Letting D = diag(A 1, ... , Ar), we have the following block form:

(4.13)

Proof. Let x1 , ... ,xn be an orthonormal eigenbasis for A, ordered so that the
corresponding eigenvalues satisfy Ai ::'.: A2 ::'.: · · · ::'.: Ar > Ar+i = · · · = An = 0.
The set Xr+ 1, ... , Xn is an orthonormal basis for the kernel JV (A), and x 1, . .. , x k
is an orthonormal basis for JV (A)J_. If Q 1 is defined to be the n x r matrix with
columns equal to x 1, ... , Xr , and Q 2 to be then x (n - r) matrix with columns
equal to Xr+1, ... , xn, then (4.13) follows immediately. D

Proposition 4.5. 7. If A is positive semidefinite, there exists a matrix S such that


A = SH S. Moreover, if A is positive definite, then the matrix S is nonsingular.

Proof. Write A= U DUH, where U is orthonormal and D = diag(d 1, ... , dn) ::'.: 0
is diagonal. Let D 112 = diag( .J"(I;, ... , v'cI;,) ::'.: 0, so that D 112D 112 = D. Setting
s = n 112uH gives sHs = A, as required.
If A is positive definite, then each diagonal entry of D is positive, and so is
each corresponding diagonal entry of D 112 . Thus, D and Sare nonsingular. D

Corollary 4.5.8. Given a positive definite matrix A E Mn(IF), there exists an


inner product (·, ·) A on !Fn given by (x, y) A = xH Ay .

Proof. It is clear that (-,·) A is sesquilinear. Since A> 0, there exists a nonsingular
SE Mn(IF) satisfying A = SHS. Hence, (x,x)A = xHAx = xHSHSx = llSxll§,
which is positive if and only if x-:/- 0. D

Proposition 4.5.9. If A is an m x n matrix of rank r, then AH A is positive


semidefinite and has rank r.

Proof. Note that (AH A)H =AH A and (x, AH Ax) = (Ax, Ax) = ll Ax ll 2 ::'.: 0. Thus,
AH A is positive semidefinite. See Exercise 3.46(iii) to show rank( AH A) = r. D

4.5.2 The Singular Value Decomposition


The SVD is one of the most important results of this text. For any rank-r matrix
A E Mmxn(IF) , the SVD gives r positive real numbers cr 1, . . . , err, an orthonormal
basis v 1, ... , Vn E !Fn, and an orthonormal basis u 1 , ... , Um E !Fm such that A maps
the first r vectors vi to criui and maps the rest of the vi to 0.
In other words, once we choose the "right" bases for the domain and codomain,
any linear transformation of finite-dimensional spaces just maps basis vectors of the
162 Chapter 4. Spectral Theory

domain to nonnegative real multiples of the basis vectors of the codomain. No


matter how complicated a linear transformation may initially appear, in the right
bases, it is extremely easy to describe-just a rescaling.
Note that although the SVD involves a diagonal matrix, the SVD is very
different from diagonalizing a matrix. When we talk about diagonalizing, we use
the same basis for the domain and codomain, but for the SVD we allow them to
have different bases. It is this flexibility-the ability to choose different bases for
domain and codomain-that allows us to describe the linear transformation in this
very simple yet powerful way.

Theorem 4.5.10 (Singular Value Decomposition). If A E Mmxn(IF) is of


rank r, then there exist orthonormal matrices U E Mm(lF) and VE Mn(lF) and an
m x n real diagonal matrix E = diag(O"i, 0"2, . . . , O"r, 0, .. . , 0) such that

(4.14)

where O"i ::'.:: 0"2 ::'.:: · · · ::'.:: O"r > 0 are all positive real numbers. This is called the
singular value decomposit ion (SVD) of A, and the r positive values O"i , 0"2 , .. . , O"r
are the singular values30 of A .

Proof. Let A be an m x n matrix of rank r. By Proposition 4.5.9, the matrix


AH A is positive semidefinite of rank r. By Proposition 4.5.6, we can find a mat rix
V = [Vi Vi] with orthonormal columns such that

where D = diag( di, . . . , dr) and di ::'.:: d2 ::'.:: · · · ::'.:: dr > 0. Write each di as a square
di= O"f , and let E i be the diagonal r x r block Ei = diag(O"i,. . .,O"r)- In block
form, we may write

v =[Vi Vi]

so we have D = ViH AH AVi = Ei.


As proved in Exercise 3.46, we have J1f (AH A) = J1f (A), which implies that
AV2 = 0. Thus, the column vectors of Vi form an orthonormal basis for J1f (A),
and the column vectors of Vi form an orthonormal basis for J1f (A)l_ = flt (AH).
Define the matrix Ui = AViE1i · It has orthonormal columns because

U~Ui = (E1 1 )H V1HAH AViE1 1 =I.


Thus, the columns of Ui form an orthonormal basis for flt (A). Let Ur+i, .. . , Um be
any orthonormal basis for flt(A)l_ = J1f (AH) , and let U2 = [ur+i ·· · um] be the
matrix with these basis vectors as columns. Setting U = [U1 U2] yields 31

UEVH = [Ui U2] [~1 ~] [~~:] = UiE1 vr = AVi Vt (4.15)

30
The additional zeros on the diagonal are not considered singular values.
31
Beware that Vi V 1H =JI, despite the fact that V 1HV1 =I and VV H =I.
4.5. The Singular Value Decomposition 163

Figure 4.1. This is a representation of the fundamental subspaces theorem


(Theorem 3.8.9) that was popularized by Gilbert Strang [Str93]. The transformation
A sends !fl (AH) (black rectangle on the left) isomorphically to !fl (A) (black rectangle
on the right) by sending each basis element vi from the SVD to O'iUi E !fl (A) if
vi E !fl (AH) . The remaining vi lie in JV (A) (blue square on the left) and are
sent to 0 (blue dot on the right) . Similarly, the transformation AH maps !fl (A)
isomorphically to !fl (AH) by sending each ui E !fl (A) to O'iVi. The remaining ui
lie in JV (AH) (red square on the right) and are sent to 0 (red dot on the left) . For
an alternative depiction of the fundamental subspaces, see Figure 3. 7.

Note that I= VVH = Vi V1H + V2V{1, and thus A = AVi V1H + AViV2H = AVi Vt .
Therefore, UI;VH = A.
Since the singular values are the positive square roots of the nonzero eigen-
values of AH A, they are uniquely determined by A, and since they are ordered
0' 1 ;::: 0'2 ;::: · · · ;::: O'ri the matrix I; is uniquely determined by A. D
Remark 4.5.11. The matrix I; is unique in the SYD, whereas the matrices U and
V are not necessarily unique.
Remark 4.5.12. The SYD gives orthonormal bases for the four fundamental sub-
spaces in the fundamental subspaces theorem (Theorem 3.8.9). Specifically, the first
r columns of V form a basis for !fl (AH); the last n -r columns of V form a basis for
JV (A); the first r columns of U form a basis for !fl (A); and the last m - r columns
of U form a basis for JV (AH) . This is visualized in Figure 4.1.

Example 4.5.13. We calculate the SYD of the rank-2 4 x 3 matrix


164 Chapter 4. Spectral Theory

The first step is to calculate AH A , which is given by

80 56 16]
56 68 40 .
[ 16 40 32

Since u(AH A) = {O , 36, 144} , the singular values are 6 and 12. The right
singular vectors, that is, the columns of V, are determined by finding a set of
orthonormal eigenvectors of AH A. Specifically, let

The left singular vectors, that is, the columns of U1 , can be computed
by observing that u i = ; , Avi for i = 1, 2. The remaining columns of U are
calculated by finding unit vectors orthogonal to u 1 and u2. We let

u ~ ~ [:
2 1
1
1 -1
-1 - 1
- 1 1
1 1
:,]
1
-1
.

Thus, the SVD of A is

1 - 1
-1 -1
-1 1
1 1

Remark 4.5.14. From (4.15) we have that UI;VH = U1 I; 1 V1H. The equation
(4.16)

is called the compact form of the SVD. The compact form encapsulates all of the
necessary information to recalculate the matrix A. Moreover, A can be represented
by the outer-product expansion

(4.17)

where ui and vi are column vectors for U1 and V1 , respectively, and a 1 , a 2 , ... , ar
are the positive singular values.
4.6. Consequences of the SVD 165

Example 4.5.15. The compact form of the SVD for the matrix A from
Example 4.5.13 is

Corollary 4.5.16 (Polar Decomposition). If A E Mmxn(lF), with m 2: n,


then there exists a matrix Q E Mmxn(lF) with orthonormal columns and a positive
semidefinite matrix P E Mn(lF) such that A = QP. This is called the right polar
decomposition of A .

Proof. From the SVD we have A = U2:VH. Set Q = UVH and P = VI;VH . Since
2: is positive semidefinite, it follows that P is also positive semidefinite. D

Remark 4.5.17. The polar decomposition is a matrix generalization of writing a


complex number in polar form as rei 6 ; see Appendix B. l. If A is a square matrix,
then the determinant of Q lies on the unit circle (see Exercise 3.lO(v)) . Also , the
determinant of P is nonnegative; that is, det (A) = det (Q) det (P) = rei 6 , where
det(P) = rand det(Q) = ei 6 .

Remark 4.5.18. Let A E Mmxn(lF), with m 2: n. The left polar decomposi-


tion consists of a matrix Q E Mmxn(lF) with orthonormal columns and a positive
semidefinite matrix P E Mm(lF) such that A = PQ. This can also be constructed
using the SVD.

4. 6 Consequences of the SVD


The SVD has many important consequences and applications. In this section we
discuss just a few of these.

4.6.1 Least Squares and Moore- Penrose Inverse


The SVD is important in computing least squares solutions when the underlying
matrix is not of full column rank; for a reminder of the full-column-rank condition
see Theorem 3.9.3. We solve these problems by computing a certain pseudoinverse,
which behaves like an inverse in many ways.
Consider the linear system Ax = b, where A E Mmxn(lF) and b E lFm. If
b E !% (A) , then the linear system has a solution. If not , then we find the "best"
approximate solution in the sense of the 2-norm; see Section 3.9. Recall that the
least squares solution xis given by solving the normal equation (3.41) given by
(4.18)

If A has full column rank, that is, rank A = n, then AH A is invertible and
x = (AH A)- 1 AHb is the unique least squares solution. If A is not injective, then
166 Chapter 4. Spectral Theory

x
there are infinitely many least squares solutions. If is a particular solution, then
any x + n with n E JV (A) also satisfies (4.18) . The SVD allows us to find the
unique particular solution that is orthogonal to JV (A).

Theorem 4.6.1 (Moore-Penrose Pseudoinverse). If A E Mmxn(lF) and b E


lFm, then there exists a unique x E ./V (A)..L satisfying (4. 18). Moreover, if A =
U1 ~ 1 V1H is the compact form of the SVD of A, then x = Atb, where

At == Vi~1 1 ur. (4.19)

We call At the Moore- Penrose pseudoinverse of A.

Proof. If x = Atb = V1~1 1 Urb, then x Ea (Vi)= a (AH) =JV (A)..L and
1
AH Ax= V1~ 1 uru1~1 vt-1Vi~1 urb = V1~1urb = AHb .

To prove uniqueness, suppose v E JV (A)..L is also a solution of the normal equation.


Subtracting, we have that AH A(x - v) = 0, and so x-
v E JV (AHA) = JV (A) by
Exercise 3.46. Hence, x - v E JV (A)..L n JV (A) = {O}. Therefore, v = x. D

Proposition 4.6.2. If A E Mmxn(lF), then the Moore-Penrose pseudoinverse of


A satisfies the following:

(i) AAtA =A.


(ii) At AAt =At.
(iii) (AAt) H =AA t.
(iv) (AtA) H =At A.

(v) AAt = proj~(A) is the orthogonal pmjection onto a(A) .

(vi) AtA = proj&i'(AH) is the orthogonal pmjection onto a (AH).

Proof. See Exercise 4.38. D

4.6. 2 Low-Rank Approximate Behavior


Another important application of the SVD is that it can be used to construct low-
rank approximations of a matrix. These are useful for data compression, as well as
for many other applications, like facial recognition, because they can greatly reduce
the amount of data that must be stored, transmitted, or processed.
The low-rank approximation goes as follows: Consider a matrix A E Mmxn(lF)
ofrank r. Using the outer product expansion (4.17) of the SVD, and then truncating
it to include only the first s < r terms, we define the approximation matrix As of
A as
(4.20)
4.6. Consequences of the SVD 167

In this section, we show that As has rank s and is the "best" rank-s approximation
of A in the sense that the norm of the difference, that is, llA - As II, is minimized
against all other ll A - B JJ where B has ranks. We make this more precise below.

Theorem 4.6.3 (Schmidt, Mirsky, Eckart- Young). 32 If A E Mmxn(C) has


rank r , then for each s < r, we have

O"s +l = inf ll A - B ll2, (4.21)


rank(B)=s

with minimizer
s
B = As = L O"iUiv{f, (4.22)
i=l
where each O"j is the jth singular value of A and Uj and vj are, respectively, the
corresponding columns of U1 and Vi in the compact form (4.16) of the singular value
decomposition.

Proof. Let W = [v 1 · · · v s+l] be the matrix whose columns are the first s + 1
right singular vectors of A . For any B E Mmxn(C) of rank s, Exercise 2.14(i)
shows that rank(BW) ::; rank(B) = s. Thus, by the rank-nullity theorem, we have
dim JV (BW) = s + 1 - rank (BW) ~ 1. Hence, there exists x E JV (BW) c 1p+ 1
satisfying ll x ll 2 = 1. We compute

r s+l
AWx = LO"iUiv{fWx = LO"iXiui,
i=l i= l

where x = [x1 X2 Xs+1]T . Since wHw =I (by Theorem 3.2.15), we have


that llWx ll 2 = 1. Thus,

JJ A - B JI § = JIA - B ll §ll Wx JI§ ~ IJ (A - B)Wx lJ § = ll AWx ll §


s+l s+l
= L 2
0"7 lxi l ~ 0";+1 L Jxi l2 = 0";+1 ·
i=l i=l

This inequality is sharp33 since


2
llA - As ll § =II t
i=s+l
O"iUivf 11 = 0";+1>
2

where the last equality is proved in Exercise 4.3l(i) . D

3 2 This theorem and its counterpart for the Frobenius norm are often just called the Eckart-Young
theorem, but there seems t o be good evidence that Schmidt and Mirsky discovered these results
earlier than Eckart and Young (see [Ste98, pg. 77]) , so all four names get attached to these
theorems.
3 3 An inequalit y is sharp if there is at least one case where equality holds. In other words, no
stronger inequalit y could hold.
168 Chapter 4. Spectral Theory

A version of the previous theorem also holds in the Frobenius norm (as defined
in Example 3.5.6).

Theorem 4.6.4 (Schmidt, Mirsky, Eckart-Young). Using the notation above,


1/2

(
t
j=s+l
a} ) = inf
rank(B)=s
llA- B llF, (4.23)

with minimizer B = As given above in (4.22).

Proof. Let A E Mm xn (IF) with SVD A = U2;VH. The invertible change of variable
Z = UH BV (combined with Exercise 4.32(i)) gives
inf llA - BllF = inf 112; - Z llF· (4.24)
rank(B )=s rank(Z)=s

From Example 3.5.6, we know that the square of the Frobenius norm of a matrix is
just the sum of the squares of the entries in the matrix. Hence, if Z = [zij], then
we have
m n
2
112; - Zll} = L L l2;ij - ZiJl
i=l j=l

r r m n

= L (JT - L (Ji(zii + zii) +LL lzijl 2


.
i=l i=l j=l
The last expression can only be minimized when Zij = 0 for all i -/= j and Zii =0
for i > r. Thus, we have that
r r r
112; - Zll} = L (JT - 2 L (Ji~(zii) +L lziil 2
i=l i=l i=l
r r r
2
:'.'.: L (JT - 2 L (Jilziil +L lziil
i=l i=l i=l
r
= I:((Ji - 1Ziil )2 .
i=l
Imposing the condition that rank (Z) = s implies that exactly s of the Zii terms
are nonzero. Therefore, the maximum occurs when Zii =(Ji for each 1 :::; i :::; s and
Zii = 0 otherwise. This choice knocks out the largest singular values. Hence, the
minimizer is (4.22) . 0

Application 4.6.5 (Data Compression and the SVD). Suppose that


you have a large amount of data to transmit or store and only limited band-
width or storage space. Low-rank approximations give a way to identify and
keep only the most important information and discard the rest.
4.6 Consequences of the SVD 169

Consider, for example, a grayscale image of dimension 250 x 250 pixels.


The picture can be represented by a 250 x 250 matrix A, where each entry
in the matrix is a number between 0 (black) and 1 (white). This amounts to
250 2 = 62,500 pixels to transmit or store.
In general the matrix A has rank 250, but the Schmidt, Mirsky, Eckart-
Young theorem (Theorem 4.6.4) guarantees that for any s < rank(A) , the
matrix
s
As = Laiuivf
i= l

is the best rank-s approximation to A. This matrix can be reconstructed


from just the data of the first s singular values and the 2s vectors u 1 , . .. , Us
and v 1 , ... , vs-a total of 501s real numbers instead of 62,500. Even with s
relatively small, the most important parts of the picture remain. To see an
example of the effects of the compression, see Figure 4.2.

original r = 250 s = 100 s = 30

s = 20 s = 10 s= 5

Figure 4.2. An example of image compression as discussed in Applica-


tion 4. 6. 5. The original image is 250 x 250 pixels, and the singular value recon-
structions are shown for 100, 30, 20, 10, and 5 singular values. These are the best
rank-s approximations of the original image.

An alternative way of formulating the two theorems of Schmidt, Mirsky, and


Eckart-Young is that they put an upper bound on the size of a perturbation 6
when A+ 6 has a given rank. So the rank of a small perturbation A+ 6, with
6 sufficiently small, must be at least as large as the rank of A. Put differently,
170 Chapter 4. Spectral Theory

the matrix b.. has to be sufficiently large in order for A + b.. to be smaller in rank
than A.

Example 4.6.6. If A= 0 E Mn(lF) is the zero matrix, then A+cl has rank n
for any c # 0. Notice that by adding a small perturbation to the zero matrix,
the rank goes from 0 to n. Adding even the smallest matrix to the zero matrix
increases the rank of the sum.

Corollary 4.6.7. Let A E Mmxn(lF) have SVD A= UI;VH . Ifs< r, then for any
b.. E Mmxn(lF) satisfying rank(A + b..) = s, we have

Equality holds when b.. =- .2:::~=s+l CTiUivf.

Proof. The proof is Exercise 4.39. D

4.6.3 Multiplicative Perturbations


We conclude this section by examining multiplicative perturbations. If A E Mn(lF)
is invertible, then I - Ab.. has rank zero only if b.. = A- 1, which has 2-norm equal
to llA- 1112 = 0";;:-1; see Exercise 4.3l(ii) . To force I - Ab.. to have rank less than n
(but not necessarily 0) does not require b.. to be so large, but we still have a lower
bound for b.. .

Theorem 4.6.8. Let A E Mmxn(lF) have SVD A= UI;VH. The infimum of llb..11
such that rank(! - Ab..) < m is CT1 1, with

(4.25)

Proof. To make sense of the expression I - Ab.., we must have IE Mmxm(lF) and
b.. E Mnxm(lF) . If rank(! - Ab..) < m, then there exists x E lFm with x # 0 such
that Ab..x = x. Thus,

which implies

1 -< llb..xll2
CT-1 < llb..ll 2
llxll 2 - ·
Since llb..*112 = llCT1 v1u~ll2 = CT;:-1, it suffices to show that rank(! - Ab..*) <
1

m . However, this follows immediately from the fact that (I - Ab..*)u 1 = O; see
Exercise 4.40. D
Exercises 171

As a corollary, we immediately get the small gain theorem, which is important


in control theory.

Corollary 4.6.9 (Small Gain Theorem). If A E Mn(lF) , then I - A~ is non-


singular, provided that llAll 2 ll ~ ll 2 < l.

Proof. The proof is Exercise 4.41 . D

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *) . We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with & are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

4.1. A matrix A E Mn(lF) is nilpotent if Ak = 0 for some k EN. Show that if>. is
an eigenvalue of a nilpotent matrix, then >. = 0. Hint: Show that if>. is an
eigenvalue of A, then >. k is an eigenvalue of A k .
4.2. Let V =span( {l, x, x 2}) be a subspace of the inner product space £ 2([0, l]; JR).
Let D be the derivative operator D: V--+ V given by D[p](x) = p'(x) . F ind
all the eigenvalues and eigenspaces of D . What are their algebraic and geo-
metric multiplicities?
4.3. Show that the characteristic polynomial of any 2 x 2 matrix has the form

p(>.) = >. 2 - tr (A)>.+ det (A).

4.4. Recall that a matrix A E Mn(lF) is Hermitian if AH = A and skew-Hermitian


if AH = - A. Using Exercise 4.3, prove that
(i) an Hermitian 2 x 2 matrix has only real eigenvalues;
(ii) a skew-Hermitian 2 x 2 matrix has only imaginary eigenvalues.
4.5. Let A, BE Mmxn(lF) and CE Mn(lF) , and assume that>. is not an eigenvalue
of C. Prove that >. is an eigenvalue of C - B HA if and only if A( C - >.I) - 1 BH
has an eigenvalue equal to l. Why is it important that >.is not an eigenvalue
172 Chapter 4. Spectral Theory

of C? Hint: If x E lFn is an eigenvector of C - BH A, consider the vector


y =Ax E lFm.
4.6. Prove Proposition 4.1.22.
4.7. Prove Proposition 4.1.23. Hint: Use Exercise 2.47.

4.8. Let V be the span of the set S = {sin(x) , cos(x), sin(2x), cos(2x)} in the
vector space C 00 (JR.; JR.).
(i) Prove that S is a basis for V.
(ii) Let D be the derivative operator. Write the matrix representation of D
in the basis S.
(iii) Find two complementary D -invariant subspaces in V.
4.9. Prove that the right shift operator on f, 00 has no one-dimensional invariant
subspace (see Remark 4.2.8).
4.10. Assume that Vis a vector space and Tis a linear operator on V. Prove that
if Wis a T -invariant subspace of V, then the map T': V/W-+ V/W given
by T'(v + W) = T(v) +Wis a well-defined linear transformation.
4.11. Let W1 and W2 be complementary subspaces of the vector space V. A re-
flection through W 1 along W2 is a linear operator R : V -+ V such that
R(w1 + w2) = w1 - W2 for W1 E W1 and W2 E W2. Prove that the following
are equivalent:
(i) There exist complementary subspaces W1 and W2 of V such that R is
a reflection through W1 along W2.
(ii) R is an involution, that is, R 2 = I.
(iii) V = JY (R - I) EB JY (R +I). Hint: We have that

1 1
v = 2,(I - R)v + (1 + R)v.
2
4.12. Let L be the linear operator on JR. 2 that reflects around the line y = 3x.
(i) Find two complementary L-invariant subspaces of V.
(ii) Choose a basis T for JR. 2 consisting of one vector from each of the two
complementary L-invariant subspaces and write the matrix representa-
tion of L in that basis.
(iii) Write the transition matrix Csr from T to the standard basis S.
(iv) Write the matrix representation of L in the standard basis.

4.13. Let
A = [0.8 0.4]
0.2 0.6 .

Compute the transition matrix P such that p-l AP is diagonal.


4.14. Prove that the linear transformation D of Exercise 4.2 is not semisimple.
4.15. Prove Theorem 4.3.12.
Exercises 173

4.16. Let A be the matrix in Exercise 4.13 above.


(i) Compute limn--+oo An with respect to the 1-norm; that is, find a matrix
B such that for any€> 0 there exists an N > 0 with llAk - Bll1 < €
whenever k > N. Hint: Use Proposition 4.3.10.
(ii) Repeat part (i) for the oo-norm and the Frobenius norm. Does the
answer depend on the choice of norm? We discuss this further in
Section 5.8.
(iii) Find all the eigenvalues of the matrix 3I + 5A + A 3 . Hint: Consider
using Theorem 4.3.12.
4.17. If p(z) is the characteristic polynomial of a semisimple matrix A E Mn(lF),
then prove that p(A) = 0. We show in Chapter 12 that this theorem holds
even if the matrix is not semisimple.
4.18. Prove: If,\ is an eigenvalue of the A E Mn(lF), then there exists a nonzero
row vector x T such that x TA = ,\x T .
4.19. Let A E Mn(lF) be a semisimple matrix.
(i) Let £T be a left eigenvector of A (a row vector) with eigenvalue ,\, and
let r be a right eigenvector of A with eigenvalue p. Prove that £Tr = 0
if ,\ -/= p.
(ii) Prove that for any eigenvalue ,\ of A, there exist corresponding left and
right eigenvectors £T and r , respectively, associated with ,\ such that
f,T r = 1.

(iii) Provide an example of a semisimple matrix A showing that even if,\ =


p-/= 0, there can exist a left eigenvector £T associated with,\ and a right
eigenvector r associated with p such that £Tr = 0.

4.20. Prove Lemma 4.4.2.


4.21. Let (V, (·, ·)) be a finite-dimensional inner product space, [v 1 , . . . , vn] C Van
orthonormal basis, and T an orthonormal operator on V . Prove that there
is an orthonormal operator Q such that Qv 1 , ... , Qvn is an orthonormal
eigenbasis of T. (This shows that there is an orthonormal change of basis Q
which takes the original orthonormal basis to an orthonormal eigenbasis of
T.) Hint: By the spectral theorem, there exists an orthonormal eigenbasis
ofT.
4.22. Let Sand T be operators on a finite-dimensional inner product space (V, (·, ·))
which commute, that is, ST = TS.
(i) Show that every eigenspace of S is T -invariant; that is, for each eigen-
value,\ of S, the ,\-eigenspace ~>.(S) satisfies T~>.(S) c ~>.(S).
(ii) Show that if S is simple, then T is semisimple.
4.23. Let T be any invertible operator on a finite-dimensional inner product space
(W, (-, ·)).
(i) Show that TT* is self-adjoint and (w, TT*w) 2: 0 for all w E W.
(ii) Show that TT* has a self-adjoint square root S; that is, TT* = S 2 and
S* = S.
174 Chapter 4. Spectral Theory

(iii) Show that s- 1r is orthonormal.


(iv) Show that there exist orthonormal operators U and V such that U*TV
is diagonal.
4.24. Given A E Mn(<C), define the Rayleigh quotient as
(x,Ax)
p(x) = l xll 2 '

where(-,·) is the usual inner product on IFn. Show that the Rayleigh quotient
can only take on real values for Hermitian matrices and only imaginary values
for skew-Hermitian matrices.
4.25. Let A E Mn(<C) be a normal matrix with eigenvalues (>..1, ... , An) and corre-
sponding orthonormal eigenvectors [x 1 , ... , Xn].
(i) Show that the identity matrix can be written I = x 1 x~ + · · · + XnX~ .
Hint: What is (xix~+···+ Xnx~)xj?
(ii) Show that A can be written as A = >.. 1 x 1 x~ + · · · + An XnX~ . This is
called an outer product expansion .
4.26. t Let A, B E Mn (IF) be Hermitian, and let IFn be equipped with the standard
inner product. If [x 1 , ... , Xn] is an orthonormal eigenbasis of B, then prove
that
n
tr (AB) = L (x i, ABxi).
i =l

Hint: Use Exercises 2.25 and 4.25.

4.27. Assume A E Mn(IF) is positive definite. Prove that all its diagonal entries
are real and positive.
4.28. Assume A, B E Mn(IF) are positive semidefinite. Prove that

0 ::::; tr (AB) ::::; tr (A) tr (B),

and use this result to prove that II · llF is a matrix norm.


4.29. Let
1
A= 0 0 0 0 .
1 1 ol
[2 2 2 0
(i) Find the eigenvalues of AH A along with their algebraic and geometric
multiplicities.
(ii) Find the singular values of A .
(iii) Compute the entire SVD of A .
(iv) Give an orthonormal basis for each of the four fundamental subspaces
of A .
4.30. Let V =span( {1, x, x 2 }) be a subspace of the inner product space L 2 ([0, 1]; JR) .
Let D be the derivative operator D: V--+ V given by D[p](x) = p'(x). Write
the matrix representation of D with respect to the basis [1, x, x 2 ] and compute
its SVD.
Exercises 175

4.31. .& Assume A E Mmxn(lF) and A is not identically zero. Prove that
(i) llAll2 CTl, where CTl is the largest singular value of A;
=

(ii) if A is invertible, then llA- 11'2 = CT~ 1 ;


(iii) llAHll§ = llATll§ = llAH Alh = llAll §;
(iv) if U E Mm(lF) and VE Mn(lF) are orthonormal, then llU AVll2 = llAll 2·
4.32 . .& Assume A E Mmxn(lF) is of rank r. Prove that
(i) if U E Mm(lF) and VE Mn(lF) are orthonormal, then llU AVllF = llAllF;
112
(ii) llAll F = ( CTf +CT~ + · · · +CT;) , where CTl :'.:: CT2 :'.:: · · · :'.:: CTr > 0 are the
singular values of A.
4.33. Assume A E Mn(lF). Prove that

llAll2 = sup IYH Axl. (4.26)


ll xlh =l
llYll2 =l

Hint: Use Exercise 4.31.


4.34.* Let B and D be Hermitian. Prove that the matrix

c
is positive definite if and only if B > 0 and D - CH B- 1 > 0. Hint: Find a
matrix P of the form [ 6f] that makes pH AP block diagonal.

4.35. Let A E Mn(lF) be nonsingular. Prove that the modulus of the determinant
is the product of the singular values:

I det Al = rr
i=l
n
CTi ·

4.36. Give an example of a 2 x 2 matrix whose determinant is nonzero and whose


singular values are not equal to any of its eigenvalues.
4.37. Let A be the matrix in Exercise 4.29 . Find the Moore-Penrose inverse At of
A. Compare At A to AH A.
4.38. Prove Proposition 4.6.2.
4.39. Prove Corollary 4.6.7.
4.40. Finish the proof of Theorem 4.6.8 by showing that when A E Mmxn(lF) has
SVD equal to A= UL:VH, and if 6* = CT! 1v1uf , then (I -A6*)u1 = 0.
4.41.* Prove the small gain theorem (Corollary 4.6.9). Hint: What is the contra-
positive of Theorem 4.6.8?
4.42.* Let A, 6 E Mn(lF) with rank(A) = r .
(i) If 9i (6) C 9i (A), prove that A+ 6 = A(I +At 6) .
(ii) For any X, Y E Mn(lF), prove that if rank(XY) < rank(X), then
rank(Y) < n.
176 Chapter 4. Spectral Theory

(iii) Use the first parts of this exercise and Theorem 4.6.8 to prove that if
A+ 6. has rank s < r, then 116.112 ;::: aro without using the Schmidt,
Mirsky, Eckart- Young theorems.

Notes
We have focused primarily on spectral theory of finite-dimensional operators because
the infinite-dimensional case is very different and has many subtleties. This is
normally covered in a functional analysis course. Some resources for the infinite-
dimensional case include [Pro08, Con90, Rud91] .
Part II

Nonlinear Analysis I
Metric Space
Topology

The angel of topology and the devil of abstract algebra fight for the soul of each
individual mathematical domain.
-Herman Weyl

In single-variable analysis, the distance between any two numbers x, y E lF is defined


to be Ix - YI, where I · I is the absolute value in IR or the modulus in C. The dis-
tance function allows us to rigorously define several essential ideas in mathematical
analysis, such as continuity and convergence. In Chapter 3, we generalized the dis-
tance function to normed linear spaces, taking the distance between two vectors to
be the norm of their difference. While normed linear spaces are central to applied
mathematics, there are many important problems that do not have the algebraic or
geometric structure needed to define a norm. In this chapter, we consider a more
general notion of distance that allows us to extend some of the most important
ideas of mathematical analysis to more abstract settings.
We begin by defining a distance function, or metric, on an abstract space
X as a nonnegative, symmetric, real-valued function on X x X that satisfies the
triangle inequality and a certain positivity condition. This gives us the necessary
framework to quantify "nearness" and define continuity, convergence, and many
other important concepts. The metric also allows us to generalize the notions of
open and closed intervals from single-variable real analysis to open and closed sets
in the space X.
The collection of all open sets in a given space is called the topology of
that space, and any properties of a space that depend only on the open sets
are called topological properties. Two of the most important topological proper-
ties are compactness and connectedness. Compactness implies that every sequence
has a convergent subsequence, and connectedness means that the set cannot be
broken apart into two or more separate pieces. Continuous functions have the very
nice property that they preserve both compactness and connectedness.
Cauchy sequences, which have the property that the terms of the sequence
get arbitrarily close to each other as the sequence progresses, are some of the most
important sequences in a metric space. If every Cauchy sequence converges in
the space, we say that the space is complete. Roughly speaking, this means any

179
180 Chapter 5. Metric Space Topology

point that we can get arbitrarily close to must actually be in the space- there
are no holes or gaps in the space. For example, IQ is not complete because we
can approximate irrational numbers (not in IQ) as closely as we like with rational
numbers. Many of the most useful spaces in mathematical analysis are complete;
for example, (wn, 11 · llp) is complete for any n EN and any p E [1,oo].
When normed vector spaces are also complete, they are called Banach spaces.
Banach spaces are very important in analysis and applied mathematics, and most
of the normed linear spaces used in applied mathematics are Banach spaces. Some
important examples of Banach spaces include lFn, the space of matrices Mmxn(lF)
(which is isomorphic to wnm), the space of continuous functions (C([a, b]; JR), 11·11£<"",
the space of bounded functions , and the spaces f,P for 1 :::; f, :::; 00.
Although this chapter is a little more abstract than the rest of the book so
far , and even though we do not give as many immediate applications of the ideas
in this chapter, the material in this chapter is fundamental to applied mathematics
and provides powerful tools that you can use repeatedly throughout the rest of the
book and beyond.

5.1 Metric Spaces and Continuous Functions


In this section, we define a notion of distance, called a metric, on a set. Sets that
have a metric are called metric spaces.34 The metric allows us to define open sets
and define what it means for a function to be continuous.

5. 1.1 Metric Spaces

Definition 5.1.1. A metric on a set X is a map d : X x X -+JR that satisfies the


fallowing properties for all x , y , and z in X:

(i) Positive definiteness: d(x,y) 2: 0, with d(x,y) = 0 if and only ifx = y.


(ii) Symmetry: d(x,y) = d(y,x).
(iii) Triangle Inequality: d(x,y):::; d(x, z) +d(z,y).
The pair (X, d) is called a metric space.

Example 5.1.2. Perhaps the most common metric space is the Euclidean
metric on wn given by the 2-norm, that is, d(x, y) = llx - Yll2· Unless we
specifically say otherwise, we always use this metric on lFn.

Example 5.1.3. The reader should check that each of the examples below
satisfies the definition of a metric space.

34
Metric spaces are not necessarily vector spaces. Indeed, there need not be any binary operations
defined on a metric space.
5.1. Metric Spaces and Continuous Funct ions 181

(i) We can generalize Example 5.1.2 to more general normed linear spaces.
By Definition 3.1.11, any norm I · II on a vector space induces a natural
metric
d(x,y) = ll x -y ll . (5 .1)
Unless we specifically say otherwise, we always use this metric on a
normed linear space.

(ii) For i ,g E C([a, b]; JR) and for any p E [l, oo], we have the metric

1
dP(f, g) = { (t if(t) - g(t)i' dt )' ', I :5 p < oo, (5 _2 )

SUPx E[a,b] li(t) - g(t)I, P = 00.

These are also written as Iii - gllLP and Iii - gllL=, respectively.
(iii) The discrete metric on X is

o if x = y,
d(x, y) ={ (5.3)
1 ifx!=-y.

Thus, no two distinct points are close together-they are always the
same distance apart.

(iv) Let ((Xi, di))f= 1 be a collection of metric spaces, and let X = X 1 x


X2 x · · · x X n be the Cartesian product. For any two points x =
(x1, X2, . .. , Xn) and y = (Y1, Y2 , ... , Yn) in X , define

(5.4)

This defines a metric on X called the p-metric. If the metric di on each


xi is induced from a norm 11 . llxi, as in (i) above, then the p-metric on
Xis the metric induced by the p-norm (see Exercise 3.34).

Example 5.1.4. Let (X, d) be a metric space. We can create a new metric
onX:
d(x, y)
p(x,y) = 1 +d(x,y)' (5.5)

where no two points are farther apart than l. To show that (5.5) is a metric
on X, see Exercise 5.4.
182 Chapter 5. Metric Space Topology

Remark 5.1.5. Every norm induces a metric space, but not every metric space is
a normed space. For example, a metric can be defined on sets that are not vector
spaces. And even if the underlying space is a vector space, we can install metrics
on it that are not induced by a norm.

5.1.2 Open Sets


Those familiar with single-variable analysis know that open intervals in JR are im-
portant for defining continuous functions, differentiability, and local properties of
functions. Open sets in metric spaces allow us to define similar ideas in an abstract
metric space.
Throughout the remainder of this section, let (X, d) be a metric space.

Definition 5.1.6. For each point :x:0 E X and r > 0, define the open ball with
center at x 0 and radius r > 0 to be the set

B(xo,r) = {x EX I d(x,xo) < r}.

Definition 5.1. 7. A subset E C X is a neighborhood of a point x E X if there


exists an open ball B(x, r) C E. In this case we say that x is an interior point of
E . We write E 0 to denote the set of interior points of E.

Definition 5.1.8. A subset E C X is an open set if every point x E E is an


interior point of E.

Example 5.1.9. Both X and 0 are open sets. First, Xis open since B(x, r) c
X for all x E X and for all r > 0. That 0 is open follows vacuously-every
point in 0 satisfies the condition because there are no points in 0.

Example 5.1. 10 . In the Euclidean norm, B(xo, r) looks like a ball, which
is why we call it the "open ball." However, in other metrics open balls can
take on very different shapes. For example, in JR. 3 the open ball in the metric
induced by the 1-norm is an octahedron, and the open ball in the metric
induced by the oo-norm is a cube (see Figure 3.6). In the discrete metric (see
Example 5.l.3(iii)), open balls of radius one or less are just points (singleton
sets); that is, B(x, 1) = {x} for each x EX, while B(x, r) = X for all r > 1.

Example 5.1. 11. Another important example is the space C([O, l]; IF) with
the metric d(f,g) = II! - gllL= determined by the sup norm 1 · llL=· In this
space the open ball B (O, 1) around the zero function is infinite dimensional,
5_1. Metric Spaces and Continuous Functions 183

Figure 5.1. A ball inside of a ball, as discussed in the proof of


Theorem 5.1.12. Here d(y,x) = E , and the distance from y to the edge of the ball
is r - E .

so it is hard to draw, but it consists of all functions that only take on values
with modulus less than 1 on the interval [O, l]. It contains sin(x) because

II sin(x) - OllL= =sup Isin(x)I = sin(l) < sin(7r /3) < 1.


[0,1]

But it does not contain cos(x) because

11 cos(x) - OllL= = sup Icos(x)I = 1.


xE[O,l]

We now prove that the balls defined in Definition 5.1.6 are, in fact, open sets,
as defined in Definition 5.1.8.

Theorem 5.1.12. If y E B(x, r) for some x E X, then B(y, r - c) c B(x, r),


where E = d(x ,y).

Proof. Assume z E B(y, r - c). Thus, d(z, y) < r - E, so d(z, y) + E < r. This
implies that d(z, y) +d(y, x) < r, which by the triangle inequality yields d(z, x) < r,
or equivalently z E B(x, r) (see Figure 5.1). D

Example 5.1.13. * Consider the metric space (!Fn, d), where d is the Euclidean
metric. Let ei be the ith standard basis vector. Note that d(ei,ej) = J2
whenever i -# j. We can show that each basis element ei is contained in
exactly one ball in 'ef' = {B(ei, J2)}~ 1 U {B(-ei, J2)}f= 1 and that each ele-
ment x in the unit ball B(O, 1) is contained in at least one of the 2n balls in 'ef'
(see Figure 5.2 and Exercise 5.7) .
184 Chapter 5. Metric Space Topology

,,
, ---- ...... ,
'
-- -' -',.......
I

, ........ ,.,.
I
-- ....
....... e ,
\

2 I ',
I

I
I
I '\
I \
I
I
I
I

' '' I , ,,
. . ""'<- - - - , ,
\
' I
'
' ........ ____ , ' , I

Figure 5.2. Illustration of Example 5.1.13 in JR 2 . Each basis element ei is


contained in exactly one ball in <ef = {B(ei, J2)}r= 1 u {B(-ei , J2)}r= 1 , and each
point in the (red) unit ball B(O, 1) is contained in at least one of the 4 balls (dashed)
in <ef .

By the definition of an open set, any open set can be written as a union of open
balls. We now show that any union of open sets is open and that the intersection
of a finite number of open sets is open.

Theorem 5.1.14. The union of any collection of open sets is open, and the inter-
section of any finite collection of open sets is open.

Proof. We first prove the result for unions. Let (Ga)aEJ be a collection of open
sets Ga indexed by the set J . If x E UaEJ Ga, then x E Ga for some a E J. Hence,
there exists c: > 0 such that B(x,c:) C: Ga, and so B(x,c:) C: UaEJGa . Therefore
UaEJ Ga is open.
We now prove the result for intersections. Let (Gk)~=l be a finite collection
of open sets. If x E n~=l Gk, then for each k there exists Ck > 0 such that
B(x,c:k) C Gk. Let c: = min{c:1 .. . c:n}, which is positive. Thus, B(x,c:) C Gk for
each k, and so B(x, c:) c n~=l Gk· It follows that n~=l Gk is open. D

Example 5.1.15. Note that an infinite intersection of open sets need not be
open. As a simple example, consider the following intersection of open sets in
JR with the usual metric:

The intersection is just the single point {O}, which is not an open set in JR.

Theorem 5.1.16. The following properties hold for any subset E of X:


(i) (E 0 ) 0 = E 0 , and hence E 0 is open.
5.2. Continuous Functions and Limits 185

(ii) If G is an open subset of E , then G c E0 •

(iii) E is open if and only if E = E0 •

(iv) E 0 is the uni on of all open sets contained in E .

Proof.
(i) By definition (E 0 )° C E 0 . Conversely, if x E E 0 , then there exists an open
ball B(x, 6) C E. By Theorem 5.1.12 every pointy E B(x, 6) is contained in
B(x, 6)° CE . Hence, B(x, 6) C E 0 , which implies x E (E 0 ) 0 •
(ii) Assume G is an open subset of E. If x E G, then there exists c > 0 such that
B(x, c) C G, which implies that B(x, c) C E; it follows that x E E 0 • Thus,
Ge E 0 •

(iii) See Exercise 5.6.

(iv) Let (Ga.)a.EJ be the collection of all open sets contained in E. By (ii), we have
that Ga. C E 0 for all a E J. Thus, Ua.EJ Ga. C E 0 • On the other hand, if
x E E 0 , then by definition it is contained in an open ball B(x,c) CE, which
by Theorem 5.1.12 is an open subset of E. D

5.2 Continuous Functions and Limits


The concept of a continuous function is fundamental in mathematics. There are
many different ways to define continuous functions on metric spaces. We give one
of these definitions and then show that it agrees with some of the others, including
the familiar definition in terms of limits (see Theorem 5.2.9) .
Throughout t his section let (X, d) and (Y, p) be metric spaces.

5.2.1 Continuous Functions

Definition 5.2.1. A function f : X -t Y is continuous at a point xo E X if for


all c > 0 there exists 6 > 0 such that p(f (x), f(xo)) < c whenever d(x, xo) < 6. A
function f : X -t Y is continuous on a subset E C X if it is continuous at each
x 0 EE. The set of continuous functions from X to Y is denoted C(X; Y).

Example 5.2.2. We have the following examples of continuous functions:

(i) Let f : JR 2 -t JR be given by f (x, y) = Ix - Y I· We show that f is


continuous at (0, 0) . Note that

lf(x , y) - f(O , O)I = Ix - YI :::; lxl + IYI :::; 2ll(x, y)ll2 = 2d((x, y), 0) .

Set ting 6 = c/ 2, we have lf(x, y) - f(O , O) I < c whenever ll(x, y)ll2 < 6.
186 Chapter 5. Metric Space Topology

(ii) & If (V, II · llv) and (W, II · llw) are normed linear spaces, then every
bounded a linear transformation T E gg(v, W) is continuous at each xo E
V. Given c. > 0, we set 8 = c./(1 + llTllv,w ). Hence,

llT(x) - T(xo)llw = llT(x - xo)llw::::; llTllv,wllx - xollv < E.

whenever llx- xollv < 8. In other words, gg(V; W) C C(V; W) . On the


other hand, an unbounded linear transformation is nowhere continuous.
In other words, it is not continuous at any point on its domain (see
Exercise 5.21).

(iii) & A function f: X---+ Y is Lipschitz continuous (or just Lipschitz for
short) if there exists K > 0 such that p(f(x1), f(x2)) ::::; Kd(x1, x2 ) for
all x 1, x 2 E X. Every Lipschiitz continuous function is continuous on all
of X (set 8 = c./K).

(iv) & For fixed xo EX , the map f: X---+ JR given by f(x) = d(x,xo) is
continuous at each x E X . To see this, for any c. > 0 set 8 = c., and thus

lf(x) - f(y)I = ld(x, xo) - d(y,xo)I::::; d(x, y) < c.,


whenever d(x, y) < 8. Note that this is similar to the solution of
Exercise 3.16.
(v) Consider the space (r,d), where dis the Euclidean metric d(x,y) =
llx - Yll2· For each k the kth projection map 7rk : IF'n ---+ IF, defined in
Example 2.l.3(i) , is continuous at each x =(xi, x2 , .. . , Xn) E !Fn. Given
any c., setting 8 = c. gives lnk(x)-nk (Y)I = lxk -Ykl::::; d(x,y) < c.,
whenever d(x, y) < 8. Note that this is a bounded linear transformation
and thus is a special case of (ii).
asee Definition 3.5.10.

5.2.2 Alternative Definitions of Continuity


The next theorem is a very important result, giving an alternative definition of
continuity on a set in terms of the funct ion (or rather its preimage) preserving
open sets. This tells us essentially that continuous functions are to the study of
open sets (the subject called topology) what linear transformations are to linear
algebra. A general philosophy in modern mathematics is that a collection of sets
with some special structure (like vector spaces, or metric spaces) is best understood
by studying the functions that preserve the special structure. Continuous functions
are those functions for spaces with open sets (called topological spaces).

Theorem 5.2.3. A function f : X ---+ Y is continuous on X if and only if the


preimage f- 1(U) of every open set Uc Y is open in X .
5.2. Continuous Functions and Limits 187

Nota Bene 5.2.4. Recall t hat t he notat ion f- 1 (U ) does not mean that f
has an inverse. T he set f - 1 (U) = {x E X \ f (x) E U} always exists (but may
be empty), even if f has no inverse.

Proof. (=:}) Assume that f is continuous on X and U c Y is open. For x 0 E


f- 1 (U) , choose E: > 0 so that B(f(x0 ), c) CU. Since f is continuous at x 0 , there
exists 6 > 0 such that f(x) E B(f(x0 ),c) whenever x E B(x0 ,6), or, in other
words , f(B(xo , 6)) C B(f(xo), c). Hence, B(x0 , 6) c 1- 1 (U). Since this holds for
all xo E f - 1 (U), we have that f- 1 (U) is open.
(¢=)Let E: > 0 be given. For any x 0 EX, the ball B(f(x0 ),c) is open, and
by hypothesis, so is its preimage f- 1 ( B (! (x 0 ) , E:)). Thus, there exists 6 > 0 such
that B(xo , 6) C 1- 1 (B(f(xo),c)), and hence f is continuous at x 0 EX. D

The following is an easy consequence of Theorem 5.2.3.

Corollary 5.2.5. Compositions of continuous maps are continuous; that is, if


f : (X , d) ---+ (Y, p) and g : (Y, p) ---+ (Z, T) are continuous mappings on X and Y,
respectively, then h =go f: (X, d)---+ (Z, T) is continuous.

Proof. If UC Z is open, then g- 1 (U) is open in Y, which implies that f - 1 (g- 1 (U))
is open in X. Hence, h- 1 (U) is open whenever U is open. D

Theorem 5.2.3 suggests yet another alternative definition of continuity at


a point.

Proposition 5.2.6. A function f : X ---+ Y is continuous at xo E X if and only


if for every E: > 0 there exists 6 > 0 such that f (B(xo, 6)) C B(f (xo), c).

Proof. This follows from the definition of continuity by just writing out the
definition of the various open balls. More precisely, x E B(xo, 6) if and only if
d(x, x 0 ) < 6; and f(x) E B(f(xo) , c) if and only if p(f(x), f(xo)) < E:. D

Proposition 5.2. 7. Consider the space lFn with the metric dp induced by the
p -norm for some fixed p E [l , oo]. Let II · II be any norm on lFn. The function
f : lFn ---+ IR given by f(x) = l\xll is continuous with respect to the metric dp. In
particular, it is continuous with respect to the usual metric (p = 2) .

Proof. Let M = max( ll e 1 ll, ... , ll en ll ), where e i is the ith standard basis vector.
Let
q= { p~l, p E (1, oo],
oo, p = 1.

Note that f; + i= 1, and for any z = (z1, ... , Zn) = L;~=l zi ei, the triangle
inequality together with Holder 's inequality (Corollary 3.6.4) gives
188 Chapter 5. Metric Space Topology

n
f(z) = ll z ll SL lzillleill S
i=l

Therefore, for any c > 0 we have


d(f(x), f(y)) = lllxll - llYlll S llx - Yll S nl/q Mllx - YllP < c,
whenever dp(x - y) < c/(n 1 /qM). This shows f is continuous with respect
to dp . D

5.2.3 Limits of Functions


The ideas of the limit of a function and the limit of a sequence are extremely
powerful. In this section we define a limit of a function on an arbitrary metric space.

Definition 5.2.8. Let f : X -+ Y be a function . The point Yo E Y at xo E X


is called the limit of f at xo E X if for all c > 0 there exists > 0 such that o
p(f (x), Yo)< c, whenever 0 < d(x, xo) < o. We denote the limit as limx-+xof(x) =Yo.

Theorem 5.2.9. A function f : X -+ Y is continuous at xo E X if and only if


limx-+xo f(x) = f(xo).

Proof. This is immediate from Definition 5.2.1. D

Example 5.2.10. The function

x2y3
(x,y) =F (0,0) ,
f(x , y) = (x2 + y2)2'
{
0, (x, y) = (0, 0) ,

is continuous at zero. To see this, note that lxl S (x 2 + y 2) 112 and IYI S (x 2 +
y 2) 112, and thus lx 2y 3 1::::; (x 2 -t-y 2) 5 12. This gives lf(x,y) -0 1::::; (x 2 +y 2) 112,
and thus lim(x ,y)-+(O,o) f (x, y) = 0.

We conclude this subsection by showing that sums and products of continuous


functions are also continuous.

Proposition 5.2.11. If V is a vector space, and if f: X-+ V and g: X-+ V are


continuous, then so is the sum f + g : X -+ V. If h : X -+ lF is continuous, then
the scalar product hf : X -+ V is also continuous.

Proof. We prove the case of the product hf. The proof of the sum is similar and
is left to the reader .
Let c > 0 be given. Since f is continuous at xo, there exists 61 > 0 so that
II! (x) - f (xo) II < 2(1h(:a)l+ l) whenever d(x, xo) < 61 . Since h is continuous, choose
5.2. Continuous Functions and Limits 189

62 > 0 so that lh(x) - h(xo)I < min(l, 2(ll JCX:l ll+l)) whenever d(x,xo) < h Note
that [h(x) - h(xo)I < 1 implies [h(x)I < ih(xo)I + 1, so we have

li(hf)(x) - (hf)(xo)ll :::; ll h(x)f(x) - h(x)f(xo) + h(x)f(xo) - h(xo)f(xo) ll


:::; llf(x) - f(xo)lllh(x)I + llf(xo) llih(x) - h(xo)I
< llf(x) - f(xo) ll(i h(xo) I + 1) + llf(xo )lllh(x ) - h(xo) I
E: E:
<2+2 = c

whenever d(x, xo) < 6 = min{61, 62}. D

5.2.4 Limit Points of Sets


We now define the important concept of a limit point of a set.

Definition 5.2.12. A point p E X is a limit point of a set E C X if every


neighborhood of p intersects E "\ {p} .

Nota Bene 5.2.13. Despite the misleading name, limit points of a set are
not the same as the limit of a function. In the following section we also define
limits of a sequence, which is yet another concept with almost the same name.
We would prefer to use very distinct names for these very distinct ideas, but
in order to communicate with other analysts, you need to know them by the
standard (and sometimes confusing) names.

Example 5.2.14. Any point on the disk D = {(x, y) I x 2 + y 2 :::; 1} is a limit


point of t he open ball B(O , 1) in JR 2, with the Euclidean metric.

Definition 5.2.15. If E c X and p E E is not a limit point of E, we say that p


is an isolated point of E.

Example 5.2.16.

(i) For any subset E of a space with the discrete metric, each point is an
isolated point of E.
(ii) Each element of the set Z x Z c JR x JR is an isolated point of Z x Z. For
any p = (m,n) E Z x Z, it is easy to see that B(p, 1) "\ {p} does not
intersect Z x z.
190 Chapter 5. Metric Space Topology

Definition 5.2.17. Let EC X . We say that Eis dense in X if every point in X


is either in E or is a limit point of E.

Example 5 .2 .18. The set Q x Q is dense in JR 2 . It is easy to see that this


generalizes: Qn is dense in ]Rn.

Theorem 5.2.19. If p is a limit point of E c X, then every neighborhood of p


contains infinitely many points of E.

Proof. Suppose there is a neighborhood N of p that contains only a finite number


of elements of E '\_ {p }, say {x 1 , . . . , xn}· If E = mink d(p, Xk), then B(p, E) does
not intersect E '\_ {p }. This contradicts the fact that pis a limit point of E. Thus,
N n (E '\_ {p}) contains infinitely many points. 0

Remark 5.2 .20. An immediate consequence of the theorem is that a finite set has
no limit points.

5.3 Closed Sets, Sequences, and Convergence


In this section we generalize two more ideas from single-variable analysis to general
metric spaces. These are closed sets and convergent sequences.
Throughout this section let (X, d) and (Y, p) be metric spaces.

5.3.1 Closed Sets

Definition 5.3.1. A set F C X is closed if it contains all of its limit points.

Theorem 5 .3.2. A set U C X is open if and only if its complement uc is closed.

Proof. (===?) Assume that U is open. For every x E U there exists an E > 0 such
that B(x , E) c U, which implies that B( x , E) n uc is empty. Thus, no x E U can be
a limit point of uc. In other words, uc contains all its limit points. Therefore, uc
is closed.
(<==) Conversely, assume that uc is closed and x E U. Since x cannot be a
limit point of uc, there exists E > 0 such that B(x,E) c U. Thus, U is open. 0

Example 5.3.3. The disk D = {(x, y) I x 2 + y2 :=:: 1} is a closed subset of


JR 2 because it contains all its limit points (see Example 5.2.14). Moreover, it
is easy to see that its complement is open. The open ball B(O, 1) = {(x,y) I
x 2 + y 2 < 1} is not closed because its complement is not open.
5.3. Closed Sets, Sequences, and Convergence 191

Nota Bene 5.3.4. Open and closed are not "opposite" properties. A set can
be both open and closed at the same t ime. A set can also be neit her open
nor closed .
For example, we already know t hat 0 and X are open , and by t he previ-
ous theorem both 0 and X are also closed sets, since X = 0c and 0 = x c. In
t he discrete metric every set is b oth open and closed. The interval [O, 1) c IR
in t he usual metric is neit her open nor closed.

Corollary 5.3.5. The intersection of any collection of closed sets is closed, and
the union of a finite collection of closed sets is closed.

Proof. These follow from Theorems 5.1.14 and 5.3.2, via De Morgan's laws (see
Proposition A. l.11). D

Remark 5.3.6. Note that the rules for intersection and union of closed set in
Corollary 5.3.5 are the "opposite" of those for open sets given in Theorem 5.1.14.

Corollary 5.3. 7. Let (X, d) and (Y, p) be metric spaces. A function f : X-+ Y is
continuous if and only if for each closed set F C Y the preimage 1- 1 (F) is closed
in X.

Proof. This follows from Theorem 5.2.3 and the fact that 1- 1 (Ec) = f- 1 (E)c; see
Proposition A.2.4. D

Example 5.3.8. * Here are more examples of closed sets.

(i) The closed ball centered at x 0 with radius r is the set D(x0 , r) = {x E
X I d(x, xo) ::; r}. This set is closed by Corollary 5.3.7 because it
is the preimage of t he closed interval [O, r] under the continuous map
l(x) = d(x, xo); see Example 5.2.2 (iv).
(ii) Singleton sets are always closed, as are finite sets by Remark 5.2.20.
Note that a singleton set {x} can also be written as the intersection of
closed balls n~=l D(x, ~) = {x} .
(iii) The unit circle S 1 c IR 2 is closed . Note that l( x, y) = x 2 + y 2 is
continuous and the set {1} E IR is closed, so the set 1- 1 ({1}) = {(x,y) I
x 2 + y 2 = 1} is closed. In fact , for any continuous 1 : X -+ IR, any set
of the form 1- 1 ({c}) C Xis closed.a
0
-The sets l- 1 ({ c}) are called level sets because if we consider the graph {(x ,y,l(x,y)) I
(x, y) E IR. 2 } of a function l: IR. 2 -+ IR., t hen the set l- 1 ({c} ) is t he set of all points of
IR. 2 that map t o a point of height (level) c on the graph. Contour lines on a topographic
map are level sets of the function that sends each point on the surface of the earth to
its alt itude.
192 Chapter 5. Metric Space Topology

5.3.2 The Closure of a Set


Definition 5.3.9. The closure of E, denoted E, is the set E together with its
limit points. We define the boundary of E, denoted oE, as the closure minus the
interior, that is, oE = E '\ E 0 •

R emark 5 .3.10. A set EC Xis dense in X if and only if E = X.

T heorem 5.3.11. The following properties hold for any EC X:

(i) E is closed.

(ii) If F c X is closed and E c F, then EC F; thus E is the smallest closed set


containing E.

(iii) E is closed if and only if E = E.

(iv) E is the intersection of all closed sets containing E.

Proof.

(i) It suffices to show that Ee is open. If we denote the set of limit points of Eby
E ' , then for any p E Ec =(EU E')c = Ee n (E')c, there exists an c: > 0 such
that B(p, c: ) '\ {p} c Ec. Combining this with the fact that p E Ee gives
B(p,c:) c Ee. Moreover, if there exists q EE' n B(p,c:), then B(p,c:) is a
neighborhood of q and therefore must contain a point of E , a contradiction.
Therefore B(p,c:) C (E')c, which implies that Ee is open.

(ii) It suffices to show that E' C F'. If p E E', then for all c: > 0, we have that
B(p,c:) n (E'\ {p}) =/:- 0, which implies that B(p,c:) n (F'\ {p}) =/:- 0 for all
c > 0. Thus, p E F' .

(iii) If E is closed, then E C E by (ii), which implies that E = E. Conversely, if


E = E, then Eis closed by (i) .

(iv) Let F be the intersection of all closed sets containing E. By (ii), we have
E c F, since every closed set that contains E also contains E. By (i), we
have F CE, since Eis a closed set containing E. Thus, E = F . D

E x ample 5.3.12. Let x EX and r 2". 0. Since D(x,r) is closed, we have that
B( x , r) C D(x, r) . If X = lFn, then B(x, r) = D(x, r ). However, in the dis-
crete metric, we have that B(x, 1) = {x}, whereas D(x, 1) is the entire space.
5.3. Closed Sets, Sequences, and Convergence 193

5.3.3 Sequences and Convergence


Sequences are a fundamental tool in analysis. They are most useful when they
converge to a limit.

Definition 5.3.13. A sequence is a function f : N --+ X. We often write individual


elements of the sequence as Xn = f(n) and the entire sequence as (xi)~ 0 .

Definition 5.3.14. We say that x E X is a limit of the sequence (xi)~ 0 if for all
E> 0 there exists N > 0 such that d(x, xn) < E whenever n :::: N . We write Xk --+ x
or limk_, 00 Xk = x and say that the sequence converges to x.

Remark 5.3.15. By definition, a sequence (xi)~ 0 converges to x if and only if


limn_, 00 d(xn, x ) = 0. If the metric is defined in terms of a norm, then the sequence
converges to x if and only if limn_, 00 ll xn - x ii = 0.

Example 5 .3.16.

(i) Consider the sequence((~, n~ 1 ))~=0 in the space !R 2 . We prove that the
sequence converges to the point (0, 1) . Given c > 0, choose N so that
'{J < E. Thus, whenever n :::: N, we have
1 1 v'2
-n 2 + (n+ 1) 2 < -n <E.

(ii) .&. Let (Jn)~=O be the sequence of functions in C([O,l];IR) given by


fn(t) = t 1 fn, and let f be the constant function f(t) = l. We show that
llf- fnllL2--+ 0, so in the L 2 -norm we have fn--+ f; but llf - fnl lL= = 1
for all n E N, and so f n does not converge to J in the L 00 -norm.
As n --+ oo, we have

In contrast, we also have

llf - fnllL= = sup lf(t) - fn(t)I = sup ll - tl fnl--+ l.


t E[O ,l] tE[O,l]

Proposition 5.3.17. If a sequence has a limit, it is unique.

Proof. Let (xn)~= O be a sequence in X. Suppose that Xn --+ x and also that
Xn --+ y i= x. For c = d(x, y) , there exists N > 0 so that d(xn, x) < ~ whenever
n:::: N. Similarly, there exists m > N with d(xm,Y) < ~· However, this implies
d(x,y) :<::: d(x,xm) +d(xm,Y) < ~ + ~ = E = d(x,y), which is a contradiction. 0
194 Chapter 5. Metric Space Topology

Nota Bene 5.3.18 . Limits and limit points are fundamentally different
t hings. Sequences have limits, and sets have limit points. The limit of a
sequence, if it exists at all, is unique but need not be a limit point of the et
{x 1, x 2, . . . } of terms in the sequence. Conversely, a limit point of t he et
{x1, x 2, . .. } of terms in the sequence need not be the limit of the sequence.
The example below illustrate these differences.

(i) T he equence
r11n if n even,
Xn = \ 1- 1/n if n odd
does not converge to any limit. Both of the points 1 and 0 are limit
point s of the et {xo, x1, x2, ... }.
(ii) The sequence 2, 2, 2, .. . (that is, t he sequence Xn = 2 for every n) con-
verges to the limit 2, but the set {x o, x 1 , x 2, ... } = {2} has no limit
points.

Theorem 5.3.19. A fu nction f : X --+ Y is continuous at a point x* E X if


and only if, for each sequence (xk ) }~o c X that converges to x*, the sequence
(f(xk))'k=o c Y converges to f( x *) E Y.

Proof. (==?) Assume t hat f is continuous at x * E X. Thus, given E > 0 there exists
a J > 0 such that p(f(x),f(x*) ) < c whenever d(x,x*) < J . Since Xk--+ x *, there
exists N > 0 such that d(xn,x*) < 5 whenever n 2 N. T hus, p(f(xn),f(x*)) < E
whenever n 2 N, and t herefore (f (xk ) ) ~ 0 converges to f(x*).
({==) If f is not continuous at x* E X , then t here exists E > 0 such t hat for
each J > 0 there exist s x E B (x *,J) with p(f(x) , f(x*)) 2 E . For each k E z+,
choose Xk E B(x* , t)
wit h p(f (xk), f(x*)) 2 E. T his implies that Xk --+ x *, but
(f(xk))f=o does not converge to f (x*) because p(f(xk), f(x*)) 2 E. D

Example 5.3.20. We can often use continuous functions to prove that a se-
quence converges. For example, the sequence Xn = sin( ~ ) converges to zero,
since ~ -> 0 as n --+ oo and t he sine function is continuous and equal to
zero at zero.

Unexample 5.3.21.
(i) T he function f : R. 2 --+ JR given by
xy
if (x, y) -f. (0, 0),
f( x, y)= ~2+ y2
{ if (x, y) = (0, 0)
5.4. Completeness and Uniform Continuity 195

is not cont inuous at the origin because if Xn = (1 /n, l /n) for every
n E z+, we have f (x n) = 1/ 2 for every n , but f (limn-+oo Xn ) = 0.

(ii) .&. T he derivative map D(f ) = f' on the set lF [x] of polynomials is
not cont inuous in the £ 00 -norm on C([O, l ]; lF). For each n E z+ let
f n(x) = x: . Note that ll f nll L= = ~ ; therefore, fn ~ 0. And yet
ll D(fn)llu"' = 1, so D(fn) does not converge to D(O) = 0. Since D does
not preserve limits, it is not continuous at the origin. See Exercise 5.21
for a generalization.

Corollary 5.3.22. For any x , y E X and any sequence (xn)~=O in X converging


to x , we have
lim d(xn,Y) = d( lim Xn,Y) = d(x, y),
n-+oo n-+oo
or, equivalently, if the metric is d(x,y) = ll x -y ll for some norm II · II , then
lim ll x n - Yll
n -+oo
= II nlim
----700
Xn -y ll = ll x -y ll - (5 .6)

Proof. The function f(x) = d(x, y) is continuous (see Example 5.2.2 (iv)), so the
result follows from Theorem 5.3.19. D

5.4 Completeness and Uniform Continuity


Convergent sequences are very important, but in order to use the definition of
convergence, you need to know the limit of the sequence. A useful way to get around
this problem is t o observe that all convergent sequences are also Cauchy sequences.
In many situations this allows us to identify convergent sequences without knowing
their limits. In some sense, Cauchy sequences are those that should converge, even
if they have no limit. Metric spaces in which all Cauchy sequences converge are
especially useful- these are called complete spaces.
Unfortunately, not all continuous functions preserve Cauchy sequences-some
continuous functions map some Cauchy sequences into sequences that are not
Cauchy. So we need a stronger form of continuity called uniform continuity.
In this section we define and discuss Cauchy sequences, completeness, and
uniform continuity. We also show that uniformly continuous functions preserve
Cauchy sequences, whereas continuous functions may not . We conclude the section
by proving t hat lFn is complete.
Throughout this section assume that (X, d) and (Y, p) are metric spaces.

5.4.1 Cauchy Sequences and Completeness

Definition 5.4.1. A sequence (xi )~ 1 in X is a Cauchy sequence if for all c > 0


there exists an N > 0 such that d(xm, xn) < c whenever m, n ~ N.
196 Chapter 5. Metric Space Topology

Example 5.4.2.

(i) The sequence (xn)~=l given by Xn = 1/n in JR is a Cauchy sequence:


given c > 0, choose N > 1/c. If n 2 m 2 N, then

11/m - 1/nl = l(n - m)/mnl < ln/mnl = 11/ml :::; 1/N < c.

(ii) Fix b E (0, 1) C R The sequence Un(x))~=O given by fn(x) = xn is


Cauchy in the space C([O, b]; JR) of continuous functions on [O, b] with
the metric induced by the sup norm. To see this, for any c > 0 let
N > logb(c). If n 2 m 2 N , then for every x E [O, b] we have lfm(x) -
fn(x)I :S lxml :S bm < E.

Unexample 5.4.3.

(i) Even when the difference Xn - Xn- l goes to zero, the sequence (xn)~=l
need not be Cauchy. For example, the sequence given by Xn = log(n)
in JR satisfies Xn - Xn-1 = log(n/(n - 1)) = log(l + 1/(n - 1)) -+ 0,
but the sequence is not Cauchy because for any m we may take n = km
and then lxn - xml = log(n/m) = log(k). This difference can be made
arbitrarily large, and thus does not satisfy the Cauchy criterion.

(ii) The sequence given by f n(x) = nxn is not a Cauchy sequence in the
space C([O, 1]; JR) of continuous functions on [O, 1] with the metric in-
duced by the sup norm. To see this, we note that

llfm - f nll L = sup lfm - fnl 2 lfm(l) - fn(l)I =Im - nl,


xE[O, l]

which cannot be made small by requiring n and m to be large.

Proposition 5.4.4. Any sequence that converges is a Cauchy sequence.

Proof. Assume (xi)~ 0 in X converges to some x E X. Given c > 0, there exists


an N > 0 such that d(xn, x) < c/2 whenever n 2': N. Hence, we have
c c
d(xm, Xn) :S d( xm , x) + d(xn, x) < 2 + 2 = E,

whenever m, n 2 N . D

Nota Bene 5.4.5. Not all Cauchy sequence are convergent. For example,
in the space X =JR '\. {O} with the usual metric d(x, y) = lx-yl, the sequence
(1/n)~=l is Cauchy, and it does converge in JR, but it doe not converge in X
because its limit 0 i not in X.
54. Completeness and Uniform Continuity 197

Similarly, in the space Q, the sequence (( 1 +1/n)n)~=l is Cauchy and


it converge in JR to e, but it does not converge in Q because e 'f. Q.
We show later that all Cauchy sequences in JR converge, but t here are
many other important spaces where not all Cauchy sequences converge.

Definition 5.4.6. We say that a set E c X is bounded if for every x E X there


is a positive real number M such that d(p, x) < M for every p EE.

Proposition 5.4.7. Cauchy sequences are bounded.

Proof. Let (xk )k°=o be a Cauchy sequence, and choose c = l. Thus, there exists
N > 0 such that d(xn , Xm) < 1 whenever m , n 2: N. Hence, for any fixed x E X
and any m 2: N, we have that
d( xn , x ) :=:; d(xn , Xm) + d(xm , x) :=:; 1 + d(xm , x) ,
whenever n 2: N. Setting
M = max{d(x1,x), ... , d(xN-1,x), 1 + d(xm,x)}
gives d(xk , x) :=:; M for all k E N. D

We now need the idea of a subsequence, which is, as the name suggests, a
sequence consist ing of some, but not necessarily all, the elements of t he original
sequence. Here is a careful definition.

Definition 5 .4.8. A subsequence of a sequence (xn);;'=o is a sequence of the form


(xn; )~ 0 , where n o < n l < n2 < · · · are nonnegative integers.

Proposition 5.4.9. Any Cauchy sequence that has a convergent subsequence is


convergent.

Proof. Let (xn) ;;'=o be a Cauchy sequence. If (xnj )~ 1 is a subsequence that


converges to x, t hen we claim that (xn)~= O converges to x. To see this, for any
c > 0 choose N such that ll xn - xm ll :=:; c/2 whenever m , n > N , and choose J such
that ll xnj - x ii :=:; c/ 2 whenever j > J. For any m > N, choose a j > J such that
nj > N. We have

ll xm - x ii :=:; ll xm - Xn j II + llxnj - x ii < c/2 + c/2 = €.


Hence, the sequence converges to x. D

Example 5.4.10 . Convergence is preserved under continuity, but Cauchy


sequences may not be. As an example, let h : [O, 1) -t JR be defined by
h(x ) = 1.".' x· Note that Xn = 1 - ~ is Cauchy, but h(xn) is not, since
h(xn) = n - l.
We show in the next subsection that a stronger version of continuity
called uniform continuity does preserve Cauchy sequences.
198 Chapter 5. Metric Space Topology

Nota Bene 5.4.5 gives some examples of Cauchy sequences that have no limit.
In some sense, these indicate a hole or gap in the space, leaving it incomplete. This
motivates the following definition.

Definition 5.4.11. A metric space (X, d) is complete if every Cauchy sequence


converges.

Unexample 5.4.12.

(i) The set Q of rational numbers with the usual metric d(x , y) = Ix -yl is
not complete. For example, let (xn);;:='=o be the sequence

3, 3.1, 3.14, 3.141, 3.1415, ... '


with Xn consisting of the decimal approximation of 7r to the first n + 1
places. This sequence converges to 7r in JR, so it is Cauchy, but it has no
limit in Q.

(ii) The space JR"- {O} is not complete (see Nota Bene 5.4.5).

(iii) The vector space C([O, 2]; JR) with the L 1-norm 11111£1 = f0 lf(t) I dt is
2

not complete. To see this, consider the sequence (gn);;:='=o defined by

( t)-{tn, t'.Sl,
9n - 1, t 2'. 1.
Given c > 0 let N 2'. l/c. Since every 9n is equal to every 9m on the
interval [l, 2], if m, n > N, then
1
ll9n - 9mllL1 = fo ltn - tmj dt

= 11/(n + 1) - l/(m + 1)1


lm-nl
(m+l)(n+l) <c.

Therefore, this sequence is Cauchy in the L1-norm, but (gn);;:='=o does


not converge to any continuous function in the L 1 -norm. In fact, in the
L 1-norm we have 9n-+ g where

c
}
= {o, l,
t < 1,
t ::::: 1.

We do not prove this here, but it follows easily from the monotone
convergence theorem (Theorem 8.4.5).

Remark 5.4.13. In Section 9.1.2 we show that every metric space can be uniquely
extended to a complete metric space by creating equivalence classes of Cauchy
5.4. Completeness and Uniform Continuity 199

sequences; that is, two Cauchy sequences are equivalent if the distance between
them converges to zero. This is also the idea behind one construction of the real
numbers JR from Q. In other words, JR can be constructed as the set of all Cauchy
sequences in Q modulo this equivalence relation (see also Vista 9.1.4).

Theorem 5.4.14. The fields JR and C are complete with respect to the usual metric
d(x,y) = lx - yl.

We prove this theorem and the next one in Section 5.4.3.

Theorem 5.4.15. If ((Xi, di))~=l is a finite collection of complete metric spaces,


then the Cartesian product X = X 1 x X2 x · · · x Xn is complete when endowed with
the p-metric (5.4) for 1 :::; p:::; oo.

The following theorem is fundamental and is an immediate corollary of the


previous two theorems.

Theorem 5.4.16. For every n E N and p E [1, oo], the linear space lFn with the
norm 1 · llP is complete.

Remark 5.4.17. While the previous theorem shows that lFn is complete in the
p-metric, Corollary 5.4.25 shows that lFn is complete in any metric that is induced
by a norm (but not necessarily in metrics that are not induced by a norm, like the
discrete metric).

5.4.2 Uniform Continuity


Continuous functions are fundamental in analysis, but sometimes continuity is not
enough. For example, continuous functions don't preserve Cauchy sequences. What
we need in these cases is a stronger condition called uniform continuity. You may
have encountered this in single-variable analysis, and it has a natural generalization
to arbitrary metric spaces.
Uniform continuity requires that there be no reference point for the continuity-
the relationship between 8 and c: must be independent of the point in question.

Definition 5.4.18. A function f : X -t Y is uniformly continuous on E C X if


for all c: > 0 there exists 8 > 0 such that p(f (x), f (y)) < c: whenever x , y E E and
d(x, y) < 8.

Example 5.4.19. The function f(x) = ex is uniformly continuous on any


closed interval [a, b]. To see this, note that ex is continuous, so for any c: > 0
there is a 8 > 0 such that ll - ezl < c:/eb if lzl < 8. Thus, for any x, y E [a, b]
with x > y and Ix -y l < 8, we have lex - eYI = 11- ey-xllexl < (c:/eb)lexl < c: ,
as required.
200 Chapter 5. Metric Space Topology

Unexample 5.4.20. The function f(x) = ex is not uniformly continuou on


JR because for any fixed difference Ix - yl, no matter how small, we can make
lex - eY I = lex lll - ey- xl as large as desired by making x (and hence ex) as
large as needed.
The function g(x) = l /x is not uniformly continuous on (0, 1) since for
any fixed value of lx - yl , the difference ll /x- 1/ y l = lx -y l/ lxy l can be made
as large as desired by making lxyl sufficiently small.
Roughly speaking, the functions f and gin this unexample curve upward
more and more as they approach one edge of the domain, and this is what
causes uniform continuity to fail for them.

Example 5.4.21. A Lipschitz cont inuous function f : X --+ Y with constant


K (see Example 5.2.2(iii)) is uniformly continuous on X. If p(f(x), f(y)) ::;
o
Kd(x, y) for all x, y EX, then, given c > 0, choose = -f<. Thus, p(f(x), f(y))
< c whenever d(x,y) < o.

An important class of uniformly continuous functions are bounded linear


transformations.

Proposition 5.4.22. If f : X --+ Y is a bounded linear transformation of normed


linear spaces, then f is uniformly continuous. Conversely, any continuous linear
transformation is bounded.

Proof. The proof that bounded linear transformations are uniformly continuous is
Exercise 5.24. Conversely, Exercise 5.21 shows that any continuous linear transfor-
mation is bounded. D

Remark 5.4.23. Although not every continuous function is uniformly continuous,


we prove in the next section (Theorem 5.5.9) that if the domain is compact (or if
it is complete and totally bounded), then a continuous function must, in fact, be
uniformly continuous.

Recall that Cauchy sequences are not preserved under continuity (see
Example 5.4.10). The following theorem says that they are, however, preserved
under uniform continuity.

Theorem 5.4.24. Assume f : X ___,, Y is uniformly continuous. If (xk)~ 0 is a


Cauchy sequence, then so is (f(xk))~~o ·

Proof. Let c > 0 be given. Since f is uniformly continuous on X, there exists


a O > 0 such that p(f(x), f(y)) < c whenever d(x, y) < o. Since (xk)k=O is
Cauchy, there exists N > 0 such that d(xm, xn) < o whenever n, m 2:: N. Thus,
p(f(xm), f(xn)) < c whenever m, n 2:: N, and so (f(xk ))~ 1 is Cauchy. D
5.4. Completeness and Uniform Continuity 201

Corollary 5.4.25. Every finite -dimensional normed linear space over F is


complete.

Proof. Given any finite-dimensional normed linear space (Z, II· II), Corollary 2.3.12
guarantees there is an isomorphism of vector spaces f : Z -t Fn. We can make
!Fn into a normed linear space with the Euclidean norm II · 1 2 . By Remark 3.5.13,
every linear transformation of finite-dimensional normed linear spaces is bounded,
and thus f and f - 1 are both bounded linear transformations. Moreover,
Proposition 5.4.22 guarantees that bounded linear transformations are uniformly
continuous.
Given any Cauchy sequence (zk)~ 0 in Z, for each k let Yk = f(zk) E !Fn. By
Theorem 5.4.24, the sequence (Yk)k=O must also be Cauchy, and thus has a limit
y E Fn, since (Fn, I · llz) is complete. For each k we have f - 1 (Yk) = zk, and so
limk---+oo Zk = limk---+oo f - 1 (Yk) = f- 1 (y) exists, since f - 1 is continuous. D

Finally, we conclude with a lemma that is important when we talk about


integration in Section 5.10 and again in Chapter 8.

Lemma 5.4.26. If Y is a dense subspace of a normed linear space Z such that


every Cauchy sequence in Y converges in Z , then Z is complete.

Proof. The proof is Exercise 5.25. D

5.4.3 * Proof that IFn is complete


In this section we prove the very important Theorem 5.4.16, which tells us Fn is
complete with respect to the metric induced by the p-norm for any p E [1, oo]. To
do this we begin with a review from single-variable calculus of the proof that JR is
complete. Using the fact that JR is complete, it is straightforward to show that C is
also complete. We then prove Theorem 5.4.15 that Cartesian products of complete
spaces with the p-metric are complete. Combining this with the previous results
gives Theorem 5.4.16.

Th e single-variable case
The completeness of JR relies upon the following fundamental property of the real
numbers called Dedekind's property or the least upper bound property. For a proof
of this property, we refer the reader to [HS75].

Theorem 5.4.27. Every nonempty subset of the real numbers that is bounded above
has a supremum.

To begin the proof of completeness, we need a lemma.

Lemma 5.4.28. Every sequence (Yn)~=O in JR that is bounded above and is mono-
tone increasing {that is, Yn ::::; Yn+l for every n E N) has a limit.

Proof. Since the set {Yn I n E N} is bounded above, it has a supremum (least
upper bound) by Dedekind's property. Let y = sup{yn I n E N}. If E > 0, then
202 Chapter 5. Metric Space Topology

there exists some m > 0 such that IYm - YI < € . If not, then y - c/2 would also be
an upper bound for {Yn I n E N} and y would not be the supremum.
Since the sequence is monotone increasing, we must have Ym ::; Yn ::; y for
every n > m. Therefore, IY - Ynl :::; IY - Yml < c for all n > m, so y is the
desired limit. D

Corollary 5.4.29. Every sequence in JR that is bounded below and is monotone


decreasing has a limit.

Proof. Let (zn);::o=O be a monotone decreasing sequence that is bounded below by


B. Let Yn = -Zn for every n E N. The sequence (Yn);;:o=O is monotone increasing
and bounded above by -B, and so it converges to a limit y .
The sequence (zn);::o=O converges to z = -y because for every c > 0, if we
choose N such that IY - Ynl < c for all n > N, then lz - znl = I - y - (-yn) I =
IY -Ynl < €. D
Theorem 5.4.30. The space JR with the usual metric d(x, y) = Ix -yl is complete.

Proof. Let (xn);::o=O be a Cauchy sequence in R Since every Cauchy sequence is


bounded, for each n E N the set En = {Xi I i 2: n} is bounded, and hence has a
supremum. For each n, let Yn = supEn. The sequence (Yn);;:o=O is monotonically
decreasing and is bounded, so by Corollary 5.4.29 it converges to some value y.
For any c > 0, choose N such that lxm - Xnl ::; c/3 whenever n, m > N.
Choose k > N such that IY - Ykl < c/3. Since Yk is the supremum of Ek, there
must be some m 2: k such that IYk - Xml < c/3. For every n 2: N we have
IY - Xnl::; IY -Ykl + IYk - Xml + lxm - Xnl < c/3 + c/3 + c/3 = €. D

Proposition 5.4.31. The space C with the usual metric d(z, w) = lz - wl is


complete.

Proof. The proof is Exercise 5.28. D

Higher dimensions
Now we prove Theorem 5.4.15, which says that if ((Xi, di)): 1 is a finite collection
of complete metric spaces, then the Cartesian product X = X 1 x X 2 x · · · x Xn is
complete when endowed with the p-metric (5.4) for 1 ::; p::; oo.

Proof. Let (xk)k=O C X be a Cauchy sequence. Denote the components of xk as


(k) (k) (k) . (k) . .
Xk = (x 1 , x 2 , . .. , Xn ) with xi E: Xi for each i. For every c > 0, choose an N
such that for every£, m > N we have c > dP(xt, xm)· If p < oo, then we have

c > dP(x•
~'
x
m
) = °'"' d(x(i) x(m))p
n

( L_...; i i ' i
) l/p
> (d·(x(i) x(m))p)l/p
- J J ' J
= d·(x(£) x(m))
J J ' J
i =l

for every j. If p = oo, then we have


c > d00 (x•~' x m ) = supd(x(i) x(m))
. i i ' i
>
-
d(x(i) x(m))
J J ' J
i
5.5. Compactness 203

for every j. In either case, each sequence (x~k))'k::o c Xj is Cauchy and converges
to some Xj E Xj .
Define x = (x1, ... ,xn)· We show that (xk)r=o converges to x. Given c; > 0,
choose N so that
di (xi(e) , xi(m ) ) :::; dP( xe , Xm ) < n +
c ,
1
whenever €, m ::'.'.: N . Letting f, ---+ oo gives di(xi, x~m)) :::; n~l. When p = oo, this

J'(x, Xm),: (t, (n: sr


gives d00 (x, Xm) :::; n~l < c;, and when p E [1 , oo) , this gives

'.O n (n: j) < E

The next-to-last inequality follows from the fact that aP + bP :::; (a+ b)P for any
nonnegative numbers a, b and any p E [1, oo) .
For all p we now have dP(x , xm) < c; , whenever m :::; N. It follows that the
Cauchy sequence (xk)'k::o C X converges and that X is complete. D

Nota Bene 5.4.32. T he previous proof uses a t echnique often seen in


analysis, especially when working wit h a convergent sequence Xk ---+ x . The
technique is to bound d( x e, x m) and t hen let f, ---+ oo t o get a bound on
d(x , x m) · T his works because the function d(· , xm ) is continuous (see
Example 5.2 .2(iv)).

5.5 Compactness
Recall from Corollary 5.3. 7 that for a continuous function f : X ---+ Y and for a
closed subset Z c Y, the preimage f- 1 (Z ) is closed. But if W c X is closed,
then the image f (W) is not necessarily closed. In this section we discuss a property
of a set called compactness, which seems just slightly stronger than being closed,
but which has many powerful consequences. One of these is that continuous func-
tions map compact sets to compact sets, and this guarantees that every real-valued
continuous function on a compact set attains both its minimum and its maximum .
These lead to many other important consequences that you will use in this book
and far beyond.
The definit ion of a compact set may seem somewhat strange-it is given in
unfamiliar terms involving various collections of open sets-but we prove the Heine-
Borel theorem (see Theorem 5.5.4 and also 5.5.12), which gives a simple description
of compact sets in !Rn as those that are both closed and bounded.
Throughout this section we assume that (X, d) and (Y, p) are metric spaces.

5. 5.1 Introduction to Compactness

Definition 5.5.1. A collection (GOl.)Ol.EJ of open sets is an open cover of the set E
if E C LJOl.EJ GO!. . A set E is compact if every open cover has a finite subcover; that
is, for every open cover (G01.)01.EJ there exists a finite subcollection (GOl.)Oi.EJ', where
J' C J is a finite subset, such that E C UOl.EJ' GO!..
204 Chapter 5. Metric Space Topology

Proposition 5.5.2. A closed subset of a compact set is compact.

Proof. Let F be a closed subset of a compact set K, and let 'ti= (Ga)aEJ be an
open covering of F . Thus, 'ti U {Fe} is an open covering of K, which has a finite
subcovering {Fe, Ga 1 , ••• , Gan}. Hence, (Gak)k=l is a finite subcover of F. D

Theorem 5.5.3. A compact subset of a metric space is closed and bounded.

Proof. Fix a point x E X. For any compact subset K C X, the collection


(B(x, k))~=l is an open cover of K . It must have a finite subcover (B(x, k));!p
since K is compact. Therefore KC B(x, M), which implies that K is bounded.
To see that K is closed, we may assume that K -/= 0 and K -/= X because
these two sets are closed. We show that Kc is open. Assume that x E Kc. For
every y EK let Oy = d(x,y)/2. The collection of balls (B(y,(Sy))yEK forms an
open cover of K, so there exists a finite subcover B(y1, Oyi}, .. . , B(yn, Oyn). Let
o= min( 8y 1 , . . • , Oy"). This is strictly positive, and we claim that B(x, o) n K = 0.
To see this, observe that for any z E K, we have z E B(yi, Oy;) for some i. Thus, by
a variation on the triangle inequality (see Exercise 5.5), we have d(z, x) 2:: d(x, Yi)-
d(z, yi) 2:: d(x,yi)/2 2:: 8. Hence, Kc is open, and K is closed. D

The next theorem gives a partial converse to Theorem 5.5.3.

Theorem 5.5.4 (Heine-Borel). If a subset of JR.n (with the usual, Euclidean


metric) is closed and bounded, then it is compact.

Proof. First we show that every n-cell is compact, that is, every set of the form
[a, b] = {x E JR.n I a::; x::; b} (meaning ak::; Xk::; bk for all k = 1, . .. ,n).
Suppose (Ga)aEJ is an open cover of Ii = [a, b] that contains no finite sub-
cover. Let c = atb be the midpoint of a and b, meaning each Ck = ak!bk . The
intervals [ak, ck] and [ck, bk] determine 2n n-cells, at least one of which, denoted I2,
cannot be covered by a finite subcollection of (Ga)aEJ. Subdivide hand repeat. We
have a sequence (h)'f:: 1 of n-cells such that In+l C In, where each In is not covered
by any finite subcollection of (Ga)aEJ and x , y E In implies llx-yll2 ::; 2-n llb-all2·
By choosing Xk E h, we have a. Cauchy sequence (xk)'f:: 0 that converges to
some x, since JR.n is complete. However, x E Ga for some o, and since Ga is open,
it contains an open ball B(x, r) for some r > 0. There exists an N > 0 such that
2-Nllb - all2 < r, and thus In C B(x,r) C Ga for all n 2:: N . This gives a finite
subcover of all these In, which is a contradiction. Thus, [a, b] is compact.
Now let E be any closed and bounded subset of JR.n. Because E is bounded,
it is contained in some n-cell [a, b] . Since E is closed and [a, b] is compact,
Proposition 5.5.2 guarantees that E is also compact. D

Example 5.5.5. * The Heine-Borel theorem does not hold in general (infinite-
dimensional) spaces. Consider the vector space JF 00 c €2 (see Example l. l.6(iv))
defined to be the set of all infinite sequences (x1, x2, ... ) with at most a finite
number of nonzero entries.
5.5. Compactness 205

The vector space lF 00 has an infinite basis of the form

B = {(1 , 0, 0, .. .), (0, 1, 0, ... ), . . . , (0, ... , 0, 1, 0, ... ), .. .}.

The unit sphere in this space (with the Euclidean metric) is closed and bounded,
but not compact. Indeed, Example 5.1.13 and Exercise 5.7 give an example
of an open cover of the unit ball that has no finite subcover.

5.5.2 Continuity and Compactn ess


Compactness is especially useful when coupled with continuity. In this section
we prove that continuous functions preserve compact sets, that continuous func-
tions on compact sets attain their supremum and infimum, and that continuous
functions on compact sets are uniformly continuous.

Proposition 5.5.6. The continuous image of a compact set is compact; that is, if
f: X -+ Y is continuous and KC X is compact, then f(K) CY is compact.

Proof. If (Ca)aEJ is an open cover of f(K), then (f- 1 (Ca))aEJ is an open


cover of K. Since K is compact, there is a finite subcover (f- 1 (Cak))k=l· Hence,
(Gak)k=l is a finite subcollection of (Ca)aEJ that covers J(K), and thus f(K) is
compact. D

Corollary 5.5. 7 (Extreme Value Theorem). If f : X -+JR is continuous and


Kc X is a nonempty compact set, then f(K) contains its infimum and supremum.

Proof. The image f(K) is compact, hence closed and bounded in R Because it
is bounded, its supremum and infimum both exist. Let M be the supremum, and
let (xn)~= O be a sequence such that f (xn) -+ M. Since K is compact, there is a
subsequence (xnJ ~o converging to some value x E K. Since it is a subsequence,
we must have f (xnJ -+ M, but continuity of f implies that f (xnJ -+ f(x); see
Theorem 5.3.19. Therefore, M = f(x ) E J(K). A similar argument shows that the
infimum lies in f (K). D

Example 5.5.8 . .& Recall from Example 3.5.4 and F igures 3.5 and 3.6 that
different metrics on a given space define different open balls. For example, the
1-norm unit sphere 5 1 = {x E lFn I llx lli = 1} is really a square when n = 2
and an octahedron when n = 3, whereas the 2-norm unit sphere is really a
circle when n = 2 and a sphere when n = 3.
The 1-norm unit sphere 5 1 is both closed and bounded (with respect to
both the 1-norm and the 2-norm), so it is compact. Consequently, for any
continuous function f : lFn -+IR, the image f (51 ) contains both its maximum
and minimum.
In particular, if II· I is any norm on lFn, then by Proposition 5.2 .7 the map
f (x) = ll x ll is continuous (with respect to both the 1-norm and the 2-norm) ,
206 Chapter 5. Metric Space Topology

and hence there exists Xmax , Xm i n E S1 such that supxESi llxll = ll x maxl l and
infxES 1 ll x ll = llxminll-
This example plays an important role in the proof of the remarkable
Theorem 5.8.7, which states that the open sets defined by any norm on a
finite-dimensional vector space are the same as the open sets defined by any
other norm on the same space.

Theorem 5.5.9. If K C X is compact and f : K --t Y is continuous, then f is


uniformly continuous on K .

Proof. Let c > 0 be given. For each x E K, there exists Ox > 0 such that
p(J(x),f(y)) < c/2 whenever d(x,y) <Ox · Let (B(x, "2))xEK be an open cover of
K. Since K is compact, there exists a finite subcover (B(xk, ";k
))k=l' So, given
any y E K, there exists k E {l, 2, ... , n} such that d(y, xk) < ~Oxk.
Leto= min{Oxk}k=i· Ify, z EK with d(y, z) < o/2, then we have

d(xk,z) :S d(xk,Y) +d(y,z) < o;k + ~ :S Oxk·


c
Hence, p(J(xk) ,f(z)) < '2' and so
E: E:
p(J(y), f(z)):::; p(J(y), f(xk)) + p(J(xk), f(z)) < 2 + 2 = E:.

Thus, f is uniformly continuous. D

5.5.3 Characterizations of Compactness


The definit ion of compactness is not always so easy to verify directly, but the Heine-
Borel theorem gives a nice way to check that finite-dimensional spaces are compact.
In this section we give several more useful characterizations of compactness and
generalize the Heine- Borel theorem to arbitrary metric spaces.

Definition 5.5. 10 .
(i) A collection~ of sets in X has the finite intersection property if every finite
subcollection of~ has a nonempty intersection.
(ii) The space X is sequentially compact if every sequence (xk)~ 0 c X has a
convergent subsequence.
(iii) The space Xis totally bounded if for all c >0 the cover~= (B(x,c))xEX
has a finite subcover.

(iv) A real number co is a Lebesgue number of an open cover (Ga)a EJ if for all
x E X and for all c < co the ball B (x , c) is contained in some G °'.

Theorem 5.5.11. Let (X, d) be a metric space. The following are equivalent:
(i) X is compact.
5.5. Compactness 207

(ii) Every collection <ef of closed sets in X with the finite intersection property has
a nonempty intersection.

(iii) X is sequentially compact.

(iv) X is totally bounded and every open cover has a positive Lebesgue number
{which depends on the cover).

Proof.

(i) ==* (ii) Assume X is compact. Let <ef = (Fa)aEJ be a collection of closed sets
with the finite intersection property. If naEJ Fa = 0, then (Fi)aEJ is an open
cover of X. Hence, there exists a finite subcover ( FiJ ~=i · But this implies
that n~=l Fa. = 0, which is a contradiction.

(ii)==* (iii) Let (xk)k°=o C X. For each n E :N let Bn = {xn,Xn+1 1 • • • }. Thus


(Bn):=o is a collection of closed sets with the finite intersection property. By
(ii) , there exists x E n%':1 Bk· For each k, choose Xnk so that nk > nk-1 and
Xn• E B(x, 1/k). Thus, Xnk ---+ x .
(iii) ==* (iv) We start by showing that every cover has a positive Lebesgue number.
Assume (Ga)aEJ is an open cover of X. For each x EX define

1 .
c(x) = "2 sup{ c5 > 0 j 30:: E J with B(x , 8) C Ga}·

Since x is an interior point of Ga, we have c(x) > 0 for all x E X . Define
E* = infxEX c(x). This is clearly a Lebesgue number of the cover, and we
must show that€* > 0.
Since E* is an infimum, there exists a sequence (xk)k°=o c X so that E(xk) ---+
E* . Because X is sequentially compact, there is a convergent subsequence.
Replacing the original sequence with the subsequence, we may assume 35 that
xk ---+ x for some x.
Let G /3 be a member of the open cover such that B(x, E(x)) c G /3. Since
xk ---+ x , there exists N 2'. 0 such that d(x, Xn) < c:<;) whenever n 2'. N.
If z E B(xn, c:<;) ), then d(x, z) :::; d(x, Xn) + d(xn, z) < c:<;) + c:<;) = c(x),
which implies that z E B(x, c(x)) C G13. Thus, B(Xn, c:<;)) C G13, and
E(xn) 2'. c:~x) > 0 for all n 2'. N. This implies that €* 2'. c:~x) > 0.
Now suppose X is not totally bounded. There is an E > 0 such that the
cover (B(x, c) )xEX has no finite subcover. We define a sequence (xk)k°=o as
follows: choose any x 0 ; since B(x0 ,c) does not cover X, there must be an
x 1 E B(x0,c)c. Since the finite collection (B(xo,E),B(x1,E)) does not cover,
there must be an x 2 E X t hat is not in the union of these two balls. Continuing
in this manner , we construct a sequence (xk)k°=o so that d(xk, xz) 2'. E for all
k =fl. Thus, (xk)k°=o has no convergent subsequence, and Xis not sequentially
compact, which is a contradiction.
35 Since c: (x ) is not known to be continuous, we can't assume c:(xn) -r c:(x ).
208 Chapter 5. Metric Space Topology

(iv)===? (i) Let 'ff= (Go:)o:EJ be an open cover of X with Lebesgue number c:*.
Since X is totally bounded, it can be covered by a finite collection of balls
(B(xk, c:))k=l where c: < c:*. Each ball B(xk, c:) is contained in some G°'< of
the open cover, so the finite collection (Go:<)k=l is a subcover. Thus, X is
compact. D

Theorem 5.5.12 (Generalized Heiine-Borel). A metric space X is compact


if and only if it is complete and totally bounded.

Proof. (====?) Assume that X is compact. By the previous theorem, it is totally


bounded. If (xk)k=D is a Cauchy sequence in X, then by the previous theorem, it
has a convergent subsequence. By Proposition 5.4.9 any Cauchy sequence with a
convergent subsequence must converge. Thus, (xk)k=O converges. Since this holds
for every Cauchy sequence, X is complete.
({==) Assume that X is complete and totally bounded. It suffices to show that
each sequence (xk)~ 0 C X has a convergent subsequence. Let E = {xk}k=O be the
set of points of the sequence. If E is finite, it clearly has a convergent subsequence,
since infinitely many of the xk must be equal to the same point. Thus, we may
assume that E is infinite.
Since X is totally bounded, we can cover it with a finite number of open balls
of radius 1. One of these open balls contains infinitely many points of E. Denote
this as B 1 . Now cover X with finitely many balls of radius l
We can pick one whose
intersection with B 1 contains infinitely many points of E. Call this B 2 . Repeating
yields a sequence of open balls (Bk)k=l of radius t.
For each ball choose Xn < E Bk·
The sequence (xnk )k'=o is Cauchy, and it converges, since X is complete. Thus, the
sequence (xn)~= D has a convergent subsequence (xnk) . D

5.5.4 * Subspaces
If d is a metric on X, then d induces a metric on every Y C X; that is, restricting
d to the set Y x Y defines a function p : Y x Y -+ [O, oo) that itself satisfies all
the conditions for a metric on the space Y. We say that Y inherits the metric p
fromX.

Example 5.5.13. If X = JR 2with the standard metric d(a, b) = lla-bll2, and


if Y = 8 1 = {(cos(t),sin(t)) It E [0, 27r)} is the unit circle, then the induced
metric on 8 1 is just the usual distance between the points of 8 1 , thought of
as points in the plane. So if a= (cos(t), sin(t) ) and b = (cos(s), sin(s)), then
d( a, b) = Ila- bll2. In particular, we have d( ( cos(O), sin(O) ), (cos( s ), sin(s))) -+
0 ass-+ 27!'.

Definition 5.5.14. For each x E Y C X and r > 0, we write Bx(x, r) = {y EX I


d(x,y) < r} and By(x,r) = {y E YI p(x,y) < r} = Bx(x,r) n Y.
A set E is open in Y or is open relative to Y if E is an open set in the metric
space (Y, p); that is, E is open in Y if for every x E E there is some open ball
By(x,r) CE.
5.5. Compactness 209

Remark 5.5.15. Theorem 5.1.12 applied to the space Y shows that the ball By(x, r)
is open in Y. But it is not necessarily open in X, as we see in the next example.

Example 5.5.16. If Y = [O, 1] c JR, then


B y (O , 1/ 2) = {y E [O , l] I IY- OI < 1/2} = [O, 1/ 2)
is open in [O, l ] but not open in JR.
The subspace [O, l] with its induced metric (from the usual metric on JR)
has open sets that look like [O, x ), (x , y) , and (x, l] and unions of such sets.

Any set that is open in Y is the intersection of Y with an open set in X, as


the next proposition shows.

Proposition 5.5.17. Let (X, d) be a metric space, and let Y C X be a subset with
the induced metric p, so that (Y, p) is itself a metric space. A subset E of (Y, p) is
open if and only if there is an open subset U c X such that E = U n Y. Similarly,
a subset F of (Y, p) is closed if and only if there is a closed set C c X such that
F = CnY .

Proof. If Eis an open set in (Y, p) , then around every point x E E, there is a ball
By(x , rx) CE. Let U = UxEE Bx(x , rx) C X. Since U is the union of open balls
in X, it is open in X. For each x E Ewe certainly have x E U, thus E c Un Y.
But we also have Bx(x, rx) n Y = By(x, rx) c E, so Un Y c E.
Conversely, if U is open in X , then for any x E U n Y , we have some ball
Bx(x, rx) c U. Therefore, By(x, rx) = Bx(x, rx) n Y c U n Y , so Un Y is open
in (Y, p).
The statement about closed sets follows from taking the complement of the
open sets. D

Proposition 5.5.18. Assume that Y is a subspace of X. If E c Y, then the


closure of E in the subspace (Y, p) is given by E n Y , where E is the closure of E
in (X,d).

Proof. Let E denote the closure of E in Y with respect to p. Clearly E n Y is


closed in Y and contains E , so E is contained in E n Y. Conversely, E is closed in
Y, and thus , by Proposition 5.5 .17, there must be a set F c X that is closed with
respect to d such that FnY = E. But this implies that E c F; hence, E c F , and
thus En Y c F n Y = E. D

Example 5.5.19. The integers Z as a subspace of JR, with the usual metric,
has the property that every element is an open set. Thus, the family !!7 of all
open sets is the power set of Z, sometimes denoted 2z . This is the most open
210 Chapter 5. Metric Space Topology

sets possible. As a result, every function with domain Z is continuous (but


continuous functions from lFn to Z must be constant). We note that these are
the same open sets as those we got with the discrete metric on Z.

Now that we have defined an induced metric on subsets, there are actually
two different ways to define compactness of any subset Y C X . The first is as a
subset of X with the usual metric on X (that is, in terms of open sets of X). The
second is as a subspace, with the induced metric (that is, in terms of open sets of
Y) . Fortunately, these are equivalent.

Proposition 5.5.20. A subset Y C X is compact with respect to open sets of X


if and only if it is compact with respect to open sets of Y, as a subspace with the
induced metric.

Proof. The proof is Exercise 5.34. D

5.6 Uniform Convergence and Banach Spaces


5.6.1 Pointwise and Uniform Convergence
There are at least two distinct notions of convergence that are useful when consider-
ing a sequence Un)'::::'=o of functions: pointwise convergence and uniform convergence.
Pointwise convergence is convergence of the sequences Un(x))'::::'=o for each x in the
domain, whereas uniform convergence is convergence in the space of functions with
respect to the L 00 -norm.

Definition 5.6.1. For any sequence of functions Un)':'=o from a set X into a
metric space Y, we can evaluate all the functions at a single point x of the domain,
which gives a sequence Un(x))':'=o CY . If f or every choice of x EX, the sequence
Un(x))'::::'=o converges in Y, then we can define a new function f by setting f(x) =
limn-+oo fn(x). In this case we say that the sequence converges pointwise, or that f
is the pointwise limit of Un) ·

Example 5.6.2. The sequence Un)'::::'=o of functions on (0, 1) defined by fn(x)


= xn converges pointwise to the zero function , since for each x E (0, 1) we have
xn -+ 0 as n -+ oo.

Pointwise convergence is often not strong enough to be very useful. A much


stronger notion is that of uniform convergence, which is convergence in the
L 00 -norm.

Definition 5.6.3. Let (fk)'k=o be a sequence of bounded functions from a set X


into a normed space Y . If (fk )'k=o converges to f in the L 00 -norm, then we say
that the sequence Uk )'k=o converges uniformly to f .
5.6. Uniform Convergence and Banach Spaces 211

Unexample 5.6.4. Although the sequence (xn);:;o=Oof functions on (0, 1) con-


verges pointwise to 0, it does not converge uniformly to 0 on (0, 1). To see
this, observe that
ll fn(x) - OllL= = sup lxnl = 1
x E (O , l)

for all n. In the next proposition we see that if there were a uniform limit, it
would have to be the same as the pointwise limit.

Proposition 5.6.5. Uniform convergence implies pointwise convergence. That is


to say, if Un);::o=O is a uniformly convergent sequence with limit f, then f n converges
pointwise to f.

Proof. The proof is Exercise 5.35. D

5.6.2 Banach Spaces


Completeness is a very useful property of a metric space. Normed vector spaces
that are complete are especially useful.

Definition 5.6.6. A complete normed linear space is called a Banach space.

Example 5.6. 7. Most of t he normed linear spaces that we work with in ap-
plied mathematics are Banach spaces. We showed in Theorem 5.4. 16 that
(lFn , 11 · llp) is a Banach space. Below we show that (C([a, b]; IR), 11 · llL=) is also
a Banach space. Additional examples include the spaces £P for 1 ::::; p ::::; oo
(see Example l.1.6(vi)).

Theorem 5.6.8. The space (C([a,b];lF), II· llL=) is a Banach space.

Proof. We need only prove that (C([a,b];lF),Jl · llL=) is complete. We do this by


showing (i) any Cauchy sequence converges pointwise, which defines a candidate
function for the L 00 limit; (ii) the convergence is actually uniform , so the candidate
really is the L 00 limit (but might not lie in the space C([a, b]; JF) ); and (iii) the limit
does indeed lie in C([a,b];JF).
(i) Let (fk)'f=o C C([a, b]; JF) be a Cauchy sequence. For a given c > 0, there
exists N > 0 such that ll f m - fnllL= < c whenever m, n ~ N. For a fixed
x E [a, b] we have that lfm(x) - fn(x)J :S suptE[a,b] lfm(t) - fn(t) J < c, and
thus the sequence (fk(x)) 'f: 0 is a Cauchy sequence. Since lF is complete, the
sequence converges. Define f as the pointwise limit of Un);:;o= 0 ; that is, for
each x E [a, b] define f(x) = limk--+= fk(x) .
(ii) We show that (fk)'f= o converges to f uniformly. Given c > 0, there exists
N > 0 such that llfn - fm ll L= < c/2 whenever m, n ~ N. By Corollary 5.3.22
we can pass the limit through pointwise; that is, for each x E [a, b] we have
212 Chapter 5. Metric Space Topology

lf(x) - fm(x)I = lim lfn(x) - fm(x)I :S c/2 < c.


n->oo

Thus, it follows that llf - fmllL= < c.


(iii) It remains to prove that f is continuous on [a, b]. For each c > 0, choose
N such that llfm - f nllL= < ~ whenever m, n :'.'.'. N. By taking the limit as
n ---+ oo, we have that I ! m - f llL= : : :; ~ whenever m :'.'.'. N. Theorem 5.5 .9
states that any continuous function on a compact set is uniformly continuous.
Therefore, for any m :'.'.'. N, we know f m is uniformly continuous on [a, b], and
thus there is a 8 > 0 such that lfm(x) - fm(Y)I < ~ whenever Ix - YI < 8.
Thus, we have that

lf(x) - f(y)I :S lf(x) - fm(x)I + lfm(x) - fm(Y)I + lfm(Y) - f(y)I


c c c
< 3 + 3 + 3 = c.
Therefore, f is continuous on [a,. b], which completes the proof. D

The next corollary is an immediate consequence of Theorem 5.6.8.

Corollary 5.6.9. If a sequence of continuous functions converges uniformly to a


function f, then f is also continuous.

5.6.3 Sums in a Banach Space


Throughout the remainder of this section, assume that (X, I · II) is a Banach space.
Definition 5.6.10. Consider a sequence (xk)k=O c X. We say that the series
2:%°=o xk converges in X if the sequence (sk)k=O of partial sums, defined by Sn =
L~=l Xk , converges in X; otherwise, we say that the series diverges.

Definition 5.6.11. Assume (xk)k=O is a sequence in X. The series 2:%°=o xk is


said to converge absolutely if the series 2:%': 0 llxk I converges in JR.

Remark 5.6.12. If a series converges absolutely in X, then for every c > 0 there
is an N such that the sum 2=%':n llxkll < c whenever n :'.'.'. N.

Proposition 5.6.13. Let (xk)k=O be a sequence in X. If the series 2:%': 0 xk


converges absolutely, then it converges in X.

Proof. It suffices to show that the sequence of partial sums (sk)k=O converges in
X. Let c > 0 be given. If the series 2::%': 0 Xk converges absolutely, then there exists
N such that 2=%':n llxk I < c whenever n :'.'.'. N. Thus, we have

whenever n > m :'.'.'. N. This implies that the sequence of partial sums is Cauchy,
and hence it converges, since X is complete. D
5.7. The Continuous Linear Extension Theorem 213

Example 5.6.14. For each n E .N let fn E C([O, 2]; JR) be the function fn( x) =
xn /n!. The sum Z:::::~=O f n converges absolutely because the series
00 00 00

L llfnllL= = n=O
n=O
L sup
xE [0, 2]
lxnl/n!= n=O
L 2n/n!= e 2

converges in R

Theorem 5.6.15. If a sum 'L:~=O Xk converges absolutely to x EX, then for any
rearrangement of the terms, the rearranged sum converges absolutely to x. That is,
if f: .N---+ .N is a bijection, then 'L:~=O Xf(k) also converges absolutely to x.

Proof. For any E. > 0 choose N > 0 such that Z:::::~= N l xnll c./2.
< Since N
is finite, there must be some M :::: N so that the set {1, 2, ... , N} is a subset of
{f(l), .. ., f(M)}. For any n > M let En = {f(l),. . ., f(n)} \_ {1, 2,. . ., N} C
{N+l ,N+ 2,. .. }. We have

l x- txf(k)l
k=O
l = l x- k=O
t x k + t x k - txf(k) ll
k=O k=O

~ l x- t,xkll + llt,xk - ~Xf(k) l


<~+ :L 1 xk 1
kEEn
E. E.
<2+2 = €.
Hence, the rearranged series converges to x.
The fact that the rearranged sum converges absolutely follows from applying
the same argument to the series Z:::::~=O l xkll·
D
Remark 5.6.16. The converse to the previous theorem is false . In any infinite-
dimensional Banach space there are series that converge, regardless of the
rearrangement, and yet are not absolutely convergent (see [DR50]).

5.7 The Continuous Linear Extension Theorem


In this section we prove several important theorems about bounded functions and
Banach spaces. We first prove that the space of bounded linear transformations into
a Banach space is itself a Banach space. We then prove that the space of bounded
functions into a Banach space is also a Banach space. This means we can use all
the tools we have developed about convergence in Banach spaces when studying
bounded linear transformations or bounded functions .
The main result of this section is a powerful theorem known as the continuous
linear extension theorem, also known as the bounded linear transformation (BLT)
theorem. This result tells us that to define a bounded linear transformation T :
Z ---+ X from a normed linear space Z to a Banach space X, it suffices to define
214 Chapter 5. Metric Space Topology

T on a dense subspace of Z. This is very useful because it is often easy to define


a transformation having the properties we want on some dense subspace, while the
definition on the entire space might be more difficult.
An important application of this theorem is the construction of the integral of
a Banach-valued function. We do this for single-variable functions in Section 5.10
and for multivariable functions in Section 8.1. We also use it later in Chapter 8
to define the Lebesgue integral. In every case, the idea is simple: first define the
integral on some of the simplest functions imaginable-step functions-and then
use the continuous linear extension theorem to extend to more general functions.

5.7.1 Bounded Linear Transformations


The convergence results and other tools developed previously in this chapter are
useful in many settings. The following result tells us we can also use them when
studying matrices and other bounded operators (see Definition 3.5.10) .

Theorem 5.7.1. Let (X, 11 · llx) be a normed linear space, and let (Y, 11 · llY) be a
Banach space. The space @(X ; Y) of bounded linear transformations from X into
Y is a Banach space when endowed with the induced norm II · llx,Y. In particular,
the space X* = @(X;JF) of bounded linear functionals is a Banach space.

Proof. The space @(X ; Y) is a normed linear space by Theorem 3.5.11, so we


need only prove it is complete. Let (Ak)r=o c @(X; Y) be a Cauchy sequence. We
prove that it converges in three steps: (i) construct a function A : X--+ Y that is a
candidate for the limit, (ii) show that A is in @(X; Y), and (iii) show that Ak--+ A
in the norm I · llx,Y.

(i) Define A : X --+ Y as follows. Given any nonzero x E X and c > 0, there
exists N > 0 such that llAm-An ll x,Y < c/llxllx, whenever m,n 2'. N. Define
Yk = Akx for each k E N. Thus,

whenever m, n 2'. N. Thus, (Yk)~ 0 is a Cauchy sequence in the Banach space


Y, and so it must converge to some y E Y . This procedure defines a mapping
from X "\ 0 to Y, and we can extend it to all of X by sending 0 to 0. Call
the resulting map A.

(ii) We now show that A E @(X; Y). First, we see that A is linear since

A(ax1+bx2) = lim Ak(ax1+bx2) =a lim Akx1+b lim Akx2 = aAx 1+bAx2.


k~oo k~oo k~oo

To show that A is a bounded linear transformation, again fix some c > O and
choose an N such that II Am - An llx,Y < c when m, n 2'. N. Thus,

Taking the limit as m --+ oo, we have that

(5.7)
5.7. The Continuous Linear Extension Theorem 215

It follows t hat

By Proposit ion 5.4.7, the sequence (Ak)r=o is bounded; hence, there exists
M > 0 such that ll An llx ,Y::; M for each n EN. Thus, llAx ll Y::; (c+M) ll x ll x,
and
llAxllY
ll A ll x ,Y =sup - - - ::; (c + M).
x;60 11 X 11 X
Therefore, A is in @(X ; Y).

(iii) We conclude the proof by showing that Ak ---* A with respect to the norm
II · ll x,Y · By (5 .7) we have ll Axll~;xll Y ::; E whenever n : :'.'. N and x f 0 . By
taking the supremum over all x f 0 , we have that llA-An ll x ,Y::; E, whenever
n :::'.'. N. D

Example 5 .7.2. Let (X,11 · ll x ) be a Banach space. Since @(X) is also a


Banach space, it makes sense to define infinite series in @(X) , provided the
series converge.
We define the exponential of a bounded operator A E @(X) as exp( A) =
I:%°=o 1~ , where A 0 =I. If II · II is the operator norm, then the series exp(A)
converges absolutely if the series I:%°=o 11 1~ II of real numbers converges. This
is straightforward to check:

f I ~~ I : ; f
k=O k=O
ll~)lk = ell All < oo.

Therefore, the operator exp(A) is defined for any bounded operator A.

Example 5.7.3. Let (X, II · llx) be a Banach space. Let A E @(X) be a


bounded operator with ll All < 1. The Neumann series of A is the sum
L~o Ak. It is the analogue of the geometric series. This is well defined,
since the series is absolutely convergent. If II · II is the operator norm, then

Proposition 5. 7.4. Let (X, II · llx) be a Banach space. If A E @(X) satisfies


ll A ll < 1, then I - A is invertible. Moreover, we have that L:%°=o Ak = (I - A) - 1 ,
and thus ll (I -A)- 1 11 ::; (l - ll All)- 1 .

Proof. The proof is Exercise 5.43. D


216 Chapter 5. Metric Space Topology

5.7.2 Bounded Functions


Recall from Proposition 3.5.9 that for any set S, the space £C>O(S; X) of all bounded
functions from S into a normed linear space (X, 11 · llx ) with the sup norm II! llu"' =
suptES llf(t)ll is a normed linear space. The next theorem shows it is also complete.

Theorem 5.7.5. Let (X, II · llx) be a Banach space. For any set S, the space
(L 00 (S; X), II · llL=) is a Banach space.

Proof. It suffices to show that the space is complete. Let (fk)f;= 0 c L 00 (S; X) be
a Cauchy sequence. For each fixed t E S, the sequence (fk( t) )f;= 0 is Cauchy and
thus converges. Define f(t) = limk-+oo fk(t) .
It suffices to show that II! - fnllL= ---+ 0 as n---+ oo and that llJllL= < oo. As
mentioned in Corollary 5.3.22, we can pass the limit through the norm. Specifically,
given c > 0, choose N > 0 such that llfn - f mllL= < c/2 whenever m, n 2': N. Thus,
for each s E S, we have

ll f(s) - fm(s)l l = lim llfn(s) - fm(s)ll :S


n-+oo
~2 < €.
It follows that ll f - fmllL= < c, which implies II! - fnllL=---+ 0.
To see that f is bounded, note that since (fk)f;= 0 is Cauchy, it is bounded
(see Proposition 5.4.7) , so there is some M < oo such that llfnllL= < M . Again,
passing the limit through the norm gives ll!llL= ::; M < oo. D

5.7.3 The Continuous Linear Exten sion Theorem


The next theorem provides the key tool for defining integration. We use this theorem
at the end of this chapter to define single-variable Banach-valued integrals, and
again when we study Lebesgue integration in Chapter 8.
This theorem says that if we know how to define a continuous linear trans-
formation on some dense subspace of a Banach space, then we can extend the
transformation uniquely to a linear transformation on the whole space. This theo-
rem is often known as the bounded linear transformation theorem because a linear
transformation is bounded if and only if it is continuous, 36 and so the theorem
says any bounded linear transformation on a dense subspace extends uniquely to a
bounded linear transformation on the whole space.

Theorem 5.7.6 (Continuous Linear Extension Theorem). Let (Z, II · llz)


be a normed linear space, (X, II · llx) a Banach space, and SC Z a dense subspace
of Z. If T : S ---+ X is a bounded linear transformation, then T has a unique linear
extension to TE ~(Z; X) satisfying llTll = llTll ·

Proof. For z E Z, since S is dense, there exists a sequence (sk)f;=0 in S that


converges to z. Since it is a convergent sequence, it is Cauchy by Proposition 5.4.4.
Moreover, since Tis bounded on S, it is uniformly continuous there (see Proposition
5.4.22) , and thus (T(sk))f;=o is also a Cauchy sequence by Theorem 5.4.24. More-
over, it converges, since X is a Banach space.
36 by Proposition 5.4.22
5.7. The Continuous Linear Extension Theorem 217

For z E Z , define T(z) = limk-+oo T(sk)· We now show this is well defined.
Let (sk)k=O be any other sequence in S converging to z. For any c > 0, by uniform
continuity ofT there is a 6 > 0 such that JJ T(a)-T(b) Jlx < c/ 2 whenever JJa-bJJz <
6. Choose K > 0 such that llsk - zllz < 6 and llsk - zllz < 6 whenever k :'.:: K.
Thus, we have

ll T(sk) -T(z)llx ::::; JJT(sk) -T(sk)llx + llT( sk) - T(z) llx < ~ +~ = c

whenever k :'.:: K. Therefore, T(sk) ---+ T(z), and the value of T(z) is independent
of the choice of (sk)k°=o·
It remains to prove that Tis linear, that ll T ll = llTll, and that Tis unique.
The linearity of T follows from the linearity of T. If (sk)k=O c S converges to z E Z
and (sk)~ 0 c S converges to z E Z, then for a, b E lF we have

T(az + bz) - aT(z) - bT(z) = lim (T(ask


k-too
+ b§k) - aT(sk) - bT(sk)) = 0.

To see the norm ofT, note first that if (sk)k=O c S converges to z E Z, then

ll T (z)ll = II lim T(sk)ll = lim llT( sk) ll ::::; lim llTll ll sk ll = ll T ll llzll ·
k-too k-+oo k-too
Therefore, we have
ll T ll = sup llT(z)ll ::::; llTll ,
zEZ llzll
z#O

but T(s) = T(s) for alls ES, so

llTJI = sup llT(z) II :'.:: sup JIT(s) II = ll T ll ·


zEZ ll z ll sES ll s ll
z;fO s;fO

Finally, for uniqueness, suppose there were another extension T of T on Z.


If T(z) =f. T(z) for some z E Z '\_ S, then for some sequence (sk)~ 0 C S that
converges to z , we have 0 = limk-+oo(T - T)(sk) = T(z) - T(z) =f. 0, which is a
contradiction. Thus, the extension is unique. D

5.7.4 *Invertible Operators


Proposition 5.7.4 gave a way to compute the inverse of I - A for llAll < l. We
now need to study some basic properties of more general inverses. Recall that for
any matrix the determinant is nonzero if and only if the inverse exists. Moreover,
the function det : Mn(lF) ---+ lF is continuous, so the inverse image of 0 (the set
det- 1 (0) ={A E Mn(lF) I det(A) = O}) is closed, which means its complement, the
set of all invertible matrices (often denoted GLn (JF)), is open.
Similarly, the adjugate of A is clearly a continuous function in the entries of A,
so Cramer's rule (or rather Corollary 2.9.23) tells us that the inverse map AH A- 1
must be a continuous function of A .
218 Chapter 5. Metric Space Topology

The next proposition tells us that these two results hold for bounded linear
operators on an arbitrary Banach space- not just for matrices. The difficulty in
proving this comes from the fact that we have neither a determinant function nor
an analogue of Cramer's rule.

Proposition 5.7.7. Let (X, II· II) be a Banach space and define
GL(X) ={A E ,qg(x) I A- 1 E ,qg(x)}.
The set GL(X) is open in ,qg(X), and the function Inv(A) = A- 1 is continuous on
GL(X) .

Proof. Let A E GL(X) . We claim that B(A,r) c GL(X) for r = llA- 111- 1. This
shows that GL(X) is open.
To see the claim, choose LE B(A,r) and write L = A(I - A- 1 (A-L)). Since
llA - L ii < r, we have that
(5.8)

and thus I - A- 1 (A - L) E GL(X); see Proposition 5.7.4. By Remark 2.2.11 and


Exercise 2.6, we have that L E GL(X), so the claim holds and GL(X) is open.
To see the continuity of the inverse map, first note that for any invertible A
and L, we have L- 1 = (A- 1L)- 1A- 1, so if llL - All < 2 11
"1-1
, then

IJL-1 -A-111 = ll(J -A-1(A- L))-1 A-1 - A-1 11

=II(~ (A-l(A- L))k) A-1 -A-111


:::; (~ llA- 1(A- L)llk) llA- 1JJ
llA- 1(A - L) Ji -1
1 - llA-l(A - L)ll llA II
llA- 1ll 2llA- Lii
(5.9)
:::; 1 - IJA- 1(A - L) ll.

Given c > 0, set o = min( 211 A" i 11 2, 21TJ- 11 ), so that whenever llL -All <owe have
1
2
1 - llA- 1(A - L)JI < '
and thus

llL-1 -A-111 < llA-1Jl2JIA - Lii < 2JIA-1 Jl2llA - Li l < c


- 1 - llA- 1 (A - L)ll
whenever ll L -All < o. Hence, the map AH A- 1 is continuous. D

The proof of the previous proposition actually gives some explicit bounds on
the norm of inverses and the size of the open ball in GL(X) containing the inverse
of some operator. This is useful for studying pseudospectra in Chapter 14.
5.8. Topologically Equival ent Metrics 219

Proposition 5. 7.8. Let (X , 11 · llx) be a Banach space. Suppose A E @(X) satisfies


llA- 1 11 < M. For any EE @(X) with l Ell < 1/llA- 1 11, the operator A+ E has a
bounded inverse satisfying

Proof. The proof is Exercise 5.44. D

5.8 Topologically Equ ivalent Metrics


Recall that for a metric space (X, d) the collection of all the open sets is called
the topology induced by the metric d. Changing the metric generally changes the
topology, but sometimes different metrics define the same topology. In that case,
we say that the metrics are topologically equivalent . For example, in this section we
show that any metric induced by a norm on !Fn is topologically equivalent to the
Euclidean metric.
Properties that depend only on open sets-topological properties-are the
same for all topologically equivalent metrics. Some examples of topological proper-
ties that you have already encountered are continuity and compactness. In the
following section we also discuss another important topological property called
connectedness.
One reason that all this discussion of topology is helpful is that often it is easier
to prove that a certain property holds for one metric than for another. If the metrics
are topologically equivalent, and if the property in question is a topological one, then
it is enough to prove the property holds for one metric, and that automatically
implies it holds for the other.

5.8.1 Topological Equivalence

Definition 5.8.1. Let X be a set with two metrics da and db. Each metric induces
a collection of open sets, and this collection is called a topology. We denote the
set of all da -open sets by 5';,, and call it the a-topology. Similarly, we denote the
set of all db -open sets by 3b and call it the b-topology. We say that da and db are
topologically equivalent if they define the same open sets on X, that is, if 5';,, = 3i,.

Example 5.8.2.

(i) Let !!7 be the topology induced on a set X by the discrete metric (see
Example 5.l.3(iii)). For any point x E X and for any r < 1, we have
B(x, r) = {x }. Since arbitrary unions of open sets are open, this means
that any set is open with respect to this metric, and the topology on X
induced by this metric is the entire power set 2X of X.
220 Chapter 5. Metric Space Topology

(ii) The 2-norm on lFn induces the usual Euclidean metric and the Euclidean
topology. The open sets in this topology are what most mathematicians
usually mean when they say "open" without any other statement about
the metric (or the topology). In particular, open sets in IR 1 with the
Euclidean topology are infinite unions and finite intersections of open
intervals (a , b).
Example 5.3.12 shows that the Euclidean topology on lFn is not topo-
logically equivalent to the discrete topology.

(iii) The space C([O, 1]; IR) with the sup norm defines a topology via the
metric doo(f,g) = llf - gllL= = SUPxE[O,l] lf(x) - g(x)I .
We may also define the L 1 -norm and its corresponding metric as d1 (f, g) =
II! - gll1= f01lf(t) - g(t)I dt .
Every open set of the L 1 topollogy is also an open set of the sup-topology
because for any f E C([O, 1]; IR) and for any c: > 0, the ball B 00 (f, c:/2) C
B 1 (f,c:) . To see this, observe that for any g E B 00 (f, c:/ 2) we have
II! - gllu = f01If - gl dt::::; fci1c:/2dt = c:/2 < c:.
In Unexample 5.8.6 we show that the L1 -metric is not topologically
equivalent to the L 00 -metric.

5.8.2 Characteri zation of Topologically Equivalent Metri cs


I

T heorem 5 .8. 3. Let X be a set with two metrics, da and db. The metrics da and
db are topologically equivalent if and only if for all x E X and for all c: > 0 there
exist ba , bb > 0 such that
and (5.10)
where B a and Bb are the open balls defined by the metrics da and db, respectively.

P roof. If da and db are topologically equivalent, then every ball Ba (x , c:) is open
with respect to db, and by the definition of open set (Definition 5.1.8) there must
be a ball Bb (x ,b) contained in Ba(x,c:) . Similarly, every ball Bb(x,c:) is open with
respect to da and there must be a ball Ba(X,"f) contained in Bb(x,c:).
Conversely, let U C X be an open set in the metric space (X, da)· For each
x E U there exists c: > 0 such that Ba(x ,c:) C U, and by hypothesis there is
a bb > 0 such that Bb(x,bb) C Ba(x,c:) C U . Hence, U is open in the metric
space (X, db)· Interchanging the roles of a and b in this argument shows that
every set that is open in (X, db) is also open in (X , da) · Hence, the topologies are
equivalent. D

5.8.3 Metrics Induced by Norms


If we restrict ourselves to normed linear spaces and only allow metrics induced by
norms, t hen we can strengthen Theorem 5.8.3 as follows.
5.8. Topologically Equivalent Metrics 221

Theorem 5.8.4. Let X be a vector space with two norms, II · Ila and II · lib· The
metrics induced by these norms are topologically equivalent if and only if there exist
constants 0 < m :::; M such that

(5 .11)

for all x E X.

Proof. (===> ) If (5.11) holds for some x E X, then it also holds for every scalar
multiple of x. Therefore, it suffices to prove that (5.11) holds for every x E B(O, 1).
Let da and db be the metrics on X induced by the norms II· Ila and II· lib, respectively.
If da and db are topologically equivalent, then by Theorem 5.8.3 there exist 0 < E1 <
1 and 0 < E11 < 1 such that

Ba(O,c") c Bb(O,c') c Ba(O, 1). (5.12)

By Exercise 5.50 we have

Setting m = Eand M = E11 gives (5.11).


1

(~)If (5.11) holds for all x EX, then it also holds for all (x - y) EX; that
is, for all x, y E X we have

mllx - Yll a:::; llx - Yllb :::; Mllx - Ylla·


Hence, mda(x, y) :::; db(x, y) and db(x, y) :::; M da(x, y), which implies that for every
E > 0 we have

and

Therefore, by Theorem 5.8.3 (with Oa = mE and Ob = c/M) the two norms are
topologically equivalent. 0

Example 5.8.5. Recall the following inequalities on lFn from Exercise 3.17:

(i) llxll2 :::; l xll 1:::; v'nllxlk


(ii) l xlloo :::; llxll2:::; v'nll xlloo·
This shows that these norms are topologically equivalent on lFn.

Unexample 5.8.6. The norms £ 00 and L 1 are not topologically equivalent


on C([O, 1]; JR) . To see this, let fn( x) = xn for each n E N. It is straightfor-
ward to check that llfnllu'° = 1 and llfnllL' = n~l' which shows that (5.11)
cannot hold.
222 Chapter 5. Metric Space Topology

Theorem 5.8. 7. All norms on a finite -dimensional vector space are topologically
equivalent.

Proof. Let II · II be a given norm on a finite-dimensional vector space X. By


Corollary 2.3.12 there is a vector-space isomorphism l : x -+ wn for some n E N.
The map l induces a norm 11 · llt on IB'n by llxllt = lll- 1(x)ll (verification that this
is a norm is Exercise 5.48). Therefore, it suffices to assume that X = wn.
Moreover, it suffices to show that II · II on wn is topologically equivalent to the
1-norm II · Iii · From Example 5.5 .8, we know that the unit sphere in the 1-norm is
compact and that the function l (x) = l xll is continuous in the 1-norm topology,
so its image contains both its maximum M and its minimum m. T hus, for every
nonzero x E wn we have m:::; 11 il~li II :::; M, hence mllxll1 :::; l xll :::; Mllxll 1- D

Remark 5.8.8. The previous theorem does not hold for infinite-dimensional spaces,
as shown in Unexample 5.8.6

Remark 5 .8.9. Recall the Cartesian. product of several metric spaces X1 x · · · x Xn


described in Example 5.l.3(iv). Following the approach of Theorem 5.8.7, one
can show that for any p, q E [1, oo) the p-metric is topologically equivalent to the
q-metric on X1 x · · · x Xn.

5.9 Topological Properthes


A property of metric spaces is topological if it depends only on the topology of the
space and not on the metric. In other words, a property is topological if whenever
it holds for one metric on X, it also holds for any topologically equivalent metric on
X. Any time we are interested in studying a topological property, we may switch
from one metric to any topologically equivalent metric. Depending on the situation,
some metrics may be much easier to work with than others, so this can simplify
many arguments and computations. In particular, Theorem 5.8.7 tells us that any
time we are interested in studying a topological property of a finite-dimensional
normed linear space, we can use whichever norm is the easiest to work with for that
problem.

5.9.1 Homeomorphi sms


To begin our study of topological properties, we first turn to continuous functions.
Continuous functions are important in this setting because the preimage of an open
set is open (see Theorem 5.2. 3). Therefore, a function that is continuous and has
a continuous inverse preserves all topological properties. We use this fact to make
the idea of a topological property more precise.

Definition 5.9. 1. Let (X, d) and (Y, p) be metric spaces. A homeomorphism f :


(X, d)-+ (Y, p) is a bijective, continuous map whose inverse l - 1 is also continuous.
5.9 . Topological Properties 223

Example 5.9.2. The map f : (0, 1) -+ (1, oo) C JR given by f(t) = l/t is
a homeomorphism: it is clearly bijective and continuous, and its inverse is
1- 1 (s) = 1/s, which is also continuous.

Unexample 5.9.3. A bijective, continuous function need not be a homeo-


morphism. For example, the map f : [O, 1) -+ 5 1 = {z E C I 1 = lzl} given
by f(t) = e 2 7ri t is continuous and bijective, but its inverse is not continuous
at z = 1. This can be seen by the fact that 5 1 is compact, but [O, 1) is
not. Since continuous functions must map compact sets to compact sets (see
Proposition 5.5.6), the map 1- 1 cannot be continuous.

Definition 5.9.4. A topological property is one that is preserved under


homeomorphism.

Example 5.9.5. Open sets are preserved under homeomorphism, as are closed
sets. Compactness is defined only in terms of open sets, so it is also preserved
under homeomorphism. Theorem 5.3.19 guarantees that convergence of se-
quences is preserved by continuous functions, and therefore convergence is a
topological property.

Proposition 5.9.6. Two metrics d and p on X are topologically equivalent if and


only if the identity map i: (X, d)-+ (X, p) is a homeomorphism.

Proof. The proof is Exercise 5.51. D

Corollary 5.9.7. Let (X,d) be a metric space. If pis another metric on X that is
topologically equivalent to d, then a sequence (xn);:::'=o in X converges to x in (X, d)
if and only if it converges toxin (X, p) .

Proof. By Proposition 5.9.6 the identity map i : (X, d) -+ (X, p) is a homeomor-


phism, and the result now follows because convergence and limits are preserved by
homeomorphisms (see Theorem 5.3.19) . D

5.9.2 Topological versus Uniform Properti es


Completeness is not a topological property (see Example 5.4.10), and a sequence
that is Cauchy in one metric space (X, d) is not necessarily Cauchy for a topo-
logically equivalent metric (X, p). However, these are uniform properties, meaning
that they are preserved by uniformly continuous functions (see Theorem 5.4. 24). Re-
markably, the next proposition shows that if two metrics da, db on a vector space
X are induced by topologically equivalent norms II · Ila, I · lib, then the identity
224 Chapter 5. Metric Space Topology

map i : ( X, da) -+ ( X, db) is uniformly continuous (and its inverse is also uniformly
continuous), so Cauchy sequences and completeness are preserved by topologically
equivalent norms.

Propos it ion 5.9.8. If (X, I · Ila) is a normed linear space, and if 1 · lib is another
norm on X that is topologically equivalent to I · Ila, then the identity map i : (X, 11 ·
Ila) -+ (X, II · llb) is uniformly continuous with a uniformly continuous inverse.
P roof. It is immediate from the definition of topologically equivalent norms that
the identity map is a bounded linear transformation, and hence by Proposition 5.4.22
it is uniformly continuous. The same applies to its inverse. D

R e m ark 5 .9.9. It is important to note that the previous result only holds for
normed linear spaces and metrics induced by norms. It does not hold for more
general metrics.

5.9 .3 Connectedn ess


Connectedness is an important topological property. You might think the idea is
obvious- a connected space shouldn't break into separate parts and you should
be able to get from any point to any other by traveling in the space. But to
actually prove anything we need to make these ideas mathematically precise. Being
precise about definitions also reveals that these are actually two distinct ideas.
Connectedness (not being able to break the space apart) is not quite the same as
path connectedness (being able to get from any point to any other along a path in
the space) .
We begin this section with the careful definition of connectedness. We then
discuss some of the consequences of connectedness and conclude with a discussion
of path connectedness.

Connectedness

Defin ition 5.9.10. A metric space Xis disconnected if there are disjoint nonempty
open subsets U and V such that X = U UV . In this case, we say the subsets U and
V disconnect X. If X is not disconnected, then it is connected.

Rem ark 5 .9 .11. If X is disconnected, then we can choose disjoint open sets U
and V with X = U U V . This implies that U = vc and V = uc are also closed.
Hence, a space X is connected if and only if the only sets that are both open and
closed are X and 0.

Example 5.9.12.

(i) As we see later in this section, the line JR. is connected. But the set
JR"' {O} is disconnected because it is the union of the two disjoint open
sets (-oo, 0) and (0, oo).
5.9. Topological Properties 225

(ii) If X is any set with at least two points and d is the discrete metric on
X , then (X, d) is disconnected. This is because every set in t he discrete
topology is both open and closed; see Example 5.8.2(i).

Theorem 5.9.13. Let f : (X, d) --+ (Y, p) be continuous and surjective. If X is


connected, then so is Y.

Proof. Suppose Y is not connected. There must be nonempty disjoint open sets U
and V satisfying Y = U UV. Since f is continuous and surjective, the sets f- 1 (U)
and f- 1 (V) are also nonempty disjoint open sets satisfying X = f - 1 (U) U f- 1 (V).
This is a contradiction. D

Consequences of connectedness
Our first important consequence of connectedness is a corollary of Theorem 5.9.13.

Corollary 5.9.14 (Intermediate Value Theorem). Assume (X, d) is a con-


nected metric space, and let f : (X, d) --+ JR be continuous (with the usual topology
on JR) . If f(x) < f(y) and c E (f(x), f(y)), then there exists z E X such that
f(z) = c.

Proof. If no such z exists, then f- 1 ( ( -oo, c)) and f - 1 ( ( c, oo)) are nonempty and
disconnect X, which is a contradiction. D

Example 5.9.15. Similar to the proof of the intermediate value theorem, we


can use connectedness to show that an odd-degree polynomial p E JR[x] has a
real root. If not , then p- 1 ((-oo, 0)) and p- 1 ((0, oo)) are both nonempty and
they disconnect JR, which contradicts the fact that JR is a connected set. But
if p has even degree, then p- 1 (-oo,O) or p - 1 (0,oo) may be empty, so this
argument does not hold for even-degree polynomials.

The intermediate value theorem leads to our first example of a fixed-point the-
orem. Fixed-point theorems are used throughout mathematics and are also widely
used in economics. We encounter them again in Chapter 7.
Corollary 5.9.16 (One-Dimensional Brouwer Fixed-Point Theorem). If
f: [a, b]
--+ [a, b] is continuous, then there exists x E [a, b] such that f(x) = x .

Proof. We assume that f(a) > a and f (b) < b, otherwise f(a) = a or f(b) = b,
and the conclusion follows. Hence, the function g(x) = x - f(x) satisfies g(a) < 0
and g(b) > 0, and is continuous on [a, b]. Thus, by the intermediate value theorem,
there exists c E (a, b) such that g(c) = 0, or in other words, f(c) = c. D

An illustration of the Brouwer fixed-point theorem is given in Figure 5.3.


Corollary 5.9.17 (One-Dimensional Borsuk-Ulam Theorem). A continuous
real-valued map f on the unit circle 5 1 = {(x,y) E JR 2 : ll(x,y)ll2 = 1} has antipodal
points that are equal; that is, there exists z E 5 1 such that f(z) = f(-z).
226 Chapter 5. Metric Space Topology

y=x
b
y = f (x)

r-x··
a

a b

Figure 5.3. Plot of a continuous function f : [a, b] -+ [a, b] . The Brouwer


fixed-point theorem (Corollary 5. 9.16) guarantees that any such function must have
at least one fixed point, where f(x) == x. In this particular case f has three fixed
points (red) .

Proof. The proof is Exercise 5.54. D

Remark 5.9.18. Connectedness is a very powerful, yet subtle, property. It implies,


for example, by way of Corollary 5.9.17, that at any point in time there are antipodal
points on the earth (or on any circle on the surface of the earth) that are exactly
the same temperature!

Path connectedness
Connectedness is defined in terms of not being able to separate a space, but as
mentioned in the introduction to this section, it may seem intuitive that you should
be able to get from any point in a connected space to any other point by traveling
within the space. This second property is actually stronger than the definition of
connectedness. We call this stronger property path connectedness.
Definition 5.9.19. A subset E of a metric space is path connected if for any
x, y E E there is a continuous map ''l : [O, 1] -+ E such that 1(0) = x and 1(1) = y.
Such a I is called a path from x toy. See Figure 5.4.

To understand path-connected spaces, we first need to know that the interval


[O, 1] is connected.

Proposition 5.9.20. The interval [O, 1] c IR is connected (in the usual topology).

Proof. Suppose that the interval has a separating pair U UV = [O, l]. Take u E U
and v E V. Without loss of generality, assume u < v. The interval [u, v] is a subset
of [O, 1], so it is contained in U UV. The set A= [u, v] n U = [u, v] n vc is compact,
because it is a closed subset of the compact interval [u, v] . Therefore A contains its
largest element a. Also a < v because v tj. U.
5.10 . Banach-Valued Integration 227

Figure 5.4. In a path-connected space, there is a path within the space


between any two points; see Definition 5. 9.19.

Similarly, the least element b of [a, v] n Vis contained in [a, v] n V . Since b rf. A
we must have a< b, which shows that the interval (a, b) C [u, v] is not empty. But
(a, b) n U = (a, b) n V = 0; hence, not every element of [O, l] is contained in U U V,
a contradiction. D

Theorem 5.9.21. A path-connected space is connected. 37

Proof. Assume that X is path connected but not connected. Hence, there exists
a pair of nonempty disjoint open sets U and V such that X = U UV. Choose
x E U and y E V and a continuous map ry : [O, l ] ---+ X such that ry(O) = x and
ry(l) = y. Thus, the sets ry - 1 (U) and ry- 1 (V) are nonempty, disjoint, and disconnect
the connected set [O, l], which is a contradiction. D

Example 5.9.22. Any interval in IR is path connected, and thus connected.

5.10 Banach-Valued Integration


In this section, we use the continuous linear extension theorem to construct the
integral of a single-variable Banach-valued function on a compact interval. To do
this we first define the integral on some of the simplest functions imaginable-step
functions. For step functions the definition of the integral is very easy and obvious.
Since continuous functions are in the closure (under the sup norm II · 11£<"'") of the
space of step functions, this means that once we show that the integral, as a linear
operator, is bounded, we can use the continuous linear extension theorem to extend
the integral to all continuous functions. These integrals are used extensively in
Chapter 6. In particular, we use these integrals to define Taylor series in a Banach
space (see Section 6.6).

37 The converse to this theorem is false. There are spaces that are connected but not path con-
nected. For an example of such a space, see [Mun75, Ex. 2, Sect. 25].
228 Chapter 5. Metric Space Topology

An immediate consequence of the fact that the integral is a continuous linear


operator is that uniform limits commute with integration. With all the tools we
now have at our disposal, it is easy to define the integral and prove many of its
properties-much easier than with the traditional definition of the Riemann integral.
We use these tools again when we treat the Lebesgue construction of the integral
in Chapter 8.
Throughout this section, let (X, 11 · llx) be a Banach space.

Nata Bene 5.10.1. The approach we use to define the integral is unusual.
A ide from Dieudonne's books [Bou76, Die60], we know of no textbook that
treats the integral this way. But it has many advantages over the more com-
mon approaches, including being much simpler. We believe it is also a better
preparation for the ideas of Lebesgue integration. The construction of the
integral given here is called the regulated integral, and although it is similar
to the Riemann integral in many ways, it i not identical to it. Nevertheless,
it is straightforward to show that whenever the two con tructions are both
defined, they must agree.

5. 10.1 Integra ls of Step Functions

Definition 5.10.2. A map f : [a, b] -+ X is a step function if there is a (finite)


subdivision a= to < t1 < · · · < tN-1 < tN = b such that we may write fin the form

(5.13)

where each xi EX and :Il.e is the indicator function of the set E:

:Il. (t) = { 1 if t E E, (5.14)


E 0 ift tj_ E.

Let S( [a, b]; X) denote the set of all step functions mapping [a, b] into X.

For an illustration of a step function see Figure 5.5.

Proposition 5.10.3. The set S([a, b]; X) of step functions is a subspace of the
normed linear space of bounded functions (L 00 ([a, b]; X), II· llv'° ).

Proof. To see that S([a, b]; X) is a subset of L 00 ([a, b]; X), note that a step function
f of the form (5.13) has finite sup norm since l f llL= = supk llxkllx-
It suffices to show that S([a, b]; X) is closed under linear combinations; that
is, given o:,(3 E F and f,g E S([a,b];X), we show that o:f + (3g E S([a,b];X). Let
5.10. Banach-Valued Integration 229

• •
a b

Figure 5.5. Graph of a step function s [a, b] --+ IR, as described in


Definition 5.10. 2.

and

g(t ) = (~l Yi:Il.[s;- 1,s;)) +yM :Il. [sM- l,s M] ·


Write the union of the indices {to , ... , t N, so, ... , s M} as an ordered list uo < u1 <
· · · < ue (first eliminating any duplicates). The step functions f and g can be
rewritten as

f(t) = (~x~:Il. [u;_ 1 ,u;)) +xe:Il. [ue_ ,ue] and g(t) = (~y~:Il.[u;-1, u ;)) +ye:Il. [ue-
1 1 ,ue]'

where each x~ and y~ lies in X. Thus, the sum takes the form

af(t) + f3 g(t) = (~(ax~+ f3y~):Il.[u i_ 1 ,u;)) +(ax£+ f3y~):Il.[u£-1,ue]·


This is a step function on [a , b]. D

Definition 5.10.4. The integral of a step function f E S( [a, b]; X) of the form
(5.13) is defined to be
N
I(f) = Lxi(ti - ti- 1). (5 .15)
i= l

This is a map from S( [a , b]; X) to X. We often write I(f) as J: f(t) dt.

Proposition 5.10.5. The integral map I : S([a, b]; X) --+ X is a bounded linear
transformation with induced norm lli ll = (b - a).

Proof. Linearity follows from t he combining of two subdivisions, as described in


the proof of Proposition 5.10.3. For any step function f of the form (5 .13) on [a, b],
we have
230 Chapter 5. Metric Space Topology

which gives lllll :::; (b - a) . But if J(t) = 11.[a,bj(t), then lll(f)ll = (b - a) =


(b - a)llJllL=, and hence l lll = (b - a). D

5.10.2 Single-Variable Banach-Valued Integration


Theorem 5.10.6 (Single-Variable Banach-Valued Integration) . Let
L 00 ([a,b];X) be given the L 00 -norm. The linear map l: S([a,b];X) -t X can be
extended uniquely to a bounded linear (hence uniformly continuous) transformation

I: S([a, b]; X) -t X,

with l i l = l l ll = (b - a).
Proof. This follows immediately from the continuous linear extension theorem
(Theorem 5.7.6) by setting S = S([a, b] ; X) and Z = S([a, b]; X), since bounded
linear transformations are always uniformly continuous, by Proposition 5.4.22. D

Definition 5.10.7. For any Banach space X and any function f E S( [a,b];X),
we write
lb f(t) dt

to denote the unique linear extension I(f) of Theorem 5.1 0.6. In other words,

lba
f(t) dt = I(f) = lim l(sn) = lim
n---+oo n---+oo
lb
a
Sn(t) dt,

where (Sn )nEN is a sequence of step functions that converges uniformly to f. We


also define
la J(t) dt = - lb J(t) dt.

Theorem 5.10.8. Continuous function s lie in the closure, with respect to the
uniform norm, of the space of step functions

C([a,b];X) c S([a,b] ;X ) c L 00 ([a,b];X).

Proof. Any f E C([a, b]; X) is uniformly continuous by Theorem 5.5.9. So, given
c: > 0, t here exists o > 0 such that llf(s) - f(t)ll < c: whenever Is -ti < o. Choose n
sufficiently large so that (b-a)/n < o, and define a step function fs E S([a, b]; X) as

fs(t) = J(to)1L[to,t 1 )(t) + J(ti)1L[t 1 ,t2)(t) + · · · + f(tn-1)11.[tn - i,tnJ(t), (5 .16)


where the grid points ti= i(b-a)/n+a are equally spaced and less than odistance
apart. Since t E [ti, tH1] implies that lf(t)- f(ti)I < c:, we have that llf- fsllL= < c:.
Since c: > 0 is arbitrary, we have that f is a limit point of S([a, b]; X). Since
f E C([a, b]; X) was arbitrary, this shows that C([a, b]; X) c S([a, b]; X). D
5.10. Banach-Valued Integration 231

Remark 5.10.9. In the proof above we approximated a continuous function f


with a step function that matched the values of f on the leftmost point of each
subinterval. However, we could just as well have chosen any point on the subinterval
to approximate the function f.
Remark 5.10.10. The tools we have built up throughout the book so far made
our construction of the integral much simpler than the traditional (Riemann or
Darboux) construction of the integral. We should point out that the Riemann
construction applies to more functions than just S([a, b]; X), but in Chapter 8 we
describe another construction (the Lebesgue or Daniell construction) that applies to
even more functions than the Riemann construction. We use the continuous linear
extension theorem as the main tool in that construction as well.
Remark 5.10.11. We won't prove it here, but it is straightforward to check that
the usual Riemann construction of the integral also gives a bounded linear trans-
formation from S([a, b]; IR) to IR, and it agrees with our regulated integral on all
real-valued step functions, so by the uniqueness of linear extensions, the Riemann
and regulated constructions of the integral must be the same on S([a, b]; IR). In
particular, they must agree on all continuous functions. Among other things, this
means that for functions in S([a, b]; IR) (real valued) we can use any of the usual
techniques of Riemann integration to evaluate the integral, including, for exam-
ple, the fundamental theorem of calculus, change of variables (substitution), and
integration by parts.
Of course the integral as we have defined it here is not limited to real-valued
functions- it works just as well on functions that take values in any Banach space,
and in that more general setting we cannot always use the usual techniques of
Riemann integration. But we prove a Banach-valued fundamental theorem of
calculus in Section 6.5.

5.10.3 Properties of the Integral


The fact that the extension I is linear means that if f, g E S( [a, b]; X) and
o:, /3 E IF, then

lb o:f (t) + (3g(t) dt = 0: lb f (t) dt + (3 .lb g(t) dt.

The next proposition gives several more fundamental properties of integration.

Proposition 5.10.12. If f S([a,b];X)


E C L 00 ([a,b];X) and o:,/3,ry E [a,b], with
o: < ry < (3, then the following hold:
(i) llf: f(t) dtll S (b - a) SUPtE [a,b] llf(t) l ·

(ii) 11 J: f(t) dtll SJ: llf(t)ll dt (integral triangle inequality).


(iii) Restricting f to a subinterval [o:, (3] C [a, b] defines a function that we also
denote by f E S([o:,/3]; X). We have J:
f(t):Il.(t) [a,i3) dt = f(t) dt. J:
(iv) I: J(t) dt =I: J(t) dt + J~ J(t) dt .
(v) The function F(t) J: f(s) ds is continuous on [a, b] .
=
232 Chapter 5. Metric Space Topology

Proof.
(i) This is just a restatement of lllll = (b - a) on the space S([a,,B];X) in
Proposition 5.10.5.

(ii) The function t r-+ II! (t) II is an element of S([a, b]; IR.), so the integral
makes sense. The rest of the proof is Exercise 5.55.
J: llf (t) II dt
(iii) Let (sk)~ 0 be a sequence of step functions in S([a, b]; X) that converges to
f E S([a, b]; X) in the sup norm. Let lab be the integral map on S([a, b]; X) as
given in (5 .15), and let IOl./3 be the corresponding integral map on S([a, ,BJ; X).
Since the product skll[Oi.,/3] vanishes outside of [a,,B], its map for I0/.13, with
restricted domain, has the same value as that of lab· In other words, for all
k EN we have
lb Sk(t)Jl.[0!.,/3] dt = .l/3 Sk(t) dt.

Since I is continuous, Sk---+ f implies that J(sk)---+ J(f) , by Theorem 5.2.9.


(iv) The proof is Exercise 5.56.
(v) The proof is Exercise 5.57. D
Remark 5.10.13. We can combine the results of (i) and (ii) in Proposition 5.10.12
into the following:

lb
I a
f(t) dtll :::; lb
a
llf(t)ll dt:::; (b - a) sup llf(t)ll·
tE[a,b]

In other words, the result in (ii) is a sharper inequality than that of (i).

Proposition 5.10.14. If f E S([a, b]; R_n) C L'x)([a, b], R.n) is written in coordinates
as f(t) = (f1(t), ... , fn(t)), then for each i we have fi E S([a, b]; IR.) and

lb f(t) dt = (lb f1 (t) dt, ... , lb fn(t) dt) .

Proof. Since I : S([a, b]; X) ---+ X is continuous, if Sn ---+ f, then J(sn) -t J(f) by
Theorem 5.2 .9. Thus, it suffices to prove the proposition for step functions. But
for a step function s( t) = (s1 (t), ... , Sn (t)) the proof is straightforward:

1
a
b
s(t) dt =
n-1
L s(ti)(ti - ti_ 1 )
i=O
n-1
= L(s1(ti), ... ,sn(ti))(ti -ti-1)
i=O

= (~ s1(ti)(ti - ti-1), ... , ~ sn(ti)(ti - ti- 1))

= (lb s1(t), ... , lb sn(t) dt) . D


Exercises 233

Application 5.10.15. Integrals of single-variable continuous functions show


up in many applications. One common application is in the physics of particle
motion. If a particle moving in lRn has acceleration a( t) = ( ai (t), .. . , an (t)),
then its velocity at time t is v(t) = ft: a(T) dT + v(t 0), and its position is
p(t) = ft: v(T) dT + p(to).

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with .&. are especially important and are likely to be used later
in this book and beyond. Those marked with tare harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

5.1. If di and d 2 are two given metrics on X, decide which of the following are
metrics on X, and prove your answers are correct.
(i) di+ d2.
(ii) di - d2.
(iii) min( di, d2)·
(iv) max( di , d2)·
5.2. Let d 2 denote the usual Euclidean metric on JR 2 . Let 0 = (0, 0) be the origin.
The French railway metric dsNCF in JR 2 is given by 38

d (x ) _ {d2(x, y), x = cxy for some ex E JR,


SNCF 'y - d2(x, 0) + d2(0, y) otherwise.

Explain why French railway might be a good name for this metric. (Hint:
Think of Paris as the origin.) Prove that dsNCF really is a metric on lR 2 .
Describe the open balls B(x, c:) in this metric.
38The official name of the French national railway is Societe N ationale des Chemins de Fer, or
SNCF for short.
234 Chapter 5. Metric Space Topology

5.3. Give an example of a set X and a function d : X x X --+ JR that is


(i) symmetric and satisfies t he triangle inequality, but is not positive
definite;
(ii) positive definite and symmetric, but does not satisfy the triangle in-
equality.
5.4. Let (X, d) be a metric space. Show that the function (5.5) of Example 5.1.4
is also a metric. Hint: The function f(x) = l~x is monotonically increasing
on[O,oo).
5.5. For all x , y, z in a metric space (X, d), prove that d( x , z) ?:: jd(x, y) - d(y, z)j.
5.6. Prove Theorem 5.l.16(iii).
5.7.* Consider the collection 1&7 in Example 5.1.13. Prove that every point in the
unit ball B(O, 1) (with respect to the usual Euclidean metric) is contained in
one of the elements of 1&7. Hint : x E B(O, 1) if and only if x =I: ciei, where
I: cr < 1. Show that llx - eill < J2 for some e i.
5.8.* Let ((Xk, dk))'k=o be an infinite sequence of metric spaces. Show that the
function
d(x y) ==
'
f:
__!_ . dk(x,y)
k=02k l+dk(x,y)
(5.17)

is a metric on the Cartesian product TI~o xk. Note: Remember to show


that the sum always converges to a finite value.

5.9. Let (X,d) be a metric space, and let B c X be nonempty. For x EX, define

p(B,x) = inf{d(x, b) I b EB}.

Show that, for a fixed B, the function p( B, ·) : X --+ JR is a continuous


function .
5.10. Prove that multivariable polynomials over IF are continuous. Hint: Use
Proposition 5.2.11.
5.11. Let f: JR 2 --+ JR be

= y = 0,
f(x, y) = {o x2 - Y2
if x
otherwise.
Jx2 + y2
Prove that f is continuous at (0, 0) (using the Euclidean metric d(v, w) =
llv - wlb in JR 2 , and the usual metric d(a, b) = la - bl in JR).
5.12. Let
if x = y = 0,
f(x, y) = {o
2x 2y
otherwise.
x4 + y2
Define <P(t) = (t, at) and 'lf;(t) = (t, t 2 ). Show that
(i) limt-ro f(<P(t)) = O;
(ii) limt-ro f('lf;(t)) = 1.
What does this say about the continuity of f? Explain your results.
Exercises 235

5.13. Prove the claim made in Example 5.2.14 that {(x,y) E IR 2 I x 2 + y 2 :::; 1} is
the set of all limit points of the ball B (0, 1) C IR 2 .
5.14. Prove that for each integer n > 0, the set «:Jr is dense in !Rn in the usual
(Euclidean) metric.

5.15. Prove that a subset EC Xis dense in X if and only if E = X .


5.16. Prove that (E)c = (Ec) 0 •
5.17. Prove that xo EE if and only if infxEE d(xo , x) = 0.
5.18. Give an example of a metric space (X,d) and two proper (nonempty and not
equal to X) subsets S, T c X such that S is both open and closed, and T is
neither open nor closed.
5.19 . .& Prove that the kernel of a bounded linear transformation is closed. Hint:
Consider using Example 5.2.2(ii).
5.20. For each of the following functions f : IR 2 "'{O} --+ IR, determine whether the
limit of f(x, y) exists at (0, 0) . Hint: If the limit exists, then the limit of
(f(zn))~=O must also exist and be the same for every sequence (zn)~=O in IR 2
with Zn --+ 0. (Why?) Consider various choices of (zn)~=O·

(i) f(x , y) = x~2 ·


(ii) f(x, y) = x -X:yc ·
(iii) f(x,y) = ~·
x2+y2

5.21.t .& Prove that unbounded linear transformations are not continuous at the
origin (see Definition 3.5.10 for the definition of a bounded linear transforma-
tion). Use this to prove that they are not continuous anywhere. Hint: If Tis
unbounded, construct a sequence of unit vectors (x k)k°=o' where llT(x k) 11 > k
for each k EN. Then modify the sequence so that Theorem 5.3.19 applies.

5.22. Let (X, d) be a (not necessarily complete) metric space and assume that
(x k)k°=o and (Yk)k°=o are Cauchy sequences. Prove that (d(xk, Yk))k°=o
converges.
5.23. Which of the following functions are uniformly continuous on the interval
(0, 1) C IR, and which are uniformly continuous on (0, oo )?
(i) x 3 .
(ii) sin(x)/x.
(iii) x log(x).
Prove that your answer to (i) is correct.
5.24 . .& Prove Proposition 5.4.22. Hint: Prove that a bounded linear transforma-
tion is Lipschitz.
5.25 . .& Prove Lemma 5.4.26.
5.26. Let B C !Rn be a bounded set.
(i) Let f : B --+ IR be uniformly continuous. Show that the set f (B) is
bounded.
236 Chapter 5. Metric Space Topology

(ii) Give an example to show that this does not necessarily follow if f is
merely continuous on B .
5.27. Let (X, d) be a metric space such that d(x, y) < 1 for all x, y E X , and let
f : X -t IR be uniformly continuous. Does it follow that f must be bounded
(that is, that there is an M such that f(x) < M for all x EX)? Justify your
answer with either a proof or a counterexample.
5.28.* Prove Proposition 5.4.31.

5.29. Let X = (0, 1) x (0, 1) c IR 2 , and let d(x, y) be the usual Euclidean metric.
Give an example of an open cover of X that has no finite subcover.
5.30. Let K c !Rn be a compact set and f : K -> !Rn be injective and continuous.
Prove that 1- 1 is continuous on f (K).
5.31. If Uc !Rn is open and KC U is compact, prove that there is a compact set
D such that Kc D 0 and DC U .
5.32. A function f : IR -t IR is called periodic if there exists a number T > 0 such
that f(x+T) = f(x) for all x E IR. Show that a continuous periodic function
is bounded and uniformly continuous on IR.
5.33 . .&. For any metric space (X, p) with CCX and D C X nonempty subsets,
define
d(C, D) = inf{p(c,d) I c E C,d ED}.

If C is compact and D is closed, then prove the following:


(i) d(C, D) > 0, whenever C and Dare disjoint.
(ii) If Dis also compact, there exist s c* EC and d* ED such that d(C, D) =
p(c*,d*).
5.34.* Prove Proposition 5.5.20.

5.35. Prove Proposition 5.6.5.


5.36. For each n EN let fn E C([O, 7r]; IR) be given by fn(x) = sinn(x).
(i) Show that Un)':'=o converges pointwise.
(ii) Describe the function it converges to.
(iii) Show that Un)':'=o does not converge uniformly.
(iv) Why doesn't this contradict Theorem 5.6.8?
5.37. Prove that if Un)':'= 1 and (gn)~~ 1 are sequences in C([a, b]; IR) that converge
uniformly to f and g, respectively, then Un+ 9n)':'= 1 converges uniformly to
f +g .
5.38. Prove that if Un)':°=l is a sequence in C([a, b]; IR) that converges uniformly
to f, t hen for any a E IR the sequence (afn)':'=i converges uniformly to af.
5.39. Let the sequence Un)':'=i in C([O, 1]; IR) be given by fn(x) = nx/(nx + 1).
Prove that the sequence converges pointwise, but not uniformly, on [O, l ].
5.40. Let t he sequence Un)':'=i in C([O, 1]; IR) be given by fn(x) = x/(l + xn) .
Prove that the sequence converges pointwise, but not uniformly, on [O, l].
Exercises 237

5.41. Show that the sum L::=l (- 1)n /n converges, and find a rearrangement of the
terms that diverges. Use this to construct an example of a series in M2(IR)
that converges, but for which a rearrangement of the terms causes the series
to diverge.

5.42. Let (X , II · ll x) be a Banach space, and let A be a bounded linear operator


on X . Prove that the following sums converge absolutely in &#(X):
(i) cos(A) = 2::%°= 0(- l)k (1:;!.
. (A)
(1·1·) sm ~ oo ( l)k A2k+1
= u k=O - (2k+1)! ·

(iii) log( A+ I) = 2::%°=1 (- l)k- Lr if ll A ll < l.


Hint: Absolute convergence is about sums of real numbers, so you can use
the ratio test from introductory calculus.
5.43. Prove Proposition 5.7.4. Hint : Proving A = B - 1 is equivalent to proving
that AB = BA = I.
5.44. Prove Proposition 5. 7.8.
5.45. Consider t he subspace IR[x] c C( [-1, l]; IR) of polynomials. The Weierstrass
approximation theorem (see Volume 2) guarantees that IR[x] is dense in
(C([-1 , l ]; IR), II · ll u '° ). Show that the map D : p(x) f--t p'(x) is a linear
transformation from IR[x]-+ IR[x] C C([-1, l]; IR).
Show that the function V'x is in C([- 1,l];IR) but does not have a deriva-
tive in C( [-1, l]; IR). This shows that there is no linear extension of D to
(C([- 1, l ]; IR), II · 11£= ). Why doesn't this contradict the continuous linear
extension theorem (Theorem 5.7.6)?

5.46. Let (X, d) be a metric space. Prove: If the function f : [O, oo) -+ [O, oo) is
strictly increasing and satisfies f (0) = 0 and f (a + b) :::; f (a) + f (b) for all
a,b E [O,oo), then p(x,y) := f(d(x,y)) is a metric on X. Moreover, if f is
continuous at 0, then (X, p) is topologically equivalent to (X, d).
5.47. Prove that the discrete metric on Fn is not topologically equivalent to the
Euclidean metric. Use this to prove that the discrete metric on Fn is not
induced by any norm.
5.48. Prove the claim in the proof of Theorem 5.8.7 that, given any norm 11 · 11
on a vector space X and any isomorphism f : X -+ Y of vector spaces, the
function 11 · ll t on Y defined by ll Yll t = ll J- 1 (y)ll is a norm on Y.
5.49. Let X be a set, and let .4 be the set of all metrics on X. Prove that
topological equivalence is an equivalence relation on .4.
5.50. Let II · Ila and II · ll b be equivalent norms on a vector space X , and assume
that there exists 0 < c < 1 such that Bb(O, c) C Ba(O, 1). Prove that

llxlla :S ll x ll b
c
for all x E Ba(O, 1). Hint: If there exists x E Ba(O, 1) such that ll x lla >
ll x llb/c, then choose a scalar a so that y = ax satisfies llYllb/c < 1 < ll Yll a·
Use this to get a contradiction to the assumption Bb(O,c) C Ba(O, 1).
238 Chapter 5 Metric Space Topology

5.51. Prove Proposition 5.9.6.


5.52. Prove that there is no homeomorphism from IR 2 onto IR. Hint: Remove a
point from each and consider a connectedness argument.
5.53. Consider the metric space (M2(IR),d), where the metric dis induced by the
Frobenius norm, that is, d(x, y) = llx - YllF· Let X denote the subset of all
invertible 2 x 2 matrices. Is (X, dx) connected? Prove your answer.
5.54. Prove Corollary 5.9.17. Hint: Use the continuous function f to build an-
other continuous function in the style of the proof of the intermediate value
theorem.

5.55 . Prove Proposition 5.10.12(ii). Hint : First prove the result for step functions,
then for the general case.
5.56. Prove Proposition 5.10.12(iv).
5.57. Prove Proposition 5.10.12(v). Hint: Use the c-6 definition of continuity.
5.58. Define a function f(x): [-1, 1] -t IR by

1 x = 0,
f(:c) = '
{ 0, xi= 0.

It can be shown that this function is Riemann integrable and has Riemann
integral equal to 0. Prove that f(x) is not in the space S([-1, 1]; IR), so the
integral of Definition 5.10. 7 is not defined for this function.
5.59. You cannot always assume that interchanging limits will give the expected
results. For example, Exercise 5.45 shows that differentiation and infinite
summation do not always commute. In this exercise you show a special case
where integration and infinite summation do commute.

(i) Prove that the integration map I: IR[x] -t IR given by I(p)


is a bounded linear transformation.
= J: p(x) dx
(ii) Prove that for any polynomial p(x) = L~=O ckxk, we have I(p) =
L~=ock(bk+ l - ak+ 1 )/(k + 1).

(iii) Let d C C([a, b]; IR) be the set of absolutely convergent power series on
[a, b]. Prove that dis a vector subspace of C([a, b]; IR).

(iv) Prove that the series L~o ck(bk+l -ak+ 1 )/(k+1) is absolutely conver-
gent if the series L ~o CkXk is absolutely convergent on [a, b].

(v) Let T: d -t IR be the map given by termwise integration:

Prove that T is a linear transformation from d to IR.


Exercises 239

(vi) Use the continuous linear extension theorem (Theorem 5.7.6) to prove
that I = Ton d. In other words, prove that if f(x) = L::%°=ockxk Ed,
then

I(f) = l b
f(x) dx = t;
<Xl

ck(bk+l - ak+ 1)/(k + 1) = T(f).

Notes
Some sources for topology include the texts [Mun75, Morl 7]. For a brief compari-
son of the regulated, Riemann, and Lebesgue integrals (for real-valued functions),
see (Ber79] .
Differentiation

In the fall of 1972 President Nixon announced that the rate of increase of inflation
was decreasing. This was the first time a sitting president used the third derivative
to advance his case for re-election.
-Hugo Rossi

The derivative of a function at a point is a linear transformation describing the


best linear approximation to that function in an arbitrarily small neighborhood
of that point. This allows us to use many linear analysis tools on nonlinear func-
tions. In single-variable calculus the derivative gives us the slope of the tangent line
at a point on the graph, whereas in multidimensional calculus, the derivative gives
the individual coordinate slopes of the tangent hyperplane. This generalization of
the derivative is ubiquitous in applications. Just as single-variable derivatives are
essential to solving problems in single-variable optimization, ordinary differential
equations, and univariate probability and statistics, we find that multidimensional
derivatives provide the corresponding framework for multivariable optimization
problems, partial differential equations, and problems in multivariate probability
and statistics.

6.1 The Directional Derivative


We begin by considering differentiability on a vector space. For now we restrict our
discussion to functions on !Rn , but we revisit these concepts in greater generality in
later sections.

6.1.1 Tangent Vectors


Recall the definition of the derivative from single-variable calculus.

Definition 6.1.1. A function f : (a, b) --+ JR is differentiable at x E (a, b) if the


following limit exists:
. f( x + h) - f(x)
1Im h . (6.1)
h-tO

241
242 Chapter 6. Different iation

The limit is called the derivative off at x and is denoted by f'(x) . If f(x) is
differentiable at every point in (a, b), we say that f is differentiable on (a, b).

Remark 6.1.2. The derivative at a point xo is important because it defines the best
possible linear approximation to f near xo. Specifically, the linear transformation
L : JR ---+ JR given by L(h) = f'(x 0 )h provides the best linear approximation of
f(xo + h) - f(x 0 ) . For every c; > 0 there is a neighborhood B(O, 8), such that
the curve y = f(x 0 + h) - f(x 0 ) lies between the linear functions L(h) - c:h and
L(h) + c:h, whenever his in B(O, 8) .
We can easily generalize this definition to a parametrized curve.

Definition 6.1.3. A curve "( : (a, b) ---+ ]Rn is differentiable at to E (a, b) if the
following limit exists:
. 1(to+h)-1(to)
1lm . (6.2)
h-+0 h
Here the limit is taken with respect to the usual metrics on JR and JRn (see
Definition 5.2.8). If it exists, this limit is called the derivative of 1(t) at to and
is denoted 1 ' (to) . If "! is differentiable at every point of (a, b) , we say that "f is
differentiable on (a, b) .

Remark 6.1.4. Again, the derivative allows us to define a linear transformation


L(h) = 1'(t0 )h that is the best linear approximation of "f near to. Said more
carefully, 1(to + h) - 1(to) is approximated by L(h) for all h in a sufficiently small
neighborhood of 0.

We can write 1(t) in standard coordinates as 1(t) = ["l1(t) . . . "fn(t)] T.


Each coordinate function "ti : JR ---+ JR is a scalar map. Assuming each "ti(t) is
differentiable, we can compute the single-variable derivatives "(~(to) using (6.1) at t 0 .

Proposition 6.1.5. A curve "! : (a, b) ---+ ]Rn written in standard coordinates as
1(t) = ["11 ( t) "fn (t)] T is differentiable at to if and only if "ti (t) is differentiable
at to for every i . In this case, we ha·ve

1' (to) = [1; (to)

Proof. All norms are topologically equivalent in JRn, so we can use any norm to
compute the limit. We use the oo-norm, which is convenient for this particular
proof.
If the derivatives of each coordinate exist, then for any c; > 0 and for each i
there exists a oi such that whenever 0 < !hi < oi we have

l"ti(to+h~ - "ti(to) - "t~(to)I <c:.

Letting h < 8 = min(8 1 , .. . , On) gives


6.1. The Directional Derivative 243

This shows that 1' (to) exists and is equal to [rWo) I~ (to)] T.
Conversely, if 1' (to) = [Y1 Yn] T exists, then for any i and for any c; >0
there exists a o> 0 such that if 0 < lhl < o, then
ri(to + h) _
h
,i(to) _ ·I< ll'(to +
Yi -
h) - 1(to) _ [
h Y1
l

which proves that 1~(to) exists and is equal to Yi · D

Application 6.1.6. If a curve represents the position of a particle as a func-


tion of time, then 1'(to) is the instantaneous velocity at to and llr'(to)ll2 is the
speed at to. We can also take higher-order derivatives; for example, 1"(t0 ) is
the acceleration of the particle at time t 0 .

1(t) + 1'(t)

/ 1' (t )

Figure 6.1. The derivative 1' (t) of a parametrized curve r : [a, b] --+ JRn
points in the direction of the line tangent to the curve at 1(t) . Note that the tangent
vector1'(t) itself {see Definition 6.1. 7) is a vector {blue) based at the origin, whereas
the line segment (red) from 1(t) to 1(t) +1'(t) is what is often informally called the
''tangent" to the curve.

Definition 6.1. 7. Given a differentiable curve I : (a, b) --+ JRn, the tangent vector
to the curve r at to is defined to be 1' (to). See Figure 6.1 for an illustration.

Example 6.1.8. The curve 1(t) = [cost sin t] T traces out a circle of radius
one, centered at the origin. The tangent curve is 1' (t) = [-sin t cost] T . The
acceleration vector is 1" (t) = [- cost - sin t] T and satisfies 1" (t) = -1( t).

Using the product and chain rules for single-variable calculus, we can easily
derive the following rules.

Proposition 6.1.9. If the maps f, g : JR --+ ]Rn and <p : JR --+ JR are differentiable
and (-, ·) is the standard inner product on ]Rn, then
244 Chapter 6. Differentiation

(i) (f+g)'=f'+g' ;
(ii) (t.p !)' = + t.p f';
t.p' f

(iii) (J, g)' = U', g) + (f, g');


(iv) (f o t.p)'(t) = t.p'(t)f'(t.p(t)).

Proof. The proof follows easily from Proposition 6.1.5 and the standard differen-
tiation rules from single-variable calculus. See Exercise 6.2. D

6.1 .2 Directional Derivatives


Another generalization of the single-variable derivative is the directional derivative
of a function f : JRn -+ JRm. This is obtained by composing f with a map/ : JR -+ lRn
given by 1(t) = x +tu to get a map f o / : JR -+ JRm, which can be differentiated
as in (6.2). If u is a unit vector, the derivative of f(1(t)) is interpreted to be the
derivative off at x in the direction of u. If u is not a unit vector, then the derivative
is scaled by a factor of llull· To summarize, we have the following definition.

Definition 6.1.10. Let f : JRn -+ JRm. Given x, v E JRn , the directional derivative
off at x with respect to v is the limit

. f(x + tv) - J(x)


1lm ,
t-+0 t
if it exists. This limit is often denoted Dvf(x).

Remark 6 .1.11. In the next section we prove Theorem 6.2 .15, which says that
for any fixed x, the function ¢(v) = Dvf(x) is a linear transformation in v, so
Dv 1 +v 2 f(x) = Dv,f(x) + Dv 2 f( x). This is an important property of directional
derivatives.

Example 6.1.12. Let f : JR 2 -+ IR be defined by f(x, y) = xy 2 + x 3 y. We


compute t he directional derivative at x = (x , y) in the direction v = (~, ~)
by computing the derivative (with respect to t) of f(x + tv) at t = 0. This
gives
6.1. The Directional Derivative 245

6.1.3 Partial Derivatives


Taking directional derivatives along the standard basis vectors e i for each i gives
what we call partial derivatives. In other words, the partial derivatives are the
directional derivatives DeJ(x) , which are often written as Dd(x) or g!,.
Definition 6.1.13. Let f : lRn -+ lRm . The ith partial derivative off at the point
x is given by the limit (if it exists)
DJ( ) _ r f (x + hei) - j (x)
i x - h~ h .

Example 6.1.14. In this example we show that the function


xy
(x,y) =I- (0, 0),
f(x,y) = x2+y2 '
{ 0, (x , y) = (0, 0) ,
is not continuous at (0, 0) , but its partial derivatives do exist. The sequence
((*, *))nE N converges to zero and yet f(*, *)=~for all n EN; thus f is not
continuous at (0, 0). However, we have

D1f(O 0) = lim f (h , O) - f (O , O) = 0 and


' h-+0 h
D2f(O, 0) = lim f(O, h) ~ f(O, 0) = 0.
h-+0

Thus, the partial derivatives are zero despite the function's failure to be con-
tinuous there.

Remark 6.1.15. In the previous definition the ith coordinate is the only one that
varies in the limit. Thus, for real-valued functions (that is, when m = 1), we can
think of this as a single-variable derivative of a scalar function with the only variable
being the ith coordinate; the other coordinates can be treated as constants. For
vector-valued functions (when m > 1), we may compute Dd using the usual rules
for differentiating if we treat Xj as a constant for j =/:- i.

Example 6.1.16. Consider the function f(x, y) = xy 2 +x 3 y of Example 6.1.12.


We can compute the first partial derivative of f(x, y) = xy 2 + x 3 y by treating
f as a function of x only. Thus, the first partial derivative is
Dif(x, y) = y
2
+ 3x 2 y.
Similarly, we can find the second partial derivative by treating fas a function
of y only. Thus, the second partial derivative is
246 Chapter 6. Differentiation

6.2 The Frechet Derivative in JRn


In the next two sections we define the derivative of a general function f : U ---+ Y,
where Y is a Banach space and U is an open set in a Banach space. This construction
is called the Frechet derivative .
In this section we define the Frechet derivative for functions on finite-
dimensional real vector spaces- this may be familiar to many students. 39 In the
next section we generalize to arbitrary Banach spaces.

Definition 6.2.1. Let U C R.n be an open set. A function f : U ---+ R.m is


differentiable at x E U if there exists a linear transformation D f (x) : R.n ---+ R.m
such that
. llf(x + h) - f (x) - D f (x)hll _
1 0 (6.3)
h~ llhll - .
Here we mean the limit in the sense of Definition 5.2.8. We call Df(x) the deriva-
tive off at x. The derivative is sometimes called the total derivative as a way to
distinguish it from the directional and partial derivatives.

Example6.2.2. Let f: R. 2 ---+ I~ 3 be given by f(x,y) = (x 3 ,xy,y2 ). At


the point x = (2, 3) , the function [?§(R. 2 ; R. 3 ) ~ M 3 , 2 (R.) given (in standard
coordinates) by

is the total derivative of f because

f (x + h) - f (x) --
12
~
ol~ h
[
lim
h~O llhll

39
Depending on the students' background in multivariable calculus and the amount of class time
available, the instructor may wish t o skip directly to the general Frechet derivative in Section 6.3.
6.2. The Frechet Derivative in JR.n 247

It turns out that any norm gives the same answer for the limit, since all
norms are topologically equivalent on JR.n. Remark 6.3.2 gives more details
about different norms and their effect on the derivative.

Example 6.2.3. An argument similar to that in the previous example shows


for arbitrary x = (x, y) that the linear function in @(JR. 2 ; JR. 3 ) ~ M 3 ,2 , repre-
sented by the matrix

-~] ,
2y

is the total derivative of f(x , y) = (x 3 ,xy,y2 ) at x.

Nota Bene 6.2.4. Beware that Df(x)v is not a linear transformation in x.


Rather, for each choice of x, it is a linear transformation in v. So in the
previous example, for each fixed x = (x , y) the transformation

3x 2

Df(x)v =
[6
is linear in v. But

Df(ax) =
[
3a2 x2

ag
2ay
Ol
ax i- a [3xy
0
2

~i = aDf(x),
2y

so D f: x HD f(x) is not linear in x.

Example 6.2.5. A If L : JR_n -t JR.m is a linear transformation, then for


every x E JR.n we have

lim llL(x + h) - L(x) - L(h)ll = lim l!Qll = O,


h-+0 llhll h-+0 llhll

so DL(x)(v) = L(v) . Note that DL(x) is independent of x in this case.


If L is represented in the standard basis by a matrix A, so that L(x) =
Ax, then DL(x) is also represented by the matrix A. If the matrix A is equal to
a T, where a E JR_n, then L can also be written as L(x) = x Ta, but the derivative
is still a T. This is an example of a general principle: whenever the independent
variable in a function is transposed, then the derivative is also transposed.
248 Chapter 6. Differentiation

Remark 6.2.6. If f : Rn -t Rm is given by f(x) =Ax with A E Mmxn(R), then


DJ = A . Let (Rm)* = Efi(Rm;R) be the dual space of Rm. We usually write
elements of (Rm)* as n-dimensional row vectors. A row vector w T corresponds to
the function LE (Rm)* given by L(x) = w T x.
The matrix A also defines a function g : Rn -t (Rm)*, given by g(x ) = x TA.
Exercise 6.10 shows that in this representation the derivative Dg : Rn --+ (Rm)* is
the linear transformation v H v TA.
Alternatively, elements of (Rm)* can be represented by column vectors, where
u E Rm defines a function L E (Rm)* given by L (x) = u T x . Exercise 6.10 shows
that in this form the function g is g;iven by g(x) = A T x, and Dg is the linear
transformation v HAT v.

Remark 6.2. 7. If the domain is one dimensional, then a linear t ransformation


L : R 1 -t Rm is given in coordinates by an m x 1 matrix [£1 t'mr· In
particular, if m = 1, then L is just a 1 x 1 matrix. In traditional single-variable
calculus with f : R --+ R, the derivative f' (x) is a scalar, but we think of D f (x) as an
element of Efi (R; R), represented in standard coordinates by the 1 x1 matrix [f'(x)) .

Example 6.2 .8. Suppose U C R and 1 : U -t Rm is a curve. We prove


that the derivative 1' (x) of the curve defined in Definition 6.1.3 is the total
derivative D1. Note that

lim li'"Y(x + h) - 1(x) - 1'(x)hll = lim 11 1(x + h) - 1(x) - 1'(x)h II·


h-tO II hll h-tO h

Division by h makes sense because h is a scalar. But this limit is zero if and
only if
r 1(x+h)-1(x) '( ) - 0
h~ h -1 x - ,
which holds if and only if

'( ) _ . 1(x + h) - 1(x)


1 x - 11m h .
h -tO

Thus, 1'(x) = D1(x), as expected.

Remark 6.2.9. The total derivative D f (x o), if it exists, defines a linear function
that approximates f in a neighborhood of xo. More precisely, if L(h) = D f (xo)h,
then for any c: > 0 there is a 5 > 0 such that the function f (x 0 + h)- f (x 0 ) is within
c: ll h ll of L(h) (that is, llf(xo + h) - f(xo) - L(h) ll :::; c:llhll) whenever h E B(O, 5).

Proposition 6 .2 .10. Let U C Rn be an open set. If f : U --+ Rm is differentiable


at x EU, then Df (x) is unique.
6.2. The Frechet Derivative in JRn 249

Proof. Let L1 and L2 be two linear transformations satisfying (6.3). For any
nonzero v E JRn, as t ---+ 0 we have that

ll L1v - L2v ll I (f(x + tv) - J(x) - L2tv) - (f(x + tv) - f(x) - L1tv) 11
ll v ll lt lll v ll

llf(x + tv) - f(x) - L2tv11 ll f(x + tv) - f(x) - L1tvll


::; lt l ll v ll + lt l ll v ll ---+ O.

Thus, Liv= L2v for every v, and hence L 1 = L2. D

Theorem 6.2.11. Let U C IRn be an open set, and let f : U ---+ JRm be given by
f = (!1, ... , fm)· If f is differentiable on U, then the partial derivatives Dkfi(x)
exist for each x E U, and the matrix representation of the linear map D f (x) in the
standard basis is the Jacobian matrix

Dif1(x ) D2f1(x) Dnfi(x)l


J(x) =
D1h(x) D2f2(x)
· ·.. Dn~2(x) . (6.4)
... ...
[
Difm(x) D2fm(x) ··· Dnfm(x)

Proof. Let J = [J1 h · · · Jn] be the matrix representation of Df(x) in the


standard basis, with each J1 being a column vector of J. For h = he1 we have

_ .
1
I ! (x + h) - f (x) - D f (x)hll
O - h~ llhll
= lim II! (x + he1) f (x) - hD f (x)e1 II
-
h~O lhlll e1 ll
= lim llf(x1, · · ·, Xj + h, · · ·, Xn) - f(x1, ... , Xn) - hJj ll.
h~ O lhl
Thus, each component also goes to zero as h ---+ 0, and so

which implies

Thus, (6.4) is the matrix representation of D f (x) in the standard basis. D

Remark 6.2.12. We often denote the Jacobian matrix as D f(x), even though the
Jacobian is really only the matrix representation of D f (x) in the standard basis.
250 Chapter 6. Differentiation

Example 6.2.13. Let f : JR 3 -t JR 2 be given by


f(x,y,z) = (xy+x 2z 2,y3z 5 +x).
The previous proposition shows that the standard representation of D f (x) is
given by
2
y + 2xz 2 x 2x z]
Df(x,y,z) = [ 1 3 y2z5 Sy3z4 ·

T heorem 6 .2.14. Let U C JRn be an open set, and let f : U -t JRm be given by
f = (Ji, . .. , f m) . If each Djfi(x) exists and is continuous on U, then D f (x) exists
and is given in standard coordinates by (6.4).

Proof. Let x E U, and let J(x) be the linear operator with matrix representation
in the standard basis given by (6.4). To show that J(x) is D f (x), it suffices to
show that (6.3) holds. Since all norms are topologically equivalent, we may use the
oo-norm. Therefore, it suffices to show that for every c: > 0 there is a o > 0 such
that for every i we have

ffi(Y) - fi(x) - [J(x)(y - x)] if < c:ffy - xff,

whenever 0 < fix - yff < o. Here [J(x)(x - y)]i denotes the ith entry of the vector
J(x)(x -y) .
For any o > 0 such that B (x, o) c U, consider y E B (x, o) with y =f. x .
Note that

f(y) - f(x) = f(y1, · · · , Yn) - f(x1, Y2, · · · Yn) + f( x 1, Y2, · · · , Yn)


- f(x1, X2, y3, · · ·, Yn) + f(x1, X2, y3, · · ·, Yn)
+ · · · + f(x1, · · ·, Xn-1, Yn) - f(x1, ... , Xn)·
For each i,j , let gij: JR -t JR be the function z i--t fi(x1,. . .,Xj-1,z,Y1+1,. . .,yn) ·
By the mean value theorem in one dimension, for each i there exists ~i,l in the
interval [x1, Y1] (or in the interval [y1, x1] if Yi < x 1) such that

Continuing in this manner, we have f.i,j E [x 1, y1] (or in [y1, x 1] if y1 < x 1) such that

fi(Y) - fi(x) = Difi(~i,1, Y2, · · ·, Yn)(Y1 - X1) + D2fi(x1, ~i,2, y3, . . . , Yn)(Y2 - x2)
+ · · · + Dnfi(X1, X2, · · ·, Xn-1, ~i,n)(Yn - Xn)

for every i. Since the ith entry of J(x)(y - x) is

n
L D1fi(x1,. . ., Xn)(Yj - Xj),
j=l
6.2. The Frechet Derivative in JRn 251

and since IYi - xii ~ ll Y - x ii we have

lf i (Y) - fi (x ) - [J(x)(y - x) ]il ~ ( ID1fi(6 , Y2, · · · , Yn) - Difi(x 1, . . . , Xn) I

+ · · · + IDnfi (X1, · · ·, Xn-11 ~n) - Dnfi(X1, · · · , Xn) I) llY - x ii ·

Since each Djfi(x) term is continuous, we can choose 6 small enough that

lf i (Y) - fi(x) - [J(x)(y - x )]il < cll Y - x ii,

whenever 0 < ll x - Yll < 6. Hence, J(x) satisfies the definition of the derivative,
and D f (x) = J (x) exists. D

The following theorem shows that the total derivative may be used to compute
directional derivatives.

Theorem 6.2.15. Let U C JRn be an open set. If f : U ---+ ]Rm is differentiable


at x E U , then the directional derivative along v E JRn at x exists and can be
computed as
Dvf(x ) = Df(x)v. (6.5)
In particular, the directional derivative is linear in the direction v .

Proof. Assume v is nonzero; otherwise the result is trivial. Let

a(t) = II f (x + t~) - f (x ) - D f (x )v ll · (6 .6)

It suffices to show that limt-+o a(t) = 0. Choose c > 0. Since f is differentiable at


x, there exists 6 > 0 such that

ll f( x + h) - f(x) - D f( x )hl l < cllhll

whenever 0 < ll h ll < 6. Thus, for ll tvll < 6, we have

ll f(x + tv) - f( x ) - tD f( x )vll II II


lt l <EV .

But ll tv ll < 6 if and only if lt l < 6ll vll- 1 ; so when ltl < 6ll v ll - 1, we know a(t) <
cl lv ll· Thus, limt-+o a(t) = 0 and (6.5) holds. D

Ex ample 6.2.16. Let f : JR 2 ---+ JR be defined by f (x, y) = xy 2 + x 3 y, as in


Example 6.1. 12. We have

Df(x,y) = [Dif (x, y) D2f(x , y)] = [y 2 +3x 2y 2xy+x 3 ] .


252 Chapter 6. Differentiation

By Theorem 6.2.15, the directional derivative off in the direction ( ~, ~) is

__.!_] 1
D(' ')f(x,y)=[y 2 +3x 2 y
72'72
2xy+x 3 ]
[~
V2
= M(y 2 +3x 2 y+2xy+x3 ),
v2

which agrees with our previous (more laborious) calculation.

Remark 6.2.17. It is possible for the partial derivatives of f to exist even if f is


not differentiable. In this case (6.5) may or may not hold; see Exercises 6.7 and 6.8.

6.3 The General Frechet Derivative


In this section, we extend the idea of the Frechet derivative from functions on rr:tn
to functions on general Banach spaces. The Frechet derivative for functions on
arbitrary Banach spaces is very similar to the finite-dimensional case, except that
derivatives in the infinite-dimensional case generally have no matrix representations.
Just as the derivative in rr:tn can be used for finding local extrema of a function,
so also the Frechet derivative can be used for finding local extrema, but now we
consider functions on infinite-dimensional Banach spaces like L 00 , and the critical
values are also elements of that space. This generalization leads to the famous
Euler-Lagrange equations of motion in Lagrangian mechanics (see Volume 4).
Throughout this section let (X, II · llx) and (Y, II · llY) be Banach spaces over
lF, and let UC X be an open set.

6.3.1 The Frechet Derivative

Definition 6.3.1. A function f : U ----+ Y is differentiable at x E U if there exists


a bounded linear transformation D f (x) : X ---+ Y such that
- II! (x + h) -- f(x) - D f(x)hi lY _
1
h~ llhllx - Q_
(6 .7)

We call Df(x) the derivative off at x . If f is differentiable at every point x EU,


we say f is differentiable on U.

Remark 6.3.2. Topologically equivalent norms (on either X or Y) give the same
derivative. That is, if II-Ila and 11 · llb are two topologically equivalent norms on X and
if II · llr and II · lls are two topologically equivalent norms on Y, then Theorem 5.8.4
guarantees that there is an M and an m such that
ll f(x + h) - f(x) - D f(x)hllr ~ Mllf(x + h) - f(x) - D f(x)hlls
and llhlla :'.'.'. mllhllb · Thus, we have
O < llf (x + h) - f (x) - D f(x)hllr < Mllf(x + h) - f(x) - D f(x)hlls
- llhlla mllhllb
So if f has derivative D f (x) at x with respect to the norms II · lls and II - lib, it must
also have the same derivative with respect to II · llr and II - Ila -
6.3. The General Frechet Derivative 253

Example 6.3.3. &


If L : X -+ Y is any linear transformation, then the
argument in Example 6.2 .5 shows that DL(x)(v) = L(v) for every x, v EX.

Example 6.3.4. For an example of a linear operator on an infinite-dimensional


space, consider X = C([O,l];JR) with the £ = -norm. The function L: X-+ JR
given by

L(f) = 1 1
tf (t) dt

is linear inf, so by Example 6.3.3 we have DL(f)(g) = L(g) for every g EX.

Example 6.3.5. If X = C([O, l] ; JR) with the £=-norm, then the function
Q: X-+ JR given by

Q(f) = 1 1
tf 3 (t) dt

is not linear. We show that for each f E X the derivative DQ(f) is the linear
1
transformation B : X -+ JR defined by B(g) = 0 3tf 2 (t)g(t) dt. To see this , J
compute

. [Q(f + h) - Q(f) - B(h)[


11m - - -------
h-+O [[h[[L 00

= lim 11 1
t((f + h) 3 (t) -
3 2
f (t) - 3f (t)h(t)) dtl
h -+0 [[h[[L 00

11
1
2 3
= lim t(3f(t)h (t) +h (t)) dtl
h-+0 [[h[[L 00

< lim
[[h[[Eoo 1l 0
[t(3f(t) + h(t))[ dt
= 0.
- h-+0 [[h[[L 00

Definition 6.3.6. Let f : U -+ Y be differentiable on U. If the map D f :


U-+ ~(X; Y), given by x H Df(x), is continuous, then we say f is continuously
differentiable on U . The set of continuously differentiable functions on U is denoted
C 1 (U; Y).
254 Chapter 6. Differentiation

Proposition 6.3. 7. If f : U -7 Y is differentiable on U, then f is locally Lipschitz


at every point 40 of U; that is, for all xo E U, there exist B(xo, o) C U and an L > 0
such that llf(x) - f(xo)llY:::; Lllx - xollx whenever llx - xollx < o.

Proof. Leth= x-x 0 and c = 1. Choose o> 0 so that whenever 0 < llx - xo[lx < o,
we have
llf(xo + h) - f(xo) - D f(xo)hl[Y < 1,
I -- xo IIx
Ix
or, alternatively,

llf(x) - f(xo) - Df(xo)(x - xo)llY < llx - xol[x. (6.8)

Applying the triangle inequality to

llf(x) - f(xo)llY = llf(x) - f(xo) - Df(xo)(x - xo) + Df(xo)(x - xo) llY

gives

llf(x) - f(xo)[IY:::; llf(x) - f(xo) - Df(xo)(x - xo)llY + ll Df(xo)(x - xo)llY·

Combining with (6.8) gives

llf(x) - f(xo)[IY:::; l x - xo llx + llDf(xo)(x - xo )llY


:::; (1 + llDJ(xo)llx,Y) [[x - xollx,

where the case of x = Xo gives equality. Thus, in the ball B( xo , o) the function f
is locally Lipschitz at xo with constant L = llDf(xo)llx,Y + 1. D

Corollary 6.3.8. If a function is differentiable on an open set U, then it is con-


tinuous on U.

Proof. The proof is Exercise 6.14. D

Remark 6.3.9. Just as in the finite-dimensional case, the derivative is unique. In


fact, the proof of Proposition 6.2 .10 uses nothing about R.n and works in the general
case exactly as written.

Proposition 6.3.10. If f : U -7 Y is differentiable at x E U and is Lipschitz with


constant L in a neighborhood of x, then llD f(x) llx,Y :::; L.

Proof. Let u be a unit vector. For any c > 0 choose osuch that
llf (x + h) - f (x) - D f (x)hllY
6
llhllx <
40 T his
should not be confused with the property of being locally Lipschitz on U, which is a stronger
condition requiring that llf(x) - f(z) llY :::; Lllx - zllx for every x , z E B(xo, 8).
6.3. The General Frechet Derivative 255

whenever ll h ll < 5, as in the definition of the derivative. Let h = (5/2)u, which


gives

DJ( ) II = ll D f (x)h ll Y
ll x u Y ll h ll x
< llf(x + h) - f (x) - D f(x)hllY + llf(x + h) - f(x) llY
- ll h ll x
II! (x + h) - f (x) - D f (x)hllY Lllhllx
::::; ll h ll x + llh ll x
::::; c+L.
Since c was arbitrary, we have the desired result. D

6.3 .2 Frechet Derivatives on Cartesian Products


Recall from Remark 5.8.9 that for p E [1, oo) all the p-norms on a Cartesian prod-
uct of a finite number of Banach spaces are topologically equivalent, and so by
Remark 6.3.2 they give the same derivative.

Proposition 6.3.11. Let f : U -t Y1 x Y2 x · · · x Ym be defined by f (x) =


(f1(x),h(x), ... , fm(x)) , where ((Yi, ll · llYJ):, 1 is a collection of Banach spaces.
If fi is differentiable at x EU for each i, then so is f. Moreover, for each h EX,

D f(x)h = (D Ji (x)h, D h(x)h, ... , D fm(x)h).

Proof. The proof is similar to that for Proposition 6.1.5 and is Exercise 6.15. D

Definition 6.3.12. Let ((Xi, II · lli ))~ 1 be a collection of Banach spaces. Let
f : X 1 x X2 x · · · x Xn -t Y , where (Y, II · llY) is a Banach space. The ith partial
derivative at (x 1, x2, . .. , Xn) E X 1 x X2 x · · · x Xn is the derivative of the function
g: Xi -t Y defined by g(xi) = f(x1, . . .,Xi-1,Xi,Xi+l,. · .,xn) and is denoted
Dd(x1, X2 , .. . , Xn) ·

Example 6.3.13. In the special case that each X i is JR and Y = JRm, the
definition of partial derivative in Definition 6.3.12 is the same as we gave
before in Definition 6.1.13.

Theorem 6.3.14. Let ((Xi , II · ll i))~ 1 be a collection of Banach spaces. Let f :


X 1 x X 2 x · · · x Xn -t Y , where (Y, I · ll Y) is a Banach space. If f is differentiable
at x = (x 1 , x 2 , ... , Xn) , then its partial derivatives Dd(x) all exist. Moreover, if
h = (h1,h2, ... , hn) E X1 X X2 X · · · X Xn , then
n
D f (x)h = 2::: Dd(x)hi.
i= l
256 Chapter 6. Differentiat ion

Conversely, if all the partial derivatives of f exist and are continuous on the set
UC X 1 x X 2 x · · · x Xn, then f is continuously differentiable on U.

Proof. The proof is an easy generalization of Theorems 6.2. 14 and 6. 2.11 . D

6.4 Properties of Derivatives


In this section, we prove three important rules of t he derivative, namely, linearity,
the product rule, and the chain rule. We also show how to compute the derivative
of various matrix-valued functions.
Throughout this section assume that (X, II · llx) and (Y, II · llY) are Banach
spaces over lF and that U C X is an open set.
We begin with a simple observation that simplifies many of the following
proofs.

Lemma 6.4.1. Given a function f : U--+ Y, a point x E U, and a linear transfor-


mation L : X --+ Y, the fallowing are equivalent:

(i) The function f is differentiable at x with derivative L.

(ii) For every c: > 0 there is a o> 0 with B (x , o) c U such that

llf(x + ~) -- f(x) - L~llY :S: t:ll~llx (6 .9)

whenever ll~llx < o.

Proof. When~= 0 the relation (6.9) is automatically true. When ~ of. 0, dividing
by ll ~ llx shows that this is equivalent to t he limit in the definition of the derivative,
except that this inequality is not strict, where the definition of limit has a strict
o
inequality. But this is remedied by choosing a such that (6.9) holds with c:/2
instead of t:. D

6.4.1 Linearity
We have already seen that the derivative D f (x)v is not linear in x , but by definition
it is linear in v . The following theorem shows that it is also linear in f.

Theorem 6.4.2 (Linearity). Assume that f: U--+ Y and g: U--+ Y. If f and


g are differentiable on U and a, b E lF, then af + bg is also differentiable on U , and
D(af(x) + bg(x)) = aDJ(x) + bDg(x ) for each x EU.

Proof. Choose t: > 0. Since f and g are differentiable at x , there exists o> 0 such
that B(x, o) c U and such that whenever ll ~llx <owe have
6.4. Properties of Derivatives 257

and

Thus,

llaf (x + ~) + bg(x + ~) - af (x) - bg(x) - aD f (x)~ - bDg(x) ~ llY


:=:: lalllf(x + ~ ) - f(x) - Df(x)~ ll Y + lblllg(x + ~) - g(x) - Dg(x)~l l Y
< clal ll ~llx + clb lll ~ llx < cll~ ll
- 2( lal + 1) 2( lbl + 1) - x

whenever ll ~ ll x < 8. The result now follows by Lemma 6.4.l. D

Remark 6.4.3. Among other things, the previous proposition tells us that the set
C 1 (U; Y) of continuously differentiable functions on U is a vector space.

6.4.2 Product Rule


We cannot necessarily multiply vector-valued functions, so a general product rule
might not make sense. But if the codomain is a field, then we can talk about the
product of functions, and the Frechet derivative satisfies a product rule.

Theorem 6.4.4 (Product Rule). If f : U --t lF and g: U --t lF are differentiable


on U, then the product map h = f g is also differentiable on U, and Dh(x) =
g(x)Df(x) + f(x)Dg(x) for each x EU.

Proof. It suffices to show that

lim lh(x + ~) - h(x) - g(x)Df(x)~ - f(x)Dg(x)~ I = O. (6.lO)


HO 11 ~ 11

Choose c > 0. Since f and g are differentiable at x , they are locally Lipschitz
at x (see Proposition 6.3.7), so there exists B(x, 8x) C U and a constant L > 0
such that
lf(x + ~ ) - f(x)I :=:: Lll~ll, (6.11)

c ll ~ ll
lf(x + ~) - f(x) - D f(x)~ I :=:: 3(ll g(x) ll + l) , (6.12)

and
cll~ll
lg(x + ~) - g(x) - Dg(x)~ I ::; 3( ll f(x) ll + L )' (6.13)

whenever 11 ~ 11 < 8x. If we set 8 =min{ 8x, 3 L(ll Dg(x) ll +l), 1}, then whenever 11 ~ 1 1 < 8,
we have that
258 Chapter 6. Differentiation

If (x + ~)g(x + ~) - f (x)g(x) - g(x)D f (x )~ - f ( x)Dg(x)~I


:S lf(x + ~)llg(x + ~) - g(x ) - Dg(x) ~ I+
lg(x) llf(x + ~) - f(x) - D f(x)~I + lf(x + ~ ) - f(x)l ll Dg(x) 1111 ~ 11
:S (lf(x) I + L)Jg(x + 0 -- g(x) - Dg(x)~I+
Jg(x) IIf (x + ~ ) - f (x) - D f (x)~I + 6LllDg(x) 1111 ~1 1
:S cll~ll
whenever 11 ~ 11 < 6. The result now follows by Lemma 6.4.1. D

Example 6.4.5. Let f : IR 3 --+IR be defined by f(x, y, z) = x 5 y + xy 2 + z 7 ,


and let g: JR 3 --+ IR be defined by g(x,y,z) = x 3 + z 11 . By the product rule
we have

D(f g)(x, y, z) = g(x, y, z)D f(x, y, z) + f(x, y, x)Dg(x , y, z)


=(x 3 + z 11 ) [5x 4 y+y 2 x 5 +2x y 7z 6 ] +(x 5 y+xy 2 +z 7 ) [3x 2 0 llz 10 ].

For example, the total derivative at the point (x ,y,z) = (0,1 , -1 ) E IR 3 is


given by

D(fg )(O, 1, -1) = - [1 0 7] - [O 0 11] = [-1 0 -18].

Proposition 6.4.6. We have the following differentiation rules:


(i) If u(x), v(x) are differentiable functions from !Rn to !Rm and f : !Rn--+ IR is
given by f(x) = u(x)Tv(x) , then
D f(x ) = u(x )TDv(x ) + v(x) TDu(x).

(ii) If g : !Rn --+ IR is given by g(x) = x TAx, then


Dg(x) =xT (A+AT).

(iii) Let w(x) = (w 1(x), . .. , wm(x)) T be a differentiable function from !Rn to !Rm,
and let
bn(x) b12(x) ... b1m(x)l
b21 (x) b22(x) ... b2m(x)
B(x ) = .
.
. .
[ . .. ..
bki(x) bk2(x) bkm(x)
be a differentiable function from !Rn to Mkm(lF). If H : !Rn --+ JRk is given by
H(x) = Bw, then
w T (x)DbT (x)l
DH(x) = B(x)Dw(x) + ; ,
[
w T(x)Dbk (x)
where bi is the i th row of B.

Proof. The proof is Exercise 6.16. D


6.4. Properties of Derivatives 259

6.4.3 Chain Rule


The chain rule also holds for Frechet derivatives.

Theorem 6.4. 7 (Chain Rule). Assume that (X, II ·II x), (Y, II ·II y), and (Z , I ·II z)
are Banach spaces, that U and V are open neighborhoods of X and Y, respectively,
and that f: U---+ Y and g: V---+ Z with f(U) CV. If f is differentiable on U and
g is differentiable on V, then the composite map h = g o f is also differentiable on
U and Dh(x) = Dg(f(x))Df(x) for each x EU.

Proof. Let x E U, and let y = f(x). Choose c; > 0. Since f is differentiable at x


and locally Lipschitz at x (by Proposition 6.3.7), there exists B(x, bx) C U and a
constant L > 0 such that whenever ll~llx <bx we have

< cl l ~llx
iif(x + ~) - f(x) - Df(x)~i iY _ 2( il Dg(y) iiY,Z + l)

and
llf(x+~)-f(x) llY :S L ll ~ llx·
Since g is differentiable at y, there exists B(y,by) CV such that whenever ll 77 llY <
by we have
llg(y + 77) - g(y) - Dg(y)77 llz :S cl~Y ·
Note that

h(x + ~) - h(x) = g(f(x + ~)) - g(f(x)) = g(y + 77(~)) - g(y),

where 77(~) := f(x + ~) - f(x). Thus, whenever ll ~ llx < min{ bx, by/ L}, we have
that ll77( ~) 11 Y :S Lll~llx <by. It follows that

llh(x + ~ ) - h(x) - Dg(y)D f(x)~llz


= llg(y + 77(~)) - g(y) - Dg(y)77(~) + Dg(y)17(~) - Dg(y)Df(x)~ ll z
::::: llg(y + 77(~)) - g(y) - Dg(y)77(~) ll z + llDg(y)ilY,zll77(~) - Df(x)~llY
< c ll77(~) 11 Y I ( )II c ll ~ ll x
- 2L + Dg y Y,Z 2(llDg(y)llY,Z + 1)
:S cll~llx·
The result now follows by Lemma 6.4.1. D

Example 6.4.8. Let f : JR 2 ---+ IR and g : IR 2 ---+ IR 2 both be differentiable. If


g(p,q) = (x(p,q),y(p,q)) and h(p,q) = f(g(p,q)) = f(x(p,q) , y(p,q)) , then

ax ax]
D f (x, y) = [ ~; ~~]
Dg(p, q) = [~~op ~~aq and ,
260 Chapter 6. Differentiation

and by the chain rule we have Dh(p,q) = Df(x ,y)Dg(p,q) . Hence,

ah
[ap ah]
aq -- [£1ax £1]
ay
ax
ap
[ !21!.
ap
~~i
[}:,'J.
aq
-
-
[af
ax ax+
ap £1!21!.
ay ap
qj_ax
ax aq
+ af!2Ji.]
ay EJq .

Example 6.4.9 . Let f: lRn-+ JR be arbitrary and/: JR-+ lRn be defined by


1(t) = x + tv. The chain rule gives

Dvf(x) = dd f(!(t))I = DJ(!(O))!'(O ) = Df(x )v .


t t=O

In other words, the directional derivative Dvf(x) is the product of the deriva-
tive Df(x ) and the tangent vector v. This is an alternative proof of
Theorem 6.2.15.

6.5 Mean Value Theorem and Fundamental Theorem


of Calculus
In this section, we generalize several more properties of single-variable derivatives
to functions on Banach spaces. These include both the mean value theorem and the
fundamental theorem of calculus. We then use these to describe some situations
where uniform limits commute with differentiation.
Throughout this section, unless indicated otherwise, assume that (X, II · llx)
and (Y, II · llY) are Banach spaces over lF and that UC Xis an open set.

6.5.1 The Mean Value Theorem


We begin this section by generalizing the mean value theorem to functions on a
Banach space.

Theorem 6.5.1 (Mean Value Theorem). Let f : U ->JR be differentiable on


u. Given a, b E u, if the entire line segment e(a, b) = {(1 - t)a + tb It E (0, l)}
is also in U, then there exists c E £(a, b) such that
f(b) - f(a) = Df(c)(b - a). (6.14)

Proof. Leth: [O, l]-+ JR be given by h(t) = f((l-t)a+tb). Since f is differentiable


on U and U contains the line segment £(a, b), the function h is differentiable on
(0, 1) and continuous on [O, l]. By the usual mean value theorem in one dimension,
there exists to E (0, 1) such that h(l) - h(O) = h'(t0 ). By the chain rule, h'(t 0 ) =
Df((l - to)a + tob)(b - a)= Df(c)(b - a), and therefore (6.14) holds. D

Remark 6. 5. 2. There is an important property in mathematics that allows the


hypothesis in Theorem 6.5.l to be stated more concisely. If for any two points a
6.5. Mean Value Theorem and Fundamental Theorem of Calculus 261

and bin the set Uthe line segment £(a, b) is also contained in U, then we say that
U is convex. See Figure 6.2 for an illustration. Convexity is a property that is used
widely in applications. We treat convexity in much more depth in Volume 2.

Figure 6.2. In this figure the set V is not convex because the line segment
£(a, b) between a and b does not lie inside of V. But the set U is convex because
for every pair of points a, b E U the line segment £( a, b) between a and b lies in U;
see Remark 6. 5. 2.

6.5.2 Single-Variable Fundamental Theorem of Calculus

Lemma 6.5.3. Assume f : [a, b] --+ X is continuous on [a, b] and differentiable on


(a, b). If D f(t) = 0 for all t E (a, b), then f is constant.

Proof. Let a., (3 E (a, b) with a.< (3 . Given c > 0 and t E (a., (3), there exists bt > 0
such that llf(t + h) - f(t)llx :s; clhl whenever lhl < bt. Since [a., (3] is compact, we
can cover it with a finite set of overlapping intervals {(ti - 8i,ti + 8i)}~ 1 , where,
without loss of generality, we can assume a. < ti < t2 < · · · < tn < (3. Choose
points Xo, x 1 , .. . , Xn so that

a.= Xo < t1 < X1 < t2 < · · · < tn < Xn = (3,


where lxi - til < 8i and lxi - ti-1 1 < 8i-l for each i; that is, the Xi are chosen to
be in the overlapping regions created by adjacent open intervals. Thus, we have

II f (~) - f (a) ll x ~ lE [f (x;) - f (x,_,)] t


~ II t,
n
[(!(xi) - f (t;)) + (f (t;) - f (x,_,)) J t
:s; L [llf(xi) - f(ti)llx + llf(ti) - f(xi-1) ll x]
i=l
n

i=l
= c({3 - a.).
Since c > O is arbitrary, as are a., (3 E (a, b), it follows that f is constant on (a, b).
Since f is continuous on [a, b], it must also be constant on [a, b]. D
262 Chapter 6. Differentiation

Theorem 6 .5.4 (Fundamental Theorem of Calculus). Using the natural


isomorphism @(JR; X) ~ X defined by sending ¢ E @(JR; X) to ¢(1) E X, we have
the following :
(i) If f E C([a, b]; X), then for all t: E (a, b) we have

d
dt la
rt f(s) ds = f(t) . (6.15)

(ii) If F: [a, b] ---+ X is continuously differentiable on (a, b) and DF(t) extends to


a continuous function on [a, b], then

lb DF(s) ds = F(b) - F(a). (6.16)

Proof.
(i) Let c; > 0 be given. There exists o> 0 such that llf(t + h) - f(t)llx < c
whenever lhl < o. Thus,

lit' f (s) ds - J.' f (s) ds - f (t )hllx ~ lit' f (s) - f (t) d{


: :; lt+h llf(s) - f(t)llx dsl : :; lhl77(h),
where ry(h) = supsEB(t,ihl) llf(s) - f(t)llx < c. Hence, (6.15) holds.

(ii) Let G(t) = J:


DF(s) ds - F(t) . This is continuous by Proposition 5.10.12(v).
Moreover, by (i) above, we have that DG(t) = 0 for all t E (a, b). Therefore,
by Lemma 6.5.3, we have that G(t) is constant . In particular, G(a) = G(b),
which implies (6.16). D

Corollary 6.5 .5 (Integral Mean Value Theorem). Let f E C 1 (U; Y). If the
line segment £( x *, x) = {(1 - t )x* + tx I t E [O, 1]} is contained in U, then
1
f(x) - f(x*) = fo D f(tx + (1 - t)x*)(x - x*) dt.

Alternatively, if we let x = x* + h, then


1
f(x* + h) - f(x,.) = fo Df(x* + th)hdt. (6.17)

Moreover, we have

llf(x) - f(x*)llY:::; sup llDJ(c)llx,Yllx - x*llx. (6.18)


cE:C(x . ,x )

Proof. The proof is Exercise 6.22 . D


6.5. Mean Value Theorem and Fundamental Theorem of Calculus 263

Corollary 6.5.6 (Change of Variable Formula). Let f E C([a, b]; X) and


g: [c, d] -+ [a, b] be continuous. If g' is continuous on (c, d) and can be continuously
extended to [c, d], then

i
d lg(d)
f(g(s))g'(s) ds = f(T) dT. (6.19)
c g(c)

Proof. The proof is Exercise 6.23. D

6.5.3 Uniform Convergence and Derivatives


Theorem 5.7.5 shows that for any Banach space X, the integral is a bounded linear
operator, and hence continuous, on S([a, b]; X) in L 00 ([a, b]; X). This means that for
any uniformly convergent sequence Un)':'=o in S( [a , b]; X), the integral commutes
with the limit; that is,

lim
n ---+ oo
lb
a
f n dt = lb (
a
lim f n) dt.
n---+oo

It is natural to hope that something similar would be true for derivatives.


Unfortunately, the derivative is not a bounded linear transformation, which means,
among other things, that the derivatives of a convergent sequence may not even
converge. But all is not lost. The mean value theorem allows us to prove that
derivatives do pass through limits, provided the necessary limit exists.
First, however, we must be careful about what we mean by uniform convergence
for differentiable functions. The problem is that differentiability is a property of
functions on open sets, but the L 00 -norm is best behaved on compact sets. Specif-
ically, a function f E C 1 (U; Y) does not generally have a finite L 00 -norm (for
example, f(x) = l/x on the open set (0, 1) is differentiable, but has llfllL= = oo) .
The solution, when studying convergence of functions in the L 00 -norm, is to look
at all the compact subsets of U.

Definition 6.5.7. Restricting a function f E C(U; Y) to a compact subset KC U


defines a function flK E (C(K;Y), I · 11£ where llflKllL = supxEK llf(x )ll · We
00 ) , 00

say that a sequence Un)':'=o E C(U; Y)


(i) is a Cauchy sequence in C(U; Y) if the restriction Un IK )':'=o E ( C(K; Y) ,
1 · 11 L=) is a Cauchy sequence for every compact subset K C U;
(ii) converges uniformly on compact subsets to f E C(U; Y) if the restriction
UnlK)':'=o converges to flK in (C(K;Y),11 · llL=) fo r every compact subset
KcU .

Remark 6.5.8. To check that a sequence on an open subset U of a


finite-dimensional Banach space is Cauchy or that it converges uniformly on com-
pact subsets, it suffices to check the condition on compact subsets that are closed
balls of the form B(x, r). Moreover, when the open set U is of the form B(xo, R) ,
then it suffices to check the condition on closed balls B (xo, r) centered at x o. The
proof of this is Exercise 6.24.
264 Chapter 6. Different iation

Example 6.5.9. For each n E N, let fn(x) = xn. Each fn is differentiable


on the interval (0, 1) and has sup(O,l) fn = 1 for all n. But any compact
subset of (0, 1) lies in an interval la, b] with 0 < a < b < 1, and we have
llxnl[a,bJllL= = bn --t 0 as n --too, so fn --t 0 uniformly on compact subsets in
the open interval (0, 1).

Unexample 6 .5.10. Consider the sequence gn = cos(nx)/n on the open set


U = (0, 27r) . This sequence converges uniformly on compact subsets to 0,
but the sequence of derivatives g~ == - sin(nx) does not converge at all. This
example also hows that the induced norm of the derivative as a linear operator
is infinite, since for every n > 0 we have

!!:__11 =sup ll~llvx~ > llfxgnJIL

l
1
-
- -
l/n--n ·
11 dx f llJllL 00 - JJgn llL 00

Nevertheless, we can prove the following important result about uniform con-
vergence of derivatives in a finite-dimensional space.

Theorem 6 .5.11. Let X be finite -dimensional Banach space. Fix an open ball
U = Bx(x*, r) c X with a sequence Un)r:'=o c C 1 (U; Y) such that Un(x*))r:::'=o c
Y converges. If (D f n)r:'=o converges uniformly on compact subsets to g E C(U;
88(X; Y)), then the sequence Un)r:'=o converges uniformly on compact subsets to a
function f E C 1 (U; Y), and g =DJ .

The idea of the proof is fairly simple-define the limit function using t he
integral mean value theorem, and then use the fact that uniform limits commute
with integration. The result can also be extended to any path-connected U .

Proof. Let z = limn-+oo fn(x*). For each x E U let h = x - x* , and define


1
f(x) = z + f g(x* + th)hdt. (6.20)
Jo
This makes sense because U is convex. Note that f(x*) = z = limn-+oo fn(x*).
We claim that Un)r:::'=o converges to f uniformly on compact subsets. To prove
this, it suffices (by Remark 6.5.8) to prove uniform convergence on any compact ball
K = B (x*, p) c U . To prove convergence on such a K, note that for any c: > 0 and
for each x EK we have, by the integral mean value theorem,

llfn(x) - f(x) llY = llfn(x*) + 1 1


Dfn(X* + th)hdt - z -1 1
g(x* + th)hdtt
1
:S llfn(x*) - zlJ y + 111 (D fn(X* +th) - g(x* + th))h dtt

:S llfn(x*) - z llY +sup llDfn(c) - g(c)JJx,YllhlJx. (6.21)


cEK
6.6. Taylor's Theorem 265

Because fn(x*) -+ z, there is an N > 0 such that llfn(x*) - zllY < c/2 if n 2: N,
and since D f n -+ g uniformly on K , there is an M > N such that
c
sup ll Dfn(c) - g(c)llx,Y < -
cEK 2r
whenever n > M. Combined with the fact that llhllx < r, Equation (6.21) gives
c c c c
llfn(x) - f(x) llY < 2 + r llhllx < 2 + r r = c
2 2
whenever n 2: M. Since the choice of N and M was independent of x, we have
llfn - f ll L= :::; con K. Therefore, fn-+ f uniformly on all closed balls B(x*, p) C U ,
and therefore on all compact subsets of U.
Finally, we must show that D f(x) = g(x) for any x in U. For any c > 0 we
must find a osuch that llf (x + h) - f(x) - g(x)hllY :::; cllhllx , whenever llhllx < o.
By the integral mean value theorem, we have

fn(x + h) - fn(x) = 1 1
Dfn(x + th)hdt.

Because U is open, there is an a > 0 such that B(x, a) c U. Integration is a


bounded linear operator, so it commutes with uniform limits. Therefore, for all
ll h ll <a we have f(x + h) - f(x) = f0 g(x + th)hdt, and thus
1

1
llf(x + h) - f(x) - g(x)hl lY = 111 g(x + th)hdt - g(x)ht

= 111
1
(g(x +th) - g(x)) h dtt.
Continuity of g on the compact set B(x, a) implies g is uniformly continuous there,
o o
and hence there exists a with 0 < < a such that llg(x +th) - g(x)l lx,Y < c
whenever llhllx < o. Therefore,
llf(x + h) - f(x) - g(x)h llY :::; c ll h ll x ,
as required. D

6.6 Taylor's Theorem


Taylor's Theorem is one of the most powerful tools in analysis. It allows us to
approximate smooth (differentiable) functions in a small neighborhood to arbitrary
precision using polynomials. Not only does this give us a lot of insight into the
behavior of the function, but it also often allows us to compute, or at least approx-
imate, functions that are otherwise difficult to compute. It is also the foundation
of many of the important theorems of applied mathematics.
Before we can describe Taylor's theorem, we need to discuss higher-order
derivatives. Throughout this section, let (X, II · ll x) and (Y, I · llY) be Banach
spaces over IF, and let Uc X be an open set.
266 Chapter 6. Differentiation

6.6.1 Higher-Order Derivatives


Definition 6.6.1. Fork :?: 2, let @k(X; Y) be defined inductively as @k(X; Y) =
@(X;@k- 1(X; Y)) and @ 1 (X; Y) = @(X; Y).

Definition 6.6.2. Let f : U --+ Y be differentiable on U . If D f : U --+ @(X; Y)


is differentiable, then we denote the derivative of D f as D 2 f (x) E @(X; @(X; Y) ),
or equivalently as D 2f(x) E @ 2 (X; Y), and call this map the second derivative .
Proceeding inductively, if the map Dk-l f: U--+ @k- 1 (X; Y) is differentiable
fork > 2, then we denote the kth derivative as Dk f(x) E @k(X; Y). If the kth
derivative Dk f (x ) is continuous on U, we say that f is k-times continuously dif-
ferentiable on U and denote the space of such functions as Ck(U; Y) .
Finally, we say that f is smooth and write f E C 00 (U; Y) if f E Ck(U; Y) for
every positive integer k.

Example 6.6.3. .&.If f : !Rn --+ IR is differentiable, then for each x E !Rn we
have Df(x) E @(!Rn;IR) . By the Riesz representation theorem
(Theorem 3.7.1), we have @(!Rn; IR) ~ !Rn, where a vector u E !Rn corre-
sponds to the function v f--t (u, v) of @(!Rn; IR). In the standard basis on !Rn,
it is convenient to write an element of @(!Rn; IR) as a row vector u T, so that
the corresponding linear transformation is just given by matrix multiplication
(to indicate this we write @ 1 (1Rn; IR)~ (!Rn)T). This is what is meant when
Theorem 6.2 .11 says that (in the standard basis) Df(x) can be written as a
row vector D f (x) = [Dif (x) · · · Dnf(x)].
If D f : !Rn --+ @(!Rn; IR) 9:~ (!Rn) T is also differentiable at each x E
!Rn, then D 2 f(x) E @(!Rn;@ 1 (1Rn; IR)) ~ @(!Rn; (!Rn)T). Theorem 6.2.11
still applies, but since D f (x) is a row vector, the second derivative in the u
direction D 2 f(x)(u) E @1 (1Rn;IR) is also a row vector. In the standard basis
we have D 2 f(x)(u) = uTHT E @1 (1Rn;IR) ~ (!Rn)T and D 2 f(x)(u)(v) =
u T HT v, where

H = (D [Dif(x)

[
D1Dif(x)
D1D2f(x)

D1Dnf(x)
Dnf(x)] T)

DnDif(x)l
DnD2f(x)

DnDnf(x)
2

B":C .
8 f

ax.,.,ax.,.,
l
The matrix His called the Hessian off. The next proposition shows that H
is symmetric.

Definition 6.6.4. Let ((Xi, II · lli))~ 1 be a collection of Banach spaces. Fix an


open set U C X1 x · · · x Xn and an ordered list of k integers i 1 , ... , ik between 1
and n (not necessarily distinct). The kth-order partial derivative off E Ck(U; Y)
corresponding to i1, ... , ik is the function Di 1 Di 2 • · • Dik f (see Definition 6.3.12).
6.6. Taylor's Theorem 267

If Y = lF and Xi = lF for every i, then U C lFn, and in this case we often write
Di 1 Di 2 • · • Dik f as

Proposition 6.6.5. Let f E C 2 (U; Y), where Y is finite dimensional. For any
x E U and any v, w in X, we have
D 2 f(x)(v, w) = D 2 f(x)(w, v). (6.22)
If X = X1 x · · · x Xn , then this says

(6.23)
for all i and j. In the case that X = lF x · · · x lF = wn and Y = lFm with f : X --+ Y
given by f = (/1 , ... , fn), then this is equivalent to
0 2 fk 0 2 fk
(6.24)

for all i, j , and k.

Proof. Since Y is finite dimensional, we may assume that Y ~ lFm and that
f = (/1, ... , fm)· The usual norm on cm is the same as the usual norm on JR 2m,
via the standard map (x1 + iy1, ... , Xm + iym) M (x1, Y1, ... , Xm, Ym)i therefore,
we may assume that lF = JR. Moreover, it suffices to prove the theorem for each fk
individually, so we may assume that Y = JR 1.
For each x E U let 9t(x) = f(x+tv) - f(x), and let Ss,t(x) = 9t(x+sw) - gt(x).
By the single-variable mean value theorem, there exists a <Ts,t E (0 , s) such that
Ss,t(x) = Dgt(x + <T 8 ,tw)(sw).
But we have
Dgt(x + <T 8 ,tw)(sw) = D f(x + <T8 ,tW + tv)(sw) - D f(x + <T 8 ,tw)(sw),
so we may apply the mean value theorem to get Ts,t E (0, t) such that
Ss ,t(x) = D 2 f(x + <T 8 ,tW + Ts,tv)(sw, tv).
Swapping the roles of tv and sw in the previous argument gives T~ ,t E (0, t) and
<T~ ,t E (0, s) such that

Ss,t(x) = D 2 f (x + T~ ,t v + <T~ ,t w)(tv, sw ).


Combining these two results and dividing by st gives
D 2 f(x + <T 8 ,tW + T8 ,tv)(w, v) = D 2 f(x + <tv + <T~ ,tw)(v, w).
Since f C 2 (U; Y) , taking the limit ass, t--+ 0 gives (6.22).
E
To prove (6.23), take any vi E Xi and any Wj E Xj and apply (6.22) with
vectors v = (0 , ... , vi, 0, . .. , 0) and w = (0, ... , wj, 0, ... , 0) that are nonzero
only in the ith or jth entry, respectively. Finally, (6.24) follows immediately
from (6.23). 0
268 Chapter 6. Differentiation

Remark 6.6.6. This proposition guarantees that the Hessian matrix of


Example 6.6.3 is symmetric.

6.6.2 Higher-Order Directional Derivatives


If Uc IFn and f E Ck(U;!Fm), then Dkf(x) is an element of ~k(JFn;IFm); so it
accepts k different vectors from IFn as inputs, and it returns an element of !Fm. It is
useful to consider what happens when all of the input vectors are the same vector
v = I:~ 1 viei. We can also consider this to be the kth directional derivative in the
direction of v . We have
n
Dvf(x) = Df(x)v = [Dif(x) Dnf(x)]v = L VjDjf( x ).
j=l

Repeating the process gives (for convenience we suppress the x)


n n n
D~f = DvDvf = Dv L Vj Djf = L L ViVjDiDjf = vTHv,
j=l i =l j=l

where H is the Hessian of Example 6.6.3.


Iterating k times gives
n
D~f = L Vi, ... Vik Di, . . . Dikf. (6.25)
i1, .. . ,ik=1

We often also write D~f = Dk fv(k).

Remark 6.6. 7. Proposition 6.6.5 shows that (6.25) has repeated terms. If we
combine these, we can reexpress (6.2~i) as

kl . . . . .2 .
k
Dvf = L ·
. vJ 1 vJ 2
1·1 .. . - 1 1 2
. +·J2 +··· +·Jn= k]l -)2·
Jl Jn·
• • • vJ,n .,_ DJ'
1
DJ2 • • • DJ,.,_f
n' (6.26)

where the sum is taken over all nonnegative integers j 1 , . .. ,Jn summing to k.

6.6.3 Taylor's Theorem


Recall the very powerful single-variable Taylor's formula .

Theorem 6.6.8 (Taylor's Formula for One Variable). Let f : JR -t JR be a


(k + l)-differentiable function . Then, for all a, h E JR, we have

f"( ) 2 f(k)( ) f(k+l)( )


f(a + h) = f(a) + J'(a)h +_ a_ h + ··· + _ _a_ hk + c hk+ 1 (6.27)
2! k! (k+ l)!

fo r some c E JR between a and a+ h .


6.6. Taylor's Theorem 269

Pi

20

p4 Ps

20

15

10

- 3 - 2 -1 0 1 2 3 - 4 -3 - 2 - 1 0 1 2 3 -4 -3 - 2 - 1 0 1 2 3

Figure 6.3. Plots of the Taylor polynomials Po, ... ,p5 for f(x) = ex. The
function f is plotted in red, and the Taylor polynomials are plotted in blue. Each
polynomial is a good approximation off in a small neighborhood of x = 0, but
farther away from 0 the approximation is generally poor. As the degree increases,
the neighborhood where the approximation is good increases in size, and the quality
of the approximation improves. See Theorem 6. 6. 8 for the one-dimensional Taylor
theorem over~ and Theorem 6.6.9 for the general case.

In this section, we prove the multidimensional version of Taylor's theorem and


show how to approximate smooth functions at a point with polynomials.

Theorem 6.6.9. Let f E Ck(U; Y). If x E U and h E X are such that the line
segment e(x, x + h) is contained in U, then
D2 f(x)h(2) Dk-1 f(x)h(k-1)
f(x + h) = f(x) + D f(x)h + ! + .. · + (k _ l)! + Rk, (6.28)
2
where the remainder is given by

Rk = ( {1 (1 - t)k-1 Dk f (x +th) dt) h(k). (6.29)


} 0 (k - 1)!

Proof. We proceed by induction on k. The base case k = 1 follows from (6.17).


Next, we assume the theorem holds for k - 1 and then prove it also holds for k.
Thus, assume
270 Chapter 6. Differentiation

v2 f(x)h(2) vk-2 J(x)h(k-2)


f(x + h) = f(x) + Df(x)h + 2
! + ··· + (k _ 2)! + Rk-1,

where
R
k-1
= ( {1 (1 - t)k-2 Dk-1f(x +th)
lo (k - 2)!
dt) h(k-1).
Note that

Rk-1 = (fo 1 (~;~):)~ 2


(Dk-l f(x +th) - Dk-l f(x)) dt) h(k-l)
+ ( fl (1 - t)k-2 vk-1 f (x) h (k-1)dt)
lo (k - 2)!
2
=
1
( f (l - t)k- ( ft D(Dk-l f(x + uh))hdu)
lo (k-2)! lo
dt) h(k-l)
vk-1 f(x)h(k-1) .) fl k-2
+( (k -
2)! lo (1 - t) dt,
where the first term of the last equality follows from the integral mean value theorem
(6.17) and the change of variables formula (6.19).
Simplifying and changing the order of integration gives

- vk-1 f (x)h(k-1)
Rk-1 - (k _ l)! +
( fl . k
D f(x +uh)
(
lo
r1 (1(k- _t)k-2
lu )! dt
)
du
)
h
(k)
2
- vk-1 f (x)h(k-1) ( fl _(l - u)k-1 k ) (k)
- (k _ l)! + lo (k _ l)! D f(x +uh) du h .
Thus, (6.28) holds, and the proof is complete. D

Remark 6.6.10. For each k ~ 0, neglecting the remainder term gives the degree-k
Taylor polynomial approximation of f. If f is smooth and the remainder Rk can
be shown to go to zero as k ~ oo, then the Taylor series converges to f.

Example 6.6.11. Let f : JR 2 ~IR be given by f(x, y) = ex+y. We find the


second-order Taylor polynomial, that is, the Taylor polynomial of degree 2 of
f at (0, 0). We have
Dif(x, y) = D2f(x , y) = ex+y.
Likewise all second-order partials are equal to ex+y , and

All of these derivative terms evaluate to 1 at 0. Thus, the second-order Taylor


approximation off at 0 , evaluated at x = (x, y), is
6.6. Taylor's Theorem 271

where f (0) = 1,

Dxf (0) = x Dif (O, 0) + yD2f (0, 0) = x + y and


D~f(O) = x 2 Dif(o, 0) + 2xyD1D2f(O, 0) + y 2 DU(o, O) = x 2 + 2xy + y2 .

It follows that

Example 6.6.12. Let f: IR 2 -+ IR be given by f(x,y) = cos(x)e 3Y. The first-


and second-order partial derivatives at (0, 0) are

f (0, 0) = 1, Duf(O, 0) = - cos(x)e3 Ylco,o) = -1,


Dif(O , 0) = - sin (x)e 3 Yl(o,o) = 0, D12f(O,O) = -3sin(x)e 3 Ylco,o) = 0,
D2f(O,O) = 3cos(x)e3 Ylco,o) = 3, D2if(O, 0) = -3 sin(x)e 3 Yl(o,o) = 0,
D22f(O, 0) = 9cos(x)e3 Ylco,o) = 9.
Thus, the second-order Taylor polynomial is

T2(x,y) = f(O,O) + (Dif(O,O) D2f(O, O)] [~]


+ ~ [x y] [Duf(O, 0) D12f(O, O)] [x]
2 D2if(O, 0) D22f(O, 0) y

3J[~J+~ [x yJ[~ ~][~]


1
= l+[o
1 2 9 2
= 1 + 3y - 2x + 2y .

Example 6.6.13. For general functions z : IR 2 -+ IR, the first-order Taylor


polynomial approximation at xo = (xo, Yo) is the plane tangent to the graph
of z at (xo, Yo, z(xo, Yo)):
oz oz
z(x, y) = z(xo, Yo)+ ox (xo, Yo)(x - xo) + oy (xo, Yo)(y - Yo). (6.30)

As a corollary to Taylor's theorem, we have the following,

Corollary 6.6.14. If II Dk f (x +th) II < M for all t E [O, 1], then the remainder
Rk is bounded by
272 Chapter 6. Differentiation

(6.31)

Proof. The proof is Exercise 6.33. D

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
firs t and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with .& are especially important and are likely to be used later
in this book and beyond. Those marked with tare harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

6.1. Let f : IR 2 -+IR be defined by

x3y
(x , y) -/:- (0, 0),
f(x,y) = x6+y2'
{ 0, (x, y) = (0, 0).

Show that the partial derivatives of f exist at (0, 0) but are discontinuous
there. Hint : To show the discontinuity, consider a sequence of the form
(a/n, b/n 3 ) .
6. 2. Prove Proposition 6.1.9.
6.3. Let A c IR 2 be an open, path-connected set, and let f : A -+ IR. If for each
a E A we have Dif(a) = D2f(a) = 0, prove that f is constant on A. Hint:
Consider first the case where A= B(x, r) is an open ball.
6.4. Define f: IR 2 -+ IR by f(x,y) = xyex+y .
(i) Find the directional derivative off in the direction u = (~, t) at the
point (1, -1) .
(ii) Find the direction in which f is increasing the fastest at (1, -1).
2
6.5.t Let A c IR be an open set and f : A-+ IR. If t he partial derivatives off
exist and are bounded on A, then prove that f is continuous on A.
Exercises 273

6.6. Let f : JR 2 -+ JR be defined by


xy2
f(x , y)= x2+y2 '
=f.
(x , y) (0, 0),
{
0, (x,y) = (0,0).

Show that f is continuous but not differentiable at (0, 0).


6.7. Let f: JR 2 -+ JR be defined by

xy x2 + =I- 0
f (x, y) = x2 + y , y ,
{ 0, x 2 +y = 0.

Show that the partial derivatives exist at (0, 0) but that f is discontinuous
at (0, 0) , so f cannot be differentiable there.
6.8. Let f : JR 2 -+ JR be defined by

x[y[ (x ) -L (0 0)
f (x,y ) = J x 2 + y2' 'Y r ' '
{ 0, (x,y) = (0,0) .

Show that f has a directional derivative in every direction at (0, 0) but is not
differentiable there.
6.9. Let f : JR 2 -+ JR be defined by

f(x, y) = { (x2 + y2) sin ( Jx2l+ y2) , (x, y) =f. (0, 0),
0, (x, y) = (0, 0).

Show that f is differentiable at (0, 0). Then show that the partial derivatives
are bounded near (0, 0) but are discontinuous there.
6.10. Use the definition of the derivative to prove the claims made in Remark 6.2.6:
(i) Using row vectors to represent (lRm)*, if A E Mnxm(lR) defines a func-
tion g : ]Rn -+ (JRm)* given by g(x) = x TA , then Dg : ]Rn -+ (JRm)* is
the linear transformation v e-+ v TA.
(ii) Using column vectors to represent (JRm)*, prove that the function g is
given by g(x) = AT x and that Dg is the linear transformation v e-+ AT v .

6.11. Let X = C([O, 1] ; JR) with the sup norm.


(i) Fix p E [O, 1] and let ep : X -+ JR be given by ep(f) = f(p). Prove
that ep is a linear transformation. Use the definition of the derivative
to verify that Dep(f)(g) = ep(g) for any f, g EX.
(ii) Let K : [O, 1] x [O, 1] -+ JR be a continuous function, and let
X: X-+ X be given by X[f](s) = f 0 K(s, t)f(t) dt . Verify that Xis
1

a linear transformation. Use the definition of derivative to verify that


DX(f)(g) = X[g] .
274 Chapter 6. Differentiation

These are both special cases of Example 6.3.3, which says that for any linear
transformation L: U-+ Y, the derivative DL(x) is equal to L for every x.
6.12. Let X = C([O, 1]; JR) with the sup norm, and let the function T : X -+ JR
be given by T(f) = f0 t 2 f2(t) dt. Let L : X -+ JR be the function L(g) =
1

f0 2t 2 f (t)g(t) dt. Prove that L is a linear transformation. Use the definition


1

of the Frechet derivative to prove that DT(f) (g) = L(g) for all f E X.
6.13. Let X be a Banach space and A E @(X) be a bounded linear operator on X .
Let At : JR-+ @(X) be scalar multiplication of A by t, and let E : JR-+ @(X)
be the exponential of At, given by E(t) = eAt (see Example 5.7.2). Use the
definition of the Frechet derivative to prove that
DE(t)(s) = AeAts

for every t, s ER You may assume without proof that E(t + s) = E(t)E(s).
6.14. Prove Corollary 6.3.8.
6.15. Prove Proposition 6.3.11.

6.16. &.
Prove Proposition 6.4.6.
6.17. Let 1(t) be a differentiable curve in JRn. If there is some differentiable function
F: ]Rn-+ JR with F('Y(t)) = C constant, show that DF("f(t))T is orthogonal
to the tangent vector 1' (t).
6.18. Define f : JR 2 -+ JR 2 and g : JR 2 -t JR 2 by

f(x, y) = (sin(y) - x, ex - y) and g(x , y) = (xy, x 2 + y2 ) .


Compute D(g o f)(O, 0).
6.19. In the previous problem, compute D((j,g))(O, O), where (., ·) is the usual
inner product on JR 2 .
6.20. Let f : ]Rn -+ JR be defined by

f (x) = II Ax - bll§,

where A is an n x n matrix and b E JRn . Find Df(x0 ).


6.21.* t Let 2 : JR 2 -+JR be a bounded integrable function with continuous partial
derivative Dz.£ satisfying the additional condition that for every c > O and
for every compact subset K C: JR 2 , there is a (uniform) 6 > 0 such that if
!hi < 6, then for every (t, s) EK we have
l2(t, s + h) - 2(t, s) - Dz2(t, s)hl:::; clhl .

Consider the functional S: Lcx:' ([a, b]; JR) -+JR defined by

S(u) =lb 2(t, u(t)) dt.

Prove that the Frechet derivative DS(u) is given by

DS(u)(v) =lb D 2 2(t,u(t))vdt (6.32)


Exercises 275

using the following steps:


(i) Prove that for every u E C([a, b]; ~) and for every c > 0 there is a 8 > 0
such that for every t E [a, b], if lh(t)I < 8, then we have

1£'(t, u(t) + h(t)) - £'(t, u(t)) - D22'(t, u(t))h(t)I :::; c:!lhlloo·

(ii) Use properties of the integral (Proposition 5.10.12) to prove that if


ll ~ llu"' < 8, then

IS( u + ~) - S( u) - lb D22'(t , u(t) )~ dtl :::; c:ll ~ll (b - a).

Conclude by Lemma 6.4.l that the Frechet derivative of Sis given by (6.32).

6.22 . Prove Corollary 6.5.5. Hint: Consider the function g(t) = f(ty + (1 - t)x).
6.23. Prove Corollary 6.5.6. Hint: Consider the function F(t) = J;(c) j(T) dT.
(i) First prove the result assuming the chain rule holds even at points where
g(s) = c or g(s) = d.
(ii)* Prove that the chain rule holds for the extension of D(Fog) to points
where g(s) = c or g(s) = d.
6.24. Prove the claims in Remark 6.5.8: Given an open subset U of a finite-
dimensional Banach space and a sequence Un)':=o in C(U; Y), prove the
following:
(i) The sequence Un)':=o is uniformly convergent on compact subsets in U
if and only if the restriction of Un)':=o to the ball B(x, r) is uniformly
convergent for every closed ball B(x, r) in U.
(ii) If U = B(xo, R), then prove that Un)':=o is uniformly convergent on
compact subsets if and only if for every 0 < r < R the restriction of
Un)':=o to the closed ball B(x0 , r) centered at xo converges uniformly.
Hint: For any compact subset K, show that d(K, uc) > 0, and use this
fact to construct a closed ball B(x0 , r) c U containing K.
(iii)* The sequence Un)':=o is Cauchy on U (see Definition 6.5.7) if and only
ifthe restriction of Un)':=o to the ball B(x, r) is Cauchy for every closed
ball B(x,r) in U.
(iv)* If U = B(xo, R), then prove that Un)':=o is Cauchy on U if and only
if for every 0 < r < R the restriction of Un)':=o to B(xo, r) is Cauchy.

6.25. For each integer n ::::= 1, let fn: [-1, 1]--+ ~be fn(x) = J.,& + x2.
( i) Prove that each f n is differentiable on (-1 , 1).
(ii) Prove that Un)':=o converges uniformly to f(x) = !xi.
(iii) Prove that f is not differentiable at 0.
(iv) Explain why this does not contradict Theorem 6.5.11.
276 Chapter 6. Differentiation

6.26. For any a> 0, for x in the interval (a,oo), and for any n EN, show that

.
hm -
dn ( 1 -
-
e-Nx) = (-1) n -n!- .
N-'too dxn X xn+l

6.27 . .&. Let U c X be the open ball B(xo, r). Assume that Un)r::'=i is a se-
quence in ( C 1 (U; Y), 11 · llL=) such that the series 2:.':=o D f n converges abso-
lutely (using the sup norm) on all compact subsets of U . Assume also that
2:.':=o fn(xo) converges (as a series in Y). Prove that the sum 2:.':=o fn con-
verges uniformly on compact subsets in U and that the derivative commutes
with the sum: C:Xl 00

Dl:fn = LDfn·
n=O n=O

6.28. Let u : JR 2 -+ JR be twice continuously differentiable. If x = r cos() and


y = r sin(), show that the Laplacian \7 2 u = ~ + ~ takes the form

6.29 . .&. Prove thatif X is a normed linear space, if U c X is an open subset,


and if f : U -+ JR is a differentiable function that attains a local minimum or
a local maximum at x E U, then D f(x) = 0. This is explored much more in
Volume 2.
6.30. Find the second-order Taylor polynomial for f(x, y) = cos(xy) at the point
(0, 0).
6.31. Find the second-order Taylor polynomial for g(x, y, z) = e2 x+yz at the point
(0, 0, 0).
6.32. Find the second-order Taylor polynomial for f(x,y) = log(l + xsin(y)) at
(0, 0). Now compute the second-order Taylor polynomial of g(t) = log(l + t)
at t = 0 and the first-order Taylor polynomial of r(y) = sin(y) at y = 0
and combine them to get a polynomial for f(x, y) = g(xr(y)). Compare the
results.
6.33. Prove Corollary 6.6.14.
6.34 . .&_ Assume Uc wn is an open subset and f E C 2 (U; JR). Prove that ifxo EU
is such that D f(xo) = 0 and D 2 f(xo) is positive definite (see Definition 4.5 .1 ),
then f attains a local minimum at x 0 . You may assume that there is a
neighborhood B(x0 , J) of xo where every eigenvalue is greater than c:, and
hence v T D 2 f (x)v > cllvll for all v and for all x in B(xo, o) (this follows from
the implicit function theorem, which is proved in the next chapter). Hint:
Consider using Taylor's theorem.
Contraction Mappings
and Applications

There are fixed points through time where things must always stay the way they are.
This is not one of them. This is an opportunity.
-Dr. Who

Fixed-point theorems are among the most powerful tools in mathematical anal-
ysis. They are found in nearly every area of pure and applied mathematics. In
Corollary 5.9.16, we saw a very simple example of a fixed-point theorem, namely,
if a map f : [a, b] -+ [a, b] is continuous, then it has a fixed point. This result
generalizes to higher dimensions in what is called the Brouwer fixed-point theorem,
which states that a continuous map on the closed unit ball D(O, 1) C lFn into itself
has a fixed point. The Brouwer fixed-point theorem can be generalized further to
infinite dimensions by the Leray- Schauder fixed-point theorem, which is important
in both functional analysis and partial differential equations.
Most fixed-point theorems only say when a fixed point exists, and do not give
any additional information on how to find a fixed point or how many there are. In
this chapter, we study the contraction mapping principle, which gives conditions
guaranteeing both the existence and uniqueness of a fixed point. Moreover, it also
provides a way to actually compute the fixed point using the method of successive
approximations. As a result , the contraction mapping principle is widely used in
pure and applied mathematics and is the basis of many computational algorithms.
Chief among these are the Newton family of algorithms, which includes Newton's
method for finding zeros of a function, and several close cousins called quasi-Newton
methods. These numerical methods are ubiquitous in applications , particularly in
optimization problems and inverse problems.
The contraction mapping principle also gives two very important theorems:
the implicit function theorem and the inverse function theorem. Given a sys-
tem of equations, these two theorems give straightforward criteria that guarantee
the existence of a function (an implicit function) solving the system. They also
allow us to differentiate these implicit functions without ever explicitly writing the
functions down. These two theorems are essential tools in differential geometry, op-
timization, and differential equations, as well as in applications like economics and
physics.

277
278 Chapter 7. Contraction Mappings and Applications

7.1 Contraction Mapping Principle


In this section, we prove the contraction mapping principle and provide several
examples of its use in applications.

Definition 7.1.1. A point xE X is a fixed point of the function f : X -+ X if


f(x) = x..

Example 7.1.2. Let f : [O, 1] -+ [O, 1] be defined by f(x) = xn for some


positive n; then 0 and 1 are both fixed points of f.

Our first theorem tells us that a class of functions called contraction mappings
always have a unique fixed point.

D efinition 7 .1.3. Assume Dis a subset of a normed linear space (X, II· II) . The
function f : D -+ D is a contraction mapping if there exists 0 ::; k < 1 such that

llf(x) - f(y)ll ::; kllx - Yll for all x, y ED. (7.1)

Remark 7.1.4. It is easy to see that contraction mappings are continuous. In fact,
they are Lipschitz continuous with constant k (see Example 5.2.2(iii)).

Example 7.1.5. Consider the mapping f : JR-+ JR given by f(x) = ~ + 10.


For any x, y E JR we have

lf(x)-f(y)J= -x +10 - -y -10 I = -1 lx-yl.


I 2 2 2
Thus, f(x) is a contraction mapping with contraction factor k = ~ and fixed
point x = 20.

Ex ample 7 .1.6. Letµ be a positive real number, and let T: [O, 1] -t [O, 1]
be defined by T(x) = 4µx(l - x). This function is important in population
dynamics. For any x, y E [O, 1], we have -1::; (1 - x -y)::; 1, and therefore

IT(x) -T(y)I = 4µlx(l - x) - y(l - y)I = 4µlx - y - (x 2 - y2 )1


= 4µlx - Ylll - x - YI ::; 4µlx - YI·

If 4µ < 1, then Tis a contraction mapping.


7.1. Contraction Mapping Principle 279

Unexample 7.1.7. Let f: (0, 1)-+ (0, 1) be given by f(x) = x 2 . For any x
we have lf(x) - f(O)I = lx 21<Ix - OI; but f is not a contraction mapping on
(0, 1) because for any distinct x, y E (1/2, 1) we have lf(x) - f(y)I = lx 2-y 21=
Ix - YI Ix +YI > Ix - YI·

Theorem 7.1.8 (Contraction Mapping Principle). Assume Dis a nonempty


closed subset of a Banach space (X, II · II). If f: D-+ D is a contraction mapping,
then there exists a unique fixed point x E D off .

Proof. Let xo E D. Iteratively define t he sequence (xn);;:'°=o by the rule

(7.2)

We first prove that the sequence is Cauchy. Since f is a contraction on D, say with
constant k, it follows that

llxn - Xn-111 = llf(xn_i) - f(xn-2)11,


'.S kllxn- 1- Xn-211,

Hence,

llXn - Xm ll '.S llxn - Xn-1 11 + · · · + llxm+l - Xmll


:S kn-lllx1 - xo ll + · · · + kmllx1 - xoll
= km(l + k + · · · + kn-m-l)llx1 - xo l
km
:S l - kllx1-xoll· (7.3)

Given E > 0, we can choose N such that l~k l x1 - xo l < E. It follows that
l xn - xm l < E whenever n > m 2 N. Therefore, the sequence (xn);;:'°=o is Cauchy.
Since X is complete and D is closed, the sequence converges to some x E D.
To prove that f(x) = x, let E > 0 and choose N > 0 so that llx - xn l < c/2
whenever n 2 N. Thus,

llf(x) - xii :S llf(x) - f(xn)ll + llx - f(xn )ll


:S kllx - xn ll + l x - Xn+1ll
E E
< 2+ 2 = €.

Since E is arbitrary, we have f (x) = x.


To show uniqueness, suppose y E D is some other fixed point of f. Then

llx - yll = l f(x) - f(y)ll::::: kllx -yll < llx - yll,


which is a contradiction. D
280 Chapter 7. Contraction Mappings and Applications

Remark 7.1.9. The contraction mapping principle can be proved for complete
metric spaces instead of just Banach spaces without any extra work. Simply change
each occurrence of l x -y l to d(x,y) in the proof above.

Remark 7.1.10. The proof of the contraction mapping principle above gives an
algorit hm for finding the unique fixed point, given by (7.2). This is called the
method of successive approximations. To use it, pick any initial guess x o E D, and
the sequence f(x 0 ), j2(x0 ), .. . converges to the fixed point. Taking the limit of
(7. 3) as n -t oo shows that the error of the mth approximation fm( x o) is at most
km
i-k l x1 - Xo II·

Vista 7 .1.11. The contraction mapping principle plays a key role in


Blackwell's theorem, which is fundamental to solving sequential decision-
making problems (dynamic optimization). We discuss this application in
Volume 2. The contraction mapping principle is also important in the proof of
existence and uniqueness of solutions to ordinary differential equations. This
is covered in Volume 4.

Example 7.1.12. We can use the method of successive approximations to


compute positive square roots of numbers greater than l. If c2 = b > 1, then
c is a fixed point of the mapping f (x) = ~ (x + ~). The function f is not a
contraction mapping on all of JR, but the interval [y'bj2, oo) is closed in JR
and f maps [y'bj2, oo) to itself. To see that f is a contraction mapping on
[y'bj2, oo), we compute

b -b I
lf(x)-f(y)I= -1 Ix-y+--
2 x y

:::; ~2 Ix - YI i1 - ~
xy
I ·
So when the domain of f is restricted to [y'bj2, oo) we have

1 x -yl I1 - b- I = -1 lx-yl
lf(x)-f(y)l<-l
- 2 b/ 2 2 .

Therefore f is a contraction mapping on this domain with Vb as its fixed


point. Starting with any point in [y'bj2, oo), the method of successive ap-
proximations converges (fairly rapidly) to ../b.
7.2. Uniform Contraction Mapping Principle 281

Example 7.1.13. Recall that (C([a, b]; IF) , II · llL=) is a Banach space. Con-
sider the operator L: C([a,b];IF)---+ C([a,b];IF), given by

L[f](x) =>.lb K(x, y)f(y) dy


on C([a,b] ;IF), where K is continuous and satisfies IK(x,y)I:::; Mon the
domain [a, b] x [a, b].
This operator L is called a Fredholm integral transform with kernela K,
and it arises in many applications. For example, if K(x , y) = e~; v and a= 0
and b = 27r, then L is the Fourier transform.
The map L is a contraction for sufficiently small l>-1. More specifically,
we have

ll L [f] - L[g]llL 00 = llL[f - g]lluxo

~ 11>. l K(x, y)(f(y) - g(y)) dyL

:::; 1>-IM(b - a)llf - gllL= ·

Thus, if i>- 1 < Md-a), then L is a contraction on C([a, b]; IF), and there exists
a unique function f(x) E C([a, b]; JR) such that

f(x ) =>.lb K( x, y)f(y) dy.


aThe word kernel here has nothing to do with kernels of linear transformations nor with
kernels of group homomorphisms. It is unfortunate that standard mathematical usage has
given the same word two entirely different meanings, but the meaning is usually clear from
the context.

It sometimes happens that a function f is not a contraction mapping, but an


iterate r is a contraction mapping. In this case, f still has a unique fixed point.

Theorem 7.1.14 . Assume D is a nonempty closed subset of the Banach space


(X, II . II) . If f: D---+ D and r
is a contraction mapping for some positive integer
n, then there exists a unique fixed point x E D off.

Proof. The proof is Exercise 7.6. D

7.2 Uniform Contraction Mapping Principle


We now generalize the results of the previous section to the uniform contraction
mapping principle. This is one of the more challenging theorems in the text, but it
is also very powerful. In Section 7.4, we show that the implicit and inverse function
theorems are just corollaries of the uniform contraction mapping principle.
282 Chapter 7. Contraction Mappings and Applications

Definition 7.2.1. If Dis a nonempty subset of a normed linear space (X, II· llx)
and B is some arbitrary set, then the function f : D x B --+ D is called a uniform
contraction mapping if there exists 0 ::; >. < 1 such that
llf(x2, b) - f(x1 , b)llx :S >-llx2 - xillx (7.4)
for all x1,x2 ED and all b EB.

Example 7.2.2. Example 7.1.5 can be generalized to a uniform contraction


mapping. Let g: IR x N --t IR be given by g(x, b) = ~ + b. For X1 and x2 in IR
we have

Hence, g is a uniform contraction mapping with >. = ~. Although >. is inde-


pendent of b, the fixed point is not. For example, when b = 10 the fixed point
is 20, but when b = 20 the fixed point is 40.

Example 7.2.3. Let B = D = [l, 2], and let f : D x B --+ D be given


by f(x,b) = ~(x + ~). The reader should check that f(x , b) E D for all
(x, b) E D x B. Using arguments similar to those in Example 7.1.12 , we
compute for any b E B and any x 1 , x 2 E D that

1
lf(x1, b) - f(x2, b)I = -
2
lx1 - x21 [1 - _ b_
X1X2
l
1
::; 2 lx1 - x2I ·
Therefore, f is a uniform contraction mapping on D x B.

7.2.1 The Uniform Contraction Mapping Principle


A uniform contraction f : D x B --+ D can be thought of as a family of contraction
mappings-one for each b E B . Each one of these contractions has a unique fixed
point, so this gives a map g : B --t D , sending b to the unique fixed point of
the corresponding contraction. The uniform contraction mapping principle gives
conditions that guarantee the function g is differentiable. This is useful because
it gives us a way to construct new differentiable functions with various desirable
properties.

Theorem 7.2.4 (Uniform Contraction Mapping Principle). Assume that


(X, 11 · llx) and (Y, II · llY) are Banach spaces, U C X and V C Y are open, and
the function f : U x V --+ U is a uniform contraction with constant 0 :::; >. < 1.
Define a function g : V --+ U that sends each y E V to the unique fixed point of the
contraction f(.,y). If f E Ck(U x V; U) for some k EN, then g E Ck(V; U).
7.2. Uniform Contraction Mapping Principle 283

Remark 7 .2.5. Although we have only defined derivatives on open subsets of


Banach spaces, the theorem requires f to be Ck on the set U x V, which is not
open. For our purposes here, derivatives and differentiability at a point x that is
not interior to the domain mean that the limit (6 .7) defining the derivative is only
taken with respect to those h with x + h in the domain.
In order to prove the uniform contraction mapping principle we first establish
several lemmata . The rough idea is to first prove continuity and differentiability of
g, and then induct on k . Continuity of g and its derivative (if it exists) is the easy
part of the proof, and the existence of the derivative is much harder. We begin by
observing that t he definition of g gives
g(y) = f(g(y), y) (7.5)
for every y EV.

Lemma 7.2.6. Let X, Y, U, V, f, and g be as in the hypothesis of Theorem 7.2.4


fork= 0. If f is continuous at each (g(y) , y) E U x V , then g is continuous at
each y EV.

Proof. Given c > 0, choose B(y , 8) C V so that


llf(g(y) , y + k) - f(g(y) , y) ll x < (1 - >-)c
whenever ll k !I Y < 8. Thus,
ll g(y + k) - g(y) ll x
= llf(g(y + k) , y + k) - f(g(y),y) ll x
:S:: ll f(g(y + k) , Y + k) - f(g(y), Y + k) ll x + ll f(g(y), Y + k) - f(g(y) , Y) ll x
< >- ll g(y + k) - g(y) !Ix + (1 - A)c,
which implies t hat llg(y + k) - g(y) llx < c whenever llkllY < 8. Therefore, g is
continuous at y. D

Lemma 7.2.7. Let X , Y, U, V, f, and g be as in the hypothesis of Theorem 7.2.4


fork = 1. If f is C 1 at each (g(y) , y) E Ux V, then the function¢ : ~(Y; X) x V ~
~(Y; X) defined by

¢ (A, y) = Dif(g(y), y)A + D2f(g(y), y) (7.6)


is a uniform contraction.

Proof. By Proposition 6.3.10, we have that ll Dif(g(y), y) ll :S:: A for each y E V,


where II · II is the induced norm, 41 and so
ll¢(A2,y) - ¢(A1,Y) ll
= l! Dif(g(y) , y)A2 + D2f(g(y), y) - Dif(g(y), y)A1 - D2f(g(y), y) ll
:S:: ll D1f(g(y),y)llll A2 -Ai li
:S:: >- ll A2 -Ai li - D
4 1 Note that t he conclusion of Proposition 6.3.10 still holds for f E C 1 (iJ x V).
284 Chapter 7 Contraction Mappings and Applications

Remark 7.2.8. The previous lemma shows that for each y E V, there exists a
unique fixed point Z(y) E a6'(Y; X) satisfying

Z(y) = Dif(g(y) , y)Z(y) + D2f(g(y),y). (7.7)

Moreover, Z is continuous by Lemma 7.2.6.

Lemma 7.2 .9. Let X, Y, U, V, f, and g be as in the hypothesis of Theorem 7.2.4.


If f is C 1 at each (g(y), y) E U x V, then g is C 1 at each y EV.

The proof of this lemma is hard. It is given in Section 7.2.2.


We are now finally ready to prove the uniform contraction mapping principle
(Theorem 7.2.4).

Proof of Theorem 7.2.4 . By Lemma 7.2.9, if f E C 1 (U x V; U), then g E


C 1 (V; U). We assume that the theorem holds for n = k - 1 E N and prove by
induction that the theorem holds for n = k. If f is ck' then g is at least ck-l _
Since Dg(y) satisfies (7.7), and ¢ is a ck-l uniform contraction, we have that
Dg(y) is also ck-l by the inductive hypothesis. Hence, g is Ck. D

7.2.2 *Proof of Lemma 7.2.9

Proof. If we knew that Dg existed, t hen for each y E V we would have Dg(y) E
a6'(Y; X), and applying the chain rule to (7.5) would give

Dg(y) = Dif(g(y) , y)Dg(y) + D2f(g(y) , y). (7.8)

That means that Dg(y) would be a fixed point of the function ¢ : a6'(Y; X) x
V ---+ a6'(Y; X) defined in Lemma 7.2.7. By Lemma 7.2.7, the map ¢ is a uniform
contraction mapping, and so there exists a function Z : V ---+ a6'(Y; X) , defined by
setting Z(y) equal to the unique fixed point of¢(., y). By Lemma 7.2.6 the map
Z is continuous; therefore, all that remains is to show that Z(y) = Dg(y) for each
y . That is, for every € > 0 we must show there exists B(y, c5) C V such that if
llkllY < c5, then
llg(y + k) - g(y) - Z(y)kllx ~ EllkllY · (7.9)
We now prove (7.9). For any h EX and k E Y, let 6(h, k) be given by

6(h, k) = f(g(y) + h , y+k)- f(g(y), y) - Dif(g(y), y)h- D2f(g(y), y)k. (7.10)

If j is C 1 , then for all 'T/ > 0, there exists B(y, c5o) c V such that

ll6(h, k) ll x ~ 77(llhllx + ll k ll Y) (7.11)

whenever llkllY < 80 and llhllx < 80 is such that g(y) + h E U .


The expression to control is llg(y + k) - g(y) - Z(y)kllx, so to connect this
to (7.11) , we choose a very specific form for has a function of k by setting h(k) =
g(y + k) - g(y) . Note that since g(y + k) E U, this choice of h(k) satisfies the
7.3. Newton's Method 285

requirement g(y) +h(k) E U for (7.11) to hold. Since g is continuous, the function h
is continuous, and thus there exists B(y, 5) C B(y, 80) C V such that llh(k) llx ::; 80
whenever ll k ll Y < 8. Moreover, we have

h(k) = f(g(y + k), y + k) - f(g(y), y)


= f(g(y) + h(k),y + k) - f(g(y),y).

Combining this with (7.10) , we have

h(k) = Dif(g(y), y)h(k) + D2f(g(y), y)k + ~(h(k), k),

which yields

ll h(k) ll x ::; ll Dif(g(y) , y) 11 ll h(k) ll x + ll D2f (g(y), y) 11 llk ll Y+71(llh(k) ll x + ll k ll Y).

Choose 7/::; 1
2'\ and recall that llDif(g(y),y)ll ::; .>-. Combining these, we find

2
Setting M = ll D2f(gi~{)ll+l-.>- and simplifying (7.11) yields

ll ~(h(k), k) ii x < 71(M + l) iik ii Y (7.12)

whenever ll k li Y ::; 8. Moreover from (7.7), we have

~(h(k), k) = h(k) - Dif(g(y), y)h(k) - D2f(g(y), y)k


= h(k) - Dif(g(y) , y)h(k) - Z(y)k + Dif(g(y), y)Z(y)k
= (I - Dif(g(y), y))(h(k) - Z(y)k).

Since llDif(g(y),y)ll ::; A< 1, we have that I - Dif(g(y),y) is invertible (see


Proposition 5.7.4). This, combined with (7.12), yields

ll g(y + k) - g(y) - Z(y)k ll x = llh(k) - Z(y)k llx


= il(J - Dif(g (y),y)) - 1 ~ (h(k),k)ilx
::; il(I - Dif(g(y),y))- 1 illl~(h(k),k)ilx

< 7/(~-~ l ) llk llY


whenever ll kllY < 8. Finally, setting 71 =min{"~~~), 12.>-} gives (7.9). D

7.3 Newton's Method


Finding zeros of a function is an essential step for solving many problems in both
pure and applied mathematics. Some of the fastest and most widely used methods
for solving these sorts of problems are Newton's method and its variants. Newton's
method is a fundamental tool in many important algorithms; indeed, even many
286 Chapter 7. Contraction Mappings and Applications

algorithms that are not obviously built from Newton's method really have Newton's
method at their core, once you look carefully [HRW12, Tap09] .
The idea of Newton's method is simple: if finding a zero of a function is
difficult, replace the function with a simpler approximation whose zeros are eas-
ier to find. Zeros of linear functions: are easy to find, so the obvious choice is a
linear approximation. Given a differentiable function f : X ---+ X, the best linear
approximation to f at Xn is the function

L(x) = f(xn) + Df(xn)(x - Xn) ·


Assuming that D f (xn) is invertible, this linear system has a unique zero at

Xn+l = Xn - Df(xn)- 1f(xn)· (7.13)

So X n+l should be a better approximation of a zero of f than Xn was. Starting at


any x 0 and repeating for each n E N gives a sequence x 0 , x 1 , ... that often (but not
always) converges to a zero of f. Moreover, if xo is chosen well, the convergence
can be very fast . See Figure 7.1 for an illustration.
In this section we give some conditions that guarantee both that the method
converges and that the convergence is rapid. We first prove some of these results
in one dimension, and then we generalize to higher dimensions and even Banach
spaces.

7.3.1 Convergence
Before treating Newton's method and its variants, we need to make a brief digression
into convergence rates. These are discussed much more in Volume 2.
An iterative process produces a sequence (xn)~=O of approximations. If we
are approximating x, we expect the sequence to converge to x. Better algorithms
produce a sequence that converges more rapidly.

Definition 7.3.1. Given a sequence (xn)~=O approximating x, denote the error of


the nth approximation by
En == llxn - xii·
The sequence is said to converge linearly with rate µ < 1 if

for each n EN. The sequence is said to converge quadratically, when there exists a
constant k ;::: 0 (not necessarily less than 1) such that

En+l :S ks;,
for all n EN.

If a sequence of real numbers converges linearly with rate µ, then with each
iteration the approximation adds about - log 10 µ digits of accuracy. Quadratic
convergence means that with each iteration the approximation roughly doubles the
number of digits of accuracy. This is much better than linear convergence.
The quasi-Newton method of Section 7. 3.3 converges linearly. In Theorems
7.3.4 and 7.3.12 we show that Newton's method converges quadratically.
7.3. Newton's Method 287

7.3.2 Newton's Method: Scalar Version


To prove that Newton's method converges, we construct a contraction mapping.

Lemma 7.3.2. Let f : [a, b] -+ JR be C 2 and assume that for some x E (a, b) we
have f(x) = 0 and f'(x) -:/= 0 . Under these hypotheses there exists > 0 such that o
the map
f (x)
¢(x) = x - f'(x)

is a contraction on [x - o, x + o] c [a, b].

Proof. Choose o > 0 so that [x - o, x + o] c [a, b]. Because f is C 2 we can shrink


O so that for all x E [x - o, x + o] C [a, b] we have f'(x)-:/= 0 and

k= sup If(x}f"~x) I < l.


xE[x - o,x+o] f (x)

By the mean value theorem, for any [x, y] c [x - o, x + o] there exists c E [x, y] such
that

l<P(x) - ¢(y)I = l<P'(c)llx - YI= [ fj:c;;c) [ix - YI~ klx - yl.

Therefore, ¢is a contraction on [x - o, x + o]. D

Remark 7.3.3. We know that ¢([x - o, x + o]) c [x - o, x + o] because ¢ is a


contraction and x is a fixed point. Thus,

l<P(x ± o) - xi= l<P(x ± o) - ¢(x)I ~ klx ± o - xi = ko < o.


Theorem 7.3.4 (Newton's Method- Scalar Version). If f [a, b] -+ JR
satisfies the hypotheses of the lemma above, then the iterative map

f (xn)
Xn+l = Xn - f'(xn) for all n E N (7.14)

converges to x quadratically, whenever x 0 is sufficiently close to x.

Proof. Since f is C 2 the derivative f' is locally Lipschitz at the point x with some
constant L (see Proposition 6.3.7), so there exists a 81 such that lf'(x+c:)- f'(x)I <
L lc: I whenever lc:I < 81 . Leto< 81 be chosen as in the previous lemma. Choose any
initial x 0 E [x - o, x + o] and iterate. By the lemma, the sequence must converge
to x.
Let En = Xn - x for each n EN. By the mean value theorem f(x + En - 1) =
f(x) + f'(x + 1JC:n- 1)En-1 for some 17 E [O, 1] (convince yourself that this still holds
288 Chapter 7. Contraction Mappings and Applications

Figure 7.1. Newton's method takes the tangent line (red) to the curve
y = f (x) at the point (Xn , f (Xn)) and defines Xn+ 1 to be the x -intercept of that line.
Details are given in Theorem 7.3.4.

if cn- 1 < 0). Thus, from (7.14) we have

f (1; + cn-1) I
lcn J = Jc n-1 - j'('-
.r + cn-1 )
= I f'(x + cn-1)cn- 1 - f(x + cn_i) I
+ cn- 1)
f'(x
< I f'(x + cn-i) - f'(x + r]cn-1) J lc _ I
- f'l'-X + cn-1 ) n 1

L(l-17)Jcn-1IJ J
::::; lf'(x + cn_i)J cn- 1
2
:S: MJcn-1J ,

where M =sup xE[x-8,x+o] lf'tx)I' D

Example 7.3.5. Applying Newton's method on the function f(x) = x 2 - a


gives the following fast algorithm for computing )a:

(7.15)

This is derived in Exercise 7.13. Using x 0 = 1.0 and a = 4 we get 14 digits of


accuracy in five iterations:

X1 = 2.500000000000000,
X2 = 2.0fiOOOOOOOOOOOOO,
7.3. Newton's Method 289

X3 = 2.000609756097561,
X4 = 2.000000092922295,
X5 = 2.000000000000002.
Notice how quickly this sequence converges to 2.

Remark 7.3.6. If the derivative off vanishes at x, then Newton's method is not
necessarily quadratic and may not even converge at all! If f'(x) = 0, we say that f
has a multiple zero at x. For example, if f is a polynomial and f'(x) = 0, then f
has a factor (over <C) of the form (x - x) 2 .

Example 7.3. 7. * Using Newton's method on the function f(x) x3 - a


gives the following fast algorithm for computing ij(i:

a - x~_ 1
Xn = Xn-l + 2 (7.16)
3xn-l

Using a = 1729.03 and x 0 = 12, we get 15 digits of accuracy after three


iterations:

X1 = 12.002384259259259,
X2 = 12.002383785691737,
X3 = 12.002383785691718.

In Richard Feynman's book Surely You 're Joking, Mr. Feynman! [FLH85],
he tells a story of an abacus master who challenges him to a race to solve
various arithmetic problems. The man was using his abacus, and Feynman was
using pen and paper. After easily beating Feynman in various multiplication
problems, the abacus master challenged him to a division problem, which
turned out a tie. Frustrated that he had not won the division contest, the
abacus master challenged Feynman to find the cube root of 1729.03. This
was a mistake, because computing cube roots on an abacus is hard work,
but Feynman was an expert in algorithms, and Newton's method was in his
arsenal. He also knew that 123 = 1728 since there are 1728 cubic inches in a
cubic foot. Then using (7.16), he carried out the following estimate

1729 03 1728
-V1729.03 R:j 12 + · -2 = 12 + l.0 3 R:j 12.002. (7.17)
3. 12 432
Feynman won this last contest easily, finding the answer to three decimal
places before the abacus master could find one.
The algorithm (7.16) would have allowed Feynman to compute the cube
root to 5 decimals of accuracy in a single iteration had he computed the
290 Chapter 7 Contraction Mappings and Applications

fraction in (7.17) to more decimal places instead of his quick estimation. In


three iterations, he would have been able to get to 15 digits of accuracy. That
would have taken forever with an abacus!

Remark 7.3.8. It is essential in Newton's method (7.14) that the initial point xo
be sufficiently close to the zero. If it is not close enough, it is possible that the
sequence will bounce around and never converge, or even go to infinity.

Unexample 7.3.9. Using Newton's method on the function f(x) = x 113 , we


have
1/3
-xn-1
Xn = Xn -1 + 1
_
213
= -2Xn-1·
3 Xn-l
The initial guess x 0 = 1 gives the sequence 1, -2, 4, -8, 16, .... Clearly this
sequence does not converge to the zero of x 113 .

7.3.3 A Quasi-Newton Method: Vector Version


We now describe a quasi-Newton method for vector-valued functions that does not
converge as quickly as Newton's method but is easier to prove converges. This
method also plays an important role in the proof of the implicit function theorem
in the next section.

Theorem 7.3.10. Let (X, II · II) be a Banach space and assume f : X --+ X is C 1
on an open neighborhood U of the point x. If f(x) = 0 and D f(x) E 86'(X) has a
o
bounded inverse, then there exists > 0 such that

¢(x) = x - D f(x)- 1 f(x) (7.18)


is a contraction on B (x, o).

Proof. Since D f(x) is continuous on U, we can choose B(x, o) CU so that


1
llDJ(x) - Df(x)ll < 2 llDf(x)-lll,

whenever ll x - xii < o. Hence, x E B(x, o) implies

llD¢(x) ll =III - Df(x)- 1 Df(x)ll = 11Df(x)- 1 1111Df(x) - Df(x)ll < ~·


Thus, by the mean value theorem we have 11¢(x) - ¢(y)ll :S ~llx-yll for all x,y E
B(x., o). o
The previous theorem does not immediately give us an algorithm because it
depends on computing Df(x) , which we generally do not know unless we know
x. Many quasi-Newton methods amount to choosing a suitable approximation to
7.3. Newton's Method 291

D f(x) - 1 , which we can then use in the contraction mapping above to produce an
iterative algorithm.
The following lemma provides a useful tool for approximating D f (x)- 1 and
is also important for proving convergence of Newton's method.

Lemma 7.3.11. Let (X, II · II) be a Banach space and assume that g: X-+ .9.e(X)
o
is a continuous map. If g(x) has a bounded inverse, then there exists > 0 such
that llg(x) - 1 11<2llg(x)- 1 ll whenever ll x - xii < o.

Proof. Since g and matrix inversion are both continuous in a neighborhood of x


(see Proposition 5.7.7), so is their composition. Hence, if c = llg(x)- 1 11, then there
exists o > 0 such that llg(x)- 1 - g(x)- 1 11 < c whenever llx - x ii < o. Thus,

whenever ll x - x ii < o. D

7.3.4 Newton's Method: Vector Version


The idea of Newton's method in general is the same as in the single-variable case.
The derivative at a point provides the best possible linear approximation to the
function near that point, and if the derivative is invertible, we can find the zero of
this linear approximation and use it as the next iterate, namely, (7.19) below.
The hard part of all this is proving that convergence is quadratic for this
method. But even here the idea is similar to the single-variable case, that is, to
examine the remainder R 2 in the Taylor polynomial off.

Theorem 7.3.12 (Newton's Method-Vector Version). Let (X, II· II) be a


Banach space and assume f : X -+ X is C 1 in an open neighborhood U of the
point x E X and f(x) = 0. If Df(x) E .9.e(X) has a bounded inverse and Df(x) is
Lipschitz on U, then the iterative map

(7.19)

converges quadratically to x whenever xo is sufficiently close to x.

Proof. Choose o > 0 as in Lemma 7.3.11, and such that B(x, o) c U. Let
x 0 E B(x, o), and define Xn for n > 0 using (7.19). We begin by writing the integral
remainder of the first-order Taylor expansion

1
f (xn) - f (x) = fo D f (x + t(xn - x) )(xn - x) dt
1
= Df(x)(xn - x) + fo (Df(x + t(xn - x)) - Df(x))(xn - x) dt.
292 Chapter 7. Contraction Mappings and Applications

Assume that k is the Lipschitz constant for D f on U. By the previous line, we have
1
II! (xn) - f (x) - D f(x)(xn - x)I :::; fo l Df (x + t(xn - x)) - D f(x) 1 l xn - xii dt
r1
:::; J kllx + t(xn - x) - xii llxn - xii dt
. 0
1 k

Also, from (7.19), we have


=
10
ktllxn - xll 2 dt = - llxn - xll 2 ·
2

Xn+l - x= Xn - D f(xn)- 1 f(xn) - x + Df(xn)- 1 f(x)


= Xn - x- Df(xn)- 1 (f( xn) - f(x))
= Xn - x- Df(xn)- 1 (Df(x)(xn - x) + f(xn) - f(x) - Df(x)(xn - x))
1
= Df(xn)- (Df(xn) - Df(x))(xn - x)
- Df(xn)- 1 (f(xn) - f(x) - Df(x)(xn - x)).
Thus, for M = 3kllDJ(x)- 1ll, we have
k
llxn+1 - xii:::; l DJ(xn)- 1llllDJ(xn) - Df(x) llllxn - xii+ 2llDJ(xn) - 1llll xn - xll 2
:::; ~kllD f(xn)- 1 II llxn - xll 2
:::; 3kllD f (x)- 1I llxn - xll 2
:::; Mllxn - xll 2 • D

Remark 7.3.13. In the previous theorem, we proved that when the initial point
xo is "sufficiently close" to the zero x, then Newton's method converges. But
to be useful, we also need to know whether a given starting point will converge.
This is answered by the Newton-Kantorovich theorem, which is a generalization of
Lemma 7.3.2 to vector-valued functions. It says that the initial value x 0 produces
a convergent sequence if

(7.20)

Here K is the Lipschitz constant for the map D f : U -+ ,qg( X) . The proof of the
Newton-Kantorovich theorem is not beyond the scope of the text, but it is tedious,
so we do not reproduce it here.

Example 7.3.14. We apply Newton's method to find the zeros of the


function

Observe that
Df (x) = [12x2 + 4y2 8xy ]
8xy 4x 2 + 12y 2 '
7.4. The Implicit and Inverse Function Theorems 293

and since D f (x) is a 2 x 2 matrix we calculate D f (x) - 1 directly as

Df(x) - l = 1
12(x2 + y2)2
[x -2xy
+ 3y
2 2
- 2xy ]
3x 2 + y2 ·

Suppose we select [a a JT as our initial guess where a =f. 0. Then

-2~ ] [8a ] = [2a/


2

3] = ~[a].
2 3
x1 = xo-Df(xo) -1 f(xo) = [a]
a - 1a 4 [ _4a
48 2a 2 4a- 8a 3 2a/3 3 a

It is clear that in general Xn = (~) n [ ~], which converges quickly to the zero
at (0, 0).

7.4 The Implicit and Inverse Function Theorems


The implicit function theorem and the inverse function theorem are consequences
of the uniform contraction mapping principle and are the basis of many ideas in
mathematics. Although most analysis texts prove these only in !Rn, we can do much
better and prove them in a general Banach space with essentially no extra work.

7.4.1 Implicit Function Theorem

Given a function of two variables F : X x Y --+ Z , each point z 0 E Z defines a


level set {(x, y) [F (x , y) = z 0 }. Globally, this set is not usually the graph of a single
function, but the implicit function theorem says that under mild conditions the
level set is locally the graph of a function-call this function f. This new function
f is the implicit function defined by the relation F(x, y) = zo near some point
(xo, Yo).

Example 7.4.1. Consider the level set {F(x, y) = 9} of the function F(x, y) =
x 2 + y 2 . In a neighborhood of the point (xo , Yo) = (0, 3) we can define y as a
function of x, namely, y = .Jg - x 2 . However, we cannot define y as a function
of x in a neighborhood around the point (3, 0) since in any neighborhood of
(3, 0) there are two points of the level set of the form (x, ±.J9 - x 2 ) with the
same x-coordinate. This is depicted in Figure 7.2
The implicit function theorem tells us when we can implicitly define one
or more of the variables (in our example the variable y) as functions of other
variables (in our example the variable x).

In the previous example, we could solve explicitly for y as a function of x, but in


many cases solving explicitly for the function is just too hard (even impossible).Yet,
294 Chapter 7_ Contraction Mappings and Applications

(0, 3)

/
/
' \ /
/
' \
I \ I
I \ I
I I

(3,0)

Figure 7.2. An illustration of Example 7.4.1. In a neighborhood around


the point (0, 3), the points on the circle F (x, y) = 9 (black arc on the left) can be
written as (x, f(x)), provided x remains in a small enough neighborhood {blue line)
ofO . But near the point (3,0) there is no function of x defining y . Instead, there is
a function g so that we can write points of the circle near (3, 0) as (g(y), y), provided
y remains in a small enough neighborhood (red line) of 0.

for many problems just knowing it exists and knowing its derivative is enough. The
implicit function theorem not only tells us when the function exists, but also how
to compute its derivative without computing the function itself; see (7.22).
To prove the implicit function theorem we construct a uniform contraction
mapping using a generalization of the quasi-Newton method (7.18) and then apply
the uniform contraction mapping principle (Theorem 7.2.4).

Theorem 7.4.2 (Implicit Function Theorem). Assume that (X, 11 · llx ),


(Y, II · 11 y), and ( Z, II · II z) are Banach spaces, that U and V are open neighborhoods
of xo E X and Yo E Y, respectively, and that F : U x V -----+ Z is a Ck map for
some integer k ~ 1. Let zo = F(xo,Yo). If D2F(xo ,Yo) E 86'(Y; Z) has a bounded
inverse, then there exists an open neighborhood Uo x Vo C U x V of (xo, Yo) and a
unique Ck function f : Uo -----+ Vo such that f (xo) =Yo and
{(x,y) E Uo x Vo I F(x,y) = zo} = {(x,f(x)) Ix E Uo} . (7.21)
Moreover, the derivative off satisfies
Df(x) = -D2F(x,f(x ))- 1D1F(x,f(x)) (7.22)
on Uo.

Proof. Without loss of generality, we assume that F(x 0 , y 0 ) = 0 (otherwise redefine


F to be F - zo) . Generalizing (7.18), consider the Ck map G: U x V-----+ Y given by
G(x,y) = y - D2F(xo , Yo)- 1F(x, y).
For fixed x E U, we have that G (x, y) = y if and only if F (x, y) = 0. Furthermore,
D2G(xo, Yo) = I - D2F(xo, Yo)- 1D2F( x o, Yo) = 0. Since G is C 1, there exists a
neighborhood U1 x Vo c U x V of (x 0 , y 0 ) such that
1
llD2G(x, Y)ll <2
7.4. The Implicit and Inverse Function Theorems 295

whenever (x, y) E U1 x Vo. Without loss of generality we may assume Vo = B(yo, o)


for some suitably chosen o> 0. Since F(x,y) is C 1 and vanishes at (x 0 ,y0 ), there
exists an open neighborhood Uo C U1 of xo such that

whenever x E U0 . Applying the triangle inequality and the inequality (6.18) from
the integral mean value theorem (Corollary 6.5.5), we have

IJG(x, y) - Yo ll :S IJG(x, y) - G(x,yo)li + IJG(x, Yo) -yo JI


1
< sup llD2G(x,Y)JlllY-Yoll + ll D2F(xo,Yo)- Jll JF(x,yo)li
yE€(yo,y)
0 0
<2+2= 0

whenever (x, y) E Uo x Vo, and thus G: Uo x Vo --+ Vo. Moreover, for x E Uo and
y 1 ,y2 E Vo, we apply the mean value inequality (6.18) again to get

This implies that G(x, ·) is a uniform contraction, so for each x there is a unique y
satisfying G(x, y) = y. By Theorem 7.2.4, this defines a Ck function f : U0 --+ V0
o
satisfying G(x, f(x)) = f(x) for all x E Uo. Since JIG(x, y) - Yoll < on Uo x Vo,
we can restrict the codomain to Vo and simply write f : Uo --+ Vo. It follows
that F(x, f(x)) = 0 for all x E U0 , which, together with uniqueness, gives (7.21).
Differentiating and solving for D f (x ) gives (7.22). D

Example 7.4.3. The level set in Example 7.4.1 is a circle of radius 3, centered
at the origin. By the implicit function theorem, as long as D2F(xo, Yo) =
2yo i=- 0, there exists a unique C 1 function f(x) in a neighborhood of the
point (xo,Yo) satisfying F(x , f(x)) = 0.
Setting y = f(x) and differentiating the equation F(x , y) = 0 with re-
spect to x gives

0 = D1F(x , f (x) ) + D2F(x, f(x))J'(x) = 2x + 2yy'.


Solving for y' = f'(x) yields

y' = f'(x) = _ D 1F(x, y) = _ 2x = - x


D2F(x, y) 2y y '
which agrees with (7.22).
Notice that the tangent line to the circle at (x, y) is perpendicular to
the radius connecting the origin (0, 0) to (x, y), so the slope m = y/x of that
radius is the negative reciprocal of the slope of the tangent line.
296 Chapter 7. Contraction Mappings and Applications

Remark 7.4.4. The previous example is a special case of a claim often seen in
a multivariable calculus class. For any function F(x, y) of two variables, if the
equation F(x, y) = 0 defines y implicitly as a function of x, then the derivative
dy / dx is given by
8F
dy ax
- 8F. (7.23)
dx ay
The implicit function theorem tells us that y is a function of x when ~~ =/=- 0, and
(7.23) is a special case of the formula (7.22).

Example 7.4.5. Consider the two-dimensional surface S defined implicitly


by the equation F(x, y, z) = 0, where

F(x, y, z) = z 3 + 3xyz 2 - 5x 2 y 2 z + 14.

Given (xo, yo, zo) = (1, -1, 2) ES, we compute D3F(xo, Yo, zo) = -5 =/=- 0. By
the implicit function theorem, the surface S can be written explicitly as the
graph of a function z = z(x, y) in a neighborhood of (xo, y0 , z0 ).
Furthermore, we can find the partial derivatives of z by differentiating
F(x, y, z(x, y)) = 0, which gives

0 = D1F(x , y, z) + D3F(x, y, z)D1z(x, y)


oF oF oz
=ox+ 8z OX
= (3yz 2 - l0xy 2z) + (3z 2 + 6xyz - 5x 2 y 2 ) ~;,
0 = D2F(x , y, z) + D3F(x, y, z)D2z(x, y)
oF oFoz
= oy + oz oy
= (3xz 2 - l0x 2yz) + (3z 2 + 6xyz - 5x 2y2) ~:.

Substituting xo, yo, zo and solving for the partial derivatives of z, we get

Thus, the tangent plane of the surface S at (xo, Yo, zo) is

32(x - 1) - 32(y + 1) + 5(z - 2) = 0.


Solving for z in terms of x and y gives the first-order Taylor expansion of
z = z(x, y) at (xo, Yo); see Example 6.6 .13.
7.4. The Implicit and Inverse Function Theorems 297

Example 7 .4.6. Consider the nonlinear system of polynomial equations

xu 2 + yzv + x 2 z = 3,
xyv 3 + 2zu - u 2 v2 = 2.

We want to show that we can solve for u and v as smooth functions of x , y,


and z in a neighborhood of the point (1, 1, 1, 1, 1) . Writing x = (x, y , z) and
y = (u, v), we define the smooth map F: JR 3 x JR 2 ---+ JR 2 by

2 2
F x = [ xu + yzv + x z - 3 ]
( 'y) xyv + 2zu - u 2 v2 - 2 ·
3

Set xo = (1, 1, 1) and Yo = (1, 1), and note that

D 2F(x ) - [ 2xu
o, Yo - 2z - 2uv 2
yz
3xyv 2 - 2u 2v
]I
x o,Yo
[~ i] '
which is nonsingular (and thus has a bounded inverse) . Therefore, by the
implicit function theorem, we have that y(x) = (u(x), v(x)) is a C 1 function
in an open neighborhood of x 0 satisfying F(x,y(x)) = 0.

Example 7.4.7. * For this example we introduce some notation. Let Ji


1
]Rn-+ JR be C ' where i = 1, . . . 'n, and let (x1, X2, ... 'Xn) E ]Rn. We write

ofi 8fi ofi


8x1 8x2 OXn.
o(fi,h, ... , fn) d i}_h_ i}_h_ Eh
= et 8x1 8x2 OXn. (7.24)
8(x1,X2, ... , xn)
Ofn 8fn. Ofn
8x1 8 x2 OXn

This is called the Jacobian determinant of the functions Ji, h , ... , fn·
Consider the system

f (x, y, z) = 0,
g(x,y , z) = 0,

where f and g are real-valued C 1 functions on JR 3 satisfying J = ~/~:;\ -=F 0. By


the implicit function theorem, we can solve for y(x) and z(x) as C 1 functions
of x. Thus, we have f(x,y(x),z(x)) = 0 and g(x,y(x) , z(x)) = 0. Taking the
derivative yields

Dif (x, y(x), z(x)) + D2f(x , y(x), z(x))y'(x) + D3f(x, y(x) , z(x))z'(x) = 0,
D 1 g(x, y(x), z(x)) + D2g(x, y(x), z(x))y'(x) + D3g(x , y(x), z(x))z'(x) = 0.
298 Chapter 7. Contraction Mappings and Applications

Moreover, from Cramer's rule (Corollary 2.9.24) we can solve for the deriva-
tives y'(x) and z'(x) to get

y'(x) = 1-18(!, g) and '( )=J-18(!,g),


8(z,x) z x o(x,y)'

see Exercise 7.25 for details.

7.4.2 Inverse Function Theornm


The implicit function theorem has another incarnation, called the inverse function
theorem. It states that if the derivative of a function is invertible (or, rather,
has a bounded inverse) at some point, then the function itself is invertible in a
neighborhood of the image of that point. This should not come as a surprise- a
differentiable function is very nearly equal to its derivative in a small neighborhood,
so if the derivative is invertible in that neighborhood, the function should also be.
The inverse function theorem follows from the implicit function theorem, but
the implicit function theorem can also be proved from the inverse function theorem.
Depending on the application, one of these theorems may be easier to use than
the other.

Theorem 7.4.8 (Inverse Function Theorem). Assume that (X, II · llx) and
(Y, II · llY) are Banach spaces, that U and V are open neighborhoods of xo E X
and y 0 E Y, respectively, and that f : U --t V is a Ck map for some k E z+
satisfying f(xo) = Yo · If D f(xo) E B&(X; Y) has a bounded inverse, then there
exist open neighborhoods Uo C U of xo and Vo C V of Yo, and a unique Ck function
g : Vo --t Uo that is inverse to f . In other words, f (g(y)) = y for all y E Vo and
g(f(x)) = x for all x E Uo. Moreover, for all x E Uo, we have

Dg(y) = Df(g(y))- 1. (7.25)

Proof. Define F(x,y) = f(x) -y. Since D1F(xo,Yo) = Df(xo) has a bounded
inverse, the implicit function theorem guarantees the existence of a neighborhood
U1 x Vo c U x X of the point (xo, Yo) and a Ck function g : Vo --t U1 such that
f(g(y)) = y for ally E Vo, which implies that g is injective (see Theorem A.2.19).
By restricting the codomain of g to Uo = g(Vo), we have that g is bijective. By
Corollary A.2.20, this implies that f : Uo --t Vo and g : Vo --t Uo are inverses of each
other. Note that U0 = U1 n f- 1(V0 ), which implies that U0 is open. Finally, (7.25)
follows by differentiating f(g(y)) = y. D

Example 7 .4. 9. The function f : ~ --t ~ given by f (t) = cos( t) has D f (t) =
- sin(t), which is nonzero whenever t -=f. k7r for all k E Z. The inverse function
theorem guarantees that for any point t -=f. kn there is a neighborhood U0 c ~
oft, a neighborhood Vo c ~ of cos(t), and an inverse function g : Vo --t U0 .
There cannot be a global inverse function g : ~ -+ ~ because f is not injective,
7.4. The Implicit and Inverse Function Theorems 299

and the image of f lies in [-1, 1]; so the inverse function can only be defined
on neighbor hoods in ( -1, 1).
As an example, fort E (0, 7r) we can take neighborhoods Uo = (0, 7r) and
Vo = (-1, 1) and let g(x) = Arccos(x). If x = cos(t), then the inverse function
theorem guarantees that the derivative of g is
1 -1 -1
Dg(x) = Df(t)- 1 = . ()
-sm t Jl - cos 2 (t) ,11 - x2 ·

Nota Bene 7.4.10. Whenever you use the inverse function theorem, there
are likely to be a lot of negative exponents flying around. Some of these
denote function inverses, and some of them denote matrix inverses, including
the reciprocals of a scalars (which can be thought of as 1 x 1 matrices) .
If the inverse function is g = l- 1 , then the derivative Dg(y) = D(f- 1 ) (y)
of the inverse function is the matrix inverse of the derivative:

where l- 1 on the left means the inverse function g, but the exponent on the
right means the inverse of the matrix D l(x).
Of course the inverse of a matrix is the matrix representing the inverse
of the corresponding linear operator, so these exponents really are denoting
the inverse function in both cases- the real problem is that many people
confuse the linear operator D l (x) : X ---+ Y (the function we want to take the
inverse of) with the nonlinear function D l : U ---+ ~(X, Y), which often has
no inverse at all.

Example 7.4.11. Let f : JR 2 ---+ JR 2 be the coordinate change in the plane


from polar to Cartesian coordinates, given by (r, B) f-t (r cos e, r sin B). Since

detDf(r B) = lc?se -rsinBI = r


' sme rcose '

the function f is invertible in a neighborhood of (r, B) whenever r-=/= 0.


Note that no one function is the inverse off on the entire punctured plane
JR 2 "' {0} . After all, the function f is not injective, so it can't have a global
e
inverse. But for points in the open set Uo = {(r, B) Ir > 0, E (-7r/2,7r/2)},
the image of l is the right half plane Vo= {(x,y) Ix> O}, and on Vo, the
J
inverse function g is given by (x, y) f-t ( x 2 + y 2 , Arctan(y / x)).
To find the derivative of the inverse function, one could try to find the
inverse function explicitly in a neighborhood and then differentiate it. But
the inverse function theorem tells us the derivative of the inverse without
actually finding the inverse. In this example, if g is the inverse function, and
300 Chapter 7. Contraction Mappings and Applications

if (x , y) = (rcosB, rsinB), then we have

Dg(x , y) = Dr1(r, B) = ~r [rc.~se


- E,m 8
rsinB]
COS 8
= [vx~+y
- . Y
2 Jx:+y2]·
~ x2 + y2

You can verify that this agrees with the result of finding g explicit ly and then
differentiating.

Theorem 7.4.12. The inverse and implicit function theorems are equivalent.

Proof. Since we used the implicit function theorem to prove the inverse func-
tion theorem, it suffices to prove the implicit function theorem, using the inverse
function theorem. Let G : U x V -+ X x Z be given by G(x, y) = (x , F(x, y)) .
Note that

DG(x,y) = [D1F~x,y) D2F~x,y)] '


which has a bounded inverse whenever D 2F(x,y) has a bounded inverse. Applying
the inverse function theorem to the equation G(x , y) = (x ,0), we have the solution
(x,y(x)) = c- 1 (x,0) , which satisfies F(x,y(x)) = 0. D

7.4.3 *Application: Navigation


Global positioning systems (GPS) are found in cars, phones, and many other de-
vices. The implicit function theorem answers an important question about the
design of a GPS, namely, how accurately time must be kept in order to achieve
the desired accuracy in location.
To determine location, the GPS device measures the distance from the device
to each of four satellites. If (xi , Yi, zi) is the location of the satellite Si, and if ri
is its distance to the GPS device, then the location of the GPS device lies at the
intersection of the following four spheres:

(x - x1) 2 + (y - Y1) 2 + (z - z1) 2 = ri ,


(x - x2) 2 + (y -- Y2) 2 + (z - z2) 2 = r~,
(x - x3) 2 + (y -- y3) 2 + (z - z3) 2 = r~,
(x - x4) 2 + (y - y4) 2 + (z - z4) 2 = r~.
The distance r i is calculated by considering the difference between the satellite time
and the user time , or in other words ri = c(6.ti - 6.ti,prop), where 6.ti,prop is the
atmospheric propagation delay and c is the speed of light.
Unfortunately, 6.ti is difficult to compute, since the clocks on both the sat ellite
and the GPS device tend to drift . The value of 6.ti is approximated as

6.ti = 6.ti,fake + 6.ti ,sat + 6.tioc,


where 6.ti ,fake is the nominal time difference between the satellite clock and the
local GPS clock, 6.ti,sat is the drift in the satellite clock, and 6.tioc is the drift in
7.5. Conditioning 301

the local GPS unit clock. Hence, ri can be written as

c( .6.ti,fake + .6.ti,sat - .6.ti,prop + .6.t1oc).


To make things cleaner we write

and
Ji,= .6.tioc,
and we let t = [ti t4r and x = [x y z e] T. Finally, let

so the system of equations becomes

F(t,x) = 0.
We treat this as a system of four equations with four unknowns (x , y, z, and£).
Suppose we wish to determine the change in x if we perturb t, or, conversely,
suppose we want x to be determined with a certain degree of precision. How much
error cant have? The implicit function theorem is perfectly suited to give this kind
of information. It states that x is a function of t if D2F(t, x) is invertible, and
in that case we must have Dx(t) = - D 2F(t,x(t)) - 1D 1F(t,x(t)). Written more
explicitly, we have

[a'
8t1
.§JI...
8t1
oz
8t1
ae
8t1
ox
8t2
.§JI...
8t2
oz
8t2
ae
8t2
ox
ot3
.§JI...
ot3
oz
ot3
ae
ot3
ax
ot4
.§JI...
ot4
oz
ot4
ae
at.
=-
[ilox
8F2
ax ay
QEi
ox
0F4
ax ay
Qft
ay
8F2

QEi
f)y
0F4
Qft
oz
8F2
8Z
QEi
oz
8F4
8Z
illrae
8F2
Eff
f)p,
Eff
8F4
Eff
8F1
at;
fJ.EJ.
8t1
0F3
at;
8F4
at;
8F1
at;
fJ.EJ.
8t2
8F3
8t2
8F4
at;
8F1
ot3
fJ.EJ.
ot3
ap,
at;
8F4
at;
at.

®J
fJ.EJ.
ot4
QEi
at.
{!_fl
ot4
.

Elementary calculations show that if the times are perturbed, the change in x is
approximately
OX OX OX ox
.6.x;:::;::: 8.6.t1
ti
+ 8.6.t2
t2
+ 8.6.t3
t3
+ 8.6.t4.
t4
Moreover, one can show that if all of the .6.ti values are correct to within a nano-
second (typical for the clocks used on satellites), then the coordinates will be correct
to within 3 meters. For further information, see [N JN98] .

7.5 Conditioning
If the answer is highly sensitive to perturbations, you have probably asked the wrong
question.
-Nick Trefethen
302 Chapter 7. Contraction Mappings and Applications

Nearly every problem in applied mathematics can ultimately be expressed as a


function . Solving these problems amounts to evaluating the functions. But when
evaluating functions numerically, there are several potential sources of error. Two of
the most important of these are errors in the inputs and errors in the intermediate
computations. Since every measurement is inherently imprecise, and most numbers
cannot be represented exactly as a floating-point number, inputs almost always have
minor errors. Similarly, floating-point arithmetic almost always introduces minor
errors at each intermediate computation, and depending on the algorithm, these
can accumulate to produce significant errors in the output.
For each problem we must ask how much error can accumulate from round-off
(floating-point) error in the algorithm, and how sensitive the function is to small
changes in the inputs. The answer to the first question is measured by the stability of
the algorithm. Stability is treated in Volume 2. The answer to the second question
is captured in the conditioning of the problem. If a small change to the input only
results in a small change to the output of the function, we say that the function is
well conditioned. But if small changes to the input result in large changes to the
output, we say the function is ill conditioned. Not surprisingly, a function can be
ill conditioned for some inputs and well conditioned for other inputs.

Example 7.5.1. Consider the function y = x/(1 - x). For values of x close
to 1, a small change in x produces a large change in y. For example, if the
correct input is x = 1.001, and if that is approximated by x = 1.002, the
actual output of ii = 1/ (1 - 1.002) = -500 is very different from the desired
output of y = 1/(1 - 1.001) = -1000. So this problem is ill conditioned near
x = 1. Note that this error has nothing to do with round-off errors in the
algorithm for computing the values- it is entirely a property of the problem
itself.
But if the desired input is x = 35, then the correct output is y = -1.0294,
and even a bad approximation to the input like x = 36 gives a good approxi-
mate output ii= -1.0286. So this problem is well conditioned near 35.

7.5.1 Condition Number of a Function


The condition number of a problem measures how sensitive the problem is to changes
in input values. In essence we would like something like

change in output = condition number x change in input .

As shown in Example 7.5.1, such a number also depends on the input.

Definition 7.5.2. Let X and Y be normed linear spaces, and let f : X -t Y be a


function. The absolute condition number off at x E X is

'() .
x = 1im
r;, sup llf(x+h)-f(x)ll .
8 -+ 0 + llhll<8 llhll
7.5. Conditioning 303

Proposition 7.5.3. Let X and Y be Banach spaces, and let Uc X be an open set
containing x. If f : U ---+ Y is differentiable at x, then

k(x) = llDf(x)ll · (7.26)

Proof. By Lemma 6.4.1, for every E > 0, if llhll is sufficiently small, then

llf(x + h) - f(x) - Df(x)hil:::; cllhll ·


Using the t riangle inequality and dividing by llhll gives

I ll f(x + ~~ll f(x) ll - llDf(x) lll:::; E.


Therefore,

lim sup llf(x + h) - f(x) ll - llD f(x)ll I :::; E


I8---tO+ ll hll<8 llhll
for every E > 0, and so (7.26) follows. D

In most settings, relative error is more useful than absolute error. An error
of 1 is tiny if the true answer is 10 20 , but it is huge if the true answer is 10- 20 .
Relative error accounts for this difference. Since the condition number is really
about the size of errors in the output, the relative condition number is usually a
better measure of conditioning than the absolute condition number.

Definition 7.5.4. Let X and Y be normed linear spaces, and let f: X---+ Y be a
function. The relative condition number off at x E X is

;;:(x) = lim sup (


8---to+ llhll<8
llf(x + h) - f(x) ll
llf(x) l
/Jl!1)
llxll
= k(x)
llf(x) ll/llxll .
(7.27)

Remark 7 .5.5. A problem is well conditioned at x if the relative condition number


is small. What we mean by "small" depends on the problem, of course. Similarly,
the problem is ill conditioned if the relative condition number is large. Again, what
is meant by "large" depends on the problem.

Nota Bene 7.5.6. Roughly speaking, we have

rel. change in output = rel. condition number x rel. change in input.

This leads to a general rule of thumb that, without any error in the algorithm
itself, we should expect to lose k digits of accuracy if the relative condition
number is lOk.
304 Chapter 7. Contraction Mappings and Applications

If f is differentiable, then Proposition 7.5.3 gives a formula for the relative


condition number in terms of the derivative.

Corollary 7.5.7 . If X and Y are Banach spaces, if U C X is an open set con-


taining x, and if f : U --t Y is differentiable at x, then

llDf(x)ll (7.28)
11:(x) = l f(x) ll/llxll

Ex ample 7.5.8.

(i) Consider the function f(x) = 1 _".'x of Example 7.5 .1. For this function
Df(x) = (1 - x)- 2 , and hence by (7.28) we have

llDf(x)ll l~I I 1 I
11: = llf(x)ll/llxll = Ii.".'x I / lxl = 1 - x .
This problem is well conditioned when x is far from 1, and poorly con-
ditioned when ll - xi is small.

(ii) Given y, consider the problem of finding x on the curve x 3 - x = y 2 .


Setting F (x, y) = x 3 -x-y 2 , we can rewrite this as the problem of finding
x to satisfy F(x, y) = 0. Note that D xF = 3x 2 - 1, so, provided that
±./173,
x -:/= the implicit function theorem applies and guarantees that
there is (locally) a function x(y) such that F(x(y) , y) = 0. Moreover,
Dx(y) = dx/dy = 2y/(3x 2 - - 1). Therefore, near a point (x, y) on the
curve with x -:/=±./173, the relative condition number of this function
is
II: -
l2y/(3x 2 - 1)1 - 2y 2 2lx3 - xi
- lxl/IYI - l3x 3 - xi lxll3x 2 - ll
This problem is ill conditioned when x is close to ±./173 and well con-
ditioned elsewhere.

7.5.2 Condition of Finding a Simple Root of a Polynomial


We now show, using the implicit function theorem, that varying the coefficients of
a single-variable polynomial p causes the simple roots (those of multiplicity 1) of p
to vary as a continuous function of the coefficients, and we calculate the condition
number of that function.

Proposition 7.5.9. Define P : lFn+l x lF --t lF by P(a, x) = L:~o aixi. For any
given b E pn+l and any simple root;<: of the polynomial p( x) = P(b, x), there is a
neighborhood U of b in lF'n+l and a continuously differentiable function r : U --t lF
with r(b) = z such that P(a,r(a)) = 0 for all a E U. Moreover, the relative
7.5. Conditioning 305

condition number of r as a function of the ith coefficient ai, at the point (b, z) , is

(7.29)

Proof. A root z of a polynomial pis simple if and only if p' ( z) -:/=- 0. Differentiating P
at (b, z) with respect to x gives DxP(b, z) = 2:~ 1 ibizi- l = p'(z). Because p'(z) is
invertible, the implicit function theorem guarantees the existence of a neighborhood
U of a and a unique continuous function r : U --+ IF such that r(b) = z and such
that P(a,r(a)) = 0 for all a EU. Moreover, we have

Combining this with (7.28) shows that the relative condition number of r as a
function of the ith coefficient ai is given by (7.29). D

Example 7.5.10. If the derivative p'(z ) is small, relative to the coefficient


bi, then the root-finding problem is ill conditioned . A classic example of this
is the Wilkinson polynomial
20
w(x) = IJ (x - r) = x 20 - 210x 19 + 20615x 18 - . . ..
r=l

Perturbing the polynomial by changing the x 19 -coeffi.cient from - 210 to


- 210.0000001 changes the roots substantially, as shown in Figure 7.3. This is
because the derivative p'(z) is small for roots like z = 15, relative to z 18 b19 ,
where b19 is the coefficient of x 19 . For example, at z = 15 we have

15 18 (-210)
K, = ~ 3.0 x 10 10 .
p'(15)

7.5.3 Condition Number of a Matrix


In this section we compute the condition number for problems of the form Ax = b.
There are several cases to consider:

(i) Given A E Mn(IF), what is the relative condition number off (x) =Ax?
(ii) Given x E !Fn, what is the relative condition number of g(A) =Ax?

(iii) Given A E Mn(IF) , what is the relative condition number of h(b) = A - 1 b?


Although the relative condition numbers of these three cases are not identical,
they are all bounded by the number llAll 11A- 1 11, and this is the best uniform bound,
as we show below.
306 Chapter 7. Contraction Mappings and Applications

x
x
2 x

x x

0 ll ll ll »: JO( 1'I ):)[ :0: 0000000000

-1 x x

-2 x
x
x
-3
-5 0 5 10 15 20 25

Figure 7.3. The blue dots are the roots of the Wilkinson polynomial w(x)
plotted in the complex plane. The red crosses are the roots of the polynomial per-
turbed by 10- 7 in the x 19 -coefficient. As described in Example 7. 5.10, the roots
are very sensitive to tiny variations in this coefficient because the relative condition
number is very large .

Theorem 7.5.11.

(i) If A E Mn(lF) is nonsingular, the relative condition number off (x) = Ax


satisfies

K, = llAll ~ 1
llAxll <- llAllllA- 1 . (7.30)

Moreover, if the norm II· II is the 2-norm, then equality holds when x is a right
singular vector of A corresponding to the minimal singular value.

(ii) Given x E lFn, the relative condition number of g(A) =Ax satisfies

(7.31)

if A is nonsingular. Moreover, for the 2-norm, equality holds when x is a


right singular vector of A corre8ponding to the minimal singular value.

(iii) If A E Mn(lF) is nonsingular, the relative condition number of h(b) = A- 1 b


satisfies

(7.32)

Moreover, for the 2-norm, equality holds when b is a left singular vector of A
corresponding to the maximal singular value.
7.5. Conditioning 307

Proof. For (i) use (7.28) to find


llDJ(x) ll llAllllxll
"'= llAxll/ll x ll llAx ll.
To get the upper bound, substitute x = Ay into the definition of llA- 111 to get
-1 llA- 1x ll llA- 1Ay ll llYll
llA II = s~p llxll = s~p llAYll = s~p llAYll '
This gives llA- 1 11 2: llYll/llAYll for ally, from which we get (7.30) .
See Exercise 7.27 for the proof that equality occurs when xis a right singular
vector associated to the minimal singular value.
For (ii) it is straightforward to verify that for any H E Mn(IF) we have that
Dg(A)H = Hx. Exercise 3.29 gives ll Dg(A) ll = ll x ll, so (7.28) gives

"' = llg(jl)~~llAll = 1 ~!1~1ill :S llAllllA- 111·


The same argument as for (i) gives equality for a singular vector associated to the
minimal singular value.
Finally, substituting A- 1 for A in (i) gives (iii), where equality holds for a
right singular vector of A- 1 = VI:- 1UH associated to the minimal singular value
1/ CJ 1 of A- 1, which is a left singular vector of A associated to maximal singular
value CJ 1. D

The previous theorem inspires the following definition.

Definition 7.5.12. Let A E Mn(IF). The condition number of A is


1
11,(A) = llAllllA- 11 ·

Nota Bene 7.5.13. Although 11,(A) is called the condition number of the ma-
trix A, it i not the condition number (as given in Definition 7.5.4) of most
problems associated to A. Rather, it is the supremum of the condition num-
bers of each of the various problems in Theorem 7.5.11; in other words, it is
a sharp uniform bound for each of those condition numbers. Also the prob-
lem of finding eigenvalues of A has an entirely different condition number (see
Section 7.5.4) .

7.5.4 *Condition Number of Finding Simple Eigenvalues


Given a matrix A E Mn(IF) and any simple eigenvalue of A, we show, using the
implicit function theorem, that a continuous deformation of the entries of A con-
tinuously deforms t he eigenvalue. We are also interested in the condition number
of t he problem of finding the eigenvalue.

Proposition 7.5.14. Let .A be a simple eigenvalue {that is, with algebraic multi-
plicity 1) of A with right eigenvector x and left eigenvector yH, both of norm l. Fix
308 Chapter 7. Contraction Mappings and Applications

E E Mn(lF) with llEll = 1. There is a neighborhood U of 0 E JR and continuously


differentiable functions ~(t), µ(t) defined on U with ~( O) = x and µ(O) = >. such that
µ(t) is an eigenvalue of A+ tE with eigenvector ~(t).
Moreover, the absolute condition number of µ(t) at t = 0 satisfies

(7.33)

and equality holds if E = yxH .

Proof. Let F : lF x lFn x <C--+ lFn x JR be given by

F(t c ) = [(A+ tE)~ - µ~]


,c, ,µ ~H~-1 .

We have F(t, ~, µ) = [o OJ T if and only if 11~11 = 1 and~ is an eigenvector of A+tE


with eigenvalue µ . In particular, we have F(O, x, >.) = 0. Computing the derivative
with respect to the last two coordinates ~, µ, we have

D F_[A + tE - µI
~,µ - 2~H
-~]
0 .

Evaluating at (0, x, >.) gives

Exercise 7.31 shows that D~,µF(O, x,, >.) is invertible. The implicit function the-
orem now applies to guarantee the existence of a differentiable function f (t)
(~(t), µ(t)) E lFn x <C, defined in a neighborhood U oft= 0 such that

F(t,~(t),µ(t)) = [~] (7.34)

for all t E U, and thus µ(t) is an eigenvalue of A+ tE with eigenvector ~(t).


Differentiating (7.34) with respect to t at t = 0 gives

(E-µ'(O))x-t-(A->.J)((O)] _ [O]
[ 2xH<;'(O) - 0 .

Multiplication of the top row on the left by yH gives

which gives
,_ '(O)-yHEx _ 1_
/'i,-µ - H :::::: H '
y x y x

where the last inequality follows from the fact that llEll = 1. In the special case
that E = yxH, it is immediate that the inequality is an equality. D
7_5_ Conditioning 309

As a corollary, we see that finding simple eigenvalues of normal matrices is


especially well conditioned.

Corollary 7.5.15. If A is normal, then the absolute condition number of finding


any simple eigenvalue is no greater than l .

Proof. If A is normal, then the second spectral theorem (or rather the analogue
of Corollary 4.4.9 for normal matrices) guarantees that there is an orthonormal
eigenbasis of A. Since >- is simple, one of the basis elements x corresponds to .A, and
by Remark 4.3.19 xH is a corresponding left eigenvector. Thus, yHx = xHx = 1,
and Pi,~ l. D

Example 7.5.16. When a matrix A is not normal, calculation of eigenvalues


can be ill conditioned. For example, consider the matrix

A= [o.~Ol 1000]
1 '

which has eigenvalues {O, 2}. Take as right eigenvector x = [l -0.00l] T and
left eigenvector yH = [-0.001 l] , and note that llxlloo = ll Ylloo = l. Setting

E = [~ ~]'
we have by Proposition 7.5.14 that there is a continuous function µ(t) such
that µ(O) = 0 and µ(t) is an eigenvalue of A+tE near t = 0. Moreover, (7.33)
gives the absolute condition number Pi, for µ at t = 0 as

, yHEx 1
K = -- = =500.
yHx 2 X lQ- 3

The ill conditioning is illustrated by looking at the change in µ from t = 0 to


t = - 0.001. When t = - 0.001, the matrix

A + tE = [0l 10100]

has a double eigenvalue µ(-0.001) = 1; so a small deformation int results in


a relatively large change in µ.

Vista 7.5.17. In Chapter 14 we discuss pseudospectra of matrices. One of


the many uses of pseudospectra is that they help us better understand and
quantify some of the problems associated with ill conditioning of eigenvalues
for nonnormal matrices.
310 Chapter 7. Contraction Mappings and Applications

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with .& are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

7.1. Consider the closed, complete metric space X = [O, oo) with the usual metric
d(x, y) = Ix - YI· Show that the map

f('r)
. -- x + v'x2+1
2

satisfies d(f(x), f(y)) < d(x, y) for all xi- y EX and yet has no fixed point.
Why doesn't this violate the contraction mapping principle?
7.2. Consider the sequence (xk)k"=o defined by the recursion Xn = ...ja + Xn-l,
with Oi > I and xo = l. Prove that

lim Xn= l+~ .


n-+oo 2
Hint: Show that the function f (x) = .../Oi + x is a contraction on the set
[O, oo), with the usual metric.
7.3. Let f: !Rn-+ !Rn be a continuous function satisfying llf(x)ll < kllxll for some
k E (0, 1) and for every x i- 0. For some initial point xo E !Rn define the
sequence (xn)~=O recursively by the rule Xn+l = f(xn)· Show that Xn -+ 0.
Hint: First prove that the sequence must have a limit, and show that if x is
the limit of the sequence, then f (x) is also. Use this to show that if x i- 0,
it cannot be the limit.
7.4. Let (X, d) be a metric space and f : X-+ X a contraction with constant K.
Prove for x , y E X that
1
d(x,y):::; l=K (d(x ,f(x))+d(y,f(y))).

Use this to prove that a contraction mapping can have at most one fixed point.
Exercises 311

7.5 . Let (X, d) be a metric space and f : X -+ X a contraction with constant


K < l. Using the inequality in the previous problem, prove that for any
x E X the sequence (r(x) );;:'=o satisfies the inequality
Km+Kn
d(fm(x), r(x)) ::::; l _ K d(x, f(x)).

Use this to prove that (r (x) );;:'=0 is a Cauchy sequence. Finally, prove that
if x is the limit of this sequence, then f (x) = x, and for any integer n > 0
Kn
d(r(x), x) ::::; _ Kd(x, f(x)).
1
7.6. Prove Theorem 7.1.14. Hint: This follows the same ideas used to prove
Theorem 7 .1.8; define a sequence via the method of successive approximations
and prove convergence.

7.7. Let f: ~ x ~-+~be defined by f(x, y) = cos(cos(x)) +y. Show that f(x, y)
is a C 00 uniform contraction mapping. Hint: Consider using the mean value
theorem and the fact that Isin(x)I ::::; sin(l) < 1 for all x E [- 1, l].
7.8. Let Cb([O,oo);~) = {g E C([O,oo);~): llgllL= < oo} bethe set of continuous
functions with bounded sup norm. Given µ > 0 let Tµ : Cb([O, oo ); ~) x
Cb([O,oo);~)-+ C([O,oo);~) be given by

(x, f) r+ TµJ[x](t) = f(t)


µ
+2 t e-µsx(s) ds.
Jo
(i) Prove that (Cb([O,oo);~), 11 · llux,) is a Banach space. Hint: The proof
of Theorem 5.6.8 may be useful.
(ii) Prove that the image of TµJ lies in Cb([O,oo);~) for every f E Cb
([O, oo); ~).
(iii) Prove that Tµ : Cb([O, oo); ~) x Cb([O, oo); ~) -+ Cb([O, oo); ~) is a uni-
form contraction.
7.9. Let (X, II · llx) be a Banach space and A E @(X) an operator with llAll < 1,
where II · II is the induced norm on @(X).
(i) Show that f: Xx X-+ X defined by f( x , y) = Ax+y is a C 00 uniform
contraction mapping.
(ii) Find the function g(y) that sends each point of y to the unique fixed
point of the corresponding contraction f (·, y), and verify that g : X -+ X
is C 00 •
(iii) Verify that the map Tµ of the previous exercise is of the form Ax+ y
with llAll < l.
7.10. For a E ~'define the weighted norm on C([O,oo);~) as

llflla =sup { eatlf(t)I}


t~O

and the corresponding function space

Ca= {f E C([O,oo);~): llfll a < oo}.


312 Chapter 7. Contraction Mappings and Applications

Essentially the same argument as in Exercise 7.8(i) shows that (Ca, II· Ila) is
a Banach space.
(i) Fix z E Ca, and define a map Tz: Ca x Ca -t C([O, oo); JR) by

(x, f) f-t Tz,J[x](t) = z(t) + f(t) 100

e(a-l)sx(s) ds.

Prove that for each f EC,~ the map Tz,J is an operator on Ca.
(ii) Fix k < 1 and let B = {f E Ca: llflla < k}. Prove that Tz: Ca xB -t
Ca is a uniform contraction.
7.11. Suppose (X, 11 · llx) and (Y, 11 · llY) are Banach spaces, UC X and V c Y are
open, and the function f : U x V -t U is a uniform contraction with constant
0 ::::; >. < l. In addition, suppose there exists a C such that

for all b 1 , b 2 E V and all x E U. Show that the function g : V -t U that


sends each y E V to the unique fixed point of the contraction is Lipschitz.
7.12.* Let D be a closed subset of a Banach space, and let f: D -t D be contin-
uous. Choose x0 E D and let Xn+l = f(xn) for every integer n E N. Prove
that if the series I:~=O llxn+l - xnll converges (in JR), then the sequence
(xn)~=O converges in D to a fixed point off.

7.13. (i) Using Newton's method, derive the square root formula (7.15) in
Example 7.3.5.
(ii) Derive the cube root formula (7.16) in Example 7.3.7. Then compute
the long division in (7.17) by hand to get 5 decimals of accuracy.
7.14. The generic proof of convergence in Newton's method requires a sufficiently
close initial guess x 0 . Prove that the square root solver in (7.15) converges
as long as x -/=- 0.
7.15. Although Newton's method is very fast in most cases, there are situations
where it converges very slowly. Suppose that
- l/x 2 if x-/=- 0,
f(x) = { ~ if x = o.

The function f can be shown to be C 00 , and 0 is the only solution off (x) = 0.
Show that if xo = 0.0001, it takes more than one hundred million iterations of
the Newton method to get below 0.00005. Prove, moreover, that the closer Xn
is to 0, the slower the convergence. Why doesn't this violate Theorem 7.3.4?
7.16. Let F: JR 2 -t JR 2 be given as
2
F( x,y) = [x-y2 +8+ cosy ]
y - x + 9 + 2 cos x ·

Using the initial guess x 0 = (1r,~), compute the first iteration of Newton's
method by hand.
Exercises 313

7.17. Determine whether the previous sequence converges by checking the Newton-
Kantorovich bound given in (7.20). Hint: To compute the operator norms
ll DJ(xo) - 1112 and ll Df(x) - Df(y)l l2 you may wish to use the results of
Exercise 3.28.

7.18. Consider the points (x, y) E IR 2 on the curve cos(xy) = 1 + tan(y). Find
conditions on x and y that guarantee x is locally a function of y, and find
dx/dy.
7.19. Find the total Frechet derivative of z as a function of x and y on the surface
{(x, y, z) E IR 3 I x 2 + xyz + 4x 5 z 3 + 6z + y = O} at the origin.
7.20. Show that the equations

sin(x + z) + ln(yz 2) = 0,
ex+z + yz = 0

implicitly define C 1 functions x(z) and y(z) in a neighborhood of the point


(1, 1, - 1).
7.21. Show that there exist C 1 functions f(x, y) and g(x, y) in a neighborhood of
(0, 1) such that f(O, 1) = 1, g(O, 1) = -1, and

f(x, y) 3 + xg(x, y) - y = 0,
g(x, y) 3
+ yf(x, y) - x = 0.

7.22. The principal inverse secant function f(x) = Arcsec(x) has domain (- oo, - 1)
U(l, oo) and range (0, 7r / 2) U (7r /2, 7r). Using only the derivative of the secant,
basic trigonometric properties, and the inverse function theorem, prove that
the derivative of f is
df 1
dx lxl vx 2 - 1 ·
7.23. Let S: M 2(IR) --+ M2(1R) be given by S(A) = A 2. Does Shave a local inverse
in a neighborhood of the identity matrix? Justify your answer.
7.24.* Denote the functions f : IR 2 --+ IR 2 and g : IR 2 --+ IR 2 in the standard bases as

f(x) = (f1(x1,x2),h(x1,x2)),
g(x) = (g1(x1, x2), g2(x1, x2)) ,

where x = (x 1,x 2 ). Prove: If f and g are C 1 and satisfy f(g(y)) =y for all
y, then

dg1 = J - 1 oh and dg1 = - J - 1 ofi'


dy1 OX2 dy2 OX2
dg2 = - J-1 oh and dg2 = J - 1 ofi'
dy1 OX1 dy2 OX1
8
where J = ((fi,h));
8 X1,X2
see (7.24) in Example 7.4.7.
7.25.* Work out the details of Example 7.4.7, showing that Cramer's rule gives the
formulas for the derivatives.
314 Chapter 7. Contraction Mappings and Applications

7.26. Find the relative condition number at each x 0 in the domain of the following
functions:
(i) ex.
(ii) ln(x).
(iii) cos(x).
(iv) tan(x).
7.27. Finish the proof of Theorem 7.5.11 by showing that in the 2-norm, if x is
a right singular vector of A associated to the minimal singular value, then
equality holds in (7.30).
7.28. Given (x, y) E C 2 , consider the problem of finding z such that x 2 + y 2 - z 3 +
z = 0. Find all the points (x, y) for which z is locally a function of (x, y).
For a fixed value of y, find the relative condition number of z as a function
of x. What is this relative condition number near the point (x, y) = (0, 0)?
7.29. Give an example of a matrix A with condition number K::(A) > 1000 (assuming
the 2-norm). Give an example of a matrix B with condition number K::(B) = 1.
Are there any matrices with condition number less than 1? If so, give an
example. If not, prove it.
7.30. Proposition 7.5.9 gives sufficient conditions for the roots of a polynomial to
be a continuous function of the coefficients.
(i) Consider the roots of the polynomial f(a,x) = x 2 +a, where a,x ER
If a = 0, then xo = 0 is a root, but if a > 0, then f has no roots in R
Why doesn't this contradict Proposition 7.5.9?
(ii) The quadratic formula gives an explicit formula for all the roots of a
quadratic polynomial as a function of the coefficients. There is a simi-
lar but much more complicated formula for roots of cubic and quartic
polynomials, but Abel's theorem guarantees that there is no general al-
gebraic formula for the roots of a polynomial of degree 5 or greater.
Why doesn't this contradict Proposition 7.5.9?
7.31.* Prove that the derivative D~ , µF(O, x, >.) of

F(t c
,..,,µ
) = [(A+ tE)~-1- µ~] '
~H~

as described in the proof of Proposition 7.5 .14, is invertible at t = 0. Hint:


Schur's lemma guarantees the existence of an orthonormal matrix Q such
that QH AQ is upper triangular (of course, with all the eigenvalues of A on
the diagonal). Conjugate D~,µF(O, x, >.) by

to get a matrix that you should be able to identify as nonsingular.


Question: How does this rely on the fact that >. is a simple eigenvalue?
Exercises 315

Notes
For more on the uniform contraction mapping principle and further generalizations,
see [Chi06, Sect. 1.11].
A readable description of the Newton-Kantorovich bound is given in [Ort68].
Kantorovich actually proved the result in two different ways, and an English trans-
lation of his two proofs is given in [Kan52, KA82].
Our treatment of conditioning is inspired by [TB97, Dem97, GVL13]. For
more on conditioning in general, see [TB97, Dem97]. For more on conditioning of
the eigenvalue problem, see [GVL13, Sect. 7.2.2].
Exercises 7.4-7.5 are based on the paper [Pa107]. Exercise 7.15 comes from
[Ans06].
Pa rt Ill

Nonlinear An alysis II
Integration I

I'm not for integration, and I'm not against it.


-Richard Pryor

Integration is an essential tool in applied mathematics. It is a fundamental operation


in many problems arising in physics and mechanics, and it is also central to the
modern treatment of probability.
Most treatments of modern integration begin with a long treatment of measure
theory and then use that to develop the theory of integration. We take a completely
different approach, based on some of the topological ideas developed earlier in this
text. Our approach is unusual, but it has many advantages. We feel it is more
natural and direct. For many students, this approach is more intuitive than the
traditional treatments of Lebesgue integration. It also avoids most of the work
involved in developing measure theory and instead leverages and reinforces many
of the ideas developed in the first half of this text.
We begin our treatment of integration in this chapter by extending the defini-
tion of the regulated integral (Section 5.10) to multivariable functions taking values
in a Banach space X. Recall that in Section 5.10 we defined the single-variable
integral by defining the integral in the "obvious" way on step functions, and then
using the continuous linear extension theorem (Theorem 5.7.6) to show that there is
a unique linear extension of the integral operator from step functions to the closure
(with respect to the L 00 -norm) of the space of step functions. The construction
of the multivariable regulated integral is essentially identical to the single-variable
case. The hardest part is just defining and keeping track of notation in the definition
of higher-dimensional step functions.
Just as in the single-variable case, continuous functions lie in the closure of the
space of step functions, so this gives a definition of the integral, called the regulated
integral for all continuous functions (on a compact interval). The regulated integral
agrees with the Riemann integral whenever it is defined, but, unfortunately, both
the Riemann and the regulated integrals have some serious drawbacks. The most
important of these drawbacks is that these integrals do not behave well with respect
to limits if the limits are not uniform. It is often desirable to move a limit past the
J J
integral sign (from limn-too f n to limn-+oo f n) even when the limit is not uniform.

319
320 Chapter 8. Integration I

But unfortunately the function limn-+oo f n need not be Riemann integrable, even if
the individual terms f n are all Riemann integrable.
Instead of limits in the L 00 -norm (uniform limits), we often need to consider
limits in the L 1-norm (given by llfllu = J llfll). The space of Riemann-integrable
functions is not complete in the L 1-norm, so there are L 1-Cauchy sequences that
do not converge in this space. To remedy this, we need to extend the space to a
larger one that is complete with respect to the L 1 -norm , and then we can use the
continuous linear extension theorem (Theorem 5.7.6) to extend the integral to this
larger space. The result of all this is a much more general theory of integration. 42
If the functions being integrated take values in X = JR, then this construction is
known as the Daniell integral. We will usually call it the Lebesgue integral, however,
because it is equivalent to the Lebesgue integral, and that name is more familiar to
most mathematicians. If X is a more general Banach space, then this construction is
called (or, rather, is equivalent to) the Bochner integral. For simplicity, we restrict
ourselves to the case of X = JR for most of this chapter and the next, but much of
what we do works just as well when X is a general Banach space.
It is important to keep in mind that on a bounded set all Riemann-integrable
and regulated-integrable functions are also Lebesgue integrable, and for these func-
tions the Riemann and regulated integrals are the same as the Lebesgue integral.
Thus, to compute the Lebesgue integral of a continuous function on a compact set,
for example, we can just use the usual techniques for finding the regulated integral
of that function.
In this chapter we begin by extending the definition of the regulated integral
to multivariable functions, and then by giving an overview of the main ideas and
theorems of Lebesgue integration (in t he next chapter we give the details and com-
plete the proofs). The majority of the chapter is devoted to some of the main tools
of integration. These include three important convergence theorems (the monotone
convergence theorem, Fatou's lemma, and the dominated convergence theorem),
Fubini's theorem, and a generalized change of variables formula. The three conver-
gence theorems give useful conditions for when limits commute with integration.
Fubini's theorem gives a way to convert multivariable integrals into several single-
variable integrals (iterated integrals), which can then be evaluated in the usual ways.
Fubini also shows us how to change the order in which those iterated single-variable
integrals are evaluated, and in many situations changing the order of integration can
greatly simplify the problem. Finally, the multivariable change of variables formula
is analogous to the single-variable version (Corollary 6.5 .6) and is also very useful.

8.1 Multivariable Integration


Recall that in Section 5.10 we define the single-variable integral by first defining
integration in the obvious way for step functions, and then the continuous linear
extension theorem (Theorem 5.7.6) gives a unique way to extend that definition
to the closure of the space of step functions, with respect to the L 00 -norm. The
same construction works for multivariable integration, once we have a definition of
n-dimensional intervals and step functions on ]Rn .

42
The approach to integration in these next two chapters is inspired by, but is perhaps even more
unusual than, what we did with integration in Chapter 5 (see Nota Bene 5.10.1) .
8.1. Multivariable Integration 321

8.1.1 Multivariable Step Functions


Step functions from a compact interval [a, b] C JR to a Banach space (X, II · II)
provide the key tool in the definition of the single-variable regulated integral. In
this subsection we generalize these to multivariable step functions . To do this we
must first define the idea of a multivariable interval.

Definition 8.1.1. Let a , b E lRn, with a= (a1 , ... , an) and b = (b1, .. . , bn)· We
denote by [a, b] the closed n -interval (or box)

To define step functions on [a , b] we must define a generalized subdivision


of the n-interval [a, b] . Recall from Definition 5.10.2 that a subdivision P of an
interval [a, b] c JR is a finite sequence a= tCO) < tCll < ... < tCk-l) < tCk) = b.

Definition 8.1.2. A subdivision 9 of the n-interval [a, b] C JRn consists of a


subdivision Pi = {ti 0 ) = ai < ti 1 l < · · · < tik;-l) < tik;) = bi} of the interval [ai, bi]
for each i E {l, ... , n} (no te that the lengths of the subdivisions in each dimension
may vary); see Figure 8.1.

Definition 8.1.3. A subdivision of an n -interval [a, b] gives a decomposition of


[a, b] into a union of partially open and closed subintervals as follows . Each tij )
with 0 < j < ki defines a hyperplane s?) = {x E ]Rn I Xi = tij)} in ]Rn ' which
divides the interval into two regions: a partially open interval

b1 -- t(4)
1

Figure 8.1. Partition of an interval [a, b] in JR 2 , consisting of two one-


0
dimensional partitions: al = ti ) <
1
ti
) <
2
ti
) <
3
) <ti4
ti
) = b1 , and a2 = t~o) <
2
t~ ) < t~ ) < t~ ) = h See Definition 8.1.2 for details.
1 3
322 Chapter 8. Integration I

3)--
b

2) _,_

1) _f-

O) _ ,_
a
I I I I
I I I I
(0) (1) (2) (3) (4)
- b1 -- t1

Figure 8.2. The hyperplane Hi3) (green) in JR 2 defined by the point ti


3
)
divides the interval into the union of a partially open interval (blue) and a closed
interval (red), as described in Definition 8.1. 3.

Rl ,3 R2,3 R3,3 R4 ,3

Rl ,2 R2,2 R3,2 R4,2

Rl ,1 R2,1 R3 ,1 R4,1
a

- t(O) t(l) t(2)


t (3) b - t(4)
ai - 1 1 1 1 1 - 1

Figure 8.3. A partition of the interval [a, b] in JR 2 divides [a, b] into


a union of subintervals R1, some partially open and one closed (the upper right
interval, containing b ), as described in Definition 8.1. 3.

and a closed interval

(see Figure 8.2) .


Repeating this process for each resulting interval (either partially open or
closed) and for each hyperplane, we get a decomposition of [a, b] into a union of in-
tervals. Specifically, for each n -tuple I= (i1, ... , in) E {1, ... , ki} x · · · x {1, ... , kn}
of indices we get a subinterval R1 (see Figure 8.3).
8.1. Multivariable Integration 323

The interval [a, b] is the union

[a,b] = LJ R1 .
IE 9

To simplify notation we write I E 9 to denote that the index I lies in the product
of the indices of 9, that is, I E {l , ... , ki} x · · · x {l, . . . , kn}. We use this notation
repeatedly when discussing step functions .

Throughout the rest of this section, assume that (X, II · II) is a Banach space.

Definition 8.1.4. For any set E C JRn the indicator function :Il.e of E is the
function
l, z EE,
:Il.e (z ) ={
0, z rf. E.

A function s : [a , b] -+ X is a st ep function on [a , b] if there is a subdivision 9 of


[a, b] such thats can be written in the form

s(t) = L x1:Il.R1 (t).


IE9

Here x1 EX for each IE 9 , and the subintervals R1 C [a, b] are determined by the
index and the partition, as described in Definition 8.1.3. More generally, for any
E C JRn we consider a function s : E -+ X to be a step function if it is zero outside
of an interval [a, b] and the restriction sl [a,b] to [a, b] is a step function on [a, b].

The proof of the next proposition is essentially identical to its one-dimensional


counterpart (Proposition 5.10.3).

Proposition 8.1.5. The set S([a , b]; X) of step functions is a subspace of the
normed linear space of bounded functions (L 00 ([a, b], X), II· llLoo ).

8.1.2 Multivariable, Banach-Valued Integration


Now we need to define t hen-dimensional volume or measure of an n-interval. Note
that we define the measure of an n-interval to be the same regardless of whether
the boundary or part of the boundary of the interval is included or not .

Definition 8.1.6. For each j E {1 , . .. , n} let Aj C JR be an interval of the form


[aj , bj ], (a j, bj) , (aj, bj ], or [aj, bj) · Let R C JRn be the Cartesian product R =
A 1 x · · · x An C JRn . Define the measure >..(R) of the n -interval R to be the product
of the lengths
n
>..(R) = IJ (bj - aj)·
j=l

The definition of the integral of a step function is essentially identical to that


of t he one-dimensional case.
324 Chapter 8. Integration I

Definition 8.1. 7. The integral of a step function s = I:;JE .9' x1ll.R 1 is

f(s) = r
l[a,b]
s= L x1>-.(R1) .
JE.9'

The proof of the next proposition is very similar to the single-variable case
(Proposition 5.10.5).

Proposition 8.1.8. For any compact interval [a, b] C Rn, the integral operator f :
S([a, b]; X) --+ X is a bounded linear transformation with norm llfll = >-.([b - a]).

Proof. The proof is Exercise 8.2. D

By the continuous linear extension theorem (Theorem 5. 7.6), the integral


extends from step functions to all functions in the closure (in the L 00 -norm)
of S([a, b]; X). In particular, since the closure includes continuous functions
C([a, b]; X), this defines the integral for all continuous functions. The arguments in
the proof of the single-variable Banach-valued integration theorem (Theorem 5.10 .6)
apply again in the multivariable case to give the following theorem.

Theorem 8.1.9 (Multivariable Banach-Valued Integral). Let S([a, b]; X)


be the closure of S([a,b];X) in L 00 ([a,b];X). The linear transformation f :
S([a, b]; X) --+ X can be extended uniquely to a bounded linear transformation
f : S([a, b];X)--+ X, and
llfll = >-.([b - a]).
Moreover, we have

C([a, b]; X) c S([a, b]; X) c L 00 ([a, b]; X) .

Proof. The proof is Exercise 8.3. D

Definition 8.1.10. For any [a,b] cRn we denote the set S([a,b]; X) by ~([a,b]; X).
The functions in ~([a, b]; X) are called regulated-integrable functions . For any
f E ~([a, b];X) , we call the linear transformation f(f) in Theorem 8.1.9 the
integral of f , and we usually denote it by

r
J [a,b]
f = f(f).

Proposition 8.1.11. If f , g E ~([a, b]; X), then the following hold:


(i) I f[a ,b] fll :::; >-.([a, b]) SUPtE[a,b] I f (t) I ·
(ii) For any sequence (fk)'Go in ~([a, b]; X) that converges uniformly to f, we
have
lim r
k--too J [a ,b)
fk = r lim
J[a,b) k--too
fk = r
J[a,b]
f.
8.1. Multivariable Integration 325

(iii) Let llfll denote the function t f---t ll f(t) ll from [a, b] to JR.. We have

r
II J[a,b]
111 < r
J[a,b]
1 111.

(iv) If llf(t)ll ~ llg(t)ll for every t E [a , b], then f[a,b] llfll ~ f[a,b] llgll ·

Proof. The proof of (i) and (ii) are Exercise 8.5 .


Item (iii) holds for step functions by the triangle inequality, and since every
f E &t'([a, b]; X) is a uniform limit of step functions, we can use (ii) to conclude
that (iii) holds for all f E &t'([a, b]; X) .
Finally, if h E &t'([a, b]; IR) and if h(t) 2: 0 for every t E [a, b], there is a
sequence of step functions (sk)k=O that converges uniformly to h. Hence for any
c > 0 there is an N > 0 such that llsk - h iIL= < c/ >..([a, b]), whenever k;::: N. This
implies that sk(t) > - c/>..([a, b]) for every t E [a, b] whenever k;::: N, and thus

{ Sk 2: -c.
J[a,b]

Since c > 0 is arbitrary, we have

r
J[a,b]
h = lim r
k--+= J[a,b]
Sk > 0.
-

Letting h = llgll - llfll gives (iv). D

Remark 8.1.12. As in the single-variable case, one can easily check that the
Riemann construction of the integral defines a bounded linear transformation on
&t'([a, b]; X) that agrees with our definition on step functions , and hence by the
uniqueness part of the continuous linear extension theorem must agree with our
construction on all of &t'([a, b]; X). The Riemann construction does work for a
slightly larger space of functions than &t'([a, b]; X); however, we need the integral
to be defined on yet more functions, because in applications we must often move lim-
its past integrals, but many limits of Riemann-integrable functions are not Riemann
integrable. This is discussed in more depth in Section 8.2.

8.1.3 Integration over subsets of [a, b]


We have defined integration over intervals [a, b], but we would like to define it over
a more general set E c [a, b]. To do this, we extend by zero using the indicator
function (see Definition 8.1.4) to make a function (denoted fliE) on all of [a, b].

Definition 8.1.13. For any function f: E-+ X , the extension off by zero is the
function
z EE,
fliE(z) = {f(z),
0, z ~ E.
326 Chapter 8. Integration I

The obvious way to try to define IE


f for f : E ---+ X is to extend f by
zero and then integrate over the whole interval to get I[a,b] liEf. This definition of
integration allows us to define integrals of many functions over many different sets,
but there are some important problems. The most immediate problem is the fact
that there are many sets for which the indicator function itself is not in &r'([a, b]; IR).

Unexample 8.1.14. Even the set consisting of a single point p E [a, b) CIR
has an indicator function lip that is not in &r'([a,b];IR). To see this, note first
that step functions on any interval [a, b] C IR are all right continuous, meaning
that
lim s(t) = s(to)
t-tt;l•

for every to E [a, b). Moreover, by Exercise 8.4 the uniform limit of a sequence
of right-continuous functions is right continuous, and thus every function
in &r'([a, b]; IR) is right continuous. But lip is not right continuous, so lip '/.
&r'([a, b]; IR) .

There are many important sets E and functions f for which we would expect
to be able to define the integral IE
f, but for which the regulated integral (and also
the more traditional Riemann construction) does not work. In the next section we
discuss this problem and its solution in more depth.

8.2 Overview of Daniell - Lebesgue Integration


8.2.1 The Problem
The main problem with the integral, as we have defined it, is that we have not de-
fined it for enough functions . The space of functions that we know how to integrate
so far is the space &r'([a, b]; X) , which includes continuous functions , but does not
include any function that is not right continuous (see Unexample 8.1.14).
By definition, &r'([a, b]; X) is closed in the vxJ_norm, and it is a subspace of
L 00 ([a, b]; X), which is complete by Theorem 5.7.5, so &r'([a, b]; X) is also com-
plete. This means that any Cauchy sequence Un)':'=o in &r'([a, b]; X) converges
uniformly to some function f E &r'([a, b]; X), and since integration is continuous
with respect to this norm, we have

lim r
n-too J[a,b]
fn = r lim fn
J[a,b] n-too
= r
J[a,b]
f.

This is an extremely useful property, but uniform convergence is a very strong


property- too strong for many of the applications we are interested in.
We need similar results for sequences and limits that may not converge uni-
formly, but rather only converge in the L 1-norm.

Definition 8.2.1. For any f E &r'([a, b]; X), define llfllu to be

llfllu = r
J[a,b]
11111 .
8.2. Overview of Daniell - Lebesgue Integration 327

We call !I · 11 £1 the L 1-norm.

Proposition 8.2.2. The function II · 11 £1 is a norm on &i'([a, b]; X).


Proof. The proof is Exercise 8.6. 0

Nota Bene 8.2.3. Alt hough we previously defined t he integral for Banach-
valued functions, and alt hough most of t he result s of t his chapter hold for
general Banach-valued functions, for the rest of t he chapter we rest rict our-
selves to JR-valued functions (that is, to t he case of X = JR) just to keep t hing
simple.
Note that t he case of real-valued functions can easily be used to describe
the case where f takes values in any finite-dimensional Banach space. For a
complex-valued funct ion f = u +iv, simply define t he integral to be the sum
of two real-valued integrals

l a ,b] J =l a,b] U +i l a ,b] V,

and for X = JF'n write J = (Ji, .. . , fn) and then define

r
J[a ,b]
f = ( r
J [a,b]
Ji, ... , r in).
J [a ,b]

Integration should be a cont inuous linear transformation on some space of


functions V that is complete in t he L 1-norm. So, if a sequence (gn);;::'=o in V is
Cauchy in t he L 1-norm, then we need limn-+oo 9n to exist in V , and the limit of
integrals limn-+oo f[a ,b] 9n ought to be equal to f [a ,b] limn-+oo 9n· But , unfortunately,
&i'([a, b]; JR) is not complete in the L 1 -norm, and neither is the space of Riemann-
integrable functions. The solution is to construct a larger space of functions and
extend the integral to t hat larger space.
In this section we briefly sketch the following main ideas:

(i) How to const ruct a vector space L 1 ([a, b ]; JR) containing both S( [a , b ]; JR) and
&i'( [a , b ]; JR) as subspaces.

(ii) How to define integrat ion and the L 1-norm on L 1 ([a, b ]; JR) in such a way that

(a) t he new definition of integration agrees with the existing definition of


integration for &i'([a, b ]; JR);
(b) the new definition of integration is a bounded (continuous) linear trans-
formation with respect to the L 1-norm;
(c) t he space L 1 ([a, b ]; JR) is complete with respect to the L 1-norm.

The full proofs of t hese constructions and their properties are given in Chapter 9.
328 Chapter 8. Integration I

Nota Bene 8.2.4. If X =IR, then the integral, as we have defined it here, is
often called the Daniell integral. We will usually call it the Lebesgue integral,
however, because it is equivalent to the Lebe gue integral, and that name is
more familiar to most mathematicians.
It is also common to u e the phrase Lebesgue integration when talking
about integration of function in L 1 ([a, b ];IR), but it i important to note that
there is really just one "integration" going on here. In particular, ~([a , b]; IR) C
L 1 ([a, b]; IR) , and Lebesgue integration restricted to ~([a , b]; IR) is just what
you have always called integration since your first introduction to calculu .
That is, Lebesgue integration is essentially just a way to extend regular old
Riemann integration to a much larger collection of functions.

8.2.2 Sketch of Dan iell - Lebesgue Integration


The basic strategy for constructing the space L 1 ([a, b]; IR) is first to show that every
normed linear space S can be embedded uniquely into a complete normed linear
space S (called the completion of S) in such a way that S is dense in S, and then
to show that when S = S( [a, b]; IR) , the elements of L 1 ([a, b]; IR) =§correspond to
well-defined functions. The details of this are given in Section 9.1 , but we give an
overview of the main ideas here.
Given a normed linear space (S, II· II), we construct Sin two steps (for details
see Section 9.1) . First, let

S' = {(sn)~=O I (sn)~=O is Cauchy in S }


be the set of Cauchy sequences in S. T he set S' is a vector space with vector
addition and scalar multiplication defined as

Define I · lls' on S' by

Unfortunately, 1 · lls' is not a norm because there are nonzero elements (sn)~=O of
S' that have ll(sn)~=olls' = 0. But the set

K = {(sn)~=O ES': ll(sn)~=o ll s' = O}


is a vector subspace of S'. It is not hard to prove that I · lls' defines a norm I · 11.s
on the quotient space
§ = S' /K.
The elements of Sare equivalence classes of L 1 -Cauchy sequences, where any two
sequences (sn)~=O and (tn)~=O are equivalent if l sn - tnll -t 0. We prove in Sec-
tion 9.1 that (S, 11·11.s) is complete and that Scan be mapped injectively into § by
sending the element s E S to the constant sequence (Sn = s for all n E N), and the
two norms on S clearly agree (that is, l slls = llsJJ.s)· Moreover, the subspace Sis
dense in§.
8.2. Overview of Daniell-Lebesgue Integration 329

We define the space L 1 ([a, b]; JR) to be the completion of S([a, b]; JR) in the
1
L -norm; that is,

L 1 ([a, b];JR) = S([a, b];JR).

This is guaranteed to be complete, but we have two wrinkles to iron out. The first
wrinkle arises from the way the completion is defined . We want L 1 ([a, b];JR) =
S([a, b]; JR) to consist of functions, not equivalence classes of sequences of functions.
So, for each element of L 1 ([a, b]; JR), that is, for each equivalence class of L 1-Cauchy
sequences, we must give a well-defined function. To do this, we take the pointwise
limit of a Cauchy sequence in the equivalence class. We prove in Section 9.3 that
for each equivalence class there is at least one sequence that converges pointwise
to a function. So we have an associated function arising from each element of
L 1 ([a, b];JR).
But the second wrinkle is that two different sequences in the same equivalence
class can converge pointwise to different functions (see Unexample 8.2.5). Thus,
unfortunately, there is not a well-defined function for each element of L 1 ([a, b]; JR).
One instance of this is given in Unexample 8.2.5.

Unexample 8.2.5. Let Sn E S([O , l]; JR) be the characteristic function of the
box [O, 2~ ), and let (tn)~=O be the zero sequence (tn = 0 for all n E N).
The sequence (sn)~=O converges pointwise to the characteristic function of
the singleton set {O}, but the sequence of L 1 -norms llsnllu converges to 0.
Integrating gives II Sn - tn II Ll -+ 0, and so these two Cauchy sequences are in
the same equivalence class in L 1 ( [O, l]; JR), but their pointwise limits are not
the same.

In a strict sense, there is no way to solve this problem-there is always more


than one function that can be associated to a given element of L 1 ([a, b]; JR). But the
two functions f and g associated to the same element of L 1 ([a , b ]; JR) differ only on
an infinitesimal set-a set of measure zero. We give a careful definition of measure
zero in the next section.
We iron out the second wrinkle by treating two functions as being "the same"
if they differ only on a set of measure zero; that is, we define an equivalence relation
on the set of all functions by saying two functions are equivalent if they differ on a
set of measure zero. In other words, f and g are equivalent if f - g is supported
on a set of measure zero. In this case we say f and g are equal almost everywhere,
and we write f = g a.e. This allows us to associate a unique equivalence class of
functions to each element of L 1 ([a, b]; JR).
The upshot of all this is that we have constructed a vector space L 1 ([a, b]; JR)
of functions (or rather a vector space of equivalence classes of functions) and a norm
11 · llu on L 1 ([a, b]; JR) such that L 1 ([a, b]; JR) is complete with respect to this norm.
Moreover, the set S([a, b]; JR) of step functions is a dense subspace of L 1 ([a , b]; JR),
and our same old definition of integration of step functions is still a bounded linear
transformation on S([a, b]; JR) with respect to the L 1-norm. So the continuous linear
extension theorem guarantees there is a unique way to extend integration to give
330 Chapter 8. Integration I

a bounded linear operator on all of L 1 ([a, b];IR). This is the Daniell or Lebesgue
integral.

Definition 8.2.6. We say that a function f : [a, b] -+ IR is integrable on [a, b] if


f E L 1 ([a, b]; IR), that is, if f is almost everywhere equal to the pointwise limit of
an L 1 -Cauchy sequence of step functions on [a, b] .

Nota Bene 8.2. 7. Beware that although it is equivalent to the more tradi-
tional definition of £ 1 ([a, b]; IR), our definition of £ 1 ([a, b]; IR) is very different
from the definition that you would see in a tandard course on integration. In
most treatments of integration, L 1 ([a, b];IR) is defined to be the set of (equiv-
alence classes a.e. of) measurable functions on [a, b] for which llfllu is finite.
But for us, L 1 ([a, b]; IR) i the set of (equivalence classes a.e. of) functions that
are almost everywhere equal to the pointwise limit of an £ 1 -Cauchy sequence
of step functions on [a, b].

Finally, the following proposition and its corollaries show that every sequence
that is Cauchy with respect to the £ 00 -norm is also Cauchy with respect to the
£ 1-norm; so a'([a, b]; IR) is a subspace of L 1 ([a, b]; IR), and the new definition of
integration, when restricted to a'( [a, b]; IR), agrees with our earlier definition
of integration.

Proposition 8 .2.8. For any f E a'([a, b]; IR) we have

llfllu ::;: >.([a, b])llfllL 00 • (8.1)

Proof. The proof is Exercise 8.7. D

Corollary 8.2.9. If Un)':'=o is a sequence in a'([a, b]; IR) converging uniformly to


f, then Un)':'=o also converges to f in the L 1 -norm.

Proof. The proof is Exercise 8.8 [J

Corollary 8 .2 .10. For any normed linear space X, if T: a'([a,b];IR)-+ X (or


T: S([a,b];IR)-+ X) is a bounded linear transformation with respect to the L 1 -
norm, then it is also a bounded linear transformation with respect to the L 00 -norm.

Proof. We have

llT(J)llx >.([a, b])llT(J)llx


llTllL ,x
00
= s~p llfllLoo ~ s~p llfllu ~>.([a, b])llTllu,x,

where the suprema are both taken over all f E a'([a, b]; IR). Thus, if llTllu,x < oo,
then we also have llTllLoo,x < oo. D
8.3. Measure Zero and Measurability 331

Corollary 8.2.11. Let f : &t'([a, b]; JR)-+ JR be a linear transformation that, when
restricted to step functions S ( [a, b]; JR), agrees with the integral yr : S ( [a, b]; JR) -+
R If f is a bounded transformation with respect to the L 1 -norm, then f must be
equal to the integral yr: &t'([a, b]; JR)-+ JR defined in Theorem 8.1.9.

Proof. The proof is Exercise 8.10. D

8.3 Measure Zero and Measurability


In practical terms, the main new idea in our new theory of integration is the idea
of sets of measure zero . It should not be surprising that some sets are so small as
to have no volume. In Definition 8.1.6, for example, we defined an open interval
(a, b) = (a1, b1) x · · · x (an, bn) to have the same measure as a closed interval [a, b].
This would suggest that the measure of [a, b] ".. (a, b) should be 0.
It turns out that extending the definition of measure from intervals to more
general sets is a subtle thing, so we do not treat it here (for a careful treatment of
measure theory, see Volume 3), but defining a set of measure zero is not nearly so
hard. In this section we define sets of measure zero and also measurable sets, which
are, essentially, the sets where it makes sense to talk about integration.

8.3.1 Sets of Measure Zero


To motivate the idea of sets of measure zero, consider two basic properties that one
might expect from the definition of a measure .A (or volume) of a set, assuming.>.. is
defined for all the sets involved. The first property we expect is that the measure
of a set should be at least as big as the measure of any subset:

If B c A, then .A(B) :::; >..(A).

This suggests that for any set A of measure zero, any subset B c A should also
have measure zero, and if (Ck)k°=o is a sequence of sets whose measure goes to zero
(>..(Ck)-+ 0 ask-+ oo), then nkEN Ck should have measure zero.
The second property we expect is that the measure of a union of sets should
be no bigger than the sum of the measures of the individual pieces:

>..(AU B) :::; >..(A) + .>..(B) .


This suggests that any finite or countable union of sets of measure zero should also
have measure zero.
Unfortunately, we only know how to define the measure of intervals so far- not
more general sets-but these two properties suggest how to define sets of measure
zero using only intervals.

Definition 8.3.1. A set A C )Rn has measure zero if for any c > 0 there is a count-
able collection of n -intervals (h)'f:"=o' such that AC U'f:"=o h and L:%':o .A(h) < c.

Proposition 8.3.2. The following hold:


(i) Any subset of a set of measure zero has measure zero.
332 Chapter 8. Integration I

(ii) A single point in JRn has measure zero.


(iii) A countable union of sets of measure zero has measure zero .

Proof. Item (i) follows immediately from the definition. The proof of (ii) is
Exercise 8.11.
For (iii) assume that (CkhEN is a countable collection of sets of measure zero.
Assume that E > 0 is given. For each k E N there exists a collection (Ij,k)jE N of
intervals covering Ck such that L,jEN>..(Ij,k) < c/2k+ 1 . Allowing k also to vary, the
collection (Ij ,k)jE N,k EN of all these intervals is a countable union of countable sets,
and hence is countable. Moreover, we have ukEN ck c Uj,kEN Ij,k and

LL >..(Ij,k) < L c/2k+i = E. D


kENjEN kEN

Ex ample 8.3.3. The Cantor ternary set is constructed by starting with the
closed interval Co = [O , 1] C JR and removing the (open) middle third to get
C 1 = [O, 1/ 3] U [2/3, l]. Repeating the process, removing the middle third of
each of the preceding collection of intervals, gives C2 = [O, 1/9] U [2/9, 1/3] U
[2/3, 7/ 9] U [8/9, l]. Continuing this process gives a sequence of sets (Ck)~ 0 ,
such that each Ck consists of 2k closed intervals of total length (2/3)k.
Define the Cantor ternary set to be the intersection Coo = nkENck.
Since [O, 1] is compact and each Ck is closed, C 00 is nonempty (see
Theorem 5.5.11). One can act ually show that this set is uncountable (see, for
example, [Cha95, Chap. 1, Sect. 4.4]).
To see that C 00 has measure zero, note that each Ck contains C 00 , and
Ck is a finite union of closed intervals of total length (2/3)k . Since (2/3)k can
be made arbitrarily small by choosing large enough values of k, the set C 00
satisfies the conditions of Definition 8.3. l.

Definition 8.3.4. We say that functions f and g are equal almost everywhere and
write f = g a.e. if the set {t I f(t) "/=- g(t)} has measure zero in JRn.

Ex ample 8 .3.5. Integration gives the same result for any two functions that
are equal almost everywhere, so when we need to integrate a function that is
messy or otherwise difficult work with, but all the "bad" parts of the function
are supported on a set of measure zero, we can replace it with a function that
is equal to t he first function almost everywhere, but is (hopefully) easier to
integrate.
For example, the Dirichlet function defined on JR by

f(t) = { ~ if t is rational,
it t is irrational
8.3. Measure Zero and Measurability 333

is equal to zero almost everywhere, because the set where f (t) =/= 0 is countable,
and hence has measure zero. Since f = 0 a.e., for any interval [a, b] C JR we
have J: f = J: 0 = 0.
Proposition 8.3.6. The relation = a. e. defines an equivalence relation on the set
of all functions from [a, b] to JR.

Proof. The proof is Exercise 8.12. D

Definition 8.3.7. We say that a sequence of functions (fk)k=O converges almost


everywhere if the set {t I (fk(t))'k=o does not converge} has measure zero. If for
almost all t the sequence (fk(t))'k=o converges to f(t), we write fk-+ f a.e.

Note that convergence almost everywhere is about pointwise convergence. It


does not depend on a choice of norm (for example, the L 00 - or L 1 -norms) on the
space of functions.

Example 8.3.8. Consider the sequence of functions given by

fn = nll[o,~] ·

The sequence Un(O));:o=O does not converge , but for all x =/= 0 the sequence
Un(x));:o=O converges to zero. Hence, fn(x)-+ 0 a.e.

8.3.2 Measurability
We often wish to integrate functions over a set that is not a compact interval.
Unfortunately, we cannot do this consistently for all sets and all functions. The
functions for which we can even think of defining a sensible integral are called
measurable functions, and the sets where we can sensibly talk about the possibility
of integrating functions are called measurable sets.

Definition 8.3.9. A function f : JRn -+ JR is measurable if there is a sequence of


step functions (sk)k=O (not necessarily L 1 - Cauchy) such that Bk -+ .f a.e . We say
that a set A c JRn is measurable if its indicator function Jl.A (see Definition 8.1.4)
is measurable.

We now define integration on measurable sets that are contained m some


compact interval, that is, on bounded measurable sets.

Definition 8.3.10. If Ac [a, b] is measurable and fll.A E L 1 ([a, b ];lR), then we


define

}A
r J = l[a,b]
r fll.A ·
334 Chapter 8. Integration I

Define L 1 (A;JR) to be the set of functions f: A-----t JR such that j]_A E L 1 ([a, b];JR).

The next proposition and its corollary show that the integral over A is well
defined; that is, it is independent of the choice of interval [a, b] .

Proposition 8.3.11. Let A c [a, b] c [c, d] be measurable. We have fRA E


L 1 ([a, b];JR) if and only if fRA E L 1 ([c,d] ; JR). Moreover,

r
J[a,b]
j:[A = r
J[c,d]
fRA. (8.2)

Proof. (==?) If fRA E L 1 ([a, b];IR'.), then there is a sequence (sn)~=O of step
functions on [a, b] such that (sn)~=O is L 1-Cauchy on [a, b] and Sn ----+ fRA a.e.
Extending each of these step functions by zero outside of [a, b] gives step functions
tm on [c, d], and tm ----+ j]. A a.e.
From the definition of the integral of a step function , we have

r
J[c,d]
[tn - tm[ = r
J[a,b]
[sn - Sm[ and r
J[c,d]
tn = r
J[a,b]
Sn

for all n, m E N. Hence, (tn)~=O is L 1-Cauchy on [c, d], the function f]_A E
L 1 ([c, d]; JR) , and (8.2) holds.
(¢:=) If fRA E L 1 ([c , d]; IR), then there is a sequence ( tn )~=O of step functions
on [c, d] such that (tn)~=O is L 1-Cauchy on [c, d] and tn ----+ f RA a.e. Multiplying
each of these by n[a,b] gives step functions Sn= tnli[a,b]·
Now we show that (tnli[a,bJ)~=O is L 1-Cauchy on [a, b]. Given c > 0, choose
N > 0 such that [[tn -tm[[u < c (on [c,d]) whenever n,m > N . The L 1-norm of
Sn - Sm on [a, b] is

r
J[a,b]
[sn - Sm[= r
J[a,b]
[tn - tm[li[a,b] :::; r
J [c,d]
[tn - tm[ < c.

Thus, (sm) ~=O is L 1-Cauchy on [a, b], and hence f]_A E L 1 ([a, b]; IR).
Finally, (sn)~=O also defines an L 1-Cauchy sequence on [c, d] converging to
fRA a.e. with f[a,b] Sn = f[c ,d] Sn for every n. Therefore (sn)~=O and (tn)~=O de-
fine the same element of L1([c,d];IR), and they have the same integral. Hence,
(8.2) holds. D

Since any nonempty intersection of two compact intervals is again a compact


interval , we get the following corollary.

Corollary 8.3.12. If A c [a, b] n [a', b'] is measurable, then f]_A E L 1 ([a, b]; JR)
if and only if fliA E L 1 ([a', b'];JR). Moreover,

r
J[a,b]
f 1lA = r
J[a' ,b']
f]_A· (8.3)
8.4. Monotone Convergence and Integration on Unbounded Domains 335

Nota Bene 8.3.13. Most sets that you are likely to encounter in applied
mathematics are measurable, including all open and closed sets and any count-
able unions and intersections of open or closed sets. But not every subset of
JRn is measurable. We do not provide an example here, but you can find
examples in [Van08, RFlO].

8.4 Monotone Convergence and Integration on


Unbounded Domains
The main result of this section is the monotone convergence theorem, which is the
first of three very important convergence theorems. Since L 1 ([a, b]; JR) is complete
(by definition), any £ 1-Cauchy sequence converges in the £ 1-norm to a function in
L 1 ([a, b] ;JR). For such a sequence (fk)'f=o the limit commutes with the integral

lim r
k-+oo J(a,b]
fk = r lim fk ·
J[a,b] k-+oo

But a sequence of functions that converges pointwise does not necessarily converge in
the £ 1 -norm (see Unexample 8.4.3) , and its pointwise limit is not always integrable.
The three convergence theorems give us conditions for identifying when a pointwise-
convergent sequence is actually £ 1 -Cauchy.
After discussing some basic integral properties, we state and prove the mono-
tone convergence theorem. We conclude the section with an important consequence
of the monotone convergence theorem, namely, integration on unbounded domains.

8.4.1 Some Basic Integral Properties

Definition 8.4.1. For any set A and any function f: A ---t JR, define

if f (a) 2: 0,
and r(a) = { ~ f(a) if f(a) S 0,
if f(a) S 0 if f(a) 2: 0.

Note that f = j+ - r and Ii i = j+ + 1-.


The integral operator on L 1 ([a, b]; JR) is linear and continuous because it is
the unique continuous linear extension of the integral operator on step functions,
and continuity implies that it commutes with limits in the £ 1 -norm. In Section 9.3
we also show that it satisfies the following basic properties.

Proposition 8.4.2. For any f, g E L 1([a, b]; JR) we have the following:

(i) If f S g a.e., then f 1a,b] f S f(a ,b ] g.

(ii) If[a ,b] fl S f[a,b] IJI = llJllL1 ·


(iii) The functions max (!, g) , min(!, g), j+ , 1- , and Iii are all integrable.
336 Chapter 8. Integration I

(iv) If h : JR.n ---+ JR is a measurable function (see Definition 8.3.9), and if lhl E
L 1 ([a, b];JR), then h E L 1 ([a, b];JR) .

(v) If llgllL= ::; M < oo, then Jg E L 1 ([a, b]; JR) and llfgllL1 ::; MllfllL1.

8.4.2 Monotone Convergence


As mentioned above, not every pointwise-convergent sequence of integrable functions
converges in the L 1 -norm. This means that we cannot move limits past integral signs
for these sequences, and in fact we cannot even expect that the limiting function is
integrable.

Unexample 8.4.3. In general we cannot expect to be able to interchange the


limit and the integral. Consider the sequence (Jk)~ 0 in L 1 ([0, l ]; JR) given by

fk(x) = {2k, x E (o'.2-k],


0 otherwise.

This sequence converges pointwise to the zero function, but J[O,ll fk = 1 for
all k EN, so
lim r
k---+oo J[o,1]
fk = 1 # 0 = r lim fk.
J[o,11 k---+oo

In the rest of this section we discuss the monotone convergence theorem, which
guarantees that if the integrals of a monotone sequence are bounded, then the
sequence must be L 1-Cauchy.

Definition 8.4.4. We say that a sequence of functions (Jk)f:'=o from [a, b] to JR


is monotone increasing if for every :x: E [a, b] we have fk(x) ::; fk+ 1 (x) fo r every
k E N. This is denoted fk ::; fk+l · We say that the sequence is almost everywhere
monotone increasing if for every k E N the set {x E [a, b] I fk (x) > fk+l (x)} has
measure zero . This is denoted fk ::; fk+ 1 a.e. Monotone decreasing and almost
everywhere monotone decreasing are defined analogously.

Theorem 8.4.5 (Monotone Convergence Theorem). Let (Jk)~ 0 c L 1 ([a, b];


JR) be almost everywhere monotone increasing. If there exists M E JR such that

{ fk::;M (8.4)
J[a,bl

for all k E N, then (Jk)~ 0 is L 1 -Cauchy, and hence there exists a function f E
L 1 ([a, b];JR) such that

(i) f = limk---+oo fk a.e. and

(ii) Jra,b] f = f[a,b] limk---+oo fk = limk---+oo f[a ,b] fk .


8.4. Monotone Convergence and Integration on Unbounded Doma ins 337

The same conclusion holds if(fk)'k:o c L 1 ([a, b];IB.) is almost everywhere monotone
decreasing on IB.n and there exists M E IB. such that

{ fk ?_ M.
J[a,b] (8.5)

This t heorem is sometimes called the Beppo Levi theorem.

Proof. Monotonicity and Proposition 8.4.2(i) guarantee that for every k E N we


have f[a,b]
fk :::;f[a,b]
fk+1 :::; M, so the sequence of real numbers fk)'k:o is (Jra,b]
monotone increasing in IB. and bounded, and hence has a limit L.
For any c > 0 there exists an N > 0 such that

0 < (L - la,b/k) < s


whenever k ?_ N. Choosing f, >m ?_ N, we have

Ille - fm llu = f Ile - fm l = f (!£ - fm)


l~,b] l~.~

= ( L- la,b/m) - (L -la,b/e)
< c.
1
Thus (fk)'k=o is L -Cauchy, as required. The monotone decreasing case follows
immediately from the previous result by replacing each fk by - f k and replacing M
by -M . D

Remark 8.4.6. Notice that the sequence in Unexample 8.4.3 is not monotone in-
creasing, so the theorem does not apply to that sequence.

8.4.3 Integration on Unbounded Domain s


So far we have only defined integration over measurable subsets of bounded intervals.
But here we use the monotone convergence theorem to extend the definition of
integration (and integrability) to unbounded intervals.

Definition 8.4. 7. Given an unbounded, measurable set A C IB_n and a measurable,


nonnegative function f : A --+ IB., we say that f is integrable on A if there exists an
increasing sequence (Ek)'k:o (by increasing we mean each Ek C Ek+1) of bounded
measurable subsets of A with A = LJ~=O Ek, such that there exists M E IB. with

{ f :::;M
}Ek

for all k . Since the sequence of real numbers JE


f is nondecreasing and bounded,
it must have a finite limit. Define the integral off on A to be the limit

Jf A
= lim
k-+oo
r
j Ek
f.
338 Chapter 8. Integration I

We say that an arbitrary measurable function g is integrable on A if g+ and


g- are both integrable on A, and we define

We write L 1 (A; JR) to denote the set of equivalence classes of integrable functions
on A (modulo equality almost everywhere) .

Nota Bene 8.4.8. Exercise 8.18 hows that L 1 (A; JR) is a normed linear
space with the £ 1 -norm and the very important property that a function
g is integrable on A if and only if lgl is integrable on A.

Example 8.4.9. If f is a nonnegative continuous function on JR, then it is


integrable on every compact interval of the form [O, n]. Iflimn-HX> Ion f(x) dx =
00
Lis finite, then every Ion f(x) dx is bounded above by Land I 0 f(x) dx = L.
For example,

lo
f
00
e-x dx = lim r
n- HX> lo
e-x dx = lim (1 - e-n) =
n-+oo
1.

Unexample 8.4.10. The continuous function f(x) = (1 + x)/(1 + x 2 ) is not


integrable on all of JR because

f+( x) = {f(x) , x 2 -1 ,
0, x ~ -1 ,

has an integral that is unbounded on the intervals [-n, n]; that is,

r
1(-n,n]
f+=jn(l+ x )/(l+x 2 ) dx
-1

[Arctan(x)+ ~log(l+x 2 )][


=
1
1 1
2
= Arctan( n) + as n --t oo.
2 log(l + n ) + 7f / 4-
2 log(2) --t oo
This shows j+ is not integrable on JR, and hence f is not integrable on R
If f+ and f- are not integrated separately, they may partially can-
cel each other. This is a problem because the cancellation may be different,
depending on the increasing sequence of bounded subsets we use in the inte-
gration. For example, using intervals of the form [-n, n] gives
8.4. Monotone Convergence and Integration on Unbounded Domains 339

l
. [-n,n]
f = r
} _n
(1 + x)/( l + x 2 ) dx = 2 Arctan(n) ---+ 7r as n---+ oo .

But using the intervals of the form [-n, n 2 ] gives

1 [-n,n2]
fdx = Arctan (n 2 ) + Arctan(n) + - log -+-2
1 (1 +
n ) ---+
2 1 n
4
oo as n ---+ oo.

This shows that nonnegativity of the integrand is important to have consistent


results.

Below we show that the definition of integrable and the value of the integral
on unbounded domains do not depend on the choice of the sequence (Ek)~ 0 of
measurable subsets, that is, they are well defined.

Theorem 8.4.11. Let A C !Rn be a measurable set, and let f be a nonnegative


measurable function on A. Let (Ek)'t'=o and (EU~o be two increasing sequences of
bounded measurable subsets such that
00 00

A= LJ Ek = LJ E~ .
k=O k= O

If there exists an M E IR such that for all k we have f ek f :::; M, then we also have
f e,,, f:::;
M for all k, and

lim r
k-+oo} Ek
f = lim
k-+oo} E~
r f.

Proof. Since each Ek is bounded and measurable, there exists a compact interval
[a, b] with Ek C [a, b], and such that liek E L 1 ([a, b];IR). Similarly, for every m we
may choose [a',b'] containing E:n and such that lie;,, E L 1 ([a' , b'];IR). Let [c,d]
be a compact interval containing both [a, b] and [a', b']. By Proposition 8.3.11 we
have that liek and lie;,, are in L 1 ([c, d]; IR), and by Proposition 8.4.2(v) the products
lieklie;,, and lie.lie;,,! are in L 1 ([c, d ];IR) . Therefore, the restriction of liek and
liekf to E:n both lie in L 1 (E:n ; IR) . Trading the roles of E:n and Ek in the previous
argument implies that ]_E'k and ]_E'k f are in L 1 (Em; IR).
If f ek f:::; M for all k, then

1E~
f]_Em = 1 flie~ 1
Em
:::;
Em
f :::; M

for all k and m. By the monotone convergence theorem, we have

r
} E~
f = r lim f]_E-m
} E~ m-+oo
= lim
m-+oo } E~
r f]_Em :::; M. (8.6)

Assume now that L = limk-+oo fek f and L' = limk-+oo f e~ f. Since the sequences
Uek f)'t'=o and Ue~ n~o are nondecreasing, they satisfy f e. f :::; L for all k and
340 Chapter 8. Integration I

JE' f
k
::; L'
for all k. Taking M = L and taking the limit of (8.6) as k ~ oo gives
L' ::; L. Similarly, interchanging t he roles of Ek and E~ and setting M = L'
gives L::; L'. Thus L = L', as required . D

8.5 Fatou's Lemma and the Dominated Convergence


Theorem
The monotone convergence theorem is t he key to proving two other important
convergence theorems: Fatou 's lemma and the dominated convergence theorem.
Like the monotone convergence theorem, these two theorems give conditions that
guarantee a pointwise-convergent sequence is actually L 1 -Cauchy.

Nota Bene 8.5.1. Being L 1 -Cauchy only guarantees convergence in the space
L 1 ([a, b ];JR). In general an L 1-Cauchy sequence of functions in &&'([a, b ]; JR)
does not converge in t he subspace &:~ ( [a, b]; JR) or in the space of Riemann inte-
grable function . T hat is, t he monotone convergence theorem, Fatou 's lemma,
and the dominat ed convergence t heorem are usually only u eful if we work in
L 1 ([a , b ];JR) .

8.5.1 Fatou's Lemma


Recall that the limit inferior of a sequence (xk)~ 0 of real numbers is defined as

Jim inf Xk = Jim (inf Xm),


k-+ oo k-+oo m?.k

and the limit superior is defined as

Jim sup X k = Jim (sup Xm) .


k-+ oo k-+oo m?.k

These always exist (although they may be infinite), even if t he limit does not.
Fatou's lemma tells us when the Jim inf of a sequence (not necessarily con-
vergent) of nonnegative integrable functions is integrable, and it tells us how the
integral of the Jim inf is related to the Jim inf of the integrals. Fatou's lemma is also
the key tool we use to prove the dominated convergence theorem.

Theorem 8.5.2 (Fatou's Lemma). Let (fk)'b:o be a sequence of integrable fun c-


tions on [a , b] that are almost everywhere nonnegative, that is, for every k E N we
have fk(x) ~ 0 fo r for almost every x . If

Jim inf r
k-+oo J[a,b]
f k < oo,

then
(i) (liminfk-+oofk) E L 1 ([a,b];JR) and

(ii) f[a ,b] Jim infk-+oo fk ::; lim infk-+cxi f[a,b] fk.
8.5. Fatou's Lemma and the Dominated Convergence Theorem 341

Proof. First we show that the infimum of any sequence (!£)~ 1 of almost-
everywhere-nonnegative integrable functions must also be integrable. For each
k E N let 9k be the function defined by gk(t) = min{fi(t), f2(t) , .. . , fk(t)}.
The sequence (gk)k=O is a monotone decreasing sequence of almost-everywhere-
nonnegative functions with limk-+ oo gk = g = infmEN f m· Since every 9k is almost
everywhere nonnegative, we have f (a ,b] 9k 2 0. By the monotone convergence the-
orem (Theorem 8.4.5), we have infmEN fm = limk-+oo9k = g E L 1 ([a, b];JR).
Now for each k E N let hk = infe;:::k f e E L 1 ([a, b]; JR). Each hk is almost ev-
erywhere nonnegative, and the sequence is monotone increasing, with limk-+oo hk =
liminfk-+oo fk < oo. Moreover, for each n E N we have hn :::; fn, so f(a,b] hn <
f[a ,b] f n and , taking limits, we have

r
J[a,b]
hn:::; lim
k-+oo J[a,b]
rhk = liminf r
k-+oo J(a,b]
hk:::; liminf
k-+oo J[a,b]
rfk < 00 .

Therefore the monotone convergence theorem guarantees lim infk-+oo hk is integrable


and limk-+oo f[a ,b] hk = f (a ,b ] limk-+oo hk. Combining the previous steps gives

{ liminf fk = { lim hk = lim { hk:::; liminf { fk· D


J(a,b] k-+oo J(a,b] k-+oo k-+oo J[a,b] k-+oo J[a,b]

Remark 8.5.3. The inequality in Fatou's lemma should not be surprising. For
intuition about this, consider the situation where the sequence (fk) consists of only
two nonnegative functions f 0 and Ji . This is depicted in Figure 8.4. In this case
inf(fo , Ji) = min(fo, Ji) :::; Jo and inf(fo , Ji) = min(fo, Ji) :::; Ji,
so by Proposition 8.4.2(i) , we have

r
J(a,b]
inf(fo , Ji) :::; r
J[a,b]
Jo and r
J [a,b]
inf(fo, Ji) :::; r
J[a,b]
Ji ..

Jo

Figure 8.4. An elementary variant of the inequality in Fatou 's lemma, as


described in Remark 8. 5. 3. The area under min(fo, Ji) (dark yellow intersection)
is less than the area under Jo (gray) and less than the area under Ji (yellow) . In
symbols, this is J:
min(fo , Ji) :::; min(f: Jo , Ji). J:
342 Chapter 8. Integration I

This implies

la,b] inf(fo, Ji) ~inf (Ja,b] Jo, la,b] Ji) .


It should be clear that the same is true for any finite collection of integrable
functions. The inequality in Fatou's lemma is the analogous result for an infinite
sequence of functions.

Example 8.5.4. The inequality in Fatou's lemma cannot be replaced by an


equality. For each k E z+ let fk = kli[o,l/kJ E L 1 ([0, l]; JR). We have
1 1 1
f liminf(fk) = f 0=0 < 1 = liminfl = liminf f fk
lo k lo k k lo

8.5.2 Dominated Convergence


The last of the three big convergence theorems is the dominated convergence the-
orem, which says that for any sequence of nonnegative functions that converges
pointwise, if it is bounded above (dominated) by an integrable function almost
everywhere, then it converges in the L 1 -norm.

Theorem 8.5.5 (Dominated Convergence Theorem). Consider a sequence


(fk)'k::o c L 1 ([a, b]; JR) of integrable functions that converges pointwise almost ev-
erywhere to f. If there exists an integrable function g E L 1 ([a,b];JR) such that
lfkl :::; g a.e. for every k EN, then f E L 1 ([a, b]; JR) and

lim f fk == f lim fk = f f. (8.7)


k~ ool~~ l~~k~oo h~~

Proof. For each k E N the function hk = g - f k is nonnegative almost everywhere.


Moreover,
f hk = f g- ik:::; Jlgllu + llfkllu:::; 2llgllu,
l[a,b] l[a,b]

so Fatou's lemma gives g - limk~ oo fk = liminfk~oo hk E L 1 ([a, b];JR), which im-


plies that f = limk~oo fk = g + liminfk~oo hk E L 1 ([a, b]; JR). Moreover, we have

I 9- f lim fk = f lim hk
l[a,b] l[a,b] k~oo l[a,b] k~oo
= f liminf hk
l(a,b] k~oo
:::; lim supk~oo f hk
l[a,b]

= f g-liminf f fk,
l[a,b] k~oo l[a,b]
8.5. Fatou's Lemma and the Dominated Convergence Theorem 343

and thus we see


r lim
} [a ,b] k--+oo
ik ;::: lim inf r
k--+oo J[a,b]
fk.

Repeating the previous argument with hk = g + fk gives the other direction, which
gives (8.7). D

Remark 8.5.6. The dominated convergence theorem guarantees that we can inter-
change limits if we can find an integrable function that dominates the sequence
almost everywhere. Note that in Unexample 8.4.3 no integrable function dominates
all the terms fk·

Example 8.5.7. For each n EN, let fn: JR.--+ JR. be given by

We wish to evaluate

lim {T fn dx = lim {T (l - x/n)nex/ 3dx


n--+oo}0 n--+oo}0
for arbitrary T > 0, but it is not immediately obvious how to integrate
J: fndx .
Since limn--+oo(l - x / n)n = e- x , the sequence Un)':'=o converges point-
wise to the function f (x) = c 2 x/ 3 . It is easy to check that ez is greater than or
equal to l+z for all z E JR., and therefore (1 - x/n) :S e- x/n for all x. This gives

and since f is continuous, it is integrable. Therefore the dominated conver-


gence theorem applies and gives

lim rTfn dx = 1Tn--+oo


n--+oo}0
lim fn dx = 1T f dx = 1T e- xl
0 0 0
2 3
dx,

which is easy to integrate.

As a nice consequence of the dominated convergence theorem, we can now


give a useful condition for deciding when an infinite sum of integrable functions
converges and is integrable.

Proposition 8.5.8. If (fk)~ 0 is any sequence of functions in L 1([a, b]; JR.) such
that

~ la,b] lfk l < 00,


then L:%°=o fk converges almost everywhere on [a, b], and

00
2:.:fk=2=
la,b] k=O
00 1
k = O [a,b]
fk.
344 Chapter 8. Integration I

Proof. By Exercise 8.16 we have that for almost every x E [a, b] the series
.Z:::: ~=O lfk(x)j converges, and the resulting function .Z::::~=O
lfkl is integrable. There-
fore, for almost every x E [a, b] the series .Z::::~=O fk(x) converges absolutely, and
hence it converges, by Proposition 5.6 .13.
The partial sums of .Z::::~=O fk are all dominated by the integrable function
.Z::::~=O Ifk j, so by the dominated convergence theorem the series .Z::::~=O f k is
integrable and

L:
00
fk(:r) = L:
1 fk ·
00
D
la,b] k=O [a,b]
k=O

8.6 Fubini's Theorem and Leibniz's Integral Rule


An important consequence of the monotone convergence theorem is Fubini 's theo-
rem, which allows us to convert multivariable integrals into repeated single-variable
integrals, where we can use the fundamental theorem of calculus and other standard
tools of single-variable integration. We note that Fubini's theorem is easy to prove
for step functions, but integrating more general functions in L 1 ([a, b]; JR) involves
taking limits of step functions, and iterated integrals involve taking two different
limits. So it should not be too surprising that a convergence theorem is needed to
control the way the two limits interact.
Finally, as an additional benefit, Fubini's theorem gives Leibniz 's integral rule,
which gives conditions for when we can differentiate under the integral sign. This
turns out to be a very useful tool and can simplify many difficult problems.

8. 6.1 Fubini's Theorem


Throughout this section we fix X == [a, b] C ]Rn and Y = [c, d] c ]Rm. Since
we integrate functions over each of these sets, in order to reduce confusion we
write the symbols dx, dy, or dxdy at the end of our integrals to help indicate where
the integration is taking place, so we write f x g(x) dx to indicate the integral of
g E L 1 (X;lR), and we write fxxyf(x,y)dxdy to indicate the integral off E
L 1 (X x Y; JR).

Theorem 8 .6.1 (Fubini's Theorem). Assume that f: X x Y --t JR is integrable


on X x Y C JRm+n. For each x E X consider the function fx : Y --t JR given by
fx(Y) = f(x,y). We have the following:

(i) For almost all x EX the function fx is integrable on Y.

(ii) The function F : X --t JR given by

F(x) ~ { [. f x(y) dy if f x is integrable,


otherwise

is integrable on X.
8.6. Fubini's Theorem and Leibniz's Integral Rule 345

(iii) The integral off m ay be computed as

{ f (x, y ) dxdy = { F(x) dx.


lx xY Jx

Remark 8 .6.2 . We often write Ix (fy fx(Y) dy) dx instead of Ix F(x) dx. We call
this an iterated integral.

Not a B ene 8.6 .3. The reader should beware that t he obvious converse of
Fubini is not true. Integrability of fx and of F = fy f x(Y) dy is not sufficient
to guarantee that f is integrable on X x Y. However , with some addit ional
condit ions one can sometimes st ill ded uce t he integrability of f. For more
details see [Cha95, Sect . IV.5.11].

We prove Fubini's theorem in Section 9.4. In the rest of this section we focus
on its implications and how to use it.
Fubini's t heorem allows us to reduce higher-dimensional integrals down to a re-
peated application of one-dimensional integrals, and these can often be computed by
standard techniques, such as the fundamental theorem of calculus (Theorem 6.5.4).

Example 8.6.4. We can compute the integral of f(x , y) = cos(x)y 2 over


R = (0, 7r / 2) x (0, 1) C JR 2 using Fubini's theorem:

l cos(x)y2 dxdy =
2
for./ (fo
1
cos(x)y 2 dy) dx

= 17r/ 2 cos (x) dx = ~.


0 3 3

8.6.2 Interchang ing the Order of Integration

Proposition 8.6.5. If f: X x Y ~ JR is integrable, then the function f :Y x X ~


JR, given by f(y, x ) = f (x,y) , is integrable, and

r
l xx Y
f (x, y ) dxdy =
JYxX
r f(y, x) dydx.

Proof. T he proof is Exercise 9.20. D

Corollary 8.6.6. If f : X x Y ~ JR is integrable, then

l (l f x(Y) dy) dx = l xY f(x, y) dxdy = l (l fy(x) dx) dy.


346 Chapter 8_ Integration I

Proof. This follows immediately from Fubini's theorem and t he proposition. D

Corollary 8.6.6 is useful because changing the order of integration can often
simplify a problem substantially.

Example 8.6. 7. It is difficult to compute the inner antiderivative of the it-


erated integral

1_: (1 2
tan(3x
2
- x + 2) sin(y) dx) dy.

But changing the order of integration gives

1_:(1 2
tan(3x 2 - x + 2) sin(y) dx) dy = 1 2
2
tan(3x -x+2) (/_: sin(y) dy) dx

and J~7r sin(y) dy = 0, so the entire double integral is zero.

Example 8.6.8. Computing the iterated integral f; f: eY dy dx is not so


2

easy, but notice that it is equal to J;


f0 eY lis(x, y)dy dx, where Sis the upper
1 2

triangle obtained by cutting the unit square (0, 1) x (0, 1) in half, diagonally.
Interchanging the order of integration, using Corollary 8.6.6, gives

Now for each value of y, we have (x, y) E S if and only if 0 :::; x :::; y, so the
integral becomes
fol lay eY2 dx dy,
which is easily evaluated to f01 yeY 2 dy = ~ (e - 1).

8.6.3 Leibniz's Integral Rule


An important consequence of Fubini's theorem is Leibniz's integral rule , which
gives conditions for when we can differentiate under the integral sign (that is, when
derivatives commute with integrals) . This is a very useful tool that greatly simplifies
many difficult problems.

Theorem 8.6.9 (Leibniz's Integral Rule). Let X = (a, b) c JR be an open


interval, let Y = [c, d] C JR be a compact interval, and let f: X x Y -7 JR be
8.6. Fubini's Theorem and Leibniz's Integral Ru le 347

continuous. If f has continuous partial derivative ofb:,y) at each point (x, y) E


X x Y, then the function
'l/J(x) = ld f(x, y) dy

is differentiable at each point x E X and

d~ 'l/J(x) = ld of~~ y) dy. (8.8)

Proof. Fix some Xo E X. Using the second form of the fundamental theorem of
calculus (6.16) and Fubini's theorem, we have

'l/J(x) - 'l/J(xo) = ld (f(x, y) - f( xo, y)) dy

= jd (lxorx of(z,OZ y) dz) dy


c

= rx (ld of(z, y) dy) dz .


lxo c OZ
Letting
()- jd of(z,y) d
g z -
c
!::>
uz
y,

we have
'ljJ(x) - 'ljJ(xo) r g(z) dz .
lxo
= (8.9)

Using the first form of the fundamental theorem of calculus (6.15) to differentiate
(8.9) with respect to x gives
d
dx 'ljJ(x) = g(x). D

Example 8.6.10. Consider the problem of finding the derivative of the map
J1
F(x) = 0 (x 2 + t) 3 dt. We can solve this without Leibniz's rule by first using
standard antidifferentiation techniques:
1
F(x) = .fo (x
2
+ t) 3 dt
= (x2+t)411
4 0
(x2 + 1)4 x8

4 4

From this we find F'(x) = 2x(x 2 + 1) 3 - 2x 7 .


348 Chapter 8. Integration I

Alternatively, Leibniz says we can differentiate under t he integral sign:

F'( x) =lor1 ox
8
(x2 + t)3 dt
1
= fo 2 2
3(x + t) (2x) dt

= (x 2 + t) 3 (2x)I~
= 2x(x 2 + 1) 3 - 2x 7 .

Remark 8.6.11. In the previous example it is not hard to verify that the two
answers agree, but in many situations it is much easier to differentiate under the
integral sign than it is to integrate first and differentiate afterward.

We can use Theorem 8.6.9 and the chain rule to prove a more general re-
sult. This generalized Leibniz formula. is important in the proof of Green's theorem
(Theorem 10.5.15).

Corollary 8.6.12. Let X and A be open intervals in IR, and let f: X x A --+ IR
be continuous with continuous partial derivative ¥x
at each point of X x A . If
a, b : X --+ A are differentiable functions and 'lj.;(x) = J:g;
f(x, t) dt, then 'lj.;(x) is
differentiable and

d lb(x) of(x t)
-d 'lf.;(x) =
0X
' dt - a'(x)f(x, a(x)) + b'(x)f(x, b(x)). (8.10)
X a(x)

Proof. The proof is Exercise 8.29. D

Example 8.6.13. Let F(x) = fs~~(~)) Arcta.n(x+t)dt. To compute F'( x) we


may use Corollary 8.6.12:

d lcos(x)
F'(x) = -d Arctan(x+t)dt
X sin(x)

-- lcos( x) l .
+ (x + t) 2 dt - cos(x) Arctan(x + sm(x) )
sin (x) 1
- sin(x) Arctan(x + cos(x) )
= (1 - sin(x)) Arctan(x + cos(x)) - (1 + cos(x) ) Arctan(x + sin(x)).
8.7. Change of Variables 349

Example 8.6.14. Generalizing the previous example, for any constants r , s E


JR and any function of the form G(x) = J:(~] g(rx +st) dt, we have

b(x)
G'( x) =
la(x)
rg'(rx +st) dt - a'(x)g(rx + sa(x))
b(x)
+ b'(x)g(rx + sb(x))

= ~ g(rx +st) - a'(x)g(rx + sa(x)) + b'(x)g(rx + sb(x))


la(x)
r r
= - g(rx + sb(x)) - -g(rx + sa(x)) - a'(x)g(rx + sa(x))
s s
+ b'(x)g(rx + sb(x))
= G+ b'(x)) g(rx + sb(x)) - (~ + a'(x)) g(rx + sa(x)).

8.7 Change of Variables


In one dimension the chain rule gives the very useful change of variables
formula (6.19):

Jc
d f(g(s))g'(s) ds = [9(d)
}g(c)
f(u) du,

which is sometimes known as the substitution formula or u -substitution.


We now extend this to a change of variables formula for higher dimensions.
Often an integral that is hard to compute in one coordinate system is much easier
to compute in another coordinate system, and the change of variables formula is
precisely what we need to move from one system to another.

8.7.1 Diffeomorphisms
The type of function we use for a change of variables is called a diffeomorphism.

Definition 8. 7.1. Let U and V be open subsets of JRn. We say that W : U---+ V is
a diffeomorphism if w is a C 1 bijection such that w- 1 is also C 1 .

Remark 8.7.2. The composition of two diffeomorphisms is a diffeomorphism. If


w : U ---+ V is a diffeomorphism, then the derivative Dw must be invertible at every
point t E U because the chain rule gives

I(t) = (DI)(t) = D(w - 1 o w)(t) = D(w - 1 )(w(t))Dw(t),

where I: U---+ U is the identity. Therefore D(w - 1 )(w(t)) = ((Dw)(t))- 1 .


350 Chapter 8. Integration I

Example 8. 7 .3.

(i) The map I : (0, 1) -+ (-1, 1) given by I (x) = 2x -1 is a diffeomorphism


because 1- 1 (y) = (y + 1)/2 is C 1 .
(ii) The map g : (-1, 1) -+ JR given by g(x) = x/(1 - x 2 ) has Dg(x) =
(1 + x 2 ) /(1 - x 2 ) 2 , which is strictly positive for all x E ( -1, 1), so g is
strictly increasing and, hence, is injective. Since g is continuous and since
limx-1.-l g(x) = -oo and limxtl g(x) = oo, the map g is also surjective.
Thus, g has an inverse, and the inverse function theorem (Theorem 7.4.8)
guarantees that the inverse is C 1 . Therefore, g is a diffeomorphism.
Using the quadratic formula gives g- 1 (y) = (-1 + J1 +y 2 )/2y, but
for a more complicated function, the inverse cannot always be written
in a simple closed form. Nevertheless, we can often show the inverse
exists from general arguments, and the inverse function theorem can
often guarantee that the inverse is C 1 ' even if we don't know explicitly
what the inverse function is.
(iii) The map l(x , y) = (cos(x),sin(y) ) from the open square (0, 7r/ 2) x
(0, 1f /2) to the open unit square (0, 1) x (0, 1) is a diffeomorphism because
it is injective (cos and sin are both injective on (0, 1f / 2)) and

D x _ [-sin(x) 0 ]
I( 'y) - 0 cos(y)

has a nonzero determinant for all (x,y) E (0, 7r/2) x (0, 7r/2), so the
inverse function theorem guarantees that 1- 1 is C 1 .

Unexample 8.7.4.

(i) The map g: (-1,1) 1--+ (-1, 1) given by g(x) = x 3 is not a diffeomor-
phism. Although it is bijective with inverse g- 1 (y) = y 113 , the inverse
function is not C 1 at o.

(ii) Let Ube the punctured plane U = JR 2 "°' {O}, and let h: (0, oo) x JR-+ U
be given by h(r,t) = (rcos(t) , rsin(t)). It is straightforward to check
that h is surjective. The derivative is

Dh = [c?s(t) -r sin(t)]
sm(t) r cos(t) '

which has determinant r > 0, so the derivative is always invert ible.


Therefore, the inverse function theorem guarantees that in a sufficiently
small neighborhood of any point of U there is a C 1 inverse of h. But
there is no single function h- 1 defined on all of U because h is not
injective. Thus, h is not a d:iffeomorphism.
8.7. Change of Variables 351

8.7.2 The Change of Variables Formula


In the single-variable change of variables formula (6.19) , if g is a diffeomorphism
on an open set containing [c, d], then the derivative g' is continuous and never
zero, so it may not change sign on the interval [c, d]. If g' < 0 on (c, d), then
g(d) < g(c), so

g(c) lg(d)

1
g([c,d])
j =
l g(d)
f = -
g(c)
j(T)dT

= -id J(g(s))g'(s) ds = f J(g(s)) lg'(s) I ds.


c J[c,d]

If g' > 0 on (c, d), then

g(d) id

1 g([c,d])
f =
1g(c)
f(T) dT =
c
f(g(s))g'(s) ds = {
J[c,d]
f(g(s))lg'(s)I ds.

In either case, we may write this as

f f = f (fog)lg'I . (8.11)
~(~,~) J~,~
This is essentially the form of the change of variables formula in higher dimensions.
The main theorem of this section is the following.

Theorem 8.7.5 (Change of Variables Theorem). Let U and V be open subsets


of !Rn , let X C U be a measurable subset of !Rn, and let 1J! : U -t V be a diffeo -
morphism. The set Y = iJ!(X) is measurable, and if f: Y -t IR is integrable, then
(j o 1J!) Idet (DiJ!) I is integrable on X and

l l J= (f ow) ldet(DiJ!) I. (8.12)

Remark 8. 7.6. In the special case that 1J! : U -t Vis a linear transformation, then
DiJ! = 1J! and the change of variables formula says that Jy f = Idet (1J!) I Jx (f o 1J!).
An especially important special case of this is when Xis an interval [a, b] C !Rn
and f = l. In this case (8.12) says that the volume of Y = iJ!([a, b]) is exactly
the determinant of 1J! times the volume of [a, b]. This should not be surprising-
the singular value decomposition says that 1J! can be written as UEVT , where U
and V are orthonormal. Orthonormal matrices are products of rigid rotations and
reflections , so they should not change the volume at all, and E is diagonal, so it
scales the ith standard basis vector by !Ji· This changes the volume by the product
of these !Ji-that is, by the determinant of E, which is the absolute value of the
determinant of 1J!.

We prove the change of variables theorem in Section 9.5. For the rest of this
section we discuss some implications and examples.
352 Chapter 8. Integration I

Example 8.7.7. Usually either the geometry of the region or the structure
of the integrand gives a hint about which diffeomorphism to use for change of
variables. Consider the integral

where R is the trapezoid region with vertices (1, 0) , (3, 0), (0, 3), and (0, 1).
Without a change of variables, this is not so easy to compute. But the presence
of t he terms x + y and x - y in the integrand and the fact that two of the sides
of t he trapezoid are segments of the lines x + y = 1 and x + y = 3 suggest the
change of variables u = y + x and v = y - x. Writing x and y in terms of u
and v gives -W(u, v) = (~(u - v), ~(u + v)) with

1
D-W = [1/2
2.
-1/2] and Idet(Dw)I =
1/2 1/2

Applying t he change of variables formula with W yields

~ f lu
1
3
-u
sin ( ~)
u
dv du = 0.

Co rollary 8.7.8. If W: U--+ V is a dijfeomorphism, then for any subset E c U


of measure zero, the image -W(E) has measure zero .

Proof. If E has measure zero, then J'1!(E) 1 = JE Idet(D-W)I = f u liEI det(D-W)I = 0


because ]_E = 0 a.e. The corollary now follows from Exercise 8.20. D

8.7.3 Polar Coordinates


One important application of Theorem 8.7.5 is integration in polar coordinates.
Define W : (O, oo) x (0,27r)-+ ~ 2 by -W(r,B) = (rcos(B),rsin(B)). This is just the
restriction of the map h of Unexample 8.7.4(ii) to the set (0, oo) x (0, 27r).
The map W is a diffeomorphism from (0, 27r) x (0, oo) to ~ 2 ' \ { (x, 0) I x : :'.'. 0}
(the plane with the origin and positive x-axis removed). To see this, first verify that
-W is bijective, so it has an inverse. Moreover , W is C 1 , and since det(D-W) = r > 0
is never zero, the inverse function theorem implies that the inverse must also be C 1 .
For any Ac (0,27r) x (O,oo) if we let B = w(A) c ~ 2 , then (8.12) gives

Jl f(x, y) dx dy = l l
f = (f o w)r = Jl f(r cos(B), r sin(B))r dr dB. (8.13)

Moreover, we can extend this relation to [O, 27r] x [O, oo) because the rays defined by
e= e
0 and = 27f have measure zero and hence contribute nothing to the integral.
8.7. Change of Variables 353

Example 8.7.9. Let A be the region {(r, B) I 0 :::; B :::; 7r/ 3, 0 :::; r :::;
J sin(3B)} , and let B = w(A) be the corresponding region in rectangular
coordinate , as in Figure 8.5.
The area of B is

r
1B 1 =
! A r dr dB =
r / 3 r.Jsin(3e)
J0 l 0 r dr dB

r /3 i 1
= lo sin(3B) dB = .
2 3

Figure 8.5. The petal-shaped region B of Example 8. 7.9. Changing to


polar coordinates simplifies the computation of the area of B.

2
Example 8.7.10. The integral I= f~co e- x dx plays an important role in
probability and statistics, but it is not easy to integrate using traditional one-
variable techniques. We can convert it to a two-dimensional integral that is
easy to comput e using polar coordinates, as follows:

12 = ({co
.J _oo
e -x2 dx) (lco-oo
e_Y2 dy) = lco lco
-oo - oo
e-x2 - y2 dxdy

= { e - (x
2
+Y 2 )dA = f
2
7r {co e- r\drdB
l JR2 lo lo
r27r i
=.10 2dB = 7r.

Thus I= .jTi.

Example 8.7.11. Sometimes more than one change of variable is needed.


Consider the integral
.!ly 2
dA,
354 Chapter 8. Integration I

where E is the region bounded by the ellipse 16x 2 + 4y 2 = 64. Since cir-
cles are generally easier to work with than ellipses, it is natural to make the
substitution W1 (u, v) = (u, 2v), so that

and Idet(Dw1)I = 2.

This yields the integral


2 /l 4v dA,
2

where C is the region bounded by the circle u 2 + v2 = 4. Switching to polar


coordinates W2(B,r) = (rcos(B) , rsin(B)) gives
{2rr {2
8 lo lo 3
r sin
2
edr dB= 327r.

8.7.4 Spherical and Hyperspherical Coordinates


Spherical coordinates in JR 3 are similar to polar coordinates in JR 2 . The three new
coordinates are r, e, ¢, where e is the angle in the xy-plane, ¢ is the angle from the
z axis, and r is the radius, as depicted in Figure 8.6.

Definition 8.7.12. Let U = (0,27r) x (O,n) x (O,oo), and define spherical coor-
dinates S: U---+ JR 3 by S(B,¢,r) = (rsin(¢)cos(B),rsin(¢)sin(B),rcos(¢)).

We have
-r sin(¢) sin( e) rcos(¢) cos(B) sin(¢) cos( B)l
DS= rsin(¢bcos(B) r cos(¢) sin( B) sin(¢) sin(B)
[ - rsin(¢) cos(¢)
z
(B,¢,r)

Figure 8.6. Representation of a point {red) in spherical coordinates


e
(B, ¢, r) in 3-space. Observe that is the angle in the xy-plane to the projection
{blue) of the point to that plane, whereas ¢ is the angle from the z -axis.
8.7. Change of Variables 355

and
Idet(DS)I = r 2 sin(¢) .
It is straightforward to check that S is C 1 and bijective onto S(U) = V, and
hence has an inverse. Since det(DS) never vanishes, the inverse function theorem
guarantees that the inverse is C 1 , and thus S is a diffeomorphism. The change of
variables formula (8.12) gives

{ f= {{{ f(x,y,z)dxdydz= { (foS)r 2 sin(¢) (8.14)


./S(X ) jJf S(X) .fX
= L f (r sin(¢) cos(()), r sin(¢) sin(()), r cos(¢)) r 2 sin(¢) dr def> d()

for any measurable X C U.

Example 8. 7.13. Consider the region D = [O, 271"] x [O, 71" /6] x [O, R], which
when mapped by S gives an ice-cream-cone-shaped solid C C JR 3 as in
Figure 8.7. As with polar coordinates, spherical coordinates are not bijective if
we include the boundary, but the boundary has measure zero, so it contributes
nothing to the integral.
Using (8.14) we see the volume of C is given by
6 2 3
{ 1= {R (" / f r. r 2 sin(¢)d()d<f;dr = ( 2 - J3)71"R
.fc .fo .fo .fo 3

The next definition generalizes polar and spherical coordinates to arbitrary


dimensions as follows.

Figure 8.7. The ice-cream-cone-shaped region of Example 8. 7.13.


356 Chapter 8. Integration I

Definition 8. 7.14. Define hyperspherical coordinates on !Rn by

W: (0,7r) x ··· X (0,7r) x (0,27r) x (O,oo)-+ !Rn,


cos( ¢1)
sin( ¢1) cos( ¢2)
sin(¢1) sin(¢2) cos(¢3)
(¢1, . . . ,</Jn-1,r) Hr

sin(¢1) sin(¢2) . . . sin(</ln-2) cos(</ln-1)


sin(¢1) sin(¢2) ... sin(</ln-2) sin(</ln-1)

A straightforward but tedious computation gives


det(Dw) = rn-l sinn- 2 (¢1) sinn- 3(¢2) . .. sin(</ln-2),
so if we let U = (0, 7r) x · · · x (0, 7r) x (0, 27r) x (0, oo) and V = w(U) c !Rn, then

if= l (f 0 w) rn-l sinn- 2(¢1) sinn- 3 (¢2) . . . sin(<Pn-z).

As with polar and spherical coordinates, one can check that '11 is bijective to its
image, and since det(Dw) does not vanish on its domain, the inverse function
theorem guarantees that the inverse is C 1' so w is a diffeomorphism to its image.
We use hyperspherical coordinates to find the volume of the unit ball in !Rn (see
Exercise 8.36).

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth .
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with &. are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

8.1. Give an example of an interval [a, b] C IR and a function f: [a, b] --+IR that is
not in ~([a, b]; IR), but whose absolute value Iii is in ~([a, b]; IR) .
8.2. Prove Proposition 8.1.8.
Exercises 357

8.3. Prove Theorem 8.1.9.


8.4. Prove the claim in Unexample 8.1.14 that the uniform limit of a sequence of
right-continuous functions is right continuous.
8.5. Prove (i) and (ii) of Proposition 8.1.11.

8.6. Prove Proposition 8.2.2, as follows:


(i) First prove that II· l u satisfies all the conditions of a norm (see Defini-
tion 3.5.1) except positivity.
(ii) Prove that llfllu 2:: 0 for all f E &i'([a, b]; X).
(iii) If f E &i'([a, b]; X) and if f =/::. 0, then there exists some p E [a, b] such
that llf(p) ll > 0. Prove that there exists an interval R C [a, b] with
>..(R) > 0 and with the property that for all x E R we have llf(x)ll >
llf(p) ll/3. Hint: By definition of &i'([a, b]; X) there exists a sequence
(sn)~ 0 of step functions that converges uniformly to f. Using the defi-
nition of uniform convergence, let c; > 0 be chosen so that c; < llf(p)ll/3.
(iv) Prove that f(a,b] llfll 2:: JR
llfll 2:: >..(R) l f(p)ll/3 > 0, and hence
l f llu > 0.
8.7. Prove Proposition 8.2.8.
8.8. Prove Corollary 8.2.9.
8.9. Give an example of a sequence of step functions that is Cauchy in the L 1 -norm
but is not Cauchy in the L 00 -norm.
8.10. Prove Corollary 8.2.11.

8.11 . Prove that a single point in JRn has measure zero.


8.12. Prove Proposition 8.3.6.
8.13. Let fn(x) = n/(1 + n 2 x 2 ). Show that fn ---+ 0 a.e. as n---+ oo. What are all
the values of x where fn(x) ft O?
8.14. Prove that a bounded set E C JRn has measure zero if and only if liE is
integrable and JE
1 = 0.

8.15. For any b >a> 0, find (with proof) the value of limn--+oo J: log(x)e-nx dx.
8.16. &. Prove that if (fk)k=O is any sequence of nonnegative functions in
L 1 ([a, b];JR) such that

f: r
k=O.![a,b]
fk < oo,

then I;~ 0 fk converges almost everywhere on [a, b], and

r f 1k fk=O r
.J[a,b] k=O
=
./[a,b]
fk .

Hint: Recall that 2:::%°=o fk is limN--+oo L~=O fk·


8.17. Let f E L 1 ([a, b]; JR) with 0::; f < 1 a.e. Prove that

lim r
k--+oo ./[a,b]
fk = o.
358 Chapter 8. Integration I

8.18. Let A c ]Rn be an unbounded measurable set. Prove that


(i) a measurable function g is integrable on A if and only if Jgl is integrable
on A;
(ii) the set L 1 (A; JR) is a normed linear space with the L 1-norm.
8.19. For each of the following functions, either prove that the function is integrable
on JR or prove that it is not integrable on R Hint: Some of the integrals cannot
be evaluated easily, but you can still bound them using the basic properties
of Proposition 8.4.2.
(i) f(x) = ex sin( ex).
(ii) f(x) = ex /(1 + e2x).
(iii) f(x) = {.,;xe-x, x 2: O,
0, x < 0.
(iv) f(x) = cos(x)/(1 + x2) .
-p >1
(v) f(x) = x
x- ' '
0, { x < 1.
(Decide and prove which values of p make f integrable and which do not.)
8.20. Prove the result of Exercise 8.14 in the case that Eis not necessarily bounded.

8.21. Prove the reverse Fatou lemma : If (fk)'Go is a sequence in L 1 ([a, h]; JR) and
if there exists some nonnegative g E L 1 ([a, b]; JR) with fk ~ g a.e. for all
k EN, then
limsup
k-+oo J[a,b]
fk ~ r limsup fk·
J[a,b] k-+oo
r
8.22. Prove that in Fatou's lemma the condition that each fk be nonnegative can be
replaced with the condition that there exists some function g E L 1 ([a, b]; JR)
such that for all k E N we have f k 2: g a.e.
8.23. Let
g(t) -
-110
e-t2(Hy2)
1
+y
2 dy .

Prove that limt-+oo g(t) = 0.


8.24. Let
fn(:r) = n/(1 + n 2 x 2 ).
Recall from Exercise 8.13 that fn-+ 0 a .e.
(i) What does Fatou's lemma tell us about lim infn-+oo J: f n dx in this case?
(ii) Use one of the convergence theorems to prove that limn-+oo
0 for all b > a > 0.
J: f n (x) dx =
(iii) Antidifferentiate explicitly to show that ifO < b, then limn-+oo J; fn(x) dx
= 7r /2, and if a< 0 < b, then limn-+oo J: fn(x) dx = 7r. Therefore,

lb Iim fn(x)dx
a n-+oo
=0< n/2 ~ n-+oo
lim lb
a
fn(x)dx

when a~ 0.
Exercises 359

(iv) Graph fn(x) in an open neighborhood of x = 0 for several values of


n ?_ l.
(v) Explain why both the monotone and dominated convergence theorems
fail to apply for a ~ 0 < b.
8.25. Let (fk)k=O be a sequence in L 1 ([a, b]; JR) with fk ---+ f a.e. Prove that if
Il l ~ g a.e. for some g E L 1 ([a, b]; JR), then f E L 1 ([a, b]; JR).
8.26. Let I be the integral l 0 (2xy + 1) dxdy, where C is the region bounded by
y = x 2 and y = 2x. Write I as an iterated integral in two different ways.
Compute the value of I .
8.27. Determine the volume of the region in ]Rn defined by the inequality

Justify (prove) your result. Hint: Consider the cases n = 1, n = 2, and


n = 3. Once you find the volume in these three cases, you should be able to
determine the pattern and use induction to prove it.
8.28. For every nonnegative integer n, let B(O, R) C ]Rn be the closed n-ball of
radius R. Prove that vol(B(O, R)) = Rnvol(B(O, 1)). Hint: Induct on n.
For the induction step, slice the ball perpendicular to one axis to get an
(n - 1)-ball, and then integrate over all the slices.
8.29. Use Leibniz's integral rule (Theorem 8.6.9) and the chain rule (Theorem 6.4.7)
to prove Corollary 8.6.12. Hint: Consider the function <I>(x, a, /3) = f(x, t)dt. l:
8.30. Show that 1000
tnct dt = n! as follows:
(i) For any N > 0 show that
N l - e-Nx

1 0
e- tx dt = - - - -
x
(8.15)

(ii) Differentiate (8.15) repeatedly to show l; (-l)ntne - tx dt = da;~ l-e;Nx.

(iii) Use the results of Exercise 6.26 to show for any x > 0 and for any n E N
roo tn e -tx dt = __EJ_
th at Jo x n+ l.
(iv) Evaluate at x = 1 to conclude that 10 tne-t dt = n!.
00

8.31. Show that 10 00 2


e- x dx = fi/2 as follows . Let f(t) = (f; e- x dx)
2
2
and let
gt( )= rl e -t 2 (l +y 2) d
Jo l+y y.
(i) Show that f'(t) + g'(t) = 0 for all t > 0. Hint: After using Leibniz,
consider the substitution u = ty.
(ii) Show that f(t) + g(t) = 7r/4 for all t > 0.
(iii) Use the result of Exercise 8.23 to show that limHoo f(t) = 7r/4 and
hence that 10 e- x dx = V7f /2.
00 2

8.32. Let D c JR 2 be the square with vertices (2, 2), (3, 3) , (2, 4), (1, 3). Compute
the integral
fv 1n (y 2 - x 2 ) dxdy.
360 Chapter 8. Integration I

8.33. Derive a formula for the volume of the ellipsoid


x2 y2 z2
2+b2+2:::;1.
a c
More generally, if A E M 3 (IR) is positive definite , derive a formula for the
volume of the ellipsoid x T Ax :::; 1.
8.34. Given A E Mn(IR) with A> 0, generalize Example 8.7.10 by showing that
r T
}'R_n e-x Axdx = Jdet(A) .
1fn/2

8.35 . Let Q be the open first quadrant Q = {(x, y) I x > 0, y > O}, and let
H be the upper half plane H = {(s, t) I t > O}. Define <I> : Q -+ H by
<I>(x,y) = (x 2 -y 2,xy) for (x,y) in Q. For a point (x,y) in Q, the pair of
numbers (x 2 - y 2,xy) = <I>(x,y) are called hyperbolic coordinates for (x,y).
(i) Show that <I> : Q-+ IR 2 is a diffeomorphism.
(ii) Now define D = {(x, y)Jx > 0, y > 0, 1:::; x 2 -y 2 :::; 9, 2:::; xy:::; 4}. Use
hyperbolic coordinates to show that

l (x
2
+ y 2)dxdy = 8.
8.36. Let R denote the box [O, 7r] x · · · x [O, 7r] x [O, 27r).
(i) Use hyperspherical coordinates to express the volume of the unit n-ball
in terms of the integral I== JR
sin(¢1) sin 2(¢2) ... sinn- 2(¢n- 2).
2
(ii) Use hyperspherical coordinates to express the integral f'R.n e-llxll dx in
terms of I and the Gamma function f(x) = f 0 tx- 1 e-t dt.
00

(iii) Use these results, combined with Example 8.7.10, to give a formula for
the volume of the unit n-ball.
(iv) Combine this with Exercise 8.28 to give a formula for the volume of
any n-ball ofradius r. (Alt ernatively, you could slightly generalize your
computation in (i) to achieve the same result .)

Notes
Much of our treatment of integration in this chapter and the next is inspired by Soo
Bong Chae's beautiful book [Cha95], which develops Lebesgue integration using an
approach due to Riesz. Another source for the Riesz approach is [Soh14]. The Riesz
approach has some significant similarities to our approach, but at heart they are
still very different ways of looking at integration. Sources on the Daniell integral
include [BM14], [AB66], and [Roy63] (the first edition of [RF10]). The Bochner
integral is described in [Mik14] . Sources for a more standard approach to Lebesgue
integration and measure theory include [Bre08, Jon93, RFlO, Rud87] .
Exercise 8.30 comes from Keith Conrad's "blurb" on differentiation under
the integral sign [Con16]. Exercise 8.31 comes from Luc Rey-Bellet [Rey06J.
Exercise 8.35 comes from Fitzpatrick [Fit06].
*Integration II

Nature laughs at the difficulties of integration.


-Pierre-Simon Laplace

In this chapter we give the details and remaining proofs of the development of the
Daniell-Lebesgue integral, as outlined in the previous chapter.

9.1 Every Normed Space Has a Unique Completion


In this section we prove that every normed linear space can be embedded (uniquely)
as a dense subspace of some Banach space. This is very handy in many settings,
but our immediate interest is to use it to complete the space of step functions with
respect to the L 1 -norm.

9.1.1 Seminorms and Norms


Before we begin our discussion of completion, we need one simple fact about semi-
norms. Recall from Definition 3.5.1 that seminorms are functions that are almost
norms, but where l f l = 0 need not imply that f = 0.

Proposition 9 .1.1. If Vis a vector space and I ·II is a seminorm on V , then the set
K = {v EV I l vll = O} forms a vector subspace ofV. Moreover, II · II : V / K ~JR,
defined by the rule l v + Kii = l vll , forms a norm on the quotient space V/K .

Proof. The proof is Exercise 9.1 D

9.1.2 Every Normed Linear Space Has a Completion


The main t heorem of this section is the following.

Theorem 9.1.2. For any normed linear space (X, II · II ), there exists a Banach
space (X, II · II x) and an injective linear map ¢ : X ~ X such that for every x E X

361
362 Chapter 9. *Integration II

we have 11¢(x)llx = llxll (we call such a map an isometric embedding) such that
¢(X) is dense in X. Moreover, this embedding is unique, in the sense that if X is
another Banach space with an isometric embedding 'ljJ : X ---+ X such that 'l/;(X)
is dense in X, then there exists a unique isomorphism of Banach spaces g: X ---+ X
such that go 'ljJ = ¢.

Remark 9.1.3. This theorem also holds for general metric spaces, that is, every
metric space can be embedded as a dense subset in a complete metric space; but
we need the additional linear structure in all of our applications, and our proofs are
simplified by assuming X is a normed vector space.

Proof. Let X' be the set of all Cauchy sequences in X. The space X maps
injectively into X' by sending x E X to the constant sequence (x)k'=o· For any o:, /3 E
IF and any two Cauchy sequences (xk)k=O and (Yk)~ 0 , let o:(xk)~ 0 + /3(yk)~ 0 be
defined to be the sequence (o:xk + f3yk)~ 0 . It is straightforward to check that
this is again a Cauchy sequence, that X' is a vector space, and that X is a vector
subspace. ·
For each Cauchy sequence (xk)k=O' the sequence of norms (llxkll)k'=o is a
Cauchy sequence in JR, since for any E > 0 there is an N such that

whenever n,m > N. Thus, (llxkll)~, 0 has a limit. Define ll(xk)k=ollx = limk-+oo
1

llxk I · Again, it is straightforward to check that this is a seminorm, but there are
many sequences (xk)k=O such that I (xk)~ 0 ll = 0 but (xk)~ 0 -=/:- (O)k=l (the zero
element of X'), so it is not a norm; see Exercise 9.2.
Let KC X' be the set of Cauchy sequences (xk)k°=o such that ll(xk)k°=ollx = 1

0. Let X = X' / K be the quotient space. By Proposition 9.1.1, the seminorm


I · llx induces a norm on the quotient X. The map ¢: X ---+ X given by sending
1

any x E X to the equivalence class of the constant sequence (x)k'=o is an isometry,


because 11¢(x) II = limk-+oo llxll = llxll · Also, ¢is injective, because ¢(x) E K if and
only if llxll = 11¢(x) II = 0, which occurs if and only if x = 0.
To see that ¢(X) is dense in .X, consider any ( = (xk)k=O E X. For any
c: > 0, choose an N > 0 such that llxn - xmll < c:/2 for all n,m > N. We have
11¢(xn) - (ii= limk-+oo llxn - xkll::; c/2, so ¢(xn) E B((,c:), hence ( E ¢(X). Now
by Lemma j) .4.26, since every Cauchy sequence in a dense subspace converges in X,
the space X is complete.
All that remains is the proof of uniqueness. If 'ljJ : X ---+ X is as in the
hyp_9thesi~ of the theorem, we must construct an isomorphism of Banach spaces
g : X ---+ X, by which we mean an isometry that has an inverse that is also an
isometry.
x
Since 'l/;(X) is dense in X, for each EX there is some sequence ('l/;(xn))~=O
converging to x. Define g(x) = limn_, 00 ¢(xn)· To see that this map is well defined ,
consider any other sequence ('l/;(Yn));~ 0 in 'l/;(X) converging to x. Both ('l/;(xn))~=O
and ('l/J(Yn))~ 0 are Cauchy sequences with 11'1/J(xn - Yn)ll ---+ 0. Since 11'1/J(x)ll =
llxll = 11¢(x)ll for all x E X, we have 11¢(xn - Yn)ll ---+ 0, so limn-+oo ¢(xn) =
limn-+oo ¢(Yn) · Thus, g(x) does not depend on the choice of the sequence (xn)~=O ·
9.1. Every Normed Space Has a Unique Completion 363

From the definition and the properties of ¢ and 'lj;, it is straightforward to


verify that g is linear and ll g(x) 11 .x = llxl l.x . An identical argument with the roles
of X and X switched gives another isometry, h : X ---+ X, which is easily seen to be
the inverse of g , and hence g is an isomorphism of Banach spaces. D

Vista 9.1.4. A similar argument can be used to construct the Banach space
IR as the completion of the metric space IQ. Beware, however , that this method
of completing IQ does not fit into the hypotheses of Theorem 9.1.2, because IQ
is not a vector space over IR or <C. The proof of the theorem above also uses
the fact t hat JR is complete in order to construct the seminorm on X, so to
complete IQ in t his way needs some additional steps.

9.1.3 The Space L 1 ([a, b]; X)


The main application of Theorem 9.1.2 is to complete S([a, b]; X) with respect to
the L 1-norm. Recall that any sequence that converges with respect to the L 00 -
norm must also converge in the L 1-norm (see Corollary 8.2.9). This means that
the completion of S([a, b]; X) with respect to the L 1-norm contains ~([a, b]; X).
Therefore, to construct the desired space, we can either take Cauchy sequences in
~([a, b]; X) or Cauchy sequences in S([a, b]; X). But, since step functions are so
much simpler to work with, we use S([a, b]; X) .
Definition 9.1.5. Let L 1 ([a, b]; X) denote the completion S([a, b]; X) of S with
respect to the L 1 -norm. By the continuous linear extension theorem, there is a
unique linear extension of the step-function integral~ to all of L 1 ([a, b]; X). We
usually denote this extension by the symbol fia,bJ . If [a, b] c IR, we often write f J:
instead of f [a,b] f .

Remark 9.1.6. As discussed in the previous chapter, the integral is a bounded


transformation, and hence continuous, so it commutes with limits in the L 1 -norm
(that is, limits pass through integrals). In particular, this means that if (sk)~ 0 is
any L 1-Cauchy sequence representing an element f E L 1 ([a, b];X), we have

kl~1! la,b] Sk = la,b] f.


Moreover, we have defined the norm on L 1 ([a, b];X) to be

ll!l lL' = lim


k-'>oo
llskllL' = lim
k-'>oo
r
}[a,b]
llsk(t)ll·

Thus, the integral and the L 1 -norm both commute with limits of L 1-Cauchy
sequences.

Proposition 9.1. 7. If (sk)k=O is any L 1 -Cauchy sequence of real-valued step func-


tions on [a, b], then (st)'t'=o and (s;;)k=O are L 1 -Cauchy, as is (lskl)'t'=o ·

Proof. The proof is Exercise 9.3 D


364 Chapter 9. *Integration II

9 .2 More about Measure Zero


9.2.1 An Alternative Definition of Sets of Measure Zero
In Definition 8.3. l the sets of measure zero were described in terms of coverings by
unions of intervals. This has many benefits pedagogically, but we prefer now to give
an alternative definition. The new definition is a little harder to state and may be
less intuitive, but it is easier to use when working with the concept of convergence
almost everywhere.

Definition 9.2.1. A set EC [a, b] C JR.n has measure zero if there is an L 1 -Cauchy
sequence (sk)~ 0 c S([a, b]; JR) of step functions such that limk-+= lsk(Y)I = oo for
each y EE.

For the rest of this section, when we say measure zero, we mean it in the sense
of Definition 9.2.1 , unless otherwise specified. Near the end of this section we prove
that the two definitions are equivalent. Unless otherwise specified, we work on a
fixed compact interval [a, b] C JR.n, so all functions are assumed to be defined on
[a, b] and all integration is done on [a, b].

9.2.2 Converg ence Almost Everywhere of L 1 -Cauchy Sequences

Lemma 9.2.2. If (sk)k=O is an L 1 -Cauchy, monotone increasing (or monotone


decreasing) sequence of step functions, then there exists a function f such that
Bk-+ f a.e.
Proof. Given a monotone increasing L 1 -Cauchy sequence (sk)k=Oof step functions,
define f by

if the limit exists,


(9.1)
otherwise.

For each t, the sequence (Bk ( t) )k'=o is nondecreasing, so the only way it can fail
to converge is if sk(t) -+ oo. Let E be the set of points t where sk(t) diverges .
Using (sk)k=O itself as the L 1-Cauchy sequence in Definit ion 9. 2.1 shows that E has
measure zero, and hence sk -+ f a.e.
The case where (sk)~ 0 is monotone decreasing follows from a similar argu-
ment; see Exercise 9.7. D

Lemma 9.2.3. If (sk)k=O is a sequence of step functions on [a, b] such that


2::%°=o l sk llL1
converges to a finite value, then there exists a function F such that
2=%°=o sk = F a.e.

Proof. For each n E N let Fn be the partial sum Fn = L~=O Sk, and let Tn =
L~=O Jskl· Each Tn is a nonnegative step function, and the sequence (Tn)~=O is
monotone increasing. For each n we have
9.2. More about Measure Zero 365

Thus, the monotone increasing sequence (ll TnllL' );;::'= 0 converges to some value M.
Given c > 0, choose N > 0 such that 0 < (M - llTnllL') < c for all n > N . We
have

llTn - Tm ll L' = r
J[a,b]
ITn - Tml

= { (Tn - Tm)
J[a,b]
= llTn ll L' - ll Tm ll L1
= (M - llTmllL1) - (M - ll Tn ll L')
<c
whenever N < m < n. Therefore, (Tn);;::'=o is L 1-Cauchy. By Lemma 9.2.2 it
converges a.e. , and thus (Fn);;::'=o converges a.e., since pointwise absolute conver-
gence implies pointwise convergence. The desired function F is given by setting
F(t) = L:%°=o sk(t) when the sum converges and F(t) = 0 when it does not
converge. D

Proposition 9.2.4. If (sk)k=O is an L 1 -Cauchy sequence of step functions on [a, b],


then there exists a subsequence (skt )bo and a function f such that Skt -+ f a.e.
Moreover, there exist two monotone increasing L1 -Cauchy sequences (</>e)bo and
('lf;e)bo of step functions such that for every£ EN, we have skt = <Pe - 'l/Je.

Proof. For each £ E N let ke be chosen such that llsn - smllL' < 2-e for all
m, n ;:::: ke. Let go = Sko and for each integer e > 0 let ge = Skt - Skt-1. This gives
llgell L' < 2 1- e for all£> 0. We have
00 00

L llgk ll L' :::; ll skollL' + L 2l-£= ll sko ll L' + 2.

By Lemma 9. 2.3 the sequence of partial sums SkN = I::!o gk converges to a function
f almost everywhere.
Let </>m = I:;;:,
0 gt and 'l/Jm = I:;;:,
0 g£ , so Skm. = </>m - 'l/Jm · For each £ E N
we have gt ;:::: 0 and g£ ;:::: 0, so the sequences (</>m)~= O and ('l/Jm)~=O are both
monotone increasing. They are also L 1 -Cauchy because for any c > 0

m m
11</>m - </>nllL' = L llgt ll L':::; L ll gt + g£ ll L'
£=n+l
m m
= L ll gell L' < L 21-e < 21- n < c
£=n+l

whenever m > n > 1 - log 2 c (and similarly for ('l/;e)b0 ). D


366 Chapter 9. *Integration II

9.2.3 Equivalence of Definitions of Sets of Measure Zero


We now use the previous results about convergence of L 1-Cauchy sequences to prove
that the two definitions of measure zero (Definitions 9.2.l and 8.3.l) are equivalent.

Proposition 9.2.5. For a set E c ffi~n the following are equivalent:

(i) E has measure zero in the sense of Definition 9. 2.1.


(ii) There exists a monotone increasing L 1 -Cauchy sequence (<h)k'= o of step func -
tions such that ¢k(t)--+ oo for every t EE.

(iii) E has measure zero in the sense of Definition 8. 3.1.

Proof.
(i)===?(ii) Let (sk)~ 0 be an L 1-Cauchy sequence of step functions with sk(t)--+ oo
for every t E E. By Proposition 9.2.4 there is a pair of monotone increasing,
L 1-Cauchy sequences (¢e)b 1 and ('l/Je)b 1 such that (¢e - 'l/Je)b 1 is a sub-
sequence of (sk)~ 0 . Since sk(t) --+ oo for every t E E, we must also have
¢k(t)--+ oo for every t EE.

(ii)===?(iii) Given (¢k)k'=o as in (ii), let F = {t E ~n [ lime__, 00 ¢e(t) = oo}. Since


(¢e)bo is L 1-Cauchy, we have lime__, 00 fJRn ¢e = M < oo. Monotonicity of the
sequence implies that f JRn ¢e ::; M for all £.
Given c: > 0, let Fe = {t E ~n [ ¢e(t) > M/c:} for each C E N. We have
E C F C LJ~ 1 Fe, and since Fe c Fe+1 for every£, it suffices to show that
each Fe is a finite union of intervals, the sum of whose measure is less than c:.
Since ¢e is a step function, Fe is a finite union of (partially open or closed)
intervals, and >..(Fe)Mc: < fJRn ¢e::; M, so >..(Fe)< c:.

(iii)===?(i) For each m E N choose intervals Um,k)k'=o' satisfying the hypothesis


of (iii) for c: = 2- m. For each k E N let Sk = Lm<k Le<m :Il.1,.,,, c These
may not quite be step functions., because the intervals may not have the right
form, but they are generalized step functions as defined in Exercise 9.6, and
it is proved in Exercise 9.6 that it suffices to use generalized step functions in
Definition 9.2.l.
We have

m$ke$m m$k
Moreover, the sequence (sk)k'=o is monotone increasing, and hence (by essen-
tially the same argument as given in the proof of Lemma 9.2.3) the sequence
(sk)~ 0 is L 1-Cauchy. But for any t EE, we have

Sk(t) = 0
:L :L 1--+ 00. D
m$ke$m

Proposition 9.2.6. If E C ~n has measure zero in ~n, then E x ~m C ~n+m has


measure zero in ~n+m.
9.3. Lebesgue-Integrable Functions 367

Proof. The proof is Exercise 9.8. D

Corollary 9.2. 7. The boundary of any interval (and hence the set of discontinuous
points of any step function) has measure zero .

Proof. The boundary of any interval in !Rn is contained in a finite union of sets of
the form {a} x [c, d], where [c, d] C JRn- 1 , and where a E JR is a single point (hence
of measure zero). D

9.3 Lebesgue-Integrable Functions


We have defined L 1 ([a, b]; JR) to be a set of equivalence classes of L 1-Cauchy se-
quences. But equivalence classes of sequences are awkward to work with, and it
would be much nicer if we could just use functions. In this section we show how
to do this by proving that for each equivalence class of L 1 -Cauchy sequences in
L 1 ([a, b]; JR) there is a uniquely determined equivalence class of functions (with
respect to the equivalence relation= a.e.).
If we define the set 2 of integrable functions to be 2 = {f I :J(sk)k=O E
L 1 ([a, b]; JR), Sk---+ f a.e.}, and if we let 2 0 = {! E 2 If = 0 a.e.}, then this gives
a linear isomorphism of vector spaces qi : L 1 ( [a, b]; JR) ---+ 2 / 2o. And if we define a
norm on 2/2o by llfllu = limk-+oo llskll whenever Sk---+ f a.e., then llfllu is well
defined, and qi preserves the norm. That is to say, L 1 ([a, b]; JR) is isomorphic, as
a normed linear space, to the space 2 / 2 0 of equivalence classes of functions that
are pointwise limits of some L 1-Cauchy sequence of step functions.
The upshot of all of this is that we may talk about elements of L 1 ([a, b]; JR)
as integrable functions, rather than as Cauchy sequences of step funct ions.

9.3.1 L 1 -Cauchy Sequences and Integrabl e Functions

Definition 9.3.1. Fix a compact n -interval [a, b] C ]Rn. We say that a function
f : [a, b] ---+ JR is Lebesgue integrable or just integrable on [a, b] if there exists an
L 1 -Cauchy sequence of step functions (sk)k:, 0 such that Sk ---+ f a.e . We denote the
vector space of integrable functions on [a, b] by 2 (for this section only) .

The results of Section 9.2.2 show that for any L 1 -Cauchy sequence (sk)k=O of
step functions, we can construct a function f such that some subsequence converges
almost everywhere to f. We use this to define a map qi : L 1 ([a, b];JR)---+ 2/2o by
sending any L 1-Cauchy sequence (sk)k=Oof step functions to the integrable function
f guaranteed to exist by Proposition 9.2.4. But it is not yet clear that this map is
well defined. To see this, we must show that the equivalence class f +20 is uniquely
determined.

Proposition 9.3.2. If (sk)k=O and (sk)k:, 0 are L 1 -Cauchy sequences of step func -
tions that are equivalent in L 1 ([a, b]; JR), and if (sk)k=O ---+ f a.e. and (sk)k'=o ---+
g a.e., then f = g a.e.
368 Chapter 9. *Integration II

Proof. For each k E N, the difference Uk = Sk - s~ is a step function, and by


the definition of equivalence in £ 1 ([a, b]; JR), the sequence (uk)~ 0 is an £ 1 -Cauchy
sequence with [[(uk)~ 0 [[u ---+ 0. Moreover, (uk)~ 0 converges almost everywhere
to h = f - g, so it suffices to show that the set of points t where h(t) -/=- 0 has
measure zero.
Let Em = {t E ]Rn [ [h(t)[ > 1/m}. Since the set where h(x)-/=- 0 is precisely
the (countable) union LJ:=l Em , it suHices to show that each Em has measure zero.
Each Uk can be written as Uk = '" L...J1.EJk ck ' 1 :Il.1k· ,1. where the index j runs over a
finite set Jk, each Ik,j is an interval, and each Ck ,j is a real number. For each m EN
and for each c, > 0 choose Uk such that [[uk [[u < c,/m. The finite set of intervals
f = {h ,j I j E Jk, lck,j[ > 1/m} covers Em, and we have
1
L >..(Ik ,j) = m L m >..(h,j) ·:::; m L ick,j[>..(Ik,j) = m[[uk[[u < c,.
/ jEh jEJ,

Therefore, Em has measure zero by Proposition 9.2.5. D

This proposition shows that the function <I>: L 1 ([a,b];JR)---+ .!L'/.!/0 is well
defined, because if (sk)~ 0 is any £ 1 -Cauchy sequence with a subsequence (skt )b 1
converging almost everywhere to f (so that we should have <I>((sk)) = f) , and if
(sk)k=O is an equivalent £ 1-Cauchy sequence with a subsequence (sj,J;;'= 1 con-
verging almost everywhere to g (so that we should have <I>((sk)) = g), then the
subsequences are equivalent in £ 1 ([a, b]; JR), so f = g a.e.
The map <I> is surjective by the definition of.£', and it is straightforward to
check that <I> is also linear. We now show that <I> is injective. To do this, we must
show that if f = g a .e., and if (sk)k=O and (s~)~ 0 are two £ 1 -Cauchy sequences of
step functions such that Sk ---+ f a .e. ands~ ---+ g a.e., then (sk)k=O and (sk)~ 0 are
equivalent in L 1 ([a, b]; JR). To do this we first need two lemmata.

Lemma 9.3.3. If (sk)~ 0 is a monotone decreasing sequence of nonnegative step


functions such that Sk ---+ 0 a.e., then [[ sk[[u ---+ 0. And similarly, if (sk)~ 0 is a
monotone increasing sequence of nonpositive step functions such that Sk ---+ 0 a.e.,
then [[sk llu ---+ 0.

Proof. Let (sk)k=O be a monotone decreasing sequence of nonnegative step func-


tions such that Sk ---+ 0 a.e. Let Do be the set of all t where (sk(t))k'=o does not
converge to 0, and for each m E N, let Dm be the set of all t such that Sm is
discontinuous at t . Each Di has measure zero, so the countable union D = u:, 0 Di
has measure zero.
The step function s 1 is bounded by some positive real number B and has a
compact interval [a, b] as its domain. Since the sequence is monotone decreasing,
each function Sk is also bounded by B, and we may assume t hat Sk has the same
domain [a, b] . For any E > 0 choose open intervals {hhEN such that D c ukE l\J h
and such that I:kEN>..(h) < 6 = c,/(2B).
For each t E [a, b] '. ._ D, there exists an mt E N such that Sm, (t) < I =
c,/(2>..([a, b]) . Since tis not a point of discontinuity of Sm,, there is an open neigh-
borhood Ut oft where Sm,(t') < r for all t' E Ut. The collection {hhEN U {Ut [
t E [a, b] ""- D} is an open cover of the compact interval [a, b], so there is a finite
subcover Ii 1 , . . . , IiN, Utu . .. , Uti that covers [a, b] . Let M = max( mt 1 , . . . , mti).
9.3. Lebesgue-Integrable Functions 369

Since (sk)k=O is monotone decreasing, if sm, (t') < /, then sc(t') < 'Y for every
£ 2: M . Therefore, we have 0 :::; sc(t') < / for all t' in LJ{= 1 Ut,.
Putting this all together gives

llscllu = 1[a,b]
sc <BL
k=l
N

>..(Ik) + 1>..([a, b]) = c/2 + c/2 = E

for all£> M. This shows that llskllu ---+ 0.


The result about a monotone increasing sequence (sk)~ 0 of nonpositive step
functions follows from the nonnegative decreasing case by changing the signs. D

Lemma 9.3.4. Let (¢k)k'=o and ('lf!k)k=O be monotone increasing sequences of step
functions such that cPk ---+ f a.e. and 'lf!k ---+ g a.e. If f:::; g a.e., then

lim r
k--t= .f[a,b]
cPk :::; lim r
k--t= .f[a,b]
'lf!k .

Proof. For each m E N the sequence (¢m - 'lf!k)k=O is monotone decreasing, and

lim <Pm - 'lf!k = <Pm - lim 'lf!k = <Pm - g :::; f - g :::; 0 a.e.
k--tCXJ k--tCXJ

Thus, the sequence ((¢m - 'lf!k)+)k=O is nonnegative and monotone decreasing and
must converge to zero almost everywhere.
By Lemma 9.3.3 we have

Moreover, we have (¢m - 'lf!k) :::; (¢m - 'lf!k) +, so

which gives

r
.f[a,b]
<Pm :::; lim (
k--t=
r
J[a,b]
(<Pm - 'lf!k) + + r
J[a,b]
'lf!k) = lim r
k--t= J[a,b]
'lf!k·

Now letting m---+ oo gives the desired result. D

Theorem 9.3.5. Let (sk)k'=o and (sk)k'=o be L1 -Cauchy sequences of step functions
(not necessarily monotone) such that Sk ---+ f a.e. ands~ ---+ g a.e. If f :::; g a.e .,
then
lim r
k--t= J[a,b]
s k < lim r
- k--t= J[a,b]
s~.
370 Chapter 9. *Integration II

Proof. By Proposition 9.2.4 we may choose monotone increasing L 1 -Cauchy


sequences (¢k)k=O ' (7/Jk)k=O' (o:k)k=O' and (f3k)'G 0 such that (¢k - 7/Jk)k=O is a
subsequence of (sk)k=O and (o:k - f3k)k=O is a subsequence of (sk)f=o ·
Any subsequence of an L 1 -Cauchy sequence is equivalent in L 1 ([a, b]; IR) to the
original sequence, so (¢k -7/Jk)-+ f a.e. and (o:k -f3k)-+ g a.e. Since each sequence
(¢k)f=o' (7/Jk)k=O' (o:k)'G 0 , and ((3k)!~o is L 1 -Cauchy, there exist functions ¢, 7/J,
o:, and (3 such that each sequence converges to the corresponding function almost
everywhere. We have¢ - 7/J = f:::; g = o: - (3 a.e. , so

¢ + (3 :::; o: + 7/J a.e.

The sequences (¢k + f3k)k=O and (O!k + 7/Jk)k=O are monotone increasing, so by
Lemma 9.3.4 we have

lim r
k--+oo J[a,b]
(¢k+f3k):s; lim
k--+oo J[a,b]
r
(o:k+7/Jk) ·

Since integration on step functions is linear, and addition (and subtraction) are
continuous, we have

lim r
k--+oo J[a,b]
(¢k - 7/Jk):::; lim
k--+oo J[a,b]
r
(o:k - f3k )·

By the uniqueness in the continuous linear extension theorem, limits of integrals of


L 1-Cauchy sequences are independent of the representative of the equivalence class,
so we have

lim r
k--+oo J[a,b]
Sk = lim r
k--+oo J[a,b]
(¢k - 7/Jk) :::; lim
k--+oo J[a,b]
r(o:k - f3k) = lim
k--+oo J[a,b]
r s~. D

Proposition 9.3.6. If f g a.e ., and if (sk)'Go and (sk)'G 0 are two L 1 -Cauchy
=
sequences of step functions such that Sk -+ f a.e. ands~ -+ g a.e., then (sk)k=O
and (sk)f=o are equivalent in L 1 ([a, b]; JR) .

Proof. Proposition 9.1.7 shows that the sequences (iskl)'G 0 and (iski)'G 0 are L 1 -
Cauchy, with !ski-+ Iii a.e. and ls~I -+ lgl a.e. By Theorem 9.3.5 we have

lim ilskllv = lim f


k--+oo k--+oo }[a,b]
!ski= lim f
k--+oo }[a,b]
is~I = lim
k--+oo
ils~llv · (9.2)

Substituting (sk - sk) for sk and 0 for s~ in (9.2) gives the desired result. D

We have shown that cl> is an isomorphism of vector spaces. We may use cl> to
define the L 1 -norm on 2 /2o by llfllv = ll(sk)'Gollv = limk--+oo llskllv whenever
f = <I>((sk)'G0). Similarly, we may define the integral f[a ,b] f = limk--+oo f[a,b] Sk
whenever f = <I>((sk)'G0).
From now on we usually describe elements of L 1 ([a, b]; IR) as integrable func-
tions (or equivalence classes of integrable functions) rather than as equivalence
classes of L 1 -Cauchy sequences of step functions- we have showed that the two
formulations are equivalent, but functions are usually more natural to work with.
9.3 . Lebesgue-Integrable Functions 371

9.3.2 Basic Integra l Properties


In this section we prove several properties of integrals and integrable functions. We
first prove Proposition 8.4.2, which we restate here for the reader's convenience.

Proposition 8.4.2. For any f, g E L 1 ([a, b]; IR) we have the following:
(i) If f::; g a.e., then fra,b] f::; fra,b] g.

(ii) [fra,bJ ! [ :S fra,bJ If I = II! ll u ·

(iii) The functions max(!, g), min(!, g), J+, 1- , and If I are all integrable.
(iv) If h : !Rn -t IR is a measurable function (see Definition 8.3.9 ), and if lhl E
L 1 ([a, b];IR) , then h E L 1 ([a, b];IR).

(v) If llgllL= ::; M < oo, then Jg E L 1 ([a, b]; IR) and llfgllu ::; Mllfllu.

Proof. Property (i) is an immediate consequence of Theorem 9.3.5 . Property (ii)


follows from (i), since f::; IJI and - f::; lfl - Property (iii) is Exercise 9.12.
For (iv), since h is measurable, there is a sequence (sk)~ 0 of step functions
with Sk -t h a.e. For each k E N let

lhl if Sk 2: lhl,
<Pk = mid(-lh l, Sk, lhl) = max(-lhl, min(sk, lhl)) = Sk if - lhl ::; Sk '.S lhl,
{
-lhl if Sk ::; - lhl.

By Proposition 8.4.2(iii), each ¢k is integrable. Since sk -t h a.e., we also have


¢k -t h a.e. Since ¢k ::; lhl for all k, the dominated convergence theorem guarantees
that h E L 1 ([a, b]; IR).
For (v) choose sequences of step functions (sk)~ 0 and (sk)k'=o such that Sk -t
f a.e. and sk -t g a.e. The sequence (sksk)~ 0 converges to f g almost everywhere.
We have

lim { lskskl :S lim { lsklM = M { lfl = Mllfllu < oo,


k -+ oo lra,b] k-+oo lra,b] J(a,b]
so by Fatou's lemma we have If g l E L 1 ([a, b]; IR), and by (iv) we also have fg E
L 1 ([a,b];IR). D

Proposition 9.3.7. Let A c !Rn be a measurable set. For any c E !Rn and any
f E L 1 (A;IR), let Jc be the function on A - c = {t E !Rn I t + c EA} given by
fc(t) = f(t+c). WehavefcE L 1 (A-c;IR) and

lA-c
r fc = r f.
jA

Proof. The proof is Exercise 9.15. D


372 Chapter 9. *Integration II

9.4 Proof of Fubini's Theorem


In this section we prove Fubini's theorem, which we restate here for the reader's
convenience.

Theorem 8.6.1. Assume that f: X x Y --+ JR is integrable on X x Y c JRm+n.


For each x EX consider the function fx: Y--+ JR given by fx(Y) = f(x,y) . We
have the following:

(i) For almost all x EX the function fx is integrable on Y.


(ii) The function F : X --+ JR given by

F(x) ~ { ~ f.(y)dy if f x is integrable,


otherwise

is integrable on X.

(iii) The integral off may be computed as

( f(x, y) dxdy = ( F(x) dx.


lxxY lx

The first step in the proof of Fubini's theorem is to check that it holds for step
functions .

Proposition 9.4.1 (Fubini's Theorem for Step Functions). Ifs : [a, b] x


[c, d] --+ JR is a step function, then
(i) for every x E [a, b] the function Sx : [c, d] --+JR is a step function;
(ii) the function S: [a, b] --+JR given by S(x) = fRm. sx(Y) dy is a step function;
(iii) the integral of s can be computed as

r
J[a,b] x [c,d]
s(x,y)dxdy= r
J[a,b]
S(x)dx= r ((
J[a,b] J[c,d]
Sx(y)dy)dx.

Proof. The proof is Exercise 9.18. D

Lemma 9.4.2. Let X = [a, b] C !Rn and Y = [c, d] c !Rm. If E c X x Y has


measure zero, then for almost all x E X the set Ex = {y E Y I (x, y) E E}
has measure zero in Y.

Proof. Using Proposition 9.2.5, choose a monotone increasing L 1-Cauchy sequence


(¢k)~ 0 of step functions on Xx Y such that ¢k(x,y)--+ oo for all (x ,y) EE. For
9.4. Proof of Fubini's Theorem 373

each k E N and each x E X, let <I>k(x) = fy ¢k,x(Y) dy, where ¢k,x(Y) = ¢k(x, y).
Because of Fubini's theorem for step functions (Proposition 9.4.1) we have

L <I>k(x) dx = L (i ¢k,x(Y) dy) dx = LxY ¢k(x , y) dxdy .


Therefore, the sequence Ux
<I>k(x) dx)';=O is bounded, and by the monotone con-
vergence theorem (Theorem 8.4.5) we have <Pk ---+<I> a .e. for some <I> E L 1 (X; JR).
Let x EX be such that <l>k(x)---+ <l>(x), so the sequence (fy ¢k,x(Y) dy)C:,:=0 is
bounded. Again by the monotone convergence theorem, <!>k,x converges for almost
all y E Y. On the other hand, if y E Ex, then ¢k,x diverges to infinity, so the
monotone increasing sequence of step functions ¢k ,x : Y ---+ JR shows that Ex has
measure zero in Y. D

Proof of Fubini's Theorem. Let (sk)~ 0 be an L 1-Cauchy sequence of step


functions on X x Y that converges almost everywhere to f. By Proposition 9.2.4
we may assume that each Bk is a difference Bk = ¢k - '¢k , where (¢k)k=O and ('¢k)k=O
are monotone increasing L 1-Cauchy sequences on Xx Y . Since integration is linear,
it suffices to prove the theorem just in the case that limk-->oo ¢k = f a.e.
For each x E X we have a sequence (¢k,x)'k=o of step functions on Y. For
each k E N define <Pk : X ---+ JR by <I>k(x ) = fy ¢k,x(Y) dy, so that (<I>k)~ 0 is a
sequence of step functions on X. Because of Fubini's theorem for step functions
(Proposition 9.4.1) we have

lim { <I>k(x) dx = lim { ( { ¢k,x(Y) dy) dx


k--too}x k--too}x }y

JXxY
= lim { ¢k(x,y) dxdy
k--too

= { f (x, y) dxdy.
lxxY
Therefore, the sequence Ux
<l>k(x) dx)C:,:=0 is bounded, and by the monotone con-
vergence theorem (Theorem 8.4.5) we have <l>k ---+ <I> a .e. for some <I> E L 1 (X; JR)
with
{ <I>(x) dx = { f(x, y) dx dy.
Jx lxxY
We must now show that cl> = F a.e. and that f x is integrable for almost all
x E X. Let E be the measure-zero subset of X x Y where (¢k(x, y))'k=o fails to
converge. By Lemma 9.4. 2 the set Ex = {y E Y I (x , y) E E} has measure zero
for almost all x E X. For any x E X such that Ex has measure zero and such
that <I>k(x) ---+ <I>(x) we have that <I>k(x) = fy <!>k,x(Y) dy converges, and so by the
monotone convergence theorem fx = limk--too ¢k,x a.e. is integrable on Y , and

<l>(x) = lim { ¢k,x(Y) dy = { fx(Y) dy = F(x),


k--too }y }y
as required. D
374 Chapter 9. *Integration II

9.5 Proof of the Change of Variables Theorem


In this section we prove the change of variables theorem. Before we begin the proof,
we need to set up a little more notation and develop a few more results about
integrable and measurable functions.

Definition 9.5.1. For each k E N let Qk C lRn be the set of points a E lRn whose
coordinates ai are all rational of the form ai = c/2k for c E Z. Let ek be the set of
compact intervals {n -cubes} of the form [a, a+ (2-k, ... , 2-k)], where a E Qk.

Proposition 9.5.2. Countable intersections and unions of measurable sets are


measurable. If X, Y are measurable, then X "Y is measurable, and every open set
(and every closed set) in JRn is measv,rable.

Proof. If { Ak} kE N is a countable collection of measurable sets, then setting fn =


TI~= l :n.Ak defines a sequence of measurable functions (by Exercise 9.14(ii)). More-
over, fn -7 :n.nk ENAk pointwise, so by Exercise 9.14(iii) nkEl\! Ak is measurable. If
X and Y are measurable, then (:Il.x - :Il.y)+ = :n.x" Y is measurable. Combining
this with the previous result shows that countable unions of measurable functions
are measurable.
The empty set is measurable because its indicator function is 0. To see that
lRn is measurable, let fk be the step function fk = :Il.[(-k, ... ,- k),(k, ... ,k)J · The sequence
(fk)~ 0 converges to 1 = :Il.[a,b], so lRn is measurable.
Let U be an open set. For each k E N let Ck be the union of all cubes in
ek that lie entirely in U. Since Ck is a countable union of intervals (which are
measurable), it is measurable. It is clear that for every k EN we have Ck C Ck+l ·
For every t E U there is a 6 > 0 such that B(t, 6) C U. Fork large enough, we have
2-k < 6; .,;n,
which implies that any cube of ek containing t lies entirely inside
B(t, 6) C U, and hence every t E U lies in some Ck · Therefore, U = LJkENCk is
measurable. Since every closed set is the complement of an open set, every closed
set is also measurable. D

Corollary 9 .5.3. Every interval I C lRn, whether open, partially open, or closed,
is measurable. If A, B C lRn are open with I C B, and if f: A -7 B is continuous,
then f - 1 (I) is also measurable.

Proof. The proof is Exercise 9.21. D

We are now ready to start the proof of the change of variables formula
(Theorem 8.7.5), which we restate here for the reader's convenience.

Theorem 8.7.5. Let U and V be open subsets of lRn, let X CU be a measurable


subset, and let W : U -7 V be a diffeomorphism. The set Y = w(X) is measurable,
and if f: Y -7 lR is integrable, then (f ow) jdet(Dw)j is integrable on X and

.l f = l~ (f ow) jdet(Dw)j. (9.3)


9.5. Proof of the Change of Variables Theorem 375

The fact that Y = w(X) is measurable follows from Corollary 9.5.3. We


prove the formula first for the case of a compact interval, and then we extend it to
unbounded X.

Lemma 9.5.4. Given the hypothesis of Theorem 8. 7.5, if n = 1 and X = [a, b] C


U C IR is a compact interval, then the change of variables theorem holds for all
f E L 1 (Y;IR), where Y = W(X) .
Proof. If f E &i'(Y; IR) (that is, if f is regulated-integrable), then this lemma is
just the usual substitution formula (8.11) (or (6.19)). In particular, the formula
holds for all step functions.
Since X is a compact interval and W is continuous, the set Y = w(X) is
also a compact interval. If f E L 1 (Y; IR), then Proposition 9.2.4 implies there are
monotone increasing L 1-Cauchy sequences (sk)!.;°=0 and (uk)~ 0 of step functions on
Y such that Sk -Uk--+ f a.e., and there are functions g, h E L 1 (Y; IR) with f = g-h
such that sk --+ g a.e. and Uk --+ h a.e.
By the monotone converge theorem (Theorem 8.4.5) we have

and similarly for h. This gives

as required. D

Lemma 9.5.5. Given the hypothesis of Theorem 8. 7.5, if X = [a, b] C U C !Rn


is a compact interval and w : u --+ v is of the form w(t) = w(t1, ... 'tn) =
(ti, w2(t), ... , Wn(t)) for some i, or of the form w(t) = ('1!1, ... , Wn-1(t), ti), then
the change of variables formula (9.3) with respect to W holds for all f E L 1 (Y; IR).

Proof. Here we prove the first case. The second is similar. We proceed by induction
on n. The base case of n = 1 follows from Lemma 9.5.4.
Given
X = [a, b] = [a1, b1] x · · · x [ai, bi] x · · · x [an, bnJ,

let S = [ai, bi] CIR and

W = [a1,b1] X ··· X [ai-1,bi-iJ X [ai+1,bi+i] X ··· X [an,bn] C !Rn-l_

For each t E S let Lt : W --+ X be the injective map

and let Ut be the open set

Ut = {w E !Rn-l ! it(w) EU}.


376 Chapter 9. *Integration II

Also, let Wt : Ut ---+ R_n -l be the map

and vt = Wt(Ut)· Since W is a diffeomorphism, it is straightforward to verify that


Wt : Ut ---+ vt is bijective. The first row of DW(wt) is zero in all but the ith position,
and the cofactor expansion of the determinant along this row (Theorem 2.9.16)
shows that
I det DWt(w)\ = I det(Dw(wt))\ -=I- 0.
0 (9.4)
Therefore, Wt is a diffeomorphism.
Using (9.4) with the Fubini theorem (Theorem 8.6.1), we have

{ (f ow)\ det(Dw)\ = { (f o W)(wt)\ det(Dw)(wt)\ dwdt


lx lsxw
.l
= (J~ (f o w)(wt)I det(Dw)(wt)I dw) dt

=Is (J~ f(t, Wt(w))\ det(DWt)(w)\ dw) dt

=
lsr (Jr'lit(W) f(t, z) dz) dt, (9.5)

where the last equality follows from the induction hypothesis. Since X is compact,
there exists a compact interval Z C R_n-l such that W(X) CS x Z. Thus, we have

Is (lt(W) f(t,z)dz) dt, =Is (l Ilw t(W)(z)f(t,z)dz) dt

=Is (l Ilw(x)(t,z)f(t,z)dz) dt

= [ f. (9.6)

Combining (9.5) and (9.6) gives the desired formula. D

This leads to the following corollary.

Corollary 9.5.6. Given the hypothesis of Theorem 8. 7.5, if X C [a, b] is a mea-


surable subset of a compact interval and W has the form given in Lemma 9. 5. 5, then
the change of variables formula (9.3) holds for all f E L 1 (Y; R).

Proof. See Exercise 9.22. D

Lemma 9.5. 7. Assume that U, V, W C Rn are open and W : U ---+ V and <I> : V ---+
W are diffeomorphisms . Let X C U be measurable, with Y = W(X) and Z = <I>(Y).
If the change of variables formula with respect to W holds for all g E L 1 (Y; R) and
the change of variables formula with -respect to <I> holds for some f E L 1 (Z; R), then
the formula holds with respect to <I> o 1¥ for f.
9.5. Proof of the Change of Variables Theorem 377

Proof. The proof is Exercise 9.23. D

Lemma 9.5.8. If '11 : U ---+ V is a diffeomorphism and X C U is a measurable


subset with Y = w(X), then for any x E X there exists an open neighborhood
W C U of x such that the statement of the change of variables theorem holds for
every integrable function on '11 ( X n W) = Y n '11 (W) .

Proof. First we show that there is an open neighborhood W of x where the diffeo-
morphism W is a composition of diffeomorphisms of the form W(t) = W(t 1, ... , tn) =
(ti, '1'2(t) , ... , Wn(t)) for some i, or of the form w(t) = ('11 1, ... , Wn_ 1(t), ti) · For
each i let C1,i be the (1, i) cofactor of DW(x), as given in Definition 2.9 .11. Using
the cofactor expansion in the first row (Theorem 2.9.16), we have

n OW
0 =!= det(Dw) = L ox 1 (x)C1,j,
j=l J

so there must be some i such that C 1,i =/= 0. Choose one such i and define'¢ : U---+ !Rn
by '¢(t) = (ti, '112, ... , Wn)· Note that Idet(D'lj;(x))I = IC1,il =/= 0, so by the inverse
function theorem (Theorem 7.4.8), there exists an open neighborhood W' c U of x
such that'¢ has a C 1 inverse on '¢(W') c V. In particular, '¢ is a diffeomorphism
on W' of the required form.
Let <I> : '¢(W') ---+ V be the diffeomorphism <I>= W o '¢- 1, so that W = <I> o '¢.
Letting z = (z1, .. . , Zn)= '¢(t1, . .. , tn) =(ti , W2(t) , . .. , Wn(t)) , we have

so <I> is also of the required form .


Since W' is an open neighborhood of x, there is a compact interval [a, b] c W'
with nonempty interior W C [a, b ] such that x E W. By Corollary 9.5.6 and
Lemma 9.5.7, the change of variables formula holds on X n W. D

Now we can prove the full version of the change of variables formula.

Proof of Theorem 8. 7.5. It suffices to prove the theorem for the case of X =U
and f E L 1(V;IR). For each£ E z+ let

Ke= {t Eu I dist(t, uc)::::: 1/£ and ltl '.':'.: £} .

Each Ke is a compact subset of U, and K e c Ke+1· Moreover, we have LJ~ 1 K e = U.


For each e and for each t E Ke, Lemma 9.5.8 gives an open neighborhood Wt oft
such that the change of variables theorem holds for all functions in L 1(w(Wt); IR).
Since Ke is compact, there is a finite subcollection of the open sets Wt that
covers K e. Thus, there is a countable collection W1, W2, ... of open subsets of
U such that the change of variables theorem holds for f on each w(Wi), and
U = LJ: 1 Wi. For each i let Wf = Wi" LJ;:,~ Wi. All the Wf are disjoint
and U = LJ: 1 Wf. Also, each Wf is a measurable subset of Wi .
378 Chapter 9. *Integration II

If f is nonnegative a.e., then because countable sums commute with integra-


tion (Exercise 8.16) we have

.l f o \JI Idet(D\J!)I = ~ .fu ll.w:f o \J!I det(D\J!)I = ~ fwi ll.w:f o WI det(D\J!)J


= ~ fw, (ll.w(w[)J) o \JI Idet(Dw)J = ~ fw, llw(w[)f
=fr i=l lw(W[)
1=r 1
lw(U)

So the formula holds for nonnegative integrable functions . Writing f = j+ - 1-


as a difference of nonnegative integrable functions shows that the formula holds for
any f. D

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth .
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *) . We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with & are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

9.1. Prove Proposition 9.1.1.


9.2. Prove the claim in the proof of Theorem 9.1.2 that the function JJ · llx' defined
by JJ(x k)k"=ollx' = limk-+oo JJxkJI is a seminorm. Provide an example of a
sequence (x k)k"=o such that IJ(xk)k"=o ll = 0 but (xk) k"=o #- (O)k"= i (the zero
element of X').
9.3. Prove Proposition 9.1.7.
9.4. Show that every element f E: L 1 ([a, b]; q can be written as f = u +iv
where u,v E L 1 ([a,b];JR), and conversely, for any u,v E L 1 ([a,b];JR), we
have u +iv E L 1 ([a, b]; q .
Exercises 379

e
9.5. For each E N let Qe C Rn be the lattice of points with coordinates all
lying in 2-ez. Show that every element of L 1 ([a, b]; X) is equivalent to an
L 1 -Cauchy sequence (sk)k=O of step functions where all the corners of all the
intervals involved in the step function are points of Qe for some€; that is, for
each k E N there exists an ek E N such that each interval Rr appearing in
the sum sk = LIE.9xrliR1 is of the form Rr = [ar, hr] with ar, br E Qek·

9.6. Define a generalized step function to be a function f of the form f = "L::i ciliR,
such that each Ri is any interval (possibly all open, all closed, all partially
open, or any combination of these).
Prove that in Definition 9.2.1 we may use generalized step functions with all
open intervals. That is, prove that a set E has measure zero in the sense
of Definition 9.2.l if and only if there is an L 1 -Cauchy sequence (fk)k=O of
generalized step functions of the form fk = "L::\ Ck ,i]. Rk,; such that each
Rk,i is an open interval and lfk(t)I --+ oo for every t EE. What changes if
we use generalized step functions with all closed intervals?
9.7. Prove Lemma 9.2.2 for a monotone decreasing sequence of functions .
9.8. Prove Proposition 9.2.6.
9.9. Prove that if I c Rn is an interval and f E !ft(!; R), then the graph r f =
{(x, f(x)) CI x R} C Rn+l has measure zero.

9.10. Find a sequence (fk)'f:: 0 of functions in f!/t([O, l]; R) such that fk--+ 0 a.e. but
1
limk--+oo f 0 fk(t) dt-/:- 0.
9.11. Let f be continuous on [a, b] C R Describe in detail how to construct a
monotone increasing sequence (sk)k=O of step functions such that Sk--+ f a.e.
and limk--+oo Jia,b] Sk < oo.
9.12. Prove Proposition 8.4.2(iii).
9.13. Give an example of a function on R that is measurable but is not integrable.
9.14. Let f and g be measurable. Prove the following:

(i) The set of measurable functions is a vector space.

(ii) The product f g is measurable, and a product of any finite number of


measurable functions is measurable.

(iii) If (fk)k=O is a sequence of measurable functions that converges to f


almost everywhere, then f is measurable.

9.15. Prove Proposition 9.3.7.


9.16. Let f E L 1 ([a, b]; R) . Show that for every function g E f!/t([a, b]; R) the
product Jg lies in L 1 ([a, b];R).
9.17. Prove that if f is any measurable function and g E L 1 ([a, b];R) is such that
lf l :S:: g a.e., then f E L 1 ([a, b];R). Hint: Generalize the arguments in the
proof of Proposition 8.4.2.

9.18. Prove Fubini's theorem for step functions (Proposition 9.4.1).


380 Chapter 9. *Integration II

9.19. Suppose that X C JR.n and Y C ]Rm and both f: X ---+ JR and g: Y ---+ JR
are integrable. Prove that the function h: Xx Y---+ R, given by h(x,y) =
f(x)g(y), is integrable and

r h-rlx 1 }yr 9
lxxY
9.20. Prove Proposition 8.6.5 .

9.21. Prove Corollary 9.5.3.


9.22. Prove Corollary 9.5.6.
9.23. Prove Lemma 9.5.7. Hint: Use the chain rule and the fact that determinants
are multiplicative (det(AB) = det(A) det(B)).

Notes
As mentioned in the notes at the end of the previous chapter, much of our devel-
opment of integration is inspired by [Cha95]. Other references are given at the end
of the previous chapter. The proof of Proposition 9.2.5 is modeled after [Cha95,
Sect. II.2 .4], and the proof of Proposition 9.3.3 is from [Cha95, Sect. II.2.3] . The
proof of Proposition 9.3.4 is from [Cha95, Sect. II.3.3]. The proof of the change of
variables formula is based on [Mun91, Sect. 19] and [Dri04, Sect. 20.2].
Calculus on Manifolds

Cultivate your curves-they may be dangerous, but they won't be avoided.


-Mae West

In this chapter we describe how to compute tangents, and integrals on parametrized


manifolds in a Banach space. Manifolds are curves, surfaces, and other spaces that
are, at least locally, images of especially nice, smooth maps from an open set in JRk
into some Banach space. We begin with manifolds whose domain is one dimensional,
that is, smooth curves. We describe how to compute tangents, normals, arclength,
and more general line integrals that account for the geometry of the curve. These
are important tools for complex analysis and spectral calculus. We then give a brief
treatment of the theory for more general manifolds.
The main result of this chapter is Green's theorem, which is a generalization
of the fundamental theorem of calculus. It tells us that under certain conditions the
line integral around a simple closed curve in the plane is the same as the integral
of a certain derivative of the integrand on the region bounded by the curve. In
other words, the surface integral of the derivative on the region is equal to the line
integral of the original integrand on the boundary of the region.

10.1 Curves and Arclength


In the next two sections we discuss smooth curves. Although these may seem like
very simple manifolds, much of the general theory is motivated by this basic case.
Throughout this section, assume that (X, I · II) is a Banach space.

Definition 10.1.1. A smooth parametrized curve in X is an injective C 1 function


CY : I --+ X, where I C JR is an interval (not necessarily closed, open, or finite) such
that CY' never vanishes. It is common to call the domain of a parametrization "time"
and to use the variable t to denote points in I. If the domain I of CY is not an open
interval, then C 1 is meant in the sense of Remark 7.2.5 We say that CY : [a, b] --+ X is
a simple closed parametrized curve if CY(a) = CY(b) and CY is injective when restricted
to [a, b).

381
382 Chapter 10. Calculus on Manifolds

Remark 10.1.2. With some work it is possible to show the results of this chapter
also hold for curves O" : I ---+ X on a closed interval I that are not differentiable at
the endpoints of I, but that are continuous on I and 0 1 on the interior of I.

Unexample 10.1.3.

(i) The derivative of the map a : IR---+ IR 2 given by o(t) = (1 - t 2 , t 3 - t)


never vanishes, but a does not define a smooth curve because it is not
injective- it takes on the value (0, 0) at t = 1 and t = -1.
(ii) The map a : IR---+ IR 2 given by o(t) = (t 2 , t 3 ) is inject ive, but it is not a
smooth curve because the derivative D o vanishes at t = 0.

We define the tangent to the curve O" at time t 1 to be the vector 0" 1 (ti). If a
curve in !Rn is thought of as the trajectory of a particle, then 0" 1 (ti) is its velocity
at t ime ti . The line in X that is tangent to the curve at time t1 is defined by the
parametrization L(t) = fo'(ti) + O"(ti).

10.1.1 Parametrizations and Equivalent Curves


We often want to study the underlying curve itself, that is, the image of the map O",
rather than the parametrization of the curve. In other words, we want the objects
we study to be independent of parametrization.

Definition 10.1.4. Two smooth parametrized curves 0" 1 : I ---+ X and 0"2 : J ---+ X
are equivalent if there exists a bijective 0 1 map¢: I---+ J, such that ¢'(t) > 0 for
all t E I and 0"2 o ¢ = 0"1. In this case, we say that 0"2 is a reparametrization of 0" 1 .
Each equivalence class of parametrizations is called a smooth, oriented curve .
If we replace the condition that ¢' (t) > 0 by the condition that ¢' (t) -=f. 0 for all
t, we get a larger equivalence class that includes orientation-reversing reparametriza-
tions. Each of these larger equivalence classes is called a smooth, unoriented curve
or just a smooth curve.

Remark 10.1.5. The tangent vector O"'(ti) to the curve O" at the point O"(t 1 ) is not
independent of parametrization. A reparametrization ¢ : I ---+ J scales the tangent
vector by ¢'. However, we can define the unit tangent T to be

T(t) = O"'(t) /ll O"'(t) ll,

and the unit tangent at each point depends only on the orientation of the curve.
See Figure 10.l.

Definition 10.1.6. A finite collection of smooth parametrized curves 0" 1 : [a 1 , b1] ---+
X, ... , O"k : [ak, bk] ---+ X is called a piecewise-smooth parametrized curve if we have
O"i(bi) = O"i+1(aH1) for each i E {1,. .. ,k - 1}. Such a curve is often denoted
0"1 + ... + O"k.
10.1. Curves and Arclength 383

CJ(t) + T(t)

CJ(a)

/ T (t)

Figure 10.1. A smooth parametrized curve CJ with starting point CJ(a) and
ending point CJ( b), as given in Definition 10.1.1. For each t E (a, b), the vector T( t)
is the unit tangent to the curve at time t . See Remark 10.1 .5 for the definition
of T(t).

Remark 10.1.7. Throughout this section we focus primarily on smooth paramet-


rized curves, but it should be clear how to extend these results to piecewise-smooth
curves.

10.1.2 Arclength
A rclength measures the length of an oriented curve in a way that is independent
of parametrization. It is essential for understanding the geometry of a curve. It
should seem intuitive that arclength would be the integral of speed, where speed is
the norm 11 CJ 1 ( u) 11 of the velocity.

Definition 10.1.8. The arclength len(CJ) of a smooth parametrized curve CJ


[a, b]-+ X is

Example 10.1.9. Consider a straight line segment, parametrized by CJ: [a, b]


-+ X with CJ(u) = ux+v, where x, v EX are fixed. The traditional definition
of the length of the segment is just the norm of the difference between the
starting and ending points: llax+ v - (bx+ v)ll = (b- a)llxll· It is immediate
to check that this agrees with our new definition of arclength.

Example 10.1.10. Consider a circle in JR 2 of radius r, centered at the origin,


with parametrization CJ: [0,27r] -+ JR 2 given by CJ(u) = (rcos(u),rsin(u)).
The length of the curve is
384 Chapter 10. Calculus on Manifolds

2
la 2
7r llCl' (u) II du = la 7r r ll( - sin( u) , cos( u)) I du = 27rr,
which agrees with the classical definition of arclength of the circle.

Example 10.1.11. The graph ofa function f E 0 1 ([a, b]; JR) defines a smooth
parametrized curve Cl : [a, b] -+ JR~ 2 by Cl(t) = (t, f(t)). We have Cl 1 (t) =
(1, f'(t)) so that

len(Cl) =lar1' Jl + (f'(u))2 du.

Proposition 10.1.12. Arclength is independent of parametrization. In other


words, if (J : [a, b] -+ x and "'( = (J 0 ¢ where ¢ : [c, d] -+ [a, b] is a bijective 0 1
function with ¢ ' (t) i= 0 for all t E [c, d], then len( Cl) = len("Y) .

Proof. Since ¢'(t) i= 0 for all t E [c, d] the intermediate value theorem guarantees
that either ¢'(t) > 0 for all tor ¢'(t) < 0 for all t. We prove the case of ¢'(t) < 0.
The case of ¢' (t) > 0 is similar (and slightly easier) .
Since ¢ is bijective, it must map some point to to b. If to > c, then the mean
value theorem guarantees that for some~ E [c, to] we have

b - ¢(c) = ¢'(0(to - c) < 0,


which is impossible, since ¢(c) E [a, b]; thus, ¢(c) = b. A similar argument shows
that ¢(d) =a. The substitution u = ¢(t) now gives

len("Y) = 1d lb'(t)ll dt = 1d 11Cl'(¢(t))lll¢'(t)1 dt

= - la llCl'(u)ll du= len(Cl). D

Definition 10.1.13. Given Cl: [a, b] --7 X, define the arclength functions: [a, b] -+
E [a, b] to the length of Cl restricted to the subinterval [a, t]
JR by assigning t
s(t) = len(Cll[a,tJ)·
If a= 0 and llCl' (u) II = 1 for all u in [O, b], then s(t) = t, and we say that Cl is
parametrized by arclength.

Example 10.1.14. If Cl: [O, oo)-+ JR 3 is the helix given by

Cl( t) = (t, cos(t ), sin(t)),


then
10.1. Curves and Arclength 385

s(t) = lot J1+ sin 2(t) + cos 2(t) dt = th,


so er is not parametrized by arclength, but it is easy to fix this. Setting
t = T / v'2 gives a new parametrization

with s(t(T)) = T and

Ila' (T) II = II der~;T)) II = ~ 11 (1, -(sin( T / v'2)), (cos(T / v'2))) II

= ~Vl + (sin2(T/v'2) + cos2(T/v'2)) = 1.

So a( T) is parametrized by arclength.

The previous example is not a fluke-we can always reparametrize a smooth


curve so that it is parametrized by arclength, as the next two propositions show.

Proposition 10.1.15. For any smooth parametrized curve er : [a, b] -t X the


function s(t) has an inverse p: [O,L] -t [a,b], where L = len(er) = s(b).

Proof. The proof is Exercise 10.2. D

Definition 10.1.16. Given any smooth curve with parametrization er we define a


new parametrization 'Y : [O, L] -t X by 'Y(s) = erop(s) . We call"( the parametrization
of the curve by arclength.

Proposition 10.1.17. For any smooth curve with parametrization er the paramet-
rization 'Y = er o p of the curve is, in fact, a parametrization by arclength; that is,

ll"f'(s) JI =1 for alls E [O,L].

Proof. This follows immediately from the fact that s'(t) = ll er'(t) ll combined with
the chain rule. D

Example 10.1.18. The circle er: [0,27r] -t JR 2 with er(t) = (rcos(t),rsin(t))


in Example 10.1.10 is parametrized by arclength only if Jrl = 1. To see this,
note that the arclength function is s(t) = J~ ll er'(u)l l du= tr , which has inverse
p(s) = s/r . Thus, 'Y( s) = (rcos(s/r),r sin(s /r)) is the parametrization of this
circle by arclength.
386 Chapter 10. Calculus on Manifolds

10.2 Line Integrals


In this section we define the line integral of a function f over a curve. We want
the line integral to be independent of the parametrization and, in analogy to the
traditional single-variable integral, to correspond to summing the value of the func-
tion on a short segment of the curve times the length of the segment. A physical
interpretation of this would correspond to the mass of the curve if the function f
describes the density of the curve at each point.
Throughout this section, assume that (X, II · llx) and (Y, I · ll Y) are Banach
spaces over the same field lF.

10.2.1 Line Integrals

Definition 10.2.1. Given a smooth curve C C X parametrized by arclength 'Y :


[O, L] --+ C and a function f : C--+ Y, we define the line integral off over C to be

fc! ds == loL f('Y(s))ds,


if the integral exists.

Proposition 10.2.2. Given a smooth curve C with parametrization a : [a, b] --+ C


(not necessarily parametrized by arclength) and a function f : C --+ Y, the line
integral of f over C can be computed as

1 C
f ds = ;·b f(<T(t))lla'(t)llx dt,
a

if that integral exists.

Proof. The proof is Exercise 10.8. D

Remark 10.2.3. Since every oriented curve has only one parametrization by arc-
length, the line integral depends only on the oriented curve class of C .

Example 10.2.4. Suppose C C JR 3 is given by the parametrization

a(t) = (2 sin(t), t , -2 cos(t)) ,

where 0 :::; t :::; 7r. The line integral Jc xyz ds can be evaluated as
- lo"' 4tsin(t) cos(t)lla'(t)ll dt = -v'5 lo"' 4tsin(t) cos(t) dt
0

,= J5( t cos(2t) - sin( t) cos( t)) I: = v'57r.


10.2. Line Integrals 387

Example 10.2.5.

(i) If C c X is parametrized by a : [a, b] --+ C, then the line integral


J:
fc ds = lla'(t)llx dt is just the arclength of C .
(ii) If p : C --+ JR is positive for all c E C, we can interpret it as giving the
density of a wire in the shape of the curve C. In this case, as mentioned
in the introduction to this section, the integral f c p ds is the mass of C.
(iii) The center of mass of a wire in the shape of the curve C with density
function pis (x1, ... , Xn), where

and m = f c pds is the mass.

10.2.2 Lin e Integrals of Vector Fields


In the special case where C is a curve in X = !Fn, we call a function F : C --+ !Fn a
vector field on C . For vector fields it is often productive to consider a line integral
that takes into account not only the length of each infinitesimal curve segment, but
also its direction. This integral is useful, for example, when computing the work
done by a force F moving a particle along C. If F E !Rn is a constant force, the
work done moving a particle along a line segment of the form tv + c, where v E !Rn
and t E [a, b], is given by

work= F · (b - a)v = (b - a) (F, v).

If C is parametrized by a : [a, b] --+!Rn and the force F : C--+ !Rn is not necessarily
constant, then over a small interval 6.a of the curve containing a( t), the work done is
approximately F(a(t)) · 6.a. Summing these pieces and taking the limit as 6.a--+ 0
yields an integral giving the work done by the force F to move a particle along the
curve C. This motivates the following definition.

Definition 10.2.6. Given a curve C C !Fn with parametrization a : [a, b] --+ C and
a 0 1 vector field F: C--+ wn, if F(a(t)) · a'(t) is integrable on [a,b], then we define
the line integral of the vector field F over C to be

l F · da = l F · Tds =lb F(a(t)). a'(t) dt, (10. l)

where T is the unit tangent to C . The · in the left-hand integral is a formal symbol
defined by (1 O.1) , while the · in the second and third integrals is the usual dot-
product: x · y = (x, y) = xHy.
If we write a and F in terms of the standard basis as

a(t) = x(t) = (x1(t), . . . ,xn(t))


388 Chapter 10. Calculus on Manifolds

and F(x) = (F1 (x), ... , Fn(x)), then it is traditional to write dxi = x~(t) dt, and in
this notation we have

Remark 10 .2. 7. Using this new notation, our previous discussion tells us that the
line integral
i F · da

is the work done by a force F: C-+ ]Rn moving a particle along the curve C.

Propos ition 10.2.8. The line integral of a vector field F over a smooth curve
does not depend on the parametrization of the curve. That is, given two equivalent
parametrizations a 1 : [a, b] -+ C and a2 : [c, d] -+ C, we have

However, if a reparametrization changes the orientation, then it changes the sign


of the integral.

Proof. The proof is Exercise 10.9. D

Example 10 .2.9. If C is a segment of a helix parametrized by a : [O, 27r] -+


JR 3 , with a(t) = (cos(t),sin(t),t), and if F(x,y ,z) = (-y,x,z 2 ) , then
2
i F · da = i -y dx + i x dy + i z dz
2 2 2
= fo 7r sin 2 (t) dt + fo 7r cos 2 (t) dt + fo 7r t 2 dt

87r3
= 27r+ 3 ·

Example 10.2.10. If F = D¢ is the derivative of a scalar-valued function


¢ : !Rn -+IR (in this context¢ is usually called a potential) , then we say that F
is a conservative vector field. In t his case the fundamental theorem of calculus
gives

l F · da =lb F(a(t)) · a'(t) dt =lb D¢ · a' (t) dt

=lb D(¢ o a) dt = ¢(a(b)) - ¢(a (a) ).


10.3. Parametrized Manifolds 389

In this case the integral depends only on the value of the potential at the
endpoints-it is independent of the path C and of CJ. This is an important
phenomenon that we revisit several times.

10.3 Parametrized Manifolds


A parametrized manifold is the natural higher-dimensional generalization of the
concept of a parametrized curve. Throughout this section assume that (X, II · II) is
a Banach space.

Definition 10.3.1. Let U be an open subset of "!Rm. We say that a E C 1 (U; X) is


a parametrized m-manifold if it is injective and at each point u E U the derivative
Da(u) is injective (that is, Da(u) has rank m). A parametrized 2-manifold is also
called a parametrized surface.

Remark 10.3.2. The injectivity of Da implies that m ::::; dim(X).

Example 10.3.3.

(i) Every smooth parametrized curve with an open domain is a parametrized


I-manifold.

(1·1·) vor
r, every n E 71
!LJ + t he i"d entity
· map I : min
1£',. -t min
1£',. is
· · d
a parametnze
n-manifold.
(iii) For any U c "!Rm, the graph of any C 1 function f: U -t "JR gives a
parametrized m-manifold in "JR.m+l by a(u) = (u,f(u)) E JF'm+l. (Check
that Da has rank m.)

10.3.1 Parametrizations and Equivalent Manifolds


As with curves, we often want to study the underlying spaces (manifolds) them-
selves, rather than the parametrizations; that is, we want to understand properties
that are independent of parametrization.

Definition 10.3.4. Two parametrized m -manifolds a1 : U1 -t X and a2 : U2 -t X


are equivalent if there exists a bijective C 1 map ¢ : U1 -t U2, such that
(i) ¢- 1 is also C 1 '
(ii) det(D¢(u)) > 0 for all u E U1, and

(iii) a2 o ¢ = a1.
In this case, we say that a2 is a reparametrization of a1. Each equivalence class of
parametrizations is called an oriented m-manifold or, if m is understood, it is just
called an oriented manifold.
390 Chapter 10. Calculus on Manifolds

If we drop the condition det(Dcp) > 0, then we get a larger equivalence class
of manifolds. Since the derivative Dcp is continuous and nonvanishing, it is either
always positive or always negative. ~f det(D¢(u)) < 0 for all u E U1, then we say
that¢ is orientation reversing. Each of these larger equivalence classes is called an
unoriented m-manifold or just an m--manifold .

Example 10.3.5. We may parametrize the upper half of the unit sphere
8 2 c JR 3 by a : U ---+ JR3 , where U = B(O , I ) is the unit disk in the plane
and a(u, v) = (u , v, JI - u 2 - v2 ) .. But we may also use t he parametrization
cr(u, v) = (v, u, JI - u 2 - v2 ). These are equivalent by t he map G : U---+ U
given by G(u, v) = (v, u). Moreover, we have

det(DG) ==<let[~ ~] =-I ,

so the equivalence reverses the orientation.


We may also parametrize the upper half of the unit sphere by (3 : (0, 7r) x
(O,n)---+ JR 3 where (3(¢,B) = (cos(¢), cos(B)sin(¢),sin(B)sin(¢)) . The para-
metrizations a and (3 are equivalent by the map F: (0, n ) x (0, n) ---+ B(O, I),
given by F(¢,B) = (cos(¢) , cos(61)sin(¢)). This map is C 00 and bijective
with inverse p- 1 (u, v) = (Arccos(u), Arccos(v/JI - u 2 )) , which is also C 00 •
Moreover, we have
- sin(¢)
det(DF) = det [cos(¢) cos(B)

so this equivalence preserves orientation.

10.3.2 Tangent Spaces and Normals


As in the case of curves, the derivative Da is not independent of parametrization,
so if a and (3 are two parametrizations of M, the derivatives Da and D(3 are usually
not equal. However, the image of the derivative- the tangent space-is the same
for all equivalent parametrizations.
Definition 10.3.6. Given a parametrized m -manifold a : U ---+ M C X, and a
point u E U with a(u) = p, define the tangent space TpM of M at p to be the
image of the derivative Da(u): lRm ---+ X in X :

Thus, ifv1, ... ,vm is any basis for lRm, then the vectors Da(u)vi, ... ,Da(u)vm
form a basis for TpM.

Example 10.3 .7. For each of the parametrizations a, (3 , and er of


Example 10.3.5, we compute the tangent space of 8 2 C JR 2 at the point
10.3. Parametrized Manifolds 391

l
We have

01 ,
-v
vl - u 2 -v 2
2
so the standard basis vectors e 1 , e 2 E IR are mapped by Da(l/2, 0) to
[l 0 - l/v'3f and [o 1 O)T, respectively. A similar calculation shows
that for the parametrization () the standard basis vectors are mapped by
Ddl/2,0) to [O 1 O)T and [l 0 -l/J3)T), respectively. And for the
parametrization f3 we have

e~
- sin(¢)
D/3 = cos(B) cos(¢) - sin( sin(¢)] ,
[ sin( B) cos(¢) cos(B) sin(¢)

and thus the standard basis vectors are mapped by D/3(1T /3, 1T /2) to
[-v'3/2 0 l / 2)T and [o -v'3/2 o)T , respectively. It is straightforward
to check that each of these pairs of vectors spans the same subspace TpS 2 of JR 3 .

Proposition 10.3.8. The tangent space TpM is independent of parametrization


and of orientation.

Proof. Given any two parametrizations a : U ---+ M C X and f3 : W ---+ M with


U, WC IRm and a(u) = p = f3(w), the composition ¢ = 13- 1 o a is a reparametri-
zation. If x E tl (Da(u)), then x = Da(u)v for some v E IRm. At the point u we
have
D(¢)(u) = D(13- 1 o a)(u) = (D/3)- 1 (p)Da(u),

and so
x = Da(u)v = Df3(w) ((D/3)- 1 (p)Da(u)v) E tl (Df3(w)).

This shows tl (Da(u)) c tl(Df3(w)). The proof that tl(Df3(w)) c tl (Da(u)) is


essentially identical. D

Remark 10.3.9. The tangent space TpM to M C X at p E Mis a vector subspace


of X, so it always contains 0 , but one often draws tangent lines or tangent planes
passing through the point p instead through 0. To get the traditional picture of
a tangent line or plane, simply add p to every element of TpM· See Figure 10.2 for
a depiction of both TpM and p+TpM. There are many reasons to prefer TpM over
the traditional picture-perhaps the most important reason is that the traditional
picture of a tangent plane is not a vector space.

In the special case of a surface in IR 3 we can use the cross product of tangent
vectors to define a normal to the surface. This cross product depends strongly
on the choice of parametrization, but if we rescale the normal to have length one,
then the resulting unit normal depends only on the oriented equivalence class of
the surface.
392 Chapter 10. Calculus on Manifolds

Figure 10.2. The tangent space TpM (red) to the manifold M at the point
p is a vector space, and hence passes through the origin. The traditional picture
that is often drawn instead is the tran.slate p + TpM (blue) . This translate is not a
vector space, but there is an obvious bijection from it to the tangent space given by
vr--+v-p.

Definition 10.3.10. Let U C JR 2 be an open set and a : U --+ S C JR 3 be a


parametrized surface. Let e 1 and e2 be the standard basis elements in JR 2, and for
each u E U we have tangent vectors Da(u)e 1 and Da(u)e2 in JR 3.
For each u E U, we define the unit normal N to the surface S at a(u) to be
N _ Da(u)e1 x Da(u)e 2
- llDa(u)e1 x Da(u)e2ll'
where x is the cross product (see Appendix C.3). See Figure 10.3 for a depiction
of the unit normal.

Proposition 10.3.11. The unit normal depends only on the orientation of the
surface (that is, the orientation-preserving equivalence class). If the orientation is
reversed, the unit normal changes sign: N H - N.

~
Da(u)e1

a(u)

Figure 10.3. The unit normal N to the surface S at a(u), as described


in Definition 10.3.10. The unit normal is independent of the parametrization, but
if the orientation of the parametrization changes, then the sign of the unit normal
is flipped; see Proposition 10.3.11.
10.4. *Integration on Manifolds 393

Proof. Given any two parametrizations a: U ~ Mc JR 3 and (3 : W ~ M with


U, W c JR 2 and a(u) = p = (3(w), such that the composition ¢ = (3- 1 o a is
orientation preserving (that is, det(D¢) > 0), and given u E U we have

Da(u)ei = D(3(¢(u))D¢(u)ei.
By Proposition C.3.2, the cross product D(3(¢(u)) D¢(u)e 1 x D (3(¢(u)) D¢(u)e2 is
equal to det(D¢(u))(D(3(¢(u))e1 x D(3(¢(u))e2), so we have

Da(u)e 1 x Da(u)e2 det(D¢(u)) D(3(¢(u))e 1 x D(3(¢(u))e 2


D
ll Da(u)e1 x Da(u)e2ll I det(D¢(u))l ll D(3(¢(u))e1 x D(3(¢(u))e2ll .

Vista 10.3.12. The previous construction of the unit normal depends on the
fact that we have a two-dimensional tangent space embedded in JR 3 . Although
the cross product is only defined for a pair of vectors in JR 3 , there is a very
beautiful and powerful generalization of this idea to manifolds of arbitrary
dimension in X using the exterior algebra of differential forms. Unfortunately
we cannot treat this topic here, but the interested reader may find a complete
treatment in many books on vector calculus or differential geometry, including
several listed in the notes at the end of this chapter.

10.4 *Integration on Manifolds


In the case of the line integral, we compute the length of a curve by integrating
the length ll O"'(t) ll of the tangent vector at each point of the parametrization. To
compute the analogous k-dimensional volume of a k-dimensional manifold in ]Rn,
the correct term to integrate is the k-dimensional volume of an infinitesimal paral-
lelepiped in the manifold.
Throughout this section we will work with manifolds in a finite-dimensional
space over JR because working over <C or in infinite dimensions involves some addi-
tional complexities that we do not wish to address here.

10.4.1 The k-volume of a Parallelepiped in IRn


To begin, we must understand volumes of parallelepipeds in ]Rn.

Definition 10.4.1. We say that a k-dimensional parallelepiped P in X = JRn with


one vertex at the origin is defined by vectors x 1, ... , xk E X if each of those vectors
is an edge of P, and any other edge is obtained by translating one of the Xi to start
at some sum I:;=l Xj e and end at xi+ I:;=l Xj e . Also we require that each (k -1)-
plane spanned by any (k -1) of the vectors contains one face of P. See Figure 10.4
for an example.

Let Q be the unit interval (cube) [O, JI.] c JRk, where JI. = I:~=l ei. This
is the parallelepiped in JRk defined by the standard basis vectors e1, ... ,ek· If
x 1, ... , x k is a collection of any k vectors in JR k, we can construct a linear operator
394 Chapter 10. Calculu s on Manifolds

Figure 10.4. The parallelepiped defined by three vectors x1, x2, and X3.
Each of the planes spanned by any two of these three vectors contains a face of the
parallelepiped, and every edge is a translate of one of these three vectors.

L(x 1 , ... ,xk) : JR.k -+ JR.k by sending each ei to xi. In terms of the standard
basis, the matrix representation of L(x 1 , . .. ,xk) has its ith column equal to the
representation of xi. The operator L(x 1 , ... , xk) maps Q to the parallelepiped
defined by x 1 , . .. , Xk· Therefore, the volume of this parallelepiped is equal to the
modulus of the determinant of L (see Remark 8.7.6.) We can also rewrite this as

vol(P) = ldet(L)I = Jdet(£T£).


A k-dimensional parallelepiped in P c JR.n, considered as a subset of JR.n, must
have measure zero, but it is useful to make sense of the k-dimensional volume of P .
We have already seen this with the length of a line segment in JRn- the line segment
has measure zero as a subset of JR.n, but it is still useful to measure its length.

Definition 10.4.2. The k-volume of the k -dimensional parallelepiped P c JR.n


defined by vectors x 1 , ... , Xk E lRn is given by

where L : JR.k -+ JR.n is the linear transformation mapping each ei E JRk to xi E ]Rn.

10.4.2 Integral over a k-Manifold in JRn


If a k-manifold MC JR.n is parametrized by a: U-+ JR.n with a(u) = p, then Da(u)
maps a unit k-cube Qin U to Da(u)Q C TpM C JR.n, which has k-volume equal to
Jdet(Da(u)TDa(u)). This motivates the following definitions.

Definition 10.4.3. Given a parametrized k -manifold a : U -+ JR.n, and given a


measurable subset AC U with a(A) = M, define the k-dimensional measure of M
to be

This is a special case of the following more general definition.


10.4 * Integration on Manifolds 395

Definition 10.4.4. Given a parametrized k -manifold a : U ---+ ]Rn, a measurable


subset A c U with a(A) = M, and a function f: M ---+ Y, where (Y, II · llY) is a
Banach space, if f Jdet(Da T Da) is integrable on U , define the integral

Example 10.4.5. If k = 1, the term yf det(DaTDa) in the previous defini-


tion is just ll D all 2 = ll a' ll 2, so that definition reduces to the formula for line
integral of the function f. If C = a([a, b]), we have

{ fdC= { f)det(DaTDa) = {bf ll a'(t)l l dt= { fds.


le l[a,b] la le

Proposition 10.4.6. The value of the integral JM f dM is independent of the


choice of parametrization a .

Proof. The proof is Exercise 10.14. D

Remark 10.4. 7. For a surface Sin JR 3 , you may have seen surface integrals defined
differently- not with J det( Da TDa), but rather as

Is f dS = l f(a) ll D1a x D2all =fl f(a(u, v))l lau x a vll dudv. (10.2)

This is equivalent to Definition 10.4.4, as can be seen by verifying that the area of
the parallelogram defined by two vectors v 1, v 2 in JR 3 is also equal to the length
ll v1 x v2 II of the cross product. For more on properties of cross products, see
Appendix C.3.

10.4.3 Surface Integrals of Vector Fields


For the special case in which F is a vector field (that is, when F : M ---+ JRn) ,
we defined a special line integral by taking the inner product of F with the unit
tangent vector T. Although there is no uniquely determined tangent vector on
a surface or higher-dimensional manifold, there is a uniquely determined normal
vector on a surface S in JR 3 . This partially motivates the following definition, which
is also motivated physically by the fact that the following integral describes the flux
across S (volume of fluid crossing the surface S per unit time) of a fluid moving
with velocity F.

Definition 10.4.8. Given a parametrized surface a : U ---+ JR 3 , given a measurable


subset A C U with S = a(A), and given a function F : S---+ JR 3 , let N : U---+ JR 3
be the unit normal to a. If (F · N) J det( Da TDa) is integrable on A, define the
integral
396 Chapter 10. Calculus on Manifolds

Remark 10.4.9. Combining the definition of the unit normal with the relation
(10.2), we can also write the surface integral of a vector field as

Is F · dS = l (F · N)Vdet(Da.T Da.) =fl F(a.) · (a.u x a.v) dudv.


Proposition 10.4.10. The surface integral of a vector-valued function is indepen-
dent of the parametrization (but will change sign if the orientation is reversed).

Proof. The proof is Exercise 10.15. D

10.5 Green's Theorem


Green's theorem is a two-dimensional analogue of the fundamental theorem of calcu-
lus. Recall that the single-variable fundamental theorem of calculus says the integral
over an interval of the derivative of a function can be computed from the value of
the function on the boundary. Green''s theorem says that the integral of a certain
derivative of a function over a region in JR 2 can be computed from (an integral of)
the function on the boundary of the surface.
Like the fundamental theorem of calculus, Green's theorem is a powerful tool
both for proving theoretical results and for computing integrals that would be very
difficult to compute any ot her way.

10.5.1 Jordan Curve Theorem


We begin with a definition and a theorem that seems obvious, but which is actually
quite tricky to prove; it is called the Jordan curve theorem.

Definition 10.5.1. A connected component of a subset SC IFn is a subset TC S


that is connected (see Definition 5. 9.10) and is not contained in any other connected
subset of S .

Example 10.5.2.

(i) The set JR 1 "' { 0} consists of two connected components, namely, the sets
( - oo, 0) and (0, oo).
(ii) The set ]Rn has only one connected component, namely, itself.

(iii) The set {(x , y) E JR 2 [ xy -=/. O} has four connected components, namely,
the four quadrants of the plane.

Theorem 10.5.3 (Jordan Curve Theorem). Let/ be a simple closed curve


in JR 2 (see Definition 10. 1.1). The complement JR 2 " ' / consists of two connected
10.5. Green's Theorem 397

components. One of these is bounded and one is unbounded. The curve 'Y is the
topological boundary of each component.

Remark 10.5.4. We do not prove this theorem because it would take us far from
our goals for this text. A careful proof is actually quite difficult.

Definition 10.5.5. We call the bounded component of the complement of the curve
'Y the interior of 'Y and the unbounded component of the complement the exterior of
T If a point zo lies in the interior of "f, we say that zo is enclosed by 'Y or that it
lies within 'Y.
Remark 10.5.6. Because <C is homeomorphic to the plane JR. 2 (see Definition 5.9.1),
the Jordan curve theorem also applies to simple closed curves in C.

Definition 10.5.7. We say that an open set Uc <C or in JR. 2 is simply connected
if for any simple closed curve 'Y that lies inside U, every point in the interior of 'Y
is also contained in U. See Figure 10.5 for an example.

Nota Bene 10.5.8. Intuitively, U is simply connected if it contains no holes.

(a) (b)

Figure 10.5. A region (a) in the plane that is simply connected, and one
(b) that is not simply connected; see Definition 10.5. 7.

Example 10.5.9.

(i) The plane JR. 2 is simply connected.


(ii) The open ball B(O,r) c JR. 2 is simply connected for every r E (O,oo).

Unexample 10.5.10.

(i) The annulus {(x, y) E IR 2 I 0 < ll(x, y)ll < l} is not simply connected.
(ii) The set S = <C "\ B(O , r) is not simply connected, because for any c > 0
the circle {z E <C I llzll = r + c} lies in S, and the origin lies in the
interior of t he circle, but the origin is not in S.
398 Chapter 10. Calculus on Manifolds

Figure 10.6. As described in Definition 10. 5.12, a parametrized curve is


positively oriented when the interior e
lies to the left of the tangent vector "( 1 ( t)
(blue). The left-pointing normal vector n(t) (red) points into G, and -n(t) points
out of e.

Proposition 10.5.11. The interior of any simple closed curve in the plane is
simply connected.

Proof. Let "! be a simple closed curve, and let JR2 "- "! be the disjoint union of
connected components U and B, with U unbounded and B bounded. For any simple
closed curve a- c B we take JR 2 "-a- = v U (3 with v unbounded and (3 bounded.
Since U is connected and misses a-, we must have U C v or U C (3. Since U is
unbounded, it cannot lie in (3, so we have UC v, and hence (3 C vc C uc =BU ry.
But"! n (3 = 0 so (3 c B . D

Definition 10.5.12. Let "! : [a, b] ---+ JR 2 be a simple closed curve with interior
8 c JR 2 . Writing"!= (x(t),y(t)), we define the left-pointing normal vector n(t) at
t to be n(t) = (-y'(t), x'(t)) .
We say that"! is positively oriented if for any t E [a, b] there is a 6 > 0 such
that 1(t) + hn(t) E 8 whenever 0 < h < 6.

Remark 10.5.13. Said informally, 'Y has positive orientation if 8 always lies to the
left of the tangent vector 'Y' (t), or, equivalently, if"! is traversed in a counterclockwise
direction. See Figure 10.6 for a depiction of this.

10.5.2 Green's Theorem

Definition 10.5.14. We say that a closed subset 6. C JR 2 is an x-simple region


if there is an interval [a, b] E JR and continuous43 functions f, g: [a, b] ---+ JR that
are C 1 on (a,b) such that 6. = {(x, y) I x E [a, b], f(x) :::; y :::; g(x)} . Similarly,
we say that 6. is a y-simple region if there is an interval [a, b] E JR and functions
f,g: [a,b]---+ JR such that 6. = {(x,y) I y E [a,b],f(y):::; x:::; g(y)} . See Figure 10.7
for examples. We say that 6. is a simple region if it is both x -simple and y -simple;
see Figure 10.8
43 See Remark 10.1.2.
10.5 Green's Theorem 399

y y

f g
b f----~---~~---

f
a f----~~--__. _ _ __

~+---~------~--- x - 1 - - - - -- - - - - - ---+ X
a b

Figure 10. 7. Examples of an x -simple region (left) and a y -simple region


(right); see Definition 10.5.14.

Figure 10.8. Examples of regions that can be classified as both x -simple


and y -simple, as described in Definition 10.5.14 .

Theorem 10.5.15 (Green's Theorem). Let"( : [a, b] ---+ JR. 2 be a piecewise-


smooth, positively oriented, simple closed curve with interior n C JR. 2 such that
TI = nu 1 is the union of a finite number of simple regions .6. 1 , . . . , .6.m for which
the intersection .6.i n .6.j has measure zero for every i =j:. j.
If U is an open set containing TI and if P , Q : U ---+ JR are C 1 , then we have

h_ 8
.n
8Q
x
- a
8P =
Y
1
'Y
(P,Q) · d1 = 1
'Y
Pdx+Qdy .

Proof. It suffices to prove the theorem in the case that n is simple. We show that
In~~ = I'Y Q dy in the case that n is x-simple. The proof that - In~~ = I'Y P dx
when n is y-simple is essentially the same.
If n is x-simple, we can write TI = {(x, y) I x E [a, b], f(x) ::::; y ::::; g(x)}.
We may take / to be the sum of four curves 1 1 , 1 2, /3, /4, traversed consecutively,
where 11(t),13(t) : [a,b] ---+ JR. 2 are given by 11(t) = (t,f(t)) (the graph off,
traversed from left to right) and 1 3 (t) = (b +a - t ,g(b +a - t)) (the graph of
g traversed from right to left), respectively, and 1 2 , "(4 : [O, l] ---+ JR. 2 are given by
400 Chapter 10. Calculus on Manifolds

a b

Figure 10.9. The x -simple region D used in the proof of Green's theorem
(Th eorem 10.5.15). Here /'i is the graph off, traversed from left to right, while ')'3
is the graph of g, traversed from right to left. The path /'2 is the right-hand vertical
line, traversed from bottom to top, and ')'4 is the left-hand vertical line, traversed
from top to bottom.

= (b, (1 - t)f(b) + tg(b)) (the right vertical line, traversed from bottom to
')'2 (t)
top) and 1'4 (t) = (a, (1 - t)g(a) + tf(a)) (the left vertical line, traversed from top
to bottom), respectively. See Figure 10.9 for a depiction of D and the four curves
that make up ')'.
By Fubini's theorem, the general Leibniz integral rule (8.10), and the funda-
mental theorem of calculus, we have

k lb
IT OX
oQ
- =
a
!g(x)
f(x)
oQ
- dydx
OX

=lb [d~ (l;~~) Q(x, y) dy) + J'(x)Q(x, f(x)) - g'(x)Q(x,g(x))l dx

g(b) !g(a)
= Q(b, y) dy - Q(a, y) dy
! f(b) f(a)

+lb J'(x)Q( x, f(x)) d:r - lb g'(x)Q(x, g(x)) dx

= i2 Q dy + i4 Q dy + i, Q dy + i3 Q dy

= i Qdy. D

Example 10.5.16. Let D be the upper half of the unit disk, so


2
D = {(x,y) E lR I 0 < ll(x,y)ll ::::; 1,y 2". O} ,
and let oD be the boundary curve, traversed once counterclockwise. To com-
pute the line integral
10.5. Green's Theorem 401

rx
laD
2
dx + 2xy dy
we may apply Green's theorem with P(x , y) = x2 and Q(x , y) = 2xy:

laD x 2
dx + 2xydy = .fl~~ - ~:=.fl 2ydxdy

Converting to polar coordinates gives

.fl 2ydxdy =fa'' fo


1
2rsin(B)rdrdB = 4/ 3.

Remark 10.5.17. Although the statement of the theorem only mentions simple
closed curves (and hence simply connected regions) , we can easily extend it to more
general cases. For example, if n is the region shown in Figure 10.10, we can cut
n into two pieces, n1 u n2, each of which is bounded by a simple closed piecewise-
smooth curve.
Since each of the new cuts is traversed twice, once in each direction, the
contribution of each cut to the line integral is zero. Also each cut has measure zero
in the plane, so it also contributes zero to the integral ffo; ~~ - ~~, so we have

re
liri
oQ _ oP
ox oy
= t Jiriire
i= l
oQ _ oP
ox oy

=L
2

i=l
1'Yi
(P, Q) . d1

= !, (P, Q) · d1 + l (P, Q) · d1.

a
Figure 10.10. Two cross cuts 0' 1 and 0'2 subdivide the annulus n into two
simply connected regions, n1 and n2. The boundary of f21 is /1 - 0'2 + T1 - 0'1
and the boundary of n2 is /2 + 0'1 + T2 + 0'2. When integrating along the boundary
of both 0 1 and f22 , the contributions from ±0'1 and ±0'2 cancel, giving the same
result as integrating along the boundary of n, that is, integrating over / + T. See
Remark 10. 5.17 for details and Example 10. 5.18 for a specific case.
402 Chapter 10. Calculu s on Manifolds

Example 10.5.18. Consider the annulus D = {(x,y) E lR 2 I 1:::; ll(x,y)ll:::;


2} . Let T be the inner boundary circle oriented clockwise, and let "( be the
outer boundary circle, oriented counterclockwise. To evaluate the integral

1 TUr
y 3 dx -x3 dy

cut the annulus along the x-axis and add new paths CT1 : [1, 2] -+ JR 2 and
CT 2 : [1,2]-+ JR2, given by CT 1(t) = (t,O) and CT2(t) = (-2+t,0).
Green's theorem applies to each half of the annulus to give

1
T+r
3
y 3 dx - x dy = (1T1 +rl - <Tt - <T2
3
y dx - x dy)
3

+ (1 r2+12+<r1U+<r2
3 3
y dx - x dy)

= r
ln1
-3(::r; 2 + y 2) dx dy + r
ln2
-3(x 2 + y2) dx dy

=In -3(x + 2
y
2
) dxdy

{21' /,2
= lo (-3r )r
2
dr dB = -457r /2.
1

Remark 10.5.19. Green's theorem,. when combined with the chain rule, can be
applied to surfaces in JR 3 to give a special case of Stokes' theorem. Assume 'Y is a
simple closed curve in the plane with interior n, and a : u -+ JR3 is a C 1 surface
n
with c u. Let s = a(n) with boundary c (parametrized by a("!) = CT).
For F: S-+ JR 3 with F = (P, Q', R), the curl of Fis

curl(F) = V x F= (aj~
ay
_aQ, aP _ aR, aQ _ aP).
az az ax ax ay (10.3)

The curl operator describes the infinitesimal rotation of F at each point. Stokes'
theorem states that
l F · dCT = Is curl(F) · dS.

This theorem follows from a tedious but straightforward application of Green's


theorem and the chain rule.

Vista 10.5.20. Both Stokes' and Green's theorems are special cases of a
great generalization of the fundamental theorem of calculus. Roughly speak-
ing, when integrating over a parametrized manifold M , there is a natural
way to differentiate integrands called exterior differentiation that turns a k-
dimensional integrand 'T} into a k + 1-dimensional integrand d'T].
Exercises 403

The generalized fundamental theorem of calculus can be written as

r
laM
ry = r dry ,
jM
(10.4)

where 8M is the boundary of M. In the case of k = 1, if ry = Pdx + Qdy,


then dry = ~~ - ~: dx dy, which agrees with Green's theorem. In the case of
k = 2, if ry = F · d1J, then dry= curl(F) · dS, so the theorem reduces to Stokes'
theorem.
The case where k = 0 is just the usual fundamental theorem of calculus,
if we make the following fairly natural definitions. Let the zero-dimensional
integral JP f of a function f at a point p E JR be f(p), let the integral at the
negatively oriented point -p be J_P f = - f (p ), and let 8[a, b] be the sum of
the oriented points +b and -a. Finally, for any C 1 function f : [a, b] ---t JR let
df = f'(t)dt. Putting these together, we have

r
} B[a,b]
f = f(b) - f(a) =lb J'(t)dt =
a
r
J[a ,b]
df.

For more on this generalization of the fundamental theorem of calculus, see


the references listed in the notes for this chapter.

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with & are especially important and are likely to be used later
in this book and beyond. Those marked with tare harder than average, but should
still be done .
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

10.1. Let H be the curve with parametrization 1J : [O, 27r] ---t JR3 given by !J(t) =
(r cos( t), r sin( t), ct) for some positive constants r, c E R Find the arclength
of H.
10.2. Prove Proposition 10.1.15.
404 Chapter 10. Calculus on Manifolds

10.3. Consider a wheel of radius 1 roUing along the x-axis in JR 2 without slipping.
Fix a point on the outside of the wheel, and let C be the curve swept out by
the point.
(i) Find a parametrization a- : JR---+ JR 2 of this curve.
(ii) Find the distance traveled by the point when the wheel makes one full
rotation.
10.4. Let X = M 2 (JR) be the Banach space of 2 x 2 real matrices with the 1-norm,
and let C be the curve with parametrization a-: [O, 1] ---+ C, given by

a-(t) = [t ~et e~t] .


For any t E (0, 1) find the unit tangent to C. Find the arclength of C.

10.5. Let X, C, and a- be as in Exercise 10.4. Let f = <let : X ---+ JR be the


determinant function. Evaluate the integral f cf ds.
10.6. Compute the line integral

l y(z + 1) dx + xzdy + xydz,


where C is parametrized by a-( B) = (cos( B), sin( B), sin3 ( B) + cos3 ( B)), with
eE [0, 27r].
10.7. Compute the line integral
xdy-ydx
(
Jc x2 + y2 ,
where C is a circle of radius R around the origin beginning and ending at
(R, 0) and proceeding counterclockwise.
(i) How does changing R change the answer?
(ii) How do the results change if the orientation is reversed?
(iii) How does the result change if C is a different simple closed curve around
the origin (not necessarily a circle)? Hint: Note that Fis conservative
(see Example 10.2.10) in certain parts of the plane- if u = -Arctan(x/y),
then we have F = Du everywhere but on the line y = 0, and if we set
v = Arctan(x/y), we have F = Dv everywhere but on the line x = 0.
10.8. Prove Proposition 10.2.2.
10.9. Prove Proposition 10.2.8.

10.10. Let S be the surface

S = {(x,y,z) E JR 3 I 9xz - 7y 2 = O}.

(i) Show that the function a: JR 2 ---+ JR 3 , given by a(t,u) = (t 2 ,3ut, 7u 2 ), is


C 1 on all of JR 2 and satisfies a(t,u) ES for all (t,u) E JR 2 .
(ii) Show that a is not injective.
(iii) Find all the points of the domain where Da is not injective.
Exercises 405

(iv) Find an open subset U of JR 2 such that the point p = (1, 3, 7) E S lies
in a(U) and such that a restricted to U is a parametrized 2-manifold.
(v) Give a basis for the tangent space TpS of Sat p.
(vi) Give the unit normal N of Sat p with the parametrization a.
10.11. Let¢: JR 2 -+ JR 2 be given by (t, u) H (u, t). Let S, p, U, and a be as in the
previous problem.
(i) Find a basis of TpS using the parametrization a o ¢. Prove that the
span of this basis is the same as the span of the basis computed with
the parametrization a.
(ii) Compute the unit normal at p for the parametrization ao¢. Verify that
this is the same as the unit normal computed for a, except that the sign
has changed.
10.12. For each of the parametrizations a, (3, and <J of Examples 10.3.5 and 10.3.7,
compute the unit normal at the point (1/2, 0, v'3/2). Verify that it is the
same for a and (3, but has the opposite sign for <J.
10.13. Describe the tangent space TpM at each point of each of the following
manifolds:
(i) A vector subspace of ]Rn .
(ii) The graph of a smooth function f : JRk -+ R
(iii) The cylinder parametrized by u,v H (u,cos(v) ,sin(v)).

10.14.* Prove Proposition 10.4.6.


10.15.* Prove Proposition 10.4.10.
10.16.* Prove that the surface area of a sphere S C JR 3 of radius r is 47rr 2 by
parametrizing half of the sphere and computing the integral fs dS.
10.17.* Set up an integral to compute then-dimensional "surface area" of then-
sphere of radius r in JRn+ 1 . You need not evaluate the integral.
10.18.* Let M be the surface in JR 4 defined by the equation 2 =
2 4
z zr in (['. 2 (think-
ing of (['. as JR ). Compute the surface area of the part of M satisfy-
ing lz1 I :::; 1. Hint: Write the surface in polar coordinates as a(r, e) =
(rcos(B),rsin(B) ,r 2 cos(2B) ,r 2 sin(2B)) , with r E [0,1].

10.19. Let C c JR 2 be the simple closed curve described in polar coordinates by


r = cos(2e)' where e E [- 7r I 4, 7r I 4], traversed in a counterclockwise direction.
Compute the integral
l 3ydx +xdy.

10.20. Evaluate the integral f 0 (x-y 3 ) dx+x 3 dy, where C is the unit circle x 2 +y 2 =
1 traversed once counterclockwise.
10.21. Use Green's theorem to give another proof of the result in Example 10.2.10
that if u : JR 2 -+JR is 0 2 and F = Du, then for any simple closed curve C in
the plane satisfying the hypothesis of Green's theorem and parametrized by
2
<J : [a, b] -+ JR we have

l F · d<J = 0.
406 Chapter 10. Calculus on Manifolds

10.22. Let D = {z EC : lzl :::; 1} be the unit disk, and let C be the boundary circle
{ (x, y) I x 2 + y 2 = 1}, traversed once counterclockwise. Let

(-y,x)/(x 2 +y 2 ) if (x , y) -I= (0,0),


F(x , y) = (P( x,y ),Q(x,y)) = { (O,O)
otherwise.

Evaluate both of the integrals

{ 8Q
lv ax
8P
8y
and i Pdx+Qdy

Explain why your results do not contradict Green's theorem.


10.23.* Let "Y be a simple closed curve in the plane, and let n be its interior , with
"'(UO contained in an open set Uc JR 2 . Assume "Y(t) = (x(t), y(t)) is positively
oriented, and let dn = (y'(t), -x'(t)) dt be the outward-pointing normal. For
ax + !2.E:i..
any C 1 vector field F ·. U -t JR 2 ' let \7 · F = !2Ei 8y . Prove that

l \7 · F = i F · dn.

10.24.* Using the same assumptions as the previous exercise, let \7 x F = curl(F)
be as in (10.3), and let e 3 = (0, 0, 1) be the third standard basis vector. Prove
that
l (\7 x F) · e3 = i F · Tds .

Notes
The interested reader can find proofs of t he Jordan curve theorem in [BroOO, Hal07a,
Hal07b, Mae84) . For more on generalizations of the cross product, the exterior
algebra, differential forms, and the generalized fundamental theorem of calculus,
including its relation to Green's and Stokes' theorems, we recommend the books
[HH99) and [Spi65).
Complex Analysis

Between two truths of the real domain, the easiest and shortest path quite often
passes through the complex domain.
-Paul Painleve

In this chapter we discuss properties of differentiable functions of a single complex


variable. You might think that we have completely treated differentiable functions
in Chapter 6 and that complex differentiable functions should just be a special case
of that more general setting, but some remarkable and important things happen
in this special case that do not happen in the general setting. Differentiability in
this setting is a very strong condition that allows us to prove many results about
complex functions that we use in Chapters 12 and 13 to prove some powerful results
about linear operators.
Throughout this chapter, let (X, II · II) be a Banach space over <C.

11.1 Holomorphic Functions


11.1.1 Differentiation on C
The complex plane <C = { x + iy I x, y E JR} is naturally a normed JR-vector space,
with scalar multiplication az = a(x + iy) = (ax+ iay), with vector addition the
same as complex addition, (x1 + iy1) + (x2 + iy2) = (xi + x2) + i(y1 + Y2), and
with norm l(x + iy)I = Jx 2 + y2. There is an obvious bijective map from <C to JR 2,

given by sending z = x + iy E <C to (x, y) E JR 2 , and this map is an isomorphism of


normed JR-vector spaces, where we use the 2-norm on JR 2 . This implies that <C and
JR 2 have the same topology. Throughout the rest of the book we often use this fact
and the associated isomorphism x + iy r-+ (x, y) without further mention.
Let U C <C be an open set in <C. The fact that we can think of <C in two ways-
as both a two-dimensional real vector space and as a one-dimensional complex vector
space, leads to two different notions of the total derivative of a function f: U -+ X.
First, if we consider <C as a two-dimensional real vector space and X as a Banach
space over JR, the total derivative at each point is a linear transformation from JR 2
to X. So, for example, if the target X is the real vector space <C ~ JR 2 , then the

407
408 Chapter 11. Complex Analysis

total real derivative off is represented by the 2 x 2 Jacobian matrix, provided the
partial derivatives exist and are continuous.
But if we consider the domain C as a one-dimensional complex vector space,
then the total derivative at each point is a bounded C-linear transformation from
C to X. In this case the total complex derivative at a point z0 is given by

f'(zo) = lim f(z) - f(zo) EX.


z--tzo Z - Zo

More precisely, for any w E C the CC-linear map D f (zo) : C -+ X is given by


D f (zo)(w) = wf'(zo), where wf'(zo) means multiplication of the vector f' (zo) E X
by the complex scalar w (see Remark 6.2 .7). When we say that a complex function
is differentiable, we always mean it is differentiable in this second sense, unless
otherwise specified.

Definition 11.1.1. Let f: U-+ X, where U C C is open. We say that the function
f is holomorphic on U if it is differentiable {see Definition 6. 3.1) at every point of
U, considering U as an open subset of the one-dimensional complex vector space C;
that is, f is holomorphic on U if the limit

J'(zo) = lim f(z) - f(zo) = lim f(zo + h) - f(zo) (11.1)


z--tzo Z - zo h--tO h

exists for every zo E U, where the limits are taken with z and h in C (not just in
IR). If U is not specified, then saying f is holomorphic means that f is holomorphic
on all of C. If f is holomorphic on all of C, it is sometimes said to be entire.

Example 11.1.2. We show that J(z) = z 2 is a holomorphic function. Let


h = reie. We compute

. f(zo
f '( zo ) = 1im + h) - f(zo)
h--tO
h
. (zo + reie)2 - z20
= hm .
r--tO reie
. 2zoreie + (reie)2
= r--tO
hm .
re'e
= lim 2zo + reie = 2zo.
r--tO

Thus f(z) = z2 is holomorphic with derivative 2z0 at z0 .

11.1 .2 The Cauchy- Riemann Equations


The condition of being holomorphic is a much stronger condition than being dif-
ferentiable as a function of two real variables. If f is holomorphic, then it is also
differentiable as a map from JR 2 ; but the converse is not true.
11.1. Holomorphic Functions 409

The reason that real differentiability does not imply complex differentiability is
that in the complex case the linear transformation D f (zo) = f' (z0 ) is given by
complex scalar multiplication w H wf'(z0), whereas most real linear transforma-
tions from JR 2 to X are not given by complex scalar multiplication. Of course
any C-linear map is JR-linear, but the converse is certainly not true , as the next
example shows.

Unexample 11.1.3. Consider the function f(z ) = z. We show that this is


a real differentiable function of two variables but that it is not holomorphic.
Let z = x + iy, so that f(z) = f(x , y) = x - iy. Considered as a function from
JR 2 to JR 2, this is a differentiable function , and the total derivative is

Df(x,y) = [~ ~1 ] .
For this function to be holomorphic, the limit (11.1) must exist regardless of
the direction in which z approaches zo. Suppose zo = 0, and we let h = t
where t E R If the derivative f'(zo) = Iimh-+o(f(zo + h) - f(zo))/h exists,
then it must be equal to

lim f (t) - f (0) = lim t - 0 = l.


t-+0 t t-+0 t

On the other hand , if h =it with t E JR, then whenever the derivative f'(zo) =
limh_,o(f(zo + h) - f(zo)) / h exists, it must be equal to

lim f(it) - 0 = lim - it - 0 = -1.


t-+0 it t-+0 it
Since these two real limits are not equal, the complex limit does not exist, and
the function cannot be holomorphic.

Any C-linear map from C to X is determined by a choice of b E X, and the


map is given by
x + iy H (x + iy)b = xb + iyb .

But an JR-linear map from JR 2 to X is determined by a choice of two elements,


b 1 , b2 E X, and the map is given by

X + iy H xb1 + yb2 .
Thus, the JR-linear map defined by b 1 , b2 is C-linear if and only if b2 = ib1.
In terms of derivatives of a function f: C ---+ X, this means that the total
real derivative [of / ax of jay] can only be C-linear, and hence can only define a
complex derivative f'(z) = of /ax + i8f jay, if it satisfies

(11.2)
410 Chapter 11. Complex Analysis

This important equation is called the Cauchy- Riemann equation. In the special
case that X = C, we can write f(:x, y) = u(x, y) + iv(x, y), so that af /ax
au/ax+ i8v/ax and aj /ay = au/ay + iav/ay. Expanding (11.2) gives
au av au av
and - - - - (11.3)
ax ay ay - ax'
which, together, are also often called the Cauchy- Riemann equations. These equa-
tions are an important way to determine whether a function is holomorphic.

Theorem 11.1.4 (Cauchy-Riemann). Let U C C be an open set, and let


f : U ---+ X . If f is holomorphic on U, then it is also real differentiable on U, and
the partials ¥x and %f exist. Moreover, (11. 2) holds for all (x, y) E U C R. 2 . The
converse also holds: if f is real differentiable and the partial derivatives and %f, ¥x
exist, are continuous, and satisfy (11.2) on UC C, then f is holomorphic on U .

Proof. Assume that f'(zo) exists for zo E U. Let zo = xo + iyo and z = x + iyo
where (xo, yo) E R. 2 and x ER As x ---+ xo we have

j'(zo)= lim f(z)-f(zo) = lim f(x,yo)-f(xo,Yo) = afl . (ll. 4)


z-+zo z - zo X--i>Xo x - Xo ax z=zo

Similarly, if z = x 0 + iy, then as y---+ yo, we have

J'(zo) = lim f(z) - f(zo) = lim f(xo, y) - f(xo, Yo) = - i af I .


z-+zo z - zo y-+yo i(y - Yo) ay z=zo

Since f' (zo) exists, these two real limits must exist and be equal, which gives the
Cauchy-Riemann equation (11.2) . If X = C, we can write f(x, y) = u(x, y) +
iv(x, y), expand (11.2), and match the real parts to real parts, and imaginary parts
to imaginary parts to get ( 11. 3) .
Conversely, if the real partials of f exist and are continuous, then the total
real derivative exists and is given by Df(x,y) = [¥x%fJ.
By Lemma 6.4.1 (the
alternative characterization of the Frechet derivative) applied to the function f on
U c R. 2 at any z 0 = x 0 + iyo E U, for every E > 0 there exists a 6 > 0 such that if
(a, b) E R. 2satisfies ll(a, b)ll2 < 6, then we have

Iii (xo +a, Yo + b) - f (xo, uo) - D f (xo, Yo)(a , b) I ~ c:ll(a, b)ll2-


Let g(z) =~;EX. If (11.2) holds, then we have already seen that Df(x 0 ,y0 ) acts
on elements (a, b) complex linearly-by sending (a, b) to (a+ bi)g(zo) EX, thought
of as an element of ~(C; X) ~ X. But this shows that if h = a+ bi has lhl < 6,
then
llf(zo + h) -- f(zo) - ghll ~ Ihle:,
which implies by Lemma 6.4.l that g = f'(z). D

Corollary 11.1.5. If U c C is open, and if all of the partial derivatives of u(x +


iy) : U ---+ R. and v(x + iy) : U --+ R. exist on U, are continuous on U, and
11.2. Properties and Examples 411

satisfy the Cauchy-Riemann equations (11.3), then the function f: C -t C given by


f(x + iy) = u(x, y) + iv(x, y) is holomorphic on U.

Proof. This follows immediately from the previous theorem and the fact that
continuous partials imply differentiability (see Theorem 6.2.14). D

Unexample 11.1.6. The function f(z) = W is not holomorphic on any


open set. To see this, note that f = xX,~i:2 , so u(x, y) = x 2 ~Y 2 and v(x, y) =
x2!y2 , and

au yz _ x2 av x2 - y2
but
ax (x2 + y2)2 ay (x2 + y2)2 ·
These are only equal if x 2 = y2. Similarly

au -2xy av 2xy
and
ay (x2 + y2)2 ax (x2 + y2)2 ·
These are equal only if xy = 0. The only point where x 2 = y 2 and xy = 0 is
x = y = 0, so the origin is the only point of C where the Cauchy-Riemann
equations could hold; but f is not even defined at the origin. Since the Cauchy-
Riemann equations do not hold on any open set, f is not holomorphic on any
open set .

The following, somewhat surprising result is a consequence of the Cauchy-


Riemann equations and gives an example of how special holomorphic functions are.

Proposition 11.1. 7. Let f be C -valued and holomorphic on a path-connected open


set U. If J is holomorphic on U or if If I is constant on U, then f is constant
on U.

Proof. The proof is Exercise 11.5. D

11.2 Properties and Examples


In this section we survey some of the basic properties and examples of holomorphic
functions. Probably the most important holomorphic functions are those that are
complex valued-that is, functions from U C C to C. But we also need to think
about functions that take values in more general Banach spaces over C. This is
especially useful in Chapter 12, where we use complex-analytic techniques to develop
results about linear operators.
After describing some of the elementary properties of holomorphic functions,
we prove that every function that can be written as a convergent power series (these
are called analytic functions) is holomorphic. We conclude the section with a proof
of Euler's formula.
412 Chapter 11. Complex Analysis

11 .2.1 Basic Properties


Remark 11.2.1. Holomorphic functions are continuous because they are differen-
tiable (see Corollary 6.3.8).

Theorem 11.2.2. Let f, g be holomorphic, complex-valued functions on the open


set U c C. The following hold:

(i) af + bg is holomorphic on U and (af + bg )' = af' + bg', where a, b E C.


(ii) jg is holomorphic on U and (Jg)' = f'g +Jg'.

(iii) If g(z) -=f. 0 on U, then f jg is holomorphic on U and

f'g - Jg'
g2

(iv) Polynomials are holomorphic on all of C (they are entire) and

(anzn + an-1Zn-l + · · · + a1z + ao)' = nanzn- l + (n - l)an-1Zn- 2 + · · · + a1.

(v) A rational fun ction

anzn + an.-1zn-l + · · · + a1z + ao


bmzm + bm-1zm- l + ··· +biz+ bo

is holomorphic on all of C except at the roots of the denominator.

Proof. The proof is Exercise 11 .6. D

Theorem 11.2.3 (Chain Rule). Let U, V be open sets in C. If f: U -+ C


and g: V -+ X are holomorphic on U and V, respectively, with f (U) c V, then
the map h : U -+ X given by h(z) = g(f ( z)) is holomorphic on U and satisfies
h'(z) = g'(f(z))f' (z) for all z EU.

Proof. This follows from Theorem 6.4.7. D

Proposition 11.2.4. Let U C C be open and path connected. If f : U -+ X is


holomorphic on U and f' ( z) = 0 for all z E U, then f is constant on U.

Proof. For any z 1, z2 EU, let g: [O, l ] -+ Ube a smooth path satisfying g(O) = z1
and g(l) = z 2. From the fundamental theorem of calculus (6.16), we have
1
f(z2) - f(z1) = f(g(l)) - f(g(O)) = fo J'(g(t))g'(t) dt = 0.

Thus, we have f(z1) = f(z2). Since the choice of z 1 and z2 is arbitrary, the function
f is constant. D
11.2. Properties and Examples 413

11.2.2 Convergent Power Series Are Holomorphic


In this section we consider some basic properties of convergent power series over the
complex numbers. Any function that can be written as a convergent power series
is called analytic. The main result of this section is the fact (Theorem 11.2.8) that
all analytic functions are holomorphic.
One of t he most important theorems of complex analysis is the converse to
Theorem 11.2.8, that is, if a function f is holomorphic in a neighborhood of a point
z0 , then f is analytic in some neighborhood of z0 . This theorem is a consequence of
the famous and powerful Cauchy integral formula, which we discuss in Section 11.4.
For every positive constant r and every zo E re, a polynomial of the form
fn(z) = :Z:::~=O ak(z - zo)k, with each ak E X, defines a function in the Banach
space (L 00 (B(z0 , r); X) , 11 · llL= ). The discussion in Section 5.6.3 about convergence
of series in a Banach space applies to series of the form :Z:::%:oak(z - zo)k, and con-
vergence wit h respect to the norm 11 · ll L= is called uniform convergence. Whenever
we talk about convergence of series, we mean convergence in this Banach space,
unless explicitly st ated otherwise. We also continue to use the definition of uni-
form convergence on an open set U to mean uniform convergence on every compact
subset of U.
We begin wit h a useful lemma about convergence of power series in re.

Lemma 11.2.5 (Abel- Weierstrass Lemma). Let (ak)k EN c X be a sequence.


If there exists an R > 0 and M E ~ such that

for every n E N, then for any positive r < R the series :Z:::%°=o ak(z - zo)k and the
series :Z:::%°=o kak(z - z 0 )k- l both converge uniformly and absolutely on the closed
ball B(zo, r) c <C.

Proof. Let D denot e the closed ball B (z 0 ,r) and set p = r / R. For every z E D
we have

Thus, on D we have
00 00

L ll ak( z - zo)k l L= ~ L Ml.


k=O k=O
Similarly, for every z E D we have
k- 1 k-1
llkak( z - zo)k- 1 11 ~ kllak l rk-l = kllakllRk- ll- 1 = kllak llRkP R ~ kMP R ·

Thus, on D we have

The series M L:%°=opk and the series M / R L:%°=okpk- l both converge absolutely
by the ratio test because p < l. Hence :Z:::%°=oak(z - zo)k and :Z:::kENkak(z - zo)k- l
414 Chapter 11. Complex Analysis

both converge absolutely on every B(zo, r) c B(zo, R), and therefore they both
converge uniformly on B(zo, R). D

The following corollary is immediate.

Corollary 11.2.6. If a series I:r=o ak(z - zo)k does not converge at some point
z1, then the series does not converge at any point z E C with iz - zo I > lz1 - zo I-

Definition 11.2. 7. Given a power series I:r=o ak ( z - zo )k, the largest real number
R such that the series converges uniformly on compact subsets in B(zo, R) is called
the radius of convergence of the series. If the series converges for every R > 0,
then we say the radius of convergence is oo .

Theorem 11.2.8. Given any power series f(z) = I:r=o ak(z - z 0 )k that is uni-
formly convergent on compact subsets in an open ball B(zo, R), the function f is
holomorphic on B(z0 , R). The power series g(z) = I:r=l kak(z - zo)k- l also con-
verges uniformly and absolutely on compact subsets in B(zo, R), and g(z) is the
derivative of f(z) on B(zo, R).

Proof. By definition, the sequence of functions fn(z) = I:~=O ak(z-zo)k converges


uniformly on compact subsets to J(z) = 2.::%': 0 ak(z - zo)k. Similarly, the sequence
of derivatives is f~(z) = I:~=l kak(z - zo)k-l_ Since L:%':
0 ak(z - zo)k converges
on B(z0 , r) for any 0 < r < R, the individual terms ak(z - zo)k -+ 0 for every
z E B(zo,r), and there is a constant M > 0 such that M::::: llakllrk for every k.
Therefore, by the Abel- Weierstrass lemma, the series LkEN kak(z - zo)k-l
converges absolutely to a function g on B(zo, r - 5) for every 8 < r. But since
r < R and 5 < r are arbitrary, the series converges absolutely on every closed ball
in B(zo, R), as required.
Finally, observe that the function g is continuous on B(z0 , R), since it is the
uniform limit of a sequence of continuous functions. Therefore g(z) = f'(z) by
Theorem 6.5.11. D

Definition 11.2.9. A complex function that can be written as a convergent power


series in some open subset U c C is analytic on U.

Remark 11.2.10. Theorem 11.2.8 says that if f(z) is analytic on U, then f(z)
is holomorphic on U. This gives a large collection of holomorphic functions to work
with. This theorem also guarantees that the derivative of any complex analytic func-
t ion is also analytic, and hence holomorphic. Thus, every complex analytic function
is infinitely differentiable.

Example 11.2.11.

(i) Define the function e 2 by the power series


oo n
z ~z
e = L 1·
n=O n.
11.2. Properties and Examples 415

This series is absolutely convergent on the entire plane because for


oo lzln
each z E C, the real sum l:n=O ---ri! converges (use the ratio test,
for example). Since absolute convergence implies convergence (see
Proposition 5.6.13), the series converges. This shows that ez is holo-
morphic on all of C.
(ii) Define the function sin(z) by the power series

. oo (_1) n z2n+ 1
sm(z) = ~ (2n + 1)!

00 lzl2n+1
This is absolutely convergent everywhere because the sum l:n=O (2n+l ) !
is convergent everywhere (again by the ratio test). This shows that
sin(z) is holomorphic on all of C.
00 ( l )n Z2n
(iii) Define the function cos(z) by the power series cos(z) = l:n=O - (2n)! .
Again, this is absolutely convergent everywhere, and hence cos(z) is
holomorphic on all of C.

The three series of the previous example are related by Euler 's formula.

Proposition 11.2.12 (Euler's Formula). For all t EC we have

eit = cos(t) + i sin(t).

Proof.

More discussion on the implications of Euler's formula is given in Appendix B.1.2.

Example 11.2.13. Example 11.2.ll(i) shows that the function f(z) = ez is


analytic, and hence holomorphic. Let's check the Cauchy-Riemann equations
for f. We have f (x, y) = ex+iy = ex cosy + iex sin y, which gives

u(x,y) =ex cosy and v(x,y) = exsiny.


416 Chapter 11. Complex Analysis

Taking derivatives yields

OU
-
av ex cosy and -
OU av
= - - = - e smy.
x ·
ax = -oy = oy ax
Thus, the Cauchy-Riemann equations hold.

Example 11.2.14. Recall that M~(<C) = 88(Cn) = 88(Cn; en) is a Banach


space (see Theorem 5.7.1). For any A E Mn(<C) define

oo (A )n
f(z) = ezA = 'E -~- .
n= 0
n.

An argument similar to that of Example 11.2.ll(i) shows that the series con-
verges absolutely on the entire plane, and hence f (z) is holomorphic on all of
C. We can use Theorem 11.2.8 to find f'(z): set zo = 0, and ak = 1~. This
gives

j'(z) = ~k (Ak) zk-1 =~A ( Ak-1 ) zk-1 =~A (AJ) zJ = AezA.


~ k! ~ (k- 1)! ~ j!
k=l k= l j =O

11.3 Contour Integrals


In this section we study integration over contours, which are piecewise-smooth
parametrized curves in C. Although contour integrals are just line integrals of
holomorphic functions over these piecewise-smooth curves, having a holomorphic
integrand gives them useful properties that make them much simpler to evaluate
than arbitrary line integrals in JR 2 .
We continue to assume, throughout, that (X, I · II) is a Banach space over C.

11 .3.1 Contour Integration

Definition 11.3.1. A parametrized contour r c C is a piecewise-smooth curve in


C . That is to say, r consists of a sequence of C 1-parametrized curves /'i : [a 1, b1]-+
C, ')'2 : [a2, b2] -+ C, ... , f'n : [an, bn] -+ C such that the endpoint of one curve is
the initial point of the next:
f'J(bj) = /'j+1(aj+1) for j = 1, 2, ... , n - l.
It is common to denote such a contou.r by r = L;?=i /'J. We also often write -r to
denote the contour traversed in the opposite direction. Another common name for
a contour is a path .

Definition 11.3.2. Let U C C be an open set, and let f: U -+ X be a contin-


uous function. Let r = L;?=i /'J be a contour, as in the previous definition, with
11.3. Contour Integra ls 417

"fj(t) = Xj(t) + iyj(t) for each j E {1, ... , n}. The contour integral fr f(z) dz off
over r = 2=7= "(j
1 is the sum of the line integrals

l f(z) dz = t, ,l 1
f(z) dz= t, 1: 1
f('Yj(t))'Yj(t) dt (11.5)

= t 1b
j= l a;
1
f('Yj(t)) (dx + idy) = t jb
j= l a1
1
f('Yj(t))(xj(t) + iyj(t)) dt.

Remark 11.3.3. A contour integral can be computed in two equivalent ways: ei-
ther as f--r f(z) dz, which is a line integral in <C, or as f--r J('Y(t)) (dx + idy) , which is
a line integral in JR 2 •

Remark 11.3.4. A contour integral does not usually represent an area or a vol-
ume, but it often represents a physical quantity like work or energy. Even when
the contour integral is not obviously being used to model a physical or geometric
quantity, it provides an important tool for working with holomorphic functions. We
use contour integrals repeatedly throughout this and the following chapters. One
of the important themes of Chapter 12 is that contour integrals are very useful for
understanding and manipulating linear operators.

Lemma 11.3.5. Let 'Y : [O, 27T] --t <C be the circle centered at z0 of radius r, given
by 'Y( B) = zo + rei 8 . For any n E Z we have

j(
'Y
Z - Zo )nd Z = {27Ti,
0
n=
otherwise.
- 1,

Proof. Setting f (z) = (z - zo)n, we have

.l f(z)dz =
2 2
1 -r f('Y(B))'Y'(B)dB = 1 -r (zo + reie - zo)n (irei 8 ) dB

= 12-rr rneinO (ireie) dB = irn+l 12-rr ei(n+l)edB.


If n-:/:- - 1, then

j--r f(z)dz = irn+1 lo(2-rr ei(n+1)edB = n+l


rn+l (e2-rri(n+l) - eo) = 0,

and if n = -1, then f--r f(z)dz = ir 0 J~-rr e0 dB = 27Ti. D

Remark 11.3.6. Since the contour integral is a line integral in IR 2 , it is independent


of parametrization, but it does depend on the orientation of the contour. If P is
an oriented curve (without specific parametrization) in <C, it is common to write
f p f(z) dz. Of course, to actually compute the contour integral directly from the
definition requires that some parametrization of P be chosen.
418 Chapter 11 Complex Analysis

If P is a closed curve in C with no parametrization or orientation given, it is


common to assume that the orientation is counterclockwise. In this case we often
write if>p f(z) dz instead of fp f(z) dz . But if the contour is not closed, then the
notions of "clockwise" and "counterclockwise" make no sense, so the orientation of
P must be specified.

The next theorem is just the fundamental theorem of calculus applied to


contour integrals.

Theorem 11.3.7. If r = I:,7=


1 /j is a contour with endpoints zo and z1 and Fis
a holomorphic function on an open set Uc C containing I', then

Ir F'(z)d;~ = F(z1) - F(zo).

Proof. For each smooth curve /j : [aj, bj] --'t C we have

j'Yj F'(z)dz = laj,rb F'(rj (t ))rj(t) dt laj,rb dtF(rj(t))dt


==
d
= F(rj(bj)) - F(rj(aj)).

Since /j(bj) = /J+ 1(aj+1) , the sum (11.5) collapses to give F(rn(bn)) - F(r1(a1)) .
Since zo = 11(ai) and z1 = /n(bn), we have the desired result. D

Remark 11.3.8. The theorem above implies that the value of the contour integral
fr F' (z) dz depends only on the endpoints of r and is independent of the path itself,
provided r lies entirely in u.

Corollary 11.3.9. If I' is a closed contour in C and F is a holomorphic function


on an open set U c C containing r, then

fr .F'(z)dz = 0.

Example 11.3.10. In the case of the integral f-r(z-zo)n dz in Lemma 11.3.5,


d (z-z )"' + 1
if n -I- -1, then we have (z - zo)n = dz n~l , so the corollary implies
f-r(z - zo)n dz= 0.
One might guess that the function (z - z0 )- 1 should be the derivative of
log(z - z 0 ) , but that is not correct because log(z - zo) is not a function on C.
To see this, recall that for x E (O,oo) c ~the function log(x) is the inverse
function of eY; that is, y = log(x) means that y is the unique number such
that eY = x. This is not a problem in (0, oo) because eY is injective as a map
from~ to (0, oo), but ez is not injective as a map from C to C; for example,
e21rik = e0 = 1 for all k E Z . Since ez is not injective, it can have no inverse.
We can get around the failure to be injective by restricting the domain
(just as with the inverse sine and inverse cosine); but even so, we cannot define
11.3. Contour Integrals 419

log(z) as a single holomorphic function on a closed contour that encircles the


origin, and (z - z0 )- 1 is not a derivative of any single function on such a
contour. This is the reason that the corollary does not force the integral
J,( z - zo )- 1 dz to vanish.

11 .3.2 The Cauchy- Goursat Theorem


In this section we consider contour integrals where the contour r forms a simple
closed curve in a simply connected set U. When we integrate a function f that
is holomorphic on U , the value of the contour integral is always zero, regardless
of the contour. This theorem, called the Cauchy-Goursat theorem, is one of the
fundamental results in complex analysis.

Theorem 11.3.11 (Cauchy-Goursat Theorem). If f: U---+ Xis holomorphic


on a simply connected set U C <C, then

Ir f(z)dz =0 (11.6)

for any simple closed contour r in u.


Remark 11.3.12. Compare the Cauchy-Goursat theorem to Corollary 11.3.9. This
theorem says that if we add the hypothesis that U is simply connected, then we
no longer need the requirement that the integrand be a derivative of a holomorphic
function- it suffices for the integrand to be holomorphic on U.

Remark 11.3.13. Let f: U---+ X. If we can write f = u +iv, where u and v are
C 1 functions from U c IR 2 to X (taken as a Banach space over IR) , then since U is
simply connected, we could use Green's theorem as follows. If R is the region that
has r as its boundary, then

fr f(z)dz =fr [u(x, y) + iv(x, y)] (dx + idy)


= fr(u+iv)dx+( - v+iu)dy

=.fl [:x (-v + iu) - :y (u +iv)] dxdy

= {{ [- (au + av ) + i (au _ av)] dxdy .


./} R ay ax ax ay

The last integral vanishes by the Cauchy-Riemann equations (Theorem 11.1.4).


Unfortunately, we cannot use Green's theorem here because in general we
do not know that f can be written as u + iv, where u and v are C 1 . We show
in Corollary 11.4.8 that u and v must, in fact, be C 00 whenever f = u +iv is
holomorphic, but that proof depends on the Cauchy-Goursat theorem, so we cannot
use it here.
We give a complete proof of the Cauchy-Goursat theorem in the next section.
420 Chapter 11. Complex Analysis

Example 11.3.14. Lemma 11.3.5 showed that when r is a circle around zo


and n E N, then fr(z - zo)n dz = 0. Since (z - z0)n is holomorphic on the
entire plane when n E N, this is consistent with the Cauchy- Goursat theorem.
In the case that n < 0 the Cauchy- Goursat theorem does not imme-
diately apply, because the function (z - z0 )n is not continuous and hence
not holomorphic at z = z0 , and the punctured plane C '\._ {zo} is not simply
connected.

Remark 11.3.15. The Cauchy- Goursat theorem implies that contour integrals of
any holomorphic function are path independent on a simply connected domain.
That is to say, if 1 1 and 12 are two contours in a simply connected domain U with
the same starting and ending points, then the contour 1 1 - 1 2 is a closed contour
in U, so
0 = r
ln-~
f(z) dz== 1
n
f(z) dz - 1~
f(z) dz.

Hence
r f(z) dz= 1 f(z) dz.
J
'Yl )'2

J
Example 11.3.16. Lemma 11.3.5 shows that 'YO (z - zo)- 1 dz = 27ri, where
'Yo is a circle around z0 , traversed once in a counterclockwise direction. Con-
sider now a different path 1 1 that also wraps once around the point z0 , as in
Figure 11.1. Adding a segment er (a "cut") from the starting (and ending) point
of 1 1 to the starting (and ending) point of lo, the contour r = 1 1 +er - 1 0 - er
encloses a simply connected region; hence the integral fr(z-zo)- 1 dz vanishes,
J J
and we have 'YO (z - zo)- 1 dz= 'Y l (z - zo)- 1 dz, since the contribution from
er cancels.

/1

Figure 11.1. If f is a function that is holomorphic everywhere but at z0,


the integrals off over /1 and 10 are same. To see this, add a little cut er (red) so
that the region (shaded blue) bounded by 11 +er -10 - er is simply connected. The
integral over11 +er-10-er is 0 by the Cauchy- Goursat theorem (Theorem 11.3.11),
and the contributions from er and -er cancel.
11.3. Contour Integrals 421

11.3.3 *Proof of the Cauchy- Goursat Theorem


Remark 11.3.15 shows that when f = u +iv is holomorphic, if u and v are C 1 , then
the Cauchy-Goursat theorem follows from Green's theorem. Goursat was the first
to prove this theorem without assuming that u and v are C 1 . In this section we
give a proof of this theorem.

Lemma 11.3.17 (Cauchy- Goursat for "Good" Contours). Let Uc C be a


simply connected set. As in Definition 9.5.1, for each k E N let Qk be the set of
points in C with (real and imaginary) coordinates of the form c/2k for c E Z, and
let ek be the set of squares of side-length 2-k in C with coordinates in Qk. Let r
be a closed contour in U with interior D. Let D = D U r .
Let f : U --+ X be holomorphic on U. Assume there exists a K > 0 such that
for every k ~ K and for every RE ek, the intersection D n R has a finite number
of boundary components.

l
In this case we have
f(z)dz = 0. (11.7)

Note that the conditions of the lemma hold for every closed contour consisting
of a finite sum r =/'I+· +!'n, where each l'i is linear. We call such paths polygonal.

Proof. Note that the interior D of r is bounded, so D = DU r is compact. Because


f is holomorphic on U, for each c > 0 and for each z ED there exists a Oz > 0 such
that ll f(w) - f(z) - f' (z)(w - z)ilx :S: c:lz - wl whenever iz - wl <Oz· Moreover,
we may assume that B(z, oz) c U. The collection of all balls B(z, oz) covers D,
o
so the cover has a Lebesgue number, which we denote by (see Theorem 5.5.11).
Choose m EN such that ,/2/2m < o. For every point z of D and for every RE em
o,
containing z, every w ER satisfies lw - zl :::; ../2/2m < so we have

llf(w) - f(z) - f'(z)(w - z)IJ :S: c: lz - wl. (11.8)

Since D is compact , it is covered by a finite number of squares R1, ... , Rn E em.


Discard from this list any square that does not meet D. For each i E { 1, . .. , n}
n
choose a Zi E Ri n and define a new function on Ri by

Q if W = Zi,
gi(w) = f(w) - f(z ,)
{ ~w~-~ z,~ -
J'( Zi ) 'f -r-
1 W
__;_ Zi.

By (11.8) we have llgi(w)JJ < c:, provided w E Ri·


For each i E { 1, ... , n} let Ci denote the positively oriented boundary of
Ri n n. By hypothesis, the contour Ci must consist of only finitely many connected
components, and each of them is a closed contour. We have

~ l, f(z) dz= l J(z) dz,

since any oriented segment of a side of Ri that is not part of r is also contained in
exactly one other R1 , but with the opposite orientation in R1 , and thus they cancel
out in the sum.
422 Chapter 11 Compl ex Analysis

By definition of gi we have

li f(w) dw = li f(zi) - zd'(zi) + wf'(zi) + (w - zi)gi(w) dw.

But by Corollary 11.3.9 we have fci dw = 0 and fci wdw = 0, and thus

fCi
f(w) dw == f Ci
(w - zi)gi(w) dw .

Summing over all i E {1, .. . , n} and using the fact that both lw-zil < v'2/2m and
llgi(w)ll < c for all w E Ri, we get

I [ f (w) dw I ~ lit, l.(w - z;)g,( w) dwll < t, oTmJ2(Ien(C;))


The sum of the lengths of the contours Ci cannot be longer than the length of r
plus the sum of the lengths of all the sides of the Ri (that is, 4 · 2-m), so we have

11.l f(w) dwll < crmv'2(4nrm + len(r)) = c.J2(4A +rm len(f)), (11.9)

where A is the total area of all the squares R 1 U · · · U Rn· Since IT is compact and
measurable, the area A approaches the (finite) measure of IT as m --+ oo. Since c
is arbitrary, we may make the right side of (11.9) as small as we please, and hence
the left side must vanish. D

To extend the result to a generall closed contour we first prove one more lemma.

Lemma 11.3.18. Letry: [a, b] --+Ube C 1 , and let f be holomorphic on U. For any
c > 0 there is a polygonal path u = I=~=l O"k c U with u(a) = 7(a) and u(b) = 7(b)
such that
Iii f(z) dz - l f(z) dzll < c.

Proof. Since I' is compact, the distance p = d(!', uc) to uc must be positive (see
Exercise 5.33). Let Z be the compact set Z = {z E U I d(z, !') :<:; p/2} C U. Since
f is holomorphic, it is uniformly continuous on Z.
Since I' is C 1 it has a well-defined arclength L = J'Y ldzl . For every c > 0,
choose 8 > 0 so that for all z, w E Z we have

llf(z) - f(w)ll < min(c/3L, p/2),


provided lz - wl < 8. The path I' is uniformly continuous on [a, b], so there exists
an TJ > 0 such that
b(t) - 7(s)I < 8
for all s, t E [a, b], provided Is - ti < :ry. Finally, 7' is uniformly continuous on [a, b],
so letting F = suptE[a,b] llf(l'(t))ll, there exists~> 0 so that

b'(t) -1'(s)I < 3(b ~ a)F' (11.10)


11.3. Contour Integrals 423

whenever [s - t[ < ~· Choose a partition a = t 0 < t 1 < · · · < tn = b of [a, b] such


that [tk - tk-1[ < min(77,~) for every k.
Let
n
s = I: j(,,(tk))[,,(tk) - ,,(tk- 1)L
k=l
and let <Jk(t) with t E [tk- 1, tk] be the line segment from 'Y(tk_i) to 'Y(tk)i that is,
1
<Jk(t) = [t('Y(tk) - 'Y(tk- 1)) + tk'Y(tk - 1) - tk-l'Y(tk)] .
tk - tk-1
Taking CJ = L~=l <Jk(t) gives a polygonal path lying entirely in Z with endpoints
'Y(a)
and 'Y(b).
We have

Ill f(z) dz - sll = Ill f(z) dz - ~ f('Y(tk))['Y(tk) - 'Y(tk- 1)]11


= 111 f(z) dz -
a
t
k=l
f-r(h) f('Y(tk))
'Y(tk-1)
dzll

= II~ 1,;t~::) [f(z) - f('Y(tk))] dzll


: :; Ln !-y(tk) [[f(z) - f('Y(tk)) [[ [dz[
k=l -y(t._i)

n !-y(tk)
<-2:
3L k=l

-r(t•-1)

[dz[ = - .
3
(11.11)

By the mean value theorem (Theorem 6.5.1), for each k there exists a tk, E [tk-1, tk]
such that
n n
s =I: J('Y(tk))b(tk) - ,,(tk_i)J = I: J('Y(tk)h'(t'k)[tk - tk-ll · (11.12)
k=l k= l

Since f o 'Y is continuous, it is Riemann integrable, so after further refining the


partition, if necessary, we may assume that

Iii f(z) dz - ~ f('Y(tk))'Y'(tk)[tk - tk-1]11 < i· (11.13)

Combining (11.10), (11.11), (11.12), and (11.13) with the triangle inequality gives

II;: f (z) dz - if (z) dzll '.O II;: J(z) dz - ~ f(1(t,) )>' (t.)(t. - ,,_,)II
+II~ f('Y(tk)h'(tk)(tk-tk - 1)- sll + JJs- 1f(z) dzJJ
< i +II~ f('Y(tk))b'(tk) - ,,'(tk)](tk - tk-1) 11 + i
424 Chapter 11. Complex Analysis

2
:::; ; +Fi=b'(tk)-1'(tk)i(tk - tk- i)
k =,1

The proof of Cauchy-Goursat is now straightforward.

Proof of Theorem 11.3.11. Let r c Ube a closed contour, with r = L:: 1 I'm,
where /'1 : [a1, bi] --+ <C, /'2 : [a2, b2] --+ <C, . . . , I'm : [am, bm] --+ <C are all C 1. For any
c: > 0, Lemma 11.3.18 guarantees there are polygonal paths cr1, . . . , CTm such that
cri(ai )= l'i(ai) and cri(bi) = l'i(bi) for every i and such that II J,, f(z) dz- fa-, f(z) dz[[
< c:/m. Letting er be the polygonal path er= L:;: 1 cri gives

II/, f(z) dz - [ f(z) dzll < c:.

But er is a closed polygonal contour, and therefore Lemma 11.3.17 applies to give

[ f(z) dz = O;

hence ,

II/, f(z) dzll < c:.


Since this holds for every c > 0 we must have

II/, f(z) dzll = 0. D

11.4 Cauchy's Integral Formula


Cauchy's integral formula is probably the single most important result in all of
complex analysis. Among other things, it implies that holomorphic functions are
infinitely differentiable (smooth)-something that is certainly not true for arbitrary
differentiable functions from JR 2 to JR 2 . It also is used to show that every holomorphic
function can be represented as a convergent power series in a small neighborhood.
That is to say, all holomorphic functions are analytic (see Definition 11.2.9). This
is a very powerful result that vastly simplifies many problems.
Cauchy's integral formula is also useful for proving several other important
results about complex functions throughout this chapter and for proving important
results about linear operators throughout the next chapter.
We continue to assume that (X, I · II) is a Banach space over <C.

11.4.1 Cauchy's Integral Formula

Theorem 11.4.1 (Cauchy's Integral Formula). Let X be a Banach space over


<C, and let f: U --+ X be holomorphic on an open, simply connected domain U C <C.
11.4. Cauchy's Integral Formula 425

Let"'( be a simple closed contour lying entirely in U, traversed once counterclockwise.


For any zo in the interior of"'(, we have

f(zo) = ~ 1 J(z) dz. (11.14)


27ri } , z - zo

Proof. If O' is any circle centered at z0 and lying entirely in the interior of"'(, then
using the Cauchy-Goursat theorem with the same argument as in Example 11.3.16
and Figure 11.1 , we have that

1 f(z) dz = 1 J(z) dz.


'f;z - z0 },z - zo

Therefore, it suffices to prove the result in the case that "'( is some circle of sufficiently
small radius r.
By Lemma 11.3.5 we have

f(zo) = J(zo) -1.


271'2
i
,
-1- dz, = - 1 .
z - zo 271'2
i ,
f (zo)
- - dz,
z - zo

which gives

I ~ 1 JJ!l dz - f(zo)ll
271'2 } , Z - Zo
= 11 ~ 1
27ri } ,
f(z) - f(zo) dz [[
Z - Zo

= 11 ~ r27r f(zo + reit) - f(zo) ireit dtjj


27ri } 0 re•t

2~ fo
2
= [[ 7r f (zo +reit) - f (z0 ) dt[[
< sup ll f(z) - f(zo)ll ·
lz-zol =r

Since f is holomorphic in U, it is continuous at zo; so, for any E. > 0 there is a choice
of 6 > 0 such that llJ(z) - f(zo)ll < E. whenever iz - zol ::; 6. Therefore, choosing
0 < r < 6, we have

I~ 1 JJ!l
2n}, z - zo
dz - f(zo)ll < c..

Since this holds for every positive c., the result follows. D

Example 11.4.2.
(i) Let r be any simple closed contour around 0. Since e 2 is holomorphic
everywhere, we compute
426 Chapter 11. Complex Ana lys is

(ii) Consider the integral


J, cos(z) dz
Jr z + z 4 '

where r is a circle of radius strictly less than one around the origin. The
function f(z) = cos(z)/(l+z 3 ) is holomorphic away from the three roots
of 1 + z 3 = 0, which all have norm l. So inside of the open ball B(O, 1)
the function f(z) is holomorphic, and r lies inside of B(O, 1). We now
compute
1 cos(zl dz= J, f(z) dz= 21fif(O) = 27fi.
Jrz+z Jr z

Co rollary 11.4.3 (Gauss's Mean Value Theorem). Let f be holomorphic


in a simply connected domain U, such that U contains the circle C = { z E <C :
Iz - zo I = r} of radius r with center .zo . We have
1 {2"'
f (zo) = 2
7f lo f (zo + reit) dt.

P roof. Parametrize C in the usual way, ')'(t) = z 0 + reit for t E [O, 27f], and apply
the Cauchy integral formula:

f(zo) = ~ 9~ f(z) dz
27ft, C Z - Zo

= -27fi1 J(2"' f (zore+.reit) ire


0
. it d
2
t
t

= -1 1·27' f (zo + reit) dt . D


27f 0

R emark 11.4.4. Gauss's mean value theorem says that the average (mean) value
of a holomorphic function on a circle is the value of the function at the center of
the circle. This is much more precise than the other mean value theorems we have
encountered, which just say that the average value over a set is achieved somewhere
in the set, but do not specify where t hat mean value is attained.

11.4.2 Ri emann's Theorem and Cauchy's Differentiation Formul a


Cauchy's integral formula can be generalized to derivatives as well. As a first step,
we prove Riemann's theorem.

T heorem 11.4.5 (Riemann's Theorem) . Let f be continuous on a closed con-


tour ')' . For every positive integer n the function
f (z)
Fn(w) ==
:k I
(
Z-W n
) dz

is holomorphic at all w in the complement of /' (both the interior and exterior
components), and its derivative satis.fies
F~(w) = n Fn+1(w).
11.4. Cauchy's Integral Formula 427

Proof. The result follows immediately from Leibniz's integral rule (Theorem 8.6.9),
which allows us to pass the derivative through the integral. For every n we have

W
d
-d Fn(w) = -d
d
W
:k'Y
(
f(z)
Z-W n
) dz

= J:. ~ f(z) dz
.' Gaw (z - w)n
= J:. nf (z) dz
}, (z - w)n+I
= nFn+I(w). 0

Corollary 11.4.6 (Cauchy's Differentiation Formula). Let f be holomorphic


on an open, simply connected domain U c C. Let r be a simple closed contour
lying entirely in U , traversed once counterclockwise. For any w in the interior of
/, we have
f(k )(w) =~ J:. f(z) dz. (11.15)
27ri}, (z - w)k+l

Proof. Proceed by induction on k. The case of k = 0 is Cauchy's integral formula


(Theorem 11.4. l). Assume that

j{k- l )(w) = (k - 1)!


2r.i
f(z) dz.
, (z-w)k
1
Riemann's theorem gives

_!},__ f(k-l)(w)
dw
=~ 1
f(z) dz
27Ti , (z - w)k+l '

which completes the induction step. 0

Nota Bene 11.4.7. T he following, equivalent form of (11.15) is often useful

= _27T_i_f(_k _- 1_l (_w_)


1 f ( z) dz
, (z-w)k (k - 1)!
(for w insider). (11. 16)

Corollary 11.4.8. If f is a function that is holomorphic on an open set U , then


the derivative off is holomorphic on U , and f is infinitely differentiable on U .

Remark 11.4.9. This last corollary is yet another demonstration of the fact that
holomorphic functions are very special, because there are many functions from IB. 2
to IB. 2 that are differentiable but not twice differentiable, whereas any function
from C to C that is differentiable as a complex function (holomorphic) is also
infinitely differentiable.
428 Chapter 11. Complex Analysis

Example 11.4.10.

(i) To compute the contour integral

j cos(z) dz
Jc 2z 3 '

where C is any simple closed contour enclosing 0, use the Cauchy differ-
entiation formula with f(z) == cos(z)/2 to get

.rcj cos(z)
2z 3
dz= 27ri J"(O)
2!
= _ 7ri .
2

(ii) Consider the integral

~
sin(z) d
(
Jr z
3
+ z 5 z,
where r is a circle of radius strictly less than one around the origin. Note
that f (z) = (~:~}) is a holomorphic function inside the circle r because
it can be written as a product of two holomorphic functions sin(z) and
1/(1 + z 2). Therefore, we have

j sin(z) dz= j f(z) dz


Jr z3 + z5 Jr z3
= 271"~ J"(O) = 7ri _!f_2 sin(z)2 I = O.
2! dz 1 + z z=O

(iii) Consider the integral

i
1
1
(z 2 - 1) 2 (z - 1)
dz= i 1
1
(z - 1)3(z + 1) 2
dz ,

where 'Y is a circle centered at 0 with radius strictly greater than 1.


Since the integrand fails to be holomorphic at both -1 and + 1, and
since 'Y encloses both 1 and -1, Cauchy's formulas do not apply with
the integral as given.
To fix this problem, let 'Yi and "(2 be circles of radius less than 1 with
centers equal to -1 and 1, respectively. Applying the cutting trick used
in Figure 11.1 to the contours 'Yl + "(2 versus 'Y (see Figure 11.2) gives

J __ l _ dz=j 1 dz
".rr (z - 1)3(z + 1)2 ".rr +12
1
(z - 1)3(z + 1)2
1 1
= J
".rr (z-1)3(z+1)2
dz +j
".rr (z-1)3(z+1)2
dz .
1 2
11.5. Consequences of Cauchy's Integral Formula 429

But inside '}'1 the function Ji = (z _!l )s is holomorphic, so

J, 1 d J, Ji (z) d = 2 "f' (-1) = - 311i


J,1 (z2 -1)2(z-1) z = J,1(z+ 1)2 z Ki 1 8 .

Similarly, inside '}'2 the function h = (z_;l) 2 is holomorphic, so

J, 1 J, h(z) . II 37fi
J,2 (z2 - 1)2(z - 1) dz= .'fr2 (z - 1)3 dz= 11if2 (1) = s ·
Therefore the integral is equal to 0.

Figure 11.2. When integrating a function f that is holomorphic every-


where inside of 'Y except at z 1 and z 2, we can replace the contour 'Y with two smaller
circles: 'Yl around z 1 and 'Y2 around z2. To do this, add a short horizontal cut T
(red) from 'Y to '}'1 and another er (orange) from 'Y1 to 'Y2· Removing er , T, and
the interiors of 'Yl and 'Y2 from the interior of 'Y leaves a simply connected region
(shaded blue), so Cauchy's integral formula guarantees that the integral off over
'Y is the same as the sum of integrals over T , the upper half of '}'1, er , '}'2, - er, the
lower half of '}' 1 , and -T. But the integrals involving T and er cancel, and the two
A:: f dz = A:: f dz+ A:: f dz. See Example 11.4.1 O(iii)
halves of 'Yl combine to give r'Y r'Y1 r'Y2
for a specific case of this.

11.5 Consequences of Cauchy's Integral Formula


Cauchy's integral formula has many important consequences, including the fact
that every bounded holomorphic function is constant (Liouville's theorem), the
fact that holomorphic functions on a disk attain their maximum norm on the bound-
ary of the disk (the maximum modulus principle), and the fact that every
polynomial with coefficients in C has a root in C (the fundamental theorem of
algebra) . We discuss all three of these in this section.
430 Chapter 11 Complex Analysis

11. 5.1 Liouville's Theorem

Theorem 11.5.1 (Liouville's Theorem). If f : C ---+ X is holomorphic and


bounded on all of C, then f is constant.

Proof. Let ME JR be such that llf(z)ll < M for all z EC. Choose any zo EC and
let 'Y be a circle of radius R centered at zo. Then

llJ'(zo)ll = 2~ Ill (z~;o)2dzll


< ~ {2-rr llf(zo + Reit)ll IReitl dt
- 27r lo \Reit\2
< ~ {2-rr M dt = M.
- 27r lo R R
This holds for every R > 0, so as R---+ oo, we see that llf'(zo)ll = 0. Thus J is
constant by Proposition 11.2.4. D

Example 11.5.2. The functions sin(z) and cos(z) are bounded if z E JR, but
since they are not constant, Liouville's theorem guarantees they cannot be
bounded on all of C. That is, for every M > 0, there must be a z E C such
that Isin(z)I > M , and similarly for cos(z).

Corollary 11.5.3. If f: C---+ C is holomorphic on all of C, and if lfl is uniformly


bounded away from zero (meaning there is an E > 0 such that lf(z)I > E for all
z EC), then f is constant.

P roof. If Iii > E, then ll/ JI < 1/c. Moreover, since f is entire and nonzero, 1/f
is also entire. Hence, by Liouville's theorem, 1/ f is constant, which implies f is
constant. D

11.5.2 The Fundamental Theorem of Algebra


The fundamental theorem of algebra is a very significant consequence of Liouville's
theorem.

Theorem 11.5.4 (Fundamental Theorem of Algebra). Every nonconstant


polynomial over C has at least one root in C

Proof. Let f be any nonconstant polynomial of degree k > 0, let c be the coefficient
of the degree-k term, and let p = f / c, so

p(z) = zk + ak .- 1zk- l + · · · + a1z + ao.


11.5. Consequences of Cauchy's Integral Formula 431

It suffices to show that p(z) has a root. Let a= max{ lak - 11, lak- 21, ... , laol}. If
a= 0, then p(z) = zk and z = 0 is a root of p; thus, we may assume a> 0. Suppose
t hat p(z) has no roots, and let R = max{(k + l)a, 1}. If lzl 2 R, then

lp(z) I 2 lz lk - (lak - 1ll z lk-l + · · · + la1 ll zl + laol)


2 lz lk - ka lz lk- l
= lzlk-l( lzl - ka)
2 lz l - ka 2 R- ka 2 a.

Thus, p(z) is uniformly bounded away from zero on the exterior of B(O, R). Since no
roots exist for lzl :::; R , compactness and continuity imply that p(z) is also uniformly
bounded away from zero on the interior of B(O, R). By Corollary 11 .5.3, we must
have that p(z) is constant, which is a contradiction. 0

11.5.3 The Maximum Modulus Principle


Recall that the extreme value theorem (Corollary 5.5.7) guarantees every contin-
uous, real-valued function on a compact set attains its maximum. The maximum
modulus principle is a sort of converse for holomorphic functions. It says that
nonconstant holomorphic functions never achieve their maximum norm on open
sets. That is, there are no holomorphic functions whose norms have a local max-
imum. This may be surprising, since there are lots of nonconstant real functions
that have many local maxima.

Theorem 11.5.5 (Maximum Modulus Principle). Let f: U -+ X be C -valued,


holomorphic, and not constant on an open, path-connected set U . The modulus
IJ(z) I off never attains its supremum on U.

The proof depends on the following lemma.

Lemma 11.5.6. Let f be holomorphic on an open, path-connected set U. If lfl


attains its maximum at zo EU, then lf(z)I is constant in every open ball B(zo,r)
whose closure B ( zo, r) is entirely contained in U.

Proof. For any r > 0 with B(z0 , r) C U, Gauss's mean value theorem
(Corollary 11.4.3) implies

2~ ifo
2
lf(zo) I = r. f(zo + reit) dtl

:::; ~ f
2
7r lf(zo + reit)I dt.
27r lo
But since If I attains its maximum at zo we have

lf(zo)I 2 lf(zo + reit)I (11.17)


432 Chapter 11. Complex Analys is

for every t E [O, 2n], so

1 1271"
lf(zo)I 2: - lf(zo + reit)I dt,
27!" . 0
and hence equality holds. And thus we see

0 = -1 1271" IJ(zo)l - lf(zo + reit)I dt.


2n 0

But by (11.17) the integrand is nonnegative; hence it must be zero. Therefore,


lf(z)I = lf(zo)I whenever lz - zol = r. For every c.::; r we also have B(zo,c.) C
B(zo, r) CU, so we have IJ(z)I = lf(zo)I for all z with lz - zol = c.::; r, and thus
Ill is constant on all of B(zo, r) C U. D

Proof of Theorem 11 .5.5. Assume, by way of contradiction, that Ill attains


its maximum at z0 E U. By Lemma 11 .5.6 the modulus If I is constant in a small
neighborhood around zo. We will show it is constant on all of U.
For any w E U , if lf(w)I < lf(zo)I , then choose a path"( in U from zo tow .
Since I is compact and uc is closed, the distance E. = d(!, uc) = inf{lc - di I c E /,
d tj. U} is strictly positive (see Exercise 5.33) . Now choose a sequence of points
z 1, ... , Zn = w on/ such that lzi - Zi -l I < c.; see Figure 11.3. Let Zk be the first
such point for which lf(zk)I < lf(zo)I.
The maximum of lf(z)I is lf(zo)I = lf(zk-1)1, and it is attained at Zk-1, so
by the lemma lf(z)I is constant on B(zk-1,c.) . Since zk E B(zk-1,c.), we have
lf(zk)I = lf(zk - 1)1 = lf(zo)I, which is a contradiction.
This shows that there are no points w in U for which If (w) I < If ( zo) I; hence
Ill is constant on U. The theorem now follows from Proposition 11.1.7. D

Figure 11.3. Representation of the proof of Theorem 11.5.5. If llJll attains


its maximum at zo (blue) but is smaller at w (red), then we can connect z0 and w
with a path/, covered by a finite number of c. -balls inside U such that the centers
zi of the balls are less than c. apart. The previous lemma guarantees that llfll is
constant on the first ball (blue) , and hence on the second, and so forth, all the way
to the ball around w (red).
11.6. Power Series and Laurent Series 433

Corollary 11.5.7. If f is C -valued, continuous on a compact set D, and holomor-


phic in the interior of D , then lfl attains its maximum on the boundary of D.

Proof. Since lfl is continuous on a compact set, it must attain its maximum
somewhere on D. If it is constant, lf l attains its maximum at every point of D. If
it is not constant, then the maximum modulus principle guarantees the maximum
cannot be attained on the interior of D; hence it must be on the boundary. D

11.6 Power Series and Laurent Series


Power series are a very useful way to represent functions . In this section we show
that every function that is holomorphic on an open set U can be written as a power
series in a neighborhood of any point of U . Moreover, if a function has only isolated
points where it fails to be holomorphic, then in a neighborhood of each of these
isolated points it can be written as a power series with negative powers as well as
positive powers; these series are called Laurent series .

11.6.1 Power Series


The power series expansion of a holomorphic function at a point z0 is given by
taking the limit of its Taylor polynomials at zo. A real function of one variable
may be infinitely differentiable in a neighborhood of a point and yet its Taylor
series at that point might not converge. But the following theorem shows that
holomorphic functions are, once again, much more special than real differentiable
functions because their Taylor series always converge in some open ball.

Theorem 11.6.1. If f is holomorphic in an open neighborhood U of zo, then f


is analytic; that is, in any open ball B(z0 , r) c U that lies entirely in U it can be
written as a power series that converges uniformly on compact subsets. Moreover,
the power series expansion off about zo is given by Taylor's formula:
00f(k) ( )
f(z) =~ zo (z - zo)k. (11.18)
~ kl
k=O

The expansion (11.18) is called the Taylor expansion or the Taylor series off at zo.

Proof. For any 0 < E: < r consider the closed ball D = B(zo , r - c) C B(zo, r) C U.
We need only show that the series in (11 .18) converges uniformly on D. Let/ be
the circle {w E <C I r - E:/ 2 = lw - zol} with the usual, counterclockwise orientation.
Note that 1 lies entirely within B(z0 , r) and completely encloses D.
For any z ED and w E /, expand w ~ z as a power series in zo to get

1 1 1 1 ~(z - zo)k (11.19)


w-z (w - zo) ( 1 _ z - zo ) = (w - zo) ~ w - zo
w - zo k= O

We have lz - z0 1:=::: r - E: and lw - zol = r - c/ 2 for all w E / and z E D; so


the equality in (11.19) holds, since ~-=-~~ < 1. This implies that the series (11.19)
434 Chapter 11. Complex Analysis

converges uniformly and absolutely, as a series of functions in z and w on D x "(.


Since ry is compact, f(w) is bounded for w E ry, so the series I:%°=o f(w) (~~~~)~: 1
is also uniformly and absolutely convergent on D x ry.
Combining (11.19) with the Cauchy integral formula gives

f(z) = ~ J,
27fi
f(w) dw =
.G w - z
~ J,
27ri
f
'Ki k=O
f(w) (z - zo)k dw.
(w - zo)k+l

Since integration is a bounded linear operator, it commutes with uniform limits,

f(z) = ~ ~
1
7fi k=O
~ *
and hence with uniformly convergent sums. Thus, we have

'Y
(z - zo)k ~ f(k)(zo)
f(w) ( _ )k+l dw = ~
W Zo k=O
kl
.
k
(z - zo) ,

where the last line follows from Cauchy's differentiation formula (11.15).
This shows that the Taylor series for f (z) converges uniformly on any closed
ball B(z 0 , r - c:) contained in any open ball B(z0 , r) C U . D

Remark 11.6.2. We have already seen (see Theorem 11.2.8) that analytic func-
tions are holomorphic. The previous theorem guarantees that the converse holds,
so a complex function is holomorphic if and only if it is analytic. Because of this
theorem, many people use the terms holomorphic and analytic interchangeably.

Proposition 11.6.3. A convergent .vower series expansion


00

f(z) = L ak(z - zo)k


k=O
around zo is unique and is equal to the Taylor series.

Proof. By Theorem 11.2.8 we may differentiate term by term. Differentiating n


t imes and substituting in z = zo shows that f(n) (z 0 ) = n !an. So every expansion in
z - zo that converges in a neighborhood of z0 is equal to the Taylor series. D

Corollary 11.6.4. If f is holomorphic in a path-connected, open set U c C, and


if there exists some zo EU such that f(n)(zo) = 0 for all n EN, then f(z) = 0 on
all of U .

Proof. Assume, by way of contradiction, that w EU satisfies f(w)-/= 0. Choose a


piecewise-smooth path ry from z0 to w.
The Taylor series off converges to fin the largest open ball B(z0 , r) contained
in U. Since the Taylor series is identically zero, the function f is identically zero
on this ball. Just as in the proof of the maximum modulus principle, let c: be the
distance from ry to uc, choose a sequence of points z1, z2, .. . , Zn = w along the path
ry t hat are no more than distance c:/2 apart. Each ball B(zi, c:) is contained in U,
and so f has a power series centered at zi that converges in B (zi, E).
11.6. Power Series and Laurent Series 435

Let Zk+1 be the first point in the sequence for which the power series expansion
of f around zk+1 is not identically zero. Since f is identically zero on the ball
B(zk,€), and since Zk+i lies in B(zk,€), all the derivatives off vanish at Zk+i, so
the power series expansion off (zk+d around the point Zk+l is also identically 0-a
contradiction. D

11.6.2 Zeros of Analytic Functions

Definition 11.6.5. Suppose f is analytic in an open neighborhood of z 0 with Taylor


expansion
00

f(z) = an(z - zot + an+1(z - zo)n+l + ·· · = L


ak(z - zo)k,
k=n
where an =/= 0. We say that f has a zero of order n (or a zero of multiplicity
n at zo.

Proposition 11.6.6. If f is holomorphic in a neighborhood of a zero z 0 of order n,


then f can be factored as f(z) = (z - zo)ng(z), where g(z) is a holomorphic function
that does not vanish at z 0 . Moreover, there is an open neighborhood V around z 0
such that f(z) =/= 0 for every z E V "'z0 .

Proof. It is clear from the Taylor expansion that f factors as (z - z 0 )ng(z) with
g analytic. Since g is continuous near z0 , there is a neighborhood V of z0 where
g(z) =/= 0 if z E V. Since the polynomial (z - zor vanishes only at z = zo, the
product (z - z 0 )ng(z) = f(z) cannot vanish on V "'z0 . D

Corollary 11.6. 7 (Local Isolation of Zeros). If f is holomorphic in a path-


connected open set U, and if there is a sequence (zk)k=O of distinct points in U
converging to any w EU such that f(zk) = 0 for all k, then f(z) = 0 for all z EU.

Proof. The convergent sequence (zk)k=O of zeros must intersect every neigh-
borhood of w; hence a neighborhood with no additional zeros, as described in
Proposition 11 .6.6, cannot exist. Thus, f must be identically zero on a neighbor-
hood of w. By Corollary 11.6.4 f must be zero on all of U. D

11.6.3 Laurent Expansions


We must also consider functions , like 1/(z - zo) 2 and cos(z)/(z - zo) , that are
holomorphic on a punctured neighborhood of some point z 0 but are not holomorphic
at zo.
We can no longer use Taylor series expansions to write these as a convergent
power series around z 0 , since a convergent power series would be continuous at zo,
and these functions are not continuous at zo. But if we allow negative powers, we
can write the function as a series expansion, and much of the theory of power series
still applies.
436 Chapter 11. Complex Analysis

Theorem 11.6.8 (Laurent Expansion). If f is holomorphic on the annulus


A= {z EC Ir< lz - zol < R} for some zo EC (we also allow r = 0 or R = oo or
both), then the function f can be written in the form
00

f(z) = ~C cn(z - z 0 )n, (11.20)


n = ,-oo

where we can decompose (11.20) as a sum of power series

f(z) = 2o
oo
Cn(z - zo)n
oo
+ ~ C-n
( 1
z _ zo
)n , (11.21)

and both of these series converge uniformly and absolutely on every closed annulus
of the form Dp,12 = {z E c Ip:::; lz - zol :::; e} for r < p < e < R . Furthermore, if 'Y
is any circle around z 0 with radius r + c < R, for some c > 0, then for each integer
n the coefficients are given by

Cn = _l_ j f(w) dw. (11.22)


27fi Y, (w - z 0 )n+l

Proof. Choose c1, c2 > 0 such that

r < r + Cl < p < e < r + c2 < R.


Let 'Yl be the circle about z 0 of radius r + c 1, and let 1'2 be the circle about z 0 of
radius r + c 2 ; see Figure 11.4.
By the usual change-of-path arguments, the integral (11.22) is independent of
c, provided 0 < c < R - r, so it is the same for c 1 and c2. For each n E Z, set

Cn = _l_ 1 f(w) dw = _l_ l f(w) dw .


27fi Y, 1
(w - zo)n+l 27fi Y, 2
(w - zo)n+l

,R

•Z

Figure 11.4. An open annulus A (blue), containing a closed annulus Dp, 12


(green) , as in Theorem 11.6.8, with r < r + c1 < p < lzl < e < r + c2 < R, and
with circles 'Yl and 1'2 of radius r + c 1 and r + c2, respectively.
11.6. Power Series and Laurent Series 437

By Cauchy's integral formula, for any z E D NJ we have

f(z) = ~
27fZ
f f(w) dw = ~ j
1' _,, W -
2 1
Z
f(w) dw - ~ j
27rz ':G2 W - Z
f(w) dw.
27rz ':Gr W - Z
(11 .23)

Expand ~~; as in (11 .19) to get

f(w) = ~ f(w)(z - zo)k


(11.24)
w- z 6 (w - z 0 )k+ 1
k=O
This converges uniformly and absolutely as a function in w and z on the compact
set 12 x Dp, 0 , so we may integrate the first half of (11.23) term by term to get

1 ~ f(w)
- · --
27rZ . 1'2 W - z
=~ 1
6 - .
2
k=O 1fZ
i1'2
(
f(w)
W- Z
k
)k+l (z - zo) dw = 6~ ck(z - z0 ) k ,
k=O

and this series converges uniformly and absolutely on Dp, 12 •


Alternatively, we can interchange the roles of z and w in (11.19) to get an
1
expansion for -=L
w - z = -z - w- as

1 ~ (w - zo)k
z - w = 6 (z - z0 )k+ 1 '
k=O
which converges uniformly and absolutely as a function in w and z on 1 1 x Dp,e·
Integrating the second half of (11.23) term by term gives

_ _l j f(w) dw =~ C-k .
27ri 'K,1'1 w - z 6 (z - z0 )k
k=O
To see that this second series converges uniformly and absolutely on D p, 12 substitute
t = l/(z - z0 ) and use the previous results for power series int. D

Proposition 11.6.9. For a given choice of A = {z E C I r < lz - zol < R}, the
Laurent expansion (11.20) off is unique.

Proof. The proof is similar to that for Taylor series, but instead of differentiating
term by term, write the expansion of (z!z~z/k+r and integrate term by term around
a contour I centered at zo with radius r + E in A, using Lemma 11.3.5. The details
are Exercise 11.28. D

Remark 11.6.10. Computing the Laurent expansion of a function is often much


more difficult than computing Taylor expansions. But in many cases finding just a
few terms of the Laurent expansion is sufficient to solve the problem at hand, and
often this can be done without much difficulty.
438 Chapter 11. Complex An alysis

Example 11.6.11. To find the Laurent expansion of cos(z)/z 3 around zo =0


2

we can expand cos(z) as cos(z) = I:~o <-N:)~ k and divide by z 3 to get

oo (-l)kz2k-3 1 1 z
cos(z)/z3 = L ~~ = z3 - 2!z + 4! + ....
k=O

Example 11.6.12. To compute the Laurent expansion of z2''t-i around zo = i,


solve for the partial fraction decomposition z2~ 1 = z~i + z~i to get a = 1/2 =
1
1 2
b, so z 2z+ 1 = z1+!2i. + z-i 12
1 .. Around z 0 = i the function z + i. is holomorphic, and
expanding it as a power series in z: - i gives

1/2 =
z+i
1/2
2i+(z-i)
= l/4i
1--i(z-i)/2
= ~
4i
f (i/ 2)k(z _ i)k.
k=O
This converges when [(i/2)(z - i) [ < 1 and diverges at (i/2)(z - i) = 1, so the
radius of convergence is l. Thus we have

z 1/2 1 ~ . k . k
- - = -. + --: L )i/2) (z - i) .
z 2
+1 Z- i 4i k=O

Since ;~~ is holomorphic away from z = i, the Laurent expansion is valid in


the punctured ball B(i , 1) '\ {i}.

11.7 The Residue Theorem


When a function is holomorphic on a simply connected set U, the Cauchy-Goursat
theorem guarantees that the value of its contour integral on any simple closed
contour r is zero. The residue theorem gives a way to calculate the contour integral
when the function fails to be holomorphic at a finite number of points enclosed
by the contour but is holomorphic on the contour. Before we can understand this
theorem, we must first understand the sorts of points that occur where the function
is not holomorphic.

11.7.1 Isolated Singularities

Definition 11.7.1. If f is holomorphic on the set {z EC [ 0 < [z - zo[ < R} and


not holomorphic at zo, we say that zo is an isolated singularity off. Let zo be an
isolated singularity off, and let 2::: ~=- oo ck(z - zo)k be the Laurent expansion on a
small punctured neighborhood B(zo , r) '\ {zo} of zo.

(i) The series l:::k~-oo ck(z - z0 )k is the principal part of the Laurent series for
f at zo .
11.7. The Residue Theorem 439

(ii) If the coefficients Ck all vanish for k < 0 (if the principal part is zero), then the
singularity is called removable . In this case f can be extended to a holomorphic
function on B(zo, r) using the power series f(z) = I: ~=O ck(z - zo)k.

(iii) If the principal part has only a finite number of nonzero terms, so that f =
L~-Nck (z - zo)k, with c_N f:. 0, then z 0 is called a pole off of order N.

(iv) A pole of first order, meaning that Ck = 0 for all k < - 1, and c_ 1 f:. 0, is
called a simple pole.

(v) If the principal part has an infinite number of nonzero terms, then z 0 is an
essential singularity off .

Example 11.7.2.

(i) si~ z has a removable singularity at zo = 0, since

si: z = ~ ( z _ ~: + ~~ _ .. .) = 1_ ~~ + ~: _ ....

(ii) ~ has a pole of order 2 at z 0 = 0 since ~ 1


= z2 ( 1+z + ~~ + · · ·) .
(iii) e 1 /z = I:~=- cx,zk/(-k)! has an essential singularity at z = 0.

Example 11.7.3. Assume that f and g are both holomorphic and C-valued
in a neighborhood of zo, that f has a zero at zo of order k, and that g has a zero
of order eat z 0 . We can write f(z) = (z - z 0 )k F(z) and g(z) = (z - zo)eG(z),
where F and Gare both holomorphic and do not vanish at zo.

(i) If k 2'. e, then f /g is undefined at z 0 , but f /g = (z - z 0 )k-eF/G away


from z 0 , and F/G and (z - z 0 )k - e are both holomorphic at zo, so the
singularity at z0 is removable.

(ii) If k < e, then f /g has a pole of order e- k at zo .

Definition 11.7.4. A function that is holomorphic in an open set U, except for


poles in U, is meromorphic in U. If U is not specified, it is assumed to be all of C .

Example 11.7.5. Rational functions are functions of the form p(z)/q(z),


where p, q E C[z]. If p and q have no common zeros and q is not identi-
cally zero, then these are meromorphic on C, since their only singularities are
isolated and of finite order. If p and q do have common zeros, then dividing
out all common factors gives a new rational function a/b that is meromorphic
440 Chapter 11. Complex Analys is

and that agrees with p/ q at all points of <C except at the zeros of q. It is stan-
dard practice to replace p/ q by a/ b everywhere but still write p/ q to denote
the function a/b. We will also do this everywhere without further comment.

11 .7.2 Residues and Winding Numbers


Although the computation of the full Laurent series for a function may be difficult,
in many cases it turns out that the only thing we really need from the Laurent series
is the coefficient of (z - zo )- 1 . The main reason for this is the computation we did
in Lemma 11.3.5, which says that the integral of (z - zo)k around a simple closed
curve ry containing zo vanishes unless k = -1.

Definition 11.7.6. Let f be holomorphic on a punctured ball B(zo,r) "{zo} =


{z E <Cf 0 < fz - zof < R}, and let ry be a simple closed contour in B(zo,r) "{zo}
with z 0 in the interior of ry. The number

~ J.
27ri 1-i f(z)dz
is called the residue off at zo and is denoted Res(!, zo)

Proposition 11. 7. 7. Let f be holomorphic on a punctured ball B ( zo, r) " {zo} =


{ z E <C f 0 < fz - zo f < r}, with the e2:pansion ( 11. 20). The residue is given by

Res(.f, zo) = c_ 1 ·

Proof. This follows immediately from the fact that the Laurent expansion converges
uniformly on compact subsets, so integration commutes with the sums and we get

Res(!, zo) = -2
1.
7rt
i'Y
f (z) dz= -2
1.
1ft
i
'Y n= -
L
00

oo
en(z - zor dz

= -1 . ~ ~
~ Cn ( z -- zo )n dz = -1 . ~ -C-1- dz = c_ 1 . D
27rt n =- oo. 'Y 27rt , 'Y z - zo

Proposition 11. 7.8. If f has an isolated singularity at zo, then the following hold:

(i) Th e singularity at zo is removable if and only iflimz__,zo f (z) is finite .

(ii) If limz__, z0 (z - zo)k f(z) exists (is finite) for some k 2: 0, then the singularity
is removable or is a pole of order less than or equal to k .

(iii) Iflimz__,z 0 (z - zo)f(z) exists, then it is equal to the residue:

Res(!, zo) = lim (z - zo)f(z) . (11.25)


z--+zo
11.7. The Residue Theorem 441

Proof. This follows from the Laurent expansion; see Exercise 11.29. 0

If the contour 'Y is closed but not simple, or if it does not contain z0 , then the
integral off around "( depends not only on the coefficient c_ 1 (the residue) , but
also on the contour itself.
If "( is a simple closed curve traversed once in a counterclockwise direction
with zo in the interior of"(, then the integral of l /(z - z0 ) around 'Y is 27fi. A little
thought shows that if 'Y circles around z 0 a total of k times in a counterclockwise
direction, then the integral is 27fki. If 'Y does not enclose the point z0 at all, there
exists a simply connected region containing 'Y but not zo where z! zo is holomorphic.
When the function z!zo is holomorphic on a simply connected region containing
"(, the Cauchy-Goursat theorem guarantees that the integral over this contour is 0.
These observations motivate the following definition.

Definition 11. 7.9. Let 'Y be a closed contour on C and z 0 E C a point not on the
contour 'Y . The winding number of 'Y with respect to z 0 is

2m
1
I('Y , zo) = -. i --.
1
dz
z - zo
(11.26)

Nota Bene 11.7.10. The winding number essentially counts the total num-
ber of times a closed curve travels counterclockwise around a given point.

Example 11. 7.11.

(i) Lemma 11.3.5 and Example 11.3.16 show that for any simple closed
contour er t he winding number I(cr, 0) is 1 if zo is contained in er and is
zero otherwise.

(ii) For the curve"( in Figure 11.5 we have

The next lemma is a straightforward consequence of the previous definitions.

Lemma 11. 7 .12. Let U be a simply connected open set, and let 'Y be a closed
contour in U. If N(z) = 2:::%': 0 (z!; 0 )k is uniformly convergent on compact subsets
in the punctured set U '\ { z0 }, then we have

1
- . J.. N(z) dz= Res(N, zo)I('Y, zo).
27fZ),

Proof. The proof is Exercise 11.30. 0


442 Chapter 11. Compl ex Analysis

Figure 11.5. Examples of winding numbers as described in Definition


11. 7.9. In this example I('Y, z1) = 1, whereas I('Y, z2) = 2, and I('Y, z3) = 0.

11.7.3 The Residue Theorem


We are now ready for the main theorem of this section.

Theorem 11.7.13 (Residue Theorem). Let U be a simply connected open set,


and let f be holomorphic on all of U, except for a finite number of isolated singu-
larities {z 1, . .. , Zn}; that is, f is holomorphic on U "- {z1 , .. . , Zn}. Assume ry is a
closed curve in U and that no zi lies on ry . We have

i I
f(z)dz = 2'lfi t
i=l
Res(!, zi)I('Y, zi)· (11.27)

Remark 11.7.14. The idea behind this theorem should be fairly clear: use the
usual change-of-path method to replace the contour ry with a sum of little circles
L;~=l "fj, one around each isolated singularity Zj, traversed I('Y,zj) times. To com-
pute the integral :fo,j f(z) dz, just use the Laurent expansion and integrate term by
term to get the desired result.
We need to be more careful, however, since there is some sloppiness in the
claim about winding numbers matching the number of times the path happens to
encircle a given singularity. For the careful proof, it turns out to be easier to use a
slightly different approach.

Proof. For each singularity Zj expand f as a convergent Laurent series on a punc-


t ured disk around Zj,
CXJ

and consider the principal parts


-1 oo c(j)
N j (z ) =
"'"""'
~ cm z
(j)( _
Z1·)m -_"'"""' -k
~ (z - z ·)k .
m=-oo k=l J
11.7. The Residue Theorem 443

Each NJ converges uniformly on compact subsets of <C "\ { ZJ} and hence is holo-
morphic on U "\ {ZJ}.
Let g(z) be the function obtained by subtracting from f the sum of all the
principal parts:
n
g(z) = f(z) - L NJ(z).
j= l

Near ZJ the positive part of the Laurent expansion off is of the form I::=o~) (z -
Zj )m, which converges uniformly on compact subsets and defines a holomorphic
function on a neighborhood BJ of Zj· In Bj, the principal parts Nc(z) are all
holomorphic if C # j, so the function G(z) = I::=ocm(z - ZJ)m - L t#] Ne is
holomorphic on BJ· Since g(z) = G(z) at every point of Bj "\ {zJ}, the function
g(z) has a removable singularity at z = Zj·
Since this holds for every j E {1, ... , n }, the Cauchy-Goursat theorem gives

Thus, we see

n
= 27fi L Res(!, ZJ)I(r, ZJ),
j= l

where the last equality follows from Lemma 11.7.12. D

The residue theorem is very powerful, especially when combined with another
method for calculating the residues easily. The next proposition gives one method
for doing this.

Proposition 11. 7.15. Assume h is a <C-valued function and g and h are both
holomorphic in a neighborhood of zo. If g(zo) # 0, h(zo) = 0, and h'(zo) # 0, then
the function g(z)/h(z) has a simple pole at zo and

g(z) ) g(zo)
(11 .28)
Res ( h(z) 'zo = h'(zo) .

Proof. Note that

. h(z) - h(z0 ) _ . h(z) _ h'( ) _;_


1lm - 1lm - Zo -r 0.
z-+zo z - Zo z -+zo Z - zo

Hence
. Z - Zo 1
1im - - = - -
z -+zo h(z) h'(zo)
444 Chapter 11. Complex Analysis


e37ri/4

e7ri/4

-R I R


e57ri/ 4

e77ri/4

Figure 11.6. The D-shaped contour of radius R, as described in


Example 11 . 7.17. The contour D consists of a half circle C (red) and a line segment
I (blue) and contains two of the four poles of 1/x 4 + 1. It is not hard to show that
as R--+ oo the integral along C goes to 0. Thus, in the limit, the integral along/ is
the same as the integral over D , which can be computed using the residue theorem.

exists, and we get

g(zo) . g(z) (g )
h'(zo) = }!.,n; (z -- zo) h(z)
0
= Res h' zo ,

where the last equality follows from Proposition 11.7.8. D

Example 11.7.16. Let/ be a circle centered at zo = 0 with radius 2. From


Proposition 11.7.15 we have

~
1 1
f
.
1
z2 - 1
= 2ni [Res (--
z2 - 1 , 1) +Res ( -z2 - 1 , -1)]

= 2ni [2tl ) + 2(~1)] = 0.

Example 11. 7.17. Contour-integral techniques are useful for computing real
integrals that would otherwise be extremely difficult to evaluate. For example,
consider the integral J~00 xff~ 1 . It is not difficult to verify that this improper
integral converges and t hat it can be computed using the symmetric limit

To compute this integral, consider the contour integral over the following
D-shaped contour. Let C be the upper half circle of radius R > 2 centered at
the origin, as shown in Figure 11. 6, let "( be the line segment from ( - R, 0) to
(R, 0), and let D = C +I be their sum. We have
11.8. *The Argument Principle and Its Consequences 445

j dz 1 dz
.'f°r> z 4 + 1 = c z 4 + 1 +
j'Y
dx
x4 + 1
1
= c z4 +
dz
1+
JR-R
dx
x4 + 1·

From Proposition 11.7.15 we have

j
Jv
~
z4 + 1
= 27fi
1
z + 1'
1
[Res ( - 4 - er.i/ 4) +Res ( - 4 - e 37fi/ 4)]
z + 1'

= 27ri. [ 4(ei/
1
4)3
+ (e3.,,.i/
1
4)3
] = - 2~• [e- 3i.,,./ 4 + e-9.,,.i/ 4]
4
7f

Now consider the integral just over C. Parametrizing C by Reit with t E [O, 7r]
gives

Therefore we have

1 ~=
00

_ 00 x4 +1
lim
R--too
(JR~+
-R x 4 +1
~) 1
c z 4 +1
= lim j
R--t ooJv z 4
~ = _!!__
+1 J2'

11.8 *The Argument Principle and Its Consequences


In this sect ion we use the residue theorem to derive three important results that
have many practical applications. The first is the argument principle, which is useful
to determine the number of zeros and poles a given meromorphic function has. The
second result, Rouche's theorem, is useful for determining the locations of zeros of
a complicated analytic function when the location of zeros of a simpler analytic
function is known. Finally, the third is the holomorphic open mapping theorem,
which states that a nonconstant holomorphic function must map an open set to an
open set.

11.8.1 The Argument Principle


The name "argument principle" requires some explanation. The argument arg(z)
of a point z = reilJ is the angle e
between the positive real axis to the vector
representing z. This is not a well-defined function on <C because eilJ = ei(1J+ 2 r.k) for
any integer k.
Although the value of the argument at a point may not be well defined, it
does make sense to consider the change in argument as z traces a closed contour <5.
In fact this is exactly what the winding number !(<5, O) is tracking. That is to say,
446 Chapter 11 Complex Analysis

27r I (CJ, 0) is the total change in angle (argument) for the contour CJ. If u = f o /,
then 27rl (f o /, 0) is the total change in argument for f o /, at least if f (w) =I- 0 for
every w E f.

Theorem 11.8.1 (Zero and Pole Counting Formula). Let U be a simply


connected open subset of C, and let/ be a closed contour in U. Let f be a C-valued
meromorphic function on U, with poles {w1, .. . , wn}, and zeros {z1, . .. , Zm}, none
of which lies on the contour /. If bi, .. . , bn are the respective orders of the poles
and a 1 , . .. , am are the respective multiplicities of the zeros, then

(11.29)

Proof. Because f is meromorphic in U, the Laurent series for f at any point zo in


U has the form 00

f(z) = L Cj(z - zo)j


j=k
for some (finite) integer k. Moreover, we may assume that Ck =/=- 0. This implies
that we can factor f as
f (z) == (z - zo)k g(z),
where g is holomorphic and nonvanishing at zo. If k > 0, then f has a zero of order
k at zo, whereas if k < 0, then f has a pole of order -k .
Setting F(z) = f'(z)/ f(z) we get

F(z) = f'(z) = k(z - zo)k- 1 g(z) + (z - z 0 )kg'(z)


f (z) (z - zo)kg(z)
kg(;.o) + (z - zo)g'(z)
(z - zo)g(z)
F• g' (z)
= (z -- zo) + g(z) ·

Since g is holomorphic and nonvanishing at zo, the term ~g? is also holomorphic
near z0 . This shows that the function F(z) has residue k at zo. The residue theorem
now gives (11.29), as desired. D

Remark 11.8.2. This theorem says the integral (11.29) is always an integer mul-
tiple of 27ri. That means we can compute it exactly by just using an approximation
that is good enough to identify which multiple of 27ri it must be. If f is holomorphic,
a rough numerical approximation that is just good enough to identify the nearest
integer multiple of 27ri tells the exact number of zeros off (with multiplicity) that
lie inside of I· If f is meromorphic, then such a numerical approximation tells the
exact difference between the number of zeros and poles inside of I·

Corollary 11.8.3. Let U be a simply connected, open subset of C, and let / be a


simple closed contour in U. Fix w E C, and let f be a C-valued holomorphic on U,
11.8. *The Argument Principle and Its Consequences 447

with f(z) =f w for every z E ry. If N is the number of solutions of f(z) = w (with
multiplicities) that lie within ry, then

-1 j f'(z) dz = N.
27fi 'Yr f(z) - w

Proof. Let g =f - w and apply the zero and pole counting formula (11.29) tog
on ry. D

Corollary 11.8.4 (Argument Principle). Let U be a simply connected, open


subset of C, and let ry be a simple closed contour in U. Let f be a «::-valued function,
meromorphic on U, with no zeros or poles on the contour ry, with Z zeros (counted
with multiplicity) inside of ry, and with P poles (also counted with multiplicity) lying
inside of ry. Define a new closed contour f ory as follows. If ry is parametrized by z( t)
with t E [a, b], parametrize a new contour by w(t) = f(z(t)), also with t E [a, b] (or if
ry is piecewise smooth, construct f ory from each of the smooth pieces w(t) = f(z(t))
in the obvious way). In this case, we have

I(f o ry, 0) = Z - P.

Proof. Since ry is a simple closed contour, the winding number I("(, zo) is 1 for
every zo inside of ry and 0 otherwise. We have

I(f 'YO)= _1 j dw = _1 j f'(z) dz= z _p


0
) 27fi r;O"f W 27fi 'Yr
f (z) '

where the last line follows from the zero and pole counting formula (11.29). D

Example 11.8.5. Suppose

(z - 7) 3 z 2
f(z) = (z - 6)3(z + 2) 5 (z - 1) 2 ·

Let's evaluate the integral of f' / f around the circle of radius 4, centered at
zero, traversed once, oriented counterclockwise. By Theorem 11.8.1 we have

The number of zeros inside ry is 2, and the number of poles inside ry is 2 + 5.


448 Chapter 11. Complex Analysis

11.8.2 Rouche's Theorem


Like the argument principle, Rouche's theorem is also concerned with the number of
zeros of a function, but it approaches this problem from a different angle. Suppose
we know that a function f (z) is holomorphic on a contour 'Y and inside 'Y. If we
perturb f(z) by some other holomorphic function, Rouche's theorem tells us about
the number of zeros of the perturbed function inside of T

Theorem 11.8.6 (Rouche's Theorem). Let U be a simply connected open subset


of <C, and let f and g be <C -valued holomorphic functions on U. If"( is a simple
closed contour in U, and if lf(z)I > lf(z)-g(z)I for every z E "(, then f and g have
the same number of zeros, counted with multiplicity, inside of T

Proof. Consider the function F(z) = g(z)/ f(z). The difference between the
number Z of zeros and the number P of poles of F is precisely the difference
between the number of zeros of g and the number of zeros off. We show that this
difference is zero.
If for g has a zero at some z E "(,then the hypothesis lf(z)I > lf(z) - g(z)I
could not hold. Therefore, the function F has no poles or zeros on T
For all z E 'Y we have

ll _ F(z)I =I f(f(z)z) _ f(z)


g(z) I=lf(z) - g(z)I <
lf(z)I
1.

Therefore, the distance from the contour F o 'Y to 1 is always less than 1, and hence
0 does not lie within the contour F o 'Y and I(F o "(, 0) = 0. This gives
O=l(Fo 'y,O)=Z -P. D

Remark 11.8.7. Sometimes Rouche 's theorem is referred to as the "dog-walking"


theorem. If you have ever tried to walk a dog on a leash near a lamppost or a tree
you have seen this theorem in action. If the leash is short enough, you and the dog
walk around the lamppost an equal number of times. But if the leash is too long,
the dog may circle the lamppost more or fewer times than you do (and the leash
becomes tangled).
In relation to Rouche's theorem, f(z) is your path, g(z) is the dog's path, and
the origin is the lamppost. The maximum difference If (z) - g (z) I is the length of the
leash. If the leash never extends to the lamppost, then lf(z)I > lf(z) - g(z)I, and
the dog must circle the lamppost the same number of times as you do; see Figure 11 .7.

Example 11.8.8.

(i) To find the number of zeros of z 5 +8z+10 inside the unit circle lzl = 1,
choose g(z) = z5 + 8z + 10 and f (z) = 10. On the unit circle we have
lg(z)- f (z) I = lz 5 + 8zl :S 9 < If (z) I = 10, so Rouche's theorem says the
number of zeros of g inside the unit circle is the same as f, that is, none.
11.8. *The Argument Principle and Its Consequences 449

Figure 11.7. Rouche's dog walking. Your path (blue) corresponds to the
contour f( z) and the dog's path (brown) corresponds to the contour g(z). At the
origin is a lamppost. If the leash is not long enough to extend to the lamppost from
any point of your path, then you and the dog must circle the lamppost the same
number of times; see Remark 11 .8. 'l.

(ii) To find the number of zeros of z 5 + 8z + 10 inside the circle lzl = 2,


choose g(z) = z 5 + 8z + 10 and f( z) = z 5 . On the circle we have
lg(z) - f(z)I = l8z + 101 ::::; 26 < lf(z)I = 32, so Rouche's theorem says
the number of zeros of g inside this circle is the same as f , that is, 5.
Combining this with the previous result shows that all the zeros of g lie
in the annulus {lzl EC I 1 < lzl < 2}.
(iii) We show t hat for every R > J5 the function g(z) = ez - z 2 + 4 has
exactly 1 zero in the left half semicirclea S = {z E C I 1R(z) : : ; 0, lzl =
R} U {z = iy EC I y E [-R,R]}. Let f(z) = (z 2 -4).
Since R 2 > 5, on the circular part {z EC I 1R(z)::::; 0, lzl = R}, we have
450 Chapter 11. Complex Analysis

On the vertical line { z = iy E:: <C I y E [- R, R]} we have

lf(z) I =I - Y2 - 41 2 4 > 1 = leiy l = lg(z ) - f(z )I.


So by Rouche's t heorem, g has the same number of zeros as f = (z -
2)(z + 2) in t he semicircle S, namely, one zero.
aRecall that R (z) refers to the real part of z and "5'(z) refers to the imaginary part of z.

Nota Bene 11.8.9. To u e Rouc:he 's t heorem, you need to find a suitable
function f (z). A good rule of thumb is to consider a function that shares
part of t he function g( z ) but whose zeros are easier to find . For example, in
part (i) of t he previous example, both f(z) = z 5 and f (z) = 10 are functions
for which it is easy to find the zeros. But you can check that f (z) = z5 does
not satisfy the hypothesis of the theorem, whereas f (z) = 10 does.
If you happen to choose the wrong function f(z) t he first time, just
try another.

11.8.3 The Holomorph ic Open Mapping Theorem


Recall from Theorem 5.2.3 that continuous funct ions pull open sets back to open
sets, but for continuous functions the image of an open set need not be open.
The holomorphic open mapping theorem says that for a nonconst ant holomorphic
function the image of an open set is always open.

Theorem 11.8.10 (Holomorphic Open Mapping Theorem) . If U C <C is


open, and if f: U --+ <C is a nonco nstant holomorphic function , then the image
f (U) is open in <C.
Proof. Given any wo E f(U) , t here is, by definition, a zo E U such that wo = f(zo).
We must show that there is an E > 0 such that B(wo,c: ) c f(U).
Consider the function g(z) = f (z) - wo , which is also holomorphic on U and
not constant. It has a zero at z 0 . Since zeros of nonconstant holomorphic functions
are isolated (see Proposit ion 11.6. 6),. there is an open neighborhood V c U such
that the only zero of g(z) in Vis z0 . Choose 6 > 0 such t hat t he closed ball B(z0 , 6)
lies entirely inside V.
The circle C0 = {z E <C : lz - zol = 6} is compact, and g(z) i= 0 on the
circle. Let E > 0 be the minimum value of llg(z)ll on C0 . For any w E B(w0 ,c:) , let
hw(z) = f(z) - w. For any z E C0 , we have
llg(z)ll 2 E > l wo - wll = llg(z) - hw(z)ll ·
Therefore, Rouche's theorem guarant ees g(z) and hw(z) have the same number of
zeros inside B (zo, 6). T his implies t hat for every w E B (wo, E) t here is a point
Zw E B(zo, 6) such that 0 = hw(zw) == f(zw) - w, and t hus f(zw) = w. That is to
say, B(wo,c:) c f(B(zo , 6)) c f (U). D
Exercises 451

Remark 11.8.11. The holomorphic open mapping theorem shows, among other
things, that no holomorphic function can map an open set to the real line, since the
real line is not open in C.
In particular, the maps ~(z), c;J(z), and I· I cannot be holomorphic, since they
all map C to the real line. This also shows that z cannot be holomorphic, since
2~(z) = z+z.

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with & are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

11.1. Identify (and prove) the largest open subset of C where the following functions
are holomorphic:
(i) g(z) = lzl.
(ii) f(z) = zsin(z).
2
(iii) h(z) = elzl .

(iv) k(z) = i~~~ .


11.2. Let
x4f3y5f3 +ix.5 f3y4f3 ...L
if Zr 0,
f (x , y) = 0 x +y-
{ if z = o.
Show that the Cauchy-Riemann equations hold at z = 0 but that f is not
holomorphic at this point. Why doesn't this contradict Theorem 11 .l.4?
11.3. Letu = e- x(x siny-ycosy) . Findavsuchthatf(z) =u+ivisholomorphic.
11.4. Prove that in polar form the Cauchy-Riemann equations can be written
au l au
or --:;: ee·
11.5. Prove Proposition 11.1.7. Hint: Write J= lfl 2 / f.
452 Chapter 11. Complex Analysis

11.6. Prove Theorem 11.2.2. (Hint: For (iv) first show that any monomial azn
is holomorphic by induction on n . Then show I:~=O ajzj is holomorphic by
induction on k.)
11.7. Can a power series 2::%"=oak (z -- 2)k converge at z = 0 but diverge at z = 3?
Prove or disprove.
11.8. Consider the series (1 + z)- 1 = 2:%"=0 (-z)k.
(i) What is the radius of convergence R?
(ii) Find a point on the boundary lzl = R where the series diverges.
(iii) Find a convergent power series expansion for f(z) = log(l + z) around
z = 0 by integrating the series for (1 + z)- 1 term by term.
(iv) What is the radius of convergence of this new series?
11.9. Let f be holomorphic on an open, connected subset A of C. Define g(z) =
f (z) on the set
A= {z I z EA}.
Prove that g is holomorphic on A, and that g'(z) = f'(z) .

11.10. Show that the length of a contour"(: [a, b] -4 <C is given by f,, ldzl.
11.11. Consider the contour integral fr lzl dz, where r is the upper half of the circle
of radius r centered at the origin oriented clockwise. Compute the integral
in two different ways:
(i) Parametrize r and calculate the integral from the definition.
(ii) Note that fr lzl dz= fr r dz . Use the Cauchy-Goursat theorem to show
that fr r dz = f~r r dt and compute this last integral.
11 .12. Evaluate the integral :j}0 (z - z) dz around the circle C = {z E <C: lzl = 1}.
11.13. The function f(z) = z 2 ~ 1 fails to be holomorphic at z = ± i. Let 'Y: [O, 27r] --*
<C be given by 'Y(t) = reit for some r > l. Prove that :}}; f(z) dz = 0 by
. I'
followmg these steps:
(i) Use the Cauchy-Goursat theorem to show that the answer is indepen-
dent of r > l.
(ii) Bound the norm of f(z) on 'Yin terms of r.
(iii) Use this bound to show that the absolute value of the integral is less
than c, if r is large enough.. Beware: It generally does not make sense
to talk about inequalities of the form f,, Iii dz ::::; f,, lgl dz, since neither
term is necessarily real.

11.14. Let 'Y be the curve 1 + 2eit fort E [O, 27r]. Compute the following:
(i) J (z+l)e"
I' z
dz .

(ii) J ----za-
(z+l)e" d
I' z.
(iii) J cos(z)
I'z2+2 z.
d

(iv) J (z2+
I'
cos(z) d
2 )2 Z.
Exercises 453

(v) J'Y sin( z) d


z -z z.
2

(vi) J sin(z) dz
"!~ .

11.15. Reevaluate the integral of Exercise 11.12 using the Cauchy integral formula.
Hint: On C we have 1 = lzl 2 = zz, which implies that z = ~ on C.
11.16. Reevaluate t he integral of Exercise 11.13 using the Cauchy integral for-
mula and changing the contour to two small circles, one around i and one
around -i.
11 .17. The nth Legendre polynomial Pn(z) is defined to be

Prove that
1 ):, (w2 - l)n
Pn(z) = 27ri ~ 2n(w - z)n+I dw ,
where / is any simple closed curve containing z.
11.18. Let f( z) be holomorphic such that lf(z)I :::; M for all z with lz - zo l :::; r .
Prove that
lf(nl (zo) I :::; Mn!r - n.
11.19. Prove that any function f that is holomorphic on a punctured ball B(z 0 , r) "
{zo} and is continuous at zo must, in fact, be holomorphic on the whole ball
B (zo, r), following these steps:
(i) Choose a circle/ around zo of radius less than r, and define a function
F(z) = -217it. ':YI'
A:. f(w)
w- z
dw. Prove that F(z) is holomorphic inside of/.
(ii) Prove t hat f(z) = F(z) on the interior of/, except possibly at z0 .
(iii) Show that f(zo) = F(z 0 ) , so F = f on the interior of/.

11.20. Liouville's theorem guarantees that all nonconstant holomorphic functions


are unbounded on C.
(i) Find a sequence z1 , z2, . . . such that Isin(zn)I --* oo. Hint: sin(z) =
~i ( e-iz _ eiz).

(ii) Find a sequence w 1, W2, . .. such that Icos( wn) I --* oo.
(iii) Consider the function f (z) = z 3 ~ 1 . Prove that If I --* 0 as lzl --* oo.
This function is not constant. Why isn't that a counterexample to
Liouville's theorem?
11.21. A common form of the fundamental theorem of algebra states that any poly-
nomial with coefficients in C of degree n has exactly n roots in C, counted
with multiplicity. Use Theorem 11.5.4 to prove this alternative form of the
fundamental theorem of algebra.
11.22. The minimum modulus principle states that if f is holomorphic and not
constant on a pat h-connected, open set U, and if If ( z) I of. 0 for every z E U,
then lfl has no minimum on U .
(i) Prove the minimum modulus principle.
454 Chapter 11. Complex Analysis

(ii) Give an example of a nonconstant, holomorphic function on an open set


U such that lf(zo)I = 0 for some zo E U.
(iii) Use your example from the previous step to show why the condition
If I =/:- 0 is necessary in the proof.
11.23. Let D = {z E C : lzl ~ 1} be the unit disk. Let f: D --+ D be holomorphic on
the open unit ball B(O, 1) CC, with f(O) = 0. In this exercise we prove that
lf'(O)I ~ 1 and that lf(z)I ~ lzl for all z E B(O, 1). We further show that if
there is any nonzero zo E B(O , 1) such that lf(zo)I = lzol , or if lf'(O)I = 1,
then f (z) = o:z for some o: with lo:I = 1 (so f is a rotation).
We do this in several steps:
(i) Let
if z =I- 0,
(z) = {f(z)/z
g f'(O) if z = o.
Prove that g is holomorphic away from 0 and continuous at 0.
(ii) Use Exercise 11.19 to deduce that g is holomorphic on all of B(O, 1) .
(iii) Show that for any r < 1 the maximum value of lgl on B(O, r) must be
attained on the circle lzl = r. Use this to show that lf(z)I ~ lzl on
B(O, 1). Hint: Show lg(z)I ~ l/r for all r < l.
(iv) Show that if lf (zo)I = lzol anywhere in B(O, 1) "- {O}, then f(z) = o:z
for some lo:I = l.
(v) Show that lf'(O)I ~ 1 and if lf'(O)I = 1, then f(z) = o:z with lo:I = l.

11.24. For each of the following functions, find the Laurent expansion on an annulus
of the form {z I 0 < lz - zol < R} around the specified point zo, and find the
greatest R for which the expansion converges
(i) z/(z - 1) around zo = 0.
(ii) ez /(z - 1) around zo = l.
(iii) sin(z)/z around z 0 = 0.
3
(iv) ez around zo = 0.
11.25. Prove that if f = L~o ak(z -- zo)k and g = L~o bj(z - zo)j are C-valued
and both have radius of convergence at least r, then the product
00 00 00

L: ak(z-zo)kL)j(z-zo)j = L L
akbj(z-zo)n
k=O j=O n=O k+j=n
has radius of convergence at least r. Hint: Use Taylor's theorem and induc-
tion to show that the nth coefficient of the power series expansion of f g is
L k+j=n akbj·
11.26. Find the Laurent expansion of the following functions in the indicated region:
(i) z/(z + 1) in the region 0 < lzl < 1.
2
(ii) ez /z in the region 0 < lzl < oo.
(iii) z(z-\)(:_ 3) 2 in the region 0 < lz - 31 < l.
Exercises 455

11.27. Uniqueness of the Laurent expansion depends on the choice of region. That
is, different regions for the Laurent expansion give different expansions. Show
this by computing the Laurent expansion for f (z) = z(z1:._ l) in the following
two regions:
(i) 0 < !z! < l.
(ii) 1 < !z! < oo.
11.28. Prove Proposition 11.6.9.

11.29. Prove Proposition 11.7.8.


11.30. Prove Lemma 11.7.12.
11.31. Compute the following residues:
( i)
1- 2z
at z = 0.
z(z - l)(z - 3) 2
(ii)
1 - 2z
at z = l.
z(z - l)(z - 3)2
(iii)
1- 2z
at z = 3.
z(z - l)(z - 3)2
11.32. Find the following integrals:
(i)
_1_ J 1 - 2z dz.
2ni .'fizl=1 z(z - l)(z - 3)2
(ii)
_1_ J 1 - 2z dz .
2ni Jlz-l l=1 z(z - l)(z - 3) 2
(iii)
_1_ J 1 - 2z dz.
2ni .'/fz-31=1 z(z - l)(z - 3) 2
(iv)
_1_ J 1 - 2z dz.
2ni .'fiz 1=4 z(z - l)(z - 3)2
11 .33. Find the real integral

= !R
1
x2 x2
- - dx = lim - -4 dx
-= 1 + x 4 R--+=. - R 1+x
2
by considering the contour integral ef>7 l~z 4 dz, where "( is the contour con-
sisting of the union of the upper half of the circle of radius R and the real
interval [- R, R] (that is, "( = {z E C I R = !z!, 'S(z) 2'.: O} U [-R, R]). Hint:
Bound 1 ~: 4 on the upper half circle and show that this part of the integral
goes to 0 as R ~ oo.
456 Chapter 11. Complex Analysis

11.34.* For each function p(z), find the number of zeros inside of the region A
without explicitly solving for them.
(i) p(z) = + 4z 2 -1 and A is the disk JzJ < l.
z6
(ii) p(z) = z 3 + 9z + 27 and A is the disk lzJ < 2.
(iii) p(z) = z 6 - 5z 2 + 10 and A is the annulus 1 < lzJ < 2.
(iv) p(z) = z 4 - z + 5 and A is the first quadrant {z E C I ~(z) > 0,
~(z) > O}.
11.35.* Show that ez = 5z 3 - 1 has exactly three solutions in the unit ball B(O, 1).
11.36.* Prove the following extension of the zero and pole counting formula.
Let f be a C-valued meromorphic function on a simply connected open set
U with zeros z1 , . .. , Zm of multiplicities a 1 , ... , am, respectively, and poles
w 1 , ... , Wn of multiplicities b1 , . . . , bn, respectively. If none of the zeros or
poles of f lie on a closed contour 'Y, and if h is holomorphic on U, then

11.37.* Let 'Y be the circle lzl = R of radius R (traversed once, counterclockwise).
Let f(z) = zn + an_ 1 zn-l + · · · + a 1 z + ao, where the coefficients ak are all
complex numbers (and notice that the leading coefficient is 1). Show that

.
hm - .
1 :k -f(f'(z)) zdz = -an-l
27rZ
R-too 'Y Z

in the following steps:


(i) If -\ 1 , ... , An are all the zeros off, show that the sum satisfies
n
LAJ = -an-1·
j=l

(ii) Use Exercise 11.36 to show that

. 1
hm ;:,.---:
:k -f(f'(z)) zdz LAJ·
=
n
R-too L,7ri ~ Z
' j=l

11.38.* Use the holomorphic open mapping theorem to give a new proof of the
maximum modulus principle.

Notes
Much of our treatment of complex analysis is inspired by the books [MH99] and
[Der72] . Sections 11 .6- 11.8 are especially indebted to [MH99]. Other references
include [Ahl78, CB09, SS02, Tayll]. Our proof of the Cauchy- Goursat theorem is
modeled after [CB09, Sect. 47] and [Cos12]. Exercise 11.3 is from [SS02].
Part IV

Linear Analysis II
Spectral Calculus

Trying to make a model of an atom by studying its spectrum is like trying to make a
model of a grand piano by listening to the noise it makes when thrown downstairs.
-British Journal of Radiology

In this chapter, we return to the study of spectral theory for linear operators on
finite-dimensional vector spaces, or, equivalently, spectral theory of matrices. In
Chapter 4 we proved several key results about spectral theory, but largely restricted
our study to semisimple (and hence diagonalizable) matrices. The tools of complex
analysis developed in the previous chapter give more powerful techniques for study-
ing the spectral properties of matrices. In particular, we can now generalize the
results of Chapter 4 to general matrix operators.
The main innovation in this chapter is the use of a matrix-valued function
called the resolvent of an operator. Given A E Mn(IF), the resolvent is the function
R(z) = (zI -A) - 1 . The first main result is the spectral resolution formula, which
says that if f is any holomorphic function whose power series converges on a disk
containing the entire spectrum of A, then the operator f (A) can be computed using
the Cauchy integral formula
1
f(A) =- . J f(z)R(z)dz,
27r2 Jr
where r is a suitably chosen simple curve. As an immediate corollary of the spectral
resolution formula, we get the famous Cayley-Hamilton theorem, which says that
the characteristic polynomial p(z) of A satisfies p(A) = 0.
The next main result is the spectral decomposition of an operator, which says
that the space !Fn can be decomposed as a direct sum of generalized eigenspaces
of the operator, and that the operator can be decomposed in terms of how it acts
on those eigenspaces. The spectral decomposition leads to easy proofs of several
key results, including the power method for computing the dominant eigenvalue-
eigenvector pair and the spectral mapping theorem, which says that when an operator
A is mapped by a holomorphic function, the eigenvalues of f (A) are just the im-
ages f (>..), where >.. E o-(A). It also gives a nice way to write out the spectral
decomposition off (A) in terms of the spectral decomposition of A.

459
460 Chapter 12. Spectral Calculus

Finally, we conclude the chapter with some applications of the spectral


decomposition and the spectral mapping theorem. These include Perron's theorem,
the Drazin inverse, and the Jordan normal form of a matrix. Perron's theorem
tells us about the existence and uniqueness of a dominant eigenvalue of a positive
matrix operator. This theorem, combined with the power method, lies at the heart
of some interesting modern applications, including Google's PageRank algorithm
for ranking web page search results. The Drazin inverse is another pseudoinverse,
different from the Moore- Penrose inverse. The Moore-Penrose inverse is a geomet-
ric inverse, determined in part by the inner product, whereas the Drazin inverse is
a spectral inverse, determined by the eigenprojections of the particular operator.
The Jordan normal form of a matrix arises from a special choice of basis
that allows the spectral decomposition to be written in a nearly diagonal form.
This form has eigenvalues along the diagonal, zeros and ones on the superdiagonal,
and zeros elsewhere. The Jordan form can sometimes be a useful approach when
dealing with "fun-sized" matrices, that is, matrices small enough that you might
decompose them by hand in an effort to gain intuition. But it is important to
emphasize that the basis-independent form of the spectral decomposition, as de-
scribed in Theorem 12.6.12, is almost always more useful than the Jordan normal
form. Limiting yourself to a particular basis (in this case the Jordan basis) can be
a hindrance to problem solving. Nevertheless, for those who really wish to use the
Jordan normal form, our previous results about the spectral decomposition give an
easy proof of the Jordan decomposition.

12.1 Projections
A projection is a special type of operator whose domain can be decomposed into
two complementary subspaces: one which maps identically to itself, and one which
maps to zero. A standard example of a projection is the operator p : JR. 3 ---+ JR. 3
given by P(x, y, z) = (x, y, 0). Intuitively, the image of an object in JR. 3 under this
projection is the shadow in the plane that the object would cast if the sun were
directly overhead (on the z-axis), hence the name "projection." We have already
seen many examples of this type of prnjection in Section 3.1. 3, namely, orthogonal
projections. In this section, we consider more general projections that may or may
not be orthogonal. 44
Throughout this section assume that V is a vector space.

12.1.1 Projections

Definition 12.1.1. A linear operator P : V---+ V is a projection if P 2 = P.

Example 12 .1.2. It is straightforward to show that if Pis a projection, then


so is the complementary projection I - P; see Exercise 12.1.

44 Nonorthogonal projections are sometimes called oblique projections to distinguish them from
orthogonal projections.
12.1. Projections 461

Lemma 12.1.3. If P is a projection, then

(i) y E a(P) if and only if Py= y;

(ii) JV (P) = a (I - P) .

Proof.

(i) If y Ea (P), then y = Px for some x, which implies Py = P(Px) = Px = y.


The converse is trivial.

(ii) Note that x E JV (P) if and only if Px = 0, and this holds if and only if
(I - P)x = x. This is equivalent to x Ea (I - P), using (i). D

A projection separates its domain into two complementary subspaces-the


kernel and the range .

Theorem 12.1.4. If Pis a projection on V , then V =a (P) EB JV (P).

Proof. For every x EV we have x = Px+(I-P)x, which gives V =a (P)+JV (P).


It suffices to show that a (P) n JV (P) = {O}. If x Ea (P) n JV (P), then Px = x
and Px = 0 . Therefore x = 0. D

Corollary 12.1.5. Assume V is finite dimensional. If P is a projection on V with


S = [s 1, ... , sk] as a basis for a (P) and T = [t 1, ... , te] as a basis for JV (P) , then
SU T is a basis for V , and the block matrix representation of P in the basis S U T is

[~ ~] '
where I is the k x k identity matrix (on a (P)), and the zero blocks are submatrices
of the appropriate sizes.

Proof. By Theorem 12.l.4 the basis SU Tis a basis of V. By Theorem 4.2.9 the
matrix representation in this basis is of the form

A 11 0 ]
[ 0 A22 '

where A 11 is the matrix representation of Pon a


(P) and A22 is the matrix repre-
sentation of Pon JV (P). But Pis zero on JV (P), so A 22 is the zero matrix. By
Lemma 12.l.3(i) we know P is the identity on all vectors in a
(P); hence, A11 is
the identity matrix. D

Theorem 12.1.6. If W 1 and W2 are complementary subspaces of V , that is,


V = W 1 EB W 2 , then there exists a unique projection P on V satisfying a (P) = W1
and JV (P) = W 2 . In this case we say that P is the projection onto W1 along W2 .
462 Chapter 12. Spectral Calculus

Proof. For each x E V there exists a unique X1 E W1 and x2 E W2 such that


x = x 1 +x 2. Define the map P by setting Px = X1 for each x EV. Since Px1 = X1
and Px2 = 0, we can see that P 2 =Pon both!% (P) and JY (P), separately, and
hence on all of V. To show that Pis linear, we let x = x1 + x2 and y = Y1 + Y2,
where x 1,y 1 E W1 and x2,Y2 E W2 . Since ax+ by= ax1 + by1 + ax2 + by2 , we
have P(ax+by) = ax 1 +by 1 = aP(x) +bP(y). To show uniqueness, suppose there
exists some other projection Q satisfying!% (Q) = W1 and JY (Q) = W2. Since
Px = Px 1 + Px 2 = x 1 = Qx 1 = Qx 1 + Qx2 = Qx for all x , we have Q = P. D

Example 12.1. 7. Let V = JF[x] . If W1 = lF[x; n] is the space of polynomials


of degree no more than n and W2 is the space of polynomials having no
monomials of degree n or less, then lF[x] = W1 EB W2. The projection of lF[x]
onto W1 along W2 is the map
m n
L aixi r-+ L aixi
i=O i=O

that just forgets the terms of degree more than n.

12.1.2 Invariant Subspaces and Their Projections


In this section we introduce some results that combine the invariant subspaces of
Section 4.2 with the results of the previous section on projections. In addition to
being important for our treatment of spectral theory, these results are useful in
numerical linear algebra, differential equations, and control theory.

Theorem 12.1.8. Let L : V -t V be a linear operator. A subspace W of V is


L-invariant if and only if for any projection P onto W we have LP = P LP.

Proof. (===?)Assume W is £ -invariant and !!l(P) = W. Since JY (I - P) = W,


it follows that (J - P)Lw = 0 for all w E W, and also Pv E W for all v E V.
Therefore, we have that (J - P)LPv = 0 for all v EV and LP= PLP.
(¢:=) Assume that LP = P LP and that !% (P) = W. Since Pw = w for all
w E W, it follows that Lw = PLw for all w E W. Thus, Lw E W for all w E W.
Therefore, W is £-invariant. D

Theorem 12.1.9. Assume that L is a linear operator on V . Two complementary


subspaces W1 and W2 of V satisfy LW1 c W1 and LW2 C W2 (we call these
mutually £ -invariant) if and only if the corresponding projection P onto W1 along
W2 satisfies LP = PL.

Proof. (===?) Since W1 and W2 are complementary, any v E V can be uniquely


written as v = w1 + w2 with w1 E W1 and W2 E W2. Since W1 and W2 are
mutually £ -invariant and P satisfies!% (P) = W1 and JY (P) = W2, we have
12.l Projections 463

PLv = PLw1 + PLw2 = PLw1 = Lw1 = LPw1 = LPv.


( ¢==) Assume that PL =LP, where Pis a projection satisfying fit (P) = W1
and JV (P) = W2. If w 1 E W1, then Lw 1 = LPw 1 = PLw 1 E W1. If w2 E W2,
then Lw2 = L(I - P)w2 = (I - P)Lw2 E W2. Thus, W1 and W2 are mutually
£ -invariant. D

12.1.3 Eigenprojections for Simple Operators


Some of the most important projections are those with range equal to the eigenspace
of an operator. In this section we consider such projections for simple matrices (see
Definition 4.3.4). It is straightforward to generalize these results to semisimple ma-
trices, but, in fact, they can also be generalized to matrices that are not semisimple.
Much of the rest of this chapter is devoted to developing this generalization.

Proposition 12.1.10. Let A E Mn(lF) have distinct eigenvalues >-.1, >-.2, . .. , An · Let
. .. , rn be a corresponding basis of {right) eigenvectors, and let S be the matrix
r 1,
whose columns are these eigenvectors, so the i th column of S is r i . Let £{ , .. . , £~
be the rows of s- 1 ~he left eigenvectors of A, by Remark 4.3.19), and define the
rank-I map Pi = r iii . For all i , j we have

(i) tT rj = J ij ,

(iv) L:~=l Pi = I, and

We call the maps Pi the eigenprojections of the simple matrix A.

Proof.

(i) This follows from the fact that s- 1S = I.

(iv) 2:~ 1 pi = L:~=l rilT' which is the outer-product expansion of ss- 1 =I.
(v) This follows by combining (iii) and (iv). D

The proposition says that any simple operator L on an n-dimensional space can
be decomposed into the sum of n rank-1 eigenprojections, and we can decompose its
domain into the direct sum of one-dimensional invariant subspaces. Representing L
464 Chapter 12. Spectral Calculus

in terms of the eigenbasis r 1 , ... , rn gives a diagonal matrix. In this representation


the eigenvectors are just the standard basis vectors ri = ei, and the projections
Pi are

0 0 0
0 0 0

Pi= eie{ = ,
0 1 0

0 0

where the only nonzero entry is P ii = 1. In this setting it is clear that the matrix
representation of L is

.A1 0 0 0
0 .A2 0 0
A= 0 0 .A3 0 = .A1P1 + · · · + AnPn.

0 0 0 An

The previous proposition says how to express these projections in terms of any
basis-not just an eigenbasis. In other words, the eigenprojections give a "basis-
free" description of spectral theory for simple operators (or matrices). In the rest
of the chapter we will develop a similar basis-free version of spectral theory for
matrices that are not necessarily even semisimple.

Example 12.1.11. Let

A = [! ~] .
We saw in Example 4.1.12 that the eigenvalues are .A 1 = -2 and >.. 2 = 5, with
corresponding eigenvectors

r1 = [ ~1 ] and r2 = [~] .
It is not hard to check that

£i=~7 [-4 3] and T


£2 = 71 [1 1] .

Thus the eigenprojections are

~ ~3] [~ ~] .
1
p1 = [4 and P2 = -
7 -4 7
12.2. Generalized Eigenvectors 465

These are both rank 1, and we can check the various properties in the lemma:

(ii) We have

-3] [3 3] = [O OJ
3 4 4 0 0 )

and it is similarly straightforward to check that Pf = P 1 and P:j = P 2 .


(iii) We have

and a similar computation shows that P2A = >..2P2.


(iv) We have

Pi
1[-44 -3]
+ p2 = 7 1[3 3]
3 +7 4 4 [~ ~] = I.

(v) Finally,

-3] ~ [3 3]
3 +7 4 4 [~ ~] =A.

12.2 Generalized Eigenvectors


Recall from Theorem 4.3.7 that a matrix A E Mn(F) has a basis of eigenvectors if
and only if it is diagonalizable. This allows us to write the domain of A as a direct
sum of eigenspaces. But the eigenvectors of a matrix that is not semisimple do not
span the domain of the matrix, so if we want to decompose such an operator like
we did for simple operators, we must generalize the idea of an eigenspace to include
more vectors.
In this section, we do just that . We develop a generalization of an eigenspace
that allows us to write the domain of any matrix operator as a direct sum of general-
ized eigenspaces. Later in the chapter we use this to transform any matrix operator
into a nice block-diagonal form.

12.2.1 The Index of an Operator


In order to discuss generalized eigenspaces, we need the concept of the index of
an operator. Recall from Exercise 2.8 that for any linear operator B, we have an
increasing sequence of subspaces

(12 .1)
466 Chapter 12. Spectral Calculus

If the operator is defined on a finite-dimensional space, then this sequence must


eventually stabilize. In other words, there exists a K E N such that JV (Bk)
JV (Bk+ 1 ) for all k 2 K.

Definition 12.2.1. The index of a matrix B E Mn(IF) is the smallest k EN such


that JV (Bk)= JV (Bk+ 1 ). We denote the index of B by ind(B).

Example 12.2.2. For any operator B we always have B 0 = I, so if B is


invertible, then ind(B) = 0 because the kernels of B 0 = I and B 1 = B are
both trivial: JV (B 0 ) = JV (B 1 ) == {O}.

In the ascending chain (12.1) it might seem possible for there to be some
proper (strict) inclusions that could be followed by an equality and then some more
proper inclusions before the terminal subspace. The following theorem shows that
this is not possible-once we have an equality, the rest are all equalities.

Theorem 12.2.3. The index of any B E Mn (IF) is well defined and finite. Further-
more, if k = ind(B), then JV (Bm) ==JV (Bk) for all m 2 k, and every inclusion
JV (B 1- 1 ) c JV (B 1) for l = 1, 2, ... , k is proper.

Proof. Since B is an operator on a finite-dimensional space, only a finite number


of inclusions in (12 .1) can be proper, so the index is well defined and finite. The
rest of the theorem follows immediately from Exercise 2.12. D

Example 12.2.4. Consider the matrix

o - 1
B = 3, -2 ll
-41 5
-1 2 .
[21 -1
-1 -3 3

It is straightforward to check that B 2 = 0, so Bk = 0 for all k ;::: 2. But


because B -=f. 0 we have dimJV (B) < dim JV (B 2 ). Hence, ind (B) = 2.

The rank-nullity theorem says t hat the dimension of the range and the kernel
of any operator on IFn must add to n, but it is possible that these spaces may have a
nontrivial intersection. The next theorem shows that when we take k large enough,
the range and kernel of Bk have a trivial intersection, and so together they span
the entire space.

Theorem 12.2.5. Let BE Mn(IF). If k : :'.'.: ind(B), then JV (Bk) n ~(Bk) = {O}
and wn =JV (Bk) E!7 ~(Bk).
12. 2. Generalized Eigenvectors 467

Proof. Suppose x E JV (Bk) n a' (Bk). Thus, Bkx = 0, and there exists y E lFn
such that x = Bky. Hence, B kBky = B 2 ky = 0, and soy E JV (B 2 k) = JV (Bk).
Therefore x = Bky = 0. It follows that JV (Bk) n a' (Bk) = {O}.
To prove that lFn = JV (Bk) + a' (Bk), use the rank-nullity theorem, which
states that n = dimJV (Bk)+ rank(Bk) . By Corollary 2.3.19, we have wn =
JV (Bk) +a' (Bk). Therefore, lFn = JV (Bk) tfJ a' (Bk). D

Corollary 12.2.6. If B E Mn(lF) and k = ind(B), then a' (Bm) = a' (Bk) for all
m~ k.

Proof. By Exercise 2.8 we also have a' (Bm) c a' (Bk), so it suffices to show
dimBm = dim Bk. But by Theorem 12.2.3 we have JV (Bm) = JV (Bk); so,
Theorem 12.2.5 implies that dim Bm = dim Bk. D

We conclude this section with one final important observation about repeated
powers of an operator B on a vector x .

Proposition 12.2.7. Assume BE Mn(lF) . If Bmx = 0 and Bm- l x-=/:- 0 for some
m E z+, then the set {x, Bx, ... , Bm- 1 x} is linearly independent.

Proof. Suppose that aox + aiBx + · · · + am_ 1 Bm- lx = 0 is a nontrivial linear


combination. Let ai be the first nonzero coefficient . Left multiplying by Bm-i-l
gives aiBm-lx = 0. But Bm-lx-=/:- 0 implies ai = 0, a contradiction. D

12.2.2 Generalized Eigenspaces


Recall that an eigenspace of an operator L associated with the eigenvalue >.. is the
subspace I:,x = JV (>..I - £),where the geometric multiplicity of>.. is the dimension
of the eigenspace. If A E Mn (JF) and A is semisimple, then we can write JFn as the
direct sum of A-invariant subspaces

where O"(A) = {>..i}~=l is the set of distinct eigenvalues of A. Choosing a basis


for each JV (.Ail - A) and writing the operator corresponding to A in terms of the
resulting basis for JFn yields a diagonal matrix that is similar to A, where each block
is of the form >..J
If A is not diagonalizable, the spaces JV (.Ail - A) do not span lFn, but as
explained in the previous section, we have a nested sequence of subspaces

JV (>..I -A) c JV ((AI - A) 2 ) c ·. · c JV ((AI - A)n) c · · · . (12.2)

If k = ind(>..! - A), then the sequence stabilizes at JV ((AI -A)k). For each
eigenvalue>.., the space JV((>..! - A)k) is a good generalization of the eigenspace.
This generalization allows us to put a nondiagonalizable operator L into block-
diagonal form. This is an important approach for understanding the spectral theory
of finite-dimensional operators.
468 Chapter 12. Spectral Calculus

Definition 12.2.8. If A E Mn(IF) with eigenvalue .A E <C, then the generalized


eigenspace of A corresponding to >. is the subspace <ff>.. = JY ((AI - A)k), where
k = ind(.XI - A). Any nonzero element of <ff>.. is called a generalized eigenvector of
A corresponding to >..

The next four lemmata provide the key tools we need to prove the main
theorem of this section (Theorem 12.2.14), which says !Fn decomposes into a direct
sum of its generalized eigenspaces.

Lemma 12.2.9. If A E Mn(IF) and>. is an eigenvalue of A, then the generalized


eigenspace <ff>.. is A-invariant.

Proof. Any subspace is invariant under the operator AI. Since A= .XI - (.XI -A),
it suffices to show that <ff>.. is (AI - A)-invariant. If x E <ff>.. and y = (AI - A)x, then
(.XI - A)ky = (AI - A)k+lx = 0, which implies that y E <ff>._. D

Example 12 .2.10. Consider the matrix

l 1
A= 0 1
[0 0

This matrix has two eigenvalues, .A, = 1 and >. = 3, but is not semisimple. A
straightforward calculation shows that the eigenvectors associated with >. =
1 and >. = 3 are [1 0 0] T and [1 2 2] T, respectively. An additional
calculation shows that ind(3I - A.) = 1 and ind(ll - A) = 2. So to span
the generalized eigenspace <ff1, we need an eigenvector corresponding to A = 1
and one additional generalized eigenvector v 3 E JY ((ll - A) 2 ) with v 3 ~
JY (ll - A). The vector v 3 = [O 1 OJ T satisfies this condition. Thus the
generalized eigenspaces are <ff1 = span( { [1 2 2] T , [0 1 0r}) and IC3 =
span({[l 0 O]T}).

Lemma 12.2.11. If A andµ are distinct eigenvalues of a matrix A E Mn(IF), then


<ff>.. n <ffµ= {O} .

Proof. Let k>.. = ind (AI - A) and kµ =ind (µI - A). The previous lemma shows
that <ff>.. is (AI - A)-invariant. We claim that µI - A restricted to <ff>.. has a trivial
kernel, and thus (by iterating) the kernel of (µI - A)k" on <ff>.. is also trivial. This
implies that <ff>.. n <ffµ = {O}.
To prove the claim, suppose that x E JY (µI - A) n <ff>.., so that Ax= µx and
(AI - A)k"'x = 0. Using the binomial theorem to expand (.XI - A)k"' shows that
(.A - µ)k-'x = O; see Exercise 12.9 for details. This implies that x = 0. D
12.2. Generalized Eigenvectors 469

Lemma 12.2.12. Assume that W1 and W2 are A-invariant subspaces of IFn with
W1 n W2 = {O}. If A is an eigenvalue of A with generalized eigenspace <ff>.., such
that <ff>. n Wi = {O} for each i, then

Proof. If x E <ff>, n (W1 EB W2), then x = X1 + x2, with Xi E wi for each i. If


k = ind(Al - A), then

0 = (AI - A)kx = (AI - A)kx 1 + (AI - A)kx 2 .

Since each Wi is A-invariant, it is also (Al - A)-invariant, so we have

Therefore, for each i we have Xi E JV ((Al - A)k) =<ff>,. But we also have Xi E wi
by definition, and thus xi = 0. This implies that x = 0. D

Lemma 12.2.13. If A E Mn(IF) and A is an eigenvalue of A , then dim (<ff>.) equals


the algebraic multiplicity m>, of A.

Proof. The proof follows that of Theorem 4.4.5. By Schur's lemma we can assume
that A is upper triangular and of the form

where the block Tu is upper triangular with all diagonal values equal to A, and
the block T22 is upper triangular with all diagonal values being different than A.
The block Al - Tu is strictly upper triangular, and the block Al - T22 is upper
triangular but has all nonzero diagonals and is thus nonsingular. It follows that
(AI -Tur" = 0 and (AI -T22)k is nonsingular for all k EN. Therefore, dim (<ff>.) =
dim JV ((AI - A)m") =dim JV ((AI - Tu)m") = m>,. D

Theorem 12.2.14. Given A E Mn(IF), we can decompose IFn into a direct sum of
A -invariant generalized eigenspaces
IFn = <ff>., EB g>-2 EB ... EB gAr' (12 .3)

where cr(A) = {A 1 , A2 , ... , Ar} are the distinct eigenvalues of A .

Proof. We claim first that for any A E cr(A) and for any subset M = {µ1, . .. , µe} C
cr(A) of distinct eigenvalues with A tf_ M, we have

To show this we use Lemmata 12.2.11 , 12.2.9, and 12.2.12 and induct on the size of
the set M .
470 Chapter 12_ Spectral Calculus

If [M[ = 1, the claim holds by Lemma 12.2.11. If the claim holds for [M[ = m,
then for any M' with [M'[ = m + 1, write M' = M U {v }. Set Wo = !:>.., set
W1 = <f:v, and set W2 = ffiµEM I:µ- The conditions of Lemma 12.2.12 hold, and so
!:>.. n ffiµE M' I:µ= {0}, proving the claim.
This shows we can define the subspace W = l:>.. 1 EB l:>.. 2 EB · · · EB l:>..r of lFn .
So, it suffices to show that W = lFn. By Lemma 12.2.13, the dimensions of the
generalized eigenspaces are equal to the algebraic multiplicities, which add up ton.
This implies that dim W = n, which implies W = lFn by Corollary 1.4.7. 0

Remark 12.2.15. By Theorem 12.2.14, a given matrix A with eigenvalues a(A) =


{A 1 , >. 2 , . .. , Ar} is similar to a block-diagonal matrix where the generalized eigen-
space !:>.., is invariant with respect to the corresponding block. In Section 12.10,
we describe a method for producing a specific basis for each generalized eigenspace
so that the block-diagonal matrix is banded with the eigenvalues on the diagonal,
zeros and ones on the superdiagonal, and zeros elsewhere. This is called the Jordan
canonical form. The rest of this chapter develops a basis-free approach for producing
the block-diagonal representation, where we use projections instead of making an
explicit choice of generalized eigenvectors.

12.3 The Resolvent


In this section we introduce the resolvent, which is a powerful tool for understanding
the spectral properties of an operator. The resolvent allows us to use the tools of
complex analysis, including the Cauchy integral formula, to study properties of
an operator.
One of the many things the resolvent gives us is a complete set of generalized
eigenprojections that describes the spectral theory of the operator in a basis-free
way. The resolvent is also a key tool in the proof of the power method (see
Section 12.7.3) and the Perron-Frobenius theorem (Theorem 12.8.11), which are
essential for working with Markov chains, and which are fundamental to applica-
tions like the Google PageRank algorithm (see Section 12.8.3).

12.3.1 Properties of the Resolvent

Definition 12.3.1. Let A E Mn(lF). The resolvent set p(A) C C of A consists of


the points z EC for which (zl -A)- 1 exists. The complement a(A) = C ""p(A) is
called the spectrum of A. The resolvent 45 of A is the map R(A, ·): p(A) -t Mn(C),
given by
R(A, z) = (zl - A)- 1 . (12.4)
If there is no ambiguity, we denote R(A, z) simply as R(z).

45 Although we define and study the resolvent only for matrices (i.e., finite-dimensional operators),
many of the results in this chapter work in the much more general case of closed linear operators
(see, for example, [Sch12].)
12.3. The Resolvent 471

Example 12.3.2. Consider the matrix

A= [~ ~]. (12.5)

By direct computation we get the resolvent

Note that R(z) has poles precisely at a(A) = {3, -2}, so the resolvent set is
p(A) = C "- {3, -2}.

Remark 12.3.3. Cramer's rule (Corollary 2.9.24) shows that the resolvent takes
the form of the rational function

R(z) = (zl - A) - 1 = adj(zl - A) (12. 7)


det(zl - A)'
where adj(-) is the adjugate matrix. Since the denominator of (12. 7) is the char-
acteristic polynomial of A, the poles of R(z) correspond precisely, in terms of both
location and multiplicity, to the eigenvalues of A. In other words, the spectrum is
the set of eigenvalues of A, and the resolvent set p(A) is precisely those points of C
that are not eigenvalues of A.

Example 12.3.4. Let

A~r~~~~j
Using (12.7) we compute the resolvent as R(z) = B(z)/p(z), where

B(z) =
(z - l)~(z
O
- 7) 3(z - l)(z - 7) 9(z - 7)
(z - 1) 2 (z - 7) 3(z - l)(z - 7) 9(z27- 1)
0 (z - 1) 2 (z - 7) 3(z - 1) 2
j
r 0 0 0 (z - 1) 3
and p(z) = (z -1) 3 (z - 7) .

Lemma 12.3.5. Let A E Mn(lF). The following identities hold:

(i) If z1 , z2 E p(A) , then

R(z2) - R(z1) = (z1 - z2)R(z2)R(z1). (12.8)

This is called Hilbert's identity.


472 Chapter 12. Spectral Calcu lus

(ii) If z E p(A1) n p(A2), then


R(A 2, z) - R(A 1, z) = R(A 1, z)(A2 - Ai)R(A2, z). (12.9)

(iii) If z E p(A), then


R(z )A = AR(z). (12.10)

(12.11)

Proof.
(i) Write (zi - z 2)I = (ziI - A) - (z2I - A), and then left and right multiply by
R(z2) and R(zi), respectively.
(ii) Write A 2 - Ai = (zI - Ai) - (zI - Az), and then left and right multiply by
R(Ai , z) and R(A2, z), respectively.
(iii) From (12.4), we have (zI - A)R(z) = R(z)(zI -A ), which simplifies to (12.10) .
(iv) For z 1 = z2 , (12.11) follows trivially. Otherwise, using (12 .8) and relabeling
indices, we have

12.3.2 Local Properties


Throughout the remainder of this section, let A E Mn(lF), and let II · II be a
matrix 46 norm.

Theorem 12.3.6. The set p(A) is open, and R(z) is holomorphic on p(A) with
the following convergent power series at zo E p(A) for lz - zol < llR(zo)ll - i:
00

R(z) = 2:) -l)k(z - zo)kRk+i(zo) . (12.12)


k=O

Proof. We use the fact (from Proposition 5.7. 4) that when llB ll < 1 the Neumann
series ""£~ 0 Bk converges to (I - Bt-i . From (12.8) we have
R(zo) = R(z) + (z - zo)R(zo)R(z) =[I+ (z - zo)R(zo)]R(z) .
Setting B = -( z - z0 )R(z0 ) in the Neumann series, we have
00

R(z) = [I+ (z - zo)R(zo)]-iR(zo) = 2:)-l )k(z - zo)k Rk+i(z0 ),


k=O

and this series converges in the open neighborhood {z EC I lz - zol < llR(zo)ll-i}
of zo. Therefore p(A) is an open set, and R(z) is holomorphic on p(A) . D

46 Recall from Definition 3.5.15 that a matrix norm is a norm on Mn(lF) that satisfies the submul-
tiplicative property !IABll ~ llAllllB!I.
12.3. The Resolvent 473

Remark 12.3.7. Comparing (12.12) with the Taylor series (11.18) for R(z) reveals
a relationship between powers of R(z) and its derivatives:

(12.13)

We can also get t his by formally taking derivatives of R(z).

Theorem 12.3.8. If lzl > l All, then R(z) exists and is given by

(12.14)

Proof. Note that

which converges whenever l z- 1All < l. D

Corollary 12.3.9.
lim llR(z) JI = 0. (12.15)
JzJ-+oo
Moreover, R(z) is holomorphic in a neighborhood of z = oo.
Remark 12.3.10. When we say R(z) is holomorphic in a neighborhood of oo, we
simply mean that if we make the substitution w = l / z, then R(l / w) is holomorphic
in a neighborhood of w = 0.

Proof. That R( z ) is holomorphic in a neighborhood of z = oo follows immediately


from (12.14). For lzl > llAll we have

l R(z) ll < ~ ~1
- L..., lzlk+
k =O
=
lzl
(1- ~)
_.!._
lzl
-
1
1
lzl - l All .
This shows lim JzJ -+oo l R(z)ll = 0. D

Remark 12.3.11. Remark 12.3.3 shows that the spectrum is the set of roots of the
characteristic polynomial. By the fundamental theorem of algebra (Theorem 11.5.4),
this is nonempty and finite; see also Exercise 11.21.
The previous theorem gives an alternative way to see that the spectrum is
nonempty, which we give in the following corollary.

Corollary 12.3.12. The spectrum cr(A) is nonempty.


474 Chapter 12. Spectral Calculus

Proof. Suppose that O"(A) = 0, so that p(A) = <C and R(z) is entire. By Corollary
12.3.9, R(z) is uniformly bounded on <C. Hence, R(z) is constant by Liouville's
theorem (Theorem 11.5.1). Thus, R(z) = limz--+oo R(z) = 0. This is a contradiction,
since I= (zl - A)R(z). D

Lemma 12.3.13. For any matrix norm II ·II and any A E Mn(lF), the limit

r(A) == lim llAklll/ k (12.16)


k--+oo

exists and r(A):::; llAll- We call r(A) the spectral radius of A .

Proof. For 1 :Sm< k fixed, we have the inequality llAkl l :S IJAk-mllllAmll· Let
ak = log llAkJI . The inequality implies that ak :::; am+ ak-m· By the division
algorithm, there exists a unique q and r such that k = qm + r, where 0 :::; r < m,
or, alternatively, we have q = lk/mj. This gives

It follows that
ak , q 1
k '."., kam + kar .
Leaving m fixed and letting k > m grow gives

. q
1imsup -k
.
= 1imsup -
lk/mJ
k- = - .
1
k k m
It follows that
. ak .
hmksup k :S hmksup
(qkam + kar
1 ) am
= --;;;-· (12.17)

Since (12 .17) holds for all m, it follows that


. ak
1imsup-k 1. . f am
:::; imm - .
k m m

Therefore the limit exists. To show that r(A):::; ll Al l, let m = 1 in (12.17). D

Theorem 12.3.14. The regions of convergence in Theorems 12.3.6 and 12.3.8 can
be increased to
1
(i) lz - zol < [r(R(z0 ))r and
(ii) lzl > r(A), respectively.

Proof. In each part below, let r denote r(R(zo)) and r(A) , respectively.

(i) Let z satisfy lz-zol < r- 1 . There exists an c > 0 such that Jz - zol < (r+2t:)- 1 .
Moreover, there exists N such that llR(z0 )kll 1/k < r + c for all k > N. Thus
lz - zolkllR(zo)kll :S (;.:;e)k < 1, which implies that (12.12) converges.
12.4. Spectral Resolution 475

(ii) Let z satisfy iz l > r. There exists an c. > 0 such that iz l > r + 2c.. Moreover
there exists N such that ll Ak ll 1 / k < r + c. for all k > N. Thus lzl-kllAk l <
(;.:;c)< 1, which implies that (12 .14) converges. 0
Remark 12.3.15. Theorem 12.3.14 implies that R(z) is holomorphic on lz l >
r(A), which implies
r (A) ~ O"M = sup l>- 1.
AEa(A)

In the next section we show in the finite-dimensional case that r(A) = O"M · Not
only does t his justify the name spectral radius for r(A), but it also implies that the
value of r(A) is independent of the operator norm used in (12.16).

12.4 Spectral Resolution


Throughout this section assume A E Mn(lF) is given. Recall from (12 .7) that R(z)
is a matrix of rational functions with a common denominator. The poles of these
functions correspond, in terms of both location and algebraic multiplicity, to the
eigenvalues of A. In Section 12.3.2, we saw the locally holomorphic behavior of R(z)
in p(A). In the next three sections we determine the Laurent expansion of R(z)
about the points of d A).

Definition 12.4.1. Let >. E u(A) . Let r be a positively oriented simple closed
curve containing>. E u(A) but no other points of u(A) . The spectral projection (or
eigenprojection) of A associated with >. is given by
1
P>, = Res(R(z), >.) = - . J, R(z)dz. (12.18)
2'Tri Jr

Theorem 12.4.2. For any A E Mn(C) and any>. E u(A), the following properties
of P>, hold:
(i) Idempotence: Pf= P>, for all>. E u(A).
(ii) Independence: P>,PN = P>-'P>- = 0, whenever>. , NE u(A) with>. =f. >.'.
(iii) A -invariance: AP>, = P>,A for all>. E u(A).
(iv) Completeness: 2=>-Ea(A) P>, =I.

Proof.
(i) Let r and r' be two positively oriented simple closed contours in p(A) sur-
rounding>. and no other points of u(A). Assume also that r is interior to I''
as in Figure 12.1. For each z E r we have
dz'
1~ = 0.
ir'
- - = 27Ti
z' - z
and
.'fr z' - z
By Hilbert's identity we have
476 Chapter 12. Spectral Calculus

,\

Figure 12.1. Illustration to assist in the proofs of Theorem 12.4.2(i) and


Lemma ta 12. 5. 2 and 12. 5.3. The point z is interior to f', but z' is exterior to r .

(~ ) 11
2
P'f. =
2m Jr }r, R(z)R(z')dz' dz
= (~)2 l, l, R(z),- R(z') dz' dz
2?Ti 1r .'fr, z -- z

Jr R(z) Jrr ~)dz _ Jrr


2
1
= ( -2?Ti.) [1 (1
Z - Z
l, R(z') Jr Z (1-f!--) dz']
- Z

1
= - . 1 R(z)dz = P>-..
2?Ti 1r

(ii) Let rand f' be two positively oriented disjoint simple closed contours in p(A)
surrounding,\ and A', respectively, and such that no other points of a(A) are
interior to either curve, as in Figure 12.2. Note that

i dz'
--=0
. rr z' - z
and 1~=0.
.'fr z' - z

,\
• ,\'

Figure 12.2. Diagram to assist in the proof of Theorem 12.4 .2(ii). In this
case z is not interior tor', nor is z' ·interior tor, and no points of a(A) "'-{A,,\'}
are interior to either curve.
12.4. Spectral Resolution 477

Again using Hilbert's identity, we have

P;_P;_, = (
2 ~i ri fr,, R(z)R(z')dz' dz
= ( ~ ) 2 l, l, R(z) ,- R(z') dz' dz
27ri Jr }r, z - z
[1Jr R(z) (1Jr, ~)dz - Jr,1 R(z') (1,fr -{!!-)dz']
2
1
= ( -27rZ.) Z - Z Z - Z

= o.
(iii) This follows directly from (12.10), that is,

A [f R(z)dz] = f AR(z)dz = f R(z)Adz = [f R(z)dz] A.

(iv) Let r be a positively oriented circle centered at z = 0 having radius R > r(A).
Theorems 12.3.8 and 12.3.14 guarantee that the Laurent expansion R(z) =
L:~o Akz-(k+l) holds along r, so the residue theorem (Theorem 11.7.13)
applied to this Laurent series gives
00

-1. ck
R(z)dz = - 1 . :/; Ak L0
k+l dz = A =I.
27rZ • r 27rZ r k=O z

However, the residue theorem applied to R(z) gives

;j-: .fr1 R(z)dz L ;j-: .fr,_


7rZ
1 R(z)dz L
=
>-Ea(A) 7rZ
=
>-Ea(A)
P;_,

where each I';_ is a positively oriented simple closed contour containing A and
no other points of o-(A). D

Remark 12.4.3. The double integrals in the previous proof require Fubini's
theorem (Theorem 8.6.1) to change the order of integration. Although we only
proved Fubini's theorem for real-valued integrals, it is straightforward to extend it
to the complex-valued case.

Example 12.4.4. Consider the matrix

of Example 12.3.2. The partial fraction decomposition of the resolvent (12.6) is

R(z) = z -1 3 (1[3
5 3 2])
2 + z +12( 15[-23 -2])
3 .
478 Chapter 12. Spect ra l Calculus

Taking residues gives the spectral projections

P3 =
1[3 2]
5 3 2

Notice that all four parts of Theorem 12.4.2 hold.

Example 12.4.5. Let


1 3
A= 0 1 3 0
0 01
1[0
0 1 3
0007
be the matrix in Example 12.3.4. Use (12.18) to compute P>. for A = 7, and
note that P 1 =I - P7. We get

o o
P1 =
[
0 0
0 0 0 1/2 ~ ~~~1 and
0
0
1
-1/81
-1/4
-1/2 .
0 0 0 1 0 0

Theorem 12.4.6 (Spectral Resolution Formula). Suppose that f(z) has a


power series at z = 0 with radius of convergence b > r(A) . For any positively
oriented simple closed contour r containing O"(A), we have

f(A) = -~ 1 (12.19)
27ri 1r J(z)R(z)dz.
Proof. Express f(z) as a power series f(z) = L,':=o akzk . W ithout loss of gener-
ality, let r be a positively oriented circle centered at z = 0 having radius bo, where
r(A) < bo < b. Thus, we have

-1. ck f(z)R(z)dz = - 1 .
2 7ri . r 2Ki r
i f(z)z- 1
00

L kAk
z
k= O
dz.

Since f(z)z- 1 is bounded on rand the summation converges uniformly on compact


subsets, the sum and integral can be interchanged to give

Corollary 12.4. 7. The spectral radius of A is

r(A) = O"NJ = sup j,\j .


>.Eo-(A)
12.4. Spectral Resolution 479

Proof. By Theorem 12.3.14, we know that r(A) 2: O'Af . For equality, it suffices to
show that r(A) :::; O'M +c for all c > 0. Let r be a positively oriented circle centered
at z = 0 of radius O'Af + c. By the spectral resolution formula, we have

1
An = - . J.. znR(z)dz . (12.20)
27rt Jr

Hence,

where
K = sup ll R(z)l loo ·
r
This gives

r(A) = lim llAnll 1 /n :S lim Klfn(O'M + c)l+l/n = O'M + c. 0


n-+oo n-+oo

Corollary 12.4.8 (Cayley- Hamilton Theorem). If A is an n x n matrix with


characteristic polynomial p(z) = det(zl - A), then p(A) = 0.

Proof. Let r be a simple closed contour containing O'(A). By the spectral resolution
formula (12.19) and Cramer's rule (Corollary 2.9.24) , we have

p(A) = - . lcf
27rt . r
det(zl - A)(zl - A) - 1 dz = - .
27ri. r
lcf.
adJ(zI - A)dz = 0,

since adj(zJ - A) is a polynomial in z and hence holomorphic. 0

Nota Bene 12.4.9. It might seem tempting to try to prove the Cayley-
Hamilton theorem by simply substituting A in for z in the expression
det(zJ - A) . Unfortunately it doesn't even make sense to substitute a matrix
A for the scalar z in the scalar multiplication zI. Contrast this with substi-
tuting A for z into p(z) , which is a polynomial in one variable, so p(A) is a
well-defined matrix in Mn(C) .

Example 12.4.10. Consider again the matrix A in Example 12.3.2. Note


that the characteristic polynomial is p(z) = z 2 - z - 6. The Cayley- Hamilton
theorem says that p(A) = A 2 - A - 6! = 0, which we verify directly:
2
1 2J _ [1 2J _ [6 OJ = [7 2J _ [l 2J _ [6 OJ = [O OJ
[30 30 06 36 30 06 oo·
480 Chapter 12. Spectral Calculus

12.5 Spectral Decomposition I


In the previous section, we were able to use the residues of R(z) at >. E O'(A) to
obtain the spectral projections of A and prove their key properties. In this section
and the next, we explore the rest of the Laurent expansion (11.20) of R(z) at >.as
a series of matrix operators, given by
00

R(z) = L Ak(z - >.)k. (12.21)


k '=-oo

According to Cauchy's integral theorem, each matrix operator Ak is given by (11.22)

1 J. R(z)
Ak = 27fi .'fr (z - >.)k+I dz, (12.22)

where r is a positively oriented simple closed contour containing >. and no other
points of O'(A); see Section 11 .6.3 for a review of Laurent expansions.
The main goal of these two sections is to establish the spectral decomposi-
tion formula for any operator A E Mn (IF). The spectral decomposition formula
is the generalization of the formula A = 2=>. >.P>. for semisimple matrices (see
Proposition 12.1.lO(v)) to general, not necessarily semisimple, matrices. We then
show in Section 12.7 that the spectral decomposition is unique. Writing out the
spectral decomposition explicitly in terms of a specific choice of basis gives the
popular Jordan normal form of the operator, but our development is a basis-free
description that works in any basis.

Nata Bene 12.5.1. Do not confuse the coefficient Ak of the Laurent expan-
sion (12.21) of the resolvent of A with the power Ak of A. Both can be
computed by a contour integral: Ak can be computed by (12.22) and Ak can
be computed by (12.20) (an application of the spectral resolution formula).
But despite the apparent similarity, they are very different things.

Lemma 12.5.2. Assume that>. E a-(A). Let r and f' be two positively oriented
simple closed contours in p(A) surrounding>. and no other points of O'(A) . Assume
also that r is interior to r'' that z' is a point on r'' and that z is a point on r' as
depicted in Figure 12.1. Let

1, n2'.0,
fin= {
0, n < 0.

The following equalities hold:


(i)
(12.23)

(ii)
(12.24)
12. 5. Spectral Decomposition I 481

Proof.
(i) Since z' is outside ofr, the function (z' - z) - 1 is holomorphic within. Expand
(z' - z)- 1 in terms of z - ;\to get

1 1 1 oo (z - >.)k
z' - z z' ->- 1-
z - ;\) =
--
L (z' -
k=O
;\)k+l ·
( z' - >-

Inserting into (12.23) and shrinking r to a small circler>- around;\ with every
point of r>- nearer to;\ than z' (see Figure 12.3) gives

1
27ri
j
Jr,.. (z - >-)
-m - 1 [~ (z - >.)k
-f;:o (z' - ;\)k+l
ldz

= I)z' - ;\) - k- 1 [~ j (z - >.)-m- l+kdz]


k= O 27ri Jr,..
= TJm(z' - ;\) - m-1.

(ii) In this case, both ;\ and z lie inside r. Split the contour into r >- and r z as in
Figure 12.3, so the left side of (12.24) becomes

~ j (z' - ;\) - n- 1 (z 1
- z) - 1 dz' +~ j (z' - >.) - n- 1 (z 1 - z)- 1 dz'
27ri Jr,.. 2m Jr"
= (1 - TJn)(z - ;\)-n-1.

The first integral follows the same idea as (i), except with a minus sign. The
second integral is an application of Cauchy's integral formula (11.14). D

Lemma 12.5.3. The coefficients of the Laurent expansion of R(z) at;\ E O"(A)
satisfy the identity
(12 .25)

r'

Figure 12.3. Diagram of the paths and points in Lemma 12.5.2. The
contour integral over r' (red) is the same as the sum of the integrals over the blue
circles r z and r >-.
482 Chapter 12. Spectral Calculus

Proof. Let r and r' be positively oriented simple closed contours surrounding .A
and no other points of O"(A). Assume also that r is interior to r' as depicted in
Figure 12.l. We have

(~) J J (z -
2

AmAn = .A)-m- 1(z' - A)-n- 1 R(z')R(z)dz'dz


27ri .'fr .'fr,
= (~ ) 2 J J (z - .A)-m-l(z' - .A)-n-1 R(z) ,- R(z') dz' dz
27ri 'Yr .'fr, z - z
2
= ( 2 ~i) i(z - .A)-m-l R(z) [.i' 1 1
(z' - .A)-n- (z' - z)- dz'] dz

- (2~i ri, (z' - .A)-n-1 R(z') [i(z - .A)-m-1(z' - z)-1dz] dz'

= - 1 . J (z - .A)-m-l R(z)(l - TJn)(z - .A)-n-ldz


27rZ ,fy

-~J (z' - .A)-n-l R(z')TJm(z' - .A)-m-ldz'


27ri }r,
1
= (1 - T/m - T/n)-. J (z - .A)-m-n- 2R(z)dz
27rZ Jr
= (1 - T/m - T/n)Am+n+l· D

Remark 12.5.4. Note that P>. = A_ 1 , so Lemma 12.5.3 gives another proof that
P>. is a projection, since

Lemma 12.5.5. Fix A E O"(A) and define D>. = A_2 and S>. = Ao. The following
hold:
(i) For n ::'.: 2, we have A_n = D~- 1 .
(ii) For n:::: 1, we have An = (-1)ns~+ 1 .

(iii) The operator P>. commutes with D>. and S>.:


(12.26)

(iv) The Laurent expansion of R(z) around A is


00 k 00

R( z ) -_ z P>.
- .A + '"'
L..... (z - D>.
.A) k+ 1 + '"'
L.) -1) k( z - .A) k S >.k+1 . (12.27)
k=l k=O

(v) We have

Proof. These follow by applying (12.25); see Exercises 12.23-25. D


12.6. Spectral Decomposition II 483

Example 12.5.6. Consider the matrix A given in Example 12.3.4. We found


the eigenprojections in Example 12.4.5. Below we determine Di and 5i . We
then verify (12.27). Using (12 .22) term by term for ,\ = 1 gives

- - 0
o 1
0
o1 -1/41
- 1/ 2
and A_3
2
o o
= D i = 9 00 00 0 ~ -~21
A_2 - Di - 3 0 0 0 0 0 .
r0 0 0 0 r0 0 0 0

Moreover, D~ = 0 for k > 2. To find Si apply (12.27) and (12.28) to get

Integrating shows that 5i = f,i P7 . Thus, fork EN we have

Putting this all together gives

Note that the holomorphic part of the Laurent expansion is a geometric series,
which sums nicely to give the final expression

Dr P1
R (z) =
(z - 1)3
+ (z Di
-1 )2
+ - Pi
z- 1
- + - -
z-7·

12.6 Spectral Decomposition II


Throughout this section, let A E Mn (IF) be a given matrix operator. The spectral
resolution formula (Theorem 12.4.6) allows us to write f (A) as a contour integral.
In this section, we use that formula to prove the spectral decomposition theorem,
which breaks up the contour integral into residue components P;.. and D;., for each
,\ E O"(A). We also examine the properties of D;., more carefully and show that D;.,
is nilpotent. Hereafter, we refer to D;., as the eigennilpotent of A associated with
the eigenvalue -\.

Lemma 12.6.1. For each,\ E O"(A), the operator D;., satisfies

(12. 29)

Moreover, its spectral radius r(D;.,) is zero.


484 Chapter 12. Spectral Calculus

Proof. To prove (12.29) it suffices to prove that

(12.30)

Let f>. be a positively oriented circle around>. containing no other points of O'(A).
By definition, we have R(z)(zI - A)== I, so zR(z) = AR(z) +I. This gives

AP>.=~ j AR(z)dz
2ni Jr,,
= ~ 1 AR(z) + Idz (I is holomorphic)
2ni Jr,,

= ~ 1 zR(z)dz
2ni Jr,,
=
1
- . 1 >.R(z)dz + -21 . 1 (z - >.)R(z)dz
2n i .fr,, ni Jr,,

= >-.P>. +D>. .
To prove r(D>.) = 0 parametrize f>. by z(t) = >. + peit for any sufficiently
small choice of p > 0. By Lemma 12.5.5(i) we have

llD~ll = II 2~i i,, (z - >-.)k R(z) dzll

2~ lifo
2
= 7r leikt R(z(t))pieit dtll

:::; l+l sup llR(z)ll·


zEr,,

Since f>. is compact, set M = supzEr,, llR(z)ll, which is finite. Thus,

r(D>.) = lim llD~ll 1 /k :::; lim pl+l/k M 1 /k = p.


k-+oo k-+oo

Since we can choose p to be arbitrarily small, we must have r(D>.) = 0. D

Example 12.6.2. Let A be the matrix

~1
1 3 0
0 1 3
A=
0 0 1
0 0 0

of Examples 12.3.4 and 12.4.5. From Lemma 12.6.1 we compute

[~ ~rn
D1 = (A - J)P1 =
3
0
0
3
0
1
0
0 -1/81
-1/4 [o 1
= 3 0 0 1 -1/41
0
-1/2
0 0 0 1 -1/2 0 0 0 0 '
0 0 0 0 0 0 0 0 0
12.6. Spectral Decomposition II 485

which agrees with our earlier computation in Example 12.5.6 and clearly has
all its eigenvalues equal to 0. Moreover

0 01[0 0 0 1/81
[~ ~1
3 0 0
~(A - ~ r-6~ -6 3 0 0 0 0 1/4 0 0
D, 7I)P, 0 -6 3 0 0 0 1/2 0 0
0 0 0 0 0 0 1 0 0

Lemma 12.6.3. A matrix B E Mn(lF) satisfies r(B) = 0 if and only if B is


nilpotent.

Proof. If r(A) = 0, then O"(A) = {O}. Hence the characteristic polynomial of A


is p(>.) = An. By the Cayley- Hamilton theorem (Corollary 12.4.8), we have that
p(A) = An = 0, which implies that A is nilpotent. Conversely, if A is nilpotent,
then Ak = 0 for all k ::'.:: n, which implies that r(A) = limk--+oo llAkll 1/k = 0. D

Remark 12.6.4. Recall that the order of a nilpotent operator B E Mn(lF) is the
smallest m such that Bm = 0. Since the order m of a nilpotent operator is the same
as its index, we have from Exercise 12.6 that m ::::; n .

Proposition 12.6.5. For each>. E O"(A), the order mA of the nilpotent operator
DA satisfies mA ::::; dim&? (PA).

Proof. By Lemmata 12.6.1 and 12.6.3, the operator DA is nilpotent because


r(DA) = 0. Since, DA = PADA = DAPA, we can consider DA as an opera-
tor on &? (PA)· Thus by Remark 12.6.4, the order mA of DA satisfies mA ::::;
dim&? (PA)· D

Remark 12.6.6. The proposition implies that R(z) is meromorphic, that is, it has
no essential singularities. More precisely, (12.27) becomes

PA m" - 1 Dk oo
~ A ~ k k k+l
R(z) = z->. + ~ (z - >.)k+l + ~ ( - 1) (z - >.) SA . (12.31)
k= l k=O
Therefore the principal part (12.28) becomes
PA m"-l D~
PAR(z) = R(z)PA = z - A+ L
(z - >.)k+l. (12.32)
k= l

Lemma 12.6.7. LetA E O"(A) andy EV. If(>.I-A)y E &?(PA), theny E &?(PA).

Proof. Assume y -:/- O; otherwise the result is trivial. Let v = (AI - A)y. If
v E &? (PA) , then v = PA v . Independence of the projections (Theorem 12.4.2(ii))
implies that Pµ v = 0 whenever µ E O"(A) and µ-:/- >.. Combining this with the fact
that PµA = µPµ + Dµ (12.30) and the fact that Pµ and Dµ commute (12 .26) gives
0 = Pµ(>.I - A)y = >.Pµy - µPµy - Dµy,
486 Chapter 12. Spectral Calculus

which implies
DµPµy = Dµy = (>. - µ)PµY·
Since Dµ is nilpotent, it follows that r(Dµ) = 0, which implies >. = µ (which is
false) or Pµy = 0 . The fact that I= °L::µEa(A) Pµ (see Theorem 12.4.2(iv)) gives

y = l= Pµy = P>,y,
µEa(A)

which implies y E !% (P>-.) D

Remark 12 .6.8. The proof of the previous lemma did not use the definition of
the projections Pµ and the nilpotents Dµ, but rather only the fact that the Pµ are
projections satisfying the basic properties listed in Theorem 12.4.2, and that the
Dµ are nilpotents satisfying commutativity of Pµ with Dµ and satisfying PµA =
µPµ + Dw Thus, the lemma holds for any collection of projections and nilpotents
indexed by the elements of u(A) and satisfying these properties.

Theorem 12.6.9. For each >. E u(A), the generalized eigenspace <ff>, is equal to
!% (P>J·

Proof. We first show that <ff>, c .~ (P>-.). Recall that IC>, C JV ( (>.I - A )k>-),
where k>, = ind(AI - A). Choose x E IC>, so that (>.I - A)k>-- 1 x -=f. 0 . The set
{x, (AI -A)x , ... , (AI -A)k>-- 1 x} is a basis for If>-. by Proposition 12.2.7. It suffices
to show that each basis vector is in ~~ ( P>-.) .
If y = (AI - A)k>-- 1 x, then (M - A)y = 0 E !% (P>-.), which from Lemma
12.6.7 implies y E !% (P>,). Similarly, y E !% (P>-.) implies (AI - A)k>-- 2 x E !% (P>-.) ·
Repeating gives (AI - A)lx E !% (P>-.) for each£. Thus, <ff>, c !% (P>-.) ·
Finally, we note that lFn = E9 !% ( P>,) and P = E9 IC>,. Since IC>, C !% ( P>-.) for
each>. E u(A), it follows that!% (P>,) =IC>,. D

Remark 12.6.10. The previous theorem holds for any collection of projections
and nilpotents satisfying the properties listed in Remark 12.6.8. This is important
for the proof that the spectral decomposition is unique (Theorem 12.7.5).

Example 12.6.11. The eigenprojection P 1 of Example 12.6.2 has rank 3, and


the eigennilpotent D 1 satisfies

o o 1 -1;2]
D2 = g 0 0 0 0 and D 31 -- o·'
1
[0 0 0 0
0 0 0 0

hence, Di has order 3. On the other hand, P1 has rank 1 and D1 has order 1,
that is, D 7 = 0, as we have already seen.
12.6. Spectral Decomposition II 487

Theorem 12.6.12 (Spectral Decomposition Theorem). For each>. E a(A)


let P>. denote the spectral projection associated to >., and let D>. denote the corre-
sponding eigennilpotent of order m>,. The resolvent takes the form

R(z) = L [z -
>.Ea (A)
p>.
), +
m"-
L
k=l
1
D~ l
(z - >.)k+l ' (12.33)

and we have the spectral decomposition

A= L >-P>.+D>. . (12.34)
>.Ea(A)

Proof. Using (12.32) , we have

R(z) = R(z) L
>.Ea(A)
P>. = L
>.Ea(A)
R(z)P>. = L
>.Ea(A) k=l
1
[z ~\ - mf (z -~~)k+l l·
Similarly Lemma 12.6.1 yields

A=A L P>,= L AP>,= L >-P>.+D>. . D


>.Ea(A) >.Ea (A) >.Ea (A)

Example 12.6.13. For the matrix

of Examples 12.4.5, 12.6.2, and 12.6.11 we can show that

L AP>,+ D>, = (lP1 + D1) + 7P7 = A.


>.Ea(A)

Corollary 12.6.14. Let f(z) be holomorphic in an open, simply connected set


containing a(A). For each>. E a(A) let
CXl

f(z) = f(>.) +L an,>.(Z - >.)n


n=l

be the Taylor series representation off near>.. We have

(12.35)
488 Chapter 12. Spectral Calculus

In the case that A is semisimple, (12.35) reduces to


f(A) = L f (>..)P;,. (12.36)
AEa-(A)

Proof. For each),. E O"(A) , let f;, be a small circle around),. that contains no other
points of er( A) and which lies inside of the open, simply connected set U containing
cr(A). For convenience of notation, set ao,>. = f(>..) . By the spectral resolution
formula (Theorem 12.4.6) we have

f(A) =
2 ~i L
>-Ea-(A)
ir>-
f(z)R(z) dz

In the next section we show (Theorem 12.7.6) that (12.35) is not just a way
to compute f(A), but that it is actually the spectral decomposition of f(A) .

Ex ample 12.6.15. For each),. E cr(A), let m;, be the order of D;, .

(i) Given k EN, we compute f(A) = Ak . The binomial formula gives

(12.37)

which converges everywhere, so if M;, = min(k, m;,), t hen

(12 .38)

(ii) Given t E IR, we compute f (A) = eAt. The Taylor expansion of ezt
around).. is

This gives the formula

(12.39)
12.7. Spectral Mapping Theorem 489

12.7 Spectral Mapping Theorem


Recall the semisimple spectral mapping theorem (Theorem 4.3.12), which says that
for any polynomial p E lF[x] and any semisimple A E Mn(lF) the set of eigenvalues of
p(A) is precisely the set {p(.A) I >. E G(A) }. In this section we extend this theorem
in two ways. First, the spectral mapping theorem applies to all matrix operators,
not just semisimple matrices, and second, it holds for all holomorphic functions, not
just polynomials.
The spectral mapping theorem describes the spectrum off (A), but it can be
extended to give a description of the full spectral decomposition off (A). Specifi-
cally, we prove Theorem 12.7.6, which says the spectral decomposition of f(A) has
the form described in the previous section; see Corollary 12.6.14. To prove this gen-
eralization we first prove Theorem 12.7.5, showing that the spectral decomposition
of any matrix operator is unique.
We conclude this section with the power method for finding eigenvectors of
matrices. This important application of the spectral mapping theorem plays a
central role in many applications, including Google's PageRank algorithm.

12.7.1 The Spectral Mapping Theorem

Theorem 12.7.1 (Spectral Mapping Theorem). For any A E Mn(lF), if f(z)


is holomorphic in an open, simply connected set containing G(A) , then G(j(A)) =
f(G(A)). Moreover, if xis an eigenvector of A corresponding to the eigenvalue>.,
then xis also an eigenvector of f(A) corresponding to f(>.) .

Proof. If µ tJ. f(G(A)), then f(z) - µ is both holomorphic in a neighborhood


of G(A) and nonzero on G(A). Hence, f(A) - µI is nonsingular, or, equivalently,
µ tj_ G(j(A)).
Conversely, ifµ E f(G(A)), that isµ= f(>.) for some>. E G(A), then set

f(z) - f(>.)
g(z) = z - >. '
z i= >. ,
{ f'(>.), z = >..

Note that g(z) is holomorphic in a punctured disk around >. and it is continuous
at >., so by Exercise 11.19 it must be holomorphic in a neighborhood of>.. Also,
g(z)(z - >.) = f(z) - µ, and hence g(A)(A - AI) = f(A) - µI.
If x is an eigenvector of A associated to >., then

(f(A) - µI)x = g(A)(A - >.I)x = g(A)O = 0.

Thus f(A) - µI is singular, andµ E G(f(A)) with x as an eigenvector. D


490 Chapter 12. Spectral Calculus

Example 12. 7.2. Consider the matrix

from Examples 12.4.5 and 12.6.13, with a(A) = {1 , 7}. The spectral mapping
theorem allows us to easily determine the eigenvalues of the following matrices:
(i) Let B = Ak for k E z. Since f(z) = zk is holomorphic on an open,
simply connected set containing a(A) , it follows by the spectral mapping
theorem that a(B ) = a(Ak) = {1, 7k} .
(ii) Let B = eAt for t E JR. Since f (z) = ezt is holomorphic on an open,
simply connected set containing O"(A) , it follows by the spectral mapping
theorem that

Example 12.7.3. Define B = sin(A) +tan(A 2 ) + log(A), where A is given in


the previous example. Since f(z) = sin(z) + tan(z 2 ) + log(z) is holomorphic
on an open, simply connected set containing O"(A) , it follows by the spectral
mapping theorem that

a(B) = O"(sin(A) + tan(A 2 ) +log( A))


=sin( a( A))+ tan(a(A) 2 ) + log(O"(A))
= {sin(l) + tan(l), sin(7) +tan( 49) + log(7)}.

Remark 12.7.4. For a less-complicated matrix, like the one found in Example
12.3.4, the spectral mapping theorem may not seem like a big deal, but for more
general matrices the spectral mapping theorem can be extremely helpful. The
spectral mapping theorem also plays an important role in the proof of the power
method for finding eigenvectors and in the proof of the Perron-Frobenius theorem,
presented in the next section.

12.7.2 Uniqueness of the Spectral Decomposition


The next theorem shows that there is really only one collection of projections
and nilpotents satisfying the main properties of the spectral decomposition. As
a simple consequence of this theorem, we show in the next section that for any
holomorphic function f and any matrix operator A, the formula for f (A) given in
Corollary 12.6.14 also gives the spectral decomposition off (A).

Theorem 12.7.5. Given A E Mn(IB'), assume that for every.XE O"(A) there is a
projection Q>. E Mn(IF) and a nilpotent C>. E Mn(IF) satisfying
12.7. Spectral Mapping Theorem 491

(i) Q~ = Q;.. ,
(ii) Q>.Qµ = 0 for allµ E o-(A) withµ#- .\

(iii) Q;..C;.. = C;..Q;.. = C;..,


(iv) QµC>. = C;..Qµ = 0 for allµ E o-(A) withµ#->.,

(v) L>.Eo-(A) Q;.. =I.


Assume further that
A = L >.Q;..+C;... (12 .40)
>.Eo-(A)
In this case, for each>. E o-(A) , we have
(i) Q;.. is the eigenprojection P;.. and
(ii) C;.. is the eigennilpotent D;...

Proof. For every µ E o-(A), the relation (12.40) implies

AQµ = L (>.Q;.. + C;..)Qµ = µQµ + Cµ-


>.Eo-(A)

This gives Cµ = (A - µI)Qµ- Hence, it suffices to show that Pµ = Qµ for every


µ E a(A).
Left multiplying (12.40) by Q;.. gives

Q;..A = >.Q;.. + C;..;


therefore Remark 12.6.10 shows that Theorem 12.6.9 applies, and~ (Qµ) =cf!µ for
every µ E o-(A). Thus, for any v E V we have Pµ v = QµPµ v. Hence, for any
>. E a(A) we have

Q;..v = L Q;..Pµv = Q;..P;..v = P;..v.


µEo-(A)

Therefore, Q;.. = P;.. for all>. E a(A). D

The following theorem is an easy corollary of the uniqueness of the spectral


decomposition. It explicitly gives the spectral decomposition of f (A) in terms of
the spectral decomposition of A.

Theorem 12.7.6 (Mapping the Spectral Decomposition). Let A E Mn(lF)


and f(z) be holomorphic on an open, simply connected set U containing a(A). For
each>. E o-(A) let f(z) = L~=oan,>.(z - >.)n be the Taylor series representation of
f near>. (with ao,>. = f (>.)) . The expression
492 Chapter 12. Spectral Calculus

given in Corollary 12.6.14 is the spectral decomposition of f(A) . That is, for each
v E O"(j(A)), the eigenprojection P 11 for f(A) is given by

P11 == L Pµ (12 .41)


µE o- (A)
f(µ) = 11

and the corresponding eigennilpotent D 11 is given by


m,..

L L: ak,µD~. (12.42)
µEo-(A) k = l
f(µ)=r.1

Proof. By the spectral mapping theorem the spectrum of f(A) is f(O'(A)) .


Corollary 12.6.14 says that the projections P11 defined by (12.41) and the nilpo-
tents D 11 defined by (12 .42) satisfy (12.40) . The other conditions of Theorem 12.7.5
clearly hold; hence these must be the unique eigenprojections and eigennilpotents,
respectively, off (A). D

Ex ample 12.7.7. Theorem 12.7.6 gives the spectral decomposition of A- 1 as

(12.43)

Thus, if A is the matrix from Examples 12.4.5 and 12.6.13, then we have that

( 12.44)

so the 1-eigennilpotent of A- 1 is -D 1 +Di and the ~ -eigennilpotent is 0. It


is easy to check that the decomposition (12.44) adds up to give

-~71
- 21 63
7 -21
0 7 -3 .
0 0 1

12.7.3 The Power Method


The power method is another important application of the spectral decomposition
formula. If an eigenvalue has its generalized eigenspace equal to its eigenspace-that
is, its algebraic and geometric multiplicity are the same- we say it is a semisimple
eigenvalue. If an operator has a sernisimple eigenvalue whose modulus is larger
than all the others, the power method gives a way to compute an eigenvector of
that dominant eigenvalue.
12.7. Spectral Mapping Theorem 493

More precisely, if A E Mn(lF) and ,\ E CT (A) is semisimple, then If>. is the


eigenspace of,\ and Theorem 12.6.9 guarantees that !% (P>.) = <ff>.. Therefore, if
x E JFn satisfies P>.x "I 0, then P>.x is an eigenvector of A associated to ,\. If ,\ is
a dominant semisimple eigenvalue, the power method gives a way to compute the
eigenvector P>.x.
Here we prove the power method for a dominant semisimple eigenvalue of
modulus 1. Exercise 12.34 extends this to dominant semisimple eigenvalues with
arbitrary modulus.

Theorem 12.7.8. For A E Mn(lF), assume that 1 E CT(A) is a semisimple eigen-


value and that there exists 0 :::; rJ < 1 such that l>-1 < rJ for all other eigenvalues
in CT(A) . If P is the eigenprojection corresponding to ,\ = 1 and v E JFn satisfies
Pv "I 0, then ask---+ oo we have

Proof. By the spectral resolution theorem (Theorem 12.4.6), we have

Ak = ~ j zk R(z)dz,
2ni Jr
where r is a positively oriented circular contour centered at zero having radius
greater than 1. Using the residue theorem (Theorem 11.7.13) , we have that

(12.45)

where r 'r/ is a positively oriented circular contour centered at zero and having radius
rJ, and f 1 is a small positively oriented circular contour centered at 1 and having
radius 1 - rJ.
From (12 .37), we have that

The last equality follows because the eigenvalue ,\ = 1 is semisimple, and thus
A _ 1 = P and A-e-i = 0 for each C 2 1. Combining this with (12.45) we have

Ak - P =~ j zkR(z)dz.
2ni Jr.,,
For any given operator norm II · II, following the proof of Corollary 12.4.7 gives

[[Ak - P[[ = 112~i i.,, znR(z)dzll:::; 2~2nCrJk = CrJk,


where C = sup [[R(z)[[. Thus, ask---+ oo we have
z Er.,,
494 Chapter 12. Spectral Calculus

Example 12. 7.9. Consider the matrix

1/5 2/5
2/5]
A= 3/5 1/5 2/5 .
[ 1/5 2/5 1/5

It is easy to show that o-(A) = {1, -1/5} , with the second eigenvalue having
algebraic multiplicity two. Therefore, the eigenvalue 1 is simple. For any xo
with Px 0 -=/= 0, semisimplicity of the eigenvalue >. = 1 means that Pxo is an
eigenvector of A with eigenvalue L
The iterative map Xk = Axk--1 gives

llxk - Pxoll = llAkxo - Pxoll :S llAk - Pllllxoll---+ 0 ask---+ oo.


Hence, the eigenvector Pxo is given by limk-+oo Xk·
After ten steps of applying this method to xo = [1 1 1] T in double-
precision floating-point arithmetic,, the sequence stabilizes to

X10 = [LO 1.16666667 0.83333333]T,

matching the correct answer Pxo = [1


0 7/6 5/6]T (essentially) perfectly.

12.8 The Perron-Frobeniius Theorem


In this section we consider matrix operators whose entries are all nonnegative or all
positive. Nonnegative and positive matrices arise in applications such as Markov
chains in probability theory, compartmental models in differential equations, and
Google's PageRank algorithm for information retrieval. In all these applications,
the spectral properties of the matrices are important. Some of the most important
spectral information for these applications comes from the theorems of Perron and
Frobenius, which we describe here.

12.8.1 Perron's Theorem

Definition 12.8.1. A matrix A E Mn(lR) is called nonnegative, denoted A ~ 0,


if every entry is nonnegative. It is called positive, denoted A >- 0, if every entry
is positive.

Remark 12.8.2. Sometimes it can also be useful to use the notation B ~A when
B - A ~ 0, and B >- A when B - A >- 0.

Remark 12.8.3. If A >- 0, then Ak >- 0 for all k E ;z;+, simply because every entry
is a sum of products of strictly positive numbers. This implies that A is not nilpo-
tent, and therefore, by Lemma 12.6.~:, the spectral radius r(A) must be positive.
12.8. The Perron-Frobenius Theorem 495

Theorem 12.8.4. A nonnegative matrix A ~ 0 has an eigenvalue .A equal to its


spectral radius r (A ), and .A has a nonnegative eigenvector.

Proof. By Theorems 12.3.8 and 12.3.14 the Laurent series

1 1 1 2
R(z) = - I+ - 2 A+ - A
z z z3
+ .. · (12.46)

converges for all z with lzl > r(A).


By way of contradiction, assume that .A = r(A) is not an eigenvalue of A. This
implies the value of limz-+ >.. R(z) is finite, so the series

1 1 1 2
R(.A) = - I+ - A+ - A
.A ,x.2 A3
+ .. ·

converges as well. Since An ~ 0 for every n > 0, the convergence is componentwise


absolute. Writing z = 1/ w, the power series w + w 2 A+ w 3 A 2 + · · · converges
uniformly on the compact disk B(O, 1/ .A), and hence R(z) = z- 1 +z- 2 A+z- 3 A 2 + ..
converges uniformly on the closed set {lzl 2: .A}. Thus the resolvent set includes
the entire circle lzl = .A, and no eigenvalue lies on the circle. Hence r(A) < .A,
a contradiction.
To see that a nonnegative eigenvector of .A exists, note that by (12.33) we have

R(z)Dm;.-1
>..
= ~
~
~+ ~
[z - µ
mj - 1
~
Dk
µ,
(z - µ)k+l
l Dm;.-1
.A
=
Dm;. - 1
----">..' ------
z - .A '
(12.47)
µ,Eu(A ) k=l

where ID>.. is the order of the eigennilpotent D.x (define D';:;.-l = P.x if ID>.. = 1).
But R(z) = (zl - A)- 1 , so (12.47) implies that

which gives
(12.48)

Hence, the nonzero columns of D';:;. - l are eigenvectors of .A. By (12.31) we also have

Dm;. - l = lim (z - .xr>- R(z) .


>.. z-+>..

But for lzl > .A = r(A), the resolvent R(z) can be written (see Theorem 12.3.8) as
R(z) = :Z:::::~o J:,. Since A~ 0 we also have limz-+.A+ (z - .A)m>- R(z) ~ 0. D

The next theorem shows that for positive matrices the real eigenvalue equal
to its spectral radius is simple and the eigenvector is componentwise positive. We
will need t he following simple lemma in the proof.

Lemma 12.8.5. If B E Mn(lR) is nonnegative and has a nonzero entry on the


diagonal, then B is not nilpotent.
496 Chapter 12. Spectral Calculus

Proof. Let B, C E Mn(IR) with B, C t: 0. Assume there exists some k such that
bkk> 0 and Ckk > 0. The (k, k) entry of BC is
n
L bkiCik = bkkCkk +L bkiCik> (12.49)
i=l if.k

but bkkCkk is strictly positive, and the remaining terms are all nonnegative, so the
(k, k) entry of BC is strictly positive.
If bkk > 0, then by induction, using (12.49) with C = Bm- 1, the (k, k) entry
of Bm = B B m - l is also strictly positive for all m E z+; hence Bm -=f. 0. D

Theorem 12 .8 .6 (Perron). A positive matrix A>- 0 has a simple (multiplicity


one) eigenvalue,\ equal to its spectral radius r(A). In addition, ,\ has a positive
eigenvector. Moreover all other eigenvalues of A are smaller in modulus than r(A).

Proof. Assume A >- 0. Let v 1 be a nonnegative eigenvector belonging to the


eigenvalue ,\ = r(A) . Note that Av 1 >- 0, and since Av1 = >.v1 and A > 0,
it follows that v 1 >- 0 as well. From (12 .48), we know each nonzero column of
D7!:"- 1 is a nonnegative eigenvector, and hence is positive. By Lemma 12.8.5 this
contradicts the nilpotency of D;.. Hence, D;. = 0, and the generalized eigenspace
g>. is spanned by eigenvectors of>. .
Since A and >. are both real, the kernel of (A - AI) has a basis of real vectors.
Suppose that v 2 is an element of this basis that is linearly independent from v 1, and
assume that v 2 has at least one negative entry (if not, multiply by -1). For every t,
the vector tv 1+v 2 is also an eigenvector with eigenvalue .A. Since v 1 >- 0, there must
be some real t such that tv1 +v2 t: 0 and such that at least one coordinate of tv 1+v2
has a zero coordinate. This is a contradiction, since nonnegative eigenvectors of
positive matrices are positive. Hence, the dimension of the >. eigenspace is 1, and >.
is a simple eigenvalue.
F inally, we show that A has no other eigenvalues on the circle [z[ = r(A) .
Choose € > 0 small enough that A - d t: 0. By definition of eigenvalue (or by
the spectral mapping theorem), ,\ - € is the largest positive eigenvalue of A - d .
It follows from Theorem 12.8.6 that r(A - c) = ,\ - €. Again by the definition of
eigenvalue we have that u(A) = u(A -- d) + € C B(c, >. - c) . Thus,\ E B(c, ,\ - c)
is the only eigenvalue on the circle [z[ = .A; see Figure 12.4. D

Remark 12 .8. 7. For any nonnegative matrix A, if a is the smallest diagonal en-
try of A, then t he end of the previous proof (taking € = a) shows that u(A) C
B(a, r(A) - a). Thus, in the case of a nonnegative matrix with positive diagonal,
we still have the conclusion that the eigenvalue ,\ = r(A) (often called the Perron
root or Perron-Probenius eigenvalue) is t he only eigenvalue on the circle [z[ = r(A).

Example 12.8.8. The positive matrix

1/5
A= 3/5
[ 1/5
12.8. The Perron-Frobenius Theorem 497

>. = r(A)

Figure 12.4. Illustration to assist in the proof of Theorem 12. 8. 6. Every


eigenvalue of a positive matrix A lies in B(c: , >. - c:), where c: is chosen small enough
that A - cl >::: 0. The Perron root >. = r(A) is the only point in B(c: , >. - c:) on the
circle lzl = >..

of Example 12.7.9 has spectrum u(A) = {l , -1 / 5}, and the eigenvalue -1 / 5


has multiplicity two, so the eigenvalue 1 (the Perron root) is simple and is equal
to the spectral radius r(A) = 1. Example 12.7.9 also shows that the corre-
sponding eigenvector is [l 7/6 5/6r, which is strictly positive, as guaran-
teed by Perron's theorem .

12.8.2 Perron -Frobeniu s Theorem


Frobenius extended the uniqueness result of Perron's theorem to a large class of
nonnegative matrices called irreducible.

Definition 12.8.9. A nonnegative matrix A >::: 0 is called primitive if there is a


k such that Ak >- 0. We say that A is irreducible if for any i, j there is a k E z+
such that the (i,j) entry of Ak is positive.

Proposition 12.8.10. If A >::: 0 is irreducible, then I+ A is primitive.

Proof. The proof is Exercise 12.38. 0

Theorem 12.8.11 (Perron- Frobenius ). A nonnegative irreducible matrix A>::: 0


has a simple eigenvalue >. equal to its spectral radius r(A), and >. has a positive
eigenvector.
498 Chapter 12. Spectral Calculus

Proof. We know from Theorem 12.8.4 that A has an eigenvalue >.. equal to its
spectral radius r(A) and that >.. has a nonnegative eigenvector. What remains
to be shown is that >.. is simple and the nonnegative eigenvector is positive. By
Proposition 12.8.10 the matrix (J + A) is primitive. We define B = (J + A)k,
where k E ;z;+ is chosen to be large enough so that B >- 0. It follows from the
spectral mapping theorem that >.. E oc(A) if and only if (1 + >..)k E a-(B), and thus
the algebraic multiplicity of >.. is equal to the algebraic multiplicity of (1 + >..)k.
Observe that

r(B) = max J(l +>..)kl= max J(l + >..)ik = { max 1(1 + >..)i}k = (1 + r(A))k
>.Ea(A) >.Ea(A) >.Ea(A)

because when the disk izl ::; r(A) is translated one unit to the right, the point of
maximum modulus is z = 1 + r(A). Since B is a positive matrix, the algebraic
multiplicity of the eigenvalue at r(B) is one, and so the algebraic multiplicity of>..
must also be one.
Finally, let v be the nonnegative eigenvector of A corresponding to>..= r(A) .
Since v = [v1 , ... , Vn] T is not identically zero, there exists an i with vi i= 0. For
each j E {1, .. . , n} there exists a k such that the (i,j) entry of Ak is positive (since
A is irreducible). This implies that the ith entry of Akv is positive. But we have
Akv = >..kv, and thus vi > 0. Since this holds for every j, we have v >- 0. D

12.8.3 Google's PageRank Al~~orithm


Google searches are useful because they not only return a list of web pages containing
your results, but they also rank the results. The PageRank algorithm is one of the
key tools used to determine the rank of each website.47
The basic idea of the algorithm is to simulate web traffic mathematically and
then rank pages by the percentage of traffic they get. Consider the following toy
model of the Internet, consisting of only four pages. The arrows in the diagram
correspond to the links from one page to another.

Assume that at each time step a user moves to a new page by clicking on a link
(selected with equal likelihood). Let A= [%] E M 4 (1R) , where aiJ is the probability
that a user at page j will click on the link for page i. At the first page a user will
click on pages two or four with equal probability; hence, a2 1 = a 41 = 0.5. A user
at page two will click on page one with probability one; hence, a 12 = 1. The third
page has two links, going to pages two and four . Thus, a23 = a43 = 0.5.
47 This algorithm is named after Larry Page, one of Google's founders.
12.8. The Perron - Frobenius Theorem 499

The fourth page presents a problem, because it has no outbound links. Instead
of setting all the corresponding entries to 0, we assume the user will randomly
''teleport" to another page (all with equal probability), so ai 4 = 0.25 for each i.
Putting this all together, we have

0 1 0 0.25]
A = 0.5 0 0.5 0.25
0 0 0 0.25 .
[
0.5 0 0.5 0.25

If ek E JR 4 is the kth standard basis vector, then Aek is the kth column of A-the
vector describing the probability that a user starting at page k will move to each of
the other pages. If the kth entry of x E JR 4 is the percentage of all users currently
on page k, then the product Ax describes the expected percentage of users that will
be on each of the pages after the next step. Repeating the process, A 2 x describes
the percentage of traffic that will be at each page after two steps, and Akx the
percentage of traffic at each page after k steps.
Notice that A is nonnegative and every column sums to 1, so A has a left
eigenvector l = ll. T of all ones, with corresponding eigenvalue l. But we also have
ll A ll 1 = 1, so Lemma 12.3.13 implies that r(A):::; 1; hence r(A) = l.
A right eigenvector r corresponding to the eigenvalue 1 satisfies Ar = r. If
r is scaled so that its entries are nonnegative and sum to 1, then it represents the
distribution of traffic in a steady state, where the overall distribution of traffic at
the next time step Ar is the same as the distribution r at the current time. If the
eigenvalue 1 has a one-dimensional eigenspace, then there is a unique nonnegative
choice of r whose entries sum to 1, and the percentage rk of traffic at the kth page
is a reasonable indicator of how important that page is.
The same logic applies to an arbitrary number n of pages with any config-
uration of links. Again construct a matrix A E Mn (IF) corresponding to the link
probabilities, following the approach described above. Since A is nonnegative and
its columns sum to 1, the previous argument shows that r(A) = 1, and if the
eigenvalue 1 is simple, then there is a unique nonnegative right eigenvector r whose
entries sum to l. This eigenvector gives the desired ranking of pages.
The one remaining problem is that the eigenvalue 1 is not necessarily simple.
To address this , the PageRank algorithm assumes that at each time step a percent-
age c < 1 of users follow links in A and the remaining percentage 1 - c > 0 teleport
to a random web page. If all web pages are equally likely for those who teleport,
the new matrix of page-hit probabilities is given by
1- c
B=cA+ - - E, (12.50)
n
where E E Mn (JR) is the matrix of all ones, that is, E = ll.ll. T. All of the columns
of B sum to one, and applying the same argument used above gives r(B) = 1.
Depending on the choice of c, the matrix B might be a more realistic model
for Internet traffic than A is, since some users really do just move to a new page
without following a link. But another important advantage of B over A is that
B >- 0, so Perron's theorem (Theorem 12.8.6) applies, guaranteeing the eigenvalue
1 is simple. Thus, there is a unique positive right eigenvector r whose entries
sum to 1, and we can use r to rank the importance of web pages. Moreover,
500 Chapter 12. Spectral Calculus

since the Perron root 1 is simple, the power method (see Theorem 12.7.8) guaran-
tees that for every nontrivial initial choice x 0 ~ 0, the sequence xo, Axo, A2 xo, .. .
converges to r.

Vista 12.8.12. Probability matrices like those considered above are examples
of Markov chains, which have widespread applications. Both Perron's theorem
and the power method are key tools in the study of Markov chains.

12. 9 The Drazin Inverse


The Moore-Penrose inverse At (defined in Theorem 4.6.1) of a square matrix A
satisfies A At = proja(A) and At A = proja(AH). In other words, the products
A At and At A are geometrically the nearest possible approximations to I. But
in settings where we are interested in multiplicative and spectral properties, rather
than geometric properties, the Moore-Penrose inverse is not usually the best choice.
In this section we describe a different generalized inverse, called the Dmzin
inverse, that behaves much better with respect to spectral and multiplicative prop-
erties. In some sense, the only obstruction to inverting a matrix is the existence of
zero eigenvalues. Thus, instead of using a projection to the range of A or AH, as in
the Moore-Penrose inverse, the Drazin inverse uses a projection to the generalized
eigenspaces associated to nonzero eigenvalues.
The Drazin inverse has many applications in dynamical systems, Markov
chains, and control theory, among others. It also is an important theoretical con-
cept in the study of Krylov subspaces and the corresponding results in numerical
linear algebra. In the next chapter we show how the Drazin inverse connects to the
Arnoldi and GMRES algorithms, which are used to solve large sparse linear algebra
problems.

12.9.1 Definition and Spectral Decomposition

Definition 12.9 .1. For A E Mn(lF) define P* = I - P0 , where Po is the


0-eigenprojection in the spectral decomposition

A= L >..P;., +D;.,. (12.51)


>-.Ea(A)

The projections Po and P* commute with A, and so~ (Po) and~ (P*) are
both A-invariant. We can write lFn =~(Po) EB~ (P*), and since I= Po+ P* we
have
A = APo + AP* = Do + AP*. (12 .52)

The decomposition (12.52) is sometimes called the Wedderburn decomposition of A .


12.9. The Drazin Inverse 501

Definition 12.9.2. Let A E Mn(lF) have spectral decomposition (12.51), and let
P* be as in Definition 12.9.1 . Let C denote the restriction of A to &l (P*). The
operator C has no zero eigenvalues, so it is invertible. Define the Drazin inverse
AD of A to be the operator

Remark 12.9.3. When A is invertible, then Po = 0 and P* = I, so we have


AD= A- 1.

Remark 12.9.4. Since lFn = &l (P*) EB JY (P.), we may choose a basis such that
the operator A can be written in block form as

A = 3- 1 [M0 NOJS ' (12.53)

where S is the change of basis matrix that block diagonalizes A into complemen-
tary subspaces, N is a nilpotent block of order k, and the block M is the matrix
representation of C in this basis. Thus, we have

AP =
*
s- 1 [M0 OJ S '
0
D0 = 3 - 1 [O
0 N
OJ S '

The spectral decomposition of the Drazin inverse is just like the spectral de-
composition for the usual inverse given in (12.43), except that the terms that would
have corresponded to the eigenvalue 0 are missing, as shown in the next theorem.

Theorem 12.9.5. For A E Mn(lF) with spectral decomposition (12.51), the Drazin
inverse AD can be written as

(12.54)

where each m;.. denotes the algebraic multiplicity of the eigenvalue >- E u(A) .

Proof. Write the spectral decomposition of C as

C = L >-Pc,>. +De,>.,
>.Eu(C)

and observe that u(C) = u(A) "'{O}. Using (12.43) gives the spectral decomposi-
tion of c- 1 as

c-1 = '"""' (~ P. ~1(-1/Dc,>.)


L >. c,>. + L >.HI ·
>. Eu(C) e=l
502 Chapter 12. Spectral Calculus

By the definition of C, we have that Pc,>-. o P* = P>-. and De,>-. o P* = D>-. for every
>. E lT(C) , and this implies that

Example 12.9.6. Consider the matrix

The eigenprojection and eigennilpotent for >. = 1 are

10 01 00 -927] 3
0 0 3
o 0 -27]
9
and Di=
[0 0 0 0
Pi= 0 0 1 3
[0 0 0
0 0 0
0
0
.

It is easy to check that A = P 1 + D 1 . By (12.54) the Drazin inverse is


AD= P 1 - D 1 +Di, which turns out to be

AD=
1
0
[0
0
-3
1
0
0
9
-3
1
0
-18
81

0
3 .
l
It is straightforward to verify that AD A = AAD = P 1 and AD AA D = AD.

Example 12 .9.7. For the matrix A from Example 12.9.6 we can use the
Wedderburn decomposition to compute A 0 and show that we get the same
answer as we got from the spectral decomposition in that example. For details,
see Exercise 12.43.
12.9. The Drazin Inverse 503

12.9.2 Alternative Characterizations

Theorem 12.9.8. Let A E Mn(lF) be a singular matrix. If r is a positively oriented


simple closed contour that contains all of the eigenvalues of A except the origin, then

(12 .55)

In other words, (12.55) is an alternative definition of the Drazin inverse.

Proof. Note that the function f(z) = l/z has Taylor series around A =f. 0 equal to

By the same argument as the proof of Corollary 12.6.14, we have that

Our definition of the Drazin inverse is somewhat unusual. It is more tradi-


tional, but perhaps less intuitive, to define the Drazin inverse to be the unique
operator satisfying three particular properties. These three properties are listed in
the following proposition. We prove that these two approaches are equivalent by
showing that AD satisfies the three properties and that these properties uniquely
identify AD.

Proposition 12.9.9. For A E Mn(lF) with index ind(A) = k, the Drazin inverse
AD of A satisfies
(i) AAD =AD A ,
(ii) Ak+ 1 AD = Ak , and
(iii) AD AAD =AD.

Proof. Writing A and AD as in Remark 12.9.4, we have

Similarly, we have

This establishes (i).


Given k = ind(A), we have
Ak+IAD = Ak(AAD) = AkP* = Ak(J - Po) = Ak -AkPo = Ak,
504 Chapter 12. Spectral Calculus

since AkP0 = 0. This establishes (iii). Finally, since AD Po= 0, we have

AD AA D = ADP* = AD (I - Po) = AD + AD P0 = AD.


This establishes (ii). D

Proposition 12.9.10. The three properties of the previous proposition uniquely


determine AD .

Proof. Suppose there exists B E Mn (JF) such that


(i) AB = BA,
(ii) Ak+ 1 B = Ak, and

(iii) BAB= B .

Since I= P* +Po, it suffices to show that BP* =ADP* and BP0 =AD Po= 0.
From (i) and the fact that Po and P* commute with A , we have

and thus
AP*BP0 Ak-·l = P*BP0 Ak = 0.
Since A is invertible on fl (P*), this gives

P*BPoAk-i = 0.
Using the argument again, we have

which yields

Continuing inductively gives

and BPo = PoBPo.


A similar argument gives

and

Combining these results gives

and PoB =BPo.

By (ii) we have

Ak+lp*B = Ak+lBP* = Akp* = Ak+lADP* = Ak+lp*AD.


This implies that
12.9. The Drazin Inverse 505

since A is invertible on~ (P*).


Finally, (iii) gives
BPo = PoB = B 2 P0 A.
Multiplying on the right by Ak- l gives

Continuing inductively, we find


BPo = 0,
as required. D

12.9.3 Application to Differential Equation s


Consider the first-order linear differential equation

Ax'(t) = Bx(t), A,B E Mn(<C),x E C 1 (JR,1Fn). (12.56)


1
If A is invertible, then the general solution is given by x(t) = eA - Btq, where q E !Fn
is arbitrary. However, if A is singular, then the general solution is more complicated.
Assume that there exists .X E IF such that (.XA + B)- 1 exists. Let

Since >.A+ B = I, we have


AB = A(I - >.A) = (I - >.A)A =BA.
Multiplying both sides of (12.56) by (.XA + B)- 1 , we rewrite the system as
Ax'(t) = Bx(t). (12 .57)

Let P* = _AD A = A.AD be the projection of Definition 12. 9.1 for the matrix A, and
let Po and Do be the 0-eigenprojection and 0-eigennilpotent, respectively, of A. We
show below that the general solution is x(t) = eAD Bt P*q, where q E !Fn is arbitrary.
Taking the derivative of the proposed solution, we have that x' (t) = AD Bx( t).
Plugging it into (12.56) gives P.Bx(t) = Bx(t). Since B commutes with P., it
suffices to show that Pox(t) = 0.
Recall that B = I - >.A. Multiplying (12.57) by Po gives

Dox'(t) = (Po - .XDo)x(t). (12 .58)

Multiplying this by D~-l gives D~ - 1 x(t) =


0, since D~ = 0. Differentiating gives
D~- 1 x'(t) = 0 . Thus, multiplying (12.58) by D~- 2 gives D~- 2 x (t) = 0, because
the other two terms vanish from the previous case. Repeating reduces the exponent
on Do until we eventually have the desired result Pox(t) = 0.
506 Chapter 12. Spectral Calculus

12.10 *Jordan Canonical Form


In Sections 12.5 and 12.6, we showed that a matrix operator A E Mn(lF) can be
written as a sum A= 2=.>..Ea(A) >..P.>.. + D.>.., where each P.>.. and D.>.. are, respectively,
the eigenprojections and eigennilpotents corresponding to the eigenvalues>.. E a(A).
Recall that P.>.. and D.>.. can be determined by taking contour integrals in the complex
plane; see (12 .18) and (12.29) for details. This is a basis-free approach to spectral
theory, meaning that the spectral decomposition can be described abstractly (as
contour integrals) without referencing a specific choice of basis. In this framework,
we can also choose to use whatever basis is most natural or convenient for the
application at hand, and we are free to change to a different basis as needed, and
still preserve the decomposition.
In contrast, the development of spectral theory in most texts requires an
operator to be represented in a very specific basis, called the Jordan basis, which
gives an aesthetically pleasing representation of the spectral decomposition. In this
section, we describe the Jordan basis and the corresponding matrix representation
called the Jordan canonical form or Jordan normal form, which is a very-nearly-
diagonal matrix with zeros everywhere except possibly on the diagonal and the
superdiagonal. The diagonal elements are the eigenvalues, and the superdiagonal
consists of only zeros and ones.
Unfortunately, the Jordan form is poorly conditioned and thus causes problems
with numerical computation; that is, small errors in the floating-point representa-
tion of a matrix can compound into large errors in the final results. For this reason,
and also because choosing a specific basis is often unnecessarily constraining, the
basis-free methods discussed earlier in this chapter are almost always preferable to
the Jordan form, but there are occasional times where the basis-specific method can
be useful.

12.10.1 Nilpotent Operators


To begin we need a definition and a few results about matrix representations of
nilpotent operators.

Definition 12.10.1. For A E Mn(JF) and b E lFn, the kth Krylov subspace of A
generated by b is
J£k(A, b) = span{b, Ab, A2 b, .. . , Ak-lb } .
If dim(J£k(A, b)) = k, we call {b, Ab, A2 b, . . . , Ak-lb} the Krylov basis of J£k(A, b).

Proposition 12.10.2. Let N be a nilpotent operator on a vector space V, and


let x E V be such that Nm(x) = 0, but Nm- 1 (x) #- 0. Expressed in terms of
the Krylov basis {x, N(x), ... , Nm- 2 (x), Nm- 1 (x)}, the restriction D of N to the
Krylov subspace Xm(N, x) has matrix representation
0 1 0 0
0 0 1 0
D= (12.59)
0 0 0 1
0 0 0 0
12.10. *Jordan Canonical Form 507

Proof. See Proposition 12.2.7 and Exercise 12.50. D

Proposition 12.10.3. Let D be a nilpotent operator on an n -dimensional vector


space V . There exist elements x 1 , ... , xe E V and nonnegative integers d 1 , .. . , de
such that n = "L;=l di and V is the direct sum of the Krylov subspaces
e
V = EBXJ, (D, xi)·
i=l

Proof. The proof is by induction on the dimension of V, and it is vacuously true if


a
dim(V) = 0. Assume now that dim(V) > 0. If (D) = {O}, then taking X1, ... 'Xe
to be any basis of V and all di = 1 gives the desired result . We may assume,
therefore, that a (D) contains at least one nonzero element.
Nilpotence of D implies that a
(D) -=f. V and hence dim(a (D)) < dim(V).
By the induction hypothesis there exist Y1, . . . , Yk Ea (D) and ni, ... , nk E N such
that
k
a (D) = EB Xn, (D,yi)· (12.60)
i=l
For each i let di = ni + 1 and choose x i E V so that Dxi = y i (this exists
because Yi Ea (D)). The elements Dn 1 x 1, ... , D nkxk all lie in JV (D). Moreover,
a
by (12.60) , they are all linearly independent and span JV (D) n (D), and by the
replacement theorem (Theorem 1.4.1) we may choose xk+l , ... , xe in JV (D) so that
nn 1 x1, . .. ' nnkxk, Xk+l >... 'Xe form a basis for JV (D) and dk+l = . .. =de = l.
If there exists any nontrivial linear combination of the elements
x1, Dx1, . .. , Dd 1 - 1x1, ... , x e, Dxe, . . . , Dd' - 1 xe,
equaling zero, then applying D to the sum and using (12 .60) gives a contradiction.
Hence these elements must be linearly independent and
e e
LJtd,(D,xi) = EB XJ,(D,xi) CV.
i=l i=l
a
Since span(x1, ... 'Xk) n (D) = span(x1, .. . ' xk) n JV (D) = {O}, repeated
application of the dimension formula (Corollary 2.3.19) gives

dim (t Jtd , (D, xi)) = dim(JV (D) +a (D) + span(x1, .. . , xk))


= dim(JV (D) +a (D)) + dim(span(x1 , ... 'Xk)))
= dim(JV (D)) + dim(a (D)) + k
= n - dim(JV (D) na( D)) + k
= n.
Therefore
e
EB XJ;(D, xi) = V. D
i=l
508 Chapter 12. Spectral Calculus

12.10.2 Jordan Normal Form


Combining the two previous propositions with the spectral decomposition yields an
almost-diagonal form for any finite-dimensional operator.

Theorem 12.10.4 (Jordan Normal Form). Let V be a finite-dimensional vector


space, and let A be an operator on V with spectral decomposition

For each .A E O"(A) let g>-. = !Ji! (P>-.) be the generalized eigenspace of A associated
with .A and let m>-. = ind(D>-.)· For d 1 , . . . , de as in the previous proposition, there
is a basis for g>-. in which .AP>-.+ D>-. is represented by an m>-. x m>-. matrix that can
be written in block-matrix form as

0
0
0

and each Jd, is a di x di matrix of the form

.A 1 0 0
0 .A 1 0

0 0 0 1
0 0 0 .A

In particular, if A E Mn (JF), then there exists an invertible matrix S E Mn (q such


that A = s- 1 MS, where M has the block-diagonal form

where r is the number of distinct eigenvalues in


11
d A) .
Remark 12.10.5. The Jordan form is poorly suited to computation because the
problem of finding the Jordan form is ill conditioned. This is because bases of
the form x, Dx, D 2 x, . . . are usually far from being orthogonal, with many almost-
parallel vectors. We show in section 13.3 that an orthogonal basis, such as the
Arnoldi basis, gives much better computational results. Of course the matrix forms
arising from Arnoldi bases do not have the nice visual appeal of the Jordan form,
but they are generally much more useful.
12.10. *Jordan Canonical Form 509

Remark 12.10.6 (Finding the Jordan normal form). Here we summarize the
algorithm for finding the Jordan form for a given matrix A E Mn(lF).
(i) Find the spectrum {.Ai, ... , Ar} of A and the algebraic multiplicity mj of each
Aj. (When working by hand, on very small matrices, this could be done by
computing the characteristic polynomial PA(z) = det(zl - A) and factoring
the polynomial as fl;=i (z - Aj )m;, but for most matrices, other algorithms,
such as those of Section 13.4, are both faster and more stable.)
(ii) For each Aj, compute the dimension of JV ((.\jl - A)k) fork= 1, 2, ... until
dim JV ((.\jl - A)k) = mj· Setting dJk = dimJV ((.\jl -A)k), this gives a
sequence 0 = dj 0 < d)i < · · · < djr = mj.
(iii) The sequence (djJ gives information about the sizes of the corresponding
blocks. The sequence is interpreted as follows: dj 1 - djo = dli tells how many
Jordan blocks of size at least one corresponding to eigenvalue Aj there are.
Similarly d12 - dj 1 tells how many Jordan blocks are at least size 2, and so
forth.
(iv) Construct the matrix J using the information from step (iii).

Example 12.10. 7. Let

Applying the algorithm above, we have the following:

(i) The characteristic polynomial of A is PA(z) = (z - 1) 3 ; hence A has a


single eigenvalue .\ = 1 with multiplicity 3.

(ii) Since .\ = 1 we have

11 - A ~ [ ~: ~; ~;]
and di = dim JV ((ll -A)i) = 2. Since (ll - A) 2 = 0, we have d2 =
dim JV ((ll -A) 2 ) = 3, which is equal to the multiplicity of.\.

(iii) Since do = 0, di = 2, and d 2 = 3, there are at least two Jordan blocks


with size at least one, and one Jordan block with size at least two.

Hence the Jordan normal form of A is

[H I]
510 Chapter 12. Spectral Calculus

Example 12.10.8. Let A be the matrix in Example 12.2.4, that is,

=
o3 -1 1lj
- 2 -4 5
A
[21 -1
-1
-1
-3
2 .
3

To find t he Jordan normal form, we first verify that CT(A) = {O} . We already
saw that A 2 = 0, and that dim A' (A) = 2, so the geometric multiplicity
of the eigenvalue >. = 0 is 2. This shows that 6"o can be decomposed into
two complementary two-dimensional subspaces. The 0-eigenspace is A' (A) =
span{x1,X3}, spanned by X1 = [2 1 1 O] T and X3 = [-1 1 0 l]T .
We seek an x2 so that {(A-- >.I)x2,x2} = {x 1,x2} and an x4 so that
{(A- U)x4,x 4} = {x3,x4}. So we must solve (A- OJ)x2 = Ax2 = x1 and
Ax4 = X3. We find X2 = [-1 -2 0 o]T and X4 = [1 1 0 of. Verify
that the set {x 1,x2,x3 , x4} = {Ax 2, x2,Ax4, x 4} is a basis and that this basis
gives the desired normal form; that is, verify that we have

2 -1
O 1 0 OJ 1 -2
J = s- 1AS = 0 0 0 0 where S =
[00 00 00 01 ,
[
1
0 0
0

The verification of the preceding details is Exercise 12.51.

Example 12.10.9. Consider the matrix

First verify that CT( A) = {1} and A'(I-A) = span{z 1,z2} , where z 1 =
[o 1 o]T and z 2 = [-2 o 1]1".
Since (I -A) 2 = 0, we have two blocks that are (I -A)-invariant: a 1x1
block and a (not semisim ple) 2 x 2 block. To find the 2 x 2 block we must find an
x that is not in A' ((A - >.I)). This will guarantee that { (A - >.I) x , x} has two
elements. In our case a simple choice is x2 = [1 0 O] T. Let (A->.I) x2 = x 1.
Since (A - !) 2 = 0, we have (A - I)x 1 = 0, which gives Ax 1 = x 1 and
Ax2 = X1 +x2 .
Verify that z1 is not in span(x1, x2) , so {x 1, x 2, z i} gives a basis of JF 3.
Verify that letting P = [x1 x2 z1] puts p-l AP in Jordan form . The
verification of the preceding details is Exercise 12.52.
Exercises 511

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with .&. are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

12.l. Verify the claim of Example 12.l.2; that is, show that (I - P) 2 = I - P.
12.2. Show that the matrix
2
1 [2 cos B sin 2B ]
2 sin 28 2 sin 2 e
is a projection for all B ER
12.3. Let P be a projection on a finite-dimensional vector space V. Prove that
rank(P) = tr (P). Hint: Use Corollary 12.l.5 and Exercise 2.25.
12.4. Let F E Mn(lF) be the n x n matrix that reverses the indexing of x, that
is, if x = [x1 x2 xnr, then Fx = [xn Xn-1 . . . x1]T. Define
E E Mn(lF) such that Ex = (x + Fx)/2. Prove that E is a projection.
Determine whether it is an orthogonal projection or an oblique projection.
What are the entries of E?
12.5. Consider the matrix

A= [~ ~].
Find the eigenvalues and eigenvectors of A and use those to write out the
eigenprojection matrices P 1 and P 2 of Proposition 12.1.10. Verify that each
of the five properties listed in that proposition holds for these matrices.

12.6. Let L be a linear operator on a finite-dimensional vector space V . Prove that


ind(L):::; dim(V).
12.7. Assume A E Mn(IR) is positive semidefinite. Show that if xis a unit vector,
then x T Ax is less than or equal to the largest eigenvalue. Hint: Write x as a
linear combination of an orthonormal eigenbasis of A.
512 Chapter 12. Spectral Calculus

12.8. Find a basis of generalized eigenvectors of

12.9. Assume that >. and µ are eigenvalues of A E Mn(IF). If Ax = µx and


(>.I - A)mx = 0, then (>. - µr x = 0. Hint: Use t he binomial theorem.
12.10. Recall from Theorem 12.2.5 that if k = ind (B), then !Fn = flt (Bk) EBJV (Bk).
Prove that the two subspaces flt (Bk) and JV (Bk) are B -invariant.

12.11. Assume NE Mn (IF) is nilpotent (see Exercise 4.1), that is, Nk = 0 for some
k EN.
(i) Use the Neumann series of I + N to prove that I+ N is invertible.
(ii) Write (J + N)- 1 in terms of powers of N.
(iii) Generalize your results to show that >.I+ N is invertible for any nonzero
>. E C.
(iv) Write (>.I+ N)- 1 in terms of powers of N.
12.12. Let
A= rn ~].
(i) Write the resolvent R(z) of A in the form R(z) = B(z)/p(z), where B(z)
is a matrix of polynomials in z and p(z) is a monic polynomial in z .
(ii) Factor p(z) into the form p(z) = (z - a)(z - b) and write the partial
fraction decomposition of R(z) in the form R(z) = C/(z-a)+D/(z -b),
where C and D are constant matrices.
12.13. Let

A.=[~~~]
0 0 7
·
(i) Write the resolvent R(z) of A in the form R(z) = B(z)/p(z), where B(z)
is a matrix of polynomials in z and p( z) is a monic polynomial in z .
(ii) Factor p(z) and write the partial fraction decomposition of R(z).
12.14. Let
A= [~ ~] .
Compute the spectral radius r(A) using the 2-norm (see Exercise 4.16), using
the 1-norm 11 · Iii (see Theorem 3.5.20) , and using the infinity norm I · !loo·
12.15. Let I · I be a matrix norm. Prove that if A E Mn(IF) satisfies r(A) < 1, then
for any 0 < € < 1 - r(A) there exists m E z+ such that

llAkll < (r(A) + c)k :S r(A) + E < 1


for all k 2: N.
Exercises 513

12.16. For A E Mn(lF), let IAI denote componentwise modulus, so if aij is the (i,j)
entry of A, then laijl is the (i,j) entry of IAI. Let~ denote componentwise
inequality (see also Definition 12.8.1) .
(i) Prove that r(A):::; r(IAI).
(ii) Prove that if 0 ::SA ::SB, then r(A) :::; r(B) .
12.17. Using the matrix A in Exercise 12.12, compute its spectral projections and
show that the four properties in Theorem 12.4.2 are satisfied.
12.18. Using the matrix A in Exercise 12.13, compute its spectral projections and
show that the four properties in Theorem 12.4.2 are satisfied.
12.19. Verify the Cayley-Hamilton theorem for the matrix A in Exercise 12.12.
12.20. Compute the eigenvalues, resolvent, and spectral projections of the matrix

12.21. Let
o
A = [3i
3i]
6 .

Find the Laurent expansion of the resolvent of A around the point z = 3.


12.22. Let B be the matrix in Exercise 12.20.
(i) Find the Laurent expansion of the resolvent of B around the point z = 2.
(ii) Find the Laurent expansion of the resolvent of B around the point z = 4.
12.23. Prove Lemma 12.5.5, parts (i) and (ii) .
12.24. Prove Lemma 12.5.5, part (iii).
12.25. Prove Lemma 12.5.5, parts (iv) and (v).

12.26. Let A be the matrix in Exercise 12.12.


(i) Write the spectral decomposition A= L A >..PA+ DA.
(ii) Write the resolvent in the form of (12.33).
(iii) Write a general formula for Ak in a form that involves no matrix multi-
plications.
12.27. Let B be the matrix in Exercise 12.20.
(i) Write the spectral decomposition B =LA >..PA+ DA.
(ii) Write the resolvent in the form of (12.33).
(iii) Compute cos(B7r/6).
12.28. Prove that the number of distinct nth roots of an invertible matrix A is at
least nk, where k is the number of distinct eigenvalues of A.
12.29. Let A E Mn(lF). Prove that if Av = >..v, then v E JV (DA).
514 Chapter 12. Spectral Calculus

12.30. Given A E Mn(lF), use the spectral decomposition (12.33) and Exercise 12.3
to prove that tr R(z) = p'(z)/p(z), where pis the characteristic polynomial
of A. Hint: Write the characteristic polynomial in factored form (4.5), and
show that

12.31. Compute eA for each A in Exercises 12.12 and 12.13.


12.32. Let x E lFn be an eigenvector of A associated to the eigenvalue >. E O"(A) .
Let U c <C be an open, simply connected set containing O"(A), and let f(z)
be a holomorphic function on U . Show that xis an eigenvector of f(A) with
eigenvalue f(>.). Under what circumstances could one have an eigenvector y
of f(A) with eigenvalue f(>.) where y is not an eigenvector of A?
12.33. Prove that if U c <C is an open, simply connected set containing O"(A), and if
f(z) is a holomorphic function on U such that f(A) = 0, then f(>.) = 0 for
every >. E O"(A). Then prove that the converse is false .
12.34. Let BE Mn(lF). Assume>. E O"(B) is a semisimple eigenvalue of modulus J>.J =
r(B), with corresponding eigenprojection P, and that all other eigenvalues
have modulus strictly less than r(B). Let x 0 be a unit vector satisfying
Pxo =/=- 0. For each k EN, define

Bxk
(12.61)
Xk+i = IJBxkll'

Prove that (xk)k=O has a convergent subsequence, that the corresponding


limit x is an eigenvector of B corresponding to >., and that >. = xH Bx. Hint:
The matrix A= >.- 1 B satisfies the hypotheses in Theorem 12.7.8 and has the
same eigenprojections as B. Consider using the fact that JJAkxo - PxoJI ---+ 0
to show that

12.35. Give an example of a matrix A E M2(lF) that has two distinct eigenvalues of
modulus 1 and such that the iterative map Xk = Axk-l fails to provide an
eigenvector for any nonzero xo that is not already an eigenvector.

12.36. Let

A ~ [!
1
0
2 !]
(i) Show that A is irreducible.
(ii) Find the Perron root of A.
12.37. For A E Mn(lF), prove that A>- 0 if and only if Ax>- 0 for every x ~ 0 with
x=f. O.
12.38. Prove Proposition 12.8.10.
Exerci ses 515

12.39. Assume A'.':::: 0, A E Mn(lF).


(i) Prove that if A is primitive, then r(A) is an eigenvalue of A and that
no other eigenvalue has norm r(A).
(ii) Prove that if A '.': : 0 is irreducible, then its Perron root is simple, and
the corresponding eigenvector xis positive (x ~ 0).
(iii) Give an example showing that, unlike the case of positive matrices, there
are irreducible matrices with more than one eigenvalue lying on the circle
izlr(A).
=
12.40. Let A E Mn(lF). Prove that if A'.':::: 0 is irreducible with Perron root,\= r(A)
and corresponding eigenvector x ~ 0 , then for any other nonnegative real
eigenvector y '.':::: 0, the corresponding eigenvalueµ must be equal to r(A) and
y = ex for some positive real constant c. To prove this, follow these steps:
(i) Show that if A '.': : 0 is irreducible, then AT '.': : 0 is also irreducible, and
r(A) = r(AT).
(ii) Show that if z is the positive eigenvector of AT associated to the eigen-
value r(AT), then z Ty> 0.
(iii) Show that r(A)z Ty= z TAy = µz Ty , and henceµ = r(A) .
(iv) Prove that y E span(x).
12.41. Use the previous exercise to show that if the row sums I:j aij of an irreducible
matrix A = (aij) '.':::: 0 are all the same for every i, then for every i we have
r(A) = l:j aij·
12.42. Let B E Mn(lR) be a positive matrix where every column's sum equals one.
Prove: If P is the eigenprojection of B corresponding to the Perron root
,\ = 1, then Pv -=f 0 for any nonzero, nonnegative v E ]Rn. Hint: Write the
eigenprojection in terms of the right and left eigenvectors, noting that the
left eigenvector is :n..

12.43. Given the matrix A from Example 12.9.6, use the Wedderburn decomposition
to compute AD. In particular, let

-~71
1 0 0 1 0 0
s= 0 1 0
an d S- 1 = 0 1
0 0
0
1
271
-9
3 .
[
0 0 1 -3 [
0 0 0 1 0 0 0 1

Verify that your answer is the same as that given in Example 12.9.6.
12.44. Given A E Mn(lF) , let Po and Do denote, respectively, the eigenprojection
and the eigennilpotent of the eigenvalue ,\ = 0. Show that

AD= (A - Do+ Po)- 1 - Po .

12.45. Given A E Mn(lF) , let Po and Do denote, respectively, the eigenprojection


and the eigennilpotent of the eigenvalue ,\ = 0. Also let P* = I - Po . Prove
that the following properties of the Drazin inverse hold:
(i) PoAD = AD Po= 0.
516 Chapter 12. Spectral Calculus

(ii) DoAD =AD Do = 0.


(iii) P*AD =AD p* =AD.
(iv) AD(A - Do)= P* =(A - Do)AD.
12.46. Given A E Mn(F), prove that the following properties of the Drazin inverse
hold:
(i) (AD)H = (AH)D.
(ii) (AD)e = (Ae)D for any f, EN.
(iii) AD =A if and only if A 3 0 = A.
(iv) (AD)D = A2AD.
12.47. Given A E Mn(lF), prove that the following properties of the Drazin inverse
hold:
(i) AD = 0 if and only if A is nilpotent.
(ii) ind(AD) = 0 if and only if A is nonsingular.
(iii) ind(AD) = 1 if and only if A is singular.
(iv) If A is idempotent, then AD =A.
12.48. Let A E Mn(lF) and m = ind(A). Prove the following :
(i) P* = AD A is the projection onto~ (Am) along JV (Am) . Hint: Show
that Am = P*Am = Am P* and use the fact that A is invertible on
~ (P*).
(ii) If b E ~(Am) , then x = ADb is a solution of Ax= b.
12.49. Prove that if b E ~ (P*), then the linear system Ax = b has the solution
X= ADb.

12.50.* Prove Proposition 12.10.2.


12.51.* Fill in all the details for Example 12.10.8.
12.52.* Fill in all the details for Example 12.10.9.
12.53.* Given the matrix

A=
a 1
0 a 2 ,
ol
[0 0 a

find the matrix P such that

Notes
Two great sources on resolvent methods are [Kat95] and [Cha12]. Exercise 12.4
comes from [TB97].
Exerc ises 517

A short history and further discussion of Perron's theorem and the Perron-
Frobenius theorem can be found in [MacOO]. For more details about Google's
PageRank algorithm, see [LM06] and [BL06].
For more about the Drazin inverse, see [CM09] and [Wil79]. The proof of
Proposition 12.9.10 is modeled after the proof of the same result in [Wil79] .
The proof of Proposition 12.10.3 was inspired by [Wi107].
Iterative Methods

Have no fear of perfection-you'll never reach it.


-Salvador Dali

The subject of numerical linear algebra is concerned with computing approximate


answers to linear algebra problems. The two main numerical linear algebra problems
are solving linear systems and finding eigenvalues and eigenvectors.
For small matrices, na!ve methods, such as Gaussian elimination, are generally
suitable because computers can solve these little problems in the blink of an eye.
However, for large matrices, na!ve methods can take a lifetime to give a solution.
For example, the temporal complexity (number of basic computations) of Gaussian
elimination is cubic, meaning that if the dimension of the matrix doubles, then the
run time increases by a factor of eight. It doesn't take very many of these doublings
before the run time is too long and we start to wish for algorithms with lower
temporal complexity.
In addition to the issue of temporal complexity, the spatial complexity-the
amount of memory, or storage space, required to solve the problem-is also im-
portant. The spatial complexity of matrix storage is quadratic, meaning that as
the matrix dimension doubles, the amount of space required to store the matrix
increases by a factor of four. It's very possible to encounter linear algebra problems
that are so large that storing the entire matrix on a single computer is impossible,
and solvers must therefore distribute the workload across multiple machines.
Fortunately, the matrices considered in most real-world applications are sparse,
that is, they have relatively few nonzero entries. Methods that take advantage of
this sparsity greatly improve the run times and do so with much less memory than
methods that don't. These sparse methods store only the nonzero entries and there-
fore use only a tiny fraction of the memory that would be needed to store all the
entries. Moreover, the matrix operations used in row reduction or matrix multipli-
cation can skip over the zeros since they don't alter the computation; when most
of the entries are zero, that skipping speeds up the algorithms substantially.
Numerical methods for large sparse linear algebra problems tend to be itera-
tive. These algorithms work by generating a sequence of successive approximations
that converge to the true solution, where each iteration builds on the previous

519
520 Chapter 13. Iterative Methods

approximation, until it is sufficiently close that the algorithm can terminate. For
example, Newton's method, described in Section 7.3, is an iterative method for
finding zeros.
We begin the chapter with a result about convergence of sequences of the form

Xk+1 = Bxk + c, k E N,

for B E Mn (IF) and x 0 , c E !Fn. This one result gives three fundamental iterative
methods for solving linear systems: the Jacobi method, the Gauss- Seidel method,
and the method of successive overrelaxation (SOR).
We then develop some iterative methods for solving numerical linear alge-
bra problems based on Krylov subspaces. Krylov methods rank among the fastest
general methods available for solving large, sparse linear systems and eigenvalue
problems. One of the most important of these methods is the GMRES algorithm.
We conclude the chapter with some powerful iterative methods for computing
eigenvalues, including QR iteration and the Arnoldi method.

13.1 Methods for Linear Systems


We have already seen an example of an iterative method with the power method
(see Section 12.7.3) for finding the Perron eigenvector. This eigenvector is found
by computing successive powers xo, Bxo, B 2 xo, .. ., for any xo C:: 0 whose entries
sum to one. We can think of this as a sequence (x k)k::o generated by the rule
Xk+l = Bxk for each k E N. The limit x of this sequence is a fixed point of B; that
is, x =Bx.
It is useful to generalize this result to sequences that are generated by succes-
sive powers together with a shift, that is,

xk+l = Bxk + c, k E N, ( 13. l)

where B E Mn(IF) and c , x 0 E !Fn are given. Iterative processes of this form show
up in applications such as Markov cha.ins, analysis of equilibria in economics, 48 and
control theory.
In this section we first show that sequences of the form (13.1) converge when-
ever the spectral radius r(B) satisfies r(B) < 1, regardless of the initial starting
point, and that the limit x E !Fn satisfies the equation x =Bx+ c, or, equivalently,
x = (I - B)- 1 c. We then use this fact to describe three iterative methods for
approximating solutions to linear systems of the form Ax = b .

13.1.1 Convergence Theorem


For any B E Mn(IF), if llBll < 1, then Exercise 7. 9 shows that the map
x H- Bx + c is a contraction mapping on !Fn and the corresponding fixed point is
(I -B)- 1c, provided II· I is the norm induced on Mn(IF) by the norm on !Fn. More-
over, the method of successive approximations (see Remark 7.1.10) always converges
t o the fixed point of a contraction mapping, and thus sequences generated by (13 .1)
48 In
1973, Wassily Leontief won the Nobel Prize in economics for his work on input-output analysis,
which used successive powers to compute equilibria for the global production of commodities.
13.1. Methods for Linear Systems 521

converge to x = (I - B) - 1 c whenever l Bll < 1. The next theorem shows that this
convergence result holds whenever r(B) < 1, regardless of the size of l Bll·

Theorem 13.1.1. Let B E Mn(lF) and c E lFn. If r(B) < 1, then sequences
generated by (13.1) converge to x =(I - B)- 1 c, regardless of the initial vector x 0 .

Proof. Let 11 • 11 be the induced norm on Mn(lF). If r(B) < 1, then Exercise 12.15
shows that for any 0 < c < 1 - r(B), there exists m E z+ such that l Bkll <
(r(B) + c)k :::; r(B) + c < 1 for all k 2: m. Set f (x) =Bx + c and observe that the
kth composition satisfies
k- 1
fk(x) = Bkx+ LBJc.
j =O

Thus, for any k 2: m , we have

and therefore Jk is a contraction mapping. By Theorem 7.1.14 and the solution


of Exercise 7.6, the sequence (xk)r=o converges to the unique fixed point x of f ,
and thus x satisfies x = Bx + c. Since I - B is invertible (otherwise ,\ = 1 is an
eigenvalue of B , which would contradict r(B) < 1), we have x = (I - B)- 1 c. D

Remark 13.1.2. The proof of the previous theorem also shows that the sequence
(13 .1) converges linearly (see Definition 7.3.1), and it gives a bound on the
approximat ion error E:k = l xk - xo l , namely,

13.1.2 Iterative Solvers


Theorem 13.1.1 lies at the heart of three different iterative solvers for linear systems,
namely, the Jacobi method, the Gauss-Seidel method, and the method of successive
overrelaxation (SOR). These all follow from the same idea. First, decompose a
linear system Ax = b into an expression of the form x = Bx+ c, where r(B) < 1.
Then, by Theorem 13.1.1, the iteration (13.1) converges to the solution linearly.
Throughout this section, assume that A E Mn (JF) is given and that A is
decomposed into a diagonal part D, a strictly upper-triangular part U, and a strictly
lower-triangular part L , so that A = L + D + U.

0 0
0
A= *

* *
Here * indicates an arbitrary entry-not necessarily zero or nonzero.
522 Chapter 13. Iterative Methods

Jacobi's Method
By writing Ax= bas Dx = -(L + U)x + b , we have
x = -D- 1 (L + U)x + D- 1 b, (13 .2)

which motivates the definition BJac == -D- 1 (L + U) and CJac = D - 1 b. Jacobi's


method is to apply the iteration (liU) with BJac and CJac· Note that D- 1 is
trivial to compute because D is diagonal, and multiplication by D- 1 is also trivial.
Theorem 13.l.1 guarantees that Jacobi's method converges to x as long as r(BJac)<l.

Example 13.1.3. Consider the linear system Ax= b, where

Decomposing A, we have

3
D= 0 2 0 ,
o ol
[0 0 2

Thus,

Starting with xo = [0 1 1] T and iterating via ( 13. l), we find that Xk ap-
proximates the correct answer [- 7 9 6] T to four digits of accuracy once
k ~ 99. We note that r(BJac) == y!5f6 ~ 0.9129, which implies that the
approximation error satisfies [[x - xk[[ :::; 0.9130k · [[x - x 0 [[ fork sufficiently
large.

The Gauss-Seidel Method


By writing Ax= bas (D + L )x = -Ux + b, we have that

(13.3)

Hence for Bes = -(D + L)- 1 U and c = (D + L)- 1 b, we have that the iteration
(13.1) converges to x as long as r(Bcs) < 1. Note that although the inverse
( D + L )- 1 is fairly easy to compute (because D + L is lower triangular), it is
generally both faster and more stable to write the Gauss-Seidel iteration as

(D + L)xk+l = -Uxk +b
and then solve for Xk+l by back substitution.
13.1. Methods for Lin ear Systems 523

Example 13.1.4. Considering the system from Example 13.1.3, we have

Bas= -(D + L)- 1 U = "6


1[o~ -~4 -~2]
Starting with xo = [O 1 l]T and iterating via (13.1), we find that Xk ap-
proximates the correct answer [- 7 9 6( to four digits of accuracy once
k ~ 50.
Here we have r(Bcs) = 5/6 ~ 0.8333, which is the square of the ra-
dius for the Jacobi method in the previous example. This implies that the
approximation error satisfies [[x - xk [[ :::; 0.9130 2 k · [[x - x 0 [[. Therefore, in
this example at least, the Gauss-Seidel method can solve this problem in half
the number of iterations as the Jacobi method. In other words, it converges
twice as fast .

Nota Bene 13.1.5. Just because one method converges faster than another
does not necessarily mean that it is a faster algorithm. One must also look at
the computational cost of each iteration.

Successive Overrelaxation

The Gauss-Seidel method converges faster than the Jacobi method because the
spectral radius of Bes is smaller than the spectral radius of BJac· We can often
improve convergence even more by splitting up A in a way that has a free parameter
and then tuning the decomposition to further reduce the value of r(B) . More
precisely, by writing Ax= bas (D+wL) x = ((1 - w)D - wU)x+wb, where w > 0,
we have
x = (D + wL)- 1 ((1 - w)D - wU)x + w(D + wL) - 1 b. (13.4)
Hence for Bw = (D+wL)- 1 ((1-w)D-wU) and Cw = w(D+wL) - 1 b, the iteration
(13.1) converges to x as long as r(Bw) < l. We note of course that when w = 1
we have the Gauss-Seidel case. Choosing w to make r(Bw) as small as possible will
give the fastest convergence.
Again, we note that rather than multiply by the inverse matrix to construct
Bw, it is generally both faster and more stable to write the iteration as

(D + wL)xk+l = ((1 - w)D - wU)xk + wb


and then solve for Xk+l by back substitution.

Example 13.1.6. Consider again the system from Examples 13.l.3 and 13.1.4.
Since w = 1 corresponds to the Gauss- Seidel case, we already know that
r(B 1 ) = 5/6 ~ 0.8333 ....
524 Chapter 13. Iterative Methods

For a large class of matrices, the optimal choice of w is

(13.5)

where j3 = r(BJac); see [Ise09, Gre97] for details. In the case of this example,
r(BJac) = .J576, which gives w* = ~(6 - v'6) ~ 1.4202 and r( Bw.) = w* -
1 ~ 0.4202. Therefore, the approximation error is bounded by ll x - Xk II :S:
0.4202k · llx -xoll, so this method converges much faster than the Gauss- Seidel
method.
For example, starting with x 0 = [O 1 l]T, the term X k approximates
the correct answer [-7 9 6] T to four digits of accuracy once k ::::: 18. Com-
pare this to k ::::: 50 for Gauss- Seidel and k ::::: 99 for Jacobi.

13.1.3 More Convergence Theorems

Definition 13.1.7. A matrix A E Mn(!F) with components A = [aij] is strictly


diagonally dominant if for each i we have

2..:::1%1 <aii· (13 .6)


#i

Theorem 13.1.8. Let A E Mn (!F) have components A = [aij] . If A is strictly


diagonally dominant, then r(BJac) < 1, that is, Jacobi iteration (13.2) converges
for any initial vector xo .

Proof. Assuming that A= [aij], we have that

Hence if A is strictly diagonally dominant, then for each i we have

which implies that llBJaclloo < 1. The result follows from Lemma 12.3.13. D

Theorem 13.1.9. Let A E Mn(lF) have components A = [aij]· If A is strictly


diagonally dominant, then r(Bas) < 1, and thus Gauss- Seidel iteration (13.3)
converges for any initial vector xo.

Proof. Define
(13 .7)
13.1. Methods for Linear Systems 525

Since A is strictly diagonally dominant, we have that

j> i j<i

which implies"(< 1. It suffices to show that llBcsllcxi :S: "(.


For any nonzero x E lFn, set y = Bcsx. Thus, ( D + L )y - Ux, or,
equivalently,

j<i j>i
for every i. Choosing i such that IYil = llYllcxi, we have

which implies
ll Bcsxllcxi = ll Yllcxi < :LJ>i 1%1 <
llxllcxi llxllcxi - laiil - :LJ<i laijl - 'Y·
Since this holds for all x =/= 0, we have that ll Bcs llcxi :S: "( < 1. D

Theorem 13.1.10. If A is Hermitian and positive definite (denoted A > 0),


then r(Bw) < 1 for all w E (0, 2). In other words, SOR converges for any initial
vector xo.

Proof. Since A is Hermitian, we have that A = D + L + LH. It follows that


Bw = (D + wL)- 1 ((1 - w)D - wLH)
= (D + wL) - 1 (D + wL - wA)
=I-(w- 1 D+L)- 1 A.

If>. E <J(Bw), then there exists a nonzero x E lFn such that Bwx = >.x, and some
straightforward algebraic manipulations give
1 xH(w- 1 D+L)x
1- >. xHAx
Since w E (0, 2) and A > 0, we have that ~ > 1, and by Exercise 4.27 we have
xHDx > 0. Moreover, since A> 0 we have xHAx > 0. Therefore,

Hence, for>. = u +iv, we have that

~
2 < ~
(- 1
1- >.
) =
~
(- 1
1-
. 1-
>. 1 - X
x) = ~(
1-
1-
2~(>.) +
x )
l>-1 2
= 1- u
1 - 2u + u 2 + v 2
.

Cross multiplying gives l>-1 2 = u 2 + v 2 < 1. D


526 Chapter 13. Iterative Methods

13.2 Minimal Polynomials and Krylov Subspaces


In this section and the next, we consider Krylov methods, which are related to
Newton's method, and which form the basis of many iterative methods for sparse
linear algebra problems. The basic idea is to construct a sequence of low-dimensional
subspaces in which an approximation to the true solution can be solved somewhat
easily. If the subspaces are well chosen, the true solution can be closely approxi-
mated in relatively few steps.
Before diving into Krylov spaces, we define minimal polynomials, which are
similar to characteristic polynomials, and which provide the theoretical founda-
tion for Krylov methods. We then show how Krylov subspaces can be used to
approximate the solutions of linear systems. In the next section, we show how the
algorithms work in practice.

13.2.1 Minimal Polynomials


The Cayley-Hamilton theorem (see Corollary 12.4.8) states that the characteristic
polynomial p(z) of a given matrix A E Mn(IF) annihilates the matrix, that is,
p(A) = 0. But it often happens that a polynomial of smaller degree will also
annihilate A.

Definition 13.2.1. The minimal polynomial p E IF[x] of a matrix A E Mn(IF) is


the manic, nonzero polynomial of least degree in IF[z] such that p(A) = 0.

Proposition 13.2.2. Given A E Mn(IF), there exists a unique minimal polynomial.

Proof. See Exercise 13.6. D

Lemma 13.2.3. Let p E IF[x] be the minimal polynomial of a matrix A E Mn(IF) .


If.\ is an eigenvalue of A, then .\ is a root of p; that is, p(.\) = 0.

Proof. By the spectral mapping theorem (Theorem 12.7.1), if.\ is an eigenvalue


of A, then p(.\) is an eigenvalue of p(A). Hence for some x =J 0, we have that
p(A)x = p(.\)x, but since p(A) = 0, we must also have that p(.\) = 0. D

Lemma 13.2.4. Let p E IF[x] be the minimal polynomial of A E Mn (IF). If q E IF[x]


satisfies q(A) = 0, then p(z) divides q(z) .

Proof. Since q(A) = 0, the degree of q is greater than or equal to the degree
of p. Using polynomial division, 49 we can write q = mp+ r, where m, r E IF[x]
and deg(r) < deg(p). Thus, r(A) = q(A) - m(A)p(A) = 0, which contradicts the
minimality of p, unless r = 0. D

The minimal polynomial can be constructed from the spectral decomposition,


as described in the next proposition.
49
We assume the reader has seen polynomial division before. For a review and for more about
polynomial division see Definition 15.2.l and Theorem 15.2.4.
13.2. Minimal Polynomials and Krylov Subspaces 527

Proposition 13.2.5. Given A E Mn(lF), let A = L>.E<T(A) >..P.x +D.x be the spectral
decomposition. For each distinct eigenvalue>.. , let m.x denote the order of nilpotency
of D.x. In this case, the minimal polynomial of A is

p(z) = IT (z - >..)m" . (13.8)


AE<T (A )

Proof. The proof is Exercise 13.7. D

Example 13.2.6. In practice t he characteristic polynomial and the minimal


polynomial are often the same, but they can differ. Consider the matrix

It is easy to see that the characteristic polynomial is p(z) = (z - 3) 4 , whereas


the minimal polynomial is p(z) = (z - 3) 2 .

13.2.2 Krylov Solution for Nonsingular Matrices


Let p(z) = L~=O ckzk be the minimal polynomial of a matrix A E Mn(lF) . If A is
nonsingular, then co is nonzero, and we can write (see Exercise 13.9)
d-1
A -1 = - -1 """'
L Ck+1A k . (13.9)
Co k = O

Applying this to the linear system Ax = b, we have


d-1
x =A - 1 b = - -1 """'
L Ck+1A k b. (13.10)
Co k =O

This implies that t he solution x lies in the d-dimensional subspace spanned by the
vectors b , Ab, A2 b , . . . , Ad- 1b.

Definition 13.2.7. For A E Mn(lF) and b E lFn, the kth Krylov subspace of A
generated by b is

Jtk(A, b) = span{b, Ab, A 2 b, ... , Ak- lb } .


If dim(Jtk (A, b )) = k, we call {b, Ab, A2 b ,. . ., Ak - lb} the Krylov basis of Jtk(A, b).

Remark 13.2.8. From (13.10) it follows that the solution of the invertible linear
system Ax= b lies in the Krylov subspace Jed(A, b) .
528 Chapter 13. Iterative Methods

Remark 13.2.9. Observe that

.Jei(A, b) c .JtZ(A, b) c Xa(A, b) c ··· .


The fundamental idea of Krylov subspace methods is to search for approximate
solutions of the linear system Ax = b in each of the low-dimensional subspaces
J[k(A,b), where k = 1,2, ... , until the approximate solution is close enough to
the true solution that we can terminate the algorithm. While we can prove that
Jt;i(A, b) contains the exact solution, we can usually get a sufficiently close approx-
imation with a much smaller value of k.

Remark 13.2.10. It's clear that J[k(A, b) C Jt;i(A, b) whenever k :::; d, but this
also holds for k > d, in particular, given a polynomial q E lF[x] of any order, we
can write it in the form q = mp+ r, where m, r E lF[x] and deg(r) < deg(p). Thus,
q(A)b = m(A)p(A)b + r(A)b = r(A) b E Jt;i(A, b) .

Example 13.2.11. Suppose

The minimal polynomial and is given by

p(z) = (z - 2) 3 (z - 3) = z4 - 9z 3 + 30z 2 - 44z + 24

and t he 4th Krylov subspace of A generated by bis given by

We may write the solution x of Ax= bin the Krylov basis as

13.2.3 Krylov Solution for Singu lar Matrices

Definition 13.2.12. Let A E Mn(lF) and b E lFn . The linear system Ax= bis said
to have a Krylov solution if for some positive integer k there exists x E J[k (A, b)
such that Ax = b .
13.2. Minimal Polynomials and Krylov Subspaces 529

In the previous subsection we saw that Ax = b always has a Krylov solution


if A is nonsingular. We now consider the case that A E Mn(IF) is singular .
Remark 13.2.13. Let A E Mn(IF), and let d be the degree of the minimal polyno-
mial of A . By Remark 13.2.10, the linear system Ax = b has a Krylov solution if
and only if there exists x E Jtd(A, b) satisfying Ax= b .

Lemma 13.2.14. If N E Mn(IF) is nilpotent of order d and b -j. 0, then Nx = b


has no Krylov solution. In other words, for any x E ..fi':k(N, b) we cannot have
Nx=b.

Proof. Since Nd - l "I- 0 and Nd = 0, the minimal polynomial of N has degree d.


If x E Jtd(N, b) satisfies Nx = b, then x may be written as a linear combination
of the Krylov basis, as
m-1
x = L ajNjb = aob + alNb + · · · + am- 2Nm - 2b + am - 1Nm- 1 b.
j=O
But this implies

and hence
(13.11)
The matrix in parentheses is nonsingular because it can be written in the form
I + M, where M = -aoN - a1N 2 - · · · - am- 2Nm- l is nilpotent; see
Exercise 13.ll(i). Hence b = 0, which is a contradiction D

Theorem 13.2.15 (Existence of a Krylov Solution). Given A E Mn(IF) and


b E !Fn, the linear system Ax = b has a K rylov solution if and only if b E &; (Am),
where m = ind(A) . Moreover, if a Krylov solution exists, then it is unique and can
be written as
(13.12)

Proof. The nonsingular case is immediate. Assume now that A is singular. Let
A = L>.Ecr (A) >..P;., + D;., be the spectral decomposition of A and let P* = L>.#O P;.. .
If xis a Krylov solution, then by Remark 13.2.13, it follows that x E Jtd(A, b),
where d is the degree of the minimal polynomial of A. Thus, we can write x as a
linear combination x =I:~:~ ajAJb. Left multiplying by P* and Po gives
d-1 d- 1
P*x = LajAJP*b and Pox= L::ajD6Pob , (13.13)
j=O j=O
and hence P*x E Jtd(A, P*b) and Pox E Jtd(Do, Pob).
Using the spectral decomposition of A, it is straightforward to verify that
P0 A = D 0 , and thus since Ax = b, then Pox is a solution of the nilpotent linear
system D 0 P 0 x = P0 b, which implies P0 b = 0 by the lemma. Therefore, b = P*b ,
which implies that b E &; (P*) = &;(Am) and x = A Db by Exercise 12.48.
530 Chapter 13. Iterative Methods

Conversely, if b E f!J2 (Am) , then by Exercise 12.48 we have AA Db = b , which


implies that x = ADb is a solution to Ax= b. To prove that ADb E Jth,(A, b), for
some k EN, we observe that A is invertible when restricted to the subspace f!J2 (P*),
and its inverse is AD (this follows since AAD = AD A = P*) . Thus ADP* can be
written as a polynomial q in AP* via (13.9), and so it follows that

x = ADb =AD P*b = q(AP*)b = q(A)P*b = q(A)b .

To prove uniqueness, suppose there exists another solution y = I:;=O CjAJb


of the linear system. It follows that y = x + z for some z E A' (A) . Note that
P*y = P*x and
k k k
P0 y = L::CjPoAjb = L::CjD6Pob = LcjD6o = o.
j=O j=O j=O

Thus, y = P.y + P0 y = P. y = P.x == ADb. D

Remark 13.2.16. By Remark 12.9.4, we can write the Drazin inverse of A as

Since c. is invertible, we can write C; 1 as a linear combination of powers of C*,


that is, as a polynomial in C* (see 13.10). The degree of this polynomial, which
is also the dimension of the largest Kry lov subspace Jtk (A, b), is degree d of the
minimal polynomial of A minus the index m of A, that is, k = d - m.

13.3 The Arnoldi Iteration and GMRES Methods


Although the solution x to a linear system Ax = b lies in the span of the Krylov
basis {b, Ab, ... , Ad- 2 b, Ad-lb}, the Krylov basis is not suitable for numerical
computations. This is because as k becomes large, the vectors Ak- lb and Akb
become nearly parallel, which causes the basis to become ill conditioned; that is,
small errors and round-off effects tend to compound into large errors in the solution.
Arnoldi iteration is a variant of the (modified) Gram- Schmidt orthonormal-
ization process (see Theorem 3.3.1) that constructs an orthonormal set of vectors
spanning the Krylov subspace Jtd(A, b). This new orthonormal basis is chosen so
that the matrix representation of A in terms of this basis is nearly upper triangular
(called upper Hessenberg) , which also makes solving many linear algebra problems
relatively easy.
In this section, we use Arnoldi iteration to solve linear systems using what
is called the generalized minimal residual method, or GMRES 50 for short. For
each k E z+, the GMRES algorithm determines the best approximate solution
xk E Jtk (A, b) of the linear system Ax = b and arrives at the exact solution once
k = d. In practice, however, we can usually get a good approximation with far
fewer iterations.
5 0Pronounced G-M-Rez.
13.3. The Arnoldi Iteration and GMRES Methods 531

There are several linear solvers that use Arnoldi iteration. GMRES is fairly
general and serves as a good example. For many systems, it will perform better than
Gaussian elimination (see Section 2.7), or, more precisely, the LU decomposition
(see Application 2.7.15). It is also often better than the methods described in
Section 13.l. Generally speaking, GMRES performs well when the matrix A is
sparse and when the eigenvalues of A are not clustered around the origin.

13.3.1 Hessenberg Matrices and the Arnoldi Iteration

Definition 13.3.1. A Hessenberg matrix is a matrix A E Mmxn(lF) with all zeros


below the first subdiagonal; that is, if A= [ai ,j], then ai,j = 0 whenever i > j + 1.

Example 13.3.2. Every 2 x 2 matrix is automatically Hessenberg. Hessen-


berg 3 x 3 matrices must have a zero in the bottom left corner, and Hessenberg
4 x 3 matrices have three zeros in the bottom left corner. We denote these as

[~ : :] E Ms(F) and
[0O** **0* **]** E M4x3(lF),
where * indicates an arbitrary entry- not necessarily zero or nonzero.

The Arnoldi process begins with an initial vector b -:/- 0. Since the basis being
constructed is orthonormal, the first step is to normalize b to get q1 = b/llbll·
This is the Arnoldi basis for the initial Krylov subspace Jfi(A, b) . Proceeding
inductively, if the orthonormal basis q 1 , .. . , qj for Jtj(A, b) has been computed,
then construct the next basis element by subtracting from A% E Jtj+l (A, b) its
projection onto Jtj (A, b). This goes as follows . First, set

hi,j = (qi , Aqj) = q~ Aqj, i :::; j.

Now set
j

qj+l = Aqj - L hi,jqi and hj+l,j = lliij+1 I · (13.14)


i =l

If hJ+l,j = 0, then lliiJ+ill = 0, which implies Aqj E Jtj(A, b), and the algorithm
terminates. Otherwise set
- qj+l
qj+l - -h- . (13.15)
j+l,j
This algorithm produces an orthonormal basis for each Jtk(A, b) fork = 1, 2, ... , m,
where mis the smallest integer such that Xm(A, b) = Xm+1(A, b).
For the ambient space JFn, define for each k = 1, 2, ... , m the n x k matrix
Qk whose jth column is the jth Arnoldi basis vector % · Let ilk be the (k + 1)
x k Hessenberg matrix whose (i,j) entry is hij· Note that when i > j + 1,
532 Chapter 13. Iterative Methods

we haven't yet defined hi,j , so we just set it to O; this is why the matrix fh
is Hessenberg.

Example 13.3.3 . The first two Hessenberg matrices H1 and H2 are

and

We can combine (13.14) and (13.15) as


Aq1 = hi,1Q1 + · · · + h1 ,1Q1 + hJ+1,1QJ+i
or, in matrix form ,
(13.16)
Note that since the columns of Q 1 are orthonormal and equal to the first j columns
of QJ+ 1 , the product Q7QJ+ 1 is the j x j identity matrix with an extra column of
zeros adjoined to the right. That is, it is the orthogonal projection that forgets the
last coordinate. So multiplying (13.16) on the left by Q7
gives
Q~AQ 1 =H1 , (13.17)
where H 1 is the matrix obtained from fI1 by removing the bottom row. Note that
in the special case that j = n, the matrix Qn is square and orthonormal.

Example 13.3.4. Consider the linear system Ax= b from Example 13.2.11 ,
that is,

A~ r~ ~ ~ ~1 Md b ~ rJ
Applying the Arnoldi method yields the following sequence of orthonormal
and Hessenberg matrices computed to four digits of accuracy:

0.51
0.5 8.75]

l
0.5 ' [5.403 '
r0.5
0.5 o. 7635 8.75 -5 .472]
0.5 0.1157
5.403 -1.495 '
0.5 -0.3471 ' [ 0
r0.5 -0.5322
0. 7238
l l
13.3. The Arnoldi Iteration and GM RES Methods 533

0.7635 - 5.472
04025 [ 8 75 2230
[05
0.5 0.1157 -0.7759 - 1.495 -0.1341
Q3=
0.5 -0.3471 - 0.1016 '
H3 = 5.r3
0.7238 0.6715 '

l
0.5 -0.5322 0.4750 0 0.1634
and
0.7635 0.4025
0.0710
[05
0.5 0.1157 -0 .7759 -0.3668
=

l
Q4 0.7869 ,
0.5 -0 .3471 -0.1016
0.5 -0.5322 0.4750 - 0.4911
-5.472 2.230
[ 8 75 - 1.495 -0.1341
-0.3009
1331
H4 = 5.r3
0.7238 0.6715 - 0.2531
0 0.1634 1.074

13.3.2 The GMRES Algorithm


GMRES is based on the observation that the solution x of the linear system Ax = b
can be found by solving the least squares problem, that is, minimizing the residual
llAx - bll2· To find this minimizer, the algorithm constructs successive approxi-
mations Xk of x by searching for the vector y E Jtk (A, b) such that 1 Ay - b II2 is
minimized. In ot her words, for each k E N, let

xk = argmin llAy - bll 2·


yEXdA ,b)

Once Jtk(A, b) = Jtk+ 1 (A , b), we have that x E Jtk(A, b), and so the least squares
solution is achieved; that is, xk = x.
Using the Arnoldi basis q1, . .. , qk of Jtk(A,b) described above, any y E
Jtk(A, b) can be written as y = Qkz for some z E IFk, where Qk is the matrix of
column vectors q 1 , ... , qk. It follows that

where Hk is the (k + 1) x k upper-Hessenberg matrix described in (13.16). Using


Exercise 4.31 , we have

l Qk+1Hkz - bll 2= llHkz - Q~+1h l 2 = lliikz - ,Be1 ll 2,


where ,B = llbll 2- Thus, in the kth iteration of GMRES the problem is to find
Zk E JFk minimizing llHkz - ,Be1112 - We write this as

(13.18)

and define the kth approximate solution of the linear system as xk = Qkzk·
Once the Arnoldi basis for step k is known, the solution of the minimization
problem is relatively fast , both because the special structure of Hk allows us to
use previous solutions to help find the next solution and because k is usually much
smaller than n .
534 Chapter 13. Iterative Methods

To complete the solution, we use the QR decomposition of fik. Recall the


QR decomposition (Section 3.3.3), where fik E M(k+l)xk(JF) can be written as a
product
(13.19)
where flk E Mk+i(lF) is orthonormal (that is, satisfies fl~{lk = flk{l~ = Ik+1) and
Rk E M(k+l)xk(JF) is upper triangular. Thus,

Zk = argmin
zE JFk
llfikz - f3e1 II 2
= argmin
zEJFk
l Akz - (3fl~e1 I .
2
(13.20)

Since Rk is upper triangular, the linear system Rkz = (3fl~ e 1 can be solved quickly
by back substitution. In fact, since Rk is a (k + 1) x k upper-triangular matrix, its
bottom row is all zeros. Hence, we write

where Rk is a k x k upper-triangular matrix. We can also split off the last entry of
the right -hand side (3fl~e1. Setting

(3fl~e1 = [~:] (13.21)

gives the solution of (13.18) as that of the linear system Rkzk = gk, which can be
computed via back substitution.

Remark 13.3.5. The QR decomposition, followed by back substitution, is not


faster than row reduction for solving linear systems, and so at first blush, it might
seem that this approach of solving a linear system at each step is ridiculous. How-
ever, as we see below, the special structure of the upper-Hessenberg matrix fik
makes it easy to solve these systems recursively, and thus this method solves the
overall problem rather efficiently.

Now compute the least squares solution for the next iterate. Note that

(13.22)

where hk+l = [h1,k+ 1 h2,k+i hk+I ,k+I]T. Left multiplying (13.22) by the
Hermitian matrix

yields

(13.23)

If u = 0, then the rightmost matrix of the above expression is Rk+I, which is upper
triangular , and the solution of the linear system Rkz = (3fl ~e 1 can be found quickly
with back substitution.
13.3. The Arnoldi Iteration and GM RES Methods 535

If, instead, we have ~ # 0, then perform a special rotation to eliminate the


bottom row. Specifically, let

(13.24)

where
and

Thus,

[~
J
where rk+l ,k+l = p2 + ~ 2 . This gives the next iterate of (13.19) without having
to compute the QR decomposition again. In particular, we have Fh+1 = nk+1Rk+1,
when~ nk+l is orthonormal (since it is the product of two orthonormal matrices)
and Rk+l is upper triangular.
We can also split up the next iterate of (13.21). Note that

It follows t hat

Hence, the solution zk+ 1 is found by solving the linear system

(13.25)

which can be found by back substitution .

Remark 13.3.6. Theorem 13.2.15, combined with Remark 13.2.16, guarantees


that GMRES will converge in no more than d iterations, where dis the degree of the
minimal polynomial minus the index of A. But, as mentioned earlier, GMRES can
often get a sufficiently good approximation to the solution with far fewer iterates.
Since the least squares error is given by ['Yk [, we can monitor the error at each
iteration and terminate when it is sufficiently small.

13.3.3 A Worked Example


We conclude this section with a worked example of the GMRES method. Consider
the linear system from Examples 13.2.11 and 13.3.4. Although this system is easy
536 Chapter 13. Iterative Methods

to solve wit h back substitution (since it's upper triangular) , we solve it here with
GMRES for illust rative reasons. In what follows we provide four digits of accuracy.
Since each fh is given in Example 13.3.4, we can solve each iteration fairly
easily. Fork = 1, it follows from (13.20) that

z1= ar~~in l H1z - f3e111 2= ar~~in llA1z - (3s1~e111 2 .


T he QR decomposition of H1 = n1R1 is given by
.H1 = [ 8.75] = [0.8509 -0.5254] [10.28]
5.403 0.5254 0.8509 0 .
"--v-''-v-'

Thus,

The least squares error is 1.051. For the next iteration, we have that

z2= argmin
z EJF2
l H2z - (3e II2 =
1 argmin
z EJF2
l A2z - (3n~e1 II2 .

0.4~16] ,
0
0.9113
-0.4116 0.9113

- = [n1
!°h ~] G~ =
[0.8509
0.5254
-0.5254
0.8509
~rn
0
0.9113 -0~116 l
l
0
0 0 0.4116 0.9113
[0.8509 - 0.4788 0. 2163
0.5254 0.7754 - 0.3502 '
0 0.4116 0.9113
and

R2 = 02
[R1
~ rb2 ''] [!
0
0.9113
- 0.4116
0.4116
0
0.9113
l [10.28
0
0
-5.441]
1.603
0.7238

n28 - 5441]
1. ~58 .

l
Moreover , we have that

fiil~e1 ~ [-sn1
c~~I ]
[ 0.9113(-1.051)
1702
-0.4116( - 1.051)
[ -0.9576l
1 702 .
0.4325
13.3. The Arnoldi Iteration and GM RES Methods 537

Thus, the least squares solution satisfies the linear system (13.25)

10.28 -5.441] = [ 1. 702 ]


[ 0 1.758 z 2 - 0.9576 '

which can be solved with back substitution. The solution is

Z2 = [- 0.1227 - 0.5446] T ,

l
which has a least squares error of 0.4325. Repeating for k = 3, we have H3 = D3 R3 ,
where
l 0
0 1 0 0
0 0 0.9899 0.1417 '
r 0 0 -0.1417 0.9899
0.8509 - 0.4788 0.2163 - 0.0307]
0.5254 0.7754 -0.3502 0.04965
0 0.4116 0.9021 - 0.1292 '
r 0 0 0.1417 0.9899
and
-5.441
1.758 - 1.8271
0.8953
0 1.153 .
0 0
Moreover, we have that

-H
(3D3 e1 = [
g2
c212 ]
- S 2 ')'2
=
r
- 1.702
0.9576
0.4281
- 0.06131
.
l
l [ l
Thus, the least squares solution is satisfied by the linear system (13.25)

10.28 -5.441 1.827 1. 702


0 1.758 -0.8953 Z2 = - 0.9576 .
[ 0 0 1.153 0.4281

Solving with back substitution gives

Z3 = [-0.08860 - 0.3555 0.3714] T ,

which has a least squares error of 0.06131. Finally, repeating for k = 4, we have
H4 = D4R4, where G4 = h ,

~l'
0.8509 -0.4788 0.2163 -0.0307
0.5254 0.7754 - 0.3502 0.04965
D3 = 0 0.4116 0.9021 -0.1292
0 0 0.1417 0.9899
0 0 0 0
538 Chapter 13. Iterative Methods

and
-5.441 1.827
-097461
1.758 -0.8953 0.7665
R3 =
ITS 0
0
0
1.153
0
0
-0 .4654 .
1.1513
0
Moreover, we have that

I
1.7021
,Bs1~e1 = [ c~n2
g2 -0.9576
] = 0.4281 .
-82"(2 -0.06131
0

Thus, the least squares solution is satisfied by the linear system (13.25)

[
10.28
0
0
0
-5.441
1.758
0
0
1.827
-0.8953
1.153
0
-0.9746]
0.7665
-0.4654 z 2
1.1513
=
ll 1.702
-0.9576
0.4281 ·
-.0613

Solving with back substitution gives

Z3 = [- 0.08860 -0.3432 0.3499 -0.05325] T,

which has a least squares error of 0. Since x4 = Q4z4, we have that

X4 = Q4Z4 = [-0.1667 -0.333,3 0 0.3333]


T
~ 61 [-1 -2 0
T
2] ,

which is the solution. In other words, x = x4.


Remark 13.3.7. In this example, we didn't stop until we got to k = 4 because
we wanted to demonstrate that when k = d, we get a least squares error of zero.
However, the previous step has a fairly small least squares error (0.06131) . Note
that
X 3 = Q3Z3 = [-0.1662 - 0.3736 0.0413 0.3213] T

is pretty close to X4.

13.4 *Computing Eigenvalues I


The power method (see Section 12.7.3) determines the eigenvalue oflargest modulus
of a given matrix A E Mn (IF). In this section, we show how to compute the full
set of eigenvalues of A, using Schur's lemma and the QR decomposition. Schur's
lemma (see Section 4.4) guarantees there exists an orthonormal Q E Mn(IF) such
that QH AQ is upper-triangular, and the diagonal elements of this upper-triangular
matrix are the eigenvalues of A (see Proposition 4.1.22) . We show how to compute
a Schur decomposition for any matrix A E Mn(IF) where the eigenvalues all have
distinct moduli. We do this by repeatedly taking the QR decomposition in a certain
clever way called QR iteration.
13.4. *Computing Eigenvalues I 539

13.4.1 Uniqueness of the QR Decomposition


The QR decompositions constructed in Theorems 3.3.9 and 3.4.9 do not necessarily
have only nonnegative diagonals, but they can always be rewritten that way. Specifi-
cally, if QR = A is any QR decomposition of a matrix A E Mn(IF) , let
A= diag(A1, ... , Am) E Mm(IF) be the diagonal matrix whose ith entry Ai is

if rii =f. 0,
if Tii = 0,
where Tii is the ith diagonal element of R . It is straightforward to check that A is
orthonormal, and R' = AR is upper triangular with only nonnegative entries on its
diagonal. Therefore, Q' = QA H is orthonormal, and A = Q' R' is a QR decomposi-
tion of A such that every diagonal element of R is nonnegative. Thus, throughout
this section and the next , we assume the convention that the QR decomposition
has only nonnegative diagonals. If A is nonsingular, the QR decomposition may be
assumed to have only positive diagonals.

Proposition 13.4.1. If A E Mn(IF) is nonsingular, then the QR decomposition


with only positive diagonals is unique.

Proof. Let A= Q1R1 and A= Q2R2, where Q1 and Q2 are orthonormal and R1
and R 2 are upper triangular with all positive elements on the diagonals. Note that

AHA=R~R1 = R~R2,

which implies that


R 1R 2- l -- R 1- HRH2 -- (R 2R 1- l)H .
1
Since R 1H2 is upper triangular and (R 2R! 1)H is lower triangular, it follows that
both are diagonal. Writing the diagonal elements of R 1 and R2, respectively, as
R1 = diag(a1, a2, ... , an) and R2 = diag(b1, b2, ... , bn), it follows for each i that

which implies that each a; = bi, and since each ai and bi are positive, we must have
1
ai = bi for each i. Thus, R 1H2 = I, which implies R1 = R2· Since Q1 = AR!
1
1
and Q2 = AR.;- , it follows that Q1 = Q2. D

13.4.2 Iteration with the QR decomposition


Using the QR decomposition in an iterative fashion, we compute a full set of eigen-
values for a certain subclass of matrices A E Mn(IF). We begin by setting Ao = A
and computing the QR decomposition Ao = QoRo. We then set A1 = RoQo and
repeat. In other words, for each k EN, we compute the QR decomposition of
(13.26)

and then set


(13.27)
540 Chapter 13. Iterative Methods

We then show that in the limit as k --+ oo, the subdiagonals converge to zero and
the diagonals converge to the eigenvalues of A.

Lemma 13.4.2. Each Ak is similar to A.

Proof. The proof is Exercise 13.16. D

Lemma 13.4.3. For each k E N define

(13.28)

We have
Ak+l = UkTk and Ak+i = ur AUk · (13.29)
Moreover, if A is nonsingular, then so is each Tk, and thus

Ak+1 = TkATi: 1. (13.30)

Proof. We proceed by induction on k E N. It is easy to show that (13.29) and


(13.30) hold for k = 0. Assume by the inductive hypothesis that Ak = UkTk
and Ak = ur_ 1AUk-l· If A is nonsingular, then we also assume Ak+l = TkATi: 1.
We have that

Ak+ 2 = AAk+l = AUkTk = UkAk+lTk = UkQk+1Rk+1Tk = Uk+1Tk+1

and

ur+ 1AUk+l = Q~+lur AUkQk+1 = Q~+1Ak+1Qk+1 = Rk+1Qk+1 = Ak+2·


Finally if A is nonsingular , then
1
Tk+1Ar;;_;1 = Rk+1TkATi: Ri;~ 1 = Rk+1Ak+1Ri;~ 1 = Rk+1Qk+1 = Ak+2·
Thus, (13.29) and (13.30) hold for all k E N. D

Theorem 13.4.4. If A E Mn (IF) is nonsingular and can be written as A = PD p-l,


where D = diag(>.1, >-2, ... , An) is spectrally separated, that is, it satisfies

l>-11 > l>-21 > · · · > l>-nl > 0, (13.31)

then the matrix Ak = [a1J)], as defined in (13.27), satisfies the following:

(i) If i > j, then limk-+oo alJ ) = 0.

(1·1·) ror
L'
eac h i,. we h ave 1.imk-+oo aii(k) = \
-"i·

Proof. Let P = QR be the QR decomposition of P with positive diagonals, and


let p-l = LU be the LU decomposition of p- 1. Since

A= P DP-- 1 = QRDR- 1QH,


13.4. *Computing Eigenvalues I 541

we have that

is upper triangular with diagonals .\ 1 , .\ 2 , ... , An ordered in moduli. Note

Since the diagonals of L are all one, we have

i < j,
i = J, (13.32)
i > j as k -t oo,

or, in other words, Dk LD-k = I+ Ek, where Ek -t 0. Hence,

\_Yriting I+ REkR- 1 = QkRk as the unique QR decomposition, where Qk -t I and


Rk -t I, we have

which is the product of an orthonormal matrix in an upper-triangular matrix. But


we do not know that all of the diagonals of RkRDkU are positive. Thus, for each
k we choose the unitary diagonal matrix Ak so that AkRkRDkU has only positive
diagonals. Hence,

Tk = AkRkRDkU -t AkRDkU,
Uk = QQkAJ; 1 -t QAJ; 1 ,
which holds by uniqueness of the QR decomposition. It follows that

which is upper triangular. Thus, the lower-triangular values of Ak+l converge to


zero and the diagonals of Ak+l converge to the diagonals of RDR- 1 , which are the
eigenvalues of A. D

Example 13.4.5. Let

A = ~ [~ ~].
By computing the QR iteration described above, we find that

.
}~Ak =
[20 1/3]
1 .

From this, we see that <J(A) = {1, 2}. In this special case the upper-triangular
portion converges, but this doesn't always happen.
542 Chapter 13. Iterative Methods

Remark 13.4.6. The upper-triangular elements of Ak+l need not converge. In


fact, it is not uncommon for them to oscillate in perpetuity as we iterate in k. As
a result, we can't say that Ak+l converges, rather we only assert that the lower-
triangular and diagonal portions do always converge as expected.

Example 13.4. 7. Let

A== [~ !1] .
By computing the QR iteration described above, we find that

(-l)k]
1 .

From this, we see that a(A) = {-2, 1}. Note that the upper-triangular entry
does not converge and in fact oscillates between positive and negative one.

Remark 13.4.8. If the hypothesis (13.31) is weakened, then QR iteration pro-


duces a block-upper-triangular matrix (similar to the original), where each block
diagonal corresponds to eigenvalues of equal moduli. This follows because (13.32)
generalizes to
i>-il < i>-ji,
l>-il = i>-ji,
i>-ii > l>-ji ask-+ oo.
It is also important to note that the block diagonals will often not converge, but
rather oscillate, since the convergence above is in modulus. Thus, we can perform
QR iteration and then inspect Ak once k is sufficiently large that the lower-diagonal
blocks converge to zero, and then we can analyze the individual blocks to find the
spectrum. In the example below we consider a simple case where we can easily
determine the spectrum of the oscillating block.

Example 13.4.9. Let

l
5 1
-4 12 -5
-4 01
-3
A= 10 -8 -4 3 .
-2 13 -8 -2

We can show that the eigenvalues of A are a(A) = {2, 1, 4 + 3i, 4 - 3i}. By
computing the QR iteration described above, we find that

l
9.8035 3.6791 1.6490 -18.30221
-11.6006 -1.8035 8.5913 0.7464
A 10000 = 0 0 2.0000 -1.2127 .
0 0 0 1.0000
13.5. *Computing Eigen valu es II 543

Thus we can see instantly that two of the eigenvalues are 2 and 1, but we have
to decompose the first block to get the other two eigenvalues. Since A 10000
is block upper triangular, the eigenvalues of each block are the eigenvalues of
the matrix (see Exercise 2.50), which is similar to A. Note that the block in
question is
B = [ 9.8035 3.6791 ]
-11.6006 -1.8035 '
which satisfies tr(B) = 8 and det(B) = 25. From here it is easy to see that the
eigenvalues of B are 4 ± 3i, and thus so are the eigenvalues of A 10000 and A.

Remark 13.4.10. Anytime we have a real matrix with a conjugate pair of eigen-
values, which is often, the condition (13.31) is not satisfied. In this case, instead of
the lower-triangular part of Ak converging, it will converge below the subdiagonal,
and the 2 x 2 block corresponding to the conjugate pair will oscillate. One way to
determine the eigenvalues is to compute the trace and determinant, as we did in
the previous example (see Exercise 4.3). The challenge arises when there are many
eigenvalues of the same modulus producing a large block. In the following section,
we show how to remedy this problem.

Remark 13.4.11. In practice this isn't a very good method for computing the
eigenvalues; it's too expensive computationally! The QR decomposition takes too
many operations for this method to be practical for large matrices. The number
of operations grows as a cubic polynomial in the dimension n of the matrix; thus,
when n is large, doubling the matrix increases the number of operations by a factor
of 8. And this is for each iteration! Since the number of QR iterations is likely to be
much larger than n (especially ifthere are two eigenvalues whose moduli are close),
the number of operations required by this method will have a leading order of more
than n 4 . In the following section we show how to vastly improve the computation
time and make QR iteration practical.

13.5 *Computing Eigenvalues II


The downside of the QR iteration described in the previous section is that it is
computationally expensive and therefore not practical, unless the matrices are small.
In this section we show how to substantially speed up QR iteration to make it
much more practical for mid-sized matrices. For very large matrices, it may not
be practical to compute all of the eigenvalues. Instead we may have to settle for
just computing the largest eigenvalues. We finish this section with a discussion of
a sort of a hybrid between the power method and QR iteration called the Arnoldi
algorithm, which computes the largest eigenvalues.
The techniques in this section are based on three main observations. First, by
using Householder transformations, we can transform a given matrix, via similarity,
into a Hessenberg matrix, and then do QR iteration on the Hessenberg matrix
instead. Second, if the initial matrix Ao is Hessenberg, then so is each Ak+l in QR
iteration. Finally, there's a very efficient way of doing the QR decomposition on
Hessenberg matrices. By combining these three observations, we are able to do QR
iteration relatively efficiently.
544 Chapter 13. Iterative Methods

13.5.1 Preconditioning to Hessenberg Form


We begin by showing how to use Householder transformations to transform a
matrix, via similarity, into a Hessenberg matrix. Recall from Section 3.4 that the
Householder transform is an Hermitian orthonormal matrix that maps a given vector
x to a multiple of e 1 (see Lemma 3.4. 7). This is used in Section 3.4 as a method
for computing the QR decomposition, but we can also use it to transform a matrix
into Hessenberg form.

Proposition 13.5.1. Given A E Mn(lF), there exists an orthonormal Q E Mn(lF)


such that QH AQ is Hessenberg.

Proof. We use a sequence of Householder transforms H1, H2 , .. . , Hn - 1 to eliminate


the entries below the subdiagonals. For k = 1, write

where au E JF, where a12 E JFn-l, and where A22 E Mn-1 (JF) . Let H1 E Mn-1 (JF)
denote the Householder matrix satisfying H 1a2 1 = v1e 1 E JFn- 1 , and let H 1 be the
block-diagonal matrix H 1 = diag(l, H1 ). Note that

Therefore, all entries below the subdiagonal in the first column of H 1 AH1 are zero.
For general k > 1, assume by the inductive hypothesis that we have

Here all the Hi are Hermitian and orthonormal, and Au E Mkxk-l(JF) is upper
Hessenberg. Also ak+l ,k E JFn-k, al,k E JFk , Aa,k+l E Mkxn-k, and Ak+ l,k+l E
Mn-k(JF) . Choose flk E Mn-k to be the Householder transformation such that
flkak+l,k = vke1 E JFn-k. Setting Hk = diag(h , Hk) gives

Continuing for all k < n shows that for Q = H 1H 2 · · · Hn-l the matrix QH AQ is
upper Hessenberg. D

13.5.2 QR Iteration for Hessenberg Matrices


Now we show that if the initial matrix Ao in QR iteration is Hessenberg, then so is
each Ak+1, where Ak+l is defined by (13.27).

Le mma 13.5.2. The products AB and BA of a Hessenberg matrix A E Mn(lF)


and an upper-triangular matrix BE Mn(lF) are Hessenberg.
13.5. *Computing Eigenvalues II 545

Proof. The matrix A = [aij] satisfies aij = 0 whenever i + 1 > j , and the matrix
B = [bij] satisfies bij = 0 whenever i > j. The (i, k) entry of the product AB is
given by
n
(AB)ik = L Uijbjk· (13.33)
j=l

Assume i + 1 > k and consider aijbjk · If j < i + 1, then aij = 0, which implies
aijbjk = 0. If j 2:: i + 1, then since j>k, we have that bjk=O, which implies aijbjk = O.
In other words, aijbjk = 0 for all j, which implies that the sum (13.33) is zero. The
proof that the product BA is Hessenberg is Exercise 13.19. D

Theorem 13.5.3. If A is Hessenberg, then the matrices Qk and Ak given in (13.26)


and (13.27) are also Hessenberg .

Proof. The proof is Exercise 13.20. D

13.5.3 QR Decomposition of Hessenberg Matrices


We now demonstrate an efficient way of doing the QR decomposition on Hessenberg
matrices. Specifically, we show that the QR decomposition of a Hessenberg matrix
A E Mn(IF) can be performed by taking the product of n - l rotation matrices (13.35)
called Givens rotations. The reason this is useful is that the number of operations
required to multiply by a Givens rotation is linear in n, and so the number of
operations required to construct the QR decomposition of a Hessenberg matrix in
this way is quadratic in n, whereas the number of operations required to construct
the QR decomposition of a general matrix in Mn(IF) is cubic inn. Therefore, QR
decomposition of Hessenberg matrices is much faster than QR decomposition of
general matrices.

Definition 13.5.4. Assume that i < j. A Givens rotation of() radians between
coordinates i and j is a matrix operator G(i , j, ()) E Mn(IF) of the form

G(i,j,B)=l- (1 - cosB)eieT-sinBeieJ +sinBejeT-(1 - cosB)ejeJ, (13.34)

or, equivalently,

G(i,j,B) =
Ji-1
0
0
0
0
0
cos()
0
sin()
0
0
0
Ij - i- 1
0
0
0
- sin()
0
cos()
0
0
0
Ino-j
I . (13.35)

Remark 13.5.5. When a Givens rotation acts on a vector, it only modifies two
components. Specifically, we have
546 Chapter 13. Iterative Methods

Xi - l
Xi COS 8 - Xj sin 8
Xi+l

G(i,j, B)
Xj-l Xj-l
Xj XisinB+xjcose
Xj+l Xj+l

Xn Xn

Similarly, left multiplying a matrix A E Mn(lF) by a Givens rotation only modifies


two rows of the matrix. The key point here is that the number of operations this
multiplication takes is linear in the number of columns (n) of the matrix (roughly
lOn operations). By contrast, multiplying two general n x n matrices together
requires roughly 2n3 operations. Hence, when n is large, Givens rotations are much
faster than general matrix multiplication.

Proposition 13.5.6. Givens rotations satisfy

(i) G(i,j, B)G(i,j, ¢) = G(i,j, e+ ¢),


(ii) G( i, j, B) T = G( i, j, -B ) (in particular, G( i, j, B) T is also a Given rotation),

(iii) G( i, j, B) is orthonormal.

Proof.

(i) This is Exercise 13.21.

(ii) This follows by taking the transpose of (13.35) and realizing that the cosine
is an even function and the sine is an odd function.

(iii) We have G(i , j,B)G(i,j,B)T = G(i,j,B)G(i,j, -B) = G(i,j,O) = I. D

Theorem 13.5.7. Suppose A E Mn(lF) is Hessenberg. Let H 1 =A and define

k = 1, 2, .. . , n - 1,

where Gk = G(k, k + 1, Bk), with

Bk = Arctan
( -W-
hk~l,k) .
hk,k
(13.36)

The matrix Hn is upper triangular and satisfies


13.5. *Computing Eigenvalues II 547

In other words, the QR decomposition A = QR of A is given by Q = G1G2 · · · Gn-1


andR =Hn .

Proof. We prove by induction that each Hk is Hessenberg, where the first k - 1


subdiagonal elements are zero. The case k = 1 is trivial. Assume Hk = [h;7)] is
Hessenberg and satisfies h;~l,i = 0 for each i = 1, 2, .. . , k-1. By Exercise 13.22 the
matrix Hk+l is Hessenberg. Moreover, Exercise 13.23 guarantees the subdiagonal
entries h;~t,~) all vanish for i = 1, 2, . . . , k - 1. We complete the proof by showing
(k+l)
that hk+l,k = 0. We have

(k+l)
h k+l,k T GTH
= ek+l k kek
= (eI+i - sinlheI - (1 - cos8k)ek+i) Hkek
= -eIHkeksin8k +eI+iHkekcosek
= -hi:k sin ek + hi~l,k cos ek
= 0,

where the last step follows from (13.36). D

Remark 13.5.8. As observed above, the number of operations required to carry


out the QR decomposition of a Hessenberg matrix is quadratic in n , while the
number required to construct the QR decomposition of a general matrix is cubic in
n. This efficiency is why for QR iteration it's usually best to transform a matrix
into Hessenberg form first and then do all the subsequent QR decompositions with
Givens rotations.

13.5.4 The Arnoldi Method for Computing Eigenvalues


We finish this section by discussing the Arnoldi method, which approximates the
largest eigenvalues of A E Mn (lF) by performing QR iteration on the Arnoldi iterate
Hk (see Section 13.3.1) instead of A. When k = n the matrix Hn is similar to A,
and therefore the eigenvalues of Hn are exactly the same as those of A (that is,
within the accuracy of floating-point arithmetic) . But for intermediate values of k,
the eigenvalues of Hk are usually good approximations for the largest k eigenvalues
(in modulus) of A.
Arnoldi iteration produces a k x k Hessenberg matrix Hk at each step, and the
k eigenvalues of Hk are called the Ritz eigenvalues of A of A. There does not yet
seem to be a good theory of how the Ritz eigenvalues converge to the actual eigenval-
ues. Nevertheless, in practice the Ritz eigenvalues do seem to converge (sometimes
geometrically) to some of the eigenvalues of A. The Arnoldi method provides bet-
ter approximations of the eigenvalues of A when the eigenspaces are orthogonal or
nearly orthogonal, whereas when the eigenspaces are far from orthogonal, that is,
nearly parallel, then the Ritz eigenvalues usually are worse approximations to the
true eigenvalues of A. For more details on this, see [TB97].
548 Chapter 13. Iterative Methods

Example 13 .5.9. Here we use the Arnoldi method to calculate the Ritz
eigenvalues of the matrix in Example 13.2.11. The square Hessenberg ma-
trices (Hk) and the Ritz eigenvalues corresponding to Arnoldi step k are as
follows:
Hk Ritz Eigenvalues

[8.75] 8.75

[ 8.75 - 5.472] 3.627 ± il.823


5.403 -1.495

[ 8 75
5 . ~03
-5.472
-1.495
0.724
223?
-O.U4
0.672
l 3.416, 2.555 ± i0.977

r5.r3
8 75
-5.472
-1.495
2.230
-0.l~\4
-13311
0.301
3.000, 2.000, 2.000, 2.000
0.724 0.672 -0.253
0 0.163 1.074

Rem ark 13.5.10. When A E Mn(lF) is Hermitian, the Hessenberg matrices formed
by Arnoldi iteration are tridiagonal and symmetric. In this case, the Arnoldi method
is better understood theoretically, and is actually called the Lanczos method. In
addit ion to t he fact that A has orthogonal eigenspaces, the implementations are
able to st ore fewer numbers by taking advantage of the symmetry of A; and this
greatly reduces t he computation time.

Exercises
N ote to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with &. are especially important and are likely to be used later
in this book and beyond. Those marked with tare harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
Exercises 549

the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

13.1. Let A E Mn(lF) have components A = [aiJ]· For each i, let Ri = Lj#i laiJ I
and let D(aii , Ri) be the closed disk centered at aii with radius Ri· Prove
that every eigenvalue A of A lies within at least one of the disks D(aii, Ri) ·
Hint : Since the eigenvalue equation can be written as L:7=l aijXj = .\xi , split
up the sum so that L #i aijXj = (.A - aii)Xi·
13.2. Assume A E Mn(lF) is strictly diagonally dominant. Prove A is nonsingular.
Hint: Use the previous exercise.
13.3. A matrix CE Mn(lF) is called an approximate inverse of A if r(I - CA) < 1.
Show that if C is an approximate inverse of A , then both A and C are
invertible, and for any x 0 E lFn and b E lFn, the map

Xn+l = Xn + C(b - Axn)


converges to x = A- 1 b.
13.4. Assume that 0 < a < b. If c: 2: 0, prove that

~<a+c:<l.
b - b+c:

Use this to prove that ll Bos ll oo :S llBJac ll oo·


13.5. Given a matrix of the form

where D 1 and D 2 are diagonal and invertible.


(i) Prove that

cn-
1
and Bos= [~ -D - B ]
D-
2
1 1
1 B ·

(ii) Prove that if A E cr(BJac), then .\ 2 E cr(Bos).


(iii) Conclude that r(Bos) = r(BJac) 2 .

13.6. Prove Proposition 13.2.2. Hint: The Cayley-Hamilton theorem guaran-


tees the existence of an annihilating polynomial. Use the well-ordering principle
the conclude that one of least order exists. By normalizing the polynomial,
you can make it monic. If there are two such minimal polynomials, then take
their difference.
13.7. Prove Proposition 13.2.5. Hint: Use the spectral decomposition of A and the
two lemmata preceding the statement of the proposition.
550 Chapter 13. Iterative Methods

13.8. As an alternative proof that the matrix in (13.11) is nonsingular, show that
all of its eigenvalues are one. Hint: Use the spectral mapping theorem (see
Theorem 12.7.1) and the fact that all of the eigenvalues of a nilpotent matrix
are zero (see Exercise 4 .1) .
13.9. Assuming A E Mn(lF) is invertible, prove (13.9).

13.10. Use GMRES to solve the linear system Ax= b, where

13.11 . Let
1 0 C1]
A= 0 1 C2 .
[0 0 1
Prove for any x 0 and b that GMRES converges to the exact solution after
two steps .
13.12. Prove that if rk is the kth residual in the GMRES algorithm for the system
Ax = b, then there exists a polynomial q E lF[z] of degree no more than k
such that rk = q(A)b.
13.13. Prove that if A = UHu - 1 , where H is square and properly Hessenberg
(meaning that the subdiagonal has no zero entries) , and where U E Mn(F )
has columns u 1, ... , Un, then span u1, . . . , Uj = Jtj(A, u1) for any j E
{l, . . .,n}.

13.14. Find the eigenvalues of the matrix

by using QR iteration.
[~ Hl
13.15. What happens when QR iteration is applied to an orthonormal matrix? How
does this relate to Theorem 13.4.4?
13.16. Prove Lemma 13.4.2.
13.17. One way to speed convergence of QR iteration is to use shifting. Instead
of factoring Ak = QkRk and then setting Ak+l = RkQk, factor Ak - O"kf,
where the shift O"k is close to an eigenvalue of A. Show that for any O" E lF,
if QR = A = O"f is the QR decomposition of A - O"f, then RQ + O"f is
orthonormally similar to A.

13.18. Put the following matrix into Hessenberg form :

2
1
2
4
ol1
[3 1 0
Exercises 551

13.19. Prove that the product BA of a Hessenberg matrix A E Mn(IF) and an upper
triangular matrix B E Mn (IF) is a Hessenberg matrix.
13.20. Prove Theorem 13.5.3.
13.21. Prove that Givens rotations G(i, j, e) satisfy the identity G(i, j, O)G(i, j , ¢) =
G(i,j,e + ¢).
13.22. Prove that Hk+l as defined in Theorem 13.5.7 is Hessenberg.
13.23. In the proof of Theorem 13.5.7, show that h~~i,~) = 0 for i = 1, 2, ... , k - l.

Notes
Our treatment of Krylov methods is based in part on [TB97] and [IM98].
Exercise 13.15 is from [TB97].
For details on the stability and computational complexity of the methods
described in this chapter, see [GVL13, CF13].
Spectra and
Pseudospectra

I am not strange. I am just not normal.


-Salvador Dali

Recall from Section 7.5.4 that the eigenvalue problem can be ill conditioned; that
is, a very small change in a matrix can produce a relatively large change in its
eigenvalues. This happens when the eigenspaces are nearly parallel. By contrast
we saw in Corollary 7.5.15 that the eigenvalue problem for normal matrices is well
conditioned. In other words, when matrices have orthogonal or nearly orthogonal
eigenspaces, the eigenvalue problem is well conditioned, and when the eigenspaces
are far from orthogonal, that is, very nearly parallel, the eigenvalue problem is ill
conditioned.
When a problem is ill conditioned, two nearly indistinguishable inputs can
have very different outputs, thus calling into question the reliability of the solution
in the presence of any kind of uncertainty, including that arising from finite-precision
arithmetic. For example, if two nearly indistinguishable matrices have very different
eigenvalues and therefore have wildly different behaviors, then the results of most
computations involving those matrices probably cannot be trusted.
An important example of this occurs with the iterative methods described in
Chapter 13 for solving linear systems. In Section 13.l we described three iterative
methods for solving linear systems by taking powers of matrices. If the eigenvalues
of the matrices are contained in the open unit disk, then the method converges
(see Theorem 13.1.1). However, even when all the eigenvalues are contained in
the open unit disk , a problem can arise if one or more of the eigenvalues are very
nearly unit length and the corresponding eigenspace is nearly parallel to another
of its eigenspaces. In this case, it is possible that these matrices will be essen-
tially indistinguishable, numerically, from those that have eigenvalues larger than
one. Because of ill conditioning, these iterative methods can fail to converge in
practice, even when they satisfy the necessary and sufficient conditions for conver-
gence. Moreover, even when the iterative methods do converge, ill conditioning can
drastically slow their convergence.
This chapter is about pseudospectral theory, which gives tools for analyzing
how conditioning impacts results and methods that depend on the eigenvalues of

553
554 Chapter 14. Spectra and Pseudospectra

matrices. In the first section we define the pseudospectrum and provide a few
equivalent definitions. One of these definitions describes the pseudospectrum in
terms of the spectra of nearby operators, which gives us a framework for connecting
convergence to conditioning. Another equivalent definition uses the resolvent.
Recall that the poles of the resolvent correspond to the eigenvalues. The pseu-
dospectrum corresponds to the regions of the complex plane where the norm of
the resolvent is large but not necessarily infinite, indicating eigenvalues are nearby.
These regions give a lot of information about the behavior of these matrices in
computations.
In the second section, we discuss the transient behavior of matrix powers.
Consider the sequence (llAkll)kEN, generated by a matrix A E Mn(C). The sequence
converges to zero as k --+ oo if and only if the spectral radius r(A) of A is less than
one, but before it goes to zero, it may actually grow first. And it need not go to
zero very quickly. Since convergence of many iterative methods depends upon these
matrix powers approaching zero, it becomes important to understand not just what
happens for large values of k, but also what happens for small and intermediate
values of k. When does the sequence converge monotonically? And when does it
grow first before decaying?
In the second section we address these questions and provide upper and lower
bounds for the sequence of powers via the Kreiss matrix theorem. We also discuss
preconditioning, which is a way of transforming the matrices used in iterative meth-
ods into new matrices that have better behaved sequences with faster convergence.
These are useful when dealing with a poorly conditioned matrix with eigenvalues
close to the unit circle. The pseudospectrum gives insights into how to choose a
good preconditioner.
In the final section we prove the Kreiss matrix theorem by appealing to a key
lemma by Spijker. The proof of Spijker's lemma is the longest part of this proof.

14.1 The Pseudospectrurn


In this section, we define the pseudospectrum and describe some of its properties as
well as several alternative definitions . We consider the role that conditioning plays
in describing the pseudospectrum and then present a few results that provide upper
and lower bounds on the pseudospectrum.
Remark 14.1.1. The definition of the pseudospectrum and most of the results of
this chapter extend very naturally to infinite-dimensional spaces and general oper-
ator norms, but in order to avoid some technical complications that would take us
too far afield, we limit ourselves to en and to the standard 2-norm on matrices.

14.1.1 Definition and Basic Properties


The pseudospectrum of a matrix A E Mn(C) is just the locus of all eigenvalues of
matrices sufficiently near to A.

Definition 14.1.2. Let A E Mn(C) and c > 0. The c-pseudospectrum of A is


the set
O"e:(A) = {z EC I z E <J(A + E) for some EE Mn(C) with llEll < c}. (14.1)
The elements O"e:(A) are called the c-pseudoeigenvalues of A.
14.1. The Pseudospectrum 555

In some sense the pseudospectra represent the possible eigenvalues of the ma-
trix when you throw in a little dirt. In mathematics we call these perturbations.
Ideally when we solve problems, we want the solutions to be robust to small pertur-
bations. Indeed, if your answer depends on infinite precision and no error, then it
probably has no connection to real-world phenomena and is therefore of little use.

Remark 14.1.3. We always have O'(A) C O'c:(A) for all c: > 0, since E = 0 trivially
satisfies llEll < c:.

Remark 14.1.4. The following two properties of pseudospectra follow immediately


from the definition:

By combining these two facts, we can identify O'(A) as the c: = 0 pseudospectrum.

Example 14.1.5. Recall the nonnormal, ill-conditioned matrix A of


Example 7.5.16,

A= [o.~Ol 1000]
1 '
which has eigenvalues {O, 2}. In Example 7.5 .16 we saw that if A was perturbed
by a matrix

E = [-o~OOl ~] '
the eigenvalues of A+ E became a double eigenvalue at l. Figure 14.l(a)
depicts the various c:-pseudospectra of A for c: = 10- 2 , 10- 2 · 5 , and 10- 3 . That
is, the figure depicts eigenvalues of matrices of the form A+ E, where E is a
matrix with 2-norm c:.

14.1.2 Equivalent Definitions


There are a few equivalent alternative definitions of the pseudospectrum.

Theorem 14.1.6. Given A E Mn(C) and c: > 0, the following sets are equal:

(i) The c:-pseudospectrum O'c: (A) of A .

(ii) The set of z E <C such that

ll(zI - A)vll < c: (14.2)

for some v E lFn with JJvJJ = l. The vectors v are the c:-pseudoeigenvectors
of A corresponding to the c:-pseudoeigenvalues z E <C.
556 Chapter 14. Spectra and Pseudospectra

31 3i

2i 21

-i -1

-2i -2i

-3i'-'--~-~-~~-~~-~-~
-3 -2 -1 3 5 -3 -2 -1

(a) (b)

Figure 14.1. Two depictions of pseudospectra for the matrix A of


Examples 7. 5.1 6 and 14 .1. 5. In panel (a), the two eigenvalues of A are plotted
in red, along with plots of eigenvalues of randomly chosen matrices of the form
A+ E with [[ E [[ 2 = 10- 3 (cyan), 10- 2 · 5 (blue), and 10-2 (green). Panel (b) shows
various contour lines for the resolvent . By Theorem 14 .1. 6(iii), these contour lines
represent the boundaries of various c-pseudospectra.

(iii) The set of z E e such that

(14.3)
where R(A, z) is the resolvent of A evaluated at z.

Proof. We establish equivalence by showing that each set is contained in the other.
(i)c(ii) . Suppose (A+E)v = zv for some [[E[[ < c and some unit vector v. In this
case, [[(zl -A)v[[ =[[Ev[[::; [[E[[[[ v[[ < c.
(ii)c(iii). Assume (zl - A)v = su for some unit vectors v, u E en and 0 < s < c.
In this case, (zl - A)- 1 u = s- 1 v, which implies I (zJ - A)- 1 [[ 2 s - 1 > c 1 .
(iii)C(i) . If [[(zl-A)- 1 [[ > c 1 , then by definition of the norm of an operator, there
exists a unit vector u E en such that II (zl -A)- 1 u [[ > c 1 . Thus, there exists
a unit vector v E en and a positives< c such that (zl - A)- 1 u = s- 1v . In
this case, we have that su = (zl -A)v, which implies that (A+suvH)v = zv.
Setting E = suvH, it suffices to show that [[E[[ = s < c. But Exercise 4.3l(ii)
implies that
[[E[[ 2 = s 2 [[(vuH)(uvH)[[::; s 2 [[vvH[[ ::; s 2 ,
where the last inequality follows from the fact that t he largest singular value of
the matrix vvH is l. Thus, z is an eigenvalue of A + E for some
[[ E [[ <c. D
Remark 14.1. 7. Each of these equivalent definitions of the pseudospectrum has its
advantages. The definition (i) is well motivated, but perhaps more difficult to visual-
ize than t he more traditional definition (iii) in terms of resolvents, which shows that
a"e:(A) is the open subset of e bounded by the c- 1 level set of the normed resolvent.
14.1. The Pseudospectrum 557

Remark 14.1.8. Recall that the spectrum of an operator A is the locus where
the resolvent ll (zl - A) - 1 11 is infinite. The fact that definitions (i) and (iii) are
equivalent means that the locus where II (zl - A)- 1 I is large, but not necessarily
infinite, gives information about how the spectrum will change when the operator
is slightly perturbed.

Example 14.1.9. Let

A=
1+i
-i
[ 0.3i
0~5 ~
0.5 0.7

Figure 14.2(a) shows a plot of the norm of the resolvent of A as a function
of z E C. The poles occur at the eigenvalues of A, and the points in the
plane where the plot is greater than c: - 1 form the c-pseudospectrum of A.
Figure 14.2(b) shows the level curves of llR(A, z)ll for the matrix A, cor-
responding to the points where llR(A,z)ll = c 1 for various choices of c.
For a given choice of c, the interior of the region bounded by the curve
llR(A, z)ll = C 1 is the c-pseudospectrum.

We can also give another equivalent form of the c-pseudospectrum, but unlike
the previous three definitions, this does not generalize to infinite-dimensional spaces
or to norms other than the 2-norm.

2i

3i/2

1.5
E
g
E 1
g: i/2
~
"'
.:::..g 0.5

"'
0

- i/2

-i

--0.5 0.5 1.5

(a) (b)

Figure 14.2. A plot of llR(A,z)ll for the matrix A in Example 14.1.9 as


z varies over the complex plane. These are represented as (a) a three-dimensional
plot and (b) a topographical map.
558 Chapter 14. Spectra and Pseudospectra

Proposition 14.1.10. If A E Mn(C), the set uc:(A) is equal to the set of z EC


such that
Smin(zI - A)< c, (14.4)
where Smin(zI - A) is the smallest singular value of (zI - A).

Proof. The equivalence of (14.4) and (14.2) is Exercise 14.4. D

Example 14.1.11. Consider the matrix

0 0 0 0 0 0 0 0 0 3628800
1 0 0 0 0 0 0 0 0 -10628640
0 1 0 0 0 0 0 0 0 12753576
0 0 1 0 0 0 0 0 0 -8409500
0 0 0 1 0 0 0 0 0 3416930
A= 0 0 0 0 1 0 0 0 0 -902055
E M10(1R).
0 0 0 0 0 1 0 0 0 157773
0 0 0 0 0 0 1 0 0 -18150
0 0 0 0 0 0 0 1 0 1320
0 0 0 0 0 0 0 0 1 -55

It is not hard to verify that the characteristic polynomial of A is the degree-


ten Wilkinson polynomial p(x) = f1~: 1 (x - k), and hence the spectrum is
{1,2,3,4,5,6, 7,8,9,10}.
We can visualize the pseudospectra of A in two different ways. First,
using the resolvent definition (iii) of the pseudospectrum, we can plot the
boundaries of uc:(A) for c = 10- 10 , 10- 9 , 10- 8 , 10- 1 . These consist of tiny
circles around the eigenvalues 1 and 2, and then some larger oblong curves
around the larger eigenvalues, as depicted in Figure 14.3(a) . Alternatively,
using the perturbation definition (i) we can perturb A by various random
matrices E with norm no more than 10-s and plot the eigenvalues of the
perturbed matrices in the complex plane. This is seen in Figure 14.3(b ).
Again, a perturbation of the matrix has almost no effect on the eigenvalues
near 1 and 2, but has a large effect on the eigenvalues near 8, 9, and 10.

14.1.3 Pseudospectra and Conditioning


Since the c-pseudospectrum of A is the set of all eigenvalues of all c perturbations
of A, it is very closely tied to the condition of the eigenvalue problem for A. Any
bounds on the pseudospectrum correspond to bounds on how much the eigenvalues
can move when the matrix is perturbed. In this section we prove the Bauer-Fike
theorem (Corollary 14.1.15), which gives a useful bound on the pseudospectrum of a
diagonalizable matrix and hence on the condition number of the eigenvalue problem
for such matrices.

Proposition 14.1.12. The following inequalities hold for all A E Mn(C)


and z EC:
14.1. The Pseudospectrum 559

-si~_~2~~~~~-~-1~0~12~~1•
-2 10 12 14

(a) (b)

Figure 14.3. In (a), the eigenvalues {black) of the matrix A from


Example 14.1.11, and the boundaries of cre(A). The boundary for E = 10- 7 is
the outer (yellow) curve, along with some invisibly tiny circles around the eigen-
values 1 and 2. The boundary for c = 10- 10 is the inner (green) oval, along with
invisibly tiny circles around l, 2, 3, and 4. In (b), the eigenvalues {red) of A along
with the boundary (yellow) of cre(A) for c = 10- s and the eigenvalues {black) of
various perturbed matrices of the form A + E, where E is chosen randomly with
ll Ell2 < 10-s ·

(i)
1 1
ll( zI - A)- 11 > - -- -
- dist(z, er( A))
(14.5)

(ii) If A is diagonalizable and has the form A = V Dv - 1


, where D is diagonal,
then
- 1 < ~(V)
ll(zI -A) 11 - dist(z,cr(A))'
(14.6)

where ~(V) = llVllllV- 1 11 is the condition number of V {see Section 7.5.3) .

(iii) If A is normal, then

-1 1
ll(zI-A) II = dist(z,cr(A)) (14. 7)

Proof. If z E cr(A), then both ll (zI -A)- 1 11 = oo and dist(z,cr(A)) = 0 and all the
relations hold trivially. Thus, we assume below that z ¢:. cr(A).

(i) If Av = ,\v for some unit vector v E en, then (zI - A)v = (z - ,\)v. Thus,
(zI - A)- 1 v = (z - ,\)- 1 v, which implies that

Since this holds for all ,\ E cr(A), we have that

1 1
ll(zI - A) - 111 > max = .
- >.EC7(A) dist(z, ,\) dist(z, cr(A))
560 Chapter 14. Spectra and Pseudospectra

(ii) Since II (zl - A) - 1 11 = llV(zl -D)- 1 v- 1 11 ::; llVll llV- 1 1111 (zl -D)- 1 11, it follows
that

1 K(V)
ll(zl - A) - 1 11::; K(V)ll(zl - D)- 1 11 :=:: K(V) max - -,
.XEa(A) 1Z - /\ 1 dist(z, cr(A)) ·

(iii) If A is normal, then it is orthonormally diagonalizable as A = V DVH . But


VHV = I implies llVll = llVH II = 1, and thus the condition number of V
satisfies K(V) = 1. The result now follows from (14.5) and (14.6). D

Remark 14.1.13. The condition number K(V) depends on the choice for the diag-
onalizing transformation V, which is not unique. Indeed, if we rescale the column
vectors of V (that is, the eigenvectors) with VA where A is a nonsingular diagonal
matrix, then VA still diagonalizes A, that is, (VA)D(VA)- 1 = (VA)DA- 1 v- 1 =
v vv- 1 = A, yet the condition number K(V A) is different and can be drastically so
in some cases. In fact, by making the eigenvectors in V sufficiently large or small,
we can make the condition number as large as we want to.

Remark 14.1.14. The number

is sometimes called the spectral condition number of A. We may substitute Ka for


K(V) in (14.6). We do not address this further here, but some references about
bounds on the spectral condition number may be found in the notes at the end of
this chapter.

Corollary 14.1.15 (Bauer-Fike Theorem). If A E Mn(C) satisfies A =


v vv- 1 with D diagonal, then for any c > 0 the E-pseudospectrum satisfies the
inequality of sets

cr(A) + B(O , c) c crc:(A) c cr(A) + B(O,cK), (14.8)

where K = K(V) is the condition number for V. In the special case that A is normal,
we have that

crc:(A) = cr(A) + B(O,c) = {z EC: lz - .Al < c for at least one>. E cr(A)}.

Remark 14.1.16. The Bauer- Fike theorem shows that the (absolute) condition
number of the eigenvalue problem for a diagonalizable A is bounded by the condi-
tion number K(V) for every V diagonalizing A, and therefore it is bounded by the
spectral condition number Ka(A).

Remark 14.1.17. The Bauer-Fike theorem says that the pseudospectrum is not
especially useful when A is normal because it can be constructed entirely from the
information about the spectrum alone.
14.2. Asymptotic and Transient Behavior 561

14.2 Asymptotic and Transient Behavior


In this section, we study the long- and short-term behavior of the norm of the powers
of a matrix A E Mn(lF), that is, llAkll fork EN. As discussed in the introduction
of this chapter, this class of functions is widely used in applications. For exam-
ple, many iterative methods of linear algebra, including all of those described in
Section 13.1 , depend on the convergence of this sequence of powers.
The long-term (asymptotic) behavior of llAkll is characterized by the spectrum
of A. In particular, if the spectral radius r(A) is less than 1, then ll Ak ll --+ 0 as
k--+ O; if r(A) > 1, then llAkll --+ oo as k--+ oo; and if r(A) = 1, then llAkll may
either be bounded or grow to infinity depending on the circumstances.
Even though the asymptotic behavior of llAk II depends exclusively on the
eigenvalues of the underlying matrix, for intermediate values of k, the quantities
can grow quite large before converging to zero. We call this intermediate stage the
transient behavior.
In this section we show that the transient behavior depends on the pseudospec-
trum of A E Mn (JF) instead of the spectrum. Moreover, the Kreiss matrix theorem
gives both upper and lower bounds on the transient behavior. We state the Kreiss
matrix theorem in this section and prove it in the next.

14.2.1 Asymptotic Behavior


The asymptotic behavior of the sequence llAk II can be summarized in the following
theorem.

Theorem 14.2.1. For any A E Mn(lF),


(i) if r(A) < 1, then llAkll --+ 0 ask--+ oo;
(ii) if r(A ) > 1, then ll Ak ll --+ oo ask--+ oo;
(iii) if r(A) = 1, then llAkll is bounded if and only if all the eigenvalues .A satisfying
I.A l = 1 are semisimple, that is, they have no eigennilpotents.

Proof.
(i) This follows from Exercise 12.15.
(ii) If r(A) > 1, then there exists .A E D"(A) such that I.A l > 1. Since l.A lk :::; llAkll
we have that llAkll --+ oo ask--+ oo.
(iii) See Exercises 14.9 and 14.10. D

14.2.2 Transient Behavior


The obvious analogue of the spectral radius for pseudospectra is the pseudospectral
radius. The pseudospectral radius and a related quantity called the Kreiss matrix
constant are important for understanding the transient behavior of ll Ak 11 -

Definition 14.2.2. For a matrix A E Mn(C) and any c: > 0, the c:-pseudospectral
radius re-(A) of A is given by
562 Chapter 14. Spectra and Pseudospectra

rc(A) = sup{lzl : z E o"c(A)}.

The Kreiss constant is the quantity

K(A) = sup rc(A) - 1. (14.9)


c>O c
The Kreiss constant also measures how fast the norm of the resolvent blows
up inside z, the open unit disk, as shown in the following equivalent, alternative
definition of the Kreiss matrix constant.

Proposition 14.2. 3. If A E Mn(lF), then

K(A) = sup (lzl - l)ll(zl -A)- 1 11· (14.10)


[z[>l

Proof. Fix c > 0. Choose zo so that lzol = rc(A). By continuity of the resolvent
and the norm, we have that ll(zol -A)- 1 11 2: c- 1 . Thus,

Taking the supremum over C gives

r (A) - 1
sup(lzl - l)ll(zl - A)- 1 11 2: c .
zEIC c

Since the supremum will never occur when lzl :::;; 1, we can restrict the domain and
write
r (A) -1
sup (lzl - l)ll(zl - A)- 1 11 2: c .
[z[>l c
Since this holds for all c > 0, we have

1 r (A) - 1
sup (lzl - l)ll(zl - A) - 11 2: sup c = K(A) .
[z[>l c>O c

To establish the other direction, fix lzl > l. Define c 1 = ll(zl -A)- 1 11 - Thus, z is
the boundary of uco· Hence, rc 0 (A) 2~ lzi. This yields

Taking the supremum over all c > 0 gives

K(A) =sup rco(A) - l 2: (lzl - l)ll(zl -A)- 1 11·


c>O co

Since this holds for all Iz I > 1, we have

K(A) 2: sup (lzl - l)ll(zl -A)- 1 11· D


[z[>l
14.2. Asymptotic and Transient Behavior 563

Remark 14.2.4. The Kreiss constant is only useful when r(A) :::; 1 because K(A) =
oo whenever r(A) > 1, as can be seen from (14.10).

Proposition 14.2.5. If A E Mn(q is normal and r(A) :S 1, then K(A) = l.


Proof. Since A is normal, Exercise 14.6 shows that ro:(A) = r(A) + E. We have

K(A) = sup
ro:(A) - 1 r(A) +E - 1
= sup :S 1.
o:>O E o:>O E

Moreover,
. r(A)
1lm
+E - 1
= 1,
c-+oo E
which implies that K(A) = 1. D

We are now prepared to give a lower bound on the transient behavior of llAkll·
Lemma 14.2.6. If A E Mn(IF) , then

K(A) :S sup llAk ll· (14.11)


k

Proof. Let M = supk l Ak l · If M = oo, the result follows trivially. We now assume
Mis finite, and thus llAll :S 1. Choose z E C such that lzl > 1. By Theorem 12.3.8,
we can write
M
lzl -1 ·
Thus,
(lzl - l)ll(zI - A)- 1 11 :SM.
Since z was arbitrary, we may take the supremum of the left side to get the desired
result. D

We are now prepared to state the Kreiss matrix theorem. The proof is given
in Section 14.3.

Theorem 14.2.7 (Kreiss Matrix Theorem). If A E Mn(lF), then


K(A) :S sup llAkll :S enK(A). (14.12)
kE N

Remark 14.2.8. The Kreiss matrix theorem gives both upper and lower bounds
on the transient behavior of l Ak II · When the Kreiss constant is greater than one,
it means that the sequence (llAkll)~ 0 grows before decaying back to zero. If the
Kreiss constant is large, then the transient phase is nontrivial and it will take many
iterations before Ak converges to zero. This means that iterative methods may take
a while to converge. By contrast, if the Kreiss constant is close to one, then the
transient phase should be relatively brief by comparison and convergence should be
relatively fast.
564 Chapter 14. Spectra and Pseudospectra

Remark 14.2.9. The original statement of the Kreiss theorem did not actually
look much like the theorem above. The original version of the right-hand inequality
in (14.12), proven by Kreiss in 1962, was supkEN IJAkJI ::; CK(A), where C,....., cnn.
Over time the bound has been sharpened through a series of improvements.
The current bound is sharp in the following sense. Although there may not be
matrices for which equality is actually attained, the inequality is the best possible
in the sense that if (JJAkJl)kEN::; Cn°'K(A) for all A, then a can be no smaller than
one. Similarly the factor e is the best possible, since if (llA~llhE N ::; Cn°'K(A) and
a = 1, then C can be no smaller thane. For a good historical survey and a more
complete treatment of the Kreiss matrix theorem, see [TE05].

Application 14.2.10 (Markov Chains and the Cutoff Phenomenon).


A finite-state stationary Markov chain can be described as an iterative sys-
tem Xk+l = Pxk , where the matrix P E Mn(lR) is nonnegative with each
column summing to one and the initial vector xo is also nonnegative with the
components summing to one.
Assuming P is irreducible (see Definition 12.8.9), the power method
(Theorem 12.7.8) implies the sequence Xk = pkx0 converges to the positive
eigenvector x 00 corresponding to the unique eigenvalue at ,\ = 1. However,
this sequence (xk)k=O can experience transient behavior before converging.
In fact, some Markov chains have a certain interesting behavior in which,
after an initial period with very litt le progress towards the steady state, con-
vergence to the steady state occurs quite suddenly. This is known as the cutoff
phenomenon.
Define the decay matrix A= P - P00 , where P00 is the matrix where every
column is the steady state vector x 00 . By induction one can show that Ak =
pk - P00 for each k .:::=:: 1. Since pk converges to P00 , the powers A k converge
to zero regardless of the norm. But the transient behavior of A is determined
by the pseudospectra of A. Careful application of the Kreiss matrix theorem
gives a bound for the cutoff phenomenon. For more information on these
bounds and how they apply to a variety of problems, including random walks
on n-dimensional cubes and card-shuffling, see [JT98].

Application 14.2.11 (Numerical Partial Differential Equations). An


important technique for numerically solving time-dependent partial differen-
tial equations is to separate the treatment of space and time. The treatment
of the spatial part of the equation can be handled using finite difference, finite
element, or spectral methods, which results in a system of coupled ordinary
differential equations. We then discretize the system in time, using time-
stepping formulas . This is called the method of lines. The problem with the
method of lines is that some choices of discretization of space and time are
unstable numerically.
14.2. Asymptotic and Transient Behavior 565

The pseudospectrum plays an important role in determining when a


given discretization is stable. In fact there are necessary and sufficient condi-
tions for numerical stability that are expressed in terms of the pseudospectrum.
For example, if Lt::.t is the differential operator for the spatial component,
and G is a function t hat characterizes the time-stepping scheme used, and
At::.t = G(t:.tL1:::.t), then we have

vn+l = G(t:.tL1:::.t)vn ,

where v n is the solution at time n. Determining whether the discretization


is stable boils down to showing that ll Ab..t ll : : ; C for all n E N and t:.t with
0 ::::; nt:.t ::::; T for some fixed function C and all sufficiently small t:.t. The
Kreiss matrix theorem gives bounds on llAb..tll, from which we deduce the
stability properties of the discretization. For further details refer to [TE05 ,
RT90, RT92] .

14.2.3 Preconditioning
In Section 13.1 we describe three iterative methods for solving linear systems; that
is, Ax = b, where A E Mn(lF) and b E lFn are given. The iterative methods are
of the form Xk+l = Bxk + c for some B E Mn(lF) and c E lFn . A necessary and
sufficient condition for convergence is that the eigenvalues of B be contained in the
open unit disk (see Theorem 13.1.1) . If for a given xo the sequence converges to
some x00 , the limit sat isfies x00 = Bx00 + c, which means that the error terms
ek = Xk - x 00 satisfy

or, equivalently, ek+l = Bk(xo - x00 ). In other words, llek+ill :S llBkllllxo - xoo ll ·
However, if l Bk ll exhibits transient behavior, the convergence may take a long time
to be realized , thus rendering the iterative method of little or no use . Therefore,
for this method to be useful, not just the eigenvalues of B must be contained in the
unit disk, but also the pseudospectrum must be sufficiently well behaved.
But even if the pseudospectrum of A is not well behaved, not all hope is lost.
An important technique is to precondition the system Ax = b by multiplying both
sides by a nonsingular matrix M- 1 so that the resulting linear system M- 1 Ax =
M - 1 b has better spectral and transient behavior for its corresponding iterative
method.
Of course M = A is the ultimate preconditioner-the resulting linear system
is x = A- 1 b, which is the solution. However, if we knew A- 1 , we wouldn't be
having this conversation. Also , the amount of work involved in computing A- 1 is
more than the amount of work required to solve the system. So our goal is to find
a precondit ioner M - 1 t hat's close to the inverse of A but is easy to compute and
leaves the resulting iterative method with good pseudospectral properties.
If we write A = M - N, then M - 1 A= I - M - 1 N, and the iterative method
of Section 13.l then becomes
566 Chapter 14. Spectra and Pseudospectra

Hence we need the pseudospectral properties of B =I - M- 1N to be well behaved.


That is, we want the Kreiss constant of B to be as small as possible. Of course, t he
overall rate of convergence is determined by the spectral radius, which is bounded
by JJBJI, so we seek an M that will make JIBJJ as small as possible. For example,
a preconditioner M that makes JJBJI = III - M- 1NII < 1/2 would be considered
an excellent choice because then the error would satisfy IJek+1 II :S: C /2k for some
constant C.

14.3 * Proof of the Kreiss Matrix Theorem


14.3.1 Proof of the Kreiss Matrix Theorem
For ease of exposition we break down the proof of the Kreiss theorem into several
lemmata, the first of which is due to Spijker. Spijker's lemma is an interesting
result that has uses in other areas of mathematics as well, but the proof is rather
intricate. We just give the statement of Spijker's lemma here and defer the proof
to Section 14.3.2

Lemma 14.3.1 (Spijker's Lemma). If r C C is a circle with radius p > 0 and


r(z) = p(z)/q(z) is a rational function of order n EN, meaning p, q are polynomials
over C of degree at most n, with q #- 0 on r, then

f lr'(z)ll dzJ :::; 27rnsup lr(z)I. (14.13)


Jr zEr

Lemma 14.3.2. Given A E Mn(q, let r(z) = uH R(z)v for some unit vectors u
and v, and let R(z) be the resolvent of A. If r = {z EC I lzl = 1+(k+1)- 1} for
some k EN, then
sup Jr(z)J :::; (k + l)K(A) . (14.14)
zEr

Proof. Using Exercise 4.33, we note that r(z) :S: JJ R(z) II · Hence,
K(A) = sup (lzl - l)JJR(z)JJ 2 sup (Jzl - l)Jr(z) I
lzl>l lzl>l
2 sup (Jzl - l)lr(z)I =sup (k + 1)- 1lr(z)I. D
zEr zEr

Now we can state and prove a lemma that will give the right-hand side
of (14.12).

Lemma 14.3.3. If A E Mn(q, then

sup JIAkll :S: enK(A). (14.15)


kEJ\l

Proof of the Kreiss Matrix Theorem. Let = {z EC I lzJ = 1+(k+1)- 1}


r
for some k EN. Let u and v be arbitrary unit vectors and define r(z) = uH R(z)v
as in Lemma 14.3.2. By the spectral resolution formula (Theorem 12.4.6), we have
14.3. *Proof of the Kreiss Matrix Theorem 567

1
Ak =- . J zk R(z)dz,
27ri lr
and thus

Integrating by parts gives

On the contour r we have

1 )k+l
lzlk+l =
( 1 + -l+k
- < e.

Using Spijker's lemma (Lemma 14.3.1) and Lemma 14.3.2, we have

1
luH Akvl::::;
27r(k + 1)
J lzlk+ 1 lr'(z)lldzl
lr
::::; 27r(ke+ 1)
en
i
lr'(z) lldzl
::::; -k- sup lr(z)I
fl+ 1 zEr
::::; enK(A).

Using Exercise 4.33, this gives llAk I : : ; en K(A). Since this holds for all k E N, we
have (14.15) . D

14.3.2 Proof of Spijker's Lemma


Spijker's lemma is a consequence of the following five lemmata. The fourth lemma
is hard, but the others are relatively straightforward.

Lemma 14.3.4. Let p(z) be a polynomial of degree n E N. The restriction of


znp(z) to the circle S = {z E <C : lzl = p}, where p > 0, is equivalent to the
restriction to S of some polynomial of degree at most n .

Proof. Let p(z) = anzn + an_ 1 zn-l + · · · + aiz + ao. Assume z(t) = peit. Note
that z(t) = pe- it = p2 z(t)- 1 . Thus,

It follows that
568 Chapter 14. Spectra and Pseudospectra

Lemma 14.3.5. Assume f(t) = r(peit). We have that

Is lr'(t)l ldzl = fo
2
7r lf'(t)I dt. (14.16)

Proof. Since f'(t) = pieitr'(peit), we have lf'(t)I = plr'(peit)I, and since z = peit
we have that ldzl = pdt. Thus,

r
ls
lr'(t) l ldzl = f
lo
2
7r lf'(t) I pdt
P
= f
lo
2
7r lf'(t)I dt. D

Lemma 14.3.6. Let g'(t) = lf'(t)I cosw(t) and h'(t) = lf'(t)I sinw(t) be the real
and imaginary parts of lf'(t)I, respectively. The following equality holds:

lf'(t)I =- 1127' lg'(t)cose+h'(t)sinelde. (14.17)


4 0

Proof. We have that


rh lg'(t) cose + h'(t) sineJ de= lorh lf'(t) ll cosw(t) cose + sinw(t) sin el de
lo
{21'
= l.f'(t) I lo I cos (w(t) - e)I de

= 4lf'(t)I. D

Lemma 14.3.7. For fixed e E [0, 2n], define Fe(t) = g(t) cose + h(t)sine. We
have
27'
IF0(t)1dt:::; 4n sup IFe(t)I . (14.18)
1 0 tE [0, 27r]

Proof. Since F0(t) = g'(t) cose + h'(t) sine is derived from a nontrivial rational
function, it has finitely many distinct roots 0 :::; to < ti < · · · < tk - 1 < 2n. Write
tk = 2n. Thus, IF0(t)I > 0 on each interval (tj - 1, tJ) for j = 1, 2, . .. , k. If F0(t) < 0
on (tj - 1, tj), then

1ti IF0(t)I dt = -1ti F0(t) dt = -(Fe(ti) - Fe(ti-1)) = IFe(ti) - Fe(ti_i)I .


ti- 1 t i- 1

If instead F 0(t) > 0 on (tj_ 1, t) , then

1ti IF0(t) I dt = 1ti F0(t ) dt =(Fe( ti) - Fe(ti- 1)) = IFe(ti) - Fe(ti- 1)1.
t i- 1 t i- 1

Thus,

£7'
· 0
2
IF0(t)1dt=t1 t; F0(t) dt == t
i =l t ,_ , i =l
IFe(ti) - Fe(ti-1 )1 :::; 2k sup IFe(t)I.
t E[0,27r]
14 3. *Proof of the Krei ss Mat rix Th eorem 569

It suffices to show that k ~ 2n. Note that

2Fe(t) = g(t)( ei 8 + e- ifJ) - ih(t)(ei 8 - e-ifJ)


= ei 8 (g(t ) - ih(t)) + e- i8 (g(t) + ih(t))
= eifJr(eit) + e-ifJr(eit) .

Recall that z(t) = peit. Multiplying both sides by znq(z )q(z) , we have

Since q(z) is nonzero on Sand, by Lemma 14.3.4, the right-hand side is a polynomial
on S of degree at most 2n, we can conclude that Fe(t) has at most 2n roots.
Therefore k ~ 2n . D

Le mma 14.3.8. For fixed () E [O, 2r.] we have

sup IFe(t)! dt ~sup !r(z) !. (14.19)


tE[0,271'] zES

Proof. Using the Cauchy- Schwarz inequality (Proposition 3.1.17), we have

JFe(t) ! = !g(t) cos()+ h(t) sin() !

= II [hmJ·[~~~~J 11
~ Jg(t) 2 + h(t) 2 V~
co_s_
2_() _ in__,.2 -
+_s- ()

= )g(t)2 + h(t)2 = !r(t) !. D

Now we have all the pieces needed for an easy proof of Spijker's lemma.

Proof of Spijker's Lemma. Using Lemmata 14.3.5- 14.3.8, we have

Is !r'(t) ! !dz! = 1 2
71' !J'(t) ! dt

= J,~(14 },~ Ig' (t) cos () + h' (t) sin ()I d()
0 0
)
dt

= ~ 1 (1 2
71'
2
71' !Fe(t) ! dt) d()
2
= n f 11' ( sup IFe(t)!) d()
lo tE [0,271']
~2r.nsup!r(z)I. D
zES
570 Chapter 14. Spectra and Pseudospectra

Exercises
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section 1, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *). We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with &. are especially important and are likely to be used later
in this book and beyond. Those marked with t are harder than average, but should
still be done.
Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

14.1. Given A E Mn(lF), show that 0-,, 1 (A) C 0- 02 (A) whenever 0 < c: 1 < c2 .
14.2. Prove the following: Given A E:'. Mn(<C), there exists 'Y > 0 and M :'.::: 1 such
that
llAkll :S M"fk for all k EN.
Hint: Given c > 0, choose A= (r(A) + c:)- 1 A and use Exercise 12.15.
14.3. Given A E Mn(lF), show that for any z E <C and c > 0, we have

a-0 (zl - A)= z - a- 0 (A).

14.4. Prove Proposition 14.1.10; that is, show the equivalence (14.4) and (14.2).
14.5. For any A E Mn(<C), any nonzero c E <C, and c: > 0, prove the following:
(i) a-0 (AH) = a-0 (A).
(ii) a-c:(A + c) = a- (A) + c.
0

(iii) O"lclc(A + c) = ca-0 (A).

14.6. Prove: If A is normal, then r 0 (A) = r(A ) + c:.


14.7. Prove: If A E Mn(<C) is normal, then llAkll = r(A)k.
14.8. Given c: > 0, let

A=[~~] ·
(i) Use induction to prove that

Ak = [~ c:1k] .
(ii) Use Exercise 3.28 to show that llAkll2 -too ask-too. Hint: It is trivial
to compute the 1-norm and oo-norm of Ak.
Exercises 571

14.9. Given A E Mn(C), assume that r(A) = 1, but that the eigennilpotent D>.. is
zero for all .X such that I.XI = 1. Prove one direction of Theorem 14.2.l(iii)
by completing following steps:
(i) Use the spectral decomposition to write

A= L AP>.+B,
i>.i=l
where r(B) < 1 and P.XB = BP>. = 0 for all A E O'(A) satisfying I.XI = 1.
(ii) Show that Ak = I:i>.i=l ,Xkp>. +Bk .
(iii) Use the triangle inequality to show that

llAkll ::; L llP>.11 + llBkll·


i>.i=l
Then take the limit ask ~ oo.
14.10. Given A E Mn(C), assume there exists an eigenvalue of modulus one that
has an eigennilpotent. Prove that llAkll ~ oo ask~ oo. This is the other
direction of Theorem 14.2.1 (iii).

14.11 . Show that (1 + l~k)k+ 1 is strictly monotonic in k E N by completing the


following steps:
(i) Let f(x) = xk+ 1 and use the mean value theorem for 0::; a< b to get
bk+l - ak-l < (k + l)bk(b - a).

(ii) By expanding the previous expression, show that

(iii) Complete the proof by choosing a= 1+(k+1)- 1 and b = 1 + k- 1 .


14.12. Given A E Mn(IF), prove that the Kreiss constant K(A) is the smallest C >0
such that
ll(zI -A)-111:::; -Iz IC-1 for all lzl > 1.
14.13. Given A E Mn(IF) and the resolvent estimate in the previous problem, show
that for a contour of radius 1 + c:, centered at the origin with c: > 0, we have
that
llAkll :S (l + c:)k K(A).
c
Then, for each k , choose c: to minimize the right-hand side, leaving the bound

ll Ak ll :S k (1 + ~) k K(A):::; keK(A).
Since this grows without bound, this isn't a very useful bound. It's for this
reason that the integration-by-parts step in Lemma 14.3.3 is so critical.
572 Chapter 14. Spectra and Pseudospectra

14.14. By integrating by parts twice in Lemma 14.3.3, show that

k 2en2 '
llA II :::; (k + l)(k + 2) ~~~Ir (z)I.

Hint: The derivative of a rational function is a rational function.


14.15. If it is known a priori that the numerator and denominator of the rational
function r(z) have degrees n - 1 and n, respectively, can Spijker's lemma be
improved, and if so, what does this mean for the Kreiss matrix theorem?

Notes
Much of our treatment of pseudospectra was inspired by [Tre92, Tre97, TE05], and
Exercise 14.5 is from [TE05]. All figures showing contour plots of pseudospectra
were created using Eigtool [Wri02].
For more about the spectral condition number and the choice of V in (14.6)
and in the Bauer-Fike theorem (Corollary 14.1.15), see [Dem83, Dem82, JL97].
Finally, for a fascinating description of the history and applications of Spijker's
lemma, we strongly recommend the article [WT94] .
Rings and Polynomials

-Sauron

Ring theory has many applications in computing, counting, cryptography, and com-
munications. Rings are sets with two operations,"+" and"·" (usually called addition
and multiplication, respectively). In many ways rings are like vector spaces, and
much of our treatment of the theory of rings will mirror our treatment of vector
spaces at the beginning of this book. The prototypical ring is the set Z with addi-
tion and multiplication as the two operations. But there are many other interesting
and useful examples of rings, including IF[x], Mn(IF), C(U; IF), and .@(X) for any
normed linear space X. Ring theory shows that many results that are true for the
integers also apply to these rings.
We begin with a survey of the basic properties of rings, with a special focus
on the similarities between rings and vector spaces. We then narrow our focus to
a special class of rings called Euclidean domains, and to quotients of Euclidean
domains. These include rings of polynomials in one variable and the ring of all
the matrices that can be formed by applying polynomials to one given matrix. A
Euclidean domain is a ring in which the Euclidean algorithm holds, and we show
in the third section that this also implies the fundamental theorem of arithmetic
(unique prime factorization) holds in these rings. Applying this to the ring of
polynomials, we get an alternative form of the fundamental theorem of algebra.
In the fourth section we treat homomorphisms, which are the ring-theoretic
analogue of linear transformations. We show there are strong parallels between
maps ofrings (homomorphisms) and maps of vector spaces (linear transformations).
Again the kernel plays a fundamental role, and the first and second isomorphism
theorems for rings both hold. 51
51 There is also a third isomorphism theorem both for rings and for vector spaces. We do not cover
that theorem here, but you can find it in almost any standard text on ring theory.

573
574 Chapter 15. Rings and Polynomials

One of the most important results of this chapter is the Chinese remainder
theorem (CRT), which we prove in Section 15.6. It is a powerful tool with many
applications in both pure and applied mathematics. In the remainder of the chapter
we focus primarily on the important implications of the CRT for partial fraction
decomposition, polynomial interpolation, and spectral decomposition of operators.
We conclude the chapter by describing a remarkable connection between Lagrange
interpolants and the spectral decomposition of a matrix (see Section 15.7.3).

15.1 Definition and Examples


15.1.1 Definition and Basic Properties

Definition 15.1.1. A ring is a set R of elements with two operations: addition,


mapping Rx R to R, and denoted by (x, y) H x + y; and multiplication, mapping
Rx R to R, and denoted by (x, y) H x · y or just (x, y) H xy . These operations
must satisfy the following properties for all x, y, z E R:

(i) Commutativity of addition: x + y = y + x.


(ii) Associativity of addition: (x + y) + z = x + (y + z) .
(iii) Existence of an additive identity: There exists an element 0 E R such that
o+x = x.
(iv) Existence of an additive inverse: For each x E R there exists an element,
denoted -x, in R such that x + (-x) = 0.

(v) First distributive law: z(x + y) = zx + zy.

(vi) Second distributive law: (x + y)z = xz + yz.

(vii) Associativity of multiplication: (xy)z = x(yz) .

Remark 15.1.2. A ring is like a vector space in many ways. The main differences
are in the nature of the multiplication. In a vector space we multiply by scalars that
come from a field- outside the vector space. But in a ring we multiply by elements
inside the ring.

Example 15.1.3. Even if you have seen rings before, you should familiarize
yourself with the following examples, since most of them are very common in
mathematics and will arise repeatedly in many different contexts.

(i) The integers Z form a ring.

(ii) The rationals Q, the reals ~., and the complex numbers CC each form a
ring (with the usual operations of addition and multiplication) .
15.1. Definition and Examples 575

(iii) Fix a positive integer n. The set Zn = {[[OJ], ... , [[n -1]]} of equivalence
classes mod n, as described in Example A.l.18(i) , forms a ring with
operations rn and 0, as described in Examples A.2.16(i) and A.2.16(ii) .

(iv) The set C 00 ((a, b); JF) of smooth functions from (a, b) to lF forms a ring,
where+ is pointwise addition (f + g)(x) = f (x) + g(x) and · is pointwise
multiplication (f · g)(x) = f(x)g(x).

(v) Given a vector space X, the set ~(X) of bounded linear operators on X
is a ring, where + is the usual addition of operators and · is composition
of operators. In particular, the set of n x n matrices Mn(lF) is a ring,
with the usual matrix addition and matrix multiplication .

(vi) Given a set S, the power set &'(S) forms a ring, where + is the sym-
metric difference and · is intersection:

A+ B = (AU B) '\_(An B) and A·B = AnB.


The additive identity is 0, and any subset A is its own additive inverse
- A = A.
Warning: Some people use + in this setting to denote union U. This
definition of + does not satisfy the axioms of a ring.

(vii) The set {True, False} of Boolean truth values forms a ring, where + is
the operation of exclusive OR (XOR); that is, a+ b =True if and only
if exactly one of a and bis True. Multiplication · is the operation AND.
The additive identity 0 is the element False, and the additive inverse of
any element is itself: - a = a.
Again, you should be aware that many people use+ to denote inclusive
OR, but inclusive OR will not satisfy the axioms of a ring.

Remark 15.1.4. Note that the definition of a ring requires addition to be commu-
tative but does not require (or forbid) that multiplication be commutative. Perhaps
the most common example of a ring with noncommutative multiplication is the
ring Mn (lF).
Definition 15.1.5. A ring R is commutative if ab= ba for all a, b ER.

Example 15.1.6.

(i) Given any commutative ring R, the set

R[x] = { ao + aix + · · · + akxk I k E N, all ai E R}


of all polynomials in the variable x with coefficients in R forms a com-
mutative ring, where the operations + and · are the usual addition and
multiplication of polynomials.
576 Chapter 15. Rings and Polynomials

Since R[x] is itself a ring, we may repeat the process with a new variable
y to get that R[x, y] = R[x][y] is also a ring, as is R[x, y, z], and so forth.

(ii) Given any commutative ring R, the set

R[x] = { ~ aixi I ai E R for all i E {O, 1, . .. } }


of formal power series in the indeterminate x is a commutative ring.
Elements of R[x] need not converge. This is why they are called formal
power series.
(iii) The set

R[x,x- 1 ]={~ aixilN:S::ME Z, aiER foralliE{N, ... , M}}

of Laurent polynomials in the indeterminate x with coefficients in R is a


commutative ring if R is a commutative ring. Notice here that N may
be negative, but N and M must be finite .

(iv) & If A E Mn(IF) is a square matrix, the set

of polynomials in A with coefficients in IF is a ring.


Note that while the symbol x in the rings IF[x] and IF[x] and F[x, x - 1J
is a formal symbol, the A in JF[A] is a specific matrix. So the elements
of IF[x], for example, are all the formal expressions of the form a 0 +
a 1 x + · · · + anxn, but the elements of IF[A] are a specific set of matrices.
For example, if A = [8 6J, then An = 0 for n > 1, and so <C[A] =
{ c1I + c2A Ic1, c2 EC} = { [ c6 ~~]I c1, c2 EC} is the set of all upper-
triangular matrices with const ant diagonal.
The ring IF[A] C Mn(IF) is a commutative ring because any power of A
commutes with any other power of A, despite t he fact that it is a subring
of Mn(IF), which is not commutative.

Nota Bene 15.1. 7. As with vector spaces, a subtle point that is sometimes
missed, because it is not a eparate item in the numbered list of axioms, i
that the definition of an operation requires that the operations of addition
and multiplication take their value in R . That is, R must be closed under
addition and multiplication.
15.1. Definition and Examples 577

Unexample 15.1.8.

(i) The natural numbers N with the usual operations of+ and· do not form
a ring because every nonzero element fails to have an additive inverse.

(ii) The odd integers ((J) with the usual operations +and · do not form a ring
because the operation of + takes two odd integers and returns an even.
That is, the operation + is not an operation on odd integers, since an
operation on ((J) is a function ((J) x ((J) -+ ((J), but the range of + is not ((J).
Instead of saying that + is not an operation on ((J), many people say the
set ((J) is not closed under addition.

Proposition 15.1.9. Let R be a ring. If x, y E R, then the following hold:

(i) The additive identity 0 is unique, that is, x +y = x implies y = 0.

(ii) Additive inverses are unique, that is, x +y = 0 implies y = (-x).

(iii) 0. x = o.

(iv) (- y)·x=-(y·x)=y·( - x).

Proof. The proof is identical to the case of vector spaces (Proposition 1.1. 7), except
for (iv), and even in that case it is similar. To see (iv) note that for each x, y ER,
we have 0 = 0 · x = (y + (- y)) · x = y · x + (-y) · x. Hence, (ii) implies that
(- y) · x = - (y · x) . A similar argument shows that -(y · x) = y · (-x). D

Remark 15.1.10. Since the expression x+(-y) determines a well-defined (unique)


element, we usually use the shorthand x - y to denote x + ( - y). We also usually
write xy instead of x · y.

Although the axioms of a ring require an additive identity element 0, they do


not require the existence of a multiplicative identity (usually denoted 1).

Definition 15.1.11. Let u in a ring R have the property that for every r E R we
have ur = ru = r. In this case u is called unity and is usually denoted 1.

Proposition 15.1.12. If R has a unity element, then it is unique .

Proof. Suppose that R contains two elements u and u' such that ur = ru =rand
u'r = ru' = r for every r ER. We have u = uu' by the unity property of u', but
also uu' = u' by the unity property of u. Hence u = u'. D
578 Chapter 15. Rings and Polynomials

Multiplicative inverses are also not required in a ring, but when they exist,
they are very useful.

Definition 15.1.13. For any ring R with unity l, an element a E R is invertible 52


if there exists b E R such that ab= 1 == ba. We usually denote this b by a- 1 .

Proposition 15.1.14. In any ring R with unity l, if an element a E R has a


multiplicative inverse, then its inverse is unique.

Proof. If there exist elements b, b' E R such that ab = 1 and b' a = 1, then we have

b =lb= (b'a)b = b'(ab) = b' . D

Example 15.1.15. In the ring Z, the only invertible elements are 1 and -1 ,
but in the ring IQ every nonzero element is invertible.

15.1.2 Ideals
Roughly speaking, an ideal is to a ring what a vector subspace is to a vector space.
But this analogy is incomplete because an ideal is not just a subring (a subset that
is also a ring)- it must satisfy some additional conditions that are important for
building quotients of rings.

Definition 15.1.16. Let R be a ring. A nonempty subset I c R is an ideal of R if


the operations of addition and multiplication in R satisfy the following properties:

(i) For any x, y E I, we have x +y E I and x - y E I.


(ii) For any r ER and any x EI, we have rx EI and xr EI.

Remark 15.1.17. Of course the first condition is just saying that I is closed under
addition and subtraction, but the second condition is new to us-I must be closed
under multiplication by any ring element. Roughly speaking, this is analogous to
the condition that vector subspaces :should be closed under scalar multiplication,
but here the analogues of scalars are elements of the ring R.

Example 15.1.18. The set lE = 2Z of even integers is an ideal of the integers


Z, because it is closed under addit ion, subtraction, and multiplication by
any integer.

52 It
is common to call an invertible element of a ring a unit, but this term is easily confused with
unity, so we prefer not to use it.
15.1. Definition and Examples 579

Unexample 15.1.19. The set IE = 2Z is not an ideal of the rationals Q,


because it is not closed under multiplication by an element from Q. For
example, 2 E IE, but ~ · 2 = ~ (j. IE.
This unexample, combined with Example 15.1.18, shows that talking
about an ideal makes no sense without specifying what ring it is an ideal of.

Proposition 15.1.20. Let R be a ring. Any ideal I c R of a ring R is itself a


ring, using the same operations + and · .

Proof. From the definition of ideal, we have that I is closed under addition and
multiplication, and hence those operations are operations on I. The properties of
associativity for + and ., as well as commutativity of + and distributivity all follow
immediately from the fact that they hold in the larger ring R.
All that remains is to check that the additive identity 0 is in I and that every
element of I has an additive inverse in I. But these both follow by closure under
subtraction. First, for any x EI we have 0 = x - x EI . Now since 0 E I, we have
- x = 0 - x EI. D

Remark 15.1.21. Example 15.1.19 shows that the converse to the previous propo-
sition is false: not every subring is an ideal.

Proposition 15.1.22. To check that a nonempty subset I of a ring R is an ideal,


it suffices to check that
(i) rx E I and xr E I for all r E R and x E I {closure under multiplication by
any ring element), and
(ii) x - y E I for all x, y E I {closure under subtraction).

Proof. If I is closed under subtraction, then given any x, y E I, we have y - y =


0 EI, and thus -y = 0 -y EI. So we have x + y = x - (-y) EI. D

The next proposition is immediate, and in fact it may even seem more difficult
than checking the definition directly, but it makes many proofs much cleaner; it also
makes the similarity to vector subspaces clearer.

Proposition 15.1.23. If R is a ring with unity, then to check that a nonempty


subset I c R is an ideal, it suffices to check that ax + by E I and xa + yb E I for
every a, b ER and every x, y EI. If R is commutative, it suffices to check just that
ax+ by EI for every a, b ER and every x, y EI.

Example 15.1.24.

(i) Every ring R is an ideal of itself, but it may not be an ideal of a


larger ring.
580 Chapter 15 Rings and Polynomials

(ii) The set {O} is an ideal in any ring.

(iii) For any p E JR, the set mp= {f E C 00 (JR; JR) I f(p) = O} of all functions
that vanish at p is an ideal in the ring C 00 (JR; JR).

(iv) If n, d, k E Z are positive integers with dk = n, then the set of (mod n)


equivalence classes dZn = {[[OJ], [[d]], [[2d]], ... , [[(k - l)d]]} is an ideal of
Zn ·
(v) For any ring R, the set

of all polynomials with zero constant term is an ideal in the polynomial


ring R[x].
(vi) The matrices Mn(2Z) C Mn(Z) of matrices with even integer entries
forms an ideal in the ring of matrices with integer entries.

Unexample 15.1.25.

(i) The set 0 of odd integers is not an ideal in Z because it is not closed
under subtraction or addition.

(ii) The only ideals of IQ are {O} and IQ itself. Any ideal I c IQ that is not
{O} must contain a nonzero element x E J. But since x :/= 0, we al o have
l /x E IQ. Let s be any element of IQ, and take r = s/x E IQ. Closure
under multiplication by elements of IQ implies t hats = (s /x)x = rx E J.

15.1.3 Generating Sets and Products of Idea ls


Rings are like vector spaces and ideals like subspaces in many ways. In the theory
of rings and ideals, generating sets play the same role that spanning sets play for
vector subspaces. But one important way that rings differ from vector spaces is
that most rings do not have a meaningful analogue of linear independence. So
while we can talk about generating sets (like spanning sets), we do not have all
the nice properties of a basis, and we cannot define dimension in the same way we
do for vector spaces. Nevertheless, many results on spanning sets and quotients of
vectors spaces carry over to generating sets and ring quotients. For many of these
results, after making the obvious adjustments from vector spaces to rings , the proof
is essentially identical to the vector space proof.
Throughout this section, assume that R is a commutative ring and that S
is a nonempty subset of R. Most of what we discuss here has a straightforward
generalization to noncommutative rings, but the commutative case is simpler and
is all we are interested in for now.
15.1. Definition and Exampl es 581

Definition 15.1.26. Let R be a commutative ring. The ideal of R generated by


S, denoted (S), is the set of all finite sums of the form

(15.1)

We call such a sum an R-linear combination of elements of S . If (S) = I for some


ideal I of R, then we say that S is a generating set for I, or equivalently that S
generates I.

Proposition 15.1.27. Let R be a commutative ring. For any subset S C R, the


set (S) is an ideal of R.

Proof. If x, y E (S), then there exists a finite subset {x 1 , ... , xm} CS, such that
x = I;:': 1 CiXi and y = I::': 1 diXi for some coefficients C1, ... , Cm and d1, ... , dm
(some possibly zero). Since ax+ by= I::': 1 (aci + bdi)xi is an R-linear combination
of elements of S, it follows that ax + by is contained in (S). Since R is commutative,
we also have xa + yb = ax+ by E (S); hence, (S) is an ideal of R. D

Corollary 15.1.28. Let R be a commutative ring. Let {Ij }j=1 be a finite set of
ideals in R. The sum I= Ii+···+ In = {I;7=l aj I aj E Ij} = (Ii U I2 U · · · U In)
is an ideal of R.

Proposition 15.1.29. Let R be a commutative ring. If I is an ideal of R, then


(I) = I.

Proof. The proof is by induction on the number of terms in an R-linear combina-


tion. It is essentially identical to that given for vector spaces (Proposition 1.2.2).
The only changes we need to make to that proof are replacing scalars in lF with
elements of R. D

E x ample 15.1.30.

(i) The ideal (2) in the ring Z is the set (2) = {. . . , -4, - 2, 0, 2, 4, . . . } of
even integers. More generally, for any d E Z the ideal (d) is the set
(d ) = {... , -2d, -d, 0, d, 2d, . .. } of all multiples of d.
(ii) For any ring R, and x ER, the ideal (x) is the set of all multiples of x .
(iii) The ideal (6, 9) in Z is the set of all Z-linear combinations of 6
and 9. Since 3 = 9 - 6, we have 3 E (6, 9), which implies that (3) C (6, 9).
But since 6 and 9 are both elements of (3) , we also have (6, 9) C (3),
and hence (6, 9) = (3).
(iv) In the polynomial ring C [x, y] , the ideal (x, y) is the set of all polynomials
whose constant term is zero.
582 Chapter 15. Rings and Polynom ials

(v) In the polynomial ring <C[x, y] t he ideal I= (x 2 -y+ l, 2y-2) contains the
element x 2 - y+ 1+~ ( 2y -2 ) == x 2 , and also t he element ~ (2y - 2) = y -1 ,
so (x 2 , y- 1) c I, but also x 2 --y+l = x 2 -(y-1 ) and 2y-2 = 2(y - 1), so
we have I = (x 2 - l,y- 1).

Nota B ene 15.1.31. A good way to show an ideal is contained in another


is to show its generators lie in the other ideal. To show that two ideals are
equal, show that the generator of each are contained in the other ideal.

Example 15.1.32. In a commutative ring R with unity 1, for any invert ible
element u E R, the ideal (u), generated by u, is t he entire ring; that is, (u) = R.
To see this, note that for any element r ER, we haver = (ru- 1 )u E (u ).

Not a Bene 15. 1.33. When talking about elements of a ring, parentheses are
still sometimes used to specify orde:r of operations, so you will often encounter
confusing notation like (x 2 - (y - 1)). If (x 2 - (y - 1)) is supposed to mean
an element of R, then (x 2 - (y - 1)) = x 2 - y + 1, but if it is supposed to
be an ideal, then (x 2 - (y - 1)) == (x 2 - y + 1) means the ideal generated
by x 2 - y + 1, that is, the set of all elements of R that can be written as a
mult iple of x 2 - y + 1. The meaning should be clear from the context, but it
takes some getting u ed to; so you'll need to pay careful attention to identify
the intended meaning. Although this notation is confusing and is certainly
suboptimal, it is the traditional notation, so you should become comfortable
wit h it.

The proofs of the remaining results in t his section are all essentially identical
to their vector space counterparts in Section 1.5. T he details are left as exercises.

Proposition 15.1.34. The intersection of a collection {Ia}aEJ of ideals of R is


an ideal of R.

Proposition 15 .1.35. Supersets of generating sets are also generating sets, that
is, if R is a commutative ring, if (S) =I, and if S C S ' C I , then (S') =I.

Theorem 15.1.36. Let R be a commutative ring. The ideal generated by S is the


smallest ideal of R that contains S , or, in other words, it is the intersection of all
ideals of R that contain S.

Proposition 15.1.37. If Ii, ... , In are ideals in a commutative ring R, then the
product of these ideals, defin ed by IE=l
Ii = n=~= l (f1~ 1 aji) I aji E Ji }, is an
ideal of R and a subset of n~ 1 Ii .
15.2. Euclidean Domains 583

15.2 Euclidean Domains


Some of the most important rings in applications are the rings 53 Zand IF[x]. Some
key properties of these rings are that they are all commutative and satisfy a form
of the division property; thus they have a form of the Euclidean algorithm. These
properties allow us to talk meaningfully about primes and divisibility.
Rings that have these properties are called Euclidean domains. Later we will
use the properties of Euclidean domains to develop the partial fraction decomposi-
tion for rational functions and to better understand polynomial interpolation.

15.2.1 Euclidean Domains and Polynomial Rings


Euclidean domains are commutative rings where the division algorithm and the Eu-
clidean algorithm work. In this section we give a careful definition of Euclidean
domains and then prove that polynomials in one variable form a Euclidean domain.
The key to the Euclidean algorithm for the integers is the fact that size (abso-
lute value) of the remainder r is smaller than that of the divisor b. In a general ring,
the absolute value may not make sense, but in many cases we can find a function
that will take the place of the absolute value. We call such a function a valuation.

Definition 15.2.1. A ring R is a Euclidean domain if the following hold:

(i) R has a multiplicative identity element l.


(ii) R is commutative; that is, for every a, b E R we have ab= ba .

(iii) R has no zero divisors; that is, for any a, b E R, if ab = 0, then a = 0 or


b = 0.
(iv) There exists a function v : R"' {O} -t N {called a valuation) such that
(a) (division property) for any a E R and any nonzero b E R there exist
q, r E R such that
a= bq + r,
with either r = 0 or v(r) < v(b);
(b) for any a, b E R with ab =/=- 0, we have

v(a) ::=; v(ab).

The canonical example of a Euclidean domain is the integers Z with the ab-
solute value v(x) = lxl as the valuation, but another important example is the ring
of polynomials in one variable over a field.

Definition 15.2.2. Define a valuation on the ring IF[x] of polynomials with coef-
ficients in IF by v(p(x)) = degp(x), where the degree degp(x) of p(x) = 2=7=oaixi
is the greatest integer i such that ai is not zero. For convenience, also define
deg(O) = -oo .
53 Almosteverything we do in this chapter with the ring JF[x] will work just as well with F[x],
where F is any field-not just JR and IC. For a review of fields see Appendix B.2.
584 Chapter 15. Rings and Polynomials

Proposition 15.2.3. For any a, b E JF[x] the degree satisfies

(i) deg(ab) = deg(a) + deg(b),


(ii) deg(a + b):::; max(deg(a), deg(b)).

Proof. First, observe that lF has no zero divisors; that is, if a, /3 E lF are both
nonzero, then the product a/3 is also nonzero. To see this, assume that a/3 = 0.
Since a =f. 0, it has an inverse, so f3 = (a- 1 a)/3 = a- 1 (a/3) = a - 1 · 0 = 0, a
contradiction. So the product of any two nonzero elements is not zero.
For (i), if a and b are nonzero, then writing out the polynomials a = ao +
a 1 x + · · · + amxm and b = bo + bix + · · · bnxn, with bn and am both nonzero, so that
deg(a) = m and deg(b) = n, we have

Since ambn =f. 0, we have deg(ab) = n + m = deg(a) + deg(b).


For (ii), assume without loss of generality that m :::; n . If m < n, we have

and so in this case deg(a + b) = m = max(deg(a), deg(b)).


If m = n, we have

So, if am+ bm =f. 0, we have deg(a + b) = deg(a) = deg(b), and if am+ bm = 0, we


have deg(a + b) < deg(a) = deg(b).
The result also holds if one or both of the polynomials is 0. D

Theorem 15.2.4. The ring JF[x] is a Euclidean domain with valuation given by the
degree of the polynomial v(p(x)) = degp(x) .

Proof. First observe that JF[x] has no zero divisors because if a, b E JF[x] are both
nonzero, then deg(ab) = deg(a) + deg(b) > -oo = deg(O).
Given a E JF[x] and any nonzero b E JF[x], let S = {a - bq I q E JF[x] }. If
0 E S, then the proof is done, since there is a q such that a = bq. If 0 .;_ S, let
D = { deg(a - bq) I (a - bq) ES} c N be the set of degrees of elements of S. By
the well-ordering axiom of natural numbers (Axiom A.3.3), D has a least element
d. Let r be some element of S with deg(r) = d, so r =a - bq for some q E JF[x].
We claim d = deg(r) < deg(b). If not, then b = b0 + b1 x + · · · + bnxn and
r = r 0 + r 1 x + · · · +rmxm with m :::'.'. n. But now let r' = r - b:: xm-n ES, so that
the degree-m term of r' cancels. We have deg(r') < deg(r)- a contradiction. D

The degree function also gives a good way to characterize the invertible ele-
ments of JF[x].

Proposition 15.2.5. An element f E::: JF[x] is invertible if and only if deg(!) = 0.


15.2. Euclidean Domains 585

r
Proof. For any invertible f E JF[x], we have 0 = deg(l) = deg(f 1 ) = deg(!)+
deg(f - 1 ), which implies that deg(!) = 0. Conversely, if deg(!) = 0, then f = ao E lF
ri
and f-=/:- 0, so = a 01 E lF C JF[x]. D

15.2.2 The Euclidean Algorithm in a Euclid ean Domain


In this section we extend the familiar idea of divisibility from the integers to Eu-
clidean domains. We also show that the very powerful Euclidean algorithm for
finding the greatest common divisor works in any Euclidean domain.
We first need a quick discussion of divisibility properties in a Euclidean
domain.

Definition 15.2.6. Let R be a Euclidean domain and let a , b E R . If b = ac for


some c E R, we say a divides b and write a Jb.

Proposition 15.2.7. Any ideal I in a Euclidean domain R can be generated by a


single element; that is, there exists some element d EI such that (d) = I. Moreover,
v(d) is least among all nonnegative integers of the form v(i) for i EI""- 0 .

Proof. Let S = { n E Z 3i EI""- {O}, n = v(i)} be the image of the valuation map
J

v : I""- {O} --+ N. By the well-ordering axiom of the integers, the set S must have a
least element u, and there must be some element d EI ""- 0 such that v(d) = u.
Let (d) c R be the ideal generated by d. Since d EI, we have (d) CI. Given
any i E J, apply the division property (Definition 15.2.l(iv)(a)) to get i = qd + r
for some r with v(r) < v(d) = u. Since d, i E I , we must haver E I , but v(r) < u
contradicts the minimality of u unless r = 0. Therefore, i = qd and i E (d). This
proves that I = (d). D

Proposition 15.2.8. If R is a Euclidean domain and a , b E R , there exists an


element d E R that satisfies the following properties:
(i) The element d can be written as (ax+ by) for x, y ER, and
v(d) = m in{v(ax' +by'): s', t' ER}.

(ii) The element d divides both a and b, and any element d' with d' la and d' lb
must satisfy d' ld.
Moreover, any element that satisfies one of these properties must satisfy the other
property, and if elements d, e satisfy these properties, then d = ue, where u is an
invertible element of R.

Proof. By the previous proposition, the ideal (a, b) is generated by a single element
d, and because d E (a, b) , we have d = ax+ by. In fact , an element c can be written
as c = ax' + by' for some x' , y' E R if and only if c E (a, b). Moreover, any
c E (a, b) must be divisible by d; hence, by the multiplicative property of valuations
(Definition 15.2.l (iv)(b)), we have v(c) :::'.'. v(d); so v(d) is least in the set of all
nonnegative integers of the form v(ax' +by').
586 Chapter 15 Rings and Polynomials

Since (d) = (a , b) we have dja and djb; conversely, given any d' with d'ja and
d'jb, we immediately have d'j(ax +by) = d. Now given any element d E R of the
form ax + by such that v( d) is least among the nonnegative integers of the form
v(ax' +by'), then the previous proposition shows that (d) = (a, b); hence the second
property must hold.
Conversely, if d is an element such that the second property holds, then by
Proposition 15.2.7 we have (a, b) = (e) for some e E (a, b), and the second property
must also hold fore. Thus we have dje and ejd, so e = ud and d = u'e . Therefore
e = u(u'e) = (uu')e, so e(l - uu') = 0 . Since e is not zero and since R has no zero
divisors, we must have 1 - uu' = 0, or 1 = uu'. Sou is an invertible element of R .
Moreover, (d) = (e) = (a, b), so the first property must also hold. D

Remark 15.2.9. The element d in the previous proposition is not necessarily


unique. In fact, given any invertible element a E R the element ad also satis-
fies all the conditions of the proposition. When R = Z we usually rescale to make
the element d be positive, and we denote the resulting, rescaled element gcd(a, b).
Similarly, when R = lF[x], we usually rescaled so that it is monic- we also call this
rescaled element gcd(a, b).
But for a general Euclidean domain, there is no canonical way to identify a
unique element to call the gcd. Instead, we say that any element d is a gcd of a
and b if it satisfies the conditions of the previous proposition.

Definition 15.2.10. In a Euclidean domain, for any tuple of elements a 1 , . . . , an,


we say that d E R is a gcd of a 1 , . . . , an if the ideal ( d) equals the ideal (a 1 , . . . , an)
of R. Elements a1, .. . , an are relatively prime if (a1, ... , an) = R. Since R = (1) =
(u) for any invertible element u ER, we often write gcd(a 1, ... , an) = 1 to indicate
that a1, . . . , an are relatively prime.

Remark 15.2.11. The previous discussion shows that two elements a and b in a
Euclidean domain R are relatively prime if and only if the identity element 1 can
be written 1 = as + bt for some s, t E R .

T heorem 15.2.12 (Euclidean Algorithm). Let R be a Euclidean domain R, and


let a , b ER. Define qo and ro as in the division property (Definition 15.2.l(iv)(a)):

a== bqo +ro.


Apply the division property to b and r to get q 1 , r 1

b == roq1 + r1.
Repeating the process, eventually the remainder will be zero:

a ==bqo + ro,
b == roq1 + r1,
ro == r 1q2 + r2,
r1 == r2Q3 + r3,
15.2. Euclidean Domains 587

rn- 2 = rn-lqn + rn,


rn-1 = rnqn+l + 0.
The result rn is a greatest common divisor of a and b.

Proof. The algorithm terminates in no more than v(b) + 1 steps, because at each
stage we have that 0 ~ v(rk) < v(rk - i), so we have a sequence v(b) > v(ro) >
v (r1) > · · · 2': 0 that decreases at each step until it reaches 0.
Since rn divides Tn-1 , and Tn-2 = Tn - lqn- 1 + Tn, we have that rn divides
rn-2· Repeating the argument for the previous stages shows that Tn divides rn_ 3
and each rk for k = (n - 3), (n - 4) , ... , 1, 0. Hence Tn divides both b and a.
Conversely, given any c that divides a and b, the first equation a = bq0 + ro shows
that clro. Repeating for each step gives clrk for all k . Hence clrn. This implies that
Tn = gcd(a, b). D

Example 15.2.13. Let a = x 4 - 1 and b = 2x 3 - 12x 2 + 22x - 12 in the ring


C[x]. We apply the Euclidean algorithm to a and b to compute the greatest
common divisor:

0 = (~ + 3) (2x 3 - 12x
2
+ 22x - 12)
a ----.,.__... b

+ (25x 2 - 60x + 35)


ro

(2x 3 -12x 2 + 22x - 12) = (~x


25
- ~) (25x 2 -
125
60x + 35) + ( 25 x
48
-
48
25
)
b '-...-' ro '-...-'

(25x 2 - 60x + 35) = (625


- x - -875) (48
- x - -48) + ._.,,...,,
0 .
48 48 25 25
ro "-,.-" r2

Thus r 1 = ( ~~ x - ~~ ) is the desired common divisor of a and b, and gcd( a, b) =


(x - 1).

Remark 15.2.14 (Extended Euclidean Algorithm (EEA)). For any a, b ER,


Proposition 15.2.8 guarantees the element gcd(a, b) can be written as as+ bt for
some s, t E R. Knowing the actual values of s and t is useful in many applications.
In the Euclidean algorithm, we have Tn = rn-2 - rn-lqn-1 , and rn-1 =
rn-3 - Tn - 2qn-2 , and so forth, up to ro = a - bqo. Back substituting gives an
explicit expression for gcd(a, b) = rn as as+bt. This is called the extended Euclidean
algorithm (EEA).
588 Chapter 15. Rings and Polynomia ls

Example 15.2.15. Applying the EEA to the results of Example 15.2. 13 gives
r1 = b - q1ro = b - q1(a - qob) = (1 + qo)b - q1a, sos= -qi and t = (1 + qo)
gives r 1 = as + bt.

15.3 The Fundamental Theorem of Arithmetic


The fundamental theorem of arithmetic states that every integer is a product of
prime integers, and that decomposition is unique (except for choices of signs on the
primes) . T he analogous result for Euclidean domains is a powerful tool and plays
a role in applications of the Chinese remainder theorem and in building partial
fraction decompositions.
We begin with the observation that even though a Euclidean domain does not
have a multiplicative inverse for every element , we can still cancel nonzero factors .

Proposition 15.3.1 (Cancellation) . In a Euclidean domain R, if a, b, c E Rand


a =f. 0, then ab= ac implies b = c.

Proof. If ab= ac, then ab - ac = 0 and so a(b - c) = 0. Since a =f. 0 and since R
has no zero divisors, then we must have (b - c) = 0, and hence b = c. D

Definition 15.3.2. An element a E R is called prime if it is not zero and not


invertible, and if whenever al be for some b, c E JR, then alb or ale. An element
a E R is called irreducible if it is not zero and not invertible, and whenever a = be,
then either b or c is invertible in R.

Remark 15.3.3. The traditional definition of a prime integer is our definition of


irreducible, since the only invertible elements in Z are 1 and - 1. In a general ring,
prime and irreducible are not the same thing, but we show below that they are
equivalent in a Euclidean domain.

Proposition 15.3.4. If two elements a, b in a Euclidean domain R are relatively


prime and c E R, then albc implies ale.

Proof. Since albc, we have be= ad for some d ER. Since a, bare relatively prime,
we have gcd(a, b) = 1, so 1 = ax+ by for some x, y E R . Thus c = axe+ bye =
axe+ yad = a(xc +yd), which implies that ale. D

Corollary 15.3.5. Let R be a Euclidean domain, and let m 1, m2 , . . . , mn E R be


pairwise relatively prime; that is, for every i =f. j we have gcd( mi, mj) = 1. If x E R
IJ:
is such that mi divides x for every i E'. {l, .. . , n}, then 1 mi divides x.

Proof. The proof is by induction on n . If n = 1, the result is immediate. If n > 1,


then mnlx implies that x = mny for some y ER, and mn- 1lx implies mn-ilmnY·
By Proposition 15.3.4, this means that mn-1IY, soy= mn- 1Z for some z ER, and
15.3. The Fundamental Theorem of Arithmetic 589

we have that x = mnmn-1Z is divisible by m~_ 1 = mnmn-1· Now make a new list
m1, ... , mn - 2, m~ _ 1 . This list has length n - 1, and the elements in it are pairwise
relatively prime, and each divides x. By the induction hypothesis x is divisible by
m1m2 . . . mn - 2m~-l = I1~ 1 mn. D

Corollary 15.3.6. If Ii, h, ... , In are ideals in a Euclidean domain such that
Ij = (mj) and the generators mi are pairwise relatively prime, then the ideal
(m1m2 · · · mn) generated by the product m1m2 · · · mn satisfies
n n
(m1m2 .. ·mn) = n
i=l
Ii= II h
i=l

Proof. Clearly m1m2 · · · mn E I1~ 1 Ii c n~=l h and hence (m1m2 · · · mn) c


I1~=l Ii C n~= l h Conversely, for any x E n~ 1 h we have x E Ii for every
i E {1, ... , n}, and hence milx for every i. By the Corollary 15.3.5, xis divisible
by m1m2 · · · mn, and hence x E (m1m2 · · · mn)· D

Theorem 15.3. 7. An element in a Euclidean domain is prime if and only if it is


irreducible.

Proof. If a E R is prime, and if a = be, then a divides be, so alb or ale. Without
loss of generality, assume alb. We have b = ax for some x ER, and a· l = be = axe.
The cancellation property (Proposition 15.3.1) gives 1 = xe, which implies that
both x and e are invertible.
Conversely, assume an irreducible element a divides be for some b, e E R . If
gcd(a, b) = 1, then Proposition 15.3.4 gives ale, as required.
If gcd(a, b) =dis not invertible, then a= dx for some x E R. By irreducibility
of a, the element x is invertible in R. Thus, the element ax- 1 = d divides b, and
hence a divides b. D

Proposition 15.3.8. Any element f of IF[x] that is of degree 1 must be irreducible,


and hence prime.

Proof. The proof is Exercise 15.22. D

Remark 15.3.9. Not every prime element in IF[x] has degree l. In particular, the
element x 2 + 1 E JR[x] is irreducible in JR[x] but has degree 2.

Before proving the main result of this section, we need one more result about
the multiplicative property of the valuation in a Euclidean domain.

Proposition 15.3.10. For every s, t E R , we have v(s) = v(st) if and only if t is


invertible.
590 Chapter 15 Rings and Polynomials

Proof. If tis invertible, then v(s) ::::; v(st) ::::; v(stC 1) = v(s), so v(s) = v(st).
Conversely, if v(s) = v(st), then s = (st)q + r for some r E R with v(r) < v(st) .
But r = s - (st)q = s(l -tq); so v(s) ::::; v(s(l -tq)) = v(r) unless 1-tq = 0. Hence
t is invertible. D

Theorem 15.3.11 (Fundamental Theorem of Arithmetic for Euclidean


Domains). Any nonzero element a in a Euclidean domain R can be written as a
product of primes times some invertible element. Moreover, if a= a,p1p2 · · · Pn and
a= (3q 1q2 · · · qm are two such decompositions, with o:, (3 invertible and all Pi and qi
prime, then m = n. Reordering if necessary gives Pi = uiqi, where each ui is an
invertible element in R .

Proof. Let S be the set of all nonzero, noninvertible elements of R that cannot be
written as an invertible element times a product of primes. The set V = {v(s)js E
S} is a subset of N and hence has a smallest element v 0 . Let s E S be an element
with v(s) = v 0 . Since s is not a product of primes, it is not prime, and hence
not irreducible. Therefore, there exist a, b E R such that ab = s and a and b are
not invertible. Proposition 15.3.10 implies v(s) > v(a) and v(s) > v(b), and hence
a and b can be written in the desired form . But s = ab implies that s can also be
written in the desired form. Therefore, V and S must both be empty.
To prove uniqueness, assume that a = O:P1P2 · · · Pn and a = f3q1q2 · · · qm are
two decompositions that do not satisfy the conclusion of the theorem, that is, either
n # m or n = m, but there is no rearrangement of the qj such that every qi = UiPi
for invertible ui . Assume further that n is the smallest integer for which such a
counterexample exists.
If n = 0, then a= o: = (3q 1 · · · qrn, so a is invertible. Thus, qija implies that qi
is also invertible for every i, and hence qi is not a prime. So we may assume n > 0.
Since Pn is prime, it must divide qi for some i E { 1, ... , m}. Rearrange the qj
so that Pn divides qm . Thus PnUn = qm for some Un. But since qm is prime, it is
irreducible. Since Pn is not invertible,, Un must be invertible.
Now divide Pn out of both sides of the equation O:P1P2 · · · Pn = f3q1q2 · · · qm to
get O:P1P2 · · · Pn-1 = f3q1 q2 · · · qm-1 Un. Redefining qm-1 to be qm-1 Un gives two new
decompositions into primes O:P1P2 · · · Pn-1 = f3q1 q2 · · · qm-1, and by the minimality
assumption on the original counterexample gives m-1 = n-1, and (after reordering)
Pi= uiqi, where each ui is an invertible element in R. But this proves that the
supposed counterexample also satisfies the conclusion of t he theorem. D

Example 15.3.12.

(i) The integer 1728 can be written as 33 26 or (-2) 3 23 (-3) 3 and in many
other ways, but the fundamental theorem of arithmetic says that, after
rearranging, every prime factorization of 1728 must be of the form

±3 . ±3 . ±3 . ±2 . . . ± 2 .
'----.._....--'
6
15.3. The Fundamental Theorem of Arithmetic 591

(ii) The polynomial

p = x 9 -20x 8 +129x 7 -257x 6 +59x 5 - 797x 4 -265x 3 -903x 2 - 196x-343


can be factored in JR[x] as

(x 2 + 1) 2 (x - 7) 3 (x 2 + x + 1),

and it can be shown that in JR[x] the polynomials x 2 + 1 and x 2 + x + 1


are both prime, as is every linear polynomial. Therefore, this is a prime
factorization of p. The fundamental theorem of arithmetic shows that
any other factorization of p must (after rearranging) be of the form

p = a(x 2 + 1) · b(x 2 + 1) · c(x - 7) · d(x - 7) · e(x - 7) · f(x 2 + x + 1),

where a, b, c, d, e, f are invertible and, hence, are in JR.

Unexample 15.3.13. Exercise 15.23 shows that the fundamental theorem of


arithmetic does not hold in t he ring

Z[H] = {a+bH I a,b E Z} c C,


because there are two distinct prime factorizations of 9. Therefore Z[H] is
not a Euclidean domain.

Remark 15.3.14. This theorem says that every element is almost unique as a
product of primes. In both cases R = IF[x] and R = Z we can make the uniqueness
more explicit.
If R = Z, then the only invertible elements are 1 and - 1; if a > 0 we may
require all primes to be positive. We have ui = 1 for all i, and the decomposition
is completely unique (up to reordering).
If R = IF[x], then the invertible elements are precisely the elements of IF (cor-
responding to degree-zero polynomials). If both a and all the primes have their
leading (top-degree) coefficient equal to 1, then we can again assume that all the Ui
are 1, and the decomposition is completely unique (up to reordering).

Theorem 15.3.15 (Fundamental Theorem of Algebra, alternative form).


All primes in C[x] have degree l , and hence every polynomial f in C[x] of degree n
can be factored uniquely (up to rearrangement of the factors) as
n
f(x) = c IT (x - Ai),
i= l

where c, A1, ... , An EC.


592 Chapter 15. Rings and Polynomials

Proof. Assume, by way of contradiction, that p(x) is prime in C[x] and has degree
n > 1. By the fundamental theorem of algebra (Theorem 11.5.4), p(x ) has at least
one root, which we denote >.. Dividing p(x) by (x - >.) gives

p(x) = (x - >.)q(x) + r(x),

where q(x) has degree n-1, and where r(x) has degree less than deg(x->.n); hence
r is constant . Moreover, 0 = p(>.) = 0 + r, and hence r = 0. Therefore, (x - >.)
divides p(x), and pis not prime. D

15.4 Homomorphisms
A linear transformation is the right sort of map for vector spaces because it preserves
all the key properties of a vector space- vector addition and scalar multiplication.
Similarly, a ring homomorphism is the right sort of map for rings because it preserves
the key properties of a ring-addition and multiplication.
Just as with vector spaces, kernels and ranges (images) are the key to un-
derstanding ring homomorphisms, and invertible homomorphisms (isomorphisms)
allow us to identify which rings are ''the same."

15.4.1 Homomorphisms

Definition 15.4.1. A map f : R--+ S between rings is a homomorphism from R


into S if
f(xy) = f(x)f(y) and f(x + y) = f(x) + f(y) (15.2)
for all x,y ER.

The next proposition is immediate, and in fact it may even seem more difficult
than checking the definition directly, but it makes many proofs much cleaner; it also
makes the similarity to linear transformations clearer.

Proposition 15.4.2. To check that a map f : R --+ S of rings is a homomorphism,


it suffices to show that for every a, b, x, y E R we have f(ax +by) = f(a)f(x) +
f(b)f(y) .

Example 15.4.3.

(i) For any n E Z the map Z--+ Zn given by x H [[x]]n is aring homomorphism.

(ii) For any interval [a, b] C JR, the map IF'[x] --+ C([a, b]; IF') given by sending
a polynomial f(x) E IF'[x] to the function on [a, b] defined by f is a
homomorphism of rings.
15.4. Homomorphisms 593

(iii) For any p E lF, the evaluation map ep : lF[x] ---+ lF defined by f(x) H f(p)
is a homomorphism of rings.

(iv) For any A E Mn(lF), the evaluation map eA : IF[x] ---+ IF[A] C Mn(IF)
defined by f(x) H f(A) is a homomorphism of rings .

Unexample 15.4.4.

(i) For any n E z+, the map Cn((a, b); lF) ---+ cn- 1 ((a , b) ;IF) defined by
f(x) H ~~) is not a homomorphism-it preserves addition

d(f + g) = df + dg
__d_x_ dx dx'

but it does not preserve multiplication

(ii) The map¢: IF---+ IF given by ¢(x) = x 2 is not a homomorphism because


¢(x + y) is not always (in fact , almost never) equal to x 2 + y 2 = ¢(x) +
¢(y).

Proposition 15.4.5. A homomorphism of rings f : R ---+ S maps the additive


identity 0 E R to the additive identity 0 E S.

Proof. The proof is identical to its linear counterpart, Proposition 2.l.5(ii). D

Proposition 15.4.6. If f : R ---+ S and g : S ---+ U are homomorphisms, then the


composition g o f : R ---+ U is also a homomorphism.

Proof. Given a, b, x, y E R we have


(go !)(ax+ by) = g(f(a)f(x) + f(b)f(y))
= g(f(a))g(f(x)) + g(f(b))g(f(y))
= (go f)(a)(g o f)(x) +(go f)(b)(g o f)(y) . D

Remark 15.4. 7. Unlike its vector space analogue, a homomorphism does not nec-
essarily map ideals into ideals. But if the homomorphism is surjective, then it does
preserve ideals.
594 Chapter 15. Rings and Polynomials

15.4.2 The Kern el and Image

Definition 15.4.8. Let R and S be rings. The kernel of a homomorphism f : R -+


S is the set JV (f) = {x E R I f(x) = O}. The image (or range) off is the set
Imf = ~(!) = {f(x) ES Ix ER}.

Proposition 15 .4 .9. For any homomorphism f

(i) the set JV (f) is an ideal of R;


(ii) the set Im f is a subring of S (but not necessarily an ideal).

Proof. Note that JV (f) and Im fare both nonempty since f(O) = 0.

(i) Let x 1, x 2 E JV (f) and a, b E R. By definition, f(x 1) = f(x2) = O; hence


f(ax1 + bx2) = f(a)f(x1) + f(b)f(x2) = f(a)O + f(b)O = 0. Therefore ax1 +
bx2 E JV(!).
(ii) For any s,t,y 1,y2 E Im(!) there exist a,b,x1,x2 ER such that f(a) = s,
f(b) = t, and f(xi) =Yi for i E {1,2}. Thus we have sy1 +ty2 = f(a)f(x1) +
f(b)f(x2) = f(ax1 + bx2) E Imf. D

Ex ample 15.4.10. If f : Z -+ Z2 iis given by x f--t [[x]h, then JV (f) = 2Z = lE


is the set of all even integers.

Example 15.4.11. &. For any A E Mn(IF), let eA : IF[x] -+ IF[A] C Mn(IF) be
the evaluation homomorphism defined by f(x) f--t f(A) . The kernel JV (eA)
consists of precisely those polynomials p(x) such that p(A) = 0. In particular,
JV (eA) contains the characteristic polynomial and the minimal polynomial.
Since every ideal in IF[x] is generated by any element of least degree (see
Proposition 15.2.7) , and since the minimal polynomial is defined to have the
least degree of any element in IF[x] that annihilates A, the ideal JV (eA) must
be generated by the minimal polynomial.

Example 15.4.12. &.


Let p E IF be any point, and let ep : IF[x] -+ IF be
the evaluation homomorphism, given by ep(f) = f(p). If f is a constant
polynomial f = c, then ep(f) = c. This shows that ep is surjective and
Im(ep) =IF. On the other hand, we have that ep(x-p) = 0, so x-p E JV (ep)·
15.4. Homomorphisms 595

Since every ideal in lF [x] is generated by a single element, we must have


JV (ep) = (g) for some polynomial g E lF[x]. In particular, we have gl(x - p) ,
but since x - p is of degree 1, it is prime, and hence g = (x - p)u for some
invertible element u . Thus (x - p) = (g) = JV (ep)·

For any p =f. q E lF the generator x - p of JV (ep) and the generator x - q


of JV (eq) are both of degree 1, hence prime, by Proposition 15.3.8. Thus, by the
fundamental theorem of arithmetic, their powers are relatively prime. This gives
the following proposition.

Proposition 15.4.13. If m, n are nonnegative integers and p =f. q E lF are arbitrary


points of lF, then the polynomials (x - p)m and (x - q)n are relatively prime.

15.4.3 Isomorphisms: Invertible Homomorphisms

Definition 15.4.14. A homomorphism of rings f : R -+ S is called an isomor-


phism if it has an inverse that is also a homomorphism. An isomorphism from R
to R is called an automorphism.

As in the case of linear transformations of vector spaces, if a ring homomor-


phism has an inverse, then the inverse must also be a homomorphism.

Proposition 15.4.15. If a homomorphism is bijective, then the inverse function


is also a homomorphism.

Proof. Assume that f : R -+ S is a bijective homomorphism with inverse function


f- 1. Given Y1, y2 , s, t ES there exists X1, x2, a, b ER such that f(xi) = Yi, f(a) = s,
and f(b) = t. Thus we have

r 1(sy1+ty2)=1- 1(f(a)f(x1) + f(b)f(x2))

= 1- 1(f(ax1 + bx2))
= ax1 + bx2
= f- 1(s)r 1(y1) + 1- 1(t)r 1(y2) .
Therefore 1- 1 is a homomorphism. D

Corollary 15.4.16. A homomorphism is an isomorphism if and only if it is


bijective.

Example 15.4.17.

(i) Let R = {True, False} be the Boolean ring of Example 15.1.3(vii) , and let
S be the ring Z 2 . The map Falser-+ 0 and Truer-+ 1 is an isomorphism.
596 Chapter 15. Rings and Polynomials

(ii) Let S C M 2 (JR) be the set of all 2 x 2 real matrices of the form

with a, b ER

It is not hard to see that S with matrix addition and multiplication is a


ring. In fact, the map ¢ : S - ><C given by

is an isomorphism of rings. The proof is Exercise 15.29.

Definition 15.4.18. If there exists a.n isomorphism L : R ---+ S, we say that R is


isomorphic to S , and we denote this by R ~ S.

Theorem 15.4.19. The relation ~ is an equivalence relation on the collection of


all rings.

Proof. The proof is identical to its counterpart for invertible linear transformations
(Theorem 2.2.12). D

15.4.4 Cartesian Prod ucts

Propos ition 15.4.20. Let {R1, R2, ... , Rn} be a collection of rings. The Cartesian
product
n
R =IT Ri = R1 X R2 x · · · x Rn= {(a1, a2, .. . , an ) I ai E Ri}
i= l

forms a ring with additive identity (0, 0, . . . , 0) and with componentwise addition
and multiplication. That is, addition and multiplication are given by
(i) (a1 , a2, ... , an)+ (b1, b2, .. . , bn) = (a1 + b1, a2 + b2, ... , an+ bn),
(ii) (a1 , a2, ... , an) · (b1, b2, ... , bn) == (a1 · b1, a2 · b2, ... , an · bn)
for all (a1 , a2, ... , an), (b1, b2, . .. , bn) ER.

Proof. The proof is Exercise 15.30. D

Example 15.4.21. The ring Z 3 x Z 2 consists of 6 elements:

([[0]]3 , [[0]]2) , ([[l]h, [[0]]2), ([[2]]3, [[0]]2),


([[O]h, [[1]]2), ([[l]h, [[1]]2), ([[2]]3, [[1]]2).
15.4. Homomorphi sms 597

Adding the element ([l ]h, [[1]]2) to itself repeatedly gives

([[l]h , [[1]]2) + ([[l]h, [[1]]2) = ([2]h, [[0]]2) ,


([[2]]3, [[0]]2) + ([[1]]3, [[1]]2) = ([O]h, [[1]]2),
([[O]h, [[1]]2) + ([[1]]3, [[1]]2) = ([[l]h, [[0]]2),
([[l]h, [[0]]2) + ([[l]h, [[1]]2) = ([[2]]3 , [[1]]2),
([[2]]3, [[0]]2) + ([[1]]3, [[1]]2) = ([0]]3, [[0]]2).
But multiplying ([[l]h, [[1]]2) times any (x, y) E Z3 x Z 2 gives (x, y) again; that
is, ([[1]]3, [[1]2) is the multiplicative identity element for this ring.
There are two obvious homomorphisms from the ring Z 3 x Z 2 , namely,
the first projection Z3 x Z2-+ Z3, defined by ([[a], [[b]]) H [[a]], and the second
projection Z3 x Z2-+ Z2, given by ([[a] , [[b]) H [[b]].

Proposition 15.4.22. Given a collection of rings Ri, ... , Rn, and given any i E
{1, ... , n}, the canonical projection Pi: TI7=l Rj-+ R i, given by (x1, . .. , Xn) H Xi,
is a homomorphism of rings.

Proof. We check Pi((x1, . . . ,Xn) + (y1, ... ,yn)) = Pi((x1 +y1, ... ,Xn +Yn)) =Xi +
Yi = Pi(x1, ... , Xn) + Pi(Y1, . . . , Yn)· The check for multiplication is similar. 0

Remark 15.4.23. It is straightforward to check that all the results proved so far
about finite Cartesian products also hold for infinite Cartesian products.

15.4.5 * Universa l Mapping Property


The next proposition says that giving a homomorphism T -t 1 Ri is exactly
equivalent to giving a collection of homomorphisms T -+ Ri for every i.
f1:
Proposition 15.4.24. Let { Ri}f= 1 be a finite collection of rings, and let T be a
ring. Any homomorphism F : T-+ 11:
1 R i determines a collection of homomor-
phisms fi : T -+ Ri for each i . Conversely, given a collection {fi : T -+ Ri}~ 1 of
homomorphisms, there is a uniquely determined homomorphism F : T-+ TI~= l Ri
such that Pio F = fi for every i; that is, the following diagram commutes: 54

54 See Proposition A .2.15 for a set-theoretic version of this proposition .


598 Chapter 15. Rings and Polynomials

Proof. If F : T --+ TI~=l Ri is a homomorphism, then each fi = Pi o F is also a


homomorphism, because the composition of homomorphisms is a homomorphism.
Conversely, given functions Ji : T --+ Ri for every i, let F be given by
F(t) = (f1(t), ... , fn(t)). It is immediate that the composition Pio F satisfies
Pio F(t) = fi(t) and that it is the unique function mapping T to TI~=l Ri that
satisfies this condition. It remains to check that F is a homomorphism.
We check the addition property of a homomorphism- the computation for
multiplication is similar.

F(s + t) = (f1(s + t), ... , fn(s + t))


= (fi(s) + fi(t), ... , fn(s) + fn(t)) = F(s) + F(t). D

15.5 Quotients and the First Isomorphism Theorem


Just as one may construct a quotient of a vector space by a subspace, one may also
construct a quotient of a ring by an ideal. Many of the results about isomorphisms
of vector spaces and quotients carry over almost exactly to rings. Throughout this
section we omit the ring-theoretic proofs of any results that are similar to their
vector-space counterparts and only include those that differ in some meaningful
way from the vector-space version.

Definition 15.5.1. Let I be an ideal of R . We say that x is equivalent toy modulo


I if x - y E I; this is denoted x = y (mod I). Sometimes, instead of equivalent
modulo I, we say congruent modulo I.

Proposition 15.5.2. Let I be an ideal of R . The relation = (mod I) is an equiva-


lence relation.

Definition 15.5.3. The equivalence classes [[y]] = {y I y = x (mod I)} partition


R, and each equivalence class [[y]] is a translate of I , that is,

[[y]] = y +I.
We call these equivalence classes cosets of I and write them as either y +I or [[y]]r.
If the ideal I is clear from context, we often write [[y]] without the I.

Because they are equivalence classes, any two cosets are either identical or
disjoint. That is, if (x+I) n (y+J) = [[x] n [[y]] # 0, then x+I = [[x]] = [[y]] = y+I.

Definition 15.5.4. Let I be an ideal of R. The set {[[x]] I x E R} of all cosets of


I in R is denoted Rf I and is called the quotient of the ring R by the ideal I .

Proposition 15.5.5. Let I = (b) be any ideal in a Euclidean domain R . The


quotient Rf I has a bijection to the set S = {s E R I v( s) < v(b)}.

Proof. Given any [[a]] E Rf I, the division property gives a= bq+r with v(r) < v(b),
so r ES. Define a map¢: Rf I--+ S by [[a]]1 r-t r. First we show this map¢ is
well defined. Given any other a' with [[a']] = [[a]], we have a' - a EI= (b); so there
15.5. Quotients and the First Isomorphism Theorem 599

exists an x E R with a' = a+ xb = (q + x)b + r. This implies ¢(a') = ¢(a), and


hence ¢ is well defined.
Define another map 1/; : S ~ R /I by s HI, [[ s ]]I. For any s E S, we have
s =Ob+ s, so we get ¢01/;(s) = s, and conversely, for any [[a]]I ER/ I , with a= bq + r,
we have 1/; o ¢(a) = [[r]]r = [[a]]r, so¢ and 1/; are inverses, and hence bijections. D

Remark 15.5.6. The previous proposition means that we can write the set Z/(n)
as the set {[[OJ], [[1]], [[2], . . . , [[n - 1]]}, and, similarly, any coset in JF[x]/(J) can be
written uniquely as [[r]] for some r of degree less than deg(!).

The set of all cosets of I has a natural addition and multiplication that makes
it into a ring.

Lemma 15.5.7. Let I be an ideal of R. The operations EB: R/I x R/I ~ R/I
and D: R / I x R/I ~ R/I given by

(i) (x +I) EB (y +I)= (x + y) +I {or [[x]] EB [[y]] = [x + y]]) and

(ii) (x +I) D (y +I) = (xy) +I {or [[x]] D [y]] = [[xy]])


are well defined for all x, y E R.

Proposition 15.5.8. Let I be an ideal of R. The quotient R/ I is a ring, when


endowed with the operations EB and D.

Remark 15.5.9. It is common to use the symbol + instead of EB and · instead of


D. This can quickly lead to confusion unless you pay careful attention to distinguish
the various meanings of these overloaded symbols.

Example 15.5.10.

(i) By Remark 15.5.6 the quotient Z/(n) is in bijective correspondence with


the elements of {O, 1, ... , n - 1}, and hence to the elements of Zn· It is
straightforward to check that the addition and multiplication in Z/(n)
are the same as the addition and multiplication in Zn·
(ii) By Remark 15.5.6 every element of the quotient IR[x]/(x 2 + 1) can be
written uniquely as [!]] for some f of degree less than 2. That means
it must be of the form [[ax + b]] for some a, b E R The operation EB
is straightforward [[ax+ b]] EB [ex+ d]] = [[(a+ c)x + (b + d)]] , but the
operation D is more subtle. Observe first that [[x 2 ]] = [[ - 1]], which gives

[[ax+ b]] D [cx+d]] = [[acx 2 +(ad+ bc)x+ bd]] = [[(ad+ bc)x + (bd - ac)]].

This is just like multiplication of complex numbers, which inspires a


map ( : JR[x]/ I ~ <C defined by [[ax+ b]] H ai + b. Using the previous
computation, it is straightforward to check that this is a homomorphism.
It is also clearly both surjective and injective, hence an isomorphism.
600 Chapter 15. Rin gs and Polynomials

Proposition 15.5.11. If a ring is commutative, then for any ideal I C R the


quotient ring Rf I is commutative.

Proof. If x +I and y +I are any elements of Rf I, then (x +I)[] (y +I) = xy +I =


yx +I= (y +I)[] (x +I). D

Proposition 15.5.12. Let I be an ideal of the ring R with quotient Rf I. The


mapping n: R-+ Rf I defined by n(x) = x +I is a surjective homomorphism. We
call 7r the canonical epimorphism.

Lemma 15.5.13. A homomorphism .f is injective if and only if JV (f) = {O}.

Lemma 15.5.14. Let R and S be rings, and let f : R-+ S be a homomorphism.


Assume that I and J are ideals of R and S, respectively. If f (I) c J , then f induces
a homomorphism J: Rf I-+ Sf J defined by f(x +I)= f(x) + J.

Theorem 15.5.15 (First Isomorphism Theorem). If R and S are rings and


f : R -+ S is a homomorphism, then Rf JV (f) ~ Im f, where Im f is the image of
f. In particular, if f is surjective, then Rf JV (f) ~ S.

Remark 15.5.16. Any homomorphism f: R-+ S gives the following commutative


diagram:

R f S

f\~
L
Rf JV (f) Imf,
i

where the right-hand vertical map i is just the obvious inclusion of the image off
into S, and the bottom horizontal map J is an isomorphism.

T his theorem is often used to prove that some quotient Rf I is isomorphic


to some other ring S. If you construct a surjective homomorphism R -+ S that
has kernel equal to I, then the first ilsomorphism theorem then gives the desired
isomorphism.

Example 15.5.17.

(i) If Risa ring, then Rf R ~ {O} . Define the homomorphism L: R-+ {O}
by f(x ) = 0 for all x E R . The kernel of f is all of R, and the first
isomorphism theorem gives the isomorphism.

(ii) For any p E [a, b] c JR let mp = {f E C([a, b]; IF) I f(p) = O}. We
claim that C([a, b]; IF)fmp ~ IF. To see this, use the homomorphism
ep : C([a, b]; IF) -+ IF given by ep(f) = f(p). The map ep is clearly
15.6. The Chinese Remainder Theorem 601

surjective, and the kernel of ep is precisely mp, so the first isomorphism


theorem gives the desired isomorphism.

(iii) Example 15.5.lO(ii) shows that the ring of complex numbers C is iso-
morphic to JR [x]/(x 2 + 1). The first isomorphism theorem gives another
way to see this: define a map JR[x] -+ C by f(x) r-+ f(i), where i E C
is the usual square root of -1. It is easy to see that this map is a ho-
momorphism. Its kernel is the set of all polynomials f E JR[x] such that
f(i) = 0, which one can verify is equal to (x 2 + 1). Therefore, we have
JR[x]/(x 2 + 1) = JR[x]/JY (f) ~ C.

Example 15.5.18. & Example 15.4.12 shows that for any p E IF' the kernel
of the evaluation map eP : IF'[x] -+ IF' is given by JY (ep) = (x - p). The
evaluation map is surjective because for any a E IF' the constant function
a E IF' [x] satisfies ep(a) = a. The first isomorphism theorem implies that
IF'[x]/(x - p) ~IF' .

Example 15.5.19. & Example 15.4.11 shows that for any A E Mn(C)
the kernel of the evaluation map eA : C[x] -+ C[A] is given by JY (eA) =
(p(z)) , where p(z) is the minimal polynomial of A. But the definition of eA
shows that it is surjective onto C[A], so the first isomorphism theorem gives
C[x]/(p(z)) ~ C[A].

15.6 The Chinese Remainder Theorem


In this sect ion we discuss the Chinese remainder theorem (CRT) and some of its
applications. Its earliest version dates back to approximately the fifth century AD
and can be found in the book The Mathematical Classic of Sun Zi.
The CRT is a powerful tool for working with congruences in the rings Zn, as
well as in polynomial rings, with direct applications in cryptography and coding
theory. But it also has many other applications, including partial fraction decom-
positions and polynomial interpolation. The CRT can be proved for arbitrary (even
noncommutative) rings, but all our applications are for Euclidean domains. Since
the proofs and results are much cleaner in the case of Euclidean domains, we assume
for the rest of the chapter that all rings are Euclidean domains .
The CRT states that given any list of pairwise relatively prime elements
m 1 , ... , mn in the ring R, and given any other elements a 1 , ... , an, the system
of congruences
602 Chapter 15. Rings and Polynomials

x = a1 (mod m1),
x = a2 (mod m2),

(15.3)
x =an (mod mn)

has a unique solution in R/(m 1m2 · · · mn)· We sometimes call the system (15 .3) the
Chinese remainder problem, or the CR problem. It is easy to show that if
the solution exists, it is unique. In the case that R = Z, it is also easy to give
a nonconstructive proof that there is a solution, but often we want to actually
solve the CR problem-not just know that it has a solution. Also, in a more general
ring, the nonconstructive proof usually doesn't work.
The way to deal with both issues is to construct a solution to the CR problem.
To do this we need either the Lagrange decomposition or the Newton decomposition.
These give an explicit algorithm for solving the CR problem, and as a nice side
benefit they also give a proof that every rational function has a partial fraction
decomposition.

15.6.1 Lagrange and Newton Decompositions


Before we discuss the CRT, we prove a theorem we call the Lagrange decomposi-
tion that says how to write elements of a Euclidean domain in terms of pairwise
relatively prime elements. This decomposition gives, as a corollary, both the par-
tial fraction decomposition of rational functions and the CRT. We also provide an
alternative decomposition due to Newton that can be used for an alternative proof
of the CRT and that gives a refined version of the partial fraction decomposition.
Both the Lagrange and Newton decompositions can be used to construct polynomial
interpolations-polynomials that take on prescribed values at certain points.

Theorem 15.6.1 (Lagrange Decomposition). Let R be a Euclidean domain,


and let m 1 , ... , mn be pairwise relatively prime elements in R (that is, any mi and
mj are relatively prime whenever i # j) . Let M = m 1 m 2 · · ·mn. Any element
f E R can be written as

f = Ji + · · · + Jn (mod M),

with each Ji =0 (mod mj) for i ::/:- j. Alternatively, we can also write

f = Ji + · · · + f n + H,
with each fi =0 (mod mj) and 0 :S: v(fi) < v(M) for every j ::/:- i, and with H =0
(mod M) .

Proof of the Lagrange Decomposition Algorithm. For each i E {1, ... , n} let
7ri = TI#i mj. Since all the mk are pairwise relatively prime, each 7ri is relatively
prime to mi, and hence there exist elements Si, ti E R such that 7riSi + miti = 1
(these can be found by the EEA). Let
15.6. The Chinese Remainder Theorem 603

By definition, Li = 1 (mod mi) and Li = 0 (mod mj) for every j -=/:- i. Let L =
2:~ 1 Li , so that L = 1 (mod mi) for every i E {1, .. . , n }. This means that 1- L E
(mi) for every i, and by Corollary 15.3.5 the product M must divide 1 - L.
For all i E { 1, ... , n} let Ji = f · Li, and let fI = f · ( 1 - L), so that we
have f = 2:~ 1 Ji+ fI with h = =
f · Li 0 (mod mj) for every j -=/:- i and fI 0=
(mod M).
Now using the division property, write each h uniquely as QiM + fi for some
Qi and some Ji with 0:::; v(fi) < v(m1m2 · · · mn). This gives

Letting H = fI + M 2:~ 1 Qi gives the desired result. D

Example 15.6.2. To write the Lagrange decomposition of 8 E Z in terms of


ml = 3, m2 = 5, and m3 = 7, compute 7r 1 = 5 · 7 = 35. Using the EEA for
3 and 35, compute 1 = 12 · 3 + (- 1) · 35, so L 1 = - 35. Similarly, 7r 2 = 21
and 1 = (-4) · 5 + 21 , so L2 = 21. Finally 7r 3 = 15 and 1 = (- 2) · 7 + 15, so
L3 = 15 and
1 = -35+21+15 = Ll + L2 + L3.
Multiplication by 8 gives

8 = - 280 + 168 + 120.

Now reduce mod 3 · 5 · 7 = 105 to get 8 = -70 + 63 + 15 (mod 105). In this


special case equality holds: 8 = - 70 + 63 + 15, and we have

- 70 =0 (mod 5) and (mod 7),


63 = 0 (mod 3) and (mod 7),
15 = 0 (mod 5) and (mod 3).

Remark 15.6.3. The Lagrange decomposition is often used in the special case
where R = IF[x] and mi = (x - Pi) for some collection of distinct points P1, .. . , Pn E
IF. In this case, we do not need the EEA to compute the Li because there is a simple
formula for them:
x - P]
Li= II
- -- .
#i Pi - Pj
(15.4)

These are called the Lagrange interpolants or Lagrange basis functions.


In the proof of the Lagrange decomposition, we have

'lri =II mj = II(x - pj),


# i #i

and 'lriSi + miti = 1 means precisely that 'lriSi = 1 (mod mi); thus, Si is the
multiplicative inverse of 'lri after reducing modulo mi = (x - Pi)· As shown in
604 Chapter 15. Rings and Polynomials

Example 15.5.18, the first isomorphism theorem says that reducing modulo the ideal
(x - Pi) is exactly equivalent to evaluating at the point Pi, thus 'Tri = TI#i(Pi - PJ)
(mod (x - Pi)) and si = TI#i Pi ~Pj.

The Newton decomposition gives an alternative way to decompose any element


in a Euclidean domain.

Theorem 15.6.4 (Newton Decomposition). If R is a Euclidean domain with


valuation v, and if m 1, ... , mn ER are any elements of R, then for any f ER we
may write
f = bo + b1m1 + b2m1m2 + · · · + bnm1m2 · · · mn,
with 0:::; v(bi) < v(mH1) for every i E {O, . .. ,n - 1} .

Proof. Use the division property to write f = m1q1 + bo , with v(bo) < v(m1),
and then again q1 = m2q2 + b1, with v (b1) < v(m2), and so forth until qn-1 =
mnbn + bn-1 · This gives

f = bo + m1 (b1 + m2 (b2 + m3 (· · · + mn-1( bn-l + mnbn))))


= bo + b1 m1 + b2m1 m2 + · · · ,
as desired. D

The following is an easy corollary.

Corollary 15.6.5. For any field lF and any f, g E lF[x] , then for any positive integer
n, we may write f =Jo+ fig+ f2g 2 + · · · + fngn such that for every i < n we have
deg(fi) < deg(g) .

15.6.2 The Chinese Remainder Th eorem


As mentioned before, the CR problem is the problem of finding an x such that
x = ai (mod mi) for each i. The Chinese remainder theorem (CRT) guarantees
that a solution exists and that it is unique, modulo the product m 1m 2 · · · mn.
In this section we prove the CRT, but we are interested in more than just the
existence of a solution- we want an algorithm for actually constructing a solution.
The Lagrange and Newton decompositions both do that.

Theorem 15.6.6 (Chinese Remainder Theorem). Given pairwise relatively


prime elements m1, . . . , mn in a Euclidean domain R (that is, if i -/=- j, then mi and
mj are relatively prime), let m = m 1m2 · · · mn. The natural ring homomorphism

R/(m)---+ R/(m1) x · · · x R/(mn)

given by [[x]](m) H ([[x]](mi)> . .. , [[x]](m,,)) is an isomorphism.

Proof. The natural map¢: R---+ R/(m1) x· · · xR/(mn) given by x H ([[xlJ(mi)> . .. ,


[[x]](m,,)) is easily seen to be a ring homomorphism. The kernel of¢ is the set of
all x such that [[x]](m 3 ) = [[O]](mj) for every j E {1, . .. , n}. But [[x]](mi) = [[O]](mi)
15.6. The Chinese Remainder Theorem 605

if and only if x E (m1), so we have JV (¢) = { x E R I x E (m 1) Vj E { 1, ... , n} } =


n7= 1(m1)· Corollary 15.3.6 gives n7= 1(m1) = (m), and the first isomorphism
theorem gives R/(m) ~Im(¢). All that remains is to show that¢ is surjective-
that is, we must show that at least one solution to the CR problem exists.
In the case that R = Z we can give a nonconstructive proof by simply counting:
The image of¢ is isomorphic to Z/(m 1 · · · mn), so it has exactly m 1 · · · mn elements.
But it is also a subset of Z/(m 1) x · · · x Z/(mn), which also has m 1 · · · mn elements
in it. Hence they must be equal, and ¢ must be surjective. But this proof only
works when R / (m 1) x · · · x R/(mn) is finite-it does not work for other important
rings like lF[x].
The proof in the general case gives a useful algorithm for solving the CR
problem.
Solution of CR Problem by Lagrange Decompos ition: Theorem 15.6.1
gives 1 = L 1 +· · ·+Ln +HER, where Li = 0 (mod m 1) for every j-=/=- i, and H = 0
(mod m), and hence Li= 1 (mod mi)· Given any element ([[a1Jl(mi)' ... , [[an]](m,.,))
E R/(m1) x · · · x R/(mn) , let x = I:;~ 1 aiLi ER. For each i we have x = aiLi = ai
(mod mi)· So ¢(x) = ([[a1]](m 1 ), . . • , [anll(m,, ))· D

Remark 15.6. 7. Here is an equivalent way to state the conclusion of the CRT:
Given pairwise relatively prime elements m 1, ... , mn in a Euclidean domain R, for
any choice ([[a 1]]m 11 . • • , [[an]]m,J E R/(m1) x · · · x R/(mn) , there exists a unique
[[x]]m E R/ (m) such that [[xlJm, = [[ai]]m; for all i E {1, . . . , n}. That is, the system
of n equations

x =ai (mod m1),


x = a 2 (mod m2) ,
(15.5)

has a unique solution for x modulo m = m1m2 · · · mn.


Remark 15.6.8. Theorem 15.6.6 returns a unique solution modulo m. The solu-
tion is not unique in Z or in R.

Remark 15.6.9 (Solution of CR Problem by Newton Decomposition


(Garner's Formula)). Instead of using the Lagrange decomposition to solve the
CR problem, we could use the Newton decomposition. Decomposing the unknown
x as x = bo + b1m1 + · · · + bnm1m2 · · · mn gives the system

bo= ai (mod m1),


bo + bim1 = az (mod mz),

(15 .6)

Set b0 = a 1. Since m 1 is relatively prime to mz, the EEA gives s1, s2 such that
s 1m 1 + s2m2 = 1, so that s 1m1 = 1 (mod m2)· Set b1 = s1(a2 - bo) . For each
606 Chapter 15. Rings and Polynomials

i E {1, ... , n - 1}, since mi+ 1 is relatively prime to m1m2 · ··mi, we can find Si such
that sim1m2 · ··mi= 1 (mod mi+1) . Setting

(15.7)

gives a solution to the CR problem. When R = Z this method of solving the CR


problem is sometimes called Garner's formula.
Just as with the Lagrange decomposition, in the case that R = lF[x] and
mi = x - Pi for Pi E lF, we do not need the EEA to find the inverses si, since we can
write a simple formula for them. Essentially the same argument as in Remark 15.6.3
shows that we can take Si= TI ( 1 _ ·) for each i E {1, ... , n - l} .
j °S; i Pi+ I P1
Each of these two methods for solving the CR problem has some advantages,
depending on the setting and the application.

Example 15.6.10. The following is an ancient problem, dating to sixth cen-


tury India [Kan88]. "Find the number that if divided by 8 is known to leave
5, that if divided by 9 leaves a remainder 4, and that if divided by 7 leaves a
remainder l."
Since 7, 8, and 9 are pairwise relatively prime and 7 · 8 · 9 = 504, the
CRT says that for any system of remainders there exists a unique equivalence
class (mod 504) that satisfies the system. We solve the system using first the
Lagrange approach and then the Newton- Garner approach .
Lagrange Approach: Let m 1 = 7, m2 = 8, and m 3 = 9. The Lagrange
decomposition algorithm gives 1 = (-3)(72)+(-1)(63)+(-4)(56) (mod 504).
The solution of this CR problem is, therefore, x = 1(-3)(72) + (5)(-1)(63) +
(4)(-4)(56) = 85 (mod 504).
Newton-Garner approach: We write x = bo + b1m 1 + b2m 1m2 +
b3m1 m2m3, or x = bo + 7b1 + 56b2 + 504b3. Since we are only concerned about
solutions modulo 504, we need not compute b3. From the Garner formula we
have b0 = a 1 = 1. We now must find s 1 such that s 1m 1 = 1 (mod m2)· By
inspection (or the EEA if necessary) we have 1 = ( -1) · 7 + 1 · 8, so s 1 = -1
and b1 = -l(a2 - bo) = -(5 - 1) = -4. Again, we must find s2 such that
s2m1m2=1 (mod m3 )· By theEEAwe have (-4)·56+25·9 = 1, so s2 = -4
and b2 = s2(a3 - bo - b1m1) = (--4)(4 - 1 - (-4) · 7) = -124. This gives
x = 1 + (-4) · 7 + (-124) · 56 (mod 504), which again yields x = 85 as the
smallest positive representative of the equivalence class.

Example 15.6.11. The Lagrange decomposition and the CRT can simplify
the process of dividing polynomials, especially when the dividend has no re-
peated prime factors . For example, consider the problem of computing the
remainder of a polynomial f when dividing by g = (x)( x - l)(x + 1). If we
have already computed the Lagrnnge interpolants Li for the prime factors
m1 = x, and m2 = x - 1 and m3 = x + 1 of g, then division by g is fairly
quick, using the CRT.
15.6. The Chinese Rem ainder Theorem 607

By (15.4) the Lagrange interpolants are


Li = (x - l)(x + 1) /(( 1)(- 1)) = 1 - x 2, (15.8)
1 2 + x) ,
L2 = (x - O)(x + 1)/((1)(2)) =
2 (x (15.9)

1
L3 = (x - l)(x - 0)/((-2)(-1)) = (x 2 - x). (15.10)
2
The CRT says that the map
¢ : JR[x]/((x)(x - l)(x + 1))-+ JR[x]/(x) x JR[x]/(x - 1) x JR[x]/(x + 1)
is an isomorphism, and that the inverse map
'ljJ : JR[x]/(x) x JR[x]/(x - 1) x JR[x]/(x + 1)-+ JR[x]/((x)(x - l)(x + 1))

is given by 'l/;(a,b,c) = aLi + bL2 + cL 3 . This means that 'l/;(¢(!)) = f mod


(x)(x - l)(x + 1). But¢(!) = (eo(f), ei(f), e_i(f)) = (f(O), f(l), f(-1)), so
f = f(O) Li+ f(l) L2 + f(-1) L3 (mod (x)(x - l)(x + 1)) .
Since each Li is already of degree less than g, the sum f(O)Li + f(l)L 2 +
f (-1) L3 also has degree less than g and hence must be the remainder of
f mod g.
So, for example, to compute f = 3x 5 + 5x 4 + 9x 2 - x + 11 mod g, simply
compute f(O) = 11 , f(l) = 27, f( - 1) = 23, and thus f = 11(1-x2)+ 2i(x 2 +
x) + 2z3(x 2 - x) mod g.
This method is especially useful when computing several remainders
modulo g-finding the Lagrange basis elements is not necessarily much easier
than doing a single division, but subsequent divisions can be completed more
rapidly.
This method does not work quite so easily if one of the factors mi is not
linear. F irst, the simple formula (15.4) for the Lagrange basis elements Li
does not apply, so one must use the EEA or Lagrange-Hermite interpolation
(see Section 15.7.4) to compute these.
Second, computing the image ¢(!) is also more difficult. For example,
if m2 = (x - 1) 2, then we could not compute¢(!) by evaluating fat x = l.
Instead we would have to divide f by (x - 1) 2. But this is still fairly easy,
replacing every x inf by x = (x -1) + 1 and then expanding. So, to compute
f mod (x -1) 2, we write
f(x) = x 4 + 5x + 6 = ((x - 1) + 1) 4 + 5((x - 1) + 1) + 6

= (x - 1) 4 + G) (x - 1) 3 + G) (x - 1)
2

+ G) (x - 1) + 1+5(x - 1) + 7

= 9(x - 1) + 8 mod (x - 1) 2.
608 Chapter 15. Rings and Polynom ials

Application 15.6.12 (Radar Systems). The CRT can be used to improve


the accuracy of radar systems. When a single radar pulse is sent out and
reflected off an object, the distance to that object can be calculated as ct/2,
where c is the speed of light and t is the time delay between sending the
pulse and receiving its reflection. Tracking a moving object requires sending
multiple pulses, but then you can't tell which reflection is from which pulse,
so you only know the total time t modulo the time 6 between pulses.
One solution to this problem is to send pulses of two or more different
frequencies. Since they are different frequencies, you can tell them apart from
each other when they reflect. Note that in most cases, electronic systems can
only count a finite number of clock cycles per second, so time is essentially
measured as an integer number of clock cycles. If there are n different fre-
quencies of pulses, and if 6j is the number of clock cycles between pulses of
frequency j, then the true time t for one pulse to travel to the object and
back is known modulo 6j for each j E {l , . . . , n}. If 6 1, ... , 6n are pairwise
relatively prime, then the CRT gives t modulo IT7=l 6j. If this product is
known to be much larger than t could possibly be, then the true value of t
(and hence the true distance to the object) has been completely determined.

Application 15.6.13 (Parallelizing Arithmetic Computations). Com-


putations on a computer cluster with many processors can be sped up dra-
matically if the problem can be broken up into many subcomputations that
can be done simultaneously (in parallel) on the separate processors and then
assembled cheaply into the final answer. The CRT can be used to parallelize
arithmetic computations involving very large integers and the ring operations
+, - , x. If primes P1 , ... , Pn are chosen so that the product IJ~=l Pi is cer-
tainly larger than the answer to the computation, then the problem can be
computed modulo Pi on the ith processor, and then the final answer assembled
via the CRT from all these separate pieces.

15.6.3 * Parti al Fraction Decomposition


Lagrange and Newton decompositions can be used to prove that every rational
function has a partial fraction decomposition.

Corollary 15.6.14 (Partial Fraction Decomposition). For any field IB', if


g1, . . . 9n are pairwise relatively prime polynomials in IB'[x], let G = I1~= 1 9i · Any
rational function of the form f /G with f E IB'[x] can be written as

f S1 Sn
- =h+ - + ·· ·+ - ,
G 91 9n

with h, s1, . .. , Sn E IB'[x], and deg(si) < deg(gi) for every i E {1, . . . , n}.
15.6. The Chinese Remainder Theorem 609

Proof. The elements gi satisfy the hypothesis of the Lagrange decomposition theo-
=
rem (Theorem 15.6.1), so we may write f = fi + · · ·+ fn + H, with fi 0 (mod gJ)
and v(fi) < v(G) for all i -j. j , and H = 0 (mod G).
Thus, we have

The relation f i = 0 (mod gJ) for each j -1- i is equivalent to gJ If i for each j -j. i.
Since the element s 9i are relatively prime, we must have (IJ#i 9J) Ik Hence fi =
(IJ#i9J)s i for some si E IF[x]. This gives

f S1 Sn H
-G = -91 +···+-+-.
9n G

Moreover, we have G IH so H = hG, and thus

f S1 Sn
- = - +···+ - +h.
G 91 9n

Finally, we have

The Newt on decomposition, or rather its immediate consequence,


Corollary 15.6.5 , allows further decomposition.

Corollary 15.6.15 (Complete Partial Fraction Decomposition). Given any


polynomials f, G E IF[x], if G factors as G = IJ~ 1 9f' with all the 9J pairwise
relatively prime, then we may write

f Sn1 Sn2
-=h+
G
(Su
-+
9~1
S12
-9r1- l
S1a 1 )
-+···+-
91
+···+ ( -+
9~n
- - +·· ·+ -Snan)
9~n -l 9n '

with h E IF [x] and siJ E IF[x] such that deg(siJ) < deg(9i) for all i and j .

Remark 15.6.16. Constructing the partial fraction decomposition can be done in at


least three ways. One way is to explicitly follow the steps in the Lagrange algorithm
(the proof of Theorem 15.6.1) , using the EEA to write the Lagrange decomposition,
which is the main step in the construction of the partial fraction decomposition.
If the denominator G can be written as a square-free product G = IJ~ 1 (x-pi)
with distinct Pi, so each (x - pi) occurs at most once, then the Lagrange interpolation
formula (15.4) gives the Lagrange decomposition off.
Finally, regardless of the nature of G, one can write the partial fraction decom-
position with undetermined coefficients Sij. Clearing denominators and matching
the terms of each degree gives a linear system of equations over IF, which can be
solved for the solution.
610 Chapter 15. Rings and Polynomials

15.7 Polynomial Interpolation and Spectral


Decomposition
The previous ring-theoretic results lead to some remarkable connections between
polynomial interpolation and the spectral decomposition of a matrix. We begin this
section with a discussion of interpolation. The connection to spectral decomposition
is described in Section 15.7.3.

15. 7.1 Lagrange Interpolation


The interpolation problem is this: given points (x1,Y1), . . . ,(xn,Yn) E lF 2 , find a
polynomial f(x) E JF[x], such that f(xi) =Yi for all i = l, ... ,n. As discussed
in Example 15.4.12, for any Xi E lF, the kernel of the evaluation homomorphism
ex; : JF[x] -t lF is the ideal (x - xi), so we are looking for a polynomial f such that
f =Yi (mod (x - Xi)) for each i E {l, ... , n} .
The CRT says that there is a unique solution modulo the product P(x) =
rr~=l (x - Xi), and one construction of the solution is provided by the Lagrange de-
composition. As discussed in Remark 15.6.3, we can use the Lagrange interpolants.
This immediately gives the following theorem.

Theorem 15.7.1 (Lagrange Interpolation). Given points (x1 , Y1), ... , (xn, Yn)
E lF 2 with all the Xi distinct, let

(15.11)

The polynomial
n
f(x) = L YiLi(x) (15.12)
i=l

is the unique polynomial of degree less than n such that f (xi) = Yi for every i E
{l, ... , n} .

Remark 15. 7.2 . Naively computing t he polynomials Li in the form above and
using (15.12) is not very efficient - the number of operations required to compute
the Lagrange interpolants this way grows quadratically in n . Rewriting the Li in
barycentric form makes them much better suited for computation. We write

where

is the barycentric weight and


n
p(x) = IT (x - Xk)-
k=I
15.7. Polynomial Interpolation and Spectral Decomposition 611

This means that


n
f(x) = p(x) """"'
L x Wj_ x. Y1,
j= i J

which is known as is the first barycentric form. Moreover, noting that the identity
element (the polynomial 1) satisfies
n
1 =p(x) """"'
L _ w.J_ '
j=i X - Xj

we can avoid computing p(x) entirely and write


n n
L __3!2_Y1
f(x) = +
f( )
=
p(x)
--=-~--- 1
X - Xj

I: __3!2_
""""' __3!2_ y .
L
j=i
n
x-x
J
J
(15.13)

Lx~ x
p(x) 1
j=iX - Xj j=i J

We call (15.13) the second barycentric form. It is not hard to see that the number
of computations needed to compute (15.13) grows linearly in n once the weights
wi, ... , Wn are known. Efficient algorithms exist for adding new points as well; see
[BT04) for details.

15.7.2 Newton Interpolation


We can also write a solution to the interpolation problem using the Newton de-
composition. Again the proof is an immediate consequence of the CRT and the
fact that evaluation of a polynomial at x 1 is equivalent to reducing the polynomial
modulo x - x1 .

Theorem 15.7.3 (Newton Interpolation). Given points (xi, Yi), ... , (xn, Yn) E
lF 2 , for each j E { 0, ... , n - 1} let
No(x) = 1,
N1(x) = IJ(x - xi), and
i-S:,j
N1(xk) = IJ (xk - xi)
i-S:,j
and define coefficients !3i E lF recursively, by an analogue of the Garner
formula (15.7):

and
f3 _ YJ+i - (f3o + f3ini(XJ+i) + · · · + f31-in1 - i(xJ+i))
f3o = Yi J - nj(Xj+i) .
(15.14)
The polynomial
n-i
f(x) = L f3iNi
i= O

is the unique polynomial of degree less than n satisfying the conditions f (xi) = Yi
for every i E {1, ... , n} .
612 Chapter 15. Rings and Polynomials

Remark 15.7.4. Regardless of whether one uses Newton or Lagrange, the final
interpolation polynomial created is the same (provided one uses exact arithmetic).
The difference between them is just that they use different bases for the vector space
of polynomials of degree less than n . Each of these bases has its own advantages and
disadvantages. When computing in floating-point arithmetic the two approaches do
not give exactly the same answers, and the Newton approach can have some stability
problems, depending on the order of the points (x1, Y1), ... , (xn, Yn)· Traditionally,
Newton interpolation was preferred in most settings, but more recently it has be-
come clear that barycentric Lagrange interpolation is the better algorithm for many
applications (see [BT04]).

Application 15.7.5 (Shamir Secret Sharing). One application of poly-


nomial interpolation and the CRT is the following method for sharing a secret
(say, for example, a missile launch code) among a large number of people in
such a way that any k + 1 of them can deduce the secret, but no smaller
number of them can deduce the secret.
To do this choose a random polynomial f(x) of degree k. Tell the first
person the value of f(l), the second person the value of f(2), and so forth .
Set your secret to be f (0). With only k or fewer people, the value of f (0)
cannot be deduced (the missiles cannot be launched) , but any k + 1 of them
will have enough information to reconstruct the polynomial and t hus deduce
f(O) (and thus launch the missiles).
An alternative method that allows progressively more information to be
deduced from each additional collaborator is the following: Choose the secret
to be a large integer L and assign each person a prime Pi that is greater
1 1
than L k + 1 but much smaller than L • . Let each person know the value of
L mod Pi · If any k of them collaborate (say the first k) , they can find the
value of L mod Ti i=l k Pi , so L = £ + m Tiki=l Pi for some integer m > 0, but
TI~=l Pi has been chosen to be much smaller than L , so L is not yet uniquely
determined. Nevertheless, it is clear that L is now much easier to try to guess
than it was before. If k+ 1 people collaborate, then the value of Lis completely
k+l
known because Ti i=l Pi > L.
The two methods of secret sharing are both applications of the CRT,
but t he first is in the ring IF[x], while the second is in the ring z.

15.7.3 Lagrange and the Spectral Decomposition


In this section we describe a remarkable connection between the Lagrange inter-
polants and the spectral decomposition of a matrix.
Consider a matrix A E Mn(rc) with spectrum O"(A) and minimal polynomial
p(z) = fI.xEa-(A)(z - >.)m"', where each m.x is the order of the eigennilpotent D.x
associated to >. E O"(A) .
As shown in Examples 15.4.11 and 15.5.19, the map eA : C[x] -t C[A] c
Mn(C), given by f(x) f-7 f(A), is a homomorphism. Also JV (eA) = (p(x)), so the
first isomorphism theorem (FIT) guarantees that the induced map C[x]/(p(x)) -t
15.7. Polynomial Interpol ation and Spect ral Deco m posit ion 613

C[A] is an isomorphism. Since each (z - >-)m" is relatively prime to any other


(z - µr"', we may apply the CRT to see that the natural map C[x]j(p(x)) ---+
IT>-. Eo- C[x]/(x - >-)m" is also an isomorphism. Putting these together gives the
following diagram:

C[x] C[x]j(p(x)) - C[A]

~+R7
II C[x]/(x - >-r"
(15 .15)

.AEo-

Here the map 7f takes f(x) to the tuple (f mod (x - >- 1 r"1, ...,
f mod (x - >-kr"k ),
where u(A) = {>-1, . . . ,>.k}. Also, 'I/; is t he map that takes an element aol +
alA + · · · + aeAe E C[A] and sends it to 7r (ao + a 1 x + · · · + aexe). In the case
that A is semisimple (m>-. = 1 for every A E u(A)), the map 7f simplifies to
1f(f) = (J(>-1), ... ,f(>-k)) .
The Lagrange decomposition (Theorem 15.6.1) guarantees the existence of
polynomials L>-. E C[x] for each >- E u(A) such that 2= >-. Eo-(A) L>-. = 1 mod p(x) and
L>-. = 0 mod (x - µ)m,, for every eigenvalueµ =f. A. This is equivalent to saying that
7r(L>-.) = (0, ... , 0, 1, 0, ... , 0), where the 1 occurs in the position corresponding to>..
In the case that m>-. = 1 for every A E u(A), the L >-. are just given by Theorem 15.7.1,
x- µ
L>-.(X) = II
µ Eo- (A)
>. - µ'
µ-cop)..

but if A is not semisimple, the multiplicities m>-. are not all 1 and the formula is
more complicated (see (15.18) below) .

Example 15.7.6. Let

A=[~~~],
0 0 5
which has minimal polynomial p(x) = (x - 3) 2 (x - 5). We have

C[A] ~ C[x]j((x - 3) 2 ) x C[x]j(x - 5) ~{a+ b(x - 3) I a, b EC} x C.

The Lagrange interpolants in this case are

L3 = -
(x - 3)(x - 5)
4
x- 5
- - - = - (x - 5)
2
(x--3
-+-
4
1) 2
and

Ls = (x - 3)2
4
614 Cha pter 15. Rings and Polynomials

Writing x - 5 = x - 3 - 2 shows that

(x-3)(x-3--2) x - 3-2
L3 = - 4 - 2

= 2(x 4- 3)
- -
x -
2
3
+ 1 = 1 mod (x - 3) and
2

L3 = 0 mod (x - 5),

while L 5 = 0 mod (x - 3) and L5 :::.:: 1 mod (x - 5). Applying the map eA to


L3 gives

eA(L3) = L3(A) =-(A - 5I)((A - 31)/4 +I /2)

[0~ ~10 0~i (~4 [~0 0~ 2~i + [1~20


0
1/2
0

[H ~] ~~
Similarly,

Note also that eA((x - 3)L3) =(A- 3I)P3 = D3 is the eigennilpotent associ-
ated to .A= 3.

The next theorem shows that the appearance of the eigenprojections and eigen-
nilpotents in the previous example is not an accident.

T heorem 15.7.7. Given the setup described above for a matrix A E Mn(C), the
eigenprojection P>. is precisely the image of L>. under the map eA : C[x] --7 C[A] c
Mn(C) . That is, for each A E u(A) we have
(15.16)

Moreover, the eigennilpotent D>. is the image of (x - .A)L>. :

(15 .17)

Proof. By the uniqueness of the spectral decomposition (Theorem 12.7.5) it suffices


to show that the operators Q>. = eA(L>.) and C>. = eA((x - .A)L>.) satisfy the usual
properties:

(i) Q~ = Q>..
(ii) Q>.Qµ = 0 when .A =J. µ.

(iii) 2=>.Eu(A) Q >. = J.


15.7 Polynomial Interpolation and Spectral Decomposition 615

(iv) Q>..C>.. = C>..Q>.. = C>...

(v) A= L>..E<r(A)(>.QA +CA)·

Since 'I/; is an isomorphism (see (15 .15)), it suffices to verify these properties for
'I/;( QA) = n(LA)· For (i) we have

n(LA)n(LA) = (0, ... , 0, 1, 0, . .. , 0)(0, . . . , 0, 1, 0, . .. , 0)


= (0, ... , 0, 1, 0, ... , 0) = n(LA),

and a similar argument gives (ii). Item (iii) follows from the fact that LA LA =
1 mod p(x) where p(x) is the minimal polynomial of A, and (iv) follows from the
fact that

and the fact that C[A] is commutative. Finally, for (v) observe that

'I/; ( L (>.Q>.. +CA)) = L (n(>.LA) + n((x - >.)LA))


AE<r(A) >..E<r(A)
= L n(xLA)
AE<r(A)

= n(x) L n(LA)
AE<r(A)

= n(x) = 'l/;(A),

Example 15.7.8. Let

B = [~ ~] '
with minimal polynomial x 2 - 5x. The CRT gives an isomorphism <p: C[B] ~
C[x]/x x C[x]/(x-5) ~ C x C. The Lagrange interpolants are Lo= -(x - 5)/5
and L 5 = x/5, and again we have <p(Lo) = (1,0) and ip(L5) = (0, 1). Applying
the evaluation map gives

1
Po= eB(Lo ) = -5(B - 51) = -51 [- 4 2]
-l
2

and
1
P5 = eB(L5) = 5(B).
2
It is easy to see that P6 = Po, that (iB) = iB, and that BPo = 0, as
expected for the eigenprojections.
616 Chapter 15 Rings and Polynomials

15.7.4 * Lagrange- Hermite lnte!rpolation


Lagrange interpolation gives the solution of the interpolation problem in the case of
distinct values. With just a little more work, these polynomial interpolation results
generalize to account for various derivatives at each point as well. This is called
Lagrange-Hermite interpolation. To begin we need to connect equivalence modulo
(x - p)m to properties of derivatives.

Proposition 15.7.9. Given f E lF[x], given p E lF, and given a positive integer n,
if f =
0 (mod (x - p)n), then for every nonnegative integer j < n, we have

JU) =0 (mod (x - p)n-1),

where jUl = fxf is the j th derivative off.

Proof. The statement f =0 (mod (:r - pr) means that f = (x - p)ng for some
g E lF[x] . Repeated application of the product rule gives

from which we immediately see that (x - p)n-j divides JU), which means that
JU) = 0 (mod (x - p)n-j). D

Suppose we want to find a polynomial with certain derivatives at a point. The


Taylor polynomial does this at a single point .

Proposition 15. 7 .10. Given f E lF [x] and p E F and a positive integer n, we have

n-1
f =L i=O i .
~; (x - p)i mod (x - p)n

if and only if for every nonnegative integer j < n we have JU) (p) = ai.

Proof. If f = L,"::01 'Tf- (x - p) i mod (::c - p) n, then the previous proposition gives

-dj . ( J -
dxJ
n-1 .
'"""a, (x - p)i
~ i!
)
=0 (mod (x - p)n-j)
i=O

for every j < n. A direct computation gives


15.7. Polynomia l Interpolation and Spectral Decomposition 617

and hence
n- j - 1
j Ul - L a%~k (x - p)k =O (mod (x - pr- j).
k=O

Evaluating at x = p gives jUl(p) = aj·


Conversely, Corollary 15.6.5 guarantees that any polynomial f E IF[x] can be
written as f = I:7=o bi (x - p) i, where deg(bi) < deg(x - p) for every i < n . This
shows bi is constant for each i < n. A direct computation of the derivatives of
f shows that for each j < n we have bj = ~;; hence f = (I;Z:01 ~(x - p) i ) +
bn(x - pr (here bn E IF[x] is not necessarily of degree 0). D

Theorem 15.7.11 (Lagrange- Hermite Interpolation). Givenpointsx 1, . . . Xn E


IF, positive integers m 1 , ... , mn , and values
(0) (m1) (0) (m2) (0) (mn )
Y1 '· · · ' Y1 , Y2 , · · · , Y2 , · · · , Yn , · · · , Yn

in IF, let M = I:7= 1 mj, and let P(x) = IT:'.: 1 (x -xi)mi+ 1. Use the partial fraction
decomposition ( Corollary 15. 6.15) to write

1 -.2--- Si ,O+ si,1(x - xi)+· · ·+ si,mi (x - Xi)m i


P(x) = £i' (x - xi)m,+l

For each i E { 1, ... , n} define L i to be P times the i th term of the partial fraction
decomposition,

L · =P(x) · s·o+s1(x-x·)+···+s .(x - x·)m' )


(
i, i, i i, m , i
(15.18)
i (x _ Xi)m i+l '

and define

The function
n
f = L fi L i
i=l
is the unique polyn omial in IF [x] of degree less than M such that jUl(xi ) = Yij ) for
every i E {1 , ... , n} and every j E {O, ... , mi}.

Proof. By Proposition 15.7.10 the condition that for each i ::; n and each j ::; mi
we have jUl(p) = Yij) is equivalent to the condition that f = fi (mod (x - xi)m;+l)
for each i < n.
By construct ion, we have 1 = I::'.: 1 Li with Li= 0 (mod (x - x 1 ))m1 for any
j -=J. i. Thus, we also have Li = 1 (mod (x - xi))m 1 and f = f i (mod (x - xi)m' )
for any i E {1, ... , n }. T herefore, the derivatives off take on the required values.
To prove uniqueness, note that the ideals ( (x - Xi) m; ) are all pairwise relatively
prime and the int ersection of these ideals is exactly (P), hence the CRT guarantees
618 Chapter 15. Rin gs and Polynomials

there is a unique equivalence class [[g]]P in IF[x]/(P) such that for every i we have
[[glJ(P) = [[fi]](P)· Finally, by P roposition 15.5.5, there is a unique f E IF[x] with
deg(!) < deg( P) such that f E [[g]] (P) . Hence the solution for f is unique. D

Example 15. 7.12. Consider the interpolation problem where X 1 = 1 and


yi
x2 = 2, with 0 ) = 1, y p ) = 2, and! y~o) = 3. We have P( x ) = (x -1) 2(x- 2),
and a routine calculation shows that the partial fraction decomposition of
l/P is
1 -x 1
-- = -
P(x ) (x-1) 2 + - -
x-2·
Therefore,
-x 2
L 1 = P(x) · . ) = -x
(x - 1 2
+ 2x,
1 2
L 2 = P(x) · (x _ ) = (x - 1) .
2

Moreover, we have

1 0 2 1
!1 = - (x-l) + - (x-1) =2x - l
O! l! '
h = ~(x
o. -
0
2) = 3,

and hence

f = fiL1 + f2L2 = (2x - l)(-x 2 + 2x) + 3(x - 1) 2 = -2x 3 + 8x 2 + 8x + 3.


It is straightforward to check that f(l) = 1, f'(l) = 2, and f(2) = 3, as
required.

Exerci ses
Note to the student: Each section of this chapter has several corresponding
exercises, all collected here at the end of the chapter. The exercises between the
first and second line are for Section l, the exercises between the second and third
lines are for Section 2, and so forth.
You should work every exercise (your instructor may choose to let you skip
some of the advanced exercises marked with *) . We have carefully selected them,
and each is important for your ability to understand subsequent material. Many of
the examples and results proved in the exercises are used again later in the text.
Exercises marked with .& are especially important and are likely to be used later
in this book and beyond . Those marked with tare harder than average, but should
still be done.
Exercises 619

Although they are gathered together at the end of the chapter, we strongly
recommend you do the exercises for each section as soon as you have completed
the section, rather than saving them until you have finished the entire chapter. The
exercises for each section are separated from those for other sections by a
horizontal line.

15.1. Prove that for any positive integer n the set Zn described in Example 15. l.3(iii)
satisfies all the axioms of a ring. Prove, also, that multiplication is commu-
tative and that there exists a multiplicative identity element (unity).
15.2. Fill in all the details in the proof of Proposition 15.1.9.
15.3. Prove that for any ring R, and for any two elements x, y E R we have
(-x)( - y) = xy.
15.4. In the ring Z, identify which of the following sets is an ideal, and justify
your answer:
(i) The odd integers.
(ii) The even integers.
(iii) The set 3Z of all multiples of 3.
(iv) The set of divisors of 24.
(v) The set {n E Z In = 3k or n = 5j} of all multiples of either 3 or 5.
15.5. Provide an example showing that the union of two ideals need not be an ideal.
15.6. Prove the following:
(i) The ideal (3, 5) in Z generated by 3 and 5 is all of z. Hint: Show
1 E (3, 5).
(ii) The ideal (x 2 ,x 2 +x,x+1) in JF'[x] is all oflF'[x].
(iii)* In the ring q x, y] the ideal (x 2 - y 3 , x - y) is a proper subset of the
ideal (x + y , x - 2y).
15.7.* Prove Proposition 15.1.34.
15.8.* Prove Proposition 15.1.35.
15.9.* Prove Theorem 15.1.36.
15.10.* Prove Proposition 15.1.37.

15.11. Let a= 204 and b = 323. Use the extended Euclidean algorithm (EEA) to
find gcd(a, b) in Z as well as integers s, t such that as+ bt = gcd(a, b). Do
the same thing for a= x 3 - 3x 2 - x + 3 and b = x 3 - 3x 2 - 2x + 6 in Q[:r].
15.12. Prove that if pis prime, then every nonzero element a E Zp has a multiplica-
tive inverse. That is, there exists x E Zp such that ax = 1 (mod p). Hint:
What is gcd(a,p)?
15.13. Find the only integer 0 < a < 72 satisfying 35a = 1 (mod 72) . Now find
the only integer 0 < b < 72 satisfying 35b = 67 (mod 72). Hint: Solve
35a + 72x = 1 with the EEA.
15.14. Find a polynomial q of degree 1 or less such that (x+l)q = x+2 (mod x 2 +3).
620 Chapter 15. Rings and Polynomials

15.15. Prove that for any composite (nonprime) integer n, the ring Zn is not a
Euclidean domain. If p is prime, prove that Zp is a Euclidean domain. Hint:
Use Exercise 15.12.
15.16. Prove that JF[x, y] has no zero divisors but is not a Euclidean domain. Hint:
What can you say about the ideal (x, y) in JF[x, y]?
15.17.* Implement the EEA for integers in Python (or your favorite computer lan-
guage) from scratch (without importing any additional libraries or methods).
Your code should accept two integers x and y and return gcd(x, y) , as well
as a and b such that ax+ by= gcd(x, y).

15.18. For p a prime in a Euclidean domain D, prove or disprove each of the


following:
(i) If a ED and plak, then pkJak.
(ii) If a, b E D and pla 2 + b2, then pla 2 and plb 2.
15.19. If I is an ideal in a Euclidean domain R, show that there are only finitely
many ideals in R that contain I. Hint: I= (c) for some c ER. Consider the
divisors of c.
15.20. If a, b E Z and 3Ja2 + b2 prove that 3la and 3lb. Hint: If 3 does not divide a,
then a = 3k + 1 or a = 3k + 2 for some k E Z.
15.21. If p E Z is any positive prime, prove that y'P is irrational. Prove that if p(x)
is an irreducible polynomial, then JP(X) cannot be written as a rational
function in the form f(x)/g(x) with f,g E lF[x].
15.22 . Prove Proposition 15.3.8.
15.23. Prove that the ring

Z[M] = {a+bM I a,b E Z} c C


is not a Euclidean domain by showing that it does not satisfy the fundamental
theorem of arithmetic. Hint: Find two different factorizations of 9 in Z[H].

15.24. Prove that each of the maps in Example 15.4.3 is a homomorphism.


15.25 . Prove that if f is surjective, then Im f is an ideal of S. Give an example of
a homomorphism of rings that is not surjective f : R --+ S where this fails.
15.26. Let Rand S be commutative rings with multiplicative identities lR and ls,
respectively. Let f : R--+ S be a ring homomorphism that is not identically 0.
(i) Prove that if f is surjective, then f (lR) = ls.
(ii) Prove that if S has no zero divisors, then f(lR) =ls.
(iii) Provide an example to show that the previous results are not true if f
is not surjective and S has zero divisors.
15.27. What is the kernel of each of the following homomorphisms?
(i) Z12 --+ Z4 given by [[x]]i2 H [[x]k
(ii) Z ~ Ql is the usual inclusion.
(iii) The map lF[x] --+ C([a, b]; IB') described in Example 15.4.3(ii).
Exercises 621

15.28. Prove that if f : Z --t Z is an isomorphism, then f is the identity map


(f(n) = n for all n E Z) .
15.29. Prove that the map defined in Example 15.4.l 7(ii) is indeed an isomorphism
to <C.
15.30. Prove Proposition 15.4.20.
15.31. Let Rand S be rings. What is the kernel of the homomorphism p: Rx S --t S
given by p(r, s) = s?

15.32. Let I = (3) C Z12 be the multiples of 3 in Z 12. Write out all the elements
of Z12/ I and write out the addition and multiplication tables for Z 12/ I. Do
the same for I = (8) c Z 1 2.
15.33. Prove that the ring JF[x]/(x - a) is isomorphic to lF for any a E JF.
15.34. Prove that the ring JF[x]/(x - 1) 3 is isomorphic to the set {a 0 + a 1 (x -1) +
a2(x - 1) 2 I ai E lF} with addition given by

(ao+a1(x - 1) + a2(x - 1) 2 ) + (bo + bi(x - 1) + b2(x - 1) 2 )


= (ao + bo) + (a1 + bi)(x - 1) + (a2 + b2)(x - 1) 2
and with multiplication given by

(ao+a1(x - 1) + a2(x -1) 2 )(bo + bi(x - 1) + b2(x -1) 2 )


= aobo + (a1bo + aob1)(x - 1) + (a2bo + aib1 + aob2)(x - 2
1) .

More generally, for any ,\ E lF show that the ring JF[x]/((x - .\)n) can be
written as u=~=~ ak(X - >.)k I ai E lF} with the addition given by

and with multiplication given by

15.35. If,\ E lF and if 7r: lF[x] --t lF[x]/((x - .\)n) is the canonical epimorphism, then
for any k EN, write 7r (xk) in the form I:]:~ aj(x - ,\)J.
15.36. Recall that an idempotent in a ring is an element a such that a 2 = a. The
element 0 is always idempotent, as is 1, if it exists in the ring. Find at least
one more idempotent (not 0 and not 1) in the ring IR[x]/(x 4 + x 2 ). Also find
at least one nonzero nilpotent in this ring.
15.37. Prove Lemma 15.5.14.
15.38. Prove Theorem 15.5.15.
15.39.* Prove that in any commutative ring R the set N of all nilpotents in R
forms an ideal of R. Prove that the quotient ring R/ N has no nonzero
nilpotent elements.
622 Chapter 15. Rings and Polynomials

15.40.* Prove the second isomorphism theorem (compare this to Corollary 2.3.18,
which is the vector space analogue of this theorem): Given two ideals I, J of
a ring R, the intersection I n J is an ideal of the ring I, and J is an ideal of
the ring I+ J. We have

(I+ J)j J ~I /(In J).

Hint: Show that the obvious homomorphism from I to (I +J)j J is a surjective


homomorphism with kernel I r1 J.

15.41. In each of the following cases, compute the Lagrange and the Newton decom-
positions for f ER relative to the elements m1, ... , mn:
(i) f = 11 in R = Z, relative to m1 = 2, m2 = 3, m3 = 5.
(ii) f = x 4 - 2x + 7 in R = Il~[x], relative to m1 = (x - 1),
m2 = (x - 2),
m 3 = (x - 3). Hint: Consider using the method of Example 15.6.11 to
reduce the amount of dividing you have to do.
15.42. A gang of 19 thieves has a pile containing fewer than 8,000 coins. They try
to divide the pile evenly, but there are 9 coins left over. As a result, a fight
breaks out and one of the thieves is killed. They try to divide the pile again,
and now they have 8 coins left over. Again they fight, and again one of the
thieves dies. Once more they try to divide the pile, but now they have 3 coins
left.
(i) How many coins are in the pile?
(ii) If they continue this process of fighting, losing one thief, and redividing,
how many thieves will be left when the pile is finally divided evenly with
no remainder?
15.43. Let A E Mn(q be a square matrix with minimal polynomial p(x) and eigen-
r
values >. 1 , .. . , >.k with p(x) = (x - >. 1 1 · · • (x - >.k)mk. Prove that the ring
C[A] is isomorphic to the product of quotient rings C[x]/(x - >. 1 )m 1 x · · · x
C[x]/(x - >.k)mk . Hint: Use the result of Example 15.5.19.
15.44. Fix a positive integer N and let w = e- 27ri/N . Prove that the map§ : C[t] ---+
<C x ... x <C, defined by §(p(t)) = (p(w 0),p(w 1), ... ,p(wN-l)), induces an
isomorphism of rings from C[t]/(tN - 1) to the ring <CN = <C x · · · x <C.
This isomorphism is called the discrete Fourier transform, and it plays an
important role in signal processing and many other applications. Hint: Use
the results of Exercise 15.33
15.45.* Implement the Lagrange decomposition algorithm for integers in Python
(or your favorite computer language) using only your previous implementa-
tion of the EEA (Exercise 15.17), without importing any other libraries or
methods. Your code should accept an integer x and a tuple (m 1 , . .. , mn) of
pairwise-relatively-prime integers and return a tuple (x 1 , ... ,xn) such that
x = 2::::~ 1 Xi (mod TI~=l mi) and such that Xi = 0 (mod mj) whenever
i=/=j.
15.46.* Implement the Newton- Garner algorithm for solving the CR problem for
integers in Python (or your favorite computer language) using only your
previous implementation of the EEA (Exercise 15.17), without importing
Exercises 623

any other libraries or methods. Your code should accept a tuple (a1, ... , an)
of integers and a tuple (m 1 , .. . , mn) of pairwise-relatively-prime integers and
return an integer 0 :::; x < IT~ 1 Imi I such that x = ai (mod mi) for every i.

15.47. Find a polynomial f(x) E Q[x] such that f(l) = 2, f(2) = 3, f(3) = 5, and
f(4) = 7 using
(i) Lagrange interpolation and
(ii) Newton interpolation.
15.48. Let

A= 0
7 1
7 0 .
ol
[0 0 2

(i) Find the minimal polynomial of A.


(ii) Write a ring of the form IT.xEu(A) C[x]/(x - .\)m>- that is isomorphic to
C[A].
(iii) Compute the Lagrange-Hermite interpolants L.x for each .\ E a(A).
Verify that L.x = 0 mod (x - µ)m,. for every µ =f. .\, and that L.x =
1 mod (x - >-r>..
(iv) Compute the eigenprojections P.x = eA( L.x) E Mn(C) by evaluating the
Lagrange-Hermite interpolants at A.
(v) Compute the eigennilpotents D.x in a similar fashion; see (15.17).
15.49. Let

B ~ [H i]
(i) Find the minimal polynomial of B. Hint: Do not compute the resolvent.
(ii) Write a ring of the form IT.xEu(A) C[x]/(x - .\)m>- that is isomorphic to
C[B].
(iii) Compute the Lagrange-Hermite interpolants L.x for each .\ E a(B).
Verify that L.x = 0 mod (x - µr,. for every µ =f. .\, and that L.x =
1 mod (x - .\)m>-.
(iv) Compute the eigenprojections P.x = es( L.x) E Mn(C) by evaluating the
Lagrange-Hermite interpolants at B.
(v) Compute the eigennilpotents D.x in a similar fashion; see (15.1 7).
15.50. Given A E Mn(C), with spectrum a(A) and minimal polynomial p(x)
IT.xEu(A)(x - .\)m>-, for each.\ E a (A), let L.x be the corresponding polyno-
mials in the Lagrange decomposition with L.x = 0 mod (x - µ)m,. for every
eigenvalueµ =f. .\,and with 2.:::.xEu(A) L.x = 1 mod p(x). Use the results of this
section to show that the Drazin inverse AD lies in C[A] C Mn(C), and that

D ( (x - .\)k(- lt)
A = eA L L.x
m>.-l
L ,\k+l '
AEu(A) k=O

where eA : C[x] ---+ C[A] is the evaluation map given by x f-t A.


624 Chapter 15. Rings and Polynomials

15.51. Use the techniques of this section (polynomial interpolation) to give a formula
for t he inverse of the discrete Fourier transform of Exercise 15.44.

Notes
Some references for the material in this chapter include [Art91 , Her96, Alu09].
Several of the applications in this chapter are from [Wik14] .
Variations of Exercise 15.42 date back to Qin Jiushao's book Shushu Jiuzhang
(Nine Sections of Mathematics), written in 1247 (see [Lib73]), but we first learned
of this problem from [Ste09].
Part V

Appendices
Foundations of
Abstract Mathematics

It has long been an axiom of mine that the little things are infinitely the most
important.
-Sir Arthur Conan Doyle

In this appendix we provide a quick sketch of some of the standard foundational


ideas of mathematics, including the basics of set theory, relations, functions, order-
ings, induction, and Zorn's lemma. This is not intended to be a complete treatment
of these topics from first principles-for that see [GC08, DSlO, Hal74] . Our pur-
pose here is to provide a handy reference for some of these basic ideas, as well as to
solidify notation and conventions used throughout the book.

A.1 Sets and Relations


A.1.1 Sets

Definition A.1.1. A set is an unordered collection of distinct elements, usually


indicated with braces, as in S = {A, 2, Cougar, v'3}. We write x E S to indicate
that x is an element of the set S, and y tJ_ S to indicate that y is not an element of
S. Two sets are considered to be equal if each has precisely the same elements as
the other.

Example A.1.2. Here are some sets that we use often in this book:

(i) The set with no elements is the empty set and is denoted 0. The empty
set is unique; that is, any set with no elements must be equal to 0.

(ii) The set C of complex numbers.

(iii) The set JR of real numbers.

627
628 Appendix A. Foundations of Abstract Mathematics

(iv) The set N = {O, 1, 2, 3, ... } of natural numbers.


(v) The set Z = { . . . , -3, -2, -1 , 0, 1, 2, 3, .. . } of integers.

(vi) The set z+ = {l, 2, 3, . .. } of positive integers.


(vii) The set IQl of rational numbers.

(viii) The set JR 2 of all pairs (a , b), where a and bare any elements of R

Example A.1.3.

(i) Sets may have elements which are themselves set s. The set

A = {{1,2,3},{r,s,t,u}}
has two elements, each of which is itself a set.

(ii) The empty set may be an element of another set. The set T = {0, {0}}
has two elements, the first is t he empty set and the second is a set
B = {0} whose only element is t he empty set. If t his is confusing, it
may be helpful to think of the empty set as a bag with nothing in it,
and the set T as a bag containing two items: one empty bag and one
other bag B with an empty bag inside of B .

Definition A.1.4. We say that a set T is a subset of a set S, and we write TC S


if all of the elements of T are also elements of S . We say that T is a proper subset
of S if T C S , and T-=/=- 0, and there is at least one elem ent of S that is not in T.

Example A.1.5.

(i) The integers are a subset of the rational numbers, which are a subset of
the real numbers, which are a subset of the complex numbers:

ZclQlc!RcC.

(ii) T he empty set 0 is a subset of every set.

(iii) Given a set S , the power set of S is the set of all subsets of S . It is
somet imes denoted f!JJ(S) or 28 . For example, S = {a , b, c}; then the
power set is

f!JJ({a,b,c}) = {0,{a},{b},{c},{a,b} , {a,c},{b,c} , {a,b,c}}.


Note that if Sis a finite set with n elements, then the power set consists
of 2n elements.
A.1. Sets and Relations 629

The fact that two sets are equal if and only if they have the same elements
leads to the following elementary, but useful, way to prove that two sets are equal.

Proposition A.1.6. Sets A and B are equal if and only if both A C B and B C A.

Definition A.1. 7. We often use the following convenient shorthand for writing a
set in terms of the properties its elements must satisfy. If P is some property or
formula with a free variable x , then we write

{x ES I P(x)}

to indicate the subset of S consisting of all elements in S which satisfy P. We call


this set builder notation or set comprehension. When P contains symbols that look
like I we use the alternative form {x E S : P( x)} to avoid confusion.

Example A.1.8.

(i) {x E Z Ix> 5} is the set of all integers greater than 5.


(ii) {x E JR I x 2 = 1} is the set of all real numbers whose square is 1.
(iii) {(a, b) E JR 2 I a 3 = b5 } is the set of all pairs (a, b) E JR 2 such that a 3 = b5 .

(iv) {x E JR I x 2 = -1} = 0 is empty because there are no choices of x that


satisfy the condition x 2 = - 1.

Remark A.1.9. In many cases it is useful to use set comprehensions that do not
specify the superset from which the elements are selected, for example, {(r, s, t) I
f (r, s, t) = O}. This can be very handy in some situations, but does have some
serious potential pitfalls. First, there is a possibility for misunderstanding. But
even when the meaning seems completely clear, this notation can lead to logical
paradoxes. The most famous of these is Russell's paradox, which concerns the
comprehension R = {A I A is a set, and A tj. A}. If R itself is a set, then we have
a paradox in the question of whether R contains itself or not.
A proper treatment of these issues is beyond the scope of this appendix and
does not arise in most applications. The interested reader is encouraged to consult
one of the many standard references on set theory and logic, for example, [Hal74] .

Definition A.1.10. There are several standard operations on sets for building new
sets from old.
(i) The union of two sets A and B is
AU B = {x Ix E A or x EB}.

(ii) If sd is a set of sets, the union of all the sets in sd is

LJ A= {x Ix E A for at least one A E sd}.


AE.<11'
630 Appendix A. Foundations of Abstract Mathematics

(iii) The intersection of two sets A and B is

An B = { x Ix EA and x EB}.

(iv) If tz1 is a set of sets, the intersection of all the sets in tz1 is

n
A Ed
A= {x I x EA for every A Ed}.

(v) The set difference A '\_ B is

A'\_ B = {x EA Ix tf. B}.

(vi) The complement of a subset A C S is

Nate that writing Ac only make.s sense if the superset S is already given.

(vii) The Cartesian product (or simply the product) A x B of two .sets A and B
is the .set of all ordered pairs (a , b), where a E A and b E B:

Ax B ={(a, b) I a EA, b EB}.

(viii) If tz1 = [A 1 , ... , An] i.s a finite (ordered) list of sets, the product of all the
sets in tz1 i.s
n
II Ai= IlAi={(x1, ... ,xn) \xiEAiforeachiE{l,2, . . . ,n}} .
A., Ed i=l

The product S x · · · x S of a set S with itself n times is often written sn .


"-v-"'
n

Proposition A.1.11 (De Morgan's Laws). If A, BC S , then

(A.I)

and
(AnB)c = AcuBc.
More generally, if tz1 is a set of sets, then

(
LJ
AEd
A
'

)
c
=
(
n
AEd
Ac
)

and

(
n
AEd
A\
)
c
=
(
LJ
AEd
Ac )
.
A.1. Sets and Relations 631

Proof. We will prove (A.1). The proofs of the rest of the laws are similar. By
definition, x E (AU B)c if and only if x E S and x '/.AU B , which holds if and only
if x '/.A and x '/. B. But this is the definition of x E Ac n Be. D

A.1.2 Relations, Equivalence Relations, and Partitions

Definition A.1.12. A relation on a set A is a subset R c A x A. We say "a


is related to b" when we mean (a, b) E R. We usually choose some symbol (for
example*) to denote the relation and write a* b to denote (a, b) ER.

Example A.1.13.

(i) The less-than symbol defines a relation on IR, namely, the subset L =
{(x,y) E IR 2 Ix< y} C !R 2 . So we have x < y if (x,y) EL.

(ii) Define a relation on Z by the subset

{(a, b) I a divides b} C Z x Z.

It is normal to use the symbol I to denote this relation, so alb if a


divides b.

(iii) Given any set A, define a relation by E = {(x, x) I x E A} C Ax A.


This matches the usual definition of equality in A. That is, (a, b) EE if
a= b.
(iv) Fix an integer n and define a relation on Z by a= b if n divides (a - b).
It is traditional to denote this equivalence relation by a = b (mod n)
and to say that a is congruent (or equivalent) to b modulo n.

(v) Let F be the set of formal symbols of the form a/b , where a, b E Zand
b i= 0. More precisely, we have F = Z x (Z '\. 0). Define a relation ,.....,
on F by a/b ,....., c/d if ad = be. This is the usual relation for equality
of fractions.

As we saw in Example A.l.13(iii), equality defines a relation on any set. In


many situations we'd like to say that certain elements of a set are alike in some
way, but without necessarily saying that they are equal. The definition of "alike"
also defines a relation on the set. The idea of equivalence relation identifies some
key properties that make such relations behave like equality.

Definition A.1.14. Given a relation RC Ax A, denoted by,.....,, we say that R (or


,....., ) is an equivalence relation if the fallowing three properties hold:

(i) (Reflexivity.) For every a E A we have a,....., a .

(ii) (Symmetry.) Whenever a ,....., b, then b ,....., a.


(iii) (Transitivity.) If a,....., b and b,....., c then a,....., c.
632 Appendix A. Foundations of Abstract Mathematics

Unexample A.1.15.

(i) The relation < of Example A.l.13(i) is not an equivalence relation on


JR because it fails to be either reflexive or symmetric. It is, however,
transitive.
(ii) The relation I of Example A.l.13(ii) is reflexive and (vacuously) transi-
t ive. But it is not an equivalence relation because it fails to be ymmetric.

Example A.1.16.

(i) The relation= of Example A.l.13(iii) is an equivalence relation. This is


the example that motivated the definition of equivalence relation.
(ii) The relation = of Example A..l.13(iv) is an equivalence relation. It is
easy to see that it is reflexive and symmetric. To see that it is transitive,
note that if a= b (mod n), then nl(b - a) and b = kn+ a for some in-
teger k. If also b = c (mod n), then nl(c - b) and c = ne + b for some
integer e. Thus we have c = ne + b = ne + nk +a, soc - a = n(e + k)
and a= c.
(iii) The relation on C given by x ,..., y if lxl = IYI is an equivalence relation.
(iv) The relation of Example A.l.13(v) is an equivalence relation.

It is often very useful to think about equivalence relations in terms of the


subsets of equivalent elements it defines.

Definition A.1.17. Given an equivalence relation ,..., on a set S, and given an


element x ES, the equivalence class of x is the set [[x]] = {y ES I y,..., x }.

Example A.1.18.

(i) Fix any integer n, and let = be the equivalence relation = mod n of
Example A.l.13(iv). The equivalence classes are

[[OJ]={ ... , -2n, -n, 0, n, 2n, 3n, ... },


[[l]] = {... , 1 - 2n, 1 - n, l, 1+n, 1+2n,1 +3n, .. . },
[[2]] = { ... , 2 - 2n, 2 - n, 2, 2 + n , 2 + 2n, 2 + 3n, ... } ,

[[n-1]] = { ... ,-n- l,-l,n - l,2n-l , 3n-l, . .. },


[[n]] = { ... , -2n, -n, 0, n, 2n, 3n, . .. } = [[OJ].
A.1. Sets and Relations 633

That is, for each i E Z, we have [[i]] = {x E Z : nl(x - i)}, and there
are precisely n distinct equivalence classes. We denote this set of n
equivalence classes by Zn.
As a special case, if we take n = 2, then the two equivalence classes of
Z2 are the even integers and the odd integers.
(ii) If the equivalence relat ion on S is equality (= ), then the equivalence
classes are just the singleton sets [[x]] = {x }.

(iii) Consider the equivalence relation on <C given by x rv y if 1-'.l:I = IYI· For
each nonnegative real number r there is exactly one equivalence class
[[r]], consisting of the circle [[r]] = {z E <C : lzl = r }. Every complex
number z lies in one of the equivalence classes [[r]], and no two distinct
nonnegative real numbers lie in the same equivalence class.

(iv) For the relation of Example A.l.13(v) the equivalence class of a given
element a/b consists of sets of all equivalent fractions. For example, we
have [[1/2]] = {1 / 2, 2/4, 3/6 , -1/ - 2, 57 /11 4,. .. }.

The properties of an equivalence relation immediately give the following prop-


erties of equivalence classes.

Proposition A.1.19. Given an equivalence relation rv on a set S , we have the


following:
(i) Any x E S is contained in the equivalence class ITxJ].

(ii) If y E [[x]], then x E [[y]].


(iii) Any two equivalence classes are either disjoint or equal; that is, ify E [[x]]n[[z]],
then [[x]] = [[z]].

Proof. Items (i) and (ii) follow immediately from the definition of an equivalence
relation. To prove item (iii), note that for any a E [[x]] we have a,..__, x and y ,. __, x, so
by symmetry and transitivity a rv y. But because y E [[z]], we also have y rv z and
hence a rv z, which shows that a E [[z]]. This shows that [[x]] C [[z]]. An essentially
identical argument shows that [[z]] c [[x]], and hence that ITxJ] = [[z]. D

Equivalence classes divide a set into disjoint subsets, which we call a


partition.

Definition A.1.20. A partition of a set S is a set .91 of nonempty disjoint subsets


of S such that

(i) S = UAE .91' A and


(ii) for every A, B E .91, if A-=/= B , then An B = 0.
634 Appendix A. Foundations of Abstract Mathematics

Example A.1.21.

(i) The integers have a partition into the even and odd integers. The
partition is d = {E,O} with E = {... ,-4,-2, 0,2,4, ... } and 0 =
{... ,-3,-1,1,3, 5, ... ,}. This is a partition because Z = EUO and
Eno= 0.
(ii) Taking d = { S} gives a (not very interesting) partition of S into just
one set- itself.
(iii) If cv is any equivalence relation on a set S, then the equivalence classes
of cv define a partition of S by d = {[[x]] [ x E S}. The first condition,
that UAE d A = S, follows immediately from Proposition A.1.19(i) . The
fact that the equivalence classes are disjoint is Proposition A.1.19(iii).

(iv) Taking d = { {x} [x ES} gives a partition of S into singleton sets.


This is the partition induced by the equivalence relation = of
Example A.l.13(iii).
(v) For any fixed integer n , the collection d =Zn= {[[OJ], [[1]], . .. , [[n-1]]},
with [[i]] = {x E Z: n[(x - i)}, is a partition of Z because LJ:: 1
0 [[i]] = Z
and [[i]] n [[j]] = 0 for every i -=f. j. This is exactly the partition into
equivalence classes (mod n) described in Example A.1.18(i) .

Any partition defines an equivalence relation, just as any equivalence relation


determines a partition into equivalence classes. It is not hard to see that partitions
and equivalence relations are equivalent; that is, each is completely determined by
the other.

Proposition A.1.22. A partition a· of a set S defines an equivalence relation on


S by x cv y if and only if x, y E A for some A E d . That is, x cv y if and only if
they both lie in the same part. Moreover, a partition is completely determined by the
equivalence relation it defines, and an equivalence relation is completely determined
by the corresponding partition into equivalence classes.

Proof. Given a partition d of S, let cv be the induced relation. It is immediate


that cv is both reflexive and symmetiric. If x cv y and y cv z, then there is a part
A Ed such that x, y EA, and a part .B Ed such that y, z E .B, but since partitions
are disjoint, the fact that y EA and y E B implies that A= B, so x, y, z EA= B
and x cv z. Thus cv is an equivalence relation.
Given any two partitions d -=f. &§, there must be at least one nonempty set
A E d "'- &§. Let x E A. Since the partition &§ covers all of S, we must have some
BE&§ such that x EB. Moreover, since A rf: &§,we must have A -=f. B; and so there
must be some y E A "'- B or there must be a z E .B "'- A. Without loss of generality,
assume y EA"'- B. Then we have x rv d y but x rf@ y, so the induced equivalence
relations are not the same.
Conversely, given any two equivalence relations cv and :=, if they are not the
same, there must be a pair of elements x, y such that x cv y, but x ¢: y. This implies
A2. Functions 635

that y is an element of the ,. . ., equivalence class [[xlJ~ of x, but y is not in the ::::
equivalence class [[x]h:: of x . Thus the corresponding partitions are not equal. D

A.2 Functions
Functions are a fundamental notion in mathematics. The formal definition of a
function is given in terms of what is often called the graph of the function.

A.2.1 Basic Properties of Functions

Definition A.2.1. A function {also called a map or a mapping) l : X ---+ Y is


a relation r f C X x Y such that for every x E X there is a unique y E Y such
that (x, y) E r f. For each x, we denote this uniquely determined y by l (x). It is
traditional to call the set r f the graph of l, although formally speaking, the graph
is the function.
We also have the following sets determined by the function l:
(i) The set X is the domain of l.
(ii) The set Y is the codomain of l.
(iii) The set l(X) = { y E YI :3x EX s .t. l(x) = y} is the range or image of l .
(iv) For any subset X' C X, we call the set l (X') = { y E YI :3x' E X' s.t. y= l(x')}
the image of X'.
(v) For any y E Y, the set l- 1 (y) = { x EX I l(x) = y} is the level set of lat y
or the preimage of y.
(vi) For any Y' CY, we call the set l- 1 (Y') = { x EX Il(x) E Y'} the preimage
ofY'.
(vii) If 0 E Y (for example, if Y =IR or Y =en), then the set {x EX I l(x) =/= O}
is called the support of l .

Notation A.2.2. As an alternative to the notation l(x) = y, it is common to


write l: x t--t y.

The following proposition is immediate from the definition but still very useful.

Proposition A.2.3. Two functions l, g are equal if and only if they have the same
domain D and for each x ED we have l(x) = g(x).

The following proposition is also immediate from the definitions.

Proposition A.2.4. Let l : X ---+ Y be a function. For any R, S C X and


T, U c Y , the following hold:
(i) If R c S, then l(R) c l(S).
(ii) l (Ru S) = l (R) U l (S).
636 Appendix A. Foundations of Abstract Mathematics

(go f)(x)

x y z

Figure A.1. The composition of go f of functions f and g, as described


in Definition A.2.5.

(iii) f(R n S) c f(R) n f(S).


(iv) R c f- 1 (f(R)).
(v) If Tc U, then f- 1 (T) c f- 1 (U).
(vi) f- 1 (T U U) = f- 1 (T) U r 1
(U).
(vii) j- 1 (T n U) = r 1
(T) n f- 1
(U) .
(viii) f(f- 1 (T)) c T.
(ix) j - 1 (Tc) = j-l(T)c.

Definition A.2.5. The composition of two functions f: X --+ Y and g: Y --+ Z is


the function g o f: X --+ Z given by
I'gof = { (x,z) I :Jy E y s . t. (x,y) E r1 and (y,z) E I'g }.
This is illustrated in Figure A .1.

Proposition A.2.6. The composition go f of functions f : X --+ Y and g : Y --+ Z


is a functio n g o f : X --+ Z. Also, composition of functions is associative, meaning
that given f : X --+ Y, g : Y --+ Z and h : Z --+ W, we have
ho(gof) =(hog) of
Proof. Because f is a function, given any x E X , there is a unique y E Y such that
(x, y) E I' f · Since g is a function, there is a unique z E Z such that (y, z) E I' 9 ;
hence there is a unique z E Z such that (x, z) E I'gof ·
For the second part of the proposition, note that by Proposition A.2.3 it
suffices to show that for any x EX, we have (ho (go f))(x) =((hog) o J)(x) . But
by definition, we have
(ho (go f))(x) = h((g o f)(x)) = h(g(f(x))) =(ho g)(f(x)) =((hog) o f)(x). D
A.2. Functions 637

g h

~
~

x y x y x y

Figure A.2. Injective, surjective, and bijective maps, as described in


Definitio n A.2. 'l. The left-hand map f is injective but not surjective. The center
map g is surjective but not injective. The right-hand map h is bijective.

A.2.2 Injective and Surjective

Definition A.2. 7. A func tion f : X-+ Y is called


(i) injective if f(x) = f(x') implies x = x';
(ii) surjective if for any y E Y there exists an x EX such that f(x) = y;
(iii) bijective if it is both injective and surjective.
See Figure A.2 for an illustration of these ideas.

Nota Bene A. 2.8. Many people use the phrase one-to-one to mean injec-
tive and onto to mean surjective. But the phrase one-to-one is misleading
and tends to make students mistakenly think of the uniqueness condition for
a function rather than the correct meaning of injective. Moreover, most stu-
dents who have encountered the phrase one-to-one before have heard various
intuitive definitions for this phrase that are too sloppy to actually use for
proofs and that tend to get them into logical trouble. We urge readers to
carefully avoid the use of the phrases one-to-one and onto as well as avoid
all the intuitive definitions you may have heard for those phrases. Instead,
we recommend you to use only the terms injective and surjective and their
formal definitions.

Proposition A.2.9. Given functions f : X -+ Y and g : Y -+ Z, the following


hold:
(i) If f and g are both injective, then g of is injective.
(ii) If f and g are both surjective, then go f is surjective.
(iii) If f and g are both bijective, then g of is bijective.
638 Appendix A. Foundations of Abstract Mathematics

(iv) If go f is injective, then f is injective.

(v) If go f is surjective, then g is surjective.

(vi) If go f is bijective, then g is surjective and f is injective.

Proof.

(i) If x,x' EX and g(f(x)) =(go f)(x) =(go f)(x') = g(f(x')), then since g
is injective, we have that f(x) == f(x'), but since f is injective, we must have
x=x'.

(ii) For any z E Z, because g is surjective, there is a y E Y such that g(y) = z .


Because f is surjective, there is an x E X such that f(x) = y, which implies
that (go f)(x) = g(f(x)) = g(y) = z.

(iii) Follows immediately from (i) and (ii).

(iv) If x, x' E X and f(x) = f(x'), then (go f)(x) = g(f(x)) g(f(x'))
(g o !) (x'). Since g o f is injective, we must have that x = x'.

(v) If z E Z, because go f is surjective, there exists an x EX such that g(f(x)) =


(go f)(x) = z, so letting y = f(x) E Y, we have g(y) = z.

(vi) Follows immediately from (iv) and (v) . D

To conclude this section, we describe a useful and common form of notation


called a commutative diagram.

Definition A.2.10. We say that the diagram

f -B
A --

!' g

g'
C --- D

commutes if g o f = g' o f'. Similarly, we say that the diagram

R u S

~ T
commutes if w o u = v .
A.2. Functions 639

A.2.3 Products and Projections


When we defined Cartesian products of set s in Definition A.1.lO(viii), we only did
it for finite lists of sets. We can now define infinite products.

Definition A.2.11. Given any set I (called the index set), and given any list of
sets d = [Aa I a E I ] indexed by I , we define the Cartesian product of the sets in
d to be the set

IT Aa = {¢:I -'; LJ Aa I ¢(a) E Aa for all a EI}.


aEI a EI

Remark A.2.12. Note that the list d is not necessarily a set itself, because we
permit duplicate elements. That is, we wish to allow constructions like S x S, which
corresponds to the list [S, SJ, whereas as a set this list would have only one element
{S, S} = {S}.

Example A.2.13.

(i) If the index set is I = {1, 2} , then rri EJ Ai = {! I {1 , 2} -'; A1 u


A2 s.t. f(i) E Ai} = {(a1, a2) I a1 E A1, a2 E A2} = A1 x A2, so this
definition is equivalent to our earlier definition in the case that the index
set has only two elements.

(ii) Following the outline of the previous example, one can show that if the
index set is finite, then the new definition of a product is equivalent to
our earlier definition for a Cartesian product of a finite number of sets .

(iii) For a nonempty indexing set I, if there exists a E I such that Aa = 0,


then rr/3EI A/3 = 0.

Cartesian products have natural maps called projections mapping out of them.

Definition A.2.14. Given sets A and B, define the projections P1 : Ax B-'; A


and P2 : A x B -'; B by P1 (a, b) = a and P2 : (a, b) = b.
More generally, given any ordered list d = [Ai I i E I ] of sets indexed by a
set I , the ith projection Pi is a map ITi EI Ai -'; Ai defined by Pi (¢) = ¢( i) .
Proposition A.2.15. The product is universal in the following sense: given any
other set T with functions qi : T -'; A i for every i E I , there exists a unique function
f : T-'; ITjEI Ai such that for every i E I we have qi = Pi of .

T - - --Ai

Note that the same map f makes the above diagram commute for all i EI.
640 Append ix A. Foundations of Abstract Mathematics

Proof. For each t ET, let f(t) E TijEI Aj be given by f(t)(j) = qJ(t) E Aj· Clearly
f(t) E TIJEI AJ· Moreover, (Pio f)(t) == f(t)(i) = qi(t), as desired. Finally, to show
uniqueness, note that if there is another g : T ---+ TijEI Aj with (Pio g)(t) = qi(t)
for every i E I and every t E T , then for each t E T and for every i E J, we have
g(t)(i) = qi(t) = f(t)(i); so g(t) = f(t) for every t ET. Hence g = f. D

A. 2.4 Pitfa ll: Well-Defined Function s


There are some potential pitfalls to defining functions-it is very easy to write down
a rule t hat seems like it should be a function, but which is ambiguous in some way or
other. For example, if we define q: Q x Q x Q---+ Q to be the rule f(a, b, c) = a/b/c,
it is unclear what is meant , because division is a binary operation-division takes
only two inputs at a time. The expression a/b/c could mean either (a/b)/c or
a/(b/c), and since those expressions give different results, f is not a function. It is
traditional to say that the function f is "not well defined." This phrase is an abuse
of language, because f is not a function at all, but everyone uses this phrase instead
of the more correct "f is not a function."
This pitfall also occurs when we t ry to define functions from the set of equiv-
alence classes on a set. It is often convenient to use specific representatives of
the equivalence class to define the function. But this can lead to ambiguity which
prevents the supposed function from being well defined (i.e., from being a function).
For example, consider the partition of Z into the equivalence classes of Z 4 , as
in Example A.l.18(i) . We have four equivalence classes:

[[0]]4 = { ... ) -4, 0, 4, 8, . . . },


[[1]]4 = {. . . '-7, -3, 1, 5, .. . },
[[2]]4 = {. .. ) -6, -2, 2, 6, ... },
[[3]]4 = {.. . ) -5, - 1, 3, 7, . . . }.

Similarly, the partition 2::3 has three equivalence classes:

[[O]h = {. . . , -6, -3, 0, 3, 6, ... },


[[l]h = { ... ) -5, -2, 1, 4, 7, ... },
[[2Jh = {. . . ) -4, - 1, 2, 5, 8, ... }.
We might try to define a function g : 2::4 ---+ 2::3 by g([[a]]4) = [[a]h, so we should
have
g([[Ojj4) = [[O]h,
but also [[0]]4 = [[4]]4, so
g([[4]]4) = [[4Jh = [[l]h -f. [[O]h = g([[0]]4).
So the rule for g is ambiguous-it is unclear which of the possible values for g[[0]] 4
the rule is specifying. To be a function, it must unambiguously identify a unique
choice fo r each input. Therefore, g is not a function. Again we abuse language and
say "the function g is not well defined."
A.2. Functions 641

On the other hand, we may define a function f: Z4 --t Z2 by f([[a]4) = [a]h,


which gives

f([[0]]4) = [[0]2 = {Even integers},


f( [[l]]4) = [[1]]2 = {Odd integers},
f([[2]]4) = [[2]]2 = {Even integers},
f([[3]]4) = [[3]]2 ={Odd integers}.

In this case f is well defined, because any time [a]] = [[a']] we have a'= a (mod 4),
which implies that 41 (a' - a) and hence 2l(a' -a), so the equivalence classes (mod 2)
are the same: [[a]h = [[a'Jl 2.
Whenever we wish to define a function f whose domain is a set of equivalence
classes, if the definition is given in terms of specific representatives of the equivalence
classes, then we must check that the function is well defined. That means we must
check that using two different representatives a and a' of the same equivalence class
[[a]] = [[a']] gives f ([[a]]) = f ([[a']]).

Example A.2.16.

(i) For any n E Z, define a function EEi: Zn xZn --t Zn by [[a]]EEl[[b]] = [[a+b]].
To check that it is well defined , we must check that if [[a] = [[a']] and
[[b]] = [[b']], then [a+ b]] = [[a'+ b']] . We have a' = a+ kn and b' = b + fn
for some k, f E Z, so a' +b' = a + b+kn+Cn, and thus [[a' +b']] = [[a+b]],
as required.

(ii) For any n E Z, define a function 0 : Zn x Zn --t Zn by [[a]] 0 [[b]] = [[ab]].


Again, to check that it is well defined, we must check that if [[a]] = [[a']]
and [[b]J = [[b']], t hen [[ab]] = [a'b']J. We have a' = a+ kn and b' = b + Cn
for some k, CEZ, so a'b' = (a+ kn)(b + Cn) =ab+ n(bk + aC + kC), and
thus [[a'b']] = [ab]], as required.
(iii) Given n, m E Z, try to define a function f : Zn --t Zm by f([[a]]n) =
[[alJm· If m does not divide n, then t his function is not well defined
because [[O]]n = [[n]]n, but f([[O]]n) = [[O]]m f [[nlJm = f([[n]]n) because
m J(n - 0).
(iv) The previous example does give a well-defined function if m divides n
because f ([a+ knlJn) = [a+ kn]]m = [[a]m = f ([[alJn)·
(v) If Fis the set of equivalence classes of fractions , as in Example A.l.13(v),
then we define a function p: F x F --t F by p(a/ b, c/ d) = (ad+bc)/(bd).
This is the function we usually call "+" for fractions. To check that it
is well defined, we must check that for any a' / b' ,. . ., a / band c' / d' ,. . ., c/d
we have p(a'/b',c'/d'),....., p(a/ b, c/ d) . Although this is not immediately
obvious, it requires only basic arithmetic to check, and we leave the
computation to the reader.
642 Appendix A. Foundations of Abstract Mathematics

Remark A.2.17. If a rule is already known to be a function, or if it is obviously


unambiguously defined, then you need not check that it is well defined. In particular,
if a function is the composition of two other (well-defined) functions, then it is
already known to be a function (see Proposition A.2.6), and therefore it is already
well defined.

A.2.5 Inverses
Definition A.2.18. Let f : X -7 Y be a function.
(i) The function f is right invertible if there exists a right inverse, that is, if
there exists g : Y -7 X such that fog= Jy, where ly : Y -7 Y denotes the
identity map on Y.
(ii) The function f is left invertible if there exists a left inverse, that is, if there
exists g : Y -7 X such that go f = Ix, where Ix : X -7 X denotes the identity
map on X.

Theorem A.2.19. Let f : X -7 Y be given.


(i) The function f is right invertible if and only if it is surjective.
(ii) The function f is left invertible if and only if it is injective.
(iii) The function f is bijective (both surjective and injective) if and only if there
is a function f- 1 which is both a left inverse and a right inverse. In this case,
the function f- 1 is the unique inverse, and we say that f is an invertible
function .

Proof.
(i) (==?)Let y E Y, and let g: Y -7 X be a right inverse off. Define x = g(y).
This implies f(x) = f(g(y)) = y. Hence, f is surjective.
(~)Since f is surjective, each set in f!8 =
1
u-
1(y)}yEY is nonempty. By the
axiom of choice, there exists <P : f!8 -7 UyEY f- (y) = X such that ¢(!- 1(y)) E
f- 1(y) for each y E Y. Define g : Y -7 X as g(y) = ¢(f- 1(y)), so that
f(g(y)) = y for each y E Y.

(ii) (==>)If f(x1) = f(x2), then g(f(x1)) = g(f(x2)), which implies that x 1 = x 2.
Hence, f is injective.
( ~) Choose Xo EX. Define g: Y -7 X by

g(y) = {x if there_ exists x such that f(x) = y,


xo otherwise.

This is well defined since l is injective (f(x 1) = l(x2) implies x 1 = x 2). Thus,
g(f(x)) = x for each x EX.
(iii) ( ==?) Since f is bijective, there exists both a left inverse and a right inverse.
Let 1-L be any left inverse, and let 1-R be any right inverse. For all y E Y
A.3. Orderings 643

we have j-L(y) = u-L o (j o j-R)(y) = ((j - L o j) o j - R)(y) = j - R(y), SO


every left inverse is equal to every right inverse.
(¢==) Since 1- 1 is a left inverse, f is injective, and since 1- 1 is a right inverse,
f is surjective. D
Corollary A.2.20. Let f : X -+ Y and g : Y -+ X be given.
(i) If g is a bijection and g(f(x)) = x for all x EX, then f = g- 1 .
(ii) If g is a bijection and f(g(y)) =y for ally E Y, then f = g- 1 .

Proof.
(i) Since g is a bijection, we have that g- 1 : X-+ Y exists, and thus, g- 1 (g(y)) =
y for ally E Y. Thus, for all x EX, we have that g- 1 (g(f(x))) = f(x) and
g- l(g(f(x))) = g-l(x).
(ii) Since g is a bijection, we have that g- 1 : X -+ Y exists, and thus, g(g- 1 (x)) =
x for all x EX. Thus, for all x EX, we have that f(g(g - 1 (x))) = f(x) and
f(g(g- 1 (x))) = g- 1 (x) . D

Corollary A.2.21. Let f: X-+ Y. If f is invertible, then (f- 1 ) - 1


= f.
Proof. Since f - 1 exists, we have that 1- 1 (f(x)) = x for all x EX and f(f- 1 (y)) =
y for all y E Y. Thus, f is the left and right inverse of 1- 1 , which implies that
(f- 1 ) - 1 exists and equals f. D

Example A.2.22. The fundamental theorem of calculus says that the map
Int: cn- 1 ([a, b]; IF)-+ cn([a, b]; IF) given by Int f = I: f(t) dt is a right inverse
of the map D = lx: cn([a, b];IF)-+ cn - 1 ([a, b];IF) because Do Int= Idcn- 1
is the identity. This shows that D is surjective and Int is injective. Because
D is not injective, it does not have a left inverse. But it has infinitely many
right inverses: for any constant C E IF the map Int +C is also a right inverse
to D.

A.3 Orderings
A.3.1 Total Orderings

Definition A.3.1. A relation ( <) on the set S is a total ordering (and the set is
called totally ordered) if for all x, y, z E S, we have the following:
(i) Trichotomy: If x, y E S, then exactly one of the following relations holds:

x < y, x = y, y < x.

(ii) Transitivity: If x <y and y < z, then x < z.


644 Appendix A. Foundations of Abstract Mathematics

The relation x :::; y means that x < y or x = y and is the negation of y < x. A total
ordering is also often called a linear ordering.

Definition A.3.2. A set S with a total ordering < is well ordered if every
nonempty subset X c S has a least element; that is, an element x E X such
that x:::; y for every y EX.

We take the following as an axiom of the natural numbers.

Axiom A.3.3 (Well-Ordering Axiom for Natural Numbers). The natural


numbers with the usual ordering < are well ordered.

The well-ordering axiom (WOA) allows us to speak of the first element of a


set that does, or does not, satisfy some property.

Example A.3.4. A prime number is an integer greater than 1 that has no


divisors other than itself and l. We can use the WOA to show that every
positive integer greater than 1 has a prime divisor. Let

S = {n E Z In> 1 and n has no prime divisor}.

By the WOA, if Sis not empty, there is a least element s E S. Ifs is prime,
then s divides itself, and the claim is satisfied. If s is not prime, then it must
have a divisor d with 1 < d < s. Since d €f_ S, there is a prime p that divides d,
but now p also divides s, since p divides a divisor of s. This is a contradiction,
so S must be empty.

Remark A.3.5. Note that the set Z of all integers is not well ordered, since it does
not have a least element . However, for any n E Z, the set z>n = {x E Z Ix> n}
is well ordered, since we can define a bijective map f : z>n -7 N to the natural
numbers f(x) = x - n - 1, which preserves ordering. In particular, the positive
integers ;t:;>O = z+ are well ordered.

A. 3.2 Induction
The principle of mathematical induction provides an extremely powerful way to
prove many theorems. It follows from the WOA of the natural numbers.

T heorem A .3.6 (Principle of Mathematical Induction). Given any property


P, in order to prove that P holds for all positive integers numbers, it suffices to
prove that
(i) P holds for l;
(ii) for every k E z+, if P holds fork, then P holds fork+ l.

The first step-P holds for 1-is called the base case, and the second step is
called the inductive step. The statement P holds for the number k is called the
induction hypothesis.
A.3. Orderings 645

Proof. Assume that both the base case and the inductive step have been verified
for the property P. Let S = { x E z+ I P does not hold for x} be the set of all
positive integers for which the property P does not hold. If S is not empty, then
by the WOA it must have a least element s E S. By the base case of the induction,
we know thats> l. Let k = s - l. Since k < s, we have that P holds fork. Since
k > 0, the inductive step implies that P also holds for k + 1 = s, a contradiction.
Therefore S is empty, and P holds for all n E z+ . D

Example A.3. 7.

(i) We use induction to prove for any positive integer n that 1 + 3 + · · · +


(2n - 1) = n 2 . The base case (2 · 1 - 1) = 1= 12 is immediate. Now,
given the induction hypothesis t hat 1 + 3 + · · · + (2k - 1) = k2 , we have
1+3+ .. ·+(2k-1)+(2(k+l) - 1) = k2 +(2(k+l) - 1) = k2 +2k+l =
(k + 1) 2 , as required. So the claim holds for all positive integers n.

(ii) We use induction to prove for any integer n > 0 that I:: 1 i = (nil)n .
The base case I:i=i i = 1 is immediate. Now, given the inductive hy-
pothesis that I:7=1 i = (kil)k, we have I:7~} i = (k + 1) + I:7=l i =
(k + 1) + (k i l )k = (k+I)2(k+ 2), as required. So the claim holds for all
positive integers n.

Corollary A.3.8. The principle of induction can be applied to any set of the form
N +a= {x E Z Ix:'.'.'. a}
by starting the base case at a . That is, a property P holds for all of N +a if
(i) P holds for a;
(ii) for every k E N +a, if P holds fork, then P holds fork+ 1.

Proof. Let P'(n) be the property P(n -1 +a). Application of the usual induction
principle to P' is equivalent to our new induction principle for P on T. D

Definition A.3.9. An element b in a totally ordered set S is an upper bound of


the set E C S if x :::; b for all x E E. Similarly, an element a E S is a lower bound
of the set E C S if a :::; x for all x E E. If E has an upper {respectively, lower)
bound, we say that E is bounded above (respectively, below).
If b is an upper bound for E C S that has the additional property that c < b
implies that c is not an upper bound of E, then b is called the least upper bound of
E or the supremum of E and is denoted sup E.
Similarly, if a is a lower bound of E such that a < d implies that d is not a
lower bound of E, then we call a the greatest lower bound or the infimum of E and
denote it by inf E.
Both inf E and sup E are unique, if they exist.
646 Appendix A. Foundations of Abstract Mathematics

Definition A.3.10. An ordered set is said to have the least upper bound property,
or l.u.b. property, if every nonempty subset that is bounded above has a least upper
bound (supremum).

Example A.3.11.
(i) The set Z of integers has the 1. u.b. property. To see this, let S c Z be any
subset that is bounded above, and let T = {x E Z I x > s Ifs E S} be the
set of all upper bounds for S. Because Tis not empty (Sis bounded),
the WOA guarantees that T has a least element t. The element t is an
l.u.b. for S.

(ii) The set Q of rational numbers does not have the l.u.b. property; for
example, the set E = {x E <Q) I x 2 < 2} C Q does not have an l.u.b. in
Q- but the l.u.b. for E does exist in JR, namely, v'2.

(iii) The real numbers JR have t he l.u.b. property. We take this as an axiom
of the real numbers.

Theorem A.3.12. Let S be an ordered set with the l.u.b. property. If a nonempty
set E c S is bounded below, then its i:nfimum exists in S .

Proof. Assume that E c Sis nonempty and bounded below. Let L denote the set
of all lower bounds of E, which is nonempty by hypothesis. Since E is nonempty,
then any given x E E is an upper bound for L, and thus a = sup L exists in S. We
claim that a= inf E. Suppose not. Then either a is not a lower bound of E or a is
a lower bound, but not the greatest lower bound. If the former is true, then there
exists x E E such that x < a, but this contradicts a as the l.u.b. of L since x would
also be an upper bound of L. If the latter is true, then there exists c E S such that
a< c and c is a lower bound of E. This is a contradiction because c EL, and thus
a cannot be the l.u.b. of L. Thus, a== inf E . D

A.3.3 Partial Orderings

Definition A.3.13. A partial ordering of a nonempty set A is a relation (:S) on


A that is reflexive (x ::; x for all x E A), transitive (x ::; y and y ::; z =;. x ::; z), and
antisymmetric (x ::; y and y ::; x =? x: = y). A partially ordered set (or poset, for
short) is a set with a partial order.

Remark A.3.14. Two elements x, y of a partially ordered set are comparable if


x :S y or y :S x . A partial ordering does not require every pair of elements to be
comparable, but in the special case that every pair of elements is comparable, then
::; defines a total (linear) ordering; see Section A.3.1.

Definition A.3.15. Let (X, s) be a partially ordered set. If for some a E X we


have x :S a for every x that is comparable to a, we say a is a maximal element of
X. If x ::; a for all x E X, we say that a is the greatest element .
A.4 Zorn's Lemma, the Axiom of Choice, and Well Ordering 647

Similarly, if for some b E X we have b :S x for every x that is comparable to


b, we say b is a minimal element of X. If b :S x for all x E X, we say that b is the
least element.

A.4 Zorn's Lemma, the Axiom of Choice,


and Well Ordering
We saw in Example A.2.13(iii) that if Aa = 0 for some a E I, then IT,eEJ A,e = 0.
The converse states that if Aa -=/- 0 for all a E I, then ITaEJ Aa -=/- 0. This statement,
however, is independent of the regular axioms of set theory, yet it stands as one of
the most fundamental ideas of mathematics. It is called the axiom of choice.

Theorem A.4.1 (Axiom of Choice). Let I be a set, and let .sz1 = {Xa}aEJ be a
set of nonempty sets indexed by I. There is a function .f : I---+ UaEJ Xa such that
f(a) E Xa for all a EI.

The axiom of choice has two other useful equivalent formulations-the well-
ordering principle and Zorn's lemma.

Theorem A.4.2 (Well-Ordering Principle). If A is a nonempty set, then there


exists a total (linear) ordering :S of A such that (A, :S) is well ordered.

Remark A.4.3. It is important to note that the choice of ordering may well differ
from the natural order on the set. For example, JR has a natural choice of total order
on it, but it is not true that JR with the usual ordering is well ordered-the subset
(0, 1) has no least element . The ordering that makes JR well ordered is entirely
different, and it is not at all obvious what that ordering is.

The third equivalent formulation is Zorn's lemma.

Theorem A.4.4 (Zorn's Lemma). Let (X, :S) be a partially ordered set. If every
chain in X has an upper bound in X , then X contains a maximal element.

By chain we mean any nonempty subset Cc X such that C is totally ordered,


that is, for every a, f3 E C we have either a :::; f3 or f3 :::; a. A chain C is said to
have an upper bound in X if there is an element x E X such that a :S x for every
a E C.

Theorem A.4.5. The following three statements are equivalent:

(i) The axiom of choice.

(ii) The well-ordering principle.

(iii) Zorn's lemma.


648 Appendix A. Foundations of Abstract Mathematics

We do not prove this theorem here, as it would take us too far afield. The
interested reader may consult [Hal74].

Definition A.4.6. Let (X, :::;) be a well-ordered set. For each y E X , define
s(y) = {x EX Ix< y}, and define s(y) to be s(y)Uy . We say that a subset S <:;:;; X
is a segment of X if either S = X or there exists y EX such that S = s(y). If the
complement s(y)c = X"' s(y) is nonempty, we call the least element of s(y)c the
immediate successor of y.

Example A.4. 7.

(i) If z is the immediate successor of y, then we have s(z) = s(y).


(ii) Every element of X except the greatest element (if it exists) has an
immediate successor.

A.5 Cardinality
It is often useful to think about the size of a set-we call this the cardinality of
the set. In the case that the set is finite, the cardinality is simply the number of
elements in the set. In the case that the set is infinite, cardinality is more subtle.

Definition A.5.1. We say that two sets A and B have the same cardinality if
there exists a bijection f : A -+ B.

Definition A.5.2. It is traditional to use the symbol N0 to denote the cardinality


of the integers. A set with this cardinality is countable. Any set with cardinal-
ity strictly greater than No is called uncountable . If a set A has the same cardinality
as the set {1, 2, ... , n}, we say that the cardinality of A is n . We often write IAI to
denote the cardinality of A, so
and 1{1,2, . .. ,n}I =n.

Example A.5.3.

(i) The cardinality of the even integers 2/Z is the same as that of Z. This
follows from the bijection g : Z -+ 2/Z given by g(n) = 2n, which is
bijective because it has a (left and right) inverse g- 1 : 2/Z-+ Z given by
g-l(m) = I[f.
(ii) The cardinality of the set z+ of positive integers is the same as the
cardinality of the set of integers (it has cardinality N0 ). To see this , use
the function h: z+ -+ Z given by f(2n) = n and f(2n + 1) = - n. T his
function is clearly bijective, so the cardinality of z+ is N0 .
A.5. Cardinality 649

(iii) The set Q+ of positive rational numbers is countable. To prove this, we


show that it has the same cardinality as the positive integers z+. This
can be seen from the following argument. List all the positive rationals
in t he lower right quadrant of the plane, as follows:

1/ 1 2/ 1 3/ 1 4/ 1

1/ 2 2/ 2 3/ 2 4/2

1/ 3 2/ 3 3/ 3 4/ 3

Now number t he rationals by proceeding along the following zig-zag


route , skipping any t hat have already been numbered (e.g. , 2/ 2 was
counted as 1/ 1, so it is not counted again) :

@D ----- 12111 I3/ l I ----- 14/ 1 I

///
[Im 2/ 2 I3/ 2 I 4/ 2

l/ / 3/ 3 4/ 3

In this way we get a function</>: Q+--+ z+, with </>(1 / 1) = 1, </>(2/1) = 2,


¢ (1 / 2) = 3, ¢ (1 / 3) = 4, and so forth. This function is clearly both
injective and surjective, so the cardinality of Q+ is the same as z+, and
hence Q+ is countable.
(iv) We do not prove it here, but it is not difficult to show that any subset
of Q is either finite or countably infinite.
650 Append ix A. Foundations of Abstract Mathematics

(v) The real numbers in the interval (0, 1) are not countable, as can be seen
from the following sketch of Cantor's diagonal argument. If the set (0, 1)
were countable, there would have to exist a bijection c : z+ -+ (0, 1).
Listing the decimal expansion of each real number, in order, we have

c(l) = 0.an ai2a13 ... ,


c(2) = O.a21 a22a23 ... ,

c(3) = O.a31a32a33 ... ,

where aij is the jth digit of the decimal expansion of c(i) .


Now consider the real number b with decimal expansion O.b1b2b3 . . . ,
where for each positive integer k the digit bk is chosen to be some digit in
the range {1, 2, . . . , 8} which is different from akk· We have b E (0, 1), but
for any £ E z+ we have that b #- c(£) because their decimal expansions
differ in the £th digit:a be -=I= aee. Thus the map c is not surjective, a
contradiction.
aTh is last point is not completely obvious-some care is needed to prove that the different
decimal expansions give different real numbers. For more details on this argument see, for
example, [Hal74].

Proposition A.5.4. If IBI = n E z+ and b EB , then IB \_bl = n - 1.

Proof. The expression IB I = n means that there is a bijection ¢ : B -+ { 1, 2, . .. , n} .


We define a new bijection 'l/; : (B \_ b)-+ {l, 2, ... , n - l} by

'l/;(x) = {¢(x) if ¢(x) i= n,


¢ (b) if ¢(x) = n.

We claim that 'l/; is a bijective map onto {l , 2, . .. , n-1} . First, for any x E (B \_ b),
if ¢ (x) i= n, then 'l/;(x) i= n, while if ¢(x) = n, then since x i= b, we have 'l/;(x) =
¢(b) i= ¢(x) = n . So 'l/; is a well-defined map 'l/; : (B \_ b)-+ {1, 2, ... , n - 1}.
For injectivity, consider x, y E (B \_ b) with 'l/;(x) = 'l/; (y) . If ¢(x) i= n and
¢ (y) i= n, then ¢(x) = 'l/;(x) = 'l/;(y) = ¢(y), so by injectivity of¢, we have x = y. If
¢(x) i= n but ¢ (y) = n, then ¢(x) = 'ljJ(x) = 'l/;(y) = ¢(b), so x = b, a contradiction.
Finally, if ¢(x) = n = ¢(y), then by injectivity of¢ we have x = y.
For surjectivity, consider any k E {l, 2, ... , n - 1}. By surjectivity of¢ there
exists an x E B such that ¢(x) = k . If xi= b, then since k i= n, we have 'l/;(x) =
¢(x) = k. But if x = b, then there exists also a y E B \_ b such that ¢(y) = n, so
'l/;(y) = ¢(b) = k. D

Proposition A .5.5. If B is a finite set and AC B, then JAi ::; !Bl .


A.5. Cardinality 651

Proof. We prove this by induction on the cardinality of B. If IBI = 0, then


B = 0 and also A = 0, so IAI = IBI. Assume now that the proposition holds
for every B with IBI < n . If A = B, then IAI = IBI and we are done. But if
A# B , and IBI = n, then there exists some b E B "A, and therefore A C B "b .
By Proposition A.5.4, the set B "b has cardinality n - 1, so by the induction
hypothesis, we have IAI ::; n - 1 < IBI = n . D

Since any injective map gives a bijection onto its image, we have, as an im-
mediate consequence, the following corollary.

Corollary A.5.6. If A and B are two finite sets and f : A ---+ B is injective, then
IAI ::; IBI.
The previous corollary inspires the following definition for comparing cardi-
nality of arbitrary sets.

Definition A.5.7. For any two sets A and B , we say IAI ::; IBI if there exists an
injection f : A---+ B, and we say IAI < IBI if there is an injection f : A---+ B, but
there is no surjection from A to B.

Example A.5.8.

(i) ~o < !JR.I because there is an obvious injection Z ---+ JR, but by
Example A.5.3(v), there is no surjection z+ ---+ (0, 1) and hence no sur-
jection Z ---+ R

(ii) For every nonnegative integer n we haven < ~o because there is an obvi-
ous injection { 1, 2, ... , n }---+Z, but there is no surjection {l, 2, ... , n }---+Z.

Proposition A.5.9. If g: A---+ B is a surjection, then IAI 2: IBI.


Proof. By surjectivity, for each b E B, there exists an a E A such that g(a) = b.
Choose one such a for each b and call it f(b). The assignment b >---t f(b) defines
a function f : B ---+ A. Moreover, if b, b' E B are such that f(b) = f(b'), then
b = g(f(b)) = g(f(b')) = b' , so f is injective. This implies that IBI ::; IAI. D

We conclude with two corollaries that follow immediately from our previous
discussion, but which are extremely useful.

Corollary A.5.10. If A and B are finite sets of the same cardinality, then any
map f : A ---+ B is injective if and only if it is also surjective.

Remark A.5.11. The previous corollary is not true for infinite sets, as can be seen
from the case of 2Z c Z and also Z C Q. The inclusions are injective maps that
are not surjective, despite the fact that these sets all have the same cardinality.
652 Appendix A Foundations of Abstract Mathematics

Corollary A.5.12 (Pigeonhole Principle). If IAI > IBI, then given any function
f: A--+ B, there must be at least two elements a, a' EA such that f(a) = f(a').

This last corollary gets its name from the example where A is a set of pigeons
and B is a set of pigeonholes. If there are more pigeons than pigeonholes, then
at least two pigeons must share a pigeonhole. This "principle" is used in many
counting arguments.
The Complex
Numbers and Other
Fields

For every complex problem there is an answer that is clear, simple, and wrong.
-H. L. Mencken

In this appendix we briefly review the fundamental properties of the field of complex
numbers and also general fields.

B.1 Complex Numbers


B.1.1 Basics of Complex Numbers

Definition B.1.1. Let i be a formal symbol (representing a square root of - l).


Let C denote the set
C = {a+ bi I a, b E JR} .
Elements of C are called complex numbers. We define addition of complex
numbers by
(a+ bi)+ (c +di) = (a+ c) + (b + d)i,

and we define multiplication of complex numbers by

(a+ bi)(c +di) = (ac - bd) +(ad+ bc)i.

Example B.1.2. Complex numbers of the form a+ Oi are usually written


just as a , and those of the form 0 +bi are usually written just as bi.
We verify that i has the expected property

i 2 = (0 + li) 2 = (0 + li)(O + li) = (0 - 1) + (0 + O)i = - 1.

653
654 Appendix B The Complex Numbers and Other Fields

Definition B.1.3. For any z = a+ bi E C we define the complex conjugate of


z to be z = a - bi, and we define the modulus (sometimes also called the norm)
of z to be izl = /ZZ, = va
2 + b2 . We also define the real part ~(z) =a and the

imaginary part ~(z) = b.

Note that lzl E JR for any z E C.

Proposition B.1.4. Addition and multiplication of complex numbers satisfy the


following properties for any z, w, v E CC:

(i) Associativity of addition: (v + w) + z = v + (w + z).


(ii) Commutativity of addition: z +w = w + z.
(iii) Associativity of multiplication: (vw)z = v(wz).
(iv) Commutativity of multiplication: zw = wz .

(v) Distributivity: v(w + z) = vw + vz .

(vi) Additive identity: 0 + z = z = z + 0.


(vii) Multiplicative identity: 1 · z = z = z · 1.
(viii) Additive inverses: if z =a+ bi, then -z =-a - bi satisfies z + (- z) = 0.
(ix) Multiplicative inverses: If z =a+ bi f. 0, then lzl- 2 E JR and so

a-bi
z-
1
== zizl- 2 = a2 + b2.

Proof. All of the properties are straightforward algebraic manipulations. We give


one example and leave the rest to the reader.
For (ix) first note that since z j 0 we have lzl 2 = a 2 + b2 f. 0, so its multi-
plicative inverse (a 2 + b2 )- 1 is also in R We have

so (zizl- 2 ) is the multiplicative inverse to z. D

B.1.2 Euler's Formula and Graphical Representation


Euler's Formula
For any z E C we define the exponential ez using the Taylor series

ez = "zk! .
00

L,,
k

k=O
B.1. Complex Numbers 655

3i

2i

- 1 1 2 3
- i

Figure B.1. A complex number x + iy with x, y E IR is usually repre-


sented graphically in the plane as the point (x, y). This figure shows the graphical
representation of the complex numbers i, 1, and 3 + 2i.

, -? z+w
'' I
I

z
I
I

'W

Figure B.2. Graphical representation of complex addition. Thinking of


complex numbers z = a + bi and w = c + di as the points in the plane (a, b) and
(c,d), respectively, their sum z + w =(a+ c) + (b + d)i corresponds to the usual
vector sum (a, b) + (c, d) = (a+ c, b + d) in the plane .

One of the most important identities for complex numbers is Euler's formula (see
Proposition 11.2.12):

eit = cos(t) +isin(t) (B.1)

for all t EC.


As a consequence of Euler's formula, we have De Moivre 's formula:

(cos(t) + i sin(t) )n = (eit)n = eint = cos( nt) + i sin( nt). (B.2)


656 Appendi>< B The Complex Numbers and Ot her Field s

z:w = (rp)ei(s+t)

/ w = peis
s+ts / z = reit
t

Figure B.3. Complex multiplication adds the polar angles (s + t) and


multiplies the moduli (r p).

Graphical Representation
The complex numbers have a very useful graphical representation as points in the
plane, where we associate the complex number z = a+ bi with the point (a, b) E JR 2 .
See Figure B.l for an illustration. In this representation real numbers lie along the
x-axis and imaginary numbers lie along the y-axis. The modulus lzl of z is the
distance from the origin to z in the plane, and the complex conjugate z is the image
of z under a reflection through the x-axis.
Addition of complex numbers is just the same as vector addition in the plane;
so, geometrically, the complex number z + w is the point in the plane corresponding
to the far corner of the parallelogram whose other corners are 0, z, and w . See
Figure B. 2.
We can represent any point in the plane in polar form as z = r( cos( B) +i sin( B))
e
for some E [O, 27r ) and some r E JR with r ~ 0. Combining this with Euler's formula
means that we can write every complex number in the form z = reilJ . In this form
we have

lzl = lreilJI = lr(cos(B)+isin(B))I = r and z = r(cos(B) - isin(B)) = re- ilJ.


We define the sign of z = reilJ E C to be

sign(z) = {eilJ = fzT if z -=I- 0,


(B.3)
1 if z = 0.
We can use the polar form to get a geometric interpretation of multiplication
of complex numbers. If z = reit and w = peis, then
wz = rpei(t+s) = lzllwl(cos(t + s) + isin(t + s)).

Multiplication of two complex numbers in polar form multiplies the moduli and
adds the angles; see Figure B.3.
Similarly, z- 1 = zlzl - 2 = re - itr - 2 = r- 1 e- it, so the multiplicative inverse
changes the sign of the angle (t H - t) and inverts the modulus (r H r - 1 ). But
the complex conjugate leaves the modulus unchanged and changes the sign of the
angle; see F igure B.4.
B.1. Complex Numbers 657

(a) (b)
Figure B.4. Graphical representation of multiplicative inverse (a) and
complex conjugate (b). The multiplicative inverse of a complex number changes the
sign of the polar angle and inverts the modulus. The complex conjugate also changes
the sign of the polar angle, but leaves the modulus unchanged.

B.1.3 Roots of Unity


Definition B.1.5. For n E z+ an nth root of unity is any solution to the equation
2
zn = 1 in C. The complex number Wn = e 7ri/ n is called the primitive nth root
of unity.

By the fundamental theorem of algebra (or rather its corollary,


Theorem 15.3.15) there are exactly n of these nth roots of unity in C. Euler's
formula tells us that Wn = cos(2n / n) + i sin(2n / n) is the point on the unit circle in
the complex plane corresponding to an angle of 2n /n radians, and

w~ = e27rik / n = cos(2nk/n) + i sin(2nk/n).

See Figure B.5.


Thus we have
wn = e27ri = 1
n '

so w~ is a root of unity for every k E Z.


=
If k' k (mod n), then k' = k + mn for some m E Z, and thus

The nth roots of unity are uniformly distributed around the unit circle, so
their average is O. The next proposition makes that precise.
658 Appendix B_ The Compl ex Numbers and Other Fields

W3 = e21ri/3

1 1

Figure B.5. Plots of all the 3rd (on the left) and 10th (on the right) roots
of unity. The roots are uniformly distributed around the unit circle, so their sum
is 0.

Proposition B.1.6. For any n E z+ and any k E Z we have

_£I: w~e = _£I:


n £=O n £=O
e21rik£/n = {O,
1,
k ¢- 0
k =0
(mod n),
(mod n).
(B.4)

Proof. The sum L:;;:-0 (w~)£ is a geometric series, so if k ¢- 0 (mod n), we have
1

"°'
n-1
wk£
L.., n
(wk)n _ l
= _ n___ =
wk - 1
(wn)k _ 1
n
wk - 1
=0
·
£=0 n n

But if k =0 (mod n), then

nl~ wn n1~ 1
L..,
£=0
k/l,
= L..,
£=0
= 1. D

We conclude this section with a simple observation that turns out to be very
powerful. The proof is immediate.

Proposition B.1.7. For any divisor d of n and any k E Z, we have

(B.5)

Vista B .1.8. The relation (B.5) is the foundation of the fast Fourier trans-
form (FFT). We discuss the FFT in Volume 2.
B.2 Fields 659

8.2 Fields
B.2.1 Axioms and Basic Properties of Fields
The real numbers JR and the complex numbers re are two examples of an important
structure called a field. Although we only defined vector spaces with the scalars
in the fields JR or re, you can make the same definitions for scalars from any field,
and all of the results about vector spaces, linear transformations, and matrices in
Chapters 1 and 2 still hold (but not necessarily the results of Chapters 3 and 4).

Definition B.2.1. A field F is a set with two binary operations, addition (a, b) ->
a + b and multiplication (a, b) -t ab, satisfying the following properties for all
a,b, c E F:

(i) Commutativity of addition: a+ b = b +a.

(ii) Associativity of addition: (a+ b) + c = a+ (b + c) .


(iii) Additive identity: There exists 0 E F such that a+ 0 = a.

(iv) Additive inverse: For each a E F there exists an element denoted (- a) E F


such that a+ (-a) = 0.

(v) Commutativity of multiplication: ab= ba .

(vi) Associativity of multiplication: (ab)c = a(bc) .


(vii) Multiplicative identity: There exists 1 E F such that la = a.
(viii) Multiplicative inverse: For each a-=/= 0 there exists a- 1 E F such that aa- 1 = 1.
(ix) Distributivity: a(b + c) =ab+ ac .

Example B.2.2. The most important examples of fields for our purposes are
the real numbers lR and the complex numbers re. The fact that re is a field
is the substance of Proposition B.1.4. It is easy to verify that the rational
numbers IQ also form a field.

Unexample B.2 .3. The integers Z do not form a field , because most nonzero
elements of Z have no multiplicative inverse. For example, the multiplicative
inverse (2- 1 ) of 2 is not in Z.
The nonnegative real numbers [O, oo] do not form a field because not all
elements of [O, oo] have an additive inverse in [O, oo] . For example, the additive
inverse -1of1 is not in [O, oo].
660 Appendix B. The Complex Numbers and Ot her Fields

Propos ition B.2.4. The additive and multiplicative identities of a field F are
unique, as are the additive and multiplicative inverses for a given element a E F.
Moreover, for any x, y, z E F the following properties hold:
(i) 0 . x = 0 = x . o.

(ii) Ifx+y=x+z ,theny=z.


(iii) If xy = xz then y = z.
(iv) -(-x) = x.
(v) (-y)x = -(yx) = y(-x).

Proof. If a, b E F and both of them are additive identities, then a = a + b because b


is an additive identity, but also a+b = b because a is an additive identity. Therefore,
we have a = a+ b = b. The same proof applies mutatis mutandis to the uniqueness
of the multiplicative identity.
If b, c E F are multiplicative inverses of a E F, then ca= ba = 1, which gives
c = c(ab) = (ca)b = b. The same proof applies mutatis mutandis to the uniqueness
of additive inverses.
For (ii) we have that x + y = x + z implies (- x) + (x + y) = (-x) + (x + z ).
By associativity of addition, we can regroup to get (- x + x) + y = (- x + x) + z,
and hence y = z. The same proof applies mutatis mutandis for (iii).
For (i) we have 0 · x = (0 + 0) · x = 0 · x + 0 · x, so by (ii) we have 0 = 0 · x.
Commutativity of multiplication gives x · 0 = 0.
For (iv) we have (-x) + (-(-x)) = 0 by definition, but adding x on the left
to both sides gives (x + (-x)) + (- (- :r)) = x, and thus -(-x) = x.
F inally, for (v) we have (-y)x + yx = (-y + y)x = 0, and adding -(yx) to
both sides gives the desired result. D

B.2.2 The Field Zp


Recall from Example A.l.18(i) that for any n E Z t he set Zn is the set of
equivalence classes in Z with respect to the equivalence relation = (mod n) . In
Examples A. 2.16(i) and A.2.16(ii) we showed that the operations of modular
addition and multiplication are well defined. In those examples we wrote [[k]] =
{ ... , k-n, k, k+n, ... } to denote the equivalence class of k (mod n), and we wrote
EB and C8l to denote modular addition and multiplication, respectively. It is common,
however, for most mathematicians to be sloppy with notation and just write k to
indicate [[k]], to write+ for EB, and to write · (or just put letters next to each other)
for C8l. We also usually use this convenient shorthand.

Unexample B.2.5. The set Z4 = {O, 1, 2, 3} with modular addition and mul-
tiplication is not a field because the element 2 has no multiplicative inverse.
To see this, assume by way of contradiction that some element a is the mul-
tiplicative inverse of 2. In this case we have
B.2. Fields 661

2 = (a · 2) · 2 = a· (2 · 2) = a· 0 = 0,

a contradiction.
In fact , if there is a d "/=. 0, 1 (mod n) which divides n, a very similar
argument shows that Zn is not a field, because d "/=. 0 (mod 0) has no multi-
plicative inverse.

It is straightforward to verify that Zn satisfies all the properties of a field


except the existence of multiplicative inverses. In the special case that n = p is
prime, we do have multiplicative inverses, and Zp is a field.

P roposition B.2.6. If p E Z is prime, then Zp with the operations of modular


addition and multiplication is a field .

Proof. As mentioned above, it is straightforward to verify that all the axioms of a


field hold except multiplicative inverses. We now show that nonzero elements of Zp
all have multiplicative inverses. Any element a E Z with a "/=. 0 (mod p) is relatively
prime top, which means that gcd( a, p) = 1. By Proposition 15.2.8, this means there
exist x , y E Z such that ax+ py = 1. This implies that ax = 1 (mod p), and hence
x is the multiplicative inverse to a in Zp. D
Topics in Matrix
Analysis

It is my experience that proofs involving matrices can be shortened by 50% if one


throws the matrices out.
- Emil Artin

C.1 Matrix Algebra


C.1.1 Matrix Fundamentals

Definition C.1.1. An m x n matrix A over lF is a rectangular array of the form

A = r~~~ ~~~
am1 am2

with m rows and n columns, where each entry aij is an element oflF. We also write
A= [ai1] to indicate that the (i,j) entry of A is aiJ · We denote the set of m x n
matrices over lF by Mmxn(lF).

Matrices are important because they give us a nice way to write down linear
transformations explicitly (see Section 2.4). Composition of linear transformations
corresponds to matrix multiplication (see Section 2.5 .1).

Definition C.1.2. Let A be an m x r matrix with entries [ai1 ], and let B be an


r x n matrix with entries [biJ] . The product AB is defined to be the m x n matrix
C with entries [cij], where
r
Cij = L aikbkj.
k =l
Note that you can also consider Cij to be the inner product of the ith row of A and
the jth column of B. Given a scalars E lF and a matrix A E Mmxn(lF) with entries
[aij], we define scalar multiplication of A by s to be the matrix sA whose (i, j) entry
is saij.

663
664 Appendix C. Topics in Matrix Analysis

Definition C.1.3. The transpose AT of a matrix A = [aij] is defined to be the


matrix
AT = [aji]· (C.l)
The Hermitian conjugate AH of A is the complex conjugate of the transpose
H - T
A =A = [aji]· (C.2)

It is straightforward to verify t hat for any A , B E Mmx n(IF) we have

(AT)T =A , (A+B)T =AT +BT,


(AH)H =A, and (A+B)H=AH+BH.

The following multiplicative property of transpose and conjugates can be verified


by a tedious computation that we will not give here.
Proposition C.1.4. For any m x r matrix A and any r x n matrix B we have

and

C.1.2 Matrix Algebra

Theorem C.1.5. For scalars a, b, c E: IF and matrices A, B, C of the appropriate


shapes for the following operations to make sense, we have the following rules of
matrix algebra:

(i) Addition is commutative: A+ B = B +A .


(ii) Addition is associative: (A+ B) + C =A+ (B + C).
(iii) Multiplication is associative: (AB)C = A(BC).
(iv) Matrix multiplication is left distributive: A(B + C) =AB+ AC.

(v) Matrix multiplication is right distributive: (A+ B)C =AC+ BC .


(vi) Scalar multiplication is associati'ue: (ab )A = a(bA) .
(vii) Scalar multiplication is distributive over matrix addition: a(A+B) = aA+aB.

(viii) Matrix multiplication is distributive over scalar addition: (a+b)A = aA+bA.

Proof. Associativity of matrix multiplication is proved in Corollary 2.5. 3. The


proof of the remaining properties is a tedious but straightforward verification that
can be found in many elementary linear algebra texts. D

Nota Bene C.1.6. Matrix multiplication is not commutative. In fact, if A


and B are not square, and AB is defined , t hen t he BA is not defined. But
even if A and B are square, we rare.ly have equality for the two products A B
and BA.
C.2. Block Matrices 665

Proposition C.1.7. If A E Mmxn(lF) satisfies Ax= 0 for all x E lFn, then A = 0.

Proof. If x = ei is the ith standard basis vector, then Aei is the ith column of A.
Since this is zero for every i, the entire matrix A must be zero. D

C.1.3 Inverses

Definition C.1.8. A matrix A E Mn is invertible (or nonsingular) if there exists


BE Mn such that AB= BA = I.

Proposition C.1.9. Matrix inverses are unique.

Proof. If B and C are both inverses of A, then


B =BI = B(AC) = (BA)C =IC = C. D

Proposition C.1.10. If A, B E Mn (lF) are both invertible, then (AB)- 1


B-1A- 1.

Proof. We have

and

C.2 Block Matrices


Partitioning a matrix into submatrices or blocks is often helpful. For example, we
may partition the matrix A as

A=[~ ~].
Here the number of rows in B and C are equal, the number of rows in D and E are
equal, the number of columns in B and D are equal, and the number of columns in
C and E are equal. There is no requirement that A, B , C , D , or Ebe square.

C.2.1 Block Matrix Multiplication


Suppose we are given two matrices A and B that are partitioned into blocks as
below:
666 Appendix C Topics in Matrix Analysis

If for every pair (Aik, Bkj) the number of columns of Ajk equals the number of
rows of Bkj, then the product of the matrices is formed in a manner similar to that
of regular matrix multiplication. In fact, the (i, j) block of the matrix is equal to
2=:= 1 AikBkj·

Example C.2.1. Suppose

and B = [Bn
B21

Then the product is

AB= [A11B11 + A12B21 A11B12 + A12B22J


A21Bn + AnB21 A21B12 + A22B22 .

Block matrix multiplication is especially useful when there are patterns (usually
involving zeros or the identity in the matrices to be multiplied).

C.2.2 Block Matrix Identities

Lemma C.2.2 (Schur). Let M be a square matrix with block form

M==[~ ~]- (C.3)

If A, D, A- B D- 1 C, and D-C A- 1 B are nonsingular, then we have two equivalent


descriptions of the inverse of M, that is,
M - 1 - [A- 1 + A- 1B(D - cA- 1 B) - 1CA- 1 -A- 1 B(D - CA- 1 B)- 1 ]
- -(D - CA- 1 B)- 1 CA- 1 (D - CA- 1 B)- 1 (C.4)

and
_1 [ (A - BD- 1c)- 1 - (A - BD- 1 C)- 1 BD- 1 ]
M = -D - 1 C(A - BD- 1 C)- 1 D- + D- 1c(A - BD- 1C)- 1BD - 1 . (C. 5)
1

The matrices A- BD- 1C and D - CA- 1B are called the Schur complements of A
and D, respectively.

Proof. These identities can be verified by multiplying (C.3), (C.4), and (C.5),
respectively. 0

Lemma C.2.3 (Sherman-Morrison-Woodbury). Let A and D be invertible


matrices and B and D be given so that the sum A+ BD- 1C is nonsingular. If
D - CA - 1 B is also nonsingular, then
(A- BD- 1c)- 1 = A- 1 + A- 1B(D - cA- 1B)- 1cA- 1.

Proof. This follows by equating the upper left blocks of (C.4) and (C.5). D
C.3 . Cross Products 667

C.3 Cross Products


The cross product of two vectors x, y E IR 3 is defined to be the unique vector x x y
with the property that (xx y) · z = det(x,y,z) for every z E JR3. Here det(x, y,z)
means the determinant of the matrix formed by writing x = (x 1, x 2, x 3)T, y =
(y1,y2,y2)T , and z = (zi,z2,z3)T in terms of the standard basis and assembling
these together to form

so we have

(xxy) · z = det(x,y,z)=det [~~ ~~ ~~i ,


X3 Y3 Z3

where det (A) is t he determinant of A; see Sections 2.8 and 2.9. Geometrically,
I det(x, y , z) I is the volume of the parallelepiped having x , y, z as three of the sides;
see Remark 8.7.6.

Proposition C.3.1. For any x, y, z E IR 3 and any a, b E IR the following properties


hold:
(i) x x y = -y x x .
(ii) The cross product is bilinear; that is,
(ax+bz)xy = a(xxy)+b(zxy) and xx(ay+bz) = a(xxy)+b(xxz).

(iii) xx y = 0 if and only if x = cy for some c E R


(iv) (xx y,x) = 0 = (xx y , y ); thus, xx y is orthogonal to the plane spanned by
x and y.

(v) ll x x Yll is the area of the parallelogram having x and y as two of the sides.
(vi) The cross product ofx = (xi,x2,x3) with y = (yi,y2,y3) can be computed as

xx y = det (x 2 x 3) ei - det (xi 3


X ) e2 - det (xi x 2) e3 .
Y2 Y3 Yi Y3 Yi Y2

Proof. The proof is a straightforward but messy computation. 0

Replacing x and y by linear combinations of x and y will change the cross


product by the determinant of the transformation.

Proposition C.3.2. Any two vectors x, y E IR 3 define a linear transformation


B : IR 2 ---+ IR 3 by ei f--t x and e2 f--t y. Given any linear transformation <I> : IR 2 ---+ IR 2,
the composition B o <I> defines another linear transformation IR 2 ---+ IR 3. Define
x' =Bo <I>(ei) and y' = Bo <I>(e 2). We have
x' x y' = det(<I>)(x x y). (C.6)
668 Appendix C. Topics in Matrix Analysis

Proof. Since we may factor <I> into elementary matrices, it suffices to show that
(C.6) holds when <I> is elementary (see Section 2.7.1). If <I> is type I, then the
determinant is -1, and we have x' = y and y' = x, so the desired result follows
from (i) in Proposition C.3.1. If <I> is type II, corresponding to multiplication of the
first row by a, then det( <I>) = a, and it is easy to see that x' x y' = a(x x y). Finally,
consider the case where <I> is type III, corresponding to adding a scalar multiple of
one row to the other row . Let us assume first that <I> = ( 6~). We have x' = x + ay
and y' = y. We have det(<I>) = 1 and

x' x y' = (x + ay) x y = x x y + ay x y = x x y.

The case of <I> = ( ~ ~ ) is essentially identical. D


The Greek Alphabet

I fear the Greeks even when they bring gifts.


-Virgil

CAPITAL LOWER VARIANT NAME


A a Alpha
B (3 Beta
r I Gamma
~ 8 Delta
E € Epsilon
z ( Zeta
H T/ Eta
8 e {) Theta
I l Iota
K K, x Kappa
A >. Lambda
M µ Mu
N I/ Nu
~
~ Xi
0 0 Omicron
II 7f w Pi
p p [! Rho
~ (}' <; Sigma
T T Tau
'I v Upsilon
<I> ¢ r.p Phi
x x Chi
"W 'ljJ Psi
[2 w Omega

669
Bibliography

[AB66] Edgar Asplund and Lutz Bungart. A First Course in Integration. Holt,
Rinehart and Winston, New York, Toronto, London, 1966. [360]

[Abb15] Stephen Abbott. Understanding Analysis. Undergraduate Texts in Math-


ematics. Springer, New York, second edition, 2015. [xiv, xvii]

[Ahl78] Lars V. Ahlfors. Complex Analysis. An Introduction to the Theory of


Analytic Functions of One Complex Variable. International Series in Pure
and Applied Mathematics. McGraw-Hill, New York, third edition, 1978.
[456]

[Alu09] Paolo Aluffi. Algebra: Chapter 0. Volume 104 of Graduate Studies in


Mathematics. American Mathematical Society, Providence, RI, 2009.
[624]

[Ans06] Richard Anstee. The Newton-Raphson method. https: I /www. math.


ubc. caranstee/math104/104newtonmethod . pdf , 2006. Last accessed
5 April 2017. [315]

[Art91] Michael Artin. Algebra. Prentice-Hall, Englewood Cliffs, NJ, 1991. [624]

[Ax115] Sheldon Axler. Linear Algebra Done Right. Undergraduate Texts in


Mathematics. Springer, Cham, third edition, 2015. [85]

[Ber79] S. K. Berberian. Classroom Notes: Regulated functions: Bourbaki's al-


ternative to the Riemann integral. Amer. Math. Monthly, 86(3):208-211,
1979. [239]

[BL06] Kurt Bryan and Tanya Leise. The $25,000,000,000 eigenvector: Thelin-
ear algebra behind Google. SIAM Rev., 48(3):569-581, 2006. [517]

[BM14] E. Blackstone and P. Mikusinski. The Daniell integral. ArXiv e-prints,


January 2014. [360]

[Bou76] N. Bourbaki. Elements de mathematique. Fonctions d'une variable reelle,


Theorie elementaire. Hermann, Paris, 1976. [228]

[Bre08] David M. Bressoud. A Radical Approach to Lebesgue's Theory of Inte-


gration. MAA Textbooks. Cambridge University Press, Cambridge, 2008.
[360]

671
672 Bibliography

[BroOO] Andrew Browder. Topology in the complex plane. Amer. Math. Monthly,
107(5):393-401, 2000. [406]

[BT04] Jean-Paul Berrut and Lloyd N. liefethen. Barycentric Lagrange inter-


polation. SIAM Rev., 46(3):501-517, 2004. [611, 612]

[CB09] Ruel V. Churchill and James Ward Brown. Complex Variables and Ap-
plications. McGraw-Hill, New York, eighth edition, 2009. [456]

[CF13] Robert M. Corless and Nicolas Fillion. A Graduate Introduction to Nu-


merical Methods. From the Viewpoint of Backward Error Analysis. With
a foreword by John Butcher. Springer, New York, 2013. [551]

[Cha95] Soo Bong Chae. Lebesgue Integration. U niversitext. Springer-Verlag,


New York, second edition, 1995. [332, 345, 360, 380]

[Cha12] Francoise Chatelin. Eigenvalues of Matrices. Volume 71 of Classics in


Applied Mathematics. With exercises by Mario Ahues and the au-
thor. Translated with additional material by Walter Ledermann; revised
reprint of the 1993 edition. Society for Industrial and Applied Mathe-
matics (SIAM), Philadelphia, PA, 2012. [516]

[Cha15] Tim Chartier. When Life is Linear. From Computer Graphics to


Bracketology. Volume 45 of Anneli Lax New Mathematical Library.
Mathematical Association of America, Washington, DC, 2015. [30]

[Chi06] Carmen Chicone. Ordinary Differential Equations with Applications. Vol-


ume 34 of Texts in Applied Mathematics. Springer, New York, second
edition, 2006 . [315]

[CM09] Stephen L. Campbell and Carl D. Meyer. Generalized Inverses of Linear


Transformations. Volume 56 of Classics in Applied Mathematics. Reprint
of the 1991 edition; corrected reprint of the 1979 original. Society for
Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2009.
[517]

[Con90] John B. Conway. A Course in Functional Analysis. Volume 96 of Grad-


uate Texts in Mathematics. Springer-Verlag, New York, second edition,
1990. [137, 176]

[Con16] Keith Conrad. Differentiation under the integral sign. http: I /www.
mat h.uconn . edu/-kconrad/blurbs/analysis/diffunderint . pdf ,
2016. Last accessed 5 April 2017. [360]

[Cos12] Christopher M. Cosgrove. Contour integration and Cauchy's theorem,


2012 . Last accessed 2 Jan 2017. [456]

[Dem82] James Demmel. The condition number of similarities that diagonalize


matrices. Technical report, Electronics Research Laboratory Memoran-
dum, University of California at Berkeley, Berkeley, CA, 1982. [572]
Bibliography 673

[Dem83] James Demmel. The condition number of equivalence transformations


that block diagonalize matrix pencils. SIAM J. Numer. Anal., 20(3):599-
610, 1983. [572]

[Dem97] James W. Demmel. Applied Numerical Linear Algebra. Society for In-
dustrial and Applied Mathematics, Philadelphia, PA, 1997. [315]

[Der72] William R. Derrick. Introductory complex analysis and applications. Aca-


demic Press, New York, 1972. [456]

[Die60] J. Dieudonne. Foundations of Modern Analysis. Volume 10 of Pure and


Applied Mathematics. Academic Press, New York, London, 1960. [228]

[DR50] A. Dvoretzky and C. A. Rogers. Absolute and unconditional convergence


in normed linear spaces. Proc. Nat. Acad. Sci. U. S. A., 36:192-197, 1950.
[213]

[Dri04] Bruce K. Driver. Analysis tools with examples. http : I /www. math. ucsd.
edu;-bdri ver /DRIVER/Book/ anal. pdf, 2004. Last accessed 5 April
2017. [380]

[DSlO] Doug Smith, Maurice Eggen, and Richard St. Andre. A Transition to
Advanced Mathematics. Brooks Cole, seventh edition, 2010. [627]

[Fit06] P. Fitzpatrick. Advanced Calculus. Pure and Applied Undergraduate


Texts. American Mathematical Society, Providence, RI, 2006. [360]

[FLH85] Richard Phillips Feynman, Ralph Leighton, and Edward Hutchings.


Surely You 're Joking, Mr. Feynman! Adventures of a Curious Character.
WW Norton & Company, 1985. [289]

[GC08] Ping Zhang Gary Chartrand, Albert D. Polimeni. Mathematical Proofs:


A Transition to Advanced Mathematics. Pearson/ Addison Wesley, second
edition, 2008. [627]

[Gre97] Anne Greenbaum. Iterative Methods for Solving Linear Systems. Vol-
ume 17 of Frontiers in Applied Mathematics. Society for Industrial and
Applied Mathematics, Philadelphia, PA, 1997. [524]

[GVL13] Gene H. Golub and Charles F. Van Loan. Matrix Computations. Johns
Hopkins Studies in the Mathematical Sciences. Johns Hopkins University
Press, Baltimore, MD, fourth edition, 2013. [315, 551]

[Hal74] Paul R. Halmos. Naive Set Theory. Undergraduate Texts in Mathemat-


ics. Reprint of the 1960 edition. Springer-Verlag, New York, 1974. [627,
629, 648, 650]

[Hal07a] Thomas C. Hales. The Jordan curve theorem, formally and informally.
Amer. Math. Monthly, 114(10):882- 894, 2007. [406]

[Hal07b] Thomas C. Hales. Jordan's proof of the Jordan curve theorem. Studies
in Logic, Grammar, and Rhetoric, 10(23):45-60, 2007. [406]
674 Bibliography

[Her96) I. N. Herstein. Abstract Algebra. Prentice-Hall, Upper Saddle River, NJ,


third edition, 1996. [624)

[HH99) John Hamal Hubbard and Barbara Burke Hubbard. Vector Calculus,
Linear Algebra, and Differential Forms: A Unified Approach. Prentice-
Hall, Upper Saddle River, NJ, 1999. [406]

[HRW12] Jeffrey Humpherys, Preston Redd, and Jeremy West. A fresh look at the
Kalman filter. SIAM Rev., 54(4) :801- 823, 2012. [286]

[HS75) Edwin Hewitt and Karl Stromberg. Real and Abstract Analysis.
A Modern Treatment of the Theory of Functions of a Real Vari-
able. Volume 25 of Graduate Texts in Mathematics, Springer-Verlag,
New York-Heidelberg, 1975. [201)

[IM98) Ilse C. F. Ipsen and Carl D. Meyer. The idea behind Krylov methods.
Amer. Math. Monthly, 105(10):889- 899, 1998. [551)

[Ise09) Arieh Iserles. A First Co·urse in the Numerical Analysis of Differen-


tial Equations. Cambridge Texts in Applied Mathematics. Cambridge
University Press, Cambridge, 2009. [524)

[JL97) Erxiong Jiang and Peteir C. B. Lam. An upper bound for the
spectral condition number of a diagonalizable matrix. Linear Algebra
Appl., 262:165-178, 1997. [572]

[Jon93) Frank Jones. Lebesgue Integration on Euclidean Space. Jones and Bartlett
Publishers, Boston, MA, 1993. [360)

[JT98) G. F. Jonsson and L. N. Trefethen. A numerical analyst looks at the


"cutoff phenomenon" in caird shuffling and other Markov chains. In Nu-
merical analysis 1997 (Dundee), volume 380 of Pitman Res. Notes Math.
Ser., pages 150- 178. Longman, Harlow, 1998. [564]

[KA82] L. V. Kantorovich and G. P. Akilov. Functional Analysis. Translated from


the Russian by Howard L. Silcock. Pergamon Press, Oxford, Elmsford,
NY, second edition, 1982. [315]

[Kan52] L. V. Kantorovich. Functional analysis and applied mathematics. NBS


Report 1509. U. S. Department of Commerce, National Bureau of
Standards, Los Angeles, CA, 1952. Translated by C. D . Benster. [315]

[Kan88] Shen Kangsheng. Historical development of the Chinese Remainder


Theorem. Archive for Hisitory of Exact Sciences, 38( 4):285-305, 1988.
[606]

[Kat95] Tosio Kato. Perturbation Theory for Linear Operators. Classics in Math-
ematics. Reprint of the 1980 edition. Springer-Verlag, Berlin, 1995. [516]

[Lay02] D.C. Lay. Linear Algebra and Its Applications. Pearson Education, 2002.
[30, 85]
Bibliography 675

[Leo80] Steven J . Leon. Linear Algebra with Applications. Macmillan, Inc.,


New York; Collier-Macmillan Publishers, London, 1980. [30, 85]

[Lib73] Ulrich Libbrecht. Chinese Mathematics in the Thirteenth Century.


Volume 1 of MIT East Asian Science Series. The Shu-shu chiu-chang
of Ch'in Chiu-shao. M.I.T. Press, Cambridge, MA, London, 1973. [624]

[LM06] Amy N. Langville and Carl D. Meyer. Google's PageRank and Beyond:
The Science of Search Engine Rankings. Princeton University Press,
Princeton, NJ, 2006. [517]

[MacOO] C.R. MacCluer. The many proofs and applications of Perron's theorem.
SIAM Rev., 42(3):487-498, 2000. [517]

[Mae84] Ryuji Maehara. The Jordan curve theorem via the Brouwer fixed point
theorem. Amer. Math. Monthly, 91(10):641-643, 1984. [406]

[MH99] Jerrold E. Marsden and Michael J. Hoffman. Basic Complex Analysis.


W. H. Freeman and Company, New York, third edition, 1999. [456]

[Mikl4] P. Mikusinski. Integrals with values in Banach spaces and locally convex
spaces. ArXiv e-prints, March 2014. [360]

[Morl 7] Sidney A. Morris. Topology without tears. http: I /www.


topologywi thoutt ears . net/topbook. pdf , 2017. Last accessed 5
April 2017. [239]

[Mun75] James R. Munkres. Topology: A First Course. Prentice-Hall, Englewood


Cliffs, NJ, 1975. [227, 239]

[Mun91] James R. Munkres. Analysis on Manifolds. Advanced Book Program.


Addison-Wesley Publishing Company, Redwood City, CA, 1991. [380]

[NJN98] Gail Nord, David Jabon, and John Nord. The global positioning system
and the implicit function theorem. SIAM Rev., 40(3):692-696, 1998. [301]

[Ort68] James M. Ortega. The Newton-Kantorovich theorem. Amer. Math.


Monthly, 75:658-660, 1968. [315]

[OS06] Peter J. Olver and Chehrzad Shakiban. Applied Linear Algebra. Pearson
Prentice-Hall, Upper Saddle River, NJ, 2006. [30, 85]

[Pa107] Richard S. Palais. A simple proof of the Banach contraction principle. J.


Fixed Point Theory Appl., 2(2 ):221- 223, 2007. [315]

[Pro08] S. David Promislow. A First Course in Functional Analysis. Pure and


Applied Mathematics (Hoboken). Wiley-lnterscience [John Wiley &
Sons], Hoboken, NJ, 2008. [131, 137, 176]

[Rey06] Luc Rey-Bellet. Math 623 homework 3. http://people.math.


umass.edu/-lr7q/ps _files/teaching/math623/PS3.pdf, 2006. Last
accessed 5 April 2017. [360]
676 Bibliography

[RFlO] H. L. Royden and P. M. Fitzpatrick. Real Analysis. Prentice-Hall,


New York, fourth edition, 2010. [335, 360]
[Roy63] H. L. Royden. Real Analysis. The Macmillan Co., New York; Collier-
Macmillan Ltd., London, first edition, 1963. [360]
[RT90] Satish C. Reddy and Lloyd N. Trefethen. Lax-stability of fully discrete
spectral methods via stability regions and pseudo-eigenvalues. Spectral
and high order methods for partial differential equations (Como, 1989).
Comput. Methods Appl. Mech. Engrg., 80(1-3):147- 164, 1990. [565]

[RT92] Satish C. Reddy and Lloyd N. Trefethen. Stability of the method of lines.
Numer. Math ., 62(2):235-267, 1992. [565]

[Rud87] Walter Rudin. Real and Complex Analysis. McGraw-Hill, New York,
third edition, 1987. [360]
[Rud91] Walter Rudin. Functional Acnalysis. International Series in Pure and Ap-
plied Mathematics. McGraw-Hill, New York, second edition, 1991. [137,
176]
[Sch12] Konrad Schmiidgen. Unbounded Self-adjoint Operators on Hilbert Space.
Volume 265 of Graduate Texts in Mathematics. Springer, Dordrecht,
2012. [470]
[Soh14] Houshang H. Sohrab. Basic Real Analysis. Birkhiiuser/Springer,
New York, second edition, 2014. [360]

[Spi65] Michael Spivak. Calculus on Manifolds . A Modern Approach to Classi-


cal Theorems of Advanced Calculus. W. A. Benjamin, Inc., New York,
Amsterdam, 1965. [406]
[SS02] E. B . Saff and A. D. Snider. Fundamentals of Complex Analysis with Ap-
plications for Engineering and Science. Prentice-Hall, Pearson Education
Inc., third edition, 2002. [4.56]

[Ste98] G. W. Stewart. Matrix Algorithms. Vol. I. Basic Decompositions. Society


for Industrial and Applied Mathematics, Philadelphia, PA, 1998. [167]
[Ste09] William Stein. Elementary Number Theory: Primes, Congruences, and
Secrets. A Computational Approach. Undergraduate Texts in Mathemat-
ics. Springer, New York, 2009. [624]

[Str80] Gilbert Strang. Linear Algebra and Its Applications. Academic Press
[Harcourt Brace Jovanovich, Publishers], New York, London, second
edition, 1980. [30 , 85]
[Str93] Gilbert Strang. The fundamental theorem of linear algebra. Amer. Math.
Monthly, 100(9):848- 855, 1993. [163]

[StrlO] Gilbert Strang. Linear algebra. Spring 2010. http: I I ocw. mi t.


edu/courses/mathematics:/18- 06 -linear-al gebra-spring-2010/
video- lectures/, 2010. Last accessed 5 April 2017. [30, 85]
Bibliography 677

[Tap09] Richard Tapia. Keynote lecture 4: If it's fast it must be Newton's


method. In Proceedings of the 15th American Conference on Applied
Mathematics, AMATH'09, pages 14-14, Stevens Point, WI, 2009. World
Scientific and Engineering Academy and Society (WSEAS). [286]
[Tayll] Joseph L. Taylor. Complex Variables. Volume 16 of Pure and Applied
Undergraduate Texts. American Mathematical Society, Providence, RI,
2011. [456]
[TB97] Lloyd N. Trefethen and David Bau, III. Numerical Linear Algebra.
Society for Industrial and Applied Mathematics, Philadelphia, PA, 1997.
[137, 315, 516, 547, 551]
[TE05] Lloyd N. Trefethen and Mark Embree. Spectra and Pseudospectra. The
Behavior of Nonnormal Matrices and Operators. Princeton University
Press, Princeton, NJ, 2005. [564, 565, 572]
[Tre92] L. N. Trefethen. Pseudospectra of matrices. In Numerical Analysis 1991
(Dundee, 1991), volume 260 of Pitman Res. Notes Math. Ser., pages
234-266. Longman Sci. Tech., Harlow, 1992. [572]
[Tre97] Lloyd N. Trefethen. Pseudospectra of linear operators. SIAM Rev.,
39(3):383-406, 1997. [572]
[Van08] E. B. Van Vleck. On non-measurable sets of points, with an example.
Transactions of the American Mathematical Society, 9(2):237-244, April
1908. [335]
[Wikl4] MathOverflow Community Wiki. Applications of the Chinese re-
mainder theorem. http: I /mathoverflow. net/questions/10014/
applications - of-the -chinese-remainder- theorem, 2009-2014. Last
accessed 5 April 2017. [624]
[Wil79] James H Wilkinson. Note on the practical significance of the Drazin
inverse. Technical report, Stanford, CA, 1979. [517]
[Wi107] Mark Wildon. A short proof of the existence of the Jordan
normal form. http://www.ma.rhul.ac. uk;-uvah099/Maths/
JNFfinal. pdf, 2007. Last accessed 5 April 2017. [517]
[Wri02] Thomas G. Wright. Eigtool. http://www. comlab. ox. ac. uk/
pseudospectra/eigtool/, 2002. Last accessed 5 April 2017. [572]
[WT94] Elias Wegert and Lloyd N. Trefethen. From the Buffon needle problem to
the Kreiss matrix theorem. A mer. Math. Monthly, 101 (2): 132-139, 1994.
[572]
Index

Abel-Weierstrass lemma, 413 basic


absolute condition number, columns, 65
302 variables, 63
addition basis, 13
closure of, 5, 576 standard, 13, 51
adjoint, 121 Bauer-Fike theorem, 560
adjugate, 73, 76 Beppo Levi theorem, 337
algebraic multiplicity, 143 Bernstein
almost everywhere, 329, 332 operator, 82
convergence, 333 polynomials, 54
equal, 332 Bessel's inequality, 97
nonnegative, 340 bijection, 38
analytic function , 411, 414, 434 bijective, 38, 637
angle preserving, 131 bilinear, 89, 667
arclength, 383 block matrix
function, 384 inverse, 666
parametrized by, 384 multiplication, 665
argument principle, 445 Bochner integral, 320
arithmetic-geometric mean inequality, Boolean ring, 575, 595
118 boundary of a set, 192
Arnoldi bounded
basis, 531 above, 645
method, 547 functions, 113
asymptotic behavior, 561 linear transformation, 114, 186,
automorphism 200, 235
of a ring, 595 linear transformation theorem, 213,
of a vector space, 38 216
axiom of choice, 647 set, 197
Brouwer fixed-point theorem, 225, 277
Bezier curve, 54
back substitution, 61 calculus
Banach space, 211 fundamental theorem of, 403
Banach-valued regulated integral cancellation, 588
multivariable, 324 canonical epimorphism, 41, 600
single-variable, 230 Cantor
barycentric diagonal argument, 650
Lagrange interpolation, ternary set, 332
610 cardinality, 648
weight, 610 Cartesian product, 14, 19, 596, 630

679
680 Index

Cauchy computer aided design, 58


differentiation formula, 427 condit ion, 302
integral formula, 413, 424 condition number
sequence, 195, 196, 263 of a matrix, 305
Cauchy- Goursat theorem, 419, 421 absolute, of a function, 302
Cauchy-Riemann eigenvalue problem, 307
equation, 410 of a matrix, 307
equations, polar form, 451 relative, of a function, 303
theorem, 410 congruent
Cauchy-Riemann equation, 410 modulo n, 631
Cauchy-Schwarz inequality, 91 modulo an ideal, 598
Cayley-Hamilton theorem, 479 connected, 179, 224
chain, 20, 647 component, 396
chain rule conservative vector field, 388
for holomorphic functions, 412 continuity
Frechet derivative, 259 uniform, 199
change of basis, 53 continuous
change of variables, 263, 351 at a point, 185
characteristic polynomial, 143 function, 185, 188
Chinese remainder linear extension theorem , 213, 216
problem, 602 Lipschitz, 186, 200, 235, 254, 278
theorem, 80, 601, 604 pointwise, 187
classical adjoint, 76 uniformly, 199
closed continuously differentiable, 266
n-interval, 321 contour, 416
ball, 191 integral, 416, 417
set, 190 simple closed, 419
closure contraction
of a set, 192 mapping, 278
of operations, 5, 576 mapping principle, 279
codomain, 635 mapping principle, uniform, 282
cofactor, 73 mapping, uniform, 282
expansion, 65, 75 convergence
column space, 126 absolute, 212
commutative almost everywhere, 333
ring, 575 linear, 286
commutative diagram, 638 of sums, 212
compact, 179, 203 pointwise, 210
support, 122 quadratic, 286
complementary uniform, 210, 413
subspace, 16 uniform on compact subsets, 263
complete, 198 convex, 261
completion, 361 coordinate
complex change of, 299
conjugate, 654 coordinates
numbers, 653 change of, 48, 351
composition, 36, 636 hyperbolic, 360
of continuous functions, 187 hyperspherical, 356
Index 681

in a basis, 47 differentiable
polar, 352 complex function, 408
spherical, 354 continuously, 253
coset, 23, 598 function, 241, 246, 252
operations, 24, 599 dimension, 18
countable, 648 formula, 46
cover direct sum, 14, 16
open, 203 Dirichlet function , 332
Cramer's rule, 77, 217 divides , 585
cross product, 667 division property, 583
CRT, see Chinese remainder theorem domain
curl, 402 Euclidean, 583
curve of a function, 635
differentiable, 242 dominated convergence theorem, 342
fitting, 129 dot product, 89
piecewise-smooth, 382 Drazin inverse, 500, 501
positively oriented, 398 dual space, 248
simple closed, 381
smooth, 382 Eckart- Young, Schmidt, Mirsky
smooth parametrized, 381 theorem, 167
smooth, oriented, 382 EEA, see extended Euclidean algorithm,
smooth, unoriented, 382 587
cutoff phenomenon, 564 eigenbasis, 151
eigennilpotent, 483
Daniell integral, 320, 328 eigenprojection, 463, 475
data compression, 168 eigenspace, 140
De Morgan's Laws, 630 generalized, 465, 468, 486
decay matrix, 564 eigenvalue, 140
decomposition semisimple, 492
LU, 62 simple, 307, 496
polar, 165 eigenvector
QR,103 generalized, 468
singular value, 162 elementary
Wedderburn, 500 matrix, 59
dense, 190 product, 68
derivative, 242 empty set, 627
directional, 244 equivalence
higher, 266 modulo a subspace, 22
linearity, 256 class, 22, 598, 632
of a complex function , 408 modulo n, 21, 631 , 633
of a parametrized curve , 242 modulo an ideal, 598
second, 266 relation, 598, 631
determinant, 65 Euclidean
De Moivre's formula, 655 algorithm, 586
diagonal matrix, 162 domain, 573, 583
diagonalizable, 151 extended algorithm, 587
orthonormally, 157 Euler's formula, 411, 415, 655
diffeomorphism, 349 extension by zero, 325
682 Index

extension theorem, 18, 29 Gauss' mean value theorem, 426


exterior of a simple closed curve, 397 Gauss-Seidel
extreme value theorem, 205 convergence, 524
method, 522
Fatou's lemma, 340 gcd, 586
field, 659 generalized
finite intersection property, 206 eigenspace, 465, 468, 486
first isomorphism theorem Heine-Borel theorem, 208
for rings, 600 inverse, see pseudoinverse
for vector spaces, 42 Leibniz integral rule, 348
fixed point, 278 minimal residual method (GMRES),
fomula 530
Euler, 411 step function, 379
formal power series, 576 geometric multiplicity, 141
Fourier transform, discrete, 622 Givens rotation, 545
Frechet derivative, 246 , 252 GMRES, 530, 533
chain rule, 259 Google, 498
higher-order, 266 Gram-Schmidt, 99
product rule, 257 modified, 101
real finite-dimensional case, 246 graph, 635
Fredholm greatest
alternative, 136 common divisor, 586
integral transform, 281 element, 646
free variables, 63 lower bound, 645
Frobenius Green's theorem, 381, 399
inner product, 90, 136
Fubini's theorem, 320, 344 Holder's inequality, 118
for step functions, 372 Heine-Borel theorem
function, 635 generalized, 208
analytic, 411, 413 on lRn, 204
entire, 408 Hermitian conjugate, 84, 664
holomorphic, 408 Hessian matrix, 266
integrable (Daniell- Lebesgue), 330 Hilbert's identity, 471
regulated integrable, 324, 375 Hilbert- Schmidt
Riemann integrable, 320, 325, 340 inner product, 90
smooth, 266 holomorphic
fundamental function, 408, 434
subspaces theorem, 124 open mapping theorem, 445, 450
theorem of algebra, 430, 591 homeomorphism, 222
theorem of arithmetic, 590 Householder, 105
theorem of calculus, 262 Householder transformation, 105, 106
theorem of calculus for line hyperbolic coordinates, 360
integrals, 388 hyperplane, 321
fundamental theorem hyperspherical coordinates, 356
of calculus, 403
ideal, 578
Gamma function, 359 generating set, 581
Garner's formula, 606 identity map, 39
Index 683

ill conditioned, 303 irreducible


image, 594, 635 element, of a ring, 588
implicit function theorem, 294 matrix, 497
index of a matrix, 466 isolated point, 189
index set, 639 isometric embedding, 362
indicator function, 228, 323 isometry, 116
induced isomorphic rings, 596
metric, 208 isomorphism
norm on linear transformations, of rings, 595, 596
114 of vector spaces, 36, 38
norm, from an inner product, 111 iterated integral, 345
induction, 644 iterative
infimum, 645 numerical methods, 519
inherited metric, 208 solvers, 521
injective, 637 iterative methods, 520
inner product, 88
Frobenius, 90, 136 Jacobi method, 522
Hilbert-Schmidt, 90 convergence, 524
positivity of, 91 Jacobian
space, 89 determinant, 297
standard, 89 matrix, 249
integers, 628 Jordan
integrable canonical form, 140, 506
function, Daniell-Lebesgue, 330 curve theorem, 396
function, regulated, 324 Jordan normal form, 480
function, Riemann, 320, 325, 340 Kantorovich, 315
function,regulated, 375 kernel, 35, 594
on an unbounded domain, 337 Kreiss constant, 562
integral Kreiss matrix theorem, 563
Bochner, 320 Kronecker delta, 95
contour, 417 Krylov
Daniell, 320, 328 basis, 527
iterated, 345 methods, 526
Lebesgue, 320, 328 solution, 529
mean value theorem, 262 subspace, 506, 527
interior
of a set, 182 Lagrange
of a simple closed curve, 397 basis, 603
interior point, 182 decomposition, 602
intermediate value theorem, 225 interpolant, 603
intersection of sets, 630 interpolant, barycentric form ,
invariant subspace, 147, 462, 468 610
inverse function theorem, 298 interpolation, 610
invertible, 38 interpolation, barycentric, 610
element, 578 Lagrange-Hermite interpolation, 616,
left, 642 617
right, 642 Lanczos method, 548
involution, 172 Laplace equation, 145
684 Index

Laurent systeni, 32,58, 105,519,520,527,


expansion, 436 530
polynoniial, 576 honiogeneous, 35, 40
series, 433, 436 overdeterniined, 127
series, principal part, 438 transforniation, 31- 33
law of cosines, 92 bounded, 114
leading entry, 61 coniposition of, 51
least niatrix representation, 49
elenient, 647 Liouville's theoreni, 430
upper bound, 645 Lipschitz
upper bound property, 646 continuous, 186, 200, 235 , 254,
least squares, 127 278
Lebesgue locally at a point, 254
integrable, 367 lower bound, 645
integral, 320, 328 1.u.b. property, see least upper bound
nuniber, 206 property
left LU deconiposition, 62
eigenvector, 154
invertible, 642 nianifold, 381
pointing nornial vector, 398 niap, 635
Legendre polynoniial, 101, 453 Markov chain, 564
Leibniz integral rule niatrix, 663
generalized, 348 oo-norni , 116
Leibniz's integral rule, 346 1-norni, 116
Leontief, Wassily, 520 augniented, 60
level set, 191, 293, 635 block, 665
liniit decay, 564
inferior, 340 diagonalizable, 151
of a function, 188 diagonally doniinant, 524
of a sequence, 193 Herniitian conjugate, 664
point, 189 index of, 466
pointwise, 210 invertible, 665
superior, 340 irreducible, 497
line fitting, 129 niinor, 73
line integral, 386 nilpotent, 171
of scalar field, 386 nonnegative, 494
of vector field, 387 nonsingular, 52, 665
linear nornial, 158
approxiniation, 242 orthogonal, 98
conibination, 10 orthonornial, 98, 132
convergence, 286 positive, 494
dependence, 12 priniitive, 497
extension, 230 product, 663
functional, 121 row equivalent, 61
honiogeneous systeni, 40 seniisiniple, 151
independence, 12 siniilar, 53
operator, 33 siniple, 151
ordering, 644 sparse, 519
Index 685

transition, 53 natural numbers, 628


transpose, 664 neighborhood, 182
tridiagonal, 145 Neumann series, 215
unitary, see matrix, orthonormal Newton
maximal element, 646 decomposition, 604
maximum modulus principle, 431 interpolation, 611
mean value theorem, 260 Newton's method
Gauss's, 426 scalar version, 287
integral, 262 vector version, 291
measurable Newton- Kantorovich theorem,
function , 333 292, 315
set, 331 , 333 noise canceling, 10
measure, 323 nonsingular matrix, 665
zero, 331 norm, 111
zero, alternative definition , 364 LP-, 113
meromorphic, 439 p-, 112
method of successive approximations, Euclidean, 111
280 Frobenius, 113, 168
metric, 179, 180 induced
Cartesian product, 181 from an inner product, 111
discrete, 181 on linear transformations, 114
Euclidean, 180 Manhattan, 111
French railway, 233 matrix, 115
induced, 208 operator, 114
inherited, 208 sup, 113
normalized, 181 taxicab, 111
space, 180 normal
minimal polynomial, 526 distribution, 359
minimum modulus principle, 453 equation, 128
Minkowski's inequality, 119 matrix, 158
Mirsky, Schmidt, Eckart-Young theo- normed linear space, 111
rem, 167 null space, 35
modulus nullity, 43
componentwise, of a matrix, 513 numerical
of a complex number, 654 linear algebra, 62
monic polynomial, 143, 586
monotone oblique projection, 460
convergence theorem, 336 one-to-one, 637
decreasing, 336 onto, 637
increasing, 201 , 336 open
Moore-Penrose pseudoinverse, 126, 166 ball, 182
multiplication cover, 203
closure of, 576 set, 182
multiplicity order
of a zero, 435, 446 of a nilpotent operator, 485
of an eigenvalue of a pole of a function, 439
algebraic, 143, 156, 469 of a zero of a function, 435
geometric, 141, 156, 467 orderings, 643
686 Index

orthogonal, 91 Perron
complement, 105, 123 root, 496
projection, 93, 96 theorem, 496
orthonormal, 87 Perron- Frobenius
matrix, 98, 132 eigenvalue, 496
set, 95 theorem, 497
transformation, 97 piecewise smooth, 382, 416
orthonormally pigeonhole principle, 652
diagonalizable, 157, 158 polar decomposition, 165
similar, 155 polarization identity, 131
outer product expansion, pole of a function, 439
164, 174 polygonal path, 421
overdetermined, 127 polynomial
monic, 143, 586
PageRank algorithm, 498 ring, 575
parallelogram identity, 131 poset, see partially ordered set
parametrized
positive
contour, 416
definite, 159
curve
semidefinite, 159
equivalent, 382
potential function, 388
smooth, 381
power
manifold, 389
method, 492
equivalent, 389
set, 628
measure of, 394
preimage, 186, 635
oriented, 389
preimage of a function, 187
tangent space, 390
prime, 588
unoriented, 390
surface, 389 primitive
parametrized manifold, 381 matrix, 497
partial primitive root of unity, 657
derivative, 245 principal ideal domain
kth-order, 266 Euclidean domain is a, 585
in a Banach space, projection, 136, 460, 639
255 canonical, 597
ordering, 646 complementary, 460
sums, 212 map, 33
partially ordered set, 646 spectral, 475
partition, 22, 598, 633 pseudoeigenvalue, 554
path, 416 pseudoeigenvector, 555
connected, 226 pseudoinverse
independent, 420 Drazin, 500, 501
periodic function, 236 Moore- Penrose, 166
permutation, 66 pseudospect ral
even, 67 radius, 561
inversion, 67 pseudospect rum, 554
odd, 67 Pythagorean
sign, 67 law, 92
transposition, 67 theorem, 97
Index 687

QR ring, 574
decomposition, 102, 103 commutative, 575, 576
decomposition, reduced, 103 homomorphism, 592
iteration, 538 of polynomials in a matrix , 576
quadratic convergence, 286 Ritz eigenvalues, 547
quasi-Newton method, 290 root
quotient of unity, 657
of a ring by an ideal, 598 simple, 304
of a vector space by a subspace, rotation map, 34
23 Rouche's theorem, 448
row
R-linear combination, 581 echelon form, 61
radius echelon form, reduced, 61
pseudospectral, 561 operations, 58
radius of convergence, 414 reduction, 58
range, 35 RREF, see row echelon form, reduced
rank, 43 Russell's paradox, 629
rank-nullity theorem, 43
Rayleigh quotient , 174 scalar, 4
reduced row echelon form, 61 Schmidt, Mirsky, Eckart-Young
REF, see row echelon form theorem, 167, 168
reflection, 172 Schur
regulated complements, 666
integrable functions, 324 form, 140, 156
integral lemma, 155
multivariable, Banach valued, second
324 barycentric form, 611
single variable, Banach valued , derivative, 266
228 isomorphism theorem, 45
relation, 631 segment, 648
relative condition number, 303 self-adjoint, 155
relatively prime, 586 seminorm, 111, 361
reparametrization, 382 semisimple
replacement theorem, 17 eigenvalue, 492
residual vector , 93 matrix, 151
residue spectral mapping theorem, 153
of a function, 440 separated
theorem, 442 spectrally, 540
resolvent, 470 sequence, 193
of an operator, 470 Cauchy, 195, 263
set, 141, 470 convergent, 193
reverse Fatou lemma, 358 uniformly convergent, 210
Riemann integral, 325 sequentially compact, 206
Riemann's theorem, 426 series, 212
Riesz representation theorem, 120 sesquilinear, 89
right set, 627
eigenvector, 154 complement, 630
invertible, 642 difference, 630
688 Index

intersection, 630 step function , 228, 323


of measure zero, 331 generalized, 379
partition, 633 integral of, 229, 324
union, 629 Stokes' theorem, 402
Sherman- Morrison- Woodbury lemma, sub cover
666 finite, 203
shifting subdivision
QR iteration, 550 generalized, 321
sign of a complex number, of intervals, 321
656 one-dimensional, 228
similar matrix, 53 submatrix, 73
simple submultiplicative property,
closed curve, 381 115
eigenvalue, 307, 496 subsequence, 197
matrix, 151 subset, 628
pole, 439 proper, 628
region, 398 subspace, 7
root, 304 complementary, 14, 16
simply connected, 397 invariant, 147
singular proper, 9
matrix, 52 trivial, 9
value decomposition, 162 substitution formula, 349
compact form, 164 successive approximation, 280
values, 162 successive overrelaxation
singularity convergence, 525
essential, 439 superposition, 9
isolated, 438 support, 635
removable, 439 supremum, 645
skew-Hermitian, 171 surface
small gain theorem, 171 parametrized, 389
smooth, 266 surjective, 637
SOR, see successive overrelaxation SVD, see singular value
span, 10 decomposition
spectral Sylvester map, 34
condition number, 560 symmetric group, 66
decomposition theorem,
487 tangent
mapping theorem, 489 to a curve, 382
radius, 474, 478 unit, 382
resolution formula, 478 tangent vector, 243
theorem Taylor
Hermitian matrices, 157 formula, 268, 269
normal matrices, 158 remainder, 271
spectrally separated, 540 series for holomorphic functions,
spectrum, 141, 470 433
spherical coordinates, 354 topological
Spijker's lemma, 566 properties, 179, 222, 223
Stein map, 34 spaces, 186
Index 689

topologically equivalent, 219, 220 universal property of the product, 639


norms, 134, 221 upper bound, 645
topology, 179, 186
induced, 219 valuation, 583
total variables
derivative, 246 change of, 351
ordering, 643 vector, 4
totally bounded, 206, 208 field, 387
transient behavior, 561 conservative, 388
transition matrix, 48, 53 length of, 91
transpose, 664 space, 4
transposition matrix, 59 unit, 91
tridiagonal matrix, 145 vectors
angle between, 92
unbounded linear transformation, 235 volume, 331, 351
uncountable, 648 n-dimensional, 323
uniform of n- ball, 360
continuity, 195, 199 of a parallelepiped, 394
contraction mapping, 282
contraction mapping principle, Wedderburn decomposition, 500
282 well conditioned, 303
convergence, 210, 413 well-defined functions, 640
convergence on compact subsets, well-ordering axiom, 644
263 well-ordering principle, 647
properties, 223 winding number, 441
union of sets, 629 WOA, see well-ordering axiom
unit, 578 Wronskian, 85
normal, 391 , 392
tangent, 382 Young's inequality, 118
vector, 91
zero
unitarily similar, 155
and pole counting formula, 446
unitary matrix, see orthonormal
divisors, 583
matrix
of order n, 435
unity
Zorn's lemma, 20, 647
ring, 577

You might also like