Optical Imaging and Aberrations Part III Wavefront Analysis

PART III
WAVEFRONT ANALYSIS
PART III
WAVEFRONT ANALYSIS
VIRENDRA N. MAHAJAN
THE AEROSPACE CORPORATION

AND
COLLEGE OF OPTICAL SCIENCES - THE UNIVERSITY OF ARIZONA
Bellingham, Washington USA

Library of Congress Cataloging-in-Publication Data
Mahajan, Virendra N.
Optical imaging and aberrations, part III: wavefront analysis / Virendra N. Mahajan
pages cm.
Includes bibliographical references and index.
ISBN 978-0-8194-9111-4
1. Optical measurements. 2. Aberration--Measurement. 3. Orthogonal decompositions.
4. Orthogonal polynomials. I. Title.
QC367.M24 2013
621.36--dc23
2013018827
Published by
SPIE
P.O. Box 10
Bellingham, Washington 98227-0010 USA
Phone: +1 360.676.3290
Fax: +1 360.647.1445
Email: Books@spie.org
Web: http://spie.org
Copyright © 2013 Society of Photo-Optical Instrumentation Engineers
All rights reserved. No part of this publication may be reproduced or distributed in any
form or by any means without written permission of the publisher.
The content of this book reflects the work and thought of the author(s). Every effort has
been made to publish reliable and accurate information herein, but the publisher is not
responsible for the validity of the information or for any outcomes resulting from reliance
thereon.
Printed in the United States of America.

First printing
Front cover: Shown from left to right are the aberration-free PSFs of optical imaging
systems with circular, annular, hexagonal, elliptical, rectangular, and square pupils.
To my grandchildren
Maya, Leela, Rohan, and Krishan
v
FOREWORD
For years Vini Mahajan has been publishing a book series on optical imaging and
aberrations. Part I of the series on Ray Geometrical Optics was published in 1998, and
Part II on Wave Diffraction Optics followed in 2001. A second edition of Part II appeared
in 2011. Now Vini has written Part III on Wavefront Analysis, which should be of interest
to anyone working in the fields of optical design, fabrication, or testing.
Wavefront Analysis is focused on the use of orthonormal polynomials for wavefront

analysis of optical imaging systems with pupils of different shapes. The book starts with
an excellent introduction to optical imaging and aberrations. These first two chapters
should be of interest to anyone working in optics. Chapter 3 describes orthonormal
polynomials and the Gram–Schmidt orthonormalization process for obtaining
orthonormal polynomials over one domain from those that are orthonormal over another.
Chapter 4 is a long and complete chapter on imaging and aberrations for optical
systems with circular pupils. The chapter covers the PSF and OTF for aberration-free
imaging, Strehl ratio and aberration balancing and tolerancing, and a very complete
description of Zernike circle polynomials. Isometric, interferometric, and imaging
characteristics of the circle polynomial aberrations are very nicely explained and
illustrated. The important relationship between the circle polynomials and the classical
aberrations is discussed. Since optical systems generally have circular pupils, this chapter
will be of use to almost anyone working in optics.
The next several chapters are intended for readers interested in optical systems with
noncircular or apodized circular or annular pupils. Much of this material is difficult to
find in such detail elsewhere. The chapters start with a brief discussion of aberration-free
imaging that includes both the PSF and the OTF of the optical system, as this is
potentially the ultimate goal of any optical design or test. Then the polynomials
appropriate for systems with pupils of different shapes representing balanced classical
aberrations are described in detail. As in the case of the circle polynomial aberrations, the
isometric, interferometric, and PSF plots of the first forty-five polynomial aberrations for
systems with hexagonal, elliptical, annular, rectangular, and square pupils facilitate
understanding of their significance. Systems with circular and annular pupils with
Gaussian illumination, anamorphic systems with square and circular pupils, and those
with circular and annular sector pupils are also discussed thoroughly.
Anyone thinking of using the Zernike circle polynomials for wavefront analysis of
systems with noncircular pupils should read Chapter 12, where their pitfalls are
illustrated by applying them to systems with annular and hexagonal pupils. Numerical
examples on the calculation of the orthonormal aberration coefficients from the
wavefront or the wavefront slope data given in Chapter 14 add to the utility and
vii
practicality of the book. A summary at the end of each chapter is quite useful, as it
describes the essence of the content.
Vini is an excellent writer with the gift of writing complex topics in a simplified, yet
rigorous, manner. As in the first two volumes of this book series, the material presented
in Part III is thorough and detailed, and much of it is from his own publications.
Wavefront Analysis is primarily analytical in nature, but it is generally easy to read with a
lot of examples and numerical results. Both students and experienced optical engineers
and scientists who have a need for wavefront analysis of optical systems will find it to be
extremely useful.
Tucson, Arizona James C. Wyant

June 2013
viii
TABLE OF CONTENTS
PART III. WAVEFRONT ANALYSIS

Preface ........................................................................................................................... xvii
Acknowledgments .......................................................................................................... xix
Symbols and Notation.................................................................................................... xxi
CHAPTER 1: OPTICAL IMAGING ............................................................. 1

1.1 Introduction ............................................................................................................................ 3
1.2 Diffraction Image ................................................................................................................... 3
1.2.1 Pupil Function .......................................................................................................... 4
1.2.2 PSF ........................................................................................................................... 5
1.2.3 OTF .......................................................................................................................... 6
1.3 Strehl Ratio ............................................................................................................................. 7
1.3.1 General Expression .................................................................................................. 7
1.3.2 Approximate Expression in Terms of Aberration Variance ..................................... 9
1.4 Aberration Balancing ........................................................................................................... 10
1.5 Summary ............................................................................................................................... 11
References ........................................................................................................................................ 12
CHAPTER 2: OPTICAL WAVEFRONTS AND THEIR ABERRATIONS .......... 13

2.1 Introduction .......................................................................................................................... 15
2.2 Optical Imaging .................................................................................................................... 15
2.3 Wave and Ray Aberrations ................................................................................................. 17
2.4 Defocus Aberration .............................................................................................................. 22
2.5 Wavefront Tilt ...................................................................................................................... 23
2.6 Aberration Function of a Rotationally Symmetric System .............................................. 25
2.7 Observation of Aberrations:
s: Interferograms .................................................................... 29
2.8 Summary ............................................................................................................................... 31
References ........................................................................................................................................ 33
CHAPTER 3: ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT

ORTHONORMALIZATION................................................... 35
3.1 Introduction .......................................................................................................................... 37
3.2 Orthonormal Polynomials ................................................................................................... 37
3.3 Equivalence of Orthogonality-Based Coefficients and Least-Squares Fitting ............... 39
3.4 Orthonormalization of Zernike Circle Polynomials over Noncircular Pupils ............... 40
ix
3.5 Unit Pupil .............................................................................................................................. 43
3.6 Summary ............................................................................................................................... 43
References ........................................................................................................................................ 46
CHAPTER 4: SYSTEMS WITH CIRCULAR PUPILS...................................... 47

4.1 Introduction .......................................................................................................................... 49
4.2 Pupil Function....................................................................................................................... 49
4.3 Aberration-Free Imaging .................................................................................................... 50
4.3.1 PSF ......................................................................................................................... 51
4.3.2 OTF ........................................................................................................................ 53
4.4 Strehl Ratio and Aberration Tolerance.............................................................................. 54
4.4.1 Strehl Ratio............................................................................................................. 54
4.4.2 Defocus Strehl Ratio............................................................................................... 55
4.4.3 Approximate Expressions for Strehl Ratio............................................................. 56
4.5 Balanced Aberrations........................................................................................................... 57
4.6 Description of Zernike Circle Polynomials ........................................................................ 63
4.6.1 Analytical Form...................................................................................................... 63
4.6.2 Circle Polynomials in Polar Coordinates ............................................................... 65
4.6.3 Polynomial Ordering .............................................................................................. 65
4.6.4 Number of Circle Polynomials through a Certain Order n .................................... 65
4.6.5 Relationships among the Indices n, m, and j .......................................................... 69
4.6.6 Uniqueness of Circle Polynomials ......................................................................... 69
4.6.7 Circle Polynomials in Cartesian Coordinates......................................................... 70
4.7 Zernike Circle Coefficients of a Circular Aberration Function ...................................... 70
4.8 Symmetry Properties of Images Aberrated by a Circle Polynomial Aberration ........... 74
4.8.1 Symmetry of PSF ................................................................................................... 74
4.8.2 Symmetry of OTF................................................................................................... 76
4.9 Isometric, Interferometric, and Imaging Characteristics of
Circle Polynomial Aberrations ........................................................................................... 78
4.9.1 Isometric Characteristics ........................................................................................ 78
4.9.2 Interferometric Characteristics ............................................................................... 78
4.9.3 PSF Characteristics ................................................................................................ 83
4.9.4 OTF Characteristics ............................................................................................... 84
4.10 Circle Polynomials and Their Relationships with Classical Aberrations ....................... 88
4.10.1 Introduction ............................................................................................................ 88
4.10.2 Wavefront Tilt and Defocus ................................................................................... 88
4.10.3 Astigmatism ........................................................................................................... 89
4.10.4 Coma....................................................................................................................... 90
4.10.5 Spherical Aberration............................................................................................... 90
4.10.6 Seidel Coefficients from Zernike Coefficients ....................................................... 91
4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing ........................... 92
x
4.11 Zernike Coefficients of a Scaled Pupil ............................................................................... 92
4.11.1 Theory .................................................................................................................... 92
4.11.2 Application to a Seidel Aberration Function.......................................................... 97
4.11.3 Numerical Example................................................................................................ 99
4.12 Summary ............................................................................................................................. 102
References ...................................................................................................................................... 103
CHAPTER 5: SYSTEMS WITH ANNULAR PUPILS .................................... 105

5.1 Introduction ........................................................................................................................ 107
5.2 Aberration-Free Imaging .................................................................................................. 107
5.2.1 PSF ....................................................................................................................... 107
5.2.2 OTF ...................................................................................................................... 109
5.3 Strehl Ratio and Aberration Balancing............................................................................ 111
5.4 Orthonormalization of Circle Polynomials over an Annulus ......................................... 114
5.5 Annular Polynomials ......................................................................................................... 116
5.6 Annular Coefficients of an Annular Aberration Function ............................................. 123
5.7 Strehl Ratio for Annular Polynomial Aberrations ......................................................... 129
Annular Polynomial Aberrations ..................................................................................... 132
5.9 Summary ............................................................................................................................. 139
References ...................................................................................................................................... 140
CHAPTER 6: SYSTEMS WITH GAUSSIAN PUPILS ................................... 141

6.1 Introduction ........................................................................................................................ 143
6.2 Gaussian Pupil .................................................................................................................... 144
6.3.1 PSF ....................................................................................................................... 145
6.3.2 Optimum Gaussian Radius.................................................................................. 146
6.3.3 OTF ...................................................................................................................... 147
6.5 Orthonormalization of Zernike Circle Polynomials over a Gaussian Circular Pupil . 153
6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations for a
Gaussian Circular Pupil..................................................................................................... 155
6.7 Weakly Truncated Gaussian Pupils ................................................................................. 156
6.8 Aberration Coefficients of a Gaussian Circular Aberration Function......................... 157
6.9 Orthonormalization of Annular Polynomials over a Gaussian Annular Pupil ............ 157
6.10 Gaussian Annular Polynomials
yn Representing Balanced Primary Aberrations for a
Gaussian Annular Pupil ..................................................................................................... 159
xi
6.11 Aberration Coefficients of a Gaussian Annular Aberration Function ......................... 161
6.12 Summary ............................................................................................................................. 161
References ...................................................................................................................................... 163
CHAPTER 7: SYSTEMS WITH HEXAGONAL PUPILS ............................... 165

7.1 Introduction ........................................................................................................................ 167
7.2 Pupil Function..................................................................................................................... 168
7.3.1 PSF ..........................................................................................................169
7.3.2 OTF ..........................................................................................................174
7.4 Hexagonal Polynomials...................................................................................................... 177
7.5 Hexagonal Coefficients of a Hexagonal Aberration Function........................................ 185
Hexagonal Polynomial Aberrations ................................................................................. 187
7.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio............................................. 194
7.7.1 Defocus ....................................................................................................194
7.7.2 Astigmatism............................................................................................. 194
7.7.3 Coma ........................................................................................................195
7.7.4 Spherical Aberration ................................................................................196
7.7.5 Strehl Ratio ..............................................................................................197
7.8 Summary ............................................................................................................................. 197
References ...................................................................................................................................... 200
CHAPTER 8: SYSTEMS WITH ELLIPTICAL PUPILS ................................... 201

8.1 Introduction ........................................................................................................................ 203
8.2 Pupil Function..................................................................................................................... 203
8.3.1 PSF ....................................................................................................................... 204
8.3.2 OTF ...................................................................................................................... 207
8.4 Elliptical Polynomials......................................................................................................... 209
8.5 Elliptical Coefficients of an Elliptical Aberration Function ......................................... 210
Elliptical Polynomial Aberrations..................................................................................... 214
8.7 Seidel Aberrations and Their Standard Deviations ........................................................ 228
8.7.1 Defocus ................................................................................................................. 228
8.7.2 Astigmatism ......................................................................................................... 228
8.7.3 Coma..................................................................................................................... 229
8.7.4 Spherical Aberration............................................................................................. 230
8.8 Summary ............................................................................................................................. 232
References ...................................................................................................................................... 234
xii
CHAPTER 9: SYSTEMS WITH RECTANGULAR PUPILS ............................ 235
9.1 Introduction ........................................................................................................................ 237
9.2 Pupil Function..................................................................................................................... 237
9.3.1 PSF ..........................................................................................................238
9.3.2 OTF ..........................................................................................................240
9.4 Rectangular Polynomials ................................................................................................... 242
9.5 Rectangular Coefficients of a Rectangular Aberration Function.................................. 243
Rectangular Polynomial Aberrations ............................................................................... 247
9.7.1 Defocus ....................................................................................................260
9.7.2 Astigmatism............................................................................................. 260
9.7.3 Coma ........................................................................................................261
9.8 Summary ............................................................................................................................. 264
References ...................................................................................................................................... 265
CHAPTER 10: SYSTEMS WITH SQUARE PUPILS ..................................... 267

10.1 Introduction ........................................................................................................................ 269
10.2 Pupil Function..................................................................................................................... 269
10.3.1 PSF ..........................................................................................................272
10.3.2 OTF ..........................................................................................................274
10.4 Square Polynomials ............................................................................................................ 281
10.5 Square Coefficients of a Square Aberration Function.................................................... 282
Square Polynomial Aberrations ........................................................................................ 289
10.7.1 Defocus ....................................................................................................289
10.7.2 Astigmatism............................................................................................. 289
10.7.3 Coma ........................................................................................................290
10.8 Summary ............................................................................................................................. 293
References ...................................................................................................................................... 294
xiii
CHAPTER 11: SYSTEMS WITH SLIT PUPILS ............................................. 295
11.1 Introduction ........................................................................................................................ 297
11.2.1 PSF ..........................................................................................................297
11.2.2 Image of an Incoherent Slit......................................................................298
11.3.1 Strehl Ratio ..............................................................................................299
11.3.2 Aberration Balancing............................................................................... 289
11.4 Slit Polynomials .................................................................................................................. 301
11.5 Standard Deviation of a Primary Aberration ................................................................. 302
11. Summary ............................................................................................................................. 305
References ...................................................................................................................................... 306
CHAPTER 12: USE OF ZERNIKE CIRCLE POLYNOMIALS FOR

NONCIRCULAR PUPILS ................................................. 307
12.1 Introduction ........................................................................................................................ 309
12.2 Relationship Between the Orthonormal and the Corresponding
Zernike Circle Coefficients ................................................................................................ 309
12.3 Use of Zernike Circle Polynomials for the Analysis of an Annular Wavefront ........... 314
12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients ...................... 314
12.3.2 Interferometer Setting (rrors ................................................................................320
12.3.3 Wavefront Fitting ................................................................................................. 320
12.3.4 Application to an Annular Seidel Aberration Function........................................ 321
12.3.4.1 Annular Coefficients ............................................................................ 321
12.3.4.2 Circle Coefficients................................................................................ 323
12.3.4.3 Residual Aberration Function Dfter Removing
Interferometer Setting Errors................................................................ 323
12.3.4.4 Error with Assuming Circle Polynomials to be
Orthogonal over an Annulus ................................................................ 325
12.3.4.5 Numerical Example ............................................................................. 326
12.4 Use of Zernike Circle Polynomials for the Analysis of a Hexagonal Wavefront ......... 332
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients........................ 332
12.4.2 Interferometer Setting Errors................................................................................ 335
124.3 Numerical Example.............................................................................................. 336
12.5 Aberration Coefficients from Discrete Wavefront Data................................................. 345
12.6 Summary ............................................................................................................................. 345
References ...................................................................................................................................... 348
xiv
CHAPTER 13: ANAMORPHIC SYSTEMS................................................ 349
13.1 Introduction ........................................................................................................................ 351
13.2 Gaussian Imaging ............................................................................................................... 352
13.3 Classical Aberrations ......................................................................................................... 354
13.4 Strehl Ratio and Aberration Balancing for a Rectangular Pupil .................................. 355
13.5 Aberration Polynomials Orthonormal over a Rectangular Pupil ................................. 356
13.6 Expansion of a Rectangular Aberration Function in Terms of Orthonormal
Rectangular Polynomials ................................................................................................... 360
13.7 Anamorphic Imaging System with a Circular Pupil....................................................... 361
13.7.1 Balanced Aberrations ..............................................................................361
13.7.2 Orthonormal Polynomials Representing Balanced Aberrations ..............362
13.8 Comparison of Polynomials for Rotationally Symmetric and
Anamorphic Imaging Systems .......................................................................................... 362
13.9 Summary ............................................................................................................................. 365
References ...................................................................................................................................... 367
CHAPTER 14: NUMERICAL WAVEFRONT ANALYSIS............................ 369

14.1 Introduction ..........................................................................................................371
14.2 Zernike Coefficients from Wavefront Data....................................................... 372
14.2.1 Theory ......................................................................................................372
14.2.2 Numerical Example ................................................................................. 373
14.3 Zernike Coefficients from Wavefront Slope Data ............................................383
14.3.1 Theory ......................................................................................................383
14.3.2 Alternative Approach for Obtaining Zernike Coefficients from
Wavefront Slope Data..............................................................................388
14.4 Summary............................................................................................................... 398
References ......................................................................................................................399
APPENDIX: SYSTEMS WITH SECTOR PUPILS ......................................... 401
Index ............................................................................................................................. 415
xv
PREFACE
This book is Part III of a series of books on Optical Imaging and Aberrations. Part I
on Ray Geometrical Optics and Part II on Wave Diffraction Optics were published
earlier. Part III is on Wavefront Analysis, which is an integral part of optical design,
fabrication, and testing. In optical design, rays are traced to determine the wavefront and
thereby the quality of a design. In optical testing, the fabrication errors and, therefore, the
associated aberrations are measured by way of interferometry. In both cases, the quality
of the wavefront is determined from the aberrations obtained at an array of points. The
aberrations thus obtained are used to calculate the mean, the peak-to-valley, and the
standard deviation values. While such statistical measures of the wavefront are part of
wavefront analysis, the purpose of this book is to determine the content of the wavefront
by decomposing the ray-traced or test-measured data in terms of polynomials that are
orthogonal over the expected domain of the data. These polynomials must include the
basic aberrations of wavefront defocus and tilt, and represent balanced classical
aberrations.
We start Part III with an outline of optical imaging in the presence of aberrations in
Chapter 1, i.e., on how to obtain the point-spread and optical transfer functions of an
imaging system with an arbitrary shaped pupil. The Strehl ratio of a system as a measure
of image quality is introduced in this chapter, and shown to be dependent only on the
aberration variance when the aberration is small. It is followed in Chapter 2 with a brief
discussion of the wavefronts and aberrations. This chapter introduces the nomenclature of
aberrations. How to obtain the orthogonal polynomials over a certain domain from those
over another is discussed in Chapter 3. For systems with a circular pupil, the Zernike
circle polynomials are well known for wavefront analysis. They are discussed at length in
Chapter 4. These polynomials are orthogonalized over an annular pupil in Chapter 5, and
over a Gaussian pupil in Chapter 6. They are obtained similarly for systems with
hexagonal, elliptical, rectangular, square, and slit pupils in the succeeding chapters. For
each pupil, the polynomials are given in their orthonormal form so that an expansion
coefficient (with the exception of piston) represents the standard deviation of the
corresponding polynomial aberration term. The standard deviation of a Seidel aberration
with and without aberration balancing is also discussed in these chapters.
Since the Zernike circle polynomials form a complete set, a wavefront over any
domain can be expanded in terms of them. However, the pitfalls of their use over a
domain other than circular and resulting from the lack of their orthogonality over the
chosen domain are discussed in Chapter 12. Finally, the aberrations of anamorphic
systems are discussed, and polynomials suitable for their aberration analysis are given in
Chapter 13 for both rectangular and circular pupils. The use of the orthonormal
polyonomials for determining the content of a wavefront is demonstrated in Chapter 14
by computer simulations of circular wavefronts. The determination of the aberrations
coefficients from the wavefront slope data, as in a Shack–Hartmann sensor, is also
discussed in this chapter.
El Segundo, California Virendra N. Mahajan

June 2013
xvii
ACKNOWLEDGMENT6
Once again, it is a great pleasure to acknowledge the generous support I have

received over the years from my employer, The Aerospace Corporation, in preparing Part
III on Wavefront Analysis in a series of bookV on Optical Imaging and Aberrations. My
special thanks go to my former classmate Dr. Bill Swantner for his constant advice on
and constructive critique of my work. I have benefitted greatly from his practical
expertise in both optical design and testing. The Sanskrit verse on p. xxiii was provided
by Professor Sally Sutherland of the University of California at Berkeley. Many thanks to
Professor James W. Wyant for writing the Foreword for this book.
I am grateful to Professor José Antonio Díaz Navas for carrying out many computer
calculations and preparing many of the figures. My thanks to Drs. Barry Johnson, James
Harvey, and Daniel Topa for reading an early version of the manuscript and suggesting to
include examples of wavefront analysis. I am grateful to Professor Eva Acosta for her
help with writing Chapter 14 on Numerical Wavefront Analysis, as my response to their
suggestion. Of course, any shortcomings or errors anywhere in the book are totally my
responsibility.
As in the past, I cannot say enough about the constant support I have received from
my wife Shashi over the many years it has taken me to complete this three-part series. I
dedicate Part III to my grandchildren.
Finally, I would like to thank SPIE Press Editors Dara Burrows and Scott McNeill,
and Manager Tim Lamkins for their quality support in bringing this book to publication.
It has always been a pleasure to work with the 63,( staff, starting with the 3XEOLFDWLRQV
'LUHFWRU Eric Pepper.
xix
SYMBOLS AND NOTATION
r
ai aberration coefficient rp pupil point position vector
A amplitude R radius of reference sphere
Ai peak aberration coefficient Re real part
Bd defocus coefficient Rj rectangular polynomial
Bj wave aberration polynomial Rnm (r) Zernike radial polynomial
Bt tilt coefficient S Strehl ratio
c aspect ratio Sex area of exit pupil
Ej elliptical polynomial Sj square, sector, or ray aberration
F focal ratio polynomial
r
Gj Gaussian or vector polynomial V vector polynomial
Hj hexagonal polynomial x, y Cartesian coordinates of a point
I irradiance W wave aberration
Im imaginary part Z nm Zernike circle polynomial
j polynomial number Zj Zernike circle polynomial
r image spatial frequency vector
Jn Bessel function vi
Lj Legendre polynomial v normalized spatial frequency
M magnification t optical transfer function
MTF modulation transfer function r = r a normalized radial coordinate
OTF optical transfer function q polar angle of a position vector
P object point f polar angle of frequency vector
P¢ Gaussian image point ⑀ obscuration or aspect ratio
Pex power in the exit pupil d (◊) Dirac delta function
Pi image power d ij Kronecker delta
Pn polynomial D longitudinal defocus
P(◊) pupil function F phase aberration
PSF point-spread function r, q polar coordinates of a point
PTF phase transfer function l optical wavelength
r radial coordinate x, h spatial frequency coordinates
rc radius of circle sW standard deviation (wave)
r
ri image point position vector sF standard deviation (phase)
xxi
Anantaratnaprabhavasya yasya himam
. na saubhagyavilopi jatam
Eko hi doso
. gunasannipate
. ˙ .
nimajjatindoh. kiranesvivankah
.
The snow does not diminish the beauty of the Himalayan mountains
which are the source of countless gems. Indeed, one flaw is lost
among a host of virtues, as the moon’s dark spot is lost among its rays.
Kalidasa Kumarasambhava 1.3
xxiii
PART III
WAVEFRONT ANALYSIS
CHAPTER 1
OPTICAL IMAGING
1.1 Introduction ..............................................................................................................3
1.2 Diffraction Image ..................................................................................................... 3
1.2.1 Pupil Function..............................................................................................4
1.2.2 PSF ..............................................................................................................5
1.2.3 OTF ..............................................................................................................6
1.3 Strehl Ratio ............................................................................................................... 7
1.3.1 General Expression ......................................................................................7
1.3.2 Approximate Expressions in Terms of Aberration Variance ......................9
1.4 Aberration Balancing ............................................................................................10
1.5 Summary................................................................................................................. 11
References ........................................................................................................................12
1
Chapter 1
Optical Imaging
1.1 INTRODUCTION
The position and the size of the Gaussian image of an object formed by an optical
imaging system is determined by using its Gaussian imaging equations. The aperture stop
of the system limits the amount of light entering it the most. Its entrance pupil determines
the amount of light from an object that enters it, and the exit pupil determines how that
light is distributed in the image. The Gaussian image is an exact replica of the object,
except for its magnification. The diffraction image of an isoplanatic incoherent object is
given by the convolution of the Gaussian image and the diffraction image of a point
object, called the point-spread function (PSF). In the spatial frequency domain, the
spectrum of the image is correspondingly given by the product of the optical transfer
function (OTF), which is the Fourier transform of the PSF, and the spectrum of the
Gaussian image. The image is obtained by inverse Fourier transforming its spectrum [1].
We define a pupil function, representing the complex amplitude at the exit pupil, and give
equations for obtaining the PSF and the OTF.
The aberrations of the system determine the quality of an image. An important

measure of the quality of an image is its Strehl ratio, which represents the ratio of the
central irradiances of the PSF with and without the aberration. This ratio is discussed and
simple but approximate expressions for it are derived for small aberrations in terms of the
variance of the aberration at the exit pupil. Since the Strehl ratio is higher for a smaller
variance, we discuss aberration balancing in which an aberration of a higher order is
balanced with one or more aberrations of lower order to minimize its variance and
thereby maximize the Strehl ratio. We discuss some general results on the effects of
nonuniform amplitude, called apodization, and nonuniform phase, called aberration, at
the exit pupil on the irradiance at the center of the reference sphere with respect to which
the aberration is defined. For a given total power in the pupil and, therefore, in the image
of a point object, maximum central irradiance is obtained for a system with an
unapodized and unaberrated pupil. Moreover, the peak value of an unaberrated image lies
at the center of curvature of the reference sphere regardless of the apodization of the
pupil. Generally, the effect of even large amplitude variations across the pupil is
relatively small compared to that of even small aberrations.
1.2 DIFFRACTION IMAGE

The Gaussian image of a point object formed by an imaging system is determined by
using Gaussian optics. In the Gaussian approximation, the aberrations are completely
neglected, and all of the rays originating at the point object and transmitted by the system
pass through the Gaussian image point. In reality, however, when the object rays are
traced through the system, they do not generally pass through the Gaussian image point
due to the aberrations. Instead, they are distributed in the vicinity of the image point, and
their distribution is referred to as the spot diagram. In practice, even if the aberrations are
3
4 OPTICAL IMAGING
absent or neglected, the light is distributed in a finite region around the Gaussian image
point due to its diffraction by the system. The diffraction image of a point object is called
the PSF of the system, and the aberration-free image is referred to as the diffraction-
limited image. The image of an extended object is determined by adding the amplitude or
the irrandiance images of its small elements, depending on whether the object radiation is
coherent or incoherent.
A system is called isoplanatic for a small enough object if the distribution of light in
the image of any point on it is approximately the same, except for its location in the
image plane. Thus, over a small field of view, the image of a point object is shift
invariant. For an incoherent isoplanatic object, the diffraction image can be obtained by
convolving the Gaussian image (which is an exact replica of the object except for its size
and illumination scaling) with the diffraction PSF. In the spatial frequency domain, the
spectrum of the image is correspondingly given by the product of the OTF, which is the
Fourier transform of the PSF, and the spectrum of the Gaussian image. The image is
obtained by inverse Fourier transforming its spectrum [1]. We define a pupil function,
representing the complex amplitude at the exit pupil, and give equations for obtaining the
PSF and the OTF.
1.2.1 Pupil Function

r
Consider a point object located at ro in the object plane radiating at a wavelength l .
Its Gaussian image formed by an imaging system determines the amount of light in the
image, depending on the object intensity, and distance from and the size of the entrance
pupil. The wave at the exit pupil of the system is represented by the pupil function
(r r ) (r r ) [ (r r )]
P rp ; ro = A rp ; ro exp iF rp ; ro , inside the exit pupil
= 0 , outside the exit pupil , (1-1)
r
(r r )
where rp is the 2D position vector of a point in the plane of the pupil and A rp ; ro and
F (r, q) are the amplitude and phase aberration functions of the system for the point
object under consideration. The phase aberration F (r, q) is related to the wave aberration
r r
( )
W rp ; ro according to
F (r, q) = (2p l)W rp ; ro (r r ) . (1-2)
The shape of the pupil is arbitrary. It may, for example, be circular or annular. The total
power in the pupil and, therefore, in the image is given by
r r 2 r
Pex = Ú P (r ; r )
p o d rp
r r r
= Ú A 2 ( rp ; ro )d rp , (1-3)
where the integration is across the pupil.

3XSLO )XQFWLRQ 5
The image lies at a distance R from the plane of the exit pupil, where R is the radius
of curvature of the Gaussian reference sphere with respect to which the aberration
r r
( )
W rp ; ro is defined. The center of curvature of the reference sphere lies at the Gaussian
r r
image point (unless defocus is introduced). Generally, the amplitude function A rp ; ro ( )
is uniform across the exit pupil. An exception is the Gaussian pupil considered in Chapter
6. We assume a small field of view so that the dependence of the aberration function
r r
( )
W rp ; ro on the location of the point object in the object plane can be neglected.
1.2.2 PSF
The PSF of the system imaging an incoherent object is given by [1]
2
r 1 Û r Ê 2pi r r ˆ r
PSF (ri ) = 2 2 Ù
Pex l R ı
P rp exp Á -
Ë lR
( )
ri rp ˜ d rp
¯
◊ , (1-4)
r
where the position vector ri of the observation point is written with respect to the
r
location rg of the Gaussian image point, and Pex is the total power in the image. The
irradiance distribution of the image is obtained by multiplying the PSF by the total power
Pex in the image, i.e.,
2
I (ri ) = 2 2 Ù P rp exp Á -
lR ı Ë lR
( )
ri rp ˜ d rp
¯
◊ . (1-5)
For a uniformly illuminated pupil with irradiance I 0 , the total power incident on and
transmitted by the pupil is given by
Pex = Sex I 0 , (1-6)
(r )
where Sex is the area of the exit pupil. Letting A 2 rp = I 0 , we may write the irradiance
distribution
2
r I0 Û r Ê 2pi r r ˆ r
I (ri ) = 2 2 Ù exp iF rp
lR ı
[ ( )] exp Á -
Ë lR
◊
ri rp ˜ d rp
¯
. (1-7)
The aberration-free irradiance at the center is given by
I0 r 2
I ( 0) =
l R2
2 [
Ú d rp ]
Pex Sex
= . (1-8)
l2 R 2
The irradiance distribution normalized by its central value may be written

2
I (ri ) = 2 Ù exp iF rp
Sex ı
[ ( )] exp Á -
Ë lR
◊
ri rp ˜ d rp
¯
. (1-9)
6 OPTICAL IMAGING
For convenience, we will refer to the irradiance distribution given by Eq. (1-9) as the
r
( )
PSF. Letting F rp = 0, we obtain the aberration-free PSF.
1.2.3 OTF
The imaging process can be described in the space domain by way of the PSF, or in
the spatial frequency domain by way of the OTF. The OTF is the Fourier transform of the
PSF, defined as
r r r r r
t (v i ) = Ú PSF (ri ) exp (2p i v i ◊ ri ) d ri , (1-10)
r
where v i is a spatial frequency vector in the image plane and related to the corresponding
r r r
frequency v o in the object plane by the image magnification M according to v i = v o M .
Since the image of an isoplanatic incoherent object is given by the convolution of the PSF
and the Gaussian image, the (spatial frequency) spectrum of the image is given by the
product of the OTF and the spectrum of the Gaussian image. The image is obtained by
inverse Fourier transforming its spectrum.
Because of the relationship of the PSF with the pupil function, as in Eq. (1-4), the
OTF can also be written as the autocorrelation of the pupil function in the form
r r r r r r 2 r
t (v i ) = Û ( ) (
Ù P rp P * rp - l R v i d rp
ı
) Ú ( )
P rp d rp
r r r
Ú ( ) (
= Pex1 A rp A rp - l R v i exp iQ rp ) [ (r )] d rr p , (1-11)
where an asterisk denotes a complex conjugate and
(r r ) (r ) (r
Q rp ; v i = F rp - F rp - l R v i
r
) (1-12)
is a phase aberration difference function defined over the region of overlap of two pupils:
r r r
one centered at rp = 0 and the other at rp = l Rvi .
From Eq. (1-11), the aberration-free OTF can be written

r
(r ) (r
t (v i ) = Pex1 Ú A rp A rp - l R v i d rp
r
) r
. (1-13)
For a uniformly illuminated pupil, the OTF is simply the fractional area of overlap of two
pupils centered at (0, 0) and l R(x, h) , where (x, h) are the Cartesian components of the
r
spatial frequency vector v i .
r
The region of overlap is maximum and equal to the area of the pupil for vi = 0,
giving a value of unity for t (0) . It represents the fact that the contrast of an image is zero
for an object of zero contrast. Because of the finite size of the pupil, the overlap region
r
reduces to zero at some frequency vc , called the cutoff frequency, and stays zero for
r r r
larger frequencies, i.e., t ( vi ) = 0 for vi ≥ vc . Because of isoplanatism, the spatial
frequency spectrum of the image is obtained as the product of the spectrum of the
27) 7
Gaussian image and the OTF. Inverse Fourier transforming the image spectrum yields the
space domain image.
From Eq. (1-10), we note that

r r
t ( vi ) = t * ( - vi ) , (1-14)
i.e., the OTF is complex symmetric or Hermitian. Therefore, its real part is even and its
imaginary part is odd, i.e.,
r r
Re t ( vi ) = Re t ( - vi ) ,
(1-15)
and
r r
Im t ( vi ) = - Im t ( - vi ) . (1-16)
The OTF can also be written in the form

r r r
[
t ( vi ) = t ( vi ) exp i Y ( vi ) ] , (1-17)
r r
where t ( vi ) and Y( vi ) are its modulus and phase, called the modulation and phase
transfer functions (MTF and PTF), respectively. Depending on the shape of the pupil and
the type of the aberration, the OTF may be real. A phase of p is sometimes associated
with a negative value of the MTF. It represents contrast reversal i.e, bright and dark
regions in the object appear as dark and bright regions in the image.
By inverse Fourier transforming Eq. (1-10), we can obtain the PSF according to
r r r r r
◊
PSF (ri ) = Ú t (v i ) exp (- 2 pi v i ri ) d v i . (1-18)
For a radially symmetric pupil with a radially symmetric aberration, e.g., a circular
pupil aberrated by spherical aberration, the OTF and PSF Eqs. (2-4) and (2-18) yield
PSF (ri ) = 2p Ú t (v i ) J 0 (2 p v i ri ) v i dv i (1-19)
and
t (v i ) = 2p Ú PSF (ri ) J 0 (2p v i ri ) ri dri , (1-20)
respectively, where J 0 (◊) is the zeroth-order Bessel function of the first kind. The OTF is
evidently real in this case.
1.3 STREHL RATIO

1.3.1 General Expression
The Strehl ratio of an image represents the ratio of its central irradiances with and
without aberration. From Eq. (1-5), the ratio of the central irradiances with aberration and
that at the Gaussian image point without aberration, may be written [1]
8 OPTICAL IMAGING
I a ( 0)
S = , (1-21)
I u ( 0)
where the subscripts a and u refer to an aberrated and an unaberrated system,

respectively, and S is the Strehl ratio given by
r r r 2
Ú ( ) [ ( )]
A rp exp iF rp d rp
[ Ú A (rr ) d rr ]
S = 2
. (1-22)
p p
It can be shown that [1]
0£ S £ 1 . (1-23)
The Strehl ratio may also be determined from the OTF of the system. By definition,
S = PSFa ( 0) PSFu ( 0) . (1-24)
From Eq. (1-11), we may write

r r
PSF ( 0) = Ú t (v i ) d v i . (1-25)
Since the PSF at any point is a real quantity, only the real part of the aberrated OTF
contributes to the integral, and the integral of its imaginary part must be zero. Hence, the
Strehl ratio is given by
r r r r
S = Ú Re t a ( v ) d v Ú t u ( v ) d v . (1-26)
Thus, the Strehl ratio may be obtained by integrating the real part of the measured
aberrated OTF over all spatial frequencies and dividing it by a similar integral of the
calculated unaberrated OTF.
The Strehl ratio gives a measure of the image quality in terms of the reduction in the
central irradiance due to the aberration in the system, including any defocus. Its value
being less than one is a consequence of the fact that the Huygens’ secondary spherical
wavelets on the reference sphere are not in phase due to the aberrations and, therefore,
they interfere nonconstructively at its center of curvature.
It can be shown that, for a given total power, the amplitude variations across the
pupil of an aberration-free system reduce the central irradiance, and any phase variations
(i.e., aberrations) further reduce it [2]. However, an irradiance reduced by phase
variations alone does not necessarily reduce any further if any amplitude variations are
also introduced. In fact, the amplitude variations can even increase this irradiance. For
example, the central value of a defocused PSF for a circular pupil decreases to zero as the
defocus aberration approaches one wave (see Section 4.4). The Huygens’ secondary
wavelets arriving at this point completely cancel each other. Hence, any amplitude
variations across the pupil will only help avoid complete cancellation and thereby
*HQHUDO ([SUHVVLRQ 9
increase the central value. The maximum value of central irradiance is obtained when the
system is unapodized and unaberrated [1,2]. It is shown in Chapter 5 how a Gaussian
pupil, as in a Gaussian beam, yields a smaller central value.
The peak value of the aberrated irradiance distribution of the image of a point object
does not necessarily occur at the center of the reference sphere. However, the peak value
of an unaberrated image does occur at the center regardless of the apodization. The
Huygens’ secondary wavelets emanating from the spherical wavefront being equidistant
from this point are in phase. Hence, they interfere constructively, producing a maximum
possible value at this point.
1.3.2 Approximate Expressions in Terms of Aberration Variance

Equation (1-22) for the Strehl ratio can be written in an abbreviated form
2
S = exp (i F) , (1-27)
where the angular brackets L indicate a spatial average over the amplitude-weighted
pupil, e.g.,
r r r
Ú A ( rp ) F ( rp ) d rp
F = r r . (1-28)
Ú A ( rp ) d rp
r
Since F is independent of rp , Eq. (1-27) can be written
2
S = [
exp i ( F - F )]
2 2
= cos (F - F ) + sin (F - F )
2 (1-29)
≥ cos (F - F ) ,
equality holding when F is zero across the pupil, in which case S = 1. For small
aberrations, expanding the cosine function in a power series and retaining the first two
obtain the Maréchal result generalized for an apodized pupil
S >~ (1 - sF2 2) 2 , (1-30)
where
s 2F = (F - F )2 (1-31)
is the variance of the phase aberration across the amplitude-weighted pupil. The quantity
s F is the standard deviation of the aberration. We will refer to it as the “sigma value” or
simply the “sigma” of the aberration.
10 OPTICAL IMAGING
For small values of s F , three approximate expressions have been used in the
literature:
2
S1 ~ (1 - s 2F 2) , (1-32)
S2 ~ 1 - s 2F , (1-33)
and
S3 ~ exp (- s 2F ) . (1-34)
The first is the Maréchal formula [3], the second is the commonly used expression ob-
4
tained when the term in s F in the first is neglected [4,5], and the third is an empirical ex-
pression giving a better fit to the actual numerical results for various aberrations [6]. Just
as S1 > S2 by s F4 4 , similarly, S3 > S1 by approximately the same amount. The simplest
expression to use is, of course, S2 , according to which s 2F gives the drop in the Strehl
ratio. We note that, for a pupil of any shape, the Strehl ratio for a small aberration does
not depend on its type but only on its variance across the apodized pupil. For a high-
quality imaging system, a typical value of the Strehl ratio desired is 0.8, corresponding to
a wave aberration with a sigma of s w = l 14 , where s w = (l 2p) s F .
1.4 ABERRATION BALANCING
In geometrical optics, we mix one aberration with another in order to minimize the
variance of the ray distribution in an image plane. For example, when we minimize the
variance by combining the primary spherical aberration with defocus aberration by
considering the ray distribution in a defocused image plane, the smallest spot, called the
circle of least confusion, has a radius that is 1/4 of its value in the Gaussian image plane
[7]. Similarly, when astigmatism is combined with defocus, the circle of least confusion
has a diameter equal to half the length of the line image in the Gaussian image plane. In
the case of coma, the ray distribution is asymmetric about the Gaussian image point and,
therefore, its centroid does not lie at this point. The centroid shift is equivalent to
introducing a wavefront tilt, or balancing coma with tilt.
Based on diffraction, the best image for small aberrations is the one for which the
variance of the wave aberration is minimum so that its Strehl ratio is maximum. Since the
value of variance depends on the shape of and the amplitude across the pupil, the value of
the balancing aberration also depends on those factors. Thus, for example, the value of
defocus for balancing spherical aberration for an annular pupil is different than that for a
circular pupil. Similarly, its value for a Gaussian circular pupil, as in the case of a circular
Gaussian beam, is different than that for a uniform circular pupil. The process of
balancing a higher-order aberration with one or more aberrations of the same and/or
lower orders to minimize the variance is called aberration balancing. Thus, for example,
secondary spherical aberration is balanced with primary spherical aberration and defocus,
and secondary coma is balanced with primary coma and tilt.
$EHUUDWLRQ %DODQFLQJ 11
The balanced aberrations for a system with a certain shape of the pupil form the basis
of determining the orthogonal polynomial aberrations for the analysis of wavefronts
across the given pupil. The Zernike circle polynomials, for example, are the orthogonal
polynomial aberrations for a system with a circular pupil that represent the balanced
classical aberrations for such a system.
1.5 SUMMARY
The diffraction image of an isoplanatic incoherent object is given by the convolution
of its Gaussian image and the PSF. In the spatial frequency domain, the spectrum of the
image is given by the product of the OTF and the spectrum of the Gaussian image. The
image is obtained by inverse Fourier transforming its spectrum.
For a system with a uniformly illuminated pupil, the aberration-free central

irradiance is given by Pex Sex l2 R 2 , independent of the shape of the pupil [see Eq. (1-8)].
The aberrations of a system are neglected in Gaussian optics when determining the
location and the size of an image formed by the system. The aberration-free OTF of a
system with a uniformly illuminated pupil is simply equal to the fractional area of overlap
of two pupils whose separation depends on the spatial frequency vector.
The aberrations of a system determine the quality of an image actually observed in

practice. An important measure of this quality is the Strehl ratio [see Eq. (1-21)], which
represents the ratio of the central irradiances of the image of a point object with and
without aberration. The Strehl ratio can also be obtained by integrating the real part of the
OTF of a system [see Eq. (1-26)]. For small aberrations, the Strehl ratio is determined by
the variance of the aberration according to, for example, Eq. (1-34), and it is independent
of the type of an aberration. The peak value of a PSF does not necessarily lie at its center,
as, for example, in the case of coma. For an apodized pupil, the aberration variance is
calculated over the amplitude-weighted pupil. A Strehl ratio of 0.8 is obtained when the
standard deviation s w of the wave aberration is approximately l 14 .
The variance of an aberration of a certain order can be reduced by mixing it with one
or more aberrations of lower order, thereby improving the Strehl ratio. The process of
mixing one aberration with others in this manner is called aberration balancing. The
polynomial aberrations used for wavefront analysis are not only orthogonal across the
pupil of a system, but also represent balanced classical aberrations for it.
12 OPTICAL IMAGING
References
1. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction

Optics, 2nd ed. (SPIE Press, Bellingham, WA, 2011).
2. V. N. Mahajan, “Luneburg apodization problem I,” Opt. Lett. 5, 267–269 (1980).
3. A. Maréchal, “Etude des effets combines de la diffraction et des aberrations

geometriques sur l'image d'un point lumineux,” Revue d'Optique 26, 257–277
(1947).
4. B. R. A. Nijboer, Thesis: ”The Diffraction Theory of Aberrations,” University of

Groningen, The Netherlands (1942).
5. B. R. A. Nijboer, “The diffraction theory of optical aberrations. Part II:

Diffraction pattern in the presence of small aberrations,” Physica 13, 605–620
(1947).
6. V. N. Mahajan, “Strehl ratio for primary aberrations in terms of their aberration

variance,” J. Opt. Soc. Am. 73, 860–861 (1983).
7. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Aberration Optics,

(SPIE Press, Bellingham, WA, Second Printing 2001).
CHAPTER 2
OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
2.1 Introduction ............................................................................................................15
2.2 Optical Imaging ......................................................................................................15
2.3 Wave and Ray Aberrations ................................................................................... 17
2.4 Defocus Aberration ................................................................................................22
2.5 Wavefront Tilt ........................................................................................................23
2.6 Aberration Function of a Rotationally Symmetric System ................................25
2.7 Observation of Aberrations: Interferograms ......................................................29
2.8 Summary................................................................................................................. 31
References ........................................................................................................................33
13
Chapter 2
Optical Wavefronts and Their Aberrations
2.1 INTRODUCTION
The position and the size of the Gaussian image of an object formed by an optical
imaging system is determined by using its Gaussian imaging equations. We have stated in
Chapter 1 that the quality of the diffraction image depends on the aberrations of the
system. A spherical wave originating at a point object is incident on the system. The
image formed by the system is aberration free and perfect if the wave exiting from the
system is also spherical. In this case, the rays originating at the point object and traced
through the system all pass through the Gaussian image point.
If the optical wavefront exiting from the exit pupil is not spherical, its optical
deviations from a spherical form represent its wave aberrations. These wave aberrations
play a fundamental role in determining the quality of the aberrated image. The rays traced
from the object point through the system, instead of passing through the Gaussian image
point, intersect the image plane in its vicinity. The distance of the point of intersection of
a ray in the image plane from the Gaussian image point is called the transverse ray
aberration, and the distribution of the rays is referred to as the spot diagram. In this
chapter, we define the wave and ray aberrations and give a relationship between them.
We relate the longitudinal defocus of an image to the defocus wave aberration, and its
wavefront tilt to the wavefront tilt aberration. Next, the possible aberrations of an
imaging system that is rotationally symmetric about its optical axis are described. The
aberration function of the system is expanded in a power series of the object and pupil
coordinates, and primary (or Seidel), secondary (or Schwarzschild), and tertiary
aberrations are introduced [1]. We also discusss briefly how the aberrations may be
observed using a Twyman–Green interferometer and what the fringe pattern of a primary
or Seidel aberration looks like. A short summary of the chapter is given at the end.
2.2 OPTICAL IMAGING

An optical imaging system consists of a series of refracting and/or reflecting
surfaces. The surfaces refract or reflect light rays from an object to form its image. The
image obtained according to geometrical optics in the Gaussian approximation, i.e.,
according to Snell's law in which the sines of the angles are replaced by the angles, is
called the Gaussian image. The Gaussian approximation and the Gaussian image are
often referred to as the paraxial approximation and the paraxial image, respectively. We
assume that the surfaces are rotationally symmetric about a common axis called the
optical axis (OA). Figure 2-1 illustrates the imaging of an on-axis point object P0 and an
off-axis point object P, respectively, by an optical system consisting of two thin lenses.
P ¢ and P0¢ are the corresponding Gaussian image points. An object and its image are
called conjugates of each other, i.e., if one of the two conjugates is an object, the other is
its image. The location and size of the image of an extended object is determined by
using its Gaussian imaging equations.
15
16 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
ExP
EnP
L1 L2
AS
MR 0
B02
OA CR0 A01
P0 A02 P¢0
B01
MR
0
(a)
ExP
L1 EnP
AS L2
C2
B2 P¢
P0 OA A2
MR A1 P¢
0
B1
CR
C1
MR
P
(b)
Figure 2-1. (a) Imaging of an on-axis point object P0 by an optical imaging system
consisting of two lenses L1 and L2 . OA is the optical axis. The Gaussian image is at
P0¢ . AS is the aperture stop; its image by L1 is the entrance pupil EnP, and its image
by L2 is the exit pupil ExP. CR0 is the axial chief ray, and MR0 is the axial marginal
ray. (b) Imaging of an off-axis point object P. The Gaussian image is at P ¢. CR is the
off-axis chief ray, and MR is the off-axis marginal ray.
2SWLFDO ,PDJLQJ 17
An aperture in the system that physically limits the solid angle of the rays from a
point object the most is called the aperture stop (AS). For an extended (i.e., a nonpoint)
object, it is customary to consider the aperture stop as the limiting aperture for the axial
point object, and to determine vignetting, or blocking of some rays, by this stop for off-
axis object points. The object is assumed to be placed to the left of the system so that
light initially travels from left to right. The image of the stop by surfaces that precede it in
the sense of light propagation, i.e., by surfaces that lie between it and the object, is called
the entrance pupil (EnP). When observed from the object side, the entrance pupil appears
to limit the rays entering the system to form the image of the object. Similarly, the image
of the aperture stop by surfaces that follow it, i.e., by surfaces that lie between it and the
image, is called the exit pupil (ExP). The object rays reaching its image appear to be
limited by the exit pupil. Since the entrance and exit pupils are images of the stop by the
surfaces that precede and follow it, respectively, the two pupils are conjugates of each
other for the whole system, i.e., if one pupil is considered as the object, the other is its
image formed by the system.
An object ray passing through the center of the aperture stop and appearing to pass
through the centers of the entrance and exit pupils is called the chief (or the principal) ray
(CR). An object ray passing through the edge of the aperture stop is called a marginal ray
(MR). The rays lying between the center and the edge of the aperture, and, therefore,
appearing to lie between the center and edge of the entrance and exit pupils, are called
zonal rays.
It is possible that the stop of a system may also be its entrance and/or exit pupil. For
example, a stop placed to the left of a lens is also its entrance pupil. Similarly, a stop
placed to the right of a lens is also its exit pupil. Finally, a stop placed at a single thin lens
is both its entrance and exit pupils.
2.3 WAVE AND RAY ABERRATIONS

Consider an optical system imaging a point object P, as illustrated in Figure 2-2. The
object radiates a spherical wave. For perfect imaging, the diverging spherical wave
incident on the system is converted by it into a spherical wave converging to the Gaussian
image point P ¢ . Generally, the wave exiting from real systems is only approximately
spherical.
The optical path length of a ray in a medium of refractive index n is equal to n times
its geometrical path length. Consider rays from a point object traced through the system
up to the exit pupil such that each one travels exactly the same optical path length. The
ray passing through the center of the pupil is called the chief ray, and represents the
reference ray with respect to which the optical path lengths of the other rays are
compared. The surface passing through the end points of the rays is called the system
wavefront, and it represents a surface of constant phase for the point object under
consideration. If the wavefront is spherical, with its center of curvature at the Gaussian
Optical
System
P¢
Figure 2-2. Perfect imaging of a point object P by an optical system at its Gaussian
image point P ¢ .
image point, we say that the image is perfect. The rays transmitted by the system have
equal optical lengths in propagating from P to P ¢ , and they all pass through P ¢ . If,
however, the actual wavefront deviates from this spherical wavefront, called the
Gaussian reference sphere, we say that the image is aberrated. The rays reaching the
Gaussian reference sphere do not travel the same optical path length, and they intersect
the Gaussian image plane in the vicinity of P ¢ . The optical deviations (i.e., the
geometrical deviations times the refractive index ni of the image space) of the wavefront
from a Gaussian reference sphere are called wave aberrations. The wave aberration of a
ray at a point on the reference sphere where the ray meets it is equal to the optical
deviation of the wavefront along that ray from the Gaussian reference sphere. It
represents the difference between the optical path lengths of the ray under consideration
and the chief ray in traveling from the point object to the reference sphere. Accordingly,
the wave aberration associated with the chief ray is zero. Since the optical path lengths of
the rays from the reference sphere to the Gaussian image point are equal, the wave
aberration of a ray is also equal to the difference between its optical path length from the
point object P to the Gaussian image point P ¢ and that of the chief ray.
The wave aberration of a ray is positive if it has to travel an extra optical path length,
compared to the chief ray, in order to reach the Gaussian reference sphere. Figures 2-3a
and 2-3b illustrate the reference sphere S and the aberrated wavefront W for on-axis and
off-axis point objects, respectively. The reference sphere, which is centered at the
Gaussian image point P0¢ in Figure 2-3a or P ¢ in Figure 2-3b, and the wavefront pass
through the center O of the exit pupil. The wave aberration ni Q Q of a general ray GR0
or GR, where ni is the refractive index of the image space, as shown in the figures, is
numerically positive. The coordinate system is also illustrated in these figures. We choose
a right-hand Cartesian coordinate system such that the optical axis lies along the z axis.
The object, entrance pupil, exit pupil, and Gaussian image lie in mutually parallel planes
that are perpendicular to this axis. Figure 2-4 illustrates the coordinate systems in the
object, exit pupil, and image planes. The origin of the coordinate system lies at O and the
Gaussian image plane lies at a distance zg from it along the z axis.
We assume that a point object such as P lies along the x axis. (There is no loss of
generality because of this since the system is rotationally symmetric about the optical
axis.) The z x plane containing the optical axis and the point object is called the
2.3 Wave and Ray Aberrations 19
ExP
Q Q(x, y, z)
GR0 x
d a
P0¢¢ (xi, yi)

CR0
z
O OA P0¢ (0, 0)
g
b
y
W(x,y) = niQQ
S
W
R
Figure 2-3a. Aberrated wavefront for an on-axis point object. The reference sphere
S of radius of curvature R is centered at the Gaussian image point P0¢ . The
wavefront W and reference sphere pass through the center O of the exit pupil ExP.
A right-hand Cartesian coordinate system showing x, y, and z axes is illustrated,
where the z axis is along the optical axis O A of the imaging system. Angular
rotations a , b , and g about the three axes are also indicated. CR0 is the chief ray,
and a general ray GR0 is shown intersecting the Gaussian image plane at P0¢¢ .
ExP
Q(x,y,z)
Q
GR
P¢¢(xi,yi)
P¢(xg,0)
R
O OA P¢0
x
a
z
g
y b W(x,y) = niQQ
S
W
zg
Figure 2-3b. Aberrated wavefront for an off-axis point object. The reference sphere
S of radius of curvature R is centered at the Gaussian image point P ¢ . The value of
R in this figure is slightly larger than its value in Figure 1-3a. GR is a general ray
intersecting the Gaussian image plane at the point P ¢¢ . By definition, the chief ray
(not shown) passes through O, but it may or may not pass through P ¢ .
xo
P (xo, 0) xp
Q (x, y)
P0
an ct
xg
pl bje
e
r
O
q
P¢¢ (xi, yi, zg)
yo
R
O P¢ (xg, 0, zg)
an il
pl up
e
P
zg
yp P¢0
pl n
e
e sia
an
ag us
yg im Ga
Figure 2-4. Right-hand coordinate system in object, exit pupil, and image planes.
The optical axis of the system is along the z axis, and the off-axis point object P is
assumed to be along the x axis, thus making the z x plane the tangential plane.
tangential or the meridional plane. The corresponding Gaussian image point P ¢ lying in
the Gaussian image plane along its x axis also lies in the tangential plane. This may be
seen by consideration of a tangential object ray and Snell’s law, according to which the
incident and the refracted (or reflected) rays at a surface lie in the same plane. The chief
ray always lies in the tangential plane. The plane normal to the tangential plane but
containing the chief ray is called the sagittal plane. As the chief ray bends when it is
refracted or reflected at an optical surface, so does the sagittal plane. It should be evident
that only the chief ray lies in both the tangential and sagittal planes, because it lies along
the line of intersection of these two planes.
Consider an image ray such as GR in Figure 2-2b passing through a point Q with
coordinates (x, y, z) on the reference sphere of radius of curvature R centered at the image
point. We let W(x, y) represent its wave aberration nQ Q , because z is related to x and y
by virtue of Q being on the reference sphere. It can be shown that the ray intersects the
Gaussian image plane at a point P ¢¢ whose coordinates with respect to the Gaussian
image point P ¢ are approximately given by [1,2]
R Ê ∂W ∂W ˆ
(x i , y i ) = Á , ˜ , (2-1)
n Ë ∂x ∂y ¯
where ( xi , yi ) represent the coordinates of P ¢¢ with respect to those of the Gaussian

image point P ¢. For systems with narrow fields of view, P ¢ lies close to P0¢ , and we may
:DYH DQG 5D\ $EHUUDWLRQV 21
replace R with zg . Note that in the case of an axial point object, R zg . [Equation (2-1)
has been derived by Mahajan [1], Born and Wolf [2], and Welford [3]. Note, however,
that Welford uses a sign convention for the wave aberration that is opposite to ours.]
The displacement P0cP0s in Figure 2-3a (or Pc Ps in Figure 2-3b) of a ray from the
Gaussian image point is called its geometrical or transverse ray aberration, and its
coordinates ( x i , y i ) in the Gaussian image plane relative to the Gaussian image point are
called its ray aberration components. Since a ray is normal to a wavefront, the ray
aberration depends on the shape of the wavefront and, therefore, on its geometrical path
difference from the reference sphere. The division of W by n in Eq. (2-1) converts the
optical path length difference into geometrical path length difference. When an image is
formed in free space, as is often the case in practice, then n = 1. The angle G ~ P0cP0s R
between the ideal ray QP0c and the actual ray QP0s is called the angular ray aberration.
The distribution of rays in an image plane is called the ray spot diagram.
We will refer to the aberration W x, y as the wave aberration at a projected point

Q x, y in the plane of the exit pupil. If r, T are the polar coordinates of this point, as
illustrated in Figure 2-5, they are related to its rectangular coordinates x, y according to
x, y r cos T, sin T . (2-2)
Note that the tangential rays, i.e., those lying in the z x plane, lie along the x axis of the
exit pupil plane and thus correspond to T 0 or S . Similarly, the sagittal rays, i.e., those
lying in a plane orthogonal to the tangential plane but containing the chief ray lie along
the y axis of the exit pupil plane and thus correspond to T S 2 or 3S 2 .
Q(x, y)
Q(r, T)
r
y
T
x
O x
Figure 2-5. Circular exit pupil of radius a of an imaging system, and Cartesian and
polar coordinates x, y and r, T, respectively, of a point Q on the pupil.
2.4 DEFOCUS ABERRATION

We now discuss defocus wave aberration of a system and relate it to its longitudinal
defocus. Consider an imaging system for which the Gaussian image of a point object is
located at P1 . As indicated in Figure 2-6, let the wavefront for this point object be
spherical with a center of curvature at P2 (due, for example, to field curvature discussed
in Section 1.6 for an off-axis point object) such that P2 lies on the line OP, joining the
center O of the exit pupil and the Gaussian image point P1 . The aberration of the
wavefront representing its optical deviation along a ray from the Gaussian reference
sphere is given by nQ2Q1 , where n is the refractive index of the image space, and Q2Q1,
as indicated in the figure, is approximately equal to the difference in the sags of the
reference sphere and the wavefront at a height r. (The sag of a surface at a certain point
on it represents its deviation at that point along its axis of symmetry from a plane surface
that is tangent to it at its vertex.) Thus, the defocus wave aberration at a point Q1 at a
distance r from the optical axis, representing the second-order difference, is given by
n §1 1· 2
W r ¨ ¸r , (2-3)
2 ©z R¹
where z and R are the radii of curvature of the reference sphere S and the spherical
wavefront W centered at P1 and P2 , respectively, passing through the center O of the exit
pupil, and r is the distance of Q1 from the optical axis. We note that the defocus wave
aberration is proportional to r 2 . If z ~ R , then Eq. (2-3) may be written as follows:
ExP
Q2 Q1
O B P1 P2
S centered at P1
W centered at P2
W S
Z
Figure 2-6. Wavefront defocus. Defocused wavefront W is spherical with a radius of

curvature R centered at P2 . The reference sphere S with a radius of curvature z is
centered at P1 . Both W and S pass through the center O of the exit pupil ExP. The
ray Q2 P2 is normal to the wavefront at Q2 . OB represents the sag of Q1 .
'HIRFXV $EHUUDWLRQ 23
W (r) ~ - n D2 r 2 , (2-4)
2R
where D = z - R is called the longitudinal defocus. We note that the defocus wave
aberration and the longitudinal defocus have numerically opposite signs.
A defocus aberration is also introduced if the image is observed in a plane other than
the Gaussian image plane. Consider, for example, an imaging system forming an
aberration-free image at the Gaussian image point P2 (and not at P1 , as in Figure 1-6).
Thus, the wavefront at the exit pupil is spherical passing through its center Q with its
center of curvature at P2 . Let the image be observed in a defocused plane passing through
a point P1 , which lies on the line joining Q and P2 . For the observed image at P1 to be
aberration free, the wavefront at the exit pupil must be spherical with its center of
curvature at P1 . Such a wavefront forms the reference sphere with respect to which the
aberration of the actual wavefront must be defined. The aberration of the wavefront at a
point Q1 on the reference sphere is given by Eqs. (2-3) and (2-4).
If the exit pupil is circular with a radius a, then Eq. (2-4) may be written
W (r) = Bd r 2 , (2-5)
where r = r a is the normalized distance of a pupil point and
Bd ~ - nD 8 F 2 (2-6)
represents the peak value of the defocus aberration with F = R 2a as the focal ratio or
the f-number of the image-forming light cone. Note that a positive value of Bd implies a
positive value of D. Thus, an imaging system having a positive value of defocus
aberration D can be made defocus free if the image is observed in a plane lying farther
from the plane of the exit pupil, compared to the defocused image plane, by a distance
8Bd F 2 n . Similarly, a positive defocus aberration of Bd ~ - nD 8F 2 is introduced into
the system if the image is observed in a plane lying closer to the plane of the exit pupil,
compared to the defocus-free image plane, by a distance D.
2.5 WAVEFRONT TILT

Now we describe the relationship between a wavefront tilt and the corresponding tilt
aberration. As indicated in Figure 2-7, consider a spherical wavefront centered at P2 in
the Gaussian image plane passing through the Gaussian image point P1 . The wave
aberration of the wavefront at Q1 is its optical deviation nQ2Q1 from a reference sphere
centered at P1 . It is evident that, for small values of the ray aberration P1P2 , the wavefront
and the reference sphere are tilted with respect to each other by an angle b . The
wavefront tilt may be due to an inadvertently tilted element of the imaging system or
distortion (discussed in Section 2.6) for an off-axis point object. The ray and the wave
aberrations can be written
x i = R (2-7)
ExP
Q2 Q1
r
P2
xi
b
O OA P1
S W
Figure 2-7. Wavefront tilt. The spherical wavefront W is centered at P2 while the
reference sphere S is centered at P1 , such that the two spherical surfaces are tilted
with respect to each other by a small angle = P1 P2 R , where R is their radius of
curvature. The ray Q2 P2 is normal to the wavefront at Q2.
and
W ( r , q) = nbr cos q , (2-8)
respectively, where P1P2 = x i and (r, q) are the polar coordinates of the point Q1 . Both
the wave and ray aberrations are numerically positive in Figure 2-7.
Once again, for a system with a circular exit pupil of radius a, Eq. (2-8) may be
written
W (r, q) = nab r cos q = Bt r cos q , (2-9)
where
B t = n i ab (2-10)
is the peak value of the wavefront tilt aberration. Note that a positive value of Bt implies
that the wavefront tilt angle is also positive. Thus, if an aberration-free wavefront is
centered at P2 , then an observation with respect to P1 as the origin implies that we have
introduced a tilt aberration of Bt r cos q.
2.6 Aberration Function of a Rotationally Symmetric System 25
2.6 ABERRATION FUNCTION OF A ROTATIONALLY SYMMETRIC

SYSTEM
Consider a point object with Cartesian coordinates (p, q) in the object plane. Its
image, formed by a rotationally symmetric system, is perfect if the spherical wavefront
diverging from the object point and incident on the imaging system is converted by the
system into a spherical wavefront converging to its Gaussian image point. Any deviation
of the imaging wavefront at the exit pupil of the system from a reference sphere passing
through the center of the pupil with center of curvature at the Gaussian image point
represents the aberration function. In optical design, the aberration function is determined
by tracing rays originating at the point object and propagating them through the system
and determining their optical path lengths in reaching the reference sphere relative to that
of the chief ray passing through the center of the pupil. Similarly, in optical testing the
wave aberration at a discrete array of points is determined interferometrically.
If (x, y) are the coordinates of a pupil point, the aberration function consists of terms
r
formed from three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy . If h
r
and rr are
r the position vectors of the object and pupil points,rthen the rotational invariants
r r r r r
are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and hr cos q , where h = h , r = r , and q is the polar
r r
angle of r with respect to that of h . It is convenient to consider the aberration function
in terms of the image height h ¢ , for example, when the object is at infinity, and let q be
the angle for the image point. The image height is, of course, related to the object height
by the Gaussian magnification. We now expand the aberration function W (h ¢; r , q) in a
power series in terms of the three rotational invariants h ¢ 2 , r 2 , and h ¢r cos q in the form
• • •
W (h¢; r , q) = Â Â ( ) l (r 2 ) p (h¢r cos q) m
Â C lpm h ¢ 2
l =0 p =0 m =0
• • •
= Â Â Â C lpm h ¢ 2l + m r 2 p + m cos m q , (2-11)
l =0 p =0 m =0
where C lpm are the expansion coefficients, and l, p, and m are positive integers, including
zero. There is no term with sinq dependence. The aberration terms are called the
classical aberrations.
It is evident that the degree of each term of the series in the object or image and pupil
coordinates is even and given by 2(l + p + m) . Any terms for which p = 0 = m so that
2 p + m = 0 , i.e., those terms that do not depend on r and, therefore, vary only as h ¢ 2l ,
must add up to zero since the aberration associated with the chief ray (for which r = 0 ) is
zero. Thus, the zero-degree term C000 and terms such as C100 h ¢ 2 , C 200 h ¢ 4 , etc., do not
appear in Eq. (2-11). There is also no term of second degree. For example, the term
C010 r 2 represents defocus aberration that is independent of h. It has the implication that
the image is being observed in a plane other than the Gaussian image plane. Similarly, the
term C 001 h ¢r cos q represents a wavefront tilt aberration that depends on h. It has the
implication that the image height is not h ¢ . Hence, a power series expansion of the
aberration function consists of terms of degree 4, 6, 8, etc. The corresponding aberrations

are referred to as the primary, secondary, tertiary aberrations, etc. The primary
aberrations are also called the Seidel aberrations, and the secondary aberrations are also
called the Schwarzschild aberrations.
It is convenient to write Eq. (2-11) in the form
• • n
W (h¢; r , q) = Â Â Â 2 l + m a nm h¢ 2l + m r n cos m q , (2-12)
l = 0 n =1 m = 0
where
n = 2p + m (2-13)
is a positive integer not including zero, and 2l + m anm are the expansion coefficients. From
Eq. (2-13), we note that n - m = 2 p ≥ 0 and even. The order i of an aberration term,
which is equal to its degree in the object and pupil coordinates, is given by
i = 2l + m + n . (2-14)
The number of terms Ni of a certain order i, i.e., the number of integer sets satisfying Eq.
(2-14) with n - m ≥ 0 and even, is given by
N i = (i + 2) (i + 4) 8 . (2-15)
This number includes a term with n = 0 = m , called piston aberration, although such a
term does not constitute an aberration (since it corresponds to the chief ray, which has a
zero aberration associated with it). It is included here for completeness, as interferometric
data based on the aberrations of a system may have a piston component.
The fourth order (i = 4), i.e., the primary or the Seidel aberration function consisting
of a sum of five fourth-order terms, can be written
W P (r , q; h ¢ ) = 0 a 40 r
4
+ 1a 31h ¢ r 3 cos q + 2 a 22 h ¢ 2 r 2 cos 2 q
(2-16)
+ 2 a 20 h ¢ 2 r 2 + 3 a11h ¢ 3 r cos q .
Since the wave aberration W has dimensions of length, the dimensions of the coefficients
i a jk are inverse length cubed. Since the ray aberrations are related to the wave
aberrations by a spatial derivative [see Eq. (2-1)], their degree is lower by one.
Accordingly, the primary aberrations are also referred to as the third-order ray
aberrations. The wave aberration coefficients 0 a 40 , 1a 31 , 2 a 22 , 2 a 20 , and 3 a11 represent
the coefficients of spherical aberration, coma, astigmatism, field curvature, and
distortion, respectively.
From Eq. (2-16), we note that only spherical aberration is independent of the object
or image height. The field curvature, in its dependence on the pupil coordinates (r, q) , is
like the defocus aberration discussed in Section 2.4. However, the field curvature
$EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 27
represents a defocus aberration that depends on the field h ¢ , thus requiring a curved
image surface for its elimination. On the other hand, pure defocus aberration, such as that
produced by observing the image in a plane other than the Gaussian image plane, is
independent of the field h ¢ . Similarly, distortion depends on the pupil coordinates as a
wavefront tilt. However, distortion depends on the field as h ¢ 3 , but the wavefront tilt
produced by a tilted element in the system would be independent of h¢ .
The sixth order ( i = 6), i.e., the secondary or the Schwarzschild aberration function,
can be written
W S (h¢; r , q) = 0 a 60 r 6 +1 a 51h ¢ r 5 cos q + 2 a 42 h ¢ 2 r 4 cos 2 q + 3 a 33 h ¢ 3 r 3 cos 3 q + 2 a 40 h ¢ 2 r 4

+ 3 a 31h¢ 3 r 3 cos q + 4 a 22 h ¢ 4 r 2 cos 2 q + 4 a 20 h ¢ 4 r 2 + 5 a11h ¢ 5 r cos q . (2-17)
Four of the nine aberration terms (excluding piston) correspond to l = 0. They are the
secondary spherical aberration ( 0 a 60 r 6 ), secondary coma ( 1a 51h¢ r 5 cos q ), secondary
astigmatism ( 4 a 22 h¢ 4 r 2 cos 2 q ) (wings or Flügelfehler), and arrows or Pfeilfehler
( 3 a 33 h¢ 3 r 3 cos 3 q ). The remaining five corresponding to l π 0 and called lateral
aberrations are similar to the corresponding primary aberrations except for their
dependence on the image height h ¢. The lateral spherical aberration 2 a40 h ¢ 2 r 4 is also
called the oblique spherical aberration.
Aberration terms of the eighth (i = 8) order are called the tertiary aberrations. There
are fourteen aberration terms of this order, excluding piston. Only five of them have the
dependencies on pupil coordinates that are different from those of the secondary or
primary aberrations. Four have dependence on these coordinates as for the secondary
aberrations, and the remaining five have the same dependence as the primary aberrations.
Their difference lies in their dependence on the image height.
By combining the aberration terms having different dependencies on the object

coordinates but the same dependence on pupil coordinates so that there is only one term
for each pair of (n, m) values, Eq. (2-12) for the power-series expansion of the aberration
function may be written
• n
W (r, q) = Â Â a nm r n cos m q , (2-18)
n =1 m = 0
where the expansion coefficients a nm are related to the coefficients i a jk according to

•
2l + m
anm = a n Â 2 l + m anm h ¢ . (2-19)
l=0
The radial coordinate r has been normalized to r = r a . It has the advantage that, since
0 £ r £ 1 and cos q £ 1, the coefficient a nm of a classical aberration r n cos m q
represents the peak value or half of the peak-to-valley (P-V) value of the corresponding
aberration term, depending on whether m is even or odd, respectively. The indices n and
m represent the powers of r and cos q, respectively. The index m also represents the
minimum power of h ¢ dependence of a coefficient (with the exception of tilt and defocus
terms corresponding to n - m ≥ 0 and 2, respectively). The maximum power of h ¢
dependence is given by i - n . Moreover, the powers of h ¢ dependence are even or odd
according to whether n and m are even or odd, respectively. The number of terms through
a certain order i in the reduced power-series expansion of the aberration function given
by Eq. (2-18) is also given by Eq. (2-15). This number includes a nonaberration piston
term corresponding to n = 0 = m . The terms of Eq. (2-12) through a certain order i
correspond to those terms of Eq. (1-18) for which n + m £ i.
The primary aberrations correspond to terms with n + m £ 4 . The primary or the

Seidel aberration function of Eq. (2-16) may be written in terms of the coefficients a nm
in the form
W P (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31q 3 cos q + a 40r 4 , (2-20)
where
3
a11 = 3 a11h ¢ a , (2-21a)
2
a 20 = 2 a 20 h ¢ a2 , (2-21b)
2
a 22 = 2 a 22 h ¢ a2 , (2-21c)
a 31 = 1a 31h ¢ a 3 , (2-21d)
and
4
a 40 = 0 a 40 a . (2-21e)
Comparing the distortion term a11r cos q with the wavefront tilt aberration given by
Eq. (2-9), we note that while the two are similar in their dependence on the pupil
coordinates, their coefficients depend on the image height differently. The distortion
coefficient a11 varies with h ¢ as h ¢ 3 , but the tilt coefficient Bt is independent of h ¢.
Similarly, comparing the field curvature term a 20r 2 with the defocus wave aberration
given by Eq. (2-5), we note that their dependence on the pupil coordinates is the same.
However, whereas the field curvature coefficient a20 varies with h ¢ as h ¢ 2 , the defocus
coefficient Bd is independent of h ¢.
The aberration function through the sixth order, i.e., for i £ 6 or n + m £ 6 may be
written
W S (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31r 3 cos q + a 33r 3 cos 3 q
+ a 40r 4 + a 42r 4 cos 2 q + a 51r 5 cos q + a 60r 6 , (2-22)
where
a11 = ( 3 a11h ¢
3
)
+ 5 a11h¢ 5 a , (2-23a)
$EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 29
a20 = ( 2 a20 h ¢
2
)
+ 4 a20 h¢ 4 a 2 , (2-23b)
a22 = ( 2 a22 h ¢
2
)
+ 4 a22 h¢ 4 a 2 , (2-23c)
a31 = (a 1 31h ¢ )
+ 3 a31h ¢ 3 a 3 , (2-23d)
3 3
a33 = 3 a33 h ¢ a , (2-23e)
a 40 = ( 0 a 40 + 2a 40h ¢ 2 ) a 4 , (2-23f)
2 4
a42 = 2 a42 h ¢ a , (2-23g)
a51 = 1a51h ¢a 5 , (2-23h)
6
a60 = 0 a60 a . (2-23i)
Written in this form, the aberration function has nine aberration terms through the sixth
order or through the secondary aberrations. Since the dependence of an aberration term
on the image height h ¢ is contained in the aberration coefficient anm , it should be noted
that the primary aberrations (including distortion and field curvature terms) in Eqs. (2-23)
are not the same as those in Eq. (2-20), because they contain aberration components not
only of the fourth degree, but of the sixth degree as well. For example, a 40r 4 consists of
spherical and lateral spherical aberrations 0 a 40 a 4 r 4 and 2 a 40 h ¢ 2 a 4 r 4 .
Similarly, the aberration function through the eighth order can be written. Once
again, an aberration term of this expansion will not be necessarily the same as a
corresponding term of the expansions of Eq. (2-20) or (2-22). We add that it is convenient
to refer to the aberration terms of a power-series expansion as the classical aberrations,
e.g., a term in r4 may be referred to as the classical primary spherical aberration.
2.7 OBSERVATION OF ABERRATIONS: INTERFEROGRAMS

There are a variety of interferometers that are used for detecting and measuring
aberrations of optical systems [4]. Figure 2-8 illustrates schematically a Twyman–Green
interferometer in which a collimated laser beam is divided into two parts by a beam
splitter BS. One part, called the test beam, is incident on the system under test, indicated
by the lens L, and the other, called the reference beam, is incident on a plane mirror M 1 .
The focus F of the lens system lies at the center of curvature C of a spherical mirror M 2 .
As the angle of the incident light is changed to study the off-axis aberrations of the
system, the mirror is tilted so that its center of curvature lies at the current focus of the
beam. In this arrangement the mirror does not introduce any aberration since it is forming
the image of an object lying at its center of curvature .
The two reflected beams interfere in the region of their overlap. Lens L ¢ is used to
observe the interference pattern on a screen S placed in a plane containing the image of L
M1
BS
L M2
x
L¢
Figure 2-8. Twyman–Green interferometer for testing a lens system L. A laser beam
is split into two parts by a beam splitter BS. The reflected part is incident on a plane
mirror M1 and the transmitted part is incident on L. F is the image-space focal
point of L , and C is the center of curvature of a spherical mirror M2 . The
interfering beams are focused by a lens L ¢ , and the interference pattern is observed
on a screen S.
formed by L ¢ . A record of the interference pattern is called an interferogram. Note that

since the test beam goes through the lens system L twice, its aberration is twice that of the
system.
If the reference beam has a uniform phase and the test beam has a phase distribution
F( x , y ) , and if their amplitudes are equal to each other, the irradiance distribution of their
interference pattern is given by
[ ]2
I ( x , y ) = I 0 1 + exp iF( x , y )
{ [
= 2I 0 1 + cos F( x , y ) ]} , (2-24)
where I0 is the irradiance when only one beam is present. Of course, the phase and the
wave aberration distributions are related to each other according to
2p
F( x , y ) = W (x, y) , (2-25)
l
2EVHUYDWLRQ RI $EHUUDWLRQV ,QWHUIHURJUDPV 31
where l is the wavelength of the laser beam. The irradiance has a maximum value equal
to 4 I 0 at those points for which
F( x , y ) = 2pn (2-26a)
and a minimum value equal to zero wherever
F( x , y ) = 2p(n + 1 2) , (2-26b)
where n is a positive or a negative integer, including zero. Each fringe in the interference
pattern represents a certain value of n, which in turn corresponds to the locus of ( x , y )
points with phase aberration given by Eq. (2-25a) for a bright fringe and Eq. (2-25b) for a
[ ]
dark fringe. If the test beam is aberration free F ( x , y ) = 0 , then the interference pattern
has a uniform irradiance of 4 I 0 . Figure 2-9 shows interferograms of six waves of a
primary aberration. In Figure 2-9a for spherical aberration and 2-9d for astigmatism, a
certain amount of defocus has also been added. In Figure 2-9c, a certain amount of tilt has
been added to the coma aberration.
2.8 SUMMARY
A perfect image of a point object is formed by an imaging system when a spherical
wave diverging from the object and incident on the system is converted by it into a
spherical wave converging to the Gaussian image point. If rays from the object point are
traced through the system, they all travel exactly the same optical path length from the
object point to the Gaussian image point, and they all pass through this image point.
When the wavefront exiting from the exit pupil of the system is not spherical, its optical
deviations from the spherical form represent the wave aberrations, and an aberrated
image is formed. The rays intersect the image plane in the vicinity of the Gaussian image
point, and their distribution is called the spot diagram. The wave and the ray aberrations
are related to each other by a spatial derivative, as in Eq. (2-1).
The aberrations of a rotationally symmetric system depend on the product of the

integral powers of three rotational invariants, namely, h ¢ 2 , r 2 , and h ¢r cos q , where h ¢ is
the height of the Gaussian image point from the optical axis and (r, q) are the polar
coordinates of a point in the plane of the exit pupil. There is no term with sinq
dependence. The order of an aberration term, representing its degree in the object and
pupil coordinates, is even. The aberrations of the lowest order, namely 4, are called
primary or Seidel aberrations. Similarly, the aberrations of the next order, namely 6, are
called the secondary or the Schwarzschild aberrations. When an image is observed in a
defocused image plane, the defocus aberration thus introduced varies as r 2 . It is similar
to the field curvature aberration in its pupil dependence, but whereas the former is
independent of the image height, the latter varies as h ¢ 2 .
The interference pattern formed by two beams, one of which has traveled through an
aberrated system, is shown in Figure 2-9 for primary aberrations, as an illustration of
interferograms.
Figure 2-9. Interferograms of primary aberrations: (a) defocus Bd r 2 , (b) spherical

aberration combined with defocus As r 4 + Bd r 2 , (c) coma combined with tilt
Ac r 3 + Bt rcos q , and (d) astigmatism combined with defocus Aa r 2 cos 2q + Bd r 2 . The
aberrations in the interferograms are twice their corresponding values in the system
under test, because the test beam goes through the system twice.
5HIHUHQFHV 33
References
1. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Geometrical Optics,

2nd Printing (SPIE Press, Bellingham, Washington, 2001).
2 M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999).
3. W. T. Welford, Aberrations of the Symmetrical Optical System (Academic Press,

New York, 1974).
4. D. Malacara, Ed., Optical Shop Testing, 3rd ed., Wiley, New York (2007).
CHAPTER 3
ORTHONORMAL POLYNOMIALS AND

GRAM–SCHMIDT ORTHONORMALIZATION
3.1 Introduction ............................................................................................................37
3.2 Orthonormal Polynomials ..................................................................................... 37
3.3 Equivalence of Orthogonality-Based Coefficients and
Least-Squares Fitting............................................................................................. 39
3.4 Orthonormalization of Zernike Circle Polynomials over
Noncircular Pupils ................................................................................................. 40
3.5 Unit Pupil ................................................................................................................43
3.6 Summary................................................................................................................. 43
References ........................................................................................................................46
35
Chapter 3
Orthonormal Polynomials and Gram–Schmidt
Orthonormalization
3.1 INTRODUCTION
In optical design, we trace rays from a point object through a system to determine the
aberrations of the wavefront at its exit pupil. In optical testing, we determine the
aberrations of a system or an element interferometrically. In both cases, we obtain
aberration numbers at an array of points. We can calculate the PSF or other associated
image quality measures from these numbers. We can also calculate the aberration
variance, which, in turn, gives some idea of the image quality. However, such measures
do not shed light on the content of the aberration function. To understand the nature of
this function, we want to know the amount of certain familiar aberrations discussed in
Chapter 2 that are present, so that perhaps something can be done about them in
improving the design or the system under test.
A straightforward approach to determine the content of an aberration function is to

decompose it into a set of orthogonal polynomials that represent balanced classical
aberrations and include wavefront defocus and tilt. The Zernike circle polynomials are in
widespread use for this purpose for systems with circular pupils. These polynomials are
unique in the sense that they are not only orthogonal across a unit circle, but they also
represent balanced aberrations yielding minimum variance, as we shall see in Chapter 4.
In this chapter, we discuss the basic properties of the orthogonal polynomials. We also
describe the Gram–Schmidt orthogonalization process for obtaining orthogonal
polynomials over one domain from those that are orthogonal over another domain, e.g.,
obtaining polynomials that are orthogonal over an annular pupil from the circle
polynomials. We emphasize the use of orthonormal polynomials so that their coefficients
represent the standard deviations of the corresponding polynomial aberration terms.
3.2 ORTHONORMAL POLYNOMIALS

Consider a complete set of polynomials F j ( x , y ) in Cartesian coordinates ( x , y ) that
are orthonormal over a certain pupil according to
1
Ú F ( x , y ) F j ' ( x , y ) dx dy = d jj ' , (3-1)
A pupil j
where A is the area of the pupil inscribed inside a unit circle, the integration is carried out
over the area of the pupil, and d jj' is a Kronecker delta. Let F1 = 1. Since it is
independent of the coordinates x and y, it is referred to as the piston polynomial. As a
result, the mean value of each polynomial, except for j = 1, is zero, i.e.,
1
F j ( x, y) = Ú F ( x , y ) dx dy
A pupil j
37
38 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION
= 0 for j π 1 , (3-2)
as may be seen by letting j ¢ = 1 in Eq. (3-1). The angular brackets on the left-hand side
of Eq. (3-2) indicate a mean value over the area of the pupil. Similarly, the mean square
value of a polynomial is unity, i.e.,
1
F j2 ( x , y ) = Ú F ( x , y ) dx dy
2
A pupil j
= 1 , (3-3)
as may be seen by letting j ¢ = j in Eq. (3-1).
An aberration function W ( x , y ) can be expanded in terms of the polynomials in the

form
•
W ( x, y) = Â a j F j ( x, y) , (3-4)
j =1
where a j is an expansion or the aberration coefficient of the polynomial F j ( x , y ) .

Multiplying both sides of Eq. (3-4) by F j ¢ ( x , y ) , integrating over the pupil, and utilizing
the orthonormality Eq. (3-1), the aberration coefficients are given by
1 1 •
Ú W ( x , y ) F j ¢ ( x , y ) dx dy = Â a Ú F ( x , y ) F j ¢ ( x , y ) dx dy
A pupil A j =1 j pupil j
= a j¢ ,
or
1
aj = Ú W ( x , y ) F j ( x , y ) dx dy . (3-5)
A pupil
It is evident that the value of an expansion coefficient is independent of the number of

polynomials used in the expansion. Accordingly, one or more terms can be added to or
subtracted from the aberration function without affecting the other coefficients. It is a
consequence of the orthogonality of the polynomials.
The mean value of the aberration function is given by
•
W ( x, y) = Â a j F j ( x, y)
j =1
= a1 , (3-6)
where we have utilized Eq. (3-2) for the mean value of a polynomial. The mean square
value of the aberration function is given by
2UWKRQRUPDO 3RO\QRPLDOV 39
1 • •
W 2 ( x, y) = Ú Â a j F j ( x , y ) Â a j ¢ F j ¢ ( x , y ) dx dy
A pupil j =1 j ¢ =1
•
= Â a 2j , (3-7)
j =1
where we have utilized the orthonormality Eq. (3-1) and Eq. (3-3) for the mean square
2
value of a polynomial. The variance s W of the aberration function is accordingly given
by
2
2
sW = W 2 ( x, y) - W ( x, y)
•
= Â a 2j , (3-8)
j =2
where s W is the standard deviation or the sigma value of the aberration function. Since
the mean value of a polynomial (except piston) is zero, each expansion coefficient a j
represents the standard deviation of the corresponding polynomial term. The variance of
the aberration function is simply the sum of the variances of the polynomial terms.
In the orthonormality Eq. (3-1) and those that follow it, we have assumed a
uniformly illuminated pupil, i.e., the amplitude across it is constant. If that is not the case,
as for example in a Gaussian pupil where the amplitude across the pupil varies as a
Gaussian function, then the amplitude function must be included in all the integrations
over the pupil (see Chapter 6). The quantity A in such cases would also be an amplitude-
weighted area of the pupil. Thus, the integrations, indicated by the angular brackets
implying a mean value, would be over an amplitude-weighted area of the pupil.
In practice, the number of polynomials used in the expansion will be truncated such
that the resulting variance obtained from Eq. (3-8) equals the actual value obtained from
the function W ( x , y ) within some specified tolerance. The Strehl ratio of an image for
small aberrations can be estimated from the variance according to Eq. (1-34).
3.3 EQUIVALENCE OF ORTHOGONALITY-BASED COEFFICIENTS AND

LEAST-SQUARES FITTING
It is easy to show that the expansion coefficients a j given by Eq. (3-5) and obtained
as a consequence of the orthogonality of the polynomials F j ( x , y ) represent a least-
squares fit of the aberration function W ( x , y ) . Suppose we estimate the function with
only J polynomials. Thus we write
J
Wˆ ( x , y ) = Â a j F j ( x , y ) , (3-9)
j =1
where Wˆ ( x , y ) is the best-fit estimate of W ( x , y ) . The least-squares error resulting from

fitting the aberration function with J polynomials is given by
1 2
E =
A pupil
[
Ú W ( x , y ) - Wˆ ( x , y ) ] dx dy
2
1 È J ˘
= Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ dx dy . (3-10)
A pupil Î j =1 ˚
The error is minimum when the coefficients obey the condition
∂E
= 0 , (3-11)
∂a j ¢
or
1 È J ˘
Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ F j ¢ ( x , y ) dx dy = 0 . (3-12)
A pupil Î j =1 ˚
Using the orthonormality Eq. (3-1), Eq. (3-12) yields Eq. (3-5). The variance of the
estimated aberration function is given by
2
ˆ2 ˆ
ˆ = W ( x, y) - W ( x, y)
2
sW
J
= Â a 2j . (3-13)
j =2
It should be evident that each polynomial coefficient provides a best fit to the
aberration function. The fit, of course, improves as more and more polynomials are added
until there is no more improvement. We point out that, in practice, the aberration function
data is available at a discrete set of points. Hence, there will be some error in the
coefficient values, because the orthonormality Eq. (3-1) will not be satisfied exactly. This
error decreases as the number of data points increases.
3.4 ORTHONORMALIZATION OF ZERNIKE CIRCLE POLYNOMIALS OVER

NONCIRCULAR PUPILS
The Zernike circle polynomials (discussed in Chapter 4) are orthogonal over a
circular pupil. They uniquely represent balanced classical aberrations and include
wavefront tilt and defocus aberrations. The corresponding polynomials F j ( x , y ) that are
orthogonal over a noncircular pupil can be obtained by orthogonalizing the circle
polynomials Z j ( x , y ) using the Gram–Schmidt orthonormalization process [1]. Omitting
the argument ( x , y ) of the polynomials for simplicity, we may write
G1 = Z1 = 1 , (3-14)
j
G j +1 = Z j +1 + Â c j +1,k Fk , (3-15)
k =1
2UWKRQRUPDOL]DWLRQ RI =HUQLNH &LUFOH 3RO\QRPLDOV RYHU 1RQFLUFXODU 3XSLOV 41
G j +1 G j +1
F j +1 = = 12
, (3-16)
G j +1 È1 2
˘
Í Ú G j +1 dx dy ˙
Î A pupil ˚
where
1
c j +1, k = - Ú Z F dx dy . (3-17a)
A pupil j +1 k
∫ - Z j +1Fk . (3-17b)
It is evident from Eq. (3-14) that F1 = 1. Substituting Eq. (3-17b) into Eq. (3-15) and
substituting the result thus obtained into Eq. (3-12), we may write
È j ˘
F j +1 = N j +1 Í Z j +1 - Â Z j +1Fk Fk ˙ , (3-18)
Î k = 1 ˚
where N j +1 is a normalization constant so that the polynomials are orthonormal over the
pupil under consideration, i.e., they satisfy the orthonormality condition of Eq. (3-1).
Thus, the F-polynomials are obtained recursively, starting with F1 = 1. It is clear from Eq.
(3-18) that each F-polynomial of a certain order is a linear combination of the circle
polynomials of no more than that order. It should be evident that the F-polynomials are
ordered in the same manner as the basis polynomials and that there is a one-to-one
correspondence between them.
Because of the biaxial symmetry of the pupils considered in this chapter and,
therefore, the symmetric limits of integration, the integral in Eq. (3-17a) is zero when the
integrand is an odd function of one or both integration variables. It should be evident that
a c-coefficient is zero unless the Z- and the G-polynomials have the same cosine or sine
dependence. If all of the c-coefficients in Eq. (3-15) are zero, then the F-polynomial has
the same form as the corresponding Zernike polynomial, except for its normalization.
The orthonormal F-polynomials represent the unit vectors of the space that span the
aberration function. They can be written in a matrix form according to
l 1
Fl ( x, y) = Â Mli Zi ( x, y) with Mll = . (3-19)
i =1 Gl
While the diagonal elements of the M-matrix are simply equal to the normalization
constants of the G- polynomials [since there is no multiplier with the polynomial Z j +1 in
Eq. (3-15)], there are no matrix elements above the diagonal because a polynomial Fl
consists of a linear combination of circle polynomials up to Zl only. The matrix is lower
triangular and the missing elements may be given a value of zero when multiplying a
( )
Zernike column vector L, Z j , L to obtain the orthonormal column vector L , F j ,L . ( )
It should be evident that the orthonormal polynomials for a noncircular pupil written in
terms of the circle polynomials immediately yield the elements of the conversion matrix
M.
The conversion matrix M can be obtained independently and nonrecursively using a

matrix approach [2], which is not only faster but also avoids the potential numerical
instability of the Gram–Schmidt approach as the number of polynomials increases.
Multiplying both sides of Eq. (3-19) by Fk , integrating over the pupil, and using the
orthonrmality Eq. (3-1), we obtain
J
Fk Fl = d kl = Â M kj Z j Fk , (3-20)
j =1
where, for example, Z j Fk represents the inner product of the Zernike polynomial Z j
and the orthonormal polynomial Fk over the pupil, i.e.,
1
Z j Fk = Ú Z ( x , y ) Fk ( x , y ) dx dy . (3-21)
A pupil j
Equation (3-19) can be written in a matrix form as
MC ZF = 1 , (3-22)
where C ZF is a J ¥ J matrix of the inner products between the Zernike polynomials Z j

and the orthonormal polynomials Fk . The elements of this matrix are given by
J T
Z k Fi [
= Â M ij Z j Z k
j =1
]
J T
= Â Z k Z j M ij
j =1
[ ] , (3-23)
T
[ ]
where, for example, M ij is the transpose of the matrix with elements M ij (obtained
by interchanging the rows and columns of the matrix M ). Equation (3-23) can be written
in the matrix form as
C ZF = C ZZ M T , (3-24)
where C ZZ is a J ¥ J symmetric matrix of inner products of the first J Zernike circle

polynomials between themselves. Substituting Eq. (3-24) into Eq. (3-22), we obtain
MC ZZ M T = 1 . (3-25)
Letting
M = QT ( )1 , (3-26)
where Q T is the transpose of the matrix Q , Eq. (3-24) reduces to

2UWKRQRUPDOL]DWLRQ RI =HUQLNH &LUFOH 3RO\QRPLDOV RYHU 1RQFLUFXODU 3XSLOV 43
QT Q = C ZZ . (3-27)
Solving Eq. (3-27) for the matrix Q , the conversion matrix M can be obtained from Eq.
(3-26). While the matrix M is lower triangular, the matrix Q is upper triangular.
3.5 UNIT PUPIL

When considering the aberrations of a circular pupil of radius a, we normalize the
radial coordinate r by defining r = r a . Thus, 0 £ r £ a , but 0 £ r £ 1. This
normalization has the advantage that the coefficient of a classical aberration r n cos m q
(see Section 2.6) represents its peak value. This value occurs at the point where the x axis
intersects the circle. At this point, r has its maximum value of unity and the value of q is
zero giving a maximum value of unity for cos q . For example, the coefficient As of the
primary spherical aberration Asr 4 represents the peak value of the aberration. Indeed,
when As = 1l , we speak of one wave of spherical aberration. The same is true of primary
coma Ac r 3 cos q , where Ac represents its peak value. Similarly, we define a unit pupil
such that the distance of the farthest point from its center is unity. Figure 3-1 shows the
noncircular pupils considered in this book. The outer radius of an annular pupil is unity,
as in Figure 3-1a. The corners of the hexagon in Figure 3-1b lie at a distance of unity.
Figure 3-1c illustrates an ellipse with an aspect ratio of b, and its semimajor axis has a
length of unity. For each of these pupils, the coefficient of a classical aberration
represents its peak value. Figure 3-1d shows a rectangle with a half width a and its
corners at a distance of unity from its center. Similarly, Figure 3-1e shows a square of
half width 1 2 so that its corners are also at a distance of unity from its center. In these
two cases, while r has its maximum value of unity at a corner, the value of cos q at that
point is not unity. Hence, in these cases, the coefficient of a classical aberration does not
represent its peak value. In the case of a rectangle, the value of cos q depends on the
value of a, but in the case of a square its value is 1 2 . For example, coma has a peak
value of Ac 2 at a corner or the midpoint of a side. Finally, a unit slit pupil with a half
width of unity is shown in Figure 3-1f. The value of a coefficient of a classical aberration
in this case does represent its peak value.
3.6 SUMMARY
The content of an aberration function can be determined by expanding it in terms of a
complete set of polynomials that are orthogonal over its domain and have the form of
familiar aberrations, such as those discussed in Chapter 2. The Zernike circle
polynomials, for example, are not only orthogonal over a circular pupil, but they also
represent balanced classical aberrations, as discussed in Chapter 4. It is advantageous to
use the polynomials in their orthonormal form so that the piston coefficient represents the
mean value of the aberration function and the other expansion coefficients represent the
standard deviations of the corresponding polynomial aberration terms. As illustrated by
Eq. (3-5), the value of an expansion coefficient is independent of the number of
polynomials used in the expansion. Moreover, each coefficient yields a least-squares fit to
the aberration function. The variance of the aberration function is given by the sum of the
squares of the coefficients (other than the piston), as in Eq. (3-8).
44 ORTHONORMAL POLYNOMIALS AND GRAM SCHMIDT ORTHONORMALIZATION
( ) ( )
1

q
( ) ( )
(a) Annulus (b) Hexagon
y y
D(0,c)
(
D –c, 1 – c 2 ) (
A c, 1 – c 2 )
C – 1, 0 A 1, 0
O x O x
(
C – c, – 1 – c 2 ) (
B c, – 1 – c 2 )
B(0, – c)
(c) E l l i p s e (d) Rectangle
y y

D – 1 2, 1 2
A 1 2,1 2
x
O x –1 O 1

C –1 2, – 1 2
B 1 2, – 1 2
(e) Sq u a r e (f) S l i t
Figure 3-1. Unit pupils inscribed inside a unit circle. (a) annulus of obscuration ratio
, (b) hexagon with a side of unity, (c) ellipse of aspect ratio b, (d) rectangle of half
width a, (e) square of half width 1 2 , and (f) slit of half width of unity.
6XPPDU\ 45
Given a set of polynomials that are orthonormal over a certain domain, those that are
orthonormal over another domain can be obtained from them by the recursive Gram–
Schmidt orthonormalization process. They can also be obtained by a nonrecursive matrix
approach. Each new polynomial obtained is a linear combination of the basis
polynomials, as indicated by Eq. (3-18). We use the Zernike circle polynomials as the
basis functions to obtain the polynomials that are orthonormal over an annular, Gaussian,
hexagonal, elliptical, rectangular, or a square pupil. The slit pupil is a limiting case of a
rectangular pupil whose one dimension is negligibly small compared to the other. The
concept of a unit pupil is emphasized so that the farthest point or points on a pupil are at a
distance of unity from its center. It has the advantage that the coefficient of a single
aberration term represents its peak value. Thus, in each case the pupil is inscribed inside a
unit circle.
References
1. A. Korn and T. M. Korn, Mathematical Handbook for Scientists and Engineers

(McGraw-Hill, New York, 1968).
2. G.-m. Dai and V. N. Mahajan, “Nonrecursive orthonormal polynomials with

matrix formulation,” Opt. Lett. 32, 74–76 (2007).
CHAPTER 4
SYSTEMS WITH CIRCULAR PUPILS

4.1 Introduction ............................................................................................................49
4.2 Pupil Function ........................................................................................................50
4.3 Aberration-Free Imaging ......................................................................................51
4.3.1 PSF ............................................................................................................51
4.3.2 OTF ............................................................................................................53
4.4 Strehl Ratio and Aberration Tolerance ............................................................... 54
4.4.1 Strehl Ratio ................................................................................................54
4.4.2 Defocus Strehl Ratio ..................................................................................55
4.4.3 Approximate Expressions for Strehl Ratio ................................................56
4.5 Balanced Aberrations ............................................................................................57
4.6 Description of Zernike Circle Polynomials..........................................................63
4.6.1 Analytical Form ......................................................................................... 63
4.6.2 Circle Polynomials in Polar Coordinates ..................................................65
4.6.3 Polynomial Ordering ................................................................................. 65
4.6.4 Number of Circle Polynomials through a Certain Order n........................65
4.6.5 Relationships among the Indices n, m, and j ............................................. 69
4.6.6 Uniqueness of Circle Polynomials ............................................................69
4.6.7 Circle Polynomials in Cartesian Coordinates ............................................70
4.7 Zernike Circle Coefficients of a Circular Aberration Function ........................70
4.8 Symmetry Properties of Images Aberrated by a
Circle Polynomial Aberration............................................................................... 74
4.8.1 Symmetry of PSF ......................................................................................74
4.8.2 Symmetry of OTF ......................................................................................76
47
48 SYSTEMS WITH CIRCULAR PUPILS
Circle Polynomial Aberrations ............................................................................. 78
4.9.1 Isometric Characteristics ........................................................................... 78
4.9.2 Interferometric Characteristics ..................................................................78
4.9.3 PSF Characteristics ....................................................................................83
4.9.4 OTF Characteristics ................................................................................... 84
4.10 Circle Polynomials and Their Relationships with Classical Aberrations ......... 88
4.10.1 Introduction................................................................................................88
4.10.2 Wavefront Tilt and Defocus ......................................................................88
4.10.3 Astigmatism............................................................................................... 89
4.10.4 Coma ..........................................................................................................90
4.10.5 Spherical Aberration ..................................................................................90
4.10.6 Seidel Coefficients from Zernike Coefficients ..........................................91
4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing ..............92
4.11 Zernike Coefficients of a Scaled Pupil ................................................................. 92
4.11.1 Theory ........................................................................................................94
4.11.2 Application to a Seidel Aberration Function ............................................. 98
4.12 Summary............................................................................................................... 101
References ......................................................................................................................103
Chapter 4
Systems with Circular Pupils
4.1 INTRODUCTION
Optical systems generally have a circular pupil. The imaging elements of such
systems also have a circular boundary. Therefore, they are also represented by circular
pupils in fabrication and testing. As a result, the Zernike circle polynomials have been in
widespread use since Zernike introduced them in his phase contrast method for testing
circular mirrors [1]. They are used in optical design and testing to understand the
aberration content of a wavefront. They have also been used for analyzing the wavefront
aberrations introduced by atmospheric turbulence on a wave propagating through it [2].
We start this chapter with a brief discussion of the point-spread function (PSF) and
the optical transfer function (OTF) of an aberration-free system with a circular pupil. We
then consider the effect of primary aberrations on the Strehl ratio of an image. Since the
Strehl ratio for small aberrations depends on the variance of an aberration, we balance a
classical aberration of a certain order with those of lower orders to reduce its variance.
The utility of the Zernike circle polynomial stems from the fact that they are not only
orthogonal over a circular pupil, but they also uniquely represent the balanced classical
aberrations yielding minimum variance over the pupil [3–6]. Because of their
orthogonality, when a circular wavefront is expanded in terms of them, the value of a
Zernike expansion coefficient is independent of the number of polynomials used in the
expansion. Hence, one or more polynomial terms can be added or subtracted without
affecting the other coefficients. The piston coefficient represents the mean value of the
aberration function, and the variance of the function is given simply by the sum of the
squares of the other expansion coefficients.
Given the m -fold symmetry of a Zernike polynomial aberration, we discuss the

symmetry of its interferogram, the corresponding aberrated PSF, the real and imaginary
parts of the OTF, and the modulation transfer function (MTF). It is shown that the
interferogram, the real part of the OTF, and the corresponding MTF are 2m-fold whether
m is an even or an odd integer, but the PSF and the imaginary part of the OTF are m-fold
when m is odd. Numerical examples are given to illustrate the Zernike aberrations
isometrically, interferometrically, and by the corresponding PSFs, OTFs, and MTFs.
Relationships between the coefficients of a power series expansion of an aberration

function and the corresponding Zernike expansion coefficients are considered. In
particular, we discuss how to obtain the Seidel coefficients from the Zernike coefficients
of an aberration function. We illustrate by an example how wrong Seidel coefficients are
obtained when using only the corresponding Zernike polynomials. Finally, we show how
the Zernike coefficients of an aberration function over a circular pupil change as its
diameter is reduced.
49
4.2 PUPIL FUNCTION

Consider an imaging system with a circular exit pupil of radius a, diameter D 2a ,
and area Sex Sa 2 lying in the pupil plane x p y p with z as its optical axis. The Cartesian

and polar coordinates x p , y p and r p , T of a pupil point Q, as illustrated in Figure 4-1,
are related to each other according to
x p, yp r p cos T, sin T , 0 d r p d a , 0 d T d 2S . (4-1)
Using a normalized radial variable U r p a , we may write
x p, yp aUcos T, sin T , 0 d U d 1 . (4-2)
We refer to the pupil in the U, T coordinates as a unit circular pupil in the sense of a unit
G
disc. For a uniformly illuminated pupil with an aberration function ) r p and power Pex
exiting from it, the pupil function of the system can be written
G
P rp G > G @
A r p exp i) r p ,
G
rp d a
(4-3)
0 , otherwise ,
where
G P
A rp ex Sex
12
(4-4)
is the uniform amplitude across the circular pupil.
yp y pc
Q(xp ,yp) Q(U, T)

Q(rp , T)
rp U
yp U sin T
T T
xp x pc
O xp O U cos T
a 1
(a) (b)
Figure 4-1. (a) Circular exit pupil of radius a of an imaging system. (b) Circular
pupil as a unit disc. The polar coordinates of a point Q are r p , T in (a) and U, T
in (b).
4.3 Aberration Free Imaging 51
4.3 ABERRATION-FREE IMAGING

4.3.1 PSF
Using polar coordinates (ri , q i ) for an observation point in Eq. (2-9), the PSF
representing the irradiance distribution in the image plane for a circular pupil can be
written
1 2p 2
1 Û Û
I (r , q i ) [ ] [
= 2 Ù Ù exp iF (r, q) exp - pir r cos (q - q i ) r dr dq
p ı ı
] , (4-5)
0 0
where r = r i l F , F = R D is the focal ratio of the image-forming light cone, F (r, q) is

the phase aberration at a point (r, q) in the pupil plane, and the irradiance is normalized
by the aberration-free central value Pex Sex l2 R 2 = p Pex 4 l2 F 2 .
For an aberration-free system, i.e., for a spherical wavefront exiting from the pupil so
that F(r, q) = 0, Eq. (4-5) reduces to
2
1 1 2p
I (r , q i ) = [ (
Ú Ú exp - pi r r cos q p - q i r dr d q p
p2 0 0
)] . (4-6)
Noting that
2p
Ú exp (i x cos a ) da = 2pJ 0 ( x ) , (4-7)
0
◊
where J 0 ( ) is the zero-order Bessel function of the first kind, Eq. (4-7) reduces to
1 2
[
I ( r ) = 4 Ú J 0 (p r r) r dr
0
] . (4-8)
Noting further that

a a
Ú x J 0 (bx ) dx = J ( ab) , (4-9)
0 b 1
where J 0 (◊) is the first-order Bessel function of the first kind, Eq. (4-9) yields
2
È 2J (p r ) ˘
I (r) = Í 1 ˙ , (4-10)
Î pr ˚
where J1(◊) is the first-order Bessel function of the first kind. Integrating over a circle of
radius rc , (in units of l F ) it can be shown that it contains a fractional power given by
P (rc ) = 1 - J 02 ( p rc ) - J12 ( p rc ) . (4-11)
Figure 4-2 shows a plot of Eq. (4-10), called the Airy pattern. It consists of a bright
spot at the center, called the Airy disc, surrounded by dark and bright diffraction rings.
The fractional power is also plotted in Figure 4-2a. The radius of the Airy disc is 1.22 and
contains 83.8% of the total light, as may be seen by letting rc = 1.22 in Eq. (4-11). The
center of the pattern lies at the Gaussian image point.
1.0
0.8
P
I(r), P(rc)
0.6
0.4
0.2 I
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r, rc
(a)
(b)
Figure 4-2. (a) Irradiance and encircled power distributions for an aberration-free
system with a circular pupil. (b) 2D PSF, called the Airy pattern.
4.3.2 OTF 53
4.3.2 OTF
From Eq. (2-11), the aberration-free OTF can be written
r Û r r r r
ı
( ) (
t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (4-12)
It is evident that the OTF represents the fractional area of overlap of two circles, each of
r
radius a, separated by a distance l Rvi , where v i = v i . From Figure 4-3, we note that the
area of overlap is given by four times the difference between the area of a sector of radius
a and cone angle b , and the area of the triangle OAB. Hence, the OTF can be written
4 Ê b 1 ˆ
t(v i ) = Á p a 2 - OA ◊ AB˜ . (4-13)
Sex Ë 2p 2 ¯
Substituting OA = a cos b , AB = a sinb, and cos b = l Rv i 2a = l Fv i = v , into Eq. (4-

13), we obtain
2
t(v i ) = (b - sin b cos b) (4-14)
p
2È
=
p ÎÍ
(
cos 1 v - v 1 - v 2 )1 2 ˘˚˙ , 0£ v£1 . (4-15)
Here, v = cos b is a spatial frequency normalized by the cutoff frequency v c = (1 l F ) at

which the overlap area reduces to zero. The OTF is radially symmetric because the
overlap area depends only on the separation l Rvi of the two pupils and is independent of
r
the direction of v i .
a
b
O
A O¢
lRni
Figure 4-3. Aberration-free OTF as the fractional area of overlap of two circles of
radius a whose centers are separated by a distance lRvi .
Figure 4-4 shows how the OTF varies with v. The integral of the aberration-free
OTF that enters into the calculation of the Strehl ratio from the real part of the complex
aberrated OTF [see Eq. (2-25)] is given by
1
Û
Ù t (v) v dv = 1 8 . (4-16)
ı
0
The slope of the OTF at the origin is given by
t¢ ( 0) = - 4 p . (4-17)
Although obtained from the aberration-free OTF, this slope is independent of any
aberration.
4.4 STREHL RATIO AND ABERRATION TOLERANCE
4.4.1 Strehl Ratio

Letting r = 0 in Eq. (4-5) for the irradiance distribution normalized by its aberration-
free central value, we obtain the Strehl ratio of an aberrated image:
1 2p 2
1 Û Û
S =
p2 ı ı
[ ]
Ù Ù exp i F(r, q) r dr dq . (4-18)
0 0
1.0
0.8
0.6
t
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
n
Figure 4-4. Aberration-free OTF as a function of normalized spatial frequency v .

4.4.2 Defocus Strehl Ratio 55
4.4.2 Defocus Strehl Ratio

Consider an observation being made in an image plane passing through a point P1 at
a distance z from the exit pupil of a system, while a beam with a spherical wavefront W
is focused at a point P2 at a distance R, as illustrated in Figure 1-6. The spherical
wavefront is aberrated with respect to the reference sphere S of radius of curvature z due
to the longitudinal defocus z R . The defocus aberration may be written
)U Bd U 2 , (4-19)
where the peak value Bd of the phase aberration is related to the longitudinal defocus
according to
Bd
S 4O F 2 z R . (4-20)
A positive value of the defocus aberration is introduced when an observation is made at a

distance z R , as in Figure 1-6. Substituting Eq. (4-19) into Eq. (4-18), we obtain the
Strehl ratio of the defocused image:
S >sin Bd 2 Bd 2 @ 2 . (4-21)
The Strehl ratio decreases as the aberration increases until it reaches a value of zero
when the aberration becomes 2S radians or one wave. As shown in Figure 4-5, it
fluctuates for increasing value of defocus, becoming zero when the aberration is an
integral number of waves. It should be evident that the defocused Strehl ratio represents
the axial irradiance of a focused beam.
1.0
0.8
0.6
S
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Bd
Figure 4-5. Strehl ratio S of a defocused beam, representing its axial irradiance,
where Bd is the defocus aberration in units of wavelength.
4.4.3 Approximate Expressions for Strehl Ratio

The approximate expressions for the Strehl ratio when the aberration is small are
given by Eqs. (2-31)–(2-33), i.e.,
2
S1 ~ (1 - s 2F 2) , (4-22a)
S2 ~ 1 - s 2F , (4-22b)
and
S3 ~ exp (- s 2F ) , (4-22c)
where
s F2 = < F 2 > - < F > 2 (4-23)
is the variance of the phase aberration across the pupil. The mean and the mean square
values of the aberration are obtained from the expression
1 2p
Û Û
Fn = p 1 Ù Ù F n (r, q) r dr dq (4-24)
ı ı
0 0
with n = 1 and 2, respectively.
Table 4-1 gives the form as well as the standard deviation s F of a primary (or a
Seidel) aberration, where an aberration coefficient Ai represents the peak value of the
aberration. It also lists the aberration tolerance, i.e., the value of the aberration coefficient
Ai , for a Strehl ratio of 0.8. This tolerance has been obtained by using the Strehl ratio
expression S2 , according to which the standard deviation for a Strehl ratio of 0.8 is given
by
sF = 0.2 (4-25)
or
s w = (l 2p) 0.2 = 0.07l = l 14.05 , (4-26)
where s w is the sigma value of the wave aberration. The aberration tolerance listed in
Table 4-1 is for the wave (as opposed to the phase) aberration coefficient, as is customary
in optics. It should be understood that the tolerance numbers given are not accurate to the
second decimal place. They are listed as such for consistency only. We have used the
symbol Ad for the coefficient of field curvature aberration, which varies quadratically
with the angle that a point object makes with the optical axis of the system. However, to
4.4.3 Approximate Expressions for Strehl Ratio 57
Table 4-1. Standard deviation and aberration tolerance for primary aberrations.
Aberration F(r, q ) sF A i for S = 0.8
Spherical As r 4 2 As As l 4.19
=
3 5 3.35
Coma Ac r3 cos q Ac Ac l 4.96

=
2 2 2.83
Astigmatism Aa r2 cos 2 q Aa l 3.51

4
Field Curvature Ad r2 Ad Ad l 4.06

=
(defocus) 2 3 3.46
Distortion (tilt) At r cos q At l 7.03

2
avoid confusion, we have used the symbol Bd for representing the defocus wave
aberration, which is independent of the field angle but has the same dependence on pupil
coordinates as field curvature. Similarly, we have used the symbol At for distortion,
which varies as the cube of the field angle. But, we will use the symbol Bt to represent
the wavefront tilt, which is independent of the field angle but has the same dependence on
pupil coordinates as distortion.
4.5 BALANCED ABERRATIONS

The variance of a primary aberration can be reduced by observing the image in a
defocused image plane, i.e., by mixing it with defocus aberration. Thus, for example, we
balance primary spherical aberration with defocus aberration and write it as
F(r) = As r 4 + Bd r 2 . (4-27)
The defocus aberration is introduced by making an observation in a plane at a distance z,

as discussed in Section 4.3. The mean and the mean square value of the aberration
function are given by
1 2p
1 Û Û
<F > =
p Ù Ù
ı ı
( A s r 4 + B d r 2 ) r dr d q
0 0
As Bd
= + (4-28)
3 2
and
As2 B2 A B
F2 = + d + s d . (4-29)
5 3 2
Accordingly, the aberration variance is given by

2
s F2 = F 2 - F
4 As2 B2 A B
= + d + s d . (4-30)
45 12 6
The value of defocus Bd yielding minimum variance is obtained by letting
∂ s F2
= 0 , (4-31)
∂ Bd
and checking that it yields a minimum and not a maximum. Thus, we find that the
optimum value is Bd = - As, and the balanced aberration is given by
(
F bs (r) = As r 4 - r 2 ) . (4-32)
Its standard deviation or sigma value is As 6 5 , which is a factor of 4 smaller than the
corresponding value 2 As 3 5 for Bd = 0. Since the sigma value has been reduced by a
factor of 4, its tolerance has been increased by the same factor. For example, S = 0.8 is
obtained in the Gaussian image plane for As = l 4 . However, the same Strehl ratio is
obtained for As = 1 l in a slightly defocused image plane such that Bd = - l .
Similarly, we balance astigmatism with defocus and coma with tilt. Table 4-2 lists
the form of a balanced primary aberration, its standard deviation, and its tolerance for a
Strehl ratio of 0.8, according to Eq. (4-16b). Also listed in the table is the location of the
diffraction focus, i.e., the point with respect to which the aberration variance is minimum
so that the Strehl ratio is maximum at it. The amount of balancing defocus is minus half
Table 4-2. Balanced primary aberrations and corresponding diffraction focus

standard deviation, and aberration tolerance.
Balanced Diffraction sF A i for

F ( r, q)
Aberration Focus* S = 0.8
Spherical (
As r 4 - r2 ) (0, 0, 8F A )
2
s
As 0.955l
6 5
Coma (
Ac r3 - 2r 3 cos q ) (4 FAc 3, 0, 0 ) Ac 0.604l
6 2
Aa
Astigmatism (
Aa r2 cos 2 q - 1 2 ) (0 , 0 , 4 F A )
2
a
2 6
0.349l
= ( Aa 2) r2 cos 2q
*The diffraction focus coordinates are relative to the Gaussian image point.
4.5 Balanced Aberrations 59
the amount of astigmatism, or the diffraction focus lies at a distance 4 F 2 As along the z
axis. The balancing tilt is minus two-thirds the amount of the coma. Thus, the maximum
Strehl ratio is obtained at a point that is displaced from the Gaussian image point by
4 FAc 3 but lies in the Gaussian image plane.
For primary aberrations, S1 and S2 underestimate the true Strehl ratio S. S3 gives a
better approximation for the true Strehl ratio than S1 and S2 . The reason is that, for small
4
values of s w , it is larger than S1 by approximately s F 4 . Of course, S1 is larger than S2
4
by s F 4 . The expression S3 underestimates the true Strehl ratio only for coma and
astigmatism; it overestimates for the other aberrations. Numerical analysis shows that the
error, defined as 100 (1 - S3 S ) , is < 10% for S > 0.3 [5,7].
Rayleigh [8] showed that a quarter-wave of primary spherical aberration reduces the
irradiance at the Gaussian image point by 20%, i.e., the Strehl ratio for this aberration is
0.8. This result has brought forth the Rayleigh’s l 4 rule; namely, that a Strehl ratio of
approximately 0.8 is obtained if the maximum absolute value of the aberration at any
point in the pupil is equal to l 4 . A variant of this definition is that an aberrated
wavefront that lies between two concentric spheres spaced a quarter-wave apart will give
a Strehl ratio of approximately 0.8. Thus, instead of W p = l 4 , we require
W p v = l 4 , where Wp is the peak absolute value and Wp v is the peak-to-valley (P-V)
value of the aberration. However, a Strehl ratio of 0.8 is obtained for W p = l 4 = W p v
for spherical aberration only. For other primary aberrations, distinctly different values of
Wp and Wp v give a Strehl ratio of 0.8 [5,9]. Thus, it is advantageous to use s w for
estimating the Strehl ratio. A Strehl ratio of S >
~ 0.8 is obtained for s w <
~ l 14 .
When a certain aberration is balanced with other aberrations to minimize its variance,
the balanced aberration does not necessarily yield a higher or the highest possible Strehl
ratio. For small aberrations, a maximum Strehl ratio is obtained when the variance is
minimum. For large aberrations, however, there is no simple relationship between the
Strehl ratio and the aberration variance. For example [9], when As = 3l , the optimum
amount of defocus is Bd = - 3l , but the Strehl ratio is a minimum and equal to 0.12. The
Strehl ratio is maximum and equal to 0.26 for Bd ~ - 4l or - 2l . For As < ~ 2.3l , the
axial irradiance is maximum at a point with respect to which the aberration variance is
minimum. Similarly, in the case of coma, the maximum irradiance in the image plane
occurs at the point with respect to which the aberration variance is minimum only if
~ 0.7l , which in turn corresponds to S >
Ac < ~ 0.76 . For larger values of Ac , the
distance of the point of maximum irradiance does not increase linearly with its value and
even fluctuates in some regions [10]. Moreover, it is found that for Ac > 2.3l , the Seidel
coma gives a larger Strehl ratio than the balanced coma, i.e., the irradiance in the image
plane at the origin is larger than at the point with respect to which the aberration variance
is minimum. Thus, only for large Strehl ratios, the irradiance is maximum at the point
associated with the minimum aberration variance.
The defocused PSFs are shown in Figure 4-6 to illustrate the zero Strehl ratio for
integral number of waves of defocus aberration. As an illustration of the improvement in
the Strehl ratio by aberration balancing, Table 4-3 lists the Strehl ratio of a primary
aberration with and without balancing for a quarter wave of aberration. The Strehl ratio
for a quarter of defocus is 0.811. As shown in Figure 4-7, the Strehl ratio for a quarter
wave of spherical aberration improves from a value of 0.800 to 0.986 when it is balanced
with an equal and opposite amount of defocus aberration. In the case of coma, a Strehl
ratio of 0.737 is obtained, but a peak of value 0.966 lies to the right of the origin, as
shown in Figure 4-8. When coma is balanced with a wavefront tilt equal to 2 3 the
amount of coma, the peak moves to the origin and the Strehl ratio increases from 0.737 to
0.966. In the case of astigmatism, as shown in Figure 4-9, the Strehl ratio increases from
a value of 0.857 to 0.902 when it is balanced with defocus.
The variance of the secondary spherical aberration ( U 6 ), secondary coma ( U 5 cos T ),

and secondary astigmatism ( U 4 cos 2 T ) can be reduced similarly by mixing them with
appropriate aberrations of lower order. The secondary spherical aberration is balanced
with primary spherical aberration and defocus to minimize its variance. The balanced
secondary spherical aberration thus obtained is given by
) bss U, T U 6 1.5U 4 0.6U 2 . (4-33)
Similarly, secondary coma is balanced with primary coma and wavefront tilt to minimize
its variance, and the balanced aberration thus obtained is given by
) bsc U, T U5 1.2U3 0.3U cos T . (4-34)
1.0
Bd = 0 Defocus
0.8
0.6
I (r)
1/4
1
0.4 x10
0.2
0.0
0.0 0.5 1.0 1.5 2.0
r
Figure 4-6. PSFs for a quarter-wave and one wave of defocus as a function of r in
units of O F . For clarity, the curve for Bd 1 has been multiplied by ten. The
aberration-free PSF, representing the Airy pattern with its first zero at 1.22, is
shown by the solid curve.
Table 4-3. Strehl ratio S for a quarter-wave of a primary aberration with and
without balancing for a circular pupil, i.e., for Bd Aa Ac As O 4 and
0 d U d 1.
Aberration S
Aberration free 1
Defocus, Bd U 2 0.811
Astigmatism, Aa U 2 cos 2 T 0.857
>
Balanced astigmatism, Aa U 2 cos 2 T 1 2 @ 0.902
Coma, Ac U 3 cos T 0.737
>
Balanced coma, Ac U 3 2 3U cos T @ 0.966
Spherical aberration, As U 4 0.800

Balanced spherical aberration, As U 4 U 2 0.986
1.0
0.8
0.6
I (r)
0.4 Balanced
Spherical
Spherical
0.2
0.0
0.0 0.5 1.0 1.5 2.0
r
Figure 4-7. PSFs for a quarter-wave of spherical aberration with and without
balancing with equal and opposite amount of defocus. The aberration-free PSF,
representing the Airy pattern with its first zero at 1.22, is shown by the solid curve.
1.0
0.8
I (x,0)
0.6
Coma
0.4
Balanced
Coma
0.2
0.0
-2 -1 0 1 2
x
Figure 4-8. PSFs for a quarter-wave of coma along the x axis (in units of O F ) with
and without the balancing tilt. The aberration-free PSF is shown by the solid curve.
Finally, secondary astigmatism is balanced with primary spherical aberration, primary

astigmatism, and defocus to minimize its variance, and the balanced aberration thus
obtained is given by:
1 4 3 2 3 1 § 4 3 2·
) bsa U, T U 4 cos 2 T U U cos 2 T U 2 U U cos 2T . (4-35)
2 4 8 2© 4 ¹
1.0
0.8 Balanced
Astigmatism
I (x,0)
0.6
0.4
Astigmatism
0.2
0.0
0 1 2
x
Figure 4-9. PSFs for a quarter-wave of astigmatism along the x axis (in units of
O F ) with and without the balancing defocus. The aberration-free PSF is shown by
the solid curve.
When secondary spherical aberration or secondary coma is balanced with lower-

order aberrations to minimize their variance, it is found [11] that a maximum of Strehl
ratio is obtained only if its value comes out to be greater than about 0.5. Otherwise, a
mixture of aberrations yielding a larger-than-minimum possible variance gives a higher
Strehl ratio than the one provided by a minimum-variance mixture.
4.6 DESCRIPTION OF ZERNIKE CIRCLE POLYNOMIALS

4.6.1 Analytical Form
In his phase contrast method for testing the figure of circular mirrors, which he
proposed as an improvement over the Foucault knife-edge test, Zernike introduced his
circle polynomials as eigenfunctions of a second-order differential equation in two
variables [1]. These polynomials, which form a complete orthogonal set for the interior of
a unit circle, are the well-known circle polynomials. Nijboer used these polynomials to
study the balancing of classical aberrations of a power-series expansion of the aberration
function and the effect of small aberrations on the diffraction images formed by
rotationally symmetric imaging systems with circular pupils [2].
The orthonormal form of the circle polynomials may be written
[
Z nm (r, q) = 2( n + 1) (1 + d m 0 ) ]1/ 2Rnm (r) cos mq , 0 £ r £ 1 , 0 £ q £ 2 p , (4-36)
where n and m are positive integers including zero, n - m ≥ 0 and even, and Rnm (r) is a
radial polynomial given by
( n m )/ 2 ( -1) s ( n - s)!
Rnm (r) = Â rn 2s
(4-37)
s= 0 Ên+m ˆ Ên-m ˆ
s!Á - s˜ ! Á - s˜ !
Ë 2 ¯ Ë 2 ¯
with a degree n in r containing terms in rn , rn 2 , K, and rm. It is clear from Eq. (4-36)
that the circle polynomials are separable in the polar coordinates r and q of a pupil
point.
A radial polynomial Rnm (r) is even or odd in r depending on whether n (or m) is

even or odd. It is normalized such that
Rnm (1) = 1 . (4-38)
We find from Eq. (4-19) that
Rnn (r) = r n , (4-39)
and
Ïd m 0 for even n 2
Rnm ( 0) = Ì (4-40)
Ó - d m 0 for odd n 2 .
For m = 0 , a radial polynomial has the same form as a corresponding Legendre

polynomial Pn (◊) according to
(
Rn0 (r) = Pn 2r 2 - 1 ) . (4-41)
The orthogonality of the trigonometric functions yields

2p
Ú cos mq cos m¢q dq = p (1 + d m 0 ) d mm ¢ . (4-42)
0
The polynomials Rnm (r) obey the orthogonality relation

1
Û m 1
Ù Rn (r) Rn ¢ (r) r dr = 2 n+ 1 d nn ¢
m
. (4-43)
ı ( )
0
In Eq. (4-43), the m value is the same for both radial polynomials because of the
orthogonality Eq. (4-42) of the trigonometric functions. Accordingly, the polynomials
Z nm (r, q) are orthonormal according to
1 1 2p m
Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢ d mm ¢
m¢
. (4-44)
p0 0 n
Since the aberrations introduced by fabrication errors or atmospheric turbulence are

random in nature, we need both the cosine and the sine Zernike circle polynomials to
express them. It is convenient in such cases to write their form and numbering as [5]:
Z even j (r, q) = 2(n + 1) Rnm (r) cos mq, m π 0 , (4-45a)
Z odd j (r, q) = 2(n + 1) Rnm (r) sin mq, m π 0 , (4-45b)
Z j (r, q) = n + 1 Rn0 (r), m = 0 . (4-45c)
An even number is associated with a cosine polynomial and an odd number with a sine
polynomial. The orthogonality of the trigonometric functions yields
2p
Ï cos mq cos m¢q , j and j ¢ are both even
Ô cos mq sin m¢q , j is even and j ¢ is odd
Û Ô
Ù dq Ì
ı Ôsin mq cos m¢q , j is odd and j ¢ is even
0
ÔÓsin mq sin m¢q , j and j ¢ are both odd
Ï p (1 + d m 0 )d mm ¢ , j and j ¢ are both even

Ô
= Ì p d mm ¢ , j and j ¢ are both odd (4-46)
Ô0 , otherwise .
Ó
Therefore, the Zernike circle polynomials are orthonormal over a unit disc according to
4.6.1 Analytical Form 65
1 2p 1 2p
Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (4-47)
0 0 0 0
4.6.2 Circle Polynomials in Polar Coordinates

The orthonormal Zernike circle polynomials and the names associated with some of
them when identified with the classical aberrations are listed in Table 4-4 in polar
coordinates for n £ 8. The polynomials independent of q are the spherical aberrations,
those varying as cos q are the coma aberrations, and those varying as cos 2q are the
astigmatism aberrations. The variation of several radial polynomials Rnm (r) with r is
illustrated in Figure 4-10. A polynomial with an even value of n has a value of zero at n 2
values of r , e.g., for defocus, astigmatism, and various orders of spherical aberration. A
polynomial with an odd value of n has a value of zero at ( n + 1) 2 values of r , e.g., for
various orders of coma. The larger the value of n of a polynomial, the more oscillatory
the polynomial.
4.6.3 Polynomial Ordering

The index n of a Zernike polynomial represents its radial degree or the order, since it
represents the highest power of r in the polynomial. This is different from the order of a
classical aberration, which represents the degree of the object (for which the aberration
function is considered) and pupil points in Cartesian coordinates (see Section 1.6). The
index m of a polynomial is referred to as its azimuthal frequency. The index j is a
polynomial-ordering number and is a function of both n and m. The polynomials in Table
4-4 are ordered such that an even j corresponds to a symmetric polynomial varying as
cosmq, while an odd j corresponds to an antisymmetric polynomial varying as sinmq. A
polynomial with a lower value of n is ordered first, and for a given value of n, a
polynomial with a lower value of m is ordered first.
4.6.4 Number of Circle Polynomials through a Certain Order n

The number of circle polynomials of a given order n is n + 1. Their number through
a certain order n is given by
N n = ( n + 1)( n + 2) 2 . (4-48)
For a rotationally symmetric imaging system, each of the sin mq terms is zero, as
discussed in Section 1.6. Accordingly, the number of polynomials of an even order is
(n 2) + 1 and ( n + 1) 2 for an odd order. Their number through an order n is given by
[
N n = (n 2) + 1 ]2 for even n , (4-49a)
= ( n + 1)( n + 3) 4 for odd n . (4-49b)

Table 4-4. Orthonormal Zernike circle polynomials Z j ( r,, q) . The indices j, n, and m
are called the polynomial number, radial degree, and azimuthal frequency,
respectively. The polynomials Z j are ordered such that an even j corresponds to a
symmetric polynomial varying as cos mqq , while an odd j corresponds to an
antisymmetric polynomial varying as sin mqq. A polynomial with a lower value of n
is ordered first, and for a given value of n, a polynomial with a lower value of m is
ordered first.
j n m Z j ( r,, q) Aberration Name*

1 0 0 1 Piston
2 1 1 2 r cos q x-tilt
3 1 1 2 r sin q y-tilt
4 2 0 (
3 2r 2 - 1 ) Defocus
5 2 2 6 r2 sin 2q 45∞ Primary astigmatism

6 2 2 6 r2 cos 2 q 0∞ Primary astigmatism
7 3 1 (
8 3r3 - 2r sin q ) Primary y-coma
8 3 1 8 (3r 3
- 2r) cos q Primary x-coma
9 3 3 8 r 3 sin 3 q
10 3 3 8 r 3 cos 3 q
11 4 0 (
5 6r 4 - 6r2 + 1 ) Primary spherical aberration
12 4 2 (
10 4r 4 - 3r2 cos 2q ) 0∞ Secondary astigmatism
13 4 2 10 ( 4r 4
- 3r ) sin 2q
2 45∞ Secondary astigmatism
14 4 4 10 r 4 cos 4 q
15 4 4 10 r 4 sin 4 q
16 5 1 ( )
12 10r5 - 12r3 + 3r cos q Secondary x-coma
17 5 1 12 (10r - 12r + 3r) sin q

5 3
Secondary y-coma
18 5 3 12 (5r - 4r ) cos 3q
5 3
19 5 3 12 (5r - 4r ) sin 3q
5 3
20 5 5 12 r 5 cos 5 q
21 5 5 12 r 5 sin 5 q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
4.6.4 Number of Circle Polynomials through a Certain Order n 67
Table 4-4. Orthonormal Zernike circle polynomials Z j ( r,, q) . (Cont.)
j n m Z j ( r,, q) Aberration Name*
22 6 0 (
7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical
23 6 2 ( 6
)
14 15r - 20r + 6r sin 2q 4 2
45∞ Tertiary astigmatism
24 6 2 14 (15r - 20r + 6r ) cos 2q

6 4 2
0∞ Tertiary astigmatism
25 6 4 14 (6r - 5r ) sin 4q
6 4
26 6 4 14 (6r - 5r ) cos 4q
6 4
27 6 6 14 r 6 sin 6 q
28 6 6 14 r 6 cos 6 q
29 7 1 ( )
4 35r7 - 60r5 + 30r3 - 4r sin q Tertiary y-coma
30 7 1 4 (35r - 60r + 30r - 4r) cos q

7 5 3
Tertiary x-coma
31 7 3 4 (21r - 30r + 10r ) sin 3q

7 5 3
32 7 3 4 (21r - 30r + 10r ) cos 3q

7 5 3
33 7 5 4 (7r - 6r ) sin 5q
7 5
34 7 5 4 (7r - 6r ) cos 5q
7 5
35 7 7 4 r 7 sin 7 q
36 7 7 4 r 7 cos 7 q
37 8 0 (
3 70r8 - 140r6 + 90r4 - 20r2 + 1 ) Tertiary spherical
38 8 2 ( )
18 56r 8 - 105r 6 + 60r 4 - 10r 2 cos 2q 0∞ Quaternary astigmatism
39 8 2 18 ( 56r 8 - 105r 6 + 60r 4 - 10r 2 ) sin 2q 45∞ Quaternary astigmatism
40 8 4 18 ( 28r 8 - 42r 6 + 15r 4 ) cos 4 q
41 8 4 18 ( 28r 8 - 42r 6 + 15r 4 ) sin 4 q
42 8 6 18 (8r 8 - 7r 6 ) cos 6q
43 8 6 18 (8r 8 - 7r 6 ) sin 6q
44 8 8 18 r 8 cos 8q
45 8 8 18 r 8 sin 8q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
n 4
0.5 8
R n(ρ)
0 (a)
0
-0.5 6
2
-1
0 0.2 0.4 0.6 0.8 1
n 5
0.5
7
1
R n(ρ)
0 (b)
1
-0.5
-1
0 0.2 0.4 0.6 0.8 1
n 6
0.5
2
R n(ρ)
0 (c)
2
-0.5 8
4
-1
0 0.2 0.4 0.6 0.8 1
U
Figure 4-10. Variation of a Zernike circle radial polynomial Rnm U as a function of

U. (a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.
4.6.5 Relationships among the Indices n, m, and j 69
4.6.5 Relationships among the Indices n, m, and j

The number of polynomials Nn through a certain order n represents the largest value
of j. Since the number of polynomials with the same value of n but different values of m
is equal to n + 1, the smallest value of j for a given value of n is Nn - n . For a given
value of n and m, there are two j values, Nn - n + m - 1 and Nn - n + m . The even value
of j represents the cos mq polynomial, and the odd value of j represents the sin mq
polynomial. The value of j with m = 0 is Nn - n . For example, for n = 5, N n = 21 and
j = 21 represents the sin 5q polynomial. The number of the corresponding cos 5q
polynomial is j = 20. The two polynomials with m = 3, for example, have j values of 18
and 19, representing the cos 3q and the sin 3q polynomials, respectively.
For a given value of j, n is given by
[
n = ( 2 j - 1)
12
]
+ 0.5
integer
-1 , (4-50)
where the subscript integer implies the integer value of the number in brackets. Once n is
known, the value of m is given by
Ô {
Ï 2 [ 2 j + 1 - n( n + 1) ] 4 }
integer
when n is even (4-51a)
m=Ì
{ }
Ô 2 [ 2( j + 1) - n( n + 1) ] 4 integer - 1 when n is odd .
Ó
(4-51b)
For example, suppose we want to know the values of n and m for the polynomial j = 10.
From Eq. (4-50), n = 3 and from Eq. (4-51b), m = 3. Hence, it is a cos 3q polynomial.
4.6.6 Uniqueness of Circle Polynomials

The Zernike circle polynomials have certain unique mathematical properties. They
are the only polynomials in two variables r and q, which (a) are orthogonal over a circle,
(b) are invariant in form with respect to rotation of the coordinate axes about the origin,
and (c) include a polynomial for each permissible pair of n and m values [4,12].
From the standpoint of wavefront analysis, their uniqueness lies in the fact that they
are not only orthogonal over a circular pupil, but include wavefront tilt, defocus, and
balanced classical aberrations as members of the polynomial set for such a pupil. For
example, Z 6 , Z 8 , and Z11 represent the balanced primary aberrations of astigmatism,
coma, and spherical aberration, as may be seen by comparing their forms with those
given in Table 4-2. Similarly, Z12 , Z16 , and Z 22 represent the balanced secondary
aberrations of astigmatism, coma, and spherical aberration, respectively, as may be seen
by comparing their forms with those given in Eqs. (4-33)–(4-35), respectively. Note that
the constant term in a radially symmetric aberration is needed to make its mean value
zero over the pupil. A balanced classical aberration in the form of a Zernike polynomial is
referred to as a Zernike or orthogonal aberration, e.g., Z 6 is Zernike primary
astigmatism or Z 8 is Zernike primary coma. In Section 4.5, aberrations with only cos mq
type dependence are considered, as would be the case for a rotationally symmetric
imaging system. In general, an aberration function will also have sin mq type terms, for
example, due to fabrication errors or those due to atmospheric turbulence. The
corresponding polynomials with sin mq dependence are considered in Section 4.6.
4.6.7 Circle Polynomials in Cartesian Coordinates

The circle polynomials given in polar coordinates in Table 4-4 can be written in the
Cartesian coordinates ( x , y ) of a pupil point, and cos mq and sin mq can be written in
terms of powers of cos q and sinq , respectively. They are listed in Table 4-5 using the
polynomial ordering index j. It is quite common in the optics literature to consider a point
object lying along the y axis when imaged by a rotationally symmetric optical system,
thus making the yz plane the tangential plane [4]. To maintain symmetry of the aberration
function about this plane, the polar angle q of a pupil point in Figure 4-1 is accordingly
defined as the angle made by its position vector OQ with the y axis, contrary to the
standard convention as the angle with the x axis. We choose a point object along the x
( )
axis so that, for example, the coma aberration is expressed as x x 2 + y 2 and not as
( )
y x 2 + y 2 . A positive value of our coma aberration yields a diffraction point spread
function that is symmetric about the x axis (or symmetric in y) with its peak and centroid
shifted to a positive value of x with respect to the Gaussian image point.
In practice, the aberration data obtained by way of interferometry will generally be

available at a uniformly spaced array of points in Cartesian coordinates. Hence, it is
convenient to carry out numerical analysis in a Cartesian coordinate system using the
Zernike circle polynomials in Cartesian coordinates.
4.7 ZERNIKE CIRCLE COEFFICIENTS OF A CIRCULAR ABERRATION

FUNCTION
The aberration function W (r, q) of a rotationally symmetric imaging system for a
certain point object can be expanded in terms of the orthonormal Zernike circle
polynomials Z nm (r, q) that are orthonormal over a unit disc in the form
• n
W (r, q) = Â Â c nm Z nm (r, q) , 0 £ r £ 1 , 0 £ q £ 2p , (4-52)
n =0 m =0
where c nm are the orthonormal expansion coefficients that depend on the object location.
The orthonormal Zernike expansion coefficients are given by
1 1 2p
c nm = Ú Ú W (r, q)Z n (r, q) r dr d q ,
m
(4-53)
p0 0
as may be seen by substituting Eq. (4-52) and utilizing the orthonormality Eq. (4-44) of
the polynomials.
Because of the orthogonality of the Zernike polynomials, the mean value of a circle
polynomial, except when n = 0 = m (the piston polynomial), is zero, and its mean square
value is unity, as shown in Section 3.2. Therefore, the mean and the mean square values
4.7 Zernike Circle Coefficients of a Circular Aberration Function 71
Table 4-5. Orthonormal Zernike circle polynomials Zj ( x, y) in Cartesian

coordinates ( x, y) , where x = r cosq , y = r sinq , and 0 £ r = x 2 + y 2(1 2
£ 1. )
Poly. n m Zj ( x, y) Name
Z1 0 0 1 Piston
Z2 1 1 2x x tilt
Z3 1 1 2y y tilt
Z4 2 0 3 (2r2 – 1) Defocus
Z5 2 2 2 6 xy 45∞ Primary astig.
Z6 2 2 6 ( x 2 – y2 ) 0∞ Primary astig.
Z7 3 1 8 y (3r 2 – 2) Primary y-coma
Z8 3 1 8 x (3r 2 – 2) Primary x-coma
Z9 3 3 8 y (3 x 2 – y 2 )
Z10 3 3 8 x( x 2 – 3y 2 )
Z11 4 0 5 (6r 4 – 6 r2 + 1 ) Primary spherical
Z12 4 2 10 ( x 2 – y 2 ) ( 4r2 – 3) 0∞ Secondary astig.
Z13 4 2 2 10 xy ( 4r2 – 3) 45∞ Secondary astig.
Z14 4 4 10 (r 4 – 8 x 2 y 2 )
Z15 4 4 4 10 xy ( x 2 – y 2 )
Z16 5 1 12 x (10 r 4 – 12 r2 + 3 ) Secondary x-coma
Z17 5 1 12 y (10r 4 – 12 r2 + 3 )] Secondary y-coma
Z18 5 3 12 x ( x 2 – 3 y 2 ) (5 r2 – 4)
Z19 5 3 12 y (3 x 2 – y 2 ) (5 r2 – 4 )
Z 20 5 5 12 x (16 x 4 – 20 x 2 r2 + 5 r 4 )
Z 21 5 5 12 y(16 y 4 – 20 y 2 r2 + 5 r 4 )
Z 22 6 0 7 (20 r6 – 30 r 4 + 12 r2 – 1 ) Secondary spherical
Z 23 6 2 2 14 xy (15 r 4 – 20 r2 + 6 )
Table 4-5. Orthonormal Zernike circle polynomials Zj ( x, y) in Cartesian

coordinates ( x, y) , where x = r cosq , y = r sinq , and 0 £ r = x 2 + y 2
1 2
( )
£ 1 . (Cont.)
Poly. n m Zj ( x, y) Name
Z 24 6 2 14 ( x 2 – y 2 ) (15 r 4 – 20 r2 + 6 ) 45∞ Tertiary astig.
Z 25 6 4 4 14 xy ( x 2 - y 2 ) (6r2 – 5 ) 0∞ Tertiary astig.
Z 26 6 4 14 (8 x 4 - 8 x 2 r2 + r 4 ) (6r2 – 5 )
Z 27 6 6 14 xy (32 x 4 – 32 x 2 r2 + 6 r 4 )
Z 28 6 6 14 (32 x 6 – 48 x 4r2 + 18 x 2 r4 – r6 )
Z 29 7 1 (
4 y 35r 6 - 60r 4 + 30r 2 - 4 ) Tertiary y-coma
Z 30 7 1 4 x ( 35r 6 - 60r 4 + 30r 2 - 4) Tertiary x-coma
Z 31 7 3 4 y ( 3x 2 - y 2 )( 21r 4 - 30r 2 + 10)
Z 32 7 3 4 x ( x 2 - 3y 2 )( 21r 4 - 30r 2 + 10)
Z 33 7 5 4( 7r 2 - 6)[ 4 x 2 y ( x 2 - y 2 ) + y (r 4 - 8 x 2 y 2 ) ]
Z 34 7 5 4( 7r 2 - 6)[ x (r 4 - 8 x 2 y 2 ) - 4 xy 2 ( x 2 - y 2 ) ]
Z 35 7 7 8 x 2 y ( 3r 4 - 16 x 2 y 2 ) + 4 y ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )
Z 36 7 7 4 x ( x 2 - y 2 )(r 4 - 16 x 2 y 2 ) - 8 xy 2 ( 3r 4 - 16 x 2 y 2 )
Z 37 8 0 3( 70r 8 - 140r 6 + 90r 4 - 20r 2 + 1) Tertiary spherical
Z 38 8 2 18 ( 56r 6 - 105r 4 + 60r 2 - 10)( x 2 - y 2 ) 0∞ Quaternary astig.
Z 39 8 2 2 18 xy ( 56r 6 - 105r 4 + 60r 2 - 10) 45∞ Quaternary astig.
Z 40 8 4 18 ( 28r 4 - 42r 2 + 15)(r 4 - 8 x 2 y 2 )
Z 41 8 4 4 18 xy ( 28r 4 - 42r 2 + 15)( x 2 - y 2 )
Z 42 8 6 18 ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )(8r 2 - 7)
Z 43 8 6 2 18 xy ( 3r 4 - 16 x 2 y 2 )
Z 44 8 8 (
2 18 r 4 - 8 x 2 y 2 ) 2 - r8
Z 45 8 8 7 (20 r6 – 30 r 4 + 12 r2 – 1 )
4.7 Zernike Circle Coefficients of a Circular Aberration Function 73
of the aberration function are given by
W (r, q) = c 00 , (4-54)
• •
W 2 (r, q) = Â 2
Â c nm , (4-55)
n =0 m =0
respectively. Accordingly, its variance is given by
2
s 2 = W 2 (r, q) - W (r, q)
• •
2
= Â Â c nm . (4-56)
n =1 m = 0
In practice, the expansion will be truncated at some value N of n such that the variance
obtained from Eq. (4-56) will be equal to its value obtained from the actual data within
some specified tolerance.
An aberration function W (r, q) across a unit disc representing aberrations resulting

from fabrication errors or atmospheric turbulence can be expanded in terms of the
Zernike circle polynomials Z j (r, q) in the form [2,5]
J
W (r, q) = Â a j Z j (r, q) , (4-57)
j =1
where a j are the expansion coefficients, and we have truncated the polynomials at
maximum value J of j. Multiplying both sides of Eq. (4-57) by Z j (r, q), integrating over
the unit disc, and using the orthonormality Eq. (4-4), we obtain the circle expansion
coefficients:
2p
11
aj = Ú
p0 Ú W (r, q)Z j (r, q) r dr dq . (4-58)
0
As stated in Section 3.2, it is evident from Eq. (4-58) that the value of a circle coefficient
a j is independent of the number J of the polynomials used in Eq. (4-57) for the
expansion of the aberration function. Hence, one or more terms can be added to or
subtracted from the aberration function without affecting the value of the coefficients of
the other polynomials in the expansion.
The mean and the mean square values of the aberration function are given by
W (r, q) = a1 , (4-59)
J
W 2 (r, q) = Â a 2j , (4-60)
j =1
respectively. Accordingly, the aberration variance is given by

s 2 = W 2 (r, q) - W (r, q)
2
J
= Â a 2j . (4-61)
j =2
4.8 SYMMETRY PROPERTIES OF IMAGES ABERRATED BY A CIRCLE

POLYNOMIAL ABERRATION
It is evident that a Zernike circle polynomial aberration varying as cos mq or sin mq

is m-fold symmetric, unless m = 0, in which case it is radially symmetric. However, the
symmetry of the corresponding interferogram depends on cos mq or sin mq , since it
does not depend on the sign of the aberration. Hence, it is 2m-fold symmetric. Based on
the symmetry of the aberration, we now determine the symmetry of the PSF, the real and
the imaginary parts of the OTF, and the MTF [13,14].
4.8.1 Symmetry of PSF

Consider an m-fold symmetric aberration of the form cos mq . From Eq. (4-5), the
PSF at a distance r but an angle q i + 2pk m , where k = 1, 2,..., m, can be written
2
1 1 2p
I (r , q i + 2pk m) = [ ] [ ]
Ú Ú exp i F ( r, q) exp - pirr cos(q - q i - 2 pk m) r dr dq
p2 0 0
,
(4-62)
Now,
[ ]
F(r, q - 2 pk m) ~ cos m(q - 2 pk m) = cos( mq - 2 pk ) = cos mq ~ F(r, q) .
(4-63)
Hence, we can write Eq. (4-62) as
1 1 2p
I (r , q i + 2pk m) = [ ] [
Ú Ú exp i F(r, q - 2pk m) exp - pirr cos(q - q i - 2 pk m)
p2 0 0
]
2
¥ r dr d q
= I (r , q i ) . (4-64)
Thus if we change the angle q i by 2pk m but keep r unchanged, we obtain the same
value of the PSF as at (r , q i ) . This change can occur m times over a complete cycle of
2p . Therefore, Eq. (4-64) shows that the PSF is m-fold symmetric, as expected for the m-
fold aberration function. However, this is true for odd values of m only.
If m is even, the invariance of the PSF when q i changes by p, i.e., for k = m/2,
r r
implies that the PSF is symmetric or even about the origin, i.e., I ( r ) = I ( -r ) . It has the
consequence that the PSF is 2m-fold symmetric when m is even, as we show next. The
PSF at a distance r but angle q i ± pj m , where j = 1, 2, ..., 2m, is given by
4.8.1 Symmetry of PSF 75
2
1 1 2p
I (r , q i ± pj m) = [ ] [
Ú Ú exp i F ( r, q) exp - pirr cos(q - q i m pj m) r dr dq
p2 0 0
] . (4-65)
Now
[ ]
F(r, q ± pj m) ~ cos m(q ± pj m) = cos( mq ± pj )
Ï cos mq for even j ÔÏF(r, q) for even j

= Ì ~ Ì (4-66)
Ó - cos mq for odd j ÔÓ -F(r, q) for odd j .
Therefore, Eq. (4-65) can be written
2
1 1 2p
I (r , q i ± pj m) = [ ] [
Ú Ú exp i F(r, q - pj m) exp - pirr cos(q - q i m pj m) r dr dq
p2 0 0
]
(4-67)
ÏÔ I (r , q i ) for even j
= Ì r (4-68)
ÔÓ I (r , q i + p) ∫ I ( -r ) for odd j ,
where in Eq. (4-67) we have substituted F(r, q) = F(r, q ± pj m) for even j and
r r
F(r, q) = -F(r, q ± pj m) for odd j to obtain Eq. (4-68). Since I ( r ) = I ( -r ) for even m,
the right-hand side of Eq. (4-68) is equal to I (r , q i ) for odd values of j also. Hence the
PSF is 2m-fold symmetric when m is even. Of course, when m = 0, the PSF is radially
symmetric, like the aberration function.
The PSFs for two polynomial aberrations with the same n and m values, and the
same sigma value, but different angular dependence as cos mq and sin mq are the same
except that one is rotated by an angle p 2m with respect to the other. If two such
polynomial aberrations are present simultaneously with sigma values a j and b j , we can
write their sum in the form
W (r, q) = a j Z even j (r, q) + b j Z odd j (r, q)
= (
2(n + 1) Rnm (r) a j cos mq + b j sin mq )
= {[
2(n + 1) Rnm (r) a 2j + b 2j cos m q - (1 m) tan 1
(b j aj )]} . (4-69)
It represents an aberration of the form cos mq with a sigma value of a 2j + b 2j , except

( )
that its orientation is different by an angle (1 m) tan 1 b j a j . Hence, the orientation of
the PSF (and OTF) also change by this angle.
( )
12
It is easy to see that when both a j and b j are negative, a 2j + b 2j in Eq. (4-69)
( )
12
must be replaced by - a 2j + b 2j . However, when one of the coefficients is positive and
( )
the other is negative, then tan 1 b j a j of a negative argument has two solutions: a
negative acute angle or its complimentary angle. The choice is made depending on
whether a 2 or a 3 is negative according to
( )
Ï - tan 1 b a for positive a and negative a
Ô (4-70a)
(b )
j j 2 3
tan 1
aj = Ì
( )
j
Ô p - tan 1 b j a j for negative a 2 and positive a 3 . (4-70b)
Ó
An alternative when a 2 is negative is to let the angle be - tan 1

(b j )
a j , as when a 2 is
( ) ( )
12 12
positive, but also replace a 2j + b 2j with - a 2j + b 2j .
4.8.2 Symmetry of OTF

The complex OTF given by Eq. (2-10) can be written in terms of its real and
imaginary parts:
r r r
t( v ) = Re t( v ) + i Im t( v ) , (4-71)
where the real and the imaginary parts are given by

r r r r r
Re t( v ) = Ú I ( r ) cos( 2pv ◊ r ) d r (4-72a)
and
r r r r r
Im t( v ) = Ú I ( r ) sin( 2pv ◊ r ) d r , (4-72b)
respectively. In polar coordinates, we can write them
[
Re t(v , f) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f) r dr dq i ] (4-73a)
and
[
Im t(v , f) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f) r dr dq i ] . (4-73b)
When m is odd, the OTF is complex. To determine the symmetry of its real part, we
consider it for a spatial frequency (v , f + pj m), where, as before, j = 1, 2, ..., 2m :
[
Re t(v , f + pj m) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f - pj m) r dr dq i ] . (4-74)
From Eq. (4-68) for even j, we can replace I (r , q i ) with I (r , q i - pj m) , and thus
[
Re t(v , f - pj m) = ÚÚ I (r , q i - pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i ]
= Re t( v , f) . (4-75)
For odd j,
I (r , q i + pj m) = I (r , q i + p) . (4-76)
4.8.2 Symmetry of OTF 77
Therefore, changing the variable of integration from q i to q i + p , we may write Eq. (4-
74) as
[ ]
Re t(v , f + pj m) = ÚÚ I (r , q i + p) cos 2 pvr cos(q i + p - f - pj m) r dr dq i
[ ]
= ÚÚ I (r , q i + pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i
= Re t(v , f) . (4-77)
Hence, Re t(v , f) is 2m-fold symmetric.
Now consider the imaginary part given by Eq. (4-73b). Following the same
procedure as for the real part, we replace I (r , q i ) by I (r , q i - pj m) for even j and write
[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i - pj m) sin 2pvr cos(q i - f - pj m) r dr dq i
= Im t(v , f) . (4-78)
However, for odd j, we obtain
[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f - pj m) r dr dq i . (4-79)
Again, changing the variable of integration from q i to q i + p and utilizing Eq. (4-68) for
odd j, we may write Eq. (4-79) as
[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i + p) sin 2 pvr cos(q i + p - f - pj m) r dr dq i
[ ]
= - ÚÚ I (r , q i + pj m) sin 2pvr cos(q i - f - pj m) r dr dq i
= - Im t(v , f) . (4-80)
Thus, the imaginary part does not change for even j, but its sign changes for odd j without
changing its magnitude. Hence, the imaginary part is only m-fold symmetric.
However, when m is even, the PSF is even about the origin, and, therefore, the
imaginary part of the OTF given by Eq. (4-72b) is zero (since its integrand is an odd
function). Accordingly, the OTF is real. Moreover, since the PSF is 2m-fold symmetric in
this case, so is the OTF. Accordingly, the MTF, which is the modulus of the OTF, is 2m-
fold symmetric whether m is even or odd. Of course, when m = 0, i.e., for a radially
symmetric aberration, the OTF is real, radially symmetric, and equal to the MTF.
The symmetry properties of the various functions discussed above for a Zernike
polynomial aberration with m -fold symmetry varying as cos mq or sin mq are
summarized in Table 4-6, where NA stands for “not applicable.” Of course, for m = 0,
the interferogram, the PSF, and the OTF are all radially symmetric. In addition, the OTF
is real when m is zero or even.
Table 4-6. Symmetry of interferogram, PSF, real and imaginary parts of OTF, and
MTF for m-fold symmetric Zernike polynomial aberration varying as cosmqq or
sinmq .
m Interferogram PSF ReOTF ImOTF MTF
Even 2m-fold 2m-fold 2m-fold NA 2m-fold
Odd 2m-fold m-fold 2m-fold m-fold 2m-fold
4.9 ISOMETRIC, INTERFEROMETRIC, AND IMAGING

CHARACTERISTICS OF CIRCLE POLYNOMIAL ABERRATIONS
The circle polynomial aberrations for n £ 8 are illustrated in three different but
equivalent ways in Figure 4-11 for a sigma value of one wave. For each polynomial
aberration, the isometric plot is shown at the top, the interferogram on the left, and the
PSF on the right. The peak-to-valley numbers of the aberrations are given, and the Strehl
ratio and examples of the OTF characteristics are illustrated for a sigma value of 0.1 wave
[14].
4.9.1 Isometric Characteristics

The isometric plot at the top illustrates the shape of an aberration polynomial, as
produced, for example, in a deformable mirror. The corresponding P-V aberration
numbers (in units of wavelength) are given in Table 4-7. From the form of the
polynomials given in Eqs. (4-45a) and (4-45b) for m π 0 , these numbers are given by
2 2( n + 1) , since Rnm (1) = 1 and cos q or sinq varies by 2 from –1 to 1. When m = 0
and n 2 is even, as for the primary and tertiary spherical aberrations Z11 and Z 37 , the P-
V numbers are given by (1 - b) n + 1 , where b is the extreme negative value of Rnm (r)
as r varies between 0 and 1. However, when m = 0 and n 2 is odd, as for defocus Z 4
and secondary spherical aberration Z 22 , Rnm (r) varies from –1 at r = 0 to 1 at r = 1, as
may be seen from Figure 4-10. The P-V numbers in this case are given by 2 ( n + 1) . It
should be evident that the P-V numbers of two polynomials with the same values of n and
m are the same. The P-V numbers of a polynomial aberration representing the fabrication
errors give a measure of the depth of material to be removed in the fabrication process.
4.9.2 Interferometric Characteristics

The symmetry of an interferogram of a polynomial aberration, as in optical testing,
can be different from that of the aberration, because a fringe is formed independent of its
sign. For example, astigmatism Z 6 varying as cos 2q is 2-fold symmetric. It has the
implication that the aberration function does not change when it is rotated by p. Rotating
by p 2 yields an aberration of the same magnitude but with an opposite sign.
Accordingly, its interferogram is 4-fold symmetric WKXV Whe fringes intersecting the x axis
4.9.2 Interferometric Characteristics 79
Z1 Z2 Z3
Z4 Z5 Z6
Z7 Z8 Z9
Z10 Z11 Z12
Z13 Z14 Z15
Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,
interferogram on the left, and PSF on the right for a sigma value of one wave.
Z16 Z17 Z18
Z19 Z20 Z21
Z22 Z23 Z24
Z25 Z26 Z27
Z28 Z29 Z30
(Cont.)
Z31 Z32 Z33
Z34 Z35 Z36
Z37 Z38 Z39
Z40 Z41 Z42
Z43 Z44 Z 45
(Cont.)
Table 4-7. Peak-to-valley (P-V) numbers (in units of wavelength) of orthonormal

Zernike polynomial aberrations for a sigma value of one wave.
Poly. P-V # Poly. P-V # Poly. P-V #
Z1 0 Z16 2 12 = 6.928 Z 31 8
Z2 4 Z17 2 12 = 6.928 Z 32 8
Z3 4 Z18 2 12 = 6.928 Z 33 8
Z4 2 3 = 3.464 Z19 2 12 = 6.928 Z 34 8
Z5 2 6 = 4.899 Z 20 2 12 = 6.928 Z 35 8
Z6 2 6 = 4.899 Z 21 2 12 = 6.928 Z 36 8
Z7 4 2 = 5.657 Z 22 2 7 = 5.292 Z 37 4.286
Z8 4 2 = 5.657 Z 23 2 14 = 7.483 Z 38 2 18 = 8.485
Z9 4 2 = 5.657 Z 24 2 14 = 7.483 Z 39 2 18 = 8.485
Z10 4 2 = 5.657 Z 25 2 14 = 7.483 Z 40 2 18 = 8.485
Z11 1.5 5 = 3.354 Z 26 2 14 = 7.483 Z 41 2 18 = 8.485
Z12 2 10 = 6.325 Z 27 2 14 = 7.483 Z 42 2 18 = 8.485
Z13 2 10 = 6.325 Z 28 2 14 = 7.483 Z 43 2 18 = 8.485
Z14 2 10 = 6.325 Z 29 8 Z 44 2 18 = 8.485
Z15 2 10 = 6.325 Z 30 8 Z 45 2 18 = 8.485
are formed by a positive aberration, and those intersecting the y axis are formed by a
negative aberration. The number of fringes in an interferogram, which is equal to the
number of times the aberration changes by one wave as we move from the center to the
edges of the pupil, is different for the different polynomials. Each fringe represents a
contour of constant phase or aberration. The fringe is dark when the phase is an odd
multiple of p, or the aberration is an odd multiple of l 2. In the case of tilts, for
example, the aberration changes by one wave four times, which is the same as the peak-
to-valley value of 4 waves. Hence, 4 straight line fringes symmetric about the center are
obtained. The x-tilt polynomial Z2 yields vertical fringes, and the y-tilt polynomial Z3
yields horizontal fringes. Similarly, defocus aberration Z4 yields about 3.5 fringes. In the
case of spherical aberration Z11 , the aberration starts at a value of 5 waves, decreases
to zero, reaches a negative value of - 5 2 waves, and then increases to 5 waves.
Hence, the total number of times the aberration changes by unity is equal to 6.7, and
approximately seven circular fringes are obtained.
4.9.3 PSF Characteristics

The PSF plots represent the images of a point object in the presence of a polynomial
aberration. The piston aberration represented by the Zernike polynomial Z1 has no effect
on the image. Thus the PSF it yields is the Airy pattern given by Eq. (4-10). The full
width of a square displaying the PSFs in Figure 4-11 is 24l F .
The polynomial aberrations Z 2 and Z 3 , representing the x and y wavefront tilts with
aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y
axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
wavefront tilt angle of 4(l D)a 2 about the y axis and displaces the PSF along the x axis
by 4l Fa 2 . Similarly, a 3 corresponds to a wavefront tilt angle of 4(l D)a 3 about the x
axis and displaces the PSF by 4l Fa 3 along the y axis. The aberrated PSFs can be
obtained from Eq. (4-5). For astigmatism Z 5 and Z 6 , m = 2, and the PSF is 4-fold
symmetric. For coma Z 7 and Z 8 , m = 1, the PSF is symmetric about the y and the x axis,
respectively. The polynomial Z10 corresponds to m = 3, the aberration function is 3-fold
symmetric, but the interferogram is 6-fold symmetric. Since m is odd, the PSF is also 3-
fold symmetric.
The Strehl ratio for the first 45 circle polynomial aberrations with a sigma value of
0.1 wave is listed in Table 4-8 and plotted in Figure 4-12 on a nominal and an expanded
scale to clearly show the variation of their values. For the tilt polynomials Z 2 and Z 3 , the
Strehl ratio simply represents the PSF value at a displaced point along the x or the y axis,
respectively. This displacement for a tilt aberration sigma of 0.1 wave is 0.4 l F .
A closed-form expression for the Strehl ratio for the defocus circle polynomial Z 4
can be obtained from Eq. (4-18) by letting
2pW (r, q) = a 4 Z 4 (r) . (4-81)
The result obtained is
( ) ˘˙
2
È sin 3a
4
S = Í . (4-82)
Í 3a 4 ˙
Î ˚
For a defocus sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement with the
result given in Table 4-8. Note that a 4 is the sigma value, which in turn is equal to
Bd 2 3 , where Bd is the peak value of the defocus aberration. Hence, Eq. (4-82) is the
same as Eq. (4-21). The amount of longitudinal defocus required to produce a certain
value of a 4 , and therefore Bd , is given by Eq. (4-20).
The results of Table 4-8 and Figure 4-12 illustrate that the Strehl ratio for a small
Table 4-8. Strehl ratio S for Zernike circle polynomial aberrations with a sigma
value of 0.1 wave.
Poly. S Poly. S Poly. S

Z1 1 Z16 0.673 Z 31 0.674
Z2 0.665 Z17 0.673 Z 32 0.674
Z3 0.665 Z18 0.674 Z 33 0.680
Z4 0.663 Z19 0.674 Z 34 0.680
Z5 0.671 Z 20 0.692 Z 35 0.705
Z6 0.671 Z 21 0.692 Z 36 0.705
Z7 0.669 Z 22 0.668 Z 37 0.670
Z8 0.669 Z 23 0.673 Z 38 0.674
Z9 0.678 Z 24 0.673 Z 39 0.674
Z10 0.678 Z 25 0.677 Z 40 0.676
Z11 0.666 Z 26 0.677 Z 41 0.676
Z12 0.672 Z 27 0.698 Z 42 0.684
Z13 0.672 Z 28 0.698 Z 43 0.684
Z14 0.685 Z 29 0.675 Z 44 0.711
Z15 0.685 Z 30 0.675 Z 45 0.711
aberration is nearly independent of the type of the aberration and that it depends primarily
( )
on its sigma value. It is approximately given by Eq. (4-22c) as exp - s F2 , or 0.67,
where s F = 0.2p .
4.9.4 OTF Characteristics

r
An image displacement of rt due to a wavefront tilt produces a linearly varying
r r r
phase factor of 2pv ◊ rt in the OTF, as may be seen from Eq. (1-10) by replacing PSF ( r )
r r r r
with the displaced PSF PSF (r - rt ) and the OTF t( v ) by the corresponding OTF t t ( v ) .
Of course, the phase factor, representing the phase transfer function, has no effect on the
MTF of the system.
The 3D MTF plots are shown in Figure 4-13 for the primary aberration polynomials
with a sigma value of 0.1 wave. The MTF for the piston aberration represents the
aberration-free MTF. It is included among the aberrated MTF plots by a solid line as a
4.9.4 OTF Characteristics 85
oS
oj
oS
oj
Figure 4-12. Strehl ratio for Zernike circle polynomial aberrations with a sigma
value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.
reference. The symmetry of the MTFs is made more explicit by the contour plots shown
below each 3D MTF figure. The MTF value at the center of the contours is unity and
decreases to zero from the center out starting with a value of 0.9 and ending with zero.
The tangential (long dashes), sagittal (medium dashes), and 45o (small dashes) MTF plots
are also shown in this figure, i.e., for the spatial frequency vector along the x axis, y axis,
and at 45o from the x axis, respectively. Because of the 4-fold symmetry of the MTF in
the case of astigmatism, the tangential MTF is equal to the sagittal MTF. As expected
[3,8], the aberrated MTF is lower than the aberration-free MTF at all spatial frequencies
0 v 1, i.e., within the passband of the system.
y x
Z 1 - Piston
Z 4 - Defocus
Z6 Primary astigmatism
Z8 Primary coma
Z 10
Z 11 Primary spherical
Figure 4-13. 3D, tangential or along x axis (in long dashes), sagittal or along y axis
(in medium dashes), and at 45 D from the x axis (in small dashes) MTF plots for
Zernike circle polynomial aberrations with a sigma value of 0.1 wave. The solid
curve represents the aberration-free MTF. The spatial frequency v is normalized
by the cutoff frequency 1 O F . The contour plots below each 3D MTF plot are in
steps of 0.1 from the center out, starting with 0.9 and ending with zero.
4.9.4 OTF Characteristics 87
Figure 4-14a shows the symmetry of the real and the imaginary parts of the OTF for
coma Z 8 . The real part has even symmetry, but the imaginary part has odd symmetry.
The thick and thin contours of the imaginary part in both cases represent its positive and
negative values, respectively. The real and imaginary parts of the OTF for the aberration
Z10 are shown in Figure 4-14b. In addition to their even and odd symmetry, it shows that
the real part is 6-fold symmetric and the imaginary part is 3-fold symmetric, as expected
for a 3-fold symmetric aberration. Because of the odd symmetry of the imaginary part, its
integral over the spatial frequencies imaged by a system is zero, as expected from the
statement after Eq. (1-25).
(a) Z8 Primary coma
(b) Z10
Re ( ) Im ( )
Figure 4-14. Real and imaginary parts of the OTF for a Zernike polynomial
aberration with a sigma value of 0.1 wave. (a) Z8 (primary coma) showing the even
and odd symmetry of the real and imaginary parts. (b) Z10 showing the 6-fold
symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to
their even and odd symmetry, respectively. The thick and thin contours of the
imaginary part in both cases represent its positive and negative values, respectively.
4.10 CIRCLE POLYNOMIALS AND THEIR RELATIONSHIPS WITH

CLASSICAL ABERRATIONS
4.10.1 Introduction
It is seen from Eq. (1-18) that a classical aberration depends on the polar angle q as
m
cos q . However, a Zernike polynomial depends on the angle as cos mq (or sin mq). By
expressing cos m q as a series of cos mq terms, or cos mq as a power series of cos q
terms, the coefficients of classical aberrations can be obtained from the Zernike
coefficients and vice versa [15,16]. We illustrate this for primary aberrations. The names
of some of the aberrations associated with the Zernike polynomials are given in Table 4-
4. They are a carry over from the names associated with the classical aberrations.
The Seidel aberrations are well known in optical design, where the optical system
has an axis of rotational symmetry with the consequence that the angle-dependent terms
are in the form of powers of cos q . However, the measured aberrations of a system in
optical testing generally contain both the cosine and sine terms due to the assembly and
fabrication errors. We show how to define the effective Seidel coefficients in such cases.
We emphasize that the Seidel aberration coefficients determined from the primary
Zernike aberrations will be in error unless the higher-order terms that also contain Seidel
terms are negligible [16,17].
4.10.2 Wavefront Tilt and Defocus

The Zernike tilt aberration
a 2 Z 2 (r, q) = 2a 2r cos q (4-83)
represents a tilt of the wavefront about the y axis by an angle 4(l D)a 2 , where the
aberration coefficient is in units of wavelength. It results in a displacement of the PSF
along the x axis by 4l Fa 2 . Similarly, the Zernike tilt aberration
a 3 Z 3 (r, q) = 2a 3r sin q (4-84)
represents a tilt of the wavefront about the x axis by an angle 4(l D)a 3 and results in a
displacement of the PSF along the y axis by 4l Fa 3 .
It should be evident that when the cosine and sine terms of a certain aberration are
present simultaneously, as in optical testing, their combination represents the aberration
whose orientation depends on the value of the component terms. For example, if both x
and y Zernike tilts are present in the form
W (r, q) = a 2 Z 2 (r, q) + a 3 Z 3 (r, q) (4-85a)
= 2 a 2r cos q + 2a 3r sin q , (4-85b)
it can be written
4.10.2 Wavefront Tilt and Defocus 89
(
W (r, q) = 2 a 22 + a 32 )1 2 r cos [q - tan 1(a 3 a 2 )] . (4-86)
Thus, it represents a Zernike wavefront tilt aberration of magnitude 2 a 22 + a 32 (

about )1 2
an axis that is orthogonal to a line making an angle of tan (a 3 a 2 ) with the x axis. How
1
to decide the sign of the overall tilt and the value of its angle are discussed following Eq.
(4-69).
The Zernike tilt aberration Z 2 (r, q) is similar to the Seidel distortion in its (r, q)
dependence. Similarly, the Zernike defocus aberration Z 4 (r) varies with r as the Seidel
field curvature varies with it. The constant term in Z 4 (r) makes its mean value across the
circular pupil to be zero, without changing its standard deviation.
4.10.3 Astigmatism
The Zernike primary astigmatism
a 6 Z 6 (r, q) = 6 a 6r 2 cos 2q (4-87)
is referred to as the 0∞ astigmatism. It consists of Seidel astigmatism r2 cos 2 q balanced

with defocus aberration r2 to yield minimum variance. It yields a uniform circular spot
diagram, but a line sagittal image along the x axis (i.e., in a plane that zeroes out the
defocus part). The Zernike primary astigmatism
a 5 Z 5 (r, q) = 6 a 5r 2 sin 2q (4-88)
can be written
a 5 Z 5 (r, q) = [
6 a 5r 2 cos 2(q + p 4) ] . (4-89)
Comparing with Eq. (4-87), it is equivalent to changing q to q + p 4 . Accordingly, it is

called the 45∞ astigmatism. The secondary Zernike astigmatism given by
a12 Z12 (r, q) = ( )

10 a12 4 r 4 - 3r 2 cos 2q (4-90)
does not yield a line image in any plane. However, it is referred to as the 0∞ astigmatism
in conformance with the corresponding primary astigmatism because of its variation with
q as cos 2q . Similarly, the name tertiary astigmatism in Table 4-4 can be explained.
If both x and 45∞ astigmatisms are present so that
W (r, q) = a 6 Z 6 (r, q) + a 5 Z 5 (r, q) (4-91a)
= 6 a 6r 2 cos 2q + 6 a 5r 2 sin 2q , (4-91b)
we may write it in the form
(
W (r, q) = a 52 + a 62 )1 2 {[
6 r 2 cos 2 q - (1 2) tan 1
(a 5 ]}
a6 ) , (4-92)
showing that it is Zernike astigmatism of magnitude (a 52 + a 62 )1 2 at an angle of

(1 2) tan 1( a 5 a 6 ) .
It should be evident that there is ambiguity in determining astigmatism, because it
can be written in different but equivalent forms by separating defocus aberration from it.
For example, a 0∞ astigmatism can be written
a 6 Z 6 (r, q) = a 6 ( 6r 2 cos 2q ) (4-93a)
(
= a 6 6 2r 2 cos 2 q - r 2 ) (4-93b)
= a6 6 ( - 2r 2 sin 2 q + r 2 ) . (4-93c)
It is clear that a 0∞ Zernike astigmatism given by Eq. (4-93a) can be written as a

combination of 0∞ positive Seidel astigmatism and a negative defocus, as in Eq. (4-93b),
or a 90∞ negative Seidel astigmatism and a positive defocus, as in Eq. (4-93c).
4.10.4 Coma
The Zernike coma terms a 8 Z 8 (r, q) and a 7 Z 7 (r, q) are called the x and y Zernike
comas. They represent classical coma r 3 cos q or r 3 sin q balanced with tilt r cos q or
r sin q , respectively, to yield minimum variance. They yield PSFs that are symmetric
about the x and y axes, respectively. Similarly, the names for the secondary and tertiary
coma can be explained.
When both x- and y -Zernike comas are present, the aberration may be written
W (r, q) = a 8 Z 8 (r, q) + a 7 Z 7 (r, q) (4-94a)
= ( ) (
8 a 8 3r 3 - 2r cos q + 8 a 7 3r 3 - 2r sin q ) (4-94b)
(
= a 72 + a 82 )1 2 8 (3r3 - 2r) cos [q - tan 1(a 7 a 8 )] , (4-94c)
which is equivalent to a Zernike coma of magnitude a 72 + a 82 ( )1 2 inclined at an angle of

tan 1(a 7 a 8 ) with the x axis.
4.10.5 Spherical Aberration

The Zernike spherical aberrations represent balanced classical spherical aberrations.
For example, the primary or Seidel spherical aberration varying as r 4 is balanced with
defocus varying as r 2 to yield Z11(r) representing the balanced primary spherical
aberration. As in the case of Zernike defocus term Z 4 (r) the constant term in Z11(r)
makes its mean value across the circular pupil to be zero. Similarly, the Zernike
secondary and tertiary spherical aberrations Z 22 and Z 37 also contain a constant term so
that their mean value is zero.
4.10.6 Seidel Coefficients from Zernike Coefficients 91
4.10.6 Seidel Coefficients from Zernike Coefficients

It should be noted that the wavefront tilt aberration given by Eq. (4-86) represents the
tilt aberration obtained from Zernike tilt aberrations. However, there are other Zernike
aberrations that also contain tilt aberration built into them, e.g., Zernike primary,
12
(
secondary, or tertiary coma. Similarly, the Seidel coma 3 8 a 72 + a 82 )
in Eq. (4-88c) at
an angle of tan 1(a 7 a 8 ) is only from the primary Zernike comas. But the secondary and
tertiary Zernike comas also contain Seidel coma. Hence, only if the higher-order Zernike
comas are zero or negligible, the PSF aberrated by primary Zernike coma will be
symmetric about a line making an angle of tan 1(a 7 a 8 ) with the x axis. Similarly, only
if the secondary and tertiary astigmatisms are zero or negligible, the Seidel astigmatism is
12
( )
2 6 a 52 + a 62 , as in Eq. (4-92). It yields an aberrated PSF that is symmetric about two
orthogonal axes, one of which is along a line that makes an angle of (1 2) tan 1( a 5 a 6 )
with the x axis.
To illustrate how a wrong Seidel coefficient can be inferred unless it is obtained from
all of the significant Zernike terms that contain Seidel aberrations, we consider an axial
image aberrated by one wave of secondary spherical aberration r 6 . In terms of Zernike
polynomials it will be written as
W (r) = a 22 Z 22 (r) + a11Z11(r) + a 4 Z 4 (r) + a1Z1(r) , (4-95)
where
(
a 22 = 1 20 7 , a11 = 1 4 5 , a 4 = 9 20 3 , a1 = 1 4 . ) (4-96)
If we infer the Seidel spherical aberration from only the primary Zernike aberration
a11Z11(r) , its amount would be 1.5 waves. Such a conclusion is obviously incorrect,
because in reality the amount of Seidel spherical aberration is zero. Needless to say if we
expand the aberration function up to the first, say, as many as 21 terms, we will in fact
incorrectly conclude that the amount of Seidel spherical aberration is 1.5 waves.
However, the Seidel spherical aberration will correctly reduce to zero when at least the
first 22 terms are included in the expansion. For an off-axis image, there are angle-
dependent aberrations, e.g., Z14 , that also contain Seidel aberrations. Hence, it is
important that the expansion be carried out up to a certain number of terms such that any
additional terms do not significantly change the mean square difference between the
function and its estimate. Otherwise, the inferred Seidel aberrations will be erroneous.
If we approximate a certain aberration function by the primary Zernike aberrations

only, we may write [16,17]
8
W (r, q) = Â a j Z j (r, q) + a11Z11(r) (4-97a)
j =1
= A p + At r cos(q - b t ) + Ad r 2 + Aa r 2 cos 2 (q - b a ) + Ac r cos(q - b c ) + Asr 4 ,

(4-97b)
where A p is the piston aberration, other coefficients Ai represent the peak value of the
corresponding Seidel aberration term, and b i is the orientation angle of the Seidel
aberration. They are given by
A p = a1 - 3a 4 + 5a11 , (4-98a)
2 2 12 Ê a - 8a7 ˆ
At = 2ÈÍ a 2 - 8 a 8
( ) + (a 3 - 8 a 7 ˘˙
) , b t = tan 1Á 3 ˜ , (4-98b)
Î ˚ Ë a2 - 8a8 ¯
Ad = 2 ( 3a 4 - 3 5a11 - Aa ) , (4-98c)
1
(
Aa = 2 6 a 52 + a 62 )1 2 , ba =
2
tan 1
(a 5 a6 ) , (4-98d)
(
Ac = 6 2 a 72 + a 82 )1 2 , b c = tan 1
(a 7 a8 ) , (4-98e)
and
As = 6 5a11 . (4-98f)
As a note of caution, we add that the approximation of Eq. (4-97a) is good only when the
higher-order Zernike aberrations that also contain Seidel aberration terms are negligible.
4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing
In Figure 4-12, we have shown the Strehl ratio for the circle polynomial aberrations
with a sigma value of one-tenth of a wave. In Figure 4-13, we show how it varies with the
sigma value of a Seidel aberration, with and without balancing (as in Tables 4-1 and 4-2),
for 0 £ s W £ 0.25 . Also plotted is the Strehl ratio obtained from the approximate
( )
expression exp - s F2 as the dashed curve. As expected, the exponential expression
yields a very good estimate of the Strehl ratio for s W £ 0.1. As s W increases, the true
Strehl ratio departs from its approximate value, except in the case of balanced
astigamtism for which the difference is quite small. It overestimates in the case of
defocus, balanced coma, and spherical aberration, but underestimates for astigmatism and
coma. Morover, for agiven value of sigma, its value for spherical aberration is exactly the
same as for the balanced spherical aberration. The aberration coefficient and the P-V
number for a certain value of s W of these aberrations can be obtained from Table 4-9.
4.11 ZERNIKE COEFFICIENTS OF A SCALED PUPIL

Given an aberration function across a circular pupil, its orthonormal Zernike
coefficients can be obtained from Eq. (4-48). Now we discuss how these coefficients
change when the size of the pupil is reduced, as when the aperture of a camera lens or the
pupil of a human eye (assuming it to be circular) is reduced due to an illumination
increase. We give two approaches. In one, we express a scaled Zernike radial polynomial
as a linear combination of the unscaled radial polynomials and utilize the orthogonal
property of the radial polynomials [18]. In the other, we use some known integrals [19].
4.11 Zernike Coefficients of a Scaled Pupil 93
1.0 1.0
0.8 0.8
0.6 0.6
S
S
0.4 0.4
0.2 0.2
Defocus Astigmatism
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
ΣW ΣW

1.0 1.0
0.8 0.8
0.6 0.6
S
0.4 0.4
0.2 0.2
Coma Spherical
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
ΣW ΣW

Figure 4-15. Strehl ratio as a function of the sigma value of a Seidel aberration with
and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical
aberration.
Table 4-9. Sigma value of a Seidel aberration with and without balancing, and P-V
numbers for a sigma value of unity, where Ai is the aberration coefficient.
Aberration Sigma P-V # for s = 1
Defocus s d = Ad 2 3 = Ad 3.46 3.46

Astigmatism s a = Aa 4 4
Balanced astigmatism s ba = Aa 2 6 = Aa 4.90 4.90
Coma s c = Ac 2 2 = Ac 2.83 2.83
Balanced coma s bc = Ac 6 2 = Ac 8.49 9.212
Spherical aberration, s s = 2 As 3 5 = As 3.35 3..35
Balanced spherical aberration s bs = As 6 5 = As 13.42 3.35
An alternate approach may also be considered [20]. It is perhaps worth noting that, in
practice, one will determine the Zernike coefficients of an aberration function of a system
from its interferometric data by using Eq. (4-58). The corresponding coefficients of a
scaled pupil can also be determined in the same manner by utilizing its data, i.e., by
excluding that data of the unscaled pupil that is not part of the scaled pupil. The result
obtained can be illustrated by considering a Seidel aberration function and writing it in
terms of the Zernike polynomials for both the unscaled and the scaled pupils.
4.11.1 Theory
Consider a circular pupil with its wave aberration function W (r, q) expanded in
terms of the orthonormal Zernike circle polynomials Z j (r, q), as in Eq. (4-57). For a
corresponding scaled pupil with a normalized radius of £ 1, as in Figure 4-16, the
aberration function can be written from Eq. (4-57) in the form
W (r, q) = Â a j Z j (r, q) . (4-99)

j
Normalizing the smaller pupil to a unit circle, the aberration function across it can also be
written in terms of the Zernike polynomials that are orthonormal over it in the form
W (r, q) = Â bj ¢ Z j ¢ (r, q) , (4-100)

j¢
where W (r, q) = W (r, q) and the orthonormal coefficients bj ¢ are given by

2p
11
bj ¢ = W (r, q) Z j ¢ (r, q) r dr dq ,
p Ú0 Ú (4-101)
0
or
2p
11
bj ¢ = W (r, q) Z j ¢ (r, q) r dr dq .
p Ú0 Ú (4-102)
0
Figure 4-16. Scaled circular pupil, where the pupil radius is reduced from unity to
by blocking the outer portion.
4.11.1 Theory 95
To obtain a coefficient bj ¢ in terms of the coefficients a j , we substitute Eq. (4-99) into

Eq. (4-102) and obtain
1 2p
1
bj ¢ = Â Ú Ú a j Z j (r, q) Z j ¢ (r, q) r dr dq . (4-103)
p j 0 0
From Eq. (4-46), the angular integration in Eq. (4-103) yields p(1 + d m 0 ) d mm ¢ . Hence, we
may write
1
bn ¢,m = 2(n ¢ + 1) Â 2(n + 1)a n,m Ú Rnm (r) Rnm¢ (r) r dr , (4-104)
n 0
where we have replaced the single index j by the corresponding double indices n and m,
and similarly replaced j ¢ by n ¢ and m according to Eqs. (4-50) and (4-51).
The integral in Eq. (4-104) can be solved very simply by writing the radial
polynomial Rnm (r) in terms of the corresponding polynomials Rnm¢ (r) in the form [18]
n
Rnm (r) = Â hn ¢ (n; )Rnm¢ (r) , (4-105)
n ¢=m
where
( -1) s ( n - s)! n2s

hn ¢ (n; ) = ( n ¢ + 1) Â Â , (4-106)
s s ¢ s! s¢!( n ¢ + s¢ + 1)!
s and s¢ are positive integers (including zero), and n - n ¢ = 2( s + s¢) . Substituting Eq. (4-
105) into Eq. (4-104) and utilizing Eq. (4-43) for the orthogonality of the radial
polynomials, we obtain the intended result:
n +1
bn ¢,m = Â h (n; ) a n,m . (4-107)
n n¢ + 1 n ¢
Since n - n ¢ ≥ 0 and even, therefore, n = n ¢, n ¢ + 2,... . If N is the highest order among

the terms of the aberration function in Eq. (4-52), then the largest value of n in Eq. (4-
107) is N or N - 1, depending on whether N - m is even or odd, respectively. From Eq.
(4-105), it is easy to show that
hn (n; ) = n , (4-108a)
hn 2 (n; ) (
= - ( n - 1) 1 - 2 n ) 2
, (4-108b)
n-3
hn 4 (n; ) =
2
( )(
1 - 2 n - 2 - n2 n ) 4
, (4-108c)
n-5
hn 6 (n; ) =
6
1 - 2 ( )[(n - 3)(n - 4) - 2(n - 1)(n - 3)2 + n(n - 1)4 ] , (4-108d)
hn 8 (n; ) =
n-7 n
2
8
(1 - 2 ) ÈÍÎ (n - 4)(n12- 5)(n - 6) - (n - 2)(n 4- 4)(n - 5) 2
( n - 1)( n - 2)( n - 4) n( n - 1)( n - 2) 6 ˘
+ 4 - ˙ , etc. (4-108e)
4 12 ˚
Equations (4-108a)–(4-108e) are sufficient to obtain the Zernike coefficients of the scaled
pupil up to and including the eighth order. The expressions for hn ¢ (n; ) for n £ 8 are
listed in Table 4-9.
Since hn ¢ (n ¢; ) = n ¢ from Eq. (4-108a), the first term in the summation is n ¢ a n ¢m .

Moreover, for a given value of n ¢ , the multiplier of a coefficient a nm is independent of
m, regardless of whether it is a cosine or a sine polynomial. For example, when n ¢ = 4,
the b-coefficients are given by
b4,0 = h4 (4; )a 4,0 + 7 5h4 (6; )a 6,0 + 9 5h4 (8; )a 8,0 + ... , (4-109a)
b4,2 = h4 (4; )a 4,2 + 7 5h4 (6; )a 6,2 + 9 5h4 (8; )a 8,2 + ... , (4-109b)
and
b4,4 = h4 (4; )a 4,4 + 7 5h4 (6; )a 6,4 + 9 5h4 (8; )a 8,4 + ... . (4-109c)
As Æ 1, all the multipliers vanish except a n ¢m , which approaches unity and yields the
expected result bn ¢,m = a n ¢,m .
The integral in Eq. (4-104) can also be evaluated by using the relationship [21]
•
( n m) 2
Rnm (r) = ( -1) Ú J n +1( r ) J m (rr ) dr (4-110)
0
to rewrite Rnm (r) , where J n (◊) is the nth-order Bessel function of the first kind. Thus,
we obtain after interchanging the integrals,
•
1 È1 m
( n m) 2 Û ˘
Ú n
R m
( r) R m
n¢ (r) r d r = ( -1) Ù n +1 Í Ú Rn ¢ (r) J m (rr ) r dr˙ dr
J ( r )
0 ı Î0 ˚
0
•
(n + n ¢ 2m) 2 Û J n ¢ +1( r )
= ( -1) Ù J n +1( r ) dr
ı r
0
1
= [
R n ¢ ( ) - Rnn ¢ + 2 ( )
2( n ¢ + 1) n
] , (4-111)
where we have sequentially used the relationships

4.11.1 Theory 97
Table 4-9. Expansion coefficients h n ¢ (n; ) given by Eq. (4-106) for n £ 8.
n n¢ h n ¢ (n; )
0 0 1
1 1
2 0 (
- 1 - 2 )
2 2 2
3 1 - 2 1 - 2 ( )
3 3 3
4 0 (1 - 2 )(1 - 22 )
4 2 - 32 (1 - 2 )
4 4 4
5 1 (
1 - 2 3 - 52 )( )
5 3 - 4 3 1 - 2 ( )
5 5 5
6 0 ( )(
- 1 - 2 1 - 52 + 54)
6 2 3 (1 - )( 2 - 3 )
2 2 2
6 4 - 54 (1 - 2 )
6 6 6
7 1 ( )(
- 2 1 - 2 2 + 82 - 74 )
7 3 2 (1 - 2 )( 5 - 72 )
3
7 5 - 65 (1 - 2 )
7 7 7
8 0 (1 - 2 )(1 - 22 )(1 - 72 + 74 )

8 2 - 2 (1 - 2 )(10 - 352 + 284 )
8 4 54 (1 - 2 )( 3 - 4 2 )
8 6 - 76 (1 - 2 )
8 8 8
( n ¢ m) 2 È J n ¢ +1 ( r ) ˘
1
Ú Rnm¢ (r) J m (rr ) r dr = ( -1) Í ˙ , (4-112a)
0 Î r ˚
J n +1( r ) J ( r ) + J n + 2 ( r )
= n , (4-112b)
r 2( n + 1)
and Eq. (4-110). Substituting Eq. (4-111) into Eq. (4-104), we obtain
n +1
bn ¢m = Â
n n ¢ + 1 nm n
[
a R n ¢ ( ) - Rnn ¢ + 2 ( ) ] . (4-113)
The equivalence of Eqs. (4-107) and (4-113) can be established by expanding the scaled
radial polynomial in terms of the orthogonal radial polynomials in the form
n
Rnm (r) = Â a n ¢ (n; )Rnm¢ (r) , (4-114)
n ¢=m
where, using the orthogonality of the radial polynomials, an expansion coefficient given
by
1
a n ¢ (n; ) = 2( n ¢ + 1) Ú Rnm (r) Rnm¢ (r) r dr (4-115)
0
is the same as hn ¢ (n; ) , as may be seen by comparing Eqs. (4-105) and (4-114).
4.11.2 Application to a Seidel Aberration Function

As an example of the use of Eq. (4-107), we consider a Seidel aberration function
[16]
W (r, q) = At r cos q + Ad r 2 + Aa r 2 cos 2 q + Ac r 3 cos q + Asr 4 , (4-116)
where a Seidel coefficient Ai represents the peak value of a Seidel aberration. It can be
written in terms of the Zernike polynomials in the form
W (r, q) = a 0,0 Z 00 + a11, Z11 + a 2,0 Z 20 + a 2,2 Z 22 + a 3,1Z 13 + a 4,0 Z 40
= a1Z1 + a 2 Z 2 + a 4 Z 4 + a 6 Z 6 + a 8 Z 8 + a11Z11 , (4-117)
where the argument (r, q) of the orthonormal Zernike polynomials Z nm is omitted for
brevity, and the Zernike coefficients are given by
Ad Aa As
a 0,0 ∫ a1 = + + , (4-118a)
2 4 3
At Ac
a11, ∫ a 2 = + , (4-118b)
2 3
4.11.2 Application to a Seidel Aberration Function 99
Ad Aa As
a 2,0 ∫ a 4 = + + , (4-118c)
2 3 4 3 2 3
Aa
a 2,2 ∫ a 6 = , (4-118d)
2 6
Ac
a 3,1 ∫ a 8 = , (4-118e)
6 2
and
As
a 4,0 ∫ a11 = . (4-118f)
6 5
Moreover, it is evident that the highest order among the aberrations is N = 4 . The
aberration variance in terms of the Zernike coefficients is given by
s 2 = a11
2 2 2 2 2
, + a 2, 0 + a 2, 2 + a 3,1 + a 4 , 0 (4-119a)
= a 22 + a 42 + a 62 + a 82 + a11
2
. (4-119b)
For a scaled pupil, the aberration function can be written in the form
W (r, q) = b0,0 Z 00 + b11, Z11 + b2,0 Z 20 + b2,2 Z 22 + b3,1Z 13 + b4,0 Z 40 (4-120a)
= b1Z1 + b2 Z 2 + b4 Z 4 + b6 Z 6 + b8 Z 8 + b11Z11 , (4-120b)
where, from Eq. (4-107) and utilizing the h-coefficients given in Table 4-9, the Zernike
coefficients are given by
b0,0 = a 0,0 h0 (0; ) + 3h0 (2; )a 2,0 + 5h0 (4; )a 4,0
( )
= a 0,0 - 3 1 - 2 a 2,0 + 5 1 - 2 1 - 22 a 4,0 ( )( ) ,
or
( )
b1 = a1 - 3 1 - 2 a 4 + 5 1 - 2 1 - 22 a11 , ( )( ) (4-121a)
[
b11, = h1 (1; ) a11, + 2 h1 (3; ) a 3,1 = a11, - 2 2 1 - 2 a 3,1 ( ) ] ,
or
[
b2 = a 2 - 2 2 1 - 2 a 8( ) ] , (4-121b)
b2,0 = h2 (2; ) a 2,0 + 5 3h2 (4; ) a 4,0 = 2 a 2,0 - 15 1 - 2 a 4,0 [ ( ) ] ,
or
[ (
b4 = 2 a 4 - 15 1 - 2 a11 ) ] , (4-121c)
b2,2 = h2 (2; ) a 2,2 = 2 a 2,2 ,
or
b6 = 2 a 6 , (4-121d)
b3,1 = h3 (3; ) a 3,1 = 3 a 3,1 ,
or
b8 = 3 a 8 , (4-121e)
and
b4,0 = h4 (4; ) a 4,0 = 4 a 4,0 ,
or
b11 = 4 a11 . (4-121f)
The aberration variance for the scaled pupil is given by
s 2 = b22 + b42 + b62 + b82 + b11

2
. (4-122)
It is easy to verify that the Zernike coefficients obtained in Eqs. (4-121a)–(4-121f)

are indeed correct by writing the Seidel aberration function for the scaled pupil and
determining its Zernike coefficients. From Eq. (4-116), the aberration function of the
scaled pupil can be written
W (r, q) = At r cos q + Ad 2r 2 + Aa 2r 2 cos 2 q + Ac 3r 3 cos q + As4 r 4 . (4-123)
It can also be written
W (r, q) = At¢r cos q + Ad¢ r 2 + Aa¢ r 2 cos 2 q + Ac¢ r 3 cos q + As¢r 4 , (4-124)
where
At¢ = At , Ad¢ = Ad 2 , Aa¢ = Aa 2 , Ac¢ = Ac 3 , and As¢ = As4 . (4-125)
Writing Eq. (4-124) in terms of Zernike polynomials, as was done in obtaining Eq. (4-
117) from Eq. (4-116), it is easy to see that the Zernike coefficients thus obtained are the
same as the corresponding coefficients given by Eqs. (4-121a)–(4-121f).
4.11.3 Numerical Example

If each Seidel aberration coefficient in Eq. (4-116) is unity (e.g., one wave), then the
corresponding Zernike coefficients in Eq. (4-117) for the full pupil are given by
4.11.3 Numerical Example 101
a1 = 13 12 , a 2 = 5 6 , a 4 = 5 4 3 , a 6 = 1 2 6 , a 8 = 1 6 2 , a11 = 1 6 5 . (4-126)
Substituting Eqs. (4-126) into Eq. (4-119b), the variance of the aberration function is
given by s 2 = 919 720 , or its standard deviation is given s = 1.1298 . For a pupil scaled
with = 0.8 , the Zernike coefficients in Eq. (4-120b) are given by
b1 = 0.6165, b2 = 0.5707, b4 = 0.3954, b6 = 0.1306, b8 = 0.0603, b11 = 0.0305 . (4-127)
Substituting Eq. (4-118) into Eq. (4-122), the aberration variance and standard deviation
for the scaled pupil are given by
s 2 = 0.5036 (4-128)
and
s = 0.7097 , (4-129)
respectively.
We have thus demonstrated how to analytically obtain the Zernike coefficients of an

aberration function of a scaled pupil in terms of their values for a corresponding unscaled
pupil. It is perhaps worth noting that, in practice, one will determine the Zernike
coefficients of an aberration function of a system from its interferometric data by using
Eq. (4-58). The corresponding coefficients of a scaled pupil can also be determined in the
same manner by utilizing its data, i.e., by excluding that data of the unscaled pupil that is
not part of the scaled pupil.
4.12 SUMMARY
The aberration-free PSF, called the Airy pattern, is shown in Figure 4-2. It consists of
a bright central spot of radius 1.22l F , called the Airy disc, containing 83.8% of the total
light, surrounded by the diffraction rings. The corresponding OTF shown in Figure 4-4
starts at a value of unity and decreases monotonically to zero at the cutoff frequency
1 l F . Since the Strehl ratio for a small aberration increases with a decrease in the
aberration variance, we explicitly consider the balancing of primary aberrations with
lower-order aberrations. As seen from Tables 4-1 and 4-2, the sigma value of primary
spherical aberration when balanced with defocus, primary coma balanced with tilt, and
primary astigmatism balanced with defocus, is reduced by a factor of 4, 3, and 6 2,
respectively. Accordingly, the aberration tolerance for a given Strehl ratio increases by
the same factor.
The Zernike circle polynomials are in widespread use for the analysis of circular
wavefronts because of their orthogonality over a unit circle and their representation of the
balanced classical aberrations for systems with circular pupils. The polynomials are
described by three indices: j is a polynomial ordering number, n represents the radial
degree or the order of a polynomial, and m represents its azimuthal frequency. The
polynomials are ordered such that an even j corresponds to a cosine polynomial and an
odd j corresponds to a sine polynomial. A polynomial with a lower value of n is ordered

first, and, for a given value of n, a polynomial with a lower value of m is ordered first.
The expressions for the polynomials through the eighth order are given in polar
coordinates in Table 4-4 and in Cartesian coordinates in Table 4-5 in the orthonormal
form so that each expansion coefficient (except piston) of an aberration function
represents the sigma value of the corresponding polynomial term.
Only the cosine circle polynomials are needed to represent the aberration function of
a rotationally symmetric system. However, both cosine and sine polynomials are needed
to represent fabrication errors, or the aberrations introduced by atmospheric turbulence. A
circle polynomial aberration varying as cos mq or sin mq is m-fold symmetric. However,
its interferogram is 2m-fold symmetric. The PSF is m-fold symmetric when m is odd, and
2m-fold symmetric when m is even, unless m = 0, in which case it is radially symmetric,
like the aberration itself. These symmetry properties (along with those of the OTF) are
summarized in Table 4-6. The PSFs for two polynomial aberrations with the same n and
m values and the same sigma value but different angular dependence as cos mq and
sin mq are the same except that one is rotated by an angle p 2m with respect to the
other. If two such polynomial aberrations are present simultaneously with sigma values
a j and b j , then the orientation of the interferogram, PSF, and OTF changes by an angle
( )
(1 m) tan 1 b j a j .
The circle polynomials for n £ 8 are illustrated in Figure 4-11 by an isometric plot,
an interferogram, and a PSF for a sigma value of one wave. The corresponding P-V
numbers are given in Table 4-7. The Strehl ratio for a sigma value of 0.1 l for each
polynomial aberration is given in Table 4-8 and plotted in Figure 4-12, illustrating that,
for a small aberration, its value can be estimated from the aberration variance regardless
of the aberration type.
The OTF is complex with real and imaginary parts (or MTF and PTF) for odd m, but
it is real for even m. For m = 0, the OTF is real and radially symmetric. The real part of
the OTF is 2m-fold symmetric whether m is odd or even. However, its imaginary part is
m-fold symmetric for odd m, though its magnitude (i.e., if we ignore its sign) is 2m-fold
symmetric. Accordingly, the MTF is 2m-fold symmetric whether m is even or odd. The
MTF for primary aberrations, and Z10 and the real and imaginary parts of the OTF for
coma and Z10 , are given for a sigma value of 0.1 wave in Figures 4-13 and 4-14,
respectively.
The determination of the effective Seidel or primary aberration coefficients from the
corresponding coefficients of the cosine and sine polynomials is demonstrated in Section
4.9. It is emphasized that these coefficients cannot be obtained from only the primary
Zernike aberrations, but must also include the primary aberrations in the higher-order
Zernike terms. How to obtain the Zernike coefficients of a certain aberration function
when the diameter of the pupil is reduced from its nominal value is discussed in Section
4.11.
5eferences 103
References
1. F. Zernike, “Diffraction theory of knife-edge test and its improved form, the phase
contrast method,” Mon. Not. R. Astron. Soc. 94, 377–384 (1934).
2. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am.

66, 207–211 (1976).
3. B. R. A. Nijboer, “The diffraction theory of optical aberrations. Part II:

Diffraction pattern in the presence of small aberrations,” Physica 13, 605–620
(1947)
4. M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999).

Optics, 2nd ed. (SPIE Press, Bellingham, Washington, 2011).
6. V. N. Mahajan, “Zernike polynomials and aberration balancing,” Proc. SPIE

Proc. 5173, 1–17 (2003).
7. V. N. Mahajan, “Strehl ratio for primary aberrations in terms of their aberration

variance,” J. Opt. Soc. Am. 73, 860–861 (1983).
8. Lord Rayleigh, Phil. Mag. (5) 8, 403 (1879); also in his Scientific Papers (Dover,
New York, 1964) Vol. 1, p. 432.
9. V. N. Mahajan, “Strehl ratio for primary aberrations: some analytical results for
circular and annular pupils,” J. Opt. Soc. Am. 72, 1258–1266 (1982); Errata, 10,
2092 (1993).
10. V. N. Mahajan, “Line of sight of an aberrated optical system,” J. Opt. Soc. Am. A
2, 833–846 (1985).
11. W. B. King, “Dependence of the Strehl ratio on the magnitude of the variance of
the wave aberration,” J. Opt. Soc. Am. 58, 655–661 (1968).
12. A. B. Bhatia and E. Wolf, “On the circle polynomials of Zernike and related
orthogonal sets,” Proc. Cambridge Philos. Soc. 50, 40–48 (1954).
13. V. N. Mahajan, “Symmetry properties of aberrated point-spread functions,” J.

Opt. Soc. Am. 11, 1993–2003 (1994).
14. V. N. Mahajan and José A. Díaz, “Imaging characteristics of Zernike and annular
polynomial aberrations,” Appl. Opt. 52, 2062-2074 (2013).
15. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Geometrical

Optics, (SPIE Press, Bellingham, Washington, Second Printing 2001).5
16 J. C. Wyant and K. Creath, “Basic wavefront aberration theory for optical

metrology,” Applied Optics and Optical Engineering, XI, 1–53 (1992). Note that
the polynomials used in this work are not in their orthonormal form, and are
ordered differently as well.
17. V. N. Mahajan and W. H. Swantner, “Seidel coefficients in optical testing,” Asian

J. Phys. 15, 203–209 (2006).
18. V. N. Mahajan, “Zernike coefficients of a scaled pupil,” Appl. Opt. 49, 5374-5377
(2010).
19. A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike coefficients
of scaled pupils,” Microlith, Microfab. and Microsyst, 5, 030501 (2006).
20. J. A. Díaz, J. Fernández-Dorado, C. Pizarro, and J. Arasa, “Zernike coefficients

for concentric circular scaled pupils: an equivalent expression,” J. Mod. Opt. 56,
149-155 (2009).
21. B. R. A. Nijboer, “The Diffraction Theory of Aberrations,” Thesis, University of

Groningen, The Netherlands (1942).
CHAPTER 5
SYSTEMS WITH ANNULAR PUPILS
5.1 Introduction ..........................................................................................................107
5.2 Aberration-Free Imaging ....................................................................................107
5.2.1 PSF ..........................................................................................................107
5.2.2 OTF ..........................................................................................................109
5.3 Strehl Ratio and Aberration Balancing ............................................................. 111
5.4 Orthonormalization of Circle Polynomials over an Annulus ..........................114
5.5 Annular Polynomials ........................................................................................... 116
5.6 Annular Coefficients of an Annular Aberration Function ..............................123
5.7 Strehl Ratio for Annular Polynomial Aberrations ........................................... 129
Annular Polynomial Aberrations ......................................................................132
5.9 Summary............................................................................................................... 139
References ......................................................................................................................140
105
Chapter 5
Systems with Annular Pupils
5.1 INTRODUCTION
An important example of an imaging system with a noncircular pupil is that of a
system with an annular pupil. The two-mirror astronomical telescopes represent systems
with annular pupils. Examples of such telescopes, including their linear obscuration ratios
given in parentheses are the 200-inch telescope at Mount Palomar (0.36), the 84-inch
telescope at the Kitt-Peak observatory (0.37), the telescope at the McDonald Observatory
(0.5), and the Hubble Space Telescope (0.33 when using the Wide-Field Planetary
Camera).
We start this chapter with a brief discussion of how the obscuration affects the
aberration-free PSF and OTF of a circular pupil. We then consider its effect on the Strehl
ratio of primary aberrations, their balancing, and tolerances with and without balancing.
Next we obtain the polynomials that are orthonormal over an annular pupil by
orthogonalizing the Zernike circle polynomials by the procedure outlined in Chapter 3.
The annular polynomials are given in terms of the Zernike circle polynomials, and in both
polar and Cartesian coordinates. They are also related to the balanced aberrations. The
aberrated PSFs and OTFs are illustrated for the annular polynomial aberrations.

5.2.1 PSF
Figure 5-1 illustrates a unit annular pupil with outer and inner radii of 1 and , i.e., a
pupil with a linear obscuration ratio of . Thus, if (r, q) are the coordinates of a point on
the pupil, then £ r £ 1 and 0 £ q £ 2 p . The PSF, Strehl ratio, and the OTF of a system
with an annular pupil can be obtained from the equations given in Section 2.2 in the same
manner as for a system with a circular pupil. The significant difference lies in replacing
the lower limit 0 of the radial integration by the obscuration ratio of the annular pupil.
Thus, Eq. (4-3) for the aberrated PSF for an aberration F(r, q; ) is replaced by
1
'
Figure 5-1. Unit annulus of obscuration ratio , representing the ratio of its inner
and outer radii.
107
108 SYSTEMS WITH ANNULAR PUPILS
1 2p 2
1
I (r , q i ) = [ ] [
Ú Ú exp i F ( r, q) exp - pirr cos(q i - q) r dr dq ] , (5-1)
(
p 2 1 - 2 )2 0
where (r ,q i ) are the polar coordinates of a point in the image plane, r is in units of l F ,
and F = R D is the focal ratio of the image-forming light cone. The PSF is normalized to
unity at the center by the aberration-free central irradiance p Pex 1 - 2 4l2 F 2 . It is
2
( )
smaller than the corresponding central value for a circular pupil by a factor of 1 - 2 , ( )
since both the pupil area and the power Pex are each smaller by a factor of 1 - 2 . ( )
The aberration-free PSF is given by [1,2]
2
1 È 2J1( pr ) 2J ( pr ) ˘
I ( r; ) = Í pr - 2 1 . (5-2)
(1 - 2 ) 2 Î pr ˙˚
The effect of the obscuration is two fold. First, there is a loss of light in the image that
increases with increasing . Second, the radius of the central bright spot decreases and
contains less and less light, while more and more light appears in the diffraction rings. As
Æ 1, the PSF approaches J 0 ( pr ) , and the central bright spot radius decreases to 0.76
compared to a value of 1.22 for a circular pupil. The irradiance distribution I of the PSF
and its encircled power P are shown in Figure 5-2 for several typical values of the
obscuration ratio. The 2D PSF is shown in Figure 5-3 for obscuration ratios of 0.5 and
0.8. For large obscuration ratios, such as 0.8, the PSF consisits of groups of diffraction
rings.
1.0
0.9 I
P
=0
0.8
0.7 0.25
(r) P(rc)
0.6
0.5
0.50
0.4
0.3 0.75
0.2
0.1
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r; rc
Figure 5-2. The irradiance and encircled power distributions for various values of
the obscuration ratio .
5.2.2 OTF 109
(a)
(b)
Figure 5-3. 2D aberration-free PSF of a system with an annular pupil having an

obscuration ratio of (a) 0.5 and (b) 0.8.
5.2.2 OTF
The aberration-free OTF, representing the Fourier transform of the corresponding
PSF given by Eq. (5-2) [3], or the fractional overlap area of two unit annular circles
separated by a distance l Rv i , is given by [1,4]
1
t (v; ) =
1 - 2
[ ]
t (v) + 2 t (v ) - t12 (v; ) , 0 £ v £ 1 , (5-3)
where t (v) is given by Eq. (4-15) and represents the OTF of the system if there were no
obscuration, v = l Fv i is a normalized radial spatial frequency as in the case of a circular
pupil (since the obscuration has no effect on the cutoff frequency 1 l F ), and
t12 (v; ) = 2 2 , 0 £ v £ (1 - ) 2 (5-4a)
(
= (2 p) q1 + 2 q 2 - 2 v sin q1 , ) (1 - ) 2 £ v £ (1 + ) 2 (5-4b)
= 0, otherwise . (5-4c)
In Eq. (5-4b), the angles q1 and q 2 are given by
4v 2 + 1 - 2
cos q1 = (5-5a)
4v
and
4v 2 - 1 + 2
cos q 2 = , (5-5b)
4 v
respectively. It is evident from Eq. (5-3) that t ( v; ) > t ( v ) at least for spatial frequencies
1
( )
(1 + ) 2 < v < 1 by a factor of 1 - 2 . This is illustrated in Figure 5-4 for the same
values of as the PSFs in Figure 5-2. The OTF decreases at the low and mid spatial
frequencies and increases at the high. This is the spatial frequency analog of the increased
light in the diffraction rings and a smaller central bright spot.
1.0
0.8
= 0
0.6
t (n; )
0.25
0.4 0.50
0.75
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
n
Figure 5-4. OTF of an aberration-free system with an annular pupil of obscuration

ratio .
5.2.2 OTF 111
The radial integral of the aberration-free OTF is given by

1
0
(
Ú t ( v; ) vdv = 1 - 8 .
2
) (5-6)
Its slope at the origin is given by
t ¢(0; ) = - 4 p (1 - ) . (5-7)
5.3 STREHL RATIO AND ABERRATION BALANCING

Letting r = 0 in Eq. (5-1), we obtain the Strehl ratio of an image:
1 2p 2
1 Û Û
S ∫ I (0; ) = [ ]
Ù Ù exp iF(r, q; ) r dr dq . (5-8)
(
p 2 1 - 2 )2 ı ı
0
The approximate value of the Strehl ratio can be obtained from the aberration variance
s2F = < F2 > - < F > 2 (5-9)
according to Eq. (1-34), where

1 2p
1 ÛÛ
n
[(
< F > = p 1- 2
)] Ù Ù F (r, q; ) r dr dq ,
ıı
n
(5-10)
0
with n = 1 and 2, respectively. Table 5-1 gives the form as well as the standard deviation
s F of a primary aberration.
Table 5-1. Primary aberrations and their standard deviations for a system with a
uniformly illuminated annular pupil of obscuration ratio .
Aberration F( r,, q) sF
Spherical As r 4 12
(4 - 2
- 6 4 - 6 + 4 8 ) As 3 5
Coma Ac r3 cos q 12
(1 + 2
+ 4 + 6 ) Ac 2 2
Astigmatism Aa r2 cos 2 q 2 12
(1 + ) Aa 4
Field curvature (defocus) Ad r2 (1 - ) A

2
d 2 3
2 12
Distortion (tilt) At r cos q (1 + ) At 2
For a small aberration, we balance a classical aberration with one or more aberrations
of lower order to minimize its variance and thereby maximize the corresponding Strehl
ratio. Thus, for example, we balance spherical aberration with defocus, as in Chapter 4,
and write it as
F (r; ) = Asr 4 + Bd r 2 . (5-11)
We determine the amount of defocus Bd such that the variance sF2 is minimized; i.e., we
calculate sF2 and let
∂s F2
= 0 (5-12)
∂B d
to determine Bd . Proceeding in this manner, we find that the optimum value is

2
( )
Bd = - 1 + 2 As . The corresponding standard deviation is 1 - 2 As 6 5 . ( )
Astigmatism and coma aberrations can be treated similarly. Table 5-2 lists the form
of a balanced primary aberration and its standard deviation. Also listed in the table is the
location of the diffraction focus, i.e., the point with respect to which the aberration
variance is minimum so that the Strehl ratio at it is maximum. We note that in the case of
coma, the balancing aberration is a wavefront tilt whose amount depends on . Thus,
maximum Strehl ratio is obtained at a point that is displaced from the Gaussian image
point but lies in the Gaussian image plane. In the case of astigmatism, the amount of
balancing defocus is independent of . The higher-order classical aberrations can be
balanced in a similar manner.
Figure 5-5 shows how the standard deviation of an aberration, for a given value of
the aberration coefficient Ai , varies with the obscuration ratio of the pupil. In Figures 5-
5a and 5-5b, the amounts of defocus and tilt required to minimize the variance of
spherical aberration and coma, respectively, are also shown. We observe from these
figures that the standard deviation of spherical and balanced spherical aberrations and
Table 5-2. Balanced primary aberrations, their standard deviation, and diffraction
focus.
Aberration F(r, q; ) sF Diffraction Focus
Balanced
spherical [ (
As r 4 - 1 + 2 r 2 ) ] 1
6 5
1 - 2( )
2
As [0,0,8(1 + )F A ]
2 2
s
Balanced 2 1 + 2 + 4 4 12
coma
Ê
Ac Á r3 -
ˆ
r˜ cos q (1 - ) (1 + 4 + )
2 2
Ac Í
(
È 4 1 + 2 + 4 ) ˘
FAc , 0, 0 ˙
Ë 3 1 + 2 ¯
6 2 (1 + ) 2 12
Î (
Í 3 1+ 2
) ˙
˚
Balanced
astigmatism a
(
A r 2 cos 2 q - 1 2 ) 1
(1 + 2
+ 4
12
) Aa (0, 0, 4 F A )
2
a
2 6
5.3 Strehl Ratio and Aberration Balancing 113
0.30 0.12 1.2 0.12

Spherical Balanced
0.25 0.10 1.0 0.10
sf /Ac (coma) balancing tilt

coma
sf /Ac (balanced coma)

Balanced defocus
0.20 0.08 0.8 0.08
sf /As
(1 + 2) 2(1 + 2 + 4)/3(1 + 2)

0.15 0.06 0.6 0.06
0.10 0.04 0.4 0.04

Coma
Balanced
0.05 spherical 0.02 0.2 0.02
0.00 0.00 0.0 0.00

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

(a) (b)
0.40 0.30
0.25
Defocus
0.35
0.20
sf /Ad
VI /Aa
0.30 0.15
Astigmatism 0.10
0.25
Balanced 0.05
astigmatism
0.20 0.00
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

(c) (d)
0.75
0.70
0.65 Tilt
sf /At
0.60
0.55
0.50
0.0 0.2 0.4 0.6 0.8 1.0

(e)
Figure 5-5. Variation of standard deviation of a primary and a balanced primary

aberration with obscuration ratio . Variation of balancing defocus in the case of
spherical aberration and tilt in the case of coma are also shown. (a) Spherical
aberration, (b) coma, (c) astigmatism, (d) defocus, and (e) tilt.
defocus decreases as increases. Correspondingly, the tolerance in terms of their

aberration coefficients As and Bd , for a given Strehl ratio, increases. Thus, for example,
the depth of focus for a certain value of the Strehl ratio increases as increases. The
standard deviation of coma, astigmatism, balanced astigmatism, and tilt increases as
increases. The standard deviation of balanced coma first slightly increases, achieves its
maximum value at = 0.29 , and then decreases rapidly as increases. The factor by
which the standard deviation of an aberration is reduced by balancing it with another
aberration is reduced in the case of spherical aberration, but increases in the case of coma
and astigmatism, as increases.
5.4 ORTHONORMALIZATION OF CIRCLE POLYNOMIALS OVER AN

ANNULUS
The polynomials Aj (r, q; ) orthonormal over a unit annulus of obscuration ratio
can be obtained recursively from the Zernike circle polynomials Z j (r, q), starting with
A1 = 1 (omitting the arguments for brevity) from Eq. (3-18) according to [5–7]
È j ˘
A j +1 = N j +1 Í Z j +1 - Â Z j +1 Ak Ak ˙ , (5-13)
Î k =1 ˚
where N j +1 is a normalization constant so that the polynomials are orthonormal. The

angular brackets indicate a mean value over the annulus. Thus,
1 2p
1 Û Û
Z j +1 Ak = Ù Ù Z j +1 Ak r dr dq . (5-14)
(
p 1 - 2 ) ı ı
0
The orthonormality of the polynomials implies that

1 2p
1 Û Û
A j A j¢ = Ù Ù A j A j ¢ r dr d q
(
p 1 - 2 ) ı ı
0
= d jj ¢ . (5-15)
Now a circle polynomial Z j varies with angle q as cos mq or sin mq depending on

whether j is even or odd. It is radially symmetric when m = 0 . Because of the orthogonal
properties of cos mq and sin mq over a period of 0 to 2p [see Eq. (4-46)], the
polynomials Ak that contribute to the sum in Eq. (5-13) must also have the same angular
dependence as that of the polynomial Z j +1. Hence, the polynomial A j +1 will also have
the same angular dependence. Thus, an annular polynomial A j is separable in polar
coordinates r and q , and differs from the corresponding circle polynomial only in its
radial dependence. Given the form of the circle polynomials by Eqs. (4-45a)–(4-45c), the
annular polynomials can accordingly be written [1]
Aeven j (r, q; ) = 2(n + 1) Rnm (r; ) cos mq , m π 0 , (5-16a)

5.4 Orthonormalization of Circle Polynomials over an Annulus 115
Aodd j (r, q; ) = 2(n + 1) Rnm (r; ) sin mq , m π 0 , (5-16b)
A j (r, q; ) = n + 1 Rn0 (r; ) , m = 0 , (5-16c)
where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; ) is
an annular radial polynomial.
Substituting Eqs. (5-16a)–(5-16c) into Eq. (5-15), we find that the annular radial
polynomials obey the orthogonality condition
1
Û m 1 - 2
Ù Rn (r; ) Rn ¢ (r; ) r dr = 2
m
d . (5-17)
ı (n+ 1) nn ¢

In the two-index n and m representation Anm (r, q; ) of an annular polynomial, Eq. (5-13)
can be written
È ( n m) 2 ˘
Anm = N nm Í Z nm - Â Z nm Anm 2i An 2i ˙ , (5-18)
Î i =1 ˚
where N nm replaces the normalization constant N j and, as in Eq. (5-13), the angular
brackets indicate a mean value over the unit annulus. Substituting Eqs. (5-16a)–(5-16c)
into Eq. (5-18), we find that the annular radial polynomials are given by
È ( n m) 2 ˘
Rnm (r; ) = N nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; ) Rnm 2i (r; )˙ , (5-19)
Î i ≥1 ˚
where
1
2 Û m
Rnm (r) Rnm¢ (r; ) = Ù Rn (r) Rn ¢ (r; ) r dr .
m
(5-20)
1 - 2 ı

Thus, Rnm (r; ) is a radial polynomial of degree n in r containing terms in rn , rn 2 , K,

and r m with coefficients that depend on . The radial polynomials are even or odd in r
depending on whether n (or m) is even or odd.
For m = 0 , the annular radial polynomials are equal to the Legendre polynomials
Pn (◊) according to
È 2 r 2 - 2
R20n (r; ) = Pn Í -
(1
˘
˙ .
) (5-21)
ÍÎ 1 -
2
˙˚
Thus, they can be obtained from the circle radial polynomials R20n (r) by replacing r with
[(r 2
- 2 ) (1 - )] 2 12
, i.e.,
ÈÊ r2 - 2 ˆ 1 2 ˘
R20n (r; ) = R20n ÍÁ 2 ˜
˙ . (5-22)
ÍÎË 1 - ¯ ˙˚
Given that Rnn (r) = r n [see Eq. 4-39)], it can be seen from Eqs. (5-17) and (5-19) that
12
{(
Rnn (r; ) = r n 1 - 2 ) [1 - 2(n +1) ]} (5-23a)
12
Ê n ˆ
= r n Á Â 2i ˜ . (5-23b)
Ë i=0 ¯
Moreover,
Rnn 2 (r; ) =
[(
nrn - (n - 1) 1 - 2 n ) (1 - ( ) )] r
2 n 1 n 2
12 . (5-24)
Ï 1 - 2
Ì
Ó
( )
1
(
Èn 2 1 - 2( n +1
ÎÍ
)
) - (n - 1)(1 - ) (1 - ( ) )˘˚˙¸˝˛
2 2n 2 2 n 1
It is evident that an annular radial polynomial Rnn (r; ) differs from the corresponding
circle polynomial Rnn (r) only in its normalization. We also note that
Rnm (1; ) = 1, m = 0 (5-25a)
π 1, m π 0 . (5-25b)
5.5 ANNULAR POLYNOMIALS

The annular polynomials obtained from Eq. (5-13) in terms of the Zernike circle
polynomials are given in Table 5-3 [1,7]. The elements of the matrix M to convert the
circle polynomials into the annular polynomials can be obtained easily from this table
{ } { }
according to A j = M Z j [see Eq. (3-19)]. The nonzero elements of the matrix for
the first 15 polynomials are given in Table 5-4. The polynomial ordering, the number of
polynomials of a certain order or through a certain order n, and the relationships among
the indices n, m, and j are the same as discussed for circle polynomials in Chapter 4. It
should be evident that an annular polynomial Aj (r, q; ) reduces to the corresponding
circle polynomial Z j (r, q) as Æ 0. In Table 5-5, the annular polynomials are given in
the Cartesian coordinates. The variation of several annular radial polynomials with r is
shown in Figure 5-6 for = 0.5 .
The annular polynomials are also unique like the circle polynomials. They not only
are orthogonal over an annular pupil but also include wavefront tilt and defocus and
balanced classical aberrations as members of the polynomial set. For example, A6 , A8 ,
and A11 represent the balanced primary aberrations of astigmatism, coma, and spherical
aberration, as may be seen by comparing their forms with those given in Table 5-2. The
annular polynomials may be referred to as the orthogonal aberrations because of their
orthogonality over the annular pupil.
5.5 Annular Polynomials 117
Table 5-3. Orthonormal annular polynomials A j (r, q; ) in terms of the orthonormal

Zernike circle polynomials Z j (r, q ) , where is the obscuration ratio of the annular
pupil.
A1 = Z1
( ) 1 2 Z2
A2 = 1 + 2
12
A3 = (1 + 2 ) Z 3
1
A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )
12
A5 = (1 + 2 + 4 ) Z 5
A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]
A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]
12
B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]
12
A9 = (1 + 2 + 4 + 6 ) Z 9
12
A10 = (1 + 2 + 4 + 6 ) Z10
A11 = (1 - 2 ) [ 52 (1 + 2 ) Z1 - 152 Z 4 + Z11 ]

2
12
Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ
A12 = Á 8˜ Á - 15 Z +
6 6
Z
2 12 ˜
Ë 1 + 4 + 10 + 4 + ¯
2 4 6
Ë 1- 1- ¯
12
Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ
A13 = Á 8˜ Á - 15 Z +
6 5
Z
2 13 ˜
Ë 1 + 4 + 10 + 4 + ¯
2 4 6
Ë 1- 1- ¯
(
A14 = 1 + 2 + 4 + 6 + 8 ) 1 2 Z14
12
A15 = (1 + 2 + 4 + 6 + 8 ) Z15
1 Ï 4 ¸
A16 =
2 2
Ì [ 3( 3 + 4
2
) ( ) ]
+ 34 Z 2 + 2 6 3 + 2 Z 8 + bZ16 ˝
(1 - ) Óa ˛
1 Ï 4 ¸
A17 =
2 2
Ì [ 3( 3 + 4
2
) ( ) ]
+ 34 Z 3 + 2 6 3 + 2 Z 7 + bZ17 ˝
(1 - ) Óa ˛
12
10 1 2 Ê 1 + 4 2 + 4 ˆ
(
a = 1 + 13 + 46 + 46 + 13 +
2 4 6 8
) , b = Á 6˜
Ë 1 + 9 + 9 + ¯
2 4
12
Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ
A18 = Á 12 ˜ Á Z10 + Z
2 18 ˜
Ë 1 + 4 + 10 + 20 + 10 + 4 + ¯
2 4 6 8 10
Ë 1-
8
1- ¯
12
Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ
A19 = Á 12 ˜ Á Z9 + Z
2 19 ˜
Ë 1 + 4 + 10 + 20 + 10 + 4 + ¯
2 4 6 8 10
Ë 1-
8
1- ¯
Table 5-3. Orthonormal annular polynomials A j (r, q; ) in terms of the orthonormal

Zernike circle polynomials Z j (r, q ) , where is the obscuration ratio of the annular
pupil. (Cont.)
(
A20 = 1 + 2 + 4 + 6 + 8 + 10 ) 1 2 Z 20
12
A21 = (1 + 2 + 4 + 6 + 8 + 10 ) Z 21
= (1 - 2 ) [ - 7 2 (1 + 32 + 4 ) Z1 + ]
3
A22 ( )
212 1 + 22 Z 4 - 35 Z11 + Z 22
1 Ï 6 ¸
A23 =
2 2
Ì [ 21(2 + 3 2
) ( ) ]
+ 34 + 26 Z 5 - 35 6 + 32 + 4 Z13 + dZ 23 ˝
(1 - ) Óg ˛
1 Ï 6 ¸
A24 =
2 2
Ì [ 21(2 + 3 2
) ( ) ]
+ 34 + 26 Z 6 - 35 6 + 32 + 4 Z14 + dZ 24 ˝
(1 - ) Óg ˛
12
(
g = 1 + 13 2 + 91 4 + 339 6 + 792 8 + 102810 + 72912 + 33914 + 9116 + 1318 + 20 )
12
Ê 1 + 4 2 + 104 + 4 6 + 8 ˆ
d =Á 12 ˜
Ë 1 + 9 + 45 + 65 + 45 + 9 + ¯
2 4 6 8 10
Ê - 3510 1 ˆ
A25 = c Á Z15 + Z
2 25 ˜
Ë 1- 1-
10
¯
Ê - 3510 1 ˆ
A26 = c Á Z14 + Z
2 26 ˜
Ë 1- 1-
10
¯
12
Ê 1 + 2 + 4 + 6 + 8 ˆ
c = Á 16 ˜
Ë 1 + 4 + 10 + 20 + 35 + 20 + 10 + 4 + ¯
2 4 6 8 10 12 14
(
A27 = 1 + 2 + 4 + 6 + 8 + 10 + 12 ) 12 Z 27
12
A28 = (1 + 2 + 4 + 6 + 8 + 10 + 12 ) Z 28
It is evident from Eq. (5-13) that each annular polynomial is a linear combination of
the circle polynomials, without any mixing of the cosine and the sine terms. Similarly,
because of the same angular dependence of an annular polynomial Aj (r, q; ) as the
corresponding circle polynomial Z j (r, q), each radial polynomial Rnm (r; ) can be written
as a linear combination of the polynomials Rnm (r) , Rnm 2 (r) , etc. This, of course, is also
evident from Eq. (5-19). For example,
1
R13 (r; ) =
B
[( )
1 + 2 R13 (r) - 24 R11(r) ] , (5-26)
where
12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] , (5-27)
Table 5-4. Nonzero elements of a 15 ¥ 15 conversion matrix M for obtaining the

annular polynomials A j (r, q; ) from the Zernike circle polynomials Z j (r, q ) .
M 11 = 1
(
M 22 = 1 + 2 ) 1 2 = M 33
M 41 = -32 1 - 2( )1
(
M 44 = 1 - 2 )1
(
M 55 = 1 + 2 + 4 ) 1 2 = M 66
M 73 = -2 2 4 B = M 82
(
M 77 = 1 + 2 B = M 88 )
12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )]
(
M 99 = 1 + 2 + 4 + 6 ) 1 2 = M10,10
(
M 111, = 52 1 + 2 1 - 2 )( )2
M 11,4 = - 152 1 - 2 ( )2
, = 1-
M 1111 2
( )2
12
6 Ê 1 + 2 + 4 ˆ
M 12,6 = - 15 6 Á 8˜
= M 13,5
1 - Ë 1 + 4 + 10 + 4 + ¯
2 4 6
12
1 Ê 1 + 2 + 4 ˆ
M 12,12 = Á 8˜
= M 13,13
1 - Ë 1 + 4 + 10 + 4 + ¯
2 2 4 6
(
M 14,14 = 1 + 2 + 4 + 6 + 8 ) 1 2 = M15,15
Table 5-5. Orthonormal annular polynomials Aj (x, y; ) in Cartesian coordinates

1 2
(
( x, y) , where x = rcos q , y = rsinq , and £ r = x 2 + y 2 £ 1. )
Poly. Aj (x, y; )
A1 1
A2 2 x / (1 + 2 )1 / 2
A3 2y /(1 + 2 )1/ 2
A4 3 (2r2 – 1 - 2 ) / (1 – 2 )
A5 2 6 xy/(1 + 2 + 4 )1 / 2
A6 6 ( x 2 – y 2 )/(1 + 2 + 4 )1 / 2
8 y[3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]
A7
(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2
8 x [3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]
A8
(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2
A9 8 y (3 x 2 – y 2 ) / (1 + 2 + 4 + 6 )1 / 2
A10 8 x ( x 2 – 3 y 2 ) / (1 + 2 + 4 + 6 )1 / 2
A11 5[6r 4 – 6 (1 + 2 ) r2 + (1 + 4 2 + 4 )] / (1 – 2 ) 2
10 ( x 2 – y 2 ) [ 4r2 – 3 (1 - 8 ) / (1 – 6 )]
A12 1/ 2
{(1 – ) 2 –1
[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }
2 10 xy[ 4r2 – 3 (1 – 8 ) / (1 – 6 )]
A13 1/ 2
{(1 – ) 2 –1
[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }
A14 10 (r 4 – 8 x 2 y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2
A15 4 10 xy ( x 2 – y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2
Table 5-5. Orthonormal annular polynomials Aj (x, y; ) in Cartesian coordinates

1 2
(
( x, y) , where x = rcos q , y = rsinq , and £ r = x 2 + y 2 £ 1. (Cont.) )
Poly. Aj (x, y; )
12 x [10 (1 + 4 2 + 4 ) r 4 – 12 ( 1 + 4 2 + 4 4 + 6 )r 2 ] + 3(1 + 4 2 + 10 4 + 4 6 + 8 )]
A16
(1 – 2 ) 2 [(1 + 4 2 + 4 )(1 + 9 2 + 9 4 + 9 6 )]1/ 2
12 y [ 10 (1 + 4 2 + 4 ) r 4 – 12 (1 + 4 2 + 4 4 + 6 ) r 2 + 3(1 + 4 2 + 10 4 + 4 6 + 8 ) ]
A17
(1 – 2 ) 2 [(1 + 4 2 + 4 )(1 + 9 2 + 9 4 + 6 )]1/ 2
12 x ( x 2 – 3 y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]
A18 1/ 2
{(1 – ) 2 –1
[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }
12 y [3 x 2 – y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]
A19 1/ 2
{(1 – ) 2 –1
[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }
A20 (
12 x 16 x 4 – 20 x 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2
A21 (
12 y 16 y 4 – 20 y 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2
7 [ 20 r 6 – 30(1 + 2 ) r 4 + 12 (1 + 3 2 + 4 ) r 2 – (1 + 9 2 + 94 + 6 )]
A22
(1 – 2 ) 3
2 14 xy [15 (1 + 4 2 + 10 4 + 4 6 + 8 ) r 4 – 20 (1 + 4 2 + 10 4 + 10 6 + 4 8 + 10 ) r 2
+ 6 (1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 )]
A23
(1 – 2 ) 2 [1 + 4 2 + 10 4 + 4 6 + 8 ) (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )]1/ 2
14 ( x 2 – y 2 )[15 (1 + 4 2 + 10 4 + 4 6 + 8 ) r 4 – 20 (1 + 4 2 + 10 4 + 10 6 + 4 8 + 10 ) r 2
+ 6 (1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 )]
A24
(1 – 2 ) 2 [1 + 4 2 + 10 4 + 4 6 + 8 ) (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )] 1/2
4 14 xy ( x 2 - y 2 )[6r2 – 5 (1 – 12 ) / (1 – 10 )]

A25 1/ 2
{(1 – ) 2 –1
[36 (1 – 14 ) – 35 (1 – 12 )2 / (1 – 10 )] }
14 (8 x 4 - 8 x 2 r2 + r 4 )[6r2 – 5 (1 – 12 ) / (1 – 10 )]
A26 1/ 2
{(1 – ) 2 –1
[36 (1 – 14 ) – 35 (1 – 12 )2 / (1 – 10 )] }
A27 (
14 xy 32 x 4 – 32 x 2 r 2 + 6 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 + 12 )1/ 2
A28 (
14 32 x 6 – 48 x 4 r 2 + 18 x 2 r 4 – r 6 ) (1 + 2 + 4 + 6 + 8 + 10 + 12 )1/ 2
n 4
0.5
8
Rn(U; H)
0 (a)
0
-0.5
6
2
-1
0.5 0.6 0.7 0.8 0.9 1
U
1
n 5
1
0.5
7
R1n(U; H)
0 (b)
-0.5
3
-1
0.5 0.6 0.7 0.8 0.9 1
n 6 2
0.5
Rn(U; H)
0 (c)
2
-0.5
4
-1
0.5 0.6 0.7 0.8 0.9 1
U
Figure 5-6. Variation of an annular radial polynomial Rnm (r; ) with r for = 0.5.
(a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.
and
(
R40 (r; ) = 1 - 2 ) 2 [R40 (r) - 32R20 (r) + 2 (1 + 2 )R00 (r)] . (5-28)
The radial annular polynomials Rnm (r; ) for n £ 8 are listed in Table 5-6. Table 5-7 lists
the full annular polynomials, illustrating their ordering.
5.6 ANNULAR COEFFICIENTS OF AN ANNULAR ABERRATION FUNCTION

The aberration function W (r, q; ) across a unit annulus with an obscuration ratio
can be expanded in terms of J annular polynomials Aj (r, q; ) in the form
J
W (r, q; ) = Â a j Aj (r, q; ) , 0 £ < 1 , 0 £ r £ 1 , 0 £ q £ 2 p , (5-29)
j =1
where a j is an annular expansion coefficient of the polynomial Aj . Multiplying both

sides of Eq. (5-29) by A j (r, q; ) , integrating over the unit annulus, and using the
orthonormality Eq. (5-15), we obtain the annular expansion coefficients:
1 1 2p
aj = 2 Ú Ú W (r, q; ) Aj (r, q; ) r dr d q . (5-30)
p(1 - ) 0
The mean and the mean square values of the aberration function are given by
W (r, q; ) = a1 (5-31)
and
J
W 2 (r, q; ) = Â a 2j . (5-32)
j =1
The variance of the aberration function is accordingly given by

2
2
sW = W 2 (r, q; ) - W (r, q; )
J
= Â a 2j . (5-33)
j =2
As explained in Section 3.3, the annular expansion coefficients yield a least-squares fit of
the aberration function with J polynomials.
Table 5-6. Annular radial polynomials Rnm (r; ) , where is the obscuration ratio
and £ r £ 1.
n m Rnm (r; )
0 0 1
12
1 1 (
r 1 + 2 )
2 0 ( 2r 2
) (1 - )
- 1 - 2 2
4 12
2 2 r (1 + + )
2 2
3 (1 + ) r - 2 (1 + + ) r
2 3 2 4
3 1
12
(1 - ) [(1 + ) (1 + 4 + )]
2 2 2 4
6 12
3 3 r (1 + + + )
3 2 4
4 0 [6r - 6 (1 + ) r + 1 + 4 + ] (1 - )
4 2 2 2 4 2 2
4r - 3 [(1 - ) (1 - )] r
4 8 6 2
4 2
Ï 1 1 2¸
8 2
Ì(1 - ) Í16 (1 - ) - 15 (1 - ) (1 - )˙
È 2 ˘ 10 6
˝
Ó Î ˚ ˛
12
4 4 (
r 4 1 + 2 + 4 + 6 + 8 )
5 1 ( ) ( ) (
10 1 + 4 2 + 4 r5 - 12 1 + 4 2 + 4 4 + 6 r3 + 3 1 + 4 2 + 10 4 + 4 6 + 8 r )
12
(1 - ) [(1 + 4 + ) (1 + 9 + 9 2 2 2 4 2 4
+ 6 )]
5 r - 4 [(1 - ) (1 - )] r 5 10 8 3
5 3 12
Ï1- 1
1 - )˘ ¸˝ 10 2
Ì( ) ( ) ( ) (
È25 1 - - 24 1 -
2 12 8
Ó Í
Î ˚˙ ˛
12
5 5 (
r5 1 + 2 + 4 + 6 + 8 + 10 )
6 0 [20 r 6
( ) (
- 30 1 + 2 r 4 + 12 1 + 32 + 4 r 2 - 1 + 92 + 94 + 6 ) ( )] (1 - 2 ) 3
( )
15 1 + 4 2 + 104 + 4 6 + 8 r 6 - 20 1 + 4 2 + 104 + 106 + 4 8 + 10 r 4 ( )
6 2
( )
+ 6 1 + 4 2 + 104 + 206 + 108 + 4 10 + 12 r 2
12
(1 - ) [(1 + 4 2 + 104 + 4 6 + 8 ) (1 + 92 + 454 + 656 + 458 + 910 + 12 )]
2 2
6 4
6r6 - 5 1 - 12 [( ) (1 - )] r 10 4
12
Ï 1 - 2
) - 35 (1 - ) (1 - )˘˚˙¸˝˛
1È 12 2
Ì
Ó
( ) ÎÍ
36 1 - 14( 10
12
6 6 (
r6 1 + 2 + 4 + 6 + 8 + 10 + 12 )
5.6 Annular Coefficients of an Annular Aberration Function 125
and £ r £ 1. (Cont.)
n m Rnm (r; )
7 1 a17 r7 + b71 r5 + c17 r3 + d71 r
7 3 a73 r7 + b73 r5 + c73 r3
7 5
7r7 - 6 1 - 14 [( ) (1 - )] r
12 5
12
Ï 1 - 2
) - 48 (1 - ) (1 - )˘˙˚¸˝˛
1È 14 2
Ì
Ó
( ) ÍÎ
49 1 - 16 ( 12
12
7 7 (
r7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 )
8 0
( ) ( )
70 r8 - 140 1 + 2 r6 + 30 3 + 82 + 34 r4 - 20 1 + 6 2 + 6 4 + 6 r2 + e80 ( )
2 4
(1 - )
8 2 a 82r 8 + b82r 6 + c 82r 4 + d 82r 2
8 4 a 84 r 8 + b84 r 6 + c 84 r 4
8 6 a 86r 8 + b86r 6
8 8 (
r 8 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 + 16 )1 2
(
a17 = 35 1 + 92 + 94 + 6 ) A17
(
b71 = - 60 1 + 9 2 + 154 + 9 6 + 8 ) A71
(
c17 = 30 1 + 9 2 + 254 + 256 + 9 8 + 10 ) A71
(
d71 = - 4 1 + 9 2 + 454 + 656 + 458 + 9 10 + 12 ) A71
(
A17 = 1 - 2 ) 3 (1 + 92 + 94 + 6 )1 2 (1 + 162 + 364 + 166 + 8 )1 2
(
a73 = 21 1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 ) A73
(
b73 = - 30 1 + 4 2 + 10 4 + 20 6 + 20 8 + 10 10 + 4 12 + 14 ) A73
(
c73 = 10 1 + 4 2 + 10 4 + 20 6 + 358 + 20 10 + 10 12 + 4 14 + 16 ) A73
2 12
(
A 73 = 1 2 ) (1 + 4 2
+ 10 4 + 20 6 + 10 8 + 4 10 + 12 )
12
(
¥ 1 + 9 2 + 45 4 + 165 6 + 270 8 + 27010 + 16512 + 4514 + 916 + 18 )
e80 = 1 + 162 + 364 + 166 + 8
(
a 82 = 56 1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 ) A82
and £ r £ 1. (Cont.)
(
b82 = -105 1 + 9 2 + 45 4 + 85 6 + 85 8 + 45 10 + 912 + 14 ) A82
(
c 82 = 60 1 + 9 2 + 45 4 + 115 6 + 150 8 + 115 10 + 4512 + 914 + 16 ) A82
(
d 82 = -10 1 + 9 2 + 45 4 + 165 6 + 270 8 + 270 10 + 16512 + 4514 + 916 + 18 ) A82
(
A82 = 1 - 2 ) 3 (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )1 2
(
¥ 1 + 162 + 136 4 + 416 6 + 6268 + 416 10 + 13612 + 1614 + 16 )1 2
(
a 84 = 28 1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 ) A84
(
b84 = -42 1 + 4 2 + 10 4 + 20 6 + 35 8 + 35 10 + 2012 + 1014 + 4 16 + 16 ) A84
(
c 84 = 15 1 + 4 2 + 10 4 + 20 6 + 35 8 + 56 10 + 3512 + 2014 + 1016 + 4 16 + 16 ) A84
2 12
(
A 84 = 1 2 ) (1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 )
12
(
¥ 1 + 9 2 + 45 4 + 165 6 + 495 8 + 846 10 + 994 12 + 84614 + 49616 + 16518 + 45 20 + 9 22 + 24 )
(
a 86 = 8 1 + 2 + 4 + 6 + 8 + 10 + 12 ) A86
(
b86 = -7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 ) A86
12
( )(
A 86 = 1 2 1 + 2 + 4 + 6 + 8 + 10 + 12 )
12
¥ (1 + 4 + 10 2 4
+ 20 6 + 35 8 + 56 10 + 84 12 + 845614 + 3516 + 2018 + 10 20 + 4 22 + 24 )
5.6 Annular Coefficients of an Annular Aberration Function 127
Table 5-7. Orthonormal annular polynomials A j (r, q; ) , ordered in the same

manner as the circle polynomials in Table 4-3.
j n m A j (r, q; ) Aberration Name*
1 0 0 R00 (r; ) = 1 Piston
2 1 1 2 R11 (r; ) cos q x-tilt
3 1 1 2 R11 (r; )sin q y-tilt
4 2 0 3 R20 (r; ) Defocus
5 2 2 6 R22 (r; )sin 2q Primary astigmatism at 45∞
6 2 2 6 R22 (r; ) cos 2q Primary astigmatism at 0∞
7 3 1 8R31 (r; )sin q Primary y-coma
8 3 1 8R31 (r; ) cos q Primary x-coma
9 3 3 8 R33 (r; )sin 3q
10 3 3 8 R33 (r; ) cos 3q
11 4 0 5 R40 (r; ) Primary spherical
12 4 2 10 R42 (r; ) cos 2q Secondary astigmatism at 0∞
13 4 2 10 R42 (r; )sin 2q Secondary astigmatism at 45∞
14 4 4 10 R44 (r; ) cos 4q
15 4 4 10 R44 (r; )sin 4q
16 5 1 12 R51 (r; ) cos q Secondary x-coma
17 5 1 12 R51 (r; )sin q Secondary y-coma
18 5 3 12 R53 (r; ) cos 3q
19 5 3 12 R53 (r; )sin 3q
20 5 5 12 R55 (r; ) cos 5q
21 5 5 12 R55 (r; )sin 5q
* The words “orthonormal annular” should be added to the name, e.g., orthonormal
annular primary spherical aberration.
Table 5-7. Orthonormal annular polynomials A j (r, q; ) , ordered in the same

manner as the circle polynomials in Table 4-3. (Cont.)
j n m A j (r, q; ) Aberration Name*
22 6 0 7 R60 (r; ) Secondary spherical
23 6 2 14 R62 (r; )sin 2q Tertiary astigmatism at 45∞
24 6 2 14 R62 (r; ) cos 2q Tertiary astigmatism at 0∞
25 6 4 14 R64 (r; ) cos 4q
26 6 4 14 R64 (r; )sin 4q
27 6 6 14 R66 (r; )sin 6q
28 6 6 14 R66 (r; ) cos 6q
29 7 1 4R17 (r; ) sin q
30 7 1 4R17 (r; ) cos q
31 7 3 4 R73 (r; ) cos 3q
32 7 3 4 R73 (r; ) cos 3q
33 7 5 4 R75 (r; ) sin 5q
34 7 5 4 R75 (r; ) cos 5q
35 7 7 4 R77 (r; ) sin 7q
36 7 7 4 R77 (r; ) cos 7q
37 8 0 3R80 (r; ) Tertiary spherical aberration
38 8 2 18 R82 (r; ) cos 2q
39 8 2 18 R82 (r; ) sin 2q
40 8 4 18 R84 (r; ) cos 4q
41 8 4 18 R84 (r; ) sin 4q
42 8 6 18 R86 (r; ) cos 6q
43 8 6 18 R86 (r; ) sin 6q
44 8 8 18 R88 (r; ) cos 8q
45 8 8 18 R88 (r; ) sin 8q
* The words “orthonormal annular” should be added to the name, e.g., orthonormal
annular primary spherical aberration.
5.7 Strehl Ratio for Annular Polynomial Aberrations 129
5.7 STREHL RATIO FOR ANNULAR POLYNOMIAL ABERRATIONS

The Strehl ratio for an annular polynomial aberration with a sigma value of 0.1 wave
is listed in Table 5-8 and plotted in 5-7. For the wavefront tilt polynomials A2 and A3 ,
the Strehl ratio simply represents the PSF value at a displaced point along the x or the y
axis, respectively. This displacement for a tilt aberration sigma of 0.1 wave is 0.358l F .
A closed-form expression for the Strehl ratio for the annular defocus polynomial can be
obtained from Eq. (5-8) by letting
F(r, q) = a 4 A4 (r) . (5-34)
The result obtained is

2
S = Í
(
È sin 3a
4 ) ˘˙ . (5-35)
Í 3a 4 ˙
Î ˚
For a defocus aberration sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement
with the result given in Table 5-8. Although Eq. (5-35) reads exactly the same as Eq. (4-
82) for a circular pupil, the longitudinal defocus for a given value of a 4 is different for
the annular pupil [see Eq. (5-37)]. .
If the defocus aberration is introduced by making an observation in a plane at a

distance z instead of the Gaussian image plane at a distance R, the longitudinal defocus is
z - R , and the aberration may be written in the form
W (r) = Bd r 2 , (5-36)
where Bd represents its peak value given by Eq. (4-19). The annular coefficient a 4 is
related to the longitudinal defocus z - R according to
p
a4 =
8 3l F 2
(
1 - 2 z - R ) . (5-37)
A positive value of defocus aberration is introduced when an observation is made at a

distance z < R .
The results in Table 5-8 and Figure 5-7 illustrate that the Strehl ratio for a small
aberration is nearly independent of the type of the aberration, and depends primarily on
(
its sigma value. It is approximately given by Eq. (1-34) as exp - s F2 , or 0.67, where )
s F = 0.2p .
Table 5-8. Strehl ratio S for annular polynomial aberrations for = 0.5 and a sigma
value of 0.1 wave.
A1 1 A16 0.675 A31 0.673
A2 0.661 A17 0.675 A32 0.673
A3 0.661 A18 0.669 A33 0.672
A4 0.663 A19 0.669 A34 0.672
A5 0.665 A20 0.681 A35 0.691
A6 0.665 A21 0.681 A36 0.691
A7 0.670 A22 0.668 A37 0.670
A8 0.670 A23 0.674 A38 0.678
A9 0.670 A24 0.674 A39 0.678
A10 0.670 A25 0.670 A40 0.672
A11 0.666 A26 0.670 A41 0.672
A12 0.669 A27 0.686 A42 0.675
A13 0.669 A28 0.686 A43 0.675
A14 0.675 A29 0.678 A44 0.696
A15 0.675 A30 0.678 A45 0.696

5.7 Strehl Ratio for Annular Polynomial Aberrations 131
o
o
Figure 5-7. Strehl ratio for annular polynomial aberrations for = 0.5 and a sigma
value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.
5.8 ISOMETRIC, INTERFEROMETRIC, AND IMAGING CHARACTERISTICS

OF ANNULAR POLYNOMIAL ABERRATIONS
As in the case of circle polynomials (see Section 4.8), we illustrate the annular
polynomials for n £ 8 in three different but equivalent ways in Figure 5-8 for = 0.5 and
a sigma value of one wave [8]. For each polynomial, the isometric plot at the top
illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is
shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers
(in units of wavelength) are given in Table 5-8. From Eqs. (5-16) for the form of the
polynomials, it is evident that the P-V numbers of two polynomials with the same values
of n and m are the same. This may also be seen from Table 5-7.
The PSF plots represent the images of a point object in the presence of an annular
polynomial aberration. Thus, for example, piston yields the aberration-free PSF (since it
has no effect on the PSF) given by Eq. (5-2). The full width of a square displaying the
PSFs in Figure 5-8 is 24l F .
The polynomial aberrations A2 and A3 , representing the x and y wavefront tilts with
12
( )
wavefront tilt angle of 4 a2 l D 1 + 2 about the y axis and displaces the PSF along the
12
( )
x axis by 4 a2 lF 1 + 2 . Similarly, a 3 corresponds to a wavefront tilt angle of
12 12
( )
4 a3 l D 1 + 2 ( )
about the x axis and displaces the PSF by 4 a3 lF 1 + 2 along the y
axes. As the order of a polynomial aberration increases, the interferograms and the PSFs
become more and more complex.
The 3D MTF plots for the for the primary polynomial aberrations and A10 are shown
in Figure 5-9 for a sigma value of 0.1 wave. The contour plots shown below each 3D
MTF figure are in steps of 0.1 from the center out, starting with a value of 0.9 and ending
with zero. The tangential, (long dashes), sagittal (medium dashes), and 45o (small dashes)
MTF plots are also shown in this figure, i.e., for the spatial frequency vector along the x
axis, y axis, and at 45o from the x axis, respectively. Figure 5-10a shows the symmetry of
the real and the imaginary parts of the OTF for the orthogonal primary coma A8 . The real
part has even symmetry, but the imaginary part has odd symmetry. The real and
imaginary parts of the OTF for the polynomial aberration A10 are shown in Figure 5-10b.
Since the aberration is 3-fold symmetric, the imaginary part of the OTF is 3-fold
symmetric, but the real part is 6-fold symmetric, as expected.
Comparing the form of the annular polynomials with those of the circle polynomials
given in Chapter 4, it is easy to see that the symmetry properties of the interferograms,
PSFs, real and imaginary parts of the OTF and the MTFs aberrated by an annular
polynomial aberration are the same as those for a corresponding circle polynomial
aberration in a circular pupil. These properties are summarized in Table 4-6.
5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 133
A1 A2 A3
A4 A5 A6
A7 A8 A9
A10 A11 A12
A13 A14 A15
Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram
on the left, and PSF on the right for = 0.5 and a sigma value of one wave.
A 16 A 17 A 18
A19 A20 A21
A22 A23 A24
A25 A26 A27
A28 A29 A30
on the left, and PSF on the right for = 0.5 and a sigma value of one wave. (Cont.)
A31 A32 A33
A34 A35 A36
A37 A38 A39
A40 A41 A42
A43 A44 A45
on the left, and PSF on the right for = 0.5 and a sigma value of one wave. (Cont.)
Table 5-9. Peak-to-valley (P-V) numbers in units of wavelength of orthonormal

annular polynomials for = 0.5 and a sigma value of one wave.
A1 0 A16 6.626 A31 7.206
A2 3.578 A17 6.626 A32 7.206
A3 3.578 A18 6.094 A33 6.944
A4 3.464 A19 6.094 A34 6.944
A5 4.276 A20 6.001 A35 6.928
A6 4.276 A21 6.001 A36 6.928
A7 5.285 A22 5.292 A37 4.286
A8 5.285 A23 6.916 A38 7.138
A9 4.909 A24 6.916 A39 7.138
A10 4.909 A25 6.520 A40 7.510
A11 3.354 A26 6.520 A41 7.510
A12 5.679 A27 6.481 A42 7.354
A13 5.679 A28 6.481 A43 7.354
A14 5.480 A29 7.329 A44 7.348
A15 5.480 A30 7.329 A45 7.348

y x
A 1 - Piston
A 4 - Defocus
A6 Primary astigmatism
A8 Primary coma
A 10
A 11 Primary spherical
Figure 5-9. 3D, tangential or along x axis (in long dashes), sagittal or along y axis (in
medium dashes), and at 45 o from the x axis (in small dashes) MTF plots for annular
polynomial aberrations with a sigma value of 0.1 wave for = 0.5. The solid curve
represents the aberration-free MTF. The spatial frequency v is normalized by the
cutoff frequency 1 l F . The contour plots below each 3D MTF plot are in steps of
0.1 from the center out, starting with 0.9 and ending with zero.
(a) A8 Primary coma
(b) A10
Re ( ) Im
Figure 5-10. Real and imaginary parts of the OTF for an annular polynomial
aberration with a sigma value of 0.1 wave for = 0.5. (a) A8 (primary coma) shows
the even and odd symmetry of the real and imaginary parts. (b) A10 shows the 6-fold
symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to
their even and odd symmetry, respectively. The thick and thin contours of the
imaginary part represent its positive and negative values, respectively.
5.9 Summary 139
5.9 SUMMARY
A brief description of the aberration-free PSF and OTF of a system with an annular
pupil is given in Section 5.2, and follows with a discussion of the Strehl ratio and
aberration balancing for such a system in Section 5.3. The variation of the standard
deviation of a primary aberration with the obscuration ratio is shown in Figure 5-5. It is
evident, for example, from Figure 5-5d that the standard deviation of the defocus
aberration decreases, and the depth of focus accordingly increases as the obscuration
increases.
The annular polynomials orthonormal over an annular pupil, obtained by

orthonormalizing the Zernike circle polynomials, are given in Table 5-3 in terms of the
circle polynomials. This form is useful for comparing the expansions of an annular
wavefront in terms of the annular and circle polynomials, as discussed in Chapter 12. The
nonzero elements of a 15 ¥ 15 conversion matrix for obtaining the annular polynomials
from the circle polynomials are given in Table 5-4. The annular polynomials are given in
Cartesian coordinates in Table 5-5 for numerical analyses of annular wavefronts. The
radial annular polynomials for n £ 8 are given in Table 5-6. The ordering of the annular
polynomials in Table 5-7 is the same as that for the circle polynomials in Table 4-3.
The Strehl ratio for a sigma value of 0.1 l for each aberration polynomial is given in
Table 5-8 and illustrated in Figure 5-7. It shows that, for a small aberration, the Strehl
ratio can be estimated from the aberration variance. The annular polynomials for n £ 8
are illustrated by an isometric plot, an interferogram, and a PSF in Figure 5-8 for = 0.5
and a sigma value of one wave. Their peak-to-valley numbers are given in Table 5-9 in
units of wavelength. The 3D MTFs are shown in Figure 5-9 for the primary and A10
polynomial aberrations. The tangential, sagittal, and 45o MTF plots are also shown in
Figure 5-9 for the orthogonal primary coma, i.e., for the spatial frequency vector along
the x axis, y axis, and at 45o from the x axis, respectively. The real and imaginary parts of
the OTFs are shown in Figure 5-10 for the A8 and A10 polynomial aberrations that have
odd values of m.
The symmetry properties of an interferogram, PSF, and real and imaginary parts of
the OTF and MTF aberrated by an annular polynomial aberration are the same as those
for a corresponding circle polynomial aberration in a circular pupil. These properties are
summarized in Table 4-6.
References

2. H. F. Tschunko, “Imaging performance of annular apertures,” Appl. Opt. 18,

1820–1823 (1974).
3. E. L. O’Neill, “Transfer function for an annular aperture,” J. Opt. Soc. Am. 46,
285–288 (1956). Note that a term of - 2 h2 is missing in the second of O’Neill’s
Eq. (26), as was pointed out by the author in an Errata on p. 1096 in the Dec 1956
issue. Unfortunately, the obscuration ratio h in the original paper was typed
incorrectly as n in the Errata.
4. W. H. Steel, “Étude des effets combines des aberrations et d’une obturation

centrale de la pupille sur le contraste des images optiques.” Rev. Opt. (Paris) 32,
143–178 (1953).
5. V. N. Mahajan, “Zernike annular polynomials and optical aberrations of systems

with annular pupils,” Appl. Opt. 33, 8125–8127 (1994).
6. V. N. Mahajan, “Zernike annular polynomials for imaging systems with annular

pupils,” J. Opt. Soc. Am. 71, 75–85 (1981); 71, 1408 (1981); 1, 685 (1984).
7. V. N. Mahajan, “Orthonormal polynomials in wavefront analysis,” Handbook of

Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, (McGraw Hill,
2009), pp. 11.3–11.41.
8. V. N. Mahajan and José A. Díaz, “Imaging characteristics of Zernike and annular

polynomial aberrations,” Appl. Opt. 52, 1–13 (2013).
CHAPTER 6
SYSTEMS WITH GAUSSIAN PUPILS
6.1 Introduction ..........................................................................................................143
6.2 Gaussian Pupil ......................................................................................................144
6.3.1 PSF ..........................................................................................................145
6.3.2 Optimum Gaussian Radius ..................................................................... 146
6.3.3 OTF ..........................................................................................................147
6.5 Orthonormalization of Zernike Circle Polynomials over a
Gaussian Circular Pupil ......................................................................................153
6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations
for a Gaussian Circular Pupil ............................................................................. 155
6.7 Weakly Truncated Gaussian Pupils ................................................................... 156
6.8 Aberration Coefficients of a Gaussian Circular Aberration Function ..........157
6.9 Orthonormalization of Annular Polynomials over a
Gaussian Annular Pupil ......................................................................................157
6.10 Gaussian Annular Polynomials Representing Balanced
Primary Aberrations for a Gaussian Annular Pupil ........................................159
6.11 Aberration Coefficients of a Gaussian Annular Aberration Function ........... 161
6.12 Summary............................................................................................................... 161
References ......................................................................................................................163
141
Chapter 6
Systems with Gaussian Pupils
6.1 INTRODUCTION
In this chapter, we consider optical systems with Gaussian apodization or Gaussian
pupils, i.e., those with a Gaussian amplitude across the wavefront at their exit pupils,
which may be circular or annular [1,2]. The discussion in this chapter is equally
applicable to imaging systems with a Gaussian transmission (obtained, for example, by
placing a Gaussian filter at its exit pupil) as well as laser transmitters in which the laser
beam has a Gaussian distribution at its exit pupil. It is evident that whereas a Gaussian
function extends to infinity, the pupil of an optical system can only have a finite diameter.
The net effect is that the finite size of the pupil truncates the infinite-extent Gaussian
function. If the Gaussian function is very narrow (i.e., its standard deviation is very small)
compared to the radius of the pupil, it is said to be weakly truncated. In such cases, the
truncation can be neglected, and the pupil can be assumed to be infinitely wide.
The aberration-free image for a system with a Gaussian pupil shows that the
Gaussian illumination reduces the central value, broadens the central bright spot, but
reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,
the OTF for a Gaussian pupil is higher for low spatial frequencies, and lower for the high.
In these respects, the effect of a Gaussian illumination is opposite to that of a central
obscuration in an annular pupil. The diffraction rings practically disappear when the pupil
radius is twice the Gaussian radius, and the beam propagates as a Gaussian everywhere.
The OTF in this case is also described by a Gaussian function.
The standard deviation of a primary aberration over a Gaussian pupil is calculated

and shown to be smaller than its corresponding value for a uniform pupil. This is due to
the fact that the wave amplitude decreases as a function of the radial distance from the
center of the pupil while the aberration increases, i.e., the amplitude is smaller where the
aberration is larger. Accordingly, the Strehl ratio for a Gaussian pupil for a given amount
of a primary aberration is higher than that for a uniform pupil, or the aberration tolerance
for a given Strehl ratio is higher for a Gaussian pupil. The balanced primary aberrations
with minimum variance are also obtained, and the diffraction focus for various values of
the truncation ratio are given. The Gaussian polynomials orthonormal over a Gaussian
pupil are obtained by orthogonalizing the circle polynomials over such a pupil. As
expected, the Gaussian polynomials for primary aberrations represent balanced
aberrations. Similarly, the orthonormal Gaussian annular polynomials are obtained by
orthogonalizing the annular polynomials over a Gaussian pupil. Again, the primary
Gaussian annular polynomials represent the balanced aberrations for a Gaussian annular
pupil. The isometric, interferometric, and imaging characteristics of the Gaussian circular
and annular polynomial aberrations are not discussed because of their similarity with
those of the corresponding circle or annular polynomial aberrations for uniform pupils.
143
144 SYSTEMS WITH GAUSSIAN PUPILS
6.2 GAUSSIAN PUPIL

The pupil function for a system with a Gaussian pupil of radius a may be written [1]
P(r, q) = A(r) exp i F(r, q) [ ] , (6-1)
where
A(r) = A0 exp - g r 2 ( ) . (6-2)
Here A0 is a constant that is determined from the total power in the pupil and
2
g = (a w ) , (6-3)
where the quantity w, called the Gaussian radius represents the radial distance from the
center of the pupil at which the amplitude drops to e 1 of the amplitude at the center. The
pupil radius a normalized by the Gaussian radius w , i.e., g = a w , is called the
truncation ratio. The larger the value of g is, the narrower the Gaussian beam is. A
uniform beam is represented by the limiting case of g Æ 0 . The aberration function
F(r, q) represents the phase aberration at a point (r, q) in the plane of the exit pupil,
where 0 £ r £ 1 and 0 £ q p £ 2p . The amplitude A0 at its center is determined from
the total power in the pupil.
A Gaussian pupil is obtained when a Gaussian laser beam illuminates a pupil or

when a uniform beam illuminates the pupil with a Gaussian transmission. In the former
case, the total power incident on the pupil and that exiting from it are given by
•
Pinc = 2 A02 Sex Ú (
exp - 2gr 2 r dr )
0
A02 Sex
= , (6-4)
2g
and
1
Pex = 2 A02 Sex Ú (
exp - 2gr 2 r dr )
0
[
= A02 (Sex 2 g ) 1 - exp(- 2 g ) ] , (6-5)
respectively. The fractional transmitted power that goes on to the image is given by
Ptrans = Pex Pinc
= 1 - exp(- 2g ) . (6-6)
*DXVVLDQ 3XSLO 145
More and more power is transmitted as the beam becomes narrower and narrower, i.e., as
w decreases or g increases. The pupil irradiance A 2 (r) in units of Pex Sex may be
written
I (r) = 2 g exp - 2 g r2 ( ) [1 - exp (- 2 g )] . (6-7)
The pupil in the latter case, where an amplitude filter is placed in the pupil plane, is
said to be apodized. The power incident in this case is Pinc = A02 Sex . The power exiting
from the pupil is again given by Eq. (6-5), but the fractional transmitted power is given
by
1 - exp(- 2g )
Ptrans = Pex Pinc = . (6-8)
2g
In this case, the transmitted power decreases as g increases.

6.3.1 PSF
Substituting Eq. (6-2) into Eq. (2-4), the irradiance distribution in the image plane in
units of Pex Sex l2 R 2 is may be written
2
1 2p
I (r; q i ; g ) = p 2
Ú Ú [ ]
I (r) exp -pirr cos(q i - q) r dr dq p , (6-9)
0 0
or, carrying out the angular integration,

2
È1 ˘
I ( r; g ) = 4 Í Ú I (r) J 0 ( prr) r dr˙ . (6-10)
ÍÎ 0 ˙˚
Letting r = 0 in Eq. (6-10), we obtain the central value
[
I (0; g ) = tanh ( g 2) ( g 2) ] . (6-11)
For large values of g, a pupil is said to be weakly truncated. For such a pupil,
I (0; g ) Æ 2 g . (6-12)
The fractional power in the image plane contained in a circle of radius rc is given by
rc
P(rc ; g ) = p 2 2( )Ú I (r; g ) rdr , (6-13)
0
where rc is in units of l F.
Figure 6-1 shows the image-plane irradiance and encircled-power distributions for
J 0 , 1, 2, and 3. It is evident that the Gaussian illumination reduces the central value
and broadens the central bright spot, but reduces the power in the diffraction rings. For
example, when J 1, the central value is 0.924 compared to a value of 1 for a uniform
beam. Moreover, the central bright spot has a radius of 1.43 and contains 95.5% of the
total power compared to a radius of 1.22 containing 83.8% of the power for a uniform
beam. The diffraction rings practically disappear for J t 4 , and the beam propagates as a
Gaussian everywhere.
6.3.2 Optimum Gaussian Radius

For a given total beam power Pinc incident on a pupil of fixed radius a, the
transmitted power Pex increases as Z decreases, but the corresponding central irradiance
in the image plane decreases. Hence, there is an optimum value of Z that yields the
maximum central value. To determine this value, we write the central irradiance given by
Eq. (6-11) in units of Pinc Sex O2 R 2 :
I 0; J >1 exp 2J @ tanhJ 2 J 2

2 J >1 exp J @2 . (6-14)
1
J = 0 J = 1
2
0.8 1 0
0.6
3
(r) P(rc)
0.4
3
0.2
0
0.5 1 1.5 2 2.5 3
r; rc
Figure 6-1. PSF and encircled power for a Gaussian pupil with J 0 , 1, 2, and 3.
The irradiance is in units of Pex Sex O2 R 2 , and the encircled power is in units of Pex .
r and rc are in units of OF.
6.3.2 Optimum Gaussian Radius 147
Letting
wI 0; J
0 , (6-15)
wJ
we find that I 0; J is maximum and equal to 0.8145 when J 1.255 or Z 0.893a.

The corresponding irradiance at the edge of the pupil is 8.1%, and the transmitted power
Ptrans is 91.87%. Figure 6-2 shows how I 0; J varies with J .
6.3.3 OTF
From Eq. (2-13), the OTF for an aberration-free Gaussian pupil is given by
G G G G G
W v i ; J
Pex1 ³ A r p A r p O Rv i dr p (6-16)
G

in the pupil coordinate system x p , y p . Let the spatial frequency vector v i with its
Cartesian components [, K make an angle I with the x p axis, as illustrated in Figure 6-
3. It is convenient to write the autocorrelation integral in a p, q coordinate system

whose axes are rotated by an angle I with respect to the x p , y p system (so that the p
G
axis lies along the direction of the spatial frequency vector v i ) and whose origin lies at a

distance ORv i from that of the x p , y p system along the p axis. If we further let the
p, q coordinates be normalized by the pupil radius a and the spatial frequency v i be
normalized by the cutoff spatial frequency 1 O F , the OTF can be written
0.8
0.6
(0 J)
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3
J
Figure 6-2. Variation of I 0 , J normalized by Pinc Sex O 2 R 2 as a function of J,

showing that its value is maximum when J 1.120 or Z 0.893a .
q
p
yp
xp
(0,0)
ni
lR
Figure 6-3. Geometry for evaluating the OTF. The centers of the two pupils are
( )
located at (0, 0) and l R ( x, h) in the x p , y p coordinate system and m (l R 2) (vi , 0)
12
in the ( p, q ) coordinate system, where vi = x 2 + h 2 ( )
and f = tan 1 ( h x) . The
shaded area is the overlap area of the two pupils. When normalized by the pupil
radius a, the centers of the two pupils of unity radius lie at m v along the p axis.
(
t (v ; g ) = a 2 Pex ) Ú Ú A( p + v , q) A( p - v , q) dp dq , 0£ v£1 . (6-17)
Substituting for the amplitude A(r) from Eq. (6-2) and for the power Pex from Eq. (6-5)
into Eq. (6-17), we obtain
1 v2 1 q2 v
(
8g exp -2gv 2 Û ) Û
t (v ; g ) = Ù
p [1 - exp( -2 g ) ] ı
dq Ù
ı
[ ( )]
exp -2g p 2 + q 2 dp , (6-18)
0 0
where the integration is over a quadrant of the overlap region of two pupils whose centers
are separated by a distance v along the p axis. For large values of g (e.g., g ≥ 4 ), the
contribution to the integral in Eq. (6-18) is negligible unless v = 0 , in which case it
represents the Gaussian-weighted area of a quadrant of the pupil, and the equation
reduces to
(
t (v ; g ) = exp -2gv 2 ) , 0£v £1 . (6-19)
Figure 6-4 shows how the OTF varies with v for several values of g . We note that
compared to a uniform pupil (i.e., for g = 0 ), the OTF of a Gaussian pupil is higher for
low spatial frequencies, and lower for the high. Moreover, as g increases, the bandwidth
6.3.3 OTF 149
0.8
1
0.6
W(Q J)
0.4
J = 3 2
0.2
0
0 0.2 0.4 0.6 0.8 1
Q
Figure 6-4. The OTF of a Gaussian pupil. A uniform pupil corresponds to J 0,

and a large value of J represents a weakly truncated pupil.
of low frequencies for which the OTF is higher decreases and the OTF at high
frequencies becomes increasingly smaller. This is due to the fact that the Gaussian
weighting across the overlap region of two pupils whose centers are separated by small
values of v is higher than that for large values of v. If we consider an apodization such
that the amplitude increases from the center toward the edge of the pupil, then the OTF is
lower for low frequencies and higher for the high. Thus unlike aberrations, which reduce
the MTF of a system at all frequencies within its passband, the amplitude variations can
increase or decrease the MTF at any of those frequencies.

From Eq. (2-22), the Strehl ratio (representing the ratio of the central irradiances with
and without aberration) for a Gaussian pupil is given by [1–3]
2 2
1 2S ª1 2 S º
S ³ ³ AU exp>i )U, T@ U dU dT «³ ³ AU U dU dT»
0 0 ¬0 0 ¼
2 1 2S 2
J ½
® S 1 exp J ¾ ³ ³ exp JU exp>i )U, T@ U dU dT
2
. (6-20)
¯ > @ Ó 0 0
For small aberrations, the Strehl ratio is approximately given by

S ~ exp ( - s F2 ) , (6-21)
where
s 2F = < F 2 > - < F > 2 (6-22)
is the variance of the phase aberration across the Gaussian-amplitude weighted pupil. The
mean and the mean square values of the aberration are obtained from the expression
1 2p 1 2p
n
< Fn > = Ú Ú [
A(r) F(r, q) ] r dr d q Ú Ú A(r) r dr dq
0 0 0 0
1 2p
g
= Ú Ú
p[1 - exp( - g ) ] 0
( )[
exp -gr 2 F(r, q) ] n r dr d q , (6-23)
0
with n = 1 and 2, respectively. The angular brackets indicate a mean value over the
Gaussian pupil.
Table 6-1 lists the primary aberrations and their standard deviations for increasing
values of g . It is evident that the standard deviation of an aberration decreases as g
increases. This is due to the fact that while an aberration increases as r increases, the
amplitude decreases more and more rapidly as g increases, thus reducing its effect more
Table 6-1. Primary aberrations and their standard deviations for optical systems
with Gaussian pupils. For comparison, the results for a uniform pupil ( g = 0 ) are
also given.
Primary Aberration sF ( g = 0) sF ( g = 1) sF ( g =2 ) sF ( g ≥3 )
Spherical, As r 4 2 As As As As 2 5 As
=
3 5 3.35 3.67 6.20 g2
Coma, Ac r3 cos q Ac Ac Ac Ac 3 Ac
=
2 2 2.83 3.33 6.08 g3 2
Astigmatism, Aa r2 cos 2 q Aa Aa Aa Aa
4 4.40 6.59 2g
Defocus, Bd r2 Bd Bd Bd Bd Bd
=
2 3 3.46 3.55 4.79 g
Tilt, Bt r cos q Bt Bt Bt Bt
2 2.19 2.94 2g
6WUHKO 5DWLR DQG $EHUUDWLRQ %DODQFLQJ 151
and more compared to that for a uniform pupil. Accordingly, for a given small amount of
aberration Ai , the Strehl ratio for a Gaussian pupil is higher than that for a uniform pupil.
Similarly, the aberration tolerance for a given Strehl ratio is higher for a Gaussian pupil.
Its approximate value can be obtained from Eq. (6-21).
Since the Strehl ratio depends on the aberration variance, we balance a given
aberration with lower-order aberrations to minimize its variance. Thus, we balance
spherical aberration and astigmatism with defocus aberration, and coma with tilt
aberration to minimize their variance. The balanced primary aberrations thus obtained are
listed in Table 6-2. For example, the defocus aberration that balances spherical aberration
is given by Bd As = - 1, - 0.933 , and - 4 g when g = 0 , 1, and ≥ 3, respectively.
Similarly, the tilt aberration that balances coma for these values of g is given by
Bt Ac = - (2 3) , - 0.608 , and - 2 g , respectively. The defocus coefficient given by
Bd = - Aa 2 to balance astigmatism is independent of the value of g .
The standard deviations of the balanced primary aberrations are given in Table 6-3.
The factor by which the standard deviation of a primary aberration is reduced by
balancing it with another is listed in Table 6-4. The diffraction focus representing the
point of maximum irradiance for a small aberration is listed in Table 6-5. We note that,
although aberration balancing in the case of a uniform pupil reduces the standard
deviation of spherical aberration and coma by factors of 4 and 3, respectively, the
reduction in the case of astigmatism is only a factor of 1.22. For a Gaussian pupil, the
trend is similar but the reduction factors are smaller for spherical aberration and coma,
and are larger for astigmatism. For a Gaussian beam with g = 1, they are 3.74, 2.64, and
1.27, corresponding to spherical aberration, coma, and astigmatism, respectively. In
Section 6.6, the balanced aberrations are identified with the Gaussian polynomials
discussed in Section 6.5.
Table 6-2. Balanced primary aberrations.
Balanced F( r, q ; g = 0) F( r, q ; g = 1) (
F r , q;; g = 2 ) (
F r, q ; g ≥ 3 )
Aberration
Ê 4 2ˆ
Spherical (
As r 4 r2 ) (
As r 4 0.933r 2 ) (
As r 4 0.728 r 2 ) As Á r 4
Ë
r ˜
g ¯
Ê 2 ˆ Ê 3 2 ˆ
Coma Ac Á r 3
Ë
r˜ cos q
3 ¯
(
Ac r 3 )
0.608 r cos q A c r 3 ( )
0.419 r cos q A c Á r
Ë
r˜ cos q
g ¯
Astigmatism
(
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 )
Table 6-3. Standard deviation of balanced primary aberrations.
Balanced sF ( g = 0) s F ( g = 1) sF ( g =2 ) sF ( g ≥3 )
Aberration
Spherical As As As As 2 As
=
6 5 13.42 13.71 18.29 g2
Coma Ac Ac Ac Ac Ac
=
6 2 8.49 8.80 12.21 g3 2
Astigmatism Aa Aa Aa Aa Aa
=
2 6 4.90 5.61 9.08 2g
Table 6-4. Factor by which the standard deviation of a Seidel aberration across an
aperture is reduced when it is optimally balanced with other aberrations.
Reduction Factor
Balanced Uniform Gaussian Gaussian Weakly Truncated
Aberration ( g = 0) ( g = 1) ( g =2 ) (
Gaussian g ≥ 3 )
Spherical 4 3.74 2.95 5 = 2.24
Coma 3 2.64 2.01 3 = 1.73
Astigmatism 1.22 1.27 1.38 2 = 1.41
Table 6-5. Diffraction focus.
Diffraction Focus
Balanced Uniform Gaussian Gaussian Weakly Truncated
Aberration ( g = 0) ( g = 1) ( g =2 ) Gaussian g ≥ 3( )
Ê 32 2 ˆ
Spherical (0, 0, 8F A ) (0, 0, 7.46 F A ) (0, 0, 5.82 F A )
2
s
2
s
2
s Á 0, 0, F As ˜
Ë g ¯
Coma (4 FAc 3, 0, 0 ) (1.22 FAc , 0, 0) (0.84 FAc , 0, 0) (4 FAc g, 0, 0 )
Astigmatism (0 , 0 , 4 F A ) (0 , 0 , 4 F A )
2
a
2
a (0 , 0 , 4 F A )
2
a (0 , 0 , 4 F A )2
a
6.5 Orthonormalization of Zernike Circle Polynomials over a Gaussian Circular Pupil 153
6.5 ORTHONORMALIZATION OF ZERNIKE CIRCLE POLYNOMIALS OVER

A GAUSSIAN CIRCULAR PUPIL
The Gaussian circle polynomials G j (r, q; g ) orthonormal over a Gaussian pupil can
be obtained recursively from the Zernike circle polynomials Z j (r, q) discussed in
Chapter 4, starting with G1 = 1 (omitting the arguments for brevity) from Eq. (3-18)
according to
È j ˘
G j +1 = N j +1 Í Z j +1 - Â Z j +1G k G k ˙ , (6-24)
Î k =1 ˚

angular brackets indicate a mean value over the Gaussian pupil. Thus
1 2p 1 2p
Z j +1G k = Ú Ú A(r) Z j +1G k r dr dq Ú Ú A(r) r dr dq
0 0 0 0
1 2p
g
= Ú
p[1 - exp( - g ) ] 0 Ú ( )
exp - gr 2 Z j +1G k r dr dq . (6-25)
0
1 2p 1 2p
G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq
0 0 0 0
1 2p
g
= Ú
p[1 - exp( - g ) ] 0 Ú ( )
exp - gr 2 G j G j ¢ r dr dq
0
= d jj ¢ . (6-26)
Now a circle polynomial Z j varies with the angle q as cos mq or sin mq depending
on whether j is even or odd. It is radially symmetric when m = 0. Because of the
orthogonal properties of cos mq and sin mq over a period of 0 to 2p [see Eq. (4-46)],
the polynomials G k that contribute to the sum in Eq. (6-8) must also have the same
angular dependence as that of the polynomial Z j +1. Hence, the polynomial G j +1 will also
have the same angular dependence. Thus, a Gaussian polynomial G j is separable in polar
coordinates r and q , and differs from the corresponding circle polynomial only in its
radial dependence. Given the form of the circle polynomials by Eqs. (4-45a)–(4-45c), the
Gaussian polynomials can accordingly be written
G even j (r, q; g ) = 2(n + 1) Rnm (r; g ) cos mq , m π 0 , (6-27a)
G odd j (r, q; g ) = 2(n + 1) Rnm (r; g ) sin mq , m π 0 , (6-27b)

G j (r, q; g ) = n + 1 Rn0 (r; g ) , m = 0 , (6-27c)
where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g )
is a Gaussian radial polynomial.
Substituting Eqs. (6-27a)–(6-27c) into the orthonormality Eq. (6-26), we find that the
Gaussian radial polynomials obey the orthogonality condition [1]
1 1
1
Ú (r; g ) (r; g ) A(r) r dr Ú A(r) r dr
Rnm Rnm¢ = d
n + 1 nn ¢
. (6-28)
0 0
Writing Eq. (6-24) in terms of two-index polynomials given by Eqs. (6-27a)–(6-27c) and
substituting these equations into it, as was done in Chapter 5 for the annular polynomials,
we find that the Gaussian radial polynomials are given by
È ( n m) 2 ˘
Rnm (r; g ) = M nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; g ) Rnm 2i (r; g )˙ , (6-29)
Î i ≥1 ˚
where
1 1
Rnm (r) Rn 2i (r; g ) = Ú (r) Rn 2i (r; g ) A(r) r dr Ú A(r) r dr
Rnm . (6-30)
0 0
The normalization constant M nm that replaces the normalization constant N j is

determined from the orthogonality Eq. (6-28) of the radial polynomials. Note that except
for the normalization constant, the radial polynomial Rnn (r; g ) is identical to the
corresponding polynomial for a uniformly illuminated circular pupil Rnn (r) , i.e.,
Rnn (r; g ) = Mnn Rnn (r) . (6-31)
The radial polynomial Rnm (r; g ) is a polynomial of degree n in r containing terms in rn ,

rn 2 , ..., and r m , whose coefficients depend on the Gaussian amplitude through g, i.e., it
has the form
Rnm (r; g ) = anm rn + bnm rn 2

+ K + dnm rm , (6-32)
where the coefficients anm , etc., depend on g. The radial polynomials are even or odd in r
depending on whether n (or m) is even or odd.
The polynomial ordering, the number of polynomials of a certain order or through a

certain order n, and the relationships among the indices n, m, and j are the same as
discussed for circle polynomials in Chapter 4. Moreover, a Gaussian circle polynomial
G j (r, q; g ) reduces to the corresponding circle polynomial Z j (r, q) as g Æ 0. The
Gaussian circle polynomials are also unique like the circle polynomials. They are not
only orthogonal over a Gaussian circular pupil, but they also include wavefront tilt and
defocus and balanced classical aberrations as members of the polynomial set.
6.6 Gaussian Circle Polynomials Representing Balanced Primary Aberrations for a Gaussian Circular Pupil 155
6.6 GAUSSIAN CIRCLE POLYNOMIALS REPRESENTING BALANCED

PRIMARY ABERRATIONS FOR A GAUSSIAN CIRCULAR PUPIL
The radial polynomials corresponding to balanced primary aberrations are listed in

Table 6-6. The column “Gaussian” is for any value of g , and the column “Weakly
Truncated Gaussian” is for its large values. It can be seen that the balancing defocus for
(
spherical aberration given by Bd = b40 a40 As and the balancing tilt for coma given by)
( )
Bt = b31 a31 Ac are in agreement with the corresponding values given in Table 6-2. For
example, the relative balancing defocus in the case of spherical aberration from Table 6-6
for g = 1 is – 5.71948 6.12902 , which is the same as - 0.933 in Table 6-2. From the
form of the Gaussian circle polynomial R22 (r; g ) cos 2q representing balanced
astigmatism and varying as r 2 cos 2q , it is evident that the balancing defocus of
- (1 2)r 2 for astigmatism r 2 cos 2 q is independent of the value of g . Similarly,
comparing the form of a balanced primary aberration with the corresponding Gaussian
polynomial, we can immediately write its standard deviation. Thus, we can see that the
sigma values As 5a40 , Ac 2 2 a31 , and Aa 2 6 a22 of balanced spherical aberration,
coma, and astigmatism, respectively, are in agreement with their values given in Table 6-
3. For example, the balanced aberration for spherical aberration Asr 4 can be written
As 0 4
W (r, q; g ) =
a 40
(a 4 r + b40r 2 + c 40 )
As
= G 4 (r, q; g ) . (6-33)
5a 40
Table 6-6. Gaussian radial polynomials representing balanced primary aberrations

for Gaussian beams. Polynomials for special cases of g = 0 (corresponding to a
uniform beam), g = 1, and weakly truncated Gaussian beams are also given.
Aberration Radial Gaussian* Gaussian Uniform Weakly Truncated

Polynomial g 1 g 0 Gaussian
Piston R00 1 1 1 1
Distortion (tilt) R11 a11r 1.09367r r

g / 2r
2
Field curvature R20 a20r2 + b20 2
2.04989r – 0.85690 2r – 1 2
( gr – 1) / 3
(defocus)
Astigmatism R22 a22r2 1.14541r2 r2 ( g / 6 )r2
Coma R31 a31r3 + b31r 3.11213r 3 – 1.89152r 3 r3 – 2 r Êg ˆ

g / 2 Á r3 – r˜
Ë2 ¯
Spherical aberration R40 a40r4 + b40r2 + c40 6.12902r4 – 5.71948r2 + 0.83368 6 r4 – 6 r2 + 1 ( g 2r4 – 4 gr2 + 2) / 2 5
1
*a11 = (2 p 2 )–1/2 , a 20 = [3( p 4 – p 22 )] –1/2, b 20 = – p 2 a 20 , a 22 = ( 3 p 4 )–1/2 , a 13 = ( p – p 42 / p 2 ) 12
, b 31 = – ( p 4 / p 2 )a 13 ,
2 6
–1/2
{
a 40 = 5 [ p8 – 2 K 1 p6 + (K 12 + 2 K 2 ) p4 – 2 K 1 K 2 p2 + K 22 ] } , b40 = – K 1 a 40 , c40 = K 2 a 40 ,
p s = < r s > = (1 – expg ) –1 + ( s / 2 g ) p s – 2 , s is an even integer,
p 0 = 1, K1 = ( p6 – p 2 p 4 ) / ( p 4 – p 22 ), K 2 = ( p 2 p6 – p 42 ) / ( p 4 – p 22 ) .
Since G 4 is an orthonormal polynomial, its multiplier As 5a 40 yields the sigma value

of the balanced aberration. The balancing defocus is, of course, Asb40 a 40 . As a numerical
example, it yields a sigma value of As 13.71 for g = 1, the same as in Table 6-3. The
corresponding balancing defocus is - 0.933As , as expected.
6.7 WEAKLY TRUNCATED GAUSSIAN PUPILS

For a weakly truncated Gaussian pupil, we can let the upper limit of the radial
integration approach infinity with negligible error. Thus, Eq. (6-20) for the Strehl ratio
and Eq. (6-23) for the mean and mean square values of the aberration may be written [1]
2
2 • 2p
Ê gˆ
S = Á ˜
Ë p¯ Ú Ú ( ) [
exp -gr 2 exp iF(r, q) r dr dq ] (6-34)
0 0
and
• 2p
g n
< Fn > =
p Ú Ú ( )[ ]
exp - g r2 F(r, q) r dr dq , (6-35)
0 0
respectively.
The standard deviation of a primary aberration for a large value of g can be obtained
by calculating its mean and mean square values according to Eq. (6-36). The results thus
obtained are given in the last column of Table 6-1. The corresponding balanced
aberrations and their standard deviations are similarly given in Tables 6-2 and 6-3,
respectively. The balancing of an aberration reduces the standard deviation by a factor of
5, 3 , and 2 in the case of spherical aberration, coma, and astigmatism,
respectively, as noted in Table 6-4. The diffraction focus for these aberrations is listed in
Table 6-5. The amount of balancing aberration decreases as g increases in the case of
spherical aberration and coma, but does not change in the case of astigmatism. For
example, in the case of spherical aberration, the amount of balancing defocus for a
weakly truncated Gaussian beam is ( 4 g ) times the corresponding amount for a uniform
beam. Similarly, in the case of coma, the balancing tilt for a weakly truncated Gaussian
beam is (3 g ) times the corresponding amount for a uniform beam. The location of the
diffraction focus is independent of the value of g in the case of astigmatism, since the
balancing defocus is the same regardless of the value of g . Compared to the peak value
of an aberration, its standard deviation is smaller by a factor of g 2 2 , g 3 2 , and 2g in
the case of spherical aberration, coma, and astigmatism, respectively.
When a Gaussian beam is weakly truncated, i.e., when g is large, the quantity ps in
Table 6-6 reduces to
ps = < rs > = (s 2 g ) ps 2 = (s 2) ! g s2
. (6-36)
:HDNO\ 7UXQFDWHG *DXVVLDQ 3XSLOV 157
As a result, we obtain simple expressions for the radial polynomials, which are listed in
the last column in Table 6-6. They are similar to Laguerre polynomials [4]. If we
normalize the radial coordinate r of a point on the pupil by w (instead of by a), then g
disappears from these expressions. Since the power in a weakly truncated Gaussian beam
is concentrated in a small region near the center of the pupil, the effect of the aberration
in its outer region is negligible. Accordingly, the aberration tolerances in terms of the
peak value of the aberration at the edge of the pupil (r = 1) may not be very meaningful.
They may instead be defined in terms of their value at the Gaussian radius [1].
6.8 ABERRATION COEFFICIENTS OF A GAUSSIAN CIRCULAR

ABERRATION FUNCTION
The aberration function W (r, q; g ) across a Gaussian circular pupil can be expanded
in terms of a complete set of orthonormal Gaussian circle polynomials G j (r, q; g ) in the
form
J
W (r, q; g ) = Â a j G j (r, q; g ) , 0 £ r £ 1 , 0 £ q £ 2 p , (6-37)
j =1
where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-
37) by G j ¢ (r, q; g ) , integrating over the Gaussian pupil, and using the orthonormality Eq.
(6-26), we obtain the circle expansion coefficients:
1 2p 1
a j = Ú Ú W (r, q; g ) G j (r, q; g ) A(r) r dr d q 2 p Ú A(r) r dr . (6-38)
0 0 0
The mean and mean square values of the aberration function are given by
W (r, q; g ) = a1 (6-39)
and
J
W 2 (r, q; g ) = Â a 2j . (6-40)
j =1
2
sW = W 2 (r, q; g ) - W (r, q; g )
J
= Â a 2j . (6-41)
j =2
6.9 ORTHONORMALIZATION OF ANNULAR POLYNOMIALS OVER A

GAUSSIAN ANNULAR PUPIL
The balanced aberrations for an annular Gaussian pupil with an obscuration ratio
can be obtained in a manner similar to those for a circular pupil, except that the lower
limit of zero in the radial integration is replaced by . The Gaussian annular polynomials
G j (r, q; g; ) orthonormal over a Gaussian annular pupil can be obtained recursively from
the annular polynomials A j (r, q; ) , starting with G1 = 1 (omitting the arguments for
brevity) from Eq. (3-18) according to
È j ˘
G j +1 = N j +1 Í A j +1 - Â A j +1G k G k ˙ , (6-42)
Î k =1 ˚

angular brackets indicate a mean value over the Gaussian annular pupil. Thus
1 2p 1 2p
A j +1G k = Ú Ú A(r) A j +1G k r dr dq Ú Ú A(r) r dr dq . (6-43)
0 0
1 2p 1 2p
G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq
0 0
= d jj ¢ . (6-44)
Applying the same reasoning as in the case of Gaussian circle polynomials, we find
that the polynomial G j (r, q; g; ) also has the same angular dependence as an annular
polynomial A j (r, q; ) . Thus, a Gaussian annular polynomial G j is separable in polar
coordinates r and q , and differs from the corresponding annular polynomial only in its
radial dependence. Given the form of the annular polynomials by Eqs. (5-17a)–(5-17c),
the Gaussian annular polynomials can accordingly be written
G even j (r, q; g; ) = 2(n + 1) Rnm (r; g; ) cos mq , m π 0 , (6-45a)
G odd j (r, q; g; ) = 2(n + 1) Rnm (r; g; ) sin mq , m π 0 , (6-45b)
G j (r, q; g; ) = n + 1 Rn0 (r; g; ) , m = 0 , (6-45c)
where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g; )
is a Gaussian annular radial polynomial.
Substituting Eqs. (6-45a)–(6-45c) into the orthonormality Eq. (6-44), we find that the
Gaussian annular radial polynomials obey the orthogonality condition [1,3]
1 1
1
Ú Rnm (r; g; ) Rnm¢ (r; g; ) A(r) r dr Ú A(r) r dr = d . (6-46)
n + 1 nn ¢

Writing Eq. (6-42) in terms of two-index polynomials given by Eqs. (6-45a)–(6-45c) and
substituting these equations into it, as was done in Chapter 5 for the annular polynomials,
6.9 Orthonormalization of Annular Polynomials over a Gaussian Annular Pupil 159
we find that the Gaussian annular radial polynomials are given by
È ( n m) 2 ˘
Rnm (r; g; ) = M nm Í Rnm (r; ) - Â (n - 2i + 1) Rnm (r; ) Rnm 2 i (r; g; ) Rnm 2 i (r; g; ) ,
˙
ÍÎ i ≥1 ˙˚
(6-47)
where the angular brackets indicate an average over the annular Gaussian pupil; i.e.,
1 1
Rnm (r; ) Rn 2 i (r; g; ) = Ú Rnm (r; ) Rn 2 i (r; g; ) A(r) r dr Ú A(r) r dr . (6-48)

The normalization constant M nm that replaces the normalization constant N j is

determined from the orthogonality Eq. (6-46) of the radial polynomials. Note that the
radial polynomial Rnn (r; g ; ) is identical to the corresponding polynomial for a uniformly
illuminated annular pupil Rnn (r; ) , except for the normalization constant, i.e.,
Rnn (r; g; ) = M nn Rnn (r; ) . (6-49)
The radial polynomial Rnm (r; g ; ) is a polynomial of degree n in r containing terms in

rn , rn 2 , ..., and r m whose coefficients depend on the Gaussian amplitude through g,
i.e., it has the form
Rnm (r; g ; ) = anm rn + bnm rn 2

+ K + dnm rm , (6-50)
where the coefficients anm , etc., depend on g and .
The polynomial ordering, the number of polynomials of a certain order or through a

certain order n, and the relationships among the indices n, m, and j are the same as those
discussed for the Zernike circle polynomials in Chapter 4, or the annular polynomials in
Chapter 5. Moreover, a Gaussian annular polynomial G j (r, q; g; ) reduces to the
corresponding annular polynomial Aj (r, q; ) as g Æ 0. The Gaussian annular
polynomials are also unique like the Gaussian circle polynomials. They are not only
orthogonal over a Gaussian circular pupil, but also include wavefront tilt and defocus and
balanced classical aberrations as members of the polynomial set.
6.10 GAUSSIAN ANNULAR POLYNOMIALS REPRESENTING BALANCED

PRIMARY ABERRATIONS FOR A GAUSSIAN ANNULAR PUPIL
The radial annular polynomials Rnm (r; g ; ) for the balanced primary aberrations are
given by the same expressions as for the circle radial polynomials in Table 6-6 except
that now
ps = < rs >
Ë { [(
= Ê s exp g 1 - 2 )] - 1} {exp [g (1 - )] - 1}ˆ¯ + (s 2 g ) p
2
s 2 . (6-51)
Using these expressions, numerical results for the coefficients of the terms of a radial
polynomial for any values of g and can be obtained.
The coefficients for g = 1 and = 0, 0.25, 0.50, 0.75, and 0.90 are given in Table 6-
7. For comparison, the coefficients for a uniformly illuminated pupil, i.e., for g = 0 , are
given in parentheses in this table. An increase (decrease) in the value of a coefficient anm
of an orthogonal aberration Rnm (r; g ; ) cos mq implies a decrease (increase) in the value
of s F for a given amount of the corresponding classical aberration. This, in turn, implies
that for small aberrations, the system performance as measured by the Strehl ratio is less
(more) sensitive to that classical aberration when balanced with other classical
aberrations to form an orthogonal aberration. Thus, as increases, irrespective of the
value of g, the system becomes less sensitive to field curvature (defocus) and spherical
aberration but more sensitive to distortion (tilt) and astigmatism. In the case of coma, it
first becomes slightly more sensitive but is much less sensitive for larger values of . As
g increases, i.e., as the width of the Gaussian illumination becomes narrower, the system
becomes less sensitive to all classical primary aberrations. Although the results for g = 0
and g = 1 only are given in Table 6-7, the coefficients for 0 £ g £ 3 show that the
differences between the coefficients for uniform and Gaussian illumination are small, and
they decrease as increases and increase as g increases. This is understandable because
as increases or g decreases, the differences between the two illuminations decreases.
Table 6-7. Coefficients of terms in Gaussian radial polynomials Rnm (r; g ; ) for g = 1.
The numbers given in parentheses are the corresponding coefficients for uniform
illumination.
a 11 a 20 b20 a 22 a 13 b31 a 40 b40 c40
0.00 1.09367 2.04989 – 0.85690 1.14541 3.11213 – 1.89152 6.12902 – 5.71948 0.83368
(1.00000) (2.00000) (– 1.00000) (1.00000) (3.00000) (– 2.00000) (6.00000) (– 6.00000) (1.00000)
0.25 1.04364 2.18012 – 1.00080 1.08940 3.01573 – 1.84513 6.95563 – 6.98197 1.25153
(0.97014) (2.13333) (– 1.13333) (0.96836) (2.94566) (– 1.97099) (6.82667) (– 7.25333) (1.42667)
0.50 0.92963 2.70412 – 1.56449 0.93620 3.14319 – 2.06618 10.79549 – 13.08900 3.46706
(0.89443) (2.66667) (– 1.66667) (0.87287) (3.11400) (– 2.17980) (10.66667) (– 13.33333) (3.66667)
0.75 0.80827 4.59329 – 3.51548 0.74439 4.55179 – 3.57767 31.47560 – 48.77879 18.39840
(0.80000) (4.57143) (– 3.57143) (0.72954) (4.53877) (– 3.63858) (31.34694) (– 48.97959) (18.63265)
0.90 0.74453 10.53581 – 9.50324 0.63890 9.60573 – 8.69629 166.33359 – 300.66342 135.36926
(0.74329) (10.52632) (– 9.52632) (0.63679) (9.60023) (– 8.72012) (166.20500) (– 300.83102) (135.62604)

6.11 Aberration Coefficients of a Gaussian Annular Aberration Function 161
6.11 ABERRATION COEFFICIENTS OF A GAUSSIAN ANNULAR

ABERRATION FUNCTION
The aberration function W (r, q; g; ) across a Gaussian annular pupil can be
expanded in terms of a complete set of orthonormal Gaussian annular polynomials
G j (r, q; g; ) in the form
J
W (r, q; g; ) = Â a j G j (r, q; g; ) , £ r £ 1 , 0 £ q £ 2 p , (6-52)
j =1
where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-
52) by G j (r, q; g; ), integrating over the Gaussian pupil, and using the orthonormality
Eq. (6-44), we obtain the Gaussian annular expansion coefficients:
1 2p 1
a j = Ú Ú W (r, q; g; )G j (r, q; g; ) A(r) r dr d q 2 p Ú A(r) r dr . (6-53)

W (r, q; g; ) = a1 (6-54)
and
J
W 2 (r, q; g; ) = Â a 2j . (6-55)
j =1
s 2 = W 2 (r, q; g; ) - W (r, q; g; )
J
= Â a 2j . (6-56)
j =2
6.12 SUMMARY
A pupil with Gaussian illumination is called a Gaussian pupil. The Gaussian
illumination may be due to a filter with Gaussian transmission placed at the pupil or due
to a laser beam with Gaussian amplitude distribution. The illumination is characterized by
a truncation ratio g = a w , where a is the pupil radius and w is the radial distance,
called the Gaussian radius, where the amplitude is 1 e times its central value.
The aberration-free image for a system with a Gaussian pupil shows that the
Gaussian illumination reduces the central value, broadens the central bright spot, but
reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,
the OTF is higher for low spatial frequencies, and lower for the high. The diffraction
rings practically disappear when the pupil radius is twice the Gaussian radius, and the
beam propagates as a Gaussian everywhere. The OTF in this case is also described by a
Gaussian function.
The Strehl ratio for a small aberration can be estimated from its variance calculated
over the Gaussian amplitude-weighted pupil. The aberration variance decreases, and,
therefore, its tolerance increases as the truncation ratio increases (see Tables 6-1 and 6-3),
because the amplitude decreases as the aberration increases with the radial distance from
the center.
The Gaussian polynomials orthonormal over a Gaussian circular pupil are obtained
by orthonormalizing the Zernike circle polynomials over a corresponding Gaussian
amplitude-weighted pupil. They are given in Table 6-6 for the primary aberrations for
g = 1. For a weakly truncated pupil, i.e., for large values of g , the polynomials have a
simple analytical form similar to Laguerre polynomials, as shown in the last column in
Table 6-6.
The orthonormal Gaussian annular polynomials for Gaussian annular pupils can be
obtained by orthonormalizing the annular polynomials. The polynomial ordering is
exactly the same as that for the circle or the annular polynomials.
5HIHUHQFHV 163
References

2. V. N. Mahajan, “Uniform versus Gaussian beams: a comparison of the effects of

diffraction, obscuration, and aberrations,” J. Opt. Soc. Am. A3, 470–485 (1986).
3. V. N. Mahajan, “Strehl ratio of a Gaussian beam,” J. Opt. Soc. Am. A22, 1824–
1833 (2005).

5. V. N. Mahajan, “Gaussian apodization and beam propagation,” Progress in

Optics, 49, 1–96, (2006).
CHAPTER 7
SYSTEMS WITH HEXAGONAL PUPILS
7.1 Introduction ..........................................................................................................167
7.2 Pupil Function ......................................................................................................168
7.3.1 PSF ..........................................................................................................169
7.3.2 OTF ..........................................................................................................174
7.4 Hexagonal Polynomials........................................................................................177
7.5 Hexagonal Coefficients of a Hexagonal Aberration Function ......................... 185
Hexagonal Polynomial Aberrations ..................................................................187
7.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio ..............................194
7.7.1 Defocus ....................................................................................................194
7.7.2 Astigmatism............................................................................................. 194
7.7.3 Coma ........................................................................................................195
7.7.5 Strehl Ratio ..............................................................................................197
7.8 Summary............................................................................................................... 197
References ......................................................................................................................200
165
Chapter 7
Systems with Hexagonal Pupils
7.1 INTRODUCTION
Although most optical imaging systems have a circular or an annular pupil, with or
without Gaussian illumination, there are times when the wavefront or the interferogram is
hexagonal. This is most notable for the primary mirrors of large telescopes, such as the
Keck [1], the James Webb [2], or the CELT [3]. Although these mirrors are circular, they
are large enough that they are segmented into small hexagonal segments. Optical testing
of a hexagonal segment yields a hexagonal wavefront or interferogram, thus requiring
polynomials that are orthogonal over a hexagon. Even a large hexagonal primary mirror
consisting of hexagonal segments has been proposed [4].
Smith and Marsh [5] have discussed the PSF of a hexagonal pupil, but their equation
for it is incorrect. Sabatke et Dl. [4] desribe the complex amplitude for a trapezoid
forming the upper half of a regular hexagon, but do not carry out the summation of the
diffracted amplitudes of the two trapezoids of the hexagonal pupil. We give closed-form
expressions for the six-fold symmetric aberration-free PSF and OTF [6]. Similar
expressions for the PSF have been given by others [7,8]. The PSF and OTF are plotted
along with the ensquared power, and compared with the corresponding quantities for a
system with a circular pupil. The ensquared power and the OTF are shown to be lower
than the corresponding values for a circular pupil.
The hexagonal polynomials representing balanced aberrations are obtained in this

chapter by orthogonalizing the Zernike circle polynomials over a unit hexagon by using
the procedure described in Chapter 3. Each of these polynomials consists of either the
cosine or the sine terms, but not both. This is a consequence of the biaxial symmetry of a
hexagonal pupil. Whereas the circle, annular, and Gaussian polynomials, described in
Chapters 4, 5, and 6, respectively, are separable in their dependence on the polar
coordinates r and q of a pupil point, only some of the hexagonal polynomials are
separable. For example, the polynomial H14 contains cos 2q and cos 4q terms. Hence,
numbering the polynomials with two indices n and m loses significance, and they must be
numbered with a single index j. A hexagonal pupil has two distinct configurations where
the hexagon in one is rotated by 30 degrees with respect to that in the other. Only some of
the polynomials are common between the two configurations.
In Chapters 4–6, we considered the balancing of classical aberrations for systems

with circular, annular, and Gaussian pupils, respectively, and showed that the
corresponding orthonormal polynomials also represented balanced aberrations. Although
not shown explicitly, as was done in Chapters 4 through 6, the hexagonal polynomials
also represent balanced classical aberrations. However, some interesting results are
obtained in this respect due to lack of the radial symmetry of the hexagonal pupil. For
example, while the polynomials H11 and H22 representing the balanced primary and
167
168 SYSTEMS WITH HEXAGONAL PUPILS
secondary spherical aberrations are radially symmetric, the polynomial H37 representing
the balanced tertiary spherical aberration is not, because it also consists of an angle-
dependent term in Z28 or cos 6q . The balancing defocus, however, to optimally balance
Seidel astigmatism for a hexagonal pupil is the same as that for a circular or an annular
pupil.
The isometric, interferometric, and PSF plots for the hexagonal polynomial
aberrations are shown. The P-V numbers for the polynomials with a sigma value of one
wave are given, and the Strehl ratios are caluclated for a sigma value of one-tenth of a
wave to illustrate that the exponential expression for it, in terms of the aberration
variance, gives a good estimate for small aberrations.
The balancing of Seidel aberrations is considered, and their standard deviations are
obtained by expressing them in terms of the orthonormal polynomials. The diffraction
focus is shown to lie closer to the Gaussian image point in the case of coma, and closer to
the Gaussian image plane in the case of spherical aberration, compared to their
corresponding locations for a circular pupil. Plots of Strehl ratio as a function of the
sigma value of a Seidel aberration are given. They demonstrate that the exponential
expression underestimates in the case of defocus, but overestimates in the case of
astigmatism, coma, and spherical aberration. The Strehl ratio is estimated very well for
balanced astigmatism and coma, but it underestimates in the case of balanced spherical
aberration for s W > 0.2 .
7.2 PUPIL FUNCTION

Consider an imaging system with a uniformly illuminated hexagonal exit pupil with
( ) ( )
each side of length a and area Sex = 3 3 2 a 2 lying in the x p , y p plane with z axis as
its optical axis, as illustrated in Figure 7-1. For a uniformly illuminated pupil with an
( )
aberration function F x p , y p and power Pex exiting from it, the pupil function of the
system can be written
yp yc
E F
30º
a A 60º
D o xp o xc
C B
a
2a
(a) (b)
Figure 7-1. (a) Hexagonal pupil with dimension a. (b) Unit hexagonal pupil inscribed
inside a unit circle showing the coordinates of its corners. Each side of the hexagon
has a length of unity. The x axis passes through the corners D and A, and y axis
bisects its parallel sides EF and CB.
7.2 Pupil Function 169
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (7-1)
where
(
A xp, yp ) = (P ex
12
Sex ) (7-2)
across the hexagonal pupil.

7.3.1 PSF
From Eq. (1-9), the aberrated irradiance distribution in the image plane normalized
by its aberration-free central value Pex Sex l2 R 2 can be writen
2
I (ri ) = 2 Ù exp iF rp exp Á -
Sex ı
[ ( )]
Ë lR
ri rp ˜ d rp
¯
◊ , (7-3)
or, using Cartesian coordinates,

2
1 Û Û È 2pi ˘
I (x i , y i ) =
Sex ı ı
[ (
2 Ù Ù exp iF x p , y p exp Í -
Î lR
)] (
x i x p + y i y p ˙ dx p dy p
˚
) , (7-4)
where the integration is carried over the hexagonal pupil. Letting
(x p, yp ) = a( x ¢, y ¢) (7-5)
and
(xi , yi ) = l Fx ( x , y ) , (7-6)
where
Fx = R 2a (7-7)
is the focal ratio of the image-forming light cone along the x axis, Eq. (7-4) can be written
2
4 ÛÛ
I ( x, y) =
27 ı ı
[ ]
Ù Ù exp iF ( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (7-8)
For the aberration-free case, Eq. (7-8) reduces to

2
4 ÛÛ
I ( x, y) = Ù Ù exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (7-9)
27 ı ı
The hexagonal region of integration consists of a rectangle CBEF and two congruent
triangles B F A and CDE with the limits of integration - 1 2, 1 2; - 3 2, 3 2 , ( )
[1 2, 1; - ] [
3(1 - x ¢), 3(1 - x ¢) , and -1, - 1 2; - 3(1 + x ¢), 3(1 + x ¢) , respectively. In ]
each case, the first pair of limits is on x ¢ , and the second on y ¢ . Hence, the irradiance
distribution is given by
2
4 È12 3 2 1 3 (1 x ¢) 12 3 (1+ x ¢) ˘
I ( x, y) = Í Ú dx ¢ Ú + Ú dx ¢ Ú + Ú dx ¢ Ú ˙ exp[ -pi ( xx ¢ + yy ¢) ]dy ¢ . (7-10)
27 ÍÎ 1 2 3 2 12 3 (1 x ¢) 1 3 (1+ x ¢) ˙
˚
The integrand in Eq. (7-10) is separable in the integration coordinates. We carry out the
integration of each of its three parts:
12 3 2
A1( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
12 3 2
= 4
sin(px 2) sin ( 3py 2 ) , (7-11)
2
p xy
1 3 (1 x ¢)
A2 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
12 3 (1 x ¢)
-2
){ [- ( ) ( )] }. (7-12)
ipx 2 ipx
= e 3 y cos 3py 2 + ix sin 3py 2 + 3 ye
(
p y x 2 - 3y 2
2
Combining A2 and A3 , we find that their sum is real:
12 3 (1+ x ¢)
A3 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
1 3 (1+ x ¢)
2
){ [ ( ) ( )] }
= e ipx 2 3 y cos 3py 2 + ix sin 3py 2 - 3 ye ipx . (7-13)
2
(
p y x - 3y 2 2
4
A2 + A3 =
p y x - 3y 2
2
( 2
)
¥ [ 3 y cos(px 2) cos ( )
3py 2 - x sin(px 2) sin ( )
3py 2 - 3 y cos( px ) . (7-14) ]
From Eqs. (7-11) and (7-14), we obtain
4
A1 + A2 + A3 =
(
p 2 x x 2 - 3y 2 )
¥ { 3x[cos(px 2) cos( ) ]
3py 2 - cos( px ) - 3y sin(px 2) sin ( )}
3py 2 . (7-15)
The sum of the three parts of diffracted amplitude is real. The irradiance distribution is
given by
7.3.1 PSF 171
4 2
I ( x, y) = A1 + A2 + A3
27
4 2
=
27
( A1 + A2 + A3 ) . (7-16)
Using the L’Hopital rule, it can be shown that the PSF I (0, 0) at the origin is unity,
as expected from the normalization in Eq. (7-3). Rotating the ( x , y ) coordinate system by
[ ]
60 o , i.e., by changing ( x , y ) to (1 2) x + 3 y , y - 3 x , it can be shown that the PSF
remains invariant, thus showing that the PSF is 6-fold symmetric, as expected for the 6-
fold symmetric pupil. The PSF along the x and y axes can be written from Eq. (7-14) as
64
I ( x , 0) = [
9p 4 x 4
cos(px 2) - cos( px ) ]2 . (7-17a)
and
16 2
I (0, y ) =
243p 4 y 4
{ [
2 3 1 - cos ( )]
3py 2 + 3py sin ( )}
3py 2 . (7-17b)
A 2D PSF is shown in Figure 7-2. The PSF in Figure 7-2a emphasizes the low-value
details, but that in Figure 7-2b is truncated to a value of 10 -3 relative to a value of unity at
the center. It shows a nearly circular bright spot at the center surrounded by nearly
hexagonal alternating dark and bright rings, three dark and two bright. Beyond the rings,
the PSF breaks into six diffracted arms each of alternating bright and dark strips with
some dim structure between two consecutive arms. Plots of the PSF along the x and y
axes and at 15o from the x axis are shown in Figure 7-3 as I ( x, 0) , I (0, y ) , and
( )
I 15o ∫ I ( r ) , respectively. The solid curve I c represents the Airy pattern for a circular
pupil (of the same radius a as the side of the hexagonal pupil imaging an object at the
same wavelength l with the same focal ratio as Fx ) with its first zero at 1.22, as in
Figure 4-2. The central bright spot has its zero value along the x axis at 1.33, and at 1.35
along the y axis.
The ensquared power, i.e., the fractional power in a square region centered at the
Gaussian image point, is given by
s s
P( s) = Ú dx Ú I ( x , y )dy , (7-18)
s s
where s is the half-width of the square. It is tabulated in Table 7-1 along with the
corresponding value for a circular pupil. The two ensquared powers are plotted in Figure
7-4 as Ph and Pc . The ensquared power for a hexagonal pupil, plotted as a dotted curve
Ph , starts at zero and rises to 83.8% as s increases to the first zero along the x axis at
1.33, like the Airy disc of radius 1.22 for a circular pupil (as in Figure 4-2a), and
approaches 100% asymptotically. It is evident that the ensquared power for a hexagonal
pupil is lower than the corresponding value for a circular pupil.
(a) (b)
Figure 7-2. 2D aberration-free PSF of a system with a hexagonal pupil.

o
Ic
I(x,0)
I(15q)
o
m
I(y,0)
Ic
Figure 7-3. PSF along the x and y axes and at 15 o from the x axis, where x, y, and r
are in units of l Fx .
7.3.1 PSF 173
Table 7-1. Ensquared power Ph of a system with a hexagonal pupil, where s is the
half width of a square in units of l Fx , compared with the ensquared power Pc for a
circular pupil.
s Ph Pc
0 0 0
0.1 0.0256 0.0310
0.2 0.0984 0.1180
0.3 0.2070 0.2449
0.4 0.3354 0.3897
0.5 0.4663 0.5302
0.6 0.5848 0.6491
0.7 0.6809 0.7369
0.8 0.7504 0.7930
0.9 0.7945 0.8229
1 0.8186 0.8360
1.2 0.8344 0.8455
1.4 0.8434 0.8624
1.6 0.8613 0.8862
1.8 0.8819 0.9043
2 0.8972 0.9135
2.2 0.9060 0.9184
2.4 0.9116 0.9241
2.6 0.9175 0.9315
2.8 0.9244 0.9384
3 0.9311 0.9426
3.5 0.9397 0.9495
4 0.9469 0.9573
4.5 0.9536 0.9615
5 0.9575 0.9662
6 0.9645 0.9722
7 0.9699 0.9765
8 0.9738 0.9798
9 0.9768 0.9823
10 0.9791 0.9843
Pc Ph
o
Figure 7-4. Ensquared power as a function of the half-width s of a square, where s is

in units of l Fx .
7.3.2 OTF
From Eq. (1-11), the OTF for a uniformly illuminated hexagonal pupil can be
obtained as the autocorrelation of the pupil function:
r
t (v ) = Sex1 Ú [ (r )] d rr
exp iQ rp p , (7-19)
where
(r r)
Q rp ; v (r ) (r
= F rp - F rp - l R v
r
) (7-20)
r
is the phase aberration difference function, and v is a spatial frequency vector in the
image plane. The integration in Eq. (7-19) is carried out over the overlap area of two
r
hexagonal pupils whose centers are displaced from each other by l R v . In the aberration-
free case, the OTF is real and simply equal to the relative area of overlap of two pupils
r
where the center of one is displaced from that of the other by l R v .
For a displacement x along the x axis, as in Figure 7-5a, the overlap area consists of
two isosceles triangles and a rectangle when x < a . The area of each triangle is 3a 2 4 ,
and that of the rectangle is 3a( a - x ) . The total fractional overlap area is 1 - 2 x 3a .
For x = a , as in Figure 5b, the rectangle vanishes and the two triangles meet forming a
rhombus. For x > a , the two triangles intersect each other, thus reducing the size and
therefore the area of the rhombus. The fractional area of the rhombus is given by
(1 3) (2 - x a)2 . The rhombus vanishes as x Æ 2a , and the two hexagons meet at a
vertex only, namely, the extreme right-hand vertex of one hexagon and the extreme left-
hand vertex of the other. Replacing the displacement x by l Rv x , where v x is a spatial
frequency along the x axis, and normalizing it by the cutoff frequency 1 l Fx along this
axis, we can write the tangential or the x-OTF as
7.3.2 OTF 175
yp
yp yp
Oc
Oc Oc y
O O
xp xp O
x x xp
(a) (b) (c)
Figure 7-5. Overlap area of two hexagonal pupils displaced from each other along
the x axis in (a) and with x = a in (b), and along the y axis in (c).
ÏÔ1 - (4 3)v x , 0 £ v x £ 1 2
t x (v x ) = Ì 2
(7-21)
ÔÓ(4 3) (1 - v x ) , 1 2 £ v x £ 1 .
Now consider a displacement y along the y axis, as illustrated in Figure 7-5c. Here
again, the overlap area consists of two congruent isosceles triangles and a rectangle. The
(
area of each triangle is 1 4 3 )( )
3a - y and that of the rectangle is a 3a - y for
2
( )
0 £ y £ 3a . The fractional overlap area is given by ( 2 3)ÈÍ 1 y 3a + (1 2) 1 y 3a ˘˙ .
( ) ( )
Î ˚
Again, replacing y by l Rv y , where v y is the spatial frequency along the y axis, and
normalizing by the cutoff frequency 1 l Fx , the sagittal or the y-OTF can be written
2
( ) = (2 3)ÈÍÎ(1 - 2v
ty vy y ) (
3 + (1 2) 1 - 2v y 3 ˘˙ , 0 £ v y £ 3 2 .
) ˚
(7-22)
Note that the cutoff frequency in the y direction is 3 2 compared to a value of unity in
the x direction.
It can be shown that the OTF for an angle q from the x axis in the range 0 £ q £ p 6
is given by [6]
Ï 4 È Ê2 ˆ ˘
Ô1 - vq Ísin q + 3 cos q + Á sin 2 q - sin 2q˜ vq ˙ , 0 £ vq £ v1
Ô 3 3 Î Ë 3 ¯ ˚
t(vq ) = Ì (7-23)
Ô 4 + 2 Ê sin q - 4 cos qˆ v + 1 Ê 1 - 1 sin 2q + 3 cos 2qˆ v 2 , v £ v £ v ,
Ô 3 3 ÁË 3 ˜ q
¯ 3Ë
Á
3
˜ q 1
¯ q 2
Ó
where vq is the normalized spatial frequency for the angle q and

1
È Ê sin q ˆ ˘
v1 = Í 2Á cos q - ˜˙ (7-24)
Î Ë 3 ¯˚
and
1
Ê sin q ˆ
v2 = Á cos q + ˜ (7-25)
Ë 3¯
are normalized spatial frequencies corresponding to the displacements r1 and r2 . The

spatial frequency v 2 represents the cutoff frequency as a function of angle q . It
decreases monotonically from a value of unity to 3 2 as the angle q increases from
zero to p 6. By letting q = 0, we obtain the OTF along the x axis as given by Eq. (7-21).
Similarly, q = p 6 yields the OTF along the y axis given by Eq. (7-22), since the OTFs
for angles p 6 and p 2 are identical owing to the six-fold symmetry of the hexagonal
pupil. The OTF for the range p 6 £ q £ p 3 is the same as that for the range 0 £ q £ p 6 ,
becuase of the symmetry of the pupil about the direction making an angle of p 6. For
larger angles, we make use of the six-fold symmetry of the OTF.
Figure 7-6 shows how the OTF varies with the spatial frequency (in units of the
cutoff frequency 1 l Fx ) along the x and y axes, and at 15o from the x axis as t(v x ),
( ) ( )
t v y (in long dashes), and t 15o ∫ t( v ) . The OTF of a system with a corresponding
circular pupil of radius a is also included for comparison as t c . Note that the cutoff
frequency of the hexagonal pupil is the same as that for the circular pupil only along the x
axis and every 60 o degrees from it. Otherwise, it is smaller. We note that the OTF of a
hexagonal pupil is lower than that for a circular pupil at all spatial frequencies. The OTF
along the x axis is slightly higher than that along the y axis, and the OTF at 15o is slightly
higher in the low frequency region but lower in the high. The 15o OTF is lower than that
along the x axis. The differences among the three curves are relatively small.
oW
Wc
o
WQy
Wq o
o
WQx
oQx Qy Q
Figure 7-6. OTF along the x and y axes, and at 15 o from the x axis, where the spatial
frequencies v x , v y , and v , are in units of 1 l Fx .
7.4 Hexagonal Polynomials 177
7.4 HEXAGONAL POLYNOMIALS

Figure 7-7 shows a unit hexagon inscribed inside a unit circle. The x axis passes
through the corners D and A , and y axis bisects its parallel sides EF and C B. The
coordinates of the corners of the hexagon are labeled in the figure. Each side of the
hexagon has a length of unity. The area of the unit hexagon is A = 3 3 2 .
The orthonormal hexagonal polynomials H j obtained by orthogonalizing the

Zernike circle polynomials over a hexagon [5,6] are given by [see Eq. (3-18)]
È j ˘
H j +1 = N j +1 Í Z j +1 - Â Z j +1H k H k ˙ , (7-26)
Î k =1 ˚
unit hexagon, i.e., they satisfy the orthonormality condition
2
Ú H j H j ¢ dx dy = d jj ¢ . (7-27)
3 3 hexagon
The hexagonal region of integration consists of a rectangle EFCB and two congruent
(
triangles F A B and C D E with limits of integration - 1 2, 1 2; - 3 2, 3 2 , )
[ ] [ ]
1 2, 1; - 3(1 - x ), 3(1 - x ) , and -1, - 1 2; - 3 (1 + x ), 3 (1 + x ) , respectively. The
angular brackets indicate a mean value over the hexagonal pupil. Thus,
2
Z j +1H k = Ú Z j +1H j dx dy . (7-28)
3 3 hexagon
The orthonormal hexagonal polynomials are given in Tables 7-2–7-4 up to the eighth
order in three different but equivalent forms [9,10]. In Table 7-2, each hexagonal
polynomial is written in terms of the circle polynomials, thus illustrating the relationship
y
£ 1 3¥ £ 1 3¥
E² , ´ F² , ´
¤ 2 2¦ ¤2 2 ¦
30°
D ( 1,0) 60° A (1,0)

O x
£ 1 3¥ £1 3¥
C² , ´ B² , ´
¤ 2 2¦ ¤2 2 ¦
Figure 7-7. Unit hexagon inscribed inside a unit circle showing the coordinates of its
corners. Each side of the hexagon has a length of unity. The x axis passes through
the corners D and A, and y axis bisects its parallel sides EF and CB.
Table 7-2. Orthonormal hexagonal polynomials H j U , T in terms of the Zernike

circle polynomials Z j U T .
H1 Z1
H2 6 5 Z2
H3 6 5 Z3
H4 5 43 Z1 + (2 15 43 )Z4
H5 10 7 Z5
H6 10 7 Z6
H7 16 14 11055 Z3 + 10 35 2211 Z7
H8 16 14 11055 Z2 + 10 35 2211 Z8
H9 (2 5 / 3 ) Z9
H10 (2 35 103 ) Z10
H11 (521/ 1072205 )Z1 + 88 15 214441 Z4 + 14 43 4987 Z11
H12 225 6 492583 Z6 + 42 70 70369 Z12
H13 225 6 492583 Z5 + 42 70 70369 Z13
H14 2525 14 297774543 Z6 (1495 70 99258181 /3)Z12 + ( 378910 / 18337 /3)Z14
H15 2525 14 297774543 Z5 + (1495 70 99258181 /3)Z13 + ( 378910 18337 /3)Z15
H16 30857 2 3268147641 Z2 + (49168/ 3268147641 )Z8 + 42 1474 1478131 Z16
H17 30857 2 3268147641 Z3 + (49168/ 3268147641 )Z7 + 42 1474 1478131 Z17
H18 386 770 295894589 Z10 +6 118965 2872763 Z18
H19 6 10 97 Z9 + 14 5 291 Z19
H20 0.71499593Z2 0.72488884Z8 0.46636441Z16 +1.72029850Z20

H21 0.71499594Z3 + 0.72488884Z7 + 0.46636441Z17 + 1.72029850Z21
H22 0.58113135Z1 + 0.89024136Z4 + 0.89044507Z11 + 1.32320623Z22
H23 1.15667686Z5 + 1.10775599Z13 + 0.43375081Z15 + 1.39889072Z23
H24 1.15667686Z6 + 1.10775599Z12 0.43375081Z14 + 1.39889072Z24
H25 1.31832566Z5 + 1.14465174Z13 + 1.94724032Z15 + 0.67629133Z23 + 1.75496998Z25
Table 7-2. Orthonormal hexagonal polynomials H j U , T in terms of the Zernike

circle polynomials Z j U T . (Cont.)
H26 1.31832566Z6 1.14465174Z12 + 1.94724032Z14 0.67629133Z24 + 1.75496998Z26
H27 2 77 93 Z27
H28 1.07362889Z1 1.52546162Z4 1.28216588Z11 0.70446308Z22 + 2.09532473Z28

H29 0.97998834Z3 + 1.16162002Z7 +1.04573775Z17 +0.40808953Z21 +1.36410394Z29
H30 0.97998834Z2 + 1.16162002Z8 + 1.04573775Z16 0.40808953Z20 + 1.36410394Z30
H31 3.63513758Z9 + 2.92084414Z19 + 2.11189625Z31
H32 0.69734874Z10 + 0.67589740Z18 + 1.22484055Z32
H33 1.56189763Z3 + 1.69985309Z7 + 1.29338869Z17 + 2.57680871Z21
+ 0.67653220Z29 + 1.95719339Z33
H34 1.56189763Z2 1.69985309Z8 1.29338869Z16 + 2.57680871Z20
0.67653220Z30 + 1.95719339Z34
H35 1.63832594Z3 1.74759886Z7 1.27572528Z17 0.77446421Z21
0.60947360Z29 0.36228537Z33 + 2.24453237Z35
H36 1.63832594Z2 1.74759886Z8 1.27572528Z16 + 0.77446421Z20
0.60947360Z30 + 0.36228537Z34 + 2.24453237Z36
H37 0.82154671Z1 + 1.27988084Z4 + 1.32912377Z11 + 1.11636637Z22
0.54097038Z28 + 1.37406534Z37
H38 1.54526522Z6 + 1.57785242Z12 0.89280081Z14 + 1.28876176Z24
0.60514082Z26 + 1.43097780Z38
H39 1.54526522Z5 + 1.57785242Z13 + 0.89280081Z15 + 1.28876176Z23
+ 0.60514082Z25 + 1.43097780Z39
H40 2.51783502Z6 2.38279377Z12 + 3.42458933Z14 1.69296616Z24
+ 2.56612920Z26 0.85703819Z38 + 1.89468756Z40
H41 2.51783502Z5 + 2.38279377Z13 + 3.42458933Z15 + 1.69296616Z23
+ 2.56612920Z25 + 0.85703819Z39 + 1.89468756Z41
H42 2.72919646Z1 4.02313214Z4 3.69899239Z11 2.49229315Z22
+ 4.36717121Z28 1.13485132Z37 + 2.52330106Z42
H43 1362 77 20334667 Z27 + (260/3) 341 655957 Z43
H44 2.76678413Z6 2.50005278Z12 + 1.48041348Z14 1.62947374Z24

+ 0.95864121Z26 0.69034812Z38 + 0.40743941Z40 + 2.56965299Z44
H45 2.76678413Z5 2.50005278Z13 1.48041348Z15 1.62947374Z23

0.95864121Z25 0.69034812Z39 0.40743941Z41 + 2.56965299Z45
Table 7-3. Orthonormal hexagonal polynomials H j U , T in polar coordinates

U, T .
H1 1
H2 2 6 / 5 ȡcosș
H3 2 6 / 5 ȡsinș
H4 5 / 43 ( 5 + 12ȡ2)
H5 2 15 / 7 ȡ2sin2ș
H6 2 15 / 7 ȡ2cos2ș
H7 4 42 / 3685 ( 14ȡ + 25ȡ3)sinș
H8 4 42 / 3685 ( 14ȡ + 25ȡ3)cosș
H9 (4 10 / 3 )ȡ3sin3ș
H10 4 70 / 103 ȡ3cos3ș
H11 (3/ 1072205 )(737 5140ȡ2 + 6020ȡ4)
H12 (30/ 492583 )( 249ȡ2 + 392ȡ4)cos2ș
H13 (30/ 492583 )( 249ȡ2 + 392ȡ4)sin2ș
H14 (10/3) 7 / 99258181 [10(297 598ȡ2)ȡ2cos2ș + 5413ȡ4cos4ș]
H15 (10/3) 7 / 99258181 [ 10(297 598ȡ2)ȡ2 sin2ș + 5413ȡ4sin4ș]
H16 2 6 / 1089382547 (70369ȡ 322280ȡ3 + 309540ȡ5)cosș
H17 2 6 / 1089382547 (70369ȡ 322280ȡ3 + 309540ȡ5)sinș
H18 4 385 / 295894589 ( 3322ȡ3 + 4635ȡ5)cos3ș
H19 4 5 / 97 ( 22ȡ3 + 35ȡ5)sin3ș

H20 ( 2.17600248ȡ + 13.23551876ȡ3 + 16.15533716ȡ5)cosș + 5.95928883ȡ5 cos5ș
H21 (2.17600248ȡ 13.23551876ȡ3 + 16.15533716ȡ5) sinș + 5.95928883ȡ5 sin5ș
H22 2.47059083 + 33.14780774ȡ2 93.07966445ȡ4 + 70.01749250ȡ6
H23 (23.72919095ȡ2 90.67126833ȡ4 + 78.51254738ȡ6)sin2ș + 1.37164051ȡ4sin4ș
H24 (23.72919095ȡ2 90.67126833ȡ4 + 78.51254738ȡ6)cos2ș 1.37164051ȡ4cos4ș
H25 (7.55280798ȡ2 36.13018255ȡ4 + 37.95675688ȡ6)sin2ș + ( 26.67476754ȡ4
+ 39.39897852ȡ6)sin4ș
H26 ( 7.55280798ȡ2 + 36.13018255ȡ4 37.95675688ȡ6)cos2ș + ( 26.67476754ȡ4
+ 39.39897852ȡ6)cos4ș
Table 7-3. Orthonormal hexagonal polynomials H j U , T in polar coordinates

U, T . (Cont.)
H27 14 22 / 93 ȡ6sin6ș
H28 0.56537219 10.44830313ȡ2 + 38.71296332ȡ4 37.27668254ȡ6 + 7.83998727ȡ6cos6ș
H29 ( 15.56917599ȡ + 130.07864353ȡ3 291.15952742ȡ5
+ 190.97455178ȡ7)sinș + 1.41366362ȡ5sin5ș
H30 ( 15.56917599ȡ + 130.07864353ȡ3 291.15952742ȡ5
+ 190.97455178ȡ7)cosș 1.41366362ȡ5cos5ș
H31 (54.28516840 202.83704634ȡ2 + 177.39928561ȡ4)ȡ3sin3ș
H32 (41.60051295 135.27397959ȡ2 + 102.88660624ȡ4)ȡ3cos3ș
H33 ( 3.87525156 + 41.84243767ȡ2 117.56342978ȡ4 + 94.71450820ȡ6)ȡsin ș
+ 76.09262860 + ( 38.04631430 + 54.80141514ȡ2)ȡ5sin5ș
H34 (3.87525156 + 41.84243767ȡ2 117.56342978ȡ4+ 94.71450820ȡ6)ȡcos ș
+ ( 38.04631430 + 54.80141514ȡ2)ȡ5cos5ș
H35 (3.10311187 34.93479698ȡ2 + 102.08124605ȡ4 85.32630533ȡ6)ȡsinș
+ (6.01202622 10.14399046ȡ2)ȡ5 sin 5ș + 8.978129552ȡ7sin7ș
H36 (3.10311187ȡ 34.93479698ȡ2 + 114.10529848ȡ4 87.65802721ȡ6)ȡcosș
+ (12.02405243 2.33172188ȡ2) ȡ5cos3ș + (12.02405243 + 3.68030434ȡ2)ȡ5cos5ș
+ 6.01202622ȡ7cos7ș
H37 2.74530738 60.39881618ȡ2 + 300.22087475ȡ4 518.03488742ȡ6
+ 288.55372176ȡ8 2.02412582ȡ6cos6ș
H38 ( 42.96232789 + 287.78381063ȡ2 565.13651608ȡ4
+ 339.98298180ȡ6)ȡ2cos2ș + (8.49786414 13.58537785ȡ2)ȡ4cos4ș
H39 ( 42.96232789 + 287.78381063ȡ2 565.13651608ȡ4
+ 339.98298180ȡ6)ȡ2sin2ș + (8.49786414 13.58537785ȡ2)ȡ4sin4ș
H40 (14.79181046 121.61654135ȡ2 + 286.77354559ȡ4
203.62188574ȡ )ȡ2cos2ș
6
+ (83.39879886 280.00664075ȡ2 + 225.07739907ȡ4)ȡ4cos4ș

H41 ( 14.79181046 + 121.61654135ȡ2 286.77354559ȡ4 + 203.62188574ȡ6)ȡ2sin2ș
+ (83.39879886 280.00664075ȡ2 + 225.07739907ȡ4)ȡ4sin4ș
H42 0.84269170 + 24.65387703ȡ2 158.21741244ȡ4 + 344.75780000ȡ6
238.31877895ȡ8 + ( 58.59775991 + 85.64367812ȡ2)ȡ6cos6ș
H43 2 22 / 20334667 ( 23443 + 32240ȡ2)ȡ6sin6ș

H44 (9.64776957 85.41873843ȡ2 + 216.08041438ȡ4
164.01834750ȡ6)ȡ2cos2ș + (12.67622930 51.08055822ȡ2
+ 48.40133344ȡ4)ȡ4cos4ș + 10.90211434ȡ8cos8ș
H45 (9.64776957 85.41873843ȡ2 + 216.08041438ȡ4 164.01834750ȡ6)ȡ2sin2ș
(12.67622930 51.08055822ȡ2 + 48.40133344ȡ4)ȡ4sin4ș + 10.90211434ȡ8sin8ș
Table 7-4. Orthonormal hexagonal polynomials H j x, y in Cartesian coordinates

x, y , where U 2 x 2 y 2 .
H1 1
H2 2 6/5 x
H3 2 6/5 y
H4 5 / 43 ( 5 + 12ȡ2)
H5 4 15 / 7 xy
H6 2 15 / 7 (x2 y2)
H7 4 42 / 3685 ( 14 + 25ȡ2)y
H8 4 42 / 3685 ( 14 + 25ȡ2)x
H9 (4/3) 10 (3x2y y3)
H10 4 70 / 103 (x3 3xy2)
H11 (3/ 1072205 )(737 5140ȡ2 + 6020ȡ4)
H12 (30/ 492583 )(392ȡ2 249)(x2 y2)
H13 (60/ 492583 )(392ȡ2 249)xy
H14 (10/3) 7 / 99258181 [567x4 + 32478 x2 y2 11393y4 2970(x2 y2)]
H15 (40/3) 7 / 99258181 ( 1485 + 8403x2 2423y2)xy
H16 2 2 / 3268147641 (211107 966840ȡ2 + 928620ȡ4)x
H17 2 2 / 3268147641 (211107 966840ȡ2 + 928620ȡ4)y
H18 4 385 / 295894589 ( 3322 + 4635ȡ2)(x3 3xy2)
H19 4 5 / 97 ( 22 + 35ȡ2)(3x2y y3)

H20 ( 2.17600247 + 13.23551876ȡ2 + 13.64110699 ȡ4)x 119.18577680 ȡ2 x3
+ 95.3486212x5
H21 (2.17600247 13.23551876ȡ2 + 45.95178131ȡ 4)y 119.18577680 ȡ2y3
+ 95.34862128y5
H22 2.47059083 + 33.14780774ȡ2 93.07966445ȡ4 + 70.01749250ȡ6
H23 (47.45838189 175.85597460x2 186.82909872y2 + 157.02509476x4
+ 314.05018953x2y2 + 157.02509476y4)xy
H24 (23.72919094 92.04290884x2 + 78.51254738x4)x2 + ( 23.72919094
+ 8.22984309x2 + 89.29962781y2 + 78.51254738x4 78.51254738x2y2
78.51254738y4)y2
Table 7-4. Orthonormal hexagonal polynomials H j x, y in Cartesian coordinates

x, y , where U 2 x 2 y 2 . (Cont.)
H25 (15.10561596 – 178.95943525x2 + 34.43870505y2 + 233.50942786x4
+ 151.82702751x2y2 – 81.68240034y4)xy
H26 (– 7.55280798 + 9.45541501x2 + 1.44222164x4)x2 + (7.55280798 + 160.04860523x2– 62.80495008y2
–234.95164950x4 – 159.03813574x2y2 + 77.35573540y4)y2
H27 (40.85537039x4 136.18456799 x2y2 + 40.85537039y4)xy
H28 0.56537219 – 10.44830312ȡ2 + 38.71296332x4 + 77.42592664 x2y2 + 38.71296332y4 29.43669525x6
229.42985678 x4y2 +5.76976155 x2y4 45.11666981y6
H29 ( 15.56917599 + 130.07864353ȡ2 – 284.09120931ȡ4 + 190.97455178ȡ6) y

– 28.2732724ȡ2y3 + 22.61861792y5
2 3 5
H30 ( 15.56917599 + 130.07864353ȡ2 – 298.22784553ȡ4 + 190.97455178ȡ6)x + 28.27327243ȡ x – 22.61861792x
H31 (162.85550520x2 54.28516840y2 608.51113904x2ȡ2 + 202.83704634y2ȡ2 +532.19785685x2ȡ4

177.39928561y2ȡ4)y
H32 [(41.60051295 135.27397959x2 + 102.88660624x4)x2 +( 124.80153887 + 270.54795919x2+ 405.82193879y2

102.88660624x4 – 514.43303123 x2y2 308.65981874y4)y2]x
H33 [ 3.87525156 + (41.84243767 307.79500129x2 + 368.72158389x4)x2 + (41.84243767 + 145.33628349x2
155.60974407y + 10.13644892x4
2
209.06921162 x2y2 + 149.51592334y4)y2]y
H34 [3.87525156 + ( 41.84243767 + 79.51711547x2 39.91309306x4)x2 + ( 41.84243767 + 615.59000259x2
72.66814174y2 777.35626084x4 558.15060029 x2y2 + 179.29256748y4)y2]x
H35 [3.10311187 + ( 34.93479698 + 132.14137712x2 73.19935100x4)x2 + ( 34.93479698 + 144.04222993x2
2 2 4 2
+ 108.09327226y2 519.49349681x4 + 23.85771799 x y 104.44842531y )y ]y
H36 [3.10311187 + ( 34.93479698 + 96.06921983x2 66.20418535x4)x2 + ( 34.93479698 + 264.28275425x2
2 2 4 2
+ 72.02111496y2 535.81555000x4 + 7.53566481 x y 97.45325965y )y ]x
H37 2.74530738 60.39881618ȡ2 + 300.22087475ȡ4 + 288.55372176ȡ8

520.05901324x6 1523.74277487 x4y2 1584.46654966 x2y4 516.01076159y6
H38 ( 42.96232789 + 296.28167478x2 578.72189394x4 + 339.98298180x6)x2 + (42.96232789 50.98718488x2
279.28594648y2 497.20962679x4 + 633.06340537 x2y2 + 551.55113822y4 + 679.96596360x6
679.96596360 x2y4 339.98298180y6)y2
H39 [ 85.92465579 + (541.57616468 1075.93152073x2 + 679.96596360x4)x2 + (609.55907786
2 2 2 4 2
2260.54606433x 1184.61454360y2 + 2039.89789081x4 + 2039.89789081x y + 679.96596360y )y ]xy
H40 (14.79181046 38.21774249x2 + 6.76690483x4 + 21.45551332x6)x2 + ( 14.79181046 500.39279319x2
2 4 2 2 4
+ 205.01534022y + 1686.80674937x + 1113.25965819 x y 566.78018634y 1307.55336779x6
4 2 2 4 6 2
2250.77399075 x y 493.06582480 x y + 428.69928482y )y
H41 [ 29.58362093 + (576.82827818 1693.57365421x2 + 1307.55336779x4)x2 +( 90.36211274
1147.09418236x2 + 546.47947184y2 + 2122.04091078x4 + 321.42171817x2y2 493.06582480y4)y2]xy
H42 0.84269170 + (24.65387703 158.21741244x2 + 286.16004008x4 152.67510082x6) x2+ (24.65387703
316.43482489x2 158.21741244y2 + 1913.23979875x4 + 155.30700127x2 y2 + 403.35555992y4
– 2152.28660953x6 – 1429.91267370x4y2 + 245.73637792x2y4 – 323.96245707y6)y2 + 403
3 3 5 2
H43 2 22 / 20334667 (6x5y 20x y +6xy )( 23443 + 32240ȡ )
2 4 6
H44 (9.64776957 72.74250912x + 164.99985615x 104.71489971x )x2
+ ( 9.64776957 –76.05737585x2 + 98.09496774y2 + 471.48320551x4
+ 39.32237674 x2y2 267.16097261y4 826.90123032x6
+ 279.13466933 x4 y2 170.82784030 x2 y4 + 223.32179529y6) y2
H45 [19.29553915 + ( 221.54239411 + 636.48306167x2 434.42511407x4)x2
+ ( 120.13255963 + 864.32165754x2 + 227.83859586y2 1788.23382186x4
179.98634818 x2y2 221.64827593y4)y2]xy
between the two. In particular, it helps determine the potential error made when a
hexagonal aberration function is expanded in terms of the circle polynomials (see Chapter
12). The coefficients of the circle polynomials are the elements of the conversion matrix
M (discussed in Chapter 3). The polynomials up to H19 are given in their analytical form,
but those with j > 19 are written in a numerical form because of the increasing
complexity of the coefficients of the circle polynomials. In Table 7-3, the hexagonal
polynomials are given in polar coordinates, showing one-to-one correspondence with the
circle polynomials but illustrating the difference between them. This form is convenient
for analytical calculations because of integration of trigonometric functions over
symmetric limits. Finally, the polynomials are given in Cartesian coordinates in Table 7-
4, for a quantitative numerical analysis of, say, an interferogram.
Several observations can be made from the polynomial tables. It is evident from
Table 7-2 that the corresponding coefficients of the Zernike polynomials that make up the
hexagonal polynomial (n, m) pairs are the same except for signs in some cases, unless m
is a multiple of 3. For example, H14 and H15 have some coefficients with different signs,
but H16 and H17 have the same signs. H9 and H10 , which correspond to n = 3 and m =
3, and H18 and H19 , which correspond to n = 5 and m = 3, have different coefficients.
From Table 7-3, we note that each hexagonal polynomial consists of cosine or sine terms,
but not both.
Unlike the circle and annular polynomials, the hexagonal polynomials are generally
not separable in r and q due to lack of radial symmetry of the hexagonal pupil. The first
13 polynomials, i.e., up to H13 , are separable, but H14 and H15 are not; H16 through H19
are separable, but H20 and H21 are not. Accordingly, the notion of two indices n and m
with dependence on m in the form of cos mq loses significance. For example, the Zernike
polynomial Z14 for n = 4 and m = 4 varies as cos 4q but H14 has a term in cos 2q also.
Hence, the hexagonal polynomials can be ordered by a single index only. While the
polynomials H11 and H22 representing balanced primary and secondary spherical
aberrations are radially symmetric, the polynomial H37 representing balanced tertiary
spherical aberration is not, since it consists of an angle-dependent term in Z28 or cos 6q
also. If this term is not included in the polynomial H37 , the standard deviation of the
aberration increases from a value of unity to 1.13339.
A different configuration of a hexagonal pupil is illustrated in Figure 7-8 where the

hexagon is rotated by 30 o compared to that in Figure 7-7 so that the point A, for example,
moves to a point A ¢ . Whereas in Figure 7-7 the x axis passes through the corners D and A
of the hexagon and the y axis bisects its parallel sides EF and CB; in Figure 7-8, the x axis
bisects the parallel sides F ¢A ¢ and D¢C ¢ of the hexagon and the y axis passes through its
corners E ¢ and B ¢ . As a result, some polynomials change, as may be seen by comparing
the polynomials given in Table 7-5 for the 30-degree rotation with those in Table 7-2.
The first eight polynomials, H11 through H13 , H16 , H17 , H22 , H27 , etc., do not change.
Polynomials H 9 and H10 , H14 and H15 , and H18 and H19 , etc., exchange the
coefficients of the circle polynomial components.
y
E¢(0,1)
30
60
r
r
Ê 3 1ˆ Ê 3 1ˆ
D¢ Á , ˜ F¢ Á , ˜
Ë 2 2¯ Ë 2 2¯
O x
Ê 3 1ˆ Ê 3 1ˆ
C¢ Á , ˜ A¢ Á , ˜
Ë 2 2¯ Ë2 2¯
B¢ (0 , 1)
Figure 7-8. Unit hexagon rotated clockwise 30 degrees with respect that in Figure 7-
7, showing the coordinates of its corners. The x axis bisects the parallel sides F ¢A¢
and D¢ C ¢ of the hexagon, and the y axis passes through its corners E ¢ and B ¢ .
7.5 HEXAGONAL COEFFICIENTS OF A HEXAGONAL ABERRATION

FUNCTION
A hexagonal aberration function W ( x , y ) across a unit hexagon can be expanded in
terms of J hexagonal polynomials H j (r, q) in the form
J
W ( x, y) = Â a j H j ( x, y) , (7-29)
j =1
where a j are the expansion coefficients. Multiplying both sides of Eq. (7-29) by
H j ( x , y ), integrating over the unit hexagon, and using the orthonormality Eq. (7-27), we
obtain the hexagonal expansion coefficients:
2
aj = Ú W ( x , y )H j ( x , y ) dx dy . (7-30)
3 3 hexagon
It is evident from Eq. (7-30) that the value of a hexagonal coefficient is independent of
the number J of polynomials used in the expansion of the aberration function. Hence, one
or more polynomial terms can be added to or subtracted from the aberration function
without affecting the value of the coefficients of the other polynomials in the expansion.
W (r, q) = a1 , (7-31)
and
J
W 2 (r, q) = Â a 2j , (7-32)
j =1
Table 7-5. Orthonormal hexagonal polynomials H j U , T in terms of Zernike circle

polynomials Z j U T for hexagon rotated by 30 R, as in Figure 7-8.
H1 Z1
H2 6 / 5 Z2
H3 6 / 5 Z3
H4 5 / 43 Z1 + 2 15 / 43 Z4
H5 10 / 7 Z5
H6 10 / 7 Z6
H7 16 14 / 11055 Z3 + 10 35 / 2211 Z7
H8 16 14 / 11055 Z2 + 10 35 / 2211 Z8
H9 2 35 / 103 Z9
H10 (2 5 /3)Z10
H11 (521/ 1072205 ) Z1 + 88 15 / 214441 Z4 + 14 43 / 4987 Z11
H12 = 225 6 / 492583 Z6 + 42 70 / 70369 Z12
H13 = 225 6 / 492583 Z5 + 42 70 / 70369 Z13
H14 = 2525 14 / 297774543 Z6 + (1495 70 / 99258181 /3)Z12 + ( 378910 / 18337 /3)Z14
H15 = 2525 14 / 297774543 Z5 (1495 70 / 99258181 /3)Z13 + ( 378910 / 18337 /3)Z15
H16 = 30857 2 / 3268147641 Z2 + (49168/ 3268147641) Z8 + 42 1474 / 1478131 Z16
H17 = 30857 2 / 3268147641 Z3 + (49168/ 3268147641) Z7 + 42 1474 / 1478131 Z17
H18 = 6 10 / 97 Z10 + 14 5 / 291 Z18
H19 = 386 770 / 295894589 Z9 + 6 118965 / 2872763 Z19

H20 = 0.71499593Z2 + 0.72488884Z8 + 0.46636441Z16 + 1.72029850Z20
H21 = 0.71499593Z3 0.72488884Z7 0.46636441Z17 + 1.72029850Z21
H22 = 0.58113135Z1 + 0.89024136Z4 + 0.89044507Z11 + 1.32320623Z22
H23 = 1.15667686Z5 + 1.10775599Z13 0.43375081Z15 + 1.39889072Z23
H24 = 1.15667686Z6 + 1.10775599Z12 + 0.43375081Z14 + 1.39889072Z24
H25 = 1.31832566Z5 1.14465174Z13 + 1.94724032Z15 0.67629133Z23 + 1.75496998Z25
H26 = 1.31832566Z6 + 1.14465174Z12 + 1.94724032Z14 + 0.67629133Z24 + 1.75496998Z26
H27 = 2 77 / 93 Z27
H28 = 1.07362889Z1 + 1.52546162Z4 + 1.28216588Z11 + 0.70446308Z22 + 2.09532473Z28
7.5 Hexagonal Coefficients of a Hexagonal Aberration Function 187
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (7-33)
j =2

OF HEXAGONAL POLYNOMIAL ABERRATIONS
As in the case of circle and annular polynomials (see Sections 4.9 and 5.7,
respectively), we illustrate the hexagonal polynomials for n £ 8 in three different but
equivalent ways in Figure 7-9. For each polynomial, the isometric plot at the top
illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is
shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers
(in units of wavelength) are given in Table 7-6.
The PSF plots represent the images of a point object in the presence of a polynomial
aberration. They can be obtained by applying Eq. (7-6) to a hexagonal pupil. Piston yields
the aberration-free PSF since it does not affect the PSF. The full width of a square
displaying the PSFs is 24l Fx .
The polynomial aberrations H 2 and H 3 , representing the x and y wavefront tilts

with aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x
and y axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
wavefront tilt angle of 2 6 5 la 2 a about the y axis and displaces the PSF along the x
axis by 4 6 5lFx a 2 . where Fx = R 2a is the focal ratio of the image-forming beam
along the x axis. Similarly, the coefficient a 3 corresponds to a tilt angle of 4 2 5la 3 a
about the x axis, and yields a displacement of the PSF along the y axis by 4 6 5lFy a 3 ,
where Fy = R ( )
3 2 a is the focal ratio of the image-forming beam along the y axis.
The symmetry properties of the aberrated PSFs (and OTFs) discussed for the circular
pupils in Section 4.7 are generally not applicable to hexagonal pupils. For example,
although the form of the polynomials H 5 and H 6 , representing balanced astigmatisms,
are the same as the corresponding Zernike circle polynomials, the interferogram and the
PSF for one cannot be obtained by a 45o rotation of the other. This is due to the lack of
radial symmetry of the hexagonal pupil. However, the interferograms and PSFs for the
polynomials H 7 and H 8 , representing balanced comas, are different from each other
only by a 90 o rotation. Similarly, the polynomials H 9 and H10 have the same form as
the Zernike circle polynomials Z 9 and Z10 , respectively, and they yield 6-fold symmetric
interferograms and 3-fold symmetric PSFs. The PSF for one can be obtained by a 120 o
rotation of the other. The interferograms and the PSFs for H11 and H 22 , representing the
balanced primary and secondary aberrations, respectively, are radially symmetric, but
those for H 37 , representing the balanced tertiary aberration, are not because it contains a
H1 H2 H3
H4 H5 H6
H7 H8 H9
H10 H11 H12
H13 H14 H15
Figure 7-9. Hexagonal polynomials shown as isometric plot on the top,

7.6 Isometric, Interferometric, and Imaging Characteristics of Hexagonal Polynomial Aberrations 189
H16 H17 H18
H19 H20 H21
H22 H23 H24
H25 H26 H27
H28 H29 H30

(Cont.)
H31 H32 H33
H34 H35 H36
H37 H38 H39
H40 H41 H42
H43 H44 H45

(Cont.)

hexagonal polynomials for a sigma value of one wave.
H1 0 H16 17.108 H 31 8.210
H2 4.328 H17 14.816 H 32 18.426
H3 3.795 H18 11.982 H 33 10.495
H4 4.092 H19 5.696 H 34 9.657
H5 5.071 H 20 8.081 H 35 10.094
H6 5.123 H 21 7.855 H 36 10.537
H7 5.790 H 22 10.086 H 37 12.843
H8 9.395 H 23 17.665 H 38 16.723
H9 5.477 H 24 15.298 H 39 25.254
H10 6.595 H 25 8.764 H 40 11.499
H11 5.728 H 26 7.919 H 41 12.891
H12 9.169 H 27 7.384 H 42 6.278
H13 10.587 H 28 6.655 H 43 9.859
H14 6.803 H 29 22.362 H 44 11.139
H15 7.116 H 30 25.822 H 45 9.983
term in cos 6q . Of course, as the order of a polynomial aberration increases, the

interferograms and the PSFs become more and more complex.
From Eq. (7-6), the Strehl ratio, representing the central value of an aberrated PSF
relative to its aberration-free value, is given by
S ∫ I (0, 0)
4 2
=
27 ÚÚ [ ]
exp iF ( x , y ) dx d y , (7-34)
where the integration is carried out over the unit hexagon, as in Eq. (7-8). We have
removed the primes on the x and y coordinates in Eq. (7-34), because the hexagonal
polynomial aberrations are already written in the normalized coordiantes. The Strehl ratio
for these aberrations with a sigma value of 0.1 wave is listed in Table 7-7 and plotted in
Figure 7-10. Because of the small value of the aberration, the Strehl ratio is
approximately the same for each polynomial, thus illustrating its independence of the
( )
type of the aberration. It is approximately given by exp - s F2 , or 0.67, where
s F = 0.2p .
Table 7-7. Strehl ratio S for hexagonal polynomial aberrations for a sigma value of
0.1 wave.
H1 1 H16 0.700 H 31 0.678
H2 0.665 H17 0.703 H 32 0.709
H3 0.665 H18 0.694 H 33 0.686
H4 0.664 H19 0.671 H 34 0.687
H5 0.672 H 20 0.692 H 35 0.704
H6 0.672 H 21 0.692 H 36 0.704
H7 0.676 H 22 0.700 H 37 0.713
H8 0.676 H 23 0.706 H 38 0.710
H9 0.677 H 24 0.703 H 39 0.714
H10 0.682 H 25 0.680 H 40 0.693
H11 0.680 H 26 0.680 H 41 0.693
H12 0.686 H 27 0.697 H 42 0.680
H13 0.686 H 28 0.700 H 43 0.693
H14 0.685 H 29 0.717 H 44 0.710
H15 0.685 H 30 0.712 H 45 0.710

Figure 7-10. Strehl ratio for a hexagonal polynomial aberration with a sigma value
of 0.1 wave.
7.7 SEIDEL ABERRATIONS, STANDARD DEVIATION, AND STREHL RATIO

As discussed in the previous chapters, the Strehl ratio of an aberrated image for small
aberrations is determined by the variance of the aberration across the pupil under
consideration. Just as the Zernike circle polynomials represent balanced aberrations in the
sense of minimum variance and, in turn, maximum Strehl ratio for a small aberration,
similarly, the hexagonal polynomials also represent balanced aberrations for the
hexagonal pupils. In Chapters 4 through 6, we have given the value of sigma for a Seidel
aberration, using Ai as its coefficient, with and without balancing for circular, annular,
and Gaussian pupils. As shown below, similar results for a hexagonal pupil can be
obtained from the corresponding orthonormal polynomials. We also determine the Strehl
ratio for Seidel aberrations with and without balancing, and compare with the result
obtained by the exponential approximation.
7.7.1 Defocus
Consider the defocus aberration
W d (r) = Ad r 2 . (7-35)
From the form of the orthonormal defocus polynomial H4 given in Table 7-2, it is
evident that its sigma value across a hexagonal pupil is given by
Ad 43 Ad
sd = = . (7-36)
12 5 4.092
7.7.2 Astigmatism
Next consider 0 o Seidel astigmatism given by
W a (r, q) = Aa r 2 cos 2 q . (7-37)
The orthonormal polynomial representing balanced astigmatism is given by
H 6 = 2 15 7r 2 cos 2q . (7-38a)
(
= 2 15 7r 2 2 cos 2 q - 1 ) . (7-38b)
It shows that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is the same for a hexagonal pupil as for a circular, annular, or a Gaussian pupil.
Hence, for a small amount of astigmatism, the diffraction focus for a hexagonal pupil is
the same as for a circular, annular, or a Gaussian pupil. For an image with a focal ratio of
F, it lies along the z axis at a distance of - 4 Aa F 2 from the Gaussian image point. The
balanced astigmatism is given by
Ê 1 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (7-39)
Ë 2 ¯
$VWLJPDWLVP 195
Its sigma value is given by
Aa 7 Aa
s ba = = . (7-40)
4 15 5.855
To obtain the sigma value of astigmatism, we write Eq. (7-37) in the form
1
W a (r, q) = (
A r 2 cos 2q + r 2
2 a
)
1 È 7 1 43 ˘
= Aa Í H6 + H ˙ + constant . (7-41)
4 Î 15 6 5 4˚
Utilizing Eq. (7-33), the sigma value is given by
Aa 127 Aa
sa = = . (7-42)
24 5 4.762
Comparing Eqs. (7-40) and (7-42), we find that balancing astigmatism with defocus
reduces its sigma value of by a factor of 1.23.
7.7.3 Coma
Now we consider Seidel coma:
W c (r, q) = Ac r 3 cos q . (7-43)
The orthonormal polynomial representing balanced coma is given by
(
H 8 = 4 42 3685 25r 3 - 14 r cos q .) (7-44)
It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
r3 cos q is - 14 25 ª -0.56 compared to - 2 3 for a circular pupil. The diffraction focus
in this case lies along the x axis at a distance of - ( 4 3) F times the amount of tilt from
the Gaussian image point. The balanced coma is given by
Ê 14 ˆ
W bc (r, q) = Ac Á r 3 - r˜ cos q . (7-45)
Ë 25 ¯
Ac 737 Ac
s bc = = . (7-46)
20 210 10.676
To obtain the sigma value of Seidel coma, we write Eq. (7-43) in the form
È 1 3685 7 5 ˘
W c (r, q) = Ac Í H8 + H ˙ . (7-47)
Î 100 42 25 6 2 ˚
Ac 83 Ac
sc = = . (7-48)
4 70 3.673
Comparing Eqs. (7-46) and (7-48), we find that balancing coma with tilt reduces its sigma
value of by a factor of 2.91.

Finally, we consider Seidel spherical aberration:
W s (r) = Asr 4 . (7-49)
The orthonormal polynomial representing balanced spherical aberration is given by
60
H11 =
1072205
( )
301r 4 - 257r 2 + constant . (7-50)
It shows that the relative amount of defocus that optimally balances Seidel spherical
aberration r 4 is - 257 301 ª - 0.85 compared to a value of –1 for a circular pupil. The
diffraction focus lies closer to the Gaussian image point in the case of coma, and closer to
the Gaussian image plane in the case of spherical aberration, compared to their
corresponding locations for a circular pupil. The balanced spherical aberration is given by
Ê 257 2 ˆ
W bs (r) = As Á r 4 - r ˜ . (7-51)
Ë 301 ¯
As A 4987
s bs = 1072205 = s
60 ¥ 301 84 215
As
= . (7-52)
17.441
To obtain the sigma value of Seidel spherical aberration, we write Eq. (7-49) in the form
È 1072205 257 43 ˘
W s (r) = As Í H11 + H ˙ + constant . (7-53)
Î 60 ¥ 301 12 ¥ 301 5 4 ˚
As 59 As
ss = = . (7-54)
6 35 4.621
Comparing Eqs. (7-52) and (7-54), we find that balancing astigmatism with defocus
reduces its sigma value by a factor of 3.77.
7.7.4 Spherical Aberration 197
The sigma values of the Seidel aberrations with and without balancing are given in
Table 7-8. The corresponding peak-to-valley (P-V) numbers for a sigma value of unity
are also given in the table.
7.7.5 Strehl Ratio

In Figure 7-10, we showed the Strehl ratio for the hexagonal polynomial aberrations
with a sigma value of one-tenth of a wave. In Figure 7-11, we show how it varies with the
sigma value of a Seidel aberration, with and without balancing, for 0 £ s W £ 0.25 . Also
( )
plotted is the Strehl ratio obtained from the approximate expression exp - s F2 as the
dashed curve. As expected, the exponential expression yields a very good estimate of the
Strehl ratio for s W £ 0.1. As s W increases, the true Strehl ratio departs from its
approximate value, except in the case of balanced astigamtism and balanced coma. It
overestimates in the case of defocus, but underestimates for the other aberrations.
Morover, the Strehl ratio for the balanced spherical aberration for large values of s W is
larger than that for the corresponding Seidel aberration, but the opposite is true in the case
of astigmatism and coma The aberration coefficient and the P-V number for a certain
value of s W of these aberrations can be obtained from Table 7-8.
7.8 SUMMARY
Closed-form expressions for the aberration-free PSF and OTF are given for a system
with a hexagonal pupil. They are plotted along with the ensquared power, and compared
with the corresponding qunatities for a system with a corresponding circular pupil. The
ensquared power and the OTF for a hexagonal pupil are shown to be lower than the
corresponding values for a circular pupil. Generally, the quantitative differences between
the corresponding functions for the two pupils are small, perhaps because the difference
in the pupil area is only about 16%.
Defocus s d = ( Ad 12) 43 5 = Ad 4.09 4.092
Astigmatism s a = ( Aa 24) 127 5 = Aa 4.76 4.762
Balanced astigmatism s ba = ( Aa 4) 7 15 = Aa 5.86 5.123
Coma s c = ( Ac 4) 83 70 = Ac 3.67 7.347
Balanced coma s bc = ( Ac 20) 737 210 = Ac 10.68 9.395
Spherical aberration s s = ( As 6) 59 35 = As 4.62 4.621
Balanced spherical aberration s bs = ( A s 84 ) 4987 215 = A s 17.44 5.728

1.0 1.0
0.8 0.8
0.6 0.6
S
S
0.4 0.4
0.2 0.2
Defocus Astigmatism
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
VW VW
(a) (b)
1.0 1.0
0.8 0.8
0.6 0.6
S
0.4 0.4
0.2 0.2
Coma Spherical
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
VW VW
(c) (d)
aberration.
7.8 Summary 199
The polynomials orthonormal over a hexagonal pupil, representing the balanced

classical aberrations over such a pupil, are given through the eighth order in Tables 7-2
through 7-4 in terms of the circle polynomials, in polar coordinates, and in Cartesian
coordinates, respectively. The polynomials are ordered in the same manner as the circle,
annular, and Gaussian polynomials discussed in Chapters 4, 5, and 6, respectively.
However, unlike these polynomials, the hexagonal polynomials are generally not
separable in the coordinates r and q of a pupil point due to a lack of the radial symmetry
of the hexagonal pupil. The first 13 polynomials, i.e., up to H13 , are separable, but H14
and H15 are not; H16 through H19 are separable, but H20 and H21 are not. Accordingly,
the concept of two indices n and m with dependence on m in the form of cos mq or
sin mq loses significance. For example, the Zernike circle polynomial Z14 for n = 4 and
m = 4 varies as cos 4q , but H14 has a term in cos 2q also. Hence, the hexagonal
polynomials can be ordered by a single index only. Even so, each polynomial contains
only the cosine or the sine terms. Thus an even j polynomial, for example, consists of
only the cosine terms, as may be seen from Table 7-2.
While the polynomials H11 and H22 representing balanced primary and secondary
spherical aberrations are radially symmetric, the polynomial H37 representing balanced
tertiary spherical aberration is not, since it consists of an angle-dependent term in Z28 or
cos 6q also. If this term is not included in the polynomial H37 , the standard deviation of
the aberration increases from a value of unity to 1.13339.
In practice, the polynomials in Cartesian coordinates given in Table 7-4 will be used
for the analysis of aberration data of a hexagonal wavefront. A somewhat different set of
hexagonal polynomials is obtained when the hexagon is rotated by 30 degrees. These
polynomials are given in Table 7-5.
The first 45 hexagonal polynomials, i.e., up to and including the 8th order, are
illustrated by an isometric plot, an interferogram, and a PSF in Figure 7-9. The coefficient
of each orthonormal polynomial, or the sigma value of the corresponding aberration, is
one wave. Their corresponding P-V numbers for a sigma value of one wave are given in
Table 7-6 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l for each
aberration is given in Table 7-7 and illustrated in Figure 7-10. It shows that, for a small
aberration, the Strehl ratio can be estimated from the aberration variance. The sigma
values of the Seidel aberrations and their balanced forms are given, along with their P-V
numbers in Table 7-8.
The diffraction focus for a system with a hexagonal pupil is shown to lie closer to the
Gaussian image point in the case of coma, and closer to the Gaussian image plane in the
case of spherical aberration, compared to their corresponding locations for a circular
pupil. Figure 7-11 shows how the Strehl ratio varies with the sigma value of a Seidel
aberration, with and without balancing. The approximate expression exp - s F2 ( )
overestimates its value in the case of defocus, but underestimates it for the other
aberrations.
References
1. keckobservatory.org/
2. L. D. Feinberg, M. Clamping, R. K. Keski-Kuha, C. Atkinson, S. Texter, M.

Bergelnad, and B. B. Gallagher, “James Webb telescope optical telescope element
mirror development history and results,” in Space Telescopes and Instrumentation,
Proc. SPIE , 84422 (2012).
3. M. Troy and G. Chanan, “Diffraction effects from giant segmented-mirror

telescopes,” Appl. Opt. 42, 3745–3753 (2003).
4. E. Sabatke, J. Burge, and D. Sabatke, “Analytic diffraction analysis of a 32-m

telescope with hexagonal segments for high-contrast imaging,” Appl. Opt. 44,
1360–1365 (2005).
5. R. C. Smith and J. S. Marsh, “Diffraction patterns of simple apertures,” J. Opt.

Soc. Am. 64, 798–803 (1974).
6. J. A. Díaz and V. N. Mahajan, “Imaging by a system with a hexagonal pupil,”

Appl. Opt. 52, 5112–5122 (2013).
7. G. Chanan and M. Troy, “Strehl ratio and modulation transfer function for
segmented mirror telescopes as functions of segment phase error,” Appl. Opt. 38,
6642–6647 (1999).
8. N. Yaitskova and K. Dohlen, “Tip-tilt error for extremely large segmented

telescopes: detailed theoretical point-spread function analysis and numerical
simulation results,” J. Opt. Soc. Am. A 19, 1274–1285 (2003).
9. V. N. Mahajan and G.-m Dai, “Orthonormal polynomials in wavefront analysis:

analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

Optics, V. N. Mahajan and E. V. Stryland, eds., 3rd edition, Vol II, pp. 11.3–
11.41 (McGraw Hill, 2009).
CHAPTER 8
SYSTEMS WITH ELLIPTICAL PUPILS
8.1 Introduction ..........................................................................................................203
8.2 Pupil Function ......................................................................................................203
8.3.1 PSF ..........................................................................................................204
8.3.2 OTF ..........................................................................................................207
8.4 Elliptical Polynomials ..........................................................................................209
8.5 Elliptical Coefficients of an Elliptical Aberration Function ........................... 210
Elliptical Polynomial Aberrations ......................................................................214
8.7 Seidel Aberrations and Their Standard Deviations ..........................................228
8.7.1 Defocus ....................................................................................................228
8.7.2 Astigmatism............................................................................................. 228
8.7.3 Coma ........................................................................................................229
8.8 Summary............................................................................................................... 232
References ......................................................................................................................234
201
Chapter 8
Systems with Elliptical Pupils
8.1 INTRODUCTION
The pupil of a human eye is slightly elliptical [1]. The pupil for off-axis imaging by a
system with an axial circular pupil may be vignetted, but can be approximated by an
ellipse [2]. When a flat mirror is tested by shining a circular beam on it at some angle
(other than normal incidence), the illuminated spot is elliptical. Similarly, the overlap
region of two circular wavefronts that are displaced from each other, as in lateral shearing
interferometry [3] or in the calculation of the optical transfer function of a system [4], can
also be approximated by an ellipse.
Starting with the pupil function of a system with an elliptical pupil, we scale the
coordinates of a point on the pupil and transform it to a circular pupil. The aberration-free
PSF and OTF are then obtained as for a system with a circular pupil. The corresponding
PSF and OTF obtained by unscaling the coordinates represent the results for the elliptical
pupil. Then we discuss the polynomials that are orthonormal over and represent balanced
classical aberrations for a unit elliptical pupil [5]. These polynomials cannot be obtained
by scaling the coordinates of the Zernike circle polynomials. The balancing of a Seidel
aberration over an elliptical pupil is discussed, and its standard deviation with and
without balancing is determined.
8.2 PUPIL FUNCTION

As illustrated in Figure 8-1a, consider an imaging system with an elliptical exit pupil
with semimajor and semiminor axes a and b and area Sex = pab lying in the x p , y p ( )
plane with z axis as its optical axis. The pupil is described by
x 2p y 2p
+ £ 1 . (8-1)
a2 b2
The aspect ratio c of the pupil is given by
c = ba £ 1 . (8-2)
( )
For a uniformly illuminated pupil with an aberration function F x p , y p and power Pex
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (8-3)
where
(
A xp, yp ) = (P ex
12
Sex ) (8-4)
is the uniform amplitude across the pupil.

203
204 SYSTEMS WITH ELLIPTICAL PUPILS
yp y9p
O xp O x9p
a a
(a) (b)
Figure 8-1. (a) Elliptical pupil with semimajor and semiminor axes a and b. (b)
Elliptical pupil transformed into a circular pupil by scaling its y p coordinate.

An elliptical pupil can be transformed to a circular pupil by scaling its coordinates.
Using the results for a circular pupil, the PSF [6] and OTF [7] of an elliptical pupil can be
written in this scaled coordinate system. Unscaling the coordinates finally yields the PSF
and OTF for a system with an elliptical pupil.
8.3.1 PSF
From Eq. (1-9), the aberrated irradiance distribution in the image plane of a system
with a uniformly illuminated elliptical exit pupil, normalized by its aberration-free central
value Pex Sex l2 R 2 , can be written
2
1 ÛÛ È 2pi ˘
I (x i , y i ) [ (
= 2 Ù Ù exp iF x p , y p expÍ -
Sex ı ı Î lR
)] ( )
x i x p + y i y p ˙ dx p d y p
˚
, (8-5)
where the integration is carried over the elliptical pupil. Using the scaled pupil
( )
coordinates x ¢p , y ¢p , where
( x ¢ , y ¢ ) = ( x , y c)
p p p p , (8-6)
the elliptical pupil is transformed into a circular pupil of radius a defined by
x ¢p2 + y ¢p2 £ a 2 . (8-7)
Similarly, we scale the image plane coordinates ( x i , y i ) into ( x ¢i , y ¢i ) according to
( x ¢i , y ¢i ) = ( x i , cy i ) , (8-8)
because of the Fourier transform relationship between the pupil function and the
diffracted amplitude. In the scaled coordinates, Eq. (8-5) for the aberrationfree case
becomes
36) 205
2
c2 È 2pi ˘
I ( x ¢i , y ¢i ; c ) = 2 ÚÚ exp Í -
p circle Î lR
x ¢i x ¢p + y ¢i y ¢p ( ) ˙ dx ¢p dy ¢p
˚
. (8-9)
( )
In polar coordinates r p¢ , q and (ri¢, q i ) for the pupil and image points, we can write
( x¢ , y¢ )
p p (
= r p¢ cos q¢p , sin q¢p ) (
= ar cos q¢p , sin q¢p ) (8-10)
and
( x ¢i , y ¢i ) = ri¢(cos q i , sin q i ) , (8-11)
where 0 £ r £ 1 and 0 £ q, q i £ 2p . In these polar coordinates, we can write Eq. (8-9) in

the form
2
1 1 2p
[
I (r , q¢i ; c ) = 2 Ú Ú exp -pirr cos q¢i - q¢p r dr dq¢p
p 0 0
( )] , (8-12)
where
ri¢ r¢
r = = i , (8-13)
l R 2a l Fx
and
Fx = R 2a (8-14)
is the focal ratio of the image-forming light cone along the x p axis.
( )
For the aberration-free case, we let F r, q¢p = 0 and perform the integration as for a
circular pupil. Thus, we obtain
2
È 2J (p r ) ˘
I (r) = Í 1 ˙ . (8-15)
Î pr ˚
Substituting for r from Eqs. (8-8), (8-11) and (8-13), we obtain
2
Ï 2J È p x 2 + c 2 y 2 1 2 ˘ ¸
Ô 1 ÍÎ ( ˙˚ Ô )
I ( x , y; c ) = Ì 1 2 ˝ , (8-16)
2
Ô p x +c y
Ó
2 2
( Ô
˛
)
where ( x , y ) are image plane coordinates in units of l Fx . The fractional power contained
in an elliptical ring can be obtained in a similar manner from the corresponding equation
for a circular pupil, namely, Eq. (4-11). Thus, the fractional power in an elliptical ring
with semimajor and semiminor axes x c and y c with y c = cx c is given by
P ( x c , y c ; c ) = 1 - J 02 ÊË p x c2 + c 2 y c2 ˆ¯ - J12 ÊË p x c2 + c 2 y c2 ˆ¯ . (8-17)
The distribution given by Eq. (8-16) approaches the Airy pattern for a circular pupil
as we let the aspect ratio c Æ 1. We also note that the relative irradiance at a point
( x, y c) is equal to the relative irradiance of the Airy pattern at a point ( x, y) . However,
the central irradiance for the elliptical pupil is equal to c 2 times the central value of the
Airy pattern. This is due to the area of the elliptical pupil being equal to c times that of
the circular pupil, and the power incident on and exiting from the elliptical pupil also
being equal to c times that for the circular pupil.
Figure 8-2a shows the 2D PSF for c = 0.85 . It is evident that the circular diffraction
rings of a circular pupil have been replaced by the elliptical diffraction rings of an
elliptical pupil. The dimension of a ring is larger in the direction of the smaller dimension
of the pupil with an aspect ratio of 1 c . Figure 8-2b shows the irradiance distribution
along the x and y axes, and at 45o from the x axis. The first zero along the x axis occurs at
1.22 (in units of l Fx ), as in the Airy pattern, at 1.22/0.85 or about 1.44 along the y axis,
and at about 1.32 at 45o from the x axis [see the curve I ( r ) ∫ I ( x = y ) ].
(a)
1.0 0.025
0.020
I (0, y)
0.8
0.015
I
I (x, 0)
0.6 0.010
I
I (x, 0) 0.005 (b)

I (r)
0.4
0.000
1.0 1.5 2.0 2.5 3.0
x, y, or r
0.2
I (0, y) I (r)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r
Figure 8-2. (a) 2D aberration-free PSF for c = 0.85. (b) Irradiance distribution along
the x and y axes, and at 45 o from the x axis, where x, y, and r are in units of l Fx .
27) 207
8.3.2 OTF
r
The OTF of an aberration-free system at a spatial frequency v i is given by [see Eq.
(2-13)]
r Û r r r r
ı
( ) (
t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (8-18)
It represents the fractional area of overlap of two elliptical pupils centered at (0, 0) and
r
l R(x, h) , where (x, h) are the Cartesian components of the spatial frequency vector v i . In
( )
the scaled coordinates x ¢p , y ¢p , as in Eq. (8-6), the elliptical pupil reduces to a circular
pupil of radius a. The overlap area of two circular pupils, each of radius a, with their
origins at (0, 0) and ( x ¢0 , y ¢0 ) is given by
È 12˘
Ê r¢ ˆ Ê r¢ ˆ Ê r¢ ˆ
S( x ¢0 , y ¢0; a) = 2a 2 Í cos 1Á 0 ˜ - Á 0 ˜ 1 - Á 0 ˜ ˙ , (8-19)
Í Ë 2a ¯ Ë 2a ¯ Ë 2a ¯ ˙
Î ˚
where
(
r0¢ = x ¢02 + y ¢02 )1 2 (8-20)
is the distance between the centers of the two pupils.
Letting
( x ¢0 , y ¢0 ) = l R(x¢, h¢ ) = l R(x, h c ) (8-21)
and noting that the overlap area is to be multiplied by c when writing it in the unscaled
coordinates, the OTF of a system with an elliptical pupil can be written from Eq. (8-19) in
the form
2È
(
t vx , vy ) =
p ÎÍ
(
cos 1 v e - v e 1 - v e2 )1 2 ˘˚˙ , (8-22)
where
12
Ê 2 v y2 ˆ
ve = Á vx + 2 ˜ (8-23)
Ë c ¯
and
Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 l Fx 1 l Fx ¯
(8-24)
are the spatial frequency components normalized by the cutoff frequency 1 l Fx along the
x axis.
It should be evident that, since -a £ x p £ a and -b £ y p £ b , therefore, 0 £ v x £ 1

and 0 £ v y £ c . Hence, the cutoff spatial frequency varies with its orientation. Thus, for
example, the cutoff frequencies along the x and y axes are 1 and c, respectively. A smaller
cutoff frequency along the y axis is the spatial frequency analog of the larger diffraction
spread due to the smaller dimension of the pupil along this axis. For an arbitrary direction
making an angle q with the x axis, the cutoff frequency is given by
12
[ ( ) ]
c 1 - 1 - c 2 cos 2 q , and represents the distance of the point from the center of a unit
ellipse where a line passing through the center and making an angle q meets it. For
example, the cutoff frequency for 45o is equal to 0.916 when c = 0.85.
Figure 8-3 shows the OTF for c = 0.85 along the x and y axes, and at 45o from the x
( ) ( )
axis as t(v x ), t v y , and t v x = v y ∫ t(v e ) with the corresponding cutoff spatial
frequencies of 1, 0.85, and 0.916, respectively, each in units of 1 l Fx . It should be
( )
evident that t(v x ) is obtained from Eq. (8-22) by letting v y = 0. Similarly, t v y is
obtained by letting v x = 0. Moreover, the OTF along the x axis is the same as for a
corresponding circular pupil.
1.0
0.8
0.6
t
t ( nx )
0.4
t ( nx ny )
0.2
t ( ny )
0.0
0.0 0.25 0.5 0.75 1.0
nx, ny, or ne
Figure 8-3. OTF of a system with an elliptical pupil with aspect ratio c = 0.85, along
the x and y axes, and at 45 o from the x axis, where v x , v y . and v e are all in units of
1 l Fx .
(OOLSWLFDO 3RO\QRPLDOV 209
8.4 ELLIPTICAL POLYNOMIALS

In Section 8.3, we obtained the aberration-free PSF and OTF by scaling the
coordinates of the elliptical pupil and thereby transforming it into a circular pupil, and
then using the PSF and OTF of a circular pupil. Similarly, by scaling the coordinates of
the Zernike circle polynomials we can obtain polynomials that are orthogonal over an
elliptical pupil. However, these elliptical polynomials do not represent the balanced
classical aberrations for a system with an elliptical pupil. To obtain the polynomials that
are orthogonal over and represent balanced aberrations for an elliptical pupil, we
orthogonalize the Zernike circle polynomials over the elliptical pupil [7,8].
Figure 8-4 shows a unit ellipse of an aspect ratio c inscribed inside a unit circle. Thus
the semimajor and semiminor axes a and b of the ellipse have been normalized by a so
that the farthest point(s) on the ellipse lie at a distance of unity. The unit ellipse is
represented by an equation
x2 + y2 c2 = 1 , (8-25)
or
y = ± c 1 - x2 . (8-26)
The area of the unit ellipse is given by pc .
The orthonormal elliptical polynomials E j obtained by orthonormalizing the Zernike

circle polynomials Z j over a unit ellipse are given by [see Eq. (3-18)]
È j ˘
E j +1 = N j +1 Í Z j +1 - Â Z j +1Ek Ek ˙ , (8-27)
Î k =1 ˚
D(0,c)
C 1, 0 A 1, 0
O x
B(0, c)
Figure 8-4. Unit ellipse of aspect ratio c inscribed inside a unit circle with its
semimajor axis of unity along the x axis.
unit ellipse i.e., they satisfy the orthonormality condition
1 c 1 x2
1 Û Û
dx E j E j ¢ dy = d jj ¢ . (8-28)
pc Ù
ı
Ù
ı
1
c 1 x2
The angular brackets indicate a mean value over the elliptical pupil. Thus, for example,
1 c 1 x2
1 Û Û
Z j Ek = dx Z j Ek dy . (8-29)
pc Ù
ı
Ù
ı
1
c 1 x2
It should be evident that because of the symmetric limits of integration, a mean value is
zero if the integrand is an odd function of x and or y. If the integrand is an even function,
then we may replace the lower limits of integration by zero and multiply the double
integral by 4.
The orthonormal elliptical polynomials up to the fourth order are given in Tables 8-1
through 8-3 in three different but equivalent forms, as in the case of hexagonal
polynomials. The expressions for higher-order elliptical polynomials are very long unless
the aspect ratio c is specified. As in the case of a hexagonal pupil, each elliptical
polynomial consists of either cosine or sine terms, but not both. For example, E6 is a
linear combination of Z 6 , Z 4 , and Z1. It also shows that the balancing defocus for (zero-
degree) Seidel astigmatism is different for an elliptical pupil compared to that for a
circular, annular, or a Gaussian pupil, as may be seen from Table 4-2, 5-2, or 6-2,
respectively. Moreover, E11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,
spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q
as well. The elliptical polynomials are generally more complex in that they are made up
of a larger number of circle polynomials. These results are a consequence of the fact that
the x and y dimensions of the elliptical pupil are not equal. As expected, the elliptical
polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse approaches
a unit circle.
8.5 ELLIPTICAL COEFFICIENTS OF AN ELLIPTICAL ABERRATION

FUNCTION
An elliptical aberration function W ( x , y ) across a unit ellipse can be expanded in
terms of J elliptical polynomials Ej (r, q) in the form
J
W ( x , y ) = Â a j Ej ( x , y ) , (8-30)
j =1
8.5 Elliptical Coefficients of an Elliptical Aberration Function 211
Table 8-1. Orthonormal elliptical polynomials E j U, T in terms of the Zernike

E1 Z1
E2 Z2
E3 Z3/c
2 4
E4 (1/ 3 2c 3c )[ 3 (1 c2) Z1 + 2Z4]
E5 Z5/c
E6 [1/(2 2 c2 3 2c
2 4
3c )][ 3 (3 4c2 + c4)Z1 3(1 c4)Z4 + 2 (3 2c2 + 3c4)Z6]
2 4
E7 [1/(c 5 6c 9c )][6(1 c2)Z3 + 2 2 Z7]
2 4
E8 (2/ 9 6c 5c )[(1 c2)Z2 + 2 Z8]
2 4
E9 [1/(2 2 c3 5 6c 9c )][ 2 2 (5 8c2 + 3c4)Z3 (5 2c2 3c4)Z7 + (5 6c2 + 9c4)Z9]
2 4
E10 [1/(2 2 c3 9 6c 5c )][ 2 2 (3 4c2 + c4)Z2 (3 + 2c2 5c4)Z8 + (9 6c2 + 5c4)Z10]
E11 (1/Į)[ 5 (7 10c2 + 3c4)Z1 + 4 15 (1 c2)Z4 2 30 (1 c2)Z6 + 8Z11]
E12 5 / 8 c 2(195 475c2 + 558c4 422c6 + 159c8 15c10)ȕ 1Z1 15 / 8 c 2(105 205c2
+ 194c4 114c6 + 5c8 + 15c10)ȕ 1Z4 + (1/2) 15 c 2 (75 155c2 + 174c4 134c6 + 55c8 15c10) ȕ 1Z6
6
10 2 c 2(3 2c2 +2c 3c8)ȕ 1Z11 + c 2ĮȖ 1Z12
2 4
E13 [1/(c 5 6c 5c )][ 15 (1 c2)Z5 +2Z13]
E14 ( 5 / 2 /4)(1 c2)2c 4(35 10c2 c4)Ȗ 1Z1 + (5 15 2 /8)(1 c2)2c 4

(7 + 2c2 c4)Ȗ 1Z4
( 15 /8)c 4 (35 70c2 + 56c4 26c6 + 5c8)Ȗ 1Z6 + (5/8 2 ) (1 c2)2c 4(7 + 10c2 + 7c4)Ȗ 1Z11
(5/8)c 4(7 6c2 + 6c6 7c8)Ȗ 1Z12 + (Ȗ/8c4) Z14
E15 ( 15 /4)c 3(5 8c2 + 3c4)į 1Z5 (5/4)(1 c4)c 3 į 1Z13 + (į/2c3) Z15
_______________________________________________________________________
Į (45 60c2 + 94c4 60c6 + 45c8)1/2
ȕ (1575 4800c2 + 12020c4 17280c6 + 21066c8 17280c10 + 12020c12 4800c14 + 1575c16)1/2
Ȗ (35 60c2 + 114c4 60c6 + 35c8)1/2
į (5 6c2 +5c4)1/2
ĮȖ ȕ
Table 8-2. Orthonormal elliptical polynomials E j U, T in polar coordinates U, T .
E1 1
E2 2ȡcosș
E3 (2ȡsinș)/c
3 / §© 3 3c ·¹ ( 1 c2 +4ȡ2)
2 4
E4 2c
E5 ( 6 /c)ȡ2 sin2ș
(1/2c2) 6 / §© 3 3c ·¹ [2c2(1 c2) c4)ȡ2 + (3 2c2 +3c4)ȡ2 cos2ș]

2 4
E6 2c 3(1
9c ) ][ (1 + 3c2)ȡ +6ȡ3]sinș
2 4
E7 [4/(c 5 6c
5c ) [ (3 + c2)ȡ + 6ȡ3]cosș
2 4
E8 (4/ 9 6c
[1/(c3 5 9c ) ]{3[4c2(1 c2)ȡ 2c2 3c4)ȡ3]sinș + (5 6c2 +9c4)ȡ3 sin3ș]}

2 4
E9 6c (5
[1/(c2 9 5c ) ]{3[4c2(1 c2)ȡ (3 + 2c2 5c4)ȡ3]cosș + (9 6c2 +5c4)ȡ3 cos3ș]}

2 4
E10 6c
E11 ( 5 /Į) [3+2c2 +3c4 24(1 + c2)ȡ2 + 48ȡ4 12(1 c2)ȡ2 cos2ș]
E12 [ 10 Į/(Ȗc2)]( 3ȡ2 + 4ȡ4) cos2ș + [ 5 2 /(2c2ȕ)][ 12c2(5 2c2 + 2c6 5c8)
+ 6(15 + 125c2 194c4 + 194c6 125c8 15c10)ȡ2 + 240( 3+2c2 2c6
+ 3c8)ȡ4 + 6(75 155c2 + 174c4 134c6 + 55c8 15c10)ȡ2 cos2ș]
E13 ( 10 /cį) [ 3(1 + c2)ȡ2 + 8ȡ4] sin2ș
E14 [ 10 /(8c4Ȗ)]{3(1 c2)2[8c4 40c2(1 + c2)ȡ2 + 5(7 + 10c2 +7c4)ȡ4]
+ 4[6c2(5 7c2 + 7c4 5c6) 5(7 6c2 +6c6 7c8)ȡ2]ȡ2 cos2ș + (35 60c2
+ 114c4 60c6 + 35c8)ȡ4 cos4ș}
E15 ( 10 /c3)į 1{[6c2(1 c2) 5(1 c4)ȡ2]ȡ2 sin2ș + [(5 6c2 +5c4)/2]ȡ4 sin4ș}
8.5 Elliptical Coefficients of an Elliptical Aberration Function 213
Table 8-3. Orthonormal elliptical polynomials E j x, y in Cartesian coordinates

x, y , where U 2 x 2 y 2 .
E1 = 1
E2 = 2x
E3 = 2y/c
2 4
E4 = ( 3 / 3 í 2c í 3c )(í 1 í c2 +4ȡ2)
E5 = (2 6 /c)xy
2 4
E6 = [ 6 /(c2 3 í 2c í 3c )][c2(1 í c2) + c2(3c2 í 1)x2 í (3 í c2)y2]
2 4
E7 = [4/(c 5 í 6c í 9c )][í (1 + 3c2) + 6ȡ2]y
2 4
E8 = (4/ 9 í 6c í 5c )[í (3 + c2) + 6ȡ2]x
2 4
E9 = [4/(c3 5 í 6c í 9c )][3c2(3c2 í 1)x2 í (5 í 3c2)y2 + 3c2(1 í c2)]y
2 4
E10 = [4/(c2 9 í 6c í 5c )][c2(5c2 í 3)x2 í 3(3 í c2)y2 + 3c2(1 í c2)]x
E11 = ( 5 /Į)[48ȡ4 í 12(3 + c2)x2 í 12(1 + 3c2)y2 + 3 + 2c2 +3c4]
E12 = [ 10 Į/(c2Ȗ)][(x2 í y2) (4ȡ2 í 3)+[ 5 /(2 2 c2ȕ)][240(í 3 + 2c2 í 2c6
2
+ 3c8)ȡ4 í 60(í 9 + 3c2 +2c4 í 6c6 +7c8 +3c10)x í 24(15 í 70c2 + 92c4 í 82c6
+ 45c8) y2 + 12c2(í5 + 2c2 í 2c6 + 5c8)]
E13 = (2 10 /cį)(8ȡ2 í 3 í 3c2)xy
E14 = ( 10 /c4Ȗ)[c4(3 í 30c2 + 35c4)x4 +6c2(5 í 18c2 + 5c4)x2y2 + (35 í 30c2 +3c4)y4
í 6c4(1 í 6c2 + 5c4)x2 í 6c2(5 í 6c2 + c4)y2 + c4(1íc2)2]
E15 = (4 10 /c3į)[c2(5c2 í 3)x2 í (5 í 3c2)y2 + 3c2(1 í c2)]xy

E j ( x , y ), integrating over the unit ellipse, and using the orthonormality Eq. (8-28), we
obtain the elliptical expansion coefficients:
1 c 1 x2
1 Û Û
aj = dx W ( x , y )E j ( x , y ) dx dy . (8-31)
pc Ù
ı
Ù
ı
1
c 1 x2
As stated in Section 3.2, it is evident from Eq. (8-7) that the value of an elliptical
coefficient is independent of the number J of polynomials used in the expansion of the
aberration function. Hence, one or more terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.
W (r, q) = a1 , (8-32)
and
J
W 2 (r, q) = Â a 2j , (8-33)
j =1
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (8-34)
j =2

OF ELLIPTICAL POLYNOMIAL ABERRATIONS
The first 45 elliptical polynomials for an elliptical pupil with an aspect ratio of c =
0.85 are given in Table 8-4 to 8-6. They are illustrated in three different but equivalent
ways in Figure 8-5. For each polynomial, the isometric plot at the top illustrates its shape.
An interferogram is shown on the left, and a corresponding PSF is shown on the right for
a sigma value of one wave. The peak-to-valley aberration numbers (in units of
wavelength) are given in Table 8-7.
The PSF plots, representing the images of a point object in the presence of a
polynomial aberration and obtained by applying Eq. (8-5), are shown in Figure 8-5. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration E1 has
no effect on the PSF, it yields an aberration-free PSF.
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 215
Table 8-4. Elliptical polynomials in terms of Zernike polynomials for an elliptical

pupil with an aspect ratio c = 0.85.
E1 Z1
E2 Z2
E3 1.1765Z3
E4 0.2721Z1 + 1.1321Z4
E5 1.17645Z5
E6 0.3032Z1 0.3972Z4 + 1.2226Z6
E7 0.8458Z3 + 1.4369Z7
E8 0.2058Z2 + 1.0486Z8
E9 0.5527Z3 0.4945Z7 + 1.3332Z9
E 10 0.3243Z2 0.3329Z8 + 1.3199Z10
E 11 0.4721Z1 + 0.6768Z4 0.4785Z6 + 1.2594Z11
E 12 0.6786Z1 0.9419Z4 + 1.0489Z6 0.7451Z11 + 1.4250Z12
E 13 0.6987Z5 + 1.3002Z13
E 14 0.2576Z1 + 0.3242Z4 0.7837Z6 + 0.1889Z11 0.5861Z12 + 1.4774Z14
E 15 0.6848Z5 0.5376Z13 + 1.4734Z15
E16 0.3201Z2 + 0.3747Z8 0.3747Z10 + 1.1026Z16
E17 1.6951Z3 + 1.7799Z7 0.5933Z9 + 1.7457Z17
E18 0.6114Z2 0.6730Z8 + 1.1686Z10 0.5222Z16 + 1.4580Z18
E19 1.4290Z3 1.4348Z7 + 1.3271Z9 0.9078Z17 + 1.4985Z19
E20 0.3159Z2 + 0.3003Z8 1.1073Z10 + 0.1586Z16 0.7251Z18 + 1.6493Z20
E21 0.5487Z3 + 0.5004Z7 1.1469Z9 + 0.2441Z17 0.7400Z19 + 1.6506Z21
E22 0.8435Z1 + 1.2371Z4 0.9604Z6 + 1.1277Z11 0.7974Z12 + 1.3738Z22
E23 1.2479Z5 + 1.1962Z13 0.5981Z15 + 1.4572Z23
E24 1.5657Z1 2.2518Z4 + 2.4365Z6 1.95526Z11 + 2.0855Z12 0.7030Z14

1.1709Z22 + 1.7128Z24
E25 1.5089Z5 1.3450Z13 + 1.6563Z15 0.8395Z23 + 1.5980Z25
E26 0.8344Z1 + 1.1536Z4 2.0055Z6 + 0.9046Z11 1.7006Z12 + 1.7223Z14 + 0.4133Z22

0.9739Z24 + 1.6111Z26
E27 0.7754Z5 + 0.6060Z13 1.6348Z15 + 0.2747Z23 0.9271Z25 + 1.8541Z27

Table 8-4. Elliptical polynomials in terms of Zernike polynomials for an elliptical

pupil with an aspect ratio c = 0.85. (Cont.)
E28 0.2686Z1 0.3567Z4 + 0.8956Z6 0.2550Z11 + 0.6867Z12 1.6500Z14 0.0970Z22

+ 0.3021Z24 0.9317Z26 + 1.8545Z28
E29 3.5331Z3 + 3.8704Z7 1.5265Z9 + 3.0038Z17 1.0013Z19 + 2.0832Z29
E30 0.5126Z2 + 0.6090Z8 0.6874Z10 + 0.5538Z16 0.5538Z18 + 1.1521Z30
E31 3.7743Z3 4.0384Z7 + 3.1911Z9 2.9785Z17 + 2.4088Z19 0.8496Z21 1.4764Z29

+ 1.7676Z31
E32 1.2170Z2 1.3856Z8 + 2.4334Z10 1.1564Z16 + 1.9603Z18 0.8039Z20

0.7334Z30 + 1.6725Z32
E33 1.8697Z3 + 1.9215Z7 2.9091Z9 + 1.2980Z17 2.1909Z19 + 2.2306Z21 + 0.5180Z29

1.1466Z31 + 1.7471Z33
E34 0.8765Z2 + 0.9332Z8 2.6500Z10 + 0.6726Z16 2.0393Z18 + 2.2043Z20 + 0.2987Z30

1.1006Z32 + 1.7428Z34
E35 0.6216Z3 0.6104Z7 + 1.4855Z9 0.3771Z17 + 1.0045Z19 2.3123Z21

0.1273Z29 + 0.4026Z31 1.1573Z33 + 2.0935Z35
E36 0.3561Z2 0.3568Z8 + 1.4294Z10 0.2285Z16 + 0.9733Z18 2.3066Z20

0.0816Z30 + 0.3938Z32 1.1559Z34 + 2.0934Z36
E37 1.5647Z1 + 2.3399Z4 1.9757Z6 + 2.2294Z11 1.7784Z12 + 0.2020Z14 + 1.6239Z22

1.1483Z24 + 1.4746Z37
E38 3.6706Z1 5.4231Z4 + 5.8485Z6 5.0215Z11 + 5.3399Z12 1.9891Z14 3.4518Z22

+ 3.5769Z24 1.1361Z26 1.6754Z37 + 2.0633Z38
E39 2.3101Z5 + 2.2906Z13 1.3805Z15 + 1.7827Z23 0.8913Z25 + 1.6187Z39
E40 2.5295Z1 + 3.6636Z4 5.4516Z6 + 3.2375Z11 4.9525Z12 + 4.2289Z14 + 2.0206Z22

3.2919Z24 + 2.8973Z26 1.0342Z28 + 0.7704Z37 1.5053Z38 + 1.8782Z40
E41 3.4596Z5 3.2813Z13 + 3.8178Z15 2.3452Z23 + 2.6958Z25 1.0155Z27

1.2074Z39 + 1.8441Z41
E42 1.0497Z1 1.4880Z4 + 2.9632Z6 1.2532Z11 + 2.5925Z12 4.2448Z14 0.7161Z22

+ 1.5920Z24 2.8929Z26 + 2.8550Z28 0.2315Z37 + 0.5922Z38 1.3793Z40 + 1.9027Z42
E43 2.3202Z5 + 2.0734Z13 4.1210Z15 + 1.3161Z23 2.8314Z25 + 2.8448Z27 +
0.5132Z39 1.3637Z41 + 1.9013Z43
E44 0.3097Z1 + 0.4294Z4 1.1006Z6 + 0.3448Z11 0.9148Z12 + 2.3943Z14 + 0.1816Z22

0.5119Z24 + 1.4601Z26 3.1647Z28 + 0.0514Z37 0.1605Z38 + 0.5359Z40
1.4192Z42 + 2.3730Z44
E45 0.9499Z5 0.79791Z13 + 2.3698Z15 0.4537Z23 + 1.4484Z25 3.1626Z27

0.1454Z39 + 0.5331Z41 1.4187Z43 + 2.3730Z45
Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an
aspect ratio c = 0.85.
E1 1
E2 2Ucosș
E3 2.3529Usinș
E4 1.6888 + 3.9217U 2
E5 2.8818U2sin2ș
2
E6 0.3848 1.3760 + 2.9947U 2cos2ș
E7 ( 6.4365 U + 12.1923U 3)sinș
E8 ( 5.5205 U + 8.8980U3)cosș
E9 (1.6917U 4.1956U 3)sinU + 3.7710U3sin3ș
E10 (1.2346U 2.8248U3)cos U + 3.733U 3 cos3ș
E11 2.1159 14.5521U 2 + 16.8965U4 1.1722U2 cos2ș
E12 0.7133 + 6.7333 U2 9.9960U 4 + ( 10.9496U2 + 18.0251U4) cos2ș
E13 ( 10.6232U 2 + 16.4461U 4)sin2ș
E14 0.1184 1.4114 U2 + 2.5347U 4 + (3.6409U2 7.4142U 4) cos2ș + 4.6720U 4 cos4ș
E15 (3.4228U2 6.8003U4)sin2ș + 4.6593U 4sin4ș
E16 (9.9790U 42.6545U 3 + 38.1952U5)cosș 1.0599U3cos3ș
E17 (11.4631U 57.4626U3 + 60.4711U 5)sinș 1.6781U 3sin3ș
E18 ( 2.8430U + 15.9987U 3 18.0913 U5)cosș + ( 16.8978U3 + 25.2538 U5)cos3ș
E19 ( 4.1751U + 25.5604 U3 31.4461U 5)sinș + ( 17.0096U 3 + 25.9539U 5)sin3ș
E20 (0.5810 U 4.0436 U3 + 5.4933 U5)cosș + (6.9151U 3 12.5589 U5)cos3ș + 5.7134 U5cos5ș
E21 (0.8035U 5.9010U3 + 8.4557U5)sinș + (7.0098U3 12.8173 U5)sin3ș + 5.7177U 5sin5ș
E22 2.41236 + 32.7723U2 93.9111U 4 + 72.6936 U6 + (5.21205U2 10.0862U 4)cos2ș
E23 (24.4229U2 93.9159 U4 + 81.7846 U6)sin2ș 1.89127U 4sin4ș
E24 1.0603 18.7425U 2 + 66.7039U4 61.9577U6 + (24.6341U 2 101.7900U4 +

6.1279U6)cos2ș 2.2230 U4cos4ș
E25 ( 9.7837U2 + 45.8111U 4 47.1181U6)sin2ș + ( 24.658 U4 + 35.8748U6)sin4ș
E26 (2.3177U2 12.8931U 4 + 15.4190 U6)sin2ș + (12.1747U 4 20.8131U6)sin4ș + 6.9375U6sin6ș
E27 0.0357 0.8943U2 + 4.2779U4 5.1324 U6 + (2.4613U2 13.9209U4 +

16.9552U6)cos2 + (12.2137U4 20.9176U6)cos4ș + 6.9389 U6cos6ș
Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an
aspect ratio c = 0.85. (Cont.)
E28 ( 16.9428 U + 157.9560 U3 395.9030 U5 + 291.6410 U7)sin ș + (9.5563U3 17.3422 U5)sin3ș
E29 ( 15.0992 U + 120.4040 U3 257.3300 U5 + 161.3000U7)cosș + (5.7290 U3 9.5919 U5)cos3ș
E30 (7.9651U 87.6220U3 + 251.1590 U5 206.6960 U7)sinș + (46.3528 U3 170.3910U5

+ 148.4790 U7)sin3ș 2.9431U 5sin5ș
E31 (5.1212U 51.6989U3 + 135.9670U5 102.6820U 7)cosș + (46.6210U3 166.7500U5

+ 140.4930U7)cos3 ș 2.7848 U5cos5ș
E32 ( 1.9287 U + 24.5047 U3 79.3495 U5 + 72.5158U7)sinș + ( 23.7341 U3 +

99.6445 U5 96.3144U7)sin3ș + ( 34.2032U5 + 48.9185 U7)sin5ș
E33 ( 1.3158 U + 15.8070 U3 48.3961U 5 + 41.8223U7)cosș + ( 23.2637 U3 +

96.7544 U5 92.4529 U7)cos3ș + ( 34.1909U5 + 48.7981U 7)cos5ș
E34 (0.3278U 4.7830 U3 + 17.4967U5 17.8268U7)sinș + (6.3868U3 30.9130 U5 +

33.8176 U7)sin3ș + (19.7658U 5 32.4050U 7)sin5ș + 8.3740U7sin7ș
E35 (0.2368 U 3.3195U 3 + 11.6663U 5 11.4227U 7)cosș + (6.3087U3 30.3995U5 +

33.0808 U7)cos3ș + (19.7502U 5 32.3638U7)cos5ș + 8.3736U7cos7ș
E36 2.6240 58.7201U2 + 299.1450 U4 533.3820U6 + 309.6560U 8 + ( 13.7469U2 +

63.4344 U4 64.4471U 6)cos2ș + 0.6387U 4cos4ș
E37 1.3994 + 39.5156U 2 245.7470U4 + 521.0140U 6 351.8340U 8 + ( 43.5675U 2 +

325.0900 U4 718.3760U 6 + 490.2030 U8)cos2ș + (14.9649U 4 5.5058U6)cos4ș
E38 ( 44.7266U 2 + 307.6270U4 621.0460U6 + 384.5860 U8)sin2ș + (12.3100U 4 20.0104 U6)sin4ș
E39 0.3882 12.8141U2 + 91.0564U 4 216.6400U6 + 161.7810U8 + (23.5910U 2

199.4850 U4 + 485.8140U6 357.6390U 8)cos2ș + (78.6963U4 269.6320U 6 +
223.11800U8)cos4ș 3.8697U 6cos6ș
E40 (21.2315U 2 173.3660 U4 + 406.2550U6 286.8680U8)sin2ș + (78.9987U4

268.0850 U6 + 219.0700 U8)sin4ș 3.7995U6sin6ș
E41 0.0744 + 2.8113U2 22.4722 U4 + 59.3275U 6 48.6104U8 + ( 6.7226U2 +

64.4168 U4 174.4750U6 + 140.7070U8)cos2ș + ( 47.0816U4 + 180.8360U 6
163.8540U8)cos4ș + ( 45.8256 U6 + 64.5805U8)cos6ș
E42 ( 6.2129U 2 + 58.3775 U4 154.7540U6 + 121.9310U8)sin2ș + ( 46.8482U 4 +

179.4400U 6 162.0030U 8)sin4ș + ( 45.8214 U6 + 64.5323U8)sin6ș
E43 0.0106 0.4565U 2 + 4.0923U4 11.9863U6 + 10.7991U 8 + (1.3004U 2 14.1264U4

+ 42.7822U 6 38.1414U8)cos2ș + (14.3621U 4 62.7180U 6 + 63.6642U8)cos4 ș
+ (30.3057U 6 48.1679U8)cos6ș + 10.0677 U8cos8ș
E44 (1.2269U2 13.1623 U4 + 39.3265U6 34.5557U 8)sin2ș + (14.3234U4 62.4781 U6

+ 63.3298 U8)sin4ș + (30.2998 U6 48.1523U8)sin6ș + 10.0676U8sin8ș
E45 (1.2269U2 13.1623U4 + 39.3265U6 34.5557U 8)sin2ș + (14.3234U4 62.4781U 6 +

63.3298 U8)sin4ș + (30.2998 U6 48.1523U 8)sin6ș + 10.0676 U8sin8ș
Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with
an aspect ratio c = 0.85.
E1 1
E2 2x
E3 2.3529y
E4 1.6888 + 3.9217x2 + 3.9217y2
E5 5.7635xy
E6 0.3848 + 1.6188x2 4.3707y2
E7 6.4365y + 12.1923x2y + 12.1923y3
E8 5.5205x + 8.8980x3 + 8.8980xy2
E9 1.6917y + 7.1173x2y 7.9665y3
E10 1.2346x + 0.9083x3 14.0244xy2
E11 2.1159 15.7243x2 + 16.8965x4 13.3799y2 + 33.7930x2y2 + 16.8965y4
E12 0.7133 4.2163x2 + 8.0291x4 + 17.6829y2 19.9921x2y2 28.0211y4
E13 21.2463xy + 32.8922x3y + 32.8922xy3
E14 0.1184 + 2.229x2 0.2075x4 5.0523y2 22.9629x2y2 + 14.6209y4
E15 6.8457xy + 5.0366x3y 32.2378xy3
E16 9.9790x 43.7144x3 + 38.1952x5 39.4747xy2 + 76.3904x3y2 + 38.1952xy4
E17 11.4631y 62.4968x2y + 60.4711x4y 55.7845y3 + 120.9420x2y3 + 60.4711y5
E18 2.8430x 0.8991x3 + 7.1625x5 + 66.6923xy2 86.6903x3y2 93.8528xy4
E19 4.1751y 25.4686x2y + 46.4157x4y + 42.5700y3 10.9843x2y3 57.4000y5
E20 0.5810x + 2.8716x3 1.3522x5 24.7890xy2 21.0295x3y2 + 71.7367xy4
E21 0.8035y + 15.1286x2y 1.4078x4y 12.9108y3 65.9003x2y3 + 26.9908y5
E22 2.4124 + 37.9843x2 103.9970x4 + 72.6936x6 + 27.5602y2 187.8220x2y2 +

218.0810x4y2 83.8249y4 + 218.0810x2y4 + 72.6936y6
E23 48.8459xy 195.3970x3y + 163.5690x5y 180.2670xy3 + 327.1380x3y3 +

163.5690xy5
E24 1.0603 + 5.89157x2 37.3093x4 + 34.1702x6 43.3766y2 + 146.7460x2y2

89.7452x4y2 + 166.2710y4 282.0010x2y4 158.0860y6
E25 19.5673xy 7.0099x3y + 49.2630x5y + 190.2540xy3 188.4720x3y3 237.7350xy5
E26 0.2346 5.6603x2 + 6.0036x4 + 3.3804x6 + 15.6266y2 + 106.8290x2y2

169.8950x4y2 96.7384y4 60.5711x2y4 + 112.7040y6
E27 4.6354xy + 22.9127x3y 10.7896x5y 74.4852xy3 77.0736x3y3 + 155.7150xy5
E28 0.0357 + 1.5670x2 + 2.5707x4 2.1559x6 3.3556y2 64.7263x2y2 + 2.0618x4y2 +

30.4124y4 + 176.3200x2y4 49.9441y6
Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with
an aspect ratio c = 0.85. (Cont.)
E29 16.9428y + 186.6250x2y 447.9300x4y + 291.6410x6y + 148.400y3

826.4900x2y3 + 874.9230x4y3 378.5610y5 + 874.9230x2y5 + 291.6410y7
E30 15.0992x + 126.1330x3 266.9220x5 + 161.300x7 + 103.2180xy2 495.4780x3y2

+ 483.8990x5y2 228.5560xy4 + 483.8990x3y4 + 161.300xy6
E31 7.9651y + 51.4363x2y 274.7310x4y + 238.7420x6y 133.9750y3 + 190.9650x2y3

+ 122.3080x4y3 + 418.6070y5 471.6090x2y5 355.1750y7
E32 5.1212x 5.0779x3 33.5676x5 + 37.8102x7 191.5620xy2 + 633.2820x3y2

448.5390x5y2 + 622.2940xy4 1010.5100x3y4 524.1600xy6
E33 1.9287y 46.6975x2y + 48.5678x4y + 28.1652x6y + 48.2387y3 + 382.6220x2y3

508.6170x4y3 213.1970y5 319.0340x2y5 + 217.7490y7
E34 1.3158x 7.4567x3 + 14.1674x5 1.8325x7 + 85.5979xy2 + 51.6084x3y2

221.2630x5y2 509.6140xy4 + 343.7410x3y4 + 563.1720xy6
E35 0.3278y + 14.3775x2y + 23.5868x4y 19.7803x6y 11.1698y3 224.4910x2y3

15.4587x4y3 + 68.1755y5 + 447.8370x2y5 92.4234y7
E36 0.2368x + 2.9892x3 + 1.0170x5 2.3322x7 22.2454xy2 113.3710x3y2 +

48.0804x5y2 + 201.6160xy4 + 255.2210x3y4 331.0990xy6
E37 2.6240 72.4669x2 + 363.2180x4 597.8290x6 + 309.6560x8 44.9732y2 +

594.4580x2y2 1664.5900x4y2 + 1238.6200x6y2 + 236.3490y4 1535.7000x2y4 +
1857.9300x4y4 468.9350y6 + 1238.6200x2y6 + 309.6560y8
E38 1.3994 4.0520x2 + 94.3076x4 222.8680x6 + 138.3690x8 + 83.0831y2

581.2840x2y2 + 972.1960x4y2 426.9300x6y2 555.8720y4 + 2408.9500x2y4
2111x4y4 + 1213.8800y6 2387.7400x2y6 842.0370y8
E39 89.4533xy + 664.4940x3y 1322.1300x5y + 769.1720x7y + 566.0140xy3

2484.1900x3y3 + 2307.5200x5y3 1162.0500xy5 + 2307.5200x3y5 + 769.1720xy7
E40 0.3882 + 10.7769x2 29.7327x4 4.3268x6 + 27.2600x8 36.4051y2

290.0650x2y2 + 1242.1000x4y2 960.6230x6y2 + 369.2380y4 + 154.3790x2y4
1260.4900x4y4 968.2160y6 + 469.9310x2y6 + 742.5370y8
E41 42.4629xy 30.7369x3y 282.6290x5y + 302.5440x7y 662.7270xy3 +

1701.0100x3y3 844.9290x5y3 + 1862.0500xy5 2597.4900x3y5 1450.0200xy7
E42 0.0744 3.9113x2 5.1370x4 + 19.8626x6 7.1767x8 + 9.5338y2 + 237.5450x2y2

213.2880x4y2 161.7390x6y2 133.9710y4 1239.1100x2y4 + 1346.8700x4y4 +
460.4640y6 + 1083.6900x2y6 417.7510y8
E43 12.4259xy 70.6377x3y + 133.3220x5y 16.9553x7y + 304.1480xy3 +

297.4110x3y3 819.8750x5y3 1302.1900xy5 + 476.1470x3y5 + 1279.0700xy7
E44 0.0106 + 0.8439x2 + 4.3280x4 1.6163x6 1.7783x8 1.7569y2 77.9881x2y2

134.1720x4y2 + 104.7100x6y2 + 32.5808y4 + 689.4340x2y4 + 132.8950x4y4
147.7920y6 1091.4200x2y6 + 170.84y8
E45 2.4538xy + 30.9691x3y + 10.5393x5y 24.1655x7y 83.6182xy3 448.6900x3y3 +

156.3320x5y3 + 510.3640xy5 + 777.2630x3y5 691.8850xy7
The polynomial aberrations E2 and E3 , representing the x - and y-tilts with

aberration coefficients a 2 and a 3 , displace the aberration-free PSF along the x and y
axes, respectively. The coefficient a 2 corresponds to a tilt angle of 2a 2 a about the y
axis, and yields a displacement of the PSF along the x axis by 4 a 2 Fx , where Fx = R 2a
is the focal ratio of the image-forming beam along the x axis. Similarly, the coefficient
a 3 corresponds to a tilt angle of 2a 3 b about the x axis, and yields a displacement of the
PSF along the y axis by 4 a 3 Fy , where Fy = R 2b is the focal ratio of the image-forming
beam along the y axis.
The defocus aberration represented by the polynomial E4 is radially symmetric and

yields a radially symmetric interferogram bounded, of course, by an ellipse. However, the
PSF is biaxially and not radially symmetric because of the larger diffraction spread along
the smaller dimension of the pupil. The interferograms and PSFs for the polynomial
aberrations E5 and E6 , representing balanced astigmatisms, are biaxially symmetric but
distinctly different from each other for the two aberrations. The polynomial aberrations
E7 and E8 , representing balanced comas, produce biaxially symmetric interferograms,
but the PSFs are symmetric about the y and x axes, respectively. The polynomial
aberrations E11 , E22 , and E37 , representing balanced primary, secondary, and tertiary
aberrations, respectively, are not radially symmetric because of the different diffraction
spreads along the x and the y axes, and because of the presence of the cos 2q term in E11
and E22 , and the cos 2q and cos 4q terms in E37 .
From Eq. (8-5), the Strehl ratio, i.e., the central value of a PSF relative to its
aberration-free value, can be written:
S(c ) ∫ I (0, 0; c )
1 c 1 x2
1 Û Û
=
pc Ù
ı
dx Ù
ı
[ ]
exp iF ( x , y ) dy , (8-35)
1 c 1 x2
where ( x , y ) are the pupil coordinates normalized by the pupil dimension a along the x p
axis, as used in the polynomials given in Table 8-3.
The Strehl ratio for elliptical polynomial aberrations with a sigma value of 0.1 wave
is listed in Table 8-8 and plotted in Figure 8-6. Because of the small value of the
aberration, the Strehl ratio is approximately the same for each polynomial. Both the table
and the figure illustrate that the Strehl ratio for a small aberration is independent of the
( )
type of aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
E1 E2 E3
E4 E5 E6
E7 E8 E9
E10 E11 E12
E13 E14 E15
Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85
shown as isometric plot on the top, interferogram on the left, and PSF on the right
for a sigma value of one wave.
E16 E17 E18
E19 E20 E21
E22 E23 E24
E25 E26 E27
E28 E29 E30
Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85
shown as isometric plot on the top, interferogram on the left, and PSF on the right
for a sigma value of one wave. (Cont.)
E31 E32 E33
E34 E35 E36
E37 E38 E39
E40 E41 E42
E43 E44 E45
Figure 8-5. Elliptical polynomials with an aspect ratio c = 0.85 shown as isometric
plot on the top, interferogram on the left, and PSF on the right for a sigma value of
one wave. (Cont.)
Table 8-7. Peak-to valley (P-V) numbers (in units of wavelength) of orthonormal
elliptical polynomial aberrations with an aspect ratio c = 0.85 and a sigma value of
one wave.
Poly. P-V # Poly. P-V# Poly. P-V #
E1 0 E16 8.920 E31 7.805
E2 4 E17 6.068 E32 8.415
E3 4 E18 7.554 E33 7.667
E4 3.922 E19 6.379 E34 8.768
E5 4.899 E20 8.700 E35 10.673
E6 4.777 E21 8.239 E36 11.196
E7 4.256 E22 6.681 E37 7.395
E8 6.7755 E23 8.444 E38 7.795
E9 5.839 E24 6.920 E39 9.824
E10 6.149 E25 8.181 E40 8.506
E11 4.831 E26 7.051 E41 8.692
E12 5.816 E27 9.958 E42 8.233
E13 6.942 E28 9.459 E43 9.313
E14 7.024 E29 7.351 E44 8.606
E15 7.428 E30 10.824 E45 12.414

Table 8-8. Strehl ratio S for elliptical polynomial aberrations with an aspect ratio
c = 0.85 and a sigma value of 0.1 wave.
E1 1 E16 0.680 E31 0.675
E2 0.665 E17 0.669 E32 0.677
E3 0.665 E18 0.678 E33 0.684
E4 0.664 E19 0.675 E34 0.685
E5 0.671 E20 0.692 E35 0.703
E6 0.672 E21 0.692 E36 0.703
E7 0.667 E22 0.675 E37 0.680
E8 0.674 E23 0.677 E38 0.673
E9 0.679 E24 0.672 E39 0.679
E10 0.679 E25 0.6811 E40 0.678
E11 0.671 E26 0.680 E41 0.678
E12 0.671 E27 0.698 E42 0.688
E13 0.675 E28 0.698 E43 0.689
E14 0.686 E29 0.671 E44 0.708
E15 0.685 E30 0.684 E45 0.708

o
o
Figure 8-6. Strehl ratio for an elliptical polynomial aberration with an aspect ratio c
= 0.85 and a sigma value of 0.1 wave.
8.7 SEIDEL ABERRATIONS AND THEIR STANDARD DEVIATIONS

We now consider balancing of a Seidel aberration and obtain its standard deviation
with and without balancing.
8.7.1 Defocus
We start with the defocus aberration
W d (r) = Ad r 2 . (8-36)
From the form of the orthonormal defocus polynomial E4 given in Table 8-2, it is
evident that its sigma value across an elliptical pupil is given by
Ad h
sd = , (8-37)
4 3
where
(
h = 3 - 2c 2 + 3c 4 )1 2 . (8-38)
8.7.2 Astigmatism
W a (r, q) = Aa r 2 cos 2 q . (8-39)
6
E6 = 2
2c h
[
h 2r 2 cos 2q - 3 1 - c 2 ( )] (8-40a)
h 6 Ê 2 2 3 - c 2 2ˆ
= Á r cos q - r ˜ + constant . (8-40b)
c2 Ë h ¯
It shows that Seidel astigmatism r2 cos q is balanced with defocus aberration

[( ) ]
- 3 - c 2 h r 2 , or that balanced astigmatism is given by
Ê 3 - c 2 2ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r ˜ . (8-41)
Ë h ¯
c2
s ba = Aa . (8-42)
h 6
To determine the sigma of Seidel astigmatism, we write the aberration in terms of the
elliptical polynomials. Thus,
$VWLJPDWLVP 29
W a (r, q) = Aa r 2 cos q
Ê c2 3 - c2 ˆ
= Aa Á E6 + E4 ˜ + constant . (8-43)
Ë 6h 4h 3 ¯
Utilizing Eq. (8-34), we find the sigma to be
s a = Aa 4 . (8-44)
Its value is independent of the aspect ratio c of the elliptical pupil, and thus equal to that
for a circular pupil. Since Seidel astigmatism x 2 varies only along the x axis for which
the unit ellipse has the same length as a unit circle, the sigma is independent of c.
8.7.3 Coma
Now we consider Seidel coma:
W c (r, q) = Ac r 3 cos q . (8-45)
4
E8 =
4 12
[6r 3
( )
cos q - 3 + c 2 r cos q ] . (8-46)
(9 - 6c 2
+ 5c )
( )
r3 cos q is - 3 + c 2 6 compared to - 2 3 for a circular pupil. The balanced coma is
given by
Ê 3 + c2 ˆ
W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (8-47)
Ë 6 ¯
s bc =
(9 - 6c 2 + 5c 4 )1 2 A . (8-48)
c
24
W c (r, q) = Ac Á
(
Ê 9 - 6c 2 + 5c 4
)1 2 E +
3 + c2
ˆ
E2 ˜ . (8-49)
8
Á 24 12 ˜
Ë ¯
Utilizing Eq. (8-34), we obtain the sigma value:
1
sc =
8
(5 + 2c 2 + c 4 )1 2 Ac . (8-50)

Finally, we consider Seidel spherical aberration
W s (r) = Asr 4 . (8-51)
E11 = ( )[ ( ) ( ) ]
5 a 48r 4 - 12 1 - c 2 r 2 cos 2q - 24 1 + c 2 r 2 + constant (8-52a)
= ( )[ ( ) ( ) ]
5 a 48r 4 - 24 1 - c 2 r 2 cos 2 q + 12 1 - 3c 2 r 2 + constant . (8-52b)
The balanced spherical aberration is given by
È 1 1 ˘
Î 4
( ) 2
(
W bs (r) = As Ír 4 - 1 - c 2 r 2 cos 2q - 1 + c 2 r 2 ˙
˚
) (8-53a)
È 1 1 ˘
Î 2
( ) 4
( ˚
)
= As Ír 4 - 1 - c 2 r 2 cos 2 q + 1 - 3c 2 r 2 ˙ + constant . (8-53b)
It shows that spherical aberration is balanced not only by defocus but astigmatism as
well. Its sigma value is given by
a
s bs = As . (8-54)
48 5
ÏÔ a
W s (r) = As Ì E11 +
c2 1 - c2 (
E6 + Í
)
1 È3 1-c 1-c
2 4
(
+ h 1 + c2
)( ) ( ˘ ¸
)˙˙E4 Ô˝
ÔÓ 48 5 2h 6 8 3 ÍÎ 2 h Ô˛
˚
+ constant . (8-55)
ss =
(225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 A . (8-56)
s
24 10
The sigma values of Seidel aberrations with and without balancing are given in Table
8-9. They reduce to the corresponding values for a circular pupil given in Table 4-3 as
c Æ 1. The variation of sigma for a primary aberration with the aspect ratio c is shown in
Figure 8-7. While s a for astigmatism is constant, it increases monotonically in the case
of coma s c and spherical aberration s s . For defocus, its value s d has a minimum for
c = 1 3 . The variation of sigma of a balanced primary aberration as a function of c is
shown in Figure 8-8. While its variation for balanced coma s bc and balanced spherical
aberration s bs is small, sigma of balanced astigmatism s ba increases monotonically.
6SKHULFDO $EHUUDWLRQ 31
Table 8-9. Standard deviation s i of a primary and a balanced primary aberration

for an elliptical pupil of aspect ratio c.
Aberration Sigma
12
Defocus [(
s d = ( Ad 4) 3 - 2c 2 + 3c 4 ) 3]
Astigmatism s a = Aa 4
12
Balanced astigmatism s ba = Aa c 2 [6(3 - 2c 2
+ 3c 4 )]
Coma (
s c = Ac 5 + 2c 2 + c 4 )1 2 8
Balanced coma (
s bc = Ac 9 - 6c 2 + 5c 4 )1 2 24
Spherical aberration (
s s = As 225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 (24 10 )
Balanced spherical aberration (
s bs = As 45 - 60c 2 + 94c 4 - 60c 6 + 45c 8 )1 2 (48 5)
Figure 8-7. Variation of sigma of a Seidel aberration as a function of aspect ratio c

of a unit elliptical pupil, where the subscript d is for defocus, a for astigmatism, c for
coma, and s for spherical aberration.
Figure 8-8. Variation of sigma of a balanced Seidel aberration as a function of

aspect ratio c of a unit elliptical pupil, where the subscript ba is for balanced
astigmatism, bc for balanced coma, and bs for balanced spherical aberration.
8.8 SUMMARY
The PSF and OTF of a system with an elliptical pupil are obtained from the
corresponding PSF and OTF of a system with a circular pupil discussed in Chapter 4 by
scaling the coordinates of the elliptical pupil and transforming it into a circular pupil. It is
explained that the orthogonal aberration polynomials for an elliptical pupil representing
balanced classical aberration for such a pupil can not be obtained in the same manner.
These polynomials orthonormal over a unit elliptical pupil are obtained by
orthonormalizing the circle polynomials by the Gram–Schmidt orthonormalization
process. They are given through the fourth order in Tables 8-1 through 8-3 in terms of the
circle polynomials, in the polar coordinates, and in the Cartesian coordinates,
respectively. Table 8-2 shows that each polynomial consists of either the cosine or the
sine terms, but not both. Thus, an even j polynomial, for example, consists of only the
cosine terms. This is a consequence of the biaxial symmetry of the pupil. Since the
polynomials are not separable in the polar coordinates r and q of a pupil point,
polynomial numbering with two indices n and m loses significance. Hence, they must be
numbered with a single index j. Their ordering is the same as for the polynomials
discussed in previous chapters.
Only the first 15 elliptical polynomials are given for an arbitrary aspect ratio c of the
pupil in the Tables 8-1 through 8-3. The expressions for the higher-order elliptical
polynomials are very long unless c is specified. The polynomial E6 for astigmatism is a
6XPPDU\ 33
degree) Seidel astigmatism is different for an elliptical pupil compared to that for a
circular, annular, or a Gaussian pupil. Moreover, E11 is a linear combination of Z11 , Z 6 ,
Z 4 , and Z1. Thus, spherical aberration r 4 is balanced with not only defocus r2 but
astigmatism r2 cos 2 q as well. It is evidently not radially symmetric. As expected, the
elliptical polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse
approaches a unit circle.
The elliptical polynomials up to the eighth order for an elliptical pupil with an aspect
ratio of c = 0.85 are given in Tables 8-4 to 8-6 in terms of the Zernike circle polynomials,
in polar coordinates, and in Cartesian coordinates, respectively. They are illustrated in
three different but equivalent ways in Figure 8-5 with the isometric plot, interferogram,
and the PSF for a sigma value of one wave. The peak-to-valley aberration numbers (in
units of wavelength) are given in Table 8-7. The Strehl ratio for a sigma value of 0.1
wave is given in Table 8-8 and plotted in Figure 8-6. The Seidel aberrations are discussed
in Section 8.7 and their sigma values with and without balancing are given in Table 8-9.
References
1. H. J. Wyatt, “The form of the human pupil,” Vision Res. 35, 2021–2036 (1995).
2. W. B. King, “The approximation of a vignetted pupil shape by an ellipse,” Appl.

Opt. 7, 197–201 (1968).
3. G. Harbers, P. J. Kunst, and G. W. R. Leibbrandt, “Analysis of lateral shearing

interferogram by use of Zernike polynomials,” Appl. Opt. 35, 6162–6172 (1996).
4. H. Sumita, “Orthogonal expansion of the aberration difference function and its

application to image evaluation,” Japanese J. Appl. Phys. 8, 1027–1036 (1969).
5. Y. P. Kathuria, “Far-field radiation patterns of elliptical pupil apertures and its

annuli,” ,((( 7UDQV $QWHQ 3URSD AP-31, 360–363 (1983).
6. J. V. Cornacchio and R. P. Soni, “Autoconvolution of an ellipse,” J. Opt. Soc. Am.

55, 107–108 (1965).

analytical solution,” J Opt. Soc. Am A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

11.41 (McGraw Hill, 2009).
CHAPTER 9
SYSTEMS WITH RECTANGULAR PUPILS
9.1 Introduction ..........................................................................................................237
9.2 Pupil Function ......................................................................................................237
9.3.1 PSF ..........................................................................................................238
9.3.2 OTF ..........................................................................................................240
9.4 Rectangular Polynomials..................................................................................... 242
9.5 Rectangular Coefficients of a Rectangular Aberration Function....................243
Rectangular Polynomial Aberrations ................................................................247
9.7 Seidel Aberrations and Their Standard Deviations ..........................................260
9.7.1 Defocus ....................................................................................................260
9.7.2 Astigmatism............................................................................................. 260
9.7.3 Coma ........................................................................................................261
9.8 Summary............................................................................................................... 264
References ......................................................................................................................265
235
Chapter 9
Systems with Rectangular Pupils
9.1 INTRODUCTION
High-power laser beams have a rectangular cross-section; hence there is a need to
discuss the diffraction characteristics of a rectangular pupil. We start this chapter with a
brief discussion of the PSF and OTF of a system with such a pupil.
Although high-power rectangular laser beams have been around for a long time [1],
there is little in the literature on rectangular polynomials representing balanced
aberrations for such beams. In this chapter we discuss such polynomials that are
orthonormal over a unit rectangular pupil [2,3]. These polynomials are not separable in
the x and y coordinates of a point on the pupil. The expressions for only the first 15
orthonormal polynomials, i.e., up to and including the fourth order, are given for an
arbitrary aspect ratio of the pupil becuase they become quite cumbersome as their order
increases. However, expressions for the first 45 polynomials, i.e., up to and including the
eighth order, are given for an aspect ratio of 0.75. The isometric, interferometric, and PSF
plots of these polynomial aberrations with a sigma value of one wave are given along
with their P-V numbers. The Strehl ratios for these polynomial aberrations for a sigma
value of one-tenth of a wave are also given. Finally, we discuss how to obtain the
standard deviation of a Seidel aberration with and without balancing.
Products of Legendre polynomials (one for the x- and the other for the y axis) which
are also orthogonal over a rectangular pupil [4], are not suitable for the analysis of
rectangular wavefronts of rotationally symmetric systems, since they do not represent
classical or balanced aberrations for such systems. For example, the defocus aberration
for such a system is represented by x 2 + y 2 . While it can be expanded in terms of a
complete set of 2D Legendre polynomials, it cannot be represented by a single product of
the x- and y-Legendre polynomials. The same difficulty holds for spherical aberration,
coma, etc. However, products of such Legendre polynomials are suitable for anamorphic
systems, as discussed in Chapter 13. Products of Chebyshev polynomials, one for the x-
and the other for the y-axis, are also orthogonal over a rectangular pupil, but they are not
suitable either for the rectangular pupils considered in this chapter for the same reasons as
for the products of Legendre polynomials.
9.2 PUPIL FUNCTION

As illustrated in Figure 9-1, consider an optical system with a rectangular exit pupil
( )
with half-widths a and b and area Sex = 4 ab lying in the x p , y p plane with z axis as its
(
optical axis. For a uniformly illuminated pupil with an aberration function F x p , y p )
and power Pex exiting from it, the pupil function of the system can be written
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (9-1)
where 237
238 SYSTEMS WITH RECTANGULAR PUPILS
yp
O xp
Figure 9-1. Rectangular pupil with half-widths a and b.
(
A xp, yp ) = (P ex Sex )
12
, - a £ xp £ a , -b £ yp £ b . (9-2)

9.3.1 PSF
From Eq. (1-9), the aberrated PSF at a point ( x i , y i ) in the image plane of a system
with a uniformly illuminated rectangular exit pupil, normalized by its aberration-free
central value Pex Sex l2 R 2 , can be written
2
1 a b È 2pi ˘
I (x i , y i ) = 2 Ú
Sex a b
[ (
Ú exp iF x p , y p expÍ -
Î lR
)] ( )
x i x p + y i y p ˙ dx p dy p .
˚
(9-3)
Letting
( x ¢, y ¢) (
= xp a, yp b ) , (9-4)
and
1
( x, y) = ( x , y )
l Fx i i
(9-5)
into Eq. (9-3), where
Fx = R 2a (9-6)
is the focal ratio of the image-forming light cone along the x axis, and
= ba (9-7)
is the aspect ratio of the pupil, the irradiance distribution can be written
2
1 1 1
I ( x, y) =
16 1 1
[ ]
Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (9-8)
Accordingly, the aberration-free distribution is given by

36) 239
2
1 1 1
I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢
16 1 1
2 2
Ê sin px ˆ Ê sin py ˆ
= Á ˜ . (9-9)
Ë px ¯ ÁË py ˜¯
Figure 9-2a shows the 2D PSF for an aspect ratio = 0.75 . In particular, it shows the
central bright rectangular spot of size 2 ¥ 2 , with each dimension in units of l Fx . The
PSF is zero wherever x and/or y is a positive or a negative integer. Figure 9-2b shows
the irradiance distribution along the x and y axes, and along the diagonal of the central
12
bright spot as I ( x, 0) , I (0, y ) , and I ( x , y ) ∫ I ( r ) , where r = x 2 + y 2 (and )
4
È Ê 2ˆ ˘
Í sinË pr 1 + ¯ ˙
I (r) = Í ˙ . (9-10)
Í pr 1 +
2
˙
Î ˚
(a)
1.0
0.8
0.6
I (0, y)
(b)
0.4
I (r)
0.2
I (x, 0)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r
Figure 9-2. (a) 2D aberration-free PSF for = 0.75. (b) Irradiance distribution along
the x and y axes, and along the diagonal of the central bright spot of the PSF.
9.3.2 OTF
From Eq. (1-13), the aberration-free OTF of a system with a rectangular pupil at a
spatial frequency (x, h) is given by the fractional area of overlap of two rectangles
centered at (0, 0) and lR(x, h) , as shown in Figure 9-3. The overlap area is given by
S(x, h) = (2a - l Rx) (2b - l Rh)
Ê x ˆÊ 1 h ˆ
= 4 abÁ 1 - ˜ Á1 - ˜ . (9-11)
Ë 1 l Fx ¯ Ë 1 l Fx ¯
Hence, the fractional area of overlap, or the OTF of the system may be written
v
(
t vx , vy ) = (1 - v ) ÊÁË1 - ˆ˜¯
x
y
, (9-12)
where
Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 l Fx 1 l Fx ¯
(9-13)
are the spatial frequency components in units of the cutoff frequency 1 l Fx along the x
12
( )
axis. The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be
obtained from Eq. (9-12) by letting v y v x = . Thus
2
Ê v ˆ
t( v ) = Á 1 - ˜ . (9-14)
Ë 1 + 2 ¯
Its cutoff frequecny is 1 + 2 .
yp
b
O9 R

O xp
R
a
Figure 9-3. Overlap area of two rectangular pupils centered at (0, 0) and l R(x , h)
for an aspect ratio = 0.75.
27) 241
Figure 9-4 shows the OTF for = 0.75 along the x and y axes, and along the
( )
diagonal of the pupil, as t(v x , 0) , t 0, v y , and t( v ) , with the corresponding cutoff
frequencies 1, 0.75, and 1.25, respectively, each in units of 1 l Fx . We note that
( )
t 0, v y < t(v x , 0) for any value of v x = v y due to the smaller dimension of the pupil
along the y axis. Moreover, t( v ) < t(v x , 0) for any frequency lying in the range
( )
0 < v = v x < 2 1 + 2 - 1 + 2 , or 0 < v = v x < 0.9375 in our example of = 0.75 . The
two OTFs are equal to each other at the frequency 2 1 + 2 - 1 + 2 , or 0.9375. At ( )
larger frequencies, t( v ) > t(v x , 0) until v = 1 + 2 . Of course, the values of both OTFs
in the vicinity of the unity cutoff frequency for t(v x , 0) are quite small in our example.
( )
Finally, t 0, v y is only slightly greater than t( v ) in the frequency range
( )
0 < v = v x < 2 1 + 2 - 1 1 + 2 . The two OTFs are equal to each other at the
( )
frequecny 2 1 + 2 - 1 1 + 2 , or 1 2.4 in our example. For larger frequecnies, t( v ) is
significantly greater. We point out that they are equal to each other only if ≥ 1 3 . As
( )
Æ 1 and the rectangular pupil becomes square, t 0, v y Æ t(v x , 0) for any value of
v x = v y , and the cutoff frequency for t( v ) appraoches 2 , as discussed in the next
chapter.
1.0
0.8
t ( nx , 0)
0.6
t
0.4
0.2
t ( 0, ny )
t (n)
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
nx, ny, or n
Figure 9-4. Aberration-free OTF for = 0.75, where v x , v y , and v are in units of
the cutoff frequency 1 l Fx along the x axis.
9.4 RECTANGULAR POLYNOMIALS

Figure 9-5 shows a unit rectangle inscribed inside a unit circle. The half-widths a
12
and b of the rectangular pupil are normalized by its semidiagonal a 2 + b 2 so that the ( )
farthest points (such as A) on the pupil lie at a distance of unity. The half-widths of the
12 12
unit rectangle along the x and y axes are c = a a 2 + b 2 ( )
and 1 - c 2 , respectively,
12
( )
where 0 < c < 1 . Accordingly, the aspect ratio of the rectangle is 1 - c 2
2 12
c , and its ( )
area is given by A = 4c 1 - c ( )
. As in the case of a unit ellipse, a unit rectangle is also
not unique, since c can have any value between 0 and 1. For example, when c = 0.8 , the
aspect ratio of the pupil is 0.75 and the area is 1.92. As c Æ 1 2 , the rectangle becomes
a square, and as c Æ 1 or 0, it becomes a slit parallel to the x or the y axis, respectively.
The orthonormal rectangular polynomials R j ( x , y ) obtained by orthogonalizing the

Zernike circle polynomials Z j over a unit ellipse are given by [see Eq. (3-18)]
È j ˘
R j +1 = N j +1 ÍZ j +1 - Â Z j +1R k R k ˙ , (9-15)
ÍÎ k =1 ˙˚
unit rectangle, i.e., they satisfy the orthonormality condition
c 1 c2
1 Û Û
dx Ù R j R j ¢ dy = d jj ¢ . (9-16)
2 Ù
4c 1 - c ı ı
c 1 c2
The angular brackets indicate a mean value over the rectangular pupil. Thus
c 1 c2
Û 1Û
Z j Rk = Ù dx Ù Z j Rk dy . (9-17)
4c 1 - c 2 ı ı
c 1 c2
D ( c, 1 c2 ) (
A c, 1 c2 )
O x
C ( c, 1 c2 ) (
B c, 1 c2 )
Figure 9-5. Unit rectangle of half-width c inscribed inside a unit circle. Its corner
points, such as A, lie at a distance of unity from its center.
5HFWDQJXODU 3RO\QRPLDOV 243
It should be evident that because of the symmetric limits of integration, a mean value is
zero if the integrand is an odd function of x and/or y. If the integrand is an even function,
then we may replace the lower limits of integration by zero and multiply the double
integral by 4.
The rectangular polynomials thus obtained up to the fourth order are given in Tables
9-1 through 9-3 in the same manner as the elliptical polynomials. Only the first 15
polynomials are given in these tables, because their expressions become too long unless
the aspect ratio is specified. Each polynomial consists of a number of circle polynomials,
but contain only the cosine or the sine terms, not both. The polynomial R6 representing
balanced astigmatism is a linear combination of Z 6 , Z 4 , and Z1, showing that the
balancing defocus for 0 o Seidel astigmatism is different for a rectangular pupil compared
to that, for example, for a circular pupil. Similarly, the polynomial R11 , representing
balanced primary spherical aberration, is not radially symmetric, since it consists of a
term in astigmatism Z 6 or cos2q . As expected, the rectangular polynomials reduce to
the square polynomials as c Æ 1 2 , and the slit polynomials for a slit pupil parallel to
the x axis as c Æ 1, discussed in Chapters 10 and 11, respectively.
9.5 RECTANGULAR COEFFICIENTS OF A RECTANGULAR ABERRATION

FUNCTION
A rectangular aberration function W ( x , y ) across a unit rectangle can be expanded in
terms of J rectangular polynomials Rj (r, q) in the form
J
W ( x , y ) = Â a j Rj ( x , y ) , (9-18)
j =1
R j ( x , y ), integrating over the unit rectangle, and using the orthonormality Eq. (9-16), we
obtain the rectangular expansion coefficients:
c 1 c2
1 Û Û
aj = Ù dx Ù W ( x , y )R j ( x , y )dy . (9-19)
2
4c 1 - c ı ı
c 1 c2
As stated in Section 3.2, it is evident from Eq. (9-19) that the value of a rectangular
W (r, q) = a1 , (9-20)
and
Table 9-1. Orthonormal rectangular polynomials R j U, T in terms of the Zernike

R1 Z1
R2 ( 3 /2c)Z2
2
R3 [ 3 /(2 1 c ) ]Z3
2 4
R4 [ 5 /(4 1 2c 2c ) ](Z1 +3Z4)
2
R5 [ 3 2 /(2c 1 c ) ]Z5
2 4
R6 { 5 /[8c2(1 c2) 1 2c + 2c ]}[(3 10c2 + 12c4 8c6)Z1 + 3 (1 2c2)Z4
+ 6 (1 2c2 + 2c4)Z6]
2 4 6
R7 [ 21 /(4 2 27 81c + 116c 62c )][ 2 (1+4c2)Z3 +5Z7]
2 4
R8 [ 21 /(4 2c 35 70c + 62c )][ 2 (5 4c2)Z2 +5Z8]
54c + 62c ·¹ / §©1 c ·¹ /[16c2(27

§ 27 2 4 2
R9 { 5 2 ©
81c2 + 116c4 62c6)]}
[2 2 (9 36c2 + 52c4 60c6)Z3 + (9 18c2 26c4)Z7 + (27 54c2 + 62c4)Z9]

2 4
R10 { 5 2 /[16c3(1 c2) 35 70c + 62c ]}[2 2 (35 112c2 + 128c4 60c6)Z2
+ (35 70c2 + 26c4)Z8 + (35 70c2 + 62c4)Z10]
R11 [1/(16)][8(3 + 4c2 4c4)Z1 +25 3 Z4 + 10 6 (1 2c2)Z6 + 21 5 Z11]

R12 {3/[16c2ȞȘ]}{(105 550c2 + 1559c4 2836c6 + 2695c8 1078c10)Z1
+ 5 3 (14 74c2 + 205c4 360c6 + 335c8 134c10)Z4 + (5/2)
3/2 (35 156c2 + 421c4 530c6 + 265c8)Z6 + 21 5 (1 4c2
+ 6c4 4c6)Z11 + [(7/2) 5 / 2 Ș/(1 c2)]Z12}
2 4 6
R13 [ 21 /(16 2 c 1 3c + 4c 2c )]( 3 Z5 + 5 Z13)
2 4 6 8
R14 Ĳ[6(245 1400c + 3378c 4452c + 3466c 1488c + 496c12)Z1
10
+ 15 3 (49 252c2 + 522c4 540c6 + 270c8)Z4 + 15 6 (49 252c2
+ 534c4 596c6+ 360c8 144c10)Z6 + 3 5 (49 196c2 + 282c4 172c6
+ 86c8)Z11 + 147 10 (1 4c2 + 6c4 4c6)Z12 + 3 10 Ȟ2Z14]
R15 {1/[32c3(1 c2) (1 3c2 + 4c4 2c6)1/2]}[3 7 2 (5 18c2 + 24c4 16c6)Z5
+ 105 2 (1 2c2)Z13 + 210 (1 2c2 +2c4)Z15]

__________________________________________________
(9 36c2 + 103c4 134c6 + 67c8)1/2
Ȟ (49 196c2 + 330c4 268c6 + 134c8)1/2
Ĳ 1/[128Ȟc4(1 c2)2]
Ș 9 45c2 + 139c4 237c6 + 201c8 67c10
Ș (1 c2)
2
9.5 Rectangular Coefficients of a Rectangular Aberration Function 245
Table 9-2. Orthonormal rectangular polynomials R j U, T in polar coordinates

U, T .
R1 = 1
R2 = ( 3 /c)ȡcosș
R3 = 3 /(1 í c2)ȡsinș
2 4
R4 = [ 5 /(2 1 í 2c 2c )](3ȡ2 í 1)
2
R5 = [3/(2c 1 í c )]ȡ2 sin2ș
2 4
R6 = { 5 /[4c2(1 í c2) 1 í 2c 2c ]}[3(1 í 2c2 + 2c4)ȡ2 cos2ș + 3(1 í 2c2)ȡ2
í 2c2(1 í c2) (1 í 2c2)]
2 4 6
R7 = [ 21 /(2 27 í 81c 116c í 62c )](15ȡ2 – 9 + 4c2)ȡsinș
R8 = [ 21 /(2c 35 í 70c 62c )](15ȡ2 í 5 í 4c2)ȡcosș

2 4
§ 27 í 54c 62c ·¹ / §©1 í c ·¹ /[8c2(27 í 81c2 + 116c4 í 62c6)]}

2 4 2
R9 = { 5 ©
{(27 í 54c2 + 62c4) × ȡ3 sin3ș í 3[4c2(3 í 13c2 + 10c4) í (9 í 18c2

í 26c4)ȡ2]ȡsinș}
2 4
R10 = { 5 /[8c3(1 í c2) 35 í 70c 62c ]}{(35 í 70c2 + 62c4)ȡ3 cos3ș
í 3[4c2(7 í 17c2 + 10c4) í (35 í 70c2 + 26c4)ȡ2]ȡcosș}
R11 = (1/8)[315ȡ4 + 30(1 í 2c2)ȡ2 cos 2ș í 240ȡ2 + 27 + 16c2 í 16c4]
R12 = [3/(8c2ȞȘ)][315(1 í 2c2) (1 í 2c2 +2c4)ȡ4 + 5(72ȡ2 í 21 + 72c2 í 225c4 + 306c 6
í 153c8)ȡ2 cos2ș í 15(1 í 2c2) (7 + 4c2 í 71c4 + 134c6 í 67c8)ȡ2
+ c2(1 í c2)(1 í 2c2)(70 í 233c2 + 233c4)]
2 4 6
R13 = [ 21 /(4c 1 í 3c 4c í 2c )](5ȡ2 í 3)ȡ2 sin2ș
R14 = 6Ĳ{5Ȟ2ȡ4 cos4ș í 20(1 í 2c2)[6c2(7 í 16c2 + 18c4 í 9c6) í 49(1 í 2c2 + 2c4)ȡ2]ȡ2
cos2ș + 8c4(1 í c2)2(21 í 62c2 + 62c4) í 120c2(7 í 30c2 + 46c4 í 23c6)ȡ2
+ 15(49 í 196c2 + 282c4 í 172c6 + 86c8)ȡ4}
R15 = { 21 /[8c3(1 í c2)3/2(1 í 2c2 +2c4)1/2]}[ í (1 í 2c2) (6c2 í 6c4 í 5ȡ2)ȡ2 sin2ș
+ (5/2)(1 í 2c2 +2c4)ȡ4 sin4ș]
Table 9-3. Orthonormal rectangular polynomials R j x, y in Cartesian coordinates

x, y , where U 2 x 2 y 2 .
R1 = 1
R2 = ( 3 /c)x
3 / §© 1 í c ·¹ y
2
R3 =
2 4
R4 = [ 5 /(2 1 í 2c + 2c )](3ȡ2 í 1)
2
R5 = [3/( c 1 í c )]xy
2 4
R6 = { 5 /[2c2(1 í c2) 1 í 2c + 2c ]}[3(1 í c2)2x2 í 3c4y2 í c2(1 í 3c2 +2c4)]
2 4 6
R7 = [ 21 /(2 27 í 81c + 116c í 62c )](15ȡ2 – 9 + 4c2)y
2 4
R8 = [ 21 /(2c 35 í 70c + 62c )](15ȡ2 í 5 í 4c2)x
R9 = { 5 § 27 í 54c + 62c
2 4·
/ §©1 í c ·¹ /[2c2(27 í 81c2 + 116c4 í 62c6)]}
2
© ¹
[27(1 í c2)2x2 í 35c4y2 í c2(9 í 39c2 + 30c4)]y

2 4
R10 = { 5 /[2c3(1 í c2) 35 í 70c + 62c ]}[35(1 í c2)2x2 í 27c4y2 í c2(21 í 51c2
+ 30c4)]x
R11 = [1/(8)][315ȡ4 í 30(7 + 2c2)x2 í 30(9 í 2c2)y2 + 27 + 16c2 í 16c4]
R12 = [3/(8c2ȞȘ)][35(1 í c2)2(18 í 36c2 + 67c4)x4 + 630(1 í 2c2)(1 í 2c2 +2c4)x2y2
í 35c4(49 í 98c2 + 67c4)y4 í 30(1 í c2) (7 í 10c2 í 12c4 + 75c6 í 67c8)x2
í 30c2(7 í 77c2 + 189c4 í 193c6 + 67c8)y2 + c2(1 í c2) (1 í 2c2) (70 í 233c2
+ 233c4)]
2 4 6
R13 = [ 21 /(2c 1 í 3c + 4c í 2c )](5ȡ2 í 3)xy
R14 = 16Ĳ[735(1 í c2)4x4 í 540c4(1 í c2)2x2y2 + 735c8y4 í 90c2(1 í c2)3(7 í 9c2)x2

+ 90c6(1 í c2) (2 í 9c2)y2 +3c4(1 í c2)2(21 í 2c2 + 62c4)]
2 4 6
R15 = { 21 /[2c3(1 í c2) 1 í 3c + 4c í 2c ]}[5(1 í c2)2x2 í 5c4y2 í c2(3 í 9c2
+ 6c4)]xy
9.5 Rectangular Coefficients of a Rectangular Aberration Function 247
J
W 2 (r, q) = Â a 2j , (9-21)
j =1
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (9-22)
j =2

OF RECTANGULAR POLYNOMIAL ABERRATIONS
The rectangular polynomials up to the eighth order for a rectangular pupil with
c = 0.8 , corresponding to an aspect ratio of = 0.75 , are given in Tables 9-4 to 9-6. They
are illustrated in three different but equivalent ways in Figure 9-6. For each polynomial,
the isometric plot at the top illustrates its shape. An interferogram is shown on the left,
and a corresponding PSF is shown on the right for a sigma value of one wave. The peak-
to-valley aberration numbers (in units of wavelength) are given in Table 9-7.
polynomial aberration and obtained by applying Eq. (9-3) are shown in Figure 9-6. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration R1 has
no effect on the PSF, it yields an aberration-free PSF. The polynomial aberrations R2 and
R3 , representing the x and y wavefront tilts with aberration coefficients a 2 and a 3 ,
displace the PSF in the image plane along the x and y axes, respectively. If the coefficient
a 2 is in units of wavelength, it corresponds to a wavefront tilt angle of 3la 2 ca about
the y axis and displaces the PSF along the x axis by 2 3lFx a 2 c , where Fx = R 2a and
12
(
c = a a 2 + b2 ) is the width of the rectangle along the x axis normalized by its
semidiagonal. Similarly, a 3 corresponds to a wavefront tilt angle of 3 (1 - c 2 )la 3 b
about the x axis and displaces the PSF by 2 3 (1 - c 2 )lFy a 3 , where Fy = R 2b is the
focal ratio of the image-forming beam along the y axis.
The defocus aberration represented by the polynomial R4 is radially symmetric and

yields a radially symmetric interferogram bounded, of course, by a rectangle. However,
the PSF is biaxially and not radially symmetric because of the larger diffraction spread
along the smaller direction of the pupil. The polynomial aberrations R5 and R6 ,
representing balanced astigmatism, both yield biaxially symmetric interferograms and
PSFs, but they are distinctly different from each other. The polynomial aberrations R7
and R8 , representing balanced comas, produce biaxially symmetric interferograms, but
the PSFs are symmetric only about the y and x axes, respectively. The polynomial
aberrations R11 , R22 , and R37 , representing balanced primary, secondary, and tertiary
aberrations are not radially symmetric because of the presence of cos 2q , cos 2q and
cos 4q , and cos 2q , cos 4q , and cos 6q terms, respectively.
Table 9-4 Rectangular polynomials in terms of Zernike circle polynomials for a

rectangular pupil with c = 0.8 corresponding to an aspect ratio 0.75.
R1 1.Z1
R2 1.0825Z2
R3 1.4434Z3
R4 0.7613Z1 + 1.3186Z4
R5 1.2758Z5
R6 0.9614Z1 0.8012Z4 + 2.1820Z6
R7 1.6096Z3 + 1.5985Z7
R8 0.8848Z2 + 1.2821Z8
R9 4.0549Z3 2.2292Z7 + 3.0190Z9
R10 0.0077Z2 + 0.1153Z8 + 2.1173Z10
R11 0.9498Z1 + 1.3109Z4 0.2076Z6 + 1.4216Z11
R12 1.8433Z1 2.0095Z4 + 4.7861Z6 0.8443Z11 + 2.8091Z12
R13 0.9952Z5 + 1.2848Z13
R14 5.7024Z1 + 6.0904Z4 7.9324Z6 + 2.5076Z11 3.1207Z12 + 4.6212Z14
R15 1.9090Z5 0.7807Z13 + 3.0068Z15
R16 1.0746Z2 + 1.2027Z8 + 0.5203Z10 + 1.3544Z16
R17 2.4267Z3 + 2.3540Z7 0.8114Z9 + 1.7220Z17
R18 0.7905Z2 + 0.7891Z8 + 4.0955Z10 + 0.4914Z16 + 2.4652Z18
R19 9.1771Z3 7.2660Z7 + 6.8435Z9 2.7816Z17 + 3.1455Z19
R20 3.1155Z2 + 2.4245Z8 4.4115Z10 + 0.8983Z16 1.3467Z18 + 4.7364Z20
R21 22.2957Z3 + 16.1385Z7 16.6680Z9 + 5.1449Z17 5.3074Z19 + 6.9206Z21
R22 1.2407Z1 + 1.8668Z4 0.3413Z6 + 1.7268Z11 0.3191Z12 + 0.6512Z14 + 1.4983Z22
R23 0.82769Z5 + 1.0323Z13 + 0.1445Z15 + 1.3087Z23
R24 3.7592Z1 4.8556Z4 + 10.4311Z6 3.2528Z11 + 7.6493Z12 0.8460Z14

1.0933Z22 + 3.4474Z24
R25 3.2181Z5 2.2882Z13 + 5.6636Z15 0.8568Z23 + 2.9200Z25
R26 14.8185Z1 + 18.3776Z4 19.4312Z6 + 11.4773Z11 11.6213Z12 + 13.7289Z14 +

3.6298Z22 3.3094Z24 + 4.9523Z26
R27 9.9177Z5 + 5.7801Z13 11.4544Z15 + 1.5808Z23 3.0839Z25 + 7.1762Z27

9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 249
Table 9-4 Rectangular polynomials in terms of Zernike circle polynomials for a

rectangular pupil with c = 0.8 corresponding to an aspect ratio = 0.75. (Cont.)
R28 = 30.6444Z1 36.3206Z4 + 53.9421Z6 20.5096Z11 + 30.7165Z12 31.3914Z14

5.3566Z22 + 8.1769Z24 8.2421Z26 + 10.7448Z28
R29 = 3.4865Z3 + 3.9022Z7 1.8556Z9 + 2.9825Z17 1.1968Z19 + 0.4761Z21 + 1.8221Z29
R30 = 1.2903Z2 + 1.5913Z8 + 1.5103Z10 + 1.4507Z16 + 0.7232Z18 + 0.0791Z20 + 1.4055Z30
R31 = 20.4078Z3 19.3401Z7 + 16.0374Z9 11.2671Z17 + 9.6922Z19 1.9475Z21

3.5735Z29 + 3.5748Z31
R32 = 2.7256Z2 + 2.6116Z8 + 6.9151Z10 + 1.6480Z16 + 5.3888Z18 + 1.0051Z20 + 0.7002Z30

+ 2.8331Z32
R33 = 58.4744Z3 + 51.4017Z7 45.1816Z9 + 26.2581Z17 23.1848Z19 + 20.6702Z21 +

6.7194Z29 5.9855Z31 + 6.2807Z33
R34 = 8.9453Z2 + 8.6027Z8 7.0055Z10 + 5.1606Z16 3.7909Z18 + 12.5633Z20 +

1.7207Z30 0.9675Z32 + 4.7946Z34
R35 = 137.4560Z3 115.4710Z7 + 119.5700Z9 54.4067Z17 + 56.4789Z19 59.716Z21

12.2438Z29 + 12.7553Z31 13.4933Z33 + 16.6422Z35
R36 = 9.1288Z2 7.5791Z8 + 29.6113Z10 3.4590Z16 + 14.4106Z18 23.3638Z20

0.7039Z30 + 3.4183Z32 5.3160Z34 + 11.2833Z36
R37 = 1.4443Z1 + 2.3880Z4 0.2229Z6 + 2.6066Z11 0.4738Z12 + 1.5018Z14 + 2.1013Z22

0.4267Z24 + 0.9143Z26 0.1707Z28 + 1.5680Z37
R38 = 6.7920Z1 9.6812Z4 + 20.4832Z6 8.1391Z11 + 17.6424Z12 3.3244Z14

4.5139Z22 + 10.4761Z24 1.4218Z26 + 1.6720Z28 1.3388Z37 + 3.9661Z38
R39 = 0.1065Z5 + 0.5880Z13 + 1.2183Z15 + 1.0307Z23 + 0.1823Z25 0.4340Z27 +

1.3327Z39
R40 = 39.3796Z1 + 53.2283Z4 53.0596Z6 + 40.6751Z11 39.4938Z12 + 41.0417Z14 +

20.0217Z22 18.4232Z24 + 20.9849Z26 2.4001Z28 + 5.33986Z37 4.3544Z38 +
6.1988Z40
R41 = 3.8438Z5 3.9634Z13 + 8.2513Z15 2.7281Z23 + 6.3196Z25 + 0.6634Z27

1.0209Z39 + 3.1115Z41
R42 = 78.9935Z1 102.5530Z4 + 153.1260Z6 72.0204Z11 + 109.2280Z12 87.8651Z14

30.8082Z22 + 48.1556Z24 38.2527Z26 + 36.5458Z28 6.4972Z37 + 10.8145Z38
8.3071Z40 + 9.4857Z42
R43 = 22.1387Z5 + 17.0827Z13 24.7366Z15 + 8.4351Z23 12.3584Z25 + 18.7263Z27 +

2.1738Z39 3.2116Z41 + 6.2555Z43
R44 = 197.7770Z1 + 252.3210Z4 358.1940Z6 + 171.0860Z11 242.6080Z12 +

254.2440Z14 + 69.4217Z22 98.2143Z24 + 103.3860Z26 109.2310Z28 + 13.6842Z37
19.2514Z38 + 20.4330Z40 21.5294Z42 + 26.1698Z44
R45 = 49.1651Z5 33.5817Z13 + 72.6480Z15 13.7675Z23 + 30.0565Z25 47.9434Z27

2.7431Z39 + 6.0701Z41 9.6463Z43 + 17.5983Z45
Table 9-5. Rectangular polynomials in polar coordinates for a rectangular pupil

with c = 0.8 corresponding to an aspect ratio 0.75.
R1 = 1
R2 = 2.1651U cosT
R3 = 2.8868U cosT
R4 = 0.7613 + 2.2839( 1 + 2U2)
R5 = 3.1250U 2cos2T
R6 = 0.9614 1.3878( 1 + 2U2) + 5.3449U 2cos2T
R7 = ( 5.8234 U + 13.5638U3)cosT
R8 = ( 5.4830 U + 10.8789U3)cosT
R9 = (4.5005 U 18.9154U3)cosT + 8.5389U3cos3T
R10 = ( 0.6370 U + 0.9787U3)cosT + 5.9885U3cos3T
R11 = 0.9498 + 2.2705( 1 + 2U2) + 3.1787(1 6U2 + 6U4) 0.5086U2cos2T
R12 = 1.8433 3.4805( 1 + 2U2) 1.8880(1 6U2 + 6U4) + ( 14.9264U2 + 35.5330U4)cos2T
R13 = ( 9.7511U 2 + 16.2519U4)cos2T
R14 = 5.7024 + 10.5488( 1 + 2U 2) + 5.6072(1 6U2 + 6 U4) + (10.1748U2 39.4736U4)cos2T

+ 14.6134U4cos4T
R15 = (2.7303U2 9.8753U4)cos2T + 9.5085 U4cos4T
R16 = (9.4205U 46.0944U3 + 46.9165U5)cosT + 1.4715 U3cos3T
R17 = (9.4323U 51.6062U3 + 59.6505U5)cosT 2.2951U 3cos3T
R18 = (2.2238U 13.7300U3 + 17.0212U5)cosT + ( 22.5745U3 + 42.6979U5)cos3T
R19 = ( 6.1582 U + 53.9729U3 96.3558 U5)cosT + ( 24.2284U3 + 54.4811U 5)cos3T
R20 = (1.8516U 16.7701U 3 + 31.1191U5)cosT + (6.1828U3 23.3257U5)cos3T + 16.4075U5cos5T
R21 = (6.7650U 76.9274U3 + 178.2230U5)cosT + (26.3979U3 91.9276U 5)cos3T + 23.9735U5cos5T
R22 = 1.2407 + 3.2334( 1 + 2U2) + 3.8612(1 6U2 + 6U4) + 3.9642( 1 + 12U 2 30U4 + 20U6)
+ (2.1911U 2 4.0362 U4)cos2T + 2.0593U4cos4T
R23 = (21.6144 U2 84.877 U4 + 73.4513U6)cos2T + 0.4570U4cos4T
R24 = 3.7592 8.4102( 1 + 2U2) 7.2735(1 6U2 + 6U4) 2.8925( 1 + 12U2 30U4 + 20U6)
+ (30.3780U2 161.2260U4 + 193.4870 U6)cos2T 2.6753U 4cos4T
R25 = ( 5.4111U 2 + 35.1766U4 48.0902 U6)cos2T + ( 36.7175 U4 + 65.5530 U6)cos4T
R26 = 14.8185 + 31.8310( 1 + 2U2) + 25.6640(1 6U2 + 6U4) + 9.60361( 1 + 12U2 30U4 + 20 U6)
+ ( 11.6421U2 + 100.6510U4 185.7370U6)cos2T + ( 49.2338 U4 + 111.1780 U6)cos4T
R27 = (4.9469U2 45.1814 U4 + 88.7207U6)cos2T + (21.4719U4 69.2325U6)cos4T + 26.8510U6cos6T
R28 = 30.6444 62.9091( 1 + 2U2) 45.8608(1 6U 2 + 6 U4) 14.1723( 1 + 12U2 30U4 + 20 U6)
+ (24.2988U 2 223.3660U4 + 458.9270U6)cos2 T + (54.9277U4 185.0350 U6)cos4T + 40.2033 U6cos6 T
Table 9-5. Rectangular polynomials in polar coordinates for a rectangular pupil

with c = 0.8 corresponding to an aspect ratio 0.75. (Cont.)
R29 ( 13.2595U + 127.7800U3 333.9810U5 + 255.0900U7)cosT + (11.3350U 3 20.7293U 5)cos3T + 1.6493U5cos5T
R30 ( 13.8336U + 121.8610 U3 287.0700U5 + 196.7720U7)cosT + ( 5.7494U 3 + 12.5263U 5)cos3T + 0.2742U 5cos5T
R31 (8.6741U 124.5650 U3 + 467.3430U 5 500.2940U7)cos T + (54.0511U 3 261.0960U 5 + 300.2790U7)cos3T

6.7464 U5cos5T
R32 ( 3.3996U + 37.6803U3 110.9620U 5 + 98.0286U7)cos T + (58.2124U3 246.6330U5 + 237.9790U7)cos3T

+ 3.4819U5cos5T
R33 ( 8.4508U + 150.9530U3 703.0400 U5 + 940.7110U7)cosT + ( 45.9536U 3 + 316.6830U 5 502.7790U7)cos3T

+ ( 79.1341U 5 + 175.8610U7)cos5T
R34 ( 4.6745U + 64.9609U3 234.2050U 5 + 240.9010U 7)cosT + ( 5.9855U 3 + 50.4373U 5 81.2682U7)cos3T

+ ( 71.5489U5 + 134.2480U7)cos5T
R35 (8.7830 U 187.4210 U3 + 1053.8100 U5 1714.1300U7)cos T + (65.8151U3 552.3990 U5 + 1071.4500U7)cos3T

+ (116.9780U5 377.8130 U7)cos5T + 66.5688 U7cos7T
R36 ( 0.0678U 4.9868U 3 + 49.1032U 5 98.5394U 7)cos T + (20.8068U 3 160.5990U5 + 287.1390 U7)cos3T
+ (46.6489U 5 148.8470 U7)cos5T + 45.1331U 7cos7T
R37 1.4443 + 4.1359( 1 + 2U 2) + 5.8286(1 6U2 + 6U4) + 5.5594( 1 + 12U2 30U4 + 20U6)
+ 4.7041(1 20U 2 + 90 U4 140U 6 + 70U 8) + ( 5.6303U 2 + 25.9377U 4 23.9482U 6)cos2T
+ ( 12.3568U4 + 20.5270 U6)cos4T 0.6386 U6cos6T
R38 6.7920 16.7684( 1 + 2U 2) 18.1996(1 6U 2 + 6U4) 11.9426( 1 + 12U2 30 U4 + 20U 6)

4.0165(1 20U 2 + 90U4 140 U6 + 70U8) + ( 50.2770U 2 + 448.8090U 4 1178.8400 U6 + 942.3010U8)cos2T
+ (16.0858U4 31.9182U 6)cos4T + 6.2562U6cos6T
R39 ( 39.2423U2 + 269.5590U4 535.8400 U6 + 316.6330U8)cos2T + (0.4428U 4 + 4.0919U6)cos4T 1.6238U6cos6T
R40 39.3796 + 92.1941( 1 + 2U2) + 90.9522(1 6U 2 + 6U4) + 52.9725( 1 + 12U2 30U4 + 20U 6)
+ 16.0196(1 20U 2 + 90U4 140U6 + 70U8) + (15.8434U 2 229.3420U 4 + 905.7830U 6 1034.5500U8)cos2T
+ (131.6850U 4 633.4660U6 + 736.3840U 8)cos4T 8.9803U6cos6 T
R41 (10.2509U2 105.8530 U4 + 301.6630 U6 242.5490 U8)cos2 T + (105.8810 U4 412.5710 U6 + 369.6300U8)cos4T

+ 2.4823U 6cos6T
R42 78.9935 177.6270( 1 + 2U2) 161.0420(1 6U2 + 6U4) 81.5109( 1 + 12U2 30U4 + 20 U6)
19.4915(1 20 U2 + 90U4 140U6 + 70 U8) + ( 38.8745U2 + 530.9200 U4 2114.8800U6 + 2569.3900U8)cos2T
+ ( 90.8696U4 + 621.471 U6 986.8280 U8)cos4T + ( 144.9680U6 + 321.9540U8)cos6T
R43 ( 10.6920U2 + 138.2210 U4 494.9710 U6 + 516.4750U8)cos2T + ( 51.4043U4 + 294.8320 U6 381.5180U8)cos4T

+ ( 115.7120 U6 + 212.3190 U8)cos6T
R44 197.7770 + 437.0330( 1 + 2U2) + 382.5600(1 6U2 + 6 U4) + 183.6730( 1 + 12 U2 30U4 + 20 U6)
+ 41.0527(1 20U2 + 90U4 140 U6 + 70U8) + (36.0550U2 619.6960 U4 + 3063.7900U6 4573.8900U8)cos2T
+ (170.1620U4 1319.9600U 6 + 2427.3200 U8)cos4T + (230.6850U6 730.7330U 8)cos6 T + 111.0290U8cos8T
R45 (5.4529U 2 92.7804 U4 + 449.2680 U6 651.7150U8)cos2 T + (53.7265U 4 406.8700 U6 + 721.0900U8)cos4 T

+ (107.0920U6 327.4060U8)cos6 T + 74.6631U 8cos8T
Table 9-6. Rectangular polynomials in Cartesian coordinates for a rectangular pupil

with c = 0.8 corresponding to an aspect ratio 0.75.
R1 1
R2 2.1651x
R3 2.8866y
R4 1.5226 + 4.5677x2 + 4.5677y2
R5 6.2500xy
R6 0.4263 + 2.5694x2 8.1204y2
R7 5.8234y + 13.5638x2y + 13.5638y3
R8 5.4830x + 10.8789x3 + 10.8789xy2
R9 4.5005y + 6.7012x2y 27.4543y3
R10 0.6370x + 6.9672x3 16.9868xy2
R11 1.8580 15.0398x2 + 19.0722x4 14.0226y2 + 38.1445x2y2 + 19.0722y4
R12 0.2507 10.5596x2 + 24.2052x4 + 19.2931y2 22.6556x2y2 46.8608y4
R13 19.5023xy + 32.5038x3y + 32.5038xy3
R14 0.7608 2.3708x2 + 8.7829x4 22.7203y2 20.3939x2y2 + 87.7301y4
R15 5.4606xy + 18.2834x3y 57.7844xy3
R16 9.4205x 44.6228x3 + 46.9165x5 50.5090xy2 + 93.8330x3y2 + 46.9165xy4
R17 9.4323y 58.4915x2y + 59.6505x4y 49.3111y3 + 119.3010x2y3 + 59.6505y5
R18 2.2238x 36.3045x3 + 59.7191x5 + 53.9936xy2 51.3535x3y2 111.0730xy4
R19 6.1582y 18.7124x2y + 67.0875x4y + 78.2013y3 83.7494x2y3 150.8370y5
R20 1.8516x 10.5873x3 + 24.2009x5 35.3186xy2 55.1853x3y2 + 183.1340xy4
R21 6.7650y + 2.2661x2y + 22.3073x4y 103.3250y3 67.1447x2y3 + 294.1240y5
R22 2.0957 + 33.0605x2 97.7345x4 + 79.2831x6 + 28.6783y2 203.8710x2y2 +

237.8490x4y2 89.6620y4 + 237.8490x2y4 + 79.2831y6
R23 43.2289xy 167.9260x3y + 146.9030x5y 171.5820xy3 + 293.8050x3y3 + 146.9030xy5

R24 0.2700 + 22.4881x2 120.7660x4 + 135.6370x6 38.2678y2 + 102.3210x2y2 +
19.9357x4y2 + 201.6850y4 367.0390x2y4 251.3380y6
R25 10.8221xy 76.5169x3y + 166.0320x5y + 217.2230xy3 192.3610x3y3 358.3920xy5

R26 0.9521 + 13.2791x2 82.7075x4 + 117.5130x6 + 36.5633y2 + 27.1545x2y2
165.4110x4y2 284.0090y4 + 206.0630x2y4 + 488.9880y6
R27 9.8939xy 4.4751x3y + 61.6173x5y 176.2500xy3 182.1370x3y3 + 615.4780xy5
R28 0.5762 + 3.5782x2 18.4357x4 + 30.6504x6 45.0195y2 29.5599x2y2

69.2811x4y2 + 428.2970y4 + 218.9620x2y4 967.6110y6
Table 9-6. Rectangular polynomials in Cartesian coordinates for a rectangular pupil

with c = 0.8 corresponding to an aspect ratio 0.75. (Cont.)
R29 = 13.2595y + 161.7850x2y 387.9230x4y + 255.0900x6y + 116.4450y3
725.9140x2y3 + 765.2710x4y3 311.6030y5 + 765.2710x2y5 + 255.0900y7
R30 = 13.8336x + 116.1110x3 274.2700x5 + 196.7720x7 + 139.1090xy2 601.9340x3y2

+ 590.3150x5y2 323.2780xy4 + 590.3150x3y4 + 196.7720xy6
R31 = 8.6741y + 37.5880x2y 349.6770x4y + 400.5440x6y 178.6160y3 + 479.9580x2y3 +

0.5157x4y3 + 721.6920y5 1200.6000x2y5 800.5730y7
R32 = 3.3996x + 95.8927x3 354.1130x5 + 336.0080x7 136.9570xy2 + 236.5240x3y2 +

56.1063x5y2 + 646.3470xy4 895.8120x3y4 615.9100xy6
R33 = 8.4508y + 13.0920x2y 148.6600x4y + 311.6780x6y + 196.9060y3 + 18.6281x2y3

571.0660x4y3 1098.8600y5 + 736.6060x2y5 + 1619.3500y7
R34 = 4.6745x + 58.9754x3 255.3170x5 + 293.8810x7 + 82.9174xy2 + 146.2040x3y2

404.2550x5y2 743.2620xy4 + 457.8080x3y4 + 1155.9400xy6
R35 = 8.7830y + 10.0244x2y 18.4974x4y + 77.1286x6y 253.2360y3 166.9520x2y3

225.9950x4y3 + 1723.1900y5 + 727.3160x2y5 3229.9600y7
R36 = 0.0688x + 15.8200x3 64.8468x5 + 84.8849x7 67.4072xy2 47.0852x3y2

190.9260x5y2 + 764.1440xy4 + 592.5860x3y4 2020.1200xy6
R37 = 2.2817 59.6994x2 + 305.1400x4 551.4460x6 + 329.2870x8 48.4388y2 +

657.2590x2y2 1759.1600x4y2 + 1317.1500x6y2 + 253.2650y4 1730.4300x2y4 +
1975.7200x4y4 502.2730y6 + 1317.1500x2y6 + 329.2870y8
R38 = 0.2972 37.5960x2 + 352.4860x4 881.0430x6 + 661.1430x8 + 62.9580y2

321.3330x2y2 142.7060x4y2 + 759.9720x6y2 545.1320y4 + 2402.6700x2y4
1686.9400x4y4 + 1464.1300y6 3009.2300x2y6 1223.4600y8
R39 = 78.4846xy + 540.8890x3y 1065.0600x5y + 633.2650x7y + 537.3460xy3

2110.8800x3y3 + 1899.8000x5y3 1097.7900xy5 + 1899.8000x3y5 + 633.2650xy7
R40 = 1.1848 30.2033x2 + 300.6440x4 919.9570x6 + 823.2050x8 61.8900y2 +

6.4945x2y2 + 657.9390x4y2 529.1540x6y2 + 759.3280y4 1423.0400x2y4
635.6140x4y4 2713.5600y6 + 3609.0500x2y6 + 2892.3100y8
R41 = 20.5019xy + 211.8160x3y 1032.0600x5y + 993.4240x7y 635.2280xy3 +

1157.0100x3y3 + 23.2280x5y3 + 2268.5100xy5 2933.8200x3y5 1963.6200xy7
R42 = 0.3900 16.1732x2 + 164.8860x4 539.7850x6 + 540.1110x8 + 61.5757y2

5.1104x2y2 + 248.0660x4y2 878.8930x6y2 896.9530y4 + 128.7860x2y4 +
1681.8400x4y4 + 3979.9200y6 2141.7300x2y6 5242.5800y8
R43 = 21.3841xy + 70.8246x3y 504.8870x5y + 780.7940x7y + 482.0590xy3 +

334.3580x3y3 1399.6900x5y3 2863.5400xy5 + 1652.4500x3y5 + 3832.9400xy7
R44 = 0.6837 2.2182x2 + 30.3867x4 99.4155x6 + 107.4090x8 74.3282y2

61.1328x2y2 18.4555x4y2 240.8510x6y2 + 1269.7800y4 + 774.5320x2y4 +
741.0040x4y4 6688.3600y6 2405.7900x2y6 + 10716.7000y8
R45 = 10.9057xy + 29.3452x3y 86.3882x5y + 213.8010x7y 400.4670xy3 344.7800x3y3

623.3740x5y3 + 3168.5700xy5 + 1970.1700x3y5 6749.5300xy7
R1 R2 R3
R4 R5 R6
R7 R8 R9
R10 R11 R12
R13 R14 R15
Figure 9-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio

= 0.75 shown as isometric plot on the top, interferogram on the left, and PSF on
the right for a sigma value of one wave.
R16 R17 R18
R19 R20 R21
R22 R23 R24
R25 R26 R27
R28 R29 R30

the right for a sigma value of one wave. (Cont.)
R31 R32 R33
R34 R35 R36
R37 R38 R39
R40 R41 R42
R43 R44 R45
Figure 9-6. Rectangular polynomials for c = 0.8 corresponding to an aspect ratio of


rectangular polynomial aberrations for c = 0.8 corresponding to an aspect ratio of
= 0.75 for a sigma value of one wave.
Poly. P-V # Poly. P-V# Poly. P-V#
R1 0 R16 15.352 R31 11.357
R2 3.464 R17 16.675 R32 10.471
R3 3.464 R18 7.354 R33 8.574
R4 4.568 R19 7.741 R34 8.959
E5 6.000 R20 7.981 R35 11.357
R6 4.568 R21 9.224 R36 9.195
R7 9.289 E22 12.142 R37 16.914
R8 8.6345 R23 20.054 R38 12.861
R9 6.460 R24 9.195 R39 28.345
R10 6.115 R25 8.181 R40 7.783
R11 7.364 R26 6.821 R41 12.659
R12 6.024 R27 7.960 R42 10.108
R13 12.481 R28 12.142 R43 10.351
R14 5.488 R29 24.920 R44 8.480
R15 6.491 R30 23.048 R45 9.297
The Strehl ratio, namely the central value of a PSF relative to its aberration-free
value can be obtained from Eq. (9-8) by letting x = 0 = y , i.e., from
2
1 1 1
I (0, 0) = [ ]
Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢
16 1 1
. (9-23)
Its value for a rectangular polynomial aberration with a sigma value of 0.1 wave is listed
in Table 9-8 and plotted in Figure 9-7. Because of the small value of the aberration, the
Strehl ratio is approximately the same for each polynomial. Both the table and the figure
illustrate that the Strehl ratio for a small aberration is independent of the type of
( )
aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
Table 9-8. Strehl ratio S for rectangular polynomial aberrations for c = 0.8
corresponding to an aspect ratio of = 0.75 for a sigma value of 0.1 wave.
R1 1 R16 0.704 R31 0.702
R2 0.663 R17 0.715 R32 0.691
R3 0.663 R18 0.678 R33 0.688
R4 0.669 R19 0.685 R34 0.683
E5 0.676 R20 0.687 R35 0.685
R6 0.669 R21 0.681 R36 0.691
R7 0.688 E22 0.718 R37 0.723
R8 0.679 R23 0.719 R38 0.703
R9 0.673 R24 0.688 R39 0.722
R10 0.678 R25 0.691 R40 0.679
R11 0.700 R26 0.682 R41 0.705
R12 0.674 R27 0.688 R42 0.690
R13 0.701 R28 0.684 R43 0.691
R14 0.680 R29 0.724 R44 0.687
R15 0.683 R30 0.718 R45 0.691

o
o
oj
Figure 9-7. Strehl ratio S for rectangular polynomial aberrations for c = 0.8
corresponding to an aspect ratio of = 0.75 for a sigma value of 0.1 wave.
9.7 SEIDEL ABERRATIONS AND THEIR STANDARD DEVIATIONS

with and without balancing.
9.7.1 Defocus
W d (r) = Ad r 2 . (9-24)
From the form of the orthonormal defocus polynomial R4 given in Table 9-2, it is
evident that its sigma value across a rectangular pupil is given by
2g
sd = Ad , (9-25)
3 5
where
(
g = 1 - 2c 2 + 2c 4 )1 2 . (9-26)
9.7.2 Astigmatism
W a (r, q) = Aa r 2 cos 2 q . (9-27)
R6 = 3 5
g 2r 2 cos 2q + 1 - 2c 2 r 2 ( ) + constant (9-28a)
2
(
4c 1 - c g 2
)
3 5g Ê 2 2 c 4 2ˆ
= Á r cos q - 2 r ˜ + constant , (9-28b)
(
2c 2 1 - c 2 ) Ë g ¯
showing that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is c 4 g 2 . It is evident that the balanced astigmatism is given by
Ê c4 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - 2 r 2 ˜ . (9-29)
Ë g ¯
s ba =
(
2c 2 1 - c 2 )A . (9-30)
a
3 5g
9.7.2 Astigmatism 261
2 Aa 2
W a (r, q) =
3 5g
[ ( )
c 1 - c 2 R6 + c 4 R4 + constant . ] (9-31)
2c 2
sa = Aa . (9-32)
3 5
9.7.3 Coma
Now, we consider Seidel coma
W c (r, q) = Ac r 3 cos q . (9-33)

12
1 Ê 21 ˆ
R8 = Á ˜
2c Ë 35 - 7c 2 + 62c 4 ¯
[15r 3
( )
cos q - 5 + 4c 2 r cos q ] . (9-34)
( )
r3 cos q is - 5 + 4c 2 15 compared to - 2 3 for a circular pupil. Its sigma value is given
by
12
2c Ê 35 - 70c 2 + 62c 4 ˆ
s bc = Ac . (9-35)
15 ÁË 21 ˜
¯
A
W c (r, q) = c
È Ê 35 - 70c 2 + 62c 4 ˆ 1 2
Í 2c Á
c 5 + 4a 2 c ˘ (
R2 ˙ .
)
15 21 ˜ R8 + (7-36)
Í Ë ¯ 3 ˙
Î ˚
Utilizing Eq. (9-22), we obtain the sigma value
7 + 8c 4
sc = c Ac (9-37)
105

Finally, we consider Seidel spherical aberration
W s (r) = Asr 4 . (9-38)
[ ( )
R11 = (1 8m) 315r 4 + 30 1 - 2c 2 r 2 cos 2q - 240r 2 + constant ] (9-39a)
= (1 8m)[ 315r 4
( ) ( ) ]
+ 60 1 - 2c 2 r 2 cos 2 q - 270 + 2c 2 r 2 + constant . (9-39b)
Hence, the balanced spherical aberration is given by
È 6 16 ˘
W bs (r) = As Ír 4 -
Î 63
( )
1 - 2c 2 r 2 cos 2q - r 2 ˙
21 ˚
(9-40a)
È 12 12 ˘
= As Ír 4 -
Î 63
( )
1 - 2c 2 cr 2 cos 2 q -
63
3 + 2c 2 r 2 ˙ .
˚
( ) (9-40b)
It shows, as in the case of an elliptical pupil, that spherical aberration is balanced not only
by defocus but astigmatism as well. Its sigma value is given by
8m
s bs = A . (9-41)
315 s
W s (r) =
1 È
Í8mR11 -
( )(
40c 2 1 - c 2 1 - 2c 2
R6 -
)
2( 241 - 2c ) ˘
R4 ˙ .
315 Í 5g 3 5g ˙˚
Î
+ constant . (9-42)
4 As
ss =
45 7
(
63 - 162c 2 + 206c 4 - 88c 6 + 44c 8 )1 2 . (9-43)
The sigma values of Seidel aberrations with and without balancing are given in Table 9-9.
Table 9-9. Sigma of a Seidel aberration with and without balancing, where Ai is the
coefficient of an aberration.
Aberration Sigma
Defocus (
s d = 2 g 3 5 Ad )
Astigmatism sa = ( 2c 3 5) A
2
a
Balanced astigmatism s ba = [ 2c (1 - c ) 3 5g ] A
2 2
a
Coma sc = c [( 7 + 8c ) 105] A 4
c
4 12
Balanced coma s bc = ( 2c 15 21)( 35 - 70c + 62c ) A 2
c
Ê 4A ˆ 8 12
˜ ( 63 - 162c + 206c - 88c + 44c )
s 2 4 6
Spherical aberration ss =Á
Ë 45 7 ¯
Balanced spherical aberration s bs = (8m 315) As

Figures 9-8 and 9-9 show the variation of sigma for a rectangular pupil as a function
of its width c along the x axis. It is evident from Figure 9-8 that defocus and spherical
sigmas have a minimum for a square pupil (i.e., for c = 1 2 ), but coma and astigmatism
sigmas increase monotonically as c increases from a value of zero, representing a slit
pupil along the y axis, to a value of 1, representing a slit pupil parallel to the x axis. The
balanced spherical sigma in Figure 9-9 has a minimum for a square pupil though its
variation is relatively small. The sigma for balanced astigmatism has a distinct maximum
for a square pupil, while the monotonically increasing sigma for balanced coma has a
point of inflection.
Figure 9-8. Variation of sigma of a primary or Seidel aberration as a function of

half-width c of a unit rectangular pupil.
Figure 9-9. Variation of sigma of a balanced primary aberration as a function of

half-width c of a unit rectangular pupil.
9.8 SUMMARY
The aberration-free PSF and OTF are discussed in Section 9.3. The polynomials
orthonormal over a unit rectangular pupil, representing balanced aberrations over such a
pupil are given through the fourth order in Tables 9-1 through 9-3 in terms of the circle
polynomials, in polar coordinates, and in Cartesian coordinates, respectively. Each
orthonormal polynomial consists of either the cosine or the sine terms, but not both. Thus
an even j polynomial, for example, consists of only the cosine terms, as may be seen from
Table 9-2. This is a consequence of the biaxial symmetry of the pupil. Since the
polynomials are not separable in the polar coordinates r and q of a pupil point,
polynomial numbering with two indices n and m loses significance, and must be
numbered with a single index j. They are ordered in the same manner as the polynomials
discussed in previous chapters.
As in the case of elliptical polynomials, only the first 15 rectangular polynomials are
given in the tables. The expressions for the higher-order polynomials are very long unless
the aspect ratio of the pupil is specified. The polynomial R6 for astigmatism is a linear
combination of Z 6 , Z 4 , and Z1, showing that the balancing defocus for (zero-degree)
Seidel astigmatism is different for a rectangular pupil compared to that, for example, for a
circular pupil. Moreover, R11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,
spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q
as well. It is evidently not radially symmetric. As expected, the rectangular polynomials
reduce to the square polynomials (discussed in the next chapter) as c Æ 1 2 , i.e., as the
unit rectangle approaches a unit square.
The first 45 rectangular polynomials, i.e., up to and including the eighth order, for a
rectangular pupil with an aspect ratio of = 0.75 are given in Tables 9-4 through 9-6 in
terms of Zernike circle polynomials, in polar coordinates, and in Cartesian coordinates,
respectively. They are illustrated in three different but equivalent ways in Figure 9-7 with
the isometric plot, interferogram, and the PSF for a sigma value of one wave. The peak-
to-valley aberration numbers (in units of wavelength) are given in Table 9-7. The Strehl
ratio for a sigma value of 0.1 wave is given in Table 9-8 and plotted in Figure 9-7. The
Seidel aberrations are discussed in Section 9.7, and their sigma values with and without
balancing are given in Table 9-9.
5HIHUHQFHV 265
References
1. K. N. LaFortune, R. L. Hurd, S. N. Fochs, M. D. Rotter, P. H. Pax, R. L. Combs,

S. S. Olivier, J. M. Brase, and R. M. Yamamoto, “Technical challenges for the
future of high energy lasers,” Proc. SPIE 6454, 1–11 (2007).

analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

11.41 (McGraw–Hill, 2009).
4 J. Rayces, “Least-squares fitting of orthogonal polynomials to the wave-aberration

function,” Appl. Opt. 31, 2223–2228 (1992).
CHAPTER 10
SYSTEMS WITH SQUARE PUPILS
10.1 Introduction ..........................................................................................................269
10.2 Pupil Function ......................................................................................................269
10.3.1 PSF ..........................................................................................................270
10.3.2 OTF ..........................................................................................................272
10.4 Square Polynomials..............................................................................................274
10.5 Square Coefficients of a Square Aberration Function ..................................... 281
Square Polynomial Aberrations ......................................................................... 282
10.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio ..............................289
10.7.1 Defocus ....................................................................................................289
10.7.2 Astigmatism............................................................................................. 289
10.7.3 Coma ........................................................................................................290
10.7.5 Strehl Ratio ..............................................................................................292
10.8 Summary............................................................................................................... 293
References ......................................................................................................................294
267
Chapter 10
Systems with Square Pupils
10.1 INTRODUCTION
We start this chapter with a brief discussion of the aberration-free PSF and OTF for a
system with a square pupil, as, for example, a high-power laser beam with a square cross-
section. We can obtain these results as a special case of the rectangular pupils discussed
in the last chapter. Similarly, the square polynomials Sk can be obtained as a special case
of the rectangular polynomials Rk discussed there, i.e., by letting c = 1 2 . However,
we describe the procedure for obtaining them independently [1,2], and give expressions
for the first 45 polynomials, i.e., up to and including the eighth order. The isometric,
interferometric, and PSF plots of these polynomial aberrations with a sigma value of one
wave are given along with their P-V numbers. The Strehl ratios for these polynomial
aberrations for a sigma value of one-tenth of a wave are also given. Finally, we discuss
how to obtain the standard deviation of a Seidel aberration with and without balancing
and then discuss the Strehl ratio as a function of it.
Orthogonal square polynomials were also obtained by Bray by orthogonalizing the

circle polynomials, but he chose a circle inscribed inside a square instead of the other way
around [3]. Thus, his square with a full width of unity has regions that fall outside the unit
circle. Defining a unit square as we have, where its semidiagonal is unity, has the
advantage that the coefficient of a term in a certain polynomial represents its peak value.
For example, since r has a maximum value of unity, the coefficients of astigmatism
r 2 cos 2 q in S6 , or coma r 3 cos q in S8 , or spherical aberration r 4 in S11 represent
their peak values.
As in the case of rectangular polynomials, products of the x- and y-Legendre

polynomials, which are orthogonal over a square pupil, are not suitable for the analysis of
square wavefronts [4], because they do not represent classical or balanced aberrations.
For example, defocus is represented by a term in x 2 + y 2 . While it can be expanded in
terms of a complete set of Legendre polynomials, it cannot be represented by a single 2D
Legendre polynomial (i.e., as a product of x- and y-Legendre polynomials). The same
difficulty holds for spherical aberration and coma, etc. However, products of Legendre
polynomials are the correct polynomials for an anamorphic system, as discussed in
Chapter 13.
10.2 PUPIL FUNCTION

As illustrated in Figure 10-1, consider an optical system with a square exit pupil of
( )
half-width a and area Sex = 4 a 2 lying in the x p , y p plane with z axis as its optical axis.
( )
For a uniformly illuminated pupil with an aberration function F x p , y p and power Pex
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (10-1)
269
270 SYSTEMS WITH SQUARE PUPILS
yp
xp
O
Figure 10-1. Square pupil of half-width a.
where
(
A xp, yp ) = (P ex Sex )
12
, -a £ xp £ a , -a £ yp £ a . (10-2)

10.3.1 PSF
From Eq. (2-9), the aberrated PSF at a point ( x i , y i ) in the image plane of a system
with a uniformly illuminated rectangular exit pupil, normalized by its aberration-free
central value Pex Sex l2 R 2 , can be written
2
1 a a È 2pi ˘
I (x i , y i ) = 2 Ú
Sex a a
[ (
Ú exp iF x p , y p expÍ -
Î lR
)] ( )
x i x p + y i y p ˙ dx p dy p .
˚
(10-3)
Letting
( x ¢, y ¢) = a 1
(x p, yp ) (10-4)
and
1
( x, y) = (x , y )
lF i i
(10-5)
into Eq. (10-3), where
F = R 2a (10-6)
is the focal ratio of the image forming beam along the x and the y axes, we obtain the
irradiance distribution
2
1 1 1
I ( x, y) =
16 1 1
[ ]
Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (10-7)
Accordingly, the aberration-free distribution is given by

10.3.1 PSF 271
2
1 1 1
I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢
16 1 1
2 2
Ê sin px ˆ Ê sin py ˆ
= Á ˜ . (10-8)
Ë px ¯ ÁË py ˜¯
Figure 10-2a shows the 2D PSF, in particular, the central bright square spot of size
2 ¥ 2 , with each dimension in units of l F . The PSF is zero wherever x and/or y is a
positive or a negative integer. Moreover, there are rectangular spots along the x and y
axes, but square spots elsewhere in the PSF. Figure 10-2b shows the irradiance
distribution along the x and y axes, and along the diagonal of the central bright spot as
12
(
I ( x, 0) , I (0, y ) , and I ( x , x ) ∫ I ( r ) , where r = x 2 + y 2 )
= 2 x and
4
I (r) = Í
(
È sin pr 2 ) ˘˙ . (10-9)
Í pr 2 ˙
Î ˚
The irradiance along the diagonal is zero at integral multiples of 2.
(a)
1.0
0.8
0.6
0.4
(b)
I (x, 0)
0.2
I (0, y)
I (r)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r
Figure 10-2. (a) 2D aberration-free PSF. (b) Irradiance distribution along the x and
y axes, and along the diagonal of the central bright spot of the PSF.
10.3.2 OTF
From Eq. (1-13), the aberration-free OTF of a system with a square pupil at a spatial
frequency (x, h) is given by the fractional area of overlap of two squares centered at
(0, 0) and lR(x, h) , as shown in Figure 10-3. The overlap area is given by
S(x, h) = (2a - l Rx) (2a - l Rh)
Ê x ˆÊ h ˆ
= 4 a 2 Á1 - ˜ Á1 - ˜ . (10-10)
Ë 1 lF ¯ Ë 1 lF ¯
Hence, the fractional area of overlap, or the OTF of the system may be written
(
t vx , vy ) = (1 - v ) (1 - v )
x y , (10-11)
where
Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 lF 1 lF ¯
(10-12)
are the spatial frequency components in units of the cutoff frequency 1 l F along the x
( )
or the y axis. The OTF t(v x , 0) along the x axis is the same as the OTF t 0, v y along
the y axis, with the same normalized cutoff frequency of unity.
yp

O9 R
xp
O
R
a
Figure 10-3. Overlap area of two square pupils centered at (0, 0) and l R(x , h) .
10.3.2 OTF 273
12
( )
The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be
obtained from Eq. (10-10) by letting v x = v y . Thus
2
Ê v ˆ
t( v ) = Á 1 - ˜ . (10-13)
Ë 2¯
Its cutoff frequency is 2.
( )
Figure 10-4 shows the OTF t(v x , 0) , t 0, v y , and t( v ) along the x and y axes, and
along the diagonal of the pupil with cutoff frequencies 1, 1, and 2 , respectively, each in
( )
units of 1 l F . Of course, t(v x , 0) = t 0, v y for any v x = v y . The OTF t( v ) < t(v x , 0) for
( )
any frequency lying in the range 0 < v = v x < 2 2 - 1 . They are equal to each other at
( )
the frequency 2 2 - 1 (or about 0.83), and t( v ) > t(v x , 0) for frequencies in the range
( )
2 2 - 1 < v = v x < 2 . Of course, t(v x , 0) is zero for v x ≥ 1, but t( v ) is not until
v = 2.
1.0
0.8
t ( nx , 0)
0.6
t (0, ny)
t
0.4
t (n)
0.2
0.0
0.0 0.5 1.0 1.5
nx, ny, or n
Figure 10-4. Aberration-free OTF of a system with a square pupil, where v x , v y ,

and v are in units of the cutoff frequency 1 l F along the x axis.
10.4 SQUARE POLYNOMIALS

Figure 10-5 shows a unit square inscribed inside a unit circle. The distance of a
corner point of the square, such as A, from its center O is unity, but each of its sides has a
length of 2 , and its area is 2.
The orthonormal square polynomials S j ( x , y ) obtained by orthogonalizing the

Zernike circle polynomials Z j ( x , y ) over a unit square are given by [see Eq. (3-18)]
È j ˘
S j +1 = N j +1 ÍZ j +1 - Â Z j +1S k S k ˙ , (10-14)
ÍÎ k =1 ˙˚
unit square, i.e., they satisfy the orthonormality condition
1 2 1 2
1 Û Û
Ù dy Ù S j S j ¢ dx = d jj ¢ . (10-15)
2 ı ı
1 2 1 2
The angular brackets indicate a mean value over the rectangular pupil. Thus, for example,
1 2 1 2
1
Z j Sk = Ú dy Ú Z j S k dx . (10-16)
2 1 2 1 2
If the integrand is an odd function of x and/or y, the mean value is zero because of the
symmetric limits of integration. If the integrand is an even function, then we may replace
the lower limits of integration by zero and multiply the double integral by 4.
The orthonormal square polynomials up to and including the eighth order, i.e., the
first 45 polynomials, in terms of the Zernike circle polynomials are given in Table 10-1.
D ( 1 2, 1 2 ) (
A 1 2,1 2 )
O x
(
C 1 2, 1 2 ) (
B 1 2, 1 2 )
Figure 10-5. Unit square of half-width 1 2 inscribed inside a unit circle. Its corner
points, such as A, lie at a distance of unity from its center.
10.4 Square Polynomials 275
Table 10-1. Orthonormal square polynomials S j U , T in terms of the Zernike circle

polynomials Z j U T .
S1 Z1
S2 3 2 Z2
S3 3 2 Z3
S4 ( 5 2 /2) Z1 + ( 15 2 /2) Z4
S5 3 2 Z5
S6 ( 15 /2)Z6
S7 (3 21 31 /2)Z3 + (5 21 62 /2)Z7
S8 (3 21 31 /2)Z2 + (5 21 62 /2)Z8
S9 (7 5 31 /2)Z3 (13 5 62 /4)Z7 + ( 155 2 /4)Z9
S10 (7 5 31 /2)Z2 + (13 5 62 /4)Z8 + ( 155 2 /4)Z10
S11 (8/ 67 )Z1 + (25 3 67 /4)Z4 + (21 5 67 /4)Z11
S12 = (45 3 /16)Z6 + (21 5 /16)Z12
S13 = (3 7 /8)Z5 + ( 105 /8)Z13
S14 = 261/(8 134 )Z1 + (345 3 134 /16)Z4 + (129 5 134 /16)Z11 + (3 335 /16)Z14
S15 = ( 105 /4)Z15

S16 = 1.71440511Z2 +1.71491497Z8 + 0.65048499Z10 + 1.52093102Z16
S17 = 1.71440511Z3 + 1.71491497Z7 0.65048449Z9 + 1.52093102Z17
S18 = 4.10471345Z2 + 3.45884077Z8 + 5.34411808Z10 + 1.51830574Z16 + 2.80808005Z18
S19 = 4.10471345Z3 3.45884078Z7 + 5.34411808Z9 1.51830575Z17 + 2.80808005Z19
S20 = 5.57146696Z2 + 4.44429264Z8 + 3.00807599Z10 + 1.70525179Z16 +1.16777987Z18 + 4.19716701Z20
S21 = 5.57146696Z3 + 4.44429264Z7 3.00807599Z9 + 1.70525179Z17 1.16777988Z19 + 4.19716701Z21
S22 = 1.33159935Z1 + 1.94695912Z4 + 1.74012467Z11 + 0.65624211Z14 + 1.50989174Z22
S23 = 0.95479991Z5 + 1.01511643Z13 + 1.28689496Z23
S24 = 9.87992565Z6 + 7.28853095Z12 + 3.38796312Z24
S25 = 5.61978925Z15 + 2.84975327Z25
S26 = 11.00650275Z1 + 14.00366597Z4 + 9.22698484Z11 + 13.55765720Z14
+ 3.18799971Z22 + 5.11045000Z26
S27 = 4.24396143Z5 + 2.70990074Z13 + 0.84615108Z23 + 5.17855026Z27
S28 = 17.58672314Z6 + 11.15913268Z12 + 3.57668869Z24 + 6.44185987Z28

S29 = 2.42764289Z3 + 2.69721906Z7 1.56598064Z9 + 2.12208902Z17
0.93135653Z19 + 0.25252773Z21 + 1.59017528Z29
S30 = 2.42764289Z2 + 2.69721906Z8 + 1.56598064Z10 + 2.12208902Z16

+ 0.93135653Z18 + 0.25252773Z20 + 1.59017528Z30
Table 10-1. Orthonormal square polynomials S j U , T in terms of the Zernike circle

polynomials Z j U T . (Cont.)
S31 9.10300982Z3 8.79978208Z7 + 10.69381427Z9 5.37383385Z17

+ 7.01044701Z19 1.26347272Z21 1.90131756Z29 + 3.07960207Z31
S32 9.10300982Z2 + 8.79978208Z8 + 10.69381427Z10 +5.37383385Z16
+ 7.01044701Z18 + 1.26347272Z20 + 1.90131756Z30 + 3.07960207Z32
S33 21.39630883Z3 + 19.76696884Z7 12.70550260Z9 + 11.05819453Z17
7.02178756Z19 +15.80286172Z21 + 3.29259996Z29 2.07602718Z31
+ 5.40902889Z33
S34 21.39630883Z2 + 19.76696884Z8 + 12.70550260Z10 + 11.05819453Z16
+ 7.02178756Z18 +15.80286172Z20 + 3.29259996Z30 + 2.07602718Z32
+ 5.40902889Z34
S35 16.54454462Z3 14.89205549Z7 + 22.18054997Z9 7.94524849Z17
+ 11.85458952Z19 6.18963457Z21 2.19431441Z29 +3.24324400Z31
1.72001172Z33 + 8.16384008Z35
S36 16.54454462Z2 + 14.89205549Z8 + 22.18054997Z10 + 7.94524849Z16
+ 11.85458952Z18 + 6.18963457Z20 + 2.19431441Z30 +3.24324400Z32
+ 1.72001172Z34 + 8.16384008Z36
S37 1.75238960Z1 + 2.72870567Z4 + 2.76530671Z11 + 1.43647360Z14
+ 2.12459170Z22 + 0.92450043Z26 + 1.58545010Z37
S38 19.24848143Z6 + 16.41468913Z12 + 9.76776798Z24 + 1.47438007Z28
+ 3.83118509Z38
S39 0.46604820Z5 + 0.84124290Z13 + 1.00986774Z23 0.42520747Z27 + 1.30579570Z39
S40 28.18104531Z1 + 38.52219208Z4 + 30.18363661Z11 + 36.44278147Z14 +
15.52577202Z22 + 19.21524879Z26 + 4.44731721Z37 + 6.00189814Z40
S41 (369/4) 35 3574 Z15 + [11781/(32 3574 )]Z25 + (2145/32) 7 3574 Z41
S42 85.33469748Z6 + 64.01249391Z12 + 30.59874671Z24 + 34.09158819Z28

+7.75796322Z38 + 9.37150432Z42
S43 14.30642479Z5 + 11.17404702Z13 + 5.68231935Z23 + 18.15306055Z27
+ 1.54919583Z39 + 5.90178984Z43
S44 36.12567424Z1 + 47.95305224Z4 + 35.30691679Z11 + 56.72014548Z14
+ 16.36470429Z22 + 26.32636277Z26 +3.95466397Z37 +6.33853092Z40
+ 12.38056785Z44
S45 21.45429746Z15 + 9.94633083Z25 + 2.34632890Z41 + 10.39130049Z45
Table 10-2. Orthonormal square polynomials S j U , T in polar coordinates U, T .

S1 = 1
S2 = 6 ȡcosș
S3 = 6 ȡsinș
2
S4 = 5 2 (3ȡ 1)
2
S5 = 3ȡ sin2ș
S6 = 3 5 2 ȡ2 cos2ș
2
S7 = 21 31 (15ȡ 7)ȡsinș
2
S8 = 21 31 (15ȡ 7)ȡcosș
S9 = ( 5 31 /2)[31ȡ3 sin3ș 3(13ȡ2 4)ȡsinș]
S10 = ( 5 31 /2)[31ȡ3 cos3ș + 3(13ȡ2 4)ȡcosș]
S11 = (1/2 67 )(315ȡ4 240ȡ2 + 31)
S12 = 15/2 2 )(7ȡ2 3)ȡ2 cos2ș
S13 = 21 2 (5ȡ2 3)ȡ2 sin2ș
S14 = [3/(8 134 )](335ȡ4 cos4ș + 645ȡ4 300ȡ2 + 22)
S15 = (5/2) 21 /2ȡ4 sin4ș

3
S16 = 55 1966 [11ȡ cos3ș + 3(19 97ȡ2 + 105ȡ4)ȡcosș]
3
S17 = 55 1966 [ 11ȡ sin3ș + 3(19 97ȡ2 + 105ȡ4)ȡsinș]
4
S18 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 cos3ș + 3(3128 23885ȡ2 + 37205ȡ )ȡcosș]
4
S19 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 sin3ș 3(3128 23885ȡ2 + 37205ȡ )ȡsinș]
4
S20 = (1/16) 7 859 [2577ȡ5 cos5ș 5(272 717ȡ2)ȡ3 cos3ș + 30(22 196ȡ2 + 349ȡ )ȡcosș]
4
S21 = (1/16) 7 859 [2577ȡ5 sin5ș + 5(272 717ȡ2)ȡ3 sin 3ș + 30(22 196ȡ2 + 349ȡ )ȡsinș]
S22 = (1/4) 65 849 (1155ȡ6 + 30ȡ4 cos4ș 1395ȡ4 + 453ȡ2 31)
S23 = (1/2) 33 3923 (471 1820ȡ2 + 1575ȡ4)ȡ2 sin2ș
S24 = (21/4) 65 1349 (27 140ȡ2 + 165ȡ4)ȡ2 cos2ș
S25 = (7/4) 33 2 (9ȡ2 5)ȡ4 sin4ș
S26 = (1/16 849 )[5( 98 + 2418ȡ2 12051ȡ4 + 15729ȡ6) + 3( 8195 + 17829ȡ2)ȡ4 cos4ș]
S27 = (1/16 7846 )[27461ȡ6 sin6ș + 15(348 2744ȡ2 + 4487ȡ4)ȡ2 sin2ș]
S28 = (21/32 1349 )[1349ȡ6 cos6ș + 5(196 1416ȡ2 + 2247ȡ4)ȡ2 cos2ș]
S29 = ( 13.79189793ȡ + 125.49411319ȡ3 308.13074909ȡ5 + 222.62454035ȡ7) sinș

+ (8.47599260ȡ3 16.13156842ȡ5) sin3ș + 0.87478174ȡ5 sin5ș
Table 10-2. Orthonormal square polynomials S j U , T in polar coordinates U, T .

(Cont.)
S30 = ( 13.79189793ȡ + 125.49411319ȡ3 308.13074909ȡ5 + 222.62454035ȡ7) cosș
+ ( 8.47599260ȡ3 + 16.13156842ȡ5) cos3ș + 0.87478174ȡ5 cos5ș
S31 = (6.14762642ȡ 79.44065626ȡ3 + 270.16115026ȡ5 266.18445920ȡ7) sinș
+ (56.29115383ȡ3 248.12774426ȡ5 + 258.68657393ȡ7) sin3ș 4.37679791ȡ5 sin5ș
3
S32 = ( 6.14762642ȡ + 79.44065626ȡ 270.16115026ȡ + 266.18445920ȡ7) cosș
5
3
+ (56.29115383ȡ 248.12774426ȡ5 + 258.68657393ȡ7) cos3ș +4.37679791ȡ5 cos5ș
S33 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)sinș
+ ( 21.68093294ȡ3 + 127.50233381ȡ5 174.38628345ȡ7) sin3ș
+ ( 75.07397471ȡ5 + 151.45280913ȡ7) sin5ș
S34 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)cosș
+ (21.68093294ȡ3 127.50233381ȡ5 + 174.38628345ȡ7) cos3ș
+ ȡ5( 75.07397471 + 151.45280913ȡ2) cos5ș
S35 = (3.69268433ȡ 59.40323317ȡ3 + 251.40397826ȡ5 307.20401818ȡ7)sinș
+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)sin3ș
+ (19.83875817ȡ5 48.16032819ȡ7) sin 5ș + 32.65536033ȡ7 sin7ș
S36 = ( 3.69268433ȡ + 59.40323317ȡ3 251.40397826ȡ5 + 307.20401818ȡ7)cosș
+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)cos3ș
+ ( 19.83875817ȡ5 + 48.16032819ȡ7) cos5ș + 32.65536033ȡ7 cos7ș
S37 = 2.34475558 55.32128002ȡ2 + 296.53777290ȡ4 553.46621887ȡ6
+ 332.94452229ȡ8 + ( 12.75329096ȡ4 + 20.75498320ȡ6)cos4ș
S38 = ( 51.83202694ȡ2 + 451.93890159ȡ4 1158.49126888ȡ6 + 910.24313983ȡ8)cos2ș
+ 5.51662508ȡ6 cos6ș
S39 = ( 39.56789598ȡ2 + 267.47071204ȡ4 525.02362247ȡ6 + 310.24123146ȡ8)sin2ș
1.59098067ȡ6 sin6ș
S40 = 1.21593465 45.42224477ȡ2 + 373.41167834ȡ4 1046.32659847ȡ6
+ 933.93661610ȡ8 + (137.71626496ȡ4 638.10242034ȡ6 + 712.98912399ȡ8)cos4ș
S41 = (9/8) 7 1787 (1455 5544ȡ2 + 5005ȡ4)ȡ4 sin4ș
S42 = ( 40.45171657ȡ2 + 494.75561036ȡ4 1738.64589491ȡ6 + 1843.19802390ȡ8)cos2ș

+ ( 150.76043598ȡ6 + 318.07940431ȡ8)cos6ș
S43 = ( 9.12193686ȡ2 + 110.47679089ȡ4 371.21215287ȡ6 + 368.07015240ȡ8)sin2ș
+ ( 107.35168289ȡ6 + 200.31338972ȡ8) sin6ș
S44 = 0.58427150 25.29433513ȡ2 + 242.54313549ȡ4 795.02011474ȡ6
+ 830.47943579ȡ8 + (90.22533813ȡ4 538.44320774ȡ6 + 752.97905752ȡ8) cos4ș
+ 52.52630092ȡ8 cos8ș
S45 = (31.08509142ȡ4 194.79990628ȡ6 + 278.72965314ȡ8) sin4ș + 44.08655427ȡ8 sin8ș
Table 10-3. Orthonormal square polynomials S j x, y in Cartesian coordinates

x, y , where U 2 x 2 y 2 .
S1 = 1
S2 = 6x
S3 = 6y
2
S4 = 5 2 (3ȡ 1)
S5 = 6xy
S6 = 3 5 2 (x2 y2)
2
S7 = 21 31 (15ȡ 7)y
2
S8 = 21 31 (15ȡ 7)x
S9 = 5 31 (27x2 35y2 + 6)y
S10 = 5 31 (35x2 27y2 6)x
S11 = (1/2 67 )(315ȡ4 240ȡ2 + 31)
S12 = (15/2 2 )(x2 y2)(7ȡ2 3)
S13 = 42 (5ȡ2 3)xy
S14 = (3/4 134 )[10(49x4 36x2y2 + 49y4) 150ȡ2 + 11]
S15 = 5 42 (x2 y2)xy

4
S16 = 55 1966 (315ȡ 280x2 324y2 + 57)x
4
S17 = 55 1966 (315ȡ 324x2 280y2 + 57)y
S18 = (1/2) 3 844397 [105(1023x4 + 80x2y2 943y4) 61075x2 + 39915y2 + 4692]x
S19 = (1/2) 3 844397 [105(943x4 80x2y2 1023y4) 39915x2 + 61075y2 4692]y
S20 = (1/4) 7 859 [6(693x4 500x2y2 + 525y4) 1810x2 450y2 + 165]x
S21 = (1/4) 7 859 [6(525x4 500x2y2 + 693y4) 450x2 1810y2 + 165]y
S22 = (1/4) 65 849 [1155ȡ6 15(91x4 + 198x2y2 + 91y4) + 453ȡ2 31]
4
S23 = 33 3923 (1575ȡ 1820ȡ2 + 471)xy
S24 = (21/4) 65 1349 (165ȡ4 140ȡ2 + 27) (x2 y2)
S25 = 7 33 2 (9ȡ2 5)xy(x2 y2)
S26 = (1/8) 849[42(1573x6 375x4y2 375x2y4 + 1573y6) 60(707x4 225x2y2

+ 707y4) + 6045ȡ2 245]
S27 = (1/2 7846 )[14(2673x4 2500 x2y2 + 2673y4) 10290ȡ2 + 1305]xy
S28 = (21/8 1349 )[3146x6 2250 x4y2 + 2250 x2y4 3146y6 1770(x4 y4) + 245(x2 y2)]
Table 10-3. Orthonormal square polynomials S j x, y in Cartesian coordinates

x, y , where U 2 x 2 y 2 . (Cont.)
S29 = ( 13.79189793 + 150.92209099x2 + 117.01812058y2 352.15154565x4 657.27245247x2y2
291.12439892y4 + 222.62454035x6 + 667.87362106x4y2 + 667.87362106x2y4 + 222.62454035y6)y
S30 = ( 13.79189793 + 117.01812058x2 + 150.92209099y2 291.12439892x4 657.27245247x2y2
352.15154565y + 222.62454035x + 667.87362106x y + 667.87362106x2y4 + 222.62454035y6)x
4 6 4 2
S31 = (6.14762642 + 89.43280522x2 135.73181009y2 496.10607212x4 + 87.83479115x2y2

+ 513.91209661y4 + 509.87526260x6 + 494.87949207x4y2 539.86680367x2y4 524.87103314y6)y
S32 = ( 6.14762642 + 135.73181009x2 89.43280522y2 513.91209661x4 87.83479115x2y2
+ 496.10607212y4 + 524.87103314x6 + 539.86680367x4y2 494.87949207x2y4 509.87526260y6)x
2 2
S33 = ( 6.78771487 + 38.11697536x + 124.84070714y 400.01976911x4 + 191.43062089x2y2
609.73320550y4 + 695.06919087x6 246.30347616x4y2 154.56957886x2y4 + 786.80308817y6)y
2 2
S34 = ( 6.78771487 + 124.84070714x + 38.11697536y 609.73320550x4 + 191.43062089x2y2
400.01976911y4 + 786.80308817x6 154.56957886x4y2 246.30347616x2y4 + 695.06919087y6)x
S35 = (3.69268433 + 25.20822264x2 87.60705178y2 200.98753298x4 63.30315999x2y2
+ 455.10450382y4 + 497.87935336x6 461.58554163x4y2 + 470.02596297x2y4 660.45220344y6)y
S36 = ( 3.69268433 + 87.60705178x2 25.20822264y2 455.10450382x4 + 63.30315999x2y2
+ 200.98753298y4 + 660.45220344x6 470.02596297x4y2 + 461.58554163x2y4 497.87935336y6)x
S37 = 2.34475558 55.32128002ȡ2 + 283.78448194ȡ4 532.71123567ȡ6 + 332.94452229ȡ8
+ 8(12.75329096ȡ2 20.75498320ȡ4) x2 + 8( 12.75329096 + 20.75498320ȡ2)x4
S38 = ( 51.83202694 + 451.93890159x2 1152.97464379x4 + 910.24313983x6)x2
+ (51.83202694 451.93890159y2 1241.24064523x4 + 1241.24064523x2y2
+ 1152.97464379y4 + 1820.48627967x6 1820.48627967x2y4 910.24313983y6)y2
S39 = ( 79.13579197 + 534.94142408x2 + 534.94142408y2 1059.59312899x4 2068.27487642x2y2
4 6 4 2
1059.59312899y + 620.48246292x + 1861.44738877x y + 1861.44738877x2y4 620.48246292y6)xy
S40 = 1.21593465 + ( 45.42224477 + 511.12794331x2 1684.42901882x4
+ 1646.92574009x6)x2 + ( 45.42224477 79.47423312x2 + 511.12794331y2
+ 51.53230630x4 + 51.53230630x2y2 1684.42901882y4 + 883.78996844x6
1526.27154329x4y2 + 883.78996844x2y4 + 1646.92574009y6)y2
S41 = (409.79084415x2 409.79084415y2 1561.42985567x4 + 1561.42985567y4
+ 1409.62417525x6 + 1409.62417525xy2 1409.62417525x2y4 1409.62417525y6)xy
S42 = ( 40.45171657 + 494.75561036x2 1889.40633090x4 + 2161.27742821x6)x2
+ (40.45171657 494.75561036y2 + 522.76064491x4 522.76064491x2y2
+ 1889.40633090y4 766.71561254x6 + 766.71561254x2y4 2161.27742821y6)y2
S43 = ( 18.24387372 + 220.95358178x2 + 220.95358178y2 1386.53440310x4
+ 662.18504631x2y2 1386.53440310y4 + 1938.02064313x6 595.96654168x4y2
595.96654168x2y4 + 1938.02064313y6)xy
S44 = 0.58427150 + ( 25.29433513 + 332.76847363x2 1333.46332249x4
+ 1635.98479424x6)x2 + ( 25.29433513 56.26575785x2 + 332.76847363y2
+ 307.15569451x4 + 307.15569451x2y2 1333.46332249y4 1160.73491284x6
+ 1129.92710444x4y2 1160.73491284x2y4 + 1635.98479424y6)y2
S45 = (124.34036571x2 124.34036571y2 779.19962514x4 + 779.19962514y4
+ 1467.61104674x6 1353.92842666x4y2 + 1353.92842666x2y4 1467.61104674y6)xy
The corresponding polynomials in polar and Cartesian coordinates are given in Tables
10-2 and 10-3, respectively. Of course, up to the fourth order, they can be obtained
simply from the rectangular polynomials Rk given in Tables 9-1 through 9-3 by letting
c = 1 2 . The square polynomial S11 representing the balanced primary spherical
aberration is radially symmetric, but the polynomial S22 representing balanced secondary
spherical aberration is not because it consists of a term in Z14 or cos4q, also. Similarly,
the polynomial S37 representing balanced tertiary spherical aberration is also not radially
symmetric, since it consists of terms in Z14 and Z 26 both varying as cos 4q .
10.5 SQUARE COEFFICIENTS OF A SQUARE ABERRATION FUNCTION

A square aberration function W ( x , y ) across a unit square can be expanded in terms
of J square polynomials Sj (r, q) in the form
J
W ( x , y ) = Â a j Sj ( x , y ) , (10-17)
j =1
S j ( x , y ), integrating over the unit square, and using the orthonormality Eq. (10-15), we
obtain the square expansion coefficients:
1 1 2 1 2
aj = Ú dy Ú W ( x , y )S j ( x , y )dy . (10-18)
2 1 2 1 2
As stated in Section 3.2, it is evident from Eq. (10-18) that the value of a square
W (r, q) = a1 , (10-19)
and
J
W 2 (r, q) = Â a 2j , (10-20)
j =1
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (10-21)
j =2
10.6 ISOMETRIC, INTERFEROMETRIC, AND IMAGING

CHARACTERISTICS OF SQUARE POLYNOMIAL ABERRATIONS
The square polynomials are illustrated in three different but equivalent ways in
Figure 10-6. For each polynomial, the isometric plot at the top illustrates its shape. An
interferogram is shown on the left, and a corresponding PSF is shown on the right for a
sigma value of one wave. The peak-to-valley aberration numbers (in units of wavelength)
are given in Table 10-4.
polynomial aberration and obtained by applying Eq. (10-7) are shown in Figure 10-6. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration S1 has
no effect on the PSF, it yields an aberration-free PSF.
The polynomial aberrations S2 and S3 , representing the x and y wavefront tilts with
wavefront tilt angle of 3 2la 2 a about the y axis and displaces the PSF along the x
axis by 6 a 2l F . Similarly, a 3 corresponds to a wavefront tilt angle of 3 2l a 3 a
about the x axis and displaces the PSF by 6 a 3l F .
The defocus aberration represented by the polynomial S4 is radially symmetric and

yields a radially symmetric interferogram bounded, of course, by a square. However, the
PSF is biaxially symmetric. The polynomial aberrations S5 and S6 , representing
balanced astigmatism, yield biaxially symmetric interferograms and PSFs, but distinctly
different from each other. The polynomial aberrations S7 and S8 , representing balanced
comas, produce biaxially symmetric interferograms, but the PSFs are symmetric only
about the y and x axes, respectively. The polynomial aberrations S11 , representing the
primary spherical aberration, yields radially symmetric PSF. However, the polynomial
aberrations S22 , and S37 , representing the balanced secondary and tertiary aberrations are
not radially symmetric because of the presence of a cos 4q term. Accordingly, neither the
interferograms nor the PSFs for these aberrations are radially symmetric.
The Strehl ratio, namely the central value of a PSF relative to its aberration-free
value can be obtained from Eq. (10-7) by letting x = 0 = y , i.e., from
2
1 1 1
I (0, 0) = [ ]
Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢
16 1 1
. (10-22)
Its value for a square polynomial aberration with a sigma value of 0.1 wave is listed in
Table 10-5 and plotted in Figure 10-7. Because of the small value of the aberration, the
Strehl ratio is approximately the same for each polynomial. Both the table and the figure
illustrate that the Strehl ratio for a small aberration is independent of the type of
( )
aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 283
S1 S2 S3
S4 S5 S6
S7 S8 S9
S10 S11 S12
S13 S14 S15

the right for a sigma value of one wave.
S16 S17 S18
S19 S20 S21
S22 S23 S24
S25 S26 S27
S28 S29 S30

S31 S32 S33
S34 S35 S36
S37 S38 S39
S40 S41 S42
S43 S44 S45


square polynomials for a sigma value of unity.
S1 0 S16 16.558 S31 11.511
S2 3.464 S17 16.558 S32 11.511
S3 3.464 S18 7.893 S33 9.390
S4 4.743 S19 7.893 S34 9.390
S5 6.000 S20 9.559 S35 12.574
S6 4.743 S21 9.559 S36 10.359
S7 9.312 S22 12.659 S37 17.116
S8 9.312 S23 20.728 S38 13.581
S9 6.532 S24 9.603 S39 29.423
S10 6.532 S25 9.749 S40 8.021
S11 7.374 S26 5.927 S41 13.325
S12 6.061 S27 7.975 S42 9.322
S13 12.962 S28 10.470 S43 10.502
S14 5.429 S29 24.983 S44 9.082
S15 6.236 S30 24.983 S45 9.853

Table 10-5. Strehl ratio S for square polynomial aberrations for a sigma value of 0.1
wave.
S1 1 S16 0.712 S31 0.798
S2 0.662 S17 0.712 S32 0.798
S3 0.662 S18 0.681 S33 0.683
S4 0.669 S19 0.681 S34 0.683
S5 0.675 S20 0.688 S35 0.700
S6 0.669 S21 0.688 S36 0.700
S7 0.685 S22 0.722 S37 0.725
S8 0.685 S23 0.721 S38 0.708
S9 0.675 S24 0.690 S39 0.722
S10 0.675 S25 0.694 S40 0.688
S11 0.703 S26 0.673 S41 0.707
S12 0.669 S27 0.691 S42 0.679
S13 0.704 S28 0.698 S43 0.693
S14 0.6875 S29 0.723 S44 0.711
S15 0.682 S30 0.723 S45 0.700

o
o
Figure 10-7. Strehl ratio S for square polynomial aberrations with a sigma value of
0.1 wave.
10.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio 289
10.7 SEIDEL ABERRATIONS, STANDARD DEVIATION, AND

STREHL RATIO
with and without balancing. We also show how the Strehl ratio varies as a function of the
standard deviation and compare it with the approximate exponential expression for it.
10.7.1 Defocus
W d (r) = Ad r 2 . (10-23)
From the form of the defocus orthonormal polynomial S4 given in Table 10-2, it is
evident that its sigma value across a square pupil is given by
1 2 Ad
sd = Ad = . (10-24)
3 5 4.743
10.7.2 Astigmatism
Next, consider 0 o Seidel astigmatism given by
W a (r, q) = Aa r 2 cos 2 q . (10-25)
5 2
S6 = 3 r cos 2q (10-26a)
2
Ê 1 ˆ
= 3 10 Á r 2 cos 2 q - r 2 ˜ , (10-26b)
Ë 2 ¯
showing that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is -1 2 , as in the case of a circular, annular, or a Gaussian pupil. Thus, the
balanced astigmatism is given by
Ê 1 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (10-27)
Ë 2 ¯
Aa Aa
s ba = = . (10-28)
3 10 9.487
Aa
W a (r, q) = (S6 + S4 ) . (10-29)
3 10
Aa Aa
sa = = . (10-30)
3 5 6.708
10.7.3 Coma
Now, we consider Seidel coma:
W c (r, q) = Ac r 3 cos q . (10-31)
21
S8 =
31
(
15r 3 cos q - 7r cos q ) . (10-32)
r3 cos q is - 7 15 compared to - 2 3 for a circular pupil. The balanced coma is given by
Ê 7 ˆ
W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (10-33)
Ë 15 ¯
1 31 Ac
s bc = Ac = . (10-34)
15 21 12.346
Ac Ê 31 7 ˆ
W c (r, q) = Á S8 + S2 ˜ . (10-35)
15 Ë 21 6 ¯
3 Ac
sc = A = . (10-36)
70 c 4.831

Finally, we consider Seidel spherical aberration:
W s (r) = Asr 4 . (10-37)
1
S11 =
2 67
(
315r 4 - 240r 2 - 31 ) . (10-38)
Hence, the balanced spherical aberration is given by

Ê 16 ˆ
W bs (r) = As Á r 4 - r 2 ˜ . (10-39)
Ë 21 ¯
It shows that spherical aberration is balanced by a relative defocus of -16 21. Its sigma
value is given by
2 1
s bs = 67 As = . (10-40)
315 19.242
2
W s (r) =
315
( )
67 S11 + 8 10 S4 + constant . (10-41)
2 101 As
ss = A = . (10-42)
45 7 s 5.923
The sigma values of Seidel aberrations with and without balancing are given in Table 10-
6.
Defocus s d = 2 5 Ad 3 = Ad 4.74 4.74
Astigmatism s a = Aa 3 5 = Aa 6.71 6.71
Balanced astigmatism s ba = Aa 3 10 = Aa 9.49 4.74
Coma s c = 3 70 Ac = Ac 4.83 9.66
Balanced coma s bc = 31 21 Ac 15 = Ac 12.35 9.31
Spherical aberration s s = 2 101 7 As 45 = As 5.92 5.92
Balanced spherical aberration s bs = 2 67 As 315 = As 19.24 7.37

10.7.5 Strehl Ratio

In Figure 10-7, we have shown the Strehl ratio for the square polynomial aberrations
with a sigma value of one wave. In Figure 10-8, we show how it varies with the sigma
value of a Seidel aberration, with and without balancing, for 0 £ s W £ 0.25 . Also plotted
( )
is the Strehl ratio obtained from the approximate expression exp - s F2 as the dashed
curve. We note that this expression underestimates the Strehl ratio for defocus and Seidel
astigmatism, but oversetimates for Seidel coma and Seidel spherical aberration. The
agreement between the actual and the approximate values is quite good for the balanced
aberrations, except that the approximate expression overestimates in the case of spherical
aberration for s W > 0.15. The aberration coefficient or the P-V aberration for a certain
value of s W can be obtained from Tables 10-4 and 10-6 for the aberrations considered
here.
(a) (b)
(c) (d)
aberration.
10.8 Summary 293
10.8 SUMMARY
The aberration-free PSF and OTF of a square pupil are discussed in Section 10.3.
The polynomials orthonormal over a unit square pupil, representing balanced aberrations
over such a pupil are given through the eighth order in Tables 10-1 through Table 10-3 in
terms of the circle polynomials, in polar coordinates, and in Cartesian coordinates,
respectively. Each orthonormal polynomial consists of either the cosine or the sine terms,
but not both. Thus, an even j polynomial, for example, consists of only the cosine terms,
as may be seen from Table 10-1 or 10-2. This is a consequence of the four-fold symmetry
of the pupil. Since the polynomials are not separable in the polar coordinates r and q of
a pupil point, the polynomial numbering with two indices n and m loses significance, and
must be numbered with a single index j. They are ordered in the same manner as the
polynomials discussed in previous chapters.
Because of the higher symmetry of a square pupil compared to a rectangular pupil,

the form of the polynomial S6 representing balanced astigmatism is the same as that for a
circular pupil. Similarly, as indicated by the polynomial S11 , spherical aberration r 4 is
balanced only by defocus r2 , compared to R11 for a rectangular pupil, which consists of
a term in astigmatism r2 cos 2 q as well.
The first 45 hexagonal polynomials, i.e., up to and including the eighth order are
illustrated by an isometric plot, an interferogram, and a PSF in Figure 10-6. The
coefficient of each orthonormal polynomial, or the sigma value of the corresponding
aberration, is one wave. Their peak-to-valley numbers for a sigma value of one wave are
given in Table 10-4 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l
for each aberration is given in Table 10-5 and illustrated in Figure 10-7. It shows that, for
a small aberration, the Strehl ratio can be estimated from the aberration variance. The
sigma values of the Seidel aberrations and their balanced forms are given in Table 10-6.
References

analytical solution,” J Opt. Soc. Am. A 24, 2994–3016 (2007). Errata: J. Opt. Soc.
Am. A 29, 1673–1674 (2012).

11.41 (McGraw Hill, 2009).
3. M. Bray, “Orthogonal polynomials: A set for square areas," 3URF SPIE 5252,
314–320 (2004).
4. J. L. Rayces, “Least-squares fitting of orthogonal polynomials to the wave-

aberration function," Appl. Opt. 31, 2223–2228 (1992).
CHAPTER 11
SYSTEMS WITH SLIT PUPILS
11.1 Introduction ..........................................................................................................297
11.2.1 PSF ..........................................................................................................297
11.2.2 Image of an Incoherent Slit......................................................................298
11.3.1 Strehl Ratio ..............................................................................................299
11.3.2 Aberration Balancing............................................................................... 299
11.4 Slit Polynomials ....................................................................................................301
11.5 Standard Deviation of a Primary Aberration ................................................... 302
11.6 Summary............................................................................................................... 305
References ......................................................................................................................306
295
Chapter 11
Systems with Slit Pupils
11.1 INTRODUCTION
A slit pupil is a limiting case of a rectangular pupil whose one dimension is
negligibly small. It is used in spectrographs. The power series aberrations of a
rotationally symmetric imaging system with a slit pupil are the 1D analog of the
corresponding aberration terms discussed in Chapter 1. In this chapter, we discuss the
PSF of a slit pupil and the incoherent image of a slit parallel to the slit pupil. The Strehl
ratio for and the balanced aberrations of a slit pupil are discussed. It is shown that the
balanced aberrations are represented by the Legendre polynomials [1,2]. We show further
that the slit pupil is more sensitive to a primary aberration with or without balancing,
except for spherical aberration, for which it is slightly less sensitive.

11.2.1 PSF
As illustrated in Figure 11-1, consider a slit pupil, i.e., a rectangular pupil of half-
widths a and b, where b << a. Thus, the aspect ratio = b a of the pupil is negligibly
small. Its PSF can be obtained from that of a rectangular pupil by letting be
negligibly small. Letting be practically zero in Eq. (9-8) for the PSF of a rectangular
pupil, the PSF of a slit pupil may be written
2
1 1
I ( x) = Ú exp[iF( x ¢) ] exp( -pix ¢x ) dx ¢ , (11-1)
4 1
where x is in units of l F , and F = R 2a is the focal ratio of the beam focusing at a

distance R from the focusing lens. The irradiance distribution is normalized by its central
value Pex Sex l2 R 2 , where Pex is the total power in the pattern, and Sex = 4 ab is the
pupil area. For the aberration-free case, we obtain
yp
O
b xp
a
Figure 11-1. A slit pupil of half-width a along the x axis, where b << a .
297
298 SYSTEMS WITH SLIT PUPILS
(a)
1.0
0.8
0.6
(x)
0.4 (b)
0.2
0.0
3 2 1 0 1 2 3
x
Figure 11-2. PSF of a slit pupil. (a) Irradiance distribution. (b) 1D PSF
2
Ê sin px ˆ
I ( x) = Á ˜ . (11-2)
Ë px ¯
The PSF is shown in Figure 11-2. Its value is zero wherever x is a positive or a negative
integer.
11.2.2 Image of an Incoherent Slit

If the point source is replaced by an incoherently illuminated slit object parallel to
the slit pupil, then each point on the source forms a PSF, and the net result for an
incoherent illumination is the sum of their irradiance images. The incoherent image of the
slit object thus obtained is shown in Figure 11-3.
Figure 11-3. Image of an incoherent slit object formed by a system with a slit pupil.
11.3 Strehl Ratio and Aberration Balancing 299

11.3.1 Strehl Ratio
From Eq. (11-1), the Strehl ratio, representing the central value of the PSF without
and with an aberration, can be written
S ∫ I ( 0)
2
1 1
= Ú exp[iF( x ¢) ] dx ¢ . (11-3)
4 1
It can also be written as

2
1 1
S = {
Ú exp i [F( x ) - F
4 1
]} dx
= {
exp i [F( x ) - F ]}
1
= 1 + i [F( x ) - F ] -
2
[F( x) - F ]2 + ...
2
~ 1 - F2 - F
∫ 1 - s F2 , (11-4)
where the angular brackets indicate a mean value across the pupil, F is the mean value
of the aberration function, F 2 is its mean square value, s F2 is its variance, and we have
neglected the higher-order terms in the power-series expansion of the exponent. The
mean value of a function g( x ) is given by
1
Ú g( x )dx
11
g( x ) = 1
1
= Ú g( x )dx . (11-5)
2 1
Ú dx
1
11.3.2 Aberration Balancing

A unit slit pupil along the x axis is illustrated in Figure 11-4. Consider an aberration
such as primary x-coma:
Wcx ( x ) = x 3 . (11-6)
Its variance across the slit pupil is given by
2
s 2cx = [W cx ( x)]2 - W cx ( x ) . (11-7)
Thus, the standard deviation of the x-coma aberration is given by s cx = 1 7.

x
1 1
O
Figure 11-4. Unit slit pupil along the x axis inscribed inside a unit circle.
The variance can be reduced by mixing it with a certain amount b of x-tilt. Thus, the
balanced aberration may be written in the form
W bcx ( x ) = x 3 + bx . (11-8)
Its variance is given by
1 2b b 2
s 2bcx = + + . (11-9)
7 5 3
The variance has a minimum value of 4/175 for a tilt of b = -3 / 5 compared to a value of
1/7 without any tilt. Thus, the variance is reduced by a factor of 25/4, or the standard
deviation of the balanced aberration is smaller by a factor of 5/2. The corresponding
balanced aberration is given by
W bcx ( x , y ) = x 3 - (3 5) x . (11-10)
A balanced aberration yields a higher Strehl ratio or increases the aberration tolerance for
a given Strehl ratio.
Similarly, the variance of the x-spherical aberration x 4 can be minimized by

combining it with x-defocus. Thus, consider the balanced aberration
W bsx ( x ) = x 4 + bx 2 . (11-11)
16 2b 4b 2
s 2bsx = + + . (11-12)
225 105 105
11.3.2 Aberration Balancing 301
Its sigma value is minimum and equal to 8 105 for b = - 6 7 compared to a value of
4 15 with no defocus. The balanced aberration is given by
W bsx ( x ) = x 4 - (6 7) x 2 . (11-13)
It should be evident that there is no distinction between defocus and astigmatism, since
they both vary as x 2 .
The process of minimizing the variance in this manner is called aberration balancing.
The variance of the higher-order classical aberrations, e.g., secondary coma x 5 ,
secondary spherical aberration x 6 , tertiary coma x 7 , and tertiary spherical aberration x 8 ,
can also be minimized by combining them with lower-degree aberrations.
11.4 SLIT POLYNOMIALS

By letting c Æ 1 in the rectangular pupil discussed in Chapter 9, we obtain a unit slit
pupil inscribed inside a unit circle that is parallel to the x axis, as illustrated in Figure 11-
4. The corresponding orthonormal polynomials representing balanced aberrations for
such pupils can be obtained from the rectangular polynomials R j ( x , y ) given in Table 9-3
by letting y Æ 0 and c Æ 1. Half of the rectangular polynomials thus reduce to zero.
Some of the other polynomials are redundant. For example, the 1D defocus and
astigmatism cannot be distinguished from each other. The slit polynomials are the
Legendre polynomials. Since the pupil is 1D along the x axis, the aberrations vary with x
only.
The Legendre polynomials Pn ( x ) are orthogonal over the interval [ -1, 1] , according
to [3]
1 1 1
Ú Pn ( x ) Pn ¢ ( x ) dx = d , (11-14)
2 1 2n + 1 nn ¢
where n is a positive integer (including zero). A polynomial with an even (odd) value of n
consists of terms with even (odd) powers of x. Thus, a polynomial is symmetric for an
even n and antisymmetric for an odd n, according to
n
Pn ( - x ) = ( -1) Pn ( x ) . (11-15)
Moreover,
Pn (1) = 1 , (11-16)
Ï1 for even n
Pn ( -1) = Ì (11-17)
Ó -1 for odd n ,
Pn ( 0) = 0 for odd n , (11-18)
Pn ( 0) is positive or negative depending on whether n/2 is even or odd.

Starting with P0 ( x ) = 1 and P1( x ) = x , the polynomials can be obtained recursively from
the relation
( n + 1) Pn +1( x) = ( 2n + 1) xPn ( x) - nPn 1( x) . (11-19)
It is evident from Eq. (11-19) that Pn ( x ) is a polynomial of degree n in x, i.e., the highest
power of x in a polynomial Pn ( x ) is n. It is perhaps worth noting that a Zernike radial
( ) (
polynomial Rn0 (r) is the same as a shifted Legendre polynomial P̃n r 2 = Pn 2r 2 - 1 , )
both of which are orthogonal over the interval [0, 1] [see Eq. (4-41)].
The variation of a polynomial Pn ( x ) for -1 £ x £ 1 is shown in Figure 11-5. For

clarity, the even polynomials are plotted in Figure 11-5a and the odd in Figure 11-5b. It is
evident, as expressed by Eqs. (11-15)–(11-18), that an odd polynomial starts at –1 for
x = -1 and ends with 1 for x = 1. However, the even polynomials start and end at unity.
The number of peaks and valleys in a polynomial Pn ( x ) is n-1.
We use the Legendre polynomials in their orthonormal form Ln ( x ) given by
Ln ( x ) = 2n + 1Pn ( x ) . (11-20)
Their orthonormality is expressed by
1 1
Ú L ( x ) Ln ¢ ( x ) dx = d nn ¢ . (11-21)
2 1 n
The first few Ln ( x ) polynomials are listed in Table 11-1. The standard deviation of
each polynomial is unity. The mean value of each polynomial [other than P0 ( x ) ] is zero,
as may be seen by letting n ¢ = 0 in Eq. (11-21). It is easy to see this explicitly for a
polynomial with an odd value of n, since the integral of an odd function over symmetric
limits is zero. For an even value of n, the piston term in the polynomial makes its mean
value zero. For example, the balanced x-spherical aberration is x 4 - (6 7) x 2 with a mean
value of - 3 35. The piston term of 3(3/8) in L4 ( x ) makes its mean value zero. The slit
pupil is more sensitive to a Seidel aberration with or without balancing compared to a
circular pupil, except for spherical aberration for which it is slightly less sensitive.
11.5 STANDARD DEVIATION OF A PRIMARY ABERRATION

The standard deviation of a 1D primary aberration for a slit pupil can be obtained
from the orthonormal polynomials by writing it as a sum of these polynomials. Of course,
they are obtained in Section 11.3, and they are listed in Table 11-2. Comparing them with
the sigma value of a corresponding 2D aberration for a circular pupil (see Tables 4-1 and
4-2), we find that a slit pupil is more sensitive to a primary aberration with or without
balancing, except for spherical aberration, for which it is slightly less sensitive.
11.5 Standard Deviation of a Primary Aberration 303
(a)
(b)
Figure 11-5. Legendre polynomials Pn ( x ) as a function of x. (a) Even n and (b) odd
n.
Table 11-1. Legendre polynomials Ln ( x) = 2n + 1Pn ( x) for a unit slit pupil

orthonormal over the interval -1 £ x £ 1.
n Aberration Ln ( x)
0 Piston 1
1 Tilt 3x
2 Defocus ( )(
5 2 3x 2 - 1 )
3 Primary coma ( )(
7 2 5x 3 - 3x )
4 Primary spherical aberration (3 8)( 35x 4 - 30 x 2 + 3)
5 Secondary coma ( )(
11 8 63x 5 - 70 x 3 + 15x )
6 Secondary spherical ( )(
13 16 231x 6 - 315x 4 + 105x 2 - 5 )
aberration
7 Tertiary coma ( )(
15 16 429 x 7 - 693x 5 + 315x 3 - 35x )
8 Tertiary spherical aberration ( )( )
17 128 6435 x 8 - 12012 x 6 + 6930 x 4 - 1260 x 2 + 35
Table 11-2. Standard deviation s of a primary aberration for a slit pupil, where Ai
is its aberration coefficient.
Aberration s
Tilt At 3 = At 1.732
Defocus (or astigmatism) 2 Ad 3 5 = Ad 3.354
Coma Ac 7 = 2.646
Balanced coma 2 Ac 5 7 = Ac 6.614
Spherical aberration 4 As 15 = As 3.750
Balanced spherical aberration 8 As 105 = As 13.125

11.6 Summary 305
11.6 SUMMARY
A slit pupil is a limiting case of a rectangular pupil whose one dimension is
negligibly small, as illustrated in Figure 11-1. Its PSF is shown in Figure 11-2. The image
of an incoherent slit object parallel to the slit pupil is shown in Figure 11-3. The balanced
aberrations for a slit pupil are the Legendre polynomials. We have written them in an
orthonormal form, as in Eq. (11-3). They are listed in Table 11-1 up to the eighth order
and plotted in Figure 11-4. The sigma value of a 1D primary aberration with and without
balancing is listed in Table 11-2. It is shown that a slit pupil is more sensitive to a
primary aberration with or without balancing, except for spherical aberration for which it
is slightly less sensitive.
References

analytical solution,” J. Opt. Soc. Am. A 24, 2994–3016 (2007).
2. R. Barakat and L. Riseberg, “Diffraction theory of the aberrations of a slit

aperture," J. Opt. Soc. Am. 55, 878–881 (1965). There is an error in their
polynomial S2 , which should read as x 2 - 1 3.

CHAPTER 12
USE OF ZERNIKE CIRCLE POLYNOMIALS FOR

NONCIRCULAR PUPILS
12.1 Introduction ..........................................................................................................309
12.2 Relationship Between the Orthonormal and the Corresponding
Zernike Circle Coefficients..................................................................................309
12.3 Use of Zernike Circle Polynomials for the Analysis of an
Annular Wavefront ..............................................................................................314
12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients ......... 314
12.3.2 Interferometer Setting Errors................................................................... 320
12.3.3 Wavefront Fitting ....................................................................................320
12.3.4 Application to an Annular Seidel Aberration Function........................... 321
12.3.4.1 Annular Coefficients................................................................321
12.3.4.2 Circle Coefficients ................................................................... 322
12.3.4.3 Residual Aberration Function after Removing
Interferometer Setting Errors ..................................................323
12.3.4.4 Error with Assuming Circle Polynomials to be
Orthogonal over an Annulus ....................................................325
12.3.4.5 Numerical Example ................................................................326
12.4 Use of Zernike Circle Polynomials for the Analysis of a
Hexagonal Wavefront ..........................................................................................332
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients ........... 332
12.4.2 Interferometer Setting Errors................................................................... 335
12.5 Aberration Coefficients from Discrete Wavefront Data ..................................345
12.6 Summary............................................................................................................... 345
References ......................................................................................................................348
307
Chapter 12
Use of Zernike Circle Polynomials for
Noncircular Pupils
12.1 INTRODUCTION
The orthonormal polynomials for various pupils discussed in the preceding chapters
represent balanced aberrations for those pupils, just as the Zernike circle polynomials
(discussed in Chapter 4) do for a circular pupil. In this chapter, we consider the use of
circle polynomials for the analysis of a noncircular wavefront. Since the circle
polynomials form a complete set, any wavefront, regardless of the shape of the pupil
(which defines the perimeter of the wavefront), can be expanded in terms of them.
Moreover, since each orthonormal polynomial is a linear combination of the circle
polynomials [see Eq. (3-18)], the wavefront fitting with the former set of polynomials is
as good as that with the latter. However, we illustrate the pitfalls of using circle
polynomials for a noncircular pupil by considering an annular and a hexagonal pupil
[1,2].
It is shown that, unlike the orthonormal coefficients, the circle coefficients generally
change as the number of polynomials used in the expansion changes. Although the
wavefront fit with a certain number of circle polynomials is the same as that with the
corresponding orthonormal polynomials, the piston circle coefficient does not represent
the mean value of the aberration function, and the sum of the squares of the other
coefficients does not yield its variance. While the interferometer setting errors of tip, tilt,
and defocus from a 4-circle-polynomial expansion are the same as those from the
orthonormal polynomial expansion, these errors obtained from, say, an 11-circle-
polynomial expansion, and removed from the aberration function yield wrong polishing
by zeroing out the residual aberration function. If the common practice of defining the
center of an interferogram and drawing a circle around it is followed, and determining the
circle coefficients in the same manner as for a circular interferogram, then the circle
coefficients of a noncircular interferogram do not yield a correct representation of the
aberration function. Moreover, in this case, some of the higher-order coefficients of
aberrations that are nonexistent in the aberration function are also nonzero. Finally, the
circle coefficients, however obtained, do not represent coefficients of the balanced
aberrations for a noncircular pupil. Such results are illustrated analytically and
numerically by considering annular and hexagonal Seidel aberration functions as
examples.
12.2 RELATIONSHIP BETWEEN THE ORTHONORMAL AND THE

CORRESPONDING ZERNIKE CIRCLE COEFFICIENTS
Consider an aberration function W ( x , y ) across a noncircular pupil fit with J

orthonormal polynomials F j ( x , y ) in the form
309
310 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
J
Wˆ ( x , y ) = Â a j F j ( x , y ) , (12-1)
j =1
where Wˆ ( x , y ) is the best-fit estimate of the function with J polynomials, and a j is the
coefficient of the polynomial F j ( x , y ) . The orthonormality of the polynomials across the
noncircular pupil is described by
1
Ú F ( x , y )F j ¢ ( x , y ) dx dy = d jj ¢ , (12-2)
A pupil j
where d jj ¢ is a Kronecker delta. The orthonormal coefficients are given by
1
aj = Ú W ( x , y )F j ( x , y ) dx dy . (12-3)
A pupil
It is evident that their value does not depend on the number of polynomials J used in the
expansion.
Letting F1( x , y ) = 1 , it is easy to see from Eq. (12-2) that the mean value of a
polynomial F j π1( x , y ) across the pupil is zero. Hence, the mean and the mean square
values of the estimated aberration function are given by
Ŵ = a1 (12-4)
and
J
Wˆ 2 ( x , y ) = Â a 2j , (12-5)
j =1
respectively. Its variance is accordingly given by
2
ˆ2 ˆ
ˆ = W ( x, y) - W ( x, y)
2
sW
J
= Â a 2j , (12-6)
j =2
where s Ŵ is its standard deviation. The number of polynomials J used in the expansion
to estimate the aberration function is increased until s Ŵ approaches the true value as
determined from the ray-trace or interferometric data within a certain prespecified
tolerance.
Since the circle polynomials Z j ( x , y ) form a complete set, each orthonormal

polynomial can be written in terms of them as a linear sum in the form [see Eq. (3-18)]
12.2 Relationship between the Orthonormal and the Corresponding Zernike Circle Coefficients 311
J
F j ( x , y ) = Â M ji Z i ( x , y ) , (12-7)
i =1
or
{F } = M {Z }
j j , (12-8)
where M ji are the elements of the lower triangular conversion matrix M The estimated
aberration function can accordingly be expanded in terms of the circle polynomials in the
form
J
Wˆ ( x , y ) = Â bˆ j Z j ( x , y ) , (12-9)
j =1
where b̂ j is the Zernike coefficient of a polynomial Z j ( x , y ). The circle polynomials are

orthonormal over a unit circle in Cartesian coordinates according to
1
Ú Z ( x , y )Z j ¢ ( x , y ) dx dy = d jj ¢ , (12-10a)
p x 2 + y 2 £1 j
or in polar coordinates (with x = r cos q and y = r sin q )
2p
11
Z j (r, q) Z j ¢ (r, q) r dr dq = d jj ¢
p Ú0 Ú . (12-10b)
0
Substituting Eq. (12-7) into Eq. (12-1), we obtain

J j
Wˆ ( x , y ) = Â a j Â M ji Z i ( x , y )
j =1 i =1
J J
= Â Â a i M ij Z j ( x , y ) . (12-11)
j =1 i = j
Comparing Eqs. (12-9) and (12-11), we obtain

J
bˆ j = Â a i M ij . (12-12)
i= j
It is clear that the value of a circle coefficient b̂ j depends on the number of polynomials J
used in the expansion. Moreover, it is a linear combination of the orthonormal
coefficients, just as an orthonormal polynomial is a linear combination of the circle
polynomials. Equation (12-12) can be written in a matrix form as
b̂ = M T a , (12-13)
where a and b̂ are the column vectors representing the orthonormal and the Zernike
coefficients, respectively, and M T is the transpose of the conversion matrix M. Thus, the
matrix that is used to obtain the orthonormal polynomials from the circle polynomials is
also used to obtain the circle coefficients from the orthonormal coefficients. The
transpose of a matrix is obtained by interchanging its rows and columns. Since M is a
lower triangular matrix, M T is an upper triangular matrix. Multiplying both sides of Eq.
1
(12-13) by the inverse M T ( )
of M T , we obtain
a = MT( ) 1 bˆ . (12-14)
Accordingly, if the circle coefficients are known, the orthonormal coefficients can be
obtained from them.
If the orthonormal coefficients are not known, the circle coefficients b̂ j can be
obtained by a least squares fit. Suppose the aberration values are known over a certain
domain by way of interferometry at N data points. Equation (12-9) can be written in
matrix form
Sˆ = Zbˆ , (12-15)
where Ŝ is an array of N elements representing the values of the aberration function

Wˆ ( x , y ) , and Z is an N ¥ J matrix representing each of the J polynomials over the N
data points. Solving Eq. (12-15), for example, with a standard singular-value
decomposition algorithm yields
bˆ = Z 1Sˆ , (12-16)
where Z 1 is a generalized inverse of the Z matrix. Of course, this procedure can also be
used to determine the orthonormal coefficients by replacing the circle polynomials with
the orthonormal polynomials. Except for any numerical error because of the finite
number N of the data points, the b̂ -coefficients given by Eq. (12-16) are the same as
those given by Eq. (12-13).
If the practice of drawing a unit circle around an interferogram and determining the
Zernike coefficients for a circular pupil is extended to a noncircular wavefront, the
coefficients thus obtained will be given by
1
bj = Ú W ( x , y )Z j ( x , y ) dx dy . (12-17)
A pupil
The circle polynomials in Eq. (12-17) are implicitly assumed to be orthonormal over the
noncircular pupil. The value of a circle coefficient b j does not depend on the number of
polynomials used in the expansion. Substituting Eq. (12-1) for the estimated aberration
function Wˆ ( x , y ) in terms of the orthonormal polynomials, we obtain
J 1
bj = Â a j¢ Ú Z ( x , y ) F j ¢ ( x , y ) dx dy
j ¢ =1 A pupil j
12.2 Relationship between the Orthonormal and the Corresponding Zernike Circle Coefficients 313
J
= Â a j¢ Z j Fj¢ , (12-18)
j ¢ =1
or in a matrix form
b = C ZF a , (12-19)
where C ZF is a matrix representing the inner products Z j F j ¢ of the Zernike

polynomials with the orthonormal polynomials over the domain of the noncircular
wavefront. As illustrated in Sections 12.3 and 12.4, by considering an annular or a
hexagonal Seidel aberration function, respectively, the circle coefficients b j thus
obtained are incorrect in the sense that they do not yield a least-squares fit of the
aberration function W ( x , y ) , unlike the coefficients b̂ j . This, of course, is due to the
incorrect assumption of orthonormality of the circle polynomials over the noncircular
pupil.
To relate the b̂ - and the b-circle coefficients, we equate the right-hand sides of Eqs.
(12-1) and (12-9), multiply both sides by Z j ¢ , and integrate over the domain of the
noncircular pupil. Thus,
J J
Â bˆ j Z j ( x , y ) = Â a j F j ( x , y ) (12-20)
j =1 j =1
and
J J
Â bˆ j Z j ¢ Z j = Â a j Z j¢ Fj , (12-21)
j =1 j =1
C ZZ bˆ = C ZF a = b , (12-22)
where we have utilized Eq. (12-19). From Eqs. (12-13) and (12-22), it is evident that
C ZF = C ZZ M T . (12-23)
Typical elements of the matrices C ZZ and C ZF are given by
1
c jj ¢ = Ú Z ( x , y )Z j ¢ ( x , y ) dx dy (12-24)
A pupil j
and
1
d jj ¢ = Ú Z ( x , y )F j ¢ ( x , y ) dx dy , (12-25)
A pupil j
respectively. It is evident from Eq. (12-24) that c jj ¢ = c j ¢j .

12.3 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR THE ANALYSIS OF

AN ANNULAR WAVEFRONT
12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients

Consider a system with a unit annular pupil with an obscuration ratio , as illustrated
in Figure 5-1. The polynomials A j (r, q; ) that are orthonormal across it and represent
balanced aberrations for it are similar to the circle polynomials in that they are separable
in the radial coordinate r and the azimuthal angle q of a point on the pupil. The
dependence on the obscuration ratio is contained only in the radial portion of the
polynomial. As discussed in Chapter 5, the annular polynomials are given by
Aeven j (r, q; ) = 2(n + 1) Rnm (r; ) cos mq , m π 0 , (12-26a)
Aodd j (r, q; ) = 2(n + 1) Rnm (r; ) sin mq , m π 0 , (12-26b)
A j (r, q; ) = n + 1 Rn0 (r; ) , m = 0 , (12-26c)
where £ r £ 1, n and m are positive integers, and n - m ≥ 0 and positive. The annular
polynomials are orthonormal across the annular pupil according to
1 2p
1
Ú Ú A j (r, q; ) A j ¢ (r, q; ) r dr dq = d jj ¢ . (12-27)
(
p 1 - 2 ) 0
As Æ 0, an annular polynomial reduces to a corresponding circle polynomial. The

annular polynomials can be written in terms of the Zernike circle polynomials Z j (r, q),
as discussed in Chapter 4, according to
{A } = M {Z }
j j , (12-28)
where M is the conversion matrix.
An annular aberration function W (r, q; ) can be estimated by J orthonormal

polynomials according to
J
Wˆ (r, q; ) = Â a j A j (r, q; ) , (12-29)
j =1
where the orthonormal annular expansion coefficients are given by

1 2p
1
aj = W (r, q; ) A j (r, q; ) r dr dq .
) Ú Ú (12-30)
(
p 1 - 2 0
The mean value and the variance of the estimated function are accordingly given by Eqs.
(12-4) and (12-6).
12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 315
Table 12-1 lists the first 11 annular polynomials, as obtained from the annular-
polynomial Tables 5-3 and 5-4. They are given in terms of the circle polynomials in
Table 12-2. The nonzero elements of a 11 ¥ 11 conversion matrix, as obtained from Table
12-2, are listed in Table 12-3. The transpose matrix M T can be obtained easily by
interchanging the rows and columns of M . The nonzero elements of the 11 ¥ 11 matrices
C ZZ and C ZF are given in Tables 12-4 and 12-5, respectively.
Given a certain annular aberration function, its annular coefficients a j can be

determined from Eq. (12-30). If it is expanded in terms of only the first four circle
polynomials, i.e., if J = 4 in Eq. (12-9), then the expansion b̂ -coefficients according to
Eq. (12-13) are given by
Ê
Ê bˆ1 ˆ Á 1 0 0 - 32 1 - 2( ) 1ˆ˜ Ê a1 ˆ Á
1 (
Ê a - 32 1 - 2
) 1 a 4 ˆ˜
Áˆ ˜ Á Á ˜
Á b2 ˜ = Á 0 (1 + 2 ) 1 2 0 0 ˜
˜ Á a2 ˜
Á 1 + 2 1 2 a
=Á
( ) 2
˜
˜ (12-31)
Á bˆ ˜ Á ˜ Á ˜ Á ˜
Á 3˜ Á 0
Áˆ ˜
0 (1 + 2 ) 1 2 0 ˜ Á a3 ˜ Á 1+ ( 2 1 2
)a3 ˜
Á ˜
Ë b4 ¯ Á ˜ Á ˜
Ë 0 0 0 (1 - 2 ) 1 ¯
Ë a4 ¯
Ë ( 1
1 - 2 a 4) ¯
or
(
bˆ1 = a1 - 32 1 - 2 ) 1 a4 , (12-32a)
(
bˆ2 = 1 + 2 ) 1 2 a2 , (12-32b)
(
bˆ3 = 1 + 2 ) 1 2 a3 , (12-32c)
(
bˆ4 = 1 - 2 ) 1 a4 . (12-32d)
These coefficients represent the Zernike piston, tip, tilt, and defocus coefficients.
To see how these coefficients change with the number of polynomials used in the
expansion, we consider an expansion using the first 11 circle polynomials. The
coefficients are now given by
(
bˆ1 = a1 - 32 1 - 2 ) 1 a4 + (
52 1 + 2 1 - 2 )( ) 2 a11 , (12-33a)
(
bˆ2 = 1 + 2 ) 1 2 a2 - (2 )
2 4 B a 8 , (12-33b)
(
bˆ3 = 1 + 2 ) 1 2 a3 - (2 )
2 4 B a 7 , (12-33c)
(
bˆ4 = 1 - 2 ) 1 a4 - (
152 1 - 2 ) 2 a11 , (12-33d)
(
bˆ5 = 1 + 2 + 4 ) 1 2 a5 , (12-33e)
(
bˆ6 = 1 + 2 + 4 ) 1 2 a6 , (12-33f)
[( ) ]
bˆ7 = 1 + 2 B a 7 , (12-33g)
[( ) ]
bˆ8 = 1 + 2 B a 8 , (12-33h)
(
bˆ9 = 1 + 2 + 4 + 6 ) 1 2 a9 , (12-33i)
Table 12-1. Orthonormal annular polynomials A j (r, q; ).
j n m A j (r, q; ) Aberration Name
1 0 0 1 Piston
x tilt
2 1 1 2 ÈÍr 1 + 2
Î
( )1 2 ˘˙˚ cos q
y tilt
3 1 1 2 ÈÍr 1 + 2
Î
( )1 2 ˘˚˙ sin q
4 2 0 (
3 2r 2 - 1 - 2 ) (1 - 2 ) Defocus
5 2 2 6 ÈÍr 2 1 + 2 + 4
Î
( )1 2 ˘˙˚ sin 2q 45∞ Primary astigmatism
6 2 2 6 ÈÍr 2 1 + 2 + 4
Î
( )1 2 ˘˙˚ cos 2q 0∞ Primary astigmatism
7 3 1 8
( ) ) sin q
3 1 + 2 r 3 - 2 1 + 2 + 4 r ( Primary y coma
12
(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]
3 (1 + 2 ) r 3 - 2 (1 + 2 + 4 ) r
8 3 1 8 1 2 cos q
Primary x coma
(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]
9 3 3 8 ÈÍr 3 1 + 2 + 4 + 6
Î
( )1 2 ˘˚˙ sin 3 q
10 3 3 8 ÈÍr 3 1 + 2 + 4 + 6
Î
( )1 2 ˘˚˙ cos 3q
2
11 4 0
ÎÍ ( )
5 È6r 4 - 6 1 + 2 r 2 + 1 + 4 2 + 4 ˘
˚˙ (1 - )
2 Primary spherical aberration
Table 12-2. Annular polynomials A j (r, q; ) in terms of the Zernike circle

polynomials Z j (r, q ) , where is the obscuration ratio of the annular pupil.
A1 = Z1
( ) 1 2 Z2
A2 = 1 + 2
12
A3 = (1 + 2 ) Z 3
1
A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )
12
A5 = (1 + 2 + 4 ) Z 5
12
A6 = (1 + 2 + 4 ) Z 6
A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]
A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]
12
A9 = (1 + 2 + 4 + 6 ) Z 9
12
A10 = (1 + 2 + 4 + 6 ) Z10
A11 = (1 - 2 ) [ 52 (1 + 2 ) Z1 - 152 Z 4 + Z11 ]

2
12
B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]
Table 12-3. Nonzero elements of a 11 ¥ 11 conversion matrix M for obtaining the

annular polynomials A j (r, q; ) from the Zernike circle polynomials Z j (r, q ) .
M 11 = 1
(
M 22 = 1 + 2 ) 1 2 = M 33
1
M 41 = - 32 (1 - 2 )
1
M 44 = (1 - 2 )
12
M 55 = (1 + 2 + 4 ) = M 66
M 73 = -2 2 4 B 1
= M 82
( ) = M 88
M 77 = 1 + B 2 1
12
M 99 = (1 + 2 + 4 + 6 ) = M 10,10
2
M 111, = 52 (1 + 2 )(1 - 2 )
2
M 11,4 = - 152 (1 - 2 )
2 2
, = (1 - )
M 1111
Table 12-4. Nonzero elements c jj ¢ of 11 ¥ 11 matrix C ZZ of the Zernike circle

polynomials over an annular pupil of obscuration ratio , where c jj ¢ = c j ¢j .
c11 = 1
c14 = 32 = c 41
c111 2
(
, = - 5 1 - 2 = c111
2
,)
c 22 = 1 + 2 = c 33
c 28 = 2 2 4 = c 82 = c 37 = c 73
c 44 = 1 - 2 2 + 4 4
( )
c 4,11 = 152 1 - 32 + 34 = c11,4
c 55 = 1 + 2 + 4 = c 66
c 77 = 1 + 2 - 7
c 99 = 1 + 2 + 4 + 6 = c10,10
, = 1 - 4 + 26 - 54 + 36
2 4 6 8
c1111
Table 12-5. Nonzero elements d jj ¢ of 11 ¥ 11 matrix C ZF of the Zernike circle

polynomials over an annular pupil of obscuration ratio .
d11 = 1
(
d 22 = 1 + 2 )1 2 = d 33
d 41 = 32
( )( ) 1
d 44 = 1 - 2 2 + 4 1 - 2
12
d 55 = (1 + 2 + 4 ) = d 66
12
d 73 = 2 2 4 (1 + 2 ) = d 82
12 12
d 77 = (1 - 2 )(1 + 4 2 + 4 ) (1 + 2 ) = d 88
12
d 99 = (1 + 2 + 4 + 6 ) = d10,10
d111, = - 52 (1 - 22 )
d11,4 = 152 (1 - 2 )
2 2
, = (1 - )
d1111
(
bˆ10 = 1 + 2 + 4 + 6 ) 1 2 a10 , (12-33j)
(
bˆ11 = 1 - 2 ) 2 a11 , (12-33k)
where
12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] . (12-34)
It is evident that all of the first four coefficients change, and b j = M jj a j for 5 £ j £ 11 .
The Zernike astigmatism coefficients b̂5 and b̂6 are smaller than the corresponding
12
( )
annular coefficients a 5 and a 6 by a factor of 1 + 2 + 4 . However, the Zernike
spherical aberration coefficient b̂11 is larger than the corresponding annular coefficient
2
( )
a11 by a factor of 1 - 2 . For example, when = 0.5 , the astigmatism coefficients are
smaller by a factor of 1.1456, and the spherical aberration coefficient is larger by a factor
of 1.7778.
It should be evident that, because of the orthogonality of the trigonometric functions,

there is correlation between an annular and a circle polynomial only if they have the same
azimuthal dependence. As a consequence, the piston coefficient b̂1, for example, is a
linear combination of the piston coefficient a1 , defocus coefficient a 4 , and various
orders of spherical aberration. Similarly, the tilt coefficient b̂2 is a linear combination of
the tilt coefficient a 2 and various orders of coma, or astigmatism coefficient b̂5 is a
linear combination of various orders of astigmatism. Accordingly, the astigmatism
coefficients change if a 13-polynomial expansion is considered. For example, b̂5 then
contains contribution from a13 , as well. The tip and tilt coefficients b̂2 and b̂3 change
further if polynomials A16 (varying as cos q ) and A17 (varying as sinq ) are included in
the expansion. Moreover, A16 also contributes to the coma coefficient b̂8 , and A17
similarly contributes to the coma coefficient b̂7 . The defocus coefficient b̂4 does not
change until the secondary spherical aberration polynomial A22 is included with its
coefficient a 22 . Its inclusion also affects the primary spherical aberration coefficient b̂11 .
Thus, it is easy to see which, when, and by how much the b̂ j coefficients change,
depending on the number of polynomials used in the expansion.
We note that the mean value of the aberration function is given by the annular piston
coefficient a1 . However, the value of the corresponding Zernike circle coefficient b̂1
depends on the number of polynomials used in the expansion, and it does not equal a1 ;
therefore, it does not represent the mean value. An orthonormal annular coefficient (other
than piston) represents the standard deviation of the corresponding aberration term in the
expansion, but a Zernike circle coefficient generally does not. The variance of the
aberration function cannot be obtained by summing the squares of the Zernike circle
coefficients b̂ j (excluding the piston coefficient). The circle coefficients b j can be
obtained from the b̂ j - or the a j -coefficients, according to Eq. (12-22). They are
considered in Section 12.3.5 for a Seidel aberration function.
12.3.2 Interferometer Setting Errors

The estimated wavefront obtained by using only the first four polynomials represents
the best-fit parabolic approximation of the aberration function in a least squares sense. In
terms of the orthonormal annular polynomials, it can be written as
Wˆ ( x , y ) = a1 A1 + a 2 A2 + a 3 A3 + a 4 A4 (12-35a)
(
= a1 + 2 1 + 2 ) 1 2 a 2 x + 2(1 + 2 ) 1 2 a 3 y
(
+ 3 1 - 2 ) 1 a 4 [2 + (2r2 - 1)] . (12-35b)
In terms of the circle polynomials it can be written
Wˆ ( x , y ) = bˆ1Z1 + bˆ2 Z 2 + bˆ3 Z 3 + bˆ4 Z 4 (12-36a)
(
= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-36b)
In Eqs. (12-35) and (12-36), we have omitted the arguments of the annular and circle
polynomials for simplicity. The coefficients of x, y, and r 2 representing the tip, tilt, and
defocus values obtained from the circle coefficients are the same as those obtained from
the orthonormal coefficients. The estimated piston from the Zernike expansion of Eq.
1
( )
(12-36b) is bˆ1 - 3bˆ4 , which is the same as a1 - 32 1 - 2 a 4 from the orthonormal
expansion in Eq. (12-35b). Accordingly, the aberration function obtained by subtracting
the piston, tip, tilt, and defocus values from the measured aberration function is
independent of the nature of the polynomials used in the expansion, so long as the
nonorthogonal expansion is in terms of only the first four circle polynomials [as may be
seen, for example, by comparing Eqs. (12-33a–d) with Eqs. (12-32a–d)]. In an
interferometer, the tip and tilt represent the lateral errors and defocus represents the
longitudinal error in the location of a point source illuminating an optical surface under
test from its center of curvature. These four terms are generally removed from the
aberration function and the remaining function is given to the optician to zero out from
the optical surface by polishing.
12.3.3 Wavefront Fitting

When an aberration function is expanded in terms of the orthonormal polynomials,
one or more polynomial terms can be added or subtracted from the aberration function
without affecting the coefficients of the other polynomials in the expansion. But that is
generally not true with the Zernike expansion. This is due to the fact that an expansion in
terms of the orthonormal polynomials gives a best fit for each polynomial, but an
expansion in terms of the circle polynomials gives it for the whole set in the expansion.
12.3.3 Wavefront Fitting 321
The estimated or reconstructed wavefront by the same number of corresponding

orthonormal or Zernike polynomials is the same. For example, the 4-polynomial
aberration functions of Eqs. (12-35) and (12-36) are exactly the same function.
Although the wavefront fit with a certain number of circle polynomials is as good as
the fit with a corresponding set of the orthonormal polynomials, there are pitfalls in using
the circle polynomials. Since the circle polynomials are not orthogonal over the
noncircular pupil, the advantages of orthogonality and aberration balancing are lost. Since
they do not represent the balanced classical aberrations for a noncircular pupil, the
Zernike coefficients b̂ j do not have the physical significance of their orthonormal
counterparts. For example, the mean value of a circle polynomial across a noncircular
pupil is not zero, the Zernike piston coefficient does not represent the mean value of the
aberration, the other Zernike coefficients do not represent the standard deviation of the
corresponding aberration terms, and the variance of the aberration is not equal to the sum
of the squares of these other coefficients. Moreover, the value of a Zernike coefficient
generally changes as the number of polynomials used in the expansion of an aberration
function changes. Hence, the circle polynomials are not appropriate for the analysis of a
noncircular wavefront. Of course, wavefront fitting with the improperly calculated
Zernike coefficients b j by using Eq. (12-17) will be in error, as demonstrated in Section
12.3.4 for a Seidel aberration function.
12.3.4 Application to an Annular Seidel Aberration Function

Consider an annular pupil aberrated by a Seidel aberration function given by
W (r, q; ) = At r cos q + Ad r 2 + Aa r 2 cos 2 q + Ac r 3 cos q + Asr 4 , £ r £ 1, (12-37)
where At , Ad , Aa , Ac , and As represent the peak values of distortion, field curvature,

astigmatism, coma, and spherical aberration, respectively. Without the explicit field
dependence, distortion is equivalent to a wavefront tilt, and field curvature is equivalent
to a wavefront defocus.
12.3.4.1 Annular Coefficients
The aberration function when approximated by only the first four annular
polynomials can be written
Wˆ (r, q; ) = a1 A1 + a 2 A2 + a 4 A4 , (12-38)
where the expansion coefficients according to Eq. (12-30) are given by
( ) (
a1 = 1 + 2 (2 Ad + Aa ) 4 + 1 + 2 + 4 As 3 , ) (12-39a)
(
a 2 = 1 + 2 )1 2 At ( )(
2 + 1 + 2 + 4 1 + 2 ) 1 2 Ac 3 , (12-39b)
( ) (
a 4 = 1 - 2 (2 Ad + Aa ) 4 3 + 1 - 4 As 2 3 . ) (12-39c)
It should be evident that the coefficient a 3 of the annular polynomial A3 varying as sinq
is zero. The mean value of the estimated aberration function is given by a1 , and its
variance is given by
2 2 2
sWˆ = a2 + a4 . (12-40)
An expansion in terms of 11 annular polynomials can be written
W (r, q; ) = a1 A1 + a 2 A2 + a 4 A4 + a 6 A6 + a 8 A8 + a11 A11 , (12-41)
where the coefficients a1 , a 2 , and a 4 are given by Eqs. (12-39a–c) and
1 12
a6 =
2 6
(1 + 2 + 4 ) Aa ,
(12-39d)
12
1 - 2 Ê 1 + 4 2 + 4 ˆ
a8 = Á ˜ Ac , (12-39e)
6 2 Ë 1 + 2 ¯
a11 =
(1 - 2 ) 2 A . (12-39f)
s
6 5
Again, it should be evident that the coefficients a 5 , a 7 , and a 9 of the polynomials A5 ,

A7 , and A9 , respectively, each polynomial varying as sin mq, are zero. Moreover, the
coefficient a10 of the polynomial A10 varying as cos 3q is also zero. The 11-polynomial
expansion represents the Seidel aberration function exactly. Its mean value is again a1 , as
given by Eq. (12-39a), and its variance is given by
2
sW = a 22 + a 42 + a 62 + a 82 + a11
2
. (12-42)
12.3.4.2 Circle Coefficients
Next we expand the Seidel aberration function in terms of the circle polynomials. A
4-polynomial expansion can be obtained from Eqs. (12-32) and (12-39) in the form
Wˆ (r, q; ) = bˆ1Z1 + bˆ2 Z 2 + bˆ4 Z 4 , (12-43)
where
[ (
bˆ1 = (2 Ad + Aa ) 4 + 1 - 2 1 + 2 2 As 3 , ) ] (12-44a)
bˆ2 = a 2 1 + 2( )1 2 ,
(12-44b)
bˆ4 = a 4 1 - 2( ) .
(12-44c)
12.3.4 Application to an Annular Seidel Aberration Function 323
The estimated aberration function in Eq. (12-43) is exactly the same as that in Eq. (12-
38), and the values of piston, x-tilt, and defocus are exactly the same as those obtained
from Eqs. (12-39a–c). It should be evident, however, that its mean value is not given by
b̂1. Moreover, since an expansion coefficient does not represent the standard deviation of
the corresponding aberration polynomial term, its variance is not given by bˆ22 + bˆ42 .
From Eqs. (12-33) and (12-39), an 11-polynomial Zernike circle expansion can be
written
W (r, q; ) = bˆ1Z1 + bˆ2 Z 2 + bˆ4 Z 4 + bˆ6 Z 6 + bˆ8 Z 8 + bˆ11Z11 , (12-45)
where
bˆ1 = (2 Ad + Aa ) 4 + As 3 , (12-46a)
bˆ2 = At 2 + Ac 3 , (12-46b)
bˆ4 = (2 Ad + Aa ) 4 3 + As 2 3 , (12-46c)
bˆ6 = Aa 2 6 , (12-46d)
bˆ8 = Ac 6 2 , (12-46e)
bˆ11 = As 6 5 . (12-46f)
As in the case of annular polynomials, the eleven circle polynomials also represent the
Seidel aberration function exactly. The expansion coefficients can also be obtained by
inspection of the aberration function and the form of the circle polynomials. Indeed
because of the form of the Seidel aberration function, the circle coefficients are
independent of the obscuration ratio . Each b̂ -coefficient represents the value of the
corresponding a-coefficient for = 0 . It is clear that each of the three nonzero
coefficients of the 4-polynomial expansion changes as the number of polynomials is
increased from four to eleven. Hence, the values of piston, x-tilt, and defocus obtained
from the coefficients b̂1, b̂2 , and b̂4 are incorrect. Again, the mean value of the aberration
function is not given by b̂1, and its variance is not given by the sum of the squares of the
other coefficients.
12.3.4.3 Residual Aberration Function after Removing Interferometer Setting Errors
If we consider the first four polynomial terms as representing the interferometer

setting errors and remove them from the aberration function, the residual aberration
function from the annular expansion is given by
W RA (r, q; ) = a 6 A6 + a 8 A8 + a11 A11 . (12-47)

The same residual aberration function is obtained if a 4-polynomial Zernike expansion of

Eq. (12-43) is subtracted from the aberration function W (r, q; ). However, if the first
four polynomials are subtracted from the aberration function of Eq. (12-45), the residual
aberration function is given by
W RCb̂ (r, q; ) = bˆ6 Z 6 + bˆ8 Z 8 + bˆ11Z11 .
( ) ( )
= Aa 2 6 Z 6 + Ac 6 2 Z 8 + As 6 5 Z11 . ( ) (12-48)
Since the 11-polynomial aberration functions of Eqs. (12-41) and (12-45) are equal
to each other [and equal to the Seidel aberration function of Eq. (12-37)], the difference
between the residual aberration functions of Eqs. (12-48) and (12-47) is equal to the
difference between the interferometer setting errors given by Eq. (12-38) or (12-43) and
those given by Eq. (12-45). Accordingly, the difference or the error function consists of
piston, tilt, and defocus only. It is given by
1 2 2 4
DW Rbˆ (r, q; ) = -
6
( )
4 + 2 As + A r cos q + 2 As r 2
3 1 + 2 c
, (12-49)
and is independent of the number J of the annular and circle polynomials (e.g., 11, as
above) used in the expansion. Of course, piston does not affect the peak-to-valley value
or the variance of the aberration function. If the interferometer setting errors obtained
from Eq. (12-45) are applied in the fabrication and testing of an optical system with an
annular pupil, the difference function represents the polishing error due to the use of the
circle polynomials.
If we compare the annular coefficients of astigmatism, coma, and spherical

aberration given by Eqs. (12-39d–f) with the corresponding Zernike coefficients given by
Eq. (12-46d–f), we obtain
a6
bˆ6
(
= 1 + 2 + 4 )1 2 , (12-50a)
12
2 Ê 1 + 4 + ˆ
2 4
a8
bˆ8
= 1 (
- Á )
Ë 1+
2 ˜
¯
, (12-50b)
and
a11
bˆ11
(
= 1 - 2 )2 . (12-50c)
Since the b̂ j -coefficients are independent of the value of , the variation of a ratio
a j bˆ j with represents the variation of an annular coefficient a j .
12.3.4.4 Error with Assuming Circle Polynomials to be Orthogonal over an Annulus 325
12.3.4.4 Error with Assuming Circle Polynomials to be Orthogonal over an Annulus
Now we consider the expansion of the Seidel aberration function in terms of the
circle polynomials by assuming them to be orthogonal over the annulus. This is what one
does when defining a center of an interferogram, drawing a unit circle around it, and
determining its circle coefficients. The aberration function in this case can be written in
the form
W (r, q; ) = b1Z1 + b2 Z 2 + b4 Z 4 + b6 Z 6 + b8 Z 8 + b11Z11 + ... , (12-51)
where, according to Eq. (12-17), the coefficients b j are given by

1 2p
1
bj = Ú Ú W (r, q; ) Z j (r, q) r dr dq . (12-52)
(
p 1 - 2 ) 0
They can also be obtained from Eq. (12-22), i.e., from the annular or circle coefficients
by using the matrix C ZZ or C ZF given in Tables 12-4 and 12-5, respectively. The
“incorrect” circle coefficients b j are given by
b1 = a1 , (12-53a)
(
b2 = 1 + 2 )1 2 a 2 , (12-53b)
1 1
b4 =
4 3
(1 + 2 + 4 4 )(2 Ad + Aa ) +
2 3
(1 + 2 + 4 + 36 ) As , (12-53c)
(
b6 = 1 + 2 + 4 )1 2 a 6 , (12-53d)
1
b8 = 2 4 At +
6 2
(1 + 2 + 4 + 96 ) Ac , (12-53e)
5 4 2 1
b11 =
4
(
3 - 1 (2 Ad + Aa ) +)6 5
(
1 + 2 + 4 - 96 + 368 As , ) (12-53f)
etc. These coefficients are incorrect in the sense that they do not yield a least-squares fit
of the aberration function. Since an annular polynomial with n = m has the same form as
that for a corresponding circle polynomial except for the normalization constant, the
coefficients b j and a j for such a polynomial are also related to each other by the
normalization constant. Equations (12-53a, b, d) represent this fact for n = m = 0, 1, 2 ,
respectively. It is clear, however, that the improperly calculated circle coefficients b j
depend on the obscuration ratio of the pupil. Evidently, they are different from the
corresponding b̂ -coefficients given by Eqs. (12-46a–f). While the value of the piston
coefficient b1 is equal to the true mean value a1 , the tilt coefficient b2 is larger than a 2
12
by a factor of 1 + 2 (1 2
)
or 1.1180, and the coma coefficient b6 is larger than a 6 by a
(
factor of 1 + 2 + 4 )
or 1.1456 when = 0.5 . Moreover, the b-coefficients of some of
the nonexistent higher-order aberrations are not zero. For example, the coefficients b22 ,
b37 , etc. of the secondary and tertiary Zernike spherical aberrations Z 22 , Z 37 , etc., and
b16 , b30 , etc. of the secondary and tertiary Zernike coma Z16 and Z 30 , etc., are nonzero.
Thus, nonexistent aberrations are generated when an aberration function is expanded
improperly in terms of the circle polynomials.
If we estimate the annular Seidel aberration function with only 4-circle polynomials
from Eq. (12-51), we obtain
Wˆ (r, q; ) = b1Z1 + b2 Z 2 + b4 Z 4 . (12-54)
If we truncate the expansion in terms of the circle polynomials in Eq. (12-51) to the first
11 circle polynomials and remove the first four coefficients as interferometer setting
errors, the residual aberration function in this case is given by
W RCb (r, q; ) = b6 Z 6 + b8 Z 8 + b11Z11 . (12-55)
The tilt error is larger by a factor of 1 + 2( )1 2

or 1.1180 when = 0.5 than its true value
given by a 2 , and the defocus error given by b4 can be compared with its true value given
by a 4 . Since the 11-polynomial aberration function from Eq. (12-51) is not equal to the
aberration function of Eq. (12-41), their difference does not consist of the difference in
their interferometer setting errors. For example, Eq. (12-53d) indicates that there will be
an astigmatism term in the difference function. Thus, wrong polishing will result if the
aberration function of Eq. (12-55) is provided to the optician to zero out.
12.3.4.5 Numerical Example
As a numerical example, we consider an annular Seidel aberration function with

At = Ad = Aa = 1, Ac = 2 , and As = 3 in waves. As illustrated in Figure 12-1, the
annular and circle coefficients of a 4-polynomial expansion differ from each other,
although they yield the same fit of the aberration function. We note that, whereas the
mean value a1 increases as increases, but the piston coefficient b̂1 decreases. However,
the defocus coefficient a 4 decreases, while b̂4 increases. Both tilt coefficients a 2 and b̂2
increase. For a 11-polynomial expansion, the first four annular coefficients remain the
same, but the circle coefficients become independent of , as in Eqs. (12-46). Figure 12-2
shows the coefficient ratios a 6 bˆ6 (astigmatism), a 8 bˆ8 (coma) and a11 bˆ11 (spherical)
for a 11-polynomial expansion. We note that the coefficient a 6 increases, a11 decreases,
and a 8 is nearly constant for small values of and then decreases as increases. Figure
12-3 shows how the b̂ -coefficients change as we change the number of polynomials from
4 to 11 for = 0.5. A wrong polishing will result if the tip, tilt, and focus errors of an
interferometer setting are estimated from the 11-circle-polynomial expansion, instead of
the four. The variation of standard deviation obtained from the coefficients of a 4- or 11-
polynomial expansion is shown in Figure 12-4, illustrating that the circle coefficients
yield incorrect results. The standard deviation obtained from the orthonormal coefficients
increase slowly with , starting at 1.7460 and 1.7877 for the 4- and 11-polynomial
12.3.4.5 Numerical Example 327
expansions, respectively. However, the standard deviation obtained from the circle
coefficients is correct only when = 0. It increases rapidly with for the 4-polynomial
expansion, but it is constant for the 11-polynomial expansion, indicating its incorrect
nature. The sigma values from the orthonormal and the circle coefficients are nearly equal
to each other for £ 0.5 because of the very slow increase of the orthonormal sigma.
Figure 12-5 shows the contours of the Seidel aberration function for a circular and an
annular pupil with obscuration ratio of = 0.5. The case of a circular pupil is included
just for reference. The dark circular region in Figure 12-5b (and others) represents the
obscuration. The contours of the annular Seidel aberration function fit with only four
polynomials, as in Eq. (12-38) or (12-43) and in Eq. (12-54), which are shown in Figures
Figure 12-1. Orthonormal annular coefficients a j and Zernike circle coefficients b̂ j

for a 4-polynomial expansion.
Figure 12-2. Ratio of the orthonormal annular coefficients a j and Zernike circle
coefficients b̂ j for a 11-polynomial expansion.
Figure 12-3. Orthonormal annular coefficients a j and Zernike circle coefficients

b̂ j , illustrating how the latter change as the number of polynomials changes from 4
to 11.
Figure 12-4. Standard deviation as obtained from the orthonormal annular

coefficients a j and Zernike circle coefficients b̂ j of a 4- and 11-polynomial
expansion.
(a) (b)
Figure 12-5. Contours of (a) Seidel aberration function of Eq. (12-37) for a circular
pupil with At = Ad = Aa = 1, Ac = 2, and As = 3 in waves. (b) Same Seidel
aberration function, but for an annular pupil with obscuration ratio = 0.5.
(a) (b)
Figure 12-6. Contours of an annular Seidel aberration function for = 0.5 fit with
only 4-polynomials, as in (a) Eq. (12-38) or (12-43), and (b) Eq. (12-54).
(a)
(b)
(c)
Figure 12-7. Contours of the residual aberration function after removing the
interferometer setting errors. (a) WRA of Eq. (12-47) using annular polynomials, (b)
WRCb̂ of Eq. (12-48) using circle polynomials correctly, and (c) WRCb of Eq. (12-53)
using circle polynomials incorrectly.
(a)
(b)
Figure 12-8. Contours of the difference or the error function (a) Eq. (12-49) and (b)
obtained by subtracting Eq. (12-47) from Eq. (12-55).
12-6a and 12-6b, respectively. The two figures look similar, but they are not the same.
Only Figure 6a represents the least-squares and, therefore, the correct fit. The contours of
the residual aberration function when the first four (of the eleven) polynomials are
removed as interferometer setting errors, as in Eqs. (12-47), (12-48), and (12-55), are
shown in Figures 12-7a, 12-7b, and 12-7c, respectively. All of the three figures are
different from each other, as expected. Only Figure 12-7a reflects removal of the correct
interferometer setting errors, and thus the correct residual aberration function. The
contours of the difference of the residual functions using the circle polynomials from the
one using the annular polynomials are shown in Figures 12-8a and 12-8b. They represent
the error functions given by Eq. (12-49) and the difference of Eqs. (12-55) and (12-47),
respectively, due to the removal of incorrect interferometer setting errors.
12.4 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR THE ANALYSIS OF

A HEXAGONAL WAVEFRONT
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients
Now, we consider a hexagonal aberration function W ( x , y ) across a unit hexagon
shown in Figure 7-7, and demonstrate the pitfalls of using Zernike circle polynomials for
its expansion. Estimating the aberration function with J hexagonal polynomials H j ( x , y )
given in Chapter 7, we may write
J
Wˆ ( x , y ) = Â a j H j ( x , y ) , (12-56)
j =1
where the orthonormal hexagonal expansion coefficients are given by
2
aj = Ú W ( x , y )H j dx dy . (12-57)
3 3 hexagon
The mean and the mean values of the estimated aberration function are given by Eqs. (12-
4) and (12-6).
An 11 ¥ 11 conversion matrix M for obtaining the hexagonal polynomials in terms of

the Zernike circle polynomials is given in Table 12-6, as obtained from Table 7-1. Its
transpose and inverse matrices are given in Tables 12-7 and 12-8, respectively. If only the
first 4 polynomials are used in the expansion, then the b̂ j coefficients according to Eq.
(12-13) are given by
Ê bˆ1 ˆ Ê 1 0 0 5 43 ˆ Ê a1 ˆ Ê a1 + 5 43a 4 ˆ
Áˆ ˜ Á 0 0 ˜ Áa ˜ Á 6 5a ˜
Á b2 ˜ 65 0
Áˆ ˜ = Á ˜ Á 2˜ = Á 2
˜ , (12-58)
b Á 0 0 65 0 ˜ Á a3 ˜ Á 6 5a 3 ˜
Á ˜ 3
Á ˜ Á ˜ Á ˜
Áˆ ˜ Ë 0 0 0 2 15 43 ¯ Ë a4 ¯ Ë 2 15 43a 4 ¯
Ëb ¯
4
or
bˆ1 = a1 + 5 43a 4 , (12-59a)
bˆ2 = 6 5a 2 , (12-59b)
bˆ3 = 6 5a 3 , (12-59c)
and
bˆ4 = 2 15 43a 4 . (12-59d)
It is evident that the piston coefficient b̂1 is not equal to a1 and, therefore, does not
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 333
Table 12-6. Conversion matrix M for obtaining the Zernike coefficients b̂ j from the
orthonormal hexagonal coefficients a j , as in Eq. (12-12).
1 0 0 0 0 0 0 0 0 0 0
0 6 5 0 0 0 0 0 0 0 0 0
0 0 6 5 0 0 0 0 0 0 0 0
5 43 0 0 2 15 43 0 0 0 0 0 0 0
0 0 0 0 10 7 0 0 0 0 0 0
0 0 0 0 0 10 7 0 0 0 0 0
14 35
0 0 16 0 0 0 10 0 0 0 0
11055 2211
14 35
0 16 0 0 0 0 0 10 0 0 0
11055 2211
2
0 0 0 0 0 0 0 0 5 0 0
3
35
0 0 0 0 0 0 0 0 0 2 0
103
521 15 43
0 0 88 0 0 0 0 0 0 14
1072205 214441 4987
Table 12-7. Transpose matrix MT for use in Eq. (12-13)
521
1 0 0 5 43 0 0 0 0 0 0
1072205
14
0 65 0 0 0 0 0 16 0 0 0
11055
14
0 0 65 0 0 0 16 0 0 0 0
11055
15
0 0 2 15 43 0 0 0 0 0 0 88 0
214441
0 0 0 0 10 7 0 0 0 0 0 0
0 0 0 0 0 10 7 0 0 0 0 0
35 0 0 0 0
0 0 0 0 0 0 10
2211
0 0 0 0 0 0 0 0 0 0 0
2
0 0 0 0 0 0 0 0 5 0 0
3
35
0 0 0 0 0 0 0 0 0 2 0
103
43
0 0 0 0 0 0 0 0 0 0 14
4987
Table 12-8. Analytical matrix M –1 for obtaining the Zernike coefficients a j from
the orthonormal hexagonal coefficients b̂ j .
1 0 0 0 0 0 0 0 0 0 0
0 56 0 0 0 0 0 0 0 0 0
0 0 56 0 0 0 0 0 0 0 0
1 2 3 0 0 43 15 2 0 0 0 0 0 0 0
0 0 0 0 7 10 0 0 0 0 0 0
0 0 0 0 0 7 10 0 0 0 0 0
2211
0 0 8 5 15 0 0 0 10 0 0 0 0
35
2211
0 8 5 15 0 0 0 0 0 10 0 0 0
35
0 0 0 0 0 0 0 0 3 2 5 0 0
103
0 0 0 0 0 0 0 0 0 2 0
35
4987
1 2 5 0 0 22 7 43 0 0 0 0 0 0 14
43
represent the mean value of the aberration function. The coefficients b̂2 , b̂3 , and b̂4
represent the tip, tilt, and defocus circle coefficients.
To see how these coefficients change with the number of polynomials used in the
expansion, we consider an expansion using 11 polynomials. The coefficients, obtained
from Eq. (12-13), are given by
bˆ1 a1 5 43a 4 521

1072205 a11 , (12-60a)
bˆ2 6 5a 2 16 14 11055a 8 , (12-60b)
bˆ3 6 5a 3 16 14 11055a 7 , (12-60c)
bˆ4 2 15 43a 4 88 15 214441a11 , (12-60d)
bˆ5 10 7 a 5 , (12-60e)
bˆ6 10 7 a 6 , (12-60f)
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 335
bˆ7 = 10 35 2211a 7 , (12-60g)
bˆ8 = 10 35 2211a 8 , (12-60h)
bˆ9 = (2 3) 5a 9 , (12-60i)
bˆ10 = 2 35 103a10 , (12-60j)
and
bˆ11 = 14 43 4987 a11 . (12-60k)
It is clear that all of the first four coefficients change, and b̂ j = M jj a j for 5 £ j £ 11 .
For astigmatism ( H 5 and H 6 ), coma ( H 7 and H 8 ), and spherical aberration ( H11 ), the
b̂ j coefficient is larger than the corresponding hexagonal coefficient by a factor of
10 7 ª 1.20 , 10 35 2211 ª 1.26 , and 14 43 4987 ª 1.30 , respectively. The
astigmatism coefficients b̂5 and b̂6 change if a 15-polynomial expansion is considered.
For example, b̂5 then contains contributions from a13 and a15 , as well. The tip and tilt
coefficients b̂2 and b̂3 change further if polynomials H16 and H17 are included in the
expansion. Moreover, H16 also contributes to the coma coefficient b̂8 , and H17 similarly
contributes to the coma coefficient b̂7 . The piston and defocus coefficients b̂1 and b̂4 do
not change until the secondary spherical aberration polynomial H 22 is included with its
coefficient a 22 . Its inclusion also affects the primary spherical aberration coefficient b̂11 .
Thus, it is easy to see which, when, and by how much the b̂ j coefficients change,
depending on the number of polynomials used in the expansion.
12.4.2 Interferometer Setting Errors

The estimated wavefront obtained by using only the first four polynomials represents
the best-fit parabolic approximation of the aberration function in a least-squares sense. In
terms of the Zernike polynomials, it can be written as
Wˆ ( x , y ) = bˆ1Z1 + bˆ2 Z 2 + bˆ3 Z 3 + bˆ4 Z 4 (12-61a)
(
= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-61b)
Similarly, it can be written in terms of the orthonormal hexagonal polynomials as
Wˆ ( x , y ) = a1H1 + a 2 H 2 + a 3 H 3 + a 4 H 4 (12-62a)
= a1 + 2 6 5a 2 x + 2 6 5a 3 y + a 4 [ (
5 43 + 6 5 43 2r 2 - 1 )] . (12-62b)
Comparing the right-hand sides of Eqs. (12-61b) and (12-62b) and utilizing Eqs. (12-59a–
d), it is seen that the coefficients of x, y, and x 2 + y 2 , representing the tip, tilt, and
defocus values obtained from the Zernike coefficients, are the same as those obtained
from the hexagonal coefficients. The estimated piston from the Zernike expansion of Eq.
(12-61b) is bˆ1 - 3bˆ4 . Substituting for b1and b4 from Eqs. (12-59a–d), we find that it is
the same as a1 - 5 5 43a 4 from the hexagonal expansion of Eq. (12-62b). Accordingly,
the aberration function obtained by subtracting the piston, tip, tilt, and defocus values
from the measured aberration function is independent of the nature of the polynomials
used in the expansion, regardless of the domain of the function or the shape of the pupil,
so long as the nonorthogonal expansion is in terms of only the first four circle
polynomials. The difference function is what is provided to the optician to zero out from
the surface under fabrication by polishing. In an interferometer, they represent the lateral
and longitudinal errors in the location of a point source illuminating an optical surface
under test from its center of curvature. These four terms are generally removed from the
aberration function and the remaining function is given to the optician to zero out from
the optical surface by polishing.

As a numerical example, we consider a hexagonal aberration function defined by 15
hexagonal coefficients a j given in Table 12-9. The mean value of the aberration function
is given by a1 = 0.0842 . The first 4, the first 11, or all of the 15 coefficients represent the
coefficients of a 4-, 11-, or 15-polynomial expansion. The corresponding circle
coefficients b̂ j obtained from Eqs. (12-59), (12-60), or in general Eq. (12-13) are also
given in Table 12-9. We note that the value of the piston coefficient b̂1 changes as the
number of polynomials increases from 4 to 11. Neither equals a1 ; and, therefore, they do
not represent the mean value. Similarly, the tip, tilt, and defocus coefficients b̂2 , b̂3 , and
b̂4 change. When the number of polynomials increases from 11 to 15, only the
astigmatism coefficients b̂5 and b̂6 change, as expected from our discussion in Section
12.4.1. The other coefficients would have changed if higher-order terms were present.
The Zernike coefficients b j obtained from Eq. (17) are also listed in Table 12-9. Their
values do not change as the number of polynomials used in the expansion changes. They
are different from the corresponding Zernike coefficients b̂ j obtained from Eq. (12-13).
The standard deviation of an aberration function is given by Eq. (12-6) in terms of

the hexagonal coefficients. As the number of hexagonal polynomials increases from 4 to
11 to 15, the standard deviation approaches its true value of 0.6068, as indicated in Table
12-10. If Eq. (12-6) is applied to the Zernike coefficients b̂ j or b j , incorrect values of
sigma are obtained. They are also listed in Table 12-10. Once again, whereas a hexagonal
coefficient (other than piston) represents the standard deviation of the corresponding
polynomial term in the expansion, a Zernike coefficient b̂ j or b j does not.
The contour plots of the aberration function fitted with 4, 11, and 15 hexagonal
polynomials are shown in Figure 12-9. The same plots are obtained with the
corresponding properly calculated Zernike coefficients b̂ j , illustrating an identical fit.
However, different plots are obtained with the improperly calculated Zernike coefficients
b j , as shown in Figure 12-10. If we remove the first four a j , b̂ j , or b̂ j coefficients of
piston, tip, tilt, and defocus representing the interferometer setting errors from the
aberration function estimated by 11 or 15 polynomials, we obtain the residual aberration
function whose contour plots are shown in Figures 12-11, 12-12, and 12-13, respectively.
Comparing these figures, it is evident that the residual functions represented in Figures
12-12 and 12-13 are incorrect. Only Figure 12-11 represents the correct residual function.
The difference of the residual aberration functions representing the error functions in
using the Zernike polynomials and thereby removing the incorrect interferometer setting
errors are shown in Figures 12-14 and 12-15. Thus, the contours in these figures represent
the difference of the contours in Figures 12-12 and 12-13 from those in Figure 12-11,
respectively.
(a)
(b)
(c)
Figure 12-9. Contour plots of a hexagonal aberration function fit with (a) 4, (b) 11,
and (c) 15 hexagonal polynomials or circle polynomials with coefficients b̂ j .
(a)
(b)
(c)
Figure 12-10. Contour plots of a hexagonal aberration function fit with (a) 4, (b) 11,
and (c) 15 circle polynomials with coefficients b j .
(a)
(b)
Figure 12-11. Contour plots of a residual hexagonal aberration function after

removing the first four coefficients from (a) 11 and (b) 15 hexagonal polynomial fit.
(a)
(b)

removing the first four coefficients from (a) 11 and (b) 15 circle polynomials fit
using b̂ j coefficients.
(a)
(b)

removing the first four coefficients from (a) 11 and (b) 15 circle polynomials fit
using b j coefficients.
(a)
(b)
Figure 12-14. Contour plots of the error function after removing the first four b̂ j
coefficients from (a) 11 and (b) 15 coefficients.
(a)
(b)
Figure 12-15. Contour plots of the error function after removing the first four b j
coefficients from (a) 11 and (b) 15 coefficients.
Table 12-9. Hexagonal and Zernike coefficients of a hexagonal aberration function

fit with a 4, 11, or 15 polynomials.
j aj b̂ j , J = 4 b̂ j , J = 11 b̂ j , J = 15 bj
1 0.0842 0.1024 0.0329 0.0329 0.0827
2 0.0501 0.0549 – 0.0673 – 0.0673 0.0452
3 – 0.2689 – 0.2946 – 0.2448 – 0.2448 – 0.2439
4 – 0.0534 0.0631 – 0.0386 – 0.0386 0.0195
5 – 0.2070 – 0.2474 – 0.1575 – 0.1725
6 0.1956 0.2338 0.3974 0.1642
7 0.0874 0.1100 0.1100 0.1794
8 – 0.2145 – 0.2699 – 0.2699 – 0.1903
9 – 0.1605 – 0.2393 – 0.2393 – 0.1070
10 0.1071 0.1249 0.1249 0.0921
11 – 0.1382 – 0.1797 – 0.1797 – 0.1506
12 0.2819 0.3293 0.1160
13 0.0730 0.1216 0.1565
14 0.1055 0.1599 0.1597
15 0.0596 0.0903 0.0585
Table 12-10. Standard deviation s of the aberration functions fit with 4, 11, and 15
hexagonal or circle polynomials using Eq. (12-6).
Number of s from s from s from

polynomials coefficients a j coefficients b̂ j coefficients b j
J
4 0.2787 0.3062 0.2488
11 0.5184 0.6099 0.4792
15 0.6068 0.7718 0.5445

As in the case of an annular wavefront, the fit with a certain number of circle
polynomials is as good as with a corresponding set of the hexagonal polynomials.
However, again there are pitfalls in using the circle polynomials. For example, the mean
value of a circle polynomial across a noncircular pupil is not zero, the Zernike piston
coefficient does not represent the mean value of the aberration, the other Zernike
coefficients do not represent the standard deviation of the corresponding aberration term,
and the variance of the aberration is not equal to the sum of the squares of these other
coefficients. Moreover, the value of a Zernike coefficient generally changes as the
number of polynomials used in the expansion of an aberration function changes.
12.5 ABERRATION COEFFICIENTS FROM DISCRETE WAVEFRONT

DATA
When an aberration function is known only at a discrete set of points, as in a

digitized interferogram, the integral for determining the aberration coefficients reduces to
a sum, and the orthonormal coefficients thus obtained may be in error due to the lack of
orthogonality of the polynomials over the discrete points of the aberration data set. The
magnitude of the error decreases as the number of (uniformly distributed) points
increases. This is not a serious problem when the wavefront errors are determined by,
say, phase-shifting interferometry [3], since the number of points can be very large.
However, when the number of data points is small, or the pupil is irregular in shape due
to vignetting, then ray-tracing or testing of the system yields wavefront error data at an
array of points across a region for which closed-form orthonormal polynomials are not
available. In such cases, we can determine the coefficients of an expansion in terms of the
numerical polynomials that are orthogonal over the data set, obtained by the Gram
Schmidt orthogonalization process. However, if we just want to determine the values of
tip/tilt and defocus terms, yielding the errors in interferometer settings, they can be
obtained by least-squares fitting the aberration function data with only these terms.
Numerical simulations for obtaining the orthonormal coefficients from discrete data are
considered in Chapter 14 on Numerical Wavefront Analysis.
12.6 SUMMARY
The expansion of a noncircular aberration function in terms of the Zernike circle
polynomials is compared with the corresponding expansion in terms of the polynomials
that are orthonormal over the domain of the function. It is shown that, whereas the
orthonormal expansion coefficients are independent of the number of polynomials used in
the expansion, the circle coefficients generally change as the number of polynomials
changes. We demonstrate which circle coefficients change and by how much.
Accordingly, one or more orthonormal polynomial terms can be added to or subtracted
from the aberration function without affecting the other coefficients only when the
orthonormal polynomials are used. Moreover, unlike the orthonormal coefficients, the
piston circle coefficient does not represent the mean value of the aberration function, and
the sum of the squares of the other circle coefficients does not yield its variance.
However, since each orthonormal polynomial of a certain order is a linear combination of
the circle polynomials of that and lower orders, the wavefront fit with a certain number of
orthonormal polynomials is exactly the same as that with the corresponding circle
polynomials.
These results are illustrated analytically as well as numerically by considering an

annular and hexagonal Seidel aberration functions. Equations (12-32a–d) show how the
annular and circle coefficients of a four-polynomial expansion, representing the
interferometer setting errors, differ from each other (see Figure 12-1), although they yield
the same fit of the aberration function [compare Eqs. (12-35) and (12-36) and see Figure
12-6a]. How the circle coefficients change when the function is fit with 11 polynomials
may be seen from Eqs. (12-33a–d) and Figure 12-3. If 11 polynomials are used to
estimate the aberration function and the first four are removed as interferometer setting
errors, only the annular polynomials give the correct residual aberration function as in
Figure 12-7a. The residual aberration functions shown in Figures 12-7b and 12-7c,
obtained respectively by using the circle polynomials in a least-squares fit or assuming
their orthogonality over the annulus, are incorrect. Such figures illustrate that, while the
correct interferometer setting errors can be obtained by a 4-polynomial least-squares fit
using the annular or the circle polynomials, only the annular polynomials yield their
correct values when obtained by fitting with a larger number of polynomials.
Similar results are illustrated when the circle polynomials are used for the analysis of
a hexagonal wavefront. For example, Eqs. (12-59) and (12-60) show that the circle
coefficients change when fitting it first with 4 circle polynomials and then with 11
polynomials. This may be seen in Table 12-9 by comparing the first four coefficients of
the column with J = 11 with the corresponding coefficients in the column with J = 4.
When the number of polynomials increases from 11 to 15, then only the astigmatism
coefficients b̂5 and b̂6 change. However, Eqs. (12-61) and (12-62) show that an identical
fit is obtained when the same number of corresponding circle or hexagonal polynomials
are used, as illustrated in Figures 12-9a–c for J = 4, 11, and 15, respectively. However,
different fits are obtained with the improperly calculated Zernike coefficients b j , as
shown in Figures 12-10a–c. If we remove the first four a j , b̂ j , or b̂ j coefficients of
piston, tip, tilt, and defocus representing the interferometer setting errors from the
aberration function estimated by 11 or 15 polynomials, only the residual aberration
function illustrated in Figures 12-11a and 12-11b is correct, but those in Figures 12-12
and 12-3 are incorrect. The sigma value of an aberration function obtained by summing
the squares of the coefficients and taking the square root of the result is correct only for
the hexagonal coefficients, as may be seen from Table 12-10.
If the common practice of defining the center of an interferogram and drawing a unit
circle around it is followed, then the circle coefficients of a noncircular interferogram do
not yield a correct representation of the aberration function. Moreover, in this case, some
of the higher-order coefficients of aberrations that are nonexistent in the aberration
function are also nonzero, as mentioned in Section 12.3.4.4. Finally, the circle
coefficients, however obtained, do not represent the coefficients of the balanced
12.6 Summary 347
aberrations for an annular pupil. Consequently, it should be clear that the circle
polynomials are not suitable for the analysis of an annular wavefront, and only the
annular polynomials should be used for such an analysis.
References
1. V. N. Mahajan and M. Aftab, “Systematic comparison of the use of annular and

Zernike circle polynomials for annular wavefronts,” Appl. Opt. 49, 8489-6501
(2010).
2. G.-m. Dai and V. N. Mahajan, “Orthonormal polynomials in wavefront analysis:

Error analysis,” Appl. Opt. 47, 3433–3445 (2008).
3. H. Schreiber and J. H. Bruning, “Phase shifting interferometry,” in Optical Shop

Testing, 3rd Ed., D. Malacara, ed., pp. 547–666 (Wiley, New York, 2008).
CHAPTER 13
ANAMORPHIC SYSTEMS
13.1 Introduction ..........................................................................................................351
13.2 Gaussian Imaging ................................................................................................352
13.3 Classical Aberrations ........................................................................................... 354
13.4 Strehl Ratio and Aberration Balancing for a Rectangular Pupil ....................355
13.5 Aberration Polynomials Orthonormal over a Rectangular Pupil ................... 356
13.6 Expansion of a Rectangular Aberration Function in Terms of
Orthonormal Rectangular Polynomials............................................................. 360
13.7 Anamorphic Imaging System with a Circular Pupil ........................................361
13.7.1 Balanced Aberrations ..............................................................................361
13.7.2 Orthonormal Polynomials Representing Balanced Aberrations ..............362
13.8 Comparison of Polynomials for Rotationally Symmetric and
Anamorphic Imaging Systems ............................................................................362
13.9 Summary............................................................................................................... 365
References ......................................................................................................................367
349
Chapter 13
Anamorphic Systems
13.1 INTRODUCTION
An anamorphic imaging system, for example, consisting of cylindrical optics, is
symmetric about two orthogonal planes whose intersection defines its optical axis. The
Gaussian images of a point object with object rays in the two symmetry planes are
formed separately. They are coincident in the final image space of the system for only
two pairs of conjugate planes [1]. By definition, an anamorphic system forms the image
of an extended object with different transverse magnifications in the two symmetry
planes. Thus, for example, the image of a square object is rectangular and that of a
rectangular object can be square. The two orthogonal planes of symmetry of the imaging
system yield six “reflection” invariants in terms of the Cartesian coordinates of the object
and pupil points [2,3], which become the building blocks of its aberration function for a
certain point object. The six invariants reduce to three “rotational” invariants for a
rotationally symmetric system, or equivalently for an infinite number of symmetry
planes.
In this chapter, we discuss the power series expansion of the aberration function in
terms of the six reflection invariants, define the classical aberrations of the system, and
discuss their balancing to minimize their variance across a rectangular exit pupil, and
thereby improve the image quality [see Chapter 2]. We show that the balanced
aberrations are represented by the products of the Legendre polynomials, one for each of
the two dimensions of the rectangular pupil [4]. The compound Legendre polynomials are
orthogonal across a rectangular pupil and, like the classical aberrations, are inherently
separable in the Cartesian coordinates of the pupil point.
The 2D orthogonal Legendre polynomials are different from the orthogonal

polynomials representing the balanced aberrations for a system with rotational symmetry
but a rectangular pupil. The rectangular polynomials for such a system are obtained by
orthogonalizing the Zernike circle polynomials over a rectangular pupil, as in Section 9.2,
and are not separable in the Cartesian coordinates of a pupil point. Similarly, products of
Chebyshev polynomials, one for the x axis and the other for the y axis, which are also
orthogonal over a rectangular or a square pupil, have been suggested for wavefront
analysis [5]. However, they are also not suitable for anamorphic systems, since they do
not represent balanced aberrations for such systems.
Just as we considered the orthonormal polynomials representing the balanced

aberrations of a rotationally symmetric system with a circular or an elliptical pupil, we
determine the polynomials for an anamorphic system with such pupils. We emphasize
that, although the Zernike circle polynomials are orthogonal over the circular pupil, they
are not suitable for an anamorphic system with such a pupil, because they do not
represent balanced aberrations for it.
351
352 ANAMORPHIC SYSTEMS
13.2 GAUSSIAN IMAGING

Consider a point object P located at a point (p, q) in the object plane imaged by an
anamorphic system at a point P ¢ , as illustrated in Figure 13-1. The cylindrical lens L1
schematically represents cylindrical lenses with their symmetry axes parallel to x axis,
and similarly for L2 along the y axis. The system is symmetric about the yz and zx
planes whose intersection defines its optical axis z. The rays in the zx plane originating at
P are transmitted by L1 like a plane-parallel plate and focused by L2 at P ¢ . Similarly, the
rays in the yz plane are focused by L1 at P ¢ and transmitted by L2 like a plane-parallel
plate. The projections of skew rays on the zx and yz planes contribute to the image in a
similar manner.
Let S1 be the distance of the point object P, and S1¢ be the distance of the Gaussian
image point P ¢ from the object- and image-space principal planes H1 and H1¢ of the lens
L1 , respectively, as illustrated in Figure 13-2. They are related to each other by the
image-space focal length f1¢ according to
1 1 1
- = , (13-1a)
S1¢ S1 f1¢
or
Figure 13-1. Schematic of an anamorphic imaging system consisting of orthogonal

cylindrical lenses in a configuration called crossed cylinders. The system is
symmetric about the yz and zx planes whose intersection defines the optical axis z. A
fan of rays in the zx plane is shown originating at a point P in the center of a square
object. The cylindrical lens L1 acts as a plane-parallel plate on these rays and
transmits them without any bending . When the transmitted rays are incident on the
cylindrical lens L2 , they are refracted by it just like a spherical lens and focused at
the image point P ¢ .
13.2 Gaussian Imaging 353
Figure 13-2. Gaussian imaging by an anamorphic imaging system, such as in Figure

13-1.
S1¢ f1¢
S1 = . (13-1b)
f1¢ - S1¢
Similarly, the object and image distances S2 and S2¢ for the lens L2 of focal length f 2¢
are related to each other according to
1 1 1
- = (13-2a)
S2¢ S2 f 2¢
or
1 1 1
- = , (13-2b)
S1¢ - d 2 S1 - d1 f 2¢
where d1 and d 2 are the distances H1H 2 and H1¢H 2¢ between the respective principal
planes of the two lenses. In the thin lens approximation, d1 and d 2 are equal to the
spacing between the lenses. Substituting for S1 from Eq. (13-1b) into Eq. (13-2b), we
obtain a quadratic equation in S1¢ yielding two solutions for it. A corresponding value of
S1 can be obtained for each value of S1¢ from Eq. (13-1b). Thus, an anamorphic system
has only two pairs of conjugates, compared to an infinite number for a rotationally
symmetric imaging system. It should be evident that the image magnifications along the x
and y axes are different, as they are given by
S2¢
Mx = - (13-3a)
S2
and
S1¢
My = - , (13-3b)
S1
respectively. Its consequence, for example, is that the image of a square object is
rectangular and that of a circle is elliptical.
13.3. CLASSICAL ABERRATIONS

Let the exit pupil of the anamorphic system be rectangular with half widths a and b.
It may be an aperture stop in the image space of the system. Let ( x , y ) be the coordinates
of a pupil point normalized by (a, b) so that -1 £ x £ 1 and -1 £ y £ 1 .
It is evident that the system is symmetric about two orthogonal planes zx and yz.
Accordingly, the aberration function, which depends on both ( p, q) and ( x , y )
coordinates, consists of products of positive integral powers of six reflection invariants
[2–4] :
p 2 , x 2 , px , q 2 , y 2 , and qy . (13-4)
The first three are symmetric about the yz plane and the other three are symmetric about
the zx plane. The aberration function can accordingly be written in the form
W ( p, q; x , y ) = Â
i, j , k, l , m , n
( ) i (q2 ) j ( x 2 ) k ( y 2 ) l ( px) m (qy) n
C i, j ,k,l ,m,n p 2 , (13-5)
where i, j, k , l, m , and n are positive integers including zero, and C i, j ,k,l ,m,n is the
coefficient of the aberration term that has a degree in the object and pupil coordinates
given by
degree = 2(i + j + k + l + m + n) . (13-6)
It is evident that the degree of an aberration term is even, and thus the aberration function
consists of aberrations of even orders only. The zero-degree term must be zero, as it
represents the aberration of the chief ray, which is zero by its definition as the reference
ray. There are six terms of second degree, namely the reflection invariants multiplied
with their respective coefficients. Two of these terms, namely those in p 2 and q 2 , are
piston terms, i.e., they are independent of the pupil coordinates, and can generally be
ignored. Among the other four, those in px and qy , represent lateral deviations of the
image point from the Gaussian image point, and those in x 2 and y 2 represent
longitudinal deviations. Since our aberration function is defined with respect to the
Gaussian image point, these four terms must be zero. It is clear that the aberration terms
are separable in the Cartesian coordinates ( x , y ) of a pupil point.
There are 21 terms of the fourth degree, of which three are piston terms and two are
equal to another two. Hence, we are left with 16 terms that depend on the pupil
coordinates. They are called the primary aberrations of an anamorphic system, compared
13.3 Classical Aberrations 355
to only five for a rotationally symmetric system. The primary aberration function can be
written
( ) ( ) ( )
W ( p, q; x , y ) = C1 p 3 + C 2 pq 2 x + C 3 p 2 q + C 4 q 3 y + C 5 p 2 + C 6 q 2 x 2
( 2
)
+ C 7 pqxy + C 8 p + C 9 q y + C10 pxy + C11qyx + C12 px 3
2 2 2 2
+ C13 qy + C14 x y + C15 x 4 + C16 y 4

3 2 2
, (13-7)
where we have expressed the aberration coefficients in a simplified form with one
subscript for convenience. For a rotationally symmetric system, the six reflection
coefficients reduce to three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy ,
r r
and the 16 primary aberrations reduce to five. If h and rr are r the position vectors of the
r r r r
object and pupil points, then the rotational invariants are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and
r r r r
hr cos q , where h = h , r = r , and q is the polar angle of r with respect to that of h .
In conformance with the aberrations of a rotationally symmetric system, the linear terms
in x and y are the distortion aberrations, the quadratic terms may be referred to as the field
curvature, defocus, or astigmatism; the cubic terms are comas; and the quaternary terms
are the spherical aberrations. It is easy to see that an anamorphic system has three primary
aberrations for an axial point object compared to only one for a rotationally symmetric
system.
13.4 STREHL RATIO AND ABERRATION BALANCING FOR A

RECTANGULAR PUPIL
The Strehl ratio of an imaging system with a rectangular pupil is given by Eq. (9-20).
Its approximate value for a small aberration depends on the aberration variance according
to Eq. (1-34). However, the balanced aberrations for an anamorphic system for minimum
variance are different from those for a rotationally symmetric system, as we now show.
Consider an aberration such as x-coma [4]:
W cx ( x , y ) = x 3 . (13-8)
Its variance across the rectangular pupil is given by
2
s 2cx = [W cx (x, y)]2 - W cx ( x , y ) , (13-9)
where the angular brackets indicate a mean value across the pupil. For example, the mean
value of a function g( x , y ) is given by
1 1
Ú Ú g( x , y ) dx dy
g( x, y) = 1 1
1 1
Ú Ú dx dy
1 1
1 1 1
= Ú Ú g( x , y ) dx dy . (13-10)
4 1 1
Thus, the standard deviation of the x-coma aberration is given by s cx = 1 7.
The variance can be reduced by mixing it with a certain amount b of x-tilt. Thus, the
balanced aberration may be written in the form
W bcx ( x , y ) = x 3 + bx . (13-11)
1 2b b 2
s 2bcx = + + . (13-12)
7 5 3
The variance has a minimum value of 4/175 for a tilt of b = -3 / 5 compared to a value of
1/7 without any tilt. Thus the variance is reduced by a factor of 25/4, or the standard
deviation of the balanced aberration is smaller by a factor of 5/2. The corresponding
balanced aberration is given by
W bcx ( x , y ) = x 3 - (3 5) x . (13-13)
A balanced aberration yields a higher Strehl ratio or increases the aberration tolerance for
a given Strehl ratio. The balanced aberration given by Eq. (13-13) is the same as Eq. (11-
6) for a slit pupil. Similarly, the variance of the x-spherical aberration x 4 can be
minimized by combining it with x-defocus of -(6 7) x 2 , yielding a balanced aberration of
x 4 - (6 7) x 2 . This balanced aberration is also the same as for a slit pupil. The sigma
value of the aberration is reduced by a factor of 7/2 from 4/15 to 8/105. It should be
evident that the y-coma or y-spherical aberration may be balanced in the same manner.
The variance of the higher-order classical aberrations, e.g., secondary aberrations (of
sixth degree), can also be minimized by combining them with one or more lower-degree
aberrations.
13.5 ABERRATION POLYNOMIALS ORTHONORMAL OVER A

RECTANGULAR PUPIL
The balanced classical aberrations for an anamorphic imaging system are separable
in the x and y coordinates of a point on the rectangular exit pupil. We have shown in
Chapter 11 that the balanced aberrations and the polynomials representing them in one
dimension are the Legendre polynomials. Hence, the balanced aberrations for a
rectangular pupil and the polynomials representing them are the products of Legendre
polynomials in x and y variables. The polynomials with the x and y variables properly
normalized by the dimensions of the rectangular pupil (a, b) are orthogonal over the
pupil, and may be referred to as the orthogonal aberrations. The order of an orthogonal
aberration is represented by its degree in the pupil coordinates, which is the same as that
of its leading classical aberration term only for an axial point object. For example, the
fourth-order classical aberration px 3 in the object and pupil coordinates representing x-
[ ]
coma becomes a third-order orthogonal aberration p x 3 - (3 5) x in pupil coordinates.
13.5 Aberration Polynomials Orthonormal over a Rectangular Pupil 357
We define the products of Legendre polynomials (discussed in Chapter 11) in x and y

variables that are orthonormal over the rectangular pupil:
Q j ( x , y ) = Ll ( x ) Lm ( y ) , (13-14)
where j is a polynomial ordering index starting with j = 1, and l and m are positive
integers (including zero). It is evident that these polynomials are inherently separable in
the Cartesian pupil coordinates x and y. This is different from the Zernike circle
polynomials, which are orthogonal over a unit circle, but separable in polar coordinates
(r, q) , where 0 £ r £ 1 and 0 £ q £ 2p . The order n of a polynomial representing its
degree in the pupil coordinates is given by n = l + m . As in the case of Zernike circle
polynomials, the number of polynomials with a certain order n is n+1. The number of
polynomials through a certain order n is given by
N n = ( n + 1)( n + 2) 2 . (13-15)
The first polynomial is the piston polynomial
Q1( x , y ) = L0 ( x ) L0 ( y ) = 1 . (13-16)
The orthonormality of the polynomials is expressed by
1 1 1
Ú Ú Q ( x , y ) Q j ¢ ( x , y ) dx dy = d jj ¢ . (13-17)
4 1 1 j
The rectangular Q-polynomials up to and including the eighth order are listed in
Table 13-1 as products of the Legendre polynomials, along with the names associated
with some of them. Their explicit form can be obtained by using the expressions of the
Legendre polynomials given in Table 11-1. Note that for each polynomial Ll ( x ) Lm ( y ) ,
there is a corresponding polynomial Lm ( x ) Ll ( y ) . These polynomials are evidently
different from those for a rotationally symmetric system with a rectangular pupil. The
rectangular polynomials given in Section 9.4 for such a system are not separable in the
Cartesian coordinates (x, y) of a pupil point.
The higher-order Q-polynomials can be written in a similar manner. It should be

evident that Q7 represents the balanced x-primary coma, and Q11 represents the balanced
x-primary spherical aberration. These polynomials are the same as those for a slit pupil
(see Table 11-1). The piston term in Q11 yields a zero mean value across the rectangular
pupil (without changing its variance). Since the aberration function is separable in the
Cartesian coordinates x and y of a pupil point, an aberration term containing both x and y
dependence is balanced separately for its x and y factors. Thus, for example, a seventh-
order aberration term x 4 y 3 will yield a balanced aberration of the form
[ ][ ]
x 4 - (6 7) x 2 x 3 - (3 5) x . The corresponding seventh-order orthonormal polynomial is
given by
Table 13-1. Orthonormal polynomials Q j (x, y ) for an anamorphic system with a

rectangular pupil, where the (x, y ) coordinates of a pupil point have been
normalized by the half-widths (a, b) of the pupil.
Polynomial Polynomial Aberration name

order
n =l +m Q j (x, y ) = Ll ( x) Lm ( y )
0 Q1 = L0 ( x ) L0 ( y ) Piston
1 Q2 = L1( x ) L0 ( y ) x-tilt
1 Q3 = L0 ( x ) L1( y ) y-tilt
2 Q4 = L 2 ( x ) L 0 ( y ) x-defocus
2 Q5 = L1( x ) L1( y )
2 Q6 = L 0 ( x ) L 2 ( y ) y-defocus
3 Q7 = L 3 ( x ) L 0 ( y ) x-primary coma
3 Q8 = L2 ( x ) L1( y )
3 Q9 = L1( x ) L2 ( y )
3 Q10 = L0 ( x ) L3 ( y ) y-primary coma
4 Q11 = L4 ( x ) L0 ( y ) x-primary spherical
4 Q12 = L3 ( x ) L1( y )
4 Q13 = L2 ( x ) L2 ( y )
4 Q14 = L1( x ) L3 ( y )
4 Q15 = L0 ( x ) L4 ( y ) y-primary spherical
5 Q16 = L5 ( x ) L0 ( y ) x-secondary coma
5 Q17 = L4 ( x ) L1( y )
5 Q18 = L3 ( x ) L2 ( y )
5 Q19 = L2 ( x ) L3 ( y )
5 Q20 = L1( x ) L4 ( y )
5 Q21 = L0 ( x ) L5 ( y ) y-secondary coma

13.5 Aberration Polynomials Orthonormal over a Rectangular Pupil 359
Table 13-1. Orthonormal polynomials Q j (x, y ) for an anamorphic system with a

rectangular pupil, where the (x, y ) coordinates of a pupil point have been
normalized by the half-widths (a, b) of the pupil. (Cont.)

order
n =l +m Q j (x, y ) = Ll ( x) Lm ( y )
6 Q22 = L6 ( x ) L0 ( y ) x-secondary spherical
6 Q23 = L5 ( x ) L1( y )
6 Q24 = L4 ( x ) L2 ( y )
6 Q25 = L3 ( x ) L3 ( y )
6 Q26 = L2 ( x ) L4 ( y )
6 Q27 = L1( x ) L5 ( y )
6 Q28 = L0 ( x ) L6 ( y ) y-secondary spherical
7 Q29 = L7 ( x ) L0 ( y ) x-tertiary coma
7 Q30 = L6 ( x ) L1( y )
7 Q31 = L5 ( x ) L2 ( y )
7 Q32 = L4 ( x ) L3 ( y )
7 Q33 = L3 ( x ) L4 ( y )
7 Q34 = L2 ( x ) L5 ( y )
7 Q35 = L1( x ) L6 ( y )
7 Q36 = L0 ( x ) L7 ( y ) y-tertiary coma
8 Q37 = L8 ( x ) L0 ( y ) x-tertiary spherical
8 Q38 = L7 ( x ) L1( y )
8 Q39 = L6 ( x ) L2 ( y )
8 Q40 = L5 ( x ) L3 ( y )
8 Q41 = L4 ( x ) L4 ( y )
8 Q42 = L5 ( x ) L3 ( y )
8 Q43 = L2 ( x ) L6 ( y )
8 Q44 = L1( x ) L7 ( y )
8 Q45 = L0 ( x ) L8 ( y ) y-tertiary spherical

Q32 ( x , y ) = L4 ( x ) L3 ( y ) . (13-18)
It should be evident that the polynomials for a square pupil can be obtained from
those for a rectangular pupil by letting a = b , i.e., by using the same scale for the x and y
axes. Products of Chebyshev polynomials (one for the x and the other for y axis), which
are also orthogonal over a rectangular or a square pupil, have been suggested for the
analysis of rectangular wavefronts [5]. However, they are not suitable for anamorphic
systems since they do not represent balanced aberrations for such systems.
13.6 EXPANSION OF A RECTANGULAR ABERRATION FUNCTION IN

TERMS OF ORTHONORMAL RECTANGULAR POLYNOMIALS
An aberration function defined over a rectangular exit pupil can be expanded in

terms of the rectangular Q-polynomials in the form
W ( x, y) = Â a j Q j ( x, y) , (13-19)
j
where a j is an expansion coefficient of the polynomial Q j ( x , y ) given by
1 1 1
aj = Ú Ú W ( x , y ) Q j ( x , y )dx dy . (13-20)
4 1 1
It is evident that the value of a coefficient is independent of the number of polynomials

used in the expansion. Since the mean value of each polynomial (other than piston) is
zero and Q1( x , y ) is unity, the mean value of the aberration function is given by the
piston coefficient a1 :
W ( x , y ) = a1 . (13-21)
Its mean square value is given by
[W (x, y)]2 = Â a 2j
j
. (13-22)
Accordingly, its variance is given by
2
2
sW = [W (x, y)]2 - W ( x, y)
= Â a 2j . (13-23)
j π1
When an aberration function is obtained at a discrete array of points by tracing rays

from a point object through an imaging system or by testing a system interferometrically,
it can be expanded in terms of the orthonormal polynomials. Once the expansion
coefficients are calculated using Eq. (13-18), they can be used to reconstruct the function
13.6 Expansion of a Rectangular Aberration Function in Terms of Orthonormal Rectangular Polynomials 361
and obtain it continuously across the pupil. Because of the orthogonality of the Legendre
polynomials, the coefficients are independent of each other, and an orthogonal aberration
term can be added to or subtracted from the aberration function without affecting the
other terms.
13.7 ANAMORPHIC IMAGING SYSTEM WITH A CIRCULAR PUPIL

If a circular aperture stop is placed in the image space of an anamorphic system, then
its inherent aberrations discussed in Section 13.3 must be balanced over its circular exit
pupil. Similarly, if a circular aperture stop is placed in the object space of the system,
then aberrations are balanced over its elliptical exit pupil. The difference in the
corresponding circular and elliptical polynomials lies in the scaling factor of the x and y
axes by the aspect ratio of the elliptical pupil.
13.7.1 Balanced Aberrations

Consider x-coma given by Eq. (13-8). Its variance across a unit circular pupil is
given by Eq. (13-9), except that the angular brackets now indicate a mean value across
the circular pupil. Thus the mean value of a function
1 1 2p
g( x, y) = Ú Ú g ( x , y ) r dr d q , (13-24)
p0 0
where x = r cos q and y = r sin q . Thus, W cx ( x , y ) = 0 , and the variance is given by
s 2cx = [W cx (x, y)]2

1 1 2p 6 6
= Ú Ú r cos q r dr dq
p0 0
= 5 64 . (13-25)
Now we balance x-coma with tilt, as in Eq. (13-11). Its variance across the circular
pupil is given by
5 b b2
s 2bcx = + + . (13-26)
64 4 4
It has a minimum value of 1 64 for b = -1 2. Thus, the sigma value is reduced by a

factor of 5 . The corresponding balanced aberration is given by
W bcx ( x , y ) = x 3 - (1 2) x . (13-27)
Similarly, we can show that when a certain amount of the x-spherical aberration is
balanced by - 3 4 of that amount of the x-defocus, its sigma value is reduced by a factor
of 10 from (1 8) 5 2 to1 16. The balanced x-spherical aberration is x 4 - (3 4) x 2 .
13.7.2 Orthonormal Polynomials Representing Balanced Aberrations

The orthonormal polynomials F j ( x , y ) representing balanced aberrations for a
circular exit pupil can be obtained by orthonormalizing the square polynomials Q j ( x , y )
given in Table 13-1 by the Gram–Schmidt process [6]. The first 15 polynomials obtained
in this manner are given in Table 13-2. Their orthonormality is expressed by
1 1 2p
Ú Ú F ( x , y ) F j ¢ ( x , y )r dr dq = d jj ¢ . (13-28)
p0 0 j
We note that there are x-polynomials for defocus, balanced coma, and balanced spherical
aberration, but no corresponding y-polynomials. The y 2 term representing the y-defocus
appears in F6 , the y 3 terms representing y-coma appears in F10 , and the y 4 term
representing the y-spherical aberration terms appears in F15 . The only difference between
the polynomials for a circular and elliptical pupil is the scaling of the x- and y-axes. An
aberration function across such a pupil can be expanded in terms of these polynomials in
the same manner as in Section 13.6.
The x-polynomials for square and circular pupils are compared in Table 2. They
illustrate how the sigma values or the balancing aberrations differ for the two pupils. The
polynomials for a rectangular or an elliptical pupil can be obtained from the
corresponding polynomials for a square or a circular pupil by simply scaling the ( x , y )
coordinates.
13.8 Comparison of Polynomials for Rotationally Symmetric and Anamorphic

Imaging Systems
For clarity and convenience, we summarize in Table 13-4 which polynomial set to
use for analyzing the aberrations of an anamorphic system depending on the shape of its
pupil. The polynomials must be orthogonal over the pupil and represent balanced
aberrations for it. For a rectangular pupil, the appropriate polynomials are the 2D
Legendre polynomials Q j ( x , y ) . The polynomials for a square pupil can be obtained as a
special case of a rectangular pupil by using the same scale for both the x and y axes. For a
circular pupil, the polynomials given in Table 13-2 obtained by orthogonalizing the
square polynomials over the circular pupil are the right polynomials. The Zernike circle
polynomials, although orthogonal over the circular pupil, are not suitable because they do
not represent balanced aberrations for it. Since the 2D Legendre polynomials are
inherently separable in the x and y coordinates, the appropriate polynomials for an
elliptical pupil can be obtained by properly scaling the x and y axes of the polynomials
given for a circular pupil in Table 13-2.
Similarly, Table 13-5 summarizes which polynomial set to use for analyzing the
aberrations of a rotationally symmetric system with different pupil shapes. For a circular
pupil, the appropriate polynomials are the Zernike circle polynomials, since they are
orthogonal over and represent balanced aberrations for such a pupil. If the pupil is
13.8 Comparison of Polynomials for Rotationally Symmetric and Anamorphic Imaging Systems 363
Table 13-2. Orthonormal polynomials Fj (x, y ) for an anamorphic system with a

circular exit pupil, where x = r cosq , y = rsinq , and 0 £ r £ 1.

number j Fj (x, y )
1 1 Piston
2 2x x-tilt
3 2y y-tilt
4 4x2 -1 x-defocus
5 2 6xy
6 (
2 x 2 + 3y 2 - 1 )
7 (
4 2x 3 - x ) x-primary coma
4
8
5
(
6x 2y - y )
9 (
4 x 3 + 3xy 2 - x )
4
10
5
(
3x 2 y + 5y 3 - 3y )
11 16 x 4 - 12 x 2 + 1 x-primary spherical
12 (
2 2 8 x 3 y - 3xy )
10
13
7
(
8 x 4 + 24 x 2 y 2 - 9 x 2 - 3y 2 + 1 )
14 (
4 2 3x 3 y + 5xy 3 - 3xy )
15
2
7
(3 x 4
)
+ 30 x 2 y 2 + 35y 4 - 6 x 2 - 30y 2 + 3
elliptical, as in the case of the human eye, the appropriate polynomials are those given in
Tables 8-1 to 8-3 obtained by orthogonalizing the circle polynomials over the elliptical
pupil. They cannot be obtained from the circle polynomials by simply scaling the x and y
axes of a circular pupil. Although the polynomials thus obtained will be orthogonal over
an elliptical pupil, they will not represent the balanced aberrations for it. For a rectangular
pupil, e.g., a rectangular beam passing through such a system, the appropriate
polynomials are those given in Tables 9-1 to 9-3. The 2D Legendre polynomials are not
suitable because, although they are orthogonal over the pupil, they do not represent
balanced aberrations for it. The polynomials for a square pupil can be obtained as a
special case of those for a rectangular pupil. They are given in Tables 10-1 to 10-3.
Table 13-3. Comparison of defocus, coma, and spherical aberration orthonormal

polynomials for anamorphic systems with square and circular pupils.
Aberration Square pupil Circular pupil
Defocus ( )(
5 2 3x 2 - 1 ) 4x2 -1
Coma ( )(
7 2 5x 3 - 3x ) (
4 2x 3 - x )
Spherical (3 8)( 35x 4 - 30 x 2 + 3) 16 x 4 - 12 x 2 + 1
Table 13-4. Appropriate polynomials for an anamorphic system with different pupil
shapes.
Pupil shape Polynomial set
Rectangular Table 13-1
Square Table 13-1 with the same scaling of x- and y-axes
Circular Table 13-2
Elliptical Table 13-2 with scaling of x- and y-axes

13.9 Summary 365
Table 13-5. Appropriate polynomials for a rotationally symmetric system with

different pupil shapes.
Pupil shape Polynomial set
Rectangular Rectangular polynomial Tables 9-1 to 9-3
Square Square polynomial Tables 10-1 to 10-3
Circular Zernike circle polynomial TabOH 4-4
Elliptical Elliptical polynomial Tables 8-1 to 8-3
13.9 SUMMARY
An anamorphic imaging system has only two pairs of Gaussian conjugates,
compared to an infinite number for a rotationally symmetric imaging system. The
diffraction PSF or OTF of these systems, which depends on the shape of the exit pupil
and the aberration across it, is the same for the same pupil function. It is assumed that the
aperture stop lies in the image spapce of the system so that it is also its exit pupil.
The aberration function of an anamorphic system depends on the object and pupil
coordinates ( p, q) and ( x , y ) , respectively, through six reflection invariants p 2 , q 2 , x 2 ,
y 2 , px , and qy , compared to three rotational invariants p 2 + q 2 , x 2 + y 2 , and px + qy
in the case of a rotationally symmetric system. Its aberration terms are separable in the
pupil coordinates. The degree of an aberration term is even, and the aberration function
accordingly consists of aberrations of even orders only. There are 16 primary aberrations
[see Eq. (13-4)] as opposed to only five for a rotationally symmetric system [see Eq. (2-
16)].
The orthonormal polynomials Q j ( x , y ) representing balanced aberrations are

products of the Legendre polynomials Ll ( x ) and Lm ( y ) in the x and y variables,
respectively, as in Table 13-1, where the x and y coordinates are normalized by the half-
widths (a, b) of the rectangular pupil. They are inherently separable in the Cartesian
coordinates of a pupil point. For each polynomial Ll ( x ) Lm ( y ) , there is a corresponding
polynomial Lm ( x ) Ll ( y ) . If l and m are the degrees of the x- and y-Legendre polynomials,
then the degree or the order n of the orthonormal polynomial obtained by their product is
n = l + m. There are n + 1 polynomials of a certain order n. The polynomials for a square
pupil can be obtained from the rectangular polynomials by letting a = b .
Products of Chebyshev polynomials (one for the x axis and the other for the y axis)
which are also orthogonal over a rectangular or a square pupil, have been suggested for
wavefront analysis, but they are not suitable for anamorphic systems, since they do not
represent balanced aberrations for such systems [5]. For a system with an axis of
rotational symmetry, as with spherical optics, the aberrations are not separable in
Cartesian coordinates, and the products of the x- and y-Legendre polynomials are not
suitable for expanding an aberration function for a rectangular pupil. The rectangular
polynomials for such systems are those obtained by orthogonalizing the Zernike circle
polynomials over a rectangular pupil, as discussed in Chapter 9.
The orthonormal polynomials F j ( x , y ) for an anamorphic system with a circular or

an elliptical exit pupil obtained by the Gram–Schmidt orthogonalization of the square or
the rectangular polynomials, respectively, are given in Table 13-2. There are x-
polynomials for defocus, balanced coma, and balanced spherical aberration, but no
corresponding y-polynomials. The y 2 term representing y-defocus appears in F6 , the y 3
term representing y-coma appears in F10 , and the y 4 term representing the y-spherical
aberration terms appears in F15 . The only difference between the polynomials for a
circular and an elliptical exit pupil is the scaling of the x and y axes. We emphasize that
the Zernike circle polynomials, although orthogonal over a circular pupil, are not suitable
for anamorphic systems with such a pupil since they do not represent balanced
aberrations for these systems.
References 367
References
1. W. T. Welford, Aberrations of the Symmetrical Optical System (Adam Hilger,

New York, 1986).
2. J. C. Burfoot, “Third-Order Aberrations of ‘Doubly Symmetric’ Systems,” Proc.

Phys. Soc. B 67, 523–528 (1954).
3. C. G. Wynne, “The primary aberrations of anamorphotic lens systems,” Proc.

Phys. Soc. B 67, 529–537 (1954).
4. V. N. Mahajan, “Orthonormal aberration polynomials for anamorphic optical

imaging systems with rectangular pupils,” Appl. Opt. 49, 6924–6929 (2010).
5. F. Liu, B. M. Robinson, and J. M. Geary, “Analyzing optics on rectangular

apertures using 2-D Chebyshev polynomials,” Opt. Eng. 50 (4), 043609 (April
2011).
6. V. N. Mahajan, “Orthonormal aberration polynomials for anamorphic optical

imaging systems with circular pupils,” Appl. Opt. 51, 4087–4091 (2012).
CHAPTER 14
NUMERICAL WAVEFRONT ANALYSIS
14.1 Introduction ........................................................................................................ 371
14.2 Zernike Coefficients from Wavefront Data ...................................................... 372
14.2.1 Theory .................................................................................................... 372
14.2.2 Numerical Examples .............................................................................. 373
14.3 Zernike Coefficients from Wavefront Slope Data ........................................... 383
14.3.1 Theory .................................................................................................... 383
Wavefront Slope Data ............................................................................ 388
14.3.3 Numerical Example ................................................................................ 393
14.4 Summary ............................................................................................................. 398
References ..................................................................................................................... 399
369
Chapter 14
Numerical Wavefront Analysis*
14.1 INTRODUCTION
While a wavefront represents a surface of constant phase and an aberration function

represents its deviation from a spherical surface, called the reference sphere, we will use
the two terms, wavefront and aberration function, synonymously. In the previous
chapters, we have emphasized that for wavefront analysis, i.e., to determine the content
of an aberration function, a set of polynomials that are orthonormal over the pupil of an
imaging system and represent balanced classical aberrations for the system must be used.
The utility of using the orthonormal polynomials, as opposed to the orthogonal, is that
each expansion coefficient is not only independent of the number of polynomials used in
the expansion, but also represents the standard deviation or the sigma value of the
corresponding polynomial aberration term (with the exception of piston). The variance of
the aberration function is then simply equal to the sum of the squares of the aberration
coefficients.
In this chapter, we consider how best to determine the orthonormal expansion or the
aberration coefficients from the wavefront data measured at an array of points, as, for
example, in a phase-measuring interferometer [1]. The problem of determining the
expansion coefficients when the measured data are the wavefront slopes, as, for example,
in a Shack–Hartmann sensor [2] is also discussed. Although we have considered optical
imaging systems with several different pupil shapes, our focus in this chapter is on a
system with a circular pupil. The analysis given here for such a pupil can be extended to
systems with other pupil shapes.
In practice, what is needed in both optical design and fabrication is the wavefront.
The wavefront aberrations determine the image quality in optical design. In fabrication
and testing of an optical surface, the wavefront errors determine surface errors, and thus
the polishing requirements to obtain the desired surface. Similarly, in adaptive optics, the
signal for the actuators of a deformable mirror to negate the aberrations, such as those
introduced by atmospheric turbulence, comes from the wavefront data. Hence, there is a
need to determine the Zernike coefficients from the wavefront data measured by a
wavefront sensor, or from the slope data provided by a slope sensor. In this chapter, we
present the two main mathematical approaches to determine the expansion coefficients:
an integration method for orthogonal solutions and the classic least squares approach.
We also illustrate the methods with some numerical examples for determining the
Zernike coefficients from the wavefront data or the wavefront slope data. The key points
considered are: how the number of data points affects the accuracy of the coefficients,
* This chapter was contributed by Prof. Eva Acosta and Dr. Justo Arines of the Departamento de Física Aplicada, Universidad
de Santiago de Compostela, Galicia, Spain.
371
372 NUMERICAL WAVEFRONT ANALYSIS
how the noise in the data affects this accuracy, and how many Zernike polynomials are
needed for adequate representation of the data.
14.2 ZERNIKE COEFFICIENTS FROM WAVEFRONT DATA
14.2.1 Theory
In Chapter 3, we discussed the orthonormal polynomials to represent the aberrations
of a system with a certain shape of its exit pupil. In Chapter 4, we considered the specific
case of a system with a circular pupil. Consider an aberration function W(x, y) of a system
expanded in terms of J Zernike circle polynomials in the form
J
W ( x, y ) ¦a Z
j 1
j j ( x, y ) . (14-1)
The Zernike polynomials are othonormal over a unit disc according to
³ Z j ( x, y) Z jc ( x, y)dxdy G jj ' . (14-2)

x 2 y 2 d1
Because of the orthonormality of the Zernike polynomials, the expansion coefficients are
given by
aj ³ Z j (x, y) W (x, y)dxdy , (14-3)
where the limits of integration in Eq. (14-3) and others are the same as in Eq. (14-2),
unless specified otherwise. Similarly, the variance of the aberration function is given by
J
V2 ¦ a j2 , (14-4)
j 2
In practice, the measured data is available at a finite number of points, and the
integral in Eq. (14-3) reduces to a sum, thus causing some error in the value of the
integral. The accuracy of the integral can be improved by interpolating the data to yield
their values at a set of points and using them to perform numerical integration by
algorithms such as adaptive integration, Monte Carlo integration, or cubature formulas
among others [3]. In the least squares (LS) approach, we determine the expansion
coefficients by minimizing the difference between the measured wavefront and the
wavefront estimated with a certain number J of the Zernike polynomials by solving a
linear system of equations [4]
DN u1 A2 N uJ aˆ J u1 , (14-5)
where D is the column vector of N data values, â is a column vector containing the J
expansion coefficients, and A is a NuJ 2D matrix representing the values of the Zernike
polynomials at the location of the data points according to
14.2.1 Theory 373
§ Z1 ( x1 , y1 ) Z 2 ( x1 , y1 ) !Z J ( x1 , y1 ) ·
¨ Z (x , y ) Z 2 ( x2 , y 2 ) Z J ( x2 , y 2 ) ¸¸
!
AN u J ¨ 1 2 2 . (14-6)
¨ # # ¸
¨ ¸
© Z1 ( x N , y N ) Z 2 ( xN , y N ) ! Z J ( xN , y N ) ¹ N uJ
In general, A is not a square matrix, because the number of data points is larger than
the number of polynomials representing the wavefront. A pseudoinverse matrix is used to
evaluate the Zernike coefficients [4]:
aˆ J u1 ( AJTu N AJ u M ) 1 AJTu N D N u1 . (14-7)
14.2.2 Numerical Examples

As a numerical example, we first consider a Seidel aberration function consisting of
astigmatism, coma, and spherical aberration with orthonormal coefficients or sigma
values of 1, –1.5, and 2 waves:
W ( x, y) 1.0Z6 ( x, y) 1.5Z8 ( x, y) 2.0Z11( x, y) . (14-8)
The P-V number of the aberration function is 14.5, and its sigma value is 2.69 waves. Its
isometric and contour plots are shown in Figure 14-1. The contour spacing is one wave.
We generate wavefront data at the nodes of three square arrays of 9 u 9 , 15 u 15 , and

21u 21 points within a unit circle, as shown in Figure 14-2. To simulate measurement
noise, we added to each data point an uncorrelated random Gaussian noise of zero mean
with a standard deviation of 2%, 5%, and 10% of the absolute mean value of the
aberration function according to
p
Vn
100 ³ > W ( x, y ) @ dx dy , (14-9)
where p is the percent noise, e.g., 2.
(a) (b)
Figure 14-1. (a) Isometric and (b) contour plots of a Seidel aberration function
represented by Eq. (14-8).
(a) (b) (c)
Figure 14-2. Square arrays of data points on a unit disc. (a) 9 u 9 array with 45 data
points, (b) 15 u 15 array with 145 data points and (c) 21u 21 array with 305 data
points.
To determine the coefficients by the integration method of Eq. (14-3), we
interpolate the data by bicubic splines based on SPLIN2, as explained in [5], to yield the
value of the integrand at the nodes of the Albrecht cubature formula that allows exact
evaluation of polynomial integrands up to degree 15 [6]. When determining the
coefficients by the least squares approach, the matrix inversion is performed by the
inv(M) function of Matlab [7]. The main advantage of the numerical integration method
is the independence of the calculated coefficients. In the least squares approach, there is
some cross-coupling of the coefficients if the number of coefficients estimated is smaller
than those present in the aberration function representing the data.
The expansion coefficients obtained by the integration and the LS approaches for
various numbers of data values and different amounts of noise are compared in Figure
14-3. The Zernike polynomials up to and including the seventh order, or J = 36, are used
in determining their coefficients. The figure shows that the accuracy of the retrieved
coefficients in both approaches increases with the number of data points, and decreases
with an increase in the amount of noise. The quality of the wavefront fit, defined as the
root mean square difference between the values of the estimated and the actual aberration
functions Ŵ and W at the data points, i.e.,
1/ 2
1 ° N ˆ ½
2°
Q ¦
® [W ( xi , yi ) W ( xi , yi )] ¾
N ¯i 1 °¿
, (14-10)
is given in Table 14-1. For a small number of points N (close to the number of data
points) and a small amount of noise, the integration method yields a slightly better fit
than the LS method due to some coupling between the lower- and the higher-order modes
in the LS method. However, as the noise increases, the LS method yields a slightly better
fit. This is because the interpolation method used in the integration method worsens as
the noise increases. As the number of data points increases, the quality of the fit becomes
approximately the same for the two methods.
Integration Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0 c
0.5 0% noise
1.0 2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
LS Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
Figure 14-3a. Estimated Zernike coefficients of a Seidel aberration function from

wavefront data on a 9 u 9 array with different amounts of noise.
Integration Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0 c
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
Figure 14-3b. Estimated Zernike coefficients of a Seidel aberration function from

Integration Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
Figure 14-3c. Estimated Zernike coefficients of a Seidel aberration function from

wavefront data on a 21u 21 array with different amounts of noise.
Table 14-1. Wavefront fit quality factor Q for a Seidel aberration function.
Q
Integration Method LS Method
ın 9u9 15 u 15 21u 21 9u9 15 u 15 21u 21
0% 0.0000 0.0000 0.0000 0.0099 0.0000 0.0000
2% 0.0065 0.0034 0.0025 0.0100 0.0023 0.0021
5% 0.0171 0.0088 0.0057 0.0100 0.0059 0.0047
10% 0.0299 0.0184 0.0118 0.0105 0.0113 0.0097
Next, we consider a wavefront representing the aberrations of a normal human eye.

The aberration function is defined by a series expansion of Zernike polynomials up to the
seventh degree with coefficients obtained from the statistical model proposed by Thibos
[8]. The values of the 36 orthonormal coefficients in this model are given in Table 14-2.
The P-V number of this aberration function is 12.3 waves and its sigma value is 2.34
waves. Its isometric and contour plots are illustrated in Figure 14-4. The contour spacing
is one wave. Its orthonormal Zernike coefficients determined in the same manner as in
the case of the Seidel aberration function for various data arrays and amounts of noise are
shown in Figure 14-5 and Table 14-3. As in the case of the Seidel aberration function, the
fit quality factor improves as the number of data points increases and the noise level
decreases.
(a) (b)
Figure 14-4. (a) Isometric and (b) contour plots of the eye aberration function.
Table 14-2. Orthonormal Zernike aberration coefficients of a normal human eye.
j aj j aj j aj
1 0.0000 13 í0.2080 25 í0.0138
2 í0.8236 14 0.0217 26 í0.0024
3 í0.8680 15 í0.0165 27 í0.0031
4 í0.3168 16 0.0264 28 0.0151
5 í1.6749 17 í0.0283 29 0.0296
6 0.9368 18 0.0122 30 í0.0228
7 0.3711 19 í0.0155 31 0.0029
8 í0.2984 20 0.0344 32 í0.0059
9 í0.0107 21 í0.0263 33 í0.0078
10 í0.0679 22 0.0188 34 í0.0062
11 0.0121 23 í0.0020 35 í0.0037
12 0.0294 24 í0.0195 36 0.0151
Table 14-3. Wavefront fit quality factor Q for the human eye aberration function.

ın 9u9 15 u 15 21u 21 9u9 15 u 15 21u 21
0% 0.0000 0.0000 0.0000 0.0053 0.0000 0.0000
2% 0.0087 0.0040 0.0025 0.0053 0.0022 0.0017
5% 0.0172 0.0101 0.0057 0.0052 0.0022 0.0042
10% 0.0183 0.0194 0.0118 0.0072 0.0095 0.0081

Integration Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-5a. Estimated Zernike coefficients of the eye aberration function from
Integration Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-5b. Estimated coefficients of the eye aberration function from wavefront
data on a 15 u 15 array with different amounts of noise.
Integration Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-5c. Estimated Zernike coefficients of the eye aberration function from
wavefront data on a 21u 21 array with different amounts of noise.
14.3.1 Theory 383
14.3 ZERNIKE COEFFICIENTS FROM WAVEFRONT SLOPE DATA

14.3.1 Theory
A Shack–Hartmann [2] sensor consists of a lenslet array placed in the pupil of a

system or its conjugate and a detector array placed in its focal plane. Each lenslet samples
a small portion of the beam so that the wavefront across it can be assumed to be planar. If
the beam is collimated, i.e., if its wavefront is planar and, therefore, aberration free, each
lenslet focuses the light across it on a corresponding pixel of the detector array. However,
if the wavefront is not planar, i.e., if it is aberrated, the image formed by a lenslet is
displaced. As the aberrations change, the wavefront portion across a lenslet is tilted
slightly, thus displacing the image spot formed by it on the detector array by some small
amount 'x, 'y . Each displacement represents the average slope of the wavefront across
a corresponding lenslet and is ascribed as the slope at the center of the lenslet. The
relationship between the displacements and the wavefront slope is the same as for the
transverse ray aberrations of the system, i.e.,
ª wW ( xl , yl ) wW ( xl , yl ) º
'x, 'y f« , » , (14-11)
¬ wx wy ¼
where ( xl , yl ) is the center of a lenslet, and f is its focal length. Because of the spatial
derivative relationship between them, such a sensor is called the wavefront slope sensor.
We assume that the wavefront sensor provides accurate measurements of the local slopes
of the wavefront under test, affected only by the measurement noise and the setup
constraints [9].
The Zernike coefficients of a wavefront can be obtained from the gradient W ( x, y )

of the aberration function by integrating it and then using Eq. (14-3). This two-step
G
process can be avoided if a set of vector polynomials V j ( x, y) can be found such that their
inner product with the gradient function W ( x, y ) yields the Zernike coefficients, i.e., if
G
aj ³ W ( x, y) V j ( x, y)dxdy . (14-12)
This problem has been studied and solved by several authors [8-12], and all of them
arrived at different solutions for the vector functions, in other words, the set of vector
polynomials is not unique [13]. A straightforward and intuitive way to find a set of vector
polynomials [12] is to apply the divergence theorem [14] to the scalar function W ( x, y )
G
and a vector field V j ( x, y) on a unit circular pupil with a contour C:
G G G G
³ W ( x, y ) V j ( x, y ) dx dy ³ W ( x, y ) V j ( x, y ) dl ³ W ( x, y ) V j ( x, y )dx dy , (14-13)
G
where dl is the differential contour element pointing out of the circumference of the unit
G
pupil. Thus, if there exists a vector function V j ( x, y) such that
G
V j (x, y) Z j (x, y) (14-14)
and
G G
V j (U 1) dl 0 , (14-15)
then using Eq. (14-3), Eq. (14-13) yields Eq. (14-12), which, in turn, can be used to
obtain the Zernike coefficients.
G
The vector polynomials G j ( x, y) proposed by Gavrielides [10] require a more
G
restrictive solution for V j ( x, y ) as being irrotational [15], deriving therefore from the
gradient of a scalar function U ( x , y ) . Thus, in order to find these polynomials, we must
solve the Poisson equation
2U j ( x, y ) Z j ( x, y ) (14-16)
with the boundary condition
G
U j (U 1) dl 0 , (14-17)
G G
and G j ( x, y) can be straightforwardly evaluated as G j ( x, y) U ( x, y) .
Using one set of vector functions or another is important because the slope data are
G
inevitably afflicted with noise. Let n ( x, y ) represent the noise vector associated with the
G
measured slope data, so that the measured slopes are given by W ( x, y ) n ( x, y ) . (The
noise sources in a Shack–Hartmann sensor have been described in detail by Neal,
Copland, and Neal [9].) Equation (14-12) is thus modified and the coefficients we
calculate in practice are given by
G G
a~ j ³ >W ( x, y) n( x, y)@ V j dx dy . (14-18)
The error variance associated with an estimated coefficient is accordingly given by
G G G G
V2j (a j a j )2 ³ n ( x, y ) V j ( x, y )dxdy ³ n( xc, yc) V j ( xc, yc)dxcdyc . (14-19)
Assuming uncorrelated random Gaussian noise with zero mean and covariance
G G
n 2 G( x xc, y y c) , the variance associated with the estimated coefficient
n ( x, y )n ( xc, y c)
is given by [13]
G G G G G 2
V2j ³³ n x, y n xc, yc V x, y V xc, yc dxdydxcdyc ³ V x, y
j j j dxdy , (14-20)
14.3.1 Theory 385
G G 2
and hence, the vector polynomials V j ( x, y ) for which ³ V j ( x, y) dx dy is minimum will
propagate less noise to the expansion coefficients. Solomon et al. [13,16] showed that the
vector functions given by Gavrielides obey this property, and therefore we will use them
for the numerical simulations in the next section. These polynomials up to the eighth
degree are listed in Table 14-4.
Table 14-4. Vector polynomials [ G jx (U, T) , G jy (U, T) ] for direct determination of

Zernike coefficients.
j G jx (U, T) G jy (U, T)
1 0 0
2 (1 / 4)(U 2 cos 2 T 2U 2 3) (1 / 4)U 2 sin 2 T
3 (1 / 4)U 2 sin 2 T (1 / 4)( U 2 cos 2 T 2U 2 3)
4 3 / 4 (U 3 U ) cos T 3 / 4(U3 U) sin T
5 (1 / 24)[U3 sin 3T (3U3 4U)sin T] (1 / 24 )[ U 3 cos 3T (3U 3 4U ) cos T ]
6 (1 / 24 )[U 3 cos 3T (3U 3 4U ) cos T] (1 / 24 )[U 3 sin 3T (3U 3 4U ) sin T]
7 (1 / 8 )(U 4 U 2 ) sin 2 T (1 / 8 )(U 2 1)[ 2U 2 cos 2T (3U 3 1)]
8 (1/ 8) (U2 1)[2U2 cos2T (3U3 1)] (1 / 8)(U 4 U 2 ) sin 2 T
9 (1 / 32 )[U 4 sin 4T ( 4U 4 5U 2 ) sin 2T] (1 / 32 )[ U 4 cos 4T ( 4U 4 5U 2 ) cos 2T]
10 (1 / 32 )[U 4 cos 4T ( 4U 4 5U 2 ) cos 2T] (1 / 32)[U4 sin 4T (4U4 5U2 )sin 2T]
11 5 / 4 (U 2 1)(2U 3 U ) cos T 5 / 4 (U 2 1)(2U 3 U ) sin T
12 5 / 8(U2 1)[U3 cos 3T (2U3 U) cos T] 5 / 8 (U 2 1)[U3 sin 3T (2U3 U) sin T]
13 5 / 8(U 2 1)[U 3 sin 3T (2U 3 U ) sin T] 5 / 8 (U 2 1)[ U3 cos 3T ( 2U3 U) cos T]
14 (1/ 40)[U5 cos5T (5U5 6U3 ) cos T] (1 / 40 )[U 5 sin 5T (5U 5 6U 3 ) sin T]
15 (1 / 40)[U5 sin 5T (5U5 6U3 ) sin T] (1/ 40)[U5 cos5T (5U5 6U3 ) cosT]
(1/ 48)(U2 1)[3(5U4 3U2 )cos2T

16 (1/ 48)(U2 1)(5U4 3U2 )sin2T
2(10U4 8U2 1)]
(1/ 48)(U2 1)[3(5U4 3U2 )cos2T

17 (1 / 48)(U2 1)(5U4 3U2 )sin 2T
2(10U4 8U2 1)]
3/16(U2 1)[2U4 cos4T 3/16(U2 1)[2U4 sin 4T

18
(5U4 3U2 )cos2T] (5U4 3U2 )sin 2T]

Zernike coefficients. (Cont.)
3/16(U2 1)[2U4 sin 4T 3 /16(U2 1)[2U4 cos 4T

19
(5U4 3U2 )sin 2T] (5U4 3U2 ) cos 2T]
20 (1/ 48)[U6 cos6T (6U6 7U4 )cos4T] (1/ 48) [U6 sin 6T (6U6 7U4 ) sin 4T]
21 (1/ 48)[U6 sin6T (6U6 7U4 )sin 4T] (1/ 48)[U6 cos6T (6U6 7U4 )cos4T]
22 49 / 4(U 2 1)(5U5 5U3 U) cos T 49 / 4 (U 2 1)(5U 5 5U 3 U ) sin T
7 / 8(U2 1)[(3U5 2U3 )sin3T 7 / 8(U2 1)[(3U5 2U3 )cos3T

23
(5U5 5U3 U)sin T] (5U5 5U3 U)cos T]
7 / 8(U2 1)[(3U5 2U3 )cos3T 7 / 8(U2 1)[(3U5 2U3 )sin3T

24
(5U5 5U3 U)cos T] (5U5 5U3 U)sin T]
7 / 8(U2 1)[U5 sin5T 7 / 8(U2 1)[U5 cos5T

25
(3U5 2U3 )sin3T] (3U5 2U3 )cos3T]
7 / 8(U2 1)[U5 cos5T 7 / 8(U2 1)[U5 sin5T

26
(3U5 2U3 ) cos 3T] (3U5 2U3 )sin3T]
27 (1/ 56)[U7 sin 7T (7U7 8U5 )sin 5T] (1 / 56)[ U 7 cos 7T (7U7 8U5 ) cos 5T]
28 (1/ 56)[U7 cos7T (7U7 8U5 )cos5T] (1 / 56)[U7 sin 7T (7U7 8U5 ) sin 5T]
(1 / 4)(U2 1)[4(7U6 8U4 2U2 )cos2T

29 (U2 1)(7U6 8U4 2U2 )sin 2T
(35U6 45U4 15U2 1)]
(1/ 4)(U2 1)[4(7U6 8U4 2U2 )cos2T

30 (U 2 1)(7U 6 8U 4 2U 2 ) sin 2 T
(35U6 45U4 15U2 1)]
(1/ 2)(U2 1)[(7U6 5U4 )sin 4T (1/ 2)(U2 1)[(7U6 5U4 )cos4T
31
2(7U6 8U4 2U2 )sin 2T] 2(7U6 8U4 2U2 )cos2T]
(1/ 2)(U2 1)[(7U6 5U4 )cos4T (1/ 2)(U2 1)[(7U6 5U4 )sin 4T
32
2(7U6 8U4 2U2 )cos2T] 2(7U6 8U4 2U2 )sin 2T]
14.3.1 Theory 387

Zernike coefficients. (Cont.)
(1/ 2)(U2 1)[2U6 sin6T (1/ 2)(U2 1)[2U6 cos6T

33 6 4
(7U 5U )sin 4T] (7U6 5U4 )cos 4T]
(1/ 2)(U2 1)[2U6 cos6T (1/ 2)(U2 1)[2U6 sin 6T

34
(7U6 5U4 )cos4T] (7U6 5U4 )sin 4T]
35 (1/ 8)[U8 sin8T (8U8 9U6 )sin 6T] (1 / 8)[ U8 sin 8T (8U8 9U 6 ) sin 6T]
36 (1 / 8)[U8 sin 8T (8U8 9U 6 ) sin 6T] (1 / 8)[U8 sin 8T (8U8 9U6 ) sin 6T]
37 (3 / 2)(U 2 1)(2U2 1)(7U5 7U3 U)cos T (3 / 2)(U 2 1)(2U 2 1)(7U5 7U 3 U)sin T
(1/ 8)(U2 1)[(28U7 35U5 10U3 )cos3T (1 / 8)(U 2 1)[(28U 7 35U 5 10U 3 ) sin 3T
38
3(14U7 21U5 9U3 U)cos T] 3(14U 7 21U 5 9U 3 U )sin T]
(1 / 8)(U 2 1)[(28U7 35U5 10U3 )sin 3T (1/ 8)(U2 1)[(28U7 35U5 10U3 )cos3T
39
3(14U7 21U5 9U3 U)sin T] 3(14U7 21U5 9U3 U)cos T]
(1/ 8)(U2 1)[3(4U7 3U5 )cos5T (1/ 8)(U2 1)[3(4U7 3U5 )sin5T
40
(28U7 35U5 10U3 )cos3T] (28U7 35U5 10U3 )sin3T]
(1 / 8)(U2 1)[3(4U7 3U5 )sin 5T (1/ 8)(U2 1)[3(4U7 3U5 )cos5T

41 7 5 3
(28U 35U 10U )sin 3T] (28U7 35U5 10U3 )cos3T]
(1/ 8)(U2 1)[3U7 cos7T (1/ 8)(U2 1)[3U7 sin 7T

42
3(4U7 3U5 )cos5T] 3(4U7 3U5 )sin5T]
(1/ 8)(U2 1)[3U7 sin7T (1 / 8)(U2 1)[3U7 cos7T

43
3(4U7 3U5 )sin5T] 3(4U7 3U5 )cos5T]
44 (1 / 72 )[U 9 cos 9T (9U 9 10U7 ) cos 7T] (1/ 72)[U9 sin 9T (9U9 10U7 )sin 7T]
45 (1/ 72)[U9 sin 9T (9U9 10U7 )sin 7T] (1/ 72)[U9 cos9T (9U9 10U7 )cos7T]

Wavefront Slope Data
Whereas the Zernike circle polynomials orthogonal over a circular pupil represent
the balanced classical wave aberrations that minimize their variance, the polynomials
representing the balanced aberrations that minimize the variance of the transverse ray
aberrations have been given by Lukosz [17] and by Braat [18]. The variance of the
transverse ray aberrations is given by
V 2s ³ >W ( x, y) W ( x, y ) @ 2 dx dy , (14-21)
where ı2s is the standard deviation or the spot sigma. For large aberrations, minimizing ı2s
is a useful criterion for obtaining a good MTF (modulation transfer function) at low
spatial frequencies. Thus, if we expand the aberration function W ( x, y ) in terms of a set of
polynomials B j ( x, y ) in the form
W ( x, y) ¦ b j B j ( x, y) , (14-22)
j
the transverse ray aberrations are given by
W ( x, y) ¦b jB j ( x, y) , (14-23)
j
where the polynomial gradients are orthonormal to each other over a unit disc, i.e.,
³ B j (x, y) B jc (x, y)dx dy G jjc . (14-24)
As a result,
bj ³ W ( x, y ) B j ( x, y )dx dy , (14-25)
and
V 2s ¦ b 2j . (14-26)
j 2
The polynomials {Bj(x,y)} form a complete set, just like the Zernike circle
polynomials, but they are not orthogonal to each other over a unit disc. For all j with
n m, they are given in terms of the Zernike polynomials by
1 , (14-27)
B j ( x, y ) Z j ( x, y )
2n(n 1)
14.3.2 Alternative Approach for Obtaining Zernike Coefficients from Wavefront Slope Data 389
and for all j with n z m by a suitable linear combination of two Zernike polynomials with
the same azimuthal frequency:
1 ª n 1 º
B j ( x, y ) « Z j ( x, y ) Z j(n ' n 2,m ' m ) ( x, y ) » . (14-28)
4n( n 1) ¬ n 1 ¼
Zhao and Burge [19,20] have constructed a set of vector polynomials {SሬԦj(x,y)} by
the Gram–Schmidt orthonormalization of the gradients of the Zernike circle polynomials
that are in fact the gradient of the Bj(x,y) polynomials, i.e., {SሬԦj(x,y) = Bj(x,y)}. The first
45 polynomials of the two sets in polar coordinates are given in Tables 14-5 and 14-6.
Once the bj coefficients are known from Eq. (14-25), the Zernike coefficients aj can
be obtained according to [19]:
b j ( n, m)
aj , nzm
2n( n 1)
(14-29)
b j ( n, m ) b j '( n 2 , m )
aj , n m .
4 n(n 1) 4( n 1)(n 2)
Table 14-5. Orthonormal wave aberration polynomials B j (U, T ) for minimum

standard deviation of transverse ray aberrations.
j n m B j (U , T )
1 0 0 1
2 1 1 U cos T
3 1 1 U sin T
4 2 0 (1/ 2)(U2 1)
5 2 2 (1/ 2)U2 sin2T
6 2 2 (1 / 2 )U 2 cos 2 T
7 3 1 3 / 2(U3 U)sin T
8 3 1 3 / 2 (U 3 U ) cos T
9 3 3 (1 / 3)U 3 sin 3T
10 3 3 (1 / 3)U 3 cos 3T
11 4 0 (1 / 2)(3U 4 4U 2 1)
12 4 2 2(U4 U2 ) cos 2T
Table 14-5. Orthonormal wave aberration polynomials B j (U, T ) for minimum

standard deviation of transverse ray aberrations. (Cont.)
j n m B j (U , T )
13 4 2 2(U4 U2 ) sin 2T
14 4 4 (1 / 2)U 4 cos 4T
15 4 4 (1 / 2)U 4 sin 4 T
16 5 1 ( 5 / 2 )(2U5 3U 3 U) cos T
17 5 1 5 / 2 (2U 5 3U 3 U ) sin T
18 5 3 5 / 2(U5 U3 ) cos 3T
19 5 3 5 / 2(U5 U 3 ) sin 3T
20 5 5 (1 / 5)U 5 cos 5T
21 5 5 (1 / 5)U 5 sin 5T
22 6 0 (1 / 24 )(20U 6 36U 4 18U 2 2)
23 6 2 3 / 4 (5U 6 8U 4 3U 2 ) sin 2 T
24 6 2 3 / 4 (5U 6 8U 4 3U 2 ) cos 2 T
25 6 4 3(U6 U 4 ) sin 4T
26 6 4 3(U6 U4 ) cos 4T
27 6 6 (1 / 6 )U 6 sin 6T
28 6 6 (1 / 6 )U 6 cos 6T
29 7 1 7 / 2 (5U 7 10U 5 6U 3 U ) sin T
30 7 1 7 / 2(5U 7 10U5 6U 3 U) cos T
31 7 3 7 / 2 (3U 7 5U 5 2U 3 ) sin 3T
32 7 3 7 / 2 (3U 7 5U 5 2U 3 ) cos 3T
33 7 5 7 / 2(U7 5U5 ) sin 5T
34 7 5 7 / 2(U7 5U5 ) cos 5T
35 7 7 (1 / 7 )U 7 sin 7 T
36 7 7 (1 / 7 )U 7 cos 7 T
37 8 0 (1 / 8)(35U 8 80U 6 60U 4 16U 2 1)
38 8 2 2(7U8 15U6 10U4 2U2 )cos2T
39 8 2 2(7U8 15U6 10U4 2U2 )sin2T
40 8 4 (7U8 12U6 5U4 )cos4T
41 8 4 (7U8 12U6 5U4 )sin4T
42 8 6 2(U8 U6 )cos6T
43 8 6 2(U8 U6 )sin6T
44 8 8 (1/ 8)U8 cos8T
45 8 8 (1 / 8)U 8 sin 8T
14.3.2 Alternative Approach for Obtaining Zernike Coefficients from Wavefront Slope Data 391
Table 14-6. Orthonormal transverse ray aberration vector polynomials [ S jx (U, T) ,

S jy (U, T) ].
j S jx (U, T) S jy (U, T)
1 0 0
2 1 0
3 0 1
4 2U cos T 2U sin T
5 2U sin T 2U cos T
6 2U cos T 2U sin T
7
2
3 / 2U sin 2T 3 / 2 ( U 2 cos 2 T 2U 2 1)
8 3 / 2(U 2 cos 2T 2U 2 1) 3 / 2U 2 sin 2 T

9 3U2 sin 2T 3U 2 cos 2 T
10 3U 2 cos 2 T 3U 2 sin 2T
11 2(3U 3 2U ) cos T 2(3U 3 2U ) sin T
12 2[(3U 3 2U) cos T U 3 cos 3T] 2[ (3U 3 2U ) sin T U 3 sin 3T]
13 2[(3U 3 2U ) sin T U 3 sin 3T] 2[(3U3 2U)cos T U3 cos3T]

14 2U 3 cos 3T 2U3 sin 3T
15 2U 3 sin 3T 2U 3 cos 3T
5 / 2[(4U4 4U2 ) cos 2T
16 5 / 2(4U 4 3U 2 ) sin 2T
(6U4 6U2 1)]
5 / 2[(4U4 4U2 ) cos 2T
17 5 / 2(4U 4 3U2 ) sin 2T
(6U4 6U2 1)]
18 5 / 2[U4 cos 4T (4U 4 3U2 ) cos 2T] 5 / 2[U4 sin 4T (4U4 3U2 )sin 2T]
19 5 / 2[U4 sin 4T (4U4 3U2 )sin 2T] 5 / 2[ U4 cos 4T (4U4 3U 2 ) cos 2T]
20 5U 4 cos 4 T 5U 4 sin 4T
21 5U4 sin 4T 5U 4 cos 4 T
22 6(10U 5 12U 3 3U ) cos T 6(10U 5 12U 3 3U ) sin T
3[(5U5 4U3 )sin 3T 3[(5U5 4U3 )cos3T
23
(10U5 12U3 3U)sin T] (10U5 12U3 3U)cos T]
3[(5U5 4U3 ) cos3T 3[(5U5 4U3 )sin 3T
24
(10U5 12U3 3U) cos T] (10U5 12U3 3U)sin T]
25 3[U 5 sin 5T (5U 5 4U 3 ) sin 3T] 3[ U5 cos 5T (5U5 4U3 ) cos 3T]
26 3[U5 cos 5T (5U5 4U3 ) cos 3T] 3[U5 cos 5T (5U 5 4U 3 ) cos 3T]
27 6U5 sin 5T 6U 5 cos 5T
28 6U 5 cos 5T 6U 5 sin 5T
Table 14-6. Orthonormal transverse ray aberration vector polynomials [ S jx (U, T) ,

S jy (U, T) ]. (Cont.)
j S jx (U, T) S jy (U, T)
7 / 2[(15U6 20U4 6U2 )cos2T

29 7 / 2 (15U 6 20U 4 6U 2 ) sin 2 T
(20U6 30U4 12U2 1)]
7 / 2[(15U6 20U4 6U2 )cos 2T

30 7 / 2 (15U 6 20U 4 6U 2 ) sin 2 T
(20U6 30U4 12U2 1)]
7 / 2[(6U6 5U4 )sin 4T 7 / 2[(6U6 5U4 )cos4T

31
(15U6 20U4 6U2 )sin 2T] (15U6 20U4 6U2 )cos2T]
7 / 2[(6U6 5U4 )cos4T 7 / 2[(6U6 5U4 )sin4T

32
(15U6 20U4 6U2 )cos2T] (15U6 20U4 6U2 )sin2T]
33 7 / 2[U6 sin 6T (6U6 5U4 )sin 4T] 7 / 2[U6 cos 6T (6U6 5U4 ) cos 4T]
34 7 / 2[U6 cos6T (6U6 5U4 )cos4T] 7 / 2[U6 sin 6T (6U6 5U4 ) sin 4T]
35 7U 6 sin 6T 7U 6 cos 6 T
36 7U 6 cos 6 T 7U 6 sin 6 T
37 8(35U 7 60U 5 30U 3 4U ) cos T 8(35U 7 60U 5 30U 3 4U ) sin T
2(21U7 30U5 10U3 )cos3T 2(21U7 30U5 10U3 )sin 3T

38
2(35U7 60U5 30U3 4U)cos T 2(35U7 60U5 30U3 4U)sin T
2(21U7 30U5 10U3 )sin 3T 2(21U7 30U5 10U3 )cos3T

39
2(35U7 60U5 30U3 4U)sin T 2(35U7 60U5 30U3 4U)cos T
2(7U7 6U5 )cos5T 2(7U7 6U5 )sin5T

40
2(21U7 30U5 10U3 )cos3T 2(21U7 30U5 10U3 )sin 3T
2(7U7 6U5 )sin5T 2(7U7 6U5 )cos5T

41
2(21U7 30U5 10U3 )sin 3T 2(21U7 30U5 10U3 )cos3T
42 2 U 7 cos 7 T 2(7 U 7 6 U 5 ) cos 5 T 2U 7 sin 7 T 2(7U 7 6U 5 ) sin 5T
43 2U 7 sin 7 T 2(7U 7 6U 5 ) sin 5T 2U 7 cos 7 T 2(7 U 7 6U 5 ) cos 5T
44 8U7 cos 7T 8U7 sin 7T
45 8U 7 sin 7T 8U7 cos 7T


As a numerical example of obtaining the Zernike coefficients from the wavefront
slope data, we again consider the aberration function describing the eye aberrations, as in
the previous section. The slopes are evaluated analytically at the nodes of the same square
arrays, i.e., 9 u 9 , 15 u 15 , and 21u 21. To simulate experimental data, we add to the x and
y slope values an uncorrelated random Gaussian noise with a standard deviation of 2%,
5%, and 10% to the absolute mean value according to
p 1 § wW ( x, y ) wW ( x, y ) ·
Vn
100 S © ³
¨¨ 0.5
wx
0.5
wx
¸¸dxdy .
¹
(14-30)
The numerical procedure for the integral approach to retrieve the Zernike
coefficients is the same as explained in the previous section. First, we interpolate the data
(separately for the x and y slopes) by bicubic splines [5] to evaluate the integrands at the
nodes of a cubature formula that allows exact evaluation of polynomial integrands up to
degree 15 [6]. We use the vector polynomials given by Gavrielides [10] to evaluate the
orthonormal Zernike coefficients. For the classic least squares approach, we solve the
system of linear equations given by
D2Nu1 C2NuJ aˆ J u1 , (14-31)
where D is now a column vector of 2N data values corresponding to both slopes at the
measurement points, â is a column vector containing the Zernike coefficients, and C is a
matrix representing the values of the derivatives of the Zernike polynomials at the
location of the data points according to
§ wZ1 ( x1 , y1 ) wZ 2 ( x1 , y1 ) wZ J ( x1 , y1 ) ·
¨ " ¸
wx wx wx
¨ ¸
¨ wZ1 ( x2 , y2 ) wZ 2 ( x2 , y2 ) wZ J ( x2 , y2 ) ¸
¨ " ¸
wx wx wx
¨ ¸
¨ ¸
¨ # # # ¸
¨ ¸ . (14-32)
¨ wZ1 ( x N , y N ) wZ 2 ( x N , y N ) wZ J ( x N , y N ) ¸
¨ " ¸
wx wx wx
C2 N u J ¨ ¸
¨ wZ1 ( x1 , y1 ) wZ 2 ( x1 , y1 ) wZ J ( x1 , y1 ) ¸
¨ " ¸
wy wy wy
¨ ¸
w
¨ 1 x2 , y 2 )
Z ( wZ 2 ( x2 , y2 ) wZ J ( x2 , y2 ) ¸
¨ " ¸
wy wy wy
¨ ¸
¨ # # # ¸
¨ ¸
¨ ¸
¨ wZ1 ( x N , y N ) wZ 2 ( x N , y N )
"
wZ J ( x N , y N ) ¸
¨ wy wy wy ¸
© ¹2 N uJ
The pseudoinverse of matrix C is also evaluated by the inversion routine inv(M) of

Matlab. The results thus obtained up to the seventh degree are shown in Figure 14-6, and
the fit quality Q is shown in Table 14-7. As in the case of determining the Zernike
coefficients from the wavefront data, the accuracy of the retrieved coefficients from the
wavefront slope data increases with the number of data points and decreases with the
amount of noise. Unlike the case of wavefront data in Table 14-3 in the LS method for
the 9 u 9 grid for zero noise, the wavefront fit error is zero in the case of wavefront slope
data, because the number of slope measurements is double that of the wavefront data, and
hence the matrix has less inversion problem.
Similar results are obtained for the Zernike coefficients of the eye aberration
function by determining first the bj coefficients according to Eq. (14-25). The numerical
integration is carried out in the same manner as in the previous cases. The Zernike
coefficients aj are then obtained by using Eq. (14-29).
Table 14-7. Wavefront fit quality factor Q for the eye aberration function.
ın 9u9 15 u 15 21 u 21 9u9 15 u 15 21 u 21
0% 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000
2% 0.0371 0.0421 0.0380 0.0367 0.0399 0.0369
5% 0.1216 0.1177 0.0899 0.1019 0.1119 0.0881
10% 0.1666 0.2237 0.1901 0.1647 0.2093 0.1860

Integration Method
9x9
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
9x9
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-6a. Estimated Zernike coefficients of the eye aberration function from
wavefront slope data on a 9 u 9 array with different amounts of noise.
Integration Method
15x15
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
15x15
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-6b. Estimated Zernike coefficients of the eye aberration function from
Integration Method
21x21
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
21x21
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-6c. Estimated Zernike coefficients of the eye aberration function from
14.4 SUMMARY
Two reconstruction methods most commonly used to estimate the Zernike

coefficients from the wavefront data are the numerical integration and the classic least
squares fit. Figures 14-3 and 14-4 as well as Tables 14-1 and 14-3 demonstrate, with
various arrays of data points and different amounts of noise, that the accuracy of
wavefront fit for both methods improves as the number of data points increases and the
amount of noise decreases.
When the wavefront slope data, instead of the wavefront data, are available, the
orthogonality properties of the Zernike polynomials over a unit disk are not
straightforwardly transferred to their derivatives, and, therefore, the Zernike coefficients
cannot be evaluated as a projection over the gradient of the polynomials. Two
conceptually different approaches solve this problem. One utilizes a set of vector
y
polynomials [G௝x (x, y), Gj (x, y)] given in Table 14-4 such that their inner products with
the wavefront gradient yield the orthonormal Zernike coefficients [See Eq. (14-12)]. In
the other, one expands the aberration function in a set of polynomials Bj(x, y) whose
gradients are orthonormal to each other over a unit disc [See Eq. (14-25)]. In the first
approach, the coefficients represent the minimum sigma values of the balanced classical
wave aberrations, and in the second, the aberrations are balanced to yield minimum
variance of the transverse ray aberrations. The wave aberration and transverse ray
aberration coefficients are related to each other according to Eq. (14-29). These
approaches require numerical evaluation of integrals or a solution of a linear system of
equations. In either case, the accuracy of wavefront fit improves as the number of data
points increases and/or the amount of noise decreases, as demonstrated in Figure 14-6
and in Table 14-7.
References 399
References
1. K. Creath, “Phase-measurement interferometry techniques,” Progress in Optics

26, 349–393 (1988).
2. B. C. Platt and R. V. Shack, “History and principles of Shack–Hartmann

wavefront sensing,” J. Refract. Surg. 17, 573–577 (2001).
3. W. T. Vetterling, S. A. Teukolsky, W. H. Press, and B. P. Flannery, Numerical

Recipes 3rd Edition: The Art of Scientific Computing (Cambridge University
Press, 2007).
4. A. Björk, Numerical Methods for Least Squares Problems (SIAM, Philadelphia,

1996).
5. C. de Boor, A Practical Guide to Splines (Springer, New York, 2001).
6. A. H. Stroud, Approximate Calculation of Multiple Integrals (Prentice-Hall, Inc.

New Jersey, 1971).
7. MATLAB and Statistics Toolbox Release 2012b, The MathWorks, Inc., Natick,
Massachusetts, United States.
8. L. N. Thibos, A. Bradley, and X. A. Hong, “A statistical model of the aberration

structure of normal, well corrected eyes,” Ophthalm. Physiol. Opt. 22, 427–433
(2002).
9. R. Neal, J. Copland, and D. Neal, “Shack–Hartmann wavefront sensor precision

and accuracy,” Proc. SPIE 4779, 148–160 (2002).
10. A. Gavrielides, “Vector polynomials orthogonal to the gradient of Zernike

polynomials,” Opt. Lett. 7, 526–526 (1982).
11. V. P. Aksenov and Y. N. Isaev, “Analytical representation of the phase and its
mode components reconstructed according to the wave-front slopes,” Opt. Lett.
17, 1180–1182 (1992).
12. E. Acosta, S. Bará, M. A. Rama, and S. Ríos, “Determination of phase mode

components in terms of local wave-front slopes: an analytical approach,” Opt.
Lett. 20, 1083–1085 (1995).
13. C. Solomon, S. Rios, E. Acosta, and S. Bara, “Modal wavefront projectors of

minimum error norm,” Opt. Commun. 155, 252–254 (1998).
14. M. R. Spiegel, S. Lipcshutz, and D. Spellman, Vector Analysis (Schaum’s

Outlines, McGraw Hill, India, 2009).
15. N. Bronshtein, K. A. Semendyayev, G. Musiol, and H. Mühlig, Handbook of

Mathematics (Springer, Berlin, 2007).
16. C. Solomon, G. C. Loos, and S. Rios, “Variational solution for modal wavefront
projection functions of minimum error norm,” J. Opt. Soc. Am. A 18, 1519–1522
(2001).
17. W. Lukosz, “Der EinfluE der Aberrationen auf die Optische

Übertragungsfuncktion bei Kleinen Orst-Frequenzen,” Optica Acta 10, 1–9,
(1963).
18. J. Braat, “Polynomial expansion of severely aberrated wavefront,” J. Opt. Soc.

Am. A 4, 643–650, (1987).
19. C. Zhao and J. H. Burge, “Orthonormal vector polynomials in a unit circle, Part I:
basis set derived from gradients of Zernike polynomials,” Opt. Express 15,
18014–18024 (2007).
20. C. Zhao and J. H. Burge, “Orthonormal vector polynomials in a unit circle, Part II:
completing the basis set,” Opt. Express 16, 6586–6591 (2008).
APPENDIX: SYSTEMS WITH SECTOR PUPILS 401
Appendix: Systems with Sector Pupils

As discussed in the following paper,* there are systems, such as the cube-corner
retroreflectors, that consist of six segments in a circular pupil, with each segment in the
form of a sector pupil. The orthonormal polynomials, representing the balanced
aberrations for such segmented pupils, can be obtained by orthonormalizing the Zernike
circle polynomials over a sector area using the Gram–Schmidt orthonormalization
process.
Due to the low symmetry of the sector pupils, the closed-form analytical expressions
for the polynomials are very complex; even the tilt and defocus polynomials are not
simple. The complexity increases even more for a system with an annular sector pupil. In
that case, there are two variables representing a point on the pupil, two parameters
defining its orientation and angular subtense, and the parameter specifying its obscuration
ratio. However, relatively simple expressions are obtained when the angular subtense and
the orientaion of the sector pupil are specified along with its obscuration ratio.
In the following paper, the first 11 orthonormal polynomials are obtained for a sector
pupil of angular subtense of p 3 symmetric about the x or the y axis, angular subtense of
p 2 symmetric about the x axis, or a semicircle symmetric about the x axis. Similarly, the
corresponding orthonormal polynomials are given for an annular sector pupil and a
semicircular pupil with an obscuration ratio of 0.5. We have shown in Chapters 8 and 9
that the radially symmetric Seidel spherical aberration in a system with an elliptical or a
rectangular pupil is balanced with not only the radially symmetric defocus aberration but
with an angular aberration of astigmatism as well, due to their low symmetry compared to
that of the radially symmetric pupils such as the circular, annular or Gaussian. In systems
with sector pupils, even the radially symmetric defocus aberration is balanced with an
angular aberration of tilt due to their lower symmetry. Moreover, a polynomial for such
pupils consists of an increasing number of terms as its order increases.
A simple example of an aberration function consisting of spherical aberration,

defocus, and tilt is considered in this chapter, and its orthonormal aberration coefficients
are determined for the various sector pupils in various orientations, both circular and
annular. In each case, the P-V numbers and the sigma values of the aberration function
are given along with their interferograms.
The number of sector polynomials up to and including a certain order is the same as for
the polynomials of systems with pupils of other shapes considered in Chapters 4 through
10. The orthonormal circle polynomials given in Chapter 4, or the annular polynomials
given in Chapter 5, are special cases of the polynomials for sector pupils with an angular
subtense of 2p .
* José A. Díaz and V. N. Mahajan, “Orthonormal aberration polynomials for optical systems with circular and
annular sector pupils,” Appl. Opt. 56, 1136–1147 (2113) [doi: 10.1364/AO.52.001136]. 5HSULQWHG ZLWK SHUPLVVLRQ
402 APPENDIX: SYSTEMS WITH SECTOR PUPILS
Orthonormal aberration polynomials for optical

systems with circular and annular
sector pupils
José Antonio Díaz1 and Virendra N. Mahajan2,*

1
Departamento de Óptica, Universidad de Granada, Granada 18071, Spain
2
The Aerospace Corporation, 2310 El Segundo Boulevard, El Segundo, California 90245, USA
*Corresponding author: virendra.n.mahajan@aero.org
Received 28 November 2012; accepted 26 December 2012;

posted 8 January 2013 (Doc. ID 180765); published 11 February 2013
Using the Zernike circle polynomials as the basis functions, we obtain the orthonormal polynomials for
optical systems with circular and annular sector pupils by the Gram–Schmidt orthogonalization process.
These polynomials represent balanced aberrations yielding minimum variance of the classical aberra-
tions of rotationally symmetric systems. Use of the polynomials obtained is illustrated with numerical
examples. © 2013 Optical Society of America
OCIS codes: 110.0110, 010.7350, 220.1010, 120.3180, 220.0220.
1. Introduction are given. The details of the Gram Schmidt process

The interferogram of a cube-corner retroreflector are illustrated only for the circular sector.
consists of a fringe pattern inside a circle. It consists However, the polynomials can be obtained numeri-
of six equal segments, where each segment has the cally for any order and any obscuration value. The
form of a circular sector with an angular subtense first 11 polynomials for a circular sector and an an-
of π∕3 [1]. The interferogram can also be hexagonal nular sector with an obscuration ratio of 0.5 with a
with six triangular segments [2]. The imaging prop- semi-angular subtense of π∕3 are given as numerical
erties of and phase retrieval on sector and annular examples. In these examples, the sector is assumed
sector pupils have also been discussed in the litera- to be symmetrical about the x axis. We also outline
ture [3 5]. Two recent papers discuss the effects of the procedure for obtaining the orthonormal polyno-
the thermal gradient on, as well as the polarization mials for a sector with an arbitrary orientation in an
and far-field diffraction patterns of cube corners xy plane. Such a procedure is useful for obtaining the
[6,7]. In this paper, we discuss the orthonormal aber- polynomials for the six circular sector segments of
ration polynomials for circular and annular sector the interferogram of a cube-corner retroreflector. It
pupils. In Section 2, we obtain the first four orthonor- is illustrated by applying it to obtain the polynomials
mal polynomials for these pupils by orthogonalizing for a circular sector that is symmetrical about the
the Zernike circle polynomials [8,9] by the Gram y axis. We point out that the defocus polynomial gi-
Schmidt orthogonalization process [10]. Due to the ven in [1] for the annular and thereby the circular
low symmetry of the sector pupils, the polynomials sector pupils is incorrect.
are sufficiently complex that the closed-form analy- The expansion of an aberration function for a certain
tical expressions for only the first four polynomials pupil in terms of the polynomials that are orthonormal
over it, and how to obtain the orthonormal expansion
coefficients, are considered in Section 3. We illustrate
1559-128X/13/061136-12$15.00/0 the use of the sector polynomials in Section 4 by
© 2013 Optical Society of America considering an aberration function that consists of
1136 APPLIED OPTICS / Vol. 52, No. 6 / 20 February 2013

spherical aberration combined with defocus and tilt, p

and determine the aberration coefficients for circular Zj ρ; θ n 1R0n ρ; m 0; (1c)
and annular sector pupils that are symmetric about
the x axis. To see how the coefficients change with where Rm
n ρ are the radial polynomials given by
the orientation of a sector, we also consider a circular
sector pupil that is symmetric about the y axis. The nX
m∕2
1s n s!
coefficients for a pupil that is semi-circular, circular, Rm
n ρ nm n m ρn 2s
; (2)
or annular but aberrated by the same aberration s 0
s! 2 s ! 2 s !
function are also obtained. The interferograms of the
aberration function for the various pupils are shown n and m are positive integers (including zero), and
for the starting aberration function as well as when n m ≥ 0 and even. The index n represents the radial
the first four polynomial terms representing the inter- degree or the order of the polynomial since it repre-
ferometer errors of piston, tip, tilt, and defocus are sents the highest power of ρ in the polynomial, and m
removed. A brief summary of the main results, and is called the azimuthal frequency. The index j is a
conclusions are given in Section 5. polynomial-ordering number and is a function of n
and m. The first 11 orthonormal polynomials and
2. Gram–Schmidt Orthogonalization of Zernike Circle the relationship among the indices j, n, and m are
Polynomials over a Sector Pupil given in Table 1. They are ordered such that an even
j corresponds to a symmetric polynomial varying as
A. Circular Sector Pupil cos mθ, while an odd j corresponds to an antisym-
Consider a circular sector, as illustrated in Fig. 1(a). metric polynomial varying as sin mθ. For a given va-
The sector is symmetrical about the x axis and lue of n, a polynomial with a lower value of m is
subtends a semi-angle α at the center of the circle. ordered first. The polynomials are orthonormal over
Let the orthonormal Zernike circle polynomials be a unit circular pupil according to
represented by Zj ρ; θ. These polynomials may be Z 1 Z 2π Z 1 Z 2π
written [8,9] Zj ρ; θZj0 ρ; θρdρdθ∕ ρdρdθ δjj0
p 0 0 0 0
Zeven j ρ; θ 2n 1Rm
n ρ cos mθ; m ≠ 0; (3)
(1a)
The circular sector polynomials Sρ; θ; α can be
p obtained by Gram–Schmidt orthogonalizing [10]
Zodd j ρ; θ 2n 1Rm
n ρ sin mθ; m ≠ 0; the circle polynomials over the circular sector pupil
(1b) according to
" #
X
j
Sj1 N j1 Zj1 hZj1 Sk iSk ; (4)
k 1
where
S1 1; (5)
the angular brackets represent an average value

over the pupil, and N j1 is a normalization constant
so that the S-polynomials are orthonormal, i.e.,
Z 1Z
1 α
hSj Sj0 i Sj ρ; θ; αSj0 ρ; θ; αρdρdθ δjj0 : (6)
α 0 α
Here δjj0 is a Kronecker delta. Letting j0 1, we

find that the mean value hSj i of a polynomial is zero
(except when j 1). Similarly, letting j j0 , its mean
square value hS2j i is unity (except when j 1).
Letting j 1 and substituting Z2 2ρ cos θ into
Eq. (4), we obtain S2 as follows:
S2 N 2 Z2 hZ2 i;
Fig. 1. Sector pupil of unit radius and semi-angular subtense α Z 1Z

1 α 4 sin α
symmetrical about the x axis. (a) Circular. (b) Annular with hZ2 i 2ρ cos θρdρdθ ;
obscuration ratio ϵ. α 0 α 3 α
20 February 2013 / Vol. 52, No. 6 / APPLIED OPTICS 1137

Table 1. Orthonormal Zernike Circle Polynomials Z j ρ;θ
j n m Zj ρ; θ Aberration
1 0 0 1 Piston
2 1 1 2ρ cos θ x tilt
3 1 1 p2ρ sin2
θ y tilt
4 2 0 p 32ρ
2
− 1 Defocus
5 2 2 p6ρ2 sin 2θ 45° primary astigmatism
6 2 2 p 6ρ3 cos 2θ 0° primary astigmatism
7 3 1 p 3ρ3 − 2ρ sin θ
8 Primary y coma
8 3 1 8p
3ρ − 2ρ cos θ Primary x coma
3
9 3 3 p8ρ3 sin 3θ
10 3 3 p 8ρ4 cos23θ
11 4 0 56ρ − 6ρ 1 Primary spherical
Z 1Z 2ρ sin θ
4 α
S3 ρ; θ; α : (8)
hZ22 i ρ2 cos2 θρdρdθ 2α sin 2α∕2α; 2α sin 2α∕2α1∕2
α 0 α
We see that even the expressions for the ortho-

hS22 i N 22 hZ2 hZ2 i2 i N 22 hZ22 i hZ2 i2 normal tilt polynomials S2 and S3 are relatively
complex. We also note that the polynomial S2 con-
1 4 sin α 2
N 22 2α sin 2α : tains a piston term, since the mean value of the
2α 3 α
wavefront tilt ρ cos θ over the circular sector is
not zero. p
Since hS22 i 1, Letting j 3 and substituting Z4 32ρ2 1
into Eq. (4), the polynomial S4 is given by

1 4 sin α 2 1∕2
S4 N 4 Z4 hZ4 i hZ4 S2 iS2 hZ4 S3 iS3
N2 2α sin 2α
2α 3 α
N 4 Z4 hZ4 S2 iS2 S3 ;
and since hZ4 i and hZ4 S3 i are both equal to zero. It can be
2ρ cos θ 4 sin α shown that
S2 ρ; θ; α h 3 α
i : (7) (
1 4 sin α 2 1∕2 1 p
2α 2α sin 2α 3 α S4 ρ; θ; α 32ρ2 1
N4
p )
Letting j 2 and substituting Z3 2ρ sin θ into 12 6 sin α2ρ cos θ 4 sin α∕3α
Eq. (4), we obtain S3 as follows: p
;
5∕ 2α 9α2α sin 2α 32 sin2 α
S3 N 3 Z3 hZ3 i hZ3 S2 iS2 : (9)
where
Since the integral of sin mθ between symmetric 1∕2
limits ∓α is zero, 1 96 sin2 α
N4 25 : (10)
5 9α2α sin 2α 32 sin2 α
hZ3 i 0; hZ3 S2 i 0;
hS3 i 0; S3 N 3 Z3 : It is evident that the complexity of the polynomials
increases considerably as we try to obtain the higher-
Hence, order polynomials.
It can be shown further that the polynomial S4 re-
hS23 i N 23 hZ23 i presents defocus aberration ρ2 balanced with an
amount Bt of wavefront tilt aberration ρ cos θ in
and the form
Z 1Z
4 α W ρ2 Bt ρ cos θ (11)
hZ23 i ρ2 sin2 θρdρdθ 2α sin 2α∕2α.
α 0 α such that it yields minimum variance of the balanced
aberration. The orthonormal aberration is then
Since hS23 i 1, given by
1∕2
N 3 2α sin 2α∕2α ρ2 Bt ρ cos θ hWi
S4 ; (12)
and σ

where sector. The annular sector reduces to a circular sector

as ϵ → 0. The annular sector polynomials Sρ; θ; ϵ; α
sin α can be obtained by Gram Schmidt orthogonalizing
Bt
; (13)
15α 1
sin8α2α 4 sin2 α the circle polynomials over the annular sector pupil,
4 9α2 i.e., by replacing the lower limit of radial integrations
from zero in Section 2.A by ϵ. The first four polyno-
and mials thus obtained that are orthonormal over the
annular sector pupil according to
1 16 sin2 α Z 1Z α
hWi (14) 1
2 59α2α sin 2α 32 sin2 α hSj Sj0 i 2
Sj ρ; θ; ϵ; αSj0 ρ; θ; ϵ; αρdρdθ
α1 ϵ ϵ α
δjj0 (17)
is the mean value of the aberration and
are given by
σ hW 2 i hWi2 1∕2 S1 1; (18)

1∕2
1 25 32 sin2 α
2
(15)
10 3 9α2α sin 2α 32 sin α 2ρ cos θ 4 1ϵϵ2 sin α
3 1ϵ α
S2 ρ;θ;ϵ;α h
i ;
1ϵ2 4 1ϵϵ2 sin α 2 1∕2
2α 2α sin 2α 3 1ϵ α
is its standard deviation over the circular sector. The
units of Bt, hWi, and σ are the same as those of the (19)
starting defocus aberration.
The number of polynomials up to and including a
certain degree n in ρ is given by the same number as 2ρ sin θ
S3 ρ; θ; ϵ; α ; (20)
in the case of Zernike circle polynomials, namely, 1 ϵ2 2α sin 2α∕2α1∕2
and
p
3 161 ϵ2 1 ϵ3 ϵ sin α 3α1 ϵρ cos θ 21 ϵ ϵ2 sin α
S4 ρ; θ; ϵ; α 2ρ2 1 ϵ2 2 2 2 2 2
; (21)
N4 321 ϵ ϵ sin α 9α1 ϵ 1 ϵ 2α sin 2α
where
1∕2
1 ϵ 1287ϵ6 28ϵ5 50ϵ4 55ϵ3 50ϵ2 28ϵ 7sin2 α 225α1 ϵ4 1 ϵ2 2α sin 2α
N4 × :
5 321 ϵ ϵ2 2 sin2 α 9α1 ϵ2 1 ϵ2 2α sin 2α
(22)
n 1n 2 We see that the complexity of the polynomials in-

Nn : (16) creases because of the obscuration. As in the case of a
2
circular sector, the orthonormal tilt polynomial S2
contains a piston term besides the wavefront tilt
In fact, if we let α π, the circular sector polyno- so that its mean value over the annular sector is zero.
mials obtained above reduce to the corresponding As in the case of a circular pupil, the orthonormal
Zernike circle polynomials. Similarly, if we let polynomial S4 ρ; θ; ϵ; α represents a defocus aberra-
α π∕2, we obtain the first four orthonormal polyno- tion ρ2 balanced with tilt aberration ρ cos θ in the
mials for a semi-circular pupil. form of Eq. (11) such that the variance of the ba-
lanced aberration over the annular sector pupil is
B. Annular Sector Pupil minimum, the mean value of the polynomials is zero,
Consider an annular sector pupil with inner and out- and its mean square value is unity. It can be shown
er radii of ϵ and unity, and thus an obscuration ratio that, similar to the case of a circular sector pupil [see
of ϵ, as illustrated in Fig. 1(b). The pupil is symme- Eqs. (11) (15)], the balancing tilt aberration Bt ϵ,
trical about the x axis and subtends a semi-angle α at the mean value hWϵi, and the standard deviation
the center of the circles formed by the arcs of the hσϵi of the balanced aberration are given by

12α1 ϵ2 1 ϵ1 ϵ3 ϵ sin α

Bt ϵ ; (23a)
59∕2α1 ϵ3 2α sin 2α 161 ϵ ϵ2 sin2 α
315α1 ϵ ϵ2 ϵ3 2 2α sin 2α 641 ϵ ϵ2 1 ϵ ϵ2 ϵ3 ϵ4 sin2 α

hWϵi ; (23b)
209∕2α1 ϵ3 2α sin 2α 161 ϵ ϵ2 sin2 α
and
1∕2
1 ϵ2 2 81 ϵ4 1 ϵ3 ϵsin2 α
hσϵi : (23c)
12 259α1 ϵ2 1 ϵ2 2α sin 2α 321 ϵ ϵ2 sin2 α
As α → π, the annular sector polynomials approach hWi 3.84, and σ 2 1.40, while the correct num-
the annular polynomials that are orthonormal over bers, as obtained from our Eqs. (13)–(15), are Bt
an annular pupil with an obscuration ratio of 1.24, hWi 0.29, and σ 2 0.005. Similarly, when
ϵ [9,11,12]. ϵ 0.8235 and α π∕6, they yield Bt 132.99,
hWi 115.31, and σ 2 65.57, while the correct
C. Sector Pupil Symmetrical About an Arbitrary numbers, as obtained from our Eqs. (23a)–(23c), are
Orientation Bt 1.22, hWi 0.22, and σ 2 0.003. Hence, the
The orthonormal polynomials for a circular sector Swantner and Chow equations referred to above
pupil with an arbitrary orientation such that its are incorrect.
sides make angles α1 and α2 with the x axis, as in
Fig. 2(a), or an annular sector pupil with an obscura- 3. Expansion of an Aberration Function in Terms of
tion ratio ϵ, as in Fig. 2(b), can be obtained in a man- Orthonormal Polynomials
ner similar to that in Section 2.A or 2.B, respectively. The wave aberration function Wρ; θ of a sector pupil
The angular integrations now will be from α1 to α2 . can be expanded in terms of the orthonormal sector
For example, the orthonormality of the polynomials polynomials Sj ρ; θ in the form
for the circular and annular sector pupils will be
described by
Z 1Z α
1 2
hSj Sj0 i S ρ; θ; α1 ; α2
α2 α1 0 α1 j
× Sj0 ρ; θ; α1 ; α2 ρdρdθ δjj0 (24)
and
Z 1Z
1 α2
hSj Sj0 i 2
Sj ρ; θ; ϵ; α1 ; α2
α2 α1 1 ϵ ϵ α1
× Sj0 ρ; θ; ϵ; α1 ; α2 ρdρdθ δjj0 ; (25)

respectively. However, the closed-form expressions
thus obtained are too complex to be of practical value.
It is better to obtain the results for each specific case.
As an example, the polynomials for circular and an-
nular sector pupils that are symmetrical about the y
axis, as illustrated in Figs. 3(a) and 3(b), can be ob-
tained by letting α1 π∕2 α and α2 π∕2 α.
The first four polynomials thus obtained are similar
to those for the corresponding pupil symmetrical about
the x axis, except that S2 and S3 exchange with each
other and cos θ is replaced by sin θ and vice versa.
We have checked Eqs. (13)–(15) given by Swantner
and Chow [1] numerically, and found that they yield Fig. 2. Sector pupil of unit radius with its sides making angles of
a negative value of sigma. For example, when ϵ 0 α1 and α2 with the x axis. (a) Circular. (b) Annular with obscuration
and α π∕6, they yield (approximately) Bt 5.36, ratio ϵ.

X
∞
hWρ; θi aj hSj ρ; θi a1 ; (28)
j 1
as may be seen from the orthonormality equation

such as Eq. (6) with j0 1. The mean square value
of the aberration function is given by
Z X
∞ X
∞
1
hW 2 ρ; θi aj hSj ρ; θi aj0 S0j ρ; θρdρdθ
A pupil j 1 j0 1
X
∞
a2j ; (29)
j 1
where we have utilized the orthonormality of the

polynomials. The variance σ 2 of the aberration func-
tion is accordingly given by
X
∞
σ 2 hW 2 ρ; θi hWρ; θi2 a2j ; (30)
j 2
where σ is the standard deviation or the sigma value

of the aberration function. Since the mean value of a
polynomial (except piston) is zero, each expansion
Fig. 3. Sector pupil of unit radius with its sides making angles coefficient aj represents the standard deviation of the
of α1 π∕2 − α and α2 π∕2 α with the y axis. (a) Circular. corresponding polynomial term. The variance of the
(b) Annular with obscuration ratio ϵ. aberration function is simply the sum of the var-
iances of these polynomial terms. It provides a mea-
sure of the quality of the image by way of the Strehl
X
∞
ratio, which for small aberrations is approximately
Wρ; θ aj Sj ρ; θ; (26) given by exp σ 2Φ , where σ Φ is the standard devia-
j 1 tion of the phase aberration. The overall image qual-
where aj is an expansion or the aberration coeffi- ity may be estimated by averaging the variance over
cient of the polynomial Sj ρ; θ. Multiplying both the six sectors of a cube-corner retroreflector.
sides of Eq. (26) by S0j ρ; θ, integrating over the sec- 4. Numerical Examples
tor pupil of area A and utilizing the orthonormality
In this section, we consider an aberration function
of the polynomials, the aberration coefficients are
given by Wρ; θ 4ρ4 5ρ2 10ρ cos θ (31)
Z as measured by an interferometer, and determine the
1
Wρ; θS0j ρ; θρdρdθ orthonormal aberration coefficients for a circular sec-
A pupil tor pupil and an annular sector pupil with obscura-
Z tion ratio ϵ 0.5, each with an angular subtense of
1X∞
a Sj ρ; θS0j ρ; θρdρdθ aj0 ; π∕3 (or α π∕6) and each symmetrical about the x
Aj 1 j pupil axis. We also consider a circular sector symmetrical
about the y axis to show how the orthonormal coeffi-
or cients change. Finally, we consider the limiting cases
of circular and annular [11,12] pupils with the same
Z aberration function and determine the coefficients of
1
aj Wρ; θSj ρ; θρdρdθ: (27) these radially symmetric or full pupils. All the poly-
A pupil nomials were determined by programming the non-
recursive matrix approach [13] using Mathematica
software [14].
It is evident that the value of an expansion coeffi- The first 11 orthonormal polynomials for a circular
cient is independent of the number of polynomials sector pupil of angular subtense π∕3 are given in
used in the expansion. Accordingly, one or more Table 2 when it is symmetrical about the x axis,
terms can be added to or subtracted from the aberra- and in Table 3 when it is symmetrical about the y
tion function without affecting the other coefficients. axis. The polynomials for a circular sector of angular
It is a consequence of the orthogonality of the subtense π∕2 symmetrical about the y axis are given
polynomials. in Table 4. We note that a polynomial consists of
The mean value of the aberration function is more and more terms as its order increases. The poly-
given by nomials for a semi-circular pupil are given in Table 5.

Table 2. Orthonormal Polynomials for a Circular Sector Pupil with Angular Subtense of π∕3 Symmetrical about the x Axis,
as in Fig. 1(a)
S1 1
S2 4.4081ρ cos θ − 2.8063
S3 4.8084ρ sin θ
S4 14.7738ρ2 − 18.2756ρ cos θ 4.2477
S5 15.7199ρ2 sin 2θ − 23.1380ρ sin θ
S6 −2.3267ρ2 13.4384ρ2 cos 2θ − 11.5019ρ cos θ 2.9289
S7 87.0864ρ3 sin θ − 65.2393ρ2 sin 2θ 37.9679ρ sin θ
S8 72.1271ρ3 cos θ − 88.0240ρ2 − 35.9271ρ2 cos 2θ 61.3806ρ cos θ − 7.7589
S9 7.5982ρ3 sin θ 42.8343ρ3 sin 3θ − 87.1692ρ2 sin 2θ 54.9874ρ sin θ
S10 −23.0378ρ3 cos θ 49.0241ρ3 cos 3θ 51.5225ρ2 − 83.7513ρ2 cos 2θ 10.8200ρ cos θ − 1.7027
S11 237.8242ρ4 − 578.3556ρ3 cos θ 41.5354ρ3 cos 3θ 312.4650ρ2 95.9653ρ2 cos 2θ − 116.2166ρ cos θ 9.1348
Table 3. Orthonormal Polynomials for a Circular Sector Pupil with Angular Subtense of π∕3 Symmetrical about the y Axis,
as in Fig. 3(a)
S1 1
S2 4.8084ρ cos θ
S3 4.4081ρ sin θ − 2.8063
S4 14.7738ρ2 − 18.2756ρ sin θ 4.2477
S5 15.7199ρ2 sin 2θ − 23.1380ρ cos θ
S6 2.3267ρ2 13.4384ρ2 cos 2θ 11.5019ρ sin θ − 2.9289
S7 72.1271ρ3 sin θ − 88.0240ρ2 35.9271ρ2 cos 2θ 61.3806ρ sin θ − 7.7589
S8 87.0864ρ3 cos θ − 65.2393ρ2 sin 2θ 37.9679ρ cos θ
S9 23.0378ρ3 sin θ 49.0240ρ3 sin 3θ − 51.5225ρ2 − 83.7513ρ2 cos 2θ − 10.8200ρ sin θ 1.7027
S10 −7.5981ρ3 cos θ 42.8343ρ3 cos 3θ 87.1692ρ2 sin 2θ − 54.9874ρ cos θ
S11 237.8243ρ4 − 578.3546ρ3 sin θ − 41.5354ρ3 sin 3θ 312.4651ρ2 − 95.9653ρ2 cos 2θ − 116.2156ρ sin θ 9.1348
Table 4. Orthonormal Polynomials for a Circular Sector Pupil with Angular Subtense of π∕2 Symmetrical about the y Axis,
as in Fig. 4
S1 1
S2 3.3178ρ cos θ
S3 4.5221ρ sin θ − 2.7142
S4 10.1720ρ2 − 12.4849ρ sin θ 2.4076
S5 11.1500ρ2 sin 2θ − 14.7336ρ cos θ
S6 −7.0559ρ2 9.5665ρ2 cos 2θ 18.2521ρ sin θ − 4.3820
S7 69.5749ρ3 sin θ − 82.2871ρ2 36.6255ρ2 cos 2θ 57.1668ρ sin θ − 6.5661
S8 31.8696ρ3 cos θ − 22.6486ρ2 sin 2θ 8.6814ρ cos θ
S9 −17.9479ρ3 sin θ 25.2706ρ3 sin 3θ 11.0762ρ2 − 52.6627ρ2 cos 2θ − 28.0196ρ sin θ 4.0136
S10 −34.0646ρ3 cos θ 27.8620ρ3 cos 3θ 73.1727ρ2 sin 2θ − 41.4386ρ cos θ
S11 93.7045ρ4 − 223.2860ρ3 sin θ − 11.1546ρ3 sin 3θ 106.3560ρ2 − 51.7028ρ2 cos 2θ − 41.1621ρ sin θ 2.9078
Table 5. Orthonormal Polynomials for a Semi-circular Pupil Symmetrical about the x Axis, as in Fig. 5(a)
S1 1
S2 3.7831ρ cos θ − 1.6056
S3 2ρ sin θ
S4 4.1683ρ2 − 2.5319ρ cos θ − 1.0096
S5 4.4114ρ2 sin 2θ − 2.9956ρ sin θ
S6 6.7981ρ2 7.5887ρ2 cos 2θ − 13.3480ρ cos θ 2.2660
S7 8.9027ρ3 sin θ − 1.4006ρ2 sin 2θ − 4.9840ρ sin θ
S8 20.5600ρ3 cos θ − 13.6275ρ2 − 7.6440ρ2 cos 2θ 0.3233ρ cos θ 1.4414
S10 40.7949ρ3 cos θ 15.2277ρ3 cos 3θ − 39.5924ρ2 − 41.0149ρ2 cos 2θ 31.8150ρ cos θ − 2.8023
S11 18.2324ρ4 − 21.1998ρ3 cos θ − 2.1906ρ3 cos 3θ − 6.1076ρ2 7.8677ρ2 cos 2θ 2.5110ρ cos θ 1.1232

Table 6. Orthonormal Polynomials for an Annular Sector Pupil with Obscuration Ratio ϵ 0.5 and Angular Subtense of π∕3
Symmetrical about the x Axis, as in Fig. 1(b)
S1 1
S2 7.1986ρ cos θ − 5.3465
S3 4.3007ρ sin θ
S4 −28.7951ρ cos θ 19.0444ρ2 9.4841
S5 17.5981ρ2 sin 2θ − 26.7660ρ sin θ
S6 27.0338ρ2 cos 2θ − 74.4974ρ cos θ 24.0159ρ2 26.3481
S7 83.4824ρ3 sin θ − 67.1102ρ2 sin 2θ 43.6343ρ sin θ
S8 180.6545ρ3 cos θ − 267.7044ρ2 − 126.9429ρ2 cos 2θ 273.1875ρ cos θ − 59.1057
S11 351.1153ρ4 − 1032.8704ρ3 cos θ 30.8480ρ3 cos 3θ 705.9722ρ2 320.4290ρ2 cos 2θ − 440.8119ρ cos θ 66.3866
Table 7. Orthonormal Polynomials for a Semi-annular Pupil with Obscuration Ratio ϵ 0.5 Symmetrical about the x
Axis, as in Fig. 5(b)
S1 1
S2 3.8539ρ cos θ − 1.9083
S3 1.7889ρ sin θ
S4 4.9234ρ2 − 1.4225ρ cos θ − 2.3728
S5 3.9259ρ2 sin 2θ − 2.7548ρ sin θ
S6 6.0925ρ2 7.9347ρ2 cos 2θ − 14.6815ρ cos θ 3.4617
S7 9.0714ρ3 sin θ − 0.9678ρ2 sin 2θ − 5.6709ρ sin θ
S8 22.9120ρ3 cos θ − 14.6184ρ2 − 5.3639ρ2 cos 2θ − 6.0598ρ cos θ 4.6007
S11 25.8811ρ4 − 14.6339ρ3 cos θ − 2.5180ρ3 cos 3θ − 21.2183ρ2 8.0154ρ2 cos 2θ − 1.9705ρ cos θ 7.4515
Table 8. Annular Polynomials Aj ρ;θ; ϵ 0.5 for an Annular Pupil with Obscuration Ratio ϵ 0.5
j n m Aj ρ; θ; ϵ 0.5 Aberration

1 0 0 1 Piston
2 1 1 1.7889ρ cos θ x tilt
3 1 1 1.7889ρ sin θ y tilt
4 2 0 2.30942ρ2 − 1.25 Defocus
5 2 2 2.1381ρ2 sin 2θ 45° primary astigmatism
6 2 2 2.1381ρ2 cos 2θ 0° primary astigmatism
7 3 1 2.34873.75ρ3 − 2.625ρ sin θ Primary y coma
8 3 1 2.34873.75ρ3 − 2.625ρ cos θ Primary x coma
9 3 3 2.4543ρ3 sin 3θ
10 3 3 2.4543ρ3 cos 3θ
11 4 0 3.97526ρ4 − 7.5ρ2 2.0625 Spherical aberration
For example, in Table 3, the S5 polynomial consists of symmetry of the annular sector pupil results in simi-
Zernike 45° astigmatism balanced by tilt, and S6 con- lar balancing of an aberration as for a circular sector
sists of Zernike astigmatism balanced by not only tilt pupil. The balancing defocus for spherical aberration
but additional defocus as well. The spherical aberra- in semi-circular and semi-annular pupils does have
tion ρ4 in S11 is balanced not only by defocus but sev- opposite signs as for the circular and the annular
eral other lower-order terms as well. Moreover, the pupils.
balancing defocus has the same sign as the spherical Using Eq. (27), we obtain the orthonormal coeffi-
aberration, instead of the opposite sign as in the cor- cients of the aberration function. Thus we may write
responding Zernike circle polynomial. All of this is a the aberration function of Eq. (31) in terms of the
consequence of the lower symmetry of the sector orthonormal polynomials for the various pupils.
pupil. The orthonormal polynomials for an annular They are given below along with their peak-to-valley
sector pupil, a semi-annular pupil, and an annular and sigma values.
pupil of an obscuration ratio ϵ 0.5 are shown in Circular sector pupil of angular subtense π∕3
Tables 6, 7, and 8, respectively. Of course, the lower symmetrical about the x axis, as in Fig. 1(a):

Wρ; θ; π∕6; π∕6 5.1995S1 1.9345S2

0.1539S4 0.1395S6
0.1303S8 0.0143S10
0.0168S11 ; (32a)
P V 9 and σ 1.9501: (32b)
Circular sector pupil of angular subtense π∕3 sym-

metrical about the y axis, as in Fig. 3(a):
Wρ; θ; π∕3; 2π∕3 1.6667S1 2.0797S2

0.3341S3 0.1539S4
0.1395S6 0.1303S7
0.0143S9 0.0168S11 ; (33a)
P V 10.0135 and σ 1.9501: (33b)
Circular sector pupil of angular subtense π∕2

symmetrical about the y axis, as in Fig. 4:
Wρ; θ; π∕4; 3π∕4 1.1667S1 3.0141S2 Fig. 5. Sector pupil of unit radius symmetrical about the x axis.
0.3231S3 0.0444S4 (a) Semi-circular. (b) Semi-annular with obscuration ratio ϵ 0.5.
0.2087S6 0.1419S7
0.0188S9 0.0427S11 ; (34a)
P V 10.5625 and σ 2.4798: (35b)
P V 14.1421 and σ 3.2585: (34b)

Circular pupil (not shown):
Semi-circular pupil symmetrical about the x axis,
as in Fig. 5(a): Wρ; θ; 0; 2π 1.6667Z1 5.0000Z2 0.2887Z4
0.2981Z11 ; (36a)
Wρ; θ; π∕2; π∕2 3.0775S1 2.4522S2
0.2194S4 0.1079S6
P V 20 and σ 5.0172. (36b)
0.1636S8 0.0316S10
0.2194S11 ; (35a) Annular sector pupil of angular subtense π∕3 sym-
metrical about the x axis, as in Fig. 2(a)
Wρ; θ; ϵ 0.5; π∕6; π∕6 6.0522S1 1.3755S2

0.0546S4 0.1412S6
0.0700S8 0.0034S10
0.0114S11 ; (37a)
P V 5.6699 and σ 1.3857: (37b)
Fig. 4. Circular sector pupil of unit radius and semi-angular sub-

tense π∕2 with its sides making angles of α1 π∕4 and α2 3π∕4 Semi-annular pupil symmetrical about the x axis,
with the x axis. as in Fig. 5(b)

Wρ; θ; ϵ 0.5; π∕2; π∕2 3.5765S1 2.5909S2 aberration ρ4 balanced by appropriate amounts of
defocus ρ2 and y tilt ρ sin θ to minimize its variance,
0.0018S4 0.0185S6 as may be seen by dropping the first four polynomials
0.0573S8 0.0241S10 in Eq. (34a):
0.1546S11 ; (38a)
W R ρ; θ 0.2087S6 0.1419S7 0.0188S9
P V 10.5625 and σ 2.5953: (38b) 0.0427S11
4ρ4 1.3630ρ2 0.5040ρ sin θ
Annular pupil (not shown):
0.0458. (42)
Wρ; θ; ϵ 0.5; 0; 2π 1.3750A1 5.5902A2
0.1677A11 ; (39a)
The factor of 4 is simply a result of the 4 in 4ρ4 in
the starting aberration function of Eq. (31), com-
P V 20 and σ 5.5927: (39b) pared to only ρ4 in Eq. (41). It is not surprising that
the residual aberration has the same form as the
In Section 2.C, we showed how the orthonormal balanced spherical aberration of Eq. (41). Since the
polynomials change as the orientation of the sector starting aberration function consists of spherical
pupil changes from the x to the y axis. Tables 2 and
3 illustrate this fact over a larger number of polyno-
mials. Equations (32a) and (33a) illustrate it with a
numerical example. The aberration functions of
Eqs. (33a) and (34a) for sector pupils symmetrical
about the y axis contain both tilt polynomials S2 and
S3 . Note that the defocus polynomial term A4 is
missing in Eq. (39a), because the defocus term in the
aberration function of Eq. (31) exactly balances its
spherical aberration term for an annular pupil of
obscuration ratio ϵ 0.5, as may be seen from the
polynomial A11 in Table 8.
Swantner and Chow also discussed a circular
sector pupil of angular subtense π∕2 symmetrical
about the y axis and aberrated by primary spherical
aberration. It can be shown that the orthonormal
polynomials obtained by orthonormalizing 1, ρ cos θ,
ρ sin θ, ρ2 , and ρ4 over such a sector are
1; 40a
3.3178ρ cos θ; (40b)
4.5221ρ sin θ 2.7142; (40c)
10.1720ρ2 12.4849ρ sin θ 2.4076; (40d)
15.5885ρ4 21.2467ρ2 7.8559ρ sin θ 0.7120:

(40e)
If we divide the last polynomial by 15.5885, we

obtain (approximately) the orthogonal spherical
aberration Fig. 6. Interferograms of the aberration function of Eq. (31),
as described by Eqs. (32)–(36) for the various circular sectors.
The left-side interferograms are for the aberration function with-
ρ4 1.3630ρ2 0.5040ρ sin θ 0.0458; (41) out removal of the first four aberration polynomial terms of
the piston, x and y tilts, and defocus, while the right side is for
considered by Swantner and Chow and plotted in the residual aberration function after removing the first four
their Fig. 5 [1]. This polynomial represents spherical terms.

process of the Gram–Schmidt orthogonalization. The

fourth polynomial represents defocus balanced with
tilt. It is more convenient to obtain the orthonormal
polynomials nonrecursively using a matrix approach
[13]. Because of the lower symmetry of a sector pupil,
the closed-form expressions of the polynomials for
P-V = 20 P-V = 0.5625 the general case are quite complex and lengthy, even
σ = 5.5927 σ = 0.1677
with the computer algebra programs such as Math-
ematica, which was used for the calculations in this
work [14]. However, it is straightforward to obtain
the polynomial expressions for any specific pupil,
as illustrated in Section 4 by considering circular
and annular sector pupils with an angular subtense
of π∕3 (as would be encountered in an interferogram
P-V = 10.5625 P-V = 0.5765 of a cube-corner retroreflector), semi-circular and
σ = 2.5923 σ = 0.1678 semi-annular pupils, and circular and annular pu-
pils. The obscuration ratio of each annulus is 0.5.
We have considered an aberration function consist-
ing of primary spherical aberration combined with
defocus and tilt, and obtained the orthonormal coef-
ficients for the various pupils just mentioned. For the
P-V =5.6699 P-V = 0.7003
circular sector pupil, we also showed, as an example,
σ = 1.3857 σ = 0.1580 how the orthonormal polynomials and the coeffi-
Fig. 7. Interferograms of the aberration function of Eq. (31), as cients change when the symmetry axis of a pupil
described by Eqs. (37)–(39) for the various annular sectors with changes from the x to the y axis. We also illustrate
an obscuration ratio ϵ 0.5. The left-side interferograms are for a significant advantage in balancing spherical aber-
the aberration function without removal of the first four aberra- ration with aberration terms as in the polynomial S11
tion polynomial terms of the piston, x and y tilts, and defocus, while compared to balancing with just defocus and tilt. The
the right side is for the residual aberration function after removing interferograms of the aberration function for the var-
the first four terms. ious pupils are shown for the starting aberration
function as well as when the first four polynomial
terms representing the interferometer errors of pis-
aberration combined with defocus and tilt, the ton, x and y tilts, and defocus are removed.
residual aberration function has to be spherical aber- In practice, as in the more familiar case of a circu-
ration balanced with the amount of defocus and tilt lar pupil, the aberration data will generally be avail-
that yields minimum variance. However, the sigma able at a square array of points and the integrations
value pof spherical aberration ρ4 over a π∕2 sector will be carried out over the sector pupils in the x and
is 2∕3 5. When it is balanced by defocus and tilt y coordinates.
only, as in Eq. (41), its sigma value decreases by a
The problem of obtaining the sector polynomials
factor of 4.65. But, if it is balanced in the form of S11,
was suggested by William H. Swantner. One of the
as in Table 4, the sigma value decreases by a factor of authors (VNM) gratefully acknowledges helpful dis-
27.94, i.e., a reduction by an additional factor of 6. cussions with him.
Figures 6 and 7 show the interferograms of the
aberration function of Eq. (31) for the various pupils. References
In Fig. 6 they are given for the circular sectors, and in 1. W. Swantner and W. W. Chow, “Gram–Schmidt orthogonaliza-
Fig. 7 for the annular sectors. The corresponding tion of Zernike polynomials for general aperture shapes,”
interferograms of the residual aberration function, Appl. Opt. 33, 1832–1837 (1994).
when the first four polynomial terms, representing 2. D. A. Thomas and J. C. Wyant, “Determination of the dihedral
angle errors of a corner cube from its Twyman–Green inter-
the interferometer setting errors of piston, tip, tilt, ferogram,” J. Opt. Soc. Am. 67, 467–472 (1977).
and defocus, are removed, are also shown side by 3. R. A. Lessard and S. C. Som, “Imaging properties of sector-
side. The peak-to-valley and the sigma values of the shaped apertures,” Appl. Opt. 11, 811–817 (1972).
aberration function are given in each case below the 4. G. Urcid and A. Padilla, “Far-field diffraction patterns of cir-
cular sectors and related apertures,” Appl. Opt. 44, 7677–7696
corresponding interferogram. (2005).
5. S. Huang, F. Xi, C. Liu, and Z. Jiang, “Phase retrieval on an-
5. Discussion and Conclusions nular and annular sector pupils by using the eigenfunctions
method to solve the transport intensity equation,” J. Opt.
We have considered the problem of a sector pupil, Soc. Am. A 29, 513–520 (2012).
such as those that are formed in an interferogram 6. S. D. Goodrow and T. W. Murphy, Jr., “Effects of thermal
of a cube corner. In Section 2.A, we obtained the first gradients in total internal reflection corner cubes,” Appl.
Opt. 51, 8793–8799 (2012).
four orthonormal aberration polynomials by orthogo- 7. T. W. Murphy and S. D. Goodrow, “Polarization and far-field
nalizing the Zernike circle polynomials recursively diffraction patterns of total internal reflection corner cubes,”
over a sector pupil, as an illustration of the recursive Appl. Opt. 52, 117–126 (2013).

8. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” 12. V. N. Mahajan, “Zernike annular polynomials and optical
J. Opt. Soc. Am. 66, 207–211 (1976). aberrations of systems with annular pupils,” Appl. Opt. 33,
9. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave 8125–8127 (1994).
Diffraction Optics, 2nd ed. (SPIE, 2011). 13. G.-M. Dai and V. N. Mahajan, “Nonrecursive orthonormal
10. A. Korn and T. M. Korn, Mathematical Handbook for Scien- polynomials with matrix formulation,” Opt. Lett. 32, 74–76
tists and Engineers (McGraw-Hill, 1968). (2007).
11. V. N. Mahajan, “Zernike annular polynomials for imaging sys- 14. Wolfram Research, Inc., Mathematica, Version 8.0, Champaign,
tems with annular pupils,” J. Opt. Soc. Am. 71, 75–85 (1981). Illinois (2010).

Index apodization ........................................... 3
Gaussian ........................................143
The terms annular, circular, elliptical, aspect ratio
Gaussian, hexagonal, rectangular, elliptical ........................................203
square, and general refer to a pupil. rectangular ....................................238
astigmatism
A annular ..................................111–113
aberration circular ....................57, 58, 61, 62, 68
ray .................................................. 21 elliptical ................................229, 231
wave ............................................... 18 Gaussian ........................150–152, 155
aberration balancing ...................... 3, 11 weakly truncated ....................... 155
annular .......................................... 111 general............................................. 26
circular ................................... 99, 105 hexagonal ..............................195, 197
elliptical ....................................... 228 rectangular ............................260, 262
Gaussian square ....................................290, 291
annular ...................................... 157 atmospheric turbulence ......................73
circular ...................................... 149 azimuthal frequency ........................... 65
weakly truncated ...................... 156
hexagonal ..................................... 194
B
rectangular ................................... 260 balanced aberrations
square ........................................... 289 annular ..........................................112
aberration coeffcients circular ............................................57
annular .......................................... 123 elliptical ........................................228
circular ........................................... 70 Gaussian
elliptical........................................ 214 annular ....................................... 159
Gaussian circular ......................................151
annular ...................................... 161 weakly truncated ....................... 159
circular ...................................... 157 hexagonal ......................................194
general ............................................ 38 rectangular ....................................260
hexagonal ..................................... 186 square ............................................289
rectangular .................................... 243 best image........................................... 99
square ........................................... 281
aberration difference function.............. 6 C
aberration-free PSF
annular .......................................... 108 central irradiance ..................................5
circular ..................................... 51, 52 chief ray ........................................14, 21
elliptical........................................ 206 circle coefficients........70, 312, 314, 322
Gaussian ....................................... 145 classical aberrations ....................25, 354
hexagonal ..................................... 172 coma
rectangular .................................... 239 annular ..........................111–113, 122
square ........................................... 271 circular ....................57, 58, 61, 62, 68
aberration tolerance ........................... 57 elliptical ................................229, 231
adaptive optics ................................. 371 Gaussian ........................150–152, 155
Airy disc ............................................ 51 weakly truncated ....................... 155
Airy pattern ........................................ 52 general............................................. 26
hexagonal ..............................195, 197
anamorphic systems......................... 351
rectangular ............................261, 262
angular ray aberration ........................ 21
square ....................................290, 291
annular coefficients.......... 123, 314, 321
conjugates ................................... 15, 353
annular pupil contour plots
Gaussian ....................... 157, 159, 161 eye aberration function ................. 375
uniform......................................... 107 Seidel aberration function ............. 373
aperture stop ...................................... 17 contrast reversal ....................................7
415
416 Index
cutoff frequency................................... 6
annular ......................................... 110 geometrical path length ......................17
circular ........................................... 53 Gram–Schmidt orthonormalization .. 40,
elliptical........................................ 208 389
hexagonal ............................. 174, 176
rectangular .................................... 240 H
square ................................... 272, 273
Hermitian ..............................................7
D Huygens’ secondary wavelets ..............8
dark and bright rings I
annular .......................................... 108
circular ........................................... 52 imaging characteristics of polynomial
elliptical........................................ 206 aberrations
Gaussian ....................................... 146 annular ..........................................132
hexagonal ..................................... 171 circular ............................................78
defocus Strehl ratio elliptical ........................................214
hexagonal ......................................187
annular .......................................... 129
rectangular ....................................247
circular ........................................... 55
square ............................................282
defocus wave aberration .................... 22
inner products ........................... 313, 383
deformable mirror .............................. 78 integration method ............................374
depth of focus interferogram ......................................30
annular .......................................... 114 symmetry ........................................78
diffraction focus interferometer setting errors ..... 320, 335
annular .......................................... 112 isometric
circular ........................................... 58 annular ..........................................132
Gaussian ....................................... 152 circular ............................................78
diffraction limited ................................ 4 elliptical ........................................214
hexagonal ......................................187
E rectangular ....................................247
square ............................................282
encircled power.................................. 51 isoplanatic system ................................4
ensquared power
circular ......................................... 173 L
hexagonal ............................. 170, 173
entrance pupil .................................... 17 lateral aberrations ............................... 27
exit pupil ............................................ 17 least squares approach ......372, 374, 393
least squares error ............................. 39
F least squares fit ................................. 325
least squares method......................... 374
f-number ............................................ 23 Legendre polynomials ..............301, 357
fabrication errors................................ 73 lenslet array ......................................383
focal ratio ........................................... 23 longitudinal defocus ........................... 23
Fourier transform ................................. 6
M
G
Maréchal formula ....................... 58, 127
Gaussian amplitude.......................... 344 meridional plane ................................. 20
Gaussian apodization ....... 343, 352, 355 modulation transfer function ................7
Gaussian image .................................... 3
Gaussian imaging ............................ 352 N
Gaussian pupil ................................. 144
noncircular pupil ..............................309
Gaussian radius ................................ 144
normalization constant........................41
Gaussian reference sphere ................. 18
normalized spatial frequency..............94
Index 417
O sector
annular ..............................405, 409
oblique spherical aberration .............. 27 circular ..............................403, 408
obscuration ratio .............................. 107 semiannular ............................... 409
optical path length.............................. 17 semicircular ............................... 408
optical transfer function square ....................................274–280
annular .......................................... 109 Poisson equation............................... 384
circular ........................................... 53
power series expansion................. 25, 27
elliptical........................................ 207
Gaussian ....................................... 147 primary aberrations
general .............................................. 6 annular ..........................................111
hexagonal ..................................... 174 circular ......................................57, 58
rectangular .................................... 240 Gaussian
square ........................................... 272 annular ....................................... 428
orthogonal aberrations ............... 69, 116 circular ..............................355, 357
orthonormalization ............................ 40 weakly truncated ............... 416, 418
OTF characteristics PSF characterisitcs
annular .......................................... 132 annular ..........................................132
circular ........................................... 84 circular ............................................83
OTF slope at the origin pupil function
annular .......................................... 111
circular ............................................50
circular ........................................... 54
elliptical ........................................203
Gaussian ........................................144
P circular ........................................50
peak-to-valley (P-V) numbers general............................................... 4
annular .......................................... 136 hexagonal ......................................168
circular ........................................... 82 rectangular ....................................237
elliptical........................................ 225 square ............................................269
hexagonal ..................................... 191
rectangular .................................... 257 R
square ........................................... 286 random Gaussian noise..... 373, 384, 393
phase transfer function ........................ 7
ray aberration ......................................21
point-spread function
annular .......................................... 107 Rayleigh’s h 4 rule ........................... 59
circular ........................................... 51 rays
elliptical........................................ 204 marginal ..........................................17
Gaussian sagiWtal .............................................21
circular ...................................... 145 tangential......................................... 21
general .............................................. 5 zonal ................................................17
hexagonal ..................................... 169 reference sphere ..................................18
rectangular .................................... 238 reflection invariants ..................351, 354
square ........................................... 270
rotational invariants ............................25
polynomial-ordering number ............. 65
polynomials
annular .................................. 116–123 S
circular ..................................... 63–72 scaled pupil......................................... 94
elliptical................................ 209–220 Schwarzschild aberration function ..... 27
Gaussian secondary aberrations ......................... 27
annular .............................. 158–160 Seidel aberration function .... 26, 28, 98,
circular .............................. 153–155 373, 375–377
weakly truncated ...................... 155 annular ..........................................321
general ............................................ 37 Seidel aberrations ............................... 15
hexagonal ............................. 177–186 Seidel coefficients from Zernike
rectangular ............................ 242–246 coeffcients ..........................................91
418 Index
Shack–Hartmann sensor .. 371, 383, 384 U

shift-invariant imaging ........................ 4
uniformly illuminated pupil............5, 50
sigma value .......................................... 9
uniqueness of circle polynomials ....... 69
slit polynomials................................ 301
unit pupil ............................................43
spatial frequency ................................ 27
annular ..........................................107
spherical aberration
circular ............................................50
circular ............................... 57, 58, 90 elliptical ........................................209
elliptical ....................................... 230 hexagonal ..............................168, 177
Gaussian ............................... 150–152 rectangular ....................................242
general ............................................ 26 square ............................................274
hexagonal ............. 191–193, 197, 198
rectangular ........................... 257, 259 V
slit................................................. 299
variance ..............................................39
square ................... 282, 287, 288, 292
aberration function
spot diagram ...................................... 21
annular ....................................... 123
spot sigma ........................................ 388
circular ..........................73, 74, 372
square arrays ............................ 373, 393
Gaussian
standard deviation
annular ................................... 161
primary aberrations
annular ...................................... 111 circular ..................................157
circular ........................................ 57 elliptical..................................... 214
elliptical .................................... 228 hexagonal ..................................187
Gaussian ................................... 150 rectangular ................................. 247
hexagonal.................................. 194 square ........................................281
rectangular ................................ 260 vector polynomials
square........................................ 289 for ray aberrations ............... 391, 392
Strehl ratio for Zernike coefficients......... 384–387
annular .......................................... 111
circular ......................... 54, 61, 92, 93 W
elliptical................................ 221, 226 wave aberration ..................................18
Gaussian
sign convention ............................... 18
circular ...................................... 149
general ........................................ 8–10 wave aberration polynomials for
hexagonal ............................. 192, 197 minimum ray aberration
rectangular ............................ 257, 258 variance ....................................388–390
square ................................... 287, 292 wavefront ............................................17
symmetry properties annular ..........................................314
interferograms hexagonal ......................................332
annular ...................................... 132 wavefront data ................................ 373,
circular .................................. 74, 78 wavefront defocus aberration ............. 22
OTF wavefront fit quality factor..... 374, 378,
annular ...................................... 132 379, 394
circular .................................. 76, 78 wavefront fitting ............................... 320
PSF wavefront gradient ............................383
annular ...................................... 132 wavefront slope data......................... 388
circular .................................. 74, 78 wavefront tilt aberration ..................... 24
weakly truncated Gaussian pupil ......156
T
Z
tangential plane .................................. 20
transverse ray aberration .............. 15, 21 Zernike circle polynomials ....... 105, 493
truncation ratio................................. 144 uniqueness....................................... 69
Twyman–Green interferometer ... 15, 30 Zernike expansion coefficients..... 70, 73
ABOUT THE AUTHOR
Virendra N. Mahajan was born in Vihari, Pakistan, and

educated in India and the United States. He received his Ph. D.
in optical sciences from the College of Optical Sciences,
University of Arizona. He spent nine years at the Charles Stark
Draper Laboratory in Cambridge, Massachusetts, where he
worked on space optical systems. Since 1983, he has been at
The Aerospace Corporation in El Segundo, California, where
he is a distinguished scientist working on space-based
surveillance systems. Parts I and II of Optical Imaging and Aberrations evolved out of a
graduate course he taught as an adjunct professor in the Electrical Engineering-
Electrophysics department at the University of Southern California. Dr. Mahajan is an
adjunct professor in the College of Optical Sciences at the University of Arizona, and the
Department of Optics and Photonics at the National Central University in Taiwan, where
he teaches graduate courses on imaging and aberrations. He also teaches short courses on
aberrations at meetings of the Optical Society of America and SPIE. He has published
numerous papers on diffraction, aberrations, wavefront analysis, adaptive optics, and
acousto-optics. He is a fellow of OSA, SPIE, and the Optical Society of India. He is an
associate editor of OSA’s 3rd edition of the Handbook of Optics, and a recipient of
SPIE’s Conrady award. He has served as a Topical Editor of Optics Letters, chairman of
OSA’s Astronomical, Aeronautical, and Space Optics technical group, and a member of
several committees of both OSA and SPIE. Dr. Mahajan is the author of Aberration
Theory Made Simple, 2nd edition (2011), editor of Selected Papers on Effects of
Aberrations in Optical Imaging (1994), and author of Optical Imaging and Aberrations,
Part I: Ray Geometrical Optics (1998) and Part II: Wave Diffraction Optics (2001,
2011), all published by SPIE Press.

Optical Imaging and Aberrations Part III Wavefront Analysis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optical Imaging and Aberrations Part III Wavefront Analysis

Uploaded by

Copyright:

Available Formats

PART III

THE AEROSPACE CORPORATION

Bellingham, Washington USA

Copyright © 2013 Society of Photo-Optical Instrumentation Engineers

Printed in the United States of America.

Maya, Leela, Rohan, and Krishan

Wavefront Analysis is focused on the use of orthonormal polynomials for wavefront

Tucson, Arizona James C. Wyant

PART III. WAVEFRONT ANALYSIS

CHAPTER 1: OPTICAL IMAGING ............................................................. 1

CHAPTER 2: OPTICAL WAVEFRONTS AND THEIR ABERRATIONS .......... 13

CHAPTER 3: ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT

CHAPTER 4: SYSTEMS WITH CIRCULAR PUPILS...................................... 47

CHAPTER 5: SYSTEMS WITH ANNULAR PUPILS .................................... 105

CHAPTER 6: SYSTEMS WITH GAUSSIAN PUPILS ................................... 141

CHAPTER 7: SYSTEMS WITH HEXAGONAL PUPILS ............................... 165

CHAPTER 8: SYSTEMS WITH ELLIPTICAL PUPILS ................................... 201

CHAPTER 10: SYSTEMS WITH SQUARE PUPILS ..................................... 267

CHAPTER 12: USE OF ZERNIKE CIRCLE POLYNOMIALS FOR

CHAPTER 14: NUMERICAL WAVEFRONT ANALYSIS............................ 369

APPENDIX: SYSTEMS WITH SECTOR PUPILS ......................................... 401

Index ............................................................................................................................. 415

El Segundo, California Virendra N. Mahajan

Once again, it is a great pleasure to acknowledge the generous support I have

Kalidasa Kumarasambhava 1.3

1.1 Introduction ..............................................................................................................3

1.2 Diffraction Image ..................................................................................................... 3

1.2.1 Pupil Function..............................................................................................4

1.2.2 PSF ..............................................................................................................5

1.2.3 OTF ..............................................................................................................6

1.3 Strehl Ratio ............................................................................................................... 7

1.3.1 General Expression ......................................................................................7

1.3.2 Approximate Expressions in Terms of Aberration Variance ......................9

1.4 Aberration Balancing ............................................................................................10

The aberrations of the system determine the quality of an image. An important

1.2 DIFFRACTION IMAGE

1.2.1 Pupil Function

F (r, q) = (2p l)W rp ; ro (r r ) . (1-2)

where the integration is across the pupil.

Pex = Sex I 0 , (1-6)

The aberration-free irradiance at the center is given by

The irradiance distribution normalized by its central value may be written

where an asterisk denotes a complex conjugate and

From Eq. (1-11), the aberration-free OTF can be written

From Eq. (1-10), we note that

The OTF can also be written in the form

PSF (ri ) = 2p Ú t (v i ) J 0 (2 p v i ri ) v i dv i (1-19)

t (v i ) = 2p Ú PSF (ri ) J 0 (2p v i ri ) ri dri , (1-20)

1.3 STREHL RATIO

where the subscripts a and u refer to an aberrated and an unaberrated system,

It can be shown that [1]

S = PSFa ( 0) PSFu ( 0) . (1-24)

From Eq. (1-11), we may write

1.3.2 Approximate Expressions in Terms of Aberration Variance

S >~ (1 - sF2 2) 2 , (1-30)

For a system with a uniformly illuminated pupil, the aberration-free central

The aberrations of a system determine the quality of an image actually observed in

1. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction

2. V. N. Mahajan, “Luneburg apodization problem I,” Opt. Lett. 5, 267–269 (1980).

3. A. Maréchal, “Etude des effets combines de la diffraction et des aberrations

4. B. R. A. Nijboer, Thesis: ”The Diffraction Theory of Aberrations,” University of

5. B. R. A. Nijboer, “The diffraction theory of optical aberrations. Part II:

6. V. N. Mahajan, “Strehl ratio for primary aberrations in terms of their aberration

7. V. N. Mahajan, Optical Imaging and Aberrations, Part I: Ray Aberration Optics,