Professional Documents
Culture Documents
WAVEFRONT ANALYSIS
PART III
WAVEFRONT ANALYSIS
VIRENDRA N. MAHAJAN
Mahajan, Virendra N.
Optical imaging and aberrations, part III: wavefront analysis / Virendra N. Mahajan
pages cm.
Includes bibliographical references and index.
ISBN 978-0-8194-9111-4
1. Optical measurements. 2. Aberration--Measurement. 3. Orthogonal decompositions.
4. Orthogonal polynomials. I. Title.
QC367.M24 2013
621.36--dc23
2013018827
Published by
SPIE
P.O. Box 10
Bellingham, Washington 98227-0010 USA
Phone: +1 360.676.3290
Fax: +1 360.647.1445
Email: Books@spie.org
Web: http://spie.org
All rights reserved. No part of this publication may be reproduced or distributed in any
form or by any means without written permission of the publisher.
The content of this book reflects the work and thought of the author(s). Every effort has
been made to publish reliable and accurate information herein, but the publisher is not
responsible for the validity of the information or for any outcomes resulting from reliance
thereon.
Front cover: Shown from left to right are the aberration-free PSFs of optical imaging
systems with circular, annular, hexagonal, elliptical, rectangular, and square pupils.
To my grandchildren
v
FOREWORD
For years Vini Mahajan has been publishing a book series on optical imaging and
aberrations. Part I of the series on Ray Geometrical Optics was published in 1998, and
Part II on Wave Diffraction Optics followed in 2001. A second edition of Part II appeared
in 2011. Now Vini has written Part III on Wavefront Analysis, which should be of interest
to anyone working in the fields of optical design, fabrication, or testing.
Chapter 4 is a long and complete chapter on imaging and aberrations for optical
systems with circular pupils. The chapter covers the PSF and OTF for aberration-free
imaging, Strehl ratio and aberration balancing and tolerancing, and a very complete
description of Zernike circle polynomials. Isometric, interferometric, and imaging
characteristics of the circle polynomial aberrations are very nicely explained and
illustrated. The important relationship between the circle polynomials and the classical
aberrations is discussed. Since optical systems generally have circular pupils, this chapter
will be of use to almost anyone working in optics.
The next several chapters are intended for readers interested in optical systems with
noncircular or apodized circular or annular pupils. Much of this material is difficult to
find in such detail elsewhere. The chapters start with a brief discussion of aberration-free
imaging that includes both the PSF and the OTF of the optical system, as this is
potentially the ultimate goal of any optical design or test. Then the polynomials
appropriate for systems with pupils of different shapes representing balanced classical
aberrations are described in detail. As in the case of the circle polynomial aberrations, the
isometric, interferometric, and PSF plots of the first forty-five polynomial aberrations for
systems with hexagonal, elliptical, annular, rectangular, and square pupils facilitate
understanding of their significance. Systems with circular and annular pupils with
Gaussian illumination, anamorphic systems with square and circular pupils, and those
with circular and annular sector pupils are also discussed thoroughly.
Anyone thinking of using the Zernike circle polynomials for wavefront analysis of
systems with noncircular pupils should read Chapter 12, where their pitfalls are
illustrated by applying them to systems with annular and hexagonal pupils. Numerical
examples on the calculation of the orthonormal aberration coefficients from the
wavefront or the wavefront slope data given in Chapter 14 add to the utility and
vii
practicality of the book. A summary at the end of each chapter is quite useful, as it
describes the essence of the content.
Vini is an excellent writer with the gift of writing complex topics in a simplified, yet
rigorous, manner. As in the first two volumes of this book series, the material presented
in Part III is thorough and detailed, and much of it is from his own publications.
Wavefront Analysis is primarily analytical in nature, but it is generally easy to read with a
lot of examples and numerical results. Both students and experienced optical engineers
and scientists who have a need for wavefront analysis of optical systems will find it to be
extremely useful.
viii
TABLE OF CONTENTS
ix
3.5 Unit Pupil .............................................................................................................................. 43
3.6 Summary ............................................................................................................................... 43
References ........................................................................................................................................ 46
x
4.11 Zernike Coefficients of a Scaled Pupil ............................................................................... 92
4.11.1 Theory .................................................................................................................... 92
4.11.2 Application to a Seidel Aberration Function.......................................................... 97
4.11.3 Numerical Example................................................................................................ 99
4.12 Summary ............................................................................................................................. 102
References ...................................................................................................................................... 103
xi
6.11 Aberration Coefficients of a Gaussian Annular Aberration Function ......................... 161
6.12 Summary ............................................................................................................................. 161
References ...................................................................................................................................... 163
xii
CHAPTER 9: SYSTEMS WITH RECTANGULAR PUPILS ............................ 235
9.1 Introduction ........................................................................................................................ 237
9.2 Pupil Function..................................................................................................................... 237
9.3 Aberration-Free Imaging .................................................................................................. 238
9.3.1 PSF ..........................................................................................................238
9.3.2 OTF ..........................................................................................................240
9.4 Rectangular Polynomials ................................................................................................... 242
9.5 Rectangular Coefficients of a Rectangular Aberration Function.................................. 243
9.6 Isometric, Interferometric, and Imaging Characteristics of
Rectangular Polynomial Aberrations ............................................................................... 247
9.7 Seidel Aberrations and Their Standard Deviations ........................................................ 260
9.7.1 Defocus ....................................................................................................260
9.7.2 Astigmatism............................................................................................. 260
9.7.3 Coma ........................................................................................................261
9.7.4 Spherical Aberration ................................................................................261
9.8 Summary ............................................................................................................................. 264
References ...................................................................................................................................... 265
xiii
CHAPTER 11: SYSTEMS WITH SLIT PUPILS ............................................. 295
11.1 Introduction ........................................................................................................................ 297
11.2 Aberration-Free Imaging .................................................................................................. 297
11.2.1 PSF ..........................................................................................................297
11.2.2 Image of an Incoherent Slit......................................................................298
11.3 Strehl Ratio and Aberration Balancing............................................................................ 299
11.3.1 Strehl Ratio ..............................................................................................299
11.3.2 Aberration Balancing............................................................................... 289
11.4 Slit Polynomials .................................................................................................................. 301
11.5 Standard Deviation of a Primary Aberration ................................................................. 302
11. Summary ............................................................................................................................. 305
References ...................................................................................................................................... 306
xiv
CHAPTER 13: ANAMORPHIC SYSTEMS................................................ 349
13.1 Introduction ........................................................................................................................ 351
13.2 Gaussian Imaging ............................................................................................................... 352
13.3 Classical Aberrations ......................................................................................................... 354
13.4 Strehl Ratio and Aberration Balancing for a Rectangular Pupil .................................. 355
13.5 Aberration Polynomials Orthonormal over a Rectangular Pupil ................................. 356
13.6 Expansion of a Rectangular Aberration Function in Terms of Orthonormal
Rectangular Polynomials ................................................................................................... 360
13.7 Anamorphic Imaging System with a Circular Pupil....................................................... 361
13.7.1 Balanced Aberrations ..............................................................................361
13.7.2 Orthonormal Polynomials Representing Balanced Aberrations ..............362
13.8 Comparison of Polynomials for Rotationally Symmetric and
Anamorphic Imaging Systems .......................................................................................... 362
13.9 Summary ............................................................................................................................. 365
References ...................................................................................................................................... 367
xv
PREFACE
This book is Part III of a series of books on Optical Imaging and Aberrations. Part I
on Ray Geometrical Optics and Part II on Wave Diffraction Optics were published
earlier. Part III is on Wavefront Analysis, which is an integral part of optical design,
fabrication, and testing. In optical design, rays are traced to determine the wavefront and
thereby the quality of a design. In optical testing, the fabrication errors and, therefore, the
associated aberrations are measured by way of interferometry. In both cases, the quality
of the wavefront is determined from the aberrations obtained at an array of points. The
aberrations thus obtained are used to calculate the mean, the peak-to-valley, and the
standard deviation values. While such statistical measures of the wavefront are part of
wavefront analysis, the purpose of this book is to determine the content of the wavefront
by decomposing the ray-traced or test-measured data in terms of polynomials that are
orthogonal over the expected domain of the data. These polynomials must include the
basic aberrations of wavefront defocus and tilt, and represent balanced classical
aberrations.
We start Part III with an outline of optical imaging in the presence of aberrations in
Chapter 1, i.e., on how to obtain the point-spread and optical transfer functions of an
imaging system with an arbitrary shaped pupil. The Strehl ratio of a system as a measure
of image quality is introduced in this chapter, and shown to be dependent only on the
aberration variance when the aberration is small. It is followed in Chapter 2 with a brief
discussion of the wavefronts and aberrations. This chapter introduces the nomenclature of
aberrations. How to obtain the orthogonal polynomials over a certain domain from those
over another is discussed in Chapter 3. For systems with a circular pupil, the Zernike
circle polynomials are well known for wavefront analysis. They are discussed at length in
Chapter 4. These polynomials are orthogonalized over an annular pupil in Chapter 5, and
over a Gaussian pupil in Chapter 6. They are obtained similarly for systems with
hexagonal, elliptical, rectangular, square, and slit pupils in the succeeding chapters. For
each pupil, the polynomials are given in their orthonormal form so that an expansion
coefficient (with the exception of piston) represents the standard deviation of the
corresponding polynomial aberration term. The standard deviation of a Seidel aberration
with and without aberration balancing is also discussed in these chapters.
Since the Zernike circle polynomials form a complete set, a wavefront over any
domain can be expanded in terms of them. However, the pitfalls of their use over a
domain other than circular and resulting from the lack of their orthogonality over the
chosen domain are discussed in Chapter 12. Finally, the aberrations of anamorphic
systems are discussed, and polynomials suitable for their aberration analysis are given in
Chapter 13 for both rectangular and circular pupils. The use of the orthonormal
polyonomials for determining the content of a wavefront is demonstrated in Chapter 14
by computer simulations of circular wavefronts. The determination of the aberrations
coefficients from the wavefront slope data, as in a Shack–Hartmann sensor, is also
discussed in this chapter.
xvii
ACKNOWLEDGMENT6
I am grateful to Professor José Antonio Díaz Navas for carrying out many computer
calculations and preparing many of the figures. My thanks to Drs. Barry Johnson, James
Harvey, and Daniel Topa for reading an early version of the manuscript and suggesting to
include examples of wavefront analysis. I am grateful to Professor Eva Acosta for her
help with writing Chapter 14 on Numerical Wavefront Analysis, as my response to their
suggestion. Of course, any shortcomings or errors anywhere in the book are totally my
responsibility.
As in the past, I cannot say enough about the constant support I have received from
my wife Shashi over the many years it has taken me to complete this three-part series. I
dedicate Part III to my grandchildren.
Finally, I would like to thank SPIE Press Editors Dara Burrows and Scott McNeill,
and Manager Tim Lamkins for their quality support in bringing this book to publication.
It has always been a pleasure to work with the 63,( staff, starting with the 3XEOLFDWLRQV
'LUHFWRU Eric Pepper.
xix
SYMBOLS AND NOTATION
r
ai aberration coefficient rp pupil point position vector
A amplitude R radius of reference sphere
Ai peak aberration coefficient Re real part
Bd defocus coefficient Rj rectangular polynomial
Bj wave aberration polynomial Rnm (r) Zernike radial polynomial
Bt tilt coefficient S Strehl ratio
c aspect ratio Sex area of exit pupil
Ej elliptical polynomial Sj square, sector, or ray aberration
F focal ratio polynomial
r
Gj Gaussian or vector polynomial V vector polynomial
Hj hexagonal polynomial x, y Cartesian coordinates of a point
I irradiance W wave aberration
Im imaginary part Z nm Zernike circle polynomial
j polynomial number Zj Zernike circle polynomial
r image spatial frequency vector
Jn Bessel function vi
Lj Legendre polynomial v normalized spatial frequency
M magnification t optical transfer function
MTF modulation transfer function r = r a normalized radial coordinate
OTF optical transfer function q polar angle of a position vector
P object point f polar angle of frequency vector
P¢ Gaussian image point ⑀ obscuration or aspect ratio
Pex power in the exit pupil d (◊) Dirac delta function
Pi image power d ij Kronecker delta
Pn polynomial D longitudinal defocus
P(◊) pupil function F phase aberration
PSF point-spread function r, q polar coordinates of a point
PTF phase transfer function l optical wavelength
r radial coordinate x, h spatial frequency coordinates
rc radius of circle sW standard deviation (wave)
r
ri image point position vector sF standard deviation (phase)
xxi
Anantaratnaprabhavasya yasya himam
. na saubhagyavilopi jatam
Eko hi doso
. gunasannipate
. ˙ .
nimajjatindoh. kiranesvivankah
.
The snow does not diminish the beauty of the Himalayan mountains
which are the source of countless gems. Indeed, one flaw is lost
among a host of virtues, as the moon’s dark spot is lost among its rays.
xxiii
PART III
WAVEFRONT ANALYSIS
CHAPTER 1
OPTICAL IMAGING
1.5 Summary................................................................................................................. 11
References ........................................................................................................................12
1
Chapter 1
Optical Imaging
1.1 INTRODUCTION
The position and the size of the Gaussian image of an object formed by an optical
imaging system is determined by using its Gaussian imaging equations. The aperture stop
of the system limits the amount of light entering it the most. Its entrance pupil determines
the amount of light from an object that enters it, and the exit pupil determines how that
light is distributed in the image. The Gaussian image is an exact replica of the object,
except for its magnification. The diffraction image of an isoplanatic incoherent object is
given by the convolution of the Gaussian image and the diffraction image of a point
object, called the point-spread function (PSF). In the spatial frequency domain, the
spectrum of the image is correspondingly given by the product of the optical transfer
function (OTF), which is the Fourier transform of the PSF, and the spectrum of the
Gaussian image. The image is obtained by inverse Fourier transforming its spectrum [1].
We define a pupil function, representing the complex amplitude at the exit pupil, and give
equations for obtaining the PSF and the OTF.
absent or neglected, the light is distributed in a finite region around the Gaussian image
point due to its diffraction by the system. The diffraction image of a point object is called
the PSF of the system, and the aberration-free image is referred to as the diffraction-
limited image. The image of an extended object is determined by adding the amplitude or
the irrandiance images of its small elements, depending on whether the object radiation is
coherent or incoherent.
A system is called isoplanatic for a small enough object if the distribution of light in
the image of any point on it is approximately the same, except for its location in the
image plane. Thus, over a small field of view, the image of a point object is shift
invariant. For an incoherent isoplanatic object, the diffraction image can be obtained by
convolving the Gaussian image (which is an exact replica of the object except for its size
and illumination scaling) with the diffraction PSF. In the spatial frequency domain, the
spectrum of the image is correspondingly given by the product of the OTF, which is the
Fourier transform of the PSF, and the spectrum of the Gaussian image. The image is
obtained by inverse Fourier transforming its spectrum [1]. We define a pupil function,
representing the complex amplitude at the exit pupil, and give equations for obtaining the
PSF and the OTF.
(r r ) (r r ) [ (r r )]
P rp ; ro = A rp ; ro exp iF rp ; ro , inside the exit pupil
= 0 , outside the exit pupil , (1-1)
r
(r r )
where rp is the 2D position vector of a point in the plane of the pupil and A rp ; ro and
F (r, q) are the amplitude and phase aberration functions of the system for the point
object under consideration. The phase aberration F (r, q) is related to the wave aberration
r r
( )
W rp ; ro according to
The shape of the pupil is arbitrary. It may, for example, be circular or annular. The total
power in the pupil and, therefore, in the image is given by
r r 2 r
Pex = Ú P (r ; r )
p o d rp
r r r
= Ú A 2 ( rp ; ro )d rp , (1-3)
The image lies at a distance R from the plane of the exit pupil, where R is the radius
of curvature of the Gaussian reference sphere with respect to which the aberration
r r
( )
W rp ; ro is defined. The center of curvature of the reference sphere lies at the Gaussian
r r
image point (unless defocus is introduced). Generally, the amplitude function A rp ; ro ( )
is uniform across the exit pupil. An exception is the Gaussian pupil considered in Chapter
6. We assume a small field of view so that the dependence of the aberration function
r r
( )
W rp ; ro on the location of the point object in the object plane can be neglected.
1.2.2 PSF
The PSF of the system imaging an incoherent object is given by [1]
2
r 1 Û r Ê 2pi r r ˆ r
PSF (ri ) = 2 2 Ù
Pex l R ı
P rp exp Á -
Ë lR
( )
ri rp ˜ d rp
¯
◊ , (1-4)
r
where the position vector ri of the observation point is written with respect to the
r
location rg of the Gaussian image point, and Pex is the total power in the image. The
irradiance distribution of the image is obtained by multiplying the PSF by the total power
Pex in the image, i.e.,
2
r 1 Û r Ê 2pi r r ˆ r
I (ri ) = 2 2 Ù P rp exp Á -
lR ı Ë lR
( )
ri rp ˜ d rp
¯
◊ . (1-5)
For a uniformly illuminated pupil with irradiance I 0 , the total power incident on and
transmitted by the pupil is given by
(r )
where Sex is the area of the exit pupil. Letting A 2 rp = I 0 , we may write the irradiance
distribution
2
r I0 Û r Ê 2pi r r ˆ r
I (ri ) = 2 2 Ù exp iF rp
lR ı
[ ( )] exp Á -
Ë lR
◊
ri rp ˜ d rp
¯
. (1-7)
I0 r 2
I ( 0) =
l R2
2 [
Ú d rp ]
Pex Sex
= . (1-8)
l2 R 2
For convenience, we will refer to the irradiance distribution given by Eq. (1-9) as the
r
( )
PSF. Letting F rp = 0, we obtain the aberration-free PSF.
1.2.3 OTF
The imaging process can be described in the space domain by way of the PSF, or in
the spatial frequency domain by way of the OTF. The OTF is the Fourier transform of the
PSF, defined as
r r r r r
t (v i ) = Ú PSF (ri ) exp (2p i v i ◊ ri ) d ri , (1-10)
r
where v i is a spatial frequency vector in the image plane and related to the corresponding
r r r
frequency v o in the object plane by the image magnification M according to v i = v o M .
Since the image of an isoplanatic incoherent object is given by the convolution of the PSF
and the Gaussian image, the (spatial frequency) spectrum of the image is given by the
product of the OTF and the spectrum of the Gaussian image. The image is obtained by
inverse Fourier transforming its spectrum.
Because of the relationship of the PSF with the pupil function, as in Eq. (1-4), the
OTF can also be written as the autocorrelation of the pupil function in the form
r r r r r r 2 r
t (v i ) = Û ( ) (
Ù P rp P * rp - l R v i d rp
ı
) Ú ( )
P rp d rp
r r r
Ú ( ) (
= Pex1 A rp A rp - l R v i exp iQ rp ) [ (r )] d rr p , (1-11)
(r r ) (r ) (r
Q rp ; v i = F rp - F rp - l R v i
r
) (1-12)
is a phase aberration difference function defined over the region of overlap of two pupils:
r r r
one centered at rp = 0 and the other at rp = l Rvi .
For a uniformly illuminated pupil, the OTF is simply the fractional area of overlap of two
pupils centered at (0, 0) and l R(x, h) , where (x, h) are the Cartesian components of the
r
spatial frequency vector v i .
r
The region of overlap is maximum and equal to the area of the pupil for vi = 0,
giving a value of unity for t (0) . It represents the fact that the contrast of an image is zero
for an object of zero contrast. Because of the finite size of the pupil, the overlap region
r
reduces to zero at some frequency vc , called the cutoff frequency, and stays zero for
r r r
larger frequencies, i.e., t ( vi ) = 0 for vi ≥ vc . Because of isoplanatism, the spatial
frequency spectrum of the image is obtained as the product of the spectrum of the
27) 7
Gaussian image and the OTF. Inverse Fourier transforming the image spectrum yields the
space domain image.
i.e., the OTF is complex symmetric or Hermitian. Therefore, its real part is even and its
imaginary part is odd, i.e.,
r r
Re t ( vi ) = Re t ( - vi ) ,
(1-15)
and
r r
Im t ( vi ) = - Im t ( - vi ) . (1-16)
By inverse Fourier transforming Eq. (1-10), we can obtain the PSF according to
r r r r r
◊
PSF (ri ) = Ú t (v i ) exp (- 2 pi v i ri ) d v i . (1-18)
For a radially symmetric pupil with a radially symmetric aberration, e.g., a circular
pupil aberrated by spherical aberration, the OTF and PSF Eqs. (2-4) and (2-18) yield
and
respectively, where J 0 (◊) is the zeroth-order Bessel function of the first kind. The OTF is
evidently real in this case.
I a ( 0)
S = , (1-21)
I u ( 0)
r r r 2
Ú ( ) [ ( )]
A rp exp iF rp d rp
[ Ú A (rr ) d rr ]
S = 2
. (1-22)
p p
0£ S £ 1 . (1-23)
The Strehl ratio may also be determined from the OTF of the system. By definition,
Since the PSF at any point is a real quantity, only the real part of the aberrated OTF
contributes to the integral, and the integral of its imaginary part must be zero. Hence, the
Strehl ratio is given by
r r r r
S = Ú Re t a ( v ) d v Ú t u ( v ) d v . (1-26)
Thus, the Strehl ratio may be obtained by integrating the real part of the measured
aberrated OTF over all spatial frequencies and dividing it by a similar integral of the
calculated unaberrated OTF.
The Strehl ratio gives a measure of the image quality in terms of the reduction in the
central irradiance due to the aberration in the system, including any defocus. Its value
being less than one is a consequence of the fact that the Huygens’ secondary spherical
wavelets on the reference sphere are not in phase due to the aberrations and, therefore,
they interfere nonconstructively at its center of curvature.
It can be shown that, for a given total power, the amplitude variations across the
pupil of an aberration-free system reduce the central irradiance, and any phase variations
(i.e., aberrations) further reduce it [2]. However, an irradiance reduced by phase
variations alone does not necessarily reduce any further if any amplitude variations are
also introduced. In fact, the amplitude variations can even increase this irradiance. For
example, the central value of a defocused PSF for a circular pupil decreases to zero as the
defocus aberration approaches one wave (see Section 4.4). The Huygens’ secondary
wavelets arriving at this point completely cancel each other. Hence, any amplitude
variations across the pupil will only help avoid complete cancellation and thereby
*HQHUDO ([SUHVVLRQ 9
increase the central value. The maximum value of central irradiance is obtained when the
system is unapodized and unaberrated [1,2]. It is shown in Chapter 5 how a Gaussian
pupil, as in a Gaussian beam, yields a smaller central value.
The peak value of the aberrated irradiance distribution of the image of a point object
does not necessarily occur at the center of the reference sphere. However, the peak value
of an unaberrated image does occur at the center regardless of the apodization. The
Huygens’ secondary wavelets emanating from the spherical wavefront being equidistant
from this point are in phase. Hence, they interfere constructively, producing a maximum
possible value at this point.
where the angular brackets L indicate a spatial average over the amplitude-weighted
pupil, e.g.,
r r r
Ú A ( rp ) F ( rp ) d rp
F = r r . (1-28)
Ú A ( rp ) d rp
r
Since F is independent of rp , Eq. (1-27) can be written
2
S = [
exp i ( F - F )]
2 2
= cos (F - F ) + sin (F - F )
2 (1-29)
≥ cos (F - F ) ,
equality holding when F is zero across the pupil, in which case S = 1. For small
aberrations, expanding the cosine function in a power series and retaining the first two
obtain the Maréchal result generalized for an apodized pupil
where
s 2F = (F - F )2 (1-31)
is the variance of the phase aberration across the amplitude-weighted pupil. The quantity
s F is the standard deviation of the aberration. We will refer to it as the “sigma value” or
simply the “sigma” of the aberration.
10 OPTICAL IMAGING
For small values of s F , three approximate expressions have been used in the
literature:
2
S1 ~ (1 - s 2F 2) , (1-32)
S2 ~ 1 - s 2F , (1-33)
and
S3 ~ exp (- s 2F ) . (1-34)
The first is the Maréchal formula [3], the second is the commonly used expression ob-
4
tained when the term in s F in the first is neglected [4,5], and the third is an empirical ex-
pression giving a better fit to the actual numerical results for various aberrations [6]. Just
as S1 > S2 by s F4 4 , similarly, S3 > S1 by approximately the same amount. The simplest
expression to use is, of course, S2 , according to which s 2F gives the drop in the Strehl
ratio. We note that, for a pupil of any shape, the Strehl ratio for a small aberration does
not depend on its type but only on its variance across the apodized pupil. For a high-
quality imaging system, a typical value of the Strehl ratio desired is 0.8, corresponding to
a wave aberration with a sigma of s w = l 14 , where s w = (l 2p) s F .
1.4 ABERRATION BALANCING
In geometrical optics, we mix one aberration with another in order to minimize the
variance of the ray distribution in an image plane. For example, when we minimize the
variance by combining the primary spherical aberration with defocus aberration by
considering the ray distribution in a defocused image plane, the smallest spot, called the
circle of least confusion, has a radius that is 1/4 of its value in the Gaussian image plane
[7]. Similarly, when astigmatism is combined with defocus, the circle of least confusion
has a diameter equal to half the length of the line image in the Gaussian image plane. In
the case of coma, the ray distribution is asymmetric about the Gaussian image point and,
therefore, its centroid does not lie at this point. The centroid shift is equivalent to
introducing a wavefront tilt, or balancing coma with tilt.
Based on diffraction, the best image for small aberrations is the one for which the
variance of the wave aberration is minimum so that its Strehl ratio is maximum. Since the
value of variance depends on the shape of and the amplitude across the pupil, the value of
the balancing aberration also depends on those factors. Thus, for example, the value of
defocus for balancing spherical aberration for an annular pupil is different than that for a
circular pupil. Similarly, its value for a Gaussian circular pupil, as in the case of a circular
Gaussian beam, is different than that for a uniform circular pupil. The process of
balancing a higher-order aberration with one or more aberrations of the same and/or
lower orders to minimize the variance is called aberration balancing. Thus, for example,
secondary spherical aberration is balanced with primary spherical aberration and defocus,
and secondary coma is balanced with primary coma and tilt.
$EHUUDWLRQ %DODQFLQJ 11
The balanced aberrations for a system with a certain shape of the pupil form the basis
of determining the orthogonal polynomial aberrations for the analysis of wavefronts
across the given pupil. The Zernike circle polynomials, for example, are the orthogonal
polynomial aberrations for a system with a circular pupil that represent the balanced
classical aberrations for such a system.
1.5 SUMMARY
The diffraction image of an isoplanatic incoherent object is given by the convolution
of its Gaussian image and the PSF. In the spatial frequency domain, the spectrum of the
image is given by the product of the OTF and the spectrum of the Gaussian image. The
image is obtained by inverse Fourier transforming its spectrum.
The variance of an aberration of a certain order can be reduced by mixing it with one
or more aberrations of lower order, thereby improving the Strehl ratio. The process of
mixing one aberration with others in this manner is called aberration balancing. The
polynomial aberrations used for wavefront analysis are not only orthogonal across the
pupil of a system, but also represent balanced classical aberrations for it.
12 OPTICAL IMAGING
References
2.8 Summary................................................................................................................. 31
References ........................................................................................................................33
13
Chapter 2
Optical Wavefronts and Their Aberrations
2.1 INTRODUCTION
The position and the size of the Gaussian image of an object formed by an optical
imaging system is determined by using its Gaussian imaging equations. We have stated in
Chapter 1 that the quality of the diffraction image depends on the aberrations of the
system. A spherical wave originating at a point object is incident on the system. The
image formed by the system is aberration free and perfect if the wave exiting from the
system is also spherical. In this case, the rays originating at the point object and traced
through the system all pass through the Gaussian image point.
If the optical wavefront exiting from the exit pupil is not spherical, its optical
deviations from a spherical form represent its wave aberrations. These wave aberrations
play a fundamental role in determining the quality of the aberrated image. The rays traced
from the object point through the system, instead of passing through the Gaussian image
point, intersect the image plane in its vicinity. The distance of the point of intersection of
a ray in the image plane from the Gaussian image point is called the transverse ray
aberration, and the distribution of the rays is referred to as the spot diagram. In this
chapter, we define the wave and ray aberrations and give a relationship between them.
We relate the longitudinal defocus of an image to the defocus wave aberration, and its
wavefront tilt to the wavefront tilt aberration. Next, the possible aberrations of an
imaging system that is rotationally symmetric about its optical axis are described. The
aberration function of the system is expanded in a power series of the object and pupil
coordinates, and primary (or Seidel), secondary (or Schwarzschild), and tertiary
aberrations are introduced [1]. We also discusss briefly how the aberrations may be
observed using a Twyman–Green interferometer and what the fringe pattern of a primary
or Seidel aberration looks like. A short summary of the chapter is given at the end.
ExP
EnP
L1 L2
AS
MR 0
B02
OA CR0 A01
P0 A02 P¢0
B01
MR
0
(a)
ExP
L1 EnP
AS L2
C2
B2 P¢
P0 OA A2
MR A1 P¢
0
B1
CR
C1
MR
P
(b)
Figure 2-1. (a) Imaging of an on-axis point object P0 by an optical imaging system
consisting of two lenses L1 and L2 . OA is the optical axis. The Gaussian image is at
P0¢ . AS is the aperture stop; its image by L1 is the entrance pupil EnP, and its image
by L2 is the exit pupil ExP. CR0 is the axial chief ray, and MR0 is the axial marginal
ray. (b) Imaging of an off-axis point object P. The Gaussian image is at P ¢. CR is the
off-axis chief ray, and MR is the off-axis marginal ray.
2SWLFDO ,PDJLQJ 17
An aperture in the system that physically limits the solid angle of the rays from a
point object the most is called the aperture stop (AS). For an extended (i.e., a nonpoint)
object, it is customary to consider the aperture stop as the limiting aperture for the axial
point object, and to determine vignetting, or blocking of some rays, by this stop for off-
axis object points. The object is assumed to be placed to the left of the system so that
light initially travels from left to right. The image of the stop by surfaces that precede it in
the sense of light propagation, i.e., by surfaces that lie between it and the object, is called
the entrance pupil (EnP). When observed from the object side, the entrance pupil appears
to limit the rays entering the system to form the image of the object. Similarly, the image
of the aperture stop by surfaces that follow it, i.e., by surfaces that lie between it and the
image, is called the exit pupil (ExP). The object rays reaching its image appear to be
limited by the exit pupil. Since the entrance and exit pupils are images of the stop by the
surfaces that precede and follow it, respectively, the two pupils are conjugates of each
other for the whole system, i.e., if one pupil is considered as the object, the other is its
image formed by the system.
An object ray passing through the center of the aperture stop and appearing to pass
through the centers of the entrance and exit pupils is called the chief (or the principal) ray
(CR). An object ray passing through the edge of the aperture stop is called a marginal ray
(MR). The rays lying between the center and the edge of the aperture, and, therefore,
appearing to lie between the center and edge of the entrance and exit pupils, are called
zonal rays.
It is possible that the stop of a system may also be its entrance and/or exit pupil. For
example, a stop placed to the left of a lens is also its entrance pupil. Similarly, a stop
placed to the right of a lens is also its exit pupil. Finally, a stop placed at a single thin lens
is both its entrance and exit pupils.
The optical path length of a ray in a medium of refractive index n is equal to n times
its geometrical path length. Consider rays from a point object traced through the system
up to the exit pupil such that each one travels exactly the same optical path length. The
ray passing through the center of the pupil is called the chief ray, and represents the
reference ray with respect to which the optical path lengths of the other rays are
compared. The surface passing through the end points of the rays is called the system
wavefront, and it represents a surface of constant phase for the point object under
consideration. If the wavefront is spherical, with its center of curvature at the Gaussian
18 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
Optical
System
P¢
Figure 2-2. Perfect imaging of a point object P by an optical system at its Gaussian
image point P ¢ .
image point, we say that the image is perfect. The rays transmitted by the system have
equal optical lengths in propagating from P to P ¢ , and they all pass through P ¢ . If,
however, the actual wavefront deviates from this spherical wavefront, called the
Gaussian reference sphere, we say that the image is aberrated. The rays reaching the
Gaussian reference sphere do not travel the same optical path length, and they intersect
the Gaussian image plane in the vicinity of P ¢ . The optical deviations (i.e., the
geometrical deviations times the refractive index ni of the image space) of the wavefront
from a Gaussian reference sphere are called wave aberrations. The wave aberration of a
ray at a point on the reference sphere where the ray meets it is equal to the optical
deviation of the wavefront along that ray from the Gaussian reference sphere. It
represents the difference between the optical path lengths of the ray under consideration
and the chief ray in traveling from the point object to the reference sphere. Accordingly,
the wave aberration associated with the chief ray is zero. Since the optical path lengths of
the rays from the reference sphere to the Gaussian image point are equal, the wave
aberration of a ray is also equal to the difference between its optical path length from the
point object P to the Gaussian image point P ¢ and that of the chief ray.
The wave aberration of a ray is positive if it has to travel an extra optical path length,
compared to the chief ray, in order to reach the Gaussian reference sphere. Figures 2-3a
and 2-3b illustrate the reference sphere S and the aberrated wavefront W for on-axis and
off-axis point objects, respectively. The reference sphere, which is centered at the
Gaussian image point P0¢ in Figure 2-3a or P ¢ in Figure 2-3b, and the wavefront pass
through the center O of the exit pupil. The wave aberration ni Q Q of a general ray GR0
or GR, where ni is the refractive index of the image space, as shown in the figures, is
numerically positive. The coordinate system is also illustrated in these figures. We choose
a right-hand Cartesian coordinate system such that the optical axis lies along the z axis.
The object, entrance pupil, exit pupil, and Gaussian image lie in mutually parallel planes
that are perpendicular to this axis. Figure 2-4 illustrates the coordinate systems in the
object, exit pupil, and image planes. The origin of the coordinate system lies at O and the
Gaussian image plane lies at a distance zg from it along the z axis.
We assume that a point object such as P lies along the x axis. (There is no loss of
generality because of this since the system is rotationally symmetric about the optical
axis.) The z x plane containing the optical axis and the point object is called the
2.3 Wave and Ray Aberrations 19
ExP
Q Q(x, y, z)
GR0 x
d a
b
y
W(x,y) = niQQ
S
W
R
Figure 2-3a. Aberrated wavefront for an on-axis point object. The reference sphere
S of radius of curvature R is centered at the Gaussian image point P0¢ . The
wavefront W and reference sphere pass through the center O of the exit pupil ExP.
A right-hand Cartesian coordinate system showing x, y, and z axes is illustrated,
where the z axis is along the optical axis O A of the imaging system. Angular
rotations a , b , and g about the three axes are also indicated. CR0 is the chief ray,
and a general ray GR0 is shown intersecting the Gaussian image plane at P0¢¢ .
ExP
Q(x,y,z)
Q
GR
P¢¢(xi,yi)
P¢(xg,0)
R
O OA P¢0
x
a
z
g
y b W(x,y) = niQQ
S
W
zg
Figure 2-3b. Aberrated wavefront for an off-axis point object. The reference sphere
S of radius of curvature R is centered at the Gaussian image point P ¢ . The value of
R in this figure is slightly larger than its value in Figure 1-3a. GR is a general ray
intersecting the Gaussian image plane at the point P ¢¢ . By definition, the chief ray
(not shown) passes through O, but it may or may not pass through P ¢ .
20 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
xo
P (xo, 0) xp
Q (x, y)
P0
an ct
xg
pl bje
e
r
O
q
P¢¢ (xi, yi, zg)
yo
R
O P¢ (xg, 0, zg)
an il
pl up
e
P
zg
yp P¢0
pl n
e
e sia
an
ag us
yg im Ga
Figure 2-4. Right-hand coordinate system in object, exit pupil, and image planes.
The optical axis of the system is along the z axis, and the off-axis point object P is
assumed to be along the x axis, thus making the z x plane the tangential plane.
tangential or the meridional plane. The corresponding Gaussian image point P ¢ lying in
the Gaussian image plane along its x axis also lies in the tangential plane. This may be
seen by consideration of a tangential object ray and Snell’s law, according to which the
incident and the refracted (or reflected) rays at a surface lie in the same plane. The chief
ray always lies in the tangential plane. The plane normal to the tangential plane but
containing the chief ray is called the sagittal plane. As the chief ray bends when it is
refracted or reflected at an optical surface, so does the sagittal plane. It should be evident
that only the chief ray lies in both the tangential and sagittal planes, because it lies along
the line of intersection of these two planes.
Consider an image ray such as GR in Figure 2-2b passing through a point Q with
coordinates (x, y, z) on the reference sphere of radius of curvature R centered at the image
point. We let W(x, y) represent its wave aberration nQ Q , because z is related to x and y
by virtue of Q being on the reference sphere. It can be shown that the ray intersects the
Gaussian image plane at a point P ¢¢ whose coordinates with respect to the Gaussian
image point P ¢ are approximately given by [1,2]
R Ê ∂W ∂W ˆ
(x i , y i ) = Á , ˜ , (2-1)
n Ë ∂x ∂y ¯
replace R with zg . Note that in the case of an axial point object, R zg . [Equation (2-1)
has been derived by Mahajan [1], Born and Wolf [2], and Welford [3]. Note, however,
that Welford uses a sign convention for the wave aberration that is opposite to ours.]
The displacement P0cP0s in Figure 2-3a (or Pc Ps in Figure 2-3b) of a ray from the
Gaussian image point is called its geometrical or transverse ray aberration, and its
coordinates ( x i , y i ) in the Gaussian image plane relative to the Gaussian image point are
called its ray aberration components. Since a ray is normal to a wavefront, the ray
aberration depends on the shape of the wavefront and, therefore, on its geometrical path
difference from the reference sphere. The division of W by n in Eq. (2-1) converts the
optical path length difference into geometrical path length difference. When an image is
formed in free space, as is often the case in practice, then n = 1. The angle G ~ P0cP0s R
between the ideal ray QP0c and the actual ray QP0s is called the angular ray aberration.
The distribution of rays in an image plane is called the ray spot diagram.
Note that the tangential rays, i.e., those lying in the z x plane, lie along the x axis of the
exit pupil plane and thus correspond to T 0 or S . Similarly, the sagittal rays, i.e., those
lying in a plane orthogonal to the tangential plane but containing the chief ray lie along
the y axis of the exit pupil plane and thus correspond to T S 2 or 3S 2 .
Q(x, y)
Q(r, T)
r
y
T
x
O x
Figure 2-5. Circular exit pupil of radius a of an imaging system, and Cartesian and
polar coordinates x, y and r, T, respectively, of a point Q on the pupil.
22 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
n §1 1· 2
W r ¨ ¸r , (2-3)
2 ©z R¹
where z and R are the radii of curvature of the reference sphere S and the spherical
wavefront W centered at P1 and P2 , respectively, passing through the center O of the exit
pupil, and r is the distance of Q1 from the optical axis. We note that the defocus wave
aberration is proportional to r 2 . If z ~ R , then Eq. (2-3) may be written as follows:
ExP
Q2 Q1
O B P1 P2
S centered at P1
W centered at P2
W S
Z
W (r) ~ - n D2 r 2 , (2-4)
2R
where D = z - R is called the longitudinal defocus. We note that the defocus wave
aberration and the longitudinal defocus have numerically opposite signs.
A defocus aberration is also introduced if the image is observed in a plane other than
the Gaussian image plane. Consider, for example, an imaging system forming an
aberration-free image at the Gaussian image point P2 (and not at P1 , as in Figure 1-6).
Thus, the wavefront at the exit pupil is spherical passing through its center Q with its
center of curvature at P2 . Let the image be observed in a defocused plane passing through
a point P1 , which lies on the line joining Q and P2 . For the observed image at P1 to be
aberration free, the wavefront at the exit pupil must be spherical with its center of
curvature at P1 . Such a wavefront forms the reference sphere with respect to which the
aberration of the actual wavefront must be defined. The aberration of the wavefront at a
point Q1 on the reference sphere is given by Eqs. (2-3) and (2-4).
If the exit pupil is circular with a radius a, then Eq. (2-4) may be written
W (r) = Bd r 2 , (2-5)
Bd ~ - nD 8 F 2 (2-6)
represents the peak value of the defocus aberration with F = R 2a as the focal ratio or
the f-number of the image-forming light cone. Note that a positive value of Bd implies a
positive value of D. Thus, an imaging system having a positive value of defocus
aberration D can be made defocus free if the image is observed in a plane lying farther
from the plane of the exit pupil, compared to the defocused image plane, by a distance
8Bd F 2 n . Similarly, a positive defocus aberration of Bd ~ - nD 8F 2 is introduced into
the system if the image is observed in a plane lying closer to the plane of the exit pupil,
compared to the defocus-free image plane, by a distance D.
x i = R (2-7)
24 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
ExP
Q2 Q1
r
P2
xi
b
O OA P1
S W
Figure 2-7. Wavefront tilt. The spherical wavefront W is centered at P2 while the
reference sphere S is centered at P1 , such that the two spherical surfaces are tilted
with respect to each other by a small angle = P1 P2 R , where R is their radius of
curvature. The ray Q2 P2 is normal to the wavefront at Q2.
and
respectively, where P1P2 = x i and (r, q) are the polar coordinates of the point Q1 . Both
the wave and ray aberrations are numerically positive in Figure 2-7.
Once again, for a system with a circular exit pupil of radius a, Eq. (2-8) may be
written
where
B t = n i ab (2-10)
is the peak value of the wavefront tilt aberration. Note that a positive value of Bt implies
that the wavefront tilt angle is also positive. Thus, if an aberration-free wavefront is
centered at P2 , then an observation with respect to P1 as the origin implies that we have
introduced a tilt aberration of Bt r cos q.
2.6 Aberration Function of a Rotationally Symmetric System 25
If (x, y) are the coordinates of a pupil point, the aberration function consists of terms
r
formed from three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy . If h
r
and rr are
r the position vectors of the object and pupil points,rthen the rotational invariants
r r r r r
are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and hr cos q , where h = h , r = r , and q is the polar
r r
angle of r with respect to that of h . It is convenient to consider the aberration function
in terms of the image height h ¢ , for example, when the object is at infinity, and let q be
the angle for the image point. The image height is, of course, related to the object height
by the Gaussian magnification. We now expand the aberration function W (h ¢; r , q) in a
power series in terms of the three rotational invariants h ¢ 2 , r 2 , and h ¢r cos q in the form
• • •
W (h¢; r , q) = Â Â ( ) l (r 2 ) p (h¢r cos q) m
 C lpm h ¢ 2
l =0 p =0 m =0
• • •
= Â Â Â C lpm h ¢ 2l + m r 2 p + m cos m q , (2-11)
l =0 p =0 m =0
where C lpm are the expansion coefficients, and l, p, and m are positive integers, including
zero. There is no term with sinq dependence. The aberration terms are called the
classical aberrations.
It is evident that the degree of each term of the series in the object or image and pupil
coordinates is even and given by 2(l + p + m) . Any terms for which p = 0 = m so that
2 p + m = 0 , i.e., those terms that do not depend on r and, therefore, vary only as h ¢ 2l ,
must add up to zero since the aberration associated with the chief ray (for which r = 0 ) is
zero. Thus, the zero-degree term C000 and terms such as C100 h ¢ 2 , C 200 h ¢ 4 , etc., do not
appear in Eq. (2-11). There is also no term of second degree. For example, the term
C010 r 2 represents defocus aberration that is independent of h. It has the implication that
the image is being observed in a plane other than the Gaussian image plane. Similarly, the
term C 001 h ¢r cos q represents a wavefront tilt aberration that depends on h. It has the
implication that the image height is not h ¢ . Hence, a power series expansion of the
26 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
• • n
W (h¢; r , q) = Â Â Â 2 l + m a nm h¢ 2l + m r n cos m q , (2-12)
l = 0 n =1 m = 0
where
n = 2p + m (2-13)
is a positive integer not including zero, and 2l + m anm are the expansion coefficients. From
Eq. (2-13), we note that n - m = 2 p ≥ 0 and even. The order i of an aberration term,
which is equal to its degree in the object and pupil coordinates, is given by
i = 2l + m + n . (2-14)
The number of terms Ni of a certain order i, i.e., the number of integer sets satisfying Eq.
(2-14) with n - m ≥ 0 and even, is given by
N i = (i + 2) (i + 4) 8 . (2-15)
This number includes a term with n = 0 = m , called piston aberration, although such a
term does not constitute an aberration (since it corresponds to the chief ray, which has a
zero aberration associated with it). It is included here for completeness, as interferometric
data based on the aberrations of a system may have a piston component.
The fourth order (i = 4), i.e., the primary or the Seidel aberration function consisting
of a sum of five fourth-order terms, can be written
W P (r , q; h ¢ ) = 0 a 40 r
4
+ 1a 31h ¢ r 3 cos q + 2 a 22 h ¢ 2 r 2 cos 2 q
(2-16)
+ 2 a 20 h ¢ 2 r 2 + 3 a11h ¢ 3 r cos q .
Since the wave aberration W has dimensions of length, the dimensions of the coefficients
i a jk are inverse length cubed. Since the ray aberrations are related to the wave
aberrations by a spatial derivative [see Eq. (2-1)], their degree is lower by one.
Accordingly, the primary aberrations are also referred to as the third-order ray
aberrations. The wave aberration coefficients 0 a 40 , 1a 31 , 2 a 22 , 2 a 20 , and 3 a11 represent
the coefficients of spherical aberration, coma, astigmatism, field curvature, and
distortion, respectively.
From Eq. (2-16), we note that only spherical aberration is independent of the object
or image height. The field curvature, in its dependence on the pupil coordinates (r, q) , is
like the defocus aberration discussed in Section 2.4. However, the field curvature
$EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 27
represents a defocus aberration that depends on the field h ¢ , thus requiring a curved
image surface for its elimination. On the other hand, pure defocus aberration, such as that
produced by observing the image in a plane other than the Gaussian image plane, is
independent of the field h ¢ . Similarly, distortion depends on the pupil coordinates as a
wavefront tilt. However, distortion depends on the field as h ¢ 3 , but the wavefront tilt
produced by a tilted element in the system would be independent of h¢ .
The sixth order ( i = 6), i.e., the secondary or the Schwarzschild aberration function,
can be written
Four of the nine aberration terms (excluding piston) correspond to l = 0. They are the
secondary spherical aberration ( 0 a 60 r 6 ), secondary coma ( 1a 51h¢ r 5 cos q ), secondary
astigmatism ( 4 a 22 h¢ 4 r 2 cos 2 q ) (wings or Flügelfehler), and arrows or Pfeilfehler
( 3 a 33 h¢ 3 r 3 cos 3 q ). The remaining five corresponding to l π 0 and called lateral
aberrations are similar to the corresponding primary aberrations except for their
dependence on the image height h ¢. The lateral spherical aberration 2 a40 h ¢ 2 r 4 is also
called the oblique spherical aberration.
Aberration terms of the eighth (i = 8) order are called the tertiary aberrations. There
are fourteen aberration terms of this order, excluding piston. Only five of them have the
dependencies on pupil coordinates that are different from those of the secondary or
primary aberrations. Four have dependence on these coordinates as for the secondary
aberrations, and the remaining five have the same dependence as the primary aberrations.
Their difference lies in their dependence on the image height.
• n
W (r, q) = Â Â a nm r n cos m q , (2-18)
n =1 m = 0
The radial coordinate r has been normalized to r = r a . It has the advantage that, since
0 £ r £ 1 and cos q £ 1, the coefficient a nm of a classical aberration r n cos m q
represents the peak value or half of the peak-to-valley (P-V) value of the corresponding
aberration term, depending on whether m is even or odd, respectively. The indices n and
m represent the powers of r and cos q, respectively. The index m also represents the
28 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
minimum power of h ¢ dependence of a coefficient (with the exception of tilt and defocus
terms corresponding to n - m ≥ 0 and 2, respectively). The maximum power of h ¢
dependence is given by i - n . Moreover, the powers of h ¢ dependence are even or odd
according to whether n and m are even or odd, respectively. The number of terms through
a certain order i in the reduced power-series expansion of the aberration function given
by Eq. (2-18) is also given by Eq. (2-15). This number includes a nonaberration piston
term corresponding to n = 0 = m . The terms of Eq. (2-12) through a certain order i
correspond to those terms of Eq. (1-18) for which n + m £ i.
W P (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31q 3 cos q + a 40r 4 , (2-20)
where
3
a11 = 3 a11h ¢ a , (2-21a)
2
a 20 = 2 a 20 h ¢ a2 , (2-21b)
2
a 22 = 2 a 22 h ¢ a2 , (2-21c)
a 31 = 1a 31h ¢ a 3 , (2-21d)
and
4
a 40 = 0 a 40 a . (2-21e)
Comparing the distortion term a11r cos q with the wavefront tilt aberration given by
Eq. (2-9), we note that while the two are similar in their dependence on the pupil
coordinates, their coefficients depend on the image height differently. The distortion
coefficient a11 varies with h ¢ as h ¢ 3 , but the tilt coefficient Bt is independent of h ¢.
Similarly, comparing the field curvature term a 20r 2 with the defocus wave aberration
given by Eq. (2-5), we note that their dependence on the pupil coordinates is the same.
However, whereas the field curvature coefficient a20 varies with h ¢ as h ¢ 2 , the defocus
coefficient Bd is independent of h ¢.
The aberration function through the sixth order, i.e., for i £ 6 or n + m £ 6 may be
written
W S (r, q) = a11r cos q + a 20r 2 + a 22r 2 cos 2 q + a 31r 3 cos q + a 33r 3 cos 3 q
where
a11 = ( 3 a11h ¢
3
)
+ 5 a11h¢ 5 a , (2-23a)
$EHUUDWLRQ )XQFWLRQ RI D 5RWDWLRQDOO\ 6\PPHWULF 6\VWHP 29
a20 = ( 2 a20 h ¢
2
)
+ 4 a20 h¢ 4 a 2 , (2-23b)
a22 = ( 2 a22 h ¢
2
)
+ 4 a22 h¢ 4 a 2 , (2-23c)
a31 = (a 1 31h ¢ )
+ 3 a31h ¢ 3 a 3 , (2-23d)
3 3
a33 = 3 a33 h ¢ a , (2-23e)
a 40 = ( 0 a 40 + 2a 40h ¢ 2 ) a 4 , (2-23f)
2 4
a42 = 2 a42 h ¢ a , (2-23g)
6
a60 = 0 a60 a . (2-23i)
Written in this form, the aberration function has nine aberration terms through the sixth
order or through the secondary aberrations. Since the dependence of an aberration term
on the image height h ¢ is contained in the aberration coefficient anm , it should be noted
that the primary aberrations (including distortion and field curvature terms) in Eqs. (2-23)
are not the same as those in Eq. (2-20), because they contain aberration components not
only of the fourth degree, but of the sixth degree as well. For example, a 40r 4 consists of
spherical and lateral spherical aberrations 0 a 40 a 4 r 4 and 2 a 40 h ¢ 2 a 4 r 4 .
Similarly, the aberration function through the eighth order can be written. Once
again, an aberration term of this expansion will not be necessarily the same as a
corresponding term of the expansions of Eq. (2-20) or (2-22). We add that it is convenient
to refer to the aberration terms of a power-series expansion as the classical aberrations,
e.g., a term in r4 may be referred to as the classical primary spherical aberration.
The two reflected beams interfere in the region of their overlap. Lens L ¢ is used to
observe the interference pattern on a screen S placed in a plane containing the image of L
30 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
M1
BS
L M2
x
L¢
Figure 2-8. Twyman–Green interferometer for testing a lens system L. A laser beam
is split into two parts by a beam splitter BS. The reflected part is incident on a plane
mirror M1 and the transmitted part is incident on L. F is the image-space focal
point of L , and C is the center of curvature of a spherical mirror M2 . The
interfering beams are focused by a lens L ¢ , and the interference pattern is observed
on a screen S.
If the reference beam has a uniform phase and the test beam has a phase distribution
F( x , y ) , and if their amplitudes are equal to each other, the irradiance distribution of their
interference pattern is given by
[ ]2
I ( x , y ) = I 0 1 + exp iF( x , y )
{ [
= 2I 0 1 + cos F( x , y ) ]} , (2-24)
where I0 is the irradiance when only one beam is present. Of course, the phase and the
wave aberration distributions are related to each other according to
2p
F( x , y ) = W (x, y) , (2-25)
l
2EVHUYDWLRQ RI $EHUUDWLRQV ,QWHUIHURJUDPV 31
where l is the wavelength of the laser beam. The irradiance has a maximum value equal
to 4 I 0 at those points for which
F( x , y ) = 2pn (2-26a)
F( x , y ) = 2p(n + 1 2) , (2-26b)
where n is a positive or a negative integer, including zero. Each fringe in the interference
pattern represents a certain value of n, which in turn corresponds to the locus of ( x , y )
points with phase aberration given by Eq. (2-25a) for a bright fringe and Eq. (2-25b) for a
[ ]
dark fringe. If the test beam is aberration free F ( x , y ) = 0 , then the interference pattern
has a uniform irradiance of 4 I 0 . Figure 2-9 shows interferograms of six waves of a
primary aberration. In Figure 2-9a for spherical aberration and 2-9d for astigmatism, a
certain amount of defocus has also been added. In Figure 2-9c, a certain amount of tilt has
been added to the coma aberration.
2.8 SUMMARY
A perfect image of a point object is formed by an imaging system when a spherical
wave diverging from the object and incident on the system is converted by it into a
spherical wave converging to the Gaussian image point. If rays from the object point are
traced through the system, they all travel exactly the same optical path length from the
object point to the Gaussian image point, and they all pass through this image point.
When the wavefront exiting from the exit pupil of the system is not spherical, its optical
deviations from the spherical form represent the wave aberrations, and an aberrated
image is formed. The rays intersect the image plane in the vicinity of the Gaussian image
point, and their distribution is called the spot diagram. The wave and the ray aberrations
are related to each other by a spatial derivative, as in Eq. (2-1).
The interference pattern formed by two beams, one of which has traveled through an
aberrated system, is shown in Figure 2-9 for primary aberrations, as an illustration of
interferograms.
32 OPTICAL WAVEFRONTS AND THEIR ABERRATIONS
References
2 M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999).
4. D. Malacara, Ed., Optical Shop Testing, 3rd ed., Wiley, New York (2007).
CHAPTER 3
Least-Squares Fitting............................................................................................. 39
3.6 Summary................................................................................................................. 43
References ........................................................................................................................46
35
Chapter 3
Orthonormal Polynomials and Gram–Schmidt
Orthonormalization
3.1 INTRODUCTION
In optical design, we trace rays from a point object through a system to determine the
aberrations of the wavefront at its exit pupil. In optical testing, we determine the
aberrations of a system or an element interferometrically. In both cases, we obtain
aberration numbers at an array of points. We can calculate the PSF or other associated
image quality measures from these numbers. We can also calculate the aberration
variance, which, in turn, gives some idea of the image quality. However, such measures
do not shed light on the content of the aberration function. To understand the nature of
this function, we want to know the amount of certain familiar aberrations discussed in
Chapter 2 that are present, so that perhaps something can be done about them in
improving the design or the system under test.
1
Ú F ( x , y ) F j ' ( x , y ) dx dy = d jj ' , (3-1)
A pupil j
where A is the area of the pupil inscribed inside a unit circle, the integration is carried out
over the area of the pupil, and d jj' is a Kronecker delta. Let F1 = 1. Since it is
independent of the coordinates x and y, it is referred to as the piston polynomial. As a
result, the mean value of each polynomial, except for j = 1, is zero, i.e.,
1
F j ( x, y) = Ú F ( x , y ) dx dy
A pupil j
37
38 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION
= 0 for j π 1 , (3-2)
as may be seen by letting j ¢ = 1 in Eq. (3-1). The angular brackets on the left-hand side
of Eq. (3-2) indicate a mean value over the area of the pupil. Similarly, the mean square
value of a polynomial is unity, i.e.,
1
F j2 ( x , y ) = Ú F ( x , y ) dx dy
2
A pupil j
= 1 , (3-3)
•
W ( x, y) = Â a j F j ( x, y) , (3-4)
j =1
1 1 •
Ú W ( x , y ) F j ¢ ( x , y ) dx dy = Â a Ú F ( x , y ) F j ¢ ( x , y ) dx dy
A pupil A j =1 j pupil j
= a j¢ ,
or
1
aj = Ú W ( x , y ) F j ( x , y ) dx dy . (3-5)
A pupil
•
W ( x, y) = Â a j F j ( x, y)
j =1
= a1 , (3-6)
where we have utilized Eq. (3-2) for the mean value of a polynomial. The mean square
value of the aberration function is given by
2UWKRQRUPDO 3RO\QRPLDOV 39
1 • •
W 2 ( x, y) = Ú Â a j F j ( x , y ) Â a j ¢ F j ¢ ( x , y ) dx dy
A pupil j =1 j ¢ =1
•
= Â a 2j , (3-7)
j =1
where we have utilized the orthonormality Eq. (3-1) and Eq. (3-3) for the mean square
2
value of a polynomial. The variance s W of the aberration function is accordingly given
by
2
2
sW = W 2 ( x, y) - W ( x, y)
•
= Â a 2j , (3-8)
j =2
where s W is the standard deviation or the sigma value of the aberration function. Since
the mean value of a polynomial (except piston) is zero, each expansion coefficient a j
represents the standard deviation of the corresponding polynomial term. The variance of
the aberration function is simply the sum of the variances of the polynomial terms.
In the orthonormality Eq. (3-1) and those that follow it, we have assumed a
uniformly illuminated pupil, i.e., the amplitude across it is constant. If that is not the case,
as for example in a Gaussian pupil where the amplitude across the pupil varies as a
Gaussian function, then the amplitude function must be included in all the integrations
over the pupil (see Chapter 6). The quantity A in such cases would also be an amplitude-
weighted area of the pupil. Thus, the integrations, indicated by the angular brackets
implying a mean value, would be over an amplitude-weighted area of the pupil.
In practice, the number of polynomials used in the expansion will be truncated such
that the resulting variance obtained from Eq. (3-8) equals the actual value obtained from
the function W ( x , y ) within some specified tolerance. The Strehl ratio of an image for
small aberrations can be estimated from the variance according to Eq. (1-34).
It is easy to show that the expansion coefficients a j given by Eq. (3-5) and obtained
as a consequence of the orthogonality of the polynomials F j ( x , y ) represent a least-
squares fit of the aberration function W ( x , y ) . Suppose we estimate the function with
only J polynomials. Thus we write
J
Wˆ ( x , y ) = Â a j F j ( x , y ) , (3-9)
j =1
1 2
E =
A pupil
[
Ú W ( x , y ) - Wˆ ( x , y ) ] dx dy
2
1 È J ˘
= Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ dx dy . (3-10)
A pupil Î j =1 ˚
∂E
= 0 , (3-11)
∂a j ¢
or
1 È J ˘
Ú ÍW ( x , y ) - Â a j F j ( x , y ) ˙ F j ¢ ( x , y ) dx dy = 0 . (3-12)
A pupil Î j =1 ˚
Using the orthonormality Eq. (3-1), Eq. (3-12) yields Eq. (3-5). The variance of the
estimated aberration function is given by
2
ˆ2 ˆ
ˆ = W ( x, y) - W ( x, y)
2
sW
J
= Â a 2j . (3-13)
j =2
It should be evident that each polynomial coefficient provides a best fit to the
aberration function. The fit, of course, improves as more and more polynomials are added
until there is no more improvement. We point out that, in practice, the aberration function
data is available at a discrete set of points. Hence, there will be some error in the
coefficient values, because the orthonormality Eq. (3-1) will not be satisfied exactly. This
error decreases as the number of data points increases.
G1 = Z1 = 1 , (3-14)
j
G j +1 = Z j +1 + Â c j +1,k Fk , (3-15)
k =1
2UWKRQRUPDOL]DWLRQ RI =HUQLNH &LUFOH 3RO\QRPLDOV RYHU 1RQFLUFXODU 3XSLOV 41
G j +1 G j +1
F j +1 = = 12
, (3-16)
G j +1 È1 2
˘
Í Ú G j +1 dx dy ˙
Î A pupil ˚
where
1
c j +1, k = - Ú Z F dx dy . (3-17a)
A pupil j +1 k
∫ - Z j +1Fk . (3-17b)
It is evident from Eq. (3-14) that F1 = 1. Substituting Eq. (3-17b) into Eq. (3-15) and
substituting the result thus obtained into Eq. (3-12), we may write
È j ˘
F j +1 = N j +1 Í Z j +1 - Â Z j +1Fk Fk ˙ , (3-18)
Î k = 1 ˚
where N j +1 is a normalization constant so that the polynomials are orthonormal over the
pupil under consideration, i.e., they satisfy the orthonormality condition of Eq. (3-1).
Thus, the F-polynomials are obtained recursively, starting with F1 = 1. It is clear from Eq.
(3-18) that each F-polynomial of a certain order is a linear combination of the circle
polynomials of no more than that order. It should be evident that the F-polynomials are
ordered in the same manner as the basis polynomials and that there is a one-to-one
correspondence between them.
Because of the biaxial symmetry of the pupils considered in this chapter and,
therefore, the symmetric limits of integration, the integral in Eq. (3-17a) is zero when the
integrand is an odd function of one or both integration variables. It should be evident that
a c-coefficient is zero unless the Z- and the G-polynomials have the same cosine or sine
dependence. If all of the c-coefficients in Eq. (3-15) are zero, then the F-polynomial has
the same form as the corresponding Zernike polynomial, except for its normalization.
The orthonormal F-polynomials represent the unit vectors of the space that span the
aberration function. They can be written in a matrix form according to
l 1
Fl ( x, y) = Â Mli Zi ( x, y) with Mll = . (3-19)
i =1 Gl
While the diagonal elements of the M-matrix are simply equal to the normalization
constants of the G- polynomials [since there is no multiplier with the polynomial Z j +1 in
Eq. (3-15)], there are no matrix elements above the diagonal because a polynomial Fl
consists of a linear combination of circle polynomials up to Zl only. The matrix is lower
triangular and the missing elements may be given a value of zero when multiplying a
( )
Zernike column vector L, Z j , L to obtain the orthonormal column vector L , F j ,L . ( )
It should be evident that the orthonormal polynomials for a noncircular pupil written in
42 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION
terms of the circle polynomials immediately yield the elements of the conversion matrix
M.
where, for example, Z j Fk represents the inner product of the Zernike polynomial Z j
and the orthonormal polynomial Fk over the pupil, i.e.,
1
Z j Fk = Ú Z ( x , y ) Fk ( x , y ) dx dy . (3-21)
A pupil j
MC ZF = 1 , (3-22)
J T
Z k Fi [
= Â M ij Z j Z k
j =1
]
J T
= Â Z k Z j M ij
j =1
[ ] , (3-23)
T
[ ]
where, for example, M ij is the transpose of the matrix with elements M ij (obtained
by interchanging the rows and columns of the matrix M ). Equation (3-23) can be written
in the matrix form as
C ZF = C ZZ M T , (3-24)
MC ZZ M T = 1 . (3-25)
Letting
M = QT ( )1 , (3-26)
QT Q = C ZZ . (3-27)
Solving Eq. (3-27) for the matrix Q , the conversion matrix M can be obtained from Eq.
(3-26). While the matrix M is lower triangular, the matrix Q is upper triangular.
3.6 SUMMARY
The content of an aberration function can be determined by expanding it in terms of a
complete set of polynomials that are orthogonal over its domain and have the form of
familiar aberrations, such as those discussed in Chapter 2. The Zernike circle
polynomials, for example, are not only orthogonal over a circular pupil, but they also
represent balanced classical aberrations, as discussed in Chapter 4. It is advantageous to
use the polynomials in their orthonormal form so that the piston coefficient represents the
mean value of the aberration function and the other expansion coefficients represent the
standard deviations of the corresponding polynomial aberration terms. As illustrated by
Eq. (3-5), the value of an expansion coefficient is independent of the number of
polynomials used in the expansion. Moreover, each coefficient yields a least-squares fit to
the aberration function. The variance of the aberration function is given by the sum of the
squares of the coefficients (other than the piston), as in Eq. (3-8).
44 ORTHONORMAL POLYNOMIALS AND GRAM SCHMIDT ORTHONORMALIZATION
( ) ( )
1
q
( ) ( )
y y
D(0,c)
(
D –c, 1 – c 2 ) (
A c, 1 – c 2 )
C – 1, 0 A 1, 0
O x O x
(
C – c, – 1 – c 2 ) (
B c, – 1 – c 2 )
B(0, – c)
y y
D – 1 2, 1 2
A 1 2,1 2
x
O x –1 O 1
C –1 2, – 1 2
B 1 2, – 1 2
(e) Sq u a r e (f) S l i t
Figure 3-1. Unit pupils inscribed inside a unit circle. (a) annulus of obscuration ratio
, (b) hexagon with a side of unity, (c) ellipse of aspect ratio b, (d) rectangle of half
width a, (e) square of half width 1 2 , and (f) slit of half width of unity.
6XPPDU\ 45
Given a set of polynomials that are orthonormal over a certain domain, those that are
orthonormal over another domain can be obtained from them by the recursive Gram–
Schmidt orthonormalization process. They can also be obtained by a nonrecursive matrix
approach. Each new polynomial obtained is a linear combination of the basis
polynomials, as indicated by Eq. (3-18). We use the Zernike circle polynomials as the
basis functions to obtain the polynomials that are orthonormal over an annular, Gaussian,
hexagonal, elliptical, rectangular, or a square pupil. The slit pupil is a limiting case of a
rectangular pupil whose one dimension is negligibly small compared to the other. The
concept of a unit pupil is emphasized so that the farthest point or points on a pupil are at a
distance of unity from its center. It has the advantage that the coefficient of a single
aberration term represents its peak value. Thus, in each case the pupil is inscribed inside a
unit circle.
46 ORTHONORMAL POLYNOMIALS AND GRAM–SCHMIDT ORTHONORMALIZATION
References
47
48 SYSTEMS WITH CIRCULAR PUPILS
4.10 Circle Polynomials and Their Relationships with Classical Aberrations ......... 88
4.10.1 Introduction................................................................................................88
4.10.3 Astigmatism............................................................................................... 89
4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing ..............92
References ......................................................................................................................103
Chapter 4
Systems with Circular Pupils
4.1 INTRODUCTION
Optical systems generally have a circular pupil. The imaging elements of such
systems also have a circular boundary. Therefore, they are also represented by circular
pupils in fabrication and testing. As a result, the Zernike circle polynomials have been in
widespread use since Zernike introduced them in his phase contrast method for testing
circular mirrors [1]. They are used in optical design and testing to understand the
aberration content of a wavefront. They have also been used for analyzing the wavefront
aberrations introduced by atmospheric turbulence on a wave propagating through it [2].
We start this chapter with a brief discussion of the point-spread function (PSF) and
the optical transfer function (OTF) of an aberration-free system with a circular pupil. We
then consider the effect of primary aberrations on the Strehl ratio of an image. Since the
Strehl ratio for small aberrations depends on the variance of an aberration, we balance a
classical aberration of a certain order with those of lower orders to reduce its variance.
The utility of the Zernike circle polynomial stems from the fact that they are not only
orthogonal over a circular pupil, but they also uniquely represent the balanced classical
aberrations yielding minimum variance over the pupil [3–6]. Because of their
orthogonality, when a circular wavefront is expanded in terms of them, the value of a
Zernike expansion coefficient is independent of the number of polynomials used in the
expansion. Hence, one or more polynomial terms can be added or subtracted without
affecting the other coefficients. The piston coefficient represents the mean value of the
aberration function, and the variance of the function is given simply by the sum of the
squares of the other expansion coefficients.
49
50 SYSTEMS WITH CIRCULAR PUPILS
We refer to the pupil in the U, T coordinates as a unit circular pupil in the sense of a unit
G
disc. For a uniformly illuminated pupil with an aberration function ) r p and power Pex
exiting from it, the pupil function of the system can be written
G
P rp G > G @
A r p exp i) r p ,
G
rp d a
(4-3)
0 , otherwise ,
where
G P
A rp ex Sex
12
(4-4)
yp y pc
a 1
(a) (b)
Figure 4-1. (a) Circular exit pupil of radius a of an imaging system. (b) Circular
pupil as a unit disc. The polar coordinates of a point Q are r p , T in (a) and U, T
in (b).
4.3 Aberration Free Imaging 51
1 2p 2
1 Û Û
I (r , q i ) [ ] [
= 2 Ù Ù exp iF (r, q) exp - pir r cos (q - q i ) r dr dq
p ı ı
] , (4-5)
0 0
For an aberration-free system, i.e., for a spherical wavefront exiting from the pupil so
that F(r, q) = 0, Eq. (4-5) reduces to
2
1 1 2p
I (r , q i ) = [ (
Ú Ú exp - pi r r cos q p - q i r dr d q p
p2 0 0
)] . (4-6)
Noting that
2p
Ú exp (i x cos a ) da = 2pJ 0 ( x ) , (4-7)
0
◊
where J 0 ( ) is the zero-order Bessel function of the first kind, Eq. (4-7) reduces to
1 2
[
I ( r ) = 4 Ú J 0 (p r r) r dr
0
] . (4-8)
where J 0 (◊) is the first-order Bessel function of the first kind, Eq. (4-9) yields
2
È 2J (p r ) ˘
I (r) = Í 1 ˙ , (4-10)
Î pr ˚
where J1(◊) is the first-order Bessel function of the first kind. Integrating over a circle of
radius rc , (in units of l F ) it can be shown that it contains a fractional power given by
Figure 4-2 shows a plot of Eq. (4-10), called the Airy pattern. It consists of a bright
52 SYSTEMS WITH CIRCULAR PUPILS
spot at the center, called the Airy disc, surrounded by dark and bright diffraction rings.
The fractional power is also plotted in Figure 4-2a. The radius of the Airy disc is 1.22 and
contains 83.8% of the total light, as may be seen by letting rc = 1.22 in Eq. (4-11). The
center of the pattern lies at the Gaussian image point.
1.0
0.8
P
I(r), P(rc)
0.6
0.4
0.2 I
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r, rc
(a)
(b)
Figure 4-2. (a) Irradiance and encircled power distributions for an aberration-free
system with a circular pupil. (b) 2D PSF, called the Airy pattern.
4.3.2 OTF 53
4.3.2 OTF
From Eq. (2-11), the aberration-free OTF can be written
r Û r r r r
ı
( ) (
t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (4-12)
It is evident that the OTF represents the fractional area of overlap of two circles, each of
r
radius a, separated by a distance l Rvi , where v i = v i . From Figure 4-3, we note that the
area of overlap is given by four times the difference between the area of a sector of radius
a and cone angle b , and the area of the triangle OAB. Hence, the OTF can be written
4 Ê b 1 ˆ
t(v i ) = Á p a 2 - OA ◊ AB˜ . (4-13)
Sex Ë 2p 2 ¯
2
t(v i ) = (b - sin b cos b) (4-14)
p
2È
=
p ÎÍ
(
cos 1 v - v 1 - v 2 )1 2 ˘˚˙ , 0£ v£1 . (4-15)
a
b
O
A O¢
lRni
Figure 4-3. Aberration-free OTF as the fractional area of overlap of two circles of
radius a whose centers are separated by a distance lRvi .
54 SYSTEMS WITH CIRCULAR PUPILS
Figure 4-4 shows how the OTF varies with v. The integral of the aberration-free
OTF that enters into the calculation of the Strehl ratio from the real part of the complex
aberrated OTF [see Eq. (2-25)] is given by
1
Û
Ù t (v) v dv = 1 8 . (4-16)
ı
0
t¢ ( 0) = - 4 p . (4-17)
Although obtained from the aberration-free OTF, this slope is independent of any
aberration.
1 2p 2
1 Û Û
S =
p2 ı ı
[ ]
Ù Ù exp i F(r, q) r dr dq . (4-18)
0 0
1.0
0.8
0.6
t
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
n
)U Bd U 2 , (4-19)
where the peak value Bd of the phase aberration is related to the longitudinal defocus
according to
Bd
S 4O F 2 z R . (4-20)
S >sin Bd 2 Bd 2 @ 2 . (4-21)
The Strehl ratio decreases as the aberration increases until it reaches a value of zero
when the aberration becomes 2S radians or one wave. As shown in Figure 4-5, it
fluctuates for increasing value of defocus, becoming zero when the aberration is an
integral number of waves. It should be evident that the defocused Strehl ratio represents
the axial irradiance of a focused beam.
1.0
0.8
0.6
S
0.4
0.2
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
Bd
Figure 4-5. Strehl ratio S of a defocused beam, representing its axial irradiance,
where Bd is the defocus aberration in units of wavelength.
56 SYSTEMS WITH CIRCULAR PUPILS
S2 ~ 1 - s 2F , (4-22b)
and
S3 ~ exp (- s 2F ) , (4-22c)
where
is the variance of the phase aberration across the pupil. The mean and the mean square
values of the aberration are obtained from the expression
1 2p
Û Û
Fn = p 1 Ù Ù F n (r, q) r dr dq (4-24)
ı ı
0 0
Table 4-1 gives the form as well as the standard deviation s F of a primary (or a
Seidel) aberration, where an aberration coefficient Ai represents the peak value of the
aberration. It also lists the aberration tolerance, i.e., the value of the aberration coefficient
Ai , for a Strehl ratio of 0.8. This tolerance has been obtained by using the Strehl ratio
expression S2 , according to which the standard deviation for a Strehl ratio of 0.8 is given
by
sF = 0.2 (4-25)
or
where s w is the sigma value of the wave aberration. The aberration tolerance listed in
Table 4-1 is for the wave (as opposed to the phase) aberration coefficient, as is customary
in optics. It should be understood that the tolerance numbers given are not accurate to the
second decimal place. They are listed as such for consistency only. We have used the
symbol Ad for the coefficient of field curvature aberration, which varies quadratically
with the angle that a point object makes with the optical axis of the system. However, to
4.4.3 Approximate Expressions for Strehl Ratio 57
Table 4-1. Standard deviation and aberration tolerance for primary aberrations.
Spherical As r 4 2 As As l 4.19
=
3 5 3.35
avoid confusion, we have used the symbol Bd for representing the defocus wave
aberration, which is independent of the field angle but has the same dependence on pupil
coordinates as field curvature. Similarly, we have used the symbol At for distortion,
which varies as the cube of the field angle. But, we will use the symbol Bt to represent
the wavefront tilt, which is independent of the field angle but has the same dependence on
pupil coordinates as distortion.
F(r) = As r 4 + Bd r 2 . (4-27)
As Bd
= + (4-28)
3 2
and
As2 B2 A B
F2 = + d + s d . (4-29)
5 3 2
58 SYSTEMS WITH CIRCULAR PUPILS
4 As2 B2 A B
= + d + s d . (4-30)
45 12 6
∂ s F2
= 0 , (4-31)
∂ Bd
and checking that it yields a minimum and not a maximum. Thus, we find that the
optimum value is Bd = - As, and the balanced aberration is given by
(
F bs (r) = As r 4 - r 2 ) . (4-32)
Its standard deviation or sigma value is As 6 5 , which is a factor of 4 smaller than the
corresponding value 2 As 3 5 for Bd = 0. Since the sigma value has been reduced by a
factor of 4, its tolerance has been increased by the same factor. For example, S = 0.8 is
obtained in the Gaussian image plane for As = l 4 . However, the same Strehl ratio is
obtained for As = 1 l in a slightly defocused image plane such that Bd = - l .
Similarly, we balance astigmatism with defocus and coma with tilt. Table 4-2 lists
the form of a balanced primary aberration, its standard deviation, and its tolerance for a
Strehl ratio of 0.8, according to Eq. (4-16b). Also listed in the table is the location of the
diffraction focus, i.e., the point with respect to which the aberration variance is minimum
so that the Strehl ratio is maximum at it. The amount of balancing defocus is minus half
Spherical (
As r 4 - r2 ) (0, 0, 8F A )
2
s
As 0.955l
6 5
Coma (
Ac r3 - 2r 3 cos q ) (4 FAc 3, 0, 0 ) Ac 0.604l
6 2
Aa
Astigmatism (
Aa r2 cos 2 q - 1 2 ) (0 , 0 , 4 F A )
2
a
2 6
0.349l
= ( Aa 2) r2 cos 2q
*The diffraction focus coordinates are relative to the Gaussian image point.
4.5 Balanced Aberrations 59
the amount of astigmatism, or the diffraction focus lies at a distance 4 F 2 As along the z
axis. The balancing tilt is minus two-thirds the amount of the coma. Thus, the maximum
Strehl ratio is obtained at a point that is displaced from the Gaussian image point by
4 FAc 3 but lies in the Gaussian image plane.
For primary aberrations, S1 and S2 underestimate the true Strehl ratio S. S3 gives a
better approximation for the true Strehl ratio than S1 and S2 . The reason is that, for small
4
values of s w , it is larger than S1 by approximately s F 4 . Of course, S1 is larger than S2
4
by s F 4 . The expression S3 underestimates the true Strehl ratio only for coma and
astigmatism; it overestimates for the other aberrations. Numerical analysis shows that the
error, defined as 100 (1 - S3 S ) , is < 10% for S > 0.3 [5,7].
Rayleigh [8] showed that a quarter-wave of primary spherical aberration reduces the
irradiance at the Gaussian image point by 20%, i.e., the Strehl ratio for this aberration is
0.8. This result has brought forth the Rayleigh’s l 4 rule; namely, that a Strehl ratio of
approximately 0.8 is obtained if the maximum absolute value of the aberration at any
point in the pupil is equal to l 4 . A variant of this definition is that an aberrated
wavefront that lies between two concentric spheres spaced a quarter-wave apart will give
a Strehl ratio of approximately 0.8. Thus, instead of W p = l 4 , we require
W p v = l 4 , where Wp is the peak absolute value and Wp v is the peak-to-valley (P-V)
value of the aberration. However, a Strehl ratio of 0.8 is obtained for W p = l 4 = W p v
for spherical aberration only. For other primary aberrations, distinctly different values of
Wp and Wp v give a Strehl ratio of 0.8 [5,9]. Thus, it is advantageous to use s w for
estimating the Strehl ratio. A Strehl ratio of S >
~ 0.8 is obtained for s w <
~ l 14 .
When a certain aberration is balanced with other aberrations to minimize its variance,
the balanced aberration does not necessarily yield a higher or the highest possible Strehl
ratio. For small aberrations, a maximum Strehl ratio is obtained when the variance is
minimum. For large aberrations, however, there is no simple relationship between the
Strehl ratio and the aberration variance. For example [9], when As = 3l , the optimum
amount of defocus is Bd = - 3l , but the Strehl ratio is a minimum and equal to 0.12. The
Strehl ratio is maximum and equal to 0.26 for Bd ~ - 4l or - 2l . For As < ~ 2.3l , the
axial irradiance is maximum at a point with respect to which the aberration variance is
minimum. Similarly, in the case of coma, the maximum irradiance in the image plane
occurs at the point with respect to which the aberration variance is minimum only if
~ 0.7l , which in turn corresponds to S >
Ac < ~ 0.76 . For larger values of Ac , the
distance of the point of maximum irradiance does not increase linearly with its value and
even fluctuates in some regions [10]. Moreover, it is found that for Ac > 2.3l , the Seidel
coma gives a larger Strehl ratio than the balanced coma, i.e., the irradiance in the image
plane at the origin is larger than at the point with respect to which the aberration variance
is minimum. Thus, only for large Strehl ratios, the irradiance is maximum at the point
associated with the minimum aberration variance.
60 SYSTEMS WITH CIRCULAR PUPILS
The defocused PSFs are shown in Figure 4-6 to illustrate the zero Strehl ratio for
integral number of waves of defocus aberration. As an illustration of the improvement in
the Strehl ratio by aberration balancing, Table 4-3 lists the Strehl ratio of a primary
aberration with and without balancing for a quarter wave of aberration. The Strehl ratio
for a quarter of defocus is 0.811. As shown in Figure 4-7, the Strehl ratio for a quarter
wave of spherical aberration improves from a value of 0.800 to 0.986 when it is balanced
with an equal and opposite amount of defocus aberration. In the case of coma, a Strehl
ratio of 0.737 is obtained, but a peak of value 0.966 lies to the right of the origin, as
shown in Figure 4-8. When coma is balanced with a wavefront tilt equal to 2 3 the
amount of coma, the peak moves to the origin and the Strehl ratio increases from 0.737 to
0.966. In the case of astigmatism, as shown in Figure 4-9, the Strehl ratio increases from
a value of 0.857 to 0.902 when it is balanced with defocus.
Similarly, secondary coma is balanced with primary coma and wavefront tilt to minimize
its variance, and the balanced aberration thus obtained is given by
1.0
Bd = 0 Defocus
0.8
0.6
I (r)
1/4
1
0.4 x10
0.2
0.0
0.0 0.5 1.0 1.5 2.0
r
Figure 4-6. PSFs for a quarter-wave and one wave of defocus as a function of r in
units of O F . For clarity, the curve for Bd 1 has been multiplied by ten. The
aberration-free PSF, representing the Airy pattern with its first zero at 1.22, is
shown by the solid curve.
4.5 Balanced Aberrations 61
Table 4-3. Strehl ratio S for a quarter-wave of a primary aberration with and
without balancing for a circular pupil, i.e., for Bd Aa Ac As O 4 and
0 d U d 1.
Aberration S
Aberration free 1
Defocus, Bd U 2 0.811
>
Balanced astigmatism, Aa U 2 cos 2 T 1 2 @ 0.902
>
Balanced coma, Ac U 3 2 3U cos T @ 0.966
Balanced spherical aberration, As U 4 U 2 0.986
1.0
0.8
0.6
I (r)
0.4 Balanced
Spherical
Spherical
0.2
0.0
0.0 0.5 1.0 1.5 2.0
r
Figure 4-7. PSFs for a quarter-wave of spherical aberration with and without
balancing with equal and opposite amount of defocus. The aberration-free PSF,
representing the Airy pattern with its first zero at 1.22, is shown by the solid curve.
62 SYSTEMS WITH CIRCULAR PUPILS
1.0
0.8
I (x,0)
0.6
Coma
0.4
Balanced
Coma
0.2
0.0
-2 -1 0 1 2
x
Figure 4-8. PSFs for a quarter-wave of coma along the x axis (in units of O F ) with
and without the balancing tilt. The aberration-free PSF is shown by the solid curve.
1 4 3 2 3 1 § 4 3 2·
) bsa U, T U 4 cos 2 T U U cos 2 T U 2 U U cos 2T . (4-35)
2 4 8 2© 4 ¹
1.0
0.8 Balanced
Astigmatism
I (x,0)
0.6
0.4
Astigmatism
0.2
0.0
0 1 2
x
Figure 4-9. PSFs for a quarter-wave of astigmatism along the x axis (in units of
O F ) with and without the balancing defocus. The aberration-free PSF is shown by
the solid curve.
4.5 Balanced Aberrations 63
[
Z nm (r, q) = 2( n + 1) (1 + d m 0 ) ]1/ 2Rnm (r) cos mq , 0 £ r £ 1 , 0 £ q £ 2 p , (4-36)
where n and m are positive integers including zero, n - m ≥ 0 and even, and Rnm (r) is a
radial polynomial given by
( n m )/ 2 ( -1) s ( n - s)!
Rnm (r) = Â rn 2s
(4-37)
s= 0 Ên+m ˆ Ên-m ˆ
s!Á - s˜ ! Á - s˜ !
Ë 2 ¯ Ë 2 ¯
with a degree n in r containing terms in rn , rn 2 , K, and rm. It is clear from Eq. (4-36)
that the circle polynomials are separable in the polar coordinates r and q of a pupil
point.
and
Ïd m 0 for even n 2
Rnm ( 0) = Ì (4-40)
Ó - d m 0 for odd n 2 .
64 SYSTEMS WITH CIRCULAR PUPILS
(
Rn0 (r) = Pn 2r 2 - 1 ) . (4-41)
In Eq. (4-43), the m value is the same for both radial polynomials because of the
orthogonality Eq. (4-42) of the trigonometric functions. Accordingly, the polynomials
Z nm (r, q) are orthonormal according to
1 1 2p m
Ú Ú Z (r, q)Z n ¢ (r, q) r dr d q = d nn ¢ d mm ¢
m¢
. (4-44)
p0 0 n
An even number is associated with a cosine polynomial and an odd number with a sine
polynomial. The orthogonality of the trigonometric functions yields
2p
Ï cos mq cos m¢q , j and j ¢ are both even
Ô cos mq sin m¢q , j is even and j ¢ is odd
Û Ô
Ù dq Ì
ı Ôsin mq cos m¢q , j is odd and j ¢ is even
0
ÔÓsin mq sin m¢q , j and j ¢ are both odd
Therefore, the Zernike circle polynomials are orthonormal over a unit disc according to
4.6.1 Analytical Form 65
1 2p 1 2p
Ú Ú Z j (r, q) Z j ¢ (r, q) r dr dq Ú Ú r dr dq = d jj ¢ . (4-47)
0 0 0 0
N n = ( n + 1)( n + 2) 2 . (4-48)
For a rotationally symmetric imaging system, each of the sin mq terms is zero, as
discussed in Section 1.6. Accordingly, the number of polynomials of an even order is
(n 2) + 1 and ( n + 1) 2 for an odd order. Their number through an order n is given by
[
N n = (n 2) + 1 ]2 for even n , (4-49a)
Table 4-4. Orthonormal Zernike circle polynomials Z j ( r,, q) . The indices j, n, and m
are called the polynomial number, radial degree, and azimuthal frequency,
respectively. The polynomials Z j are ordered such that an even j corresponds to a
symmetric polynomial varying as cos mqq , while an odd j corresponds to an
antisymmetric polynomial varying as sin mqq. A polynomial with a lower value of n
is ordered first, and for a given value of n, a polynomial with a lower value of m is
ordered first.
4 2 0 (
3 2r 2 - 1 ) Defocus
7 3 1 (
8 3r3 - 2r sin q ) Primary y-coma
8 3 1 8 (3r 3
- 2r) cos q Primary x-coma
9 3 3 8 r 3 sin 3 q
10 3 3 8 r 3 cos 3 q
11 4 0 (
5 6r 4 - 6r2 + 1 ) Primary spherical aberration
12 4 2 (
10 4r 4 - 3r2 cos 2q ) 0∞ Secondary astigmatism
13 4 2 10 ( 4r 4
- 3r ) sin 2q
2 45∞ Secondary astigmatism
14 4 4 10 r 4 cos 4 q
15 4 4 10 r 4 sin 4 q
16 5 1 ( )
12 10r5 - 12r3 + 3r cos q Secondary x-coma
18 5 3 12 (5r - 4r ) cos 3q
5 3
19 5 3 12 (5r - 4r ) sin 3q
5 3
20 5 5 12 r 5 cos 5 q
21 5 5 12 r 5 sin 5 q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
4.6.4 Number of Circle Polynomials through a Certain Order n 67
22 6 0 (
7 20r6 - 30r 4 + 12r2 - 1 ) Secondary spherical
23 6 2 ( 6
)
14 15r - 20r + 6r sin 2q 4 2
45∞ Tertiary astigmatism
25 6 4 14 (6r - 5r ) sin 4q
6 4
26 6 4 14 (6r - 5r ) cos 4q
6 4
27 6 6 14 r 6 sin 6 q
28 6 6 14 r 6 cos 6 q
29 7 1 ( )
4 35r7 - 60r5 + 30r3 - 4r sin q Tertiary y-coma
33 7 5 4 (7r - 6r ) sin 5q
7 5
34 7 5 4 (7r - 6r ) cos 5q
7 5
35 7 7 4 r 7 sin 7 q
36 7 7 4 r 7 cos 7 q
37 8 0 (
3 70r8 - 140r6 + 90r4 - 20r2 + 1 ) Tertiary spherical
38 8 2 ( )
18 56r 8 - 105r 6 + 60r 4 - 10r 2 cos 2q 0∞ Quaternary astigmatism
42 8 6 18 (8r 8 - 7r 6 ) cos 6q
43 8 6 18 (8r 8 - 7r 6 ) sin 6q
44 8 8 18 r 8 cos 8q
45 8 8 18 r 8 sin 8q
*The words “orthonormal Zernike circle” are to be associated with these names, e.g.,
orthonormal Zernike circle 0∞ primary astigmatism.
68 SYSTEMS WITH CIRCULAR PUPILS
n 4
0.5 8
R n(ρ)
0 (a)
0
-0.5 6
2
-1
0 0.2 0.4 0.6 0.8 1
n 5
0.5
7
1
R n(ρ)
0 (b)
1
-0.5
-1
0 0.2 0.4 0.6 0.8 1
n 6
0.5
2
R n(ρ)
0 (c)
2
-0.5 8
4
-1
0 0.2 0.4 0.6 0.8 1
U
[
n = ( 2 j - 1)
12
]
+ 0.5
integer
-1 , (4-50)
where the subscript integer implies the integer value of the number in brackets. Once n is
known, the value of m is given by
Ô {
Ï 2 [ 2 j + 1 - n( n + 1) ] 4 }
integer
when n is even (4-51a)
m=Ì
{ }
Ô 2 [ 2( j + 1) - n( n + 1) ] 4 integer - 1 when n is odd .
Ó
(4-51b)
For example, suppose we want to know the values of n and m for the polynomial j = 10.
From Eq. (4-50), n = 3 and from Eq. (4-51b), m = 3. Hence, it is a cos 3q polynomial.
From the standpoint of wavefront analysis, their uniqueness lies in the fact that they
are not only orthogonal over a circular pupil, but include wavefront tilt, defocus, and
balanced classical aberrations as members of the polynomial set for such a pupil. For
example, Z 6 , Z 8 , and Z11 represent the balanced primary aberrations of astigmatism,
coma, and spherical aberration, as may be seen by comparing their forms with those
given in Table 4-2. Similarly, Z12 , Z16 , and Z 22 represent the balanced secondary
aberrations of astigmatism, coma, and spherical aberration, respectively, as may be seen
by comparing their forms with those given in Eqs. (4-33)–(4-35), respectively. Note that
the constant term in a radially symmetric aberration is needed to make its mean value
zero over the pupil. A balanced classical aberration in the form of a Zernike polynomial is
referred to as a Zernike or orthogonal aberration, e.g., Z 6 is Zernike primary
astigmatism or Z 8 is Zernike primary coma. In Section 4.5, aberrations with only cos mq
type dependence are considered, as would be the case for a rotationally symmetric
70 SYSTEMS WITH CIRCULAR PUPILS
imaging system. In general, an aberration function will also have sin mq type terms, for
example, due to fabrication errors or those due to atmospheric turbulence. The
corresponding polynomials with sin mq dependence are considered in Section 4.6.
• n
W (r, q) = Â Â c nm Z nm (r, q) , 0 £ r £ 1 , 0 £ q £ 2p , (4-52)
n =0 m =0
where c nm are the orthonormal expansion coefficients that depend on the object location.
The orthonormal Zernike expansion coefficients are given by
1 1 2p
c nm = Ú Ú W (r, q)Z n (r, q) r dr d q ,
m
(4-53)
p0 0
as may be seen by substituting Eq. (4-52) and utilizing the orthonormality Eq. (4-44) of
the polynomials.
Because of the orthogonality of the Zernike polynomials, the mean value of a circle
polynomial, except when n = 0 = m (the piston polynomial), is zero, and its mean square
value is unity, as shown in Section 3.2. Therefore, the mean and the mean square values
4.7 Zernike Circle Coefficients of a Circular Aberration Function 71
Z1 0 0 1 Piston
Z2 1 1 2x x tilt
Z3 1 1 2y y tilt
Z4 2 0 3 (2r2 – 1) Defocus
Z6 2 2 6 ( x 2 – y2 ) 0∞ Primary astig.
Z9 3 3 8 y (3 x 2 – y 2 )
Z10 3 3 8 x( x 2 – 3y 2 )
Z14 4 4 10 (r 4 – 8 x 2 y 2 )
Z15 4 4 4 10 xy ( x 2 – y 2 )
Z18 5 3 12 x ( x 2 – 3 y 2 ) (5 r2 – 4)
Z19 5 3 12 y (3 x 2 – y 2 ) (5 r2 – 4 )
Z 20 5 5 12 x (16 x 4 – 20 x 2 r2 + 5 r 4 )
Z 21 5 5 12 y(16 y 4 – 20 y 2 r2 + 5 r 4 )
Z 23 6 2 2 14 xy (15 r 4 – 20 r2 + 6 )
72 SYSTEMS WITH CIRCULAR PUPILS
Poly. n m Zj ( x, y) Name
Z 26 6 4 14 (8 x 4 - 8 x 2 r2 + r 4 ) (6r2 – 5 )
Z 27 6 6 14 xy (32 x 4 – 32 x 2 r2 + 6 r 4 )
Z 28 6 6 14 (32 x 6 – 48 x 4r2 + 18 x 2 r4 – r6 )
Z 29 7 1 (
4 y 35r 6 - 60r 4 + 30r 2 - 4 ) Tertiary y-coma
Z 33 7 5 4( 7r 2 - 6)[ 4 x 2 y ( x 2 - y 2 ) + y (r 4 - 8 x 2 y 2 ) ]
Z 34 7 5 4( 7r 2 - 6)[ x (r 4 - 8 x 2 y 2 ) - 4 xy 2 ( x 2 - y 2 ) ]
Z 35 7 7 8 x 2 y ( 3r 4 - 16 x 2 y 2 ) + 4 y ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )
Z 36 7 7 4 x ( x 2 - y 2 )(r 4 - 16 x 2 y 2 ) - 8 xy 2 ( 3r 4 - 16 x 2 y 2 )
Z 42 8 6 18 ( x 2 - y 2 )(r 4 - 16 x 2 y 2 )(8r 2 - 7)
Z 43 8 6 2 18 xy ( 3r 4 - 16 x 2 y 2 )
Z 44 8 8 (
2 18 r 4 - 8 x 2 y 2 ) 2 - r8
Z 45 8 8 7 (20 r6 – 30 r 4 + 12 r2 – 1 )
4.7 Zernike Circle Coefficients of a Circular Aberration Function 73
W (r, q) = c 00 , (4-54)
• •
W 2 (r, q) = Â 2
 c nm , (4-55)
n =0 m =0
2
s 2 = W 2 (r, q) - W (r, q)
• •
2
= Â Â c nm . (4-56)
n =1 m = 0
In practice, the expansion will be truncated at some value N of n such that the variance
obtained from Eq. (4-56) will be equal to its value obtained from the actual data within
some specified tolerance.
where a j are the expansion coefficients, and we have truncated the polynomials at
maximum value J of j. Multiplying both sides of Eq. (4-57) by Z j (r, q), integrating over
the unit disc, and using the orthonormality Eq. (4-4), we obtain the circle expansion
coefficients:
2p
11
aj = Ú
p0 Ú W (r, q)Z j (r, q) r dr dq . (4-58)
0
As stated in Section 3.2, it is evident from Eq. (4-58) that the value of a circle coefficient
a j is independent of the number J of the polynomials used in Eq. (4-57) for the
expansion of the aberration function. Hence, one or more terms can be added to or
subtracted from the aberration function without affecting the value of the coefficients of
the other polynomials in the expansion.
The mean and the mean square values of the aberration function are given by
W (r, q) = a1 , (4-59)
J
W 2 (r, q) = Â a 2j , (4-60)
j =1
s 2 = W 2 (r, q) - W (r, q)
2
J
= Â a 2j . (4-61)
j =2
2
1 1 2p
I (r , q i + 2pk m) = [ ] [ ]
Ú Ú exp i F ( r, q) exp - pirr cos(q - q i - 2 pk m) r dr dq
p2 0 0
,
(4-62)
Now,
[ ]
F(r, q - 2 pk m) ~ cos m(q - 2 pk m) = cos( mq - 2 pk ) = cos mq ~ F(r, q) .
(4-63)
Hence, we can write Eq. (4-62) as
1 1 2p
I (r , q i + 2pk m) = [ ] [
Ú Ú exp i F(r, q - 2pk m) exp - pirr cos(q - q i - 2 pk m)
p2 0 0
]
2
¥ r dr d q
= I (r , q i ) . (4-64)
Thus if we change the angle q i by 2pk m but keep r unchanged, we obtain the same
value of the PSF as at (r , q i ) . This change can occur m times over a complete cycle of
2p . Therefore, Eq. (4-64) shows that the PSF is m-fold symmetric, as expected for the m-
fold aberration function. However, this is true for odd values of m only.
If m is even, the invariance of the PSF when q i changes by p, i.e., for k = m/2,
r r
implies that the PSF is symmetric or even about the origin, i.e., I ( r ) = I ( -r ) . It has the
consequence that the PSF is 2m-fold symmetric when m is even, as we show next. The
PSF at a distance r but angle q i ± pj m , where j = 1, 2, ..., 2m, is given by
4.8.1 Symmetry of PSF 75
2
1 1 2p
I (r , q i ± pj m) = [ ] [
Ú Ú exp i F ( r, q) exp - pirr cos(q - q i m pj m) r dr dq
p2 0 0
] . (4-65)
Now
[ ]
F(r, q ± pj m) ~ cos m(q ± pj m) = cos( mq ± pj )
2
1 1 2p
I (r , q i ± pj m) = [ ] [
Ú Ú exp i F(r, q - pj m) exp - pirr cos(q - q i m pj m) r dr dq
p2 0 0
]
(4-67)
ÏÔ I (r , q i ) for even j
= Ì r (4-68)
ÔÓ I (r , q i + p) ∫ I ( -r ) for odd j ,
where in Eq. (4-67) we have substituted F(r, q) = F(r, q ± pj m) for even j and
r r
F(r, q) = -F(r, q ± pj m) for odd j to obtain Eq. (4-68). Since I ( r ) = I ( -r ) for even m,
the right-hand side of Eq. (4-68) is equal to I (r , q i ) for odd values of j also. Hence the
PSF is 2m-fold symmetric when m is even. Of course, when m = 0, the PSF is radially
symmetric, like the aberration function.
The PSFs for two polynomial aberrations with the same n and m values, and the
same sigma value, but different angular dependence as cos mq and sin mq are the same
except that one is rotated by an angle p 2m with respect to the other. If two such
polynomial aberrations are present simultaneously with sigma values a j and b j , we can
write their sum in the form
= (
2(n + 1) Rnm (r) a j cos mq + b j sin mq )
= {[
2(n + 1) Rnm (r) a 2j + b 2j cos m q - (1 m) tan 1
(b j aj )]} . (4-69)
( )
12
It is easy to see that when both a j and b j are negative, a 2j + b 2j in Eq. (4-69)
( )
12
must be replaced by - a 2j + b 2j . However, when one of the coefficients is positive and
( )
the other is negative, then tan 1 b j a j of a negative argument has two solutions: a
76 SYSTEMS WITH CIRCULAR PUPILS
negative acute angle or its complimentary angle. The choice is made depending on
whether a 2 or a 3 is negative according to
( )
Ï - tan 1 b a for positive a and negative a
Ô (4-70a)
(b )
j j 2 3
tan 1
aj = Ì
( )
j
Ô p - tan 1 b j a j for negative a 2 and positive a 3 . (4-70b)
Ó
and
r r r r r
Im t( v ) = Ú I ( r ) sin( 2pv ◊ r ) d r , (4-72b)
[
Re t(v , f) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f) r dr dq i ] (4-73a)
and
[
Im t(v , f) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f) r dr dq i ] . (4-73b)
When m is odd, the OTF is complex. To determine the symmetry of its real part, we
consider it for a spatial frequency (v , f + pj m), where, as before, j = 1, 2, ..., 2m :
[
Re t(v , f + pj m) = ÚÚ I (r , q i ) cos 2pvr cos(q i - f - pj m) r dr dq i ] . (4-74)
From Eq. (4-68) for even j, we can replace I (r , q i ) with I (r , q i - pj m) , and thus
[
Re t(v , f - pj m) = ÚÚ I (r , q i - pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i ]
= Re t( v , f) . (4-75)
For odd j,
I (r , q i + pj m) = I (r , q i + p) . (4-76)
4.8.2 Symmetry of OTF 77
Therefore, changing the variable of integration from q i to q i + p , we may write Eq. (4-
74) as
[ ]
Re t(v , f + pj m) = ÚÚ I (r , q i + p) cos 2 pvr cos(q i + p - f - pj m) r dr dq i
[ ]
= ÚÚ I (r , q i + pj m) cos 2 pvr cos(q i - f - pj m) r dr dq i
= Re t(v , f) . (4-77)
Now consider the imaginary part given by Eq. (4-73b). Following the same
procedure as for the real part, we replace I (r , q i ) by I (r , q i - pj m) for even j and write
[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i - pj m) sin 2pvr cos(q i - f - pj m) r dr dq i
= Im t(v , f) . (4-78)
[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i ) sin 2pvr cos(q i - f - pj m) r dr dq i . (4-79)
Again, changing the variable of integration from q i to q i + p and utilizing Eq. (4-68) for
odd j, we may write Eq. (4-79) as
[ ]
Im t(v , f + pj m) = ÚÚ I (r , q i + p) sin 2 pvr cos(q i + p - f - pj m) r dr dq i
[ ]
= - ÚÚ I (r , q i + pj m) sin 2pvr cos(q i - f - pj m) r dr dq i
= - Im t(v , f) . (4-80)
Thus, the imaginary part does not change for even j, but its sign changes for odd j without
changing its magnitude. Hence, the imaginary part is only m-fold symmetric.
However, when m is even, the PSF is even about the origin, and, therefore, the
imaginary part of the OTF given by Eq. (4-72b) is zero (since its integrand is an odd
function). Accordingly, the OTF is real. Moreover, since the PSF is 2m-fold symmetric in
this case, so is the OTF. Accordingly, the MTF, which is the modulus of the OTF, is 2m-
fold symmetric whether m is even or odd. Of course, when m = 0, i.e., for a radially
symmetric aberration, the OTF is real, radially symmetric, and equal to the MTF.
The symmetry properties of the various functions discussed above for a Zernike
polynomial aberration with m -fold symmetry varying as cos mq or sin mq are
summarized in Table 4-6, where NA stands for “not applicable.” Of course, for m = 0,
the interferogram, the PSF, and the OTF are all radially symmetric. In addition, the OTF
is real when m is zero or even.
78 SYSTEMS WITH CIRCULAR PUPILS
Table 4-6. Symmetry of interferogram, PSF, real and imaginary parts of OTF, and
MTF for m-fold symmetric Zernike polynomial aberration varying as cosmqq or
sinmq .
The circle polynomial aberrations for n £ 8 are illustrated in three different but
equivalent ways in Figure 4-11 for a sigma value of one wave. For each polynomial
aberration, the isometric plot is shown at the top, the interferogram on the left, and the
PSF on the right. The peak-to-valley numbers of the aberrations are given, and the Strehl
ratio and examples of the OTF characteristics are illustrated for a sigma value of 0.1 wave
[14].
Z1 Z2 Z3
Z4 Z5 Z6
Z7 Z8 Z9
Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,
interferogram on the left, and PSF on the right for a sigma value of one wave.
80 SYSTEMS WITH CIRCULAR PUPILS
Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,
interferogram on the left, and PSF on the right for a sigma value of one wave.
(Cont.)
4.9.2 Interferometric Characteristics 81
Z43 Z44 Z 45
Figure 4-11. Zernike circle polynomials shown as isometric plot on the top,
interferogram on the left, and PSF on the right for a sigma value of one wave.
(Cont.)
82 SYSTEMS WITH CIRCULAR PUPILS
Z1 0 Z16 2 12 = 6.928 Z 31 8
Z2 4 Z17 2 12 = 6.928 Z 32 8
Z3 4 Z18 2 12 = 6.928 Z 33 8
Z5 2 6 = 4.899 Z 20 2 12 = 6.928 Z 35 8
Z6 2 6 = 4.899 Z 21 2 12 = 6.928 Z 36 8
are formed by a positive aberration, and those intersecting the y axis are formed by a
negative aberration. The number of fringes in an interferogram, which is equal to the
number of times the aberration changes by one wave as we move from the center to the
edges of the pupil, is different for the different polynomials. Each fringe represents a
contour of constant phase or aberration. The fringe is dark when the phase is an odd
multiple of p, or the aberration is an odd multiple of l 2. In the case of tilts, for
example, the aberration changes by one wave four times, which is the same as the peak-
to-valley value of 4 waves. Hence, 4 straight line fringes symmetric about the center are
obtained. The x-tilt polynomial Z2 yields vertical fringes, and the y-tilt polynomial Z3
yields horizontal fringes. Similarly, defocus aberration Z4 yields about 3.5 fringes. In the
case of spherical aberration Z11 , the aberration starts at a value of 5 waves, decreases
to zero, reaches a negative value of - 5 2 waves, and then increases to 5 waves.
4.9.2 Interferometric Characteristics 83
Hence, the total number of times the aberration changes by unity is equal to 6.7, and
approximately seven circular fringes are obtained.
The polynomial aberrations Z 2 and Z 3 , representing the x and y wavefront tilts with
aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y
axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
wavefront tilt angle of 4(l D)a 2 about the y axis and displaces the PSF along the x axis
by 4l Fa 2 . Similarly, a 3 corresponds to a wavefront tilt angle of 4(l D)a 3 about the x
axis and displaces the PSF by 4l Fa 3 along the y axis. The aberrated PSFs can be
obtained from Eq. (4-5). For astigmatism Z 5 and Z 6 , m = 2, and the PSF is 4-fold
symmetric. For coma Z 7 and Z 8 , m = 1, the PSF is symmetric about the y and the x axis,
respectively. The polynomial Z10 corresponds to m = 3, the aberration function is 3-fold
symmetric, but the interferogram is 6-fold symmetric. Since m is odd, the PSF is also 3-
fold symmetric.
The Strehl ratio for the first 45 circle polynomial aberrations with a sigma value of
0.1 wave is listed in Table 4-8 and plotted in Figure 4-12 on a nominal and an expanded
scale to clearly show the variation of their values. For the tilt polynomials Z 2 and Z 3 , the
Strehl ratio simply represents the PSF value at a displaced point along the x or the y axis,
respectively. This displacement for a tilt aberration sigma of 0.1 wave is 0.4 l F .
A closed-form expression for the Strehl ratio for the defocus circle polynomial Z 4
can be obtained from Eq. (4-18) by letting
( ) ˘˙
2
È sin 3a
4
S = Í . (4-82)
Í 3a 4 ˙
Î ˚
For a defocus sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement with the
result given in Table 4-8. Note that a 4 is the sigma value, which in turn is equal to
Bd 2 3 , where Bd is the peak value of the defocus aberration. Hence, Eq. (4-82) is the
same as Eq. (4-21). The amount of longitudinal defocus required to produce a certain
value of a 4 , and therefore Bd , is given by Eq. (4-20).
The results of Table 4-8 and Figure 4-12 illustrate that the Strehl ratio for a small
84 SYSTEMS WITH CIRCULAR PUPILS
Table 4-8. Strehl ratio S for Zernike circle polynomial aberrations with a sigma
value of 0.1 wave.
aberration is nearly independent of the type of the aberration and that it depends primarily
( )
on its sigma value. It is approximately given by Eq. (4-22c) as exp - s F2 , or 0.67,
where s F = 0.2p .
The 3D MTF plots are shown in Figure 4-13 for the primary aberration polynomials
with a sigma value of 0.1 wave. The MTF for the piston aberration represents the
aberration-free MTF. It is included among the aberrated MTF plots by a solid line as a
4.9.4 OTF Characteristics 85
oS
oj
oS
oj
Figure 4-12. Strehl ratio for Zernike circle polynomial aberrations with a sigma
value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.
reference. The symmetry of the MTFs is made more explicit by the contour plots shown
below each 3D MTF figure. The MTF value at the center of the contours is unity and
decreases to zero from the center out starting with a value of 0.9 and ending with zero.
The tangential (long dashes), sagittal (medium dashes), and 45o (small dashes) MTF plots
are also shown in this figure, i.e., for the spatial frequency vector along the x axis, y axis,
and at 45o from the x axis, respectively. Because of the 4-fold symmetry of the MTF in
the case of astigmatism, the tangential MTF is equal to the sagittal MTF. As expected
[3,8], the aberrated MTF is lower than the aberration-free MTF at all spatial frequencies
0 v 1, i.e., within the passband of the system.
86 SYSTEMS WITH CIRCULAR PUPILS
y x
Z 1 - Piston
Z 4 - Defocus
Z6 Primary astigmatism
Z8 Primary coma
Z 10
Z 11 Primary spherical
Figure 4-13. 3D, tangential or along x axis (in long dashes), sagittal or along y axis
(in medium dashes), and at 45 D from the x axis (in small dashes) MTF plots for
Zernike circle polynomial aberrations with a sigma value of 0.1 wave. The solid
curve represents the aberration-free MTF. The spatial frequency v is normalized
by the cutoff frequency 1 O F . The contour plots below each 3D MTF plot are in
steps of 0.1 from the center out, starting with 0.9 and ending with zero.
4.9.4 OTF Characteristics 87
Figure 4-14a shows the symmetry of the real and the imaginary parts of the OTF for
coma Z 8 . The real part has even symmetry, but the imaginary part has odd symmetry.
The thick and thin contours of the imaginary part in both cases represent its positive and
negative values, respectively. The real and imaginary parts of the OTF for the aberration
Z10 are shown in Figure 4-14b. In addition to their even and odd symmetry, it shows that
the real part is 6-fold symmetric and the imaginary part is 3-fold symmetric, as expected
for a 3-fold symmetric aberration. Because of the odd symmetry of the imaginary part, its
integral over the spatial frequencies imaged by a system is zero, as expected from the
statement after Eq. (1-25).
(b) Z10
Re ( ) Im ( )
Figure 4-14. Real and imaginary parts of the OTF for a Zernike polynomial
aberration with a sigma value of 0.1 wave. (a) Z8 (primary coma) showing the even
and odd symmetry of the real and imaginary parts. (b) Z10 showing the 6-fold
symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to
their even and odd symmetry, respectively. The thick and thin contours of the
imaginary part in both cases represent its positive and negative values, respectively.
88 SYSTEMS WITH CIRCULAR PUPILS
The Seidel aberrations are well known in optical design, where the optical system
has an axis of rotational symmetry with the consequence that the angle-dependent terms
are in the form of powers of cos q . However, the measured aberrations of a system in
optical testing generally contain both the cosine and sine terms due to the assembly and
fabrication errors. We show how to define the effective Seidel coefficients in such cases.
We emphasize that the Seidel aberration coefficients determined from the primary
Zernike aberrations will be in error unless the higher-order terms that also contain Seidel
terms are negligible [16,17].
represents a tilt of the wavefront about the y axis by an angle 4(l D)a 2 , where the
aberration coefficient is in units of wavelength. It results in a displacement of the PSF
along the x axis by 4l Fa 2 . Similarly, the Zernike tilt aberration
represents a tilt of the wavefront about the x axis by an angle 4(l D)a 3 and results in a
displacement of the PSF along the y axis by 4l Fa 3 .
It should be evident that when the cosine and sine terms of a certain aberration are
present simultaneously, as in optical testing, their combination represents the aberration
whose orientation depends on the value of the component terms. For example, if both x
and y Zernike tilts are present in the form
it can be written
4.10.2 Wavefront Tilt and Defocus 89
(
W (r, q) = 2 a 22 + a 32 )1 2 r cos [q - tan 1(a 3 a 2 )] . (4-86)
to decide the sign of the overall tilt and the value of its angle are discussed following Eq.
(4-69).
The Zernike tilt aberration Z 2 (r, q) is similar to the Seidel distortion in its (r, q)
dependence. Similarly, the Zernike defocus aberration Z 4 (r) varies with r as the Seidel
field curvature varies with it. The constant term in Z 4 (r) makes its mean value across the
circular pupil to be zero, without changing its standard deviation.
4.10.3 Astigmatism
The Zernike primary astigmatism
can be written
a 5 Z 5 (r, q) = [
6 a 5r 2 cos 2(q + p 4) ] . (4-89)
does not yield a line image in any plane. However, it is referred to as the 0∞ astigmatism
in conformance with the corresponding primary astigmatism because of its variation with
q as cos 2q . Similarly, the name tertiary astigmatism in Table 4-4 can be explained.
(
W (r, q) = a 52 + a 62 )1 2 {[
6 r 2 cos 2 q - (1 2) tan 1
(a 5 ]}
a6 ) , (4-92)
90 SYSTEMS WITH CIRCULAR PUPILS
(
= a 6 6 2r 2 cos 2 q - r 2 ) (4-93b)
= a6 6 ( - 2r 2 sin 2 q + r 2 ) . (4-93c)
4.10.4 Coma
The Zernike coma terms a 8 Z 8 (r, q) and a 7 Z 7 (r, q) are called the x and y Zernike
comas. They represent classical coma r 3 cos q or r 3 sin q balanced with tilt r cos q or
r sin q , respectively, to yield minimum variance. They yield PSFs that are symmetric
about the x and y axes, respectively. Similarly, the names for the secondary and tertiary
coma can be explained.
When both x- and y -Zernike comas are present, the aberration may be written
= ( ) (
8 a 8 3r 3 - 2r cos q + 8 a 7 3r 3 - 2r sin q ) (4-94b)
(
= a 72 + a 82 )1 2 8 (3r3 - 2r) cos [q - tan 1(a 7 a 8 )] , (4-94c)
To illustrate how a wrong Seidel coefficient can be inferred unless it is obtained from
all of the significant Zernike terms that contain Seidel aberrations, we consider an axial
image aberrated by one wave of secondary spherical aberration r 6 . In terms of Zernike
polynomials it will be written as
where
(
a 22 = 1 20 7 , a11 = 1 4 5 , a 4 = 9 20 3 , a1 = 1 4 . ) (4-96)
If we infer the Seidel spherical aberration from only the primary Zernike aberration
a11Z11(r) , its amount would be 1.5 waves. Such a conclusion is obviously incorrect,
because in reality the amount of Seidel spherical aberration is zero. Needless to say if we
expand the aberration function up to the first, say, as many as 21 terms, we will in fact
incorrectly conclude that the amount of Seidel spherical aberration is 1.5 waves.
However, the Seidel spherical aberration will correctly reduce to zero when at least the
first 22 terms are included in the expansion. For an off-axis image, there are angle-
dependent aberrations, e.g., Z14 , that also contain Seidel aberrations. Hence, it is
important that the expansion be carried out up to a certain number of terms such that any
additional terms do not significantly change the mean square difference between the
function and its estimate. Otherwise, the inferred Seidel aberrations will be erroneous.
8
W (r, q) = Â a j Z j (r, q) + a11Z11(r) (4-97a)
j =1
where A p is the piston aberration, other coefficients Ai represent the peak value of the
corresponding Seidel aberration term, and b i is the orientation angle of the Seidel
aberration. They are given by
A p = a1 - 3a 4 + 5a11 , (4-98a)
2 2 12 Ê a - 8a7 ˆ
At = 2ÈÍ a 2 - 8 a 8
( ) + (a 3 - 8 a 7 ˘˙
) , b t = tan 1Á 3 ˜ , (4-98b)
Î ˚ Ë a2 - 8a8 ¯
Ad = 2 ( 3a 4 - 3 5a11 - Aa ) , (4-98c)
1
(
Aa = 2 6 a 52 + a 62 )1 2 , ba =
2
tan 1
(a 5 a6 ) , (4-98d)
(
Ac = 6 2 a 72 + a 82 )1 2 , b c = tan 1
(a 7 a8 ) , (4-98e)
and
As = 6 5a11 . (4-98f)
As a note of caution, we add that the approximation of Eq. (4-97a) is good only when the
higher-order Zernike aberrations that also contain Seidel aberration terms are negligible.
4.10.7 Strehl Ratio for Seidel Aberrations with and without Balancing
In Figure 4-12, we have shown the Strehl ratio for the circle polynomial aberrations
with a sigma value of one-tenth of a wave. In Figure 4-13, we show how it varies with the
sigma value of a Seidel aberration, with and without balancing (as in Tables 4-1 and 4-2),
for 0 £ s W £ 0.25 . Also plotted is the Strehl ratio obtained from the approximate
( )
expression exp - s F2 as the dashed curve. As expected, the exponential expression
yields a very good estimate of the Strehl ratio for s W £ 0.1. As s W increases, the true
Strehl ratio departs from its approximate value, except in the case of balanced
astigamtism for which the difference is quite small. It overestimates in the case of
defocus, balanced coma, and spherical aberration, but underestimates for astigmatism and
coma. Morover, for agiven value of sigma, its value for spherical aberration is exactly the
same as for the balanced spherical aberration. The aberration coefficient and the P-V
number for a certain value of s W of these aberrations can be obtained from Table 4-9.
1.0 1.0
0.8 0.8
0.6 0.6
S
S
0.4 0.4
0.2 0.2
Defocus Astigmatism
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
ΣW ΣW
1.0 1.0
0.8 0.8
0.6 0.6
S
0.4 0.4
0.2 0.2
Coma Spherical
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
ΣW ΣW
Figure 4-15. Strehl ratio as a function of the sigma value of a Seidel aberration with
and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical
aberration.
Table 4-9. Sigma value of a Seidel aberration with and without balancing, and P-V
numbers for a sigma value of unity, where Ai is the aberration coefficient.
An alternate approach may also be considered [20]. It is perhaps worth noting that, in
practice, one will determine the Zernike coefficients of an aberration function of a system
from its interferometric data by using Eq. (4-58). The corresponding coefficients of a
scaled pupil can also be determined in the same manner by utilizing its data, i.e., by
excluding that data of the unscaled pupil that is not part of the scaled pupil. The result
obtained can be illustrated by considering a Seidel aberration function and writing it in
terms of the Zernike polynomials for both the unscaled and the scaled pupils.
4.11.1 Theory
Consider a circular pupil with its wave aberration function W (r, q) expanded in
terms of the orthonormal Zernike circle polynomials Z j (r, q), as in Eq. (4-57). For a
corresponding scaled pupil with a normalized radius of £ 1, as in Figure 4-16, the
aberration function can be written from Eq. (4-57) in the form
Normalizing the smaller pupil to a unit circle, the aberration function across it can also be
written in terms of the Zernike polynomials that are orthonormal over it in the form
or
2p
11
bj ¢ = W (r, q) Z j ¢ (r, q) r dr dq .
p Ú0 Ú (4-102)
0
Figure 4-16. Scaled circular pupil, where the pupil radius is reduced from unity to
by blocking the outer portion.
4.11.1 Theory 95
From Eq. (4-46), the angular integration in Eq. (4-103) yields p(1 + d m 0 ) d mm ¢ . Hence, we
may write
1
bn ¢,m = 2(n ¢ + 1) Â 2(n + 1)a n,m Ú Rnm (r) Rnm¢ (r) r dr , (4-104)
n 0
where we have replaced the single index j by the corresponding double indices n and m,
and similarly replaced j ¢ by n ¢ and m according to Eqs. (4-50) and (4-51).
The integral in Eq. (4-104) can be solved very simply by writing the radial
polynomial Rnm (r) in terms of the corresponding polynomials Rnm¢ (r) in the form [18]
n
Rnm (r) = Â hn ¢ (n; )Rnm¢ (r) , (4-105)
n ¢=m
where
s and s¢ are positive integers (including zero), and n - n ¢ = 2( s + s¢) . Substituting Eq. (4-
105) into Eq. (4-104) and utilizing Eq. (4-43) for the orthogonality of the radial
polynomials, we obtain the intended result:
n +1
bn ¢,m = Â h (n; ) a n,m . (4-107)
n n¢ + 1 n ¢
hn (n; ) = n , (4-108a)
hn 2 (n; ) (
= - ( n - 1) 1 - 2 n ) 2
, (4-108b)
n-3
hn 4 (n; ) =
2
( )(
1 - 2 n - 2 - n2 n ) 4
, (4-108c)
n-5
hn 6 (n; ) =
6
1 - 2 ( )[(n - 3)(n - 4) - 2(n - 1)(n - 3)2 + n(n - 1)4 ] , (4-108d)
96 SYSTEMS WITH CIRCULAR PUPILS
hn 8 (n; ) =
n-7 n
2
8
(1 - 2 ) ÈÍÎ (n - 4)(n12- 5)(n - 6) - (n - 2)(n 4- 4)(n - 5) 2
( n - 1)( n - 2)( n - 4) n( n - 1)( n - 2) 6 ˘
+ 4 - ˙ , etc. (4-108e)
4 12 ˚
Equations (4-108a)–(4-108e) are sufficient to obtain the Zernike coefficients of the scaled
pupil up to and including the eighth order. The expressions for hn ¢ (n; ) for n £ 8 are
listed in Table 4-9.
b4,0 = h4 (4; )a 4,0 + 7 5h4 (6; )a 6,0 + 9 5h4 (8; )a 8,0 + ... , (4-109a)
b4,2 = h4 (4; )a 4,2 + 7 5h4 (6; )a 6,2 + 9 5h4 (8; )a 8,2 + ... , (4-109b)
and
b4,4 = h4 (4; )a 4,4 + 7 5h4 (6; )a 6,4 + 9 5h4 (8; )a 8,4 + ... . (4-109c)
As Æ 1, all the multipliers vanish except a n ¢m , which approaches unity and yields the
expected result bn ¢,m = a n ¢,m .
The integral in Eq. (4-104) can also be evaluated by using the relationship [21]
•
( n m) 2
Rnm (r) = ( -1) Ú J n +1( r ) J m (rr ) dr (4-110)
0
to rewrite Rnm (r) , where J n (◊) is the nth-order Bessel function of the first kind. Thus,
we obtain after interchanging the integrals,
•
1 È1 m
( n m) 2 Û ˘
Ú n
R m
( r) R m
n¢ (r) r d r = ( -1) Ù n +1 Í Ú Rn ¢ (r) J m (rr ) r dr˙ dr
J ( r )
0 ı Î0 ˚
0
•
(n + n ¢ 2m) 2 Û J n ¢ +1( r )
= ( -1) Ù J n +1( r ) dr
ı r
0
1
= [
R n ¢ ( ) - Rnn ¢ + 2 ( )
2( n ¢ + 1) n
] , (4-111)
n n¢ h n ¢ (n; )
0 0 1
1 1
2 0 (
- 1 - 2 )
2 2 2
3 1 - 2 1 - 2 ( )
3 3 3
4 0 (1 - 2 )(1 - 22 )
4 2 - 32 (1 - 2 )
4 4 4
5 1 (
1 - 2 3 - 52 )( )
5 3 - 4 3 1 - 2 ( )
5 5 5
6 0 ( )(
- 1 - 2 1 - 52 + 54)
6 2 3 (1 - )( 2 - 3 )
2 2 2
6 4 - 54 (1 - 2 )
6 6 6
7 1 ( )(
- 2 1 - 2 2 + 82 - 74 )
7 3 2 (1 - 2 )( 5 - 72 )
3
7 5 - 65 (1 - 2 )
7 7 7
8 4 54 (1 - 2 )( 3 - 4 2 )
8 6 - 76 (1 - 2 )
8 8 8
98 SYSTEMS WITH CIRCULAR PUPILS
( n ¢ m) 2 È J n ¢ +1 ( r ) ˘
1
Ú Rnm¢ (r) J m (rr ) r dr = ( -1) Í ˙ , (4-112a)
0 Î r ˚
J n +1( r ) J ( r ) + J n + 2 ( r )
= n , (4-112b)
r 2( n + 1)
and Eq. (4-110). Substituting Eq. (4-111) into Eq. (4-104), we obtain
n +1
bn ¢m = Â
n n ¢ + 1 nm n
[
a R n ¢ ( ) - Rnn ¢ + 2 ( ) ] . (4-113)
The equivalence of Eqs. (4-107) and (4-113) can be established by expanding the scaled
radial polynomial in terms of the orthogonal radial polynomials in the form
n
Rnm (r) = Â a n ¢ (n; )Rnm¢ (r) , (4-114)
n ¢=m
where, using the orthogonality of the radial polynomials, an expansion coefficient given
by
1
a n ¢ (n; ) = 2( n ¢ + 1) Ú Rnm (r) Rnm¢ (r) r dr (4-115)
0
is the same as hn ¢ (n; ) , as may be seen by comparing Eqs. (4-105) and (4-114).
where a Seidel coefficient Ai represents the peak value of a Seidel aberration. It can be
written in terms of the Zernike polynomials in the form
where the argument (r, q) of the orthonormal Zernike polynomials Z nm is omitted for
brevity, and the Zernike coefficients are given by
Ad Aa As
a 0,0 ∫ a1 = + + , (4-118a)
2 4 3
At Ac
a11, ∫ a 2 = + , (4-118b)
2 3
4.11.2 Application to a Seidel Aberration Function 99
Ad Aa As
a 2,0 ∫ a 4 = + + , (4-118c)
2 3 4 3 2 3
Aa
a 2,2 ∫ a 6 = , (4-118d)
2 6
Ac
a 3,1 ∫ a 8 = , (4-118e)
6 2
and
As
a 4,0 ∫ a11 = . (4-118f)
6 5
Moreover, it is evident that the highest order among the aberrations is N = 4 . The
aberration variance in terms of the Zernike coefficients is given by
s 2 = a11
2 2 2 2 2
, + a 2, 0 + a 2, 2 + a 3,1 + a 4 , 0 (4-119a)
= a 22 + a 42 + a 62 + a 82 + a11
2
. (4-119b)
For a scaled pupil, the aberration function can be written in the form
where, from Eq. (4-107) and utilizing the h-coefficients given in Table 4-9, the Zernike
coefficients are given by
b0,0 = a 0,0 h0 (0; ) + 3h0 (2; )a 2,0 + 5h0 (4; )a 4,0
( )
= a 0,0 - 3 1 - 2 a 2,0 + 5 1 - 2 1 - 22 a 4,0 ( )( ) ,
or
( )
b1 = a1 - 3 1 - 2 a 4 + 5 1 - 2 1 - 22 a11 , ( )( ) (4-121a)
[
b11, = h1 (1; ) a11, + 2 h1 (3; ) a 3,1 = a11, - 2 2 1 - 2 a 3,1 ( ) ] ,
or
[
b2 = a 2 - 2 2 1 - 2 a 8( ) ] , (4-121b)
or
100 SYSTEMS WITH CIRCULAR PUPILS
[ (
b4 = 2 a 4 - 15 1 - 2 a11 ) ] , (4-121c)
or
b6 = 2 a 6 , (4-121d)
or
b8 = 3 a 8 , (4-121e)
and
or
W (r, q) = At¢r cos q + Ad¢ r 2 + Aa¢ r 2 cos 2 q + Ac¢ r 3 cos q + As¢r 4 , (4-124)
where
Writing Eq. (4-124) in terms of Zernike polynomials, as was done in obtaining Eq. (4-
117) from Eq. (4-116), it is easy to see that the Zernike coefficients thus obtained are the
same as the corresponding coefficients given by Eqs. (4-121a)–(4-121f).
a1 = 13 12 , a 2 = 5 6 , a 4 = 5 4 3 , a 6 = 1 2 6 , a 8 = 1 6 2 , a11 = 1 6 5 . (4-126)
Substituting Eqs. (4-126) into Eq. (4-119b), the variance of the aberration function is
given by s 2 = 919 720 , or its standard deviation is given s = 1.1298 . For a pupil scaled
with = 0.8 , the Zernike coefficients in Eq. (4-120b) are given by
Substituting Eq. (4-118) into Eq. (4-122), the aberration variance and standard deviation
for the scaled pupil are given by
s 2 = 0.5036 (4-128)
and
s = 0.7097 , (4-129)
respectively.
4.12 SUMMARY
The aberration-free PSF, called the Airy pattern, is shown in Figure 4-2. It consists of
a bright central spot of radius 1.22l F , called the Airy disc, containing 83.8% of the total
light, surrounded by the diffraction rings. The corresponding OTF shown in Figure 4-4
starts at a value of unity and decreases monotonically to zero at the cutoff frequency
1 l F . Since the Strehl ratio for a small aberration increases with a decrease in the
aberration variance, we explicitly consider the balancing of primary aberrations with
lower-order aberrations. As seen from Tables 4-1 and 4-2, the sigma value of primary
spherical aberration when balanced with defocus, primary coma balanced with tilt, and
primary astigmatism balanced with defocus, is reduced by a factor of 4, 3, and 6 2,
respectively. Accordingly, the aberration tolerance for a given Strehl ratio increases by
the same factor.
The Zernike circle polynomials are in widespread use for the analysis of circular
wavefronts because of their orthogonality over a unit circle and their representation of the
balanced classical aberrations for systems with circular pupils. The polynomials are
described by three indices: j is a polynomial ordering number, n represents the radial
degree or the order of a polynomial, and m represents its azimuthal frequency. The
polynomials are ordered such that an even j corresponds to a cosine polynomial and an
102 SYSTEMS WITH CIRCULAR PUPILS
Only the cosine circle polynomials are needed to represent the aberration function of
a rotationally symmetric system. However, both cosine and sine polynomials are needed
to represent fabrication errors, or the aberrations introduced by atmospheric turbulence. A
circle polynomial aberration varying as cos mq or sin mq is m-fold symmetric. However,
its interferogram is 2m-fold symmetric. The PSF is m-fold symmetric when m is odd, and
2m-fold symmetric when m is even, unless m = 0, in which case it is radially symmetric,
like the aberration itself. These symmetry properties (along with those of the OTF) are
summarized in Table 4-6. The PSFs for two polynomial aberrations with the same n and
m values and the same sigma value but different angular dependence as cos mq and
sin mq are the same except that one is rotated by an angle p 2m with respect to the
other. If two such polynomial aberrations are present simultaneously with sigma values
a j and b j , then the orientation of the interferogram, PSF, and OTF changes by an angle
( )
(1 m) tan 1 b j a j .
The circle polynomials for n £ 8 are illustrated in Figure 4-11 by an isometric plot,
an interferogram, and a PSF for a sigma value of one wave. The corresponding P-V
numbers are given in Table 4-7. The Strehl ratio for a sigma value of 0.1 l for each
polynomial aberration is given in Table 4-8 and plotted in Figure 4-12, illustrating that,
for a small aberration, its value can be estimated from the aberration variance regardless
of the aberration type.
The OTF is complex with real and imaginary parts (or MTF and PTF) for odd m, but
it is real for even m. For m = 0, the OTF is real and radially symmetric. The real part of
the OTF is 2m-fold symmetric whether m is odd or even. However, its imaginary part is
m-fold symmetric for odd m, though its magnitude (i.e., if we ignore its sign) is 2m-fold
symmetric. Accordingly, the MTF is 2m-fold symmetric whether m is even or odd. The
MTF for primary aberrations, and Z10 and the real and imaginary parts of the OTF for
coma and Z10 , are given for a sigma value of 0.1 wave in Figures 4-13 and 4-14,
respectively.
The determination of the effective Seidel or primary aberration coefficients from the
corresponding coefficients of the cosine and sine polynomials is demonstrated in Section
4.9. It is emphasized that these coefficients cannot be obtained from only the primary
Zernike aberrations, but must also include the primary aberrations in the higher-order
Zernike terms. How to obtain the Zernike coefficients of a certain aberration function
when the diameter of the pupil is reduced from its nominal value is discussed in Section
4.11.
5eferences 103
References
1. F. Zernike, “Diffraction theory of knife-edge test and its improved form, the phase
contrast method,” Mon. Not. R. Astron. Soc. 94, 377–384 (1934).
4. M. Born and E. Wolf, Principles of Optics, 7th ed. (Cambridge University Press,
New York, 1999).
8. Lord Rayleigh, Phil. Mag. (5) 8, 403 (1879); also in his Scientific Papers (Dover,
New York, 1964) Vol. 1, p. 432.
9. V. N. Mahajan, “Strehl ratio for primary aberrations: some analytical results for
circular and annular pupils,” J. Opt. Soc. Am. 72, 1258–1266 (1982); Errata, 10,
2092 (1993).
10. V. N. Mahajan, “Line of sight of an aberrated optical system,” J. Opt. Soc. Am. A
2, 833–846 (1985).
11. W. B. King, “Dependence of the Strehl ratio on the magnitude of the variance of
the wave aberration,” J. Opt. Soc. Am. 58, 655–661 (1968).
12. A. B. Bhatia and E. Wolf, “On the circle polynomials of Zernike and related
orthogonal sets,” Proc. Cambridge Philos. Soc. 50, 40–48 (1954).
14. V. N. Mahajan and José A. Díaz, “Imaging characteristics of Zernike and annular
polynomial aberrations,” Appl. Opt. 52, 2062-2074 (2013).
18. V. N. Mahajan, “Zernike coefficients of a scaled pupil,” Appl. Opt. 49, 5374-5377
(2010).
19. A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike coefficients
of scaled pupils,” Microlith, Microfab. and Microsyst, 5, 030501 (2006).
References ......................................................................................................................140
105
Chapter 5
Systems with Annular Pupils
5.1 INTRODUCTION
An important example of an imaging system with a noncircular pupil is that of a
system with an annular pupil. The two-mirror astronomical telescopes represent systems
with annular pupils. Examples of such telescopes, including their linear obscuration ratios
given in parentheses are the 200-inch telescope at Mount Palomar (0.36), the 84-inch
telescope at the Kitt-Peak observatory (0.37), the telescope at the McDonald Observatory
(0.5), and the Hubble Space Telescope (0.33 when using the Wide-Field Planetary
Camera).
We start this chapter with a brief discussion of how the obscuration affects the
aberration-free PSF and OTF of a circular pupil. We then consider its effect on the Strehl
ratio of primary aberrations, their balancing, and tolerances with and without balancing.
Next we obtain the polynomials that are orthonormal over an annular pupil by
orthogonalizing the Zernike circle polynomials by the procedure outlined in Chapter 3.
The annular polynomials are given in terms of the Zernike circle polynomials, and in both
polar and Cartesian coordinates. They are also related to the balanced aberrations. The
aberrated PSFs and OTFs are illustrated for the annular polynomial aberrations.
1
'
Figure 5-1. Unit annulus of obscuration ratio , representing the ratio of its inner
and outer radii.
107
108 SYSTEMS WITH ANNULAR PUPILS
1 2p 2
1
I (r , q i ) = [ ] [
Ú Ú exp i F ( r, q) exp - pirr cos(q i - q) r dr dq ] , (5-1)
(
p 2 1 - 2 )2 0
where (r ,q i ) are the polar coordinates of a point in the image plane, r is in units of l F ,
and F = R D is the focal ratio of the image-forming light cone. The PSF is normalized to
unity at the center by the aberration-free central irradiance p Pex 1 - 2 4l2 F 2 . It is
2
( )
smaller than the corresponding central value for a circular pupil by a factor of 1 - 2 , ( )
since both the pupil area and the power Pex are each smaller by a factor of 1 - 2 . ( )
The aberration-free PSF is given by [1,2]
2
1 È 2J1( pr ) 2J ( pr ) ˘
I ( r; ) = Í pr - 2 1 . (5-2)
(1 - 2 ) 2 Î pr ˙˚
The effect of the obscuration is two fold. First, there is a loss of light in the image that
increases with increasing . Second, the radius of the central bright spot decreases and
contains less and less light, while more and more light appears in the diffraction rings. As
Æ 1, the PSF approaches J 0 ( pr ) , and the central bright spot radius decreases to 0.76
compared to a value of 1.22 for a circular pupil. The irradiance distribution I of the PSF
and its encircled power P are shown in Figure 5-2 for several typical values of the
obscuration ratio. The 2D PSF is shown in Figure 5-3 for obscuration ratios of 0.5 and
0.8. For large obscuration ratios, such as 0.8, the PSF consisits of groups of diffraction
rings.
1.0
0.9 I
P
=0
0.8
0.7 0.25
(r) P(rc)
0.6
0.5
0.50
0.4
0.3 0.75
0.2
0.1
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
r; rc
Figure 5-2. The irradiance and encircled power distributions for various values of
the obscuration ratio .
5.2.2 OTF 109
(a)
(b)
5.2.2 OTF
The aberration-free OTF, representing the Fourier transform of the corresponding
PSF given by Eq. (5-2) [3], or the fractional overlap area of two unit annular circles
separated by a distance l Rv i , is given by [1,4]
110 SYSTEMS WITH ANNULAR PUPILS
1
t (v; ) =
1 - 2
[ ]
t (v) + 2 t (v ) - t12 (v; ) , 0 £ v £ 1 , (5-3)
where t (v) is given by Eq. (4-15) and represents the OTF of the system if there were no
obscuration, v = l Fv i is a normalized radial spatial frequency as in the case of a circular
pupil (since the obscuration has no effect on the cutoff frequency 1 l F ), and
(
= (2 p) q1 + 2 q 2 - 2 v sin q1 , ) (1 - ) 2 £ v £ (1 + ) 2 (5-4b)
= 0, otherwise . (5-4c)
4v 2 + 1 - 2
cos q1 = (5-5a)
4v
and
4v 2 - 1 + 2
cos q 2 = , (5-5b)
4 v
respectively. It is evident from Eq. (5-3) that t ( v; ) > t ( v ) at least for spatial frequencies
1
( )
(1 + ) 2 < v < 1 by a factor of 1 - 2 . This is illustrated in Figure 5-4 for the same
values of as the PSFs in Figure 5-2. The OTF decreases at the low and mid spatial
frequencies and increases at the high. This is the spatial frequency analog of the increased
light in the diffraction rings and a smaller central bright spot.
1.0
0.8
= 0
0.6
t (n; )
0.25
0.4 0.50
0.75
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
n
0
(
Ú t ( v; ) vdv = 1 - 8 .
2
) (5-6)
t ¢(0; ) = - 4 p (1 - ) . (5-7)
1 2p 2
1 Û Û
S ∫ I (0; ) = [ ]
Ù Ù exp iF(r, q; ) r dr dq . (5-8)
(
p 2 1 - 2 )2 ı ı
0
The approximate value of the Strehl ratio can be obtained from the aberration variance
with n = 1 and 2, respectively. Table 5-1 gives the form as well as the standard deviation
s F of a primary aberration.
Table 5-1. Primary aberrations and their standard deviations for a system with a
uniformly illuminated annular pupil of obscuration ratio .
Aberration F( r,, q) sF
Spherical As r 4 12
(4 - 2
- 6 4 - 6 + 4 8 ) As 3 5
Coma Ac r3 cos q 12
(1 + 2
+ 4 + 6 ) Ac 2 2
Astigmatism Aa r2 cos 2 q 2 12
(1 + ) Aa 4
2 12
Distortion (tilt) At r cos q (1 + ) At 2
112 SYSTEMS WITH ANNULAR PUPILS
For a small aberration, we balance a classical aberration with one or more aberrations
of lower order to minimize its variance and thereby maximize the corresponding Strehl
ratio. Thus, for example, we balance spherical aberration with defocus, as in Chapter 4,
and write it as
We determine the amount of defocus Bd such that the variance sF2 is minimized; i.e., we
calculate sF2 and let
∂s F2
= 0 (5-12)
∂B d
Figure 5-5 shows how the standard deviation of an aberration, for a given value of
the aberration coefficient Ai , varies with the obscuration ratio of the pupil. In Figures 5-
5a and 5-5b, the amounts of defocus and tilt required to minimize the variance of
spherical aberration and coma, respectively, are also shown. We observe from these
figures that the standard deviation of spherical and balanced spherical aberrations and
Table 5-2. Balanced primary aberrations, their standard deviation, and diffraction
focus.
Balanced
spherical [ (
As r 4 - 1 + 2 r 2 ) ] 1
6 5
1 - 2( )
2
As [0,0,8(1 + )F A ]
2 2
s
Balanced 2 1 + 2 + 4 4 12
coma
Ê
Ac Á r3 -
ˆ
r˜ cos q (1 - ) (1 + 4 + )
2 2
Ac Í
(
È 4 1 + 2 + 4 ) ˘
FAc , 0, 0 ˙
Ë 3 1 + 2 ¯
6 2 (1 + ) 2 12
Î (
Í 3 1+ 2
) ˙
˚
Balanced
astigmatism a
(
A r 2 cos 2 q - 1 2 ) 1
(1 + 2
+ 4
12
) Aa (0, 0, 4 F A )
2
a
2 6
5.3 Strehl Ratio and Aberration Balancing 113
0.40 0.30
0.25
Defocus
0.35
0.20
sf /Ad
VI /Aa
0.30 0.15
Astigmatism 0.10
0.25
Balanced 0.05
astigmatism
0.20 0.00
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
(c) (d)
0.75
0.70
0.65 Tilt
sf /At
0.60
0.55
0.50
0.0 0.2 0.4 0.6 0.8 1.0
(e)
È j ˘
A j +1 = N j +1 Í Z j +1 - Â Z j +1 Ak Ak ˙ , (5-13)
Î k =1 ˚
= d jj ¢ . (5-15)
where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; ) is
an annular radial polynomial.
Substituting Eqs. (5-16a)–(5-16c) into Eq. (5-15), we find that the annular radial
polynomials obey the orthogonality condition
1
Û m 1 - 2
Ù Rn (r; ) Rn ¢ (r; ) r dr = 2
m
d . (5-17)
ı (n+ 1) nn ¢
In the two-index n and m representation Anm (r, q; ) of an annular polynomial, Eq. (5-13)
can be written
È ( n m) 2 ˘
Anm = N nm Í Z nm - Â Z nm Anm 2i An 2i ˙ , (5-18)
Î i =1 ˚
where N nm replaces the normalization constant N j and, as in Eq. (5-13), the angular
brackets indicate a mean value over the unit annulus. Substituting Eqs. (5-16a)–(5-16c)
into Eq. (5-18), we find that the annular radial polynomials are given by
È ( n m) 2 ˘
Rnm (r; ) = N nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; ) Rnm 2i (r; )˙ , (5-19)
Î i ≥1 ˚
where
1
2 Û m
Rnm (r) Rnm¢ (r; ) = Ù Rn (r) Rn ¢ (r; ) r dr .
m
(5-20)
1 - 2 ı
For m = 0 , the annular radial polynomials are equal to the Legendre polynomials
Pn (◊) according to
È 2 r 2 - 2
R20n (r; ) = Pn Í -
(1
˘
˙ .
) (5-21)
ÍÎ 1 -
2
˙˚
Thus, they can be obtained from the circle radial polynomials R20n (r) by replacing r with
[(r 2
- 2 ) (1 - )] 2 12
, i.e.,
116 SYSTEMS WITH ANNULAR PUPILS
ÈÊ r2 - 2 ˆ 1 2 ˘
R20n (r; ) = R20n ÍÁ 2 ˜
˙ . (5-22)
ÍÎË 1 - ¯ ˙˚
Given that Rnn (r) = r n [see Eq. 4-39)], it can be seen from Eqs. (5-17) and (5-19) that
12
{(
Rnn (r; ) = r n 1 - 2 ) [1 - 2(n +1) ]} (5-23a)
12
Ê n ˆ
= r n Á Â 2i ˜ . (5-23b)
Ë i=0 ¯
Moreover,
Rnn 2 (r; ) =
[(
nrn - (n - 1) 1 - 2 n ) (1 - ( ) )] r
2 n 1 n 2
12 . (5-24)
Ï 1 - 2
Ì
Ó
( )
1
(
Èn 2 1 - 2( n +1
ÎÍ
)
) - (n - 1)(1 - ) (1 - ( ) )˘˚˙¸˝˛
2 2n 2 2 n 1
It is evident that an annular radial polynomial Rnn (r; ) differs from the corresponding
circle polynomial Rnn (r) only in its normalization. We also note that
π 1, m π 0 . (5-25b)
The annular polynomials are also unique like the circle polynomials. They not only
are orthogonal over an annular pupil but also include wavefront tilt and defocus and
balanced classical aberrations as members of the polynomial set. For example, A6 , A8 ,
and A11 represent the balanced primary aberrations of astigmatism, coma, and spherical
aberration, as may be seen by comparing their forms with those given in Table 5-2. The
annular polynomials may be referred to as the orthogonal aberrations because of their
orthogonality over the annular pupil.
5.5 Annular Polynomials 117
A1 = Z1
( ) 1 2 Z2
A2 = 1 + 2
12
A3 = (1 + 2 ) Z 3
1
A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )
12
A5 = (1 + 2 + 4 ) Z 5
A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]
A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]
12
B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]
12
A9 = (1 + 2 + 4 + 6 ) Z 9
12
A10 = (1 + 2 + 4 + 6 ) Z10
12
Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ
A12 = Á 8˜ Á - 15 Z +
6 6
Z
2 12 ˜
Ë 1 + 4 + 10 + 4 + ¯
2 4 6
Ë 1- 1- ¯
12
Ê 1 + 2 + 4 ˆ Ê 6 1 ˆ
A13 = Á 8˜ Á - 15 Z +
6 5
Z
2 13 ˜
Ë 1 + 4 + 10 + 4 + ¯
2 4 6
Ë 1- 1- ¯
(
A14 = 1 + 2 + 4 + 6 + 8 ) 1 2 Z14
12
A15 = (1 + 2 + 4 + 6 + 8 ) Z15
1 Ï 4 ¸
A16 =
2 2
Ì [ 3( 3 + 4
2
) ( ) ]
+ 34 Z 2 + 2 6 3 + 2 Z 8 + bZ16 ˝
(1 - ) Óa ˛
1 Ï 4 ¸
A17 =
2 2
Ì [ 3( 3 + 4
2
) ( ) ]
+ 34 Z 3 + 2 6 3 + 2 Z 7 + bZ17 ˝
(1 - ) Óa ˛
12
10 1 2 Ê 1 + 4 2 + 4 ˆ
(
a = 1 + 13 + 46 + 46 + 13 +
2 4 6 8
) , b = Á 6˜
Ë 1 + 9 + 9 + ¯
2 4
12
Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ
A18 = Á 12 ˜ Á Z10 + Z
2 18 ˜
Ë 1 + 4 + 10 + 20 + 10 + 4 + ¯
2 4 6 8 10
Ë 1-
8
1- ¯
12
Ê 1 + 2 + 4 + 6 ˆ Ê - 2 6 8 1 ˆ
A19 = Á 12 ˜ Á Z9 + Z
2 19 ˜
Ë 1 + 4 + 10 + 20 + 10 + 4 + ¯
2 4 6 8 10
Ë 1-
8
1- ¯
118 SYSTEMS WITH ANNULAR PUPILS
(
A20 = 1 + 2 + 4 + 6 + 8 + 10 ) 1 2 Z 20
12
A21 = (1 + 2 + 4 + 6 + 8 + 10 ) Z 21
= (1 - 2 ) [ - 7 2 (1 + 32 + 4 ) Z1 + ]
3
A22 ( )
212 1 + 22 Z 4 - 35 Z11 + Z 22
1 Ï 6 ¸
A23 =
2 2
Ì [ 21(2 + 3 2
) ( ) ]
+ 34 + 26 Z 5 - 35 6 + 32 + 4 Z13 + dZ 23 ˝
(1 - ) Óg ˛
1 Ï 6 ¸
A24 =
2 2
Ì [ 21(2 + 3 2
) ( ) ]
+ 34 + 26 Z 6 - 35 6 + 32 + 4 Z14 + dZ 24 ˝
(1 - ) Óg ˛
12
(
g = 1 + 13 2 + 91 4 + 339 6 + 792 8 + 102810 + 72912 + 33914 + 9116 + 1318 + 20 )
12
Ê 1 + 4 2 + 104 + 4 6 + 8 ˆ
d =Á 12 ˜
Ë 1 + 9 + 45 + 65 + 45 + 9 + ¯
2 4 6 8 10
Ê - 3510 1 ˆ
A25 = c Á Z15 + Z
2 25 ˜
Ë 1- 1-
10
¯
Ê - 3510 1 ˆ
A26 = c Á Z14 + Z
2 26 ˜
Ë 1- 1-
10
¯
12
Ê 1 + 2 + 4 + 6 + 8 ˆ
c = Á 16 ˜
Ë 1 + 4 + 10 + 20 + 35 + 20 + 10 + 4 + ¯
2 4 6 8 10 12 14
(
A27 = 1 + 2 + 4 + 6 + 8 + 10 + 12 ) 12 Z 27
12
A28 = (1 + 2 + 4 + 6 + 8 + 10 + 12 ) Z 28
It is evident from Eq. (5-13) that each annular polynomial is a linear combination of
the circle polynomials, without any mixing of the cosine and the sine terms. Similarly,
because of the same angular dependence of an annular polynomial Aj (r, q; ) as the
corresponding circle polynomial Z j (r, q), each radial polynomial Rnm (r; ) can be written
as a linear combination of the polynomials Rnm (r) , Rnm 2 (r) , etc. This, of course, is also
evident from Eq. (5-19). For example,
1
R13 (r; ) =
B
[( )
1 + 2 R13 (r) - 24 R11(r) ] , (5-26)
where
12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] , (5-27)
5.5 Annular Polynomials 119
M 11 = 1
(
M 22 = 1 + 2 ) 1 2 = M 33
M 41 = -32 1 - 2( )1
(
M 44 = 1 - 2 )1
(
M 55 = 1 + 2 + 4 ) 1 2 = M 66
M 73 = -2 2 4 B = M 82
(
M 77 = 1 + 2 B = M 88 )
12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )]
(
M 99 = 1 + 2 + 4 + 6 ) 1 2 = M10,10
(
M 111, = 52 1 + 2 1 - 2 )( )2
M 11,4 = - 152 1 - 2 ( )2
, = 1-
M 1111 2
( )2
12
6 Ê 1 + 2 + 4 ˆ
M 12,6 = - 15 6 Á 8˜
= M 13,5
1 - Ë 1 + 4 + 10 + 4 + ¯
2 4 6
12
1 Ê 1 + 2 + 4 ˆ
M 12,12 = Á 8˜
= M 13,13
1 - Ë 1 + 4 + 10 + 4 + ¯
2 2 4 6
(
M 14,14 = 1 + 2 + 4 + 6 + 8 ) 1 2 = M15,15
120 SYSTEMS WITH ANNULAR PUPILS
Poly. Aj (x, y; )
A1 1
A2 2 x / (1 + 2 )1 / 2
A3 2y /(1 + 2 )1/ 2
A4 3 (2r2 – 1 - 2 ) / (1 – 2 )
A5 2 6 xy/(1 + 2 + 4 )1 / 2
A6 6 ( x 2 – y 2 )/(1 + 2 + 4 )1 / 2
8 y[3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]
A7
(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2
8 x [3 (1 + 2 ) r2 – 2 (1 + 2 + 4 )]
A8
(1 – 2 ) [1 + 2 )(1 + 4 2 + 4 )] 1 / 2
A9 8 y (3 x 2 – y 2 ) / (1 + 2 + 4 + 6 )1 / 2
A10 8 x ( x 2 – 3 y 2 ) / (1 + 2 + 4 + 6 )1 / 2
A11 5[6r 4 – 6 (1 + 2 ) r2 + (1 + 4 2 + 4 )] / (1 – 2 ) 2
10 ( x 2 – y 2 ) [ 4r2 – 3 (1 - 8 ) / (1 – 6 )]
A12 1/ 2
{(1 – ) 2 –1
[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }
2 10 xy[ 4r2 – 3 (1 – 8 ) / (1 – 6 )]
A13 1/ 2
{(1 – ) 2 –1
[16 (1 – 10 ) – 15 (1 – 8 )2 / (1 – 6 )] }
A14 10 (r 4 – 8 x 2 y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2
A15 4 10 xy ( x 2 – y 2 ) / (1 + 2 + 4 + 6 + 8 )1 / 2
5.5 Annular Polynomials 121
12 x [10 (1 + 4 2 + 4 ) r 4 – 12 ( 1 + 4 2 + 4 4 + 6 )r 2 ] + 3(1 + 4 2 + 10 4 + 4 6 + 8 )]
A16
(1 – 2 ) 2 [(1 + 4 2 + 4 )(1 + 9 2 + 9 4 + 9 6 )]1/ 2
12 y [ 10 (1 + 4 2 + 4 ) r 4 – 12 (1 + 4 2 + 4 4 + 6 ) r 2 + 3(1 + 4 2 + 10 4 + 4 6 + 8 ) ]
A17
(1 – 2 ) 2 [(1 + 4 2 + 4 )(1 + 9 2 + 9 4 + 6 )]1/ 2
12 x ( x 2 – 3 y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]
A18 1/ 2
{(1 – ) 2 –1
[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }
12 y [3 x 2 – y 2 )[5 r2 – 4 (1 – 10 ) / ( 1 – 8 ) ]
A19 1/ 2
{(1 – ) 2 –1
[ 25 (1 – 12 ) – 24 (1 – 10 )2 / (1 – 8 ) ] }
A20 (
12 x 16 x 4 – 20 x 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2
A21 (
12 y 16 y 4 – 20 y 2 r 2 + 5 r 4 ) (1 + 2 + 4 + 6 + 8 + 10 )1 2
7 [ 20 r 6 – 30(1 + 2 ) r 4 + 12 (1 + 3 2 + 4 ) r 2 – (1 + 9 2 + 94 + 6 )]
A22
(1 – 2 ) 3
2 14 xy [15 (1 + 4 2 + 10 4 + 4 6 + 8 ) r 4 – 20 (1 + 4 2 + 10 4 + 10 6 + 4 8 + 10 ) r 2
+ 6 (1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 )]
A23
(1 – 2 ) 2 [1 + 4 2 + 10 4 + 4 6 + 8 ) (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )]1/ 2
14 ( x 2 – y 2 )[15 (1 + 4 2 + 10 4 + 4 6 + 8 ) r 4 – 20 (1 + 4 2 + 10 4 + 10 6 + 4 8 + 10 ) r 2
+ 6 (1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 )]
A24
(1 – 2 ) 2 [1 + 4 2 + 10 4 + 4 6 + 8 ) (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )] 1/2
n 4
0.5
8
Rn(U; H)
0 (a)
0
-0.5
6
2
-1
0.5 0.6 0.7 0.8 0.9 1
U
1
n 5
1
0.5
7
R1n(U; H)
0 (b)
-0.5
3
-1
0.5 0.6 0.7 0.8 0.9 1
n 6 2
0.5
Rn(U; H)
0 (c)
2
-0.5
4
-1
0.5 0.6 0.7 0.8 0.9 1
U
Figure 5-6. Variation of an annular radial polynomial Rnm (r; ) with r for = 0.5.
(a) Defocus and spherical aberrations. (b) Tilt and coma. (c) Astigmatism.
5.5 Annular Polynomials 123
and
(
R40 (r; ) = 1 - 2 ) 2 [R40 (r) - 32R20 (r) + 2 (1 + 2 )R00 (r)] . (5-28)
The radial annular polynomials Rnm (r; ) for n £ 8 are listed in Table 5-6. Table 5-7 lists
the full annular polynomials, illustrating their ordering.
1 1 2p
aj = 2 Ú Ú W (r, q; ) Aj (r, q; ) r dr d q . (5-30)
p(1 - ) 0
The mean and the mean square values of the aberration function are given by
W (r, q; ) = a1 (5-31)
and
J
W 2 (r, q; ) = Â a 2j . (5-32)
j =1
J
= Â a 2j . (5-33)
j =2
As explained in Section 3.3, the annular expansion coefficients yield a least-squares fit of
the aberration function with J polynomials.
124 SYSTEMS WITH ANNULAR PUPILS
Table 5-6. Annular radial polynomials Rnm (r; ) , where is the obscuration ratio
and £ r £ 1.
n m Rnm (r; )
0 0 1
12
1 1 (
r 1 + 2 )
2 0 ( 2r 2
) (1 - )
- 1 - 2 2
4 12
2 2 r (1 + + )
2 2
3 (1 + ) r - 2 (1 + + ) r
2 3 2 4
3 1
12
(1 - ) [(1 + ) (1 + 4 + )]
2 2 2 4
6 12
3 3 r (1 + + + )
3 2 4
4 0 [6r - 6 (1 + ) r + 1 + 4 + ] (1 - )
4 2 2 2 4 2 2
4r - 3 [(1 - ) (1 - )] r
4 8 6 2
4 2
Ï 1 1 2¸
8 2
Ì(1 - ) Í16 (1 - ) - 15 (1 - ) (1 - )˙
È 2 ˘ 10 6
˝
Ó Î ˚ ˛
12
4 4 (
r 4 1 + 2 + 4 + 6 + 8 )
5 1 ( ) ( ) (
10 1 + 4 2 + 4 r5 - 12 1 + 4 2 + 4 4 + 6 r3 + 3 1 + 4 2 + 10 4 + 4 6 + 8 r )
12
(1 - ) [(1 + 4 + ) (1 + 9 + 9 2 2 2 4 2 4
+ 6 )]
5 r - 4 [(1 - ) (1 - )] r 5 10 8 3
5 3 12
Ï1- 1
1 - )˘ ¸˝ 10 2
Ì( ) ( ) ( ) (
È25 1 - - 24 1 -
2 12 8
Ó Í
Î ˚˙ ˛
12
5 5 (
r5 1 + 2 + 4 + 6 + 8 + 10 )
6 0 [20 r 6
( ) (
- 30 1 + 2 r 4 + 12 1 + 32 + 4 r 2 - 1 + 92 + 94 + 6 ) ( )] (1 - 2 ) 3
( )
15 1 + 4 2 + 104 + 4 6 + 8 r 6 - 20 1 + 4 2 + 104 + 106 + 4 8 + 10 r 4 ( )
6 2
( )
+ 6 1 + 4 2 + 104 + 206 + 108 + 4 10 + 12 r 2
12
(1 - ) [(1 + 4 2 + 104 + 4 6 + 8 ) (1 + 92 + 454 + 656 + 458 + 910 + 12 )]
2 2
6 4
6r6 - 5 1 - 12 [( ) (1 - )] r 10 4
12
Ï 1 - 2
) - 35 (1 - ) (1 - )˘˚˙¸˝˛
1È 12 2
Ì
Ó
( ) ÎÍ
36 1 - 14( 10
12
6 6 (
r6 1 + 2 + 4 + 6 + 8 + 10 + 12 )
5.6 Annular Coefficients of an Annular Aberration Function 125
Table 5-6. Annular radial polynomials Rnm (r; ) , where is the obscuration ratio
and £ r £ 1. (Cont.)
n m Rnm (r; )
7 5
7r7 - 6 1 - 14 [( ) (1 - )] r
12 5
12
Ï 1 - 2
) - 48 (1 - ) (1 - )˘˙˚¸˝˛
1È 14 2
Ì
Ó
( ) ÍÎ
49 1 - 16 ( 12
12
7 7 (
r7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 )
8 0
( ) ( )
70 r8 - 140 1 + 2 r6 + 30 3 + 82 + 34 r4 - 20 1 + 6 2 + 6 4 + 6 r2 + e80 ( )
2 4
(1 - )
8 2 a 82r 8 + b82r 6 + c 82r 4 + d 82r 2
8 4 a 84 r 8 + b84 r 6 + c 84 r 4
8 6 a 86r 8 + b86r 6
8 8 (
r 8 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 + 16 )1 2
(
a17 = 35 1 + 92 + 94 + 6 ) A17
(
b71 = - 60 1 + 9 2 + 154 + 9 6 + 8 ) A71
(
c17 = 30 1 + 9 2 + 254 + 256 + 9 8 + 10 ) A71
(
d71 = - 4 1 + 9 2 + 454 + 656 + 458 + 9 10 + 12 ) A71
(
A17 = 1 - 2 ) 3 (1 + 92 + 94 + 6 )1 2 (1 + 162 + 364 + 166 + 8 )1 2
(
a73 = 21 1 + 4 2 + 10 4 + 20 6 + 10 8 + 4 10 + 12 ) A73
(
b73 = - 30 1 + 4 2 + 10 4 + 20 6 + 20 8 + 10 10 + 4 12 + 14 ) A73
(
c73 = 10 1 + 4 2 + 10 4 + 20 6 + 358 + 20 10 + 10 12 + 4 14 + 16 ) A73
2 12
(
A 73 = 1 2 ) (1 + 4 2
+ 10 4 + 20 6 + 10 8 + 4 10 + 12 )
12
(
¥ 1 + 9 2 + 45 4 + 165 6 + 270 8 + 27010 + 16512 + 4514 + 916 + 18 )
e80 = 1 + 162 + 364 + 166 + 8
(
a 82 = 56 1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 ) A82
126 SYSTEMS WITH ANNULAR PUPILS
Table 5-6. Annular radial polynomials Rnm (r; ) , where is the obscuration ratio
and £ r £ 1. (Cont.)
(
b82 = -105 1 + 9 2 + 45 4 + 85 6 + 85 8 + 45 10 + 912 + 14 ) A82
(
c 82 = 60 1 + 9 2 + 45 4 + 115 6 + 150 8 + 115 10 + 4512 + 914 + 16 ) A82
(
d 82 = -10 1 + 9 2 + 45 4 + 165 6 + 270 8 + 270 10 + 16512 + 4514 + 916 + 18 ) A82
(
A82 = 1 - 2 ) 3 (1 + 9 2 + 45 4 + 65 6 + 45 8 + 9 10 + 12 )1 2
(
¥ 1 + 162 + 136 4 + 416 6 + 6268 + 416 10 + 13612 + 1614 + 16 )1 2
(
a 84 = 28 1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 ) A84
(
b84 = -42 1 + 4 2 + 10 4 + 20 6 + 35 8 + 35 10 + 2012 + 1014 + 4 16 + 16 ) A84
(
c 84 = 15 1 + 4 2 + 10 4 + 20 6 + 35 8 + 56 10 + 3512 + 2014 + 1016 + 4 16 + 16 ) A84
2 12
(
A 84 = 1 2 ) (1 + 4 2 + 10 4 + 20 6 + 35 8 + 20 10 + 1012 + 4 14 + 16 )
12
(
¥ 1 + 9 2 + 45 4 + 165 6 + 495 8 + 846 10 + 994 12 + 84614 + 49616 + 16518 + 45 20 + 9 22 + 24 )
(
a 86 = 8 1 + 2 + 4 + 6 + 8 + 10 + 12 ) A86
(
b86 = -7 1 + 2 + 4 + 6 + 8 + 10 + 12 + 14 ) A86
12
( )(
A 86 = 1 2 1 + 2 + 4 + 6 + 8 + 10 + 12 )
12
¥ (1 + 4 + 10 2 4
+ 20 6 + 35 8 + 56 10 + 84 12 + 845614 + 3516 + 2018 + 10 20 + 4 22 + 24 )
5.6 Annular Coefficients of an Annular Aberration Function 127
* The words “orthonormal annular” should be added to the name, e.g., orthonormal
annular primary spherical aberration.
128 SYSTEMS WITH ANNULAR PUPILS
* The words “orthonormal annular” should be added to the name, e.g., orthonormal
annular primary spherical aberration.
5.7 Strehl Ratio for Annular Polynomial Aberrations 129
S = Í
(
È sin 3a
4 ) ˘˙ . (5-35)
Í 3a 4 ˙
Î ˚
For a defocus aberration sigma of 0.1 wave, a 4 = 0.2p and S = 0.66255 , in agreement
with the result given in Table 5-8. Although Eq. (5-35) reads exactly the same as Eq. (4-
82) for a circular pupil, the longitudinal defocus for a given value of a 4 is different for
the annular pupil [see Eq. (5-37)]. .
W (r) = Bd r 2 , (5-36)
where Bd represents its peak value given by Eq. (4-19). The annular coefficient a 4 is
related to the longitudinal defocus z - R according to
p
a4 =
8 3l F 2
(
1 - 2 z - R ) . (5-37)
The results in Table 5-8 and Figure 5-7 illustrate that the Strehl ratio for a small
aberration is nearly independent of the type of the aberration, and depends primarily on
(
its sigma value. It is approximately given by Eq. (1-34) as exp - s F2 , or 0.67, where )
s F = 0.2p .
130 SYSTEMS WITH ANNULAR PUPILS
Table 5-8. Strehl ratio S for annular polynomial aberrations for = 0.5 and a sigma
value of 0.1 wave.
o
o
Figure 5-7. Strehl ratio for annular polynomial aberrations for = 0.5 and a sigma
value of 0.1 wave, shown on a nominal scale as well as on an expanded scale.
132 SYSTEMS WITH ANNULAR PUPILS
As in the case of circle polynomials (see Section 4.8), we illustrate the annular
polynomials for n £ 8 in three different but equivalent ways in Figure 5-8 for = 0.5 and
a sigma value of one wave [8]. For each polynomial, the isometric plot at the top
illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is
shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers
(in units of wavelength) are given in Table 5-8. From Eqs. (5-16) for the form of the
polynomials, it is evident that the P-V numbers of two polynomials with the same values
of n and m are the same. This may also be seen from Table 5-7.
The PSF plots represent the images of a point object in the presence of an annular
polynomial aberration. Thus, for example, piston yields the aberration-free PSF (since it
has no effect on the PSF) given by Eq. (5-2). The full width of a square displaying the
PSFs in Figure 5-8 is 24l F .
The polynomial aberrations A2 and A3 , representing the x and y wavefront tilts with
aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y
axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
12
( )
wavefront tilt angle of 4 a2 l D 1 + 2 about the y axis and displaces the PSF along the
12
( )
x axis by 4 a2 lF 1 + 2 . Similarly, a 3 corresponds to a wavefront tilt angle of
12 12
( )
4 a3 l D 1 + 2 ( )
about the x axis and displaces the PSF by 4 a3 lF 1 + 2 along the y
axes. As the order of a polynomial aberration increases, the interferograms and the PSFs
become more and more complex.
The 3D MTF plots for the for the primary polynomial aberrations and A10 are shown
in Figure 5-9 for a sigma value of 0.1 wave. The contour plots shown below each 3D
MTF figure are in steps of 0.1 from the center out, starting with a value of 0.9 and ending
with zero. The tangential, (long dashes), sagittal (medium dashes), and 45o (small dashes)
MTF plots are also shown in this figure, i.e., for the spatial frequency vector along the x
axis, y axis, and at 45o from the x axis, respectively. Figure 5-10a shows the symmetry of
the real and the imaginary parts of the OTF for the orthogonal primary coma A8 . The real
part has even symmetry, but the imaginary part has odd symmetry. The real and
imaginary parts of the OTF for the polynomial aberration A10 are shown in Figure 5-10b.
Since the aberration is 3-fold symmetric, the imaginary part of the OTF is 3-fold
symmetric, but the real part is 6-fold symmetric, as expected.
Comparing the form of the annular polynomials with those of the circle polynomials
given in Chapter 4, it is easy to see that the symmetry properties of the interferograms,
PSFs, real and imaginary parts of the OTF and the MTFs aberrated by an annular
polynomial aberration are the same as those for a corresponding circle polynomial
aberration in a circular pupil. These properties are summarized in Table 4-6.
5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 133
A1 A2 A3
A4 A5 A6
A7 A8 A9
Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram
on the left, and PSF on the right for = 0.5 and a sigma value of one wave.
134 SYSTEMS WITH ANNULAR PUPILS
A 16 A 17 A 18
Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram
on the left, and PSF on the right for = 0.5 and a sigma value of one wave. (Cont.)
5.8 Isometric, Interferometric, and Imaging Characteristics of Annular Polynomial Aberrations 135
Figure 5-8. Annular polynomials shown as isometric plot on the top, interferogram
on the left, and PSF on the right for = 0.5 and a sigma value of one wave. (Cont.)
136 SYSTEMS WITH ANNULAR PUPILS
y x
A 1 - Piston
A 4 - Defocus
A6 Primary astigmatism
A8 Primary coma
A 10
A 11 Primary spherical
Figure 5-9. 3D, tangential or along x axis (in long dashes), sagittal or along y axis (in
medium dashes), and at 45 o from the x axis (in small dashes) MTF plots for annular
polynomial aberrations with a sigma value of 0.1 wave for = 0.5. The solid curve
represents the aberration-free MTF. The spatial frequency v is normalized by the
cutoff frequency 1 l F . The contour plots below each 3D MTF plot are in steps of
0.1 from the center out, starting with 0.9 and ending with zero.
138 SYSTEMS WITH ANNULAR PUPILS
(b) A10
Re ( ) Im
Figure 5-10. Real and imaginary parts of the OTF for an annular polynomial
aberration with a sigma value of 0.1 wave for = 0.5. (a) A8 (primary coma) shows
the even and odd symmetry of the real and imaginary parts. (b) A10 shows the 6-fold
symmetry of the real part and 3-fold symmetry of the imaginary part, in addition to
their even and odd symmetry, respectively. The thick and thin contours of the
imaginary part represent its positive and negative values, respectively.
5.9 Summary 139
5.9 SUMMARY
A brief description of the aberration-free PSF and OTF of a system with an annular
pupil is given in Section 5.2, and follows with a discussion of the Strehl ratio and
aberration balancing for such a system in Section 5.3. The variation of the standard
deviation of a primary aberration with the obscuration ratio is shown in Figure 5-5. It is
evident, for example, from Figure 5-5d that the standard deviation of the defocus
aberration decreases, and the depth of focus accordingly increases as the obscuration
increases.
The Strehl ratio for a sigma value of 0.1 l for each aberration polynomial is given in
Table 5-8 and illustrated in Figure 5-7. It shows that, for a small aberration, the Strehl
ratio can be estimated from the aberration variance. The annular polynomials for n £ 8
are illustrated by an isometric plot, an interferogram, and a PSF in Figure 5-8 for = 0.5
and a sigma value of one wave. Their peak-to-valley numbers are given in Table 5-9 in
units of wavelength. The 3D MTFs are shown in Figure 5-9 for the primary and A10
polynomial aberrations. The tangential, sagittal, and 45o MTF plots are also shown in
Figure 5-9 for the orthogonal primary coma, i.e., for the spatial frequency vector along
the x axis, y axis, and at 45o from the x axis, respectively. The real and imaginary parts of
the OTFs are shown in Figure 5-10 for the A8 and A10 polynomial aberrations that have
odd values of m.
The symmetry properties of an interferogram, PSF, and real and imaginary parts of
the OTF and MTF aberrated by an annular polynomial aberration are the same as those
for a corresponding circle polynomial aberration in a circular pupil. These properties are
summarized in Table 4-6.
140 SYSTEMS WITH ANNULAR PUPILS
References
3. E. L. O’Neill, “Transfer function for an annular aperture,” J. Opt. Soc. Am. 46,
285–288 (1956). Note that a term of - 2 h2 is missing in the second of O’Neill’s
Eq. (26), as was pointed out by the author in an Errata on p. 1096 in the Dec 1956
issue. Unfortunately, the obscuration ratio h in the original paper was typed
incorrectly as n in the Errata.
References ......................................................................................................................163
141
Chapter 6
Systems with Gaussian Pupils
6.1 INTRODUCTION
In this chapter, we consider optical systems with Gaussian apodization or Gaussian
pupils, i.e., those with a Gaussian amplitude across the wavefront at their exit pupils,
which may be circular or annular [1,2]. The discussion in this chapter is equally
applicable to imaging systems with a Gaussian transmission (obtained, for example, by
placing a Gaussian filter at its exit pupil) as well as laser transmitters in which the laser
beam has a Gaussian distribution at its exit pupil. It is evident that whereas a Gaussian
function extends to infinity, the pupil of an optical system can only have a finite diameter.
The net effect is that the finite size of the pupil truncates the infinite-extent Gaussian
function. If the Gaussian function is very narrow (i.e., its standard deviation is very small)
compared to the radius of the pupil, it is said to be weakly truncated. In such cases, the
truncation can be neglected, and the pupil can be assumed to be infinitely wide.
The aberration-free image for a system with a Gaussian pupil shows that the
Gaussian illumination reduces the central value, broadens the central bright spot, but
reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,
the OTF for a Gaussian pupil is higher for low spatial frequencies, and lower for the high.
In these respects, the effect of a Gaussian illumination is opposite to that of a central
obscuration in an annular pupil. The diffraction rings practically disappear when the pupil
radius is twice the Gaussian radius, and the beam propagates as a Gaussian everywhere.
The OTF in this case is also described by a Gaussian function.
143
144 SYSTEMS WITH GAUSSIAN PUPILS
where
Here A0 is a constant that is determined from the total power in the pupil and
2
g = (a w ) , (6-3)
where the quantity w, called the Gaussian radius represents the radial distance from the
center of the pupil at which the amplitude drops to e 1 of the amplitude at the center. The
pupil radius a normalized by the Gaussian radius w , i.e., g = a w , is called the
truncation ratio. The larger the value of g is, the narrower the Gaussian beam is. A
uniform beam is represented by the limiting case of g Æ 0 . The aberration function
F(r, q) represents the phase aberration at a point (r, q) in the plane of the exit pupil,
where 0 £ r £ 1 and 0 £ q p £ 2p . The amplitude A0 at its center is determined from
the total power in the pupil.
•
Pinc = 2 A02 Sex Ú (
exp - 2gr 2 r dr )
0
A02 Sex
= , (6-4)
2g
and
1
Pex = 2 A02 Sex Ú (
exp - 2gr 2 r dr )
0
[
= A02 (Sex 2 g ) 1 - exp(- 2 g ) ] , (6-5)
respectively. The fractional transmitted power that goes on to the image is given by
= 1 - exp(- 2g ) . (6-6)
*DXVVLDQ 3XSLO 145
More and more power is transmitted as the beam becomes narrower and narrower, i.e., as
w decreases or g increases. The pupil irradiance A 2 (r) in units of Pex Sex may be
written
The pupil in the latter case, where an amplitude filter is placed in the pupil plane, is
said to be apodized. The power incident in this case is Pinc = A02 Sex . The power exiting
from the pupil is again given by Eq. (6-5), but the fractional transmitted power is given
by
1 - exp(- 2g )
Ptrans = Pex Pinc = . (6-8)
2g
[
I (0; g ) = tanh ( g 2) ( g 2) ] . (6-11)
For large values of g, a pupil is said to be weakly truncated. For such a pupil,
I (0; g ) Æ 2 g . (6-12)
The fractional power in the image plane contained in a circle of radius rc is given by
rc
P(rc ; g ) = p 2 2( )Ú I (r; g ) rdr , (6-13)
0
where rc is in units of l F.
146 SYSTEMS WITH GAUSSIAN PUPILS
Figure 6-1 shows the image-plane irradiance and encircled-power distributions for
J 0 , 1, 2, and 3. It is evident that the Gaussian illumination reduces the central value
and broadens the central bright spot, but reduces the power in the diffraction rings. For
example, when J 1, the central value is 0.924 compared to a value of 1 for a uniform
beam. Moreover, the central bright spot has a radius of 1.43 and contains 95.5% of the
total power compared to a radius of 1.22 containing 83.8% of the power for a uniform
beam. The diffraction rings practically disappear for J t 4 , and the beam propagates as a
Gaussian everywhere.
1
J = 0 J = 1
2
0.8 1 0
0.6
3
(r) P(rc)
0.4
3
0.2
0
0.5 1 1.5 2 2.5 3
r; rc
Figure 6-1. PSF and encircled power for a Gaussian pupil with J 0 , 1, 2, and 3.
The irradiance is in units of Pex Sex O2 R 2 , and the encircled power is in units of Pex .
r and rc are in units of OF.
6.3.2 Optimum Gaussian Radius 147
Letting
wI 0; J
0 , (6-15)
wJ
6.3.3 OTF
From Eq. (2-13), the OTF for an aberration-free Gaussian pupil is given by
G G G G G
W v i ; J
Pex1 ³ A r p A r p O Rv i dr p (6-16)
G
in the pupil coordinate system x p , y p . Let the spatial frequency vector v i with its
Cartesian components [, K make an angle I with the x p axis, as illustrated in Figure 6-
3. It is convenient to write the autocorrelation integral in a p, q coordinate system
whose axes are rotated by an angle I with respect to the x p , y p system (so that the p
G
axis lies along the direction of the spatial frequency vector v i ) and whose origin lies at a
distance ORv i from that of the x p , y p system along the p axis. If we further let the
p, q coordinates be normalized by the pupil radius a and the spatial frequency v i be
normalized by the cutoff spatial frequency 1 O F , the OTF can be written
0.8
0.6
(0 J)
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3
J
q
p
yp
xp
(0,0)
ni
lR
Figure 6-3. Geometry for evaluating the OTF. The centers of the two pupils are
( )
located at (0, 0) and l R ( x, h) in the x p , y p coordinate system and m (l R 2) (vi , 0)
12
in the ( p, q ) coordinate system, where vi = x 2 + h 2 ( )
and f = tan 1 ( h x) . The
shaded area is the overlap area of the two pupils. When normalized by the pupil
radius a, the centers of the two pupils of unity radius lie at m v along the p axis.
(
t (v ; g ) = a 2 Pex ) Ú Ú A( p + v , q) A( p - v , q) dp dq , 0£ v£1 . (6-17)
Substituting for the amplitude A(r) from Eq. (6-2) and for the power Pex from Eq. (6-5)
into Eq. (6-17), we obtain
1 v2 1 q2 v
(
8g exp -2gv 2 Û ) Û
t (v ; g ) = Ù
p [1 - exp( -2 g ) ] ı
dq Ù
ı
[ ( )]
exp -2g p 2 + q 2 dp , (6-18)
0 0
where the integration is over a quadrant of the overlap region of two pupils whose centers
are separated by a distance v along the p axis. For large values of g (e.g., g ≥ 4 ), the
contribution to the integral in Eq. (6-18) is negligible unless v = 0 , in which case it
represents the Gaussian-weighted area of a quadrant of the pupil, and the equation
reduces to
(
t (v ; g ) = exp -2gv 2 ) , 0£v £1 . (6-19)
Figure 6-4 shows how the OTF varies with v for several values of g . We note that
compared to a uniform pupil (i.e., for g = 0 ), the OTF of a Gaussian pupil is higher for
low spatial frequencies, and lower for the high. Moreover, as g increases, the bandwidth
6.3.3 OTF 149
0.8
1
0.6
W(Q J)
0.4
J = 3 2
0.2
0
0 0.2 0.4 0.6 0.8 1
Q
of low frequencies for which the OTF is higher decreases and the OTF at high
frequencies becomes increasingly smaller. This is due to the fact that the Gaussian
weighting across the overlap region of two pupils whose centers are separated by small
values of v is higher than that for large values of v. If we consider an apodization such
that the amplitude increases from the center toward the edge of the pupil, then the OTF is
lower for low frequencies and higher for the high. Thus unlike aberrations, which reduce
the MTF of a system at all frequencies within its passband, the amplitude variations can
increase or decrease the MTF at any of those frequencies.
2 2
1 2S ª1 2 S º
S ³ ³ AU exp>i )U, T@ U dU dT «³ ³ AU U dU dT»
0 0 ¬0 0 ¼
2 1 2S 2
J ½
® S 1 exp J ¾ ³ ³ exp JU exp>i )U, T@ U dU dT
2
. (6-20)
¯ > @ Ó 0 0
S ~ exp ( - s F2 ) , (6-21)
where
is the variance of the phase aberration across the Gaussian-amplitude weighted pupil. The
mean and the mean square values of the aberration are obtained from the expression
1 2p 1 2p
n
< Fn > = Ú Ú [
A(r) F(r, q) ] r dr d q Ú Ú A(r) r dr dq
0 0 0 0
1 2p
g
= Ú Ú
p[1 - exp( - g ) ] 0
( )[
exp -gr 2 F(r, q) ] n r dr d q , (6-23)
0
with n = 1 and 2, respectively. The angular brackets indicate a mean value over the
Gaussian pupil.
Table 6-1 lists the primary aberrations and their standard deviations for increasing
values of g . It is evident that the standard deviation of an aberration decreases as g
increases. This is due to the fact that while an aberration increases as r increases, the
amplitude decreases more and more rapidly as g increases, thus reducing its effect more
Table 6-1. Primary aberrations and their standard deviations for optical systems
with Gaussian pupils. For comparison, the results for a uniform pupil ( g = 0 ) are
also given.
Primary Aberration sF ( g = 0) sF ( g = 1) sF ( g =2 ) sF ( g ≥3 )
Spherical, As r 4 2 As As As As 2 5 As
=
3 5 3.35 3.67 6.20 g2
Coma, Ac r3 cos q Ac Ac Ac Ac 3 Ac
=
2 2 2.83 3.33 6.08 g3 2
Astigmatism, Aa r2 cos 2 q Aa Aa Aa Aa
4 4.40 6.59 2g
Defocus, Bd r2 Bd Bd Bd Bd Bd
=
2 3 3.46 3.55 4.79 g
Tilt, Bt r cos q Bt Bt Bt Bt
2 2.19 2.94 2g
6WUHKO 5DWLR DQG $EHUUDWLRQ %DODQFLQJ 151
and more compared to that for a uniform pupil. Accordingly, for a given small amount of
aberration Ai , the Strehl ratio for a Gaussian pupil is higher than that for a uniform pupil.
Similarly, the aberration tolerance for a given Strehl ratio is higher for a Gaussian pupil.
Its approximate value can be obtained from Eq. (6-21).
Since the Strehl ratio depends on the aberration variance, we balance a given
aberration with lower-order aberrations to minimize its variance. Thus, we balance
spherical aberration and astigmatism with defocus aberration, and coma with tilt
aberration to minimize their variance. The balanced primary aberrations thus obtained are
listed in Table 6-2. For example, the defocus aberration that balances spherical aberration
is given by Bd As = - 1, - 0.933 , and - 4 g when g = 0 , 1, and ≥ 3, respectively.
Similarly, the tilt aberration that balances coma for these values of g is given by
Bt Ac = - (2 3) , - 0.608 , and - 2 g , respectively. The defocus coefficient given by
Bd = - Aa 2 to balance astigmatism is independent of the value of g .
The standard deviations of the balanced primary aberrations are given in Table 6-3.
The factor by which the standard deviation of a primary aberration is reduced by
balancing it with another is listed in Table 6-4. The diffraction focus representing the
point of maximum irradiance for a small aberration is listed in Table 6-5. We note that,
although aberration balancing in the case of a uniform pupil reduces the standard
deviation of spherical aberration and coma by factors of 4 and 3, respectively, the
reduction in the case of astigmatism is only a factor of 1.22. For a Gaussian pupil, the
trend is similar but the reduction factors are smaller for spherical aberration and coma,
and are larger for astigmatism. For a Gaussian beam with g = 1, they are 3.74, 2.64, and
1.27, corresponding to spherical aberration, coma, and astigmatism, respectively. In
Section 6.6, the balanced aberrations are identified with the Gaussian polynomials
discussed in Section 6.5.
Balanced F( r, q ; g = 0) F( r, q ; g = 1) (
F r , q;; g = 2 ) (
F r, q ; g ≥ 3 )
Aberration
Ê 4 2ˆ
Spherical (
As r 4 r2 ) (
As r 4 0.933r 2 ) (
As r 4 0.728 r 2 ) As Á r 4
Ë
r ˜
g ¯
Ê 2 ˆ Ê 3 2 ˆ
Coma Ac Á r 3
Ë
r˜ cos q
3 ¯
(
Ac r 3 )
0.608 r cos q A c r 3 ( )
0.419 r cos q A c Á r
Ë
r˜ cos q
g ¯
Astigmatism
(
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 ) (
A a r 2 cos 2 q 12 )
152 SYSTEMS WITH GAUSSIAN PUPILS
Balanced sF ( g = 0) s F ( g = 1) sF ( g =2 ) sF ( g ≥3 )
Aberration
Spherical As As As As 2 As
=
6 5 13.42 13.71 18.29 g2
Coma Ac Ac Ac Ac Ac
=
6 2 8.49 8.80 12.21 g3 2
Astigmatism Aa Aa Aa Aa Aa
=
2 6 4.90 5.61 9.08 2g
Table 6-4. Factor by which the standard deviation of a Seidel aberration across an
aperture is reduced when it is optimally balanced with other aberrations.
Reduction Factor
Balanced Uniform Gaussian Gaussian Weakly Truncated
Aberration ( g = 0) ( g = 1) ( g =2 ) (
Gaussian g ≥ 3 )
Spherical 4 3.74 2.95 5 = 2.24
Diffraction Focus
Balanced Uniform Gaussian Gaussian Weakly Truncated
Aberration ( g = 0) ( g = 1) ( g =2 ) Gaussian g ≥ 3( )
Ê 32 2 ˆ
Spherical (0, 0, 8F A ) (0, 0, 7.46 F A ) (0, 0, 5.82 F A )
2
s
2
s
2
s Á 0, 0, F As ˜
Ë g ¯
Astigmatism (0 , 0 , 4 F A ) (0 , 0 , 4 F A )
2
a
2
a (0 , 0 , 4 F A )
2
a (0 , 0 , 4 F A )2
a
6.5 Orthonormalization of Zernike Circle Polynomials over a Gaussian Circular Pupil 153
È j ˘
G j +1 = N j +1 Í Z j +1 - Â Z j +1G k G k ˙ , (6-24)
Î k =1 ˚
1 2p 1 2p
Z j +1G k = Ú Ú A(r) Z j +1G k r dr dq Ú Ú A(r) r dr dq
0 0 0 0
1 2p
g
= Ú
p[1 - exp( - g ) ] 0 Ú ( )
exp - gr 2 Z j +1G k r dr dq . (6-25)
0
1 2p 1 2p
G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq
0 0 0 0
1 2p
g
= Ú
p[1 - exp( - g ) ] 0 Ú ( )
exp - gr 2 G j G j ¢ r dr dq
0
= d jj ¢ . (6-26)
Now a circle polynomial Z j varies with the angle q as cos mq or sin mq depending
on whether j is even or odd. It is radially symmetric when m = 0. Because of the
orthogonal properties of cos mq and sin mq over a period of 0 to 2p [see Eq. (4-46)],
the polynomials G k that contribute to the sum in Eq. (6-8) must also have the same
angular dependence as that of the polynomial Z j +1. Hence, the polynomial G j +1 will also
have the same angular dependence. Thus, a Gaussian polynomial G j is separable in polar
coordinates r and q , and differs from the corresponding circle polynomial only in its
radial dependence. Given the form of the circle polynomials by Eqs. (4-45a)–(4-45c), the
Gaussian polynomials can accordingly be written
where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g )
is a Gaussian radial polynomial.
Substituting Eqs. (6-27a)–(6-27c) into the orthonormality Eq. (6-26), we find that the
Gaussian radial polynomials obey the orthogonality condition [1]
1 1
1
Ú (r; g ) (r; g ) A(r) r dr Ú A(r) r dr
Rnm Rnm¢ = d
n + 1 nn ¢
. (6-28)
0 0
Writing Eq. (6-24) in terms of two-index polynomials given by Eqs. (6-27a)–(6-27c) and
substituting these equations into it, as was done in Chapter 5 for the annular polynomials,
we find that the Gaussian radial polynomials are given by
È ( n m) 2 ˘
Rnm (r; g ) = M nm Í Rnm (r) - Â (n - 2i + 1) Rnm (r) Rnm 2i (r; g ) Rnm 2i (r; g )˙ , (6-29)
Î i ≥1 ˚
where
1 1
Rnm (r) Rn 2i (r; g ) = Ú (r) Rn 2i (r; g ) A(r) r dr Ú A(r) r dr
Rnm . (6-30)
0 0
where the coefficients anm , etc., depend on g. The radial polynomials are even or odd in r
depending on whether n (or m) is even or odd.
As 0 4
W (r, q; g ) =
a 40
(a 4 r + b40r 2 + c 40 )
As
= G 4 (r, q; g ) . (6-33)
5a 40
Piston R00 1 1 1 1
2
Field curvature R20 a20r2 + b20 2
2.04989r – 0.85690 2r – 1 2
( gr – 1) / 3
(defocus)
Astigmatism R22 a22r2 1.14541r2 r2 ( g / 6 )r2
Spherical aberration R40 a40r4 + b40r2 + c40 6.12902r4 – 5.71948r2 + 0.83368 6 r4 – 6 r2 + 1 ( g 2r4 – 4 gr2 + 2) / 2 5
1
*a11 = (2 p 2 )–1/2 , a 20 = [3( p 4 – p 22 )] –1/2, b 20 = – p 2 a 20 , a 22 = ( 3 p 4 )–1/2 , a 13 = ( p – p 42 / p 2 ) 12
, b 31 = – ( p 4 / p 2 )a 13 ,
2 6
–1/2
{
a 40 = 5 [ p8 – 2 K 1 p6 + (K 12 + 2 K 2 ) p4 – 2 K 1 K 2 p2 + K 22 ] } , b40 = – K 1 a 40 , c40 = K 2 a 40 ,
p 0 = 1, K1 = ( p6 – p 2 p 4 ) / ( p 4 – p 22 ), K 2 = ( p 2 p6 – p 42 ) / ( p 4 – p 22 ) .
156 SYSTEMS WITH GAUSSIAN PUPILS
and
• 2p
g n
< Fn > =
p Ú Ú ( )[ ]
exp - g r2 F(r, q) r dr dq , (6-35)
0 0
respectively.
The standard deviation of a primary aberration for a large value of g can be obtained
by calculating its mean and mean square values according to Eq. (6-36). The results thus
obtained are given in the last column of Table 6-1. The corresponding balanced
aberrations and their standard deviations are similarly given in Tables 6-2 and 6-3,
respectively. The balancing of an aberration reduces the standard deviation by a factor of
5, 3 , and 2 in the case of spherical aberration, coma, and astigmatism,
respectively, as noted in Table 6-4. The diffraction focus for these aberrations is listed in
Table 6-5. The amount of balancing aberration decreases as g increases in the case of
spherical aberration and coma, but does not change in the case of astigmatism. For
example, in the case of spherical aberration, the amount of balancing defocus for a
weakly truncated Gaussian beam is ( 4 g ) times the corresponding amount for a uniform
beam. Similarly, in the case of coma, the balancing tilt for a weakly truncated Gaussian
beam is (3 g ) times the corresponding amount for a uniform beam. The location of the
diffraction focus is independent of the value of g in the case of astigmatism, since the
balancing defocus is the same regardless of the value of g . Compared to the peak value
of an aberration, its standard deviation is smaller by a factor of g 2 2 , g 3 2 , and 2g in
the case of spherical aberration, coma, and astigmatism, respectively.
When a Gaussian beam is weakly truncated, i.e., when g is large, the quantity ps in
Table 6-6 reduces to
ps = < rs > = (s 2 g ) ps 2 = (s 2) ! g s2
. (6-36)
:HDNO\ 7UXQFDWHG *DXVVLDQ 3XSLOV 157
As a result, we obtain simple expressions for the radial polynomials, which are listed in
the last column in Table 6-6. They are similar to Laguerre polynomials [4]. If we
normalize the radial coordinate r of a point on the pupil by w (instead of by a), then g
disappears from these expressions. Since the power in a weakly truncated Gaussian beam
is concentrated in a small region near the center of the pupil, the effect of the aberration
in its outer region is negligible. Accordingly, the aberration tolerances in terms of the
peak value of the aberration at the edge of the pupil (r = 1) may not be very meaningful.
They may instead be defined in terms of their value at the Gaussian radius [1].
J
W (r, q; g ) = Â a j G j (r, q; g ) , 0 £ r £ 1 , 0 £ q £ 2 p , (6-37)
j =1
where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-
37) by G j ¢ (r, q; g ) , integrating over the Gaussian pupil, and using the orthonormality Eq.
(6-26), we obtain the circle expansion coefficients:
1 2p 1
a j = Ú Ú W (r, q; g ) G j (r, q; g ) A(r) r dr d q 2 p Ú A(r) r dr . (6-38)
0 0 0
The mean and mean square values of the aberration function are given by
W (r, q; g ) = a1 (6-39)
and
J
W 2 (r, q; g ) = Â a 2j . (6-40)
j =1
2
sW = W 2 (r, q; g ) - W (r, q; g )
J
= Â a 2j . (6-41)
j =2
The balanced aberrations for an annular Gaussian pupil with an obscuration ratio
can be obtained in a manner similar to those for a circular pupil, except that the lower
158 SYSTEMS WITH GAUSSIAN PUPILS
limit of zero in the radial integration is replaced by . The Gaussian annular polynomials
G j (r, q; g; ) orthonormal over a Gaussian annular pupil can be obtained recursively from
the annular polynomials A j (r, q; ) , starting with G1 = 1 (omitting the arguments for
brevity) from Eq. (3-18) according to
È j ˘
G j +1 = N j +1 Í A j +1 - Â A j +1G k G k ˙ , (6-42)
Î k =1 ˚
1 2p 1 2p
A j +1G k = Ú Ú A(r) A j +1G k r dr dq Ú Ú A(r) r dr dq . (6-43)
0 0
1 2p 1 2p
G jG j ¢ = Ú Ú A(r) G j G j ¢ r dr dq Ú Ú A(r) r dr dq
0 0
= d jj ¢ . (6-44)
Applying the same reasoning as in the case of Gaussian circle polynomials, we find
that the polynomial G j (r, q; g; ) also has the same angular dependence as an annular
polynomial A j (r, q; ) . Thus, a Gaussian annular polynomial G j is separable in polar
coordinates r and q , and differs from the corresponding annular polynomial only in its
radial dependence. Given the form of the annular polynomials by Eqs. (5-17a)–(5-17c),
the Gaussian annular polynomials can accordingly be written
where n and m are positive integers (including zero), n - m ≥ 0 and even, and Rnm (r; g; )
is a Gaussian annular radial polynomial.
Substituting Eqs. (6-45a)–(6-45c) into the orthonormality Eq. (6-44), we find that the
Gaussian annular radial polynomials obey the orthogonality condition [1,3]
1 1
1
Ú Rnm (r; g; ) Rnm¢ (r; g; ) A(r) r dr Ú A(r) r dr = d . (6-46)
n + 1 nn ¢
Writing Eq. (6-42) in terms of two-index polynomials given by Eqs. (6-45a)–(6-45c) and
substituting these equations into it, as was done in Chapter 5 for the annular polynomials,
6.9 Orthonormalization of Annular Polynomials over a Gaussian Annular Pupil 159
È ( n m) 2 ˘
Rnm (r; g; ) = M nm Í Rnm (r; ) - Â (n - 2i + 1) Rnm (r; ) Rnm 2 i (r; g; ) Rnm 2 i (r; g; ) ,
˙
ÍÎ i ≥1 ˙˚
(6-47)
where the angular brackets indicate an average over the annular Gaussian pupil; i.e.,
1 1
Rnm (r; ) Rn 2 i (r; g; ) = Ú Rnm (r; ) Rn 2 i (r; g; ) A(r) r dr Ú A(r) r dr . (6-48)
ps = < rs >
Ë { [(
= Ê s exp g 1 - 2 )] - 1} {exp [g (1 - )] - 1}ˆ¯ + (s 2 g ) p
2
s 2 . (6-51)
160 SYSTEMS WITH GAUSSIAN PUPILS
Using these expressions, numerical results for the coefficients of the terms of a radial
polynomial for any values of g and can be obtained.
The coefficients for g = 1 and = 0, 0.25, 0.50, 0.75, and 0.90 are given in Table 6-
7. For comparison, the coefficients for a uniformly illuminated pupil, i.e., for g = 0 , are
given in parentheses in this table. An increase (decrease) in the value of a coefficient anm
of an orthogonal aberration Rnm (r; g ; ) cos mq implies a decrease (increase) in the value
of s F for a given amount of the corresponding classical aberration. This, in turn, implies
that for small aberrations, the system performance as measured by the Strehl ratio is less
(more) sensitive to that classical aberration when balanced with other classical
aberrations to form an orthogonal aberration. Thus, as increases, irrespective of the
value of g, the system becomes less sensitive to field curvature (defocus) and spherical
aberration but more sensitive to distortion (tilt) and astigmatism. In the case of coma, it
first becomes slightly more sensitive but is much less sensitive for larger values of . As
g increases, i.e., as the width of the Gaussian illumination becomes narrower, the system
becomes less sensitive to all classical primary aberrations. Although the results for g = 0
and g = 1 only are given in Table 6-7, the coefficients for 0 £ g £ 3 show that the
differences between the coefficients for uniform and Gaussian illumination are small, and
they decrease as increases and increase as g increases. This is understandable because
as increases or g decreases, the differences between the two illuminations decreases.
Table 6-7. Coefficients of terms in Gaussian radial polynomials Rnm (r; g ; ) for g = 1.
The numbers given in parentheses are the corresponding coefficients for uniform
illumination.
0.00 1.09367 2.04989 – 0.85690 1.14541 3.11213 – 1.89152 6.12902 – 5.71948 0.83368
0.25 1.04364 2.18012 – 1.00080 1.08940 3.01573 – 1.84513 6.95563 – 6.98197 1.25153
0.50 0.92963 2.70412 – 1.56449 0.93620 3.14319 – 2.06618 10.79549 – 13.08900 3.46706
0.75 0.80827 4.59329 – 3.51548 0.74439 4.55179 – 3.57767 31.47560 – 48.77879 18.39840
0.90 0.74453 10.53581 – 9.50324 0.63890 9.60573 – 8.69629 166.33359 – 300.66342 135.36926
where a j is an expansion coefficient of the polynomial. Multiplying both sides of Eq. (6-
52) by G j (r, q; g; ), integrating over the Gaussian pupil, and using the orthonormality
Eq. (6-44), we obtain the Gaussian annular expansion coefficients:
1 2p 1
a j = Ú Ú W (r, q; g; )G j (r, q; g; ) A(r) r dr d q 2 p Ú A(r) r dr . (6-53)
The mean and mean square values of the aberration function are given by
W (r, q; g; ) = a1 (6-54)
and
J
W 2 (r, q; g; ) = Â a 2j . (6-55)
j =1
s 2 = W 2 (r, q; g; ) - W (r, q; g; )
J
= Â a 2j . (6-56)
j =2
6.12 SUMMARY
A pupil with Gaussian illumination is called a Gaussian pupil. The Gaussian
illumination may be due to a filter with Gaussian transmission placed at the pupil or due
to a laser beam with Gaussian amplitude distribution. The illumination is characterized by
a truncation ratio g = a w , where a is the pupil radius and w is the radial distance,
called the Gaussian radius, where the amplitude is 1 e times its central value.
The aberration-free image for a system with a Gaussian pupil shows that the
Gaussian illumination reduces the central value, broadens the central bright spot, but
reduces the power in the diffraction rings compared to a uniform pupil. Correspondingly,
the OTF is higher for low spatial frequencies, and lower for the high. The diffraction
rings practically disappear when the pupil radius is twice the Gaussian radius, and the
beam propagates as a Gaussian everywhere. The OTF in this case is also described by a
Gaussian function.
162 SYSTEMS WITH GAUSSIAN PUPILS
The Strehl ratio for a small aberration can be estimated from its variance calculated
over the Gaussian amplitude-weighted pupil. The aberration variance decreases, and,
therefore, its tolerance increases as the truncation ratio increases (see Tables 6-1 and 6-3),
because the amplitude decreases as the aberration increases with the radial distance from
the center.
The Gaussian polynomials orthonormal over a Gaussian circular pupil are obtained
by orthonormalizing the Zernike circle polynomials over a corresponding Gaussian
amplitude-weighted pupil. They are given in Table 6-6 for the primary aberrations for
g = 1. For a weakly truncated pupil, i.e., for large values of g , the polynomials have a
simple analytical form similar to Laguerre polynomials, as shown in the last column in
Table 6-6.
The orthonormal Gaussian annular polynomials for Gaussian annular pupils can be
obtained by orthonormalizing the annular polynomials. The polynomial ordering is
exactly the same as that for the circle or the annular polynomials.
5HIHUHQFHV 163
References
3. V. N. Mahajan, “Strehl ratio of a Gaussian beam,” J. Opt. Soc. Am. A22, 1824–
1833 (2005).
References ......................................................................................................................200
165
Chapter 7
Systems with Hexagonal Pupils
7.1 INTRODUCTION
Although most optical imaging systems have a circular or an annular pupil, with or
without Gaussian illumination, there are times when the wavefront or the interferogram is
hexagonal. This is most notable for the primary mirrors of large telescopes, such as the
Keck [1], the James Webb [2], or the CELT [3]. Although these mirrors are circular, they
are large enough that they are segmented into small hexagonal segments. Optical testing
of a hexagonal segment yields a hexagonal wavefront or interferogram, thus requiring
polynomials that are orthogonal over a hexagon. Even a large hexagonal primary mirror
consisting of hexagonal segments has been proposed [4].
Smith and Marsh [5] have discussed the PSF of a hexagonal pupil, but their equation
for it is incorrect. Sabatke et Dl. [4] desribe the complex amplitude for a trapezoid
forming the upper half of a regular hexagon, but do not carry out the summation of the
diffracted amplitudes of the two trapezoids of the hexagonal pupil. We give closed-form
expressions for the six-fold symmetric aberration-free PSF and OTF [6]. Similar
expressions for the PSF have been given by others [7,8]. The PSF and OTF are plotted
along with the ensquared power, and compared with the corresponding quantities for a
system with a circular pupil. The ensquared power and the OTF are shown to be lower
than the corresponding values for a circular pupil.
secondary spherical aberrations are radially symmetric, the polynomial H37 representing
the balanced tertiary spherical aberration is not, because it also consists of an angle-
dependent term in Z28 or cos 6q . The balancing defocus, however, to optimally balance
Seidel astigmatism for a hexagonal pupil is the same as that for a circular or an annular
pupil.
The isometric, interferometric, and PSF plots for the hexagonal polynomial
aberrations are shown. The P-V numbers for the polynomials with a sigma value of one
wave are given, and the Strehl ratios are caluclated for a sigma value of one-tenth of a
wave to illustrate that the exponential expression for it, in terms of the aberration
variance, gives a good estimate for small aberrations.
The balancing of Seidel aberrations is considered, and their standard deviations are
obtained by expressing them in terms of the orthonormal polynomials. The diffraction
focus is shown to lie closer to the Gaussian image point in the case of coma, and closer to
the Gaussian image plane in the case of spherical aberration, compared to their
corresponding locations for a circular pupil. Plots of Strehl ratio as a function of the
sigma value of a Seidel aberration are given. They demonstrate that the exponential
expression underestimates in the case of defocus, but overestimates in the case of
astigmatism, coma, and spherical aberration. The Strehl ratio is estimated very well for
balanced astigmatism and coma, but it underestimates in the case of balanced spherical
aberration for s W > 0.2 .
E F
30º
a A 60º
D o xp o xc
C B
a
2a
(a) (b)
Figure 7-1. (a) Hexagonal pupil with dimension a. (b) Unit hexagonal pupil inscribed
inside a unit circle showing the coordinates of its corners. Each side of the hexagon
has a length of unity. The x axis passes through the corners D and A, and y axis
bisects its parallel sides EF and CB.
7.2 Pupil Function 169
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (7-1)
where
(
A xp, yp ) = (P ex
12
Sex ) (7-2)
(x p, yp ) = a( x ¢, y ¢) (7-5)
and
(xi , yi ) = l Fx ( x , y ) , (7-6)
where
Fx = R 2a (7-7)
is the focal ratio of the image-forming light cone along the x axis, Eq. (7-4) can be written
2
4 ÛÛ
I ( x, y) =
27 ı ı
[ ]
Ù Ù exp iF ( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (7-8)
The hexagonal region of integration consists of a rectangle CBEF and two congruent
triangles B F A and CDE with the limits of integration - 1 2, 1 2; - 3 2, 3 2 , ( )
170 SYSTEMS WITH HEXAGONAL PUPILS
[1 2, 1; - ] [
3(1 - x ¢), 3(1 - x ¢) , and -1, - 1 2; - 3(1 + x ¢), 3(1 + x ¢) , respectively. In ]
each case, the first pair of limits is on x ¢ , and the second on y ¢ . Hence, the irradiance
distribution is given by
2
4 È12 3 2 1 3 (1 x ¢) 12 3 (1+ x ¢) ˘
I ( x, y) = Í Ú dx ¢ Ú + Ú dx ¢ Ú + Ú dx ¢ Ú ˙ exp[ -pi ( xx ¢ + yy ¢) ]dy ¢ . (7-10)
27 ÍÎ 1 2 3 2 12 3 (1 x ¢) 1 3 (1+ x ¢) ˙
˚
The integrand in Eq. (7-10) is separable in the integration coordinates. We carry out the
integration of each of its three parts:
12 3 2
A1( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
12 3 2
= 4
sin(px 2) sin ( 3py 2 ) , (7-11)
2
p xy
1 3 (1 x ¢)
A2 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
12 3 (1 x ¢)
-2
){ [- ( ) ( )] }. (7-12)
ipx 2 ipx
= e 3 y cos 3py 2 + ix sin 3py 2 + 3 ye
(
p y x 2 - 3y 2
2
12 3 (1+ x ¢)
A3 ( x , y ) = Ú dx ¢ Ú exp[ -pi ( xx ¢ + yy ¢) ]dy ¢
1 3 (1+ x ¢)
2
){ [ ( ) ( )] }
= e ipx 2 3 y cos 3py 2 + ix sin 3py 2 - 3 ye ipx . (7-13)
2
(
p y x - 3y 2 2
4
A2 + A3 =
p y x - 3y 2
2
( 2
)
¥ [ 3 y cos(px 2) cos ( )
3py 2 - x sin(px 2) sin ( )
3py 2 - 3 y cos( px ) . (7-14) ]
From Eqs. (7-11) and (7-14), we obtain
4
A1 + A2 + A3 =
(
p 2 x x 2 - 3y 2 )
¥ { 3x[cos(px 2) cos( ) ]
3py 2 - cos( px ) - 3y sin(px 2) sin ( )}
3py 2 . (7-15)
The sum of the three parts of diffracted amplitude is real. The irradiance distribution is
given by
7.3.1 PSF 171
4 2
I ( x, y) = A1 + A2 + A3
27
4 2
=
27
( A1 + A2 + A3 ) . (7-16)
Using the L’Hopital rule, it can be shown that the PSF I (0, 0) at the origin is unity,
as expected from the normalization in Eq. (7-3). Rotating the ( x , y ) coordinate system by
[ ]
60 o , i.e., by changing ( x , y ) to (1 2) x + 3 y , y - 3 x , it can be shown that the PSF
remains invariant, thus showing that the PSF is 6-fold symmetric, as expected for the 6-
fold symmetric pupil. The PSF along the x and y axes can be written from Eq. (7-14) as
64
I ( x , 0) = [
9p 4 x 4
cos(px 2) - cos( px ) ]2 . (7-17a)
and
16 2
I (0, y ) =
243p 4 y 4
{ [
2 3 1 - cos ( )]
3py 2 + 3py sin ( )}
3py 2 . (7-17b)
A 2D PSF is shown in Figure 7-2. The PSF in Figure 7-2a emphasizes the low-value
details, but that in Figure 7-2b is truncated to a value of 10 -3 relative to a value of unity at
the center. It shows a nearly circular bright spot at the center surrounded by nearly
hexagonal alternating dark and bright rings, three dark and two bright. Beyond the rings,
the PSF breaks into six diffracted arms each of alternating bright and dark strips with
some dim structure between two consecutive arms. Plots of the PSF along the x and y
axes and at 15o from the x axis are shown in Figure 7-3 as I ( x, 0) , I (0, y ) , and
( )
I 15o ∫ I ( r ) , respectively. The solid curve I c represents the Airy pattern for a circular
pupil (of the same radius a as the side of the hexagonal pupil imaging an object at the
same wavelength l with the same focal ratio as Fx ) with its first zero at 1.22, as in
Figure 4-2. The central bright spot has its zero value along the x axis at 1.33, and at 1.35
along the y axis.
The ensquared power, i.e., the fractional power in a square region centered at the
Gaussian image point, is given by
s s
P( s) = Ú dx Ú I ( x , y )dy , (7-18)
s s
where s is the half-width of the square. It is tabulated in Table 7-1 along with the
corresponding value for a circular pupil. The two ensquared powers are plotted in Figure
7-4 as Ph and Pc . The ensquared power for a hexagonal pupil, plotted as a dotted curve
Ph , starts at zero and rises to 83.8% as s increases to the first zero along the x axis at
1.33, like the Airy disc of radius 1.22 for a circular pupil (as in Figure 4-2a), and
approaches 100% asymptotically. It is evident that the ensquared power for a hexagonal
pupil is lower than the corresponding value for a circular pupil.
172 SYSTEMS WITH HEXAGONAL PUPILS
(a) (b)
Ic
I(x,0)
I(15q)
o
m
I(y,0)
Ic
Figure 7-3. PSF along the x and y axes and at 15 o from the x axis, where x, y, and r
are in units of l Fx .
7.3.1 PSF 173
Table 7-1. Ensquared power Ph of a system with a hexagonal pupil, where s is the
half width of a square in units of l Fx , compared with the ensquared power Pc for a
circular pupil.
s Ph Pc
0 0 0
0.1 0.0256 0.0310
0.2 0.0984 0.1180
0.3 0.2070 0.2449
0.4 0.3354 0.3897
0.5 0.4663 0.5302
0.6 0.5848 0.6491
0.7 0.6809 0.7369
0.8 0.7504 0.7930
0.9 0.7945 0.8229
1 0.8186 0.8360
1.2 0.8344 0.8455
1.4 0.8434 0.8624
1.6 0.8613 0.8862
1.8 0.8819 0.9043
2 0.8972 0.9135
2.2 0.9060 0.9184
2.4 0.9116 0.9241
2.6 0.9175 0.9315
2.8 0.9244 0.9384
3 0.9311 0.9426
3.5 0.9397 0.9495
4 0.9469 0.9573
4.5 0.9536 0.9615
5 0.9575 0.9662
6 0.9645 0.9722
7 0.9699 0.9765
8 0.9738 0.9798
9 0.9768 0.9823
10 0.9791 0.9843
174 SYSTEMS WITH HEXAGONAL PUPILS
Pc Ph
o
7.3.2 OTF
From Eq. (1-11), the OTF for a uniformly illuminated hexagonal pupil can be
obtained as the autocorrelation of the pupil function:
r
t (v ) = Sex1 Ú [ (r )] d rr
exp iQ rp p , (7-19)
where
(r r)
Q rp ; v (r ) (r
= F rp - F rp - l R v
r
) (7-20)
r
is the phase aberration difference function, and v is a spatial frequency vector in the
image plane. The integration in Eq. (7-19) is carried out over the overlap area of two
r
hexagonal pupils whose centers are displaced from each other by l R v . In the aberration-
free case, the OTF is real and simply equal to the relative area of overlap of two pupils
r
where the center of one is displaced from that of the other by l R v .
For a displacement x along the x axis, as in Figure 7-5a, the overlap area consists of
two isosceles triangles and a rectangle when x < a . The area of each triangle is 3a 2 4 ,
and that of the rectangle is 3a( a - x ) . The total fractional overlap area is 1 - 2 x 3a .
For x = a , as in Figure 5b, the rectangle vanishes and the two triangles meet forming a
rhombus. For x > a , the two triangles intersect each other, thus reducing the size and
therefore the area of the rhombus. The fractional area of the rhombus is given by
(1 3) (2 - x a)2 . The rhombus vanishes as x Æ 2a , and the two hexagons meet at a
vertex only, namely, the extreme right-hand vertex of one hexagon and the extreme left-
hand vertex of the other. Replacing the displacement x by l Rv x , where v x is a spatial
frequency along the x axis, and normalizing it by the cutoff frequency 1 l Fx along this
axis, we can write the tangential or the x-OTF as
7.3.2 OTF 175
yp
yp yp
Oc
Oc Oc y
O O
xp xp O
x x xp
Figure 7-5. Overlap area of two hexagonal pupils displaced from each other along
the x axis in (a) and with x = a in (b), and along the y axis in (c).
ÏÔ1 - (4 3)v x , 0 £ v x £ 1 2
t x (v x ) = Ì 2
(7-21)
ÔÓ(4 3) (1 - v x ) , 1 2 £ v x £ 1 .
Now consider a displacement y along the y axis, as illustrated in Figure 7-5c. Here
again, the overlap area consists of two congruent isosceles triangles and a rectangle. The
(
area of each triangle is 1 4 3 )( )
3a - y and that of the rectangle is a 3a - y for
2
( )
0 £ y £ 3a . The fractional overlap area is given by ( 2 3)ÈÍ 1 y 3a + (1 2) 1 y 3a ˘˙ .
( ) ( )
Î ˚
Again, replacing y by l Rv y , where v y is the spatial frequency along the y axis, and
normalizing by the cutoff frequency 1 l Fx , the sagittal or the y-OTF can be written
2
( ) = (2 3)ÈÍÎ(1 - 2v
ty vy y ) (
3 + (1 2) 1 - 2v y 3 ˘˙ , 0 £ v y £ 3 2 .
) ˚
(7-22)
Note that the cutoff frequency in the y direction is 3 2 compared to a value of unity in
the x direction.
It can be shown that the OTF for an angle q from the x axis in the range 0 £ q £ p 6
is given by [6]
Ï 4 È Ê2 ˆ ˘
Ô1 - vq Ísin q + 3 cos q + Á sin 2 q - sin 2q˜ vq ˙ , 0 £ vq £ v1
Ô 3 3 Î Ë 3 ¯ ˚
t(vq ) = Ì (7-23)
Ô 4 + 2 Ê sin q - 4 cos qˆ v + 1 Ê 1 - 1 sin 2q + 3 cos 2qˆ v 2 , v £ v £ v ,
Ô 3 3 ÁË 3 ˜ q
¯ 3Ë
Á
3
˜ q 1
¯ q 2
Ó
and
176 SYSTEMS WITH HEXAGONAL PUPILS
1
Ê sin q ˆ
v2 = Á cos q + ˜ (7-25)
Ë 3¯
Figure 7-6 shows how the OTF varies with the spatial frequency (in units of the
cutoff frequency 1 l Fx ) along the x and y axes, and at 15o from the x axis as t(v x ),
( ) ( )
t v y (in long dashes), and t 15o ∫ t( v ) . The OTF of a system with a corresponding
circular pupil of radius a is also included for comparison as t c . Note that the cutoff
frequency of the hexagonal pupil is the same as that for the circular pupil only along the x
axis and every 60 o degrees from it. Otherwise, it is smaller. We note that the OTF of a
hexagonal pupil is lower than that for a circular pupil at all spatial frequencies. The OTF
along the x axis is slightly higher than that along the y axis, and the OTF at 15o is slightly
higher in the low frequency region but lower in the high. The 15o OTF is lower than that
along the x axis. The differences among the three curves are relatively small.
oW
Wc
o
WQy
Wq o
o
WQx
oQx Qy Q
Figure 7-6. OTF along the x and y axes, and at 15 o from the x axis, where the spatial
frequencies v x , v y , and v , are in units of 1 l Fx .
7.4 Hexagonal Polynomials 177
È j ˘
H j +1 = N j +1 Í Z j +1 - Â Z j +1H k H k ˙ , (7-26)
Î k =1 ˚
where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit hexagon, i.e., they satisfy the orthonormality condition
2
Ú H j H j ¢ dx dy = d jj ¢ . (7-27)
3 3 hexagon
The hexagonal region of integration consists of a rectangle EFCB and two congruent
(
triangles F A B and C D E with limits of integration - 1 2, 1 2; - 3 2, 3 2 , )
[ ] [ ]
1 2, 1; - 3(1 - x ), 3(1 - x ) , and -1, - 1 2; - 3 (1 + x ), 3 (1 + x ) , respectively. The
angular brackets indicate a mean value over the hexagonal pupil. Thus,
2
Z j +1H k = Ú Z j +1H j dx dy . (7-28)
3 3 hexagon
The orthonormal hexagonal polynomials are given in Tables 7-2–7-4 up to the eighth
order in three different but equivalent forms [9,10]. In Table 7-2, each hexagonal
polynomial is written in terms of the circle polynomials, thus illustrating the relationship
y
£ 1 3¥ £ 1 3¥
E² , ´ F² , ´
¤ 2 2¦ ¤2 2 ¦
30°
£ 1 3¥ £1 3¥
C² , ´ B² , ´
¤ 2 2¦ ¤2 2 ¦
Figure 7-7. Unit hexagon inscribed inside a unit circle showing the coordinates of its
corners. Each side of the hexagon has a length of unity. The x axis passes through
the corners D and A, and y axis bisects its parallel sides EF and CB.
178 SYSTEMS WITH HEXAGONAL PUPILS
H2 6 5 Z2
H3 6 5 Z3
H4 5 43 Z1 + (2 15 43 )Z4
H5 10 7 Z5
H6 10 7 Z6
H7 16 14 11055 Z3 + 10 35 2211 Z7
H8 16 14 11055 Z2 + 10 35 2211 Z8
H9 (2 5 / 3 ) Z9
H27 2 77 93 Z27
H2 2 6 / 5 ȡcosș
H3 2 6 / 5 ȡsinș
H4 5 / 43 ( 5 + 12ȡ2)
H5 2 15 / 7 ȡ2sin2ș
H6 2 15 / 7 ȡ2cos2ș
H9 (4 10 / 3 )ȡ3sin3ș
H2 2 6/5 x
H3 2 6/5 y
H4 5 / 43 ( 5 + 12ȡ2)
H5 4 15 / 7 xy
H6 2 15 / 7 (x2 y2)
H7 4 42 / 3685 ( 14 + 25ȡ2)y
H8 4 42 / 3685 ( 14 + 25ȡ2)x
+ 314.05018953x2y2 + 157.02509476y4)xy
H24 (23.72919094 92.04290884x2 + 78.51254738x4)x2 + ( 23.72919094
+ 8.22984309x2 + 89.29962781y2 + 78.51254738x4 78.51254738x2y2
78.51254738y4)y2
7.4 Hexagonal Polynomials 183
between the two. In particular, it helps determine the potential error made when a
hexagonal aberration function is expanded in terms of the circle polynomials (see Chapter
12). The coefficients of the circle polynomials are the elements of the conversion matrix
M (discussed in Chapter 3). The polynomials up to H19 are given in their analytical form,
but those with j > 19 are written in a numerical form because of the increasing
complexity of the coefficients of the circle polynomials. In Table 7-3, the hexagonal
polynomials are given in polar coordinates, showing one-to-one correspondence with the
circle polynomials but illustrating the difference between them. This form is convenient
for analytical calculations because of integration of trigonometric functions over
symmetric limits. Finally, the polynomials are given in Cartesian coordinates in Table 7-
4, for a quantitative numerical analysis of, say, an interferogram.
Several observations can be made from the polynomial tables. It is evident from
Table 7-2 that the corresponding coefficients of the Zernike polynomials that make up the
hexagonal polynomial (n, m) pairs are the same except for signs in some cases, unless m
is a multiple of 3. For example, H14 and H15 have some coefficients with different signs,
but H16 and H17 have the same signs. H9 and H10 , which correspond to n = 3 and m =
3, and H18 and H19 , which correspond to n = 5 and m = 3, have different coefficients.
From Table 7-3, we note that each hexagonal polynomial consists of cosine or sine terms,
but not both.
Unlike the circle and annular polynomials, the hexagonal polynomials are generally
not separable in r and q due to lack of radial symmetry of the hexagonal pupil. The first
13 polynomials, i.e., up to H13 , are separable, but H14 and H15 are not; H16 through H19
are separable, but H20 and H21 are not. Accordingly, the notion of two indices n and m
with dependence on m in the form of cos mq loses significance. For example, the Zernike
polynomial Z14 for n = 4 and m = 4 varies as cos 4q but H14 has a term in cos 2q also.
Hence, the hexagonal polynomials can be ordered by a single index only. While the
polynomials H11 and H22 representing balanced primary and secondary spherical
aberrations are radially symmetric, the polynomial H37 representing balanced tertiary
spherical aberration is not, since it consists of an angle-dependent term in Z28 or cos 6q
also. If this term is not included in the polynomial H37 , the standard deviation of the
aberration increases from a value of unity to 1.13339.
y
E¢(0,1)
30
60
r
r
Ê 3 1ˆ Ê 3 1ˆ
D¢ Á , ˜ F¢ Á , ˜
Ë 2 2¯ Ë 2 2¯
O x
Ê 3 1ˆ Ê 3 1ˆ
C¢ Á , ˜ A¢ Á , ˜
Ë 2 2¯ Ë2 2¯
B¢ (0 , 1)
Figure 7-8. Unit hexagon rotated clockwise 30 degrees with respect that in Figure 7-
7, showing the coordinates of its corners. The x axis bisects the parallel sides F ¢A¢
and D¢ C ¢ of the hexagon, and the y axis passes through its corners E ¢ and B ¢ .
where a j are the expansion coefficients. Multiplying both sides of Eq. (7-29) by
H j ( x , y ), integrating over the unit hexagon, and using the orthonormality Eq. (7-27), we
obtain the hexagonal expansion coefficients:
2
aj = Ú W ( x , y )H j ( x , y ) dx dy . (7-30)
3 3 hexagon
It is evident from Eq. (7-30) that the value of a hexagonal coefficient is independent of
the number J of polynomials used in the expansion of the aberration function. Hence, one
or more polynomial terms can be added to or subtracted from the aberration function
without affecting the value of the coefficients of the other polynomials in the expansion.
The mean and mean square values of the aberration function are given by
W (r, q) = a1 , (7-31)
and
J
W 2 (r, q) = Â a 2j , (7-32)
j =1
186 SYSTEMS WITH HEXAGONAL PUPILS
H1 Z1
H2 6 / 5 Z2
H3 6 / 5 Z3
H4 5 / 43 Z1 + 2 15 / 43 Z4
H5 10 / 7 Z5
H6 10 / 7 Z6
H7 16 14 / 11055 Z3 + 10 35 / 2211 Z7
H8 16 14 / 11055 Z2 + 10 35 / 2211 Z8
H9 2 35 / 103 Z9
H10 (2 5 /3)Z10
H27 = 2 77 / 93 Z27
H28 = 1.07362889Z1 + 1.52546162Z4 + 1.28216588Z11 + 0.70446308Z22 + 2.09532473Z28
7.5 Hexagonal Coefficients of a Hexagonal Aberration Function 187
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (7-33)
j =2
As in the case of circle and annular polynomials (see Sections 4.9 and 5.7,
respectively), we illustrate the hexagonal polynomials for n £ 8 in three different but
equivalent ways in Figure 7-9. For each polynomial, the isometric plot at the top
illustrates its shape. An interferogram is shown on the left, and a corresponding PSF is
shown on the right for a sigma value of one wave. The peak-to-valley aberration numbers
(in units of wavelength) are given in Table 7-6.
The PSF plots represent the images of a point object in the presence of a polynomial
aberration. They can be obtained by applying Eq. (7-6) to a hexagonal pupil. Piston yields
the aberration-free PSF since it does not affect the PSF. The full width of a square
displaying the PSFs is 24l Fx .
The symmetry properties of the aberrated PSFs (and OTFs) discussed for the circular
pupils in Section 4.7 are generally not applicable to hexagonal pupils. For example,
although the form of the polynomials H 5 and H 6 , representing balanced astigmatisms,
are the same as the corresponding Zernike circle polynomials, the interferogram and the
PSF for one cannot be obtained by a 45o rotation of the other. This is due to the lack of
radial symmetry of the hexagonal pupil. However, the interferograms and PSFs for the
polynomials H 7 and H 8 , representing balanced comas, are different from each other
only by a 90 o rotation. Similarly, the polynomials H 9 and H10 have the same form as
the Zernike circle polynomials Z 9 and Z10 , respectively, and they yield 6-fold symmetric
interferograms and 3-fold symmetric PSFs. The PSF for one can be obtained by a 120 o
rotation of the other. The interferograms and the PSFs for H11 and H 22 , representing the
balanced primary and secondary aberrations, respectively, are radially symmetric, but
those for H 37 , representing the balanced tertiary aberration, are not because it contains a
188 SYSTEMS WITH HEXAGONAL PUPILS
H1 H2 H3
H4 H5 H6
H7 H8 H9
From Eq. (7-6), the Strehl ratio, representing the central value of an aberrated PSF
relative to its aberration-free value, is given by
S ∫ I (0, 0)
4 2
=
27 ÚÚ [ ]
exp iF ( x , y ) dx d y , (7-34)
192 SYSTEMS WITH HEXAGONAL PUPILS
where the integration is carried out over the unit hexagon, as in Eq. (7-8). We have
removed the primes on the x and y coordinates in Eq. (7-34), because the hexagonal
polynomial aberrations are already written in the normalized coordiantes. The Strehl ratio
for these aberrations with a sigma value of 0.1 wave is listed in Table 7-7 and plotted in
Figure 7-10. Because of the small value of the aberration, the Strehl ratio is
approximately the same for each polynomial, thus illustrating its independence of the
( )
type of the aberration. It is approximately given by exp - s F2 , or 0.67, where
s F = 0.2p .
Table 7-7. Strehl ratio S for hexagonal polynomial aberrations for a sigma value of
0.1 wave.
Figure 7-10. Strehl ratio for a hexagonal polynomial aberration with a sigma value
of 0.1 wave.
194 SYSTEMS WITH HEXAGONAL PUPILS
7.7.1 Defocus
Consider the defocus aberration
W d (r) = Ad r 2 . (7-35)
From the form of the orthonormal defocus polynomial H4 given in Table 7-2, it is
evident that its sigma value across a hexagonal pupil is given by
Ad 43 Ad
sd = = . (7-36)
12 5 4.092
7.7.2 Astigmatism
Next consider 0 o Seidel astigmatism given by
H 6 = 2 15 7r 2 cos 2q . (7-38a)
(
= 2 15 7r 2 2 cos 2 q - 1 ) . (7-38b)
It shows that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is the same for a hexagonal pupil as for a circular, annular, or a Gaussian pupil.
Hence, for a small amount of astigmatism, the diffraction focus for a hexagonal pupil is
the same as for a circular, annular, or a Gaussian pupil. For an image with a focal ratio of
F, it lies along the z axis at a distance of - 4 Aa F 2 from the Gaussian image point. The
balanced astigmatism is given by
Ê 1 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (7-39)
Ë 2 ¯
$VWLJPDWLVP 195
Aa 7 Aa
s ba = = . (7-40)
4 15 5.855
To obtain the sigma value of astigmatism, we write Eq. (7-37) in the form
1
W a (r, q) = (
A r 2 cos 2q + r 2
2 a
)
1 È 7 1 43 ˘
= Aa Í H6 + H ˙ + constant . (7-41)
4 Î 15 6 5 4˚
Aa 127 Aa
sa = = . (7-42)
24 5 4.762
Comparing Eqs. (7-40) and (7-42), we find that balancing astigmatism with defocus
reduces its sigma value of by a factor of 1.23.
7.7.3 Coma
Now we consider Seidel coma:
(
H 8 = 4 42 3685 25r 3 - 14 r cos q .) (7-44)
It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
r3 cos q is - 14 25 ª -0.56 compared to - 2 3 for a circular pupil. The diffraction focus
in this case lies along the x axis at a distance of - ( 4 3) F times the amount of tilt from
the Gaussian image point. The balanced coma is given by
Ê 14 ˆ
W bc (r, q) = Ac Á r 3 - r˜ cos q . (7-45)
Ë 25 ¯
Ac 737 Ac
s bc = = . (7-46)
20 210 10.676
To obtain the sigma value of Seidel coma, we write Eq. (7-43) in the form
È 1 3685 7 5 ˘
W c (r, q) = Ac Í H8 + H ˙ . (7-47)
Î 100 42 25 6 2 ˚
196 SYSTEMS WITH HEXAGONAL PUPILS
Ac 83 Ac
sc = = . (7-48)
4 70 3.673
Comparing Eqs. (7-46) and (7-48), we find that balancing coma with tilt reduces its sigma
value of by a factor of 2.91.
60
H11 =
1072205
( )
301r 4 - 257r 2 + constant . (7-50)
It shows that the relative amount of defocus that optimally balances Seidel spherical
aberration r 4 is - 257 301 ª - 0.85 compared to a value of –1 for a circular pupil. The
diffraction focus lies closer to the Gaussian image point in the case of coma, and closer to
the Gaussian image plane in the case of spherical aberration, compared to their
corresponding locations for a circular pupil. The balanced spherical aberration is given by
Ê 257 2 ˆ
W bs (r) = As Á r 4 - r ˜ . (7-51)
Ë 301 ¯
As A 4987
s bs = 1072205 = s
60 ¥ 301 84 215
As
= . (7-52)
17.441
To obtain the sigma value of Seidel spherical aberration, we write Eq. (7-49) in the form
È 1072205 257 43 ˘
W s (r) = As Í H11 + H ˙ + constant . (7-53)
Î 60 ¥ 301 12 ¥ 301 5 4 ˚
As 59 As
ss = = . (7-54)
6 35 4.621
Comparing Eqs. (7-52) and (7-54), we find that balancing astigmatism with defocus
reduces its sigma value by a factor of 3.77.
7.7.4 Spherical Aberration 197
The sigma values of the Seidel aberrations with and without balancing are given in
Table 7-8. The corresponding peak-to-valley (P-V) numbers for a sigma value of unity
are also given in the table.
7.8 SUMMARY
Closed-form expressions for the aberration-free PSF and OTF are given for a system
with a hexagonal pupil. They are plotted along with the ensquared power, and compared
with the corresponding qunatities for a system with a corresponding circular pupil. The
ensquared power and the OTF for a hexagonal pupil are shown to be lower than the
corresponding values for a circular pupil. Generally, the quantitative differences between
the corresponding functions for the two pupils are small, perhaps because the difference
in the pupil area is only about 16%.
Table 7-8. Sigma value of a Seidel aberration with and without balancing, and P-V
numbers for a sigma value of unity, where Ai is the aberration coefficient.
1.0 1.0
0.8 0.8
0.6 0.6
S
S
0.4 0.4
0.2 0.2
Defocus Astigmatism
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
VW VW
(a) (b)
1.0 1.0
0.8 0.8
0.6 0.6
S
0.4 0.4
0.2 0.2
Coma Spherical
0.0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.00 0.05 0.10 0.15 0.20 0.25
VW VW
(c) (d)
Figure 7-11. Strehl ratio as a function of the sigma value of a Seidel aberration with
and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical
aberration.
7.8 Summary 199
While the polynomials H11 and H22 representing balanced primary and secondary
spherical aberrations are radially symmetric, the polynomial H37 representing balanced
tertiary spherical aberration is not, since it consists of an angle-dependent term in Z28 or
cos 6q also. If this term is not included in the polynomial H37 , the standard deviation of
the aberration increases from a value of unity to 1.13339.
In practice, the polynomials in Cartesian coordinates given in Table 7-4 will be used
for the analysis of aberration data of a hexagonal wavefront. A somewhat different set of
hexagonal polynomials is obtained when the hexagon is rotated by 30 degrees. These
polynomials are given in Table 7-5.
The first 45 hexagonal polynomials, i.e., up to and including the 8th order, are
illustrated by an isometric plot, an interferogram, and a PSF in Figure 7-9. The coefficient
of each orthonormal polynomial, or the sigma value of the corresponding aberration, is
one wave. Their corresponding P-V numbers for a sigma value of one wave are given in
Table 7-6 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l for each
aberration is given in Table 7-7 and illustrated in Figure 7-10. It shows that, for a small
aberration, the Strehl ratio can be estimated from the aberration variance. The sigma
values of the Seidel aberrations and their balanced forms are given, along with their P-V
numbers in Table 7-8.
The diffraction focus for a system with a hexagonal pupil is shown to lie closer to the
Gaussian image point in the case of coma, and closer to the Gaussian image plane in the
case of spherical aberration, compared to their corresponding locations for a circular
pupil. Figure 7-11 shows how the Strehl ratio varies with the sigma value of a Seidel
aberration, with and without balancing. The approximate expression exp - s F2 ( )
overestimates its value in the case of defocus, but underestimates it for the other
aberrations.
200 SYSTEMS WITH HEXAGONAL PUPILS
References
1. keckobservatory.org/
7. G. Chanan and M. Troy, “Strehl ratio and modulation transfer function for
segmented mirror telescopes as functions of segment phase error,” Appl. Opt. 38,
6642–6647 (1999).
References ......................................................................................................................234
201
Chapter 8
Systems with Elliptical Pupils
8.1 INTRODUCTION
The pupil of a human eye is slightly elliptical [1]. The pupil for off-axis imaging by a
system with an axial circular pupil may be vignetted, but can be approximated by an
ellipse [2]. When a flat mirror is tested by shining a circular beam on it at some angle
(other than normal incidence), the illuminated spot is elliptical. Similarly, the overlap
region of two circular wavefronts that are displaced from each other, as in lateral shearing
interferometry [3] or in the calculation of the optical transfer function of a system [4], can
also be approximated by an ellipse.
Starting with the pupil function of a system with an elliptical pupil, we scale the
coordinates of a point on the pupil and transform it to a circular pupil. The aberration-free
PSF and OTF are then obtained as for a system with a circular pupil. The corresponding
PSF and OTF obtained by unscaling the coordinates represent the results for the elliptical
pupil. Then we discuss the polynomials that are orthonormal over and represent balanced
classical aberrations for a unit elliptical pupil [5]. These polynomials cannot be obtained
by scaling the coordinates of the Zernike circle polynomials. The balancing of a Seidel
aberration over an elliptical pupil is discussed, and its standard deviation with and
without balancing is determined.
x 2p y 2p
+ £ 1 . (8-1)
a2 b2
c = ba £ 1 . (8-2)
( )
For a uniformly illuminated pupil with an aberration function F x p , y p and power Pex
exiting from it, the pupil function of the system can be written
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (8-3)
where
(
A xp, yp ) = (P ex
12
Sex ) (8-4)
yp y9p
O xp O x9p
a a
(a) (b)
Figure 8-1. (a) Elliptical pupil with semimajor and semiminor axes a and b. (b)
Elliptical pupil transformed into a circular pupil by scaling its y p coordinate.
8.3.1 PSF
From Eq. (1-9), the aberrated irradiance distribution in the image plane of a system
with a uniformly illuminated elliptical exit pupil, normalized by its aberration-free central
value Pex Sex l2 R 2 , can be written
2
1 ÛÛ È 2pi ˘
I (x i , y i ) [ (
= 2 Ù Ù exp iF x p , y p expÍ -
Sex ı ı Î lR
)] ( )
x i x p + y i y p ˙ dx p d y p
˚
, (8-5)
where the integration is carried over the elliptical pupil. Using the scaled pupil
( )
coordinates x ¢p , y ¢p , where
( x ¢ , y ¢ ) = ( x , y c)
p p p p , (8-6)
( x ¢i , y ¢i ) = ( x i , cy i ) , (8-8)
because of the Fourier transform relationship between the pupil function and the
diffracted amplitude. In the scaled coordinates, Eq. (8-5) for the aberrationfree case
becomes
36) 205
2
c2 È 2pi ˘
I ( x ¢i , y ¢i ; c ) = 2 ÚÚ exp Í -
p circle Î lR
x ¢i x ¢p + y ¢i y ¢p ( ) ˙ dx ¢p dy ¢p
˚
. (8-9)
( )
In polar coordinates r p¢ , q and (ri¢, q i ) for the pupil and image points, we can write
( x¢ , y¢ )
p p (
= r p¢ cos q¢p , sin q¢p ) (
= ar cos q¢p , sin q¢p ) (8-10)
and
2
1 1 2p
[
I (r , q¢i ; c ) = 2 Ú Ú exp -pirr cos q¢i - q¢p r dr dq¢p
p 0 0
( )] , (8-12)
where
ri¢ r¢
r = = i , (8-13)
l R 2a l Fx
and
Fx = R 2a (8-14)
is the focal ratio of the image-forming light cone along the x p axis.
( )
For the aberration-free case, we let F r, q¢p = 0 and perform the integration as for a
circular pupil. Thus, we obtain
2
È 2J (p r ) ˘
I (r) = Í 1 ˙ . (8-15)
Î pr ˚
2
Ï 2J È p x 2 + c 2 y 2 1 2 ˘ ¸
Ô 1 ÍÎ ( ˙˚ Ô )
I ( x , y; c ) = Ì 1 2 ˝ , (8-16)
2
Ô p x +c y
Ó
2 2
( Ô
˛
)
where ( x , y ) are image plane coordinates in units of l Fx . The fractional power contained
in an elliptical ring can be obtained in a similar manner from the corresponding equation
for a circular pupil, namely, Eq. (4-11). Thus, the fractional power in an elliptical ring
with semimajor and semiminor axes x c and y c with y c = cx c is given by
P ( x c , y c ; c ) = 1 - J 02 ÊË p x c2 + c 2 y c2 ˆ¯ - J12 ÊË p x c2 + c 2 y c2 ˆ¯ . (8-17)
206 SYSTEMS WITH ELLIPTICAL PUPILS
The distribution given by Eq. (8-16) approaches the Airy pattern for a circular pupil
as we let the aspect ratio c Æ 1. We also note that the relative irradiance at a point
( x, y c) is equal to the relative irradiance of the Airy pattern at a point ( x, y) . However,
the central irradiance for the elliptical pupil is equal to c 2 times the central value of the
Airy pattern. This is due to the area of the elliptical pupil being equal to c times that of
the circular pupil, and the power incident on and exiting from the elliptical pupil also
being equal to c times that for the circular pupil.
Figure 8-2a shows the 2D PSF for c = 0.85 . It is evident that the circular diffraction
rings of a circular pupil have been replaced by the elliptical diffraction rings of an
elliptical pupil. The dimension of a ring is larger in the direction of the smaller dimension
of the pupil with an aspect ratio of 1 c . Figure 8-2b shows the irradiance distribution
along the x and y axes, and at 45o from the x axis. The first zero along the x axis occurs at
1.22 (in units of l Fx ), as in the Airy pattern, at 1.22/0.85 or about 1.44 along the y axis,
and at about 1.32 at 45o from the x axis [see the curve I ( r ) ∫ I ( x = y ) ].
(a)
1.0 0.025
0.020
I (0, y)
0.8
0.015
I
I (x, 0)
0.6 0.010
I
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r
Figure 8-2. (a) 2D aberration-free PSF for c = 0.85. (b) Irradiance distribution along
the x and y axes, and at 45 o from the x axis, where x, y, and r are in units of l Fx .
27) 207
8.3.2 OTF
r
The OTF of an aberration-free system at a spatial frequency v i is given by [see Eq.
(2-13)]
r Û r r r r
ı
( ) (
t (v i ) = Pex 1 Ù A r p A r p - l R v i d r p ) . (8-18)
It represents the fractional area of overlap of two elliptical pupils centered at (0, 0) and
r
l R(x, h) , where (x, h) are the Cartesian components of the spatial frequency vector v i . In
( )
the scaled coordinates x ¢p , y ¢p , as in Eq. (8-6), the elliptical pupil reduces to a circular
pupil of radius a. The overlap area of two circular pupils, each of radius a, with their
origins at (0, 0) and ( x ¢0 , y ¢0 ) is given by
È 12˘
Ê r¢ ˆ Ê r¢ ˆ Ê r¢ ˆ
S( x ¢0 , y ¢0; a) = 2a 2 Í cos 1Á 0 ˜ - Á 0 ˜ 1 - Á 0 ˜ ˙ , (8-19)
Í Ë 2a ¯ Ë 2a ¯ Ë 2a ¯ ˙
Î ˚
where
(
r0¢ = x ¢02 + y ¢02 )1 2 (8-20)
Letting
and noting that the overlap area is to be multiplied by c when writing it in the unscaled
coordinates, the OTF of a system with an elliptical pupil can be written from Eq. (8-19) in
the form
2È
(
t vx , vy ) =
p ÎÍ
(
cos 1 v e - v e 1 - v e2 )1 2 ˘˚˙ , (8-22)
where
12
Ê 2 v y2 ˆ
ve = Á vx + 2 ˜ (8-23)
Ë c ¯
and
Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 l Fx 1 l Fx ¯
(8-24)
are the spatial frequency components normalized by the cutoff frequency 1 l Fx along the
x axis.
208 SYSTEMS WITH ELLIPTICAL PUPILS
Figure 8-3 shows the OTF for c = 0.85 along the x and y axes, and at 45o from the x
( ) ( )
axis as t(v x ), t v y , and t v x = v y ∫ t(v e ) with the corresponding cutoff spatial
frequencies of 1, 0.85, and 0.916, respectively, each in units of 1 l Fx . It should be
( )
evident that t(v x ) is obtained from Eq. (8-22) by letting v y = 0. Similarly, t v y is
obtained by letting v x = 0. Moreover, the OTF along the x axis is the same as for a
corresponding circular pupil.
1.0
0.8
0.6
t
t ( nx )
0.4
t ( nx ny )
0.2
t ( ny )
0.0
0.0 0.25 0.5 0.75 1.0
nx, ny, or ne
Figure 8-3. OTF of a system with an elliptical pupil with aspect ratio c = 0.85, along
the x and y axes, and at 45 o from the x axis, where v x , v y . and v e are all in units of
1 l Fx .
(OOLSWLFDO 3RO\QRPLDOV 209
Figure 8-4 shows a unit ellipse of an aspect ratio c inscribed inside a unit circle. Thus
the semimajor and semiminor axes a and b of the ellipse have been normalized by a so
that the farthest point(s) on the ellipse lie at a distance of unity. The unit ellipse is
represented by an equation
x2 + y2 c2 = 1 , (8-25)
or
y = ± c 1 - x2 . (8-26)
È j ˘
E j +1 = N j +1 Í Z j +1 - Â Z j +1Ek Ek ˙ , (8-27)
Î k =1 ˚
D(0,c)
C 1, 0 A 1, 0
O x
B(0, c)
Figure 8-4. Unit ellipse of aspect ratio c inscribed inside a unit circle with its
semimajor axis of unity along the x axis.
210 SYSTEMS WITH ELLIPTICAL PUPILS
where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit ellipse i.e., they satisfy the orthonormality condition
1 c 1 x2
1 Û Û
dx E j E j ¢ dy = d jj ¢ . (8-28)
pc Ù
ı
Ù
ı
1
c 1 x2
The angular brackets indicate a mean value over the elliptical pupil. Thus, for example,
1 c 1 x2
1 Û Û
Z j Ek = dx Z j Ek dy . (8-29)
pc Ù
ı
Ù
ı
1
c 1 x2
It should be evident that because of the symmetric limits of integration, a mean value is
zero if the integrand is an odd function of x and or y. If the integrand is an even function,
then we may replace the lower limits of integration by zero and multiply the double
integral by 4.
The orthonormal elliptical polynomials up to the fourth order are given in Tables 8-1
through 8-3 in three different but equivalent forms, as in the case of hexagonal
polynomials. The expressions for higher-order elliptical polynomials are very long unless
the aspect ratio c is specified. As in the case of a hexagonal pupil, each elliptical
polynomial consists of either cosine or sine terms, but not both. For example, E6 is a
linear combination of Z 6 , Z 4 , and Z1. It also shows that the balancing defocus for (zero-
degree) Seidel astigmatism is different for an elliptical pupil compared to that for a
circular, annular, or a Gaussian pupil, as may be seen from Table 4-2, 5-2, or 6-2,
respectively. Moreover, E11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,
spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q
as well. The elliptical polynomials are generally more complex in that they are made up
of a larger number of circle polynomials. These results are a consequence of the fact that
the x and y dimensions of the elliptical pupil are not equal. As expected, the elliptical
polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse approaches
a unit circle.
where a j are the expansion coefficients. Multiplying both sides of Eq. (8-30) by
8.5 Elliptical Coefficients of an Elliptical Aberration Function 211
E2 Z2
E3 Z3/c
2 4
E4 (1/ 3 2c 3c )[ 3 (1 c2) Z1 + 2Z4]
E5 Z5/c
E6 [1/(2 2 c2 3 2c
2 4
3c )][ 3 (3 4c2 + c4)Z1 3(1 c4)Z4 + 2 (3 2c2 + 3c4)Z6]
2 4
E7 [1/(c 5 6c 9c )][6(1 c2)Z3 + 2 2 Z7]
2 4
E8 (2/ 9 6c 5c )[(1 c2)Z2 + 2 Z8]
2 4
E9 [1/(2 2 c3 5 6c 9c )][ 2 2 (5 8c2 + 3c4)Z3 (5 2c2 3c4)Z7 + (5 6c2 + 9c4)Z9]
2 4
E10 [1/(2 2 c3 9 6c 5c )][ 2 2 (3 4c2 + c4)Z2 (3 + 2c2 5c4)Z8 + (9 6c2 + 5c4)Z10]
E12 5 / 8 c 2(195 475c2 + 558c4 422c6 + 159c8 15c10)ȕ 1Z1 15 / 8 c 2(105 205c2
+ 194c4 114c6 + 5c8 + 15c10)ȕ 1Z4 + (1/2) 15 c 2 (75 155c2 + 174c4 134c6 + 55c8 15c10) ȕ 1Z6
6
10 2 c 2(3 2c2 +2c 3c8)ȕ 1Z11 + c 2ĮȖ 1Z12
2 4
E13 [1/(c 5 6c 5c )][ 15 (1 c2)Z5 +2Z13]
( 15 /8)c 4 (35 70c2 + 56c4 26c6 + 5c8)Ȗ 1Z6 + (5/8 2 ) (1 c2)2c 4(7 + 10c2 + 7c4)Ȗ 1Z11
E15 ( 15 /4)c 3(5 8c2 + 3c4)į 1Z5 (5/4)(1 c4)c 3 į 1Z13 + (į/2c3) Z15
_______________________________________________________________________
Į (45 60c2 + 94c4 60c6 + 45c8)1/2
ȕ (1575 4800c2 + 12020c4 17280c6 + 21066c8 17280c10 + 12020c12 4800c14 + 1575c16)1/2
Ȗ (35 60c2 + 114c4 60c6 + 35c8)1/2
į (5 6c2 +5c4)1/2
ĮȖ ȕ
212 SYSTEMS WITH ELLIPTICAL PUPILS
E1 1
E2 2ȡcosș
E3 (2ȡsinș)/c
3 / §© 3 3c ·¹ ( 1 c2 +4ȡ2)
2 4
E4 2c
E5 ( 6 /c)ȡ2 sin2ș
9c ) ][ (1 + 3c2)ȡ +6ȡ3]sinș
2 4
E7 [4/(c 5 6c
5c ) [ (3 + c2)ȡ + 6ȡ3]cosș
2 4
E8 (4/ 9 6c
E11 ( 5 /Į) [3+2c2 +3c4 24(1 + c2)ȡ2 + 48ȡ4 12(1 c2)ȡ2 cos2ș]
E12 [ 10 Į/(Ȗc2)]( 3ȡ2 + 4ȡ4) cos2ș + [ 5 2 /(2c2ȕ)][ 12c2(5 2c2 + 2c6 5c8)
+ 4[6c2(5 7c2 + 7c4 5c6) 5(7 6c2 +6c6 7c8)ȡ2]ȡ2 cos2ș + (35 60c2
E15 ( 10 /c3)į 1{[6c2(1 c2) 5(1 c4)ȡ2]ȡ2 sin2ș + [(5 6c2 +5c4)/2]ȡ4 sin4ș}
8.5 Elliptical Coefficients of an Elliptical Aberration Function 213
E2 = 2x
E3 = 2y/c
2 4
E4 = ( 3 / 3 í 2c í 3c )(í 1 í c2 +4ȡ2)
E5 = (2 6 /c)xy
2 4
E6 = [ 6 /(c2 3 í 2c í 3c )][c2(1 í c2) + c2(3c2 í 1)x2 í (3 í c2)y2]
2 4
E7 = [4/(c 5 í 6c í 9c )][í (1 + 3c2) + 6ȡ2]y
2 4
E8 = (4/ 9 í 6c í 5c )[í (3 + c2) + 6ȡ2]x
2 4
E9 = [4/(c3 5 í 6c í 9c )][3c2(3c2 í 1)x2 í (5 í 3c2)y2 + 3c2(1 í c2)]y
2 4
E10 = [4/(c2 9 í 6c í 5c )][c2(5c2 í 3)x2 í 3(3 í c2)y2 + 3c2(1 í c2)]x
2
+ 3c8)ȡ4 í 60(í 9 + 3c2 +2c4 í 6c6 +7c8 +3c10)x í 24(15 í 70c2 + 92c4 í 82c6
E14 = ( 10 /c4Ȗ)[c4(3 í 30c2 + 35c4)x4 +6c2(5 í 18c2 + 5c4)x2y2 + (35 í 30c2 +3c4)y4
E j ( x , y ), integrating over the unit ellipse, and using the orthonormality Eq. (8-28), we
obtain the elliptical expansion coefficients:
1 c 1 x2
1 Û Û
aj = dx W ( x , y )E j ( x , y ) dx dy . (8-31)
pc Ù
ı
Ù
ı
1
c 1 x2
As stated in Section 3.2, it is evident from Eq. (8-7) that the value of an elliptical
coefficient is independent of the number J of polynomials used in the expansion of the
aberration function. Hence, one or more terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.
The mean and mean square values of the aberration function are given by
W (r, q) = a1 , (8-32)
and
J
W 2 (r, q) = Â a 2j , (8-33)
j =1
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (8-34)
j =2
The PSF plots, representing the images of a point object in the presence of a
polynomial aberration and obtained by applying Eq. (8-5), are shown in Figure 8-5. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration E1 has
no effect on the PSF, it yields an aberration-free PSF.
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 215
E2 Z2
E3 1.1765Z3
E4 0.2721Z1 + 1.1321Z4
E5 1.17645Z5
E7 0.8458Z3 + 1.4369Z7
E8 0.2058Z2 + 1.0486Z8
E 13 0.6987Z5 + 1.3002Z13
Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an
aspect ratio c = 0.85.
E1 1
E2 2Ucosș
E3 2.3529Usinș
E4 1.6888 + 3.9217U 2
E5 2.8818U2sin2ș
2
E6 0.3848 1.3760 + 2.9947U 2cos2ș
E8 ( 5.5205 U + 8.8980U3)cosș
E20 (0.5810 U 4.0436 U3 + 5.4933 U5)cosș + (6.9151U 3 12.5589 U5)cos3ș + 5.7134 U5cos5ș
Table 8-5. Elliptical polynomials in polar coordinates for an elliptical pupil with an
aspect ratio c = 0.85. (Cont.)
E28 ( 16.9428 U + 157.9560 U3 395.9030 U5 + 291.6410 U7)sin ș + (9.5563U3 17.3422 U5)sin3ș
E29 ( 15.0992 U + 120.4040 U3 257.3300 U5 + 161.3000U7)cosș + (5.7290 U3 9.5919 U5)cos3ș
Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with
an aspect ratio c = 0.85.
E1 1
E2 2x
E3 2.3529y
E5 5.7635xy
Table 8-6. Elliptical polynomials in Cartesian coordinates for an elliptical pupil with
an aspect ratio c = 0.85. (Cont.)
From Eq. (8-5), the Strehl ratio, i.e., the central value of a PSF relative to its
aberration-free value, can be written:
S(c ) ∫ I (0, 0; c )
1 c 1 x2
1 Û Û
=
pc Ù
ı
dx Ù
ı
[ ]
exp iF ( x , y ) dy , (8-35)
1 c 1 x2
where ( x , y ) are the pupil coordinates normalized by the pupil dimension a along the x p
axis, as used in the polynomials given in Table 8-3.
The Strehl ratio for elliptical polynomial aberrations with a sigma value of 0.1 wave
is listed in Table 8-8 and plotted in Figure 8-6. Because of the small value of the
aberration, the Strehl ratio is approximately the same for each polynomial. Both the table
and the figure illustrate that the Strehl ratio for a small aberration is independent of the
( )
type of aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
222 SYSTEMS WITH ELLIPTICAL PUPILS
E1 E2 E3
E4 E5 E6
E7 E8 E9
Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85
shown as isometric plot on the top, interferogram on the left, and PSF on the right
for a sigma value of one wave.
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 223
Figure 8-5. Elliptical polynomials for an elliptical pupil with an aspect ratio c = 0.85
shown as isometric plot on the top, interferogram on the left, and PSF on the right
for a sigma value of one wave. (Cont.)
224 SYSTEMS WITH ELLIPTICAL PUPILS
Figure 8-5. Elliptical polynomials with an aspect ratio c = 0.85 shown as isometric
plot on the top, interferogram on the left, and PSF on the right for a sigma value of
one wave. (Cont.)
8.6 Isometric, Interferometric, and Imaging Characteristics of Elliptical Polynomial Aberrations 225
Table 8-7. Peak-to valley (P-V) numbers (in units of wavelength) of orthonormal
elliptical polynomial aberrations with an aspect ratio c = 0.85 and a sigma value of
one wave.
Table 8-8. Strehl ratio S for elliptical polynomial aberrations with an aspect ratio
c = 0.85 and a sigma value of 0.1 wave.
o
o
Figure 8-6. Strehl ratio for an elliptical polynomial aberration with an aspect ratio c
= 0.85 and a sigma value of 0.1 wave.
228 SYSTEMS WITH ELLIPTICAL PUPILS
8.7.1 Defocus
We start with the defocus aberration
W d (r) = Ad r 2 . (8-36)
From the form of the orthonormal defocus polynomial E4 given in Table 8-2, it is
evident that its sigma value across an elliptical pupil is given by
Ad h
sd = , (8-37)
4 3
where
(
h = 3 - 2c 2 + 3c 4 )1 2 . (8-38)
8.7.2 Astigmatism
Next consider 0 o Seidel astigmatism given by
6
E6 = 2
2c h
[
h 2r 2 cos 2q - 3 1 - c 2 ( )] (8-40a)
h 6 Ê 2 2 3 - c 2 2ˆ
= Á r cos q - r ˜ + constant . (8-40b)
c2 Ë h ¯
Ê 3 - c 2 2ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r ˜ . (8-41)
Ë h ¯
c2
s ba = Aa . (8-42)
h 6
To determine the sigma of Seidel astigmatism, we write the aberration in terms of the
elliptical polynomials. Thus,
$VWLJPDWLVP 29
W a (r, q) = Aa r 2 cos q
Ê c2 3 - c2 ˆ
= Aa Á E6 + E4 ˜ + constant . (8-43)
Ë 6h 4h 3 ¯
s a = Aa 4 . (8-44)
Its value is independent of the aspect ratio c of the elliptical pupil, and thus equal to that
for a circular pupil. Since Seidel astigmatism x 2 varies only along the x axis for which
the unit ellipse has the same length as a unit circle, the sigma is independent of c.
8.7.3 Coma
Now we consider Seidel coma:
4
E8 =
4 12
[6r 3
( )
cos q - 3 + c 2 r cos q ] . (8-46)
(9 - 6c 2
+ 5c )
It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
( )
r3 cos q is - 3 + c 2 6 compared to - 2 3 for a circular pupil. The balanced coma is
given by
Ê 3 + c2 ˆ
W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (8-47)
Ë 6 ¯
s bc =
(9 - 6c 2 + 5c 4 )1 2 A . (8-48)
c
24
To obtain the sigma value of Seidel coma, we write Eq. (8-44) in the form
W c (r, q) = Ac Á
(
Ê 9 - 6c 2 + 5c 4
)1 2 E +
3 + c2
ˆ
E2 ˜ . (8-49)
8
Á 24 12 ˜
Ë ¯
1
sc =
8
(5 + 2c 2 + c 4 )1 2 Ac . (8-50)
230 SYSTEMS WITH ELLIPTICAL PUPILS
E11 = ( )[ ( ) ( ) ]
5 a 48r 4 - 12 1 - c 2 r 2 cos 2q - 24 1 + c 2 r 2 + constant (8-52a)
= ( )[ ( ) ( ) ]
5 a 48r 4 - 24 1 - c 2 r 2 cos 2 q + 12 1 - 3c 2 r 2 + constant . (8-52b)
È 1 1 ˘
Î 4
( ) 2
(
W bs (r) = As Ír 4 - 1 - c 2 r 2 cos 2q - 1 + c 2 r 2 ˙
˚
) (8-53a)
È 1 1 ˘
Î 2
( ) 4
( ˚
)
= As Ír 4 - 1 - c 2 r 2 cos 2 q + 1 - 3c 2 r 2 ˙ + constant . (8-53b)
It shows that spherical aberration is balanced not only by defocus but astigmatism as
well. Its sigma value is given by
a
s bs = As . (8-54)
48 5
To obtain the sigma value of Seidel spherical aberration, we write Eq. (8-50) in the form
ÏÔ a
W s (r) = As Ì E11 +
c2 1 - c2 (
E6 + Í
)
1 È3 1-c 1-c
2 4
(
+ h 1 + c2
)( ) ( ˘ ¸
)˙˙E4 Ô˝
ÔÓ 48 5 2h 6 8 3 ÍÎ 2 h Ô˛
˚
+ constant . (8-55)
ss =
(225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 A . (8-56)
s
24 10
The sigma values of Seidel aberrations with and without balancing are given in Table
8-9. They reduce to the corresponding values for a circular pupil given in Table 4-3 as
c Æ 1. The variation of sigma for a primary aberration with the aspect ratio c is shown in
Figure 8-7. While s a for astigmatism is constant, it increases monotonically in the case
of coma s c and spherical aberration s s . For defocus, its value s d has a minimum for
c = 1 3 . The variation of sigma of a balanced primary aberration as a function of c is
shown in Figure 8-8. While its variation for balanced coma s bc and balanced spherical
aberration s bs is small, sigma of balanced astigmatism s ba increases monotonically.
6SKHULFDO $EHUUDWLRQ 31
Aberration Sigma
12
Defocus [(
s d = ( Ad 4) 3 - 2c 2 + 3c 4 ) 3]
Astigmatism s a = Aa 4
12
Balanced astigmatism s ba = Aa c 2 [6(3 - 2c 2
+ 3c 4 )]
Coma (
s c = Ac 5 + 2c 2 + c 4 )1 2 8
Balanced coma (
s bc = Ac 9 - 6c 2 + 5c 4 )1 2 24
Spherical aberration (
s s = As 225 + 60c 2 - 58c 4 + 60c 6 + 225c 8 )1 2 (24 10 )
Balanced spherical aberration (
s bs = As 45 - 60c 2 + 94c 4 - 60c 6 + 45c 8 )1 2 (48 5)
8.8 SUMMARY
The PSF and OTF of a system with an elliptical pupil are obtained from the
corresponding PSF and OTF of a system with a circular pupil discussed in Chapter 4 by
scaling the coordinates of the elliptical pupil and transforming it into a circular pupil. It is
explained that the orthogonal aberration polynomials for an elliptical pupil representing
balanced classical aberration for such a pupil can not be obtained in the same manner.
These polynomials orthonormal over a unit elliptical pupil are obtained by
orthonormalizing the circle polynomials by the Gram–Schmidt orthonormalization
process. They are given through the fourth order in Tables 8-1 through 8-3 in terms of the
circle polynomials, in the polar coordinates, and in the Cartesian coordinates,
respectively. Table 8-2 shows that each polynomial consists of either the cosine or the
sine terms, but not both. Thus, an even j polynomial, for example, consists of only the
cosine terms. This is a consequence of the biaxial symmetry of the pupil. Since the
polynomials are not separable in the polar coordinates r and q of a pupil point,
polynomial numbering with two indices n and m loses significance. Hence, they must be
numbered with a single index j. Their ordering is the same as for the polynomials
discussed in previous chapters.
Only the first 15 elliptical polynomials are given for an arbitrary aspect ratio c of the
pupil in the Tables 8-1 through 8-3. The expressions for the higher-order elliptical
polynomials are very long unless c is specified. The polynomial E6 for astigmatism is a
6XPPDU\ 33
degree) Seidel astigmatism is different for an elliptical pupil compared to that for a
circular, annular, or a Gaussian pupil. Moreover, E11 is a linear combination of Z11 , Z 6 ,
Z 4 , and Z1. Thus, spherical aberration r 4 is balanced with not only defocus r2 but
astigmatism r2 cos 2 q as well. It is evidently not radially symmetric. As expected, the
elliptical polynomials reduce to the circle polynomials as c Æ 1, i.e., as the unit ellipse
approaches a unit circle.
The elliptical polynomials up to the eighth order for an elliptical pupil with an aspect
ratio of c = 0.85 are given in Tables 8-4 to 8-6 in terms of the Zernike circle polynomials,
in polar coordinates, and in Cartesian coordinates, respectively. They are illustrated in
three different but equivalent ways in Figure 8-5 with the isometric plot, interferogram,
and the PSF for a sigma value of one wave. The peak-to-valley aberration numbers (in
units of wavelength) are given in Table 8-7. The Strehl ratio for a sigma value of 0.1
wave is given in Table 8-8 and plotted in Figure 8-6. The Seidel aberrations are discussed
in Section 8.7 and their sigma values with and without balancing are given in Table 8-9.
234 SYSTEMS WITH ELLIPTICAL PUPILS
References
1. H. J. Wyatt, “The form of the human pupil,” Vision Res. 35, 2021–2036 (1995).
References ......................................................................................................................265
235
Chapter 9
Systems with Rectangular Pupils
9.1 INTRODUCTION
High-power laser beams have a rectangular cross-section; hence there is a need to
discuss the diffraction characteristics of a rectangular pupil. We start this chapter with a
brief discussion of the PSF and OTF of a system with such a pupil.
Although high-power rectangular laser beams have been around for a long time [1],
there is little in the literature on rectangular polynomials representing balanced
aberrations for such beams. In this chapter we discuss such polynomials that are
orthonormal over a unit rectangular pupil [2,3]. These polynomials are not separable in
the x and y coordinates of a point on the pupil. The expressions for only the first 15
orthonormal polynomials, i.e., up to and including the fourth order, are given for an
arbitrary aspect ratio of the pupil becuase they become quite cumbersome as their order
increases. However, expressions for the first 45 polynomials, i.e., up to and including the
eighth order, are given for an aspect ratio of 0.75. The isometric, interferometric, and PSF
plots of these polynomial aberrations with a sigma value of one wave are given along
with their P-V numbers. The Strehl ratios for these polynomial aberrations for a sigma
value of one-tenth of a wave are also given. Finally, we discuss how to obtain the
standard deviation of a Seidel aberration with and without balancing.
Products of Legendre polynomials (one for the x- and the other for the y axis) which
are also orthogonal over a rectangular pupil [4], are not suitable for the analysis of
rectangular wavefronts of rotationally symmetric systems, since they do not represent
classical or balanced aberrations for such systems. For example, the defocus aberration
for such a system is represented by x 2 + y 2 . While it can be expanded in terms of a
complete set of 2D Legendre polynomials, it cannot be represented by a single product of
the x- and y-Legendre polynomials. The same difficulty holds for spherical aberration,
coma, etc. However, products of such Legendre polynomials are suitable for anamorphic
systems, as discussed in Chapter 13. Products of Chebyshev polynomials, one for the x-
and the other for the y-axis, are also orthogonal over a rectangular pupil, but they are not
suitable either for the rectangular pupils considered in this chapter for the same reasons as
for the products of Legendre polynomials.
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (9-1)
where 237
238 SYSTEMS WITH RECTANGULAR PUPILS
yp
O xp
(
A xp, yp ) = (P ex Sex )
12
, - a £ xp £ a , -b £ yp £ b . (9-2)
Letting
( x ¢, y ¢) (
= xp a, yp b ) , (9-4)
and
1
( x, y) = ( x , y )
l Fx i i
(9-5)
Fx = R 2a (9-6)
is the focal ratio of the image-forming light cone along the x axis, and
= ba (9-7)
is the aspect ratio of the pupil, the irradiance distribution can be written
2
1 1 1
I ( x, y) =
16 1 1
[ ]
Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (9-8)
2
1 1 1
I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢
16 1 1
2 2
Ê sin px ˆ Ê sin py ˆ
= Á ˜ . (9-9)
Ë px ¯ ÁË py ˜¯
Figure 9-2a shows the 2D PSF for an aspect ratio = 0.75 . In particular, it shows the
central bright rectangular spot of size 2 ¥ 2 , with each dimension in units of l Fx . The
PSF is zero wherever x and/or y is a positive or a negative integer. Figure 9-2b shows
the irradiance distribution along the x and y axes, and along the diagonal of the central
12
bright spot as I ( x, 0) , I (0, y ) , and I ( x , y ) ∫ I ( r ) , where r = x 2 + y 2 (and )
4
È Ê 2ˆ ˘
Í sinË pr 1 + ¯ ˙
I (r) = Í ˙ . (9-10)
Í pr 1 +
2
˙
Î ˚
(a)
1.0
0.8
0.6
I (0, y)
(b)
0.4
I (r)
0.2
I (x, 0)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r
Figure 9-2. (a) 2D aberration-free PSF for = 0.75. (b) Irradiance distribution along
the x and y axes, and along the diagonal of the central bright spot of the PSF.
240 SYSTEMS WITH RECTANGULAR PUPILS
9.3.2 OTF
From Eq. (1-13), the aberration-free OTF of a system with a rectangular pupil at a
spatial frequency (x, h) is given by the fractional area of overlap of two rectangles
centered at (0, 0) and lR(x, h) , as shown in Figure 9-3. The overlap area is given by
Ê x ˆÊ 1 h ˆ
= 4 abÁ 1 - ˜ Á1 - ˜ . (9-11)
Ë 1 l Fx ¯ Ë 1 l Fx ¯
Hence, the fractional area of overlap, or the OTF of the system may be written
v
(
t vx , vy ) = (1 - v ) ÊÁË1 - ˆ˜¯
x
y
, (9-12)
where
Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 l Fx 1 l Fx ¯
(9-13)
are the spatial frequency components in units of the cutoff frequency 1 l Fx along the x
12
( )
axis. The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be
obtained from Eq. (9-12) by letting v y v x = . Thus
2
Ê v ˆ
t( v ) = Á 1 - ˜ . (9-14)
Ë 1 + 2 ¯
yp
b
O9 R
O xp
R
a
Figure 9-3. Overlap area of two rectangular pupils centered at (0, 0) and l R(x , h)
for an aspect ratio = 0.75.
27) 241
Figure 9-4 shows the OTF for = 0.75 along the x and y axes, and along the
( )
diagonal of the pupil, as t(v x , 0) , t 0, v y , and t( v ) , with the corresponding cutoff
frequencies 1, 0.75, and 1.25, respectively, each in units of 1 l Fx . We note that
( )
t 0, v y < t(v x , 0) for any value of v x = v y due to the smaller dimension of the pupil
along the y axis. Moreover, t( v ) < t(v x , 0) for any frequency lying in the range
( )
0 < v = v x < 2 1 + 2 - 1 + 2 , or 0 < v = v x < 0.9375 in our example of = 0.75 . The
two OTFs are equal to each other at the frequency 2 1 + 2 - 1 + 2 , or 0.9375. At ( )
larger frequencies, t( v ) > t(v x , 0) until v = 1 + 2 . Of course, the values of both OTFs
in the vicinity of the unity cutoff frequency for t(v x , 0) are quite small in our example.
( )
Finally, t 0, v y is only slightly greater than t( v ) in the frequency range
( )
0 < v = v x < 2 1 + 2 - 1 1 + 2 . The two OTFs are equal to each other at the
( )
frequecny 2 1 + 2 - 1 1 + 2 , or 1 2.4 in our example. For larger frequecnies, t( v ) is
significantly greater. We point out that they are equal to each other only if ≥ 1 3 . As
( )
Æ 1 and the rectangular pupil becomes square, t 0, v y Æ t(v x , 0) for any value of
v x = v y , and the cutoff frequency for t( v ) appraoches 2 , as discussed in the next
chapter.
1.0
0.8
t ( nx , 0)
0.6
t
0.4
0.2
t ( 0, ny )
t (n)
0.0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4
nx, ny, or n
Figure 9-4. Aberration-free OTF for = 0.75, where v x , v y , and v are in units of
the cutoff frequency 1 l Fx along the x axis.
242 SYSTEMS WITH RECTANGULAR PUPILS
È j ˘
R j +1 = N j +1 ÍZ j +1 - Â Z j +1R k R k ˙ , (9-15)
ÍÎ k =1 ˙˚
where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit rectangle, i.e., they satisfy the orthonormality condition
c 1 c2
1 Û Û
dx Ù R j R j ¢ dy = d jj ¢ . (9-16)
2 Ù
4c 1 - c ı ı
c 1 c2
The angular brackets indicate a mean value over the rectangular pupil. Thus
c 1 c2
Û 1Û
Z j Rk = Ù dx Ù Z j Rk dy . (9-17)
4c 1 - c 2 ı ı
c 1 c2
D ( c, 1 c2 ) (
A c, 1 c2 )
O x
C ( c, 1 c2 ) (
B c, 1 c2 )
Figure 9-5. Unit rectangle of half-width c inscribed inside a unit circle. Its corner
points, such as A, lie at a distance of unity from its center.
5HFWDQJXODU 3RO\QRPLDOV 243
It should be evident that because of the symmetric limits of integration, a mean value is
zero if the integrand is an odd function of x and/or y. If the integrand is an even function,
then we may replace the lower limits of integration by zero and multiply the double
integral by 4.
The rectangular polynomials thus obtained up to the fourth order are given in Tables
9-1 through 9-3 in the same manner as the elliptical polynomials. Only the first 15
polynomials are given in these tables, because their expressions become too long unless
the aspect ratio is specified. Each polynomial consists of a number of circle polynomials,
but contain only the cosine or the sine terms, not both. The polynomial R6 representing
balanced astigmatism is a linear combination of Z 6 , Z 4 , and Z1, showing that the
balancing defocus for 0 o Seidel astigmatism is different for a rectangular pupil compared
to that, for example, for a circular pupil. Similarly, the polynomial R11 , representing
balanced primary spherical aberration, is not radially symmetric, since it consists of a
term in astigmatism Z 6 or cos2q . As expected, the rectangular polynomials reduce to
the square polynomials as c Æ 1 2 , and the slit polynomials for a slit pupil parallel to
the x axis as c Æ 1, discussed in Chapters 10 and 11, respectively.
where a j are the expansion coefficients. Multiplying both sides of Eq. (9-18) by
R j ( x , y ), integrating over the unit rectangle, and using the orthonormality Eq. (9-16), we
obtain the rectangular expansion coefficients:
c 1 c2
1 Û Û
aj = Ù dx Ù W ( x , y )R j ( x , y )dy . (9-19)
2
4c 1 - c ı ı
c 1 c2
As stated in Section 3.2, it is evident from Eq. (9-19) that the value of a rectangular
coefficient is independent of the number J of polynomials used in the expansion of the
aberration function. Hence, one or more terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.
The mean and mean square values of the aberration function are given by
W (r, q) = a1 , (9-20)
and
244 SYSTEMS WITH RECTANGULAR PUPILS
R2 ( 3 /2c)Z2
2
R3 [ 3 /(2 1 c ) ]Z3
2 4
R4 [ 5 /(4 1 2c 2c ) ](Z1 +3Z4)
2
R5 [ 3 2 /(2c 1 c ) ]Z5
2 4
R6 { 5 /[8c2(1 c2) 1 2c + 2c ]}[(3 10c2 + 12c4 8c6)Z1 + 3 (1 2c2)Z4
+ 6 (1 2c2 + 2c4)Z6]
2 4 6
R7 [ 21 /(4 2 27 81c + 116c 62c )][ 2 (1+4c2)Z3 +5Z7]
2 4
R8 [ 21 /(4 2c 35 70c + 62c )][ 2 (5 4c2)Z2 +5Z8]
2 4 6
R13 [ 21 /(16 2 c 1 3c + 4c 2c )]( 3 Z5 + 5 Z13)
2 4 6 8
R14 IJ[6(245 1400c + 3378c 4452c + 3466c 1488c + 496c12)Z1
10
R2 = ( 3 /c)ȡcosș
R3 = 3 /(1 í c2)ȡsinș
2 4
R4 = [ 5 /(2 1 í 2c 2c )](3ȡ2 í 1)
2
R5 = [3/(2c 1 í c )]ȡ2 sin2ș
2 4
R6 = { 5 /[4c2(1 í c2) 1 í 2c 2c ]}[3(1 í 2c2 + 2c4)ȡ2 cos2ș + 3(1 í 2c2)ȡ2
í 2c2(1 í c2) (1 í 2c2)]
2 4 6
R7 = [ 21 /(2 27 í 81c 116c í 62c )](15ȡ2 – 9 + 4c2)ȡsinș
R15 = { 21 /[8c3(1 í c2)3/2(1 í 2c2 +2c4)1/2]}[ í (1 í 2c2) (6c2 í 6c4 í 5ȡ2)ȡ2 sin2ș
+ (5/2)(1 í 2c2 +2c4)ȡ4 sin4ș]
246 SYSTEMS WITH RECTANGULAR PUPILS
R2 = ( 3 /c)x
3 / §© 1 í c ·¹ y
2
R3 =
2 4
R4 = [ 5 /(2 1 í 2c + 2c )](3ȡ2 í 1)
2
R5 = [3/( c 1 í c )]xy
2 4
R6 = { 5 /[2c2(1 í c2) 1 í 2c + 2c ]}[3(1 í c2)2x2 í 3c4y2 í c2(1 í 3c2 +2c4)]
2 4 6
R7 = [ 21 /(2 27 í 81c + 116c í 62c )](15ȡ2 – 9 + 4c2)y
2 4
R8 = [ 21 /(2c 35 í 70c + 62c )](15ȡ2 í 5 í 4c2)x
R9 = { 5 § 27 í 54c + 62c
2 4·
/ §©1 í c ·¹ /[2c2(27 í 81c2 + 116c4 í 62c6)]}
2
© ¹
J
W 2 (r, q) = Â a 2j , (9-21)
j =1
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (9-22)
j =2
The PSF plots, representing the images of a point object in the presence of a
polynomial aberration and obtained by applying Eq. (9-3) are shown in Figure 9-6. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration R1 has
no effect on the PSF, it yields an aberration-free PSF. The polynomial aberrations R2 and
R3 , representing the x and y wavefront tilts with aberration coefficients a 2 and a 3 ,
displace the PSF in the image plane along the x and y axes, respectively. If the coefficient
a 2 is in units of wavelength, it corresponds to a wavefront tilt angle of 3la 2 ca about
the y axis and displaces the PSF along the x axis by 2 3lFx a 2 c , where Fx = R 2a and
12
(
c = a a 2 + b2 ) is the width of the rectangle along the x axis normalized by its
semidiagonal. Similarly, a 3 corresponds to a wavefront tilt angle of 3 (1 - c 2 )la 3 b
about the x axis and displaces the PSF by 2 3 (1 - c 2 )lFy a 3 , where Fy = R 2b is the
focal ratio of the image-forming beam along the y axis.
R2 1.0825Z2
R3 1.4434Z3
R4 0.7613Z1 + 1.3186Z4
R5 1.2758Z5
R7 1.6096Z3 + 1.5985Z7
R8 0.8848Z2 + 1.2821Z8
R2 = 2.1651U cosT
R3 = 2.8868U cosT
R5 = 3.1250U 2cos2T
R7 = ( 5.8234 U + 13.5638U3)cosT
R8 = ( 5.4830 U + 10.8789U3)cosT
R22 = 1.2407 + 3.2334( 1 + 2U2) + 3.8612(1 6U2 + 6U4) + 3.9642( 1 + 12U 2 30U4 + 20U6)
+ (2.1911U 2 4.0362 U4)cos2T + 2.0593U4cos4T
R23 = (21.6144 U2 84.877 U4 + 73.4513U6)cos2T + 0.4570U4cos4T
R24 = 3.7592 8.4102( 1 + 2U2) 7.2735(1 6U2 + 6U4) 2.8925( 1 + 12U2 30U4 + 20U6)
+ (30.3780U2 161.2260U4 + 193.4870 U6)cos2T 2.6753U 4cos4T
R26 = 14.8185 + 31.8310( 1 + 2U2) + 25.6640(1 6U2 + 6U4) + 9.60361( 1 + 12U2 30U4 + 20 U6)
+ ( 11.6421U2 + 100.6510U4 185.7370U6)cos2T + ( 49.2338 U4 + 111.1780 U6)cos4T
R28 = 30.6444 62.9091( 1 + 2U2) 45.8608(1 6U 2 + 6 U4) 14.1723( 1 + 12U2 30U4 + 20 U6)
+ (24.2988U 2 223.3660U4 + 458.9270U6)cos2 T + (54.9277U4 185.0350 U6)cos4T + 40.2033 U6cos6 T
9.6 Isometric, Interferometric, and Imaging Characteristics of Rectangular Polynomial Aberrations 251
R30 ( 13.8336U + 121.8610 U3 287.0700U5 + 196.7720U7)cosT + ( 5.7494U 3 + 12.5263U 5)cos3T + 0.2742U 5cos5T
R37 1.4443 + 4.1359( 1 + 2U 2) + 5.8286(1 6U2 + 6U4) + 5.5594( 1 + 12U2 30U4 + 20U6)
+ 4.7041(1 20U 2 + 90 U4 140U 6 + 70U 8) + ( 5.6303U 2 + 25.9377U 4 23.9482U 6)cos2T
+ ( 12.3568U4 + 20.5270 U6)cos4T 0.6386 U6cos6T
R40 39.3796 + 92.1941( 1 + 2U2) + 90.9522(1 6U 2 + 6U4) + 52.9725( 1 + 12U2 30U4 + 20U 6)
+ 16.0196(1 20U 2 + 90U4 140U6 + 70U8) + (15.8434U 2 229.3420U 4 + 905.7830U 6 1034.5500U8)cos2T
+ (131.6850U 4 633.4660U6 + 736.3840U 8)cos4T 8.9803U6cos6 T
R44 197.7770 + 437.0330( 1 + 2U2) + 382.5600(1 6U2 + 6 U4) + 183.6730( 1 + 12 U2 30U4 + 20 U6)
+ 41.0527(1 20U2 + 90U4 140 U6 + 70U8) + (36.0550U2 619.6960 U4 + 3063.7900U6 4573.8900U8)cos2T
+ (170.1620U4 1319.9600U 6 + 2427.3200 U8)cos4T + (230.6850U6 730.7330U 8)cos6 T + 111.0290U8cos8T
R2 2.1651x
R3 2.8866y
R5 6.2500xy
R1 R2 R3
R4 R5 R6
R7 R8 R9
The Strehl ratio, namely the central value of a PSF relative to its aberration-free
value can be obtained from Eq. (9-8) by letting x = 0 = y , i.e., from
2
1 1 1
I (0, 0) = [ ]
Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢
16 1 1
. (9-23)
Its value for a rectangular polynomial aberration with a sigma value of 0.1 wave is listed
in Table 9-8 and plotted in Figure 9-7. Because of the small value of the aberration, the
Strehl ratio is approximately the same for each polynomial. Both the table and the figure
illustrate that the Strehl ratio for a small aberration is independent of the type of
( )
aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
258 SYSTEMS WITH RECTANGULAR PUPILS
Table 9-8. Strehl ratio S for rectangular polynomial aberrations for c = 0.8
corresponding to an aspect ratio of = 0.75 for a sigma value of 0.1 wave.
o
o
oj
Figure 9-7. Strehl ratio S for rectangular polynomial aberrations for c = 0.8
corresponding to an aspect ratio of = 0.75 for a sigma value of 0.1 wave.
260 SYSTEMS WITH RECTANGULAR PUPILS
9.7.1 Defocus
We start with the defocus aberration
W d (r) = Ad r 2 . (9-24)
From the form of the orthonormal defocus polynomial R4 given in Table 9-2, it is
evident that its sigma value across a rectangular pupil is given by
2g
sd = Ad , (9-25)
3 5
where
(
g = 1 - 2c 2 + 2c 4 )1 2 . (9-26)
9.7.2 Astigmatism
Next consider 0 o Seidel astigmatism given by
R6 = 3 5
g 2r 2 cos 2q + 1 - 2c 2 r 2 ( ) + constant (9-28a)
2
(
4c 1 - c g 2
)
3 5g Ê 2 2 c 4 2ˆ
= Á r cos q - 2 r ˜ + constant , (9-28b)
(
2c 2 1 - c 2 ) Ë g ¯
showing that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is c 4 g 2 . It is evident that the balanced astigmatism is given by
Ê c4 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - 2 r 2 ˜ . (9-29)
Ë g ¯
s ba =
(
2c 2 1 - c 2 )A . (9-30)
a
3 5g
To obtain the sigma value of astigmatism, we write Eq. (9-27) in the form
9.7.2 Astigmatism 261
2 Aa 2
W a (r, q) =
3 5g
[ ( )
c 1 - c 2 R6 + c 4 R4 + constant . ] (9-31)
2c 2
sa = Aa . (9-32)
3 5
9.7.3 Coma
Now, we consider Seidel coma
It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
( )
r3 cos q is - 5 + 4c 2 15 compared to - 2 3 for a circular pupil. Its sigma value is given
by
12
2c Ê 35 - 70c 2 + 62c 4 ˆ
s bc = Ac . (9-35)
15 ÁË 21 ˜
¯
To obtain the sigma value of Seidel coma, we write Eq. (9-33) in the form
A
W c (r, q) = c
È Ê 35 - 70c 2 + 62c 4 ˆ 1 2
Í 2c Á
c 5 + 4a 2 c ˘ (
R2 ˙ .
)
15 21 ˜ R8 + (7-36)
Í Ë ¯ 3 ˙
Î ˚
7 + 8c 4
sc = c Ac (9-37)
105
[ ( )
R11 = (1 8m) 315r 4 + 30 1 - 2c 2 r 2 cos 2q - 240r 2 + constant ] (9-39a)
= (1 8m)[ 315r 4
( ) ( ) ]
+ 60 1 - 2c 2 r 2 cos 2 q - 270 + 2c 2 r 2 + constant . (9-39b)
262 SYSTEMS WITH RECTANGULAR PUPILS
È 6 16 ˘
W bs (r) = As Ír 4 -
Î 63
( )
1 - 2c 2 r 2 cos 2q - r 2 ˙
21 ˚
(9-40a)
È 12 12 ˘
= As Ír 4 -
Î 63
( )
1 - 2c 2 cr 2 cos 2 q -
63
3 + 2c 2 r 2 ˙ .
˚
( ) (9-40b)
It shows, as in the case of an elliptical pupil, that spherical aberration is balanced not only
by defocus but astigmatism as well. Its sigma value is given by
8m
s bs = A . (9-41)
315 s
To obtain the sigma value of Seidel spherical aberration, we write Eq. (9-38) in the form
W s (r) =
1 È
Í8mR11 -
( )(
40c 2 1 - c 2 1 - 2c 2
R6 -
)
2( 241 - 2c ) ˘
R4 ˙ .
315 Í 5g 3 5g ˙˚
Î
+ constant . (9-42)
4 As
ss =
45 7
(
63 - 162c 2 + 206c 4 - 88c 6 + 44c 8 )1 2 . (9-43)
The sigma values of Seidel aberrations with and without balancing are given in Table 9-9.
Table 9-9. Sigma of a Seidel aberration with and without balancing, where Ai is the
coefficient of an aberration.
Aberration Sigma
Defocus (
s d = 2 g 3 5 Ad )
Astigmatism sa = ( 2c 3 5) A
2
a
Balanced astigmatism s ba = [ 2c (1 - c ) 3 5g ] A
2 2
a
Coma sc = c [( 7 + 8c ) 105] A 4
c
4 12
Balanced coma s bc = ( 2c 15 21)( 35 - 70c + 62c ) A 2
c
Ê 4A ˆ 8 12
˜ ( 63 - 162c + 206c - 88c + 44c )
s 2 4 6
Spherical aberration ss =Á
Ë 45 7 ¯
Figures 9-8 and 9-9 show the variation of sigma for a rectangular pupil as a function
of its width c along the x axis. It is evident from Figure 9-8 that defocus and spherical
sigmas have a minimum for a square pupil (i.e., for c = 1 2 ), but coma and astigmatism
sigmas increase monotonically as c increases from a value of zero, representing a slit
pupil along the y axis, to a value of 1, representing a slit pupil parallel to the x axis. The
balanced spherical sigma in Figure 9-9 has a minimum for a square pupil though its
variation is relatively small. The sigma for balanced astigmatism has a distinct maximum
for a square pupil, while the monotonically increasing sigma for balanced coma has a
point of inflection.
9.8 SUMMARY
The aberration-free PSF and OTF are discussed in Section 9.3. The polynomials
orthonormal over a unit rectangular pupil, representing balanced aberrations over such a
pupil are given through the fourth order in Tables 9-1 through 9-3 in terms of the circle
polynomials, in polar coordinates, and in Cartesian coordinates, respectively. Each
orthonormal polynomial consists of either the cosine or the sine terms, but not both. Thus
an even j polynomial, for example, consists of only the cosine terms, as may be seen from
Table 9-2. This is a consequence of the biaxial symmetry of the pupil. Since the
polynomials are not separable in the polar coordinates r and q of a pupil point,
polynomial numbering with two indices n and m loses significance, and must be
numbered with a single index j. They are ordered in the same manner as the polynomials
discussed in previous chapters.
As in the case of elliptical polynomials, only the first 15 rectangular polynomials are
given in the tables. The expressions for the higher-order polynomials are very long unless
the aspect ratio of the pupil is specified. The polynomial R6 for astigmatism is a linear
combination of Z 6 , Z 4 , and Z1, showing that the balancing defocus for (zero-degree)
Seidel astigmatism is different for a rectangular pupil compared to that, for example, for a
circular pupil. Moreover, R11 is a linear combination of Z11 , Z 6 , Z 4 , and Z1. Thus,
spherical aberration r 4 is balanced with not only defocus r2 but astigmatism r2 cos 2 q
as well. It is evidently not radially symmetric. As expected, the rectangular polynomials
reduce to the square polynomials (discussed in the next chapter) as c Æ 1 2 , i.e., as the
unit rectangle approaches a unit square.
The first 45 rectangular polynomials, i.e., up to and including the eighth order, for a
rectangular pupil with an aspect ratio of = 0.75 are given in Tables 9-4 through 9-6 in
terms of Zernike circle polynomials, in polar coordinates, and in Cartesian coordinates,
respectively. They are illustrated in three different but equivalent ways in Figure 9-7 with
the isometric plot, interferogram, and the PSF for a sigma value of one wave. The peak-
to-valley aberration numbers (in units of wavelength) are given in Table 9-7. The Strehl
ratio for a sigma value of 0.1 wave is given in Table 9-8 and plotted in Figure 9-7. The
Seidel aberrations are discussed in Section 9.7, and their sigma values with and without
balancing are given in Table 9-9.
5HIHUHQFHV 265
References
References ......................................................................................................................294
267
Chapter 10
Systems with Square Pupils
10.1 INTRODUCTION
We start this chapter with a brief discussion of the aberration-free PSF and OTF for a
system with a square pupil, as, for example, a high-power laser beam with a square cross-
section. We can obtain these results as a special case of the rectangular pupils discussed
in the last chapter. Similarly, the square polynomials Sk can be obtained as a special case
of the rectangular polynomials Rk discussed there, i.e., by letting c = 1 2 . However,
we describe the procedure for obtaining them independently [1,2], and give expressions
for the first 45 polynomials, i.e., up to and including the eighth order. The isometric,
interferometric, and PSF plots of these polynomial aberrations with a sigma value of one
wave are given along with their P-V numbers. The Strehl ratios for these polynomial
aberrations for a sigma value of one-tenth of a wave are also given. Finally, we discuss
how to obtain the standard deviation of a Seidel aberration with and without balancing
and then discuss the Strehl ratio as a function of it.
(
P xp, yp ) ( ) [ (
= A x p , y p exp iF x p , y p )] , (10-1)
269
270 SYSTEMS WITH SQUARE PUPILS
yp
xp
O
where
(
A xp, yp ) = (P ex Sex )
12
, -a £ xp £ a , -a £ yp £ a . (10-2)
Letting
( x ¢, y ¢) = a 1
(x p, yp ) (10-4)
and
1
( x, y) = (x , y )
lF i i
(10-5)
F = R 2a (10-6)
is the focal ratio of the image forming beam along the x and the y axes, we obtain the
irradiance distribution
2
1 1 1
I ( x, y) =
16 1 1
[ ]
Ú Ú exp iF( x ¢ , y ¢ ) exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢ . (10-7)
2
1 1 1
I ( x, y) = Ú Ú exp[ -pi ( xx ¢ + yy ¢) ] dx ¢dy ¢
16 1 1
2 2
Ê sin px ˆ Ê sin py ˆ
= Á ˜ . (10-8)
Ë px ¯ ÁË py ˜¯
Figure 10-2a shows the 2D PSF, in particular, the central bright square spot of size
2 ¥ 2 , with each dimension in units of l F . The PSF is zero wherever x and/or y is a
positive or a negative integer. Moreover, there are rectangular spots along the x and y
axes, but square spots elsewhere in the PSF. Figure 10-2b shows the irradiance
distribution along the x and y axes, and along the diagonal of the central bright spot as
12
(
I ( x, 0) , I (0, y ) , and I ( x , x ) ∫ I ( r ) , where r = x 2 + y 2 )
= 2 x and
4
I (r) = Í
(
È sin pr 2 ) ˘˙ . (10-9)
Í pr 2 ˙
Î ˚
(a)
1.0
0.8
0.6
0.4
(b)
I (x, 0)
0.2
I (0, y)
I (r)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
x, y, or r
Figure 10-2. (a) 2D aberration-free PSF. (b) Irradiance distribution along the x and
y axes, and along the diagonal of the central bright spot of the PSF.
272 SYSTEMS WITH SQUARE PUPILS
10.3.2 OTF
From Eq. (1-13), the aberration-free OTF of a system with a square pupil at a spatial
frequency (x, h) is given by the fractional area of overlap of two squares centered at
(0, 0) and lR(x, h) , as shown in Figure 10-3. The overlap area is given by
S(x, h) = (2a - l Rx) (2a - l Rh)
Ê x ˆÊ h ˆ
= 4 a 2 Á1 - ˜ Á1 - ˜ . (10-10)
Ë 1 lF ¯ Ë 1 lF ¯
Hence, the fractional area of overlap, or the OTF of the system may be written
(
t vx , vy ) = (1 - v ) (1 - v )
x y , (10-11)
where
Ê x h ˆ
(v , v )
x y = Á , ˜
Ë 1 lF 1 lF ¯
(10-12)
are the spatial frequency components in units of the cutoff frequency 1 l F along the x
( )
or the y axis. The OTF t(v x , 0) along the x axis is the same as the OTF t 0, v y along
the y axis, with the same normalized cutoff frequency of unity.
yp
O9 R
xp
O
R
a
Figure 10-3. Overlap area of two square pupils centered at (0, 0) and l R(x , h) .
10.3.2 OTF 273
12
( )
The OTF t( v ) , where v = v x2 + v y2 , along the diagonal of the pupil can be
obtained from Eq. (10-10) by letting v x = v y . Thus
2
Ê v ˆ
t( v ) = Á 1 - ˜ . (10-13)
Ë 2¯
( )
Figure 10-4 shows the OTF t(v x , 0) , t 0, v y , and t( v ) along the x and y axes, and
along the diagonal of the pupil with cutoff frequencies 1, 1, and 2 , respectively, each in
( )
units of 1 l F . Of course, t(v x , 0) = t 0, v y for any v x = v y . The OTF t( v ) < t(v x , 0) for
( )
any frequency lying in the range 0 < v = v x < 2 2 - 1 . They are equal to each other at
( )
the frequency 2 2 - 1 (or about 0.83), and t( v ) > t(v x , 0) for frequencies in the range
( )
2 2 - 1 < v = v x < 2 . Of course, t(v x , 0) is zero for v x ≥ 1, but t( v ) is not until
v = 2.
1.0
0.8
t ( nx , 0)
0.6
t (0, ny)
t
0.4
t (n)
0.2
0.0
0.0 0.5 1.0 1.5
nx, ny, or n
È j ˘
S j +1 = N j +1 ÍZ j +1 - Â Z j +1S k S k ˙ , (10-14)
ÍÎ k =1 ˙˚
where N j +1 is a normalization constant so that the polynomials are orthonormal over the
unit square, i.e., they satisfy the orthonormality condition
1 2 1 2
1 Û Û
Ù dy Ù S j S j ¢ dx = d jj ¢ . (10-15)
2 ı ı
1 2 1 2
The angular brackets indicate a mean value over the rectangular pupil. Thus, for example,
1 2 1 2
1
Z j Sk = Ú dy Ú Z j S k dx . (10-16)
2 1 2 1 2
If the integrand is an odd function of x and/or y, the mean value is zero because of the
symmetric limits of integration. If the integrand is an even function, then we may replace
the lower limits of integration by zero and multiply the double integral by 4.
The orthonormal square polynomials up to and including the eighth order, i.e., the
first 45 polynomials, in terms of the Zernike circle polynomials are given in Table 10-1.
D ( 1 2, 1 2 ) (
A 1 2,1 2 )
O x
(
C 1 2, 1 2 ) (
B 1 2, 1 2 )
Figure 10-5. Unit square of half-width 1 2 inscribed inside a unit circle. Its corner
points, such as A, lie at a distance of unity from its center.
10.4 Square Polynomials 275
S1 Z1
S2 3 2 Z2
S3 3 2 Z3
S4 ( 5 2 /2) Z1 + ( 15 2 /2) Z4
S5 3 2 Z5
S6 ( 15 /2)Z6
S7 (3 21 31 /2)Z3 + (5 21 62 /2)Z7
S8 (3 21 31 /2)Z2 + (5 21 62 /2)Z8
S14 = 261/(8 134 )Z1 + (345 3 134 /16)Z4 + (129 5 134 /16)Z11 + (3 335 /16)Z14
S2 = 6 ȡcosș
S3 = 6 ȡsinș
2
S4 = 5 2 (3ȡ 1)
2
S5 = 3ȡ sin2ș
S6 = 3 5 2 ȡ2 cos2ș
2
S7 = 21 31 (15ȡ 7)ȡsinș
2
S8 = 21 31 (15ȡ 7)ȡcosș
3
S17 = 55 1966 [ 11ȡ sin3ș + 3(19 97ȡ2 + 105ȡ4)ȡsinș]
4
S18 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 cos3ș + 3(3128 23885ȡ2 + 37205ȡ )ȡcosș]
4
S19 = (1/4) 3 844397 [5( 10099 + 20643ȡ2)ȡ3 sin3ș 3(3128 23885ȡ2 + 37205ȡ )ȡsinș]
4
S20 = (1/16) 7 859 [2577ȡ5 cos5ș 5(272 717ȡ2)ȡ3 cos3ș + 30(22 196ȡ2 + 349ȡ )ȡcosș]
4
S21 = (1/16) 7 859 [2577ȡ5 sin5ș + 5(272 717ȡ2)ȡ3 sin 3ș + 30(22 196ȡ2 + 349ȡ )ȡsinș]
S26 = (1/16 849 )[5( 98 + 2418ȡ2 12051ȡ4 + 15729ȡ6) + 3( 8195 + 17829ȡ2)ȡ4 cos4ș]
S27 = (1/16 7846 )[27461ȡ6 sin6ș + 15(348 2744ȡ2 + 4487ȡ4)ȡ2 sin2ș]
3
+ (56.29115383ȡ 248.12774426ȡ5 + 258.68657393ȡ7) cos3ș +4.37679791ȡ5 cos5ș
S33 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)sinș
+ ( 21.68093294ȡ3 + 127.50233381ȡ5 174.38628345ȡ7) sin3ș
+ ( 75.07397471ȡ5 + 151.45280913ȡ7) sin5ș
S34 = ( 6.78771487ȡ + 103.15977419ȡ3 407.15689696ȡ5 + 460.96399558ȡ7)cosș
+ (21.68093294ȡ3 127.50233381ȡ5 + 174.38628345ȡ7) cos3ș
+ ȡ5( 75.07397471 + 151.45280913ȡ2) cos5ș
S35 = (3.69268433ȡ 59.40323317ȡ3 + 251.40397826ȡ5 307.20401818ȡ7)sinș
+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)sin3ș
+ (19.83875817ȡ5 48.16032819ȡ7) sin 5ș + 32.65536033ȡ7 sin7ș
S36 = ( 3.69268433ȡ + 59.40323317ȡ3 251.40397826ȡ5 + 307.20401818ȡ7)cosș
+ (28.20381860ȡ3 183.86176738ȡ5 + 272.43249673ȡ7)cos3ș
+ ( 19.83875817ȡ5 + 48.16032819ȡ7) cos5ș + 32.65536033ȡ7 cos7ș
S37 = 2.34475558 55.32128002ȡ2 + 296.53777290ȡ4 553.46621887ȡ6
+ 332.94452229ȡ8 + ( 12.75329096ȡ4 + 20.75498320ȡ6)cos4ș
S38 = ( 51.83202694ȡ2 + 451.93890159ȡ4 1158.49126888ȡ6 + 910.24313983ȡ8)cos2ș
+ 5.51662508ȡ6 cos6ș
S39 = ( 39.56789598ȡ2 + 267.47071204ȡ4 525.02362247ȡ6 + 310.24123146ȡ8)sin2ș
1.59098067ȡ6 sin6ș
S40 = 1.21593465 45.42224477ȡ2 + 373.41167834ȡ4 1046.32659847ȡ6
+ 933.93661610ȡ8 + (137.71626496ȡ4 638.10242034ȡ6 + 712.98912399ȡ8)cos4ș
S2 = 6x
S3 = 6y
2
S4 = 5 2 (3ȡ 1)
S5 = 6xy
S6 = 3 5 2 (x2 y2)
2
S7 = 21 31 (15ȡ 7)y
2
S8 = 21 31 (15ȡ 7)x
4
S17 = 55 1966 (315ȡ 324x2 280y2 + 57)y
4
S23 = 33 3923 (1575ȡ 1820ȡ2 + 471)xy
S28 = (21/8 1349 )[3146x6 2250 x4y2 + 2250 x2y4 3146y6 1770(x4 y4) + 245(x2 y2)]
280 SYSTEMS WITH SQUARE PUPILS
The corresponding polynomials in polar and Cartesian coordinates are given in Tables
10-2 and 10-3, respectively. Of course, up to the fourth order, they can be obtained
simply from the rectangular polynomials Rk given in Tables 9-1 through 9-3 by letting
c = 1 2 . The square polynomial S11 representing the balanced primary spherical
aberration is radially symmetric, but the polynomial S22 representing balanced secondary
spherical aberration is not because it consists of a term in Z14 or cos4q, also. Similarly,
the polynomial S37 representing balanced tertiary spherical aberration is also not radially
symmetric, since it consists of terms in Z14 and Z 26 both varying as cos 4q .
J
W ( x , y ) = Â a j Sj ( x , y ) , (10-17)
j =1
where a j are the expansion coefficients. Multiplying both sides of Eq. (10-17) by
S j ( x , y ), integrating over the unit square, and using the orthonormality Eq. (10-15), we
obtain the square expansion coefficients:
1 1 2 1 2
aj = Ú dy Ú W ( x , y )S j ( x , y )dy . (10-18)
2 1 2 1 2
As stated in Section 3.2, it is evident from Eq. (10-18) that the value of a square
coefficient is independent of the number J of polynomials used in the expansion of the
aberration function. Hence, one or more terms can be added to or subtracted from the
aberration function without affecting the value of the coefficients of the other
polynomials in the expansion.
The mean and mean square values of the aberration function are given by
W (r, q) = a1 , (10-19)
and
J
W 2 (r, q) = Â a 2j , (10-20)
j =1
2
2
sW = W 2 (r, q) - W (r, q)
J
= Â a 2j . (10-21)
j =2
282 SYSTEMS WITH SQUARE PUPILS
The PSF plots, representing the images of a point object in the presence of a
polynomial aberration and obtained by applying Eq. (10-7) are shown in Figure 10-6. The
full width of a square displaying the PSFs is 24l Fx . Since the piston aberration S1 has
no effect on the PSF, it yields an aberration-free PSF.
The polynomial aberrations S2 and S3 , representing the x and y wavefront tilts with
aberration coefficients a 2 and a 3 , displace the PSF in the image plane along the x and y
axes, respectively. If the coefficient a 2 is in units of wavelength, it corresponds to a
wavefront tilt angle of 3 2la 2 a about the y axis and displaces the PSF along the x
axis by 6 a 2l F . Similarly, a 3 corresponds to a wavefront tilt angle of 3 2l a 3 a
about the x axis and displaces the PSF by 6 a 3l F .
The Strehl ratio, namely the central value of a PSF relative to its aberration-free
value can be obtained from Eq. (10-7) by letting x = 0 = y , i.e., from
2
1 1 1
I (0, 0) = [ ]
Ú Ú exp iF( x ¢ , y ¢ ) dx ¢dy ¢
16 1 1
. (10-22)
Its value for a square polynomial aberration with a sigma value of 0.1 wave is listed in
Table 10-5 and plotted in Figure 10-7. Because of the small value of the aberration, the
Strehl ratio is approximately the same for each polynomial. Both the table and the figure
illustrate that the Strehl ratio for a small aberration is independent of the type of
( )
aberration. It is approximately given by exp - s F2 , or 0.67, where s F = 0.2p .
10.6 Isometric, Interferometric, and Imaging Characteristics of Square Polynomial Aberrations 283
S1 S2 S3
S4 S5 S6
S7 S8 S9
Table 10-5. Strehl ratio S for square polynomial aberrations for a sigma value of 0.1
wave.
o
o
Figure 10-7. Strehl ratio S for square polynomial aberrations with a sigma value of
0.1 wave.
10.7 Seidel Aberrations, Standard Deviation, and Strehl Ratio 289
10.7.1 Defocus
We start with the defocus aberration
W d (r) = Ad r 2 . (10-23)
From the form of the defocus orthonormal polynomial S4 given in Table 10-2, it is
evident that its sigma value across a square pupil is given by
1 2 Ad
sd = Ad = . (10-24)
3 5 4.743
10.7.2 Astigmatism
Next, consider 0 o Seidel astigmatism given by
5 2
S6 = 3 r cos 2q (10-26a)
2
Ê 1 ˆ
= 3 10 Á r 2 cos 2 q - r 2 ˜ , (10-26b)
Ë 2 ¯
showing that the relative amount of defocus r2 that balances Seidel astigmatism
r2 cos 2 q is -1 2 , as in the case of a circular, annular, or a Gaussian pupil. Thus, the
balanced astigmatism is given by
Ê 1 ˆ
W ba (r, q) = Aa Á r 2 cos 2 q - r 2 ˜ . (10-27)
Ë 2 ¯
Aa Aa
s ba = = . (10-28)
3 10 9.487
To obtain the sigma value of astigmatism, we write Eq. (10-25) in the form
Aa
W a (r, q) = (S6 + S4 ) . (10-29)
3 10
290 SYSTEMS WITH SQUARE PUPILS
Aa Aa
sa = = . (10-30)
3 5 6.708
10.7.3 Coma
Now, we consider Seidel coma:
21
S8 =
31
(
15r 3 cos q - 7r cos q ) . (10-32)
It shows that the relative amount of tilt r cos q that optimally balances Seidel coma
r3 cos q is - 7 15 compared to - 2 3 for a circular pupil. The balanced coma is given by
Ê 7 ˆ
W bc (r, q) = Ac Á r 3 cos q - r cos q˜ . (10-33)
Ë 15 ¯
1 31 Ac
s bc = Ac = . (10-34)
15 21 12.346
To obtain the sigma value of Seidel coma, we write Eq. (10-31) in the form
Ac Ê 31 7 ˆ
W c (r, q) = Á S8 + S2 ˜ . (10-35)
15 Ë 21 6 ¯
3 Ac
sc = A = . (10-36)
70 c 4.831
1
S11 =
2 67
(
315r 4 - 240r 2 - 31 ) . (10-38)
Ê 16 ˆ
W bs (r) = As Á r 4 - r 2 ˜ . (10-39)
Ë 21 ¯
It shows that spherical aberration is balanced by a relative defocus of -16 21. Its sigma
value is given by
2 1
s bs = 67 As = . (10-40)
315 19.242
To obtain the sigma value of Seidel spherical aberration, we write Eq. (10-23) in the form
2
W s (r) =
315
( )
67 S11 + 8 10 S4 + constant . (10-41)
2 101 As
ss = A = . (10-42)
45 7 s 5.923
The sigma values of Seidel aberrations with and without balancing are given in Table 10-
6.
Table 10-6. Sigma value of a Seidel aberration with and without balancing, and P-V
numbers for a sigma value of unity, where Ai is the aberration coefficient.
(a) (b)
(c) (d)
Figure 10-8. Strehl ratio as a function of the sigma value of a Seidel aberration with
and without balancing. (a) defocus, (b) astigmatism, (c) coma, and (d) spherical
aberration.
10.8 Summary 293
10.8 SUMMARY
The aberration-free PSF and OTF of a square pupil are discussed in Section 10.3.
The polynomials orthonormal over a unit square pupil, representing balanced aberrations
over such a pupil are given through the eighth order in Tables 10-1 through Table 10-3 in
terms of the circle polynomials, in polar coordinates, and in Cartesian coordinates,
respectively. Each orthonormal polynomial consists of either the cosine or the sine terms,
but not both. Thus, an even j polynomial, for example, consists of only the cosine terms,
as may be seen from Table 10-1 or 10-2. This is a consequence of the four-fold symmetry
of the pupil. Since the polynomials are not separable in the polar coordinates r and q of
a pupil point, the polynomial numbering with two indices n and m loses significance, and
must be numbered with a single index j. They are ordered in the same manner as the
polynomials discussed in previous chapters.
The first 45 hexagonal polynomials, i.e., up to and including the eighth order are
illustrated by an isometric plot, an interferogram, and a PSF in Figure 10-6. The
coefficient of each orthonormal polynomial, or the sigma value of the corresponding
aberration, is one wave. Their peak-to-valley numbers for a sigma value of one wave are
given in Table 10-4 in units of wavelength. The Strehl ratio for a sigma value of 0.1 l
for each aberration is given in Table 10-5 and illustrated in Figure 10-7. It shows that, for
a small aberration, the Strehl ratio can be estimated from the aberration variance. The
sigma values of the Seidel aberrations and their balanced forms are given in Table 10-6.
294 SYSTEMS WITH SQUARE PUPILS
References
3. M. Bray, “Orthogonal polynomials: A set for square areas," 3URF SPIE 5252,
314–320 (2004).
References ......................................................................................................................306
295
Chapter 11
Systems with Slit Pupils
11.1 INTRODUCTION
A slit pupil is a limiting case of a rectangular pupil whose one dimension is
negligibly small. It is used in spectrographs. The power series aberrations of a
rotationally symmetric imaging system with a slit pupil are the 1D analog of the
corresponding aberration terms discussed in Chapter 1. In this chapter, we discuss the
PSF of a slit pupil and the incoherent image of a slit parallel to the slit pupil. The Strehl
ratio for and the balanced aberrations of a slit pupil are discussed. It is shown that the
balanced aberrations are represented by the Legendre polynomials [1,2]. We show further
that the slit pupil is more sensitive to a primary aberration with or without balancing,
except for spherical aberration, for which it is slightly less sensitive.
2
1 1
I ( x) = Ú exp[iF( x ¢) ] exp( -pix ¢x ) dx ¢ , (11-1)
4 1
yp
O
b xp
a
Figure 11-1. A slit pupil of half-width a along the x axis, where b << a .
297
298 SYSTEMS WITH SLIT PUPILS
(a)
1.0
0.8
0.6
(x)
0.4 (b)
0.2
0.0
3 2 1 0 1 2 3
x
Figure 11-2. PSF of a slit pupil. (a) Irradiance distribution. (b) 1D PSF
2
Ê sin px ˆ
I ( x) = Á ˜ . (11-2)
Ë px ¯
The PSF is shown in Figure 11-2. Its value is zero wherever x is a positive or a negative
integer.
Figure 11-3. Image of an incoherent slit object formed by a system with a slit pupil.
11.3 Strehl Ratio and Aberration Balancing 299
S ∫ I ( 0)
2
1 1
= Ú exp[iF( x ¢) ] dx ¢ . (11-3)
4 1
= {
exp i [F( x ) - F ]}
1
= 1 + i [F( x ) - F ] -
2
[F( x) - F ]2 + ...
2
~ 1 - F2 - F
∫ 1 - s F2 , (11-4)
where the angular brackets indicate a mean value across the pupil, F is the mean value
of the aberration function, F 2 is its mean square value, s F2 is its variance, and we have
neglected the higher-order terms in the power-series expansion of the exponent. The
mean value of a function g( x ) is given by
1
Ú g( x )dx
11
g( x ) = 1
1
= Ú g( x )dx . (11-5)
2 1
Ú dx
1
Wcx ( x ) = x 3 . (11-6)
2
s 2cx = [W cx ( x)]2 - W cx ( x ) . (11-7)
x
1 1
O
Figure 11-4. Unit slit pupil along the x axis inscribed inside a unit circle.
The variance can be reduced by mixing it with a certain amount b of x-tilt. Thus, the
balanced aberration may be written in the form
W bcx ( x ) = x 3 + bx . (11-8)
1 2b b 2
s 2bcx = + + . (11-9)
7 5 3
The variance has a minimum value of 4/175 for a tilt of b = -3 / 5 compared to a value of
1/7 without any tilt. Thus, the variance is reduced by a factor of 25/4, or the standard
deviation of the balanced aberration is smaller by a factor of 5/2. The corresponding
balanced aberration is given by
W bcx ( x , y ) = x 3 - (3 5) x . (11-10)
A balanced aberration yields a higher Strehl ratio or increases the aberration tolerance for
a given Strehl ratio.
W bsx ( x ) = x 4 + bx 2 . (11-11)
16 2b 4b 2
s 2bsx = + + . (11-12)
225 105 105
11.3.2 Aberration Balancing 301
Its sigma value is minimum and equal to 8 105 for b = - 6 7 compared to a value of
4 15 with no defocus. The balanced aberration is given by
W bsx ( x ) = x 4 - (6 7) x 2 . (11-13)
It should be evident that there is no distinction between defocus and astigmatism, since
they both vary as x 2 .
The process of minimizing the variance in this manner is called aberration balancing.
The variance of the higher-order classical aberrations, e.g., secondary coma x 5 ,
secondary spherical aberration x 6 , tertiary coma x 7 , and tertiary spherical aberration x 8 ,
can also be minimized by combining them with lower-degree aberrations.
The Legendre polynomials Pn ( x ) are orthogonal over the interval [ -1, 1] , according
to [3]
1 1 1
Ú Pn ( x ) Pn ¢ ( x ) dx = d , (11-14)
2 1 2n + 1 nn ¢
where n is a positive integer (including zero). A polynomial with an even (odd) value of n
consists of terms with even (odd) powers of x. Thus, a polynomial is symmetric for an
even n and antisymmetric for an odd n, according to
n
Pn ( - x ) = ( -1) Pn ( x ) . (11-15)
Moreover,
Pn (1) = 1 , (11-16)
Ï1 for even n
Pn ( -1) = Ì (11-17)
Ó -1 for odd n ,
Starting with P0 ( x ) = 1 and P1( x ) = x , the polynomials can be obtained recursively from
the relation
It is evident from Eq. (11-19) that Pn ( x ) is a polynomial of degree n in x, i.e., the highest
power of x in a polynomial Pn ( x ) is n. It is perhaps worth noting that a Zernike radial
( ) (
polynomial Rn0 (r) is the same as a shifted Legendre polynomial P̃n r 2 = Pn 2r 2 - 1 , )
both of which are orthogonal over the interval [0, 1] [see Eq. (4-41)].
Ln ( x ) = 2n + 1Pn ( x ) . (11-20)
1 1
Ú L ( x ) Ln ¢ ( x ) dx = d nn ¢ . (11-21)
2 1 n
The first few Ln ( x ) polynomials are listed in Table 11-1. The standard deviation of
each polynomial is unity. The mean value of each polynomial [other than P0 ( x ) ] is zero,
as may be seen by letting n ¢ = 0 in Eq. (11-21). It is easy to see this explicitly for a
polynomial with an odd value of n, since the integral of an odd function over symmetric
limits is zero. For an even value of n, the piston term in the polynomial makes its mean
value zero. For example, the balanced x-spherical aberration is x 4 - (6 7) x 2 with a mean
value of - 3 35. The piston term of 3(3/8) in L4 ( x ) makes its mean value zero. The slit
pupil is more sensitive to a Seidel aberration with or without balancing compared to a
circular pupil, except for spherical aberration for which it is slightly less sensitive.
(a)
(b)
Figure 11-5. Legendre polynomials Pn ( x ) as a function of x. (a) Even n and (b) odd
n.
304 SYSTEMS WITH SLIT PUPILS
n Aberration Ln ( x)
0 Piston 1
1 Tilt 3x
2 Defocus ( )(
5 2 3x 2 - 1 )
3 Primary coma ( )(
7 2 5x 3 - 3x )
4 Primary spherical aberration (3 8)( 35x 4 - 30 x 2 + 3)
5 Secondary coma ( )(
11 8 63x 5 - 70 x 3 + 15x )
6 Secondary spherical ( )(
13 16 231x 6 - 315x 4 + 105x 2 - 5 )
aberration
7 Tertiary coma ( )(
15 16 429 x 7 - 693x 5 + 315x 3 - 35x )
8 Tertiary spherical aberration ( )( )
17 128 6435 x 8 - 12012 x 6 + 6930 x 4 - 1260 x 2 + 35
Table 11-2. Standard deviation s of a primary aberration for a slit pupil, where Ai
is its aberration coefficient.
Aberration s
Tilt At 3 = At 1.732
Coma Ac 7 = 2.646
11.6 SUMMARY
A slit pupil is a limiting case of a rectangular pupil whose one dimension is
negligibly small, as illustrated in Figure 11-1. Its PSF is shown in Figure 11-2. The image
of an incoherent slit object parallel to the slit pupil is shown in Figure 11-3. The balanced
aberrations for a slit pupil are the Legendre polynomials. We have written them in an
orthonormal form, as in Eq. (11-3). They are listed in Table 11-1 up to the eighth order
and plotted in Figure 11-4. The sigma value of a 1D primary aberration with and without
balancing is listed in Table 11-2. It is shown that a slit pupil is more sensitive to a
primary aberration with or without balancing, except for spherical aberration for which it
is slightly less sensitive.
306 SYSTEMS WITH SLIT PUPILS
References
12.3.1 Zernike Circle Coefficients in Terms of the Annular Coefficients ......... 314
References ......................................................................................................................348
307
Chapter 12
Use of Zernike Circle Polynomials for
Noncircular Pupils
12.1 INTRODUCTION
The orthonormal polynomials for various pupils discussed in the preceding chapters
represent balanced aberrations for those pupils, just as the Zernike circle polynomials
(discussed in Chapter 4) do for a circular pupil. In this chapter, we consider the use of
circle polynomials for the analysis of a noncircular wavefront. Since the circle
polynomials form a complete set, any wavefront, regardless of the shape of the pupil
(which defines the perimeter of the wavefront), can be expanded in terms of them.
Moreover, since each orthonormal polynomial is a linear combination of the circle
polynomials [see Eq. (3-18)], the wavefront fitting with the former set of polynomials is
as good as that with the latter. However, we illustrate the pitfalls of using circle
polynomials for a noncircular pupil by considering an annular and a hexagonal pupil
[1,2].
It is shown that, unlike the orthonormal coefficients, the circle coefficients generally
change as the number of polynomials used in the expansion changes. Although the
wavefront fit with a certain number of circle polynomials is the same as that with the
corresponding orthonormal polynomials, the piston circle coefficient does not represent
the mean value of the aberration function, and the sum of the squares of the other
coefficients does not yield its variance. While the interferometer setting errors of tip, tilt,
and defocus from a 4-circle-polynomial expansion are the same as those from the
orthonormal polynomial expansion, these errors obtained from, say, an 11-circle-
polynomial expansion, and removed from the aberration function yield wrong polishing
by zeroing out the residual aberration function. If the common practice of defining the
center of an interferogram and drawing a circle around it is followed, and determining the
circle coefficients in the same manner as for a circular interferogram, then the circle
coefficients of a noncircular interferogram do not yield a correct representation of the
aberration function. Moreover, in this case, some of the higher-order coefficients of
aberrations that are nonexistent in the aberration function are also nonzero. Finally, the
circle coefficients, however obtained, do not represent coefficients of the balanced
aberrations for a noncircular pupil. Such results are illustrated analytically and
numerically by considering annular and hexagonal Seidel aberration functions as
examples.
309
310 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
J
Wˆ ( x , y ) = Â a j F j ( x , y ) , (12-1)
j =1
where Wˆ ( x , y ) is the best-fit estimate of the function with J polynomials, and a j is the
coefficient of the polynomial F j ( x , y ) . The orthonormality of the polynomials across the
noncircular pupil is described by
1
Ú F ( x , y )F j ¢ ( x , y ) dx dy = d jj ¢ , (12-2)
A pupil j
1
aj = Ú W ( x , y )F j ( x , y ) dx dy . (12-3)
A pupil
It is evident that their value does not depend on the number of polynomials J used in the
expansion.
Letting F1( x , y ) = 1 , it is easy to see from Eq. (12-2) that the mean value of a
polynomial F j π1( x , y ) across the pupil is zero. Hence, the mean and the mean square
values of the estimated aberration function are given by
Ŵ = a1 (12-4)
and
J
Wˆ 2 ( x , y ) = Â a 2j , (12-5)
j =1
2
ˆ2 ˆ
ˆ = W ( x, y) - W ( x, y)
2
sW
J
= Â a 2j , (12-6)
j =2
where s Ŵ is its standard deviation. The number of polynomials J used in the expansion
to estimate the aberration function is increased until s Ŵ approaches the true value as
determined from the ray-trace or interferometric data within a certain prespecified
tolerance.
J
F j ( x , y ) = Â M ji Z i ( x , y ) , (12-7)
i =1
or
{F } = M {Z }
j j , (12-8)
where M ji are the elements of the lower triangular conversion matrix M The estimated
aberration function can accordingly be expanded in terms of the circle polynomials in the
form
J
Wˆ ( x , y ) = Â bˆ j Z j ( x , y ) , (12-9)
j =1
1
Ú Z ( x , y )Z j ¢ ( x , y ) dx dy = d jj ¢ , (12-10a)
p x 2 + y 2 £1 j
2p
11
Z j (r, q) Z j ¢ (r, q) r dr dq = d jj ¢
p Ú0 Ú . (12-10b)
0
J J
= Â Â a i M ij Z j ( x , y ) . (12-11)
j =1 i = j
It is clear that the value of a circle coefficient b̂ j depends on the number of polynomials J
used in the expansion. Moreover, it is a linear combination of the orthonormal
coefficients, just as an orthonormal polynomial is a linear combination of the circle
polynomials. Equation (12-12) can be written in a matrix form as
b̂ = M T a , (12-13)
where a and b̂ are the column vectors representing the orthonormal and the Zernike
coefficients, respectively, and M T is the transpose of the conversion matrix M. Thus, the
312 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
matrix that is used to obtain the orthonormal polynomials from the circle polynomials is
also used to obtain the circle coefficients from the orthonormal coefficients. The
transpose of a matrix is obtained by interchanging its rows and columns. Since M is a
lower triangular matrix, M T is an upper triangular matrix. Multiplying both sides of Eq.
1
(12-13) by the inverse M T ( )
of M T , we obtain
a = MT( ) 1 bˆ . (12-14)
Accordingly, if the circle coefficients are known, the orthonormal coefficients can be
obtained from them.
If the orthonormal coefficients are not known, the circle coefficients b̂ j can be
obtained by a least squares fit. Suppose the aberration values are known over a certain
domain by way of interferometry at N data points. Equation (12-9) can be written in
matrix form
Sˆ = Zbˆ , (12-15)
bˆ = Z 1Sˆ , (12-16)
where Z 1 is a generalized inverse of the Z matrix. Of course, this procedure can also be
used to determine the orthonormal coefficients by replacing the circle polynomials with
the orthonormal polynomials. Except for any numerical error because of the finite
number N of the data points, the b̂ -coefficients given by Eq. (12-16) are the same as
those given by Eq. (12-13).
If the practice of drawing a unit circle around an interferogram and determining the
Zernike coefficients for a circular pupil is extended to a noncircular wavefront, the
coefficients thus obtained will be given by
1
bj = Ú W ( x , y )Z j ( x , y ) dx dy . (12-17)
A pupil
The circle polynomials in Eq. (12-17) are implicitly assumed to be orthonormal over the
noncircular pupil. The value of a circle coefficient b j does not depend on the number of
polynomials used in the expansion. Substituting Eq. (12-1) for the estimated aberration
function Wˆ ( x , y ) in terms of the orthonormal polynomials, we obtain
J 1
bj = Â a j¢ Ú Z ( x , y ) F j ¢ ( x , y ) dx dy
j ¢ =1 A pupil j
12.2 Relationship between the Orthonormal and the Corresponding Zernike Circle Coefficients 313
J
= Â a j¢ Z j Fj¢ , (12-18)
j ¢ =1
or in a matrix form
b = C ZF a , (12-19)
To relate the b̂ - and the b-circle coefficients, we equate the right-hand sides of Eqs.
(12-1) and (12-9), multiply both sides by Z j ¢ , and integrate over the domain of the
noncircular pupil. Thus,
J J
 bˆ j Z j ( x , y ) =  a j F j ( x , y ) (12-20)
j =1 j =1
and
J J
 bˆ j Z j ¢ Z j =  a j Z j¢ Fj , (12-21)
j =1 j =1
C ZZ bˆ = C ZF a = b , (12-22)
where we have utilized Eq. (12-19). From Eqs. (12-13) and (12-22), it is evident that
C ZF = C ZZ M T . (12-23)
1
c jj ¢ = Ú Z ( x , y )Z j ¢ ( x , y ) dx dy (12-24)
A pupil j
and
1
d jj ¢ = Ú Z ( x , y )F j ¢ ( x , y ) dx dy , (12-25)
A pupil j
where £ r £ 1, n and m are positive integers, and n - m ≥ 0 and positive. The annular
polynomials are orthonormal across the annular pupil according to
1 2p
1
Ú Ú A j (r, q; ) A j ¢ (r, q; ) r dr dq = d jj ¢ . (12-27)
(
p 1 - 2 ) 0
{A } = M {Z }
j j , (12-28)
The mean value and the variance of the estimated function are accordingly given by Eqs.
(12-4) and (12-6).
12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 315
Table 12-1 lists the first 11 annular polynomials, as obtained from the annular-
polynomial Tables 5-3 and 5-4. They are given in terms of the circle polynomials in
Table 12-2. The nonzero elements of a 11 ¥ 11 conversion matrix, as obtained from Table
12-2, are listed in Table 12-3. The transpose matrix M T can be obtained easily by
interchanging the rows and columns of M . The nonzero elements of the 11 ¥ 11 matrices
C ZZ and C ZF are given in Tables 12-4 and 12-5, respectively.
Ê
Ê bˆ1 ˆ Á 1 0 0 - 32 1 - 2( ) 1ˆ˜ Ê a1 ˆ Á
1 (
Ê a - 32 1 - 2
) 1 a 4 ˆ˜
Áˆ ˜ Á Á ˜
Á b2 ˜ = Á 0 (1 + 2 ) 1 2 0 0 ˜
˜ Á a2 ˜
Á 1 + 2 1 2 a
=Á
( ) 2
˜
˜ (12-31)
Á bˆ ˜ Á ˜ Á ˜ Á ˜
Á 3˜ Á 0
Áˆ ˜
0 (1 + 2 ) 1 2 0 ˜ Á a3 ˜ Á 1+ ( 2 1 2
)a3 ˜
Á ˜
Ë b4 ¯ Á ˜ Á ˜
Ë 0 0 0 (1 - 2 ) 1 ¯
Ë a4 ¯
Ë ( 1
1 - 2 a 4) ¯
or
(
bˆ1 = a1 - 32 1 - 2 ) 1 a4 , (12-32a)
(
bˆ2 = 1 + 2 ) 1 2 a2 , (12-32b)
(
bˆ3 = 1 + 2 ) 1 2 a3 , (12-32c)
(
bˆ4 = 1 - 2 ) 1 a4 . (12-32d)
These coefficients represent the Zernike piston, tip, tilt, and defocus coefficients.
To see how these coefficients change with the number of polynomials used in the
expansion, we consider an expansion using the first 11 circle polynomials. The
coefficients are now given by
(
bˆ1 = a1 - 32 1 - 2 ) 1 a4 + (
52 1 + 2 1 - 2 )( ) 2 a11 , (12-33a)
(
bˆ2 = 1 + 2 ) 1 2 a2 - (2 )
2 4 B a 8 , (12-33b)
(
bˆ3 = 1 + 2 ) 1 2 a3 - (2 )
2 4 B a 7 , (12-33c)
(
bˆ4 = 1 - 2 ) 1 a4 - (
152 1 - 2 ) 2 a11 , (12-33d)
316 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
(
bˆ5 = 1 + 2 + 4 ) 1 2 a5 , (12-33e)
(
bˆ6 = 1 + 2 + 4 ) 1 2 a6 , (12-33f)
[( ) ]
bˆ7 = 1 + 2 B a 7 , (12-33g)
[( ) ]
bˆ8 = 1 + 2 B a 8 , (12-33h)
(
bˆ9 = 1 + 2 + 4 + 6 ) 1 2 a9 , (12-33i)
1 0 0 1 Piston
x tilt
2 1 1 2 ÈÍr 1 + 2
Î
( )1 2 ˘˙˚ cos q
y tilt
3 1 1 2 ÈÍr 1 + 2
Î
( )1 2 ˘˚˙ sin q
4 2 0 (
3 2r 2 - 1 - 2 ) (1 - 2 ) Defocus
5 2 2 6 ÈÍr 2 1 + 2 + 4
Î
( )1 2 ˘˙˚ sin 2q 45∞ Primary astigmatism
6 2 2 6 ÈÍr 2 1 + 2 + 4
Î
( )1 2 ˘˙˚ cos 2q 0∞ Primary astigmatism
7 3 1 8
( ) ) sin q
3 1 + 2 r 3 - 2 1 + 2 + 4 r ( Primary y coma
12
(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]
3 (1 + 2 ) r 3 - 2 (1 + 2 + 4 ) r
8 3 1 8 1 2 cos q
Primary x coma
(1 - 2 ) [(1 + 2 ) (1 + 4 2 + 4 )]
9 3 3 8 ÈÍr 3 1 + 2 + 4 + 6
Î
( )1 2 ˘˚˙ sin 3 q
10 3 3 8 ÈÍr 3 1 + 2 + 4 + 6
Î
( )1 2 ˘˚˙ cos 3q
2
11 4 0
ÎÍ ( )
5 È6r 4 - 6 1 + 2 r 2 + 1 + 4 2 + 4 ˘
˚˙ (1 - )
2 Primary spherical aberration
12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 317
A1 = Z1
( ) 1 2 Z2
A2 = 1 + 2
12
A3 = (1 + 2 ) Z 3
1
A4 = (1 - 2 ) ( - 32 Z1 + Z 4 )
12
A5 = (1 + 2 + 4 ) Z 5
12
A6 = (1 + 2 + 4 ) Z 6
A7 = B 1[ - 2 2 4 Z 3 + (1 + 2 ) Z 7 ]
A8 = B 1[ - 2 2 4 Z 2 + (1 + 2 ) Z 8 ]
12
A9 = (1 + 2 + 4 + 6 ) Z 9
12
A10 = (1 + 2 + 4 + 6 ) Z10
12
B = (1 - 2 )[(1 + 2 )(1 + 4 2 + 4 ) ]
M 11 = 1
(
M 22 = 1 + 2 ) 1 2 = M 33
1
M 41 = - 32 (1 - 2 )
1
M 44 = (1 - 2 )
12
M 55 = (1 + 2 + 4 ) = M 66
M 73 = -2 2 4 B 1
= M 82
( ) = M 88
M 77 = 1 + B 2 1
12
M 99 = (1 + 2 + 4 + 6 ) = M 10,10
2
M 111, = 52 (1 + 2 )(1 - 2 )
2
M 11,4 = - 152 (1 - 2 )
2 2
, = (1 - )
M 1111
318 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
c11 = 1
c14 = 32 = c 41
c111 2
(
, = - 5 1 - 2 = c111
2
,)
c 22 = 1 + 2 = c 33
c 28 = 2 2 4 = c 82 = c 37 = c 73
c 44 = 1 - 2 2 + 4 4
( )
c 4,11 = 152 1 - 32 + 34 = c11,4
c 55 = 1 + 2 + 4 = c 66
c 77 = 1 + 2 - 7
c 99 = 1 + 2 + 4 + 6 = c10,10
, = 1 - 4 + 26 - 54 + 36
2 4 6 8
c1111
d11 = 1
(
d 22 = 1 + 2 )1 2 = d 33
d 41 = 32
( )( ) 1
d 44 = 1 - 2 2 + 4 1 - 2
12
d 55 = (1 + 2 + 4 ) = d 66
12
d 73 = 2 2 4 (1 + 2 ) = d 82
12 12
d 77 = (1 - 2 )(1 + 4 2 + 4 ) (1 + 2 ) = d 88
12
d 99 = (1 + 2 + 4 + 6 ) = d10,10
d11,4 = 152 (1 - 2 )
2 2
, = (1 - )
d1111
12.3.1 Zernike Circle Coeffiients in Terms of the Annular Coefficients 319
(
bˆ10 = 1 + 2 + 4 + 6 ) 1 2 a10 , (12-33j)
(
bˆ11 = 1 - 2 ) 2 a11 , (12-33k)
where
12
(
B = 1 - 2 )[(1 + 2 )(1 + 4 2 + 4 )] . (12-34)
It is evident that all of the first four coefficients change, and b j = M jj a j for 5 £ j £ 11 .
The Zernike astigmatism coefficients b̂5 and b̂6 are smaller than the corresponding
12
( )
annular coefficients a 5 and a 6 by a factor of 1 + 2 + 4 . However, the Zernike
spherical aberration coefficient b̂11 is larger than the corresponding annular coefficient
2
( )
a11 by a factor of 1 - 2 . For example, when = 0.5 , the astigmatism coefficients are
smaller by a factor of 1.1456, and the spherical aberration coefficient is larger by a factor
of 1.7778.
We note that the mean value of the aberration function is given by the annular piston
coefficient a1 . However, the value of the corresponding Zernike circle coefficient b̂1
depends on the number of polynomials used in the expansion, and it does not equal a1 ;
therefore, it does not represent the mean value. An orthonormal annular coefficient (other
than piston) represents the standard deviation of the corresponding aberration term in the
expansion, but a Zernike circle coefficient generally does not. The variance of the
aberration function cannot be obtained by summing the squares of the Zernike circle
coefficients b̂ j (excluding the piston coefficient). The circle coefficients b j can be
320 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
obtained from the b̂ j - or the a j -coefficients, according to Eq. (12-22). They are
considered in Section 12.3.5 for a Seidel aberration function.
Wˆ ( x , y ) = a1 A1 + a 2 A2 + a 3 A3 + a 4 A4 (12-35a)
(
= a1 + 2 1 + 2 ) 1 2 a 2 x + 2(1 + 2 ) 1 2 a 3 y
(
+ 3 1 - 2 ) 1 a 4 [2 + (2r2 - 1)] . (12-35b)
(
= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-36b)
In Eqs. (12-35) and (12-36), we have omitted the arguments of the annular and circle
polynomials for simplicity. The coefficients of x, y, and r 2 representing the tip, tilt, and
defocus values obtained from the circle coefficients are the same as those obtained from
the orthonormal coefficients. The estimated piston from the Zernike expansion of Eq.
1
( )
(12-36b) is bˆ1 - 3bˆ4 , which is the same as a1 - 32 1 - 2 a 4 from the orthonormal
expansion in Eq. (12-35b). Accordingly, the aberration function obtained by subtracting
the piston, tip, tilt, and defocus values from the measured aberration function is
independent of the nature of the polynomials used in the expansion, so long as the
nonorthogonal expansion is in terms of only the first four circle polynomials [as may be
seen, for example, by comparing Eqs. (12-33a–d) with Eqs. (12-32a–d)]. In an
interferometer, the tip and tilt represent the lateral errors and defocus represents the
longitudinal error in the location of a point source illuminating an optical surface under
test from its center of curvature. These four terms are generally removed from the
aberration function and the remaining function is given to the optician to zero out from
the optical surface by polishing.
Although the wavefront fit with a certain number of circle polynomials is as good as
the fit with a corresponding set of the orthonormal polynomials, there are pitfalls in using
the circle polynomials. Since the circle polynomials are not orthogonal over the
noncircular pupil, the advantages of orthogonality and aberration balancing are lost. Since
they do not represent the balanced classical aberrations for a noncircular pupil, the
Zernike coefficients b̂ j do not have the physical significance of their orthonormal
counterparts. For example, the mean value of a circle polynomial across a noncircular
pupil is not zero, the Zernike piston coefficient does not represent the mean value of the
aberration, the other Zernike coefficients do not represent the standard deviation of the
corresponding aberration terms, and the variance of the aberration is not equal to the sum
of the squares of these other coefficients. Moreover, the value of a Zernike coefficient
generally changes as the number of polynomials used in the expansion of an aberration
function changes. Hence, the circle polynomials are not appropriate for the analysis of a
noncircular wavefront. Of course, wavefront fitting with the improperly calculated
Zernike coefficients b j by using Eq. (12-17) will be in error, as demonstrated in Section
12.3.4 for a Seidel aberration function.
The aberration function when approximated by only the first four annular
polynomials can be written
Wˆ (r, q; ) = a1 A1 + a 2 A2 + a 4 A4 , (12-38)
( ) (
a1 = 1 + 2 (2 Ad + Aa ) 4 + 1 + 2 + 4 As 3 , ) (12-39a)
(
a 2 = 1 + 2 )1 2 At ( )(
2 + 1 + 2 + 4 1 + 2 ) 1 2 Ac 3 , (12-39b)
( ) (
a 4 = 1 - 2 (2 Ad + Aa ) 4 3 + 1 - 4 As 2 3 . ) (12-39c)
322 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
It should be evident that the coefficient a 3 of the annular polynomial A3 varying as sinq
is zero. The mean value of the estimated aberration function is given by a1 , and its
variance is given by
2 2 2
sWˆ = a2 + a4 . (12-40)
1 12
a6 =
2 6
(1 + 2 + 4 ) Aa ,
(12-39d)
12
1 - 2 Ê 1 + 4 2 + 4 ˆ
a8 = Á ˜ Ac , (12-39e)
6 2 Ë 1 + 2 ¯
a11 =
(1 - 2 ) 2 A . (12-39f)
s
6 5
Next we expand the Seidel aberration function in terms of the circle polynomials. A
4-polynomial expansion can be obtained from Eqs. (12-32) and (12-39) in the form
where
[ (
bˆ1 = (2 Ad + Aa ) 4 + 1 - 2 1 + 2 2 As 3 , ) ] (12-44a)
bˆ2 = a 2 1 + 2( )1 2 ,
(12-44b)
bˆ4 = a 4 1 - 2( ) .
(12-44c)
12.3.4 Application to an Annular Seidel Aberration Function 323
The estimated aberration function in Eq. (12-43) is exactly the same as that in Eq. (12-
38), and the values of piston, x-tilt, and defocus are exactly the same as those obtained
from Eqs. (12-39a–c). It should be evident, however, that its mean value is not given by
b̂1. Moreover, since an expansion coefficient does not represent the standard deviation of
the corresponding aberration polynomial term, its variance is not given by bˆ22 + bˆ42 .
From Eqs. (12-33) and (12-39), an 11-polynomial Zernike circle expansion can be
written
where
bˆ1 = (2 Ad + Aa ) 4 + As 3 , (12-46a)
bˆ2 = At 2 + Ac 3 , (12-46b)
bˆ4 = (2 Ad + Aa ) 4 3 + As 2 3 , (12-46c)
bˆ6 = Aa 2 6 , (12-46d)
bˆ8 = Ac 6 2 , (12-46e)
bˆ11 = As 6 5 . (12-46f)
As in the case of annular polynomials, the eleven circle polynomials also represent the
Seidel aberration function exactly. The expansion coefficients can also be obtained by
inspection of the aberration function and the form of the circle polynomials. Indeed
because of the form of the Seidel aberration function, the circle coefficients are
independent of the obscuration ratio . Each b̂ -coefficient represents the value of the
corresponding a-coefficient for = 0 . It is clear that each of the three nonzero
coefficients of the 4-polynomial expansion changes as the number of polynomials is
increased from four to eleven. Hence, the values of piston, x-tilt, and defocus obtained
from the coefficients b̂1, b̂2 , and b̂4 are incorrect. Again, the mean value of the aberration
function is not given by b̂1, and its variance is not given by the sum of the squares of the
other coefficients.
( ) ( )
= Aa 2 6 Z 6 + Ac 6 2 Z 8 + As 6 5 Z11 . ( ) (12-48)
Since the 11-polynomial aberration functions of Eqs. (12-41) and (12-45) are equal
to each other [and equal to the Seidel aberration function of Eq. (12-37)], the difference
between the residual aberration functions of Eqs. (12-48) and (12-47) is equal to the
difference between the interferometer setting errors given by Eq. (12-38) or (12-43) and
those given by Eq. (12-45). Accordingly, the difference or the error function consists of
piston, tilt, and defocus only. It is given by
1 2 2 4
DW Rbˆ (r, q; ) = -
6
( )
4 + 2 As + A r cos q + 2 As r 2
3 1 + 2 c
, (12-49)
and is independent of the number J of the annular and circle polynomials (e.g., 11, as
above) used in the expansion. Of course, piston does not affect the peak-to-valley value
or the variance of the aberration function. If the interferometer setting errors obtained
from Eq. (12-45) are applied in the fabrication and testing of an optical system with an
annular pupil, the difference function represents the polishing error due to the use of the
circle polynomials.
a6
bˆ6
(
= 1 + 2 + 4 )1 2 , (12-50a)
12
2 Ê 1 + 4 + ˆ
2 4
a8
bˆ8
= 1 (
- Á )
Ë 1+
2 ˜
¯
, (12-50b)
and
a11
bˆ11
(
= 1 - 2 )2 . (12-50c)
Since the b̂ j -coefficients are independent of the value of , the variation of a ratio
a j bˆ j with represents the variation of an annular coefficient a j .
12.3.4.4 Error with Assuming Circle Polynomials to be Orthogonal over an Annulus 325
Now we consider the expansion of the Seidel aberration function in terms of the
circle polynomials by assuming them to be orthogonal over the annulus. This is what one
does when defining a center of an interferogram, drawing a unit circle around it, and
determining its circle coefficients. The aberration function in this case can be written in
the form
They can also be obtained from Eq. (12-22), i.e., from the annular or circle coefficients
by using the matrix C ZZ or C ZF given in Tables 12-4 and 12-5, respectively. The
“incorrect” circle coefficients b j are given by
b1 = a1 , (12-53a)
(
b2 = 1 + 2 )1 2 a 2 , (12-53b)
1 1
b4 =
4 3
(1 + 2 + 4 4 )(2 Ad + Aa ) +
2 3
(1 + 2 + 4 + 36 ) As , (12-53c)
(
b6 = 1 + 2 + 4 )1 2 a 6 , (12-53d)
1
b8 = 2 4 At +
6 2
(1 + 2 + 4 + 96 ) Ac , (12-53e)
5 4 2 1
b11 =
4
(
3 - 1 (2 Ad + Aa ) +)6 5
(
1 + 2 + 4 - 96 + 368 As , ) (12-53f)
etc. These coefficients are incorrect in the sense that they do not yield a least-squares fit
of the aberration function. Since an annular polynomial with n = m has the same form as
that for a corresponding circle polynomial except for the normalization constant, the
coefficients b j and a j for such a polynomial are also related to each other by the
normalization constant. Equations (12-53a, b, d) represent this fact for n = m = 0, 1, 2 ,
respectively. It is clear, however, that the improperly calculated circle coefficients b j
depend on the obscuration ratio of the pupil. Evidently, they are different from the
corresponding b̂ -coefficients given by Eqs. (12-46a–f). While the value of the piston
coefficient b1 is equal to the true mean value a1 , the tilt coefficient b2 is larger than a 2
12
by a factor of 1 + 2 (1 2
)
or 1.1180, and the coma coefficient b6 is larger than a 6 by a
(
factor of 1 + 2 + 4 )
or 1.1456 when = 0.5 . Moreover, the b-coefficients of some of
326 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
the nonexistent higher-order aberrations are not zero. For example, the coefficients b22 ,
b37 , etc. of the secondary and tertiary Zernike spherical aberrations Z 22 , Z 37 , etc., and
b16 , b30 , etc. of the secondary and tertiary Zernike coma Z16 and Z 30 , etc., are nonzero.
Thus, nonexistent aberrations are generated when an aberration function is expanded
improperly in terms of the circle polynomials.
If we estimate the annular Seidel aberration function with only 4-circle polynomials
from Eq. (12-51), we obtain
If we truncate the expansion in terms of the circle polynomials in Eq. (12-51) to the first
11 circle polynomials and remove the first four coefficients as interferometer setting
errors, the residual aberration function in this case is given by
expansions, respectively. However, the standard deviation obtained from the circle
coefficients is correct only when = 0. It increases rapidly with for the 4-polynomial
expansion, but it is constant for the 11-polynomial expansion, indicating its incorrect
nature. The sigma values from the orthonormal and the circle coefficients are nearly equal
to each other for £ 0.5 because of the very slow increase of the orthonormal sigma.
Figure 12-5 shows the contours of the Seidel aberration function for a circular and an
annular pupil with obscuration ratio of = 0.5. The case of a circular pupil is included
just for reference. The dark circular region in Figure 12-5b (and others) represents the
obscuration. The contours of the annular Seidel aberration function fit with only four
polynomials, as in Eq. (12-38) or (12-43) and in Eq. (12-54), which are shown in Figures
Figure 12-2. Ratio of the orthonormal annular coefficients a j and Zernike circle
coefficients b̂ j for a 11-polynomial expansion.
328 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
(a) (b)
Figure 12-5. Contours of (a) Seidel aberration function of Eq. (12-37) for a circular
pupil with At = Ad = Aa = 1, Ac = 2, and As = 3 in waves. (b) Same Seidel
aberration function, but for an annular pupil with obscuration ratio = 0.5.
(a) (b)
Figure 12-6. Contours of an annular Seidel aberration function for = 0.5 fit with
only 4-polynomials, as in (a) Eq. (12-38) or (12-43), and (b) Eq. (12-54).
330 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
(a)
(b)
(c)
Figure 12-7. Contours of the residual aberration function after removing the
interferometer setting errors. (a) WRA of Eq. (12-47) using annular polynomials, (b)
WRCb̂ of Eq. (12-48) using circle polynomials correctly, and (c) WRCb of Eq. (12-53)
using circle polynomials incorrectly.
12.3.4.5 Numerical Example 331
(a)
(b)
Figure 12-8. Contours of the difference or the error function (a) Eq. (12-49) and (b)
obtained by subtracting Eq. (12-47) from Eq. (12-55).
12-6a and 12-6b, respectively. The two figures look similar, but they are not the same.
Only Figure 6a represents the least-squares and, therefore, the correct fit. The contours of
the residual aberration function when the first four (of the eleven) polynomials are
removed as interferometer setting errors, as in Eqs. (12-47), (12-48), and (12-55), are
shown in Figures 12-7a, 12-7b, and 12-7c, respectively. All of the three figures are
different from each other, as expected. Only Figure 12-7a reflects removal of the correct
interferometer setting errors, and thus the correct residual aberration function. The
contours of the difference of the residual functions using the circle polynomials from the
one using the annular polynomials are shown in Figures 12-8a and 12-8b. They represent
the error functions given by Eq. (12-49) and the difference of Eqs. (12-55) and (12-47),
respectively, due to the removal of incorrect interferometer setting errors.
332 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
2
aj = Ú W ( x , y )H j dx dy . (12-57)
3 3 hexagon
The mean and the mean values of the estimated aberration function are given by Eqs. (12-
4) and (12-6).
Ê bˆ1 ˆ Ê 1 0 0 5 43 ˆ Ê a1 ˆ Ê a1 + 5 43a 4 ˆ
Áˆ ˜ Á 0 0 ˜ Áa ˜ Á 6 5a ˜
Á b2 ˜ 65 0
Áˆ ˜ = Á ˜ Á 2˜ = Á 2
˜ , (12-58)
b Á 0 0 65 0 ˜ Á a3 ˜ Á 6 5a 3 ˜
Á ˜ 3
Á ˜ Á ˜ Á ˜
Áˆ ˜ Ë 0 0 0 2 15 43 ¯ Ë a4 ¯ Ë 2 15 43a 4 ¯
Ëb ¯
4
or
bˆ2 = 6 5a 2 , (12-59b)
bˆ3 = 6 5a 3 , (12-59c)
and
It is evident that the piston coefficient b̂1 is not equal to a1 and, therefore, does not
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 333
Table 12-6. Conversion matrix M for obtaining the Zernike coefficients b̂ j from the
orthonormal hexagonal coefficients a j , as in Eq. (12-12).
1 0 0 0 0 0 0 0 0 0 0
0 6 5 0 0 0 0 0 0 0 0 0
0 0 6 5 0 0 0 0 0 0 0 0
5 43 0 0 2 15 43 0 0 0 0 0 0 0
0 0 0 0 10 7 0 0 0 0 0 0
0 0 0 0 0 10 7 0 0 0 0 0
14 35
0 0 16 0 0 0 10 0 0 0 0
11055 2211
14 35
0 16 0 0 0 0 0 10 0 0 0
11055 2211
2
0 0 0 0 0 0 0 0 5 0 0
3
35
0 0 0 0 0 0 0 0 0 2 0
103
521 15 43
0 0 88 0 0 0 0 0 0 14
1072205 214441 4987
521
1 0 0 5 43 0 0 0 0 0 0
1072205
14
0 65 0 0 0 0 0 16 0 0 0
11055
14
0 0 65 0 0 0 16 0 0 0 0
11055
15
0 0 2 15 43 0 0 0 0 0 0 88 0
214441
0 0 0 0 10 7 0 0 0 0 0 0
0 0 0 0 0 10 7 0 0 0 0 0
35 0 0 0 0
0 0 0 0 0 0 10
2211
0 0 0 0 0 0 0 0 0 0 0
2
0 0 0 0 0 0 0 0 5 0 0
3
35
0 0 0 0 0 0 0 0 0 2 0
103
43
0 0 0 0 0 0 0 0 0 0 14
4987
334 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
Table 12-8. Analytical matrix M –1 for obtaining the Zernike coefficients a j from
the orthonormal hexagonal coefficients b̂ j .
1 0 0 0 0 0 0 0 0 0 0
0 56 0 0 0 0 0 0 0 0 0
0 0 56 0 0 0 0 0 0 0 0
1 2 3 0 0 43 15 2 0 0 0 0 0 0 0
0 0 0 0 7 10 0 0 0 0 0 0
0 0 0 0 0 7 10 0 0 0 0 0
2211
0 0 8 5 15 0 0 0 10 0 0 0 0
35
2211
0 8 5 15 0 0 0 0 0 10 0 0 0
35
0 0 0 0 0 0 0 0 3 2 5 0 0
103
0 0 0 0 0 0 0 0 0 2 0
35
4987
1 2 5 0 0 22 7 43 0 0 0 0 0 0 14
43
represent the mean value of the aberration function. The coefficients b̂2 , b̂3 , and b̂4
represent the tip, tilt, and defocus circle coefficients.
To see how these coefficients change with the number of polynomials used in the
expansion, we consider an expansion using 11 polynomials. The coefficients, obtained
from Eq. (12-13), are given by
bˆ5 10 7 a 5 , (12-60e)
bˆ6 10 7 a 6 , (12-60f)
12.4.1 Zernike Circle Coefficients in Terms of Hexagonal Coefficients 335
bˆ9 = (2 3) 5a 9 , (12-60i)
and
It is clear that all of the first four coefficients change, and b̂ j = M jj a j for 5 £ j £ 11 .
For astigmatism ( H 5 and H 6 ), coma ( H 7 and H 8 ), and spherical aberration ( H11 ), the
b̂ j coefficient is larger than the corresponding hexagonal coefficient by a factor of
10 7 ª 1.20 , 10 35 2211 ª 1.26 , and 14 43 4987 ª 1.30 , respectively. The
astigmatism coefficients b̂5 and b̂6 change if a 15-polynomial expansion is considered.
For example, b̂5 then contains contributions from a13 and a15 , as well. The tip and tilt
coefficients b̂2 and b̂3 change further if polynomials H16 and H17 are included in the
expansion. Moreover, H16 also contributes to the coma coefficient b̂8 , and H17 similarly
contributes to the coma coefficient b̂7 . The piston and defocus coefficients b̂1 and b̂4 do
not change until the secondary spherical aberration polynomial H 22 is included with its
coefficient a 22 . Its inclusion also affects the primary spherical aberration coefficient b̂11 .
Thus, it is easy to see which, when, and by how much the b̂ j coefficients change,
depending on the number of polynomials used in the expansion.
(
= bˆ1 + 2bˆ2 x + 2bˆ3 y + 3bˆ4 2r 2 - 1 ) . (12-61b)
Wˆ ( x , y ) = a1H1 + a 2 H 2 + a 3 H 3 + a 4 H 4 (12-62a)
= a1 + 2 6 5a 2 x + 2 6 5a 3 y + a 4 [ (
5 43 + 6 5 43 2r 2 - 1 )] . (12-62b)
Comparing the right-hand sides of Eqs. (12-61b) and (12-62b) and utilizing Eqs. (12-59a–
d), it is seen that the coefficients of x, y, and x 2 + y 2 , representing the tip, tilt, and
defocus values obtained from the Zernike coefficients, are the same as those obtained
from the hexagonal coefficients. The estimated piston from the Zernike expansion of Eq.
336 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
(12-61b) is bˆ1 - 3bˆ4 . Substituting for b1and b4 from Eqs. (12-59a–d), we find that it is
the same as a1 - 5 5 43a 4 from the hexagonal expansion of Eq. (12-62b). Accordingly,
the aberration function obtained by subtracting the piston, tip, tilt, and defocus values
from the measured aberration function is independent of the nature of the polynomials
used in the expansion, regardless of the domain of the function or the shape of the pupil,
so long as the nonorthogonal expansion is in terms of only the first four circle
polynomials. The difference function is what is provided to the optician to zero out from
the surface under fabrication by polishing. In an interferometer, they represent the lateral
and longitudinal errors in the location of a point source illuminating an optical surface
under test from its center of curvature. These four terms are generally removed from the
aberration function and the remaining function is given to the optician to zero out from
the optical surface by polishing.
The contour plots of the aberration function fitted with 4, 11, and 15 hexagonal
polynomials are shown in Figure 12-9. The same plots are obtained with the
corresponding properly calculated Zernike coefficients b̂ j , illustrating an identical fit.
However, different plots are obtained with the improperly calculated Zernike coefficients
b j , as shown in Figure 12-10. If we remove the first four a j , b̂ j , or b̂ j coefficients of
piston, tip, tilt, and defocus representing the interferometer setting errors from the
aberration function estimated by 11 or 15 polynomials, we obtain the residual aberration
12.4.3 Numerical Example 337
function whose contour plots are shown in Figures 12-11, 12-12, and 12-13, respectively.
Comparing these figures, it is evident that the residual functions represented in Figures
12-12 and 12-13 are incorrect. Only Figure 12-11 represents the correct residual function.
The difference of the residual aberration functions representing the error functions in
using the Zernike polynomials and thereby removing the incorrect interferometer setting
errors are shown in Figures 12-14 and 12-15. Thus, the contours in these figures represent
the difference of the contours in Figures 12-12 and 12-13 from those in Figure 12-11,
respectively.
(a)
(b)
(c)
Figure 12-9. Contour plots of a hexagonal aberration function fit with (a) 4, (b) 11,
and (c) 15 hexagonal polynomials or circle polynomials with coefficients b̂ j .
338 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
(a)
(b)
(c)
Figure 12-10. Contour plots of a hexagonal aberration function fit with (a) 4, (b) 11,
and (c) 15 circle polynomials with coefficients b j .
12.4.3 Numerical Example 339
(a)
(b)
(a)
(b)
(a)
(b)
(a)
(b)
Figure 12-14. Contour plots of the error function after removing the first four b̂ j
coefficients from (a) 11 and (b) 15 coefficients.
12.4.3 Numerical Example 343
(a)
(b)
Figure 12-15. Contour plots of the error function after removing the first four b j
coefficients from (a) 11 and (b) 15 coefficients.
344 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
j aj b̂ j , J = 4 b̂ j , J = 11 b̂ j , J = 15 bj
Table 12-10. Standard deviation s of the aberration functions fit with 4, 11, and 15
hexagonal or circle polynomials using Eq. (12-6).
As in the case of an annular wavefront, the fit with a certain number of circle
polynomials is as good as with a corresponding set of the hexagonal polynomials.
However, again there are pitfalls in using the circle polynomials. For example, the mean
value of a circle polynomial across a noncircular pupil is not zero, the Zernike piston
coefficient does not represent the mean value of the aberration, the other Zernike
coefficients do not represent the standard deviation of the corresponding aberration term,
and the variance of the aberration is not equal to the sum of the squares of these other
coefficients. Moreover, the value of a Zernike coefficient generally changes as the
number of polynomials used in the expansion of an aberration function changes.
12.6 SUMMARY
The expansion of a noncircular aberration function in terms of the Zernike circle
polynomials is compared with the corresponding expansion in terms of the polynomials
that are orthonormal over the domain of the function. It is shown that, whereas the
orthonormal expansion coefficients are independent of the number of polynomials used in
the expansion, the circle coefficients generally change as the number of polynomials
changes. We demonstrate which circle coefficients change and by how much.
Accordingly, one or more orthonormal polynomial terms can be added to or subtracted
from the aberration function without affecting the other coefficients only when the
orthonormal polynomials are used. Moreover, unlike the orthonormal coefficients, the
piston circle coefficient does not represent the mean value of the aberration function, and
the sum of the squares of the other circle coefficients does not yield its variance.
However, since each orthonormal polynomial of a certain order is a linear combination of
346 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
the circle polynomials of that and lower orders, the wavefront fit with a certain number of
orthonormal polynomials is exactly the same as that with the corresponding circle
polynomials.
Similar results are illustrated when the circle polynomials are used for the analysis of
a hexagonal wavefront. For example, Eqs. (12-59) and (12-60) show that the circle
coefficients change when fitting it first with 4 circle polynomials and then with 11
polynomials. This may be seen in Table 12-9 by comparing the first four coefficients of
the column with J = 11 with the corresponding coefficients in the column with J = 4.
When the number of polynomials increases from 11 to 15, then only the astigmatism
coefficients b̂5 and b̂6 change. However, Eqs. (12-61) and (12-62) show that an identical
fit is obtained when the same number of corresponding circle or hexagonal polynomials
are used, as illustrated in Figures 12-9a–c for J = 4, 11, and 15, respectively. However,
different fits are obtained with the improperly calculated Zernike coefficients b j , as
shown in Figures 12-10a–c. If we remove the first four a j , b̂ j , or b̂ j coefficients of
piston, tip, tilt, and defocus representing the interferometer setting errors from the
aberration function estimated by 11 or 15 polynomials, only the residual aberration
function illustrated in Figures 12-11a and 12-11b is correct, but those in Figures 12-12
and 12-3 are incorrect. The sigma value of an aberration function obtained by summing
the squares of the coefficients and taking the square root of the result is correct only for
the hexagonal coefficients, as may be seen from Table 12-10.
If the common practice of defining the center of an interferogram and drawing a unit
circle around it is followed, then the circle coefficients of a noncircular interferogram do
not yield a correct representation of the aberration function. Moreover, in this case, some
of the higher-order coefficients of aberrations that are nonexistent in the aberration
function are also nonzero, as mentioned in Section 12.3.4.4. Finally, the circle
coefficients, however obtained, do not represent the coefficients of the balanced
12.6 Summary 347
aberrations for an annular pupil. Consequently, it should be clear that the circle
polynomials are not suitable for the analysis of an annular wavefront, and only the
annular polynomials should be used for such an analysis.
348 USE OF ZERNIKE CIRCLE POLYNOMIALS FOR NONCIRCULAR PUPILS
References
ANAMORPHIC SYSTEMS
13.4 Strehl Ratio and Aberration Balancing for a Rectangular Pupil ....................355
References ......................................................................................................................367
349
Chapter 13
Anamorphic Systems
13.1 INTRODUCTION
An anamorphic imaging system, for example, consisting of cylindrical optics, is
symmetric about two orthogonal planes whose intersection defines its optical axis. The
Gaussian images of a point object with object rays in the two symmetry planes are
formed separately. They are coincident in the final image space of the system for only
two pairs of conjugate planes [1]. By definition, an anamorphic system forms the image
of an extended object with different transverse magnifications in the two symmetry
planes. Thus, for example, the image of a square object is rectangular and that of a
rectangular object can be square. The two orthogonal planes of symmetry of the imaging
system yield six “reflection” invariants in terms of the Cartesian coordinates of the object
and pupil points [2,3], which become the building blocks of its aberration function for a
certain point object. The six invariants reduce to three “rotational” invariants for a
rotationally symmetric system, or equivalently for an infinite number of symmetry
planes.
In this chapter, we discuss the power series expansion of the aberration function in
terms of the six reflection invariants, define the classical aberrations of the system, and
discuss their balancing to minimize their variance across a rectangular exit pupil, and
thereby improve the image quality [see Chapter 2]. We show that the balanced
aberrations are represented by the products of the Legendre polynomials, one for each of
the two dimensions of the rectangular pupil [4]. The compound Legendre polynomials are
orthogonal across a rectangular pupil and, like the classical aberrations, are inherently
separable in the Cartesian coordinates of the pupil point.
351
352 ANAMORPHIC SYSTEMS
Let S1 be the distance of the point object P, and S1¢ be the distance of the Gaussian
image point P ¢ from the object- and image-space principal planes H1 and H1¢ of the lens
L1 , respectively, as illustrated in Figure 13-2. They are related to each other by the
image-space focal length f1¢ according to
1 1 1
- = , (13-1a)
S1¢ S1 f1¢
or
S1¢ f1¢
S1 = . (13-1b)
f1¢ - S1¢
Similarly, the object and image distances S2 and S2¢ for the lens L2 of focal length f 2¢
are related to each other according to
1 1 1
- = (13-2a)
S2¢ S2 f 2¢
or
1 1 1
- = , (13-2b)
S1¢ - d 2 S1 - d1 f 2¢
where d1 and d 2 are the distances H1H 2 and H1¢H 2¢ between the respective principal
planes of the two lenses. In the thin lens approximation, d1 and d 2 are equal to the
spacing between the lenses. Substituting for S1 from Eq. (13-1b) into Eq. (13-2b), we
obtain a quadratic equation in S1¢ yielding two solutions for it. A corresponding value of
S1 can be obtained for each value of S1¢ from Eq. (13-1b). Thus, an anamorphic system
has only two pairs of conjugates, compared to an infinite number for a rotationally
symmetric imaging system. It should be evident that the image magnifications along the x
and y axes are different, as they are given by
S2¢
Mx = - (13-3a)
S2
and
354 ANAMORPHIC SYSTEMS
S1¢
My = - , (13-3b)
S1
respectively. Its consequence, for example, is that the image of a square object is
rectangular and that of a circle is elliptical.
It is evident that the system is symmetric about two orthogonal planes zx and yz.
Accordingly, the aberration function, which depends on both ( p, q) and ( x , y )
coordinates, consists of products of positive integral powers of six reflection invariants
[2–4] :
p 2 , x 2 , px , q 2 , y 2 , and qy . (13-4)
The first three are symmetric about the yz plane and the other three are symmetric about
the zx plane. The aberration function can accordingly be written in the form
W ( p, q; x , y ) = Â
i, j , k, l , m , n
( ) i (q2 ) j ( x 2 ) k ( y 2 ) l ( px) m (qy) n
C i, j ,k,l ,m,n p 2 , (13-5)
where i, j, k , l, m , and n are positive integers including zero, and C i, j ,k,l ,m,n is the
coefficient of the aberration term that has a degree in the object and pupil coordinates
given by
It is evident that the degree of an aberration term is even, and thus the aberration function
consists of aberrations of even orders only. The zero-degree term must be zero, as it
represents the aberration of the chief ray, which is zero by its definition as the reference
ray. There are six terms of second degree, namely the reflection invariants multiplied
with their respective coefficients. Two of these terms, namely those in p 2 and q 2 , are
piston terms, i.e., they are independent of the pupil coordinates, and can generally be
ignored. Among the other four, those in px and qy , represent lateral deviations of the
image point from the Gaussian image point, and those in x 2 and y 2 represent
longitudinal deviations. Since our aberration function is defined with respect to the
Gaussian image point, these four terms must be zero. It is clear that the aberration terms
are separable in the Cartesian coordinates ( x , y ) of a pupil point.
There are 21 terms of the fourth degree, of which three are piston terms and two are
equal to another two. Hence, we are left with 16 terms that depend on the pupil
coordinates. They are called the primary aberrations of an anamorphic system, compared
13.3 Classical Aberrations 355
to only five for a rotationally symmetric system. The primary aberration function can be
written
( ) ( ) ( )
W ( p, q; x , y ) = C1 p 3 + C 2 pq 2 x + C 3 p 2 q + C 4 q 3 y + C 5 p 2 + C 6 q 2 x 2
( 2
)
+ C 7 pqxy + C 8 p + C 9 q y + C10 pxy + C11qyx + C12 px 3
2 2 2 2
where we have expressed the aberration coefficients in a simplified form with one
subscript for convenience. For a rotationally symmetric system, the six reflection
coefficients reduce to three rotational invariants, namely, p 2 + q 2 , x 2 + y 2 , and px + qy ,
r r
and the 16 primary aberrations reduce to five. If h and rr are r the position vectors of the
r r r r
object and pupil points, then the rotational invariants are h ◊ h , r ◊ r , h ◊ r or h 2 , r 2 , and
r r r r
hr cos q , where h = h , r = r , and q is the polar angle of r with respect to that of h .
In conformance with the aberrations of a rotationally symmetric system, the linear terms
in x and y are the distortion aberrations, the quadratic terms may be referred to as the field
curvature, defocus, or astigmatism; the cubic terms are comas; and the quaternary terms
are the spherical aberrations. It is easy to see that an anamorphic system has three primary
aberrations for an axial point object compared to only one for a rotationally symmetric
system.
W cx ( x , y ) = x 3 . (13-8)
2
s 2cx = [W cx (x, y)]2 - W cx ( x , y ) , (13-9)
where the angular brackets indicate a mean value across the pupil. For example, the mean
value of a function g( x , y ) is given by
1 1
Ú Ú g( x , y ) dx dy
g( x, y) = 1 1
1 1
Ú Ú dx dy
1 1
1 1 1
= Ú Ú g( x , y ) dx dy . (13-10)
4 1 1
356 ANAMORPHIC SYSTEMS
The variance can be reduced by mixing it with a certain amount b of x-tilt. Thus, the
balanced aberration may be written in the form
W bcx ( x , y ) = x 3 + bx . (13-11)
1 2b b 2
s 2bcx = + + . (13-12)
7 5 3
The variance has a minimum value of 4/175 for a tilt of b = -3 / 5 compared to a value of
1/7 without any tilt. Thus the variance is reduced by a factor of 25/4, or the standard
deviation of the balanced aberration is smaller by a factor of 5/2. The corresponding
balanced aberration is given by
W bcx ( x , y ) = x 3 - (3 5) x . (13-13)
A balanced aberration yields a higher Strehl ratio or increases the aberration tolerance for
a given Strehl ratio. The balanced aberration given by Eq. (13-13) is the same as Eq. (11-
6) for a slit pupil. Similarly, the variance of the x-spherical aberration x 4 can be
minimized by combining it with x-defocus of -(6 7) x 2 , yielding a balanced aberration of
x 4 - (6 7) x 2 . This balanced aberration is also the same as for a slit pupil. The sigma
value of the aberration is reduced by a factor of 7/2 from 4/15 to 8/105. It should be
evident that the y-coma or y-spherical aberration may be balanced in the same manner.
The variance of the higher-order classical aberrations, e.g., secondary aberrations (of
sixth degree), can also be minimized by combining them with one or more lower-degree
aberrations.
Q j ( x , y ) = Ll ( x ) Lm ( y ) , (13-14)
where j is a polynomial ordering index starting with j = 1, and l and m are positive
integers (including zero). It is evident that these polynomials are inherently separable in
the Cartesian pupil coordinates x and y. This is different from the Zernike circle
polynomials, which are orthogonal over a unit circle, but separable in polar coordinates
(r, q) , where 0 £ r £ 1 and 0 £ q £ 2p . The order n of a polynomial representing its
degree in the pupil coordinates is given by n = l + m . As in the case of Zernike circle
polynomials, the number of polynomials with a certain order n is n+1. The number of
polynomials through a certain order n is given by
N n = ( n + 1)( n + 2) 2 . (13-15)
Q1( x , y ) = L0 ( x ) L0 ( y ) = 1 . (13-16)
1 1 1
Ú Ú Q ( x , y ) Q j ¢ ( x , y ) dx dy = d jj ¢ . (13-17)
4 1 1 j
The rectangular Q-polynomials up to and including the eighth order are listed in
Table 13-1 as products of the Legendre polynomials, along with the names associated
with some of them. Their explicit form can be obtained by using the expressions of the
Legendre polynomials given in Table 11-1. Note that for each polynomial Ll ( x ) Lm ( y ) ,
there is a corresponding polynomial Lm ( x ) Ll ( y ) . These polynomials are evidently
different from those for a rotationally symmetric system with a rectangular pupil. The
rectangular polynomials given in Section 9.4 for such a system are not separable in the
Cartesian coordinates (x, y) of a pupil point.
0 Q1 = L0 ( x ) L0 ( y ) Piston
1 Q2 = L1( x ) L0 ( y ) x-tilt
1 Q3 = L0 ( x ) L1( y ) y-tilt
2 Q4 = L 2 ( x ) L 0 ( y ) x-defocus
2 Q5 = L1( x ) L1( y )
2 Q6 = L 0 ( x ) L 2 ( y ) y-defocus
3 Q7 = L 3 ( x ) L 0 ( y ) x-primary coma
3 Q8 = L2 ( x ) L1( y )
3 Q9 = L1( x ) L2 ( y )
4 Q12 = L3 ( x ) L1( y )
4 Q13 = L2 ( x ) L2 ( y )
4 Q14 = L1( x ) L3 ( y )
5 Q17 = L4 ( x ) L1( y )
5 Q18 = L3 ( x ) L2 ( y )
5 Q19 = L2 ( x ) L3 ( y )
5 Q20 = L1( x ) L4 ( y )
6 Q23 = L5 ( x ) L1( y )
6 Q24 = L4 ( x ) L2 ( y )
6 Q25 = L3 ( x ) L3 ( y )
6 Q26 = L2 ( x ) L4 ( y )
6 Q27 = L1( x ) L5 ( y )
7 Q30 = L6 ( x ) L1( y )
7 Q31 = L5 ( x ) L2 ( y )
7 Q32 = L4 ( x ) L3 ( y )
7 Q33 = L3 ( x ) L4 ( y )
7 Q34 = L2 ( x ) L5 ( y )
7 Q35 = L1( x ) L6 ( y )
7 Q36 = L0 ( x ) L7 ( y ) y-tertiary coma
8 Q38 = L7 ( x ) L1( y )
8 Q39 = L6 ( x ) L2 ( y )
8 Q40 = L5 ( x ) L3 ( y )
8 Q41 = L4 ( x ) L4 ( y )
8 Q42 = L5 ( x ) L3 ( y )
8 Q43 = L2 ( x ) L6 ( y )
8 Q44 = L1( x ) L7 ( y )
Q32 ( x , y ) = L4 ( x ) L3 ( y ) . (13-18)
It should be evident that the polynomials for a square pupil can be obtained from
those for a rectangular pupil by letting a = b , i.e., by using the same scale for the x and y
axes. Products of Chebyshev polynomials (one for the x and the other for y axis), which
are also orthogonal over a rectangular or a square pupil, have been suggested for the
analysis of rectangular wavefronts [5]. However, they are not suitable for anamorphic
systems since they do not represent balanced aberrations for such systems.
W ( x, y) = Â a j Q j ( x, y) , (13-19)
j
1 1 1
aj = Ú Ú W ( x , y ) Q j ( x , y )dx dy . (13-20)
4 1 1
W ( x , y ) = a1 . (13-21)
[W (x, y)]2 = Â a 2j
j
. (13-22)
2
2
sW = [W (x, y)]2 - W ( x, y)
= Â a 2j . (13-23)
j π1
and obtain it continuously across the pupil. Because of the orthogonality of the Legendre
polynomials, the coefficients are independent of each other, and an orthogonal aberration
term can be added to or subtracted from the aberration function without affecting the
other terms.
1 1 2p
g( x, y) = Ú Ú g ( x , y ) r dr d q , (13-24)
p0 0
= 5 64 . (13-25)
Now we balance x-coma with tilt, as in Eq. (13-11). Its variance across the circular
pupil is given by
5 b b2
s 2bcx = + + . (13-26)
64 4 4
W bcx ( x , y ) = x 3 - (1 2) x . (13-27)
Similarly, we can show that when a certain amount of the x-spherical aberration is
balanced by - 3 4 of that amount of the x-defocus, its sigma value is reduced by a factor
of 10 from (1 8) 5 2 to1 16. The balanced x-spherical aberration is x 4 - (3 4) x 2 .
362 ANAMORPHIC SYSTEMS
1 1 2p
Ú Ú F ( x , y ) F j ¢ ( x , y )r dr dq = d jj ¢ . (13-28)
p0 0 j
We note that there are x-polynomials for defocus, balanced coma, and balanced spherical
aberration, but no corresponding y-polynomials. The y 2 term representing the y-defocus
appears in F6 , the y 3 terms representing y-coma appears in F10 , and the y 4 term
representing the y-spherical aberration terms appears in F15 . The only difference between
the polynomials for a circular and elliptical pupil is the scaling of the x- and y-axes. An
aberration function across such a pupil can be expanded in terms of these polynomials in
the same manner as in Section 13.6.
The x-polynomials for square and circular pupils are compared in Table 2. They
illustrate how the sigma values or the balancing aberrations differ for the two pupils. The
polynomials for a rectangular or an elliptical pupil can be obtained from the
corresponding polynomials for a square or a circular pupil by simply scaling the ( x , y )
coordinates.
Similarly, Table 13-5 summarizes which polynomial set to use for analyzing the
aberrations of a rotationally symmetric system with different pupil shapes. For a circular
pupil, the appropriate polynomials are the Zernike circle polynomials, since they are
orthogonal over and represent balanced aberrations for such a pupil. If the pupil is
13.8 Comparison of Polynomials for Rotationally Symmetric and Anamorphic Imaging Systems 363
1 1 Piston
2 2x x-tilt
3 2y y-tilt
4 4x2 -1 x-defocus
5 2 6xy
6 (
2 x 2 + 3y 2 - 1 )
7 (
4 2x 3 - x ) x-primary coma
4
8
5
(
6x 2y - y )
9 (
4 x 3 + 3xy 2 - x )
4
10
5
(
3x 2 y + 5y 3 - 3y )
11 16 x 4 - 12 x 2 + 1 x-primary spherical
12 (
2 2 8 x 3 y - 3xy )
10
13
7
(
8 x 4 + 24 x 2 y 2 - 9 x 2 - 3y 2 + 1 )
14 (
4 2 3x 3 y + 5xy 3 - 3xy )
15
2
7
(3 x 4
)
+ 30 x 2 y 2 + 35y 4 - 6 x 2 - 30y 2 + 3
364 ANAMORPHIC SYSTEMS
elliptical, as in the case of the human eye, the appropriate polynomials are those given in
Tables 8-1 to 8-3 obtained by orthogonalizing the circle polynomials over the elliptical
pupil. They cannot be obtained from the circle polynomials by simply scaling the x and y
axes of a circular pupil. Although the polynomials thus obtained will be orthogonal over
an elliptical pupil, they will not represent the balanced aberrations for it. For a rectangular
pupil, e.g., a rectangular beam passing through such a system, the appropriate
polynomials are those given in Tables 9-1 to 9-3. The 2D Legendre polynomials are not
suitable because, although they are orthogonal over the pupil, they do not represent
balanced aberrations for it. The polynomials for a square pupil can be obtained as a
special case of those for a rectangular pupil. They are given in Tables 10-1 to 10-3.
Defocus ( )(
5 2 3x 2 - 1 ) 4x2 -1
Coma ( )(
7 2 5x 3 - 3x ) (
4 2x 3 - x )
Spherical (3 8)( 35x 4 - 30 x 2 + 3) 16 x 4 - 12 x 2 + 1
Table 13-4. Appropriate polynomials for an anamorphic system with different pupil
shapes.
13.9 SUMMARY
An anamorphic imaging system has only two pairs of Gaussian conjugates,
compared to an infinite number for a rotationally symmetric imaging system. The
diffraction PSF or OTF of these systems, which depends on the shape of the exit pupil
and the aberration across it, is the same for the same pupil function. It is assumed that the
aperture stop lies in the image spapce of the system so that it is also its exit pupil.
The aberration function of an anamorphic system depends on the object and pupil
coordinates ( p, q) and ( x , y ) , respectively, through six reflection invariants p 2 , q 2 , x 2 ,
y 2 , px , and qy , compared to three rotational invariants p 2 + q 2 , x 2 + y 2 , and px + qy
in the case of a rotationally symmetric system. Its aberration terms are separable in the
pupil coordinates. The degree of an aberration term is even, and the aberration function
accordingly consists of aberrations of even orders only. There are 16 primary aberrations
[see Eq. (13-4)] as opposed to only five for a rotationally symmetric system [see Eq. (2-
16)].
Products of Chebyshev polynomials (one for the x axis and the other for the y axis)
which are also orthogonal over a rectangular or a square pupil, have been suggested for
wavefront analysis, but they are not suitable for anamorphic systems, since they do not
366 ANAMORPHIC SYSTEMS
represent balanced aberrations for such systems [5]. For a system with an axis of
rotational symmetry, as with spherical optics, the aberrations are not separable in
Cartesian coordinates, and the products of the x- and y-Legendre polynomials are not
suitable for expanding an aberration function for a rectangular pupil. The rectangular
polynomials for such systems are those obtained by orthogonalizing the Zernike circle
polynomials over a rectangular pupil, as discussed in Chapter 9.
References
369
Chapter 14
Numerical Wavefront Analysis*
14.1 INTRODUCTION
In this chapter, we consider how best to determine the orthonormal expansion or the
aberration coefficients from the wavefront data measured at an array of points, as, for
example, in a phase-measuring interferometer [1]. The problem of determining the
expansion coefficients when the measured data are the wavefront slopes, as, for example,
in a Shack–Hartmann sensor [2] is also discussed. Although we have considered optical
imaging systems with several different pupil shapes, our focus in this chapter is on a
system with a circular pupil. The analysis given here for such a pupil can be extended to
systems with other pupil shapes.
In practice, what is needed in both optical design and fabrication is the wavefront.
The wavefront aberrations determine the image quality in optical design. In fabrication
and testing of an optical surface, the wavefront errors determine surface errors, and thus
the polishing requirements to obtain the desired surface. Similarly, in adaptive optics, the
signal for the actuators of a deformable mirror to negate the aberrations, such as those
introduced by atmospheric turbulence, comes from the wavefront data. Hence, there is a
need to determine the Zernike coefficients from the wavefront data measured by a
wavefront sensor, or from the slope data provided by a slope sensor. In this chapter, we
present the two main mathematical approaches to determine the expansion coefficients:
an integration method for orthogonal solutions and the classic least squares approach.
We also illustrate the methods with some numerical examples for determining the
Zernike coefficients from the wavefront data or the wavefront slope data. The key points
considered are: how the number of data points affects the accuracy of the coefficients,
* This chapter was contributed by Prof. Eva Acosta and Dr. Justo Arines of the Departamento de Física Aplicada, Universidad
de Santiago de Compostela, Galicia, Spain.
371
372 NUMERICAL WAVEFRONT ANALYSIS
how the noise in the data affects this accuracy, and how many Zernike polynomials are
needed for adequate representation of the data.
14.2.1 Theory
In Chapter 3, we discussed the orthonormal polynomials to represent the aberrations
of a system with a certain shape of its exit pupil. In Chapter 4, we considered the specific
case of a system with a circular pupil. Consider an aberration function W(x, y) of a system
expanded in terms of J Zernike circle polynomials in the form
J
W ( x, y ) ¦a Z
j 1
j j ( x, y ) . (14-1)
Because of the orthonormality of the Zernike polynomials, the expansion coefficients are
given by
where the limits of integration in Eq. (14-3) and others are the same as in Eq. (14-2),
unless specified otherwise. Similarly, the variance of the aberration function is given by
J
V2 ¦ a j2 , (14-4)
j 2
In practice, the measured data is available at a finite number of points, and the
integral in Eq. (14-3) reduces to a sum, thus causing some error in the value of the
integral. The accuracy of the integral can be improved by interpolating the data to yield
their values at a set of points and using them to perform numerical integration by
algorithms such as adaptive integration, Monte Carlo integration, or cubature formulas
among others [3]. In the least squares (LS) approach, we determine the expansion
coefficients by minimizing the difference between the measured wavefront and the
wavefront estimated with a certain number J of the Zernike polynomials by solving a
linear system of equations [4]
DN u1 A2 N uJ aˆ J u1 , (14-5)
where D is the column vector of N data values, â is a column vector containing the J
expansion coefficients, and A is a NuJ 2D matrix representing the values of the Zernike
polynomials at the location of the data points according to
14.2.1 Theory 373
§ Z1 ( x1 , y1 ) Z 2 ( x1 , y1 ) !Z J ( x1 , y1 ) ·
¨ Z (x , y ) Z 2 ( x2 , y 2 ) Z J ( x2 , y 2 ) ¸¸
!
AN u J ¨ 1 2 2 . (14-6)
¨ # # ¸
¨ ¸
© Z1 ( x N , y N ) Z 2 ( xN , y N ) ! Z J ( xN , y N ) ¹ N uJ
In general, A is not a square matrix, because the number of data points is larger than
the number of polynomials representing the wavefront. A pseudoinverse matrix is used to
evaluate the Zernike coefficients [4]:
aˆ J u1 ( AJTu N AJ u M ) 1 AJTu N D N u1 . (14-7)
The P-V number of the aberration function is 14.5, and its sigma value is 2.69 waves. Its
isometric and contour plots are shown in Figure 14-1. The contour spacing is one wave.
p
Vn
100 ³ > W ( x, y ) @ dx dy , (14-9)
(a) (b)
Figure 14-1. (a) Isometric and (b) contour plots of a Seidel aberration function
represented by Eq. (14-8).
374 NUMERICAL WAVEFRONT ANALYSIS
Figure 14-2. Square arrays of data points on a unit disc. (a) 9 u 9 array with 45 data
points, (b) 15 u 15 array with 145 data points and (c) 21u 21 array with 305 data
points.
To determine the coefficients by the integration method of Eq. (14-3), we
interpolate the data by bicubic splines based on SPLIN2, as explained in [5], to yield the
value of the integrand at the nodes of the Albrecht cubature formula that allows exact
evaluation of polynomial integrands up to degree 15 [6]. When determining the
coefficients by the least squares approach, the matrix inversion is performed by the
inv(M) function of Matlab [7]. The main advantage of the numerical integration method
is the independence of the calculated coefficients. In the least squares approach, there is
some cross-coupling of the coefficients if the number of coefficients estimated is smaller
than those present in the aberration function representing the data.
The expansion coefficients obtained by the integration and the LS approaches for
various numbers of data values and different amounts of noise are compared in Figure
14-3. The Zernike polynomials up to and including the seventh order, or J = 36, are used
in determining their coefficients. The figure shows that the accuracy of the retrieved
coefficients in both approaches increases with the number of data points, and decreases
with an increase in the amount of noise. The quality of the wavefront fit, defined as the
root mean square difference between the values of the estimated and the actual aberration
functions Ŵ and W at the data points, i.e.,
1/ 2
1 ° N ˆ ½
2°
Q ¦
® [W ( xi , yi ) W ( xi , yi )] ¾
N ¯i 1 °¿
, (14-10)
is given in Table 14-1. For a small number of points N (close to the number of data
points) and a small amount of noise, the integration method yields a slightly better fit
than the LS method due to some coupling between the lower- and the higher-order modes
in the LS method. However, as the noise increases, the LS method yields a slightly better
fit. This is because the interpolation method used in the integration method worsens as
the noise increases. As the number of data points increases, the quality of the fit becomes
approximately the same for the two methods.
14.2.2 Numerical Example 375
Integration Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0 c
0.5 0% noise
1.0 2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
LS Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
Integration Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0 c
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
Integration Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
j
Table 14-1. Wavefront fit quality factor Q for a Seidel aberration function.
Q
Integration Method LS Method
ın 9u9 15 u 15 21u 21 9u9 15 u 15 21u 21
(a) (b)
Figure 14-4. (a) Isometric and (b) contour plots of the eye aberration function.
14.2.2 Numerical Example 379
j aj j aj j aj
Table 14-3. Wavefront fit quality factor Q for the human eye aberration function.
Integration Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
9x9
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-5a. Estimated Zernike coefficients of the eye aberration function from
wavefront data on a 9 u 9 array with different amounts of noise.
14.2.2 Numerical Example 381
Integration Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
15x15
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-5b. Estimated coefficients of the eye aberration function from wavefront
data on a 15 u 15 array with different amounts of noise.
382 NUMERICAL WAVEFRONT ANALYSIS
Integration Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
21x21
2.5
2.0
1.5
1.0
0.5
âj
0.0
0.5
0% noise
1.0 2% noise
5% noise
1.5 10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-5c. Estimated Zernike coefficients of the eye aberration function from
wavefront data on a 21u 21 array with different amounts of noise.
14.3.1 Theory 383
ª wW ( xl , yl ) wW ( xl , yl ) º
'x, 'y f« , » , (14-11)
¬ wx wy ¼
where ( xl , yl ) is the center of a lenslet, and f is its focal length. Because of the spatial
derivative relationship between them, such a sensor is called the wavefront slope sensor.
We assume that the wavefront sensor provides accurate measurements of the local slopes
of the wavefront under test, affected only by the measurement noise and the setup
constraints [9].
This problem has been studied and solved by several authors [8-12], and all of them
arrived at different solutions for the vector functions, in other words, the set of vector
polynomials is not unique [13]. A straightforward and intuitive way to find a set of vector
polynomials [12] is to apply the divergence theorem [14] to the scalar function W ( x, y )
G
and a vector field V j ( x, y) on a unit circular pupil with a contour C:
G G G G
³ W ( x, y ) V j ( x, y ) dx dy ³ W ( x, y ) V j ( x, y ) dl ³ W ( x, y ) V j ( x, y )dx dy , (14-13)
G
where dl is the differential contour element pointing out of the circumference of the unit
G
pupil. Thus, if there exists a vector function V j ( x, y) such that
384 NUMERICAL WAVEFRONT ANALYSIS
G
V j (x, y) Z j (x, y) (14-14)
and
G G
V j (U 1) dl 0 , (14-15)
then using Eq. (14-3), Eq. (14-13) yields Eq. (14-12), which, in turn, can be used to
obtain the Zernike coefficients.
G
The vector polynomials G j ( x, y) proposed by Gavrielides [10] require a more
G
restrictive solution for V j ( x, y ) as being irrotational [15], deriving therefore from the
gradient of a scalar function U ( x , y ) . Thus, in order to find these polynomials, we must
solve the Poisson equation
2U j ( x, y ) Z j ( x, y ) (14-16)
G
U j (U 1) dl 0 , (14-17)
G G
and G j ( x, y) can be straightforwardly evaluated as G j ( x, y) U ( x, y) .
Using one set of vector functions or another is important because the slope data are
G
inevitably afflicted with noise. Let n ( x, y ) represent the noise vector associated with the
G
measured slope data, so that the measured slopes are given by W ( x, y ) n ( x, y ) . (The
noise sources in a Shack–Hartmann sensor have been described in detail by Neal,
Copland, and Neal [9].) Equation (14-12) is thus modified and the coefficients we
calculate in practice are given by
G G
a~ j ³ >W ( x, y) n( x, y)@ V j dx dy . (14-18)
G G G G
V2j (a j a j )2 ³ n ( x, y ) V j ( x, y )dxdy ³ n( xc, yc) V j ( xc, yc)dxcdyc . (14-19)
Assuming uncorrelated random Gaussian noise with zero mean and covariance
G G
n 2 G( x xc, y y c) , the variance associated with the estimated coefficient
n ( x, y )n ( xc, y c)
is given by [13]
G G G G G 2
V2j ³³ n x, y n xc, yc V x, y V xc, yc dxdydxcdyc ³ V x, y
j j j dxdy , (14-20)
14.3.1 Theory 385
G G 2
and hence, the vector polynomials V j ( x, y ) for which ³ V j ( x, y) dx dy is minimum will
propagate less noise to the expansion coefficients. Solomon et al. [13,16] showed that the
vector functions given by Gavrielides obey this property, and therefore we will use them
for the numerical simulations in the next section. These polynomials up to the eighth
degree are listed in Table 14-4.
j G jx (U, T) G jy (U, T)
1 0 0
10 (1 / 32 )[U 4 cos 4T ( 4U 4 5U 2 ) cos 2T] (1 / 32)[U4 sin 4T (4U4 5U2 )sin 2T]
14 (1/ 40)[U5 cos5T (5U5 6U3 ) cos T] (1 / 40 )[U 5 sin 5T (5U 5 6U 3 ) sin T]
15 (1 / 40)[U5 sin 5T (5U5 6U3 ) sin T] (1/ 40)[U5 cos5T (5U5 6U3 ) cosT]
j G jx (U, T) G jy (U, T)
20 (1/ 48)[U6 cos6T (6U6 7U4 )cos4T] (1/ 48) [U6 sin 6T (6U6 7U4 ) sin 4T]
21 (1/ 48)[U6 sin6T (6U6 7U4 )sin 4T] (1/ 48)[U6 cos6T (6U6 7U4 )cos4T]
27 (1/ 56)[U7 sin 7T (7U7 8U5 )sin 5T] (1 / 56)[ U 7 cos 7T (7U7 8U5 ) cos 5T]
28 (1/ 56)[U7 cos7T (7U7 8U5 )cos5T] (1 / 56)[U7 sin 7T (7U7 8U5 ) sin 5T]
(1/ 2)(U2 1)[(7U6 5U4 )sin 4T (1/ 2)(U2 1)[(7U6 5U4 )cos4T
31
2(7U6 8U4 2U2 )sin 2T] 2(7U6 8U4 2U2 )cos2T]
(1/ 2)(U2 1)[(7U6 5U4 )cos4T (1/ 2)(U2 1)[(7U6 5U4 )sin 4T
32
2(7U6 8U4 2U2 )cos2T] 2(7U6 8U4 2U2 )sin 2T]
14.3.1 Theory 387
j G jx (U, T) G jy (U, T)
35 (1/ 8)[U8 sin8T (8U8 9U6 )sin 6T] (1 / 8)[ U8 sin 8T (8U8 9U 6 ) sin 6T]
36 (1 / 8)[U8 sin 8T (8U8 9U 6 ) sin 6T] (1 / 8)[U8 sin 8T (8U8 9U6 ) sin 6T]
(1/ 8)(U2 1)[(28U7 35U5 10U3 )cos3T (1 / 8)(U 2 1)[(28U 7 35U 5 10U 3 ) sin 3T
38
3(14U7 21U5 9U3 U)cos T] 3(14U 7 21U 5 9U 3 U )sin T]
(1 / 8)(U 2 1)[(28U7 35U5 10U3 )sin 3T (1/ 8)(U2 1)[(28U7 35U5 10U3 )cos3T
39
3(14U7 21U5 9U3 U)sin T] 3(14U7 21U5 9U3 U)cos T]
(1/ 8)(U2 1)[3(4U7 3U5 )cos5T (1/ 8)(U2 1)[3(4U7 3U5 )sin5T
40
(28U7 35U5 10U3 )cos3T] (28U7 35U5 10U3 )sin3T]
44 (1 / 72 )[U 9 cos 9T (9U 9 10U7 ) cos 7T] (1/ 72)[U9 sin 9T (9U9 10U7 )sin 7T]
45 (1/ 72)[U9 sin 9T (9U9 10U7 )sin 7T] (1/ 72)[U9 cos9T (9U9 10U7 )cos7T]
388 NUMERICAL WAVEFRONT ANALYSIS
V 2s ³ >W ( x, y) W ( x, y ) @ 2 dx dy , (14-21)
where ı2s is the standard deviation or the spot sigma. For large aberrations, minimizing ı2s
is a useful criterion for obtaining a good MTF (modulation transfer function) at low
spatial frequencies. Thus, if we expand the aberration function W ( x, y ) in terms of a set of
polynomials B j ( x, y ) in the form
W ( x, y) ¦ b j B j ( x, y) , (14-22)
j
W ( x, y) ¦b jB j ( x, y) , (14-23)
j
where the polynomial gradients are orthonormal to each other over a unit disc, i.e.,
As a result,
bj ³ W ( x, y ) B j ( x, y )dx dy , (14-25)
and
V 2s ¦ b 2j . (14-26)
j 2
The polynomials {Bj(x,y)} form a complete set, just like the Zernike circle
polynomials, but they are not orthogonal to each other over a unit disc. For all j with
n m, they are given in terms of the Zernike polynomials by
1 , (14-27)
B j ( x, y ) Z j ( x, y )
2n(n 1)
14.3.2 Alternative Approach for Obtaining Zernike Coefficients from Wavefront Slope Data 389
and for all j with n z m by a suitable linear combination of two Zernike polynomials with
the same azimuthal frequency:
1 ª n 1 º
B j ( x, y ) « Z j ( x, y ) Z j(n ' n 2,m ' m ) ( x, y ) » . (14-28)
4n( n 1) ¬ n 1 ¼
Zhao and Burge [19,20] have constructed a set of vector polynomials {SሬԦj(x,y)} by
the Gram–Schmidt orthonormalization of the gradients of the Zernike circle polynomials
that are in fact the gradient of the Bj(x,y) polynomials, i.e., {SሬԦj(x,y) = Bj(x,y)}. The first
45 polynomials of the two sets in polar coordinates are given in Tables 14-5 and 14-6.
Once the bj coefficients are known from Eq. (14-25), the Zernike coefficients aj can
be obtained according to [19]:
b j ( n, m)
aj , nzm
2n( n 1)
(14-29)
b j ( n, m ) b j '( n 2 , m )
aj , n m .
4 n(n 1) 4( n 1)(n 2)
j n m B j (U , T )
1 0 0 1
2 1 1 U cos T
3 1 1 U sin T
4 2 0 (1/ 2)(U2 1)
5 2 2 (1/ 2)U2 sin2T
6 2 2 (1 / 2 )U 2 cos 2 T
7 3 1 3 / 2(U3 U)sin T
8 3 1 3 / 2 (U 3 U ) cos T
9 3 3 (1 / 3)U 3 sin 3T
10 3 3 (1 / 3)U 3 cos 3T
11 4 0 (1 / 2)(3U 4 4U 2 1)
12 4 2 2(U4 U2 ) cos 2T
390 NUMERICAL WAVEFRONT ANALYSIS
j n m B j (U , T )
13 4 2 2(U4 U2 ) sin 2T
14 4 4 (1 / 2)U 4 cos 4T
15 4 4 (1 / 2)U 4 sin 4 T
16 5 1 ( 5 / 2 )(2U5 3U 3 U) cos T
17 5 1 5 / 2 (2U 5 3U 3 U ) sin T
18 5 3 5 / 2(U5 U3 ) cos 3T
19 5 3 5 / 2(U5 U 3 ) sin 3T
20 5 5 (1 / 5)U 5 cos 5T
21 5 5 (1 / 5)U 5 sin 5T
22 6 0 (1 / 24 )(20U 6 36U 4 18U 2 2)
23 6 2 3 / 4 (5U 6 8U 4 3U 2 ) sin 2 T
24 6 2 3 / 4 (5U 6 8U 4 3U 2 ) cos 2 T
25 6 4 3(U6 U 4 ) sin 4T
26 6 4 3(U6 U4 ) cos 4T
27 6 6 (1 / 6 )U 6 sin 6T
28 6 6 (1 / 6 )U 6 cos 6T
29 7 1 7 / 2 (5U 7 10U 5 6U 3 U ) sin T
30 7 1 7 / 2(5U 7 10U5 6U 3 U) cos T
31 7 3 7 / 2 (3U 7 5U 5 2U 3 ) sin 3T
32 7 3 7 / 2 (3U 7 5U 5 2U 3 ) cos 3T
33 7 5 7 / 2(U7 5U5 ) sin 5T
34 7 5 7 / 2(U7 5U5 ) cos 5T
35 7 7 (1 / 7 )U 7 sin 7 T
36 7 7 (1 / 7 )U 7 cos 7 T
37 8 0 (1 / 8)(35U 8 80U 6 60U 4 16U 2 1)
38 8 2 2(7U8 15U6 10U4 2U2 )cos2T
39 8 2 2(7U8 15U6 10U4 2U2 )sin2T
40 8 4 (7U8 12U6 5U4 )cos4T
41 8 4 (7U8 12U6 5U4 )sin4T
42 8 6 2(U8 U6 )cos6T
43 8 6 2(U8 U6 )sin6T
44 8 8 (1/ 8)U8 cos8T
45 8 8 (1 / 8)U 8 sin 8T
14.3.2 Alternative Approach for Obtaining Zernike Coefficients from Wavefront Slope Data 391
j S jx (U, T) S jy (U, T)
1 0 0
2 1 0
3 0 1
4 2U cos T 2U sin T
5 2U sin T 2U cos T
6 2U cos T 2U sin T
7
2
3 / 2U sin 2T 3 / 2 ( U 2 cos 2 T 2U 2 1)
20 5U 4 cos 4 T 5U 4 sin 4T
21 5U4 sin 4T 5U 4 cos 4 T
22 6(10U 5 12U 3 3U ) cos T 6(10U 5 12U 3 3U ) sin T
3[(5U5 4U3 )sin 3T 3[(5U5 4U3 )cos3T
23
(10U5 12U3 3U)sin T] (10U5 12U3 3U)cos T]
3[(5U5 4U3 ) cos3T 3[(5U5 4U3 )sin 3T
24
(10U5 12U3 3U) cos T] (10U5 12U3 3U)sin T]
25 3[U 5 sin 5T (5U 5 4U 3 ) sin 3T] 3[ U5 cos 5T (5U5 4U3 ) cos 3T]
26 3[U5 cos 5T (5U5 4U3 ) cos 3T] 3[U5 cos 5T (5U 5 4U 3 ) cos 3T]
27 6U5 sin 5T 6U 5 cos 5T
28 6U 5 cos 5T 6U 5 sin 5T
392 NUMERICAL WAVEFRONT ANALYSIS
j S jx (U, T) S jy (U, T)
33 7 / 2[U6 sin 6T (6U6 5U4 )sin 4T] 7 / 2[U6 cos 6T (6U6 5U4 ) cos 4T]
34 7 / 2[U6 cos6T (6U6 5U4 )cos4T] 7 / 2[U6 sin 6T (6U6 5U4 ) sin 4T]
35 7U 6 sin 6T 7U 6 cos 6 T
36 7U 6 cos 6 T 7U 6 sin 6 T
p 1 § wW ( x, y ) wW ( x, y ) ·
Vn
100 S © ³
¨¨ 0.5
wx
0.5
wx
¸¸dxdy .
¹
(14-30)
The numerical procedure for the integral approach to retrieve the Zernike
coefficients is the same as explained in the previous section. First, we interpolate the data
(separately for the x and y slopes) by bicubic splines [5] to evaluate the integrands at the
nodes of a cubature formula that allows exact evaluation of polynomial integrands up to
degree 15 [6]. We use the vector polynomials given by Gavrielides [10] to evaluate the
orthonormal Zernike coefficients. For the classic least squares approach, we solve the
system of linear equations given by
where D is now a column vector of 2N data values corresponding to both slopes at the
measurement points, â is a column vector containing the Zernike coefficients, and C is a
matrix representing the values of the derivatives of the Zernike polynomials at the
location of the data points according to
§ wZ1 ( x1 , y1 ) wZ 2 ( x1 , y1 ) wZ J ( x1 , y1 ) ·
¨ " ¸
wx wx wx
¨ ¸
¨ wZ1 ( x2 , y2 ) wZ 2 ( x2 , y2 ) wZ J ( x2 , y2 ) ¸
¨ " ¸
wx wx wx
¨ ¸
¨ ¸
¨ # # # ¸
¨ ¸ . (14-32)
¨ wZ1 ( x N , y N ) wZ 2 ( x N , y N ) wZ J ( x N , y N ) ¸
¨ " ¸
wx wx wx
C2 N u J ¨ ¸
¨ wZ1 ( x1 , y1 ) wZ 2 ( x1 , y1 ) wZ J ( x1 , y1 ) ¸
¨ " ¸
wy wy wy
¨ ¸
w
¨ 1 x2 , y 2 )
Z ( wZ 2 ( x2 , y2 ) wZ J ( x2 , y2 ) ¸
¨ " ¸
wy wy wy
¨ ¸
¨ # # # ¸
¨ ¸
¨ ¸
¨ wZ1 ( x N , y N ) wZ 2 ( x N , y N )
"
wZ J ( x N , y N ) ¸
¨ wy wy wy ¸
© ¹2 N uJ
394 NUMERICAL WAVEFRONT ANALYSIS
Similar results are obtained for the Zernike coefficients of the eye aberration
function by determining first the bj coefficients according to Eq. (14-25). The numerical
integration is carried out in the same manner as in the previous cases. The Zernike
coefficients aj are then obtained by using Eq. (14-29).
Table 14-7. Wavefront fit quality factor Q for the eye aberration function.
ın 9u9 15 u 15 21 u 21 9u9 15 u 15 21 u 21
Integration Method
9x9
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
9x9
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
5% noise
1.5
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-6a. Estimated Zernike coefficients of the eye aberration function from
wavefront slope data on a 9 u 9 array with different amounts of noise.
396 NUMERICAL WAVEFRONT ANALYSIS
Integration Method
15x15
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
15x15
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-6b. Estimated Zernike coefficients of the eye aberration function from
wavefront slope data on a 15 u 15 array with different amounts of noise.
14.3.3 Numerical Example 397
Integration Method
21x21
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
LS Method
21x21
1.5
1.0
0.5
0.0
âj
0.5
1.0 0% noise
2% noise
1.5 5% noise
10% noise
2.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Figure 14-6c. Estimated Zernike coefficients of the eye aberration function from
wavefront slope data on a 21 u 21 array with different amounts of noise.
398 NUMERICAL WAVEFRONT ANALYSIS
14.4 SUMMARY
When the wavefront slope data, instead of the wavefront data, are available, the
orthogonality properties of the Zernike polynomials over a unit disk are not
straightforwardly transferred to their derivatives, and, therefore, the Zernike coefficients
cannot be evaluated as a projection over the gradient of the polynomials. Two
conceptually different approaches solve this problem. One utilizes a set of vector
y
polynomials [Gx (x, y), Gj (x, y)] given in Table 14-4 such that their inner products with
the wavefront gradient yield the orthonormal Zernike coefficients [See Eq. (14-12)]. In
the other, one expands the aberration function in a set of polynomials Bj(x, y) whose
gradients are orthonormal to each other over a unit disc [See Eq. (14-25)]. In the first
approach, the coefficients represent the minimum sigma values of the balanced classical
wave aberrations, and in the second, the aberrations are balanced to yield minimum
variance of the transverse ray aberrations. The wave aberration and transverse ray
aberration coefficients are related to each other according to Eq. (14-29). These
approaches require numerical evaluation of integrals or a solution of a linear system of
equations. In either case, the accuracy of wavefront fit improves as the number of data
points increases and/or the amount of noise decreases, as demonstrated in Figure 14-6
and in Table 14-7.
References 399
References
7. MATLAB and Statistics Toolbox Release 2012b, The MathWorks, Inc., Natick,
Massachusetts, United States.
11. V. P. Aksenov and Y. N. Isaev, “Analytical representation of the phase and its
mode components reconstructed according to the wave-front slopes,” Opt. Lett.
17, 1180–1182 (1992).
16. C. Solomon, G. C. Loos, and S. Rios, “Variational solution for modal wavefront
projection functions of minimum error norm,” J. Opt. Soc. Am. A 18, 1519–1522
(2001).
19. C. Zhao and J. H. Burge, “Orthonormal vector polynomials in a unit circle, Part I:
basis set derived from gradients of Zernike polynomials,” Opt. Express 15,
18014–18024 (2007).
20. C. Zhao and J. H. Burge, “Orthonormal vector polynomials in a unit circle, Part II:
completing the basis set,” Opt. Express 16, 6586–6591 (2008).
APPENDIX: SYSTEMS WITH SECTOR PUPILS 401
Due to the low symmetry of the sector pupils, the closed-form analytical expressions
for the polynomials are very complex; even the tilt and defocus polynomials are not
simple. The complexity increases even more for a system with an annular sector pupil. In
that case, there are two variables representing a point on the pupil, two parameters
defining its orientation and angular subtense, and the parameter specifying its obscuration
ratio. However, relatively simple expressions are obtained when the angular subtense and
the orientaion of the sector pupil are specified along with its obscuration ratio.
In the following paper, the first 11 orthonormal polynomials are obtained for a sector
pupil of angular subtense of p 3 symmetric about the x or the y axis, angular subtense of
p 2 symmetric about the x axis, or a semicircle symmetric about the x axis. Similarly, the
corresponding orthonormal polynomials are given for an annular sector pupil and a
semicircular pupil with an obscuration ratio of 0.5. We have shown in Chapters 8 and 9
that the radially symmetric Seidel spherical aberration in a system with an elliptical or a
rectangular pupil is balanced with not only the radially symmetric defocus aberration but
with an angular aberration of astigmatism as well, due to their low symmetry compared to
that of the radially symmetric pupils such as the circular, annular or Gaussian. In systems
with sector pupils, even the radially symmetric defocus aberration is balanced with an
angular aberration of tilt due to their lower symmetry. Moreover, a polynomial for such
pupils consists of an increasing number of terms as its order increases.
The number of sector polynomials up to and including a certain order is the same as for
the polynomials of systems with pupils of other shapes considered in Chapters 4 through
10. The orthonormal circle polynomials given in Chapter 4, or the annular polynomials
given in Chapter 5, are special cases of the polynomials for sector pupils with an angular
subtense of 2p .
* José A. Díaz and V. N. Mahajan, “Orthonormal aberration polynomials for optical systems with circular and
annular sector pupils,” Appl. Opt. 56, 1136–1147 (2113) [doi: 10.1364/AO.52.001136]. 5HSULQWHG ZLWK SHUPLVVLRQ
402 APPENDIX: SYSTEMS WITH SECTOR PUPILS
Using the Zernike circle polynomials as the basis functions, we obtain the orthonormal polynomials for
optical systems with circular and annular sector pupils by the Gram–Schmidt orthogonalization process.
These polynomials represent balanced aberrations yielding minimum variance of the classical aberra-
tions of rotationally symmetric systems. Use of the polynomials obtained is illustrated with numerical
examples. © 2013 Optical Society of America
OCIS codes: 110.0110, 010.7350, 220.1010, 120.3180, 220.0220.
where
S1 1; (5)
j n m Zj ρ; θ Aberration
1 0 0 1 Piston
2 1 1 2ρ cos θ x tilt
3 1 1 p2ρ sin2
θ y tilt
4 2 0 p 32ρ
2
− 1 Defocus
5 2 2 p6ρ2 sin 2θ 45° primary astigmatism
6 2 2 p 6ρ3 cos 2θ 0° primary astigmatism
7 3 1 p 3ρ3 − 2ρ sin θ
8 Primary y coma
8 3 1 8p
3ρ − 2ρ cos θ Primary x coma
3
9 3 3 p8ρ3 sin 3θ
10 3 3 p 8ρ4 cos23θ
11 4 0 56ρ − 6ρ 1 Primary spherical
Z 1Z 2ρ sin θ
4 α
S3 ρ; θ; α : (8)
hZ22 i ρ2 cos2 θρdρdθ 2α sin 2α∕2α; 2α sin 2α∕2α1∕2
α 0 α
δjj0 (17)
is the mean value of the aberration and
are given by
and
p
3 161 ϵ2 1 ϵ3 ϵ sin α 3α1 ϵρ cos θ 21 ϵ ϵ2 sin α
S4 ρ; θ; ϵ; α 2ρ2 1 ϵ2 2 2 2 2 2
; (21)
N4 321 ϵ ϵ sin α 9α1 ϵ 1 ϵ 2α sin 2α
where
1∕2
1 ϵ 1287ϵ6 28ϵ5 50ϵ4 55ϵ3 50ϵ2 28ϵ 7sin2 α 225α1 ϵ4 1 ϵ2 2α sin 2α
N4 × :
5 321 ϵ ϵ2 2 sin2 α 9α1 ϵ2 1 ϵ2 2α sin 2α
(22)
As α → π, the annular sector polynomials approach hWi 3.84, and σ 2 1.40, while the correct num-
the annular polynomials that are orthonormal over bers, as obtained from our Eqs. (13)–(15), are Bt
an annular pupil with an obscuration ratio of 1.24, hWi 0.29, and σ 2 0.005. Similarly, when
ϵ [9,11,12]. ϵ 0.8235 and α π∕6, they yield Bt 132.99,
hWi 115.31, and σ 2 65.57, while the correct
C. Sector Pupil Symmetrical About an Arbitrary numbers, as obtained from our Eqs. (23a)–(23c), are
Orientation Bt 1.22, hWi 0.22, and σ 2 0.003. Hence, the
The orthonormal polynomials for a circular sector Swantner and Chow equations referred to above
pupil with an arbitrary orientation such that its are incorrect.
sides make angles α1 and α2 with the x axis, as in
Fig. 2(a), or an annular sector pupil with an obscura- 3. Expansion of an Aberration Function in Terms of
tion ratio ϵ, as in Fig. 2(b), can be obtained in a man- Orthonormal Polynomials
ner similar to that in Section 2.A or 2.B, respectively. The wave aberration function Wρ; θ of a sector pupil
The angular integrations now will be from α1 to α2 . can be expanded in terms of the orthonormal sector
For example, the orthonormality of the polynomials polynomials Sj ρ; θ in the form
for the circular and annular sector pupils will be
described by
Z 1Z α
1 2
hSj Sj0 i S ρ; θ; α1 ; α2
α2 α1 0 α1 j
× Sj0 ρ; θ; α1 ; α2 ρdρdθ δjj0 (24)
and
Z 1Z
1 α2
hSj Sj0 i 2
Sj ρ; θ; ϵ; α1 ; α2
α2 α1 1 ϵ ϵ α1
X
∞
hWρ; θi aj hSj ρ; θi a1 ; (28)
j 1
Table 2. Orthonormal Polynomials for a Circular Sector Pupil with Angular Subtense of π∕3 Symmetrical about the x Axis,
as in Fig. 1(a)
S1 1
S2 4.4081ρ cos θ − 2.8063
S3 4.8084ρ sin θ
S4 14.7738ρ2 − 18.2756ρ cos θ 4.2477
S5 15.7199ρ2 sin 2θ − 23.1380ρ sin θ
S6 −2.3267ρ2 13.4384ρ2 cos 2θ − 11.5019ρ cos θ 2.9289
S7 87.0864ρ3 sin θ − 65.2393ρ2 sin 2θ 37.9679ρ sin θ
S8 72.1271ρ3 cos θ − 88.0240ρ2 − 35.9271ρ2 cos 2θ 61.3806ρ cos θ − 7.7589
S9 7.5982ρ3 sin θ 42.8343ρ3 sin 3θ − 87.1692ρ2 sin 2θ 54.9874ρ sin θ
S10 −23.0378ρ3 cos θ 49.0241ρ3 cos 3θ 51.5225ρ2 − 83.7513ρ2 cos 2θ 10.8200ρ cos θ − 1.7027
S11 237.8242ρ4 − 578.3556ρ3 cos θ 41.5354ρ3 cos 3θ 312.4650ρ2 95.9653ρ2 cos 2θ − 116.2166ρ cos θ 9.1348
Table 3. Orthonormal Polynomials for a Circular Sector Pupil with Angular Subtense of π∕3 Symmetrical about the y Axis,
as in Fig. 3(a)
S1 1
S2 4.8084ρ cos θ
S3 4.4081ρ sin θ − 2.8063
S4 14.7738ρ2 − 18.2756ρ sin θ 4.2477
S5 15.7199ρ2 sin 2θ − 23.1380ρ cos θ
S6 2.3267ρ2 13.4384ρ2 cos 2θ 11.5019ρ sin θ − 2.9289
S7 72.1271ρ3 sin θ − 88.0240ρ2 35.9271ρ2 cos 2θ 61.3806ρ sin θ − 7.7589
S8 87.0864ρ3 cos θ − 65.2393ρ2 sin 2θ 37.9679ρ cos θ
S9 23.0378ρ3 sin θ 49.0240ρ3 sin 3θ − 51.5225ρ2 − 83.7513ρ2 cos 2θ − 10.8200ρ sin θ 1.7027
S10 −7.5981ρ3 cos θ 42.8343ρ3 cos 3θ 87.1692ρ2 sin 2θ − 54.9874ρ cos θ
S11 237.8243ρ4 − 578.3546ρ3 sin θ − 41.5354ρ3 sin 3θ 312.4651ρ2 − 95.9653ρ2 cos 2θ − 116.2156ρ sin θ 9.1348
Table 4. Orthonormal Polynomials for a Circular Sector Pupil with Angular Subtense of π∕2 Symmetrical about the y Axis,
as in Fig. 4
S1 1
S2 3.3178ρ cos θ
S3 4.5221ρ sin θ − 2.7142
S4 10.1720ρ2 − 12.4849ρ sin θ 2.4076
S5 11.1500ρ2 sin 2θ − 14.7336ρ cos θ
S6 −7.0559ρ2 9.5665ρ2 cos 2θ 18.2521ρ sin θ − 4.3820
S7 69.5749ρ3 sin θ − 82.2871ρ2 36.6255ρ2 cos 2θ 57.1668ρ sin θ − 6.5661
S8 31.8696ρ3 cos θ − 22.6486ρ2 sin 2θ 8.6814ρ cos θ
S9 −17.9479ρ3 sin θ 25.2706ρ3 sin 3θ 11.0762ρ2 − 52.6627ρ2 cos 2θ − 28.0196ρ sin θ 4.0136
S10 −34.0646ρ3 cos θ 27.8620ρ3 cos 3θ 73.1727ρ2 sin 2θ − 41.4386ρ cos θ
S11 93.7045ρ4 − 223.2860ρ3 sin θ − 11.1546ρ3 sin 3θ 106.3560ρ2 − 51.7028ρ2 cos 2θ − 41.1621ρ sin θ 2.9078
Table 5. Orthonormal Polynomials for a Semi-circular Pupil Symmetrical about the x Axis, as in Fig. 5(a)
S1 1
S2 3.7831ρ cos θ − 1.6056
S3 2ρ sin θ
S4 4.1683ρ2 − 2.5319ρ cos θ − 1.0096
S5 4.4114ρ2 sin 2θ − 2.9956ρ sin θ
S6 6.7981ρ2 7.5887ρ2 cos 2θ − 13.3480ρ cos θ 2.2660
S7 8.9027ρ3 sin θ − 1.4006ρ2 sin 2θ − 4.9840ρ sin θ
S8 20.5600ρ3 cos θ − 13.6275ρ2 − 7.6440ρ2 cos 2θ 0.3233ρ cos θ 1.4414
S9 8.4228ρ3 sin θ 9.2844ρ3 sin 3θ − 14.4709ρ2 sin 2θ 4.2114ρ sin θ
S10 40.7949ρ3 cos θ 15.2277ρ3 cos 3θ − 39.5924ρ2 − 41.0149ρ2 cos 2θ 31.8150ρ cos θ − 2.8023
S11 18.2324ρ4 − 21.1998ρ3 cos θ − 2.1906ρ3 cos 3θ − 6.1076ρ2 7.8677ρ2 cos 2θ 2.5110ρ cos θ 1.1232
Table 6. Orthonormal Polynomials for an Annular Sector Pupil with Obscuration Ratio ϵ 0.5 and Angular Subtense of π∕3
Symmetrical about the x Axis, as in Fig. 1(b)
S1 1
S2 7.1986ρ cos θ − 5.3465
S3 4.3007ρ sin θ
S4 −28.7951ρ cos θ 19.0444ρ2 9.4841
S5 17.5981ρ2 sin 2θ − 26.7660ρ sin θ
S6 27.0338ρ2 cos 2θ − 74.4974ρ cos θ 24.0159ρ2 26.3481
S7 83.4824ρ3 sin θ − 67.1102ρ2 sin 2θ 43.6343ρ sin θ
S8 180.6545ρ3 cos θ − 267.7044ρ2 − 126.9429ρ2 cos 2θ 273.1875ρ cos θ − 59.1057
S9 63.9617ρ3 sin θ 66.6774ρ3 sin 3θ − 191.8725ρ2 sin 2θ 135.5028ρ sin θ
S10 257.4016ρ3 cos θ 103.2660ρ3 cos 3θ − 371.1898ρ2 − 416.5508ρ2 cos 2θ 561.8510ρ cos θ − 130.9662
S11 351.1153ρ4 − 1032.8704ρ3 cos θ 30.8480ρ3 cos 3θ 705.9722ρ2 320.4290ρ2 cos 2θ − 440.8119ρ cos θ 66.3866
Table 7. Orthonormal Polynomials for a Semi-annular Pupil with Obscuration Ratio ϵ 0.5 Symmetrical about the x
Axis, as in Fig. 5(b)
S1 1
S2 3.8539ρ cos θ − 1.9083
S3 1.7889ρ sin θ
S4 4.9234ρ2 − 1.4225ρ cos θ − 2.3728
S5 3.9259ρ2 sin 2θ − 2.7548ρ sin θ
S6 6.0925ρ2 7.9347ρ2 cos 2θ − 14.6815ρ cos θ 3.4617
S7 9.0714ρ3 sin θ − 0.9678ρ2 sin 2θ − 5.6709ρ sin θ
S8 22.9120ρ3 cos θ − 14.6184ρ2 − 5.3639ρ2 cos 2θ − 6.0598ρ cos θ 4.6007
S9 7.0047ρ3 sin θ 8.2895ρ3 sin 3θ − 13.0446ρ2 sin 2θ 4.2501ρ sin θ
S10 39.3643ρ3 cos θ 16.1588ρ3 cos 3θ − 41.7012ρ2 − 44.7775ρ2 cos 2θ 39.2625ρ cos θ − 4.5536
S11 25.8811ρ4 − 14.6339ρ3 cos θ − 2.5180ρ3 cos 3θ − 21.2183ρ2 8.0154ρ2 cos 2θ − 1.9705ρ cos θ 7.4515
Table 8. Annular Polynomials Aj ρ;θ; ϵ 0.5 for an Annular Pupil with Obscuration Ratio ϵ 0.5
For example, in Table 3, the S5 polynomial consists of symmetry of the annular sector pupil results in simi-
Zernike 45° astigmatism balanced by tilt, and S6 con- lar balancing of an aberration as for a circular sector
sists of Zernike astigmatism balanced by not only tilt pupil. The balancing defocus for spherical aberration
but additional defocus as well. The spherical aberra- in semi-circular and semi-annular pupils does have
tion ρ4 in S11 is balanced not only by defocus but sev- opposite signs as for the circular and the annular
eral other lower-order terms as well. Moreover, the pupils.
balancing defocus has the same sign as the spherical Using Eq. (27), we obtain the orthonormal coeffi-
aberration, instead of the opposite sign as in the cor- cients of the aberration function. Thus we may write
responding Zernike circle polynomial. All of this is a the aberration function of Eq. (31) in terms of the
consequence of the lower symmetry of the sector orthonormal polynomials for the various pupils.
pupil. The orthonormal polynomials for an annular They are given below along with their peak-to-valley
sector pupil, a semi-annular pupil, and an annular and sigma values.
pupil of an obscuration ratio ϵ 0.5 are shown in Circular sector pupil of angular subtense π∕3
Tables 6, 7, and 8, respectively. Of course, the lower symmetrical about the x axis, as in Fig. 1(a):
Wρ; θ; π∕4; 3π∕4 1.1667S1 3.0141S2 Fig. 5. Sector pupil of unit radius symmetrical about the x axis.
0.3231S3 0.0444S4 (a) Semi-circular. (b) Semi-annular with obscuration ratio ϵ 0.5.
0.2087S6 0.1419S7
0.0188S9 0.0427S11 ; (34a)
P V 10.5625 and σ 2.4798: (35b)
Wρ; θ; ϵ 0.5; π∕2; π∕2 3.5765S1 2.5909S2 aberration ρ4 balanced by appropriate amounts of
defocus ρ2 and y tilt ρ sin θ to minimize its variance,
0.0018S4 0.0185S6 as may be seen by dropping the first four polynomials
0.0573S8 0.0241S10 in Eq. (34a):
0.1546S11 ; (38a)
W R ρ; θ 0.2087S6 0.1419S7 0.0188S9
P V 10.5625 and σ 2.5953: (38b) 0.0427S11
4ρ4 1.3630ρ2 0.5040ρ sin θ
Annular pupil (not shown):
0.0458. (42)
Wρ; θ; ϵ 0.5; 0; 2π 1.3750A1 5.5902A2
0.1677A11 ; (39a)
The factor of 4 is simply a result of the 4 in 4ρ4 in
the starting aberration function of Eq. (31), com-
P V 20 and σ 5.5927: (39b) pared to only ρ4 in Eq. (41). It is not surprising that
the residual aberration has the same form as the
In Section 2.C, we showed how the orthonormal balanced spherical aberration of Eq. (41). Since the
polynomials change as the orientation of the sector starting aberration function consists of spherical
pupil changes from the x to the y axis. Tables 2 and
3 illustrate this fact over a larger number of polyno-
mials. Equations (32a) and (33a) illustrate it with a
numerical example. The aberration functions of
Eqs. (33a) and (34a) for sector pupils symmetrical
about the y axis contain both tilt polynomials S2 and
S3 . Note that the defocus polynomial term A4 is
missing in Eq. (39a), because the defocus term in the
aberration function of Eq. (31) exactly balances its
spherical aberration term for an annular pupil of
obscuration ratio ϵ 0.5, as may be seen from the
polynomial A11 in Table 8.
Swantner and Chow also discussed a circular
sector pupil of angular subtense π∕2 symmetrical
about the y axis and aberrated by primary spherical
aberration. It can be shown that the orthonormal
polynomials obtained by orthonormalizing 1, ρ cos θ,
ρ sin θ, ρ2 , and ρ4 over such a sector are
1; 40a
8. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” 12. V. N. Mahajan, “Zernike annular polynomials and optical
J. Opt. Soc. Am. 66, 207–211 (1976). aberrations of systems with annular pupils,” Appl. Opt. 33,
9. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave 8125–8127 (1994).
Diffraction Optics, 2nd ed. (SPIE, 2011). 13. G.-M. Dai and V. N. Mahajan, “Nonrecursive orthonormal
10. A. Korn and T. M. Korn, Mathematical Handbook for Scien- polynomials with matrix formulation,” Opt. Lett. 32, 74–76
tists and Engineers (McGraw-Hill, 1968). (2007).
11. V. N. Mahajan, “Zernike annular polynomials for imaging sys- 14. Wolfram Research, Inc., Mathematica, Version 8.0, Champaign,
tems with annular pupils,” J. Opt. Soc. Am. 71, 75–85 (1981). Illinois (2010).
415
416 Index
cutoff frequency................................... 6
annular ......................................... 110 geometrical path length ......................17
circular ........................................... 53 Gram–Schmidt orthonormalization .. 40,
elliptical........................................ 208 389
hexagonal ............................. 174, 176
rectangular .................................... 240 H
square ................................... 272, 273
Hermitian ..............................................7
D Huygens’ secondary wavelets ..............8
dark and bright rings I
annular .......................................... 108
circular ........................................... 52 imaging characteristics of polynomial
elliptical........................................ 206 aberrations
Gaussian ....................................... 146 annular ..........................................132
hexagonal ..................................... 171 circular ............................................78
defocus Strehl ratio elliptical ........................................214
hexagonal ......................................187
annular .......................................... 129
rectangular ....................................247
circular ........................................... 55
square ............................................282
defocus wave aberration .................... 22
inner products ........................... 313, 383
deformable mirror .............................. 78 integration method ............................374
depth of focus interferogram ......................................30
annular .......................................... 114 symmetry ........................................78
diffraction focus interferometer setting errors ..... 320, 335
annular .......................................... 112 isometric
circular ........................................... 58 annular ..........................................132
Gaussian ....................................... 152 circular ............................................78
diffraction limited ................................ 4 elliptical ........................................214
hexagonal ......................................187
E rectangular ....................................247
square ............................................282
encircled power.................................. 51 isoplanatic system ................................4
ensquared power
circular ......................................... 173 L
hexagonal ............................. 170, 173
entrance pupil .................................... 17 lateral aberrations ............................... 27
exit pupil ............................................ 17 least squares approach ......372, 374, 393
least squares error ............................. 39
F least squares fit ................................. 325
least squares method......................... 374
f-number ............................................ 23 Legendre polynomials ..............301, 357
fabrication errors................................ 73 lenslet array ......................................383
focal ratio ........................................... 23 longitudinal defocus ........................... 23
Fourier transform ................................. 6
M
G
Maréchal formula ....................... 58, 127
Gaussian amplitude.......................... 344 meridional plane ................................. 20
Gaussian apodization ....... 343, 352, 355 modulation transfer function ................7
Gaussian image .................................... 3
Gaussian imaging ............................ 352 N
Gaussian pupil ................................. 144
noncircular pupil ..............................309
Gaussian radius ................................ 144
normalization constant........................41
Gaussian reference sphere ................. 18
normalized spatial frequency..............94
Index 417
O sector
annular ..............................405, 409
oblique spherical aberration .............. 27 circular ..............................403, 408
obscuration ratio .............................. 107 semiannular ............................... 409
optical path length.............................. 17 semicircular ............................... 408
optical transfer function square ....................................274–280
annular .......................................... 109 Poisson equation............................... 384
circular ........................................... 53
power series expansion................. 25, 27
elliptical........................................ 207
Gaussian ....................................... 147 primary aberrations
general .............................................. 6 annular ..........................................111
hexagonal ..................................... 174 circular ......................................57, 58
rectangular .................................... 240 Gaussian
square ........................................... 272 annular ....................................... 428
orthogonal aberrations ............... 69, 116 circular ..............................355, 357
orthonormalization ............................ 40 weakly truncated ............... 416, 418
OTF characteristics PSF characterisitcs
annular .......................................... 132 annular ..........................................132
circular ........................................... 84 circular ............................................83
OTF slope at the origin pupil function
annular .......................................... 111
circular ............................................50
circular ........................................... 54
elliptical ........................................203
Gaussian ........................................144
P circular ........................................50
peak-to-valley (P-V) numbers general............................................... 4
annular .......................................... 136 hexagonal ......................................168
circular ........................................... 82 rectangular ....................................237
elliptical........................................ 225 square ............................................269
hexagonal ..................................... 191
rectangular .................................... 257 R
square ........................................... 286 random Gaussian noise..... 373, 384, 393
phase transfer function ........................ 7
ray aberration ......................................21
point-spread function
annular .......................................... 107 Rayleigh’s h 4 rule ........................... 59
circular ........................................... 51 rays
elliptical........................................ 204 marginal ..........................................17
Gaussian sagiWtal .............................................21
circular ...................................... 145 tangential......................................... 21
general .............................................. 5 zonal ................................................17
hexagonal ..................................... 169 reference sphere ..................................18
rectangular .................................... 238 reflection invariants ..................351, 354
square ........................................... 270
rotational invariants ............................25
polynomial-ordering number ............. 65
polynomials
annular .................................. 116–123 S
circular ..................................... 63–72 scaled pupil......................................... 94
elliptical................................ 209–220 Schwarzschild aberration function ..... 27
Gaussian secondary aberrations ......................... 27
annular .............................. 158–160 Seidel aberration function .... 26, 28, 98,
circular .............................. 153–155 373, 375–377
weakly truncated ...................... 155 annular ..........................................321
general ............................................ 37 Seidel aberrations ............................... 15
hexagonal ............................. 177–186 Seidel coefficients from Zernike
rectangular ............................ 242–246 coeffcients ..........................................91
418 Index