Professional Documents
Culture Documents
Selected Topics in
Characteristic Functions
Nikolai G. USHAKOV
Russian Academy of Sciences
MYSPIN
UTRECHT, THE NETHERLANDS, 1 9 9 9
VSP BV Tel: +31 30 692 5790
P.O. Box 346 Fax: +31 30 693 2081
3700 AH Zeist E-mail: vsppub@compuserve.com
The Netherlands Home Page: http://www.vsppub.com
V S P BV 1999
ISBN 90-6764-307-6
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior
permission of the copyright owner.
Preface vii
Notation ix
1 Basic properties 1
1.1 Definition and elementary properties 1
1.2 Continuity theorems and inversion formulas 4
1.3 Criteria 8
1.4 Inequalities 23
1.5 Characteristic functions and moments, expansions of character-
istic functions, asymptotic behavior 39
1.6 Unimodality 43
1.7 Analyticity of characteristic functions 52
1.8 Multivariate characteristic functions 54
1.9 Notes 64
2 Inequalities 67
2.1 Auxiliary results 67
2.2 Inequalities for characteristic functions of distributions with
bounded support 81
2.3 Moment inequalities 88
2.4 Estimates for the characteristic functions of unimodal distribu-
tions 95
2.5 Estimates for the characteristic functions of absolutely continu-
ous distributions 100
2.6 Estimates for the characteristic functions of discrete distribu-
tions Ill
2.7 Inequalities for multivariate characteristic functions 114
2.8 Inequalities involving integrals of characteristic functions . . . . 133
2.9 Inequalities involving differences of characteristic functions . . . 142
2.10 Estimates for the first positive zero of a characteristic function . 154
2.11 Notes 158
3 Empirical characteristic functions 159
3.1 Definition and basic properties 159
3.2 Asymptotic properties of empirical characteristic functions . . . 164
3.3 The first positive zero 182
3.4 Parameter estimation 187
3.5 Non-parametric density estimation I 198
3.6 Non-parametric density estimation II 210
3.7 Tests for independence 226
3.8 Tests for symmetry 232
3.9 Testing for normality 242
3.10 Goodness-of-fit tests based on empirical characteristic functions 251
3.11 Notes 256
A Examples 259
Bibliography 335
vi
Preface
vii
sented in the remainder of the chapter. Some results are presented on statisti-
cal estimation (both parametric sind nonparametric) and on testing statistical
hypotheses.
A collection of various examples, counterexamples and assertions, demon-
strating interesting (sometimes unexpected) properties of characteristic func-
tion, is presented in Appendix A. Constructing counterexamples permanently
accompanies the investigation of characteristic function and their applications
and sometimes leads to beautiful and intriguing results. Such results are of-
ten not only of theoretical interest but helpful in solving applied problems as
well. Many examples have already been included in monographs concerning
counterexamples in probability theory and statistics and in those concerning
characteristic functions. However, recently quite a number of new interesting
examples have appeared in various publications and the goal of the appendix
is to bring together and systematize all known counterexamples.
In Appendix B, we give formulas and graphs of some frequently used char-
acteristic functions as well as characteristic functions demonstrating interest-
ing properties.
In Appendix C, several unsolved problems are presented.
I am very grateful to Yu.V. Prokhorov, D.V. Belomestnii, A.V. Kolchin,
V.Yu. Korolev, and A.P. Ushakova for many useful comments.
N. G. Ushakov
viii
Notation
Throughout the book, a triple numeration is used: the first digit denotes the
number of a chapter, the second one denotes the number of the section in the
chapter and the last one denotes the number of an item (theorem, lemma,
formula) in the section.
For a complex number , denotes the complex conjugate. R m is the m-
dimensional Euclidean space. 91 and 3 denote the real and imaginary parts of
a function, respectively; f(x+0) and f(x 0) are respectively the right-hand and
a.s. D
left-hand limits; >, and denote, respectively, the convergence almost
surely, convergence in probability and convergence in distribution; a=' means
that an equality holds with probability one. The weak convergence is denoted
LimFGc) = F(x)
71- OO
or
Fn-+F.
ix
1
/ -oo eltxp(x)dx.
I f X is discrete with P(X = x n ) = p n , = 1,2,..., - 1, then
oo
= **.
n=l
Characteristic functions of the most frequently arising distributions can
be found in Appendix B.
1
2 1. Basic properties
(b) /(0) = 1;
(d) f(t) = f(t) where the horizontal bar denotes the complex conjugate.
THEOREM 1.1.4. Let fit) be a characteristic function. Then the following state-
ments are equivalent:
In Theorems 1.1.5 and 1.1.6, the converse is not true: there exist singu-
lar distributions whose characteristic functions satisfy (1.1.1) or (1.1.2) (see
Appendix A, Examples 6 and 7).
if and only if fit) = f\it)f2it), where fit), fiit) and f2it) are characteristic
functions of Fix), F\(x) and F2ix) respectively.
4 1. Basic properties
or
w
FN^F.
THEOREM 1.2.1 (continuity theorem 1). Let Fi(x), i ^ t o , ...be a sequence of dis-
tribution functions and f\(t),f2(t),... be the corresponding sequence of char-
acteristic functions. The sequence F\(x),F2(x),... converges weakly to some
distribution function F{x) if and only if the sequence f\(t), /2W,... converges at
all points to some function f(t) which is continuous at zero. In this case, fit) is
the characteristic function ofF(x).
THEOREM 1.2.2 (continuity theorem 2). Let F\ix), F^ix), ...be a sequence of dis-
tribution functions and fiit),f2it),... be the corresponding sequence of char-
acteristic functions. The sequence Fi(x),F%(.x),... converges weakly to some
distribution function Fix) if and only if the sequence /(),/2M,... converges
uniformly on each bounded interval to some function fit). In this case, fit) is
the characteristic function of Fix).
Inversion formulas of other kinds are given by the next two theorems.
/ .oo = T->
limoo lim
>0
We can use the inversion formula of Theorem 1.2.5 in all points if redefine
F(x) at discontinuity points as
F(x + 0) + F(x - 0)
F(x) =
2
For absolutely continuous distributions, inversion formulas for the distri-
bution functions are supplemented by the inversion formula for the density
function.
(1.2.1)
THEOREM 1.2.9 (Parseval). Let Fix) and Gix) be distribution functions with
characteristic functions fit) and git) respectively. Then
oo roo
/ -oo e~itxgix)dFix)= /
Joo fiy t) dGiy).
oo , -T
n=1
8 1. Basic properties
If F(x) has only a finite number of jumps, then the series at the left-hand
side should be replaced by the finite sum.
1.3. Criteria
We start with some general necessary and sufficient conditions, and then
present various sufficient conditions for a function to be the characteristic
function of a probability distribution. We also give some methods of con-
structing characteristic functions satisfying given properties. At the end of
the section, some necessary conditions are presented which supplement nec-
essary conditions given by Theorem 1.1.1. Other necessary conditions (in the
form of inequalities) are given in Section 1.4. Criteria related to unimodal
distributions will be given separately in Section 1.6.
A complex-valued function f(t) of the real variable t is called non-negative
definite if it is continuous, and the sum
NN _
j=l k=l
is real and non-negative for any positive integer and any real t\,t2, Jn
and complex , 2, , ,-
(b) f(0) = 1.
1.3. Criteria 9
(a) /(0) = 1,
fA r A
(b) / fit- s)e"(i~s) dtds > 0 for all real and all A > 0.
Jo Jo
THEOREM 1.3.3 (Khintchine). A complex-valued function fit) of the real vari-
able tisa characteristic function if and only if there exists a sequence ofcomplex-
valued functionsg\it),g2it),..., satisfying the condition
oo
/ -oo \gnit)\2dt = 1,
and such that oo
/ -oo gnit + s)gn(s)ds
uniformly in each bounded interval.
/ -oo \git)\2dt = 1.
i=1 j=1
THEOREM 1 . 3 . 5 . Let R(s, t) be a covariance function satisfying the condition
/
oo
R(s,s)ds< oo. (1.3.1)
oo
Then the function
f(t)=J-~ m (1.3.2)
J_00R(s,s)ds
=
2
f
J-
00
00
e~mt\Xiu)\2du,
1.3. Criteria 11
hence
E\X(-y)\2
p(y) =
iz.m^du
such that
r
J
\g(t)\2dt = 1,
oo
jt
oo
We can set
R(s,t)=g(s)g(t).
We also present quite a recent criterion which is due to (Trigub, 1989).
(a) f{0) = 1;
(e) t h e r e e x i s t s k o s u c h t h a t f o r a n y k > k o a n d . a n y ,
r f ( m
(sgnx)',+1
t ) d t
>0. (1.3.4)
J-oo (x + iii t f + 1
N > 14
1 N
/r _ .
dJ t = e (sgnx + sgnSz) +
J-
(b) F o r a n y r e a l 0 a n d a n y > 1 ,
N u i N i N
f e d t e e ~
/ = + + 02 2
J - N t + i x + i xx ii xx 1 + '
N i x i
1 f e d t 1 1 f e C
2 J -
t z
= 2 e (sgnx + s g n S z ) - /
2 J r z
Obviously,
, Nx sin < ,
Jo
and, since
sin > -
for 0 < < /2,
!2 r/2
/ ~" = 2 e~Nxsin(pd(p< 2/ -
e ~ * < .
Jo Jo Jo
Hence
-Wxsm 2
<
1 ' 1+'
1.3. Criteria 13
and || < 2.
To prove (b), let us integrate by parts. We have
. _ eiN e~iN fN elidt
J-N t + ix + ix ix (t + ix)2'
If |jc| > N/2 > 1/2, then, integrating by parts once more, we see that the absolute
value of the integral does not exceed
2 42V 2 8 42
+ T^T < 2 + 2 <
N2 +x2 |x|3 N + x2 x 1 + x2'
If |jc| < N/2, then we use (as above) the Cauchy theorem with ^ being the
upper half-circle:
[ eiZdZ
< N
-sin <pd(0< 5
<
JrN(z + ix)2 -(N-\x\)2Jo
~ (-\x\)2 J
1 + x2'
LEMMA 1.3.2. Let f(t) satisfy conditions (a)-(c) of Theorem 1.3.6. Set
f f(t)dt fM fWt
g(x) = v.p. / = lim / -.
J-oo X + lt M->o J- x + lt
If
lim g(x) = 0,
|x|->oo
then for any real y 0,
(X) roo
lxy
/ -oo e g(x)dx = 2nisgny /
Jo / ( - t s g n y ) e ' -Mdt.
PROOF. D e n o t e
*<*)= J-M X + lt
gM(x) - gN&)
fM fM f-2 Jf
m )]
=L tt*s7? --> 7
where > > 0, that g\t&) > g(x) as oo uniformly in each bounded
interval (we use the boundedness of f(t) in the first integral and the Abel test
in the second one). Therefore
rN . fN . foo fNy eiudu
/ elxyg(x)dx= lim / elxygM(x)dx= / f(t)dt .
J-N M-kjo J-N J-oo JNy " + ltJ
14 1. Basic properties
Let us take the limit of the right hand side as oo. By virtue of of
Lemma 1.3.1 (N > l/|y|),
Ny e
iu
du eiNy e~iNy
+ eN(y,t)-
-Ny u + ity UN + it)y UN - it)y ' 1+ t2y2'
(the existence of the limit in the right-hand side follows from the previous
equality). Thus, it remains to verify that
iN
roo ( eiNy e-iNy \ roo ( giNy e~ y \
lim / f(t) + r \dt - / f i t ) lim - + r dt.
The right-hand side of this relation is equal to zero, and the left-hand side is
l i m \giN)eiNy -gi~N)e~iNy} = 0.
N>o
Thus, by virtue of Lemma 1.3.1,
oo ro /
coo pixy
pixy
/
elxygix)dx = / f(t)dt / rdx
-oo J oo J oo X + it
oo
/ -oo fit)etyisgny - sgn t)dt.
PROOF OF THEOREM 1.3.6. The necessity of conditions (aHd) is obvious (see
(Lukacs, 1970). Let us prove the necessity of condition (e). Let
oo
/ -oo eixtdFix)
for some distribution function Fix). For any realy * 0, by virtue of the Fubini
theorem
fM fit)dt f rM eltx
gM(y) = / ^ = / dFix) / -dt.
J-M y + it J-oo J-M y + it
By Lemma 1.3.1 iz = iy), the inner integral is bounded by a constant depending
only on y and has a limit as > oo; therefore, for y * 0,
0
g(y) = 2 / e~yxdF(x). (1.3.6)
Joo
Therefore, for any k > 0 and y 0,
( - l ) V % 0 s g n y >0
V
v=0 '
Fix) = ^-F\(x)
2
f o r x > 0 and
Fix) = J-F2ix)
2
for < 0. Then the function
oo
/ -oo eUxdFix)
16 1. Basic properties
f0(t)dt
/ r- = g(x).
J-OO X + lt
Now it suffices to prove that fit) = fait), i.e., thatg(i) uniquely determines fit).
Assuming git) = 0, we, due to Lemma 1.3.2, obtain
fi-tsgny)e-tMdt =0
Jo
f1
/ /(lnusgny)u w_1 du = 0.
Jo
Considering this relation for all complex y 0 and taking into account that
fit) is continuous, we obtain fit) = 0.
THEOREM 1.3.7. Let fit) be a characteristic function. Then the following func-
tions are also characteristic functions:
(c) */(*);
(d) \fit)\2;
(a) ifX is a random variable with characteristic function fit), then fit) is the
characteristic function of X;
(e) ifX is a random variable with characteristic function fit), then elbtf(at)
is the characteristic function of aX + b.
(a) for any fixed a the function fait) = fit, a) is a characteristic function;
(b) for any fixed t the function ftia) = fit, a) is a measurable function.
is a characteristic function.
00
fit) = Y^aJnit)
is a characteristic function.
18 1. Basic properties
m - 1
PROOF. We have
2 m
8 ' 2-fit) 2-fit) ^ 2"
= f fiu)uP~ldu
tP Jo
is also a characteristic function for any > 0.
{-~ 0 <x<t,
rtix) = I tP
I0 otherwise.
This function is a probability density (this can be examined directly) for all
t > 0 and > 0. Denote its characteristic function by htiu). It is easy to see
that htiu) = h\{tu). Using the Parseval theorem (Theorem 1.2.9) we obtain
git) = = fiu)rtiu)du
tP Jo J-oo
oo /
/
htiu)dFiu) - / hiitu)dFiu).
-oo J oo
Thus, by virtue of Theorem 1.3.8, git) is a characteristic function.
COROLLARY 1.3.4. Let fit) be a characteristic function. Then for any > 0
g{t) = epm-
The following sufficient condition, which is due to (Polya, 1949), has found
wide applications.
(a) f(0) = 1;
(d) lim^oo/(*) = ()
(a) fiO) = 1;
is a characteristic function.
(a) (0) = 0;
(b) lim^oogM = - o o ;
Then
fit) = e*(i)
is a characteristic function.
(a) giO) = 0;
Then
/(f) = exp{- |f|+*<*)}
is a characteristic function, provided > 0 is sufficiently large.
then
(a) /(0) = 1;
(c) there exists an interval (,) such that f{t) is convex on (0,a);
Theorems 1.3.11 and 1.3.12 are very usefully combined with the following
criterion.
1.3. Criteria 21
THEOREM 1.3.13. Let f(t) be a periodic (say, with period T) characteristic func-
tion. Then for any a > 1 satisfying the condition
the function
(1 + a)fit) - a (1.3.10)
Po fit)dt. (1.3.12)
a
q0 = (1 + a) 0 >0
1+a
= Po + Po - + (1 + ) = Po + Po - + (1 + )(1 - 0 ) = 1.
But taking (1.3.11) into account, we see that the characteristic function of this
distribution is
= (1 + a) Po + linntIT a
71*0
= (1 + a)fit) - a,
g ' /2>(0)
is a characteristic function.
Then there exist > 0 and > 0 such that Sif(t) < for || > a. Since
u(t) = 3\f(t) is a characteristic function, it is a positive definite function, i.e.,
for any positive integer and any real t\, and complex \,..., ^
NN _
j= 1 k=l
Let us takeiV > ^ + 1 and set = = ... = = 1, tj = ja,j = 1,2, ...,N. Then
and
lim inf 9/() < 0.
|(|-oo
1.4. Inequalities
Various inequalities for characteristic functions are the subject of Chapter 2.
However, some general inequalities are given in this section. They can be
considered as necessary conditions for characteristic functions, therefore they
supplement the results of the preceding section. We also present several well-
known estimates of the closeness of distribution functions in terms of the
closeness of the corresponding characteristic functions. Recent developments
in this area are given in the next chapter.
THEOREM 1.4.1. For any characteristic function fit) and any positive integer n,
sinx = 0
24 1. Basic properties
or
nij sin(nx)
sin* 0 cosandx= .
sinx
In the former case, = nk(k is integer), hence fink) < 1. In the latter case,
. . cosx sin(nj:) x sin((n l)x)
f(x)= COS(TIX) =
sin : sin :
so t h a t we see from (1.4.2) t h a t
fix) <n- 1
n - 2 < 7i[9t/(f)]B.
() = zn an ( )~ (1.4.3)
for \z\ < 1 a n d 0 < < 1 < nan~l a n d 1 < 2 < nan. An elementary
computation shows t h a t y(z) > 0 for \z\ < 1.
L e t u s s e t a = 3if(t), = c o s ( t x ) i n (1.4.3). T h e n
J I
c o s n ( t x ) d F ( x ) > [9l/(i)] n .
1.4. Inequalities 25
Theorem 1.4.1 was proved first for = 2k and then extended to all integer
in (Heathcote & Pitman, 1972). Further extension is impossible in the sense
that for any non-integer positive b there exists a characteristic function f(t)
and i 0 such that 1-9/( 0 ) > 2 [l-9l/(i 0 )] Indeed, consider the characteristic
function f(t) = cos t and set = 2. Then
THEOREM 1.4.2. For any characteristic function fit) and any t\ and 2,
/ 1 f(~t 1) /(-i2) \
det /() 1 fit 1 - i 2 ) > 0,
\fit2) f(t2-h) 1 J
i.e.,
Indeed, if |/( - t2)\ > |/(fi)||/(f2)|, then (1.4.6) is obvious. If |/(ix - t2)\ <
|/(fl)||/(f2)|, then (1.4.6) is equivalent to (1.4.5) (this can be verified directly)
and therefore is again true.
Now replacing t2 by t2 in (1.4.6) and taking into account that |/()| =
|/(-f)|, we finally obtain (1.4.4).
COROLLARY 1.4.2. Let f(t) be a characteristic function. If |/(fi)| > cos q>\ and
\f(t2)I > cos 2, where >0,2>0, and + q>2< /2, then
In particular, if
|/(f)| > cos ,
then
\f(nt)\ > cos(nq>).
and
PROOF. Without loss of generality we may assume that t\ < 1/c and t\ < 1/c.
Otherwise the right-hand side of (1.4.9) is negative, and the assertion of the
lemma is obvious.
Suppose the contrary: let (1.4.7) and (1.4.8) hold but (1.4.9) do not hold,
i.e.,
From (1.4.7) and (1.4.8), taking into account that 1 - ct2 > 0 and 1 - ct\ > 0,
we obtain
which is equivalent to
(f + t2)2 < 0,
COROLLARY 1.4.3. Let f(t) be a characteristic function. Then |/()| > 1 - ci^
implies |/(rafo)| > 1 c(nto)2 for any = 1,2,...
THEOREM 1.4.3. Let f(t) be a characteristic function. Then the following asser-
tions take place.
(2) I f R f ( t ) > 1 ct2 in some neighborhood of the origin, t * 0, then 91fit) >
1 ct2 for all t. The assertion remains true if'>' is replaced by ">' in both
inequalities.
m t ) f < l ^ l (1.4.11)
for all t. Suppose that > 1 ct2 in some neighborhood of the origin but
not everywhere. Let ib, b) be the widest interval such that 3\fit) > 1 ct2 for
28 1. Basic properties
t e i-b,b), t 0. Then <R/() = 1 - cb2, and taking (1.4 .11) into account we
obtain
2
|/(i)| < 1 - ^ t (1.4.13)
PROOF. Suppose the contrary: let (1.4.13) do not hold. Then there exists
e (0, b] such that
b b
i- < i 0 ^
2 n+1 2n
In connection with Theorem 1.4.4, the question arises whether the coeffi-
cient 1/4 at the term ^ r t 2 can be improved (made larger) or not. It is clear
that this constant cannot be larger than 1.
1.4. Inequalities 29
and
|/(i + s) fit)\2 < 2[1 9t/(s)]
PROOF. Let us prove the first inequality. Making use of the Cauchy-Schwarz-
Buniakowskii inequality, we obtain
oo
/ -oo sin(tx)dF(x)
1/2
< Iy sin2(tx)dF(x)
r 1/2
[1 - cos(2te)]dF(x)
2 J - 0 0
where 1 > pk > 1 . Then, as is easy to see, the variances of the distribu-
tions corresponding to the characteristic functions fkit) are equal to a 2 , and,
for k large enough, fk(tk) > Pk > 1 -
Now we present several so-called truncation inequalities connecting the
behavior of the tail of a distribution function with the behavior of the corre-
sponding characteristic function in the neighborhood of the origin.
\x\<a!t
x2dFix) <
(l-cosa)i^
[1 - /(*)] (1-4.16)
where
c = (1 - cos a)/a2 = 1 - t2c [ x2dF(x),
J\x\<a/t
Lx\<Vt tz
f x2dFix)< 2 2|l-/ffl|
J\x\<a/t ( 1 - c o s a)tz
for allO <t < a.
1.4. Inequalities 31
f dF{x)<- [\l-^f{u)]du
J\x\>Vt t Jo
and
f dF(x) <- [1 - f(u)] du.
J\x\>2!t t J-t
THEOREM 1.4.9. Let F(x) and G(x) be two distribution functions with charac-
teristic functions fit) and git) respectively. IfGix) has a derivative and
rT I fit)-git) a
sup - G(x)| <b f dt + cib)
X J-'. r
fJo
x,i
sin 2 u j 1
du - - + -.
4 8b
THEOREM 1.4.10. Let Fix) and G(x) be distribution functions of integer-valued
random variables, fit) and git) be their characteristic functions. Then
I fit)-git)
sup \Fix) - G(x)\ < \ dt.
x 4 J-n
THEOREM 1.4.11. Let Fix) and Gix) be two distribution functions with charac-
teristic functions f{t) and g{t) respectively. IfGix) is absolutely continuous, and
the corresponding density qix) is bounded:
then for any positive and L, satisfying the condition LT > 2, the inequality
V(/) = s u p ^ | / C x i ) - / ( * i _ i ) |
i=l
where the supremum is taken over all and all collections XQ,XI, ...,xn such
that a = XQ < x\ < ... < xn = b. The total variation on the whole real line is
defined as
OO X
V i f ) = lim VCf).
oo x->oo X
For Vf?oo(/), we will omit the limits and write V(/).
A function fix) is said to be a function of bounded total variation if V(/) < oo.
sup \Fix) - Gix)\ < c sup |/(f) - g(i)| log iLT) + YiL)
_|i|sr 1
PROOF. Denote
and = 8/. Without loss of generality we may assume that > 16/3,
> 16y(L) and > 16, for otherwise the theorem would be obvious. We show
now that we can choose two points xo < yo so that |*o 3 | + (8 + 1 )/T and
for e fro 4/, + 4/] andy e (yo 4 /, y + 4/] we obtain
In the first case, as xq andyo we may take \+4/ andyi4/. Since F(x)
is a non-decreasing function and by condition 2 of the theorem the increment of
G(x) on intervals of length 8 / does not exceed , condition (1.4.17) is satisfied
in this case.
In the second case we assume that W(xi) < /4. For < x\ we get | W0c)| <
/4 + y(L). On the other hand, for y > yj we obtain
Therefore, there exists a point such that < < yi, and | > 3/4. Set
^o = - 4/, and takeyo =z + 4/ for W{z) > 0 and y^z - 4/ for W{z) < 0.
From the monotonicity of F(x) and the boundedness of the derivative of G(x) it
follows that (1.4.17) holds. The case W(yi) < /4 can be handled with obvious
modifications.
Now consider two auxiliary functions:
. , 3 f sin(xT/4)\ 4 , . f itx( _
= 8 \ ic7V4 / ' m = L e HiX)dX
/
H{x)dx = 1, 2/ H(x)dx<, (1.4.18)
-oo|()| < 1, 0 for || > T.8 0
h(t) = Jin/T (1.4.19)
34 1. Basic properties
Now apply the inversion formula to the characteristic functions of the distri-
butions OO
/ -ooF{x-y)H(y)dy
and oo
/ -oo G(x y)H(y) dy.
We obtain
/ = Fi(x0) - GM - Fiiyo) + Gi(y0)
e-ityo _ -
[f(t)-g(t)]h(t)dt.
2./_:- it
From the choice of xo andyo. (1-4.17) and (1.4.18) it follows that
rin/T
4/
On the other hand, from (1.4.19) and the fact that |*o Jo| < L + (8 + 1 )/T we
obtain
1 fT |sin[i(*o -yo)/2]
\f(t)-g(t)\dt<c2elog(LT)
2 rT(xo-yo)/2 sini
<- dt
Jo t
2 f(LT+8+1)/2
2 r sini
<- / dt
Jo
2
(Lr+8n+l)/2 sini
dt
THEOREM 1.4.13. Let Fix) and G(x) be two distribution functions with charac-
teristic functions fit) and git) respectively. Then, for any > 1.3, the inequality
T
log
- Jof
\ fit) -git)
LiF,G)< dt + 2e
holds.
and
LiF, G) < 8 ( l + In yr + (1 + 1/r) In W,g),
We also point out an estimate for the uniform distance between proba-
bility densities pix) and qix) via the Li-distance between the corresponding
characteristic functions fit) and git):
1 f
sup |p(x) - g(*)| < / | / - g(t)\dt,
x J-oo
which is just a trivial consequence of the inversion theorem for densities (The-
orem 1.2.6); its discrete analog is: if X and are integer-valued random
variables with characteristic functions fit) and git), then
The proof of the theorem can be found in (Ibragimov & Linnik, 1971).
If characteristic functions of two distributions coincide on some interval
containing the origin, they are not necessarily the same (see Appendix A,
Examples 11 and 12). However, this implies that the distribution functions
must be close (the wider the interval, the closer). More exactly, the following
estimate holds.
The proof of the theorem is contained in (Esseen, 1945). It turns out that
Theorem 1.4.16 is sharp, as we see from the assertion below.
F,G J-00
is true, where the supremum is taken over the set of all pairs of distribution
functions whose characteristic functions coincide on the interval [-T, T\.
where the constants c > 0 are chosen so as to make pnix) a density function
for each n. For each n, let fnit) be the characteristic function corresponding
1.4. Inequalities 37
to pn(x). Then fi(t) vanishes outside the interval [1,1] (Feller, 1971, p.501),
and if we write cos* = (e" + e - l x )/2, it is easily seen that fn(t) vanishes outside
[n,n]. Continuing fn(t) periodically with the period > 2, we obtain the
characteristic function fn,x(t) of an arithmetic distribution (Feller, 1971), with
atoms a t * = 2 nk/, k = 0,1,2,..., of sizes
= (,) = - . (1.4.23)
oo
/ -oo ^) - FnMiW\d* * (1.4.24)
oo
1 2 J ~ 1 2 2 J 1 2 J \ 2 J '
?H}) - ?{}
Hence by (1.4.22) and (1.4.25), with
38 1. Basic properties
we obtain
poo
oo
/ -oo \Fn,x(x) - FnM1(x)\dx
* h {rh i 1 - {})}
k=oo --0,1
+ min
2 00
> h(xk)[FnM1(xk) - FnM1(xk - 0)]
*
(2)2 00
) P<*W*) (1.4.26)
k=oo
From (1.4.23) and (1.4.26) we obtain
oo
/
- FnjL+1(x)\dx
-oo
oo pit
/
pn(x)h(x)dx = In / (), (1.4.27)
-oo Jo
where
oo
() =k=oo
(pn(2nk + oo x)+ pn(Ink - x))
= 2c(l - cos (2nk + x)~2,
1
0<<.
k=oo
We denote the distribution function corresponding to the density () by
(). Let us prove that
()->0 as n - > oo (1.4.28)
for each e [0, ). Let us choose any y e (, ). Then
(1-cosu)" (-)
1> ()> 2c- 2 du > 2c n (l cosy) -
JJyy JJyy 11 71
therefore
c fx
() = / ()<1 = / (1 cosu) n-1 i^i(u)iiu
yo ci 7o
< (1 cosx)" - 1 / l//l(u)du
Ci Jo
^ ( 1 - cos :)"-1
< > 0,
2( - y ) ( l - cosy)"
1.5. Moments, expansions, asymptotics 39
provided that the integral on the right-hand side exists and is finite. In this
case the moment of order
exists also.
In this section, we present some results concerning the relationship be-
tween the existence of moments of a distribution function and the behavior of
its characteristic function in the neighbourhood of the origin.
If the moment of order exists, then it related to the rath derivative of the
characteristic function by the formula
an = (-i)n/n}(0).
kl
fei
as |i| 0.
Conversely, if fit) admits an expansion of the form
k=l
then Fix) has all moments of orders inferior to + . In this case
iiakf
1.5. Moments, expansions, asymptotics 41
and, conversely, (1.5.1) implies that exists for all < , the following
theorem holds.
A=1 k
n+i
fit+s) - fit) - ^fa) -... - ^/^ n+1
1! nl in + 1)!
(1.5.2)
t
is bounded for 0 < |f| < , then the moment of order In of Fix) exists.
Conversely, if the moment of order 2 of Fix) exists, then function (1.5.2) is
bounded on the set i, ) \ { 0 } for some positive .
(a) lim^oo tn = 0;
1 71,5
ft. = "TV1
^ sin
) ^ " ( 2 " ) (^0 ) ~ dt.
+ T
if and only if
1 - 91f(t) = 0(||), \t\ 0.
if and only if
1 - |/()| = 0(|| ), \t\ 0.
1.6. Unimodality 43
1.6. Unimodality
The concept of unimodality was introduced in (Khintchine, 1938). A random
variable X and its distribution function F(x) are called unimodal with the mode
at a if F(x) is convex on (oo, a) and concave on (a, oo). Unimodal distributions
play an important role in many branches of probability theory and statistics.
This section brings together some of the basic results concerning characteristic
functions of unimodal distributions.
For simplicity, we will consider below only unimodal distributions with the
zero mode. Possible extentions to the general case are evident.
In (Khintchine, 1938), the following criterion for characteristic functions
of unimodal distributions was proved.
THEOREM 1.6.1. Let cp(t) be an arbitrary characteristic function, then the func-
tion
(1.6.1)
THEOREM 1.6.2. Let G(x) be an arbitrary distribution function. Then the func-
tion
(1.6.2)
PROOF. Let a function f(t) be represented in the form (1.6.2) where G(u) is a
distribution function. Taking into account that
sin(uf)
(") = e
ut
44 1. Basic properties
,>=! , , , , ^ ^ )
t Jo J-oo (ut/2)
= Joo
^ute ^ G ( u ) ,
where G(u) = (2).
unimodal law, then the distribution of-Xi Xi is also unimodal. Note that this
is not true for the sum: f2it) does not necessarily corresponds to a unimodal
distribution if fit) does (see Appendix A, Example 30).
The necessary and sufficient conditions given by Theorems 1.6.1 and 1.6.2
are often hard to verify for a given function; therefore useful sufficient condi-
tions are of interest. One of these conditions is given below.
(1) f(0) = 1;
PROOF. We prove the theorem under additional conditions that fit) is three
times differentiable and f i t ) is non-negative for t > 0,
/OO
p'ix) = -2 tfit)sinitx)dt
Jo
for t > 0. Integrate by parts, differentiating fit) and integrating the other
factor. Then
rt 00 r oo pt
Hx) = fit) / u sin(ux) du fit) / u sin(ujc) du dt,
Jo Jo Jo
46 1. Basic properties
therefore
it v)2v sin vdv = 4 / (1 cosy)(l cos(f v))dv,
Jo Jo
and the right-hand side is obviously positive when t > 0.
jet rt
=^ T TT / (1.6.4)
(1 - elt) Jo
where (0 < < 1) is arbitrary and cp(u) is the characteristic function of some
discrete distribution.
1 f*
f(t) = r (p(u)du, f(2nn) = 1, = 0,1,2,..., (1.6.6)
2 sin Jo
j=k J j=-oo
for k 1/2 < < k + 1/2, k = 0, 1, 2,... Denote its characteristic function by
git):
oo
/ -oo eitxdG(x).
Then, as we easily see, G(x) is a continuous distribution function such that
GGc) - xG'(x) = ()
(the forward derivative is taken at the points - k + 1/2). Then, due (Khint-
chine, 1938), G(x) is a unimodal distribution function (with mode at zero),
and
/ -oo eitxdFix).
Taking (1.6.8) into account, we obtain
1 [*
f = / <P(u)du, f(2tm) = 1, = 0 , 1 , 2 , . . .
2 sin(i/2) Jo
1.6. Unimodality 49
x sin(/2)
=
t/2
/ -oo H(x-y)dF(y),
where H(x) is defined by formula (1.6.9), and F(x) is the distribution function
corresponding to the characteristic function f(t), i.e., G(x) is a continuous and
linear on each interval [k 1/2, k + 1/2] distribution function such t h a t it is
convex for < 0 and concave for > 0. In other words, on each interval
[k1/2, k+1/2), k = 01, 2,..., the distribution function G(x) can be represented
in the form
G(x) = akx + bk
where * > 0, > 0, * increases for k < 0 and decreases for k>0 ,bk increases
for all k. Therefore (see (Khintchine, 1938))
and
1 *
g(t) = - q>{u)du,
t Jo
where <J>0c) is some distribution function and cp(t) is the corresponding charac-
teristic function.
From (1.6.10) we obtain (*) = bk for * e (k - 1/2,k + 1/2), k = 0,1,2,...
This means t h a t is a discrete distribution function having jumps bk+1 bk
at points k + 1/2, k = 0, 1,2,..., i.e.,
or
P(y = k + V2)
at - P(i 1/2 <X <i + 1/2) = 2 f ;
k=l 2k+ 1
Hence
1= P(i - 1/2 <X< i + 1/2) = ^ ah
where
~ P(Y = k + l/2)
a, = 2} ,
f-i 2A: + 1
Now consider the set of all distributions with densities of the form
or of the form
[, < k, x>0.
Denote this set by si'. Denote the set of all mixtures of distributions from s i
by
for all 0 < r < R, R > 0, then f(z) is an analytic characteristic function in the
circle \z\ < R.
Conversely, if fiz) is an analytic characteristic function in a circle \z\ < R,
then F(x) satisfies condition (1.7.1) for all 0 < r < R.
for all 0 < r < R, R > 0, then f(z) is an analytic characteristic function in the
circle \z\ <R.
Conversely, if f(z) is an analytic characteristic function in a circle \z\ < R,
then F{x) satisfies condition (1.7.2) for all0<r <R.
1.7. Analyticity of characteristic functions 53
The maximal strip of the form a < 3z < b, a > 0, b > 0, where a
characteristic function fiz) is analytic is said to be the strip of analyticity
of fiz). The strip of analyticity of an analytic characteristic function can be
the whole complex plane or a half-plane or can have one or two horizontal
boundary lines. If the strip has a boundary, then the points of intersection
of the boundary with the imaginary axis are singular points of the analytic
characteristic function. If the strip of analyticity is the whole plane, then the
characteristic function is said to be an entire characteristic function.
Let fit), fiit), and fiit) be characteristic functions such that fit) = fiit)f2it).
If fit) is an analytic characteristic function, then both f\(t) and fiit) are analytic
characteristic functions. If fiz) is analytic in a strip < 3z < b, a > 0, b > 0,
then /i(z) and f?,iz) are analytic in this strip. If fiz) is an entire characteristic
function, then /i(i) and fiit) are entire characteristic functions.
The methods of the theory of analytic functions have proved to be a very
powerful mean in the investigation of characteristic functions. One of the most
celebrated examples is the following Cramer theorem on the decomposition of
the normal law.
54 1. Basic properties
The next theorem is among those results of this section which will be used
below.
/(t)= [ e i ^ d F i x ) = Eei<txl
J Rm
If .F(x) is absolutely continuous with density p(x), then
/(t)= [ e'^p(x)dx.
J R">
If X is discrete with P(X = ) = pn, = 1,2 ^ = 1, then
00
/et) = 5Dpe i<tA> .
71=1
if and only if fit) = /i(t)/2(t), where fit), fi(t) and f2it) are characteristic
functions of Fix), F\ix), and F2ix) respectively.
/ uix)dFnix) / uix)dFix) as oo
J RM J Rm
for any continuous bounded function u(x). The continuity theorem in the
multi-dimensional case is also identical to t h a t in the univariate case.
Let Fix) be a distribution function. We will use the same letter to denote
the corresponding distribution. So, for any Borel set c R m ,
FiB) = f dFix).
JB
r / nr\ml2 i^lltll2
= 3 <2^ L (|tjf)
where Jp{z) is the Bessel function of order p.
Let a = (,...,a m ) and b = (&i bm) be two vectors such that aj < bi,
i = 1, ...,m. Denote
p(x) = - i - / e-^fit)dt.
THEOREM 1.8.6 (Parseval). Let Fix) and G(x) be distribution functions with
characteristic functions fit) and git) respectively. Then
In particular,
f fit) dGit) = [ git)dFit).
J R"> J Rm
1.8. Multivariate characteristic functions 57
NN _
j= 1 A=1
is real and non-negative for any positive integer N, any complex > I2>>
and any ti,t2, . . . , t belonging to R m .
THEOREM 1.8.9 (Bochner-Khintchine). A complex-valued function f(t) defined
on Mm is a characteristic function if and only if
(a) f{t) is non-negative definite,
(b) fiO) = 1.
(a) /(0) = 1;
(b) for any unit vector e, fe(t) = fite) is convex for t > 0,
There are functions in R m satisfying these conditions which are not charac-
teristic functions (see Appendix A, Example 37). Nevertheless, some multi-
dimensional analogs of the Polya criterion were obtained (see, e.g., (Velikoiva-
nenko, 1987; Velikoivanenko, 1992). Unfortunately, many multi-dimensional
analogs lose the main advantage of the Polya criterion, namely its remark-
able simplicity. A nice exclusion is the Askey theorem (Askey, 1973); see also
(Trigub, 1989).
(a) fiO) = 1;
(b) for any unit vector e, feit) = fite) is k = [m/2] times differentiable, where
[] denotes the greatest integer less or equal to, and (1 )kf^it) is convex
for t > 0;
THEOREM 1.8.11. For any characteristic function fit) and any positive integer
n,
1 - Kfint) < n { l - PV(t)F} < n2[ 1 - 9t/(t>]
for all t e R m .
1.8. Multivariate characteristic functions 59
THEOREM 1.8.12. Let fit) be a characteristic function such that |/(t)| < c for
||t|| > b, c < 1, b > 0. Then
|/(t)|<l-^||t||2
for ||t||< b.
Note that, generally speaking, the second inequality of Theorem 1.4.5 does
not hold in the multi-dimensional case. Indeed, consider the characteristic
function /(t) = f(t\, = (the corresponding distribution is concentrated
on the y-axis). Let s and t be arbitrary real numbers such that s * 0 and t * 0.
Set s = (s, 0) and t = (0, t). Then
and
1 ^f(s) = 1 f(s, 0) = 0;
hence
|/(t + s) - /(t)|2 > 2[1 - 9l/(e)].
The following multi-dimensional analog of Theorem 1.4.2 is not a conse-
quence of the one-dimensional result but can be proved in exactly the same
way as in the one-dimensional case.
THEOREM 1.8.14. For any characteristic function fit) and any ti and t%
COROLLARY 1.8.1. Let fit) be a characteristic function. If |/(ti)| > cos and
|/(t2)| > cos 2, where >0,2>0, and (pi + (fc < /2, then
where k\ + ... +km = k(kj are non-negative integers, j - 1 ,.,.,; some of them
can be equal to zero).
i-i)kdkfi t)
Recall that this result is not true (even in the univariate case) if the order
of the derivatives is not even.
d2kfi t)
dtf
1.8. Multivariate characteristic functions 61
d2km
8W ~ dtf\..dt% dt{ki...dt%"
is a characteristic function.
1 f 1 m m 1
v vdetC
(2rt)m/2 r a e x p [-2 ^ - ,* - *> j
( >1
^ m I
V^dHir), (1.8.2)
- Jo
f
where U(D
is the uniform distribution on the surface of the m-dimensional
sphere of radius r with center at zero and H(r) is some distribution function.
62 1. Basic properties
(m 3)/2
iv io\
r(m/2) -
(r
u \x) = rv^r((m - 1 ) / 2 ) ^ r2 ^ ' F * r, (1.8.3)
0, Ixl > r.
This theorem implies (in view of Theorem 1.6.3) the following important
necessary condition for characteristic functions of spherically symmetric dis-
tributions.
This theorem shows, in particular, that the Polya criterion (Theorem 1.3.9)
cannot be extended to multivariate case straightforwardly (see Appendix A,
Example 37) at least for m > 3. Actually, as shown in (Velikoivanenko, 1987),
the same situation takes place in the case of m = 2.
Another consequence of Theorem 1.8.19 is (due to Theorem 1.6.1) the fol-
lowing representation of characteristic functions of spherically symmetric dis-
tributions for m > 3: each of these functions can be represented as
1
Z(t)=TTr/ g(u)du,
Fll Jo
where 0 < < 1, E(x) is the degenerate, concentrated at zero distribution, and
G(x) is an absolutely continuous distribution whose density () satisfies the
condition g(x x ) > g(x 2 ) if ||xi|| < || 2 ||.
1.8. Multivariate characteristic functions 63
fl>w(t) = ( I ) ( ^ ) m / 2
'WiirlltH). (1.8.6)
Therefore, by (1.8.2) and (1.8.4) we obtain the following (quite similar) char-
acterizations for spherically symmetric and unimodal spherically symmetric
distributions respectively.
/(t)= + m/2jm/2(r||t|l)dG(r)
K? ^r()
where G(r) is a univariate distribution function concentrated on the positive
half-line.
1.9. Notes
Characteristic functions were used in probability theory long since: almost two
centuries ago Laplace employed them in his work concerning a limit theorem
for independent, uniformly distributed random variables. At the beginning
of the twentieth century, Lyapunov widely used the method of characteristic
functions in studying limit theorems. A systematic study of properties of
characteristic function was started by Levy (see (Levy, 1925)). The modern
state of the theory of characteristic function is reflected in a number of books:
(Gnedenko & Kolmogorov, 1954; Lukacs & Laha, 1964; Ramachandran, 1967;
Lukacs, 1970; Feller, 1971; Ibragimov & Linnik, 1971; Kawata, 1972; Cuppens,
1975; Petrov, 1975; Linnik & Ostrovskii, 1977; Loeve, 1977; Lukacs, 1983;
Galambos, 1988; Prokhorov & Rozanov, 1969).
Most results presented in Chapter 1 are well-known and can be found
together with proofs in many textbooks and monographs, first of all, in those
mentioned above. Respectively, historical and bibliographical information con-
cerning these results can be derived from these sources as well. Due to this
reason, here we touched upon only some results, mainly recent or less known.
1.9. Notes 65
Theorems 1.3.5 and 1.3.9 are due to (Berman, 1975). Some close results
were obtained in (Velikoivanenko, 1987; Velikoivanenko, 1992). Theorem 1.3.6
was obtained in (Trigub, 1989).
Theorem 1.4.1 is due to (Heathcote & Pitman, 1972). Theorem 1.4.2 was
derived in (Postnikova & Yudin, 1977). A weaker variant of Theorem 1.4.4
was originally obtained by Cramer (see (Cramer, 1962)), and later improved
in (Postnikova & Yudin, 1977). Theorem 1.4.5 was obtained in (Raikov, 1940).
Theorem 1.4.6 was obtained by Loeve (see (Loeve, 1977)). Theorem 1.4.9 is
due to (Esseen, 1945). There are some generalizations of this inequality (see,
e.g., (Petrov, 1975)). Theorem 1.4.10 is due to (Tsaregradskii, 1958). Theo-
rem 1.4.12 (Theorem 1.4.11 is its particular case) was obtained in (Meshalkin
& Rogozin, 1962). Theorems 1.4.13, 1.4.14, and 1.4.15 are, respectively, due
to (Zolotarev, 1970; Zolotarev, 1971; Zolotarev & Senatov, 1975; Esseen, 1945).
Proposition 1.4.1 is due to (Kallenberg, 1974).
Theorem 1.6.5 was obtained in (Askey, 1975). Theorem 1.6.6 is due to
(Medgyessy, 1972); Theorems 1.6.7 and 1.6.8 are due to (Ushakov, 1998). The-
orem 1.8.14 is due to (Postnikova & Yudin, 1977).
2
Inequalities
COS* < 1 ^2(c) , |jc| < c, (2.1.2)
(2.1.3)
(2.1.4)
- 1 + 2 ^ - 1 + 2
'.2
h2(x) > 1 - (2.1.5)
12
67
68 2. Inequalities
for all x;
s i n * _ . , , 2
< 1 -A2(c), |*|<c, (2.1.6)
* 6
for any 0 < c < 2;
2
cos* > 1 - (2.1.8)
for all *;
Sln
*
cos* < (2.1.9)
or
()M) 2
Now it suffices to observe t h a t the function sin*/* decreases for |*| < and is
non-negative in this interval.
To prove (2.1.3), we rewrite it in the form (without loss of generality assume
that * > 0)
decreases in the interval (0, y/2). But this is obvious because both functions
2/*2 and ln(l - * 2 /2) decrease.
To prove (2.1.4), let u s use the elementary inequality ln(l +z) <z which is
true for any > 1. Then we obtain (for 0 < * < 1)
2 2
t , , 2 / 2 \ , *2
2.1. Auxiliary results 69
x2 x 4
cosx < 1 + . (2.1.10)
2 24
X2 COS(iV)(0x) 4
cosx =1 + X
2 24
X2 COs(x)
= 1 +
2 24
2 4
2+24'
.x3
v(x) = h,2(c) sinx > 0
6
x2
() = 1 hzic) cosx > 0
Li
due to (2.1.2). Hence v(x) is non-decreasing for 0 < < c. Taking into account
that v(0) = 0, we see that v(x) > 0 for 0 < < c.
Let us prove (2.1.7). As in the previous case, assume without loss of gener-
ality that > 0. Set
v(x) = 1 s-x2 cosx.
We have v(0) = 0, (/2) = 0. Consider the derivative of v(x):
/ 8
(x) = sinx X.
Since the function sinx/x decreases in the interval [0, /2], 7() has only one
root in this interval, therefore either v(x) > 0 for all e [0, /2] or v(x) < 0 for
all e [0, /2]. Demonstrate that the first relation holds. Indeed, we have
. (sinx 8\
provided thatbx is sufficiently small, i.e., for some > 0, v(x) > 0 for e [0,a].
70 2. Inequalities
x2
v(x) = cos* 1 + .
Lt
We have v(0) = 0 and v'(x) = sinx > 0, > 0, which implies (2.1.8).
Finally, let us prove (2.1.9). Without loss of generality, we may assume
that > 0. Consider the function
We have v(0) = 0, and V(x) = sinx > 0 for 0 < < . These obviously imply
that v(x) is positive on the interval (0, ), i.e., (2.1.9) is valid.
A set of distributions and the set of the corresponding characteristic func-
tions (denote the latter by are said to be closed with respect to translation
if the inclusion fit) e & implies the inclusion f(t)eltb e & for any real b. It
is easy to see that a class of characteristic functions is closed with respect to
translation if and only if it can be represented in the form
& = U
ael
where I is some set of indices and for each & a the corresponding set of distri-
butions is an additive type (a set of distributions is called the additive type of
distribution F if it consists of all distribution functions of the form F(x a),
oo < a < oo; see (Prokhorov, 1965; Kagan et al., 1973). The following lemma
is frequently used in this chapter.
LEMMA 2.1.1. Let & be a class of characteristic functions closed with respect
to translation, be an arbitrary set of the real line, and let g(t) be a real-valued
function defined on B. If for any f e & and any t e B,
|9V(i)| * git),
then
\m\<g(t\ t e B.
PROOF. Let us fix an arbitrary / e & and an arbitrary t e B. Then, since &
is closed with respect to translation,for any b
3 ( f ( t ) e i t b ) = 0,
2.1. Auxiliary results 71
(this is always possible because the equation sin ; = b cos has roots for
any a and b). So, we obtain
The two lemmas below are very close to each other but are presented
separately for convenience.
LEMMA 2.1.2. Let (), oo < < oo, be a real-valued nonnegative function,
symmetric about some c, and non-increasing for > c, and let X and Y be
random variables. If
P(\K-c\<a)<P{\Y-c\<a)
lim () = 0
and
lim () = 0,
|*|-oo
and
() = () + , () = () + .
Observe that
() = (||)
and
() = (||).
Set
r>(x) = p ( | y | < x ) - p ( ^ | < x ) .
72 2. Inequalities
() - () = (||) - (\\)
roc POO
= {)() = - D(x)dq>(x)> 0,
J0 Jo
because () is non-increasing for > 0 and hence
POO
/ D(x)dq>(x) < 0.
Jo
LEMMA 2.1.3. Let fi(x), fi(x) and g(x) be integrable functions defined on the
interval [a,b]. Suppose that g(x) is non-increasing on [a,b],
pb pb
/ fi(x)dx= / f2(x)dx,
Ja Ja
and there exists c (a, b) such that
fb h(x)g(x) dx - f f2(x)g(x) dx
Ja Ja
= fb Iflix) - f2(x)]g(x)dx
Ja
= [C\fi(x)-f2(x)]g(x)dx- fb[f2(x)-fl(x)]g(x)dx
Ja Je
Let fix) be a non-negative continuous function defined in the interval [a, &].
Let us introduce the functional
sup [ f(x)dx,
BeMu)
)J
where is the class of all Borel subsets of the interval [a, 6] whose Lebesgue
measure is equal to (it is assumed that < b a). We need the following
properties of the functional 3Fba.
LEMMA 2.1.4. Let a be a real number, b, c and d be positive numbers such that
, 1
b>, cd > -.
2c 2
Then
^c(dcoSbx,-)< sm.
PROOF. Without loss of generality, one can suppose that c = /2 (the general
case can be reduced to this one by the change of variables). Let 0 < < .
Then, for any real a and b > 1,
LEMMA 2.1.5. Let f(x) be a probability density function andgix) be a continuous
function. If
sup f(x) < c
and
rb
f fix)dx = 1
Ja
for some a and b, then
fix)gix)dx<^ba (cg,-c
74 2. Inequalities
LEMMA 2.1.6. Letm> 2. Then
(2)
1 I 2 ) m
.r(f + l) fr(f) m
r (i) r(f) r(f) 2"
+ (N+I)
13s(2n + 1)2" /
~ 13s(2ra - l ) 2 n + 1
In +1 m
= 2 = 2*'
LEMMA 2.1.7. Lei X and Y be independent and identically distributed random
variables with zero expectation and finite absolute moment of order 2 + , 0 <
< 1. Then
E\X - \2+ < 2(||2+ + 2 ^| ).
we obtain
oo rOO
/ -oo Joo
OO roo
/
/ (: - Y ) 2 ( | X | 5 + \y\s)dF(x)dF(y)
oo Joo
oo roo
/ -OO J oo
oo roo
/ -OO /
J ooay(|x| + |y| )dfXx)dFXy).
e e
/ /
-ooJoo
xy(\x\s+ \y\s)dF(x)dF(y) = 2 Joo
xdF{x) Joo
ybl^W
= 2(||) = 0,
therefore
E\X + Y\>E\X\.
therefore
oo roo
/ -oo
Ejx + YjdF(x)> / Joo |x| dF(x) = E|X|.
LEMMA 2 . 1 . 9 . For any 0 <p< 1/2, the function
() = ^ ^ - ( + (1 - p ) c o s x ]
has a unique root , xo > , in the interval (0,2). In addition, () is positive
for 0 < x < xq and negative for xq < x < 2.
76 2. Inequalities
sin* 1 + cos*
> (2.1.11)
2
sin*
< cos* <p + (1 )cosx,
hence () is negative.
It remains to consider the interval (,). In this interval the function
sin x/x decreases (because its derivative OJCCOSX sin x)/x2 is negative), while
the function + (1 ) cosx increases. Hence sin x/x and + (1 ) cosx can
have at most one intersection. Summing up, we obtain the assertion of the
lemma.
r2
cosx < 1 + c(5)|x
L
for all real x, where
and is the unique (in the interval (0, In)) root of the equation
2 1 .
-x + - - x s m x + cosx = 1.
2(2 + 5) 2+
2.1. Auxiliary results 77
PROOF. Without loss of generality we can take > 0. Let us fix an arbitrary
e (0,1] and consider the function
x2
<pz(x) = 1 + 2 + cosx,
(we write min instead of inf because this infimum is, obviously, attained). Let
us prove t h a t zm = c(<5) where c(<5) is defined by (2.1.12).
Observe t h a t all roots of the function (pZm(x) (we consider () only for
> 0), if they exist, are isolated. Besides, if xr is a root of {), then both
conditions
<pzJx) = 0 (2.1.13)
and
=0 (2.1.14)
X=Xr
lim () = oo
>00
and
(pZm{x) > 0 , > 0,
t h a t for sufficiently small ,
which contradicts the definition of zm. Denote the minimal positive root of
() by xm. Then relations (2.1.13) and (2.1.14) mean t h a t
l-^+zmx%s -cosx = 0
and
-Xm + (2 + S)zmx!L+s + sinJCm = 0.
78 2. Inequalities
xm - sinx
Zm
(2 + 5)xl+s '
and substituting this expression to the first one, we obtain the following equa-
tion for xm:
1 -
2(27S)*" ~ h ) X m SinXm - CSXm =
, 1
1 - r*2- xsinx cosx = 0 (2.1.15)
2(2 + 5) (2 + )
has a unique solution in the interval (0,2) for any 0 < < 1.
Consider the left-hand side of equation (2.1.15):
We have
2*
(0) = 0, (2) = -(7)<0' (2.1.16)
and
(1 + ) sin
'() = cos
(2 + ) (1 + ) (1 + )
Due to Lemma 2.1.9, the expression in square brackets in the last relation has
exactly one change of its sign (at some point xo) in the interval (0,2), and this
expression is positive for 0 < < xo and negative for xo < < 2. Therefore the
derivative '(), which coincides with the expression in square brackets up to
a positive factor, satisfies the same conditions. Hence y/(x) increases on the
interval (,) and decreases on xo, 2. Taking (2.1.16) into account, we come
to the conclusion that () has a root in (0,2), and the root is unique.
0 . sin0.. .
cos* < cos 0 sin 0 + (|x| ) . (2.1.17)
2 20
PROOF. First let us prove that for any 0 e (0, ] and all e [0,2],
0 . sin0,
cosx < cos 0 sin 0 + ( ) . (2.1.18)
2 20
2.1. Auxiliary results 79
i.e., (2.1.18) holds for all : > 0. Taking into account that cos is an even function,
we finally arrive at (2.1.17).
The next two lemmas concern functions of bounded variation. The defini-
tion of these functions was given in Section 1.4.
LEMMA 2 . 1 . 1 2 . Let p(x) and q{x) be two probability density functions, and r{x)
be their convolution:
OO POO
PROOF. Let xo < x\ < . . < xn be arbitrary points of the real line. We have
n I r oo
n ro
y: - *_)| = \ ( ~ u)q(u)du - / p(xi-i - u)q(u)du
i=l i=l i 7 " 00
n /-oo
= / [ p ( * j - " ) - " ) ] < ? ( " ) d u
oo n
/
^ |pii - - u)\q(u)du
i=oo
The proof of the lemma is similar to that of Lemma 2.1.12: taking deriva-
tives of both sides of the equality
oo
/ -oop(x u)q(u)du
we obtain oo
/ -oopM(x - u)q(u)du,
and now we can repeat the proof of Lemma 2.1.12 replacingp(x) by p^nHx).
LEMMA 2.1.14. Let and t be complex numbers such that \z\ < 1, \t\ < 1. Then
PROOF. We have
PROOF. Denote
ix ji1
(ix>
vn(x) = e* - 1 - -
(; - 1)!
Then
vn(x) = i [ v_i(u)du,
Jo
and the result is derived by induction, since
|vi(x)| = Ii eiudu
I J
PROOF. Without loss of generality, we can assume t h a t 0 < t < n/2c. Let
us fix an arbitrary t from the interval [0, n/2c]. Denote the additive type
of distribution of the random variable X by s i x . Let us take an arbitrary
distribution function Fix) from s/ and denote its characteristic function by
fit). Set
Ink Ink
Lk = (2.2.1)
~2t t ' 2t t
= {x: Fix) * 0} u {x: F(x) * 1}.
Introduce the distance between sets on the real line and the diameter of a set
as usual:
p(E1,E2) = inf \x - y\, d{E) = sup - y\,
xjeE
yeE2
where E, E\, and Ei are any subsets of the real line.
One of the following two inequalities holds:
or
because cos(fce) > 0 when e UaS-oo^ and cositx) < 0 when < U ^ - o o ^k
Further,
piLi,Lj) > y > 2c > diB),
/.
J
ik 0
dFix) > 0,
i.e.,
Lk L
dFix) = 0
I W ) | * J / cos
' itx) dFix). (2.2.5)
Lk0
2.2. Distributions with bounded support 83
Denote the distribution function of the random variable Y by G(x) and set
{) = i C O s ( i x ) ' * e Lko>
\o, x Lko.
Let Xo and Yo be random variables with distribution functions Fix) and Hix),
respectively. Since Yq = Y + 2nko/t and Xq is some translation of X, we have
for any > 0. Therefore Lemma 2.1.2 can be applied. Using Lemma 2.1.2 and
(2.2.5), we obtain
|9t/(f)| ^ (2.2.6)
is proved. In the case where (2.2.3) holds, the same inequality can be proved
similarly. Thus, (2.2.6) is true for any F e s/ and any t e [n!2c, n/2c]. But
the class of distributions s i x is closed with respect to translation, hence, by
virtue of Lemma 2.1.1,
|()| = |/(f)| < 9ty(f)
for |f| < /2.
THEOREM 2.2.2. Let X be a random variable with density function pix) and
characteristic function fit). If < c with probability 1 and p(x) < a for all (c
and a are some positive constants), then
and
4dC ft 71
|/(f)|< s i n for |f| > . (2.2.8)
4ac 2c
84 2. Inequalities
. .. 4ac . . .
|9Wi)|< s i n , / o r \t\>-,
or
Suppose, for example, t h a t (2.2.9) holds. Denote by q(x) the density function
corresponding to the characteristic function (). Let [a c,a + c] be some
interval such t h a t q(,x) = 0 if [a c,a + c\. Using Lemmas 2.1.4 and 2.1.5,
we obtain
oo ra+c
<^aa%(acos(tx))< sin-^,
V aJ 4ac
for M s
"""-sro i
and
for | ! >
The corollary follows from the theorem and inequality 2.1.6. One j u s t
should take into account t h a t 2ac > 1.
PROOF. Fix an arbitrary from the interval [0, /4]. Denote the additive type
of the distribution of the random variable by Since srf is closed with
respect to translation, it suffices to prove (Lemma 2.1.1) that
2*2
|9ty(f)| 1 - A 2 ( a ) - (2.2.11)
for |f I < ale for any characteristic function () whose distribution belongs to
fi/.
Let us fix an arbitrary () whose distribution belongs to si and fix an
arbitrary t from the interval [0, /c]. We will use the same notation as that
introduced in the proof of Theorem 2.2.1. In particular, Lk (k = 0,1,2,...)
is determined by (2.2.1). As in the proofs of Theorems 2.2.1 and 2.2.2, we
should separately consider the cases > 0 and SRi^(i) < 0, and, because
the reasoning is similar, we will consider only the first case.
As in the proof of Theorem 2.2.1, we come to the conclusion that there
exists an integer ko such that
Then Oq = of = a2, where <Jq and are the variances of distributions G and
F respectively. Substitutingy for 2nko/t in (2.2.12), we obtain
\<Ry/(t)\ * [ cos(ty)dG(y).
JLo
Since the support of the distribution of the random variable X is included in
the interval [c,c], there exists some real d such that the support of F{x) is
included in [d c,d + c]. Let us introduce the following notation:
H+ = _ =
2? 7
2t
L+ = 0, L_ =
21 "0
2nko , 2nko
B0 = d - c,d +c
(1) 0 c L + u L _ = L 0
(3) B 0 n H _ &, Bo c H - u L _
r /-/21 ( t2x2\
|9tvr(f)| < / cos(tx)dG(x) < 1 - h2(a) dG(x)
JLo j-%m \ 2 J
+2 roo fj2/2
= 1 - h2(a)~ / x2dG(x) < 1 - h2(a)-.
2 Joo 2
2. In this case
2nko
do d c > 0.
Let us set H(x) = G(x + do); then = Oq = a2, and the support of the
distribution H(x) is included in the interval [0,2c]. Let us prove that
Indeed, we have
Taking into account that 0 < do < /2 and that cos* decreases in the interval
[0, ], we see that > 0. Thus, we finally obtain
rn/2t
mzt rnut
//2
/ cos(tx)dG(x)< / cos(tx)dH(x)
p2c +2 pn/2t
= cos(tx)dH(x) < 1 - h2(a) I x2dH(x)
Jo 2 J-n/2t
2*2
< 1 - 2().
3. This case can be considered similarly to case 2.
The proof of (2.2.11) is the same if the inequality 9ly/{i) < 0 holds (instead
of9ty(*)>0).
|/(f)| ^ 1 - 4 2 i 2
for |i| < n/4c.
2*2 2*21
exp { h\(a) } < |/()| < exp < - 2 ( ) - >
The upper bound in this theorem follows from Theorem 2.2.3, whereas the
lower bound is a particular case of Theorem 2.3.4 below.
88 2. Inequalities
{ 1 + 2 , ] , 1 -a2/12 2
exp - < |/(f)| < exp - " Vi (2.2.13)
To obtain this corollary it suffices to use inequalities (2.1.4) and (2.1.5) from
Section 2.1.
Taking various values of in (2.2.13), we obtain estimates for |/(f)| which
are more precise in narrower intervals. For example, setting = 1/4, we obtain
2*2
exp { - 1 . 0 1 6 - } < |/(f)| < exp { - 0 . 9 9 4 - }
21
THEOREM 2.3.1. Let f(t) be an even characteristic function. If for some non-
negative integer the derivative 0) exists, then
<2* (2 3 1)
2n+l fWfn-v
1 (2
w * 3'2)
=0
2.3. Moment inequalities 89
i.e.,
, <A2 , 2 Pit4
2?
PROOF. First let us prove that if the expectation of Fix) is equal to zero, then
(2.3.6) is true.
Since Sifit) is the characteristic function of a symmetric distribution, by
virtue of Corollary 2.3.1, we obtain
eft 2
91/() > 1 -
2*2
/()= 1 - +( 2 ), ->0,
2*2
/(-*) = ! - - + o ( t 2 ) , -> 0;
therefore
which, in view of the second part of Theorem 1.5.3, implies that the variance
of the distribution corresponding to 9/() is 2 .
Now suppose that F(x) has an arbitrary expectation. Denote it by a. Then
the expectation of the distribution function F(x + a) is equal to zero; hence
*(e-itaf(t))> 1 - ^ f l
2*2
|/(f)| = |e_Iia/(i)| > <R (e~Uaf(t)) > 1 -
THEOREM 2.3.3. Let F(x) be a distribution function with zero expectation, vari-
ance 2 and characteristic function fit). Then
a2t2
| i - / ( * ) ! < &
2.3. Moment inequalities 91
where, as usual, \ is the first absolute moment. Indeed, again using Lem-
ma 2.1.15, we obtain
oo roo
/ -OO\1 - eitx\dF(x) < J OO \tx\dF(x) = fo\t\.
From Theorem 2.3.2 and inequality (2.1.3) we immediately obtain the fol-
lowing estimate.
for < /.
If the variance is infinite but the first-order moment is finite, then the
following simple lower estimate can be used.
PROOF. It clearly suffices to confine our considerations to the case t > 0. Con-
sider the function
v(t) = Xf(t) - (1 - it).
We have v(0) = 0 and
OO fOO
/ -OO sin tx dF(x) >i~ J OO dF(x) = 0,
i.e., v(i) is non-decreasing for t > 0. This implies that v(t) > 0 .
In connection with Theorems 2.3.2 and 2.3.4, the question arises whether
inequalities of the form (2.3.5) and (2.3.7) hold true for moments of any order
between 0 and 2. More exactly, is the following assertion true: for any r e (0,2]
there exists some constant c = c(r) depending only on r such that for any
characteristic function fit) of distribution with a finite absolute moment r of
order r, the inequality |/()| > 1 c(r)r\t\r is true? The question is still open.
92 2. Inequalities
THEOREM 2.3.6. Let F(x) be a distribution function with zero expectation, vari-
ance 2 and characteristic function f(t). Denote the absolute moment ofF{x) of
order a > 0 by :
oo
/ -oo W W * ) .
If
0 2 + S < OO
The values of the constant c(8) for some are given in Table 2.1.
Table 2.1.
cPt2
|/(t)| < 1 - + C(5O)(^2+5o + G28o)\t\2+5
Jifl
= _ . +c(s0m+So + 2**.
=
c(o)(2+So + 2)
|/()|2 = E c o s ( i ( X - 7 ) )
sin sin ,. ^
< - cos + - )I - )2
sin sin r 99 ,, 9,
< - cos + [ 2 e r r - 2\\ + ],
2 2 (2.3.12)
(the term under the square root is non-negative because, due to the Lyapunov
inequality, 2 < o2); therefore, we may set
= y/2 -2nl\t\+2o2t2
in (2.3.12) (this value minimizes the right-hand side of (2.3.12)). After that we
obtain
|/|2 < - cos \Jit2 2|| + 2 2 2 ,
PROOF. It, obviously, suffices to confine our considerations to the case t > 0.
Let t be a fixed arbitrary positive number. We first prove that
(2.4.2)
96 2. Inequalities
and
fC+l(i)
fC
an(t)= \ cositx) dF(x).
)cnn(t)
Jc I
|*/(f)| * M * ) | (2.4.3)
We shall assume that 9l/(f) > 0 (the case 91/(f) < 0 is treated in a similar
fashion). There exists an integer k = k(t) such that
71 Ttk 71 7lk
+ < U < + .
It t It t
Since the cases a^it) > 0 and *() < 0 are handled in one and the same way, we
shall assume that > 0. For any integer n, an(t) and an+i(t) have opposite
signs and in addition, by virtue of the unimodality of the distribution F(x),
|()| > | + ()| for > k + 1 and | ()| < | + ()| for < k 1. Therefore,
oo
()
* *0
n=k+1
and
k-i
Mi)<0.
n=oo
But
oo
a
W ) = n(t)Z0,
Choose b so that 3e i b t f(t) = 0 (for a given fixed t). This is always possible
because the equation a sin + b cos = 0 has roots for any real a and b. Then
Such a b can be chosen for any value oft, hence inequality 2.4.1 holds for any
t.
COROLLARY 2.4.1. Let X be a random variable with symmetric unimodal dis-
tribution, and let f(t) be its characteristic function. Then
Then
(2.4.4)
We set
98 2. Inequalities
for any I > 0; therefore, taking (2.4.6) and (2.4.7) into account and applying
Lemma 2.1.2, we obtain
/21
/
cos(tx)dG(x)
-n/2t
!oo
/ -oo cos (tx)dG(x) = Wgit).
The passage from the resultant inequality to inequality (2.4.4) is accomplished
in exactly the same way as in the proof of Theorem 2.4.1.
COROLLARY 2 . 4 . 2 . Let F(x) be a unimodal distribution function with charac-
teristic function f(t). Then, for any positive b,
|/(f)| <1 ^ - t 2
b2
PROOF. We will use Theorem 2.4.2. As Y, consider the random variable taking
three values /(26), 0, n/(2b) with probabilities
7tt
|/(i)| <Q [ F ; - ] + COS
2b'
' t2'
\m\<Q(F;f) 1 -QlF-l
_ _ 1 Q(F; nib)
b2 f '
2.4. Unimodal distributions 99
for all t.
PROOF. Let us prove the first inequality. Let X be a random variable with
the distribution function F(x) and be a random variable with the uniform
distribution on the interval [1/(2a), 1/(2a)]. It is easy to see that the random
variables X and Y satisfy the conditions of Theorem 2.4.2; hence |/(i)| < 9ig(t)
for || < , where g(t) is the characteristic function of Y. Taking into account
that
g(i)=sin'
we arrive at (2.4.8).
Let us prove the second inequality. Without loss of generality we can
assume that t > 0. Let , cn(t) and an(t), = 0,1, 2,..., denote the same as
in the proof of Theorem 2.4.1. Then for any t there exists an integer m such
that (2.4.3) holds (see the proof of Theorem 2.4.1. Therefore,
<wi(i)
|9/()| < |am(f)| = / cos(,tx)p(x)dx
Jcmcm(t)
(t)
Cm+M
<a f I cos(fcc)| dx
Jc,
Cm(t)
mit 2a
=aL
n/2t
cos (tx)dx = .
t
Combining inequality (2.4.8) with inequality (2.1.6), we arrive at the fol-
lowing assertion.
COROLLARY 2.4.3. Let the conditions of Theorem 2.4.3 be satisfied. Then for
any 0 < c < /2,
/ < - 2(C)
24 2
100 2. Inequalities
(2.5.1)
(2.5.2)
2.5. Absolutely continuous distributions 101
PROOF. W e s e t
Sc = P(\X\ c),
. . (p(x)K 1 - Sc), |x| < c,
=
\o, ; > c.
We have
Eg(\X\)
g(c) =
e
(2.5.3)
g(c)
Further, q(x) is a probability density function, whose support belongs to the
interval [c,c], and such that
for |i| < n/2c. Taking (2.5.3) into account, we finally obtain
" 3 2 2
(1 )3
< 1
3 2 2
102 2. Inequalities
(1 - )3 ,, ( \
and
The set of distributions whose densities are bounded by the same constant
a is closed with respect to translation. This implies that a in Corollary 2.5.1
can be replaced by miny \ \.
One partial case of Corollary 2.5.1 (a = 2, = 1/4) has to be pointed out.
then
91
l/WI * 1
64 22
for || < /4 and
<254)
for all t.
2.5. Absolutely continuous distributions 103
m m
f{ 64 ' 1024 1J >~ 64 (16
9t2
2 2
9
2 2 2
9
2 2
+ 2 )'
<2 5 6,
{ - 2 ^ +2*4
for all t.
supp(x) < o,
X
then
*1 - 3 ^ 2D: * 2 (2 5 7)
3
m
104 2. Inequalities
| (2.5.8)
|/| *
12a 2 c 2
for \t\ > n/2c.
PROOF. W e s e t
x e
D<x) = \PWPJ> A/'
PJ
\0, xAj,j = 1,2,...,
oo
141 (2.5.9)
3 2 2
' r 1 2c'
\t\ > 2zc? (2.5.10)
12a2c2'
lzcrc"
are true. Further,
oo
l/(f)| = Y] / eitxp(x)dx
J=1
/-oo
JOO
>1
oo
<,/.,|.
>1 >1
Using (2.5.9) and (2.5.10), and taking into account that / l f i , = 1, we finally
arrive at (2.5.7) and (2.5.8).
COROLLARY 2.5.4. Let the conditions of Theorem 2.5.2 be satisfied, then, for
any I > 0, the inequalities
[QxiD?
2.5. Absolutely continuous distributions 105
P(X e ) = Qxil)
12aHl/2)2 3a2l2
THEOREM 2.5.3. Let p(x) be a probability density with bounded variation with
characteristic function fit). Then
|/(i)|<^sin-^- (2.5.11)
t \(p)
l/ ^ ^ (2.5.12)
PROOF. Let us prove the first inequality. Since the set of densities with a
given total variation is closed with respect to translation, it suffices, in view of
Lemma 2.1.1, to prove that
for || < V(p)/2. Let us fix an arbitrary such that || ^ V(p)/2. Without
loss of generality assume that > 0. The cases SRf(to) > 0 and 9/() < 0
should be considered separately. We consider only the first one: it will be seen
that the second case can be treated in a similar way.
So, suppose that *Rf(to) > 0. Denote
f ( + 1)
M n = supp(x),
XtBn
mn = inf p(x),
XBn
In = p(x)dx, = 0,1,2,...
JBn
We have
OO r
/ "OOcos(tox)p(x)dx= n.=Y]
rm J/Bn cos(iojc)p(:!e) dx. (2.5.14)
where
r ( x ) _ iMn ~ ^ e
[ / / . /to + zn],
n
10 otherwise
for even n, and
1. In this case,
/() = 0 < - mn
for /to + zn < < ( + 1)/ Therefore, by virtue of Lemma 2.1.3 (cos(iox)
decreases on the interval Bn),
fnllt+nnlto
/+/
/ cos(ox)(p(x) rnn] dx< cos(io*)[p(*) mn] dx
JBn J /to
//2+/
< / cos(io*)[-Mn mn]dx
J /tn
= / cos(tox)rn(x)dx.
JBn
If is odd, then
( ) _ /^ + +
D^o). /2 < x < 0,
n
10 otherwise
It is easy to see t h a t
m
maxqn(x) < ~ ". (2.5.18)
2
I n d e e d , p n ( x ) a n d p n ( x ) have non-intersecting supports, therefore
In addition, obviously,
oo
/ -oo qn(x)dx < /. (2.5.19)
From (2.5.14), (2.5.15), and (2.5.17) we obtain
q(x) = ( ) (2.5.21)
\n=oo /
Then, by (2.5.19),
oo
/ -oo q(x)dx < 1,
2.5. Absolutely continuous distributions 109
and, by (2.5.18),
1 oo 1
supq(x) < - (Mn-mn)<- V(p).
2 n=oo
In addition, since eachp0c) vanishes outside the interval [/2, /2], the
support of q(x) belongs to this interval as well. Applying Theorem 2.2.2, we
obtain
11
r* w w * V<P> f
/ cos(tx)q(x) dx < - sin < sin
J-oo t V(p) t V(p)
for all || < 2("2() = ; in particular,
f V(p) i0
/ cos(i0x)g(x)cfcc < sin ( 2 . 5 . 2 2 )
y-oo V(p)
From (2.5.20), (2.5.21), and (2.5.22) we finally obtain (2.5.13).
Now let us prove inequality (2.5.12). First we prove it in the case where
p(x) is differentiate; then
oo 1 roc
/ -OOeitxp(x)dx=- it Joo p(x)de
ltx
= - - eltxdp(x) = ~ / eltxp'(x)dx,
it Joo it Joo
which yields
1 r
\f(t)\<-j_^\p'(x)\dx.
is the normal density function with zero mean and variance e 2 . The function
pe(x) is differentiate because ne(x) is differentiate hence
< V(p e )
f(t)e-V
110 2. Inequalities
<
Estimates (2.5.11) and (2.5.12) are sharp: for an arbitrary > 0 and any
fixed such that || ^ /2, there exists a probability density p{x) such that
V(p) = and
| / ( M = 7" sin ,
to
where /() is the characteristic function corresponding to p(x), and a similar
fact holds true for inequality (2.5.12).
COROLLARY 2.5.5. Let the conditions of Theorem 2.5.3 be satisfied. Then for
any 0 < c < /2,
t2
|/()| < 1 - 2(c)- 2
'6V (p)
for |i| < c V(p), where the function h2(x) is defined by (2.1.1).
PROOF. The proof is similar to that of inequality (2.5.12). First, suppose that
( " - 1 ) 0) is differentiable (i.e., p(x) is times differentiable). The procedure
used in the proof of inequality (2.5.12) can be repeated as many times as many
derivatives of p(x) exist. More exactly, if p(x) is times differentiable, and its
first 1 derivatives satisfy the condition
then
1 roo 1 1 fOO
fit) = - - eitxp'(x)dx = -2 / p'(x)deitx = / e^dp'ix)
It J-oo ( I t r J-oo (itr J-oo
Vtx)dx >
w Lj - (-) U "
r / \n r
which yields
/ ^ ^ r 1
The passage to the case where p(n~1\x) is not differentiable can be per-
formed in exactly the same way as in the proof of inequality (2.5.12) with the
use of Lemma 2.1.13.
Then
'oo
/ -oo H(x y) dF(y), (2.6.1)
where
';+1/2, 1 / 2 <x < 1 / 2 ,
H(x) = 0, : < - 1 / 2 ,
1, : >1/2
Moreover, denote the density of G(x) by q(x). Then qix) is a piecewise constant
function taking the valuePk on each interval (1/2 + k, 1/2 + k), k = 0,1,2,...
(at the boundary points = 1/2 + k, qix) can be defined arbitrarily; for definite-
ness and convenience, we define it to be continuous on the right). This implies
that the maximum value of the density q(x) coincides with the maximum of the
numbers pk, and the total variation of the density qix) is given by the relation
for any z, where Q(G;z) and Qx(z) are the concentration functions of G and
X respectively (see, e.g., (Hengartner & Theodorescu, 1 9 7 3 ) ) . From ( 2 . 6 . 1 ) ,
( 2 . 6 . 2 ) , and Theorem 2 . 2 . 2 , we immediately obtain the following discrete analog
of Theorem 2 . 2 . 2 .
then
lJKJl sin(f/2)
for I < /(2m + 1).
2.6. Discrete distributions 113
Now suppose that Fix) is a discrete unimodal distribution function (for the
definition of discrete unimodality and some basic properties used below, see
Section 1.6). Then G(x) defined by (2.6.1) is a unimodal distribution function,
therefore for its characteristic function g(t) Theorems 2.4.1 and 2.4.3 hold.
Taking (2.6.2) and (2.6.4) into account, we obtain the two theorems below.
for < .
for || < .
then
|J 1 P sin(i/2)
for || < .
Denote the right-hand side of (2.6.3) by V rf (F). Thus, Vd(F) is the discrete
analog of the total variation. Using (2.6.2), (2.6.3), and Theorem 2.5.3, we
arrive at the following assertion.
WdiF)SinitlWdiF))
2 sin(i/2)
|9t/(t)| < % ( t )
PROOF. Let X be a random vector having the characteristic function fit). For
any c > 0, we set
that
\*f(t)\ = \[ cos(t,x)dF(x)
I /"1
Thus, by virtue of Lemma 2.7.1, without loss of generality, we can assume that
X is bounded with probability one: P(||X|| < a) = 1 for some a > 0.
Let us demonstrate that there exists 2 > 0 such that
inf Var (e,X) > 2 (2.7.1)
l|e||=l
where the infimum is taken over all unit vectors in R m .
We assume the contrary:
inf Var (e, X) = 0.
I|e||=l
Then there exists a sequence of unit vectors ,2,..., such that
Var (e,X) 0 as oo. (2.7.2)
Since the unit sphere in R m is a compact set, there exists a limit point of
this sequence, say, a vector eo. Without loss of generality we assume that the
sequence ei, 2,... converges to eo:
|e eoll > 0 as > oc.
116 2. Inequalities
i.e.,
Var(e 0 ,X) = 0 .
and
I x||>2/u
J _ V /
dF{x) <
mu ^ J-mu
[1- /(V1. <Vm)] dtj.
PROOF. We will prove only the first inequality. The second one can be proved
similarly. Let X = (Xi, ...,Xm) be a random vector with distribution function
Fix). We have
Taking into account that for each j = 1, ...,m, the characteristic function of
Xj is fiSyti, ...,8mjtm), and applying Theorem 1.4.8 to each summand of the
right-hand side, we arrive at (2.7.7).
PROOF. L e t i , denote the '(l/2 m )th spaces' ofR m labeled by (1) , ...,e ( 2 m \
respectively, i.e., ^ e Ej,j = 1, ...,2 m . This means that if = (x\, ...,xm) e Ej,
j = 1 2 m _ 1 , then efxk = k = 1, ...,m, and if = (, e EJ+2m-i,
j = 1 2 m - \ then efxk = -\xk\,k = 1, ...,m.
Denote Cj = Ej u Ej+2m-i,
Au = j x = (xi, ...,* m ):
118 2. Inequalities
and
s(x) = 5 > * i
k=l
Then
dt
=1 u Jo L
+ $xm)) dt dF(x)
- V f [l - s i n ( "(l)xl + + mxrn))
dF(x)
2m 1
^ J ^ _ sin(ttS(x))j
dF(x)
" ^ JcjrAyu 1 ~ uS(x) J
4 / = dF(x),
r ( f + 1)
c(a) =
rrm/2 1 - - J 2 -
PROOF. First let us prove that
m
|y/r)(t)| < (2.7.9)
V-Iltll
() () ((/ + 2)12) m
sup vy '(x) = v ' = = <
r^/nT{{m + l)/2) 2 ry/'
Let y/(f\t) be the characteristic function of v(r\x). Then, using Theorem 2.4.3
(i/ r) (x) is unimodal), we obtain
m
^r\t\
= 1 dF(x)
~ l h i \[ s(t,x)dt
l(u) [y||t||<u
= 1 - f m \f cos (t, x) dF(x)
Ju L-Zr"1
= 1 - [ m / " ' ( x W x ) = f m [1 - i / u ) ( x ) W ( x )
JR JR
[ [1 - yriu)<(x)]<F(x)> 1 -
m
y/.
/ dF(x),
then
(2.7.10)
120 2. Inequalities
and
w e i ^ - s ^ W>5 71"
where
n(m~Wcm-l
a = ((m + l)/2)
pi(x)= p(x,x2,...,xm)dx2...dxm
J Rm-l
= / p(.x,X2,...,xm)dx2-.-dxn
Jxi+...+xZ,<c2
<a dx2...dxm = 0 = _ = .
Jxl+...+xl<c2 r((m + l)/2)
Applying Theorem 2.2.2 (we, obviously, have |Xi| < c), we arrive at (2.7.10) and
(2.7.11).
lltll2
and
s i-isp * >
THEOREM 2.7.6. Let Xbe a bounded m-dimensional random vector, ||X|| < c,
with characteristic function f(t) and covariance matrix .
For any a e [0, /4]
tt'
|/(t)| < 1 - h2(a)
for ||t|| < ale. (the functions h\(a) and /12(a) are defined by (2.1.1),).
PROOF. Without loss of generality we can assume that EX = 0. Let a [0, /4].
Fix an arbitrary t such that ||t|| < ale and set to = t/||t||. Consider the random
variable (to,X). Denote its characteristic function by (pit). Then
We have
I (to,X) I < ||X|| < c,
so, applying Theorem 2.2.3 to the random variable (to,X) and taking into
account that
Var (t 0 ,X) = t0Xt0 = TL^',
we obtain
In what follows, when we speak about the covariance matrix of a distribu-
tion, we assume that the corresponding second-order moments are finite.
Theorems 2.7.7-2.7.11 below, which are multi-dimensional generalizations
of Corollary 2.3.1, Theorems 2.3.2-2.3.4, and Corollary 2.3.2 can be obtained
in exactly the same way as Theorem 2.7.6.
for all t e
tit'
|/(t)| > 1 -
for all t e
for all t e R m .
LEMMA 2.7.2. Let be anmxm non-negative definite matrix and c e (0, y/2).
Then
f txt'l
|/(t)| exp I - ! ( a ) j
f\f>
|/(t)| < 1 - + C(5)(E||X|| 2+(5 + E||X||2E||X||5)||T||2+5
2
for all t e Rm, where c(5) is defined by (2.1.12).
for some 0 < < 1, then for any 0 < < and any > 0 there exists > 0
such that
\ M \ < I - ^ + e\\T\RS
Now we give some estimates for the characteristic functions of the ab-
solutely continuous multivariate distributions making use of estimates for
characteristic functions of distributions with bounded support. The proofs are
very similar to those in the one-dimensional case.
As in Section 2.5, letgO*:), > 0, be a non-negative increasing function such
thatgCr) > oo as oo, whose inverse function is denoted byg - 1 0e).
|/(t)|<l-^^||t||2 (2.7.13)
(2 7 14)
i ^ - w
and
jr(m-l)/2cm-l
Oo
0 = , :ra. (2.7.15)
r ( ( m + l)/2)
and
< oo, > 0,
and denote
ya = m imn E | | X - b | | a
beR
Then
2
(1 - 5)352(m-l)/a ^ ( W ^ j
2
l/(t)| 1
- 3rt m + l y 2(m-l)/a a 2 ^
(i - )32/ [r
l/(t)| 12nm-lY2m/aa2
l/(t)|
- 1
-
9 [ {)[ (2 7 16)
2.7. Multivariate characteristic functions 125
9 [()] 2 1 2
(2.7.17)
|/(t)| ~ 1 2" +12 " - 1 2 ( 2 )" - 1 ( / )
PROOF. Proof. The proof, in general, repeats the arguments of the proof of
Theorem 2.5.1. Without loss of generality, we assume that EX = 0. Let us
fix an arbitrary t. For the sake of simplicity, suppose that t is of the form
t = (t, 0,..., 0). Let c be a positive constant (it will be chosen later). We set
B = {xe + . . . < c 2 } ,
0 otherwise.
Denote the characteristic function corresponding to g(x) byg(t). Then (see the
proof of Theorem 2.5.1)
912
kiwi ^ - (1 - <5C)2
^ 1 ^ - 6 4 * - - *
(1 - 6C)2
l/<t)|S1 ( 1 4 ) 3
- 1024(W"-" "
c2 " c2 2'
and we finally arrive at (2.7.16) and (2.7.17).
9|r(^)l2||t||2
|/(t)| < exp
The corollary can be obtained in exactly the same way as Corollary 2.5.3.
An estimate similar to the estimate of the corollary is given by the following
theorem.
PROOF. Without loss of generality we assume that X has zero expectation. For
t = 0 the assertion of the theorem is obvious, so we suppose that t 0. We
have
kl - |/(2Trt)|2) = f sin 2 ( (t, x))p(x)dx.
Jw
Let and r be positive numbers ( < 1). Divide the space R m into three disjoint
sets:
< \ F (-^^
< yRm
J
~
i=l >1
I = p(x)dx
JA2
and choose the parameters and r. For any real x, we denote its distance from
the nearest integer by [*]o, i.e.,
Mo = min \x 1 k\.
A=0,l,2,... 1
Since
|sin(* (t,x))| > 2[(t,x)]o,
the relations
are true.
The matrix - 1 , which is inverse to a non-degenerate covariance matrix
, can be represented in the form of the product of two matrices and ':
- 1 = '. Taking into account that p(x) < a and changing the variables
y = , we obtain
1< [ dy = Lk (2.7.20)
|A| JBK
128 2. Inequalities
where
where
a 2{2nfm~1)l2
ao =
|| (m 1)!!
for even m and
2(2/2
ao =
|A| n(m - 1)!!
for odd m. It is easy to see that
y/2i(2 n r ^ '
L -]Af||t(A')-M|(m-l)!! (2,7'21)
for all m.
By the definition of m we have
/
\m\ < + I < t,x >| + 1.
Since 0 < < 1 and - 1 ' < r2, we obtain
and
\m\ < rVffit 7 + 2.
Making use of (2.7.19H2.7.22) and taking into account that
2.7. Multivariate characteristic functions 129
and
tst' < HtCAO-^lllA-Vll,
we come to the conclusion that
" ^ "--V
yfiim -1)!! V Vm/J
<272
Besides,
/2|(2/2/2 ^ 1 2m
y/nm\\ ~ r2 '
LEMMA 2.7.3. Let X = CXi Xm) and Y = (Yi Ym) be two spherically sym-
metric random vectors. If
P(||X||<r)>P(||Y||<r) (2.7.24)
where jjir)
is the uniform distribution on the surface of the m-dimensional
sphere of radius r with center at zero.
Suppose that (2.7.24) holds. Denote
Sr = { x Rm : ||x|| < r}.
Then
C/ r) (S a ) = 0
if r > a, and
t/^GSa) = 1
if r < a; therefore (2.7.26) and (2.7.27) yield
poo
P(||X| <a) = F(Sa) = / lr\Sa) dHp(r) = Hp(a)
Jo
and
P(||Y|| <a) = HG(a).
when 0 < < y, i.e., the function (p(r) = lf-r)(Ea) decreases (for any fixed a).
Therefore from (2.7.26) and (2.7.27), taking (2.7.28) into account and making
use of Lemma 2.1.2, we obtain
/oo
P(|*i| < a) = F(Ea) = / Zf r)(Ea)dHF(r)
Jo
roo roo
= / q>(r)dHF{r)> / (p{r)dHG{r)
Jo Jo
poo
= / l/^EJdHoir) = G(Ea) = (|^| < a),
Jo
By the way, the assertion converse to Lemma 2.7.3 also holds, i.e., (2.7.25)
implies (2.7.24). Of course, relation (2.7.25) in the formulation of the lemma
can be replaced with the relation
and
P(||X||<r)<P(||Y||<r)
for some unit vector e R m . Consider the unit vector e = (1,0, ...,0) and by
X\ and Y\ denote the random variables (X, e) and (Y, e) whose characteristic
functions are fe(t) and ge(t) (see Section 1.8). From Lemma 2.7.3 it follows
that
(||<)<(||<)
Then
(||||<)<(||||<)
for all > 0, where Y is a random vector having spherically symmetric distri-
bution of the form
G=pE0 + a-p)Uir)
132 2. Inequalities
\m/2_1
(77l\ / 2
2J ( j Wi(r||t||).
On the other hand, the random vectors X and Y obviously satisfy the conditions
of Theorem 2 . 7 . 1 6 . We thus come to the following assertion.
1
|/(t)|</> + (l-p)r(|) Jmn-M^W)
The estimate given by the corollary is, obviously, sharp. Another conse-
quence of Theorem 2 . 7 . 1 6 is the following.
then
ml2
1/(4,1 s r +1 I)
(f )(^Bi)
for ||t|| < n/(2ra), where ra is the radius of the ball of volume 1 la:
_ J _ /r(w/2 + 1)\ Vm
a
y/ V a )
(i) the function 91K(t) is even, and the function 3K(t) is odd;
Then for any distribution function F(x) with characteristic function f{t), the
inequalities
and
are true.
PROOF. It is easy to see that we only need to prove the first inequality: the
second one is its consequence. Indeed, the distribution function corresponding
to the characteristic function g(t) = f(t) is G(x) = 1 F(x 0); therefore, in
view of (2.8.1),
v.p. i T T e - i i x K ( t ) f M t
K[^\e-itxm + K(-^)eitxf(-t) dt
h^OJh
-)-* *(e~ltxf(t))
Z{e-itxf{t))}dt
"ff M * ) (jy^-^Fiy^dt
dF(y). (2.8.3)
Furthermore,
1 f 1
FOt + 0) dF(y). (2.8.4)
Relations (2.8.3) and (2.8.4) imply that (2.8.3) is true for every distribution
function F(x) if and only if
K{t) = a ^ i t ) + (1 - a)K^2\t),
Further, let K(t) be a function satisfying conditions (iMiv). Denote its real
and imaginary parts by K\{t) and K2(t) respectively. Then condition (iv) can be
written in the form
Hl(x) = 2 [ Ki(t)cos(xt)dt,
Jo
H2(x) = 2 / K2(t) sin(jci) dt.
Jo
Conditions (iMiv) can be expressed in terms of functions H\(x) and H2{x) (or
their analytic continuations). For instance, condition (iv) is equivalent to the
condition
()+ 2 (*) > ( * ) -
or, since H\(x) is an even and H2(x) is an odd function, to the condition
Thus, the construction of a function K(x) satisfying conditions (iMiv) (or ex-
amination whether a given function satisfies these conditions or not), can be
fulfilled via constructing (examining) functions H\{x) and H2(x). Sometimes
the latter proves to be easier. Below we give the formulation of a theorem
realizing this idea. Its proof is contained in (Prawitz, 1972).
The functions \() and/^Ot) under consideration can always be continued
as entire analytic functions. Let J^f be the class of those even entire analytic
functions H{z), = + iy which satisfy the conditions
(c) there exist positive constants c and h such that \H(z)\ > c for < h;
136 2. Inequalities
* du,.
J-u 2
2ida, J-iociw -x2)H(w)
i 1
Kit) = + / [H(x) cos itx) + iG(x) sin(te)] dx
2nt 2 -oo
and
F(x _ 0) > i - V.p. f ^ e - ^ K ( - i ) /()
are true for every distribution function Fix) with the characteristic function fit).
[0 otherwise.
The fact that satisfies (i)-(iv) can be verified directly or on the basis of
Theorem 2.8.2 (for details, see (Prawitz, 1972)).
Using Theorem 2.8.1 with Kit) = ^0)(i), one obtains a useful addition to
the inversion formula in the form of Theorem 1.2.5. Before formulating it, let
us establish the following estimate of the function K^\t). Denote the real and
imaginary parts of <0)(f) by <0)() and K^\t) respectively.
|1 - 2ittK$\t)\ <
is true.
2.8. Integrals of characteristic functions 137
PROOF. Without loss of generality we may assume that 0 < t < 1. We have
|1 - 2ntK%\t)\ = |1 - - i)cot(?rf) - t\
itt sin(jtf)
= (1-0- cos(Trf)
sin(Trt) TCt
Thus, the lemma will be proved if we prove the inequalities
sin(Ttf)
cos(jtf) 0<t<l, (2.8.6)
itt
sin(Ttf)
> 1 - t, 0 < t < 1. (2.8.7)
Jtt
Let us prove (2.8.6). First of all, we see that, by inequality (2.1.9),
sin(Ttf) sin(Trt)
- cos(Ttf) - cos(Trt), 0 < t < 1. (2.8.8)
jtt 7tt
Consider the function
v() = t sin(jrf) + ().
that is,
V 2 } Jtt
138 2. Inequalities
The last inequality follows from inequality (2.1.8) and the inequality
()
< 1.
7lt
THEOREM 2.8.3. LetF(x) be a distribution function with characteristic function
f(t). Then
Fix) = 1 + J- v.p. / +R
2 2 J-T t
for any > 0, where the remainder R = Rif, T) satisfies the inequality
rT
\ R ^ V r U m d t
PROOF. Denote
that is,
Fix) =I + R0 (2.8.9)
where
\R0\ZJ. (2.8.10)
Denote
2 2
Then, by virtue of (2.8.9),
Fix) =l + 7 + R +Ri>
J t
therefore, to prove the theorem it suffices
3 1 torT demonstrate that
J _ T \ m \ d t
2.8. Integrals of characteristic functions 139
f_J
-T
mdt (2812)
whLH^if)-1
dt. (2.8.13)
where
= max
c = min p(x).
<Kx<6/2
/
p(a(x )) dF(x) = - / eiyuh{ula) / e~iuxdF(x) du
oo 2 J-a U-oo
< \f(t)h{t/a)\dt (2.8.15)
2 J
140 2. Inequalities
I = { X : y - X - < x < Y + * - Y
Then
oo
P ( X e /) < - \f(tMt/a)\dt.
2nac J-a
for any positive a and satisfying the condition <b. Thus (2.8.14) is proved
for such a and .
Setting a = b/ in (2.8.16), we come to the conclusion that
_
Q x W ^ A r f \f(t)h(t/a)\dt
2nbc J\t\<ba
i\t\<ba
Inequality (2.8.14) is thus proved for any positive a, b and . The concentration
function ?() is non-decreasing for > 0; therefore (2.8.16) and (2.8.14) are
true for = 0 as well.
er) s ( ) 2 (, 1 ) m (i - e.8.17)
1
w j - i [0,
i"" f | >s ;1,
\t\
we obtain
, N 1 sin 2 0e/2) 1 /95\2
C
- 2 ^ 9 6 )
and, therefore, (2.8.14) implies (2.8.17).
<2 818)
) W i ; ) / > i *
(0, \t\>i,
8
h(t) = ^ 2(1 - | , 1/2 < |*| < 1,
( l 6 2 + 6|f| 3 , |i| < 1/2,
we obtain
, 3 sin 4 (x/4)
p(x) =
8 (x/4)4
(Gnedenko & Kolmogorov, 1954), and
> 9
~ 8 96/ '
1
Jo 2'
where E(x) is the distribution function degenerated at zero.
In other words, Kit) is supposed to satisfy the conditions of Theorem 2.8.1.
For these functions, smoothing inequalities (2.8.1) and (2.8.2) are true for
every distribution function Fix) with characteristic function f(t). On the basis
of these inequalities, we obtain a series of estimates of the uniform distance
between distribution functions in terms of the differences of their characteristic
functions. We start with some general estimates.
Let F(x) and G(x) be arbitrary distribution functions with characteristic
functions fit) and g(t). From Theorem 2.8.1 we obtain
(2.9.1)
(2.9.2)
2.9. Differences of characteristic functions 143
THEOREM 2.9.1. Let the function Kit) = K\{t) + iK<z(t) satisfy conditions (iHiv).
Then for any distribution functions Fix) and Gix) with characteristic functions
fit) and git), the inequality
\fit)+git)\dt (2.9.3)
and
- 2 f \ - ^ K l ^ ) g i t ) d t ,
THEOREM 2.9.2. Let the function Kit) = K\it) + iK^it) satisfy conditions (iHiv).
Then for any distribution functions Fix) and Gix) with characteristic functions
fit) and git), the inequality
sup\Fix)-G(x)\< ( I ) \fit)-git)\dt
/lOx:) = v.p. J
hix) = v.p. J
i f ,ixt
J(x) = v.p. I git)
2 J-oo t
where, in the last integral, the principal value is taken both at the origin and
at the infinity, that is,
oo rh rH
/
-oo = H-*oo
lim lim
->0 /J- + /Jh
Then we obtain from Theorems 1.2.5 and 2.8.1
which yields
But
hix)-Jix)
T
- i x (t ^iK ^
( h2 i r(J^ l \) fit) _ _ L v.p. f ~^ )
dt
v.p. ^ - git) dt e g { t
2 7d J 2 7|(|> t
and
\m\dt
git)
dt + - i dt. (2.9.5)
2 J\t\>T
2.9. Differences of characteristic functions 145
This relation will be used below together with inequalities of Theorems 2.9.1
and 2.9.2.
A number of useful inequalities that supplement (and often improve) Theo-
rem 1.4.9 can be derived from Theorems 2.9.1 and 2.9.2 (as well as from relation
(2.9.5)) by substituting K(t) = ^Ht), where #< 0) (i) is defined by (2.8.5). Before
formulating some of them, we prove some simple relations for the function
i^ 0 ) (i) (in addition to Lemma 2.8.1 which will also be used). As in the previous
section, and K^\t) denote, respectively, the real and imaginary parts of
PROOF. Without loss of generality, we assume that 0 < t < 1. Using inequality
(2.8.7), we obtain
cos(7rt)| 1
l a - ! + 2
()
1 () | cos(;rt)| 1
+
2 nt sin(Tti) 2
2 V w)
LEMMA 2.9.2. For all real t,
, ^ . (2.9.6)
PROOF. Since |L(0)(i)| is an even function and 2^0)() = 0 for |i| > 1, without
loss of generality, we may assume that 0 < t < 1. Using inequality (2.8.7), we
obtain
| ) 2 = [ t f f w ] 2 + u4 0) (i)] 2
_ 1
(1 - t)2 + (1 - t)2 cot2() + - ( 1 - )| cot(7rt)| + -
~ 4
1 2() sin2(7tf) , 2sin(Ttf). , 1
< - cot () + cot(Trt) +
4 ()2 (nt)2
- + - + 11
1 2
" 42 t2 t n2t2'
THEOREM 2.9.3. Let Fix) and Gix) be two distribution functions with charac-
teristic functions f(t) and git). Then for any 0 < < 1 and any positive T, the
inequality
1 +p fT Im-git)
sup IFix) - G(x)\ dt
2 J-T\ t
^ ( i + j j \ m \ + m)dt (2.9.7)
is true.
PROOF. Setting Kit) = in Theorem 2.9.1, using Lemma 2.9.1 and taking
into account the obvious bound < 1/2, we obtain
rT\m-git)
sup - G(x)\ < [ dt
x 2 J- t
I l/W - 8(t)\ dt + ^l_T l/W + (2.9.8)
2
which implies (2.9.7), because
and
COROLLARY 2.9.1. Let Fix) and Gix) be two distribution functions with charac-
teristic functions fit) and git). Then for any positive the inequalities
fT\f(t)-git)
sup |F(x) - G(*)| < J - f dt
2, J- t
+
1+ ) \ f j m + (2'9'9)
d
sup II Fix)
BY - Gix)
, I <
^ -1 /
1 f(t)~
g(t)
x J-T t
are true.
2.9. Differences of characteristic functions 147
THEOREM 2 . 9 . 4 . Let F{x) and G(x) be two distribution functions with charac-
teristic functions f(t) and g(t). Then for any positive T,
rT\m-g(t)
sup |F(*) - CKx)\ < - [ dt + \ f_TW)\dt. (2.9.11)
J-
The assertion of the theorem immediately follows from Theorem 2.9.2 and
Lemma 2.9.2.
REMARK 2.9.1. An estimate similar to (2.9.9) can also be obtained from Theo-
rem 2.8.3:
2 J-
T \m-git)
dt+^ jj\m\ + \g(t)\)dt.
I t
2 J-
However, here the constant 3/4 at the second summand on the right-hand side
is somewhat worse than that in (2.9.9): ^(1 + l/) 0.65915.
REMARK 2.9.2. We can improve the constant at the first summand on the right-
hand side of (2.9.11) at the cost of introducing an additional summand being
an integral of the difference |/(f) g(f)|. For instance, together with (2.9.11)
the following inequality is true: for any 0 < < 1,
+p fT I f(t)-g(t)
sup |F(*) - G(x) I < f dt
2 j-
+\{~+)\-|/w"{t)idt+\/:^dt- (292)
Now estimate the term (1 |i|) in the first and third summands by sin()/()
148 2. Inequalities
) 1 - 1 sin(?rt) | cos(;rt)|
+ +
' ' ~ 2itt 2 2 ret sin(Tri) + 2
1+p 1/ 1
< + -11- + -
2 2 V
Substituting this inequality into (2.9.4) and estimating |i^ 0) (i)| by 1/2, we
obtain (2.9.12).
THEOREM 2.9.5. Let Fix) and G(x) be two distribution functions with charac-
teristic functions fit) and git). Then for any positive T,
m git)
sup IFix) - G(x) I < f I I dt
2 J- I t I
3 rT i/ git)
+ m i t dt. (2.9.13)
ir U * ^ L\t\>T
[T \m\dt+ [ m dt.
2T J-T 2 J\t\>T
\t\> I t
L - ^ {) * s - L^
dt
dt = h+I2,
where
T
I fit) - git)
dt,
2 y_ I t
and, by virtue of Lemma 2.8.1,
* L wwi^i I1 -
s
(?) I **h Lmdt-
rT
PROPOSITION 2 . 9 . 1 . Let fit) and git) be two characteristic functions, and let the
absolute value of git) do not increase for t > 0. Then, if
PROOF. W e s e t
and prove first (2.9.15) for |f| < T 0 . From (2.9.14), the definition of T 0 , and the
elementary inequality
n/wr-^n^irw-^wi
we obtain
Thus, neither fn(t) nor is equal to zero for |f| < TQ and, hence, in view of
the continuity of /() and g(t), the representations
are true, where (pit) and () are continuous functions such that
for || < To, because, if (2.9.18) does not hold, then, due to the continuity of
functions () and {) and because of validity of (2.9.17), there exists t\ > 0
(*i < T0) such that
- < |() - y ( i i ) | < -
But then
= i i r w i - ^ 2 + 2 | / ^ ( ) | [ - cos(<p(f) -
hence,
2
sin2<P(t)-w(t) j m - g ^ m K t (2919)
which is true for |x| < /6. Then from (2.9.18) and (2.9.19) we obtain
\(p(t)-(t)\<J-
3
that is,
(pit) Wit)
~ 3rt(|/(i)|"|^(i)| n ) iy2 ' (2 9 20)
Furthermore,
/ - / 1 + / 2 ^ ) + - + W 1
)
= /() - < f/"(i) < ,
|(,)| s s ( 2 9 2 1 )
" *'" + ' '
2.9. Differences of characteristic functions 151
From (2.9.16), (2.9.20), (2.9.21), and the elementary inequality | sinx| < \x\, we
obtain
= /2+m 2
- 2\mw)\ cos ()
-{)
= l l / ( i ) | - k W I I 2 + 4|/(0|L?(i)|sin 2
2
II II / I
(pit) Iff I / I ^
)
</-*2 + ( 2
V3 J /"-1^)!"-1
2
( ^ ( 1 1/
" { )
\ 3 /
Thus,
( \
\f(t)-g(t)\<\l+-). (2.9.22)
which yields
\git)\ < (2)1"1, |/()| < (3) 1/ ;
hence,
/ - s ( i ) | < / + ()| < (V3 + \2) .
Combining the last inequality for To < || < and inequality (2.9.22) for
|f I < To, we finally obtain
152 2. Inequalities
Let fit) and g(t) be two characteristic functions, and let the
THEOREM 2 . 9 . 6 .
absolute value ofg(t) do not increase for t > 0. Then
PROOF. W e s e t
= A(/n,^).
|/()-/()|<2
To V2
v/2
PROPOSITION 2.9.2. Let f(t) and git) be two characteristic functions such that
and
where
(\/2 + 1) + 6>/~
c(A) =
3V2MV2 + 1)
is a constant depending only on .
PROOF. We have
/()2-?2</2()-^()|<|
which yields
2 2
k w i ^ \ m \ - * ;
hence
] f 2 ( t ) ( 2 9 2 4 )
- * > * ^ S f * v S ^
Let us demonstrate t h a t
>-m)\2m2sin2aTefHt)-2aTg8Ht)
154 2. Inequalities
which yields
II gl .
which is true for \x\ < /6. Then from (2.9.25) and (2.9.26) we obtain
I arg f(t) - argg(i)| =
arg fit) - axgg2{t)
(2.9.27)
And, finally, from (2.9.24), (2.9.27), and Lemma 2.1.14 it follows that
~n(V2 + 1) + 6V"
|/(f) ~g(t)\ < \f\t) -g'{t)\.
3y/2Uy/2 + 1)
It can be easily seen that both Propositions 2.9.1 and 2.9.2 are true not
only for characteristic functions but for any continuous functions f\{t) and
/2(f) satisfying the conditions |/i()| < 1 and I/2WI ^ 1.
>
oSx-.
2
The theorem is a consequence of the following more general result.
(if |/(f)| > a for all t > 0, then we set ta = oo). Then
arccos
ta>
PROOF. Without loss of generality, we assume that ta < oo. For any natural n,
we set
= arccos / ( - (2.10.2)
ta + 0
- 1
- 2n> l^2J. - > oo;
therefore
ta
/ = 1
2 2
+0 > oo. (2.10.3)
which yields
Furthermore,
1 - +o - = =
( 1
2n2
(arccosa) 2 / 1
<1 ;- +
2 2
which yields
(arccosa) 2
*
THEOREM 2.10.2. Let F{x) have finite first absolute moment . Then
> 1
+
l
PROOF. Let us prove the first assertion of the theorem. Consider the function
rem 2.10.1, R0 > /(2) and, by the Lyapunov inequality, / < 1). Then,
using Theorem 2.3.2, we obtain
The second assertion of the theorem can be obtained in exactly the same
way.
In the case of RQ, a more accurate estimate can be obtained than that given
by Theorem 2 . 1 0 . 3 if we use Lemma 2 . 1 0 . 1 instead of inequality ( 2 . 3 . 5 ) .
+ - arccos J 1 - (2.10.6)
V V
PROOF. Let 0 < < 1, and ta be defined by (2.10.1). Consider the function
i.e.,
a
Rq + fa
Pi
a 1
RN > + arccos a. (2.10.7)
Setting
. . , 1 - I2
(this value maximizes the right-hand side of (2.10.7)), we obtain (2.10.6).
Perhaps, more accurate lower bounds for the first positive zero of a char-
acteristic function or its real part can be obtained if we use moments of higher
(than two) order. For example, there are some grounds to expect that
'6
2.11. Notes
Theorems 2.2.1-2.2.4 are due to (Ushakov, 1997). Theorem 2.2.4 to some extent
improves inequalities from (Doob, 1953). Theorem 2.3.1 is due to (Sapogov,
1979). Theorem 2.3.6 was obtained by (Prawitz, 1973; Prawitz, 1975) in the
case = 1. In (Ushakov & Ushakov, 1999) it was shown that the same approach
is valid in the general case 0 < <5 < 1. Theorem 2.3.8 is due to (Prawitz, 1975)).
Some other inequalities of this kind are contained in (Prawitz, 1973; Prawitz,
1975; Prawitz, 1991).
Section 2.4 is mainly based on (Ushakov, 1981a). Theorems 2.4.1, 2.4.2,
and their corollaries are taken from this work. Theorem 2.4.3 was originally
obtained in (Prokhorov, 1962) for symmetric distributions, and in the general
case, in (Ushakov, 1981a).
Theorem 2.5.1 and its corollaries, as well as Theorem 2.5.2, were obtained
by (Ushakov, 1997). Theorems 2.5.3 and 2.5.4 are due to (Ushakov & Ushakov,
1999a). Results of Section 2.6 are simple consequences of results of Sections 2.4
and 2.5.
Theorem 2.7.3 was obtained in (Csrg, 1981b); Theorems 2.7.4 and 2.7.5
are due to (Ushakov & Ushakov, 1999b). Theorems 2.7.7-2.7.12 immediately
follow from their univariate analogs. Theorem 2.7.17 is taken from (Ushakov,
1981b).
Theorems 2.8.1 and 2.8.2 are due to (Prawitz, 1972). Theorem 2.8.3 fol-
lows from the results of (Prawitz, 1972); see also (Bentkus & Gtze, 1996).
Theorem 2.8.4 and its corollaries were obtained in (Daugavet & Petrov, 1987).
Some extensions and generalizations of these results are contained in (Sa-
likhov, 1996). The main results of Section 2.9 (Theorems 2.9.1-2.9.5 follow
from (Prawitz, 1972); see also (Bentkus & Gtze, 1996). Theorem 2.9.6 is due
to (Ushakov & Ushakova, 1995).
Theorem 2.10.1 was obtained in (Sakovich, 1965); other results of Sec-
tion 2.10 have not been published before.
3
159
160 3. Empirical characteristic functions
distribution function Fn(x) associated with the sample Xi, ...,Xn is defined as
n
F {x)
1
n = ~
.
i=l
where
l, Xi^x,
10,
0, Xi>x.
0, x<Xa),
Fn(x) = r/n, X(r) <x<X{r+i),
1, X>X(n),
EF n (x) = F(x)
(1) /(0) = 1;
(/() - / ( ) / ( 2 ) - mv
= E/ n (fi)/(f 2 ) - f(ti)f(t2)
= E(lteltlXjltelt2Xk)
V >i n *=i J
Similarly,
In particular,
Denote the real and the imaginary parts of fit) by uit), v(t) and those of fn(t)
by un(t) and vn{t) respectively. Then
1 n
Unit) = - ) COSitXj),
j=
n
nit) = -1 V
sin(fcX,).
Obviously,
Un{t) = uit), Ew(f) = vit).
Find cross-covariances of unit) and vnit). Observe t h a t
In addition, by virtue of the strong law of large numbers, fnit) converges almost
surely to fit) at any fixed point.
3.1. Definition and basic properties 163
In other words, the real part of fn(t) is always closer to fit) than fnit) itself if
fit) is symmetric.
If a sample under consideration consists of multi-dimensional random vec-
tors, and it is known that their components are independent, then it is reason-
able to use the product of one-dimensional marginal empirical characteristic
functions
m
Y[fn(o,...,o,tj,o,...,o)
(3.1.7)
164 3. Empirical characteristic functions
where Tn oo, log Tnln 0 as oo, and if it is known that the underlying
distribution is either absolutely continuous or discrete, then the function
fi
fn t ) = ifn(t)' sup*(Fn(x + 0)~ Fnix ~ 0 ) ) > 1/n ' f 3 ft)
\f*n(t), supx(Fn(x + 0) - Fn(x - 0)) = 1 In,
almost surely for any fixed positive < oo. On the other hand, in the general
case, the empirical characteristic function is not consistent (even in the sense
3.2. Asymptotic properties 165
lim |/(f)| = 0;
|f|->oo
then
lim sup |/re(i) /(f) | = 1
|i|->oo
with probability one. In fact,
any time when the underlying distribution contains a continuous (maybe sin-
gular) component (in the sense of mixture). It turns out that ( 3 . 2 . 1 ) holds when
= Tn depends on and tends to infinity as > oo but not too fast. The
sharp result is the following theorem.
THEOREM 3 . 2 . 1 . If
l Q
lim g= 0,
n-> oo
then
lim sup |/(t) - /(t)| = 0
||t||<r
f dF(x) < J.
8
J\\*\\>x
We set
bi t)= [ el{t'x)dFix),
J IMIsz
6(t) = / e^dFni*) = - f y W / f l p y a r } .
166 3. Empirical characteristic functions
1 1 "
- sup - n~ 7{,>*}'
n
lltpT* >1 j=
and these bounds converge almost surely to ij|x||>^ dFi'x.) which is also a bound
for the third term.
m 3l2 m
Let us cover the cube [~Tn,Tn] by Nn = ([(8Km Tn)/e] + l) (here the
square brackets donote the integer part) disjoint small cubes ,..., A w h o s e
edges are of length /(4Km 3 ' 2 ), and let ti,..., be the centers of these cubes.
Then
= p
( s > I)
v2(t) = ER?(t)
(l n e\ i 2 e X P l~S}' ^ 2w2(t),
> i H k l
Since u2(t) < 1 and < 2, the probability in question is no greater than
2 exp{2/64}, and the same is true for the other one with the I/s. Thus,
Let < 2/(64m). Then Tn < exp{5n} for sufficiently large n, hence <
oo, and the Borel-Cantelli lemma and relation (3.2.2) yield the desired result.
THEOREM 3 . 2 . 2 . If
lim If(t!,...,t k ,...,t m )l = 0
Then
Pai^) >0
PROOF. We s e t
i^V) = inf { a : > 0}.
What we have to show is that i^V) = 1. First we establish the following
properties of (^Y):
then 0 M O = {Jt)\
(iii) if = { 2 t h e n (/3(2)2 = ( ) .
The proof of (i) is based on the Dirichlet theorem from the theory of Dio-
phantine approximations whish asserts that ify i,... ,yn are arbitrary real num-
bers, > 0 and a > 1, then there exists an integer t e [K, Kan] such that with
appropriate integers vi,..., vn the inequalities
\tyj-Vj\<^, j=l,...,n,
Since for every fixed K, we have 5" < 6" for all sufficiently large n, it follows
that
But (3.2.3) implies that for all sufficiently large k, we have a m * > a"* and
and so
This means that p a (^C) > 0. Since this is true for all a > (jV), we can
conclude that {J<() < {^V). Reversing the roles of JV and j, we obtain the
opposite inequality, and hence (ii) is also proved.
Let us prove (iii). Introduce the following subsequences of the original
sequence X:
X*1* = {X1.X4.-X7.-X10> },
X*2) = {X2.-X5.-X8.-X11.}.
*1* = {-X2.-X3.-X5.-X6.-X8.-X9.-X11.-X12. }.
Y = {-X1.-X3.-X4.-X6.-X7.-X9.-X10.-X12. }>
Y*3) = {X1.X2.-X4.-X5.-X7.-X8.-X10.-X11. }
whence
+ Pf sup W ' U ^ j
\K<t<an* %nk 2 y
( sup nk' nk
K<t<a
IS ('^)!
> )I
\
(
= 0 + 0 = 0,
sup nk'
K<t<a nk
IS (<2))|
> JI
\
where at the last step we used a < {^V). Thus, < (<sV) yields yja. < {2jV).
Therefore ( < ((2^Y))2.
Now let a < (2^). It is obvious that
{VtfSSa2'* 22nnik
\s2nk{t-,Y^l))\
3
^
sup
\K<t<a2n* 2 nk
= 0 + 0 + 0 = 0,
i.e., a < /3(2 yields a 2 < (<vV). Therefore the opposite inequality
((2)2 < (^V) also follows, and hence we have (iii).
Thus, properties (i)-(iii) are validatedv Now, for a positive integer m, we set
Uk I
= {
2m I 2m
3.2. Asymptotic properties 171
2> =
2m
V2m
V2m
= m-J/ <6
2
and since this is true for any integer m > 1, the equality (<yV) = 1 follows.
sup |/ n (fi, . . . , i O T ) - / ( i i , . . . , f m ) |
|(ti tm)\<Tn
> sup |/ n (0, ...,0) - /(, ...,0)|,
-Tn<tk<Tn
for every > 0. Choosing so large t h a t \f(t)\ < Ml 2 for t > and then setting
= 12, we obtain
lim sup [ sup |/ 4 (i) - fit) \ > ) > lim sup ( sup \fnit(t)\ > ) > ,
kxx> \\t\<Tnk J A->oo \K<t<er"k J
Although Theorem 3.2.1 cannot be improved in the general case, this can
be done under the additional condition that the underlying distribution is
discrete.
PROOF. Let Xj take values ,2, with probabilities p\,pi, For an arbi-
trary > 0 there exists ko such that
oo
(3.2.4)
k=k0+i
Denote A = {x*0+i, xa0+2, } By virtue of the strong law of large numbers, for
this there exists no such that for > no,
(3.2.5)
and
(3.2.6)
3.2. Asymptotic properties 173
|/(t) - /(t)| =
ra
>1 k=l
i(t,Xk)
- ^
>1 A=1 k=l
Ao \ n
) - , ] e'<t.*A>
-7{>=^>
*=1 V" >1
1 oo
^
>1 A=A0+1 *=0+1
*o
- 7{,= + - 7 { ^ } + Pk
A=1 >1 J= 1 A=Jfeo+l
< - + + - = ,
2 3 6
which, since is arbitrary, proves the assertion of the theorem.
As mentioned above, the empirical characteristic function is not a consis-
tent (almost surely) estimator uniformly on the whole space if the underlying
distribution is absolutely continuous. However, in this case, the empirical char-
acteristic function can be modified in such a way that the resulting estimator
is uniformly consistent. For instance, the estimator
|| || = sup I |.
teB
Let Y(t) restricted to be a random element of () with EY(t) = 0, the
cross-covariance matrix (see (3.1.4M3.1.6))
_ (EUn{t)Un{s) EE/(t)Vn(s)\
l ' ' ~ \EVn(t)Un(s) EV(t)Vn(s)/
_ f h [u(t - s) + u( t + s)] - w(t)u(s) \ [-u(t - s) + u(t + s)] - u(t)u(s)\
" [u(t - s) + u(t + s)] - v(t)u(s) \[u(t - s) - u(t + s)] - v(t)v(s) J
where mes stands for the Lebesgue measure. Then the non-decreasing rear-
rangement () of (pit) is defined as
()
/
Jo
( M )
1/2
dh < oo. (3.2.7)
A proof of the theorem is contained in (Marcus, 1981) for the univariate case
and in (Csrg, 1981c) for the general case (see also (Feuerverger & Mureika,
1977; Csrg, 1981a)). Note that the condition of the theorem is satisfied, in
particular, if
E(log+ ||||) 1+ < oo
for some > 0.
The results presented below in this section have been proved only in the
univariate case, so in the remainder of the section, we assume t h a t m = 1.
The following two theorems, which are due to (Keller, 1988), deal with
the problem of large deviations of the empirical characteristic function. More
precisely, they give asymptotic expressions for the limit
+ 1
lim sup - log ( sup |F n (*) - F(x)| > ) < - min{J05. ): 0 < < 1 - }
-* oo \<< /
is true.
N o w let us introduce the random vector
M0) =
and denote
ht(e, 0) = inf e - n M t ( r e ) ,
r> 0
Then
lim - log ( s u p |/(f) - /(f)| > ) = ().
-^ocn Vie /
Hence
lim
-sup
oo -Tl log (sup
\teB
|/() - /()| > / )
= max (lim
sup - log (max )/(A:,)| > ), lim sup - log (^>)1.
L n->o Tl \1</</ / oo J
Now
l t 2 X
[ _ e ) d ( F n ( x ) - F i x ) )
J\x\>k
Let
= max Iii, ' = 2(1 + ) .
teB
Using integration by parts, we obtain
f (elilX - e l t 2 X
) d ( F n ( x ) - F ( x ) )
J
l t x l i 2 X
- i ( F n ( x ) - F ( x ) ) (he * - t 2 e ) d x
J
< 2 \ F n { X - 0 ) - F ( X ) \ + 2 \ F n { - X ) - F ( - X ) \ + \ t l - t 2 \ X ' sup | f ( x ) - F ( * ) |
oo<t<oo
S n , k < 2 \ F n { X ) - F ( A ) | + 2\Fn{X - 0) - F ( X ) | + A \ F n ( - X ) -
Hence,
lim sup - log ( s u p |/ () - /()| > )
n-> oo V teB J
is bounded by the maximum of ( ),
and therefore,
() = supifte) > oo
teB
PROOF. With the same conclusion as in the proof of Theorem 3.2.5, we obtain
Since g{t) is almost periodic, there exists an L - L(y 2 ) > 0 such that every
interval of the real line of length no smaller than L contains at least one -
almost period, i.e., a number satisfying |g(t + ) < y2 for all real t.
Hence, if t is fixed, we can choose an -almost period from the open interval
(t, t + L). Then we obtain
|/(f) - /()| < If n (t + ) - fit + )I + |/(i) - f(t) - (fn(t + ) - fit + ))|
< sup |/n(<) - /()| + If n (t) - fit) - (fn{t + ) - fit + ))|.
0<t<L (3.2.8)
It follows from Theorem 3.2.5 that
^ - ^ ' - ) )
j= l k=i
* L*J
II oo
(+)*
hxj^k) -Pk
J=11 k=l
I. 1
n
1 n ko
- + 2 Ihxj^-Pk
k=l k=ka+X
1/2
1 n ko
+2 I7 {Xj=a}-Pk
j= 1 k=l k=ko+l
1/2
1 n ko
-p^ I + 2 -Pk
nM A=1 J k=k0+l
1 "
= < -E
* J>
ZJ
n%i
where
1/2
ko
= -Pk + 2 |7R=a*} -Pk 1 <j < n.
lim sup - log ( sup |/() - /()| > J < max |( - ), log EerZl - r j
n-> oo \oo<t<oo J >
for all r > 0. If we let first converge to zero, then ko tend to infinity (to if
X takes a finite number of values), and finally let r tend to infinity, we obtain
THEOREM 3.2.7. Let the underlying distribution function F(x) have finite first
absolute moment . Then, for a > 0, 6 > 0, where a and b depend on in an
arbitrary fashion,
where
Rn = F*n (~) + 1 - F*n Q/ift - ) ,
PROOF. S e t
_ b_
r ~ 4/3i
We find numbers t\ < t2 < ... < tk with the property that t\ = a, tk = a,
|ij i;+i| < y. Obviously, we can assure this with k<l + 2/. We begin with
Denote the three summands on the right-hand side by T\, and T3, and
estimate them. Note that
|/(f) - f(s)I < E|1 - ei(i~s)*| < E|(i - s)X\ < yfa < 6/3
182 3. Empirical characteristic functions
Therefore,
1 n 1 n
P(|/ B (ii) - fiti) > 6/3) < P(|(ii) - uiti)I > ft/6)
+ P(|u n (ii) - u(ii)| > 6/6) < 4e- n b 2 / 1 2 ,
by Hoeffding's inequality for bounded random variables (Hoeffding, 1963),
where, as usual, u(t) and v(t) are the real and imaginary parts of fit), and un(t)
and vn(t) are those of fn(t). This concludes the proof of the theorem.
We set
ro = min{i > 0: u(t) = 0}
(the first positive zero of u(t); if ro does not exist, we write ro = oo), and define
the random variable Rn as
where ro, rq1', r , . . . are the first positive zeros of real parts of f(t),gi(t),g2(t),...
As an example of such a sequence, consider
f(t)=pe~Vi + ( 1 p ) cos t,
where 0 < < 1 is some appropriately chosen number, i.e., in such a way that
f'(t) = 0 for t = ro, and
grSt) = (p + + ( 1 - - ) cos t,
THEOREM 3.3.1. Ifro < oo is an isolated zero ofu(t), and u(t) decreases in some
neighborhood of ro, then
a.s.
Rn > ro as > oo.
we immediately obtain from Theorem 2.10.1 the following estimate for Rn:
Rn > a l m o s t surely.
2 v /m2
This bound asymptotically requires
Other estimates, involving the (sample) first order absolute moment or both
the first and the second moments, follow from Theorems 2.10.2-2.10.4:
almost surely,
m 2m2
almost surely.
3.3. The first positive zero 185
1 n
where
1 "
ma = - Y" \Xj\a, 0 < a < 1.
71 > i
Rn US k .> OO.
sup IT n k Rn I 0 as k oo.
n>N
By the definition of Rn, An > 0 and Tnjk > bn for k > [6/] + 1, where fr]
denotes the integer part of x.
186 3. Empirical characteristic functions
Let be any compact subset of the real line such that [0, r + /2] c . Then,
due to the uniform convergence of the empirical characteristic function to the
corresponding characteristic function on each compact set (Theorem 3.2.1),
there exists an N\ such that, with probability as close to one as desired, for all
n>Nu
U(t) - < Un(t) < U{t) +
Also, by the strong law of large numbers, there exists an N2 such that, with
probability as close to one as desired, for all n > N 2
ma < + ,
where
oo
roo
/ -00 \x\adFM.
Unit)
nK) 1"" }
1 a
: 0<t <Rn } .
2 ~ ma
But > 0 by the choice of so that with probability as close to one as desired,
for all k > [(ro /2)/] + 1 ([] is the integer part),
\Tn,k n| <
The theorems below give some results concerning the distribution of Rn
depending on the behavior of the rea\ part of the underlying characteristic
function. The proofs of the theorems and some additional results are contained
in (Heathcote & Hsler, 1990; Brker & Hsler, 1991).
3.4. Parameter estimation 187
rn = r 0 + z. 2 = (1 + u(2t) - 2u\t)),
THEOREM 3.3.4. Let u(t) = exp{-/3|i| a } for some > 0, 1 < a < 2. Then
THEOREM 3.3.5. Lei u(i) = exp{-/3|i| a } for some > 0, 1 < a < 2. Then, for
THEOREM 3.3.6. Lei u(i) = exp{-/3|i| a } with 0 < a < 1, > 0. Then
( 2 \1/a
.R 1 as > oo.
\logny
0 < a < 2, || < 1, > 0, oo < a < oo. The parameter a is called the
characteristic exponent, is a measure of skewness, is the scale parameter,
and a is the location parameter.
Let X be a random variable with the characteristic function of form (3.4.1).
Denote its distribution function and density by Six; , , , a) and s(x; , , , a)
respectively.
The inference problem with stable distributions is not straightforward; it
is complicated by the fact that their densities are not generally available in
closed form, making it difficult to apply conventional estimation techniques.
Due to this reason, the characteristic function-based estimation seems to be
reasonable for the stable laws.
y|*i| a = - l o g | / ( f i ) | ,
y | i 2 | a = log|/(i 2 )|
Solving these two equations simultaneously for a and y, and replacing fit) by
fn(t), we obtain the estimators
1
- y ^ r tan ^ = = 3,4. (3.4.4)
2 tk
Since
1 n 1 "
hit) = - V cositXj) + i - V sinitXj),
nr-f r-f
j= J=
in polar coordinates we have
fnit) = PnitV^,
where 2 2
Hence,
log/(i) = Pnit) + ), wn(t) = 3(log/(*)) =
We choose the principal values of l o g k = 3,4, i.e., using principal values,
for t = 3 , 4 ,
sin(tXj)
wn(t) = arctan ^ - f (3.4.5)
> cos (tXj)
Replacing w(t), a and y in (3.4.4) by their estimated values given in (3.4.2),
(3.4.3), and (3.4.5), and solving the two implied linear equations simultane-
ously for and a, we obtain the estimators
a \wn(t3)/t3 - wn(t4)it4\
. \U\&~lwn(t3)lt3 - \H\*~xwn(U)lti
a = 1*410-1 - M 4 " 1 (347)
m = e~ r
i.e., = log |/(f)|/|i|; therefore, for the parameter y the estimator is
r = (3.4.8)
I'll
o _ n[wn(t2)/t2 - wn(t3)/t3]
P 2y log 1*3/^21 '
log |3/2|
Thus, (3.4.2), (3.4.3), (3.4.6), and (3.4.7) yield moment estimators for the
parameters a, y, , and in the case of * 1; (3.4.8), (3.4.10), and (3.4.11) yield
3.4. Parameter estimation 191
C34 = 43 = ^ - [ f i t i - t 2 ) - f i t l + t2)],
2
192 3. Empirical characteristic functions
where L(f) is defined by (3.4.13). Then, from (3.4.12) and (3.4.14) we obtain
g( Z) = a
for
1
= -,-rr = , 0)2 = -)H = (21,
log|*i| - log|* 2 |
and
g(Z) = logy
for
log |*21 log I
0)1 = = Wl2 2 =
~ log |*i I log |2| ' 18|*1|-18|*222
From the large samples theory (see, e.g. (Rao, 1965, p. 321)),
and
where
r\j = ?(, 2) = 2(, <y), j = 1,2,
and
2/ V- ^ ( , 2, 3, 4 )
(, *) = rn ^ , (3.4.17)
3.4. Parameter estimation 193
we obtain
^(01,02,03,04)^ (Oj
J
50, /(*,) log/(*,)' ' '
and
dgWi, 02, 03,04) . . . .
= 0' / = 3'4
where
2 1 + /(2)| - 2|/ ()| 2 1 + |/(2f 2 )| - 2|/(i 2 )| 2
m = 2(|/ ()| log |/ ()| log M 2 I ) 2 2i\fM\ log |/ ( 2 )| log \txlt2\?
THEOREM 3.4.1. Let a family fit; ), e and a metric satisfy the conditions
PROOF. W e have
and hence
> oo.
3.4. Parameter estimation 195
Note that, in particular, the first condition of the theorem is satisfied when
is a compact set, and Fix; ) is uniquely determined by , that is, Fix; \)
F(x; 2) if \ * 02. Or this is so when is not compact but the limits of the form
1||||- fit; ) and lim_> fit; ), where g , either do not exist or are not
characteristic functions (the case of stable laws).
Two examples of distances satisfying the conditions of the theorem are
given below. These are the weighted uniform distance
pi<pit), )) = s u p ;()|() - y ( f ) |
t
where wit) is a bounded positive function such that lim^^oo wit) = 0, and the
weighted Lp -metric
1lp
picpit), )) = \ I (pit) - wit)\PdGit) p> 0,
Then
2 *
n
>i
196 3. Empirical characteristic functions
where (), ...,() are independent and identically distributed random vari-
ables with variance
oo poo
/
/ [cov(cos(iX), cos(sX))u'(t; e)u'(s; )
-oo J oo
+ 2 cov(cos(iX), sin(sX))u'(t; 0)t/(s; )
+ cov(sin(iX),sin(sX))i/(f; 0)i/(s; )] dG(t)dG(s), (3.4.24)
where, as it follows from elementary trigonometric identities or from (3.1.4)-
(3.1.6) for = 1,
cov(cos(iX), cos(sX)) = [u(t + s; ) + u(t - s; ) - 2u(t; 0)u(s; )],
hence oo
/ -oo I fit; 0O)| 2dG(t) = 2(0).
By the strong law of large numbers, this yields
THEOREM 3.4.2. Let /() be differentiable with respect to under the integral
sign, and let functions u"(t; ) and v"(t; ) be uniformly bounded by some G-
integrable functions. Then
\/( - 0 ) (, as
where Var (0) and (0) are defined by (3.4.24) and (3.4.26).
= toc/tc.
be satisfied. Then
> as > oo,
i.e., is an almost surely consistent estimator of .
Details can be found in (Markatou et al., 1995; Markatou & Horowitz,
1995).
/ -oo eltxK(x)dx
(see (Titchmarsh, 1937, Chapter 4))
Now we introduce some basic characteristics of a density estimator which
are of frequent use and which we will deal with below. Let pn(x) be an esti-
mator (not necessarily a kernel estimator) of p(x) associated with the sample
X\ Xn. The bias of n(x) is (we try to make notations close to those used in
the literature on density estimation)
( ) ) = () - p{x). (3.5.2)
In the case of the kernel estimator pn(x) defined by (3.5.1), the bias is written
as
Bn(pn(x)) = (Kh*p)(x)-p(x),
where
(3.5.3)
-H)
Kh(x) = T K [ T ) ,
200 3. Empirical characteristic functions
/
B2n(pn(x))dx + / Var (pn(x))dx.
-oo J oo
Along with MSE and MISE, other measures of the deviation are used.
Among them the mean absolute error
(()) = [ - ( * ) |
and, respectively, the mean integrated absolute error
oo
/
-oo \pn(x)-p(x)\dx
are especially important (see (Devroye & Gyrfi, 1985)). However, in this book,
we will consider only MSE and MISE.
In the case of the kernel estimator, the variance, MSE, and MISE of an
estimator can be easily expressed in terms of the function K^ix) defined by
(3.5.3) and a density to be estimated:
V a i i p n ( x ) ) = ~[(K2h *p)(x) - (Kh *p)2(x)],
/
\x\kg(x)dx, k = 1,2,...,
-oo
oo
/ -00 g*ix)dx,
provided that these integrals exist. If the kernel Kix) is a probability density
function, and a density to be estimated is two times differentiable and its
second derivative is square integrable, then the asymptotic relation for MISE
(see, e.g. (Wand & Jones, 1995))
inf MISE(p n (*)) ~ f Ui%(K)R4(K)R(p")] -> oo.
A>0 *
Thus the best order of approximation is n~ 4/5 if only density functions are used
for the kernel. However, if we permit the kernel not to be a density, then the
order can be improved. For example if p(x) is the normal density and K(x) is
the sine kernel, i.e., K{x) = sin(/cc)/(ra:), then
mf MISE(/>Cc)) = , /i oo.
The general form of the kernel estimator (3.5.1) in terms of the empirical
characteristic function is
Pnhix) . 2i fJ- t
e-ltxfn(t)(p(hnt)dt,
1 f l2
MSE(pn(x)) = I J e~Uxf(t)( 1- (p(hnt))dt
1 1 f f
+ - / / + v)dudv
(2)^ J-oo J-oo
1 1 "00 /
^ I
(2 J-oo / 7-00 e~l(u+v)x(p(hnu)(p(hnv)f(u)f(v)dudv.
The first term on the right-hand side is dominated by the first term of the
right-hand side of (3.5.14). Let us estimate the absolute value of the second
(denoted by Ti) and third (denoted by 3) terms. We have
1 1 00 1
2 = - / (P(hnu) / e-iiu+v)x(p(hnv)f(u + ) dv du.
nlnj-oo linj-oo
The term in the square brackets, being transformed to the form ^ f ^ e ltx
(( - u))f(t)dt, is equal to f^pix-y)Kh(y)e-iu*dy (since cp(hn(t - u))f(t) is
the Fourier transform of the convolution of the functions p(x) and Kh(x)e~lux),
and we have
OO I poo
/
p{x - y)Kh(y)e'iuydy\ < p{x - y)Kh(y) dy
-00 I J00
oo
Furthermore,
I f 00 . lf
ITsI = I / e-lux<p(hnu)f{u)du / e-wxf{.v)(p{hnv)dv
2, J-oo Aft J-00
1 1 r . If 0 0
sup I e-luyf{u)q>(hnu)du / |/()||(,)| do
y In J00 J-00
1 1
< - suptK/, * p)(y) / |(| <i
/ > y_oo
1 /
< - / |(;)| du
2 y_oo
(we used the obvious inequality supyCK/, * p)(y) < sup^pty)). Thus we finally
arrive at (3.5.14).
204 3. Empirical characteristic functions
we|s
(v^. 13
and by virtue of Theorem 2.3.3,
rix - ^
for all t. Hence,
II f " e t / ; $
. ,3.5.17)
5
Further, using the Parseval-Plancherel identity (Theorem 1.2.10), we obtain
1 r 1 1 r
/ \<f>(.t)\2dt=-n~m \<p(t)\2dt (3.5.18)
nnn J-oo J-oo
= in-4/5 r K2(x)dx = W .4/5
J-oo ho (3.5.19)
COROLLARY 3.5.2. Let the conditions of Theorem 3.5.1 be satisfied. Then for
each n = 1 , 2 , . . . ,
/ 3 54 \ 1/5
( w j -10311368
If p(x) is only one time differentiable or/and the expectation of K(x) does
not equal zero, then results are weaker.
PROOF. By virtue of Theorem 2.5.4 and the remark after Theorem 2.3.3,
and
|1 - ()| <i(K)hn\t\
for all t. Hence (see the proof of Theorem 3.5.1),
OO Q
Wt)\2dt = - * 3 . (3.5.22)
nnn J-oo ho
Theorems 3.5.1 and 3.5.2 provide us with bounds for the integrated devia-
tion of the mean squared error of a kernel estimator from zero. Now we obtain
bounds for the sup-deviation. Denote
r
B(K) = / \<p{t)\dt.
Iii J-oo
1*1 * v34'
/
|/(0| 11 - ()| < 2(K)hl / t2dt + V3 -2
-oo [Jo Jv tz
COROLLARY 3.5.4. Let the conditions of Theorem 3.5.3 be satisfied. Then for
each = 1,2,...,
1/5
/ \
inf sup MSE(p(x)) < 4 %15(K)Vi'10B4/5(K)amn~^.
h> \9J
PROOF. Use ( 3 . 5 . 1 6 ) and the inequality from the remark after Theorem 2 . 3 . 3 :
|1-()|<(#)||.
Then
fV? f dt
|/(f)||l - q*hnt)\ dt < 2 ( K ) h n / tdt + 2 2 ( K ) h n / 1 3
-oo Jo Jvj t*
3
= 3 (K)V$ hn = {)- .
COROLLARY 3.5.5. Let the conditions of Theorem 3.5.4 be satisfied. Then for
each = 1,2,...,
/ q \ 1/3
inf sup MSE(p n (*)) < 3 f ) ? / 3 (^) 2 4 / 9 2 / 3 (^) 2 / 3 " 2 / 3 .
> \4*J
THEOREM 3.5.5. Let the underlying density p(x) be a function of bounded vari-
ation: V = Yip) < oo. If
h =
" y/n\ogn
208 3. Empirical characteristic functions
then
log2 4v/2
MISE(Pa(*)) < ^-3{^/(\()}
y/n
R(K)
max{V 3/2 , V 2 } max{\/,o} + (3.5.24)
h0log
provided that > ee.
PROOF. Let us use Lemma 3.5.2. For the second term in the square brackets,
by the Parseval-Plancherel identity, we obtain
I ro Ojr r oo 2 nR(K)
- Witfdt = / K\x)dx = (3.5.25)
nhn J-oo nhn J- nh
Let us estimate the first term. First establish the following inequality: for any
0 < < 1,
|1 - (p{t)\ < () 2 1- || (3.5.26)
for all real t. Indeed, in view of the remark after Theorem 2.3.3,
|1-()| <(#)||. (3.5.27)
For |f I < 2 1 (), the right-hand side of (3.5.26) majorizes the right-hand side
of (3.5.27); therefore (3.5.26) is true for these t. If \t\ > 2^\), then (3.5.26)
becomes obvious because its right-hand side exceeds two.
Let be arbitrary, 0 < a < 1/2. Making use of (3.5.26) and Theorem 2.5.3,
we obtain
oo
/ -oo |/()|2|1 - ()| 2
fV roo
= 2 / |/(i)|2|l - (p(hnt)\2dt + 2 / |/()|2|1 - ()|2
Jo Jv
2 4-2
~ a-4a^fl(K)2aV2a+lh2"
From this bound and (3.5.25), using Lemma 3.5.1, we obtain
23" 2 a
MISE(p n A (*)) < -j^--fa(K)V2a+1h2a + ^
(1 - 4 2 ) nhn
232 1 2a
{)2+
(1 - 4 2 ) y^log .
R(K) log
(3.5.28)
ho y/n
3.5. Non-parametric density estimation I 209
Set
log 2 log log
ao =
2(log + 2 log log n)'
then a > ocq (if > ee); hence
1 2a
1 2a log
(3.5.30)
,-y/nlogn. ,\/nlogn. y/n
If
hp
K =
y/nlogn'
210 3. Empirical characteristic functions
then
Before constructing new estimators, let us make the following remark. The
most tempting way of overcoming the troubles related to a possible negativity
and non-normalization (or even non-integrability) of an estimator seems to be
the replacement of the estimator by its projection onto the set of all probability
densities (or onto some subset of this set, if some preliminary information
about a density to be estimated is available). In other words, let & be some
space of functions the estimator under consideration almost surely belongs to,
and let 9 be the set of all probability density functions. Suppose that 9 c J2",
and & is a metric space with metric p(,). The estimator pn(x) is replaced by
the estimator () such that for any realization of pn(x), the corresponding
realization of (x) satisfies the conditions
(1) pfa) 9\
and construct for them the modifications pn(x) and pn(x) respectively. If pn(x)
almost surely satisfies one of relations (3.6.1) or (3.6.2), then p*n(x) = pn(x) or
p(x) = pn(x) depending on which relation holds. If some realizations of the
estimator pn(x) satisfy (3.6.1) while others satisfy (3.6.2), then we can use the
following combination of the estimators pn(x) and pn(x):
OO
LEMMA 3.6.1. Let q(x) be a bounded, square integrable function. Then the
function
qE(,x) = max{0, q(x) - }
PROOF. Denote
b = supg(x),
X
and
Ae = {x: q(x) > }.
Due to the conditions of the lemma, 6 < oo. It is easy to see that has a finite
Lebesgue measure. Indeed, if mesA = oo, then
(mes denotes the Lebesgue measure) which contradicts the assumption that
q(x) is square integrable.
We have qe(x) = 0 if , therefore
We set oo
/ -oo max{0, q(x) z } dx.
Let z\ > 22 > 0. We have
LEMMA 3.6.2. The function () is continuous for > 0 and strictly decreases
for 0 < < b.
Lemmas 3.6.1 and 3.6.2 imply that exists, and each realization of pn
uniquely determines the corresponding realization of , i.e., is well-defined.
By the definition of pn(x), each realization of it is a probability density
function. We prove now that the MISE of pn(x) (as an estimator of p(x)) is at
least as good as that of pn(x) for any n.
which implies (3.6.5), so, we will assume that > 0 for the realization consid-
ered. This means, in particular, that
oo
Let us fix an arbitrary 0 < < 1. Choose a set Am (depending on ) on the real
line, having a sufficiently large Lebesgue measure > 0, so that
(3.6.7)
(the last two relations are meant for the considered realization of pn (x); the
first of them is possible due to (3.6.6), the second one is possible due to the
square integrability ofp
Now, let us construct the sequence of functions {<?*()}, k = 0,1,2,..., as
follows:
where
(3.6.10)
k = 1,2,... (3.6.11)
Its derivative is
f }
qk(x) + ck = max < 0,qr0(x) - J ^ c , > , A = 1,2,..., (3.6.14)
which can be proved by induction: this is obviously true for k = 1 and if this is
true for k - m, then
r - \
= max < 0,<70(*) - ^ c , > ,
>1
because otherwise
( * ) = A->oo
lim qk(x).
We have
and
hence
00
>1
and therefore,
qM(x)<pn(x) (3.6.21)
[ (pn(x) - p(x))2dx +
JAm
oo
2
/ -oo(pn(x)-p(x)) dx + e. (3.6.22)
The transition from qM(x) to pn(x) is carried out by the passage to the
limit as 0, oo, and Am - M1. More exactly, let decrease, and
Am increase (the latter in the sense A ^ c Am- for < 2), and let 0,
> oo, and Am 1 (in the sense Uaz-^m = R ) in such a way that conditions
(3.6.7M3.6.9) are satisfied. Below, all lim's and lim sup's are taken as > 0,
> oo and Am > R 1 In view of (3.6.19), we have
From this relation and (3.6.18), taking (3.6.17), (3.6.20), and (3.6.21) into ac-
count and setting s = supp(c) (by the conditions of the theorem, c < 00), we
obtain
oo poo
i.e.,
oo
/ -00(Pn(x) ~ ( A * ) ) 2 g & = 0. (3.6.23)
oo
3.6.2.
/ -oo max{0,p n 0e)} dx < 1
The results in the case (3.6.2) are not so good as in the case (3.6.1) but quite
sufficient for applications. The estimator pn{x) introduced below for the case
(3.6.2) is not 'always better' than the initial estimator pn{x) (as the estimator
pn(x)), but is 'almost as good as' pn(x).
Let pn{x) be an estimator of a density function p(x), and let (3.6.2) hold
almost surely. Assume that p n (x) is almost surely square integrable. Our
proposal for the estimator pn(x) (it is supposed to depend on a parameter
= M(n) > 0) is
where
/ -oo pn(x)dx = 1.
/
-oo max{0,/>(*)} cfoc < 1.
Then for any n, almost surely,
oo poo g
2 2
/ -oo(pn(x) - p(x)) dx < J-oo
/ (pn(x) - p(x)) dx + 2
,
PROOF. We have
oo
L oo
(pn(x)
-p(x))2dx
/
(max{0,p(x)} - p(x))2dx + 2 / (max{0,p(;c)} - p{x))dx + 2h
-oo 3_ J-M
oo
2M
/ -oo(p(x) - p(x))2dx +
Taking, for example, = 3/2, where is an arbitrary positive number,
we arrive at the following assertion.
Theorems 3.6.1 and 3.6.2 show that there are no reasons to avoid the use
of estimators producing estimates which are not densities. This is especially
important with respect to kernel estimators based on the so-called sine and
superkernels. The sine kernel is the function
() =
nx
* - { [0,
!">
|*| , s l1;
J c
K(x)dx = 1,
whose Fourier transforms are equal to one on the interval [1,1]. An example
of a superkernel is
. sinxsin(2x)
K(x) =
22
220 3. Empirical characteristic functions
1, |*|;si,
(pit) = |(3 - |f|), 1 < |i| < 3,
0, |x| > 3.
(3.6.26)
where f n (t) is the empirical characteristic function. Suppose that the charac-
teristic function f(t) of the underlying density p(x) is integrable, and denote
First, we obtain relations for the sine estimator, similar to those given by
Lemmas 3.5.1 and 3.5.2 (we cannot apply Lemmas 3.5.1 and 3.5.2 since now
K(x) is not integrable).
and
(3.6.28)
3.6. Non-parametric density estimation II 221
1 ( r rVhn ^
MSE(p(*)) = -- / e~Uxf{t)dt / e~Uxfn(t)dt
2 yj-oo J-vhn j
1 fVhn
= e~Uxf(t) dt + e~ltx(f(t) - fn(t)) dt
2 n L\t\>Vhn 2 J-Vhn
2
rvh"
= e x f(t) dt + J e-ltx(f(t)-fn(t))dt
2~J\t\>vh
i n -1 Ihn
Let us estimate the second term on the right-hand side. Denote it by T%.
Taking into account that
we obtain
1 1 ryhn ryhn ,
2 = - 77 / / e~Mx[f(u + v)- f(u)f(v)) du dv
(27 J l/hn J-Vhn
2 [ 2 J-vhn+v
[/(*), \t\<l/hn,
Ux) =
lo otherwise;
therefore,
1 ryhn f
MISE(p n (z)) = / E\fn(t) - m\2dt + / \m\2dt
2 \ J-Vhn J\t\>Vhn
222 3. Empirical characteristic functions
E\fn(t)\2 = - + U
\
- ~ J) m \ \
we obtain
Utln 1 rUtln
2
/ -VhnE\fn(t) - m\ dt = - /J-yhn (i - \m2\)dt
1 rvh" 2
<- / dt = -.
J-vhn "-Tin
Now we obtain some estimates for MISE and MSE of the sine estimator
depending on the smoothness of the underlying density. First we consider
the non-smooth case where the density to be estimated is not supposed to be
differentiable and even continuous.
Let p(x) have bounded variation: V(p) = V < oo, and pn(x) be
THEOREM 3 . 6 . 3 .
the sine estimator. I f h n = hol^/n, then
PROOF. Making use of relation (3.6.28) of Lemma 3.6.3 and Theorem 2.5.3, we
obtain
^/ V ho
COROLLARY 3 . 6 . 3 . Let the conditions of Theorem 3.6.3 be satisfied. Then for
each 1 n,
infMISE(p(x))< 2 V
h>0 '
3.6. Non-parametric density estimation II 223
COROLLARY 3.6.4. Letp(x)be a unimodal density function and pn (.) be the sine
estimator. Ifp(x) is bounded:
MISE(p><*))
B 0e)) < (f 420 + ^
Uy/n \ .
Now consider the case where the density to be estimated is m times dif-
ferentiable, m > 1. It will be shown that in this case the upper bound for the
MISE of the sine kernel is of the order n - 2 m / ( 2 m + 1 ) which in essence cannot be
achieved (for m > 2) if kernel estimators are used with kernels being density
functions.
THEOREM 3.6.4. Let p(x) be m times differentiable (m > 1), and p{m\x) be a
function of bounded variation: V ( p ( m ) ) = Vm < oo. Ifpn(x) is the sine estimator,
and
hn = h0n-y{2m+1\
then
PROOF. We have
2m
f \f(t)\2dt = h2nm f ()m\m\2dt
J\t\>Vhn J\t\>Vhn \hnJ
<hlm [ \t\2m\m\2dt
J\t\>Vhn
OO
Let us estimate the integral on the right-hand side, making use of Theo-
rem 2.5.4. We have
/-!1'
therefore
roc roo At
Thus, from inequality (3.6.28) of Lemma 3.6.3, and relations (3.6.30) and
(3.6.31) we obtain (3.6.29).
COROLLARY 3.6.5. Let the conditions of Theorem 3.6.4 be satisfied. Then for
each = 1, ...,n,
= hon-"1-,
then
(m + l) 2 y2m/(m+l)^2(m-1)
supMSE(p(x)) <
* m
The proof of this theorem is similar to that of Theorem 3.6.4; one just needs
to use relation (3.6.27) of Lemma 3.6.3 instead of relation (3.6.28) and take into
account that, by virtue of Theorem 2.5.4,
COROLLARY 3.6.6. Let the conditions of Theorem 3.6.5 be satisfied. Then for
each = 1,2,...,
2m - 1 m + 1 2/(2m1)
inf supMSE(p(x))<
>0 " m
m + y(m-l)/(/7i+l)
(2m-2)/(2m-l)
y2/(m+l)n-2(m-l)/(2m-1)
m(m 1)
functions is more simple, more natural, and more convenient for our purposes).
A distribution F with characteristic function f(t) is said to be supersmooth if
for some a > 0 and > 0,
oo
THEOREM 3 . 6 . 6 . Let the characteristic function f(t) ofp(x) satisfy the relation
oo
Y a
/ -oo e ^ \f(t)\dt<oo
for some a > 0 and > 0. Ifpn(x) is the sine estimator, and
-1/
hn = log(&ora)
L
then
PROOF. We have
ff \f(t)\2dt< f \f(t)\dt = e~^ f e^\f(t)\dt
J\t\>vk
'\t\>l/h, n J\t\>ykn J\t\t\ZVhn
oo (;,)
/ -oo e^a\m\dt = nho
Using this bound and inequality ( 3 . 6 . 2 8 ) of Lemma 3 . 6 . 3 , we obtain ( 3 . 6 . 3 2 ) .
hn<~,
226 3. Empirical characteristic functions
then
2 2
supMSE(p(*)) < < =
2
X Tinhn n nhn'
and
MISE(pCx)) <
In particular, ifhn = c o n s t = l / , then
and
MISE(p(x)) < .
1 Ath
fnkitk) = -,1'"* = / e^xdFnk(x) = /(0 0, tk,0, ...,0),
3.7. Tests for independence 227
1 <k <m.
The hypothesis of independence of components of the random vectors Xj
can be formulated in the following two equivalent forms:
m
Ho: = (*i, .,*) e R m (3.7.1)
k=\
and
m
Ho: f(h t m ) = n/*(fi). (h tm) e R m . (3.7.2)
k=l
where t (0) is a specially chosen point of R m (the exact definition will be given
below) or the test statistic
Sn(t{n)) = Vn ( / ( t ^ ) - / ^ * 0 ) '
\ A=1 /
Then (Csrg, 1985) under HQ, Sn(t) converges weakly in to Spit) if and
only if condition (3.2.7) holds.
Under i/o.
^ ( t ) = E|S F (t)| 2
m m m
2 1 2
= 1- I/aM - D - \fk(th)\ )Yl Wi)\2.
k=l k=l 1=1
l*k
By definition, t (0) is the point maximizing cr2;) on B. The idea of this choice is
that at t (0) the random function |Sf(t)|, and hence |S(t)| for large n, is most
variable.
Since the function cr^t), and hence the point t (0) , is unknown in practice,
t (0) is replaced for each by its estimate t (n) which minimizes
m m 77i
0*(t) = 1 - \fnk(tk)\2 - B 1 " /-*(*)|2)/^/)|2.
k=l k=l
l*k
an estimator of cr^t).
Under some additional conditions, the limiting distribution of the test
statistic can be obtained under HQ. Let us introduce some notation. De-
note the real and imaginary parts of /(t), / n (t), fkitk), and fnk(tk), as usual,
by the letters u and with the same indices and arguments and those of the
random function YjKt), by RF(t) and/jKt). Then (see Section 3.1),
Denote
where
For the sake of brevity we will use the notation tk for the vector
(0,..., 0, tk, 0 0) e R m . Let S^Ct) and S%\t) be the real and imaginary p a r t s
of the random function Sf(t). We have
an(s,t) = E S ^ s ^ t )
m
= a11(s,t) - -B*(t)a 1 2 (s,i A )]
*=i
m
- J2^k(s)an(t,sk)-Bk(s)al2(t,sk)]
k=l
m m
+ Y^IAkiaiAiMJHsk,tO - Ak(s)Bi(t)a12(sk,t0
k=l 1=1
-Bk(s)Amal\tusk) + Bk{s)Bi{t)(^\sk,ti)\,
a 1 2 (s,t) = ESfp\s)Sfp\t)
m
= 1 2 (s,t) - [ B * ( t ) a u ( e , i * ) + A*(t)a 12 (s,**)]
k=l
m
- [ A * ( s ) a 1 2 ( s * , t ) -S*(e)o 22 (s Jfc ,t)]
k=l
m m
+ ^11^, ti) + Ak(s)Al(t)a12(sk, t0
k=l 1=1
- Bkismva^sk) - Bk(s)Bi(t)al2(sk,ti)],
a 2 2 (s,t) = E42)(s)S<F2)(t)
m
= ^, - J2[Bk(t)a12(tk, s)+Ak(t)a22(s,tk)]
k=l
m
- Y^[Bk(s)a12(sk, t) + A^(s)cr 22 (s^,t)]
k=l
m m
+ E[5*(s)S'(t)CTll(s*' *>+ Bk(s]Ai(.t)a12(sk, tO
k=l 1=1
+ Ak(s)Bi(t)au(ti, sk) + Akis^iW^isk, i/)].
the matrices
-() =
where
DN(t) = detZ(t) = crf u (t)o| 2irt (t) - crf2 (t),
and CTii(t), CTi2,n(t), o%2tnW are obtained upon replacing u{t), v(t) and f(t)
and their marginals by un(t), u(t) and fn(t) and their marginals, respectively,
everywhere in the respective definitions of of 1 (t),CTi2(t)and of 2 (t).
We introduce the maximum variance quadratic form statistic
| [^(tCVn^t"1')]2
A,(t ( n ) )
THEOREM 3.7.1. Let a compact set be chosen in such a way that the point t(0)
maximizing 2;) on is unique, that is,
2^) > t)
The described test based on the test statistic Sn (t) is not consistent in the
general case. To demonstrate this, let us construct an example of a bivariate
characteristic function /(s, t) such that
/(s (0) , i (0) ) = /(s (0) , 0)/(0, i(0>),
but
f(s, t) f(s, 0)/(0, t).
Consider the characteristic function
f(s,t) = e - ' s+i '.
It is clear that
f{s, t) f{s, 0)f(0, t).
Let us demonstrate that
/(s (0) , i (0) ) = /(s (0) , 0)/(0, (0)).
We have
&*(s,t) = l - |/(s,0)|2|/(0,i)|2
= ( l - |/(s,0)|2) ( l - | / ( 0 , i ) | 2 ) ,
where f'n(s,t) and f'nl(s), f'^it) are the empirical characteristic function of
specially transformed data and its marginals, and W(s, t) is an appropriate
weight function.
For arbitrary m, tests based on test statistics of the form
2
r I m
k ( t ) - / * ( * * ) w(t)dt,
/Rm I k=l
where u;(t) is a weight function, were studied in (Kankainen, 1995; Kankainen
& Ushakov, 1998).
232 3. Empirical characteristic functions
(3.8.1)
for some h > 0 (F(x) is the distribution function). Of course, (3.8.1) is quite
a restrictive condition. However, as pointed out in (Feuerverger & Mureika,
1977), 'the restriction is less troublesome than might first be thought': we can
reduce the general case to the considered one by some transformation of the
data. Indeed, let X be a random variable with distribution function F(x), and
H(x) be any absolutely continuous, strictly increasing distribution function
which is symmetric about the origin. Then X is symmetric about the origin if
and only if 2H(X) 1 is. This means that testing symmetry of a distribution
function F(x) is equivalent to testing symmetry of the distribution function
fo, x<-l,
Fh(X) = F(H~H(X + l)/2)), \x\ < 1,
*>i,
which, obviously, satisfies (3.8.1). The sample values X\, ...,Xn should be
replaced b y 2H(X0 - 1,..., 2H(Xn) - 1.
Let X\,...,Xn be a random sample from a distribution function F(x) (which
is supposed to satisfy (3.8.1)) with characteristic function f(t). Denote the
3.8. Tests for symmetry 233
= L t ^ i - - s(Xj+m
>1 k=l
Denote
OO
/ -oo[3 mfdGd),
oo poo
/
/ -OO JOO3/()3/( 2 )[29(/( - t2) - f(h +12))
- 4 3 ) 3 / ( 2 ) ] dG(h)dG(t2).
PROOF. W e s e t
The first and third terms on the right converge to zero. Consider the middle
term. Since Un is a Hoeffding [/-statistic, it is, by the Hoeffding theorem (see,
e.g. (Frazer, 1957, Theorem 5.1)), asymptotically normal with mean
and variance
- Var [E X2 (g(Xi " - gfX 1 + X2))] =
(where denotes the conditional expectation of Y given X). This implies
the assertion of the theorem.
Denote
Wn(t) = ^Y/sm(tXj).
j=1
The covariance function of the random process Wn(t) is
THEOREM 3.8.2. Let F{x) be symmetric about the origin, and let W(t) be a zero
mean Gaussian process having the same covariance function (3.8.2) as Wn(t).
Denote oo
/ -00 W2(t)dG(t);
then
m D
for each u,
Xu > X as u > oo,
and
We set
u roo
/
W2(t)dG(t), X= / W\t)dG(t) = i,
-u Joo
u poo
/ -u W2(t)dG(t), Yn= Joo W2(t)dG(t) = nTn.
Using the Markov inequality, we obtain
-iL
P(\Xun - YnI > ) = I I w2n(t)dG(t) >
l\t\>U )
< \ f [ WlihWlWdGitJdGih),
J\u\>u J\u\>u
!\tl\>U J\t2\>u
is given by
oo
) = - 2 i X j t r
j=
where {,} is the solution set of the eigen-value equation
rM
M
/ -M Vj(s)K(s,t)(q(s)q(t))y2ds.
PROOF. According to the Karhunen-Loeve theorem (see, e.g. (Ash, 1965, Ap-
pendix)),
oo
W(t) = YtZjWj(t), \t\<M,
>1
236 3. Empirical characteristic functions
= \ [l-f(2t)]dG(t)
2 J-M
M pM
/
/ COv( W2(s), W2(t)) dG(s) dG(t)
M rM
-mJ-M
/
/ K\s,t)dG{s)dG{t).
f(t) may be estimated by -M J-M
1 n
Unit) = -^VcOSitXj).
~
j=
When the underlying distribution is symmetric, un(,t) will become uniformly
close to fit). This leads to a test procedure which, to within the accuracy of
the 2 approximation, will have asymptotic level a. More details and further
possibilities are contained in (Feuerverger & Mureika, 1977).
The second test we consider in this section, is based on a special measure of
asymmetry and can be used for testing symmetry about an unspecified centre.
First we introduce the characteristic symmetry function based on the char-
acteristic function of the underlying distribution whose behavior is indicative
of symmetry or its absence. Let X be a random variable with distribution
function Fix) and characteristic function fit). Denote the real and imaginary
parts of the latter by uit) and v(t) respectively. Let ro be the first positive zero
of uit), 0 < ro < oo (recall that we write ro = oo if uit) > 0 for all real f). Define
the characteristic symmetry function of X (or of Fix)) as
1 uit)
6it) = - arctan , 0 < t < ro.
t uit)
3.8. Tests for symmetry 237
and, since || < /2 if t < ro, () = . The symmetry hence implies that ()
is a constant. Conversely, let 6{t) be equal to a constant for 0 < t < ro- Then
tan(i) = >(t)/u(t) and hence, sin(i0)u(i) cos()() = 0 (|i| < ro). Besides,
sin(i0)u(i) cos(t9)v(t) = 3 ( e ~ l t 9 f ( t ) ) . Therefore, the characteristic function
fe(t) = e~lWf(t) of X is real on the interval (,). Under condition
(3.8.1), which is supposed to be satisfied, this implies that feit) is real for all
real t.
Thus, under condition (3.8.1), the constancy of Q(t) is indicative of symme-
try. Further, a straightforward calculation yields, as t > 0,
t2 t4
Bit) = - ECX" - )3 + [EC* - 0) 5 - 10(E(X - )3( - )2)] + 0(t6),
3! 5!
with
EX" = = lim 0(t).
->
Thus, as t increases from zero, Q{t) either remains constant or, in the case
of asymmetry, departs from a constant value with direction and magnitude
initially determined by the odd central moments of X.
Now letXi, ...,Xn be a random sample from the distribution function F(x).
Define the empirical characteristic symmetry function as
where un(t) and vn(t) are the real and imaginary parts of the empirical char-
acteristic function fn(t), and [a,] is some appropriately chosen interval with
0 < a < b < ro- The idea of the proposed test consists in rejecting the hypothesis
of symmetry if () departs sufficiently from a constant value.
Under weak tail condition (3.2.7), () almost surely converges to Q{t) uni-
formly on each compact set (Csrgo & Heathcote, 1982, Theorem 2). Deviations
of Qn{t) from Q(t) are measured by the process
This process converges weakly (Csrgo & Heathcote, 1982) to a zero mean
Gaussian process () with covariance
238 3. Empirical characteristic functions
where
A sufficient condition for this convergence is the weak tail condition that, for
some > 0,
E(log+ \X\)1+S < oo,
which is, obviously, satisfied if (3.8.1) holds.
The test statistic of the proposed test uses the variance function of the
stochastic process (), namely cfiit) = (, t). Before introducing it, we make
the following remark. A sensitive measure of the deviation of () from a
constant is
= \/ sup |0(s) - ()|.
a<s,t<b
Under symmetry,
= sup |r(s) - T(f)|.
a<s,t<b
= s u p |T(S) - ( ) | = I s u p ( s ) i n f T(S)|.
a<s,t<b a<s<b a<s<b
= IQniSn) - 3.M
B Vn + 2() - 2an(sn, i)F2
converges in distribution to |T(SO) ()|> t E [A, b]. The latter is the modulus of
a normally distributed random variable with zero mean and variance cx^so) +
cP{to) 2a(so, ) Thus, if the condition of the uniqueness of so and and
( 3 . 8 . 1 ) are satisfied, then we have the following result concerning the limiting
distribution of the test statistic.
with (;) being the standard normal distribution function. If, on the contrary,
0(SQ) * E(t0), then
SJI oo as oo.
PROOF. Assume that distribution of-X" is symmetric about m(X). Then, for any
240 3. Empirical characteristic functions
and similarly,
Hence,
which yields
P(X - m(X) e A) = P(Z - m(X) e - A ) ,
F(x)=pG(x) + a-p)H(x)
3.8. Tests for symmetry 241
where 0 < < 1 and 1/2. The non-symmetry of F(x) is obvious. Let us show
that if X has distribution function F(x), then |X| and sgnX are independent.
Indeed, for any > 0,
and obviously,
P(sgnX = 1) = p.
i.e.,
P(|X| < x, sgnX = 1) = P(|X| < * ) P ( s g n X = 1).
The equality
Ho: Fe ,m
where, as usual,
is the sample mean vector. Fn(x) and fn(t) denote the empirical distribution
function and empirical characteristic function of , ...,Xn respectively.
3.9. Testing for normality 243
() = ^^(),
where
1, S is singular,
DN(R), S is non-singular,
S-V2 is the symmetric positive definite square root of the inverse S " 1 of S,
() = {t = (ii,..., tm): \tjI < , j = 1,..., m} is a cubic box centered at the origin
with sides of length 2, and is some positive parameter (in (Csrgo, 1986), it
is recommended to choose = lAHy/m).
Thus, the test consists of rejecting the hypothesis of normality for large
values of Mn{). Since () is undefined if S is singular, it is replaced in this
case by its maximum possible value 1. In practice this will always lead to the
rejection of Ho-
The second test is based on the test statistic
Tn() = nAf(),
where
(P(t) is the weight function (some density function). Again, since Wn() is only
defined if S is non-singular, it is replaced by its maximum possible value 4 in
the case where S n is not invertible (and in this case Hq is rejected).
A good choice for the weight function is
(3.9.1)
244 3. Empirical characteristic functions
2 r 2 2
(1 + 2r>2 exp
{ 2(1 + 2)11 j]l
J +
(1 + 22r/2'
so, since the computation of \\Yj Y^||2 and ||Y/||2 involves only S " 1 , not even
the square root S ~ y 2 of S " 1 is needed.
Before considering the problem of consistency of the tests we establish the
following auxiliary assertion.
<P{t)dt.
L - { - \ \ \ t f }
3.9. Testing for normality 245
The non-negative limits are zero if and only if Ho is true. Therefore,, both
tests are consistent against all alternatives with non-degenerate covariance
matrix (() oo, and () -1 oo as oo).
The question arises whether or not this is true for other alternatives (hav-
ing components with infinite variances). The problem is important because
the set of remaining alternatives contains stable distributions. It is easy to
obtain the positive answer in the univariate case: if m = 1, and EX2 = oo, then
- T 2
If m > 2, then the problem is non-trivial, but it turns out that the answer is
still positive.
() oo, oo,
and
() oo, > oo,
for any weight function (fiit) which is positive in some neighborhood of the
origin.
The assertion of the theorem follows from the lemma below. Consider the
event (here, for the sake of brevity, we denote the determinant of a matrix by
I I instead of det)
oo oo
0={8>}
tj k=\ n~k
consisting in that S is infinitely many times non-singular, and let be its
complement.
liminf Anf ( ) > 4/nc0 + 1 ^ infm inf [ |e-'( s " l t '- i l )/(S- 1 t') - e " l | t | | 2 / 2 2
(paWdt,
oo ^ R S JRm I
where inf in S is taken over all positive definite symmetric mxm matrices S.
246 3. Empirical characteristic functions
PROOF. Denote the (j,k)tti element of the matrix S" 1 7 2 by s~kV2{n). Let > 0
be arbitrary. Introduce the symmetric truncated matrix A n (K) with the (/, &)th
element
( _ J
T7-\
s / 2 ( /*-'
I"Jh
i ) | <K, '
aMn; } =
{Ksgnis^Hn)), IsJ k v Hn)\ > K,
sup l/Wr-expi-HSft'll2}
teRn{K)
(we used the Minkowski triangle inequality for the sup norm). Since each
of the regions Rn(K) is a subset of the ra-dimensional ball {t: ||t|| < },
the first supremum on the right-hand side converges to zero almost surely
(Theorem 3.2.1). On the event , there exists a subsequence nj depending
upon the elementary events in such that rij > ooasj oo, and/{|g n | >0 } = 1,
j = 1,2,... Hence by an obvious consideration we obtain
where the infimum is taken over all pairs o f m x m symmetric matrices A and
S such t h a t S is positive definite, A = S _ 1 whenever | | S - 1 | | < K, and ||A|| =
3.9. Testing for normality 247
08)
JRn
>4Z { | S n | = 0 } + / { | s | > 0 } [ e - ^ ) f n ( t ) - e - ^ r \ ( 8 j * t ' ) | S f | A.
JRn(K)
Using now the Minkowski inequality for the L2 norm, we see t h a t t h e last
integral is no less t h a n
Rn(K)
L/(T)-/(T)|VSFT')|SF|DT
1/2
2
- ( [ e - ^ m - <p,(Sft')|Sf|dt
\JRn(K)
Again, the first square root in this lower bound converges to zero almost surely.
Hence by the reasoning used above,
st 2 2
inf inf [ e - i M f ( t ) - e-^l 'H ^ ( S f )|S":l|A.
"1 A.S J {At':
(At' teB(r)}
Obviously,
2
lim L{K, ) = inf 1 inf f l1 e - ' ^ ' ^ / i S - V ) - <pe(t)dt,
k-Hx " S Jb(x)
and letting finally > oo, we obtain the second assertion of the lemma.
PROOF OF THEOREM 3 . 9 . 1 . Let us begin with proving the first statement. As-
sume that
-y* oo as > oo.
Then, with some positive probability,
liminf <() = 0
rt 1 '
248 3. Empirical characteristic functions
inf
s
sup l / C S - V f - "4"2 = 0
teBW
where the infimum is taken over all positive definite symmetric mxm matrices.
Since f(t) is uniformly continuous on the whole space R m , there exists a matrix
So from this class such that
l / ( -UM2
S0-=
for all t e (). Therefore, by virtue of Lemma 3.9.1, /(Sq 1 t / ), and hence /(t)
are normal characteristic functions.
The second part of the theorem can be proved similarly.
There is also an upper bound for the tail of the limiting distribution of ()
(Csrg, 1986).
A similar situation occurs for the test statistic Tn{): there is a complete
qualitative description of the limiting distribution of () under HQ. Namely,
if the weight function q>(t) is given by (3.9.1), then under Ho,
oo
W)-> as n^ oo, (3.9.2)
A=1
A1q{x.)== /[ /ig
h*(,
(x,
y)q-(y)<p(y)dy
3.9. Testing for normality 249
(q(y) e Lz(Rm,
jV), i.e., it is square integrable with respect to the Tri-
dimensional standard Gaussian measure), where
h*(x,y) = e-02Hx-yll2/2
||2
[l+y(2y 2y||x|| ||y|| +2
1 _
(l+2)3 2 ) m/2 2(^2)(|| + 5||2
"2/)
4 \1+22j ((||x||2-m)(||y||2-m)+2(^y)2-||x||-||y||+m))
Y =
2(1 + 2 )'
where
However, an explicit form of , fe,... has not been found so far neither in
case of the operator nor in case of A2, therefore nothing is known theoreti-
cally about the quantitative behavior of the tail of the limiting distribution of
() (except some moments).
Thus, for both tests () and (), approximating computing formulas
and simulation have been used for the calculation of percentage points. Some
formulas and tables are contained in the works indicated above and also in
(Baringhaus & Henze, 1988; Baringhaus et al., 1989).
In conclusion of this section we point out the close relationship between
the test statistic T n () and some statistic involving a kernel density estimator.
This relationship sheds, in particular, more light on the effect produced by
varying the parameter .
250 3. Empirical characteristic functions
Taking the weight function (t) of the form (3.9.1) and using the notation
we obtain
2
+ l|t|12 dt.
- 11- " - p { 4 (' w) }
l|2
-Yi
=1 V
2 h2 J
where h2 = l/(2/32).
The function exp j ^ + j j|t||2| is the characteristic function of the
0/32.1
normal distribution with zero mean and covariance matrix ^ I m , i.e., with
density (2 ^ ) m / i exp | - ^ } , where
2 + l
on the basis of ,..., X. Denote the characteristic functions of Fix) and Foix)
by fit) and /o(t) respectively. Then the equivalent form for Ho and is
against
Hi: /(t)*/ 0 (t).
Let, as usual, / n (t) be the empirical characteristic function of the sample
Xi, ...,X. The tests we consider in this section are based on a quadratic
measure of the difference between /(t) and /o(t) evaluated at r points. To
obtain their consistency, we let r = rin) > oo as > oo.
Denote the real and imaginary parts of a characteristic function fit) (with
or without an index) by u(t) and vit) with the same index. Let ti,..., tr e R m .
Denote
We have
1 n
Z(ti, ..., t r ) - Z(ti, ..., tr) = - . tr)
j=
where
Let Qo(ti, , t r ) be the covariance matrix of Y/ti,..., t r ) under H0. The matrix
contains the following elements (see Section 3.1):
where ^"1/2 is the symmetric positive definite square root of the inverse ^ 1
of , and W = W(ti,...,t r ) is a diagonal weight matrix with non-negative
diagonal elements (the role of the weight matrix W is to direct the power of
the test based on T towards different frequencies).
Consider one example of a special case of T. It is a one-dimensional test
for uniformity (to apply it to other distributions, one must transform the data
to uniform on [0,1] random variables). Thus, let M = 1, FQ(X) = for 0 < < 1
is the uniform on the interval [0,1] distribution function. Then
sin t . 1 cos t
/oW= + l .
r r
Tn = Wj[V2un(2nj)]2 + wr+j[V2vn(2nj)]2, (3.10.1)
>1 >1
where
1 n
a,jn = - V%cos(2njXk),
n
k=l
1 n
bjn = v ^ s i n ( 2 njXk).
n
k=l
Assume that the number of points t\,t2,... is infinite: r = oo. Then it is easy
to see that nTn is similar to the Cramer-von Mises statistic of the form
oo 1
where
= 1 n
jn ~ ^cos(njXk).
THEOREM 3.10.1. Let rbea finite integer not depending on n. Then under Ho,
D 2r
m
nT w
jX2 as
oo
n
If there is little knowledge about the alternatives, then consistent tests are
preferable to directional tests. Let r = r(n) oo as > oo. For a matrix A,
denote the largest and the smallest eigen-values of A by Amax(A) and ^ )
respectively.
254 3. Empirical characteristic functions
THEOREM 3 . 1 0 . 2 . Assume that there exists a constants > 0 such that ^^() >
. If
* > 0 as oo,
and
VtriW54 ) >n 0 as oo,
tr(W )
/ie/i under Ho,
nTn - tr(W) D
-. > N(0,1) as -> oo
\/2tr(W )
(7V(0,1) is a random variable having the standard normal distribution).
%/tr(W4)
> 0 as > oo,
tr(W2)
and
tr(W)
0 as > oo,
and
tr[(W - W) 2 ] -> 0,
Mn = o\ U ] , oo,
\\/tr(W )/
n(nTn- tr(W) \
. > Mn I 1 as oo.
V \/2tr(W ) /
Mn = o(n), oo,
3.11. Notes
Probably the empirical characteristic function first appeared in (Cramer, 1946).
Since then and until the middle of 70s, several works have been published us-
ing empirical characteristic functions for the parameter estimation and testing
hypotheses: (Heathcote, 1972; Press, 1972a; Kent, 1975; Feigin & Heathcote,
1976; Blum & Susarla, 1977; Heathcote, 1977; Thornton & Paulson, 1977).
A systematic study of the empirical characteristic function was initiated by
(Feuerverger & Mureika, 1977). After that, empirical characteristic functions
have been studied and applied quite extensively.
The asymptotic behavior of the empirical characteristic function and some
related statistics was investigated by (Kent, 1975; Feuerverger & Mureika,
1977; Csrg, 1980; Csrg, 1981a; Csrg, 1981c; Csrg, 1981d; Feuerverg-
er & McDunnough, 1981a; Feuerverger & McDunnough, 1981b; Marcus,
1981; Csrg & Totik, 1983; Keller, 1988; Kolchinskii, 1989; Feuerverger,
1990; Devroye, 1994).
Theorems 3.2.1 and 3.2.2, as well as Lemma 3.2.1, are due to (Csrg &
Totik, 1983). Before that, in (Feuerverger & Mureika, 1977) for the case m = 1
and in (Csrg, 1981c) for the case m > 1, a result similar to Theorem 3.2.1
but weaker (slower increase of Tn) was obtained. Theorem 3.2.3 is due to
(Feuerverger & Mureika, 1977). The weak convergence of the empirical char-
acteristic process in the space of continuous complex-valued functions on a
compact set was studied in (Kent, 1975; Feuerverger & Mureika, 1977; Mar-
cus, 1981; Csrg, 1981a; Csrg, 1981d) in the univariate case and in (Csrg,
1981c) in the multivariate case. Theorem 3.2.4 was obtained in (Marcus, 1981)
for the case m = 1 and in (Csrg, 1981c) for the case m > 1. Theorems 3.2.5
and 3.2.6 are due to (Keller, 1988), Theorem 3.2.7 is due to (Devroye, 1994).
The problem of the first positive zero of the real part of a characteristic
function and of an empirical characteristic function was investigated in (Welsh,
1986; Heathcote & Hsler, 1990; Brker & Hsler, 1991). Theorems 3.3.1
and 3.3.2 were obtained in (Welsh, 1986); Theorems 3.3.4-3.3.6 are due to
(Heathcote & Hsler, 1990).
The problem of parameter estimation on the basis of the empirical char-
acteristic function was studied in (Press, 1972a; Press, 1972b; Paulson et
al., 1975; Heathcote, 1977; Thornton & Paulson, 1977; Koutrouvelis, 1980b;
Csrg, 1981d; Koutrouvelis, 1981; Koutrouvelis, 1982; Welsh, 1985; Markatou
et al., 1995; Markatou & Horowitz, 1995). First applications of the empirical
characteristic function in statistical estimation were related to the estimation
of parameters of stable laws. In (Press, 1972a; Press, 1972b), several methods
of estimation were proposed which use the empirical characteristic function
and one of them, called the method of moments, was studied in detail, in the
case of stable characteristic functions. The integrated squared error estimator
was considered in a number of works. Consistency and asymptotic normality
of this estimator were investigated in (Thornton & Paulson, 1977) with the
3.11. Notes 257
test for univariate normality. The approach of Epps and Pulley was extended
to the multivariate case, developed and investigated in (Baringhaus & Henze,
1988; Csrg, 1989; Henze & Zirkler, 1990; Henze, 1990; Henze & Wagner,
1997). In (Csrg, 1986), the maximal deviation test statistic was proposed
and investigated as an extension of the test statistic of (Murota & Takeuchi,
1981). Theorem 3.9.1 is due to (Csrg, 1989).
The empirical characteristic function approach to constructing goodness-of-
fit tests was used in (Heathcote, 1972; Feigin & Heathcote, 1976; Koutrouvelis,
1980a; Koutrouvelis & Kellermeier, 1981; Fan, 1997) (the references to works
concerning testing for normality were given above separately). The results
presented in Section 3.10 are due to (Fan, 1997).
A
Examples
hence 91fit) = 9\git). At the same time, fit) git) because git) is the char-
acteristic function of a distribution concentrated on the positive half-line and
not degenerated at the origin, therefore 2>git) 0 while 3/() 0 due to the
symmetry.
The characteristic function foit) = 1 is uniquely determined by its real
part. Indeed, suppose that there exists a characteristic function git) such that
9\git) = Xfo(t) = 1
259
260 Appendix A. Examples
and
3git) 3/o(f) 0.
Then there exists to such that 2>gito) * 0, and
that is impossible.
An open question: is the same true for the imaginary part of the character-
istic function, i.e., is it true that the characteristic function is never determined
by its imaginary part?
(b) elt/2Jo(t/2), the characteristic function of the arc sine distribution, i.e.,
distribution with the density
0 < x < 1,
p(x) = i
[0 otherwise.
()
OO, t -> OO.
\(p(t)\
First let us prove that there exists a piecewise linear, continuous, non-negative,
strongly decreasing for t > 0 function () such that i//(0) = 1, () < 1, () > 0
as t oo, and () > (fait) for all sufficiently large t. Denote
/(0=-, = 1,2
where the points t'x,t!^,... will be chosen later and define fit) to be linear on
each interval [ t ' n , , = 1,2,... We set t[ = 0,t' 2 = t<i and suppose that the
first k points t'v...,t'k have been chosen. Then
t'k+1 = max
so, the absolute value of the derivative of fit) on the interval it'k, t'k+l) is less
than that on the interval it'k_vt'k), i.e., fit) is convex for t > 0, and fit) > ()
it > 0). For negative t, define fit) by the equality fit) = fit). By virtue of
262 Appendix A. Examples
fit)
> oo, t > oo.
(pit)
n ( l - i - ) = 0 . (A.1)
fit) = (1 - 1 + -2_"
=l V an an
|T|->OO
OO
fit) = cos(i/7i!)
71=1
(Lukacs, 1970).
or
00 t
k=l z
(Lukacs, 1 9 7 0 ) . The product of these two characteristic functions is sin tit, the
characteristic function of the uniform distribution on [1,1].
263
1, 0 < i < 1,
<pit) = < t > 1,
k0, < 0.
m = c- e~se-<s+t)ds
= e- < / _ 1 + i + M < 0 .
2e2 J
Another way to construct examples satisfying the required conditions is
based on the use of Theorem 1.3.14. By this theorem, the function
A2
)
412 ' 1
f(t) = - 1-
2 2
il + t ) 1 + t2 2 dt2 \l + t2J
264 Appendix A. Examples
fit) = (1 - -'
EXAMPLE 11. For any interval [a, 6] there exist two characteristic functions fit)
and g{t) such that fit) = git) for t e [a, b] but fit) git).
This proposition is a special case of the following.
EXAMPLE 12. For any symmetric interval [, a] there exist two characteristic
functions fit) and git) such that fit) = git) for t [a, a] and fit) * git) for
t e R 1 \ [a, a].
One can set, for example,
for some , one of these identities holds, for instance, let it be the first one.
Then
()() = <p(tM-t)eita,
and, since (pit) * 0 for all t,
V(t) = w(-t)eita
or
() 2 = w(t)e 2 ,
ita
i.e., () 2 is symmetric, which contradicts our assumption.
It is clear that functions (pit) and {) can be chosen in such a way that
both fit) and git) are analytic and do not have zeros.
The problem is closely related to the so-called phase problem in physics
(see, e.g. (Kuznetsov & Ushakov, 1986)).
EXAMPLE 14. There exist two different real characteristic functions whose ab-
solute values coincide.
The usually cited example is the following: two periodic functions fit) and
git) with periods 2 and 4, respectively, such that fit) = 1 |f| for |i| < 1, and
git) = 1 |f| for |f| < 2 . The first function is a characteristic function by
Theorem 1.3.12. The function git) is obtained from fit) as
git) = 2 f ( i ) - l ,
EXAMPLE 15. There exist characteristic functions fit), git) and hit) such that
fit)hit) = git)hit) but fit) git).
The simplest example is the following: fit) and git) are from Example 12,
and
m J l - \ t \ / a , \t\Sa,
10 otherwise.
EXAMPLE 16. There exist characteristic functions fit) and git) such that f2it) =
g2it) but fit) git).
For this case, Example 14 is valid. An open question: do there exist two
different characteristic functions fit) and git) such that fnit) = for some
odd ?
EXAMPLE 17. For any symmetric interval ia,a), there exists a characteristic
function which is infinitely differentiable and vanishes outside ia,a).
Let X\,X%,... be a sequence of independent and identically distributed
random variables with the uniform distribution on the interval [1,1], and
266 Appendix A. Examples
let ,2, ... be a sequence of positive real numbers such that ^- = ain->
= 1,2,..., and
oo
= 1.
n=l
Denote bn = 2- Let the random variable U be the sum of the absolutely
convergent random series Y ^ = \ a n X n . Then |/| < 1 almost surely, and the
characteristic function fit) of U is given by
f(t) _ JJ = (sinCM)\2
Unt bnt
n=l n=l ^ '
m< 1
b\t2'
f(t) is integrable over the real line, therefore U is absolutely continuous with
the density
1 f
Pix) = / e~Uxfit)dt.
2 J-oo
Since fit) is nonnegative and integrable, there exists a constant > 0 such
t h a t afit) is a probability density function, therefore for some > 0, () is
a characteristic function. Denote the corresponding distribution function by
Fix).
Thus, the characteristic function of Fix), which is ypix), vanishes outside
the interval [1,1]. On the other hand, the density of Fix), which is afit),
satisfies the inequality
/()< ( /
VJ=i
for any positive integer n, hence Fix) has moments of all orders, and pix) has
derivatives of all orders (Theorem 1 . 5 . 2 ) .
To obtain a characteristic function satisfying conditions of the example,
one can take
(pit) = YP Q
EXAMPLE 1 8 . There exist two different characteristic functions with equal ab-
solute values and such that the corresponding distributions have the same
moments of all orders.
267
ak < (A.2)
[0 otherwise.
= i ) / w
^ i x ) = ^ / ^ 1 - cos(4ax)). (A.4)
2(0)
The functions qiix) and qzix) are probability density functions, g\it) and g%it)
are the corresponding characteristic functions. The support of foit) is contained
268 Appendix A. Examples
in the interval [2a,2a]; therefore |()| |2(f)| From (A.3) and (A.4), one
can easily see that for each positive integer there exists c such that
hence both <?(;) and qzix) possess moments of all orders. Since |()| \g2(t)\,
we see thatgi(i) = g^it) in some neighborhood of the origin, which implies that
all moments of <7(;) and q^ix) coincide.
Thus, and g2(t) satisfy the conditions of the counterexample.
EXAMPLE 19. For any 1 < a < 1, there exists a characteristic function fit)
which takes on the value a on some interval (of the real line) ofpositive length.
If 0 < a < 1, then one can take
fit) = a + (1 - a)git)
^ ^ | otherwise.
Let 1 < < 0. Consider the characteristic function git) from Example 14,
i.e., the periodic (with period 4) function such that git) = 1 |i| for |i| < 2. Let
numbers c and satisfy the conditions 1/3 < c < 1 and 0 < < 1. Consider the
characteristic function hc<pit) given by
Set
1
=
1 + c'
then
j . . 1-3c
hr Dit) =
c,p 1+c
on the interval [2,6], therefore it suffices to take
- 1 ~ a
E X A M P L E 2 0 . For any a, there exist two characteristic functions fit) and git)
such that 3fit) = 3git) for || < a but 3fit) ^ 3git).
Consider, for example, two symmetric characteristic functions foit) and
goit) satisfying conditions of Example 12, i.e., such that foit) = goit) for || <
and foit) * goit) for |i| > a. Then the characteristic functions fit) = foit)e and
git) = go(t)elbt satisfy the conditions of this example for any b 0.
EXAMPLE 2 1 . For any a, there exists a characteristic function fit) such that
3fit) = 0 for |i| < a but 3fit) 0.
We take the characteristic functions fit) and git) from Example 20 and set
) = hm+gi-t)i
Then the characteristic function (pit) satisfies the conditions of this example.
git) =
0 otherwise.
is integrable and
270 Appendix A. Examples
because
1 cos*
<1,
and
sin(ax)
< a.
Denote the characteristic function of pix) by fit) and the Fourier transform
of r(x) by hit). Since rix) is an odd function, hit) is purely imaginary, hence
otherwise
and the periodic function git), which has period 2 and coincides with fit) on
the interval [1,1] (see Example 14) are valid.
Another, more interesting, example is derived from Example 17 and Theo-
rem 1.3.11. In this case, characteristic functions are infinitely differentiable.
271
EXAMPLE 25. There exists a characteristic function f(t) which is infinitely dif-
ferentiable at every point except the origin but no absolute moment of positive
order of the corresponding distribution exists.
Let a random variable X have the characteristic function
where c is the normalizing constant. For any fixed t different from the ori-
gin there exists a neighborhood of t where the series on the right-hand side,
differentiated term by term, converges uniformly. Therefore, for this t, the
derivative of f(t) exists. The same is true for the derivative of any order. On
the other hand,
EXAMPLE 26. There exist characteristic functions which are not differentiable
at all points.
As an example, we take the Weierstrass function
^ J 1 - ^ W " n>
10 otherwise.
m-l110 otherwise,
EXAMPLE 29. The uniform (on the whole line) convergence of a sequence of char-
acteristic functions fiit), /2W,... characteristic function fit) does not imply
that the sequence of the corresponding distributions converges in variation.
The distance in variation between two distributions F and G is defined as
where sup is taken over all Borel sets A of the real line.
It is not hard to show that the convergence in variation of a sequence of
univariate probability distributions implies the uniform (on the whole line)
convergence of the corresponding characteristic functions. The present exam-
ple demonstrates that the converse is not true.
In view of Example 7, there exists a singular distribution F such that its
characteristic function fit) satisfies the condition
lim fit) = 0.
||-
At the same time, for any > 0, there exist a = () and no = jiq(e) such that
|/(f)| < /2 for |i| > a, and |1 - e~a*'2n\ < for > n0. Thus, for > n0,
-a2/2n
< max , sup |/()| + sup |/(i)|
||> ||>
-a2/2n
< max , 2 sup |/()| > < ,
||> J
i.e., fn(t) > f(t) as > oo uniformly on the whole line.
a, 0<*<1,
p(x) = b, l<x<3,
0 otherwise,
where a and b are such positive numbers that a + 2b = 1 and a > 2b, say,
a = 2/3, b = 1/6. Then p(x) is unimodal. Denote the density of X\ + X2 by q(x).
Simple calculations show that q( 1) = a 2 , <7(2) = 2ab, q(3) = 2ab + b2, which
yields q( 1) > q(2) < q{3), i.e., q(x) is not unimodal.
The unimodality of the distribution corresponding to |/(i)| 2 was proved in
(Hodges & Lehmann, 1954).
m = ^ e i t k
6
k=o
which is the characteristic function of a discrete uniform distribution on the
set { 0 , 1 , 2 , 3 , 4 , 5 } , and the functions
Then
m = h(t)f2(t)=gl(t)g2(t).
Each of the characteristic functions /2W and g2(t) corresponds to a two-point
distribution; hence they are indecomposable (see, e.g. (Lukacs, 1970)). Thus,
it only remains to show that fi(t) andgi() are also indecomposable. Suppose
thatgi() is decomposable: gi(t) - gu(t)gi2(t), where gn(f) andgi2(i) are non-
trivial factors. It is obvious thatgi(i) corresponds to a distribution, say, Gi(x),
concentrated in three points, 0, 1, 2, each with probability 1/3. However,
the discontinuity points of Gi(ac) are of the type Xj + yk where xj and y^ are
discontinuity points of the distributions corresponding to the characteristic
functions ^11 (i) and ^12(i) respectively (Lukacs, 1970).
Since G\(,x) has three discontinuity points andn(i),gi2() are non-trivial,
we conclude that
EXAMPLE 32. There exists a characteristic function f(t) which is not infinitely
divisible, whereas its absolute value |/(i)| is.
It is obvious that if fit) is an infinitely divisible characteristic function,
then |/(f)| is also an infinitely divisible characteristic function. The present
counterexample, constructed in (Gnedenko & Kolmogorov, 1954) demonstrates
that the converse is false.
Let 0 < < b < 1. Consider the function
(1 - 6)(1 + qg-'O
m =
(1 - o)(l - be**)'
It admits the representation
_ ~ &* + ( - i ) V
We have
EXAMPLE 33. There exists a characteristic function fit) which coincides with
the normal characteristic function e~i2/2 on some interval but fit) e~{2/2.
Consider the function
This function is continuous, decreasing, and convex for t > 0, hence, by Theo-
rem 1.3.9, it is a characteristic function.
Of course, an interval in examples of such kind cannot contain the origin:
if a characteristic function coincides with the normal characteristic function
276 Appendix A. Examples
where 0 < < 1, 0 < < 1, (pdit) and ) are characteristic functions of
discrete distributions, 0) and () are characteristic functions of continuous
distributions. Then we have
for all a > 0 (Kruglov, 1970). Relations (A.7) and (A.8) yield
r 2 r 2
/ e " dGd(x) < oo, / dGc(x) < oo,
J oo J oo
2 f 2
e* dtfdOc) < oo, e* dHc(x) < oo,
Joo Joo
for some > 0, while (A.6) and (A.9) imply that at least one of the relations
holds for all a > 0. This contradiction shows that eqrefeqA5 is impossible.
EXAMPLE 35. Let & = J^"(a,c) be the set of characteristic functions whose dis-
tributions are absolutely continuous, concentrated on the same interval [c,c],
and their densities are bounded by the same constant a (a > 1/c). There does not
exist a function g(t) such that limi^oogii) = 0 and |/(f)| < g(t) for all f(t) e
In other words, there exist > 0, a sequence of characteristic functions
f\(t),f<i(t),... from a,c), and a sequence t\,t2,, such that tn oo as
> oo, and
liminf |/()| >
EXAMPLE 36. For any a > 0, > 0, and any real there exists an absolute-
ly continuous distribution function Fix) with density p(x) and characteristic
function fit) such that supx p{x) < a and |/(T)| > 1 .
Let arbitrary positive , and real be fixed. Consider the set of the real
line
, = cosiTx) > 1 }.
278 Appendix A. Examples
oo
/ -oo cos(Tx)p(x)dx
EXAMPLE 3 7 . The function /(T) = 1 - ||t|| for ||t|| < 1 and / ( t ) = 0 for ||t|| > 1,
t e R m , is not a characteristic function if m> 1.
It is well known that in one-dimensional case the function
[1-1*1, 1*1*1,
f
\0, || > 1,
EXAMPLE 38. There exists a bivariate characteristic function f(s, t) which is the
product of univariate marginal characteristic functions in some neighbourhood
of the origin but not everywhere on the plane. In other words, there exists a
bivariate characteristic function f(s,t) such that f(s,t) = fi(s)f2(t), |s| < , || < ,
for some > 0, where / i ( s ) and /2(i) are marginal characteristic functions:
his) = f(s, 0), f2(.t) = f(0, t), but f(s, t) h(s)f2(.t).
Consider the bivariate probability density
, 2 (1 cos*) (1 cosy) z 2 .
p(x,y) = ~ cos (x +y).
279
EXAMPLE 39. There exists a symmetric (real) characteristic function f(t) which
has roots but the real part of the corresponding empirical characteristic function
fn(t) does not have roots with some positive probability for any = 1,2,... (
does not depend on n).
It is obvious that the real part of an empirical characteristic function may
have roots even when the real part of the underlying characteristic function
does not (this is the case for the symmetric normal distribution or, more gen-
erally, for all symmetric stable laws). This example demonstrates that the
converse is also true.
Consider the distribution F concentrated in three points 1,0,1 with prob-
abilities 1/4, 1/2 and 1/4, respectively. The corresponding characteristic func-
tion is f(t) = ^(1 + cost) and has the roots + 2nk, k = 0,1,2,... Let us
demonstrate that
(u(t) > 0) > 1/4
for any = 1,2,..., where un(t) is the real part of the empirical characteristic
function fn(t). Without loss of generality assume that is even. Let X\, ...,Xn
be a random sample from the distribution F. Consider the event
Then we have
Characteristic functions of some
distributions
In this appendix, we give formulas and graphs of some frequently used charac-
teristic functions and characteristic functions demonstrating interesting prop-
erties (as illustrations to the examples of Appendix A). Extensive tables of
characteristic functions can be found in (Oberhettinger, 1973). Almost all
distributions below are either absolutely continuous or integer-valued.
Distributions are given by the probability density function p(x) in the ab-
solutely continuous case and by the probability distribution p(k) for discrete
distributions. The characteristic function is denoted by f(t). Parameters of a
distribution are indicated after the argument and separated from the latter by
the semicolon.
Some examples are included into some others as special cases (for instance,
the arcsine distribution is a special case of the beta distribution).
On the figures, the absolute value of a characteristic function, its real and
imaginary parts are represented, respectively, by the solid, dashed and dotted
curves. In the discrete case, the characteristic function is usually represented
on the interval [0,2], in the absolutely continuous case, the length of the
interval where a characteristic function is represented is chosen so that all
essential information on the behavior of the characteristic function is available.
The letters j, k, I, m, , M, and (with or without indices) stand for integer
numbers.
The distributions are placed in the alphabetic order.
281
282 Appendix . Some characteristic functions
ARCSINE DISTRIBUTION
0 otherwise.
fit) = eit,2J0(t/2),
where
BESSEL DISTRIBUTION
{.)=[ {)' x >
10 otherwise,
> 0, where
OO 2 / \ 2*+p
7 =
!(++1) (f)
2
m = [i - it+Va-it) -i
BETA DISTRIBUTION
Tip + k) j-itf
Tip) Tip + q + k) k\
k=0
Figure B.3. The ch.f. of the Beta distribution with parameters = 2, q = 1/2
285
Figure B.4. The ch.f. of the Beta distribution with parameters = 1/2, q = 2
286 Appendix . Some characteristic functions
BINOMIAL DISTRIBUTION
f(t)=[l+p(eu -l)]n.
Figure B.6. The ch.f. of the binomial distribution with parameters = 0.3, = 10
288 Appendix . Some characteristic functions
CAUCHY DISTRIBUTION
1 cos*
I* I s 1.
otherwise.
10 for other k
Figure B.9. The ch.f. of the discrete uniform distribution with j = 0,1 = 5
291
EXPONENTIAL DISTRIBUTION
~, x>0,
(;)=< A > 0;
10 otherwise,
f(t)=
it
GAMMA DISTRIBUTION
GEOMETRICAL DISTRIBUTION
lt
e 1+
Figure B.12. The ch.f. of the geometrical distribution with parameter = 0.4
294 Appendix . Some characteristic functions
p(X) = 1
cosh x '
1
f(t) =
cosh(7rt/2)'
HYPEREXPONENTIAL DISTRIBUTION
. *= 0Ckhe~hx, X>0,
(,,,)= \
y0 otherwise,
m= a k h
LAPLACE DISTRIBUTION
Figure B.15. The ch.f. of the Laplace distribution with parameters a = 1, b = 0.5
297
LOGARITHMIC DISTRIBUTION
p(k;p) = - ^ - { 1
u
P)k
, k = l,2,..., 0 <p < 1;
In k
m - -
lnp
LOGISTIC DISTRIBUTION
+ k
p{k-,n,p)=r k - k= 0,1,2,...,
P n
m =
[1 CI -)"]"'
Figure B.18. The ch.f. of the negative binomial distribution with = 5 and = 0.7
300 Appendix . Some characteristic functions
NORMAL DISTRIBUTION
f.
fit) = exp < > .
1 e~px/a, > a,
{,,) = { <+1> eU+- ) a > 0, > 0;
otherwise,
iat iatyP-1
fit) = e
.
Figure B.20. The ch.f. of the Pearson distribution of the 3rd type with a = 3, = 1
302 Appendix . Some characteristic functions
) (1 + 2, a > 0;
(; a) =
v^r(a-i)
/(f) =
a-1/2
1 (\t\
Ka-y2(\t\),
V 2
)
where
KV(Z) = [7_ v (z)-7 v W],
2 sin(7rv)
and
v+2
Iy(z) =
n=l n\T(v + n + 1) \ 2
is the modified Bessel function.
Figure B.21. The ch.f. of the Pearson distribution of the 7th type with a = 1.5
303
POISSON DISTRIBUTION
where
tan , a * 1,
(oit, a) = 2
-2 log |f|, a = 1.
71
, , (( + 1 ) / 2 ) -2V(A+1)/2
1 \a+b2x\
1 6 , a <x <b,
oo < < oo, b > a;
0 otherwise,
2 eibt/2 iatl2
lbm - e
b a it
jl Kb-a), a<x<b,
p(x;a,b) = < , oo < a < oo, b > a;
I0 otherwise,
_ (p* _ ei*t
it(b a)
2mTl/2m2/2r emiX
L v 2
p(x;m1,m2) = -
( ) ( ) (m2 + m i e ^) ( m i + m 2 ) / 2 '
y0 otherwise,
f i t ) = 1
( 1 - 2 i t r 1 2 '
, * _r2/0
p(x) = e , oo <x <00;
\/2
/() = (1 - i 2 )e- i 2 / 2 .
This characteristic function f(t) is negative for all | | > 1 (see Example 10
of Appendix A). The next characteristic function possesses the same
property.
Figure B.31.
313
2x2
p(x) =
ffl = (1 - -.
Figure B.32.
314 Appendix . Some characteristic functions
p { x ) =
fit) = (1 + -'''.
Figure B.33.
315
Figure B.34. a = 4
316 Appendix . Some characteristic functions
p(x) = i e - M ( l + | x | ) ;
fit) =
(1 + i 2 ) 2 '
Figure B.35.
317
Figure B.36.
318 Appendix . Some characteristic functions
Figure B.37. a = 1, b = 3
319
P(X) 4 cosh2(rac/2)'
^ sinh t
Figure B.38.
320 Appendix . Some characteristic functions
p(*) =
2sinh(7Ec/2)'
^ ^ cosh 2 1
Figure B.39.
321
= ( + gWip - it)
()( + q it)
Figure B.42. = 1
324 Appendix . Some characteristic functions
+ > 0, q > 0;
T(p)T{q)
_ Tip + it)T{q - it)
m ~ TipWiq)
Figure B.43. = 1, q = 5
325
Figure .44.
326 Appendix . Some characteristic functions
_ j 1/2, k = 0,
P(R) - \ l-cos(wfe)
l > k
/() = 1 - - | |
Figure B.45.
Figure B.46.
328 Appendix . Some characteristic functions
where
to
1*1
1*1 >2,
1*1 <1,
- { U 1
* 1*1 > 1.
Figure B.47.
329
p e ~ ^ + (1 p) cos t = 0
p e ' ^ - ^ j : + (1 - p) sin t = 0
2 t
For this characteristic function, the first positive zero of the real part
of the empirical characteristic function is not a consistent estimator of
the first positive zero of the characteristic function although both these
zeros exist and are isolated.
Figure B.48.
Bibliography
335
336 BIBLIOGRAPHY
Blum, J.R., and Susarla, V. (1977). A Fourier inversion method for the estima-
tion of a density and its derivatives. J. Austral. Math. Soc. A23,166-171.
Brker, H.U., and Hsler, J. (1991). On the first zero of an empirical charac-
teristic function. J. Appl. Prob. 28, 593-601.
Brown, B.M. (1972). Formulae for absolute moments. J. Australian Math. Soc.
13,104-106.
Chambers, R.L., and Heathcote, C.R. (1981). On the estimation of slope and
the identification of outliers in linear regression. Biometrika 68, 21-33.
Chiu, S.-T. (1991). Bandwidth selection for kernel density estimation. Ann.
Statist. 19, 1883-1905.
Chung, K.L. (1974). A Course in Probability Theory. Academic Press, London.
Csrgo, S. (1982). The empirical moment generating function. Coll. Math. Soc.
J. Bolyai 32: Nonparametric Statistical Inference (Gnedenko, B.V., Puri,
M.L., and Vincze, I., Eds). Elsevier, Amsterdam, pp. 139-150.
Csrgo, S. (1983). The theory of functional least squares. J. Austral. Math. Soc.
A34, 336-355.
Csrgo, S., and Heathcote, C.R. (1982). Some results concerning symmetric
distributions. Bull. Austral. Math. Soc. 25, 327-335.
Csrgo, S., and Heathcote, C.R. (1987). Testing for symmetry. Biometrika 74,
177-184.
Csrgo, S., and Teugels, J.L. (1990). Empirical Laplace transform and approx-
imation of compound distributions. J. Appl. Prob. 27, 88-101.
338 BIBLIOGRAPHY
Csrg, S., and Totik, V. (1983). On how long interval is the empirical charac-
teristic function uniformly consistent? Acta Sei. Math. 45,141-149.
Daugavet, A.I., and Petrov, V.V. (1987). A generalization of the Esseen inequal-
ity for the concentration function. J. Soviet Math. 36, 473-476.
Davis, K.B. (1975). Mean square error properties of density estimates. Ann.
Statist. 3,1025-1030.
Davis, K.B. (1977). Mean integrated square error properties of density esti-
mates. Ann. Statist., 5, 530-535.
De Silva, B.M., and Griffiths, C.R. (1980). A test of independence for bivariate
symmetric stable distributions. Austral. J. Statist. 22, 172-177.
van Es, A.J. (1997). A note on the integrated squared error of a kernel density
estimator in non-smooth cases. Statist. Probab. Lett. 35, 241-250.
Epps, T.W., and Pulley, L.B. (1983). A test for normality based on the empirical
characteristic function. Biometrika 70, 723-726.
Fang, K.-T., Kotz, S., and Ng, K.W. (1989). Symmetric Multivariate and Related
Distributions. Chapman and Hall, London.
Feigin, P.D., and Heathcote, C.R. (1976). The empirical characteristic function
and the Cramer-von Mises statistic. Sankhy A38, 309-325.
Feuerverger, . (1987). On some ECF procedures for testing independence.
In: Time Series and Econometric Modeling (MacNeill, I.B., and Umphrey,
G.J., Eds). Reidel, New York, pp. 189-206.
Glad, I.K., Hjort, N.L., and Ushakov, N.G. (1999a). Upper Bounds for the MISE
ofKernel Density Estimators. Preprint Dept. Statistics, University of Oslo.
Glad, I.K., Hjort, N.L., and Ushakov, N.G. (1999b). Correction of Density Es-
timators which are not Densities. Preprint Dept. Statistics, University of
Oslo.
Glad, I.K., Hjort, N.L., and Ushakov, N.G. (1999c). Density Estimation Using
the Sine Kernel. Preprint Dept. Statistics, University of Oslo.
Gnedenko, B.V., and Kolmogorov, A.N. (1954). Limit Distributions for Sums of
Independent Random Variables. Addison-Wesley, Reading, MA.
Gtze, F., Prokhorov, Yu.V., and Ulyanov, V.V. (1996). Bounds for characteris-
tic functions of polynomials in asymptotically normal random variables.
Russian Math. Surveys 51,181-204.
Hall, P., and Murison, R.D. (1993). Correcting the negativity of high-order
kernel density estimators. J. Multivariate Anal. 47,103-122.
Hall, P., and Welsh, A.H. (1983). A test for normality based on the empirical
characteristic function. Biometrika 70,485489.
Heathcote, C.R. (1982). The theory of functional least squares. J. Appl. Probab.
A19, 225-239.
Heathcote, C.R., and Hsler, J. (1990). The first zero of an empirical charac-
teristic function. Stoch. Proc. Appl. 35, 347-360.
Henze, ., and Wagner, T. (1997). A new approach to the BHEP tests for
multivariate normality. J. Multivariate Anal. 62,1-23.
Henze, N., and Zirkler, B. (1990). A class of invariant and consistent tests for
multivariate normality. Commun. Statist. Theory Methods 19,3595-3617.
Hjort, N.L., and Glad, I.K. (1995). Nonparametric density estimation with a
parametric start. Ann. Statist. 23, 882-904.
Hodges, J.L., and Lehmann, E.L. (1954). Matching in paired comparisons. Ann.
Math. Statist. 25, 787-791.
Ibragimov, I.A., and Linnik, Yu.V. (1971). Independent and Stationary Se-
quences of Random Variables. Wolters-Noordhoff, Groningen.
Kagan, A.M., Linnik, Yu.V., and Rao, C.R. (1973). Characterization Problems
in Mathematical Statistics. Wiley, New York.
Kent, J.T. (1975). A weak convergence theorem for the empirical characteristic
function. J. Appl. Prob., 12, 515-523.
Kuznetsov, S.M., and Ushakov, N.G. (1986). The problem of the numerical
restoration of a wave front from the intensity distribution. U.S.S.R. Corn-
put. Maths. Math. Phys. 26,100-103.
Lenth, R.V., Markatou, M., and Tsimikas, J. (1995). Robust tests based on the
sample characteristic function. Austral. J. Statist. 37, 45-60.
Linnik, Yu.V., and Ostrovskii, I.V. (1977). The Decomposition of Random Vari-
ables and Vectors. American Math. Soc., Providence.
Markatou, M., and Horowitz, J.L. (1995). Robust scale estimation in the error-
components model using the empirical characteristic function. Canad. J.
Statist. 23, 369-381.
344 BIBLIOGRAPHY
Markatou, ., Horowitz, J.L., and Lenth, R.V. (1995). A robust scale estimator
based on the empirical characteristic function. Statist. Probab. Lett. 25,
185-192.
Meshalkin, L.D., and Rogozin, B.A. (1962). An estimate of the distance between
distribution functions in terms of the proximity of their characteristic
functions and its applications to the central limit theorem. In: Limit
Theorems of Probability Theory: Proc. All-Union Colloq., Fergana 1962.
Tashkent, 1963, pp. 49-55.
Paulson, A.S., Holcomb, E.W., and Leitch, R.A. (1975). The estimation of the
parameters of the stable laws. Biometrika 62,163-170.
Postnikova, L.P., and Yudin, A.A. (1977). On the concentration function. Theory
Probab. Appl. 22, 362-366.
Postnikova, L.P., and Yudin, A.A. (1980). An analytic method for estimates of
the concentration function. Proc. Steklov Inst. Math. 143,153-161.
Prokhorov, Yu.V. (1962). Extremal problems in limit theorems. In: Proc. Sixth
All-Union Conf. Theor. Probab. Math. Statist. (Vilnius, 1960). Gos. Izdat.
Politichesk. i Nauchn. Lit. Litovsk. SSR, Vilnius, pp. 77-84.
Prokhorov, Yu.V., and Rozanov, Yu.A. (1969). Probability Theory, Basic Con-
cepts, Limit Theorems, and Random Processes. Springer, Berlin.
Raikov, D.A. (1940). On positive definite functions. Soviet Math. Dokl. 26, 857-
862.
346 BIBLIOGRAPHY
Ramachandran, B., and Rao, C.R. (1968). Some results on characteristic func-
tions and characterizations of the normal and generalized stable laws.
Sankhy A30, 125-140.
Rao, C.R. (1965). Linear Statistical Inference and Its Applications. Wiley, New
York.
Sakovich, G.N. (1965). On the width of spectra. Dopovidi A.N. Ukr. SSR, 11,
1427-1430.
Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis.
Chapman & Hall, London.
Staudte, R.G., and Tata, M.N. (1970). Complex roots of real characteristic
functions. Proc. American Math. Soc. 25, 238-246.
Trigub, R.M. (1989). A criterion for the characteristic function and a test of the
Polya type for the radial functions of several variables. Theory Probab.
Appl. 34, 805-810.
Ushakov, N.G. (1982). On a problem of Renyi. Theory Probab. Appl. 27, 361-
362.
Ushakov, N.G. (1985). Upper estimates of maximum probability for sums of
independent random vectors. Theory Probab. Appl. 30, 38-^49.
Ushakov, N.G. (1997). Lower and upper bounds for characteristic functions. J.
Math. Sei. 84, 1179-1189.
Ushakov, N.G., and Ushakov, V.G. (1999a). Some Inequalities for Characteristic
Functions of Densities with Bounded Variation. Preprint Dept. Statistics,
University of Oslo.
Ushakov, N.G., and Ushakov, V.G. (1999b). Some inequalities for multivariate
characteristic functions. J. Math. Sei. (to appear).
Ushakov, V.G., and Ushakov, N.G. (1984). On indecomposable laws with in-
finitely divisible projections. Theory Probab. Applic., 29, No. 3, 596-598.
Ushakov, V.G., and Ushakov, N.G. (1999). Several inequalities for characteris-
tic functions. Vestnik Mosk. Univ., Ser.15 (to appear).
Wand, M.P., and Jones, M.C. (1995). Kernel Smoothing. Chapman & Hall,
London.
Watson, G.S., and Leadbetter, M.R. (1963). On the estimation of the probability
density. Ann. Math. Statist. 34, 480-491.
Welsh, A.H. (1984). A note on scale estimates based on the empirical character-
istic function and their application to test for normality. Statist. Probab.
Lett., 2, 345-348.
Welsh, A.H. (1985). An angular approach for linear data. Biometrika 72, 441-
450.
Zolotarev, V.M., and Senatov, V.V. (1975). Two-sided estimates of Levy's metric.
Theory Probab. Appl. 20, 234-245.
Zolotarev, V.M., and Uchaikin, V.V. (1999). Chance and Stability. VSP, Utrecht.
351
352 SUBJECT INDEX
multi-dimensional 55 absolute 39
convolution theorem 3
multi-dimensional 55 nondegenerate distribution 114
covariance function 10 nonnegative definite function 8
Cramer criterion 9
Parseval equality 7
Cramer theorem 53
multi-dimensional 56
derivatives of characteristic func- Parseval-Plancherel identity 7, 8,
tion 22, 39- 40, 88 36
discrete unimodal distribution 46 multi-dimensional 57
in multi-dimensional case 60 Polya criterion 19, 58, 63
principal value 6, 133
empirical characteristic function 160 projection 57, 61
multi-dimensional 163 projection estimator 194
empirical characteristic process 174 of normal distribution 61
empirical distribution function 160 of spherically symmetric distri-
expansion of characteristic function bution 62
40-41, 88
sine estimator 220-226
function of bounded variation 32, sine kernel 201, 219-226
79 smoothing parameter 199
spherically symmetric distribution
integrated squared error estimator 61-64, 129
195 stable distribution 188
inversion theorem 5, 35 sufficient conditions for character-
for density 6 istic function 16-22
for lattice distributions 7 superkernel 219
multi-dimensional 56, 56 symmetrization 17
for density 56
Trigub criterion 11
kernel density estimator 199 truncation inequalities 30
Khintchine criterion 9
unimodal distribution 43
Levy metrics 34 spherically symmetric 62
-metrics 35 uniqueness theorem 2
353
354 AUTHOR INDEX
Young, D. 50
Yudin, A.A. 65
Zaitsev, A.Yu. 35
Zirkler, B. 248, 257, 258
Zolotarev, V.M. 9, 34, 42, 65, 152,
188
Zygmund, A. 271