Comm-05-Random Variables and Processes

Chapter 5 Random Variables and Processes
Wireless Information Transmission System Lab. Institute of Communications Engineering g g National Sun YatYat-sen University
Table of Contents

5.1 Introduction 52 P 5.2 Probability b bilit 5.3 Random Variables 5 4 Statistical Averages 5.4 A erages 5.5 Random Processes 5 6 Mean 5.6 Mean, Correlation and Covariance Functions 5.7 Transmission of a Random Process through a Linear Filter 5 8 Po 5.8 Power er Spectral Densit Density 5.9 Power Spectral Density 59 G 5.9 Gaussian i Process P 5.10 Noise 5.11 Narrowband Noise
2
5.1 Introduction
Fourier transform is a mathematical tool for the representation of deterministic signals. Deterministic signals g : the class of signals g that may y be modeled as completely specified functions of time. A signal is random random if it is not possible to predict its precise value in advance. A random process consists of an ensemble (family) of sample functions, each of which varies randomly with time. A random variable is obtained by observing a random process at a fixed instant of time.
3
5.2 Probability
Probability theory is rooted in phenomena that, explicitly or implicitly can be modeled by an experiment with an outcome that is implicitly, subject to chance. Example: p Experiment p may y be the observation of the result of tossing a fair coin. In this experiment, the possible outcomes of a trial are heads or tails. If an experiment has K possible outcomes, then for the kth possible outcome we have a point called the sample point, which we denote b sk. With thi by this basic b i framework, f k we make k the th following f ll i definitions: d fi iti The set of all possible outcomes of the experiment is called the sample space, which we denote by S. An event corresponds to either a single sample point or a set of sample points in the space S.
4
5.2 Probability
A single sample point is called an elementary event. The entire sample space S is called the sure event; and the null set is called the null or impossible event. Two T o events e ents are mutually t ll exclusive l i if the occurrence occ rrence of one event e ent precludes the occurrence of the other event. A probability measure P is a function that assigns a non-negative non negative number to an event A in the sample space S and satisfies the following g three properties p p ( (axioms): )
1. 0 P [ A] 1 2 P[S ] = 1 2. 3. If A and B are two mutually exclusive events, then P [ A B ] = P [ A] + P [ B ]

5
( 5.1) 5 2) ( 5.2
5 3) ( 5.3
5.2 Probability
5.2 Probability
The following properties of probability measure P may be derived from the above axioms:
1. P A = 1 P [ A] 2. When events A and B are not mutually exclusive: P [ A B ] = P [ A] + P [ B ] P [ A B ] 3. If A1 ,A2 ,...,Am are mutually exclusive events that include all possible outcomes of the random experiment, then P [ A1 ] + P [ A2 ] + + P [ Am ] = 1
( 5.4 ) ( 5.5) ( 5.6 )
5.2 Probability
Let P[B|A] denote the probability of event B, given that event A has occurred The probability P[B|A] is called the conditional occurred. probability of B given A. P[B|A] is defined by
= P B A P [ A B]
P [ A]
( 5.7 )
Bayes rule

We may write Eq.(5.7) Eq (5 7) as P[AB] = P[B|A]P[A] (5 (5.8) 8) It is apparent that we may also write P[AB] = P[A|B]P[B] (5.9) From Eqs.(5.8) and (5.9), provided P[A] 0, we may determine P[B|A] by using the relation
P B A =
P A B P [ B] P [ A]
5 10 ) ( 5.10
5.2 Conditional Probability
Suppose that the condition probability P[B|A] is simply equal to the elementary probability of occurrence of event B, that is
P B A = P [ B] P [ A B ] = P [ A] P [ B ] P [ A B] P [ B] = P [ A] P [ B ] P [ B] = P [ A]
so that
= P A B
( 5.13)
Events A and B that satisfy this condition are said to be statistically independent.
Example 5.1 Binary Symmetric Channel
This channel is said to be discrete in that it is designed to handle discrete messages. The channel is memoryless in the sense that the channel output at any time depends only on the channel input at that time. The channel is symmetric, which means that the probability of receiving symbol 1 when 0 is sent is the same as the probability of receiving symbol 0 when symbol 1 is sent.
10
Example 5.1 Binary Symmetric Channel (continued)
Th a priori The i i probabilities b bili i of f sending di binary bi symbols b l 0 and d 1: 1

P [ A0 ] = p0 P [ A1 ] = p1 The conditional probabilities of error:
P B1 A0 = P B0 A1 = p
The probability of receiving symbol 0 is given by:
B0 A1 P [ A1 ] = (1 p ) p0 + pp1 P [ B0 ] = P B0 A0 P [ A0 ] + P
The probability of receiving symbol 1 is given by:

P [ A1 ] = pp0 + (1 p ) p1 P [ B1 ] = P B1 A0 P [ A0 ] + P B1 A1
11
Example 5.1 Binary Symmetric Channel (continued)
Th a posteriori The i i probabilities b bili i P[A0|B0] and d P[A1|B1]: ]

P B0 A0 P [ A0 ] 1 p ) p0 ( P = A0 B0 = P [ B0 ] (1 p ) p0 + pp1
P P [ A1 ] B1 A1 1 p ) p1 ( P = A1 B1 = P [ B1 ] pp0 + (1 p ) p1
12
5.3 Random Variables

We denote the random variable as X(s) or just X. X is a function. function Random variable may be discrete or continuous. Consider the random variable X and the probability of the event X x. We denote this probability by P[X x]. To simplify our notation, notation we write
FX ( x ) = P [ X x ]
( 5.15)
The function FX(x) is called the cumulative distribution function (cdf) or simply the distribution function of the random variable X. The distribution function FX(x) has the following properties:
1. 0 Fx ( x ) 1 2. Fx ( x1 ) Fx ( x2 ) if x1 < x2
13
There may be more than one random variable associated with the same random experiment.
14
If the distribution function is continuously differentiable, then

fX ( x) =
d FX ( x ) dx
( 5.17 )
fX(x) is called the probability b bilit density d it function f ti (pdf) of the random variable X. Probability of the event x1 < X x2 equals
P [ x1 < X x2 ] = P [ X x2 ] P [ X x1 ] = FX ( x2 ) FX ( x1 ) = f X ( x )dx
x1 x2
FX ( x ) = f X ( )d
x1 =
( 5.19 )
Probability density function must always be a nonnegative function, and with a total area of one.
15
Example 5.2 Uniform Distribution

0, 1 fX ( x) = , b a 0, xa a< xb x>b
0, xa FX ( x ) = , b a 0 0,
xa a< xb x>b
16
Several Random Variables
Consider C id two t random d variables i bl X and d Y. We W define d fi the th joint j i t distribution function FX,Y(x,y) as the probability that the random variable X is less than or equal to a specified value x and that the random variable Y is less than or equal to a specified value y.
FX ,Y ( x,y ) = P [ X x,Y y ] 5 23) ( 5.23
Suppose that joint distribution function FX,Y(x,y) is continuous everywhere, h and d th that t the th partial ti l derivative d i ti
f X ,Y ( x, y ) = 2 FX ,Y ( x, y ) xy
( 5.24 )
exists and is continuous everywhere. We call the function fX,Y(x,y) the joint probability density function of the random variables X and Y.
17
The joint distribution function FX,Y(x,y) is a monotonenondecreasing function of both x and y.
f X,Y ( , ) d d = 1
Marginal density fX(x)

FX ( x ) =

f X ,Y ( , ) d d fX ( x) =
f X ,Y ( x, ) d
( 5.27 )
Suppose that X and Y are two continuous random variables with joint probability density function fX,Y(x,y). The conditional probability density function of Y given that X = x is defined by
fY ( y x ) = f X ,Y ( x,y ) fX ( x)
18
( 5.28)
If the th random d variable i bl X and d Y are statistically i i ll independent i d d , then th knowledge of the outcome of X can in no way affect the distribution of Y.
b (5 by 5.28 28 ) fY ( y x ) = fY ( y ) f X ,Y ( x, y ) = f X ( x ) fY ( y )
( 5.32 )
P [ x A, Y B ] = P [ X A] P [Y B ]
( 5.33)
19
Example 5.3 Binomial Random Variable
Consider a sequence of coin coin-tossing tossing experiments where the probability of a head is p and let Xn be the Bernoulli random variable representing the outcome of the nth toss. Let Y be the number of heads that occur on N tosses of the coins:
Y = Xn
n =1
N y Ny P [Y = y ] = p (1 p ) y
N N! y = y! N y ! ( )
20
5.4 Statistical Averages
The expected value or mean of a random variable X is defined by
x = E [ X ] = xf X ( x ) dx
( 5.36 )
Function of a Random Variable
Let X denote a random variable, and let g(X) denote a realvalued l d function f i defined d fi d on the h real l line. li We denote d as
Y = g(X )
( 5.37 )
To find the expected value of the random variable Y.
g ( X ) = g ( x ) f x ( x ) dx E E [Y ] = yfY ( y ) dy
21
( 5.38)
Example 5.4 Cosinusoidal Random Variable

Let Y=g(X)=cos(X) X is a random variable uniformly distributed in the interval (-, )
1 , f X ( x ) = 2 0,
< x < otherwise dx
1 E [Y ] = ( cos x ) 2 1 = sin x x = 2 =0
22
Moments
For the special case of g(X) = X n, we obtain the nth moment of the probability distribution of the random variable X; that is
n E X x = f X ( x ) dx n
( 5.39 )
5.40 0) ( 5.
Mean-square value l of f X:
2 E X = x f X ( x ) dx 2
The nth central moment is

n n E ( X X ) = ( x X ) f X ( x ) dx
( 5.41)
23
For n = 2 the second central moment is referred to as the variance of the random variable X, written as
2 2 var [ X ] = E ( X X ) = ( x X ) f X ( x ) dx
( 5.42 )
2 The variance of a random variable X is commonly denoted as X . The square root of the variance is called the standard deviation of the random variable X.
2 2 X = var [ X ] = E ( X X )
2 2 2 = E X X + X X 2 2 = E + X 2 E X [ ] X X 2 2 = E X X
( 5.44 )
24
Chebyshev y inequality q y
Suppose X is an arbitrary random variable with finite mean mx and finite variance x2. For any positive number :
x2 P( X mx ) 2
Proof:
= ( x mx ) 2 p ( x ) dx d
2 x
| x m x |
( x mx ) 2 p ( x ) d dx
| x m x |
p ( x ) dx = 2 P (| X mx | )
25
Chebyshev y inequality q y
Another way to view the Chebyshev bound is working with the zero mean random variable Y=X-mx. Define a function g(Y) as:
1 g (Y ) = 0
(Y ) (Y < )
and E [g (Y )] = P ( Y )
Upper-bound g(Y) by the quadratic
(Y/)2,
Y i.e. g (Y )
2 2 2 Y E Y y The tail probability E [g (Y )] E = x = = 2 2 2 2 2
( )
26
Chebychev inequality
A quadratic upper bound on g(Y) used in obtaining the tail probability (Chebyshev bound)
For many practical applications, the Chebyshev bound is extremely loose.

27
Characteristic function X ( ) is defined as the expectation of the complex exponential function exp( jX ), ) as shown by
X ( j ) = E exp ( j X ) = f X ( x ) exp ( j X ) dx
( 5.45)
In other words , the characteristic function X ( ) is the Fourier transform of the probability density function fX(x).
Analogous g with the inverse Fourier transform:

1 fX ( x) = 2
X ( j ) exp ( j X ) d
( 5.46 )
28
Characteristic functions
First moment (mean) can be obtained by:

d ( jv j ) E ( X ) = mx = j dv v =0
Since the differentiation process can be repeated, n-th moment can be calculated by:
n d ( jv ) E ( X n ) = ( j ) n dv n v = 0
29
Determining the PDF of a sum of statistically independent random variables:

n Y = X i Y ( jv ) = E (e ) = E exp jv X i i =1 i =1 n n jvX i jvxi = E e = ... e p ( x1 , x2 ,..., xn ) dx1dx2 ...dxn i =1 i =1 Since the random variables are statistica lly independen t,
n jvY
p ( x1 , x2 , ,..., , xn ) = p ( x1 ) p ( x2 ) )... p ( xn ) Y ( j jv ) = X i ( j jv )
i =1
If X i are iid (independe nt and identically distribute d)
Y ( jv ) = [ X ( jv )]n
30
The PDF of Y is determined from the inverse Fourier transform of Y(jv). Since the characteristic function of the sum of n statistically independent random variables is equal to the product of the characteristic h t i ti functions f ti of f the th individual i di id l random d variables, i bl it follows that, in the transform domain, the PDF of Y is the nfold convolution of the PDFs of the Xi. Usually, the n-fold convolution is more difficult to perform g the PDF than the characteristic function method in determining of Y.
31
Example 5.5 Gaussian Random Variable
The probability density function of such a Gaussian random variable is defined by: 2 ( x X ) 1 fX ( x) = exp , < x < 2 2 2 X X The characteristic function of a Gaussian random variable with mean mx and variance 2 is (Problem 5.1): 1 jvm x (1 / 2 )v 2 2 ( x m x )2 / 2 2 jvx ( jv ) = e e dx = e 2 It can be shown that the central moments of a Gaussian random variable are given by: k 1 3 ( k 1 ) (even k ) k E ( X mx ) = k = (odd k ) 0
32
Example 5.5 Gaussian Random Variable (cont.)

The h sum of f n statistically i i ll independent i d d Gaussian i random d variables is also a Gaussian random variable. Proof:
Y = Xi
i =1
Y ( jv) = X ( jv) = e
i =1
i
jvmi v 2 i2 / 2
=e
2 jvmy v 2 y /2
i =1
where my = mi and = i2
i =1 2 y i =1
Therefore,Y is Gaussian- distributed with mean my

2 and d variance i y . 33
Joint Moments
Consider next a pair of random variables X and Y. A set of statistical averages of importance in this case are the joint moments t , namely, l the th expected t d value l of f Xi Y k, where h i and dk may assume any positive integer values. We may thus write
i k E X Y = x y f X ,Y ( x, y ) dxdy i k
( 5.51)
A joint moment of particular importance is the correlation defined by E[XY], which corresponds to i = k = 1. Covariance of X and Y : cov [ XY ] = E ( X E [ X ]) (Y E [Y ]) = E [ XY ] X Y
34
5 53) ( 5.53
Correlation coefficient of X and Y :
cov [ XY ]
XY
( 5.54 )
X and Y denote the variances of X and Y.
We say X and Y are uncorrelated if and only if cov[XY] = 0. Note N t th that t if X and d Y are statistically i i ll independent i d d , then th they th are uncorrelated. The converse of the above statement is not necessarily true true. W say X and We d Y are orthogonal th l if and d only l if E[XY] = 0 0.
35
Example 5.6 Moments of a Bernoulli Random Variable
Consider C id the h coin-tossing i i experiment i where h the h probability b bili of fa head is p. Let X be a random variable that takes the value 0 if the result is a tail and 1 if it is a head. head We say that X is a Bernoulli random variable.
E [ X ] = kP ( X = k ) = 0 (1 p ) + 1 p = p
k =0 1
1 p x = 0 P ( X = x) = p x =1 0 otherwise
= (k X ) P[ X = k ]
2 X 2
E X j E [ X k ] E X X = j k 2 X E j
p2 = p
36
jk j=k
k =0
= ( 0 p ) (1 p ) + (1 p )
2
jk j=k
1
= p (1 p )
2 2 where the E X = k j k =0 P [ X = k ].
5.5 Random Processes

An ensemble of sample functions.
For a fixed time instant tk, { x ( t ) , x ( t ) , x ( t )} = { X ( t , s ) , X ( t , s ) ,, X ( t , s )} constitutes a random variable.

1 k 2 k n k k 1 k 2 k n
37
At any given time instant, the value of a stochastic process i a random is d variable i bl indexed i d d by b the th parameter t t. We W denote such a process by X(t). In general, the parameter t is continuous, whereas X may be either continuous or discrete discrete, depending on the characteristics of the source that generates the stochastic process. process The noise voltage generated by a single resistor or a single information source represents a single realization of the stochastic process. process It is called a sample function.
38
The set of all possible sample functions constitutes an ensemble bl of f sample l functions f ti or, equivalently, i l tl the th stochastic process X(t). In general, the number of sample functions in the ensemble is assumed to be extremely large; often it is infinite. Having defined a stochastic process X(t) as an ensemble of sample functions, we may consider the values of the process at any set of time instants t1>t2>t3>>tn, where n is any positive integer.
In g general, , the random variables X ti X (ti ), i = 1,2, ,..., , n, are characteri zed statistica lly by their joint PDF p xt1 , xt 2 ,..., xt n .
39
Stationary stochastic processes
Consider another set of n random variables X ti +t X ( ti + t ) , variables are characterized by the joint PDF p xt1 +t , xt2 +t ,..., xtn +t .
i = 1, 2,..., n, where t is an arbitrary time shift. These random
The jont PDFs of the random variables X ti and X ti +t ,i i = 1, 2,...,n n, may or may not be identical. When they are identical, i.e., when p xt1 , xt2 ,..., xtn = p xt1 +t , xt2 +t ,..., xtn +t
y in the strict sense ( (SSS). ) for all t and all n, it is said to be stationary When the joint PDFs are different, the stochastic process is non-stationary.
40

Averages for a stochastic process are called ensemble averages. The nth moment of the random variable X ti is defined as :
E X
( )=
n ti
xtni p xti dxti
( )
In g general, , the value of the nth moment will depend p on the time instant ti if the PDF of X ti depends on ti .
When the process is stationary, p xti +t = p xti for all t. Therefore, the PDF is independent of time, and, as a conseq ence, consequenc e the nth moment is independent of time. time
( ) ( )
41
Two random variables: X ti X ( ti ) , i = 1, 2.
The correlation is measured by the joint moment:

E X t1 X t2 = xt1 xt2 p xt1 , xt2 dxt1 dxt2 Since this joint moment depends on the time instants t1 and t2, it is denoted by RX(t1 ,t2). RX(t1 ,t2) is called the autocorrelation function of the stochastic process. For a stationary stochastic process, the joint moment is: E ( X t X t ) = RX (t1 , t2 ) = RX (t1 t2 ) = RX ( )
1 2
RX ( ) = E ( X t1 X t1 + ) = E ( X t1 + X t1 ) = E ( X t ' X t ' ) = RX ( )
1 1
Average power in the process X(t): RX(0)=E(Xt2).

42
Wide-sense stationary (WSS)
A wide-sense stationary process has the property that the mean value of the process is independent of time (a constant) and where the autocorrelation function satisfies the condition that RX(t1,t2)=RX(t1-t2). Wide-sense stationarity is a less stringent condition than strict sense stationarity. strict-sense stationarity
43
Auto-covariance function
The auto-covariance function of a stochastic process is defined as: X t m ( t2 ) ( t1 , t2 ) = E X t m ( t1 )
= RX ( t1 , t2 ) m ( t1 ) m ( t2 )
When the process is stationary, the auto-covariance function simplifies to: (t1 , t2 ) = (t1 t2 ) = ( ) = RX ( ) m 2 For a Gaussian random process, higher-order moments can b expressed be di in terms of f first fi and d second d moments. Consequently, a Gaussian random process is completely characterized by its first two moments. moments
44
5.6 Mean, Correlation and Covariance Functions
Consider a random process X(t). We define the mean of the process X(t) as the h expectation i of f the h random d variable i bl obtained b i d by b observing the process at some time t, as shown by
X (t ) = E X ( t ) = xf X ( t ) ( x ) dx
( 5.57 )
A random d process i is said id to t be b stationary i to first fi order d if the th distribution function (and therefore density function) of X(t) does not vary with time time.
f X ( t1 ) ( x ) = f X ( t2 ) ( x ) for all t1 and t2 X ( t ) = X for all t

( 5.59 )
The mean of the random process is a constant. The variance of such a process is also constant.
45
We define the autocorrelation function of the process X(t) as the expectation of the product of two random variables X(t1) and X(t2). )
RX ( t1,t2 ) = E X ( t1 ) X ( t2 ) =
1 2
x x f X ( t1 ), X ( t2 ) ( x1 , x2 ) dx1dx2
( 5.60 )
We say a random process X(t) is stationary to second order if the joint distribution f X ( t1 ), X ( t2 ) ( x1 , x2 ) depends on the difference between the observation time t1 and t2.
RX ( t1 , t2 ) = RX ( t2 t1 ) for all t1 and t2
( 5.61)
The autocovariance function of a stationary random process X(t) is written as

2 ( X ( t1 ) X ) ( X ( t2 ) X ) = RX ( t2 t1 ) X C X ( t1 , t2 ) = E
( 5.62 )
46
For convenience of notation, we redefine the autocorrelation function of a stationary process X(t) as RX ( ) = E X ( t + ) X ( t ) for all t
( 5.63)
( 5.64 ) 5 65 ) ( 5.65 ( 5.67 )
This autocorrelation function has several important properties:

2 1. RX ( 0 ) = E X ( t )
2 RX ( ) = RX ( ) 2. 3. RX ( ) RX ( 0 )
Proof of (5.64) can be obtained from (5.63) by putting = 0.
47
Proof of (5.65):
RX ( ) = E X ( t + ) X ( t ) = E X ( t ) X ( t + ) = RX ( )
Proof of (5.67):
2 E ( X (t + ) X (t )) 0 2 X 2 ( t + ) 2E 0 X ( t + ) X ( t ) E + E X ( t )
2 RX ( 0 ) 2 RX ( ) 0 RX ( ) RX ( 0 )
RX ( 0 ) RX ( ) RX ( 0 )
48
The physical significance of the autocorrelation function RX() is that it provides a means of describing the interdependence interdependence of two random variables obtained by observing a random process X(t) at times seconds apart. p
49
Example 5.7 Sinusoidal Signal with Random Phase
C id a sinusoidal Consider i id l signal i l with i h random d phase: h 1 , X ( t ) = A cos ( 2 f c t + ) f ( ) = 2 elsewhere 0, RX ( ) = E X ( t + ) X ( t )

2 A 2 A = E cos ( 2 f c ) cos ( 4 f c t + 2 f c + 2 ) + 2 E A2 1 A2 cos ( 4 f c t + 2 f c + 2 ) d + cos ( 2 f c ) = 2 2 2 A2 = cos ( 2 f c ) 2
50
Averages for joint stochastic processes
Let X(t) and Y(t) denote two stochastic processes and let XtiX(ti), i=1,2,,n, YtjY(tj), j=1,2,,m, represent the random variables at times t1>t2>t3>>tn, and t1>t2>t3>>tm , respectively. The two processes are characterized statistically by their joint PDF:
p xt1 , xt2 ,..., xtn , yt ' , yt ' ,..., yt '
1 2
The cross cross-correlation correlation function of X(t) and Y(t), denoted by Rxy(t1,t2), is defined as the joint moment:
Rxy (t1 , t2 ) = E ( X t1 Yt2 ) =
xt1 yt2 p ( xt1 , yt2 )dxt1 dyt2
The cross-covariance is:
xy (t1 , t2 ) = Rxy (t1 , t2 ) mx (t1 )my (t2 )

51
Averages for joint stochastic processes
When the process are jointly and individually stationary, stationary we have Rxy(t1,t2)=Rxy(t1-t2), and xy(t1,t2)= xy(t1-t2):
Rxy ( ) = E ( X t1Yt1 + ) = E ( X t ' Yt ' ) = E (Yt ' X t ' ) = Ryx ( )
1 1 1 1
The stochastic processes X(t) and Y(t) are said to be statistically independent if and only if :
1 2 m 1 2 m
p ( xt1 , xt 2 ,..., xt n , y t ' , y t ' ,..., y t ' ) = p ( xt1 , xt 2 ,..., xt n ) p ( y t ' , y t ' ,..., y t ' )
for all choices of ti and t ti and for all positive integers n and m. The processes are said to be uncorrelated if
Rxy (t1 , t2 ) = E ( X t1 ) E (Yt2 )
52
xy (t1 , t 2 ) = 0
Example 5.9 Quadrature-Modulated Processes
Consider a pair of quadrature-modulated processes X1(t) and X2(t):
X 1 ( t ) = X ( t ) cos ( 2 f c t + ) X 2 ( t ) = X ( t ) sin ( 2 f c t + )
R12 ( ) = E X 1 ( t ) X 2 ( t ) = E X ( t ) X ( t ) cos ( 2 f c t + ) sin ( 2 f c t 2 f c + ) X ( t ) X ( t ) E = E cos ( 2 f c t + ) sin ( 2 f c t 2 f c + ) 1 sin ( 4 f c t 2 f c t + 2 ) sin ( 2 f c ) = RX ( ) E 2 1 = RX ( ) sin ( 2 f c ) =0 R12 ( 0 ) = E X1 ( t ) X 2 ( t ) 2

53
Ergodic Processes
In many instances, i it i is i difficult diffi l or impossible i ibl to observe b all ll sample l functions of a random process at a given time. It is i often ft more convenient i t to t observe b a single i l sample l function f ti for f a long period of time. For a sample function x(t), ) the time average of the mean value over an observation period 2T is 1 T x ,T = x ( t ) dt 5 84 ) ( 5.84 T 2T For many stochastic processes of interest in communications, the time averages and ensemble averages are equal, a property known as ergodicity. Thi property This t implies i li that th t whenever h an ensemble bl average is i required, i d we may estimate it by using a time average.
54
Cyclostationary Processes (in the wide sense)
There h is i another h important i class l of f random d processes commonly l encountered in practice, the mean and autocorrelation function of which exhibit periodicity: X ( t1 + T ) = X ( t1 )
RX ( t1 + T , t2 + T ) = RX ( t1 , t2 )
for all t1 and t2.
Modeling the process X(t) as cyclostationary adds a new dimension namely, dimension, namely period T to the partial description of the process.
55
5.7 Transmission of a Random Process Through a Linear Filter
Suppose that a random process X(t) is applied as input to linear time invariant filter of impulse response h(t), time-invariant ) producing a new random process Y(t) at the filter output.
Assume that X(t) is a wide-sense stationary random process. The mean of the output random process Y(t) is given by Y ( t ) = E Y t E h X t d = ( ) ( ) ( ) 1 1 1 = h ( 1 )E X ( t 1 ) d 1

= h ( 1 ) X ( t 1 )d 1
( 5.86 )
56
When the input random process X(t) is wide-sense stationary, the mean X ( t ) is a constant X , then mean Y ( t ) is also a constant Y .
Y ( t ) = X h ( 1 )d 1 = X H ( 0 )
( 5.87 )
where H(0) is the zero-frequency (dc) response of the system.
The autocorrelation function of the output random process Y(t) is given by:
RY ( t , u ) = E Y t Y u E h X t d h X u d = ( ) ( ) ( ) ( ) ( ) ( ) 1 1 1 2 2 2
= d 1h ( 1 ) d 2 h ( 2 )E X ( t 1 ) X ( u 2 )

= d 1h ( 1 ) d 2 h ( 2 ) RX ( t 1 , u 2 )

57
When the input X(t) is a wide-sense stationary random process, the autocorrelation function of X(t) is only a function of the difference between the observation times:
RY ( ) =
h ( ) h ( ) R (
1 2
+ 2 )d 1d 2
( 5.90 )
If the input to a stable linear time-invariant filter is a wide-sense stationary y random p process, , then the output p of the filter is also a wide-sense stationary random process.
58
5.8 Power Spectral Density
The Fourier transform of the autocorrelation function RX() is called the power spectral density SX( f ) of the random process X(t). S X ( f ) = RX ( ) exp ( j 2 f ) d

( 5.91) ( 5.92 )
RX ( ) = S X ( f ) exp ( j 2 f ) df
Equations (5.91) (5 91) and (5.92) (5 92) are basic relations in the theory of spectral analysis of random processes, and together they constitute y called the Einstein-Wiener-Khintchine relations. what are usually
59
5 .8 5.8 P Power o w e r S Spectral p e c t r a l D Density e n s it y
Properties of the Power Spectral Density
P Property 1: 1 S X ( 0 ) = RX ( ) d
( 5.93)
Proof: Let f =0 0 in Eq. (5.91)

2
Property 2: E X
= S X ( f ) df ( t )
( 5.94 )
( 5.95)
Proof: Let =0 in Eq. (5.92) and note that RX(0)=E[X2(t)].
Property 3: S X ( f ) 0 for all f Property 4: S X ( f ) = S X ( f )
( 5.96 )
Proof: From (5.91)
S X ( f ) = RX ( ) exp ( j 2 f ) d
RX ( ) = RX ( )
RX ( ) exp ( j 2 f ) d = S X ( f )
60
Proof of Eq. (5.95)
It can be shown that (see eq. 5.106) SY ( f ) = S X ( f ) H ( f )

RY ( ) = SY ( f ) exp ( j 2 f ) df = S X ( f ) H ( f ) exp ( j 2 f ) df
2

RY ( 0 ) = E Y
( t ) = S X ( f ) H ( f ) df 0 for any H ( f )
2
f2 f1
Suppose we let S l t |H( f )|2=1 1 for f any arbitrarily bit il small ll interval i t l f1 f f2 , and H( f )=0 outside this interval. Then, we have:
S X ( f ) df 0
This is possible if an only if SX( f )0 for all f.
C l i Conclusion: SX( f )0 for f all ll f.

61
Example 5.10 Sinusoidal Signal with Random Phase
Consider C id the h random d process X(t)= ) Acos(2 (2fc t+), ) where h is i a uniformly distributed random variable over the interval (-,). Th autocorrelation The t l ti function f ti of f this thi random d process is i given i in i Example 5.7: A2 RX ( ) = cos ( 2 f c ) (5.74) 2 Taking the Fourier transform of both sides of this relation: A2 ( f f c ) + ( f + f c ) (5.97) SX ( f ) = ( ) 4
62
Example 5.12 Mixing of a Random Process with a Si Sinusoidal id l Process P
A situation that often arises in practice is that of mixing (i.e., multiplication) of a WSS random process X(t) with a sinusoidal signal cos(2fc t+), where the phase is a random variable that is uniformly distributed over the interval (0,2). Determining the power spectral density of the random process Y(t) y defined by: Y ( f ) = X ( t ) cos ( 2 f c t + ) (5.101)
We note that random variable is independent of X(t) .
63
Example 5.12 Mixing of a Random Process with a Si Sinusoidal id l Process P (continued) ( ti d)
The autocorrelation function of Y(t) is given by: Y ( t + ) Y ( t ) RY ( ) = E = E X ( t + ) cos ( 2 f c t + 2 f c + ) X ( t ) cos ( 2 f c t + ) = E X ( t + ) X ( t ) E cos ( 2 f c t + 2 f c + ) cos ( 2 f c t + ) 1 cos ( 2 f c ) + cos ( 4 f c t + 2 f c + 2 ) = RX ( ) E 2 1 = RX ( ) cos ( 2 f c ) 2 Fourier transform 1 SY ( f ) = S X ( f fc ) + S X ( f + fc ) (5.103) 4
64
Relation among the Power Spectral Densities of the Input and Output Random Processes
SY ( f ) = RY ( ) e j 2 f d

Let SY( f ) denote the power spectral density of the output random process Y(t) obtained by passing the random process through a linear filter of transfer function H( f ).

= =
h ( ) h ( ) R ( + )d d ( 5.90 ) h ( 1 )h ( 2 ) RX ( 1 + 2 ) e j 2 f d 1d 2 d
1 2 X 1 2 1 2
RY ( ) =
L t 1 + 2 = 0 Let

h ( 1 )h ( 2 ) RX ( 0 ) e
j 2 f 1
j 2 f ( 0 +1 2 )
d 1d 2 d 0
= h ( 1 )e
d 1 h ( 2 )e
j 2 f 2 2
d 2 RX ( 0 )e j 2 f 0 d 0
= H ( f )H
( f ) SX ( f ) =
65
H ( f ) SX ( f )
( 5.106 )
Example 5.13 Comb Filter
Consider C id the h filter fil of f Figure Fi (a) ( ) consisting i i of f a delay d l line li and da summing device. We wish to evaluate the power spectral density of the filter output Y(t). )
66
Example 5.13 Comb Filter (continued)
The transfer Th f function f i of f this hi filter fil is i H ( f ) = 1 exp ( j 2 fT ) = 1 cos ( 2 fT ) + j sin ( 2 fT ) H( f )

2 2 2 = + 1 cos 2 fT sin 2 fT ( ) ( ) 2 =2 2 1 cos 2 fT =4sin 4 i ( ) ( fT )
Because of the periodic form of this frequency response (Fig. (b)), the h fil filter is i sometimes i referred f d to as a comb b filter fil . The power spectral density of the filter output is:
SY ( f ) = H ( f ) S X ( f ) = 4sin 2 ( fT ) S X ( f )
2
If fT is very small SY ( f ) 4 2 f 2T 2 S X ( f )
67
(5.107)
5.9 Gaussian Process
A random variable Y is defined by:

Y = g ( t ) X ( t ) dt
0 T
Y = ai Xi
i =1
ai are constants Xi are random variables
We refer to Y as a linear functional of X(t). ) Y is i a linear li function f ti of f the th Xi . If the weighting function g(t) is such that the mean-square value of the random variable Y is finite, , and if the random variable Y is a Gaussian-distributed random variable for every g(t) in this class of functions, then the process X(t) is said to be a Gaussian process. I other In h words, d the h process X(t) i is a Gaussian G i process if every linear functional of X(t) is a Gaussian random variable. The Gaussian process has many properties that make analytic results possible. The random processes prod produced ced b by ph physical sical phenomena are often such that a Gaussian model is appropriate.
68
The random variable Y has a Gaussian distribution if its probability density y function has the form 2 y Y ) ( 1 fY ( y ) = exp 2 2 Y 2 Y Y : the mean of the random variable Y
2 Y : the variance of the random variable Y
If the Gaussian random variable Y is normalized to have a mean of zero and d a variance i of f one, such h a normalized li d Gaussian G i distribution is commonly written as N(0,1).
y2 1 fY ( y ) = exp 2 2
69
Central Limit Theorem
Let Xi, i = 1, L 1 2, 2 , N, be b a set of f random d variables i bl that h satisfies i fi the following requirements:

The Xi are statistically independent. The Xi have the same probability distribution with mean X and variance 2X. The Xi so described are said to constitute a set of independent and identically distributed (i.i.d.) random variables.
Define:
1 Yi = 1 2 2, , N . ( X i X ) , i = 1, X
E [Yi ] = 0 var [Yi ] = 1
1 VN = N
Y
i =1
The central limit theorem states that the probability distribution of VN approaches a normalized Gaussian distribution N(0,1) in the limit as N approaches infinity.
70
Property 1: If a Gaussian process X(t) is applied to a stable linear , then the output p of Y(t) is also Gaussian. filter, Property 2: Consider the set of random variables or samples X(t1), X(t2), X(tn), obtained by observing a random process X(t) at time t1, t2, , tn. If the th process X(t) is i Gaussian, G i then th this thi set t of f random d variables is jointly Gaussian for any n, with their n-fold joint probability p y density y function being g completely p y determined by y specifying the set of means: X ( ti ) , i = 1, 2,, n X ( ti ) = E and the set of autocovariance functions: CX ( tk , ti ) = E X ( tk ) X (tk ) X ( ti ) X (ti ) , k, i = 1, 2, ..., n Consider the composite set of random variables X(t1), X(t2),, X(tn), Y(u1), Y(u2),, Y(um). We say that the processes X(t) and Y(t) are jointly Gaussian if this composite set of random variables are jointly Gaussian for any n and m.
)(
71
Property 3: If a Gaussian process is wide-sense stationary, then the process is also stationary in the strict sense. Property 4: If the random variables X(t1), ) X(t2), ) X(tn), ) are uncorrelated, that is
E X ( tk ) X ( tk ) X ( ti ) X ( ti ) = 0, 0 ik then these random variables are statistically independent.
)(
The implication of this property is that the joint probability density function of the set of random variables X(t1), ) X(t2),, ) X(tn) can be expressed as the product of the probability density functions of the individual random variables in the set.
72
5.10 Noise
The sources of noise may be external to the system (e.g., atmospheric noise, noise galactic noise, noise man-made noise), noise) or internal to the system. The second category includes an important type of noise that arises p fluctuations of current or voltage g in electrical from spontaneous circuits. This type of noise represents a basic limitation on the transmission or detection of signals in communication systems i l i the involving h use of f electronic l i devices. d i The two most common examples of spontaneous fluctuations in electrical circuits are shot noise and thermal noise.
73
5.10 Noise
Shot Noise
Shot noise arises in electronic devices such as diodes and transistors because of the discrete nature of current flow in these devices. For example, in a photodetector circuit a current pulse is generated every time an electron is emitted by the cathode due to incident light from a source of constant intensity. intensity The electrons are naturally emitted at random times denote by k. If the e random do emissions e ss o s o of e electrons ec o s have ve been bee going go g on o for o a long time, then the total current flowing through the photodetector may be modeled as an infinite sum of current pulses as shown by pulses, X (t ) = h (t k )
k =
where h(t- k) is the current pulse generated at time k. The process X(t) is a stationary process, called shot noise.
74
5.10 Noise
Shot Noise
The number of electrons, N(t), emitted in the time interval (0, t) constitutes a discrete stochastic process, the value of which increase by one each time an electron is emitted. (Fig. 5.17) Let the mean value of the number of electrons, v, emitted between times t and t+t0 be E [ ] = t0
: a constant called the rate of the process
The total number of electrons emitted in the interval (t, t t+t0) is = N ( t + t0 ) N ( t ) follows a Poisson distribution with a mean value va ue equal equa to t0. The probability that k electrons are emitted in Fig. 5.17 Sample function of a Poisson counting process. the interval (t, t+t0) is k t ( 0 ) k e k = 0, 1, P [ = k ] = k! 75
5.10 Noise
Thermal Noise
Thermal noise is the name given to the electrical noise arising from the random motion of electrons in a conductor. The mean-square value of the thermal noise voltage VTN , appearing across the terminals of a resistor, measured in a bandwidth of f Hertz, Hertz is given by:
2 2 E V = 4 kTR f volts TN
k : Boltzmanns constant=1.38 10-23 joules per degree Kelvin. T : Absolute temperature in degrees Kelvin. R: The resistance in ohms.
76
5.10 Noise
White Noise
The noise analysis is customarily based on an idealized form of noise called white noise, the power spectral density of which is independent of the operating frequency. White is used in the sense that white light contains equal amount of all frequencies within the visible band of electromagnetic radiation. radiation We express the power spectral density of white noise, with a sample s p e function u c o denoted de o ed by w(t), as s N0 SW ( f ) = 2 N 0 = kTe
The dimensions of N0 are in watts per Hertz, Hertz k is Boltzmann Boltzmanns s constant and Te is the equivalent noise temperature of the receiver.
77
5.10 Noise
White Noise
The equivalent noise temperature of a system is defined as the temperature at which a noisy resistor has to be maintained such that, by connecting the resistor to the input of a noiseless version of the system, it produces the same available noise power at the output of the system as that produced by all the sources of noise in the actual system. system The autocorrelation function is the inverse Fourier transform of the power spectral density: N0 RW ( ) = ( ) 2 Any two different samples of white noise, no matter how closely together in time they are taken, are uncorrelated. If the white noise w(t) is also Gaussian, then the two samples are statistically independent.
78
5.10 Noise
Example 5.14 Ideal Low-Pass Filtered White Noise
Suppose th S that t a white hit Gaussian G i noise i w(t) of f zero mean and d power spectral density N0/2 is applied to an ideal low-pass filter passband amplitude p response p of one. of bandwidth B and p The power spectral density of the noise n(t) is
N0 , SN ( f ) = 2 0, B < f < B f >B
The autocorrelation function of n(t) is

RN ( ) = N0 exp ( j 2f ) df B 2 = N 0 B sinc ( 2B )
B
79
5.11 Narrowband Noise

The receiver of a communication system usually includes some provision for p p preprocessing p g the received signal. g The preprocessing may take the form of a narrowband filter whose bandwidth is just large enough to pass the modulated component of the h received i d signal i l essentially i ll undistorted di d but b not so large l as to admit excessive noise through the receiver. The noise process appearing at the output of such a filter is called narrowband noise.
Fig. g. 5. 5.24 (a). Power owe spect spectral a de density s ty o of narrowband a owba d noise. o se. (b). Sa Sample pe function of narrowband noise, which appears somewhat similar to a sine wave of frequency fc, which undulates slowly in both amplitude and phase.
80
Representation of Narrowband Noise in Terms of InIn -p phase and Quadrature Components p
Consider a narrowband noise n(t) of bandwidth 2B centered on q y fc, it can be represented p as frequency
n ( t ) = nI ( t ) cos ( 2 f c t ) nQ ( t ) sin ( 2f c t )
nI(t): ) i in-phase h component t of f n(t) nQ(t): quadrature component of n(t)
Both nI(t) and nQ(t) are low-pass low pass signal signal.
Fig. 5.25 Fi 5 25 (a). ( ) Extraction E i of f in-phase i h and d quadrature d components of f a narrowband b d process. (b). Generation of a narrowband process from its in-phase and quadrature components.
81
nI(t) and nQ(t) of a narrowband noise n(t) have some important properties: p p
1) 2) 3) 4) The nI(t) and nQ(t) of n(t) have zero mean. If n(t) is Gaussian, then nI(t) and nQ(t) are jointly Gaussian. If n(t) i is stationary, t ti then th nI(t) and d nQ(t) are jointly j i tl stationary. t ti Both nI(t) and nQ(t) have the same power spectral density, which is related to the power spectral density SN( f ) of n(t) as
S ( f fc ) + S N ( f + fc ) , S N I ( f ) = S NQ ( f ) = N 0, B f B otherwise
5) nI(t) and nQ(t) have the same variance as the narrowband noise n(t). 6) The cross-spectral density of nI(t) and nQ(t) of n(t) is purely imaginary
S N I NQ ( f ) = S NQ N I S N ( f + fc ) S N ( f fc ) , j (f)= 0, B f B otherwise
7) If n(t) is Gaussian and its power spectral density SN(t) is symmetric about the mid-band frequency fc, then nI(t) and nQ(t) are statistically independent.
82
Example 5.17 Ideal Band-Pass Filtered White Noise
Consider a white Gaussian noise of zero mean and power spectral density N0/2, which is passed through an ideal band-pass filter of passband magnitude response equal to one, mid mid-band band frequency fc, and bandwidth 2B. The p power spectral p density y characteristic of the filtered noise n(t) is shown in Fig. (a). The power spectral density characteristic of nI(t) and nQ(t) are shown in Fig. (c).
83
Example 5.17 Ideal Band-Pass Filtered White Noise
The autocorrelation function of n(t) is the inverse Fourier transform of the power spectral density characteristic:
fc + B N N0 0 RN ( ) = exp ( j 2 f ) df + exp ( j 2 f ) df fc B 2 fc B 2 = N 0 B sinc i ( 2 B ) exp ( j 2 f c ) + exp ( j 2 f c ) fc + B
= 2 N 0 B sinc ( 2 B ) cos ( 2 f c )
The autocorrelation function of nI(t) and nQ(t) is gi given en by: b :
RN I ( ) = RNQ ( ) = 2 N 0 B sinc ( 2 B )
84
The narrowband noise n(t) can be represented in terms of its p and p phase components: p envelope
n ( t ) = r ( t ) cos 2f c t + ( t )
r (t ) = n ( t ) + n ( t )
2 I 2 Q 12
1
nQ ( t ) ( t ) = tan nI ( t )
r(t) : envelope of n(t); (t) : phase of n(t)
Both r(t) and (t) are sample functions of low-pass random processes. The probability distributions of r(t) and (t) may be obtained from th those of f nI(t) and d nQ(t). )
85
Let NI and NQ denote the random variables obtained by observing processes represented p by y the sample p functions nI(t) and the random p nQ(t), respectively . NI and NQ are independent Gaussian random variables of zero mean and d variance i 2. Their Th i joint j i probability b bili density d i function f i is i given i by: b
2 nI2 + nQ 1 f N I , NQ ( nI , nQ ) = exp 2 2 2 2
Define:nI = r cos and nQ = r sin. We have dnI dnQ = r dr d. The joint probability density function of R and is: r2 r f R , ( r , ) = exp 2 2 2 2 Th is The i uniformly if l distributed di ib d inside i id the h range 0 to 2.
86
The probability density function of the random variable R is:

r r2 2 exp 2 , fR (r ) = 2 0, r0 elsewhere (5.150)
A random variable having g the probability p y density y function of (5.150) is said to be Rayleigh distributed. The Rayleigh distribution in the normalized form 2 exp , fV ( ) = 2 0,
0
elsewhere
Fig. 5.28 Normalized Rayleigh distribution 87
Example 5.18 Sinusoidal Signal Plus Narrowband Noise
A sample function of the sinusoidal signal A cos(2fct) plus narrowband noise n(t) is given by:
x ( t ) = A cos ( 2 f c t ) + n ( t )
Representing n(t) in terms of its in-phase and quadrature components around the carrier frequency fc x ( t ) = nI' ( t ) cos ( 2 f c t ) nQ ( t ) sin ( 2 f c t )
nI' ( t ) = A + nI ( t )
Assume that n(t) is Gaussian with zero mean and variance 2. Both B th n I(t) and d nQ(t) are G Gaussian i and d statistically t ti ti ll independent. i d d t The mean of nI(t) is A and that of nQ(t) is zero. The variance of both n nI(t) and nQ(t) is 2.
88
The joint probability density function of the random variables NI and NQ , corresponding to n nI(t) and nQ(t) is
( n' A )2 + n 2 1 I Q ' f N ' , N ( nI , nQ ) = exp 2 2 Q I 2 2 L t r(t) d Let denote t the th envelope l of f x(t) and d (t) d denote t its it phase. h
r (t ) = n ( t ) + n (t )
' I 2 2 Q
12
nQ ( t ) ( t ) = tan ' n t I ( )
1
The joint probability density function of the random variables R and is given by
r 2 + A2 2 Ar cos r f R , ( r , ) = exp 2 2 2 2
89
The function fR, (r,) cannot be expressed as a product fR(r)f(). g the values of both This is because we now have a term involving random variables multiplied together as r cos . Rician distribution: 2 M difi d B Modified Bessel l function f ti of f the th first fi t f R ( r ) = f R , ( r , )d kind of zeroth order.
0
r 2 + A2 2 r Ar = exp exp cos 2 d 0 2 2 2 2
The Rician distribution reduces to the Rayleigh y g distribution for small a, and reduces to an approximate Gaussian distribution when a is large.
Fig 5.29 Normalized Rician distribution
90

Comm-05-Random Variables and Processes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Comm-05-Random Variables and Processes

Uploaded by

Copyright:

Available Formats

Chapter 5 Random Variables and Processes

1. 0 P [ A] 1 2 P[S ] = 1 2. 3. If A and B are two mutually exclusive events, then P [ A B ] = P [ A] + P [ B ]

( 5.4 ) ( 5.5) ( 5.6 )

5.2 Conditional Probability

5.2 Conditional Probability

Example 5.1 Binary Symmetric Channel

5.2 Conditional Probability

Example 5.1 Binary Symmetric Channel (continued)

Th a priori The i i probabilities b bili i of f sending di binary bi symbols b l 0 and d 1: 1

The probability of receiving symbol 0 is given by:

The probability of receiving symbol 1 is given by:

5.2 Conditional Probability

Example 5.1 Binary Symmetric Channel (continued)

Th a posteriori The i i probabilities b bili i P[A0|B0] and d P[A1|B1]: ]

5.3 Random Variables

5.3 Random Variables

5.3 Random Variables

If the distribution function is continuously differentiable, then

5.3 Random Variables

Example 5.2 Uniform Distribution

5.3 Random Variables

Several Random Variables

5.3 Random Variables

Several Random Variables

The joint distribution function FX,Y(x,y) is a monotonenondecreasing function of both x and y.

Marginal density fX(x)

5.3 Random Variables

Several Random Variables

5.3 Random Variables

Example 5.3 Binomial Random Variable

5.4 Statistical Averages

The expected value or mean of a random variable X is defined by

Function of a Random Variable

To find the expected value of the random variable Y.

5.4 Statistical Averages

Example 5.4 Cosinusoidal Random Variable

Let Y=g(X)=cos(X) X is a random variable uniformly distributed in the interval (-, )

< x < otherwise dx

5.4 Statistical Averages

The nth central moment is

5.4 Statistical Averages

5.4 Statistical Averages

5.4 Statistical Averages

Upper-bound g(Y) by the quadratic

2 2 2 Y E Y y The tail probability E [g (Y )] E = x = = 2 2 2 2 2

5.4 Statistical Averages

For many practical applications, the Chebyshev bound is extremely loose.

5.4 Statistical Averages

Analogous g with the inverse Fourier transform:

5.4 Statistical Averages

First moment (mean) can be obtained by:

5.4 Statistical Averages

Determining the PDF of a sum of statistically independent random variables:

If X i are iid (independe nt and identically distribute d)

5.4 Statistical Averages

5.4 Statistical Averages

Example 5.5 Gaussian Random Variable

5.4 Statistical Averages

Example 5.5 Gaussian Random Variable (cont.)

Therefore,Y is Gaussian- distributed with mean my

5.4 Statistical Averages

5.4 Statistical Averages

Correlation coefficient of X and Y :

X and Y denote the variances of X and Y.