Professional Documents
Culture Documents
Descriptive statistics:
Population mean, variance and coefficient of variation:
𝑁 𝑁
1 1 𝜎
𝜇 = ∑ 𝑥𝑖 ; 𝜎 2 = ∑(𝑥𝑖 − 𝜇)2 ; 𝐶𝑉 = ∗ 100%
𝑁 𝑁 𝜇
𝑖=1 𝑖=1
Introduction to probability:
Permutation rule formula:
𝑛!
𝑃𝑥𝑛 = 𝑛(𝑛 − 1)(𝑛 − 2) … . (𝑛 − 𝑥 + 1) =
(𝑛 − 𝑥)!
Where 𝑥! = 𝑥(𝑥 − 1)(𝑥 − 2) … (1)
Combination rule formula:
𝑛(𝑛 − 1)(𝑛 − 2) … . (𝑛 − 𝑥 + 1) 𝑛! 𝑃𝑥𝑛
𝐶𝑥𝑛 = = =
𝑥(𝑥 − 1) … (1) 𝑥! (𝑛 − 𝑥)! 𝑥!
Where 𝑥! = 𝑥(𝑥 − 1)(𝑥 − 2) … (1)
Addition rule:
𝑃(𝐴) + 𝑃(𝐵) = 𝑃(𝐴 ∪ 𝐵) + 𝑃(𝐴 ∩ 𝐵)
Conditional probability:
𝑃(𝐴 ∩ 𝐵)
𝑃(𝐴|𝐵) =
𝑃(𝐵)
Multiplication rule:
𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴|𝐵)𝑃(𝐵) = 𝑃(𝐵|𝐴)𝑃(𝐴)
1
Law of total probability:
𝑃(𝐴) = 𝑃(𝐴|𝐵1)𝑃(𝐵1 ) + 𝑃(𝐴|𝐵2)𝑃(𝐵2 ) + … + 𝑃(𝐴|𝐵𝑘 )𝑃(𝐵𝑘 )
where 𝐵1 , 𝐵2 , … , 𝐵𝑘 are mutually exclusive and collectively exhaustive events.
Bayes’ theorem:
𝑃(𝐵|𝐴)𝑃(𝐴)
𝑃(𝐴|𝐵) =
𝑃(𝐵)
Bernoulli distribution
𝑃(𝑋 = 1) = 𝑃 ; 𝑃(𝑋 = 0) = 1 − 𝑃
𝑀𝑒𝑎𝑛: 𝜇 = 𝑃; 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 𝑃(1 − 𝑃)
Binomial distribution
𝑛!
𝑃(𝑥) = 𝑃 𝑥 (1 − 𝑃)𝑛−𝑥
𝑥! (𝑛 − 𝑥)!
𝑀𝑒𝑎𝑛: 𝜇 = 𝑛𝑃; 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 𝑛𝑃(1 − 𝑃)
Poisson distribution
𝜆𝑥
𝑃(𝑥) = 𝑒 −𝜆 𝑓𝑜𝑟 𝑥 = 0,1,2, …
𝑥!
𝑀𝑒𝑎𝑛: 𝜇 = 𝜆 𝑎𝑛𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 𝜆
2
Cumulative distribution function:
𝑥
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = 𝑃(𝑋 < 𝑥) = ∫ 𝑓(𝑥)𝑑𝑥
𝑥𝑚𝑖𝑛
Uniform distribution
1 𝑥−𝑎
𝑓(𝑥) = ; 𝐹(𝑥) = 𝑓𝑜𝑟 𝑎𝑛𝑦 𝑎 ≤ 𝑥 ≤ 𝑏
𝑏−𝑎 𝑏−𝑎
𝑎+𝑏 2
(𝑏 − 𝑎)2
𝑀𝑒𝑎𝑛: 𝜇 = ; 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 =
2 12
Normal distribution
1 2 /2𝜎 2
𝐹𝑜𝑟 𝑋~𝑁(𝜇, 𝜎 2 ), 𝑓(𝑥) = 𝑒 −(𝑥−𝜇)
√2𝜋𝜎 2
Standard normal distribution:
1 2 /2
𝐹𝑜𝑟 𝑍~𝑁(0,1), 𝑓(𝑧) = 𝑒 −𝑥
√2𝜋
Exponential distribution:
𝑓(𝑡) = 𝜆𝑒 −𝜆𝑡 ; 𝐹(𝑡) = 1 − 𝑒 −𝜆𝑡 for t > 0
1 1
𝑀𝑒𝑎𝑛: 𝜇 = 𝑎𝑛𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒: 𝜎 2 = 2
𝜆 𝜆
Joint Probability Distribution:
Joint probability distribution:
𝑃(𝑥, 𝑦) = 𝑃(𝑋 = 𝑥 ∩ 𝑌 = 𝑦)
Marginal probability distributions
Conditional variance
2 2 2
𝜎𝑌|𝑋 = 𝐸 [(𝑌 − 𝜇𝑌|𝑋 ) |𝑋] = ∑ [(𝑌 − 𝜇𝑌|𝑋 ) |𝑋] 𝑃(𝑦|𝑋)
𝑌
Covariance
3
𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸[(𝑋 − 𝜇𝑋 )(𝑌 − 𝜇𝑌 )] = ∑ ∑(𝑥 − 𝜇𝑋 )(𝑦 − 𝜇𝑌 )𝑃(𝑥, 𝑦)
𝑦 𝑥
Correlation
𝐶𝑜𝑣(𝑋, 𝑌)
𝜌 = 𝐶𝑜𝑟𝑟(𝑋, 𝑌) =
𝜎𝑋 𝜎𝑌
General rules:
𝐸[𝑋 + 𝑌] = 𝜇𝑋 + 𝜇𝑌
𝐸[𝑋 − 𝑌] = 𝜇𝑋 − 𝜇𝑌
𝑉𝑎𝑟(𝑋 + 𝑌) = 𝜎𝑋2 + 𝜎𝑌2 + 2𝐶𝑜𝑣(𝑋, 𝑌)
𝑉𝑎𝑟(𝑋 − 𝑌) = 𝜎𝑋2 + 𝜎𝑌2 − 2𝐶𝑜𝑣(𝑋, 𝑌)
𝑉𝑎𝑟(𝑋 + 𝑌 + 𝑍) = 𝜎𝑋2 + 𝜎𝑌2 + 𝜎𝑍2 + 2𝐶𝑜𝑣(𝑋, 𝑌) + 2𝐶𝑜𝑣(𝑋, 𝑍) + 2𝐶𝑜𝑣(𝑌, 𝑍)
𝑉𝑎𝑟(𝑋 − 𝑌 + 𝑍) = 𝜎𝑋2 + 𝜎𝑌2 + 𝜎𝑍2 − 2𝐶𝑜𝑣(𝑋, 𝑌) + 2𝐶𝑜𝑣(𝑋, 𝑍) − 2𝐶𝑜𝑣(𝑌, 𝑍)
𝑉𝑎𝑟(𝑋 − 𝑌 − 𝑍) = 𝜎𝑋2 + 𝜎𝑌2 + 𝜎𝑍2 − 2𝐶𝑜𝑣(𝑋, 𝑌) − 2𝐶𝑜𝑣(𝑋, 𝑍) + 2𝐶𝑜𝑣(𝑌, 𝑍)
𝐶𝑜𝑣(𝑋, 𝑋) = 𝑉𝑎𝑟(𝑋)
𝐶𝑜𝑣(𝑋 + 𝑌, 𝑍) = 𝐶𝑜𝑣(𝑋, 𝑍) + 𝐶𝑜𝑣(𝑌, 𝑍)
𝜎2 𝑁 − 𝑛
𝜎𝑋2̅ =
𝑛 𝑁−1
4
𝑃(1 − 𝑃)
𝜎𝑝2̂ =
𝑛
If population is normal or sample size is large, and population variance 𝜎 2 is known, the
100(1 − 𝛼)% confidence intervals of population mean is:
𝜎
𝑋̅ ± 𝑧𝛼/2
√𝑛
If population is normal or sample size is large, and population variance 𝜎 2 is unknown, the
100(1 − 𝛼)% confidence intervals of population mean is:
𝑠
𝑋̅ ± 𝑡𝑛−1,𝛼/2
√𝑛
If population is normal or sample size is large, the 100(1 − 𝛼)% confidence intervals of
population proportion is:
𝑝̂ (1 − 𝑝̂ )
𝑝̂ ± 𝑧𝛼/2 √
𝑛
If population is normal, the 100(1 − 𝛼)% confidence intervals of population variance is:
5
(𝑛 − 1)𝑠 2 (𝑛 − 1)𝑠 2
2 < 𝜎2 < 2
𝜒𝑛−1,𝛼/2 𝜒𝑛−1,1−𝛼/2
If population size is infinitely large, margin of error for population mean is required to be 𝑀𝐸,
sample size 𝑛 is determined by
2
𝑧𝛼/2 𝜎2
𝑛=
𝑀𝐸 2
If population size is finite, margin of error for population mean is required to be 𝑀𝐸, sample
size 𝑛 is determined by
𝑛0 𝑁
𝑛=
𝑛0 + 𝑁 − 1
2
𝑧𝛼/2 𝜎2
Where 𝑛0 = 𝑀𝐸 2
Two-sided
Intercept coefficient
𝑏0 = 𝑦̅ − 𝑏1 𝑥̅
Error/residual sum of squared, SSE, is:
𝑛 𝑛