You are on page 1of 5

Parameter Estimation

Once a distribution has been selected, the parameters of the distribution need t
o be estimated. Several parameter estimation methods are available. This section
will present an overview of these methods, starting with the relatively simple
method of probability plotting and continuing with the more sophisticated least
squares and maximum likelihood methods.
This section includes the following subsections:
• Probability Plotting
• Least Squares Parameter Estimation (Regression Analysis)
• MLE (Maximum Likelihood) Parameter Estimation for Complete Data)
Probability Plotting
The least mathematically intensive method for parameter estimation is the method
of probability plotting. As the term implies, probability plotting involves a p
hysical plot of the data on specially constructed probability plotting paper. Th
is method is easily implemented by hand, given that one can obtain the appropria
te probability plotting paper.
The method of probability plotting takes the cdf of the distribution and attempt
s to linearize it by employing a specially constructed paper. For example, in th
e case of the two-parameter Weibull distribution, the cdf and unreliability Q(T)
, can be shown to be:

This function can then be linearized (i.e. put in the common form of y = a + bx)
as follows:
(15)
Then setting:
and:
the equation can be rewritten as,
which is now a linear equation with a slope of β and an intercept of βln(η).
T e next task is to construct a paper wit t e appropriate y- and x-axes. T e x-
axis calculation is easy since it is simply logarit mic. T e y-axis, owever, a
s to represent,
 
w ereQ(T) is t e unrelia ility (or a dou le log reciprocal
 scale). Suc papers
ave een created y different vendors and   are called pro a ility plotting paper

s. (Note: You can download different pro a ility plotting papers from www.wei ul
l.com.)     
To illustrate, consider t e following pro a ility plot on a Wei ull pro a ility
paper.

T is paper is constructed ased  on t e mentioned y- and x-transformations, w ere
t e y-axis represents
 unrelia ility and t e x-axis represents time. Bot of t e
se values must e known for eac time-to-failure point we want to plot. 
T en, given t e y and x valuefor eac point, t e points can  easily e put on t
e plot. Once t e points ave een placed on t e plot, te est possi le straig t
line is drawn
 t roug t ese points.
 Once t e line as een drawn, t e slope of
t e line can e o tained (some pro a ility papers include a slope indicator to s
implify t is calculation). T is is t e parameter β, w ic is t e value of t e slop
e. 
To determine t e scale parameter, η (also called t e c aracteristic life  y some a
ut ors), one must simply set t = η in t e cdf equation. Note t at from efore:
so at T = η:

T us, if we enter t e y axis at Q(T) = 63.2%, t e corresponding value of T will


e equal to η. T us, using t is simple ut rat er time-consuming met odology, t e
parameters of t e Wei ull distri ution can e estimated.
Determining t e X and Y Position of t e Plot Points
T e points on t e plot represent our data or, more specifically, our times-to-fa
ilure data. If, for example, we tested four units t at failed at 10, 20, 30 and
40 ours, we would use t ese times as our x values or time  values. Determiningw
at t e appropriate y plotting positions, or t e unrelia ility values, s ould e
is a little more complex. To determine t e y plottingpositions, we must first
determine a value indicating t e corresponding unrelia ility for t at failure. I
n ot er words, we need to o tain
 t e cumulative percent failed for eac time-to-
failure.
 In t is example, and y 10 ours, t e cumulative percent failed is 25%,
y 20 ours 50%, and so fort . T is is a simple met od illustrating t e idea. T
e pro lem wit t is simple met od is t e fact t at t e 100%point is not define 
d on most pro a ility plots, t us an alternative and more ro ust approac must
e used. T e most widely used met od of determining t is value is t e met od of o
taining t e median rank for eac failure. T is met od is discussed next.
Median Ranks  
Median ranks are used to o tain an estimate oft e unrelia ility, Q(Tj), for eac
failure. It is t e value t at t e true pro a ility of failure, Q(Tj), s ould
ave at t e jt failure out of a sample  of N units at a 50% confidence
 level. T i
s essentially means t at t is is our est estimate for t e unrelia ility. Half o
f t e time t e true value will e greater tan t e 50% confidence estimate, t e
ot er alf of t e time t e true value
 will e less t an t e estimate. T is estim
ate is ased on a solution of t e inomial equation.
T e rankcan e found for any percentage
 point, P, greater t an zero and less t
an one, y solving
 t e cumulative inomial equation for Z. T is represents t e r
ank, or unrelia ility estimate,
 for t e jt failure [15 ; 16 ] in t e following
equation for t e cumulative inomial:

w ere N is t e sample size and
 j t e order num er.
T e median rank is o tained y solving t is equation for Z at P = 0.50.
(16)
For example, if N = 4 and we ave four failures, we would solve t e median rank
equation, Eqn. (16), four times; once for  eac failure wit j = 1, 2, 3 and 4, f
or t e value of Z. T is result can t en e used as t e unrelia ility estimate fo
r eac failure or t e γ plottin position. (The Weibull distribution chapter prese
nts a step-by-step example for this method.) The solution of Eqn. (16) for Z req
uires the use of numerical methods.
A more straihtforward and easier method of estimatin median ranks is by applyi
n two transformations to Eqn. (16), first to the beta distribution and then to
the F distribution, resultin in [12 ;13 ],
F0.50;m;n denotes the F distribution at the 0.50 point, with m and n derees of
freedom, for the jth failure out of N units. Weibull++ uses this formulation whe
n determinin the median ranks.
Another quick, and less accurate, approximation of the median ranks is also ive
n by [15 ]:
(17)
This approximation of the median ranks is also known as Benard's approximation.
Kaplan-Meier
The Kaplan-Meier estimator is used as an alternative to the median ranks method
for calculatin the estimates of the unreliability for probability plottin purp
oses. The equation of the estimator is iven by,
(18)
where,
m = total number of data points
n = the total number of units
rj = number of failures in the jth data roup, and
sj = number of survivin units in the jth data roup
Weibull++ provides the option to select whether the median ranks or the Kaplan-M
eier estimator is used for the unreliability estimates for probability plottin
and reression. By default, the median ranks are used.
Probability Plots for Other Distributions
This same methodoloy can be applied to other distributions which have cdf equat
ions that can be linearized. Different probability papers exist for each distrib
ution, since different distributions have different cdf equations. Weibull++ aut
omatically creates these plots for you when choosin a probability plot for a pa
rticular distribution. Special scales on these plots allow the parameter estimat

es to be derived
 directly
  from the plots, similar
 to the way β and η were o tained f
rom t e Wei ull pro a ility
 plot. T ese will e discussed in su sequent c apters
on t e individual distri utions.
 
Some S ortfalls ofManual Proa ility Plotting
 
Besides t e most o vious drawack
 to pro a ility plotting, w ic is t e amount o
f effort required, manual pro a ility plotting is not always consistent in t e r
esults. Two people plotting a straig t line t roug a set of points will not alw
ays draw t is line t e same way, and t us will come up wit slig tly different r
esults. T is met od was used primarily efore t e widespread use of computers t
at could easily perform t e calculations for more complicated parameter estimati
on met ods, suc as t e least squares and maximum likeli ood met ods.
Least Squares Parameter
  Estimation (Regression Analysis)
Using
 t e idea of pro a ility plotting, regression analysis mat ematically fits
t e est straig t line to a set of points, in an attempt to estimate t e paramet
ers. Essentially, t is is a mat ematically ased version of t e pro a ility plot
ting met od discussed previously.
Background T eory
Te metod of linear least squares is used for all regression analysis performed

y Wei ull++, except for t e cases of
 t e t ree-parameter Wei ull, mixed Wei ul
l, gamma and generalized gamma distri utions w ere a non-linear regression tec n
ique is employed. T e terms linear regression and least squares are used synonym
ously in t is reference. Te term rank regression is used instead of least squar
es, or linear regression, ecause t e regression is performed on t e rank values
, more specifically, t e median rank values (represented on
 t e y-axis).
T e met od of least squares requires t at a straig t line e fitted to a set of
data points, suc t at t e sum of t e squares of t e distance of t e points to t
e fitted line is minimized. T is minimization can e performed in eit er t e ve
rtical or orizontal direction. If t e regression is on X, t en t e line is fitt
ed so t at t e orizontal deviations from t e points to t e line are minimized.
If t e regression is on Y, t en t is means t at t e distance of t e vertical dev
iations from t e points to t e line is minimized. T is is illustrated in t e fol
lowing figure.
Rank Regression on Y 
Assume t at a set of data pairs (x1,y1), (x2,y2),..., (xN,yN) were o tained and
plotted, and t at t e x-values are known exactly. T en, according to t e least s
quares principle, w ic minimizes t e verticaldistance etween t e data points
and t e straig t line fitted to t e data, t e est fitting straig t line to t es
e data is t e straig t line y = + x (w ere t e recently introduced (^) sym ol
indicates t at t is value is an estimate) suc t at:
 
and w ere and are t e least squares estimates  of a and , and N is t e num er
of data points. T ese equations are minimized y estimates of and suc t at:
and:
Rank Regression on X 
Assume t at a set of data pairs (x1,y1), (x2,y2),..., (xN,yN) were o tained and
plotted, and t at t e y-values are known exactly. T e same least
 squares princip
le is applied, t is time minimizing t e orizontaldistance etween t e data poi
nts and t e straig t line fitted to t e data. T e est fitting straig t line to
t ese data is t e straig t line x = + y suc t at:
 
Again, and are t e least squares estimates  of a and , and N is t e num er of
data points. T ese equations are minimized y estimates of and suc t at:
(19)
and:
(20) 
T e corresponding
 relations for determining t e parameters for specific distri u
tions (i.e.Wei ull, exponential, etc.), are presented in t e c apters covering
t at distri ution.
T e Correlation Coefficient
T e correlation coefficient is a measure  of ow well t e linear regression model
fits t e data and is usually denoted y ρ. In the case of life data analysis, it
is a measu e fo  the st ength of the linea  elation (co elation) between the m
edian anks and the data. The population co elation coefficient is defined as f
ollows:
whe e σxy = covariance of x and y, σx = tandard deviation of x, and σy = tandard dev
iation of y.
The e timator of ρ is the sample co elation coefficient, , given by,
The ange of is -1 1.
The close  the value is to , the bette  the linea  fit. Note that +1 indicates
a pe fect fit (the pai ed values (xi, yi) lie on a st aight line) with a positiv
e slope, while -1 indicates a pe fect fit with a negative slope. A co elation c
oefficient value of ze o would indicate that the data a e andomly scatte ed and
have no patte n o  co elation in elation to the eg ession line model.
Comments on the Least Squa es Method
The least squa es estimation method is quite good fo  functions that can be line
a ized. (Note: Most of the dist ibutions used in life data analysis a e capable
of being linea ized.) Fo  these dist ibutions, the calculations a e elatively e
asy and st aightfo wa d, having closed-fo m solutions which can eadily yield an
answe  without having to eso t to nume ical techniques o  tables. Fu the , thi
s technique p ovides a good measu e of the goodness-of-fit of the chosen dist ib
ution in the co elation coefficient. Least squa es is gene ally best used with
data sets containing complete data, that is, data consisting only of single time
s-to-failu e with no censo ed o  inte val data. The Data & Data Types chapte  de
tails the diffe ent data types, including complete, left censo ed, ight censo e
d (o  suspended) and inte val data.
MLE (Maximum Likelihood) Pa amete  Estimation fo  Complete Data
F om a statistical point of view, the method of maximum likelihood estimation is
, with some exceptions, conside ed to be the most obust of the pa amete  estima
tion techniques discussed he e. This method is p esented in this section fo  com
plete data, that is, data consisting only of single times-to-failu e.
Backg ound on Theo y
The basic idea behind MLE is to obtain the most likely values of the pa amete s,
fo  a given dist ibution, that will best desc ibe the data.
As an example, conside  the following data (-3, 0, 4) and assume that you a e t 
ying to estimate the mean of the data. Now, if you have to choose the most likel
y value fo  the mean f om -5, 1 and 10, which one would you choose? In this case
, the most likely value is 1 (given you  limit on choices). Simila ly, unde  MLE
, one dete mines the most likely value(s) fo  the pa amete (s) of the assumed di
st ibution.
It is mathematically fo mulated as follows:
If x is a continuous andom va iable with pdf:
whe e θ1, θ2,..., θk are k unknown parameter  which need to be e timated, with R indep
endent ob ervation , x1, x2,... xR, which corre pond in the ca e of life data an
aly i  to failure time . The likelihood function i  given by:
i = 1, 2, ..., R
The logarithmic likelihood function i  given by: (Note: Weibull++ provide  a thr
ee-dimen ional plot of thi  log-likelihood function. See Appendix A for an examp
le.)
The maximum likelihood e timator  (or parameter value ) of θ1, θ2,..., θk are obtained
by maximizing L or Λ. 
By maximizinΛ, which is much easier to work with than , the maximum likelihood
estimators (M E) of θ1, θ2,..., θk are the imultaneou  olution  of k euation  uch
that:
Even though it i  common practice to plot the MLE olution  u ing median rank  (
point  are plotted according to median rank  and the line according to the MLE 
olution ), thi  i  not completely repre entative. A  can be een from the euati
on  above, the MLE method i  independent of any kind of rank . For thi  rea on,
the MLE olution often appear  not to track the data on the probability plot. Th
i  i  perfectly acceptable ince the two method  are independent of each other,
and in no way ugge t  that the olution i  wrong.
Comment  on the MLE Method
The MLE method ha  many large ample propertie  that make it attractive for u e.
It i  a ymptotically con i tent, which mean  that a  the ample ize get  large
r, the e timate  converge to the right value . It i  a ymptotically efficient, w
hich mean  that for large ample , it produce  the mo t preci e e timate . It i 
a ymptotically unbia ed, which mean  that for large ample  one expect  to get
the right value on average. The di tribution of the e timate  them elve  i  norm
al, if the ample i  large enough, and thi  i  the ba i  for the u ual Fi her Ma
trix confidence bound  di cu ed later. The e are all excellent large ample pro
pertie .
Unfortunately, the ize of the ample nece ary to achieve the e propertie  can
be uite large: thirty to fifty to more than a hundred exact failure time , depe
nding on the application. With fewer point , the method  can be badly bia ed. It
i  known, for example, that MLE e timate  of the hape parameter for the Weibul
l di tribution are badly bia ed for mall ample ize , and the effect can be in
crea ed depending on the amount of cen oring. Thi  bia  can cau e major di crepa
ncie  in analy i .
There are al o pathological ituation  when the a ymptotic propertie  of the MLE
do not apply. One of the e i  e timating the location parameter for the three-p
arameter Weibull di tribution when the hape parameter ha  a value clo e to 1. T
he e problem , too, can cau e major di crepancie .
However, MLE can handle u pen ion  and interval data better than rank regre io
n, particularly when dealing with a heavily cen ored data et with few exact fai
lure time  or when the cen oring time  are unevenly di tributed. It can al o pro
vide e timate  with one or no ob erved failure , which rank regre ion cannot do
.
A  a rule of thumb, our recommendation i  to u e rank regre ion techniue  when
the ample ize  are mall and without heavy cen oring (cen oring i  di cu ed
in the Data & Data Type  chapter). When heavy or uneven cen oring i  pre ent, wh
en a high proportion of interval data i  pre ent and/or when the ample ize i 
ufficient, MLE hould be preferred.

You might also like