Design of Experiments

Session 1
Classical Methods of Experimental Design

Prof. Shiv G. Kapoor
Introduction
Shiv G. Kapoor
Department of Mechanical Science and Engineering
University of Illinois at Urbana-Champaign
@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

IE 400 Lecture 1
OUTLINE
1. Introduction
Role of Experimental Design, Important Concepts in the DOE
2. Review of Basic Statistical Methods and Probability Concepts
Discrete and Continuous Probability Distribution Functions,
Normal and Sampling Distributions, Tests of Hypotheses
3. Comparative Experiments
Comparing Two Treatments
4. Design and Analysis of 2k Factorial Experiments
General Factorial Designs and Design at Two levels
Calculation and Interpretation of Main and Interaction effe
5. Two-Level Fractional Factorial Designs
Rationale for, and Consequences of, Fractions of Two-Level
Factorials
Concept of Design Resolution

IE 400 Lecture 1
THE NEED FOR STATISTICAL METHODS
IN THE DESIGN OF EXPERIMENTS
1. The world around us is not deterministic

– Variability is part of the natural order
of things
Variation in data is neither totally
chaotic nor small enough to be ignored
It is real, identifiable, and predictable
statistically.
University of Illinois at Urbana-Champaign 2006 by Dr. Shiv G. Kapoor
NATURE OF VARIABILITY IN DATA
What is Variability?
Let us understand the proper interpretation of
variability in data via a dialog between the
professor and a young, naive student.
Professor: “I am interested in the tool life of
this tool.”

IE 400 Lecture 1
Conditions
Speed: 170 FPM
Feed: .017 IPR
Depth of Cut: 0.07 IN
Workpiece 1018 Steel 6”dia, 24” long

IE 400 Lecture 1
Student: “I will go down to the lab and run

a test to determine the required tool life.”
Speed
Tool Life
Feed
Depth of Cut
Material

IE 400 Lecture 1
Student: “I have found that the tool life

for the conditions you specified is 15
minutes.”
Professor: “That’s fine-but why don’t you
go back and run another test. I want to be
sure that the tool life is 15 minutes.”
Student: “I told you that the tool life is 15
minutes, but I will run another test if you
insist.”
IE 400 Lecture 1
After running another test he came to

Professor and said,
“Professor, this time I got a tool life of 16.2
minutes. However, I noticed some
fluctuations in the machine power. If we
install a power regulator, we can probably
eliminate this source of variation and get a
true tool life value.”

IE 400 Lecture 1
Professor: “Do whatever you need to do

and re-run the test.”
Power
Regulator
Speed
Tool Life
Feed
Depth of Cut
Material

IE 400 Lecture 1
Student: “After installing the power

regulator, I now get a tool life of 15.6 minutes-
this is the true tool life.”
Professor: “Let me ask you if you think that
the machine was vibrating? Should you be
making the machine more rigid and then re-
run the test?”
Student: “You are right, I better make the
machine more rigid.”
IE 400 Lecture 1
Student: Now the tool life is 16.2

minutes. I am sure that the true tool life is
16.2 minutes.”
Professor: “How can you be so sure that
you have eliminated all variation? What about
differences in materials, tools, operator
inconsistencies, etc?”
Student: “Well, I suppose they could
cause some variation.”
IE 400 Lecture 1
Professor: “What I really want is an

average tool life for an average tool cutting a
material over average environmental
conditions! There will always be some
variation in the process-we just do the best
we can.”
Student: “OK, I’ll run more tests using
several tools, material pieces, etc. and find
out what the average tool life is.”
IE 400 Lecture 1
Student: “I have run tests over a random

sample of tools, Workpieces, times of day, etc.
Results of the tests are:
Test Tool Life Test Tool Life
1 15 7 15
2 15.6 8 16.7
3 16.2 9 16
4 16.5 10 16
5 16.2 11 16.7
6 16.5 12 16.8
I find the average tool life to be 16.1 min.”
IE 400 Lecture 1
Professor: “That’s fine, but how confident

are you that 16.1 min is the true tool life?”
Student: “Pretty darn confident!”
Professor: “That’s not good enough. I need
more information-you better tell me
something like ‘I am x% confident that the
true tool life is within a certain range’-Doesn’t
that logical?”
Student: “It certainly is logical-but I don’t
know the answer”
IE 400 Lecture 1
THE NEED FOR STATISTICAL METHODS IN
THE DESIGN OF EXPERIMENTS
2. We can easily define many factors of potential

significance but only a few account for the vast
majority of the structure/variation in the data.
The problem at hand is to screen from a large

group of potentially important factors those few,
which are worthy of continuing study.
3. The sequential nature of experimentation and the
iterative process of building up a knowledge base
is central to experimental work.
IE 400 Lecture 1

4. All processes are subject to identifiable and

unidentifiable disturbances which can totally
invalidate results.
5. In most physical processes, variables tend not to

influence the process are independent of each
other.
6. The world around us is non-linear.

IE 400 Lecture 1
7. Experimentation is costly and time consuming

business.
8. We generally respond to crises, not principles.

-There is never enough time to do the job right
but always enough time to do the job over.
9. We know much less about what makes things

work than we think.

IE 400 Lecture 1

10. Concepts drive people – people drive

techniques.
A myth we must dispel – “everything he is saying
sounds logical and probably works fantastically for
some people – but my specific problem simply
doesn’t lend itself to this approach and/or simply
doesn’t need it anymore.”
The above statement altogether is made too often.

IE 400 Lecture 1
THE SEQUENTIAL AND ITERATIVE
NATURE OF EXPERIMENTATION
To optimize and control the process, a logical series

of questions must be asked:
1. Which of a list of potentially important variables

are worthy of further study?
2. How, specifically, do the important variables tend
to influence the results?
3. What levels of the most important variables tend
to optimize the process?

IE 400 Lecture 1

4. In the context of the above, what characteristics

of the experimental environment may be
bothersome and, therefore, need to be
somehow neutralized?
5. What specific concepts/techniques need to be
evoked en route to answering the above?

IE 400 Lecture 1
The key to successfully and efficiently embracing

these questions is the iterative nature of the learning
process.
Conjecture: hypothesis about the situation at
hand.
Design: creation of an exercise/experiment to
test/place the hypothesis in jeopardy.

IE 400 Lecture 1

Experiment: conduct the experiment and

obtain the data.
Analysis: examination of data to study its
plausibility in light of the
hypothesis being tested.
This is then followed by modification of the
original hypothesis in accordance with the analysis
– deductions and then designing a second
experiment, etc.
IE 400 Lecture 1
In initiating an experimental study, we

should probably plan in the first experiment
to do no more than 25% of the total
experiments we have resources for.

IE 400 Lecture 1

One big experiment is not only inefficient but

leaves us with no resources left if our initial
conjecture was in error.
We should begin by looking at many factors in
a somewhat superficial fashion and more
toward the examination of only the few, most
relevant, factors in a more comprehensive
fashion.

IE 400 Lecture 1
DIFFICULTIES WITH EXPERIMENTAL
WORK WHICH REQUIRES STATISTICAL
METHODS
Three key sources of difficulty confront

the experimenter.
Experimental error.
Confusion of correlation with causation.
Complexity of variable effects.

IE 400 Lecture 1

METHODS
Experimental error
1. Composed of many minute disturbances
which individually have little effect on the
outcome of the experiment.
2. Collectively these small chance occurrences
may increase the dispersion or spread of the
results to the point where real variable effects
are masked.
IE 400 Lecture 1
METHODS
Experimental error
3. Composed of more than errors of
measurement – not all instrumentation
oriented. A good measurement system
accounts for no more than 10-15% of the
total error.
4. Can be a function of both unknown and
known sources.
IE 400 Lecture 1

METHODS
Statistical analysis
1. Can the results be explained solely by
chance causes?
2. How much data is required to reveal the
existence of true effects in light of chance
error?

IE 400 Lecture 1
METHODS
Correlation Versus Causation

1. Two factors highly related only because they are
related to a third common (often unidentified)
factor. Deliberate change in one may not lead to
a change in the other- Concept of planned
versus passively observed
2. To really find out how changes in some factor
affect the output, you have to change that factor
deliberately and observe the change.
IE 400 Lecture 1

METHODS
Complexity of Variable Effects

1. Nice state of affairs: variable effects are linear
and additive! Generally not the case!
Example: effects of aspirin and coffee on driving
reaction time: aspirin – increase Δ, coffee –
reduce 2Δ. Will one aspirin and one cup of
coffee reduce reaction time by 1Δ? Additive:
Will 10 aspirins and 5 cups of coffee keep
reaction time constant? Linear?
2. Need to plan experiments to reveal variable
interactions and nonlinear variable effects.
IE 400 Lecture 1
SOME IMPORTANT CONCEPTS RELATED
TO STATISTICAL DESIGN OF
EXPERIMENTS
External Versus Internal Experimental

Comparisons
Comparing an experimental result with past
data is inviting because data may be there
and database may be big! This can be
dangerous because
• prevailing conditions may have changed from time of
database development to time of the experiment.
• the data base may have been passively observed –
poor control; no active interference with the system.
IE 400 Lecture 1

EXPERIMENTS
External Versus Internal Experimental

Comparisons
The presence of extraneous variation factors

may yield poor sensitivity of statistical test. It
is generally best to stay within the experiment
for making comparisons.

IE 400 Lecture 1
EXPERIMENTS
Essential Nature of a Significance Test

In the analysis of the results of comparative
experiments, it is necessary to assess the
magnitude of the test statistic of interest, say a
difference in the mean results of two methods, in
light of the natural variability inherent in the
experiment.
The question we are really asking is: could this
experimental result--difference--have arisen solely
due to chance causes or is there a real difference
in the methods?
IE 400 Lecture 1

EXPERIMENTS
Essential Nature of a Significance Test

To answer this question we must measure the
level of chance variation, usually through genuine
test replication, and then determine the
probability that we could have observed the result
before us or one more extreme if only these
chance causes were at work.
This is the role of a statistical test of significance.

IE 400 Lecture 1
EXPERIMENTS
Blocking to Avoid Nuisance Variation

When known sources of extraneous/unwanted
variation can be identified, we can design the
experiment in such a way as to eliminate their
influence and provide more sensitive test of
significance test.

IE 400 Lecture 1

EXPERIMENTS
Example of Blocking
Problem: Two different tool materials are being

tested to determine whether or not there is a real
difference in their wear characteristics.
A=B,
Wear?
Material Material
A B
IE 400 Lecture 1
EXPERIMENTS
Example of Blocking
First Experimental Design:
Twenty tools are made, ten with material A and
ten with material B.
Twenty machine operators are chosen at

random, given the tools, and told to use them in
machining as they normally do.
IE 400 Lecture 1

EXPERIMENTS
Example of Blocking
First Experimental Design:
At the end of the experiments the mean
amount of wear is determined based on the ten
measurements for each material type.
The mean difference is calculated and
examined by a statistical test of significance.
No real difference is found.
IE 400 Lecture 1
EXPERIMENTS
Problem with First Experiment

The twenty operators are very different in terms of
their skills and experiences.
Hence, a sizeable nuisance variation is introduced
within the measurements comprising the mean
wear for each material.
This nuisance variation markedly increases the
level of “chance” variation and hence may be
“hiding” the presence of a real difference in
materials.

IE 400 Lecture 1

EXPERIMENTS
Solution to the Problem

Re-design the experiment:
Give each operator a pair of tools with one tool
of material A and one tool of material B.
At the end of experiment measure the wear on
each tool for each operator and calculate the
difference in wear within each operator.
Average these differences across all operators
and perform a test of significance on this
average difference.

IE 400 Lecture 1
EXPERIMENTS
Solution to the Problem

Operator-to-operator nuisance variation is in
this way blocked from consideration since only
relative differences within each operator-block
are examined.
This is an example of the technique of

Blocking – a very important experimental
design concept.
IE 400 Lecture 1
RANDOMIZATION TO GUARANTEE
VALIDITY OF TEST RESULTS
Unknown sources of nuisance variation are

always present but hard to identify – trends from
day to day.
Systematic conduct of tests may violate basic
statistical assumptions for significance tests,
i.e.,Normality/Independence.
Random sampling imply that the data are
independently distributed about their respective
means.
Block for what you can identify, randomize for

what you cannot.
IE 400 Lecture 1
INTERACTION AMONG VARIABLES
Often in experimental work it is assumed that

factors influence the results independent of each
other – the effects are additive.
Unfortunately, this is often not the case.
More likely the way in which one factor influences
the results depends upon the level of one or more
other factors; that is,
the factors are interdependent on or interact

with each other.
IE 400 Lecture 1
INTERACTION AMONG VARIABLES
Example:
Two factors: temperature and pressure are thought to
affect chemical reaction time:
Mr. X’s
Experiment
Mr. Y’s
Experiment

IE 400 Lecture 1
SUMMARY
Experiments should be comparative.

There should be genuine replication: replicates
can provide an accurate measure of errors.
Whenever appropriate, blocking (pairing) should
be used to reduce error.
Randomization is needed for homogeneity or
independence.
Experiments should be designed in such a way as
to be able to determine the interaction effects of
factors.
IE 400 Lecture 1
OTHER ISSUES IN PLANNED

EXPERIMENTATION
The purpose of most experimental work is to

discover the direction of changes that may lead to
improvements in both quality and productivity.
A fundamental task in designing an experiment is
to select an appropriate arrangement of test points
within the space defined by the independent
variables and develop a mathematical model.

IE 400 Lecture 1
OTHER ISSUES IN PLANNED
EXPERIMENTATION
For example, if a quadratic relationship between

two variables is suspected, an experiment that
studies the process at only two levels of these
variables will be inadequate.
Similarly, an experiment using four levels would be
unnecessary and inefficient if the true relationship
were linear.

IE 400 Lecture 1
Session 2
Review of Basic Statistical Methods
Review of Basic Statistical Methods
Professor Shiv G. Kapoor

@2006 University of Illinois Board of Trustees. All Rights Reserved.
When we collect data, we are interested in

finding out
how the process is behaving in terms of
an output quality characteristic(s),
perhaps, in terms of the value of the
characteristic on the average and
the variation in individual measurements
of that characteristic.
In general, we have three choices:

1) Observe the process once and use that
observation as an absolute reflection of the
process behavior.
2) Observe all of the output of the process to get
a true reflection of its behavior.
3) Observe part of the output of the process and
use it to infer something about the true process
behavior.
If we do 1) we are committing a potentially

“fatal” error.
If we do 2) we are said to be observing the
entire population or universe.
The term population, meaning all possible
realizations of the process, suggests that all
terms in the population constitute a very large
number, perhaps in a more abstract sense, an
infinite population size.

For any particular circumstance of practical

significance, however, a population is
generally of finite size though that size could
be quite large.
It is generally neither practical nor necessary
to observe the entire population.
Rather, we observe a relatively small subset
of it (choice 3 above). When we do this we
are said to be sampling the process
(population).
How much we sample, when we take the

sample, and how we specifically take the
sample are critical issues?
But the fundamental issue is this: We will
draw the sample from the process and use
the information contained in the samples to
say something about the process.
We need an efficient and adequate method to
collect and analyze data.

CHARACTERIZATION OF DATA
Characterization of Data
Three Characteristics:
1) Central Tendency
2) Dispersion or Variability
3) Shape of Distribution of Frequencies
A Measure of Central Tendency

Given a sample of n pieces of data (X1,
X2,……, Xn) taken from a given population of
size N, the arithmetic mean of the sample,
denoted by is
n
X = ( ∑ Xi ) / n
i =1
The true mean of the population of size N would
be defined as µ.
Measures of Dispersion or Variability

The Range denoted by R is the difference between
the largest value and the smallest value of the data.
In general,
R=Xlargest-Xsmallest
The variance is another measure of the variability in
data. The sample variance, denoted by s2, is given
by: n
2
s = [ ∑ ( X i − X ) ] /( n − 1 )
2
i =1
The variance of the population is referred to as σ2.
CHARACTERIZATION OF DATA-AN
EXAMPLE
It has been shown that the average can be

used to describe the data characteristics.
However, it is possible to find two sets of data to
have equal averages, but their degrees of
scatter are most likely different, and
It has been a common mistake in many cases
of applications to put all emphasis on the
average but overlooking the scatter of the data.
EXAMPLE
Such a mistake usually leads to unnecessary

erroneous conclusions which could have been
easily avoided if the scatter of the data had
been considered.
Suppose two different brands, A and B, of
cutting tools are used to machine 20
workpieces, 10 with each brand.
The surface finish readings taken on the 20
workpieces are shown in the Table 1.
EXAMPLE
Table 1 Surface Finish

XAI XBI (X Ai − X A )2 (X Bi − X B )2
70 70 0 0
72 70 4 0
71 70 1 0
70 69 0 1
69 69 1 1
70 70 0 0
71 70 1 0
70 71 0 1
68 71 4 1
69 70 1 0

EXAMPLE
The two averages X A and X B are calculated as:
70 + 72 + .... + 69 700
XA = = = 70
10 10
70 + 70 + ... + 70 700
XB = = = 70
10 10
EXAMPLE
Judging from these two averages, it seems

that there is no difference in the surface finish
reading in as much as which brand of the
cutting tool is to be used.
However, if we calculate the sample variances
for the two samples, we get:
10 10
∑ ( X Ai − X A )2 ∑( X Bi − X B )2
SA = i =1
= 1.33 , S B = i=1
= 0.44
2 2
10 − 1 10 − 1
EXAMPLE
Hence, the two sets of surface finish data are

characterized statistically as
X A = 70 microinches, X B = 70 microinches,
S A = 1.15 microinches, S B = 0.66 microinches.
EXAMPLE
It would appear that cutting tool B gives a more

consistent-less variable-surface finish.
Whether the difference we see here in these
sample results is real/significant remains to be
determined by a test of statistical significance.

Probability Density Function
For a continuous random variable, X, the probability behavior is

described by a function called the probability density function
(pdf), f(x). A pdf must satisfy the following:
∞
f(x) ≥ 0, all x and ∫-∞
f ( x)dx = 1,
over the interval for X. The corresponding cumulative

distribution function (cdf) of a continuous random variable with
pdf f(x) is given by
∞
F(x) = P(X ≤ x) = ∫-∞
f(t)dt .
Note that
d
f(x)= F(x)
dt
d
f(x)= F(x)
Mathematical Expectation
The expectation of a function g(x) of a random variable X is defined

as ∞
E(g(x)) = ∫ -∞
g(x) f(x) dx
g(x) = x : the expectation of X, E(X), is the true arithmetic mean value
of X , usually denoted by µx.
g(x) = (x-µx)2: the expectation of (X-µx)2 is the true variance of X,
which is denoted by σ2x.
Some useful rules for Expectation:
1. E(cX) c E(X)
2. E(X+Y) = E(X) + E(Y) and E(X-Y) = E(X) – E(Y)
3. Var(cX) = c2 Var(X), if c is a constant
4. Var(X+Y) = Var(X) + Var(Y), if X and Y are independent
5. Var(X-Y) = Var(X) + var(Y), if X and Y are independent.

Uniform or Rectangular Distribution
The pdf of a rectangular distribution is given by:
1
f(x) = a≤x≤b
b-a
= 0 elsewhere
The mean and variance of this distribution are:
∞ 1 (b + a )
µ = E(x) = ∫ x dx =
-∞ (b − a ) 2
∞
Var(x) = E (x-µ ) 2  = ∫ (x-µ) 2f(x)dx
-∞
= E(x 2 ) − [ E(x) ] = E(x 2 ) − µ 2

2
(b-a) 2
=
12
Example 1
A wire cutting machine cuts wire to a specified length.

Due to certain inaccuracies of the cutting mechanism,
the length of the cut wire (in centimeters), say X, may
be considered as a uniformly distributed random
variable over (11.5, 12.5). The specified length is 12
centimeters. If the length of the cut wire is between
11.7 and 12.2 cm, the wire can be sold for a profit of
$0.25. If the length of the cut wire is greater or equal
to 12.2 centimeters, a profit of $0.10 is realized.
However, if the length of the cut wire is less than 11.7
cm, the wire is discarded with a loss of $0.02. What is
the expected profit per wire cut?

Solution
Given:
1
f(x) = = 1.0 11.5 ≤ x ≤ 12.5
(12.5-11.5)
f(x) = 0 elsewhere
Also, let Profit, P be a random variable. It will take the values of
P = 0.25 11.7 ≤ x ≤ 12.2
P = 0.10 x ≥ 12.2
P = -0.02 x <11.7
∞
Since f(x) is known, we can find E(P) = ∫-∞
P(x) f(x) dx
12.2 11.7 12.5
E(P) = ∫11.7
(0.25) 1. dx + ∫
11.5
(−.02).1. dx + ∫
12.2
(0.10).1.
= 0.125 + (-0.004) + 0.03 = $0.151 = 15 cents.
THE NORMAL DISTRIBUTION
Mathematically, the Normal Curve is defined

by the Equation 2
1  x− µ 
−  
1 2  σ x 
f ( x )= e
2π ⋅ σ x
where µ (the mean) and σx (standard
deviation) are the two parameters of the
distribution, and f(x) is referred to as the
probability density function (p.d.f.) of x.
The distribution, called the Normal distribution,

does an excellent job of approximating the
relative frequencies of many natural and “man
made” phenomena, e.g. dimension of
machined parts, strength of steel samples, etc.
It is a bell shaped curve, symmetric about the
objects, which fell off quite rapidly beyond a
distance of about one standard deviation from
the mean.
The Normal Curve has the appearance as shown in Figure 1.

Fig. 1 Normal Curve

Strictly speaking, the curve stretches from -∞ to ∞.

However, much of its density is distributed over a
relatively narrow range. In fact,
68.26% of the observations fall between µ-σx and µ+σx
95.46% of the observations fall between µ-2σx and
µ+2σx
99.73% of the observations fall between µ-3σx and
µ+3σx .
The function f(x) has been conveniently scaled so
that the total area under the curve over the full
range of x (- to + ) equals to 1.0.
Example
The heat shield plates for the space shuttle must have a
closely measured thickness in order to withstand the
rigors of heat from re-entry. After testing 400 of them,
the engineer found the thickness was normally
distributed with a mean of 22.5 mm. It was also found
that 382 plates were within 20 +/- 2.50 mm. If the
defective plates deviate more than 1.80 mm from the
mean, find the number of plates to be rejected during
the testing.

Solution
Let X be a random variable that represents the
thickness of heat shield plates.
It is given that X follows a normal distribution with
mean, µ and variance σ2. It is also given that µ = 20
mm and 382/400=95.5% plates fall in between 20+/-
2.50 mm.
From the Normal distribution, 95.44% area
corresponds to µ +/- 2σ . Hence,
2 σ = 2.50 or σ = 1.25.
Since the plates are rejected if the thickness is

greater than 21.8 mm or less than 18.2 mm, the
standard unit equivalent of X is calculated as:
Z1 = (x1-µ)/σ = (18.2-20)/1.25 = -1.44
Z2 = (x2-µ)/σ = (21.8-20)/1.25 = 1.44
Probability of rejection
= P(Z<-1.44) + P(Z>1.44)
= (0.0749) + (1-0.9251)) = 0.1498
Numer of Heat shield rejected = 14.98%.

FREQUENCY DISTRIBUTION OF THE
SAMPLE MEAN
We have seen that the sample mean provides

us with an estimate of the population mean µ.
Furthermore, since the sample is obtained by
considering only a small subset of the total
population, it is an uncertain estimate of µ.
That is, if we sample a population or process
several times, each time calculating mean, we
would find the values to vary, simply due to
sampling variation.
FREQUENCY DISTRIBUTION OF THE

SAMPLE MEAN
The amount of this sampling variation is a

reflection of just how good is an estimate of µ.
i.e., how close we expect to be from µ?
To answer this question we need to
understand how the sample means behave
with respect to their own mean, standard
deviation and, more importantly, their
frequency distribution?

DISTRIBUTION OF SAMPLE MEANS
1. The mean of the distribution of averages is the

mean of the population, as given by:
X 1 + X 2 + ..... + X k
X=
k
2. The variance of a sample mean, X given a
random sample size of n is
σ x2
V( X ) = σ 2
x =
n
3. Since the distribution of the sample means

with mean (µ) and variance (σx2/n) follows a
normal distribution, then the relationship
between the distribution of sample means and
the z distribution is given by:
X − µ
z =
σ x
n
When the population standard deviation σx is

unknown and the sample size is small, we
must rely on the sample standard deviation, s.
(X −µ)
s
The quantity also follows a bell shaped
n
curve but is more spread out than the standard
normal distribution.
This distribution is called a t-distribution with v=n-

1 degrees of freedom, as shown in Figure 2.
Figure 2. t and z distribution

The increased spread is reflecting the added

uncertainty due to the fact that σx is unknown
and must be estimated by s which itself is
subject to sampling error.
Hypothesis Testing of a Population Mean
Often we are called on to make decisions or draw

conclusions on the new design or improvement in
performance of a given process based on sampled
data.
In making a decision, we typically form a hypothesis
concerning what we believe is true and then collect
data to prove or disapprove the hypothesis.

Hypothesis Testing of a Population Mean
In statistical hypothesis testing, we generally formulate two

hypotheses. The null hypothesis, denoted by H0, will be rejected or
nullified if the sample data do not support it. Any hypothesis that is
different from H0 is called an alternative hypothesis, denoted by H1.
Any time, H0 is rejected, H1 will be considered accepted.
There are three ways to set up the alternative hypothesis.
Method 1 : H 0 : µ x =µ 0 H1 : µ > µ 0
Method 2 ; H 0 : µ x =µ 0 H1 : µ< µ 0
Method 3 : H 0 : µ x =µ 0 H1 : µ ≠ µ0
Stepwise Approach to Hypothesis Testing
A statistical hypothesis test consists of the following six steps:

1. State the null and alternative hypotheses. Define the test statistic
used to analyze the situation.
2. Determine the significance level, a, at which the test will be made.
3. Collect the data and calculate the test statistic result.
4. Define the reference distribution for the test statistic.
5. Compare the test statistic and its reference distribution under H0.
Carry out the necessary analysis of data.
6. Assess the risk.

Example
To test the newspaper claim that the mean wage rate

of local foundry workers is $16 an hour, 25 foundry
workers were randomly surveyed. It was found that
the average wage rate for the sample of workers was
$14.50. Historical data suggest that the wage rates
follow the normal distribution and the standard
deviation of wage rates is $3. Can the Union claim
that the average wage is not $16 an hour? Assume α
=0.05.
Solution
Step 1: H 0 : µ = 16, H1 : µ ≠ 16. The sample statistic is X.

Step 2: Given that α = 0.05 .
Step 3: X = 14.50.
Step 4 : The standard deviation of the normally distributed X is given by
σ 3
σx = x = = 0.6.
n 25

Solution
Step 5: The critical values for z which place α/2

(0.05/2 = 0.025) in each tail of the standard normal distribution
with both tails of the distribution (we are concerned with both
tails of the distribution, since this is a two-tailed test) are
Z0.025 = -1.96 and z0.975 = 1.96.
The calculated z value corresponding to the observed sample

mean of 14.5 is then given by:
(x-µ ) (14.50 − 16)

z= = = − 2.50
σ 0.60
INTRODUCTION TO THE CONFIDENCE

INTERVAL CONCEPT
Given a random sample of n observations from

some process of interest and an estimate of
the process mean, it is of interest to make
some statement about the “goodness” of that
sample mean, as an estimate of µ, i.e., the
degree of belief or confidence that can be
placed on it.
One way of approaching this problem is
through the concept of the confidence
interval.
INTERVAL CONCEPT
Recall that for random samples of size n drawn

from a population, we expect that 95% of all
sample means will be within an interval of
µ±1.96 standard deviations of the distribution of
the sample mean, i.e.,
µ ± 1.96σ x / n
embraces 95% of all the sample means.

INTERVAL CONCEPT
σx
In other words, X ± 1.96 is called a 95%
n
confidence interval for the true mean µ.
Graphically, we can show it as in Fig.2 and
Fig.3
σx
A more general statement is that X ± z 1−α / 2
n
is a 100(1-α)% Confidence Interval for µ.

INTERVAL CONCEPT
Fig.2 Graphical Representation of 95% Confidence Interval
σx σx
1.96 1.96
n n

INTERVAL CONCEPT
Fig.3 Graphical Representation of 95% Confidence Interval

INTERVAL CONCEPT
When the sample size is small and σx is

unknown, the Confidence Interval is given by
s
X ± t v ,1−α / 2
n
where v is the degree of freedom and it equals n-1.
Example
Given that 9 bearings made by a certain

process have an average diameter of 0.305
cm and the sample standard deviation of
0.003 cm, construct a 99 % confidence
interval for the true mean diameter of
bearings made by the process. What is the
width of the confidence interval?

Solution
Given that the sample size, n is small, the sample stat, follows a t-distribution.
X
Hence a 99% confidence interval for the true mean diameter is given by
s s
X - tν ( ) ≤ µ ≤ X + tν ( )
n n
0.003 0.003
0.305-3.355 ≤ µ ≤ 0.305 + 3.355
9 9
0.30165
The width of C. I. is ≤ µ ≤ 0.30835
s
2 tν = 2.(0.00335)=0.0067 .
n
Session 3
Comparative Experiments
Comparative Experiments

IE 400 Lecture 3 @2006 Dr. Shiv G. Kapoor All Rights Reserved
COMPARISON OF TWO
TECHNIQUES/DESIGNS/PROCESSES
Two Different Designs of Wood Cutting

Saws- An Example
Due to the environmental awareness and the
enactment of Occupational Safety and Health
Act (OSHA), the industrial noise is considered a
significant factor in designing the machine and
cutting tools.
Wudkut Co. manufactures the wood cutting
tools. One of their products, the woodcutting
circular saw, has traditionally been a high noise-
producing tool.
COMPARISON OF TWO
It is like a circular plate with cutting blades

equally spaced around the periphery.
The Director of R & D has known for some
time that if the saw blades are perturbed in
some fashion, the noise produced by the saw
is considerably reduced.
In order to confirm these results he developed
an optimal ‘perturbation’ design.
COMPARISON OF TWO
Objective:
To compare the noise levels produced by two,
otherwise identical, circular saws, one
designed with equal blade spacing (the
‘conventional’ design) and another with uneven
spacing (the ‘modified’ design).

COMPARISON OF TWO
It was decided to manufacture eight identical

circular saws for each of the two designs and
send a pair each to eight different customers
for test and report.
They are to record the noise levels in
decibels, db.
Furthermore, the decision as to which of the
two circular saws should be run first has to be
taken by every customer by flipping a coin
(Randomization).
COMPARISON OF TWO
The results reported by the customers are

given in Table 1.
The numbers in the parentheses (1 or 2)
indicate the testing sequence.
Two customers reported that their machines
broke down and hence they could not test the
conventionally-designed saws.
So, in all, there are 14 different results as
reported in Table1.

COMPARISON OF TWO
Table 1 Noise Level From the Two Saws

Noise from
Noise from
Customer Conventionally
Perturbed Blade
Numbers Designed Saw,
Saw, N m(db) 501
N c(db) Nc = = 83.50
1 85(2) 86(1) 6
2 77(2) 74(1)
3 78(1) 76(2) 649
4 86(2) 87(1) Nm = = 81.13
5 80(1) 77(2) 8
6 95(2) 88(1)
7 83
8 78
Total 501 644
COMPARISON OF TWO
Strictly speaking, it is a comparison of the two

means µc and µm.
But since µc, µm, and σ2 are not available, we
would have to base the comparison on the
sample statistics N c , N m, Sc2, and Sm2
N c and N m are the averages, Sc2 and Sm2 are
the sample variances.

COMPARISON OF TWO
In other words, we are faced with the task of

making statistical decisions about the
difference of the two means (µc - µm) on the
basis of the sample statistics .
We will use the confidence interval approach
for the comparison of two means.
COMPARISON OF TWO
An Independent t-test analysis

With the 14 test data, a common practice would
be to calculate the averages Nc and Nm , and
compare these two averages.
However, in view of the variation of the tests
results, the two means, µc and µm, cannot
justifiably be compared merely from these two
statistics, (Nc − Nm ) .

COMPARISON OF TWO

Hence, for this purpose, we shall hereby
introduce the independent t-test as a tool for the
comparison of the two designs.
Recall that the general expression of a “t” is
statistics− mean of statistics

t= (1)
var( statistics)
COMPARISON OF TWO

In this problem, the statistics of interest is the
observed difference of the two averages Nc and N,m
and the mean of this statistics is the difference
between the two means, (µc - µm).
Statistics= (Nc − Nm) = 83.50-81.13=2.37,
Mean of statistics=µc-µm (which is unknown) and
the estimated variance of the statistics is
V(statistics) = V ( N c − N m )
COMPARISON OF TWO
Thus equation (1) can be rewritten as

( N c − N m ) − (µ c − µ m )
t= (2)
V (Nc − N m )
The immediate information that is required

for equation (2) would be the estimated
variance V ( N c − N m ) .
COMPARISON OF TWO
Calculation of V(N c − N m )
It can be shown theoretically that the variance
of ( N c − N m ) is
V ( N c − N m ) = V( N c ) + V( N m ) - 2 COV( N c N m ) (3)
where COV( N c N m ) is called the covariance of N c and N m .

COMPARISON OF TWO
Calculation of V(N c − N m)
Assuming that the observations Nc do not

affect the observations Nm, or vice versa, that
is, Nc and Nm are independent, and equation
(3) becomes
V ( N c − N m ) = V( N c ) + V( N m ) (4)
COMPARISON OF TWO
Estimating the variances V( N c ) and V( N m) by
the sample variances , Sc2 and Sm2, we have
S2 S2 (5)
V ( N c − N m ) = V( N c ) + V( N m ) = c + m
nc nm
Assumption: The two data sets come from
the same process (two normal distributions
with equal variances, i.e. Sc2=Sm2=S2)

COMPARISON OF TWO
The common (Pooled) variance is defined as:
( n − 1 )S c + ( n m − 1 )S m
2 2
= c
2
Sp
( nc − 1 ) + ( nm − 1 )
COMPARISON OF TWO
Example of Calculation of V(N c − N m)

Using equation (3) and also referring to Table 1,
the sample variances Sc2 and Sm2 are:
6
∑(N ci − N c )2
1
Sc = i =1
= ( 225.5) = 45.10
2
nc − 1 6 −1
8
∑(N mi − N m )2
1
Sm = i =1
= ( 212.88) = 30.41
2
nm − 1 8 −1
COMPARISON OF TWO
Example of Calculation of V(N c − N m )

Therefore,
( 6 − 1)( 45 .1) + (8 − 1)( 30 .41)

Sp =
2
( 6 − 1) + (8 − 1)
225 .50 + 212 .88
= = 36 .53
5+7
COMPARISON OF TWO
Substituting the pooled sample variance Sp2 for

Sc2 and Sm2 in equation (5), we obtain:
1 1
V ( Nc − Nm ) = (
2
+ ) Sp
nc n m
1 1
= ( + ) (36.53) = 10.96
6 8
V ( Nc − Nm ) = 3.31

COMPARISON OF TWO
Calculation and Interpretation of the 95%

Confidence Interval
After the estimated variance (Nc − Nm) has been
obtained, the 95% confidence interval for (µc - µm)
can be readily calculated as
(Nc −Nm)± t V(Nc −Nm) (6)
COMPARISON OF TWO

Confidence Interval
A total of 14 tests, hence 14 degrees of freedom, is
involved.
So far we have used up 2 degrees of freedom in
calculating the averages N c and N m as estimates of
µc and µm.
Therefore, 12 degrees of freedom are left.

COMPARISON OF TWO

Confidence Interval
The value of t associated with 12 degrees of
freedom is t12, .025 = 2.179
Substituting the statistics( N c − N m )= 2.37, V ( N c − N m ) =
3.31, and t12, .025= 2.179 into equation (6), the 95%
confidence interval for (µc - µm) can be obtained. It
is [2.37±(2.179)(3.31)]=[-4.84,9.58 ].
COMPARISON OF TWO

Confidence Interval
This 95% confidence interval extends all the way
from -4.84 to 9.58, implying that the true difference
between the two designs may be as much as 9.58
db in favor of perturbation design or 4.84 db in
favor of conventional design.
One might well conjecture that there is really no
difference between the performances of these two
techniques.
COMPARISON OF TWO

Confidence Interval
Putting this conjecture in the form of a question,
we might ask, “Are these data consistent with
the proposition that the true difference
between the two designs is zero?”
One can get a good understanding of what he is
entitled to believe on the basis of the data by
seeing where this proposed zero value falls in the
95% confidence interval.
COMPARISON OF TWO

Confidence Interval
It is seen that the value zero falls well within the
confidence interval.
Hence, this value seems to be consistent with the
data.
That is, if someone claimed that there was no
difference between the two designs, he would not
be contradicted by the data.

COMPARISON OF TWO

Confidence Interval
However, if someone claimed that the perturbed
design produced noise 20 db less than what is
produced by the conventional design, we could
say that his claim does not appear to he
supported by the data.
This is because the value of 20 lies far outside the

v
COMPARISON OF TWO

Confidence Interval
The confidence interval contains, roughly speaking,
all the plausible values of the true difference
between the two designs.
The result of the independent t-test indicates that
there is no strong evidence either way in favor of
the two designs.

COMPARISON OF TWO
(An Example)
Table 1 Noise Level From the Two Saws

Noise from
Noise from
Customer Conventionally
Perturbed Blade
Numbers Designed Saw,
Saw, N m(db) 501
N c(db) Nc = = 83.50
1 85(2) 86(1) 6
2 77(2) 74(1)
3 78(1) 76(2) 649
4 86(2) 87(1) Nm = = 81.13
5 80(1) 77(2) 8
6 95(2) 88(1)
7 83
8 78
Total 501 644
COMPARISON OF TWO
Strictly speaking, it is a comparison of the two

means µc and µm.
But since µc, µm, and σ2 are not available, we
would have to base the comparison on the
sample statistics N c , N m, Sc2, and Sm2
N c and N m are the averages, Sc2 and Sm2 are
the sample variances.

COMPARISON OF TWO
In other words, we are faced with the task of

making statistical decisions about the
difference of the two means (µc - µm) on the
basis of the sample statistics .
We will use the confidence interval approach
for the comparison of two means.
COMPARISON OF TWO

With the 14 test data, a common practice would
be to calculate the averages Nc and Nm , and
compare these two averages.
However, in view of the variation of the tests
results, the two means, µc and µm, cannot
justifiably be compared merely from these two
statistics, (Nc − Nm ) .

COMPARISON OF TWO

Hence, for this purpose, we shall hereby
introduce the independent t-test as a tool for the
comparison of the two designs.
Recall that the general expression of a “t” is
statistics− mean of statistics

t= (1)
var( statistics)
COMPARISON OF TWO

In this problem, the statistics of interest is the
observed difference of the two averages Nc and N,m
and the mean of this statistics is the difference
between the two means, (µc - µm).
Statistics= (Nc − Nm) = 83.50-81.13=2.37,
Mean of statistics=µc-µm (which is unknown) and
the estimated variance of the statistics is
V(statistics) = V ( N c − N m )
COMPARISON OF TWO
Thus equation (1) can be rewritten as

( N c − N m ) − (µ c − µ m )
t= (2)
V (Nc − N m )
The immediate information that is required

for equation (2) would be the estimated
variance V ( N c − N m ) .
COMPARISON OF TWO
It can be shown theoretically that the variance
of ( N c − N m ) is
V ( N c − N m ) = V( N c ) + V( N m ) - 2 COV( N c N m ) (3)
where COV( N c N m ) is called the covariance of N c and N m .

COMPARISON OF TWO
Calculation of V(N c − N m)
Assuming that the observations Nc do not

affect the observations Nm, or vice versa, that
is, Nc and Nm are independent, and equation
(3) becomes:
V ( N c − N m ) = V( N c ) + V( N m ) (4)
COMPARISON OF TWO
Estimating the variances V( N c ) and V( N m) by
the sample variances , Sc2 and Sm2, we have
S2 S2 (5)
V ( N c − N m ) = V( N c ) + V( N m ) = c + m
nc nm
Assumption: The two data sets come from
the same process (two normal distributions
with equal variances, i.e. Sc2=Sm2=S2)

Test for equal variances
For independent random samples of size n1

and n2, from the two populations, the F-value
for testing σ12=σ22 is the ratio
F=S12/s22, where S12 and S22 are the variances computed from
the two samples. If the two populations are approximately
normally distributed and the null hypothesis is true, the ratio F
is a value of the
F-distribution with ν1 = n1-1 and ν2 = n2-1 degrees of freedom.
For the two-sided test, the critical region is
F<F1-α/2 (ν1,ν2) or F>Fα/2(ν1,ν2)
COMPARISON OF TWO
Example of Calculation of V(N c − N m)

Using equation (3) and also referring to Table 1,
the sample variances Sc2 and Sm2 are:
6
∑(N ci − N c )2
1
Sc = i =1
= ( 225.5) = 45.10
2
nc − 1 6 −1
8
∑(N mi − N m )2
1
Sm = i =1
= ( 212.88) = 30.41
2
nm − 1 8 −1
Test for equal variances
For our problem:

H0: σ≠12=σ22
H1: σ12 σ22
F = 45.10/30.41=1.48
α = 0.10
Critical Region: From the F-dist Table,
F0.05 (5,7) =3.97 and F0.95(5,7) =1/F0.05(7,5)
=1/4.88=0.20.
Since F is greater than 0.20 and less than 3.97, the Null
hypothesis is not rejected. Hence σ12=σ22.

COMPARISON OF TWO
The common (Pooled) variance is defined as:
( n − 1 )S c + ( n m − 1 )S m
2 2
= c
2
Sp
( nc − 1 ) + ( nm − 1 )
COMPARISON OF TWO

Therefore,
( 6 − 1)( 45 .1) + (8 − 1)( 30 .41)

Sp =
2
( 6 − 1) + (8 − 1)
225 .50 + 212 .88
= = 36 .53
5+7

COMPARISON OF TWO
Substituting the pooled sample variance Sp2 for

Sc2 and Sm2 in equation (5), we obtain:
1 1
V ( Nc − Nm ) = (
2
+ ) Sp
nc n m
1 1
= ( + ) (36.53) = 10.96
6 8
V ( Nc − Nm ) = 3.31
COMPARISON OF TWO

Confidence Interval
After the estimated variance (Nc − Nm) has been

obtained, the 95% confidence interval for (µc - µm)
can be readily calculated as
(Nc −Nm)± t V(Nc −Nm) (6)

COMPARISON OF TWO

Confidence Interval
A total of 14 tests, hence 14 degrees of freedom, is
involved.
So far we have used up 2 degrees of freedom in
calculating the averages N c and N m as estimates of
µc and µm.
Therefore, 12 degrees of freedom are left.
COMPARISON OF TWO

Confidence Interval
The value of t associated with 12 degrees of
freedom is t12, .025 = 2.179
Substituting the statistics ( N c − N m )= 2.37, V ( N c − N m ) =
3.31, and t12, .025= 2.179 into equation (6), the 95%
confidence interval for (µc - µm) can be obtained. It
is [2.37±(2.179)(3.31)]=[-4.84,9.58 ].

COMPARISON OF TWO

Confidence Interval
This 95% confidence interval extends all the way
from -4.84 to 9.58, implying that the true difference
between the two designs may be as much as 9.58
db in favor of perturbation design or 4.84 db in
favor of conventional design.
One might well conjecture that there is really no
difference between the performances of these two
techniques.
COMPARISON OF TWO

Confidence Interval
Putting this conjecture in the form of a question,
we might ask, “Are these data consistent with
the proposition that the true difference
between the two designs is zero?”
One can get a good understanding of what he is
entitled to believe on the basis of the data by
seeing where this proposed zero value falls in the
95% confidence interval.
COMPARISON OF TWO

Confidence Interval
It is seen that the value zero falls well within the
Hence, this value seems to be consistent with the
data.
That is, if someone claimed that there was no
difference between the two designs, he would not
be contradicted by the data.

COMPARISON OF TWO

Confidence Interval
However, if someone claimed that the perturbed
design produced noise 20 db less than what is
produced by the conventional design, we could
say that his claim does not appear to he
supported by the data.
This is because the value of 20 lies far outside the
COMPARISON OF TWO

Confidence Interval
The confidence interval contains, roughly speaking,
all the plausible values of the true difference
between the two designs.
The result of the independent t-test indicates that
there is no strong evidence either way in favor of
the two designs.

COMPARISON OF TWO

Confidence Interval
The Director of R&D felt, however, intuitively felt
that the evidence in favor of perturbation design
was stronger than that implied by the 95%
One possible source of difficulty, he pointed out,
was that the quality of wood and the types of
machine tools varied considerably from customer
to customer, and this variation was not taken into
account in the above analysis.
COMPARISON OF TWO

Confidence Interval
In addition, he also pointed out that the data of
perturbation design vary from a value of 74 db to 88
db, and the data of the conventional design vary
from a value of 77 db to 95 db (refer to Table 1).
He suspected (correctly!) that the variation from
customer to customer was causing the sample
variances Sc2 and Sm2 to be inflated, thus resulting
in an inflated pooled sample variance Sp2.
COMPARISON OF TWO

Confidence Interval
From these observations, he conjectured that
Sp2 was not a good estimate of the true
variance S2 because Sp2 was estimating, in
addition, the variance from customer to
customer.
COMPARISON OF TWO
What is the Solution?

A paired t-test analysis
Perform a paired comparison of the two

designs for every customer.
The difference “blocks” out the customer-to-
customer effect on variability.

COMPARISON OF TWO
Paired Design of Comparative Test Analysis

It was decided to run 20 additional tests.
Twenty identical circular saws for each of the
two designs were send to ten different
customers for test.
COMPARISON OF TWO

The customers were requested to test the two
saws.
on the same machine,
under exactly similar conditions of type and
size of wood, cutting parameters, etc.
Run tests in a random order.
record the noise levels in decibels (db).
COMPARISON OF TWO
TABLE 2 NOISE LEVEL FROM 20 ADDITIONAL TESTS

N O IS E F R O M N O IS E F R O M
CUSTO M ER C O N V E N T IO N A L L Y P E R T U R B A T IO N
d = µ c -µ M
NUMBER D E S IG N E D S A W µ C D E S IG N E D S A W
(D B ) µ M (D B )
1 8 8 (2 ) 8 4 (1 ) 4
2 8 0 (1 ) 7 6 (2 ) 4
3 9 3 (1 ) 9 0 (2 ) 3
4 8 1 (2 ) 7 8 (1 ) 3
5 7 8 (2 ) 7 9 (1 ) -1
6 8 3 (1 ) 8 3 (2 ) 0
7 9 1 (2 ) 8 6 (1 ) 5
8 8 4 (1 ) 7 9 (2 ) 5
9 9 0 (2 ) 8 5 (1 ) 5
10 8 0 (2 ) 7 7 (1 ) 3
COMPARISON OF TWO

The average difference, d , is therefore
10
∑ di
31
d = i=1
= = 3 .1
10 10
The 95% confidence interval for the difference
is given as
d ± t V(d )
COMPARISON OF TWO
The 95% confidence interval therefore depends
upon three statistics:
d
V (d )
t value.
COMPARISON OF TWO
d:
Table 2 gives the ten differences d1, d2, ...., d10.
The average difference is: d = 3.10

COMPARISON OF TWO

V (d ) :
In order to find the sample variance d of,
,first we have to find the sample variance of a
single difference V(d). The sample variance of d is
10 2
∑ (d − d )
i
V (d ) = i = 1 =
40.12
= 4.458
10 − 1 9
COMPARISON OF TWO

The sample variance d of V (d ) is given by
V (d ) 4.458
V (d ) = = = 0.4458
n 10

COMPARISON OF TWO
t value:
Here, one degree of freedom has already been used
up in calculating the mean, and the value of t
associated with nine degrees of freedom and
corresponding to a 95% confidence level is t9,.025 = 2.
262.
The 95% confidence interval is, from equation (6),
3.1 ± 2.262 0.4458 = [1.6,4.6]

Here the confidence interval does not include the point
zero.
COMPARISON OF TWO
This test indicates that the true difference between

the performances of the two different design saws
is probably somewhere between 1.6 and 4.6 db.
The new design might be as much as 4.6 db or as
little as the 1.6 db better than the conventional
design.
The director may now claim with 95% confidence,
that the perturbed (modified) design is in fact
better than the conventional design in the sense
that the saws made according to modified design
produce less noise.
Blocking
One important lesson to learn from this example is

the importance of setting up experiments so that
extraneous variation can be eliminated from the
results. The extraneous variation in the example
was the customer to customer fluctuation in the
wood and the types of machine tools.
By running tests in pairs, this customer to customer
variation was eliminated because direct
comparison between the perturbation and
conventional design could be made for the same
customer. This is called blocking.
Randomization in Paired and Independent

Comparisons
We have seen how on the assumption that a sample

of “n” observations may be treated as if they had been
independently drawn from a Normal population with
constant mean and constant variance, simple
inferential procedures can be developed.
Unfortunately, in the real world, these assumptions
are usually not exactly justifiable. In fact the
observations could be correlated in the sequence, for
example, and unexpected changes in both mean and
variance of their distribution might occur.
Randomization in Paired and Independent
Comparisons
It was pointed out by Sir Ronald Fisher that it is

possible to take a precaution in the actual carrying out
an experiment, which guarantees the approximate
validity of comparative procedures. This precaution is
randomization.
In the wood cutting saw example, recall that a coin
was flipped by each customer to decide the order of
the usage of the saws.
This represents a process of randomization and was
done for one important reason - to alleviate the
unavoidable influences of both the known and
unknown variables on the experimental results.
Session 4
Factorial Designs
FACTORIAL DESIGNS (FD)

IE 400 Lecture 10 2006 Dr. Shiv G. Kapoor All Rights Reserved
Factorial Designs
When we wish to compare two different

techniques,
we may either use the independent t-test, or
we may use a paired t-test if a nuisance
variable is to be blocked away.

Factorial Designs
Suppose now that we want to compare

several techniques, the independent t-tests
would no longer be adequate and a
k-variable analysis should be used.
There are indeed many experimental
designs available which may be chosen to
suit particular experimental situations.
Factorial Designs
But these designs, e.g., k-variable analysis,

involve certain assumptions and restrictions.
We will now introduce a general and effective

class of experimental design called the
Factorial Design which includes k-variable
analysis, etc. as a special case.

Factorial Designs
To develop a general factorial design, one would

select a fixed number of “levels” for each of a
number of variables (factors) and then design
tests with all possible combinations.
For example, if there are
L1 levels for variable 1 , L2 levels for variable 2
……….., and Lk levels for variable k, then
L1 x L2 x L3 x ……x Lk is called a factorial design
for k variables.
Factorial Designs
Factorial designs are preferred over other

experimental designs because :
They require relatively few runs per variable (factor) even
though they do not explore the entire region of interest
They deal easily with variable interactions
They can indicate major trends and help determine a possible
direction for further experimentation
They can be augmented to form composite designs
They form the basis for fractional factorial designs
The interpretation of the data produced by the designs can be
easily analyzed.

Factorial Designs
Variables vs Levels
A design with 3 variables at 2 levels would require 8
tests for a 2 x 2 x 2 = 23 factorial design
A design with 3 variables at 3 levels would need 27
tests for a 3 x 3 x 3 = 33 factorial design.
A design with 5 variables at 2 levels would need 32
tests for a 25 factorial design.
AN EXPERIMENTAL STUDY USING A 23

FACTORIAL DESIGN
Background of the Study--Welding of High

Strength Steel Bars
High carbon steel, because of its high
strength and low cost, has been known to
have a potential for a "good market".
However, because of its high carbon content,
it is not easy to weld.

FACTORIAL DESIGN

Strength Steel Bars
According to the code of the American Welding
Society (AWS), additional steps of pre-heating
and post-heating are required in order to have
good quality welds and high strength steel.
A user of this steel was interested to study
whether or not these additional steps of
pre-heating and post-heating were really
needed.

FACTORIAL DESIGN

Strength Steel Bars
After a preliminary investigation by manual arc
welding tests, it appeared that there were three
variables significantly affecting the ultimate
tensile stress of a weld.
These three variables were:
ambient temperature,
wind velocity
bar size.
FACTORIAL DESIGN

Strength Steel Bars
The evidence, however, was not decisive,
and further experiments were therefore
planned.
How many tests should be conducted?

FACTORIAL DESIGN
Based upon available funds and time limit, it

was decided that 244 tests be run.
In the meantime, an engineering
statistician,who was called upon for
consultation, suggested that 16 tests be run
according to his specified sets of test
conditions.
These 16 tests formulated a specific
statistical experimental design which we shall
further discuss.
FACTORIAL DESIGN
Experimental Design - 23 Factorial Design

The statistical experimental design that was
formulated by the engineering statistician for
this study was a two-level three-variable
factorial design, simply designated as a 23
factorial design.
The three selected variables were:
ambient temperature, denoted by T
wind velocity, denoted by V,
bar size, denoted by B.

FACTORIAL DESIGN

Two levels were chosen for each variable
based upon desired field conditions to be
simulated.
One level is called the high level and the
other the low level.
The high and low levels of the three
variables are listed in Table 1.

FACTORIAL DESIGN
Table 1: The High and Low levels of the 23 Factorial Design
Variable Unit Low High

Ambient Temperature (T) °F 0 70
Wind Velocity (V) mph 0 20
Bar Size (B) 1/8 inch 4 11

FACTORIAL DESIGN

It is seen that the simulated field conditions would
consist of a low of zero degrees Fahrenheit ambient
temperature and a high of 70 degrees Fahrenheit.
Similarly, the low wind velocity was zero miles per
hour and the high wind velocity was 20 mph.
Two different bar sizes were used,
the smaller one was 4/8 inch which is designated as
the low level
the larger one was 11/8 inches, designated as the
high level.
FACTORIAL DESIGN
Transforming Equations
In order to adopt a notation which will be the same for
all two-level factorial designs, we use transforming
equations to code the variables such that
the high level will be denoted by +1,
the low level will be denoted by -1.
By so doing, regardless of the physical conditions
represented by the two levels, the basic design of any
two-level factorial design becomes a simple
arrangement of +1 and -1.

FACTORIAL DESIGN
For example:
The transforming equation for ambient
temperature (T) is
(T − 35)
X1 =
35

FACTORIAL DESIGN
In order to code the high and low levels of the
ambient temperature into +1 and -1, we simply
substitute the two levels, 00F and 700F, of the
ambient temperature into the above transforming
equation.
(0 − 35)
For the low level, X1 = = −1
35
(70 − 35)
For the high level, X1 = = +1
35

FACTORIAL DESIGN
Construction of the 23 Factorial Design

A complete 23 factorial design requires 23 = 8 tests.
One systematic way of writing down the eight test
conditions in their coded forms is to proceed as
follows:
For X1, write down the values -1,+1,-1,+1,-1,+1,-1,+1
in a column. The signs alternate each time.
For X2, write down the values -1,-1,+1,+1,-1,-1,+1,+1
in a column. The signs alternate in pairs.
For X3, write down the values -1,-1,-1,-1,+1,+1,+1,+1
in a column. The signs alternate in groups of four.
FACTORIAL DESIGN

In general, there are 2k sets of test conditions for
a 2k factorial design.
We can obtain the 2k sets of coded test
conditions by writing down columns as follows:
For X1, write down a 2k number of -1, +1, -1,
+1,.... The signs alternate each time (i.e., 20 =
1, one alternation every time).
For X2, write down a 2k number of -1, -1, +1,
+1,....The signs alternate in pairs (i.e., 21 = 2,
alternate in pairs).

FACTORIAL DESIGN

For X3, write down a 2k number of -1, -1, -1, -1, +1,
+1, +1, +1,.... The signs alternate in groups of four
(i.e., 22 = 4, alternate in groups of four).
For X4, the signs alternate in groups of eight (23 = 8).
Proceed in a similar way for X5, X6, ..., Xk. For Xk,
write down a 2k-1 number of –1’s, followed by a 2k-1
number of +1's.
This method will yield all the 2k distinct sets of coded
test conditions without repetitions.
FACTORIAL DESIGN
Table 2 : The 23 Factorial Design for the High Strength Steel Bar Problem
Coded Test Conditions Actual Test Conditions
Test # X1 X2 X3 °F (mph) (1/8 in)
1 -1 -1 -1 0 0 4
2 1 -1 -1 70 0 4
3 -1 1 -1 0 20 4
4 1 1 -1 70 20 4
5 -1 -1 1 0 0 11
6 1 -1 1 70 0 11
7 -1 1 1 0 20 11
8 1 1 1 70 20 11

FACTORIAL DESIGN
Table 3 Results of the Sixteen Welding Experiments

Test Y ai Test Y bi Average
Test # X1 X2 X3
Order (kpsi) Order (kpsi) (kpsi)
1 -1 -1 -1 6 84 3 91 87.5
2 1 -1 -1 8 90.6 7 84 87.3
3 -1 1 -1 1 69.6 5 86 77.8
4 1 1 -1 2 76 4 98 87
5 -1 -1 1 5 77.7 8 80.5 79.1
6 1 -1 1 3 99.7 1 95.5 97.6
7 -1 1 1 4 82.7 2 74.5 78.6
8 1 1 1 7 93.7 6 81.7 87.7

FACTORIAL DESIGN
Calculation of Average Effects of Ambient

Temperature
Average Effect of Ambient Temperature, E1
Observe that for Test Nos. 1 and 2 in Table 3, the
conditions of wind velocity (X2) and bar size (X3) are
the same but the temperature (X1) conditions are
different.
A high temperature was used for Test No. 2 and a low
temperature was used for Test No. 1.
Therefore, the difference in these two test results, apart
from the intrinsic variation that is present, can be
attributed solely to the effect of ambient temperature
alone.

FACTORIAL DESIGN

Temperature
Similarly, for the pairs of Test # 3 and 4, 5 and
6, and 7 and 8 in Table 3; each pair involved
similar test conditions with respect to wind
velocity and bar size, but different test
conditions with respect to ambient
temperature.
Thus, the differences in the results within each
of these four pairs reflect the effect of ambient
temperature alone.
FACTORIAL DESIGN

Temperature
The differences in the test conditions are:
(Test Nos . 1 and 2 ) y 2 − y 1 = 87 .3 − 87 .5 = −0 .2

(Test Nos . 3 and 4 ) y 4 − y 3 = 87 .0 − 77 .8 = 9 .2
(Test Nos . 5 and 6 ) y 6 − y 5 = 97 .6 − 79 .1 = 18 .5
(Test Nos . 7 and 8 ) y 8 − y 7 = 87 .7 − 78 .6 = 9 .1 .

FACTORIAL DESIGN

Temperature
The average effect of ambient temperature,
designated by E1 is by definition the average of
the above four differences and is given as :
E1 = 1 / 4[( y2 − y1 ) + ( y 4 − y3 ) + ( y6 − y5 ) + ( y8 − y7 )]
= 1/4[-0.2 + 9.2 + 18.5 + 9.1]
= 9.15 units of 1000 psi
= 9150 psi.
Note that the average effect is commonly referred
to as main effect.
FACTORIAL DESIGN

Temperature
Geometrically, the average effect of ambient
temperature, E1, is the difference between the
average result in Plane II (high level of ambient
temperature) and the average result on Plane I
(low level of ambient temperature) as shown in
Figure 1 .

FACTORIAL DESIGN
Figure 1. A Geometrical Interpretation of Average

Effect of Ambient Temperature

FACTORIAL DESIGN

Temperature
This can be seen by rearranging the average
effect equation of E1 as
1
E1 = [(y2 + y4 + y6 + y8 ) − (y1 + y3 + y5 + y 7 )]
4
high level of ambient Low level of ambient
temperature temperature

FACTORIAL DESIGN

Temperature
The average effect of ambient temperature tells
us that, on the average, over the ranges of the
variables in this investigation, the effect of
changing the ambient temperature from its low
level is an increase of ultimate tensile stress by
9150 psi.
But, notice that the individual differences ( -200
psi, 9200 psi, 18,000 psi, and 9100 psi) are
actually quite erratic.
FACTORIAL DESIGN

Temperature
The average effect, therefore, must be
interpreted in conjunction with the intrinsic
variabilities that are present in the
experimental results.

FACTORIAL DESIGN
Calculation of Average Effects of Wind Velocity

Consider Test No. 1 and 3 in Table 3. For these
two tests,
the ambient temperature (X1) and bar size (X3)
are constant,
but for Test #3 the wind velocity is at the high
level.
The difference in the two test results, apart from
the intrinsic variation that is present, can be
attributed to the effect of wind velocity alone.

FACTORIAL DESIGN

Similarly, for Test No. 2 and 4, 5 and 7, and 6
and 8:
the ambient temperature and bar size are
constant within each pair
but the wind velocity is at different levels.

FACTORIAL DESIGN

Therefore, the average effect of wind velocity,
E2, can be obtained by taking the average of
the four individual differences, which are in
units of 1000 psi:
(Test Nos. 1 and 3) y3 − y1 = 77.8 − 87.5 = −9.7
(Test Nos. 2 and 4) y 4 − y 2 = 87.0 − 87.3 = −0.3
(Test Nos. 5 and 7) y 7 − y 5 = 78.6 − 79.1 = −0.5
(Test Nos. 6 and 8) y8 − y6 = 87.7 − 97.6 = −9.9.
FACTORIAL DESIGN

And the average effect is
1
E2 = [( y3 − y1 ) + (y 4 − y2 ) + (y7 − y5 ) + (y8 − y6 )]
4
1
= (-9.7 - 0.3- 0.5 - 9.9)
4
= -5.1 ( units of 1000 psi)
= -5100.

FACTORIAL DESIGN

Geometrically, the average effect of wind
velocity is the difference between the average
result on Plane IV ( high level of wind velocity)
and the average result on Plane III (low level
of wind velocity) as shown in Figure 2.

FACTORIAL DESIGN
Figure 2. A Geometrical Interpretation of Average Effect

of Wind Velocity

FACTORIAL DESIGN

The average effect of wind velocity is –5100
psi which tells us that on the average, the
ultimate tensile stress decreased by 5100 psi
when the wind velocity was changed from 0
mph to 20 mph.

FACTORIAL DESIGN
Calculation of Average Effects of Bar Size

Similarly, the average effect of bar size, E3, is the
corresponding comparison between Plane VI (
high level of bar size) and Plane V ( low level of
bar size) which are indicated in Figure 3.
This effect is calculated as follows:
1
E3 = [( y 5 − y 1 ) + ( y 6 − y 2 ) + ( y 7 − y 3 ) + ( y 8 − y 4 ) ]
4
1
= (-8.4 + 10.3 + 0.8 + 0.7)
4
= 0.85 ( units of 1000 psi)
= 850 psi.

FACTORIAL DESIGN
Figure 3. A Geometrical Interpretation of Average Effect

of Bar Size

FACTORIAL DESIGN
Two-Variable Interactions
The average effects E1, E2, and E3 represent the
individual effects of ambient temperature, wind
velocity, and bar size on the ultimate tensile
stress.
What about the joint effect of two variables, say,
ambient temperature and wind velocity on the
ultimate tensile stress?
or wind velocity and bar size on the ultimate
tensile stress?
These joint effects are indicated by the
two-variable interactions.

FACTORIAL DESIGN
Physically, what is a two-variable interaction?
Let us consider the hypothetical set of
data given in Figure 4.
The numbers located at each of the four
corners represent the hypothetical test
results which may be observed at the four
sets of test conditions given by the four
corners.

FACTORIAL DESIGN
Figure 4 Hypothetical Results Illustrating the Absence of

a Two-variable Interaction
(high)
110 120
Wind
Velocity
90 100
(low) Ambient (high)
Temperature

FACTORIAL DESIGN
The difference in the test results due to a
change of ambient temperature performed
at the low level of wind velocity is
100 - 90 =10.
At the high level of wind velocity, the
difference is
120 - 110 = 10.

FACTORIAL DESIGN
These two differences, which are due to the
change in ambient temperature at different
levels of wind velocity, are identical.
We say, in this case, that there is no
interaction between ambient temperature and
wind velocity.

FACTORIAL DESIGN
In other words if the effect of changing
ambient temperature is the same at both
levels of wind velocity (or, if the effect of
changing the wind velocity is the same at both
ambient temperature levels), there is no
interaction between ambient temperature and
wind velocity.
In a sense, ambient temperature and wind
velocity act independently of one another.
FACTORIAL DESIGN
On the other hand, consider the results given in the
following Figure 5.
(high
) 10
1 140
Wind
Velocit
y
90 100
(low) Ambient (high
Temperature )

FACTORIAL DESIGN
Here, for the low level of wind velocity the difference in
the results due to change in level of ambient
temperature is
100-90=10.
But at the high level of wind velocity, the difference in
the results due to change in level of ambient
temperature is
140-110=30.
Thus, the effect of changing ambient temperature is not
the same at the high and low levels of wind velocity.

FACTORIAL DESIGN
Likewise, the difference in the results at the
low level of ambient temperature due to the
change in wind velocity level is 110 - 90 - 20,
and the difference in the results at the high
level of ambient temperature is 140-100=40.
That is, the effect of changing the level of wind
velocity is not the same at both levels of
ambient temperature.

FACTORIAL DESIGN
In other words, the effect of ambient
temperature depends on the level of wind
velocity, they do not act independently of one
another.
We say, in this case, that an interaction

exists between ambient temperature and
wind velocity.
FACTORIAL DESIGN
Calculation of Two-Variable Interactions

There are three two-variable interactions to be
calculated, namely:
between ambient temperature and wind
velocity, denoted by E12,
between ambient temperature and bar size,
denoted by, E13
between wind velocity and bar size, denoted
by E23.
The calculations of the three two-variable interactions

will be illustrated in the same order as is given above.

FACTORIAL DESIGN
Interaction between Ambient Temperature and

Wind Velocity, E12
In order to calculate the interaction between
ambient temperature and wind velocity, let us go
back to Table 3.
At the high level of wind velocity, the two
differences of results given by (Y 4 − Y3 ) and (Y8 − Y 7 ) ,
both reflect individual effect of the change in
ultimate tensile stress that could occur due to a
change of ambient temperature from the low level
to the high level.
FACTORIAL DESIGN
Table 3 Results of the Sixteen Welding Experiments

Test Y ai Test Y bi Average
Test # X1 X2 X3
Order (kpsi) Order (kpsi) (kpsi)
1 -1 -1 -1 6 84 3 91 87.5
2 1 -1 -1 8 90.6 7 84 87.3
3 -1 1 -1 1 69.6 5 86 77.8
4 1 1 -1 2 76 4 98 87
5 -1 -1 1 5 77.7 8 80.5 79.1
6 1 -1 1 3 99.7 1 95.5 97.6
7 -1 1 1 4 82.7 2 74.5 78.6
8 1 1 1 7 93.7 6 81.7 87.7

FACTORIAL DESIGN

Wind Velocity, E12
The average of these two differences, at the high
level of wind velocity is 1
[(Y − Y 3 ) + (Y 8 − Y 7 )]
2 4
Similarly at the low level of wind velocity, the
average of the two differences in results due to
the sole effect of a change in ambient
temperature level is 1
[(Y 2 − Y1 ) + (Y 6 − Y 5 )].
2
FACTORIAL DESIGN

Wind Velocity, E12
The interaction between ambient temperature
and wind velocity is given by the average
difference of these two averages, that is,
1 1 1
E12 = { [(Y2 − Y1 ) + (Y6 − Y5 )] − [(Y2 − Y1) + (Y6 − Y5 )]}.
2 2 2
1
= {(Y1 + Y5 + Y4 +Y8 ) − (Y2 + Y6 +Y3 + Y7 )}
4

FACTORIAL DESIGN

Wind Velocity, E12
Note, therefore, that the interaction between
ambient temperature and wind velocity tells us
the average change in ultimate tensile stress
that would occur due to a change from the low
level to the high level in both the ambient
temperature and wind velocity.

FACTORIAL DESIGN

Wind Velocity, E12
The two-variable interaction between ambient
temperature and wind velocity is:
E12 = (1/4[(87.5+79.1+87.0+87.7)-(87.3+97.6+
77.8+78.6)]) = 0.0 psi
That is, no interaction exists between ambient

temperature and wind velocity.

FACTORIAL DESIGN
Interaction Between Ambient Temperature and

Bar Size, E13
Similarly, the interaction between temperature bar
size is
1 1
E13 = (Y1 + Y3 + Y6 + Y8 ) − (Y2 + Y4 + Y5 + Y7 )
4 4
= −4650 psi

FACTORIAL DESIGN
Interaction Between Wind Velocity and Bar Size,

E23
Likewise, the interaction between wind
velocity and bar size is
1 1
E23 = (Y1 + Y2 + Y7 + Y8 ) − (Y 3 + Y4 + Y5 + Y6 )
4 4
1 1
= (87.5 + 87.3 + 78.6 + 87. 7) − (77.8 + 87.0 + 79.1+ 97.6)
4 4
= -0.10 ( units of 1000 psi)
= -100 psi.

FACTORIAL DESIGN
Three-Variable-Interaction, E123
Just as a two-variable interaction is a measure of the
joint effect of two–variables on a response, a three-
variable interaction is indicative of the joint effect of
three-variables on a response.
The procedure of estimating these effects is similar to
estimating the second order effects.
However, a simplified procedure that has been
developed to estimate all these effects will now be
discussed.

A SIMPLIFIED METHOD FOR MAIN AND
INTERACTION EFFECTS
The purpose of going through the

average effects and interactions in a
rather detailed and descriptive manner,
such as above using the geometrical
representation of the design, is to
provide a basic understanding
better appreciation of the meaning of these
effects.
A SIMPLIFIED METHOD FOR AVERAGE/MAIN

AND INTERACTION EFFECTS
However, the usefulness of this approach

for calculation purposes is somewhat
limited since extension to more than three
variables is cumbersome.
A simplified calculation procedure, which
is easily extended for analyzing two-level
factorial designs in any number of
variables, is therefore needed and is now
described.
A SIMPLIFIED METHOD FOR
AVERAGE/MAIN AND INTERACTION
EFFECTS
Let us refer again to the design matrix for

the High Strength Steel Bar Example:
Table 1 Design Matrix
Test X1 X2 X3 y
1 -1 -1 -1 87.5
2 1 -1 -1 87.3
3 -1 1 -1 77.8
4 1 1 -1 87
5 -1 -1 1 79.1
6 1 -1 1 97.6
7 -1 1 1 78.6
8 1 1 1 87.7

EFFECTS
We now notice that if we are to apply

(multiply) the column of ±1 values
associated with X1 to the data ( y ) (refer to
Table 1) and then sum the result and divide
the sum by N/2 = 8/2 = 4, we would have
the average/main effect temperature.

EFFECTS
Table 2
SUM 36.6
= = 9.15kpsi
X1 X y X1 y 4 4
-1 X 87.5 -87.5 E temp = 9.15kpsi
1 X 87.3 87.3
-1 X 77.8 -77.8 This result should not be
1 X 87 87 too surprising.
-1 X 79.1 -79.1
1 X 97.6 97.6 The above is simply a
-1 X 78.6 -78.6 rearrangement of the
1 X 87.7 87.7 individual results which form
SUM = 36.6 the four contrasts that were
subsequently averaged.

EFFECTS
Table 3 Table 4
X2 X y X2 y
X3 X y X3 y
-1 X 87.5 -87.5 -1 X 87.5 -87.5
-1 X 87.3 -87.3 -1 X 87.3 -87.3
1 X 77.8 77.8 -1 X 77.8 -77.8
1 X 87 87 -1 X 87 -87
-1 X 79.1 -79.1 1 X 79.1 79.1
-1 X 97.6 -97.6 1 X 97.6 97.6
1 X 78.6 78.6 1 X 78.6 78.6
1 X 87.7 87.7 1 X 87.7 87.7
SUM = -20.4 SUM = 3.40
EWind Velocity = −5.1kpsi E Bar Size = 0.85kpsi

EFFECTS
Again these results (refer to Table 3 & 4) are

not a surprise to us since in the geometrical
representation the average effect was simply
an average of four contrasts
Each contrast being the difference between a

test result at a high level (+1) for that factor
and one at a low level (-1) for that factor.

EFFECTS
While the simplified and generalized method for

calculating the average/main effects via the
design matrix is somewhat obvious, it may not
be immediately obvious how to extend this to
the calculation of interaction effects.
To see how we might handle the calculation of
an interaction effect, we return to the
mathematical model form associated with a 22
factorial,
Y = b0 + b1 X 1 + b2 X 2 + b12 X 1 X 2 + e
EFFECTS
The coefficients b1 and b2 correspond to the

average/main effects of variables 1 and 2,
respectively, while the coefficient b12
corresponds to the interaction effect
between variables 1 and 2.
Given Y, b1, b2, and particular values for X1

and X2, the key to obtaining a solution for b12
is to form the cross-product X1X2 (assume e
= 0).

EFFECTS
To obtain estimates of the interaction effects

E12, E13, E23, and E123, we may form the
cross-product columns X1X2, X1X3, X2X3,
and, X1X2X3 as shown below.
This complete seven column maids will be
referred to as the calculation matrix.
This cross-product columns of ± signs are
simply the inner products of the individual
columns, e.g., X1X2 = (X1)(X2).

EFFECTS
Table 5 Calculation Matrix

Main Effects Interactions
Test X1 X2 X3 X1X2 X1X3 X2X3 X1X2X3 y
1 -1 -1 -1 1 1 1 -1 87.5
2 1 -1 -1 -1 -1 1 1 87.3
3 -1 1 -1 -1 1 -1 1 77.8
4 1 1 -1 1 -1 -1 -1 87
5 -1 -1 1 1 -1 -1 1 79.1
6 1 -1 1 -1 1 -1 -1 97.6
7 -1 1 1 -1 -1 1 -1 78.6
8 1 1 1 1 1 1 1 87.7

EFFECTS
Determining Variable Effects Via Calculation Matrix

To calculate any one of the average effects
of interactions,
merely multiply, element by element, the
appropriate column by the column of average
responses,
sum algebraically,
divide the sum by four (i.e., 23/2. In general,
for a 2k factorial design. the sum should be
divided by 2k / 2.).

EFFECTS
Determining Variable Effects Via Calculation Matrix

Table 6 Dividing the sum
X1X2 yi X 1 X 2 yi by four, we obtain
1 87.5 (1)(87.5) 87.5 the answer that
-1 87.3 (-1)(87.3) -87.3 was obtained
-1 77.8 (-1)(77.8) -77.8 previously
1 87.0 = (1)(87.0) = 87.0 (namely, zero) for
1 79.1 (1)(79.1) 79.1 this two-variable
-1 97.6 (-1)(97.6) -97.6 interaction.
-1 78.6 (-1)(78.6) -78.6
1 87.7 (1)(87.7) 87.7
SUM = 0.0
STATISTICAL INFERENCE OF THE

AVERAGE AND INTERACTION EFFECTS
We cannot form a proper attitude towards an

average effect or interaction unless something
is known about the intrinsic variability of the
testing procedure.
Our attitude towards an average effect of say,
500, would not be the same if the 95%
confidence interval were 500±2 as it would be
if the interval were 500±2000.
In the former instance, we would feel that the

existence of an average effect has been rather
convincingly demonstrated and we could
assert with some confidence that its true
magnitude is probably fairly close to 500.
In the later instance, this is not the case at all,
because considerable uncertainty is
associated with the effect and its magnitude.

To obtain a quantitative measure of the

uncertainty in our calculated average effects
and interactions, we proceed as follows:
Estimate the variance S2 of an individual
observation
Estimate the variances associated with the
average effects and interactions
Calculate the appropriate 95% confidence
intervals for the “true” average effects and
interactions.

From the 95% confidence intervals, we may

be able to interpret the significance of each
average effect and interaction, and draw
conclusions regarding the experimental study.

Estimation of Variance σ2 of an Individual

Observation
Recall that each of the eight tests was replicated
once (refer to Table 3), so that there were actually
sixteen individual observations on the ultimate
tensile stress.
It is the variance of each of these sixteen
observations that we will now estimate.
We shall assume that the true variance σ2 is the
same for all sixteen observations and that the
observations and that the observations are
independent.

Observation
For Test No.1, the two observations are 84.0
and 91.0. A sample variance for this test,
designated as S12, can be calculated for Table
3 as
( y a 1 − y 1 )2 + ( y b 1 − y 1 )2
S1 =
2
(2−1)
= ( 84 .0 − 87.5 )2 + ( 91 .0 − 87.5 )2
= 24 .50


Observation
In this example, we can calculate eight sample
variances S12, S22, …….,S82, one for each
test. The eight sample variances are
calculated to be:
S12 = 24.5 S22=21.78
S32=134.48 S42=242.0
S52=3.92 S62=8.82
S72=33.62 S82=72.00

Observation
Since we are assuming that there is only one true
variance σ2 for all sixteen observations, an estimate
for σ2 is the pooled sample variance Sp2 of the eight
estimated variances S12, S22, ……,S82.
In this case,
[( ya 1 − y1 )2 + ( yb 1 − y1 )2 + .......... ..... + ( ya 8 − y8 )2 + ( yb 8 − y8 )2 ]
Sp =
2
( 2 − 1 ) + .......... . + ( 2 − 1 )
24.50 + 21.78 + ....... + 72
=
8
= 67.64 .


Observation
It should be pointed out that when the number of
replications are not the same for all eight tests, the
pooled sample variance Sp2 has to be modified
properly.
For example, suppose Test No. 2 was replicated twice
instead of once, then
[(ya1 − y1)2 +( yb1 − y1)2]+[(ya2 − y2)2 +( yb2 − y2)2 +( yc2 − y2)2]+....+[ ya8 − y8)2 +( yb8 − y8)2]
Sp =
2
(2−1) +(2−1).........+(2−1)

Checking the Assumption of
Common Variance- Bartlett test
H 0 : σ 12 = σ 22 = L = σ m2 = σ y2
H1 : at least one σ i2 ≠ σ 2j , i ≠ j
M m = 2k
χν2=m −1 =
C Sample size = n
where N = n1+n2+…+nm
m
M = ( N − m) ln s 2p − ∑ (ni − 1) ln si2
i =1
1  m 1  1 
C = 1+  ∑ − 
3(m − 1)  i =1 ni − 1  N − m 
Bartlett test
The value of M will be large if the sample

variances si2 differ greatly in magnitude, and
will be zero if all the sample variances are
exactly equal.
We will reject H0 if χ2cal is too large, i.e.,
χcal
2
> χm2 −1,α
α
χ 2
m −1,α

Bartlett Test
s 2p = 67.64 N = 16, m = 8
M = (16 − 8) ln 67.64 − [(2 − 1) ln 24.5 + (2 − 1) ln 21.78
+ (2 − 1) ln134.48 + (2 − 1) ln 242 + (2 − 1) ln 3.92
+ (2 − 1) ln 8.82 + (2 − 1)33.62 + (2 − 1) ln 72.0]
= 5.713
1  8 1  1 
C = 1+  ∑  − 
3(8 − 1)  i =1 2 − 1  16 − 8 
1
= 1+ [8 − 5] = 1.357
21
Bartlett Test
5.713
χcal
2
= = 4.21
1.357
χ7,2 α =0.05 = 14.1
Since χ2cal < χ2table, do not reject H0.
The variances are equal.

Estimation of the Variances Associated with the Average

Effects and Interactions
The average effect of ambient temperature, E1 is
1
E1 = ( y 2 − y 1 + y 4 − y 3 + y6 − y5 + y 8 − y7 )
4
Since each is an average of two observations yai and
ybi, E1 can be written as
1 ( y a 2 + y b 2 ) ( ya 1 + y b 1 ) ( y + yb 8 ) ( ya7 + yb7 )
E1 = [ − + .... + a 8 − ]
4 2 2 2 2
or
1
E1 = [ ya 2 + yb 2 − ya 1 − yb 1 + ........... − ya7 − yb7 ]
8

Estimation of the Variances Associated with the

Average Effects and Interactions
It can be shown that the variances of all other average
effects and interactions are equal to V(E1), that is
V(E1)=V(E2)+V(E3)=V(E12)=V(E13)=V(E23)=V(E123)=σ2/4
Substituting for σ2 the pooled sample variance Sp2, we

obtain the sample variances as Sp2/4.

Calculation of 95% Confidence Intervals

Recall that a confidence interval for a certain
parameter can be calculated on the basis of
the sample statistic.
The confidence interval for the average and
interaction effects can be obtained as:
2
Sp
Ei ± t i = 1,2,...
4


We have already determined the values of E1, E2,
...,E12,….and Sp2; what is left to be determined is the
value of t.
We have a total of sixteen tests, and we used up eight
degrees of freedom in calculating the eight averages, .
Therefore, the appropriate t-value is the value
associated with eight degrees of freedom and
corresponding to a 95% confidence level, which is
t8, 0.025=2.306.

The 95% confidence intervals are therefore
67.64
E i ± 2.306 = Ei ± 9.48
4
For example, the 95% confidence interval for the
“true” average effect of ambient temperature (in
units of 1000 psi) is
E1±9.48=[9.15±9.48]
Or in terms of psi, is
[9150±9480]

Table 4. 95% Confidence Intervals for the True Average Effects and Interactions
Average Effects 95% Confidence Interval
Ambient temperature ( E1) ±9480 psi
9150±
Wind Velocity (E2) -5100±±9480 psi
Bar Size (E3) ±9480 psi
850±
Ambient temperature-Wind Velocity (E12) ±9480 psi
0±
Ambient temperature-Bar Size (E13) ±9480 psi
-4650±
Wind Velocity-Bar Size (E23) ±9480 psi
-100±
Three-Variable Interaction
±9480 psi
Ambient temperature-Wind Velocity-Bar Size (E123) 4700±

Session 5
Fractional Factorial Designs
Fractional Factorial Design

IE 400 Lecture 15 @ 2006 Dr. Shiv G. Kapoor All Rights Reserved
Redundancy in Two-Level
Factorials
In Full Factorial designs, a large amount of resources are

expended in estimating interaction terms. That is , the
ratio of the number of main effects to the total number of
effects reduces rapidly as the number of variables k
increases.
For example, in a full 26 experiment, only 9.5% (6/64)
of the effects calculated are main/average effects.
The remaining 90.5% effects relate to interaction effects.

Example - 26 = 64 tests
Table 1
1 Mean
6 Main effects
15 Two-factor interaction
20 Three-factor interaction
15 Fourth order
6 Fifth order
1 Sixth order
64 Tests
Redundancy in Two-Level Factorials
The negligible magnitude of the many higher-order

interactions, together with the fact that only a few
variables will have significant influence on the
response, a tremendous amount of redundancy
exists in two-level factorial designs.
To reduce the problem of estimating large numbers
of possibly unimportant interaction effects, fractional
factorial designs are created by replacing some of the
higher-order interaction terms by additional
experimental factors.

Creating a Fraction of a 2k Design
For example, suppose that you want to study

four factors, X1, X2, X3 and X4, but that you
want to use 8 test runs rather than the 16
runs required by a full 24 design.
To do this, first write down the extended
design matrix for the full 23 design (i.e., the 2k
design with 8 runs)
A 23 Factorial Design
Table 2
X1 X2 X3 X1X2 X1X3 X2X3 X1X2X3
-1 -1 -1 +1 +1 +1 -1
+1 -1 -1 -1 -1 +1 +1
-1 +1 -1 -1 +1 -1 +1
+1 +1 -1 +1 -1 -1 -1
-1 -1 +1 +1 -1 -1 +1
+1 -1 +1 -1 +1 -1 -1
-1 +1 +1 -1 -1 +1 -1
+1 +1 +1 +1 +1 +1 +1

Next, since the highest-order interaction is

least likely to be important, replace the X1X2X3
column by the letter X4. This is abbreviated by
writing X4 = X1X2X3
Then erase all remaining interaction columns
to obtain the design matrix.
In fact, the 8 test runs correspond to certain
rows in the full 24 design as shown in Table 3.
Table 3
X1=1 X2=2 X3=3 X4=4

-1 -1 -1 -1
+1 -1 -1 +1
-1 +1 -1 +1
+1 +1 -1 -1
-1 -1 +1 +1
+1 -1 +1 -1
-1 +1 +1 -1
+1 +1 +1 +1

This four-column matrix is the design matrix

of a fractional factorial design based on four
factors. In fact, these 8 test runs correspond
to certain rows in the full 24 design, as shown
(shaded) here.
Design Matrix – 24 Design
Run X1 X2 X3 X4
1 -1 -1 -1 -1 1
2 +1 -1 -1 -1
3 -1 +1 -1 -1
4 +1 +1 -1 -1 4
5 -1 -1 +1 -1
6 +1 -1 +1 -1 6
7 -1 +1 +1 -1 7
8 +1 +1 +1 -1

Design Matrix – 24 Design
9 -1 -1 -1 +1
10 +1 -1 -1 +1 2
11 -1 +1 -1 +1 3
12 +1 +1 -1 +1
13 -1 -1 +1 +1 5
14 +1 -1 +1 +1
15 -1 +1 +1 +1
16 +1 +1 +1 +1 8
Because the 8 test runs comprise only a

fraction of the 16 runs required in a full 24
design, we say that the 8-run experiment is a
fractional factorial experiments.
Furthermore, since this design uses only half
of the 16 runs, we say that it is a half fraction
of the full factorial design based on four
factors.

Consequences of Fractionating a
Full Factorial
Once the tests are conducted in accordance

with the test recipes defined by the design
matrix, the calculation matrix is determined to
provide for the estimation of the interaction
effects.
Expanding the design matrix above, we
obtain the following calculation matrix by
forming all possible products of columns 1
through 4.
Calculation Matrix
Test I 1 2 3 4 12 13 14 23 24 34 123 124 134 234 1234
1 + − − − − + + + + + + − − − − +
2 + + − − + − − + + − − + − − + +
3 + − + − + − − − − + − + − + +
4 + + + − − + − − − − + − − + + +
5 + − − + + + − − − − + + + − − +
6 + + − + − − + − − + − − + − + +
7 + − + + − − − + + − − − + + − +
8 + + + + + + + + + + + + + + + +

Full Factorial
Examination of the calculation matrix above

reveals that many of the columns are
identical. In particular, of the 16 columns, only
eight are unique; each unique column
appears twice.
The following pairs of variable effects are
represented in the calculation matrix by the
same column of plus and minus signs:
Full Factorial
1 and 234 12 and 34

2 and 134 13 and 24
3 and 124 23 and 14
4 and 123 average (I) and 1234.

Full Factorial
What does all this mean? When you multiply,

for example, the 12 column by the data, sum
and divided by 4 do you get an estimate of
the two-factor interaction 12? Or the two-
factor interaction 34? Or both?
The interactions 12 and 34 are said to be
confounded or confused.
Full Factorial
The interaction 12 and 23 are said to be

aliases of the unique column of plus and
minus signs defined by (+−−++−−+). Use of
this column for effect estimation produces a
number (estimate) that is actually the sum of
the two-factor interaction effects 12 and 34.
Similarly, 1 and 234 are confounded effects,
2 and 134 are confounded effects, and so on.

Full Factorial
It seems that the innocent act of using the

123 column to introduce a fourth variable into
a 23 full factorial scheme has created a lot of
confounding among the variable effects.
The eight unique columns in the calculation
matrix are used to obtain the linear
combinations l0, l1, …, l123 of confounded
effects when their signs are applied to the
data, and the result is summed and then
divided by 4 (divide by 8 for l0).
Full Factorial
l0 estimates mean + (1/2)(1234)

l1 estimates 1 + 234
Full Factorial
Some of this confounding can be eliminated

by invoking the assumption that third- and
higher-order effects are negligible, leading to
clear estimates of all main effects.
But the six two-factor interactions are still
hopelessly confounded.
Full Factorial
If we assume three- and four-factor interactions can
be neglected, the experiment produces the following
linear combinations:
l0 estimates mean
l1 estimates 1
l2 estimates 2
l3 estimates 3
l123 estimates 4
l23 estimates 23 + 14.

Full Factorial
The four-variable, eight-test, two-level

experiment discussed thus far is referred to
as a two-level fractional factorial design since
it considers only a fraction of the tests defined
by the full factorial.
In this case we have created a one-half
fraction design. It is commonly referred to as
a 24-1 fractional factorial design. It is a
member of the general class of 2k-p Fractional
Factorial Designs.
Full Factorial
For these designs

1. k variables are examined
2. in 2k-p tests
3. requiring that the p of the variables be
introduced into the full factorial in k–p
variables
4. by assigning them to interaction effects
in the first k–p variables.

Example
24-1 fractional factorial

1. 4 variables are studied
2. in 24-1 = 8 tests
3. p = 1 of the variables is introduced into a 23
full factorial
4. by assigning it to the interaction 123 (i.e., let
4 = 123)
Full Factorial
Many other useful fractional factorials can be

developed, some dealing with rather large
numbers of variables in relatively few tests.
The 24-1 fractional factorial design just
examined is one of the more simple fractional
factorial designs. It can get much worse.
Therefore, we need a system to set up such
designs easily and to determine quickly the
precise nature/pattern of the confounding of
the variable effects.

System to Define the Confounding Pattern
of a Two-Level Fractional Factorial
Suppose that an investigator wishes to study

the potential effects that five variables may
have on the output of a certain process using
some type of two-level factorial experiment.
If all possible combinations of five variables at
two levels each are to be considered, then 25
= 32 tests must be conducted.
System to Define the Confounding Pattern

of a Two-Level Fractional Factorial
His boss informs him that due to time and

budget limitations he will only be able to run 8
tests, not 32.
How might the investigator reconsider his
original test plan and gain some useful
information about the five variables?
If only eight tests are to be considered using
a two-level scheme, only three variables can
be examined in a full two-level factorial test
plan as shown in Table 4.
Table 4
Average
Test 1 2 3 12 13 23 123 y
I
1 + − − − + + + − y1
2 + + − − − − + + y2
3 + − + − − + − + y3
4 + + + − + − − − y4
5 + − − + + − − + y5
6 + + − + − + − − y6
7 + − + + − − + − y7
8 + + + + + + + + y8
Design of a Fractional Factorial

Experiment
As discussed earlier, in designing fractional

factorial experiments, we introduce additional
variables into the base design by borrowing
columns initially assigned to interaction
effects in the base design variables.
The base design is the full factorial design
associated with the number of tests we wish
to run.

Experiment
For the case under consideration:
1. Five variables will be studied using only eight tests.
Therefore, a 23 design is the base design.
2. Two variables must be further introduced in the 23
base design. Columns 12, 13, 23 and 123 are
available to introduce these two additional variables.
5. The new test plan will be called a 25-2 fractional
factorial design.
Two levels of each variable.
Five variables under study.
25-2 = 8 tests to be run.
Two variables introduced into the 23 base design.

Experiment
For the five-variable, eight-test fractional

factorial under study, let us introduce
variables 4 and 5 into the 23 base design by
assigning them to the 12 and 13 columns,
respectively, as in Table 5.
In Table 5, the first five columns constitute the
design matrix and all seven columns refer to
the calculation matrix.

Table 5
4 5
Test I 1 2 3 12 13 23 123 y
1 + − − − + + + − y1
2 + + − − − − + + y2
3 + − + − − + − + y3
4 + + + − + − − − y4
5 + − − + + − − + y5
6 + + − + − + − − y6
7 + − + + − − + − y7
8 + + + + + + + + y8
Design Generators and the Defining

Relation
The question that remains is to determine

exactly which effects are confounded with
each other.
From now on, when we refer to a column
heading (e.g., 1 or 23 or 123) we should
imagine a column of + and − signs directly
under it. Our 25-2 fractional factorial design
was generated by setting the 4-column equal
to the 12-column and the 5-column equal to
the 13-column.

Relation
In the interest of
convenience we will 4 = 12
denote these as 4 = 12 + +
and 5 = 13, where the = − −
sign really implies an − −
identity between +
=
+
columns of + and − signs, + +

− −
for example.
− −
+ +

Relation
Now, if any column of +

4 × 4 = I
and − signs is multiplied
+ + +
by itself, a column of all
− − +
+ signs is produced. We
− − +
will denote such a
+ + +
column by the heading I. × =
+ + +
− − +
− − +
+ + +

Relation
This simple operation will prove to be very

useful. Since the 25-2 fractional factorial
design was generated by setting
4 = 12
5 = 13
and given the definition of I above, then if we
multiply both sides of the two “equations”
above by 4 and 5, respectively, then

Relation
4×4=12×4, 5×5=13×5
which reduces to
I = 124, I = 135.
These two identities are referred to as our design
generators. While both the left- and right-hand
sides of the equation above represent columns
of all + signs, the right-hand side retains the
individual column headings that produced the
column of all + signs by their product.
Relation
That is,
I column = 1 column × 2 column × 4 column
Now since both the 124 and 135 columns
equal I, their product must also equal I:
(124) × (135) = I,
or, rearranging numbers (columns),
(1)(1)2345 = I.

Relation
But (1)(1) = I and any column multiplied by a

column of plus signs (I) remains unchanged.
Therefore, I is also equal to 2345:
I = 2345.
Hence we have
I = 124 = 135 = 2345,
an identity comprised of the design
generators and their products in all possible
combinations (in this case only one product).

Relation
The identity
I = 124 = 135 = 2345
is referred to as the defining relation of this 25-2
fractional factorial design, and through it we can
reveal the complete aliasing/confounding
structure of this fractional factorial design.
Revelation of the Complete

Confounding Pattern
Returning to the original base design calculation

matrix, we recall that we have seven independent
columns of + and − signs, and an eighth column for I
(a column of all + signs) as shown in Table 6.
By letting 4 = 12 and 5 = 13, we have created many
aliased effects. To find the aliases of the column
headings above (1, 2, 3, …, 123), we multiply each
by every term (including I) in the defining relation:
Defining relation: I = 124 = 135 = 2345.

Table 6
Test I 1 2 3 12 13 23 123
1 + − − − + + + −
2 + + − − − − + +
3 + − + − − + − +
4 + + + − + − − −
5 + − − + + − − +
6 + + − + − + − −
7 + − + + − − + −
8 + + + + + + + +
l0 l1 l2 l3 l12 l13 l23 l123
Revelation of the Complete

Confounding Pattern
Given the Defining relation: I = 124 = 135 = 2345.
For column heading 1:
(1)I = (1)124 = (1)135 = (1)2345.
Removing all I’s [recall that (1)(1) = I], we have
1 = 24 = 35 = 12345.
That is, the aliases of 1 are 24, 35 and 12345.
Therefore, when we multiply the 1 column by the y
column, sum, and divide by 4, we obtain an estimate of
the sum (linear combination) of 1, 24, 35 and 12345.
We conveniently denote this sum of confounded
variable effects as l1(l for linear combination of the
effects).

Linear Effects
That is,
l1 estimates 1 + 24 + 35 + 12345.
Similarly, moving to column headings 2, 3
and so on, we find that
2 = 14 = 1235 = 345
3 = 1234 = 15 = 245
12 = 4 = 235 = 1345
13 = 234 = 5 = 1245
23 = 134 = 125 = 45
123 = 34 = 25 = 145.
Linear Effects
Hence
l2 estimates 2 + 14 + 1235 + 345
l3 estimates 3 + 1234 + 15 + 245
….
l123 estimates 123 + 34 + 25 + 145.
We have now defined the complete confounding
pattern of this 25-2 fractional factorial design and
we know precisely what effect combinations we
can obtain from the data.
Summary
1. We want to study five variables in only eight

tests in a two-level factorial scheme.
2. The base design is therefore a 23 = 8 test
two-level full factorial in the base design
variables 1, 2 and 3.
3. To the base design we introduce variables 4
and 5 by assigning them to interaction
columns in the base design: for example,
4 = 12
5 = 13.

v
Summary
4. The design generators are therefore

I = 124
I = 135,
and hence the design defining relation is
I = 124 = 135 = 2345.
5. The defining relation produces the confounding pattern
or alias structure:
I = 124 = 135 = 2345 23 = 134 = 125 = 45
1 = 24 = 35 = 12345 123 = 34 = 25 = 145
2 = 14 = 1235 = 345
3 = 1234 = 15 = 245
12 = 4 = 235 = 1345
13 = 234 = 5 = 1245

Summary
For the confounding structure shown above,

it should be noted that each column heading
(I, 1, 2, 3, …, 123) is confounded with three
effects (e.g., I is confounded with 124, 135,
and 2345).
Thus, each of the eight rows in the
confounding structure contains four effects.
For a full factorial in five variables (25 = 32
tests) there are 31 column headings plus I in
the calculation matrix.
Effects for a 25 Factorial Design
In particular, for a 25 full factorial calculation

matrix, there are columns for
1 Mean
5 Main effects
10 Two-factor interaction effects
10 Three-factor interaction effects
5 Four-factor interaction effects
1 Five-factor interaction effects
32 Variable effects

Summary
Examining the aliasing structure for the 25-2

design, we observed that all 32 variable
effects (including the mean, I) are accounted
for (8 rows × 4 effects/row). It is a good idea
to verify that this is the case.
Summary
6. The eight columns (including I) in the 25-2 fractional

factorial design produce the following linear
combinations of effects which can be estimated:
l0 estimates mean + (1/2)(124 + 135 + 2345)
l1 estimates 1 + 24 + 35 + 12345
l2 estimates 2 + 14 + 1235 + 345
l3 estimates 3 + 1234 + 15 + 245
l12 estimates 12 + 4 + 235 + 1345
l13 estimates 13 + 234 + 5 + 1245
l23 estimates 23 + 134 + 125 + 45
l123 estimates 123 + 34 + 25 + 145.
Summary
7. If we assume that the majority of the variability in

the data can be explained by the presence of main
effects and two-factor interaction effects, the linear
combinations of effects are
l0 estimates mean
l1 estimates 1 + 24 + 35 l23 estimates 23 + 45
l2 estimates 2 + 14 l123 estimates 34 + 25.
l3 estimates 3 + 15
Concept of Resolution of Two-Level

Fractional Factorial Designs
We have previously seen that the introduction

of additional variables into two-level full
factorials gives rise to confounding or aliasing
of variable effects.
It would be desirable to make this introduction
in such a way as to confound low order
effects (main effects and two-factor
interactions) not with each other but with
higher order interactions that are considered
unimportant.

Selecting the Preferred Generators
To illustrate, consider the study of five

variables in just sixteen tests (the full factorial
would required 25 = 32 tests). One additional
variable – the fifth variable – must be
introduced into a 24 = 16 run base design.
Any of the interactions in the first four
variables could be used for this purpose.
12, 13, 14, 23, 24, 34
123, 124, 134, 234
1234.
Design Generator and Defining

Relationship
If any one of the two-factor interactions are

used, say, 5 = 12, then the design generator
becomes
I = 125
which is also the defining relationship.
Therefore, at least some of the average/main
effects will be confounded with two-factor
interactions, viz.,
1 = 25, 2 = 15, 5 = 12

Design Generator and Defining
Relationship
If any one of the three-factor interactions are

used to introduce the fifth variable, the
situation is greatly improved, at least for the
estimation of average/main effects. For
example, if we let 5 = 123, then
I = 1235
is the generator and defining relationship.
Alias structure
So, some main effects are confounded with,

at worst, three-factor interactions, while two-
factor interactions are confounded with each
other, e.g.,
1 = 235, 2 = 135, 3 = 125, 5 = 123,
12 = 35, 13 = 25, 23 = 15

Another option
If the four-factor interaction is used to introduce the fifth

variable, i.e., 5 = 1234, an even more desirable result is
obtained (the best under these circumstances). The
generator and defining relationship for this situation is
I = 12345
Therefore,
1 = 2345. 2 = 1345, 3 = 1245, 4 = 1235, 5 = 1234,
12 = 345, 13 = 245, 14 = 235, 15 = 234, 23 = 145,
24 = 135, 25 = 134, 34 = 125, 35 = 124, 45 = 123.
Design Resolution
In this last case,

1. All main effects are confounded with four-factor
interactions.
2. All two-factor interactions are confounded with
three-factor interactions.
The varying confounding structures produced by
using different orders of variable interactions to
introduce the fifth variable in the example above are
described by the concept of the resolution of fractional
factorial designs.
Design Resolution
“The resolution of a two-level fractional factorial

design is defined to be equal to the number of
letters (numbers) in the shortest length word
(term) in the defining relationship, excluding I.”
If the defining relationship of a certain design is
I = 124 = 135 = 2345
then the design is of resolution three, denoted as a
Resolution III, since the words “123” and “135” have
three letters each.
Design Resolution
If the defining relation of a certain design is:

I = 1235 = 2345 = 1456
then the design is of Resolution IV (“1235”,
“2346” and “1456” each have four letters).
Similarly, the design with defining relationship
(I = 12345), is a Resolution V design.

Design Resolution
If a design is of Resolution III, this means that at least

some main effects are confounded with two-factor
interactions.
If a design is of Resolution IV, this means that at
least some main effects are confounded with three-
factor interactions while at least some two-factor
interactions are confounded with other two-factor
interactions.
If a design is of Resolution V, this means that at least
some main effects are confounded with four-factor
interactions and some two-factor interactions are
confounded with three-factor interactions.
Design Resolution
It may be noted at this point, that the number

of words in the defining relationship for a 2k-p
fractional factorial design is equal to 2p. Thus,
for a 26-3 fractional factorial (k=6 and p=3),
there are 23 = 8 words in the defining
relationship.

Example: Design Resolution/
Selection of Generators
A 26-2 fractional factorial design is set up by

introducing variable 5 and 6 via
5 = 123, 6 = 1234.
What is the resolution of this design? The
design generators are:
I = 1235, I = 12346.
The defining relationship is:
I = 1235 = 12346 = 456.
Example: Design Resolution/

Selection of Generators
Therefore, the design is of Resolution III.

What would the resolution be if the
generators were
5 = 123, 6 = 124?
The defining relationship is:
I = 1235 = 1246 = 3456.
Now, the design is of Resolution IV. It is clear
that the selection of the proper design
generators is very important.
Summary: The Concept of Design
Resolution
1. Higher resolution designs seem more desirable since

they provide the opportunity for low order effect
estimates to be determined in an unconfounded state,
assuming higher order interaction effects can be
neglected.
2. The more variables considered in a fixed number of
tests, the lower the resolution of the design becomes.
3. There is a limit to the number of variables that can be
considered in a fixed number of tests while maintaining
a pre-specified resolution requirement.
Summary: The Concept of Design

Resolution
4. No more than (n-1) variables can be

examined in n tests (n is a power of 2, e.g.,
4, 8, 16, 32, …) to maintain a design
resolution of at least III. Such designs are
commonly referred to as saturated design.
Examples are
23-1, 27-4, 215-11, 231-26.
For saturated designs all interactions in the
base design variables are used to introduce
additional variables.
Case Study: FD
Case Study - FD
Shiv G. Kapoor
ANOTHER CASE STUDY: SURFACE

FINISH OF A MACHINED PART
A manufacturer was concerned with the surface finish of

a cylindrical part produced through a single-point turning
machining operation, as shown in Figure 1.
Fig.1 Variables Considered in

the Surface Finish Study
The specifications for the part called for a

nominal surface finish of 100 micro-inches
Although the process producing the parts had

been stabilized through the use of X and R
charts, the average surface finish of the parts
produced by the process was approximately
120 micro-inches.

To identify how the process could be centered

at the target value of 100 micro-inches, a
study team was formed to examine the
problem.
The study team identified three variables of
interest:
spindle speed,
feedrate,
tool nose radius.

The study team decided to perform a 23 factorial

design to study the effects of the three variables
on, and ultimately to develop a model for, the
machined surface finish. The study team
selected levels for three variables spaced about
the current operating conditions for the process
Spindle speed = 2500 revolutions per minute
Feedrate = 0.008 inches per revolution
Nose radius = 1/64 inch

The high and low levels selected for the three

variables for the experiment are summarized
in Table 1.
Table 1 Variables Levels for the Surface Finish Experiment
Variable Low (-1) Level High (+1) Level

X1: Spindle speed (rev/min) 2400 2600
X2: Feedrate (in./min) 0.005 0.010
X3: Nose radius (in.) 1/64 1/32

The surface finish results of the 23 experiment

are given in Table 2.
Table 2. 23 Design and Results of the Surface Finish Experiment

S u rfa c e Run
Test # X1 X2 X3
F in i s h ( µ in ) O rd e r
1 -1 -1 -1 52 6
2 1 -1 -1 63 8
3 -1 1 -1 213 1
4 1 1 -1 206 2
5 -1 -1 1 31 5
6 1 -1 1 28 3
7 -1 1 1 110 4
8 1 1 1 105 7
CALCULATION OF EFFECT ESTIMATES

AND DETERMINATION OF SIGNIFICANT
EFFECTS
Given the experimental results of Table 2,

the effect estimates were computed using
the calculation matrix and the algebraic
method of effect calculation.
The calculation matrix along with the
response values is given in Table 3, and the
computed effects are displayed in Table 4.

EFFECTS
Table 3 Calculation Matrix

Main Effects Interactions
Test X1 X2 X3 X 1X 2 X 1X 3 X 2X 3 X 1X 2 X 3 yi
1 -1 -1 -1 1 1 1 -1 52
2 1 -1 -1 -1 -1 1 1 63
3 -1 1 -1 -1 1 -1 1 213
4 1 1 -1 1 -1 -1 -1 206
5 -1 -1 1 1 -1 -1 1 31
6 1 -1 1 -1 1 -1 -1 28
7 -1 1 1 -1 -1 1 -1 110
8 1 1 1 1 1 1 1 105

1

EFFECTS
Table 4 Effect Estimates for the Surface Finish

Experiment (micro-inches)
Average = 101 E 12 = -5
E 1 = -1 E 13 = -3
E 2 = 115 E 23 = -37
E 3 = -65 E 123 = 4

EFFECTS
To identify those effect estimates that are

distinguishable from the noise in the
experimental environment, a normal
probability plot may be constructed.
The normal plot associated with the effects
of Table 4 is shown in Fig. 2.

EFFECTS

EFFECTS
In the Figure 2 a straight line has been

drawn through four of the effect estimates
that have values close to zero.
This line passes near the (0, 50%)
coordinate, and the scatter of the four
smaller effect estimates about this line
appears fairly random.

EFFECTS
It is seen that the other three effect estimates

fall well off the line.
We therefore conclude that the main effects
2, 3, and the interaction effect 23, are
important.

MODEL DEVELOPMENT AND CHECKING
For a 23 factorial design the response is

assumed to be described by a model of the
following form:
y = b0 + b1 X 1 + b2 X 2 + b3 X 3 + b12 X 1 X 2 +
+ b13 X 1 X 3 + b23 X 2 X 3 + b123 X 1 X 2 X 3 + ε
(1)
Since only the significant effects (model

coefficients) need be included in the model,
the fitted model for surface finish is given by:
ŷ = b̂0 + b̂2 X 2 + b̂3 X 3 + b̂23 X 2 X 3 (2)

Substituting the values for b̂i = E i 2 and b̂0 =
average response, we obtain
ŷ = 101 + 57.5 X 2 − 32.5 X 3 − 18.5 X 2 X 3 (3)

Table 5. Predicted Response and Model Residuals for

the Surface Finish Experiment
Test # X1 X2 X3 yi ŷ i εi
1 -1 -1 -1 52 57.5 5.5
2 1 -1 -1 63 57.5 -5.5
3 -1 1 -1 213 209.5 3.5
4 1 1 -1 206 209.5 -3.5
5 -1 -1 1 31 29.5 1.5
6 1 -1 1 28 29.5 -1.5
7 -1 1 1 110 107.5 2.5
8 1 1 1 105 107.5 -2.5
USING THE MODEL FOR QUALITY

IMPROVEMENT
With a prediction model now developed and

checked for any potential inadequacies, the
study team next turned its attention to using
the model to find a solution to the problem at
hand.
In short, it was desired to find values for
spindle speed, feedrate, and nose radius that
center the turning process at a surface finish
of 100 micro-inches, on the average.
IMPROVEMENT
In terms of the fitted model, the team wanted

to know values of X1, X2, and X3 that produce
a predicted surface finish of 100 micro-
inches.
Since the prediction model of Eq.(2) does
not depend on the level of the spindle speed
(X1), values for X2, and X3 are sought that
satisfy Eq.(2) when ŷ =100.

IMPROVEMENT
Solving Eq.(2) for X3 gives

ŷ − 101 − 57.5 X 2
X3 = (4)
− 32.5 − 18.5 X 2
Using Eq.(4), we find that contours of
constant predicted surface finish may be
constructed as a function of X2 and X3.

IMPROVEMENT
Such a contour plot was developed by the

study team and is shown in Fig. 3.
Table 6 shows some of the contour
construction calculations for a contour value
of 100 micro-inches.
An examination of Fig. 3 shows that there are
a number of combinations of variables X2 and
X3 that produce a predicted surface finish of
100 micro-inches.

IMPROVEMENT
Fig. 3 Contours of Constant Predicted Surface Finish in Micro-inches

IMPROVEMENT
Table 6 Contour Calculations for ŷ =100 micro-inches

X2 -1.0 -0.8 -0.6 -0.4 -0.2 0
X3 -4.036 -2.542 -1.565 -0.867 -0.365 -0.031
X2 0.2 0.4 0.6 0.8 1.0
X3 0.345 0.601 0.814 0.994 1.147

IMPROVEMENT
Based on this contour plot, the study team

decided to select the largest value for X2
(feedrate), corresponding to an X3 (nose
radius) value of + 1.
The reasoning behind the selection of this
particular X2, X3 combination was that in
addition to it producing the predicted surface
finish of 100 micro-inches, the larger value
for X2 would also lead to higher productivity.
IMPROVEMENT
The same desire for higher productivity led to

selection of the level for variable X1 (i.e., X1 =
+1).
In summary, therefore, the following levels
were selected for X1, X2, and X3:
X1=+1 spindle speed = 2600 rev/min
X2=+0.8 feedrate = 0.0095 in./rev
X3=+1 nose radius =1/32 in.

IMPROVEMENT
It may be noted that when these conditions

were applied to the turning process in
confirmatory tests, the process was indeed
recentered at approximately 100 micro-
inches.
Furthermore, the selected conditions
provided for a very substantial increase in the
productivity from the original operating
conditions.
Case Study: FFD
Case Study - FFD
Shiv G. Kapoor
Case Study Application: The

Sandmill Experiment
In an investigation at an automotive plant that

was concerned with a process used to
manufacture vinyl film, it was suggested that
a new type of sandmill, a horizontal sandmill
could improve both productivity and quality.
The plant currently employed vertical
sandmills to disperse pigments for the
production of colorants for the vinyl film.

Sandmill Experiment
It was decided to perform some tests to assess the

superiority of the horizontal sandmill with respect to the
vertical sandmill, since the sandmill replacement cost
would be approximately $180,000. Arrangements were
made with the manufacturer for some trial runs on the
horizontal sandmill.
Past experience with the vertical sandmill had shown
that grind fineness (measured using the Hegman scale
from 0/corase to 10/fine) was affected by both the
temperature setting of the process and the flow rate.

Sandmill Experiment
While targeted values had been established

some time in the past for these variables (as
a function of pigment type and pigment
concentration), it was not at all clear which
values to use for the horizontal machine.
In addition, there was some concern about
the “goodness” of the target values actually
being used for the vertical machine.

Sandmill Experiment
Therefore, it was decided to consider the two

process parameters (temperature and flow
rate) in addition to the machine type, pigment
type, and pigment concentration in an
experiment. Table lists the variables and
variable levels that were chosen for the
experiments.
Variables and Their Levels for

the Sandmill Experiment
Level
Variable Name Low (−) High (+)
1 Sandmill/ machine type Vertical Horizontal
2 Pigment type Blue Red
3 Pigment concentration (%) 10 15
4 Processing temperature (°F) 140 160
5 Flow rate (gal/hr) 20 30

Design and Conduct of the
Experiment
To consider this group of five factors in

accordance with a two-level full factorial design
scheme and run 32 tests without any replication,
seemed prohibitive.
Given the time constraints and the expense of
so many tests, it was decided to conduct a 25-1
fractional factorial design (i.e., a one-half
fraction design).
A 25-1 Fractional Factorial Design
Test I 1 2 3 4 12 13 14 23 24 34 123 124 134 234 1234
1 + − − − − + + + + + + − − − − +
2 + + − − − − − − + + + + + + − −
3 + − + − − − + + − − + + + − + −
4 + + + − − + + − − − + − − + + +
5 + − − + − + − + − + − + − + + −
6 + + − + − − + − − + − − + − + +
7 + − + + − − − + + − − − + + − +
8 + + + + − + − − + − − + − − − −

A 25-1 Fractional Factorial Design
9 + − − − + + + − + − − − + + + −
10 + + − − + − − + + − − + − − + +
11 + − + − + − + − − + − + − + − +
12 + + + − + + + + − + − − + − − −
13 + − − + + + − − − − + + + − − +
14 + + − + + − + + − − + − − + − −
15 + − + + + − − − + + + − − − + −
16 + + + + + + + + + + + + + + + +
Design and Conduct of the

Experiment
A fifth variable may be introduced into the structure by

assigning it to one of the interaction columns. To obtain
the highest design resolution, variable 5 was introduced
using the 1234 interaction column.
If the +/− signs in the design matrix are replaced by the
actual levels for the variables, we obtain the recipes for
each test condition. The test conditions in the design
matrix were performed in a random order (using the
indicated run order) and the Hegman values displayed
in Table were obtained.

Design Matrix
Test I 1 2 3 4 5
1 + − − − − +
2 + + − − − −
3 + − + − − −
4 + + + − − +
5 + − − + − −
6 + + − + − +
7 + − + + − +
8 + + + + − −
9 + − − − + −
10 + + − − + +
11 + − + − + +
12 + + + − + −
13 + − − + + +
14 + + − + + −
15 + − + + + −
16 + + + + + +
Test Condition Recipes
Machine Pigment Pigment Processing Flow Rate Hegman

Type Type Conc. (%) Temp. (F) (gal/hr) Value Run
Test 1 2 3 4 5 y Order
1 Vertical Blue 10 140 30 6.25 5
2 Horizontal Blue 10 140 20 6.25 11
3 Vertical Red 10 140 20 7.75 8
4 Horizontal Red 10 140 30 6.75 2
5 Vertical Blue 15 140 20 6.25 13
7 Vertical Red 15 140 30 6.75 4
8 Horizontal Red 15 140 20 6.75 14
9 Vertical Blue 10 160 20 7.00 6
10 Horizontal Blue 10 160 30 5.25 12
11 Vertical Red 10 160 30 8.00 16
12 Horizontal Red 10 160 20 7.00 1
13 Vertical Blue 15 160 30 5.50 15
15 Vertical Red 15 160 20 7.75 3
16 Horizontal Red 15 160 30 6.50 9

Confounding Structure and Linear
Combinations of Effects
As has been noted, the 25-1 fractional factorial design described
above was obtained by introducing variable 5 through the 1234
interaction column. Hence, the generator and defining relation for
this design is
I = 12345.
As can be seen, this design is of resolution V since the
length of the smallest word in the defining relation is five
(excluding I).
Therefore, at least some of the main effects (in this case all) will
be confounded with four-factor interactions, and at least some of
the two-factor interactions (in this case all) will be confounded with
three factor interactions.
Linear Effects
Given the defining relation, the patterns of

confounded effects may be obtained by
multiplying the column headings in the base
design calculation matrix (I, 1, 2, 3, 4, 12, 13,
…, 1234) by every term in the defining
relation.
For example, for the column denoted 1, we
have
(1)I = (1)12345 or 1 = 2345.

Alias Structure
As another example, for the column denoted

123, we have
(123)I = (123)12345 or 123 = 45.
The complete set of linear combinations of
confounded effects is then
l0 estimates mean + (1/2)(12345)
Complete Alias Structure

l1234 estimates 1234 + 5.
Alias Structure
Under the assumption that three-factor

interactions and higher are negligible, this set
of linear combinations reduces to:
l0 estimates mean l23 estimates 23
l1 estimates 1 l24 estimates 24
Estimation of Effects
The estimates (li’s) associated with the linear

combinations of confounded effects are
obtained by multiplying each column of plus
and minus signs in the base design
calculation matrix by the column of responses
(Hegman values), summing, and then
dividing by 8 (for the average, divide by 16).
As an example, consider the calculation of
the estimate l1:

Column Column
1 y
− 6.25
+ 6.25
− 7.75
+ 6.75
= (−6.25 + 6.25 − 7.75 + 6.75 − 6.25 + 5.25
− 6.25
+ 5.25 −6.75 + 6.75 − 7.00 + 5.25 − 8.00 + 7.00
− 6.75
+ × 6.75 −5.50 + 5.25 − 7.75 + 6.50) / 8
− 7.00
+ 5.25 = −0.78125 = l1
− 8.00
+ 7.00
− 5.50
+ 5.25
− 7.75
+ 6.50
The estimates are summarized as follows:

l0 = 6.51563 estimates mean
l1 = -0.78125 estimates 1
l2 = 1.28125 estimates 2
l3 = -0.53125 estimates 3
l4 = 0.03125 estimates 4
l12 = -0.03125 estimates 12
l13 = 0.15625 estimates 13
l14 = -0.28125 estimates 14
l23 = 0.09375 estimates 23

l24 = 0.28125 estimates 24

l34 = -0.03125 estimates 34
l123 = 0.03125 estimates 45
l124 = -0.03125 estimates 35
l134 = 0.15625 estimates 25
l234 = 0.09375 estimates 15
l1234 = -0.46875 estimates 5
Normal Probability Plot of Effect

Estimates

Summary
An examination of the normal probability plot

reveals that the following effects are
important (under the assumption that three-
factor and higher-order interactions are
negligible):
1, 2, 3, 5, 14, 24.
These main and two-factor interaction effects
may be interpreted as follows:
Summary
1. Main Effect of Machine Type (Variable 1).

The vertical machine, on average, yields Hegman
values 0.78125 higher than the horizontal machine.
This effect was large relative to the other effects.
2. Main Effect of Pigment Type (Variable 2).
Blue pigment, on average, yields Hegman values
1.28125 lower than red pigment. Blue pigment is
known to be generally more difficult to grind.

Summary
3. Main Effect of Pigment Concentration

(Variable 3).
A 10% pigment concentration, on average,
yields Hegman values 0.54125 higher than a
concentration of 15%.
4. Main Effect of Flow Rate (Variable 5).
A 20 gal/hr flow rate, on average gives Hegman
values 0.46875 higher than a flow rate of 30
gal/hr.
Summary
5. Two-Factor Interaction: Machine by

Temperature (14 Interaction).
This interaction indicates that the vertical
machine performs somewhat better at higher
temperature, while the horizontal machine
performs somewhat better at lower
temperatures. However, the low-temperature
horizontal mill results were poorer than either
high- or low-temperature vertical mill results.

Machine by Temperature Interaction
7.06 6.00
160º
Temperature, ºF
140º
6.75 6.25
Vertical Horizontal
Machine
Summary
6. Two-Factor Interaction: Pigment Type by

Temperature (24 Interaction).
This interaction indicates that for the blue
pigment, better Hegman values are obtained at
the low level of temperature, while for the red
pigment, the high level of temperature is better.
This result is important in that it begins to
provide information of value with respect to the
optimization of the process.

Pigment Type by Temperature
Interaction
5.75 7.31
160º
Temperature, ºF
140º
6.00 7.00
Blue Red
Pigment
Summary
Based on the results of this experiments, there appears

to be no advantage in using the horizontal sandmill in
place of the vertical sandmill. In fact, the vertical sandmill
appears to be superior.
Furthermore, it was learned that a lower pigment
concentration actually produces higher/better Hegman
values. This results is important because a reduction in
the pigment concentration lowers the raw material costs.
Further study of these two processing variables is
needed from a process optimization standpoint.

Another Case Study
Objective
To study effect of groove
geometry on the twist drill
performance.
CASE STUDY: TWO-LEVEL FRACTIONAL

FACTORIAL DESIGNS
80 mm deep holes were drilled in the absence

of cutting fluids.
5 variables were identified as being critical for

the performance of the drill.
The variables along with their levels are

mentioned in the Table 1.

CASE STUDY: TWO-LEVEL FRACTIO
NAL FACTORIAL DESIGNS
Table 1 Variables and Their Levels

Variable Low(-1) High(+1)
1. Speed (S in rpm) 800 1100
2. Feed (F in mm/rev) 0.14 0.18
3. Width of groove (Wg in mm) 1.3 1.7
4. Angle of groove (γ in deg) 20 30
5. Height of groove (Hg in mm) 0.7 0.9

Figure 1. Groove of the Drill γ is the angle of the groove

leading edge from the cutting lip
of the drill;
d1 & d2 is the perpendicular
distance of the groove leading
edge from the inner & outer
corner of the cutting lip;
O is the point where the two
cutting lips meet when extended;
AA’ is the groove cross section
a & b is the radial distance of the
inner & outer corner of the cutting
lip from O.
Figure 2. Geometry of the C & C’ is the groove leading &

trailing edge
Groove
Wg is the width of the groove
Hg is the depth of the groove
Rg is the radius of the groove
Bg is the back wall height of the
groove
φ is the groove entry angle
L is the land length

Figure 3. Critical Depth

Critical depth is
defined as the
depth of hole at
which sudden
increases in
force and torque
occur while
drilling.

25-1 FRACTIONAL FACTORIAL DESIGN
A 25-1 fractional factorial design with design

generator 5 = 1234 was used
Response Parameters: Critical Depth
Experiments were conducted and the

response in terms of critical depth for each
experiment was collected.
Table 2 gives the design matrix and the response
for each experiment..
Response Parameters:
Critical Depth
Experiments were conducted and the
response in terms of critical depth for each
experiment was collected.
Table 2 gives the design matrix and the
response for each experiment.

Table 2. Design Matrix

Test X 1: S X 2: F X 3: W g X 5: H g Y: Dc
X 4: γ (°)
#. (R P M ) (m m /r e v ) (m m ) (m m ) (m m )
1 -1 -1 -1 -1 1 41
2 1 -1 -1 -1 -1 53
3 -1 1 -1 -1 -1 65
4 1 1 -1 -1 1 50
5 -1 -1 1 -1 -1 42
6 1 -1 1 -1 1 48
7 -1 1 1 -1 1 42
8 1 1 1 -1 -1 51
9 -1 -1 -1 1 -1 58
10 1 -1 -1 1 1 80
11 -1 1 -1 1 1 80
12 1 1 -1 1 -1 80
13 -1 -1 1 1 1 54
14 1 -1 1 1 -1 47
15 -1 1 1 1 -1 51
16 1 1 1 1 1 66
Effect Estimates
Table 3 gives the effect estimates for the

response parameter
Three factor interactions are neglected from

the effect estimates.

25-1 FRACTIONAL FACTORIAL DESI
GN
Table 3. The effect estimates for the response parameter
Effects Response = Critical Depth
L1 = 1 5.25
L2 = 2 7.75
L3 = 3 -13.25
L4 = 4 15.5
L12 = 12 -3
L13 = 13 0.5
L14 = 14 2.25
L23 = 23 -3
L24 = 24 1.75
L34 = 34 -6.75
L123 = 45 9.25
L124 = 35 3
L134 = 25 -4
L234 = 15 1.5
L1234 = 5 1.75
RESPONSE:CRITICAL DEPTH
Figure 4. Normal Probability Plot of Critical Depth

As can be seen from Figure 4 , with critical

depth as the response parameter, the effects
due to the width of the groove and angle of
the groove come out to be significant.
While the width of the groove has a high

negative effect on the critical depth, the angle
of the groove has a positive effect.
Figure 5. Effect of Angle of Groove on Critical Depth

It is clear that as the
angle of groove is
increased from low to
high, the critical depth
increases by about
35%.
Higher angles of the
grooves will lead to
greater critical depths
thereby indicating less
chip clogging.

Figure 6. Effect of Width of Groove on Critical Depth

With decrease in
width of the groove
from 1.7 mm to 1.3
mm, the critical depth
increases by about
30%.

Session 6
Philosophy of Taguchi & Taguchi Loss
Function
Prof. Suhas S. Joshi
Session #6: Philosophy of Taguchi and Taguchi Loss
Function
Dr. Suhas S. Joshi

Professor
Department of Mechanical Engineering
Indian Institute of Technology, Bombay,
Powai, MUMBAI – 400 076 (India)
Phone: 022 2576 7527 (O) / 2576 8527 ®
Email: ssjoshi@iitb.ac.in
Taguchi (Robust Design) Philosophy

• It is known that the objective of any R&D activity is to generate drawings,
specifications, and other relevant information needed to manufacture
products that meet customer requirements.
• It uses scientific knowledge and the past engineering experience to
design a product that has low cost and high quality.
• There are many configurations,
processes and parameters
involved in the product or
process design.
• One of the ways of improving
the productivity of R&D so that
high quality products are
generated quickly and low cost
is the use of Taguchi or Robust
Design Method.
Typical R&D activities [1]

• Japan began the reconstruction of the country after world war II, where it
faced acute shortage of good quality raw material, high-quality
manufacturing equipment and skilled manpower.
• Dr. Genichi Taguchi who was a manager in Nippon Telephone and
Telegraph corporation, was given this task of developing certain
telecommunication products.
• Between 1950-1960 he has developed Robust design methodology,
which later was called as ‘Taguchi Method’. He validated its basic
philosophy by applying them to many products. In recognition of his
contribution, he was awarded Deming Award in 1962, a highest
recognition in the field of quality.

• What is Taguchi (Robust Design) Methods?
• It is a technique which draws many ideas from statistical experimental
design to plan experiments to obtain reliable information about the
decision making variables.
• But it addresses two main issues [1]:
– How to reduce economically the variation of a product’s function in the
customer’s environment that is to achieve product’s function consistently on
target.
– How to ensure that decisions found to be optimum during laboratory
experiments will prove to be so in manufacturing and in customer
environments.
• Taguchi defines three costs that must be considered while designing
process or product; these are –
– Operating Cost: it is the cost of energy required to operate the
product, environmental control, maintenance, inventory of spare parts
and units, etc. If product is sensitive to environment then, high cost,
e.g. paint of buildings in Mumbai. If the product fails often, then high
maintenance cost and inventory costs.
– Manufacturing Cost: It includes the cost of equipment, machinery, raw
materials, labor scrap, rework, etc.
– R&D Cost: It involves the time taken to develop a new product and the
amount of engineering and laboratory resources needed are the major
elements of R&D costs. The goal of R&D costs is to minimize the unit
manufacturing cost and the operating cost.
• Of the above costs, mfg and R&D costs are incurred by the producer and
are passed on to the customer through purchase price.
• But the operating cost is directly borne by the customer. From customer’s
view point, OC + Mfg and R&D cost determine the quality.

• According to Taguchi, the quality of a product is measured in terms of the
total loss to society due to functional variation and harmful side effects.
• If a car breaks down in the street then it causes inconvenience not only to
the customer but many on the road who were not concerned about it.
• Thus, poor quality product causes many losses to the society. So, greater
the loss, poorer the quality.
• A tile manufacturing company in Japan faced problem of non-uniformity in
dimensions of tiles baked in a kiln since the ties at the center are exposed
to lower temperature that at the periphery.
• Redesign cost of the kiln was enormous.
• Investigators started searching for low cost
solution.
• They found that increasing lime content of the clay
from 1 to 5 percent reduces the variation in tile
dimensions. Since lime was the least expensive
ingredient, solution was the cheapest one. Tile manufacturing [1]
• Thus, philosophy of Taguchi Methods is to minimize the cause of the
variation (non-uniform temperature distribution) without controlling the
cause itself (kiln design).
• It is not always necessary to eliminate the cause of variation altogether,
but its effect on variation in the quality can be minimized.
• Therefore, the objective here is to improve the quality of a product or
process by optimizing it such that its performance is minimally sensitive to
the various causes of variation.
Tile manufacturing [1]

• Important tools used in Taguchi Methods are:
– Measurement of quality during design and development: It involves
identification of leading indicator of quality. This is achieved by
defining ‘signal-to-noise’ ratio.
– Efficient experimentation to find reliable information about the design
parameters. This is achieved by using ‘orthogonal arrays’ for
experimentation so that necessary information is obtained with
minimum time and resources.
Taguchi (Robust Design) Method
• Taguchi Methods involves a number of steps:
– Definition of quality (loss) function
– Identification of Noise variables
– Identification of Control variables
Taguchi Loss Function

• As mentioned earlier, Taguchi defined quality level in terms of total loss
incurred by the society due to failure of the product to deliver the target
(intended) performance.
• A common measure to quantify the loss is in terms of fraction of total
number of defectives, called fraction defective.
• This method says that the products just inside the specification limit are
good whereas those outside are bad.
• But the products are closer to the mean give the best performance and
their performance reduces as the product quality deviates from the mean.
• The product progressively becomes worst.
• Example of quality is given by Sony TVs made in Japan and USA.
• Both companies manufacture products within the specification limits.
• The quality is measured in terms of color density.
• Sony-Japan manufactures more products closer to the mean than Sony-
USA.
• Sony-Japan manufactures
more A grade sets than B
or C. But Sony-USA
manufactures almost all
grades equally.
• Therefore, the average
grade of Sony-Japan is
better than Sony-USA.
• Thus, meeting
specifications is not
sufficient, but meeting
target is more important. Distribution of color density in TV sets [1]

• Another example of target specific manufacturing is distribution of
resistance of telephone cables.
• The resistance varies between m±∆0.
• By improving the wire drawing process through new technology, the
manufacturer was able to reduce the process variance substantially (b).
• But the manufacturer has moved
closer to the upper specified limits.
• While the manufacturer saves lot
of cost by reducing the number of
defectives (rejection), larger
resistance of the wires caused
more average electrical loss
causing complaints from the
customers.
• So, savings on Mfg. cost are not
sufficient, operating cost is
important too. Therefore, being on Distribution of wire resistance [1]
target (mean) is crucial.
• The usual definition of the quality loss is by means of a step function,
where specifications are written as m±∆0.
• This mean that all the products within m + ∆0 and m - ∆0 are equally good.
But as soon as the limit is exceeded, they are rejected. Therefore, the
loss function is defined as –
 0 if y − m ≤ ∆ 0
L( y) = 
 A0 otherwise
• If y is the quality characteristic with
m as target value for y, the
quadratic loss function is given by –
L ( y ) = k ( y − m)
2
k is called quality loss coefficient.

• At y = m, loss = 0. The loss
increases as y deviates from mean
Quality loss function [1]

• The usual definition of the quality loss is by means of a step function,
where specifications are written as m±∆0.
• This mean that all the products within m + ∆0 and m - ∆0 are equally good.
But as soon as the limit is exceeded, they are rejected. Therefore, the
loss function is defined as –
 0 if y − m ≤ ∆ 0
L( y) = 
 A0 otherwise
• If y is the quality characteristic with
m as target value for y, the
quadratic loss function is given by –
L ( y ) = k ( y − m)
2
k is called quality loss coefficient.

• At y = m, loss = 0. The loss
increases as y deviates from mean
Quality loss function [1]
• If loss at m±∆0 is A0, then substituting in quadratic loss function, we get –
A0
k=
∆ 02
• After substituting k, the quadratic loss function now becomes –
A0
L( y) = 2 (
y − m)
2
∆0
• Example: For a TV set, if the color density limit is m±7. The repair cost of
TV set is Rs 98 i.e. A0 = 98. Therefore, substituting in the above
equation, the loss function becomes –
98
L( y) = 2 (
y − m) = 2 ( y − m )
2 2
7
For y = m, loss = 0
For y = (m+7), the loss L(y) = 98
For y = (m + 2), L(y) = 8
For y = (m - 3), L(y) = 18
Taguchi Loss Functions

• There are at least four variations possible of
quadratic loss function; see adjoining Fig.
• Nominal-the-better type: Here the ideal value
is mean
L ( y) = k ( y − m)
2
• Smaller-the-better type: Here the ideal value

is zero. E.g. Radiation leakage.
L( y) = k ( y)
2
• Larger-the-better type: Here the ideal value is

more than zero. E.g. strength of a weld.
 1 
L( y) = k  2 
y 
• Asymmetric loss function
 k1 ( y − m )2 , y > m
L( y) = 
k2 ( y − m )2 , y ≤ m
 Quality loss functions [1]
References
1. Madhav S. Phadke, ‘Quality Engineering using Robust Design’,

P.T.R. Prentice Hall, Englewood Cliffs, New Jersey, 1989.
Session 7
Steps in Taguchi Methods
Session #7: Steps in Taguchi Method
Dr. Suhas S. Joshi

Professor
Phone: 022 2576 7527 (O) / 2576 8527 ®
Steps in Taguchi Method

• Robust design is a methodology for finding the optimum settings of the
control factors to make the product or process insensitive to noise
factors. It involves eight steps [1] –
– Identify main function, side effect and failure modes
– Identify noise factors and testing conditions for evaluating the quality
loss
– Identify quality characteristic to be observed and the objective
function to be optimized
– Identify control factors and their alternate levels
– Design the matrix experiment and define the data analysis procedure
– Conduct the experiment
– Analyze the data and determine optimum levels for control factors
and predict performance under these levels.
– Conduct the verification experiment and plan future actions.
Main Function Identification
• Designing a process or product is a complex activity. It involves three
essential elements –
– Design of system architecture
– Design of nominal values of all parameters of the system
– Design of tolerance or the allowable variation in each parameter
• Optimizing a product or process means determining the best
architecture, the best parameter values and the best tolerances.
• Optimization strategy could be –
– To minimize manufacturing cost while delivering the same quality.
Here the supplier can increase his profit margin.
– To minimize quality loss while keeping the manufacturing cost same.
Here the supplier will build reputation for quality.
– To minimize the sum of the quality loss and manufacturing cost. This
is the strategy for best utilization of sum of resources of the supplier
and the customer.

• But it is difficult to define a single objective function encompassing all the
costs. Therefore following three step strategy is followed –
• Concept Design
• Parameter Design
• Tolerance Design
• In the first step, a number of architectures and technologies for achieving
he desired function of the product are identified. E.g. selecting an
appropriate sequence of manufacturing steps, selecting an appropriate
circuit diagram.
• In the second step, the best settings of the control factors are made such
that it does not affect the manufacturing cost. Here we minimize the
sensitivity of the function variables to the noise and also get the mean
function on target. Here initially, low manufacturing cost, or low-grade
material is assumed. If at the end, the quality loss is within the
specification limit, then we already have the solution.
• In the third step, trade-off is between reduction in the quality loss due to
performance variation and increase in manufacturing cost. The
tolerances are selectively reduced and selectively higher grade materials
are specified and the cost effectiveness of the process is judged.
Classification of Parameters
• A number of parameters can influence the quality characteristics or
response or the product. These parameters are of three types [1] –
1. Signal factors
2. Noise factors
3. Control factors
• Signal factors (M): These are the parameters set by the user or operator
of the product to express the intended value for the response of the
product.
For example, speed setting on a
table fan is the signal factor that
specifies the amount of breeze.
The signal factors are selected by
the design engineering based on
the engineering knowledge of the
product being developed.
P Diagram
Classification of Parameters
• Noise factors (x): These factors can not be controlled by the
designer. Three classes of noise factors as defined little later. Only
the statistical characteristics such as mean and variance of noise
factors can be known or specified but the actual values are not
known.
• The noise factors cause the response y to deviate from the target
specified by the signal factor M and lead to quality loss.
• Control Factors (z): These parameters can be specified by the

designer. It is the designer’s responsibility to determine the best
values of these parameters.
• Each control factor can take multiple values, called levels. When the
levels of the control factors are changed, the manufacturing cost
does not change. However, when the levels of other factors are
changed, the manufacturing cost also changes.
Quality Characteristics & Objective Function

• It is often tempting to take percentage of good units that meet the
specifications as an objective function to be optimized.
• But this is not a good measure of quality loss. Instead reason behind the
variation in the quality from the target could be a better objective
function.
• Life-cycle of a product has four basic stages: 1. product design, 2.
Manufacturing process design, 3. manufacturing and 4. Customer
usage.
• The quality control activities during the process and product design are
called ‘off-line’ quality control and while during manufacturing are called
‘on-line’ quality control.
• Various quality control activities during product realization are indicated
in a table on the next page.
Quality control activities during product realization steps [1]
Noise Factors or Causes of Variation

• The performance of a product that is measured in terms of quality
characteristics varies in the field due to a variety of causes. All these
factors are called noise factors; they are of three types [1] –
• External: these factors are related to the environment in which a product
works and the load it is subjected to are the two main sources of
variation.
– Some of the environmental factors are – temperature, humidity, dust, supply
voltage, electromagnetic interference, vibrations and human error in
operating the product.
– The number of tasks to which a product is subjected to simultaneously and
period of time it is exercised continuously are the load-related noise factors.
• Unit-to-unit variation: This type of variation is inevitable in a
manufacturing process, which eventually leads to variation in the product
quality from unit-to-unit.
• Deterioration: At the beginning, a product may function at the target, but
as the time passes some components may deteriorate.
• Key noise factors in a Car:
• External: dry, humid and wet weather conditions. Cement, tar or soil
roads. Weight of passengers it carries, distances is runs in single
journey, speed with which it runs.
• Unit-to-unit variation: clearance between piston and cylinders, amount of
splash lubrication, variation in friction coefficient between brake pads and
drums
• Deterioration: wear of piston rings and liners, leakage of oil from gaskets,
wear of brake drums and pads.
• The noise factors for different quality characteristics could be different. In

such cases, different quality characteristics must be optimized
separately.

• Average Quality Loss: The quality characteristics of a product, y varies
from unit to unit and from time to time during the use of the product.
• If the distribution of y resulting from various noise sources is as shown in
the figure. Let y1, y2, y3 … yn be the representative measures of quality
taken at a few representative units throughout the life of the product.
• Let, y be a nominal-the-best type quality characteristic and m be its target
value. The average loss Q resulting from this product is given by –
1
Q=  L ( y1 ) + L ( y1 ) + ..... + L ( yn ) 
n
k
Q = ( y1 − m ) + ( y1 − m ) + ..... + ( yn − m ) 
2 2 2
n 
 −
(σ )  where µ = 1 ∑ yi and
n 1
Q = k ( µ − m ) +
2 2 n
1 n
σ2 = ∑ ( yi − µ )
2
 n  n i =1 n − 1 i =1
• For large n, we get
Q = k ( µ − m ) + (σ ) 
2 2
 
• For large n, we get –
Q = k ( µ − m ) + (σ ) 
2 2
 
• Thus the average quality loss has the following two components –
k ( µ − m ) resulting from the deviation of the average value of y from the
2
–
target
– kσ 2 resulting from the mean squared standard deviation of y around its own
mean.
• Of the two, the first one is easier
to control.
• The second one, decreasing
variance is more difficult.
Loss function [1]

• Methods of reducing variance involve –
– Screening out bad products by inspection
– Discovering the cause of malfunction and eliminating it
– Application of Taguchi method in which involves making product insensitive
to the noise factors.
Example of a Taguchi Experiment
• Chemical Vapor Deposition (CVD) process has four important
parameters that control the surface defects [1]. These are –
1. Temperature (A) 2. Pressure (B)
3. Settling time (C) 4.Cleaning Method (D)
• The objective is to determine best factor settings that minimize surface
defects.
Factor levels [1]

• The experiment was conducted as per the L9 orthogonal array as shown
in the table below [1]:
• The L9 array chosen for experiment is a standard orthogonal array.
• In this array 1, 2, 3 numbers indicate the levels of respective parameters.
• In the array, for any pair of columns, all combinations of factor levels
occur and they occur an equal number of times.
• This is called balancing property or the array and is called orthogonality.
• The last column of the table gives surface defect count per unit area. It is
defined by a formula summary statistic, ηi for an experiment i as below:
η = −10 log10 ( Mean square of defect count for exp eriment i )

where mean square refers to the average of the nine observations in an
experiment i.
• The summary statistic is called is as ‘signal-to-noise’ ratio.

• The effect of a factor level is defined as the deviation it causes to the
overall mean.
• The overall mean is given by –
1 n 1
m= ∑ ηi = [η1 + η2 + ... + η9 ]
9 i =1 9
• The effect of factor A at level 3 is given by the result of experiment
number 7, 8, 9. Therefore the average S/N ratio for these experiments is
given by –
1
m A3 =
3
[η7 + η8 + η9 ]
• The effect of temperature at level A3 is given by: (m A3 – m)
• The effect of temperature at level A2 is given by: (m A2 – m)
1
where,
mA 2 =
3
[η4 + η5 + η6 ]
• All the factor effects can be plotted in the form of a means plot as below.
• The optimum parameters are indicated where contribution of the level to
deviate away from the mean, see the adjoining table.
Means Plot or Analysis of Means [1] Final results, optimum levels are
indicated by * [1]
Optimum combination is
A1B1C2D2 or A1B1C2D3

• Additive model:
• The actual relationship between the process parameters and the
response variable η can be quite complex. But we assume that it can be
approximated by an additive model –
η ( Ai , B j , Ck , Dl ) = µ + ai + b j + ck + dl + e
• In the above equation, µ overall mean, ai is the deviation from µ caused

by setting A at level Ai.
• Similarly, deviation from µ due to setting B at level Bi is bi.
• The term e stands for error. This is an error of additive approximation
plus the error in repeatability of measuring η for a given experiment.
• This additive model indicates that the total effect of several factors is
equal to sum of the individual factor effects. Though, it is possible that
the individual factor effects could be linear, quadratic or of higher order.
• By definition a1, a2 and a3 are the deviations from µ value caused by the
three levels of factor A. Thus,
a1 + a2 + a3 = 0
Similarly ,
b1 + b2 + b3 = 0
c1 + c2 + c3 = 0
d1 + d 2 + d3 = 0
• It can be seen that the averaging procedure of estimating factor effects is

equivalent to fitting the additive model by least square method.
• This is a consequence of using orthogonal array to plan matrix
experiment.

• Analysis of Variance:
• Various factors affect the surface defects to a different degree. The
relative magnitude of their effect can be judged by the means plot.
• But better feel about relative effect of different factors can be obtained by
decomposition of variance, the method is called as Analysis of Variance
(ANOVA).
• ANOVA is also needed for estimating the error variance for the factor
effects and variance of the prediction error.
• It involves computation of sum of the squares:
• Grand sum of the squares is given by –
9
= ∑ηi2 = ( −20 ) + ( −10 ) + ... + ( −70 )  = 19425 ( dB )
2 2 2 2
i =1
 
• It is decomposed into two parts – sum of squares due to mean and total
sum of squares given as below:
• Sum of square due to mean = (number of experiments) x m 2
= 9 (41.67)2 = 15625 (dB)2
• The total sum of the squares is given by –
9
= ∑ (ηi − m ) = ( −20 − 41.67 ) + ( −10 − 41.67 ) + ... + ( −70 − 41.76 ) 
2 2 2 2
i =1
 
= 3800 ( dB )
2
• Therefore, we get –
• Total sum of squares = (Grand total sum of squares) –
(Sum of squares due to mean)
3800 = 19425 - 15625

• Sum of squares due to factor A is given by –
= 3 ( mA1 − m ) + 3 ( mA 2 − m ) + 3 ( mA3 − m ) 
2 2 2
 
= 3 ( 20 + 41.67 ) + 3 ( −45 + 41.67 ) + 3 ( −60 + 41.67 ) 
2 2 2
 
= 2450 ( dB )
2
• Similarly, sum of squares due to other factors can be estimated as shown

in the table.
ANOVA Table for η [1]
Factor Degree of Sum of Mean square F ratio

freedom squares
A: Temperature 2 2450 1225 12.25

B: Pressure 2 950 475 4.75
C: Settling Time 2 350* 175
D: Cleaning Method 2 50* 25
Error 0 0 --
Total 8 3800
(Error) (4) (400) (100)
*Indicates sum of squares added together to estimate the poled error sum.
F ratio is calculated by using pooled error mean square.

• Knowing the factor effects i.e. the values of ai, bj, ck and dl, we can use
the additive model given by the earlier equation to calculate the error
term ei for each experiment i. The sum of squares due to error is the sum
of the squares of the error terms. Thus, we have –
n
Sum of squares due to error = ∑ ei2
i =1
• In this study, the total number of parameters are (µ, a1, a2 , a3 , b1 b2,
etc.) is 13. The number of constraints defined by eqs. is 4. The number of
model parameters minus the number of constraints is equal to the
number of experiments, i.e. 9. Hence, the error term is zero in each of
the experiments. Therefore, the sum of the squares due to error is also
zero. But this need not be the situation in each of the cases.
• Inferences from ANOVA [1]
• It is clear that the factor A is responsible for (2450/3800) x 100 = 64.5
percent of variation of η.
• The factor B is responsible for (950/3800) x 100 = 25% variation of η.
• The factor C and D are responsible for only about 10% variation of η.
• Estimation of Error Variance [1]

(σe)2= Sum of squares due to error / degrees of freedom for error
• Confidence intervals for factor effects: are useful in judging the size of
the change caused by a factor level compared to the error standard
deviation.
• The error variance is = (1/3) (σe)2 = (1/3)(100) = 33.3 (dB)2.
• Thus for 95% confidence interval - ±2 √ (1/3) (σe)2 = ±2 √33.3 = ±11.5 dB.
• In the means plot, these deviations are plotted for the mean levels of
each factors.

• Prediction of η under optimum conditions: it is known that the objective of
Taguchi methods is to predict optimum conditions.
• Earlier, the optimum conditions have been predicted as –
A1B1C2D2 or A1B1C2D3
• Using the additive model, the value of η under optimum conditions as –
ηopt = m + ( mA1 − m ) + ( mB1 − m )
ηopt = −41.67 + ( −20 + 41.67 ) + ( −30 + 41.67 ) = −8.33dB
• The above equation does not include sum of squares due to factors C
and D, since they are small and are included as errors.
• If we include, then the corresponding improvements in the prediction of η
exceeds the actual realized improvement. In this case, our prediction
could have been biased on higher side. By excluding them, this bias is
avoided.
• Validation Experiment: After determining the optimum conditions and
predicting the response variable under these conditions, a verification
experiment is conducted.
• The results of the verification experiment are compared with the
predicted value of η.
• If the predicted and observed values are close to each other, we
conclude that the additive model is adequate for describing the
dependence of η on the various parameters.
• But, if the observed value of η deviates drastically from predicted value,
then we conclude that the additive model is inadequate.
• This is an evidence of a strong interaction among the parameters.
References

Session 8
Matrix Experiment Design Using OAs
Session #9: Matrix Experiment Design using Orthogonal
Arrays
Dr. Suhas S. Joshi

Professor
Phone: 022 2576 7527 (O) / 2576 8527 ®
Matrix Experiment Design

• A matrix experiment consists of a set of experiments where settings of
various product or process parameters are changed.
• After conducting a matrix experiment, the data from all experiments is
analyzed to determine the effects of various parameters.
• The matrix experiment is conducted using special matrices called
orthogonal arrays, which allows the effects of several parameters to be
determined efficiently.
• In the statistical terminology, matrix experiments are called ‘designed
experiments’ and the individual experiment conditions are called ‘runs’ or
‘treatments’. Settings of parameters are called ‘levels’ and parameters are
called ‘factors’.
• Degrees of Freedom:
• The first step in constructing an orthogonal array is to count degrees of
freedom.
• This will determine the minimum number of experiments that must be
performed to study all the chosen control factors.
• For a 3-level factor, the degrees of freedom are two.
• This is because there can be two comparisons possible, as the factor A
changes from A1 to A2 and A1 to A3.
• In general, degree of freedom for a factor is one less than the number of
levels the factor is varied.
• For an interaction, degrees of freedom is the product of degrees of
freedom of individual factors.
• Overall mean has one degree of freedom, regardless of the number of
control factors.

• For example for an experiment, we have
– Five factors (A, B, C, D, E) each at 3 levels, and a factor (F) at level 2, and an
interaction (AxF). The total degrees of freedom is given as below:
Factor DOF
Overall mean 1
A, B, C, D, E 5(3-1) = 10
F (2-1) = 1
AxF (3-1)(2-1) = 3
---------------------------------------------------------------------
Total 14
---------------------------------------------------------------------
• Therefore, we must conduct at least 14 experiments to estimate the effect
of each factor and the desired interaction.
Orthogonal Array
• Selection of a Standard Orthogonal Array
• The orthogonal array should be selected in such a way that it has more
than equal to the degrees of freedom that the chosen experiment has.
• There are 18 standard orthogonal arrays.
• An array name indicates the number of rows it has.
• The standard orthogonal arrays are indicated in the table.
• The number of rows in an array indicates the number of experiments that
need to be done.
• The number of columns of an array represents the maximum number of
factors that can be studied using that array.
• Also, to use a standard orthogonal array directly, we must be able to
match the number of levels of the factors with the numbers of levels of the
columns in the array.
• Since it is usually expensive to conduct experiment, we must try to use
the smallest possible aray.
Orthogonal Array
• 2-level arrays are:
L4, L8, L12, L16,
L32, L64.
• 3-level arrays are:
L9, L27, L81
• 2- and 3-level arrays
are: L18, L36, L’36,
L54
• There are ways by
which the standard
orthogonal arrays
can be modified to
suit our
requirements.
L18 Orthogonal array
Orthogonal Array
Typical orthogonal arrays
L8 Orthogonal array
Orthogonal Array and Linear Graphs

• Usually, in the Taguchi Method, we do not consider situation where,
interactions between factors are significant.
• But the orthogonal arrays do permit evaluation of a few select
interactions by using linear graphs.
• The linear graphs represent the interaction information graphically
and make it easy to assign factors and interactions easier to various
columns of an orthogonal array.
• A linear graph consists of dots and lines.
• A line connecting two dots indicate that the interaction of the two
columns represented by dots is confounded (contained) in the
column represented the line joining the two dots.
• The adjoining figure shows that interaction between 3
Columns 1 and 2 is in column 3. 1 2
• In such cases, we will not assign any factor to column
3 since it will be difficult to obtain its main effect.
L8 Orthogonal array Two standard linear

graphs of L8
Interacting
column
Column

• Example: In an experiment, there are four 2-level factors, A, B, C, D.
Three 2-factor interactions are possible between AxB, BxC, BxD.
Estimate the total degree of freedom of the experiment and choose
an appropriate orthogonal array and decide about its assignment.
• Degree of freedom estimation:

Four 2-level parameter: 4 (2-1) =4
Three 2-factor interactions: 3 (2-1)(2-1) =3
Degree of freedom of overall mean: =1
Total Degree of freedom: =8
• Therefore, we can choose, a L8 orthogonal array.
• Assignment of four 2-level factors A,B, C, D should be such that
their effects are not confounded by any interaction.
• Three 2-factor interactions are possible between AxB, BxC, BxD.
Estimate the total degree of freedom of the experiment and choose
an appropriate orthogonal array and decide about its assignment.
A
A
B C
D
D
B C
Here we get main effects of four factors Here we get main effects of four factors
but we get effect of interaction AxC and the three desired interactions AxB,
instead of the desired BxD BxC and BxD

• Therefore, the assignment of four 2-level factors A,B, C, D and three
2-factor interactions AxB, BxC, BxD is as indicated below.
B A AxB C BxC BxD D

A
B C
D
• Therefore, the experiments are now conducted in the following order
Expt 1 2 3 4 5 6 7
No
1 B1 A1 C1 D1
2 B1 A1 C2 D2
3 B1 A2 C1 D1
4 B1 A2 C2 D2
5 B2 A1 C1 D1
6 B2 A1 C2 D2
7 B2 A2 C1 D1
8 B2 A2 C2 D2
B A AxB C BxC BxD D
L12 Orthogonal Array
• In the L12 array, the interaction

between any two columns is
confounded partially with
remaining nine columns [1].
• Therefore, this array should be
used only when there are no
interactions between any two
control factors.
• L16 array permits

evaluation of a large
number of factors and
interactions.
• A variety of linear
graphs are possible
with this array
L16 Orthogonal Array Linear Graphs [1]
4
3
5
6
• In the L18 array, the interaction

between 1 and 2 can be estimated
by 2-way table of column 1 and 2.
• Columns 1 and 2 can be merged to
form a 6-level column.
Linear graph for L18 orthogonal array [1]
• In a 3-level orthogonal array, each

column has 2 degrees of freedom.
Therefore, interaction between two
columns have 4 degrees of freedom.
• Such interactions therefore require
two columns of 2-dof each.
Two different linear graphs for L27

orthogonal array [1]
Experimentation Strategy
• Beginner Strategy: Initially, it will be good to begin with the standard
orthogonal arrays and fit the problem into it.
• The beginners may not go beyond array 18 at the beginning.
• A beginner should consider all the factors at 2-levels or all at 2-
levels. But may not consider interactions at the beginning.
• Intermediate Strategy: With the modest experience, the

experimenter should go for little modification in the orthogonal
arrays.
• The experimenter can use combined level factors.
• Advanced strategy: the experimenter then should modify the linear

graphs, orthogonal arrays.
• Consider larger interactions and higher level parameters and arrays.
Intermediate Strategy for selecting an orthogonal array [1]
• The strategy of experimentation can be based on the resolution of
experiment.
• As the resolution of an experiment increases, the number of experiments
to be performed increases.
• Resolution V: all factor effects and all 2-factor interactions can be
estimated.
• Resolution IV: no 2-factor interactions are confounded with the main
effects. Comparison of Resolution III and IV [1]
• In Resolution III:
No two main
effects are
confounded with
each other. The 2-
factor interactions
are confounded
with each other.
• Screening experiments: Usually, with resolution III experiment roughly
twice a many as parameters can be studied that the resolution IV
experiments.
• It is good to conduct a resolution III experiment to determine which
parameters out of a large number of parameters influence the response
variable.
• Modeling Experiments: The resolution IV are used to conduct
experiments with the parameters that are found important during the
screening experiment. They are used to build the mathematical model.
References

Session 9
Conducting Matrix Experiment & Analysis
Session #10: Conducting Matrix Experiment and
Analyzing Results
Dr. Suhas S. Joshi

Professor
Phone: 022 2576 7527 (O) / 2576 8527 ®
Conducting Matrix Experiment

• Running in a random order is emphasized in the classical statistical
experiments. It is hoped to reduce the effect of noise factors.
• In the matrix experiments, it is recommended to follow settings that
minimizes changes in the levels of factors that are difficult to change.
• Due to orthogonal nature of array, even after randomization, the
experiments will appear in an order.
• Therefore, randomization is advised to the extend permitted by the
convenience of the experimenter [1].
• As far as possible, all the experiments should be done in a single setting
using the same equipment and also continuously.
• If the surroundings are changed the results and the results change, then it
is an indication that the results are sensitive to some noise factors.
• In such conditions, separate experiments with every noise factor are
necessary.
Case Study #1: Wear of Rotating Tools in
Machining of Composites
Process Details
Turning operation using rotary tools as shown in the Figure below [2].
Fig. 1 Schematic of cutting process [2] Fig. 2 Rotary Tool for turning [2]
Table 1: Experimental Parameters [2]

Variable Cutting Feed Inclinatio Volume Reinforcement
s Speed Rate n Angle in the Composite (% of
(m/min) (mm/rev) (deg) SiCp)
Level 1 22 0.084 15 10
Level 2 88 0.17 45 30

Fig.3: Linear Graph of the chosen array L8 [2]
Table 2: Design of Experiment using L8 orthogonal array [2]

Column 1 2 3 4 5 6 7
Experiment A B Ax C Ax Bx D
al Runs Feed Speed B Angle C C Volume
1 1 1 1 1
2 1 1 2 2
3 1 2 1 2
4 1 2 2 1
5 2 1 1 2
6 2 1 2 1
7 2 2 1 1
8 2 2 2 2
Table 3: Analysis of Means for flank wear (After 10 s of machining) [2]
1 2 3 4 5 6 7
Ra St Resp A : Feed B: Speed AxB C : Angle Ax C BxC D: Volume
n onse 0.0840.17 22 88 15 45 10 30
8 1 0.168 0.168 0.168 0.168 0.168 0.1 0.16 0.168
68 8
1 2 0.196 0.196 0.196 0.196 0.1 0.1 0.1 0.196
96 96 96
5 3 0.432 0.432 0.432 0.432 0.432 0.4 0.4 0.432
32 32
4 4 0.336 0.336 0.336 0.336 0.3 0.3 0.33 0.336
36 36 6
7 5 0.192 0.192 0.192 0.192 0.192 0.1 0.19 0.192
92 2
6 6 0.240 0.240 0.240 0.240 0.2 0.2 0.2 0.240
40 40 40
2 7 0.348 0.348 0.348 0.348 0.348 0.3 0.3 0.348
48 48
3 8 0.570 0.570 0.570 0.570 0.5 0.5 0.57 0.570
70 70 0
Total 2.482 1.132 1.35 0.796 1.686 1.2 1.282 1.14 1.3 1.0 1.4 1.2 1.26 1.092 1.39
42 72 1 16 6
No of 24 12 12 12 12 12 12 12 12 12 12 12 12 12 12
Values
Average 0.103 0.094 0.113 0.064 0.141 0.1 0.107 0.095 0.1 0.0 0.1 0.1 0.10 0.091 0.116
12 89 18 01 6
Effect -0.0185 -0.0769 -0.0065 -0.0165 -0.0285 -0.0045 -0.0248
% Effect 10.05 % 43.64 % 3.68 % 9.36 % 16.17 % 2.55 % 14.07 %
Table 4: Analysis of Means for flank wear (After 50 s of machining)
1 2 3 4 5 6 7
Ra St Resp A : Feed B: Speed AxB C : Angle Ax C BxC D: Volume
n onse 0.0840.17 22 88 15 45 10 30
8 1 0.252
1 2 0.372
5 3 0.606
4 4 0.610
7 5 0.516
6 6 0.400
2 7 0.607
3 8 0.780
Total 4.143
No of 24
Values
Average 0.173
Effect
% Effect
0.16 0.24
Feed (mm/rev) Feed (mm/rev)
Speed (m/min) Speed (m/min)
88 88
0.14 Angle (deg) Angle (deg)
Volume (%of SiCp) Volume (% of SiCp)
Flank Wear (mm)

Flank Wear (mm)
0.20
0.12 0.17 30
0.17 30 45
45
0.10
15 15
0.084 10 0.16
0.084 10
0.08
22 22
0.06 0.12
Fig. 4 Means plot for flank wear Fig 5 Means plot for flank wear
(After 10 seconds of machining) [2] (After 50 seconds of machining) [2]
• It shows that cutting speed is the most important parameter that governs
the wear. But it is more than volume of reinforcement in the composite.
• Therefore, it is processing parameters that govern the tool wear than the
volume of reinforcement in the composite material.
• Not much change in the processing parameters occur after 10 or 50 s of
machining.

Table 5: Analysis of variance for Table 6: Analysis of variance for
magnitude of flank wear magnitude of flank wear
(After 10 seconds of machining) [2] (After 50 seconds of machining) [2]
Source of Sum of DOF Mean F- Significa Sum of DOF Mean F-ratio Significa
Variation Squares Square ratio nce level Squares Square nce level
Model 0.04552917 7 0.00650417 158.48 0.0001 0.0659678 7 0.0094240 163.186 0.0000
A : Feed rate 0.00198017 1 0.00198017 48.25 0.0001 * 0.0087402 1 0.0087402 151.345 0.0000 *
B : Cutting 0.03300417 1 0.03300417 804.16 0.0001 * 0.0466402 1 0.0466402 807.622 0.0000 *
speed
C : Inclination 0.00170017 1 0.00170017 41.43 0.0001 * 0.0006615 1 0.0006615 11.455 0.0038 *
angle
D : SiCp 0.00370017 1 0.00370017 90.16 0.0001 * 0.0014415 1 0.0014415 24.961 0.0001 *
Volume
AxB 0.00028017 1 0.00028017 6.83 0.0188 0.0001602 1 0.0001602 2.773 0.1153
AxC 0.00476017 1 0.00476017 115.98 0.0001 * 0.0013202 1 0.0013202 22.860 0.0002 *
BxC 0.00010417 1 0.00010417 2.54 0.1307 0.0070042 1 0.0070042 121.284 0.0000 *
Residual 0.00065667 16 0.00004104 9.24x10-4 16 5.77x10-5
Total 0.04618583 23 0.0668918 23
(Corrected)
Case Study #2: Weld Quality Improvement
• Example [3]: A quality improvement team at Canadian Farm Ltd.
Analyzed the effect of three factors weld time, pressure and moisture
content on the strength of weld on an automotive part.
• Each factor was run at three different levels as per details shown in
Table.
• For 3 factors, each at three levels, L9 experimental design was chosen
Table 1: Experimental Parameters [3]
Variables Weld time (s) Pressure (psi) Moisture (%)
Level 1 0.2 16 0.787

Level 2 0.4 20 1.150
Level 3 0.6 24 1.758

• The experiments were replicated six times as shown in the Table below.
Total of 9x6 = 54 experiments were performed.
• The results of this experiment are indicated in a table below.
• An average of six replications is considered as one of the response
variable. Also, the log s was considered as another response variable.
Table 2: L9 orthogonal Array and result of experimentation [3]
Table 3: Analysis of Means for weld strength (average) [3]
Table 4: Analysis of Means for weld strength (log s)

• Evaluation of response shows that the range [3] –
• For Weld time: 136.1 – 123.1 = 13.1
• For Pressure: 130.5 – 121.7 = 8.8
• For Moisture: 158.9 – 75.6 = 83.3
• For an Unassigned factor: 135.4 -117.8 = 17.6
• The moisture content appears to be the most important parameter which
governs the process.
• Also, the fourth unassigned factor reflects random variability.
• The ranges of average weld strength as weld time or pressure change are
less than the range attributed to random variability.
• Therefore, we conclude that changes in weld time and weld pressure
does not have a significant effect o the average weld strength.
• Though, moisture has significant effect on the weld strength, the weld
strength is very low (75.6) when the moisture % is at the highest level
(1.758).
• But at the other two levels of moisture, weld strength is close to maximum
• Thus, the study recommends that

the moisture be set at 1.150 % for
maximizing the weld strength.
Weld strength as function of moisture [3]

References

2. Suhas S. Joshi, N. Ramakrishnan, H. E. Nagarwalla, and P.

Ramakrishnan, Wear of Rotary Carbide tools in Machining of
Al/SiCp Composites, Wear, volume 230, 1999, pp.124-132.
3. Robert H. Lochner and Joseph E. Matar, ‘Designing for Quality:

An Introduction to the best of Taguchi and Western Methods of
Statistical Experimental Design’, Quality Resource, A Division of
the Kraus Organization Limited, New York, 1990.
Session 11
Additivity, Interactions & OA Modifications
Session #11: Additivity, Interactions, Orthogonal Array
Modifications
Dr. Suhas S. Joshi

Professor
Phone: 022 2576 7527 (O) / 2576 8527 ®
Additive Function
• The goal of Taguchi design is to determine the best levels of each of the
control factors in the design of a product or a process.
• In many cases, it is observed that the effects of control factors on
response variables are additive by nature. This way the response
variables can be predicted for any combination of levels of the control
factors by knowing only the main effects.
• But, if the effects are not additive, then it is an indication that there are
strong interactions among the control factors.
• In such cases, experiments should be conducted under all the
combinations of control factors, that is full factorial experiments are to be
conducted, which will be highly expensive.
• When interactions are strong, then the conditions under which
experiments are conducted can also be considered as a control factor. In
such cases, laboratory experiments may not simulate the conditions at
customer site.
Additive Function
• The relative magnitude and importance of interactions can be reduced
greatly through the proper choice of quality characteristics, objective
function, and control factors.
• In general, Taguchi method recommends that as far as possible,
interactions between the control factors be avoided. This way, cross-
product term does not come into the model and the function becomes
additive.
• Some guidelines for achieving additivity are [1]:
– The quality characteristics should be directly related to the energy transfer
associated with the basic mechanism of the product or the process.
– As far as possible choose continuous variables as quality characteristics.
– The quality characteristic should be monotonic. The effect of control factor on
response variable should be in a consistent direction.
– Use a quality characteristics that is easy to measure.
– When a product consists of a number of systems with a feedback mechanism
then all the sub-systems should be optimized independently.
Quality Characteristic
• The quality characteristic selection is very important.
• Yield as a quality characteristics:
• Usually, every one wants to maximize the yield of a process or product.
• But, we do not realize that the yield of a process or product depends upon
a large number of sub-processes or sub-products.
• If we choose, final objective of increasing the yield then there will be huge
number of control parameters and their interactions.
• Therefore, it is necessary to refer to something basic of the process or a
key assembly of a product and optimize it, which will indirectly improve
the yield of the final product.
• Examples of objective functions are [1]:
1. Spray painting process: Size of droplet is the main factor which influences the
paint quality.
2. Chemical Process: If a chemical is formed out of three A,B,C. The better
objective function could be concentration of each that maximizes yield.
Quality Characteristic
3. Heat exchanger design:
In a heat exchanger, the
temperature (T2) of fluid to be Schematic of heat exchanger system [1]
cooled after comes out of the
cooling chamber should be the
criterion. One can try to control the
variation in this temperature.
Schematic of paper feeding [1]
4. Paper handling system in a copying

machine should provide force that
is just sufficient to pick up one
sheet and not two sheets. By
making F1 as small as possible
and F2 as large as possible, paper
feeding defects can be minimized.
Control Factor Interactions
• The orthogonal arrays permit evaluation of certain interactions. For

example consider the Taguchi experiment using L8 array.
• The interaction between factors A and B can be obtained from the
orthogonal experiments.
Factor levels [1]

• The interaction magnitude can be estimated by –
( ) (
A × B Interaction = y A2 B2 − y A1B2 − y A2 B1 − y A1B1 )
=(y A2 B2 + y A1 B1 )−(y A2 B1 + y A1 B2 )
• This combination A1B1 and A2B2 of
factors A and B is seen under the factor C
is at level 1 experiments (1,2,7 and 8).
• The combination of A1B2 and A2B1 of
factors A and B occurs under level 2 of
factor C.
• Thus, it is not possible to distinguish the
effect of factor C from AxB interaction.
This is called ‘confounding’ of factor
effects by interactions.
• Therefore, to avoid confounding, we do
not assign any factor to column 3. Interaction plot [1]
L8 orthogonal array and its interaction table [1]

• To estimate an interaction, a 2-way table is prepared form the
observed data.
2-way table for AxB interaction [1]
L8 experiment layout [1]

• The interactions between three level factors can be of three different
types as shown in figures.
• The first figure shows that in the first case, there are no interactions
between factors. There is no change in the effect of factor A as the
level of factor B changes from B1 to B3.
•
3-level factor interaction [1]

• The second figure shows that some of the lines are not parallel to
each other. It means that the effect of factor A changes as we change
the level of factor B.
• Here the additive model we considered could be misleading. This
type of interaction is called s synergetic interaction.
• But the interaction in the last figure shows that there is a huge change
in the effects of factor A as the levels of factor B change. This is
called antisynergetic interaction. In this case, the linear additive
model which we considered will not be valid.
• In general, in Taguchi methods, it is necessary to anticipate some

interactions before beginning the experiment and evaluate those; at
the same time, assign the factors to columns in such a way that there
is no confounding of factor effects.

• It is possible to identify the cause behind the interactions and choose the
right ones in your analysis. The interactions occur because the
interacting factors influence a common variable, which we have not
considered in the experiment.
• For example, in the rotary tool experiment, all the chosen factors
influence speed of rotation of the round insert.
• Either you take that
factor as a control
variable, if it is possible
to measure it easily, or
otherwise, estimate the
interaction and see
whether they are
significant.
Fig. Logic for the selection

of interactions [2]
Linear Graph Modifications
• The linear graphs can be modified so that different orthogonal arrays can
be generated from the standard orthogonal arrays. Rules for modification
of linear graphs are as follows [1]:
• Breaking a line:
– In a 2-level orthogonal array, a line connecting two dots, ‘a’ and ‘b’ can be
removed and replaced by a dot.
– In a 3-level orthogonal array, each column has two degrees of freedom,
therefore, the interaction between two columns is in two other columns.
Hence, by breaking a line, four dots are generated.
• Forming a line:
– A line can be added in a linear graph of a 2-level orthogonal array to connect
two dots ‘a’ and ‘b’ provided the dot associated with their interaction is
removed. Similarly, it can be done in case of a 3-level orthogonal array.
• Moving a line:
– It is a combination of preceding two rules. A line connecting two dots ‘a’ and
‘b’ can be removed and replaced by a line joining set of another two dots say
‘f’ and ‘g’.
Rules for modification of linear graphs [1]

• In the following example of modification of linear graph, line 1-7 is
broken and two points corresponding 6 and 7 are added.
• Now, interaction between 1 and 7 cannot be estimated. Now, only two
interactions can be estimated.
Fig. Modification of linear graphs for L8 orthogonal array [1]
Column Merging
• By this method, a 4-level column can be created with all 2-level
columns, 9-level column with all 3-level columns and 6-level column
in a 2-level and 3-level orthogonal array [1].
• To create a 4-level column in a standard orthogonal array, any two
columns and their interactions can be merged.
• For example, in a L8 array, columns 1, 2, and 3 can be merged to
form a 4-level column.
• Since, three columns of 1 degree of freedom are merged to form a 3
degree of freedom, 4-level column. In the new column
• Combination (1,1) is designated as 1
• The columns corresponding to 1,2,3 are removed from the original
array and new 4-level array is introduced in it.
Column Merging
Fig. Column merging L8 orthogonal array [1]
References

2. Suhas S. Joshi, N. Ramakrishnan, H. E. Nagarwalla, and P.

Ramakrishnan, Wear of Rotary Carbide tools in Machining of
Al/SiCp Composites, Wear, volume 230, 1999, pp.124-132.
Session 11
Case Study on Deburring Operations
Prof. R. Balasubramaniam
Design of Experiments:
Abrasive Jet Deburring
Dr. R. Balasubramaniam
Bhabha Atomic Research Centre
Trombay, Mumbai
5/16/2009 Dr.R.Balasubramaniam, BARC
Outline
• Burrs
• Deburring
• Design of Experiments
– External Burrs
– External Deburring
• Deburring Process
• Deburring Time
• Extent of Edge Radius Generated
– Internal Deburring
Burrs
• Burr, Edge and Surface Conditioning
Technology of SME defines
– Burrs as an undesirable projection on
materials formed as the result of plastic
flow from cutting, forming, blanking or
shearing operation which is unavoidable
in all kinds of machining operations.
Effects of Burrs
• Interfere with the assembly of
components
• Source of small particles inside the
assembly
• Seriously affects the performance and
reliability of the system
• Causes injury to operating personal

Deburring
• Removal of Burr
• Generation of edge condition
• External Deburring
• Internal/Inaccessible area Deburring
Deburring Methods
• Abrasive processes More than
50
• Mechanical processes
deburring
• Thermal processes processes
• Chemical processes are
practiced
• Electro chemical process
in
industries

Experiments
• Generation of External Burrs
• Abrasive Jet External Deburring

• Deburring Time
• Extent of Edge Radius Generated
• Abrasive Jet Internal Deburring

Design of Experiments for the

generation of external burrs
• Burr Model
-Burr Height (H)
T1 -Burr Root Thickness (T2)

Root Radius (r) -Burr Thickness at the top(T1)
H
-Root Radius (r)
T2

T1
Design of Experiment H
- External Burr T2
Input Parameters Response Parameters
-Tool Nose radius (mm)

Burr Root
-Depth of Cut (mm) Thickness (mm)
-Feed Rate (mm/tooth)
contd…
• 3 Input Parameters
Factor Low High
1 Response parameter
2 Level Expt:
TNR 0 mm 1 mm
• Full Factorial Experiment (A1) (A2)
DOC 0.5mm 2 mm
• 23 = 8
(B1) (B2)
• 3 Runs each Feed 0.05mm 0.15mm
• Total 24 runs (C1) (C2)
ANOVA Results of Burr Root
Thickness
Factors Sum of df Mean F-Ratio P-Value
squares Square
TNR(A) 2.7135 1 2.7135 131.07 0.0000*
DOC(B) 2.8359 1 2.8359 136.99 0.0000*
Feed (C) 0.4959 1 0.4959 23.96 0.0002*
AB 0.3384 1 0.3384 16.35 0.0011*
AC 0.3384 1 0.3384 16.35 0.0011*
BC 0.1890 1 0.1890 9.13 0.0086*
Error 0.3105 17 0.0207 P-Value less than 0.05
is significant
Total 7.2216 23
R-Squared = 95.7 R-Squared (Adjusted) = 94.18
Effect of Factors
Factors % Effect
TNR(A)
37.5
DOC(B)
39.3
Feed (C)
6.8
AB
4.6
AC
4.6
BC
2.6
Error
4.6
Main Factor Effects Plot
Burr Root Thickness
1.49
1.29
1.09
in mm
0.89
0.69
0 1 0.5 2.0 0.05 0.15

TNR DOC Feed
Interaction Effects Plot

1.80
+ +
+
1.50
Burr Root Thickness
+ - - -
1.20
+
0.90 +
in mm
-
0.60
-
-
0.30
0
1 2 1 2 1 2
AB AC BC
Regression Co-efficients for Burr
Root Thickness
Factors Co-efficient
Constant -0.2975
A : TNR 1.5433
B : DOC 0.3800
C : Feed 2.2917
AB -0.3167
AC -4.7500
BC 2.3667
Regression Equation
• Burr Root Thickness
= -0.2975 + 1.5433 TNR

+ 0.3800 DOC
+ 2.2917 Feed
- 0.3167* TNR * DOC
- 4.75 * TNR * Feed
- 2.3667 * DOC * Feed
- All Units in mm

Abrasive Jet External Deburring
• Cause-Effect Diagram
Orientation Parameter Nozzle Parameter
SOD Diameter
Length Burr Removal

Impingement Angle
Jet height Material
Edge Generation
Mixing ratio
Type of Abrasive
Velocity
Grain Size Nozzle Pressure Deburring time
Abrasive parameters Flow Parameters
5/16/2009Parameters
Input Dr.R.Balasubramaniam, BARC 17
Response parameters
Input Parameters for Edge

Conditioning
- Impingement Angle
Criteria for edge conditioning - Jet Height
Jet height
00, h/2
Convex Edge
00, <h/2
Concave Edge
>00, <h/2
Taper Edge

Response table
Burr Removal Edge Radius Deburred?
Generation
0 0 0
0 1 0
1 0 0
1 1 1
•Generation of Convex Type Edge Radius is considered

with Impingement Angle of 00 and Jet Height of h/2
Input & Response Parameters –

Deburring Process
• Stand-off-distance (SOD) – A1 & A2
• Nozzle Pressure - B1 & B2
• Mixing Ratio - C1 & C2
• Abrasive Grit Size - D1 & D2
• JH and IANG to generate convex edge
• Burr removal and Convex Edge Radius generation
• Full Factorial – 16 Run & 3 Repetition

ANOVA Results of Deburring
Process
Sources Sum Of Sq. df Mean % Effect
Squares
SOD 12 1 12 100%
NPR 0 1 0 0
MR 0 1 0 0
ASize 0 1 0 0
All 2 factors 0 6 0 0
Error 0 37 0 0
Total 12 47
* SOD is the significant factor for Deburring Process

5/16/2009 Dr.R.Balasubramaniam, BARC 21
Experiment for Deburring Time

• Effects of other three parameters on
Deburring Time
• Fixed SOD
• 3 Factor- Full Factorial – 8 * 3 Run

ANOVA Results of Deburring
Time
Source Sum of Sq. df Mean Sq. % effect
NPR(B) 90 1 96 84
MR(C) 1.5 1 1.5 1.3
ASize(D) 13.5 1 13.5 11.8
BC 1,5 1 1.5 1.3
BD 1.5 1 1.5 1.3
CD 0 1 0 0
Error 0 15 0 0
Total 114.0 23
5/16/2009 Dr.R.Balasubramaniam, BARC 23
Regression Co-efficient
Factor Co-efficient
Constant 7.05
NPR(B) -1.05
MR(C) -20.0
ASize(D) -0.28
BC 3.33
BD -0.33
CD 0
Experiment for Extent of Edge
Radius Generated
• Convex Edge Radius
• SOD – Input Parameter

(Determined with experiment on
plaster-of-paris)
(Details not discussed here)
• Radius - Output Parameter
Regression Equation
Edge radius Generated (mm)

= 0.15 + 0.057 * SOD
When SOD is less than 8 mm.
Radius value is limited to burr root thickness

Abrasive Jet Internal Deburring
Process
Concept of Secondary Erosion

Primary and Secondary Erosions
Secondary
erosion on this
target
Primary erosion
on this target

Internal Burr Specimen
- 900 Cross Drilled Holes
Stopper Burr
Specimen
Nozzle
Input and Response Parameters

• SOD
• MR
• ASize
• Burr Root To Stopper Distance (BSD)
• Removal of Burr
• Edge Radius

Burr Root to Stopper Distance
(BSD)
BSD
PL1
PL2
Cross Drilled Hole Burr Model

00
Strong Burr Region
Tmax
Weak Burr Region
Hmax
No Burr Region
2700

Parameters and their Level
• Input Parameters
4 factor-2 level-3 Runs = 48
– SOD
– MR
– ASize
– BSD
• Response Parameters
– Removal of Burr
• Radius Generation varied from every section
ANOVA
Source Sum of Sq df Mean Sq. F-Ratio P-Value
ASize(A) 0.75 1 0.75 37 0*
MR(B) 0 1 0 0 1.0
SOD (C) 0.75 1 0.75 37 0*
BSD(D) 0.75 1 0.75 37 0*
AB 0 1 0 0 1
AC 0.75 1 0.75 37 0*
AD 0.75 1 0.75 37 0*
BC 0 1 0 0 1
BD 0 1 0 0 1
CD 0.75 1 0.75 37 0*
Error 0.75 37 0.02
Total 5.25 47
R-Squared – 85.71 R-Square (adjusted) – 81.85

Significant factors
• ASize
• SOD
• BSD
• Interactions of AC, AD and CD
Conclusions
• Design of Experiments and Analysis of
variables helped in
– Identifying the significant factors affecting the
response factors.
– Developing Regression Models

Appendix A
Appendix A
Dr. Suhas S. Joshi

Professor
Phone: 022 2576 7527 (O) / 2576 8527 ®
Response Table for L8 orthogonal array
1 2 3 4 5 6 7
Ra St Resp
n onse
Total
No of
Values
Average
Effect
% Effect
1 2 3 4 5 6 7
Ra St Resp
n onse
Total
No of
Values
Average
Effect
% Effect

Response Table for L18 Orthogonal Array

Response Table for L27 Orthogonal Array
Response Table for L27 Orthogonal Array (Continued)

Normal Probability paper
Appendix B
International Journal Publications Using
Design of Experiment Approach

Design of Experiments

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Design of Experiments

Uploaded by

Copyright:

Available Formats

Session 1

Classical Methods of Experimental Design

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

1. The world around us is not deterministic

University of Illinois at Urbana-Champaign 2006 by Dr. Shiv G. Kapoor

NATURE OF VARIABILITY IN DATA

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

NATURE OF VARIABILITY IN DATA

Student: “I will go down to the lab and run

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

Student: “I have found that the tool life

NATURE OF VARIABILITY IN DATA

After running another test he came to

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

Professor: “Do whatever you need to do

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

NATURE OF VARIABILITY IN DATA

Student: “After installing the power

Student: Now the tool life is 16.2

NATURE OF VARIABILITY IN DATA

Professor: “What I really want is an

Student: “I have run tests over a random

NATURE OF VARIABILITY IN DATA

Professor: “That’s fine, but how confident

2. We can easily define many factors of potential

The problem at hand is to screen from a large

THE NEED FOR STATISTICAL METHODS IN

4. All processes are subject to identifiable and

5. In most physical processes, variables tend not to

6. The world around us is non-linear.

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

7. Experimentation is costly and time consuming

8. We generally respond to crises, not principles.

9. We know much less about what makes things

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

THE NEED FOR STATISTICAL METHODS IN

10. Concepts drive people – people drive

The above statement altogether is made too often.

To optimize and control the process, a logical series

1. Which of a list of potentially important variables

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

THE SEQUENTIAL AND ITERATIVE

4. In the context of the above, what characteristics

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

The key to successfully and efficiently embracing

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

THE SEQUENTIAL AND ITERATIVE

Experiment: conduct the experiment and

In initiating an experimental study, we

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

THE SEQUENTIAL AND ITERATIVE

One big experiment is not only inefficient but

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

Three key sources of difficulty confront

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

DIFFICULTIES WITH EXPERIMENTAL

DIFFICULTIES WITH EXPERIMENTAL

@ 2006 Dr. Shiv G. Kapoor All Rights Reserved

Correlation Versus Causation

DIFFICULTIES WITH EXPERIMENTAL

Complexity of Variable Effects

External Versus Internal Experimental