Statistics for sensory and consumer research

Statistics used in sensory
and consumer research

Chantal Gilbert
14th Nordic Workshop in Sensory Science
5-6 October 2011
C.C. Gilbert
Nordic Workshop, 6 Oct 2011
Opening talk on the topic of Statistics
C.C. Gilbert
Why sensory statistics?

Sensory data is unique
Uses human assessors to measure the perception of a wide
range of stimuli, as detected by the senses
Need to consider:
Physiology
Psychology
Motivation
Performance
Behaviour
Genetics...
C.C. Gilbert
Psychological errors:
Stimulus error
Expectation error
Central tendency
Contrast and convergence
Habituation
Halo effect
Logical error
The importance of Experimental

Design: but not enough time!!!
Statistical methods and issues

worthy of discussion...
C.C. Gilbert
Statistical methods and issues

Ive chosen to discuss...
1) Statistics for
discrimination
testing and
the issue of
similarity
2) ANOVA as a
method for
analysing
descriptive
profile data
3) Multivariate
methods, and
interpretation
pitfalls
... a drop in the ocean!

C.C. Gilbert
1) Statistics for
discrimination testing and
the issue of similarity
C.C. Gilbert
Something simple to start with:

Analysis of Triangle Tests
[In fact, its not so simple, see for example:
O'Mahony M (1995) Who told you the
triangle test was simple? FQP, Vol. 6, No. 4]
C.C. Gilbert
Triangle Test
Each assessor simultaneously receives 3
samples, two of which are identical.
Asked to identify the odd one.
951
C.C. Gilbert
627
398
Some Scenarios
24 screened assessors perform triangle test task.
# correct
# incorrect
8
16
If result
then there is no difference
(expect 1/3 to be correct by chance).
# correct # incorrect
20
4
then it is clear that the result
# correct # incorrect
13
11
What do we conclude ?
If result
will be significant.
If result
We need the help of Statistics

C.C. Gilbert
Example - Triangle test

A company has made a small change of ingredient
hoping to lead to an improved product. Prior to
consumer testing, the sensory analyst runs a triangle
test to see if the difference between the products is
perceivable.
Objective: Difference testing
Analysts sets alpha risk at 0.05 (5% chance of saying
theres a difference when theres not).
18 assessors perform the test.
C.C. Gilbert
Hypothesis testing cont

Triangle test example:
= 0.05
H0: p=1/3 (there is NO difference)
H1: p> 1/3 (there is a difference)
AB
A=B
H0
H1
How many assessors out of 18 are required to identify

the odd sample to be confident of a difference? - Critical
value
C.C. Gilbert
Binomial tables
(p=1/3, one-tailed test)
Number of
Assessors
5%
12
13
14
15
16
17
18
19
20
21
22
23
24
8
8
9
9
9
10
10
11
11
12
12
12
13
C.C. Gilbert
Significance Level
1%
0.1%
9
9
10
10
11
11
12
12
13
13
14
14
15
10
11
11
12
12
13
13
14
14
15
15
16
16
If result 10
reject H0
& conclude
sig. diff
Triangle Example
Results: 11 assessors correctly identified the odd sample.
H0
Critical
Value (
=0.05)
= 10
H1
<
Calculated value (test statistic)
= 11
Analyst rejects H0 in favour of H1, and concludes there is

a perceivable difference between the samples (p0.05).
C.C. Gilbert
Using discrimination tests to

demonstrate similarity
C.C. Gilbert
Triangle tests for similarity...
A) How they used to do it

(WRONG!)
C.C. Gilbert
Example 2 - Triangle test

Similarity testing situation change of supplier
Triangle test ultimately want to show no
perceivable difference...
Imagine - analyst proceeds as usual (ignoring the
similarity issue):
=0.05; 18 assessors
critical value = minimum of 10 correct responses
C.C. Gilbert
Example 2 - Triangle test

Results: 8 assessors correctly identified the odd sample.
H0
H1
<
Calculated value
(test statistic)
=8
C.C. Gilbert
Critical
Value (
=0.05)
= 10
Example 2 - conclusion
Analyst fails to reject H0 - not enough evidence to
suggest theres a difference between the two
suppliers.
Actually, thinking about it, the analyst would like to
demonstrate that the samples are the same.
The samples are not significantly different (p>0.05),
therefore they must be the same!
C.C. Gilbert
B) How we currently do it
See ISO reference: Sensory analysis methodology triangle test. BS ISO 4120:2004.
C.C. Gilbert
Similarity testing approach in BS

ISO 4120 (2004)
Traditional hypothesis testing format is not
the ideal tool for demonstrating similarity
between products.
Approach: Use same sensory test methods,
but control for a different type of error when
conducting tests where research objective is
demonstrating similarity.
Essentially ignoring standard rules for
hypothesis testing!
C.C. Gilbert
Approach used
Focus on
(risk of missing a true difference); power = 1-
Pd (measure of the size of the difference)
Be very certain (1- ) that

few people (Pd) can detect a difference
For this, need to use many more assessors
E.g. To be 95% certain that only 20% of population
can detect a difference, need 147 assessors to do
the test (at =0.05)
C.C. Gilbert
Example - selecting sample size

Going back to Example 2
Triangle test between products made using the
ingredient from the two suppliers.
Because the company would like to show

that the new suppliers ingredient does not
change the overall perception, the analyst
knows the objective is to establish similarity.
C.C. Gilbert
Example - Selecting sample size

(compromising choice of and )
0,20
0,10
0,05
0,01
0,001
0,20
0,10
0,05
0,01
0,001
0,20
0,10
0,05
0,01
0,001
0,20
0,10
0,05
0,01
0,001
C.C. Gilbert
Pd
50%
40%
30%
20%
0,20
7
12
16
25
36
12
17
23
35
55
20
30
40
62
93
39
62
87
136
207
0,10
12
15
20
30
43
17
25
30
47
68
28
43
53
82
120
64
89
117
176
257
0,05
16
20
23
35
48
25
30
40
56
76
39
54
66
97
138
86
119
147
211
302
0,01
25
30
35
47
62
36
46
57
76
102
64
81
98
131
181
140
178
213
292
396
0,001
36
43
48
62
81
55
67
79
102
130
97
119
136
181
233
212
260
305
397
513
Decide
want
Pd = 30%
n=30 is
max. due
to budget
Example cont
Analyst consults the sample size table. Decides to
use n=30 assessors, knowing risks are:
=20%, =10%, Pd=30%
Results show 10 of the 30 assessors identified the
odd sample.
C.C. Gilbert
Example - Interpreting results

n
Critical
number of
responses
table
(ISO)
18
24
30
36
C.C. Gilbert
0,001
0,01
0,05
0,10
0,20
0,001
0,01
0,05
0,10
0,20
0,001
0,01
0,05
0,10
0,20
0,001
0,01
0,05
0,10
0,20
10%
0
2
3
4
4
2
3
5
6
7
3
5
7
8
9
5
7
9
10
11
20%
1
3
4
5
6
3
5
6
7
8
5
7
9
10
11
7
9
11
12
13
Pd
30%
2
4
5
6
7
4
6
8
9
10
7
9
11
11
13
9
11
13
14
16
40%
3
5
6
7
8
6
8
9
10
11
9
11
13
14
15
11
14
16
17
18
50%
5
6
8
8
9
8
9
11
12
13
11
13
15
16
17
14
16
18
19
21
Example - conclusion
Table shows that the maximum number of
correct responses needed to conclude that
two samples are similar, based on a triangle
test, is 11.
Results show 10 correct responses.
Therefore, the analyst concludes the
samples are similar (that is, they are 90%
confident that no more than 30% of
discriminators can detect a difference).
C.C. Gilbert
C) Some new approaches
C.C. Gilbert
Testing for similarity whats new?

Interval hypothesis testing:
Specify an allowed or ignorable difference in terms of the
proportion of discriminators (Pd0)
Work out the probably of correct responses, Pc0,
corresponding to Pd0 (accounts for the chance of guessing).
H0: PcPc0 (i.e. there IS a difference)
HA: Pc<Pc0 (i.e. there is no difference, similar)
Follows standard rules of hypothesis testing, if p<, reject H0
and conclude the samples are similar (where similarity is
defined by the interval).
See: Bi J (2006) Sensory Discrimination Tests and
Measurements. Blackwell Publishing Professional.
C.C. Gilbert
Other new approaches

Using more sensitive sensory methods
One example (among others...) - Tetrads
Present 4 samples
Group the stimuli into two groups of two (unspecified procedure)
6 possible outcomes: WWSS, WSWS, WSSW, SSWW, SWSW,
SWWS
Probability of guessing correctly is 1/3
Ennis JM (2010)
http://ifpressdelta.com/wpcontent/uploads/2011/03/A
STM_2010_Spring_John_
Ennis_New_Methods.pdf
C.C. Gilbert
2) ANOVA as a method for

analysing descriptive
profile data
C.C. Gilbert
Link between sample presentation

design and analysis
Independent Samples Design
Assessor 1
Assessor 2
Assessor 3
Assessor 4
Assessor 5
Assessor 6
Assessor 7
Assessor 8
Product 1
X
X
X
X
Product 2
X
X
X
X
E.g. Independent samples T-test,

or one-way ANOVA for 3 samples
C.C. Gilbert
Related Samples Design

Assessor 1
Assessor 2
Assessor 3
Assessor 4
Assessor 5
Assessor 6
Assessor 7
Assessor 8
Product 1
X
X
X
X
X
X
X
X
Product 2
X
X
X
X
X
X
X
X
E.g. Paired T-test, or two-way

ANOVA for 3 samples
Standard Analysis:
Two-Way ANOVA with Interaction
Tests of Between-Subjects Effects
Dependent Variable: Astringent
Source
Corrected Model
Intercept
Sample
Judge
Sample * Judge
Error
Total
Corrected Total
Type III Sum

of Squares
19328.625a
326041.875
2111.975
7935.375
9281.275
10198.500
355569.000
29527.125
df
59
1
5
9
45
60
120
119
Mean Square
327.604
326041.875
422.395
881.708
206.251
169.975
F
1.927
1918.175
2.485
5.187
1.213
a. R Squared = .655 (Adjusted R Squared = .315)
Assessor term scale usage (level effect)

Interaction measure of disagreement
C.C. Gilbert
Sig.
.006
.000
.041
.000
.240
Example of Sample by Judge interaction

Interaction plot - Mean data for Sweet
Judge
1
2
Mean Sweet
80
3
4
5
6
7
8
9
10
60
40
20
Bardolino
C.C. Gilbert
Parador
Rhone
Rioja
Sample
Solana
Ventoux
Recent adoptions - Mixed Model

ANOVA
Mixed Model:
Samples - Fixed effect
Assessors - Random effect
Example: Wine - Astringent (mixed model)

Tests of Between-Subjects Effects
Dependent Variable: Astringent
Source
Intercept
Sample
Judge
Sample
* Judge
Hypothesis
Error
Hypothesis
Error
Hypothesis
Error
Hypothesis
Error
Type III Sum

of Squares
326041.875
7935.375
2111.975
9281.275
7935.375
9281.275
9281.275
10198.500
a. MS(Judge)
b. MS(Sample * Judge)
c. MS(Error)
C.C. Gilbert
df
1
9
5
45
9
45
45
60
Mean Square
326041.875
881.708a
422.395
206.251b
881.708
206.251b
206.251
169.975c
F
369.784
Sig.
.000
2.048
.090
4.275
.000
1.213
.240
Whats new...
Expanding the ANOVA model to account for other
scaling effects
Brockhoff (2003) Statistical testing of individual differences
in sensory profiling. FQP, 14, 425-434
Romano et al. (2008) Correcting for different use of the
scale and the need for further analysis of individual
differences in sensory analysis. FQP 19, 197-209.
Most recently, O5.6 at Pangborn 2011, where Per

Brockhoff introduced an new mixed model ANOVA
that also accounts for scaling differences between
assessors (i.e. the range of the scale used).
Decomposes the interaction into scaling differences +
disagreement, and uses the specific disagreement term in
the F-ratio denominator
C.C. Gilbert
3) Multivariate methods,
and interpretation pitfalls
C.C. Gilbert
Multivariate data analysis

The methodology applied to data that include
simultaneous measurements on many variables is
called multivariate analysis.
Because they analyse all variables together, multivariate
methods are inherently more difficult to understand than
univariate methods (such as ANOVA).
Well known multivariate methods of analysis include:
Principal Components Analysis (PCA)

Generalised Procrustes Analysis (GPA)
Partial Least Squares Regression (PLS)
Multiple Factor Analysis (MFA)
Preference Mapping: etc.
C.C. Gilbert
Uses for multivariate analysis

The first important thing to stress is that generally
speaking, most multivariate methods are not used for
inference.
That is, most multivariate methods (e.g. PCA, GPA,
PLS, etc.,) are not interested in estimating population
parameters or determining significant differences
between samples.
Multivariate methods, such as PCA, are exploratory in
nature, used for data reduction and data interpretation.
Abuse of PCA: avoid applying it to any and all data
sets!
Be aware of the objectives of the method.
C.C. Gilbert
Example of a biplot
1.0
Sensory Biplot : PC1 vs PC2

Acid.At
Strawberry.Od
Strawberry.Fl
Summer
Raspberry.Od
Acid.Bt
0.5
Cranberry.Fl
Sweet.At
Thickness.Mf
Astringent.Mf
Bitter.Bt
Bitter.At
Cloves.Od
Cloves.Fl
Spiced
0.0
PC2: 28.7%
Sweet.Bt
Blue+Rasp
Cran+Cherry
Ribena
Black.leaf.Fl
Black.leave.Od
Medicinal.Fl
Medicinal.Od
-0.5
MixedBerries.Fl
MixedBerries.Od
-1.0
Blackcurrant.Od
Blackcurrant.Fl
-1.0
C.C. Gilbert
-0.5
0.0
PC1: 53.2%
0.5
1.0
Biplots: interpretation challenges

Cannot interpret as you would a scatterplot or graph
of the means.
Can easily be misinterpreted if taken at face value.
Interpretation can be reasonably subjective
requires experience
Looking at relative positions between objects
Beware that interpretation rules may be different
depending on the method.
Several interpretation pitfalls that users need to be
aware of:
C.C. Gilbert
Example: Pitfall # 1
1.0

Acid.At
Strawberry.Od
Strawberry.Fl
Summer
Raspberry.Od
Acid.Bt
0.5
Cranberry.Fl
Sweet.At
Thickness.Mf
Astringent.Mf
Bitter.Bt
Bitter.At
Cloves.Od
Cloves.Fl
Spiced
0.0
PC2: 28.7%
Sweet.Bt
Blue+Rasp
Cran+Cherry
Ribena
Black.leaf.Fl
Black.leave.Od
Medicinal.Fl
Medicinal.Od
-0.5
MixedBerries.Fl
MixedBerries.Od
-1.0
Blackcurrant.Od
Blackcurrant.Fl
-1.0
C.C. Gilbert
-0.5
0.0
PC1: 53.2%
0.5
1.0
Attribute correlations
2
sw eet_fla
runny_txt
-1
3
8
6
-2
0.5
large.fruits_app
thin_app
bright_app
arti.straw _fla
pink_app
straw _od
dark.pink_app
0.0
Dim 2 (17.39 %)
straw_fla
thick_app
small.fruits_app
seedy.fruits_app
low _od fruits_appdairy_fla
-0.5
-3
pale.pink_app
acidic_fla
yellow .tints_app
creamy_fla
-3
-2
-1
Dim 1 (39.41 %)
-1.0
Dim 2 (17.39 %)
1.0
Sample scores
-1.0
-0.5
0.0
0.5
Dim 1 (39.41 %)
C.C. Gilbert
1.0
Pitfall # 3
3
Sample scores
2
Samples described similarly

by dimensions 1 and 2, but
can differ in dimension 3 or
4...
5
3
-1
Dim 2 (17.39 %)
-3
-2
-3
-2
-1
Dim 1 (39.41 %)
C.C. Gilbert
C.C. Gilbert
Example cont: Pitfall # 4
C.C. Gilbert
Example Pitfall # 5: random data

GPA Group Average : dimension 1 versus 2
1.35
object 6
object 4
object 2
attr 7
attr 6
attr 2 attr 1
attr 10
attr 8
-1.35
attr 5
attr 4
attr 9
1.35
object 1
attr 3
object 3
object 5
-1.35
C.C. Gilbert
Example Pitfall # 5: PCA with

non-significant attributes
1.0

Molass.F
0.5
2
Fruit.O
0.0
PC2: 25.8%
Sweet.F
Curry.O
-0.5
Fruit.F
-1.0
Bitter.AT
-1.0
C.C. Gilbert
-0.5
0.0
PC1: 61.2%
0.5
1.0
Other pitfalls...
What do the dimensions in the graph
represent?
E.g. Internal vs external preference map.
Underlying dimensions will be different: either
sensory or preference dimensions.
Difficult for clients to understand.
Filtering out the signal from the noise too

much information presented on the graph?
Simplify the graph; present the most appropriate
solution.
C.C. Gilbert
Other approaches...
MFA Coordinates of the projected points (axes F1
and F2: 58.62 %)
4
Euthymol.Home
Use
F2 (28.50 %)
3
Euthymol.Expect
ation
Aquafresh
Extreme
Clean.Home Use
Arm & Hammer
Aquafresh
Enamel
BeverlyExtreme
Arm & Hammer
Care.Expectation
Hills.Home Use
EnamelBeverly Clean.Expectatio
Retardex.Expect
M entadent
ASDA
n
Sensodyne
Care.Home
Oral-B
Use
ASDA
Hills.Expectation
M
entadent
Sensodyne
ation Rembrandt
SR.Expectation
Pronamel.Home
Vitint
SafeM
& intfresh.Home
Blanx.Expectatio
M intfresh.Expect
SR.Home Use
Pronamel.Expect
Use
Use
Use Colgate 2 in 1
White.Expectatio
Oral-B
ation
n Plus.Home
Vitint Safe
& Whitening.Expect
ation
Colgate
nWhite.Home
2 inRembrandt
1 Use
Retardex.Home
Whitening.Home
Plus.Expectationation
Use Blanx.Home Use
Use
-1
-2
-4
-3
-2
-1
F1 (30.11 %)
Examples courtesy
of Anne Hasted,
C.C. Gilbert
Conclusions
Statistics is a necessary part of the field of sensory
and consumer sciences
Our science is improving
Increased knowledge and understanding of statistical
methods
Increased access to statistical software e.g. free R
programs
Better choices of sensory methods coupled with better
methods of statistical analysis
A basic understanding of statistics is beneficial

Statistics can be fun!
C.C. Gilbert
Thank you for your attention!

Questions?
Chantal Gilbert
c.gilbert@campden.co.uk
+44 (0)1386 842256
Chipping Campden
Gloucestershire
GL55 6LD
England
www.campden.co.uk
C.C. Gilbert

Statistics for sensory and consumer research

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics for sensory and consumer research

Uploaded by

Copyright:

Available Formats

Statistics used in sensory

and consumer research

Opening talk on the topic of Statistics

Why sensory statistics?

The importance of Experimental

Statistical methods and issues

Statistical methods and issues

... a drop in the ocean!

Something simple to start with:

then it is clear that the result

We need the help of Statistics

Example - Triangle test

Hypothesis testing cont

How many assessors out of 18 are required to identify

Analyst rejects H0 in favour of H1, and concludes there is

Using discrimination tests to

Triangle tests for similarity...

A) How they used to do it

Example 2 - Triangle test

Example 2 - Triangle test

Triangle tests for similarity...

Similarity testing approach in BS

Be very certain (1- ) that

Example - selecting sample size

Because the company would like to show

Example - Selecting sample size

Example - Interpreting results

Triangle tests for similarity...

C) Some new approaches

Testing for similarity whats new?

Other new approaches

2) ANOVA as a method for

Link between sample presentation

E.g. Independent samples T-test,

Related Samples Design

E.g. Paired T-test, or two-way

Type III Sum

a. R Squared = .655 (Adjusted R Squared = .315)

Assessor term scale usage (level effect)

Example of Sample by Judge interaction

Recent adoptions - Mixed Model

Example: Wine - Astringent (mixed model)

Type III Sum

Most recently, O5.6 at Pangborn 2011, where Per

Multivariate data analysis

Principal Components Analysis (PCA)

Uses for multivariate analysis

Sensory Biplot : PC1 vs PC2

Biplots: interpretation challenges

Sensory Biplot : PC1 vs PC2

Samples described similarly

Example cont: Pitfall # 4

Example Pitfall # 5: random data

Example Pitfall # 5: PCA with

Sensory Biplot : PC1 vs PC2

Filtering out the signal from the noise too

A basic understanding of statistics is beneficial

Thank you for your attention!

You might also like