Professional Documents
Culture Documents
What is it?
When to use it?
Types of Variables
Designing an Experiment
Case Study
Analyzing the data
Types of evaluation
Users not involved
Supported by practice/theory
Large Sampling
Subjective/qualitative
External validity
degree to which research results applies to real situations
Disadvantage
Cannot know all the contributing factors to users
performance
i.e. do they use menus more frequently than toolbar buttons
because the icons are not comprehensible OR because the
buttons are too small OR simply because they do not know
that they exist OR . [can go on]
Internal validity
confidence that we have in our explanation of experimental
results
Common measures
Task completion time
Error rate
Learning rate (novice -> expert transition)
Fatigue, comfort?
etc.
Advantages:
Way of finding out if IV is worth studying
Results easy to interpret and analyze
Some cases do not need more than two levels
investigating two interaction techniques
two educational methods
etc.
Disadvantages:
Sometimes does not say much about the relationship
between the IV and the DV
Reading Time
Reading Time
12 10 12 10
Print Size Print Size
Reading Time 12 10
Print Size
1/22/2013 Comp 4020 - HCI 2 (PPI) 30
Single Variable Experiment
Multilevel Experiments: single variable experiments where IV has > 2
levels
Advantages:
Have better handle over IV-DV relationship
The more levels added the less critical is the range of IV (balance
between realistic and large enough)
Disadvantages:
Requires more time and effort than 2-level (within-subjects increases time
for each subject, between-subjects requires additional subjects)
Statistical tests more complex
Need to know when to limit the number of levels
1/22/2013 Comp 4020 - HCI 2 (PPI) 31
Multiple Variable Experiment
Most frequent design combines several variables in a factorial
combination that pairs each level of IV with the others
referred to as a factorial design
Font Size
Small Medium Large
Yes
Caps
No
Disadvantages
Time-consuming and costly
Analysis more complicated, need to typically do an ANOVA
Assumption that variability in data approximates a normal
distribution (dont know until completed experiment)
Interpretation of results is more complex
Problems
Individual differences:
best user 10x faster than slowest
best 25% of users ~2x faster than slowest 25%
Unreliable instruments
e.g., built in clock vs. stop watch
Partial Solution
Reasonable number and range of users tested
Correlate data from repeated measurements
Trial# 1 2 3 4
Condition A B B A
Linear Confounding effect 10 20 30 40
If asymmetric transfer
i.e., A-B transfer > or < B-A transfer then use a between-
subjects design
Range effects
People tend to perform best in middle of range of trials
does between-subjects design solve this?
Context effect when one level of IV is used subjects establish a
context
Split
Subjects?
Hypothesis?
IV?
DV?
Design?
Task (s)?
25 6
5
20
4
15
TM TM
3
CM CM
10
2
5
1
0 0
St ruct ure Size St r uc t ur e Si z e
Structure Size
Statement TM CM
1. I was able to count the number of directories using toolname. 3.65 4.40
2. I was able to find the bitmap (.bmp) files using toolname. 3.70 4.60
3. I was able to detect the type of files using toolname. 3.95 4.55
5. I was able to find the files inside a sub-directory using toolname. 3.05 3.95
6. I was able to find the largest file using toolname. 3.50 3.95
7. I was able to compare the sizes of files using toolname. 3.30 3.90
8. I was able to find the largest directory using toolname. 3.70 4.40
9. After the training session I knew how to use toolname. 4.00 4.35
?
n9
High
n5
n1
n10
Sunburst ?
n7 n8
Medium
n0
Low
n1 n2 n3
n4 n5 n6
n7 n8 n9 n10
Range of IVs
Frequency is the number of raw data points that fall into each
score category
1 62 11 55 3
2 56 12 42 2.5
3 67 13 61 2
4 91 14 58
1.5
5 53 15 70 0.5
6 87 16 47 0
0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99
7 51 17 62
8 63 18 36 Game Player
9 46 19 74
10 71 20 51 3.5
2.5
By looking at distributions we can 2
0.5
0
10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99
The smaller the std dev, indicates that mean is with fewer
errors
60
60
50
50
40
40
30
30
20
20
10
10
0
0
1 2 3 4 5
P NP
If you use raw data will very likely find some variability or
spread a scatter plot
+.87 - 1.0
How?
obtain the two sets of measurements
calculate correlation coefficient
+1: positively correlated
0: no correlation (no relation)
1: negatively correlated
10
r2 = .668
condition 1 condition 2 9
5 6
4 5
3
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5
Pickles eaten per month
3
Which conclusion could be correct? 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5
- Eating pickles causes your salary to increase
Pickles eaten per month
- Making more money causes you to eat more pickles
- Pickle consumption predicts higher salaries because
older people tend to like pickles better than younger
people, and older people tend to make more money than
younger people
1/22/2013 Comp 4020 - HCI 2 (PPI) 88
Correlation
Dangers
attributing causality
a correlation does not imply cause and effect
cause may be due to a third hidden variable related to
both other variables
9
condition 1 condition 2
5 6 8
4 5
6 7 Condition 2
4 4 7
5 6
3 5
5 7 6
4 4
5 7
5
6 7
6 6
7 7 4
6 8
7 9
3
3 4 5 6 7
1/22/2013 Comp 4020 - HCI 2 (PPI) Condition 1 91
Interpreting Results from Factorial Experiments
Example:
time it takes subjects to read paragraphs typed in 12-point or 10-
point print
8-year olds in one group, 12-year olds in another group
Main Effects
To evaluate main effects of an IV must average across levels of
the other variable
30
Time
20
yes
10
10 12 10 12
Print Size Print Size
Age
8 years Main Time
12 years effect yes
of age?
10 12
1/22/2013 Comp 4020 - HCI 2 (PPI) 94
Print Size
Interpreting Results from Factorial Experiments
Interactions
To determine whether the IVs interact we must ask:
is the effect of print size different for each age? (or)
is the effect of age different for each print size?
1st question:
we see that going from 10-point to 12-point causes a decrease in
reading time for 8-year old but no diff for 12-year old
2nd question:
we see that the difference between reading times for the two
ages is larger for 10-point than for 12-point
Time
10 12 10 12
Print Size Print Size
Age
8 years yes
12 years
Time
10 12 10 12
Print Size Print Size
Print size? No
Age? Yes Print size? Yes
Interaction? No Age? Yes
Interaction? No
Age
8 years
12 years
Independent Dependent
Variable Variable
Parametric
Non-parametric