Program & Bibliographie

- 3(1,2): ~5 theory (301, B2) +10 practice (Comp. Chem. Lab by gro up)

- Website: / (available from Sep 15)


Nguyễn Hoà
Nguyễ Hoàng Dũng,
ng, PhD.
Trường Đại học Bách khoa Tp.
Trườ Tp. HCM

Foreign and Vietnamese Cheeses : Quality and Preference ?

Conduct a research

1. Sampling

2. Measurement

3. Collect data *

4. Analysis and present your results

* Sensory practices

1-1. Samples and Populations Simple Random Sample

A population consists of the set of all
Sampling from the population is often done
measurements in which the investigator is randomly,
randomly, such that every possible sample of
interested. equal size (n) will have an equal chance of being
A sample is a subset of the measurements selected.
selected from the population.
A census is a complete enumeration of every A sample selected in this way is called a simple
random sample or just a random sample.
item in a population.

Samples and Populations Problem
Foreign and Vietnamese Cheeses : Quality and Preference ?

Conduct a research

1. Sampling

2. Measurement

3. Collect data *

4. Analysis and present your results

* Sensory practices
Population (N) Sample (n)
Measurements The criteria of “science”

•The assigning of numbers to the values of
a variable (SS Stevens, Science 1946;103:677 -80) Science Pseudoscience
Logic, experimental evidence Belief, loyalty

•Rules specify procedures to assign Results are repeatable Results are not repeatable

numbers to values Falsiability*

Falsiability* Not falsifiable

Peer-reviewed journals Not in peer reviewed journals

Evolution / learn from mistakes Constant, unchanged belief

*capable of being tested (verified or falsified) by experiment o r observation

Criteria of measurements Accuracy vs reliability (precision)

Validity measures what it purports to

Accuracy - the degree of “truthfulness”

truthfulness” of an attribute that is
being measured.

Reliability (consistency and repeatability)

Sensitivity to important variation precision


Measurement error decreases the accuracy of measurement

Some important concepts: Data - Variables -
Qualitative - Categorical - Quantitative - Measurable or 1.1 Mô tả ngườ
người trả
trả lời phỏ
phỏng vấn
1.1.1 Giới tính của người được phỏng vấn?1
n?1. Nam 2. Nữ
Frequency or Nominal: Countable: Tình trạng hôn nhân:
nhân: 1. Độc thân 2. Có gia đình

1.1.2 Tuổi của người được phỏng vấn?

Examples are- Examples are-
are- Dưới 25 tuổ
are- Dướ
25 – 30 tuổ
• Color • Temperatures tuổi
31 – 54 tuổ
>55 tuổ
• Gender • Humidity
Ông/Bà cho biết nghề nghiệp hiện nay ?
1.1.3 Xin Ông/Bà
• Nationality • Gross compounds Học sinh,
sinh, sinh viên
• Preference points Bác sĩ/giá
/giáo viên
nhân/ lao động làm thuê/bá
Công nhân/ thuê/bán hàng
scored on a 100 point Hưu trí

Ông/Bà cho biết thu nhập của gia đình Ông/Bà

1.1.4 Ông/Bà Ông/Bà ở mức nào sau đây
Thấp ( ≥ 2 triệ
1 . Thấ triệu đồng và < 5 triệ
triệu và <8 triệ
2 . Trung bình (≥ 5 triệ triệu)
3 . Cao ( ≥ 8 triệ
Some important concepts: Data - Variables -

•8 phomat
phomat (EdamF
(EdamF,, EdamH,
EdamH, GoudaH,
GoudaH, m1, m2, m3, m4, Variable Measurement scales
m5) • Discrete variables • Nominal scales ? (Label)
người thử
•11 ngườ thử (chuyên gia) • Continuous variables • Ordinal scales (Ranks in
• Independent variables Army)
lần lặ
•3 lầ lặp lạ
lại • Inteval scales (Celsius,
• Dependent variables
thuật ngữ
•15 thuậ ngữ mô tả: sour bitterness umami salty greasiness
• Ration scales (true zero
butter_odor milk_odor acrid rancid lactic cheese_flavor acetic full point, ratio)

flavor yellow hard

điểm không cấ
•Thang điể cấu trú từ 0-100 mm
trúc từ

Types of measurement Qualitative measurements

Nominal level Ordinal level
Qualitative Quantitative
Quantitative • Classification • Classification + Ordering
chất) (định
• A set of objects can be • A set of numbers can be
classified into exhaustive, assigned rank values and
mutually exclusive and nothing more.
Nominal Interval unique symbol • Ex: socio-economic status,
• Ex: religion, sex, location, education, levels of
Ordinal Ratio etc satisfaction, etc

Quantitative measurements Problem
Foreign and Vietnamese Cheeses : Quality and Preference ?
Interval level Ratio level
• Classification + Ordering + • Classification + Ordering + Conduct a research

Standard distance Standard distance + 1. Sampling

• A set of objects can be Natural zero
2. Measurement
described by units that • Quantitative variable with
indicate how far one case natural zero 3. Collect data *
is from another case • Ex: income, age, weight,
4. Analysis and present your results
• Ex: temperature bone mineral density
* Sensory practices

Ông/Bà cho biết loại pho mát cứng nào mà Ông/Bà

1.2.2. Ông/Bà Ông/Bà thường sử dụng
1.2.7. Theo Ông/Bà phó mát cứ ng ăn v ới sản phẩm nào?
Ông/Bà phó
Cheddar Bánh mì
Gouda Bánh sandwich
Edam Salad
Emental Bánh biscuit
Khác (ghi rõ)
…………………….. Rượu vang
Ông/Bà cho biết mức độ ưa thí
1.2.4. Ông/Bà thích chung đối với sản phẩm phó
phó mát Khá
Khác (ghi rõ tên)
bán cứng
1 2 3 4 5 6 7 8 9 1.2.8. Khi chọn mua sản phẩm phó
phó mát cứ ng, Ông/Bà cho biết mức độ quan tâm đối với
ng, Ông/Bà
Ông/Bà cho biết tần số sử dụng sản phẩm phó
1.2.5. Xin Ông/Bà phó mát bán cứng.
ng. những y ếu tố sau đây (1=r
(1=rất không quan tâm,
tâm, 2=không
2= không quan tâm, 3=không ý kiến,
tâm, 3=không
4=quan tâm, 5=rất quan tâm)
tâm, 5=r tâm)
> 3 lần/tuầ
n/tuần Giá cả
Giá 1 2 3 4 5
1 – 2 lần/tuầ
n/tuần chất cảm quan của sản phẩ
Tính chấ phẩm 1 2 3 4 5
1-3 lần/thá
n/tháng Mức độ quen thuộ
thuộc 1 2 3 4 5
Ông/Bà cho biết lượng phó
1.2.6. Xin Ông/Bà phó mát bán cứng sử dụng trong tuần của Thuận lợi khi sử dụng
Thuậ 1 2 3 4 5
Ông/Bà Có lợi cho sức khoẻ
khoẻ 1 2 3 4 5
< 100g Khối lượ
Khố lượng sản phẩ
phẩm 1 2 3 4 5
100 – 300g
> 300g

•8 phomat
phomat (EdamF
(EdamF,, EdamH,
EdamH, GoudaH,
GoudaH, m1, m2, m3, m4,
người thử
•11 ngườ thử (chuyên gia)
lần lặ
•3 lầ lặp lạ
thuật ngữ
•15 thuậ ngữ mô tả: sour bitterness umami salty greasiness
butter_odor milk_odor acrid rancid lactic cheese_flavor acetic full
flavor yellow hard
điểm không cấ
•Thang điể cấu trú từ 0-100 mm
trúc từ

Summary Measures Population Parameters Sample
judge session product sour bitterness umami salty
S1 1 m1 50 18 0 40 Measures of Central Measures of Variability
S2 1 m1 100 65 40 100 • Range
• Median
• Variance
S3 1 m1 32 11 35 4
S4 1 m1 30 10 25 1
• Mode
S5 1 m1 60 23 30 29 • Standard
S6 1 m1 30 35 25 50 • Mean Deviation
S7 1 m1 50 32 45 64
S8 1 m1 32 23 40 40
S9 1 m1 78 27 45 21
l Other summary
S10 1 m1 55 30 34 18
S11 1 m1 62 21 43 32
– Skewness
– Kurtosis
1-3. Measures of Central Tendency or Location Arithmetic Mean or Average

• Median â Middle value when The mean of a set of observations is their average - the
sorted in order of sum of the observed values divided by the number of
magnitude observations.
â 50th percentile
Population Mean Sample Mean
• Mode â Most frequently-
N n
occurring value ∑x ∑x
µ= i =1
x= i =1

N n
• Mean â Average

Arithmetic Mean or Average Median

Robust parameter of central tendency
Non affected by outliers
Affected by outliers

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5 Median = 5
Means = 5 Means = 6

Mode Measures of Central Tendency or Location

1 n
x1 + x 2 + K + x n
Ø Mean : x =
i =1
i =
1 k
n1 x1 + n2 x 2 + K + nk x k
x =
i =1
i i =
Sample size
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Ø Median : med ( x ) = x ( p + 1) si n = 2p + 1
Mode = 9 Without Mode
x ( p ) + x ( p + 1)
= si n = 2p
Mean or Median ? Quartiles

The value of the boundary at the 25th, 50th, or 75th percentiles of a frequency
distribution divided into four parts, each containing a quarter of the population

Ÿ Outliers : median
25% 25% 25% 25%
Ÿ Many of « ex aequo » (variable discrete) : mean
( Q1 ) ( Q2 ) ( Q3 )
Position of ith quartile i ( n + 1)
( Qi ) =
1 ( 9 + 1) (12 + 13 ) = 12.5
Position of Q1 =
Position = 2.5 Q1 =
4 2

Data classified in increasing order : 11 12 13 16 16 17 18 21 22

1-4. Measures of Variability or Dispersion Dispersion

• Difference between maximum and Ø Range : Range ( x ) = x( n ) − x (1)
minimum values
Range = 12 - 7 = 5 Range = 12 - 7 = 5
• Mean* squared deviation from the mean
Standard Deviation
• Square root of the variance 7 8 9 10 11 12 7 8 9 10 11 12

q0.75 − q0.25
Ø Intervalle interquartile :
∗ Definitions of population variance and sample variance differ slightly .
Mean (average) Variation
Given a series of values xi (i = 1, … , n):
n): x1, x2, …, xn, the The mean does not adequately describe the data.
mean is: 1 n We need to know the variation in the data.
∑ xi
i =1
An obvious measure is the sum of difference from
Study 1:
1: the color scores of 6 consumers are: 6, 7, 8, 4, 5, and 6. The the mean:
mean is:
n 1 6 + 7 + 8 + 4 + 5 + 6 36 For study 1, the scores 6, 7, 8, 4, 5, and 6, we have:
x= ∑ xi = = =6
n i =1 6 6 (6-
(6-6) + (7-
(7-6) + (8-
(8-6) + (4-
(4-6) + (5-
(5-6) + (6-
Study 2:
2: the color scores of 4 consumers are: 10, 2, 3, and 9. The
mean is: 1 n 10 + 2 + 3 + 9 24
x= ∑ xi = = =6
Sum of squares Variance

We need to make the difference positive by squaring We have to divide the SS by sample size n. But in each square
them. This is called “Sum of squares”
squares” (SS) we use the mean to calculate the square, so we lose 1 degree of
freedom. Therefore the correct denominator is n-1. This is
For study 1: 6, 7, 8, 4, 5, 6, we have:
called variance (denoted by s2)
(6-6)2 + (7-
SS = (6- (8-6)2 + (4-
(7-6)2 + (8- (6-6)2 =
(5-6)2 + (6-
(4-6)2 + (5-
s2 =
(x1 − x )2 + (x 2 − x )2 + ... + (x n − x )2
For study 2: 10, 2, 3, 9, we have: n −1
SS= (10- (2-6)2 + (3-
(10-6)2 + (2- (9-6)2 = 50
(3-6)2 + (9-
Or, in the sum notation:
1 n
This is better! s2 = ∑ ( xi − x )
But it does not take into account sample size n. n − 1 i =1

1-5. Variance and Standard Deviation Variance - example

Population Variance Sample Variance For study 1: 6, 7, 8, 4, 5, and 6, the variance is:
(6 − 6 )2 + (7 − 6 )2 + (8 − 6 )2 + (5 − 6 )2 + (6 − 6 )2 10
∑ (x − x) s2 = = =2

∑ (x − µ)2 s =
2 i =1 6 −1 5
σ2 = i =1

N (n − 1)
For study 2: 10, 2, 3, 9, the variance is:
( x) ( )
2 2
N n
∑ n
i =1 (10 − 6 )2 + (2 − 6 )2 + (3 − 6 )2 + (9 − 6 )2 50
∑x − i =1
∑x − s2 = = = 16 .7
2 2

4 −1 3
N = n
i =1
= i =1

N (n − 1)
σ= σ
The scores in study 2 were much more variable
s= s

than those in study 1.

Standard deviation Standard Deviation
The problem with variance is that it is expressed in unit
squared, whereas the mean is in the actual unit. We Data A Mean = 15.5
need a way to convert variance back to the actual unit of s = 3.338
measurement. 11 12 13 14 15 16 17 18 19 20 21

Data B
We take the square root of variance – this is called Mean = 15.5
“standard deviation”
deviation” (denote by s) 11 12 13 14 15 16 17 18 19 20 21 s = .9258
Data C
For study 1, s = sqrt(2) = 1.41 Mean = 15.5
For study 2, s = sqrt(16.7) = 4.1 11 12 13 14 15 16 17 18 19 20 21 s = 4.57

1-6 Form indicators: Skewness & Kurtosis Skewness

Skewed to left
• Measure of asymmetry of a frequency distribution
• Skewed to left Mean < median < mode
• Symmetric or unskewed 3 0

• Skewed to right
Kurtosis 2 0
F re q ue nc y

• Measure of flatness or peakedness of a frequency

1 0
• Platykurtic (relatively flat)
• Mesokurtic (normal)
• Leptokurtic (relatively peaked)

1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0

Kurtosis Kurtosis

Platykurtic - flat distribution Mesokurtic - not too flat and not too peaked
7 0 0
5 0 0

6 0 0

4 0 0
5 0 0
F re q u e n c y

F re q u e n c y

4 0 0 3 0 0

3 0 0
2 0 0

2 0 0
1 0 0
1 0 0

0 0

- 3 .5 - 2 .7 - 1 .9 - 1 .1 - 0 .3 0 .5 1 .3 2 .1 2 .9 3 .7 -4 -3 -2 -1 0 1 2 3 4


Diagram Quantitative variable

NHDzung – Lesson 1, slide 49 NHDzung – Lesson 1, slide 50

Quantitative variable Quantitative variable : boxplot x


Plus grande valeur inférieure à

If we want to see in detail: 21 q 0.75 +1.5(q 0.75 - q 0.25)
freq. between 1.65 m & 1.70
m distribute in 8 in [1.65 ; q 0.75
1.675] & 13 in [1.675 ; 1.70]
q 0.25

Plus petite valeur supérieure à

q 0.25 -1.5(q 0.75 - q 0.25)
Boîte à moustaches

Form indicators Principes of good « figure »

γ1 < 0 γ1 > 0
Biểu diễ
§Biể diễn kết quả
quả phứ
phức tạp một cách rõ ràng,
ng, chí
chính xác và
Asymetry Symetry Asymetry hiệu quả
hiệ quả

§Trì nhiều ý tưở

Trình bày nhiề tưởng một cách hiệ
hiệu quả
quả nhấ

§Không nói dối !

Q1 Q 2 Q3 Q1 Q 2Q3 Q1 Q2 Q3

A BAD figure A GOOD figure
Fig. Digestion interactions of coral

e ) ns Figure 3. Digestion interactions for coral taxa sampled at

da ea )
ri (M e ac (B
po s da s ae ge
Pioneer Bay, Orpheus Island
o i te si y on i te a e i id n
cr or us lc or lg av po 60 Wins

120 Losses
110 40
90 Wins Losses 30

60 20
40 10
20 0
) s
(M ae an B) ae ae es
or sid ce es
( lg
id ng
es us na
A o
op rit M rit Fa Sp
cr o yo Po
A P lc

