You are on page 1of 150

X l thng k

s liu phn tch


PGS.TS.T Th Tho

Bi ging tm tt

Khi nim v Ha phn tch


Ha phn tch l ngnh ha hc nghin cu v thnh
phn cu to (nh tnh) v hm lng cc thnh phn
(nh lng) trong nhng mu kho st.
Ha phn tch ng vai tr quan trng trong khoa hc, k
thut, trong nghin cu khoa hc; iu tra c bn pht
trin tim nng, khai thc ti nguyn khong sn; nh gi
cht lng sn phm
Ha hc phn tch ngnh khoa hc ng dng tng hp
cc thnh tu ca cc ngnh khoa hc khc c lin quan
nh: ha hc, vt l, ton hc - tin hc, sinh hc - mi
trng, v tr, hi dng hc, a cht, a l.v.v...y l
mt ngnh khoa hc c s tch hp cao ca nhiu ngnh
khoa hc t nhin m mc ch cui cng ca n l em
li li ch ti a cho khoa hc, i sng v s pht trin
ca con ngi.

WHY TO STUDY ANAL. CHEM.


Environmental
Agriculture

Food

Tocicology

Medicine

Criminology

Pharmarcy
Analytical
chemistry
Chemical
industry

Energetics

Space
industry

Geology
Others

History
Law

Phn tch nh tnh

Phn tch nh lng

Phn tch ha hc

Phn tch cng c

Recent techniques of Analytical Chemistry


Chemical
methods
Biochemistry

Chemometry

Analytical chemitry
Mathematic
methods

Physico- chemical
methods

Biology
methods

Physycal
methods

Qu trnh phn tch


1. Xc nh vn nghin cu
2. Chn phng php phn tch
3. Ly mu
4. Chun b mu
5. Tch v lm giu cht phn tch (nu
cn)

6. Phn tch

7. X l thng k s liu phn tch


v bo co kt qu

Qu trnh phn tch


1. Xc nh vn nghin cu

(tip)

phi hiu r

Cn xc nh i lng no?
nh tnh hay nh lng?
Kt qu c dng lm g? Ai s s dng
thng tin v khi no cn thng tin ?
ng v chnh xc cn t c l bao
nhiu?
Gi thnh phn tch?
Ngi phn tch cn khuyn khch hng
xut nhng phng php phn tch v cch ly
mu hiu qu.

Qu trnh phn tch

(tip)

2. Chn phng php phn tch cn c vo


Loi mu
S lng v khi lng mu cn ly phn tch
Lm giu (nu hm lng cht phn tch qu nh)
Tha mn chn lc (loi c cht gy cn tr)
p ng c ng v chnh xc
Thit b v my mc sn c.
i ng k thut vin v kinh nghim
Gi c
Thi gian phn tch
Qui trnh phn tch t ng hay khng?
Phng php phn tch c sn trong ti liu tham kho
hay khng?
Cc phng php tiu chun c sn khng?

Qu trnh phn tch (tip)


3. Ly mu
xc nh xem
Loi mu no
Mu i din hay mu ngu nhin
Kch thc mu (s lng v lng mu)
Mu thng k hay mu tng th
Sai s khi ly mu?

Qu trnh phn tch


4. Chun b mu phn tch
ty theo
Mu l cht rn, lng hay cht kh

Cht phn tch dng tan hay khng tan trong nc

Tro ha kh hay x l mu theo phng php t


Tch cht bng phng php ha hc hay che cc
cht cn tr nu cn?
C cn lm giu cht phn tch khng?
Cn chuyn dng cht phn tch xc nh
khng?
Cn iu chnh mi trng mu phn tch? ( pH,
thm thuc th)

Qu trnh phn tch (tip)


5. Tch cht bng phng php ha hc khi
cn thit
Chng ct
Kt ta
Chit bng dung mi
Chit pha rn
Tch sc k
C th thc hin nh mt
giai on phn tch
in di
.

Qu trnh phn tch (tip)


6. Phn tch

Theo phng php ha hc:


- PP trng lng
- PP phn tch th tch (4 k thut chun )
Theo phng php phn tch cng c
- Dng ng chun (tm quan h gia tn hiu phn
tch v nng cht phn tch)
- Phn tch mu cha bit
- Phn tch mu hiu chnh/mu chun kim tra v mu
trng.
- Phn tch mu lp

7. X l s liu phn tch v bo co kt qu


Phn tch thng k s liu thc nghim
Bo co kt qu phn tch

Cc loi phng php phn tch cng c


Signal

Instrumental Methods

Emission spectroscopy (X-ray, UV, visible, electron,


Auger); fluorescence, phosphorescence, and
luminescence (X-ray, UV, and visible)
Absorption of radiation Spectrophotometry and photometry (X-ray, UV, visible,
IR); photoacoustic spectroscopy; nuclear magnetic
resonance and electron spin resonance spectroscopy

Emission of radiation

Scattering of radiation

Refraction of radiation
Diffraction of radiation
Rotation of radiation

Turbidimetry;nephelometry; Raman spectroscopy


Refractometry; interferometry
X-ray and electron diffraction methods
Polarimetry; optical rotatry dispersion;circular dichroism

Electrical potential
Electric charge

Potentiometry; chronopotentiometry
Coulometry

Electric current
Electrical resistance

Polarography; amperometry
Conductometry

Mass-to-charge ratio

Mass spectrometry

Rate of reaction
Thermal properties
Radioactivity

Kinetic methods
Thermal conductivity and enthalpy methods
Activation and isotope dilution methods

So snh cc k thut phn tch


Method

Approx.
range
( mol/L)

Approx
precisi
on (%)

Selectivity

Gravimetry

10-1-10-2

0.1

Poor- mod.

Titrimetry

10-1-10-4

0.1-1

Poor- mod.

Potentionmetry

10-1 -10-6

Good

Electrogravimetry,

10-1-10-4

0.01-2 Moderate

10 -10

2-5

Speed

Cost

Principle uses

Low

Inorg.

Low

Inorg., Org.

Fast

Low

Inorg.

Slow-mod.

Mod.

Inorg., org.

Good

Moderate

Mod.

Inorg., org.

Good-mod.

Fast- mod.

Low- Mod.

Inorg., org.

Moderate

Moderate

Mod.

Org.

Good

Fast

Mod.- High

Inorg- Multiele.

Good

Fast-Mod.

Mod.-high

Good-Mod.

Fast- Mod.

Org. Multicom.
Inorg.,org,
enzyme

Slow
Mod.

coulometry
Voltammetry

-3

-10

Spectrophotometry

10-3-10-6

Fluorometry

10-6-10-9

2-5

Atomic spectrometry 10-3 -10-9

2-10

Chromatography

10-3 -10-9

2-5

Kinetic methods

10-2-10-10

2-10

Mod.

S ghi chp trong phng th nghim


L s c dng ghi cng vic phn tch
trong PTN, nhng ti liu hay bt c cng vic
no bn tin hnh.
Mt s ch dn chnh:
+ dng s c ba cng
+ nh s trang ni tip nhau
+ ch ghi bng bt mc
+ khng bao gi c php x i trang no.
+ ghi r ngy vo mi trang, k nhn v phi c
ch k xc nhn ca ngi c trch nhim
+ ghi r tn cng vic tin hnh, ti sao n c
tin hnh v ti liu tham kho l g.
+ ghi tt c s liu m bn thu c

S ghi chp trong phng th nghim


V d v mt cch ghi chp trong s th nghim:
+ Ngy lm th nghim:
+ Tn th nghim
+ Nguyn tc tin hnh th nghim
+ Phn ng dng nh lng:
+ Th nghim pha ch dung dch chun, thuc
th v nng chun
+ Cch tnh nng , x l s liu thc nghim,
tnh gi tr trung bnh v lch chun.
+ Kt qu cui cng t c v s liu s bo
co.

Chng 1: Cc dng sai s


trong Ha phn tch
1.
2.
3.
4.
5.
6.
7.

Sai s
Sai s tuyt i v sai s tng i
Sai s ngu nhin v sai s h thng
Sai s th v sai s tch ly
lp li v ti lp
chnh xc v chm
S c ngha

* Every measurement that is made is subject to a number of errors.

If you cannot measure it, you cannot know it.

Sai s tuyt i -Sai s


tng i

Sai s tuyt i =X
= gi tr o c gi tr thc
X = x

Sai s tng i = x = X /
Phn trm sai s tng i

(%)

= x x 100

Sai s ngu nhin


(hay sai s khng xc nh)
L sai s khng xc nh c (hay khng kim sot
c, gy ra do cc yu t khch quan hoc ch quan
khng bit trc)
Sai s c th c gi tr m hoc dng (gi tr thc
nghim thu c ln hn hoc nh hn gi tr thc)
Gim sai s ngu nhin bng cch
+tin hnh nhiu th nghim
+kim sot tt iu kin thc nghim (trang thit
b, phng php, rn luyn tay ngh phn tch

- Sai s ngu nhin tun theo phn b chun


Gaussian i vi tp s liu thc nghim ln.
- C th m t sai s ngu nhin bng cc tham s
thng k.

Sai s h thng
Systematic Error (determinate error))
Nguyn nhn:

- Do ngi tin hnh th nghim

- Cha chun ha dng c, thit b o lm kt


qu lch v mt pha

c im: -Sai s lun dng v lun m


C th hiu chnh kt qu
C th l sai s h thng khng i hoc bin i
pht hin sai s h thng:
Dng mu chun c chng nhn (mu CRM)
Phn tch mu trng (blank sample)
S dng phng php phn tch khc
Tham gia cuc thi th nghim thnh tho (nhiu ngi
phn tch t nhiu PTN khc nhau phn tch cng mu v
so snh kt qu)

Cc loi sai s
Proportional error
influences the
slope.

Constant error influences the


intercept.

Sai s tch ly
Nu khng bit c bn cht sai s (ngu
nhin hay sai s h thng (random or
systematic?) th c th p dng cc qui lut sau:

Computation

Error

Addition or Subtraction
Z=A+BC

Z = A + B + C

Multiplication or Division
Z = AB / C

Z / Z = (A / A) + (B / B) + (C / C)

General function
Z = f (A,B,C)

Z = (df/dA)A + (df/dB)B + (df/dC)B

Php cng v php tr

Multiplication and division


Php nhn v chia

When multiplying or dividing measurements the relative errors are added.

Consequently the absolute errors of the measurements must first be converted to relative
errors.
Example 1:
A = (1.56 0.04) cm,
B = (15.8 0.2) cm2,

A = 0.04 cm
B = 0.2 cm2

A = 0.04 cm / 1.56 cm = 0.0256


B = 0.2 cm2 / 15.8 cm2 = 0.0127

Product of A and B: AB = (1.56 cm)(15.8 cm2) = 24.648 cm3 = 24.6 cm3 to 3 SF


Adding relative errors: AB = A + B = 0.0256 + 0.0127 = 0.0383 = 0.04
The % relative error in the product AB is therefore = 4 %

Ngun sai s
Ly mu
-Mu i din?
-Mu ng nht hay
khng ng nht?

How about sampling a


chocolate chip cookie?

Chun b mu
S mt mu
Phn tch
Nhim bn
(t bn ngoi) -nh lng cht phn tch
-Xy dng ng chun
Sai s do thit b
Sai s do dung dch chun

1. Sai s c nh
2. Sai s thay i
3. Sai s thm vo trong qu trnh phn tch 4. Sai s dng c
5. Sai s do ngi phn tch
6. Sai s do phng php
7. Sai s tng hp t nhiu ngun

lp li v phc hi
-Ch mc gn nhau gia cc kt qu
ring bit t c bi cng phng php
vi cng mu phn tch c trng
-Nu tin hnh trong nhng iu kin nh
nhau (cng ngi phn tch, cng trang
thit b, cng PTN, v trong khong thi
gian ngn) th gi l lp li (repeatability).
Nu tin hnh trong nhng iu kin khc
nhau (khc ngi phn tch, khc trang
thit b, khc PTN v thi gian khc nhau)
th gi l phc hi (reproducibility).

chm v chnh xc
Gi tr thc (true value):
l gi tr chun, gi tr
i chng hoc gi tr
c chp nhn
chnh xc (Accuracy)
ch mc gn vi
gi tr thc
chm (Precision) ch
phc hi hoc
trng nhau ca cc gi
tr o trong cc php o
lp li

chnh xc (Accuracy--

Precision)

chm

Ch t c khi cc Cho thy mc tn


gi tr o gn gi tr
mn ca cc kt qu
thc
o ring bit so vi gi
Cn gim sai s ngu tr trung bnh
nhin v sai s h
Cho bit phc hi
thng tng
ca php o
ng
Tng chnh xc ca
Lun s dng mu
phng php bng
chun c chng
cch gim sai s ngu
nhn (CRM) so
nhin
snh kt qu

Chng 2: Cc i lng thng k


Lm th no nh gi tng sai s ca
qu trnh phn tch?
- S dng mu chun i chng nh mt
mu phn tch kim tra.
- Mu chun i chng phi c phn
tch cng qu trnh nh mu phn tch
xt mc gn nhau ca cc kt qu
thc nghim vi kt qu cho trc trong
thnh phn mu chun

Mu tng th v mu thng k
(population vs. sample)
Mu tng th l tt c cc mu phn tch mt loi
i tng no
VD: cn phn tch hm lng axit ascobic trong vin vitamin C
100 gam.
Khi tt c cc mu vitamin C 100 mg c gi l mu tng
th

Mu thng k l mt phn mu tng th c ly


phn tch
Nu ly ra mt l vitamin C 100 mg phn tch th gi l mu
thng k.
Trong s cc mu thng k th mt mu em vo phn tch c
gi l mu phn tch
Generally only data for samples is available since it is generally
impossible to obtain data for the whole population

Cc i lng thng k
N

Gi tr trung bnh (mean):

x
i 1

Trung v (median) Nu sp xp N gi tr lp li trong tp s


liu theo th t tng n hoc gim dn t x1, x2, , xN th
s nm gia tp s liu c gi l trung v.
- Nu N l th trung v chnh l s gia dy s.
- Nu N chn th trung v l trung bnh cng ca 2 gi tr nm
gia dy s.
S tri (mode): l s c tn s xut hin l ln nht trong
tp s liu lp li.
Khong bin thin
R = xmax - xmin

lch chun
(Standard Deviation)

Mu thng k

Mu tng th

Khng phi l mu i din

nh gi bin thin khi


phn tch c mu tng th

x
N

i 1

nh gi bin thin thnh


phn mu tng th

x _

i1
N

N 1

x
i1

2
i

xi
i1

N 1

Why divide by N-1 when calculating


s?
N-1 = degrees of freedom (Df) of sample
number of independent values on which a
result is based, or the number of values in
the final calculation of a statistic that are free
to vary
for a population Df = N
for a sample Df = N-1
one Df lost when calculating the Average of a
sample

More on Dfs
To calculate the std. dev. of a random sample, we must first calculate
the mean of that sample and then compute the sum of the several
squared deviations from that mean.
While there will be n such squared deviations only (n - 1) of them are,
in fact, free to assume any value whatsoever.
This is because the final squared deviation from the mean must include
the one value of X such that the sum of all the Xs divided by n will
equal the obtained mean of the sample.
All of the other (n - 1) squared deviations from the mean can,
theoretically, have any values whatsoever.
For these reasons, std. dev. of a sample is said to have only (n - 1)
degrees of freedom.

Tp s liu t mu tng th
Khi s th nghim ln

th x

trung bnh tp hp
hp.

lch chun tp

Khi tin hnh th nghim m thu c


lch chun nh th kt qu s chnh
xc hn.
Tuy nhin, chnh xc tt khng ni
rng ng tt.
Kt qu thc nghim thng c biu din
di dng:
Gi tr trung bnh lch chun

xs

sai chun
(Standard deviation of the mean)
(standard error)
Khi tnh lch chun ca mt s gi tr trung bnh th tng lch gia cc gi tr trung bnh s
gim nu chia lch chun cho cn bc hai ca s th nghim

s = lch chun gia cc gi tr ring r


sm = lch chun gia cc gi tr trung bnh

sm

s
N

Accuracy and Precision


The center of
the target is
the true value.

Bn cht
chnh xc v
lp li

chnh xc v
chm

Ch chm
nhng khng
chnh xc

Khng chnh
xc v khng
chm

ngha
thng k

lch chun
(SD) hay h s
bin thin (CV%)
nh
Sai s tng i
nh

lch chun
(SD) hay h s
bin thin (CV
%) nh
Sai s tng
i ln

lch
chun (SD)
hay h s bin
thin (CV%)
ln
Sai s tng
i ln

Chm v chnh xc

ngha
thng k

Sai s gia
cc php o
rt nh
Tt c cc
gi tr o gn
gi tr thc
Cn gi tr
chun v gi
tr thc so
snh

Ch chm

Khng chm v
khng chnh xc

Cc gi tr
o chm
nhau nhng
lch khi gi
tr thc

hiu ng

bn sng

Cn lm li

th nghim,
Sai do
thay i
ng chun phng
hoc php o php hoc
hay do sai s ngi khc
lm th
h thng
nghim

Biu din chnh xc


(accuracy) v chm (precision)
Gi tr trung bnh (Mean)(average)
Sai s tng i (Percent error) chnh
xc

Khong bin thin


lch
lch chun
H s bin thin
(See also in chapter 3)

chm

Nhng cch khc biu din chnh xc ca


tp s liu
Phng sai (Variance)

x
N

S
2

i 1

N 1

N 1

i 1

2
i

x
i 1

lch chun tng i


(Relative standard deviation)

lch chun (standard deviation)

S S

s
RSD
x

H s bin thin (coefficient of variation)

s
%RSD 100
x

Box and whisker plot on Minitab 14


Gi tr bt thng
lch
nh

lch ln

Trung v

Khong bin thin

S c ngha
L s cc ch s trong kt qu phn tch
phn nh ng ca php o v chnh
xc ca gi tr o.
Kt qu c bo co vi s c ngha t nht
(trong php nhn v chia) hoc t s c ngha
nht sau du thp phn (trong php cng v
tr)
Qui c s c ngha
Cc con s t nhin nh 1, 6 , 9
S 0 gia cc s c ngha
S 0 sau s t nhin v bn phi s thp phn

Lm trn s
Cc ch s cn c lm trn v s c ngha
khi bo co kt qu phn tch
Nu b cc s 6,7,8,9, th tng ga tr trc n
ln 1 n v.
Nu loi b cc s 1,2,3,4, th khng thay i
con s ng trc n.
Nu loi b s 5 th lm trn s trc v s
chn gn nht.
V d: 2,25 lm trn thnh 2,2;

2,35 thnh 2,4.

Phng php biu din


khng m o
C 3 phng php ghi khng m bo o
(Uncertainty)
1. Ghi gi tr tuyt i ca khng m bo o
2. Ghi khng m bo o tng i (%)
3. Dng s c ngha
+ Ghi tt c cc con s chc chn ng
(hay chc chn bit)
+ Ghi thm mt s khng chc chn
V d: 12,234

Th d v biu din s liu


phn tch

Khi lng cn c
9,82 0,2385 g
tc 6051,78 30 m/s

= 9,82 0,24 g
= 6052 30 m/s

biu din khng m bo o


- lm trn khng m bo o v mt s c ngha
.tr khi x c 1 mt s c ngha
x = 0,14 th x =0,14 khng ghi 0,1
Khi tnh ton nn gi li mt s c ngha

Chng 3:Phn phi l thuyt


Biu din kt qu phn tch bng th.
(s dng phn mm Origin, Minitab, Excel)

- th dng ng (ng thng, ng


cong)
- th tn sut
- Biu dng ct
- Biu hnh qut
- th khng gian 3 chiu (3D)

Hm phn b v chun phn b


Hm phn b l g? Chun phn b l g?
Hm phn b l hm ton hc biu din qui
lut phn b ca tp s liu v c minh
ha bng th
ng dng:- cho bit s phn b cc s liu
- d on kh nng xut hin ca
chng.
Hm phn b c th lin tc hoc ri rc

Cc loi hm phn b
Binomial
Distribution
Normal Distribution
Poisson Distribution
Exponential
Distribution
Logistic Distribution
t-Distribution

Chi-squared
Distribution
F-Distribution
Gamma Distribution
Hypergeometric
Laplace Distribution

Binomial Distribution Graphic

From http://mathworld.wolfram.com/BinomialDistribution.html

Khi s th nghim trong tp s liu ln, kt qu tin ti


s phn b theo ng cong c gi l ng cong
phn b chun hay phn b Gauss (GAUSSIAN or
NORMAL DISTRIBUTION CURVE)
c im phn b chun:
Gi tr trung bnh x
Cho bit tm ca s
phn b
lch chun s
Cho bit rng ca
phn b

Nu sai s trong tp s liu l sai s ngu nhin th


tp s liu tun theo phn phi chun

Gaussian Distribution of Random Errors (Populatio


The Gaussian curve equation:
1
(x ) 2 /2 2
y
e
2

1
= Normalization factor
2
It guarantees that the area under
the curve is unity

Probability of
measuring a value
in a certain range =
area below the
graph of that range

The Gaussian curve whose area


is unity is called a normal error
curve.
= 0 and = 1

Gaussian Distribution of Random Errors


Cch khc biu din phn b chun l chuyn trc
x thnh bin mi (trc Z)
_

x xi x
z

s
Trong
z = lch khi gi tr
trung bnh ca im s
liu c biu din theo
n v ca lch
chun

Normal distribution
lch chun cho bit rng ca phn b chun
( cng ln th ng cong cng t)

Khong

s php o (%)

68.3

95.5

99.7

Cng tin hnh nhiu th nghim th tin cy cng cao v


gi tr trung bnh s tin ti gi tr thc.
khng m bo o (hay bt n) t l vi

1/ n

Data Transformation
What do you do if your data is not
normally distributed?
Use a non-parametric test
Transform your data
Logarithmic transformation:
Variable x log (Variable x +1)
Power transformation:
e.g. Variable x (Variable x)
Angular transformation:
e.g. Variable x arcsine ((Variable
x))

Poisson Distribution
Typically used to model the number of
random occurrences of some
phenomenon in a specified unit of space
or time.
E.g. The number of birds seen in a 10 min
period

Can usually be approximated by a normal


distribution

Exponential Distribution
Describes a sample
where y= x^a
Messy to work with,
but can be
transformed
(sometimes) or you
can use a nonparametric test

Logistic Distribution
Typically describes
sample that fits
y = log (x)
Again, messy to
work with
(sometimes) but can
be transformed or
you can use a nonparametric test

Phn b student (t)

Phn b chun

- N CNG GiM TH PHN B T CNG BT T.


- Khi N rt ln phn b t tin ti phn b chun

T-distribution ( 1-sided)

Phn b Fisher (F-Distribution)


L phn b dng
xt hai bin
trong tp s liu
c cng phng
sai
L t s ca hai
gi tr thng k
bnh phng
F=S12/S22
ANOVAs are based on F distributions

Chi-squared Distribution
This is also based upon degrees of
freedom
Can be used to approximate many
different distributions
For example, may be used to
approximate the sampling distribution of
the likelihood ratio statistic (may cover
this later)

Chi-square Distribution
examples

c on sai s ngu nhin


Sai s ngu nhin (x) trong tp s
liu c tnh theo cng thc:

x t p , f sm

t p, f s
N

Trong , tp,f l gi tr chun student (t)


tra tin cy thng k P v bc t do f=N-1
(N l s th nghim lp li)

Khong tin cy
X mc tin cy thng k (thng ly
95%) cho bit gi tr thc s nm
trong khong X ca gi tr trung
bnh

tp,vs

tp,vs
N

tp,vs
N

Chng 4:

Cc phng php
kim tra thng k

Gi thit thng k:
gii p cu hi c s khc nhau gia cc
kt qu thu c hay khng.
cn kim tra gi thit thng k cc kt qu o
cng tp hp l ng hay sai?
Cch lm: t ra gi thit thng k v phn tch
thng k s liu a ra xc sut v gi thit
.
Thng gi thit l ng (gi thit o- null
hypothesis) v tnh ra xc sut l gi thit
ng.

Kt lun thng k
Gi thit cn kim tra (ging nhau) b bc b nu sai
lm loi mt (b ci ng) xut hin t hn 100 (1%
tng trng hp) ( hay tr s P tc l Pvalue<0,01), th
s khc nhau c ngha thng k mc tin cy 99%.
Gi thit cn kim tra c chp nhn nu sai lm loi
mt ln hn 100 (5% tng trng hp) ( hay Pvalue>
0,05) th kt lun s khc nhau khng c ngha, tc l
c xem nh ging nhau mc tin cy 95%.
Nu sai lm loi mt nm trong khong 5% v 1%
(0,95 < P < 0,99 hay 0,01<Pvalue<0,05) th xem l ang
nghi vn. Khi phi lm thm php o.
Trong nhiu trng hp ch cn xt tin cy 95% chung cho cc
php so snh ( hay ch cn nh gi (Pvalue>0,05 hay Pvalue< 0,05)

Cch loi b sai s th


Cch 1: Quan st mt cch khch quan
tm nguyn nhn gy sai s th.
Cch 2: Gi li kt qu nhng ti thiu
ho nh hng ca n bng cch dng
gi tr trung v.
Cch 3: S dng chun thng k loi
b s liu bt thng
Tiu chun 1: qui tc 2
Tiu chun 2: chun Dixon

Chun Dixon (chun Q) loi gi tr bt


thng
Sp xp cc s liu thu c theo chiu tng
hoc gim dn
Dng Q-test nh gi kt qu nghi ng khc xa
bao nhiu so vi s cn li trong tp s liu:
x nghi ngo x lan can
-Tnh gi tr Q theo biu thc
- So snh vi gi tr Q chun tra bng

Q(P=0,90, N)
Nu Qtnh> Qchun th l sai s th

x max x min

Tnh s th nghim cn tin hnh lp li

ts
n

t 2s 2
e

= gi tr thc hay gi tr mong mun


x = gi tr o c
n = s mu cn lm lp li
s2 = phng sai
e = sai khc gia gi tr o c v gi tr thc
t c tra bng khi n
Phng php tnh c lp li n khi n khng thay i

So snh gi tr trung bnh tp


hp v gi thc (1 Z)
Mc ch : kim tra trung bnh tp hp c khc nhau
c ngha vi gi tr thc cho trc 0 hay khng .
Gi thit thng k l H0 : =0 , nu khng tho mn th
> 0 hay <0 mc tin cy thng k cho trc.
Quyt nh mc ngha , thay i bc b nu n ng
da trn mc tin cy thng k s dng trong trng hp
phn phi chun
Tnh gi tr Z v so snh vi g tr Zchun trong bng
4.3.
-Nu Z <-1,96 hoc Z >1,96 th loi b gi thit o (vi
=0,05)
-Trong phn mm my tnh: gi thit o c chp nhn
nu Pvalue 0,05) (tc l ging nhau)

So snh gi tr trung bnh vi gi


tr thc (1t)
Tnh gi tr t:
x
x
x
s

ts
N
ts
N
N

( x) N
tcalc
s

So snh ttnh vi tchun (ttra bng p=0,95 v f=N-1)


Nu ttnh > tbng (hay Pvalue<0,05) th hai gi tr khc
nhau c ngha thng k

So snh hai tp s liu


So snh cc gi t trung bnh (T-test) (2t)
Cc s liu khng tng cp
Mu so snh c ly cng mu tng th
V d so snh kt qu phn tch Pb trong nc ngm HN ca hai PTN khc
nhau

S liu tng cp
Cc mu khng cng mu tng th
V d so snh hm lng cholesterol trong mu ca nhng bnh nhn khc
nhau s dng hai phng php phn tch

So snh phng sai hai tp s liu (F-test)


Ch so snh cc s liu khng tng cp

So snh hai phng sai (F-test)


-Tnh F :

Fcal .

S1

S2

-So snh Ftnh vi Fbng(P=0,95, f1,f2)

-Nu Ftnh> Fbng (hay Pvalue<0,05) th S1


and S2 khc nhau c ngha thng k

So snh hai gi tr trung bnh


(unpaired data)
Phng php:

1. So snh hai phng sai


2. So snh hai gi tr trung bnh

Nu s1 and s2 khc

nhau khng c ngha


thng k (F-test
passes).
tnh ttnh v so snh
vi tbng(P=0,95,
f=n1+n2-2)
Nu ttnh > tbng (hay
P
<0,05) th s khc

s (n1 1) s (n2 1)
spooled
n1 n2 2
2
1

tcalc

x1 x 2
s pooled

2
2

n1n2
n1 n2

So snh hai gi tr trung bnh


(unpaired data) (cont.)
Phng php
Nu S1 and S2 khc nhau c
ngha thng k (F-test
khng tha mn)
- Tnh ttnh theo cng thc

t calc

2
1

n1

2
2

n2

- Tra bng tbng (P=0,95,f)

Nu ttnh > tbng (hay


Pvalue<0,05) th hai gi tr
khc nhau c ngha
thng k

x1 x 2

2
1

n1
2

2
2

n2

s
s
n
n1
2

n2 1
n1 1
2
1

2
2

So snh hai gi tr trung bnh


(unpaired data) (cont.)
Phng php khong bin thin

Tnh khong tin cy ca gi tr trung bnh


So snh cc khong tin cy
Cc kt qu khc nhau khng c ngha thng
k nu cc khong tin cy xen ph nhau (mi
khong tin cy cha gi tr trung bnh ca tp s
liu kia)

So snh hai gi tr trung bnh


(paired data)

Tnh s sai khc gia tng cp s liu:


di=xAi-xBi
Gi tr c th + hoc Khng ly tr tuyt i s sai khc

Tnh trung bnh v lch chun s sai khc (s d)


Tnh gi tr t
Nu ttnh > tbng (hay Pvalue<0,05) th hai kt qu khc nhau c ngha thng k
(f=1)

t calc

(d ) N

sd

Which type of t test should be


Mu phn tch t
use
cng mt ngun

Cc php so snh c
cng s th nghim

Phng sai ca hai php


so snh c ging nhau khng

Chng 5:

ANOVA (analysis of variance)

Phn b t: kim tra s khc nhau gia hai


gi tr trung bnh.
Chun t khng ph hp trong mt s trng
hp sau
Ch da trn mt phn d liu
C qu nhiu so snh nhng ch mt s trong
c ngha thng k
Khng th so snh nhiu ga tr trung bnh ca cc
nhm trong cng mt mu tng th

S cn thit s dng ANOVA


Gi thit thng k cn kim tra trong ANOVA:
cc ga tr trung bnh mu khc nhau
Phi dng ANOVA v cn kim tra phng sai
ca cc gi tr trung bnh c khc nhau c ngha
hay khng
ng dng khc: ANOVA dng nh ga nh
hng chnh v nh hng tng h ca mt
hay nhiu yu t n kt qu th nghim

Thut ton so snh cc gi tr trung


bnh bng ANOVA
GTTK: H0: 1 = 2 = 3 . = k
Ha: t nht 2 gi tr trung bnh khc nhau
- Phng php kim tra: dng chun Fisher
(khng kim tra c nu phng sai cc nhm
khc nhau)
- Tnh tr s P (P value) theo chun t 2 pha:
(Fdf between groups,,/ df within groups = , p = )
-Nu Ftnh >Fbng th gi thit o b loi b tc l
cc ga tr trung bnh ca cc mu thng k l
khc nhau c ngha

ANOVA nh gi nh hng ca
yu t n kt qu th nghim
iu kin s dng ANOVA:
Mu c ly ngu nhin t mu tng th
Cc nhm mu c phn tch c lp
nhau.
Cc th nghim lp li ca mi mu c
tin hnh c lp nhau
Tp s liu ca mu tng th phi tun
theo phn b chun

Two Sources of Variability


In ANOVA, an estimate of variability
between groups is compared with
variability within groups.
Between-group variation is the variation among
the means of the different treatment conditions
due to chance (random sampling error) and
treatment effects, if any exist.
Within-group variation is the variation due to
chance (random sampling error) among individuals
ANO VA
given the same treatment.

T o ta l V a r ia tio n A m o n g S c o r e s
W ith in -G r o u p s V a r ia tio n
V a ria t io n d u e t o c h a n c e .

B e tw e e n -G r o u p s V a r ia tio n
V a ria t io n d u e t o c h a n c e
a n d t r e a t m e n t e f f e c t ( i f a n y e x is t i s ) .

Variability Between Groups

There is a lot of variability from one mean to the next.


Large differences between means probably are not due to
chance.
It is difficult to imagine that all six groups are random
samples taken from the same population.
The null hypothesis is rejected, indicating a treatment
effect in at least one of the groups.

ANOVA mt yu t
Tnh cc i lng
Yi,j = trung bnh chung + nh hng ca
nhm + i,j
Yi,j = gi tr ca kt qu phn tch th i
trong nhm th j
nh hng ca nhm = s khc nhau gia cc gi tr
trung bnh cu mu tng th th I v trung bnh
chung
i,j is l gi tr ngu nhin ca phn b chun vi gi
tr trung bnh bng 0

The F Ratio
Between GroupVaria bility
F
Within GroupVaria bility

MSbetween
F
MSwithin

A N O V A (F )
T o ta l V a r ia tio n A m o n g S c o r e s
W ith in -G r o u p s V a r ia tio n
V a ria t io n d u e t o c h a n c e .

B e tw e e n -G r o u p s V a r ia tio n
V a ria t io n d u e t o c h a n c e
a n d t r e a t m e n t e f f e c t ( i f a n y e x is t i s ) .

M e a n S q u a r e s W i t h in

M e a n S q u a r e s B e tw e e n

The F Ratio
MSbetween
F
MSwithin

SSwithin
MSwithin
dfwithin

SSbetween
MSbetween
df between

SStotal SSbetween SSwithin

The F Ratio: SS Between

SS between n( X group X grand )

Find each group total, square


it, and divide by the number
of subjects in the group
Grand Total (add all of
the scores together,
then square the total)
T2
G2

SS between

N
Total number of
subjects.

The F Ratio: SS Within

SS within ( X X group )

Square each individual score


and then add up all of the
squared scores

SS within

T
X
n
2

Squared group total.

Number of
subjects in
each group.

The F Ratio: SS Total


SS total ( X X grand ) ( X group X grand ) ( X X group )
2

G
SStotal X
N
2

Square each
score, then add
all of the
squared scores
together.
Degrees of Freedom:
Between:
Within:

Grand Total (add all of


the scores together,
then square the total)

Total number of
subjects.

df between number of groups - 1

df within n1 1 n2 1 n3 1...
df within total number of subjects - total number of groups

ANOVA hai yu t
ANOVA s dng cc i lng tnh nh
ANOVA mt yu t
Trung bnh ca phng sai trong mt ct
(SSWC/dfWC)
S khc nhau gia cc ct SS t l vi mi
nh hng chnh (dng v ct) v nh
hng tng h
SSR, SSC, SSRXC

Two-Way ANOVA

Two-Way ANOVA

Two-Way ANOVA

Two-Way ANOVA

Latin square
Latin squares has counterbalancing built in
Nr of rows equals the nr of columns
The letter presenting treatments appears in
each column and row only once
Effects of treatment, order and sequence are
isolated systematic counterbalancing
Order 1 2 3
seq 1 A B C
seq 2 B C A
seq 3 C A B

Latin Squares

Latin Squares

Chng 6: Phn tch hi qui v


phn tch tng quan
6.1. Tng quan hai bin:
- Dng o mc quan h tuyn tnh
gia hai bin
- Xc nh mi quan h gia hai bin
-Tnh h s tng quan Pearson v
tng quan th hng Spearmen.

Assumptions
Subjects are representative of a larger population
Paired samples (must have 2 variables) are
Independent observations
X and Y values must be measured independently
X values are measured but not controlled
Normal distribution (if not use Spearmans rank
correlation)
All covariation must be linear

Note that outliers have a large influence in

Scatter Diagram
Designate one variable X and the other Y.
Although it does not matter which is which, in cases
where one variable is used to predict the other, X is
the predictor variable (the variable youre predicting
from).
Draw axes of equal length for your graph.
Determine the range of values for each variable. Place
the high values of X to the right on the horizontal axis
and the high values of Y toward the top of the vertical
axis. Label convenient points along each axis.
For each pair of scores, find the point of intersection for
the X and Y values and indicate it with a dot.

Tng quan Pearson


- Tnh h s tng quan (r) ch ra mc
quan h tuyn tnh trong mu
-Kim tra thng k tnh c nghi ca (r)
xem c mi quan h tuyn tnh gia cc
bin trong tp s liu hay khng
H s tng quan Pearsons r gi thit
rng c quan h tuyn tnh gia cc bin
v chng minh mi quan h

Phn tch tng quan


Biu din mi tng quan hai bin trn
th
Nhn xt chiu hng v mc tng
quan gia hai bin

Scatterplots help illustrate the relationships between variables

H s tng quan Pearson


Definitional formula:
degree to which X and Y vary together
r
degree to which X and Y vary separately

Computational formula:

COVXY
r
(sx )(sy )
r

COV XY

( X X )(Y Y )

n( XY ) ( X )( Y )

( n X ( X ) )( n Y ( Y ) )
2

Mc tng quan
Biu din mc tng quan th no
Pearsons r :
-1<r<1
Du (- hoc +) ch ra chiu hng mi
tng quan
Gi tr tuyt i r cng gn 1 th mc
tng quan cng ln

-1

------------ 0 ------------ +1

PerfectRelationship

NoRelationship

PerfectRelationship

H s tng quan

Spearman Rank Correlation


is the best-known and easiest technique
N

rs is given by the equation:

6
rs = 1 -

i=1

N(

N 2 -1)

where d is the difference between rankings in two


ranking methods
When N 10, rs can be used to calculate a t-score with
the equation
and the resulting t-score is used in
a two-tailed test of significance

Kendall Rank Correlation


Coefficient ()
More complicated than the Spearman rank
Should be used when three or more sets of
rankings are compared
Calculated by the proportion of concordant pairs
minus the proportion of discordant pairs
There exist two bivariate observations, (xi,yi) and (xj,yj)
Concordant pairs are when (xi-xj)*(yi-yj) are positive
Discordant pairs are when (xi-xj)*(yi-yj) is negative

Scores range from -1 to 1

Goodman and Kruskals Lambda


()
is used when nominal scales are used
Spearman rank and wont work because the
ordering element is missing with nominal
scales

can be calculated by statistical packages

Partial Correlations (rP)


- To indicate the degree of two variables are linearly
related in a sample, partialling out the effects of
one or more control variables.
- To interpret partial correlation between two
variables we must know the bivariate correlation
between the variables first.
- To conduct a partial correlation, there must be at
least three variables.
can be used in the following ways
Partial correlation between two variables
Partial correlation among multiple variables within
a set
Partial correlation between sets of variables

Phng php bnh phng ti thiu


khng m bo o ca gi tr y ln hn
khng m bo o ca gi tr X.
ng diu din phi tha mn tnh cht lch
ca y (residuals) l nh nht
lch gia cc im trn ng biu din v cc
im thc nghim c th m hoc dng nn cn
ti thiu ha tng bnh phng cc lch
Quan h tuyn tnh gia tn hiu phn tch v hm
lng cht phn tch c biu din theo phng
trnh
Y = a + bX
Hoc signal = Sblank + b(Conc.)

Phng php bnh phng ti thiu

b
a+bx

signal = Sblank+b (Conc.)

Linear regression
y=a+bx
( x x )( y y ) n x y x y

n x ( x )
(x x)
i

i i

2
i

a y b.x

y . x
i 1

i 1
n

2
i

i 1
n

i 1

xi . xi . y i

n x ( xi )
i 1

2
i

i 1

Cc thng s quan trng trong


phn tch cng c
1) nhy (Sensitivity

2) Gii hn pht hin (Detection) Limit


3) Khong tuyn tnh (Dynamic) Range
4) chn lc (Selectivity )
5) T s tn hiu /nhiu (Signal-to-noise Ratio )

Gii hn pht hin


Detection Limit ( LOD)

LOD: l nng nh nht ca cht phn tch m thit b


cn ghi nhn c khc c ngha vi tn hiu nhiu nn

(i.e. analytical signal = 2 or 3 times S.D.of blank


measurement ( approx. equal to the peaknoise).
Tnh LOD
The minimal detectable analytical signal (Sm) is given
by: Sm = Sbl + k.SDblank
To experimentally determine
Perform 20-30 blank measurements over an extended
period of time.
Calculate Sbl (mean blank signal) and SDblank
Detection limit (Cm) is : Cm = (Sm-SDbl/)/m

LOQ, LOL, Dynamic Range


LOQ (limit of
quantitation): [lowest] at
which quantitative
measurements can
reliably be made.
LOQ=10 x Average Signal
for blank
LOL (limit of linearity):
point where signal is no
longer proportional to
concentration.
[Dynamic Range]: from
LOQ to LOL.
Cm: detection limit

Sensitivity
Indicates the response of the instrument to changes in
analyte concentration or a measure of a methods ability
to distinguish between small differences in concentration
in different samples.
In other words, a change in analytical signal per unit
change in [analyte].
Effected by the slope of calibration curve & precision
For two methods with equal precision, the one with the
steeper calibration curve is more sensitive.
( Calibration Sensitivity)
If two methods have calibration curves with equal
slopes, the one with higher precision is more sensitive.
(Analytical Sensitivity)

Calibration Sensitivity
is the slope of the calibration curve evaluated .

S = b(Conc.) + Sbl
(b= slope; C= conc; Sbl = Signal of Blank)
Advantage: sensitivity independent of [analyte]
Disadvantage: Does not account for precision of individual
measurements

Analytical Sensitivity
(Defined by Mandel and Stiehler )
to include precision in sensitivity definition
g = m/Ss
(m = slope; Ss is the standard deviation of measurement)
- Advantage: Insensitive to amplification factors i.e. increasing gain also
increases m but Ss also increases by same factor hence g stays
constant
- Disadvantage: concentration dependent as Ss usually varies with [analyte]

Selectivity
Degree to which a measurement is free from
interferences by other species contained in the matrix
Analytical Signal Detected is a sum of the analyte signal
plus interference signals

S = maCa + mbC + mCc + Sblank


Selectivitity is a measure of how easy it is to distinguish
between the analyte signal and the interference signal
Selectivity of an analytical method can be described using
a figure of merit called selectivity coefficient

kb,a = mb/ma : kc,a = mc/ma


S = ma(Ca + kb,aCb + kc,aCc) + Sblank
Selectivity coefficients range from 0 >> 1. Can be negative
if interference reduces the observed signal

Standard Addition Calibration


Most useful when analyzing complex samples when
significant matrix effects are possible.
Most common form is adding 1 or more aliquots to sample
aliquot (sample spiked)
If sample limited, can add to sample aliquot

Where : k is a proportionality constant relating signal to


concentration, Vs is the volume of standard added at a concentration of Cs,
Vx is the volume of unknown (aliquot) added at a concentration Cx, and
Vt is the total (final) volume.

The Standard Addition Method


(Spiking)
Technique to be used when:
Samples have substantial matrix effects.
Assay requires instrumental conditions that are difficult to control
Procedure
A measurement is made on a portion of the sample
Varying but known amounts (called spikes) of the assayed
substances are added to several equal portions of the sample
standard addition
Each solution is diluted to same volume and measured.
The assay measurement is then plotted as a function of the
concentration spike.
The resulting plot is extrapolated to the concentration axis (i.e.
xaxis)

Internal Standard Method


An internal standard is a substance that is added in a
constant amount to all samples, blanks and calibration
standards in an analysis.
Procedure:
Carefully measured quantity of the internal standard is
introduced into each standard sample.
The solutions are diluted to the same volume and the
analytical signal is measured.
Calibration curve: Plot a ratio of the analyte signal to internal
standard signal vs. the analyte concentration of the
standards
The ration for the samples is then used to obtain their
analyte concentration from the calibration plot.

Internal Standard (IS)

Internal standards are essential if we have a time-varying


instrumental response Internal standards are very useful if
you have matrix effects

Chapter 7:
Quality Assurance / Quality Control
QA: The planned measures that ensure a
service or product meets minimum
professional standards.
QC: The day-to-day activities that monitor
the quality of laboratory reagents, supplies
and equipment.
QA/QC: Proficiency Testing
Laboratory Accreditation
Validation

ISO 9000
An international set of standards for quality
management.
Applicable to a range of organisations from
manufacturing to service industries.
ISO 9001 applicable to organisations which
design, develop and maintain products.
ISO 9001 is a generic model of the quality
process that must be instantiated for each
organisation using the standard.

ISO 9001

ISO 9000 certification


Quality standards and procedures should be
documented in an organisational quality
manual.
An external body may certify that an
organisations quality manual conforms to
ISO 9000 standards.
Some customers require suppliers to be ISO
9000 certified although the need for flexibility
here is increasingly recognised.

Documentation standards

Particularly important - documents are the


tangible manifestation of the software.
Documentation process standards

Document standards

Concerned with how documents should be


developed, validated and maintained.
Concerned with document contents, structure, and
appearance.

Document interchange standards

Concerned with the compatibility of electronic


documents.

Document standards
Document identification standards
How documents are uniquely identified.

Document structure standards


Standard structure for project documents.

Document presentation standards


Define fonts and styles, use of logos, etc.

Document update standards


Define how changes from previous versions
are reflected in a document.

Quality in Environmental Analysis

Value of Quality Control

General QC principles.
Sources of error.
Terminology and Definitions.
Quality Control vs. Quality Assurance.

QC Terminology and Definitions


Representativeness:
- A measure of the degree to which data
accurately and precisely represents a sampling
point or process condition.
- A measure of how closely a sample
is representative of a larger process.

Comparability:
- A qualitative term that expresses the confidence
that two data sets can contribute to a common
analysis.

QC Terminology and Definitions


Completeness:
- A measure of the amount of valid data
obtained from a measurement system,
expressed as a percentage of the valid
measurements that should have been
collected (i.e., measurements that were
planned to be collected).

Quality Control vs. Quality


Assurance
- QC is a component of QA.
- QC measures and estimates errors in a
system.
- QA is the ability to prove that the data is
as reported.

Sources of Error
-

Sample errors
Reagent errors
Reference material errors
Method errors
Calibration errors
Equipment errors
Signal registration and recording errors
Calculation errors
Errors in reporting results

Sources of Error
Sample Errors
-

Sample container contaminated.


Incorrect sample location.
Non-representative sample.
Incorrect sample container.
Sample mix up.

Reagent Errors
-

Impure reagents or solvents.


Improper storage of reagents.
Neglect of reagent expiration date.
Evaporated reagents.
Consideration of different purities or

Sources of Error
Reference Material Errors
- Impurity

of reference materials.
- Errors from interfering substances.
- Changes due to improper storage.
- Errors in preparing reference material.
- Using expired reference material.

General Method Errors


- Deviating from the analysis procedure.
- Disregard for the limit of detection.
- Disregard for a blank correction.
- Calculation errors (dilutions, mixtures, additions).
- Not using the correct analytical procedure.

Sources of Error
Calibration Errors
- Volumetric measuring errors.
- Weighing errors.
- Inaccurate equipment adjustments.

Equipment Errors
-Equipment not cleaned
- Maintenance neglected.
- Temperature, electrical, and magnetic effects.
- Errors in using auto-pipettes (not calibrated, pipette tip
not correctly attached, contamination).
- Errors in using glass pipettes (damaged, bad technique,
contamination).

Sources of Error
Equipment Errors (continued)
Cuvette errors (defects not considered,
unsuitable cuvette glass, not filled to
minimum, wet on the outside, air bubbles,
contamination).
Photometer errors (wrong wavelength,
insufficient lamp intensity, dirty optics, drift
effect ignored, incorrectly set zero, light
entering the sample chamber).

Sources of Error
Signal Registration and Recording Errors
- Incorrect range setting.
- Reading errors.
- Recording errors.
- Switching of data.
Calculation Errors
- Arithmetic errors, decimal point errors, incorrect units.
- Rounding errors.
- Not taking into account the reagent blank values.
- Error in dilution factor.
Errors in Reporting Results
- Omitting a sample error.
- No quality assurance implemented

Validation demonstrates that a procedure is


robust, reliable and reproducible
A robust method is one which produces
successful results a high percentage of
the time.
A reliable method is one that produces
accurate results.
A reproducible method produces similar
results each time a sample is tested.

Selecting an Analytical Method


Defining the Problem
1. What accuracy is required?
2. How much sample is available?
3. What is the concentration range of the
analyte?
4. What components of the sample will cause
interference?
5. What are the physical and chemical
properties of the sample matrix?
6. How many samples are to be analyzed?

Numerical
Numerical Criteria
Criteria for
for Selecting
Selecting Analytical
Analytical Methods
Methods

nh gi phng php phn tch


Kim tra chnh xc bng cch phn tch mu chun
+ S dng chun kim tra
+Phn tch mu thm chun
+ So snh kt qu phn tch mu vi kt qu phn tch
theo phng php chun
+ Phn tch mu chun i chng c thnh phn bit
trc(mu CRM)
+Phn tch mu chun kim tra hng ngy
nh gi phng php cn p dng nhng hng dn
c nu trong thc hnh phng th nghim (good
laboratory practice) (GLP)

You might also like