You are on page 1of 12

n tp Tin Sinh Hc 1.

Thut ton Needlman +Nm 1970, Needleman v Wunsch a ra thut ton Needleman Wunsch l gii thut ging cp chui ton b da trn quy hoch tnh im cho qa trnh ging chui. + tnh im cho cc cp k t khi ging chui theo gii thut Needleman Wunsch ta dng ma trn thay th, nh i vi cc chui protein ta hay dung ma trn thay th PAM250, BLOSUM62. + Phng php quy hoch ng m bo v mt ton hc s tm ra phng n ging cp chui ti u vi mt c ch tnh im c th. Tuy nhin phng php ny c s bc tnh ton ln, khong bnh phng chiu di chui Gii thut tng qut ca thut ton ging cp chui Needleman Wunsch gm 3 bc: + B1: Khi to ma trn nh gi t 2 chui + B2: Tnh ton in gi tr cho ma trn nh gi + B3: S dng k thut quay lui tm ra kt qu Xt bi ton: Gi s c 2 trnh t A v B. vic sp ging ct cp trnh t AB c im cao nht (tc cho kt qu tng ng cao nht), mt ma trn F hai chiu c to ra. Mi v tr trong ma trn c k hiu l Fij. im cho sp ging ct c c trng bng ma trn tng ng. Trong : + S(i,j) l im tng ng gia 2 k t i v j + d l mt im pht tuyn tnh cho cc gap Trong ma trn trc honh l cc k t ca trnh t A(c chiu di l x), cc k t ca trnh t B(c chiu dai la y) c biu din trn trc tung. Thc hin cc bc: Bc 1: Khi to ma trn t 2 chui sequence Ta khi to ma trn F nh sau: F(0,0) = 0 F(0,j) = 0 F(i,0) = 0 Bc 2: Lp y ma trn Ta tin hnh lp y ma trn F t nh bn tri ti y bn phi. Gi tr ti v tr F(i,j) s c tnh da trn im ti F(i-1,j-1); F(i-1,j); F(i,j1) theo cng thc: Fi,j = Max[Fi-1,j-1 + Si,j , Fi,j-1 + d, Fi-1,j + d]

Bc 3: Traceback Da vo k thut lu vt tm ng i ngc li + Khi to: Xut pht t (m,n) + Cc bc lp: t (i,j) ta xt cc (i-1,j-1), (i-1,j), (i,j-1) Nu F(i,j) = F(i,j-1) + d th ta c ng i t (i,j-1) n (i,j) Nu F(i,j) = F(i-1,j) + d th ta c ng i t (i-1,j) n (i,j) Nu F(i,j) = F(i-1,j-1) + S(i,j) vi iu kin A[j] ging B[i] th ta c ng i t (i,j) n (i-1,j-1) Nu F(i,j) = F(i-1,j-1) th cng c ng i t (i,j) n (i-1,j-1) Mi bc c th chn nhiu ng i Tm ra cc aligenment + Nu ng i theo hng ng cho t (i-1,j-1) n (i,j) Nu A[j] ging B[i] th A[j] v B[i] c ni vi nhau Nu A[j] khc B[i] th A[j] c ging vi B[i] nhng khng c ni(c gi l s thay th _ substitution) + Nu ng i theo hng t (i,j-1) n (i,j) th chm thm mt Gap B ging vi A[j] + Nu ng i theo hng t (i-1,j)n (i,j) th chm thm mt Gap A ging vi B[i] V d minh ha: Cho 2 trnh t A v b nh sau: Trnh t A: GAATTCAGTTA (N=11) Trnh t B: GGATCGA (M=7) Trong N l di ca trnh t A, M l di ca trnh t B Cch tnh im nh sau: Sij = 0 nu 2 k t khng trng nhau Sij = 1 nu 2 k t trng nhau d = 0 nu gp khong cch (Gap Bc 1: Khi to To mt ma trn F c N+1 ct v M+1 hng Gi s ban u cc hng v ct ca ma trn bng 0

Bc 2: Lp y ma trn i vi mi v tr ca Fij c nh ngha c im ti a ti v tr i,j Fij=Max[Fi-1,j-1 + Si,j, Fi,j-1 +d ,Fi-1,j +d ] Ta ln lt in cc gi tr Fij cho n khi lp y cc ma trn Vd: tnh F1,1 nh sau: Ta c S1,1=1, d=0, F0,0=0, F0,1=0, F1,0=0 F1,1=Max[F0,0 + S1,1 , F1,0+d ,F0,1 +d] = Max[1,0,0]=1

Bc 3: Traceback(Xc nh con ng i n gi tr cui cng ny) Bt u t gi tr cui cng ca ma trn

Kt qu: a ra mt alignment : Substitution( S thay th A bng G) GAATTCAGTTA | | | | | | GGA_TC_G__ A 2.Thut ton quy hoch ng Smith_Waterman - Thut ton Smith-Waterman l mt thut ton quy hoch ng dng tm kim c s d liu pht trin bi TF Smith v MS Waterman vo nm 1981 v da trn mt m hnh thch hp trc c tn Needleman v Wunsch -c im:Thut ton Smith-Waterman l thut ton ging cp chui cc b da trn quy hoch ng tnh im cho qu trnh ging chui. - ngha:Gii thut ny gip nhn ra nhng min tng ng gia hai chui tm kim cho ging chui cc b ti u hn. - Gii thut xy dng trn tng so snh tm ra nhng on hay nhng min ca hai chui m c tng ng cao nht, t nh gi mc tng ng gia hai chui -Qu trnh ging chui c thc hin bi vic ging chui tng cp trong 2 chui. +Khi im cho ging chui tng cp k t ph thuc vo: hai k t l ging nhau (matches), hai k t khng ging nhau (mismatches) v im cho vic thm/bt khong trng (gap penalty). Kt qu ca ging cp cc b l tm ra c nhng on trong 2 chui c tng ng cao nht.

*Bi ton:Gi s c hai chui S1 v S2. vic sp trnh t cp chui S1, S2 c im cao nht (tc cho kt qu tng ng cao nht), mt ma trn F hai chiu c to ra. Mi v tr trong ma trn c k hiu l Fij. im cho sp trnh t c c trng bng ma trn thay th S, trong : -S(i,j) l im tng ng gia hai k t i v j. -d l mt im pht tuyn tnh cho cc gap (gap penalty). Trong ma trn, trc honh l cc k t ca chui S1 (c chiu di n), cc k t ca chui S2 (c chiu di m) c biu din trn trc tung. THUT GII - Input: + 2 chui S1, S2 vi chiu di tng ng l n, m + Ma trn thay th S + Gap penalty d - Output: 2 chui S1,S2 Bc 1: Khi to + F(0,0)=0 + F(i,0)=0 0 i m + F(0,j)=0 0 j n Bc 2: in gi tr vo ma trn + Tnh F(i,j) theo cng thc: F(i,j)=Max ( 0, F(i-1,j-1) + S(i,j), F(i-1,j)+d, F(i,j-1)+d ) (1) + Mi khi tnh F(i,j) lu li ch s ca s hng v phi (1) Bc 3: Tm (i_max,j_max) c im cao nht (0 i m, 0 j n) Bc 4: Traceback - Xut pht t (i_max,j_max). Da vo nhng ch s lu bc 2 tm traceback cho n khi gp F(i,j) = 0 th dng. - Xt (i,j) + Nu ng i theo chiu ngang hay t (i,j-1) sang (i,j) th thm vo S2 v thm k t S1(j) vo S1. + Nu ng i theo chiu thng ng hay t (i-1,j) xung (i,j) th thm vo S1 v thm k t S2(i) vo S2. + Nu ng i theo ng cho hay t (i-1,j-1) n (i,j) th thm k t S1(j) vo S1 v S2(i) vo S2 o ngc S1, S2 Vd: S1:ATATGCTAAG S2: ACTACTTAG Bc1: Khi to

Bc 2: in gi tr vo ma trn theo cng thc F(i,j)=Max ( 0, F(i-1,j-1) + S(i,j), F(i-1,j)+d, F(i,j-1)+d ) Cho Match =2 , Mismatch=-1, D=-1

Bc 3: Tm ( i max,j max) c im cao nht

Bc 4: Traceback

Nhn xt: S dng thut ton qui hoch ng SW gii quyt bi ton so snh trnh t theo hng tm ra li gii gn ti u nht da trn cc bc di chuyn tt ca ca mi trng thi ca li gii so vi cc trng thi xung quanh n trong mi vng lp xc nh. Ch yu c s dng cho vic gii quyt bi ton so snh theo hng ton cc. +u im: gim khng gian tm kim ca li gii, tc tm kim nhanh +Nhc im: do phng php tm kim cc b ch m bo tnh ng ch khng m bo tnh y (complete), v do vic la chn cc bc i thot khi tnh trng ti u cc b l ngu nhin nn c th thut ton s khng tm ra c li gii tt trong mt s ln chy(ph thuc vo vic n nh cc tham s u vo 3. Thut ton Clustawl a. Khi qut + Gii thut ClustalW l phng php ci tin cho ging a chui. Phng php ny ang c s dng rng ri cho ging a chui v xy dng cy pht sinh loi bi v phng php ny gii quyt v phc tp tnh ton m nhng phng php khc cha gii quyt c, ng thi gii quyt c bi ton ging a chui xy dng cy pht sinh chng loi v nh gi c mc tng ng gia cc chui + Phng php ny xy dng ging a chui bt u vi vic xc nh mt ging cp c mi quan h tng ng ln nht Input: k chui sinh hc S1,S2,Sk cn so snh Output: kt qu phng n ging a chui ti u b. Cc bc ca gii thut ClustalW Gm 3 bc:

+ Bc 1: Thc hin ging cp chui gia tt c nhng chui v xc nh mc tng ng gia mi cp. T xy dng ma trn khong cch(distance) tng ng gia cc chui. + Bc 2: T ma trn khong cch xy dng cy ch dn (guide tree) th hin mi quan h tng ng gia cc chui. S dng phng thc Neighbor-Joining(quan h hng xm). + Bc 3: Xy dng ging a chui (MSA). T kt qu trong bc 2 thu c cy ch dn(guide tree), cn c vo cy ch dn xc nh nhng nhnh c cp chui tng ng ln nht thc hin ging cp, ri kt hp nhng ging cp li ta thu c kt qu ging a chui. V d: Thc hin ging 5 chui: S0: ABDFGI S1: AKHGL S2: ADFIKF S3: ABFGLI S4: AKDILM Bc 1: Thc hin ging cp chui ln lt ca 5 chui S0,S1,S2,S3,S4. Ta c kt qu ging: S0 ABDFGI S0 ABDFGI-S0 ABDFG-I S1 A-KHGL S2 A - D F-IKF S3 A-BFGLI S0 ABDFGI-S4 AKD- -ILM S1 AKHG--L S3 ABFGLIS2 S3 S3 S4 ADFIKF ABFGLI-ABFGL- I - A- - - KDILM S1 AK--HGLS2 ADFI- - KF S1 AKHGL- - - S4 A- - - KDILM S2 A-DFIKFS4 AKD-I- LM

Tinh khoang cach gia cac cp: Vi mi ging i tm kim nhng v tr khng cha du gap (non-gapped) v m s cp k t khp gia 2 chui (m), sau chia cho chiu di ca 2 chui (n)- l chiu di chui c tnh khng c du gap.

T ta c ma trn khong cch:

Bc 2: Ta c d(S0,S2) =1 max nn S1 v S2 c tng ng ln nht nn ta xy dng cy gm 2 chui S0 v S2:

Tip theo ta xc nh mc tng ng gia nhnh S0, S2 v nhng chui cn li bng cch xy dng ma trn khong cch mi:

Ta c d(S0,2, S3) =0,775 max nn ta c cy sau:

Xy dng ma trn khong cch mi:

10

Ta thy d( S0,2,3, S4) = 0,72 max v ghp chui S1 cui cng vo cy nn ta c cy nh sau:

Bc 3: Sau khi thnh lp c cy ta c: + Bt u ging cp S0 v S2 thu c trn: S0 ABDFGI-S2 A-DF-IKF + Thm chui S3 vo kt qu trn cn c vo ging chui S3,S0: S0 ABDFG-I S3 A-BFGLI Thc hin ging chui theo phng php ta c kt qu ging 3 chui : S0 ABDFG-I-S2 A-DF- -IKF S3 A-BFGLI Thc hin ging thm chui S4 vo cp 3 chui va ging cn c vo kt qu ging ca cp S0, S4 l: S0 ABDFGI- S4 AKD- - ILM Ta c 4 chui sau khi ging l: S0 ABD FG-I-S2 A- D F - - IKF S3 A- B FG LI- S4 AKD - - -ILM + Thc hin ging thm chui S1 vo cp 4 chui va ging cn c vo kt qu ging ca cp S0, S1 l: S0 ABDFGI S1 A-KHGL

11

Ta c 5 chui sau khi ging l: S0 ABDFG-I-S2 A - D F - - I KF S3 A - B F G LI - S4 AKD- - -ILM S1 A - K H G- L- -

12

You might also like