You are on page 1of 117

MUC LUC MUC LUC....................................................................................1 Li ni u.................................................................................3 PHN 1.......................................................................................5 C S D LIU PHN TN.........................................................5 CHNG 1.

TNG QUAN V C S D LIU PHN TN............5 1.1. H CSDL phn tn............................................................5 1.1.1. nh ngha CSDL phn tn.........................................5 1.1.2. Cc c im chnh ca c s d liu phn tn.........6 1.1.3. Mc ch ca vic s dng c s d liu phn tn.....8 1.1.4. Kin trc c bn ca CSDL phn tn..........................9 1.1.5. H qun tr CSDL phn tn.......................................10 1.2. Kin trc h qun tr C s d liu phn tn..................11 1.2.1. Cc h khch / i l................................................11 1.2.2. Cc h phn tn ngang hng...................................12 CHNG 2. CC PHNG PHP PHN TN D LIU...............13 2.1.Thit k c s d liu phn tn.......................................13 2.1.1.Cc chin lc thit k..............................................13 2.2. Cc vn thit k........................................................14 2.2.1. L do phn mnh.....................................................14 2.2.2. Cc kiu phn mnh................................................14 2.2.3. Phn mnh ngang....................................................16 2.3. Phn mnh dc..............................................................30 2.5. Phn mnh hn hp.......................................................41 2.6. Cp pht........................................................................42 2.6.1 Bi ton cp pht......................................................42 2.6.2 Yu cu v thng tin.................................................42 2.6.3. M hnh cp pht....................................................43 CHNG 3. X L VN TIN.....................................................47 3.1. Bi ton x l vn tin.....................................................47 3.2. Phn r vn tin............................................................... 51 3.3. Cc b ha d liu phn tn..........................................59 3.4. Ti u ho vn tin phn tn...........................................66 3.4.1. Khng gian tm kim................................................66 3.4.2. Chin lc tm kim.................................................69 3.4.3. M hnh chi ph phn tn..........................................70 3.4.4. Xp th t ni trong cc vn tin theo mnh.............76 CHNG 4. QUN L GIAO DCH.............................................83 4.1. Cc khi nim................................................................83 4. 2. M hnh kho c bn.....................................................91 4.4. Thut ton iu khin tng tranh bng nhn thi gian 97 PHN 1...................................................................................100 C S D LIU SUY DIN.....................................................100 2.1. Gii thiu chung...........................................................100

2.2- CSDL suy din...........................................................100 2.2.1. M hnh CSDL suy din.........................................100 2.2.2. L thuyt m hnh i vi CSDL quan h................102 2.2.3. Nhn nhn CSDL suy din......................................104 2.2.4. Cc giao tc trn CSDL suy din...........................105 2.3. CSDL da trn Logic.....................................................105 2.3.4. Cu trc ca cu hi..............................................110 2.3.5. So snh DATALOG vi i s quan h....................111 2.3.6. Cc h CSDL chuyn gia.....................................116 2.4. Mt s vn khc......................................................116

Li ni u

Cc h c s d liu (h CSDL) u tin c xy dng theo cc m hnh phn cp v m hnh mng, xut hin vo nhng nm 1960, c xem l th h th nht ca cc h qun tr c s d liu (h QTCSDL). Tip theo l th h th hai, cc h QTCSDL quan h, c xy dng theo m hnh d liu quan h do E.F. Codd xut vo nm 1970. Cc h QTCSDL c mc tiu t chc d liu, truy cp v cp nht nhng khi lng ln d liu mt cch thun li, an ton v hiu qu. Hai th h u cc h QTCSDL p ng c nhu cu thu thp v t chc cc d liu ca cc c quan, x nghip v t chc kinh doanh. Tuy nhin, vi s pht trin nhanh chng ca cng ngh truyn thng v s bnh trng mnh m ca mng Internet, cng vi xu th ton cu ho trong mi lnh vc, c bit l v thng mi, lm ny sinh nhiu ng dng mi trong phi qun l nhng i tng c cu trc phc tp (vn bn, m thanh, hnh nh) v ng (cc chng trnh, cc m phng). Trong nhng nm 1990 xut hin mt th h th ba cc h QTCSDL cc h hng i tng, c kh nng h tr cc ng dng a phng tin (multimedia). Trc nhu cu v ti liu v sch gio khoa ca sinh vin chuyn nghnh cng ngh thng tin, nht l cc ti liu v CSDL phn tn, CSDL suy din, CSDL hng i tng, chng ti a ra gio trnh mn hc C s d liu 2. Mc ch ca gio trnh C s d liu 2 nhm trnh by cc khi nim v thut ton c s ca CSDL bao gm: cc m hnh d liu v cc h CSDL tng ng, cc ngn ng CSDL, t chc lu tr v tm kim, x l v ti u ho cu hi, qun l giao dch v ieukhin tng tranh, thit k cc CSDL. Trong qu trnh bin son, chng ti da vo ni dung chng trnh ca mn hc hin ang c ging dy ti cc trng i hc trong nc, ng thi cng c gng phn nh mt s thnh tu mi ca cng ngh CSDL. Gio trnh C s d liu 2 c chia thnh 2 phn Phn 1: C s d liu phn tn Phn 2: C s d liu suy din

Sau mi chng u c nhng phn tm tt cui chng, cu hi n tp v bi tp nhm gip sinh vin nm vng ni dung chnh ca tng chng v kim tra trnh ca chnh mnh trong vic gii cc bi tp. Tuy rt g gng, gio trnh chc chn cn c nhng thiu st. Rt mong nhn c kin ng gp ca c gi trong ln ti bn sau, gio trnh s hon chnh hn. Thi Nguyn thng 10 nm 2009 Cc tc gi

PHN 1 C S D LIU PHN TN

CHNG 1. TNG QUAN V C S D LIU PHN TN Vi vic phn b ngy cng rng ri ca cc cng ty, x nghip, d liu bi ton l rt ln v khng tp trung c. Cc CSDL thuc th h mt v hai khng gii quyt c cc bi ton trong mi trng mi khng tp trung m phn tn, song song vi cc d liu v h thng khng thun nht, th h th ba ca h qun tr CSDL ra i vo nhng nm 80 trong c CSDL phn tn p ng nhng nhu cu mi. 1.1. H CSDL phn tn 1.1.1. nh ngha CSDL phn tn Mt CSDL phn tn l mt tp hp nhiu CSDL c lin i logic v c phn b trn mt mng my tnh - Tnh cht phn tn: Ton b d liu ca CSDL phn tn khng c c tr mt ni m c tr ra trn nhiu trm thuc mng my tnh, iu ny gip chng ta phn bit CSDL phn tn vi CSDL tp trung n l. - Tng quan logic: Ton b d liu ca CSDL phn tn c mt s cc thuc tnh rng buc chng vi nhau, iu ny gip chng ta c th phn bit mt CSDL phn tn vi mt tp hp CSDL cc b hoc cc tp c tr ti cc v tr khc nhau trong mt mng my tnh.

Trm 1 Trm 5 Trm 2

Mng truyn d liu

Trm 4 Hnh 1.1 Mi trng h CSDL phn tn

Trm 3

Trong h thng c s d liu phn tn gm nhiu trm, mi trm c th khai thc cc giao tc truy nhp d liu trn nhiu trm khc. V d 1.1: Vi mt ngn hng c 3 chi nhnh t cc v tr khc nhau. Ti mi chi nhnh c mt my tnh iu khin mt s my k ton cui cng (Teller terminal). Mi my tnh vi c s d liu thng k a phng ca n ti mi chi nhnh c t mt v tr ca c s d liu phn tn. Cc my tnh c ni vi nhau bi mt mng truyn thng. 1.1.2. Cc c im chnh ca c s d liu phn tn (1) Chia s ti nguyn Vic chia s ti nguyn ca h phn tn c thc hin thng qua mng truyn thng. chia s ti nguyn mt cch c hiu qu th mi ti nguyn cn c qun l bi mt chng trnh c giao din truyn thng, cc ti nguyn c th c truy cp, cp nht mt cch tin cy v nht qun. Qun l ti nguyn y l lp k hoch d phng, t tn cho cc lp ti nguyn, cho php ti nguyn c truy cp t ni ny n ni khc, nh x ln ti nguyn vo a ch truyn thng, ... (2) Tnh m Tnh m ca h thng my tnh l d dng m rng phn cng (thm cc thit b ngoi vi, b nh, cc giao din truyn thng ...) v cc phn mm (cc m hnh h iu hnh, cc giao thc truyn tin, cc dch v chung ti nguyn, ... ) Mt h phn tn c tnh m l h c th c to t nhiu loi phn cng v phn mm ca nhiu nh cung cp khc nhau vi iu kin l cc thnh phn ny phi theo mt tiu chun chung. Tnh m ca h phn tn c xem xt thao mc b sung vo cc dch v dng chung ti nguyn m khng ph hng hay nhn i cc dch v ang tn ti. Tnh m c hon thin bng cch xc nh hay phn nh r cc giao din chnh ca mt h v lm cho n tng thch vi cc nh pht trin phn mm. Tnh m ca h phn tn da trn vic cung cp c ch truyn thng gia cc tin trnh v cng khai cc giao din dng truy cp cc ti nguyn chung. (3) Kh nng song song H phn tn hot ng trn mt mng truyn thng c nhiu my tnh, mi my c th c 1 hay nhiu CPU. Trong cng mt thi im nu c N tin trnh cng tn ti,

ta ni chng thc hin ng thi. Vic thc hin tin trnh theo c ch phn chia thi gian (mt CPU) hay song song (nhiu CPU) Kh nng lm vic song song trong h phn tn c thc hin do hai tnh hung sau: Nhiu ngi s dng ng thi ra cc lnh hay cc tng tc vi cc chng Nhiu tin trnh Server chy ng thi, mi tin trnh p ng cc yu cu t (4) Kh nng m rng H phn tn c kh nng hot ng tt v hiu qu nhiu mc khc nhau. Mt h phn tn nh nht c th hot ng ch cn hai trm lm vic v mt File Server. Cc h ln hn ti hng nghn my tnh. Kh nng m rng c c trng bi tnh khng thay i phn mm h thng v phn mm ng dng khi h c m rng. iu ny ch t c mc d no vi h phn tn hin ti. Yu cu vic m rng khng ch l s m rng v phn cng, v mng m n tri trn cc kha cnh khi thit k h phn tn. (5) Kh nng th li Vic thit k kh nng th li ca cc h thng my tnh da trn hai gii php: - Dng kh nng thay th m bo s hot ng lin tc v hiu qu. - Dng cc chng trnh hi phc khi xy ra s c. Xy dng mt h thng c th khc phc s c theo cch th nht th ngi ta ni hai my tnh vi nhau thc hin cng mt chng trnh, mt trong hai my chy ch Standby (khng ti hay ch). Gii php ny tn km v phi nhn i phn cng ca h thng. Mt gii php gim ph tn l cc Server ring l c cung cp cc ng dng quan trng c th thay th nhau khi c s c xut hin. Khi khng c cc s c cc Server hot ng bnh thng, khi c s c trn mt Server no , cc ng dng Clien t chuyn hng sang cc Server cn li. Cch hai th cc phn mm hi phc c thit k sao cho trng thi d liu hin thi (trng thi trc khi xy ra s c) c th c khi phc khi li c pht hin. Cc h phn tn cung cp kh nng sn sng cao i ph vi cc sai hng phn cng. (6) Tnh trong sut trnh ng dng cc tin trnh Client khc.

Tnh trong sut ca mt h phn tn c hiu nh l vic che khut i cc thnh phn ring bit ca h i vi ngi s dng v nhng ngi lp trnh ng dng. Tnh trong sut v v tr: Ngi s dng khng cn bit v tr vt l ca d liu. Ngi s dng c quyn truy cp ti n c s d liu nm bt k ti v tr no. Cc thao tc ly, cp nht d liu ti mt im d liu xa c t ng thc hin bi h thng ti im a ra yu cu, ngi s dng khng cn bit n s phn tn ca c s d liu trn mng. Tnh trong sut trong vic s dng: Vic chuyn i ca mt phn hay ton b c s d liu do thay i v t chc hay qun l, khng nh hng ti thao tc ngi s dng. Tnh trong sut ca vic phn chia: Nu d liu c phn chia do tng ti, n khng c nh hng ti ngi s dng. Tnh trong sut ca s trng lp: Nu d liu trng lp gim chi ph truyn thng vi c s d liu hoc nng cao tin cy, ngi s dng khng cn bit n iu . (7) m bo tin cy v nht qun H thng yu cu tin cy cao: s b mt ca d liu phi c bo v, cc chc nng khi phc h hng phi c m bo. Ngoi ra yu cu ca h thng v tnh nht qun cng rt quan trng trong th hin: khng c c mu thun trong ni dung d liu. Khi cc thuc tnh d liu l khc nhau th cc thao tc vn phi nht qun. 1.1.3. Mc ch ca vic s dng c s d liu phn tn Xut pht t yu cu thc t v t chc v kinh t: Trong thc t nhiu t chc l khng tp trung, d liu ngy cng ln v phc v cho a ngi dng nm phn tn, v vy c s d liu phn tn l con ng thch hp vi cu trc t nhin ca cc t chc . y l mt trong nhng yu t quan trng thc y vic pht trin c s d liu phn tn. S lin kt cc c s d liu a phng ang tn ti: c s d liu phn tn l gii php t nhin khi c cc c s d liu ang tn ti v s cn thit xy dng mt ng dng ton cc. Trong trng hp ny c s d liu phn tn c to t di ln da trn nn tng c s d liu ang tn ti. Tin trnh ny i hi cu trc li cc c s d liu cc b mt mc nht nh. D sao, nhng sa i ny vn l nh hn rt nhiu so vi vic to lp mt c s d liu tp trung hon ton mi.

Lm gim tng chi ph tm kim: Vic phn tn d liu cho php cc nhm lm vic cc b c th kim sot c ton b d liu ca h. Tuy vy, ti cng thi im ngi s dng c th truy cp n d liu xa nu cn thit. Ti cc v tr cc b, thit b phn cng c th chn sao cho ph hp vi cng vic x l d liu cc b ti im . S pht trin m rng: Cc t chc c th pht trin m rng bng cch thm cc n v mi, va c tnh t tr, va c quan h tng i vi cc n v t chc khc. Khi gii php c s d liu phn tn h tr mt s m rng uyn chuyn vi mt mc nh hng ti thiu ti cc n v ang tn ti Tr li truy vn nhanh: Hu ht cc yu cu truy vn d liu t ngi s dng ti bt k v tr cc b no u tho mn d liu ngay ti thi im . tin cy v kh nng s dng nng cao: nu c mt thnh phn no ca h thng b hng, h thng vn c th duy tr hot ng. Kh nng phc hi nhanh chng: Vic truy nhp d liu khng ph thuc vo mt my hay mt ng ni trn mng. Nu c bt k mt li no h thng c th t ng chn ng li qua cc ng ni khc. 1.1.4. Kin trc c bn ca CSDL phn tn y khng l kin trc tng minh cho tt c cc CSDL phn tn, tuy vy kin trc ny th hin t chc ca bt k mt CSDL phn tn no - S tng th: nh ngha tt c cc d liu s c lu tr trong CSDL phn tn. Trong m hnh quan h, s tng th bao gm nh ngha ca cc tp quan h tng th. - S phn on: Mi quan h tng th c th chia thnh mt vi phn khng gi ln nhau c gi l on (fragments). C nhiu cch khc nhau thc hin vic phn chia ny. nh x (mt - nhiu) gia s tng th v cc on c nh ngha trong s phn on. - S nh v: Cc on l cc phn logic ca quan h tng th c nh v vt l trn mt hoc nhiu v tr trn mng. S nh v nh ngha on no nh v ti cc v tr no. Lu rng kiu nh x c nh ngha trong s nh v quyt nh CSDL phn tn l d tha hay khng. - S nh x a phng: nh x cc nh vt l v cc i tng c lu tr ti mt trm (tt c cc on ca mt quan h tng th trn cng mt v tr to ra mt nh vt l)

S tng th S phn on S nh v

S nh x a phng 1

S nh x a phng 2 DBMS ca v tr 2

Cc v tr khc

DBMS ca v tr 1

CSDL a phng ti v tr 1

CSDL a phng ti v tr 2

Hnh1.2 Kin trc c bn ca CSDL phn tn

1.1.5. H qun tr CSDL phn tn H qun tr CSDL phn tn (Distributed Database Management SystemDBMS) c nh ngha l mt h thng phn mm cho php qun l cc h CSDL (to lp v iu khin cc truy nhp cho cc h CSDL phn tn) v lm cho vic phn tn tr nn trong sut vi ngi s dng. c tnh v hnh mun ni n s tch bit v ng ngha cp cao ca mt h thng vi cc vn ci t cp thp. S phn tn d liu c che du vi ngi s dng lm cho ngi s dng truy nhp vo CSDL phn tn nh h CSDL tp trung. S thay i vic qun tr khng nh hng ti ngi s dng. H qun tr CSDL phn tn gm 1 tp cc phn mm (chng trnh) sau y: Cc chng trnh qun tr cc d liu phn tn Cha cc chng trnh qun tr vic truyn thng d liu Cc chng trnh qun tr cc CSDL a phng. Cc chng trnh qun tr t in d liu. to ra mt h CSDL phn tn (Distributed Database System-DDBS) cc tp tin khng ch c lin i logic chng cn phi c cu trc v c truy xut qua mt giao din chung.

10

Mi trng h CSDL phn tn l mi trng trong d liu c phn tn trn mt s v tr. 1.2. Kin trc h qun tr C s d liu phn tn 1.2.1. Cc h khch / i l Cc h qun tr CSDL khch / i l xut hin vo u nhng nm 90 v c nh hng rt ln n cng ngh DBMS v phng thc x l tnh ton. tng tng qut ht sc n gin: phn bit cc chc nng cn c cung cp v chia nhng chc nng ny thnh hai lp: chc nng i l (server function) v chc nng khch hng (client function). N cung cp kin trc hai cp, to d dng cho vic qun l mc phc tp ca cc DBMS hin i v phc tp ca vic phn tn d liu. i l thc hin phn ln cng vic qun l d liu. iu ny c ngha l tt c mi vic x l v ti u ho vn tin, qun l giao dch v qun l thit b lu tr c thc hin ti i l. Khch hng, ngoi ng dng v giao din s c modun DBMS khch chu trch nhim qun l d liu c gi n cho bn khch v i khi vic qun l cc kho cht giao dch cng c th giao cho n. Kin trc c m t bi hnh di rt thng dng trong cc h thng quan h, vic giao tip gia khch v i l nm ti mc cu lnh SQL. Ni cch khc, khch hng s chuyn cc cu vn tin SQL cho i l m khng tm hiu v ti u ho chng. i l thc hin hu ht cng vic v tr quan h kt qu v cho khch hng. C mt s loi kin trc khch/ i l khc nhau. Loi n gin nht l trng hp c mt i l c nhiu khch hng truy xut. Chng ta gi loi ny l nhiu khch mt i l. Mt kin trc khch/ i l phc tp hn l kin trc c nhiu i l trong h thng (c gi l nhiu khch nhiu i l). Trong trng hp ny chng ta c hai chin lc qun l: hoc mi khch hng t qun l ni kt ca n vi i l hoc mi khch hng ch bit i l rut ca n v giao tip vi cc i l khc qua i l khi cn. Li tip cn th nht lm n gin cho cc chng trnh i l nhng li t gnh nng ln cc my khch cng vi nhiu trch nhim khc. iu ny dn n tnh hung c gi l cc h thng khch t phc v. Li tip cn sau tp trung chc nng qun l d liu ti i l. V th s v hnh ca truy xut d liu c cung cp qua giao din ca i l. T gc tnh logc c d liu, DBMS khch/ i l cung cp cng mt hnh nh d liu nh cc h ngang hng s c tho lun phn tip theo. Ngha l chng cho ngi s dng thy mt hnh nh v mt CSDL logic duy nht, cn ti mc vt l n c th phn tn. V th s phn bit ch yu gia cc h khch/i l v ngang hng

11

khng phi mc v hnh c cung cp cho ngi dng v cho ng dng m m hnh kin trc c dng nhn ra mc v hnh ny.

1.2.2. Cc h phn tn ngang hng


M hnh client / server phn bit client (ni yu cu dch v) v server (ni phc v cc yu cu). Nhng m hnh x l ngang hng, cc h thng tham gia c vai tr nh nhau. Chng c th yu cu va dch v t mt h thng khc hoc va tr thnh ni cung cp dch v. Mt cch l tng, m hnh tnh ton ngang hng cung cp cho x l hp tc gia cc ng dng c th nm trn cc phn cng hoc h iu hnh khc nhau. Mc ch ca mi trng x l ngang hng l h tr cc CSDL c ni mng. Nh vy ngi s dng DBMS s c th truy cp ti nhiu CSDL khng ng nht.

12

CHNG 2. CC PHNG PHP PHN TN D LIU

2.1.Thit k c s d liu phn tn 2.1.1.Cc chin lc thit k

Qu trnh thit k t trn xung (top-down)

Phn tch yu cu

Yu cu h thng(mc tiu)

Thit k khi nim

Nguyn liu t ngi dng

Thit k khung nhn

Lc khi nim ton cc

Thng tin truy xut

nh ngha lc ngoi
Nguyn liu

Thit k phn tn t ngi dng

Lc khi nim cc b

Thit k vt l

Lc vt l Phn hi Theo di v bo tr

Hnh 2.1. Qu trnh thit k t trn xung

13

Phn tch yu cu: nhm nh ngha mi trng h thng v thu thp cc nhu cu v d liu v nhu cu x l ca tt c mi ngi c s dng CSDL Thit k khung nhn: nh ngha cc giao-din cho ngi s dng cui (enduser) Thit k khi nim: xem xt tng th x nghip nhm xc nh cc loi thc th v mi lin h gia cc thc th. Thit k phn tn: chia cc quan h thnh nhiu quan h nh hn gi l phn mnh v cp pht chng cho cc v tr. Thit k vt l: nh x lc khi nim cc b sang cc thit b lu tr vt l c sn ti cc v tr tng ng. Qu trnh thit k t di ln (bottom-up) Thit k t trn xung thch hp vi nhng CSDL c thit k t u. Tuy nhin chng ta cng hay gp trong thc t l c sn mt s CSDL, nhim v thit k l phi tch hp chng thnh mt CSDL. Tip cn t di ln s thch hp cho tnh hung ny. Khi im ca thit k t di ln l cc lc khi nim cc b . Qu trnh ny s bao gm vic tch hp cc lc cc b thnh khi nim lc ton cc. 2.2. Cc vn thit k 2.2.1. L do phn mnh Khung nhn ca cc ng dng thng ch l mt tp con ca quan h. V th n v truy xut khng phi l ton b quan h nhng ch l cc tp con ca quan h. Kt qu l xem tp con ca quan h l n v phn tn s l iu thch hp duy nht. Vic phn r mt quan h thnh nhiu mnh, mi mnh c x l nh mt n v, s cho php thc hin nhiu giao dch ng thi. Ngoi ra vic phn mnh cc quan h s cho php thc hin song song mt cu vn tin bng cch chia n ra thnh mt tp cc cu vn tin con hot tc trn cc mnh. V th vic phn mnh s lm tng mc hot ng ng thi v nh th lm tng lu lng hot ng ca h thng.

2.2.2. Cc kiu phn mnh


Cc quy tc phn mnh ng n Chng ta s tun th ba quy tc trong khi phn mnh m chng bo m rng CSDL s khng c thay i no v ng ngha khi phn mnh. a) Tnh y (completeness).

14

Nu mt th hin quan h R c phn r thnh cc mnh R 1, R2,,Rn, th mi mc d liu c th gp trong R cng c th gp mt trong nhiu mnh Ri. c tnh ny ging nh tnh cht phn r ni khng mt thng tin trong chun ho, cng quan trng trong phn mnh bi v n bo m rng d liu trong quan h R c nh x vo cc mnh v khng b mt. Ch rng trong trng hp phn mnh ngang mc d liu mun ni n l mt b, cn trong trng hp phn mnh dc, n mun ni n mt thuc tnh. b) Tnh ti thit c (reconstruction). Nu mt th hin quan h R c phn r thnh cc mnh R 1, R2,,Rn, th cn phi nh ngha mt ton t quan h sao cho R= Ri, Ri Fr Ton t thay i tu theo tng loi phn mnh, tuy nhin iu quan trng l phi xc nh c n. Kh nng ti thit mt quan h t cc mnh ca n bo m rng cc rng buc c nh ngha trn d liu di dng cc ph thuc s c bo ton. c) Tnh tch bit (disjointness). Nu quan h R c phn r ngang thnh cc mnh R 1, R2,,Rn, v mc d liu di nm trong mnh Rj, th n s khng nm trong mnh Rk khc (kj ). Tiu chun ny m bo cc mnh ngang s tch bit (ri nhau). Nu quan h c phn r dc, cc thuc tnh kho chnh phi c lp li trong mi mnh. V th trong trng hp phn mnh dc, tnh tch bit ch c nh ngha trn cc trng khng phi l kho chnh ca mt quan h.

Cc yu cu thng tin
Mt iu cn lu trong vic thit k phn tn l qu nhiu yu t c nh hng n mt thit k ti u. t chc logic ca CSDL, v tr cc ng dng, c tnh truy xut ca cc ng dng n CSDL, v cc c tnh ca h thng my tnh ti mi v tr u c nh hng n cc quyt nh phn tn. iu ny khin cho vic din t bi ton phn tn tr nn ht sc phc tp. Cc thng tin cn cho thit k phn tn c th chia thnh bn loi: - Thng tin CSDL - Thng tin ng dng - Thng tin v mng - Thng tin v h thng my tnh

15

Hai loi sau c bn cht hon ton nh lng v c s dng trong cc m hnh cp pht ch khng phi trong cc thut ton phn mnh

2.2.3. Phn mnh ngang


Trong phn ny, chng ta bn n cc khi nim lin quan n phn mnh ngang (phn tn ngang). C hai chin lc phn mnh ngang c bn:

- Phn mnh nguyn thu (primary horizontal fragmentation) ca mt quan h c


thc hin da trn cc v t c nh ngha trn quan h .

- Phn mnh ngang dn xut (derived horizontal fragmentation ) l phn mnh mt


quan h da vo cc v t c nh trn mt quan h khc. Hai kiu phn mnh ngang Phn mnh ngang chia mt quan h r theo cc b, v vy mi mnh l mt tp con cc b t ca quan h r. Phn mnh nguyn thu (primary horizontal fragmentation) ca mt quan h c thc hin da trn cc v t c nh ngha trn quan h . Ngc li phn mnh ngang dn xut (derived horizontal fragmentation ) l phn mnh mt quan h da vo cc v t c nh trn mt quan h khc. Nh vy trong phn mnh ngang tp cc v t ng vai tr quan trng. Trong phn ny s xem xt cc thut ton thc hin cc kiu phn mnh ngang. Trc tin chng ta nu cc thng tin cn thit thc hin phn mnh ngang. Yu cu thng tin ca phn mnh ngang a) Thng tin v c s d liu Thng tin v CSDL mun ni n l lc ton cc v quan h gc, cc quan h con. Trong ng cnh ny, chng ta cn bit c cc quan h s kt li vi nhau bng php ni hay bng php tnh khc. vi mc ch phn mnh dn xut, cc v t c nh ngha trn quan h khc, ta thng dng m hnh thc th - lin h (entityrelatinhip model), v trong m hnh ny cc mi lin h c biu din bng cc ng ni c hng (cc cung) gia cc quan h c lin h vi nhau qua mt ni.

16

Th d 1:
CT Chc v, Lng L1 DA

NV

MNV, tnNV, chc v L2 L3

MDA, tnDA, ngn sch, a im

PC MNV , MDA, nhim v, thi gian

Hnh 2.2. Biu din mi lin h gia cc quan h nh cc ng ni.

Hnh trn trnh by mt cch biu din cc ng ni gia cc quan h. ch rng hng ca ng ni cho bit mi lin h mt -nhiu. Chng hn vi mi chc v c nhiu nhn vin gi chc v , v th chng ta s v mt ng ni t quan h CT (chi tr) hng n NV (nhn vin). ng thi mi lin h nhiu- nhiu gia NV v DA(d n) c biu din bng hai ng ni n quan h PC (phn cng). Quan h nm ti u (khng mi tn ) ca ng ni c gi l ch nhn (owner) ca ng ni v quan h ti cui ng ni (u mi tn) gi l thnh vin (member). Th d 2: Cho ng ni L1 ca hnh 2.2, cc hm owner v member c cc gi tr sau: Owner( L1 ) = CT Member (L1) = NV Thng tin nh lng cn c v CSDL l lc lng (cardinality) ca mi quan h R, l s b c trong R, c k hiu l card (R) b) Thng tin v ng dng phn tn ngoi thng tin nh lng Card(R) ta cn cn thng tin nh tnh c bn gm cc v t c dng trong cc cu vn tin. Lng thng tin ny ph thuc bi ton c th.

17

Nu khng th phn tch c ht tt c cc ng dng xc nh nhng v t ny th t nht cng phi nghin cu c cc ng dng quan trng nht. Vy chng ta xc nh cc v t n gin (simple predicate). Cho quan h R ( A1, A2,, An ), trong Ai l mt thuc tnh c nh ngha trn mt min bin thin D(Ai) hay Di.. Mt v t n gin P c nh ngha trn R c dng: P: Ai Value Trong {=,<,, , >, } v value c chn t min bin thin ca Ai (value Di). Nh vy, cho trc lc R, cc min tr Di chng ta c th xc nh c tp tt c cc v t n gin Pr trn R. Vy Pr ={P: Ai Value }. Tuy nhin trong thc t ta ch cn nhng tp con thc s ca Pr . Th d 3: Cho quan h D n nh sau: P1 : TnDA = thit b iu khin P2 : Ngn sch 200000 L cc v t n gin.. Chng ta s s dng k hiu Pri biu th tp tt c cc v t n gin c nh ngha trn quan h Ri. Cc phn t ca Pri c k hiu l pij. Cc v t n gin thng rt d x l, cc cu vn tin thng cha nhiu v t phc tp hn, l t hp ca cc v t n gin. Mt t hp cn c bit ch , c gi l v t hi s cp (minterm predicate), l hi (conjunction) ca cc v t n gin. Bi v chng ta lun c th bin i mt biu thc Boole thnh dng chun hi, vic s dng v t hi s cp trong mt thut ton thit k khng lm mt i tnh tng qut. Cho mt tp Pri = {pi1, pi2, , pim } l cc v t n gin trn quan h Ri, tp cc v t hi s cp Mi={mi1, mi2, , miz } c nh ngha l: Mi={mij | mij= p*ik} vi 1 k m, 1 j z Trong p*ik=pik hoc p*ik= pik . V th mi v t n gin c th xut hin trong v t hi s cp di dng t nhin hoc dng ph nh.

18

Th d 4: Xt quan h CT: chc v K s in Phn tch h thng K s c kh Lp trnh Lng 40000 34000 27000 24000

Di y l mt s v t n gin c th nh ngha c trn PAY. p1: p2: p3: p4: p5: p6: gin ny m1: chc v= K s in Lng 30000 chc v= K s in chc v= Phn tch h thng chc v= K s c kh chc v= Lp trnh Lng 30000 Lng > 30000

Di y l mt s cc v t hi s cp c nh ngha da trn cc v t n

m2: chc v = K s in Lng > 30000 m3: m4: m5: m6: (chc v= K s in ) Lng 30000 (chc v= K s in ) Lng> 30000 chc v= Lp trnh Lng 30000 chc v= Lp trnh Lng > 30000

Ch :+ Php ly ph nh khng phi lc no cng thc hin c. Th d:xt hai v t n gin sau: Cn_di A; A Cn_trn. Tc l thuc tnh A c min tr nm trong cn di v cn trn, khi phn b ca chng l: (Cn_di A);

19

(A Cn_trn) khng xc nh c. Gi tr ca A trong cc ph nh ny ra khi min tr ca A. Hoc hai v t n gin trn c th c vit li l: Cn_di A Cn_trn c phn b l: (Cn_di A Cn_trn) khng nh ngha c. V vy khi nghin cu nhng vn ny ta ch xem xt cc v t ng thc n gin. => Khng phi tt c cc v t hi s cp u c th nh ngha c. + Mt s trong chng c th v ngha i vi ng ngha ca quan h Chi tr. Ngoi ra cn ch rng m3 c th c vit li nh sau: m3: chc v K s in Lng 30000 Theo nhng thng tin nh tnh v cc ng dng, chng ta cn bit hai tp d liu.

tuyn hi s cp (minterm selectivity): s lng cc b ca quan h s c


truy xut bi cu vn tin c c t theo mt v t hi s cp cho. chng hn tuyn ca m1 trong Th d 4 l zero bi v khng c b no trong CT tha v t ny. tuyn ca m2 l 1. Chng ta s k hiu tuyn ca mt hi s cp mi l sel (mi).

Tn s truy xut (access frequency): tn s ng dng truy xut d liu. Nu


Q={q1, q2,....,qq} l tp cc cu vn tin, acc (qi) biu th cho tn s truy xut ca qi trong mt khong thi gian cho. Ch rng mi hi s cp l mt cu vn tin. Chng ta k hiu tn s truy xut ca mt hi s cp l acc(mi) Phn mnh ngang nguyn thu Phn mnh ngang nguyn thu c nh ngha bng mt php ton chn trn cc quan h ch nhn ca mt lc ca CSDL. V th cho bit quan h R, cc mnh ngang ca R l cc Ri: Ri = Fi(R), 1 i z. Trong Fi l cng thc chn c s dng c c mnh R i. Ch rng nu Fi c dng chun hi, n l mt v t hi s cp (mj).

20

Th d 5: Xt quan h DA MDA P1 P2 P3 P4 TnDA Thit b o c Pht trin d liu CAD/CAM Bo dng Ngn sch 150000 135000 250000 310000 a im Montreal New York New York Paris

Chng ta c th nh ngha cc mnh ngang da vo v tr d n. Khi cc mnh thu c, c trnh by nh sau: DA1=a im=Montreal (DA) DA2=a im=New York (DA) DA3=a im=Paris (DA) DA1 MDA P1 DA2 MDA P2 P3 DA3 MDA P4 TnDA thit b o c Ngn sch 310000 a im Paris TnDA Pht trin d liu CAD/CAM Ngn sch 135000 250000 a im New York New York TDA Thit b o c Ngn sch 150000 a im Montreal

By gi chng ta c th nh ngha mt mnh ngang cht ch v r rng hn Mnh ngang Ri ca quan h R c cha tt c cc b R tha v t hi s cp mi Mt c tnh quan trng ca cc v t n gin l tnh y v tnh cc tiu. - Tp cc v t n gin Pr c gi l y nu v ch nu xc sut mi ng dng truy xut n mt b bt k thuc v mt mnh hi s cp no c nh ngha theo Pr u bng nhau.

21

Th d 6: Xt quan h phn mnh DA c a ra trong Th d 5. Nu tp ng dng Pr={a im=Montreal, a im=New York , a im=Paris, Ngn sch 200000 } th Pr khng y v c mt s b ca DA khng c truy xut bi v t Ngn sch 200000. cho tp v t ny y , chng ta cn phi xt thm v t Ngn sch > 200000 vo Pr. Vy Pr={a im=Montreal, a im=New York , a im=Paris, Ngn sch 200000 , Ngn sch> 200000 } l y bi v mi b c truy xut bi ng hai v t p ca Pr. Tt nhin nu ta bt i mt v t bt k trong Pr th tp cn li khng y . L do cn phi m bo tnh y l v cc mnh thu c theo tp v t y s nht qun v mt logic do tt c chng u tho v t hi s cp. Chng cng ng nht v y v mt thng k theo cch m ng dng truy xut chng. V th chng ta s dng mt tp hp gm cc v t y lm c s ca phn mnh ngang nguyn thy. - c tnh th hai ca tp cc v t l tnh cc tiu. y l mt c tnh cm tnh. V t n gin phi c lin i (relevant) trong vic xc nh mt mnh. Mt v t khng tham gia vo mt phn mnh no th c th coi v t l tha. Nu tt c cc v t ca Pr u c lin i th Pr l cc tiu. Th d 7: Tp Pr c nh ngha trong Th d 6 l y v cc tiu. Tuy nhin nu chng ta thm v t TnDA =thit b o c vo Pr, tp kt qu s khng cn cc tiu bi v v t mi thm vo khng c lin i ng vi Pr. V t mi thm vo khng chia thm mnh no trong cc mnh c to ra. Khi nim y gn cht vi mc tiu ca bi ton. S v t phi y theo yu cu ca bi ton chng ta mi thc hin c nhng vn t ra ca bi ton. Khi nim cc tiu lin quan n vn ti u ca b nh, ti u ca cc thao tc trn tp cc cu vn tin. Vy khi cho trc mt tp v t Pr xt tnh cc tiu chng ta c th kim tra bng cch vt b nhng v t tha c tp v t Pr l cc tiu v tt nhin Pr cng l tp y vi Pr. Thut ton COM_MIN: Cho php tm tp cc v t y v cc tiu Pr t Pr. Chng ta tm quy c: Quy tc 1: Quy tc c bn v tnh y v cc tiu , n khng nh rng mt quan h hoc mt mnh c phn hoch thnh t nht hai phn v chng c truy xut khc nhau bi t nht mt ng dng .

22

Thut ton 1.1 COM_MIN Input : R: quan h; Pr: tpcc v t n gin; Output: Pr: tp cc v t cc tiu v y ; Declare F: tp cc mnh hi s cp; Begin Pr= ; F = ; For each v t p Pr if p phn hoch R theo Quy tc 1 then Begin Pr: = Pr p; Pr: = Pr p; F: = F p; {fi l mnh hi s cp theo pi } End; {Chng ta chuyn cc v t c phn mnh R vo Pr} Repeat For each p Pr if p phn hoch mt mnh fk ca Pr theo quy tc 1 then Begin Pr: = Pr p; Pr: = Pr p; F: = F p; End; Until Pr y {Khng cn p no phn mnh fk ca Pr} For each p Pr, if p m p<=>p then Begin Pr:= Pr-p; F:= F - f; End; End. {COM_MIN}

23

Thut ton bt du bng cch tm mt v t c lin i v phn hoch quan h cho. Vng lp Repeat-until thm cc v t c phn hoch cc mnh vo tp ny, bo m tnh y ca Pr. on cui kim tra tnh cc tiu ca Pr. V th cui cng ta c tp Pr l cc tiu v y . Bc hai ca vic thit k phn mnh nguyn thy l suy dn ra tp cc v t hi s cp c th c nh ngha trn cc v t trong tp Pr. Cc v t hi s cp ny xc nh cc mnh ng c vin cho bc cp pht. Vic xc nh cc v t hi s cp l tm thng; kh khn chnh l tp cc v t hi s cp c th rt ln (thc s chng t l hm m theo s lng cc v t n gin). trong bc k tip chng ta s tm cch lm gim s lng v t hi s cp cn c nh ngha trong phn mnh. Bc ba ca qu trnh thit k l loi b mt s mnh v ngha. iu ny c thc hin bng cch xc nh nhng v t mu thun vi tp cc php ko theo (implication) I. Chng hn nu Pr={p1, p2}, trong P1: att= value_1 P2: att=value_2 V min bin thin ca att l {value_1, value_2}, r rng I cha hai php ko theo vi khng nh: I1: (att=value_1) (att=value_2) I2: (att=value_1)(att=value_2) Bn v t hi s cp sau y c nh ngha theo Pr: M1: (att=value_1) (att=value_2) M2: (att=value_1) (att=value_2) M3: (att=value_1) (att=value_2) M4: (att=value_1) (att=value_2) Trong trng hp ny cc v t hi s cp m 1, m4 mu thun vi cc php ko theo I v v th b loi ra khi M. Thut ton phn mnh ngang nguyn thy c trnh by trong thut ton 1.2. Thut ton 1.2 PHORIZONTAL Input: Output: R: quan h; Pr: tp cc v t n gin; M: tp cc v t hi s cp;

24

Begin Pr:= COM_MIN(R, Pr); Xc nh tp M cc v t hi s cp; Xc nh tp I cc php ko theo gia cc piPr; For each mi M do Begin IF mi mu thun vi I then M:= M-mi End; End. {PHORIZONTAL}

Th d 8: Chng ta hy xt quan h DA. Gi s rng c hai ng dng. ng dng u tin c a ra ti ba v tr v cn tm tn v ngn sch ca cc d n khi cho bit v tr. Theo k php SQL cu vn tin c vit l: SELECT FROM WHERE TnDA, Ngn sch DA a im=gi tr

i vi ng dng ny, cc v t n gin c th c dng l: P1: a im=Montreal P2: a im=New York P3: a im=Paris ng dng th hai l nhng d n c ngn sch di 200.000 la c qun l ti mt v tr, cn nhng d n c ngn sch ln hn c qun l ti mt v tr th hai. V th cc v t n gin phi c s dng phn mnh theo ng dng th hai l: P4: ngn sch200000 P5: ngn sch>200000 Nu kim tra bng thut ton COM_MIN, tp Pr={p1, p2, p3, p4, p5} r rng y v cc tiu Da trn Pr chng ta c th nh ngha su v t hi s cp sau y to ra M:

25

M1: (a im=Montreal) (ngn sch200000) M2: (a im=Montreal) (ngn sch>200000) M3: (a im=New York) (ngn sch200000) M4: (a im=New York) (ngn sch>200000) M5: (a im=Paris) (ngn sch200000) M6: (a im=Paris) (ngn sch>200000) y khng phi l cc v t hi s cp duy nht c th c to ra. Chng hn vn c th nh ngha cc v t: p1 p2 p3 p4 p5 Tuy nhin cc php ko hin nhin l: I1: p1 p2 p3 I2: p2 p1 p3 I3: p3 p1 p2 I4: p4 p5 I5: p5 p4 I6: p4 p5 I7: p5 p4 Cho php loi b nhng v t hi s cp ny v chng ta cn li m1 n m6. Cn nh rng cc php ko theo phi c nh ngha theo ng ngha ca CSDL, khng phi theo cc gi tr hin ti. Mt s mnh c nh ngha theo M={m1,,m6} c th rng nhng chng vn l cc mnh. Kt qu phn mnh nguyn thu cho DA l to ra su mnh FDA={DA1, DA2, DA3, DA4, DA5, DA6}, y c hai mnh rng l {DA2, DA5 } DA1 MDA P1 TnDA Ngn sch a im Montreal

Thit b o c 150000

26

DA3 MDA P2 TnDA Pht trin d liu Ngn sch 135000 a im New York

DA4 MDA P3 TnDA CAD/CAM Ngn sch 250000 a im New York

DA 6 MDA P4 TnDA bo dng Ngn sch 310000 a im Paris

Phn mnh ngang dn xut Phn mnh ngang dn xut c nh ngha trn mt quan h thnh vin ca ng ni da theo php ton chn trn quan h ch nhn ca ng ni . Nh th nu cho trc mt ng ni L, trong owner (L)=S v member(L)=R, v cc mnh ngang dn xut ca R c nh ngha l: Ri=R|>< Si , 1 i w Trong w l s lng cc mnh c nh ngha trn R, v Si= Fi(S) vi Fi l cng thc nh ngha mnh ngang nguyn thu Si

27

Th d 9: Xt ng ni NV MNV E1 E2 E2 E3 E3 E4 E5 E6 E7 E8 TnNV J.Doe M.Smith M.Smith A.Lee A.Lee J.Miller B.Casey L.Chu R.david J.Jones Chc v K s in Phn tch NV Phn tch K s c kh K s c kh Programmer Phn tch h thng K s in K s c kh Phn tch h thng
MNV, TnNV, Chc v Chc v, Lng

CT

L1

th th chng ta c th nhm cc k s thnh hai nhm ty theo lng: nhm c lng t 30.000 la tr ln v nhm c lng di 30.000 la. Hai mnh Nhn vin1 v Nhn vin2 c nh ngha nh sau: NV1=NV |>< CT1 NV2=NV |>< CT2 Trong CT1= CT2= CT 1 Chc v K s c kh Lp trnh Lng 27000 24000
Lng 30000 Lng>30000

( CT)

( CT) CT2 Chc v K s in Phn tch h thng Lng 40000 34000

Kt qu phn mnh ngang dn xut ca quan h NV nh sau:

28

NV1 MNV E3 E4 E7 TnNV A.Lee J.Miller R.David Chc v K s c kh Lp trnh vin K s c kh MNV E1 E2 E5 E6 E8 Ch :

NV2 TnNV J.Doe M.Smith B.Casey L.Chu J.Jones Chc v K s in Phn tch Phn tch h thng K s in Phn tch h thng

+ Mun thc hin phn mnh ngang dn xut, chng ta cn ba nguyn liu (input): 1. Tp cc phn hoch ca quan h ch nhn (Th d: CT1, CT2). 2. Quan h thnh vin 3. Tp cc v t ni na gia ch nhn v thnh vin (Chng hn CT.Chucvu = NV.Chucvu). + Vn phc tp cn ch : Trong lc CSDL, chng ta hay gp nhiu ng ni n mt quan h R. Nh th c th c nhiu cch phn mnh cho quan h R. Quyt nh chn cch phn mnh no cn da trn hai tiu chun sau: 1. Phn mnh c c tnh ni tt hn 2. Phn mnh c s dng trong nhiu ng dng hn. Tuy nhin, vic p dng cc tiu chun trn cn l mt vn rc ri. Th d 10: Chng ta tip tc vi thit k phn tn cho CSDL bt u t Th d 9. V quan h NV phn mnh theo CT. By gi xt ASG. Gi s c hai ng dng sau: 1. ng dng 1: Tm tn cc k s c lm vic ti mt ni no . ng dng ny chy c ba trm v truy xut cao hn cc k s ca cc d n nhng v tr khc. 2. ng dng 2: Ti mi trm qun l, ni qun l cc mu tin nhn vin, ngi dng mun truy xut n cc d n ang c cc nhn vin ny thc hin v cn bit xem h s lm vic vi d n trong bao lu. Kim nh tnh ng n By gi chng ta cn phi kim tra tnh ng ca phn mnh ngang. a. Tnh y

29

+ Phn mnh ngang nguyn thu: Vi iu kin cc v t chn l y , phn mnh thu cng c m bo l y , bi v c s ca thut ton phn mnh l tp cc v t cc tiu v y Pr, nn tnh y c bo m vi iu kin khng c sai st xy ra. + Phn mnh ngang dn xut: C khc cht t, kh khn chnh y l do v t nh ngha phn mnh c lin quan n hai quan h. Trc tin chng ta hy nh ngha qui tc y mt cch hnh thc. R l quan h thnh vin ca mt ng ni m ch nhn l quan h S. Gi A l thuc tnh ni gia R v S, th th vi mi b t ca R, phi c mt b t ca S sao cho t.A=t.A Quy tc ny c gi l rng buc ton vn hay ton vn tham chiu, bo m rng mi b trong cc mnh ca quan h thnh vin u nm trong quan h ch nhn. b. Tnh ti thit c Ti thit mt quan h ton cc t cc mnh c thc hin bng ton t hp trong c phn mnh ngang nguyn thy ln dn xut, V th mt quan h R vi phn mnh Fr={R1, R2,,Rm} chng ta c R = Ri , Ri FR c. Tnh tch ri Vi phn mnh nguyn thu tnh tch ri s c bo m min l cc v t hi s cp xc nh phn mnh c tnh loi tr tng h (mutually exclusive). Vi phn mnh dn xut tnh tch ri c th bo m nu th ni thuc loi n gin.

2.3. Phn mnh dc


Mt phn mnh dc cho mt quan h R sinh ra cc mnh R 1, R2,..,Rr, mi mnh cha mt tp con thuc tnh ca R v c kho ca R. Mc ch ca phn mnh dc l phn hoch mt quan h thnh mt tp cc quan h nh hn nhiu ng dng ch cn chy trn mt mnh. Mt phn mnh ti ul phn mnh sinh ra mt lc phn mnh cho php gim ti a thi gian thc thi cc ng dng chy trn mnh . Phn mnh dc tt nhin l phc tp hn so vi phn mnh ngang. iu ny l do tng s chn la c th ca mt phn hoch dc rt ln. V vy c c cc li gii ti u cho bi ton phn hoch dc thc s rt kh khn. V th li phi dng cc phng php khm ph (heuristic). Chng ta a ra hai loi heuristic cho phn mnh dc cc quan h ton cc.

30

- Nhm thuc tnh: Bt u bng cch gn mi thuc tnh cho mt mnh, v ti mi bc, ni mt s mnh li cho n khi tha mt tiu chun no . K thut ny c c xut ln u cho cc CSDL tp trung v v sau c dng cho cc CSDL phn tn. - Tch mnh: Bt u bng mt quan h v quyt nh cch phn mnh c li da trn hnh vi truy xut ca cc ng dng trn cc thuc tnh. Bi v phn hoch dc t vo mt mnh cc thuc tnh thng c truy xut chung vi nhau, chng ta cn c mt gi tr o no nh ngha chnh xc hn v khi nim chung vi nhau. S o ny gi l t lc hay lc ht (affinity) ca thuc tnh, ch ra mc lin i gia cc thuc tnh. Yu cu d liu chnh c lin quan n cc ng dng l tn s truy xut ca chng. gi Q={q1, q2,,qq} l tp cc vn tin ca ngi dng (cc ng dng) s chy trn quan h R(A1, A2,,An). Th th vi mi cu vn tin qi v mi thuc tnh Aj, chng ta s a ra mt gi tr s dng thuc tnh, k hiu use(qi, Aj) c nh ngha nh sau: 1 nu thuc tnh Aj c vn tin qi tham chiu use(qi, Aj)= 0 trong trng hp ngc li

Cc vct use(qi, ) cho mi ng dng rt d nh ngha nu nh thit k bit c cc ng dng s chy trn CSDL. Th d 11: Xt quan h DA, gi s rng cc ng dng sau y chy trn cc quan h . Trong mi trng hp chng ta cng c t bng SQL. q1: Tm ngn sch ca mt d n, cho bit m ca d n SELECT FROM WHERE Ngn sch DA MDA=gi tr

q2: Tm tn v ngn sch ca tt c mi d n SELECT FROM TnDA, ngn sch DA

q3: Tm tn ca cc d n c thc hin ti mt thnh ph cho SELECT tnDA

31

FROM WHERE

DA a im=gi tr

q4: Tm tng ngn sch d n ca mi thnh ph SELECT FROM WHERE SUM (ngn sch) DA a im=gi tr

Da theo bn ng dng ny, chng ta c th nh ngha ra cc gi tr s dng thuc tnh. cho tin v mt k php, chng ta gi A1=MDA, A2=TnDA, A3=Ngn sch, A4=a im. Gi tr s dng c nh ngha di dng ma trn, trong mc (i,j) biu th use(qi , Aj ). A1 q1 q2 q3 q4 T lc ca cc thuc tnh Gi tr s dng thuc tnh khng lm c s cho vic tch v phn mnh. iu ny l do chng khng biu th cho ln ca tn s ng dng. S o lc ht (affinity) ca cc thuc tnh aff(Ai, Aj), biu th cho cu ni (bond) gia hai thuc tnh ca mt quan h theo cch chng c cc ng dng truy xut, s l mt i lng cn thit cho bi ton phn mnh. Xy dng cng thc o lc ht ca hai thuc tnh Ai, Aj. Gi k l s cc mnh ca R c phn mnh. Tc l R = R1 .Rk. Q= {q1, q2,,qm} l tp cc cu vn tin (tc l tp cc ng dng chy trn quan h R). t Q(A, B) l tp cc ng dng q ca Q m use(q, A).use(q, B) = 1. Ni cch khc: Q(A, B) = {qQ: use(q, A) =use(q, B) = 1} Th d da vo ma trn trn ta thy Q(A1,A1) = {q1}, Q(A2,A2 ) = {q2, q3}, Q(A3,A3 ) = {q1,q2, q4}, Q(A4,A4 ) = {q3, q4}, Q(A1,A2 ) = rng, Q(A1,A3 ) = {q1}, Q(A2,A3 ) = {q2},.. 1 0 0 0 A2 0 1 1 0 A3 1 1 0 1 A4 0 0 1 1

32

S o lc ht gia hai thuc tnh Ai, Aj c nh ngha l: aff(Ai, Aj)=


qk Q(Ai, Aj)

refl (qk)accl(qk)
l Rl

Hoc: aff(Ai, Aj)=


Use(qk, Ai)=1Use(qk, Aj)=1 Rl

refl (qk)accl(qk)

Trong refl (qk) l s truy xut n cc thuc tnh (A i, Aj) cho mi ng dng qk ti v tr Rl v accl(qk) l s o tn s truy xut ng dng qk n cc thuc tnh Ai, Aj ti v tr l. Chng ta cn lu rng trong cng thc tnh aff (Ai, Aj) ch xut hin cc ng dng q m c Ai v Aj u s dng. Kt qu ca tnh ton ny l mt ma trn i xng n x n, mi phn t ca n l mt s o c nh ngha trn. Chng ta gi n l ma trn lc t ( lc ht hoc i lc) thuc tnh (AA) (attribute affinity matrix). Th d 12: Chng ta hy tip tc vi Th d 11. cho dn gin chng ta hy gi s rng refl (qk) =1 cho tt c qk v Rl. Nu tn s ng dng l: Acc1(q1) = 15 Acc1(q2) = 5 Acc1(q3) = 25 Acc1(q4) = 3 Acc2(q1) = 20 Acc2(q2) = 0 Acc2(q3) = 25 Acc2(q4) = 0 Acc3(q1) = 10 Acc3(q2) = 0 Acc3(q3) = 25 Acc3(q1) = 0

S o lc ht gia hai thuc tnh A1 v A3 l: Aff(A1, A3) =


1 k=1

t=1

acct(qk) = acc1(q1)+acc2(q1)+acc3(q1) = 45

Tng t tnh cho cc cp cn li ta c ma trn i lc sau:

A1 A1 A2 A3 A4 45 0 45 0

A2 0 80 5 75

A3 45 5 53 3

A4 0 75 3 78

Thut ton nng lng ni BEA (Bond Energy Algorithm)

33

n y ta c th phn R lm cc mnh ca cc nhm thuc tnh da vo s lin i (lc ht) gia cc thuc tnh, th d t lc ca A1, A3 l 45, ca A2, A4 l 75, cn ca A1, A2 l 0, ca A3, A4 l 3 Tuy nhin, phng php tuyn tnh s dng trc tip t ma trn ny t c mi ngi quan tm v s dng. Sau y chng ta xt mt phng php dng thut ton nng lng ni BEA ca Hoffer and Severance, 1975 v Navathe., 1984. 1. N c thit k c bit xc nh cc nhm gm cc mc tng t, khc vi mt sp xp th t tuyn tnh ca cc mc. 2. Cc kt qu t nhm khng b nh hng bi th t a cc mc vo thut ton. 3. Thi gian tnh ton ca thut ton c th chp nhn c l O(n 2), vi n l s lng thuc tnh. 4. Mi lin h qua li gia cc nhm thuc tnh t c th xc nh c. Thut ton BEA nhn nguyn liu l mt ma trn i lc thuc tnh (AA), hon v cc hng v ct ri sinh ra mt ma trn i lc t (CA) (Clustered affinity matrix). Hon v c thc hin sao cho s o i lc chung AM (Global Affinity Measure) l ln nht. Trong AM l i lng: AM=
n i=1

n j=1

aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)+aff(Ai-1, Aj)+ aff(Ai+1, Aj)]

Vi aff(A0, Aj)=aff(Ai, A0)=aff(An+1, Aj)=aff(Ai, An+1)=0 cho i,j Tp cc iu kin cui cng cp n nhng trng hp mt thuc tnh c t vo CA v bn tri ca thuc tnh tn tri hoc v bn phi ca thuc tnh tn phi trong cc hon v ct, v bn trn hng trn cng v bn di hng cui cng trong cc hon v hng. Trong nhng trng hp ny, chng ta cho 0 l gi tr lc ht aff gia thuc tnh ang c xt v cc ln cn bn tri hoc bn phi (trn cng hoc di y ) ca n hin cha c trong CA. Hm cc i ho ch xt nhng ln cn gn nht, v th n nhm cc gi tr ln vi cc gi tr ln , gi tr nh vi gi tr nh. V ma trn lc ht thuc tnh AA c tch cht i xng nn hm s va c xy dng trn thu li thnh: AM=
n i=1

n j=1

aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]

Qu trnh sinh ra ma trn t lc (CA) c thc hin qua ba bc: Bc 1: Khi gn: t v c nh mt trong cc ct ca AA vo trong CA. Th d ct 1, 2 c chn trong thut ton ny.

34

Bc 2: Thc hin lp Ly ln lt mt trong n-i ct cn li (trong i l s ct c t vo CA) v th t chng vo trong i+1 v tr cn li trong ma trn CA. Chn ni t sao cho cho i lc chung AM ln nht. Tip tc lp n khi khng cn ct no dt. Bc 3: Sp th t hng Mt khi th t ct c xc nh, cc hng cng c t li cc v tr tng i ca chng ph hp vi cc v tr tng i ca ct. Thut ton BEA Input: AA - ma trn i lc thuc tnh;

Output: CA - ma trn i lc t sau khi sp xp li cc hng cc ct; Begin {Khi gn: cn nh rng l mt ma trn n x n} CA(, 1)AA(, 1) CA(, 2)AA(, 2) Index:=3 while index <= n do begin for i :=1 to index-1 by 1 do tnh cont(Ai-1, Aindex, Ai); Tnh cont(Aindex-1,Aindex, Aindex+1); { iu kin bin} {chn v tr tt nht cho thuc tnh Aindex}

Loc ni t, c cho bi gi tr cont ln nht; for i: = index downto loc do CA(, j)CA(, j-1); CA(, loc)AA(, index); indexindex+1; end-while Sp th t cc hng theo th t tng i ca ct. end. {BEA} {xo trn hai ma trn}

35

hiu r thut ton chng ta cn bit cont(*,*,*). Cn nhc li s o i lc chung AM c nh ngha l: AM=
n i=1

n j=1

aff(Ai, Aj)[aff(Ai, Aj-1)+aff(Ai, Aj+1)]

V c th vit li: AM = =
n i=1 n j=1

n j=1 n i=1

[aff(Ai, Aj) aff(Ai, Aj-1)+aff(Ai, Aj) aff(Ai, Aj+1)] aff(Ai, Aj) aff(Ai, Aj-1)+
n i=1

aff(Ai, Aj) aff(Ai, Aj+1)]

Ta nh ngha cu ni (Bond) gia hai thuc tnh Ax, v Ay l: Bond(Ax, Ay )= Th th c th vit li AM l: AM =


n j=1 n z=1

aff(Az, Ax)aff(Az, Ay)

[ Bond(Ai, Aj-1)+Bond(Ai, Aj+1)]

By gi xt n thuc tnh sau: A1 A2 Ai-1 Ai Aj Aj+1 An Aj+1 An thuc nhm AM

Vi A1 A2 Ai-1 thuc nhm AM v AiAj

Khi s o lc ht chung cho nhng thuc tnh ny c th vit li: AMold = AM + AM+ bond(Ai-1, Ai) + bond(Ai, Aj) + bond(Aj, Ai)+ bond(bond(Aj+1, Aj) =
n l=1

[ bond(Al, Al-1)+bond(Ai, Al+1)] +

l=i+1

[bond(Al,

Al-1)+bond(Ai, Al+1)] + 2bond(Ai, Al)) By gi xt n vic t mt thuc tnh mi Ak gia cc thuc tnh Ai v Aj trong ma trn lc ht t. S o lc ht chung mi c th c vit tng t nh: AMnew = AM + AM+ bond(Ai, Ak) + bond(Ak, Ai) + bond(Ak, Aj)+ bond(Aj, Ak) = AM + AM+ 2bond(Ai, Ak) + 2bond(Ak, Aj) V th ng gp thc (net contribution) cho s o i lc chung khi t thuc tnh Ak gia Ai v Aj l: Cont(Ai, Ak, Aj) = AMnew - AMold = 2Bond(Ai, Ak )+ 2Bond(Ak, Aj ) - 2Bond(Ai, Aj ) Bond(A0, Ak)=0. Nu thuc tnh Ak t bn phi thuc tnh tn bn phi v cha c thuc tnh no c t ct k+1 ca ma trn CA nn bond(Ak, Ak+1)=0. Th d 13: Ta xt ma trn c cho trong Th d 12 v tnh ton phn ng gp khi di chuyn thuc tnh A4 vo gia cc thuc tnh A1 v A2, c cho bng cng thc:

36

Cont(A1, A4, A2)= 2bond(A1, A4)+ 2bond(A4, A2)-2bond(A1, A2) Tnh mi s hng chng ta c: Bond(A1, A4) =
4 z=1

aff(Az, A1)aff(Az, A4) = aff(A1,A1) aff(A1,A4) +aff(A2,A1)

aff(A2,A4) + aff(A1,A3) aff(A3,A4) + aff(A1,A4) aff(A4,A4) = 45*0 +0*75+ 45*3+0*78 = 135 Bond(A4, A2)= 11865 Bond(A1,A2) = 225 V th cont(A1, A4) = 2*135+2*11865+2*225 = 23550 Th d 14: Chng ta hy xt qu trnh gom t cc thuc tnh ca quan h D n v dng ma trn i lc thuc tnh AA. bc khi u chng ta chp cc ct 1 v 2 ca ma trn AA vo ma trn CA v bt u thc hin t ct th ba. C 3 ni c th t c ct 3 l: (3-1-2), (1, 3, 2) v (1, 2, 3). Chng ta hy tnh ng gp s i lc chung ca mi kh nng ny. th t (0-3-1): cont(A0, A3, A1) = 2bond(A0, A3)+ 2bond(A3, A1) - 2bond(A0, A1) bond(A0, A3) = bond(A0, A1)=0 bond(A3, A1) = 45*48+5*0+53*45+3*0=4410 cont(A0, A3, A1) = 8820 th t (1-3-2) cont (A1, A3, A2)=10150 th t (2-3-4) cont (A2, A3, A4)=1780 Bi v ng gp ca th t (1-2-3) l ln nht, chng ta t A3 vo bn phi ca A1. Tnh ton tng t cho A4 ch ra rng cn phi t n vo bn phi ca A2. Cui cng cc hng c t chc vi cng th t nh cc ct v cc hng c trnh by trong hnh sau:

37

A1 A1 A2 A3 A4 45 0 45 0 (a) A1 A1 A2 A3 A4 45 0 45 0 (b)

A2 0 80 5 75 A1 A2 A3 A4

A1 45 0 45 0

A3 45 5 53 3

A2 0 80 5 75 (b)

A3 45 5 53 3

A2 0 80 5 75

A4 0 75 3 78 A1 A3 A2 A4

A1 45 45 0 0

A3 45 53 5 3

A2 0 5 80 75 (d)

A4 0 3 75 78

trong hnh trn chng ta thy qu trnh to ra hai t: mt gc trn tri cha cc gi tr i lc nh, cn t kia di gc phi cha cc gi tr i lc cao. Qu trnh phn t ny ch ra cch thc tch cc thuc tnh ca D n. Tuy nhin, ni chung th ranh ri cc phn tch khng hon ton r rng. Khi ma trn CA ln, thng s c nhiu t hn c to ra v nhiu phn hoch c chn hn. Do vy cn phi tip cn bi ton mt cch c h thng hn. Thut ton phn hoch Mc ch ca hnh ng tch thuc tnh l tm ra cc tp thuc tnh c truy xut cng nhau hoc hu nh l cc tp ng dng ring bit. Xt ma trn thuc tnh t:

A1 A2 A3 ... Ai Ai+1 ... An A1 A1 : Ai Ai+1 : : An

TA

BA

38

Nu mt im nm trn ng cho c c nh, hai tp thuc tnh ny c xc nh. Mt tp {A1, A2,..., Ai} nm ti gc trn tri v tp th hai {Ai+1, Ai+2,..., An} nm ti gc bn phi v bn di im ny. Chng ta gi 2 tp ln lt l TA, BA. Tp ng dng Q={q1, q2,...,qq} v nh ngha tp ng dng ch truy xut TA, ch truy xut BA hoc c hai, nhng tp ny c nh ngha nh sau: AQ(qi) = {Aj |use(qi, Aj)=1} TQ = {qi | AQ(qi) TA} BQ = {qi | AQ(qi) BA} OQ = Q - {TQ BQ} y ny sinh bi ton ti u ho. Nu c n thuc tnh trong quan h th s c n-1 v tr kh hu c th l im phn chia trn ng cho chnh ca ma trn thuc tnh t cho quan h . V tr tt nht phn chia l v tr sinh ra tp TQ v BQ sao cho tng cc truy xut ch mt mnh l ln nht cn tng truy xut c hai mnh l nh nht. V th chng ta nh ngha cc phng trnh chi ph nh sau: CQ = refj(qi)accj(qi)
qiQ Sj

CTQ = refj(qi)accj(qi)
qiTQ Sj

CBQ= refj(qi)acc(qi)
qiBQ Sj

COQ= refj(qi)acc(qi)
qiOQ Sj

Mi phng trnh trn m tng s truy xut n cc thuc tnh bi cc ng dng trong cc lp tng ng ca chng. Da trn s liu ny, bi ton ti u ho c nh ngha l bi ton tm im x (1 x n) sao cho biu thc: Z=CTQ+CBQ-COQ2 ln nht. c trng quan trng ca biu thc ny l n nh ngha hai mnh sao cho gi tr ca CTQ v CBQ cng gn bng nhau cng tt. iu ny cho php cn bng ti trng x l khi cc mnh c phn tn n cc v tr khc nhau. Thut ton phn hoch c phc tp tuyn tnh theo s thuc tnh ca quan h, ngha l O(n). Thut ton PARTITION Input: CA: ma trn i lc t; R: quan h; ref: ma trn s dng thuc tnh; acc: ma trn tn s truy xut;

39

Output: F: tp cc mnh; Begin {xc nh gi tr z cho ct th nht} {cc ch mc trong phng trnh chi ph ch ra im tch} tnh CTQn-1 tnh CBQn-1 tnh COQn-1 best CTQn-1*CBQn-1 (COQn-1)2 do begin for i from n-2 to 1 by -1 do begin tnh CTQi tnh CBQi tnh COQi z CTQi*CBQi (COQi)2 if z > best then begin best z ghi nhn im tch bn vo trong hnh ng x dch end-if end-for gi SHIFT(CA) end-begin until khng th thc hin SHIFT c na Xy dng li ma trn theo v tr x dch R1 TA(R) K R2 BA(R) K {K l tp thuc tnh kho chnh ca R} {xc nh cch phn hoch tt nht}

40

F {R1, R2} End. {partition} p dng cho ma trn CA t quan h d n, kt qu l nh ngha cc mnh Fd
n

={D n1, D n2} Trong : D n1={A1, A3} v D n2= {A1, A2, A4}. V th D n1={M d n, Ngn sch} D n2={M d n, Tn d n, a im} ( y M d n l thuc tnh kho ca D n) Kim tra tnh ng n: Tnh y : c bo m bng thut ton PARTITION v mi thuc tnh

ca quan h ton cc c a vo mt trong cc mnh. Tnh ti thit c: i vi quan h R c phn mnh dc FR={R1, R2,...., Rr} v cc thuc tnh kho K R=
K

Ri , RiFR

Do vy nu iu kin mi R i l y php ton ni s ti thit li ng R. Mt im quan trng l mi mnh Ri phi cha cc thuc tnh kho ca R.

2.5. Phn mnh hn hp


Trong a s cc trng hp, phn mnh ngang hoc phn mnh dc n gin cho mt lc CSDL khng p ng cc yu cu t ng dng. Trong trng hp phn mnh dc c th thc hin sau mt s mnh ngang hoc ngc li, sinh ra mt li phn hoch c cu trc cy. Bi v hai chin lc ny c p dng ln lt, chn la ny c gi l phn mnh hn hp.
R H H R2 V R11 R12 V R21 R22

R1 V

R23

41

2.6. Cp pht
2.6.1 Bi ton cp pht Gi s c mt tp cc mnh F={F1, F2, ...,Fn} v mt mng bao gm cc v tr S={S1, S2, ...,Sm} trn c mt tp cc ng dng Q={q1, q2, ...,qq} ang chy. Bi ton cp pht l tm mt phn phi ti u ca F cho S. Tnh ti u c th c nh ngha ng vi hai s o: - Chi ph nh nht: Hm chi ph c chi lu mnh F i vo v tr Sj, chi ph vn tin mnh Fi vo v tr Sj, chi ph cp nht Fi ti tt c mi v tr c cha n v chi ph tryn d liu. V th bi ton cp pht c gng tm mt lc cp pht vi hm chi ph t hp nh nht. - Hiu nng: Chin lc cp pht c thit k nhm duy tr mt hiu qu ln l h thp thi gian p ng v tng ti a lu lng h thng ti mi v tr. Ni chung bi ton cp pht tng qut l mt bi ton phc tp v c phc tp l NP-y (NP-complete). V th cc nghin cu c dnh cho vic tm ra cc thut gii heuristec tt c li gii gn ti u. 2.6.2 Yu cu v thng tin giai on cp pht, chng ta cn cc thng tin nh lng v CSDL, v cc ng dng chy trn , v cu trc mng, kh nng x l v gii hn lu tr ca mi v tr trn mng. Thng tin v CSDL tuyn ca mt mnh Fj ng vi cu vn tin qi. y l s lng cc b ca Fj cn c truy xut x l qi. Gi tr ny k hiu l seli(Fj) Kch thc ca mt mnh Fj c cho bi Size (Fj) = card (Fj)* length(Fj) Trong : Length(Fj) l chiu di (tnh theo byte) ca mt b trong mnh Fj. Thng tin v ng dng Hai s liu quan trng l s truy xut c do cu vn tin qi thc hin trn mnh Fj trong mi ln chy ca n (k hiu l RRij), v tng ng l cc truy xut cp nht (URij). Th d chng c th m s truy xut khi cn phi thc hin theo yu cu vn tin. Chng ta nh ngha hai ma trn UM v RM vi cc phn t tng ng u ij v rij c c t tng ng nh sau:

42

1 nu vn tin qi c cp nht mnh Fj uij= 0 trong trng hp ngc li

1 nu vn tin qi c cp nht mnh Fj rij = trong trng hp ngc li Mt vct O gm cc gi tr o(i) cng c nh ngha, vi o(i) c t v tr a ra cu vn tin qi . Thng tin v v tr Vi mi v tr (trm) chng ta cn bit v kh nng lu tr v x l ca n. Hin nhin l nhng gi tr ny c th tnh c bng cc hm thch hp hoc bng phng php nh gi n gin. + Chi ph n v tnh lu d liu ti v tr Sk s c k hiu l USCk. + c t s o chi ph LPCk, l chi ph x l mt n v cng vic ti v tr S k. n v cng vic cn phi ging vi n v ca RR v UR. Thng tin v mng Chng ta gi s tn ti mt mng n gin, gij biu th cho chi ph truyn mi b gia hai v tr Si v Sj. c th tnh c s lng thng bo, chng ta dng fsize lm kch thc (tnh theo byte) ca mt b d liu. 2.6.3. M hnh cp pht M hnh cp pht c mc tiu lm gim thiu tng chi ph x l v lu tr d liu trong khi vn c gng p ng c cc i hi v thi gian p ng. M hnh ca chng ta c hnh thi nh sau: Min (Total Cost) ng vi rng buc thi gian p ng, rng buc lu tr, rng buc x l. Bin quyt nh xij c nh ngha l 1 nu mnh Fi c lu ti v tr Sj xij= 0 trong trng hp ngc li

43

Tng chi ph Hm tng chi ph c hai thnh phn: phn x l vn tin v phn lu tr. V th n c th c biu din l: TOC= QPCi +
qi Q Sk S

STCjk
Fj F

vi QPCi l chi ph x l cu vn tin ng dng q i, v STCjk l chi ph lu mnh Fj ti v tr Sk. Chng ta hy xt chi ph lu tr trc. N c cho bi STCjk = USCk * size(Fj) *xjk Chi ph x l vn tin kh xc nh hn. Hu ht cc m hnh cho bi ton cp pht tp tin FAP tch n thnh hai phn: Chi ph x l ch c v chi ph x l ch cp nht. y chng ti chn mt hng tip cn khc trong m hnh cho bi ton DAP v xc nh n nh l chi ph x l vn tin bao gm chi ph x l l PC v chi ph truyn l TC. V th chi ph x l vn tin QPC cho ng dng qi l QPCi=PCi+TCi Thnh phn x l PC gm c ba h s chi ph, chi ph truy xut AC, chi ph duy tr ton vn IE v chi ph iu khin ng thi CC: PCi=ACi+IEi+CCi M t chi tit cho mi h s chi ph ph thuc vo thut ton c dng hon tt cc tc v . Tuy nhin minh ho chng ti s m t chi tit v AC: ACi= (uij*URij+rij*RRij)* xjk*LPCk

SkS FjF Hai s hng u trong cng thc trn tnh s truy xut ca vn tin qi n mnh Fj. Ch rng (URij+RRij) l tng s cc truy xut c v cp nht. Chng ta gi thit rng cc chi ph x l chng l nh nhau. K hiu tng cho bit tng s cc truy xut cho tt c mi mnh c qi tham chiu. Nhn vi LPCk cho ra chi ph ca truy xut ny ti v tr Sk. Chng ta li dng xjk ch chn cc gi tr chi ph cho cc v tr c lu cc mnh. Mt vn rt quan trng cn cp y. Hm chi ph truy xut gi s rng vic x l mt cu vn tin c bao gm c vic phn r n thnh mt tp cc vn tin con hot tc trn mt mnh c lu ti v tr , theo sau l truyn kt qu tr li v v tr a ra vn tin.

44

H s chi ph duy tr tnh ton vn c th c m t rt ging thnh phn x l ngoi tr chi ph x l cc b mt n v cn thay i nhm phn nh chi ph thc s duy tr tnh ton vn. Hm chi ph truyn c th c a ra ging nh cch ca hm chi ph truy xut. Tuy nhin tng chi ph truyn d liu cho cp nht v cho yu cu ch c s khc nhau hon ton. Trong cc vn tin cp nht, chng ta cn cho tt c mi v tr bit ni c cc bn sao cn trong vn tin ch c th ch cn truy xut mt trong cc bn sao l . Ngoi ra vo lc kt thc yu cu cp nht th khng cn phi truyn d liu v ngc li, cho v tr a ra vn tin ngoi mt thng bo xc nhn, cn trong vn tin ch c c th phi c nhiu thng bo tryn d liu. Thnh phn cp nht ca hm truyn d liu l: TCUi = SkS uj*xjk*go(i),k + FjF SkS uj*xjk*g k,o(i) FjF

S hng th nht gi thng bo cp nht t v tr gc o(i) ca qi n tt c bn sao cp nht. S hng th hai dnh cho thng bo xc nhn. Thnh phn chi ph ch c c th c t l: TCRi= min (uij * xjk * go(i), k+rij * xjk * (seli(Fj)* length (Fj)/fsize) * gk, o(i)) Fj F SkS S hng th nht trong TCR biu th chi ph truyn yu cu ch c n nhng v tr c bn sao ca mnh cn truy xut. S hng th hai truyn cc kt qu t nhng v tr ny n nhng v tr yu cu. Phng trnh ny khng nh rng trong s cc v tr c bn sao ca cng mt mnh, ch v tr sinh ra tng chi ph truyn thp nht mi c chn thc hin thao tc ny. By gi hm chi ph tnh cho vn tin qi c th c tnh l: TCi=TCUi+TCRi Rng buc Rng buc thi gian p ng cn c c t l thi gian thc thi ca q i thi gian p ng ln nht ca qi qi Q Ngi ta thch c t s o chi ph ca hm theo thi gian bi v n n gin ho c t v rng buc thi gian thc thi. Rng buc lu tr l: STCjk kh nng lu tr ti v tr Sk, SkS
Fj F

45

Trong rng buc x l l: ti trng x l ca qi ti v tr Sk kh nng x l ca Sk, SkS.


qi Q

46

CHNG 3. X L VN TIN

Ng cnh c chn y l php tnh quan h v i s quan h. Nh chng ta thy cc quan h phn tn c ci t qua cc mnh. Thit k CSDL c vai tr ht sc quan trng i vi vic x l vn tin v nh ngha cc mnh c mc ch lm tng tnh cc b tham chiu, v i khi tng kh nng thc hin song song i vi nhng cu vn tin quan trng nht. Vai tr ca th x l vn tin phn tn l nh x cu vn tin cp cao trn mt CSDL phn tn vo mt chui cc thao tc ca i s quan h trn cc mnh. Mt s chc nng quan trng biu trng cho nh x ny. Trc tin cu vn tin phi c phn r thnh mt chui cc php ton quan h c gi l vn tin i s. Th hai, d liu cn truy xut phi c cc b ha cc thao tc trn cc quan h c chuyn thnh cc thao tc trn d liu cc b (cc mnh). Cui cng cu vn tin i s trn cc mnh phi c m rng bao gm cc thao tc truyn thng v c ti u ha hm chi ph l thp nht. Hm chi ph mun ni n cc tnh ton nh thao tc xut nhp a, ti nguyn CPU, v mng truyn thng. 3.1. Bi ton x l vn tin C hai phng php ti u ha c bn c s dng trong cc b x l vn tin: phng php bin i i s v chin lc c lng chi ph. Phng php bin i i s n gin ha cc cu vn tin nh cc php bin i i s nhm h thp chi ph tr li cu vn tin, c lp vi d liu thc v cu trc vt l ca d liu. Nhim v chnh ca th x l vn tin quan h l bin i cu vn tin cp cao thnh mt cu vn tin tng ng cp thp hn c din t bng i s quan h. Cu vn tin cp thp thc s s ci t chin lc thc thi vn tin. Vic bin i ny phi t c c tnh ng n ln tnh hiu qu. Mt bin i c xem l ng n nu cu vn tin cp thp c cng ng ngha vi cu vn tin gc, ngha l c hai cng cho ra mt kt qu. Mt cu vn tin c th c nhiu cch bin i tng ng thnh i s quan h. Bi v mi chin lc thc thi tng ng u s dng ti nguyn my tnh rt khc nhau, kh khn chnh l chn ra c mt chin lc h thp ti a vic tiu dng ti nguyn. Th d 3.1: Chng ta hy xt mt tp con ca lc CSDL c cho NV( MNV, TnNV, Chc v)

47

PC (MNV, MDA, Nhim v, Thi gian) V mt cu vn tin n gin sau: Cho bit tn ca cc nhn vin hin ang qun l mt d n Biu thc vn tin bng php tnh quan h theo c php ca SQL l: SELECT FROM WHERE TnNV NV, PC NV.MNV=PC.MNV AND Nhimv=Qunl Hai biu thc tng ng trong i s quan h do bin i chnh xc t cu vn tin trn l: v
TnNV TnNV

Nhimv=Qunl NV.MNV=PC.MNV

(NV x PC))

(NV|><|MNV(

Nhimv=Qunl

(PC)))

Hin nhin l trong cu vn tin th hai, chng ta trnh s dng tch Descartes, v th tiu dng t ti nguyn my tnh hn cu vn tin th nht v v vy nn c gi li. Trong bi cnh tp trung, chin lc thc thi vn tin c th c din t chnh xc bng mt m rng ca i s quan h. Nhim v chnh ca th x l vn tin tp trung l i vi mt cu vn tin cho, n phi chn ra c mt cu vn tin i s tt nht trong s nhng cu vn tin tng ng. Bi v y l bi ton phc tp v mt tnh ton khi s lng cc quan h kh ln, nn ni chung n thng c rt li yu cu l chn c mt li gii gn ti u. Trong cc h phn tn, i s quan h khng din t cc chin lc thc thi. N phi c cung cp thm cc php ton trao i d liu gia cc v tr. Bn cnh vic chn th t cho cc php ton i s quan h, th x l vn tin phn tn cng phi chn cc v tr tt nht x l d liu, v c th c cch bin i d liu. Kt qu l khng gian li gii cc chin lc thc thi tng ln, lm cho vic x l vn tin phn tn tng ln rt nhiu. Th d 3.2: Th d ny minh ha tm quan trng ca vic chn la v tr v cch truyn d liu ca mt cu vn tin i s. Chng ta xt cu vn tin ca th d trn:

48

TnNV

(NV|><|MNV(

Nhimv=Qunl

(PC)))

chng ta gi s rng cc quan h NV v PC c phn mnh ngang nh sau: NV1= NV2= PC1= PC2=
MNV E3 MNV > E3 MNV E3 MNV E3

(NV)

(NV)

(PC) (PC)

Cc mnh PC1, PC2, NV1, NV2 theo th t c lu ti cc v tr 1, 2, 3 v 4 v kt qu c lu ti v tr 5


V tr 5 Kt qu = NV1 NV2 NV1 NV2 V tr 4

V tr 3 NV1= NV|><|MV PC1 N PC1


Nimv =Qnl h u

NV2= NV|><|MV PC2 N PC2 PC2=

V tr 1 PC1=

V tr 2
Nimv = u l h Qn

PC1

PC2

Hnh 4.1a) Chin lc a V tr 5 Kt qu = (NV1 NV2)|><| MV N PC1 V tr 1 PC2 V tr 2 V tr 3


N v = u l him Qn

(PC1 PC2) NV1 V tr 4 NV2

Hnh 4.1b) Chin lc b

Mi tn t v tr i n v tr j c nhn R ch ra rng quan h R c chuyn t v tr i n v tr j. Chin lc A s dng s kin l cc quan h EMP v ASG c phn mnh theo cng mt cch thc hin song song cc php ton chn v ni. chin lc B tp trung tt c cc d liu ti v tr lu kt qu trc khi x l cu vn tin.

49

nh gi vic tiu dng ti nguyn ca hai chin lc ny, chng ta s dng mt m hnh chi ph n gin sau. Chng ta gi s rng thao tc truy xut mt b (tuple access) c k hiu l tupacc, l mt n v v thao tc truyn mt b (tuple transfer) tuptrans l 10 n v. ng thi chng ta cng gi s l cc quan h NV v PC tng ng c 400 v 1000 b, v c 20 gim c d n thng nht cho cc v tr. Cui cng chng ta gi s rng cc quan h PC v NV c gom t cc b tng ng theo cc thuc tnh Nhimv v MNV. V vy c th truy xut trc tip n cc b ca PC da trn gi tr ca thuc tnh Nhimv (tng ng l MNV cho NV) * Tng chi ph ca chin lc A c th c tnh nh sau: 1.To ra PC bng cch chn trn PC cn (10+10)* tupacc 2. Truyn PC n v tr ca NV cn (10+10)*tuptrans 3. To NV bng cch ni PC v NV cn (10+10)*tupacc*2 4. Truyn NV n v tr nhn kt qu cn (10+10)*tuptrans Tng chi ph * Tng chi ph cho chin lc B c th c tnh nh sau: 1. Truyn NV n v tr 5 cn 400*tuptrans 2. truyn PC n v tr 5 cn 1000*tuptrans 3. To ra PC bng cch chn trn PC cn 1000*tupacc 4. Ni NVv PC cn 400*20*tupacc Tng chi ph l = 4.000 =10.000 = 1.000 = 8.000 23.000 = 20 = 200 = 40 = 200 460

Trong chin lc B, chng ta gi s rng cc phng php truy xut cc quan h NV v PC da trn cc thuc tnh Nhimv v MNV b mt tc dng do vic truyn d liu. y l mt gi thit hp l trong thc t. Chin lc A tt hn vi h s 50 rt c ngha. Hn na n a ra mt cch phn phi cng vic gia cc v tr. Khc bit cn cao hn na nu chng ta gi thit l tc truyn chm hn v/ hoc mc phn mnh cao hn. Ch s nh gi tiu dng ti nguyn l tng chi ph (total cost) phi tr khi x l vn tin. Tng chi ph l tng thi gian cn x l cc php ton vn tin ti cc v tr khc nhau v truyn d liu gia cc v tr. Mt cng c khc l thi gian p ng ca cu vn tin, l thi gian cn thit chy cu vn tin. V cc php ton c th c thc hin song song ti cc v tr khc nhau, thi gian p ng c th nh hn nhiu so vi tng chi ph ca n. Trong mi trng CSDL phn tn, tng chi ph cn phi gim

50

thiu l chi ph CPU, chi ph xut nhp v chi ph truyn. Chi ph CPU l chi ph phi tr khi thc hin cc thao tc trn d liu trong b nh chnh. Chi ph xut nhp (I/O) l thi gian cn thit cho cc thao tc xut nhp a. Chi ph truyn tin l thi gian cn trao i d liu gia cc v r tham gia vo trong qu trnh thc thi cu vn tin. Chi ph ny phi tr khi phi x l cc thng bo (nh dng/ gii nh dng) v khi truyn d liu trn mng. Chi ph truyn c l l yu t quan trng nht c xt n trong CSDL phn tn. phc tp ca cc php ton quan h: cc php ton chn, Chiu (Khng loi b trng lp) c phc tp l O(n); Cc php chiu(C loi b trng lp), trng lp, ni, ni na, chia c phc tp l O(n*logn); Tch Descartes c phc tp l O(n2) ( N biu th lc lng ca quan h nu cc b thu c c lp vi nhau) nh gi: + Cc thao tc c tnh chn la lm gim i lc lng cn phi thc hin trc tin. + Cc php ton cn phi c sp xp trnh thc hin tch Descartes hoc li thc hin sau.

3.2. Phn r vn tin


Phn r vn tin l giai on u tin ca qu trnh x l cu vn tin. N bin i cu vn tin dng php tnh quan h thnh cu vn tin i s quan h. Cc vn tin nhp xut u tham chiu cc quan h ton cc v khng dng n cc thng tin phn b d liu. V th phn r vn tin u ging nhau trong c h thng tp trung ln phn tn, cu vn tin xut s ng v ng ngha v t cht lng theo ngha l loi b cc hnh ng khng cn thit. Phn r vn tin c th xem nh bn bc lin tip nhau: Chun ho, phn tch, loi b d tha, vit li cu vn tin Chun ho Mc ch ca chun ho (normalization) l bin i cu vn tin thnh mt dng chun x l tip. Chun ho mt vn tin ni chung gm c t cc lng t v lng t ho vn tin bng cch p dng u tin ca cc ton t logic. Vi cc ngn ng quan h nh SQL, bin i quan trng nht l lng t ho vn tin (mnh Where), c th l mt v t phi lng t vi phc tp no vi tt c cc lng t cn thit ( hoc ) c t pha trc. C hai dng chun c th cho v t, mt c th bc cao cho AND( v loi cn li cho th bc cao OR ( ) ). Dng chun hi l hi (v t ca cc tuyn v t (cc v t ) ): (p11 p12 .p1n) ..(pm1pm2 .pmn)

51

trong pij l mt v t n gin. Ngc li, mt lng t ho dng chun tuyn nh sau: (p11 p12 .p1n) .(pm1pm2 .pmn) Bin i cc v t phi lng t l tm thng bng cch s cc quy tc tng ng cho cc php ton logic ( ): 9 , , 1. p1 p2 p2 p1 2. p1 p2 p2 p1 3. p1 p2 p3) (p1 p2 )p3 ( 4. p1 p2 p3) (p1 p2 )p3 ( 5. p1 p2 p3) (p1 p2 ) 1p3 ) ( (p 6. p1 p2 p3) (p1p2 ) (p1p3 ) ( 7. (p1 p2 ) p1p2 8. (p1 p2 ) p1 2 p 9. (p) p Trong dng chun tc tuyn, cu vn tin c th c x l nh cc cu vn tin con hi c lp, c ni bng php hp (tng ng vi cc tuyn mnh ). Nhn xt: Dng chun tuyn t c dng v dn n cc v t ni v chn trng nhau. Dng chun hi hay dng trong thc t Th d 3.3: Tm tn cc nhn vin ang lm vic d n P1 trong 12 thng hoc 24 thng. Cu vn tin c din t bng SQL nh sau: SELECT FROM WHERE AND AND TnNV NV, PC NV.MNV=PC.MNV PC.MDA= P1 Thi gian=12 OR Thi gian=24

Lng t ho dng chun hi l: NV.MNV=PC.MNV PC.MDA= P1 (Thi gian=12 Thi gian=24) Cn lng t ho dng chun tuyn l

52

(NV.MNV=PC.MNV PC.MDA=P1 Thi gian=12) (NV.MNV=PC.MNV PC.MDA=P1 Thi gian=24) dng sau, x l hai hi c lp c th l mt cng vic tha nu cc biu thc con chung khng c loi b. Phn tch Phn tch cu vn tin cho php ph b cc cu vn tin chun ho nhng khng th tip tc x l c hoc khng cn thit, nhng l do chnh l do chng sai kiu hoc sai ng ngha. - Mt cu vn tin gi l sai kiu nu n c mt thuc tnh hoc tn quan h cha c khai bo trong lc ton cc, hoc nu n p dng cho cc thuc tnh c kiu khng thch hp. Select MaDA From TenNV >200 Mt cu vn tin gi l sai ngha nu cc thnh phn ca n khng tham gia vo vic to ra kt qu. Nu cc cc vn tin khng cha cc tuyn v ph nh ta c th dng th vn tin. Vn tin cha php chn ni chiu. - Biu din bng th vn tin: + 1 nt biu th quan h kt qu + Cc nt khc biu th cho quan h ton hng + Mt cnh gia hai nt khng phi l quan h kqu biu din cho mt ni + Cnh m nt ch l kt qu s biu th cho php chiu. + Cc nt khng phi l kt qu s c gn nhn l mt v t chn hoc 1 v t ni (chnh n). - th ni: mt th con quan trng ca th vn tin, n ch c cc ni. Th d 3.4: PC (MNV, MaDA, NV, Tgian) NV (MaNV, TnNV, CV) DA (MaDA, TnDA, Kph, im)

53

Tm tn, Nv cc ca nhng ngi c Cv=TP lm vic d n CAD/CAM trong hn 3 nm Select TnNV From PC, NV,DA Where Cc v t n gin: p1: PC.MaNV = NV.MaNV p3: TnDA=CAD/CAM th vn tin P C P 5 P2 D A P 3 N V p2: p4: CV =TP th ni P C D A PC.MaDA=DA.MaDA p5: tgian >36 PC.MaNV = NV.MaNV and PC.MaDA=DA.MaDA and TnDA=CAD/CAM and CV =TP and tgian >36

P1 P 4 N V

K Q

Th d 3.5: Select TnNV From PC, NV,DA Where PC.MaNV = NV.MaNV and TnDA=CAD/CAM and CV =TP and tgian >36 P P 5 P C C P1 P 4 N V K Q => th khng lin thng D A P 3 N V D A

54

Nhn xt: Cu vn tin sai ng ngha nu th vn tin ca n khng lin thng: 1 hoc nhiu th con b tch ri vi th kt qu. Loi b d tha Mt cu vn tin ca ngi s dng thng c din t trn mt khung nhn c th c b sung thm nhiu v t c c s tng ng khung nhn - quan h, bo m c tnh ton vn ng ngha v bo mt. Th nhng lng t ho vn tin c sa i ny c th cha cc v t d tha, c th phi khin lp li mt s cng vic. Mt cch lm n gin vn tin l loi b cc v t tha Loi b v t d tha bng qui tc lu ng: 10 1. p p p 2. p v p p 3. p True p 4. p v False p 5. p False False Th d 3.6 : Select CV From NV Where (Not (CV =TP) and (CV=TP or CV=PP) and not (CV=PP)) or TnNV=Mai 6. p v True True 7. p p False 8. p v p True 9. p1 (p1v p2) p1 10. p1 v (p1 p2) p1

p1: CV =TP p2: CV=PP p3: TnNV=Mai

Lng t ho: ( p1 (p1 v p2) p2 ) v p3

p dng : ( p1 ((p1 p2 ) v (p2 p2 ))) v p3 p dng 3: ( p1 p1 p2 ) v ( p1 p2 p2 ) v p3 p dng 7: (False p2) v ( p1 False) v p3 p dng 5: False v False v p3 = p3 Vit li: Select CV From NV

55

where TnNV=Mai Vit li cu vn tin Bc ny c chia thnh hai bc nh: (1) (2) Bin i cu vn tin t php tnh quan h thnh i s quan h Cu trc li cu vn tin i s nhm ci thin hiu nng.

cho d hiu, chng ta s trnh by cu vn tin i s quan h mt cch hnh nh bng cy ton t. Mt cy ton t l mt cy vi mi nt l biu th cho mt quan h c lu trong CSDL v cc nt khng phi l nt l biu th cho mt quan h trung gian c sinh ra bi cc php ton quan h. Chui cc php ton i theo hng t l n gc biu th cho kt qu vn tin. Bin i cu vn tin php tnh quan h b thnh mt cy ton t c th thu c d dng bng cch sau. Trong SQL, cc nt l c sn trong mnh FROM. th hai nt gc c to ra nh mt php chiu cha cc thuc tnh kt qu. Cc thuc tnh ny nm trong mnh SELECT ca cu vn tin SQL. Th ba, lng t ho (mnh Where ca SQL) c dch thnh chui cc php ton quan h thch hp (php chn, ni, hp, ..) i t cc nt l n nt gc. Chui ny c th c cho trc tip qua th t xut hin ca cc v t v ton t. Th d 3.7: Cu vn tin: tm tn cc nhn vin tr J.Doe lm cho d n CAD/CAM trong mt hoc hai nm. Biu thc SQL l: SELECT FROM WHERE AND AND AND AND TnNV DA, PC, NV PC.MNV=NV.MNV PC.MDA=DA.MDA TnNV J.Doe DA.TnDA=CAD/CAM (Thi gian=12 OR Thi gian=24)

Cc th c nh x thnh cy trong hnh di.

56

TN n V

Chiu

T g = 2 h i ian 1

T g = 4 h i ian 2

T D =C D A n A A /C M

Chn

TN n V

o J.D e

MA D

EO N

Ni

DA

PC

NV

Bng cch p dng cc quy tc bin i, nhiu cy c th c thy rng tng ng vi cy c to ra bng phng php c m t trn. Su quy tc tng ng hu ch nht v c xem l cc php ton i s quan h c bn : R, S, T l nhng quan h, trong R c nh ngha trn cc thuc tnh A={A1, A2,,An} v quan h S c nh ngha trn cc thuc tnh B={B 1, B2, ,Bn}. 1. Tnh giao hon ca php ton hai ngi R x S S x R R hay ni na. 2. Tnh kt hp ca cc php ton hai ngi (R x S)x T R x (Sx T) (R S) TR (S T) S S R

Quy tc ny cng p dng c cho hp nhng khng p dng cho hiu tp hp

3- Tnh ly ng ca cc php ton n ngi Nu R c nh ngha trn tp thuc tnh A v A A, A A v A A th


A

(R))

(R)

57

p1(A1)

p2(A2)

(R)) p1(A1)p2(A2)(R)

trong pi l mt v t c p dng cho thuc tnh Ai 4. Giao hon php chn vi php chiu
A1An

p(Ap)

(R)) A1An(

p(Ap)

A1An,Ap

(R)))

Ch rng nu Ap l phn t ca {A1, A2,,An} th php chiu cui cng trn {A1, A2,,An} v phi ca h thc khng c tc dng. 5. Giao hon php chn vi php ton hai ngi
p(Ai) p(Ai) p(Ai)

(R x S) ( (R
p(j, Bk)

p(Ai)

(R)) x S
p(Ai)

S) (
p(Ai)

(R)) (T)

p(j, Bk)

(R T)

(R)

p(Ai)

6-Giao hon php chiu vi php ton hai ngi Nu C=A B, trong AA, B B, v A, B l cc tp thuc tnh tng ng ca quan h R v S, chng ta c C(R x S) A(R) B(S) C(R
p(i, Bj)

S) A(R)

p(i, Bj)

B(S)

C(R S) A(R) B(S) Cc quy tc trn c th c s dng cu trc li cy mt cch c h thng nhm loi b cc cy xu. Mt thut ton ti cu trc n gin s dng heuristic trong c p dng cc php ton n ngi (chn/ chiu ) cng sm cng tt nhm gim bt kch thc ca quan h trung gian. Ti cu trc cy trong hnh trn sinh ra cy trong hnh sau. Kt qu c xem l t cht lng theo ngha l n trnh truy xut nhiu ln n cng mt quan h v cc php ton chn la nhiu nht c thc hin trc tin.

58

TN n V

M AT N D , en V

MV N

MA D

M AM V D, N

M VT N N , en V

T D = A /C M en AC D A

T o ian 1 h ig = 2

h ig = 4 T o ian 2

TenDA J.Doe

DA

PC

NV

3.3. Cc b ha d liu phn tn


Tng cc b ha d liu chu trch nhim dch cu vn tin i s trn quan h ton cc sang cu vn tin i s trn cc mnh vt l. Cc b ha c s dng cc thng tin c lu trong mt lc phn mnh. Tng ny xc nh xem nhng mnh no cn cho cu vn tin v bin i cu vn tin phn tn thnh cu vn tin trn cc mnh. To ra cu vn tin theo mnh c thc hin qua hai bc. Trc tin vn tin phn tn c nh x thnh vn tin theo mnh bng cch thay i mi quan h phn tn bng chng trnh ti thit ca n. Th hai vn tin theo mnh c n gin ho v ti cu trc to ra mt cu vn tin c cht lng. Qu trnh n gin ho v ti cu trc c th c thc hin theo nhng

59

quy tc c s dng trong tng phn r. Ging nh trong tng , cu vn tin theo mnh cui cng ni chung cha t n ti u bi v thng tin lin quan n cc mnh cha c s dng. Cc b ho d liu s xc nh cc mnh no cn cho cu vn tin. Bin i cu vn tin phn tn thnh cc cu vn tin theo mnh. Trong phn ny i vi mi kiu phn mnh ta s trnh by cc k thut rt gn to cc cu vn tin ti u v n gin hn. Ta s s dng cc qui tn bin i v cc khm ph, chng hn y cc php ton n ngi xung thp nh c th. Rt gn phn mnh ngang nguyn thu - Vic phn mnh ngang phn tn 1 quan h da trn cc v t chn Th d 3.8: NV (MaNV, TenNV, CV) NV1 =
MaNV E3

(NV) NV = NV1 NV2 NV3

NV2 = E3 < MaNV E6(NV) NV3 = Cch lm:


MaNV > E6

(NV)

+ Xc nh sau khi ti cu trc li cy con, xem cy no to ra cc quan h rng th loi b chng i. + Phn mnh ngang c th c dng n gin ho php chn v php ni Rt gn vi php chn: Chn trn cc mnh c lng t mu thun vi lng t ho ca qui tc phn mnh s sinh ra quan h rng ta loi b chng. rj = nu t thuc r : ( t(pi) t(pj) ) Trong pi, pj l cc v t chn, t biu th cho 1 b, t(p) biu th v t p ng vi t. V d: NV1 =
MaNV E3

(NV)

NV2 = E3 < MaNV E6(NV) NV3 =


MaNV > E6

(NV)

Select MaNV From NV

Bng cch hon v php chn vi php hp ta s pht hin ra v t chn >< v t ca NV1, NV3. => to ra cc quan h rng => loi b

60

Where MaNV=E5

MaNV

MaNV

M V E aN = 5

M V E aN = 5

NV2 NV1 Rt gn vi php ni - Ni trn cc quan h phn mnh ngang c th c n gin khi cc quan h ni c phn mnh theo thuc tnh ni. - n gin ho gm c phn phi cc ni trn cc hp ri b i cc ni v dng. ( r1 r2 ) |><| s = (r1 |><| s ) (r2 |><| s ) ri l cc mnh cn r, s l cc quan h Bng php bin i ny, cc hp c th c di chuyn ln trn cy ton t tt c cc ni c th c cc mnh u c l ra. Cc ni v dng c b i khi cc lng t ho ca cc mnh c mu thun. Th d 3.9: Cho 2 quan h c phn mnh NV1 = NV(MaNV, TnNV, CV) NV2 =
MaNV E3

NV2

NV3

(NV)

E3 < MaNV E6

(NV) (NV) (PC) (NV)

NV3 = PC1 = PC (MaNV, MaDA, NV, Tg) Cu hi: Select * From NV, PC Where NV.MaNV = PC.MaNV PC2 =

MaNV > E6 MaNV E3 MaNV > E3

61

* |><|

|><|
NV 3 NV 1

|><|
NV 2

|><|
NV 3

PC1

PC2

NV 1

NV 2

PC1

PC2

PC3

Rt gn cho phn mnh dc Phn mnh dc phn tn mt quan h da trn cc thuc tnh chiu. Chng trnh cc b ho cho mt quan h phn mnh dc gm c ni ca cc mnh theo thuc tnh chung. Th d 3.10: NV(MaNV, TnNV, CV) NV1 = MaNV, TnNV(NV) NV2 = MaNV, CV(NV) Chng trnh cc b ho: NV = NV1 |><| NV2 Cch lm: Vn tin trn phn mnh dc c th c rt gn bng cch xc nh cc quan h trung gian v dng v loi b cc cy con sinh ra chng A= {A1, A2, ...} v c phn mnh dc thnh ri (A) trong A A ri (D, A) l v dng nu tp thuc tnh chiu D khng thuc A Th d 3.11: NV(MaNV, TnNV, CV) NV1 = MaNV, TnNV(NV) NV2 = MaNV, CV(NV) Select TnNV From NV

62

TN n V

TN n V

|><|

NV1

NV1

NV2

Rt gn cho phn mnh ngang dn xut - Phn mnh ngang dn xut l mt cch phn phi hai quan h m nh c th ci thin kh nng x l cc im giao nhau gia php chn v php ni. - Nu quan h r phi phn mnh dn xut theo quan h s, cc mnh ca r v s ging nhau thuc tnh ni s nm cng v tr. Ngoi ra s c th c phn mnh theo v t chn. - Phn mnh dn xut ch c s dng cho mi lin h 1 N (phn cp) t s n r: trong 1 b ca s c th khp vi nhiu b ca r Th d 3.12: NV(MaNV, TnNV, CV) NV1=
Cv=TP

(NV)
Cv TP

NV2 = PC(MaNV, MaDA, NV, Tg)

(NV)

PC1 = PC |>< NV1 PC2 = PC |>< NV2

a ra tt c cc thuc tnh ca NV, PC vi NV =PP Select * From NV, PC Where NV.MaNV = PC.MaNV and NV.CV=PP Cu vn tin trn cc mnh NV1, NV2, PC1, PC2 c nh ngha. y php chn xung cc mnh NV1. NV2 cu vn tin rt gn li do mu thun vi v t chn ca NV1 = > loi b NV1

63

|><|M V aN
C P v= P

|><|M V aN

C P v= P

NV2 PC1 PC2 NV1 NV2 PC1 PC2

* *

|><|M V aN |><|M V aN
PC2

|><|M V aN

C P v= P

PC1

C P v= P

C P v= P

PC2 NV2

NV2

NV2

64

Rt gn cho phn mnh ngang hn hp

Mc tiu: H tr hiu qu cc cu vn tin c cha php chiu, chn v ni Cu vn tin trn cc mnh hn hp c th c rt gn bng cch t hp cc qui tc tng ng uc dng trong cc phn mnh ngang nguyn thu, phn mnh dc, phn mnh ngang dn xut. Qui tc: 1/ Loi b cc quan h rng c to ra bi cc php ton chn mu thun trn cc mnh ngang. 2/ Loi b cc quan h v dng c to ra bi cc php chiu trn cc mnh dc 3/ Phn phi cc ni cho cc hp nm c lp v loi b cc ni v dng. Th d 3.13: MaNV, TnNV, CV) NV2 = NV1 =
MaNV > E4 MaNV E4

( MaNV, TnNV (NV) )

( MaNV, TnNV (NV) )

NV3 = MaNV, CV (NV) Chng trnh cc b ho NV = (NV1 NV2 ) |><| NV3 Cu vn tin: Tn ca nhn vin c m E5

TN n V

TN n V

M V E aN = 5

M V E aN = 5

|><|
NV2

NV3

NV1

NV2

65

3.4. Ti u ho vn tin phn tn Trong phn ny chng ta s gii thiu v qu trnh ti u ha ni chung, bt k mi trng l phn tn hay tp chung. Vn tin cn ti u gi thit l c din t bng i s quan h trn cc quan h CSDL (c th l cc mnh) sau khi vit li vn tin t biu thc php tnh quan h. Ti u ha vn tin mun ni n qu trnh sinh ra mt hoch nh thc thi vn tin (query execution plan, QEP) biu th cho chin lc thc thi vn tin. Hoch nh c chn phi h thp ti a hm chi ph. Th ti u ha vn tin, l mt n th phn mm chu trch nhim thc hin ti u ha, thng c xem l cu to bi ba thnh phn: mt khng gian tm kim (search space), mt m hnh chi ph (cost model) v mt chin lc tm kim (search strstegy) (xem hnh 1.4.4). Khng gian tm kim l tp cc hoch nh thc thi biu din cho cu vn tin. Nhng hoch nh ny l tng ng, theo ngha l chng sinh ra cng mt kt qu nhng khc nhau th t thc hin cc thao tc v cch thc ci t nhng thao tc ny, v th khc nhau v hiu nng. Khng gian tm kim thu c bng cch p dng cc quy tc bin i, chng hn nhng qui tc cho i s quan h m t trong phn vit li cu vn tin. M hnh chi ph tin on chi ph ca mt hoch nh thc thi cho. cho chnh xc, m hnh chi ph phi c thng tin cn thit v mi trng thc thi phn tn. Chin lc tm kim s khm ph khng gian tm kim v chn ra hoch nh tt nht da theo m hnh chi ph. N nh ngha xem cc hoch nh no cn c kim tra v theo th t no. Chi tit v mi trng (tp trung hay phn tn) c ghi nhn trong khng gian v m hnh chi ph.

3.4.1. Khng gian tm kim


Cc hoch nh thc thi vn tin thng c tru tng ha qua cy ton t), trn nh ngha th t thc hin cc php ton. Chng ta b sung thm cc thng tin nh thut ton tt nht c chn cho mi php ton. i vi mt cu vn tin cho, khng gian tm kim c th c nh ngha nh mt tp cc cy ton t tng ng, c c bng cch p dng cc qui tc bin i . nu bt cc c trng ca th ti u ha vn tin , chng ta thng tp trung cc cy ni (join tree), l cy ton t vi cc php ton ni hoc tch Descartes. L do l cc hon v th t ni cc tc dng quan trng nht n hiu nng ca cc vn tin quan h.

66

CU VN TIN CU VN TIN

TO RA KHNG GIAN TMKHNG TO RA KIM GIAN TM KIM

QUY TC BIN I

QEP TNG NG

CHIN LC TM KIM QEP TT NHT

M HNH CHI PH

Hnh 9.1. Qu trnh ti u ha vn tin.

Th d 3.14: Xt cu vn tin sau: SELECT FROM WHERE AND ENAME EMP, ASG, PROJ EMP, ENO=ASG.ENO ASG, PNO=PROJ . PNO

Hnh sau minh ha ba cy ni tng ng cho vn tin , thu c bng cch s dng tnh cht kt hp ca cc ton t hai ngi. Mi cy ny c th c gn mt chi ph da trn chi ph ca mi ton t. Cy ni ( c ) bt u vi mt tch Des-cartes c th c chi ph cao hn rt nhiu so vi cy cn li.

67

PNO ENO EMP (a) ENO.PNO X PROJ (c) EMP ASG ASG PROJ ASG (b) PNO

ENO EMP PROJ

Vi mt cu vn tin phc tp (c gm nhiu quan h v nhiu ton t), s caaytoans t tng ng c th rt nhiu. Th d s cy ni c th thu c t vic p dng tnh giao hon v kt hp l O(N!) cho N quan h. Vic nh gi mt khng gian tm kim ln c th mt qu nhiu thi gian ti u ha, i khi cn tn hn c thi gian thc thi thc s. V th, th ti u ha thng hn ch kch thc cn xem xt ca khng gian tm kim . Hn ch th nht l dng cc heuristic. Mt heuristic thng dng nht l thc hin php chn v chiu khi truy xut n quan h c s. Mt heuristic thng dng khc l trnh ly cc tch Descartes khng c chnh cu vn tin yu cu. Th d trong hnh trn cy ton t (c ) khng phi l phn c th ti u ha xem xt trong khng gian tm kim.

R4 R3 R1 R2 R1 R2 R3 R4

a) Cy ni tuyn tnh

b) Cy ni xum xu

68

Mt hn ch quan trng khc ng vi hnh dng ca cy ni. Hai loi cy ni thng c phn bit Cy ni tuyn tnh v cy ni xum xu (xem Hnh 9.3). Mt cy tuyn tnh (linear tree) l cy vi mi nt ton t c t nht mt ton hng l mt quan h c s. Mt cy xum xu (bushy tree) th tng qut hn v c th c cc ton t khng c quan h c s lm ton hng (ngha l c hai ton hng u l cc quan h trung gian). Nu ch xt cc cy tuyn tnh, kch thc ca khng gian tm kim c rt gn li thnh O(2N). Tuy nhin trong mi trng phn tn, cy xum xu rt c li cho vic thc hin song song. 3.4.2. Chin lc tm kim Chin lc tm kim hay c cc th ti u ha vn tin s dng nht l quy hoch ng (dynamic programming) ci tnh cht n nh (deterministic). Cc chin lc n nh tin hnh bng cch xy dng cc hoch nh , bt u t cc quan h c s, ni thm nhiu quan h ti mi bc cho n khi thu c tt c mi hoch nh kh hu nh trong Hnh 9.4.. Quy hoch ng xy dng tt c mi hoch nh kh hu theo hng ngang (breadth-first) trc khi n chn ra hoch nh tt nht. h thp chi ph ti u ha, cc hoch nh tng phn rt c kh nng khng dn n mt hoch nh ti u u c xn b ngay khi c th. Ngc li, mt chin lc n nh khc l thut ton thin cn ch xy dng mt hoch nh theo hng su (depth-first).

R3 R1
Bc 1

R3 R1 R2
Bc 3

R4 R4

R2 R1 R2
Bc 2

Quy hoch ng hu nh c bn cht vt cn v bo m tm ra c cc hoch nh. N phi tr mt chi ph c th chp nhn c (theo thi gian v khng gian) khi s quan h trong cu vn tin kh nh. Tuy nhin li tip cn ny c chi ph qu cao khi s quan h ln hn 5 hoc 6. V l do ny m cc ch gn y ang tp trung vo cc chin lc ngu nhin ha (randomized strategy) lm gim phc tp ca ti

69

u ha nhng khng bo m tm c hoch nh tt nht. Khng ging nh cc chin lc n nh, cc chin lc ngu nhin ha cho php th ti u ha nh i thi gian ti u ha v thi gian thc thi. Chin lc ngu nhin ha chng hn nh tp trung vo vic tm kim li gii ti u xung quanh mt s im c bit no . Chung khng m bo s thu c mt li gii tt nht nhng trnh c chi ph qu cao ca ti u ha tnh theo vic tiu dng b nh v thi gian. Trc tin mt hoc nhiu hoch nh khi u c xy dng bng mt chin lc thin cn . Sau thut ton tm cch ci thin hoch nh ny bng cch thm cc ln cn (neighbor) ca n. Mt ln cn thu c bng cch p dng mt bin i ngu nhin cho mt hoch nh. Th d v mt bin i in hnh gm c hon i hai quan h ton hng c chn ngu nhin ca hoch nh nh trong chng t bng thc nghim rng cc chin lc ngu nhin ha c hiu nng tt hn cc chin lc n nh khi vn tin c cha kh nhiu quan h.

R3 R1 R2 R1
3.4.3. M hnh chi ph phn tn

R2

R3

M hnh chi ph ca th ti u ha gm c cc hm chi ph d on chi ph ca cc ton t, s liu thng k, d liu c s v cc cng thc c lng kch thc cc kt qu trung gian. Hm chi ph Chi ph ca mt chin lc thc thi phn tn c th c din t ng vi tng thi gian hoc vi thi gian p ng. Tng thi gian (total time) l tng tt c cc thnh phn thi gian (cn c gi l chi ph), cn thi gian p ng ( response time) l thi gian tnh t khi khi hot n lc hon thnh cu vn tin. Cng thc tng qut xc nh tng chi ph c m t nh sau: Total_time = TCPU * #insts + TI/O * #I/Os + TMSG * #msgs + TTR * #bytcs Hai thnh phn u tin l thi gian x l cc b, trong TCPU l thi gian ca mt ch th CPU v TI/O l thi gian cho mt thao tc xut nhp a. Thi gian truyn c biu th qua hai thnh phn cui cng. TMSG l thi gian c nh cn khi hot v nhn mt thng bo, cn TTR l thi gian cn truyn mt n v d liu t v tr ny n v tr khc. n v d liu y tnh theo byte (#byte l tng kch thc ca

70

tt c cc thng bo), nhng cng c th tnh theo nhng n v khc (th d theo gi). Thng thng chng ta gi thit TTR l mt gi tr khng i. iu ny c th khng ng trong cc mng WAN, trong mt s v tr nm xa hn so vi mt s khc. Tuy nhin gi thit ny lm n gin qu trnh ti u ha rt nhiu. V th thi gian truyn #byte d liu t v tr ny n v tr khc c gi thuyt l mt hm tuyn tnh theo #bytes: CT(#bytes) = TMSG + TTR * #bytes Cc chi ph ni chung c din t theo n v thi gian, v t c th chuyn thnh cc n v khc (th d nh la). Gi tr tng i ca cc h s chi ph c trng cho mi trng CSDL phn tn. Topo mng c nh hng rt ln n t s gia cc thnh phn ny. Trong mng WAN nh Internet, thi gian truyn thng l h s chim a phn. Tuy nhin trong cc mng LAN th cc h s thnh phn cn bng hn. Nhng nghin cu ban u ch ra rng t s gia thi gian truyn v thi gian xut nhp mt trang vo khong 20:1 i vi mng WAN, i vi cc mng Ethernet in hnh (10Mbds) th vo khong 1:1,6. V th phn ln cc h DBMS phn tn c thit k trn cc mng WAN u b qua chi ph x l cc b v tp trung vo vn cc tiu ha chi ph truyn. Ngc li cc DBMS phn tn c thit k cho mng LAN u xt n c ba thnh phn chi ph ny. Cc mng nhanh hn c mng WAN ln mng LAN ci thin cc t l nu trn thin v chi ph truyn khi tt c mi th khc u nh nhau. Tuy nhin thi gian truyn vn l mt yu t chin a phn trong cc mng WAN nh Internet bi v d liu cn phi c di chuyn i n cc v tr xa hn. Khi thi gian p ng vn tin l hm mc tiu ca th ti u ha, chng ta cn phi xt n vn x l cc b song song v truyn song song. Cng thc tng qut ca thi gian p ng l: Response_time = TCPU * seq_ #insts + TI/O * seg_ #I/Os + TMSG * seg_ #msgs + TTR * seg_ #bytes Trong seq_ #x, vi x c th l cc ch th (insts), cc xut nhp I/O, cc thng bo (msgs) hoc bytes, l s lng x ti a phi c thc hin mt cch tun t khi thc hin vn tin. V vy mi x l v truyn d liu thc hin song song u c b qua. Th d 3.15: Chng ta minh ha s khc bit gia tng chi ph v thi gian p ng qua th d trong Hnh 6, trong kt qu tr li c tnh ti v tr 3, d liu c ly t v tr 1 v 2. n gin, chng ta phi gi s rng ch xt n chi ph truyn.

71

V tr 1

x n v

V tr 3

V tr 2

y n v

Gi s rng TMSG v TTR c din t theo n v thi gian. Tng chi ph truyn x n v d liu t v tr 1 n v tr 3 v y n v d liu t v tr 2 n v tr 3 l Total_time = 2 TMSG + TTR * (x +y) Thi gian p ng cho cu vn tin ny c th tnh xp x l: Response_time = max {TMSG + TTR * x; TMSG + TTR * y) bi v cc thao tc truyn d liu c thc hin song song. H thp ti a thi gian p ng c thc hin bng cch lm tng mc thc thi song song. Tuy nhin iu ny khng c ngha l tng thi gian cng c h thp. Ngc li n c th lm tng tng thi gian, th d nh do tng x l song song cc b v truyn song song. H thp tng thi gian cho thy l ci thin c vic s dng ti nguyn, v th lm tng lu lng ca h thng. Trong thc hnh cn cn i c hai thi gian ny. S liu thng k CSDL Tc nhn chnh nh hng n hiu qu hot ng ca mt chin lc thc thi l kch thc cc quan h trung gian c to ra. Khi mt php ton tip theo nm ti mt v tr khc, quan h trung gian phi c di chuyn n . chnh l iu khin chng ta phi c lng kch thc ca cc kt qu trung gian ca cc php ton i s quan h nhm gim thiu lng d liu phi truyn. Vic c lng ny da trn cc thng tin thng k v cc quan h c s v cc cng thc d on lc lng ca cc kt qu. D nhin l c nhng c mt gia tnh chnh xc ca cc s liu thng k v chi ph qun l chng: s liu cng chnh xc, chi ph cng cao. i vi mt quan h R c nh ngha trn tp thuc tnh A = {A1, A2, ., An} v c phn mnh l R1, R2, ., Rn d liu thng k in hnh nh sau:

72

1. i vi mi thuc tnh Ai chiu di (theo s byte) c k hiu l length (Ai), v i vi mi thuc tinh Ai ca mi mnh Rj, s lng phn bit cc gi tr ca Ai, l lc lng khi chiu mnh Rj trn Ai , c k hiu l card ( Ai (Rj)). 2. ng vi min ca mi thuc tnh Ai trn mt tp gi tr sp th t c (th d s nguyn hoc s thc), gi tr ln nht v nh nht c k hiu l max (Ai) v min(Ai). 3. ng vi min ca mi thuc tnh Ai lc lng ca min c k hiu l card(dom[Ai]). Gi tr ny cho bit s lng cc gi tr duy nht trong dom[Ai]. 4. S lng cc b trong mi mnh Rj c k hiu l card(Rj) i khi d liu thng k cng bao gm h s chn ni (join selectivity factor) i vi mt s cp quan h, ngha l t l cc b c tham gia vo ni. H s c chn ni c k hiu l SFj ca quan h R v S l mt gi tr thc gia 0 v 1.:

card(R SFj =

S)

card(R)* card(S)

Chng hn h s chn ni 0.5 tng ng vi mt quan h ni cc ln, trong khi h s 0.001 tng ng vi mt quan h kh nh. Chng ta ni rng ni c chn km trong trng hp u v chn tt trong trng hp sau: D liu thng k ny rt c ch cho vic d on kch thc quan h trung gian. Size (R) = card (R) * length (R) Trong length (R) l chiu di (theo byte) ca mt b ca R, c tnh t cc chiu di ca cc thuc tnh ca R. Vic c lng card (R), s lng cc b trong R, i hi phi s dng cc cng thc c cho trong phn tip theo. Lc lng ca cc kt qu trung gian D liu thng k rt c ch khi nh gi lc lng ca cc kt qu trung gian. Hai gi thit n gin thng c a ra v CSDL. Phn phi ca cc gi tr thuc tnh trong mt quan h c gi nh l thng nht, v tt c mi thuc tnh u c c lp, theo ngha l gi tr ca mt thuc tnh khng nh hng n gi tr ca cc thuc tnh khc. Hai gi thit ny thng khng ng trong thc t, tuy nhin chng lm cho bi ton d gii quyt hn. Trong nhng on sau, chng ta trnh by cc cng thc c lng, lc lng cc kt qu ca cc php ton i s c bn (php chn, php chiu, tch Descartes, ni, ni na, hp, v hiu). Quan h ton hng c k hiu l R v S. H s chn ca mt php ton c biu th l SFOP, vi OP biu th cho php ton.

73

Php chn. Lc lng ca php chn l: Card(F(R)) = SFS(F) * card(R) Trong SFS(F) ph thuc vo cng thc chn v c th c tnh nh sau, vi p(Ai) v p(Aj) biu th cho v tr t thuc tnh Ai v Aj. 1 SFS (A= value)= card(A (R))

max(A ) -value SFS (A > value)= max(A) - min(A)

value- min(A) SFS (A < value)= max(A) - min(A)

SFS (p(Ai) p(Aj)) = SFS (p(Ai) * SFS (p(Aj))

SFS (p(Ai) p(Aj)) = SFS (p(Ai) * SFS (p(Aj)) (SFS (p(Ai)) * SFS (p(Aj)))

SFS (Ai {value}) = SFS (A = value) * card({values})

Php chiu. Chiu c th loi b hoc khng loi b cc b ging nhau. y chng ta xem nh chiu c km theo c vic loi b ny. Mt php chiu bt k rt kh c lng chnh xc bi v mi tng quan gia cc thuc tnh c chiu thng khng c bit. Tuy nhin c hai trng hp c bit c ch nhng vic c lng hon ton tm thng. Nu chiu ca quan h R da trn thuc tnh A duy nht, lc lng ch l s b thu c khi thc hin php chiu. Nu mt trong cc thuc tnh chiu l kha ca R th

74

card (A(R)) = card(R) Tch Descartes. Lc lng ca tch Descartes ca quan h R v S l card (R x S) = card(R)* card(S) Ni. Khng c mt phng php tng qut no tnh lc lng ca ni m khng cn thm thng tin b sung. Cn trn ca lc lng cho ni l lc lng ca tch Descartes. Mt s h thng, chng hn nh h INGRES phn tn s dng cn trn ny, mt c lng hi qu ng. R* s dng thng s ca trn ny vi mt hng s, phn nh s kin l kt qu ni lun nh hn tch Descartes. Tuy nhin c mt trng hp xy ra kh thng xuyn nhng vic c lng li kh n gin. Nu R c thc hin ni bng vi S trn thuc tnh A ca R v thuc tnh B ca S, trong A l kha ca quan h R v B l kha ngoi ca quan h S th lc lng ca kt qu c th tnh xp x l: Card (R
A=B

S) = card(S)

Bi v mi b ca S khp vi ti a mt b ca R. Hin nhin l iu ny cng ng nu B l kha ca S v A l kha ngoi ca R. Tuy nhin c lng ny l cn trn bi v n gi s rng mi b ca S u tham gia vo trong ni. i vi nhng ni quan trng khc, chng ta cn duy tr h s chn ni SFJ nh thnh phn ca cc thng tin thng k. Trong trng hp d lc lng ca kt qu l: Card (R S) = SFJ* card(R)* card(S)

Ni na. H s chn ca ni na gia R v S cho bi ty l phn trm cc b ca R c ni vi cc b ca S. Mt xp x cho h s chn ni na c a ra trng l:

Card (A(S)) SFSJ (R |>< S)= Card(dom[A])

Cng thc ny ch ph thuc vo thuc tnh A v S. V th n thng c gi l h s chn ca thuc tnh A v S, k hiu l SFSJ(S.A), v l h s chn ca S.A trn bt k mt thuc tnh no c th ni c vi n. V th lc lng ca ni na c cho bi: Card (R |>< S) = SFSJ(S.A)* card(R) Xp x ny c th c xc nhn trn mt trng hp rt thng gp, l khi R.A l kha ngoi ca S (S.A l kha chnh). Trong trng hp ny, h s chn ni na l 1 bi v card (A(S))= Card(dom[A]) cho thy rng lc lng ca ni na l card (R).

75

Php hp. Rt kh c lc lng trong trng hp ca R v S bi v cc b ging nhau b loi b trong hp. Chng ta ch trnh by cng thc n gin cho cc cn trn v di, tng ng l: card (R) + card (S) max{card (R), card (S)} Ch rng nhng cng thc ny gi thit R v S khng cha cc b ging nhau. Hiu. Ging nh php hp, chng ta ch trnh by cc cn trn v di. Cn trn ca card (R - S) l card (R), cn cn di l 0. 3.4.4. Xp th t ni trong cc vn tin theo mnh Nh chng ta bit vic sp xp cc ni l mt ni dung quan trng trong qu trnh ti u ha vn tin tp trung. Xp th t ni trong ng cnh phn tn d nhin l quan trng hn bi v ni cc mnh lm tng thi gian truyn. Hin c hai cch tip cn c bn sp th t cc ni trong cc vn tin mnh. Mt l ti u ha trc tip vic xp th t ni, cn cch kia th thay cc ni bng cc t hp ca ni na nhm gim thiu chi ph truyn. Xp th t ni. Mt s thut ton ti u ha vic sp th t ni mt cch trc tip m khng dng cc ni na. Cc thut ton ca h INGRES phn tn v System R* l i din cho nhm ny. Mc ch ca phn ny l trnh by cc vn phc tp ca vic sp th t ni v to tin cho phn tip theo c s dng ni na ti u ha cc cu vn tin ni. Chng ta cn a ra mt s gi thit nhm tp trung vo cc vn chnh. Bi v cu vn tin c cc b ha v c din t trn cc mnh, chng ta khng cn phi phn bit gia cc mnh ca cng mt quan h v cc mnh c lu ti mt v tr c th. Nhm tp trung vo vic sp th t ni, chng ta b qua thi gian x l cc b, vi gi thit l cc thao tc rt gn (chn, chiu) c thc hin cc b hoc trc khi, hoc trong khi ni, (cn nh rng thc hin ph chn trc khng phi lc no cng hiu qu). V th chng ta ch xt cc cu vn tin ni m cc quan h ton hng c lu ti cc v tr khc nhau. Chng ta gi s rng vic di chuyn quan h c thc hin theo ch mi ln mt tp ch khng phi ni ln mt b. Cui cng chng ta b qua thi gian truyn d liu c c d liu ti v tr kt qu. Trc tin chng ta tp trung vo mt vn n gin hn l truyn ton hng trong mt ni. Cu vn tin l R S , trong R v S l cc quan h c lu ti nhng v tr khc nhau. Chn la quan h truyn, hin nhin l gi quan h nh n v tr ca quan h ln, cho ra hai kh nng nh c trnh by trong Hnh 7. c th a ra mt chn la, chng ta cn c lng kch thc ca R v S. By gi chng ta xt trng hp c nhiu hn hai quan h trong mt ni. Ging nh trng hp mt ni n, mc ch ca thut ton xp th t ni l truyn nhng quan h nh. Kh khn ny sinh t s kin l cc ni c th lm gim hoc tng kch thc ca cc quan h trung gian. V th c lng kch thc kt qu ni l iu bt buc nhng cng rt

76

kh. Mt gii php l c lng chi ph truyn ca tt c cc chin lc ri chn ra mt chin lc tt nht. Tuy nhin s lng ca cc chin lc s tng nhanh theo s quan h. Li tip cn ny, c dng rong System*R, c chi ph ti u ha cao, mc d n s c tr li rt nhanh nu cu vn tin c thc hin thng xuyn.

nu size (R) < size (S)

R
nu size (R) > size (S)

Th d 3.16: Xt cu vn tin c biu din di dng i s quan h: PROJ


PNO

EMP

ENO

ASG

Vi th ni c trnh by trong Hnh 8. Ch rng chng ta a ra mt s gi thit v v tr ca ba quan h. Cu vn tin ny c th c thc hin t nht l bng nm cch khc nhau. Chng ta m t nhng chin lc ny bng nhng chng trnh sau, trong (R v tr j) biu th quan h R c chuyn n v tr j V tr 2

AS G
ENO PNO

V tr 1

EM P

PR OJ

V tr 3

1. EMP v tr 2. V tr 2 tnh EMP = EMP tnh EMP PROJ 2. ASG v tr 1. V tr 1 tnh EMP = EMP tnh EMP PROJ 3. ASG v tr 3. V tr 3 tnh ASG = ASG tnh ASG EMP

ASG.EMP v tr 3. V tr 3 ASG.EMP v tr 3. V tr 3 PROJ.SG v tr 1. V tr 1

77

4. PROJ v tr 2. V tr 2 tnh PROJ = PROJ 1 tnh PROJ EMP

ASG. PROJ v tr 1.V tr PROJ ASG

5. EMP v tr 2. PROJ v tr 2. V tr 2 tnh EMP

chn ra mt chng trnh trong s ny, chng ta phi bit hoc d on c cc kch thc: size (EMP), size (ASG), size (PROJ), size (EMP ASG) v size (ASG PROJ). Hn na nu xem xt c thi gian p ng, vic ti u ha phi tnh n vn l truyn d liu c th c thc hin song song trong chin lc 5. Mt phng n khc lit k tt c cc gii php l dng cc heuristic ch xt n kch thc cc quan h ton hng bng cch gi thit, chng hn l lc lng ca ni c to ra l tch ca cc lc lng. Trong trng hp ny, cc quan h c xp th t theo kch thc v th t thc hin c cho bi cch xp th t v th ni. Th d th t (EMP, ASG, PROJ) c th s dng chin lc 1, cn th t (PROJ, ASG, EMP) c th dng chin lc 4. Cc thut ton da trn ni na Trong phn ny chng ta trnh by xem phi s dng ni na nh th no h thp tng thi gian ca cc vn tin ni. y chng ta cng dng gi thit ging nh trong phn 1. Thiu st chnh ca phng php ni c m t trong phn trc l ton b quan h ton hng phi c truyn qua li gia cc v tr. i vi mt quan h, ni na hnh ng nh mt tc nhn rt gn kch thc ging nh mt php chn. Ni ca hai quan h R v S trn thuc tnh A, c lu tng ng ti v tr 1 v 2, c th c tnh bng cch thay mt hoc c hai ton hng bng mt ni na vi quan h kia nh cc quy tc sau y.R AS (R |><AS) AS R
A

(S |><AR)
A

(R |><A S)

(S |><AR)

Chn la gia mt trong ba chin lc ni na i hi phi c lng cc chi ph tng ng ca chng. S dng ni na s c ch nu chi ph to v gi n n v tr kia nh hn chi ph gi ton b quan h ton hng v thc hin ni thc hin. minh ha ch li ca ni na, chng ta hy so snh cc chi ph ca hai chn la. R AS vi (R |>< A S) S, gi thit rng size(R) <size(S) A Chng trnh sau y, s dng k php ca phn 1, dng ni na:

1. A(S) v tr 1
2. V tr 1 tnh R = R |>< A S 3. R v tr 2 4. V tr 2 tnh R
A

cho n gin, chng ta hy b qua hng T MSG trong thi gian truyn vi gi thit l ton hng TTR * size(R ) ln hn n rt nhiu. Sau chng ta c th so snh hai chn la ny theo s lng d liu c truyn. Chi ph ca thut ton da trn ni

78

l chi ph truyn quan h R n v tr 2. Chi ph ca thut ton da trn ni na l chi ph ca bc 1 v 3 trn. V th phng php ni na s tt hn nu: Size (A(S)) + size(R |>< A S) < size(R) Phng php ni na tt hn nu ni hnh ng nh mt tc nhn rt gn y , ngha l nu ch mt s t cc b ca R tham gia vo trong ni. Phng php ni tt hn nu hu nh tt c cc b ca R u tham gia vo ni bi v phng php ni na i hi thm mt ln truyn kt qu chiu trn thuc tnh ni. Chi ph ca bc thc hin chiu c th h thp ti a bng cc m ha kt qu chiu trong cc mng Bit, nh lm gim i chi ph truyn cc gi tr thuc tnh c ni. iu quan trng cn bit rng khng c cch tip cn no l tt nht; chng cn c xem nh l nhng b tr cho nhau. Tng qut hn ni na c th lm gim i kch thc ca quan h ton hng c trong cc cu vn tin a ni. Tuy nhin vic ti u ha vn tin s tr ln phc tp hn trong trng hp ny. Hy xt li th ni ca cc quan h EMP, ASG, PROJ c cho trong Hnh 9. Chng ta c th p dng thut ton ni trc y bng cch dng cc ni na cho tng ni. V th mt th d v mt chng trnh tnh EMP ASG PROJ l EMP ASG EMP PROJ, trong EMP = EMP |>< ASG v ASG = ASG |>< PROJ Tuy nhin chng ta c th rt gn thm na kch thc ca mt quan h ton hng bng cch dng nhiu ni na, Th d EMP trong chng trnh trn c th c thay bng EMP v c tnh bng: EMP = EMP |>< (ASG |>< PROJ) Bi v nu size(ASG |><PROJ) size(ASG) chng ta c size(EMP) size(EMP). Do theo cch ny, EMP c rt bi mt chui ni na: EMP |>< (ASG |>< PROJ). Mt dy ni na nh th c gi l mt chng trnh ni na cho EMP. Tng t chng ta c th c cc chng trnh ni na cho mt quan h bt k. Th d PROJ c th c rt gn bng mt chng trnh ni na PROJ |>< (ASG |>< EMP) Tuy nhin khng phi tt c cc quan h c trong mt cu vn tin u cn phi rt gn; c bit chng ta c th b qua nhng quan h khng c mt trong cc ni cui cng. i vi mt quan h cho s c nhiu chng trnh ni na khc nhau. S lng cc kh nng thc hin t l l m theo s quan h. Th nhng c mt chng trnh ni na ti u, c gi l trnh rt gn hon ton, m vi mi quan h R n rt gn R nhiu hn cc chng trnh khc. Vn l tm ra trnh rt gn hon ton ny. Mt phng php n gin l c lng kt qu rt gn kch thc ca tt c mi chng trnh ni na kh hu v chn ra chng trnh tt nht, cc bi ton lit k gp phi hai vn . 1. Mt lp vn tin c cc chu trnh trong th ni, c gi l vn tin c vng (cyclic queries), v loi vn tin ny khng c trnh rt gn hon ton (full reducer).

79

2. Mt loi vn tin khc, c gi l vn tin cy (Tree queries) th c trnh rt gn hon ton nhng s lng cc chng trnh ni na cn kim tra t l lm hm m theo s lng quan h, khin cho phng php lit k tr thnh cc loi bi ton NPhard. Trong nhng on sau chng ta tho lun v cc gii php cho nhng bi ton ny. Th d 3.17: Xt cc quan h sau, trong c thm thuc tnh CITY cho cc quan h EMP (c tn mi l ET) v PROJ (c tn mi l PT) ca CSDL: ET (ENO, ENAME, TITLE, CITY) ASG (ENO, PNO, RESP, DUR) PT (PNO, PNAME, BUDGET, CITY) Cu vn tin sau xut tn ca tt c nhn vin sng trong thnh ph c d n ca h ang c thc hin SELECT FROM WHERE AND AND ET.ENAME ET, ASG, PT ET.ENO = ASG.ENO ASG. PNO = PT. PNO ET.CITY = PT. CITY

Khng c trnh rt gn hon ton no cho cu vn tin trong Th d 9.7. Thc s c th dn xut c cc chng trnh ni na rt gn n nhng s lng cc php ton nhn vi s lng cc b trong mi quan h khin cho li tip cn nayfkhoong hiu qu. Mt gii php l bin i th c vng thnh mt cy bng cch loi b mt cung ca th v thm vo cc v t thch hp cho cc cung khc sao cho v t c loi b ton nh tnh bc cu

80

AS G
ET.ENO=ASG.ENO and ET.CITY= ASG.CITY ET.ENO=ASG.ENO

AS G

ET

ET
ASG.PNO=PT.PNO

PT PT
ET.CITY=PT.CITY

(a) vn tin c vng

(b) Vn tin khng vng tng ng

th d hnh b, trong cung (ET, PT) c loi b, v t c thm vo ET.CITY= ASG.CITY v ASG.CITY= PT.CITY ko theo ET.CITY=PT.CITY nh tnh bc cu. V th cu vn tin khng vng tng ng vi vn tin c vng. Vic thm nhng v t ny dn n vic thm thuc tnh CITY trong quan h ASG. V th cc gi tr cho thuc tnh CITY phi c gi ti ET hoc ASG. Mc d trnh rt gn hon ton cho cc vn tin cy c tn ti, bi ton tm ra chng thuc loi NP-hard. Tuy nhin c mt lp vn tin quan trng gi l vn tin mt xch (chained query) c mt thut ton a thc cho chng. Mt vn tin mt xch c mt th ni, trong cc quan h c th c sp th t , v mi quan h ch ni vi quan h k tip theo th t .Th d cu vn tin trong hnh tren l mt vn tin mt xch. Do rt kh ci t mt thut ton vi cc trnh rt gn hon ton, phn ln cc h thng u dng cc ni na n l rt gn kch thc quan h.

81

DA NV PC Phng php ni

DA

MA D

NV PC

DA

MV N

NV Phng php ni na

So snh ni v ni na Nu so snh v ni na phi thc hin nhiu php ton hn nhng rt c th trn cc ton hng nh hn. Hnh tren minh ha nhng khc bit ny qua mt cp chin lc ni v ni na tng ng cho cu vn tin c th ni c cho trong Hinh 11. Ni ca hai quan h EMP ASG trong hnh c thc hin bng cch gi mt quan h, chng hn ASG n v tr ca quan h kia EMP v hon ton ni ti v tr . Tuy nhin khi dng ni na th trnh phi truyn quan h ASG. Thay vo l truyn cc gi tr thuc tnh ni ca EMP n v tr ca ASG, sau l truyn cc b i so snh c ca quan h ASG n v tr ca quan h EMP, ri hon tt ni . Nu chiu di thuc tnh ni nh hn chiu di ca mt b v ni na c tuyn chn tt th phng php ni na c th lm tng thi gian x l cc b bi v mt trong hai quann h c ni phi truy xut hai ln. Th d cc quan h EMP v PROJ c truy xut hai ln. Hn na ni ca hai quan h trung gian sinh ra t ni na khng tn dng c ch mc c sn trn quan h c s. V th s dng ni na c th khng phi l kin hay nu thi gian truyn d liu khng phi l yu t chim u th nh trng hp cc mng cc b. Ni na vn c ch trong cc mng tc cao nu chng ta c tuyn chn rt tt v c ci t bng cc mng bit Mt mng bit BA[1:n] rt c ch trong vic m ha cc gi tr thuc tnh ni c trong quan h. Chng ta hy xt ni na R |>< S. Th th BA[i] c t l 1 nu tn ti mt gi tr thuc tnh ni A al trong quan h S sao cho h(val) = i trong h l mt hm bm. Bng khng th BA[i] c t bng 0. Mt mng bit nh th s nh hn nhiu so vi mt danh sch cc gi tr thuc tnh ni. V th truyn mt mng bit thay v gi tr thuc tnh ni n v tr ca quan h R s tit kim c thi gian truyn tin. Ni na c th c thc hin nh sau. Mi b ca quan h R c gi tr thuc tnh ni l val s thuc v ni na nu BA[h(val)] = 1.

82

CHNG 4. QUN L GIAO DCH


4.1. Cc khi nim 1.5- Qun l giao dch Giao dch C nhiu nh ngha khc nhau v giao dch (transaction). Trong c 2 nh ngha c nhiu ngi s dng l ca Jeffrey D. Ullman v M.Tamer Ozsu v Patrick Valduriez Jeffrey D. Ullman [10] cho rng, giao dch l mt thc hin ca mt chng trnh. Chng trnh ny c th l mt cu vn tin hoc mt chng trnh trong ngn ng ch, trong c gn vo mt ngn ng vn tin. Mt giao dch s c c d liu t mt CSDL vo vng lm vic ring (private workpace), thc hin cc tnh ton trong vng lm vic ny v ghi d liu t vng lm vic ny vo c s d liu. Nh vy, cc tnh ton do giao dch thc hin khng lm thay i c s d liu cho n khi cc gi tr mi c ghi vo c s d liu. Mt nh ngha khc ca M. Tamer Ozsu v Patrick Valduriez cho rng giao dch l mt n v tnh ton nht qun v tin cy. iu ny c ngha l, mt giao dch thc hin mt truy xut trn c s d liu, gy ra mt s bin i trng thi. Nu c s d liu nht qun trc khi thc hin giao dch th cng s nht qun khi kt thc giao dch cho d giao dch ny c thc hin ng thi vi cc giao dch khc hoc xy ra s c trong lc n c thc hin. Ni chung, mt giao dch c xem nh c to bi mt dy cc thao tc c v ghi trn c s d liu cng vi cc bc tnh ton cn thit. Theo ngha ny, mt giao dch c xem nh mt chng trnh c cc cu vn tin truy vn n c s d liu c gn vo. Giao dch l mt thc hin ca mt chng trnh. Mt cu vn tin cng c th c xem l mt chng trnh v c a ra nh mt giao dch. Qun l giao dch: B qun l giao dch (transaction manager - TM) chu trch nhim iu phi vic thc hin cc thao tc c s d liu ca cc ng dng. B qun l giao dch ci t mt giao din cho cc ng dng, bao gm cc lnh: begintransaction, read, write, commit v abort. Cc lnh ny c x l trong mt DBMS phn tn. Mc d liu

83

Mc d liu (item) l cc n v d liu trong c s d liu. Bn cht v kch thc cc mc d liu do nh thit k chn. Kch thc ca cc n v ny c la chn sao cho vic truy xut d liu c hiu qu. Chng hn trong m hnh d liu quan h, chng ta c th chn cc mc ln nh cc quan h, hoc cc mc nh nh cc b hay thnh phn ca cc b. Kch thc ca cc mc d liu c h thng s dng gi l mn (granularity) ca h thng. Mt h thng c gi l mn (fine-grained), nu n s dng cc mc d liu nh v h thng l th (coarse-grained), nu n s dng cc mc d liu ln. cng th s gim chi ph qun l vic truy cp cc mc, ngc li cng mn li cho php nhiu hot ng ng thi hn. Phng php thng dng nht iu khin vic truy cp cc mc l s dng kho cht (lock). B qun l kho cht (lock manager) l thnh phn ca DBMS tru trch nhim theo di mt mc I hin c giao dch no ang c ghi vo cc thnh phn ca I hay khng. Nu c th b qun l kho cht s cn tr ngn cn khng cho giao dch khc truy cp I trong trng hp truy cp c th xy ra xung t, chng hn vic bn mt gh trn mt chuyn my bay hai ln. B xp lch v cc giao thc ngn nga s b tc ngi ta c th s dng b lp lch (scheduler) v cc giao thc . B lp lch l mt thnh phn ca h thng c s d liu, c vai tr lm trng ti phn x cc yu cu ang c xung t, chu trch nhim sp xp mt lch biu cho cc thao tc ca cc giao dch. Chng hn chng ta bit cch loi b kho sng ca mt b lp lch n trc phc v trc. Mt b lp lch cng c th x l cc kho gi v tnh bt kh tun t bng cch: N cng c th buc mt giao dch phi i cho n khi kho m giao dch yu cu c gii phng. Buc mt giao dch phi hu b v ti thc hin. Giao thc theo ngha tng qut nht, ch l mt hn ch trn chui cc bc nguyn t m mt giao dch c th thc hin, l cc qui tc m cc giao dch phi tun theo. Chng hn, chin lc trnh kho gi bng cch yu cu kho cht trn cc mc theo mt th t c nh no chnh l mt giao thc. Cc tnh cht ca giao dch 1) Tnh nguyn t

84

Qun l giao dch l mt c gng lm cho cc thao tc phc tp xut hin di dng nguyn t (atomic). Ngha l mt thao tc c th xy ra trn vn hoc khng xy ra. Nu xy ra, khng c bin c hay giao dch no cng xy ra trong sut thi gian tn ti ca n. Cch thng dng nhm m bo c tnh nguyn t ca cc giao dch l phng php tun t ho (serialization). Phng php ny lm cho cc giao dch c thc hin tun t. Mt giao dch khng c tnh nguyn t nu: - Trong mt h thng phn chia thi gian, lt thi gian cho giao dch T c th kt thc trong khi T ang tnh ton v cc hot ng ca mt giao dch khc s c thc hin trc khi T hon tt. Hoc - Mt giao dch khng th hon tt c. Chng hn khi n phi chm dt gia chng, c th v n thc hin mt php tnh khng hp l, hoc c th do i hi nhng d liu khng c quyn truy xut. Bn thn h thng CSDL c th buc giao dch ny ngng li v nhiu l do. Chng hn giao dch c th b kt trong mt kho gi. Trong thc t, mi giao dch u c mt chui cc bc c bn nh: c hay ghi mt mc d liu (item) vo CSDL v thc hin cc php tnh ton s hc n gin trong vng lm vic, hoc cc bc s ng khc nh cc bc kho cht, gii phng kho, u thc (hon tt) giao dch v c th c nhng bc khc na. Chng ta lun gi s rng nhng bc s ng ny l nguyn t. Thm ch thao tc kt thc lt thi gian xy ra khi ang tnh ton cng c th xem l nguyn t. Bi v n xy ra trong vng lm vic cc b v khng c g c th nh hng n vng lm vic cho n khi giao dch ang thc hin d cc php tnh s hc c ti hot ng tr li. 2)Tnh nht qun (consistency) Tnh nht qun ca mt giao dch ch n gin l tnh ng n ca n. Ni cch khc, mt giao dch l mt chng trnh ng n, nh x c s d liu t trng thi nht qun ny sang trng thi nht qun khc. Vic xc nhn rng cc giao dch nht qun l vn iu khin d liu ng ngha. C s d liu c th tm thi khng nht qun khi giao dch ang thc hin, nhng n phi tr v trng thi nht qun khi kt thc giao dch. Tnh nht qun giao dch mun ni n hnh ng ca cc giao dch ng thi. Chng ta mong rng CSDL vn nht qun ngay c khi c mt s yu cu ca ngi s dng ng thi truy nhp n CSDL. Tnh cht phc tp ny sinh khi xt n cc CSDL c nhn bn.

85

Nu c s d liu c nhn bn, tt c cc bn sao phi c trng thi ging nhau vo lc kt thc giao dch. iu ny gi l tnh tng ng mt bn, v trng thi nht qun ca cc bn sao c gi l trng thi nht qun tng h. 3) Tnh bit lp Tnh bit lp l tnh cht ca cc giao dch, i hi mi giao dch phi lun nhn thy c s d liu nht qun. Ni cch khc, mt giao dch ang thc thi khng th lm l ra cc kt qu ca n cho nhng giao dch khc ang cng hot ng trc khi n u thc. C mt s l do cn phi nhn mnh n tnh bit lp: Mt l phi duy tr tnh nht qun qua li gia cc giao dch. Nu hai giao dch ng thi truy xut n mt mc d liu ang c mt trong chng cp nht th khng th bo m rng giao dch th hai c c gi tr ng. Hai l, tnh bit lp cho php khc phc cc hin tng hu b dy chuyn (cascading abort). Bi v nu mi giao dch cho php cc giao dch khc c cc mc m n ang thay i trc khi c u thc, nu n b hu b, mi thao tc c cc gi tr nhng mc ny cng phi hu b theo. iu ny gy nhng chi ph ng k cho DBMS. Vn bit lp c lin quan trc tip n tnh nht qun CSDL v v th l ti ca iu khin ng thi. 4) Tnh bn vng Tnh bn vng mun ni n tnh cht ca giao dch bo m rng mt khi giao dch u thc, kt qu ca n c duy tr c nh v khng b xo ra khi CSDL. V th DBMS bo m rng kt qu ca giao dch s vn tn ti d c xy ra cc s c h thng. y chnh l l do m chng ta nhn mnh rng giao dch u thc trc khi n thng bo cho ngi s dng bit rng n hon tt thnh cng, tnh bn vng a ra cc vn khi phc d liu, ngha l cch khi phc CSDL v trng thi nht qun m mi hnh ng u thc u c phn nh. Tnh kh tun t ca cc lch biu v vic s dng chng Mc ch ca giao thc iu khin tng tranh l xp lch thc hin sao cho khng xy ra s tc ng ln nhau gia chng. C mt gii php n gin: Ch cho php mt giao dch thc hin ti mt thi im. Nhng mc ch ca h qun tr c s d liu a ngi dng li l ti a ho s thc hin ng thi trong h thng sao cho nhng giao dch thc hin ng thi khng nh hng ln nhau.

86

Lch biu l mt dy (c th t) cc thao tc ca mt tp cc giao dch tng tranh m trong th t ca mt thao tc trong mi giao dch c bo ton. y l vn x l hot ng ng thi c lin quan n cc nh thit k CSDL ch khng phi cc nh thit k cc h thng ng thi tng qut. Gi s chng ta c mt tp cc giao dch S={T1, T2, T3, ... }. Lch biu tun t: Chng ta thy ngay rng nu cc giao dch thc hin tun t theo mt th t no , cc thao tc ca mi giao dch c thc hin k tip nhau, khng c mt thao tc no ca cc giao dch khc xen k vo th cc s c tranh chp chc chn khng xy ra v trong CSDL chng ta c mt kt qu no . Chng ta nh ngha mt lch biu cho mt tp cc giao dch S l th t (c th xen k) cc bc c bn ca ca cc giao dch (kho, c, ghi, ... ) c thc hin. Cc bc ca mt giao dch cho phi xut hin trong lch biu theo ng th t xy ra trong giao dch . Lch biu khng tun t: L lch m trong cc thao tc ca mt tp cc giao dch tng tranh c xen k vo nhau. Bi v lun c lch biu tun t cho tp giao dch S v vy chng ta s gi s rng hot ng ca cc giao dch ng thi l ng n nu v ch nu tc dng ca n ging nh tc dng c c ca lch biu tun t. Lch biu c gi l kh tun t (serializable) nu tc dng ca n ging vi tc dng ca mt lch biu tun t. Lch biu c gi l bt kh tun t nu tc dng ca n khng ging vi tc dng ca lch biu tun t. Mc tiu ca b xp lch l vi mt tp cc giao dch ng thi, a ra c mt lch biu kh tun t. Trong vic tun t ho, th t ca cc thao tc c v ghi rt quan trng: Nu hai thao tc ch c mt mc d liu th chng s khng nh hng n nhau v th t gia chng khng quan trng Nu hai thao tc c hay ghi trn hai mc d liu hon ton khc nhau th chng s khng nh hng n nhau v th t gia chng khng quan trng Nu mt thao tc ghi mt mc d liu v mt thao tc khc c hay ghi trn chnh mc d liu ny th th t gia chng rt quan trng.

87

Xt cc lch biu Lch biu tun t T1 Read A A:=A-10 WriteA Read B B:=B+10 WriteB ReadB B:=B-20 WriteB Read C C:=C+20 Write C (a) (b) Hnh 4.1. Mt s lch biu Hnh 4.1 (a) l mt lch biu tun t Hnh 4.1 (b) l lch biu kh tun t Hnh 4.1 (c) l lch biu bt kh tun t Trong thc t, qua cc tnh cht ca i s n thun chng ta c th gp mt lch biu l bt kh tun t nhng n cho cng kt qu so vi lch biu tun t. Cc k thut iu khin tng tranh bng kho Cc thut ton iu khin t ng tranh WriteB WriteC (c) B:=B+10 C:=C+20 WriteB C:=C+20 Write C ReadB Read C B:=B+10 ReadC WrieA WriteB ReadB WriteB A:=A-10 B:=B 20 WriteA B:=B-20 T2 Lch biu kh tun t T1 ReadA ReadB T2 Lch biu bt kh tun t T1 ReadA A:=A-10 ReadB T2

Cc thut ton bi quan

Cc thut ton lc quan

Kho

Nhn thi gian

Lai

Kho

Nhn thi gian

88

Kho Kho (Lock) l mt c quyn ca mt giao dch c b qun l kho trao cho c th truy cp trn mt mc d liu. Hay kho l mt bin gn vi mt mc d liu trong c s d liu biu din trng thi ca mt mc d liu ny trong mi lin quan n thao tc thc hin trn . B qun l kho cng c th thu hi li kho ny. Ti mt thi im, mc d liu X c mt trong 3 trng thi: - C kho c (read-lock)( cn gi l kho chia s shared lock): ch cho php mt giao dch c mt mc nhng khng c cp nht trn mc ny. - C kho ghi (wrire-lock) ( cn gi l kho c quyn exclusive lock): cho php thc hin c hai thao tc c ghi. - Khng c kho. Cc kho c s dng theo cch sau: + Bt k mt giao dch no cn truy cp vo mt mc d liu trc ht phi kho mc d liu li. Giao dch s yu cu mt kho c nu ch cn c d liu v yu cu kho ghi nu va cn c v cn ghi d liu. + Nu mc d liu cha b kho bi mt giao dch no khc th kho s c cp pht theo ng yu cu. + Nu mc d liu ang b kho, HQT CSDL s xc nh xem kho c yu cu c tgng thch vi kho hin hnh hay khng. Khi mt giao dch yu cu cp mt kho c cho n trn mt mc d liu m trn mc ang c mt kho c (ca giao dch khc) th kho yu cu ny c cp pht. Trong trng hp kho yu cu l kho ghi th giao dch yu cu kho s phi ch cho n khi kho hin hnh c gii phng mi c cp kho. + Mt giao dch tip tc gi mt kho cho n thi im kho c gii phng, thi im ny hoc nm trong qu trnh thc hin giao dch hoc l thi im giao dch c chuyn giao hay b hu b. Ch khi kho ghi c gii phng th kt qu cua thao tc ghi mi thy c i vi cc giao dch khc. Mt s h thng cn cho php giao dch a cc kho c trn mt mc d liu v sau nng cp kho ln thnh kho ghi. iu ny cho php mt giao dch kim tra d liu trc, sau mi quyt nh c cp nht hay khng. B qun l kho lu cc kho trong mt bng kho (lock table).

89

Khi iu khin cc hot ng tng tranh bng kho, c th xy ra cc tnh hung: Kho sng (live-lock) l tnh hung m mt giao dch yu cu kho trn mt mc m chng bao gi nhn c kho trong khi lun c mt giao dch khc gi kho trn mc ny (kho sng trn mc A ca giao dch T l kho khng kho c A v A lun b kho bi mt giao dch khc), mc d c mt s ln giao dch ny c c hi nhn kho trn mc . Rt nhiu gii php c cc nh thit k h iu hnh xut vn gii quyt kho sng. C th s dng mt chin lc n gin n trc, phc v trc loi b c kho sng. B tc hay kho gi (deadlock) l tnh hung m trong mi giao dch trong mt tp hay nhiu giao dch ang i nhn kho ca mt mc hin ang b kho bi mt giao dch khc trong mt tp giao dch v ngc li (mt mc trong giao dch ny b gi bi giao dch khc v ngc li). V d Gi s c hai giao dch ngthi T1 v T2 nh sau: T1 : T2 : Lock A ; Lock B ; Unlock A ; Unlock B; Lock B ; Lock A ; Unlock B ; Unlock A;

T1 v T2 cng c th thc hin mt s tc v no trn A v B. Gi s T 1 v T2 c thc hin cng lc. T1 yu cu v c trao kho trn A, cn T2 yu cu v c trao kho trn B. Do khi T1 yu cu kho trn B n s phi i v T 2 kho B. Tng t T2 yu cu kho trn A n s phi i v T1 kho A. Kt qu l khng mt giao dch no tip tc hot ng c: mi giao dch u phi i giao dch kia m kho, v chng u phi i nhng chng bao gi nhn c kho nh yu cu. trnh b tc c th s dng cc gii php: (i) Buc cc giao dch phi a ra tt c cc yu cu kho cng mt lc v b qun l kho trao tt c cc kho cho chng nu c, hoc khng trao v cho giao dch ny i nu mt hay nhiu kho c yu cu ang b gi bi mt giao dch khc. (ii) Gn mt th t tuyn tnh cho cc mc v yu cu tt c cc giao dch phi xin kho theo ng th t ny. (iii) Mt cch khc x l b tc l nh k kim tra yu cu kho v pht hin c xy ra b tc khng. Bng cch dng th ch, vi cc nt biu din cc giao dch v cc cung Ti Tj biu th Tj ang i nhn kho trn mt mc ang c Ti gi. Nu trong th c chu trnh, s b tc s xy ra, v nu khng c chu trnh th kt

90

lun khng c kho gi hay b tc. Nu mt kho gi b pht hin, khi h thng s buc mt trong cc giao dch b b tc phi khi ng li v tc dng ca giao dch trn c s d liu phi c hon ton tr li. 4. 2. M hnh kho c bn Kho (Lock) l mt c quyn truy cp trn mt mc d liu m b qun l kho (lock manager) c th trao cho mt giao dch hoc thu hi li. Trong cc m hnh giao dch c s dng kho khng ch c cc thao tc c v ghi cc mc m cn c thao tc kho (lock) v m kho (unlock) chng. Mi mc c kho phi c m kho sau . Vi mt mc A, gia bc lock A v unlock A k tip ca mt giao dch, giao dch ny phi c coi l ang gi mt kho trn A. Trong m hnh ny, chng ta da trn cc gi nh sau: - Mt kho phi c t trn mt mc trc khi c hay ghi mc . - Cc thao tc kho hot ng trn c s ng b ho, ngha l nu mt giao dch kho mt mc b kho trc bi mt giao dch khc, n khng th thao tc trn mc ny cho n khi kho ny c gii phng bng lnh m kho do giao dch ang gi kho trc thc hin. - Mi giao dch u c th m c mi kho do chnh n kho. - Mt giao dch s khng yu cu kho mt mc nu n ang hin gi kho ca mc , hoc m kho mt mc m n hin khng gi kho trn mc . Cc lch biu tun theo quy tc ny c gi l hp l . V d : Xt hai giao dch ng thi T1 v T2 cng truy xut n mc d liu A theo m hnh ny s l: T1 Lock A Read A A:=A+1 Write A Unlock A T2 Lock A Read A A:=A+1 Write A Unlock A

Nu T1 bt u trc T2, n s yu cu kho trn mc A. Gi s khng c giao dch no ang kho A, b qun l kho s cho n kho mc ny. Khi ch c T1 mi c truy xut n mc ny. Nu T2 bt u trc khi T1 chm dt th T2 thc hin Lock A, h thng buc T2 phi i. Ch n khi T1 thc hin lnh Unlock A, h thng mi cho php T2 tin hnh. Nh vy, T1 hon thnh trc khi T2 bt u v kt qu l sau hai giao dch, gi tr ca A s l 32

91

Vi m hnh ny, kim tra tnh kh tun t ca mt lch biu, ta s xem xt th t m cc giao dch kho mt mc cho. Th t ny phi thng nht vi th t trong lch biu tun t tng ng. y thc cht l vic kim tra mt th c chu trnh hay khng. Thut ton 2.1: Kim tra tnh kh tun t ca mt lch biu. Nhp: Mt lch biu S cho mt tp cc giao dch T1, T2 , ... , Tk. Xut: Khng nh S c kh tun t hay khng? Nu c th a ra mt lch biu tun t tng ng vi S. Phng php: Bc 1: To ra mt th c hng G (gi l th tun t ho), c cc nt l cc giao dch, cc cung ca th ny c xc nh nh sau: Gi S l a1, a2, ... an trong mi ai l mt thao tc ca mt giao dch c dng Tj : Lock Am hoc Tj : Unlock Am vi Tj l giao dch thc hin thao tc kho hoc m mc Am. Nu ai l Tj : Unlock Am th hnh ng ap k tip ai c dng Ts : Lock Am. Nu s j th v mt cung t Tj n Ts. Cung ny c ngha l trong lch biu tun t tng ng, Tj phi i trc Ts. Bc 2: Kim tra, G c chu trnh th S bt kh tun t. Nu G khng c chu trnh th ta tm mt th t tuyn tnh cho cc giao dch, trong Ti i trc Tj khi c mt cung i t Ti Tj. tm th t tuyn tnh , ta thc hin qu trnh sp xp topo nh sau. u tin ta xut pht t mt nt Ti khng c cung vo (ta lun tm thy mt nt nh th, nu khng th G l mt th c chu trnh), lit k T i ri loi b Ti ra khi G. Sau lp li qu trnh trn cho n khi th khng cn nt no na. Khi , th t cc nt c lit k l th t tun t ca cc giao dch. V d 2.3: T1 : Lock A T2 : Lock B T2 : Lock C T2 : Unlock B T1 : Lock B Gi s ta c lch biu ca ba giao dch T 1, T2, T3 nh sau

92

T1 : Unlock A T2 : Lock A T2 : Unlock C T2 : Unlock A T3 : Lock A T3 : Lock C T1 : Unlock B T3 : Unlock C T3 : Unlock A

Hnh 2.4 th th t cc giao dch T T


2

T
3

th c 3 nt 1 1, T2 v T3. Cc cung c xy dng nh sau: T bc (4) ta c T2 : Unlock B, bc tip theo c lnh Lock B l bc (5) T1: Lock B. Vy ta v mt cung t T2 T1. bc (6) ta c T1 : Unlock A, bc tip theo c lnh Lock A l bc (7) T 2: Lock A. Vy ta v mt cung t T1 T2. bc (8) ta c T2 : Unlock C, bc tip theo c lnh Lock C l bc (11) T 3: Lock C. Vy ta v mt cung t T2 T3. bc (9) ta c T2 :Unlock A, bc tip theo c lnh Lock A l bc (10) c T3: Lock A. Vy ta v mt cung t T2 T3. th ny c mt chu trnh nn lch biu cho bt kh tun t. V d : Lch biu ca ba giao dch T1, T2, T3 (1) (2) (3) T2 : Lock A T2 : Unlock A T3 : Lock A

93

(4) (5) (6) (7) (8)

T3 : Unlock A T1 : Lock B T1 : Unlock B T2 : Lock B T2 : Unlock B

T
1

T
2

T
3

Hnh 2.5. th tun t cho ba giao dch

4.3. M hnh kho c v kho ghi Trong m hnh kho c bn, ta gi s rng khi kho mt mc c th thay i mc . Trn thc t, c nhng trng hp mt giao dch truy cp mt mc theo ngha ch c gi tr ca mc khng thay i gi tr ca mc . V vy nu ta phn bit hai loi truy cp: ch c (read only) v c ghi (read write), th ta c th tin hnh c mt s thao tc ng thi b cm trong m hnh kho c bn. Khi , ta phn bit hai loi kho nh sau: Kho c (read lock or shared lock) k hiu RLock hot ng nh sau: mt giao dch T ch mun c mt mc A s thc hin lnh RLock A, ngn khng cho bt k giao dch khc ghi gi tr mi vo A trong khi T kho A, nhng cc giao dch khc vn c th gi mt kho c trn A cng lc vi T. Kho ghi (write lock) k hiu WLock hot ng nh trong m hnh kho c bn, ngha l mt giao dch mun thay i gi tr ca mc A s thc hin lnh WLock A. Khi khng mt giao dch no c ly kho c hoc kho ghi trn mc . C kho c v kho ghi u c m bng lnh Unlock. Ngoi cc gi nh nh m hnh kho c bn, ta c thm gi nh rng mt giao dch c th yu cu mt kho ghi trn mc m n ang gi kho c. Hai lch biu l tng ng nu: - Chng sinh ra cng mt gi tr cho mi mc, v - Mi kho c c p dng bi mt giao dch xy ra trong c hai lch biu vo nhng lc mc b kho c cng gi tr.

94

Thut ton 2.2: Kim tra tnh kh tun t ca cc lch biu vi cc kho c / ghi Nhp: Mt lch biu S cho mt tp cc giao dch T1, T2 , ... , Tk. Xut: Khng nh S c kh tun t hay khng? Nu c th a ra mt lch biu tun t tng ng vi S. Phng php: Bc 1: Chng ta xy dng mt th c hng G (gi l th tun t ho), c cc nt l cc giao dch. Cc cung ca th c xc nh bng quy tc sau: Gi s trong S, Ti nhn kho c hoc kho ghi mc A, Tj l giao dch k tip kho ghi A, v i j, th ta s t mt cung t Ti Tj. Gi s trong S, giao dch Ti kho ghi A, Tm l kho c A sau khi Ti m kho A nhng trc cc giao dch khc kho ghi A, v i m, th ta s t mt cung t T i Tm . Bc 2: Kim tra, nu G c chu trnh th S bt kh tun t. Nu G khng c chu trnh th mt sp xp topo ca G l th t tun t ca cc giao dch ny. V d 2.5: Mt lch biu ca bn giao dch : (1) T2 : RLock A (2) T3 : RLock A (3) T2 : WLock B (4) T2 : Unlock A (5) T3 : WLock A (6) T2 : Unlock B (7) T1 : RLock B (8) T3 : Unlock A (9) T4 : RLock B (10) T1 : RLock A (11) T4 : Unlock B (12) T1 : WLock C (13) T1 : Unlock A (14) T4 : WLock A (15) T4 : Unlock A (16) T1 : Unlock B (17) T1 : Unlock C

95

th tun t ho ca lch biu ny c trnh by trong hnh 2.6. Cc nt l bn giao dch T1 , T2 , T3 , T4. Cc cung c xc nh nh sau: bc (4) T2 m kho mc A Bc (5) T3 kho ghi mc A, T3 phi i sau T2, c mt cung t T2 n T3. bc (6) T2 m kho mc B Bc (7) T1 cc kho c mc B v T4 bc (9). Nh vy T1 v T4 phi i sau T3, c mt cung t T2 n cc nt ny. bc (8) T3 m kho mc A, Bc (10) T1 l kho c mc A v kho ghi mc A ca T4 bc (14). Nh vy T1 v T4 phi i sau T3, c mt cung t T3 n cc nt ny. bc (13) T1 m kho mc A, bc (14) T4 kho ghi mc A, T4 phi i sau T1, c mt cung t T1 n T4.

T
1

T
2

T
4

T
3

Sp xp topo cho th ta c th t cc giao dch l: T1 T2 T3 T4 Giao thc hai pha ca m hnh trc cng c th p dng cho m hnh ny. Cc kho c v kho ghi u i trc bc m kho, v iu s m bo tnh kh tun t ca lch biu. Trong m hnh ny ta cng rt ra c qui tc lin quan n vic trao kho nh sau: Mt kho c trn mt mc c th c trao cho mt giao dch nu khng c kho ghi no ang c mt giao dch khc gi trn n. Mt kho ghi trn mt mc ch c th c trao cho mt giao dch nu khng c kho c hoc kho ghi no ang c mt giao dch khc gi trn mc .

96

4.4. Thut ton iu khin tng tranh bng nhn thi gian m bo tnh kh tun t ca cc lch biu, ngoi cc m hnh s dng kho nh trnh by trn. Ta cn s dng nhn thi gian (timestamp). tng chnh l gn cho mi giao dch mt nhn thi gian, chnh l im bt u ca giao dch. Thit lp nhn thi gian Nu tt c cc giao dch u c b lp lch gn nhn thi gian th b lp lch s duy tr b m s lng cc giao dch c lp lch. Khi c mt giao dch mi yu cu c lp lch, b lp lch s tng b m s lng ny ln mt n v v gn tr s cho giao dch c yu cu. Nh vy, khng th xy ra trng hp hai giao dch c cng nhn thi gian, v th t tng i gia cc nhn thi gian ca cc giao dch cng chnh l th t m cc giao dch c thc hin. Mt cch gn nhn thi gian khc cho cc giao dch l dng gi tr ca ng h h thng ti thi im bt u giao dch. Trong trng hp tn ti nhiu b xp lch do h thng CSDL chy trn mt my a b x l hoc trong h CSDL phn tn, th ta phi gn thm mt hu t duy nht cho mi nhn thi gian. Hu t ny chnh l nh danh ca b x l tng ng. Khi , vic ng b ho cc b m hoc ng h c dng mi b x l l mt yu cu quan trng m bo tnh kh tun t ca cc lch biu. m bo tnh kh tun t bng nhn thi gian Qui tc duy tr th t tun t ca nhn thi gian nh sau. Gi s ta c mt giao dch c nhn thi gian t ang mun thc hin mt thao tc X trn mt mc c thi im c tr v thi im ghi tw th: a/ Cho thc hin thao tc ny nu: X = Read v t tw hoc X = Write v t tr v t tw Trong trng hp trc, t thi im c l t nu t > t rv trong trng hp sau, t thi im ghi l t nu t > tw. b/ Khng thc hin g nu X = Write v tr t < tw. c/ Hu b giao dch ny nu: X = Read v t < tw hoc X = Write v t < tr

97

V d : Trong RLock c xem l Read, WLock c xem l Write, cc bc Unlock khng tn ti. Cc giao dch T1, T2 , T3, T4 c cc nhn thi gian ln lt l 100, 200, 300, 400. bc (1) , T2 (c nhn thi gian l t = 200) c A (c WT = 0), tc t tw. Thao tc ny l c php, v v t > tr (bng 0) nn RT ca A c t li l 200. Tng t bc (2) , T3 (c nhn thi gian l t = 300) c A (c WT = 0), tc t tw. Thao tc ny l c php, v v t > t r (bng 200) nn RT ca A c t li l 300. bc (3) , T2 (c nhn thi gian l t = 200) ghi B (c RT = 0 v WT = 0), tc t Tr v t tw. Thao tc ny l c php, v t > tW (bng 0) nn WT ca B c t li l 200. bc (4) , T3 (c nhn thi gian l t = 300) ghi A (c RT = 300 v WT=0), tc t Tr v t tw. Thao tc ny l c php, v t > tW (bng 0) nn WT ca A c t li l 300. bc (5) , T1 (c nhn thi gian l t = 100) c B (c WT = 200), tc t < tw. V vy thao tc ny b hu b. TT T1 100 (1) (2) (3) (4) (5) Read B T1 b hu b Write B Write A WT=300 T2 200 Read A Read A T3 300 T4 400 A RT=0 WT=0 RT=200 RT=300 WT=200 B RT=0 WT=0 C RT=0 WT=0

Trong qu trnh thc hin giao dch CSDL c th tm thi khng nht qun nhng CSDL phi nht qun khi giao dch kt thc. Tnh tin cy da vo hai kh nng sau: + Kh nng phc hi nhanh ca h thng khi nhiu kiu li xy ra (Khi cc li xy ra h thng c th chu ng c v c th tip tc cung cp cc dch v ).

98

+ Khi phc: t c trng thi nht qun. Tr v trng thi nht qun trc hoc tip ti trng thi nht qun mi sau khi xy ra li. Nht qun v giao tc lin quan ti s thc hin cc truy nhp trng nhau. Vic qun l giao tc tip xc vi cc vn lun gi CSDL trong trng thi nht qun khi xy ra cc truy nhp trng nhau v cc li.

99

PHN 1 C S D LIU SUY DIN


2.1. Gii thiu chung - Khi nim v CSDL suy din c nhiu nh nghin cu cp n theo hng pht trin cc kt qu m Green t c vo nm 1969 v cc h thng cu hi tr li. - Xut pht t quan im l thuyt, cc CSDL suy din c th c coi nh cc chng trnh logic vi s khi qut ho khi nim v CSDL quan h. l cch tip cn ca Brodie v Manola vo nm 1989, ca Codd vo nm 1970, ca Date vo nm 1986, ca Gardarin v Valdurier vo nm 1989 v ca Ullman vo nm 1984. - Lp trnh logic l mng cng vic trc tin khi chng minh nh l c hc. S tht th vic chng minh nh l to nn c s cho hu ht h thng lp trnh logic hin nay. T tng c bn ca lp trnh logic l s dng logic ton hc nh ngn ng lp trnh. iu ny c cp trong ti liu ca Kowaski nm 1970, v c Colmerauer a vo thc hnh nm 1975 trong cc ci t ngn ng lp trnh logic u tin, tc l ngn ng PROLOG (PROgramming LOGic). Nh s hnh thc ho, Kowalski xem xt tp con ca cc logic bc mt, gi l logic mnh Horn. Mt cu hay mt mnh theo logic c th c nhiu iu kin ng nhng ch c mt hay khng c kt lun ng. - i vi nhu cu thc hnh CSDL suy din x l cc cu khng phc tp nh cc cu trong h thng lp trnh logic. S cc lut, tc l s cc cu vi cc iu kin khng trng trong CSDL suy din nh hn s cc s kin, tc cc cu vi iu kin rng. -Mt kha cnh khc nhau na gia CSDL suy din v lp trnh logic l cc h thng lp trnh logic nhn mnh cc chc nng, trong khi CSDL suy din nhn mnh tnh hiu qu. C ch suy din dng trong CSDL suy din tnh ton tr li khng c tng qut nh trong lp trnh logic. - Ngoi vic dng logic din t cc cu CSDL, ngi ta cn dng logic din t nhng cu hi v cc Iu kin ton vn. 2.2- CSDL suy din 2.2.1. M hnh CSDL suy din M hnh d liu gm:

100

+ K php ton hc m t hnh thc d liu v cc quan h, v + K thut x l d liu nh tr li cc cu hi, kim tra iu kin ton vn. Ngn ng bc mt c dng nh k php ton hc m t d liu trong m hnh CSDL suy din v d liu c x l trong cc m hnh nh vy nh vic nh gi cng thc logic. Tip cn ca logic bc mt nh nn tng l thuyt ca cc h thng CSDL suy din. Tuy nhin d biu din hnh thc cc khi nim v CSDL suy din, ngi ta thng dng php ton v t, tc logic v t bc nht. Logic v t bc nht l ngn ng hnh thc dng th hin cc quan h gia cc i tng v suy din ra quan h mi. nh ngha 1: Mi mt hng s, mt bin s hay mt hm s p ln cc term l mt hng thc (term) Hm n ngi f(x1,x2,..,xn); xi | i = 1,2,..,n l mt hng thc th f(x1,x2, ,xn) l mt term. nh ngha 2: Cng thc nguyn t(cng thc nh nht) l kt qu ca vic ng dng mt v t trn cc tham s ca term di dng P(t1, t2,, tn). Nu P l v t c n ngi v ti | i=1,2, ..,n l mt hng thc(term). nh ngha 3: (Literal) Dy cc cng thc nguyn t hay ph nh ca cng thc nguyn t c phn tch qua cc lin kt logic ( , , , , ) th cng , , thc c thit lp ng n. (i): Mt cng thc nguyn t l cng thc thit lp ng n. (ii): F, G l Cng thc thit lp ng n => F G, F G, F G, F G, F ,G cng l cc cng thc thit lp ng n. (iii): Nu F l Cng thc thit lp ng n, m x l mt bin t do trong F => (x)F v ( x)F cng l cc cng thc thit lp ng n ( x, x trong F ). V d 1: Cho quan h R(A1, A2,, An) vi n bc ( tc n thuc tnh) => l mt v t n ngi. Nu rR (r b ca R) => (r.A1, r.A2,, r.An ) => R(A1, A2,.., An) nhn gi tr ng. Nu rR (r b ca R) => gn (r.A1, r.A2,, r.An ) => R(A1, A2,.., An) nhn gi tr sai. nh ngha 4: Cu(Clause)

101

Cng thc c dng P1 Q1 P2 . Pn Q2 . Qn Trong : Pi v Qj (i,j=1,2,,n) l cc Literal dng Trong h thng logic, Literal dng c dng nguyn t, nh nht, tri vi Literal m l ph nh ca nguyn t. nh ngha 5: Cu Horn (Horn clause) l cu c dng P1 Q1 P2 . Pn nh ngha 6: CSDL suy din tng qut (General deductive database) CSDL suy din tng qut, hay CSDL tng qut, hay CSDL suy din c xc nh nh cp (D,L), trong D l tp hu hn ca cc cu CSDL v L l ngn ng bc mt. Gi s L c t nht hai k hiu, mt l k hiu hng s v mt k kiu v t. + Mt CSDL xc nh (hay CSDL chun) l CSDL suy din(D,L) m D ch cha cc cu xc nh(cu chun). + Mt CSDL quan h l CSDL suy din (D,L) m D ch cha cc s kin xc nh . Vy CSDL quan h l mt dng c bit ca CSDL tng qut, hay chun, hay xc nh. Cn mt CSDL xc nh l dng c bit ca CSDL chun hay tng qut.

2.2.2. L thuyt m hnh i vi CSDL quan h


2.2.2.1. Nhn nhn CSDL theo quan im logic Mt CSDL c th c nhn nhn di quan im ca logic nh sau: L thuyt bc mt, hay Din gii ca l thuyt bc mt Theo quan im din gii, cc cu hi v cc iu kin ton vn l cng thc dng nh gi vic s dng nh ngha ng ngha. Cn theo quan im l thuyt, cc cu hi c coi nh cc nh l c th chng minh c hay cng thc hin nhin theo l thuyt ny. Hai tip cn ny c tham chiu n nh quan im l thuuyt m hnh, hay quan im cu trc quan h, v quan im l thuyt chng minh. Hai quan im trn c hnh thc ho thnh khi nim tng ng ca c s d liu thng thng v CSDL suy din. T tng ng sau quan im l thuyt chng minh ca CSDL(D,L) l

102

(i) (ii)

Xy dng mt l thuyt T, gi l l thuyt chng minh ca (D,L), bng Tr li cc cu hi trong CSDL.

cch dng cc cu D v ngn ng L, v

2.2.2.2. Nhn li CSDL quan h y ta xt lp cc CSDL quan h, tc l cc s kin lm nn da trn nn ca cc s kin, vi cc ngn ng khng cha bt k k hiu hm no. Cc gi thit c t ra trn lp ca cc CSDL quan h nh gi cc cu hi:

1)

Gi thit v th gii ng(CWA Close World Assumption): Khng

nh rng cc thng tin khng ng trong CSDL c coi l sai, tc l R(a1,a2,..,an) coi l ng ch khi s kin R(a1,a2,..,an) khng xut hin trong CSDL. V d: C CSDL sau: Hoc_sinh(Xun) Sinh_vien(ng) Nghin_cu(ng) Thich(Xun, Ton) Nh vy theo CWA th b Thich (ng, Ton) c gi s l ng, tc ng khng thch Ton.

2)

Gi thit v tn duy nht (UNA Unique Name Assumption): Khng Theo v d trn c th ni rng hai hng s Xun v ng gn tn duy

nh cc hng s ca cc tn khc nhau c coi l khc nhau. nht cho hai sinh vin khc nhau.

3)

Gi thit v bao ng ca min (DCA Domain Closure Assumption): Theo v d trn c th ni rng Trit khng phi l hng ng. Cho CSDL quan h (D,L), D c mt vi hn ch L khng cha k hiu hm

Cho rng khng c cc hng s ngoi cc hng s trong ngn ng ca CSDL.

no. Vy CSDL ny c th c coi l din gii ca l thuyt bc mt gm c ngn ng L v cc bin ca L , nh c sp t trn min trong din gii ny. Vic nh gi cng thc Logic trong din gii ny da trn : R(a1,a2,,an) ng ch khi R(a1,a2,,an) D Cc tin ca ngn ng T: Theo quan im l thuyt chng minh ca CSDL quan h thu c bng cch xy dng l thuyt T trong ngn ng L.

103

T1. Xc nhn: i vi mi s kin R(a1,a2,,an) D => R(a1,a2,,an) c xc nh. T2. Cc tin y : Vi mi k hiu quan h R, nu R(a11, a21,.., an1), R(a12, a22,.., an2),, R(a1m, a2m,.., anm) k hiu cho cc s kin ca R th tin y i vi R l: x1, x2,, xn R(a1, a2,..,an) (x1 = a11 x2 = a21 xn = an1) . (x1 = a12 x2 = a22 xn = an2) . (x1 = a1m x2 = a2m xn = anm) . . T3. Cc tin v tn duy nht: Nu a1, a2,.., ap l tt c nhng k hiu hng s ca L th (a1 a2), (a1 a3), ., (a1 ap ), (a2 a3), (a2 a4),, (ap-1 ap ) T4. Cc tin v bao ng ca min: Nu a1, a2,.., ap l cc k hiu hng s ca L th: x((x=a1) (x=a2) .(x=ap)) T5. Cc tin tng ng:

1. 2. 3. 4.

x(x=x) xy((x=y) (y=x)) xyz ((x=y) (y=z) (x=z)) x1,x1,,xn(P(x1, x2,.., xn) (x1=y1) (x2=y2) . (xn=yn)

(y1, y2,.., yn)) 2.2.3. Nhn nhn CSDL suy din y ch nhn nhn l thuyt chng minh p dng cho CSDL suy din. Ngn ng L ca CSDL (D, L) c xy dng ch bng cc k hiu xut hin trong D, v ngi ta c th dng bt k ng ngha th tc no trong ng cnh ca chng trnh logic nh cng c tm cc cu tr li bng cch suy din t l thuyt chng minh T, l thuyt T m bo ng ngha ca D nht tr vi ng ngha ca T. Lin quan n CSDL suy din, ngi ta a ra Comp(D) nh l l thuyt chng minh ca CSDL (D, L) v dng cch gii SLDNF tm cu tr li cho cu hi. Gi s (D, L) l CSDL chun. Nh trong trng hp ca CSDL quan h, quan im l thuyt chng minh ca D t c bng cch xy dng mt l thuyt T trong ngn ng L. Cc tin l thuyt ca T nh sau:

104

1) 2) 3)

Cc tin v y : Tin c c do hon thin mi k hiu v t Tin v duy nht ca tn v v tnh tng ng: cc tin v l Tin v bao ng ca min: Nu a1, a2,, ap l tt c nhng phn

ca L, tng ng vi cc cu trong D. thuyt tng ng l tu theo cc k hiu hng s, hm s v v t ca L. t ca L v f1, f2,..,fq l cc k hiu hm s ca L, th tin v bao ng ca min, theo Lloyd nm 1987, Mancarella nm 1988 nh sau: x((x=a1) (x=ap) ( x1, x2,.., xm(x = f1(x1, x2,.., xm))) ( y1, y2, ., yn( x = fq(y1, Y2,, yn)))) 2.2.4. Cc giao tc trn CSDL suy din nh ngha 1: Giao tc (Transaction) Mt giao tc trong CSDL suy din l mt mt xu hu hn ca cc php ton, hay cc hnh ng b sung, loi b hay cp nht cc cu. V mt CSDL suy din c xem nh tp cc cu, tc l theo quan im l thuyt m hnh, khng mt php loi b hay cp nht no c php thc hin trn s kin. Cc s kin l ngm c trong CSDL. nh ngha 2: Khng nh (Commit) Mt giao tc c gi l c khng nh tt nu ton b xu cc php ton to nn kt qu tt ca giao tc. L do chnh ca vic khng m bo hon thnh tt mt giao tc l s vi phm iu kin ton vn khi thc hin cc php ton trong giao tc, hay h hng h thng, tnh ton v hn. 2.3. CSDL da trn Logic Trong phn ny ta i nghin cu CSDL da trn Logic m c th l chng trnh DATALOG. DATALOG l mt ngn ng phi th tc da trn logic v t bc nht. Ngi ta s dng m t thng tin cn thit khng theo cch ly thng tin trong cc th tc bnh thng m da trn logic (ngn ng DATALOG). 2.3.1. C php + K hiu : + v t so snh : <tn thuc tnh> <gi tr>

105

(Bin) so snh vi (gi tr) = {<, >, <=, >=, =, <>} + Cch biu din cc lut(Clause Rule) Q P1, P2, ..,Pn Du , AND ( ) Du ; OR ( ) Du : Ko theo Pi : l cc tin , gi thit, ch con, v t Q : l kt lun hay l s kin + Nu n = 0 : Q Cc s kin ca CSDL ci t. + Nu P P1, P2,,Pn th P l lut quy ( hay v t trong thn v u lut) 2.3.2. Ng ngha l tp tt c cc s kin c suy din ra t chng trnh DATALOG. V d: (r1) Cham(x,y) B(x,y) (r2) Cham(x,y) m(x,y) (r3) ngb(x,y) Cham(x,z) , Cham(z,y) (r4) B(x,y) (r7) : TTin(x,y) Cham(x,z) , TTin(z,y) (r5) M(x,y) (r6) : TTin(x,y) , Cham(x,y) 2.3.3. Cu trc c bn CSDL DATALOG gm hai loi quan h: Cc quan h c s c lu tr trong CSDL, c dng nh ngi ta thy. Cc quan h suy din khng cn lu trong CSDL. Chng c dng nh Ngi ta cn gi c s ny l CSDL m rng EDB (Extended Database). quan h tm thi, cha cc kt qu trung gian khi tr li cu hi. Cc quan h ny c gi l CSDL theo mc ch IDB (Intentional Database). Mi quan h c tn v s ct. Khc vi i s quan h, cc thuc tnh ca mi quan h trong DATALOG khng mng tn hin r. Thay v c tn, mi thuc tnh cn c vo gi tr ca n.

106

Cc chng trnh DATALOG c mt tp hu hn cc lut tc ng n cc quan h c bn v quan h suy din. Trc khi a ra nh ngha hnh thc ta xt v d sau: + C lut v ngn hng nh sau: Ca(Y,X) Gitin(H Ni, X, Y, Z), Z>1200 Lut ny gm quan h c s l Gitin, quan h suy din l Ca. Lut ny rt ra cc cp <Tn khch hng, Ti khon> ca tt c cc khch hng c ti khon ti chi nhnh H Ni v c s d ln hn 1200. + Lut trn c th vit c di dng biu thc tnh ton tng ng trn min xc nh v kt qu c b sung vo quan h suy din mi Ca { <X, Y> | W, Z (W, X, Y, Z) Gitin W= H Ni Z>1200} T ta i n mt s cng thc sau: 1) Cc lut c xy dng trn cc Literal c dng sau: P(A1, A2,, An), trong : P l tn ca quan h c s hay quan h suy din. Mi Ai (i=1,2,,n) l hng s hay tn bin. 2) Mt lut trong DATALOG c dng: P(X1, X2,, Xn) Q1(X11, X12,,X1,m1), Q2(X21, X22,,X2,m2),, Qr(Xr1, Xr2,,Xr,mr), e Trong : + P l tn ca quan h suy din + Mi Qi l tn ca quan h c s hay quan h suy din + e l biu thc v t s hc i vi cc bin xut hin trong P v tt c cc Qi (mi bin xut hin trong P cng xut hin trong Qi no ). Literal P(X1, X2,, Xn) c gi l u ca lut, phn cn li gi l thn ca lut. hiu chnh xc cch thc din gii mt lut trong Datalog, ngi ta xc nh khi nim thay th lut v hin trng ca lut. nh ngha 7: Thay th lut (Rule Substitution) Vic thay th lut c p dng cho mt lut l vic thay mi bin trong lut bng mt bin hay mt hng.

107

Tc l, nu mt bin xut hin nhiu ln trong mt lut th phi thay n bng cng mt bin hay cng mt hng s. V d: Thay th i vi lut nu trong v d trn, bin Z c thay bng W v cc bin kia c thay bng hng s. Ca(M, 123) Gitin(H Ni, 123, M, W), W>1200 Tuy nhin, nu thay X bng hng s 123 v 333 th khng c Ca(M, 123) Gitin(H Ni, 333, M, W), W>1200 => sai nh ngha 8: Hin trng ca lut (Rule instantiation) Hin trng ca lut l vic thay th hp l cc bin bng cc hng s. Mt thay th ng cho ngi ta mt hin trng ca lut. V d: Ca(M, 123) Gitin(H Ni, 123, M, 1500), 1500>1200

i vi lut c th, c th c nhiu hin trng hp l. xem Datalog din gii lut ra sao, ngi ta xt mt hin trng ca lut: P(X1, X2,, Xn) Q1(X11, X12,,X1,m1), Q2(X21, X22,,X2,m2),, Qr(Xr1, Xr2,,Xr,mr), e P ng nu cc biu thc: Q1(C11,C12,,C1,m1) Q2(C21,C22,,C2,m2) Qr(Cr1, Cr2,,Cr,mr) e C gi tr ng, Literal Qi(Ci1, Ci2,,Ci,mi) l ng nu n_b (Ci1, Ci2, ,Ci,mi) c mt trong quan h Qi V d: i vi lut: Ca(Y,X) Gitin(H Ni, X, Y, Z), Z>1200

Ca(Y, X) l ng khi c hng s C1 tho mn iu kin sau: C1>1200 n_b(H Ni, 123, M, C1) c trong quan h Gitin. nh ngha 9: H qun tr CSDL suy din (Deductive DBMS) H qun tr CSDL cho php suy din cc n_b ca v t theo mc ch bng bng cch s dng cc lut logic. Cu CSDL suy din c m t nh sau: Cc chc nng ca h qun trhi

Cc v t theo mc ch

Cc lut Datalog

Cp nht

Cc v t c s

108

CSDL suy din c xy dng da trn cc quan h c s v quan h suy din. H qun tr CSDL ny c gi l suy din bi l n cho php suy ra cc thng tin t cc d liu lu tr theo c ch suy din logic. Cc thng tin l cc v t theo mc ch, cc thng tin ny c c khi ngi ta tng tc vi v t theo mc ch hoc cp nht v t c s. nh ngha 10: Cu hi Datalog (Datalog Query) Mt cu hi trong CSDL suy din gm c: lut. Mt chng trnh Datalog, tc l mt tp hu hn, c th rng ca cc Mt Literal n c dng P(x1,x2,..,xn)? Trong xi (i=1,2,..,n) l hng Vic khai thc cu hi trc tin l tnh chng trnh Datalog, nu c. Tip theo P(x1, x2,.., xn) c nh gi. Th tc ny tng t nh la chn trong quan h P theo rng buc ph hp. V d 1: Tm tt c cc n_b ca quan h vay ti chi nhnh H Ni. Khi ta c: Vay(H Ni,X, Y, Z) Cu hi ny khng c chng trnh Datalog. V d 2: Tnh tp cc khch hng ca chi nhnh H Ni c ti khon m s d trn 1200. Chng trnh Datalog ch c mt lut n. C(Y) Guitien(H Ni, X, Y, Z), Z>1200 C(Y)? Cu C(Y)? l tha; v n ch nhm xc nh quan h cn th hin. loi tr hin tng tha, ngi ta c th dng k php ngn gn, nu khng s b ln ln, nhm ln. Nu by gi cho cu P(x1, x2,.., xn) v yu cu chng trnh Datalog bao hm mt lut n phn bit l: Hi(x1, x2,,xn) .. Trong xi (i=1, 2,,n) l tn bin. iu ny hiu rng ngi ta c cu:

s hoc tn bin.

109

Hi(x1, x2,,xn)? Cu ny l mt phn ca cu hi. Do vy, cu hi sau l tng ng vi cu hi trn l: Hi(Y) Guitien(H Ni, X, Y, Z), Z>1200

2.3.4. Cu trc ca cu hi
trnh by cu trc ca cu hi ngi ta s dng th lut nh ngha 11: th lut (Rule Graph) Mt th lut i vi cu hi q l th c hng m: ca q. Cung ca th ng vi quan h trc gia Literal trong thn ca lut ai aj Nu lut ny c mt trong cu hi: ai aj Ch : Vic xy dng ny khng tnh n tp cc bin v cc hng s c mt trong cc lut a dng ca cu hi ny. Thng tin duy nht ngi ta dng l tp cc k hiu Literal v quan h ca chng theo cc lut a dng. V d 1: Xt cu hi: p1(X, Y, Z) q1(X, Y), q2(X, Z), q3(Y, Z) p2(A, B) Hi(B) p1(A, B), q4(B, A) p2(A, B), p3(B, A) v Literal c mt trong u ca lut . Do vy th s c cung Cc nt ca th ng vi tp cc k hiu Literal c mt trong cc lut

th ng vi cu hi ny l:

110

Hi P2 P1 q1

P3 q4 q3 q2

th trn l th khng c chu trnh thng c gi l cu hi khng quy. V d 2: Xt cu hi: p1(A, B, C) q1(A, B), P2(B, C) p2(X, Y) q2(X), p1(X, Y, Z)

Hi(A, B) p1(A, B, C), p2(B, C) th ng vi cu hi ny l: Hi P2 q2

P1

q1

th ny l th c chu trnh thng c gi l cu hi quy. Kt lun: + Vic xy dng cu trc ca cu hi cho php chng ta d dng trong vic nh gi cu hi. + Gia cu hi quy v cu hi khng quy cng c nhiu khc nhau kha cnh loi hnh cu hi trn CSDL. Thc t cho thy vic nh gi cu hi quy phc tp hn nh gi cu hi thng.

2.3.5. So snh DATALOG vi i s quan h


V mt c bn ngn ng Datalog vi cc cu hi khng quy c xem nh tng ng vi i s quan h v kh nng th hin.

111

Vi cc cu hi quy cho php ngi ta mt cng c mnh hn cc ngn ng quan quan h. iu ny ngn ng Datalog cho php hi cc cu hi khng c php trong i s quan h.

(1)

Php hp : l tp cc lut c cng u lut Hi(X1, X2,,Xn) r1(X1, X2,,Xn) Hi(Y1, Y2,,Yn) r2(Y1, Y2,,Yn) V d 1: (r1) Cham(x,y) B(x,y) (r2) Cham(x,y) m(x,y) V d 2:Tm tn ca cc khch hng ti chi nhnh H Ni, lm nh sau: Hi(Y) Vay(H Ni, X, Y, Z) Hi(B) Gi tin(H Ni, A, B, C) Ch : hai lut th hin php hp l tch bit

(2)
thc chn.

Php chn : ng vi mt lut m thn lut c mt v t so snh -> biu

Php chn chn cc n_b trong quan h r c vit di dng cu hi: r(x1, x2,, xn)? Trong : xi (i=1, 2,..,n) l tn bin hay mt hng s. V d 1: Cham(x,y) Cham(x,y) , y= Dng iu ny
y = Dng

(Cham(x,y)) ( php chn vi iu kin l y= Dng)

V d 2: Chn (tm kim) tn ca nhng khch hng vay qu 1000? Hi(Y) Vay(H Ni, X, Y, Z), Z >1000

(3)

Php chiu : l php ton ng vi mt s lut m c mt s bin thn

lut m khng xut hin trong u lut. Cha(x) = KQ(x) Cham(x,y) , y = Dng

(4)
thn lut.

Php kt ni : l php ng vi lut m c bin chung cc v t ca

Php kt ni hai quan h r1 v r2 c vit di dng Datalog nh sau: Hi(X1, X2,,Xn, Y1, Y2,.., Ym) r1(X1, X2,,Xn), r2(Y1, Y2,.., Ym) Trong : Xi, Yj | i=1,2,..,n v j=1,2,..,m l cc tn bin phn bit nhau.

112

V d 1: (r3) ngb(x,y) Cham(x,z) , Cham(z,y)

(5)

Kh nng quy:

V d 1: nh (r7) (r4) B(x,y) (r7) : TTin(x,y) Cham(x,z) , TTin(z,y) V d 2: Gi s c lc quan h: Qun l(Tn nhn cng, tn ngi qun l) Lc th hin mi quan h ngi qun l v nhn cng.

Gi s Qun l l mt quan h theo m hnh trn.

Tn nhn cng M Hoa Mai Lan Chn Tch

Tn ngi qun l M M M M Hoa Hoa

Yu cu:1) Tm tn ca nhng ngi lm vic trc tip di quyn ca ng M, tc ph thuc mc 1, vit nh sau: Hi(X) Qun l(X, M) 2) tm tn ca nhng ngi lm vic trc tip di quyn ca ngi do ng M qun l, tc phc thuc mc 2 vo ng M, Vit nh sau: Hi(X) Qun l(X, Y), Qun l(Y, M) Nh vy, ngi ta khng th th hin yu cu tm ngi ph thuc bc n vo ng M trong i s quan h c. D nhin cu hi tm tn ca nhn cng lm vic di quyn ca ng M, trc tip hay gin tip, khng th to c bng i s quan h hay bng Datalog vi cc cu hi khng quy. Nguyn nhn l do ngi ta

113

khng bit ng M qun l n mc no. Tuy nhin ngi c trong Datalog di dng cu hi quy nh sau: e(X) e(X) Qun l(X, M) Qun l(X, Y), e(Y)

th to cu hi ny

Hi(X) e(X) Ch : a) Cch 1: i vi nhng cu hi quy ngi ta cng c th chuyn v cu hi khng quy bng cch s dng ngn ng ta Pascal vi mt s ln hu hn cc bc lp. Vic lp c th hin qua cu lnh Repeat. iu kin trong cu Until s kim tra v tp hp, nh tnh bng nhau, bao nhau hay rng. Trong cu Until cc quan h suy din c coi nh cc tp. Do vy cu hi quy trn c th c vit li nh sau: e(X) Qun l(X, M) Repeat e(X) e(X) e(X) Qun l(X, Y), e(Y) Until e = e M t:

- Lut u tin tm nhn cng m ng M trc tip qun l. Khi hon thnh
cc lut trong vng Repeat c nh gi.

- Ti mi ln lp, mc tip theo ca nhn cng c tm v c b sung vo


tp e.

- Th tc ny kt thc khi tp e = e (Khi khng cn nhn cng mi c th


c b sung vo e.). Mt khc, do tp nhng ngi qun l l hu hn. Cch thc hin: Theo di chu trnh vi cc d liu trong bng khi chy. e = {Hoa, Lan, Mai} e = {Hoa, Lan, Mai} e = {Hoa, Lan, Mai, Chn, Tch} e = {Hoa, Lan, Mai, Chn, Tch} Cch 2: Ngoi cch lm nh trn ngi ta c th c cch lm khc m vn t c kt qu nh trn:

114

m(X, Y) Qun l(X, Y) m(X, Y) Qun l(X, Z), m(Z, Y) Hi(X) m(X, M)

So snh gia cch 1 v cch 2: Cch 1: Tm ra cc nhn cng ca ng M. Cch ny cho php tm nhanh hn. Cch 2: Tm tt c quan h nhn cng ngi qun l ri chn ra cc cp c tn ngi qun l l M b) Khc vi cu hi khng quy, ngi ta c nhiu chin lc nh gi cu hi quy nh chin lc nh gi t di ln. nh gi cu hi quy e c gi l nh gi th. Tuy n n gin nhng khng my hiu qa trong s cc chin lc di ln. S khng hiu qu l do khi ngi ta s dng lut quy, tp e trc c s dng trong tnh ton. hiu qu hn, ngi ta dng nh gi na th. Di y ch cc nhn cng va c b sung trong ln lp trc mi c lut xt n. Cch 11: i:=0 ei (X) Qun l(X, M) Repeat e (X) ei (X) ei + 1(X) Qun l(X, Y), ei (Y) i: = i +1 Until ei e Cch 21: i:=0 mi (X, Y) Qun l(X, Y) Repeat m (X, Y) mi (X, Y) mi + 1(X, Y) Qun l(X, Z), mi (Z, Y)

115

i: = i +1 Until mi m Hi(X) Qun l(X, M) Lu : D c phng php nh gi tt hn nh gi th, ngi ta vn khng t c hiu qu nh trong cu hi cho cng kt qu trc . Cng c nhiu k thut m bo lm tinh k thut na th.

2.3.6. Cc h CSDL chuyn gia


Qua phn trn, ngi ta thy rng cc lut da trn logic c th tch hp c vo CSDL quan h. Cc lut nh vy bt u t cc s kin trong cc n_b ca cc bng quan h. Cc h chuyn gia dng ny thc hin hn na cc hot ng c iu khin. nh ngha: H thng CSDL chuyn gia(Expert Database System) Mt h thng CSDL chuyn gia bao gm cc lut c dng nu c tp cc n_b no trong CSDL, th mt th tc c bit c khai thc. Th tc ny c th cp nht CSDL; v cu lnh IF ca cc lut khc c th ng v th tc khc c thc hin Nh vy CSDL loi ny gi l CSDL nng ng. Cu trc ca h thng CSDL chuyn gia tng t nh cu trc ca h chuyn gia trong tr tu nhn to. Khc nhau chnh gia hai loi hnh ny l vic s dng CSDL hoc s dng b nh trong, hay b nh o. Theo dng chun, mt h thng CSDL chuyn gia gm CSDL chun v h chuyn gia chun. H chuyn gia hi bng ngn ng ca CSDL, chng hn nh ngn ng SQL v i tr li t pha CSDL.

2.4. Mt s vn khc
Ngoi cch tip cn v CSDL suy din nh trn, ngi ta cn quan tm n mt s vn v CSDL suy din sau:

Th nht l: nhng c trng ca qu trnh x l cu hi. Cn thit m

t chi tit hn v la chn cc chin lc nh gi cu hi i vi CSDL xc nh v cc ch xc nh. Mt khc vic x l cu hi trong mi trng song song cng c quan tm.

Th hai l: cc nghin cu h thng v cc kha cnh ca iu kin ton

vn. Cn c s phn loi chi tit tu theo bn cht ca rng buc, cch th hin ca

116

rng buc trong cng thc logic, v cc quan im khc nhau v tho mn v v kim tra ton vn trong CSDL suy din. Bn cnh cn c cc phng php qun l iu kin ton vn trong CSDL suy din.

Th ba l: mu hnh ca h thng CSDL suy din. l mt s kin

trc c th chp nhn c i vi h thng CSDL suy din. Khi chp nhn mt s kin trc no , CSDL suy din mu s c pht trin trc khi dng b din gii Prolog.

Th t l: cc CSDL suy din song song. Vic gii thiu mt vi kin

trc song song ca CSDL suy din gm cc thut ton m t chi tit qu trnh x l cu hi. Cc cu hi c coi l xc nh v CSDL suy din c xc nh tch bit, t do v chc nng. Vic nh gi song song i vi cc iu kin ton vn cng l quan trng.

Th nm l: vic hnh thc ho cc chc nng gp ln v cc d liu

ton vn. Trong cc phn trc iu kin ton vn ch l tnh v khng gp ln, dng cho CSDL chun. Khi pht tin CSDL, cc iu kin ton vn cng c lm ph hp. Ngi ta hnh thc ho cc chc nng gp ln, cc iu kin ton vn v cc rng buc trn giao tc.

117

You might also like