You are on page 1of 29

CS 5614: Basic Data Definition and Modification in SQL

67

Introduction to SQL

V V V

V Serves as DDL as well as DML

Structured Query Language (Sequel)

Declarative V Say what you want without specifying how to do it V One of the main reasons for commercial success of DBMSs Many standards and implementations V ANSI SQL V SQL-92/SQL-2 (Null operations, Outerjoins etc.) V SQL3 (Recursion, Triggers, Objects) V Vendor specific implementations Bag Semantics instead of Set Semantics V Used in commercial RDBMSs

CS 5614: Basic Data Definition and Modification in SQL

68

Example:

Create a Relation/Table in SQL

CREATE TABLE Students (sid CHAR(9), name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL);

V CHAR(n) V VARCHAR(n) V BIT(n) V BIT VARYING(n) V INT/INTEGER V FLOAT V REAL, DOUBLE PRECISION V DECIMAL(p,d) V DATE, TIME etc.

Support for Basic Data Types

CS 5614: Basic Data Definition and Modification in SQL

69

More Examples

And one for Courses

CREATE TABLE Courses (courseid CHAR(6), department CHAR(20));

And one for their relationship!

CREATE TABLE takes

V Why?

(sid CHAR(9), courseid CHAR(6));

Can also provide default values

CREATE TABLE Students (sid CHAR(9), .... age INTEGER DEFAULT 21, gpa REAL);

CS 5614: Basic Data Definition and Modification in SQL

70

Examples Contd.

V V V V

DATE and TIME V Implementations vary widely V Typically treated as strings of a special form V Allows comparisons of an ordinal nature (<, > etc.) DATE Example V 1999-03-03 (No Y2K problems) TIME Examples V 15:30:29 V 15:30:29.3875 Deleting a Relation/Table in SQL

DROP TABLE Students;

CS 5614: Basic Data Definition and Modification in SQL

71

Modifying Relation Schemas

V V

Drop an attribute (column)

ALTER TABLE Students DROP login;


Add an attribute (column)

ALTER TABLE Students ADD phone CHAR(7);

V What happens to the new entry for the old records? V Default is NULL or say
ALTER TABLE Students ADD phone CHAR(7) DEFAULT unknown;

V V

Always begin with ALTER TABLE <TABLE_Name> Can use DEFAULT even with regular definition (as in Slide 69)

CS 5614: Basic Data Definition and Modification in SQL

72

How do you enter/modify data?

INSERT command INSERT INTO Students VALUES (53688,Mark,mark2345,23,3.9)

V Cumbersome (use bulk loading; described later)

DELETE command DELETE FROM Students S WHERE S.name = Smith

UPDATE command UPDATE Students S SET S.age=S.age+1, S.gpa=S.gpa-1 WHERE S.sid = 53688

CS 5614: Basic Data Definition and Modification in SQL

73

Domains

Domains: Similar to Structs and other user-defined types

CREATE DOMAIN Email AS CHAR(8) DEFAULT unknown; .... login Email // instead of login CHAR(8) DEFAULT unknown

Advantages: can be reused

junkaddress Email, fromaddress Email, toaddress Email, ....

Can DROP DOMAINS too!

DROP DOMAIN Email;

V Affects only future declarations

CS 5614: Basic Data Definition and Modification in SQL

74

Keys

V Use PRIMARY KEY or UNIQUE V Declare alongside attribute V For multiattribute keys, declare as a separate line
CREATE TABLE takes ( sid CHAR(9), courseid CHAR(6), PRIMARY KEY (sid,courseid) );

To Specify Keys

V Typically only one PRIMARY KEY but any number of UNIQUE keys V Implementor allowed to attach special significance

Whats the difference between PRIMARY KEY and UNIQUE?

CS 5614: Basic Data Definition and Modification in SQL

75

Creating Indices/Indexes

V V

V Speeds up query processing time


For Students

Why?

CREATE INDEX indexone ON Students(sid); CREATE INDEX indextwo ON Students(login);

How to decide attributes to place indices on? V One is (typically) created by default on PRIMARY KEY V Creation of indices on UNIQUE attributes is implementation-dependent V In general, physical database design/tuning is very difficult! V Use Tools: Microsoft SQLServer has an index selection Wizard Why not place indices on all attributes? V Too cumbersome for insertions/deletions/updates Like all things in computer science, there is a tradeoff! :-)

V V

CS 5614: Basic Data Definition and Modification in SQL

76

Other Properties

NOT NULL instead of DEFAULT CREATE TABLE Students (sid CHAR(9), name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL);

V V

V NULL will be inserted


If we had specified

Can insert a tuple without a value for gpa

gpa REAL NOT NULL);

V insert cannot be made!

 "!#$%'&)(0

12436587@9BADCFE8GIH@P6QSRTPURV7@H@HW9X3YG8E `a3658bcPdPeE87gfhPdbcP62Y3piqG8P6bX`rH@st28uvGIsvuvPd9cwvbxP6y8bxP69cP62Y3Ys3Y7gCv289G89xP6E7@26Cv2t G8283Y7@Ct2RV73Y53658PbcP6Hgs3Y7gCv28stHvA4CFE8P6Hg58P8bx937@9WbxP6H@s3Y7gCv28stHFsvH@utP68bxsvQqsv2svHguvP68bxsv7gst28EeyIbcCF6P6EIG8bcstH RTs`rCtb6bcPds3Y7g28u28PxRdbcPdH@s367@Ct289bcCtAeuv7gfhP62Cv2IP69ch58P9cP6dCv28E7@9iBs3YstH@CtuvQSRV587@57@9H@Ctuv7g6svHjsv28E E8P6dH@svbxs367fhPh7@228s36G8bcPDkX93YGIE8P62Y3Y9svA47@Hg7@stblRV7365p3Y58PmnopWoVqry8bcCtuvbxsvA4AD7g28uDHgsv28utG8svutPhRV7gH@HI28E 3Y5I7@9stH@H)3YCFC28s3YG8bxsvH@sxt12stx3YQ)3Y58PdbcPustbcPu6HgCv9xPu6CvbxbcP69xyvCv28E8Pd286P69whP3RuP6P62EIs3Yst8sv9xP9`93YPdAD9wsv28E mnopWoVqwmnopWoqx6sv2yvPz3Y5ICvG8ut53aCtsv9esE8s36sv8st9cP9`936P6A{RV58P6bxPDstH@H|3Y5IPE8s36s}t369e7g236C A4sv7@2~A4P6A4Cvb`F12dCv2Y3Ybcst936QusEI7@9367@28utG87@9x587@2IuP6s36G8bcPCvunhiu897g93658s343Y58P`CvyvP6bxs3YPCt2 9cPd6Cv28EIsvb`936CvbxsvuvPdo3658P6b3Y58st23Y58s3ykst28E~9cCtADPEI7@fvP6bcPd286P697@2yiqG8P6b`y8bxCF6P69c9x7@28utscQ3658P6bcP4svbxP Px6P6HgH@P62Y3Usv2IsvH@Ctuv9w36CvC3Y56G8Hg3YG8bxP69cC3Y5mnoVpWoqst28EntiuI9svbcPE8Pd6H@stbcs367fhP658s3RuP 28CR3YC4vPebcPdH@s367@Ct289svbxPebcPdP6bcbxP6Ez3YC4sv9B@y8bxP6E87@ds3YPd9c7g2mnopWoq36G8y8H@P7g96svHgH@P6Es4uvbcCtG828E sv3p7g2}mnopWoVq3Yst8H@Pz7@9h6svHgH@P6Esv2gPxq3YP6289x7@Ct28svHlE8P682I73Y7gCv28W7g2}mnopWoqst28E9cCCt28kXiuC 28C3VuvPx3uhCtuvutP6E}E8CRV2Y`43Y58P69xPu9cyvP6d7@869xqRuPt7g286H@GIE8P3658P6A58P6bxP G893u9cCB3658s3w`hCvG6sv2ADs P3658P 6Ct2828P6x367@Ct28Q7@|`hCtGstbcPsvHgbcP6stEt`stAD7gH@7gsvbwRV7g3Y5mnopWoVqH@9xP6Q28C3Y587g28up3YCpRTCtbcb`F@s58PB36587@bxE iqG8P6b`abxP6y8bxP69cP62Y3Ys3Y7gCv247@9cQjCvW6CtG8bc9xP6QIVp3658s3wRTst97@2Y3YbcCFEIG86P6E4P6svbxH@7gP6bcw12T3Y5IPubcP6A4sv7g28E8P6bCtI3Y587g9 E8CF6GIADP62Y3YQRuPpRV7@H@H7g236bcCFE8G8dPaIsv9c7gCvyvP6bcs3Y7gCv2894sv28E~A4sv287gy8G8Hgs3Y7gCv289B3658s3RuP6sv2~yvP6bcCtbcACt2 bcPdH@s367@Ct289c4vCvbP6sv59cGI58st9c7gCvyvP6bcs3Y7gCv28QRTPBRV7@HgH9x58CR58CR737@9bcP6yIbcP69xP6236P6E}7g2P6sv5Cv|3658P 3Y5IbcP6PE87gfhPdbcP62Y3T28C3Ys3Y7@Ct289c ulvvq8dvl58PzG8287gCv2Cvj3RuCbcP6Hgs3Y7gCv289Usv28Ea7@9'3Y58P9cP3VCvlP6H@P6A4P62Y3Y93Y58s3 stbcPu7g2"Ctb7g2CvbvC3Y58Pusv9x9cG8A4Pw3658s3w3Y58Ph9c58P6A4sv9wCv"st28EsvbxPusvHg7 PVkCtW6CvGIbc9cPds st28Ez3Y5Is33Y58Pd7@b6CtH@G8A4289svbxPtstH@9xCeCvbxE8P6bxP6EDstH@7 PTkCt6CtG8bc9xP6Qvsvutsv7@2Isc'P28CRut7fhP3Y5IP3658bcP6P bxP6y8bcPd9cP62Y3Ys3Y7@Ct289CvS3Y5IPVGI287@Ct2}bcPdH@s367@Ct28 "k9c7gADyIH@P6Qjbc7@ut536s

'ccc'~c'''c| 'ccc'~c'''c| BC3Y7g6P3658s33658Pfsvbx7@svIH@P69 'cc stbcPA4P6bcPdH`@yIH@svdP658CvHgE8P6bx9cG89cPdECtby8s3x3YP6bx2 ' A4s365I7@28utIRTPu6CtG8H@E58sfhPRVbc7g3c36P62p3Y58PsvvCfhP3RTCsv9x 'ccY~c''c| 'ccc'~c'''c| cqqq lvxcWW  ' ccv dxFj6

q qY cqqq q uqlF4hwq8dvl58PBE87@fvP6bxP6286Pt P6HgP6A4P62369l3658s3esvbxPe7@2a8Gt3e28C3 7g928C3kutP628P6bxsvH@Hg`|s3Y58P9xsvA4PVst9B ~CtI3RTCebxP6H@s3Y7gCv289st28E7@93Y58PB9cPx3uCv

7g2ahu9hG89cGIsvH@QSRTPst9c9cGIADPU3Y58s3t3658Pe9x58PdADst9Cv'sv28E B

svbxPusvHg7 Phsv28Ep3658s3w3Y58P67gb6CvHgG8AD2I9svbxPusvHg9cCCtbcE8P6bxP6EstH@7 PdCvbxP6CfhP6bcQI28C367@6Pw3658s3T

'ccc'~c'''c|h|Yc'''c|' cqqq q q cqqq q 6q6qF6vvq8dvl58P7@2Y3YP6bx9cP63Y7@Ct2p Cv)3RTCbcP6Hgs367@Cv2I9' st28ED7@9j3658P9cP3 CtP6HgP6A4P62369l3658s3VstbcPe7g2  sv28E}Buvsv7g28QIRTPBsv9c9xG8A4Ph3Y5Is33Y5IPV9x5IP6ADst9Cv'st28E stbcPstH@7 Pst28EB3Y5Is33658P67@bWdCvH@GIAD289lsvbxPsvHg9cChCvbcEIP6bcP6EstH@7 P6uC3Y7@dPW3658s3U   k 'sx

'ccc'~c'''c|hcc''h' cqqq q Yqqqq cqqq q    F6vBoVyvP6bxs36P69uCv2ys9x7@28utH@PebxP6H@s3Y7gCv2}st28EbcPdADCfhP69h9cCtADPzCv|3Y58PdCvH@GIAD289x"!u9xP6G8H

CtbDbxP6936bc7@3Y7@2Iu}7@2ICvbcA4s367@Ct28u9c9xG8A4P3658s34RTPpRusv2Y3aCv28Hg`3Y58P28stADPst28EstE8E8bcPd9c94bcCvA bxP6H@s3Y7@Ct2aB

#%$'&)(1032 &54648790A@B@
5qG89 '

'cc|qcc''h' hPd6CvA4P7@bxbcP6HgPxfsv2Y3s3c3Ybx7@8G 3YP69xlP6CvG8HgED7g28936P6svEbxP67@2ICvbcdP3Y587g9`zRVbc7g3Y7@2Iu 'cc|qcc'DCDCF' qqqFEqG'qHqPII q RlqF6voyvP6bcs3YP69hCv2s9c7g28uvHgPVbxP6H@s3Y7gCv2sv28EbcP6A4CfhP699cCtADPCtj3Y58PbxC Q 3658P3YG8y8HgP69bcCtARV5IP6bcP3Y5IPV2IsvA4PV7g9@7g58stP6H@g TS RV9xt5IPebcPdADCfsvH

7g9VIsv9cPdECv29xCvA4PDdCv28E87g3Y7gCv2y9cyvP667g8P6Ey`3658PDGI9cP6bxhCtbPx|stADyIH@P6Q9cGIy8yvCv9cPVRTPTRTsv2Y3stH@H

U$'&)(10)VXW`Yba`ced&50Afhg g
qqq q

'ccc'~c'''c|hpirqtsqq%utv

wxqqFEGqFiyqcPsqYqPutv
vRlqF6vlr8P6HgP6x3Y7gCv289vP6dCvA4P}6CtADy8Hg7@6s3YP6ERV58Pd2y3Y5IP}6Ct28E87367@Ct289DutPx3}H@Cv2IuvP6bxQ y8stb367@6G8HgsvbxH`RV7365}ius3YsvHgCvuk3658PeC3Y58Pdbw3RTCCtbcA49tstbcPy8bxPx3c3`936bcsv7guv5Y3YCtbRTstbcE8sx Cv289x7@E8Pdb CtbP|stADy8HgP6QYRV58P62BRTPRusv2Y3UsvHgHt3Y5IP3YG8y8HgP69bxCvAdnRV5IP6bcP3658P28svA4P7@9@7@58stP6H@FonyRV5IP62 3658PuvP628EIP6b7@9g@qPRVbc7g3YPw36587@97g2aiBs3YstH@Ctusv9x ''cccqcc''hDpirqcPsqqqPutvh ''cccqcc''hDpirqcv

uC3Y7g6Ph3658s3RTPg9cy8Hg736j3Y58PdCv28E87g3Y7gCv2svdbcCv9x9h3RTCbcG8HgP69cQ GI93Dsv9RTP4E8C7@23658PG8287gCv2y6sv9xP kCvA4Pz3YC 3Y587g2 Cvh736Ql3Y5IPDon7@9e7g28E8P6PdE3658PDGI287@Ct2Ct3RTC}dCv28E87g3Y7gCv289xscy87@A47@HgsvbcHg`)Qj3658P svbxPe28P67g3Y58PdbtA4svHgPe28Ctb 6CtADA4s7g2}P6st5Cv|3Y58PstvC fhPEIs3YstH@CtuabxG8H@Pd9tA4CFE8P6Hg9w3Y58PBuid6Ct28E87g3Y7@Ct28pWPx36@9h6Cv289x7@E8Pdb sA4CvbxPe6CtADyIH@7@ds3YPdEadCv28E87g3Y7gCv288P6HgP6x3VstH@HS3Y58PU3YG8y8HgP69hbcCvAxn3Y58s3 58sfhPw3658P28svA4P@vC|g ''cccqcc''hD 587gH@P9cPdH@P6x367@28uB3658Ph36G8y8HgP69bcCtA st587gPxfhP6E4`kRV5Y`|sc ''cccqcc''hD ''cccqcc''hD qcv qcvqadpqtYv n3658s3 stbcPz28C3 vC3Y5A4svH@Pst28E58sfhP3Y58Pz28svA4P@vC|W7g9

qcPYrv| 3Y58P48bc9X3DP6H@P6A4P62Y3

86q68 X1lF587@9z7@9w3658P49cPx3DCvw@y8st7@bx9cCvbxADP6EyY`58CFCv9x7@28u E87g9cstA87@utG8s36Ph3Y5IP6Ax`y8bxP6t7@28uT3Y58PdARV7g3Y5 Ctv3RTC4bcPdH@s367@Ct289Usv28E} 7@9huv7gfhP62Y`|

bxCvAsv2IE43Y5IPu9cP6dCv28EP6HgP6ADPd23tbcCtA'126sv9xPCvl6Cv28GI9c7@Ct289hsvA4Cv28u4s3x3Ybc7g8Gt36Pu28svA4P69xQ 3Y58PbxP6H@s3Y7@Ct2}28stADPdV5IP6svbX3YP69x7@sv2y8bcCFEIG8x3

'cPtt|thcpcpc~ct|cPcPPFDccp'cp'p qqqhBEG'hqpHqtIpIhtEqHDhtsHpdqqdF hBEG'hqpHqtIpIhtEqHDhtsHpdqqdF q' uC3Y7g6P58CRRTPzE87@9xsvA87guvG8s3YPes3c36bc7@IGt3YP69h7@2 5Isv9f3YG8y8HgP69cQI3658P62a 3Y58P8pfhP6bx9c7@Ct28uuHg9cC428C367@6PU3Y58s3p7g'r5Isv9

3YG8y8HgP69sv2IE

rRV7gH@H5Is fhP

e f3YGIy8H@P69x

Tg

hjiq6lk5mv587@97g9 G893aHg7 PB3658P6svbX3YP69x7@sv2~y8bxCFE8G8x3aIGt3auvCFP694s936P6y~G8b3658P6bxB3YP6b S CtbcA47@28uB3658P e f 3YG8yIH@P69xQ73u9cP6HgP6x369Cv2IH`rs9xG889cP3VCvS3Y58P6A3YC7g286HgG8E8Pu7g27369sv289XRTP6bxQW8sv9xP6E 6CtH@G8A4289hsv9'3Y58Pz6svbX3YP69x7@st2}y8bxCFE8G8x3e8Gt3e28C328P66P69x9cstbc7@Hg`3Y5IPV9xsvA4Pe2qG8AvP6bCvbxC RV9ek9xCvA4P

Ct29xCvA4P6Cv28EI73Y7gCv285qG89xQh3Y5IP3Y58Px36s Cv7g2Cvh3RuCbcP6Hgs367@Cv2I958sv9V3Y58P9xsvA4P2G8AvP6bCv Ct3658P}bxCRV9RV7gH@HhhPbcP6A4CfhP6E~vP66svGI9cP3Y5IPx`E87g9c9cs3Y7g9c`9xCvA4P}6Ct28E87g3Y7@Ct28scCt289c7gE8P6bCtb PxsvA4y8HgP6Q 3Y58s3RTPRTst23w3YC828Egy8sv7gbc9cWCt9X3YG8EIP623699xG85p3658s33658P8bc9X3VyhPdbc9cCt27@2p3Y58PBy8sv7gb 7g9svHRus`|9g7@58svP6Hg@|PVutPx3Y onApTqsr $t&5(u05Vlv Yba`ced&50Afhg 'cPtt|thcpcpc~ct|cPcPPFDccp'cp'phDPirqtsYqtu%v| qqqhBEG'hqpHqtIpIhtEqHDhtsHpdqqdF hBEG'hqpHqtIpIhtEqHDhtsHpdqqdF q'

wxqqtBEqpGqwixqctsYqtuPv
3658P3Y58Px36s Cv7g28uH@9xCvQWbxP6svHg7|{6Pw3658s3 uC3Y7g6P|3Y58s3RuP7g236bcCFE8GI6Pl3658P@vCR3Y7@PdF9`AhCtHynBpz9cGlzTcP6Ez`u3658PdCv28E87g3Y7gCv2eCtb7g28E87g6s367@28u

}nBp~ 8l I"mv587@9z7@9 g g6CvHgH@svyI9cP6j3658P6A

U ~k

's

G89X3s6HgPxfhP6bxP6bwRTs`36C6CvAI7@28PT3RTC}bxP6H@s3Y7gCv2897g236C}Cv2IP6~58P 3Y587g9

8st9c7gD7gE8P6s7@93Y5Is37g|3Y58PT3RTCbcP6Hgs367@Cv2I9V5Is fhP49cCtADP46CtH@G8A42k9xs7@2y6CvA4ADCt28Q|3Y58Pd2RuPD6st2 7g236C43Y5IPD9xsvA4P6CvHgG8A427g23658PDI28svHwCvGt36y8Gt3YyCvbxP6CfhP6bcQ|RuPD6st2E8C Ct28H`}7@)3Y58Pw3RuCh3YGIy8H@P69bxCvA3Y58Pw3RuCVbxP6H@s3Y7gCv289wsvuvbxP6Pu7g2z3Y58Ct9cPudCvA4ADCt26CvHgG8A4289cw5G89xQ7g3 7g9u9c7gAD7gH@svb36Cz3Y5IP6svbX3YP69x7@sv2yIbcCFE8G83YQ8G 3uRTPz Cv7g28lCv28Hg`}3Y5ICv9cPyIsv7@bx9w3Y5Is3 87gb3658E8s36PeCt936G8E8P62Y3Y9h7@2s9x7@28utH@PbcP6Hgs367@Cv2IuuC367@6P3Y58s3 C3Y58P6bCtG8bs3x3Ybx7@8Gt36P69svbxPy8bcP69xP62Y3U7@2 3RTCbxP6H@s3Y7gCv289x onAp 'ccc'3'h~q~'ccc'|D''c)|' qqqhBEG'h4qpHqtIpIDtEqHt'htsHpdqqdF q' 6CtADA4Cv24s3x3Ybx7@8Gt36P69cCv289x7@E8Pdbl3Y5Is3RTP'RTst233YC82IEz3Y58Ph28stADP6QvstE8E8bcPd9c9cQvutP628E8P6bxQvuvy8szsv28E A4s3657@23Y5IP67@b

uvy8s47@9hsfsv7@Hgsv8HgPbcCvA8G 3t3658P

BICvQ RTP28PdP6EsRTs`e36CV7g236P6H@Hg7@utP6236H`e6CvAI7@28P3Y58P69xP

wxqqtBEqpGqpiqhAEFG  hqqpq H PII i hqpHqtIpI


' Vqlluj5I7@9h7@9 G89X3ps6CFCtHv36587@2IuvQW7@26st9cPRTP58sfhP3YCFC4A4sv2Y`r28svA47@2Iu6Cv2lI7@x369sv28E 6Ct28G89x7@Cv2I9usvbx7@9c7g28uvUPdsv2}GI9cPB3Y5I7@9hCvyvP6bcs3YCtb36CbcPd28svA4PesbxP6H@s3Y7gCv28g9t28stADPzsv28EIwvCvbCt28P CtbtA4CvbxPVCt73Y9s3c36bc7@IGt3YP69xuvCvbPxsvA4y8HgP6QWsv9x9cG8A4PhRuPRusv2Y3t36CDbxP628stADPV36C 7g3Y9h6CvHgG8AD2I93YCvPedsvH@HgP6E Hg7 Ph9cCv

sv2IE

st28E}A4s P

587g9t7g9tA4Cv9X3pG89xP6G8HSRV7365abxP6H@s3Y7gCv28stHstH@uvPd8bcstQ

Y(82 (52 (st

kBs

CS 5614: Misc. SQL Stuff, Safety in Queries

81

What is still to be covered (and will be)

V V V V

V Domain Constraints V Referential Integrity (Foreign Keys)


More SQL Stuff V Subqueries V Aggregation SQL Peculiarities V Strange Phenomena V More on Bag Semantics V Ifs and Buts Embedding SQL in a Programming Environment V Accessing DBs from within a PL V (will be covered in Module 3)

Declaring constraints

CS 5614: Misc. SQL Stuff, Safety in Queries

82

What will be mentioned (but not covered in detail)

V V V V V V

V Read Cow Book or Boat Book


More SQL Gory Details Recursive Queries (SQL3) V Why do we need these? Security Authorization and Privacy Trends towards Object Oriented DBMSs

Triggers

CS 5614: Misc. SQL Stuff, Safety in Queries

83

Tuple-Based Domain Constraints

V V

V NOT NULL V UNIQUE, PRIMARY KEY etc.


In General

Already Seen

CREATE TABLE Students (sid CHAR(9), name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL, CHECK (gpa >= 0.0) );

V Note: Implementations vary, but this is the general idea

Other Complicated Forms V Constraints on whole relations, Assertions

CS 5614: Misc. SQL Stuff, Safety in Queries

84

Referential Integrity Constraints

V V

Foreign Keys V An attribute a of R1 is a foreign key if it references the primary key (say b) of another relation R2 V In addition, there is a ref. integrity constraint from R1 to R2. Example V login is a FOREIGN KEY for Students

CREATE TABLE Students (sid CHAR(9) PRIMARY KEY, name VARCHAR(20), login CHAR(8) REFERENCES Accounts(acct), age INTEGER, gpa REAL ); CREATE TABLE Accounts ( acct CHAR(8) PRIMARY KEY );

CS 5614: Misc. SQL Stuff, Safety in Queries

85

Alternatively

Can use FOREIGN KEY construct

CREATE TABLE Students (sid CHAR(9) PRIMARY KEY, name VARCHAR(20), login CHAR(8), age INTEGER, gpa REAL, FOREIGN KEY login REFERENCES Accounts(acct) ); CREATE TABLE Accounts ( acct CHAR(8) PRIMARY KEY );

V in both cases

Note: acct should be declared as PRIMARY KEY for Accounts!

CS 5614: Misc. SQL Stuff, Safety in Queries

86

SQL Subqueries

V V V

Given

Students(sid,name,login,age,gpa) HasCar(sid,carname)

V the car of the student with login=mark


Traditional Way

Find

SELECT carname FROM Students, HasCar WHERE Students.login=mark AND Students.sid=HasCar.sid;

The Subway

SELECT carname FROM HasCar WHERE sid= (SELECT sid FROM Students WHERE login=mark);

CS 5614: Misc. SQL Stuff, Safety in Queries

87

Aggregation

V V V

Given

Students(sid,name,login,age,gpa)

V the average of the ages of all the students


Solution

Find

SELECT AVG(age) FROM Students;

V SUM (summation of all the values in a column) V MIN (least value) V MAX (highest value) V COUNT (the number of values), e.g.
SELECT COUNT(*) FROM Students;

Other Operations

V COUNTs the number of Students!

CS 5614: Misc. SQL Stuff, Safety in Queries

88

Ordering

V V V

Given

Students(sid,name,login,age,gpa)

V the students in (ascending) alphabetical order of name


Solution

List

SELECT * FROM Students ORDER BY name;

and that for DESCending ORDER is

SELECT * FROM Students ORDER BY name DESC;

Default is ASC

CS 5614: Misc. SQL Stuff, Safety in Queries

89

Grouping

V V V

Given

Students(sid,name,login,age,gpa)

V the names of students with gpa=4.0 and V group people with like ages together
Solution

Find

SELECT name FROM Students WHERE gpa=4.0 GROUP BY name;

CS 5614: Misc. SQL Stuff, Safety in Queries

90

More on Grouping

V V V

Given

Students(sid,name,login,age,gpa)

V the names of students with gpa=4.0 and V group people with like ages together and V show only those groups that have more than 2 students in it
Solution

Find

SELECT name FROM Students WHERE gpa=4.0 GROUP BY name HAVING COUNT(*) > 2;

CS 5614: Misc. SQL Stuff, Safety in Queries

91

Summary of SQL Syntax

General Form

SELECT <attribute(s)> FROM <relation(s)> WHERE <condition(s)> GROUP BY <attribute(s)> HAVING <grouping condition(s)>

V FROM V WHERE V GROUP BY V HAVING V SELECT

Order of Execution

CS 5614: Misc. SQL Stuff, Safety in Queries

92

Views

V V

Can be viewed as temporary relations V do not exist physically BUT V can be queried and modified (sometimes) just like normal relations Example:

CREATE VIEW GoodStudents(id,name) AS SELECT sid,name FROM Students WHERE gpa=4.0; SELECT * FROM GoodStudents WHERE name=Mark;

V Can we update the original relation using the GoodStudents VIEW?

CS 5614: Misc. SQL Stuff, Safety in Queries

93

Beginning of Wierd Stuff

V V V V V V

SQL uses Bag Semantics V meaning: does not normally eliminate duplicates V e.g. the SELECT clause BUT (a big BUT) this doesnt apply to V UNION, INTERSECT and DIFFERENCE Either way, it provides facilities to do whatever we want If you want duplicates eliminated in SELECT clause

V use SELECT DISTINCT .....

V use (SELECT ...) UNION ALL (SELECT ...) V Likewise for INTERSECT and DIFFERENCE

If you want to prevent elimination of duplicates in UNION etc.

CS 5614: Misc. SQL Stuff, Safety in Queries

94

... and thats just the tip of the iceberg

What happens with the following code?

SELECT R.A FROM R,S,T WHERE R.A = S.A or R.A = T.A

V V

when R(A) has {2,3}, S(A) has {3,4} and T(A) is {} Confusion Reigns!

CS 5614: Misc. SQL Stuff, Safety in Queries

95

Safety in Queries

V V

V should not be permitted in DB access


Example V Given only the following relation

Some queries are inherently unsafe

Students(id)

V Find all those who are not students

Easy to distinguish unsafe queries via common-sense V Final result is not closed V Is there an automatic way to determine safety?

CS 5614: Misc. SQL Stuff, Safety in Queries

96

Answer: Yes!

V V V

Easiest to spot when written in Datalog

Answer(id) <- NOTStudents(id).

V Any variable that appears anywhere must also V In this case, id causes the query to be unsafe
Example of a Safe Query appear in a non-negated body part

Golden Rule

Answer(id) <- People(id), NOT Students(id)

V This produces all those people who are NOT students V safe because the People relation provides a reference point V id which appears in a negated body part also appears non-negated

CS 5614: Misc. SQL Stuff, Safety in Queries

97

More Dangers

V V

V occurs even with arithmetic body parts (why?)


Given V only the following relation

Problem not restricted to negated body parts

Students(id,age)

V Find all those numbers that are greater than the age of some student
Answer(x) <- Student(id,age), x>age.

V Any variable that appears anywhere must also


appear in a non-negated, non-arithmetic body part V In this case, x causes the query to be unsafe bcoz it doesnt appear in a non-negated, non-arithmetic part

Extension to previous rule

CS 5614: Misc. SQL Stuff, Safety in Queries

98

One More Example

V V V V V V

Given V a relation Composite(x) V which lists all the composite numbers Write a query to find V the prime numbers Wrong Way

V Prime(x) <- NOT Composite(x). V Prime(x) <- Number(x), NOT Composite(x). V Relational Algebra: via the subtraction operator V SQL: via the EXCEPT construct
Safety in Other Notations Right Way

Notice how SQL and Relational Algebra do not allow unsafe queries V because there is no way to write such queries with the given constructs V how clever, eh? :-) V It is always amazing how languages force you to think in a certain manner V a problem long studied by philosophers

CS 5614: Misc. SQL Stuff, Safety in Queries

99

Recursion in Queries

V V

Used to specify an indefinite number of applications of a relation

V Given only the following relation


Person(name,parent)

Example

V Find all the ancestors of Mark

Easy to find an ancestor at a predefined level V parent: Use Person V grandparent: Join Person with Person V great-grandparent: Join Person with Person with Person V and so on. To find an ancestor at no predefined level V Need to join Person with Person an indefinite number of times SQL3 provides support for recursive definitions

V V

CS 5614: Misc. SQL Stuff, Safety in Queries

100

Solution in Datalog

V V V

First, the base case

Ancestor(x,y) <- Person(x,y).


Then, the inductive step

Ancestor(x,y) <- Person(x,z), Ancestor(z,y).


Can also write the previous rule as

Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y).

V why?

CS 5614: Misc. SQL Stuff, Safety in Queries

101

Recursion in SQL3

V V

Use the WITH RECURSIVE ... SELECT construct Example

WITH RECURSIVE Ancestor(name,ans) AS (SELECT * FROM Person) UNION (SELECT Person.name,Ancestor.ans FROM Person, Ancestor WHERE Person.parent=Ancestor.name) SELECT * FROM Ancestor;

V example: the following Datalog query might not be allowed in SQL3

Use with caution: Some kinds of recursive queries will not be allowed!

Ancestor(x,y) <- Ancestor(x,z), Ancestor(z,y).

V because the rule involves 2 applications of the recursively defined predicate V Linear recursion allows only one (as in the SQL code above)

CS 5614: Misc. SQL Stuff, Safety in Queries

102

Final Example

V V V V V

Be careful when combining negation, aggregation and recursion V perfect recipe for disaster! Mutual Recursion

V Odd(x) <- Number(x), NOT Even(x). V Even(x) <- Number(x), NOT Odd(x). V Notice that the query appears safe (per Slide 96) V cycles indefinitely!; no proper base cases
What are the problems?

Illegal in SQL3 V not because of mutual recursion V but due to the fact that there is no unique interpretation to the query V Eg: 6 could be either in Odd or in Even; both are acceptable!

V with proper limiting constraints and base cases

Sometimes mutual recursion is good and fruitful, if written properly

CS 5614: Misc. SQL Stuff, Safety in Queries

103

Introduction to Deductive DBMSs

V V V V

Intersection of traditional RDBMSs and Logic Programming

V CORAL (Univ. Wisc.) V LDL++ (MCC) V XSB Systems (SUNY, Stony Brook)
Can be viewed as V extending PROLOG-type systems with secondary storage V extending RDBMSs with deductive functionality Mappings: Commonalities between PROLOG and DBMSs V Predicate: Relation V Argument: Attribute V Ground Fact: Tuple V Extensional Definition: Table (defined by data) V Intensional Definition: Table (defined by a view)

Example Systems

CS 5614: Misc. SQL Stuff, Safety in Queries

104

PROLOG vs. RDBMSs

V Tuple-at-a-time V Backward Chaining V Top-Down V Goal-Oriented V Fixed-Evaluation Strategy (Depth-First)


Characteristics of RDBMSs V Set-at-a-time (recall relational algebra) V Forward Chaining V Bottom-Up V Query Optimizer figures a good evaluation strategy Example

Characteristics of PROLOG

V V

V ancestor(X,X). parent(amy,bob). V ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y). V Find the ancestors of bob: ancestor(X,bob)?
Query

CS 5614: Misc. SQL Stuff, Safety in Queries

105

PROLOG Pitfalls

V V V V

V Linear Recursion V Tail Recursion

Previous Example

What if we reverse the order of clauses in

V ancestor(X,Y) <- parent(X,Z), ancestor(Z,Y). V PROLOG goes into an infinite loop (why?) V ancestor(X,Y) <- ancestor(X,Z), ancestor(Z,Y). V Not Linear Recursion

What if we make it

Inference = Resolution + Unification V Entailment in First Order Logic is Semi-decidable

CS 5614: Misc. SQL Stuff, Safety in Queries

106

Example of Deductive Query Optimization

V V V V V

V sg(X,Y) <- flat(X,Y). V sg(X,Y) <- up(X,U),sg(U,V),down(V,Y). V Rewrite query such that advantages of bottom-up evaluation
goal-oriented behavior are combined Example: For the query Magic: A Rewriting Technique

Same-Generation: Hello World of DDBMSs

V sg(john,Z)?
Magic produces

V sg(X,Y) <- magic_sg(X),flat(X,Y). V sg(X,Y) <- magic_sg(X),up(X,U),sg(U,V),down(V,Y). V magic_sg(john). V Iterative Fixpoint Evaluation (when the answer stops changing)
How do you know when to stop?

CS 5614: Misc. SQL Stuff, Safety in Queries

107

SQL in a Programming Environment

V V V V V

Incorporating SQL in a complete application

V There are some things we cannot do with SQL alone V e.g. preserving complex states, looping, branching etc. V Typically embed SQL in a host-language interface
Problems: Impedance Mismatch V SQL operates on sets of tuples V Languages such as C, C++ operate on an individual basis Solution V easy when SELECT returns only one row When more than one row is returned V design an iterator to run over the results V called a cursor

Why?

CS 5614: Misc. SQL Stuff, Safety in Queries

108

How are these implemented?

V V

Vendor-Specific Implementations V ORACLE: PL/SQL (procedural extensions to SQL) Open Database Connectivity Standard V Provides a standard API for transparent database access V used when database independence is important V used when required to connect to diverse data sources

CS 5614: Misc. SQL Stuff, Safety in Queries

109

Tradeoffs

V originated by Microsoft in 1991 V adds one more abstraction layer V not as fast as a native API (does not exploit special features) V least-common denominator approach V constantly evolving

ODBC

PL/SQL etc. V tailored to the details of the underlying DBMS V might not extend to heterogeneous domains V modeled after a specific programming language (e.g. Ada for Pl/SQL)

CS 5614: Misc. SQL Stuff, Safety in Queries

110

In Between: Stored Procedures

V V V V

Used for developing tightly-coupled applications V push computations selectively into the database system V avoid performance degradation V work in database address space instead of application address space Advantages V No sending SQL statements to and fro V eliminate pre-processing V speedup by an order of magnitude Example Applications V Database Adminstration V Integrity Maintenance and Checks V Database Mining Disadvantages V Non-standard implementation V Difficult to enforce transactional synchronization V Without traditional SQL optimization, can lead to performance degradation

CS 5614: Misc. SQL Stuff, Safety in Queries

111

Introduction to Query Optimization

V V

V One of the main reasons for commercial success of DBMSs


A motivating example V Find all students with 4.0 gpa enrolled in CS5614

Helps attain declarativeness of RDBMSs

SELECT name FROM Students, Classroll WHERE Students.name = Classroll.studentname AND Students.gpa = 4.0 AND Classroll.coursename = CS5614

V V

V Do join and then filter out the ones with gpa <> 4.0 and course <> CS5614 V Filter first the ones with gpa <> 4.0 and course <> CS5614 and then Join
Which is Better? V Always good to push selections as far down into the query parse tree

Two Strategies

CS 5614: Misc. SQL Stuff, Safety in Queries

112

How does a Query Optimizer work?

V V V V V

Three Requirements V A Search Space of Plans V A Cost Model (for Plan evaluation) V An Enumeration Algorithm Ideally V Search Space: contains both good and efficient plans V Cost Models: cheap to compute and accurate V Enumeration Algorithm: efficient (not a monkey-typewriter algorithm) Example of a Search Space V See Previous Slide Examples of Cost Models V #(tuples) evaluation V #(main memory locations) etc. Example of an enumeration algorithm V Sequential enumeration of a lattice of plans V Dynamic Programming vs. Greedy Approaches

CS 5614: Misc. SQL Stuff, Safety in Queries

113

A Simple Measure of Cost

V V V V

#(Tuples) in a query Easiest to compute for

V Cartesian Product: #(R X S) = #(R)#(S) V Projection:#(Pi(R)) = #(R)


A Notation for Other Operations V V(R,A) = Number of distinct values of attribute A in R V formulas assume that all values of A are equally likely in R V Holds in average case for most distributions (e.g. Zipf) Selectivity Factors for Selection Operations V Equality Tests: Use 1/V(R,A) V < or > Tests: Use 1/3 V <> Test: Use (V(R,A)-1)/V(R,A) V AND Conditions: Multiply Selectivity Factors V OR Conditions: Three Choices V Sum of results from individual selectivity factors V Max(sum,total size of relation): why? V n(1-(1-m1/n)(1-m2/n)) formula : most accurate

CS 5614: Misc. SQL Stuff, Safety in Queries

114

Estimating the Size of a Join

V V V V V

Assume: R(X,Y) Join S(Y,Z) Range of Values

V Minimum: 0 V In-between: #(R) (if Y is a foreign key for R and a key for S) V Maximum: #(R)#(S) (if Ys in R and S are all the same)
Assumptions for Join Size Estimation V Containment of Value Sets V Preservation of Value Sets Containment of Value Sets V If V(R,Y) <= V(S,Y) then the Ys in R are a subset of the Ys in S V Satisfied when Y is a foreign key in R and a key in S Preservation of Value Sets V #(R Join S,X) = #(R,X) V #(R Join S,Z) = #(S,Z) V why is this reasonable?

CS 5614: Misc. SQL Stuff, Safety in Queries

115

The Actual Estimate

V V V V V V

V Every tuple in R has a chance of 1/V(S,Y) of joining with a tuple of S V Every tuple in R has a chance of #(S)/V(S,Y) of joining with S V All tuples in R have a chance of #(R)#(S)/V(S,Y) of joining with S
What if V(S,Y) <= V(R,Y) V Answer: #(R)#(S)/V(R,Y) V In general: #(R)#(S)/(max (V(S,Y),V(R,Y))) What if there are multiple join attributes V Have a max factor in the denominator for each such attribute! How to Estimate #(R Join S Join T)? V Does it matter which we do first? Surprise! V Estimation formula preserves associativity of Joins! V In other words, it takes care of itself! Thus, for a Join attribute appearing > 2 times V 3 times: Use two highest values V 4 times: Use three highest values etc.

Assume that V(R,Y) <= V(S,Y)

CS 5614: Misc. SQL Stuff, Safety in Queries

116

More on Join Associativity and Commutativity

V V

Which is better: (R Join S) or (S Join R) V Good to put the smaller relation on the left V Why? Most Join algorithms are assymmetric Example: V Construct a good query tree for the following

SELECT movietitle FROM Actors,ActedIn WHERE Actors.name = ActedIn.actorname AND Actors.age = 23

V Arises from the shape of the trees: T(n) V Arises from permuting the leaves: n! V Total choices: n! T(n)

Number of Possible Trees of n Attributes

CS 5614: Misc. SQL Stuff, Safety in Queries

117

What is T(n)?

V 1: 1 V 2: 1 V 3: 2 V 4: 5 V 5: 14

Sample Values

V V V

A formula V T(1) = 1 (Basis) V T(n) = T(1)T(n-1) + T(2)T(n-2) + ..... + T(n-1)T(1) Classifications V Left-Deep Trees: All right children are leaves V Right-Deep Trees: All left children are leaves V Bushy Trees: Neither Left-Deep nor Right-Deep Choosing a Join Order: Restricted to Left-Deep Trees V By Dynamic Programming: O(n!) V Greedy Approach: Make local selections

CS 5614: Misc. SQL Stuff, Safety in Queries

118

Example

V V

Consider V R(a,b): #(R) = 1000, V(R,a) = 100, V(R,b) = 200 V S(b,c): #(S) = 1000, V(S,b) = 100, V(S,c) = 500 V T(c,d): #(T) = 1000, V(T,c) = 20, V(T,d) = 50 Possible Join Orders V (R Join S) Join T V (S Join R) Join T (same as above; why?) V (R Join T) Join S V (T Join R) Join S (same as above) V (S Join T) Join R V (T Join S) Join R (same as above) Cost Estimation = Sizes of Intermediate Relations V (R Join S) Join T: 5000 V (R Join T) Join S: 1000000 V (S Join T) Join R: 2000 Best Plan = (S Join T) Join R

V V

CS 5614: Misc. SQL Stuff, Safety in Queries

119

Database Tuning: Why?

V V

V OLTP (Access small number of records) V OLAP (Summarize from a large number of records)
Sources of Poor Performance V Imprecise Data Searches V Random vs. Sequential Disk Accesses V Short Bursts of Database Interaction V Delays due to Multiple Transactions What can be done? V Tune Hardware Architecture V Tune OS V Tune Data Structures and Indices

Two Families of Queries

CS 5614: Misc. SQL Stuff, Safety in Queries

120

Examples of Database Tuning

V V V V V V

To normalize or not to V Sacrificing Redundancy Elimination V Sacrificing Dependency Elimination Several Choices of Normalized Schemas V Vertical Partitioning Applications Recomputing Indices V Histograms etc. might be outdated Restricting Uses of Subqueries V Unnesting query blocks by Joins Declining the Use of Indices V Table Scans for Small Tables V Rule-based optimization: Rewrite A=6 as A+0=6 Provide Redundant Tables V Decision-Support/ Data Mining Queries

You might also like