You are on page 1of 66

LecLure - 1

lnLroducuon Lo uaLabase SysLem



CS344
S. 8anblr Slngh
Course lnLroducuon: 1heory
Cradlng
Culz-1: 13 (beLween 18
Lh
- 20
Lh
leb)
Surprlse 1esL-1: 10
Mld: 20
Culz-2: 13 (beLween 22
nd
- 24
Lh
Aprll)
Surprlse 1esL-2:10
Lnd: 30
8ooks
uaLabase ManagemenL SysLems, 3rd Ldluon, 8aghu
8amakrlshnan and !ohannes Cehrke, McCraw-Plll,
2002
Course lnLroducuon: Lab
AsslgnmenLs: (lndlvldual)
number of asslgnmenL 3 (10 each)
1
sL
week lab LuLorlal and 2
nd
week evaluauon
ro[ecL: (Croup)
allocauon (11
Lh
Mar-14
Lh
March)
Lvaluauon: 21
sL
Aprll - 23
Lh
Aprll
30 (group) + 20 (lndlvldual, along wlLh Culz-2)
WhaL ls a uaLabase?
A seL of lnformauon held ln a compuLer
Cxford Lngllsh ulcuonary
Cne or more large sLrucLured seLs of
perslsLenL daLa, usually assoclaLed wlLh
soware Lo updaLe and query Lhe daLa
lree Cn-Llne ulcuonary of Compuung
A collecuon of daLa arranged for ease and
speed of search and reLrleval
ulcuonary.com
WhaL ls uaLabase
A uaLabase ls a collecuon of sLored operauonal
daLa used by Lhe appllcauon sysLems of some
parucular enLerprlse (C.!. uaLe)
aper uaLabases
Sull conLaln a large poruon of Lhe worlds knowledge
llle-8ased uaLa rocesslng SysLems
Larly baLch processlng of (prlmarlly) buslness daLa
uaLabase ManagemenL SysLems (u8MS)
Why uaLabase ManagemenL SysLem?
Conslder an example
Suppose we are bulldlng a sysLem of a depL
SLudenL
Course
rofessors
1asks
lnformauon ManagemenL of Lhe course allocauon
lnserL sLudenL, Course and professors lnformauon
updaLe who Lakes whaL, who Leaches whaL
Search
rlor Lo u8MS
CreaLe relevanL CS les such as
sLudenLs.LxL courses.LxL professors.LxL



WrlLe appllcauon programs Lo access Lhe les,
updaLe Lhe les for perform speclc Lasks
known as I||e-rocess|ng System
Lxample: WrlLe an appllcauon for
every Lask
1ask: Lnroll a sLudenL Lo Lhe course
x?Z Lo CS344
kead students.txt
kead courses.txt
I|nd & update the record k2
I|nd & update the record CS344
Wr|te students.txt
Wr|te courses.txt
WrlLe a dedlcaLed appllcauon program Lo perform
Lhe above Lask:
WhaL ls Lhe problem?
rogram-uaLa uependenL
I||e descr|pnons are stored w|th each app||canon program that
accesses that parncu|ar h|e
Any change at that h|e descr|pnon needs changes to a|| program
access|ng that h|e
App||canon programs perform phys|ca| access to the h|es
WhaL ls Lhe problem?
uaLa 8edundancy
8epeaLed occurrences of same daLa ln dlerenL les
WasLe of sLorage
lncrease ln managemenL and processlng cosL
lnconslsLency of daLa conLenL
Lacklng concordance of le conLenL, caused due Lo
changes
uaLa anomaly
lnseruon
ueleuon
updaLe
WhaL ls Lhe problem?
lsolauon of daLa/llmlLed sharlng beLween
sysLems
Lack of sharlng le sysLems of dlerenL sysLems
Sharlng beLween course allocauon deparLmenL wlLh
SLudenL reglsLrauon deparLmenL
Appllcauon program need Lo be exLenslvely
modled for any new feaLures.
WhaL ls Lhe problem?
Lack of daLa-lndependence
Change of daLa characLerlsucs may lead Lo change
ln le descrlpLor ln all appllcauon program
accesslng lL
Change Lhe address slze from 40 char Lo 30 char
Lack of sLrucLural - lndependence
you wanL Lo add a new eld
may need Lo change ln all appllcauon programs
accesslng lL.
WhaL ls Lhe problem?
Long developmenL ume
lf you wanL Lo add new feaLures
new query/Lask
Lxcesslve program malnLenance
LoLs of programs
Weak recovery and securlLy feaLures
WhaL lf program crashes
Pow Lo proLecL a le from accesslng from oLher users
Concurrency and Mulu-users supporLs
u8MS helps ln addresslng Lhose
problems
DBMS
===============
Design tools
Table Creation
Form Creation
Query Creation
Report Creation
Procedural
language
compiler (4GL)
=============
Run time
Form processor
Query processor
Report Writer
Language Run time
User
Interface
Applications
Application
Programs
Database
Database contains:
Users Data
Metadata
Indexes
Application Metadata
ComponenLs of uaLabase uslng u8MS
uaLabases
Web lndexes
Llbrary caLalogues
Medlcal records
8ank accounLs
SLock conLrol
ersonnel sysLems
roducL caLalogues
1elephone dlrecLorles
1raln umeLables
Alrllne booklngs
CredlL card deLalls
SLudenL records
CusLomer hlsLorles
SLock markeL prlces
ulscusslon boards
and so on.
Is Database d|erent from Informanon retr|eva| (Web
search eng|ne) ?
uaLabase SysLems
A daLabase sysLem conslsLs
of
uaLa (Lhe daLabase)
Soware
Pardware
users
We focus malnly on Lhe
soware
uaLabase sysLems allow
users Lo
SLore
updaLe
8eLrleve
Crganlse
roLecL
Lhelr daLa.
ln 1radluonal u8MS: uaLabase users
Lnd users
use Lhe daLabase sysLem Lo
achleve some goal
Appllcauon developers
WrlLe soware Lo allow end
users Lo lnLerface wlLh Lhe
daLabase sysLem
uaLabase AdmlnlsLraLor
(u8A)
ueslgns & manages Lhe
daLabase sysLem
uaLabase sysLems
programmer
WrlLes Lhe daLabase soware
lLself
uaLabase ManagemenL SysLems
A daLabase managemenL
sysLem (u8MS) ls Lhe soware
Lhan conLrols LhaL lnformauon
Lxamples:
Cracle
u82 (l8M)
MS SCL Server
MS Access
lngres
osLgreSCL
MySCL
D8MS: Idea |s to des|gn
An Independent and |og|ca||y separated
Mu|n-|eve| Arch|tecture
An Lxample: AnSl/SA8C ArchlLecLure
AnSl - Amerlcan nauonal
SLandards lnsuLuLe
SA8C - SLandards lannlng
and 8equlremenLs
Commluee
1973 - proposed a
framework for u8s
A Lhree-level archlLecLure
lnLernal level: lor sysLems
deslgners
ConcepLual level: lor
daLabase deslgners and
admlnlsLraLors
LxLernal level: lor daLabase
users
lnLernal Level
ueals wlLh physlcal sLorage
of daLa
SLrucLure of records on dlsk -
les, pages, blocks
lndexes and orderlng of
records
used by daLabase sysLem
programmers
lnLernal Schema
RECORD EMP
LENGTH=44
HEADER: BYTE(5)
OFFSET=0
NAME: BYTE(25)
OFFSET=5
SALARY: FULLWORD
OFFSET=30
DEPT: BYTE(10)
OFFSET=34


ConcepLual/Loglcal Level
ueals wlLh Lhe organlsauon
of Lhe daLa as a whole
AbsLracuons are used Lo
remove unnecessary deLalls
of Lhe lnLernal level
used by u8As and appllcauon
programmers
ConcepLual Schema
CREATE TABLE
Employee (
Name
VARCHAR(25),
Salary REAL,
Dept_Name
VARCHAR(10))

LxLernal Level
rovldes a vlew of Lhe
daLabase Lallored Lo a user
arLs of Lhe daLa may be
hldden
uaLa ls presenLed ln a useful
form
used by end users and
appllcauon programmers
LxLernal Schemas
Payroll:
String Name
double Salary

Personnel:
char *Name
char *Department
Mapplngs
Mapplngs LranslaLe
lnformauon from one level
Lo Lhe nexL
LxLernal/ConcepLual
ConcepLual/lnLernal
1hese mapplngs provlde
daLa lndependence
hyslcal daLa lndependence
Changes Lo lnLernal level
shouldnL aecL concepLual
level
Loglcal daLa lndependence
ConcepLual level changes
shouldnL aecL exLernal
levels
AnSl/SA8C ArchlLecLure
Stored
Data
Conceptual
View
External
View 1
External
View 2
User 1 User 2 User 3
DBA
External Schemas
External/Conceptual Mappings
Conceptual/logical Schema
Internal Schema
Conceptual/Internal Mapping
1he concepL aL Lhe core
uaLa lndependence
hyslcal represenLauon and locauon of daLa
and Lhe use of LhaL daLa are separaLed
1he appllcauon doesnL need Lo know how or
where Lhe daLabase has sLored Lhe daLa, buL [usL
how Lo ask for lL
Movlng a daLabase from one u8MS Lo anoLher
should noL have a maLerlal eecL on appllcauon
program
8ecodlng, addlng elds, eLc. ln Lhe daLabase
should noL aecL appllcauons
Summary: llle rocesslng Lo u8MS
roblems wlLh le processlng sysLems
lnconslsLenL daLa
lnexlblllLy
LlmlLed daLa sharlng
oor enforcemenL of sLandards
Lxcesslve program malnLenance
Summary: llle rocesslng Lo u8MS
u8MS 8eneLs
Mlnlmal daLa redundancy
ConslsLency of daLa
lnLegrauon of daLa
Sharlng of daLa
Lase of appllcauon developmenL
unlform securlLy, prlvacy, and lnLegrlLy conLrols
uaLa accesslblllLy and responslveness
uaLa lndependence
8educed program malnLenance
LecLure-2
S. 8anblr Slngh
CS344[ll1C
1ypes of uaLabase SysLems

Number of Users
Slngle-user
ueskLop daLabase
Muluuser
Workgroup daLabase
LnLerprlse daLabase
Scope
ueskLop
Workgroup
LnLerprlse
1ypes of uaLabase SysLems
Locanon
CenLrallzed
ulsLrlbuLed
Use
1ransacuonal (roducuon)
ueclslon supporL
uaLa warehouse
1ypes of uaLabase SysLems
C daLabases
CenLrallzed daLabase
CllenL/server daLabases
ulsLrlbuLed daLabases
C uaLabases
L.g.:
Access
loxro
ubase
LLc.
CenLrallzed uaLabases
Central
Computer
CllenL Server uaLabases
Network
Client
Client
Client
Database
Server
ulsLrlbuLed uaLabases
computer
computer
computer
Location A
Location C
Location B
Homogeneous
Databases
ulsLrlbuLed uaLabases
Local Network
Database
Server
Client
Client
Comm
Server
Remote
Comp.
Remote
Comp.
Heterogeneous
Or Federated
Databases
Schemas and Instances
Schema the logical structure of the
database
Physical schema: database design at the
physical level
Logical schema: database design at the
logical level
Instance the actual content of the
database at a particular point in time
Analogous to type and variable in
programming language
Data Models
A collection of tools for describing
Data
Data relationships
Data semantics
Data constraints
Relational model
Entity-Relationship data model (mainly for
database design)
Object-based data models (Object-
oriented and Object-relational)
Semi-structured data model (XML)
Other older models:
Network model
Hierarchical model
Data Manipulation Language (DML)
Language for accessing and manipulating the
data organized by the appropriate data model
DML also known as query language
Two classes of languages
Procedural user species what data is required
and how to get those data
Declarative (nonprocedural) user species
what data is required without specifying how to
get those data
SQL is the most widely used query language
Data Denition Language (DDL)
Dene database schema
Example: create table instructor (
ID char(5), name varchar(20),
dept_name varchar(20),salary numeric(8,2))
DDL compiler generates a set of table templates
stored in a data dictionary
Data dictionary contains metadata (data about data)
Database schema
Integrity constraints
Primary key (ID uniquely identies instructors)
Authorization
History of Database Systems
1950s and early 1960s:
Data processing using magnetic tapes for storage
Tapes provided only sequential access
Punched cards for input
Late 1960s and 1970s:
Hard disks allowed direct access to data
Network and hierarchical data models in widespread use
Ted Codd denes the relational data model
Would win the ACM Turing Award for this work
IBM Research begins System R prototype
UC Berkeley begins Ingres prototype
High-performance (for the era) transaction processing

History (cont.)
1980s:
Research relational prototypes evolve into commercial
systems
SQL becomes industrial standard
Parallel and distributed database systems
Object-oriented database systems
1990s:
Large decision support and data-mining applications
Large multi-terabyte data warehouses
Emergence of Web commerce
Early 2000s:
XML and XQuery standards
Automated database administration
Later 2000s:
Giant data storage systems
Google BigTable, Yahoo PNuts, Amazon, ..
Introducnon to ke|anona| Mode|
Example of a Relation
attributes
(or columns)
tuples
(or rows)
Attribute Types
The set of allowed values for each
attribute is called the domain of the
attribute
Attribute values are (normally)
required to be atomic; that is,
indivisible
The special value null is a member
of every domain
The null value causes complications
in the denition of many operations
Relation Schema and Instance
A
1
, A
2
, , A
n
are attributes
R = (A
1
, A
2
, , A
n
) is a relation schema
Example:
instructor = (ID, name, dept_name, salary)
given sets of domain D
1
, D
2
, . D
n
,
a relation r is a subset of
D
1
x D
2
x x D
n
#
Thus, a relation is a set of n-tuples (a
1i
, a
2i
, , a
ni
)
where each a
i
! D
i

Relations are Unordered


! Crder of Luples ls lrrelevanL (Luples may be sLored ln an arblLrary order)
! Lxample: !"#$%&'$(% relauon wlLh unordered Luples
Database
A database consists of multiple
relations
Example: instructor, student, advisor
Bad design: Put everything in one relation
univ (instructor -ID, name, dept_name, salary,
student_Id, ..)
results in
repetition of information (e.g., two students have the
same instructor)
the need for null values (e.g., represent an student with
no advisor)
Keys
Let K " R
K is a superkey of R if values for K are sufcient to
identify a unique tuple of each possible relation r(R)
{ID} and {ID,name} are both superkeys of instructor.
Superkey K is a candidate key if K is minimal
{ID} is a candidate key for Instructor
One of the candidate keys is selected to be the
primary key. Which one?
Foreign key constraint: Value in one relation must
appear in another
Referencing relation
Referenced relation
Lecture-3

Integrity Constraints
lnLegrlLy ConsLralnLs
key ConsLralnLs
uomaln/AurlbuLe ConsLralnLs
8eferenual lnLegrlLy/lorelgn key ConsLralnLs
Asseruons
1rlggers
luncuonal uependencles
Integr|ty constra|nts guard aga|nst acc|denta| damage to the
database, by ensur|ng that author|zed changes to the database
do not resu|t |n a |oss of data cons|stency.
uomaln/AurlbuLe ConsLralnLs
AssoclaLe uaLa Lype/domaln for every aurlbuLe
Auach consLralnLs Lo values of aurlbuLes

1. Data 1ype
e.g.: CkLA1L 1A8LL branch(
bname CnAk(1S),
....)
2. NC1 NULL
e.g.: CkLA1L 1A8LL branch(
bname CnAk(1S) NC1 NULL,
.... )
3. CnLCk
e.g.: CkLA1L 1A8LL depos|tor(
....
ba|ance |nt NC1 NULL,
CnLCk( ba|ance >= 0),
....)
key ConsLralnLs
lL specles LhaL a relauon ls a seL, noL a bag

1. Primary Key:
CREATE TABLE branch(
bname CHAR(15) PRIMARY KEY,
bcity CHAR(20),
assets INT);
or
CREATE TABLE depositor(
cname CHAR(15),
acct_no CHAR(5),
PRIMARY KEY(cname, acct_no));
2. Candidate Keys:
CREATE TABLE customer (
ssn CHAR(9) PRIMARY KEY,
cname CHAR(15),
address CHAR(30),
city CHAR(10),
UNIQUE (cname, address, city);
key ConsLralnLs
LecL of SCL key declarauons
8lMA8? (A1, A2, .., An)
or
unlCuL (A1, A2, ..., An)
lnseruons: check lf any Luple has same values for A1, A2, .., An as any
lnserLed Luple. lf found, re[ect |nsernon
updaLes Lo any of A1, A2, ..., An: LreaL as lnseruon of enure Luple
Primary vs Unique (candidate)
1. one primary key per table, several unique keys allowed.
2. Only primary key can be referenced by foreign key (SQL
Server allows to refer to candidate key)
3. DBMS may treat primary key differently
(e.g.: implicitly create an index on PK)
4. NULL values permitted in UNIQUE keys but not in PRIMARY
KEY
5. Primary key can not be modied (SQL Server allows)
8eferenual lnLegrlLy: lorelgn key
Referential Integrity is a constraint to enforce relationship between primary
key (some DBMS consider condidate key) of one relation and foreign key of
the other. Referential Integrity is used to ensure that each value of a foreign
key attribute refers to an entity that appears in the foreign table. As with other
constraints, any attempt to modify the database contents that would cause a
foreign key constraint violation must be disallowed.
Relational database systems provide enforcement of referential integrity
constraints. The constraint is specified in the database schema, and the
database system enforces it.
8eferenual lnLegrlLy: lorelgn key
Lnsures LhaL a value LhaL appears ln one relauon for a glven seL of
aurlbuLes also appears for a cerLaln seL of aurlbuLe ln anoLher
relauon.
lf an accounL exlsLs ln Lhe daLabase wlLh branch name erryrldge,
Lhen Lhe branch erryrldge musL acLually exlsL ln Lhe daLabase.
rlmary keys of
respecuve relauons
lorelgn key
branch (branch-name, branch-clLy, asseL )
erryrldge 8rooklyn 300,000
accounL ( accounL-no, branch-name, balance )
A-123 erryrldge 3000
A seL of aurlbuLes x ln 8 ls a forelgn key lf lL ls noL a prlmary key of 8 buL
lL ls a prlmary key of some relauon S.
8eferenual lnLegrlLy
lormal uenluon
LeL r
1
(8
1
) and r
2
(8
2
) be relauons wlLh prlmary keys k
1
and k
2
respecuvely.

1he subseL # of 8
2
ls a forelgn key referenclng k
1
ln relauon r
1
, lf for every
L
2
ln r
2
Lhere musL be a Luple L
1
ln r
1
such LhaL L
1
[k
1
]=L
2
[#].

8eferenual lnLegrlLy consLralnL: $
#
(r
2
) " $
k1
(r
1
)
R
2
( K
2
, ., #, )
t
2
R
1
( K
1
, ., . )
t
1
)
)
8eferenual lnLegrlLy for lnseruon and
ueleuon
1he followlng LesLs musL be made ln order Lo preserve Lhe followlng
referenual lnLegrlLy consLralnL:
$
#
(r
2
) " $
k
(r
1
)

lnserL. lf a Luple L
2
ls lnserLed lnLo r
2
. 1he sysLem musL ensure LhaL
Lhere ls a Luple L1 ln r1 such LhaL L
1
[k] = L
2
[#]. 1haL ls
L
2
[#] !$
k
(r
1
)
ueleLe. lf a Luple L
1
ls deleLed from r
1
, Lhe sysLem musL compuLe Lhe
seL of Luples ln r
2
LhaL reference L
1
:
%
#=L1[k]
(r
2
)
lf Lhls seL ls noL empLy, elLher Lhe deleLe command ls re[ecLed as an
error, or Lhe Luples LhaL reference L
1
musL Lhemselves be deleLed
(cascadlng deleuons are posslble)
8eferenual lnLegrlLy for updaLe
lf a Luple L
2
ls updaLed ln relauon r
2
and Lhe updaLe modles values
for Lhe forelgn key #, Lhen a LesL slmllar Lo Lhe lnserL case ls made. LeL
L
2
denoLe Lhe new value of Luple L
2
. 1he sysLem musL ensure LhaL
L
2
[#]! $
k
(r
1
)

lf a Luple L
1
ls updaLed ln r
1
, and Lhe updaLe modles values for
prlmary key(k), Lhen a LesL slmllar Lo Lhe deleLe case ls made. 1he
sysLem musL compuLe
%
#=L1[k]
(r
2
)

uslng Lhe old value of L
1
(Lhe value before Lhe updaLe ls applled). lf
Lhls seL ls noL empLy, Lhe updaLe may be re[ecLed as an error, or Lhe
updaLe may be applled Lo Lhe Luples ln Lhe seL (cascade updaLe), or
Lhe Luples ln Lhe seL may be deleLed.
new foreign key
value must exist
no foreign keys contain
the old primary key
8eferenual lnLegrlLy ln SCL -example
creaLe Lable cusLomer
(cusLomer-name char(20) noL null,
cusLomer-sLreeL char(30),
cusLomer-clLy char(30),
prlmary key (cusLomer-name))

creaLe Lable branch
(branch-name char(13) noL null,
branch-clLy char(30),
asseLs lnLeger,
prlmary key (branch-name))
8eferenual lnLegrlLy ln SCL- example
creaLe Lable accounL
(branch-name char(13),
accounL-number char(10) noL null,
balance lnLeger,
prlmary key(accounL-number),
forelgn key (branch-name) references branch)

creaLe Lable deposlLor
(cusLomer-name char(20) noL null,
accounL-number char(10) noL null,
prlmary key (cusLomer-name, accounL-number),
forelgn key (accounL-number) references accounL,
forelgn key (cusLomer-name) references cusLomer)
Cascadlng Acuons ln SCL
Due to the on de|ete cascade c|auses, |f a de|ete of a
tup|e |n branch resu|ts |n referenna|-|ntegr|ty constra|nt
v|o|anon, the de|ete cascades to the account
re|anon, de|enng the tup|e that refers to the branch that
was de|eted.
Cascad|ng updates are s|m||ar.
creaLe Lable accounL
...
forelgn key (branch-name) references branch
on deleLe cascade
on updaLe cascade,
.)

You might also like