You are on page 1of 16

3/6/2019

Announcements

Principles of Database Management • If you are enrolled to the class, but have not seen the 


information on Blackboard IU, please send me an email
• HW1 posted on Blackboard IU
Lecture 2 – Due on 11/3 (Monday), 10:00 pm. Start early – no late days
– Both homework missed ‐> incomplete grade

SQL ‐ 1 • Lecture “notes” (incomplete) will be uploaded before the class,  
“slides” (completed) after the class

• TA: Dang Tam Nhan (dtnhan@hcmiu.edu.vn)

Recap: Lecture 1 Summary: Data Models

• Why use a DBMS? • Relational data model is the most standard for  


• Structured data model: Relational data model database managements
– and is the main focus of this course
– table, schema, instance, tuples, attributes
• Semi‐structured model/XML is also used in  
– bag and set semantic
practice – you will use them in hw/
• Data independence assignments
– Physical independence: Can change how data is  • Unstructured data (text/photo/video) is  
stored on disk without affecting applications unavoidable, but won’t be covered in this  
– Logical independence: can change schema w/o  class
affecting apps
3 Duke CS, Fall 2017 CompSci 516: Database Systems 4
3/6/2019

Today’s topic What is SQL?


• SQL basic • SQL is Structured Query Language, which is a computer 
– Reading material: [RG] Chapters 3 and 5 language for storing, manipulating and retrieving data 
stored in a relational database.
– Additional reading for practice: [GUW] Chapter 6
• SQL is the standard language for Relational Database 
System.

Acknowledgement:
The following slides have been created adapting the  
instructor material of the [RG] book provided by the authors  
Dr. Ramakrishnan and Dr. Gehrke.
Duke CS, Fall 2017 CompSci 516: Database Systems 5 6

A Brief History of SQL A Brief History of SQL
1970 − Dr. Edgar F. "Ted" Codd of IBM is known as the father 
of relational databases. He described a relational model for  • Standards:
databases. – SQL‐86
1974 − Structured Query Language appeared. – SQL‐89 (minor revision)
– SQL‐92 (major revision)
1978 − IBM worked to develop Codd's ideas and released a  – SQL‐99 (major extensions, current standard)
product named System/R.
– More: MS SQL Server history on the internet
1986 − IBM developed the first prototype of rela onal 
database and standardized by ANSI. The first relational 
database was released by Relational Software which later 
came to be known as Oracle.
7 Duke CS, Fall 2017 CompSci 516: Database Systems 8
3/6/2019

Purposes of SQL DML ‐ Data Manipulation Language

• Data Manipulation Language (DML)
– Querying: SELECT‐FROM‐WHERE
– Modifying: INSERT/DELETE/UPDATE

• Data Definition Language (DDL)
– CREATE/ALTER/DROP

Duke CS, Fall 2017 CompSci 516: Database Systems 9 10

DDL ‐ Data Definition Language Relational Model column/


attribute/
field

sid name login age gpa


row / 53666 Jones jones@cs 18 3.4
tuple / 53688 Smith smith@ee 18 3.2
record 53650 Smith smith1@math 19 3.8
53831 Madayan madayan@music 11 1.8
53832 Guldu guldu@music 12 2.0

• mathematically, relation is a set of tuples


– each tuple appears 0 or 1 times in the table
– order of the rows is unspecified
11 CSE 414 ‐ Fall 2017 12
3/6/2019

SQL (“sequel”)
• Standard query language for relational data
– used for databases in many different contexts
SQL (“sequel”) – Cont.
– inspires query languages for non-relational (e.g. SQL++)
• Everything not in quotes (‘…’) is case insensitive – BOOLEAN
– DATE, TIME, TIMESTAMP
• Provides standard types. • DATE: Stores year, month, and day values
Examples: • TIME: Stores hour, minute, and second values
– numbers: INT, FLOAT, DECIMAL(p,s) • TIMESTAMP: Stores year, month, day, hour,
• DECIMAL(p,s): Exact numerical, precision p, scale s. minute, and second values
Example: decimal(5,2) is a number that has 3 digits • Additional types in here
before the decimal and 2 digits after the decimal
– strings: CHAR(n), VARCHAR(n)
• CHAR(n): Fixed-length n
• VARCHAR(n): Variable length. Maximum length n
CSE 414 ‐ Fall 2017 13
CSE 414 ‐ Fall 2017 14

Exact Numeric Data Types Approximate Numeric Data Types

15 16
3/6/2019

Date and Time Data Types Character Strings Data Types

17 18

Unicode Character Strings Data Types Binary Data Types

19 20
3/6/2019

Misc Data Types SQL statements


• Create database …
• create table …
• drop table ...
• alter table ... add/remove ...
• insert into ... values ...
• delete from ... where ...
• update ... set ... where ...

21 CSE 414 ‐ Fall 2017 22

SQL statements
Demo on SQLite
Syntax:

CREATE DATABASE DatabaseName:


• E.g., type sqlite3 in Cygwin CREATE DATABASE testDB;
• .exit - exit from sqlite3
DROP DATABASE DatabaseName:
DROP DATABASE testDB;

USE DatabaseName:
USE testDB;

all attributes

CSE 414 ‐ Fall 2017 23


24
3/6/2019

CREATING TABLE Creating Relation/Table
CREATE TABLE Students
• Creates the “Students” relation
(sid CHAR(10),
The basic syntax of the CREATE TABLE statement is as follows: – the type (domain) of each field is   name CHAR(15),
specified login CHAR(20),
CREATE TABLE table_name( – enforced by the DBMS whenever tuples   age INTEGER,
column1  datatype,  are added or modified gpa REAL/DECIMAL(2,1))
Column2 datatype, 
CREATE TABLE Enrolled
column3  datatype, • As another example, the (sid CHAR(20), ??
.....  “Enrolled” table holds information   cid CHAR(20),
columnN datatype,  about courses that students take grade CHAR(2))
PRIMARY KEY (one or more columns) 
sid cid grade
); sid name login age gpa
53831 Carnatic101 C
53666 Jones jones@cs 18 3.4
all attributes 53688 Smith smith@eecs 18 3.2
53831 Reggae203 B
53650 Smith smith@math 19 3.8 53650 Topology112 A
53666 History105 B
Students
Enrolled
25 Duke CS, Fall 2017 CompSci 516: Database Systems 26

Example: CREATE TABLE Destroying Relation/table
Customers(ID: int, name: string(20), age: int, address:  Syntax:  Drop table Table_name
string(25), salary decimal(18,2))
DROP TABLE Customers;
CREATE TABLE Customers(
• Destroys the relation Customers
– The schema information and the tuples are deleted.

);all attributes

27 28
3/6/2019

Altering Relation/Table Adding/ Insert into


ALTER TABLE Students
ADD COLUMN firstYear: integer Syntax:
• The schema of Students is altered by adding a new INSERT INTO TABLE_NAME (column1, column2, column3, ..., 
field; columnN)  
• What’s the value in the new field? VALUES (value1, value2, value3,..., valueN);
every tuple in the current instance is extended with
a null value in the new field. INSERT INTO TABLE_NAME 
VALUES (value1, value2, value3, ..., valueN);

Duke CS, Fall 2017 CompSci 516: Database Systems 29 30

Adding and Deleting Tuples Example: Adding/ Insert into

• Can insert a single tuple using:
INSERT INTO Students (sid, name, login, age, gpa)
VALUES (53688, ‘Smith’, ‘smith@ee’, 18, 3.2)

• Can delete all tuples satisfying some


condition (e.g., name = Smith):
DELETE
FROM Students S INSERT INTO CUSTOMERS (ID, NAME, AGE, ADDRESS, SALARY)
WHERE S.name = ‘Smith’ VALUES (7, 'Muffy', 24, 'Indore', 10000.00);

INSERT INTO CUSTOMERS 
VALUES (7, 'Muffy', 24, 'Indore', 10000.00);
Duke CS, Fall 2017 CompSci 516: Database Systems 31 32
3/6/2019

update ... set ... where ... Integrity Constraints (ICs)

• IC: condition that must be true for any instance of the database
– e.g., domain constraints
– ICs are specified when schema is defined
UPDATE Student
– ICs are checked when relations are modified
SET age = age + 2
where sid = ‘53680'; • A legal instance of a relation is one that satisfies all specified ICs
– DBMS will not allow illegal instances

• If the DBMS checks ICs, stored data is more faithful to real‐world  
meaning
– Avoids data entry errors, too!

Duke CS, Fall 2017 CompSci 516: Database Systems 33 Duke CS, Fall 2017 CompSci 516: Database Systems 34

Integrity Constraints Integrity Constraints‐NOT NULL
NOT NULL Constraint − Ensures that a column cannot have NULL  CREATE TABLE CUSTOMERS(
value. ID  INT  NOT NULL,
DEFAULT Constraint − Provides a default value for a column when  NAME  VARCHAR (20) NOT NULL,
none is specified. AGE  INT  NOT NULL,
ADDRESS  CHAR (25) ,
UNIQUE Constraint − Ensures that all values in a column are 
SALARY  DECIMAL (18, 2),
different.
PRIMARY KEY (ID)
PRIMARY Key − Uniquely iden fies each row/record in a table. );
FOREIGN Key − Uniquely iden fies a row in any of the given table.
If CUSTOMERS table has already been created
CHECK Constraint − Ensures that all the values in a column sa sfies 
certain conditions. ALTER TABLE CUSTOMERS 
INDEX − Used to create and retrieve data from the database very  ALTER COLUMN SALARY DECIMAL (18, 2) NOT NULL
quickly.
35 36
3/6/2019

Integrity Constraints‐DEFAULT Integrity Constraints‐UNIQUE
CREATE TABLE CUSTOMERS(
ID  INT  NOT NULL,
NAME  VARCHAR (20) NOT NULL, CREATE TABLE CUSTOMERS(
AGE  INT  NOT NULL, ID  INT  NOT NULL,
ADDRESS  CHAR (25) , NAME  VARCHAR (20) NOT NULL UNIQUE,
SALARY  DECIMAL (18, 2) DEFAULT 5000.00, AGE  INT  NOT NULL,
PRIMARY KEY (ID) ADDRESS  CHAR (25) ,
); SALARY  DECIMAL (18, 2),
If the CUSTOMERS table has already been created PRIMARY KEY (ID)
);
ALTER TABLE CUSTOMERS
DROP column SALARY;

ALTER TABLE CUSTOMERS
ADD SALARY DECIMAL (18, 2) DEFAULT 5000.00;
37 38

Integrity Constraints‐UNIQUE Integrity Constraints‐CHECK
If the CUSTOMERS table has already been created
CREATE TABLE CUSTOMERS(
ALTER TABLE CUSTOMERS  ID  INT  NOT NULL,
ADD CONSTRAINT UniqueConstraint UNIQUE(NAME); NAME  VARCHAR (20) NOT NULL,
AGE  INT  NOT NULL CHECK (AGE >= 18),
ADDRESS  CHAR (25) ,
DROP a UNIQUE Constraint
SALARY  DECIMAL (18, 2),
PRIMARY KEY (ID)
ALTER TABLE CUSTOMERS  );
DROP CONSTRAINT UniqueConstraint;

39 40
3/6/2019

Integrity Constraints‐CHECK Integrity Constraints‐INDEX
Syntax:
If the CUSTOMERS table has already been created CREATE INDEX index_name
ON table_name ( column1, column2.....);
ALTER TABLE CUSTOMERS 
ADD CONSTRAINT CheckConstraint CHECK(AGE >=18); To create an INDEX on the AGE column, to optimize the 
search on customers for a specific age
DROP a CHECK Constraint
CREATE INDEX idx_age
ON CUSTOMERS ( AGE );
ALTER TABLE CUSTOMERS 
DROP CONSTRAINT CheckConstraint; DROP an INDEX Constraint

ALTER TABLE CUSTOMERS 
DROP INDEX idx_age;
41 42

Keys in a Database Primary Key Constraints


• Key = subset of columns that uniquely identifies tuple
• Key / Candidate Key
• Another constraint on the table
• Primary Key – no two tuples can have the same values for those columns
• Examples:
• Super Key – Movie(title, year, length, genre): key is (title, year)
• Foreign Key – what is a good key for Student?
Students(sid: string, name: string, login: string, age: integer, gpa: real).
• Can have multiple keys for a table
• Primary key attributes are underlined in a schema
– Person(pid, address, name) • Only one of those keys may be “primary”
– DBMS often makes searches by primary key fastest
– Person2(address, name, age, job) – other keys are called “secondary”

Duke CS, Fall 2017 CompSci 516: Database Systems 43 Duke CS, Fall 2017 CompSci 516: Database Systems 44


3/6/2019

Primary and Candidate Keys in SQL Primary and Candidate Keys in SQL


• Possibly many candidate keys • Possibly many candidate keys
– specified using UNIQUE – specified using UNIQUE
– one of which is chosen as the primary key. – one of which is chosen as the primary key.

CREATE TABLE Enrolled


• Example: CREATE TABLE Enrolled • “For a given student and course,
(sid CHAR(10)
• “For a given student and course, (sid CHAR(10) there is a single grade.” cid CHAR(20),
cid CHAR(20),
there is a single grade.” grade CHAR(2),
grade CHAR(2),
• What a primary key is in a table? PRIMARY KEY ???) PRIMARY KEY (sid,cid) )

Duke CS, Fall 2017 CompSci 516: Database Systems 45 Duke CS, Fall 2017 CompSci 516: Database Systems 46

Primary and Candidate Keys in SQL Primary and Candidate Keys in SQL


• Possibly many candidate keys • Possibly many candidate keys
– specified using UNIQUE – specified using UNIQUE
– one of which is chosen as the primary key. – one of which is chosen as the primary key.
CREATE TABLE Enrolled CREATE TABLE Enrolled
(sid CHAR(10) (sid CHAR(10)
• “For a given student and course, there is a cid CHAR(20), • “For a given student and course, there is a cid CHAR(20),
single grade.” single grade.”
grade CHAR(2), grade CHAR(2),
PRIMARY KEY (sid,cid) ) • vs. PRIMARY KEY (sid,cid) )
vs.
CREATE TABLE Enrolled • “Students can take only one course, and CREATE TABLE Enrolled
• “Students can take only one course, and (sid CHAR(10) receive a single grade for that course; further, (sid CHAR(10)
receive a single grade for that course; further, no two students in a course receive the same
cid CHAR(20), cid CHAR(20),
no two students in a course receive the same grade.”
grade CHAR(2), grade CHAR(2),
grade.”
PRIMARY KEY ???, PRIMARY KEY sid,
UNIQUE ??? ) UNIQUE (cid, grade))
Duke CS, Fall 2017 CompSci 516: Database Systems 47 Duke CS, Fall 2017 CompSci 516: Database Systems 32
3/6/2019

Primary and Candidate Keys in SQL Foreign Keys, Referential Integrity


• Possibly many candidate keys
– specified using UNIQUE • Foreign key: Set of fields in one relation that is used to
– one of which is chosen as the primary key. `refer’ to a tuple in another relation
CREATE TABLE Enrolled – Must correspond to primary key of the second relation
(sid CHAR(10) – Like a `logical pointer’
• “For a given student and course, there is a cid CHAR(20),
single grade.”
grade CHAR(2),
vs. PRIMARY KEY (sid,cid) )

• E.g. sid is a foreign key referring to Students:
• “Students can take only one course, and CREATE TABLE Enrolled – Enrolled(sid: string, cid: string, grade: string)
receive a single grade for that course; further, (sid CHAR(10)
no two students in a course receive the same – If all foreign key constraints are enforced, referential  
cid CHAR(20),
grade.” integrity is achieved
grade CHAR(2),
PRIMARY KEY sid, – i.e., no dangling references
• Used carelessly, an IC can prevent the storage
UNIQUE (cid, grade))
of database instances that arise in practice!
Duke CS, Fall 2017 CompSci 516: Database Systems 49 Duke CS, Fall 2017 CompSci 516: Database Systems 50

Foreign Keys in SQL Enforcing Foreign‐Key Constraints


• Only students listed in the Students relation should be  
allowed to enroll for courses If there is a foreign‐key constraint from relation R
CREATE TABLE Enrolled to relation S, two violations are possible:
(sid CHAR(10), cid CHAR(20), grade CHAR(2), 1. An insert or update to R introduces values 
PRIMARY KEY (sid,cid),
FOREIGN KEY (sid) REFERENCES Students ) not found in S.
2. A deletion or update to S causes some 
Enrolled tuples of R to “dangle.”
Students
sid cid grade
sid name login age gpa
53666 Carnatic101 C
53666 Jones jones@cs 18 3.4
53666 Reggae203 B
53650 Topology112 A 53688 Smith smith@eecs 18 3.2
53666 History105 B 53650 Smith smith@math 19 3.8

Duke CS, Fall 2017 CompSci 516: Database Systems 51 52


3/6/2019

Action taken Action taken
Example: suppose R = Enrolled, S = Students 
1. Default: Reject the modification.
An insert or update to Enrolled that introduces a 
nonexistent Students must be rejected.
2. Cascade: Make the same changes in Enrolled.
A delete or update to Students that removes a student  Deleted Students: delete Enrolled tuple.
value found in some tuples of Enrolled can be handled  Updated Students: change value in Enrolled.
in four ways (next slide)
3. Set NULL: Change the sid in E to NULL.
4. Default is No action: (delete/update is  
rejected)
53 54

Example: Cascade Example: Set NULL
Delete the 53666 tuple from Students: Delete the 53666 tuple from Students:
Then delete all tuples from Enrolled that have  Change all tuples of Enrolled that have sid = 
sid = ’53666’. ‘53666’ to have sid = NULL.

Update the 53666 tuple by changing ’53666’ to  Update the 53666 tuple by changing ’53666’ to 
’53686’: ’53686’:
Then change all Enrolled tuples with sid =  Same change as for deletion. 
’53666’ to sid = ’53686’. 

55 56
3/6/2019

Where do ICs Come From?
Referential Integrity in SQL
• ICs are based upon the semantics of the real‐world enterprise  
• SQL/92 and SQL:1999 support  all 4 options on  that is being described in the database relations
deletes and updates
• Can we infer ICs from an instance?
CREATE TABLE Enrolled (sid
CHAR(10), – We can check a database instance to see if an IC is violated, but we  
cid CHAR(20), can NEVER infer that an IC is true by looking at an instance.
grade CHAR(2), – An IC is a statement about all possible instances!
PRIMARY KEY (sid,cid), – From example, we know name is not a key, but the assertion that sid is  
FOREIGN KEY (sid) a key is given to us.
REFERENCES Students
ON DELETE CASCADE
ON UPDATE SET DEFAULT ) • Key and foreign key ICs are the most common; more general  
ICs supported too
Duke  CS, Fall 2016 CompSci 516: Data Intensive Computing Systems
Duke CS, Fall 2017 CompSci 516: Database Systems 38

Q/A
Example Instances
1. Is a NULL value same as zero or a blank space? If not 
• We will use these instances of the   Sailor
then what is the difference?
Sailors and Reserves relations in   sid sname rating age
our examples 22 dustin 7 45
31 lubber 8 55 A NULL value is not same as zero or a blank space. A NULL 
58 rusty 10 35 value is a value which is ‘unavailable, unassigned, 
• If the key for the Reserves relation  
contained only the attributes sid   unknown or not applicable’. Whereas, zero is a number 
and bid, how would the semantics   Reserves
and blank space is a character.
differ? sid bid day
22 101 10/10/96
58 103 11/12/96

Duke CS, Fall 2016 CompSci 516: Data Intensive Computing Systems


60
3/6/2019

Q/A
2. If a table contains duplicate rows, does a query result 
display the duplicate values by default? How can you 
eliminate duplicate rows from a query result?

A query result displays all rows including the duplicate 
rows. To eliminate duplicate rows in the result, the 
DISTINCT keyword is used in the SELECT clause.

61

You might also like