Professional Documents
Culture Documents
into smaller relations.Database Normalization is a technique of organizing the data in the database.
Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable
characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into
tabular form by removing duplicated data from the relation tables.
BCNF
Student Table :
Student
Adam
Alex
Stuart
Age
15
14
17
Subject
Biology, Maths
Maths
Maths
In First Normal Form, any row must not have a column in which more than one value is saved, like separated
with commas. Rather than that, we must separate such data into multiple rows.
Age
15
15
14
17
Subject
Biology
Maths
Maths
Maths
Using the First Normal Form, data redundancy increases, as there will be many columns with same data in
multiple rows but each row as a whole will be unique.
Age
Adam
Alex
Stuart
15
14
17
In Student Table the candidate key will be Student column, because all other column i.e Age is dependent on it.
Subject
Adam
Adam
Alex
Stuart
Biology
Maths
Maths
Maths
In Subject Table the candidate key will be {Student, Subject} column. Now, both the above tables qualifies for
Second Normal Form and will never suffer from Update Anomalies. Although there are a few complex cases in
which table in Second Normal Form suffers Update Anomalies, and to handle those scenarios Third Normal
Form is there.
Student_Detail Table :
Student_id
Student_na
me
DOB
Street
city
State
Zip
In this table Student_id is Primary key, but street, city and state depends upon Zip. The dependency between zip
and other fields is called transitive dependency. Hence to apply 3NF, we need to move the street, city and state
to new table, with Zip as primary key.
Student_name
DOB
Zip
Street
city
state
Address Table :
Zip
Canonical cover:
A canonical cover Fc for F is a set of dependencies such that F logically implies all dependencies in Fc and Fc
logically implies all dependencies in F, and further
No functional dependency in Fc contains an extraneous attribute.
Each left side of a functional dependency in Fc is unique.
Minimal Covers useful in obtaining a lossless, dependency-preserving decomposition of a scheme R into 3NF
relation schemas.
Compute a canonical cover for F:
repeat
Use the union rule to replace any dependencies in F 1 1 and 1 2 with 1 1 2
Find a functional dependency with an extraneous attribute either in or in .
If an extraneous attribute is found, delete it from until F does not change.
Lossless-join Decomposition
If R is decomposed into R1 and R2, we require that for all possible relations r on schema R satisfies r = R1
(r )
R2 (r )
A decomposition of R into R1 and R2 is lossless join if and only if at least one of the following dependencies
is in F+
R1 R2 R1
R1 R2 R2
example:
R = (A, B, C), F = {A B, B C)
Can be decomposed in two different ways
R1 = (A, B), R2 = (B, C)
Lossless-join decomposition:
R1 R2 = {B} and B BC Dependency preserving
R1 = (A, B), R2 = (A, C)
Lossless-join decomposition:
R1 R2 = {A} and A AB Not dependency preserving (cannot check B C without
computing R1
R2)
Data dependence is a function of the data storage characteristics, rather than of the file structure. Even if a files
structure is not changed that is, no fields are added or deleted a fields underlying data characteristics may
change. For example, even if a numeric field remains numeric, changing the numeric data characteristics from
integer to decimal entails a significant difference in the way data are stored physically.
FUNCTIONAL DEPENDENCY:
Functional dependency is a relationship that exists when one attribute uniquely determines another attribute.
If R is a relation with attributes X and Y, a functional dependency between the attributes is represented as X->Y,
which specifies Y is functionally dependent on X. Here X is a determinant set and Y is a dependent attribute.
Each value of X is associated precisely with one Y value.
Functional dependency in a database serves as a constraint between two sets of attributes. Defining functional
dependency is an important part of relational database design and contributes to aspect normalization.
Armstrong's Axioms
If F is a set of functional dependencies then the closure of F, denoted as F +, is the set of all functional
dependencies logically implied by F. Armstrong's Axioms are a set of rules, that when applied repeatedly,
generates a closure of functional dependencies.
Reflexive rule If alpha is a set of attributes and beta is_subset_of alpha, then alpha holds beta.
Augmentation rule If a b holds and y is attribute set, then ay by also holds. That is adding
attributes in dependencies, does not change the basic dependencies.
Transitivity rule Same as transitive rule in algebra, if a b holds and b c holds, then a c
also holds. a b is called as a functionally that determines b.
Closure of a set of attributes X with respect to F is the set X+of all attributes that are functionally determined
by X
X+ can be calculated by repeatedly applying IR1, IR2, IR3 using the FDs in F
Relational Algebra
Relational algebra is a procedural query language, which takes instances of relations as input and yields
instances of relations as output. It uses operators to perform queries. An operator can be either unary or binary.
They accept relations as their input and yield relations as their output. Relational algebra is performed
recursively on a relation and intermediate results are also considered relations.
The fundamental operations of relational algebra are as follows
Select
Project
Union
Set different
Cartesian product
Rename
We will discuss all these operations in the following sections.
Select Operation ()
It selects tuples that satisfy the given predicate from a relation.
Notation p(r)
Where stands for selection predicate and r stands for relation. p is prepositional logic formula which may use
connectors like and, or, and not. These terms may use relational operators like =, , , < , >, .
For example
subject = "database"(Books)
Output Selects tuples from books where subject is 'database'.
subject = "database" and price = "450"(Books)
Output Selects tuples from books where subject is 'database' and 'price' is 450.
subject = "database" and price = "450" or year > "2010"(Books)
Output Selects tuples from books where subject is 'database' and 'price' is 450 or those books published
after 2010.
Project Operation ()
It projects column(s) that satisfy a given predicate.
Notation A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example
Notion r U s
Where r and s are either database relations or relation result set (temporary relation).
For a union operation to be valid, the following conditions must hold
r, and s must have the same number of attributes.
Attribute domains must be compatible.
Duplicate tuples are automatically eliminated.
author (Books) author (Articles)
Output Projects the names of the authors who have either written a book or an article or both.
Set Difference ()
The result of set difference operation is tuples, which are present in one relation but are not in the second
relation.
Notation r s
Finds all the tuples that are present in r but not in s.
author (Books) author (Articles)
Output Provides the name of authors who have written books but not articles.
Cartesian Product ()
Combines information of two different relations into one.
Notation r s
Where r and s are relations and their output will be defined as
r s = { q t | q r and t s}
author = 'tutorialspoint'(Books Articles)
Output Yields a relation, which shows all the books and articles written by tutorialspoint.
Rename Operation ()
The results of relational algebra are also relations but without any name. The rename operation allows us to
rename the output relation. 'rename' operation is denoted with small Greek letter rho .
Notation x (E)
Where the result of expression E is saved with name of x.
Additional operations are
Set intersection
Assignment
Natural join
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that is, it tells what to
do but never explains how to do it.
Relational calculus exists in two forms
Tuple Relational Calculus (TRC)
Notation {T | Condition}
Returns all tuples T that satisfies a condition.
For example
{ T.name | Author(T) AND T.article = 'database' }
Output Returns tuples with 'name' from Author who has written article on 'database'.
TRC can be quantified. We can use Existential () and Universal Quantifiers ().
For example
{ R| T Authors(T.article='database' AND R.name=T.name)}
Output The above query will yield the same result as the previous one.
Domain Relational Calculus (DRC)
In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as done in TRC,
mentioned above).
Notation
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner attributes.
For example
{< article, page, subject > | TutorialsPoint subject = 'database'}
Output Yields Article, Page, and Subject from the relation TutorialsPoint, where subject is database.
Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also involves relational
operators.
The expression power of Tuple Relation Calculus and Domain Relation Calculus is equivalent to Relational
Algebra.