You are on page 1of 36

RELATIONAL DATABASE

MANAGEMENT SYSTEM (17332)

Theory Paper : 100 marks


Ext. Oral : 25 marks
Term Work : 50 marks
Sessional : 10 marks
Total : 185 marks
By Imran Shaikh

REFERENCE BOOKS
Database System Concepts (4th Edition)
Author : Silberschatz, Korth, Sudarshan

Introduction to Database Management Systems


Author : ISRD Group

SQL ,PL/SQL the Programming language of Oracle


Author : Ivan Bayross

Advanced Database Management System


Author : Chakrabarti Dasgupta
By Imran Shaikh

Chapter No 1 : DATABASE SYSTEM CONCEPT (marks 16)


Data
Known facts that can be recorded and have an implicit meaning. For
example, consider the names, telephone numbers, and addresses of
the people you now.
Database
A database is a collection of related data. Database systems are
designed to manage large bodies of information. Management of data
involves both defining structures for storage of information and
providing mechanisms for the manipulation of information. The
database system must ensure the safety of the information stored,
despite system crashes or attempts at unauthorized access.
By Imran Shaikh

DBMS
A database management system (DBMS) is a collection of
related data and programs that enables users to create and
maintain a database. The DBMS is a general-purpose software
system that facilitates the processes of defining, constructing,
manipulating, and sharing databases among various users and
applications.

By Imran Shaikh

File Processing System

By Imran Shaikh

Disadvantages of file processing system


Data redundancy and inconsistency
Difficulty in accessing data
Data isolation
Concurrent-access anomalies
Integrity problems
Atomicity problems
Security problems

By Imran Shaikh

Data redundancy and inconsistency


For example, the address and telephone number of a particular customer may appear in a le
that consists of savings-account records and in a le that consists of checking-account records.
This redundancy leads to higher storage and access cost. In addition, it may lead to data
inconsistency.

Difficulty in accessing data


find out the names of all customers who live within a particular postal- code area.
There is no application program on hand to meet every need. So a bank manager face
difficulty to access data

Data isolation (Separation of Data in various files)


Because data are scattered in various files, and files may be in different formats, writing new
application programs to retrieve the appropriate data is difficult.

By Imran Shaikh

Problems with concurrent access


Example:
Assume I'm paying for groceries with my MAC card at the same time my pay check is being
deposited (and my bank uses a file processing system):
Withdrawal program & Deposit program accessing database concurrently
1.
2.
3.
4.
5.
6.

Read balance from checking account file as $51


Read balance from checking account file as $51
Subtract $50 (for groceries)
Update checking account file (new balance: $1)
add $100 (my salary)
Update checking account file (new balance: $151)

By Imran Shaikh

Atomicity Problem
In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent
state that existed prior to the failure. It is difficult to ensure atomicity in a conventional leprocessing system.

Integrity problems
Data may need to satisfy certain conditions, called consistency constraints .
for example: account balances should never fall below $0 .
difficult to enforce/add/change such consistency constraints in a file processing system.

Security Problems
Not all users have access permission to all type of data , but enforcing such restrictions in FPS is
difficult.

By Imran Shaikh

Application of database
1.
2.
3.
4.
5.
6.
7.
8.

Banking:
Airlines:
Universities:
Credit card transactions:
Telecommunication:
Finance:
Sales:
Manufacturing:

By Imran Shaikh

Introduction to RDMS
A DBMS that is based on relational model is called as RDBMS. RDBMS Designed by E.F. Codd.
The relational model uses a collection of tables to represent both data and the relationships
among those data. Each table has multiple columns, and each column has a unique name.
A table is a two dimensional array containing rows and columns. Each row contains data
related to an entity such as a student. Each column contains the data related to a single
attribute of the entity such as student name.
Basic Concepts of RDBMS are as follows
Tuple:
In relational model, a row is called as tuple.
Attribute: A column header is called as an attribute.
Degree:
The degree of relation is number of attributes of the table.
Domain:
All permissible values of attributes is called as a domain.
Cardinality: Number of rows in the table is called as cardinality

By Imran Shaikh

Figure 1 shows how data is represented in relational model and what are the terms used to refer
to various components of a table. The following are the terms used in relational model.

By Imran Shaikh

Difference between DBMS and RDBMS

By Imran Shaikh

Name of various DBMS and RDBMS software


DBMS Software
Dbase
FoxBASE
FoxPro

RDMS Software
Oracle
MySQL
SQL Server

By Imran Shaikh

Data Abstraction
Hiding Database Design complexities from users (which are not computer professionals) is
nothing but known as Data Abstraction.
Data Abstraction feature provide easy way to retrieve data, from database.
There are three levels of abstraction.

Physical level :
Hiding the detail about how the data is stored actually and where it is stored in database
from user is known as physical level abstraction.
The physical level describes complex low-level data structures in detail.

Logical level :
Hiding the detail about what data are stored in the database, and what relationships exist
among those data from user is known as logical level abstraction.
The logical level describes simple data structures in detail.
By Imran Shaikh

View level
The view level of abstraction exists to simplify their interaction with the system. The system
may provide many views for the same database, may they need to access only a part of the
database.
Views can also hide information (e.g., salary) for security purposes.

Fig. Three levels


Abstraction
By ImranofShaikh

Database Languages

DDL : Data Definition Language

DDL commands are deal with the database structure.


DDL commands are used to create, manipulate and drop the database structure.

create , alter, rename , drop commands


DQL : Data Query Language
DQL commands are used to access database data.

select command
DML : Data Manipulation Language
DML commands deal with instances of the database.
DML commands are used to insert, update and delete the actual data of database.

insert , update , delete commands


By Imran Shaikh

Instances and Schemas


Schema
The overall structure of the database OR overall design of the database is called the database
schema.
The schema of the database not change frequently , it is more stable as compare to data in
database.
Design of schema decide the fields and their type in database.
We can create the schema or structure of database by using create command.
We can change the schema or structure of database by using alter command.

Instances
Information store in database at a particular point of time is known as Instance in database.
Instances are not stable in nature and can be change frequently.
Instances denote the rows of table in RDBMS and represent the actual data of the database.

We can add new instance in database by using insert command.


By Imran Shaikh

Data Independence

It is the property of the database which tries to ensure that if we make any change in any
level of schema of the database, the schema immediately above it would require minimal
or no need of change.
What does this mean? We know that in a building, each floor stands on the floor below it.
If we change the design of any one floor, e.g. extending the width of a room by
demolishing the western wall of that room, it is likely that the design in the above floors
will have to be changed also. As a result, one change needed in one particular floor would
mean continuing to change the design of each floor until we reach the top floor, with an
increase in the time, cost and labour. Would not life be easy if the change could be
contained in one floor only? Data independence is the answer for this. It removes the
need for additional amount of work needed in adopting the single change into all the
levels above.

By Imran Shaikh

Data independence can be classified into the following two types

1.Physical Data Independence:


This means that for any change made in the physical schema, the need to change the logical schema is minimal. This is
practically easier to achieve. Let us explain with an example. Say, you have bought an Audio CD of a recently released film
and one of your friends has bought an Audio Cassette of the same film. If we consider the physical schema, they are
entirely different. The first is digital recording on an optical media, where random access is possible. The second one is
magnetic recording on a magnetic media, strictly sequential access. However, how this change is reflected in the logical
schema is very interesting.
For music tracks, the logical schema for both the CD and the Cassette is the title card imprinted on their back. We have
information like Track no, Name of the Song, Name of the Artist and Duration of the Track, things which are identical for
both the CD and the Cassette. We can clearly say that we have achieved the physical data independence here.
2.Logical Data Independence
: This means that for any change made in the logical schema, the need to change the external schema is minimal. As we
shall see, this is a little difficult to achieve. Let us explain with an example. Suppose the CD you have bought contains 6
songs, and some of your friends are interested in copying some of those songs (which they like in the film) into their
favorite collection. One friend wants the songs 1, 2,4,5,6, another wants 1,3,4,5 and another wants 1,2,3,6. Each of these
collections can be compared to a view schema for that friend. Now by some mistake, a scratch has appeared in the CD
and you cannot extract the song 3. Obviously, you will have to ask the friends who have song 3 in their proposed
collection to alter their view by deleting song 3 from their proposed collection as well
By Imran Shaikh

Overall Structure of DBMS


There are four components
1 ) Disk Storage
3 ) Query Processor
Disk Storage
Data files,

2 ) Storage Manager
4) Database Users

which store the database itself.

Data dictionary which stores metadata about the structure of the database,
in particular the schema of the database
Indices

which provide fast access to data items that hold particular


values.

By Imran Shaikh

Storage Manager
Authorization and integrity manager
which tests for the satisfaction of integrity constraints and checks the authority of users to
access data.
Transaction manager,
which ensures that the database remains in a consistent (correct) state despite system failures,
and that concurrent transaction executions proceed without conflicting.
File manager
which manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.
Buffer manager
which is responsible for fetching data from disk storage into main memory, and deciding what
data to cache in main memory.

By Imran Shaikh

The Query Processor


The query processor components include
DDL interpreter which interprets DDL statements and records the definitions in the data
dictionary.
DML compiler, which translates DML statements in a query language into an evaluation
plan consisting of low-level instructions that the query evaluation engine
understands. A query can usually be translated into any of a number of
alternative evaluation plans that all give the same result. The DML compiler also
performs query optimization, that is, it picks the lowest cost evaluation plan
from among the alternatives.
Query evaluation engine, which executes low-level instructions generated by the DML
compiler.

By Imran Shaikh

Fig.
Overall Structure of
DBMS

By Imran Shaikh

Database Users
Naive users
invoke one of the permanent application programs that have been written
previously.
E.g. people accessing database over the web, bank tellers, clerical staff
Application programmers
interact with system through DML calls.
Responsible to write application programs using DML commands.
Sophisticated users
form requests in a database query language.
Responsible to search required data from database (Data Mining)

By Imran Shaikh

Functions of Database Administrator


Definition of the schema, the architecture of the three levels of the data abstraction,
data independence.
Modification of the defined schema as and when required.
Creating new user ID, password etc. , and also creating the access permissions that each
user can or cannot enjoy. DBA is responsible to create user roles.
Defining the integrity constraints for the database to ensure that the data entered
conform to some rules, thereby increasing the reliability of data.
Creating a security mechanism to prevent unauthorized access, accidental or intentional
handling of data that can cause security threat.
Creating backup and recovery policy. This is essential because in case of a failure the
database must be able to revive itself to its complete functionality with no loss of data
,as if the failure has never occurred.

By Imran Shaikh

Two / Three Tier architecture

Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a database.
Three-tier architecture: E.g. web-based applications, and applications built using middleware.
By Imran Shaikh

Two-tier architecture
In a two-tier architecture, the application is partitioned into a component that
resides at the client machine, which invokes database system functionality at the
server machine through query language statements. Application program interface
standards like ODBC and JDBC are used for interaction between the client and the
server.
Three-tier architecture
In contrast, in a three-tier architecture, the client machine acts as merely a front
end and does not contain any direct database calls. Instead, the client end
communicates with an application server, usually through a forms interface. The
application server in turn communicates with a database system to access data.
The business logic of the application, which says what actions to carry out under
what conditions, is embedded in the application server, instead of being
distributed across multiple clients. Three-tier applications are more appropriate for
large applications, and for applications that run on the World Wide Web.
By Imran Shaikh

E.F. Codds laws for fully functional RDBMS


1. All data should be presented to the user in table form.
2. All data should be accessible without ambiguity.
3. A column should be allowed to remain empty.
4. The DBMS must provide access to its structure through the same tools that are used to access
the data.
5. The DBMS must support a clearly defined language that includes functionality for data definiti
on, data manipulation, data integrity, and database transaction control.
6. Data can be presented to the user in different logical combinations called views.
7. Set operations like Union, Intersection and Minus should be supported.
8. The user is isolated from the physical method of storing and retrieving information from the da
tabase.
9. How a user views data should not change when the logical structure (tables structure) of the d
atabase changes.
10. Integrity Constraints on user input must be stored in Data Dictionary, to make RDBMS front
end Independent.
11. A user should be totally unaware of whether or not the database is housed on one computer
or distributed across several computers.
12. Users should not be allowed to modify the database structure using any GUI based
By Imran Shaikh
applications. Database Structure modification
must be done by only direct SQL commands.

Distributed Database
We can classify database into two categories as
follows

Centralized Database :
The data reside in one single location.
Distributed Database :
The data is distributed over multiple
locations.

Distributed Database
A distributed database is a collection of
partially independent databases that (ideally)
share a common schema, and coordinate
processing of transactions that access nonlocal
data. The processors communicate with one
another through a communication network
that handles routing and connection
strategies.

Admission
Section

Account
Section

DDBMS

By Imran Shaikh

Student
Personal
Data

Exam
Section

Student
Result
Analysis

Fig. Distributed Database

Student
Fees
Record

Types of Distributed DBMS


Homogeneous DDBMS
A homogeneous distributed database has identical software and hardware running all
databases instances, and may appear through a single interface as if it were a single
database.
Heterogeneous DDBMS
A heterogeneous distributed database may have different hardware, operating systems,
database management systems, and even data models for different databases.

By Imran Shaikh

DATA WAREHOUSE
Definition
A data warehouse is Integrated, Subject oriented, Time-variant, Nonvolatile collection of data
in support of management's decision making process.

Integrated
Data warehouses must put data from disparate sources into a consistent format. They
must resolve such problems as naming conflicts and inconsistencies among units of
measure. When they achieve this, they are said to be integrated.
Subject Oriented
Data warehouses are designed to help you analyze data. For example, to learn more
about your company's sales data, you can build a warehouse that concentrates on
sales. Using this warehouse, you can answer questions like "Who was our best
salesman for this item last year?" This ability to define a data warehouse by subject
matter, sales in this case, makes the data warehouse subject oriented.
By Imran Shaikh

Nonvolatile
Nonvolatile means that, once entered into the warehouse, data should not change. This
is logical because the purpose of a warehouse is to enable you to analyze what has
occurred previously.
Time Variant
In order to discover trends in business, analysts need large amounts of historical data.
So it is necessary to put data into warehouse time by time.

DATA MINING
Data mining (knowledge discovery in databases): Extraction of interesting (non-trivial,
implicit, previously unknown and potentially useful) information or patterns from data
in large databases.

By Imran Shaikh

Data Independence
Data independence, which can be defined as the capacity to change the schema at one level of a
database system without having to change the schema at the next higher level .
There are two types of data independence present
Logical data independence
Logical data independence is the capacity to change the conceptual schema without having
to change external schemas or application programs.
Example : change in constraint at logical level should not affect the application programs.
Physical data independence
Physical data independence is the capacity to change the internal schema without having
to change the conceptual schema. Hence, the external schemas need not be changed as
well.
Example: if physical files are reorganized , it should not affect logical organization of
database.
By Imran Shaikh

Explain any four advantages of DBMS


Differentiate between Two tier and Three tier architecture.
Explain following terms
Storage Manager
Database Users

Write down any four advantages of DDBMS.

By Imran Shaikh

DCL : Data Control Language


DCL commands are useful to implement access control policies of the organization.
DCL commands are used to set/reset users permissions on different objects (eg.
Tables,Views etc).

grant , revoke commands


TCL : Transaction Control Language
TCL commands are used to complete transactions without atomicity problem.
TCL commands useful to maintain database in a consistent state.
commit , rollback , savepoint commands
By Imran Shaikh

You might also like