You are on page 1of 8

Introduction to Database Management Systems

Table of Contents — Previous: Introduction — Next: Introduction to MMDB

Contents
[hide]

• 1 2.1 Overview
• 2 2.2 Driver Interfaces
• 3 2.3 SQL Engine
• 4 2.4 Transaction Engine
• 5 2.5 Relational Engine
• 6 2.6 Storage Engine
• 7 2.7 SELECT Execution Sequence

• 8 2.8 INSERT Execution Sequence

[edit] 2.1 Overview


The database management system (DBMS) is the software that handles storage and
retrieval of data. Most of the DBMS present today are relational DBMS. This book
concentrates only on relational database management systems. RDBMS has five main
components

• Interface Drivers
• SQL Engine
• Transaction Engine
• Relational Engine
• Storage Engine

Figure 1 contains DBMS components, Memory layout and disk files associated with
relational database management system. From early days of database system evolution,
disk is considered to be the backing store for the data to achieve durability. The
architecture above applies for disk resident database systems (DRDB). Nowadays there
are two different approaches other than DRDBs. They are

• Main memory databases (MMDB) – data is stored in main memory.


• Network Databases – data is stored in another host over the network.

Most of the components in the DRDB system architecture above are present in main
memory and network databases as well.
[edit] 2.2 Driver Interfaces
A user or application program shall initiate either schema modification or content
modification. These application requests are broadly classified by SQL as Data Definition
Language (DDL), Data Manipulation Language (DML) and Data Control Language
(DCL). DDL deals with schema modifications; DML deals with content modifications;
DCL deals with user access and privilege modifications. If the application program is
written in C/C++, it shall use ODBC drivers to connect to the DBMS, or if it is written in
Java, it shall use JDBC drivers to connect to DBMS. Some vendors provide language
specific proprietary interfaces. For example MySQL provides drivers for PHP, Python,
etc.

These drivers are built on top of SQL. They provide methods to prepare statements,
execute statements, fetch results, etc.

[edit] 2.3 SQL Engine


This component is responsible for interpreting and executing the SQL query. It comprises
of three major components

Compiler – builds a data structure from the SQL statement and then does semantic
checking on the query such as whether the table exists, field exists, etc.

Optimizer – transforms the initial query plan (data structure created by compiler), into
sequence of operations usually pipelined together to achieve fast execution. It refers the
metadata (dictionary) and statistical information stored about the data to decide which
sequence of operations is likely to be faster and based on that it creates the optimal query
plan. Both cost and rule based optimizers are used in case of DRDBs.

Execution Engine – executes each step in the query plan chosen by optimizer. It
interacts with the relational engine to retrieve and store records.

[edit] 2.4 Transaction Engine


Transactions are sequence of operations that read or write database elements, which are
grouped together. Transaction should have the following ACID properties

Atomicity: Either all or none of the effect should appear in database after transaction
completes.

Consistency: Constraints should always keep the database in consistent state

Isolation: Transaction should run as though no other transaction is running.

Durability: Once the transaction completes, effect of the transaction on the database
must never be lost.

All the above properties are explained in detail under the Transaction Chapter.

Transaction engine comprises of three major components

Concurrency Manager – responsible for concurrent synchronized access to data. This is


usually implemented using latches and locks. Latches or Mutexes are acquired and
released for short duration synchronization and locks are used for that of long duration.
Log Manager – responsible for atomicity and durability property of transaction. Undo
logs make sure that transaction rollback takes the database state to previous consistent
state when that transaction started. Redo logs make sure that all committed transactions
shall be recovered in case of crash.

Recovery Manager- responsible for recovering the database from the disk image and
redo log files. Most of the databases uses a technique called ‘shadow paging’ to maintain
consistent image of memory in disk.

[edit] 2.5 Relational Engine


Relational objects such as Table, Index, and Referential integrity constraints are
implemented in this component. Some of the main components are

Field – abstracts column level information including type, length, etc Catalog – maintains
Meta data information about the relational database objects such as table, index, trigger,
field, etc

Table – responsible for insert, update, delete, fetch, execute. It interacts with the allocator
subsystem of storage engine, which in turn talks to buffer manager to get the job done.

Index – responsible for insert, update, delete, and scan of index nodes for all index types.
Popular index types are hash and tree. Hash index is used for improving the point lookup
(predicate with equality on primary key) and tree index is used for improving the range
query (predicate with greater or less than operator on key).

Expression Engine – represents the predicate (WHERE clause of SQL statement) of the
data retrieval operation and responsible for evaluating the expressions, which shall
include arithmetic, comparison, and logical expressions.

[edit] 2.6 Storage Engine


This component is responsible to store and retrieve data records. It also provides
mechanism to store meta data information and control information such as undo logs,
redo logs, lock tables, etc. Important storage engine components are

Buffer manager – responsible for loading pages from disk to memory and to manage the
buffer pool based on Least Recently Used (LRU) algorithm. This also has special purpose
allocator for storing control information, which are transient. Buffer pool is the memory
space used by buffer manager to cache disk pages associated with records, index
information, Meta data information. Some database systems have space limit at
individual level and some at global level for buffer pool size.
File Manager – Database in DRDB is nothing but a physical file at disk. File manager
maps disk pages of the file to the memory pages and does the actual disk I/O operations
in case of major faults generated by buffer manager module.

Process Manager – responsible for registering and deregistering database application


process and threads and account all the resources (transactions, locks, latches) acquired
by them.

[edit] 2.7 SELECT Execution Sequence


This is what happens conceptually, when user issues a SELECT SQL statement, Fig 2:
SQL SELECT Execution Sequence

1. User issues transaction start request (startTrans())


2. DBMS reserves one free slot for the transaction started (allocSlot())
3. DBMS returns to the user.
4. User issues an SELECT SQL request (stmtExecute())
5. DBMS interprets the request and represents it in data structure (parse())
6. DBMS checks whether table and field names exist in database (check())
7. DBMS identifies the optimum way to execute the statement (optimize())
8. DBMS executes the statement by interacting with relational engine (execute())
9. DBMS checks with buffer manager whether the disk page where data is present is
already present in memory (isPageInMemory())
10. DBMS interacts with file manager to load the page into memory buffer if not
already loaded.( loadPage())
11. DBMS evaluates records which satisfy the predicate (evaluate())
12. DBMS takes lock on the records based on the isolation level of the transaction
(lockRecord())
13. DBMS retrieves records and returns to the application (returnRecords)
14. User issues transaction commit (commit())
15. DBMS releases all the locks acquired during the transaction (releaseLocks())
16. DBMS releases the transaction slot allocated for this transaction (freeSlot())
17. DBMS returns to the application

[edit] 2.8 INSERT Execution Sequence


This is what happens conceptually, when user issues INSERT SQL statement,

Fig 3: SQL INSERT Execution Sequence


1. User issues transaction start request (startTrans())
2. DBMS reserves one free slot for the transaction started (allocSlot())
3. DBMS returns to the user.
4. User issues an INSERT SQL request (stmtExecute())
5. DBMS interprets the request and represents it in data structure (parse())
6. DBMS checks whether table and field names exist in database (check())
7. DBMS identifies the optimum way to execute the statement (optimize())
8. DBMS executes the statement by interacting with relational engine (execute())
9. DBMS checks with buffer manager whether the disk page where record needs to
be allocated is already present in memory (isPageInMemory())
10. DBMS interacts with file manager to load the page into memory buffer if not
already loaded.( loadPage()), not shown in diagram above
11. DBMS creates undo log records for the newly inserted record (createUndoLog())
12. DBMS copies the values from the application buffer to allocated new record
(copyValues())
13. DBMS creates redo log records for the newly inserted record (createRedoLog())
14. DBMS takes lock on the allocated record based on the isolation level of the
transaction (lockRecord())
15. DBMS checks if indexes are available, if yes it does index node insertion for all
indexes on this table. (insertIndexNode())
16. DBMS checks with buffer manager whether the index disk page where index
node needs to be allocated is already present in memory (isPageInMemory())
17. DBMS interacts with file manager to load the page into memory buffer if not
already loaded.( loadPage()), not shown in diagram above
18. DBMS takes lock on the allocated index node (lockIndexNode())
19. DBMS returns to the application with the number of rows affected
(returnRowsAffected)
20. User issues transaction commit (commit())
21. DBMS releases all the locks acquired during the transaction (releaseLocks())
22. DBMS releases the transaction slot allocated for this transaction (freeSlot())
23. DBMS returns to the application

You might also like