You are on page 1of 7

The buffer manager of a DBMS

A DBMS must manage a huge amount of data, and in the course of processing the
required space for the blocks of data will often be greater than the memory space
available. For this there is the need to manage a memory in which to load and unload the
blocks. The buffer manager is responsible primarily for managing the operations inherent
saving and loading of the blocks. In fact, the operations that provides the buffer manager
are these:

* FIX: This command tells the operator of the buffer to load a block from disk and return
the pointer to the memory where it is loaded. If the block was already in memory, the
buffer manager needs only to return the pointer, otherwise he must load from disk and
bring it into memory. If the buffer memory is full but it is possible to have 2 situations:
or the possibility of releasing a portion of memory that is occupied by transactions
already completed. In this case, before freeing the area the content is written to disk if
any block of this area had been changed.

* There is the possibility of free memory to be occupied because transitions still ongoing.
In this case, the buffer manager can work in 2 ways: in the first mode (STEAL) the
operator of the free buffer memory occupied by a transition already active, possibly
saving your changes to disk, in the second mode (NOT STEAL) the transition requested
block is made to wait until the free memory.

* SET DIRTY: invoking this command, you mark a block of memory as amended.

Before introducing the last 2 commands you need to anticipate that the DMBS can
operate in 2 modes: Force and NOT FORCE. When working in FORCE mode, the rescue
disk is in synchronous mode with the commit of a transaction. When working mode is
NOT FORCE the rescue is carried out from time to time in asynchronous manner.
Typically, commercial database operating mode NOT FORCE because this allows an
increase in performance: the block may undergo multiple changes in memory before
being saved, then you can choose to make the saves when the system is unloading.

* Force: This command will cause the operator of the buffer to make the writing in
synchronously with the completion (commit) the transaction
* FLUSH: This command will cause the operator of the buffer to perform the rescue,
when in how NOT FORCE.

List of common DBMS


Lecture 25
Recovery System
Chapter 15, Database Systems Concepts, by Silberschatz, et al, 1997

Portions reproduced with permission

A computer system, like any other mechanical or electrical system is subject to failure.
There are a variety of causes, including disk crash, power failure, software errors, a fire
in the machine room, or even sabotage. Whatever the cause, information may be lost. The
database must take actions in advance to ensure that the atomicity and durability
properties of transactions are preserved. An integral part of a database system is a
recovery scheme that is responsible for the restoration of the database to a consistent
stage that existed prior to the occurrence of the failure.

Failure Classification
The major types of failures involving data integrity (as opposed to data security) are:

• Transaction Failure:

o Logical error. The transaction can not continue with its normal execution
because of such things as bad input, data not found, or resource limit
exceeded.
o System error. The system has entered an undesirable state (for example,
deadlock), as a result of which a transaction can not continue with its
normal execution. The transaction, however, can be reexecuted at a later
time.
• System Crash. There is a hardware malfunction, or a bug in the database
software or the operating system, that causes the loss of the content of volatile
storage, and brings transaction processing to a halt. The content of the nonvolatile
storage remains intact, and is not corrupted.
• Disk Failure. A disk block loses its contents as a result of either a head crash or
failure during a data transfer. Copies of data on other disks, or archival backups
on tertiary media, such as tapes, are used to recover from the failure.

Storage Structure
• Volatile storage. Information here does not usually survive system crashes.
• Nonvolatile storage. This information normally does survive system crashes, but
can be lost (in a head crash, etc).
• Stable storage. System designed not to loss data.
Stable Storage Implementation

To implement stable storage, we need to replicate the needed information on several


nonvolatile media with independent failure modes and to update the information in a
controlled manner to ensure that failure during data transfer does not damage the needed
information.

RAID systems guarantee that the failure of a single disk will not result in the loss of data.
The simplest and fasted form of RAID is the mirrored disk, which keeps two copies of
each block, on separate disks. RAID systems can not guarantee failure of a site! The most
secure systems keep a copy of each block of stable storage at a remote site, writing it out
over a computer network, in addition to storing it on a local disk system.

Transfer of data between memory and disk storage can result in:

• Successful completion. The transferred information arrived safely at its


destination.
• Partial failure. A failure occurred in the midst of a transfer and the destination
block as incorrect information.
• Total failure. The failure occurred sufficiently early during the transfer that the
destination block remains intact.

If a data transfer failure occurs, the system must detect it and invoke a recovery
procedure to restore the block to a consistent state. To do so, the system must maintain
two physical blocks for each logical database block.

A transfer of a block of data would be:

1. Write the information onto the first physical block.


2. Write the same information onto the second physical block.

During recovery, each pair of physical blocks is examined. If both of them are the same
and no detectable error exists, then no further actions are necessary.

If one block contains a detectable error, then we replace its content with the contents of
the other block.

When the blocks contain different data without a detectable error in either, then the
contents of the second block are written to the first block.

Hopefully, this recovery procedure ensures that a write to stable storage either succeeds
completely or results in no change.

Log-Based Recovery
The most widely used structure for recording database modifications is the log. The log is
a sequence of log records and maintains a history of all update activities in the database.
There are several types of log records.

An update log record describes a single database write:

• Transactions identifier.
• Data-item identifier.
• Old value.
• New value.

Whenever a transaction performs a write, it is essential that the log record for that write
be created before the database is modified. Once a log record exists, we can output the
modification that has already been output to the database. Also we have the ability to
undo a modification that has already been output to the database, by using the old-value
field in the log records.

For log records to be useful for recovery from system and disk failures, the log must
reside on stable storage. However, since the log contains a complete record of all
database activity, the volume of data stored in the log may become unreasonable large.

Deferred Database Modification

The deferred-modification technique ensures transaction atomicity by recording all


database modifications in the log, but deferring all write operations of a transaction until
the transaction partially commits (i.e., once the final action of the transaction has been
executed). Then the information in the logs is used to execute the deferred writes. If the
system crashes or if the transaction aborts, then the information in the logs is ignored.

Immediate Database Modification

The immediate-update technique allows database modifications to be output to the


database while the transaction is still in the active state. These modifications are called
uncommitted modifications. In the event of a crash or transaction failure, the system must
use the old-value field of the log records to restore the modified data items.

Checkpoints

When a system failure occurs, we must consult the log to determine those transactions
that need to be redone and those that need to be undone. Rather than reprocessing the
entire log, which is time-consuming and much of it unnecessary, we can use checkpoints:

1. Output onto stable storage all the log records currently residing in main memory.
2. Output to the disk all modified buffer blocks.
3. Output onto stable storage a log record, <checkpoint>.
Now recovery will be to only process log records since the last checkpoint record.

Shadow Paging
Shadow paging is an alternative to log-based recovery techniques, which has both
advantages and disadvantages. It may require fewer disk accesses, but it is hard to extend
paging to allow multiple concurrent transactions. The paging is very similar to paging
schemes used by the operating system for memory management.

The idea is to maintain two page tables during the life of a transaction: the current page
table and the shadow page table. When the transaction starts, both tables are identical.
The shadow page is never changed during the life of the transaction. The current page is
updated with each write operation. Each table entry points to a page on the disk. When
the transaction is committed, the shadow page entry becomes a copy of the current page
table entry and the disk block with the old data is released. If the shadow is stored in
nonvolatile memory and a system crash occurs, then the shadow page table is copied to
the current page table. This guarantees that the shadow page table will point to the
database pages corresponding to the state of the database prior to any transaction that was
active at the time of the crash, making aborts automatic.

There are drawbacks to the shadow-page technique:

• Commit overhead. The commit of a single transaction using shadow paging


requires multiple blocks to be output -- the current page table, the actual data and
the disk address of the current page table. Log-based schemes need to output only
the log records.
• Data fragmentation. Shadow paging causes database pages to change locations
(therefore, no longer contiguous.
• Garbage collection. Each time that a transaction commits, the database pages
containing the old version of data changed by the transactions must become
inaccessible. Such pages are considered to be garbage since they are not part of
the free space and do not contain any usable information. Periodically it is
necessary to find all of the garbage pages and add them to the list of free pages.
This process is called garbage collection and imposes additional overhead and
complexity on the system.

Recovery with Concurrent Transactions


Regardless of the number of concurrent transactions, the disk has only one single disk
buffer and one single log. These are shared by all transactions. The buffer blocks are
shared by a transactions. We allow immediate updates, and permit a buffer block to have
data items updated by one or more transactions.

Buffer Management
Log-Record Buffering

The cost of performing the output of a block to stable storage is sufficiently high that it is
desirable to output multiple log records at once, using a buffer. When the buffer is full, it
is output with as few output operations as possible. However, a log record may reside in
only main memory for a considerable time before it is actually written to stable storage.
Such log records are lost if the system crashes. It is necessary, therefore, to write all
buffers related to a transaction when it is committed. There is no problem written the
other uncommitted transactions at this time.

Database Buffering

Database buffering is the standard operating system concept of virtual memory.


Whenever blocks of the database in memory must be replaced, all modified data blocks
and log records associated with those blocks must be written to the disk.

Operating System Role in Buffer Management

We can manage the database buffer sing one of two approaches:

1. The database system reserves part of main memory to serve as a buffer that the
DBMS manages instead of the operating system. This means that the buffer must
be kept as small as possible (because of its impact on other processes active on
the CPU) and it adds to the complexity of the DBMS.
2. The DBMS implements its buffer within the virtual memory of the operating
system. The operating system would then have to coordinate the swapping of
pages to insure that the appropriate buffers were also written to disk.
Unfortunately, almost all current-generation operating systems retain complete
control of virtual memory. The operating system reserves space on disk for
storing virtual memory pages that are not currently in main memory, called swap
space. This approach may result in extra output to the disk.

Failure with Loss of Nonvolatile Storage


The basic scheme is to dump the entire content of the database to stable memory
periodically. No transaction can be active during the dump procedure.

To recover from the loss of nonvolatile memory, we restore the database from the archive
and all the transactions that have been committed since the most recent dump are redone.

This is also known as an archival dump. Dumps of the database and checkpointing are
very similar.

You might also like