You are on page 1of 24

FILE STRUCTURES

MJ FOLK, B ZOELLICK, G RICCARDI


CHAPTER 3

Secondary Storage and


System Software
Secondary Storage : Disks

Disks
Belongs to a class of devices known as DASD.
* DASD (Direct Access Storage Device) is a disk or other
secondary storage device that permits access to a
specific sector or block of data.
Types of Disks
* Hard Disks offers high capacity and low cost per bit.
* Floppy disks are cheap but slow and hold little data.
* Optical disks are read only but holds a lot of data and
can be reproduced cheaply.
Secondary Storage : Disks
Organization of Disks

Information stored on a disks is stored on the surface


of one or more platters.
* Platter is one disk in the stack of disks on a disk drive.
Information is stored in successive tracks on the
surface of a disk.
* Track is the set of byte on a single surface of a disk that
can be accessed without seeking.
Each track is divided into a number of sectors.
* Sector is the smallest addressable unit on a disk.
Organization of Disks

Cylinder
* The set of tracks on a disk
that are directly above
and below each other.
* Information on a single
cylinder can be accessed
without having to move
the access arm or
read/write arm, it can be
accessed without the
expense of seek time.
Estimating Capacities and Space
Needs

Since a cylinder is consists of a group of tracks, a


tracks is consists of a group of sectors, and a sector is
consists of bytes.

* Track Capacity = # of sectors per track * bytes per sector


* Cylinder Capacity = # of tracks per cylinder * track capacity
* Drive Capacity = # of cylinders * cylinder capacity
Estimating Capacities and Space
Needs

Suppose we want to store a file with 50,000 fixed-


length data records on a 2.1 gb small computer disk
with the following characteristics:
* # of byte per sector = 512
* # of sectors per track = 63
* # of tracks per cylinder = 16
* # 0f cylinders = 4092
How many cylinders does the file require if each data
records requires 256 bytes?
Estimating Capacities and Space
Needs

Each sector can hold 2 records (512/256=2):


* 50,000 / 2 = 25,000 sectors
One cylinder can hold:
* 63 * 16 = 1008 sectors
The number of cylinders required is
approximately:
* 25,000 / 1008 = 24.8 cylinders
Organizing Tracks by Sector

Physical Placement of Sectors


Most practical logical organization of sectors on a
track is that sectors are adjacent, fixed-sized
segments of a track that happens to hold a file.
Interleaving the sector: leave an interval of several
physical sectors between logically adjacent sectors.
In early 1990s, controller speeds improved so that
disks can now offer 1:1 interleaving. Read an entire
track in a single revolution of the disks.
Organizing Tracks by Sector
Organizing Tracks by Sector

Clusters
Another view of sector organization, is the view
maintained by the part of a computers operating
system that we call file manager.
* File manager is the part of an OS that is responsible for
managing files, including collection of programs whose
responsibilities range from keeping track of files to
invoke I/O processes that transmit information between
primary and secondary storage.
* Cluster is a fixed number of contiguous sectors.
Organizing Tracks by Sector

Once a cluster has been found on a disk, all sectors in


that cluster can be accessed without requiring an
additional seek.
The file manager ties logical sectors to the physical
clusters they belong to by using a file allocation table
(FAT).
* File allocation table (FAT) is a table that contains
mapping to the physical locations of all the clusters in all
files on disk storage.
Organizing Tracks by Sector

Extents
If there is a lot of free room on a disk, it may be
possible to make a file consist entirely of contiguous
clusters. ==> the file consists of one extent. ==> the
file can be processed with a minimum of seeking
time.
* Extent is one or more adjacent clusters allocated as part
of a file. The number of extents in a file reflects how
dispersed the file over the disk. The more dispersed the
file, the more seeking must be done.
Organizing Tracks by Sector

Fragmentation is the space that goes unused within a


cluster, block, track or other unit of physical storage.
* Example
Sector size is 512 bytes
Size of all the records in a file is 300 bytes

Internal fragmentation is the loss of space within a


sector.
Organizing Tracks by Sector

There are 2 possible organizations for records (if the


records are smaller than the sector size:
1. Store one record per sector
Advantage: Each record can be retrieved from one sector
Disadvantage: Internal Fragmentation
2. Store the records successively (i.e., one record may
span two sectors)
Advantage: No internal fragmentation
Disadvantage: Two sectors may need to be accessed to
retrieved a single record.
Organizing Tracks by Block

Disk tracks may be divided into integral numbers of


user-defined blocks whose sizes can vary .
* Block is a unit of data organization corresponding to the
amount of data transferred in a single access.
When the data on a track is organized by block, this
usually means that the amount of data transferred in
a single I/O operation can vary depending on the
needs of the software designer.
Organizing Tracks by Block

Block Addressing Scheme


Blocking factor is the number of records stored in one
block.
Count subblock is a small block that precedes each
data block and contains information about the data
block such as its byte count and its address.
Key subblock is the block that contains the key of the
last records in the data block.
Nondata Overhead

Whether using a block or a sector organization, some


space on the disk is taken up by non-data overhead. i.e.,
information stored on the disk during pre-formatting.
On sector-addressable disks, pre-formatting involves
storing, at the beginning of each sector, sector address,
track address and condition (usable or defective) + gaps
and synchronization marks between fields of info to help
the read/write mechanism distinguish between them.
On Block-Organized disks, subblock + interblock gaps have
to be provided with every block. The relative amount of
non-data space necessary for a block scheme is higher than
for a sector-scheme.
Nondata Overhead

Suppose we have a block-addressable disk drive with


20,000 bytes per track and the amount of space taken
by the subblocks and interblock gaps is equal to 300
bytes per block. We want to store a file containing
100-byte records on the disk. How many records can
be stored per track if the blocking factor is 10? It the
blocking factor is 60?
Nondata Overhead

Blocking factor = 10 Blocking factor = 60

100 * 10 = 1000 bytes per 100 * 60 = 6000 bytes per


block block

1000 + 300 = 1300 6000 + 300 = 6300

20,000/1300 = 15.38 0r 15 20,000/6300 = 3.17 or 3


15 * 10 = 150 records 3 * 60 = 180 records
Cost of a Disk Access

Factors that contributes to the amount of time needed


to access a file on a fixed disk.
Seek time is the time required to move the access
arm to the correct cylinder.
Rotational delay refers to the time it takes for the disk
to rotate so the sector we want is under the
read/write head.
Transfer time refers to the amount of time required
for the read/write head to pass under the data.
Disk as Bottleneck

Problem
* Disk-Bound, i.e., the network and the CPU often have to
wait inordinate lengths of time for the disk to transmit
data.
Solutions
Multiprogramming in which the CPU works on other
jobs while waiting for the data to arrive.
Disk striping involves splitting the parts of a file on
several different drives, then letting the separate drives
deliver parts of the file to the network simultaneously.
Disk as Bottleneck

RAID(Redundant Array of Independent Disks) is an array


of disk drives that provide access to the disks in parallel.
Avoid accessing the disk at all. Used the memory disk or
disk cache instead of the secondary storage.
RAM disk is a block of memory configured to simulate a
disk.
Disk cache is a segment of memory configured to contain
pages of data from a disk.

You might also like