Professional Documents
Culture Documents
Serial file
.
A serial file is one in which records are stored, one after the other, in the order in which they are
added not in order of a key field.
This means that new records are stored at the end of the file.
The following shows a serial file that is used to store the number of entries for EdExcel GCSE
Mathematics. The entries were received in the order: Kettlewood, Queens Park, St Marys, Wilton
High, West Orling.
Centre
Number
Centre
Name
No of
Candidates
27102
Kettlewood
85
38240
Queens Park
103
64715
St Marys
121
30446
Wilton High
156
12304
West Orling
105
Note that the key field in this file would be Centre Number (it uniquely identifies each school)
Both disks and tapes can be used to store a file serially.
Sequential file
A sequential file is one in which the records are stored, one after the other, in the order of the key
field.
The following shows a sequential file that is used to store the number of entries for EdExcel GCSE
Mathematics. The entries were added in the order: Kettlewood, Queens Park, St Marys, Wilton
High, West Orling but they are stored in the order of the key field Centre Number:
Centre
Number
Centre
Name
No of
Candidates
12304
West Orling
105
27102
Kettlewood
85
30446
Wilton High
156
38240
Queens Park
103
64715
St Marys
121
As with a serial file, both tape and disks can be used to store a file sequentially and access to the
records must take place from the beginning of the file.
Benefits
Sequential files allow the records to be displayed in the order of the key field this makes the
process of adding a record slower, but significantly speeds up searches.
It should be noted that this very simple method where [disk address] = [primary key] is very
inefficient in respect of disk space. For example:
if the lowest primary key is 1001, then all the disk space below block 1001 will be wasted.
If there are some values which the primary key never takes (for example odd values) these
storage spaces will be wasted.
In order to be more efficient with the use of disk space, random access files calculate disk addresses
by using a hashing algorithm (also known as just hashing).
Hashing
Hashing is a calculation that is performed on a primary key in order to calculate the storage address
of a record.
A hashing algorithm will typically divide the primary key by the number of disk blocks that are
available for storage, work out the remainder and add the start address. The answer will be the
storage address of the record.
[disk address] = [primary key] MOD [number of blocks] + [start address]
Example
If a file was to be stored on the first 5000 blocks of a disk then:
[disk address] = [primary key] MOD 5000
That is, the primary key of each of the records would be divided by 5000 and the remainder would
be the disk address for the record.
This means that a record with primary key of 27102 would be stored at the disk address calculated
as follows:
27102
= 5 remainder 2102
5000
This means that the disk address for this record will be 2102.
The table shows some other disk addresses calculated using the same hashing algorithm:
Centre
Number
Centre
Name
No of
Candidates
Disk
Address
27102
Kettlewood
85
2102
38240
Queens Park
103
3240
64715
St Marys
121
4715
30446
Wilton High
156
446
12304
West Orling
105
2304