You are on page 1of 83

Operating System - Overview

Definition:

An operating system is a program that acts as an interface between the user and the computer
hardware and controls the execution of all kinds of programs

An Operating System (OS) is an interface between a computer user and computer hardware. An
operating system is a software which performs all the basic tasks like file management, memory
management, process management, handling input and output, and controlling peripheral devices
such as disk drives and printers.

Some popular Operating Systems include Linux, Windows, OS X, VMS, OS/400, AIX, z/OS, etc.

Following are some of important functions of an operating System.

1. Memory Management
2. Processor Management
3. Device Management
4. File Management
5. Security
6. Control over system performance
7. Job accounting
8. Error detecting aids
9. Coordination between other software and users
Memory Management
Memory management refers to management of Primary Memory or Main Memory. Main memory is
a large array of words or bytes where each word or byte has its own address.

Main memory provides a fast storage that can be accessed directly by the CPU. For a
program to be executed, it must in the main memory. An Operating System does the
following activities for memory management

Keeps tracks of primary memory, i.e., what part of it are in use by whom, what part are not
in use.

In multiprogramming, the OS decides which process will get memory when and how much.

Allocates the memory when a process requests it to do so.

De-allocates the memory when a process no longer needs it or has been terminated.

Processor Management
In multiprogramming environment, the OS decides which process gets the processor when
and for how much time. This function is called process scheduling. An Operating System
does the following activities for processor management

Keeps tracks of processor and status of process. The program responsible for this task is
known as traffic controller.

Allocates the processor (CPU) to a process.

De-allocates processor when a process is no longer required.

Device Management
An Operating System manages device communication via their respective drivers. It does
the following activities for device management

Keeps tracks of all devices. Program responsible for this task is known as the I/O controller.

Decides which process gets the device when and for how much time.

Allocates the device in the efficient way.

De-allocates devices.

File Management
A file system is normally organized into directories for easy navigation and usage. These
directories may contain files and other directions.

An Operating System does the following activities for file management

Keeps track of information, location, uses, status etc. The collective facilities are often
known as file system.

Decides who gets the resources.

Allocates the resources.

De-allocates the resources.


Other Important Activities
Following are some of the important activities that an Operating System performs

Security By means of password and similar other techniques, it prevents unauthorized


access to programs and data.

Control over system performance Recording delays between request for a service and
response from the system.

Job accounting Keeping track of time and resources used by various jobs and users.

Error detecting aids Production of dumps, traces, error messages, and other debugging
and error detecting aids.

Coordination between other softwares and users Coordination and assignment of


compilers, interpreters, assemblers and other software to the various users of the computer
systems.

History of Operating Systems

The earliest computers were mainframes that lacked any form of operating system. Each user had
sole use of the machine for a scheduled period of time and would arrive at the computer with
program and data, often on punched paper cards and magnetic or paper tape. The program would
be loaded into the machine, and the machine would be set to work until the program completed or
crashed. Programs could generally be debugged via a control panel using dials, toggle switches
and panel lights. Symbolic languages, assemblers, and compilers were developed for
programmers to translate symbolic program-code into machine code that previously would have
been hand-encoded. Later machines came with libraries of support code on punched cards or
magnetic tape, which would be linked to the user's program to assist in operations such as input
and output. This was the genesis of the modern-day operating system; however, machines still ran
a single job at a time. At Cambridge University in England the job queue was at one time a
washing line from which tapes were hung with different colored clothes-pegs to indicate job-priority.

As machines became more powerful the time to run programs diminished, and the time to hand off
the equipment to the next user became large by comparison. Accounting for and paying for
machine usage moved on from checking the wall clock to automatic logging by the computer. Run
queues evolved from a literal queue of people at the door, to a heap of media on a jobs-waiting
table, or batches of punch-cards stacked one on top of the other in the reader, until the machine
itself was able to select and sequence which magnetic tape drives processed which tapes. The first
operating system used for real work was GM-NAA I/O, produced in 1956 by General Motors'
Research division[2] for its IBM 704.[3] Most other early operating systems for IBM mainframes
were also produced by customers. Early operating systems were very diverse, with each vendor or
customer producing one or more operating systems specific to their particular mainframe
computer. Every operating system, even from the same vendor, could have radically different
models of commands, operating procedures, and such facilities as debugging aids. Typically, each
time the manufacturer brought out a new machine, there would be a new operating system, and
most applications would have to be manually adjusted, recompiled, and retested.

Here are notable Operating Systems in our evolution of computers.


1956, GM-NAA I/O: Developed by Robert L. Patrick of General Motors for use on their IBM 704
mainframe. This early OS was primarily designed to automatically switch to the next job once its
current job was completed. It was used on about fourty IBM 704 mainframes.

1961, MCP (Master Control Program): Developed by Burroughs Corporations for their B5000
mainframe. MCP is still in used today by the Unisys ClearPath/MCP machines.

1966, DOS/360: After years of being strictly in the hardware business, IBM ventured into the OS.
IBM developed a few unsuccessful mainframe Operating Systems until it finally released DOS/360
and its successors, which put IBM in the driver seat for both the hardware and OS industries.

1969, Unix: Developed by AT&T Bell Labs programmers Ken Thompson, Dennis Ritchie, Douglas
McIlroy, and Joe Ossanna. It gained widespread acceptance first within the large AT&T company,
and later by colleges and universities. It is written in C, which allows for easier modification,
acceptance, and portability.

1973, CP/M (Control Program/Monitor (later re-purposed as Control Program for


Microcomputers): Developed by Greg Kildall as a side project for his company Digital Research.
CP/M became a popular OS in the 1970s. It had many applications developed for it, including
WordStar and dBASE. It was ported to a variety of hardware environments. In fact, IBM originally
wanted CP/M for its new Personal Computers, but later selected MS-DOS when a deal could not
be reached.

1977, BSD (Berkeley Software Distribution): Developed by the University of California, Berkeley.
BSD is a Unix variant based on early versions of Unix from Bell Labs.

1981, MS-DOS: Developed by Microsoft for the IBM PCs. It was the first widely available
Operating Systems for home users. In 1985, Microsoft released Microsoft Windows, which
popularized the Operating System even more. Microsoft Windows allowed users a graphical user
interface (GUI), which rapidly spread Microsofts product.

1982, SunOS: Developed by Sun Microsystems, SunOS was based on BSD. It was a very popular
Unix variant.

1984, Mac OS: Developed by Apple Computer, Inc for their new product, the Macintosh home PC.
The Macintosh was widely advertised (the famous 1984 commercial is available below). Mac OS
was the first OS with a GUI built-in. This lead to a very stable OS, as well as wide acceptance due
to its ease of use.
1987, OS/2: Developed by a joint venture of IBM and Microsoft. Though the OS was heavily
marketed, it did not pick up in popularity.

1991, Linux: Developed by Linus Torvalds as a free Unix variant. Linux today is a very largely
contributed Open Source project that plays a very prominant role in todays server industry.

1992, Sun Solaris: Developed by Sun Microsystems, Solaris is a widely used Unix variant, and
partially developed based on Suns SunOS.

1993, Windows NT: Developed by Microsoft as a high-end server Operating System, the NT code
became the basis for Operating Systems to this day. NT was primarily used on computers used as
servers to counter the Unix dominance in the arena.

1995, Windows95: Developed by Microsoft, it was the first Microsoft Operating system to have a
graphical user interface built into it. It was tremendously marketed (successfully) and quickly swept
across the country and the globe. Below is one of Microsofts popular commercials, featuring the
Rolling Stones with Start Me Up, drawing attention to Microsofts Start button, which to this day
is a dominant feature of their Operating Systems.
1997, JavaOS: Developed by Sun Microsystems, JavaOS was developed primarily using the Java
programming language. The OS was created to be installed on any device, including PCs.

1998, Windows98: Developed by Microsoft, Windows 98 was the next iteration of the Microsoft
Windows95 Operating System.

1999, MacOS X Server 1.0: Developed by Apple Computer, Inc., MacOS X Server 1.0 was a
precursor to Apples MacOS X desktop version, which replaced it in 2001. MacOS X Server 1.0
was developed for Apples popular Macintosh PC.

2000, Windows 2000: Developed by Microsoft, Windows 2000 was a much improved Operating
System over Windows 98. It was developed from a dramatically different code base. It was
targetted for business oriented uses.

2000, Windows ME: Developed by Microsoft, Windows ME (also called Windows Millenium) was a
rather unsuccessful new version of Windows 98 and had a short shelf life. It was released just
seven months after Windows 2000 and just a year before Windows XP.

2001, MacOS X Version 10.0: Developed by Apple Computer, Inc., MacOS X Version 10.0
dramatically changed the user interface for Apples Macinstosh users.

2001, Windows XP: Developed by Microsoft, Windows XP was an enhanced version of Windows
2000 code base. XP became widely popular and is used extensively today, despite the release of
newer versions of Windows.

2003, Windows Server 2003: Developed by Microsoft as an improved version of their NT OS.

2007, Windows Vista: Developed by Microsoft, Windows Vista had been slow in taking off.

2008, Windows Server 2008: Developed by Microsoft as an upgrade to Windows Server 2003.

2009, Windows 7: Developed by Microsoft to replace Vista, Win7 is currently used by over 50%
of internet users.

2012, Windows 8: Developed by Microsoft to replace Win7, Win8 was just released October
26th, 2012,

2015, Windows 10: Developed by Microsoft to replace Win8, Win8.1.

Assembler
Input to an assembler is an assembly language program. Output is an object
program plus information that enables the loader to prepare the object program for
execution. At one time, the computer programmer had at his disposal a basic
machine that interpreted, through hardware, certain fundamental instructions. He
would program this computer by writing a series of ones and zeros(machine
language), place them into the memory of the machine.
Compiler
The high level languages examples are FORTRAN, COBOL, ALGOL and PL/I
are processed by compilers and interpreters. A compilers is a program that
accepts a source program in a high-level language and produces a corresponding
object program. An interpreter is a program that appears to execute a source
program as if it was machine language. The same name (FORTRAN, COBOL etc)
is often used to designate both a compiler and its associated language.
Loader
A loader is a routine that loads an object program and prepares it for execution.
There are various loading schemes: absolute, relocating and direct-linking. In
general, the loader must load, relocate, and link the object program. Loader is a
program that places programs into memory and prepares them for execution. In a
simple loading scheme, the assembler outputs the machine language translation of
a program on a secondary device and a loader is placed in core. The loader places
into memory the machine language version of the users program and transfers
control to it. Since the loader program is much smaller than the assembler, thus makes more
space available to users program.

Linker
In computer science, a linker is a computer program that takes one or more object files generated
by a compiler and combines them into one, executable program.
Computer programs are usually made up of multiple modules that span separate object files, each
being a compiled computer program. The program as a whole refers to these separately-compiled
object files using symbols. The linker combines these separate files into a single, unified program;
resolving the symbolic references as it goes along.
Dynamic linking is a similar process, available on many operating systems, which postpones the
resolution of some symbols until the program is executed. When the program is run, these dynamic
link libraries are loaded as well. Dynamic linking does not require a linker.
The linker bundled with most Linux systems is called ld.

I/O System Management


I/O System Management
The module that keeps track of the status of devices is called the I/O traffic
controller. Each I/O device has a device handler that resides in a separate
process associated with that device.
The I/O subsystem consists of
1. A memory management component that includes buffering, caching and spooling.
2. A general device driver interface. Drivers for specific hardware devices.

Operating Systems Services

Following are the five services provided by an operating systems to the convenience of the users.

Program Execution

The purpose of a computer systems is to allow the user to execute programs. So the operating
systems provides an environment where the user can conveniently run programs. The user does
not have to worry about the memory allocation or multitasking or anything. These things are taken
care of by the operating systems.

Running a program involves the allocating and deallocating memory, CPU scheduling in case of
multiprocess. These functions cannot be given to the user-level programs. So user-level programs
cannot help the user to run programs independently without the help from operating systems.

I/O Operations

Each program requires an input and produces output. This involves the use of I/O. The operating
systems hides the user the details of underlying hardware for the I/O. All the user sees is that the
I/O has been performed without any details. So the operating systems by providing I/O makes it
convenient for the users to run programs.
For efficiently and protection users cannot control I/O so this service cannot be provided by user-
level programs.

File System Manipulation

The output of a program may need to be written into new files or input taken from some files. The
operating systems provides this service. The user does not have to worry about secondary storage
management. User gives a command for reading or writing to a file and sees his her task
accomplished. Thus operating systems makes it easier for user programs to accomplished their
task.

This service involves secondary storage management. The speed of I/O that depends on
secondary storage management is critical to the speed of many programs and hence I think it is
best relegated to the operating systems to manage it than giving individual users the control of it. It
is not difficult for the user-level programs to provide these services but for above mentioned
reasons it is best if this service s left with operating system.

Communications

There are instances where processes need to communicate with each other to exchange
information. It may be between processes running on the same computer or running on the
different computers. By providing this service the operating system relieves the user of the worry of
passing messages between processes. In case where the messages need to be passed to
processes on the other computers through a network it can be done by the user programs. The
user program may be customized to the specifics of the hardware through which the message
transits and provides the service interface to the operating system.

Error Detection

An error is one part of the system may cause malfunctioning of the complete system. To avoid
such a situation the operating system constantly monitors the system for detecting the errors. This
relieves the user of the worry of errors propagating to various part of the system and causing
malfunctioning.

This service cannot allowed to be handled by user programs because it involves monitoring and in
cases altering area of memory or deallocation of memory for a faulty process. Or may be
relinquishing the CPU of a process that goes into an infinite loop. These tasks are too critical to be
handed over to the user programs. A user program if given these privileges can interfere with the
correct (normal) operation of the operating systems
Operating System Components
Even though, not all systems have the same structure many modern operating systems share the
same goal of supporting the following types of system components.

Process Management

The operating system manages many kinds of activities ranging from user programs to system
programs like printer spooler, name servers, file server etc. Each of these activities is encapsulated
in a process. A process includes the complete execution context (code, data, PC, registers, OS
resources in use etc.).

It is important to note that a process is not a program. A process is only ONE instant of a program
in execution. There are many processes can be running the same program. The five major
activities of an operating system in regard to process management are

Creation and deletion of user and system processes.


Suspension and resumption of processes.
A mechanism for process synchronization.
A mechanism for process communication.
A mechanism for deadlock handling.

Main-Memory Management

Primary-Memory or Main-Memory is a large array of words or bytes. Each word or byte has its own
address. Main-memory provides storage that can be access directly by the CPU. That is to say for
a program to be executed, it must in the main memory.

The major activities of an operating in regard to memory-management are:

Keep track of which part of memory are currently being used and by whom.
Decide which process are loaded into memory when memory space becomes available.
Allocate and deallocate memory space as needed.

File Management

A file is a collected of related information defined by its creator. Computer can store files on the
disk (secondary storage), which provide long term storage. Some examples of storage media are
magnetic tape, magnetic disk and optical disk. Each of these media has its own properties like
speed, capacity, data transfer rate and access methods.

A file systems normally organized into directories to ease their use. These directories may contain
files and other directions.

The five main major activities of an operating system in regard to file management are
The creation and deletion of files.
The creation and deletion of directions.
The support of primitives for manipulating files and directions.
The mapping of files onto secondary storage.
The back up of files on stable storage media.

I/O System Management

I/O subsystem hides the peculiarities of specific hardware devices from the user. Only the device
driver knows the peculiarities of the specific device to whom it is assigned.

Secondary-Storage Management

Generally speaking, systems have several levels of storage, including primary storage, secondary
storage and cache storage. Instructions and data must be placed in primary storage or cache to be
referenced by a running program. Because main memory is too small to accommodate all data and
programs, and its data are lost when power is lost, the computer system must provide secondary
storage to back up main memory. Secondary storage consists of tapes, disks, and other media
designed to hold information that will eventually be accessed in primary storage (primary,
secondary, cache) is ordinarily divided into bytes or words consisting of a fixed number of bytes.
Each location in storage has an address; the set of all addresses available to a program is called
an address space.

The three major activities of an operating system in regard to secondary storage


management are:

Managing the free space available on the secondary-storage device.


Allocation of storage space when new files have to be written.
Scheduling the requests for memory access.

Networking

A distributed systems is a collection of processors that do not share memory, peripheral devices,
or a clock. The processors communicate with one another through communication lines called
network. The communication-network design must consider routing and connection strategies, and
the problems of contention and security.

Protection System

If a computer systems has multiple users and allows the concurrent execution of multiple
processes, then the various processes must be protected from one another's activities. Protection
refers to mechanism for controlling the access of programs, processes, or users to the resources
defined by a computer systems.
Command Interpreter System

A command interpreter is an interface of the operating system with the user. The user gives
commands with are executed by operating system (usually by turning them into system calls). The
main function of a command interpreter is to get and execute the next user specified command.
Command-Interpreter is usually not part of the kernel, since multiple command interpreters (shell,
in UNIX terminology) may be support by an operating system, and they do not really need to run in
kernel mode. There are two main advantages to separating the command interpreter from the
kernel.

If we want to change the way the command interpreter looks, i.e., I want to change the interface of
command interpreter, I am able to do that if the command interpreter is separate from the kernel. I
cannot change the code of the kernel so I cannot modify the interface.
If the command interpreter is a part of the kernel it is possible for a malicious process to gain
access to certain part of the kernel that it showed not have to avoid this ugly scenario it is
advantageous to have the command interpreter separate from kernel.

Time Sharing Operating System


A time sharing system allows many users to share the computer resources simultaneously. In
other words, time sharing refers to the allocation of computer resources in time slots to several
programs simultaneously. For example a mainframe computer that has many users logged on to it.
Each user uses the resources of the mainframe -i.e. memory, CPU etc. The users feel that they
are exclusive user of the CPU, even though this is not possible with one CPU i.e. shared among
different users.

The time sharing systems were developed to provide an interactive use of the computer system. A
time shared system uses CPU scheduling and multiprogramming to provide each user with a small
portion of a time-shared computer. It allows many users to share the computer resources
simultaneously. As the system switches rapidly from one user to the other, a short time slot is
given to each user for their executions.

The time sharing system provides the direct access to a large number of users where CPU time is
divided among all the users on scheduled basis. The OS allocates a set of time to each user.
When this time is expired, it passes control to the next user on the system. The time allowed is
extremely small and the users are given the impression that they each have their own CPU and
they are the sole owner of the CPU. This short period of time during that a user gets attention of
the CPU; is known as a time slice or a quantum. The concept of time sharing system is shown in
figure.

In above figure the user 5 is active but user 1, user 2, user 3, and user 4 are in waiting state
whereas user 6 is in ready status.

As soon as the time slice of user 5 is completed, the control moves on to the next ready user i.e.
user 6. In this state user 2, user 3, user 4, and user 5 are in waiting state and user 1 is in ready
state. The process continues in the same way and so on.

The time-shared systems are more complex than the multi-programming systems. In time-shared
systems multiple processes are managed simultaneously which requires an adequate
management of main memory so that the processes can be swapped in or swapped out within a
short time.

Note: The term 'Time Sharing' is no longer commonly used, it has been replaced by 'Multitasking
System.

Multiprogramming Operating System


To overcome the problem of underutilization of CPU and main memory, the multiprogramming was
introduced. The multiprogramming is interleaved execution of multiple jobs by the same computer.

In multiprogramming system, when one program is waiting for I/O transfer; there is another
program ready to utilize the CPU. So it is possible for several jobs to share the time of the CPU.
But it is important to note that multiprogramming is not defined to be the execution of jobs at the
same instance of time. Rather it does mean that there are a number of jobs available to the CPU
(placed in main memory) and a portion of one is executed then a segment of another and so on. A
simple process of multiprogramming is shown in figure
As shown in fig, at the particular situation, job' A' is not utilizing the CPU time because it is busy in
I/ 0 operations. Hence the CPU becomes busy to execute the job 'B'. Another job C is waiting for
the CPU for getting its execution time. So in this state the CPU will never be idle and utilizes
maximum of its time.
A program in execution is called a "Process", "Job" or a "Task". The concurrent execution of
programs improves the utilization of system resources and enhances the system throughput as
compared to batch and serial processing. In this system, when a process requests some I/O to
allocate; meanwhile the CPU time is assigned to another ready process. So, here when a process
is switched to an I/O operation, the CPU is not set idle.

Multiprogramming is a common approach to resource management. The essential components of


a single-user operating system include a command processor, an input/ output control system, a
file system, and a transient area. A multiprogramming operating system builds on this base,
subdividing the transient area to hold several independent programs and adding resource
management routines to the operating system's basic functions.

Multitasking Operating System


Multitasking is a logical extension of multiprogramming system that supports multiple programs to
run concurrently. In multitasking more than one task are executed at the same time. In this
technique the multiple tasks, also known as processes, share common processing resources such
as a CPU. In the case of a computer with single CPU, only one job can be processed at a time.
Multitasking solves the problem by scheduling and deciding which task should be the running task
and when a waiting task should get turn. This attempt is done with the help of interrupt (a signal)
which is attended by CPU by keeping the current activity aside, saves its present status in buffer
and returns to another important job whatever task it was doing earlier. The act of re-assigning a
CPU from one task to another one is known as context switch.

The multitasking systems were developed to provide interactive use of a computer system. This
system uses the CPU scheduling and multiprogramming to provide each user with a small portion
of a time-shared computer. Thus multitasking makes the best possible use of available hardware at
any given instance of time and improves the overall efficiency of computer system. A multi-tasking
operating system is characterized by its capability to support the concurrent execution of more than
one task. This is achieved by simultaneous management of several processes in the main memory
at the same time and by availing I/O resources amongst the active tasks. The multi-tasking OS
monitors the state of all the tasks and of the system resources.
Multitasking provides the fundamental mechanism for an application to control and react to
multiple, discrete real-world events and is therefore essential for many real-time applications. A
multitasking environment allows applications to be constructed as a set of independent tasks, each
with a separate thread of execution and its own set of system resources. The inter-task
communication facilities allow these tasks to synchronize and coordinate their activity. Multitasking
creates the appearance of many threads of execution running concurrently when, in fact, the kernel
interleaves their execution on the basis of a scheduling algorithm. This also leads to efficient
utilization of the CPU time and is essential for many embedded applications where processors are
limited in computing speed due to cost, power, silicon area and other constraints. In a multi-tasking
operating system, it is assumed that the various tasks are to cooperate to serve the requirements
of the overall system Co-operation will require that the tasks communicate with each other and
share common data in an orderly and disciplined manner, without creating the contention and
deadlocks.

Multiprocessing Operating System


Multiprocessing Operating System refers to the use of two or more central processing units (CPU)
within a single computer system. These multiple CPUs are in a close communication sharing the
computer bus, memory and other peripheral devices. These systems are referred as tightly
coupled systems.

These types of systems are used when very high speed is required to process a large volume of
data. These systems are generally used in environment like satellite control, weather forecasting
etc. The basic organization of multiprocessing system is shown in fig.
Multiprocessing system is based on the symmetric multiprocessing model, in which each processor
runs an identical copy of operating system and these copies communicate with each other. In this
system processor is assigned a specific task. A master processor controls the system. This
scheme defines a master-slave relationship. These systems can save money in compare to single
processor systems because the processors can share peripherals, power supplies and other
devices. The main advantage of multiprocessor system is to get more work done in a shorter
period of time. Moreover, multiprocessor systems prove more reliable in the situations of failure of
one processor. In this situation, the system with multiprocessor will not halt the system; it will only
slow it down.

In order to employ multiprocessing operating system effectively, the computer system must have
the followings:

1. Motherboard Support: A motherboard capable of handling multiple processors. This means


additional sockets or slots for the extra chips and a chipset capable of handling the multiprocessing
arrangement.

2. Processor Support: processors those are capable of being used in a multiprocessing system.

The whole task of multiprocessing is managed by the operating system, which allocates different
tasks to be performed by the various processors in the system.

Applications designed for the use in multiprocessing are said to be threaded, which means that
they are broken into smaller routines that can be run independently. This allows the operating
system to let these threads run on more than one processor simultaneously, which is
multiprocessing that results in improved performance.

Multiprocessor system supports the processes to run in parallel. Parallel processing is the ability of
the CPU to simultaneously process incoming jobs. This becomes most important in computer
system, as the CPU divides and conquers the jobs. Generally the parallel processing is used in the
fields like artificial intelligence and expert system, image processing, weather forecasting etc.
In a multiprocessor system, the dynamically sharing of resources among the various processors
may cause therefore, a potential bottleneck. There are three main sources of contention that can
be found in a multiprocessor operating system:

Locking system: In order to provide safe access to the resources shared among multiple
processors, they need to be protected by locking scheme. The purpose of a locking is to serialize
accesses to the protected resource by multiple processors. Undisciplined use of locking can
severely degrade the performance of system. This form of contention can be reduced by using
locking scheme, avoiding long critical sections, replacing locks with lock-free algorithms, or,
whenever possible, avoiding sharing altogether.

Shared data: The continuous accesses to the shared data items by multiple processors (with one
or more of them with data write) are serialized by the cache coherence protocol. Even in a
moderate-scale system, serialization delays can have significant impact on the system
performance. In addition, bursts of cache coherence traffic saturate the memory bus or the
interconnection network, which also slows down the entire system. This form of contention can be
eliminated by either avoiding sharing or, when this is not possible, by using replication techniques
to reduce the rate of write accesses to the shared data.

False sharing: This form of contention arises when unrelated data items used by different
processors are located next to each other in the memory and, therefore, share a single cache line:
The effect of false sharing is the same as that of regular sharing bouncing of the cache line among
several processors. Fortunately, once it is identified, false sharing can be easily eliminated by
setting the memory layout of non-shared data.

Apart from eliminating bottlenecks in the system, a multiprocessor operating system developer
should provide support for efficiently running user applications on the multiprocessor. Some of the
aspects of such support include mechanisms for task placement and migration across processors,
physical memory placement insuring most of the memory pages used by an application is located
in the local memory, and scalable multiprocessor synchronization primitives.

Multithreading
Up to now, we have talked about multiprogramming as a way to allow multiple programs being
resident in main memory and (apparently) running at the same time. Then, multitasking refers to
multiple tasks running (apparently) simultaneously by sharing the CPU time. Finally,
multiprocessing describes systems having multiple CPUs. So, where does multithreading come in?
Multithreading is an execution model that allows a single process to have multiple code segments
(i.e., threads) run concurrently within the context of that process. You can think of threads as
child processes that share the parent process resources but execute independently. Multiple
threads of a single process can share the CPU in a single CPU system or (purely) run in parallel in
a multiprocessing system
Why should we need to have multiple threads of execution within a single process context?
Well, consider for instance a GUI application where the user can issue a command that require
long time to finish (e.g., a complex mathematical computation). Unless you design this command to
be run in a separate execution thread you will not be able to interact with the main application GUI
(e.g., to update a progress bar) because it is going to be unresponsive while the calculation is
taking place.
Of course, designing multithreaded/concurrent applications requires the programmer to handle
situations that simply dont occur when developing single-threaded, sequential applications. For
instance, when two or more threads try to access and modify a shared resource (race conditions),
the programmer must be sure this will not leave the system in an inconsistent or deadlock state.
Typically, this thread synchronization is solved using OS primitives, such as mutexes and
sempaphores.
What is Spooling
Spooling - simultaneous peripheral operations on-line, spooling refers to as a process that putting
jobs in a buffer or say spool, or temporary storage area, a special area in memory or on a disk
where a device can access them when it is ready. Spooling is useful because devices access data
at different rates.

The buffer provides a waiting station where data can rest while the slower device catches
up.However, unlike a spool of thread, the first jobs sent to the spool are the first ones to be
processed (FIFO, not LIFO).

The most common spooling application is print spooling. When you choose to print a document,
the computer sends the document information to the printer very quickly, but the printer can't
accept it at the same rate. The printer can only handle a chunk of information at a time, and it
pauses to process and print that chunk before it's ready for more. Meanwhile, you have to wait until
the printer has accepted the whole document, piece by piece, before you can use your computer
again because the computer has to hang around and feed the information through. That's why you
need a print spooler-software that reduces the amount of time during which you can't work while
you wait for a job to print.

A spooler works by intercepting the information going to the printer, parking it temporarily on disk
or in memory. The computer can send the document information to the spooler at full speed, then
immediately return control of the screen to you. The spooler, meanwhile, hangs onto the
information and feeds it to the printer at the slow speed the printer needs to get it. So if your
computer can spool, you can work while a document is being printed.

You will notice during spooling, though, that your work gets slightly interrupted for a few seconds
here and there because the computer cannot really do more than one thing at a time, meaning it
can't keep the spooler running and your monitor running at the same time. So if your cursor doesn't
move or the letters type sporadically here and there, don't worry. Total control will be returned to
you when the printer is done

Sometimes spooling software is built into the system software, as in "Background printing" in
System 7 or the "Backgrounder" file on System 6 on the Mac, or the Print Manager in Windows.
Sometimes you buy extra software that allows you to spool.

Buffering
Spooling is great for some sorts of computer tasks, but it's not appropriate for others. Preloading
data into a reserved area of memory (the buffer) is called buffering. It temporarily stores input or
output data in an attempt to better match the speeds of two devices such as a fast CPU and a slow
disk drive. Buffer may be used in between when moving data between two processes within a
computer. Data is stored in buffer as it is retrieved from one processes or just before it is sent to
another process. With spooling, the disk is used as a very large buffer. Usually complete jobs are
queued on disk to be completed later.
It is mostly used for input, output, and sometimes temporary storage of data either when transfer of
data takes place or data that may be modified in a non-sequential manner. Watching video on
YouTube comes with the expectation that clicking "Play" will cause the video to begin playing
immediately. For this to work, the website sends small parts of the video when the page loads, and
then starts sending the next parts of the video when the "Play" button is clicked. These subsequent
parts are queued up in a buffer so that the video plays smoothly even though it's not fully
downloaded when you start it.
Caching
There's a third process that's used that's related to spooling and buffering: caching.
Caching transparently stores data in component called Cache, so that future request for that data
can be served faster. A special high-speed storage mechanism. It can be either a reserved section
of main memory or an independent high-speed storage device. The data that is stored within a
cache might be values that have been computed earlier or duplicates of original values that are
stored elsewhere. E.g: Memory Caching, Disk Caching, Web Caching(used in browser), Database
Caching etc. A caches sole purpose is to reduce accesses to the underlying slower storage.When
a computer caches a file, it stores a local copy that can be accessed at the speed of the local hard
drive. Internet browsers will cache files that get accessed a lot, and any time you watch a video
and hit the "Replay" button, your browser is usually pulling up the copy of the file it downloaded
previously, rather than downloading it again. At the level of bits and bytes, your CPU and video
card cache data to boost performance.

Distributed Operating System

Distributed Operating System is a model where distributed applications are running on multiple
computers linked by communications. A distributed operating system is an extension of the
network operating system that supports higher levels of communication and integration of the
machines on the network.

This system looks to its users like an ordinary centralized operating system but runs on multiple,
independent central processing units (CPUs).

These systems are referred as loosely coupled systems where each processor has its own local
memory and processors communicate with one another through various communication lines, such
as high speed buses or telephone lines. By loosely coupled systems, we mean that such
computers possess no hardware connections at the CPU - memory bus level, but are connected
by external interfaces that run under the control of software.

The Distributed Os involves a collection of autonomous computer systems, capable of


communicating and cooperating with each other through a LAN / WAN. A Distributed Os provides
a virtual machine abstraction to its users and wide sharing of resources like as computational
capacity, I/O and files etc.

The structure shown in fig contains a set of individual computer systems and workstations
connected via communication systems, but by this structure we can not say it is a distributed
system because it is the software, not the hardware, that determines whether a system is
distributed or not.

The users of a true distributed system should not know, on which machine their programs are
running and where their files are stored. LOCUS and MICROS are the best examples of distributed
operating systems.

Using LOCUS operating system it was possible to access local and distant files in uniform manner.
This feature enabled a user to log on any node of the network and to utilize the resources in a
network without the reference of his/her location. MICROS provided sharing of resources in an
automatic manner. The jobs were assigned to different nodes of the whole system to balance the
load on different nodes.
Below given are some of the examples of distributed operating systems:

l. IRIX operating system; is the implementation of UNIX System V, Release 3 for Silicon Graphics
multiprocessor workstations.

2. DYNIX operating system running on Sequent Symmetry multiprocessor computers.

3. AIX operating system for IBM RS/6000 computers.

4. Solaris operating system for SUN multiprocessor workstations.

5. Mach/OS is a multithreading and multitasking UNIX compatible operating system;

6. OSF/1 operating system developed by Open Foundation Software: UNIX compatible.

Distributed systems provide the following advantages:

1 Sharing of resources.

2 Reliability.

3 Communication.

4 Computation speedup.
Distributed systems are potentially more reliable than a central system because if a system has
only one instance of some critical component, such as a CPU, disk, or network interface, and that
component fails, the system will go down. When there are multiple instances, the system may be
able to continue in spite of occasional failures. In addition to hardware failures, one can also
consider software failures. Distributed systems allow both hardware and software errors to be dealt
with.

A distributed system is a set of computers that communicate and collaborate each other using
software and hardware interconnecting components. Multiprocessors (MIMD computers using
shared memory architecture), multicomputers connected through static or dynamic interconnection
networks (MIMD computers using message passing architecture) and workstations connected
through local area network are examples of such distributed systems.

A distributed system is managed by a distributed operating system. A distributed operating system


manages the system shared resources used by multiple processes, the process scheduling activity
(how processes are allocating on available processors), the communication and synchronization
between running processes and so on. The software for parallel computers could be also tightly
coupled or loosely coupled. The loosely coupled software allows computers and users of a
distributed system to be independent each other but having a limited possibility to cooperate. An
example of such a system is a group of computers connected through a local network. Every
computer has its own memory, hard disk. There are some shared resources such files and
printers. If the interconnection network broke down, individual computers could be used but without
some features like printing to a non-local printer.

Operating System - Processes


A process is basically a program in execution. The execution of a process must progress in a
sequential fashion.

A process is defined as an entity which represents the basic unit of work to be implemented in the
system.
To put it in simple terms, we write our computer programs in a text file and when we execute this
program, it becomes a process which performs all the tasks mentioned in the program.
When a program is loaded into the memory and it becomes a process, it can be divided into four
sections stack, heap, text and data. The following image shows a simplified layout of a process

inside main memory


Component & Description
1 Stack

The process Stack contains the temporary data such as method/function parameters,
return address and local variables.

2 Heap
This is dynamically allocated memory to a process during its run time.

3 Text
This includes the current activity represented by the value of Program Counter and the contents of
the processor's registers.

4 Data
This section contains the global and static variables.

Program
A program is a piece of code which may be a single line or millions of lines. A computer program is
usually written by a computer programmer in a programming language. For example, here is a
simple program written in C programming language. A computer program is a collection of
instructions that performs a specific task when executed by a computer. When we compare a
program with a process, we can conclude that a process is a dynamic instance of a computer
program.

A part of a computer program that performs a well-defined task is known as an algorithm. A


collection of computer programs, libraries and related data are referred to as a software.

Process Life Cycle


When a process executes, it passes through different states. These stages may differ in different
operating systems, and the names of these states are also not standardized.

In general, a process can have one of the following five states at a time.
State & Description

1 Start

This is the initial state when a process is first started/created.

2 Ready

The process is waiting to be assigned to a processor. Ready processes are waiting to have the
processor allocated to them by the operating system so that they can run. Process may come into
this state after Start state or while running it by but interrupted by the scheduler to assign CPU to
some other process.

3 Running

Once the process has been assigned to a processor by the OS scheduler, the process state is set
to running and the processor executes its instructions.

4 Waiting

Process moves into the waiting state if it needs to wait for a resource, such as waiting for user
input, or waiting for a file to become available.

5 Terminated or Exit

Once the process finishes its execution, or it is terminated by the operating system, it is moved to
the terminated state where it waits to be removed from main memory.

Process Control Block (PCB)


A Process Control Block is a data structure maintained by the Operating System for every process.
The PCB is identified by an integer process ID (PID). A PCB keeps all the information needed to
keep track of a process as listed below in the table

Information & Description


1 Process State

The current state of the process i.e., whether it is ready, running, waiting, or whatever.
2 Process privileges

This is required to allow/disallow access to system resources.

3 Process ID

Unique identification for each of the process in the operating system.

4 Pointer

A pointer to parent process.

5 Program Counter

Program Counter is a pointer to the address of the next instruction to be executed for this process.

6 CPU registers

Various CPU registers where process need to be stored for execution for running state.

7 CPU Scheduling Information

Process priority and other scheduling information which is required to schedule the process.

8 Memory management information

This includes the information of page table, memory limits, Segment table depending on memory
used by the operating system.

9 Accounting information

This includes the amount of CPU used for process execution, time limits, execution ID etc.

10 IO status information

This includes a list of I/O devices allocated to the process.


The architecture of a PCB is completely dependent on Operating System and may contain different
information in different operating systems. Here is a simplified diagram of a PCB The PCB is

maintained for a process throughout its lifetime, and is deleted once the process terminates.
Operating System - Process Scheduling
The process scheduling is the activity of the process manager that handles the removal of the
running process from the CPU and the selection of another process on the basis of a particular
strategy.

Process scheduling is an essential part of a Multiprogramming operating systems. Such operating


systems allow more than one process to be loaded into the executable memory at a time and the
loaded process shares the CPU using time multiplexing.

Process Scheduling Queues


The OS maintains all PCBs in Process Scheduling Queues. The OS maintains a separate queue
for each of the process states and PCBs of all processes in the same execution state are placed in
the same queue. When the state of a process is changed, its PCB is unlinked from its current
queue and moved to its new state queue.

The Operating System maintains the following important process scheduling queues
Job queue This queue keeps all the processes in the system.

Ready queue This queue keeps a set of all processes residing in main memory, ready and
waiting to execute. A new process is always put in this queue.

Device queues The processes which are blocked due to unavailability of an I/O device constitute
this queue.

The OS can use different policies to manage each queue (FIFO, Round Robin, Priority, etc.). The
OS scheduler determines how to move processes between the ready and run queues which can
only have one entry per processor core on the system; in the above diagram, it has been merged
with the CPU.

Schedulers
Schedulers are special system software which handle process scheduling in various ways. Their
main task is to select the jobs to be submitted into the system and to decide which process to run.
Schedulers are of three types

Long-Term Scheduler
Short-Term Scheduler
Medium-Term Scheduler

Context Switch
A context switch is the mechanism to store and restore the state or context of a CPU in Process
Control block so that a process execution can be resumed from the same point at a later time.
Using this technique, a context switcher enables multiple processes to share a single CPU.
Context switching is an essential part of a multitasking operating system features.
When the scheduler switches the CPU from executing one process to execute another, the state
from the current running process is stored into the process control block. After this, the state for the
process to run next is loaded from its own PCB and used to set the PC, registers, etc. At that point,
the second process can start executing.Context switches are computationally intensive since
register and memory state must be saved and restored. To avoid the amount of context switching
time, some hardware systems employ two or more sets of processor registers. When the process
is switched, the following information is stored for later use.

Program Counter
Scheduling information
Base and limit register value
Currently used register
Changed State
I/O State information
Accounting information
Operating System Scheduling algorithms

A Process Scheduler schedules different processes to be assigned to the CPU based on particular
scheduling algorithms. There are six popular process scheduling algorithms which we are going to
discuss in this chapter

First-Come, First-Served (FCFS) Scheduling


Shortest-Job-Next (SJN) Scheduling
Priority Scheduling
Shortest Remaining Time
Round Robin(RR) Scheduling
Multiple-Level Queues Scheduling
These algorithms are either non-preemptive or preemptive. Non-preemptive algorithms are
designed so that once a process enters the running state, it cannot be preempted until it completes
its allotted time, whereas the preemptive scheduling is based on priority where a scheduler may
preempt a low priority running process anytime when a high priority process enters into a ready
state.

First Come First Serve (FCFS)


1. Jobs are executed on first come, first serve basis.
2. It is a non-preemptive, pre-emptive scheduling algorithm.
3. Easy to understand and implement.
4. Its implementation is based on FIFO queue.
5. Poor in performance as average wait time is high.

Shortest Job Next (SJN)


1. This is also known as shortest job first, or SJF
2. This is a non-preemptive, pre-emptive scheduling algorithm.
3. Best approach to minimize waiting time.
4. Easy to implement in Batch systems where required CPU time is known in advance.
5. Impossible to implement in interactive systems where required CPU time is not known.
6. The processer should know in advance how much time process will take.

Priority Based Scheduling


1. Priority scheduling is a non-preemptive algorithm and one of the most common scheduling
algorithms in batch systems.
2. Each process is assigned a priority. Process with highest priority is to be executed first and so
on.
3. Processes with same priority are executed on first come first served basis.
4. Priority can be decided based on memory requirements, time requirements or any other
resource requirement.

Shortest Remaining Time


1. Shortest remaining time (SRT) is the preemptive version of the SJN algorithm.
2. The processor is allocated to the job closest to completion but it can be preempted by a newer
ready job with shorter time to completion.
3. Impossible to implement in interactive systems where required CPU time is not known.
4. It is often used in batch environments where short jobs need to give preference.

Round Robin Scheduling


1. Round Robin is the preemptive process scheduling algorithm.
2. Each process is provided a fix time to execute, it is called a quantum.
3. Once a process is executed for a given time period, it is preempted and other process
executes for a given time period.
4. Context switching is used to save states of preempted processes.

Multiple-Level Queues Scheduling


Multiple-level queues are not an independent scheduling algorithm. They make use of other
existing algorithms to group and schedule jobs with common characteristics.

1. Multiple queues are maintained for processes with common characteristics.


2. Each queue can have its own scheduling algorithms.
3. Priorities are assigned to each queue.
For example, CPU-bound jobs can be scheduled in one queue and all I/O-bound jobs in another
queue. The Process Scheduler then alternately selects jobs from each queue and assigns them to
the CPU based on the algorithm assigned to the queue.

Operating System - Multi-Threading

A thread is a flow of execution through the process code, with its own program counter that keeps
track of which instruction to execute next, system registers which hold its current working variables,
and a stack which contains the execution history.

A thread shares with its peer threads few information like code segment, data segment and open
files. When one thread alters a code segment memory item, all other threads see that.

A thread is also called a lightweight process. Threads provide a way to improve application
performance through parallelism. Threads represent a software approach to improving
performance of operating system by reducing the overhead thread is equivalent to a classical
process.

Each thread belongs to exactly one process and no thread can exist outside a process. Each
thread represents a separate flow of control. Threads have been successfully used in
implementing network servers and web server. They also provide a suitable foundation for parallel
execution of applications on shared memory multiprocessors. The following figure shows the

working of a single-threaded and a multithreaded process.

Difference between Process and Thread


1 Process is heavy weight or resource intensive./ Thread is light weight, taking lesser
resources than a process.
2 Process switching needs interaction with operating system. /Thread switching does not
need to interact with operating system.
3 In multiple processing environments, each process executes the same code but has its own
memory and file resources. / All threads can share same set of open files, child processes.
4 If one process is blocked, then no other process can execute until the first process is
unblocked. / While one thread is blocked and waiting, a second thread in the same task can run.
5 Multiple processes without using threads use more resources. / Multiple threaded
processes use fewer resources.
6 In multiple processes each process operates independently of the others. / One thread can
read, write or change another thread's data.

Advantages of Thread
Threads minimize the context switching time.
Use of threads provides concurrency within a process.
Efficient communication.
It is more economical to create and context switch threads.
Threads allow utilization of multiprocessor architectures to a greater scale and efficiency.

Types of Thread
Threads are implemented in following two ways

1. User Level Threads User managed threads.


2. Kernel Level Threads Operating System managed threads acting on kernel, an operating
system core.

User Level Threads


In this case, the thread management kernel is not aware of the existence of threads. The thread
library contains code for creating and destroying threads, for passing message and data between
threads, for scheduling thread execution and for saving and restoring thread contexts. The
application starts with a single thread.

Advantages
1. Thread switching does not require Kernel mode privileges.
2. User level thread can run on any operating system.
3. Scheduling can be application specific in the user level thread.
4. User level threads are fast to create and manage.
Disadvantages
1. In a typical operating system, most system calls are blocking.
2. Multithreaded application cannot take advantage of multiprocessing.
Kernel Level Threads
In this case, thread management is done by the Kernel. There is no thread management code in
the application area. Kernel threads are supported directly by the operating system. Any
application can be programmed to be multithreaded. All of the threads within an application are
supported within a single process.

The Kernel maintains context information for the process as a whole and for individuals threads
within the process. Scheduling by the Kernel is done on a thread basis. The Kernel performs
thread creation, scheduling and management in Kernel space. Kernel threads are generally slower
to create and manage than the user threads.

Advantages
1. Kernel can simultaneously schedule multiple threads from the same process on multiple
processes.
2. If one thread in a process is blocked, the Kernel can schedule another thread of the same
process.
3. Kernel routines themselves can be multithreaded.
Disadvantages
1. Kernel threads are generally slower to create and manage than the user threads.
2. Transfer of control from one thread to another within the same process requires a mode switch
to the Kernel.

Multithreading Models
Some operating system provide a combined user level thread and Kernel level thread facility.
Solaris is a good example of this combined approach. In a combined system, multiple threads
within the same application can run in parallel on multiple processors and a blocking system call
need not block the entire process. Multithreading models are three types

Many to many relationship.


Many to one relationship.
One to one relationship.

Many to Many Model


The many-to-many model multiplexes any number of user threads onto an equal or smaller
number of kernel threads.

The following diagram shows the many-to-many threading model where 6 user level threads are
multiplexing with 6 kernel level threads. In this model, developers can create as many user threads
as necessary and the corresponding Kernel threads can run in parallel on a multiprocessor
machine. This model provides the best accuracy on concurrency and when a thread performs a
blocking system call, the kernel can schedule another thread for execution.

Many to One Model


Many-to-one model maps many user level threads to one Kernel-level thread. Thread management

is done in user space by the thread library. When thread makes a blocking system call, the entire
process will be blocked. Only one thread can access the Kernel at a time, so multiple threads are
unable to run in parallel on multiprocessors.

If the user-level thread libraries are implemented in the operating system in such a way that the
system does not support them, then the Kernel threads use the many-to-one relationship modes.

One to One Model


There is one-to-one relationship of user-level thread to the kernel-level thread. This model provides
more concurrency than the many-to-one model. It also allows another thread to run when a thread
makes a blocking system call. It supports multiple threads to execute in parallel on
microprocessors.

Disadvantage of this model is that creating user thread requires the corresponding Kernel thread.
OS/2, windows NT and windows 2000 use one to one relationship model.

Difference between User-Level & Kernel-Level Thread


1 User-level threads are faster to create and manage./ Kernel-level threads are slower to
create and manage.
2 Implementation is by a thread library at the user level. / Operating system supports creation
of Kernel threads.
3 User-level thread is generic and can run on any operating system. / Kernel-level thread is
specific to the operating system.
4 Multi-threaded applications cannot take advantage of multiprocessing. / Kernel routines
themselves can be multithreaded.

Concurrency in operating systems


In computer science, concurrency is the execution of several instruction sequences at the same
time. In an operating system, this happens when there are several process threads running in
parallel. These threads may communicate with each other through either shared memory or
message passing. FULL ANSWER
Distribution is a form of concurrency where all communication between simultaneous threads is
done exclusively via message passing. Distribution is useful because it employs a more lenient
scaling of resource consumption, which economizes these resources. Whereas shared memory
concurrency often requires a single processor per thread, distribution allows several threads to co-
exist and communicate between one another.
Concurrency is also a programming design philosophy. In concurrent programming, programmers
attempt to break down a complex problem into several simultaneous executing processes that can
be addressed individually. Although concurrent programming offers better program structure than
sequential programming, it is not always more practical. In a concurrent system, computations
being executed at the same time can diverge, giving indeterminate answers. They system may end
in a deadlock if well-defined maxima are not assigned for the resource consumption of each of the
executing threads. Thus, to design for robust concurrency in an operating system, a programmer
needs to both reduce a problem into individual, parallel tasks and coordinate the execution,
memory allocation and data exchange of those tasks.
Race Condition
In concurrent programming a Race Condition occurs when a second thread modifies the state of
one (or more objects), making any assumptions, checks, made by the first threads invalid. This is
sometimes referred to as check then act. A race condition occurs when two or more threads can
access shared data and they try to change it at the same time. Because the thread scheduling
algorithm can swap between threads at any time, you don't know the order in which the threads will
attempt to access the shared data. Therefore, the result of the change in data is dependent on the
thread scheduling algorithm, i.e. both threads are "racing" to access/change the data.The term
race condition stems from the metaphor that the threads are racing through the critical section, and
that the result of that race impacts the result of executing the critical section.

Mutual Exclusion

A mutual exclusion (mutex) is a program object that prevents simultaneous access to a shared
resource. This concept is used in concurrent programming with a critical section, a piece of code in
which processes or threads access a shared resource. Only one thread owns the mutex at a time,
thus a mutex with a unique name is created when a program starts. When a thread holds a
resource, it has to lock the mutex from other threads to prevent concurrent access of the resource.
Upon releasing the resource, the thread unlocks the mutex.
Mutex comes into the picture when two threads work on the same data at the same time. It acts as
a lock and is the most basic synchronization tool. When a thread tries to acquire a mutex, it gains
the mutex if it is available, otherwise the thread is set to sleep condition. Mutual exclusion reduces
latency and busy-waits using queuing and context switches. Mutex can be enforced at both the
hardware and software levels.

Disabling interrupts for the smallest number of instructions is the best way to enforce mutex at the
kernel level and prevent the corruption of shared data structures. If multiple processors share the
same memory, a flag is set to enable and disable the resource acquisition based on availability.
The busy-wait mechanism enforces mutex in the software areas. This is furnished with algorithms
such as Dekker's algorithm, the black-white bakery algorithm, Szymanski's algorithm, Peterson's
algorithm and Lamport's bakery algorithm.

Mutually exclusive readers and read/write mutex class codes can be defined for an efficient
implementation of mutex.

Mutually exclusive access to shared data is often required to avoid problematic race conditions.
What is meant by this? How can we make it happen?
Requirements:

No more than one thread can be in its critical section at any one time.
A thread which dies in its critical non-critical section will not affect the others' ability to continue.
No deadlock: if a thread wants to enter its critical section then it will eventually be allowed to do
so.
No starvation.
Threads are not forced into lock-step execution of their critical sections.
Approaches:

Bare Machine Techniques: techniques that work in the absence of help from a scheduler.
Scheduler-Assisted Techniques: techniques that rely on an scheduler (usually part of the
operating system) to block threads until a resource is free.

Semaphores
A semaphore is hardware or a software tag variable whose value indicates the status of a common
resource. Its purpose is to lock the resource being used. A process which needs the resource will
check the semaphore for determining the status of the resource followed by the decision for
proceeding. In multitasking operating systems, the activities are synchronized by using the
semaphore techniques.
A semaphore is a variable. There are 2 types of semaphores:
Binary semaphores

- Binary semaphores have 2 methods associated with it. (up, down / lock, unlock)
- Binary semaphores can take only 2 values (0/1). They are used to acquire locks. When a
resource is available, the process in charge set the semaphore to 1 else 0.

Counting semaphores

- Counting Semaphore may have value to be greater than one, typically used to allocate resources
from a pool of identical resources
What is difference between binary semaphore and mutex?

The differences between binary semaphore and mutex are:

- Mutex is used exclusively for mutual exclusion. Both mutual exclusion and synchronization can
be used by binary.
- A task that took mutex can only give mutex.
- From an ISR a mutex can not be given.
- Recursive taking of mutual exclusion semaphores is possible. This means that a task that holds
before finally releasing a semaphore, can take the semaphore more than once.
- Options for making the task which takes as DELETE_SAFE are provided by Mutex, which
means the task deletion is not possible when holding the mutex.

Mutex.
Mutex is the short form for Mutual Exclusion object. A mutex allows multiple threads for sharing
the same resource. The resource can be file. A mutex with a unique name is created at the time of
starting a program. A mutex must be locked from other threads, when any thread that needs the
resource. When the data is no longer used / needed, the mutex is set to unlock.
A mutex and the binary semaphore are essentially the same. Both can take values: 0 or 1.
However, there is a significant difference between them that makes mutexes more efficient than
binary semaphores.
A mutex can be unlocked only by the thread that locked it. Thus a mutex has an owner concept.
What is difference between binary semaphore and mutex?

The differences between binary semaphore and mutex are:

- Mutex is used exclusively for mutual exclusion. Both mutual exclusion and synchronization can
be used by binary.
- A task that took mutex can only give mutex.
- From an ISR a mutex can not be given.
- Recursive taking of mutual exclusion semaphores is possible. This means that a task that holds
before finally releasing a semaphore, can take the semaphore more than once.
- Options for making the task which takes as DELETE_SAFE are provided by Mutex, which
means the task deletion is not possible when holding the mutex.

Classical IPC Problems


The operating systems literature is full of interprocess communication problems that have been
widely discussed using a variety of synchronization methods. In the following sections we will
examine two of the better-known problems.
The Dining Philosophers Problem
In 1965, Dijkstra posed and solved a synchronization problem he called the dining philosophers
problem. Since that time, everyone inventing yet another synchronization primitive has felt
obligated to demonstrate how wonderful the new primitive is by showing how elegantly it solves the
dining philosophers problem. The problem can be stated quite simply as follows. Five philosophers
are seated around a circular table. Each philosopher has a plate of spaghetti. The spaghetti is so
slippery that a philosopher needs two forks to eat it. Between each pair of plates is one fork. The

layout of the table is illustrated in Fig.

The life of a philosopher consists of alternate periods of eating and thinking. (This is something of
an abstraction, even for philosophers, but the other activities are irrelevant here.) When a
philosopher gets hungry, she tries to acquire her left and right fork, one at a time, in either order. If
successful in acquiring two forks, she eats for a while, then puts down the forks and continues to
think. The key question is: can you write a program for each philosopher that does what it is
supposed to do and never gets stuck? (It has been pointed out that the two-fork requirement is
somewhat artificial; perhaps we should switch from Italian to Chinese food, substituting rice for
spaghetti and chopsticks for forks.)
the obvious solution. The procedure take_fork waits until the specified fork is available and then
seizes it. Unfortunately, the obvious solution is wrong. Suppose that all five philosophers take their
left forks simultaneously. None will be able to take their right forks, and there will be a deadlock.

We could modify the program so that after taking the left fork, the program checks to see if the right
fork is available. If it is not, the philosopher puts down the left one, waits for some time, and then
repeats the whole process. This proposal too, fails, although for a different reason. With a little bit
of bad luck, all the philosophers could start the algorithm simultaneously, picking up their left forks,
seeing that their right forks were not available, putting down their left forks, waiting, picking up their
left forks again simultaneously, and so on, forever. A situation like this, in which all the programs
continue to run indefinitely but fail to make any progress is called starvation. (It is called starvation
even when the problem does not occur in an Italian or a Chinese restaurant.)

The Readers and Writers Problem


In computer science, the readers-writers problems are examples of a common computing problem
in concurrency. There are at least three variations of the problems, which deal with situations in
which many threads try to access the same shared resource at one time. Some threads may read
and some may write, with the constraint that no process may access the share for either reading or
writing, while another process is in the act of writing to it. (In particular, it is allowed for two or more
readers to access the share at the same time.) A readers-writer lock is a data structure that solves
one or more of the readers-writers problems.
The simplest reader writer problem which uses only two semaphores and doesn't need an array of
readers to read the data in buffer.

A solution?
although the writer could starve if readers keep arriving

Producerconsumer problem

In computing, the producerconsumer problem (also known as the bounded-buffer problem) is a


classic example of a multi-process synchronization problem. The problem describes two
processes, the producer and the consumer, who share a common, fixed-size buffer used as a
queue. The producer's job is to generate data, put it into the buffer, and start again. At the same
time, the consumer is consuming the data (i.e., removing it from the buffer), one piece at a time.
The problem is to make sure that the producer won't try to add data into the buffer if it's full and
that the consumer won't try to remove data from an empty buffer.

The solution for the producer is to either go to sleep or discard data if the buffer is full. The next
time the consumer removes an item from the buffer, it notifies the producer, who starts to fill the
buffer again. In the same way, the consumer can go to sleep if it finds the buffer to be empty. The
next time the producer puts data into the buffer, it wakes up the sleeping consumer. The solution
can be reached by means of inter-process communication, typically using semaphores. An
inadequate solution could result in a deadlock where both processes are waiting to be awakened.
The problem can also be generalized to have multiple producers and consumers.

Sleeping barber problem

In computer science, the sleeping barber problem is a classic inter-process communication and
synchronization problem between multiple operating system processes. The problem is analogous
to that of keeping a barber working when there are customers, resting when there are none, and
doing so in an orderly manner.
The analogy is based upon a hypothetical barber shop with one barber. The barber has one barber
chair and a waiting room with a number of chairs in it. When the barber finishes cutting a
customer's hair, he dismisses the customer and then goes to the waiting room to see if there are
other customers waiting. If there are, he brings one of them back to the chair and cuts his hair. If
there are no other customers waiting, he returns to his chair and sleeps in it.

Each customer, when he arrives, looks to see what the barber is doing. If the barber is sleeping,
then the customer wakes him up and sits in the chair. If the barber is cutting hair, then the
customer goes to the waiting room. If there is a free chair in the waiting room, the customer sits in
it and waits his turn. If there is no free chair, then the customer leaves.

Based on a nave analysis, the above description should ensure that the shop functions correctly,
with the barber cutting the hair of anyone who arrives until there are no more customers, and then
sleeping until the next customer arrives. In practice, there are a number of problems that can occur
that are illustrative of general scheduling problems.

The problems are all related to the fact that the actions by both the barber and the customer
(checking the waiting room, entering the shop, taking a waiting room chair, etc.) all take an
unknown amount of time. For example, a customer may arrive and observe that the barber is
cutting hair, so he goes to the waiting room. While he is on his way, the barber finishes the haircut
he is doing and goes to check the waiting room. Since there is no one there (the customer not
having arrived yet), he goes back to his chair and sleeps. The barber is now waiting for a customer
and the customer is waiting for the barber. In another example, two customers may arrive at the
same time when there happens to be a single seat in the waiting room. They observe that the
barber is cutting hair, go to the waiting room, and both attempt to occupy the single chair.

The Sleeping Barber Problem is often attributed to Edsger Dijkstra (1965), one of the pioneers in
computer science.

Many possible solutions are available. The key element of each is a mutex, which ensures that
only one of the participants can change state at once. The barber must acquire this mutual
exclusion before checking for customers and release it when he begins either to sleep or cut hair.
A customer must acquire it before entering the shop and release it once he is sitting in either a
waiting room chair or the barber chair, and also when he leaves the shop because no seats were
available. This eliminates both of the problems mentioned in the previous section. A number of
semaphores is also required to indicate the state of the system. For example, one might store the
number of people in the waiting room.
Solution
If the barber is asleep then the customers must wake him up.

Deadlock
A deadlock is a situation in which two computer programs sharing the same resource are
effectively preventing each other from accessing the resource, resulting in both programs ceasing
to function.
The earliest computer operating systems ran only one program at a time. All of the resources of
the system were available to this one program. Later, operating systems ran multiple programs at

once, interleaving them. Programs were required to specify in advance what resources they
needed so that they could avoid conflicts with other programs running at the same time. Eventually
some operating systems offered dynamic allocation of resources. Programs could request further
allocations of resources after they had begun running. This led to the problem of the deadlock.

System Model

For the purposes of deadlock discussion, a system can be modeled as a collection of limited
resources, which can be partitioned into different categories, to be allocated to a number of
processes, each having different needs.
Resource categories may include memory, printers, CPUs, open files, tape drives, CD-ROMS,
etc.
By definition, all the resources within a category are equivalent, and a request of this category
can be equally satisfied by any one of the resources in that category. If this is not the case ( i.e.
if there is some difference between the resources within a category ), then that category needs
to be further divided into separate categories. For example, "printers" may need to be separated
into "laser printers" and "color inkjet printers".
Some categories may have a single resource.
In normal operation a process must request a resource before using it, and release it when it is
done, in the following sequence:
Request - If the request cannot be immediately granted, then the process must wait until the
resource(s) it needs become available. For example the system calls open( ), malloc( ), new( ),
and request( ).
Use - The process uses the resource, e.g. prints to the printer or reads from the file.
Release - The process relinquishes the resource. so that it becomes available for other
processes. For example, close( ), free( ), delete( ), and release( ).
For all kernel-managed resources, the kernel keeps track of what resources are free and which
are allocated, to which process they are allocated, and a queue of processes waiting for this
resource to become available. Application-managed resources can be controlled using mutexes
or wait( ) and signal( ) calls, ( i.e. binary or counting semaphores. )
A set of processes is deadlocked when every process in the set is waiting for a resource that is
currently allocated to another process in the set ( and which can only be released when that
other waiting process makes progress. )

Necessary Conditions

There are four conditions that are necessary to achieve deadlock:


1) Mutual Exclusion - At least one resource must be held in a non-sharable mode; If any other
process requests this resource, then that process must wait for the resource to be released.
2) Hold and Wait - A process must be simultaneously holding at least one resource and waiting for
at least one resource that is currently being held by some other process.
3) No preemption - Once a process is holding a resource ( i.e. once its request has been granted ),
then that resource cannot be taken away from that process until the process voluntarily releases it.
4) Circular Wait - A set of processes { P0, P1, P2, . . ., PN } must exist such that every P[ i ] is
waiting for P[ ( i + 1 ) % ( N + 1 ) ]. ( Note that this condition implies the hold-and-wait condition, but
it is easier to deal with the conditions if the four are considered separately. )

Resource-Allocation Graph

In some cases deadlocks can be understood more clearly through the use of Resource-
Allocation Graphs, having the following properties:
- A set of resource categories, { R1, R2, R3, . . ., RN }, which appear as square nodes on the
graph. Dots inside the resource nodes indicate specific instances of the resource. ( E.g. two dots
might represent two laser printers. )
- A set of processes, { P1, P2, P3, . . ., PN }
- Request Edges - A set of directed arcs from Pi to Rj, indicating that process Pi has requested Rj,
and is currently waiting for that resource to become available.
- Assignment Edges - A set of directed arcs from Rj to Pi indicating that resource Rj has been
allocated to process Pi, and that Pi is currently holding resource Rj.
- Note that a request edge can be converted into an assignment edge by reversing the direction of
the arc when the request is granted. ( However note also that request edges point to the
category box, whereas assignment edges emanate from a particular instance dot within the box.
)
- For example:
Resource allocation graph

If a resource-allocation graph contains no cycles, then the system is not deadlocked. ( When
looking for cycles, remember that these are directed graphs. ) See the example in Figure 7.2
above.
If a resource-allocation graph does contain cycles AND each resource category contains only a
single instance, then a deadlock exists.
If a resource category contains more than one instance, then the presence of a cycle in the
resource-allocation graph indicates the possibility of a deadlock, but does not guarantee one.

Consider, for example, Figures below:


Resource allocation graph with a deadlock

Resource allocation graph with a cycle but no deadlock

Methods for Handling Deadlocks

Generally speaking there are three ways of handling deadlocks:


A. Deadlock prevention or avoidance - Do not allow the system to get into a deadlocked state.
B. Deadlock detection and recovery - Abort a process or preempt some resources when
deadlocks are detected.
C. Ignore the problem all together - If deadlocks only occur once a year or so, it may be better
to simply let them happen and reboot as necessary than to incur the constant overhead and
system performance penalties associated with deadlock prevention or detection. This is the
approach that both Windows and UNIX take.
In order to avoid deadlocks, the system must have additional information about all processes. In
particular, the system must know what resources a process will or may request in the future. (
Ranging from a simple worst-case maximum to a complete resource request and release plan
for each process, depending on the particular algorithm. )
Deadlock detection is fairly straightforward, but deadlock recovery requires either aborting
processes or preempting resources, neither of which is an attractive alternative.
If deadlocks are neither prevented nor detected, then when a deadlock occurs the system will
gradually slow down, as more and more processes become stuck waiting for resources currently
held by the deadlock and by other waiting processes. Unfortunately this slowdown can be
indistinguishable from a general system slowdown when a real-time process has heavy
computing needs.
Deadlock Prevention

Deadlocks can be prevented by preventing at least one of the four required conditions:
Mutual Exclusion
Shared resources such as read-only files do not lead to deadlocks.
Unfortunately some resources, such as printers and tape drives, require exclusive access by a
single process.
Hold and Wait

To prevent this condition processes must be prevented from holding one or more resources
while simultaneously waiting for one or more others. There are several possibilities for this:
Require that all processes request all resources at one time. This can be wasteful of system
resources if a process needs one resource early in its execution and doesn't need some other
resource until much later.
Require that processes holding resources must release them before requesting new resources,
and then re-acquire the released resources along with the new ones in a single new request.
This can be a problem if a process has partially completed an operation using a resource and
then fails to get it re-allocated after releasing it.
Either of the methods described above can lead to starvation if a process requires one or more
popular resources.
No Preemption

Preemption of process resource allocations can prevent this condition of deadlocks, when it is
possible.
One approach is that if a process is forced to wait when requesting a new resource, then all
other resources previously held by this process are implicitly released, ( preempted ), forcing this
process to re-acquire the old resources along with the new resources in a single request, similar
to the previous discussion.
Another approach is that when a resource is requested and not available, then the system looks
to see what other processes currently have those resources and are themselves blocked waiting
for some other resource. If such a process is found, then some of their resources may get
preempted and added to the list of resources for which the process is waiting.
Either of these approaches may be applicable for resources whose states are easily saved and
restored, such as registers and memory, but are generally not applicable to other devices such
as printers and tape drives.
Circular Wait

One way to avoid circular wait is to number all resources, and to require that processes request
resources only in strictly increasing ( or decreasing ) order.
In other words, in order to request resource Rj, a process must first release all Ri such that i >=
j.
One big challenge in this scheme is determining the relative ordering of the different resources
Deadlock Avoidance

The general idea behind deadlock avoidance is to prevent deadlocks from ever happening, by
preventing at least one of the aforementioned conditions.
This requires more information about each process, AND tends to lead to low device utilization. (
I.e. it is a conservative approach. )
In some algorithms the scheduler only needs to know the maximum number of each resource
that a process might potentially use. In more complex algorithms the scheduler can also take
advantage of the schedule of exactly what resources may be needed in what order.
When a scheduler sees that starting a process or granting resource requests may lead to future
deadlocks, then that process is just not started or the request is not granted.
A resource allocation state is defined by the number of available and allocated resources, and
the maximum requirements of all processes in the system.
Safe State

A state is safe if the system can allocate all resources requested by all processes ( up to their
stated maximums ) without entering a deadlock state.
More formally, a state is safe if there exists a safe sequence of processes { P0, P1, P2, ..., PN }
such that all of the resource requests for Pi can be granted using the resources currently

allocated to Pi and all processes Pj where j < i. ( I.e. if all the processes prior to Pi finish and free
up their resources, then Pi will be able to finish also, using the resources that they have freed
up. )
If a safe sequence does not exist, then the system is in an unsafe state, which MAY lead to
deadlock. ( All safe states are deadlock free, but not all unsafe states lead to deadlocks. )

Safe, unsafe, and deadlocked state spaces.

For example, consider a system with 12 tape drives, allocated as follows. Is this a safe state?

What is the safe sequence?


What happens to the above table if process P2 requests and is granted one more tape
drive?
Key to the safe state approach is that when a request is made for resources, the request is
granted only if the resulting allocation state is a safe one.
Resource-Allocation Graph Algorithm
If resource categories have only single instances of their resources, then deadlock states can be
detected by cycles in the resource-allocation graphs.
In this case, unsafe states can be recognized and avoided by augmenting the resource-
allocation graph with claim edges, noted by dashed lines, which point from a process to a
resource that it may request in the future.
In order for this technique to work, all claim edges must be added to the graph for any particular
process before that process is allowed to request any resources. ( Alternatively, processes may
only make requests for resources for which they have already established claim edges, and
claim edges cannot be added to any process that is currently holding resources. )
When a process makes a request, the claim edge Pi->Rj is converted to a request edge.
Similarly when a resource is released, the assignment reverts back to a claim edge.

This approach works by denying requests that would produce cycles in the resource-allocation
graph, taking claim edges into effect.
Consider for example what happens when process P2 requests resource R2:

Resource allocation graph for deadlock avoidance

The resulting resource-allocation graph would have a cycle in it, and so the request cannot be
granted.

An unsafe state in a resource allocation graph

Banker's Algorithm
For resource categories that contain more than one instance the resource-allocation graph
method does not work, and more complex ( and less efficient ) methods must be chosen.
The Banker's Algorithm gets its name because it is a method that bankers could use to assure
that when they lend out resources they will still be able to satisfy all their clients. ( A banker won't
loan out a little money to start building a house unless they are assured that they will later be
able to loan out the rest of the money to finish the house. )
When a process starts up, it must state in advance the maximum allocation of resources it may
request, up to the amount available on the system.
When a request is made, the scheduler determines whether granting the request would leave
the system in a safe state. If not, then the process must wait until the request can be granted
safely.
The banker's algorithm relies on several key data structures: ( where n is the number of
processes and m is the number of resource categories. )
I. Available[ m ] indicates how many resources are currently available of each type.
II. Max[ n ][ m ] indicates the maximum demand of each process of each resource.
III. Allocation[ n ][ m ] indicates the number of each resource category allocated to each process.
IV. Need[ n ][ m ] indicates the remaining resources needed of each type for each process. ( Note
that Need[ i ][ j ] = Max[ i ][ j ] - Allocation[ i ][ j ] for all i, j. )
For simplification of discussions, we make the following notations / observations:
- One row of the Need vector, Need[ i ], can be treated as a vector corresponding to the needs of
process i, and similarly for Allocation and Max.
- A vector X is considered to be <= a vector Y if X[ i ] <= Y[ i ] for all i.
Safety Algorithm

In order to apply the Banker's algorithm, we first need an algorithm for determining whether or
not a particular state is safe.
This algorithm determines if the current state of a system is safe, according to the following
steps:
1. Let Work and Finish be vectors of length m and n respectively.
Work is a working copy of the available resources, which will be modified during the analysis.
Finish is a vector of booleans indicating whether a particular process can finish. ( or has
finished so far in the analysis. )
Initialize Work to Available, and Finish to false for all elements.
2. Find an i such that both (A) Finish[ i ] == false, and (B) Need[ i ] < Work. This process has
not finished, but could with the given available working set. If no such i exists, go to step 4.
3. Set Work = Work + Allocation[ i ], and set Finish[ i ] to true. This corresponds to process i
finishing up and releasing its resources back into the work pool. Then loop back to step 2.
4. If finish[ i ] == true for all i, then the state is a safe state, because a safe sequence has
been found.
( JTB's Modification:
In step 1. instead of making Finish an array of booleans initialized to false, make it an array of
ints initialized to 0. Also initialize an int s = 0 as a step counter.
In step 2, look for Finish[ i ] == 0.
In step 3, set Finish[ i ] to ++s. S is counting the number of finished processes.
For step 4, the test can be either Finish[ i ] > 0 for all i, or s >= n. The benefit of this method is
that if a safe state exists, then Finish[ ] indicates one safe sequence ( of possibly many. ) )
Resource-Request Algorithm ( The Bankers Algorithm )
Now that we have a tool for determining if a particular state is safe or not, we are now ready to
look at the Banker's algorithm itself.
This algorithm determines if a new request is safe, and grants it only if it is safe to do so.
When a request is made ( that does not exceed currently available resources ), pretend it has
been granted, and then see if the resulting state is a safe one. If so, grant the request, and if not,
deny the request, as follows:
1. Let Request[ n ][ m ] indicate the number of resources of each type currently requested
by processes. If Request[ i ] > Need[ i ] for any process i, raise an error condition.
2. If Request[ i ] > Available for any process i, then that process must wait for resources to
become available. Otherwise the process can continue to step 3.
3. Check to see if the request can be granted safely, by pretending it has been granted
and then seeing if the resulting state is safe. If so, grant the request, and if not, then the
process must wait until its request can be granted safely.The procedure for granting a
request ( or pretending to for testing purposes ) is:
Available = Available - Request
Allocation = Allocation + Request
Need = Need - Request
Deadlock Detection

If deadlocks are not avoided, then another approach is to detect when they have occurred and
recover somehow.
In addition to the performance hit of constantly checking for deadlocks, a policy / algorithm must
be in place for recovering from deadlocks, and there is potential for lost work when processes
must be aborted or have their resources preempted.
Single Instance of Each Resource Type

If each resource category has a single instance, then we can use a variation of the resource-
allocation graph known as a wait-for graph.
A wait-for graph can be constructed from a resource-allocation graph by eliminating the
resources and collapsing the associated edges, as shown in the figure below.
An arc from Pi to Pj in a wait-for graph indicates that process Pi is waiting for a resource that
process Pj is currently holding.

(a) Resource allocation graph. (b) Corresponding wait-for graph

As before, cycles in the wait-for graph indicate deadlocks.


This algorithm must maintain the wait-for graph, and periodically search it for cycles.
Detection-Algorithm Usage

When should the deadlock detection be done? Frequently, or infrequently?


The answer may depend on how frequently deadlocks are expected to occur, as well as the
possible consequences of not catching them immediately. ( If deadlocks are not removed
immediately when they occur, then more and more processes can "back up" behind the
deadlock, making the eventual task of unblocking the system more difficult and possibly
damaging to more processes. )
There are two obvious approaches, each with trade-offs:
1. Do deadlock detection after every resource allocation which cannot be immediately granted.
This has the advantage of detecting the deadlock right away, while the minimum number of
processes are involved in the deadlock. ( One might consider that the process whose request
triggered the deadlock condition is the "cause" of the deadlock, but realistically all of the
processes in the cycle are equally responsible for the resulting deadlock. ) The down side of
this approach is the extensive overhead and performance hit caused by checking for deadlocks
so frequently.
2. Do deadlock detection only when there is some clue that a deadlock may have occurred, such
as when CPU utilization reduces to 40% or some other magic number. The advantage is that
deadlock detection is done much less frequently, but the down side is that it becomes
impossible to detect the processes involved in the original deadlock, and so deadlock recovery
can be more complicated and damaging to more processes.
3. ( As I write this, a third alternative comes to mind: Keep a historical log of resource allocations,
since that last known time of no deadlocks. Do deadlock checks periodically ( once an hour or
when CPU usage is low?), and then use the historical log to trace through and determine when
the deadlock occurred and what processes caused the initial deadlock. Unfortunately I'm not
certain that breaking the original deadlock would then free up the resulting log jam. )
Recovery From Deadlock

There are three basic approaches to recovery from deadlock:


1. Inform the system operator, and allow him/her to take manual intervention.
2. Terminate one or more processes involved in the deadlock
3. Preempt resources.
Process Termination

Two basic approaches, both of which recover resources allocated to terminated processes:
Terminate all processes involved in the deadlock. This definitely solves the deadlock, but at the
expense of terminating more processes than would be absolutely necessary.
Terminate processes one by one until the deadlock is broken. This is more conservative, but
requires doing deadlock detection after each step.
In the latter case there are many factors that can go into deciding which processes to terminate
next:
Process priorities.
How long the process has been running, and how close it is to finishing.
How many and what type of resources is the process holding. ( Are they easy to preempt and
restore? )
How many more resources does the process need to complete.
How many processes will need to be terminated
Whether the process is interactive or batch.
( Whether or not the process has made non-restorable changes to any resource. )
Resource Preemption
When preempting resources to relieve deadlock, there are three important issues to be addressed:
1. Selecting a victim - Deciding which resources to preempt from which processes involves many
of the same decision criteria outlined above.
2. Rollback - Ideally one would like to roll back a preempted process to a safe state prior to the
point at which that resource was originally allocated to the process. Unfortunately it can be
difficult or impossible to determine what such a safe state is, and so the only safe rollback is to
roll back all the way back to the beginning. ( I.e. abort the process and make it start over. )
3. Starvation - How do you guarantee that a process won't starve because its resources are
constantly being preempted? One option would be to use a priority system, and increase the
priority of a process every time its resources get preempted. Eventually it should get a high
enough priority that it won't get preempted any more.

Operating System - Memory Management

Memory Partitioning:
1. Fixed Partitioning:
Main memory is divided into a no. of static partitions at system generation time. A process may be
loaded into a partition of equal or greater size.
Memory Manager will allocate a region to a process that best fits it
Unused memory within an allocated partition called internal fragmentation
Advantages:
Simple to implement
Little OS overhead
Disadvantages:
Inefficient use of Memory due to internal fragmentation. Main memory utilization is extremely
inefficient. Any program, no matter how small, occupies an entire partition. This phenomenon, in
which there is wasted space internal to a partition due to the fact that the block of data loaded is
smaller
than the partition, is referred to as internal fragmentation.
Two possibilities:
a). Equal size partitioning
b). Unequal size Partition
Not suitable for systems in which process memory requirements not known ahead of time; i.e.
timesharing systems.
When the queue for a large partition is empty but the queue for a small partition is full, as is the
casefor partitions 1 and 3. Here small jobs have to wait to get into memory, even though plenty of
memoryis free
An alternative organization is to maintain a single queue as in Fig.. Whenever a partition
becomes free, the job closest to the front of the queue that fits in it could be loaded into the empty
partition and run.
Dynamic/Variable Partitioning:
To overcome some of the difficulties with fixed partitioning, an approach known as dynamic
partitioning was developed . The partitions are of variable length and number. When a process is
brought into main memory, it is allocated exactly as much memory as it requires and no more. An
example, using 64 Mbytes of main memory, is shown in Figure
Eventually it leads to a situation in which there are a lot of small holes in memory. As time goes on,
becomes more and more fragmented, and memory utilization declines. This phenomenon is
referred to as external fragmentation, indicating that the memory that is external to all partitions
becomes increasingly fragmented. One technique for overcoming external fragmentation is
compaction: From time to time, the operating system shifts the processes so that they are
contiguous and so that all of the free memory is together in one block. For example, in Figure ,
compaction will result in a block of free memory of length 16M.

This may well be sufficient to load in an additional process. The difficulty with compaction is that it
isa time consuming procedure and wasteful of processor time.
Swapping
Swapping is a mechanism in which a process can be swapped temporarily out of main memory (or
move) to secondary storage (disk) and make that memory available to other processes. At some
later time, the system swaps back the process from the secondary storage to main memory.

Though performance is usually affected by swapping process but it helps in running multiple and
big processes in parallel and that's the reason Swapping is also known as a technique for memory
compaction.

The total time taken by swapping process includes the time it takes to move the entire process to a
secondary disk and then to copy the process back to memory, as well as the time the process
takes to regain main memory.

Let us assume that the user process is of size 2048KB and on a standard hard disk where
swapping will take place has a data transfer rate around 1 MB per second. The actual transfer of
the 1000K process to or from memory will take

2048KB / 1024KB per second


= 2 seconds
= 2000 milliseconds
Now considering in and out time, it will take complete 4000 milliseconds plus other overhead where
the process competes to regain main memory.

Fragmentation
As processes are loaded and removed from memory, the free memory space is broken into little
pieces. It happens after sometimes that processes cannot be allocated to memory blocks
considering their small size and memory blocks remains unused. This problem is known as
Fragmentation.

Fragmentation is of two types


1 External fragmentation
Total memory space is enough to satisfy a request or to reside a process in it, but it is not
contiguous, so it cannot be used.

2 Internal fragmentation
Memory block assigned to process is bigger. Some portion of memory is left unused, as it cannot
be used by another process.
The following diagram shows how fragmentation can cause waste of memory and a compaction
technique can be used to create more free memory out of fragmented memory

External fragmentation can be reduced by compaction or shuffle memory contents to place all free
memory together in one large block. To make compaction feasible, relocation should be dynamic.

The internal fragmentation can be reduced by effectively assigning the smallest partition but large
enough for the process.

Paging
A computer can address more memory than the amount physically installed on the system. This
extra memory is actually called virtual memory and it is a section of a hard that's set up to emulate
the computer's RAM. Paging technique plays an important role in implementing virtual memory.

Paging is a memory management technique in which process address space is broken into blocks
of the same size called pages (size is power of 2, between 512 bytes and 8192 bytes). The size of
the process is measured in the number of pages.

Similarly, main memory is divided into small fixed-sized blocks of (physical) memory called frames
and the size of a frame is kept the same as that of a page to have optimum utilization of the main
memory and to avoid external fragmentation.A computer can address more memory than the
amount physically installed on the system. This extra memory is actually called virtual memory and
it is a section of a hard that's set up to emulate the computer's RAM. Paging technique plays an
important role in implementing virtual memory.

Paging is a memory management technique in which process address space is broken into blocks
of the same size called pages (size is power of 2, between 512 bytes and 8192 bytes). The size of
the process is measured in the number of pages.
Similarly, main memory is divided into small fixed-sized blocks of (physical) memory called frames
and the size of a frame is kept the same as that of a page to have optimum utilization of the main

memory and to avoid external fragmentation.

Address Translation
Page address is called logical address and represented by page number and the offset.

Logical Address = Page number + page offset


Frame address is called physical address and represented by a frame number and the offset.

Physical Address = Frame number + page offset


A data structure called page map table is used to keep track of the relation between a page of a
process to a frame in physical memory.
When the system allocates a frame to any page, it translates this logical address into a physical
address and create entry into the page table to be used throughout execution of the program.

When a process is to be executed, its corresponding pages are loaded into any available memory
frames. Suppose you have a program of 8Kb but your memory can accommodate only 5Kb at a
given point in time, then the paging concept will come into picture. When a computer runs out of
RAM, the operating system (OS) will move idle or unwanted pages of memory to secondary
memory to free up RAM for other processes and brings them back when needed by the program.

This process continues during the whole execution of the program where the OS keeps removing
idle pages from the main memory and write them onto the secondary memory and bring them back
when required by the program.

Advantages and Disadvantages of Paging


Here is a list of advantages and disadvantages of paging

Paging reduces external fragmentation, but still suffer from internal fragmentation.
Paging is simple to implement and assumed as an efficient memory management technique.
Due to equal size of the pages and frames, swapping becomes very easy.
Page table requires extra memory space, so may not be good for a system having small RAM.
Segmentation
Segmentation is a memory management technique in which each job is divided into several
segments of different sizes, one for each module that contains pieces that perform related
functions. Each segment is actually a different logical address space of the program.

When a process is to be executed, its corresponding segmentation are loaded into non-contiguous
memory though every segment is loaded into a contiguous block of available memory.

Segmentation memory management works very similar to paging but here segments are of
variable-length where as in paging pages are of fixed size.

A program segment contains the program's main function, utility functions, data structures, and so
on. The operating system maintains a segment map table for every process and a list of free
memory blocks along with segment numbers, their size and corresponding memory locations in
main memory. For each segment, the table stores the starting address of the segment and the
length of the segment. A reference to a memory location includes a value that identifies a segment

and an offset.

Virtual Memory
A computer can address more memory than the amount physically installed on the system. This
extra memory is actually called virtual memory and it is a section of a hard disk that's set up to
emulate the computer's RAM.

The main visible advantage of this scheme is that programs can be larger than physical memory.
Virtual memory serves two purposes. First, it allows us to extend the use of physical memory by
using disk. Second, it allows us to have memory protection, because each virtual address is
translated to a physical address.

Following are the situations, when entire program is not required to be loaded fully in main
memory.

User written error handling routines are used only when an error occurred in the data or
computation.
Certain options and features of a program may be used rarely.
Many tables are assigned a fixed amount of address space even though only a small amount of
the table is actually used.
The ability to execute a program that is only partially in memory would counter many benefits
Less number of I/O would be needed to load or swap each user program into memory.
A program would no longer be constrained by the amount of physical memory that is available.
Each user program could take less physical memory, more programs could be run the same
time, with a corresponding increase in CPU utilization and throughput.

Modern microprocessors intended for general-purpose use, a memory management unit, or MMU,
is built into the hardware. The MMU's job is to translate virtual addresses into physical addresses.

A basic example is given below

Virtual memory is commonly implemented by demand paging. It can also be implemented in a


segmentation system. Demand segmentation can also be used to provide virtual memory.

Demand Paging
A demand paging system is quite similar to a paging system with swapping where processes
reside in secondary memory and pages are loaded only on demand, not in advance. When a
context switch occurs, the operating system does not copy any of the old programs pages out to
the disk or any of the new programs pages into the main memory Instead, it just begins executing
the new program after loading the first page and fetches that programs pages as they are
referenced
While executing a program, if the program references a page which is not available in the main
memory because it was swapped out a little ago, the processor treats this invalid memory
reference as a page fault and transfers control from the program to the operating system to
demand the page back into the memory.

Advantages
Following are the advantages of Demand Paging

Large virtual memory.


More efficient use of memory.
There is no limit on degree of multiprogramming.
Disadvantages
Number of tables and the amount of processor overhead for handling page interrupts are greater
than in the case of the simple paged management techniques.

Page Replacement Algorithm


Page replacement algorithms are the techniques using which an Operating System decides which
memory pages to swap out, write to disk when a page of memory needs to be allocated. Paging
happens whenever a page fault occurs and a free page cannot be used for allocation purpose
accounting to reason that pages are not available or the number of free pages is lower than
required pages.

When the page that was selected for replacement and was paged out, is referenced again, it has
to read in from disk, and this requires for I/O completion. This process determines the quality of the
page replacement algorithm: the lesser the time waiting for page-ins, the better is the algorithm.
A page replacement algorithm looks at the limited information about accessing the pages provided
by hardware, and tries to select which pages should be replaced to minimize the total number of
page misses, while balancing it with the costs of primary storage and processor time of the
algorithm itself. There are many different page replacement algorithms. We evaluate an algorithm
by running it on a particular string of memory reference and computing the number of page faults,

Reference String
The string of memory references is called reference string. Reference strings are generated
artificially or by tracing a given system and recording the address of each memory reference. The
latter choice produces a large number of data, where we note two things.

For a given page size, we need to consider only the page number, not the entire address.

If we have a reference to a page p, then any immediately following references to page p will
never cause a page fault. Page p will be in memory after the first reference; the immediately
following references will not fault.

For example, consider the following sequence of addresses 123,215,600,1234,76,96

If page size is 100, then the reference string is 1,2,6,12,0,0

First In First Out (FIFO) algorithm


Oldest page in main memory is the one which will be selected for replacement.

Easy to implement, keep a list, replace pages from the tail and add new pages at the head.
Optimal Page algorithm

An optimal page-replacement algorithm has the lowest page-fault rate of all algorithms. An
optimal page-replacement algorithm exists, and has been called OPT or MIN.
Replace the page that will not be used for the longest period of time. Use the time when a page
is to be used.

Least Recently Used (LRU) algorithm


Page which has not been used for the longest time in main memory is the one which will be
selected for replacement.
Easy to implement, keep a list, replace pages by looking back into time.
Page Buffering algorithm
To get a process start quickly, keep a pool of free frames.
On page fault, select a page to be replaced.
Write the new page in the frame of free pool, mark the page table and restart the process.
Now write the dirty page out of disk and place the frame holding replaced page in free pool.
Least frequently Used(LFU) algorithm
The page with the smallest count is the one which will be selected for replacement.
This algorithm suffers from the situation in which a page is used heavily during the initial phase
of a process, but then is never used again.

Most frequently Used(MFU) algorithm


This algorithm is based on the argument that the page with the smallest count was probably just
brought in and has yet to be used.

Overview of Mass-Storage Structure


Magnetic Disks

Traditional magnetic disks have the following basic structure:


One or more platters in the form of disks covered with magnetic media. Hard disk platters are
made of rigid metal, while "floppy" disks are made of more flexible plastic.
Each platter has two working surfaces. Older hard disk drives would sometimes not use the very
top or bottom surface of a stack of platters, as these surfaces were more susceptible to potential
damage.
Each working surface is divided into a number of concentric rings called tracks. The collection of
all tracks that are the same distance from the edge of the platter, ( i.e. all tracks immediately
above one another in the following diagram ) is called a cylinder.
Each track is further divided into sectors, traditionally containing 512 bytes of data each,
although some modern disks occasionally use larger sector sizes. ( Sectors also include a
header and a trailer, including checksum information among other things. Larger sector sizes
reduce the fraction of the disk consumed by headers and trailers, but increase internal
fragmentation and the amount of disk that must be marked bad in the case of errors. )
The data on a hard drive is read by read-write heads. The standard configuration ( shown below
) uses one head per surface, each on a separate arm, and controlled by a common arm
assembly which moves all heads simultaneously from one cylinder to another. ( Other
configurations, including independent read-write heads, may speed up disk access, but involve
serious technical difficulties. )
The storage capacity of a traditional disk drive is equal to the number of heads ( i.e. the number
of working surfaces ), times the number of tracks per surface, times the number of sectors per
track, times the number of bytes per sector. A particular physical block of data is specified by
providing the head-sector-cylinder number at which it is located.
In operation the disk rotates at high speed, such as 7200 rpm ( 120 revolutions per second. )
The rate at which data can be transferred from the disk to the computer is composed of several
steps:
The positioning time, a.k.a. the seek time or random access time is the time required to
move the heads from one cylinder to another, and for the heads to settle down after the
move. This is typically the slowest step in the process and the predominant bottleneck to
overall transfer rates.
The rotational latency is the amount of time required for the desired sector to rotate around
and come under the read-write head.This can range anywhere from zero to one full
revolution, and on the average will equal one-half revolution. This is another physical step
and is usually the second slowest step behind seek time. ( For a disk rotating at 7200 rpm,
the average rotational latency would be 1/2 revolution / 120 revolutions per second, or just
over 4 milliseconds, a long time by computer standards.
The transfer rate, which is the time required to move the data electronically from the disk to
the computer. ( Some authors may also use the term transfer rate to refer to the overall
transfer rate, including seek time and rotational latency as well as the electronic data
transfer rate. )
Disk heads "fly" over the surface on a very thin cushion of air. If they should accidentally contact
the disk, then a head crash occurs, which may or may not permanently damage the disk or even
destroy it completely. For this reason it is normal to park the disk heads when turning a
computer off, which means to move the heads off the disk or to an area of the disk where there
is no data stored.
Floppy disks are normally removable. Hard drives can also be removable, and some are even
hot-swappable, meaning they can be removed while the computer is running, and a new hard
drive inserted in their place.
Disk drives are connected to the computer via a cable known as the I/O Bus. Some of the
common interface formats include Enhanced Integrated Drive Electronics, EIDE; Advanced
Technology Attachment, ATA; Serial ATA, SATA, Universal Serial Bus, USB; Fiber Channel, FC,
and Small Computer Systems Interface, SCSI.
The host controller is at the computer end of the I/O bus, and the disk controller is built into the
disk itself. The CPU issues commands to the host controller via I/O ports. Data is transferred
between the magnetic surface and onboard cache by the disk controller, and then the data is
transferred from that cache to the host controller and the motherboard memory at electronic
speeds.
Magnetic Tapes

Magnetic tapes were once used for common secondary storage before the days of hard disk
drives, but today are used primarily for backups.
Accessing a particular spot on a magnetic tape can be slow, but once reading or writing
commences, access speeds are comparable to disk drives.
Capacities of tape drives can range from 20 to 200 GB, and compression can double that
capacity.

Disk Structure

The traditional head-sector-cylinder, HSC numbers are mapped to linear block addresses by
numbering the first sector on the first head on the outermost track as sector 0. Numbering
proceeds with the rest of the sectors on that same track, and then the rest of the tracks on the
same cylinder before proceeding through the rest of the cylinders to the center of the disk. In
modern practice these linear block addresses are used in place of the HSC numbers for a
variety of reasons:
1. The linear length of tracks near the outer edge of the disk is much longer than for those
tracks located near the center, and therefore it is possible to squeeze many more sectors
onto outer tracks than onto inner ones.
2. All disks have some bad sectors, and therefore disks maintain a few spare sectors that can
be used in place of the bad ones. The mapping of spare sectors to bad sectors in managed
internally to the disk controller.
3. Modern hard drives can have thousands of cylinders, and hundreds of sectors per track on
their outermost tracks. These numbers exceed the range of HSC numbers for many ( older
) operating systems, and therefore disks can be configured for any convenient combination
of HSC values that falls within the total number of sectors physically on the drive.
There is a limit to how closely packed individual bits can be placed on a physical media, but that
limit is growing increasingly more packed as technological advances are made.
Modern disks pack many more sectors into outer cylinders than inner ones, using one of two
approaches:
1. With Constant Linear Velocity, CLV, the density of bits is uniform from cylinder to
cylinder. Because there are more sectors in outer cylinders, the disk spins slower when
reading those cylinders, causing the rate of bits passing under the read-write head to
remain constant. This is the approach used by modern CDs and DVDs.
2. With Constant Angular Velocity, CAV, the disk rotates at a constant angular speed, with
the bit density decreasing on outer cylinders. ( These disks would have a constant
number of sectors per track on all cylinders. )

Disk Attachment

Disk drives can be attached either directly to a particular host ( a local disk ) or to a network.
Host-Attached Storage

Local disks are accessed through I/O Ports as described earlier.


The most common interfaces are IDE or ATA, each of which allow up to two drives per host
controller.
SATA is similar with simpler cabling.
High end workstations or other systems in need of larger number of disks typically use SCSI
disks:
The SCSI standard supports up to 16 targets on each SCSI bus, one of which is generally the
host adapter and the other 15 of which can be disk or tape drives.
A SCSI target is usually a single drive, but the standard also supports up to 8 units within each
target. These would generally be used for accessing individual disks within a RAID array. ( See
below. )
The SCSI standard also supports multiple host adapters in a single computer, i.e. multiple
SCSI busses.
Modern advancements in SCSI include "fast" and "wide" versions, as well as SCSI-2.
SCSI cables may be either 50 or 68 conductors. SCSI devices may be external as well as
internal.
FC is a high-speed serial architecture that can operate over optical fiber or four-conductor
copper wires, and has two variants:
A large switched fabric having a 24-bit address space. This variant allows for multiple
devices and multiple hosts to interconnect, forming the basis for the storage-area networks,
SANs, to be discussed in a future section.
The arbitrated loop, FC-AL, that can address up to 126 devices ( drives and controllers. )
Network-Attached Storage

Network attached storage connects storage devices to computers using a remote procedure
call, RPC, interface, typically with something like NFS filesystem mounts. This is convenient for
allowing several computers in a group common access and naming conventions for shared
storage.
NAS can be implemented using SCSI cabling, or ISCSI uses Internet protocols and standard
network connections, allowing long-distance remote access to shared files.
NAS allows computers to easily share data storage, but tends to be less efficient than standard
host-attached storage.

Storage-Area Network

A Storage-Area Network, SAN, connects computers and storage devices in a network, using
storage protocols instead of network protocols.
One advantage of this is that storage access does not tie up regular networking bandwidth.
SAN is very flexible and dynamic, allowing hosts and devices to attach and detach on the fly.
SAN is also controllable, allowing restricted access to certain hosts and devices.
Disk Scheduling

As mentioned earlier, disk transfer speeds are limited primarily by seek times and rotational
latency. When multiple requests are to be processed there is also some inherent delay in waiting
for other requests to be processed.
Bandwidth is measured by the amount of data transferred divided by the total amount of time
from the first request being made to the last transfer being completed, ( for a series of disk
requests. )
Both bandwidth and access time can be improved by processing requests in a good order.
Disk requests include the disk address, memory address, number of sectors to transfer, and
whether the request is for reading or writing.

FCFS Scheduling

First-Come First-Serve is simple and intrinsically fair, but not very efficient. Consider in the
following sequence the wild swing from cylinder 122 to 14 and then back to 124:

SSTF Scheduling

Shortest Seek Time First scheduling is more efficient, but may lead to starvation if a constant
stream of requests arrives for the same general area of the disk.
SSTF reduces the total head movement to 236 cylinders, down from 640 required for the same
set of requests under FCFS. Note, however that the distance could be reduced still further to
208 by starting with 37 and then 14 first before processing the rest of the requests.

SCAN Scheduling

The SCAN algorithm, a.k.a. the elevator algorithm moves back and forth from one end of the
disk to the other, similarly to an elevator processing requests in a tall building.
Under the SCAN algorithm, If a request arrives just ahead of the moving head then it will be
processed right away, but if it arrives just after the head has passed, then it will have to wait for
the head to pass going the other way on the return trip. This leads to a fairly wide variation in
access times which can be improved upon.
Consider, for example, when the head reaches the high end of the disk: Requests with high
cylinder numbers just missed the passing head, which means they are all fairly recent requests,
whereas requests with low numbers may have been waiting for a much longer time. Making the
return scan from high to low then ends up accessing recent requests first and making older
requests wait that much longer.
C-SCAN Scheduling

The Circular-SCAN algorithm improves upon SCAN by treating all requests in a circular queue
fashion - Once the head reaches the end of the disk, it returns to the other end without processing
any requests, and then starts again from the beginning of the disk:

LOOK Scheduling

LOOK scheduling improves upon SCAN by looking ahead at the queue of pending requests, and
not moving the heads any farther towards the end of the disk than is necessary. The following
diagram illustrates the circular form of LOOK:

Selection of a Disk-Scheduling Algorithm

With very low loads all algorithms are equal, since there will normally only be one request to
process at a time.
For slightly larger loads, SSTF offers better performance than FCFS, but may lead to starvation
when loads become heavy enough.
For busier systems, SCAN and LOOK algorithms eliminate starvation problems.
The actual optimal algorithm may be something even more complex than those discussed here,
but the incremental improvements are generally not worth the additional overhead.
Some improvement to overall filesystem access times can be made by intelligent placement of
directory and/or inode information. If those structures are placed in the middle of the disk instead
of at the beginning of the disk, then the maximum distance from those structures to data blocks
is reduced to only one-half of the disk size. If those structures can be further distributed and
furthermore have their data blocks stored as close as possible to the corresponding directory
structures, then that reduces still further the overall time to find the disk block numbers and then
access the corresponding data blocks.
On modern disks the rotational latency can be almost as significant as the seek time, however it
is not within the OSes control to account for that, because modern disks do not reveal their
internal sector mapping schemes, ( particularly when bad blocks have been remapped to spare
sectors. )
Some disk manufacturers provide for disk scheduling algorithms directly on their disk controllers,
( which do know the actual geometry of the disk as well as any remapping ), so that if a series of
requests are sent from the computer to the controller then those requests can be processed in
an optimal order.
Unfortunately there are some considerations that the OS must take into account that are beyond
the abilities of the on-board disk-scheduling algorithms, such as priorities of some requests over
others, or the need to process certain requests in a particular order. For this reason OSes may
elect to spoon-feed requests to the disk controller one at a time in certain situations.

Disk Management
Disk Formatting
Before a disk can be used, it has to be low-level formatted, which means laying down all of the
headers and trailers demarking the beginning and ends of each sector. Included in the header
and trailer are the linear sector numbers, and error-correcting codes, ECC, which allow
damaged sectors to not only be detected, but in many cases for the damaged data to be
recovered ( depending on the extent of the damage. ) Sector sizes are traditionally 512 bytes,
but may be larger, particularly in larger drives.
ECC calculation is performed with every disk read or write, and if damage is detected but the
data is recoverable, then a soft error has occurred. Soft errors are generally handled by the on-
board disk controller, and never seen by the OS. ( See below. )
Once the disk is low-level formatted, the next step is to partition the drive into one or more
separate partitions. This step must be completed even if the disk is to be used as a single large
partition, so that the partition table can be written to the beginning of the disk.
After partitioning, then the filesystems must be logically formatted, which involves laying down
the master directory information ( FAT table or inode structure ), initializing free lists, and
creating at least the root directory of the filesystem. ( Disk partitions which are to be used as raw
devices are not logically formatted. This saves the overhead and disk space of the filesystem
structure, but requires that the application program manage its own disk storage requirements. )
Boot Block

Computer ROM contains a bootstrap program ( OS independent ) with just enough code to find
the first sector on the first hard drive on the first controller, load that sector into memory, and
transfer control over to it. ( The ROM bootstrap program may look in floppy and/or CD drives
before accessing the hard drive, and is smart enough to recognize whether it has found valid
boot code or not. )
The first sector on the hard drive is known as the Master Boot Record, MBR, and contains a
very small amount of code in addition to the partition table. The partition table documents how
the disk is partitioned into logical disks, and indicates specifically which partition is the active or
boot partition.
The boot program then looks to the active partition to find an operating system, possibly loading
up a slightly larger / more advanced boot program along the way.
In a dual-boot ( or larger multi-boot ) system, the user may be given a choice of which operating
system to boot, with a default action to be taken in the event of no response within some time
frame.
Once the kernel is found by the boot program, it is loaded into memory and then control is
transferred over to the OS. The kernel will normally continue the boot process by initializing all
important kernel data structures, launching important system services ( e.g. network daemons,
sched, init, etc. ), and finally providing one or more login prompts. Boot options at this stage may
include single-user a.k.a. maintenance or safe modes, in which very few system services are
started - These modes are designed for system administrators to repair problems or otherwise
maintain the system.
Bad Blocks

No disk can be manufactured to 100% perfection, and all physical objects wear out over time.
For these reasons all disks are shipped with a few bad blocks, and additional blocks can be
expected to go bad slowly over time. If a large number of blocks go bad then the entire disk will
need to be replaced, but a few here and there can be handled through other means.
In the old days, bad blocks had to be checked for manually. Formatting of the disk or running
certain disk-analysis tools would identify bad blocks, and attempt to read the data off of them
one last time through repeated tries. Then the bad blocks would be mapped out and taken out of
future service. Sometimes the data could be recovered, and sometimes it was lost forever. (
Disk analysis tools could be either destructive or non-destructive. )
Modern disk controllers make much better use of the error-correcting codes, so that bad blocks
can be detected earlier and the data usually recovered. ( Recall that blocks are tested with every
write as well as with every read, so often errors can be detected before the write operation is
complete, and the data simply written to a different sector instead. )
Note that re-mapping of sectors from their normal linear progression can throw off the disk
scheduling optimization of the OS, especially if the replacement sector is physically far away
from the sector it is replacing. For this reason most disks normally keep a few spare sectors on
each cylinder, as well as at least one spare cylinder. Whenever possible a bad sector will be
mapped to another sector on the same cylinder, or at least a cylinder as close as possible.
Sector slipping may also be performed, in which all sectors between the bad sector and the
replacement sector are moved down by one, so that the linear progression of sector numbers
can be maintained.
If the data on a bad block cannot be recovered, then a hard error has occurred., which requires
replacing the file(s) from backups, or rebuilding them from scratch.

Swap-Space Management

Modern systems typically swap out pages as needed, rather than swapping out entire
processes. Hence the swapping system is part of the virtual memory management system.
Managing swap space is obviously an important task for modern OSes.

Swap-Space Use

The amount of swap space needed by an OS varies greatly according to how it is used. Some
systems require an amount equal to physical RAM; some want a multiple of that; some want an
amount equal to the amount by which virtual memory exceeds physical RAM, and some
systems use little or none at all!
Some systems support multiple swap spaces on separate disks in order to speed up the virtual
memory system.

Swap-Space Location

Swap space can be physically located in one of two locations:

As a large file which is part of the regular filesystem. This is easy to implement, but inefficient.
Not only must the swap space be accessed through the directory system, the file is also subject
to fragmentation issues. Caching the block location helps in finding the physical blocks, but that
is not a complete fix.
As a raw partition, possibly on a separate or little-used disk. This allows the OS more control
over swap space management, which is usually faster and more efficient. Fragmentation of
swap space is generally not a big issue, as the space is re-initialized every time the system is
rebooted. The downside of keeping swap space on a raw partition is that it can only be grown by
repartitioning the hard drive.

DISK RELIABILITY

Disk reliability refers to an important property of any kind of database system. Reliable operation is
very important for a system. One aspect of such a reliable operation is that all data captured in a
committed transaction has to be stored in a nonvolatile area. An assessment help says that this
very is safe in terms of power loss, operating system failure, and hardware failure. This kind of
requirement can be met with successfully writing the data to the computer's permanent storage. In
case when a computer is fatally damaged and the disk drives continue to exist. Such requirement
can be moved to another computer with similar hardware. In this transaction process all committed
will remain intact.
More About Disk Reliability:
Disk reliability forces data to disk platters in periodical manner. It might seem just like a simple
operation but unfortunately it is not. Such disk drives are slower in comparison with main memory
and CPUs. There are several layers of caching that exists between the main memory of the
computer and disk platters. Initially, the buffer cache of the operating system can be requested disk
blocks combining disk writes. All kinds of operating systems give applications in a method that
force to write from the buffer cache to disk. This is possible to have a cache in the disk drive
controller. It is commonly on RAID controller cards. They are write-back that means data to be sent
to the drive at some later time. They can be some kind of reliability hazard as the disk controller
cache memory is unstable. It loses the contents in a power failure. Those controller cards that are
known to be better than others, are with battery-backup units that mean the card possesses a
battery maintaining the power to the cache in case of loss of system power. Power can be restored
on the data and it will be recorded on the disk drives.

Stable-Storage Implementation

The concept of stable storage ( first presented in chapter 6 ) involves a storage medium in which
data is never lost, even in the face of equipment failure in the middle of a write operation.
To implement this requires two ( or more ) copies of the data, with separate failure modes.
An attempted disk write results in one of three possible outcomes:
A. The data is successfully and completely written.
B. The data is partially written, but not completely. The last block written may be garbled.
C. No writing takes place at all.
Whenever an equipment failure occurs during a write, the system must detect it, and return the
system back to a consistent state. To do this requires two physical blocks for every logical block,
and the following procedure:
1. Write the data to the first physical block.
2. After step 1 had completed, then write the data to the second physical block.
3. Declare the operation complete only after both physical writes have completed
successfully.
During recovery the pair of blocks is examined.
If both blocks are identical and there is no sign of damage, then no further action is
necessary.
If one block contains a detectable error but the other does not, then the damaged block is
replaced with the good copy. ( This will either undo the operation or complete the operation,
depending on which block is damaged and which is undamaged. )
If neither block shows damage but the data in the blocks differ, then replace the data in the
first block with the data in the second block. ( Undo the operation. )
Because the sequence of operations described above is slow, stable storage usually includes
NVRAM as a cache, and declares a write operation complete once it has been written to the
NVRAM.

The hardware and software clocks


A personal computer has a battery driven hardware clock. The battery ensures that the clock will
work even if the rest of the computer is without electricity. The hardware clock can be set from the
BIOS setup screen or from whatever operating system is running.
The Linux kernel keeps track of time independently from the hardware clock. During the boot,
Linux sets its own clock to the same time as the hardware clock. After this, both clocks run
independently. Linux maintains its own clock because looking at the hardware is slow and
complicated.

The kernel clock always shows universal time. This way, the kernel does not need to know about
time zones at all. The simplicity results in higher reliability and makes it easier to update the time
zone information. Each process handles time zone conversions itself (using standard tools that are
part of the time zone package).

The hardware clock can be in local time or in universal time. It is usually better to have it in
universal time, because then you don't need to change the hardware clock when daylight savings
time begins or ends (UTC does not have DST). Unfortunately, some PC operating systems,
including MS-DOS, Windows, and OS/2, assume the hardware clock shows local time. Linux can
handle either, but if the hardware clock shows local time, then it must be modified when daylight
savings time begins or ends (otherwise it wouldn't show local time).

File System

File
A file is a named collection of related information that is recorded on secondary storage such as
magnetic disks, magnetic tapes and optical disks. In general, a file is a sequence of bits, bytes,
lines or records whose meaning is defined by the files creator and user.

File Structure
A File Structure should be according to a required format that the operating system can
understand.

A file has a certain defined structure according to its type.


A text file is a sequence of characters organized into lines.
A source file is a sequence of procedures and functions.
An object file is a sequence of bytes organized into blocks that are understandable by the
machine.
When operating system defines different file structures, it also contains the code to support
these file structure. Unix, MS-DOS support minimum number of file structure.
File Type
File type refers to the ability of the operating system to distinguish different types of file such as
text files source files and binary files etc. Many operating systems support many types of files.
Operating system like MS-DOS and UNIX have the following types of files

Ordinary files
These are the files that contain user information.
These may have text, databases or executable program.
The user can apply various operations on such files like add, modify, delete or even remove the
entire file.
Directory files
These files contain list of file names and other information related to these files.
Special files
These files are also known as device files.
These files represent physical device like disks, terminals, printers, networks, tape drive etc.
These files are of two types

Character special files data is handled character by character as in case of terminals or


printers.
Block special files data is handled in blocks as in the case of disks and tapes.

File Access Mechanisms


File access mechanism refers to the manner in which the records of a file may be accessed. There
are several ways to access files

Sequential access
Direct/Random access
Indexed sequential access
Sequential access
A sequential access is that in which the records are accessed in some sequence, i.e., the
information in the file is processed in order, one record after the other. This access method is the
most primitive one. Example: Compilers usually access files in this fashion.

Direct/Random access
Random access file organization provides, accessing the records directly.
Each record has its own address on the file with by the help of which it can be directly accessed
for reading or writing.
The records need not be in any sequence within the file and they need not be in adjacent
locations on the storage medium.

Indexed sequential access


This mechanism is built up on base of sequential access.
An index is created for each file which contains pointers to various blocks.
Index is searched sequentially and its pointer is used to access the file directly.
Space Allocation
Files are allocated disk spaces by operating system. Operating systems deploy following three
main ways to allocate disk space to files.

Contiguous Allocation
Linked Allocation
Indexed Allocation
Contiguous Allocation
Each file occupies a contiguous address space on disk.
Assigned disk address is in linear order.
Easy to implement.
External fragmentation is a major issue with this type of allocation technique.
Linked Allocation
Each file carries a list of links to disk blocks.
Directory contains link / pointer to first block of a file.
No external fragmentation
Effectively used in sequential access file.
Inefficient in case of direct access file.
Indexed Allocation
Provides solutions to problems of contigous and linked allocation.
A index block is created having all pointers to files.
Each file has its own index block which stores the addresses of disk space occupied by the file.
Directory contains the addresses of index blocks of files.

Directory Structure
Storage Structure

A disk can be used in its entirety for a file system.


Alternatively a physical disk can be broken up into multiple partitions, slices, or mini-disks, each
of which becomes a virtual disk and can have its own filesystem. ( or be used for raw storage,
swap space, etc. )
Or, multiple physical disks can be combined into one volume, i.e. a larger virtual disk, with its

own filesystem spanning the physical disks.


Directory Overview

Directory operations to be supported include:


Search for a file
Create a file - add to the directory
Delete a file - erase from the directory
List a directory - possibly ordered in different ways.
Rename a file - may change sorting order
Traverse the file system.

Single-Level Directory

Simple to implement, but each file must have a unique name.


Two-Level Directory

Each user gets their own directory space.


File names only need to be unique within a given user's directory.
A master file directory is used to keep track of each users directory, and must be maintained
when users are added to or removed from the system.
A separate directory is generally needed for system ( executable ) files.
Systems may or may not allow users to access other directories besides their own
If access to other directories is allowed, then provision must be made to specify the directory
being accessed.
If access is denied, then special consideration must be made for users to run programs
located in system directories. A search path is the list of directories in which to search for

executable programs, and can be set uniquely for each user.

Tree-Structured Directories
An obvious extension to the two-tiered directory structure, and the one with which we are all
most familiar.
Each user / process has the concept of a current directory from which all ( relative ) searches
take place.
Files may be accessed using either absolute pathnames ( relative to the root of the tree ) or
relative pathnames ( relative to the current directory. )
Directories are stored the same as any other file in the system, except there is a bit that
identifies them as directories, and they have some special structure that the OS understands.
One question for consideration is whether or not to allow the removal of directories that are not
empty - Windows requires that directories be emptied first, and UNIX provides an option for

deleting entire sub-trees.

Acyclic-Graph Directories

When the same files need to be accessed in more than one place in the directory structure ( e.g.
because they are being shared by more than one user / process ), it can be useful to provide an
acyclic-graph structure. ( Note the directed arcs from parent to child. )
UNIX provides two types of links for implementing the acyclic-graph structure. ( See "man ln" for
more details. )
A hard link ( usually just called a link ) involves multiple directory entries that both refer to the
same file. Hard links are only valid for ordinary files in the same filesystem.
A symbolic link, that involves a special file, containing information about where to find the linked
file. Symbolic links may be used to link directories and/or files in other filesystems, as well as
ordinary files in the current filesystem.
Windows only supports symbolic links, termed shortcuts.
Hard links require a reference count, or link count for each file, keeping track of how many
directory entries are currently referring to this file. Whenever one of the references is removed
the link count is reduced, and when it reaches zero, the disk space can be reclaimed.
For symbolic links there is some question as to what to do with the symbolic links when the
original file is moved or deleted:
One option is to find all the symbolic links and adjust them also.
Another is to leave the symbolic links dangling, and discover that they are no longer valid the
next time they are used.
What if the original file is removed, and replaced with another file having the same name before

the symbolic link is next used?

General Graph Directory

If cycles are allowed in the graphs, then several problems can arise:
Search algorithms can go into infinite loops. One solution is to not follow links in search
algorithms. ( Or not to follow symbolic links, and to only allow symbolic links to refer to
directories. )
Sub-trees can become disconnected from the rest of the tree and still not have their reference
counts reduced to zero. Periodic garbage collection is required to detect and resolve this
problem. ( chkdsk in DOS and fsck in UNIX search for these problems, among others, even
though cycles are not supposed to be allowed in either system. Disconnected disk blocks that
are not marked as free are added back to the file systems with made-up file names, and can
usually be safely deleted. )
File-System Mounting

The basic idea behind mounting file systems is to combine multiple file systems into one large
tree structure.
The mount command is given a filesystem to mount and a mount point ( directory ) on which to
attach it.
Once a file system is mounted onto a mount point, any further references to that directory
actually refer to the root of the mounted file system.
Any files ( or sub-directories ) that had been stored in the mount point directory prior to mounting
the new filesystem are now hidden by the mounted filesystem, and are no longer available. For
this reason some systems only allow mounting onto empty directories.
Filesystems can only be mounted by root, unless root has previously configured certain
filesystems to be mountable onto certain pre-determined mount points. ( E.g. root may allow
users to mount floppy filesystems to /mnt or something like it. ) Anyone can run the mount
command to see what filesystems are currently mounted.
Filesystems may be mounted read-only, or have other restrictions imposed.

Protection

Files must be kept safe for reliability ( against accidental damage ), and protection ( against
deliberate malicious access. ) The former is usually managed with backup copies. This section
discusses the latter.
One simple protection scheme is to remove all access to a file. However this makes the file
unusable, so some sort of controlled access must be arranged.

Types of Access

The following low-level operations are often controlled:


Read - View the contents of the file
Write - Change the contents of the file.
Execute - Load the file onto the CPU and follow the instructions contained therein.
Append - Add to the end of an existing file.
Delete - Remove a file from the system.
List -View the name and other attributes of files on the system.
Higher-level operations, such as copy, can generally be performed through combinations of the
above.
Access Control

One approach is to have complicated Access Control Lists, ACL, which specify exactly what
access is allowed or denied for specific users or groups.
The AFS uses this system for distributed access.
Control is very finely adjustable, but may be complicated, particularly when the specific users
involved are unknown. ( AFS allows some wild cards, so for example all users on a certain
remote system may be trusted, or a given username may be trusted when accessing from any
remote system. )
UNIX uses a set of 9 access control bits, in three groups of three. These correspond to R, W,
and X permissions for each of the Owner, Group, and Others. ( See "man chmod" for full details.
) The RWX bits control the following privileges for ordinary files and directories:

In addition there are some special bits that can also be applied:
The set user ID ( SUID ) bit and/or the set group ID ( SGID ) bits applied to executable files
temporarily change the identity of whoever runs the program to match that of the owner / group
of the executable program. This allows users running specific programs to have access to files (
while running that program ) to which they would normally be unable to access. Setting of these
two bits is usually restricted to root, and must be done with caution, as it introduces a potential
security leak.
The sticky bit on a directory modifies write permission, allowing users to only delete files for
which they are the owner. This allows everyone to create files in /tmp, for example, but to only
delete files which they have created, and not anyone else's.
The SUID, SGID, and sticky bits are indicated with an S, S, and T in the positions for execute
permission for the user, group, and others, respectively. If the letter is lower case, ( s, s, t ), then
the corresponding execute permission is not also given. If it is upper case, ( S, S, T ), then the
corresponding execute permission IS given.
The numeric form of chmod is needed to set these advanced bits.

free space management


Since there is only a limited amount of disk space, it is necessary to reuse the space from deleted
files for new files. To keep track of free disk space, the system maintains a free-space list. The
free-space list records all disk blocks that are free (i.e., are not allocated to some file). To create a
file, the free-space list has to be searched for the required amount of space, and allocate that
space to a new file. This space is then removed from the free-space list. When a file is deleted, its
disk space is added to the free-space list.

Bit-Vector

Frequently, the free-space list is implemented as a bit map or bit vector. Each block is represented
by a 1 bit. If the block is free, the bit is 0; if the block is allocated, the bit is 1.

For example, consider a disk where blocks 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 17, 18, 25, 26, and 27
are free, and the rest of the blocks are allocated. The free-space bit map would be:

11000011000000111001111110001111

The main advantage of this approach is that it is relatively simple and efficient to find n consecutive
free blocks on the disk. Unfortunately, bit vectors are inefficient unless the entire vector is kept in
memory for most accesses. Keeping it main memory is possible for smaller disks such as on
microcomputers, but not for larger ones.

Linked List

Another approach is to link all the free disk blocks together, keeping a pointer to the first free block.
This block contains a pointer to the next free disk block, and so on. In the previous example, a
pointer could be kept to block 2, as the first free block. Block 2 would contain a pointer to block 3,
which would point to block 4, which would point to block 5, which would point to block 8, and so on.
This scheme is not efficient; to traverse the list, each block must be read, which requires
substantial I/O time.

Grouping

A modification of the free-list approach is to store the addresses of n free blocks in the first free
block. The first n-1 of these are actually free. The last one is the disk address of another block
containing addresses of another n free blocks. The importance of this implementation is that
addresses of a large number of free blocks can be found quickly.

Counting

Another approach is to take advantage of the fact that, generally, several contiguous blocks may
be allocated or freed simultaneously, particularly when contiguous allocation is used. Thus, rather
than keeping a list of free disk addresses, the address of the first free block is kept and the number
n of free contiguous blocks that follow the first block. Each entry in the free-space list then consists
of a disk address and a count. Although each entry requires more space than would a simple disk
address, the overall list will be shorter, as long as the count is generally greater than 1.
Operating System - Security
Security refers to providing a protection system to computer system resources such as CPU,
memory, disk, software programs and most importantly data/information stored in the computer
system. If a computer program is run by an unauthorized user, then he/she may cause severe
damage to computer or data stored in it. So a computer system must be protected against
unauthorized access, malicious access to system memory, viruses, worms etc. We're going to
discuss following topics in this chapter.

Authentication
One Time passwords
Program Threats
System Threats
Computer Security Classifications
Authentication
Authentication refers to identifying each user of the system and associating the executing
programs with those users. It is the responsibility of the Operating System to create a protection
system which ensures that a user who is running a particular program is authentic. Operating
Systems generally identifies/authenticates users using following three ways

Username / Password User need to enter a registered username and password with Operating
system to login into the system.

User card/key User need to punch card in card slot, or enter key generated by key generator in
option provided by operating system to login into the system.

User attribute - fingerprint/ eye retina pattern/ signature User need to pass his/her attribute via
designated input device used by operating system to login into the system.

One Time passwords


One-time passwords provide additional security along with normal authentication. In One-Time
Password system, a unique password is required every time user tries to login into the system.
Once a one-time password is used, then it cannot be used again. One-time password are
implemented in various ways.

Random numbers Users are provided cards having numbers printed along with corresponding
alphabets. System asks for numbers corresponding to few alphabets randomly chosen.

Secret key User are provided a hardware device which can create a secret id mapped with user
id. System asks for such secret id which is to be generated every time prior to login.

Network password Some commercial applications send one-time passwords to user on


registered mobile/ email which is required to be entered prior to login.

Program Threats
Operating system's processes and kernel do the designated task as instructed. If a user program
made these process do malicious tasks, then it is known as Program Threats. One of the common
example of program threat is a program installed in a computer which can store and send user
credentials via network to some hacker. Following is the list of some well-known program threats.

Trojan Horse Such program traps user login credentials and stores them to send to malicious
user who can later on login to computer and can access system resources.
Trap Door If a program which is designed to work as required, have a security hole in its code
and perform illegal action without knowledge of user then it is called to have a trap door.

Logic Bomb Logic bomb is a situation when a program misbehaves only when certain
conditions met otherwise it works as a genuine program. It is harder to detect.

Virus Virus as name suggest can replicate themselves on computer system. They are highly
dangerous and can modify/delete user files, crash systems. A virus is generatlly a small code
embedded in a program. As user accesses the program, the virus starts getting embedded in other
files/ programs and can make system unusable for user

System Threats
System threats refers to misuse of system services and network connections to put user in trouble.
System threats can be used to launch program threats on a complete network called as program
attack. System threats creates such an environment that operating system resources/ user files are
misused. Following is the list of some well-known system threats.

Worm Worm is a process which can choked down a system performance by using system
resources to extreme levels. A Worm process generates its multiple copies where each copy uses
system resources, prevents all other processes to get required resources. Worms processes can
even shut down an entire network.

Port Scanning Port scanning is a mechanism or means by which a hacker can detects system
vulnerabilities to make an attack on the system.

Denial of Service Denial of service attacks normally prevents user to make legitimate use of the
system. For example, a user may not be able to use internet if denial of service attacks browser's
content settings.

Goals of Protection

Obviously to prevent malicious misuse of the system by users or programs. See chapter 15 for a
more thorough coverage of this goal.
To ensure that each shared resource is used only in accordance with system policies, which
may be set either by system designers or by system administrators.
To ensure that errant programs cause the minimal amount of damage possible.
Note that protection systems only provide the mechanisms for enforcing policies and ensuring
reliable systems. It is up to administrators and users to implement those mechanisms effectively.
Principles of Protection

The principle of least privilege dictates that programs, users, and systems be given just enough
privileges to perform their tasks.
This ensures that failures do the least amount of harm and allow the least of harm to be done.
For example, if a program needs special privileges to perform a task, it is better to make it a
SGID program with group ownership of "network" or "backup" or some other pseudo group,
rather than SUID with root ownership. This limits the amount of damage that can occur if
something goes wrong.
Typically each user is given their own account, and has only enough privilege to modify their
own files.
The root account should not be used for normal day to day activities - The System Administrator
should also have an ordinary account, and reserve use of the root account for only those tasks
which need the root privileges
Domain of Protection

A computer can be viewed as a collection of processes and objects ( both HW & SW ).


The need to know principle states that a process should only have access to those objects it
needs to accomplish its task, and furthermore only in the modes for which it needs access and
only during the time frame when it needs access.
The modes available for a particular object may depend upon its type.

Domain Structure

A protection domain specifies the resources that a process may access.


Each domain defines a set of objects and the types of operations that may be invoked on each
object.
An access right is the ability to execute an operation on an object.
A domain is defined as a set of < object, { access right set } > pairs, as shown below. Note that

some domains may be disjoint while others overlap.

The association between a process and a domain may be static or dynamic.


If the association is static, then the need-to-know principle requires a way of changing the
contents of the domain dynamically.
If the association is dynamic, then there needs to be a mechanism for domain switching.
Domains may be realized in different fashions - as users, or as processes, or as procedures.
E.g. if each user corresponds to a domain, then that domain defines the access of that user, and
changing domains involves changing user ID.

Access Matrix
The model of protection that we have been discussing can be viewed as an access matrix, in
which columns represent different system resources and rows represent different protection
domains. Entries within the matrix indicate what access that domain has to that resource.
- Access matrix.

Domain switching can be easily supported under this model, simply by providing "switch" access to
other domains:

Access matrix of previous Figure with domains as objects.

The ability to copy rights is denoted by an asterisk, indicating that processes in that domain have
the right to copy that access within the same column, i.e. for the same object. There are two
important variations:
If the asterisk is removed from the original access right, then the right is transferred, rather than
being copied. This may be termed a transfer right as opposed to a copy right.
If only the right and not the asterisk is copied, then the access right is added to the new domain,
but it may not be propagated further. That is the new domain does not also receive the right to
copy the access. This may be termed a limited copy right, as shown in Figure below:
- Access matrix with copy rights.

The owner right adds the privilege of adding new rights or removing existing ones:

- Access matrix with owner rights.

Copy and owner rights only allow the modification of rights within a column. The addition of
control rights, which only apply to domain objects, allow a process operating in one domain to
affect the rights available in other domains. For example in the table below, a process operating
in domain D2 has the right to control any of the rights in domain D4.

- Modified access matrix

Implementation of Access Matrix


Global Table

The simplest approach is one big global table with < domain, object, rights > entries.
Unfortunately this table is very large ( even if sparse ) and so cannot be kept in memory (
without invoking virtual memory techniques. )
There is also no good way to specify groupings - If everyone has access to some resource, then
it still needs a separate entry for every domain.
Access Lists for Objects

Each column of the table can be kept as a list of the access rights for that particular object,
discarding blank entries.
For efficiency a separate list of default access rights can also be kept, and checked first.
Capability Lists for Domains

In a similar fashion, each row of the table can be kept as a list of the capabilities of that domain.
Capability lists are associated with each domain, but not directly accessible by the domain or
any user process.
Capability lists are themselves protected resources, distinguished from other data in one of two
ways:
A tag, possibly hardware implemented, distinguishing this special type of data. ( other types
may be floats, pointers, booleans, etc. )
The address space for a program may be split into multiple segments, at least one of which
is inaccessible by the program itself, and used by the operating system for maintaining the
process's access right capability list.
A Lock-Key Mechanism

Each resource has a list of unique bit patterns, termed locks.


Each domain has its own list of unique bit patterns, termed keys.
Access is granted if one of the domain's keys fits one of the resource's locks.
Again, a process is not allowed to modify its own keys.
Comparison

Each of the methods here has certain advantages or disadvantages, depending on the particular
situation and task at hand.
Many systems employ some combination of the listed methods.
Access Control
Role-Based Access Control, RBAC, assigns privileges to users, programs, or roles as
appropriate, where "privileges" refer to the right to call certain system calls, or to use certain
parameters with those calls.
RBAC supports the principle of least privilege, and reduces the susceptibility to abuse as
opposed to SUID or SGID programs.

- Role-based access control in Solaris 10.


Revocation of Access Rights

The need to revoke access rights dynamically raises several questions:


Immediate versus delayed - If delayed, can we determine when the revocation will take
place?
Selective versus general - Does revocation of an access right to an object affect all users
who have that right, or only some users?
Partial versus total - Can a subset of rights for an object be revoked, or are all rights
revoked at once?
Temporary versus permanent - If rights are revoked, is there a mechanism for processes to
re-acquire some or all of the revoked rights?
With an access list scheme revocation is easy, immediate, and can be selective, general, partial,
total, temporary, or permanent, as desired.
With capabilities lists the problem is more complicated, because access rights are distributed
throughout the system. A few schemes that have been developed include:
Reacquisition - Capabilities are periodically revoked from each domain, which must then re-
acquire them.
Back-pointers - A list of pointers is maintained from each object to each capability which is
held for that object.
Indirection - Capabilities point to an entry in a global table rather than to the object. Access
rights can be revoked by changing or invalidating the table entry, which may affect multiple
processes, which must then re-acquire access rights to continue.
Keys - A unique bit pattern is associated with each capability when created, which can be
neither inspected nor modified by the process.
A master key is associated with each object.
When a capability is created, its key is set to the object's master key.
As long as the capability's key matches the object's key, then the capabilities remain
valid.
The object master key can be changed with the set-key command, thereby invalidating
all current capabilities.
More flexibility can be added to this scheme by implementing a list of keys for each
object, possibly in a global table.

You might also like