You are on page 1of 81

Introduction to Operating Systems

Anupam Basu

What does an Operating System do?


An OS is Similar to a government Begs the question: does a government do anything useful by itself? Coordinator and Traffic Cop: Manages all resources Settles conflicting requests for resources Prevent errors and improper use of the computer Facilitator: Provides facilities that everyone needs Standard Libraries, Windowing systems Make application programming easier, faster, less error-prone Some features reflect both tasks: E.g. File system is needed by everyone (Facilitator) But File system must be Protected (Traffic Cop)

Review: Virtual Machine Abstraction

Application Operating System Hardware

Virtual Machine Interface Physical Machine Interface

Software Engineering Problem: what programmers want/need Optimize for convenience, utilization, security, reliability, etc For Any OS area (e.g. file systems, virtual memory, networking, scheduling): Whats the hardware interface? (physical reality) Whats the application interface? (nicer abstraction)

Review: Example of Address Translation Code Data Heap Stack


Prog 1 Virtual Address Space 1
Data 2 Stack 1 Heap 1 Code 1 Stack 2 Data 1 Heap 2 Code 2 OS code

Code Data Heap Stack


Prog 2 Virtual Address Space 2

Translation Map 1

OS data OS heap & Stacks

Translation Map 2

Physical Address Space

Review: Dual Mode Operation


Hardware provides at least two modes: Kernel mode (or supervisor or protected) User mode: Normal programs executed Some instructions/ops prohibited in user mode: Example: cannot modify page tables in user mode Attempt to modify Exception generated Transitions from user mode to kernel mode: System Calls, Interrupts, Other exceptions

Moores Law Change Drives OS Change


1981 CPU MHz, Cycles/inst DRAM capacity Disk capacity Net bandwidth # addr bits #users/machine Price 10 310 128KB 10MB 9600 b/s 16 10s $25,000 2006 3200x4 0.250.5 4GB 1TB 1 Gb/s 32 e1 $4,000 Factor 1,280 640 32,768 100,000 110,000 2 e 0.1 0.2

Typical academic computer 1981 vs 2006

Dawn of time ENIAC: (19451955)

The machine designed by Drs. Eckert and Mauchly was a monstrosity. When it was finished, the ENIAC filled an entire room, weighed thirty tons, and consumed two hundred kilowatts of power. http://ei.cs.vt.edu/~history/ENIAC.Richey.HTML

History Phase 1 (19481970) Hardware Expensive, Humans Cheap


When computers cost millions of $s, optimize for more efficient use of the hardware! Lack of interaction between user and computer User at console: one user at a time Batch monitor: load program, run, print Optimize to better use hardware When user thinking at console, computer idleBAD! Feed computer batches and make users wait Autograder for this course is similar No protection: what if batch program has bug?

History Phase 1 (late 60s/early 70s)


Data channels, Interrupts: overlap I/O and compute DMA Direct Memory Access for I/O devices I/O can be completed asynchronously Multiprogramming: several programs run simultaneously Small jobs not delayed by large jobs More overlap between I/O and CPU Need memory protection between programs and/or OS Complexity gets out of hand: Multics: announced in 1963, ran in 1969 1777 people contributed to Multics (30-40 core dev) Turing award lecture from Fernando Corbat (key researcher): On building systems that will fail OS 360: released with 1000 known bugs (APARs) Anomalous Program Activity Report OS finally becomes an important science: How to deal with complexity??? UNIX based on Multics, but vastly simplified

A Multics System (Circa 1976)

The 6180 at MIT IPC, skin doors open, circa 1976: We usually ran the machine with doors open so the operators could see the AQ register display, which gave you an idea of the machine load, and for convenient access to the EXECUTE button, which the operator would push to enter BOS if the machine crashed. http://www.multicians.org/multics-stories.html

Early Disk History

1973: 1. 7 Mbit/sq. in 140 MBytes

1979: 7. 7 Mbit/sq. in 2,300 MBytes

source: New York Times, 2/23/98, page C3, Makers of disk drives crowd even more data into even smaller spaces

History Phase 2 (1970 1985) Hardware Cheaper, Humans Expensive


Computers available for tens of thousands of dollars instead of millions OS Technology maturing/stabilizing Interactive timesharing: Use cheap terminals (~$1000) to let multiple users interact with the system at the same time Sacrifice CPU time to get better response time Users do debugging, editing, and email online Problem: Thrashing Performance very non-linear response with load Thrashing caused by many factors including Swapping, queueing Response time

Users

History Phase 3 (1981 ) Hardware Very Cheap, Humans Very Expensive


Computer costs $1K, Programmer costs $100K/year If you can make someone 1% more efficient by giving them a computer, its worth it! Use computers to make people more efficient Personal computing: Computers cheap, so give everyone a PC Limited Hardware Resources Initially: OS becomes a subroutine library One application at a time (MSDOS, CP/M, ) Eventually PCs become powerful: OS regains all the complexity of a big OS multiprogramming, memory protection, etc (NT,OS/2) Question: As hardware gets cheaper does need for OS go away?

History Phase 4 (1989): Distributed Systems


Networking (Local Area Networking) Different machines share resources Printers, File Servers, Web Servers Client Server Model Services Computing File Storage

History Phase 5 (1995): Mobile Systems


Ubiquitous Mobile Devices Laptops, PDAs, phones Small, portable, and inexpensive Recently twice as many smart phones as PDAs Many computers/person! Limited capabilities (memory, CPU, power, etc) Wireless/Wide Area Networking Leveraging the infrastructure Huge distributed pool of resources extend devices Traditional computers split into pieces. Wireless keyboards/mice, CPU distributed, storage remote Peer-to-peer systems Many devices with equal responsibilities work together Components of Operating System spread across globe

Moores Law Reprise: Modern Laptop


1981 CPU MHz, Cycles/inst DRAM capacity Disk capacity Net bandwidth 10 310 128KB 10MB 9600 b/s 2005 3200x4 0.250.5 4GB 1TB 1 Gb/s 2006 Ultralight Laptop 1830 0.250.5 2GB 100GB 1 Gb/s (wired) 54 Mb/s (wireless) 2 Mb/s (wide-area) 32 e $2500

# addr bits #users/machine Price

16 10s $25,000

32 e1 $4,000

Migration of Operating-System Concepts and Features

Compare: Performance Trends

Log of Performance

Supercomputers Mainframes Minicomputers

Microprocessors

Year

1970

1975

1980

1985

1990

1995

History of OS: Summary


Change is continuous and OSs should adapt Not: look how stupid batch processing was But: Made sense at the time Situation today is much like the late 60s [poll] Small OS: 100K lines Large OS: 10M lines (5M for the browser!) 100-1000 people-years Complexity still reigns NT under development from early 90s to late 90s Never worked very well Jury still out on Windows 2000/XP Windows Vista (aka Longhorn) delayed many times Latest release date of 2005, 2006, 2007+ Promised by removing some of the intended technology

Now for a quick tour of OS Structures

Operating Systems Components (What are the pieces of the OS)

Process Management Main-Memory Management I/O System management File Management Networking User Interfaces

Operating System Services (What things does the OS do?)


Services that (more-or-less) map onto components Program execution How do you execute concurrent sequences of instructions? I/O operations Standardized interfaces to extremely diverse devices File system manipulation How do you read/write/preserve files? Looming concern: How do you even find files??? Communications Networking protocols/Interface with CyberSpace? Cross-cutting capabilities Error detection & recovery Resource allocation Accounting Protection

System Calls (What is the API)

System Calls System calls provide the interface between a running program and the operating system.

Generally available as assembly-language instructions. Languages defined to replace assembly language for systems programming allow system calls to be made directly (e.g., C, C++)

System Calls are Used Frequently A single program may make numerous system calls. For example, a program to read from one file and write to another would need system calls for the following:

Prompt the user to enter file names Read in filenames Open input file Read from input file Open/create output file Write output to file Close input and output files
The system must be able to signal and handle errors that occur.

Passing Parameters

Three general methods are used to pass parameters between a running program and the operating system.

Pass parameters in registers. Store the parameters in a table in memory, and the table address is passed as a parameter in a register. Push (store) the parameters onto the stack by the program, and pop off the stack by operating system.

Passing of Parameters As A Table

Types of System Calls


Process control create (fork), execute (exec), wait, end (exit), abort (kill), etc. File management creat, delete, open, close, read, write, cp, rm, mkdir, rmdir, ls, cat, more, grep, etc. get/set file attributes Device management read, write, attach (mount), detach (unmount), get/set device attributes Information maintenance get/set time or date, get/set file attributes (chmod, chown, chgrp), get/set process/device attributes (du, ps, etc) Communications send, receive, connect, accept, get/set status information, gethostid/sethostid, gethostbyname, etc.

Process Control
A process executing one program may want to load and execute another program (e.g. the shell loads and executes programs). The following are important considerations: Where does control return when the new process is finished?
If return control to existing program, must save memory image of existing program before loading new process. If both programs are to run concurrently, the new process is added to the multiprogramming set of processes.

Controlling execution of the process:

get/set process attributes, terminate process


Waiting for the process to finish

wait event, signal event


Terminating the process

Normal (exit) or abnormal (abort) termination.


There are multiple ways of implementing process control.

MS-DOS Execution
MS-DOS runs the command interpreter on startup. It loads a new program into memory, writing over part of the interpreter. When the program terminates, the part of the interpreter not overwritten begins execution. It loads the rest of the interpreter from disk.

At System Start-up

Running a Program

UNIX Running Multiple Programs


UNIX runs the shell on startup. To start a new process, it uses the fork command to create the process and exec to execute it. If the process is in the foreground, the shell waits for the process to finish. If the process is in the background, the user can continue to issue commands while the process is running. When the process is finished, it executes an exit system call.

Message Passing Processes use message passing to send messages to one another. First, the connection must be opened. The name of the communicator must be known (hos tname or host id, process name or process id). Use get process id or get host id. open connection, close connection recipient uses accept connection The initiator is the client. The recipient of the request is the server. Exchange of information made with write message system calls.

Shared Memory In memory sharing, processes communicate by writing and reading to the same memory addresses. Processes use map memory system calls to access memory owned by other processes. Both processes must agree to remove O.S. memory restriction so that they can access the same region of memory. The processes are responsible for the form and location of the data. The processes are responsible for making sure that they are not writing to the same memory area simultaneously.

Communication Models
Communication may take place using either message passing or shared memory.

Msg Passing

Shared Memory

System Programs

System programs provide a convenient environment for program development and execution. The can be divided into: File manipulation Status information File modification Programming language support Program loading and execution Communications Application programs Most users view of the operation system is defined by system programs, not the actual system calls.

Simple Structure
MS-DOS written to provide the most functionality in the least space Not divided into modules Interfaces and levels of functionality not well separated

UNIX: Also Simple Structure

UNIX limited by hardware functionality Original UNIX operating system consists of two separable parts: Systems programs The kernel Consists of everything below the system-call interface and above the physical hardware Provides the file system, CPU scheduling, memory management, and other operating-system functions; Many interacting functions for one level

UNIX System Structure

User Mode

Applications Standard Libs

Kernel Mode

Hardware

Layered Operating System

Modules-based Structure
Most modern operating systems implement modules Uses object-oriented approach Each core component is separate Each talks to the others over known interfaces Each is loadable as needed within the kernel Overall, similar to layers but with more flexible

Interrupts

Computer Hardware Review

CPU

Memory

Video Controller

Keyboard Controller

Floppy disk Controller

Hard Disk controller

Hardware Components
Processors Memory I/O Devices Buses

I/O Devices
OS must manage I/O devices. User programs can not access I/O devices directly. An I/O device usually consists of two parts: A device controller A chip or set of chips that controls the device. Usually has a microcontroller in it that runs independent of CPU and that is programmed to control the device. The device itself. Examples: A graphics card and a monitor. A hard disk controller and hard disk (drive) itself. A floppy disk controller and a floppy disk drive itself. A printer controller and printer itself. A keyboard controller and keyboard itself. .. A device controller is also called a card or an adapter. Some controllers may not have an associated mechanical device (a network card for example).

I/O Devices

Device

Manufacturer A controller Manufacturer B controller Manufacturer C controller

Linux driver Windows XP driver Solaris driver

Hardware

Software

Linux OS Win XP OS Solaris OS

I/O types
Polling (busy waiting) Driver polls the controller until data is available or operation is complete Waste of CPU with useless polling (busy waiting) Interrupt driven Driver starts I/O by giving commands to the controller Process is blocked CPU is given to an other process Controller does the job independent of CPU and when finished, gives an interrupt to the CPU. Interrupt service routine of driver takes the data from controller and copies it to the memort The process that was blocked can now continue to run

3- assert CPU pin

Hard disk and drive Disk controller Bus

CPU
4-sends device no

Interrupt controller

2- controller sends interrupt using bus

1- driver gives command to the controller to do some job

Interrupt Driven I/O

Current instruction Next instruction

User program

Interrupt Vector

OS kernel
Device driver An other device driver

Processes and Multiprogramming


A process can be viewed as programs in the memory (executing or waiting to execute) Multi-programming: having multiple programs in the memory to be executed When you start running a program, a process for it is created and executed (Ex: Netscape, notepad, word) Processes may create other processes createProcess (in Windows 98+) fork (in unix, and linux)

Process Hierarchy
In Unix, processes created by other processes form a hierarchy (with parent child relationships) In Windows, there is no specific hierarchy.

The Birth of a Program

myprogram.c int j; char* s = hello\n; int p() { j = write(1, s, 6); return(j); }


.. p: store this store that push jsr _write ret etc.

myprogram.o

assembler

data

object file
data data data libraries and other objects

linker

compiler

myprogram.s

data

program

myprogram (executable file)

A Peek Inside a Running Program

CPU

0 x

common runtime

your program
code library your data address space (virtual or physical)

R0

Rn PC SP x y y high

heap

registers

stack
memory

Process States

As a process executes, it changes state new: The process is being created running: Instructions are being executed waiting: The process is waiting for some event to occur ready: The process is waiting to be assigned to a process terminated: The process has finished execution

Process Control Block (PCB)

Information associated with each process Process state Program counter CPU registers CPU scheduling information Memory-management information Accounting information I/O status information

CPU Switch From Process to Process

Context-switch time is overhead; the system does no useful work while switching Time dependent on hardware support

Process Scheduling Queues


Job queue set of all processes in the system Ready queue set of all processes residing in main memory, ready and waiting to execute Device queues set of processes waiting for an I/O device Processes migrate among the various queues I/O-bound process spends more time doing I/O than computations, many short CPU bursts CPU-bound process spends more time doing computations; few very long CPU bursts

Ready Queue And Various I/O Device Queues

Representation of Process Scheduling

Schedulers
Long-term scheduler (or job scheduler) selects which processes should be brought into the ready queue invoked infrequently (seconds, minutes) (may be slow) Short-term scheduler (or CPU scheduler) selects which process should be executed next and allocates CPU invoked frequently (milliseconds) (must be fast) (Medium-term scheduler)

Process Creation
Parent process create children processes (tree of processes) Resource sharing Parent and children share all resources Children share subset of parents resources Parent and child share no resources Execution Parent and children execute concurrently Parent waits until children terminate Address space Child duplicate of parent Child has a program loaded into it UNIX examples fork system call creates new process exec system call used after a fork to replace the process memory space with a new program

A tree of processes on Solaris

Example: Process Creation in Unix


int main() { Pid_t pid; /* fork another process */ pid = fork(); if (pid < 0) { /* error occurred */ fprintf(stderr, "Fork Failed"); exit(-1); } else if (pid == 0) { /* child process */ execlp("/bin/ls", "ls", NULL); } else { /* parent process */ /* parent will wait for the child to complete */ wait (NULL); printf ("Child Complete"); exit(0); } }

The fork syscall returns twice: it returns a zero to the child and the child process ID (pid) to the parent. Parent uses wait to sleep until the child exits; wait returns child pid and status. Wait variants allow wait on a specific child, or notification of stops and other signals.

Unix Fork/Exec/Exit/Wait Example

fork parent

fork child

int pid = fork(); Create a new process that is a clone of its parent. exec*(program [, argvp, envp]); Overlay the calling process virtual memory with a new program, and transfer control to it. exit(status); Exit with status, destroying the process.

initialize child context

exec

wait

exit

int pid = wait*(&status); Wait for exit (or other status change) of a child.

The Concept of Fork


The Unix system call for process creation is called fork(). The fork system call creates a child process that is a clone of the parent. Child has a (virtual) copy of the parents virtual memory. Child is running the same program as the parent. Child inherits open file descriptors from the parent. (Parent and child file descriptors point to a common entry in the system open file table.) Child begins life with the same register values as parent. The child process may execute a different program in its context with a separate exec() system call.

Whats So Cool About Fork

fork is a simple primitive that allows process creation without troubling with what program to run, args, etc. Serves some of the same purposes as threads. fork gives the parent program an opportunity to initialize the child processespecially the open file descriptors. Unix syscalls for file descriptors operate on the current process. Parent program running in child process context may open/close I/O and IPC objects, and bind them to stdin, stdout, and stderr. Also may modify environment variables, arguments, etc. Using the common fork/exec sequence, the parent (e.g., a command interpreter or shell) can transparently cause children to read/write from files, terminal windows, network connections, pipes, etc.

The Shell
The Unix command interpreters run as ordinary user processes with no special privilege. This was novel at the time Unix was created: other systems viewed the command interpreter as a trusted part of the OS. Users may select from a range of interpreter programs available, or even write their own (to add to the confusion). csh, sh, ksh, tcsh, bash: choose your flavor Shells use fork/exec/exit/wait to execute commands composed of program filenames, args, and I/O redirection symbols. Shells are general enough to run files of commands (scripts) for more complex tasks, e.g., by redirecting shells stdin. Shells behavior is guided by environment variables.

Process Termination
Process executes last statement and asks the operating system to delete it (exit) Output data from child to parent (via wait) Process resources are deallocated by operating system Parent may terminate execution of children processes (abort) Child has exceeded allocated resources Task assigned to child is no longer required If parent is exiting Some operating system do not allow child to continue if its parent terminates: cascading termination (all children terminated)

Unix process creation


In Unix, the only way to create a new process is calling the fork() system call. In Unix, each process has a process identifier (which is an integer value). fork() system call creates a exact duplicate of the calling (parent) process. The child then can load a new program if it wants to execute some different program than what the parent process executes.

Can do this by use of execl system call execl replaces the memory image of a process with some new program.

Sample Code (Process creation in Unix)

main() { int pid; pid = fork(); }

Fork will return a negative value of there is an error.

This code creates a child process that will run concurrently with the parent process. Both child process and parent process will execute the same code after the fork() call. The parents memory will be copied to child process, and it will not be shared. Child and parent process can decide which program fragment to run as shown in the following slide

Sample Code for Unix


main() { int pid = fork(); if (pid == 0) { // fork will return 0 to the child // so child will execute this block cout << "waiting for the parent" << endl; // getppid will return the process id of the parent waitpid(getppid(), NULL, 0); } else { // fork will return some positive number to the parent // therefore parent will execute this code block sleep(1); cout << "I will kill my child " << endl; sleep(1); kill(0, SIGKILL); // terminated the child process } }

Some info about the code in the next slide


In the fork command you can not specify an external code fragment to run, such as an executable code like netscape. In order to run an outside program inside a process, you need to use execlp The process that calls execlp is replaced by the executable prorgam specified as the parameter. For example execlp (/bin/ls, ls, NULL);

#include <stdio.h> int i, j, char a[1024]; int fd; main(int argc, char *argv[]) { int pid; fd = fopen(myfile.txt, R_ODLY); pid = fork(); /* create a new process */ if (pid < 0) { /* system call is not successful */ fprintf(stderr, fork() failed\n); exit(-1); /* termination with some error */ } else if (pid == 0) { /* this is the child process */ execlp (/bin/ls, ls, NULL); } else { /* this is the parent process */ wait(NULL); /* wait for the child to terminate */ printf(child completed\n); exit(0); /* normal termination */ } }

Process creation
Next slide shows how memory of the parent process is copied to the child process.

already exists pid = ? stack


int i = 10; inj j; char a[1024] int fd = 5 int i; inj j; char a[1024] int fd = 5

newly created pid = 0 stack


int i = 10; inj j; char a[1024] int fd = 5

pid = 3210

data

data

main() { PC . register pid = fork(); }

Parent

main() fork() { executes . text PC pid = fork(); register }

Parent

main() { . pid = fork(); }

text

Child

Parent id = 2345

Parent id = 2345 Parent id = 3210

Before calling fork()

After calling fork()

Process Creation in windows98 and higher versions

Syntax
BOOL CreateProcess( LPCTSTR lpApplicationName, // pointer to name of executable module LPTSTR lpCommandLine, // pointer to command line string LPSECURITY_ATTRIBUTES lpProcessAttributes, LPSECURITY_ATTRIBUTES lpThreadAttributes, BOOL bInheritHandles, // handle inheritance flag DWORD dwCreationFlags, // creation flags LPVOID lpEnvironment, // pointer to new environment block LPCTSTR lpCurrentDirectory, // pointer to current directory name LPSTARTUPINFO lpStartupInfo, // pointer to STARTUPINFO LPPROCESS_INFORMATION lpProcessInformation // pointer to PROCESS_INFORMATION );

Sample code fragment for process creation in windows

STARTUPINFO si; PROCESS_INFORMATION pi; memset(&pi, 0, sizeof(pi)); memset(&si, 0, sizeof(si)); si.cb = sizeof(si); int res = CreateProcess(0, "c:\\windows\\notepad.exe", 0, 0, 0, 0, 0, 0, &si, &pi);

The above code fragment defines two structures related to process creation of type STARTUPINFO and PROCESS_INFORMATION. They will be used by the CreateProcess call to initialize some parameters of the new process. The memset call initialises the newly defined structures si, and pi to 0. Second parameter of CreateProcess is the complete path of the executable. Dont worry about the 0 (NULL) parameters.

void main( VOID ) { STARTUPINFO si; PROCESS_INFORMATION pi; ZeroMemory( &si, sizeof(si) ); si.cb = sizeof(si); // Start the child process.

A COMPLETE CODE FOR PROCESS CREATION

if( !CreateProcess( NULL, // No module name (use command line). "MyChildProcess", // Command line. NULL, // Process handle not inheritable. NULL, // Thread handle not inheritable. FALSE, // Set handle inheritance to FALSE. 0, // No creation flags. NULL, // Use parent's environment block. NULL, // Use parent's starting directory. &si, // Pointer to STARTUPINFO structure. &pi // Pointer to PROCESS_INFORMATION structure. ) { ErrorExit( "CreateProcess failed." ); } // Wait until child process exits. WaitForSingleObject( pi.hProcess, INFINITE ); // Close process and thread handles. CloseHandle( pi.hProcess ); CloseHandle( pi.hThread ); }

CreateProcess


If CreateProcess succeeds, it returns a PROCESS_INFORMATION structure which contains


handles for the new process and its primary thread valid until closed - using the CloseHandle function, even after the process or thread they represent has been terminated identifier that uniquely identifies the new process and its primary thread throughout the system

Process Handles & Identifiers




A process can use the


GetCurrentProcessId function to get its own process identifier DWORD GetCurrentProcessId(void) GetCurrentProcess function to retrieve a pseudo handle to its own process object HANDLE GetCurrentProcess(void)

Process Handles & Identifiers


#include <stdio.h> #include <stdlib.h> #include <windows.h> void main( void) { HANDLE processHandle; DWORD processId; processId = GetCurrentProcessId(); printf("The process id is %d\n", processId); processHandle = GetCurrentProcess(); printf("Terminating process\n"); TerminateProcess(processHandle,0); printf("Unreachable code\n"); }

Terminating Processes


A process executes until


Any thread of the process calls the ExitProcess / TerminateProcess functions The primary thread of the process returns The primary thread can avoid terminating other threads by explicitly calling ExitThread before it returns The last thread of the process terminates The user shuts down the system or logs off

You might also like