You are on page 1of 44

Advanced Operating Systems

Course Reading Material:


Lecture notes, class discussions will be the material for examination purposes. Recommend reading appropriate reference papers for the different algorithms. Other books discussing the algorithms will help too.

Advanced Operating Systems

Project :

C, C++

Course Outline
Proposed Outline. Might be modified based on time availability: Introduction to operating systems, inter-process communication. Distributed Operating Systems

Architecture Clock Synchronization, Ordering Distributed Mutual Exclusion Distributed Deadlock Detection Agreement Protocols Distributed File Systems Distributed Shared Memory Distributed Scheduling

Distributed Resource Management


Course Outline ...

Recovery & Fault Tolerance Concurrency Control/ Security

Depending on time availability

Additional/modified topics might be introduced from other texts and/or papers. References to those materials will be given at appropriate time.

Evaluation

1 Mid-terms: in class. 75 minutes. 1 Final Exam: 90 minutes or 2 hours Homeworks/assignments: at most 2-3 Programming Projects:

1 preparatory (to warm up on thread and socket programming) NOT Graded 2 projects based on algorithms discussed in class.

Grading

1 Mid-term + Final: 60%

Equally distributed 30% each

Intermediate & Final Projects (combined): 40%

Schedule

Mid-term: October 27, 2011 Final Exam: As per HCMUS schedule

Programming Projects

No copying/sharing of code/results will be tolerated. Any instance of cheating in projects/homeworks/exams will not acceptable. No copying code from the Internet. 2 individual students copying code from Internet independently: still considered copying in the project !! Individual projects. Deadlines will be strictly followed for projects and homeworks submissions. Projects submissions through Moodle. Demo will be needed

Two-track Course

Programming Project Discussions:

Announcements in class Minimal Discussions in class Design discussions during lab hours based on individual needs
Full discussion in class Lab hours for clarification if needed

Theory (Algorithms) Discussions:


Moodle

Announcements Students have responsibility to update news on Moodle everyday

10

Cheating

Academic dishonesty will be taken seriously. Cheating students will be handed over to Head/Dean for further action. Remember: home works/projects (exams too !) are to be done individually. Any kind of cheating in home works/ projects/ exams will result 0 grade for entire course.

11

Projects

Involves exercises such as ordering, deadlock detection, load balancing, message passing, and implementing distributed algorithms (e.g., for scheduling, etc.). Platform: Linux/Windows, C/C++/Java. Network programming will be needed. Multiple systems will be used. Specific details and deadlines will be announced in class and moodle. Suggestion: Learn network socket programming and threads, if you do not know already. Try simple programs for file transfer, talk, etc.

12

Homeworks

At most 2-3 home works, announced in class and moodle.

13

Basic Computer Organization


Input Unit Output Unit CPU Memory ALU (Arithmetic & Logic Unit) Secondary Storage

Memory

ALU

Disk

CPU

I/O
Keyboard

Display

14

Simplified View of OS
OS Kernel Physical Memory OS Tools User Processes User Virtual Memory Processes .. Tools ++ Memory Space 15 Code User i Process j Code

Data

Code

Code
Data

Data

Data

Distributed View of the System

hardware

hardware

hardware Process

hardware

hardware

16

Inter-Process Communication

Need for exchanging data/messages among processes belonging to the same or different group. IPC Mechanisms:

Shared Memory: Designate and use some data/memory as shared. Use the shared memory to exchange data.

Requires facilities to control access to shared data.

Message Passing: Use higher level primitives to send and receive data.

Requires system support for sending and receiving messages. Request-response action Similar to message passing with mandatory response Can be implemented using shared memory too.

Operation oriented language constructs


17

IPC Examples

Parallel/distributed computation such as sorting: shared memory is more apt.

Using message passing/RPC might need an array/data manager of some sort.

Client-server type: message passing or RPC may suit better.

Shared memory may be useful, but the program is more clear with the other types of IPCs.

RPC vs. Message Passing: if response is not a must, atleast immediately, simple message passing should suffice.

18

Shared Memory
Writers/ Producers
Only one process can write at any point in time. No access to readers.

Shared Memory
Multiple readers can access. No access to Writers.

Readers/ Consumers 19

Shared Memory: Possibilities

Locks (unlocks) Semaphores Monitors Serializers Path expressions

20

Message Passing

Blocked Send/Receive: Both sending and receiving process get blocked till the message is completely received. Synchronous. Unblocked Send/Receive: Both sender and receiver are not blocked. Asynchronous. Unblocked Send/Blocked Receive: Sender is not blocked. Receiver waits till message is received. Blocked Send/Unblocked Receive: Useful ? Can be implemented using shared memory. Message passing: a language paradigm for human ease.

21

Un/blocked

Blocked message exchange


Easy to: understand, implement, verify correctness Less powerful, may be inefficient as sender/receiver might waste time waiting More efficient, no waste on waiting Needs queues, i.e., memory to store messages Difficult to verify correctness of programs

Unblocked message exchange


22

Message Passing: Possibilities


receiver i receiver i

sender

receiver j

sender

receiver j

receiver k

receiver k

23

Message Passing: Possibilities...


sender i sender i

sender j

receiver

sender j

receiver

sender k

sender k

24

Naming

Direct Naming Specify explicitly the receiver process-id. Simple but less powerful as it needs the sender/receiver to know the actual process-id to/from which a message is to be sent/received. Not suitable for generic client-server models Port Naming receiver uses a single port for getting all messages, good for clientserver. more complex in terms of language structure, verification Global Naming (mailbox) suitable for client-server, difficult to implement on a distributed network. complex for language structure and verification Indirect Naming Use a naming server, e.g., as in RPCs.

25

Communicating Sequential Processes (CSP)


process reader-writer OKtoread, OKtowrite: integer (initially = value); busy: boolean (initially = 0);

*[ busy = 0; writer?request() -> busy := 1; writer!OKtowrite; busy = 0; reader?request() -> busy := 1; reader!OKtoread; busy = 1; reader?readfin() -> busy := 0; busy = 1; writer?writefn() -> busy := 0; ]

26

CSP: Drawbacks

Requires explicit naming of processes in I/O commands. No message buffering; input/output command gets blocked (or the guards become false) -> Can introduce delay and inefficiency.

27

Operation oriented constructs


Remote Procedure Call (RPC): Task A
Service declarations RPC: abc(xyz, ijk); .

Task B
Service declarations

send xyz; wait for result


return ijk

Service declaration: describes in and out parameters Can be implemented using message passing Caller: gets blocked when RPC is invoked. Callee implementation possibilities: Can loop accepting calls Can get interrupted on getting a call Can fork a process/thread for calls

28

RPC: Issues

Pointer passing, global variables passing can be difficult. If processes on different machines, data size (number of bits for a data type) variations need to be addressed.

Abstract Data Types (ADTs) are generally used to take care of these variations. ADTs are language like structures that specify how many bits are being used for integer, etc What does this imply?

Multiple processes can provide the same service? Naming needs to be solved. Synchronous/blocked message passing is equivalent to RPC.

29

Ada
task [type] <name> is entry specifications end
Task body <name> is Declaration of local variables begin list of statements .. accept <entry id> (<formal parameters> do body of the accept statement end<entry id> exceptions Exception handlers end; task proc-buffer is entry store(x:buffer); remove(y:buffer); end; task body proc-buffer is temp: buffer; begin loop when flag accept store(x: buffer); temp := x; flag :=0; end store; When !flag accept remove(y:buffer); y := temp; flag :=1; end remove; end loop end proc-buffer.

30

Ada Message Passing


Task A
entry store(...)

Task B
xyz is sent from . Task B to A

. accept Store(.) ...

result

Store(xyz); ..

Somewhat similar to executing procedure call. Parameter value for the entry procedure is supplied by the calling task. Value of Result, if any, is returned to the caller.

31

RPC Design

Structure

Caller: local call + stub Callee: stub + actual procedure Where to execute? Name/address of the server that offers a service Name server with inputs from service specifications of a task. Packing: Convert to remote machine format Unpacking: Convert to local machine format

Binding

Parameter & results

32

RPC Execution
Binding Server Local Proc. Local call

Stub
Query binding server Params packing Wait

Receive Query
Return Server Address

Register Services

Stub
Unpack

Remote Proc. Execute procedure

Local call Pack results Callee

Return Caller

Unpack result

Return

33

RPC Semantics

At least once

A RPC results in zero or more invocation. Partial call, i.e., unsuccessful call: zero, partial, one or more executions.

Exactly once

Only one call maximum Unsuccessful? : zero, partial, or one execution


Zero or one. No partial executions.

At most once

34

RPC Implementation

Sending/receiving parameters:

Use reliable communication? : Use datagrams/unreliable? Implies the choice of semantics: how many times a RPC may be invoked.

35

RPC Disadvantage

Incremental results communication not possible: (e.g.,) response from a database cannot return first few matches immediately. Got to wait till all responses are decided.

36

Distributed Operating Systems


Issues :

Global Knowledge Naming Scalability Compatibility Process Synchronization Resource Management Security Structuring Client-Server Model 37

DOS: Issues ..

Global Knowledge

Lack of global shared memory, global clock, unpredictable message delays Lead to unpredictable global state, difficult to order events (A sends to B, C sends to D: may be related) Need for a name service: to identify objects (files, databases), users, services (RPCs). Replicated directories? : Updates may be a problem. Need for name to (IP) address resolution. Distributed directory: algorithms for update, search, ...

Naming

38

DOS: Issues ..

Scalability

System requirements should (ideally) increase linearly with the number of computer systems Includes: overheads for message exchange in algorithms used for file system updates, directory management... Binary level: Processor instruction level compatibility Execution level: same source code can be compiled and executed Protocol level: Mechanisms for exchanging messages, information (e.g., directories) understandable.

Compatibility

39

DOS: Issues ..

Process Synchronization

Distributed shared memory: difficult.


Data/object management: Handling migration of files, memory values. To achieve a transparent view of the distributed system. Main issues: consistency, minimization of delays, .. Authentication and authorization

Resource Management

Security

40

DOS: Issues ..

Structuring

Monolithic Kernel: Not needed (e.g.,) file management not needed fully on diskless workstations. Collective kernel: distributed functionality on all systems.

Micro kernel + set of OS processes Micro kernel: functionality for task, memory, processor management. Runs on all systems. OS processes: set of tools. Executed as needed. Object types: process, directory, file, Operations on the objects: encapsulated data can be manipulated.

Object-oriented system: services as objects.


41

DOS: Communication
Computer Switch

42

ISO-OSI Reference Model


Application Presentation Session Transport Network Datalink Physical Network Datalink Physical Network Datalink Physical Application Presentation

Communication Network

Session Transport Network Datalink Physical

43

Un/reliable Communication

Reliable Communication

Virtual circuit: one path between sender & receiver. All packets sent through the path. Data received in the same order as it is sent. TCP (Transmission Control Protocol) provides reliable communication. Datagrams: Different packets are sent through different paths. Data might be lost or out of sequence. UDP (User datagram Protocol) provides unreliable communication.

Unreliable communication

44

You might also like