Professional Documents
Culture Documents
Ekaterina Elts
Scientific adviser: Assoc. Prof. A.V. Komolkin
Introduction
Computational Grand
Challenge problems
Parallel processing the
method of having many
small tasks to solve one
large problem
Two major trends :
MPPs - (massively parallel
processors) but cost $$$ !
distributed computing
Introduction
The hottest trend today
is PC clusters running
Linux
Many Universities and
companies can afford
16 to 100 nodes.
PVM and MPI are the
most used tools for
parallel programming
Contents
Parallel Programming
A Parallel Machine Model
Cluster
A Parallel Programming Model
Message Passing Programming Paradigm
Conclusion
A central processing
unit (CPU) executes a
program that performs
a sequence of read and
write operations on an
attached memory
The
von Neumann
computer
SISD Single Instruction Stream Single Data Stream
The cluster
A node can communicate with
other nodes by sending and
receiving messages over an
interconnection network
The
von Neumann
computer
A Parallel Programming
Model
input
output
Sequential (serial) algorithm
input
output
Parallel algorithm
a, b
input
input
do i=1,N
S=s+aibi
enddo
output
do i=1,N/2
s1=s1+aibi
enddo
do i=N/2+1,N
s2=s2+aibi
enddo
S=s1+s2
output
print S
Sequential (serial) algorithm
print S
Parallel algorithm
A Parallel Programming
Model
Message Passing
4
1
3
2
detailed picture of
a single task
Messages
Messages are packets of data moving
between processes
The message passing system has to be
told the following information:
Sending process
Source location
Data type
Data length
Receiving process(es)
Destination location
Destination size
Message Passing
SPMD
Single Program Multiple Data
MPMD
Multiple Program Multiple Data
Each process
perform a different
function (input,
problem setup,
solution, output,
display)
Simple Example
SPMD&Master/Slave
For i from rank step size to N do
s=s+aibi
a1b1+a1+sizeb1+size+a1+2*sizeb1+2*size+
enddo
s2 ai bi
2
a, b
s1 ai bi
1
s3 ai bi
3
slave
S=s1+s2 master
slave
PVM
The development of
PVM started in summer
1989 at Oak Ridge
National
Laboratory
(ORNL).
PVM was effort of a single
research group, allowing
it great flexibility in design
of this system
MPI-1
PVM-1
1989
90
PVM-2
94
PVM-3
96
MPI
PVM-3.4
97
99
2000
Goals
MPI
A distributed
operating system
Portability
Heterogeneity
Handling
communication
failures
What is MPI ?
MPI - Message Passing Interface
A fixed set of processes is created at program
initialization, one process is created per
processor
mpirun np 5 program
Each process knows its personal number (rank)
Each process knows number of all processes
Each process can communicate with other
processes
Process cant create new processes (in MPI-1)
What is PVM ?
PVM - Parallel Virtual Machine
Is a software package that allows a
heterogeneous collection of workstations (host
pool) to function as a single high performance
parallel machine (virtual)
PVM, through its virtual machine provides a
simple yet useful distributed operating system
It has daemon running on all computers making
up the virtual machine
Executing user
computation
master
Executing
PVM system
routines
Heterogeneity
Architecture
Data format
Computational speed
Machine load
Network load
static
dynamic
Heterogeneity: MPI
Different datatypes can be encapsulated
in a single derived type, thereby allowing
communication of heterogeneous
messages. In addition, data can be sent
from one architecture to another with data
conversion in heterogeneous networks
(big-endian, little-endian).
Heterogeneity: PVM
The PVM system supports heterogeneity
in terms of machines, networks, and
applications. With regard to message
passing, PVM permits messages
containing more than one datatype to be
exchanged between machines having
different data representations.
Process control
- Ability to start and stop tasks, to find out which
tasks are running, and possibly where they are
running.
PVM contains all of these capabilities
it can spawn/kill tasks dynamically
MPI -1 has no defined method to start new task.
MPI -2 contain functions to start a group of tasks
and to send a kill signal to a group of tasks
Resource Control
PVM is inherently dynamic in nature, and
it has a rich set of resource control
functions. Hosts can be added or deleted
load balancing
task migration
fault tolerance
efficiency
Virtual topology
- only for MPI
Convenient process naming
Naming scheme to fit the communication pattern
Simplifies writing of code
Can allow MPI to optimize communications
Point-to-Point communications
A synchronous
communication does not
complete until the message
has been received.
An asynchronous
communication completes as
soon as the message is on
its way
Non-blocking operations
Collective communications
Broadcast
A broadcast sends a
message to a number of
recipients
Barrier
A barrier operation
synchronises a
number of
processors.
Reduction
operations
Reduction operations
reduce data from a
number of processors
to a single item.
Virtual Machine
Conclusion
Each API has its unique strengths
PVM
MPI
No such abstraction
Rich message support
Support logical communication
topologies
Some realizations do not
interoperate across architectural
boundaries
Portability over performance Performance over flexibility
Primarily concerned with
Resource and process
messaging
control
More susceptible to faults
Robust fault tolerance
Conclusion
Each API has its unique strengths
Acknowledgments
Scientific adviser
Assoc. Prof. A.V.Komolkin
???