You are on page 1of 8

Client-server architecture

Pure P2P architecture

server:
always-on host
permanent IP address
server farms for
scaling
clients:

client/server

communicate with server


may be intermittently
connected
may have dynamic IP
addresses
do not communicate
directly with each other

no always-on server

arbitrary end systems

directly communicate
peers are intermittently

File sharing
File distribution
Searching for information
Case Studies: Bittorrent
and Skype

TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Hybrid of client-server and P2P

P2P file sharing

Skype
voice-over-IP P2P application
centralized server: finding address of remote
party:
client-client connection: direct (not through
server)
Instant messaging
chatting between two users is P2P
centralized service: client presence
detection/location
user registers its IP address with central
server when it comes online
user contacts central server to find IP
addresses of buddies

Example
Alice runs P2P client
application on her
notebook computer
intermittently
connects to Internet;
gets new IP address
for each connection
asks for Hey Jude
application displays
other peers that have
copy of Hey Jude.

TDDD36: Peer-to-peer

P2P: centralized directory


original Napster design
1) when peer connects, it
informs central server:

Bob

peers
1

the peers, Bob.


file is copied from

Bobs PC to Alices
notebook: HTTP
while Alice downloads,
other users uploading
from Alice.
Alices peer is both a
Web client and a
transient Web server.
All peers are servers =
highly scalable!

TDDD36: Peer-to-peer

single point of failure


performance bottleneck

Alice chooses one of

P2P: problems with centralized directory

centralized
directory server

IP address
content

2) Alice queries for Hey


Jude
3) Alice requests file from
Bob

peer-peer

connected and change IP


addresses
Three topics:

copyright infringement:

target of lawsuit is
obvious

file transfer is
decentralized, but
locating content is
highly centralized

Alice

TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Gnutella: protocol

Query flooding: Gnutella


fully distributed
no central server
public domain protocol
many Gnutella clients

implementing protocol

overlay network: graph


edge between peer X
and Y if theres a TCP
connection
all active peers and
edges form overlay net
edge: virtual (not
physical) link
given peer typically
connected with < 10
overlay neighbors

File transfer:
HTTP

Query message
sent over existing TCP
connections
peers forward
Query message
QueryHit
sent over
reverse
path

Query
QueryHit

Query
QueryHit
Scalability:
limited scope
flooding

TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Gnutella: Peer joining

Hierarchical Overlay

joining peer Alice must find another peer in


Gnutella network: use list of candidate peers
2. Alice sequentially attempts TCP connections with
candidate peers until connection setup with Bob
3. Flooding: Alice sends Ping message to Bob; Bob
forwards Ping message to his overlay neighbors
(who then forward to their neighbors.)
peers receiving Ping message respond to Alice
with Pong message
4. Alice receives many Pong messages, and can then
setup additional TCP connections

between centralized

1.

index, query flooding


approaches
each peer is either a
group leader or assigned
to a group leader.

TCP connection between


peer and its group leader.
TCP connections between
some pairs of group leaders.

group leader tracks

content in its children

ordinary peer
group-leader peer
neighoring relationships
in overlay network

TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Distributed Hash Table (DHT)

DHT Identifiers

DHT = distributed P2P database

Assign integer identifier to each peer in range

[0,2n-1].

Database has (key, value) pairs;

key: ss number; value: human name


key: content type; value: IP address

Each identifier can be represented by n bits.

Require each key to be an integer in same range.

Peers query DB with key


DB returns values that match the key
Peers can also insert (key, value) peers

TDDD36: Peer-to-peer

To get integer keys, hash original key.

eg, key = h(Led Zeppelin IV)


This is why they call it a distributed hash table

TDDD36: Peer-to-peer

How to assign keys to peers?

Circular DHT (1)


1

Central issue:

Assigning (key, value) pairs to peers.

15

Rule: assign key to the peer that has the

closest ID.
Convention in lecture: closest is the
immediate successor of the key.
Ex: n=4; peers: 1,3,4,5,8,10,12,14;

key = 13, then successor peer = 14


key = 15, then successor peer = 1

4
12

5
10

only aware of immediate successor


and predecessor.
Overlay network
Each peer

TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Circular DHT with Shortcuts

Circle DHT (2)

O(N) messages
on avg to resolve
query, when there
are N peers

0001

Whos
responsible
for key 1110 ?

I am

15

0011

1111

12

1110

10

0100

1110
1110

1100

Each peer keeps track of IP addresses of predecessor,


1110
1110

Define closest
as closest
successor

Whos
responsible
for key 1110?

0101

1110

1010

successor, short cuts.

Reduced from 6 to 2 messages.


Possible to design shortcuts so O(log N) neighbors,

O(log N) messages in query

1000
TDDD36: Peer-to-peer

Peer Churn

P2P Case study: Skype

15

4
12

To handle peer churn, require


each peer to know the IP address
of its two successors.
Each peer periodically pings its
two successors to see if they
are still alive.

5
10

TDDD36: Peer-to-peer

Peer 5 abruptly leaves


Peer 4 detects; makes 8 its immediate successor;

asks 8 who its immediate successor is; makes 8s


immediate successor its second successor.
What if peer 13 wants to join?

TDDD36: Peer-to-peer

Skype clients (SC)

inherently P2P: pairs

of users communicate.
proprietary

Skype

login server
application-layer
protocol (inferred via
reverse engineering)
hierarchical overlay
with Supernodes
(SNs)
Index maps usernames
to IP addresses;
distributed over SNs

Supernode
(SN)

TDDD36: Peer-to-peer

Scalable Content Delivery

Peers as relays

Motivation

Problem when both

Use of Internet for content delivery is massive and

Alice and Bob are


behind NATs.

becoming more so (e.g., recent projection that by 2013,


90% of all IP traffic will be video content)

NAT prevents an outside


peer from initiating a call
to insider peer

How to make scalable and efficient?

Solution:
Using Alices and Bobs
SNs, Relay is chosen
Each peer initiates
session with relay.
Peers can now
communicate through
NATs via relay

Here: Cost efficiency (e.g., energy efficiency) ...

Variety of approaches: broadcast/multicast, batching,

replication/caching (e.g. CDNs), P2P, peer-assisted,


In these slides:

BitTorrent (peer-to-peer)
Peer-assisted streaming
Server-based solutions (aggregation and replication)

TDDD36: Peer-to-peer

File Distribution: Server-Client vs P2P


Question : How much time to distribute file
from one server to N peers?
us: server upload
bandwidth

Server
u1

d1

u2

us

ui: peer i upload


bandwidth
d2

di: peer i download


bandwidth

File, size F
dN
uN

Network (with
abundant bandwidth)

TDDD36: Peer-to-peer

File distribution time: server-client


Server

server sequentially

sends N copies:

u1 d1 u2
d2

F
us

NF/us time

dN

client i takes F/di

Network (with
abundant bandwidth)

uN

time to download

Time to distribute F
to N clients using
client/server approach = dcs = max { NF/us, F/min(di) }

increases linearly in N
(for large N)
TDDD36: Peer-to-peer

File distribution time: P2P

Server-client vs. P2P: example


Client upload rate = u, F/u = 1 hour, us = 10u, dmin us

Server

copy: F/us time


client i takes F/di time
to download
NF bits must be
downloaded (aggregate)
fastest possible upload rate: us +

u1 d1 u2

F
us

d2

Network (with
abundant bandwidth)

dN
uN

3.5

Minimum Distribution Time

server must send one

TDDD36: Peer-to-peer

P2P
Client-Server

3
2.5
2
1.5
1
0.5
0

dP2P = max { F/us, F/min(di) , NF/(us +

u ) }

10

15

20

25

30

35

N
TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

File distribution: BitTorrent


P2P file distribution

BitTorrent (1)

torrent: group of

tracker: tracks peers

peers exchanging
chunks of a file

participating in torrent

file divided into 256KB

chunks.

peer joining torrent:

has no chunks, but will accumulate them over time


registers with tracker to get list of peers,
connects to subset of peers (neighbors)
while downloading, peer uploads chunks to other
peers.
peers may come and go
once peer has entire file, it may (selfishly) leave or
(altruistically) remain

obtain list
of peers

trading
chunks

peer

TDDD36: Peer-to-peer

BitTorrent-like systems

TDDD36: Peer-to-peer

Download using BitTorrent


Background: Incentive mechanism

File split into many smaller pieces


Pieces are downloaded from both seeds and downloaders
Distribution paths are dynamically determined

Establish connections to large set of peers


At each time, only upload to a small (changing) set of
peers

Based on data availability

Rate-based tit-for-tat policy


Downloaders give upload preference to the downloaders
that provide the highest download rates

Downloader
Seed

Downloader

Seed

Downloader

Highest download rates

Torrent

Arrivals

Pick top four

(x downloaders; y seeds)

Departures
Downloader

Seed residence
time

Download time

TDDD36: Peer-to-peer

BitTorrent (2)

Download using BitTorrent


Background: Piece selection
Peer 1:

Peer 2:

Peer N :

Pieces in neighbor set: (1)


1

(2) (1)

(2)

from

(2) (3) (2)


K

Optimistic unchoke

to

Rarest first piece selection policy


Achieves high piece diversity
Request pieces that
the uploader has;
the downloader is interested (wants); and
is the rarest among this set of pieces TDDD36: Peer-to-peer

Pulling Chunks
at any given time,
different peers have
different subsets of
file chunks
periodically, a peer
(Alice) asks each
neighbor for list of
chunks that they have.
Alice sends requests
for her missing chunks
rarest first

Pick one at random


TDDD36: Peer-to-peer

Sending Chunks: tit-for-tat


Alice sends chunks to four neighbors
currently sending her chunks at the

highest rate
v re-evaluate top 4 every 10 secs
every 30 secs: randomly select
another peer, starts sending chunks
v newly chosen peer may join top 4
v optimistically unchoke

TDDD36: Peer-to-peer

Live Streaming

BitTorrent: Tit-for-tat
(1) Alice optimistically unchokes Bob
(2) Alice becomes one of Bobs top-four providers; Bob reciprocates
(3) Bob becomes one of Alices top-four providers

using BT-like systems

Internet
piece
upload/downloads

Media player
queue/buffer

Buffer window

Live streaming (e.g., CoolStreaming)


All peers at roughly the same play/download position

With higher upload rate,


can find better trading
partners & get file faster!

High bandwidth peers can easily contribute more

(relatively) Small buffer window


Within which pieces are exchanged

TDDD36: Peer-to-peer

Peer-assisted VoD streaming


Some research questions ...

TDDD36: Peer-to-peer

Streaming Analysis

Can BitTorrent-like protocols provide scalable on-

Media Streaming Progress (MSP) depends

demand streaming?

on two different things:

How sensitive is the performance to the application

configuration parameters?
Piece selection policy
Peer selection policy
Upload/download bandwidth

MSP = DP

What is the user-perceived performance?

Download Progress (DP)


Sequential Progress (SP)

(useful media pieces


per unit time)

Start-up delay
Probability of disrupted playback

(pieces obtained
per unit time)

SP
(useful media pieces
per pieces obtained)

These two can (sometimes) be analyzed separately

ACM SIGMETRICS 2008; IFIP Networking 2007; IFIP Networking 2009


TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Piece Selection Policy

Baseline Protocol

Example Results

Piece Selection (1)


Which piece to upload?
Basic tradeoff

Piece diversity
Tit-for-tat is most effective with rarest-first

In-order requirements
Streaming is most natural using in-order

Baseline policy (from 07 paper)


Simple probabilistic policy
Bias towards earlier pieces

Average Startup Delay

0.1

Inorder, Rarest
Portion, x%
n

0.01

inorder
portion, 90%
portion, 50%
rarest
zipf(1.25)

0.001

Zipf()
n

0.0001

Zipf()

TDDD36: Peer-to-peer

4
8
12
Client Bandwidth

16

x% inorder
(100-x)% rarest

Random with bias towards


earlier pieces
Bias follow Zipf distribution

TDDD36: Peer-to-peer

Start-up Rule

Start-up Rule

Intuition

Intuition
The amount of in-order
data received (i.e., the size
of the in-order buffer)

The amount of in-order


data received

The total amount of data received

data

data

The total amount of data received

Required amount of in-order


data, if received at constant rate

x
The amount of data played out if
playback starts at time T

time

In-order buffer
Contains all pieces up to the first missing piece
The rate the size of the in-order buffer increases

is expected to increase with time (as holes are


filled)

time

Estimate the rate using a long-term average (LTA)


Adjusts start-up delay based on network conditions
Enough (in-order) pieces to get a reasonable rate estimate

TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Request aggregation techniques

The Case for Distributed


Digital Fountains (example)

Potentially useful for hot files receiving multiple requests closely


spaced in time

Efficiency? use of a relatively small number of distributed

servers, using request aggregation, may be as, or more, energy


friendly than peer-assisted designs
S
S

Allow use of service aggregation


Total work may become important (e.g.,
energy/carbon consumption)

S
S
C

Energy costs
Green data centers (close to renewable
energy source, for example)
Energy proportional data centers

S
S

S
C
S C
C
C C
C

Idea is to serve multiple requests together (entirely or partially)


Network transmissions may be multicast (in which case both
server and network resources saved), or unicast (in which case
only server resources, such as disk bandwidth, are saved)

Consider delivery of a hot file, and suppose want to minimize:


server bandwidth usage for that file
maximum client delay (download time)

Also, greater control of content


TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

Cyclic Multicast Protocols

Cyclic Multicast Protocols

Without Losses

With Losses

Operation: The file is cyclically


File A

scheduled on a channel
- Clients listen until fully served

1 2 3 4 5 6 7 8

Operation: The file is cyclically

File A

File A

1 2 X X X 6 7 8

1 2 3 4 5 6 7 8

8
3

2 1 8 7 6

Server

TDDD36: Peer-to-peer

2 1 8 7 6

1
2

Client

1 2 X X X 6 7 8

File A

scheduled on a channel
- Clients listen until fully served

Server

losses

Client

TDDD36: Peer-to-peer

A Digital Fountain Approach

Fountain service

Operation: The file data is


File A
1 2 3 4 5 6 7 8

File A

erasure coded and clients listen to


the channel until having retrieved
enough pieces to re-create the
original file

Basic fountain protocol (with turn of when idle)

1 2 3 4 5 6 7 8

Cyclic/l: (parameter: transmission rate) (e.g., digital fountain of

Enc.

Byers et al.)

Dec.

request arrivals
S

Server

losses

Client

Optimal if clients are not subject to additional batch delays

Main drawback: Periods during which relatively few clients are


being served
TDDD36: Peer-to-peer

TDDD36: Peer-to-peer

A final note

Locality vs. aggregation

If you are interested in one (or more)

But now have server selection problem


network resource usage is minimized by selecting nearest server
server resource usage is minimized by directing closely-spaced

requests for the same file to the same server

S
S
S
S
C

S
S

C
S C
C
C C
C

ACM TOMCCAP 2010; IEEE/ACM MASCOTS 2010


TDDD36: Peer-to-peer

topic(s) in this course, please do not


hesitate to contact me ...
There are always lots interesting problems
to work on (for an exjobb, for example)!!!
You can also find out more about my
research on my Website:
www.cs.usask.ca/grads/nic169/research.html

TDDD36: Peer-to-peer

You might also like