Professional Documents
Culture Documents
Jeffrey Semke
Jamshid Mahdavi
Matthew Mathis
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 1
Acknowledgments
● Work made possible by funding from the
National Science Foundation
● Much assistance from
– Greg Miller and the MCI vBNS Team
– Kevin Lahey (NASA Ames)
– kc claffy & Jambi Ganbar (SDSC)
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 2
High Speed File Transfers
Server
Client
vBNS
10 MB File Transfer
Path Properties:
- 155 Mb/s Bandwidth
- 68 msec Round Trip Time
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 3
Expected Performance
● Transfer time = 10 MB/(155 Mb/s) = .5 sec
● Typical transfer time: 43 sec (!)
● Why? TCP Default tuning limits file
transfer to a window of 16kB
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 4
Default Tuning:
Performance vs. Delay
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 5
TCP Assumptions
● Assume that TCP/IP stack already includes
modifications for higher performance:
– RFC 1191 Path MTU Discovery
– RFC 1323 TCP Large Windows
– RFC 2018 Selective Acknowledgment (SACK)
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 6
Socket Buffers
Sender Receiver
Applic. Applic.
Data Data
Send Recv
Data 40 Socket Data 40
Socket
Buffer Buffer Data 23
Data 22
Network
TCP/IP TCP/IP
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 7
Under-buffered send socket buffer
Sender Receiver
Applic.
Data
Send
Data 100 Socket
Buffer Ack 20 Ack 21 Ack 22
Data 0 Data 26
Data 20
Data
26
TCP/IP TCP/IP
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 8
Hand Tuning:
● Increase sender and receiver socket buffers
to enable high performance.
● Required socket buffer size is 1 to 2 times
BW * Delay
● In this example, 2 * 155 Mb/s * 68 msec =
2.6 MB
● Can be done system wide, or by each user.
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 9
Problems with Hand Tuning
● Per User:
– Requires users to be network wizards.
● System Wide:
– All connections default to same buffer size
– Can result in overuse of system memory
(specifically MBUF Clusters) and in some
cases crashes.
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 10
Over-buffered send socket buffer
Sender Receiver
Send
Applic.
Socket
Data
Buffer
Data
999,999 Data Ack 20 Ack 21 Ack 22
999,999
Data 28
Data 0 Data 20 Data Data Data
28 27 26
TCP/IP TCP/IP
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 11
System Memory Usage
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 12
Goals of Sender Autotuning
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 13
Congestion Window
● Sender uses Congestion Avoidance
algorithm to attempt to find the appropriate
window
– Factors:
● Distance between hosts (delay)
● Bottleneck link rate (bandwidth)
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 14
Sender auto-tuning based on cwnd
Sender Receiver
Send
Applic.
Socket
Data
Buffer
Data
999,999 Data Ack 20 Ack 21 Ack 22
20 + S
Data 28
Data 0 Data 20 Data Data Data
28 27 26
TCP/IP TCP/IP
2*cwnd <= S < 4*cwnd
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 15
sb_net_target vs. cwnd over time
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 16
Fair Share Algorithm
● sb_net_target may not be attainable for all
connections
● Fair share is periodically calculated
● Small connections (sb_net_target < fair share)
donate unused memory to the pool
● Large connections (sb_net_target >= fair share) are
limited to the fair share.
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 17
Illustration of fair share algorithm
Memory pool for TCP sender socket buffers
conn4
conn5
conn6
R
100Mbps R
FDDI
ring FDDI ring
10Mbps ether.
NetBSD
sender Pittsburgh San
Diego
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 20
Basic Functionality Test
● From 1 to 30 concurrent connections from
sender in Pittsburgh to receiver in San
Diego
● 40 Mbps bottleneck link, 68 ms delay
– 340 kB bandwidth delay product
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 21
Basic Functionality
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 22
Basic Functionality
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 23
Basic Functionality
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 24
System Memory Usage
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 25
Diverse concurrent connections
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 26
Diverse concurrent connections
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 27
Diverse concurrent connections
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 28
Diverse concurrent connections
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 29
System Memory Usage
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 30
Impact
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 31
Conclusion
● Better performance
● More efficient use of memory
● More concurrent connections
● Great for servers that have many
connections over very diverse
bandwidth*delay paths
PSC/CMU/NLANR http://www.psc.edu/networking/auto.html 32