You are on page 1of 6

1

Notes on The Simulation of TCP Congestion Control Algorithms


Miriam Allalouf, Yuval Shavitt and Eitan Steiner School of Electrical Engineering Tel Aviv University

Abstract Simulations are used to evaluate various suggestions for improvements in the MAC, transport, and application layers. In many of these simulations TCP is used as part of the system modeling. We compared the TCP micro-behavior model of the widely used NS-2 simulator with a newer network simulator, J-Sim. We found that both simulators exhibit variations with respect to the congestion control mechanism: J-Sim is more compliant with the RFC than NS-2.

I. I NTRODUCTION TCP is the most common end-to-end transport protocol in use today, and a fundamental building block in networking technology. Discrete event network simulators are a signicant tool in TCP related research, and are used for validation of TCP improvement suggestions and more often for testing higher layer protocol in various settings. In this paper, we study the characteristics of TCP congestion control algorithms, as implemented by common network simulation tools. We observe and compare the deployment and execution of two network simulators: The popular and widely used NS-2 network simulator [NS2] and J-Sim, an interesting Java based network simulator [JSI]. Specically, looking at the TCP details, we focus on the micro-behavior of the TCP congestion control features with respect to packet loss events. We compare the behavior of two different TCP avors: Tahoe and Reno running TCP connections over a lossy bottleneck link. The results of both simulators are compared and the differences are analyzed. TCP is a connection-oriented protocol which follows the go-back-n model using cumulative positive acknowledgements, and requires re-transmission of data segments lost during transport. The protocol time line consists of few congestion control phases and mechanisms: Slow-Start, Congestion-Avoidance, and Fast-Retransmit [Jac88]. These phases were added by different TCP avors in order to improve the goodput of the TCP connection. The slow-start and congestion-avoidance phases are used by a TCP sender to control the amount of out-

standing data being injected into the network and are implemented by maintaining a set of state variables per TCP connection. The congestion window (cwnd) is a sender-side limit to the amount of data the sender can transmit into the network before receiving an acknowledgement (ACK), while the receivers advertised window (rwnd) is a receiver-side limit to the amount of outstanding data. The minimum of cwnd and rwnd governs data transmission. Another state variable, the slowstart threshold (ssthresh), is used to indicate the transition point from slow-start to congestion avoidance. According to Internet RFC 2581 [APS99] TCP congestion control algorithms are dened as follows: Slow Start Phase: TCP Tahoe established the slow start phase which is the initial congestion control phase. Beginning transmission into a network with unknown conditions requires TCP to slowly probe the network to determine the available capacity, in order to avoid congesting the network with an inappropriately large burst of data. The slow-start algorithm is used for this purpose at the beginning of transfer, or after repairing a loss detected by the retransmission timer. During slowstart, TCP increments cwnd by at most MSS bytes for each ACK received that acknowledges new data. Slowstart ends when the cwnd exceeds ssthresh or when congestion is detected. Congestion Avoidance Phase: TCP operates in congestion avoidance where cwnd is incremented by 1 MSS per round-trip-time (RTT). Congestion avoidance continues until congestion is detected. Thus, the formula used to control cwnd is:
cwnd+ = M SS M SS/cwnd

(1)

When a TCP sender detects segment loss using the retransmission timer, the value of ssthresh is updated as follows:
ssthresh = max{F lightSize/2, 2 M SS}

(2)

Where FlightSize is the amount of outstanding data in the network, namely data which has already been sent

but not yet acknowledged. Both simulators use cwnd as an approximation for FlightSize. Fast-Retransmit / Fast-Recovery Phase: With FastRetransmit, after receiving a small number (usually 3) of duplicate acknowledgements for the same TCP segment (dup ACKs), the transmitter infers that a packet has been lost, and retransmits the packet without waiting for a retransmission timer to expire, resulting in higher channel utilization and connection throughput. TCP Reno [APS99] implementation retained the enhancements incorporated into Tahoe, but modied the Fast-Retransmit operation to include Fast-Recovery [Jac90]. The FastRecovery algorithm prevents the communication path (pipe) from going empty after Fast-Retransmit, thereby avoiding the need to Slow-Start to rell the pipe after a single packet loss. Fast-Recovery operates by assuming each dup ACK received represents a single packet having left the pipe at the receiving end. Thus during FastRecovery the TCP sender is able to make intelligent estimates of the amount of outstanding data. A TCP receiver should send an immediate duplicate ACK when an out-of-order segment arrives, in order to signal the sender that a segment is missing. A TCP receiver should also send an immediate ACK when the incoming segment lls in part of a gap in the sequence space. A TCP sender uses the Fast-Retransmit algorithm to detect and repair loss, based on incoming duplicate ACKs. To avoid spurious segment retransmission, e.g., due to packet reorder or packet duplication, the receiver waits for two or three duplicate ACKs before inferring the a segment is lost. Fast-Retransmit algorithm uses the arrival of 3 duplicate ACKs as an indication that a segment has been lost. This triggers retransmission of what appears to be the missing segment without waiting for the retransmission timer to expire. After the fast retransmit of missing segment, Fast-Recovery algorithm governs the transmission of new data until a nonduplicate ACK arrives. The reason for avoiding SlowStart is that receiving duplicate ACK not only indicates that a segment has been lost, but also that segments are most likely leaving the network and buffered in the receiver. The following steps outline the detailed behavior of the Fast-Retransmit and Fast-Recovery algorithms (that are usually integrated) as they are presented in RFC 2581 [APS99]: Step 1: When the 3rd duplicate ACK is received, set the ssthresh to no more than the value given in equation (2) above. Step 2: Retransmit the lost segment and set cwnd to ssthresh+ 3M SS . This inates the congestion window by the number of 3 segments which have

left the network and which the receiver has buffered. Step 3: For each additional duplicate ACK received, increase cwnd by MSS. This inates the congestion window to reect the additional segment that has left the network. Step 4: Transmit a segment if allowed by the (min) new value of cwnd and the receivers advertised window. Step 5: When the next ACK arrives that acknowledges new data, set cwnd to ssthresh (the value set in step 1). This is termed deate the window. This ACK should be the acknowledgement elicited by the retransmission from step 2, one RTT after the retransmission. In addition this ACK should acknowledge that all intermediate segments between the lost segment and the received 3rd duplicate ACK, if none of these were lost. This algorithm is known to generally not recover very effectively from multiple losses in a single ight of packets [FF96]. New-Reno was proposed to address this issue [FHG04].

II. S IMULATION E NVIRONMENT Network simulations are usually done using discrete event simulators which include a variety of libraries with network elements, trafc models, and protocols, thus allowing the prototyping and the simulation of various networking scenarios. One of the most popular opensource simulators is the NS-2 project currently hosted by ISI at USC [NS2]. The simulator is based on 2 parallel layers of class hierarchies: Object-Oriented simulator in C++ and corresponding OTcl interpreted layer which allows simulation scenarios to be written as Tcl scripts, while retain execution efciency with compiled C++ modules. Another interesting network simulator is J-Sim project lead by J. Hou at UIUC [JSI], which is a Java-based. J-Sim is designed as an autonomous component architecture which was derived from the IC design architecture. J-Sim is also a duallanguage simulation environment which uses Java for component implementation and supports Tcl scripting as means to write simulation scenarios, (although Java can also be used for this purpose). This study compares the TCP congestion control behavior of the NS-2 and J-Sim network simulators. We chose these simulators because both are open-source and readily available environments that support the TCP functionality. Scenarios and Topologies. In our simulations we used simple topologies that create congestion in a single bottleneck link as illustrated in Figs. 1 and 2. The

S0 D0
S0

S0-D0 RTT 320ms


D0

Capacity = CMbps
BN1 BN2

S1 D1
S1

Buffer Size = N Pkts


D1

S1-D1 RTT 240ms

S2

Capacity = CMbps
BN1 S3 BN2

D2

Fig. 2.

Topology for simulation scenario 2

Buffer Size = N Pkts


D3

S4 D4

S% D5

Fig. 1.

Topology for simulation scenario 1

bottleneck link is congured to have limited buffer space in the output queue of the bottleneck node to force packet drops. A simulative FTP agent was used at the application layer, as a bulk-data transmission driver for the TCP layer. In order to evaluate the TCP congestion avoidance behavior we traced the simulator state variables. We used the following state variables: CongestionWindow (cwnd), Sender-Sequence-Number (seqno), and Receiver-ACK (ack). Due to the differences in the simulators software structure, we used different infrastructure mechanisms to log the events in the system. In J-Sim, we used the event driven mechanism to log the values for cwnd, seqno, and ack. This mechanism generates a log entry when a relevant event occurs. In NS-2, cwnd is not available in the event driven logs (trace les), thus we sampled the three state variables periodically at high enough rate (2-5msec, depending on the scenario). Equivalent scenarios were reproduced with NS-2 and J-Sim to compare and validate results of the different simulator implementations. Simulation based comparison consists of two basic scenarios: 1) Multiple simultaneous connections using either TCP Tahoe or Reno; and 2) Two connection scenario for TCP Reno when one connection acts as background trafc for the measured connection. All scenarios utilized drop-tail queuing scheme. The rst multiple connection scenario had a 14 node

topology, which included 6 TCP sources and 6 TCP sinks all pass through a shared (2 node) bottleneck link (see Fig. 1). Link delays were set to create pairs of short, medium, and long TCP connections (two each). We used two sets of link delays. In the rst, short, medium, and long delays were set to be 20msec, 140msec, and 340msec, respectively, and the bottleneck link capacity was 4Mbps. In the second set the delays were 240msec, 360msec, and 500msec, respectively, and the bottleneck link capacity was 2Mbps. For the rst set of parameters all TCP connections started simultaneously. For the second set, TCP connections were started at different times: long, medium, and short connections were orderly phased approximately 6sec apart, (with 0.2sec gap between connections of same type). These multiple connections scenarios were repeated for TCP Tahoe and Reno. Note that packet drop was not forced explicitly, but was a result of buffer overow at the output port of the congested link. III. S IMULATION R ESULTS In the following section we present comparative simulation results for TCP Tahoe and TCP Reno as generated by both NS-2 and J-Sim environments. We compared both overall congestion-control behavior for multiple connections, as well as focused on detailed micro-behavior in response to a single loss event. The graphs exhibit the combined outline of CongestionWindow (cwnd), Sender-Sequence-Number (seqno), and Receiver-ACK (ack) state variables vs. time. The results analysis are derived from these graphs. TCP Tahoe. For this avor, both simulators produced matching results for all the sets of the simulated parameters. The observed behavior, illustrated in Fig. 3) this behavior, matches the typical congestion control behavior, i.e., slow-start resume after a packet loss (described in section 1). Fig. 4 shows the micro behavior of a single loss event.

NS2 with TCP Tahoe 80 4000

NS2 with TCP Reno 2500 50

halving seqno 60 3500

2000

40

40

3000

seqno

cnwd

seqno/Ack

1500

30 cwnd

20

2500

1000

20

26

27

28

29

30 time

31

32

33

34

35

2000

500

10

Fig. 3. Macro behavior of TCP Tahoe using NS-2 (14 nodes scenario parameter set 2).
NS2 with TCP Tahoe 60 3400

5 time

Fig. 5. Macro behavior of TCP Reno using NS-2 (14 nodes scenario parameter set 2).
JSim with TCP Reno

40

3300 seqno/Ack

4000

80

cnwd

3000
20 3 Dup Acks 3200

60

seqno/Ack

2000

40

0 32.6

32.7

32.8

32.9

33

33.1 time

33.2

33.3

33.4

33.5

3100 33.6

Fig. 4. Micro behavior of TCP Tahoe using NS-2 (14 nodes scenario parameter set 2).

1000

20

0 10

15

20 time

25

30

0 35

TCP Reno. In this avor, a loss event during the congestion avoidance stage, (when the congestion window - cwnd - is above ssthresh), which is identied at the sender side by triple duplicate ACK causes cwnd to drop (deate) by half. This feature was added to avoid reverting to slow-start from a minimal window size. Interestingly, the two simulators feature signicantly different behavior in this case: in NS-2 a loss event causes cwnd to drop once, while in J-Sim cwnd drops twice: rst when the loss is detected; and second when the loss is recovered (indicated by receiving the rst nonduplicate ACK). Figures 5 and 6 show nicely how cwnd changes differently for the two simulators. In order to isolate the micro-behavior of TCP Reno recovering from a single loss event we constructed a second simulation scenario (see Fig. 2): Two TCP connections run over a shared bottleneck link. The primary TCP connection under test starts rst and after some time it is disturbed by the secondary connection which acts as background noise. Congestion is caused due to the limited buffer space in the bottleneck queue which

Fig. 6. Macro behavior of TCP Reno using J-Sim (14 nodes scenario parameter set 2).

results in packet loss for the TCP connections. The simulation parameters (namely bottleneck buffer-size and bandwidth) were ne-tuned such that the primary connection features only a single isolated packet loss in steady state (e.g., during congestion-avoidance stage). Fig. 7 and Table I zoom in on a single loss event in NS-2. We see the drop in the congestion window to half its previous value due to triple duplicate ACK, and remained static until new data acknowledge, then resumed increase. The sender retransmits the last packet which was not acknowledged (this cannot be seen in the graph due to the sampling method, but appears in the NS-2 trace les), the subsequent transmitted packets resume from sequence number before the loss event. As expected, the observed time difference between the retransmit and recovery events matches approximately 1

cwnd

NS2 with TCP Reno 2000 60 500

JSim with TCP Reno 50

400 1800 50

40

seqno/Ack

seqno/Ack

300

30 cnwd

1600

40

cwnd

200

20

1400

30 100 10

1200 15

16

17 time

18

19

20 20

0 6.2

6.3

6.4

6.5

6.6 time

6.7

6.8

6.9

Fig. 7. Micro behavior of TCP Reno using NS-2 (Six nodes scenario).

Fig. 8. Micro behavior of TCP Reno using J-Sim (Six nodes scenario).

RTT. Specically, the end-to-end link delay was set to 160msec, which results in RTT of 320msec. Fig. 8 and Table II zoom in on a single loss event in J-Sim. Unlike what we saw in NS-2, here the congestion window (cwnd) was decreased (deated) twice: upon packet retransmit and upon new data acknowledge. Like in NS-2, only lost packet is retransmitted, the subsequent transmitted packets resume from sequence number before the loss event. Here, too, the time difference between retransmit and recovery events matches approximately 1 RTT (320msec). Looking at the TCP Congestion Control RFC [APS99], it seems that the J-Sim implementation fully complies with the Fast-Retransmit/Fast-Recovery algorithm described in RFC while NS-2 exhibits partial compliance. Specically the NS-2 implementation does not fully comply with step 3 of the algorithms as we describe it in the introduction, since congestion window is not inated during Fast-Recovery state. Interestingly, both simulators exhibit different behavior than the description in the popular TCP reference book by Stevens [Ste94, p. 315, gure 21.10]. According to the presented text there and an example taken from the BSD implementation the TCP congestion window should not shrink after Fast-Retransmit event but rather continue to inate, and shrink (deate) only once after a Fast-Recovery event. Additional differences which contribute to the discrepancy of the results are the unimplemented 3-way handshake during TCP connection establishment in J-Sim (as it is stated in J-Sim documentation), as well as the node queuing mechanism. These result in different patterns of dropped packets in NS-2 and J-Sim bottleneck queues

(for most simulations), while general queuing scheme and buffer size were congured equivalently and overall TCP connection duration times matched. We also examined the ease of use of the two simulators and found then to be comparable. One advantage J-Sim has over NS-2 is for MS Windows users: The Java based J-Sim could be natively installed on MS Windows, while the Linux based NS-2 requires installation over Cygwin which may incur some overhead to the deployment process. IV. C ONCLUSIONS There are a number of discrete event network simulators both commercial (OPNET, Simulink-SymEvents) and open-source projects (NS-2, NCTUns, JiST, and JSim, to name a few). In this work we compared NS-2, which appears to be the most widely used open-source simulator, with J-Sim, which although not yet commonly used, appears to be well architected platform. Most TCP protocol stacks in use today are based either on Reno or New-Reno. Hence simulation accuracy of these protocols is important for simulation based research. Following the simulation results presented in this work we conclude that NS-2 and J-Sim network simulators implementations generally match with respect to the TCP-Tahoe avor. However, they differ with respect to the TCP Reno avor, namely the Fast-Recovery algorithms. TCP Reno implementation differences in micro-behavior were observed. These second-order differences may represent valid variations between implementation, but at the same time exhibit partial incompliance with the RFC.

Event Packet Drop 3rd Duplicate Acknowledge Update ssthresh Packet Retransmit Cwnd Shrink No. 1 New Data Acknowledge cwnd Shrink No. 2

Time [Sec] 16.782 (16.731) 17.308 17.308 (17.310) 17.631 (17.635) (17.635)

Comments Drop event logged (seqno = 1575) (Actual Transmission Time) RENO FAST RETX event logged [ 54/2] = 27 Sequence number of dropped packet cwnd = ssthresh (=27) (sampled time) New Data (seqno = 1630) logged time (sampled time) Not Implemented cwnd starts to increase after static period

TABLE I NS-2: R ENO . CONNECTION 0

Event Packet Drop 3rd Duplicate Acknowledge Update ssthresh hline Packet Retransmit cwnd Shrink No. 1 New Data Acknowledge Cwnd Shrink No. 2

Time [Sec] 5.987 6.297 6.297 6.297 6.621 6.621

Comments Garbage event logged (seqno = 365) DUPACK3 event logged [33/2] = 16 Sequence number of dropped packet cwnd = ssthresh + 3*MSS (16+3=19) New Data (seqno = 389) cwnd = ssthresh (=16)

TABLE II J-S IM : R ENO . CONNECTION 0

R EFERENCES
[APS99] M. Allman, V. Paxson, and W. Richard Stevens. TCP congestion control. Internet RFC 2581, April 1999. [FF96] K. Fall and S. Floyd. Simulation-based comparisons of Tahoe, Reno and SACK TCP. Computer Communication Review, 26(3):521, July 1996. [FHG04] S. Floyd, T. Henderson, and A. Gurtov. The NewReno modication to TCPs fast recovery algorithm. Internet RFC 3782, April 2004. Obsoletes RFC2582 from April 1999. [Jac88] Van Jacobson. Congestion avoidance and control. In ACM SIGCOMM, pages 314 329, August 1988. [Jac90] Van Jacobson. Modied TCP congestion avoidance algorithm. end2end-interest mailing list, April 30 1990. [JSI] J-sim. http://www.j-sim.org. [NS2] The network simulator: Ns-2. http://www.isi.edu/nsnam/ns. [Ste94] W. Richard Stevens. TCP/IP Illustrated, volume 1. Addison-Wesley, November 1994.

You might also like