You are on page 1of 131

Performance Characteristics of Convergence Layers in Delay Tolerant Networks

A thesis presented to the faculty of the Russ College of Engineering and Technology of Ohio University

In partial fulllment of the requirements for the degree Master of Science

Mithun Roy Rajan August 2011 2011 Mithun Roy Rajan. All Rights Reserved.

2 This thesis titled Performance Characteristics of Convergence Layers in Delay Tolerant Networks

by MITHUN ROY RAJAN

has been approved for the Electrical Engineering and Computer Science and the Russ College of Engineering and Technology by

Shawn D. Ostermann Associate Professor of Engineering and Technology

Dennis Irwin Dean, Russ College of Engineering and Technology

3 Abstract RAJAN, MITHUN ROY, M.S., August 2011, Computer Science Engineering Performance Characteristics of Convergence Layers in Delay Tolerant Networks (131 pp.) Director of Thesis: Shawn D. Ostermann Delay Tolerant Networks (DTNs) are designed to operate in environments with high delays, signicant losses and intermittent connectivity. Internet protocols like TCP/IP and UDP/IP are not designed to perform eectively in challenging environments. DTN uses the Bundle Protocol which is an overlay protocol to store and forward data units. This Bundle Protocol works in cooperation with convergence layers to function in extreme environments. The convergence layers augment the underlying communication layer to provide services like reliable delivery and message boundaries. This research focuses on the kind of performance that can be expected from two such convergence layers - the TCP Convergence Layer and the Licklider Transmission Protocol Convergence Layer - under various realistic conditions. Tests were conducted to calculate the throughput using these convergence layers under dierent losses and delays. The throughput that was obtained using dierent convergence layers was compared and the performance patterns were analyzed to determine which of these convergence layers has a higher performance under the various scenarios. Approved: Shawn D. Ostermann Associate Professor of Engineering and Technology

4 Acknowledgments I would like to thank my advisor, Dr. Ostermann, for his endless supply of ideas and suggestions in helping me complete my thesis. I am grateful to him for giving me the opportunity to work in the IRG Lab. A very special thanks to Dr. Kruse for his guidance and for always having answers to all my questions. Thanks to Gilbert Clark, Josh Schendel, James Swaro, Kevin Janowiecki, Samuel Jero and David Young for all their help in the lab. It was always great to have someone to discuss research issues and world issues with. I appreciate all the love and support from my family. Thanks for being patient and having faith in me.

5 Table of Contents Page

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 DTN Architecture and Convergence Layers 2.1 DTN Architecture . . . . . . . . . . . 2.2 Bundle Protocol . . . . . . . . . . . . 2.3 Convergence Layers . . . . . . . . . . 2.3.1 TCP Convergence Layer . . . 2.3.2 LTP Convergence Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 4 6 7 8 14 14 16 20 22 25

3 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.1 Hardware Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Software Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4 Experiments, Result and Analysis . . . . . . . 4.1 TCP Convergence Layer Tests . . . . . . 4.1.1 TCPCL Without Custody Transfer 4.1.2 TCPCL With Custody Transfer . 4.2 LTP Convergence Layer Tests . . . . . . 4.2.1 LTPCL Without custody transfer . 4.2.2 LTPCL With custody transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 33 33 44 49 50 60

5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Appendix A: ION Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Appendix B: Supporting programs . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6 List of Tables Page 3.1 3.2 4.1 Testbed Conguration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Experimental Settings and Parameters . . . . . . . . . . . . . . . . . . . . . . 31 Number of sessions used for various RTTs . . . . . . . . . . . . . . . . . . . . 53

7 List of Figures Page 2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12 4.13 4.14 4.15 4.16 4.17 4.18 4.19 4.20 4.21 4.22 4.23 4.24 4.25 4.26 Bundle Protocol in Protocol Stack . . . . . . Bundle Protocol Structure . . . . . . . . . . . Convergence Layer Protocol in Protocol Stack TCPCL Connection Lifecycle . . . . . . . . LTPCL Connection Lifecycle . . . . . . . . . LTPCL Error Connection Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 17 21 24 27 28

Physical Testbed Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Logical Testbed Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 TCPCL Throughput vs RTT . . . . . . . . . . . . . . . . . . . . . . . . . . . Outstanding Window Graph for TCP Reno with 2 sec RTT . . . . . . . . . . . Outstanding Window Graph for TCP Cubic with 2 sec RTT . . . . . . . . . . . Time Sequence Graph of TCP Link with 20sec RTT . . . . . . . . . . . . . . . TCPCL Throughput vs Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . TCPCL(w/o custody) Throughput vs RTT & Loss for 200 Byte Bundles . . . . TCPCL(w/o custody) Throughtput vs RTT & Loss for 5000 Byte Bundles . . . TCPCL(w/o custody) Throughput vs RTT & Loss for 15000 Byte Bundles . . . TCPCL(w/ custody) Throughtput vs RTT . . . . . . . . . . . . . . . . . . . . Comparing TCPCL(w/ custody) and TCPCL(w/o custody) Throughtput vs RTT TCPCL(w/ custody) Throughput vs Loss . . . . . . . . . . . . . . . . . . . . . Comparing TCPCL(w/ custody) and TCPCL(w/o custody) Throughtput vs Loss TCPCL(w/ custody) Throughput vs RTT & Loss for 200 Byte Bundles . . . . . TCPCL(w/ custody) Throughput vs RTT & Loss for 5000 Byte Bundles . . . . TCPCL(w/ custody) Throughput vs RTT & Loss for 15000 Byte Bundles . . . TCPCL(w/ custody) Throughput vs RTT & Loss for 40000 Byte Bundles . . . LTPCL Throughtput vs RTT . . . . . . . . . . . . . . . . . . . . . . . . . . . Comparing LTPCL and TCPCL Throughput vs RTT . . . . . . . . . . . . . . LTPCL Throughput vs Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . Comparing LTPCL and TCPCL Throughput vs Loss . . . . . . . . . . . . . . LTPCL Throughput vs RTT & Loss for 5000 Byte Bundles . . . . . . . . . . . LTPCL Throughput vs RTT & Loss for 15000 Byte Bundles . . . . . . . . . . Comparing LTPCL(w/o custody) and LTPCL(w/ cusotdy) Throughput vs Delay Comparing LTPCL(w/o custody) and LTPCL(w/ custody) Throughput vs Loss . LTPCL(w/ custody) Throughput vs RTT & Loss for 5000 Byte Bundles . . . . LTPCL(w/custody) Throughput vs RTT & Loss for 15000 Byte Bundles . . . . 33 34 35 37 40 41 42 43 46 47 48 49 50 51 52 53 54 55 57 59 60 61 62 63 64 65

8 1 Introduction Environments with limited network connectivity, high losses and delays require special networks for communication. Terrestrial networks which generally use TCP/IP do not perform well under such conditions. Deep space exploration, wireless and sensor networks, etc. operate in such challenging environments and it was important to develop a solution for such challenging environments. Terrestrial networks can be described as having the following characteristics: delay in the order of milliseconds low error rates continuous connectivity high bandwidth The protocols that have been developed for terrestrial networks exploit the above mentioned benets. For example, TCP performs a three-way handshake between two communicating entities to establish connection between them. In a high bandwidth environment with negligible delay, the additional time required to establish a connection will be insignicant. Furthermore, during a TCP transmission with packet loss, TCP will block the delivery of subsequent packets until the lost packet is retransmitted and this would lead to under-utilization of the bandwidth. Since terrestrial networks have continuous connectivity, under-utilization of the bandwidth will not aect data delivery. The sender will store the data until it receives an acknowledgement from the receiver because if any packets are lost during transmission, the lost packets can be retransmitted. Since the round trip time is in the order of milliseconds, the senders retransmission buer will only need to store the sent data for a few milliseconds or seconds.

9 On the other end of the spectrum, a challenged network can be dened by the following characteristics: delay in the order of minutes/hours high error rates depending on the environmental conditions disrupted connectivity limited bandwidth asymmetric bandwidth Burleigh et al. [2003] explains why using an Internet protocol on a delayed/disrupted link is not advisable. When TCP/IP is used in disrupted networks, the connection establishment by the 3-way handshake will be an overhead. The high round-trip time for the 3-way handshake will delay the ow of application data. This is unfavorable in an environment where connection can be frequently disrupted. Moreover, if there is limited opportunity to send data, using that limited connectivity to establish a connection will be a misuse of the available bandwidth. Similarly, waiting on lost packets will lead to under-utilization of the bandwidth. It is important to maximize the usage of bandwidth when there is connectivity instead of waiting for lost packets. End-to-end TCP retransmissions will cause the retransmission buer at the senders end to retain the data it sends for long periods of time. These periods of time can range in length from a few minutes to hours [Burleigh et al. 2003]. There are a few modied versions of TCP with extensions specically for deep space communication, like Space Communications Protocol Standards-Transport Protocol (SCPS-TP) [Durst et al. 1996], and TCP peach [Akyildiz et al. 2001]. In addition, there are also some protocols/architectures which were developed for interplanetary communication, like Satellite Transport Protocol (STP) [Henderson and Katz 1999], and

10 Performance enhancing transport architecture (PETRA) [Marchese et al. 2004]. Most of these protocols and architecture solve the high delays, losses, and asymmetric bandwidth problems. Delay/Disruption Tolerant Networks [Fall 2003; Burleigh et al. 2003] is one such architecture that has been designed to solve the problems posed by such environments. Delay tolerant networks are deployed for deep space communication [Wood et al. 2008, p.1], networks for mobile devices and military networks [Fall 2003]. Delay and Disruption Tolerant networks are generally deployed in environments where the source and destination nodes do not have end-to-end connectivity. In areas with limited networking infrastructure or areas where communication networks are dicult to build, it is easier and cheaper to set up Delay Tolerant Networks. Farrell and Cahill [2006] explains a number of interesting applications of DTNs. Some of these applications are detailed in the following paragraphs. Deep-space communication is an excellent example of an environment that deploys Delay/Disruption Tolerant networks. The round-trip times for space links are on the order of minutes (Mars is anywhere between 9 to 50 minutes depending on the location of Earth and Mars on its orbit [Akan et al. 2002]). The communication link between nodes/entities in space is intermittent because there may be points in time when the nodes are not in line of sight of each other. Losses are also common in deep space communication because of low signal strength over large distances. A good example for DTN is the communication between Mars rovers and ground stations. A Mars rover could transmit data directly to earth stations but doing so will be energy-intensive and could drain the power of the rover. Thus, it would be reasonable to use a relay to transmit the data, so the rover can conserve power. Rovers on Mars collect data and pictures which need to be transmitted to a ground station on earth. After the rover collects the data, it will wait until it can make contact with the Mars orbiter. Therefore, the rover stores the data until it can forward it on to the orbiter. The rover transfers the data to

11 the orbiter when it can make contact. The Mars orbiter will wait until it is in line of sight of the Deep Space Network antenna on earth, so it can relay that data from the rover. Juang et al. [2002] explains another interesting application of Delay Tolerant Networks called Zebranet. Zebranet tracks the migration of zebras in central Kenya. There are a number of options available to track zebras. One option is to have collars for the zebras that use sophisticated GPS to track their positions and upload the data directly to a satellite. Since these collars will be powered by solar cells and a satellite upload will drain the cell quickly and make the collar useless, this option is not viable. Another alternative is to build a network infrastructure, but building a network infrastructure in the wilderness would be an expensive alternative. The cost eective way of doing this is to use Delay Tolerant Networks. In this case, the zebras will be tted with collars that have a GPS, a short range radio and solar cells to power the collar. The GPS wakes up periodically to record the position of the zebra. When a collar is in close range with another collar, the radio will transmit all the data of one zebra to the other zebra and vice versa. When a node comes within close range of multiple nodes, it will aggregate data from multiple zebras. Finally, when a researcher drives by and comes within range of any of these collars, it will transfer the aggregated data to its nal destination, which is the base station. The collars will delete the accumulated data when it has transferred the data to the base station [Juang et al. 2002]. DTN Architecture (explained in Chapter 2) is an overlay architecture which can function on top of multiple transport layers; thus making DTN deployable in many dierent environments that support various transport layer protocols. DTN uses a standard message format called Bundle Protocol (BP)[Scott and Burleigh 2007] for transferring data. An extra layer of reliability can be added to BP by using custody transfer, which is explained in Chapter 2. Multiple transport layers can be used under BP by using the corresponding convergence layers above the transport layer. These convergence layers

12 enhance the underlying transport layer protocol and use the transport protocol to send and receive bundles. Some of the Convergence layer protocols that DTN supports are TCP [Demmer and Ott 2008], UDP [Kruse and Ostermann 2008], LTP [Burleigh et al. 2008] and Saratoga [Wood et al. 2008]. The TCP convergence layer (TCPCL) uses TCP, but the other three convergence layers use UDP to send and receive bundles. LTP is also designed to use Consultive Committee for Space Data Systems (CCSDS) link layer protocol[CCSDS 2006] and is currently being developed to use DCCP [Kohler et al. 2006]. Interplanetary Overlay Network (ION) implements DTN architecture and it was developed by the NASA Jet Propulsion Lab. ION was intended to be used on embedded system such as space ight mission systems but it can also be used to develop DTN protocols and test the performance of DTN protocols. DTN2[DTNRG 2010], DTN1 [DTNRG 2010], IBR-DTN [Doering et al. 2008] are some of the other implementations of DTN architecture. There has been earlier work that evaluates the performance of TCPCL implemented by DTN2, LTP implemented by ION, and TCPCL-LTPCL hybrid for long-delay cislunar communication in Wang et al. [2010]. This thesis, unlike previous works, evaluates the maximum performance that can be expected from these convergence layers under various conditions of delay and losses. The tests are conducted with a 4 node setup, with the capability of modifying the link characteristics of the middle link. The throughput of TCPCL and LTPCL is compared for dierent delays, losses, combination of delays and losses, and bundle sizes. Performance is evaluated for BP with custody transfer and without custody transfer. Some of the main contributions of this thesis are LTP dissector for wireshark and SBP Iperf. The LTP dissector decodes LTP data in wireshark. This has proved to be very useful tool to debug and analyze LTP behaviour. SBP Iperf is a network performance tool

13 that is used with DTN protocols. All the performance tests in this thesis were performed using SBP Iperf. Chapter 2 gives an insight into DTN architecture, BP and the convergence layers that were evaluated. Chapter 3 discusses the experimental setup, test parameters, and testing tools. Chapter 4 presents the results of the performance tests and analyzes the result of the tests. Chapter 5 concludes with some recommendations for future work.

14 2 DTN Architecture and Convergence Layers 2.1 DTN Architecture Fall [2003] proposes an overlay architecture that will function above transport layer protocols like TCP, UDP, and sensor network transport protocols. The architecture suggests that networks be divided into regions, which can be connected by DTN gateways. The nodes within a region can send and receive data without using the DTN gateway. The DTN gateway will act as a bridge between the 2 regions; therefore, the DTN gateway should understand the transport protocols of both regions. The naming convention consists of 2 parts: region name and entity name. The region name should be unique, but the entity name needs to be unique only within the region. Hence, while routing packets in delay tolerant networks, the DTN gateways use the region names for routing. Only the DTN gateway at the end of the destination region needs to resolve the entity name. Selecting a path and scheduling a packet transfer in DTN is performed by the use of a list of contact information of the nodes. The contact information consists of the relative or absolute time when contact can be made with other nodes, the delay between nodes, and the data rate that the link will support. This method of routing is contact graph routing. Other than contact graph routing, there are other routing techniques that can be used depending on the environment. The zebranet explained in the previous chapter uses ood routing, where each node transfers all data to every node it comes in contact with. The routing nodes can either have persistent or non-persistent storage. The persistent routing nodes will support custody transfer, which implies that the routing node will store a packet/bundle that it receives in persistent storage and send an acknowledgement to the sending node. The routing node will store the bundle until it has successfully transferred it to the next node that accepts custody. Custody transfer of the overlay architecture is an important concept of DTN because it gives reliability to the architecture amidst losses and

15

Figure 2.1: Bundle Protocol in Protocol Stack

delays. Some of the transport layer protocols like TCP, LTP, etc. also provide reliability. Therefore, custody transfer will add a second layer of reliability. Since this overlay architecture functions over transport layer protocols, this architecture would need to enhance the underlying transport protocols that will be used by using a convergence layer above the transport protocol. If the transport layer protocol is a connection oriented protocol, then it is the responsibility of the convergence layer to restart connectivity when the connection is lost. Convergence layers are explained more in depth later in this chapter [Fall 2003, p.30 - 33].

16 2.2 Bundle Protocol

The BP is the overlay protocol that is used in the DTN architecture. This protocol gives the DTN architecture the capability to store a bundle and then forward it to the next node. A Bundle is a collection of data blocks that is preceded by a common header. As shown in Figure 2.1, BP operates on top of the transport layer protocol, and it is designed to function over a wide range of networks including deep space networks, sensor networks, and ad-hoc networks. Wood et al. [2009] closely studies the BP and examines some of the BPs design decisions. Every delay tolerant network diers vastly in characteristics, and it is dicult to nd a general networking solution to the problem. Therefore, DTN architecture decided to implement a generalized message format known as BP, which could be used over a variety of networks. Hence the protocol by itself is not the solution to delay tolerant/disrupted networks; however, it needs to work in cooperation with convergence transport layer protocols or transport layer protocols designed specially for delay tolerant networks. This protocol only provides a standard format for data but most of the support for a disrupted network is provided by the convergence layers implemented in the architecture [Wood et al. 2009]. Figure 2.2 shows the bundle structure as specied in Scott and Burleigh [2007]. A bundle consists of a primary block followed by one or more payload blocks. Most of the elds in the BP structure are Self-Delimiting Numeric Values (SDNVs) [Eddy and Davies 2011], which helps to reduce size of the bundle. Using SDNV also helps to make the design scalable for the various underlying network layers that the bundle protocol might use. The following are the elds in the bundle structure : The bundle processing ags are sub-divided into general ags, class of service and status report.

17

Figure 2.2: Bundle Protocol Structure

18 The block length is an SDNV of the total remaining length of the block. The dictionary osets represents the oset of specied elds in the dictionary. It contains the scheme oset and scheme specic part (SSP) oset of the destination, source, report-to, and custodian. Hence, the destination scheme oset is the position within the dictionary which contains the scheme name of the endpoint. Similarly, the destination SSP oset holds the oset within the dictionary which contains scheme-specic part of the destination endpoint. The next eld in the primary bundle block is creation timestamp and creation timestamp sequence number both of which are SDNVs. Creation timestamp is the UTC when the bundle was created. If more than one bundle is created in the same second, then the bundles are dierentiated by the incrementing bundle timestamp sequence number. Dictionary information contains the dictionary length which is the SDNV of the size of the dictionary and the dictionary itself. The dictionary accommodates the scheme and the scheme specic parts which are referenced by the osets earlier in the block. The last two elds in the primary bundle block (fragment oset, payload size) are included in the block only if the bundle processing ags signal that the bundle is a fragment. The fragment oset is the oset from the beginning of the actual application data, and payload size is an SDNV of the total application data of the bundle. The primary bundle block is usually followed by one or more canonical bundle blocks. The general format of the bundle block is also represented in Figure 2.2. The bundle block has a 1 byte type eld, and if the bundle block is a payload block then type eld will be 1. The type eld is followed by bundle processing ags which can set certain

19 special bundle features, like custody transfer, etc. The length eld which is an SDNV, represents the size of payload eld, and the payload eld contains the application data [Scott and Burleigh 2007]. In the DTN Architecture, the application hands over data to a local bundle agent, and the bundle agent is responsible to get the data over to the bundle agent at the destination. Bundling of the data is done by these bundle agents. The bundle agent applies routing algorithms to calculate the next hop to forward the bundle. The bundle agent then advances the bundle to the corresponding convergence layer adapter to forward the bundle to its next hop. In environments where DTNs are applied, there is a high probability that the forwarding will fail because of loss or lack of connectivity. If the convergence layer adapters return with failure due to lack of connectivity, the bundle agent will wait until connection is restored and then reforward the bundle. However, if a bundle is lost due to network losses, then it will depend on the reliability of the transport layer protocol and/or the reliability of bundle protocol, which ensure the bundle gets from one hop to the next. On reception of a bundle, the bundle agent needs to compute the next hop and continue the same steps mentioned above. However, if a bundle is destined for a local endpoint, the application data will be sent to the application after reassembly. BP can add a layer of reliability by using an optional service called custody transfer. Custody transfer requests that a node with persistent storage store the bundle if it has storage capacity. The node that accepts custody of a bundle will transmit a custody signal to the node that previously had custody of the bundle. On reception of the custody signal, the previous custodian will delete the bundle from the custodians storage and/or retransmission buer. The previous custodian is not required to wait until the bundle reaches the destination to delete the bundle from its storage. The responsibility of reliably transferring the bundle to the destination lies with the node that accepted custody of the bundle. Therefore, if the bundle is lost or corrupted at some point of time, it is not the

20 responsibility of the sender to resend the bundle, but the responsibility of the node that accepted custody or the custodian of that bundle to resend the bundle. Custody transfer not only ensures that the bundle reaches the destination from the source, but it also shifts the accountability of the bundle in the direction of the destination. Every time a node accepts custody of a bundle, it will free the previous custodians resources. On the other hand, bundle transmission without custody transfer relies on the reliability of the transport protocol. BP is a relatively newer protocol, so there are still some undecided issues. Wood et al. [2008] and Wood et al. [2009] list the problems with the current design of the BP. The following are some of the signicant concerns : Synchronizing time on all the nodes is important for BP because a bundle can be rejected if the time stamp check claims that the bundle has expired due to unsynchronized clocks on the sender and receiver. More than one naming scheme is currently used in dierent BP implementations and each of the naming schemes have separate rules for creating end point identier (EID). There is no accepted routing protocol, essentially because there is more than one naming scheme. Dynamic routing protocols will help to improve the scalability of the network. 2.3 Convergence Layers The convergence layer acts as an interface between the BP and the transport layer protocol. Figure 2.3 shows the position of the convergence layer protocol in the protocol stack. Since BP is an overlay protocol that can be used above dierent transport layer protocols, the corresponding convergence layer protocol needs to be used to allow data

21 ow from BP to the transport protocol and vice versa. The main function of this layer is to aid the underlying transport layer. As mentioned before, TCPCL, UDPCL, LTP, and Saratoga are some of the DTN convergence layer protocols.

Figure 2.3: Convergence Layer Protocol in Protocol Stack

A DTN that uses TCP for communication will require a TCP Convergence layer between the BP and TCP. The convergence layer enhances the underlying transport protocol by adding some additional functionality to the transport layer protocol which might make it suitable in extreme environments. For example, the convergence layer for a connection oriented protocol will have to maintain the connection state. Thus, if connection is lost for any reason, the convergence layer will try to re-establish connection.

22 Additionally, if there are transport protocols that do not provide congestion control, then the convergence layer adds this functionality to the stack. 2.3.1 TCP Convergence Layer DTN uses TCPCL when the transport layer protocol being used is TCP. TCP is a reliable protocol, and it ensures that a packet reaches from one node to another. Every packet that a node sends has to be acknowledged by the sender. If a packet is lost in transmission and the receiver does not receive any acknowledgement for a packet, then it will retransmit the packet. Timers on the sender side will keep track of the time when the acknowledgment is expected. When the timer runs out, the packet is retransmitted. Similarly when an acknowledgment is lost, the sender will have to retransmit the packet. TCP implements ow control using congestion window. The congestion window size dictates the amount of data that is in the network. TCP also uses certain congestion control algorithms, when there is loss of packets. TCP is designed to conclude that packet losses are because of congestion. Slow start, congestion avoidance, fast retransmit, and fast recovery [Allman et al. 2009] are used for congestion control. A TCP connection starts with slow start and it is again applied when there is a retransmission timeout. The congestion window size is 1 segment size and during slow start this congestion window size is increased by 1 segment size for every acknowledgment received; thus, increasing the congestion window size exponentially. Furthermore, TCP performs congestion avoidance when the congestion window size either reaches the slow start threshold or when there is data loss. The initial size of the slow start threshold is xed high, which is usually the receiver window size. In the congestion avoidance phase, every round trip time, the congestion window size increases by 1 segment size. However, the congestion window cannot exceed the receiver window size. When a packet is lost, the receiver will send duplicate ACKs for the last

23 packet in sequence that it received. On receiving 3 duplicate ACKs, the sender will retransmit the missing segments. This algorithm is called fast retransmit because the sender does not wait for the retransmission timer. During fast retransmit, the congestion window size is reduced by half its original value and so is the slow start threshold . Fast recovery phase follows fast retransmit. During fast recovery the lost packet is retransmitted and the congestion window size is advanced by 1 segment size each time a duplicate acknowledgement is received. When the receiver acknowledges all the missing data, the congestion window size drops to the value of the slow start threshold and returns to the congestion avoidance phase. On the other hand, if it does not receive an acknowledgment there will be retransmission timeout. This timeout causes the congestion window to drop to 1 segment size and the connection returns to slow start. The transport layer implements both congestion control and ow control. Hence TCPCL does not have to perform any congestion or ow control. TCP also ensures reliable data delivery; therefore, TCPCL can assign the responsibility of reliable data transfer to TCP. On the other hand, the convergence layers responsibilities include re-initiating connection and managing message boundaries for the TCP stream. Demmer and Ott [2008] proposes a TCP-based convergence layer protocol. Similar to TCPs three-way handshake, TCPCL establishes connection by exchanging contact headers. This exchange sets up connection parameters, like keep alive period, acknowledgements, etc. Figure 2.4 shows an example TCPCL message exchange. When NodeA needs to send a bundle to NodeB using TCPCL, NodeA rst needs to establish a TCPCL connection with NodeB. To establish a TCPCL connection, NodeA sends a contact header to NodeB. NodeB responds with a contact header and the connection parameters are negotiated. Following connection establishment, both the nodes can exchange data. TCPCL has an option to add an extra layer of reliability by using bundle acknowledgements. If the 2 communicating nodes decide during the contact header

24 exchange that they will support acknowledgements, then each bundle that is transmitted has to be acknowledged by the receiver. The nodes also send keep-alives back and forth, the keep-alives is a method to ensure that the nodes are still connected.

Figure 2.4: TCPCL Connection Lifecycle

TCP/IP is already considered to be a chatty protocol due to the three-way handshake and acknowledgements. Exchange of contact headers and TCPCL acknowledgements adds an extra layer of chattiness. In environments with limited connectivity and bandwidth the additional packets and bundles that are exchanged for connection setup can aect the performance of the convergence layer negatively. In theory, there is a possibility

25 of 3 layers of reliability using TCPCL over TCP - TCP reliability, TCPCL reliability using TCPCL acknowledgements, which is optional, and BP reliability using custody transfer. Akan et al. [2002] presents the performance of dierent versions of TCP on deep space links. The performance of TCPCL can be expected to be very similar to TCP. According to the test results in Akan et al. [2002], TCP recorded a throughput high of 120KB/sec on a 1 Mbps link with no delay, and the throughput drops down to 10 bytes/sec when the round trip time (RTT) is 40 minutes. 2.3.2 LTP Convergence Layer LTP was specially designed for links with high delays and intermittent connectivity. LTP is used over space links or links with long round-trip time as a convergence layer. Similar to TCP, LTP acknowledges bundles that it receives and retransmits any lost bundles. However, unlike TCP, LTP does not perform a 3-way handshake for connection setup. LTP is generally used above the data link layer on space links, but for testing purposes on terrestrial networks it is also used over UDP. Burleigh et al. [2008] explains the objective behind the design of this protocol and the reasoning behind the design decisions. Some of the important design decisions are listed below: To utilize the intermittent links bandwidth eciently, LTP uses the concept of sessions. Each LTP data block is sent using a session and LTP will open as many sessions as the link permits. This will allow multiple data blocks to be sent over the link in parallel. LTP retransmits unacknowledged data, which is what makes LTP a reliable protocol. LTP receiver uses report segments to let the sender know all the data segments the receiver received successfully.

26 Connection setup has been omitted in LTP because in an environment with long delay and limited communication opportunities, a protocol that needs connection setup will under utilize the bandwidth. Unlike TCP, LTP connections are unidirectional; therefore, communication between nodes talking to each other on LTP will have two unidirectional connections open. Ramadas [2007] and Burleigh et al. [2008] describe the operation of the LTP protocol. A basic LTP operation without link errors or losses is depicted in Figure 2.5. The sender opens up a session to send a LTP data block. As mentioned earlier, a node can open as many sessions as the link will permit. This is also a ow control mechanism because bandwidth is limited by the number of sessions that can be opened simultaneously. Each time a node sends a LTP data block, it will use a dierent session number. The last data segment of a data block is a checkpoint/End of Block (EOB) segment. When the receiver receives a checkpoint, it will respond with a report segment. The report segment is an acknowledgement of all the data segments of the data block that it received. On receiving the report acknowledging all the segments of the bundle block, the sender closes the session and sends a report acknowledgement. The receiver will close the import session on reception of the report acknowledgement. The following gure, Figure 2.6, shows an LTP session on an error prone network. The sender will send all the data segments of the block. If some of the data segments are lost during transmission, the receiver will report all the segments it received in its report segment. This will let the sender know which data segments it needs to retransmit. The sender responds with a report acknowledgement, and then it retransmits the lost segments. After the sender retransmits all the lost segments, it will send a checkpoint segment to the receiver. On reception of the checkpoint, the receiver will respond with another report segment with a report of all the segments it received. If the report claims to have received

27

Figure 2.5: LTPCL Connection Lifecycle

all the segments of the block, the sender will close the session and send a report acknowledgement. When the receiver receives the report acknowledgement, it will close the LTP session [Ramadas 2007; Burleigh et al. 2008]. Sessions is the ow control mechanism in LTP. LTP can be congured to open a certain number of sessions, so when all the sessions are being used, data blocks that need to be sent have to wait until a previous session closes. LTP can technically transmit at links bandwidth only limited by the number of sessions that LTP can open simultaneously.

28

Figure 2.6: LTPCL Error Connection Lifecycle

29 3 Experiment Setup

This chapter explains the hardware and software conguration of the test setup. 3.1 Hardware Conguration

Figure 3.1: Physical Testbed Setup

Figure 3.1 shows the physical setup of the testbeds. The 4 testbeds are connected to each other using a 100Mbps Fast Ethernet Switch. The logical testbed setup for the tests is shown in Figure 3.2. Node1 is assigned as the source node and Node4 is assigned as the destination node. All the nodes have host-specic routes. Node1 has to make 2 hops to get to Node4 through Node2 and Node3. Similarly Node4 also needs to make 2 hops to reach Node1 through Node3 and Node2. The routing information is present in the ION conguration les presented in A.

30

Figure 3.2: Logical Testbed Setup

Table 3.1 provides details about the individual testbeds. The tests are done on Linux machines, running Linux kernel version 2.6.24. All the testbeds are 32 bit machines except for Node2. Node2 is a 64-bit machine, so that ION will allocate more small pool memory. This will ensure that it doesnt run out of small pool memory during the tests.1

Table 3.1: Testbed Conguration Conguration Operating System Node1 Ubuntu 8.04 Linux Kernel Version CPU 2.6.24 Intel PenNode2 Ubuntu 8.04 2.6.24 AMD rion X2 2GB 64-bit Tu64 Node3 Ubuntu 8.04 2.6.24 Intel PenNode4 Ubuntu 8.04 2.6.24 Intel Pen-

tium 4 CPU 2.80GHz Memory Architecture 1GB 32-bit

tium 4 CPU 3.00GHz 1GB 32-bit

tium 4 CPU 2.80GHz 1GB 32-bit

ION can allocate a maximum of 16MB small pool memory on a 32-bit machine and a maximum of 256 bytes. The small pool memory is used to store bundle related information within ION. Since Node2 might have to retain bundles in the system when there are losses and delays, it is important for Node2 to have more small pool memory. Otherwise Node2 will run out of small pool memory and crash ION.

31 3.2 Software Conguration Table 3.2 lists some of the important experimental factors used in the performance tests. All the testbeds run an opensource version of Interplanetary Overlay Network which is an implementation of DTN architecture from NASAs Jet Propulsion Lab. The number of syn/synacks that TCP will send is increased from the default value 5 to 255 by setting the kernel parameter net.ipv4.tcp syn retries = 255 and emphnet.ipv4.tcp synack retries = 255. This ensures that TCP does not give up connection establishment for high delay links. The bundle lifetime within ION is set to 10 hours. This helps to ensure that the bundles do not expire during the duration of the test. The tests are performed over 2 sets of DTN protocol stacks - BP over TCPCL over TCP/IP and BP over LTP over UDP.

Table 3.2: Experimental Settings and Parameters Parameters DTN implementation DTN Convergence Layers Bundle Size(bytes) Channel Bandwidth RTT(sec) Loss(%) Values ION Opensource LTP(ION v2.4.0), TCP(ION v2.3.0) 200 - 40000 100 Mbits/sec 0 - 20 0 - 10

The throughput of the 2 protocols (TCP, LTP) is measured for bundle sizes varying from 200 bytes to 40000 bytes. The behavior of these protocols in extreme environments is modeled by measuring the throughput for various values of round-trip time and packet loss. We also measure the throughput in environments with some delay and loss. The

32 delay or loss on this multi hop link between Node1 and Node4 is simulated on Link2 between Node2 and Node3. Netem [Hemminger 2005] is used on Node2 and Node3 to emulate delay and loss. The netem rules have been set up such that the delay or loss are applied to incoming trac on Node2 and Node3. The network emulation rules are applied to incoming trac rather than outgoing trac because if loss is applied to outgoing trac, the loss is reported to the higher layers, and protocols like TCP will retransmit the lost packets, which is not the desired eect. All the tests use a tool called SBP Iperf, which is a variation of the commonly used Iperf. SBP Iperf is similar to Iperf except that it runs over DTN protocols. The SBP Iperf client runs on Node1, and the SBP Iperf server runs on Node4. The SBP Iperf client transmits data and SBP Iperf server waits until it receives all the bundles and responds back to the client with a server report. The SBP Iperf client and server code is presented in B.3. The client will use an incrementing sequence number in each of the data segment it sends. The last data segment will have a negative sequence number, which helps the server to determine that the client has stopped sending data. The SBP Iperf server will wait for a few seconds before it sends the server report. This will ensure that none of the out of order bundles are reported as lost bundles. This wait period can be adjusted by changing the timeout parameter on the server side. The server report contains the following information : throughput, loss, jitter, bundles transmitted and transmission time.

33 4 Experiments, Result and Analysis

4.1 TCP Convergence Layer Tests This section discusses the results of the performance tests with the TCPCL. 4.1.1 TCPCL Without Custody Transfer The rst set of tests uses Bundle Protocol over TCP Convergence Layer over TCP/IP. We measure the throughput for varying bundle sizes and round-trip times. In this test the SBP Iperf client transmits data for 30 seconds, and when the server receives all the data the server sends a report back to the client. As explained in the previous chapter, the round-trip time between Node2 and Node3 is varied using netem. This test does not use the custody transfer option in BP and the loss between the links is set to 0. Figure 4.1 shows the TCPCL throughput in Kbits/sec for various round-trip times(seconds).

105

104

Throughput (Kbits/sec)

103

102

101

10 RTT (sec)

12

14

16

18

20

200 bytes 500 bytes 1000 bytes

1500 bytes 2000 bytes 5000 bytes

7500 bytes 10000 bytes 15000 bytes

Figure 4.1: TCPCL Throughput vs RTT

34 As we can see in Figure 4.1, throughput decreases with increase in round-trip time. This is because the TCP congestion window size will increase more gradually when the RTT is high, during slow start. The congestion window is doubled every round trip during slow start. However, this increase will be more gradual when RTT is higher.

Figure 4.2: Outstanding Window Graph for TCP Reno with 2 sec RTT

The tests where RTT is greater than 0.4 seconds and less than 10 seconds use TCP Cubic instead of TCP Reno which is used for the other tests. TCP Cubic yields better throughput for these range of RTT values since TCP Cubic is able to increase the

35

Figure 4.3: Outstanding Window Graph for TCP Cubic with 2 sec RTT

congestion window higher than TCP Reno. Figure 4.2 and Figure 4.3 shows the dierence in outstanding window when using TCP Reno and TCP Cubic on a link with a 2 second RTT. The outstanding window in the gures is depicted by the red line. Additionally, when the RTT is greater than 9 seconds (when the SYN is retransmitted for the third time), the sender will assume that the SYN packet is lost since it has not received the SYN/ACK from the receiver within 9 seconds. This behavior will trigger a retransmission timeout. Under normal circumstances, the slow start threshold (ssthresh) is

36 set to the size of the advertised window. Contrary to this, when TCP detects loss due to congestion, TCP calculates the ssthresh as [Allman et al. 2009]

ssthresh = max(Flight Size / 2, 2 * Maximum Segment Size) Since, data has not been sent as a part of this connection, the ight size will be zero. Thus setting ssthresh to twice the maximum segment size(MSS) . The ssthresh and the size of the congestion window (cwnd) determines whether a TCP connection is in slow start or in congestion avoidance. If ssthresh is greater than cwnd, TCP will be in slow start phase and in slow start the congestion window will double every RTT. The TCP connection will go into congestion avoidance when the congestion window increases to a value greater than the ssthresh. The congestion window will increase by 1 segment size every RTT during congestion avoidance phase as mentioned earlier in chapter 2. Allman et al. [2009] also species that the initial window size is set to 1 maximum segment size when either a SYN or SYN/ACK is lost. Hence in the above mentioned case where the RTT is greater than 9 seconds, the initial window size is set to 1 MSS and the ssthresh will be set to twice the MSS. This low value of ssthresh will cause TCP to go into congestion avoidance after one roundtrip time. This behavior of TCP is shown in the TCP time sequence graph in Figure 4.4. Therefore, a connection with high RTT will take longer to reach its maximum congestion window size. This drop in the ssthresh value due to retransmission timeout event during the TCP handshake period can be prevented if we can congure the initial timeout depending on the RTT of the link. But the TCP version in Linux 2.6.24 does not allow us to set this value without recompiling the kernel hence causing any connection with an RTT greater than 3 seconds to start with the congestion avoidance phase rather than slow start.

37

Figure 4.4: Time Sequence Graph of TCP Link with 20sec RTT

We also notice that that the throughput varies with bundle size. When bundle size is 200 bytes, the throughput is approximately 2500Kbits/sec, and the throughput is 40 times higher when bundle size is 15000 bytes. This is so because ION makes approximately 64 semop2 calls for every bundle that is sent. We use a simple program that performs a million semop operations and calculate the time it takes to run the program. By running this program on one of the testbeds, we calculated that the time taken to perform one semop function call is 1.07 * 10-6 seconds(sec).
semop is a system call that performs semaphore operations. Semaphores are used for inter-process communication to regulate access to a common resource by multiple processes.
2

38

1 Bundle => 64 semop 1 Semop => 1.07 106 sec Semop overhead / bundle => 64 1.07 106 sec Semop overhead / bundle => 68.87 106 sec Hence there is a semop overhead per bundle that is sent. Other than the semop overhead, there is also a semaphore contention overhead within ION. Semaphores are used within ION for inter-process communication. The ION application will add the bundle to the forward queue and depending on the addressing scheme, it will release the scheme specic semaphore. The scheme specic forwarder will wait on the scheme specic semaphore. The forwarder, on grabbing this semaphore, will enqueue the bundle into the outduct queue and release the outduct semaphore. The outduct will wait on the outduct semaphore, and on receiving the semaphore, the outduct process will dequeue the bundle and send it out. Additionally, ION also uses semaphores when it makes changes to shared memory. The combination of all these semaphores will cause a contention overhead. We use a program called SystemTap, which can be used to get information about the linux system. We tap ipc lock() to record the amount of time ION waits on this function to get an approximate contention overhead time. Refer to B.2 for the SystemTap script.

Semaphore contention overhead / bundle => 149.38 106 sec Adding the semop overhead and the contention overhead for each bundle we get

Total overhead / bundle => 218.25 106 sec No of bundles transmitted per second => 1/218.25 106 sec

39 No of bundles transmitted per second => 4582 bundles Hence for a bundle of size 200 bytes, the expected maximum throughput is

Throughput of 200 byte bundle => 4582 200 bytes/sec Throughput of 200 byte bundle => 6.99 Mbps The client reports the transmission rate at which the data left the client. The server, on receiving all the data sends a server report back to the client reporting the rate at which it received the data. The client node reports a transmission rate of approximately 6.3Mbps for 200 byte bundles, which reduces to 3Mbps on the server report. Some factor of semop and contention overhead needs to be factored into throughput over 3 hops from client to server. That explains the dierence in throughput between the client report and the server report. Since the number of bundles that can be transmitted is limited by semop and contention overhead, throughput will increase with increase in size of bundles. The next set of tests measures TCPCL throughput for dierent percentages of packet loss without custody transfer. In the tests X% packet loss is dened as, X packets out of 100 packets are dropped from Node2 to Node3, and X packets out of 100 packets are dropped from Node3 to Node2. Like the previous set of tests, the SBP Iperf client in this case also sends data for 30 seconds and waits for the SBP Iperf server to respond back with the report. Figure 4.5 shows the throughput in Kbits/sec of TCPCL for dierent losses and the maximum theoretical throughput. We use the Mathis equation [Mathis et al. 1997] to model the the maximum theoretical throughput for a TCP connection with packet loss. The Mathis equation is given by

40
105

Throughput (Kbits/sec)

104

103

102

5 Loss (%)

10

Max 200 bytes 500 bytes

1000 bytes 1500 bytes 2000 bytes

5000 bytes 7500 bytes 10000 bytes

15000 bytes

Figure 4.5: TCPCL Throughput vs Loss

Bandwidth = (Maximum Segment Size C)/(RTT Where, C is Mathis constant and p is loss probability

p)

(4.1)

As you can see in Figure 4.5, the theoretical throughput and the actual throughput are almost the same for 2% loss and the lines start to diverge from each other after 3% loss. As the loss increases, chances of retransmission timeouts also increase. Unfortunately, the

41
10
5

10

Throughput (Kbits/sec)

103

10

101

100

0.5

1.5

2.5 RTT (sec)

3.5

4.5

0% theoretical max(.05 % .05% theoretical max(2 %

Loss loss) Loss loss)

2% Loss theoretical max(5 % loss) 5% Loss

Figure 4.6: TCPCL(w/o custody) Throughput vs RTT & Loss for 200 Byte Bundles

Mathis equation does not consider retransmission timeouts in the model. That is why the maximum theoretical throughput diverges from the actual throughput.

TCP will go into congestion avoidance phase when there is packet loss. When the TCP sender gets 3 duplicate acknowledgements from the receiver, the TCP sender reduces the congestion window to half the previous size and retransmit the next unacknowledged segment. However, in cases when either one of the 3 duplicate acknowledgements is lost, or the retransmission gets lost, a timeout event occurs. When a timeout event occurs, TCP reduces the congestion window to 1 maximum segment size and goes back into slow start. For smaller loss percentage, losing one of the triple duplicate acknowledgements or losing a retransmission is rare. But the occasional packet loss will halve the congestion window

42
10
5

10

Throughput (Kbits/sec)

103

10

101

100

0.5

1.5

2.5 RTT (sec)

3.5

4.5

0% theoretical max (.05% .05% theoretical max (2%

Loss loss) Loss loss)

2% Loss theoretical max (5% loss) 5% Loss

Figure 4.7: TCPCL(w/o custody) Throughtput vs RTT & Loss for 5000 Byte Bundles

from time to time, and since the chances of timeout events are less, the congestion window size will gradually increase with time. On the other hand, higher loss percentage increases the chance of timeout events due to loss of one of the triple duplicate acknowledgements or due to loss of the retransmitted packet. Thus causing the connection to reduce its congestion window to 1 MSS from time to time. This explains the drop in throughput after 3% packet loss. The last set of tests for TCPCL without custody transfer measures the throughput for combinations of loss and delay. The throughput is measured for 0, .05, 2 and 5% loss for delay ranging from 0 to 5 seconds. Unlike the previous 2 sets of tests, in this set we only transfer 25MB of data from the SBP Iperf client to the SBP Iperf server. This helps to

43
10
5

10

Throughput (Kbits/sec)

103

10

101

100

0.5

1.5

2.5 RTT (sec)

3.5

4.5

0% theoretical max(.05% .05% theoretical max(2%

Loss loss) Loss loss)

2% Loss theoretical max(5% loss) 5% Loss

Figure 4.8: TCPCL(w/o custody) Throughput vs RTT & Loss for 15000 Byte Bundles

reduce the run time of some of the tests. Since the throughput of some of the tests are really low, the run time of some of the extreme test cases can be very long. Figure 4.6, Figure 4.7 and Figure 4.8 presents the throughput of TCPCL without custody transfer for a combination of RTTs and losses for 200 byte, 5000 byte and 15000 byte bundles respectively. The combination of loss and delay has a very signicant impact on throughput when compared to the impact on throughput solely due to delay or loss. The above gures also depict the maximum theoretical throughput which is obtained using Equation 4.1. The observed throughput values are very close to the maximum theoretical throughput according to the Mathis equation [Mathis et al. 1997].

44 4.1.2 TCPCL With Custody Transfer This sections presents the results from using TCP Convergence Layer with custody transfer enabled in the BP layer. The same set of tests performed in Section 4.1.1 are repeated with custody transfer. We also compare the throughput of TCPCL with and without custody transfer. In all the TCPCL with custody transfer test cases, SBP Iperf client sends data for 10 seconds. The reason behind reducing the amount of time the SBP Iperf client spends sending data when custody transfer is enabled is that the ION nodes will have to store the bundles for a longer time since the nodes have to wait for a custody signal from its neighboring node before destroying the bundle. Storing huge amounts of bundles can cause the ION node to run out of memory; hence, to ensure the tests do not fail we have to reduce the amount of data the client sends. In this test scenario, SBP Iperf client sends approximately 52000 bundles in 10 seconds when the bundle size is 200 bytes. The behavior of ION when custody transfer is enabled is explained here. When a bundle requesting custody transfer is received by a node, a custody signal will be sent to the previous custodian of the bundle. If the receiving node decides to accept custody for the bundle, the custodian is notied by sending a custody accepted signal. The custodian will now delete the bundle from its persistent storage. But if the receiver discards the bundle for reasons like redundant bundle, no memory, error block, etc., the receiver will send a custody error signal with a reason code to the custodian. The custodian will take appropriate action depending on the error code. If the receiver discards the bundle because it does not have memory, the custodian will reforward the bundle to another node which can transmit the bundle to the desired destination. We explore the behavior of ION when custody signal is lost, since these nodes operate on links with losses. Custody signals will not be lost when using TCP as the transport layer, but there is a possiblity of losing the custody signal when running LTP over UDP.

45 Custody acceptance signal - When the custody acceptance signal is lost, the receiver (custodian2) will accept custody for the bundle, but the previous custodian (custodian1) will not release custody for the bundle. Technically, there will be 2 custodians for the bundle in the network. Custodian2 will continue to forward the bundle to its destination. Custodian1 will keep the bundle in persistent storage until its TTL expires. On expiration of TTL, custodian1 will delete the bundle and send a status report to the source. Custody error signal - If the custody error signal from the receiver is lost, the custodian retains the bundle in persistent storage until the TTL expires. When the TTL expires, the custodian will delete the bundle and send a status report to the source.

The TCPCL performance for dierent RTT is shown in Figure 4.9. Figure 4.10 compares the performance of TCPCL with and without custody transfer for varying RTT. The throughput of TCPCL with custody is lower than the throughput of TCPCL without custody. When custody transfer is enabled, the bundle is not destroyed until a node receives custody signal from a node that has accepted custody for the bundle. On receiving a custody signal, the node will nd the bundle corresponding to the custody signal and delete it from ION memory. This additional processing time reduces the number of bundles ION can transmit, hence aecting the performance of TCPCL with custody. The dierence in throughput is more profound for smaller bundles because more bundles have to be transmitted when the bundle size is smaller. As mentioned earlier when bundle size is 200 bytes, SBP Iperf client will send approximately 52000 bundles in 10 seconds. On the other hand, SBP Iperf client will send approximately 8000 bundles in 10 seconds for a bundle of size 15000 bytes. Hence the number of custody signals will be lesser when bundle size is 15000 bytes as compared to bundle size of 200 bytes. This

46
105

104
Throughput (Kbits/sec)

103

102

101

10 RTT (sec)

15

20

200 bytes

5000 bytes

15000 bytes

40000 bytes

Figure 4.9: TCPCL(w/ custody) Throughtput vs RTT

would imply that ION will spend less time processing custody signals when bundle size is 15000 bytes. The performance of TCPCL with custody for dierent loss rates is presented in Figure 4.11. The maximum theoretical throughput using Mathis equation [Mathis et al. 1997] is included in this gure. The performance comparison between TCPCL with custody and without custody for dierent loss rates is illustrated in Figure 4.12.

47
105

104
Throughput (Kbits/sec)

103

102

101

10 RTT (sec)

15

20

200 bytes without custody 200 bytes with custody

15000 bytes without custody 15000 bytes with custody

Figure 4.10: Comparing TCPCL(w/ custody) and TCPCL(w/o custody) Throughtput vs RTT

The performance of TCPCL with custody for higher loss rates is similar to the performance of TCPCL without custody for higher loss rates. A combination of low congestion window size and retransmission time outs throttle the number of TCP segments that can be transmitted. Hence the additional time ION nodes spend processing custody signals, does not have an impact on the throughput. The number of bundles ION can release for transmission with the custody signal processing overhead is still more than what TCP can actually send at higher loss rates.

48
105

Throughput (Kbits/sec)

104

103

102

5 Loss (%)
5000 bytes 15000 bytes

10

theoretical max 200 bytes

40000 bytes

Figure 4.11: TCPCL(w/ custody) Throughput vs Loss

The last set of tests measure the performance of TCPCL with custody for a combination of dierent RTT and loss rates. The throughput measure for 200, 5000, 15000 and 40000 byte bundles are shown in Figure 4.13, Figure 4.14, Figure 4.15 and Figure 4.16 respectively.

The performance of TCPCL with custody transfer for a combination of delay and loss is similar to the performance of TCPCL without custody transfer. Similar to TCPCL

49
105

Throughput (Kbits/sec)

104

103

102

5 Loss (%)

10

theoretical max 200 bytes without custody 200 bytes with custody

15000 bytes without custody 15000 bytes with custody

Figure 4.12: Comparing TCPCL(w/ custody) and TCPCL(w/o custody) Throughtput vs Loss

without custody, the throughput curve for TCPCL with custody is identical to the maximum theoretical throughput curve as given by the Mathis equation. 4.2 LTP Convergence Layer Tests We discuss the results of performance tests using LTP convergence layer over UDP/IP in this section. The rst subsection presents the results of LTPCL without custody transfer followed by the results of performance tests of LTPCL with custody.

50
10
5

10

Throughput (Kbits/sec)

103

10

101

100

0.5

1.5

2.5 RTT (sec)

3.5

4.5

0% theoretical max(.05 % .05% theoretical max(2 %

Loss loss) Loss loss)

2% Loss theoretical max(5 % loss) 5% Loss

Figure 4.13: TCPCL(w/ custody) Throughput vs RTT & Loss for 200 Byte Bundles

4.2.1 LTPCL Without custody transfer The rst set of tests measure the throughput of LTPCL without custody transfer for dierent delay between Node2 and Node3. The SBP Iperf client sends 25MB of data to the SBP Iperf server. Refer to Section A.2 for the LTPCL conguration les used in ION for the tests. The performance of LTPCL without custody transfer for various RTT is shown in Figure 4.17. The LTP conguration le uses a command called span to control the transmission rate. The main parameters in span that regulate transmission are aggregation size, LTP segment size, and number of export sessions. The span parameters between Node1-Node2 and Node3-Node4 always remain the same because delay and loss is

51
10
5

10

Throughput (Kbits/sec)

103

10

101

100

0.5

1.5

2.5 RTT (sec)

3.5

4.5

0% theoretical max (.05% .05% theoretical max (2%

Loss loss) Loss loss)

2% Loss theoretical max (5% loss) 5% Loss

Figure 4.14: TCPCL(w/ custody) Throughput vs RTT & Loss for 5000 Byte Bundles

applied between Node2 and Node3. This would mean only Node2-Node3 span has to be tweaked for dierent delays and losses. Aggregation size is the amount of data that ION will aggregate into an LTP block before using an LTP session to transmit the data. As soon as ION accumulates data equal to or more than the aggregation size, ION will break the LTP block into LTP segments and hand it down to the UDP layer for transmission. However, if the internal clock within ION triggers before reaching the aggregation size, ION will segment the LTP block and send it anyway. LTP divides the LTP block into LTP segments of size specied by the LTP segment size parameter within the span. Export session conguration, as explained earlier, is a method to do ow control in LTP.

52
10
5

10

Throughput (Kbits/sec)

103

10

101

100

0.5

1.5

2.5 RTT (sec)

3.5

4.5

0% theoretical max(.05% .05% theoretical max(2%

Loss loss) Loss loss)

2% Loss theoretical max(5% loss) 5% Loss

Figure 4.15: TCPCL(w/ custody) Throughput vs RTT & Loss for 15000 Byte Bundles

A link with no articial delay uses smaller number of export session (2 in this case) and a higher LTP segment size (5600 bytes). Increasing the number of export session increases the amount of data transmitted, but it also risks the chances of packets getting dropped because of buer overow. The intermediate nodes do twice as much work as the end nodes. Thus increasing the number of the export sessions will send more bundles to the intermediate node than it can process. Packet loss will cause a drop in throughput because the lost packets will have to be retransmitted. Additionally, using a higher LTP segment size will reduce the time ION spends segmenting an LTP block. In turn, the IP layer will segment the bigger LTP segment. The advantage of this approach is, IP segmentation takes less time than LTP segmentation, and hence throughput will be higher. On the other hand, losing one IP segment of the bigger LTP segment will mean LTP will

53
10
5

10

Throughput (Kbits/sec)

103

10

101

100

0.5

1.5

2.5 RTT (sec)

3.5

4.5

0% theoretical max(.05% .05% theoretical max(2%

Loss loss) Loss loss)

2% Loss theoretical max(5% loss) 5% Loss

Figure 4.16: TCPCL(w/ custody) Throughput vs RTT & Loss for 40000 Byte Bundles

have to resend the whole LTP segment again. A link with no delay uses lesser number of export sessions: hence, we can use the IP segmentation of the LTP segment to our advantage, since losses will be minimal.

Table 4.1: Number of sessions used for various RTTs No. of Sessions 2 50 200 400 RTT(sec) 0 .2 1 >1

54
10
5

Throughput (Kbits/sec)

104

103

10 RTT (sec)
15000 bytes

15

20

5000 bytes

40000 bytes

Figure 4.17: LTPCL Throughtput vs RTT

For a link with delay we use a higher number of export sessions and a lower LTP segment size (1400 bytes) on the nodes connected to the delayed link. Table 4.1 shows the number of export sessions that we congure LTP to use for various RTTs. The extra export sessions will guarantee that there is constant dialogue between the sender and the receiver. Consider a case where the number of export sessions is 10 and the delay between the nodes is 5 seconds. If ION can ll 10 sessions worth of data at a time, once we send 10 sessions of data, ION needs to wait for 10 seconds to receive the corresponding reports from the receiver so it can clear out the session. And only after the session is cleared can it be re-used to send data. However, if we have 400 export sessions, then we can use the other 390 unused sessions to transmit data while waiting for the report segments from the rst 10 sessions.

55
10
5

Throughput (Kbits/sec)

104

10

102

10 RTT (sec)

15

20

5000 bytes with LTPCL 5000 bytes with TCPCL

15000 bytes with LTPCL 15000 bytes with TCPCL

Figure 4.18: Comparing LTPCL and TCPCL Throughput vs RTT

Figure 4.18 shows the comparison in performance of LTPCL and TCPCL on link with delay. TCP is optimized to perform well on links with minimal delay. But on links with higher delays, LTP can be ne tuned to obtain higher performance than TCP. Unlike TCP, LTP is designed to use the available bandwidth at all time. TCP spends a considerable amount of time in slow start/congestion avoidance on high delay links, but LTP is always transmitting at the constant congured transmission rate. As mentioned earlier, the transmission rate is controlled by the number of export sessions, aggregation size, and LTP segment size. The heap memory reserved for LTP depends on the number of sessions.

56

LTP space reservation

= (Max export bundle size No. of export sessions) + (Max import bundle size No. of import sessions)

If ION is congured such that heap memory allocated for LTP is less than the actual LTP space reservation, then ION will store the extra LTP blocks in les instead of storing it on the heap. However, writing and reading from a le will be more expensive than writing and reading from the heap. Hence reserving the right amount of space for LTP blocks is important for the performance of LTP. The amount of heap reserved for LTP should also not exceed the total amount of heap memory allocated to ION. Hence the number of session that can be congured for LTP indirectly depends on the memory. Therefore, the LTP test results are constrained by the amount of memory allocated to ION and also by the OS buer size for UDP connections. The maximum number of sessions that we use in our tests is 400. Theoretically we can open more sessions on links with high delay to get better throughput, but practically we are constrained by the memory. The LTPCL throughput for various amounts of packet loss is shown in Figure 4.19. When an LTP segment is lost, the LTP sender and receiver will keep the session for the block open until the LTP segment is retransmitted. The drop in throughput is mainly because of the retransmission of lost segments and under utilization of the session. The session which has a lost segment will have to complete retransmission of the LTP segment before the session can be re-used to send other LTP blocks. Not only does this prevent other LTP blocks from using this session, but it also under utilizes the session during retransmission. Take for example a LTP block of size 40000 bytes and LTP segment size of 1500 bytes. This would require a session to send approximately 25 segments of LTP blocks. If 2 segments in this session are lost, the retransmission will send the 2 segments.

57
10
5

Throughput (Kbits/sec)

104

103

4 Loss(%)
5000 bytes 15000 bytes

10

40000 bytes

Figure 4.19: LTPCL Throughput vs Loss

Thus a session which would generally send 40000 bytes during the retransmission phase sends only 3000 bytes. The situation worsens with loss because more sessions will have to retransmit lost segments. Hence the performance reduces with loss. The behavior of ION on losing dierent segments is explained below Data segment - When a data segment is lost, the loss is reported by the receiver when it sends a report segment. The sender on receiving the report, acknowledges the report segment and retransmits the lost segment. Checkpoint segment- After sending a checkpoint segment, the sender will start a checkpoint retransmission timer. If the report segment doesnt arrive before the timer expires, it will resend the checkpoint segment. Consequently, on losing a

58 checkpoint segment, the retransmission timer will expire, forcing the sender to retransmit the checkpoint segment. Report Segment - Loss of a report segment will cause the checkpoint retransmission timer on the sender side and the report segment retransmission timer on the receiver side to expire. This will result in both checkpoint retransmission and a report retransmission. Report Acknowledgement - The receiver will send a report segment, when it receives a checkpoint segment from the sender. The sender responds to the report segment with a report acknowledgment. If the receiver reports that all the segments were received successfully, the sender will send a report acknowledgement and close the export session. However, if the report acknowledgement is lost, the report retransmission timer on the receiver side will expire, and it will retransmit a report segment. Since the sender closed the session, it will have no record of the session the report segment is reporting about, so it will discard the report segment. The receiver will retransmit the report segment until the number of retransmissions reach a preset retransmission max. On reaching the retransmission maximum, the receiver will cancel the session. However, the sender will still have no record of the session that the receiver is trying to cancel, so it will discard the cancel segment. Furthermore the receiver will retransmit the cancel segment until it reaches the retransmission maximum. Figure 4.20 gives the performance comparison of LTPCL and TCPCL versus loss. As mentioned earlier, LTP under utilizes sessions during retransmission of lost LTP segments. The session transmits less data than it would normally send during regular transmission. TCP, on losing a packet will lower its congestion window by half, hence transmitting half as much as it was previously transmitting. The congestion window size of the TCP sender

59
10
5

Throughput (Kbits/sec)

104

10

102

4 Loss (%)
5000 bytes with LTPCL 5000 bytes with TCPCL

10

15000 bytes with LTPCL 15000 bytes with TCPCL

Figure 4.20: Comparing LTPCL and TCPCL Throughput vs Loss

is increased by one segment each round trip time until it reaches advertised receive window size, or there is another packet loss. However, when the loss rate is high, the congestion window constantly remains low, and hence the throughput degrades constantly with loss. On the other hand, LTPs behavior remains approximately the same for low loss and high loss. Therefore, the performance of LTPCL is better than TCPCL for greater losses.

The performance of LTPCL without custody for delay and loss combination is shown in Figure 4.21 and Figure 4.22. The performance of LTPCL on a link with delay and loss is similar to the performance of LTPCL with only loss. The number of export session that can be used to transmit LTP blocks will help keep a constant dialogue between the sender

60
10
5

Throughput (Kbits/sec)

104

103

2 RTT (sec)

0% Loss

.05% Loss

2% Loss

5% Loss

Figure 4.21: LTPCL Throughput vs RTT & Loss for 5000 Byte Bundles

and the receiver even on links with some delay. Hence the performance of LTP on links with RTT less than 5 seconds is determined by the loss on the link. 4.2.2 LTPCL With custody transfer

This section presents the results of LTPCL with custody transfer on a link with dierent delays and losses. Figure 4.23 displays the throughput of LTPCL with custody for various RTTs and compares the throughput of LTP with custody and without custody. As explained in the Section 4.1.2, when custody transfer is enabled, the nodes needs to spend extra time handling custody signals. The intermediate nodes in the setup have to handle incoming bundles, routing bundles and transmitting bundles if custody transfer is not enabled. When custody transfer is enabled, these nodes additionally need to send

61
10
5

Throughput (Kbits/sec)

104

103

2 RTT (sec)

0% Loss

.05% Loss

2% Loss

5% Loss

Figure 4.22: LTPCL Throughput vs RTT & Loss for 15000 Byte Bundles

custody signals to the previous custodian and handle incoming custody signals from the next custodian. The additional processing reduces the transmission rate for LTPCL with custody transfer. As Figure 4.24 illustrates, the performance of LTPCL with custody is better than LTP without custody when losses are high. A link with lower loss acts similar TCPCL, where the throughput of LTPCL without custody is higher than LTPCL with custody. The dierence in throughput for low loss links can be attributed to custody signal processing overhead. Contrary to that, links with high loss have a high probability of losing custody signals. Custody signals are transferred between administrative endpoints on the nodes and do not use the sessions that are used for transmitting data. Losing a data segment or a report segment would entail the session to be open until the segment is retransmitted.

62
105

Throughput (Kbits/sec)

104

103

10 RTT (sec)

15

20

5000 bytes without custody 5000 bytes with custody

15000 bytes without custody 15000 bytes with custody

Figure 4.23: Comparing LTPCL(w/o custody) and LTPCL(w/ cusotdy) Throughput vs Delay

Therefore, the session needs to be open for a longer time, and it prevents the session from being reused for other bundles. On the contrary, losing a custody signal will not aect the session behavior and consequently will not aect the transmission rate. Losing custody signals has bigger consequences in the long run, as mentioned earlier, but in this test scenario results in better performance.

63
105

Throughput (Kbits/sec)

104

103

4 Loss (%)
5000 bytes without custody 5000 bytes with custody

10

15000 bytes without custody 15000 bytes with custody

Figure 4.24: Comparing LTPCL(w/o custody) and LTPCL(w/ custody) Throughput vs Loss

Figure 4.25 and Figure 4.26 illustrates the throughput of LTPCL with custody over links with both delay and loss.

64
10
4

Throughput (Kbits/sec)
103 0

2 RTT (sec)

0% Loss

.05% Loss

2% Loss

5% Loss

Figure 4.25: LTPCL(w/ custody) Throughput vs RTT & Loss for 5000 Byte Bundles

65
10
5

Throughput (Kbits/sec)

104

103

2 RTT (sec)

0% Loss

.05% Loss

2% Loss

5% Loss

Figure 4.26: LTPCL(w/custody) Throughput vs RTT & Loss for 15000 Byte Bundles

66 5 Conclusions This thesis evaluates the maximum performance that can be expected from TCPCL and LTPCL in extreme network conditions. The throughput data establishes a reference point for the performance of the LTPCL and TCPCL in those test environments. The constraints on throughput imposed by some external factors are also explained in the document. Our study also focused on setting up the TCPCL and LTPCL optimally, to elicit maximum performance while at the same time avoiding conguration and code bugs. As we would expect, TCP performs well under conditions with minimal delay and loss. TCP has been optimized to perform extremely well in such conditions. TCP automatically controls its transmission rate under extreme conditions, which will help to prevent further losses and delays. Although this works well in environments with end-to-end connectivity, this is not the best approach when bandwidth and connectivity are a decient resource. LTPCL on the other hand, performs better than TCPCL in conditions with high delay and high loss. LTP, by design takes advantage of the available bandwidth to transmit as much data as possible when the connection is open. In the case of the ION version of LTP, its performance is highly dependent on the shared memory that can be allocated to ION in the system. This is because the number of sessions depends on the amount of memory available, which in turn determines LTPs transmission rate. Hence on a system that can allocate more memory to ION, we can expect the performance of LTP to be higher than what is shown in these tests. TCP dynamically adjusts its transmission rate depending on the operating environment. On the contrary, LTP has to be pre-congured depending on the environment. Since there are so many parameters that have to be congured to ne tune LTPs performance, it would be useful to abstract away some of the parameters in the

67 code, so it will be easier to set up LTP in ION. Some more resources and tools to aid in setting up LTP will also be valuable. The performance of these convergence layers on a link with disconnection was not studied as a part of this thesis. Other than delay and loss, disconnection is another important factor in Delay Tolerant Networks, so it is important to map the performance of these convergence layers on a link with intermittent connectivity. Also proling the other commonly used convergence layers will give us a better idea of what convergence layers best suit a specic setting. Finally, a formal study of how the various LTP parameters - maximum bundle size, number of sessions, aggregation size, aggregation time, and LTP segment size - aect the performance of LTP will also be an interesting experiment for future work.

68 References Akan, O. B., Fang, J., and Akyildiz, I. F. 2002. Performance of TCP Protocols in Deep Space Communication Networks. IEEE Communications Letters 6, 11, 478480. Akyildiz, I. F., Morabito, G., and Palazzo, S. 2001. TCP-Peach: A New Congestion Control Scheme for Satellite. IP Networks, IEEE/ACM Trans. Networking 9, 307321. Allman, M., Paxson, V., and Blanton, E. 2009. TCP Congestion Control. RFC 5681 (Draft Standard). Burleigh, S., Hooke, A., Torgerson, L., Fall, K., Cerf, V., Durst, B., Scott, K., and Weiss, H. 2003. Delay-Tolerant Networking: An Approach to Interplanetary Internet. IEEE Communications Magazine 41, 6, 128136. Burleigh, S., Ramadas, M., and Farrell, S. 2008. Licklider Transmission Protocol Motivation. RFC 5325 (Informational). CCSDS. 2006. Proximity-1 Space Link Protocol - Data Link Layer. Technical Report Blue Book, CCSDS 211.0-B-4, The Consultative Committee for Space Data Systems, Washington, DC. July. Demmer, M. and Ott, J. 2008. Delay Tolerant Networking TCP Convergence Layer Protocol. draft-irtf-dtnrg-tcp-clayer-02.txt. Doering, M., Lahde, S., Morgenroth, J., and Wolf, L. 2008. IBR-DTN: an ecient implementation for embedded systems. In Proceedings of the third ACM workshop on Challenged networks. CHANTS 08. ACM, New York, NY, USA, 117120. DTNRG. 2010. Dtn reference implementation. Durst, R. C., Miller, G. J., and Travis, E. J. 1996. TCP Extensions for Space Communications. In Proceedings of MOBICOM 96. Eddy, W. and Davies, E. 2011. Using Self-Delimiting Numeric Values in Protocols. draft-irtf-dtnrg-sdnv-09. Fall, K. 2003. A Delay-Tolerant Network Architecture for Challenged Internets. In SIGCOMM 03: Proceedings from 2003 conference on Applications, technologies, architectures, and protocols for computer communications. ACM, 2734. Farrell, S. and Cahill, V. 2006. Delay- and Disruption-Tolerant Networking, First ed. Artech House.

69 Hemminger, S. 2005. Network Emulation with NetEm. In LCA 2005, Australias 6th national Linux conference (linux.conf.au), M. Pool, Ed. Linux Australia, Sydney NSW, Australia. Henderson, T. R. and Katz, R. H. 1999. Transport Protocols for Internet-Compatible Satellite Networks. IEEE J. Select. Areas Commun 17, 326344. Juang, P., Oki, H., Wang, Y., Martonosi, M., Peh, L. S., and Rubenstein, D. 2002. Energy-Ecient Computing for Wildlife Tracking: Design Tradeos and Early Experiences with ZebraNet. In Proceedings of the 10th international conference on Architectural support for programming languages and operating systems. ASPLOS-X. ACM, New York, NY, USA, 96107. Kohler, E., Handley, M., and Floyd, S. 2006. Datagram Congestion Control Protocol (DCCP). RFC 4340 (Proposed Standard). Updated by RFCs 5595, 5596. Kruse, H. and Ostermann, S. 2008. UDP Convergence Layers for the DTN Bundle and LTP Protocols. draft-irtf-dtnrg-udp-clayer-00 (work in progress). Marchese, M., Rossi, M., and Morabito, G. 2004. PETRA: Performance Enhancing Transport Architecture for Satellite Communications. Selected Areas in Communications, IEEE Journal on 22, 2, 320 332. Mathis, M., Semske, J., Mahdavi, J., and Ott, T. 1997. The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm. Computer Communication Review 27. Ramadas, M. 2007. Phd dissertation. Ph.D. thesis, Ohio University, Department of Electrical Engineering and Computer Science. Scott, K. and Burleigh, S. 2007. Bundle Protocol Specication. RFC 5050(Experimental). Wang, R., Wu, X., Wang, T., Liu, X., and Zhou, L. 2010. TCP Convergence Layer-Based Operation of DTN for Long-Delay Cislunar Communications. Systems Journal, IEEE 4, 3, 385 395. Wood, L., Eddy, W., and Holliday, P. 2009. A Bundle of Problems. In Proceedings from IEEE Aerospace Conference. IEEE, Big Sky, Montana. Wood, L., Ivancic, W., Eddy, W. M., Stewart, D., Northam, J., Jackson, C., and Curiel, A. D. S. 2008. Use of the Delay-Tolerant Networking Bundle Protocol from Space. In Proceedings from 59th International Astronautical Congress.

70 Appendix A: ION Conguration A.1 TCPCL Conguration A.1.1 Node1 conguration le ## begin ionadmin # ionrc configuration file for 4node-tcp-test for Node1 # # # This uses tcp as the primary convergence layer. This command should be run FIRST.

# Initialization command (command 1). # # # Set this node to be node 1 (as in ipn:1). Use sdr configuration in "/home/testsuite/scripts/host1/ion.conf"

1 1 /home/testsuite/scripts/host1/ion.conf

# start ion node s ## end ionadmin

## begin ionsecadmin 1 e 1 ## end ionsecadmin

## begin bpadmin

71 # bprc configuration file for the 4node-tcp-test # # # This command should be run AFTER ionadmin and ltpadmin and BEFORE ipnadmin or dtnadmin.

# Initialization command (command 1). 1

# Add an EID scheme. # # # # # The schemes name is ipn. This schemes forwarding engine is handled by the program ipnfw. This schemes administration program (acting as the custodian daemon) is ipnadminep.

a scheme ipn ipnfw ipnadminep # Add endpoints. # # # # # # # Establish endpoints ipn:1.1 and ipn:1.2 on the local node. The behavior for receiving a bundle when there is no application currently accepting bundles, is to queue them q, as opposed to immediately and silently discarding them (use x instead of q to discard). Note that the custodian endpoint "ipn:1.0" is automatically generated.

a endpoint ipn:1.1 q a endpoint ipn:1.2 q

72 # Add a protocol. # # # # Add the protocol named tcp. Estimate transmission capacity assuming 1400 bytes of each frame (in this case, tcp on ethernet) for payload, and 100 bytes for overhead.

a protocol tcp 1400 100

# Add an induct. (listen) # # # # # Add an induct to accept bundles using the tcp protocol. The induct will listen at the loopback IP address. The induct will listen on port 4556, the IANA assigned default DTN TCP convergence layer port. The induct itself is implemented by the tcpcli command.

a induct tcp testbed1.cs.ohio.edu:4556 tcpcli

# Add an outduct. # # Add an outduct to send bundles using the tcp protocol. The outduct itself is implemented by the tcpclo command.

a outduct tcp testbed1.cs.ohio.edu:4556 tcpclo a outduct tcp testbed6.cs.ohio.edu:4556 tcpclo

s ## end bpadmin

## begin ipnadmin # ipnrc configuration file for the 4node-tcp-test

73 # # # Essentially, this is the IPN schemes routing table. This command should be run AFTER bpadmin (likely to be run last).

# Add an egress plan. # # # # # # # Bundles to be transmitted to element number 1 (that is, yourself). Transmission should use the tcp convergence layer tcp/testbed1.cs.ohio.edu:4556. All other transmission uses the tcp convergence layer tcp/testbed6.cs.ohio.edu:4556 See your bprc file or bpadmin for outducts/protocols you can use.

a plan 1 tcp/testbed1.cs.ohio.edu:4556 a plan 2 tcp/testbed6.cs.ohio.edu:4556 a plan 3 tcp/testbed6.cs.ohio.edu:4556 a plan 4 tcp/testbed6.cs.ohio.edu:4556 ## end ipnadmin A.1.2 Node2 conguration le ## begin ionadmin 1 2 /home/testsuite/scripts/host2/ion.conf

## end ionadmin

74 ## begin ionsecadmin 1 e 1 ## end ionsecadmin

## begin bpadmin 1

a scheme ipn ipnfw ipnadminep

a endpoint ipn:2.1 q a endpoint ipn:2.2 q

a protocol tcp 1400 100

a induct tcp testbed6.cs.ohio.edu:4556 tcpcli

a outduct tcp testbed1.cs.ohio.edu:4556 tcpclo a outduct tcp testbed6.cs.ohio.edu:4556 tcpclo a outduct tcp testbed7.cs.ohio.edu:4556 tcpclo s ## end bpadmin

## begin ipnadmin a plan 1 tcp/testbed1.cs.ohio.edu:4556 a plan 2 tcp/testbed6.cs.ohio.edu:4556

75 a plan 3 tcp/testbed7.cs.ohio.edu:4556 a plan 4 tcp/testbed7.cs.ohio.edu:4556 ## end ipnadmin A.1.3 Node3 conguration le ## begin ionadmin 1 3 /home/testsuite/scripts/host3/ion.conf

s ## end ionadmin

## begin ionsecadmin 1 e 1 ## end ionsecadmin

## begin bpadmin 1

a scheme ipn ipnfw ipnadminep

a endpoint ipn:3.1 q a endpoint ipn:3.2 q

a protocol tcp 1400 100

76 a induct tcp testbed7.cs.ohio.edu:4556 tcpcli

a outduct tcp testbed6.cs.ohio.edu:4556 tcpclo a outduct tcp testbed7.cs.ohio.edu:4556 tcpclo a outduct tcp testbed8.cs.ohio.edu:4556 tcpclo

s ## end bpadmin

## begin ipnadmin a plan 1 tcp/testbed6.cs.ohio.edu:4556 a plan 2 tcp/testbed6.cs.ohio.edu:4556 a plan 3 tcp/testbed7.cs.ohio.edu:4556 a plan 4 tcp/testbed8.cs.ohio.edu:4556 ## end ipnadmin A.1.4 Node4 conguration le ## begin ionadmin 1 4 /home/testsuite/scripts/host4/ion.conf

s ## end ionadmin

## begin ionsecadmin 1 e 1

77 ## end ionsecadmin

## begin bpadmin 1

a scheme ipn ipnfw ipnadminep

a endpoint ipn:4.1 q a endpoint ipn:4.2 q

a protocol tcp 1400 100

a induct tcp testbed8.cs.ohio.edu:4556 tcpcli

a outduct tcp testbed7.cs.ohio.edu:4556 tcpclo a outduct tcp testbed8.cs.ohio.edu:4556 tcpclo

s ## end bpadmin

## begin ipnadmin a plan 1 tcp/testbed7.cs.ohio.edu:4556 a plan 2 tcp/testbed7.cs.ohio.edu:4556 a plan 3 tcp/testbed7.cs.ohio.edu:4556 a plan 4 tcp/testbed8.cs.ohio.edu:4556 ## end ipnadmin

78 A.2 LTPCL Conguration A.2.1 Node1 conguration le # This is the config file for the 4node LTP test

## begin ionadmin # ionrc configuration file for the 4node LTP test # This uses ltp as the primary convergence layer.

# Initialization command (command 1). # # # Set this node to be node 1 (as in ipn:1). Use sdr configuration in /home/testsuite/scripts/host1/ion.conf

1 1 /home/testsuite/scripts/host1/ion.conf

# start ion node s

# Add a contact. # # # # It will start at +1 seconds from now, ending +36000 seconds from now. It will connect node 1 to itself It will transmit 100000000 bytes/second.

a contact +1 +36000 1 1 100000000

a contact +1 +36000 1 2 100000000

79 a contact +1 +36000 2 1 100000000 a contact +1 +36000 2 2 100000000 a contact +1 +36000 2 3 100000000 a contact +1 +36000 3 2 100000000 a contact +1 +36000 3 3 100000000 a contact +1 +36000 3 4 100000000 a contact +1 +36000 4 3 100000000 a contact +1 +36000 4 4 100000000

# Add a range. This is the physical distance between nodes. # # # # # It will start at +1 seconds from now, ending +36000 seconds from now. It will connect node 1 to itself. Data on the link is expected to take 1 second to reach the other end (One Way Light Time).

a range +1 +36000 1 1 1

a range +1 +36000 1 2 1 a range +1 +36000 1 3 1 a range +1 +36000 1 4 1 a range +1 +36000 2 3 1 a range +1 +36000 2 4 1 a range +1 +36000 3 4 1

# set this node to consume and produce a mean of # 1000000 bytes/second.

80 m production 100000000 m consumption 100000000 ## end ionadmin

## begin ionsecadmin 1 e 1 ## end ionsecadmin

## begin ltpadmin # ltprc configuration file for the 4 node ltp test # Initialization command (command 1) # Set total number estimated export session to 8 # LTP space reservation is 30000000 # which should be >= (max export size * max export bundle size) # # 1 8 30000000 (max import size * max import bundle size) for all spans +

a b

a span 1 2 100000 2 100000 5600 10000 1 udplso testbed1.cs.ohio.edu:1113 a span 2 2 100000 2 100000 5600 10000 1 udplso testbed6.cs.ohio.edu:1113 # # a = remote LTP engine number

81 # b = max export sessions # c = max exort bundle size # d = max import sessions # e = max import bundle size # f = LTP segment size # g = aggregation size # h = aggregation time

s udplsi testbed1.cs.ohio.edu:1113 ## end ltpadmin

## begin bpadmin # bprc configuration file for the 4 node ltp test # # # This command should be run AFTER ionadmin and ltpadmin and BEFORE ipnadmin or dtnadmin.

# Initialization command (command 1). 1

# Add an EID scheme. # # # # # The schemes name is ipn. This schemes forwarding engine is handled by the program ipnfw. This schemes administration program (acting as the custodian daemon) is ipnadminep.

82 a scheme ipn ipnfw ipnadminep # Add endpoints. # # # # # # # Establish endpoints ipn:1.1 and ipn:1.2 on the local node. The behavior for receiving a bundle when there is no application currently accepting bundles, is to queue them q, as opposed to immediately and silently discarding them (use x instead of q to discard). Note that the custodian endpoint "ipn:1.0" is automatically generated.

a endpoint ipn:1.1 q a endpoint ipn:1.2 q

# Add a protocol. # # # # Add the protocol named ltp. Estimate transmission capacity assuming 1400 bytes of each frame (in this case, tcp on ethernet) for payload, and 100 bytes for overhead.

a protocol ltp 1400 100

# Add an induct. (listen) # # Add an induct to accept bundles using the ltp protocol. The induct itself is implemented by the ltpcli command.

a induct ltp 1 ltpcli

# Add an outduct. (send to yourself) # Add an outduct to send bundles using the ltp protocol.

83 # The outduct itself is implemented by the ltpclo command.

a outduct ltp 1 ltpclo a outduct ltp 2 ltpclo s ## end bpadmin

## begin ipnadmin # ipnrc configuration file for the 4node ltp test # # # # Essentially, this is the IPN schemes routing table. This command should be run AFTER bpadmin (likely to be run last).

# Add an egress plan. # # # # # # # Bundles to be transmitted to element number 1 (that is, yourself). Transmission should use the ltp span represented by engine number 1 Bundles to be transmitted to element number 2. Transmission should use the ltp span represented by engine number 2

a plan 1 ltp/1 a plan 2 ltp/2 a plan 3 ltp/2 a plan 4 ltp/2

84 ## end ipnadmin A.2.2 Node2 conguration le ## begin ionadmin

1 2 /home/testsuite/scripts/host2/ion.conf

a contact +1 +36000 1 1 100000000 a contact +1 +36000 1 2 100000000 a contact +1 +36000 2 1 100000000 a contact +1 +36000 2 2 100000000 a contact +1 +36000 2 3 100000000 a contact +1 +36000 3 2 100000000 a contact +1 +36000 3 3 100000000 a contact +1 +36000 3 4 100000000 a contact +1 +36000 4 3 100000000 a contact +1 +36000 4 4 100000000

a range +1 +36000 1 1 1 a range +1 +36000 1 2 1 a range +1 +36000 1 3 1 a range +1 +36000 1 4 1 a range +1 +36000 2 3 1 a range +1 +36000 2 4 1

85 a range +1 +36000 3 4 1

m production 100000000 m consumption 100000000 ## end ionadmin

## begin ionsecadmin 1 e 1 ## end ionsecadmin

## begin ltpadmin 1 808 70000000

#configuration used for no delay no loss links #1 12 30000000

a span 1 2 100000 2 100000 5600 10000 1 udplso testbed1.cs.ohio.edu:1113 a span 2 2 100000 2 100000 5600 10000 1 udplso testbed6.cs.ohio.edu:1113

#configuration used for no delay no loss links #a span 3 2 100000 2 100000 5600 10000 1 udplso testbed7.cs.ohio.edu:1113 a span 3 400 100000 400 100000 1400 10000 1

86 udplso testbed7.cs.ohio.edu:1113

s udplsi testbed6.cs.ohio.edu:1113 ## end ltpadmin

## begin bpadmin 1

a scheme ipn ipnfw ipnadminep

a endpoint ipn:2.1 q a endpoint ipn:2.2 q

a protocol ltp 1400 100

a induct ltp 2 ltpcli

a outduct ltp 1 ltpclo a outduct ltp 2 ltpclo a outduct ltp 3 ltpclo s ## end bpadmin

## begin ipnadmin

a plan 1 ltp/1

87 a plan 2 ltp/2 a plan 3 ltp/3 a plan 4 ltp/3

## end ipnadmin A.2.3 Node3 conguration le ## begin ionadmin

1 3 /home/testsuite/scripts/host3/ion.conf

a contact +1 +36000 1 1 100000000 a contact +1 +36000 1 2 100000000 a contact +1 +36000 2 1 100000000 a contact +1 +36000 2 2 100000000 a contact +1 +36000 2 3 100000000 a contact +1 +36000 3 2 100000000 a contact +1 +36000 3 3 100000000 a contact +1 +36000 3 4 100000000 a contact +1 +36000 4 3 100000000 a contact +1 +36000 4 4 100000000

a range +1 +36000 1 1 1 a range +1 +36000 1 2 1

88 a range +1 +36000 1 3 1 a range +1 +36000 1 4 1 a range +1 +36000 2 3 1 a range +1 +36000 2 4 1 a range +1 +36000 3 4 1

m production 100000000 m consumption 100000000 ## end ionadmin

## begin ionsecadmin 1 e 1 ## end ionsecadmin

## begin ltpadmin # configuration used for no delay no loss links #1 12 30000000

1 808 90000000

# configuration used for no delay no loss links #a span 2 2 100000 2 100000 5600 10000 1 udplso testbed6.cs.ohio.edu:1113

a span 2 400 100000 400 100000 1400 10000 1

89 udplso testbed6.cs.ohio.edu:1113 a span 3 2 100000 2 100000 5600 10000 1 udplso testbed7.cs.ohio.edu:1113 a span 4 2 100000 2 100000 5600 10000 1 udplso testbed8.cs.ohio.edu:1113

s udplsi testbed7.cs.ohio.edu:1113 ## end ltpadmin

## begin bpadmin

a scheme ipn ipnfw ipnadminep

a endpoint ipn:3.1 q a endpoint ipn:3.2 q

a protocol ltp 1400 100

a induct ltp 3 ltpcli

a outduct ltp 2 ltpclo a outduct ltp 3 ltpclo a outduct ltp 4 ltpclo s

90 ## end bpadmin

## begin ipnadmin

a plan 1 ltp/2 a plan 2 ltp/2 a plan 3 ltp/3 a plan 4 ltp/4

## end ipnadmin A.2.4 Node4 conguration le ## begin ionadmin

1 4 /home/testsuite/scripts/host4/ion.conf

a contact +1 +36000 1 1 100000000 a contact +1 +36000 1 2 100000000 a contact +1 +36000 2 1 100000000 a contact +1 +36000 2 2 100000000 a contact +1 +36000 2 3 100000000 a contact +1 +36000 3 2 100000000 a contact +1 +36000 3 3 100000000 a contact +1 +36000 3 4 100000000

91 a contact +1 +36000 4 3 100000000 a contact +1 +36000 4 4 100000000

a range +1 +36000 1 1 1 a range +1 +36000 1 2 1 a range +1 +36000 1 3 1 a range +1 +36000 1 4 1 a range +1 +36000 2 3 1 a range +1 +36000 2 4 1 a range +1 +36000 3 4 1

m production 100000000 m consumption 100000000 ## end ionadmin

## begin ionsecadmin 1 e 1 ## end ionsecadmin

## begin ltpadmin 1 8 30000000

a span 3 2 100000 2 100000 5600 10000 1 udplso testbed7.cs.ohio.edu:1113 a span 4 2 100000 2 100000 5600 10000 1

92 udplso testbed8.cs.ohio.edu:1113

s udplsi testbed8.cs.ohio.edu:1113 ## end ltpadmin

## begin bpadmin

a scheme ipn ipnfw ipnadminep

a endpoint ipn:4.1 q a endpoint ipn:4.2 q

a protocol ltp 1400 100

a induct ltp 4 ltpcli

a outduct ltp 3 ltpclo a outduct ltp 4 ltpclo s ## end bpadmin

## begin ipnadmin

a plan 1 ltp/3

93 a plan 2 ltp/3 a plan 3 ltp/3 a plan 4 ltp/4

## end ipnadmin

94 Appendix B: Supporting programs B.1 Netem Script

!/bin/bash # This is the delay / loss emulator script on Node2 # There is a similar script in Node3 # This script takes 3 arguments - interface, delay, loss # # echo "Delay Script.." if [ $# -ne 3 ]; then echo "Usage $0 <interface> <delay>ms/sec/usec <loss>" 1>&2 exit 1 fi

interface=$1 delay=$2 loss=$3 loss_s="$3%"

echo "Interface : " $interface echo "Delay : " $delay echo "Loss : " $loss

sudo tc qdisc add dev $interface root handle 1: prio

95 if [ $loss -gt 0 ];then

sudo tc qdisc add dev $interface parent 1:3 handle 30: \ netem delay $delay loss $loss_s limit 10000 1>&2 else sudo tc qdisc add dev $interface parent 1:3 handle 30: \ netem delay $delay limit 10000 1>&2 fi sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip dst 132.235.3.39 match ip dport 1113 0xffff flowid 1:3

sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip dst 132.235.3.39 match ip sport 1113 0xffff flowid 1:3

sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip dst 132.235.3.39 match ip dport 4556 0xffff flowid 1:3

sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip dst 132.235.3.39 match ip sport 4556 0xffff flowid 1:3

sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip src 132.235.3.39 match ip dport 1113 0xffff flowid 1:3

sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip src 132.235.3.39 match ip sport 1113 0xffff flowid 1:3

sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip src 132.235.3.39 match ip dport 4556 0xffff flowid 1:3

sudo tc filter add dev $interface protocol ip parent 1:0 prio 1 \ u32 match ip src 132.235.3.39 match ip sport 4556 0xffff flowid 1:3

96 exit 0 B.2 SystemTap Script

#! /usr/bin/env stap

# This script number of times ION calls ipc_lock

global total_count global wait_time

probe begin { printf("Log of ipc_lock wait time\n") }

probe kernel.function("ipc_lock@ipc/util.c").return { total_count <<< 1 time_elapsed = gettimeofday_us() @entry(gettimeofday_us()) wait_time <<< time_elapsed }

probe end { printf("Total sem_op calls : %d", @count(total_count))

97 printf("Total wait time in ipc_lock(us) : %d \n", @sum(wait_time) ) } B.3 B.3.1 SBP Iperf SBP Server

/*--------------------------------------------------------------* Copyright (c) 1999,2000,2001,2002,2003 * The Board of Trustees of the University of Illinois * All Rights Reserved. *--------------------------------------------------------------* Permission is hereby granted, free of charge, to any person * obtaining a copy of this software (Iperf) and associated * documentation files (the "Software"), to deal in the Software * without restriction, including without limitation the * rights to use, copy, modify, merge, publish, distribute, * sublicense, and/or sell copies of the Software, and to permit * persons to whom the Software is furnished to do * so, subject to the following conditions: * * * Redistributions of source code must retain the above * copyright notice, this list of conditions and * the following disclaimers. * *

98 * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimers in the documentation and/or other materials * provided with the distribution. * * * Neither the names of the University of Illinois, NCSA, * nor the names of its contributors may be used to endorse * or promote products derived from this Software without * specific prior written permission. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE CONTIBUTORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * ________________________________________________________________ * National Laboratory for Applied Network Research * National Center for Supercomputing Applications * University of Illinois at Urbana-Champaign * http://www.ncsa.uiuc.edu * ________________________________________________________________ *

99 * Server.cpp * by Mark Gates <mgates@nlanr.net> * * * This has been modified to run over DTN protocols using SBP-API * by Mithun Roy <mr774007@ohio.edu> * ----------------------------------------------------------------* A server thread is initiated for each connection accept() returns. * Handles sending and receiving data, and then closes socket. * Changes to this version : The server can be run as a daemon * ---------------------------------------------------------------*/ Ajay Tirumala (tirumala@ncsa.uiuc.edu>.

#define HEADERS()

#include "headers.h" #include "Server.hpp" #include "List.h" #include "Extractor.h" #include "Reporter.h" #include "Locale.h"

/* ----------------------------------------------------------------* Stores connected socket and socket info. * ---------------------------------------------------------------*/

Server::Server( thread_Settings *inSettings ) {

100 mSettings = inSettings; mBuf = NULL;

// initialize buffer mBuf = new char[ mSettings->mBufLen ]; FAIL_errno( mBuf == NULL, "No memory for buffer\n", mSettings ); }

/* ----------------------------------------------------------------* Destructor close socket. * ---------------------------------------------------------------*/

Server::Server() { if ( mSettings->mSock != INVALID_SOCKET ) { #ifdef ION sbp_close( mSettings->mSock ); #else int rc = close( mSettings->mSock ); WARN_errno( rc == SOCKET_ERROR, "close" ); #endif mSettings->mSock = INVALID_SOCKET; } DELETE_ARRAY( mBuf ); }

void Server::Sig_Int( int inSigno ) {

101 }

/* ----------------------------------------------------------------* Receieve data from the (connected) TCP/UDP socket. * Sends termination flag several times at the end. * Does not close the socket. * ---------------------------------------------------------------*/ void Server::Run( void ) { long currLen; max_size_t totLen = 0; struct UDP_datagram* mBuf_UDP = (struct UDP_datagram*) mBuf;

ReportStruct *reportstruct = NULL;

reportstruct = new ReportStruct; if ( reportstruct != NULL ) { reportstruct->packetID = 0; mSettings->reporthdr = InitReport( mSettings ); do { // perform read #ifdef ION currLen = sbp_recv( mSettings->mSock, mBuf, mSettings->mBufLen, 0 ); #else currLen = recv( mSettings->mSock, mBuf, / mSettings->mBufLen, 0 );

102 #endif if ( isUDP( mSettings ) ) { // read the datagram ID and sentTime // out of the buffer reportstruct->packetID = ntohl( mBuf_UDP->id ); reportstruct->sentTime.tv_sec = ntohl( mBuf_UDP->tv_sec ); reportstruct->sentTime.tv_usec = ntohl( mBuf_UDP->tv_usec); reportstruct->packetLen = currLen; gettimeofday( &(reportstruct->packetTime), NULL ); } else { totLen += currLen; }

// terminate when datagram begins with negative index // the datagram ID should be correct, just negated if ( reportstruct->packetID < 0 ) { reportstruct->packetID = -reportstruct->packetID; currLen = -1; } if ( isUDP (mSettings)) ReportPacket( mSettings->reporthdr, reportstruct ); } while ( currLen > 0 );

// stop timing

103 gettimeofday( &(reportstruct->packetTime), NULL ); if ( !isUDP (mSettings)) { reportstruct->packetLen = totLen; ReportPacket( mSettings->reporthdr, reportstruct ); } CloseReport( mSettings->reporthdr, reportstruct );

// send a acknowledgement back only // if were NOT receiving multicast if ( isUDP( mSettings ) && !isMulticast( mSettings ) ) { // send back an acknowledgement of the // terminating datagram write_UDP_AckFIN( ); } } else { FAIL(1, "Out of memory! Closing server thread\n", mSettings); }

Mutex_Lock( &clients_mutex ); Iperf_delete( &(mSettings->peer), &clients ); Mutex_Unlock( &clients_mutex );

DELETE_PTR( reportstruct ); EndReport( mSettings->reporthdr ); }

104 // end Recv

/* ----------------------------------------------------------------* Send an AckFIN (a datagram acknowledging a FIN) on the socket, * then select on the socket for some time. If additional datagrams * come in, probably our AckFIN was lost and they are re-transmitted * termination datagrams, so re-transmit our AckFIN. * ---------------------------------------------------------------*/

void Server::write_UDP_AckFIN( ) {

int rc;

fd_set readSet; FD_ZERO( &readSet );

struct timeval timeout;

int count = 0; while ( count < 10 ) { count++;

UDP_datagram *UDP_Hdr; server_hdr *hdr;

UDP_Hdr = (UDP_datagram*) mBuf;

105

if ( mSettings->mBufLen > (int) ( sizeof( UDP_datagram ) + sizeof( server_hdr ))) { Transfer_Info *stats = GetReport(mSettings->reporthdr); hdr = (server_hdr*) (UDP_Hdr+1);

hdr->flags hdr->total_len1

= htonl( HEADER_VERSION1 ); = htonl( (long) (stats->TotalLen >> 32));

hdr->total_len2

htonl( (long) (stats->TotalLen & 0xFFFFFFFF)); hdr->stop_sec hdr->stop_usec = htonl( (long) stats->endTime ); =

htonl( (long)((stats->endTime - (long)stats->endTime) * rMillion)); hdr->error_cnt = htonl( stats->cntError );

hdr->outorder_cnt = htonl( stats->cntOutofOrder ); hdr->datagrams hdr->jitter1 hdr->jitter2 = htonl( stats->cntDatagrams ); = htonl( (long) stats->jitter ); =

htonl( (long) ((stats->jitter - (long)stats->jitter) * rMillion) );

// write data

106 #ifdef ION sbp_write( mSettings->mSock, mBuf, mSettings->mBufLen ); #else write( mSettings->mSock, mBuf, mSettings->mBufLen ); #endif

// wait until the socket is readable, or our timeout expires #ifndef ION // since sbp doesnt supoort // select Im going to wait for timeout to expire FD_SET( mSettings->mSock, &readSet ); #endif timeout.tv_sec = 1;

timeout.tv_usec = 0; #ifdef ION rc = 0; #else rc = select( mSettings->mSock+1, &readSet, NULL, NULL, &timeout ); #endif FAIL_errno( rc == SOCKET_ERROR, "select", mSettings );

if ( rc == 0 ) { // select timed out return; } else {

107 // socket ready to read #ifdef ION rc = sbp_read( mSettings->mSock, mBuf, mSettings->mBufLen ); #else rc = read( mSettings->mSock, mBuf, mSettings->mBufLen ); #endif WARN_errno( rc < 0, "read" ); if ( rc <= 0 ) { // Connection closed or errored // Stop using it. return; } } }

fprintf( stderr, warn_ack_failed, mSettings->mSock, count ); } // end write_UDP_AckFIN B.3.2 SBP Client

/*--------------------------------------------------------------* Copyright (c) 1999,2000,2001,2002,2003 * The Board of Trustees of the University of Illinois * All Rights Reserved.

108 *--------------------------------------------------------------* Permission is hereby granted, free of charge, to any person * obtaining a copy of this software (Iperf) and associated * documentation files (the "Software"), to deal in the Software * without restriction, including without limitation the * rights to use, copy, modify, merge, publish, distribute, * sublicense, and/or sell copies of the Software, and to permit * persons to whom the Software is furnished to do * so, subject to the following conditions: * * * Redistributions of source code must retain the above * copyright notice, this list of conditions and * the following disclaimers. * * * Redistributions in binary form must reproduce the above * copyright notice, this list of conditions and the following * disclaimers in the documentation and/or other materials * provided with the distribution. * * * Neither the names of the University of Illinois, NCSA, * nor the names of its contributors may be used to endorse * or promote products derived from this Software without * specific prior written permission.

109 * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE CONTIBUTORS OR COPYRIGHT * HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, * WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, * ARISING FROM, OUT OF OR IN CONNECTION WITH THE * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. * ________________________________________________________________ * National Laboratory for Applied Network Research * National Center for Supercomputing Applications * University of Illinois at Urbana-Champaign * http://www.ncsa.uiuc.edu * ________________________________________________________________ * * Client.cpp * by Mark Gates <mgates@nlanr.net> * ---------------------------------------------------------------* A client thread initiates a connect to the server and handles * sending and receiving data, then closes the socket. * --------------------------------------------------------------*/

#include "headers.h" #include "Client.hpp" #include "Thread.h"

110 #include "SocketAddr.h" #include "PerfSocket.hpp" #include "Extractor.h" #include "delay.hpp" #include "util.h" #include "Locale.h"

/* ---------------------------------------------------------------* Store server hostname, optionally local hostname, and socket * info. * ------------------------------------------------------------- */

Client::Client( thread_Settings *inSettings ) { mSettings = inSettings; mBuf = NULL;

// initialize buffer mBuf = new char[ mSettings->mBufLen ]; pattern( mBuf, mSettings->mBufLen ); if ( isFileInput( mSettings ) ) { if ( !isSTDIN( mSettings ) ) Extractor_Initialize( mSettings->mFileName, mSettings->mBufLen, mSettings ); else Extractor_InitializeFile( stdin, mSettings->mBufLen, mSettings );

111

if ( !Extractor_canRead( mSettings ) ) { unsetFileInput( mSettings ); } }

// connect Connect( );

if ( isReport( inSettings ) ) { ReportSettings( inSettings ); if ( mSettings->multihdr && isMultipleReport(inSettings)){ #ifndef ION mSettings->multihdr->report->connection.peer = mSettings->peer; mSettings->multihdr->report->connection.size_peer = mSettings->size_peer; mSettings->multihdr->report->connection.local = mSettings->local; SockAddr_setPortAny ( &mSettings->multihdr->report->connection.local ); mSettings->multihdr->report->connection.size_local = mSettings->size_local; #endif } }

112

} // end Client

/* ---------------------------------------------------------------* Delete memory (hostname strings). * ------------------------------------------------------------- */

Client::Client() { if ( mSettings->mSock != INVALID_SOCKET ) { #ifdef ION sbp_close( mSettings->mSock ); #else int rc = close( mSettings->mSock ); WARN_errno( rc == SOCKET_ERROR, "close" ); #endif mSettings->mSock = INVALID_SOCKET; } DELETE_ARRAY( mBuf ); } // end Client

const double kSecs_to_usecs = 1e6; const int kBytes_to_Bits = 8;

void Client::RunTCP( void ) { long currLen = 0; struct itimerval it;

113 max_size_t totLen = 0;

int err;

char* readAt = mBuf;

// Indicates if the stream is readable bool canRead = true, mMode_Time = isModeTime( mSettings );

ReportStruct *reportstruct = NULL;

// InitReport handles Barrier for multiple Streams mSettings->reporthdr = InitReport( mSettings ); reportstruct = new ReportStruct; reportstruct->packetID = 0;

lastPacketTime.setnow(); if ( mMode_Time ) { memset (&it, 0, sizeof (it)); it.it_value.tv_sec = (int) (mSettings->mAmount / 100.0); it.it_value.tv_usec = (int) 10000 * (mSettings->mAmount it.it_value.tv_sec * 100.0); err = setitimer( ITIMER_REAL, &it, NULL ); if ( err != 0 ) { perror("setitimer"); exit(1);

114 } } do { // Read the next data block from // the file if its file input if ( isFileInput( mSettings ) ) { Extractor_getNextDataBlock( readAt, mSettings ); canRead = Extractor_canRead( mSettings ) != 0; } else canRead = true;

// perform write currLen = write( mSettings->mSock, mBuf, mSettings->mBufLen ); if ( currLen < 0 ) { WARN_errno( currLen < 0, "write2" ); break; } totLen += currLen;

if(mSettings->mInterval > 0) { gettimeofday( &(reportstruct->packetTime), NULL ); reportstruct->packetLen = currLen; ReportPacket( mSettings->reporthdr, reportstruct ); }

115 if ( !mMode_Time ) { mSettings->mAmount -= currLen; }

} while ( ! (sInterupted

|| && 0 >= mSettings->mAmount)) && canRead );

(!mMode_Time

// stop timing gettimeofday( &(reportstruct->packetTime), NULL );

// if were not doing interval reporting, report the // entire transfer as one big packet if(0.0 == mSettings->mInterval) { reportstruct->packetLen = totLen; ReportPacket( mSettings->reporthdr, reportstruct ); } CloseReport( mSettings->reporthdr, reportstruct );

DELETE_PTR( reportstruct ); EndReport( mSettings->reporthdr ); }

/* ---------------------------------------------------------------* Send data using the connected UDP/TCP socket, * until a termination flag is reached.

116 * Does not close the socket. * ------------------------------------------------------------- */

void Client::Run( void ) { struct UDP_datagram* mBuf_UDP = (struct UDP_datagram*) mBuf; long currLen = 0;

int delay_target = 0; int delay = 0; int adjust = 0;

char* readAt = mBuf;

#if HAVE_THREAD if ( !isUDP( mSettings ) ) { RunTCP(); return; } #endif

// Indicates if the stream is readable bool canRead = true, mMode_Time = isModeTime( mSettings );

// setup termination variables if ( mMode_Time ) { mEndTime.setnow();

117 mEndTime.add( mSettings->mAmount / 100.0 ); }

if ( isUDP( mSettings ) ) { // Due to the UDP timestamps etc, included // reduce the read size by an amount // equal to the header size

// compute delay for bandwidth restriction, // constrained to [0,1] seconds delay_target = (int) ( mSettings->mBufLen * ((kSecs_to_usecs * kBytes_to_Bits)/ mSettings->mUDPRate));

if ( delay_target < 0

||

delay_target > (int) 1 * kSecs_to_usecs ) { fprintf( stderr, warn_delay_large, delay_target / kSecs_to_usecs ); delay_target = (int) kSecs_to_usecs * 1; } if ( isFileInput( mSettings ) ) { if ( isCompat( mSettings ) ) { Extractor_reduceReadSize( sizeof(struct UDP_datagram), mSettings ); readAt += sizeof(struct UDP_datagram); } else { Extractor_reduceReadSize(

118 sizeof(struct UDP_datagram) + sizeof(struct client_hdr), mSettings ); readAt += sizeof(struct UDP_datagram) + sizeof(struct client_hdr); } } }

ReportStruct *reportstruct = NULL;

// InitReport handles Barrier for multiple Streams mSettings->reporthdr = InitReport( mSettings ); reportstruct = new ReportStruct; reportstruct->packetID = 0;

lastPacketTime.setnow();

// Flush out the read queue, there might be some server reports // from previous runs #ifdef ION while(sbp_recv(mSettings->mSock, mBuf, mSettings->mBufLen, MSG_DONTWAIT) > 0); #endif

do { // Test case: drop 17 packets and send 2 out-of-order:

119 // sequence 51, 52, 70, 53, 54, 71, 72 //switch( datagramID ) { // // // // //} gettimeofday( &(reportstruct->packetTime), NULL ); memset(mBuf_UDP, sizeof(UDP_datagram), 0); case 53: datagramID = 70; break; case 71: datagramID = 53; break; case 55: datagramID = 71; break; default: break;

if ( isUDP( mSettings ) ) { // store datagram ID into buffer mBuf_UDP->id = htonl( (reportstruct->packetID)++ ); mBuf_UDP->tv_sec =

htonl( reportstruct->packetTime.tv_sec ); mBuf_UDP->tv_usec = htonl( reportstruct->packetTime.tv_usec );

// delay between writes // make an adjustment for how long the last loop // iteration took // TODO this doesnt work well in certain cases, // like 2 parallel streams adjust = delay_target + lastPacketTime.subUsec( reportstruct->packetTime );

120 lastPacketTime.set( reportstruct->packetTime.tv_sec, reportstruct->packetTime.tv_usec );

if ( adjust > 0

||

delay > 0 ) {

delay += adjust; } }

// Read the next data block from // the file if its file input if ( isFileInput( mSettings ) ) { Extractor_getNextDataBlock( readAt, mSettings ); canRead = Extractor_canRead( mSettings ) != 0; } else canRead = true;

// perform write #ifdef ION currLen = sbp_write( mSettings->mSock, mBuf,mSettings->mBufLen); #else currLen = write( mSettings->mSock, mBuf, mSettings->mBufLen ); #endif if ( currLen < 0 && errno != ENOBUFS ) { WARN_errno( currLen < 0, "write2" );

121 break; }

// report packets reportstruct->packetLen = currLen; ReportPacket( mSettings->reporthdr, reportstruct );

if ( delay > 0 ) { delay_loop( delay ); }

if ( !mMode_Time ) { mSettings->mAmount -= currLen; }

} while ( ! (sInterupted (mMode_Time

|| && ||

mEndTime.before( reportstruct->packetTime )) (!mMode_Time &&

0 >= (int)mSettings->mAmount)) && canRead );

if ( isUDP( mSettings ) ) { // send a final terminating datagram // Dont count in the mTotalLen. The server counts

122 // this one, but didnt count our first datagram, so were // even now. The negative datagram ID signifies // termination to the server.

// store datagram ID into buffer //memset(mBuf, 0, mSettings->mBufLen); mBuf_UDP->id mBuf_UDP->tv_sec = htonl( -(reportstruct->packetID) = htonl( reportstruct->packetTime.tv_sec ); mBuf_UDP->tv_usec = htonl( reportstruct->packetTime.tv_usec ); } // stop timing gettimeofday( &(reportstruct->packetTime), NULL ); //Changes reportstruct->packetID!! CloseReport( mSettings->reporthdr, reportstruct ); );

if ( isUDP( mSettings ) ) { if ( isMulticast( mSettings ) ) { #ifdef ION sbp_write( mSettings->mSock, mBuf, mSettings->mBufLen ); #else write( mSettings->mSock, mBuf, mSettings->mBufLen ); #endif } else {

123 write_UDP_FIN( ); } }

DELETE_PTR( reportstruct ); EndReport( mSettings->reporthdr ); } // end Run

void Client::InitiateServer() { if ( !isCompat( mSettings ) ) { int currLen; client_hdr* temp_hdr; if ( isUDP( mSettings ) ) { UDP_datagram *UDPhdr = (UDP_datagram *)mBuf; temp_hdr = (client_hdr*)(UDPhdr + 1); } else { temp_hdr = (client_hdr*)mBuf; } Settings_GenerateClientHdr( mSettings, temp_hdr ); if ( !isUDP( mSettings ) ) { currLen = send( mSettings->mSock, mBuf, sizeof(client_hdr), 0 ); if ( currLen < 0 ) { WARN_errno( currLen < 0, "write1" ); }

124 } } }

/* ---------------------------------------------------------------* Setup a socket connected to a server. * If inLocalhost is not null, bind to that address, specifying * which outgoing interface to use. * ------------------------------------------------------------- */

void Client::Connect( ) { int rc; SockAddr_remoteAddr( mSettings ); SBP_Init();

assert( mSettings->inHostname != NULL );

// create an internet socket #ifdef ION int type = SOCK_BUNDLE; #else int type = ( isUDP( mSettings ) #endif ? SOCK_DGRAM : SOCK_STREAM);

#ifdef ION int domain = AF_DTN;

125 #else int domain = (SockAddr_isIPv6( &mSettings->peer ) ? #ifdef HAVE_IPV6 AF_INET6 #else AF_INET #endif : AF_INET); #endif

#ifdef ION mSettings->mSock = sbp_socket( domain, type, 0 ); #else mSettings->mSock = socket( domain, type, 0 ); #endif

WARN_errno( mSettings->mSock == INVALID_SOCKET, "socket" );

#ifdef ION int val = 1; int cust = 1; sbp_setsockopt(mSettings->mSock, SOL_SOCKET, DTNOPT_BLOCKING, &val, sizeof(val)); if( isCustody( mSettings )) { sbp_setsockopt(mSettings->mSock, SOL_SOCKET, //wait 1 seconds before giving up

126 DTNOPT_CUSTODY, &cust, sizeof(cust)); } #else SetSocketOptions( mSettings ); #endif

SockAddr_localAddr( mSettings ); #ifdef ION if (mSettings->mLocalhost != NULL) { rc = sbp_bind(mSettings->mSock, &mSettings->local, mSettings->size_local); WARN_errno( rc == SOCKET_ERROR, "bind" ); }

rc = sbp_connect( mSettings->mSock, &mSettings->peer, mSettings->size_peer); WARN_errno( rc == SOCKET_ERROR, "connect" );

mSettings->size_peer = sizeof(sbp_sockaddr); strcpy(mSettings->peer.uri, mSettings->mHost); #else

if ( mSettings->mLocalhost != NULL ) { // bind socket to local address rc = bind( mSettings->mSock, (sockaddr*) &mSettings->local, SockAddr_get_sizeof_sockaddr

127 ( &mSettings->local ) ); WARN_errno( rc == SOCKET_ERROR, "bind" ); } // connect socket rc = connect( mSettings->mSock, (sockaddr*) &mSettings->peer, SockAddr_get_sizeof_sockaddr( &mSettings->peer)); WARN_errno( rc == SOCKET_ERROR, "connect" );

getsockname( mSettings->mSock, (sockaddr*) &mSettings->local, &mSettings->size_local ); getpeername( mSettings->mSock, (sockaddr*) &mSettings->peer, &mSettings->size_peer ); #endif } // end Connect

/* ---------------------------------------------------------------* Send a datagram on the socket. The datagrams contents should * signify a FIN to the application. Keep re-transmitting until an

* acknowledgement datagram is received. * --------------------------------------------------------------*/

void Client::write_UDP_FIN( ) { int rc; fd_set readSet; struct timeval timeout; server_hdr *hdr;

128 char *udp_fin;

int count = 0;

// copy udp_fin from mBuf, before it gets overwritten udp_fin = (char*) malloc (mSettings->mBufLen); memcpy(udp_fin, mBuf, mSettings->mBufLen);

while ( count < 10 ) { count++;

// write data #ifdef ION // The first 5 FINs are sent 1 second apart // but the last 5 are sent ater waiting for // mTTL seconds(set by timeout) if( count == 5) { // if mTTL is set by the user if(mSettings->mTTL > 0){ sbp_setsockopt(mSettings->mSock, SOL_SOCKET, DTNOPT_BLOCKING, &mSettings->mTTL, sizeof(mSettings->mTTL)); } else {

129 int default_wait = 2; sbp_setsockopt(mSettings->mSock, SOL_SOCKET, DTNOPT_BLOCKING, &default_wait, sizeof(default_wait)); }

} sbp_write( mSettings->mSock, udp_fin, mSettings->mBufLen); #else write( mSettings->mSock, mBuf, mSettings->mBufLen ); #endif

// wait until the socket is readable, // or our timeout expires FD_ZERO( &readSet ); #ifndef ION // Since select is not implemented // in sbp will wait for timeout FD_SET( mSettings->mSock, &readSet ); #endif timeout.tv_sec = 0;

timeout.tv_usec = 250000; // quarter second, 250 ms

#ifdef ION rc = 1; #else

130 rc = select( mSettings->mSock+1, &readSet, NULL, NULL, &timeout ); #endif FAIL_errno( rc == SOCKET_ERROR, "select", mSettings );

if ( rc == 0 ) { // select timed out continue; } else { // socket ready to read #ifdef ION rc = sbp_read( mSettings->mSock, mBuf, mSettings->mBufLen); #else rc = read( mSettings->mSock, mBuf, mSettings->mBufLen ); WARN_errno( rc < 0, "read" ); #endif if ( rc < 0 ) { #ifdef ION continue; #else break; #endif } else if ( rc >= (int) (sizeof(UDP_datagram) + sizeof(server_hdr)) ) {

131 #ifdef ION hdr = (server_hdr*) ((UDP_datagram*)mBuf +1);

if( ntohl(hdr->packetID) == -(ntohl( ((UDP_datagram*)udp_fin)->id))){ #endif ReportServerUDP( mSettings, (server_hdr*) ((UDP_datagram*)mBuf + 1) ); #ifdef ION } else{ continue; } #endif } free(udp_fin); return; } } free(udp_fin); fprintf( stderr, warn_no_ack, mSettings->mSock, count ); } // end write_UDP_FIN

You might also like