You are on page 1of 8

storage without boundaries

Whitepaper 05/08/2008

WDS (WAN-optimization Data Services) in StorTrends iTX: Improving the Efficiency of Remote Replication in Wide Area Data Transfers

Copyright 1998-2008 American Megatrends, Inc. All rights reserved. American Megatrends, Inc. 5555 Oakbrook Parkway, Suite 200 Norcross, GA 30093

TRADEMARK AND COPYRIGHT ACKNOWLEDGMENTS This publication contains proprietary information that is protected by copyright. No part of this publication can be reproduced, transcribed, stored in a retrieval system, translated into any language or computer language, or transmitted in any form whatsoever without the prior written consent of the publisher, American Megatrends, Inc. Trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. American Megatrends, Inc. disclaims any proprietary interest in trademarks and trade names other than its own.

FOR ADDITIONAL INFORMATION Call American Megatrends at 1-800-246-8600 for additional information. You can also visit us online at ami.com.

LIMITATIONS OF LIABILITY In no event shall American Megatrends be held liable for any loss, expenses, or damages of any kind whatsoever, whether direct, indirect, incidental, or consequential, arising from the design or use of this product or the support materials provided with the product.

LIMITED WARRANTY No warranties are made, either express or implied, with regard to the contents of this work, its merchantability, or fitness for a particular use. American Megatrends assumes no responsibility for errors and omissions or for the uses made of the material contained herein or reader decisions based on such use.

DISCLAIMER: Although efforts have been made to assure the accuracy of the information contained here, American Megatrends expressly disclaims liability for any error in this information, and for damages, whether direct, indirect, special, exemplary, consequential or otherwise, that may result from such error, including but not limited to the loss of profits resulting from the use or misuse of the information contained herein (even if American Megatrends has been advised of the possibility of such damages). Any questions or comments regarding this document or its contents should be addressed to American Megatrends at the address shown on the back cover of this document. American Megatrends provides this publication as is without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability or fitness for a specific purpose. Some states do not allow disclaimer of express or implied warranties or the limitation or exclusion of liability for indirect, special, exemplary, incidental or consequential damages in certain transactions; therefore, this statement may not apply to you. Also, you may have other rights that vary from jurisdiction to jurisdiction. This publication could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. American Megatrends may make improvements and/or revisions in the product(s) and/or the program(s) described in this publication at any time.

Original Release: 10/05/2007 First Revision: 11/19/2007 Second Revision: 05/08/2008

2008 American Megatrends Inc. - Product Specifications Subject to Change without Notice

Remote replication over Metropolitan Area Networks (MAN) and Wide Area Networks (WAN) has become increasingly widespread with the mass adoption and deployment of iSCSI-based storage appliances. However, despite its popularity, this activity is not bereft of challenges. One of the biggest challenges now facing replication over significant distances is the delay incurred when doing so. Signals transported at nearly the speed of light are saddled with delays, resultinging in latencies that can no longer be ignored, as is the case with LAN data transfers. These round-trip delays range from a couple of milliseconds for inter-city connections to around 80-100 ms from coast-to-coast, and as much as 250-300 ms for submarine transmissions across the globe. When geostationary satellites are used, naturally the distances covered are much greater, resulting in delays of about 700ms.

This whitepaper will examine the impact of transport distance and latency on the efficiency of network bandwidth in WAN remote replication situations over TCP/IP. It will then demonstrate how StorTrends iTX and its powerful WAN acceleration (called WDS) and deduplication technologies improve bandwidth utilization, through sophisticated compression and data reduction techniques.

Since it is founded in iSCSI technology, StorTrends relies heavily on the TCP/IP protocol, optimized for LAN and SAN environments, for serving front-end I/Os and for synchronous replication operations. StorTrends iTX harnesses standards-based transport protocols to minimize the inefficiencies and high latencies of TCP/IP in WAN situations, and greatly improve and increase data transfer speed, approaching the theoretical maximum of the connection speed. StorTrends iTX also provides additional performance gains through data reduction technologies such as compression and data deduplication.

The Effects of Packet Loss and Latency on TCP/IP Efficiency The core technology behind iSCSI and remote replication is the TCP/IP protocol, which is a connection-oriented protocol that employs various congestion control algorithms with checks and balances to ensure guaranteed delivery of packets. Due to the very nature of the protocol, and its dependence on round-trip acknowledgments and sliding windows, the round trip time (RTT) it incurs plays a very dominant role.

A second major factor that adds additional challenges is the issue of packet loss. At such significant transport distances, packets can be dropped due to congestion or bit errors. While recovering from these hiccups, the TCP protocol gets into a slow start mode, where it carries out more conservative corrective actions, resulting in even more restricted performance. In essence, the throughput achieved in long distance replication depends on two basic parameters: the link bandwidth and the transport delays and losses.

The following table shows some common latency values for networks the and corresponding file transfer times using TCP/IP (iSCSI).

City-wide TCP/IP Terrestrial Performance Path (MAN) Terrestrial Path Transcontinental

Transoocean Transglobal Submarine Submarine Path Path Satellite Geostationary

Latency (approx.) (ms)

15

75

125

340

RTT (ms)

30

150

250

680

Transfer Time For 100MB File

12S

3m 7S

15m

27m

1h 10m

Figure 1: Effect of TCP protocol on transfer bandwidth The effect of bandwidth is very simple to understand, as it allows the link to achieve its theoretical maximum throughput. For example, using a single T1 connection, the maximum achievable rate is 1.55 mbps; for T3 it is 45 mbps; with OC3 it increases to 155 mbps, and with Gigabit Ethernet (GigE) links this maximum reaches n the impressive order of Gigabits per second.

2008 American Megatrends Inc. - Product Specifications Subject to Change without Notice

Other factors that exert an effect on throughput are the latency and window sizes employed by the TCP protocol. The end-to-end acknowledgment model of TCP entails an inverse dependency between throughput and latency. This dependency, by extension, superimposes an inverse relationship envelope over the achievable throughput, as can be seen from the figure below.

Figure 2: StorTrends Transfer Profile vs. TCP Transfer Profile Based on the data above, it is clear that for short distances, where latencies are in the sub-ms range (for example, in LAN environments), the throughput is dictated entirely by the link bandwidth. However, the situation is quite the opposite as the latencies increase. Generally speaking, as bandwidth speed increases, the effect of latency becomes more pronounced. For example, a 1GB data transfer via satellites using 45mbps bandwidth could take more than 24 hours, as opposed to just the few minutes it would have taken if the latencies were minimal. Due to the nature of the TCP protocol, replication over long distances is essentially bottlenecked by latencies, and higher bandwidth transports do very little to help. The result of this bottleneck effect is a gross under-utilization of available bandwidth.

The StorTrends iTX WDS Implementation and Transfer Acceleration Various remedies exist to counteract the irregularity of TCP over long distances; specific tunings or accelerated protocols can sometimes be implemented to help alleviate this problem. Another fairly common, though expensive, solution is to place pairs of special dedicated appliances along the transport path to boost or improve its throughput. As pointed out earlier, although the performance of the TCP stack can be tweaked, it is a fairly accurate generalization that TCP is more suitable for the LAN environment than it is in long-haul networks.

In iSCSI storage servers such as the StorTrends 1U and 3U storage appliances from American Megatrends, it is much more desirable to have the TCP stack optimized for the LAN environment. This method of optimization is preferable, since the server is used to serve I/Os over the iSCSI interconnect to the storage network (SAN), which is LAN-like in behavior. For replication over long distances, StorTrends iTX uses an intelligent combination of several standard IP transport protocols and a unique feature called WDS (WAN-optimization Data Services) to achieve near theoretical maximum throughput as determined by the bandwidth, taking into account the few resultant dropped packets. The figures below show the effect of packet loss and delays on long distance replications using iSCSI, and the typical corresponding improvements achieved with WDS in the Asynchronous

2008 American Megatrends Inc. - Product Specifications Subject to Change without Notice

Replication module found in StorTrends iTX:

Figure 3: Effect of Latency and Losses in TCP

Figure 4: Effect of Latency and Losses in StorTrends Transfer Protocol

WDS in StorTrends iTX: Compression and Data Deduplication for WAN Asynchronous Replication AMI offers these WDS features as an integral part of the StorTrends iTX stack, which can be enabled or disabled through a robust feature licensing capability. Designed to provide the most efficient and optimal asynchronous replication capability, WDS does so through powerful data reduction (or data deduplication) and WAN acceleration features.

In order to minimize the amount of data transported over WAN links, StorTrends iTX offers two types of data reduction features: compression and data deduplication. Whereas compression is a pervasive and therefore well understood technology, deduplication is much more of an emerging concept, and is considered to be something of a Holy Grail for storage appliances. This section will explain the data deduplication implementation of StorTrends iTX for WAN replication.

Like many of its competitors, StorTrends employs delta-snapshot technology to transfer only the data that has changed between passes in order to reduce the amount of data transported. In addition to this, StorTrends also employs an intelligent data-dedupe algorithm to filter out some of the previously transmitted data from earlier passes. This technique provides significant benefits for database applications, in particular.

At its most basic level, StorTrends iTX employs an optimal delta-snapshot replication technology for the implementation asynchronous replication; the figure below illustrates the concept.

Figure 5: The Delta Snapshot Replication Technology behind Asynchronous Replication in StorTrends iTX 2008 American Megatrends Inc. - Product Specifications Subject to Change without Notice
5

Once the original volume is replicated to the remote site, regular closely spaced snapshots are scheduled on the primary production server; the replica server is then updated snapshot at a time. There are two methods of implementing snapshots: one is the conventional Copy-On-Write (COW) and the other is a more advanced technique called Redirect-On-Write (ROW). AMI implements ROW in order to offer the best performance and space efficiency. However, irrespective of the method in use, at the core these implementations track and maintain this difference data in granular regions called chunks. Typically, these chunks are 64KB in size.

The storage stacks also maintain a list of delta chunks indicating the difference in contents between successive snapshots. This permits replication of only the delta chunks in successive snapshots. Normally, during write operations in a snapshot enabled environment, sub-chunk sized writes result in read-modify-write operations, meaning that any such small I/Os will generate a large number of sub-chunk granular duplicates. A close observation of the types of I/Os generated by various application servers reveals that these types of small, duplicate I/Os make up the majority. For example, SQL servers generate random I/Os of 8K blocks, and Exchange servers generate random I/Os of 4K blocks, making it easy to see that even when chunk sized I/Os are replicated, they may carry multiple packets of duplicated data. To remedy this, the deduplication feature in StorTrends iTX utilizes a very I/O-efficient algorithm to detect duplicate data and eliminates them, to further reduce the amount of data transported during asynchronous replication. The screenshot below demonstrates the amount of data reduction achieved in a typical realworld asynchronous replication scenario.

Figure 6: Deduplication and WAN Transfer Acceleration Management in StorTrends iTX Naturally, this data reduction is achieved at the expense of generating more read I/Os and subsequent byte-level comparisons. These extra chores are performed in order to eliminate duplicate data that has already been transported in previous cycles. It must be noted that here data deduplication is employed only for data transfer optimizations over WAN. In such replication environments this is very beneficial as it curtails a lot of replication time, resulting in faster replications and thereby yielding excellent RPO.

Additional WAN Assists in Asynchronous Replication with StorTrends In addition to the optimizations described above, StorTrends supports the compression and encryption of data before it is released on the WAN, which reduces the net size of data transferred over these more expensive links. StorTrends supports various compression depths, including an adaptive compression mode where the depth of compression is determined by an analysis of the current and statistical load on the system.

2008 American Megatrends Inc. - Product Specifications Subject to Change without Notice

Figure 5: ManageTrends Replication Management Screen To summarize, StorTrends utilizes several sophisticated technologies and techniques to optimize the speed of long distance WAN connections, including: Frontline I/Os optimized for SAN Replication I/Os optimized for WAN One-to-One, One-to-many, Many-to-one, Round-Robin replication schemas Link-level I/O throttling Multiple levels of compression, including adaptive compression Encryption Delta transfers and data deduplication Automatic validation of replicated snapshots Extensive instrumentation and monitoring

In short, the innovative replication stack found in StorTrends iTX offers unprecedented performance for remote replication over wide-area networks. It utilizes an intelligent mix of standards-based transport protocols to overcome the inefficiencies and high latencies of TCP protocols in WANs and provide excellent bandwidth utilization. Additional performance gains are made through data reduction technologies such as compression and data deduplication. The TCP protocol, optimized for LAN and SAN environments, is utilized for serving front-end I/Os and for synchronous replication operations, to greatly improve and increase data transfer speed, approaching the theoretical maximum of the connection speed.

Why AMI? AMI offers a wide array of disaster recovery and high availability solutions for your business needs. We provide services that range from storage needs analysis to the design and implementation of a custom disaster recovery solution. We can help your business plan for when things are at their worst while reducing the cost and complexity of your storage environment. For more information on AMI StorTrends solutions, visit www.StorTrends.com, email us at sales@amiindia.co.in, or call us at 1-800-U-Buy-AMI.

2008 American Megatrends Inc. - Product Specifications Subject to Change without Notice

Whitepaper

American Megatrends India Pvt. Ltd


This publication contains proprietary information that is protected by copyright. No part of this publication can be reproduced, transcribed, stored in a retrieval system, translated into any language or computer language, or transmitted in any form whatsoever without the prior written consent of the publisher, American Megatrends, Inc. 2008 American Megatrends, Inc. All Rights Reserved

Kumaran Nagar, Semmenchery Off Rajiv Gandhi Salai (OMR) Chennai - 600 119 India

Sales & Product Information sales@amiindia.co.in | [91] 44-66540922


Technical Support

support@amiindia.co.in

www.amiindia.co.in 2008 American Megatrends Inc. - Product Specifications Subject to Change without Notice

You might also like