You are on page 1of 64

White Paper

EMC VPLEX: ELEMENTS OF PERFORMANCE


AND TESTING BEST PRACTICES DEFINED

Abstract
This white paper describes the performance
characteristics, metrics, and testing considerations
for EMC VPLEX family of products. Its intent is to refine
performance expectations, to review key planning
considerations, and to describe testing best
practices for VPLEX Local, Metro and Geo. This
paper is not suitable for planning for exceptional
situations. In configuring for performance, every
environment is unique and actual results may vary.

Copyright 2012 EMC Corporation. All Rights Reserved.


EMC believes the information in this publication is accurate of its publication
date. The information is subject to change without notice.
The information in this publication is provided as is. EMC Corporation makes no
representations or warranties of any kind with respect to the information in this
publication, and specifically disclaims implied warranties of merchantability or
fitness for a particular purpose.
Use, copying, and distribution of any EMC software described in this publication
requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation
Trademarks on EMC.com.
All other trademarks used herein are the property of their respective owners.
Part Number h11299

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

Table of Contents
Executive summary ....................................................................................................... 5
Audience ........................................................................................................................ 5
Introduction .................................................................................................................... 6
Transaction-based workloads................................................................................... 8
Throughput-based workloads................................................................................... 8
The Role of Applications in Determining Acceptable Performance .................. 8
Section1: VPLEX Architecture ...................................................................................... 10
VPLEX hardware platform ....................................................................................... 10
VPLEX GeoSynchrony 5.1 System Configuration Limits ....................................... 10
READ / Write IO Limits ............................................................................................... 11
Section 2: VPLEX Performance Highlights .................................................................. 12
Understanding VPLEX overhead ............................................................................ 12
Native vs. VPLEX Local Performance ..................................................................... 12
OLTP Workload Example ......................................................................................... 15
Native vs. VPLEX Metro Performance .................................................................... 15
Native vs. VPLEX Geo Performance ...................................................................... 16
Section 3: Hosts and Front-end Connectivity............................................................. 17
Host Environment ...................................................................................................... 17
Host Paths .................................................................................................................. 17
Host to director connectivity .................................................................................. 20
Host Path Monitoring ................................................................................................ 22
Policy based path monitoring ................................................................................ 23
VPLEX Real-time GUI Performance Monitoring Stats ........................................... 24
Remote Monitoring and Scripting .......................................................................... 26
Watch4Net ................................................................................................................ 26
Perpetual Logs .......................................................................................................... 26
Benchmarking Applications, Tools and Utilities .................................................... 26
Section 4: Application Performance Considerations ................................................ 31
High Transaction environments .............................................................................. 31
High Throughput environments .............................................................................. 32
VPLEX Device Geometry ......................................................................................... 32
Section 5: Back-end Performance Considerations ................................................... 34
Storage Considerations ........................................................................................... 34
Storage Array Block Size .......................................................................................... 34
SAN Architecture for Storage Array Connectivity ............................................... 34

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

Active/Active Arrays ................................................................................................ 35


Active/ Passive Arrays .............................................................................................. 38
Additional Array Considerations ............................................................................ 40
Automated Storage Tiering..................................................................................... 40
Performance Metrics for Back-end IO................................................................... 40
Back-end Connectivity Summary .......................................................................... 41
Section 6: SAN and WAN Performance ...................................................................... 42
SAN Redundancy ..................................................................................................... 42
Redundancy through Cisco VSANs or Brocade Virtual Fabrics......................... 42
Planning SAN Capacity ........................................................................................... 43
ISL Considerations ..................................................................................................... 43
FC WAN Sizing ........................................................................................................... 44
Brocade switches: .................................................................................................... 44
IP WAN Settings VPLEX Metro-IP and VPLEX Geo................................................. 45
Areas to Check to Avoid SAN and WAN Performance Issues ........................... 45
Section 7: VPLEX Performance Checklist ................................................................... 47
Section 8: Benchmarking ............................................................................................ 51
Tips when running the benchmarks ....................................................................... 51
Take a scientific approach when testing ............................................................. 51
Typical Benchmarking Mistakes ............................................................................. 52
Real World Testing Mistake Example ..................................................................... 54
Understand the Metamorphosis of an IO ............................................................. 54
VPLEX Performance Benchmarking Guidelines ................................................... 54
IOMeter Example...................................................................................................... 56
Conclusion.................................................................................................................... 61
References.................................................................................................................... 62

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

Executive summary
For several years, businesses have relied on traditional physical storage to meet their
information needs. Developments such as sever virtualization and the growth of
multiple sites throughout a businesses network have placed new demands on how
storage is managed and how information is accessed.
To keep pace with these new requirements, storage must evolve to deliver new
methods of freeing data from a physical device. Storage must be able to connect
to virtual environments and still provide automation, integration with existing
infrastructure, consumption on demand, cost efficiency, availability, and security.
The EMC VPLEX family is the next generation solution for information mobility and
access within, across, and between data centers. It is the first platform in the world
that delivers both Local and Distributed Federation.

Local Federation provides the transparent cooperation of physical elements


within a site.

Distributed Federation extends access between two locations across distance.


VPLEX is a solution for federation both EMC and non-EMC storage.

VPLEX completely changes the way IT is managed and delivered particularly when
deployed with server virtualization. By enabling new models for operating and
managing IT, resources can be federated pooled and made to cooperate through
the stackwith the ability to dynamically move applications and data across
geographies and service providers. The VPLEX family breaks down technology silos
and enables IT to be delivered as a service.
VPLEX resides at the storage layer, where optimal performance vital. This document
focuses on key considerations for VPLEX performance, performance metrics, and
testing best practices. The information provided is based on VPLEX Release 5.1. The
subject is advanced and it is assumed the reader has a basic understanding of the
VPLEX technology. For additional information on VPLEX best practices and detailed
technologies see the appendix for a reference list and hyperlinks to relevant
documents.

Audience
This white paper is intended for storage, network and system administrators who
desire a deeper understanding of the performance aspects of EMC VPLEX, the
testing best practices, and/or the planning considerations for the future growth of
their VPLEX virtual storage environment(s).
This document outlines how VPLEX
technology interacts with existing storage environments, how existing environments
might impact VPLEX technology, and how to apply best practices through basic
guidelines and troubleshooting techniques as uncovered by EMC VPLEX
performance engineering and EMC field experiences.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

Introduction
Before we begin our discussion, it is important to know why we are providing
guidance on interpretation of the performance data provided in this document. The
Business unit that has delivered VPLEX to the market has a guiding policy to be as
open and transparent as possible with EMC field resources, partners and customers.
We believe that all modern storage products have limitations and constraints and
therefore the most successful and satisfied customers are those that fully understand
the various constraints and limitations of the technology they intend to implement.
This approach leads our customers to success because there are fewer surprises and
the product expectations match the reality. Our intent is to be candid as possible.
We ask readers to use the information to understand the performance aspects of
VPLEX implementations and to make better informed judgments about nominal
VPLEX capabilities rather than use the document as the final word on all VPLEX
performance (as competitors may be tempted to do). If you have questions about
any of the content in this document please contact to your local EMC Sales or
Technical representatives.
When considering a given solution from any vendor there will undoubtedly be
strengths and weaknesses that need to be considered. There will always be a
specific unique IO profile that poses challenges in servicing the application load, the
key is to understand the overall IO mix and how this will impact real production
workloads. It is misleading to extrapolate a specific IO profile to be representative of
an entire environment unless the environment homogeneously shares a single IO
profile.
Lets begin our discussion of VPLEX performance by considering performance in
general terms.
What is good performance anyway? Performance can be
considered to be a measure of the amount of work that is being accomplished in a
specific time period. Storage resource performance is frequently quoted in terms of
IOPS (IO per second) and/or throughput (MB/s). While IOPS and throughput are both
measures of performance, they are not synonymous and are actually inversely
related meaning if you want high IOPS, you typically get low MB/s. This is driven in
large part by the size of the IO buffers used by each storage product and the time it
takes to load and unload each of them. This produces a relationship between IOPS
and throughput as shown in Figure 1 below.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

Figure 1
For example, an application requests 1,000 IOPS at an 8KB IO size which equals 8
MB/s of throughput (1,000 IOPS x 8KB = 8MB/s). Using 200MB/s Fibre channel, 8 MB/s
doesnt intuitively appear to be good performance (8MB/s is only 4% utilization of the
Fibre Channel bus) if youre thinking of performance in terms of MB/s. However, if the
application is requesting 1,000 IOPS and the storage device is supplying 1,000 IOPS
without queuing (queue depth = 0), then the storage resource is servicing the
application needs without delay meaning the performance is actually good.
Conversely, if a video streaming application is sequentially reading data with a 64MB
IO size and 3 concurrent streams, it would realize 192MB/s aggregate performance
across the same 200MB/s Fibre channel connection (64MB x 3 streams = 192MB/s).
While theres no doubt that 192 MB/s performance is good (96% utilization of the Fibre
Channel bus), its equally important to note were only supporting 3 IOPS in this
application environment.
These examples illustrate the context dependent nature of performance that is,
performance depends upon what you are trying to accomplish (MB/s or IOPS).
Knowing and understanding how your host servers and applications handle their IO
workload is the key to being successful with VPLEX performance optimization. In
general, there are two types of IO workloads:

Transaction-based

Throughput-based

As you saw in Figure 1, these workloads are quite different in terms of their objectives
and must be planned for in specific ways. We can describe these two types of
workloads in the follow ways:

A workload that is characterized by a high number of IO per second (IOPS) is


called a transaction-based workload.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

A workload that is characterized by a large amount of data transferred,


normally with large IO sizes, is called a throughput-based workload.

What should you expect to see from each type of workload?


Transaction-based workloads
High performance transaction-based environments cannot typically be built using
low cost and consequently low IOPs back end arrays. Transaction processing rates
are heavily dependent on the competency of the backend array. Ultimately the
number of back-end physical drives that are available within a storage system to
processing host IO becomes the limiting factor. In general, transaction- based
processing is limited by the physical spindle count and individual disk IO capabilities
of the array rather than the size of the connectivity pipes, the transfer buffer sizes, or
the internal bandwidth of the array.
Another common characteristic of transaction intense applications is that they use a
small random data block pattern to transfer data. With this type of data pattern,
having more back-end drives enables more host IO to be processed simultaneously.
When transaction-based or any type of workloads are random and write biased, the
efficacy of read cache is diminished as misses need to be retrieved from the physical
disks.
In many cases, slow transaction performance problems can be traced directly to
hot files that cause a bottleneck on a critical component (such as a single physical
disk). This situation can occur even when the overall storage subsystem sees a fairly
light workload. When bottlenecks occur, they can present an extremely difficult and
frustrating task to resolve.
Throughput-based workloads
Throughput-based workloads are seen with applications or processes that require
massive amounts of data to be transmitted in as few IO as possible. Generally these
workloads use large sequential blocks to reduce the impact of disk latency.
Applications such as satellite imagery, high performance compute (HPC), video
streaming, seismic research, surveillance, and the like would fit into this category.
Relatively speaking, a smaller number of physical drives are needed to reach
adequate IO performance compared to transaction-based workloads.
In a
throughput-based environment, read operations make use of the storage subsystem
cache to pre-fetch large chunks of data at a time to improve the overall
performance. Throughput rates are heavily dependent on the connectivity pipe size,
IO buffer size, and storage subsystems internal bandwidth. Modern storage
subsystems with high bandwidth internal busses are able to reach higher throughput
numbers and bring higher rates to bear.
The Role of Applications in Determining Acceptable Performance
Regardless of the capability of a given storage frame, it cannot provide more IO
than the application requests. Ultimately, the application is the real performance
driver. For example, say an application generates requests for 2,500 IOPS from a

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

storage resource - is there any performance difference at the application level


between a storage frame capable of delivering 2,500 IOPS and another storage
frame capable of delivering 10,000 IOPS? Obviously, the answer is a resounding No.
Either resource is capable of servicing the 2,500 IOPS requirement. Its like traveling in
a car at 65mph on the freeway if everyone obeys the 65 mph speed limit, then, any
car that goes the speed limit will get you there in the same amount of time whether
its a Chevy Lumina or Ferrari Enzo.

The point we are trying to make is that performance is very much dependent on the
point of view. Ultimately, performance can be considered good if the application is
not waiting on the storage frame. Understanding the applications performance
requirements and providing compatible storage resources ensures maximum
performance and application productivity. It goes without saying to always be
cautious about performance claims and spec sheet speeds and feeds.
If the
environment that generated the claims is not identical or does not closely
approximate your environment, you may very well not see the same performance
results.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

Section1: VPLEX Architecture


VPLEX hardware platform
A VPLEX system with GeoSynchrony 5.1 is composed of one or two VPLEX clusters:
one cluster for VPLEX Local systems and two clusters for VPLEX Metro and VPLEX Geo
systems. These clusters provide the VPLEX AccessAnywhere capabilities.
Each VPLEX cluster consists of:

A VPLEX Management Console

One, two, or four engines

One standby power supply for each engine

In configurations with more than one engine, the cluster also contains:

A pair of Fibre Channel switches

An uninterruptible power supply for each Fibre Channel switch

As you add engines you add cache, front-end, back-end, and wan-com
connectivity capacity as indicated in Table 2 below.

VPLEX GeoSynchrony 5.1 System Configuration Limits


Capacity
Maximum virtualized capacity

Local
No Known Limit

Metro
No Known Limit

Geo
No Known Limit

Maximum virtual volumes

8,000

16,000

16,000

Maximum storage
elements

8,000

16,000

16,000

Minimum/maximum virtual
volume size

100MB/32TB

100MB/32TB

No VPLEX Limit
/ 32TB

No VPLEX Limit /
32TB

Minimum/maximum
storage volume size
Number of host initiators

1600

1600

100MB/32TB

No VPLEX Limit /
32TB
800

Table 1

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

10

Engine Type

VPLEX VS1

VPLEX VS2

Model

Cache
[GB]

FC speed
[Gb/s]

Engines

FC Ports

Announced

Single

64

32

10-May-10

Dual

128

64

10-May-10

Quad

256

128

10-May-10

Single

72

16

23-May-11

Dual

144

32

23-May-11

Quad

288

64

23-May-11

Table 2
Table 1 and Table 2 show the current limits and hardware specifications for the VPLEX
VS1 and VS2 hardware versions. Although the VS2 engines have half the number of
ports as VS1 the actual system throughput is improved as each VS2 port can supply
full line rate (8 Gbps) of throughput whereas the VS1 ports are over-subscribed.
Several of the VPLEX maximums are determined by the limits of the externally
connected physical storage frames and therefore unlimited in terms of VPLEX itself.
The latest configuration limits are published in the GeoSynchrony 5.1 Release Notes
which are available on Powerlink.EMC.com.

READ / Write IO Limits


VPLEX with GeoSynchrony 5.1 can be configured with one to four engines per cluster.
For a fully configured four-engine VS2 VPLEX cluster the maximums work out as
follows:

IOPS up to 3 Million IOPS


GB/S up to 23.2 Gigabytes per second

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

11

Section 2: VPLEX Performance Highlights


Understanding VPLEX overhead
Properly understanding VPLEX performance capabilities and dependencies will
greatly benefit many of the design decisions for your VPLEX environment. In general,
with VPLEX's large per-director cache, host reads are comparable to and, in some
cases, better than native array performance. Writes on the other hand, will follow
VPLEX's write-through caching model on VPLEX Local and Metro will inevitably have
slightly higher latency than native.
There are many factors involved in determining if and when latency is added by
VPLEX. Factors such as host IO dispensation size, IO type, VPLEX internal queue
congestion, SAN congestion, and array congestion will play a role in whether or not
latency is introduced by VPLEX. In real world production environments, however,
what do all of these factors add up to? Lets take a look at the average latency
impact. We can break these latencies into the following 3 categories based on the
type of host IO and whether or not the data resides in VPLEX cache:
For VPLEX read cache hits, the VPLEX read response time typically ranges from 85-150
microseconds depending on the overall system load and IO size. In many cases this is
less latency than the latency of the native storage array and can actually be
considered a reduction in latency. For local devices VPLEX adds a small amount of
latency to each VPLEX cache read miss and each write operation:

Typical Read Miss: About 200-400 microseconds


Typical Write: About 200-600 microseconds

These latency values will vary slightly depending on the factors mentioned earlier. For
example, if there are large block IO requests which must be broken up into smaller
parts (based on VPLEX or individual array capabilities) and then written serially in
smaller pieces to the storage array. Further, if you are comparing native array to
VPLEX performance, it will be heavily dependent on the overall load on the array. If
you have an array that is under cache pressure, adding VPLEX to the environment
can actually improve read performance.
The additive cache from VPLEX may
offload a portion of read IO from the array, thereby reducing average IO latency.
Additional discussion on this topic is provided later in the subsequent host and
storage sections.
Native vs. VPLEX Local Performance
Native performance tests use a direct connection between a host and storagearray. VPLEX Local testing inserts VPLEX in the path between the host and array.
4KB Random Read Hit
Random read hits are tested over a working set size that fits entirely into array or
VPLEX cache.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

12

Figure 2 VPLEX Random 4KB Read Hit Native vs. VPLEX


This test actually reveals that VPLEX performs faster than the storage frame used in
this test. Here the lower the latency the better! Though this may not be typical of
every situation, it does help illustrate a case where VPLEX read hits can help to
improve overall storage environment performance.
4KB Random Read Miss
Random read misses by-pass VPLEX cache (but can be a cache hit on the array)
because of their large working set size. Here we see the typical VPLEX read miss
overhead of about 300 microseconds.

Figure 3 Random Read Miss Native vs. VPLEX

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

13

Figure 4 Random 4KB Write Native vs. VPLEX


4KB IO are used in our tests to illustrate a high number of IO operations. In Figure 4 we
see VPLEX write overhead of about 350 microseconds. This test reveals that VPLEX
adds a measureable but relatively small impact to each IO. Here the lower the
latency the better.

Figure 5 128 KB Sequential Write


128 KB IO are used in our tests to illustrate high throughput (bandwidth) operations. In
Figure 5 we see an average VPLEX write overhead of about 500 microseconds. This
test reveals that VPLEX adds a measureable but relatively small impact to each IO.
Here the lower the latency the better.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

14

OLTP Workload Example


Our synthetic OLTP heavy workload (called OLTP2HW below) benchmark workload is
a mix of 8KB and 64KB IO request sizes, with a 1:1 ratio of reads and writes.

In this test, the application demonstrates slightly more host latency compared to
native with VPLEX. The additional latency overhead is about 600 microseconds.
Native vs. VPLEX Metro Performance
VPLEX Metro write performance is highly dependent upon the WAN round-trip-time
latency (RTT latency). The general rule of thumb for Metro systems is host write IO
latency will be approximately 1x-3x the WAN round-trip-time. While some may view
this as overly negative impact, we would caution against this view and highlight the
following points. First, VPLEX Metro uses a synchronous cache model and therefore is
subject to the laws of physics when it comes to data replication. In order to provide
a true active-active storage presentation it is incumbent on VPLEX to provide a
consistent and up to date view of data at all times. Second, many workloads have a
considerable read component, so the net WAN latency impact can be masked by
the improvements in the read latency provided by VPLEX read cache. This is another
reason that we recommend a thorough understanding of the real application
workload so as to ensure that any testing that is done is applicable to the workload
and environment you are attempting to validate.
In comparing VPLEX Metro to native array performance it is important to ensure that
the native array testing is also synchronously replicating data across an equal
distance and WAN link as VPLEX. Comparing Metro write performance to a single
array that is not doing synchronous replication is an apples to bananas comparison.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

15

Figure 6 Metro WAN Latency Impact


Figure 6 illustrates the impact of WAN latency on VPLEX Metro. As WAN latency is
added there is a corresponding impact on write IO. The OLTP (green) lines show a
simulated OLTP application (8KB and 64 KB IO with roughly equal read and write IO)
and the overall impact of WAN latency with VPLEX Metro.
Write throughput-intensive applications such as back-ups need to be aware of
maximum available WAN bandwidth between VPLEX clusters. If the write workload
exceeds the WAN link bandwidth, response time will spike, and other applications
may also see severe performance degradations.
Native vs. VPLEX Geo Performance
Given the fundamental architectural differences of VPLEX Geo from Local and
Metro, namely its write-back caching model and asynchronous data replication, it's
even more difficult to accurately compare native array performance to VPLEX Geo
performance.
In short, VPLEX Geo performance will be limited to the available drain-rate, which is a
function of the available WAN bandwidth and storage-array performance at each
cluster. If a VPLEX director's incoming host write rate exceeds what the outgoing write
rate can achieve, inevitably there will be push back or throttling that occurs on the
host, which will negatively affect host per operation write latency causing it to rise.
Ensure the WAN and arrays are properly configured, and various VPLEX Geo related
settings are tuned properly.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

16

Section 3: Hosts and Front-end Connectivity


There are certain baseline configuration recommendations when using VPLEX to
provision virtual storage to hosts. These considerations include how many paths
through the fabric are allocated to the host, how many host ports to use, how to
spread the hosts across VPLEX directors, logical unit number (LUN) mapping, and the
correct size of virtual volumes to use. Maximizing connectivity and following best
practices for inline devices such as VPLEX will optimize performance for your virtual
storage environment.
Host Environment
When configuring a new host for VPLEX, the first step is to determine the EMC
supported operating system, driver, firmware, and supported host bus adapters in
order to prevent unexpected problems due to untested configurations. Consult the
VPLEX Simple Support Matrix prior to bringing a new host into VPEX for recommended
levels. The VPLEX support matrix is available at: http://powerlink.emc.com or at
Support.EMC.com
In addition, verify that all host path management software is enabled and operating
correctly.
Path management applications should be set as follows:
Operating System
Hewlett-Packard HPUX
VMware ESX
IBM AIX
All Linux

Recommended Policy
PVLinks set to Failover.
Set NMP policy to Fixed
Native MPIO set to Round Robin
MPIO set to Round Robin Load Balancing

All Veritas DMP

Balanced Policy with parititonsize set to 16MB

All platforms using Powerpath

EMC Powerpath policy set to Adaptive

Table 3
Note: The most current and detailed information for each host OS is provided in the
corresponding Host Connectivity Guides on Powerlink at: http://powerlink.emc.com
Host Paths
EMC recommends that you limit the total number of paths that the multipathing
software on each host is managing to four paths, even though the maximum
supported is considerably more than four. Following these rules helps prevent many
issues that might otherwise occur and leads to improved performance.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

17

The major reason to limit the number of paths available to a host from the VPLEX is for
error recovery, path failover, and path failback purposes. These are also important
during the VPLEX non-disruptive upgrade (NDU) process. The overall time for
handling path loss by a host is significantly reduced when you keep the total number
of host paths to a reasonable number required to provide the aggregate
performance and availability. Additionally, the consumption of resources within the
host is greatly reduced each time you remove a path from path management
software.
During NDU, there are intervals where only half of the VPLEX directors and associated
front-end ports (on first and second upgraders, respectively) are available on the
front-end fabric. NDU front-end high availability checks ensure that the front-end
fabric is resilient against single points of failure during the NDU, even when either the
first or second upgrader front-end ports are offline.
From a host pathing perspective there are two types of configurations:
High availability configurations VPLEX configurations that include
redundancy to avoid data unavailability during NDU, even in the
front-end fabric or port failures. The NDU high-availability pre-checks
for these configurations.
Minimal configurations VPLEX configurations that do not include
redundancy to avoid data unavailability in the event of front-end
port failures.

sufficient
event of
succeed
sufficient
fabric or

For minimal configurations, the NDU high-availability pre-checks fail. Instead, the prechecks for these configurations must be performed manually. This can take a
considerable amount of time in large environments and in general EMC believes that
the benefits in lower port count requirements are not justified based on the increased
operational impact.
High availability configurations
VPLEX Non-Disruptive Upgrade (NDU) automated pre-checks verify that VPLEX is
resilient in the event of failures while the NDU is in progress.
In high availability configurations:
In dual- or quad-engine systems, each view has front-end target ports across
two or more engines in the first upgrader set (A directors), and two or more
engines in the second upgrader set (B directors).
In single-engine systems, each initiator port in a view has a path to at least one
front-end target port in the first upgrader (A director) and second upgrader (B
director). (See 7).
There are two variants of front-end configurations to consider for high availability that
will pass the high-availability pre-checks:

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

18

An optimal configuration for a single engine cluster is one in which there are
redundant paths (dotted and solid lines in 7) between both front end fabrics
and both directors. In addition to protecting against failures of an initiator port,
HBA, front-end switch, VPLEX front-end, or director, these redundant paths also
protect against front end port failures during NDU.
A high-availability configuration for a single-engine cluster is one in which
there is a single path between the front end fabrics and the directors (solid
lines in 7). Like the optimal configuration described above, a high-availability
configuration protects against failures of initiator ports, HBAs, front-end
switches, and director failures during NDU.

A high availability configuration provides protection against front-end port failures


during NDU.

Figure 7 High Availability Front-end Configuration (single-engine)


Minimal configurations
A minimal configuration is not considered highly-available, and the automated NDU
pre-check will not pass. For a single-engine cluster, a minimal configuration is one in
which each fabric has a single path to a single director. Minimal configurations
support failover, but have no redundancy during NDU.
Strict high-availability pre-checks for front-end and back-end connectivity have
been implemented in VPLEX 5.1 and higher. If the high-availability pre-check detects
one or more storage views do not conform to the front-end high availability
requirements of NDU, it will specify which storage views are in question. For example:
Error: Storage view /clusters/cluster-2/exports/storageviews/lsca3195_win2k3 does not have target ports from two or more directors
in the second upgrader set at cluster-2.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

19

Update these views to satisfy the high availability requirement. Ensure the storageview in question has front-end target ports across two or more engines in the first
upgrader set (A directors) and second upgrader set (B directors).
Figure8 illustrates a single-engine cluster with a minimal front-end configuration:

Figure 8 Minimal Front-end Configuration


For minimal configurations, the automated high-availability pre-checks fail, and must
be performed manually. Refer to the VPLEX Procedure Generator documentation on
upgrading and the necessary manual pre-checks, commands, and options for
minimal configurations.
Host to director connectivity
VPLEX caching algorithms send cache coherency messages between directors via
the internal fibre channel networks or via the built-in CMI bus contained within each
engine chassis. The CMI bus is a low latency, high speed communication bus. It
allows two directors within the same engine to directly communicate. The
recommendation when two or more VPLEX engines are available is to connect to
two directors per host. In addition, ensure each host is connected to an A and a B
director on different VPLEX engines. There are certainly possible exceptions to the
two director connectivity rule. For example, a heavy IO workload (OLTP) server with
4 or more adapter ports the director connectivity would be need to be at least 4 or
possibly 8 directors. The key take away is VPLEX system performance under normal
loads will be virtually equivalent whether you use directors in the same engine or two
directors in different engines. The benefits from the added availability tip the scale in

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

20

favor of connecting hosts to one director on two different engines. In general,


consuming just 2 directors per host will provide the best overall scalability and
balance of resource consumption for your VPLEX system.

Figure 9 Current VPLEX NDU Enforced Single and Dual Engine Connectivity

Note: For code releases through VPLEX GeoSynchrony code version 5.1 Patch 3, the
non-disruptive upgrade pre-check strictly enforces connecting hosts across 4
directors with 2 and 4 engine VPLEX systems. This restriction will likely be relaxed in
future releases to better align with the reasoning presented above.
When considering attaching a host to more than two directors in a dual-engine or
quad-engine VPLEX configuration, both the performance and the scalability of the
VPLEX complex should be considered. Though this may contradict what the
automated NDU will accept, this guidance for the following reasons:
Utilizing more than two directors per host increases cache update traffic
among the directors
Utilizing more than two directors per host decreases probability of read cache
hits on the ingress director.
Based on the reliability and availability characteristics of VPLEX hardware,
attaching a host to just two directors provides a high availability configuration
without unnecessarily impacting performance and scalability of the solution

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

21

General Best practice considerations for multipath software:


With EMC Powerpath the pathing policy should be set up for Adaptive mode.
Avoid connecting to multiple A directors and multiple B directors with a single
host or host cluster.
Avoid a single IO policy for round-robin IO. Alternating every single IO across
directors is not an efficient situation for cache-coherency traffic, and defeats
the VPLEX director's read-ahead cache pre-fetch. When using a round-robin
policy, set the burst or stream count to something greater than one so more
consecutive I/O's are sent to the same director before another director is
chosen.
For Veritas DMP, using the balanced policy with a partitionsize value of 16MB is
the optimal for VPLEX director cache-coherency.
Separate latency sensitive applications from each other, preferably using
independent directors and independent front-end ports.
For VPLEX Metro-FC cross-connect solutions, be aware of which path(s) the
hosts are using, and configure the hosts to prefer the local paths over the
remote paths
Host Path Monitoring
Host IO monitoring tools are available across virtually every open systems OS
supported by VPLEX. In particular, EMC Powerpath provides a consistent set of
commands and outputs across operating systems such as AIX, Linux, VMware, and
Windows.
Individual host path performance can be monitored using the powermt display
command:
Example 1 - Windows path monitoring with Powerpath for Windows
powermt display dev=all
Pseudo name=harddisk12
Invista ID=FNM00103600####
Logical device ID=6000144000000010A001ED129296E028
state=alive; policy=ADaptive; priority=0; queued-IOs=0
==============================================================================
---------------- Host --------------- Stor -- I/O Path - -- Stats --### HW Path
I/O Paths
Interf.
Mode
State Q-IOs Errors
==============================================================================
4 port4\path0\tgt0\lun10
c4t0d10
04
active alive
8
0

Also, latency by path is available with the powermt display latency command:
powermt display latency
Invista logical device count=86
==============================================================================

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

22

----- Host Bus Adapters --------- ------ Storage System ---- - Latency (us) ### HW Path
ID
Interface
Current
Max
==============================================================================
3 port3\path0
FNM0010360####
01
0
0
4 port4\path0
FNM0010360####
04
0
0

Policy based path monitoring


There are many situations in which a host can lose one or more paths to storage. If
the problem is isolated to that one host, it might go unnoticed until an upgrade to
VPLEX or when a SAN event occurs that causes the remaining paths to go offline,
such as a switch failure, or routine switch maintenance. This can lead to poor
performance or, worse yet, a data unavailability event, which can seriously affect
your business. To prevent this loss-of-access event from happening, many users have
found it useful to implement automated path monitoring using path management
software like EMC Powerpath or Veritas DMP or to create custom scripts that issue
path status commands and then parse the output for specific key words that then
trigger further script action.
For EMC Powerpath you can turn on path latency monitoring and define a threshold
to simply stop using a specific path.
Example 2 Automated latency monitoring with Powerpath
powermt set path_latency_monitor=on|off
powermt set path_latency_threshold=<seconds>

It is also possible to set an autorestore policy with Powerpath so that any paths that
drop offline are brought back online if they are healthy.
Example 3 Auto restore paths
powermt set periodic_autorestore=on|off

Each of these commands can provide hosts with self-monitoring and self-recovery to
provide the greatest resiliency and available possible for each host. This command
can be combined with a scheduler, such as cron, and a notification system, such as
an e-mail, to notify SAN administrators and system administrators if the number of
paths to the system changes.
For Veritas DMP there are recovery settings that control how often a path will be
retried after failure. If these are not the default settings on your hosts, you should set
the following on any hosts using DMP:

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

23

Example 4 - DMP Tuning Parameters


vxdmpadm setattr enclosure emc-vplex0 recoveryoption=throttle iotimeout=30
vxdmpadm setattr enclosure emc-vplex0 dmp_lun_retry_timeout=30

The values shown in Example 4 specify a retry 30s period for handling transient errors.
When all paths to a disk fail (such as during a VPLEX NDU), there may be certain
paths that have a temporary failure and are likely to be restored soon. If IOs are not
retried for a non-zero period of time, the IO may be failed by the application layer.
The DMP tunable dmp_lun_retry_timeout can be used for more robust handling of
such transient errors. If the tunable is set to a non-zero value, I/Os to a disk with all
failed paths will be retried until the specified dmp_lun_retry_timeout interval or
until the I/O succeeds on one of the paths, whichever happens first. The default
value of the tunable is 0, which means that the paths are probed only once.
VPLEX Real-time GUI Performance Monitoring Stats
The Unisphere for VPLEX UI contains several key performance statistics for host
performance and overall health.
They can be found on the Performance
Dashboard tab and can be added to the default performance charts that are
displayed. Using the data provided, the VPLEX administrator can quickly determine
the source of performance problems within an environment. Figure 10, below shows
the performance data included in the GeoSynchrony 5.1 version of VPLEX.

Figure 10 VPLEX Real-time Performance Data

Unisphere for VPLEX Performance Data Details


Back-end Latency time in microseconds for IO to complete with physical
storage frames.
CPU Utilization - % busy of the VPLEX directors in each engine. 50% or less is
considered ideal
Front-end Aborts SCSI aborts received from hosts connected to VPLEX frontend ports. 0 is ideal.
Front-end Bandwidth - Total IO as measured in MB per second from hosts to
VPLEX.
Front-end Latency time in microseconds for IO to complete between VPLEX
and hosts. Very dependent on backend array latency.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

24

Front-end Throughput Total IO as measured in IO per second.


Rebuild Status Completion status of local and remote device rebuild jobs.
Subpage Writes Number of writes that < 4KB. This statistic has taken on a very
diminished importance for VPLEX Local and Metro systems running
GeoSynchrony 5.0.1 and later code. For VPLEX Geo, this is still a very relevant
metric.
WAN Link Usage IO between VPLEX clusters as measured in MB per second.
This chart can be further sub divided into system, rebuild, and distributed
volume write activity.
WAN Link Performance IO between VPLEX clusters as measured in IO per
second.

Figure 11 UniSphere for VPLEX Performance Dashboard


Figure 11 shows the VPLEX Performance Dashboard which provides continuous real
time data for 10 key performance metrics over a continuously updated 5 minute
window. Each of the charts can be added, moved, or removed from the display to
meet a wide variety of monitoring needs.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

25

Remote Monitoring and Scripting


VPLEX has a RESTful API and supports SNMP monitoring via third party SNMP
monitoring tools. The VPLEX MIB is available on the VPLEX Management Server in the
following directory:
/opt/emc/VPlex/mibs
Today there are a limited set of performance categories today for SNMP. Using REST
API to access VPLEX allows virtually any command that can be run locally on a VPLEX
to be run remotely. This enables integration with Microsoft PowerShell and with
VMware vCOPS. Refer to VPLEX 5.1 Administrators Guide -- Performance Monitoring
Chapter for more details.
Watch4Net
Comprehensive historical and trending data along with custom dashboard views will be
provided for VPLEX by Watch4Net. Watch4Net will likely be supported very shortly after this
document is published.
Perpetual Logs
VPLEX maintains a perpetual log of over 50 different performance statistics on the
VPLEX management server. There are 10 of these files for each VPLEX director and
they roll at the 10 MB mark. The perpetual log files contain comma separated data
that can be very easily imported to MS Excel for aggregation, reporting and historical
trending analysis. The files are located in the /var/log/VPlex/cli directory as show in
below:
service@RD-GEO-2-1:/var/log/VPlex/cli> ll | grep PERPETUAL
-rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--rw-r--r--

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1

service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service
service

users
users
users
users
users
users
users
users
users
users
users
users
users
users
users
users
users
users
users
users
users
users

3374442
10485855
10485864
10486060
10485825
10485922
10486009
10486000
10486298
10486207
10485969
2467450
10485770
10486183
10485816
10485977
10486275
10485793
10486230
10485762
10485807
10486077

2012-11-17
2012-11-14
2012-09-10
2012-11-07
2012-10-30
2012-10-22
2012-10-15
2012-10-08
2012-10-01
2012-09-24
2012-09-17
2012-11-17
2012-11-15
2012-09-07
2012-11-08
2012-10-31
2012-10-24
2012-10-16
2012-10-08
2012-09-30
2012-09-22
2012-09-14

05:41
18:38
01:25
03:33
12:14
21:38
12:43
05:20
03:31
02:24
00:24
05:41
11:39
09:02
01:10
13:49
01:25
07:27
07:56
05:37
01:53
14:31

director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.1
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.10
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.2
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.3
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.4
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.5
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.6
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.7
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.8
director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.9
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.1
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.10
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.2
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.3
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.4
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.5
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.6
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.7
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.8
director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.9

Benchmarking Applications, Tools and Utilities


There are good benchmarking applications out there, and there are not-so-good
ones. Testers use tools they can trust. The following section reviews several

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

26

benchmarking tools that are useful (and not so useful) when testing VPLEX
performance in your environment.
Good benchmarks
IOMeter
IOMeter is one of the most popular public domain benchmarking tools among
storage vendors, and is primarily a Windows-based tool.
It is available
from http://www.iometer.org. In the Benchmarking section of this document we
provide some examples of IOMeter settings that are used to simulate specific
workloads for testing.
The popularity of IOMeter holds true at EMC. Many internal teams, including the
VPLEX Performance Engineering team use IOMeter and are familiar with its behavior,
input parameters, and output. That being said, the IO patterns, queue depths, and
other tunables can be misused and distorted. Its important to maintain healthy
skepticism from any benchmark numbers you see until you know the full details of the
settings and overall testing parameters.
Warning: It's not recommended to run the IO client (dynamo) on Linux. Dynamo does
not appear to function completely as expected. It's best to use Windows clients with
Dynamo.
IOZone
Broad operating system support, however primarily file system based. It is available
for free from http://www.iozone.org.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

27

iorate
Initially implemented by EMC, iorate has been released to the public as open source.
Available for free from http://iorate.org/
fio
io is an I/O tool meant to be used both for benchmark and stress/hardware
verification. It has support for 13 different types of I/O engines, I/O priorities (for
newer Linux kernels), rate I/O, forked or threaded jobs, and much more. It can work
on block devices as well as files. fio is a tool that will spawn a number of threads or
processes doing a particular type of I/O action as specified by the user. The typical
use of fio is to write a job file matching the I/O load one wants to simulate. Available
for free from http://freecode.com/projects/fio
Additional info: http://linux.die.net/man/1/fio
Poor benchmarks
In general, any single outstanding I/O or filesystem focused benchmarks are not
good choices.
Unix dd test
dd is completely single-threaded or single outstanding I/O.
The dreaded dd test:

Bonnie
Bonnie was designed to test UNIX file systems and is over 20 years old.
Bst5 or "Bart's stuff test"
Bst5 is single outstanding I/O. http://www.nu2.nu/bst/
File copy commands

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

28

These are single threaded and single outstanding I/O. They use a host memory file
cache, so it is not known when or if a particular file IO hits storage. It is also not clear
what I/O size the filesystem will happen to choose and so it might be reading and
writing with inefficient IO sizes. In theory, a multiple file copy benchmark could be
constructed; however it requires careful parallelism and multiple independent source
and target locations.
It is best to separate reads and writes in performance testing. For example, a slow
performing read source device could penalize a fast write target device. The entire
copy test would show up as slow. Without detailed metrics into the read and write
times (not always gathered in a simple "how long did it take" file copy test), the
wrong conclusions can easily be drawn about the storage solution.
Note: See Section 8: Benchmarking for specific testing recommendations and
example results.

Benchmarking Applications
The list of possible application level benchmarking programs is numerous. Some that
are fairly well known and understood are:
Microsoft Exchange - JetStress
Microsoft SQL Server - SQLIO
Oracle - SwingBench, DataPump, or export/import commands
VMware - VMbench, VMmark - virtual machine benchmarking tools
These particular benchmarking applications are potentially one step closer to a
production application environment; however as all artificially crafted benchmarking
applications suffer from the fact that at the end of the day they likely are not
representative of your environment.
Engage EMC's application experts when you are interested in a specific application
benchmark. We stress that these benchmarks also exercise more of the application
and host IO stack and so they may not be representative of the underlying storage
devices and could be affected by a lot of things outside the storage layer.
Application Testing
Testing with the actual application is the best way to measure storage performance.
Production-like environment that can stress storage limits is desirable.
Measure performance of different solutions:
Compare OLTP response times.
Compare batch run times.
Compare sustained streaming rates.
Operating system and application tools can help monitor storage performance.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

29

Production Testing
Ultimately, there must be a level of trust in the solution and the deployment the
solution in your production environment.
When are considering moving an
application into product there are some risks and rewards.
Risk vs. Reward:
Risk: taking an unsupported, well-traveled evaluation unit and putting it in a
production environment could compromise application availability and
expose unexpected system problems.
Reward: sometimes this is the only way to know for certain that storage
performance is acceptable for an application.
In order to minimize the risk side of the equation, consider a staged approach
whereby at first non-business critical applications can be virtualized with VPLEX. This is
a similar approach recommended by VMware in the early stages of host
virtualization.
Go for the low hanging fruit first and then closely monitor the
performance throughout the process.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

30

Section 4: Application Performance Considerations


When gathering data for planning from the application side, it is important to first
consider the workload type for the application. If multiple applications or workload
types will share the system, you need to know the type of workloads of each
application, and if the applications have both types -- are mixed (transaction-based
and throughput-based), which workload will be the most critical?
Many
environments have a mix of transaction-based and throughput-based workloads;
generally, the transaction performance is considered the most critical. However, in
some environments, for example, a backup media server environment, the streaming
high throughput workload of the backup itself is the critical part of the operation. The
backup database, although a transaction-centered workload, is a less critical
workload.
High Transaction environments
So, what are the traits of transaction-based and high throughput applications? In the
following sections, we explain these traits in more detail.
Applications that use high transaction workloads are better known as Online
Transaction Processing (OLTP) systems. Examples of these systems are database
servers and mail servers. If you have a database, you tune the server type
parameters, as well as the databases logical devices, to meet the needs of the
database application. If the host server has a secondary role of performing nightly
backups for the business, you may choose to use a different set of logical devices,
which are tuned for high throughput for the best backup performance.
As mentioned in the introduction, you can expect to see a high number of
transactions and a fairly small IO size in OLTP environments. Different databases use
different IO sizes for their logs, and these logs vary from vendor to vendor. In all
cases, the logs are generally high write-oriented workloads. For table spaces, most
databases use between a 4 KB and a 16 KB IO size. In certain applications, larger
chunks (for example, 64 KB) will be moved to host application cache memory for
processing. VPLEX currently has a fixed 4KB page size and IO in this size range is not
appreciably impacted by the introduction of VPLEX.
Understanding how your application is going to handle its IO is critical to laying out
the data properly at the storage layer. In many cases, the table space is generally a
large file made up of small blocks of data records. The records are normally
accessed using small IOs of a random nature, which can result in about a high
cache miss ratio. It is important to ensure the backend storage array is able to keep
up with the IOPS requirement as well.
Another point to consider is whether the typical IO is a read or a write. In many OLTP
environments, there is generally a mix of about 70% reads and 30% writes. However,
the transaction logs of a database application have a much higher write ratio and,

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

31

therefore, perform better if they are isolated onto dedicated storage volumes.
VPLEXs large read cache benefits this sort of a read portion workload but the log
volumes will likely not benefit from cache, and therefore will need underlying storage
volumes that can keep pace with the write workload.
High Throughput environments
With high throughput workloads, you have fewer transactions, but much larger IO per
transaction. IO sizes of 128 K or greater are normal, and these IOs are generally
sequential in nature. Applications that typify this type of workload are imaging,
video servers, seismic processing, high performance computing (HPC), and backup
servers.
When running applications that use larger size I/O, it is important to be aware of the
extra IO impact that VPLEX will add as a result of breaking up write IO that are larger
than 128KB. For example, a single 1MB host write would require VPLEX to do 8 x 128KB
writes out to the backend storage frame.
When practical, maximum host and
application IO size and allocations units for thigh throughput systems should be set
to128KB or less. An increase the maximum back-end write size to 1MB is expected in
the next major VPLEX code release.
Best practice: Database table spaces, journals, and logs should not be placed on
virtual volumes that reside on extents from the same backend storage volume.

VPLEX Device Geometry


The typical consumption methodology for back-end storage for the majority of
applications is to create 1:1 mapped (physical:virtual) devices. If striping (raid-0) or
concatenated (raid-c) geometries are used then VPLEX devices should be
constructed using storage volumes of similar raid protection and performance
characteristics. This general purpose rule is applicable to most VPLEX backend
storage configurations and simplifies the device geometry decision for storage
administrators. This type of physical storage consumption model enables the
continued use of array based snap, clones, and remote replication technologies (for
example, MirrorView, SnapView, SRDF, and TimeFinder)
It is also important to consider where the failure domains are in the backend storage
frames and take them into consideration when creating complex device
geometries. In this context, we define a failure domain as the set of storage
elements that will be affected by the loss of a single storage component. We
strongly advise against the creation of VPLEX devices consisting of striped or
concatenated extents across different backend storage frames. Using different sets
of back-end storage frames makes the failure domain wider and makes it more
susceptible to being affected by a single failure. This can also unbalance the I/O,
and will limit the performance of those striped volumes to the slowest back-end

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

32

device. It is acceptable to use striped (raid-0) volumes with applications and storage
frames that do not already stripe their data across physical disks.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

33

Section 5: Back-end Performance Considerations


Storage Considerations
It is of great importance that the selected storage subsystem model is able to support
the required IO workload. Besides availability concerns, adequate performance
must be ensured to meet the requirements of the applications, which include
evaluation of the physical drive type (EFD, FC, SATA) used and if the internal
architecture of the storage subsystem is sufficient. For example, a high speed, Fibre
Channel 15K rpm drives or Enterprise Flash drives are typically selected for use with
transaction-based (OLTP) workloads. As for the subsystem architecture, newer
generations of storage subsystems have larger internal caches, higher bandwidth
busses, and more powerful storage controllers.
Storage Array Block Size
Today VPLEX supports communicating to back-end storage-arrays that advertise a
512 byte block size. Within VPLEX, the block-size parameter that you see for a
storage-volume within VPLEX is not the underlying storage-array's supported blocksize, but rather it is the VPLEX associated 4KB block size. Each and every volume
reported by VPLEX today will have 4KB. This has the implications for host to VPLEX IO
size that were discussed in Section 4.

Note: VPLEX can and does read/write to back-end arrays at I/O sizes as
small as 512 bytes as of GeoSynchrony 5.0.

SAN Architecture for Storage Array Connectivity


For back-end (storage) connectivity the recommended SAN topology consists of
redundant (A/B) fabrics. Though EMC does support the use of direct storage to
VPLEX connectivity, this practice is extremely limited in terms of cost efficiency,
flexibility and scalability. Direct connect is intended for proof of concept, test,
development, and / or specific sites that only have a single storage frame. Direct
connect allows for backend connectivity while reducing the number of required
switch ports, but as mentioned earlier, the sacrifices in terms of scale and flexibility
make this a fairly uncommon connectivity scheme. Sites with multiple arrays, existing
SAN fabrics, or large implementations should plan to utilize dual redundant SAN
connectivity as it provides the most robust overall solution.
Note: Direct connect applies only to backend connectivity. Front-end (direct host
to VPLEX) connect is not supported.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

34

Active/Active Arrays
With Active/Active storage platforms such as EMC VMAX and Symmetrix, Hitachi VSP,
IBM XIV, and HP 3PAR each director in a VPLEX cluster must have a minimum of two
paths to every local back-end storage array and to every storage volume presented
to VPLEX. Each VPLEX director requires physical connections to the back-end
storage across dual fabrics. Each director is required to have redundant paths to
every back-end storage array across both fabrics. Otherwise this would create a
single point of failure at the director level that could lead to rebuilds that
continuously start/restart and never finish. This is referred to as asymmetric backend
visibility. This is detrimental when VPLEX is mirroring across local devices (RAID-1) or
across Distributed Devices (Distributed RAID-1).
Each storage array should have redundant controllers connected to dual fabrics,
with each VPLEX controller having a minimum of two ports connected to the backend storage arrays through the dual fabrics (required).
VPLEX allows a maximum of 4 back-end paths per director to a given LUN. This is
considered optimal because each director will load balance across the four paths to
the storage volume. Maximum because VPLEX using more paths to any given
storage volume or the Initiator, Target, LUN (ITL) would potentially lead to an excess
ITL nexus per storage volume resulting in the inability to claim or work with the device.
Exceeding 4 paths per storage volume per director can lead to elongated backend
path failure resolution, ndu pre-check failures, and decreased scalability.
High quantities of storage volumes (i.e 1000+ storage volumes) or entire arrays
provisioned to VPLEX should be divided up into appropriately sized groups (i.e.
masking views or storage groups) and presented from the array to VPLEX via groups
of four array ports per VPLEX director so as not to exceed the four active paths per
VPLEX director limitation. As an example, following the rule of four active paths per
storage volume per director (also referred to as ITLs), a four engine VPLEX cluster
could have each director connected to four array ports dedicated to that director.
In other words, a quad engine VPLEX cluster would have the ability to connect to 32
ports on a single array for access to a single device presented through all 32 ports
and still meet the connectivity rules of 4 ITLs per director. This can be accomplished
using only two ports per backend I/O module leaving the other two ports for access
to another set of volumes over the same or different array ports.
Appropriateness would be judged based on the planned total IO workload for the
group of LUNs and the limitations of the physical storage array. For example, storage
arrays often have limits around the number of LUNs per storage port, storage group,
or masking view they can have.
Maximum performance, environment wide, is achieved by balancing IO workload
across maximum number of ports on an array while staying within the IT limits.
Performance is not based on a single host but the overall impact of all resources

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

35

being utilized. Proper balancing of all available resources provides the best overall
performance.
Storage Best Practices: Create separate port groups within the storage frame for
each of the logical path groups that have been established. Spread each group
of four ports across storage array engines for redundancy. Mask devices to allow
access to the appropriate VPLEX initiators for both port groups.
Figure 12 shows the physical connectivity from a quad-engine to a hex-engine VMAX
array.

Figure 12 Active/Active Storage to VPLEX Connectivity

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

36

Similar considerations should apply to other active/active arrays as well as ALUA


arrays. Follow the array best practices for all arrays including third party arrays.
The devices should be provisioned in such a way as to create digestible chunks
and provisioned for access through specific FA ports. The devices within this device
grouping should restrict access to four specific FA ports for a VPLEX A director port
group and a different set of FA ports for a VPLEX B director port group.
The VPLEX initiators (backend ports) on a single director should spread across engines
to increase HA and redundancy. The array should be configured into initiator groups
such that each VPLEX director acts as a single host per four paths.
This could mean four physical paths or four logical paths per VPLEX director
depending on port availability and whether or not VPLEX is attached to dual fabrics
or multiple fabrics in excess of two.
For the example above following basic limits on the VMAX:
Initiator Groups (IG) (HBAs); max of 32 WWN's per IG; max of 8192 IG's on a
VMax; set port flags on the IG; an individual WWN can only belong to 1 IG.
Cascaded Initiator Groups have other IG's (rather than WWN's) as members.
Port Groups (PG) (FA ports): max of 512 PG's; ACLX flag must be enabled on
the port; ports may belong to more than 1 PG
Storage Groups (SG) (LUNs / Symm Devs); max of 4096 Symm Devs per SG; a
Symm Dev may belong to more than 1 SG; max of 8192 SG's on a VMAX
Masking View consists of an Initiator Group, a Port Group, and a Storage
Group
We have divided the backend ports of the VPLEX into two groups allowing us to
create four masking views on the VMAX. Ports FC00 and FC01 for both directors are
zoned to two FAs each on the array. The WWNs of these ports are the members of
the first Initiator Group and will be part of Masking View 1. The Initiator Group created
with this group of WWNs will become the member of a second Initiator Group which
will in turn become a member of a second Masking View. This is called Cascading
Initiator Groups. This was repeated for ports FC02 and FC03 placing them in Masking
Views 3 and 4. This is only one example of attaching to the VMAX and other
possibilities are allowed as long as the rules are followed.
VPLEX virtual volumes should be added to masking views containing initiators from a
director A and initiators from a director B. This translates to a single host with two
initiators connected to dual fabrics and having four paths into two VPLEX directors.
VPLEX would access that hosts storage volumes via eight FAs on the array through
two VPLEX directors (an A director and a B director). The VPLEX A director and B
director each see four different FAs across at least two VMAX engines if available.
This is an optimal configuration that spreads a single hosts I/O over the maximum
number of array ports. Additional hosts will attach to different pairs of VPLEX directors
in a dual-engine or quad -engine VPLEX cluster. This will help spread the overall
environment I/ O workload over more switches, VPLEX, and array resources. This

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

37

would allow for the greatest possible balancing of all resources resulting in the best
possible environment performance.

Figure 13 ITL per Storage Volume


Figure 13 shows the ITLs per Storage Volume. In this example the VPLEX Cluster is a
single engine and is connected to an active/active array with four paths per Storage
Volume per Director giving us a total of eight logical paths. The Show ITLs button
displays the ports on the VPLEX director from which the paths originate and which FA
they are connected to.
Active/ Passive Arrays
When using a storage array that is operating in active-passive model, each director
needs to have logical (zoning and masking) and physical connectivity to both the
active and the passive storage controller. This ensures that VPLEX does not lose
access to storage volumes if the active controller should fails or is restarted.
Additionally, arrays like the CLARiiON have limitations on the size of initiator or
storage groups. It may be necessary to have multiple storage groups to
accommodate provisioning storage to the VPLEX. Follow the logical and physical
connectivity guidelines described earlier in this section.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

38

Figure 14 Active / Passive Storage to VPLEX Connectivity


Figure 14 (above) shows an example of EMC CLARiiON connectivity to VPLEX. Array
connections go to Fabric-A with SPA0 & SPB0 (even ports) and Fabric-B with SPA3 &
SPB3 (odd ports) for dual fabric redundancy.
For Active/Passive storage frames:
Each storage controller (processor) must be connected to each of the two
redundant fabrics
Each storage controller (processor) must have has at least two connections to
every VPLEX director.
ALUA mode support is available on some active/passive storage arrays and allows
VPLEX connectivity as if they were active/active. Please refer to your individual
storage vendor configuration literature to determine if ALUA mode is supported. See
the active/active section (above) for more details on how to connect to an ALUA
array. The number of ITLs shown in Figure 14 for an active / passive array is double
the count of an active/active array as it would also contain the logical paths for the
passive SP on the array.
Note: For VNX and CX4 storage array connectivity to VPLEX, ensure that mode 4
(ALUA) or mode 1 (non-ALUA) is set during VPLEX initiator registration prior to
device presentation. Dont try to change it after devices are already presented.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

39

Additional Array Considerations


Some arrays such as EMC Symmetrix use in-band gatekeeper devices for array
management and require a direct path from hosts that are used to run the
management tools. A direct path to storage (outside of the VPLEX IO stack) should
be solely for the purposes of in-band management. Storage volumes provisioned to
hosts by VPLEX should never simultaneously be connected directly from the array to
the host; otherwise there is a high probability of data corruption.
Storage volumes provided by arrays must have a capacity that is a multiple of 4 KB.
Any volumes that are not evenly divisible by 4KB will not show up in the list of
available volumes to be claimed. In order to consume these volumes with VPLEX
they must first be migrated to volumes that are a multiple of 4KB or grown to a size
that is divisible by 4KB and then presented to VPLEX. Another alternative would be to
use a host based copy utility to move the data to a new and unused VPLEX device.
Automated Storage Tiering
Arrays such as EMC VMAX and VNX offer policy-based automated movement of
data between different performance tiers of disks (EFD, FC, and SATA). VPLEX and its
read cache are complementary to these technologies. The back-end storage array
will see a less read intensive workload than it otherwise would, which will free the
storage platform to focus on servicing write IO and other tasks it may be responsible
for.
Performance Metrics for Back-end IO
The UniSphere for VPLEX UI contains several key performance statistics for array
performance and overall health.
They can be found on the Performance
Dashboard tab and can be added to the default performance charts that are
displayed. Using the data provided, the VPLEX administrator can quickly determine
the source of performance problems within an environment. Figure 15, below shows
the performance data included in the GeoSynchrony 5.1 version of VPLEX.

Figure 15 VPLEX Realtime Performance Statistics


Unisphere for VPLEX Performance Data Details
Back-end Latency time in microseconds for IO to complete with physical
storage frames.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

40

CPU Utilization - % busy of the VPLEX directors in each engine. 50% or less is
considered ideal
Front-end Aborts SCSI aborts received from hosts connected to VPLEX frontend ports. 0 is ideal.
Front-end Bandwidth - Total IO as measured in MB per second from hosts to
VPLEX.
Front-end Latency time in microseconds for IO to complete between VPLEX
and hosts. Very dependent on backend array latency.
Front-end Throughput Total IO as measured in IO per second.
Rebuild Status Completion status of local and remote device rebuild jobs.
Subpage Writes Number of writes that < 4KB. This statistic has taken on a very
diminished importance for VPLEX Local and Metro systems running
GeoSynchrony 5.0.1 and later code. For VPLEX Geo, this is still a very relevant
metric.
WAN Link Usage IO between VPLEX clusters as measured in MB per second.
This chart can be further sub divided into system, rebuild, and distributed
volume write activity.
WAN Link Performance IO between VPLEX clusters as measured in IO per
second.

The Back-end Latency statistic provides a quick way to narrow down the source of
performance issues and if they are caused by the array, the hosts, or by VPLEX. If
back-end latency is high then you should expect to see correspondingly high frontend latency. This would rule out VPLEX as the source of the latency.
Back-end Connectivity Summary
For typical workloads, VPLEX will normally perform as well as the underlying
storage-array
Injected VPLEX Local write overhead is usually in the 300-600 microsecond range
VPLEXs cache can benefit read-intensive applications leading to reduced
latency (compared to baseline) when there are VPLEX read cache hits.
Baseline and document your storage and application environments pre-VPLEX
Follow your storage vendors best practices for performance regarding RAID
layout, disk types (SSD, FC, SAS/SATA), thin/thick, and automated storage tiering.
Reference the EMC Support Matrix, Release Notes, and online documentation
available on http://Powerlink.EMC.com for specific array configuration
requirements
Engage the EMC and 3rd party storage vendor as needed

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

41

Section 6: SAN and WAN Performance


VPLEX has unique SAN fabric requirements that may differ from what you are used to
in your storage infrastructure. A quality SAN configuration can help you achieve a
stable, reliable, and scalable VPLEX installation; conversely, a poor SAN environment
can make your VPLEX experience considerably less pleasant. This section provides
you with information to tackle this topic.
The topology requirements for the VPLEX do not differ too much from any other
storage device. What makes VPLEX unique is that it is typically configured with a
large number of hosts, which can cause interesting issues with SAN scalability. Also,
because VPLEX often serves many hosts, an issue caused by poor SAN design can
quickly cascade into a more serious problem. Another thing that differentiates VPLEX
is the fact that it can work with multiple SANs (up to 4 different sans) from different
SAN vendors running different (perhaps incompatible) switch firmware.
SAN Redundancy
One of the fundamental VPLEX SAN requirements is to create two (or more) entirely
redundant SANs. The easiest way to accomplish this is to construct two SANs that are
mirror images of each other. Technically, the VPLEX will work using just a single SAN
(appropriately zoned). However, we do not recommend this design in any
production environment. We also do not recommend this design in development
environments either, because a stable development platform is important to
programmers, and an extended outage in the development environment can cause
an expensive business impact. For a dedicated storage test platform, however, it
might be acceptable.
Redundancy through Cisco VSANs or Brocade Virtual Fabrics
Even though VSANs and Virtual Fabrics can provide a logical separation within a
single appliance, they do not replace the hardware redundancy. All SAN switches
have been known to suffer from hardware or fatal software failures. Furthermore,
redundant fabrics should be separated into different non-contiguous racks and fed
from redundant power sources.
Note: Due to the design of Fibre Channel, it is extremely important to avoid interswitch link (ISL) congestion. While VPLEX can, under most circumstances, handle a
host or storage array that has become overloaded, the mechanisms in Fibre
Channel for dealing with congestion in the fabric itself are not effective. The
problems caused by fabric congestion can range anywhere from dramatically
slow response time all the way to storage access loss. These issues are common
with all high-bandwidth SAN devices and are inherent to Fibre Channel; they are
not unique to VPLEX. Contrasting Fibre Channel with Ethernet we know that when
an Ethernet network becomes congested, the Ethernet switches simply discard
frames for which there is no room. When a Fibre Channel network becomes
congested, the Fibre Channel switches instead stop accepting additional frames

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

42

until the congestion clears, in addition to occasionally dropping frames. This


congestion quickly moves upstream in the fabric and clogs the end devices
(such as VPLEX) from communicating anywhere. This behavior is referred to as
head-of-line blocking, and while modern SAN switches internally have a nonblocking architecture, head-of-line-blocking still exists as a SAN fabric problem.
Planning SAN Capacity
If at all possible, plan your SAN for the maximum size configuration that you ever
expect your VPLEX installation to reach. The design of the SAN can change radically
for larger numbers of hosts and storage arrays. Modifying the SAN later to
accommodate a larger-than-expected number of hosts might produce a poorlydesigned and a poorly performing SAN. Moreover it can be difficult, expensive, and
disruptive to your business. Planning for the maximum size does not mean that you
need to purchase all of the SAN hardware initially. It only requires you to design the
SAN considering the maximum size. Always deploy at least one extra ISL per
switch. Not doing so exposes you to consequences from complete path loss (this is
bad) to fabric congestion (this is even worse).
EMC does not permit the number of hops between VPLEX and the hosts to exceed
three hops. The same goes for back-end storage three hops maximum.
ISL Considerations
With modern 4 or 8 Gbps SAN switches, it is typical for anywhere from 7 to 15 hosts to
share an ISL and for this reason there are many possible traffic behaviors that must be
taken into account when setting up switch ISLs. It is important to understand the IO
workload and bandwidth requirements of each host sharing an ISL. Beyond host per
ISL counts, it is also important to take these other factors into account as well:
Take peak loads into consideration, not average loads. For instance, while a
database server might only use 20 MBps during regular production workloads,
it might perform a backup at far higher data rates.
Minimize switch congestion as congestion to one switch in a large fabric can
cause performance issues throughout the entire fabric, including traffic
between VPLEX and storage subsystems, even if they are not directly
attached to the congested switch. The reasons for these issues are inherent to
Fibre Channel flow control mechanisms, which are simply not designed to
handle fabric congestion.
Implement a spare ISL or ISL trunk as you need to be able to avoid congestion
if an ISL fails due to as a SAN switch line card or port blade or similar failure.
Consider the bandwidth consequences of a complete fabric outage.
While a complete fabric outage is a fairly rare event, insufficient bandwidth
can turn a single-SAN outage into a total access loss event.
Take the bandwidth of the links into account. It is common to have ISLs run
faster than host ports, which obviously reduces the number of required ISLs.
In a core/edge SAN, connect VPLEX to the core switch to minimize latency
and hops to hosts and to storage arrays.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

43

FC WAN Sizing
Insufficient WAN bandwidth for the desired workload will guarantee performance
degradation. The application will end up seeing high response times because of
queue build-ups within VPLEX when the WAN pipe is saturated.
The minimum required inter-cluster bandwidth for VPLEX Geo is 1Gbps. VPLEX MetroIP has a release notes stated minimum of 3Gbps, however solutions running at 1Gbps
are considered for RPQs.
Ensure your WAN devices are properly configured for distance and have proper
licenses, and that QoS or bandwidth rate limits are not artificially capping available
inter-cluster bandwidth.
Ensure compliant WAN round-trip-times
Unsupported inter-cluster WAN round-trip-times can result in unexpected
performance results. VPLEX Metro with FC WAN can benefit from FC Fast Write
technology (available from vendors such as Brocade and Cisco) by minimizing the
total number of round trips incurred for writes between data centers.
Buffer to Buffer Credits
If FC switches are used over dark fibre or DWDM WAN equipment, ensure that the
WAN facing FC ports have sufficient buffer credits allocated to the ports. A lack of
buffer credits will impose an undesired limit on the maximum throughput on the WAN
link.
Brocade switches:
For Brocade, an extended fabric license is required for each edge switch, and the WAN
facing ports must be set to LS or LD mode. See the command portcfglongdistance.
Monitor the port's counters for non-zero values for tim_txcrd_z or time transmission
credits are zero. This means the FC port wanted to transmit a FC packet, but did not
have sufficient buffer credits to so. Any non-zero value in this category implies
performance issues on the WAN link. If FCIP gateway devices are used between VPLEX
clusters, ensure that the FCIP tunnel is configured properly.
Brocade FCIP switches:
Check for bandwidth rate limiting setting on the tunnel. See the command portshow
fciptunnel. Verify the values for Min Comm Rt and Max Comm Rt (Minimum /
Maximum Communication Rate) are not causing a bottleneck. Check for improper
QoS settings on the tunnel. From portshow fciptunnel command output, check
the values for QoS Percentages. Note that only if QoS has been set on the LAN
facing FC ports will QoS settings affect the fciptunnel settings.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

44

IP WAN Settings VPLEX Metro-IP and VPLEX Geo


There are a couple of parameters which should be tuned according to the VPLEX
configuration when using an Ethernet WAN.
Jumbo frames
Ethernet carries user data in IP frames, which, for compatibility, defaults to 1500
bytes Maximum Transmission Unit (MTU). However, there is a large fixed overhead
both to the host and the Ethernet switches that is independent of the size of the
data transmitted. So larger (Jumbo) frames, e.g., 9000 MTU, can transmit data
~50% faster with ~50% less impact to the host CPU, especially when using 10 GigE.
Unfortunately, not all Ethernet systems can use Jumbo frames, e.g., intermittent
switches may only be capable of handling 1500 MTU.
Note that each and every hop along the IP path must have jumbo frames
enabled. The VPLEX CLI command "director tracepath" will show the number of
hops and their supported MTU sizes. Note that in order to discover if jumbo frames
are supported, the VPLEX IP ports must be set to the maximum possible. It is best
to ask your WAN service provider what the network is capable of.
Caution: WAN providers are very hesitant to enable this on publicly shared WAN
links because of quality of service concerns for those not using jumbo frames.
Don't be surprised by a push back from the ISP.
Socket Buffer Size
In Ethernet the maximum size data that will be transmitted in one go is
determined by the Socket Buffer (SB) size. It forms a flow control mechanism. If the
SB is too small VPLEX will be starved for packets and transmissions will take extra
overhead. If the SB is too large then VPLEX will accumulate large internal queues,
and IOs will not be processed efficiently. Thus the goal is to setup the SB size to be
optimal for your configuration.
Check the port-group's socket buffer size (socket-buf-size in /clusters/cluster#/cluster-connectivity/option-sets/optionset-com-#/).
The
default
as
of
GeoSynchrony 5.1, is 1MB. The optimal value for this is the network's delaybandwidth product which is the latency or delay of the network multiplied by the
available bandwidth, which is the amount of data required to be outstanding to
fully utilize the network.
Areas to Check to Avoid SAN and WAN Performance Issues
Metro/Geo WAN Performance
Metro-FC
Lack of buffer credits yields low transfer rates
Provision enough inter-cluster bandwidth
Results in high host latency if bandwidth is saturated
Metro-IP / Geo

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

45

IP QoS settings
Sufficient bandwidth
MTU sizes Use Jumbo frames whenever possible
Dirty/Unhealthy FC fabric
Fabric health is CRITICAL to VPLEX performance
Watch for c3discards, CRC errors, internal link failures, slow drain devices,
etc.
Brocade: 8Gbps Fabrics change fillword setting per port
Avoid incorrect FC port speed between the fabric and VPLEX. Use highest
possible bandwidth to match the VPLEX maximum port speed and use dedicated
port speed s i.e. do not use oversubscribed ports on SAN switches.
Each VPLEX director has the capability of connecting both FE and BE IO modules
to both fabrics with multiple ports.
o The ports connected to on the SAN should be on different blades or switches
so a single blade or switch failure wont cause loss of access on that fabric
overall.
o A good design will group VPLEX BE ports with Array ports that will be
provisioning groups of devices to those VPLEX BE ports in such a way as to
minimize traffic across blades.

Note: A more detailed treatment of VPLEX best practices can be found in the
VPLEX Implementation and Planning Best Practices Technote available on
http://Powerlink.EMC.com

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

46

Section 7: VPLEX Performance Checklist


This section summarizes the topics that have been covered in the previous 6 sections
and provides a quick review of the overall performance considerations topics. Here
is a checklist of factors to consider when deploying VPLEX:
Ensure VPLEX Best Practices have been followed
The best way to ensure the optimal operation of your VPLEX is to adhere to the VPLEX
configuration best practices. The best practices for VPLEX are documented in the
EMC Technical Note: Implementation and Planning Best Practices for VPLEX
available Powerlink.emc.com and provide the key considerations, limitations, and
architecture details for VPLEX design. These topics are beyond the scope of this
white paper.
Run the latest GeoSynchrony code
VPLEX bug fixes and general performance enhancements are being continually
released. Run the latest available stable version. Read the release notes for a release
so you know what's coming, and for known issues.
Check ETA and Primus articles
Follow and understand all VPLEX-related EMC Technical Advisories (ETAs) and
performance related Primus articles. An ETA identifies an issue that may cause serious
negative impact to a production environment. EMC's technical support organization
pro-actively publishes this information on Powerlink.emc.com.
Load balance across VPLEX directors and fibre channel ports
Avoid overloading of any one particular VPLEX director or pairs of directors (with dual
and quad systems). The same goes for VPLEX front-end ports - spread the IO
workload around. Avoid creating hot spots which can cause artificial performance
bottlenecks.
Separate IO sensitive workloads
When creating VPLEX front-end storage-views, isolate applications where possible
onto different physical VPLEX resources. Be sure to spread the workload across
available front-end FC ports on a VPLEX IO module, up to 4 available per director on
VS2 hardware. The same is true for back-end FC (storage array) port consumption.
When practical, use back-end ports in a rotating fashion so that all four BE ports are
consumed by various storage arrays, before re-using the same BE port for another
array or arrays.
Competing latency sensitive applications sharing a single FC port (FE or BE) may
impact performance if they share the same FC ports.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

47

Check System Status


Be very aware of the general health of your VPLEX system, your storage-arrays,
storage fabrics, and, for Metro/Geo, the health state and your WAN infrastructure.
With the Unisphere for VPLEX GUI, pay particular attention to System Status and
Performance Dashboard Tabs. Keep an eye out component errors, front-end aborts,
back-end errors, WAN errors, dropped packets, and/or packet re-transmissions since
these indicate key portions of IO operations are failing and may have resulted in retried operations. In turn, they affect the performance of your VPLEX system..
Configure Multi-pathing
Remember that there is no singular host multi-pathing policy for every IO scenarios.
Generally PowerPaths Adaptive policy (default for VPLEX devices) is sufficient. Avoid
excessive multiple director multi-pathing, and in a Metro cross-connect environment,
set hbas to prefer the local paths. Depending on the specifics of your environment
you may wish to try using different policies to see which best suites the workload.
Front-end/host initiator port connectivity summary:

The front-end dual fabrics should have a minimum of two physical


connections to each director (required)
Each host should have at least two paths to a cluster (required)
Each host should have at least one path to an A director and one path to a
B director on each fabric for a total of four logical paths (required for NDU)
Hosts should be configured to a pair of directors to minimize cache transfers
and improve performance
At the extreme, performance benefits can be maximized if both directors
use by a host are on the same engine as cache transfers would happen via
the internal CMI bus within the engine chassis. In general, this is not a
recommended best practice when 2 or more engines are available.
Maximum availability for host connectivity is achieved by using hosts with
multiple host bus adapters and with zoning to all VPLEX directors. Its
important to note, however, that this would be analogous to zoning a single
host to all storage ports on an array. Though this sort of connectivity is
technically possible and with highest availability, from a cost per host,
administrative complexity, overall performance, and scalability perspective
it would not be a practical design for every host in the environment.
Each host should have redundant physical connections to the front-end
dual fabrics (required).
Each host should have fabric zoning that provides redundant access to
each LUN from a minimum of two directors on each fabric.
Four paths are required for NDU.

Note: The most comprehensive treatment of VPLEX best practices can be found in
the VPLEX Implementation and Planning Best Practices Technote which is located at
http://powerlink.EMC.com

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

48

Ensure your file-system is aligned


A properly aligned file-system is a performance best practice for every storage
product from every vendor in the marketplace.
Windows Server 2008, VMware vSphere 5.0, and some more recent Linux
environments automatically align their disk partitions.
When provisioning LUNs for older Windows and Linux operating systems that
use a 63 SCSI block count header, the host file system needs to be aligned
manually.
EMC recommends aligning the file system with a 1 MB offset.
Understand VPLEX transfer-size
During the initial synchronization / rebuilding of mirrored devices (both local and
distributed) and for device mobility activities VPLEX uses the transfer-size parameter
to determine how much of a source volume can be locked during the copy activity
from the source device to the target device (mirror legs). This value is 128KB by
default which is ensures the least impact. It is important to realize that 128KB is
extremely conservative and if the goal is to see how fast an initial sync, rebuild or
mobility can be completed then the parameter can typically be increased to at
least 2MB without a noticeable impact on host IO. As with any type of activities that
involved heavy IO to back-end storage it is important to adjust this value gradually to
ensure the host, the array, and the infrastructure can tolerate the increased write
activity.
Transfer-size can be set up to a maximum value of 32MB for the fastest
sync, rebuild, or mobility activities.
Know your baseline performance
Often times in performance troubleshooting we need to know the native host to
storage-array performance is. It is very important to know baseline performance
when adding VPLEX to an existing environment. There are circumstances when
VPLEX may be a victim of unsatisfactory storage-array performance as VPLEX
performance is heavily dependent on back end array performance. Baseline data
makes it easier to determine if it was a problem before or after VPLEX was added.
You can always check the observed VPLEX front-end and back-end latencies to
confirm the overall net latency impact. By following your storage-array vendor's
performance best practices, you will also maximize your observed VPLEX
performance.
One size does not fit all
EMC Global Services Pre-sales personnel have tools specifically designed to help size
your EMC VPLEX environment(s).
These tools provide:
1. A GUI to enter environment and workload details and quickly size a local or
metro solution.
2. A way to check proposed solution against the technical and performance
boundaries of VPLEX to help assess which VPLEX solution best meets the
environmental requirements.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

49

3.
4.

An export of the "proposed" solutions to Excel to aid in comparing various


"what-if" scenarios.
Support for GeoSynchrony 5.0.1+ Local / Metro FC / Geo

If you have questions or concerns about the appropriate number of engines for your
VPLEX system, please contact your EMC account team.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

50

Section 8: Benchmarking
Tips when running the benchmarks
There are four important guidelines to running benchmarks properly:
1) Ensure that every benchmark run is well understood. Pay careful attention to the
benchmark parameters chosen, and the underlying test systems configuration
and settings.
2) Each test should be run several times to ensure accuracy, and standard deviation
or confidence levels should be used to determine the appropriate number of
runs.
3) Tests should be run for a long enough period of time, so that the system is in a
steady state for a majority of the run. This means most likely at least tens of
minutes for a single test. A test that only runs for 10 seconds or less is not sufficient.
4) The benchmarking process should be automated using scripts to avoid mistakes
associated with manual repetitive tasks. Proper benchmarking is an iterative
process. Inevitably you will run into unexpected, anomalous, or just interesting
results. To explain these results, you often need to change configuration
parameters or measure additional quantities - necessitating additional iterations
of your benchmark. It pays upfront to automate the process as best as possible
from start to finish.
Take a scientific approach when testing
Before starting any systems performance testing or benchmarking, here are some
best practices:

First things first, define your benchmark objectives. You need success metrics so
you know that you have succeeded. They can be response times, transaction
rates, users, anything as long as they are something.
Document your hardware/software architecture. Include device names and
specifications for systems, network, storage, applications. It is considered good
scientific practice to provide enough information for others to validate your
results.
This is an important requirement if you find the need to engage EMC Support on
benchmarking environments.
When practical, implement just one change variable at a time.
Keep a change log. What tests were run? What changes were made? What
were the results? What were your conclusions for that specific test?

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

51

Map your tests to what performance reports you based your conclusions on.
Sometimes using codes or special syntax when you name your reports helps.

Typical Benchmarking Mistakes


Testing peak device performance with one outstanding IO
A storage system cannot possibly be peak tested when it is not fully utilized. There is a
lot of waiting by the storage device for single outstanding IOs.
Performance testing on shared infrastructure or multiple user system
A shared resource cannot and should not be used for performance testing. Doing so
calls into question the performance results gathered since it's anyone's guess who
happened to be doing what on the system at the same time as the test. Ensure that
the entire system solution (host, storage appliance, network, and storage-array) is
completely isolated from outside interference.
Do not conduct performance testing in a production system since benchmark
generated IO workload could affect production users.
Comparing different storage devices consecutively without clearing host server
cache
Caching of data from a different performance run could require host server cache
to flush out dirty data for the previous test run. This would surely affect the current run.
Better yet, run performance tests to storage devices on the host in raw or direct
mode, completely by-passing host cache.
Testing where the data set is so small the benchmark rarely goes beyond cache
Be aware of the various levels of caching throughout the system stack - server,
storage appliance (VPLEX, IBM SVC, other), and the storage-array. Choose a
sufficient working set size and run the test for long enough of a time to negate
significant caching effects. Its also import to ensure that too large of a working size is
not used either. Too large of a working set could completely negate the benefits of
storage engine and array cache and not represent real world application
performance.
It is not always clear if benchmarks should be run with "warm" or "cold" caches. On
one hand, real systems do not generally run with a completely cold cache, and
many applications benefit greatly from various caching algorithms so to run a
benchmarking completely eliminating the cache wouldn't be right. On the other
hand, a benchmark that accesses too much cached data may be unrealistic as well
since the I/O requests may not even hit the desired component under test.
Inconsistent cache states between tests

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

52

Not bringing the various system caches back to a consistent state between runs can
cause timing inconsistencies. Clearing the caches between test runs will help create
identical runs, thus ensuring more stable results. If, however, warm cache results are
desired, this can be achieved by running the experiment n+1 times, and discarding
the first run's result.
Testing storage performance with file copy commands
Simple file copy commands are typically single threaded and result in single
outstanding I/Os which is poor for performance and does not reflect normal usage.
Testing peak bandwidth of storage with a bandwidth-limited host peripheral slot
If your server happens to be an older model, you could be host motherboard PCI bus
limited. Ensure you have sufficient host hardware resources (CPU, memory, bus, HBA
or CNA cards, etc.) An older model fibre channel network (2Gbps as an example)
may performance limit newer servers.
Forgetting to monitor processor utilization during testing
Similar to peak bandwidth limitations on hosts, ensure that your host server isn't
completely used up. If this is happens, your storage performance is bound to be
limited.
Same goes for the storage virtualization appliance, and the storage-array. If you are
maxing out the available CPU resources in the storage device you will be
performance limited.
Not catching performance bottlenecks
Performance bottlenecks have the potential to occur at each and every stack in the
I/O layer between the application and the data resting on flash or spinning media.
Of course ultimately the performance that the application sees relies upon all of the
sub-components situated between it and the storage, but it's critical to understand in
which layer of this cake the performance limitations may exist. One misbehaving
layer can spoil everything.
Performance testing with artificial setups
Avoid "performance specials". Test with a system configuration that is similar to your
production target. For example, switching storage-array cache mirroring may speed
up your test, but would you do that in production? Short-stroking the storage-array's
RAID configuration may boost performance but in reality is a very inefficient use of
disk space that would not normally be used.
VMWare vSphere - Performance testing directly on the ESXi hypervisor console
Don't do it. Ever. ESXi explicitly throttles the performance of the console to prevent a
console app from killing VM performance. Also, doing I/O from the console directly
to a file on VMFS results in excessive metadata operations (SCSI reservations) that
otherwise would not be present when running a similar performance test from a VM.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

53

Real World Testing Mistake Example


A previously seen benchmarking mistake from the field was to compare a different
vendor's storage-array to a joint VPLEX-EMC Storage solution. There was
disappointment with the VPLEX-Storage performance and initially blamed VPLEX as
the culprit for the slowness. Further investigation into VPLEX revealed no significant
additional latency injected by VPLEX! Once a native host to EMC Storage
performance test was done, it revealed that the EMC Storage wasn't configured to
perform as fast as the 3rd party vendor's array due to fewer drive spindles, different
performance tier drives and raid protection on the EMC Storage. Once the native
EMC Storage was properly reconfigured the VPLEX solution performed as expected.

Understand the Metamorphosis of an IO


Here's why you typically don't want to use a high level benchmarking program for
storage testing. What you think the storage device sees might be something
completely different. Each software layer within the host may be transparently
segmenting, rearranging, and piecing back together the initial I/O request from the
application.

Figure 16
VPLEX Performance Benchmarking Guidelines
There are a few themes to mention with regards to performance benchmarking with
VPLEX.
Test with multiple volumes

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

54

VPLEX performance benefits from I/O concurrency. Concurrency can easily and is
best achieved by running I/O to multiple virtual-volumes. Testing with only one
volume does not fully utilize VPLEX or the storage-array full performance capabilities.
With regard to the previously mentioned single outstanding I/O issue, with enough
volumes active (such as a few hundred) having a single outstanding I/O per volume
is acceptable. Multiple volumes active creates a decent level of concurrency. A
single volume with single outstanding I/O most definitely does not.
Storage Arrays
For VPLEX Metro configurations, ensure that each cluster's storage-arrays are of equal
class. Check the VPLEX back-end storage-volume read and write latency for
discrepancies. Perform a local-device benchmarking test at each cluster, if possible,
to eliminate the WAN and remote storage-array from the equation.
One Small Step
Walk before you run. It is typically quite exciting to test the full virtualization solution
end to end, soup to nuts. For VPLEX, this may not always be the most scientific
approach when problems arise. By testing end to end immediately the test results
may disappoint (due to unrealistic expectations) and lead to the false conclusions
about the overall solution without understanding the individual pieces of the puzzle.
Take a moderated staged approach to system testing:
1) Start with your native performance test:
Host <-> storage-array

1a) If you have a two cluster deployment in mind, it is important to quantify the
performance of the storage-arrays at each cluster.
This will be your baseline to compare to VPLEX. For certain workloads, VPLEX can only
perform as well as the underlying storage-array.
2) Encapsulate the identical or similar performing volumes to VPLEX configuring them
as local-devices:

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

55

Host <-> VPLEX <-> storage-array

2a) Test both cluster's local-device performance. (Note: The second cluster's VPLEX
local-device performance test could be skipped if Step 1 showed satisfactory
performance native performance on the second cluster.)
3) Create a VPLEX distributed-device spanning both clusters storage-arrays.
Host <-> VPLEX <-> cluster-1 storage-array and cluster-2 storage-array (distributeddevice)

Tip: Any significant performance degradations at this step should focus


troubleshooting efforts at the WAN.
Be cognizant of inherent write latency increases by writing to a distributed-device for
VPLEX Metro.
VPLEX Geo's write-back caching model will initially allow VPLEX to absorb host writes,
however over-time the performance will be limited by the system's sustained drainrate (WAN performance and storage-array performance.)
IOMeter Example

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

56

This section provides IOMeter settings examples in the form of screen captures from
actual test systems. They illustrate the setting that can be used to simulate various
workloads and create benchmarks.
Disk Targets Tab:

Access Specification Tab:

Application Simulation Testing with IOMeter


IOMeter can be used to synthesize a simplistic application I/O workload. Alignment
should be set to a page size of 4KB.
Single threaded I/O:

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

57

Multi-threaded I/O:

Simulating Database Environments:


Use small transfer requests.
Mostly random distribution.
Match your existing database application read/write mix.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

58

Simulating Data Streaming Environments:


Use large transfer requests to generate peak bandwidth.
Mostly sequential distribution.
Match your existing streaming application read/write mix.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

59

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

60

Conclusion
This paper has focused on VPLEXs role in providing a virtual storage layer between servers
and block storage frames. Because VPLEX lives at the very heart of the storage area
network, VPLEXs primary design principals are continuous availability and minimized IO
latency. VPLEX also provides non-disruptive data mobility within and across data centers
while simplifying the management of heterogeneous storage frames. When VPLEX and the
corresponding storage environment are properly sized and configured IO latency can be
reduced in the case of read skewed workloads and kept nearly neutral for write biased
workloads. Individual results will, of course, vary based on the application and IO
workload.
Weve learned how inserting an inline virtualization engine like VPLEX has the potential to
increase I/O latency. In particular, we have seen how writes a metro distances reacts in a
synchronous caching model. The read/write mix, the I/O pattern, and the I/O stream
characteristics can affect the overall result. If benchmark or proof of concept testing is
being done, it is important to understand the factors that impact VPLEX performance and
make every effort to ensure the benchmark test workload is as close to the real world
workload as possible.
The role of SAN, server and storage capabilities in terms of congestion, reads and writes
was another important topic of discussion. These external components are extremely
relevant in determining overall VPLEX performance results. Weve discussed how VPLEXs
read cache may increase the level of performance compared to baseline for a native
array and how each host write must be acknowledge by the back-end storage frames.
Understanding the impact of VPLEX and how an environment can be prepared for single,
dual or quad VPLEX clusters will greatly increase the chances of your success when
configuring virtualized storage environments for testing, benchmarks, and production.

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

61

References
The following reference documents are available at Powerlink.EMC.com:

White Paper: Workload Resiliency with EMC VPLEX

Techbook: EMC VPLEX Architecture and Deployment: Enabling the Journey to


the Private Cloud

VPLEX 5.1 Administrators Guide

VPLEX 5.1 Configuration Guide

VPLEX Procedure Generator

EMC VPLEX HA Techbook

External References

IOMeter screen captures and Product Testing discussion:


http://www.snia.org/sites/default/education/tutorials/2007/spring/storage/Storage_
Performance_Testing.pdf

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

62

Appendix A: Terminology
Term

Definition

Storage volume

LUN or unit of storage presented by the back-end arrays

Metadata volume

System volume that contains metadata about the


devices, virtual volumes, and cluster configuration

Extent

All or part of a storage volume

Device

Protection scheme applied to an extent or group


of extents

Virtual volume

Unit of storage presented by the VPLEX front-end


ports to hosts

Front-end port

Director port connected to host initiators (acts as a


target)

Back-end port

Director port connected to storage arrays (acts as


an initiator)

Director

The central processing and intelligence of the


VPLEX solution. There are redundant (A and B)
directors in each VPLEX Engine

Engine

Consists of two directors and is the unit of scale for


the VPLEX solution

VPLEX cluster

A collection of VPLEX engines in one rack, using


redundant, private Fibre Channel connections as
the cluster interconnect

VPLEX Metro

A cooperative set of two VPLEX clusters, each


serving their own storage domain over
Synchronous distance

VPLEX Metro HA

As per VPLEX Metro, but configured with VPLEX


Witness to provide fully automatic recovery from
the loss of any failure domain.

Access Anywhere

The term used to describe a distributed volume


using VPLEX Metro

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

63

Federation

The cooperation of storage elements at a peer


level

EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES


DEFINED

64

You might also like