Professional Documents
Culture Documents
StorAGe ArchitectureS
www.infinio.com
contact@infinio.com
617-374-6500
Modern Complexities:
All-Flash Arrays
10
Hyper-Converged Infrastructures
13
Storage Acceleration
14
decade ago, the data center was a vastly different world. Traditional
storage arraysconfigured as SAN or NASsat as centralized storage units
VirTualizaTion
i/o onto the same
Scale-ouT
applicaTion
archiTecTureS
of magnitude greater
storage resources
performance while
consolidating more
FlaSh
supporting orders
Virtualization
Traditional SAN and NAS arrays were designed for a world without virtualization.
While not new, over the past few years virtualization has had a profound impact
on storage architectures and workloads.
Simply put, server virtualization means that a single physical server can be
shared by multiple, independent virtual machines. No longer are there dedicated
LUNs for each physical application; gone too are the optimizations built for that
architecture. Its no longer efficient to employ simplistic mechanisms to improve
the ability of drive heads to access data from a specific location.
Users have to
contend with both
the explosion of
storage, driven by
machine-generated
data, along with
a new multiplier:
copies used for
data protection.
When you have a lot of compute in a scale-out architecture (like you do with VMware and
Oracle, for example), storage is under significantly more pressure. The same spindles must
handle increased workload compared to when they were sized 1:1 for individual scale-up
applications. The goal of these systems is to evenly and automatically distribute the data
across multiple systems. However, this efficiency comes with a price: the cluster as a whole is
processing significantly more data, putting more performance pressure on storage. Similarly,
scale-out applications in the datacenter that are starting to use replicas rather than RAID for
data protection are driving a huge amount of data capacity requirements into storage.
FlASh
enter flash-based SSd devices. Flash is a core technology that can be deployed in several ways,
such as SSds in hosts, pci-e cards, and SSds in arrays. other new architectures like nV-based
diMMS are also emerging in this space. Because it is orders of magnitude faster (without being
analogous orders of magnitude more expensive), flash is a huge disruption in the economics of
storage.
this puts pressure on existing storage systems in a few ways. First, flashs additional
performance capabilities drive significantly more iopS through existing controllers than
legacy systems were designed to handle. equally impactful to existing platforms is that flashs
architecture often needs special handling to address particular challenges.
coMpleXitieS to conSider
like other technologies, individual flash devices (like SSds) have some architectural
challenges that need to be addressed, either by storage controllers or by software.
For example:
Wear balancing
over time, flash cells wear out. its desirable to have them wear out at about the same time, so
some storage controllers include the logic of special algorithms to spread wear evenly across cells.
Write amplification
Flash only allows writes to an empty cell; if a cell has content, it must be erased before it can be
re-written to. Further increasing overhead is that while writes might be at the block level, erasures
occur at the page level.
garbage collection
Because cells must be empty to be written to, a cycle of clean-up needs to occur in order to
dispose of outdated data. this background process can degrade both active read and write
performance while it occurs.
1 Minute
2 WeekS
1-2 MonthS
10 YeArS
in this illustration, we demonstrate the relative speed of different media in the data center.
But first, lets take a broad look at the ways in which users have
integrated flash into their systems.
In the early days before purpose-built hybrid arrays emerged, users
handled their need for speed and performance by adding SSDs to their
existing arrays, and developing tiering strategies (eventually these were
automated) to help them best use their mix based on the level/frequency
of data access.
Similarly, some users created all-flash arrays by purchasing legacy
architecture arrays filled with SSD drives. However, both of these
approaches fell short in delivering the promise of flash they didnt
leverage flash in the most effective ways, and they fell victim to make of
the complexities of flash discussed earlier. The other critical thing about
both of these approaches is this: the architecture of the datacenter and of
the storage arrays stayed basically the same.
hYBrid ArrAYS
hybrid arrays represent the next mainstream generation
of disk arrays, with controllers that are better-suited to
utilize flash resources. the latest hybrid arrays use flash for
more specific tasks: as a read cache and a write log buffer
for example.
these hybrid arrays access data from either flash or from its
disk pool, but flash is not exposed directly to applications.
the goal of this storage architecture is to present an optimal
mix of flash to optimize the arrays ability to handle increased
iopS, and spinning drives to optimize capacity utilization.
typically used in scenarios where there is a set of mixed
enterprise workloads, hybrid arrays offer mainstream users an
option for benefiting from flash when optimizing cost is more
important than occasional latency.
the hybrid landscape continues to rapidly expand and
develop. For now, the hybrid array approach of combining
SSds and spinning drives seems to be the new status quo
for organizations purchasing new arrays.
ALL-FLASH ARRAYs
The all-flash trend began with putting SSDs into existing arrays, but a range of
the complexities sparked the emergence of a new generation of storage. As we
discussed earlier, controller functions like wear leveling become imperative in
arrays leveraging flash to avoid wearing the flash cells unevenly. Similarly, space
management can be a challenge, too, since flash only writes to empty cells.
And, cleaning up storage may incur slower cycles as multiple reads and writes
execute while blocks are being erased. The newest all-flash arrays are built with
controller logic that handle these technical challenges.
Essentially, an all-flash array is like tacking a single, high performance/high
cost tier onto your existing data center. Its not just fast, but its a guarantee of
fast for everything connected to it an SLA no other architecture has yet to
promise. But while the approach is ideal for handle a high level of missioncritical applications, it may be too expensive for many organizations. The
deduplication that all-flash array vendors tout as the key to delivering spinningdrive economics may represent a false promise: that same deduplication can be
applied to spinning drives as well to once again separate the costs of flash and
spinning drives.
Economics aside, the reality is that many organizations do not demand that
much speed for every application.
10
11
hYper-conVerGed
inFrAStructure
Architecturally, a hyper-converged system binds hardware
typically a server with direct-attached storage and flash
cardswith the software needed to run virtual workloads,
including: a hypervisor, system management, configuration
and virtual networking tools. these technologies do not
connect to traditional storage stacks and instead offer next
generation storage services such as scale-out performance,
caching, encryption, deduplication.
But hyper-converged infrastructure has its limitations, not
the least of which is the reality that users have to change
their whole data center to implement it. While all-flash
and hybrid arrays have unique attributes, users still buy
and manage networking and servers the same way as they
always have, having a lesser impact on the architecture of
the data center. By contrast, hyper-converged infrastructure
requires users to buy into a new building block for their
environments, changing their management tools and
processes. thus, these are typically deployed in small and
medium businesses or remote office environments.
oTher conSideraTionS include:
Storage Efciency: the scale-out architecture of hyper-converged
infrastructure has driven a data protection scheme based on
replication, rather than traditional rAid. While rAid for data
protection might have a 10-20% overhead for capacity, the
replicas necessary to protect hyper-converged infrastructure
might see a capacity overhead closer to 300-400%.
Pre-dened Scaling: compared to traditional it, there is less
flexibility to throttle pieces of the infrastructure that you need
more or less ofexpansion is done by buying another predefined
block of all the resources.
Multiple User Management Experience: in a hyper-converged
environment, silos of it personnel may need to be reorganized to
streamline management. Familiar server and storage monitoring
tools are often replaced with different interfaces.
WhY
hYper-conVerGed?
enables simplified, unified
design, management
deployment and support by
integrating the components
of the it infrastructure.
provides predictable
building blocks that can
be aggregated together to
meet growth needs.
12
StorAGe AccelerAtion
Many organizations who understand the benefits of
leveraging server-side resources to improve storage
performance seek a solution that enables this without
massive disruption to datacenter architecture.
enter storage acceleration.
Storage acceleration enables it managers to improve
storage performance by aggregating server-side
resources. this approach creates a low-latency
performance layer, enabling organizations to purchase
or use lower-cost storage for capacity purposes. Whether
organizations are looking to improve existing storage or
design a new datacenter, storage acceleration enables
organizations to manage the resources of performance
and capacity separately without changing the
architecture or the operations of the storage side.
this architecture can deliver the lowest cost/iopS by
using less expensive commodity-based resources on the
server side, and the lowest cost/GB by focusing shared
arrays on being optimized for capacity.
From a technology standpoint, this approach
enables organizations to separate the acquisition and
management of performance resources from that of
capacity resources.
From a business standpoint, it provides a way to add
performance to an existing infrastructure without a
rip-and-replace and its inherent costin hardware,
software, it time investment and downtime. these
resources can also be significantly less expensive than the
same hardware deployed within a proprietary package.
WhY StorAGe
AccelerAtion?
provides low-latency server-side
access to the fastest storage
resources, at a low $/iopS
enables organizations to
maintain their existing
infrastructure investment in
shared storage platforms,
even reducing $/GB on newer
platforms
13
Infinios Storage
Acceleration Platform
Storage acceleration is still an emerging field,
but Infinio has been providing a solution in
this space since 2013. The Infinio solution is
highly efficient with resources, transparent to
existing storage operations, and non-disruptive.
All of these qualities enhance the benefits of
separating storage into its atomic qualities
capacity and performance including:
10x improvement in latency
SSD-class performance without
additional hardware
Reduced performance costs ($/IOPS)
Scale-out I/O with application growth
Reduced capacity costs for any array ($/GB)
14
Infinio
Storage acceleration for
the virtualized datacenter.
At the core of Infinios solution to deliver storage performance separately from storage
capacity is an architecture built on the understanding that most datacenters contain
significant amounts of duplicate data, especially across VMware clusters. Infinios contentbased architecture exploits this fact, tracking content (rather than location) which results in
a performance layer with inline deduplication. It is this deduplication that enables Infinio
to deliver high performance (10X improvement in latency) on just small amounts of RAM starting at just 8GB per host. When this deduplication is combined with Infinios scale-out
global architecture, just 5 nodes of Infinio can have access to hundreds of GB of effective cache.
And its not just the efficiency that makes Infinio different. Core to the design of the product
has always been a commitment to seamless integration into an existing environment. As such,
Infinio can be installed in under 30 minutes with no downtime, disruption, guest agents, or
changes to storage configuration. Turning acceleration on or off is a single click, as is removing
Infinio entirely at the end of an evaluation.
Once implemented, Infinio enables you to continue using your familiar storage tools, like
snapshots, replication, and thin provisioning, as well as customizations youve made in your
environment around backup system integration or reporting.
15
What you
can expect
Simple installation,
enabling you to evaluate
without downtime,
disruption, or changes
Investment preservation,
since it co-exists with
your existing storage
system tools and reports
16
reAl-World SucceSS
Attivio, one of infinios earliest customers, has seen significant storage performance
improvements over the long term, including:
improved storage performance with no added hardware
Sustained read offload of 88%; 93% of bandwidth offloaded
Sustained 5x performance improvement over 16 weeks
installed with no downtime or service interruption
17
Real-world success
When mobile workers complained about slow response and poor application
performance, Budd Van Lines deployed Infinio get business moving, achieving:
Improved VDI performance (read response times decreased by 2.5x;
75% of requests offloaded)
Eased network bandwidth by offloading storage requests
Installed quickly without affecting production or users
18
19