Professional Documents
Culture Documents
The concept of scalability applies to technology as well as business settings. The base
concept is consistent - The ability for a business or technology to accept increased
volume without impacting the contribution margin (= revenue - variable costs). For
example, a given piece of equipment may have capacity from 1-1000 users, and
beyond 1000 users, additional equipment is needed or performance will decline
(variable costs will increase and reduce contribution margin).
Measures
Load scalability: The ability for a distributed system to easily expand and contract
its resource pool to accommodate heavier or lighter loads. Alternatively, the ease
with which a system or component can be modified, added, or removed, to
accommodate changing load.
Geographic scalability: The ability to maintain performance, usefulness, or
usability regardless of expansion from concentration in a local area to a more
distributed geographic pattern.
Administrative scalability: The ability for an increasing number of organizations to
easily share a single distributed system.
Functional scalability: The ability to enhance the system by adding new
functionality at minimal effort.
Examples
Methods of adding more resources for a particular application fall into two broad
categories:[3]
As computer prices drop and performance continues to increase, low cost "commodity"
systems can be used for high performance computing applications such as seismic
analysis and biotechnology workloads that could in the past only be handled
by supercomputers. Hundreds of small computers may be configured in a cluster to
obtain aggregate computing power which often exceeds that of single traditional RISC
processor based scientific computers. This model has further been fueled by the
availability of high performance interconnects such
as Myrinet and InfiniBand technologies. It has also led to demand for features such as
remote maintenance and batch processing management previously not available for
"commodity" systems.
The scale-out model has created an increased demand for shared data storage with
very high I/O performance, especially where processing of large amounts of data is
required, such as in seismic analysis. This has fueled the development of new storage
technologies such asobject storage devices.
Tradeoffs
There are tradeoffs between the two models. Larger numbers of computers means
increased management complexity, as well as a more complex programming model and
issues such as throughput and latency between nodes; also, some applications do not
lend themselves to a distributed computing model. In the past, the price differential
between the two models has favored "scale out" computing for those applications that fit
its paradigm, but recent advances in virtualization technology have blurred that
advantage, since deploying a new virtual system over a hypervisor (where possible) is
almost always less expensive than actually buying and installing a real one.
Database scalability
A number of different approaches enable databases to grow to very large size while
supporting an ever-increasing rate of transactions per second. Not to be discounted, of
course, is the rapid pace of hardware advances in both the speed and capacity of mass
storage devices, as well as similar advances in CPU and networking speed. Beyond
that, a variety of architectures are employed in the implementation of very large-scale
databases.
One technique supported by most of the major DBMS products is the partitioning of
large tables, based on ranges of values in a key field. In this manner, the database can
be scaled out across a cluster of separate database servers. Also, with the advent of
64-bit microprocessors,multi-core CPUs, and large SMP multiprocessors, DBMS
vendors have been at the forefront of supporting multi-threaded implementations that
substantially scale up transaction processing capacity.
Network-attached storage (NAS) and Storage area networks (SANs) coupled with fast
local area networks and Fibre Channel technology enable still larger, more loosely
coupled configurations of databases and distributed computing power. The widely
supported X/Open XAstandard employs a global transaction monitor to
coordinate distributed transactions among semi-autonomous XA-compliant database
resources. Oracle RAC uses a different model to achieve scalability, based on a
"shared-everything" architecture that relies upon high-speed connections between
servers.
While DBMS vendors debate the relative merits of their favored designs, some
companies and researchers question the inherent limitations ofrelational database
management systems. GigaSpaces, for example, contends that an entirely different
model of distributed data access and transaction processing, named Space based
architecture, is required to achieve the highest performance and scalability.[4] On the
other hand, Base One makes the case for extreme scalability without departing from
mainstream database technology.[5] In either case, there appears to be no end in sight
to the limits of database scalability.
In the context of high performance computing there are two common notions of
scalability. The first is strong scaling, which is defined as how the solution time varies
with the number of processors for a fixed total problem size[6]. The second is weak
scaling, which is defined as how the solution time varies with the number of processors
for a fixed problem size per processor.