10g RAC Scalability Proofpoints

Oracle RAC
The Effectiveness
of Scaling Out
Erik Peterson
RAC Development
Server Technologies
Oracle
1
Agenda
What is RAC?
Why does it Scale?
Why Scale Out?
Scale Out Examples?
Scale Out or Scale Up?
Improving Scalability
2
Oracle RAC
Architecture
Centralized Application Servers/ Users

Management Network
Console
Low Latency Interconnect
VIA or Proprietary No Single
High Speed
Switch or
Point Of Failure
Interconnect
Clustered
Database Instances
Shared Cache
Hub or
Switch Storage Area Network
Fabric
Drive and Exploit
Mirrored Disk Industry Advances in
Subsystem Clustering
Why Does RAC Scale to Many Nodes?
Messaging cost independent of cluster size
Instance A Instance B
3
Update Current
Block 10 225
1 Master 2
Requester GCS Holder
200
Instance C
Why Scale Out?
5
Economies of Scale
System
Cost
Performance
Sweet Spot
Number of CPU’s in a single node
6
Higher Availability
CRM
OE
Payroll
Email
Higher Availability
CRM
OE
Payroll
Email
Node failure has less impact

Scale Out
or
Scale Up
9
Major Bank testing Siebel
10 CPU 20 CPU 2 x 10 CPU
time
onse
Response Time
Resp
Response time
with RAC
User load
10
Scale Out at a Fraction of the Cost
SMP RAC
SMP RAC
120000
100000
80000
SMP RAC
SMP RAC
60000
40000
SMP RAC
SMP RAC
SMP RAC
SMP RAC x10 x10
20000
0
16 CPUs 48 CPUs 64 CPUs 72 CPUs
Audited
Audited Customer
Customer Audited
Audited Customer
Customer
Benchmark
Benchmark Benchmark
Benchmark Benchmark
Benchmark Benchmark
Benchmark
HP RAC vs. SMP TPC-C
118%
SMP RAC Of Big SMP
1,184,893 Result
1,200,000 1,008,144
1,000,000
800,000
tpmC 600,000
400,000 RAC = $5.52 / tpmC

SMP = $8.33 / tpmC
200,000
0
1X64 16X4
Nodes X CPUs per Node Same 1.5 GHz Itanium2 CPUs
As of September 13, 2006: HP Integrity Superdome, 1,1008,144.49/tpmC, $8.33/tpmC, available 4/14/04. Hp Integrity rx5670, 1,184,893 tpmC, 12
$5.52/tpmC, available 4/30/06. Source: Transaction Processing Performance Council (TPC), www.tpc.org
HP RAC vs. SMP TPC-C
Details
– Same CPUs (Intel Itanium2 1.5 GHz)
– RAC had less total memory (768 GB vs. 1024
GB)
– RAC / Linux vs. SMP / HP-UX
List prices for processor hardware*
– SMP $7,921,505
– Cluster $2,620,866
* Includes processors, OS, memory, cluster interconnects and support
As of September 13, 2006: HP Integrity Superdome, 1,1008,144.49/tpmC, $8.33/tpmC, available 4/14/04. Hp Integrity rx5670, 1,184,893 tpmC, 13
$5.52/tpmC, available 4/30/06. Source: Transaction Processing Performance Council (TPC), www.tpc.org
IBM Oracle Applications Benchmark
Oracle Applications Standard Benchmark (OASB)
SMP RAC 96%

Of Big SMP
25,000 22,008* 21,168* Result
20,000
15,000
# Users
10,000
5,000
0
1X16 4X4
Nodes X CPUs per Node Same 1.7 GHz Power4+ CPUs
*Audited
14
Source http://www.oracle.com/apps_benchmark
IBM Oracle Applications Benchmark
Details
– Same IBM CPUs (1.7 GHz)
– Same total memory (256 GB)
– Same operating system (AIX 5L)
List prices for processor hardware*
– SMP $1,405,750 list
– Cluster $788,000 list **
* Includes processors, OS, memory, and cluster interconnects

** RAC software adds $320,000 list
15
Customer Loan Processing Benchmark
115%
SMP RAC Of Big SMP
35,000 Result
30,000 31,578
27,500
25,000
# Loans 20,000
Applications
Processed 15,000
Per Minute 10,000
5,000
0
1 Node 2 Nodes
(1 X 48) (2 X 24) Same Sun 900 MHz CPUs
16
Details
– Same CPUs (Sun 900 MHz)
– Same Sun Solaris operating system
Price comparison (N/A)
– Cluster was constructed by partitioning 48
CPU Sun Fire 15K server into two 24 CPU
domains
17
Customer is one of world’s largest financial

services companies
Goal: Determine which platforms can meet
peak processing load requirements
– Process 30 million loan applications in 15 hours
2 million/hour or 33,333/minute sustained
– Mix of transactions, e.g.,
T1 = 13 reads, 4 inserts, 2 updates
T2 = 5 reads, 2 updates
T3 = 1 read, 6 updates
18
Customer Telecom Benchmark
103%
SMP RAC Of Big SMP
450,000 Result
437,070
400,000 423,420
350,000
300,000
Trans / Hr 250,000
200,000
150,000
100,000
50,000
0
1 Node 2 Nodes
(1 X 72) (2 X 36) Same Sun 1.2 GHz CPUs
19
Customer is one of world’s largest Telecom

companies
Goal: Determine which platforms can meet peak
processing load requirements
– 400,000 transactions per hour
– < 0.8 second response time
– Complex mix of transactions, e.g.,
Complex Routing 1% Node Query 12%

Route Report 46% Location Query 11%
Status Change 16% Termination Query 12%
Simple Enquiry 2%
20
Details
– Same CPUs (Sun 1.2 GHz)
– Same Sun Solaris operating system
Price comparison (N/A)
– Cluster was constructed by partitioning Sun
SMP server into two 36 CPU domains
21
Cost Savings
CPU Costs (List) for 550,000
Transactions /hour
$3,000,000
$2,700,000
$2,000,000
$1,000,000
$160,000 $60,000
$0
(1) 72-CPU UNIX (4) 4-CPU Dell (10) 2-CPU Dell
SMP Itanium 2-based Xeon-based
7250s 1750s
List Prices
Joint Tests done by Dell, EMC & Oracle

Based on Telecom Application Workload
22
Hitachi BladeSymphony Test Using
a Real Stock Exchange Workload
2000 tps 80%↑
↑ 3600 tps 80%↑
↑ 6480 tps
8
67%↑
↑ 64%↑
↑ 67%↑
↑
6
# of CPUs
1200 tps 83%↑

↑ 2200 tps 78%↑
↑ 3888 tps 70%↑
↑ 6624 tps
4
88%↑
↑ 83%↑
↑ 76%↑
↑
640 tps 88%↑

↑ 1200 tps 84%↑
↑ 2208 tps
2
10 business units 24 business units
1 2 4 6 8
# of Nodes
23
Why Does RAC Scale Well?
Scalable Interconnect vs. Complex System Bus

– Max 3 way protocol, regardless of cluster size
– 99% of customers use the capacity of a single GigE
interconnect, but easily able to add another.
– SMP requires synchronization for every load and store
operation (millions+/sec). RAC generate 3 to 4 orders of
magnitude less messages
Extending the limits of a single machine architecture
– Don’t need a complex bus
– Virtually unlimited HBAs, Memory, CPUs
– Avoid in Memory Contention
24
Scale Out
Examples
25
Increasing # of Wide Clusters
26
Source: New Linux environments in the RAC Customer Tracking System
7 of 8 Biggest Linux DW run
on RAC
16 nodes
4 nodes
8 nodes
8 nodes
3 nodes
2 nodes
8 nodes
Source: Winter Corp 2005 Survey
27
Amazon RAC/Linux DW – Oracle10g
Query ETL In Oct 2005 Amazon merged

its 35 TB Clickstream DW’s
#Nodes 16 x 4 8x4 data into its 25 TB Query
x #cpus DW. It ran rock-solid through
Total DB 61 TB 10 TB its end of year peak season.
At the time Clickstream and
Data 51 TB 7 TB Query were already 2 of the
Top 10 DWs in the world:
Index 2 TB 1-2 TB
www.wintercorp.com
Disk 71 TB 36 TB
Query DW is now 61 TB+.
28
Amazon DW Modular Architecture
Oracle10g RAC
Amazon’s RAC is so cost-effective they run 2 concurrently and still save money!!
1. Extract 2. Integrate, 3. Query and 4. Data access

from source transform, and analyze and publishing
systems denormalize
Users
Extract STG1 ADS1 Query DW
Servers (ETL/Staging) (Atomic Data Store)
8 nodes 4 cpus 16 nodes 4 cpus DSS UI
Client
ETL
Manager
2nd pair of identical
STG2 ADS2 Query DW RAC clusters means
(ETL/Staging) (Atomic Data Store) ‘no need for backup’
8 nodes 4 cpus 16 nodes 4 cpus for active online data
Amazon.com – Oracle10g
61 TB production Query DWs
– 100,000 queries/week (mostly complex ad hoc)
– Amazon runs 2 identical 61 TB+ query DWs loaded concurrently.
Config for each is:
16 node RAC/Linux cluster
Oracle10gR1 RAC using ASM on Red Hat Enterprise Linux 3
16 HP DL580s, each w/ 4 3-GHz cpus
71 HP StorageWorks MSA1000
8 32-port Brocade switches
1 Gigabit interconnect
– DW Metrics
Each holds 51 TB raw data (growing at 2x per year)
Each is 61 TB total database size w/ only 2 TB indexes
71 TB total disk for each
30
Amazon.com DW Statistics
2000 2001 2002 2003 2004 2005 2006
DW Size ~1 TB 3.5 TB 10 TB 15 TB 20 TB 25 TB 61 TB
DW Data ~1 TB 2.3 TB 9 TB 13 TB 18 TB 23 TB 51 TB
Users 330 512 800 830 830 830 830
Queries/Day 630 1000 4200+ 6,000 7,000 8,000 14,000
% < SLA 63% 77% 80%+ 80%+ 80%+ 80%+ 80%+
Direct SQL Access No Yes Yes Yes Yes Yes Yes
User-Pub'd Repts No Yes Yes Yes Yes Yes Yes
In just 6 years:
- 50x growth in data volume
- 16x growth in query volume
- ~3x growth in number of users
- additional lines of business / product lines supported
- huge standard reporting growth -> many more partners supported
…and still meeting SLAs – with ever-improving price/performance!
Mercado Libre
eBay in Latin America
Runs marketplace on RAC
Scaled incrementally as marketplace grew
1,600,000
1,400,000
Business Volume
1,200,000
1,000,000
Nodes
800,000
600,000
400,000
200,000
0
2004 2005 2006
32
Mercado Libre
Performance Characteristics
MercadoLibre’s 13 node Linux Itanium cluster

• 460 GB RAM clusterwide
• 286 GB SGA
• 14,500 URLS/second
• 47 GB/ redo /day
Only use a maximum 40% of the capacity of a

single Gigabit Ethernet interconnect
33
J2 Global
Reporting OLTP
Single 16 node Oracle 10g Sun Solaris Cluster

12 nodes DataGuard copy of production for reporting & DR
4 nodes for several OLTP databases
34
Dell IT – Tests of Oracle EBS
User Count Scalability
4000
3500
3000
2500
Users
Ideal
2000
Actual
1500
1000
500
0
2 3 4 5 6 7 8
Nodes
35
128 Node Scalability Proof of Concept in Japan
36
System Configuration Overview
128 “blade servers” for the RAC instances

Two NFS servers for storage
Two workload generator servers
Two network segments
• #1 for CSS / RAC traffic
• #2 for NFS / Application traffic
37
How far can RAC scale with a
single interconnect?
Internode Parallel Query Test Results (Scalability)
128
120
112
Scalability (Elapsed time / Elapsed Time @ 1 instance)
104
96
88
80
72
64
56
48
40
32
24
16
0
0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128
Degree of Parallelism (#instances)
38
AC3 - Australia
World’s Largest RAC/Linux Cluster

• High Performance Technical Computing config.
• 155 node cluster of 2-way IA32 Dell servers
• Total purchase price < $1m AUD
Oracle10g RAC Proof of Concept
• Red Hat Linux 3.0
• Network Appliance Storage
• 63-node database cluster built
• Linear scalability demonstrated.
39
Gas Natural Grid Environment
Clusters in Production and in Process
Corporate DW
• Wide Linux RAC
SAP BI environments are
now standard
Electricity Dispatching deployment
• Order of
magnitude cost
Siebel - Europe
savings
• Showing
SAP ERP scalability of OLTP
& DW
environments
Siebel - Brazil
40
When does RAC Scale?
If your application will scale

transparently on SMP, then it is
realistic to expect it to scale well on
RAC, without having to make any
changes to the application code.
41
Network Resources and
Scalability
Verify interconnect resources
– private network
– ports set to maximum bit rate ( e.g. 1 Gb/sec )
– full duplex
– network buffers ( e.g. socket receive buffers , RX/TX
descriptors )
Monitor the interconnect network and IPC
– bandwidth used
– discarded, dropped packets
– buffer overflows
– reassembly failures
– “lost blocks”: gc blocks lost
42
OS and Disk Resources and
Scalability
Higher and fixed priority for block server processes
– scheduling/starvation affects message latency
Fewer block server processes ( LMS ) usually more
efficient
Determine max possible read/write IO throughput
– important for loading and querying in parallel
Establish baseline for write IO latency
– “slow” log writes may affect block access time
Many long-term and transient performance and
scalability problems are caused by OS and disk
resource/capacity problems
43
Improving Scalability
Tune serializing contention
– concurrent access to the same block does not scale
anywhere
Tune SQL execution
– the most efficient plan in single instance is also the most
efficient one in RAC
Large cache for Oracle sequence numbers
“Sparse” ( high PCTFREE ) or small block sizes
– for small, in-memory tables with frequent concurrent
access
Parallel execution ( inter- or intranode )
– large data scans and loads and index rebuilds
44
Performance Diagnostics
Use advisories provided by EM or ADDM

– interpreted findings and recommendations
– thresholds and alerts in infrastructure
Use the same rationales as with one instance
– the interconnect network is an “IO” resource
Save the AWR repository
45
RAC Scalability Best Practices
Better HA with 4+ nodes

2+ CPU nodes for OLTP, 4+ CPU for
DW
Use same scalability tuning mechanisms
for RAC as you would for SMP
46
Q U E S T I O N S
A N S W E R S
47

10g RAC Scalability Proofpoints

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10g RAC Scalability Proofpoints

Uploaded by

Copyright:

Available Formats

Oracle RAC

Centralized Application Servers/ Users

Requester GCS Holder

Number of CPU’s in a single node

Node failure has less impact

10 CPU 20 CPU 2 x 10 CPU

400,000 RAC = $5.52 / tpmC

* Includes processors, OS, memory, cluster interconnects and support

SMP RAC 96%

* Includes processors, OS, memory, and cluster interconnects

Customer is one of world’s largest financial

Customer is one of world’s largest Telecom

Complex Routing 1% Node Query 12%

Joint Tests done by Dell, EMC & Oracle

1200 tps 83%↑

640 tps 88%↑

Scalable Interconnect vs. Complex System Bus

Source: Winter Corp 2005 Survey

Query ETL In Oct 2005 Amazon merged

1. Extract 2. Integrate, 3. Query and 4. Data access

MercadoLibre’s 13 node Linux Itanium cluster

Only use a maximum 40% of the capacity of a

Single 16 node Oracle 10g Sun Solaris Cluster

128 “blade servers” for the RAC instances

World’s Largest RAC/Linux Cluster

If your application will scale

Use advisories provided by EM or ADDM

Better HA with 4+ nodes

You might also like