You are on page 1of 37

Everything You Know About

Essbase BSO Optimization Is


Incomplete, Outdated, or Just Plain Wrong
Wayne D. Van Sluys
Senior Consultant
wvansluys@interrel.com
Joe Aultman
Director of Strategic Services
jaultman@interrel.com

Disclaimer
These slides represent the work and opinions of the
presenter and do not constitute official positions of
Oracle or any other organization.
This material has not been peer reviewed and is
presented here with the permission of the presenter.
This material should not be reproduced without the
written permission of interRel Consulting.

About interRel
Reigning Oracle Award winner
EPM & BI Solution of the year
Three Oracle ACE Directors
Authors of 10+ of the Best Selling
books on Hyperion & Essbase
Oracle Platinum Partner
One of the 100 fastest growing tech
companies in the USA (CRN
Magazine)
One of the fastest growing companies
in USA (Inc. Magazine 2007-present)

Infra
structure

Press

Consulting

Strategy

Support

Training

Founded in 1997, we are the


longest-standing, Oracle EPM/BIdedicated partner in the world

Oracle EPM 11.1.2 Books For Sale at the Registration Desk!

3 New Hyperion Planning Books!


Planning 11.1.2.2/11.1.2.3:
Creating Applications
Planning 11.1.2.2/11.1.2.3:
Advanced Planning
Planning 11.1.2.2/11.1.2.3:
An End Users Guide
Smart View 11.1.2.2: End User Guide
Essbase Studio 11.1.2.2
Essbase 11: Admin Guide
Visit interRel.com, to purchase these
titles & more!

Old School Tuning

All rules declared as absolute rules


All designed for 32-bit servers
Most created on version 7 or earlier
General rule: optimize for bulk aggregations not retrieval
time or anything else
Get the database into memory and minimize the disk I/O.
Only one cube ever processes at a time
Doesnt take into account the two most common server
environments of today:
64-bit servers (including Exalytics)
Virtual Machines

Re-Benchmarking Everything

I tested every crazy thing I could think of


Multiple databases of differing sizes and dimensionality
Ran iterations with one cube at a time on up to 5
simultaneously to get averages
Two different servers
64-bit Virtual Machine on Windows (in the interRel labs)
with 10 GB of RAM and 4 CPUs
64-bit Oracle Exalytics on Linux (Wellpoints data
center)
with 1 TB of RAM and 40 cores (hyperthreaded to 80)
and 700GB of RAM used for a RAMdisk

Which of these are still true?

Hourglass on a stick
shape
Dense/sparse dont
impact database size
Small block size (8Kb)
Bitmap compression
Hold the index in cache
Turn off hyper
threading

Data cache 1/8 of


database size
Dont use Direct I/O
Dynamic calcs slow
down retrievals
FIX on sparse, IF on
dense
More threads, the
better

Old School: Dimension Ordering

Largest Dense Dimensions


Smallest Dense Dimensions
Smallest Aggregating Sparse Dimensions
Largest Aggregating Sparse Dimensions
Non-aggregating Sparse Dimensions

Dimension Ordering

Doesn't matter if dense dimensions are first or


sparse dimensions are first
Size of dense dimensions doesn't matter
Order dense based on compression order
Except it changes the order internally so it wont
matter much anyway in most cases
Reporting was fastest sorting the dense and sparse
dimensions together alphabetically
But closest was heavily queried sparse
dimensions first
Calcs best when sparse in ratio order of
Leaves:StoredAncestors (smallest to largest)

Old School: Dense/Sparse Impact on Database Size

Density and Sparsity have little effect on PAG file


size
The database still has to store all the numbers
Small blocks can make for large index files

Dense/Sparse Impact on Database Size


Headers take 72 bytes in most compressions
Small block,112 bytes
111G page, 22G index
Medium block 2.4K
14G page, 2.4G index
Large block 2.5M
3G page, 0.5G index
Huge block 7M
22GB page, 0.2G index
Database size was the same on all environments

Old School: Block Size

Generally, small block sizes are best


1-8 K though we do have this viewpoint
DBAG: A data block size of 8 Kb to 100 Kb
provides optimal performance in most cases.

12

Block Size

Can affect timings on reports from 161


seconds to 19227 seconds
Most optimal block sizes
VM rep., 4Kb
Exa rep., 4Kb
VM calc, 4Kb
Exa calc, 2Mb
Your results may vary

13

Old School: Compression Type

Bitmap is the default


RLE can result in a smaller block size if Time is the
first dense dimension, but it will take longer to
process RLE

Compression Type

VM reporting
RLE then Bitmap then zLib
20% total reporting time difference
Exalytics reporting
Really has no impact (less than 1% from best to worst)
VM calculation
RLE then Bitmap then zLib
20% total calc time difference
Exalytics calculation
Bitmap then RLE then No Compression then zLib
39% total calc time difference

Old School: Index Cache

Default
Buffered I/O: 1024 KB (1048576 bytes)
Direct I/O: 10240 KB (10485760 bytes)
Guideline:
Combined size of all essn.ind files, if possible;
otherwise, as large as possible
Do not set this cache size higher than the total
index size, as no performance improvement
results

Index Cache

Inconsistent results depending on cube (you'll see


this in the real world)
Inconsistent on reporting
1Mb-100Mb index cache seems best but
difference is less than 15% anyway
Definitely not the whole index
VM calculation
100Mb better but only 14% than defaults or 200Mb
Exalytics calculation
Default is best (literally, 1Mb) by as much as 68%

Old School: Hyper Threading

Turning this on basically makes the OS think you


have more CPUs/Cores than you do
This slows down calculations and should be turned
off at the BIOS level

Hyper Threading

Doesn't hurt performance on Exalytics when threads


launched are less than cores on the box
Helps when threads exceed the core count on the
box

Old School: Data Cache


Default
3072 KB (3145728 bytes)
Guideline
0.125 * Combined size of all essn.pag files, if possible;
otherwise as large as possible
Increase value if any of these conditions exist:
Many concurrent users are accessing different data blocks
Calculation scripts contain functions on sparse ranges, and
the functions require all members of a range to be in memory
(for example, when using @RANK and @RANGE)
For data load, the number of threads specified by the
DLTHREADSWRITE setting is very high and the expanded
block size is large

Data Cache

Inconsistent on reporting
3Mb-300Mb data cache but difference is less than
15% anyway
Definitely not as much as 1/8 page file size
(setting it that large definitely hurts performance in
all environments)
VM calculation
300Mb better but only 14% than defaults or
600Mb
Exalytics calculation
Default is best (literally, 3Mb) by as much as 68%

Old School: Direct I/O

This was introduced to have Essbase take over the


disk cache from the OS
While in specific circumstances (particularly tuned
UNIX instances, for example) it can improve
performance, in general it should not be used
Turn on Buffered I/O instead (its the default)

Direct I/O

Can't on Linux (i.e., Exalytics)


On VM, calculations get 50% slower
On VM, retrievals get 15 times faster
Or they get 24 times slower
(depending on the database)
In other words, decide on your own for VMs, but
when in doubt, dont use it.

Old School: Dynamic Calculations

According to the Essbase Database Administrators


Guide
Turning on dynamic calculations speed up
calculations
But they slow down retrievals

Dynamic Calculations

On dense dimensions, retrievals speed up when


members are dynamically calculated
Didnt have a single benchmark where dynamically
calculated dense members didnt speed up
performance on both retrievals and calculations
They do slow down retrievals on sparse dimensions
Much less impact on Exalytics
Calculation is still helped by dynamically
calculating the top (and maybe even other) levels
of some sparse dimensions

Old School: FIX on sparse, IF on dense

Generally, IF works best on dense dimension


members
Particularly when ElseIF or Else are included
Block is only brought into memory once and all
related conditions are processed
FIX works best on sparse dimension members

FIX on sparse, IF on dense

Incorrect anyway
FIXing on dense is fine if you're not doing something different to different
members

Testing this, you get 3x better performance FIXing on dense


than IFing when youre doing the same thing to the same
members
What if youre doing different things to different dense
members?
If the data is cached, you get 2x better performance on FIX dense vs. IF
dense
Not cached, can be 2-12x slower on FIX vs IF

IF on sparse is still bad, because IF is bad

FIX on sparse, IF on dense recommendation

Forget FIX on sparse, IF on dense


If youre doing the same thing to different members,
FIX on sparse and dense
If youre doing different things to different members.
FIX on sparse
FIX on dense if the data is likely already cached
and you only have a couple of FIXs (literally like 2
maybe 3)
In other cases, IF on dense

Old School: Parallel Calculations

Set in Essbase.CFG or a calc script to launch


simultaneous threads
Can launch up to 128 threads for many activities
Use CALCPARALLEL set to max in most cases
Or according to DBAG, number of CPUs/Cores
minus 1

Parallel Calculations

There are 40 cores or


80 hyperthreaded on Exalytics
Performance gets slower the more cores you add
beyond a certain point
Not a lot, it gets up to 27% worse
So tune your parallel count
190

Time (sec)

170
150
130
110
90
70
50
0

10

20

30

40

50

Threads

60

70

80

90

100

Thats helpful, I think,


but real world, what do I when
optimizing for each situation?

Optimizations: VM Retrievals

Heavily queried sparse


first
Rest of the dimensions
dont matter
Small block size
RLE compression
Index cache has
minimal impact (keep it
relatively small)

Data cache has


minimal impact (keep
it relatively small)
Dont use Direct I/O
Dynamic calcs slow
down retrievals on
sparse
FIX on dense is often
a good thing
Tune the threads

Optimizations: 64-Bit Retrievals

Heavily queried sparse


first
Rest of the dimensions
dont matter
Small block size
No compression impact
Index cache has
minimal impact

Data cache has


minimal impact
Dont use Direct I/O
Dynamic calcs slow
down retrievals on
sparse
FIX on dense is often
a good thing
Tune the threads

Optimizations: VM Calculations

Sparse dimensions in
ratio order
Rest of the dimensions
dont matter
Small block size
RLE compression
Index cache has
minimal impact

~300M Data Cache


(still, minimal impact)
Dont use Direct I/O
Dynamic calcs slow
down retrievals on
sparse
FIX on dense is often
a good thing
Tune the threads

Optimizations: 64-Bit Calculations

Sparse dimensions in
ratio order
Rest of the dimensions
dont matter
Large block size (2M)
Bitmap compression
Default for index cache
Use hyperthreading

Default for data cache


Dont use Direct I/O
Dynamic calcs slow
down retrievals on
sparse
FIX on dense is often
a good thing
Tune the threads

In conclusion

Thanks to Jonathon Eastman from Wellpoint and


Jason Novikoff & Robert Gideon from interRel
Dont trust everything you hear
Especially dont trust things you know are 100%
true
Every situation is different so

Every few years,


throw out the playbook!

Everything You Know About


Essbase BSO Optimization Is STILL
Incomplete, Outdated, or Just Plain Wrong
Wayne D. Van Sluys
Senior Consultant
wvansluys@interrel.com
Joe Aultman
Director of Strategic Services
jaultman@interrel.com

You might also like