You are on page 1of 12

Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

Oracle

Blogs Home
Products & Services
Downloads
Support
Partners
Communities
About
Login

Oracle Blog

The Data Warehouse Insider

Technical details, ideas and news on data warehousing and big data from the Oracle Team

« Limited Edition... | Main | OpenWorld call for... »

Updated: Price Comparison for Big Data Appliance and Hadoop

By Jean-Pierre Dijcks on Apr 03, 2014

It was time to update this post a little. Big Data Appliance grew, got more features and prices
as well as insights just changed all across the board. So, here is an update.

The post is still aimed at providing a simple apples-to-apples comparison and a clarification of
what is, and what is not included in the pricing and packaging of Oracle Big Data Appliance
when compared to "I'm doing this myself - DIY style".

Oracle Big Data Appliance Details

A few of the most overlooked items in pricing out a Hadoop cluster are the cost of software,
the cost of actual production-ready hardware and the required networking equipment. A
Hadoop cluster needs more than just CPUs and disks... For Oracle Big Data Appliance we
assume that you would want to run this system as a production system (with hot-pluggable
components and redundant components in your system). We also assume you want the leading
Hadoop distribution plus support for that software. You'd want to look at securing the cluster
and possibly encrypting data at rest and over the network. Speaking of network, InfiniBand
will eliminate network saturation issues - which is important for your Hadoop cluster.

With that in mind, Oracle Big Data Appliance is an engineered system built for production
clusters. It is pre-installed and pre-configured with Cloudera CDH and all (I emphasize all!)
options included and we (with the help of Cloudera of course) have done the tuning of the
system for you. On top of that, the price of the hardware (US$ 525,000 for a full rack system -

1 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

more configs and smaller sizes => read more) includes the cost of Cloudera CDH, its options
and Cloudera Manager (for the life of the machine - so not a subscription).

So, for US$ 525,000 you get the following:

Big Data Appliance Hardware (comes with Automatic Service Request upon component
failures)
Cloudera CDH and Cloudera Manager
All Cloudera options as well as Accumulo and Spark (CDH 5.0)
Oracle Linux and the Oracle JDK
Oracle Distribution of R
Oracle NoSQL Database Community Edition
Oracle Big Data Appliance Enterprise Manager Plug-In

The support cost for the above is a single line item.. The list price for Premier Support for
Systems per the Oracle Price list (see source below) is US$ 63,000 per year.

To do a simple 3 year comparison with other systems, the following table shows the details
and the totals for Oracle Big Data Appliance. Note that the only additional item is the install
and configuration cost which are done by Oracle personnel or partners, on-site:

3 Year
Year 1 Year 2 Year 3
Total
BDA Cost $525,000
Annual
$63,000 $63,000 $63,000
Support Cost
On-site Install
$14,000
(approximately)
Total $602,000 $63,000 $63,000 $728,150

For this you will get a full rack BDA (18 Sun X4-2L servers, 288 cores (Two Intel Xeon
E5-2650V2 CPUs per node), 864TB disk (twelve 4TB disks per node), plus software, plus
support, plus on-site setup and configuration. Or in terms of cost per raw TB at purchase and
at list pricing: $697.

HP DL-380 Comparative System (this is changed from the original post to the more
common DL-380's)

To build a comparative hardware solution to the Big Data Appliance we picked an HP-DL180
configuration and built up the servers using the HP.com website for pricing. The following is
the price for a single server.

Model Total
Description Quantity
Number Price

2 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

ProLiant DL380p
Gen8 Rackmount
Factory Integrated 8
SFF CTO Model (2U)
with no processor, 24
DIMM with no
memory, open bay
(diskless) with 8 SFF
drive cage, Smart
653200-B21 1 $2,051
Array P420i controller
with Zero Memory, 3 x
PCIe 3.0 slots, 1
FlexibleLOM
connector, no power
supply, 4 x redundant
fans, Integrated HP
iLO Management
Engine
2.6GHz Xeon E5-2650
v2 processor (1 chip, 8
715218-L21 cores) with 20MB L3 2 $3,118
cache - Factory
Integrated Only
HP 1GbE 4-port
331FLR Adapter -
684208-B21 1 $25
Factory Integrated
Only
460W Common Slot
503296-B21 Gold Hot Plug Power 1 $229
Supply
HP Rack 10000 G2
Series - 10842 (42U)
AF041A 0 $0
800mm Wide Cabinet -
Pallet Universal Rack
8GB (1 x 8GB) Single
Rank x8 PC3L-12800R
(DDR3-1600)
731765-B21 8 $1,600
Registered CAS-11
Low Voltage Memory
Kit
HP Smart Array
P222/512MB FBWC
631667-B21 1 $599
6Gb 1-port Int/1-port
Ext SAS controller

3 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

4TB 6Gb SAS 7.2K


LFF hot-plug
SmartDrive SC
695510-B21 12 $12,588
Midline disk drive
(3.5") with 1-year
warranty

Grand Total for a


single server (list $20,210
prices)

On top of this we need InfiniBand switches. Oracle Big Data Appliance comes with 3 IB
switches, allowing us to expand the cluster without suddenly requiring extra switches. And,
we do expect these machines to be a part of a much larger clusters. The IB switches are
somewhere in the neighborhood of US$ 6,000 per switch, so add $18,000 per rack and add a
management switch (BDA uses a Cisco switch) which seems to be around $15,000 list. The
total switching comes to roughly $33,000.

We will also need Cloudera Enterprise subscription - and to compare apples to apples, we will
do it for all software. Some sources (see this document) peg CDH Core at $3,382 list per node
and per year (24*7 support). Since BDA has more software (all options) and that pricing is not
public I am going to make an educated calculation and rounding and double the price with a
rounding to the nearest nice and round number. That gets me to $7,000 per node, per year for
24*7 support.

BDA also comes with on-disk encryption, which is even harder to price out. My somewhat
educated guess is around $1,500 list or so per node and per year. Oh, and lets not forget the
Linux subscription, which lists at $1,299 per node per year. We also run a MySQL database
(enterprise edition with replication), which costs list subscription $5,000. We run it replicated
over 2 nodes.

This all gets us to roughly $10,000 list price per node per year for all applicable software
subscriptions and support and an additional $10,000 for the two MySQL nodes.

HP + Cloudera Do-it-Yourself System

Let's go build our own system. The specs are like a BDA, so we will have 18 servers and all
other components included.

Year 1 Year 2 Year 3 Total

Servers $363,780

Networking $33,000

4 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

SW
Subscriptions $190,000 $190,000 $190,000
and Support
Installation
and $15,000
Configuration
Total $601,780 $190,000 $190,000 $981,780

Some will argue that the installation and configuration is free (you already pay your data
center team), but I would argue that something that takes a short amount of time when done
by Oracle, is worth the equivalent if it takes you a lot longer to get all this installed,
optimized, and running. Nevertheless, here is some math on how to get to that cost anyways:
approximately 150 hours of labor per rack for the pure install work. That adds up to US
$15,000 if we assume a cost per hour of $100.

Note: those $15,000 do NOT include optimizations and tuning to Hadoop, to the OS, to Java
and other interesting things like networking settings across all these areas. You will now need
to spend time to figure out the number of slots you allocate per node, the file system block
size (do you use Apache defaults, or Cloudera's or something else) and many more things at
system level. On top of that, we pre-configure for example Kerberos and Apache Sentry
giving you a secure authorization and authentication method, as well as have a one-click
on-disk and network encryption setting. Of course you can contact various other companies to
do this for you.

You can also argue that "you want the cheapest hardware possible", because Hadoop is built to
deal with failures, so it is OK for things to regularly fail. Yes, Hadoop does deal well with
hardware failures, but your data center is probably much less keen about this idea, because
someone is going to replace the disks (all the time). So make sure the disks are hot-swappable.
An oh, that someone swapping the disks does cost money... The other consideration is failures
in important components like power... redundant power in a rack is a good thing to have. All
of this is included (and thought about) in Oracle Big Data Appliance.

In other words, do you really want spend weeks installing, configuring and learning or would
you rather start to build applications on top of the Hadoop cluster and thus providing value to
your organization.

The Differences

The main differences between Oracle Big Data Appliance and a DIY approach are:

1. A DIY system - at list price with basic installation but no optimization - is a staggering
$220 cheaper as an initial purchase
2. A DIY system - at list price with basic installation but no optimization - is almost
$250,000 more expensive over 3 years.
Note to purchasing, you can spend this on building or buying applications on your
cluster (or buy some real intriguing Oracle software)

5 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

3. The support for the DIY system includes five (5) vendors. Your hardware support
vendor, the OS vendor, your Hadoop vendor, your encryption vendor as well as your
database vendor. Oracle Big Data Appliance is supported end-to-end by a single vendor:
Oracle
4. Time to value. While we trust that your IT staff will get the DIY system up and running,
the Oracle system allows for a much faster "loading dock to loading data" time.
Typically a few days instead of a few weeks (or even months)
5. Oracle Big Data Appliance is tuned and configured to take advantage of the software
stack, the CPUs and InfiniBand network it runs on
6. Any issue we, you or any other BDA customer finds in the system is fixed for all
customers. You do not have a unique configuration, with unique issues on top of the
generic issues.

Conclusion

In an apples-to-apples comparison of a production Hadoop cluster, Oracle Big Data Appliance


starts of with the same acquisition prices and comes out ahead in terms of TCO over 3 years.
It allows an organization to enter the Hadoop world with a production-grade system in a very
short time reducing both risk as well as reducing time to market.

As always, when in doubt, simply contact your friendly Oracle representative for questions,
support and detailed quotes.

Sources:

HP and related pricing: http://www.hp.com or http://www.ideasinternational.com/ (the latter is


a paid service - sorry!)
Oracle Pricing: http://www.oracle.com/us/corporate/pricing/exadata-pricelist-070598.pdf
MySQL Pricing: http://www.oracle.com/us/corporate/pricing/price-lists/mysql-pricelist-
183985.pdf

Category: Big Data

Tags: bigdata hadoop

Permanent link to this entry

« Limited Edition... | Main | OpenWorld call for... »


Comments:

Western digital SE 4TB drives are roughly $250/ea. That's $3,000 per node, or $9,588 less per
node than your HP comparison. Across 18 nodes that's a $172,584 difference right there.

And that's just the hard drives. The servers and everything else are a fraction of the prices
listed.

I'm sure there's wiggle room on the Oracle pricing as well... but I suspect not that much.

6 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

Posted by BD on April 04, 2014 at 09:25 AM PDT #

I'm sure anyone can argue any of the components, I'm merely trying to share a reasonable
overview of components and prices.

If you do a very quick "configure a server" on any of the HW sites (HP, Dell, Oracle etc.) you
will start to see that a drive sets you back far more than $250. Some examples, a 2.5" 1TB
7.2K drive on HP.com goes for $619.00 when I add it to my DL380 server. A similar drive on
Dell.com goes for about $400...

I fully understand that I can buy that WD 4TB drive from Amazon.com - btw it does come
with Prime, so I pay no shipping and I can watch lots of fun movies while assembling the disk
trays and servers - but I doubt it is realistic that we compare any raw disk unit price
somewhere on the internet with the price in a server as ordered from my server vendor and
installed in my server.

Posted by Jean-Pierre on April 04, 2014 at 10:29 AM PDT #

Right - but it appears that's exactly what most places are in fact doing. Including us most
likely.

We are looking to expand our infrastructure and have been discussing implementations with
several other companies. The model appears to be stripped down servers from SuperMicro,
Dell, or HP with self-sourced hard drives. To-date every single company we have talked to is
following that pattern from hundreds to thousands of nodes.

One of the key advantages of Hadoop/etc is that you don't need expensive support
contracts/etc.

Cut the server prices you listed nearly in half, then put in $250/hard drive and re-run the #s.
We've considered the Oracle solution as well; but when you really run the #s it doesn't pencil
out for us. We haven't been able to find anyone else that it's made sense for either.

If you wanted a 100% turn-key solution, with full Oracle support, etc - I can see the
advantage. But it's an extreme premium to pay for that. That's where we are currently
struggling to keep the Oracle solution as a contender at the moment.

Posted by guest on April 04, 2014 at 02:06 PM PDT #

I think we need to just agree to disagree...

because I don't think "most places" buy disks, chassis and components and then put them
together themselves. You either have the cloud (?) to specify exactly what kit you get in large
quantities, or you buy kit somewhat of the shelve (as in the post).

because I disagree that Hadoop does not need support contracts unless you are just playing
around with some software... Any production system that is somewhat critical to the
organization (and some Hadoop based system will be - soon!) will want to have issues fixed
when an issue arises. Let's say that (just an example!) a name node crash occurs and both NNs

7 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

are down, do you really want to hope that someone on a forum is going to debug this for you?

because you really do not want to run a "unique" system where you are in charge of figuring
out all the dependencies between even your hardware components. Rather than saving on
disk, I would either run this in the cloud, buy standard HW or buy an appliance.

And now I'm full circly, because I still argue that an appliance is far simpler for a better 3 year
TCO...

Posted by Jean-Pierre on April 21, 2014 at 04:35 PM PDT #

You forgot to quote Oracle Big Data connectors which is licensed per core at list price $ 2000
per core. This is a mandatory component which sums up to $ 288.000 !!! Quite expensive for
just moving data around :) Add 22 % Support for this and soon I get the impression, that you
do not compare apples with apples....

Found here: http://docs.oracle.com/cd/E27101_01/doc.10/e26746/bda.htm#CHDFFGHG

Oracle Big Data Connectors must be licensed for all processors of Oracle Big Data Appliance
or for all processors of the Hadoop cluster when not licensed on Oracle Big Data Appliance.

Posted by Peter on May 28, 2014 at 07:09 AM PDT #

Hi Peter,

Actually that is an incorrect interpretation of the documentation text, which I do however fully
understand. So as a first order of business we will make that text more crisp.

What the sentence actually is trying to say is the following:

If you are licensing Big Data Connectors (whether this is on BDA or on a non-BDA Hadoop
cluster), then you are required to license it for all processors of said cluster. The reason for all
processors of said is that things like OLH will use the entire cluster to prepare data.

Now, the sentence did start with IF. In other words, this is an optional suite of software. You
are free to purchase this on BDA, on a regular Cloudera cluster, Hortonworks or Apache.

Therefore, if you feel there is value in this software, you CAN license it. It is not a required
component of BDA, nor of your favorit homegrown cluster. So we either add it to both (if we
perceive value) or we do not add it to any.

Hope that explains why it is not added... and we will change the license text.

BTW, the connectors provide both data movement software, as well as analytics (R). BDC
also provides something genuinely cool in XQuery for Hadoop, which enables you to parse
and query XML, JSON and other nested types in massively parallel fashion.

Posted by Jean-Pierre on May 28, 2014 at 09:26 AM PDT #

8 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

Jean-Pierre,

nice article but the disadvantage of this comparison is you compare apples with apples on a
purely infrastructure level. You assume here a small workload and your exadata fits the
requirements. But if you compare with real live scenarios, there's a big chance that the number
of servers, or the amount of storage differs a lot from the exadata's config. What if my
workload needs 7 servers : BENG a full exadata at the full cost. I can just buy 7 HP servers
instead. Or what if don't need 48 TB of storage in each node, 10 TB is enough?

Or let's say I need to grow, can I buy 1 additional node in the exadata? And to make it
complete : I need 13 nodes : yep 2 exadata's.

So this is a nice comparison if your requirements exactly fits what the exadata offers, but if
you differ (and face it, chances are real) the build it yourself scenario comes way more
attractive.

Posted by Walter on July 18, 2014 at 01:25 AM PDT #

Post a Comment:

Name:
E-Mail:
URL:
Notify me by email of new comments
Remember Information?

Your Comment:
HTML Syntax: NOT allowed

Please answer this simple math question

2 + 32 =

About

The data warehouse insider is written by the Oracle product management team and sheds
lights on all thing data warehousing and big data.

Search

9 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

Enter search term:

Search only this blog

Recent Posts

Analytical SQL scripts now on Github


X-Charging for Sandboxes
Parallel Execution Fundamentals White Paper
Optimizing Table Scans with Zone Maps
Part 4 of DBAs guide to managing sandboxes - Observe
Optimizing Queries with Attribute Clustering
Review of Data Warehousing and Big Data at #OOW14
One of the ways Oracle is using Big Data
The End of the Data Scientist Bubble...
Why SQL is becoming the goto language for Big Data analysis

Top Tags

12c
analytics
Analytics
appliance
bda
Best_Practices
BI
big
big_data
bigdata
Cloud_Computing
Competitive_Information
data
Data_Integration
Data_Mining
Data_Model
Data_Warehouse
database
Database
Database_Resource_Manager
ETL
Exadata
exadata
Explain_Plan
hadoop
Hadoop
hive

10 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

java
mapreduce
MapReduce
matching
nosql
ODTUG
OLAP
openworld
OpenWorld
Oracle
oracle
Oracle_Database
parallel
Parallel_Processing
Partitioning
pattern
Performance
processing
sandboxing
SQL
sql
warehousing
Workload_Management

Categories

Oracle
Best Practices
Big Data
Functionality
News
OpenWorld
Opinion
Oracle Database Machine

Archives

« January 2015
Sun Mon Tue Wed Thu Fri Sat
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

11 of 12 13-01-2015 12:25
Updated: Price Comparison for Big Data Appliance and Hadoop (The ... file:///C:/Users/lenovo/Desktop/Vigneswara-DrKVR/big data server/U...

Today

Menu

Blogs Home
Weblog
Login

Feeds

RSS

All
/Oracle
Comments

Atom

All
/Oracle
Comments

The views expressed on this blog are those of the author and do not necessarily reflect the
views of Oracle. Terms of Use | Your Privacy Rights | Cookie PreferencesCookie Preferences

12 of 12 13-01-2015 12:25

You might also like