You are on page 1of 24

IBM Information Management Forum, 11.

September 2013, Wien


Pure Data System for Haoop
Big Data !pp"ian#e
Dipl.Ing. Wolfgang Nimfhr
Business Development Executive
IBM Software Group
Big Data
!"#$ International Business Machines %orporation
!
IBM Information Management &orum' ##. Septem(er !"#$' Wien
$%ery #ompany &as a Big Data ' !na"yti#s (pportunity
)&e po*er of Data #oming toget&er
Wit& t&e po*er of )e#&no"ogy
)o e"i%er Impro%e (ut#omes
$nri#& your information base
with Big Data Exploration
Pre%ent #rime
with Security and Intelligence Extension
(ptimi+e operations
with Operations Analysis
,ain I) effi#ien#y an s#a"e
with Data Warehouse Augmentation
Impro%e #ustomer intera#tion
with Enhanced 360 !iew o" the #ustomer
)o generate Business -a"ue
!"#$ International Business Machines %orporation
$
IBM Information Management &orum' ##. Septem(er !"#$' Wien
. 2013 IBM /orporation
Ho* o you &arness t&e e"ep&ant in t&e room0
!"#$ International Business Machines %orporation
)
IBM Information Management &orum' ##. Septem(er !"#$' Wien
/&a""enges of Bui"ing a Haoop /"uster
* (pen sour#e is not rea""y free1
+e,uires man- resources to manage a .a/oop
cluster.
* !pa#&e Haoop #onsists of mu"tip"e subpro2e#ts
0m(iguit- on which components to use in which
situations.
* 3a#4 of integrate #"uster management
No integrate/ tools to manage the har/ware an/
.a/oop.
* /ommoity &ar*are base Haoop #"usters
pose signifi#ant #&a""enges
Several /iscrete components 1open source' thir/2
part- tools3 means no guarantee/ efficiencies or
pretesting /one among the components.
!"#$ International Business Machines %orporation
4
IBM Information Management &orum' ##. Septem(er !"#$' Wien
.ive
5ig
Map+e/uce
.D&S
HCatalog
Visualization
Development
6ools
3et5s simp"ify Big Data 6
Designed to
Simplif- the (uil/ing'
/eplo-ing an/ management
of a .a/oop cluster
Spee/ the time2to2value for
.a/oop an/ unstructure/
/ata
Maximi7e the overall
anal-tic ecos-stem
5rovi/e enterprise securit-
an/ platform management
From #ustom an #omp"e7
6)o organi+e simp"i#ity
$
Based on IB% internal testing and customer "eed&ac'( )#ustom &uilt clusters) re"er to clusters that are not
pro"essionally pre*&uilt+ pre*tested and optimi,ed( Indi-idual results may -ary(
!"#$ International Business Machines %orporation
8
IBM Information Management &orum' ##. Septem(er !"#$' Wien
!nnoun#ing t&e ne* PureData System for Haoop
0ccelerate time to value
0ccelerate time to insight
Simplif- (ig /ata a/option an/ consumption
Exten/ the value of the /ata warehouse
Implement enterprise class (ig /ata
Minimi7e s-stem setup an/ a/ministration
System for Hadoop
!"#$ International Business Machines %orporation
9
IBM Information Management &orum' ##. Septem(er !"#$' Wien
IBM PureData System for Haoop
Accelerate .adoop analytics with appliance simplicity
Spee
Spee/ to insight with (uilt2in anal-tics
Spee/ to value with accelerate/ /eplo-ment
Simp"i#ity
+ea/- to loa/ /ata in hours
Integrate/ s-stem management
0ppliance approach re/uces complexit-
Single point of support
Smart
Esta(lish a cost efficient online /ata archive
Easil- leverage /ata across the (ig /ata
platform
Enterprise securit-' governance an/ high
availa(ilit-
System for Haoop
!"#$ International Business Machines %orporation
:
IBM Information Management &orum' ##. Septem(er !"#$' Wien
Benefits of IBM 5ureData S-stem for .a/oop
Dep"oy 87 faster
than custom*&uilt solutions
$
Bui"t9in %isua"i+ation
to accelerate insight
Bui"t9in ana"yti# a##e"erators
2
unli'e &ig data appliances on the mar'et
Sing"e system #onso"e
"or "ull system administration
:api maintenan#e upates
with automation
;o assemb"y re<uire, ata "oa reay in &ours
(n"y integrate Haoop system *it& bui"t9in
ar#&i%ing too"s
2
De"i%ere *it& more robust se#urity
than open source so"tware
!r#&ite#te for &ig& a%ai"abi"ity
#
Base/ on IBM internal testing an/ customer fee/(ac;. <%ustom (uilt clusters< refer to clusters that are not professionall- pre2(uilt' pre2teste/ an/ optimi7e/. In/ivi/ual results ma- var-.
!
Base/ on current commerciall- availa(le Big Data appliance pro/uct /ata sheets from large ven/ors. =S >N?@ %?0IM.
Accelerate
Big Data
/ime to !alue
Accelerate
Big Data
/ime to !alue
Simpli"y Big Data
Adoption 0 #onsumption
Simpli"y Big Data
Adoption 0 #onsumption
Implement
Enterprise #lass
Big Data
Implement
Enterprise #lass
Big Data
!"#$ International Business Machines %orporation
A
IBM Information Management &orum' ##. Septem(er !"#$' Wien
InfoSp&ere BigInsig&ts
Entry 1oint "or .adoop
!"#$ International Business Machines %orporation
#"
IBM Information Management &orum' ##. Septem(er !"#$' Wien
!na"yti# !##e"erators
B Social Me/ia 0ccelerator
B Machine Data 0ccelerator
B BigSheets sprea/sheet an/ visuali7ation
B 0/vance/ 6ext 0nal-tics 0ccelerator
B C0D? ,uer- language
Performan#e an (ptimi+ation
B 0/aptive Map +e/uce
B 0/vance/ Sche/uler
B BigIn/ex for large scale in/exing
B &ast' splitta(le compression
Se#urity
B +ole (ase/ authori7ation
(ptim De%e"opment Stuio
B Eclipse (ase/ IDE for Cava
Big Data Integration
B Information Server' InfoSphere
Streams' Nete77a' DB!
$nterprise $nab"ement
B Big SD?
B G5&S2&%>
IB%2s distri&ution is &ased on Apache .adoop and utili,es many o" the capa&ilities
includes in that distri&ution+ &ut IB% is "ocused on ma'ing its distri&ution more o" an
enterprise class o""ering(3
BigInsig&ts -a"ue !bo%e an Beyon Haoop
!"#$ International Business Machines %orporation
##
IBM Information Management &orum' ##. Septem(er !"#$' Wien
IBM2certifie/ 0pache .a/oop
!ministration ' Se#urity
Wor4"oa (ptimi+ation
Integrate De%e"opment $n%ironment
/onne#tors =Data, !na"yti#s, Integration>
!%an#e )e7t !na"yti#s $ngine
-isua"i+ation ' $7p"oration
(pen
sour#e
#omponents
!itiona"
enterprise
#apabi"ities
?ey enterprise #apabi"ities on top of an unmoifie open
sour#e founation
!"#$ International Business Machines %orporation
#!
IBM Information Management &orum' ##. Septem(er !"#$' Wien
1
2
BigInsig&ts 2.1 features a %ariety of en&an#ements t&at
e"i%er 4ey $nterprise Haoop #apabi"ities
* >ut of the (ox .igh
0vaila(ilit-
* Seamless' automatic an/
transparent failover for .D&S
NameNo/e
* Eliminates a/min intervention
* +e/uces /owntime for
recover- of the cluster
* .ar/ware fencing to
guarantee /ata integrit-
* No single point of failure
* Built2in .igh 0vaila(ilit-
* 5>SIE compliance
* Enhance/ Securit- with 0%?
support
* Support for Storage 5ools
* SnapShot capa(ilit-
* %omprehensive Stan/ar/ 0NSI
SD? support to access /ata
store/ in BigInsights
* Stan/ar/s compliant CDB% F
>DB% /rivers
* ?everages Map+e/uce
parallelism in complex /ata sets
* Direct access for low2latenc- in
small ,ueries' e.g. su(2secon/
response to .Base ,ueries
Hig& !%ai"abi"ity
,PFS9FP( support
Big S@3 /ognos %10, 6
!"#$ International Business Machines %orporation
#$
IBM Information Management &orum' ##. Septem(er !"#$' Wien
BigInsig&ts $nterprise $ition 2.1
%onnectivit- an/ Integration
Streams
Nete77a
6ext
processing
engine an/
li(rar-
CDB%
&lume
Infrastructure
Ca,l
.ive
5ig
.Base
Map+e/uce
.D&S
GooHeeper
In/exing
?ucene
0/aptive
Map+e/uce
>o7ie
6ext compression
Enhance/
securit-
&lexi(le
sche/uler
>ptional
IBM an/
partner
offerings
0nal-tics an/ /iscover- I0ppsJ
DB!
BigSheets
We( %rawler
Distri( file cop-
DB export
Boar/rea/er
DB import
0/ hoc ,uer-
Machine
learning
Data
processing
. . .
0/ministrative an/
/evelopment tools
We( console
* Monitor cluster health' Ko(s' etc.
* 0// L remove no/es
* Start L stop services
* Inspect Ko( status
* Inspect wor;flow status
* Deplo- applications
* ?aunch apps L Ko(s
* Wor; with /istri( file s-stem
*Wor; with sprea/sheet interface
*Support +ES62(ase/ 05I
* . . .
+
Eclipse tools
* 6ext anal-tics
* Map+e/uce programming
* Ca,l' .ive' 5ig /evelopment
* BigSheets plug2in /evelopment
* >o7ie wor;flow generation
Integrate/
installer
>pen Source
IBM IBM
%ognos BI
Big SD?
0ccelerator for
machine /ata
anal-sis
0ccelerator for
social /ata
anal-sis
Guar/ium
DataStage Data Explorer
S,oop
.%atalog
G5&S B&5>
G5&S B&5>
!"#$ International Business Machines %orporation
#)
IBM Information Management &orum' ##. Septem(er !"#$' Wien
$7p"oration,
Integrate
Ware&ouse, an
Mart Aones
Discover-
Deep +eflection
>perational
5re/ictive
All Data Sources
Information
Ingestion
an
(perationa"
Information
/ase
Management
!na"yti#s
!pp"i#ations
!"erts
3aning !rea,
!na"yti#s Aone
an !r#&i%e
+aw Data
Structure/ Data
6ext 0nal-tics
Data Mining
Entit- 0nal-tics
Machine
?earning
:ea"9time
!na"yti# Aone
Mi/eoL0u/io
Networ;LSensor
Entit- 0nal-tics
5re/ictive
Stream
5rocessing
Data
Integration
Master Data
Streams
Information ,o%ernan#e, Se#urity an Business /ontinuity
Information ,o%ernan#e, Se#urity an Business /ontinuity
Big Data Ecosystem Analytic Applications
/ogniti%e
4earn Dynamically5
Pres#ripti%e
Best Outcomes5
Prei#ti%e
What #ould .appen5
Des#ripti%e
What .as .appened5
$7p"oration an
Dis#o%ery
What Do 6ou .a-e5
/"ou
Ser%i#es
IBM Watson
IBM Big Data ' !na"yti#s :eferen#e !r#&ite#ture
PureData
System
for
Haoop
!#ti%e
!r#&i%e
Big Data
$7p"oration
Pre
Pro#essing
Hub
!"#$ International Business Machines %orporation
#4
IBM Information Management &orum' ##. Septem(er !"#$' Wien
7se #ase8 Big Data $7p"oration
Bse /ases
Explore new /ata an/ previousl-
untappe/ sources
Misuali7e an/ gain new insight with eas-
to use sprea/sheet2st-le anal-sis
I/entif- useful information that woul/
a// value when integrate/
=se/ for /ata profiling to un/erstan/
/ata (efore moving to other s-stems
!"#$ International Business Machines %orporation
#8
IBM Information Management &orum' ##. Septem(er !"#$' Wien
0/2hoc anal-tics for /ata scientists
0nal-7e a variet- of /ata 2
unstructure/ an/ structure/
Browser2(ase/
Sprea/sheet metaphor for exploringL
visuali7ing /ata
Gather Extract Explore Iterate
%rawl B gather statisticall-
0/apterBgather /-namicall-
Document2level info
%leanse' normali7e
0nal-7e' annotate' filter
Misuali7e results
Iterate through an- prior
step
Sprea/sheet2st-le anal-sis process with BigSheets
7se #ase8 Big Data $7p"oration
!"#$ International Business Machines %orporation
#9
IBM Information Management &orum' ##. Septem(er !"#$' Wien
7se #ase8 !#ti%e !r#&i%e
Bse /ases
Imme/iate storage alternative of col/
/ata
%ost savings for col/ /ata
%ompliance re,uirements
Simple anal-tics L exploration
PureData
System for Analytics
PureData
System for Hadoop
!"#$ International Business Machines %orporation
#:
IBM Information Management &orum' ##. Septem(er !"#$' Wien
7se #ase8 Pre9Pro#essing Hub
Bse /ases
0ggregation of /ata
5re2process cleansing
%ompliance re,uirements
Simple anal-tics L exploration
!"#$ International Business Machines %orporation
#A
IBM Information Management &orum' ##. Septem(er !"#$' Wien
PureData System for Haoop
Bringing Big Data to the enterprise
Simplif- the /eliver- of unstructure/
/ata to the enterprise
Integrate .a/oop with the /ata
warehouse
?everage .a/oop for /ata archive
5rovi/e (est in class securit-
5rovi/e /ata exploration across
structure/ an/ unstructure/ /ata
0ccelerate insight with machine /ata
0ccelerate insight with social /ata
S
i
m
p
"
i
f
y

B
i
g

D
a
t
a

f
o
r

t
&
e

e
n
t
e
r
p
r
i
s
e
1
!"#$ International Business Machines %orporation
!"
IBM Information Management &orum' ##. Septem(er !"#$' Wien
For apps "i4e $9#ommer#e6
Database cluster services optimized for
transactional throughput and scalability
For apps "i4e /ustomer !na"ysis6
Data warehouse services optimized for
high-speed, peta-scale analytics and simplicity
For apps "i4e :ea"9time Frau Dete#tion6
perational data warehouse services optimized to
balance high performance analytics and real-time
operational throughput
!eeting "ig Data #hallenges $ %ast and &asy'
IBM PureData System Fami"y
System for Transactions
System for Analytics
System for Operational Analytics
System for Hadoop
For $7p"oratory !na"ysis ' @ueryab"e !r#&i%e
Hadoop data services optimized for big data analytics
and online archive with appliance simplicity
. 2013 IBM /orporation 20
!"#$ International Business Machines %orporation
!#
IBM Information Management &orum' ##. Septem(er !"#$' Wien
***.ibm.#omCsoft*areCataCpureataC&aoopC
!"#$ International Business Machines %orporation
IBM Information Management &orum' ##. Septem(er !"#$' Wien
!"#$ International Business Machines %orporation
!!
IBM Information Management &orum' ##. Septem(er !"#$' Wien
$7p"oration,
Integrate
Ware&ouse, an
Mart Aones
Discover-
Deep +eflection
>perational
5re/ictive
All Data Sources
Information
Ingestion
an
(perationa"
Information
/ase
Management
!na"yti#s
!pp"i#ations
!"erts
3aning !rea,
!na"yti#s Aone
an !r#&i%e
+aw Data
Structure/ Data
6ext 0nal-tics
Data Mining
Entit- 0nal-tics
Machine
?earning
:ea"9time
!na"yti# Aone
Mi/eoL0u/io
Networ;LSensor
Entit- 0nal-tics
5re/ictive
Stream
5rocessing
Data
Integration
Master Data
Streams
Information ,o%ernan#e, Se#urity an Business /ontinuity
Information ,o%ernan#e, Se#urity an Business /ontinuity
Big Data Ecosystem Analytic Applications
/ogniti%e
4earn Dynamically5
Pres#ripti%e
Best Outcomes5
Prei#ti%e
What #ould .appen5
Des#ripti%e
What .as .appened5
$7p"oration an
Dis#o%ery
What Do 6ou .a-e5
/"ou
Ser%i#es
IBM Watson
IBM Big Data ' !na"yti#s :eferen#e !r#&ite#ture
7D%ise
Data
$7p"orer
!"#$ International Business Machines %orporation
!$
IBM Information Management &orum' ##. Septem(er !"#$' Wien
!"#$ International Business Machines %orporation
!)
IBM Information Management &orum' ##. Septem(er !"#$' Wien
Part of t&e IBM Big Data P"atform
(or)load ptimized Solutions for all your analytic needs
!na"yti#s ' De#ision Management
So"utions
Big Data Infrastru#ture
IBM Big Data P"atform
!##e"erators
Information Integration ' ,o%ernan#e
-isua"i+ation
' Dis#o%ery
!pp"i#ation
De%e"opment
Systems
Management
Stream
/omputing
Haoop
System
Data
Ware&ouse
PureData
System for Analytics
PureData
System for Hadoop
. 2013 IBM /orporation 2E

You might also like