Professional Documents
Culture Documents
SEPTEMBER 2013
Business
Information
INSIGHT ON MANAGING AND USING DATA
SPECIAL ISSUE
Breaking Big
Besieged by endless
big data plugging and
knee-deep in Hadoop
hoopla, many businesses
are confusedand its no
wonder. To make the right
technology decisions
and tap into real value,
a keen-eyed look
is needed.
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
intelligence, analytics and data warehousing topics, conducted earlier this year by TechTarget, which publishes
Business Information magazine, interest levels in big data
analytics were relatively high, and high-minded. Forty-one percent of 540 respondents said they had active
programs or planned to add one in the next 12 months.
And the primary goals of those respondents primarily
revolved around driving new business: A combined 66%
cited gaining competitive advantages, better understanding customers or increasing revenue. By comparison,
27% opted for improving organizational efficiency.
The three articles in this special edition of Business
Information offer insight and advice to help point the way
forward. First we look at the capabilities, and limitations,
of Hadoop. Next we report on the relationship between
big data and in-memory analytics toolsand issues to
consider before joining them at the hip. We close with
tips on making Hadoop work in corporate applications
from a panel of IT and BI professionals who spoke at the
Hadoop Summit 2013. n
is executive editor of TechTargets SearchData
Management.com and SearchBusinessAnalytics.com websites.
Email him at cstedman@techtarget.com.
CRAIG STEDMAN
STRATEGIES | ED BURNS
WHEN TO
USE HADOOP,
AND WHEN
NOT TO
Hadoop has become everyones big data darling.
For now, at least, it can only do so muchand
savvy businesses shouldnt buy into the hype.
HOME
STRATEGIES | ED BURNS
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
where time isnt a constraint. That includes running endof-the-day reports to review daily transactions or scanning historical data dating back several months.
But when it comes to running the real-time analytics processes that are at the heart of what Metamarkets
offers to its clients, Hadoop isnt involved. Driscoll said
thats because its optimized to run batch jobs that look
at every file in a database. It comes down to a tradeoff: In
IT COMES DOWN TO A
TRADEOFF: IN ORDER TO
MAKE DEEP CONNECTIONS
BETWEEN DATA POINTS,
HADOOP SACRIFICES SPEED.
order to make deep connections between data points, the
technology sacrifices speed. Using Hadoop is like having
a pen pal, he said. You write a letter and send it and get
a response back. But its very different than [instant messaging] or email.
Because of the time factor, Hadoop has limited value
in online environments where fast performance is crucial, said Kelly Stirman, director of product marketing at
NoSQL database developer MongoDB Inc. For example,
analytics-fueled online applications, such as product recommendation engines, rely on processing small amounts
of information quickly. But Hadoop cant do that efficiently, Stirman said.
4
No Replacement Plan
Some businesses might be tempted to try scrapping their
traditional data warehouses in favor of Hadoop clusters,
because technology costs are so much lower with the
open source technology. But Carl Olofson, an analyst at
market research company IDC, said that weighing the
two is an apples-and-oranges comparison.
Olofson said the relational databases that power most
data warehouses are used to accommodating trickles of
data that come in at a steady rate over a period of time,
such as transaction records from day-to-day business
processes. Conversely, he added, Hadoop is best suited to
processing vast stores of accumulated data.
And because Hadoop is typically used in large-scale
projects that require clusters of servers and employees
with specialized programming and data management
skills, implementations can become expensive, even
though the cost-per-unit of data may be lower than with
relational databases. When you start adding up all the
costs involved, its not as cheap as it seems, Olofson said.
Specialized development skills are needed because Hadoop uses the MapReduce software programming framework, which limited numbers of developers are familiar
with. That can make it difficult to access data in Hadoop
from SQL databases, according to Todd Goldman, vice
president of enterprise data integration at software vendor Informatica Corp.
Various vendors have developed connector software
that can help move data between Hadoop systems and
relational databases. But Goldman thinks that for many
STRATEGIES | ED BURNS
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
IN-MEMORY
FINDS A PLACE
IN BIG DATAS
UNIVERSE
Big data plus memory-based analytics software
can form a mutually beneficial relationshipif thats
the kind of power business users really need.
HOME
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
A Data Flood
At ContactLab, an email marketing services provider in
Milan, Italy, the need for in-memory analytics capabilities became apparent when its business model shifted
from broad-based marketing campaigns to a more individualized outreach approach, said Massimo Fubini,
the companys founder and director. ContactLab, which
manages an average of 60,000 to 70,000 email and outbound SMS messages daily, faced a big data challenge
as it tried to sort through hundreds of millions of data
points on click-throughs, website visits and other actions
to analyze customer behavior and serve up relevant marketing messages on the fly.
Conventional BI tools worked fine up to that point,
Fubini said. But the change in business strategy changed
the analytics game and opened the door to the deployment of a Hadoop system that captures the data and
feeds it into in-memory analytics softwarein this case,
SAS Visual Analytics from SAS Institute Inc.
As part of the big data environment,ContactLab also
7
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
TRAINING,
PLANNING
NEEDED TO
PUT HADOOP
INTO PLAY
HOME
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
solve, he said. You get all this data, then you go off and
try to solve everything that you can think of.
But Lavus team learned early on that small projects
were good starting points with Hadoop. Its a whole
new way of doing things, he said. Start with something
small that you can actually manage. Its about learning.
Lavu also told would-be enterprise Hadoop users to be
careful not to solve problems that are already solved.
For example, existing reports that are being produced
and distributed effectively dont need to be redone in Hadoop just for the sake of changing platforms.
Hadoop first gained attention based on the efforts of
systems programmers at Internet companies such as
Yahoo, Google, Facebook and Twitter. But incorporating
the technology into mainstream business and analytics
applications takes different skills. Even Web stalwarts
such as Salesforce.com have learned lessons while moving Hadoop into a support role for business decision
makers.
When Hadoop comes to mind, too often its only
the datahow big it is. But as you add more and more
users, you have to think in terms of the compute [requirements] also. Its not just the storage, said Ramesh
Koteshwar, a business intelligence architect at Salesforce.
Koteshwar anticipates that a sizable part of the companys workforce will ask questions about data collected in
Hadoop. We expect hundreds and thousands of users on
the Hadoop cluster, he said.
Developing robust security capabilities is another part
of the process of bringing Hadoop to wider use, he said.
10
Hadoop use at Salesforce is very much still at an exploratory stage, and end-user access and authentication are
barriers that must be hurdled on the track to broader
deployment. When you really want to bring it into the
enterprise, you want to make sure there are security policies and processes in place in front of the Hadoop [cluster], Koteshwar said.
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
11
HOME
EDITORS NOTE
WHEN TO USE
HADOOP, AND
WHEN NOT TO
IN-MEMORY FINDS
A PLACE IN BIG
DATAS UNIVERSE
TRAINING, PLANNING
NEEDED TO PUT
HADOOP INTO PLAY
is SearchDataManagement
.coms news and site editor. He covers topics
such as data warehousing, big data management, databases, data integration and data
quality. Vaughan previously worked as an
editor for TechTargets SearchSOA.com, SearchVB.com,
TheServerSide.net and SearchDomino.com websites.
Email him at jvaughan@techtarget.com.
JACK VAUGHAN
amatthews@techtarget.com
12