You are on page 1of 5

Big data analytics: The cloud-fueled shift now under way

Public clouds are the future of enterprise big data analytics, and their use is creating the unified
platform needed to fully gain its value

James Kobielus By James Kobielus

Contributor, InfoWorld | MAR 8, 2018

Big data analytics: The cloud-fueled shift now under way

Thinkstock

MORE LIKE THIS

cloud trends 2017

In 2018, can cloud, big data, and AI stand more turmoil?

Big data analytics hand touchscreen user man

What is big data? Everything you need to know

toy2018 title

InfoWorld’s 2018 Technology of the Year Award winners

VIDEO

The Difficulties of Recruiting Data Scientists

RELATED ARTICLES

abstract data perspective composition

Introducing Pandas DataFrame for Python data analysis

leaking binary data pouring through one's hands

R tutorial: How to reshape data in R

deep thinking ai artificial intelligence

PyTorch tutorial: Get started with deep learning in Python


See all Insider

Today’s big data analytics market is quite different from the industry of even a few years ago. The
coming decade will see change, innovation, and disruption ripple through at every segment of this global
industry.

In the recently published annual update to its market study, Wikibon, the analyst group of SiliconAngle
Media, found that the worldwide big data analytics market grew at 24.5 percent in 2017 from the year
before. (I work for Wikibon.) This was faster than forecast in the previous year’s report, owing largely to
stronger-than-expected public cloud deployment and utilization as well as accelerating convergence of
platforms, tools, and other solutions. Also, enterprises are moving more rapidly out of the
experimentation and proof-of-concept phases with big data analytics and are achieving higher levels of
business value from their deployments.

[ The essentials from InfoWorld: What is Apache Spark? The big data analytics platform explained •
What is big data analytics? Everything you need to know • What is data mining? How analytics uncovers
insights. | Go deep into analytics and big data with the InfoWorld Big Data and Analytics Report
newsletter. ]

Going forward, Wikibon forecasts that the overall big data analytics market will grow at an 11 percent
annual growth rate by 2027, reaching reach $103 billion globally. Much of the market growth in later
years will be sustained by adoption of big data analytics in internet of things (IoT), mobility, and other
edge-computing use cases.

The key trends in big data analytics’ evolution over the next decade

What will drive big data analytics industry evolution over the coming decade are the following key
trends, as substantiated in Wikibon’s research:

Public cloud providers are expanding their sway. The big data industry is converging around three
principal public cloud providers—Amazon Web Services, Microsoft Azure, and Google Cloud Platform—
and most software vendors are building solutions that operate in all of them. These and other big data
public cloud providers—including such established big data vendors as IBM and Oracle—are offering
managed IaaS and PaaS data lakes into which customers and partners are encouraged to develop new
applications and into which they’re migrating legacy applications. As a consequence, the pure data
platform/NoSQL vendors seem to be flat-lining, becoming marginalized in a big data space increasingly
dominated by diversified public cloud providers.

Public cloud advantages over private clouds continue to widen. Public clouds are becoming the
preferred big data analytics platform for every customer segment. That’s because public cloud solutions
are maturing more rapidly than on-premises stacks, adding richer functionality, with increasingly
competitive cost of ownership. Public clouds are growing their application programming interface
ecosystems and enhancing their administrative tools faster than what is emerging from the world of big
data analytics solutions designed for on-premises deployments.

Hybrid clouds are becoming an intermediate stop for enterprise big data on the way to more complete
deployment in public clouds. Hybrid clouds are figuring into the big data plans of most large enterprise,
but predominantly as a transitional strategy. That’s because the balance is tipping toward enterprises
putting more of their big data assets in public clouds. Recognizing this trend, traditional big data vendors
are optimizing their products for hybrid use cases. By the same token, premises-based big data
platforms are being rearchitected for deployment in public clouds.

Cloud-based big-data silo convergence is speeding enterprise time-to-value. Users are beginning to step
up the pace of consolidation of their siloed big data assets into public clouds. The growing dominance of
public cloud providers is collapsing the cross-business silos that have heretofore afflicted enterprises’
private big data architectures. Just as important, big data solutions, both cloud-based and on-premises,
are converging into integrated offerings designed to reduce complexity and accelerate time to value.
More solution providers are providing standardized APIs for simplifying access, accelerating
development, and enabling more comprehensive administration throughout their big data solution
stacks.

Innovative big data startups are bringing increasingly sophisticated AI-infused applications to market.
Innovative application providers are starting to disrupt the big data competitive landscape with AI-based
solutions. The threat from new market entrants is accelerating in every big data segment, with most of
the innovations being designed for public or hybrid cloud deployments. Many new database, stream
processing, and data science startups have entered the market in the past several years.

Disruptive big data approaches are becoming viable alternatives to established platforms. Before long, a
new generation of “unicorn” big data platform providers will emerge on the strength of a next-
generation approach that blends IoT, blockchain, and stream computing. More of these next-generation
big data platforms will be optimized for managing the end-to-end devops pipeline for machine learning,
deep learning, and AI. Also, big data platforms are being architected for AI microservices to edge
devices.

Hadoop is becoming just a piece in the big data puzzle. We’re seeing signs that the marketplace regards
Hadoop as more of a legacy big data technology than as a strategic platform for disruptive business
applications. Nevertheless, Hadoop is a mature technology that is widely adopted for key use cases—
such as unstructured information refinery—in many users’ IT organizations and still has a long useful life
ahead of it in many organizations. With that long-term perspective in mind, vendors continue to
enhance their offerings by engineering smoother interoperability among independently developed
hardware and software components.

Users are increasingly mixing and matching multivendor big data deployments in open ecosystems.
Fewer big-data vendors are delivering solutions that incorporate proprietary, nonstandard, or non-open-
source componentry. Customers are taking advantage of today’s highly competitive market to extract
continuing enhancements from big data analytics vendors. Vendors, in turn, are decoupling their tools
into modular architectures in which customers may swap out components at various functional levels.
This is the best approach for vendors who want to gain a sustainable share in a market in which full-
stack vendor lock-in is a thing of the past.

Databases are being deconstructed and reassembled in innovative approaches. From an architectural
standpoint, the database as we used to know is waning. We are moving into a future in which
streaming, in-memory, and serverless big data analytics infrastructures will reign supreme. Vendors are
exploring new ways to rearchitect core database capabilities to address emerging requirements, such as
automated machine learning pipelines and edge-facing cognitive IoT analytics. In this evolution, analytic
and application databases are converging as more high-performance transactional analytic capabilities
are integrated into data platforms of all types. Also, the database storage engine is becoming a
repository primarily for machine data that is addressable through alternate structures such as key value
indices and object schema.

Data science tool chains are increasingly automating the end-to-end devops pipeline. Big-data-
augmented programming will continue to grow in sophistication. Developers have access to a growing
range of devops tools for automating various tasks in the development, deployment, and management
of machine learning, deep learning, and other AI assets. A growing range of these solutions even
leverage specialized machine learning algorithms to drive such machine learning development functions
as hyperparameter tuning.

Packaged big data analytics applications are becoming more widely available. Over the coming decade,
more users will acquire big data analytics solutions as prebuilt, pretrained, and templatized cloud
services. More of these services will automatically adapt and tune their embedded machine learning,
deep learning, and AI models to continuously deliver optimal business outcomes. And more of these
services will incorporate pretrained models that customers can tweak and extend to their own specific
needs.

Barriers to big data analytics’ evolution and deployment

Although the forecast for big data analytics adoption looks rosy, there remain many persistent issues
that frustrate users’ attempts to maximize the value of their investments in these technologies. Chief
among these are:

Excessive complexity. Big data analytics environments and applications are still too complex. Vendors
will need to keep on simplifying the interfaces, architectures, features, and tools of these environments.
Doing so will put sophisticated big data analytics capabilities in reach of mainstream user and
developers, many of whom lack in-house IT staff with the requisite specialized skills.

Cumbersome overhead. Big data analytics administration and governance processes are still too siloed,
costly, and inefficient for many IT professionals. Vendors will need build prepackaged workflows that
help large teams of specialized personnel administer the data, metadata, analytics, and service
definitions more efficiently, rapidly, and accurately.

Protracted pipelines. Big data analytics application development and operationalization pipelines are
still too time-consuming and manual. Vendors will need to step up their tools’ automation features to
ensure boost the productivity of users’ technical staff while ensuring consistent handling of complex
tasks even by low-skilled personnel.

Custom applications. Big data analytics professional services are still essential for developing, deploying,
and managing the many custom applications. This is especially true for data-driven applications that
span hybrid clouds, involve disparate platforms and tools, and incorporate unfathomably complex data
processes. Vendors need to beef up the prepackaged application content for common big data analytics
applications while giving users self-service, visual tools for specifying complex business logic without
external assistance.

For enterprise IT, Wikibon’s chief recommendation is to start migrating more of your big data analytics
development efforts to public cloud environments. This will accelerate your ability to take advantage of
rapidly maturing, low-cost offerings provided by Amazon Web Services, Microsoft, Google, IBM, and
other public cloud providers. You should consider building out your enterprise hybrid cloud to ensure a
smooth transition to the public cloud over the next several years.

Related: Big Data Analytics Cloud Computing Devops Data Science

James Kobielus is SiliconAngle Wikibon's lead analyst for AI, data science, and application development.

Follow

You might also like