You are on page 1of 2

DATA SCIENCE AND BIG DATA ANALYTICS

DATA SHEET REVI S ED : 08.01.1 4

Data Science and Big Data Analytics


COURSE OVERVIEW
DELIVERY METHODS This course provides practical foundation level training that
• Instructor-led enables immediate and effective participation in big data and
other analytics projects. It includes an introduction to big data
• Live-online
and the Data Analytics Lifecycle to address business challenges
that leverage big data. The course provides grounding in basic
COURSE DURATION
and advanced analytic methods and an introduction to big data
• Five days of instructor-led classroom training
analytics technology and tools, including MapReduce and
TARGET AUDIENCE Hadoop. The extensive labs throughout provide many
• Managers of teams of business intelligence, analytics, opportunities for students to apply these methods and tools to
and big data professionals real-world business challenges as a practicing Data Scientist. The
• Current Business and Data Analysts looking to add big course takes an “Open”, or technology-neutral approach, and
data analytics to their skills. includes a final lab in which students address a big data analytics
• Data and database professionals looking to exploit their challenge by applying the concepts taught in the course in the
analytic skills in a big data environment context of the Data Analytics Lifecycle. The course prepares the
• Recent college graduates and graduate students with student for the Proven™ Professional Data Scientist Associate
academic experience in a related discipline looking to (EMCDSA) certification exam, and establishes a baseline of Data
move into the world of Data Science and big data Science skills that can be enhanced with additional training and
further real- world experience.
PREREQUISITES
• A strong quantitative background with a solid
understanding of basic statistics COURSE OBJECTIVES
• Experience with a scripting language, such as Java, Perl, Upon successful completion of this course, participants should
or Python (or R). be able to:
• Experience with SQL Immediately participate and contribute as a Data Science Team
Member on big data and other analytics projects by:
PRICING • Deploying the Data Analytics Lifecycle to address big data
Please visit our website at pivotal.io/academy analytics projects
• Reframing a business challenge as an analytics challenge
MORE INFORMATION
• Applying appropriate analytic techniques and tools to analyze
On-site training is also available for customers who prefer big data, create statistical models, and identify insights that can
to bring a Pivotal Certified Instructor to their own facilities lead to actionable results
For additional information about on-site classes, including
facility requirements, contact education@pivotal.io
• Selecting appropriate data visualizations to clearly
communicate analytic insights to business sponsors and
analytic audiences
• Using tools such as: R and RStudio, MapReduce/Hadoop,
in-database analytics, Window and MADlib functions
Explain how advanced analytics can be leveraged to create
competitive advantage and how the data scientist role and skills
differ from those of a traditional business intelligence analyst

pivotal.io
DEVELOPING APPLICATIONS WITH CLOUD FOUNDRY

COURSE MODULES 6. ADVANCED ANALYTICS-TECHNOLOGIES


AND TOOLS
1. INTRODUCTION AND COURSE AGENDA • Analytics for Unstructured Data-MapReduce and Hadoop

2. INTRODUCTION TO BIG DATA ANALYTICS • The Hadoop Ecosystem


• Big Data Overview 1. In-database Analytics–SQL Essentials
• State of the Practice in Analytics 2. Advanced SQL and MADlib for In-database Analytics
• The Data Scientist
• Big Data Analytics in Industry Verticals 7. THE ENDGAME, OR PUTTING IT ALL
TOGETHER
3. DATA ANALYTICS LIFECYCLE • Operationalizing an Analytics Project
• Discovery • Creating the Final Deliverables
• Data Preparation • Data Visualization Techniques
• Model Planning • Final Lab Exercise on Big Data Analytics
• Model Building
• Communicating Results
• Operationalizing

4. REVIEW OF BASIC DATA ANALYTIC


METHODS USING R
• Using R to Look at Data – Introduction to R
• Analyzing and Exploring the Data
• Statistics for Model Building and Evaluation

5. ADVANCED ANALYTICS–THEORY AND


METHODS
• K Means Clustering
• Association Rules
• Linear Regression
• Logistic Regression
• Naïve Bayesian Classifier
• Decision Trees
• Time Series Analysis
• Text Analysis

At Pivotal our mission is to enable customers to build a new class of applications, leveraging big and fast data, and do all of this with the power of cloud independence. Uniting selected technol-
ogy, people and programs from EMC and VMware, the following products and services are now part of Pivotal: Greenplum, Cloud Foundry, Spring, GemFire and other products from the VMware
vFabric Suite, Cetas and Pivotal Labs.
By procuring these services, Customer agrees that the terms and conditions set forth here: http://pivotal.io/svcs-terms are incorporated by reference into this Data Sheet and shall govern the
provision of Pivotal’s Services herein.
Pivotal 3495 Deer Creek Road, Palo Alto, CA 94304 Pivotal.io
Pivotal is a registered trademark or trademark of Pivotal Software, Inc. in the United States and other countries. All other trademarks used herein are the property of their respective owners. © Copyright 2014 Pivotal Software, Inc. All
rights reserved. Published in the USA.

You might also like