You are on page 1of 6

QUEST FOR KNOWLEDGE

ETL ARCHITECTURE
IN DEPTH
DATE
LOCATION
INSTRUCTORS
INFORMATION AND REGISTRATION

Organized by

10 13 June 2013
Amsterdam
Ralph Kimball and Bob Becker
www.Q4K.com

With the support of

10 13 June 2013

ETL ARCHITECTURE IN DEPTH


Kimball University
Kimball University (KU), operated by the Kimball group, is the definitive source for dimensional data warehouse
education. KU provides the highest quality and most practical education consistent with KU instructors books and
extensive experience in the dimensional approach. Youll learn from the best in the business. The KU instructors literally
wrote the books; why settle for anything less than the original inventors and authors of these concepts!
KU offers public classes in US and international venues. In addition, KU instructors teach classes on-site at client
locations. All class content is vendor neutral, with the exception of the Microsoft-centric courses.

RALPH KIMBALL
Ralph Kimball is known worldwide as an innovator,
writer, educator, speaker and consultant in the field
of data warehousing. He has remained steadfast in
his long-term conviction that data warehouses must be
designed to be understandable and fast. His books
on dimensional design techniques have become the
all time best sellers in data warehousing. His books
include The Data Warehouse Toolkit (2nd Edition),
The Data Warehouse Lifecycle Toolkit (2nd Edition),
The Data Webhouse Toolkit, The Data Warehouse ETL
Toolkit, The Microsoft Data Warehouse Toolkit (2nd
Edition) and The Kimball Group Reader. To date Ralph
has written more than 100 articles and columns for
Intelligent Enterprise and its predecessors, winning the
Readers Choice Award five years in a row.
After receiving a Ph.D. in 1972 from Stanford in
electrical engineering (specializing in man-machine
systems), Ralph joined the Xerox Palo Alto Research

Center (PARC). At PARC Ralph co-invented the Xerox


Star Workstation, the first commercial product to use
mice, icons and windows.
Ralph then became vice president of applications at
Metaphor Computer Systems, pioneering decision
support software and services provider. As a hands-on
manager, he developed the Capsule Facility in 1982.
The Capsule was a graphical programming technique
which connected icons together in a logical flow,
allowing a very visual style of programming for nonprogrammers. The Capsule was used to build reporting
and analysis applications at Metaphor.
Ralph founded Red Brick Systems in 1986, serving
as CEO until 1992. Red Brick Systems, now owned
by IBM, was known for its lightning fast relational
database optimized for data warehousing. Ralph
Kimball Associates incorporated in 1992 to provide
data warehouse consulting and education.

BOB BECKER
Bob Becker has worked with business managers and IT
professionals to prioritize, justify and implement largescale decision support and data warehousing systems
since 1990. Regardless of the industry, he is highly
skilled at identifying business requirements, facilitating
organizational consensus and designing dimensional
data models. Bob leverages these consulting
experiences when teaching the Kimball University on-site
courses. He co-authored The Data Warehouse Lifecycle
Toolkit (2nd Edition) and The Kimball Group Reader.

WWW.Q4K.COM

Before co-founding DecisionWorks, Bob worked


at Metaphor. He also held various sales and
management positions with Oracle, Tandem
Computers, IBM and Data General. He graduated
from the University of Minnesota School of Business
with a BSB in Marketing.

10 13 June 2013

ETL ARCHITECTURE IN DEPTH


Course Outline
Note: Numbered items refer to the 34 ETL subsystems
Surrounding the Requirements
Note: Augmented by class input students/instructor propose requirements for a comprehensive
ETL system design

Business needs
Compliance
Data profiling
Security
Integration needs
Latency (daily, hourly, seconds, instantaneous)
Archiving (recent history, very long term)
Lineage and impact
User profiles (developers, business users, analysts)
Existing IT skills (traditional EDW, new Big Data systems)
Existing technology licenses
Hand coding vs. ETL tool choice
Class roundtable exercise: Challenges in students environments
Extract Steps: Bringing the Data to the Back Room
Data types used in ETL systems
(1) Data profiling
Source to target map
Access methods, source types (including new Big Data)
Software, techniques
(2) Change data capture
(3) Extract window
(3) Immediate transformations
(3) Extract staging table designs, table types, retention, backup
(3) Technical extraction tips
(3) Traditional mainframe sources
(3) XML sources and persistence of structures in back room
ERP system sources
Example vendors: Microsoft SSIS, Pentaho Kettle
Service oriented architectures, WSDL, and SOAP
Big data sources
(22) Job scheduler
(22) Exception handling architecture
(23) Back up
Short term and long term recovery, archiving, sunsetting
(24) Recovery, (24) Restart
Architecture of Data Cleansing
(4) Data quality architecture
(4) Data quality screens (column, structure, and business rule)
(4) Business rule screens from statistical forecasts
(4) Column and structure screens from data profiling
(5) Error event fact tables
Fact table surrogate keys
(6) Building the audit dimension and exposing in BI tool
(4, 5, 6) Implementing data quality architecture in agile environment
(7) De-duplication and survivorship
Real Time Data Warehousing
Hot partition
Streaming versus batch ETL
Streaming delivery, query, reporting, dashboards, notifications
Enterprise application integration (EAI) architecture
Micro-batch ETL (MBETL) architecture
Enterprise information integration (EII) architecture
Big Data Predictive Analytics
Big Data use cases
Four Vs: volume, variety, velocity, value
MapReduce, Hadoop, Pig, Hive, Hbase
When to export to conventional RDBMS
Architecture of Data Integration
(8) Conforming dimensions, definition, impact on BI
(8) Centralized and distributed responsibilities using conformed dimensions
(8) Implementing conformed dimensions in agile environment
(8) Example vendors: Pentaho, Microsoft SSIS, Informatica, Zend Studio
(28) Sorting
(25) Version control
(26) System and version migration, testing and regression

WWW.Q4K.COM

(27)
(27)
(23)
(29)
(30)

Workflow monitor
Example vendors: Microsoft SSIS, IB Tivoli, Informatica
Job scheduler
Lineage and dependency analyzer
Problem escalation system

Delivering Dimension Tables


(9) Time variance designs for slowly changing dimensions
(10) Surrogate key generator
(15) Multi-valued dimensions, bridge tables
(11) Hierarchical dimensions
- Fixed
- Variable
- Ragged
- Bridge tables revisited
(12) Special dimensions
Date / Time dimensions
Junk dimensions
Mini-dimensions
Small dimensions
User maintained dimensions
Shrunken dimensions
Outrigger dimensions
Behavior tags
Step dimensions
Super type / Sub type dimensions
Study groups
Special cases: extreme dimensionality and dimension width, incompatible
members
Delivering Fact Tables
(13) Fact table builder
- Transaction
- Periodic snapshot
- Accumulating snapshot
- Consolidated
(14) Surrogate key pipeline
Referential integrity
Graceful extensibility
- Add attributes and facts
- Add dimensions to existing schemas
(16) Late arriving dimension and fact data
(17) Dimension manager
- Responsibilities and procedures
- Real time complexities
(18) Fact provider
- Responsibilities and procedures
- Real time complexities
(19) Aggregations
(20) Feeding OLAP cubes
(21) Data Integration manager
- Feeding data mining
- Presentation layer extracts
- 3rd party flat files
Development and Operations
(31) Parallel processing and pipelining
(32) Security
(33) Compliance
(34) Metadata
- Metadata context
- Process metadata
- Technical metadata
- Business metadata
- Metadata options
- Metadata strategy

10 13 June 2013

ETL ARCHITECTURE IN DEPTH


Course description
This course helps you understand all the factors necessary for effectively designing the
back room ETL system of your DW/BI environment. It tries to guarantee that critical
processes within the ETL system are not overlooked. Even if you dont have an immediate
qualified need for every ETL subsystem on our list, it is likely that you will over time. By the
end of this course, you will understand how your data warehouse ETL system can be built
to anticipate these potential requirements.
This is not a microscopic code-oriented implementation course; it is a vendor-neutral
architecture class for the designer who must keep a broad perspective. The course is
organized around the 34 necessary ETL subsystems which are developed in detail throughout
the course progresses. During the course, each student builds (on paper) a comprehensive
ETL system based on a realistically complex example, starting with the first steps of extraction
through the final steps of data delivery to the presentation area for your BI tool.

Who should attend


This course is designed for those responsible for building the back room ETL system of a
data warehouse environment, including ETL architects, ETL designers and developers, and
data warehouse operational staff.
Since dimensional models are the ultimate ETL deliverables, some familiarity with the basic
principles of dimensional modeling is necessary. Students can gain this knowledge by
reading the following Data Management Review articles found at www.kimballgroup.com:
Resist the Urge to Start Coding (Nov 2007):
www.kimballgroup.com/2007/10/29/resist-the-urge-to-start-coding/
Set Your Boundaries (Dec 2007):
www.kimballgroup.com/2007/11/29/set-your-boundaries/
Data Wrangling (Jan 2008):
www.kimballgroup.com/2008/01/03/data-wrangling/
Dimensional Perspectives (Feb 2008):
www.kimballgroup.com/2008/01/17/dimensional-perspectives-myth-busters/

Registration fee
The fee for this 4-day course is EUR 2.695,00 per person. This includes four days of
instruction, lunch and morning/afternoon snacks, course materials and a KU Certificate
of Completion. Students receive a copy of The Data Warehouse ETL Toolkit.
We offer the following discounts. Discounts cannot be combined.
10% Early Bird discount for students registering before 19 April 2013. Payment must
be received before the cut off date to receive the discount.
10% discount for groups of 3 or more students from the same company registering
at the same time.
20% discount for groups of 5 or more students from the same company registering
at the same time. Register 5 students, only pay for 4.
Note: Groups that register at a discounted rate must retain the minimum group size or the
discount will be revoked.

WWW.Q4K.COM

Venue
Steigenberger Airport Hotel Amsterdam is
located on the outskirts of the Amsterdam
Forest. Only 7 minutes from Schiphol
airport and a few minutes away from the
highway. The hotel underground car park
can accommodate up to 266 vehicles
and the hotel offers a free shuttle bus
service from and to the airport.
Steigenberger Airport Hotel Amsterdam
Stationsplein ZW 951
1117 CE Amsterdam Schiphol-Oost
The Netherlands
T: +31 20 5400-777
F: +31 20 5400-700
E: airporthotel-amsterdam@steigenberger.nl
W: www.steigenberger.com

10 13 June 2013

ETL ARCHITECTURE IN DEPTH


F A X R E G I S T R A T I O N F O R M +31 76 572 21 96
Course Details

ETL Architecture in Depth

10 13 June 2013

Amsterdam

EUR 2.695 (ex. vat)

Company Details
Company Name:

E-mail:

Contact Name:

Telephone:

Address:

Fax:

Postal Code:

Website:

City:

Invoice Address:

Country:
Postal Address:

VAT Number:
Purchase Order no.:

Student Details
First Name:

Gender:

Last Name:

E-Mail:

Job Title:

Telephone:

Authorization

Registration Information

Name:

Confirmation and Invoicing: upon receipt of your registration our


customer service department will send you a customer information
pack including details of payment and hotel information. Full payment
is due prior to the course start date.

Job Title:
Date:
Signature:

WWW.Q4K.COM

Male

Female

Cancellations and Substitutions: Cancellations must be received


in writing 20 working days prior to the course start date and are
subject to a 20% administration fee. Otherwise the full registration
fee remains due. As an alternative to cancellation you may transfer
your place for the course to a colleague without extra costs, but
Quest For Knowledge has to be informed about this transfer in
advance. Quest For Knowledge reserves the right to cancel any
course at anytime without any liability whatsoever, safe for the
refund of the registration fee.

10 13 June 2013

ETL ARCHITECTURE IN DEPTH

Quest For Knowledge


The Netherlands
Hoge Schouw 1H | 4817 BZ Breda
T: +31 76 572 21 99 | F: + 31 76 572 21 96
Belgium
Uitbreidingstraat 84-3 | 2600 Antwerp
T: +32 3 877 93 39 | F: + 32 3 877 93 41
Online
www.Q4K.com | info@Q4K.com

Organized by

With the support of

Quest For Knowledge

KVL Inspiratie Technologie

Architecting an IT environment that stands the test of time begins


with a sharp vision on the durability of all of its components.
Quest for Knowledge (Q4K) concentrates on education and
training on software and concepts that have a bright future
in one of these interrelated disciplines: Data Warehousing,
Business Intelligence and Customer Relationship Management.
The Q4K Data Warehouse and Business Intelligence curriculum
provides in the most comprehensive education and training
available in the Benelux. With in depth Data Warehouse
courses and a series of product oriented training classes for
leading Business Intelligence solutions, Q4K training provides
you with the best knowledge transfer and a sound foundation
to make your projects successful. Visit our website www.Q4K.
com or request our training catalog for a complete overview.

KVL Inspiratie Technologie (Inspiration Technology) is a group


of companies and delivers excellent expertise in the areas of
Business Intelligence, Data Warehousing, Data Integration
and Data Quality. With a history of more than 15 years
experience, the expertise differentiates in the market through
a result-driven approach with high cost savings. KVL delivers
certified specialists with a broad knowledge base and many
years of experience. KVL works with market leading technology
partners IBM, Informatica and Microsoft. For example for the
Microsoft BI platform, KVL develops DDM Studio for Microsoft
SQL Server 2008, also compliant with Microsoft SQL Server
Fasttrack. KVL for everyone: Cost saving with 75% or more
on development cost, reduction of TCO with at least 50%
and reduction of elapsed time on a project up to 75% with a
guaranteed result within 3 months are proven examples. KVL
Inspiration Technology is a dynamic, innovative organization
with excellent customer satisfaction characterizes KVL in the
Dutch market, explaining the strong organizational growth.
KVL proves professional services can be improved at all times,
with a strong vision best practice is turned into next practice
proven with strong results. The KVL offices located in Rotterdam
is no coincidence, the no-nonsense approach and result
driven mentality is genetic. Within their professional fields KVL
acquired a unique position in the market, without losing roots.

Kimball University
Kimball University (KU) is the definitive source for dimensional
data warehouse education. KU provides the highest quality
and most practical education consistent with KU instructors
books and extensive experience in the dimensional
approach. Youll learn from the best in the business. Kimball
University offers public classes in venues around the US and
internationally. In addition, KU teaches classes on-site at client
locations. All class content is vendor neutral, with the exception
of the Microsoft-centric course.

WWW.Q4K.COM

You might also like