You are on page 1of 30

Asuri Saranathan

BEST PRACTICES IN DATA WAREHOUSING AND BUSINESS INTELLIGENCE

Agenda
Introduction
Best Practices Over View Deep Dive Conclusion Q&A

Introduction

Speaker
Holds Bachelor degree in Physics and Electrical and Electronics

Engineering.
Over 26 years of Experience in Power System and Information

Technology field.
Has built several large scale applications including online CRM for

multinationals.
Has managed several Data Warehousing and BI projects for Direct

Marketing, Manufacturing and Auto finance verticals.


Functioned as Solution Architect for Data warehousing and

reporting projects.
ISO auditor Certified Bullet Proof Manager from CrestComm USA.

Best Practices- Overview

What are Best Practices?


Is it a technology?
Is it application of a set of best tools

available in the market? Is it a Framework?

Best Practices Definition


A framework of a set of processes or method that exhibits achievement of specific results in a specific manner over a sustained period

of time. The framework should have certain characteristics in that they should be repeatable.

Do Best Practices Evolve?


Yes they do.
Because of Innovation Changes in Technology

Changes in Law or Governance Structure.


Expectations, Values , Knowledge or other that

makes the practice outdated or inappropriate.

Where can it be Applied?


Practically in all fields.

How do we apply Best Practices to Data Warehousing and Business Intelligence?

Data Warehouse - Definition


In an elementary form , it is the collection of key information that can be used by the business users to become more profitable. But Is this definition sufficient ?
We need much more precise definition of

what a data ware house is .

What is a Data Warehouse?


A Data warehouse is the
Data ( Meta / Fact / Dimension/ Aggregation)

and The Process Managers ( Load / Warehouse / Query) That make information available , enabling the user to make informed decisions.

Deep Dive

What is the Challenge?


Business is never Static.
And so is Data Warehouse.

In order to respond to todays requirement for instant access to corporate information , the data warehouse should be designed to respond to this need in a optimal way. Business itself probably not aware of what information is required in the future. This requires a fundamentally different approach than the traditional waterfall method of software development for the Data warehouse.

Experience so far
Most Enterprise Data Warehousing projects tend to have

development cycle of between 18 24 months from start to finish. Justification of this investment is substantial. Businesses would prefer a better approach to justify the investment.

What should be done?


Focus on Business Requirements
A clear understanding of what is short term

and long term requirement of the data warehouse. An Architecture design that would evolve. Identification of quick win that delivers business benefit in the first build.

Scalability for Growth


Scalability means ability of the underlying Hardware and Software to support increasing demands over a period of time.

Horizontal Scalability
High Speed Network
CPU CPU CPU CPU CPU CPU CPU CPU

RAM

RAM

RAM

RAM

RAM

RAM

RAM

RAM

DB

DB

DB

DB

DB

DB

DB

DB

Multiple servers are connected thru a network and use the data partitioning feature of the Database to tie the CPUs together.

Data Warehouse Environment


Staging Area

Data warehouse (System of Record) Full History in 3rd Normal Form No User Access

Data Mart

Source Systems

Summary Area Full History User Access

Analytical Area User Access

Strategy
Metadata Mgmt Architecture Integration Control Delivery Data Quality Spec Analysis Measurement Improvement Data Architecture Entp. DM Value Chain Data Development Analysis Data Modeling DB Design Implementation

Document / Content Mgmt Acquisition & Storage Backup & Recovery Content Retrieval Retention

Data Governance

Data Operations Acquisition Recovery Tuning Purging

DWH / BI Architecture Implementation Training and Support Tuning

Reference and MDM External & Internal Code Customer Data Product data

Data Security Standards Classification Administration Authentication Auditing

Environment
Organization & Culture

Technology

Activities

Goals & Objectives

Practices & Techniques

Deliverables

Roles & Responsibilities

Architecture Requirements
Must recognize change as a constant
Take incremental development approach Existing applications must continue to work

Need to allow more data and new types of data to be

added

High Level
Remember the different worlds
On-line transaction processing (OLTP) Business intelligence systems (BIS)

Users are different


Data content is different Data structures are different Architecture & methodology must be

different

Best Practice #1
Use a Data model that is optimized for Information retrieval
dimensional model

denormalized
hybrid approach

DW Architecture Best Practices

September 8, 2013

24

Best Practice #2
Carefully design the data acquisition and cleansing processes for your DW
Ensure the data is processed efficiently and

accurately Consider acquiring ETL and Data Cleansing tools Use them well!

DW Architecture Best Practices

September 8, 2013

25

Best Practice #3
Design a metadata architecture that allows sharing of metadata between components of your DW
consider metadata standards such as OMGs

Common Warehouse Metamodel (CWM)

DW Architecture Best Practices

September 8, 2013

26

Best Practice #4
Take an approach that consolidates data into a single version of the truth
Data Warehouse Bus conformed dimensions & facts OR?

DW Architecture Best Practices

September 8, 2013

27

Best Practice #5
Consider implementing an ODS only when information retrieval requirements are near the bottom of the data abstraction pyramid and/or when there are multiple operational sources that need to be accessed
Must ensure that the data model is integrated, not just

consolidated May consider 3NF data model Avoid at all costs a data dumping ground

DW Architecture Best Practices

September 8, 2013

28

Pitfalls to be Avoided
Engagement of Non-BI Manger in a BI delivery Project.
Trying to please the client and the user community. Expecting the Service Provider to own the Project completely. Bringing the Solution Architect half way into the project.

Allowing the Business Users to drive the Data Model.


Not having the right people with right skills in tool selection process. Expecting the contractor to deliver all that they presented. Over dependency on the Service provider or contractor in execution.

Assuming that the Data quality will be handled somehow.


Assuming that the Data warehouse project is over once it is

deployed.

Data Warehouse Architecture Best Practices

Thank You

You might also like