You are on page 1of 26

CHAPTER 13:

DATA WAREHOUSE
PROJECT DOCUMENTATION
PART II:
1. DOCUMENTATION OVERVIEW
2. TEMPLATES
WHAT IS DOCUMENTATION?

• Documentation is a set of documents


provided on paper, or online, or on digital,
such as audio tape or CDs.
• Documentation is distributed via websites,
software products, and other on-line
applications.
DOCUMENTATION ROADMAP
TEMPLATES
- THE TEMPLATES ARE DIVIDED INTO 11
CATEGORIES.
- WITHIN EACH CATEGORY, THE
DOCUMENTS ARE NUMBERED
SEQUENTIALLY.

NOTE: 11 TEMPLATES ARE JUST GENERAL


ONES THAT CAN BE USED AS REQUIRED.
1. CONCEPT

Definition: The business may have a concept and the


IT team will be able to describe the major component
and concepts of a data warehouse.
1.1 Business Concepts for the Data Warehouse
- It describes subject areas and their broad
relationships as well as key performance indicators
used by the business.
1.2 Overview Architecture for Enterprise Data
Warehouses – It is a design pattern for data
warehousing to describe the basic concepts of the
data warehouse.
2 REQUIREMENTS

Definition: The objective of these templates is


to give breadth and depth to the requirements.
Breadth is the ability to ensure that all truly
required information would be covered, whilst
depth is the amount of detail that is specified in
the requirements to ensure that the developers
have sufficient, unambiguous, detail with which
to develop.
2.1 Data Warehouse Business Requirements
(WBR) – It details the ‘soft’ requirements for
business information according to a number of
subject areas of interest to the business.
2 REQUIREMENTS

2.2 Data Warehouse Data Requirements (WDR)


- This is the refinement of the business
requirements in that the analysts can use the
business requirement to drive out the data
required to answer the questions.
2.3 Data Warehouse Query Requirements
(WQR) – It lists a number of potential queries to
which the solution should be able to provide
answers.
2.4 Data Warehouse Technical Requirements
(WTR) - Details the functional and non-functional
requirements that are expected of the solution.
2 REQUIREMENTS

2.5 Data Warehouse Interface Requirements


(WIR) - Details the requirements for interfaces
that feed from the data warehouse out to other
systems.
2.6 Business Definitions Dictionary (BDD) - It
is important that a common dictionary is
developed and kept so that there is a common
reference for words.
3 ARCHITECTURE
Definition: The architecture category contains a
number of documents that describe how the
system should be built.
3.1 Technical Architecture - Describes the
technical components that will be used to build
the system include the hardware, software and
network configuration, along with specific
versions where appropriate and standard.
3.2 Security Model – it should describe all the
required roles/groups etc. that will be required for
each component of the system.
3 ARCHITECTURE
3.3 Resilience Plan - This should include the need for
redundant hardware and networks, incremental,
cumulative and full backups, restores of individual
components or entire systems, how to back out records
individually, as a group or entire sets and how disasters
such the loss of a data centre etc. are managed.
3.4 Data Quality Plan (DQP) – This will include the
principles of where data is cleansed (in the source, in the
staging, in the data warehouse itself, etc.) how it is
profiled, what type of cleansing is carried out (e.g. rule
based or heuristic ), how it is profiled, what metrics are
set and monitored for data improvement, etc .
4 DATA MODELS

Definition: Data models are (normally) graphical


representations of the data that is required.
4.1 Data Modelling Standards - Describes the
naming conventions of objects in the database,
as well as any particular modelling methods (e.g.,
a hierarchy must always be modelled in a
specific way and any exceptions noted along with
a justification for the difference).
4.2 Logical Model – Is a model that represents
the true structure of data used by the business,
independent of software or hardware
implementation constraints.
4 DATA MODELS

4.3 Repository Data Model - Is a physical data


model of the main storage area within a data
warehouse.
4.4 Data Mart Data Model(s) - Are the physical
models of the part of the system that the user will
query.
5 ANALYSIS

Definition: The goal of the analysis phase is to


identify the sources of the information required to
populate the physical data models.
5.1 Source Systems Analysis (SSA) - Is a high-
level analysis that gathers information about
available systems.

5.2 Data Profiling - Is a process whereby an


existing source system is examined in order to
collect information and statistics about that data
held.
5 ANALYSIS

5.3 Source Entity Analysis (SEA) - Is the


detailed documentation of the sources selected
because data profiling has validated these
sources as being useful for the data warehouse .

5.4 Target Orientated Analysis (TOA) - Is used


to describe which sources will be used to
populate which target entities.
6 DESIGN

Definition: The design phase concentrates on taking


the analysis and creating a plan for the code build .
6.1 ETL Execution Plan - Is a document that
explains from the high level down to the low level
how the ETL code will be put together.
6.2 Initial Capacity Plan - Describes the sizes of the
databases and database objects required for the
initial build.
6.3 Coding Standards - Describes the naming
conventions for all objects that will be created,
including but not limited to: database objects such as
table and column names, ETL mapping names, script
names etc.
7 BUILD

Definition: Is a roadmap to the documentation


that should be produced during a data
warehouse project.
7.1 Code Repository - A lot of the code will
contain valuable documentation in the form of
comments. It is also vital that the history of
changes to code is recorded.
7.2 Data Cleansing Integration - Will have
generated a number of rules that will have to be
implemented in order to maintain data quality.
8 TEST

Definition: Testing software is operating the


software under controlled conditions , to
1. Verify that it behaves “as specified”
Verification is the checking or testing of items,
including software, for conformance and
consistency by evaluating the results against
pre- specified requirements.
2. 2. To detect errors, testing should intentionally
attempt to make things go wrong to determine
if things happen when they should not or
things do not happen when they should.
8 TEST
8.1 Unit Testing - designed to validate what an
individual unit of development work (normally an ETL
mapping, input screen or report) is functioning as
expected.
8.2 System Testing - designed to check that a suite
of newly developed or changed units work correctly
together in the expected manner.
8.3 Integration Testing - designed to ensure that the
suites of newly developed or changed units work with
other suites that are already deployed on the system
and do not damage the existing product environment.
8.4 Performance Testing - is designed to ensure the
performance of the system.
9 IMPLEMENTATION
Definition: After the development and testing are over
the system has to be deployed into production and left
operating.
9.1 Configuration Management Procedures - It
should cover all aspects of the changes to the
configuration from applying patches and new releases
through to system software upgrades.
9.2 Operations Guide – It is intended for those with
responsibility for looking after the system on a day -to-
day basis.
9.3 Capacity Plan - Describing the Initial Capacity
Plan will have already been produced.
9.4 Service Level Agreements (SLA) – It is a formal
negotiated agreement between two parties.
9 IMPLEMENTATION
9.5 Helpdesk Scripts It needs to be able to handle
support calls.
9.6 Training Plan – It will need to provide a training plan .
This is how users become competent enough to use the
system.
9.7 Operational Schedule – It is the list of tasks that
must be performed each hour, day, week and month, etc.
and any dependencies (e.g. must run after midnight, must
only run if a previous job is successful etc.).
9.8 System Monitoring Plan - The system monitoring
plan is the list of system components that are going to
be monitored, along with threshold at which warnings
and errors are signalled.
10 PROJECT MANAGEMENT

Definition: It has described documents required for


individual phases of the project.
10.1 Documentation Roadmap – It is the document
that describes all the documents that should be
produced for each of the phases of a project.
10.2 Project Plan – It is the list of tasks and
activities with timescales, resources and
dependencies that must be performed to deliver the
solution.
10.3 ‘DRIVE’ Statements – It is short one page
template that helps a project manager assess
whether a project, or work package should be
undertaken.
10 PROJECT MANAGEMENT
10.4 ‘SWOT’ Analysis - is often used in data
warehouse projects as a way of comparing different
approaches to a problem.
10.5 ‘MoSCoW’ Analysis - is a method of prioritizing a
list of requirements of features of the system by
breaking the list down.
10.6 Change Requests (CR) - is a critical component
of any project and is vital to data warehouse projects .
10.7 Risk Register - is a list of events that may happen.
If the event occurs then it will have some negative
impact on the project in terms of cost, resource or time.
10.8 Issue Log (BUG) - is the active management of
issues that have arisen.
10 PROJECT MANAGEMENT

10.9 Key Design Decisions (KDD) - The key design


decision is a template to record significant design
decisions. It records the issue, the chosen option,
any rejected options and rationale behind the
decision.
11 MISCELLANEOUS

Definition: The final category of this document


describes some general-purpose documents that a
project will find useful.
11.1 General Purpose Document - A standard look
and feel document with the required categories for
any project document required.
11.2 General Purpose Presentation – It is a
presentation with a standard look and feel.
11.3 Meeting Agenda - A standard agenda template
for meetings.
11.4 Memo – It provides a standard memo format for
anyone who is recording formal aspects of the
project outside the documentation roadmap .
SUMMARY

Data Management & Warehousing has identified


three aspects to essential documentation:
• A roadmap that describes what documentation
is required and how it fits together.
• Team members within the project to use the
templates, create quality documents and store
them to the project repositories.
• Easy access for people outside the project
team to the documentation including publication
or notification of changes, updates and new
releases.
THANK YOU FOR
LISTENING

26

You might also like