Expert Cube Development with SSAS Multidimensional Models
By Marco Russo, Alberto Ferrari and Chris Webb
()
About this ebook
If you are an Analysis Services cube designer wishing to learn more advanced topic and best practices for cube design, this book is for you.You are expected to have some prior experience with Analysis Services cube development.
Read more from Marco Russo
DAX Patterns: Second Edition Rating: 5 out of 5 stars5/5Expert Cube Development with Microsoft SQL Server 2008 Analysis Services Rating: 5 out of 5 stars5/5
Related to Expert Cube Development with SSAS Multidimensional Models
Related ebooks
Learning Tableau Rating: 0 out of 5 stars0 ratingsLearning Tableau 10 - Second Edition Rating: 4 out of 5 stars4/5Data Lake Development with Big Data Rating: 0 out of 5 stars0 ratingsMicrosoft SQL Server 2014 Business Intelligence Development Beginner’s Guide Rating: 0 out of 5 stars0 ratingsRobust Cloud Integration with Azure Rating: 0 out of 5 stars0 ratingsSQL Server 2016 Developer's Guide Rating: 0 out of 5 stars0 ratingsData Engineering on Azure Rating: 0 out of 5 stars0 ratingsSQL Server Analysis Services 2012 Cube Development Cookbook Rating: 0 out of 5 stars0 ratingsLearning PostgreSQL Rating: 1 out of 5 stars1/5Big Data Analytics Rating: 0 out of 5 stars0 ratingsOracle Warehouse Builder 11g: Getting Started Rating: 0 out of 5 stars0 ratingsJoe Celko's Trees and Hierarchies in SQL for Smarties Rating: 0 out of 5 stars0 ratingsPractical Business Intelligence Rating: 3 out of 5 stars3/5The Analytics Lifecycle Toolkit: A Practical Guide for an Effective Analytics Capability Rating: 0 out of 5 stars0 ratingsSAS Visual Analytics for SAS Viya Rating: 0 out of 5 stars0 ratingsBig Data Visualization Rating: 0 out of 5 stars0 ratingsTabular Modeling with SQL Server 2016 Analysis Services Cookbook Rating: 4 out of 5 stars4/5Microsoft Tabular Modeling Cookbook Rating: 0 out of 5 stars0 ratingsImplementing Power BI in the Enterprise Rating: 5 out of 5 stars5/5Data Mapping for Data Warehouse Design Rating: 5 out of 5 stars5/5Star Schema The Complete Reference Rating: 0 out of 5 stars0 ratingsMDX with Microsoft SQL Server 2016 Analysis Services Cookbook - Third Edition Rating: 0 out of 5 stars0 ratingsMDX with SSAS 2012 Cookbook Rating: 0 out of 5 stars0 ratingsPowerShell for SQL Server Essentials Rating: 0 out of 5 stars0 ratingsLearning Tableau 2019 - Third Edition: Tools for Business Intelligence, data prep, and visual analytics, 3rd Edition Rating: 0 out of 5 stars0 ratingsLearn T-SQL Querying: A guide to developing efficient and elegant T-SQL code Rating: 0 out of 5 stars0 ratings
Programming For You
Python: Learn Python in 24 Hours Rating: 4 out of 5 stars4/5Python: For Beginners A Crash Course Guide To Learn Python in 1 Week Rating: 4 out of 5 stars4/5SQL: For Beginners: Your Guide To Easily Learn SQL Programming in 7 Days Rating: 5 out of 5 stars5/5Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps Rating: 4 out of 5 stars4/5SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL Rating: 4 out of 5 stars4/5Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition) Rating: 0 out of 5 stars0 ratingsHTML & CSS: Learn the Fundaments in 7 Days Rating: 4 out of 5 stars4/5Modern C++ for Absolute Beginners: A Friendly Introduction to C++ Programming Language and C++11 to C++20 Standards Rating: 0 out of 5 stars0 ratingsThe Unofficial Guide to Open Broadcaster Software: OBS: The World's Most Popular Free Live-Streaming Application Rating: 0 out of 5 stars0 ratingsCoding All-in-One For Dummies Rating: 4 out of 5 stars4/5SQL All-in-One For Dummies Rating: 3 out of 5 stars3/5Grokking Algorithms: An illustrated guide for programmers and other curious people Rating: 4 out of 5 stars4/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer. Rating: 5 out of 5 stars5/5Python Projects for Beginners: A Ten-Week Bootcamp Approach to Python Programming Rating: 0 out of 5 stars0 ratingsLearn PowerShell in a Month of Lunches, Fourth Edition: Covers Windows, Linux, and macOS Rating: 0 out of 5 stars0 ratingsLinux: Learn in 24 Hours Rating: 5 out of 5 stars5/5Web Designer's Idea Book, Volume 4: Inspiration from the Best Web Design Trends, Themes and Styles Rating: 4 out of 5 stars4/5Learn JavaScript in 24 Hours Rating: 3 out of 5 stars3/5PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project Rating: 5 out of 5 stars5/5
Reviews for Expert Cube Development with SSAS Multidimensional Models
0 ratings0 reviews
Book preview
Expert Cube Development with SSAS Multidimensional Models - Marco Russo
Table of Contents
Expert Cube Development with SSAS Multidimensional Models
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why subscribe?
Free access for Packt account holders
Instant updates on new Packt books
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code and database for the book
Downloading the color images of this book
Errata
Piracy
Questions
1. Designing the Data Warehouse for Analysis Services
The source database
The OLTP database
The data warehouse
The data mart
Data modeling for Analysis Services
Fact tables and dimension tables
Star schemas and snowflake schemas
Junk dimensions
Degenerate dimensions
Slowly Changing Dimensions
Bridge tables or factless fact tables
Snapshot and transaction fact tables
Updating fact and dimension tables
Natural and surrogate keys
Unknown Members, key errors, and NULLability
Physical database design for Analysis Services
Multiple data sources
Data types and Analysis Services
SQL queries generated during cube processing
Dimension processing
Dimensions with joined tables
Reference dimensions
Fact dimensions
Distinct Count measures
Indexes in the data mart
Dimension tables
Fact tables
Usage of schemas
Naming conventions
Views versus the Data Source View
Summary
2. Building Basic Dimensions and Cubes
Multidimensional and Tabular models
Choosing an edition of Analysis Services
Setting up a new Analysis Services project
Creating data sources
Creating Data Source Views
Designing simple dimensions
Using the New Dimension wizard
Using the Dimension Editor
Adding new attributes
Configuring a Time dimension
Creating user hierarchies
Configuring attribute relationships
Building a simple cube
Using the New Cube wizard
Project deployment
Database processing
Summary
3. Designing More Complex Dimensions
Grouping and banding
Grouping
Banding
Modeling Slowly Changing Dimensions
Type I SCDs
Type II SCDs
Modeling attribute relationships on a Type II SCD
Handling member status
Type III SCDs
Modeling junk dimensions
Modeling ragged hierarchies
Modeling parent/child hierarchies
Ragged hierarchies with HideMemberIf
Summary
4. Measures and Measure Groups
Measures and aggregation
Useful properties of measures
FormatString
DisplayFolders
Built-in measure aggregation types
Basic aggregation types
DistinctCount
None
Semi-additive aggregation types
ByAccount
Dimension calculations
Unary operators and weights
Custom Member Formulas
Non-aggregatable values
Measure groups
Creating multiple measure groups
Creating measure groups from dimension tables
MDX formulas versus pre-calculating values
Handling different dimensionality
Handling different granularities
Non-aggregatable measures – a different approach
Using linked dimensions and measure groups
Role-playing dimensions
Dimension/measure group relationships
Fact relationships
Referenced relationships
Data mining relationships
Summary
5. Handling Transactional-Level Data
Details about transactional data
Drillthrough
Actions
Drillthrough actions
Drillthrough columns order
Drillthrough and calculated members
Drillthrough modeling
Drillthrough using a transaction detail dimension
Drillthrough with ROLAP dimensions
Drillthrough on alternate fact table
Drillthrough recap
Many-to-many dimension relationships
Implementing a many-to-many dimension relationship
Advanced modeling with many-to-many relationships
Performance issues
Summary
6. Adding Calculations to the Cube
Different kinds of calculated members
Common calculations
Simple calculations
Referencing cell values
Aggregating members
Year-to-date calculations
Ratios over a hierarchy
Previous period growths
Same period previous year
Moving averages
Ranks
Formatting calculated measures
Calculation dimensions
Implementing a simple calculation dimension
The Time Intelligence wizard
Attribute overwrite
Limitations of calculated members
Calculation dimension best practices
Named sets
Regular named sets
Dynamic named sets
Summary
7. Adding Currency Conversion
Introduction to currency conversion
Data collected in a single currency
Data collected in a multiple currencies
Where to perform currency conversion
The Add Business Intelligence wizard
Concepts and prerequisites
How to use the Add Business Intelligence wizard
Data collected in a single currency with reporting in multiple currencies
Data collected in multiple currencies with reporting in a single currency
Data stored in multiple currencies with reporting in multiple currencies
Measure expressions
DirectSlice property
Writeback
Summary
8. Query Performance Tuning
Understanding how Analysis Services processes queries
Performance tuning methodology
Designing for performance
Performance-specific design features
Partitions
Why partition?
Building partitions
Planning a partitioning strategy
Unexpected Partition scans
Aggregations
Creating an initial aggregation design
Usage-Based Optimization
Monitoring partition and aggregation usage
Building aggregations manually
Common aggregation design issues
MDX calculation performance
Diagnosing Formula Engine performance problems
Calculation performance tuning
Tuning algorithms used in MDX
Using Named Sets to avoid recalculating Set Expressions
Using calculated members to cache numeric values
Tuning the implementation of MDX
Caching
Formula cache scopes
Other scenarios that restrict caching
Cache warming
The CREATE CACHE statement
Running batches of queries
Scale-up and Scale-out
Summary
9. Securing the Cube
Sample security requirements
Analysis Services security features
Roles and role membership
Securable objects
Creating roles
Membership of multiple roles
Testing roles
Administrative security
Data security
Granting Read Access to Cubes
Cell security
Dimension security
Visual Totals
Restricting access to Dimension Members
Applying security to Measures
Dynamic security
Dynamic dimension security
Dynamic security with stored procedures
Dimension security and parent/child hierarchies
Dynamic cell security
Accessing Analysis Services from outside a domain
Managing security
Security and query performance
Cell security
Dimension security
Dynamic security
Summary
10. Going in Production
Making changes to a cube in production
Managing partitions
Relational versus Analysis Services partitioning
Building a template partition
Generating partitions in Integration Services
Managing processing
Dimension processing
Partition processing
Lazy Aggregations
Processing reference dimensions
Handling processing errors
Managing processing with Integration Services
Push-mode processing
Proactive caching
SSAS Data Directory maintenance
Performing database backup
Copying databases between servers
Summary
11. Monitoring Cube Performance and Usage
Analysis Services and the operating system
Resources shared by the operating system
CPU
Memory
I/O Operations
Tools to monitor resource consumption
Windows Task Manager
Performance counters
Resource Monitor
Analysis Services memory management
Memory differences between 32 bit and 64 bit
Controlling the Analysis Services Memory Manager
Out of memory conditions in Analysis Services
Sharing SQL Server and Analysis Services on the same machine
Monitoring processing performance
Monitoring processing with trace data
SQL Server Profiler
ASTrace
XMLA
Flight Recorder
Monitoring processing with Performance Monitor counters
Monitoring processing with Dynamic Management Views
Monitoring query performance
Monitoring queries with trace data
Monitoring queries with Performance Monitor counters
Monitoring queries with Dynamic Management Views
Monitoring usage
Monitoring usage with trace data
Monitoring usage with Performance Monitor counters
Monitoring usage with Dynamic Management Views
Activity Viewer
Building a complete monitoring solution
Summary
A. DAX Query Support
Implementation details
Mapping Multidimensional objects to Tabular concepts
Unsupported features
New functionality in Analysis Services
Connecting Power View to a Multidimensional model
Running DAX queries against a Multidimensional model
Executing DAX queries
DAX queries and attributes
Index
Expert Cube Development with SSAS Multidimensional Models
Expert Cube Development with SSAS Multidimensional Models
Copyright © 2014 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: July 2009
Second Edition: February 2014
Production Reference: 1170214
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-84968-990-8
www.packtpub.com
Cover Image by Faiz Fattohi (<faizfattohi@gmail.com>)
Credits
Authors
Chris Webb
Alberto Ferrari
Marco Russo
Reviewers
Sean O'Regan
James Serra
Dr. John Tunnicliffe
Jen Underwood
Roger D Williams Jr
Acquisition Editors
James Lumsden
Kartikey Pandey
Joanne Fitzpatrick
Content Development Editor
Chalini Snega Victor
Technical Editors
Manan Badani
Shashank Desai
Shali Sasidharan
Project Coordinator
Kranti Berde
Proofreaders
Simran Bhogal
Maria Gould
Ameesha Green
Paul Hindle
Indexers
Monica Ajmera Mehta
Rekha Nair
Tejal Soni
Graphics
Ronak Dhruv
Valentina Dsilva
Abhinash Sahu
Production Coordinator
Arvindkumar Gupta
Cover Work
Arvindkumar Gupta
About the Authors
Chris Webb (<chris@crossjoin.co.uk>) has been working with Microsoft Business Intelligence tools for 15 years in a variety of roles and industries. He is an independent consultant (www.crossjoin.co.uk) and trainer (www.technitrain.com) based in the UK, specializing in SQL Server Analysis Services, MDX, DAX, Power Pivot, and the whole Power BI stack. He is the co-author of MDX Solutions with Microsoft SQL Server Analysis Services 2005 and SQL Server Analysis Services 2012: The BISM Tabular Model. He is a regular speaker at user groups and conferences, and blogs about Microsoft BI at http://cwebbbi.wordpress.com/.
First and foremost, I'd like to thank my wife Helen and my two daughters Natasha and Mimi for their patience. I'd also like to thank everyone who helped me and answered my questions when I was writing both the first and second editions: Deepak Puri, Darren Gosbell, David Elliott, Mark Garner, Edward Melomed, Gary Floyd, Greg Galloway, Mosha Pasumansky, Ashvini Sharma, Akshai Mirchandani, Marius Dumitru, Sacha Tomey, Teo Lachev, Thomas Ivarsson, Vidas Matelis, Jen Underwood, Paul te Braak, Dan English, Paul Turley, Sean O'Regan, James Serra, Stacia Misner, and Kasper de Jonge.
Alberto Ferrari (<alberto.ferrari@sqlbi.com>) is a consultant and trainer who specializes in Business Intelligence with the Microsoft BI stack. He spends half of his time consulting for companies who need to develop complex data warehouses, and the other half in training, book writing, conferences, and meetings. He is a SQL Server MVP and a SSAS-Maestro.
He is a founder, with Marco Russo, of www.sqlbi.com, where they publish whitepapers and articles about SQL Server Analysis Services technology. He co-authored several books on SSAS and PowerPivot.
My biggest thanks goes to my family: Caterina, Lorenzo, and Arianna. They have the patience to support me during the hard time of book writing and the smiles I need to fill my life with happiness.
Marco Russo is a Business Intelligence consultant and mentor. His main activities are related to data warehouse relational and multidimensional design, but he is also involved in the complete development lifecycle of a BI solution. He has particular competence and experience in sectors such as financial services (including complex OLAP designs in the banking area), manufacturing, gambling, and commercial distribution. Marco is also a book author, and in addition to his BI-related publications, he has authored books about .NET programming. He is also a speaker at international conferences such as TechEd, PASS Summit, SQLRally, and SQLBits. He achieved the unique SSAS Maestro certification and is also a Microsoft Certified Trainer with several Microsoft Certified Professional certifications.
About the Reviewers
Sean O'Regan is a senior Business Intelligence consultant based in London, UK. He has been working with SQL Server since version 6.5, building a data warehouse for a large software reseller. In 1997, he co-founded a software house that grew from a two man outfit to a Microsoft Gold Partner. In 2009, he returned to his roots as an independent consultant working through his own company, BI5 Limited (www.bi5.co.uk). Since then, he has been privileged to work with some great people on several high profile BI projects for some of the largest organizations in the world.
When he is not working, playing tennis, or spending time with his family, you will often find him at SQL user group events in London or at SQLBits conferences in the UK.
James Serra is an independent consultant with the title of Data Warehouse/Business Intelligence Architect. He is a Microsoft SQL Server MVP with over 25 years of IT experience. He started his career as a software developer, then was a DBA for 12 years, and for the last seven years he has been working extensively with Business Intelligence using the SQL Server BI stack. He has been at times a permanent employee, consultant, contractor, and owner of his own business. All these experiences along with continuous learning have helped him to develop many successful data warehouse and BI projects. He is a noted blogger and speaker, having presented at the PASS Summit and the PASS Business Analytics Conference.
He has earned the MSCE: SQL Server 2012 Business Intelligence, MSCE: SQL Server 2012 Data Platform, MCITP: SQL Server 2008 Business Intelligence Developer, MCITP: SQL Server 2008 Database Administrator, and MCITP: SQL Server 2008 Database, and has a Bachelor of Science degree in Computer Engineering from UNLV.
James resides in Houston, Texas, with his wife Mary and three children: Lauren, RaeAnn, and James.
This book is dedicated to my wonderful wife Mary and my children Lauren, RaeAnn, and James, and my parents, Jim and Lorraine. Their love, understanding, and support is what made this book possible. Now if they only understood the content.
Dr. John Tunnicliffe is an independent consultant with extensive experience in implementing Business Intelligence solutions based on the Microsoft platform across a range of industries including investment banking, healthcare, legal, and professional services. With a particular focus on SSAS, John specializes in the design and development of Business Intelligence, data warehouse, and reporting systems based on OLAP technologies delivered to the end user via SharePoint.
With the ability to rapidly design Proof of Concept solutions, John has a proven track record of taking these prototypes and delivering them into a full production environment. Indeed, in the past, John has been the leader of two Innovation Award winning projects.
As a regular speaker at the SQLBits conference, John has covered topics such as building a dynamic OLAP environment and building an infrastructure to support real-time OLAP. Recordings of his presentations are available at http://www.sqlbits.com/Speakers/Dr_John_Tunnicliffe.
John is currently working at BNP Paribas, delivering a data warehouse to the Fixed Income business delivered to the end user via OLAP cubes and Tableau dashboards. In the past John has worked for Barclays Capital, IMS Health, Commerzbank, Bank of America, Capco, Catlin Underwriting, HBOS, Virgin Travel, and Linklaters.
Jen Underwood has almost 20 years of hands-on experience in the data warehousing, Business Intelligence, reporting, and predictive analytics industry. Prior to starting Impact Analytix, she held roles as a Microsoft Global Business Intelligence Technical Product Manager, Microsoft Enterprise Data Platform Specialist, Tableau Technology Evangelist, and also as a Business Intelligence Consultant for Big 4 Systems Integration firms. Throughout most of her career, she has been researching, designing, and implementing analytic solutions across a variety of open source, niche, and enterprise vendor landscapes including Microsoft, Oracle, IBM, and SAP.
Recently, Jen was honored with a Boulder BI Brain Trust membership, a BeyeNetwork Prescriptive Analytics Channel, and a 2013 Tableau Zen Master (MVP) award. She also writes Business Intelligence articles for the SQL Server Pro magazine.
Jen holds a Bachelor of Business Administration degree from the University of Wisconsin, Milwaukee and a postgraduate certificate in Computer Science, Data Mining from the University of California, San Diego.
Roger D Williams Jr has over 20 years of experience in Information Technology, specifically related to data warehousing, data architecture, data conversion/ETL, database administration, and client/server development.
His experience in consulting work has spanned over many industries, such as. manufacturing (General Electric, Westvaco), financial (Bank Of America, Wachovia, and LendingTree), insurance (The Guardian), pharmaceutical/healthcare (Lilly and MedCath), and retail (Starbucks, Yum Foods, and Potbelly).
He is a Microsoft Certified Database Administrator (MCDBA) and Oracle Certified Professional (OCP). He also has extensive experience with a variety of database platforms from Access and FoxPro to enterprise RDBMS, such as SQL Server, Oracle, Sybase, Teradata, DB2, Informix, and MySQL.
www.PacktPub.com
Support files, eBooks, discount offers and more
You might want to visit www.PacktPub.com for support files and downloads related to your book.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
http://PacktLib.PacktPub.com
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print and bookmark content
On demand and accessible via web browser
Free access for Packt account holders
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.
Instant updates on new Packt books
Get notified! Find out when new books are published by following @PacktEnterprise on Twitter, or the Packt Enterprise Facebook page.
Preface
Microsoft SQL Server Analysis Services (Analysis Services
from here on) is now 15 years old, and a mature product proven in thousands of enterprise-level deployments around the world. Starting from a point where few people knew that Analysis Services existed and those who knew about it were often suspicious of it, it has grown to be the most widely deployed OLAP server and one of the keystones of Microsoft's Business Intelligence product strategy. Part of the reason for its success has been the easy availability of information about it. Apart from the documentation Microsoft provides, there are white papers, blogs, online forums, and books galore on the subject. So, why should we write yet another book on Analysis Services? The short answer is to bring together all of the practical, real-world knowledge about Analysis Services that's out there into one place.
We, the authors of this book, are consultants who have spent the last few years of our professional lives designing and building solutions based on the Microsoft Business Intelligence platform and helping other people to do so. We've watched Analysis Services growing to its maturity and at the same time seen more and more people moving from being hesitant beginners on their first project to confident cube designers; but, at the same time, we felt that there were no books on the market aimed at this emerging group of intermediate-to-experienced users. Similarly, all of the Analysis Services books we read concerned themselves with describing its functionality and what you could potentially do with it, but none of them addressed the practical problems that we encountered day-to-day in our work—the problems of how you should go about designing cubes, what the best practices for doing so are, which areas of functionality work well and which don't, and so on. We wanted to write this book to fill these two gaps, and to allow us to share our hard-won experience. Most of the technical books are published to coincide with the release of a new version of a product and so are written using beta software, before the author had a chance to use the new version in a real project. This book, on the other hand, has been written with the benefit of having used Analysis Services for many years.
A very important point to make is that this book only covers Analysis Services 2012 Multidimensional Models. As you may know, as of SQL Server 2012, there are two versions of Analysis Services: Multidimensional and Tabular. Although both of them are called Analysis Services and can be used for much the same purposes, the development experience for the two is completely different. If you are working on a Microsoft Business Intelligence project, it is very likely that you will be using either one or the other, not both, so it makes no sense to try and cover both Multidimensional and Tabular in the same book.
The approach we've taken with this book is to follow the lifecycle of building an Analysis Services solution from start to finish. As we've said already, this does not take the form of a basic tutorial; it is more of a guided tour through the process with an informed commentary telling you what to do, what not to do, and what to look out for.
Another important point must be made before we continue and it is that in this book we're going to be expressing some strong opinions. We're going to tell you how we like to design cubes based on what we've found to work for us over the years, and you may not agree with some of the things that we say. We're not going to pretend that all of the advice that differs from our own is necessarily wrong, though: best practices are often subjective and one of the advantages of a book with multiple authors is that you not only get the benefit of more than one person's experience but also that each author's opinions have already been moderated by his or her co-authors. Think of this book as a written version of the kind of discussion you might have with someone at a user group meeting or a conference, where you pick up hints and tips from your peers. Some of the information may not be relevant to what you do, some of it you may dismiss, but even if only 10 percent of what you learn is new, it might be the crucial piece of knowledge that makes the difference between the success and failure of your project.
Analysis Services is very easy to use—some would say too easy. It's possible to get something up and running very quickly and as a result, it's an all too common occurrence that a cube gets put into production and subsequently shows itself to have problems that can't be fixed without a complete redesign. We hope that this book helps you to avoid having an If only I'd known about this earlier!
moment yourself, by passing on the knowledge that we've learned the hard way. We also hope that you will enjoy reading it and that you're successful in whatever you're trying to achieve with Analysis Services.
What this book covers
Chapter 1, Designing the Data Warehouse for Analysis Services, shows you how to design a relational data mart to act as a source for Analysis Services.
Chapter 2, Building Basic Dimensions and Cubes, covers setting up a new project in SQL Server Data Tools and building simple dimensions and cubes.
Chapter 3, Designing More Complex Dimensions, discusses more complex dimension design problems such as slowly changing dimensions and ragged hierarchies.
Chapter 4, Measures and Measure Groups, looks at the measures and measure groups, how to control the measures that aggregate up, and how dimensions can be related to the measure groups.
Chapter 5, Handling Transactional-Level Data, looks at issues such as drillthrough, fact dimensions, and many-to-many relationships.
Chapter 6, Adding Calculations to the Cube, shows how to add calculations to a cube, and gives some examples of how to implement common calculations in MDX.
Chapter 7, Adding Currency Conversion, deals with the various ways in which we can implement currency conversion in a cube.
Chapter 8, Query Performance Tuning, covers query performance tuning, including how to design aggregations and partitions and how to write efficient MDX.
Chapter 9, Securing the Cube, looks at the various ways in which we can implement security, including cell security and dimension security, as well as dynamic security.
Chapter 10, Going in Production, looks at some of the common issues that we'll face when a cube is in production, including how to deploy changes and how to automate partition management and processing.
Chapter 11, Monitoring Cube Performance and Usage, discusses how we can monitor query performance, processing performance, and usage once the cube has gone into production.
Appendix, DAX Query Support, discusses support for DAX queries and Power View against Multidimensional Models.
What you need for this book
To follow the examples used in this book, we recommend that you have a PC with the following software installed on it:
Microsoft Windows Vista SP2 or greater for desktops
Microsoft Windows Server 2008 SP2 or greater for servers
Microsoft SQL Server Analysis Services 2012 Multidimensional
Microsoft SQL Server 2012 (the relational engine)
Microsoft Visual Studio 2012 and SQL Server Data Tools
SQL Server Management Studio
Excel 2007, 2010, or 2013 is an optional bonus as an alternative method of querying the cube
We recommend that you use SQL Server Developer Edition to follow the examples used in this book. We'll discuss the differences between Developer Edition, Standard Edition, BI Edition, and Enterprise Edition in Chapter 2, Building Basic Dimensions and Cubes; some of the functionalities that we'll cover are not available in the Standard Edition and we'll mention that fact whenever it's relevant.
Who this book is for
This book is aimed at Business Intelligence consultants and developers who work with Analysis Services Multidimensional Models on a daily basis, who already know the basics of building a cube, and who want to gain a deeper practical knowledge of the product and perhaps check that they aren't doing anything badly wrong at the moment.
It's not a book for absolute beginners and we're going to assume that you understand basic Analysis Services concepts such as what a cube and a dimension is, and that you're not interested in reading yet another walkthrough of the various wizards in SQL Server Data Tools. Equally, it's not an advanced book and we're not going to try to dazzle you with our knowledge of obscure properties or complex data modeling scenarios that you're never likely to encounter. We're not going to cover all of the functionality available in Analysis Services either, and in the case of MDX where a full treatment of the subject requires a book on its own, we're going to give some examples of code that you can copy and adapt yourselves, but not try to explain how the language works.
Conventions
In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.
Code words in text are shown as follows: DimGeography is not used to create a new dimension, but is being used to add geographic attributes to the Customer dimension.
A block of code will be set as follows:
CASE WHEN Weight IS NULL OR Weight<0 THEN 'N/A'
WHEN Weight<10 THEN '0-10Kg'
WHEN Weight<20 THEN '10-20Kg'
ELSE '20Kg or more'
END
New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in our text like this: You can set project properties by right-clicking on the Project node in the Solution Explorer pane and selecting Properties.
Note
Warnings or important notes appear in a box like this.
Tip
Tips and tricks appear like this.
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.
To send us general feedback, simply drop an email to <feedback@packtpub.com>, and mention the book title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
Downloading the example code and database for the book
You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.
The downloadable files contain instructions on how to use them.
All of the examples in this book use a sample database based on the Adventure Works sample that Microsoft provides, which can be downloaded from http://tinyurl.com/SQLServerSamples. We use the same relational data source data to start, but then make changes as and when required for building our cubes, and although the cube we build as the book progresses resembles the official Adventure Works cube, it differs in several important respects. So, we encourage you to download and install it.
Downloading the color images of this book
We also provide you a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from: https://www.packtpub.com/sites/default/files/downloads/9908EN_ColoredImages.pdf
Errata
Although we have taken every care to ensure the accuracy of our contents, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in text or code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration, and help us to improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the let us know link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata added to any list of existing errata. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.
Piracy
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <copyright@packtpub.com> with a link to the suspected pirated material.
We appreciate your help in protecting our authors, and our ability to bring you valuable content.
Questions
You can contact us at <questions@packtpub.com> if you are having a problem with any aspect of the book, and we will do our best to address it.
Chapter 1. Designing the Data Warehouse for Analysis Services
The focus of this chapter is how to design a data warehouse specifically for Analysis Services. There are numerous books available that explain the theory of dimensional modeling and data warehouses; our goal here is not to discuss generic data warehousing concepts, but to help you adapt the theory to the needs of Analysis Services.
In this chapter, we will touch on just about every aspect of data warehouse design, and mention several subjects that cannot be analyzed in depth in a single chapter. We will cover some of these subjects in detail, such as Analysis Services cube, and dimension design, in later chapters. Other aspects, which are outside the scope of this book, will require further research on the part of the reader.
The source database
Analysis Services cubes are built on top of a database, but the real question is: what kind of database should this be?
We try to answer this question by analyzing the different kinds of databases we encounter in our search for the best source for our cube. In the process of doing so, we describe the basics of dimensional modeling, as well as some of the competing theories on how data warehouses should be designed.
The OLTP database
Typically, you create a BI solution when business users want to analyze, explore, and report on their data in an easy and convenient way. The data itself may be composed of thousands, millions, or even billions of rows, normally kept in a relational database built to perform a specific business purpose. We refer to this database as the On Line Transactional Processing (OLTP) database.
The OLTP database can be a legacy mainframe system, a CRM system, an ERP system, a general ledger system, or any kind of database that a company uses in order to manage their business.
Sometimes, the source OLTP may consist of simple flat files generated by processes running on a host. In such a case, the OLTP is not a real database, but we can still turn it into one by importing the flat files into a SQL Server database for example. Therefore, regardless of the specific media used to store the OLTP, we will refer to it as a database.
Some of the most important and common characteristics of an OLTP system are:
The OLTP system is normally a complex piece of software that handles information and transactions. From our point of view, though, we can think of it simply as a database.
We do not normally communicate in any way with the application that manages and populates the data in the OLTP. Our job is that of exporting data from the OLTP, cleaning it, integrating it with data from other sources, and loading it into the data warehouse.
We cannot make any assumptions, such as a guarantee in the schema structure over time, about the OLTP database's structure.
Somebody else has built the OLTP system and is probably currently maintaining it, so its structure may change over time. We do not usually have the option of changing anything in its structure anyway, so we have to take the OLTP system as is
even if we believe that it could be made better.
The OLTP may well contain data that does not conform to the general rules of relational data modeling, such as foreign keys and constraints.
Normally in the OLTP system, you find historical data that is not correct. This is almost always the case. A system that runs for years very often has data that is incorrect and never will be corrected.
When building a BI solution we have to clean and fix this data, but normally it would be too expensive and disruptive to do this for old data in the OLTP system itself.
In our experience, the OLTP system is very often poorly documented. Our first task is, therefore, that of creating good documentation for the system, validating data, and checking it for any inconsistencies.
The OLTP database is not built to be easily queried for analytical workloads, and is certainly not going to be designed with Analysis Services cubes in mind. Nevertheless, a very common question is: do we really need to build a dimensionally modeled data mart as the source for an Analysis Services cube?
, and the answer is a definite yes
!
As we'll see, the structure of a data mart is very different from the structure of an OLTP database and Analysis Services is built to work on data marts, not on generic OLTP databases. The changes that need to be made when moving data from the OLTP database to the final data mart structure should be carried out by specialized ETL software, such as SQL Server Integration Services, and cannot simply be handled by Analysis Services in the Data Source View.
Moreover, the OLTP database needs to be efficient for OLTP queries. OLTP queries tend to be very fast on small chunks of data, in order to manage everyday work. If we run complex queries ranging over the whole OLTP database, as BI-style queries often do, we will create severe performance problems for the OLTP database. There are very rare situations in which