Professional Documents
Culture Documents
Insider
TIPS FOR THE SQL SERVER PRO VOL.1 k BUSINESS
INTELLIGENCE
SQL Server
k EDITORS
NOTE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
sure to be hot in 2010 is business intelligence. Microsoft SQL Server 2008 R2 has been described as a BI release by some, and its easy to see why, as many of the systems key technological advances center around data reporting and analysis. One BI sensation thats getting a lot of attention is the concept of self-service business intelligence, which is really the crux of Microsofts long-developed BI technology. In our feature article, Is Self-Service BI the Answer? SQL Server MVP Eric Johnson breaks down exactly what goes into self-service BI, what it promises and what it actually delivers. Another BI improvement from Microsoft involves how organizations deal with data quality issues. Enter Master Data Services (MDS), a new addition to the companys fully packed BI features set. While not a perfect product, the cost benefits of MDS could be a key motivator for larger companies interested in upgrading to SQL Server 2008 R2 in the coming year. Of course, SQL Server Insider also cares about what database folks are doing right now. Keep reading to see what Denny Cherry believes are the Top Five SQL Server DBA Time-Wasting Tasks. How does your organization plan to take advantage of business intelligence? Send me an email and let me know. I
ONE TOPIC THATS
BRENDAN COURNOYER
k BUSINESS
INTELLIGENCE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
business intelligence has become a part of technology and business lexicon, but what does it really mean? Many people think that the term refers to data warehouses, and they would be correct. But theres so much more to BI. Business intelligence is the combination of technologies and processes used to gather, store, analyze and access data that supports decision making within a company. It includes decisions about data warehouses; reporting; data mining; extract, transform and load (ETL) processes; and forecasting. The data thats acquired can be invaluable in terms of what you can learn about your business and customers. Because BI encompasses such a broad spectrum of technologies, it requires highly skilled technologists to manage it. So how does self-service BI fit into that puzzle?
THE TERM
The goal of self-service BI is to empower end users so they can make decisions based on their own analyses, instead of forcing them to use only the data and reports available from a larger BI system. Self-service BI gives end users the tools they need to access and analyze data contained in a larger BI solution. That way, they can design and customize their own reports and create their own views of the data. In fast-paced businesses, the ability to gather and analyze data quickly can be an advantage over your competition. Additionally, being able to see that a product or service would prove to be a losing venture in the coming months allows you to shut down a product line to save millions of dollars. It stands to reason that the more people you have looking at
k BUSINESS
INTELLIGENCE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
data, the faster you will be able to come to some useful conclusions. In theory, self-service BI is great, but is it practical? The concept of self-service BI has taken hold and many IT managers and business analysts are jumping at the theoretical benefits. Before you jump in, DBAs need to truly realize what self-service BI can buy you and what you should consider before implementing it.
The biggest benefit of self-service BI is that it empowers an individual end user to do his own analysis without needing the help of an IT staff member. Users wont be hindered by the IT department; the IT department wont have to spend a lot of time creating new reports or making changes to the existing BI system. The IT department may not fully understand how the data should look when an end user requests it or what that user is trying to get out of the data, which can lead to frustrated users and unhappy IT professionals. Self-service BI can, in theory, put all of the decisions into the hands of the users so nothing is lost in translation. In a perfect world, self-service BI would give users more control and make an IT pros job easier. But there is no perfect world.
Self-service BI has a dark side. While it does enable users to get the data when they want it and in the form that they prefer, self-service BI also allows users to tax the system in ways that the system may not be ready for or capable of handling. For example, suppose that an organization has 20,000 employees and about 100 business analysts in different lines of business. Lets say that 100 people will create their own reports and do their own data analyses. What does that mean for the volume of access against your data store and the volume of reports in your environment? Heres one scenario: Each analyst will have his own set of reports, which could cause a lot of duplication of effort. Giving analysts access to the data store could quickly overload the database. What if an analyst leaves the company? Will all his work be lost? How much effort would it take to transfer his work to another analyst? When executives want to make an informed business decision, they want all of the data within their reach and they want it to be correct. If you distribute this data among hundreds of analysts, its difficult to ensure that you have a complete picture. In addition to this potential chaos, self-service BI can be expensive and time consuming to organize and implement.
k BUSINESS
INTELLIGENCE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
Several software companies offer self-service BI products. Business Objects Crystal Reports claims some level of self-service BI, and Microsofts project Gemini is built around the self-service BI concept. Deciding which self-service BI product, if any, is right for your organization requires some serious consideration before taking the step toward implementation. First, you have to remember that self-service BI is not a complete BI solution. You still need to put in the time and effort to develop data warehouses, data marts, ETL processes, cubes, dimensions and all of the other moving parts in a true BI solution. Self-service BI is really a combination of parts that get bolted onto an existing BI solution in order to make it more user-friendly. On the Business Application Research Centers website www.bi-verdict.com, Nigel Pendse called Microsofts Project Gemini a Trojan horse for analysis services. Project Gemini is really an add-on to Excel that allows users to analyze and report on the data within Microsoft SQL Server Analysis Services, Microsofts BI platform. When you set out to develop a platform, be sure that you have the resources necessary to build a very large and complex system. Lack of resources and capital are the two
biggest killers of any BI project. If you already have a BI solution in place and youre simply looking to add selfservice BI functionalities, you are one step ahead of the game. The last thing to consider when implementing self-service BI is to use caution when providing users access to your BI system. Most likely, the data in your warehouse is sensitive; giving too many people carte blanche access can lead to excessive load. If your database becomes overloaded, it wont be of much use to anyone. Because data warehouses contain millions of rows of data with a great deal of analytics, one overly complex query can be a killer in terms of performance.
The self-service BI concept is still in its infancy, so its difficult to predict how it will come to fruition in the enterprise. Used correctly, a selfservice BI platform could prove invaluable. However, if planned and implemented improperly, projects can quickly deteriorate, wasting valuable time and resources. When it comes to self-service BI, tread lightly and do your homework. Take the time to figure out exactly what you want from it and how much you are willing to put into making a solution work. I
k MIGRATION
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
While SQL Server 2008 R2 is rich with business intelligence, that alone may not be reason enough to upgrade. One feature that could convince companies still on the fence about upgrading is Master Data Services. BY BRENDAN COURNOYER
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
one area that Microsoft focused on when it developed SQL Server 2008 R2, it was business intelligence. Ready for general availability in the first half of 2010, the companys latest database management system (DBMS) is packed with supported BI improvements like PowerPivot for Excel, formerly code-named Gemini.
IF THERE IS
R2s debut comes less than two years after the release of SQL Server 2008, which included its own share of BI features and enhancements to SQL Server Analysis Services, Integration Services and Reporting Services. The following table shows a quick rundown of what other features will be available through the release of
T PowerPivot (formerly code-named Gemini) T Parallel Data Warehouse (formerly code-named Madison) T Reporting Services improvements
Managed self-service BI is possible with PowerPivot add-ins for Excel and SharePoint A new edition of SQL Server 2008 R2 designed to improve scalability of data warehouses over multiple servers Enhancements to SQL Server Reporting Services (SSRS) include report parts, shared datasets and SharePoint integration
k MIGRATION
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
SQL Server 2008 R2. Despite this influx of new technology, most organizations are still running SQL Server 2005 in productionand are seemingly content to continue doing so. So what will it take to convince companies to upgrade? Likely, it will have to deliver substantial business benefits for many companies to invest precious time and resources into a migrationsomething that minor tweaks and improvements dont always provide. With SQL Server 2008 R2, however, Microsoft has added one new feature to its BI platform that could prove attractive to those pondering an upgrade: Master Data Services. Although its new to SQL Server, MDS is a seasoned technology that was derived from a Stratature product, a data management company that Microsoft acquired in 2007. Stratatures technology was considered a strong acquisition at the time since it was already built on SQL Server. It also filled a perceived infrastructure void for Microsoft in comparison to other enterprise vendors like IBM, Oracle and SAP.
The main purpose of Master Data Servicesand master data management in generalis consistency. This
is a particular challenge for larger enterprises that host multiple databases, each with slight variations of the same attributes. When you want to pull all your data together and report on it, [organizations often find] that they have real data quality problems, said Kevin Kline, a technical strategy manager for SQL Server solutions with Quest Software Inc. This is when you want to implement a master data strategy; one version of the truth, or reference data, as we call it. Having multiple versions of the truth is the cause of most data quality issues. A simple example of this could involve two databases, each housing customer data. While Database 1 might list John Smith as working at Hewlett-Packard, Database 2 would have him working at HP. Therefore, two different versions of the same company exist. When it comes to reporting, cleaning up such inconsistencies has caused database professionals considerable time and stress. Microsofts goal with Master Data Services is to provide a simpler, more cost-effective way to keep data in sync across multiple systems. [What MDS] does nicely is that it enables you to build your records data, Kline said. It gives you one version of the truth, which is a powerful thing when you have lots of different databases. So you dont have a
k MIGRATION
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
customer with eight different company names, for example. Herain Oberoi, group product manager with the Microsoft SQL Server business group, described MDS as an important addition to the companys BI platform. Oberoi said that when customers are building a BI solution, they dont want to have to guess where their master data reference is, adding that MDS is designed to solve this problem in multiple ways. There's this notion of operational master data, which is keeping systems in sync, Oberoi said. Then there's this notion of analytic master data, which is not just keeping my traditional line of business systems in sync, but also keeping both things in sync with my data warehouse.
a lot of people are using as their first BI tool, Kline said. Larger companies with established BI strategies could have a much stronger interest in MDS and, in turn, SQL Server 2008 R2. While master
MDS is not ideal for organizations where business intelligence isnt a chief priority.
data management solutions are nothing new, implementing such a project is traditionally a costly and tedious endeavor. Kline said that having a service like MDS could prove to be extremely cost-effective for organizations running SQL Server. Microsoft is making the same play with the other BI tools in that this was formerly a high-end expensive market, and now [the company is] going to make this much easier and inexpensive, Kline said. He noted that another moneysaving feature of MDS comes from its process automation capabilities. Any time you have to pay people to get this stuff done, that really adds up in terms of dollars, he explained. A big element of MDS is process improvement and process automa-
Microsoft views Master Data Services as a natural addition to its current slate of BI features, and Kline agrees. It aligns really well with the rest of Microsofts BI platform, he said, but noted that MDS is not ideal for organizations where business intelligence isnt already a chief priority. Its the kind of thing that small companies might not find interesting, but once you have a BI strategy in place, you start to see the need for it all over the place. It makes sense that it would come out a few years after [SQL Server Analysis Service], which
k MIGRATION
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
tion. Its much easier in that you dont have to hire a consultant anymore; you can just do it yourself. Other MDS benefits include improved data-cleansing operations and process management capabilities. The latter is designed to help guide database professionals toward properly cleaning up their data.
Even though it was derived from seasoned technology, the original incarnation of Master Data Services with SQL Server 2008 R2 is just thata first edition. Therefore, its not surprising that the technology still has some holes to fill when compared to competing products. This will likely open the door for Microsoft partners to put their stamp on MDS. The SQL Server product is a little more generic, so it wont give you the same built-in toolset right off the bat, said John Welch, a SQL Server MVP and BI architect with Varigence Inc. You might see vendors start to offer extended plug-ins to fill these holes over the next one or two years. One notable area of improvement could involve the logic that goes with identifying different versions of the same customer. Welch said that while much of the MDS functionality is geared toward cleaning up varying attributes, it might not be as mature when it
comes to identifying similar customers that are actually different people. An example could involve differentiating between customers Jon Smith and John Smithsimilar names that have different spellings. Some products out there are more mature in doing those types of processing and are capable of finding matches and saying this one [is the same], this one isnt, Welch said. What well see from MDS in the first round is not going to be as baked in and effective at doing that. One vendor that is more advanced in that area is Zoomix, a data quality software company that Microsoft acquired in 2008. While Zoomixs data quality software will not be a part of Master Data Services with SQL Server 2008 R2, Welch said it will likely come into play in the next version of MDS. Still, the SQL Server 2008 R2 feature stands to make a big difference for organizations looking to expand their business intelligence strategies to ensure a higher level of data qualityparticularly if they are already using SQL Servers BI tools. Remember, this is the tool youd want to use after your first BI project, said Kline. Its for when you say, This worked great, but we realize our data is messed up, so how do we keep it clean? Its actually a followup to the BI project youve already done. I
k PERFORMANCE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
There are five things that DBAs do on a daily basis that are a complete waste of time. If youre doing any of these tasks, stop now to save valuable hours and resources. BY DENNY CHERRY
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DBAs perform on a regular basis not only have little or no benefit to SQL Server, but they can actually be detrimental to the health of their environments. Lets look at five of the biggest DBA time wasters. If you are doing any of these, I suggest that you stop doing them as soon as possible.
Daily database shrinking is bad for a few reasons. From the technical side, the biggest impact youll see is a greatly increased index fragmentation after each database shrink. In addition, shrinking database files increases both the physical file fragmentation on the disk subsystem and
the I/O load on the server, which decreases performance for other functions while the shrink operation is running. The actual shrinking of the database doesnt cause fragmentation, but as the files regrow themselves and you continue to shrink them, the database will become more fragmented. If you shrink the log file, there is also the bad side effect of having to regrow it. When the log file fills and autogrows, all operations in the database are paused as the transaction log grows. This could take a second or two on a very busy system, causing all sorts of locking and blocking as processes wait for the transaction log to grow. The other downside is that when the database maintenance begins to run again, the files will need to grow. This growth requires CPU and disk
10
k PERFORMANCE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
resources to complete. All of this further slows the database maintenance process, especially on SQL Server 2000 and older or on SQL Server 2005 systems and higher that do not have the instant file initialization setting enabled. From a management side, shrinking the database can give you a false sense of security since you dont know how much space your database actually needs to consume. In other words, if your database grows from 100 GB to 130 GB every time you run the database maintenance process on it, and then you shrink it back down to 100 GB again, you will have no idea how much space the database actually needs. Does the database need 100 GB or 130 GB? The answer is that it needs 130 GB of space so that it can perform the needed database maintenance. If you shrink it down and then put other data on the disk, you may not have enough space to perform your database maintenance and, ultimately, the job will fail.
Log backup Index rebuild Full backup Truncate log Log backups every 30 minutes That schedule rebuilds indexes and performs full backups. So far, so good, right? The log is then truncated, which breaks the log chain. That makes all log backups taken after this happens useless until the next full backup is taken because the truncate log step is breaking the log sequence number (LSN) chain. Whenever a transaction occurs, an LSN is written to the transaction log. When a backup is taken, the first and last LSNs included in the backup are written to the header of the log backup. When the logs are restored, the LSNs from one log backup to the next must be contiguous. If they are not contiguous, then SQL Server knows that log records are missing and the log backups cannot be restored. In this scenario, the full backup can be restored to the database. Unfortunately, the log backups are useless because the last LSN in the transaction log backup will be different from the LSN from the first truncation log backup taken after the log is truncatedsince the truncate log command changes the LSN of the log. Another common scenario is to truncate the log, then perform the full
2
11
One of the more common setups I see online is the following database maintenance schedule:
k PERFORMANCE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
backup. This is better, but not by much. Any transactions between the truncate statement and the next full backup can't be recovered if the full backup is corrupt. Why? Because you can't restore the full backup from two days prior and then roll all the logs forward. Thats because the truncate log step will still be resetting the log sequence numbers. And, yes, switching the log into simple recovery model does the exact same thing. If you are truncating your transaction log so that you can shrink it, then please scroll up and reread the section above. Now, if you dont need the transaction log to be intact, but have the database in full recovery, then you should change the database into simple recovery mode. That way the transaction log will not grow since log entries will be overwritten instead of kept until the next log backup.
the log shipping target server daily or weekly. This is a waste of time because the log shipping target already has all the transactions applied to it. Restoring the full backup is wasted time and bandwidth if your log shipping target is in another office or data center. When you back up the transaction log, everything that has happened to the database since the last log backup is includednew columns and tables, index rebuilds, etc. By restoring the full backup to catch up on anything that has been missed, you are simply dropping the destination database and restoring it to the exact same state, then applying all the logs forward that were backed up while the full backup was being restored. All this does is increase the chance that a log backup will be missed.
3
12
This is one task that that I hope you aren't doing on a daily basis. The first sign that log shipping was set up by someone who doesn't fully understand how the transaction log works is that the log shipping configuration is set up to restore the full backup to
As you know, there are two ways to cleanup your database indexes. You can defragment the index using the REORG parameter or you can do a full rebuild of the index. With the database maintenance plans in SQL Server 2005, however, its very easy to do both. While doing this wont specifically
k PERFORMANCE
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
hurt the database, performing both operations against the same index is a major time waster. The end result from both operations is the samean index that isn't fragmented and that has the proper fill factor set for all of the database pages. If you frequently perform an index reorganization followed by an index rebuild, then CPU power and disk I/O that you spend doing the reorganization is wasted. The index rebuild command will completely rebuild the index. You should do one or the other not both. If you arent sure which one to use, you can purchase plenty of products to handle this automatically (Quest Software Inc.s Capacity Manager or Ideras SQL defrag manager, for example), or you can find some free scripts available online.
automated way to read the log files and look for error logs, which would save you a lot of time and money, especially as the log files grow. If you have a monitoring solution in place, it probably has a way to
A lot of daily tasks can add value to your organization, but some may actually detract from the bottom line.
read the application log. Any critical error in the ErrorLog file will also be written to the Windows application log. If you dont have any sort of monitoring application, or if it doesnt support reading the error log, you can load the ErrorLog file and/or the application log into a table and look for errors. Remember, while there are a lot of daily tasks that can add value to your organization, some add no value to the business and/or SQL Server and may actually detract from the bottom line. Its a good idea to step back and look at each of these tasks to evaluate what they do, and see if they are giving your company an actual cost benefit or not. I
Many DBAs in smaller shops take the time to read through error logs daily to look for problems. When you only have one or two servers to deal with, this doesnt take very long. When you start adding in more SQL Servers, however, going through these log files manually can take a very long time. Youd benefit more from finding an
13
k ABOUT
THE AUTHORS
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Editors Note
Eric Johnson, SQL Server MVP, is the co-founder of Consortio Services and the primary Database Technologies consultant. His background in IT ranges from operating systems and hardware to specialized applications and development. He has more than 10 years of IT experience, much of which deals with SQL Server, and has managed and designed databases of all shapes and sizes. He is an active presenter on SQL Server topics for national technology conferences and has published various articles and books.
Cathy Gagne Editorial Director cgagne@techtarget.com Brendan Cournoyer Site Editor bcournoyer@techtarget.com Michelle Boisvert Managing Editor mboisvert@techtarget.com Martha Moore Copy Editor mmoore@techtarget.com Linda Koury Art Director of Digital Content lkoury@techtarget.com Marc Laplante Publisher mlaplante@techtarget.com Peter Larkin Senior Director of Sales plarkin@techtarget.com
14
k FROM
OUR SPONSOR
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
q SQL Server 2008 Upgrade Series: Business Strategies q SQL Server 2008 Upgrade Series: Performance q SQL Server 2008 Upgrade Series: Virtualization About Dell, Inc. and Intel: Dell and Intel are strategic partners in delivering innovative hardware solutions to solve your most challenging IT problems. Together we are delivering virtualization optimized solutions to maximize the benefits of any virtualization project.
15