Professional Documents
Culture Documents
20464C
Developing Microsoft® SQL Server®
Databases
MCT USE ONLY. STUDENT USE PROHIBITED
ii Developing Microsoft® SQL Server® Databases
Information in this document, including URL and other Internet Web site references, is subject to change
without notice. Unless otherwise noted, the example companies, organizations, products, domain names,
e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with
any real company, organization, product, domain name, e-mail address, logo, person, place or event is
intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the
user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in
or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical,
photocopying, recording, or otherwise), or for any purpose, without the express written permission of
Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property
rights covering subject matter in this document. Except as expressly provided in any written license
agreement from Microsoft, the furnishing of this document does not give you any license to these
patents, trademarks, copyrights, or other intellectual property.
The names of manufacturers, products, or URLs are provided for informational purposes only and
Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding
these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a
manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links
may be provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not
responsible for the contents of any linked site or any link contained in a linked site, or any changes or
updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission
received from any linked site. Microsoft is providing these links to you only as a convenience, and the
inclusion of any link does not imply endorsement of Microsoft of the site or the products contained
therein.
© 2014 Microsoft Corporation. All rights reserved.
Released: 08/2014
MCT USE ONLY. STUDENT USE PROHIBITED
MICROSOFT LICENSE TERMS
MICROSOFT INSTRUCTOR-LED COURSEWARE
These license terms are an agreement between Microsoft Corporation (or based on where you live, one of its
affiliates) and you. Please read them. They apply to your use of the content accompanying this agreement which
includes the media on which you received it, if any. These license terms also apply to Trainer Content and any
updates and supplements for the Licensed Content unless other terms accompany those items. If so, those terms
apply.
BY ACCESSING, DOWNLOADING OR USING THE LICENSED CONTENT, YOU ACCEPT THESE TERMS.
IF YOU DO NOT ACCEPT THEM, DO NOT ACCESS, DOWNLOAD OR USE THE LICENSED CONTENT.
If you comply with these license terms, you have the rights below for each license you acquire.
1. DEFINITIONS.
a. “Authorized Learning Center” means a Microsoft IT Academy Program Member, Microsoft Learning
Competency Member, or such other entity as Microsoft may designate from time to time.
b. “Authorized Training Session” means the instructor-led training class using Microsoft Instructor-Led
Courseware conducted by a Trainer at or through an Authorized Learning Center.
c. “Classroom Device” means one (1) dedicated, secure computer that an Authorized Learning Center owns
or controls that is located at an Authorized Learning Center’s training facilities that meets or exceeds the
hardware level specified for the particular Microsoft Instructor-Led Courseware.
d. “End User” means an individual who is (i) duly enrolled in and attending an Authorized Training Session
or Private Training Session, (ii) an employee of a MPN Member, or (iii) a Microsoft full-time employee.
e. “Licensed Content” means the content accompanying this agreement which may include the Microsoft
Instructor-Led Courseware or Trainer Content.
f. “Microsoft Certified Trainer” or “MCT” means an individual who is (i) engaged to teach a training session
to End Users on behalf of an Authorized Learning Center or MPN Member, and (ii) currently certified as a
Microsoft Certified Trainer under the Microsoft Certification Program.
g. “Microsoft Instructor-Led Courseware” means the Microsoft-branded instructor-led training course that
educates IT professionals and developers on Microsoft technologies. A Microsoft Instructor-Led
Courseware title may be branded as MOC, Microsoft Dynamics or Microsoft Business Group courseware.
h. “Microsoft IT Academy Program Member” means an active member of the Microsoft IT Academy
Program.
i. “Microsoft Learning Competency Member” means an active member of the Microsoft Partner Network
program in good standing that currently holds the Learning Competency status.
j. “MOC” means the “Official Microsoft Learning Product” instructor-led courseware known as Microsoft
Official Course that educates IT professionals and developers on Microsoft technologies.
k. “MPN Member” means an active Microsoft Partner Network program member in good standing.
MCT USE ONLY. STUDENT USE PROHIBITED
l. “Personal Device” means one (1) personal computer, device, workstation or other digital electronic device
that you personally own or control that meets or exceeds the hardware level specified for the particular
Microsoft Instructor-Led Courseware.
m. “Private Training Session” means the instructor-led training classes provided by MPN Members for
corporate customers to teach a predefined learning objective using Microsoft Instructor-Led Courseware.
These classes are not advertised or promoted to the general public and class attendance is restricted to
individuals employed by or contracted by the corporate customer.
n. “Trainer” means (i) an academically accredited educator engaged by a Microsoft IT Academy Program
Member to teach an Authorized Training Session, and/or (ii) a MCT.
o. “Trainer Content” means the trainer version of the Microsoft Instructor-Led Courseware and additional
supplemental content designated solely for Trainers’ use to teach a training session using the Microsoft
Instructor-Led Courseware. Trainer Content may include Microsoft PowerPoint presentations, trainer
preparation guide, train the trainer materials, Microsoft One Note packs, classroom setup guide and Pre-
release course feedback form. To clarify, Trainer Content does not include any software, virtual hard
disks or virtual machines.
2. USE RIGHTS. The Licensed Content is licensed not sold. The Licensed Content is licensed on a one copy
per user basis, such that you must acquire a license for each individual that accesses or uses the Licensed
Content.
2.1 Below are five separate sets of use rights. Only one set of rights apply to you.
2.2 Separation of Components. The Licensed Content is licensed as a single unit and you may not
separate their components and install them on different devices.
2.3 Redistribution of Licensed Content. Except as expressly provided in the use rights above, you may
not distribute any Licensed Content or any portion thereof (including any permitted modifications) to any
third parties without the express written permission of Microsoft.
2.4 Third Party Notices. The Licensed Content may include third party code tent that Microsoft, not the
third party, licenses to you under this agreement. Notices, if any, for the third party code ntent are included
for your information only.
2.5 Additional Terms. Some Licensed Content may contain components with additional terms,
conditions, and licenses regarding its use. Any non-conflicting terms in those conditions and licenses also
apply to your use of that respective component and supplements the terms described in this agreement.
a. Pre-Release Licensed Content. This Licensed Content subject matter is on the Pre-release version of
the Microsoft technology. The technology may not work the way a final version of the technology will
and we may change the technology for the final version. We also may not release a final version.
Licensed Content based on the final version of the technology may not contain the same information as
the Licensed Content based on the Pre-release version. Microsoft is under no obligation to provide you
with any further content, including any Licensed Content based on the final version of the technology.
b. Feedback. If you agree to give feedback about the Licensed Content to Microsoft, either directly or
through its third party designee, you give to Microsoft without charge, the right to use, share and
commercialize your feedback in any way and for any purpose. You also give to third parties, without
charge, any patent rights needed for their products, technologies and services to use or interface with
any specific parts of a Microsoft technology, Microsoft product, or service that includes the feedback.
You will not give feedback that is subject to a license that requires Microsoft to license its technology,
technologies, or products to third parties because we include your feedback in them. These rights
survive this agreement.
c. Pre-release Term. If you are an Microsoft IT Academy Program Member, Microsoft Learning
Competency Member, MPN Member or Trainer, you will cease using all copies of the Licensed Content on
the Pre-release technology upon (i) the date which Microsoft informs you is the end date for using the
Licensed Content on the Pre-release technology, or (ii) sixty (60) days after the commercial release of the
technology that is the subject of the Licensed Content, whichever is earliest (“Pre-release term”).
Upon expiration or termination of the Pre-release term, you will irretrievably delete and destroy all copies
of the Licensed Content in your possession or under your control.
MCT USE ONLY. STUDENT USE PROHIBITED
4. SCOPE OF LICENSE. The Licensed Content is licensed, not sold. This agreement only gives you some
rights to use the Licensed Content. Microsoft reserves all other rights. Unless applicable law gives you more
rights despite this limitation, you may use the Licensed Content only as expressly permitted in this
agreement. In doing so, you must comply with any technical limitations in the Licensed Content that only
allows you to use it in certain ways. Except as expressly permitted in this agreement, you may not:
• access or allow any individual to access the Licensed Content if they have not acquired a valid license
for the Licensed Content,
• alter, remove or obscure any copyright or other protective notices (including watermarks), branding
or identifications contained in the Licensed Content,
• modify or create a derivative work of any Licensed Content,
• publicly display, or make the Licensed Content available for others to access or use,
• copy, print, install, sell, publish, transmit, lend, adapt, reuse, link to or post, make available or
distribute the Licensed Content to any third party,
• work around any technical limitations in the Licensed Content, or
• reverse engineer, decompile, remove or otherwise thwart any protections or disassemble the
Licensed Content except and only to the extent that applicable law expressly permits, despite this
limitation.
5. RESERVATION OF RIGHTS AND OWNERSHIP. Microsoft reserves all rights not expressly granted to
you in this agreement. The Licensed Content is protected by copyright and other intellectual property laws
and treaties. Microsoft or its suppliers own the title, copyright, and other intellectual property rights in the
Licensed Content.
6. EXPORT RESTRICTIONS. The Licensed Content is subject to United States export laws and regulations.
You must comply with all domestic and international export laws and regulations that apply to the Licensed
Content. These laws include restrictions on destinations, end users and end use. For additional information,
see www.microsoft.com/exporting.
7. SUPPORT SERVICES. Because the Licensed Content is “as is”, we may not provide support services for it.
8. TERMINATION. Without prejudice to any other rights, Microsoft may terminate this agreement if you fail
to comply with the terms and conditions of this agreement. Upon termination of this agreement for any
reason, you will immediately stop all use of and delete and destroy all copies of the Licensed Content in
your possession or under your control.
9. LINKS TO THIRD PARTY SITES. You may link to third party sites through the use of the Licensed
Content. The third party sites are not under the control of Microsoft, and Microsoft is not responsible for
the contents of any third party sites, any links contained in third party sites, or any changes or updates to
third party sites. Microsoft is not responsible for webcasting or any other form of transmission received
from any third party sites. Microsoft is providing these links to third party sites to you only as a
convenience, and the inclusion of any link does not imply an endorsement by Microsoft of the third party
site.
10. ENTIRE AGREEMENT. This agreement, and any additional terms for the Trainer Content, updates and
supplements are the entire agreement for the Licensed Content, updates and supplements.
12. LEGAL EFFECT. This agreement describes certain legal rights. You may have other rights under the laws
of your country. You may also have rights with respect to the party from whom you acquired the Licensed
Content. This agreement does not change your rights under the laws of your country if the laws of your
country do not permit it to do so.
13. DISCLAIMER OF WARRANTY. THE LICENSED CONTENT IS LICENSED "AS-IS" AND "AS
AVAILABLE." YOU BEAR THE RISK OF USING IT. MICROSOFT AND ITS RESPECTIVE
AFFILIATES GIVES NO EXPRESS WARRANTIES, GUARANTEES, OR CONDITIONS. YOU MAY
HAVE ADDITIONAL CONSUMER RIGHTS UNDER YOUR LOCAL LAWS WHICH THIS AGREEMENT
CANNOT CHANGE. TO THE EXTENT PERMITTED UNDER YOUR LOCAL LAWS, MICROSOFT AND
ITS RESPECTIVE AFFILIATES EXCLUDES ANY IMPLIED WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
14. LIMITATION ON AND EXCLUSION OF REMEDIES AND DAMAGES. YOU CAN RECOVER FROM
MICROSOFT, ITS RESPECTIVE AFFILIATES AND ITS SUPPLIERS ONLY DIRECT DAMAGES UP
TO US$5.00. YOU CANNOT RECOVER ANY OTHER DAMAGES, INCLUDING CONSEQUENTIAL,
LOST PROFITS, SPECIAL, INDIRECT OR INCIDENTAL DAMAGES.
It also applies even if Microsoft knew or should have known about the possibility of the damages. The
above limitation or exclusion may not apply to you because your country may not allow the exclusion or
limitation of incidental, consequential or other damages.
Please note: As this Licensed Content is distributed in Quebec, Canada, some of the clauses in this
agreement are provided below in French.
Remarque : Ce le contenu sous licence étant distribué au Québec, Canada, certaines des clauses
dans ce contrat sont fournies ci-dessous en français.
EXONÉRATION DE GARANTIE. Le contenu sous licence visé par une licence est offert « tel quel ». Toute
utilisation de ce contenu sous licence est à votre seule risque et péril. Microsoft n’accorde aucune autre garantie
expresse. Vous pouvez bénéficier de droits additionnels en vertu du droit local sur la protection dues
consommateurs, que ce contrat ne peut modifier. La ou elles sont permises par le droit locale, les garanties
implicites de qualité marchande, d’adéquation à un usage particulier et d’absence de contrefaçon sont exclues.
EFFET JURIDIQUE. Le présent contrat décrit certains droits juridiques. Vous pourriez avoir d’autres droits
prévus par les lois de votre pays. Le présent contrat ne modifie pas les droits que vous confèrent les lois de votre
pays si celles-ci ne le permettent pas.
Acknowledgments
Microsoft Learning would like to acknowledge and thank the following for their contribution towards
developing this title. Their effort at various stages in the development has ensured that you have a good
classroom experience.
Contents
Module 1: An Introduction to Database Development
Module Overview 1-1
Lesson 2: Storing XML Data and XML Schemas in SQL Server 13-9
Lesson 1: Considerations for Working with Data Files in SQL Server 2014 15-2
Lesson 2: Implementing FILESTREAM and FileTables 15-9
Module 13 Lab: Storing and Querying XML Data in SQL Server L13-1
Course Description
This 5-day instructor-led course introduces SQL Server 2014 and describes logical table design, indexing
and query plans. It also focusses on the creation of database objects including views, stored procedures,
along with parameters, and functions. Other common aspects of procedure coding, such as transactions,
error handling, triggers, and SQL CLR are also covered in this course.. This course helps people prepare for
exam 70-461: Writing Queries Using Microsoft® SQL Server® 2014 Transact-SQL.
Audience
The primary audience for this course is IT Professionals who want to become skilled on SQL Server 2012
product features and technologies for implementing a database.
Student Prerequisites
This course requires that you meet the following prerequisites:
In addition to their professional experience, students who attend this training should already have the
following technical knowledge:
Course Objectives
After completing this course, students will be able to:
Describe the concepts of database development.
Course Outline
This section provides an outline of the course:
Course Materials
The following materials are included with your kit:
Course Handbook A succinct classroom learning guide that provides all the critical technical
information in a crisp, tightly-focused format, which is just right for an effective in-class learning
experience.
Lessons: Guide you through the learning objectives and provide the key points that are critical to
the success of the in-class learning experience.
Labs: Provide a real-world, hands-on platform for you to apply the knowledge and skills learned
in the module.
Module Reviews and Takeaways: Provide improved on-the-job reference material to boost
knowledge and skills retention.
Lab Answer Keys: Provide step-by-step lab solution guidance at your fingertips when it’s
needed.
Modules: Include companion content, such as questions and answers, detailed demo steps and
additional reading links, for each lesson. Additionally, they include Lab Review questions and answers
and Module Reviews and Takeaways sections, which contain the review questions and answers, best
practices, common issues and troubleshooting tips with answers, and real-world issues and scenarios
with answers.
MCT USE ONLY. STUDENT USE PROHIBITED
About This Course iii
Resources: Include well-categorized additional resources that give you immediate access to the most
up-to-date premium content on TechNet, MSDN®, Microsoft Press®.
Course evaluation At the end of the course, you will have the opportunity to complete an online
evaluation to provide feedback on the course, training facility, and instructor.
The following table shows the role of each virtual machine used in this course:
Software Configuration
The following software is installed on each VM:
Course Files
There are files associated with the labs in this course. The lab files are located in the folder
D:\Labfiles\LabXX on the 20464C-MIA-SQL virtual machine.
Classroom Setup
Each classroom computer will have the same virtual machine configured in the same way.
Module 1
An Introduction to Database Development
Contents:
Module Overview 1-1
Module Overview
Before beginning to work with SQL Server in either a development or an administration role, it is
important to understand the overall SQL Server platform. In particular, it is useful to understand that SQL
Server is not just a database engine but it is a complete platform for managing enterprise data.
Along with a strong platform, SQL Server provides a series of tools that make the product easy to manage
and a good target for the application development.
Individual components of SQL Server can operate within separate security contexts. Correctly configuring
SQL Server services is important where enterprises are operating with a policy of least possible
permissions.
Objectives
After completing this lesson, you will be able to:
Lesson 1
Introduction to the SQL Server Platform
Microsoft® SQL Server® data management software is a platform for developing business applications
that are data focused. Rather than being a single, monolithic application, SQL Server is structured as a
series of components. It is important to understand the use of each component.
You can install more than one copy of SQL Server on a server. Each copy is called an instance and you can
separately configure and manage each one.
There are various editions of SQL Server, and each edition has a different set of capabilities. It is important
to understand the target business cases for each SQL Server edition and how SQL Server has evolved
through a series of improving versions over many years. It is a stable and robust platform.
Lesson Objectives
After completing this lesson, you will be able to:
Describe the overall SQL Server platform.
Explain the role of each of the components that make up the SQL Server platform.
Enterprise Ready
High Availability
Impressive performance is necessary, but not at the cost of availability. Organizations need constant
access to their data. Many enterprises are now finding it necessary to provide access to their data 24 hours
a day, seven days a week. The SQL Server platform was designed with the highest levels of availability in
mind. As each version of the product has been released, more capabilities have been added to minimize
any potential downtime.
Security
Uppermost in the minds of enterprise managers is the need to secure organizational data. It is not
possible to retrofit security after an application or product has been created. From the very beginning,
SQL Server has been built with the highest levels of security as a goal.
Scalability
Organizations need data management capabilities for systems of all sizes. SQL Server scales from the
smallest needs to the largest via a series of editions that have increasing capabilities.
Cost of Ownership
Many competing database management systems are expensive both to purchase and to maintain. SQL
Server offers very low total cost of ownership. SQL Server tooling (both management and development)
builds on existing Windows® knowledge. Most users tend to become familiar with the tools quite quickly.
The productivity that users achieve when they use the tools is enhanced by the high degree of integration
between the tools. For example, many of the SQL Server tools have links to launch and preconfigure other
SQL Server tools.
Component Purpose
Component Purpose
Multiple Instances
Applications that need an organization to support them may require server configurations that are
inconsistent or incompatible with the server requirements of other applications. Each instance of SQL
Server is separately configurable.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-5
Application databases might need to be supported with different levels of service, particularly in
relation to availability. You can use SQL Server instances to separate workloads with differing service
level agreements (SLAs) that need to be met.
Applications might require different server-level collations. Although each database can have
different collations, an application might be dependent on the collation of the tempdb database
when the application is using temporary objects.
You can often install different versions of SQL Server side by side by using multiple instances. This can
assist when testing upgrade scenarios or performing upgrades.
Prior to SQL Server 2000, it was only possible to install a single copy of SQL Server on a server system. SQL
Server was addressed by the name of the server. To maintain backward compatibility, this mode of
connection is still supported and is known as a ‘‘default’’ instance.
Additional instances of SQL Server require an instance name in addition to the server name and are
known as ‘‘named’’ instances. You do not need to install a default instance before installing named
instances. It is not possible to install all components of SQL Server in more than one instance. A
substantial change in SQL Server 2012 enables multiple instance support for SQL Server Integration
Services.
There is no need to install SQL Server tools more than once. A single installation of the tools can manage
and configure all instances.
Early Versions
Later Versions
Version 7.0 saw a significant rewrite of the product. Substantial advances were made in reducing the
administration workload for the product. OLAP Services (which later became Analysis Services) was
introduced.
SQL Server 2000 featured support for multiple instances and collations. It also introduced support for data
mining. SQL Server Reporting Services was introduced after the product release as an add-on
enhancement to the product, along with support for 64-bit processors.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-7
SQL Server 2005 provided another significant rewrite of many aspects of the product:
It introduced support for nonrelational data that was stored and queried as XML.
SQL Server Management Studio was released to replace several previous administrative tools.
SQL Server Integration Services replaced a tool formerly known as Data Transformation Services (DTS).
Another key addition to the product was the introduction of support for objects that had been
created by using the common language runtime (CLR).
The Transact-SQL language was substantially enhanced, including structured exception handling.
Dynamic Management Views and Functions were introduced to enable detailed health monitoring,
performance tuning, and troubleshooting.
Substantial high-availability improvements were included in the product. Database mirroring was
introduced.
The SQL Server “AlwaysOn” technologies were introduced to reduce potential downtime.
Full-text indexing was integrated directly within the database engine. (Previously, full-text indexing
was based on interfaces to services at the operating system level.)
A policy-based management framework was introduced to assist with a move to more declarative-
based management practices, rather than reactive practices.
The enhancements and additions to the product in SQL Server 2008 R2 included:
Support for managing reference data with the introduction of Master Data Services.
The introduction of StreamInsight, which enabled users to query data that was arriving at high speed,
before storing the data in a database.
The enhancements and additions to the product in SQL Server 2012 included:
The migration of Business Intelligence projects into Microsoft Visual Studio® 2010.
MCT USE ONLY. STUDENT USE PROHIBITED
1-8 An Introduction to Database Development
Data-tier applications, which assisted with packaging database applications as part of application
development projects.
Strong enhancements to the Transact-SQL language, such as the addition of sequences, new error-
handling capabilities, and new window functions.
The enhancements and additions to the product in SQL Server 2014 include:
Substantial performance gains from the introduction of in-memory tables and native stored
procedures.
Enhanced security.
Improved scalability.
Lesson 2
Working with SQL Server Tools
Working effectively with SQL Server requires familiarity with the tools that are used in conjunction with it.
Before any tool can connect to SQL Server, it needs to make a network connection to the server. In this
lesson, you will see how these connections are made, and then look at the tools that are most commonly
used when you are working with SQL Server.
Lesson Objectives
After completing this lesson, you will be able to:
TDS is a high-level protocol that is transported by lower-level protocols. It is most commonly transported
by the TCP/IP protocol or the Named Pipes protocol, or implemented over a shared memory connection.
Authentication
For most applications and organizations, data must be held securely and access to the data is based on
the identity of the user who is attempting to access the data. The process of verifying the identity of a
user (or more formally, of any principal) is known as authentication. SQL Server supports two forms of
authentication:
1. It can store the login details for users directly within its own system databases. These logins are
known as SQL Server logins.
2. It can be configured to trust a Windows authenticator (such as Active Directory®). In that case, a
Windows user can be granted access to the server, either directly or via his or her Windows group
memberships.
MCT USE ONLY. STUDENT USE PROHIBITED
1-10 An Introduction to Database Development
When a connection is made, the user is connected to a specific database, which is known as his or her
“default” database.
Client Libraries
OLEDB is a library that does not translate commands. OLEDB originally stood for Object Linking and
Embedding for Databases, but that meaning is no longer very relevant. When an application sends an SQL
command, OLEDB passes it to the database server without modification.
The SQL Server Native Access Component (SNAC) is a software layer that encapsulates commands that
libraries such as OLEDB, ODBC, and JDBC have issued into commands that SQL Server can understand. It
then encapsulates results that SQL Server returns ready for consumption by these libraries. This primarily
involves wrapping the commands and results in the TDS protocol.
Network Libraries
SQL Server exposes endpoints that client applications can connect to. The endpoint is used to pass
commands and data to and from the database engine.
SNAC connects to these endpoints via network libraries such as TCP/IP, or Named Pipes. For client
applications that are executing on the same computer as the SQL Server service, a special “shared
memory” network connection is also available.
SQL Server receives commands via endpoints and sends results to clients via endpoints. Clients interact
with the Relational engine, which in turn utilizes the Storage engine to manage the storage of databases.
The SQL Server Operating System (SQLOS) is a software layer that provides a layer of abstraction between
the Relational engine and the available server resources.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-11
All SQL Server relational database management tasks can be performed by using the Transact-SQL
language, but many users prefer graphical administration tools because they are typically easier to use
than the Transact-SQL commands. SQL Server Management Studio provides graphical interfaces for
configuring databases and servers.
SQL Server Management Studio can connect to a variety of SQL Server services including the Database
Engine, Analysis Services, Integration Services, Reporting Services, and SQL Server Compact edition.
Register servers.
Demonstration Steps
Use SSMS to connect to an on-premises instance of SQL Server 2014
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
4. In the Connect to Server window, ensure that Server type is set to Database Engine.
6. In the Authentication drop-down list, select Windows Authentication, and then click Connect.
Run a T-SQL script
1. If required, on the View menu, click Object Explorer.
2. In Object Explorer, expand Databases, expand AdventureWorks, and then expand Tables. Review
the database objects.
3. Right-click the AdventureWorks database, and then click New Query.
5. Note the use of IntelliSense while you are typing this query, and then on the toolbar, click Execute.
Note how the results can be returned.
6. On the File menu, click Save SQLQuery1.sql. Note that this saves the query to a file. In the Save File
As window, click Cancel.
7. On the Results tab, right-click the cell for ProductID 1 (first row and first cell), and then click Save
Results As. In the FileName text box, type Demonstration2AResults and then click Save. Note that
this saves the query results to a file.
8. On the Query menu, click Display Estimated Execution Plan. Note that SQL Server Management
Studio can do more than simply execute queries.
10. In the Options pane, expand Query Results, expand SQL Server, and then click General. Review the
available configuration options and then click Cancel.
11. On the File menu, click Close. In the Microsoft SQL Server Management Studio window, click No.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-13
3. On the View menu, click Solution Explorer. Note the contents of Solution Explorer.
2. On the File menu, click New, and then click Database Engine Query to open a new connection.
3. In the Connect to Database Engine window, in the Server name box, type (local).
4. In the Authentication drop-down list, select Windows Authentication, and then click Connect.
5. In the Available Databases drop-down list on the toolbar, click tempdb. Note that this will change
the database against which the query is executed.
6. Right-click in the query window, click Connection, and then click Change Connection. This will
reconnect the query to another instance of SQL Server.
2. In the Registered Servers window, expand Database Engine, right-click Local Server Groups, and
then click New Server Group.
3. In the New Server Group Properties window, in the Group name box, type Dev Servers and then
click OK.
5. In the New Server Registration window, click the Server name drop-down list, select (local) and
then click Save.
8. In the Registered Servers window, right-click the Dev Servers group, and then click New Query.
9. Type the query as shown in the snippet below, and then click Execute.
SELECT @@version;
11. In the Microsoft SQL Server Management Studio window, click No.
MCT USE ONLY. STUDENT USE PROHIBITED
1-14 An Introduction to Database Development
Lesson 3
Configuring SQL Server Services
Users can configure each SQL Server service individually. The ability to provide individual configuration for
services assists organizations that aim to minimize the permissions assigned to service accounts as part of
a policy of least-privilege execution. SQL Server Configuration Manager is used to configure services,
including the accounts under which the services operate, and the network libraries that the SQL Server
services use.
SQL Server also ships with various tools. It is important to know what each of these tools is used for.
Lesson Objectives
After completing this lesson, you will be able to:
Managing client protocols. When client applications (such as SQL Server Management Studio) are
installed on a server, it is necessary to configure how connections from those tools are made to SQL
Server. Users can use SQL Server Configuration Manager to configure the protocols required and to
create aliases for the servers to simplify connectivity.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-15
Each service has a start mode. This mode can be set to Automatic, Manual, or Disabled. Services that are
set to the Automatic start mode are automatically started when the operating system starts. Services that
are set to the Manual start mode can be manually started. Services that are set to the Disabled start mode
cannot be started.
Instances
Many SQL Server components are instance-aware and can be installed more than once on a single server.
When SQL Server Configuration Manager lists each service, it shows the associated instance of SQL Server
in parentheses after the name of the service.
Many protocols provide multiple levels of configuration. For example, the configuration for the TCP/IP
protocol makes it possible to have different settings on each configured IP address if required, or a
general set of configurations that is applied to all IP addresses.
Client Configurations
MCT USE ONLY. STUDENT USE PROHIBITED
1-16 An Introduction to Database Development
Every computer that has SNAC installed needs to be able to configure how that library will access SQL
Server services.
SNAC is installed on the server in addition to being installed on client systems. When SQL Server
Management Studio is installed on the server, it uses the SNAC library to make connections to the SQL
Server services that are on the same system. Users can use the client configuration nodes within SQL
Server Configuration Manager to configure how those connections are made. Note that two sets of client
configurations are provided and that they only apply to the computer where they are configured. One set
is used for 32-bit applications; the other set is used for 64-bit applications. SQL Server Management
Studio is a 32-bit application, even when SQL Server is installed as a 64-bit application.
Aliases
Hard-coding connection details for a specific server,
protocol, and port within an application is not
desirable because these might need to change over
time.
Each client system that utilizes SNAC (including the server itself) can have one or more aliases configured.
Aliases for 32-bit applications are configured independently of the aliases for 64-bit applications.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-17
Tool Purpose
Master Data Services Configuration Configure and manage SQL Server Master
Manager Data Services
SQL Server Error and Usage Reporting Configure the level of automated reporting
back to the SQL Server product team
about errors that occur and on usage of
different aspects of the product
SQL Server Management Objects (SMO) Provide a detailed .NET-based library for
working with management aspects of SQL
Server directly from application code
MCT USE ONLY. STUDENT USE PROHIBITED
1-18 An Introduction to Database Development
Demonstration Steps
Start a SQL Server Profiler trace
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, ensure that Server type is set to Database Engine.
5. In the Server name text box, type (local).
6. In the Authentication drop-down list, select Windows Authentication, and then click Connect.
8. In the Connect to Server window, ensure that Server type is set to Database Engine.
10. In the Authentication drop-down list, select Windows Authentication, and then click Connect.
11. In the Trace Properties window, in Trace name, type Demonstration.
12. Click Run. Note that this will start a new trace with the default options.
View a SQL Server Profiler trace
1. Switch to SQL Server Management Studio, and then click New Query.
2. In the query window, type the query as shown below, and then click Execute.
USE AdventureWorks;
GO
SELECT * FROM Person.Person
ORDER BY FirstName;
GO
3. Switch to SQL Server Profiler. Note the statement trace occurring in SQL Server Profiler.
6. Close SQL Server Management Studio and SQL Server Profiler without saving any changes.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-19
Objectives
After completing this lab, you will have:
Password: Pa$$w0rd
3. In File Explorer, navigate to the D:\Labfiles\Lab01\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
1. Check That the Database Engine and Reporting Services Have Been Installed
2. Ensure That All Required Services Including SQL Server Agent Are Started and Set To Autostart for Both
Instances
3. Configure the TCP Port for the SQL3 Database Engine Instance to 51550
Task 1: Check That the Database Engine and Reporting Services Have Been Installed
1. Open SQL Server Configuration Manager.
2. Check the installed list of services for the MSSQLSERVER instance and ensure that the database
engine and Reporting Services have been installed for the default instance.
Task 2: Ensure That All Required Services Including SQL Server Agent Are Started and
Set To Autostart for Both Instances
1. Ensure that all of the services for the default instance are set to autostart. (Ignore the SQL Full-text
Filter Daemon Launcher service at this time.)
Task 3: Configure the TCP Port for the SQL3 Database Engine Instance to 51550
1. Using the property page for the TCP/IP server protocol, configure the use of the fixed port 51550.
(Make sure that you clear the dynamic port.)
3. Ensure that the SQL3 database engine instance has been restarted successfully.
Question: How can you configure SQL Server to use a different IP port?
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 1-21
Question: What is the difference between a version of SQL Server and an edition of SQL
Server?
Module 2
Designing and Implementing Tables
Contents:
Module Overview 2-1
Module Overview
In relational database management systems (RDBMSs), user and system data is stored in tables. Each table
consists of a set of rows that describe entities and a set of columns that hold the attributes of an entity.
For example, a Customer table would have columns such as CustomerName and CreditLimit and a row
for each customer. In Microsoft® SQL Server® data management software, tables are contained within
schemas that are very similar in concept to folders that contain files in the operating system. Designing
tables is often one of the most important roles that a database developer undertakes because incorrect
table design leads to the inability to query the data efficiently. After an appropriate design has been
created, it is then important to know how to correctly implement the design.
Objectives
After completing this module, you will be able to:
Design tables.
Lesson 1
Using Data Types
The most basic types of data that get stored in database systems are numbers, dates, and strings. There is
a range of data types that can be used for each of these. In this lesson, you will see the Microsoft-supplied
data types that you can use for numeric and date-related data. You will also see what NULL means and
how to work with it. In the next lesson, you will see how to work with string data types.
Lesson Objectives
After completing this lesson, you will be able to:
Constraining Values
Data types are a form of constraint that is placed on
the values that can be stored in a location. For
example, if you choose a numeric data type, you
will not be able to store text in the location.
In addition to constraining the types of values that can be stored, data types also constrain the range of
values that can be stored. For example, if you choose a smallint data type, you can only store values
between –32,768 and +32,767.
Query Optimization
When SQL Server identifies that the value in a column is an integer, it may be able to generate an entirely
different and more efficient query plan to one where it identifies that the location is holding text values.
The data type also determines which sorts of operations are permitted on that data and how those
operations work.
Self-Documenting Nature
Choosing an appropriate data type provides a level of self-documentation. If all values were stored in a
string value (which could potentially represent any type of value) or XML data types, you would probably
need to store documentation about what sort of values can be stored in the string locations.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-3
Data Types
There are three basic sets of data types:
System data types. SQL Server provides a large number of built-in (or intrinsic) data types. Examples of
these include integer, varchar, and date.
Alias data types. Users can also define data types that provide alternate names for the system data types
and potentially further constrain them. These are known as alias data types. For example, you could use
an alias data type to define the name PhoneNumber as being equivalent to nvarchar(16). Alias data
types can help to provide consistency of data type usage across applications and databases.
User-defined data types. By using managed code via SQL Server integration with the common language
runtime (CLR), you can create entirely new data types. There are two categories of these CLR types. One
category is system CLR data types, such as the geometry and geography spatial data types. The other is
user-defined CLR data types, which enable users to create their own data types.
Question: Why would it be faster to compare two integer variables that are holding the
values 3,240 and 19,704 than two varchar(10) variables that are holding the values "3240"
and "19704"?
smallint is stored in 2 bytes (that is, 16 bits) and stores values from –32,768 to 32,767.
int is stored in 4 bytes (that is, 32 bits) and stores values from –2,147,483,648 to 2,147,483,647. It is a
very commonly used data type. SQL Server uses the full word “integer” as a synonym for “int.”
bigint is stored in 8 bytes (that is, 64 bits) and stores very large integer values. Although it is easy to
refer to a 64-bit value, it is hard to comprehend how large these values are. If you placed a value of
zero in a 64-bit integer location and executed a loop to simply add one to the value, on most
common servers currently available, you would not reach the maximum value for many months.
decimal is an ANSI-compatible data type that enables you to specify the number of digits of
precision and the number of decimal places (referred to as the scale). A decimal(12,5) location can
store up to 12 digits with up to five digits after the decimal point. decimal is the data type that you
should use for monetary or currency values in most systems and any exact fractional values such as
sales quantities (where part quantities can be sold) or weights.
money and smallmoney are data types that are specific to SQL Server and have been present since
the early days of the platform. They were used to store currency values with a fixed precision of four
decimal places.
Note: Four is often the wrong number of decimal places for many monetary applications,
and the money and smallmoney data types are not standard data types. In general, use decimal
for monetary values.
Note that there is no literal string format for bit values in SQL Server. The string values TRUE and FALSE
can be converted to bit values, as can the integer values 1 and 0. TRUE is converted to 1 and FALSE is
converted to 0.
Higher-level programming languages differ about how they store true values in Boolean columns. Some
languages store true values as 1; others store true values as –1. In two's complement notation (which is
the encoding used to store smallint, int, and bigint), a 1-bit value would range from –1 to 0.
To avoid any chance of mismatch, in general, when working with bits in applications, test for false values
by using the following code.
IF (@InputValue = 0)
IF (@InputValue <> 0)
This is preferable to testing for a value being equal to 1 because it will provide more reliable code.
bit, along with other data types, is also nullable, which can be a surprise to new users. That means that a
bit location can be in three states: NULL, 0, or 1. (Nullability is discussed in more detail later in this
module.)
Question: What would be a suitable data type for storing the value of a check box that can
be 0 for cleared, 1 for selected, or –1 for disabled?
SQL Server and occupies either 4 or 8 bytes, enabling the storage of approximate values with a defined
scale. The scale values permitted are from 1 to 53 and the default scale is 53. Even though a range of
values is provided for in the syntax, the current SQL Server implementation of the float data type is that if
the scale value is from 1 to 24, the scale is implemented as 24. For any larger value, a scale of 53 is used.
Common Errors
A very common error for new developers is to use approximate numeric data types to store values that
need to be stored exactly. This causes rounding and processing errors. A “code smell” for identifying
programs that new developers have written is a column of numbers that do not exactly add up to the
displayed totals. It is common for small rounding errors to creep into calculations, for example, a total that
is incorrect by 1 cent in dollar-based or euro-based currencies.
The inappropriate use of numeric data types can cause processing errors. Look at the following code and
decide how many times the PRINT statement would be executed.
It might surprise you to learn that this query would never stop running and would need to be cancelled.
After cancelling the query, if you looked at the output, you would see the following code.
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
…
What has happened? The problem is that the value 0.1 cannot be stored exactly in a float or real data
type, so the termination value of the loop is never hit exactly. If a decimal value had been used instead,
the loop would have executed as expected.
Consider how you would write the answer to 1÷3 in decimal form. The answer isn't 0.3, it is 0.3333333
recurring. There is no way in decimal form to write 1÷3 as an exact decimal fraction. You have to
eventually settle for an approximate value.
The same problem occurs in binary fractions; it just occurs at different values. 0.1 ends up being stored as
the equivalent of 0.099999 recurring. 0.1 in decimal form is a nonterminating fraction in binary. Therefore,
when you put the system in a loop adding 0.1 each time, the value never exactly equals 1.0, which can be
stored precisely.
MCT USE ONLY. STUDENT USE PROHIBITED
2-6 Designing and Implementing Tables
The time data type is aligned to the SQL standard form of hh:mm:ss with optional decimal places up to
hh:mm:ss.nnnnnnn. Note that when you are defining the data type, you need to specify the number of
decimal places, such as time(4), if you do not want to use the default value of seven decimal places, or if
you want to save some storage space. The format that SQL Server uses is similar to the ISO 8601 definition
for TIME.
The ISO 8601 standard makes it possible to use 24:00:00 to represent midnight and to have a leap second
over 59. These are not supported in the SQL Server implementation.
The datetime2 data type is a combination of a date data type and a time data type.
Another problem with the datetime data type is that the way it converts strings to dates is based on
language format settings. A value in the form “YYYYMMDD” will always be converted to the correct date,
but a value in the form “YYYY-MM-DD” might end up being interpreted as “YYYY-DD-MM,” depending
on the settings for the session.
It is important to understand that this behavior does not happen with the new date data type, so a string
that was in the form “YYYY-MM-DD” could be interpreted as two different dates by the date (and
datetime2) data type and the datetime data type. You should specifically check any of the formats that
you intend to use, or always use formats that cannot be misinterpreted. Another option that was
introduced in SQL Server 2012 can help. A series of functions that enable date and time values to be
created from component parts was introduced. For example, there is now a DATEFROMPARTS function
that enables you to create a date value from a year, a month, and a day.
Time Zones
The datetimeoffset data type is a combination of a datetime2 data type and a time zone offset. Note
that the data type is not aware of the time zone; it can simply store and retrieve time zone values.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-7
Note that the time zone offset values extend for more than a full day (a range of –14:00 to +14:00). A
range of system functions has been provided for working with time zone values, and for all of the data
types related to dates and times.
Question: Why is the specification of a date range from the year 0000 to the year 9999
based on the Gregorian calendar not entirely meaningful?
Unique Identifiers
Globally unique identifiers (GUIDs) have become
common in application development. They are used
to provide a mechanism where any process can
generate a number and know that it will not clash
with a number that any other process has
generated.
GUIDs
Numbering systems have traditionally depended on
a central source for the next value in a sequence to
make sure that no two processes use the same
value. GUIDs were introduced to avoid the need for
anyone to function as the “number allocator.” Any
process (on any system) can generate a value and know that it will not clash with a value generated by
any process across time and space and on any system to an extremely high degree of probability.
This is achieved by using extremely large values. When discussing the bigint data type earlier, you learned
that the 64-bit bigint values were really large. GUIDs are 128-bit values. The magnitude of a 128-bit value
is well beyond our capabilities of comprehension.
The IDENTITY property is used to automatically assign values to columns. (IDENTITY is discussed in
Module 3.) The IDENTITY property is not used with uniqueidentifier columns. New values are not
calculated by code in your process. They are calculated by calling system functions that generate a value
for you. In SQL Server, this function is the NEWID() function.
The random nature of GUIDs has also caused significant problems in current storage subsystems. SQL
Server 2005 introduced the NEWSEQUENTIALID() function to try to circumvent the randomness of the
values that the NEWID() function generated. However, the function does so at the expense of some
guarantee of uniqueness.
The usefulness of the NEWSEQUENTIALID() function is also quite limited because the main reason for
using GUIDs is to enable other layers of code to generate the values and know that they can just insert
them into a database without clashes. If you need to request a value from the database via
NEWSEQUENTIALID(), it usually would have been better to use an IDENTITY column instead.
A very common development error is to store GUIDs in string values rather than in uniqueidentifier
columns.
Question: The slide mentions that a common error is to store GUIDs as strings. What would
be wrong with this?
NULL
NULL is a state of a column in a particular row,
rather than a type of value that is stored in a
column. You do not say that a value equals NULL;
you say that a value is NULL. This is why, in
Transact-SQL, you do not check whether a value is
NULL with the equality operator. For example, you
would not write the following code.
Common Errors
New developers often confuse NULL values with zero, blank (or space), zero-length strings, and so on. The
misunderstanding is exacerbated by other database engines that treat NULL and zero-length strings or
zeroes as identical. NULL indicates the absence of a value.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-9
Careful consideration must be given to the nullability of a column. In addition to specifying a data type
for a column, you specify whether a value needs to be present. (Often, this is referred to as whether a
column is mandatory.)
Look at the NULL and NOT NULL declarations on the slide and decide why each decision might have been
made.
Demonstration Steps
Work with NULL and insert GUIDs into a table
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Run D:\Demofiles\Mod02\Setup.cmd as an administrator to revert any changes.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
6. If Solution Explorer is not visible, click the View menu and click Solution Explorer.
7. Expand the Queries folder and then double-click 11 - Demonstration 1A.sql.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
2-10 Designing and Implementing Tables
Lesson 2
Working with Character Data
In the last lesson, you saw that the most basic types of data that get stored in database systems today are
numbers, dates, and strings. There are a choice of data types that can be used for each of these. You also
looked at the available range of data types that can be used for numeric and date-related data. In this
lesson, you will now look at the other very common category of data: the string-related data types.
Another common class of design and implementation errors relates to collations. Collations define how
string data is sorted. In this lesson, you will also see how collations are defined and used.
Lesson Objectives
After completing this lesson, you will be able to:
Unicode
Traditionally, most computer systems stored one
character per byte. This only allowed for 256
different character values, which is not enough to
store characters from many languages.
They can then enter the number beside the character to select the intended word. It might not seem
important to an English-speaking person but given that the first option means “horse”, the second option
is like a question mark, and the third option means “mother”, there is definitely a need to select the
correct option!
Character Groups
An alternate way to enter the characters is via radical groupings. Please note the third character in the
screenshot above. The left-hand part of that character, 女, means “woman”. Rather than entering English-
like characters (that could be quite unfamiliar to the writers), select a group of characters based on what is
known as a radical.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-11
Please note that the character representing “mother” is the first character on the second line. For this sort
of keyboard entry to work, the characters must be in appropriate groups, not just stored as one large sea
of characters. An additional complexity is that the radicals themselves are also in groups. You can see in
the screenshot that the woman radical was part of the third group of radicals.
Unicode
In the 1980s, work was done by a variety of researchers, to determine how many bytes are required to be
able to hold all characters from all languages but also store them in their correct groupings. The answer
from all researchers was three bytes. You can imagine that three was not an ideal number for computing
and at the time users were mostly working with 2 byte (that is, 16 bit) computer systems.
Unicode introduced a two-byte character set that attempts to fit the values from the three bytes into two
bytes. Inevitably then, trade-offs had to occur.
Unicode allows any combination of characters that are drawn from any combination of languages to exist
in a single document. There are multiple encodings for Unicode with UTF-7, UTF-8, UTF-16, and UTF-32.
(UTF is universal text format). SQL Server currently implements double-byte UTF-16 characters for its
Unicode implementation.
For string literal values, an N prefix on a string allows the entry of double-byte characters into the string
rather than just single-byte characters. (N stands for “National” in “National Character Set”).
When working with character strings, the LEN function returns the number of characters (Unicode or not)
whereas DATALENGTH returns the number of bytes.
Trailing Spaces
DECLARE @String1 char(10);
DECLARE @String2 char(10);
SET @String1 = 'Hello';
SET @String2 = 'There';
SELECT @String1 + @String2;
Note the trailing spaces. The char and nchar data types are not very useful for data that varies in length
but are ideal for short strings that are always the same length, for example, state codes in the U.S.A.
The varchar and nvarchar data types are limited to 8000 and 4000 characters, respectively. This is
roughly what fits in a data page in a SQL Server database.
char is restricted to a particular code page, so it is likely that applications will not be able to store input
values that do not fit in that code page. This could be as simple as an accent in the user's name. These
problems also occur when exporting data. For example, you might send data to a vendor to produce a
report, a code page mismatch occurs, and the output appears as square boxes or question marks. Nchar
and nvarchar support the main Unicode character pane and avoid the problems with encoding
conversions. This is particularly important for web apps, which may have browsers set to any number of
code pages.
Question: Why would you use the sysname data type rather than the nvarchar(128) data
type?
Understanding Collations
Collations in SQL Server are used to control the
code page that is used to store non-Unicode data
and the rules that govern how SQL Server sorts and
compares character values.
Code Pages
It was mentioned earlier that computer systems
traditionally stored one byte per character. This
allowed for 256 possible values, with a range from 0
to 255. The values from 0 to 31 were reserved for
“control characters” such as backspace (character 8)
and tab (character 9). Character 32 was allocated
for a space and so on, up to the Delete character
which was assigned the value 127.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-13
For values above 127 though, standards were initially not very clear. It was common to store characters
such as line drawing characters or European characters with accents or umlauts in these codes.
In fact, a number of computer systems only used 7 bits to store characters instead of 8 bits. (As an
example, the DEC10 system from Digital Equipment Corporation stored 5 characters of 7 bits each per 36-
bit computer “word”. It used the final bit as a parity check bit).
Problems did arise when different vendors used the upper characters for different purposes. In the 1970s,
it was not uncommon to type a character on your screen and see a different character when that
document was printed, as the screen and the printer were using different characters in the values above
127.
A number of standard character sets that described what should be in the upper code values did appear.
The MS-DOS operating system categorized these as “code pages”. What a code page really defines is
which characters are used for the values from 128 to 255.
Both the operating systems and SQL Server support a range of code pages. A default code page is chosen
while installing SQL Server.
SortRules A string identifying the alphabet or language that are applied when dictionary
sorting is specified.
CodePage One to four digits that define the code page used by the collation. For curious
historic reasons, CP1 specifies code page 1252 but for all others the number
indicates the code page, for example, CP850 specifies code page 850.
ComparisonStyle Either BIN for binary or a combination of case and accent sensitivity. CI is case-
insensitive, CS is case-sensitive. AI is accent-insensitive, AS is accent-sensitive.
with less fields. For example, Windows collation Latin1_General_CI_AS refers to Latin1_General as the
alphabet being used, case-insensitive and accent-sensitive.
Collation Issues
The main issues with collations occur when you try to compare values that are stored with different
collations. It is possible to set default collations for servers, databases, and even columns.
When comparing values from different collations, you need to then specify which collation (which could
be yet another collation) will be used for the comparison.
Another use of this is as shown in the example in the slide. In this case, you are forcing the query to
perform a case-sensitive comparison between the string '%ball%' and the value in the column. If the
column contained 'Ball', it would not then match.
Question: What are the code page and sensitivity values for the collation
SQL_Scandinavian_Cp850_CI_AS?
SC Collations
SQL Server Denali introduced support for collations
with supplementary characters. Current Microsoft
Windows® operating systems already support
these SC collations. The supplementary characters
are stored in four bytes per character. The two
consecutive 16 bit words that are used to store these characters are known as surrogate pairs.
Unicode UTF-16 characters are defined in 16 planes. Planes are ranges of allowed values. The planes of
particular interest are denoted in the standard as follows:
0x0000 to 0xFFFF is the main multilingual plane
The supplementary multilingual plane mostly includes further Asian language elements and the other
planes include less common (but still useful) characters such as musical notes. SQL Server collations that
have an SC suffix (such as Japanese_Bushu_Kakusu_100_CI_AS_SC) permit the use of supplementary
characters.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-15
Lesson 3
Designing Tables
The most important aspect of designing tables involves determining what data each column will hold. All
organizational data is held within database tables, so it is critical to store the data with an appropriate
structure.
The best practices for table and column design are often represented by a set of rules that are known as
“normalization” rules. In this lesson, you will learn the most important aspects of normalized table design
along with the appropriate use of primary and foreign keys. In addition, you will learn to work with the
system tables that are supplied when SQL Server is installed.
Lesson Objectives
After completing this lesson, you will be able to:
What Is a Table?
Relational databases store data about entities in
tables that are defined by columns and rows. Rows
represent entities and columns define the attributes
of the entities. The rows of a table have no
predefined order and can be used as a security
boundary.
Tables
Tables store data about entities such as customers, suppliers, orders, products, and sales. Each row of a
table represents the details of a single entity, such as a single customer, supplier, order, product, or sale.
Columns define the information that is being held about each entity. For example, a Product table might
have columns such as ProductID, Size, Name, and UnitWeight. Each of these columns is defined by
using a specific data type. For example, the UnitWeight column of a product might be allocated a
decimal(18,3) data type.
MCT USE ONLY. STUDENT USE PROHIBITED
2-16 Designing and Implementing Tables
Naming Conventions
Strong disagreement exists in the industry over naming conventions for tables. The use of prefixes (such
as tblCustomer or tblProduct) is widely discouraged. Prefixes were widely used in higher-level
programming languages before the advent of strong typing (that is, the use of strict data types rather
than generic data types), but are now rare. The main reason for this is that names should represent the
entities, not how they are stored. For example, during a maintenance operation, it might become
necessary to replace a table with a view or vice versa. This could lead to views named tblProduct or
tblCustomer when trying to avoid breaking existing code.
Another area of strong disagreement relates to whether table names should be singular or plural. For
example, should a table that holds the details of a customer be called Customer or Customers?
Proponents of plural naming argue that the table holds the details of many customers, whereas
proponents of singular naming argue that it is common to expose these tables via object models in
higher-level languages and that the use of plural names complicates this process. Although we might
have a Customers table, in a high-level language, we are likely to have a Customer object. SQL Server
system tables (and views) have plural names.
The argument is not likely to be resolved either way and is not a problem that is specific to the SQL
language. For example, an array of customers in a higher-level language could sensibly be called
“Customers,” yet referring to a single customer via “Customers[49]” seems awkward. The most important
aspect of naming conventions is that you should adopt a naming convention that you can work with and
apply it consistently.
Security
It is possible to use tables as security boundaries because users can be assigned permissions at the table
level. However, note that SQL Server supports the assignment of permissions at the column level in
addition to at the table level. Row-level security is not available for tables, but can be implemented via a
combination of views, stored procedures, and/or triggers.
Row Order
Tables are containers for rows, but they do not define any order for the rows that they contain. When
users select rows from a table, they should only specify the order that the rows should be returned in if
the output order matters. SQL Server may have to expend additional sorting effort to return rows in a
given order and it is important that this effort is only expended when necessary.
Normalizing Data
Normalization is a systematic process that is used to
improve the design of databases.
Normalization
Codd introduced first normal form in 1970, followed by second normal form, and then third normal form
in 1971. Since that time, higher forms of normalization have been introduced by theorists, but most
database designs today are considered to be “normalized” if they are in third normal form.
Intentional Denormalization
Not all databases should be normalized. It is common to intentionally denormalize databases for
performance reasons or for ease of end-user analysis.
For example, dimensional models that are widely used in data warehouses (such as the data warehouses
that are commonly used with SQL Server Analysis Services) are intentionally designed not to be
normalized.
Tables might also be denormalized to avoid the need for time-consuming calculations or to minimize
physical database design constraints such as locking.
Although there is disagreement on the interpretation of these rules, general agreement exists on most
common symptoms of violating the rules.
First Normal Form
Eliminate repeating groups in individual tables. Create a separate table for each set of related data.
Identify each set of related data by using a primary key.
No repeating groups should exist. For example, a Product table should not include columns such as
Supplier1, Supplier2, and Supplier3. Column values should not include repeating groups. For example, a
column should not contain a comma-separated list of suppliers.
Duplicate rows should not exist in tables. You can use unique keys to avoid having duplicate rows. A
candidate key is a column or set of columns that you can use to uniquely identify a row in a table. An
alternate interpretation of first normal form rules would disallow the use of nullable columns.
Create separate tables for sets of values that apply to multiple records. Relate these tables by using a
foreign key.
A common error with second normal form would be to hold the details of products that a supplier
provides in the same table as the details of the supplier's credit history. These values should be stored
separately.
MCT USE ONLY. STUDENT USE PROHIBITED
2-18 Designing and Implementing Tables
Imagine a Sales table that had OrderNumber, ProductID, ProductName, SalesAmount, and SalesDate
columns. This table would not be in third normal form. A candidate key for the table might be the
OrderNumber column. The ProductName column only depends on the ProductID column, and not on
the candidate key. The Sales table should be separated from a Product table and likely linked to it by
ProductID.
Formal database terminology is precise, but can be hard to follow when it is first encountered. In the next
demonstration, you will see examples of common normalization errors.
Primary Keys
A primary key is a form of constraint that is applied
to a table. A candidate key is used to identify a
column or set of columns that can be used to
uniquely identify a row. A primary key is chosen
from any potential candidate keys.
Primary Key
A primary key must be unique and cannot be NULL.
Primary keys are a form of constraint. (Constraints
are discussed later in this course.)
Consider a table that holds an EmployeeID column
and a NationalIDNumber column, along with the
employee's name and personal details. The EmployeeID and NationalIDNumber columns are both likely
to be possible candidate keys. In this case, the EmployeeID column might be the primary key, but either
candidate key could be used. You will see later that some data types will lead to better performing
systems when they are used as primary keys, but logically any candidate key could be nominated to be
the primary key.
It may be necessary to combine multiple columns into a key before the key can be used to uniquely
identify a row. In formal database terminology, no candidate key is more important than any other
candidate key. However, when tables are correctly normalized, they will usually have only a single
candidate key that could be used as a primary key. However, this is not always the case. Ideally, keys that
are used as primary keys should not change over time.
Natural vs. Surrogate Keys
A surrogate key is another form of key that is used as a unique identifier within a table, but it is not
derived from “real” data. Natural keys are formed from data within the table.
For example, a Customer table may have a CustomerID or CustomerCode column that contains numeric,
GUID, or alphanumeric codes. The surrogate key would not be related to the other attributes of a
customer.
The use of surrogate keys is another topic that can lead to strong debate between database professionals.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-19
Foreign Keys
A foreign key is used to establish references or
relationships between tables.
Self-Referencing Tables
A table can hold a foreign key reference to itself. For example, an Employees table might contain a
ManagerID column. An employee's manager is also an employee. A foreign key reference can be made
from the ManagerID column of the Employees table to the EmployeeID column in the same table.
Reference Checking
It is not possible to update or delete referenced keys unless options that cascade the changes to related
tables are used. For example, you cannot change the ID for a customer when there are orders in a
CustomerOrders table that reference that customer's ID.
Tables might also include multiple foreign key references. For example, an Orders table might have
foreign keys that refer to a Customers table and a Products table.
Terminology
Foreign keys are referred to as being used to “enforce referential integrity.” Foreign keys are a form of
constraint and will be covered in more detail in a later module.
The ANSI SQL 2003 definition refers to self-referencing tables as having “recursive foreign keys.”
Server to have improved designs for these tables while avoiding the chance of breaking existing
applications. As an example, when it was necessary to expand the syslogins table, a new sysxlogins table
was added instead of changing the existing table.
In SQL Server 2005, these tables were hidden and replaced by a set of system views that show the
contents of the system tables. These views are permission-based and display data to a user only if the user
has appropriate permission to view the data.
msdb is the database that SQL Server Agent uses, primarily for organizing scheduled background tasks
that are known as “jobs.” A large number of system tables are still present in the msdb database. Again,
while it is acceptable to query these tables, they should not be directly modified. Unless the table is
documented, no dependency on its format should be taken when designing applications.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-21
Lesson 4
Working with Schemas
SQL Server 2005 introduced a change to how schemas are used. Since that version, schemas are used as
containers for objects such as tables, views, and stored procedures. Schemas can be particularly helpful in
providing a level of organization and structure when large numbers of objects are present in a database.
It is also possible to assign security permissions at the schema level rather than individually on the objects
that are contained within the schemas. Doing this can greatly simplify the design of system security
requirements.
Lesson Objectives
After completing this lesson, you will be able to:
What Is a Schema?
Schemas are used to contain objects and to provide
a security boundary for the assignment of
permissions. In SQL Server, schemas are used as
containers for objects, rather like a folder is used to
hold files at the operating system level. Since their
introduction in SQL Server 2005, schemas can be
used to contain objects such as tables, stored
procedures, functions, types, and views. Schemas
form a part of the multipart naming convention for
objects. In SQL Server, an object is formally referred
to by a name of the form
Server.Database.Schema.Object.
Security Boundary
Schemas can be used to simplify the assignment of permissions. An example of applying permissions at
the schema level would be to assign the EXECUTE permission on a schema to a user. The user could then
execute all stored procedures within the schema. This simplifies the granting of permissions because there
is no need to set up individual permissions on each stored procedure.
It is important to understand that schemas are not used to define physical storage locations for data, as
occurs in some other database engines.
If you are upgrading applications from SQL Server 2000 and earlier versions, it is important to understand
that the naming convention changed when schemas were introduced. Previously, names were of the form
Server.Database.Owner.Object.
MCT USE ONLY. STUDENT USE PROHIBITED
2-22 Designing and Implementing Tables
Objects still have owners, but the owner's name does not form a part of the multipart naming convention
from SQL Server 2005 onward. When upgrading databases from earlier versions, SQL Server will
automatically create a schema that has the same name as existing object owners, so that applications that
use multipart names will continue to work.
When locating an object, SQL Server will first check the user's default schema. If the object is not found,
SQL Server will then check the dbo schema to try to locate the object.
It is important to include schema names when referring to objects instead of depending upon schema
name resolution, such as in this modified version of the previous statement.
Apart from rare situations, using multipart names leads to more reliable code that does not depend upon
default schema settings.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-23
Creating Schemas
Schemas are created by using the CREATE
SCHEMA command. This command can also
include the definition of objects to be created
within the schema at the time the schema is
created.
CREATE SCHEMA
Schemas have both names and owners. In the first
example shown on the slide, a schema named
Reporting is being created. It is owned by the user,
Terry. Although both schemas and the objects
contained in the schemas have owners and the
owners do not have to be the same, having
different owners for schemas and the objects contained within them can lead to complex security issues.
Besides creating schemas, the CREATE SCHEMA statement can include options for object creation.
Although the second example on the slide might appear to be three statements (CREATE SCHEMA,
CREATE TABLE, and GRANT), it is in fact a single statement. Both CREATE TABLE and GRANT are
options that are being applied to the CREATE SCHEMA statement.
Within the newly created KnowledgeBase schema, the Article table is being created and the SELECT
permission on the database is being granted to Salespeople.
Statements such as the second CREATE SCHEMA example on the slide can lead to issues if the entire
statement is not executed together.
Create a schema, create a schema with an included object, and drop a schema.
Demonstration Steps
Create a schema, create a schema with an included object, and drop a schema
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Ensure that you have completed the previous demonstrations in this module.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod02\Demo02.ssmssln, and then click Open.
7. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
2-24 Designing and Implementing Tables
Lesson 5
Creating and Altering Tables
Now that you understand the core concepts surrounding the design of tables, this lesson introduces you
to the Transact-SQL syntax that is used when defining, modifying, or dropping tables. Temporary tables
are a special form of table that can be used to hold temporary result sets. Computed columns are used to
create columns where the value held in the column is automatically calculated, either from expressions
involving other columns from the table or from the execution of functions.
Lesson Objectives
After completing this lesson, you will be able to:
Create tables.
Drop tables.
Alter tables.
Use temporary tables.
Creating Tables
Tables are created by using the CREATE TABLE
statement. This statement is also used to define the
columns that are associated with the table and
identify constraints such as primary and secondary
keys.
CREATE TABLE
Nullability
You should specify NULL or NOT NULL for each column in the table. SQL Server has defaults for this that
you can change via the ANSI_NULL_DEFAULT setting. Scripts should always be designed to be as reliable
as possible and specifying nullability in data definition language (DDL) scripts helps to improve script
reliability.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-25
Primary Key
You can specify a primary key constraint beside the name of a column if only a single column is included
in the key. It must be included after the list of columns when more than one column is included in the
key. See the following example, where the SalesID value is only unique for each SalesRegisterID value:
Primary keys are constraints and are more fully described along with other constraints in a later module.
Dropping Tables
The DROP TABLE statement is used to drop tables
from a database. If a table is referenced by a
foreign key constraint, it cannot be dropped.
Altering Tables
Altering a table is useful because permissions on
the table are retained along with the data in the
table. If you drop and re-create the table with a
new definition, both the permissions on the table
and the data in the table are lost. If the table is
referenced by a foreign key, it cannot be dropped.
However, it can be altered.
Note that the syntax for adding and dropping columns is inconsistent. The word COLUMN is required for
DROP, but not for ADD. In fact, it is not an optional keyword for ADD either. If the word COLUMN is
omitted in a DROP, SQL Server assumes that it is a constraint being dropped.
In the slide example, the PreferredName column is being added to the PetStore.Owner table. Later, the
PreferredName column is being dropped from the PetStore.Owner table. Note the difference in syntax
regarding the word COLUMN.
Demonstration Steps
Create tables and alter tables
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Ensure that you have completed the previous demonstrations in this module.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
Temporary Tables
Temporary tables are used to hold temporary result
sets within a user's session. They are created within
the tempdb database and deleted automatically
when they go out of scope. This typically occurs
when the code in which they were created
completes or aborts. Temporary tables are very
similar to other tables, except that they are only
visible to the creator and in the same scope (and
sub-scopes) within the session. They are
automatically deleted when a session ends or when
they go out of scope. Although temporary tables
are deleted when they go out of scope, you should
explicitly delete them when they are no longer required, to reduce overall resource requirements on the
server. Temporary tables are often created in code by using the SELECT INTO statement.
A table is created as a temporary table if its name has a number sign (#) prefix. A global temporary table
is created if the name has a double-number-sign (##) prefix. Global temporary tables are visible to all
users and are not commonly used.
Passing Temporary Tables
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-27
Temporary tables are also often used to pass rowsets between stored procedures. For example, a
temporary table that is created in a stored procedure is visible to other stored procedures that are
executed from within the first procedure. Although this use is possible, it is not considered good practice
in general. It breaks common rules of abstraction for coding and also makes it more difficult to debug or
troubleshoot the nested procedures. SQL Server 2008 introduced table-valued parameters (TVPs) that can
provide an alternate mechanism for passing tables to stored procedures or functions. (TVPs are discussed
later in this course.)
The overuse of temporary tables is a common Transact-SQL coding error that often leads to performance
and resource issues. Extensive use of temporary tables can be an indicator of poor coding techniques,
often due to a lack of set-based logic design.
2. Ensure that you have completed the previous demonstrations in this module.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
Computed Columns
Computed columns are columns that are derived
from other columns or from the result of executing
functions.
A nonpersisted computed column is calculated every time a SELECT operation occurs on the column and
it does not consume space on disk. A persisted computed column is calculated when the data in the row
is inserted or updated and does consume space on the disk. The data in the column is then selected like
the data in any other column.
The core difference between persisted and nonpersisted computed columns relates to when the
computational performance impact is exerted. Nonpersisted computed columns work best for data that is
modified regularly, but selected rarely. Persisted computed columns work best for data that is modified
rarely, but selected regularly. In most business systems, data is read much more regularly than it is
updated. For this reason, most computed columns would perform best as persisted computed columns.
Demonstration Steps
Work with computed columns
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Ensure that you have completed the previous demonstrations in this module.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
The new tables need to be isolated in their own schema. You need to create the required schema called
DirectMarketing. The owner of the schema should be dbo.
When the schema has been created, if you have enough time, you need to create the tables that have
been designed.
Objectives
After completing this lab, you will be able to:
Create a schema.
Create tables.
Password: Pa$$w0rd
2. Run the Setup Windows Command Script file (Setup.cmd) in the D:\Labfiles\Lab02\Starter folder as
Administrator.
MCT USE ONLY. STUDENT USE PROHIBITED
2-30 Designing and Implementing Tables
2. Review the supporting documentation for details of the PhoneCampaign, Opportunity, and
SpecialOrder tables and determine column names, data types, and nullability for each data item in
the design.
Created a schema.
Created tables.
Created a schema.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 2-31
Created the tables that you designed in the first exercise of this lab.
Review Question(s)
Question: What is a primary key?
Module 3
Ensuring Data Integrity through Constraints
Contents:
Module Overview 3-1
Module Overview
The quality of data in your database largely determines the usefulness and effectiveness of applications
(and people) that rely on it, and it can play a major role in the success or failure of an organization or a
business venture. Ensuring data integrity is a critical step in maintaining high-quality data.
You should enforce data integrity at all levels of an application from first entry or collection through
storage. Microsoft® SQL Server® data management software provides various features that simplify the
enforcement of data integrity.
Objectives
After completing this module, you will be able to:
Explain the available options for enforcing data integrity and the levels at which they should be
applied.
Lesson 1
Enforcing Data Integrity
An important step in database planning is deciding the best way to enforce the integrity of the data. Data
integrity refers to the consistency and accuracy of data that is stored in a database.
Lesson Objectives
After completing this lesson, you will be able to:
Explain how data integrity checks need to apply across different layers of an application.
Application Levels
User-interface level
Data tier
User-Interface Level
There are several advantages of enforcing integrity at the user-interface level. The responsiveness to the
end user is usually higher because it is possible to trap minor errors before any calls are made to other
layers of code. Error messages are often clearer because the code is more aware of the action that the
user has taken that caused the error to occur.
The main disadvantage of enforcing integrity at the user-interface level is that more than a single
application might need to work with the same underlying data and each application might enforce the
rules differently.
Middle Tier
Many integrity issues are directly related to business logic requirements. The middle tier is often where
the bulk of those requirements exist in code. In addition, multiple user interfaces often reuse the middle
tier. Implementing integrity at this level helps to avoid different user interfaces applying different rules
and checks. At this level, the logic is still quite aware of the functions that cause errors, so the error
messages that are returned to the user can still be quite specific.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 3-3
It is also easy for integrity checks that are only applied in the middle tier to be ineffective due to race
conditions. For example, it might seem easy to check that a customer exists and then enable an order to
be placed for the customer. Consider, though, the possibility that another user could remove the
customer between the time that you check for the customer's existence and the time that you record the
order.
Data Tier
The advantage of implementing integrity at the data tier is that upper layers cannot bypass it. In
particular, it is common for the same data to be accessed by multiple applications or even directly
through tools such as SQL Server Management Studio. If integrity is not maintained at the data-tier level,
all applications need to consistently apply all of the rules and checks.
The challenge of implementing some forms of integrity at the data tier (usually within the database) is
that the data tier is often unaware of the user actions that caused an error to occur, so the error messages
that are returned from this layer tend to be very precise in describing the issue, but quite cryptic for an
end user to understand. They typically need to be retranslated by upper layers of code before being
presented to end users.
Multiple Tiers
The correct solution in most situations involves applying rules and checks at multiple levels. However, the
challenge with this approach is in maintaining consistency between the rules and checks at different
application levels.
Entity (or table) integrity requires that all rows in a table have a way of being uniquely identified. This is
commonly called a primary key value. Whether the primary key value can be changed or whether the
whole row can be deleted depends on the level of integrity that is required between the primary key and
any other tables, based on referential integrity.
Referential integrity ensures that the relationships among the primary keys (in the referenced table) and
foreign keys (in the referencing tables) are always maintained. You are not permitted to insert a value in
the referencing column that doesn’t exist in the referenced column in the target table. A row in a
referenced table cannot be deleted nor can the primary key be changed if a foreign key refers to the row
unless a form of cascading action is permitted. You can define referential integrity relationships within the
same table or between separate tables.
As an example of referential integrity, you may need to ensure that an order cannot be placed for a
nonexistent customer.
MCT USE ONLY. STUDENT USE PROHIBITED
3-4 Ensuring Data Integrity through Constraints
Data Types
Nullability
The nullability of a column determines whether a value must be present in the column. This is often
referred to as whether a column is mandatory or not.
Default Values
If a column is not nullable, a value must be placed in it whenever a new row is inserted. A default value
enables users to insert a specific value into a column when no value is supplied in the statement that
inserted the row.
Constraints
Constraints are used to limit the permitted values in a column further than the limits that the data type
provides. For example, a tinyint column can have values from 0 to 255. You might decide to further
constrain the column so that only values between 1 and 9 are permitted in the column.
You can also apply constraints at the table level and enforce relationships between the columns of a table.
For example, you might have a column that holds an order number, but it is not mandatory. You might
then add a constraint that specifies that the column must have a value if the Salesperson column also has
a value.
Triggers
Triggers are procedures (somewhat like stored procedures) that are executed whenever specific events
such as INSERT or UPDATE occur on a specific object such as a table. In the code for the trigger, you can
then enforce even more complex rules for integrity. Triggers are discussed in Module 10.
Objects from Earlier Versions
Early versions of SQL Server supported objects called rules and defaults. Note that defaults were a type of
object and not the same as DEFAULT constraints. Defaults were separate objects that were then bound to
columns. They were reused across multiple columns.
These objects have been deprecated because they were not compliant with Structured Query Language
(SQL) standards. Code that is based on these objects should be replaced. In general, you should replace
rules with CHECK constraints and defaults with DEFAULT constraints.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 3-5
Lesson 2
Implementing Domain Integrity
Domain integrity limits the range and type of values that can be stored in a column. It is usually the most
important form of data integrity when first designing a database. If domain integrity is not enforced,
processing errors can occur when unexpected or out-of-range values are encountered.
Lesson Objectives
After completing this lesson, you will be able to:
Describe how you can use data types to enforce domain integrity.
Describe how you can use DEFAULT constraints to provide default values for columns.
Describe how you can use CHECK constraints to enforce domain integrity.
Data Types
Choosing an appropriate data type for each column
is one of the most important decisions that you
must take when you are designing a table as part of
a database Data types were discussed in Module 2.
You can assign data types to a column by using one
of the following methods:
Using SQL Server system data types.
SQL Server supplies system data types and a large range of data types is available. Choosing a data type
determines both the types of data that can be stored and the range of values that is permitted.
It is common for consistency problems to occur when tables are designed. This is even more common
when more than one person has designed the tables. For example, you may have several tables that store
the weight of a product that was sold. One column might be defined as decimal(18,3), another column
might be defined as decimal(12,2), and another column might be defined as decimal(16,5). For
consistency, alias data types enable the creation of a data type called ProductWeight, define it as
decimal(18,3), and then use it as the data type for all of the columns. This can help lead to more
consistent database designs.
An additional advantage of alias data types is that code generation tools can create more consistent code
when the tools have the additional information about the data types that alias data types provide. For
example, you could decide to have a user-interface design program that always displayed and/or
prompted for product weights in a specific way.
MCT USE ONLY. STUDENT USE PROHIBITED
3-6 Ensuring Data Integrity through Constraints
The addition of managed code to SQL Server as part of SQL Server 2005 onward made it possible to
create entirely new data types. Although alias data types are user-defined, they are still effectively subsets
of the existing system data types. User-defined data types that are created in managed code enable the
design of not only the data that is stored in a data type, but also the behavior of the data type. For
example, you could design a jpeg data type. Besides designing how it would store images, you could
decide that it could be updated by calling a predesigned Resize method. Designing user-defined data
types is discussed in a Module 12.
DEFAULT Constraints
A DEFAULT constraint provides a value for a column
when no value is specified in the statement that
inserted the row. You can view the existing
definition of DEFAULT constraints by querying the
sys.default_constraints view.
DEFAULT Constraint
DEFAULT constraints are associated with a table column. They are used to provide a default value for the
column when the user does not supply a value. The value is retrieved from the evaluation of an expression
and the data type that the expression returns must be compatible with the data type of the column.
However, note that if the statement that inserted the row explicitly inserted NULL, the default value would
not be used.
Named Constraints
SQL Server does not require you to supply names for constraints that you create. If a name is not supplied,
SQL Server will automatically generate a name. However, the names that are generated are not very
intuitive. Therefore, it is generally considered a good idea to provide names for constraints as you create
them and to do so in a consistent naming pattern.
A good example of why naming constraints is important is that if a column needs to be deleted, you must
first remove any constraints that are associated with the column. Dropping a constraint requires you to
provide a name for the constraint that you are dropping. Having a consistent naming standard for
constraints helps you to know what that name is likely to be rather than having to execute a query to find
the name. (Locating the name of a constraint would involve querying the sys.constraints system view or
searching in Object Explorer.)
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 3-7
CHECK Constraints
CHECK constraints limit the values that a column
can accept by controlling the values that can be put
in the column.
Logical Expression
CHECK constraints work with any logical (Boolean) expression that can return TRUE, FALSE, or
UNKNOWN. Particular care must be given to any expression that could have a NULL return value. CHECK
constraints reject values that evaluate to FALSE. This does not include an unknown return value because
these values will not be rejected.
Table-Level CHECK Constraints
Apart from checking the value in a particular column, you can apply CHECK constraints at the table level
to check the relationship between the values in more than a single column from the same table. For
example, you could decide that the FromDate column should not have a larger value than the ToDate
column in the same row.
Demonstration Steps
Enforce data and domain integrity
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
6. If Solution Explorer is not visible, click the View menu and click Solution Explorer.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
3-8 Ensuring Data Integrity through Constraints
Lesson 3
Implementing Entity and Referential Integrity
It is important to be able to uniquely identify rows within tables and to be able to establish relationships
across tables. For example, you will need to make sure that a customer can be identified and that the
customer exists before you allow an order to be placed for that customer. This can be enforced by using a
combination of entity and referential integrity.
Lesson Objectives
After completing this lesson, you will be able to:
Explain how PRIMARY KEY constraints are used to enforce entity integrity.
Explain how FOREIGN KEY constraints are used to enforce referential integrity.
Describe how table relationships can be maintained while deleting or updating data through
cascading relationships.
As with other types of constraints, even though a name is not required when defining a PRIMARY KEY
constraint, it is desirable to choose a name for the constraint rather than leaving SQL Server to do so.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 3-9
UNIQUE Constraints
A UNIQUE constraint indicates that the column or
combination of columns is unique. One row can be
NULL (if the column nullability permits this). SQL
Server will internally create an index to support the
UNIQUE constraint.
If you were storing a tax identifier for employees in Spain, you would store one of these values, include a
CHECK constraint to make sure that the value was in one of the two valid formats, and have a UNIQUE
constraint on the column that stores these values. Note that this may be unrelated to the fact that the
table has another unique identifier such as EmployeeID that is used as a primary key for the table.
As with other types of constraints, even though a name is not required when defining a UNIQUE
constraint, it is desirable to choose a name for the constraint rather than leaving SQL Server to do so.
Note that you cannot change the length of a column when a FOREIGN KEY constraint is defined on it.
The target table can be the same table. For example, an Employee row might reference a manager who is
another row in the same Employee table.
As with other types of constraints, even though a name is not required when defining a FOREIGN KEY
constraint, it is desirable to choose a name for the constraint rather than leaving SQL Server to do so.
MCT USE ONLY. STUDENT USE PROHIBITED
3-10 Ensuring Data Integrity through Constraints
When you add a FOREIGN KEY constraint to a column (or columns) in a table, SQL Server will check the
data that is already in the column to make sure that the reference to the target table is valid. However, if
you specify WITH NOCHECK, SQL Server does not apply the check to existing rows and will only check
the reference in future when rows are inserted or updated. The WITH NOCHECK option can be applied
to other types of constraints, too.
REFERENCES Permission
Before you can place a FOREIGN KEY constraint on a table, you must at least have REFERENCES
permission on the target table. This avoids the situation where another user could place a reference to
one of your tables, leaving you unable to drop or substantially change your own table until the other user
removed that reference. However, in terms of security, keep in mind that providing REFERENCES
permission to a user on a table for which they do not have SELECT permission does not totally prevent
them from working out what the data in the table is by a brute force attempt that involves trying all
possible values.
1. NO ACTION is the default. For example, if you attempt to delete a customer and there are orders for
the customer, the deletion will fail.
2. CASCADE makes the required changes to the referencing tables. If the customer is being deleted, his
or her orders will be deleted, too. If the customer primary key is being updated (although note that
this is undesirable anyway), the customer key in the orders table will also be updated so that the
orders still refer to the correct customer.
3. SET NULL causes the values in the columns in the referencing table to be nullified. For the customer
and orders example, this means that the orders would still exist, but they would not refer to any
customer.
4. SET DEFAULT causes the values in the columns in the referencing table to be set to their default
values. This provides more control than the SET NULL option, which always sets the values to NULL.
Caution
Although cascading referential integrity is easy to set up, you should exercise extreme caution when using
it within database designs.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 3-11
For example, if you used the CASCADE option in the example above, would it really be okay for the
orders for the customer to be removed when you decided to remove the customer? Most organizations
might not mind orders disappearing, but might be much less happy to see other objects such as invoices
disappearing. Also, keep in mind the cascading nature of this situation. When you remove the customer,
you remove the orders. However, there may be other tables that reference the orders table (such as order
details or even invoices), and these would be removed, too.
Naming
Changing Constraints
You can create, alter, or drop constraints without having to drop and re-create the underlying table.
You use the ALTER TABLE statement to add, alter, or drop constraints.
Error Checking in Applications
Even though you have specified constraints in your database layer, you may also want to check the same
logic in higher layers of code. Doing so will lead to more responsive systems because they will go through
fewer layers of code. It will also provide more meaningful errors to users because the code is closer to the
business-related logic that led to the errors. The challenge is in keeping the checks between different
layers consistent.
When you are performing bulk loading or updates of data, you can often achieve better performance by
disabling CHECK and FOREIGN KEY constraints while performing the bulk operations and then reenabling
them afterwards, rather than having them checked row by row during the bulk operation.
MCT USE ONLY. STUDENT USE PROHIBITED
3-12 Ensuring Data Integrity through Constraints
Define entity integrity for table, define referential integrity for tables, and define cascading referential
integrity constraints.
Demonstration Steps
Define entity integrity for table, define referential integrity for tables, and define cascading referential
integrity constraints
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
IDENTITY Property
It is common to require a series of numbers to be
automatically provided for an integer column. The
IDENTITY property on a database column indicates
that an INSERT statement will not provide the value
for the column; instead, SQL Server will provide it
automatically.
When you specify the IDENTITY property, you specify a seed and an increment. The seed is the starting
value. The increment is how much the value goes up by each time. Both seed and increment default to a
value of 1 if they are not specified.
Although explicit inserts are not normally allowed for columns that have an IDENTITY property, it is
possible to explicitly insert values. You can temporarily enable the ability to insert into an IDENTITY
column by using a connection option. You can use SET IDENTITY_INSERT ON to enable the user to insert
values into the column by using the IDENTITY property instead of having the column auto-generated.
Having the IDENTITY property on a column does not in itself ensure that the column is unique. Unless
there is also a UNIQUE constraint on the column, there is no guarantee that values in a column that has
the IDENTITY property will be unique.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 3-13
After inserting a row into a table, you often need to know the value that was placed into the column with
the IDENTITY property. The system variable @@IDENTITY returns the last identity value that was used
within the session, in any scope. This can be a problem with triggers that perform inserts on another table
with an IDENTITY column as part of an INSERT statement.
For example, if you insert a row into a customer table, the customer might be assigned a new identity
value. However, if a trigger on the customer table caused an entry to be written into an audit logging
table when inserts are performed, the @@IDENTITY variable would return the identity value from the
audit logging table, rather than the one from the customer table.
To deal effectively with this, the SCOPE_IDENTITY() function was introduced. It provides the last identity
value within the current scope only. In the previous example, it would return the identity value from the
customer table.
Another complexity relates to multi-row inserts, which were introduced in SQL Server 2008. In this
situation, you may want to retrieve the IDENTITY column value for more than one row at a time.
Typically, this would be implemented by the use of the OUTPUT clause on the INSERT statement.
Sequences
You can use sequences in a similar way to
IDENTITY properties when a sequence of values is
required. However, unlike IDENTITY properties,
sequences are not tied to any specific table. This
means that you could use a single sequence to
provide key values for a group of tables.
Sequences can be cyclic. They can return to a low
value when a specified maximum value has been
exceeded.
In the example on the slide, a sequence called
BookingID is created in the Booking schema. The
sequence is defined as generating integer values. By
default, sequences generate bigint values.
Values from sequences are retrieved by using the NEXT VALUE FOR clause. In the example shown on the
slide, the sequence is being used to provide the default value for the FlightBookingID column in the
Booking.FlightBooking table.
Sequences are created by the CREATE SEQUENCE statement, modified by the ALTER SEQUENCE
statement, and deleted by the DROP SEQUENCE statement.
Other database engines provide sequence values, so the addition of sequence support in SQL Server 2012
and SQL Server 2014 can assist with migrating code to SQL Server from other database engines.
Note that values that are retrieved from the sequence are never returned for reuse. This means that gaps
can occur in the set of sequence values. In addition, a range of sequence values can be retrieved in a
single call via the sp_sequence_get_range system stored procedure. Options also exist to cache sets of
sequence values to improve performance. When a server failure occurs, the entire cached set of values is
lost.
MCT USE ONLY. STUDENT USE PROHIBITED
3-14 Ensuring Data Integrity through Constraints
Work with identity constraints, create a sequence, and use a sequence to provide key values for two
tables.
Demonstration Steps
Work with identity constraints, create a sequence, and use a sequence to provide key values for two tables
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
8. Close SQL Server Management Studio and SQL Server Profiler without saving any changes.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 3-15
The following table should be used when you are designing your constraints.
Objectives
In this lab, you will add constraints to tables.
Password: Pa$$w0rd
3. In File Explorer, navigate to the D:\Labfiles\Lab03\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
2. Work through the list of requirements and alter the table to make columns the primary key based on
the requirements.
3. Work through the list of requirements and alter the table to make columns foreign keys based on the
requirements.
4. Work through the list of requirements and alter the table to add DEFAULT constraints to columns
based on the requirements.
Results: Having completed this lab, you will have added constraints to the DirectMarketing.Opportunity
table.
Note: This query should fail due to the FOREIGN KEY constraint.
Results: After completing this exercise, you should have successfully tested your constraints.
Question: In SQL Server Management Studio, you have successfully run a script that created
a table, but you don’t see the table in Object Explorer. What do you need to do?
Question: What does the DEFAULT option do when you create a column?
Question: What requirement does a PRIMARY KEY constraint have that a UNIQUE constraint
does not?
MCT USE ONLY. STUDENT USE PROHIBITED
3-18 Ensuring Data Integrity through Constraints
Review Question(s)
Question: Why implement CHECK constraints if an application is already checking the input
data?
Question: What are some scenarios in which you may want to temporarily disable constraint
checking?
MCT USE ONLY. STUDENT USE PROHIBITED
4-1
Module 4
Introduction to Indexes
Contents:
Module Overview 4-1
Module Overview
An index is a collection of pages associated with a table. Indexes are used to improve the performance of
queries or enforce uniqueness. Before learning to implement indexes, it is important to understand how
they work, how effective different data types are when used within indexes, and how indexes can be
constructed from multiple columns. This module discusses table structures without indexes and the
different index types available in SQL Server.
Objectives
After completing this module, you will be able to:
Lesson 1
Core Indexing Concepts
Although it is possible for Microsoft® SQL Server® data management software to read all of the pages in
a table when it is calculating the results of a query, doing so is often highly inefficient. Instead, you can
use indexes to point to the location of required data and to minimize the need for scanning entire tables.
In this lesson, you will learn how indexes are structured and learn the key measures that are associated
with the design of indexes. Finally, you will see how indexes can become fragmented over time.
Lesson Objectives
After completing this lesson, you will be able to:
Sometimes SQL Server creates its own temporary indexes to improve query performance. However, doing
so is up to the optimizer and beyond the control of the database administrator or programmer, so these
temporary indexes will not be discussed in this module. The temporary indexes are only used to improve a
query plan if no proper indexing already exists.
In this module, you will consider standard indexes that are created on tables. SQL Server also includes
other types of index:
Integrated full-text search is a special type of index that provides flexible searching of text.
Spatial indexes are used with the GEOMETRY and GEOGRAPHY data types.
Primary and secondary XML indexes assist when querying XML data.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-3
Columnstore indexes are used to speed up operations for data that is not constantly changing, such
as data in data warehouses.
Each of these other index types is discussed in later modules in this course.
At this point, it is useful to consider an analogy that might be easier to relate to. Consider a physical
library. Most libraries store books in a given order, which is basically an alphabetical order within a set of
defined categories.
Note that even when you store the books in alphabetical order, there are various ways to do so. The order
of the books could be based on the name of the book or the name of the author. Whichever option is
chosen makes one form of access easy and other forms of access harder. For example, if books were
stored in book name order, how would you locate books that were written by a particular author? Indexes
assist with this type of problem.
Index Structures
Tree structures provide rapid search capabilities for
large numbers of entries in a list.
need for excessive depth within an index. Depth is defined as the number of levels from the top node
(called the root node) to the bottom nodes (called leaf nodes).
Locating the book in the bookcases based on the information in that next index entry.
You would need to keep repeating the same steps until you had found all of the books by that author.
Now imagine doing the same for a range of authors, such as one-third of all of the authors in the library.
You quickly reach a point where it would be quicker to just scan the whole library and ignore the author
index rather than running backward and forward between the index and the bookcases.
Density is a measure of the lack of uniqueness of the data in a table. A dense column is one that has a
high number of duplicates.
Index depth is a measure of the number of levels from the root node to the leaf nodes. Users often
imagine that SQL Server indexes are quite deep, but the reality is quite different. The large number of
children that each node in the index can have produces a very flat index structure. Indexes with only three
or four layers are very common.
Index Fragmentation
Index fragmentation is the inefficient use of pages
within an index. Fragmentation occurs over time as
data is modified.
For operations that read data, indexes perform best
when each page of the index is as full as possible.
Although indexes may initially start full (or relatively
full), modifications to the data in the indexes can
cause the need to split index pages.
Internal fragmentation is similar to what would occur if an existing bookcase was split into two bookcases.
Each bookcase would then be only half full.
External fragmentation relates to where the new bookcase would be physically located. It would probably
need to be placed at the end of the library, even though it would “logically” need to be in a different
order. That means that to read the bookcases in order, you could no longer just walk directly from
bookcase to bookcase. Instead, you would need to follow pointers around the library to follow a chain
from one bookcase to another.
Detecting Fragmentation
Character data values tend to be larger than numeric values. For example, a character column might hold
a customer's name or address details. This means that far fewer entries can exist in a given number of
index pages, which makes character-based indexes slower to seek.
Character-based indexes also tend to cause fragmentation problems because new values are almost never
ascending or descending.
Date-related data types are only slightly less efficient than integer data types. Date-related data types are
relatively small and can be compared and sorted quickly.
Globally unique identifier (GUID) values are reasonably efficient within indexes. There is a common
misconception that they are large, but they are 16 bytes long and can be compared in a binary fashion.
This means that they pack quite tightly into indexes and can be compared and sorted quite quickly.
MCT USE ONLY. STUDENT USE PROHIBITED
4-6 Introduction to Indexes
There is a very common misconception that bit columns are not useful in indexes. This stems from the
fact that there are only two values. However, the number of values is not the issue.
Selectivity of queries is the most important issue. For example, consider a transaction table that contains
100 million rows, where one of the columns (IsFinalized) indicates whether a transaction has been
completed. There might only be 500 transactions that are not completed. An index that uses the
IsFinalized column would be very useful for finding the unfinalized transactions. It would be highly
selective.
Demonstration Steps
Identify fragmented indexes
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
9. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-7
Lesson 2
Single-Column and Composite Indexes
The indexes that have been discussed so far have been based on data from single columns. Indexes can
also be based on data from multiple columns and constructed in ascending or descending order. This
lesson investigates these concepts and the effects that they have on index design along with details of
how SQL Server maintains statistics on the data that is contained within indexes.
Lesson Objectives
After completing this lesson, you will be able to:
Higher selectivity.
Similarly, an index by topic would be of limited value, too. After the correct topic had been located, it
would be necessary to search all of the books on that topic to determine if they were by the specified
author.
The best option would be an author index that also included details of each book's topic. In that case, a
scan of the index pages for the author would be all that was required to work out which books needed to
be accessed.
When you are constructing composite indexes, in the absence of any other design criteria, you should
typically index the most selective column first.
MCT USE ONLY. STUDENT USE PROHIBITED
4-8 Introduction to Indexes
Composite indexes can benefit from each component having a different order. Often this is used to avoid
sorts. For example, you might need to output orders by date descending within customer ascending.
From our physical library analogy, imagine that an author index contains a list of books by release date
within the author index. Answering the query would be easier if the index was already structured this way.
Index Statistics
SQL Server keeps statistics on indexes to assist when
making decisions about how to access the data in a
table.
Earlier in the module, you saw that SQL Server
needs to make decisions about how to access the
data in a table. For each table that is referenced in a
query, SQL Server might decide to read the data
pages or it might decide to use an index.
When discussing the physical library analogy earlier, it was mentioned that if you were looking up the
books for an author, using an index that is ordered by author could be useful. However, if you were
locating books for a range of authors, there would be a point at which scanning the entire library would
be quicker than running backward and forward between the index and the shelves of books.
The key issue here is that, before executing the query, you need to know how selective (and therefore
useful) the indexes would be. The statistics that SQL Server holds on indexes provide this knowledge.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-9
Lesson 3
Table Structures in SQL Server
Tables in SQL Server can be structured in two ways. Rows can be added in any order, or rows can be
ordered. In this lesson, you will investigate both options, and gain an understanding of how each option
affects common data modification operations. Finally, you will see how unique, clustered indexes are
structured differently to non-unique, clustered indexes.
Lesson Objectives
After completing this lesson, you will be able to:
Describe how unique clustered indexes are structured differently to non-unique, clustered indexes.
What Is a Heap?
A heap is a table that has no enforced order for
either the pages within the table or for the data
rows within each page.
In the physical library analogy, a heap would be represented by structuring your library so that every book
is just placed in any available space that is large enough. Without any other assistance, finding a book
would involve scanning one bookcase after another.
MCT USE ONLY. STUDENT USE PROHIBITED
4-10 Introduction to Indexes
Operations on Heaps
The most common operations that are performed
on tables are INSERT, UPDATE, DELETE, and
SELECT operations. It is important to understand
how each of these operations is affected by
structuring a table as a heap.
A DELETE operation could be imagined as scanning the bookcases until the book is found, removing the
book, and throwing it away. More precisely, it would be like placing a tag on the book to say that it is to
be thrown out the next time the library is cleaned up or space on the bookcase is needed.
An UPDATE operation would be represented by replacing a book with a (potentially) different copy of the
same book. If the replacement book was the same (or smaller) size as the original book, it could be placed
directly back in the same location as the original book. However, if the replacement book was larger, the
original book would be removed and the replacement placed into another location. The new location for
the book could be in the same bookcase or in another bookcase.
There is a common misconception that adding additional indexes always reduces the performance of data
modification operations. However, it is clear that for the DELETE and UPDATE operations described
above, having another way to find these rows might well be useful. In Module 5, you will see how to
achieve this.
Forwarding Pointers
When other indexes point to rows in a heap, data
modification operations cause forwarding pointers
to be inserted into the heap. This can cause
performance issues over time.
There was no order to the books on the bookcases, so when an entry was found in the ISBN index, the
entry would refer to the physical location of the book. The entry would include an address like “Bookcase
12, Shelf 5, Book 3.” That is, there would need to be a specific address for a book.
An update to the book that meant that it needed to be moved to a different location would be
problematic. One option for resolving this would be to locate all index entries for the book and update
the new physical location.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-11
An alternate option would be to leave a note in the location where the book used to be that points to
where the book has been moved to. This is what a forwarding pointer is in SQL Server. A forwarding
pointer enables rows to be updated and moved without the need to update other indexes that point to
them.
A further challenge arises if the book needed to be moved again. There are two ways in which this could
be handled. Either yet another note could be left pointing to the new location or the original note could
be modified to point to the new location. Either way, the original indexes would not need to be updated.
SQL Server deals with this by updating the original forwarding pointer. This way, performance does not
continue to degrade by having to follow a chain of forwarding pointers.
Forwarding pointers were a common performance problem with tables in SQL Server that were structured
as heaps. There were no straightforward options for “cleaning up” a heap to remove the forwarding
pointers.
Although options existed for removing forwarding pointers, each had significant disadvantages. SQL
Server 2008 introduced a method for dealing with this problem via the following command.
Note that although options to rebuild indexes were available in prior versions, the option to rebuild a
table was not available. You can also use this command to change the compression settings for a table.
(Page and row compression are advanced topics that are beyond the scope of this course.)
There is a common misconception that pages in a clustered index are “physically stored in order.”
Although this is possible in rare situations, it is not commonly the case. If it were true, fragmentation of
clustered indexes would not exist. SQL Server tries to align physical and logical order while it creates an
index, but disorder can arise as data is modified.
Index and data pages are linked within a logical hierarchy and also double-linked across all pages at the
same level of the hierarchy to assist when scanning across an index.
In the library analogy, a clustered index is similar to storing all books in a specific order. An example of
this would be to store books in International Standard Book Number (ISBN) order. Clearly, the library can
only be in a single order.
MCT USE ONLY. STUDENT USE PROHIBITED
4-12 Introduction to Indexes
If there was insufficient space in the bookcase to accommodate the larger book, the bookcase would need
to be split. If the ISBN of the replacement book was different from the original book, the original book
would need to be removed and the replacement book treated like the insertion of a new book.
A DELETE operation would involve the book being removed from the bookcase. (Again, more formally, it
would be flagged as free in a free space map, but simply left in place for later removal.)
When a SELECT operation is performed, if the ISBN is known, the required book can be quickly located by
efficiently searching the library. If a range of ISBNs was requested, the books would be located by finding
the first book and continuing to collect books in order until a book was encountered that was out of
range or until the end of the library was reached.
that says that no more than a single copy of any book can ever be stored. If someone tried to insert a new
book and another book was found to have the same ISBN (assuming that the ISBN was the clustering
key), the insertion of the new book would be refused.
It is important to understand that the comparison is made only on the clustering key. The book would be
rejected for having the same ISBN, even if other properties of the book were different.
A non-unique, clustered index is similar to having a rule that allows more than a single book that has the
same ISBN. The issue is that it is likely to be desirable to track each copy of the book separately. The
uniqueifier that SQL Server adds would be like a “Copy Number” being added to books that can be
duplicated. The uniqueifier is not visible to users.
Create a table as a heap, check the fragmentation and forwarding pointers for a heap, and rebuild a
heap.
Demonstration Steps
Create a table as a heap, check the fragmentation and forwarding pointers for a heap, and rebuild a heap
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
4-14 Introduction to Indexes
Lesson 4
Working with Clustered Indexes
If a decision has been made to structure a table by using a clustered index, it is important to be familiar
with how the indexes are created, dropped, or altered. In this lesson, you will see how to perform these
actions, understand how SQL Server performs them automatically in some situations, and see how to
incorporate free space within indexes to improve insert performance.
Lesson Objectives
After completing this lesson, you will be able to:
In the first example on the slide, the dbo.Article table was created. The ArticleID column has a PRIMARY
KEY constraint associated with it. There is no other clustered index on the table, so the index that is
created to support the PRIMARY KEY constraint will be created as a clustered primary key. ArticleID will
be both the clustering key and the primary key for the table.
In the second example on the slide, the dbo.LogData table is initially created as a heap. When the
PRIMARY KEY constraint is added to the table, no other clustered index is present on the table, so SQL
Server will create the index to support the PRIMARY KEY constraint as a clustered index.
If a table has been created as a heap, it can be converted to a clustered index structure by adding a
clustered index to the table. In the fourth command shown in the examples on the slide, a clustered index
named CL_LogTime is added to the dbo.LogTime table and the LogTimeID column is the clustering key.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-15
This command will not only create an index over the data, it causes the entire structure of the table to be
reorganized.
Restructuring an index is not permitted within an ALTER INDEX statement. You cannot add or remove
columns that make up the clustering key by using this command and you cannot move the index to a
different filegroup.
WITH DROP_EXISTING
An option to change the structure of an index is provided while creating a replacement index. The
CREATE INDEX command includes a WITH DROP_EXISTING clause that can enable the statement to
replace an existing index. This operation is also typically much faster than dropping and re-creating the
index because SQL Server can build the index based on the old index structure.
Note that you cannot change an index from being clustered to nonclustered or back by using this
command. (Nonclustered indexes are covered in Module 5.)
MCT USE ONLY. STUDENT USE PROHIBITED
4-16 Introduction to Indexes
Disabling Indexes
Although the ALTER INDEX statement includes a DISABLE option that can be applied to any index, this
option is of limited use with clustered indexes. After a clustered index is disabled, no access to the data in
the table is then permitted until it is rebuilt.
You can alleviate the performance impacts of page splits by leaving empty space on each page when you
are creating an index, including a clustered index. You can achieve this by specifying a FILLFACTOR value.
FILLFACTOR defaults to 0, which means “fill 100 percent.” Any other value (including 100) is taken as the
percentage of how full each page should be. For the example on the slide, this means 70 percent full and
30 percent free space on each page.
FILLFACTOR only applies to leaf-level pages in an index. PAD_INDEX is an option that, when it is
enabled, causes the same free space to be allocated in the nonleaf levels of the index.
Static. Clustering keys should be based on data values that do not change. This is one reason why
primary keys are often used for this purpose. A change to the clustering key will mean the need to
move the row. You have seen already that moving rows is generally not desirable.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-17
Increasing. This assists with INSERT behavior. If the keys within the data are increasing as they are
inserted, the inserts happen directly at the logical end of the table. This minimizes fragmentation (the
need to split pages) and reduces the amount of memory that is needed for page buffers.
Unique. Unique clustering keys do not require SQL Server to add a uniqueifier column. It is important
to declare unique values as being unique. Otherwise, SQL Server will still add a uniqueifier column to
the key.
Although this list provides good general guidelines, you must evaluate typical query patterns when you
are designing clustering keys.
You can use character data types for clustering keys, but the sorting performance of character data types
is limited. Character values often tend to change in typical business applications.
Date data is typically not unique, but provides excellent advantages in size and sorting performance. It
works well for date range queries that are common in typical business applications.
An indexed view is a unique, clustered index that you create on a view. The index stores the data set that
is the result of the query that the view contains, so the data set is said to be persisted or materialized.
When the view is called, SQL Server can return the data set directly from the index, and does not need to
run the query. By avoiding the costs of processing of the query logic, including the joins, aggregations,
and filters that the query contains, SQL Server can significantly improve response times. Indexed views can
potentially provide additional performance benefits, because the query optimizer can choose to use an
index that is built on a view even if the view is not referenced in the FROM clause of the query. For
example, if a query has the same definition as the syntax of an indexed view, or it queries a subset of the
data that the indexed view contains, the optimizer can use the indexed view to answer the query.
When you are planning indexed views, consider the following points:
Indexed views provide the most significant performance benefits for queries that are commonly used
or high priority, and queries that include operations such as joins or aggregations. Creating indexes
for infrequently run, low-priority queries might deliver improved performance for those queries, but
the costs of index maintenance will probably outweigh the benefits.
Indexed views can cause performance degradation when data sets are frequently modified because
inserts, updates, and deletes all require the data to be changed in both the index and the supporting
tables. Furthermore, SQL Server might need to perform aggregations every time a row is modified in
the underlying table.
When you drop a view, all indexes on the view are also dropped. The data set is no longer persisted,
so after you drop a view, the query optimizer processes the view in the same way as a standard view.
o You must set the ANSI_NULLS and QUOTED_IDENTIFIER options to ON when you execute the
CREATE VIEW statement.
o You must set the ANSI_NULLS option to ON when you execute the CREATE TABLE statements
to create the tables that the view will reference. For this reason, you should ensure that you
consider early in the planning stage whether you might use indexed views.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-19
o A view that has an index can only reference base tables in the same database as the view, and it
cannot reference other views.
o The definition of an indexed view must be deterministic. Deterministic expressions always return
the same result when you execute them with the same set of input values. Certain functions are
not deterministic, so you cannot use them in an indexed view. For example, the DATEADD
function is deterministic because it always returns the same result when it is used with a specific
set of parameter values. However, the GETDATE function is not deterministic because the value it
returns changes each time it is executed.
Reference Links: For a full list of the requirements for creating indexed views, see the
Creating Indexed Views topic in SQL Server Books Online.
When the data in the base columns that the computed column references changes, the index is
correspondingly updated. If the data changes frequently, these index updates can impair
performance.
When you rebuild an index on a computed column, SQL Server recalculates the values in the column.
The amount of time that this takes will depend on the number of rows and the complexity of the
calculation, but if you rebuild indexes often, you should consider the impact that this can have.
You can only build indexes on computed columns that are deterministic.
Reference Links: For information about the requirements for creating indexes on
computed columns, see the Indexes on Computed Columns topic in SQL Server Books Online.
MCT USE ONLY. STUDENT USE PROHIBITED
4-20 Introduction to Indexes
Create a table that has a clustered index, detect fragmentation in a clustered index, and correct
fragmentation in a clustered index.
Demonstration Steps
Create a table that has a clustered index, detect fragmentation in a clustered index, and correct
fragmentation in a clustered index
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-21
Lesson 5
Working with Nonclustered Indexes
In this module, you will learn how SQL Server structures nonclustered indexes and how they can provide
performance improvements for your applications. You will also see how to create, alter, and drop
nonclustered indexes.
Lesson Objectives
After completing this lesson, you will be able to:
Nonclustered Indexes
Whenever you update key columns from the nonclustered index or update clustering keys on the base
table, the nonclustered indexes need to be updated, too. This affects the data modification performance
of the system. Each additional index that is added to a table increases the work that SQL Server might
need to perform when modifying the data rows in the table. You must take care to balance the number of
indexes that are created against the overhead that they introduce.
Ongoing Review
An application's data access patterns may change over time, particularly in enterprises where ongoing
development work is being performed on the applications. This means that nonclustered indexes that are
created at one point in time may need to be altered or even dropped at a later point in time, to continue
to achieve high performance levels.
MCT USE ONLY. STUDENT USE PROHIBITED
4-22 Introduction to Indexes
Physical Analogy
Continuing our library analogy, nonclustered indexes are indexes that point back to the bookcases. They
provide alternate ways to look up the information in the library. For example, they might enable access by
author, by release date, or by publisher. They can also be composite indexes where you could find an
index by release date within the entries for each author.
You can create multiple nonclustered indexes on a table regardless of whether the table is structured as a
heap or has a clustered index.
Physical Analogy
Based on the library analogy, a nonclustered index over a heap is like an author index pointing to books
that have been stored in no particular order within the bookcases. When an author is found in the index,
the entry in the index for each book would have an address like “Bookcase 4, Shelf 3, Book 12.” Note that
it would be a pointer to the exact location of the book.
If the clustered index is not a unique, clustered index, the leaf level of the nonclustered index also needs
to hold the uniqueifier value for the data rows.
Physical Analogy
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 4-23
In the library analogy, a nonclustered index over a clustered index is like having an author index built over
a library where the books are all stored in ISBN order. When the required author is found in the author
index, the entry in the index provides details of the ISBNs for the required books. These ISBNs are then
searched for by using the second index to locate the books within the bookcases. If the bookcases need to
be rearranged (for example, due to other rows being modified), it is not necessary to make any changes
to the author index because it is only providing keys that are used for locating the books, rather than the
physical location of the books.
If an index is created only to enhance performance, rather than as part of the initial schema of an
application, one suggested standard is to include in the name of the index the date of creation and a
reference to documentation that describes why the index was created. Database administrators are often
hesitant to remove indexes when they do not know why those indexes were created. Keeping
documentation that explains why indexes were created avoids that confusion.
A composite index specifies more than one column as the key value. Using composite indexes can
enhance query performance, especially when users regularly search for information in more than one way.
However, wide keys increase the storage requirements of an index.
Most useful nonclustered indexes in business applications are composite indexes. A common error is to
create single-column indexes on many columns of a table. These indexes are rarely useful.
In composite indexes, the ordering of key columns is important. In the absence of any other requirements,
you should specify the most selective column first. You can specify each column that makes up the key as
ASC (ascending) or DESC (descending). Ascending is the default order.
MCT USE ONLY. STUDENT USE PROHIBITED
4-24 Introduction to Indexes
INCLUDE Clause
In earlier versions of SQL Server (prior to 2005), it
was common for database administrators or
developers to create indexes that had a large
number of columns, to attempt to cover important
queries. Covering a query avoids the need for
lookup operations and can greatly increase the
performance of queries. The INCLUDE clause was
introduced to make the creation of covering
indexes easier.
SQL Server 2005 introduced the ability to include one or more columns (up to 1,024 columns) only at the
leaf level of the index. The index structure in other levels is unaffected by these included columns. They
are included only to help with covering queries. If more than one column is listed in an INCLUDE clause,
the order of the columns within the clause is not relevant.
Performance Impacts
Indexes that provide all columns required for a query are considered to “cover” the query. Covering
indexes can have a very positive performance impact on the queries that they are designed to support.
However, although it would be possible to create an index to cover most queries, doing so could be
counterproductive. Each index that is added to a table can negatively impact the performance of data
modifications on the table. For this reason, it is important to decide which queries are most important and
to aim to cover only those queries.
Demonstration Steps
Create covering indexes
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod04\Demo04.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
9. Close SQL Server Management Studio and SQL Server Profiler without saving any changes.
MCT USE ONLY. STUDENT USE PROHIBITED
4-26 Introduction to Indexes
Objectives
After completing this lab, you will have:
Password: Pa$$w0rd
3. In File Explorer, navigate to the D:\Labfiles\Lab04\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
Results: After completing this exercise, you will have created tables with clustered indexes.
Results: After completing this lab, you will have created a nonclustered index.
Question: Which table structure is automatically assigned when a table is assigned a primary
key during table creation and no structure is specified?
MCT USE ONLY. STUDENT USE PROHIBITED
4-28 Introduction to Indexes
Review Question(s)
Question: What is the main problem with using unique identifiers as primary keys?
Question: Where are newly inserted rows placed when a table is structured as a heap?
MCT USE ONLY. STUDENT USE PROHIBITED
5-1
Module 5
Advanced Indexing
Contents:
Module Overview 5-1
Module Overview
In earlier modules, you have seen that one of the most important decisions that Microsoft® SQL Server®
takes when executing a query, is how to access the data in any of the tables involved in the query. SQL
Server can read the underlying table (which might be structured as a heap or with a clustered index), but
it might also choose to use another index. It is important to know how to determine the outcomes of the
decisions that SQL Server makes. Execution plans show how each step of a query was executed. In this
module, you will learn how to read and interpret execution plans and you will see how nonclustered
indexes have the potential to significantly enhance the performance of your applications. You will also
learn to use a tool that can help you design these indexes appropriately.
Objectives
After completing this module, you will be able to:
Lesson 1
Core Concepts of Execution Plans
The first steps in working with execution plans in Microsoft® SQL Server® data management software are
to understand why execution plans are so important and to understand the phases that SQL Server passes
through when it executes a query. When you have that information, you can learn what an execution plan
is, what the different types of execution plans are, and how execution plans relate to execution contexts. It
is possible to retrieve execution plans in a variety of formats. It is also important to understand the
differences between each of these formats and to know when to use each format.
Lesson Objectives
After completing this lesson, you will be able to:
Describe the phases that SQL Server passes through while executing a query.
Explain what execution plans are.
I created an index to make access to the table fast, but SQL Server is ignoring the index. Why won't it
use my index?
I have created an index on every column in the table, yet SQL Server still takes the same time to
execute my query. Why is it ignoring the indexes?
SQL Server provides tools to help answer these common questions. Execution plans show how SQL Server
intends to execute a query or how it executed a query. The ability to interpret these execution plans
enables you to answer the questions above.
Many users capture execution plans and then try to resolve the worst performing aspects of a query.
However, the best use of execution plans is in verifying that the plan that you expected to be used was, in
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-3
fact, used. This means that you already need to have an idea of how you expect SQL Server to execute
your queries.
Transact-SQL Parsing
In the second phase, SQL Server resolves the names of objects to their underlying object IDs. SQL Server
needs to know exactly which object is being referred to. For example, consider the statement in the
following code example.
At first glance, it might seem that mapping the Product table to its underlying object ID would be easy,
but remember that SQL Server supports more than a single object that has the same name in a database,
through the use of schemas. For example, note that each of the objects in the following code could be
completely different in structure and that the names relate to entirely different objects.
SQL Server needs to apply a set of rules to relate the table name “Product” to the intended object.
Query Optimization
After the object IDs have been resolved, SQL Server needs to decide how to execute the overall batch.
Based on the available statistics, SQL Server will make decisions about how to access the data that is
contained in each of the tables that are part of each query. This might involve creating new statistics or
updating existing statistics before executing the query.
SQL Server does not always find the best possible plan. It weighs up the cost of a plan, based on its
estimate of the cost of resources that are required to execute the plan. The cost is based on CPU
resources, memory, and I/O operations and is strongly influenced by the available statistics. The aim is to
find a satisfactory plan in a reasonable period of time. The more complex a Structured Query Language
(SQL) batch is, the longer it could take SQL Server to evaluate all of the possible plans that could be used
to execute the batch. Finding the best plan might take longer than executing a less optimal plan.
There is no need to consider alternate plans for data definition language (DDL) statements such as
CREATE, ALTER, or DROP. Many simple queries also have trivial plans that are quickly identified.
Query Plan Execution
After a plan is found, the execution engine and storage engine work to execute the plan. It may or may
not succeed because run-time errors could occur.
MCT USE ONLY. STUDENT USE PROHIBITED
5-4 Advanced Indexing
Plan Caching
If the plan is considered sufficiently useful, it may be stored in the Plan Cache. On later executions of the
batch, SQL Server will attempt to reuse execution plans from the Plan Cache. This is not always possible
and, for certain types of query, not always desirable.
SQL Server uses a cost-based optimizer: each element of the query plan is assigned a cost in relation to
the total cost of the batch. SQL Server Management Studio also calculates a relationship between the
costs of each statement, which is useful where a batch contains more than one statement.
The costs that are either estimated or calculated as part of the plan can only be interpreted within the
context of the plan. It is possible to compare the cost of individual elements across statements in a single
batch, but you should not make comparisons between the costs of elements in different batches. You can
only use costs to determine whether an operation is cheaper or more expensive than another operation.
You cannot use costs to estimate execution time.
Another option on the Query menu is Display Estimated Execution Plan. This asks SQL Server to
calculate an execution plan for a query (or batch) based on how it would attempt to execute the query.
This is calculated without actually executing the query. This type of plan is known as an “estimated”
execution plan. Estimated execution plans are very useful when you are designing queries or when you
are debugging queries that are suffering from performance problems.
Note that it is not always possible to retrieve an estimated execution plan. One common reason for this is
that the batch might include statements that create objects and then access them. The objects do not
exist yet, so SQL Server has no knowledge of them and cannot create a plan for processing them. You will
see an example of this in the next demonstration.
When SQL Server executes a plan, it may also make choices that differ from an estimated plan. This is
commonly related to the available resources (or more likely the lack of available resources) at the time
when the batch is executed.
Execution plans include row counts in each data path. For estimated execution plans, these are based on
estimates from the available statistics. For actual execution plans, both the estimated and actual row
counts are shown.
Execution contexts are cached for reuse in a very similar way to the caching that occurs with execution
plans. When a user executes a plan, SQL Server retrieves an execution context from the cache if there is
one available, even if it was generated for a different user.
To maximize performance and minimize memory requirements, execution contexts are not fully
completed when they are created. Branches of the code are “fleshed out” when the code needs to move
to the branch. This means that if a procedure includes a set of procedural logic statements (like the IF
statement), the execution context that is retrieved from the cache may have gone in a different logical
direction and not yet have all the details that are required, even if it was a different user who executed the
procedure.
For caching reuse, it is useful to avoid too much procedural logic in stored procedures. You should favor
set-based logic instead.
MCT USE ONLY. STUDENT USE PROHIBITED
5-6 Advanced Indexing
Plan Portability
SQL Server provided a graphical rendering of execution plans to make reading text-based plans easier.
One challenge with this, however, was that it was very difficult to send a copy of a plan to another user for
review. XML plans can be saved as an .sqlplan file type and are entirely portable between systems. You
can render graphical plans from XML plans, including plans that have been received from other users.
Note that graphical plans include only a subset of the information that is available in an XML plan.
Although it is not easy to read XML plans directly, you can obtain further information by reading the
contents of the XML plan.
XML plans are also ideal for programmatic access for users who are creating tools and utilities because
XML is relatively easy to consume programmatically in an application.
SET Statements
The Transact-SQL SET statements enable you to
view execution plan information in text format, or
to capture it in XML format so that you can use
other applications to view it or process it. The
output from these statements is displayed on the
Messages tab in the Results pane in SQL Server
Management Studio.
SET STATISTICS IO
the query, for example, when a nested loops join is used that requires an index read for each value that
it is attempting to match.
Physical reads. The physical reads value represents the number of pages that have been read from
the disk. If the required data is already in the cache, this will be 0. If the data is not in the cache, SQL
Server accesses the pages from the disk, places them in the data cache, and then reads them from
there.
Logical reads. The logical reads value represents the number of pages that have been read from the
cache. The fewer reads that a query requires, the faster it will execute.
Read-ahead reads. The read-ahead reads value represents the number of pages that SQL Server read
from the disk into the cache to execute the query. The read-ahead mechanism anticipates the data
pages and index pages that might be needed to execute the query, and accesses them before they are
required for processing, which improves performance.
Large object (LOB) logical reads, LOB physical reads, and LOB read-ahead reads. These values
indicate the number of logical reads, physical reads, and read-ahead reads that were performed to
access LOB data.
The code example below includes the SET STATISTICS IO ON option in query execution:
SET STATISTICS IO
SET STATISTICS IO ON;
SELECT MONTH(s.OrderDate) AS OrderMonth, p.ProductName, SUM(s.SalesAmount) AS Revenue
FROM SalesOrder AS s
JOIN Product AS p ON s.ProductCode = p.ProductCode
WHERE YEAR(s.OrderDate) = YEAR(getdate())
GROUP BY MONTH(s.OrderDate), p.ProductName
ORDER BY MONTH(s.OrderDate), p.ProductName
SET STATISTICS IO OFF;
SET SHOWPLAN_XML
The SET SHOWPLAN_XML command displays the execution plan in XML format, which enables you to
use the output in other applications. SET SHOWPLAN_XML does not execute the Transact-SQL
statement.
Note: SET SHOWPLAN_TEXT, SET SHOWPLAN_ALL, and SET STATISTICS PROFILE will
be deprecated in a future version of SQL Server, so you should avoid using them. Instead of SET
SHOWPLAN_TEXT and SET SHOWPLAN_ALL, you should use SET SHOWPLAN_XML. Instead
of SET STATISTICS PROFILE, you should use SET STATISTICS XML.
Reference Links: For more information about the SET commands, see the Displaying
Execution Plans by Using the Showplan SET Options topic in the Microsoft Developer Network
(MSDN) library.
Demonstration Steps
Use execution plans
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Run D:\Demofiles\Mod05\Setup.cmd as an administrator to revert any changes.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
9. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-9
Lesson 2
Common Execution Plan Elements
Now that you have learned about the role of execution plans, along with the format of the plans, it is
important to learn to interpret the plans. Execution plans can contain a large number of different types of
elements. Certain elements, however, appear regularly in execution plans. In this lesson, you will learn to
interpret execution plans and learn about the most common elements of execution plans.
Lesson Objectives
After completing this lesson, you will be able to:
Describe table scans, clustered index scans, and clustered index seeks.
Describe aggregations.
Describe filter and sort operations.
If a query's logic is related to the clustering key for the table, SQL Server may be able to use the index that
supports it to quickly locate the row or rows required. For example, if a Customer table is clustered on a
CustomerID column, consider how the following query would be executed.
SQL Server does not need to read the entire table and can use the index to quickly locate the correct
customer. This is referred to as a clustered index seek. By comparison, if the WHERE clause had been on
another nonindexed column, a table scan would have occurred.
MCT USE ONLY. STUDENT USE PROHIBITED
5-10 Advanced Indexing
In some earlier documentation, a Key Lookup was also referred to as a Bookmark Lookup. The Key
Lookup operator was introduced in SQL Server 2005 Service Pack 2. Note also that in earlier versions of
SQL Server 2005, the Bookmark Lookup was shown as a Clustered Index Seek operator that had a
LOOKUP keyword associated with it.
In the physical library analogy, a lookup is similar to reading through an author index and for each book
that is found in the index, going to collect it from the bookcases.
Lookups are often expensive operations because they need to be executed once for every row of the
upper input source. Note that in the execution plan shown, more than half of the cost of the query is
accounted for by the Key Lookup operator. In the next module, you will see options for minimizing this
cost in some situations. The Nested Loops operator is the preferred choice whenever the number of rows
in the upper input source is small when compared with the number of rows in the lower input source.
Merge Joins
The answer depends upon the order of the sheets. If the customer sheets were in customer ID order and
the customer order sheets were also in customer ID order, merging the two piles would be easy. The
process involved is similar to what occurs when you use a Merge Join operator. You can only use this
operator when the inputs are already in the same order. One option to consider would be to presort the
two piles.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-11
You can use the Merge Join operator to implement a variety of join types such as left outer joins, left
semi joins, left anti semi joins, right outer joins, right semi joins, right anti semi joins, and unions.
Hash Matches
Now imagine how you would merge the piles of customers and customer orders if the customers were in
customer ID order, but the customer orders were ordered by customer order number. The same problem
would occur if the customer sheets were in postal code order. These situations are similar to the problem
that Hash Match operations encounter. There is no easy way to merge the piles. One option would be to
presort the data and then use a Merge Join operation, but a Hash Match operation is often more
efficient in this case.
Hash Match operations use a relatively “brute force” method of joining. One input is broken into a set of
“hash buckets” based on an algorithm. The other input is processed based on the same algorithm. In the
analogy with the piles of paper, the algorithm could be to obtain the first digit of the customer ID. Using
this algorithm, 10 buckets would be created. Now you can calculate the hash value for one row, and look
in the bucket that contains matching rows from the other table. The bucket will contain a relatively small
number of rows, and can be searched without having to do an entire table scan. If a match is found, the
rows are joined and returned. If no match is found, the input row is discarded and the next one is
examined.
Although it may not always be possible to avoid Hash Match operations in query plans, their presence is
often an indication of a lack of appropriate indexing on the underlying tables. In data warehouses, Hash
Match joins are often the most common form of join due to minimal indexing.
Aggregations
There are two types of Aggregate operator: Stream
Aggregate and Hash Match Aggregate. Stream
Aggregate operations are very efficient.
One option would be to sort all of the customer orders by customer ID first, and then to count all of the
customer orders for each customer ID.
Another option is to process the input by using a hashing algorithm like the one that is used for Hash
Match operations. This is what SQL Server does when it uses a Hash Match Aggregate operation. The
presence of these operations in a query plan is often (but not always) an indication of a lack of
appropriate indexing on the underlying table.
MCT USE ONLY. STUDENT USE PROHIBITED
5-12 Advanced Indexing
Sort operations are often used to implement ORDER BY clauses in queries, but they have other uses. For
example, you could use a Sort operator to sort rows before they are passed to other operations such as
Merge Join operations or for performing DISTINCT or UNION operations.
Sorting data rows can be an expensive operation. You should avoid unnecessary ORDER BY operations.
Not all data needs to be put in a specific order. However, if a sorted result is required, you should always
use an ORDER BY clause. Do not depend upon a sorted outcome from an execution plan always staying in
that same order.
Run queries that demonstrate the most common execution plan elements.
Demonstration Steps
Run queries that demonstrate the most common execution plan elements
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. If you have not completed the previous demonstrations in this module, run
D:\Demofiles\Mod05\Setup.cmd as an administrator to revert any changes.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
5-14 Advanced Indexing
Lesson 3
Working with Execution Plans
Now that you understand the importance of execution plans and are familiar with common elements that
the plans contain, you need to consider the different ways in which the plans can be captured. In this
lesson, you will see various ways to capture plans and explore the criteria by which SQL Server decides
whether to reuse plans. When working with execution plans, SQL Server exposes several dynamic
management views (DMVs) that you can use to explore query plan reuse. You will also see how execution
plans are used.
Lesson Objectives
After completing this lesson, you will be able to:
Explain how SQL Server decides whether to reuse existing plans when it reexecutes queries.
Use DMVs that are related to execution plans.
SQL Server Profiler has a Performance events > Showplan XML event that you can use to add a column
to a trace. The trace will then include the actual execution plans. You need to take care when you use this
option because you could quickly generate a huge trace output if you do not use appropriate filtering.
The overall performance of the system could be degraded.
SQL Server Profiler is still very commonly used, but over time, it will be replaced by the Extended Events
profiling sessions that are integrated into SQL Server Management Studio in SQL Server 2014. The
Extended Events profiling capability is more extensive than that provided by SQL Server Profiler. However,
you should continue to use SQL Server Profiler for capturing traces of SQL Server Analysis Services activity.
Dynamic management views provide information about recent expensive queries and missing indexes
that SQL Server detected when it created the plan. Activity Monitor in SQL Server can display the results of
querying these DMVs.
The Data Collector in SQL Server collects information from the DMVs, uploads it to a central database,
and provides a series of reports based on the data. Unlike Activity Monitor, which shows recent expensive
queries, Data Collector can show historical entries. This can be very useful when a user asks about a
problem that occurred last Tuesday morning rather than at the time when the problem is occurring.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-15
Reexecuting Queries
SQL Server attempts to reuse execution plans where
possible. Although this is often desirable, reusing
existing plans can be counterproductive to
performance.
Even for cached plans, SQL Server may eventually decide to evict them from the cache and recompile the
queries. The two main reasons for this are:
Optimality (data has been sufficiently modified to require a new plan to be considered).
SQL Server assigns a cost to each plan that is cached, to estimate its “value.” The value is a measure of
how expensive the execution plan was to generate. When memory resources become tight, SQL Server
will need to decide which plans are the most useful to keep. The decision to evict a plan from memory is
based on this reduced cost value.
Options are available to force compilation behavior of code, but they should be used sparingly and only
where necessary.
View Description
Demonstration Steps
Viewing cached execution plans
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-17
Lesson 4
Designing Effective Nonclustered Indexes
Before you start to implement nonclustered indexes, you need to design them appropriately. In this
lesson, you will learn how to find information about the indexes that have been created and how to create
filtered indexes.
Lesson Objectives
After completing this lesson, you will be able to:
Each index has a property page that details the structure of the index and the characteristics of its
operational, usage, and physical layout.
SQL Server Management Studio also includes a set of prebuilt reports that show the state of a database.
Many of these reports relate to index structure and usage.
The sp_helpindex system stored procedure returns details of the indexes that have been created on a
specified table.
SQL Server provides a series of catalog views that provide information about indexes. Some of the more
useful views are shown in the following table.
sys.index_columns Column ID, position within the index, type (key or nonkey),
and sort order (ASC or DESC)
SQL Server provides a series of dynamic management objects that contain useful information about the
structure and usage of indexes. Some of the most useful views and functions are shown in the following
table.
View Notes
System Functions
SQL Server provides a set of functions that provide information about the structure of indexes. Some of
the more useful functions are shown in the following table.
Function Notes
INDEXKEY_PROPERTY Index column position within the index and column sort
order (ASC or DESC)
Filtered Indexes
Unless you specify otherwise, when you create a
nonclustered index on a table, the index will include
every row in the table. Although indexing all of the
rows in a table is frequently desirable, there are
scenarios when it might not be:
Tables that have many NULL values. When a column includes many NULL values, a nonclustered
index that is built on that column can be inefficient.
You can use filtered indexes to create smaller, more focused indexes that deliver greater efficiency and
better performance.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-19
Filtered indexes are nonclustered indexes that you define by including a WHERE clause in the CREATE
INDEX statement. The WHERE clause filter limits the rows that the index will include, which has several
benefits, including:
The index is more efficient to manage, for example, rebuild and reindex operations will be faster.
The index will deliver faster response times because small indexes take less time to read than large
ones.
The size of the index statistics is correspondingly smaller, so updating statistics for a filtered index is
faster.
For example, most queries against the Employee table in the HumanResources database specify the data
value New York for the City column in the WHERE clause. By creating an index that includes only rows
that have New York in the City column, you can create a more efficient index that offers better
performance than an unfiltered index. When you are planning your indexing strategy, you should
consider the trade-off between indexes that have a broad coverage and indexes that are focused, but
might deliver better performance. Focused indexes are useful when you have a small number of high-
priority queries as the focus of your strategy. Broader indexes are useful when you have many queries of
equal priority.
The code example below creates a filtered index that includes a WHERE clause to limit the number of
rows that the index contains:
You can use an indexed view to achieve a similar result to that achieved by using a filtered index; you just
need to specify a filter in the indexed view definition to exclude the unwanted rows. However, there are
some important differences between the two solutions. When you are deciding between using an indexed
view or a filtered index, consider the following points:
You can use indexed views to create indexes that are based on multiple tables, but you can only
create a filtered index on a single table.
Filtered indexes only support simple comparison operators in the WHERE clause of the index
definition, so, for example, you cannot use the LIKE operator to create a filtered view. If you need to
filter by using more complex logic, you can use an indexed view.
The query optimizer uses filtered indexes in more situations than indexed views, so by using a filtered
index, you are more likely to improve performance across more queries.
You can perform index rebuild operations while a filtered index is online, but indexed views do not
support online rebuilds.
Updates of filtered indexes generally require fewer CPU resources than updates to indexed views,
which helps to minimize maintenance costs.
MCT USE ONLY. STUDENT USE PROHIBITED
5-20 Advanced Indexing
Filtered indexes do not need to be unique indexes, but indexed views do because an index that is
built on a view is a clustered index.
Demonstration Steps
Viewing index information
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. If you have not completed the previous demonstrations in this module, run
D:\Demofiles\Mod05\Setup.cmd as an administrator to revert any changes.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod05\Demo05.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
Lesson 5
Performance Monitoring
Many factors can affect database performance. Correct indexing, hardware, network performance,
application design, logical and physical database design, data changes, and operating system
configuration are just a few of the things that could have major effects on database performance for the
user. This lesson describes the options for monitoring performance, and explains how you can create a
baseline to aid troubleshooting.
Lesson Objectives
After completing this lesson, you will be able to:
There can be no definitive rules about what a performance monitoring and tuning strategy should
include, and you should take each system on a case-by-case basis. Tune the system to meet your goals,
remove any bottlenecks that prevent you from meeting your goals, benchmark the system to provide a
performance baseline, and then monitor your system to ensure that you are meeting or exceeding your
baseline.
MCT USE ONLY. STUDENT USE PROHIBITED
5-22 Advanced Indexing
Note: SQL Server Profiler is being deprecated. It will be removed from a future version of
SQL Server and replaced by Extended Events Profiler and SQL Server Distributed Replay.
However, SQL Server Profiler is still the current recommended tool for capturing and replaying
traces.
Activity Monitor. Activity Monitor is available on the SQL Server Management Studio toolbar.
Activity Monitor enables you to identify expensive queries, and to view data file I/O, resource wait
times, processes, percentage of processor time, the number of waiting tasks, database I/O, and batch
requests per second. Activity Monitor is useful for identifying performance problems after you have
used Performance Monitor to determine that it is SQL Server that is causing the problem, not a
different component of the system such as Windows or Microsoft SharePoint® Server.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-23
DMVs. The sys.dm_db_index_usage_stats DMV returns a large amount of information about index
operations, how many times they were performed, and when they were last performed.
A baseline provides a sound basis for hardware planning because it enables you to spot trends and
create projections for future hardware requirements. When hardware budgets are limited, this
approach can help to ensure that you spend the budget in the most cost-effective way.
A baseline enables you to assess the impact of changes in database design or hardware. After you
make the changes, you can use the baseline to verify that you have achieved the desired
improvements. If there is no improvement, you can roll back the changes, but if there is
improvement, you can implement the changes. After you have made changes to a server, you should
create a new baseline that reflects the new configuration.
When you are planning a performance baseline, you should aim to create samples that monitor system
resource usage over an extended period of time, and to include periods of low, normal, and high usage.
This will help you to gain a true picture of system performance, rather than just a snapshot of
performance at a single point in time. The longer you monitor, the more reliable the statistics will be;
however, you will need to balance this against the impact of monitoring on system resources, including
storage space and CPU utilization, so that monitoring itself does not become a factor that negatively
affects performance. To minimize the impact of monitoring, you should monitor your servers from a
remote workstation, and connect to them by using Performance Monitor. You can specify the server that
you want to monitor in the Add Counters dialog box. You should avoid using remote desktop
connections to connect to a server and then running Performance Monitor on that server because this
uses server resources.
After you create a baseline, you should periodically compare current server performance with the baseline
figures. You should investigate any values that are significantly above or below baseline figures. You
should investigate unexpected improvement in addition to unexpected performance degradation. For
example, if no customers can access your website because of a denial-of-service attack, you might find
that the database server is running unusually quickly. This improvement is actually caused by a problem
elsewhere.
MCT USE ONLY. STUDENT USE PROHIBITED
5-24 Advanced Indexing
You can create a baseline by monitoring the following counters, and recording the information in a log:
o Memory:Available Mbytes. This counter captures the amount of available memory on the
server in megabytes. If there is not enough free memory, the operating system will use the
paging file, which impairs performance. There is no ideal figure for this counter that will suit all
servers, but you should ensure that there is enough free memory to handle not just the SQL
Server workloads, but any other workloads that run on the server, such as backup jobs and
administrative connections.
o Paging File:% Usage. This counter captures page file usage, and ideally should be a very low
value. A high value indicates that the server has insufficient memory. You can also use the
Memory:Pages/sec. counter to verify this.
o SQL Server:Buffer Manager:Buffer cache hit ratio. This counter indicates the percentage of
pages that are read from the data cache without having to read from disk. Ideally, this figure
should be over 90 percent; if it is lower than this, the impact of disk I/O can become a problem.
o SQL Server:Buffer Manager:Page life expectancy. This counter indicates in seconds how long
pages that are read into memory will remain in the cache before being removed to enable the
caching of other pages. Higher values indicate that there is sufficient memory available; if the
value falls, this could be because the workload has increased and you need to add more memory.
Alternatively, it could indicate that poorly written queries are using table or index scans, which
bring the entire table or index into memory, forcing other items to be removed.
o SQL Server:Memory Manager:Memory Grants Pending. This counter indicates the number of
queries that are currently waiting to be allocated memory so that they can execute. The ideal
value for this counter is 0. A value higher than this is a strong indication that the server has
insufficient memory.
o PhysicalDisk: Avg. Disk Queue Length. A value greater than 2 for an individual disk often
indicates a potential bottleneck, particularly if you are also experiencing high disk latency.
o Processor:% Privileged Time. This counter indicates the percentage of total time that a CPU or
CPU core spends executing kernel commands, which includes SQL Server disk I/O requests. You
can use it to help identify inefficient and over-utilized disk subsystems.
o The counters that are described above measure all disk activity, regardless of its source. To
identify disk I/O that results specifically from SQL Server activity, you can use the following
counters:
SQL Server:Buffer Manager: Page reads/sec
SQL Server:Buffer Manager: Page writes/sec
SQL Server:Buffer Manager: Checkpoint pages/sec
SQL Server:Buffer Manager: Lazy writes/sec
Counters for assessing CPUs:
o Processor:% Processor Time. This counter indicates the percentage of time that a processor
spends processing workloads (sometimes referred to as executing non-idle threads). You can use
this counter to monitor individual CPUs and CPU cores or to monitor the total for all CPUs and
cores. If the value of this counter is consistently greater than 80 percent, it may indicate that the
CPU or CPUs represent a bottleneck in the system. On the other hand, a value of 20 percent or
less indicates space capacity, which you could use to consolidate other databases or instances.
o System:Processor Queue Length. This counter indicates the number of threads that are waiting
for CPUs to become available so that they can be processed. On a single processor system, a
value that is consistently greater than five can indicate that the CPU or CPUs represent a
bottleneck in the system. On multiprocessor systems, you should divide the queue length by the
number of processors to obtain the relevant value.
Counters for assessing network performance:
o Network Interface:Bytes Total/sec. This counter captures the total number of bytes that are
sent and received over a network connection for each second.
o Network Interface:Current Bandwidth. This counter records the actual capacity (as opposed to
the rated capacity) of a network interface card.
You can calculate network utilization for a specific network adapter in the following way:
o IPv4:Datagrams/sec and IPv6:Datagrams/sec. You can use these counters to capture the
number of IP datagrams that are sent and received over a defined period of time, and use this as
a benchmark when you are testing network performance.
In addition to the counters that are described above, SQL Server includes a range of dedicated
performance objects and counters that you can use to create a baseline and to troubleshoot, including
the SQL Server:General Statistics and SQL Server:SQL Statistics performance objects. These objects
include a range of counters that you can use to create a baseline and to troubleshoot CPU-related
performance issues:
MCT USE ONLY. STUDENT USE PROHIBITED
5-26 Advanced Indexing
SQL Server:General Statistics:User Connections. You can use this counter to establish the number
of user connections to a server, and then monitor this over time. This can be used to corroborate the
data from other counters. For example, if you identify a CPU issue that is getting gradually worse, you
can check this against the number of user connections over the same time period to see if there is a
correlation.
o SQL Server SQL Statistics:SQL Compilations/sec and SQL Server SQL Statistics:SQL Re-
Compilations/sec. You can use these counters to track the number of times SQL Server compiles
and recompiles execution plans. Compiling an execution plan can be resource-intensive, so you
typically want to see a small number of compilations and recompilations. You can compare the
SQL Server SQL Statistics:SQL Compilations/sec counter against the SQL Server SQL
Statistics:Batch Requests/sec counter to see how many of the batches that are submitted to the
server require a compilation. The number of recompilations should be significantly lower than the
number of compilations, ideally about 10 percent. If this figure is significantly higher, you should
investigate the cause of the recompilations.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 5-27
One of the company developers has provided you with a list of the most important queries that the new
marketing management system will execute. Depending upon how much time you have available, you
need to determine the best column orders for indexes to support each query.
Objectives
After completing this lab, you will have:
Password: Pa$$w0rd
2. View Statistics
3. Review the Results
4. Create Statistics
7. Answer Questions
2. Use a full scan of the data when you are creating the statistics.
Task 8: Execute an SQL Command and Check the Accuracy of Some Statistics
1. Execute the following command to check how accurate the statistics that have been generated are.
Query 2
Results: After this exercise, you will have assessed selectivity on various queries.
Results: After completing this exercise, you will have created a covering index.
Question: Can two different queries end up with the same execution plan?
MCT USE ONLY. STUDENT USE PROHIBITED
5-30 Advanced Indexing
Review Question(s)
Question: What is the difference between a graphical execution plan and an XML execution
plan?
Question: Why might a Transact-SQL DELETE statement have a complex execution plan?
MCT USE ONLY. STUDENT USE PROHIBITED
6-1
Module 6
In-Memory Database Capabilities
Contents:
Module Overview 6-1
Module Overview
The capacity of physical memory has grown substantially in recent years, while the cost of memory
modules has dropped. As a result, modern servers generally have much higher memory specifications
than servers in the past. Microsoft® SQL Server® 2014 data management software includes new and
enhanced features that take advantage of the increasing amount of memory in modern servers to
improve I/O performance. This module explores some of these features and explains how to use them to
maximize the performance and scalability of your database applications.
Objectives
After completing this module, you will be able to:
Use the buffer pool extension to improve performance for read-heavy online transaction processing
(OLTP) workloads.
Lesson 1
The Buffer Pool Extension
SQL Server uses a buffer pool of memory to cache data pages, reducing I/O demand and improving
overall performance. As database workloads intensify over time, you can add more memory to maintain
performance, but this solution is not always practical. Adding storage is often easier than adding memory,
and SQL Server 2014 introduces the buffer pool extension to enable you to use fast storage devices for
buffer pool pages.
Lesson Objectives
After completing this lesson, you will be able to:
Describe the key features and purpose of the buffer pool extension.
Identify scenarios where the buffer pool extension can improve performance.
Performance gains on OLTP applications that have a high amount of read operations can be
improved significantly.
SSD devices are often less expensive per megabyte than physical memory, making this approach a
cost-effective way to improve performance in I/O-bound databases.
It is easily possible to enable the buffer pool extension, and doing so requires no changes to existing
applications.
Note: The buffer pool extension is only available in 64-bit installations of SQL Server 2014
Enterprise.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 6-3
Scenarios where the buffer pool extension is unlikely to significantly improve performance include:
To disable the buffer pool extension, use the ALTER SERVER CONFIGURATION statement with the SET
BUFFER POOL EXTENSION OFF clause.
To resize or relocate the buffer pool extension file, you must disable the buffer pool extension and then
reenable it with the required configuration. When you disable the buffer pool extension, SQL Server will
have less buffer memory available, which may cause an immediate increase in memory pressure and I/O
and result in performance degradation. You should therefore plan reconfiguration of the buffer pool
extension carefully to minimize disruption to application users.
You can view the status of the buffer pool extension by querying the
sys.dm_os_buffer_pool_extension_configuration dynamic management view (DMV), and you can monitor
its usage by querying the sys.dm_os_buffer_descriptors DMV.
MCT USE ONLY. STUDENT USE PROHIBITED
6-4 In-Memory Database Capabilities
Demonstration Steps
Configure the buffer pool extension
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod06\Demo06.ssmssln, and then click Open.
9. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 6-5
Lesson 2
Columnstore Indexes
SQL Server 2012 introduced significant new indexing functionality that can dramatically improve query
response times. This functionality, which is named columnstore indexes, has been significantly enhanced
in SQL Server 2014.
Lesson Objectives
After completing this lesson, you will be able to:
Storage. Data is stored in a compressed columnar data format (stored by column) instead of row
store format (stored by row). It is possible to achieve compression ratios of seven times greater in a
columnstore index.
Batch mode execution. Data is processed in batches (of 1,000-row blocks) instead of row by row.
Depending on filtering and other factors, a query may also benefit from “segment elimination,” which
involves bypassing million-row chunks (segments) of data and further reducing I/O.
MCT USE ONLY. STUDENT USE PROHIBITED
6-6 In-Memory Database Capabilities
The size of the dimension tables. Consider using columnstore indexes for very large fact or
dimension tables that have millions of rows. For smaller tables, columnstore indexes might not
provide a major performance benefit.
Data compression. Use columnstore indexes on tables that contain data, such as character or
numeric data with frequently repeated values that will compress well.
The types of queries. Columnstore indexes deliver the best results with certain types of queries, such
as aggregate queries that join two tables and simple aggregate queries on a single table.
If you are unsure whether a columnstore index is suitable, you can create one and test the impact on your
query workload.
Nonclustered columnstore indexes are read-only; you cannot perform INSERT, UPDATE, DELETE, or
MERGE operations on a table that has a nonclustered columnstore index. To update the data in a
table with a nonclustered columnstore index, you can drop the index, update the data, and then re-
create the index or use partition switching to add new data. Alternatively, you can use a clustered
columnstore index, which you can update.
You cannot use columnstore indexes in conjunction with the following SQL Server features:
Change tracking
FILESTREAM columns
Replication
Sparse columns
You cannot store in-memory OLTP data as a SQL Server data file in Microsoft Azure™. This is because
it requires FILESTREAM data, which is not currently supported in Microsoft Azure. It is possible to use
in-memory functionality in a Microsoft Azure virtual machine.
Reference Links: For a full list of the limitations of using columnstore indexes, see the
Columnstore Indexes topic in SQL Server Books Online.
It does not store the columns in a sorted order, but rather optimizes storage for compression and
performance.
Note: Clustered columnstore indexes are new in SQL Server 2014. In SQL Server 2012, you
can only create nonclustered columnstore indexes.
Clustered columnstore indexes store the data in compressed columnstore segments. However, some data
is stored in a rowstore table that is referred to as the “deltastore,” which is an intermediary storage
location for use until the data can be compressed and moved into a columnstore segment. The following
rules are used to manage data modifications:
When you use an INSERT statement to insert a new row, it is stored in the deltastore until there are
enough rows to meet the minimum size for a rowgroup. This rowgroup is then compressed and
moved into the columnstore segments.
When you execute a DELETE statement, affected rows that are stored in the deltastore are physically
deleted. Affected data in the columnstore segments is marked as deleted and the physical storage is
only reclaimed when the index is rebuilt.
When you execute an UPDATE statement, affected rows in the deltastore are updated. Affected rows
in the columnstore are marked as deleted and a new row is inserted into the deltastore.
MCT USE ONLY. STUDENT USE PROHIBITED
6-8 In-Memory Database Capabilities
You cannot update it. Tables that contain a nonclustered columnstore index are read-only.
Periodically drop the index, perform the updates to the table, and then re-create the index.
This approach is the simplest way of handling updates, and fits in with the way in which many
organizations already perform data loads into their data warehouses. The disadvantage of this
approach is that creating a columnstore index can be time-consuming when the base table is very
large, and this can be problematic when the window for performing a data load is relatively short.
Use table partitioning. When you create an index on a partitioned table, SQL Server automatically
aligns the index with the table, meaning that the index is divided up in the same way as the table.
When you switch a partition out of the table, the aligned index partition switches out of the table,
too. You can use partition-switching to perform inserts, updates, merges, and deletes:
o To perform a bulk insert, partition the table, load new data into a staging table, build a
columnstore index on the staging table, and then use partition-switching to load the data into
the partitioned data warehouse table.
o For other types of updates, you can switch a partition out of the data warehouse table into a
staging table, drop or disable the columnstore index on the staging table, perform the updates,
re-create or rebuild the columnstore index on the staging table, and then switch the staging
table back into the data warehouse table.
Trickle Updating
The techniques that are described above enable administrators to update nonclustered columnstore index
tables in a typical data warehouse scenario, where access to static data is adequate. However, it is
sometimes necessary to provide users with access to live data and recent updates between data loads.
Although you cannot update a table that has a nonclustered columnstore index directly, you can provide
access to changing data by using a delta table. A delta table is a table that has the same columns as the
table that has the columnstore index, and contains changed data such as new rows. You can write queries
that use the UNION operator to combine the changed data in the delta table with the static data in the
table that has the columnstore index. This approach is sometimes called trickle updating. During the
periodic data warehouse data load, you can remove the data from the delta table and load it into the
columnstore table. This helps to keep the delta table relatively small, which is necessary to ensure that you
maintain the performance benefit that the columnstore index provides.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 6-9
For queries that involve aggregating data from the columnstore table and the delta table, you can use a
common table expression to perform local-global aggregation. Local-global aggregation involves
separately aggregating the required values from the delta table and the columnstore table, and then
combining and aggregating the two results sets.
The following code example uses a common table expression to combine and aggregate data from a
columnstore index and data from a delta table:
Combining Data from a Columnstore Index with Data from a Delta Table
WITH AggregateSOD (ProductKey, UnitPrice)
AS (SELECT ProductKey, SUM(UnitPrice) FROM SalesOrderDetail
GROUP BY ProductKey
UNION
SELECT ProductKey, SUM(UnitPrice) FROM SOD_Delta
GROUP BY ProductKey)
SELECT ProductKey, SUM(UnitPrice) AS Total FROM AggregateSOD
GROUP BY ProductKey
ORDER BY Total DESC
To create a nonclustered columnstore index, use the CREATE NONCLUSTERED COLUMNSTORE INDEX
statement as shown in the following code example:
To create a columnstore index by using SQL Server Management Studio, in Object Explorer, expand the
relevant database, expand the Tables node, expand the table that you want to index, right-click the
Indexes node, click New Index, and then create the required kind of columnstore index.
MCT USE ONLY. STUDENT USE PROHIBITED
6-10 In-Memory Database Capabilities
Demonstration Steps
Create a columnstore index
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod06\Demo06.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
You are planning to optimize some database workloads by using the in-memory database capabilities of
SQL Server 2014. To test these capabilities, you will enable the buffer pool extension and create
columnstore indexes.
Objectives
After completing this lab, you will be able to:
Password: Pa$$w0rd
You want to extend the buffer pool onto the SSD device by using a 10-GB file named BufferCache.bpe.
o Size: 10 GB
Results: After completing this exercise, you should have enabled the buffer pool extension.
New data is loaded to the FactInternetSales table on a weekly basis by an ETL process that drops all
indexes, loads the new data, and re-creates all indexes. The FactProductInventory table is updated on an
ongoing basis.
You want to retain the existing indexes on the FactInternetSales table, but you do not need to retain any
existing indexes or keys on the FactProductInventory table.
The main tasks for this exercise are as follows:
2. Configure SQL Server Management Studio to include the actual execution plan, and then execute the
script in the AdventureWorksDW database. Review the execution plan, and note the indexes that
were used.
3. Based on the scenario for this exercise, decide whether a clustered or nonclustered index is
appropriate for the FactInternetSales table.
4. Create the required columnstore index, dropping existing indexes and keys if required and including
all columns in the FactInternetSales table. Then reexecute the query to verify that the new
columnstore index is used along with existing indexes.
2. Configure SQL Server Management Studio to include the actual execution plan, and then execute the
script in the AdventureWorksDW database. Review the execution plan, and note the indexes that
were used.
3. Based on the scenario for this exercise, decide whether a clustered or nonclustered index is
appropriate for the FactProductInventory table.
4. Create the required columnstore index, dropping existing indexes and keys if required. Then
reexecute the query to verify that the new columnstore index is used along with existing indexes.
Results: After completing this exercise, you should have created columnstore indexes.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 6-13
Module 7
Designing and Implementing Views
Contents:
Module Overview 7-1
Module Overview
Views are a type of virtual table because the result set of a view is not usually saved in the database. Views
can simplify the design of database applications by abstracting the complexity of the underlying objects.
Views can also provide a layer of security. It is possible to give users permission to access a view without
permission to access the objects on which the view is constructed.
Objectives
After completing this lesson, you will be able to:
Lesson 1
Introduction to Views
In this lesson, you will gain an understanding of views and how they are used. You will also investigate the
system views that Microsoft® SQL Server® data management software supplies. A view is effectively a
named SELECT query. Unlike ordinary tables (base tables) in a relational database, a view is not part of the
physical schema; it is a dynamic, virtual table that is computed or collected from data in the database.
Effective use of views in database system design helps improve performance and manageability. In this
lesson, you will learn about views, the different types of views, and how to use them.
Lesson Objectives
After completing this lesson, you will be able to:
Describe views.
What Is a View?
You can think of a view as a named virtual table
that is defined through a SELECT statement. To an
application, a view behaves very similarly to a table.
Horizontal filtering is used to limit the rows that the view returns. For example, a Sales table might hold
details of the sales for the entire organization. Sales staff might only be permitted to view sales for their
own region or state. You could create a view that limits the rows that are returned to those for a particular
state or region.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 7-3
Types of Views
There are four basic types of view: standard views,
system views (including dynamic management
views), indexed views, and partitioned views
(including distributed partitioned views).
Standard Views
Standard views combine data from one or more
base tables (or views) into a new virtual table. From
the base tables (or views), particular columns and
rows can be returned. Any computations, such as
joins or aggregations, are performed during query
execution for each query that references the view.
System Views
SQL Server provides system views, which show details of the system catalog or aspects of the state of SQL
Server. Dynamic management views (DMVs) were introduced in SQL Server 2005 and enhanced in every
edition since then. DMVs provide dynamic information about the state of SQL Server, such as information
about the current sessions or the queries those sessions are executing.
Indexed Views
Indexed views materialize the view through the creation of a clustered index on the view. This is usually
done to improve query performance and will consume disk space. You can avoid complex joins or lengthy
aggregations at execution time by precalculating the results. Indexed views are discussed later in this
module.
Partitioned Views
Partitioned views unite data from multiple tables into a single view. One column in the view defines which
underlying table stores the data and CHECK constraints on the table enforce this. Distributed partitioned
views are formed when the tables that are being combined by a UNION operation are located on
separate instances of SQL Server.
Advantages of Views
Views are generally used to focus, simplify, and
customize the perception that each user has of the
tables in the database.
Many external applications cannot execute stored procedures or Transact-SQL code, but can select data
from tables or views. Creating a view enables you to isolate the data that is needed for these export
functions.
It is possible to use views to provide a backward-compatible interface to emulate a table that previously
existed, but whose schema has changed. For example, if a Customer table has been split into two tables,
CustomerGeneral and CustomerCredit, a Customer view could be created over the two new tables to
make it appear that the Customer table still exists. This would enable existing applications to query the
data without requiring the applications to be altered.
Reporting applications often need to execute complex queries to retrieve the report data. Rather than
embedding this logic in the reporting application, a view could be created to supply the data that the
reporting application requires in a much simpler format.
System Views
SQL Server provides information about its
configuration through a series of system views.
These views also provide metadata that describes
both the objects that you create in the database
and the objects that SQL Server provides. Catalog
views are primarily used to retrieve metadata about
tables and other objects in databases.
The International Organization for Standardization (ISO) has standards for Structured Query Language
(SQL). Each database engine vendor uses different methods of storing and accessing metadata, so a
standard mechanism was designed. This interface is provided by the views in the INFORMATION_SCHEMA
schema. The most commonly used INFORMATION_SCHEMA views are:
INFORMATION_SCHEMA.CHECK_CONSTRAINTS INFORMATION_SCHEMA.COLUMNS
INFORMATION_SCHEMA.PARAMETERS
INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS
INFORMATION_SCHEMA.ROUTINE_COLUMNS
INFORMATION_SCHEMA.ROUTINES
INFORMATION_SCHEMA.TABLE_CONSTRAINTS INFORMATION_SCHEMA.TABLE_PRIVILEGES
INFORMATION_SCHEMA.TABLES INFORMATION_SCHEMA.VIEW_COLUMN_USAGE
INFORMATION_SCHEMA.VIEW_TABLE_USAGE INFORMATION_SCHEMA.VIEWS
You can see the list of current DMVs by looking down the list of System Views in Object Explorer in SQL
Server Management Studio. Similarly, you can see the list of current DMFs by looking down the list of
System Functions in Object Explorer.
You can use DMOs to view and monitor the internal health and performance of a server along with
aspects of its configuration. They also have an important role in assisting with troubleshooting problems
(such as blocking issues) and with performance tuning.
Demonstration Steps
Query system views and dynamic management views
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
7-6 Designing and Implementing Views
Lesson 2
Creating and Managing Views
In the previous lesson, you learned about the role of views. In this lesson, you will learn how to create,
drop, and alter views. You will also learn how views and the objects on which they are based have owners
and how this can affect the use of views. You will see how to find information about existing views and
how to obfuscate the definitions of views.
Lesson Objectives
After completing this lesson, you will be able to:
Create views.
Drop views.
Alter views.
Creating Views
To create a view, the database owner must grant
you permission to do so. Creating a view involves
associating a name with a SELECT statement.
CREATE VIEW
Views can be based on other views instead of being
based on the underlying tables. Up to 32 levels of
nesting are permitted. You should take care when
nesting views deeply because it can become
difficult to understand the complexity of the
underlying code and to troubleshoot performance
problems that are related to the views.
Views have no natural output order. Queries that access the views should specify the order for the
returned rows. You can use the ORDER BY clause in a view, but only to satisfy the needs of a clause such
as the TOP clause.
If you specify the WITH SCHEMABINDING option, the underlying tables cannot be changed in a way
that would affect the view definition. If you later decide to index the view, you must use the WITH
SCHEMABINDING option.
Expressions that are returned as columns need to be aliased. It is also common to define column aliases in
the SELECT statement within the view definition, but you can also provide a column list after the name of
the view.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 7-7
CREATE VIEW
CREATE VIEW HumanResources.EmployeeList
(EmployeeID, FamilyName, GivenName)
AS
SELECT EmployeeID, LastName, FirstName
FROM HumanResources.Employee;
Dropping Views
Dropping a view removes the definition of the view
and all permissions that are associated with the
view.
DROP VIEW
Even if a view is re-created with exactly the same
name as a view that has been dropped, permissions
that were formerly associated with the view are
removed.
If a view was created by using the WITH SCHEMABINDING option, it will need to be removed before it is
possible to make changes to the structure of the underlying tables.
The DROP VIEW statement supports the dropping of multiple views via a comma-delimited list, as shown
in the following code example:
DROP VIEW
DROP VIEW Sales.WASales, Sales.CTSales, Sales.CASales;
Altering Views
After a view is defined, you can modify its definition
without dropping and re-creating the view.
ALTER VIEW
The ALTER VIEW statement modifies a previously
created view. (This includes indexed views, which
are discussed in the next lesson.)
Ownership Chaining
One of the key reasons for using views is to provide
a layer of security abstraction so that access is given
to views and not to the underlying table or tables.
For this mechanism to function correctly, an
unbroken ownership chain must exist.
For example, a user, John, has no access to a table that Nupur owns. If Nupur creates a view or stored
procedure that accesses the table and gives John permission to the view, John can then access the view
and through it, the data in the underlying table. However, if Nupur creates a view or stored procedure
that accesses a table that Tim owns and grants John access to the view or stored procedure, John would
not be able to use the view or stored procedure, even if Nupur has access to Tim's table, because of the
broken ownership chain. Two options are available to correct this situation:
In Transact-SQL, you can obtain the list of views in a database by querying the sys.views view.
In earlier versions of SQL Server, you could locate object definitions (including the definitions of
unencrypted views) by executing the sp_helptext system stored procedure.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 7-9
The OBJECT_DEFINITION() function enables you to query the definition of an object in a relational
format. The output of the function is easier to consume in an application than the output of a system
stored procedure such as sp_helptext.
If you change the name of an object that a view references, you must modify the view so that its text
reflects the new name. Therefore, before renaming an object, display the dependencies of the object first
to determine whether the proposed change will affect any views.
You can find overall dependencies by querying the sys.sql_expression_dependencies view. You can find
column-level dependencies by querying the sys.dm_sql_referenced_entities view.
Updatable Views
It is possible to update data in the base tables by
updating a view.
It is possible to modify a row in a view in such a way that the row would no longer belong to the view. For
example, you could have a view that selected rows where the State column contained the value WA. You
could then update the row and set the State column to the value CA. If the view was queried again, the
row would seem to have vanished. To avoid the chance of this happening, you can specify the WITH
CHECK OPTION clause when you define the view. It will check during data modifications that any row that
had been modified would still be returned by the same view.
Data that is modified in a base table via a view still needs to meet the restrictions on those columns (such
as nullability, constraints, and defaults) as if the base table was modified directly. This can be particularly
challenging if all of the columns in the base table are not present in the view. For example, an INSERT
operation on the view would fail if the base table upon which it was based required mandatory columns
that were not exposed in the view and did not have DEFAULT values.
MCT USE ONLY. STUDENT USE PROHIBITED
7-10 Designing and Implementing Views
WITH ENCRYPTION
The WITH ENCRYPTION clause provides limited
obfuscation of the definition of a view.
The encryption that is provided is not very strong. Many third-party tools exist that can decrypt the source
code, so you should not depend on this to protect your intellectual property if doing so is critical to you.
Demonstration Steps
Create, query and drop views
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 7-11
Lesson 3
Performance Considerations for Views
Now that you understand why views are important and know how to create them, it is important to
understand the potential performance impacts of using views.
In this lesson, you will see how views are incorporated directly into the execution plans of queries in which
they are used. You will see the effect and potential disadvantages of nesting views and see how it is
possible to improve performance in some situations.
Finally, you will see how it is possible to combine the data from multiple tables into a single view, even if
those tables are on different servers.
Lesson Objectives
After completing this lesson, you will be able to:
Standard views do not appear in execution plans for queries because the views are not accessed. The
underlying objects that they reference will be seen in the execution plans.
You should avoid using SELECT * in a view definition. As an example, you will notice that, if you add a
new column to the base table, the view will not reflect the column until the view has been refreshed. You
can correct this situation by executing an updated ALTER VIEW statement or by calling the
sp_refreshview system stored procedure.
MCT USE ONLY. STUDENT USE PROHIBITED
7-12 Designing and Implementing Views
Partitioned Views
Partitioned views enable you to split the data in a
large table into smaller member tables. The data is
partitioned between the member tables based on
ranges of data values in one of the columns.
Data ranges for each member table in a partitioned
view are defined in a CHECK constraint that is
specified on the partitioning column. A UNION ALL
statement is used to combine selects of all of the
member tables into a single result set.
In a local partitioned view, all participating tables and the view reside on the same instance of SQL Server.
In most cases, you should use table partitioning instead of local partitioned views.
In a distributed partitioned view, at least one of the participating tables resides on a different (remote)
server. You can use distributed partitioned views to implement a federation of database servers.
Good planning and testing are crucial because major performance problems can arise if the design of the
partitioned views is not appropriate.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 7-13
Demonstration Steps
Investigate how views can affect query performance
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
8. Close SQL Server Management Studio and SQL Server Profiler without saving any changes.
You can only build indexes on views that are deterministic. That is, the views must always return the same
data unless the underlying table data is altered. For example, an indexed view could not contain a column
that returned the outcome of the SYSDATETIME() function.
WITH SCHEMABINDING is an option that the view must have been created with before it is possible to
create an index on the view. The WITH SCHEMABINDING option prevents changes to the schema of the
underlying tables while the view exists.
MCT USE ONLY. STUDENT USE PROHIBITED
7-14 Designing and Implementing Views
You can imagine an indexed view as a special type of table that has a clustered index. The differences are
that the schema of the table is not defined directly; it is defined by the SELECT statement in the view.
Also, you don't modify the table directly; you modify the data in the “real” tables that underpin the view.
When the data in the underlying tables is modified, SQL Server realizes that it needs to update the data in
the indexed view.
Indexed views have a negative impact on the performance of INSERT, DELETE, and UPDATE operations
on the underlying tables, but they can also have a positive impact on the performance of SELECT queries
on the view. They are most useful for data that is regularly selected, but much less frequently updated.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 7-15
Details of organizational contacts are contained in several tables. The relationship management system
that the account management team is using needs to be able to gain access to these contacts. However,
the team needs a single view that contains all contacts. You need to design, implement, and test the
required view.
Objectives
After completing this lab, you will be able to:
Password: Pa$$w0rd
Supporting Documentation
View1: OnlineProducts
ViewColumn SourceColumn
ProductID ProductID
ProductName ProductName
ProductNumber ProductNumber
Size Size
UnitOfMeasure SizeUnitMeasureCode
Price ListPrice
Weight Weight
This view is based on the Marketing.Product table. Rows should only appear if the product has begun to
be sold and is still being sold. (Derive this from SellStartDate and SellEndDate.)
MCT USE ONLY. STUDENT USE PROHIBITED
7-16 Designing and Implementing Views
View2: AvailableModels
ViewColumn SourceColumn
ProductID ProductID
ProductName ProductName
ProductModelID ProductModelID
ProductModel ProductModel
This view is based on the Marketing.Product and Marketing.ProductModel tables. Rows should only
appear if the product has at least one model, has begun to be sold, and is still being sold. (Derive this
from SellStartDate and SellEndDate.)
Supporting Documentation
View3: Contacts
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 7-17
Question: What considerations are there for views that involve multiple tables?
Question: What is required for columns in views that are created from expressions?
MCT USE ONLY. STUDENT USE PROHIBITED
7-18 Designing and Implementing Views
Review Question(s)
Question: How does SQL Server store the view in the database?
Module 8
Designing and Implementing Stored Procedures
Contents:
Module Overview 8-1
Module Overview
Stored procedures enable you to create Transact-SQL logic that will be stored and executed at the server.
This logic might enforce business rules or data consistency. Stored procedures are also used to return sets
of rows based upon input parameters. You will see the potential advantages of the use of stored
procedures in this module along with guidelines on creating them.
Objectives
After completing this module, you will be able to:
Describe the role of stored procedures and the potential benefits of using them.
Work with stored procedures.
Lesson 1
Introduction to Stored Procedures
Microsoft® SQL Server® data management software provides several stored procedures and users can
create stored procedures, too. In this lesson, you will see the role of stored procedures and the potential
benefits of using them. System stored procedures provide a large amount of prebuilt functionality that
you can take advantage of when you are building applications. When you are designing stored
procedures, it is also important to realize that not all Transact-SQL statements are permitted within stored
procedures.
Lesson Objectives
After completing this lesson, you will be able to:
Identify statements that are not permitted within the body of a stored procedure declaration.
Alternatively, a stored procedure could be created at the server level to encapsulate all of the Transact-
SQL statements that are required. Stored procedures are given names and are called by name. The
application can then simply ask to execute the stored procedure each time it needs to use that same
functionality, rather than sending all of the statements that would otherwise be required.
Stored Procedures
Stored procedures are similar to procedures, methods, and functions in high-level languages. They can
have input and output parameters and a return value.
As a side effect of executing the stored procedure, rows of data can also be returned from the stored
procedure. In fact, multiple rowsets can be returned from a single stored procedure.
Stored procedures can be created in either Transact-SQL code or managed .NET code and are run by the
EXECUTE Transact-SQL statement. The creation of stored procedures in managed code will be discussed
in a Module 12, Implementing Managed Code in SQL Server.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-3
Security Boundary
Stored procedures can be part of a scheme that
helps to increase application security. They can be
treated as a security boundary. Users can be given
permission to execute a stored procedure without
being given permission to access the objects that
the stored procedure accesses.
For example, you can give a user (or set of users via
a role) permission to execute a stored procedure that updates a table without granting the user any
permissions directly on the table.
Modular Programming
Code reuse is important. Stored procedures help by enabling logic to be created once and then enabling
the logic to be called many times and from many applications. Maintenance is easier because if a change
is needed, you only need to change the procedure, without needing to change the application code at all
in many cases. Changing a stored procedure could avoid the need to change the data access logic in a
group of applications.
Delayed Binding
It is possible to create a stored procedure that accesses (or references) a database object that does not yet
exist. This can be helpful in simplifying the order in which database objects need to be created. This is
referred to as deferred name resolution.
Performance
Sending the name of a stored procedure to be executed rather than hundreds or thousands of lines of
executable Transact-SQL code can offer a significant reduction in the level of network traffic.
Before Transact-SQL code is executed, it needs to be compiled. When a stored procedure is compiled, in
many cases, SQL Server will attempt to retain (and reuse) the query plan that it previously generated, to
avoid the cost of the compilation of the code.
Although it is possible to reuse execution plans for ad-hoc Transact-SQL code that applications have
issued, SQL Server favors the reuse of stored procedure execution plans. Query plans for ad-hoc Transact-
SQL statements are among the first items to be removed from memory when memory pressure is
occurring.
The rules that govern the reuse of query plans for ad-hoc Transact-SQL code are largely based on
matching the text of the queries exactly. Any difference at all (for example, white space or casing) will
cause a different query plan to be used, unless the difference is only a value that SQL Server decides must
be the equivalent of a parameter.
Stored procedures have a much higher chance of achieving query plan reuse.
MCT USE ONLY. STUDENT USE PROHIBITED
8-4 Designing and Implementing Stored Procedures
Originally, there was a basic distinction in the naming of these stored procedures, where system stored
procedures had an sp_ prefix and system extended stored procedures had an xp_ prefix. Over time, the
need to maintain backward compatibility has caused a mixture of these prefixes to appear in both types
of procedure. Now, most system stored procedures have an sp_ prefix and most system extended stored
procedures have an xp_ prefix.
You should now use managed-code stored procedures instead of user-defined extended stored
procedures. The use of managed code to create stored procedures will be described in Module 12,
Implementing Managed Code in SQL Server.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-5
Demonstration Steps
Execute system stored procedures
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
8-6 Designing and Implementing Stored Procedures
Lesson 2
Working with Stored Procedures
Now that you understand why stored procedures are important, you need to understand the practicalities
that are involved in working with stored procedures.
Lesson Objectives
After completing this lesson, you will be able to:
Stored procedures are always created in the current database with the single exception of stored
procedures that are created with a number sign (#) prefix in their name. The # prefix on a name indicates
that it is a temporary object. As such, it would be created in the tempdb database and removed at the
end of the user's session.
Note: Although wrapping the body of a stored procedure with a BEGIN…END block is not
required, doing so is considered a good practice. Note also that you can terminate the execution
of a stored procedure by executing a RETURN statement within the stored procedure.
EXECUTE Statement
The EXECUTE statement is mostly used to execute
stored procedures, but can also be used to execute
other objects such as dynamic Structured Query
Language (SQL) statements.
As mentioned in the first lesson, you can execute
system stored procedures within the master
database without having to explicitly refer to that
database. That does not apply to other stored procedures.
If the stored procedure name starts with sp_ (not recommended for user stored procedures):
SQL Server first looks in the master database in the sys schema for the stored procedure.
SQL Server then looks in the default schema for the user who is executing the stored procedure.
SQL Server then looks in the dbo schema in the current database for the stored procedure.
Having SQL Server perform unnecessary steps to locate a stored procedure reduces performance for no
reason.
MCT USE ONLY. STUDENT USE PROHIBITED
8-8 Designing and Implementing Stored Procedures
ALTER PROC
The main reason for using the ALTER PROC
statement is to retain any existing permissions on
the procedure while it is being changed. Users may
have been granted permission to execute the
procedure. If you drop the procedure and re-create
it, those permissions that had been granted to the
users would be removed when the procedure was
dropped.
Procedure Type
Note that the type of procedure cannot be changed. For example, a Transact-SQL procedure cannot be
changed to a managed-code procedure by using an ALTER PROCEDURE statement or vice versa.
Connection Settings
The connection settings, such as QUOTED_IDENTIFIER and ANSI_NULLS, that will be associated with the
modified stored procedure will be those taken from the session that makes the change, not from the
original stored procedure, so it is important to keep these consistent when you are making changes.
Complete Replacement
Note that when you alter a stored procedure, you need to resupply any options (such as the WITH
ENCRYPTION clause) that were supplied while creating the procedure. None of these options are retained
and they are replaced by whatever options are supplied in the ALTER PROC statement.
Permissions
Dropping a procedure requires either ALTER
permission on the schema that the procedure is
part of or CONTROL permission on the procedure itself.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-9
sp_depends
Earlier versions of SQL Server used the sp_depends
system stored procedure to return details of
dependencies between objects. It was known to
have issues and to report incomplete information
due to issues with deferred name resolution.
sys.sql_expression_dependencies
Use of the sys.sql_expression_dependencies view
replaces the previous use of the sp_depends system stored procedure. The
sys.sql_expression_dependencies view provides a “one row per name” dependency on user-defined
entities in the current database. sys.dm_sql_referenced_entities and sys.dm_sql_referencing_entities
provide more targeted views over the data that the sys.sql_expression_dependencies view provides.
You will see an example of these dependency views being used in the next demonstration.
There is no right or wrong way to do this in all situations, but you should decide on a method for naming
objects that your applications are to use and apply the method consistently. It is possible to enforce
naming conventions on most objects by using Policy-Based Management (first introduced in SQL Server
2008 and beyond the scope of this course) or DDL triggers (first introduced in SQL Server 2005 and also
beyond the scope of this course).
WITH ENCRYPTION
As mentioned in Module 7, it is important to
understand that although SQL Server provides the
WITH ENCRYPTION clause to obfuscate the
definition of your stored procedures, the encryption
is not particularly strong.
In fact, the encryption is known to be relatively easy to defeat because the encryption keys are stored in
known locations within the encrypted text. There are both direct methods and several third-party tools
that can reverse the encryption.
You need to keep original copies of the source code regardless of the fact that decryption might be
possible. Do not depend upon this.
Encrypted code is much harder to work with in terms of diagnosing and tuning performance issues.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-11
Demonstration Steps
Create, execute, and alter a stored procedure
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. Ensure that you have run the previous demos in this module.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod08\Demo08.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
8-12 Designing and Implementing Stored Procedures
Lesson 3
Implementing Parameterized Stored Procedures
The stored procedures that you have seen earlier in this module have not involved parameters. They have
produced their output without needing any input from the user and they have not returned any values
apart from the rows that they have returned. Stored procedures are more flexible when you include
parameters as part of the procedure definition because you can create more generic application logic.
Stored procedures can use both input and output parameters and return values.
Although the reuse of query execution plans is desirable in general, there are situations where this reuse is
detrimental. You will see situations where this can occur and consider options for workarounds to avoid
the detrimental outcomes.
Lesson Objectives
After completing this lesson, you will be able to:
Parameterize stored procedures.
Input Parameters
Parameters are used to exchange data between
stored procedures and the application or tool that
called the stored procedure. They enable the caller
to pass a data value to the stored procedure. To
define a stored procedure that accepts input
parameters, you declare one or more variables as
parameters in the CREATE PROCEDURE statement.
You will see an example of this in the next topic.
Output Parameters
Output parameters enable the stored procedure to pass a data value or a cursor variable back to the
caller. To use an output parameter within Transact-SQL, you must specify the OUTPUT keyword in both
the CREATE PROCEDURE statement and the EXECUTE statement.
Return Values
Every stored procedure returns an integer return code to the caller. If the stored procedure does not
explicitly set a value for the return code, the return code is 0 if no error occurs; otherwise a negative value
is returned.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-13
Return values are commonly used to return a status result or an error code from a procedure and are sent
by the Transact-SQL RETURN statement.
Although it is possible to send a value that is related to business logic via a RETURN statement, in
general, you should use output parameters to generate values rather than the RETURN value.
Default Values
Provide default values for a parameter where appropriate. If a default is defined, a user can execute the
stored procedure without specifying a value for that parameter.
Default Values
CREATE PROCEDURE Sales.OrdersByDueDateAndStatus
@DueDate datetime, @Status tinyint = 5
AS
Two parameters have been defined (@DueDate and @Status). The @DueDate parameter has no default
value and must be supplied when the procedure is executed. The @Status parameter has a default value
of 5. If a value for the parameter is not supplied when the stored procedure is executed, a value of 5 will
be used.
This execution supplies a value for both @DueDate and @Status. Note that the names of the parameters
have not been mentioned. SQL Server knows which parameter is which by its position in the parameter
list.
MCT USE ONLY. STUDENT USE PROHIBITED
8-14 Designing and Implementing Stored Procedures
This is an example of the previous stored procedure with one input parameter supplied and one
parameter using the default value:
In this case, a value for the @DueDate parameter has been supplied, but no value for the @Status
parameter has been supplied. In this case, the procedure will be executed with the @Status value set at a
default value of 5.
This is an example of a stored procedure being executed and both parameters are defined by name.
In this case, the stored procedure is being called by using both parameters, but they are being identified
by name.
In this example, the results will be the same, even though they are in a different order, because the
parameters are defined by name:
In this case, the @DueDate parameter is an input parameter and the @OrderCount parameter has been
specified as an output parameter. Note that, in SQL Server, there is no true equivalent of a .NET output
parameter. SQL Server output parameters are really input/output parameters.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-15
First, variables to hold the parameter values have been declared. In this case, a variable to hold a due date
has been declared, along with another to hold the order count.
In the EXEC call, note that the @OrderCount parameter is followed by the OUTPUT keyword. If you do
not specify the output parameter in the EXEC statement, the stored procedure would still execute as
normal, including preparing a value to return in the output parameter. However, the output parameter
value would simply not be copied back into the @OrderCount variable. This is a common bug when
working with output parameters.
Finally, you would then use the returned value in the business logic that follows the EXEC call.
SQL Server provides various ways to deal with this problem, which is often called a “parameter-sniffing”
problem. Note that parameter sniffing only applies to parameters, not to variables within the batch. The
code for these looks very similar, but variable values are not “sniffed” at all and this can lead to poor
execution plans regardless.
WITH RECOMPILE
You can add a WITH RECOMPILE option when you are declaring a stored procedure. This causes the
procedure to be recompiled every time it is executed.
OPTIMIZE FOR
There is an OPTION (OPTIMIZE FOR) query hint that enables you to specify the value of a parameter
that should be assumed when compiling the procedure, regardless of the actual value of the parameter.
OPTIMIZE FOR
CREATE PROCEDURE dbo.GetProductNames
@ProductIDLimit int
AS
BEGIN
SELECT ProductID,Name
FROM Production.Product
WHERE ProductID < @ProductIDLimit
OPTION (OPTIMIZE FOR (@ProductIDLimit = 1000))
END;
Demonstration Steps
Pass parameters to stored procedures
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Ensure that you have run the previous demos in this module
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-17
Lesson 4
Controlling Execution Context
Stored procedures normally execute in the security context of the user who is calling the procedure. As
long as a chain of ownership extends from the stored procedure to the objects that are referenced, the
user can execute the procedure without the need for permissions on the underlying objects. Ownership-
chaining issues with stored procedures are identical to those for views. Sometimes, however, more precise
control over the security context in which the procedure is executing is desired.
Lesson Objectives
After completing this lesson, you will be able to:
Execution Contexts
A login token and a user token represent an
execution context. The tokens identify the primary
and secondary principals against which permissions
are checked and the source that is used to
authenticate the token. A login that connects to an
instance of SQL Server has one login token and one
or more user tokens, depending on the number of databases to which the account has access.
Login token: A login token is valid across the instance of SQL Server. It contains the primary and
secondary identities against which server-level permissions and any database-level permissions that are
associated with these identities are checked. The primary identity is the login itself. The secondary identity
includes permissions that are inherited from rules and groups.
MCT USE ONLY. STUDENT USE PROHIBITED
8-18 Designing and Implementing Stored Procedures
User token: A user token is valid only for a specific database. It contains the primary and secondary
identities against which database-level permissions are checked. The primary identity is the database user
itself. The secondary identity includes permissions that are inherited from database roles. User tokens do
not contain server-role memberships and do not honor the server-level permissions that are granted to
the identities in the token including those that are granted to the server-level public role.
For example, if you add a WITH EXECUTE AS 'Pat' clause to the definition of a stored procedure, it will
cause the procedure to be executed with 'Pat' as the security context rather than with the default security
context that is supplied by the caller of the stored procedure.
Explicit Impersonation
SQL Server supports the ability to impersonate
another principal either explicitly by using the
stand-alone EXECUTE AS statement, or implicitly
by using the EXECUTE AS clause on modules.
Implicit Impersonation
You can perform implicit impersonations by using the WITH EXECUTE AS clause on modules to
impersonate the specified user or login at the database or server level. This impersonation depends on
whether the module is a database-level module, such as a stored procedure or function, or a server-level
module, such as a server-level trigger.
When you impersonate a principal by using the EXECUTE AS LOGIN statement, or within a server-scoped
module by using the EXECUTE AS clause, the scope of the impersonation is server-wide. This means that,
after the context switch, it is possible to access any resource within the server on which the impersonated
login has permissions.
However, when you impersonate a principal by using the EXECUTE AS USER statement, or within a
database-scoped module by using the EXECUTE AS clause, the scope of impersonation is restricted to the
database by default. This means that references to objects that are outside the scope of the database will
return an error.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-19
Demonstration Steps
View and change the execution context
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Ensure that you have run the previous demos in this module.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
Objectives
After completing this lab, you will be able to:
Password: Pa$$w0rd
Supporting Documentation
Stored procedure Marketing.GetProductColors
2. In the D:\Labfiles\Lab08\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-21
Supporting Documentation
Stored procedure Marketing.GetProductsByColor
Notes: Colors should not be returned more than once in the output.
NULL values should not be returned.
Note: Ensure that approximately 26 rows are returned for blue products. Ensure that approximately
248 rows are returned for products that have no color.
Question: When do you need the OUTPUT keyword for output parameters when you are
working with stored procedures?
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 8-23
Review Question(s)
Question: What happens to the WITH RECOMPILE option when you use it with a CREATE
PROC statement?
Question: What happens to the WITH RECOMPILE option when you use it with an
EXECUTE statement?
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
9-1
Module 9
Designing and Implementing User-Defined Functions
Contents:
Module Overview 9-1
Module Overview
Functions are routines that are used to encapsulate frequently performed logic. Rather than having to
repeat all of the function logic, any code that must perform the logic can call the function.
In this module, you will learn to design and implement user-defined functions (UDFs) that enforce
business rules or data consistency, and to modify and maintain existing functions that other developers
have written.
Objectives
After completing this module, you will be able to:
Lesson 1
Overview of Functions
Functions are routines that consist of one or more Transact-SQL statements that you can use to
encapsulate code for reuse. A function takes zero or more input parameters and returns either a scalar
value or a table. Functions do not support output parameters, but do return results, either as a single
value or a table.
Lesson Objectives
After completing this lesson, you will be able to:
Types of Functions
Most high-level programming languages offer
functions as blocks of code that are called by name
and can process input parameters. Microsoft® SQL
Server® data management software offers three
types of functions: scalar functions, table-valued
functions (TVFs), and system functions. You can
create two types of TVFs: inline TVFs and
multistatement TVFs.
Scalar Functions
Scalar functions return a single data value of the
type that is defined in a RETURNS clause. An
example of a scalar function would be a function
that extracts the protocol from a URL. From the string “http://www.microsoft.com”, the function would
return the string “http”.
For example, if a table holds details of sales for an entire country, you could create individual views to
return details of sales for particular states within the country. You could write an inline TVF that takes the
state code or ID as a parameter and returns all of the details of sales for the state that match the
parameter. In this way, you would only need a single function to provide details for all states, rather than
separate views for each state.
System Functions
System functions are built-in functions that SQL Server provides to help you perform a variety of
operations. You cannot modify them. System functions are described in the next topic.
System Functions
SQL Server has a wide variety of built-in functions
that you can use in queries to return data or to
perform operations on data.
Aggregates such as MIN, MAX, AVG, SUM, and COUNT perform calculations across groups of rows. Many
of these functions automatically ignore NULL rows.
Ranking functions such as ROW_NUMBER, RANK, DENSE RANK, and NTILE perform windowing
operations on rows of data.
MCT USE ONLY. STUDENT USE PROHIBITED
9-4 Designing and Implementing User-Defined Functions
Lesson 2
Designing and Implementing Scalar Functions
You have seen that functions are routines that consist of one or more Transact-SQL statements that you
can use to encapsulate code for reuse, and that functions can take zero or more input parameters and
return either scalar values or tables.
This lesson provides an overview of scalar functions and explains why and how you use them, in addition
to explaining the syntax for creating them.
Lesson Objectives
After completing this lesson, you will be able to:
Scalar Functions
Unlike the definition of a stored procedure, where it
is optional to use a BEGIN…END block that wraps
the body of the stored procedure, the body of a
function must be defined in a BEGIN…END block.
The function body contains the series of Transact-
SQL statements that return the value.
For example, consider the function definition in the following code example.
CREATE FUNCTION
CREATE FUNCTION dbo.ExtractProtocolFromURL
( @URL nvarchar(1000))
RETURNS nvarchar(1000)
AS BEGIN
RETURN CASE WHEN CHARINDEX(N':',@URL,1) >= 1
THEN SUBSTRING(@URL,1,CHARINDEX(N':',@URL,1) - 1)
END;
END;
Note that the body of the function consists of a single RETURN statement that is wrapped in a
BEGIN…END block.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 9-5
You can use the function in the following code example as an expression wherever a single value could be
used.
You can also implement scalar functions in managed code. Managed code will be discussed in a Module
12, Implementing Managed Code in SQL Server. The allowable return values for scalar functions differ
between functions that are defined in Transact-SQL and functions that are defined by using managed
code.
Scalar UDFs
You use scalar functions to return information from
a database. A scalar function returns a single data
value of the type that is defined in a RETURNS
clause. The body of the function, which is defined in a BEGIN…END block, contains the series of Transact-
SQL statements that return the value.
Guidelines
Consider the following guidelines when you create scalar UDFs:
Make sure that you use two-part naming for the function and for all database objects that the
function references.
Avoid Transact-SQL errors that lead to a statement being canceled and the process continuing with
the next statement in the module (such as within triggers or stored procedures) because they are
treated differently inside a function. In functions, such errors cause the execution of the function to
stop.
Side-Effects
A function that modifies the underlying database is considered to have “side-effects.” In SQL Server,
functions are not permitted to have side-effects. You cannot change data in a database within a function,
you may not call a stored procedure, and you may not execute dynamic Structured Query Language (SQL)
code.
MCT USE ONLY. STUDENT USE PROHIBITED
9-6 Designing and Implementing User-Defined Functions
Deterministic Functions
A deterministic function is one that will always
return the same result when it is provided with the
same set of input values for the same database
state.
Deterministic Function
CREATE FUNCTION dbo.AddInteger
(@FirstValue int, @SecondValue int)
RETURNS int
AS BEGIN
RETURN @FirstValue + @SecondValue;
END;
GO
Every time the function is called with the same two integer values, it will return exactly the same result.
Nondeterministic Functions
A nondeterministic function is one that may return different results for the same set of input values each
time it is called, even if the database remains in the same state.
Nondeterministic Function
CREATE FUNCTION dbo.CurrentUTCTimeAsString()
RETURNS varchar(40)
AS BEGIN
RETURN CONVERT(varchar(40),SYSUTCDATETIME(),100);
END;
Each time the function is called, it will return a different value, even though no input parameters are
supplied.
Demonstration Steps
Work with scalar functions
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod09\Demo09.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
9-8 Designing and Implementing User-Defined Functions
Lesson 3
Designing and Implementing Table-Valued Functions
In this lesson, you will learn how to work with functions that return tables instead of single values. There
are two types of TVFs: inline and multistatement. Both types of TVF will be covered in this lesson.
The ability to return a table of data is important because it enables a function to be used as a source of
rows in place of a table in a Transact-SQL statement. In many cases, this can avoid the need to create
temporary tables.
Lesson Objectives
After completing this lesson, you will be able to:
Describe TVFs.
Table-Valued Functions
There are two ways to create TVFs. Inline TVFs
return an output table that is defined by a RETURN
statement that consists of a single SELECT
statement. If the logic of the function is too
complex to include in a single SELECT statement,
you need to implement the function as a
multistatement TVF.
Multistatement TVFs construct a table within the body of the function and then return the table. They also
need to define the schema of the table to be returned.
You can use both types of TVF as the equivalent of parameterized views.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 9-9
For inline functions, the body of the function is not enclosed in a BEGIN…END block. A syntax error occurs
if you attempt to use this block. The CREATE FUNCTION statement still needs to be the only statement in
the batch.
In the same way that you use a view, you can use a TVF in the FROM clause of a Transact-SQL statement.
MCT USE ONLY. STUDENT USE PROHIBITED
9-10 Designing and Implementing User-Defined Functions
Implement TVFs.
Demonstration Steps
1. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
2. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
Lesson 4
Considerations for Implementing Functions
Although the ability to create functions in Transact-SQL is very important, you need to bear in mind some
key considerations when you are creating functions. In particular, it is important to avoid negative
performance impacts through inappropriate use of functions. Performance problems due to such
inappropriate usage are very common. This lesson provides guidelines for the implementation of
functions and describes how to control their security context.
Lesson Objectives
After completing this lesson, you will be able to:
You can use the CROSS APPLY operator to call a TVF for each row in the table on the left within the
query. Designs that require the calling of a TVF for every row in a table can lead to significant
performance overhead. You should examine the design to see if there is a way to avoid the need to call
the function for each row.
SQL Server supports the ability to impersonate another principal either explicitly by using the stand-alone
EXECUTE AS statement, or implicitly by using the EXECUTE AS clause on modules. You can use the stand-
alone EXECUTE AS statement to impersonate server-level principals, or logins, by using the EXECUTE AS
LOGIN statement. You can also use the stand-alone EXECUTE AS statement to impersonate database-
level principals, or users, by using the EXECUTE AS USER statement.
Implicit impersonations that are performed through the EXECUTE AS clause on modules impersonate the
specified user or login at the database or server level. This impersonation depends on whether the module
is a database-level module, such as a stored procedure or function, or a server-level module, such as a
server-level trigger.
When you are impersonating a principal by using the EXECUTE AS LOGIN statement, or within a server-
scoped module by using the EXECUTE AS clause, the scope of the impersonation is server-wide. This
means that, after the context switch, it is possible to access any resource within the server on which the
impersonated login has permissions.
However, when you are impersonating a principal by using the EXECUTE AS USER statement, or within a
database-scoped module by using the EXECUTE AS clause, the scope of impersonation is restricted to the
database by default. This means that references to objects that are outside the scope of the database will
return an error.
Use two-part naming to qualify the name of any database objects that are referred to within the
function and also use two-part naming when you are choosing the name of the function.
Consider the impact of using functions in combination with indexes. In particular, note that a WHERE
clause that uses a predicate, such as the following code example, is likely to remove the usefulness of
an index on CustomerID.
For example, consider the function definition in the following code example:
Avoid statements that will raise Transact-SQL errors because exception handling is not permitted within
functions.
MCT USE ONLY. STUDENT USE PROHIBITED
9-14 Designing and Implementing User-Defined Functions
Demonstration Steps
Alter the execution context of a function
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
3. In the virtual machine, on the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod09\Demo09.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
Lesson 5
Alternatives to Functions
Functions are only one option for implementing code. This lesson explores situations where other
solutions may be appropriate and helps you to make decisions about which solution to use.
Lesson Objectives
After completing this lesson, you will be able to:
Stored procedures can execute dynamic SQL statements. Functions are not permitted to execute dynamic
SQL statements.
Stored procedures can include detailed exception handling. Functions cannot contain exception handling.
Stored procedures can return multiple resultsets from a single stored procedure call. TVFs can return a
single rowset from a function call. There is no mechanism to permit the return of multiple rowsets from a
single function call.
MCT USE ONLY. STUDENT USE PROHIBITED
9-16 Designing and Implementing User-Defined Functions
Objectives
After completing this lab, you will be able to:
Create a function.
Password: Pa$$w0rd
2. In the D:\Labfiles\Lab09\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
2. Review the Function Specifications: Phone Number section in the supporting documentation.
MCT USE ONLY. STUDENT USE PROHIBITED
9-18 Designing and Implementing User-Defined Functions
Results: After this exercise, you should have created a new FormatPhoneNumber function within the
dbo schema.
4. Test the function by using an alternate delimiter such as the pipe character (|).
2. Review the requirement for the dbo.IntegerListToTable function in the supporting documentation.
Results: After this exercise, you should have created a new IntegerListToTable function within a dbo
schema.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 9-19
Review Question(s)
Question: When you are using the EXECUTE AS clause, what privileges should you grant to
the login or user that is being impersonated?
Question: When you are using the EXECUTE AS clause, what privileges should you grant to
the login or user that is creating the code?
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
10-1
Module 10
Responding to Data Manipulation via Triggers
Contents:
Module Overview 10-1
Module Overview
Data manipulation language (DML) triggers are a powerful tool that enables you to enforce domain,
entity, and referential data integrity and business logic. The enforcement of integrity helps you to build
reliable applications. In this module, you will learn what DML triggers are and how they enforce data
integrity, the different types of trigger that are available to you, and how to define triggers in your
database.
Objectives
After completing this module, you will be able to:
Lesson 1
Designing DML Triggers
Before you begin to create DML triggers, you need to become familiar with how they should be designed,
so that you can avoid making common design errors. Several types of DML trigger are available. It is
important to know what they do, how they work, and how they differ from data definition language (DDL)
triggers. DML triggers need to be able to work with both the previous state of the database and its
changed state. You will see how the inserted and deleted virtual tables provide that capability. DML
triggers are often added after applications are built, so you need to make sure that adding a trigger does
not cause errors in the applications that were designed without them being in place. The SET NOCOUNT
ON command helps to avoid the side-effects of triggers.
Lesson Objectives
After completing this lesson, you will be able to:
Describe DML triggers.
Explain how AFTER triggers differ from INSTEAD OF triggers and where you should use each of them.
Access both the prior and final states of the database data by using the inserted and deleted virtual
tables.
Trigger Operation
The trigger and the statement that fires it are treated as a single operation, which you can roll back from
within the trigger. By rolling back an operation, you can undo the effect of a Transact-SQL statement if
the logic in your triggers decides that the statement should not have been executed. If the statement is
part of another transaction, that outer transaction is also rolled back.
Triggers can cascade changes through related tables in the database; however, in many cases, you can
execute these changes more efficiently by using cascading referential integrity constraints.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 10-3
Unlike CHECK constraints, triggers can reference columns in other tables. For example, a trigger can use a
SELECT statement from another table to compare to the inserted or updated data and to perform
additional actions, such as modifying the data or displaying a user-defined error message.
Triggers can evaluate the state of a table before and after a data modification and take actions based on
that difference. For example, you may want to check that the balance of a customer’s account does not
change by more than a certain amount if the person processing the change is not a manager.
Triggers also enable the use of custom error messages for when constraint violations occur. This could
make the messages that are passed to end users more meaningful.
Multiple Triggers
Multiple triggers of the same type (INSERT, UPDATE, or DELETE) on a table enable multiple different
actions to occur in response to the same modification statement. You might create multiple triggers to
separate the logic that each performs, but note that you do not have complete control over the order in
which they fire. You can only specify which trigger should fire first and which should fire last.
AFTER Triggers
AFTER triggers fire after the data modifications that are part of the event to which they relate complete.
This means that an INSERT, UPDATE, or DELETE statement executes and modifies the data in the
database. After that modification has completed, AFTER triggers that are associated with that event fire,
but still within the same operation that triggered them.
In many cases, you can replace trigger-based code with other forms of code. For example, Microsoft®
SQL Server® data management software might provide auditing. Relationships between tables are more
typically implemented by using foreign key constraints. Default values and calculated values are typically
implemented by using DEFAULT constraints and persisted calculated columns. However, in some
situations, the complexity of the logic that is required will make triggers a good solution.
If the trigger executes a ROLLBACK statement, the data modification statement with which it is associated
will be rolled back. If that statement was part of a larger transaction, that outer transaction would be
rolled back, too.
INSTEAD OF Triggers
An INSTEAD OF trigger is a special type of trigger that executes alternate code instead of executing the
statement from which it was fired.
When you use an INSTEAD OF trigger, only the code in the trigger is executed. The original INSERT,
UPDATE, or DELETE operation that caused the trigger to fire does not occur.
A common use case for INSTEAD OF triggers is to enable views that are based on multiple base tables to
be updatable.
After an UPDATE operation, the inserted virtual table holds details of the modified versions of the rows.
The underlying table also contains those rows in the modified form.
After a DELETE operation, the deleted virtual table holds details of the rows that have just been deleted.
The underlying table no longer contains those rows.
After an UPDATE operation, the deleted virtual table holds details of the rows from before the
modification was made. The underlying table holds the modified versions.
INSTEAD OF Triggers
When you attempt an INSERT, UPDATE, or DELETE statement and an INSTEAD OF trigger is associated
with the event on the table, the inserted and deleted virtual tables hold details of the modifications that
need to be made, but have not happened yet.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 10-5
SET NOCOUNT ON
When you are adding a trigger to a table, you need
to avoid breaking any existing applications that are
accessing the table unless the intended purpose of
the trigger is to prevent misbehaving applications
from making inappropriate data changes.
It is common for application programs to issue data
modification statements and to check the returned
count of the number of rows that are affected.
This process is often performed as part of an
optimistic concurrency check. For example, consider
the following code example:
UPDATE Statement
UPDATE Customer
SET Customer.FullName = @NewName,
Customer.Address = @NewAddress
WHERE Customer.CustomerID = @CustomerID
AND Customer.Concurrency = @Concurrency;
In this case, the Concurrency column is a rowversion data type column. The application was designed so
that the update only occurs if the Concurrency column has not been altered. Using rowversion columns,
every modification to the row causes a change in the rowversion column.
When the application intends to modify a single row, it issues an UPDATE statement for that row. The
application then checks the count of updated rows that SQL Server returns. When the application sees
that only a single row has been modified, the application knows that only the row that it intended to
change was affected. It also knows that no other user had modified the row since the application read the
data.
A common mistake when you are adding triggers is that if the trigger also causes row modifications (for
example, writes an audit row into an audit table), that count is returned in addition to the expected count.
You can avoid this situation by using the SET NOCOUNT ON statement. Most triggers should include this
statement.
Returning Rowsets
Although it is possible to include a SELECT statement within a trigger and for it to return rows, the
creation of this type of side-effect is discouraged. The ability to do this is now deprecated and should not
be used in new development work. There is a configuration setting, ‘disallow results from triggers’, which,
when it is set to 1, disallows this capability.
MCT USE ONLY. STUDENT USE PROHIBITED
10-6 Responding to Data Manipulation via Triggers
Constraints are checked before any data modification is attempted, so they often provide much higher
performance than is possible with triggers, particularly in ROLLBACK situations. You can use constraints
when the checks that you need to perform are relatively simple. Triggers make it possible to check
complex logic.
Lesson 2
Implementing DML Triggers
The first lesson provided information about designing DML triggers. You now need to consider how to
implement the designs that have been created.
Lesson Objectives
After completing this lesson, you will be able to:
The trigger can examine the inserted virtual table to determine what to do in response to the
modification.
Multirow Inserts
In the code example on the slide, insertions for the Sales.Opportunity table are being audited to a table
called Sales.OpportunityAudit. Note that the trigger processes all inserted rows at the same time. A
common error when designing AFTER INSERT triggers is to write them with the assumption that only a
single row is being inserted.
MCT USE ONLY. STUDENT USE PROHIBITED
10-8 Responding to Data Manipulation via Triggers
Demonstration Steps
Create an AFTER INSERT trigger
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod10\Demo10.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
The trigger can examine the deleted virtual table to determine what to do in response to the
modification.
Multirow Deletes
In the code example on the slide, rows in the Product.Product table are being flagged as discontinued if
the product category row with which they are associated in the Product.Category table is deleted. Note
that the trigger processes all deleted rows at the same time. A common error when designing AFTER
DELETE triggers is to write them with the assumption that only a single row is being deleted.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 10-9
TRUNCATE TABLE
When rows are deleted from a table by using a DELETE statement, any AFTER DELETE triggers are fired
when the deletion is completed. TRUNCATE TABLE is an administrative option that removes all rows
from a table. It needs additional permissions above those required for deleting rows. It does not fire any
AFTER DELETE triggers that are associated with the table.
Demonstration Steps
Create and test AFTER DELETE triggers
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. If you have not completed the previous demonstration in this module, then run
D:\Demofiles\Mod10\Setup.cmd as an administrator to revert any changes
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod10\Demo10.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
The trigger can examine both the inserted and deleted virtual tables to determine what to do in response
to the modification.
MCT USE ONLY. STUDENT USE PROHIBITED
10-10 Responding to Data Manipulation via Triggers
Multirow Updates
In the code example on the slide, the Product.ProductReview table contains a column called
ModifiedDate. The trigger is being used to ensure that when changes are made to the
Product.ProductReview table, the value in the ModifiedDate column always reflects when any changes
last happened. Note that the trigger processes all updated rows at the same time. A common error when
designing AFTER UPDATE triggers is to write them with the assumption that only a single row is being
updated.
Demonstration Steps
Create and test AFTER UPDATE triggers
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. If you have not completed the previous demonstrations in this module, then run
D:\Demofiles\Mod10\Setup.cmd as an administrator to revert any changes.
3. On the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 10-11
Lesson 3
Advanced Trigger Concepts
In the previous two lessons, you have learned to design and implement DML AFTER triggers. However, to
make effective use of these triggers, you need to understand some additional areas of complexity that are
related to them. You also need to understand where to use triggers and where to consider alternatives to
triggers.
Lesson Objectives
After completing this lesson, you will be able to:
Explain how nested triggers work and how configurations might affect their operation.
Use the UPDATE function to build logic based on the columns being updated.
Describe the limited control that you can exert over the order in which triggers fire when multiple
triggers are defined for the same event on the same object.
INSTEAD OF Triggers
INSTEAD OF triggers cause the execution of
alternate code instead of executing the statement
that caused them to fire.
Updatable Views
A very common use case for INSTEAD OF triggers is to enable views that are based on multiple base
tables to be updatable. You can define INSTEAD OF triggers on views that have one or more base tables,
where they can extend the types of updates that a view can support.
This trigger executes instead of the original triggering action. INSTEAD OF triggers increase the variety of
types of updates that you can perform against a view. Each table or view is limited to one INSTEAD OF
trigger for each triggering action (INSERT, UPDATE, or DELETE).
MCT USE ONLY. STUDENT USE PROHIBITED
10-12 Responding to Data Manipulation via Triggers
You can specify an INSTEAD OF trigger on both tables and views. You cannot create an INSTEAD OF
trigger on views that have the WITH CHECK OPTION clause defined. You can perform operations on the
base tables within the trigger. This avoids the trigger being called again. For example, you could perform
a set of checks before inserting data and then perform the insert on the base table.
Demonstration Steps
Create and test an INSTEAD OF DELETE trigger
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. If you have not completed the previous demonstrations in this module, then run
D:\Demofiles\Mod10\Setup.cmd as an administrator to revert any changes.
3. On the taskbar, click SQL Server 2014 Management Studio.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
A failure at any level of a set of nested triggers cancels the entire original statement, and all data
modifications are rolled back.
A nested trigger will not fire twice in the same trigger transaction; a trigger does not call itself in response
to a second update to the same table within the trigger.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 10-13
Complexity of Debugging
It was mentioned in an earlier lesson that debugging triggers can be difficult. Nested triggers are
particularly difficult to debug. One common method that is used during debugging is to include PRINT
statements within the body of the trigger code so that you can determine where a failure occurred.
However, it is important that these statements are only used during debugging phases.
Direct Recursion
Direct recursion occurs when a trigger fires and
performs an action on the same table that causes
the same trigger to fire again. For example, an
application updates table T1, which causes trigger Trig1 to fire. Trigger Trig1 updates table T1 again,
which causes trigger Trig1 to fire again.
Indirect Recursion
Indirect recursion occurs when a trigger fires and performs an action that causes another trigger to fire on
a different table, which subsequently causes an update to occur on the original table, which then causes
the original trigger to fire again. For example, an application updates table T2, which causes trigger Trig2
to fire. Trig2 updates table T3, which causes trigger Trig3 to fire. Trigger Trig3 in turn updates table T2,
which causes trigger Trig2 to fire again.
To prevent indirect recursion of this sort, turn off the nested triggers option at the server instance level.
UPDATE Function
It is a common requirement to build logic that only
takes action if particular columns are being
updated.
Change of Value
Note that the UPDATE function does not indicate if the value is actually changing. It only indicates if the
column is part of the list of columns in the SET clause of the UPDATE statement. To detect if the value in
MCT USE ONLY. STUDENT USE PROHIBITED
10-14 Responding to Data Manipulation via Triggers
a column is actually being changed to a different value, you need to interrogate the inserted and deleted
virtual tables.
COLUMNS_UPDATED Function
SQL Server also provides a function called COLUMNS_UPDATED. This function returns a bitmap that
indicates which columns are being updated. The values in the bitmap depend upon the positional
information for the columns. Hard-coding that sort of information in the code within a trigger is generally
not considered good coding practice because it affects the readability (and hence the maintainability) of
your code. It also reduces the reliability of your code because schema changes to the table could break
the code.
sp_settriggerorder
Developers often seek to control the firing order of
multiple triggers that are defined for a single event
on a single object. For example, a developer might
create three AFTER INSERT triggers on the same
table, each implementing different business rules or
administrative tasks.
The possible values for the @order parameter are First, Last, or None. None is the default action. An error
will occur if the First and Last triggers both refer to the same trigger.
For DML triggers, the possible values for the @stmttype parameter are INSERT, UPDATE, or DELETE.
Checking Values
You could use triggers to check that values in
columns are valid or within given ranges. In general,
you should use CHECK constraints instead of
triggers for this because CHECK constraints perform
the check before the data modification is
attempted.
If you are using triggers to check the correlation of values across multiple columns within a table, you
should usually create table-level CHECK constraints instead.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 10-15
Defaults
You can use triggers to provide default values for columns when no values have been provided in INSERT
statements. However, you should generally use DEFAULT constraints for this instead.
Foreign Keys
You can use triggers to check the relationship between tables. However, you should generally use
FOREIGN KEY constraints for this.
Computed Columns
You can use triggers to maintain the value in one column based on the value in other columns. In general,
you should use computed columns or persisted computed columns for this.
Precalculating Aggregates
You can use triggers to maintain precalculated aggregates in one table, based on the values in rows in
another table. In general, you should use indexed views to provide this functionality.
As another example, a FOREIGN KEY constraint cannot be contained on a column that is also used for
other purposes. Consider a column that holds an employee number only if another column holds the
value ‘E’. This typically indicates a poor database design, but you can use triggers to ensure this sort of
relationship.
MCT USE ONLY. STUDENT USE PROHIBITED
10-16 Responding to Data Manipulation via Triggers
Demonstration Steps
Replace a trigger with a computed column
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. If you have not completed the previous demonstrations in this module, then run
D:\Demofiles\Mod10\Setup.cmd as an administrator to revert any changes.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 10-17
Supporting Documentation
The Production.ProductAudit table is used to hold changes to high-value products. When inserting rows
into this table, the data required in each column is shown in the following table.
Objectives
After completing this lab, you will be able to:
Create triggers.
Modify triggers.
Password: Pa$$w0rd
Note: Inserts or deletes on the table do not need to be audited. Details of the current user
can be taken from the ORIGINAL_LOGIN() function.
MCT USE ONLY. STUDENT USE PROHIBITED
10-18 Responding to Data Manipulation via Triggers
3. Design a Trigger
2. In the D:\Labfiles\Lab10\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
3. Review the existing structure of the Production.ProductAudit table and the values required in each
column, based on the supporting documentation.
4. Review the existing structure of the Production.Product table.
Results: After this exercise, you should have created a new trigger. Tests should have shown that it is
working as expected.
2. Use an ALTER TRIGGER statement to change the existing trigger so that it will meet the updated
requirements.
Results: After this exercise, you should have altered the trigger. Tests should show that it is now working
as expected.
MCT USE ONLY. STUDENT USE PROHIBITED
10-20 Responding to Data Manipulation via Triggers
1. In many business scenarios, it makes sense to mark records as deleted with a status column and use a
trigger or stored procedure to update an audit trail table. The changes can then be audited, the data
is not lost, and the IT staff can perform purges or archival of the deleted records.
Review Question(s)
Question: How do constraints and triggers differ regarding timing of execution?
MCT USE ONLY. STUDENT USE PROHIBITED
11-1
Module 11
Using In-Memory Tables
Contents:
Module Overview 11-1
Module Overview
Microsoft® SQL Server® 2014 data management software introduces in-memory OLTP functionality
features to improve performance of OLTP workloads. Memory-optimized tables are primarily stored in
memory which provides the improved performance by reducing hard disk access and natively compiled
stored procedures further improve performance over traditional interpreted Transact-SQL.
Objectives
After completing this module, you will be able to:
Use memory-optimized tables to improve performance for latch-bound workloads.
Lesson 1
Memory-Optimized Tables
SQL Server 2014 introduces memory-optimized tables as a way to improve the performance of latch-
bound OLTP workloads. Memory-optimized tables are stored in memory, and do not use locks to enforce
concurrency isolation. This dramatically improves performance for many OLTP workloads.
Lesson Objectives
After completing this lesson, you will be able to:
Memory-optimized tables:
Can persist their data to disk as FILESTREAM data, or they can be nondurable.
Can be queried by using Transact-SQL through interop services that the SQL Server query processor
provides.
Cannot include some data types, including text, image, and nvarchar(max).
A table contains “hot” pages. For example, a table that contains a clustered index on an incrementing
key value will inherently suffer from concurrency issues because all insert transactions occur in the last
page of the index.
Repeatable read validation failures. These occur when a row that the transaction has read has
changed since the transaction began.
Serializable validation failures. These occur when a new (or phantom) row is inserted into the range
of rows that the transaction accesses while it is still in progress.
Commit dependency failures. These occur when a transaction has a dependency on another
transaction that has failed to commit.
MCT USE ONLY. STUDENT USE PROHIBITED
11-4 Using In-Memory Tables
Migration Validation
Migration validation reports on any features of your
disk-based tables that are not supported in
memory-optimized tables.
Migration Warnings
Migration warnings don’t prevent a disk-based table from being migrated to a memory-optimized table,
or stop the table from functioning once it’s converted, but the warnings will list any other associated
objects, such as stored procedures, that might not function correctly post-migration.
Migration Options
You can now specify options such as the filegroup, the new name for the original unmigrated disk-based
table, and whether to transfer the data from the original table to the new memory-optimized table.
Index Migration
Index migration gives you the same options as primary key migration for each of the indexes on the table.
Summary
The summary lists the options that you have specified in the previous stages and allows you to migrate
the table, or to create a script to migrate the table at a subsequent time.
To start Memory Optimization Advisor, in SQL Server Management Studio, right-click a table in Object
Explorer and select Memory Optimization Advisor.
You can add a filegroup for memory-optimized data to a database by using the ALTER DATABASE
statement, as the following example shows:
Note: When the durability option is set to SCHEMA_AND_DATA, the data is written to
disk as a stream, not in 8-KB pages as used by disk-based tables. The ability to set the durability
option to SCHEMA_ONLY is useful when the table is used for transient data, such as a session
state table in a web server farm.
All tables that have a durability option of SCHEMA_AND_DATA must include a primary key. You can
specify this inline for single-column primary keys, as shown in the previous example, or you can specify it
after all of the column definitions.
MCT USE ONLY. STUDENT USE PROHIBITED
11-6 Using In-Memory Tables
To create a memory-optimized table that has a composite primary key, you must specify the PRIMARY
KEY constraint after the column definitions, as shown in the following example:
To create indexes in addition to the primary key, you must specify the indexes after the column
definitions:
Query Interop
You can use Transact-SQL statements to access
memory-optimized tables in the same way as
traditional disk-based tables. The SQL Server 2014
query engine provides an interop layer that does
the necessary interpretation to query the compiled
in-memory table. You can use this technique to
create queries that access both memory-optimized
tables and disk-based tables, for example, by using
a JOIN clause.
Native Compilation
You can increase the performance of workloads that use memory-optimized tables further by creating
natively compiled stored procedures. You can define these by using CREATE PROCEDURE statements
that the SQL Server 2014 query engine converts to native C code. The C version of the stored procedure is
compiled into a DLL, which is loaded into memory. You can only use natively compiled stored procedures
to access memory-optimized tables; they cannot reference disk-based tables.
MCT USE ONLY. STUDENT USE PROHIBITED
11-8 Using In-Memory Tables
Demonstration Steps
Using memory-optimized tables
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod11\Demo11.ssmssln, and then click Open.
9. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 11-9
Lesson 2
Natively Compiled Stored Procedures
Natively compiled stored procedures are stored procedures that are compiled into native code. They are
written in traditional Transact-SQL code, but are compiled when they are created rather than when they
are executed which improves performance.
Lesson Objectives
After completing this lesson, you will be able to:
For more information, see the Introduction to Natively Compiled Stored Procedures article on
MSDN.
http://go.microsoft.com/fwlink/?LinkID=394850&clcid=0x409
MCT USE ONLY. STUDENT USE PROHIBITED
11-10 Using In-Memory Tables
NATIVE_COMPILATION
SCHEMABINDING
EXECUTE AS
SNAPSHOT. Using this isolation level, all data that the transaction reads is consistent with the version
that was stored at the start of the transaction. Data modifications that other, concurrent transactions
have made are not visible and attempts to modify rows that other transactions have modified result
in an error.
REPEATABLE READ. Using this isolation level, every read is repeatable until the end of the transaction.
If another, concurrent transaction has modified a row that the transaction had read, the transaction
will fail to commit due to a repeatable read validation error.
SERIALIZABLE. Using this isolation level, all data is consistent with the version that was stored at the
start of the transaction, and repeatable reads are validated. In addition, the insertion of “phantom”
rows by other, concurrent transactions will cause the transaction to fail.
The following code example shows a CREATE PROCEDURE statement that is used to create a natively
compiled stored procedure:
Demonstration Steps
Create a natively compiled stored procedure
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod11\Demo11.ssmssln, and then click Open.
9. Follow the instructions contained within the comments of the script file.
10. Close SQL Server Management Studio without saving any changes
MCT USE ONLY. STUDENT USE PROHIBITED
11-12 Using In-Memory Tables
You are planning to optimize some database workloads by using the in-memory database capabilities of
SQL Server 2014. You will create memory-optimized tables and natively compiled stored procedures to
optimize OLTP workloads.
Objectives
After completing this lab, you will be able to:
Password: Pa$$w0rd
2. In the D:\Labfiles\Lab11\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
2. Add a file for memory-optimized data to the InternetSales database. You should store the file in the
filegroup that you created in the previous step.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 11-13
o SessionID: integer
o TimeAdded: datetime
o CustomerKey: integer
o ProductKey: integer
o Quantity: integer
3. The table should include a composite primary key hash index on the SessionID and ProductKey
columns with 100000 buckets.
4. Test the table by inserting some rows and querying the table. You can use any valid values for this
test.
Results: After completing this exercise, you should have created a memory-optimized table and a natively
compiled stored procedure in a database with a filegroup for memory-optimized data.
2. Create a natively compiled stored procedure named DeleteItemFromCart. The stored procedure
should include SessionID and ProductKey parameters, and should delete matching rows from the
ShoppingCart table by using a SNAPSHOT isolation transaction.
3. Create a natively compiled stored procedure named EmptyCart. The stored procedure should include
SessionID parameters, and should delete matching rows from the ShoppingCart table by using a
SNAPSHOT isolation transaction.
4. Test each of the stored procedures by writing Transact-SQL statements to call them with appropriate
parameter values.
Results: After completing this exercise, you should have created a natively compiled stored procedure.
MCT USE ONLY. STUDENT USE PROHIBITED
11-14 Using In-Memory Tables
Module 12
Implementing Managed Code in SQL Server
Contents:
Module Overview 12-1
Module Overview
As a database professional, you are asked to create databases and related objects to meet business needs.
You can meet most requirements by using Transact-SQL. However, there are times when the requirements
go beyond the abilities of Transact-SQL. These requirements may include functionality such as:
Complex or compound data types, such as currency values that include culture information, complex
numbers, and dates that include a calendar system, or storing entire arrays of values in a single
column.
Accessing image files on the operating system and reading them or copying them into the database.
All of these are examples of requirements that you can meet by using common language runtime (CLR)
integration in Microsoft® SQL Server® data management software. You can use integrated code to
create user-defined functions, stored procedures, aggregates, types, and triggers. You can develop these
objects by using any .NET language and they can be highly specialized. In this module, you will learn
about using CLR integrated code to create user-defined database objects that the .NET Framework
manages.
Objectives
After completing this module, you will be able to:
Lesson 1
Introduction to CLR Integration in SQL Server
Among database professionals, there is a constant desire to extend the built-in functionality of SQL Server.
For example, you might want to add a new aggregate to the existing list of aggregates that SQL Server
supplies. There is no right or wrong method to extend the product. Particular methods are more or less
suited to particular needs and situations. CLR integration in SQL Server is one method for extending SQL
Server. It is important to understand CLR integration in SQL Server and its appropriate use cases.
Lesson Objectives
After completing this lesson, you will be able to:
Managed Code
Managed code is code that is written to operate
within the .NET Framework. There seems to be
concern among database administrators about
running managed code within the Database Engine,
but it is important to realize that even the most
unsafe managed code that you write is always safer than any extended stored procedure code.
You can create many applications by using the “out-of-the-box” tools and functionality that SQL Server
provides. However, being able to reuse previously developed functionality helps to produce higher quality
outcomes. Therefore, it is desirable to package that reusable functionality as an extension of the SQL
Server product.
Many SQL Server components are extensible. As an example, SQL Server Reporting Services enables you
to create rendering extensions, security extensions, data processing extensions, delivery extensions,
custom code, and external assemblies.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-3
.NET Framework
The .NET framework is a layer of software that sits above the Win32 and Win64 APIs and abstracts the
underlying complexity. This framework is written in a consistent fashion to a tightly written set of design
guidelines. Many people describe it as appearing to have been “written by one brain.” It is not specific to
any one programming language and also contains many thousands of prebuilt and pretested objects.
These objects are collectively referred to as the .NET Framework class libraries.
These capabilities make the .NET Framework a good base for building code to extend SQL Server.
Security features to ensure that managed code will not compromise the server.
The ability to create new resources by using .NET languages such as Microsoft Visual C#® and
Microsoft Visual Basic® .NET.
Memory Management
A key problem that arose in development directly against the Win32 and Win64 APIs related to memory
management. In older Component Object Model (COM) programming that was used with these APIs,
releasing memory when it was no longer needed was based on reference counting. The idea was that the
following sequence of events would occur:
Object C might then acquire a reference to Object B, too. Object B then notes that it has two
references to itself.
Object C releases its reference. Object B then notes that it has only a single reference to itself.
Object A releases its reference, too. Object B then notes that it now has no references to itself, so it
proceeds to destroy itself.
The problem with this scheme is that it is easy to create situations where memory is lost. For a simple
example, consider circular references. If two objects have references to each other, but no other object
has any reference to either of them, they can both sit in memory forever as long as they have a reference
to each other. This causes a leak (or loss) of the memory that those objects consume. Over time, creation
of such situations could cause the loss of all available memory on the system.
This sort of memory management scheme would not be suitable within the Database Engine. The .NET
Framework includes a sophisticated memory management system that is known as garbage collection. It
is designed to avoid any chance of such memory leaks. Instead of objects needing to count references, the
CLR periodically checks which objects are “reachable” and disposes of the other objects.
Type Safety
Another common problem with Win32 and Win64 code relates to what is known as type safety. When a
function or procedure is called, all that is known to the caller is the address in memory of the function.
The caller assembles a list of any required parameters, places them in an area that is called the stack, and
jumps to the memory address of the function. Problems arise when the design of the function and/or its
parameters change and the calling code is not updated. The function can then end up referring to
memory locations that do not exist.
The .NET CLR is designed to avoid such problems. As an example, in addition to providing details of the
address of a function, it provides details of what is called the signature of a function. This specifies the
data types of each of the parameters and the order that they need to be in. The CLR will not enable a
function to be called with the wrong number or types of parameters. This is referred to as “type safety.”
CLS
The CLS is the common language specification. It specifies the rules that languages must conform to, so
that interoperability between languages is possible. For example, even though it is possible in C# to create
a method called SayHello and another method called Sayhello, these methods could not be called from
another language that was not case-sensitive. The CLS states that, to avoid interoperability problems, you
should not create these two methods.
Although there have been advances in error handling in Transact-SQL in recent years, the error handling
that the Transact-SQL language provides is still well short of the type of error handling that higher-level
languages typically provide. Writing managed code enables you to take advantage of these more
extensive error-handling capabilities.
MCT USE ONLY. STUDENT USE PROHIBITED
12-6 Implementing Managed Code in SQL Server
Stored procedures
User-defined aggregates
Transact-SQL
Transact-SQL is the primary method for
manipulating data within databases. It is designed
for direct data access and offers high performance,
particularly when it is working against very large
sets of data. However, Transact-SQL is not a fully-
fledged high-level programming language.
Managed Code
Managed code provides full object-oriented capabilities, although this only applies within the managed
code itself. Transact-SQL code does not support the object-oriented capabilities.
Managed code works well in situations that require intensive calculations (such as encryption) or string
handling.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-7
General Rules
Two good general rules apply when you are making a choice between using Transact-SQL and managed
code:
The more data-oriented the need is, the more likely it is that Transact-SQL will be the better answer.
The more the need is focused on calculation, strings, or external access, the more likely it is that
managed code will be the better answer.
Scalar UDFs
It is well-known that scalar user-defined functions
(UDFs) that are written in Transact-SQL can cause
performance problems in SQL Server environments.
Managed code is often a good option for
implementing scalar UDFs as long as the function
does not depend on data access.
Table-Valued UDFs
The more data-related table-valued UDFs are, the more they are likely to be best implemented in
Transact-SQL. A common use case for managed code in table-valued UDFs is for functions that need to
access external resources such as the file system, environment variables, and the registry.
Stored Procedures
Stored procedures have traditionally been written in Transact-SQL. Most stored procedures should
continue to be written in Transact-SQL. There are very few good use cases for managed code in stored
procedures. The exceptions to this are stored procedures that need to access external resources or
perform complex calculations. However, you should consider whether code that performs these tasks
should be implemented within SQL Server at all.
DML Triggers
Almost all data manipulation language triggers are heavily oriented toward data access and are written in
Transact-SQL. There are very few valid use cases for implementing DML triggers in managed code.
DDL Triggers
Data definition language triggers are again often data-oriented. However, some DDL triggers need to do
extensive XML processing, particularly based on the XML EVENTDATA structure that SQL Server passes to
these triggers. The more that extensive XML processing is required, the more likely it is that the DDL
trigger would be best implemented in managed code. Managed code would also be a better option if the
DDL trigger needed to access external resources, but this is rarely a good idea within any form of trigger.
User-Defined Aggregates
Transact-SQL offers no concept of user-defined aggregates. You need to implement these in managed
code.
MCT USE ONLY. STUDENT USE PROHIBITED
12-8 Implementing Managed Code in SQL Server
Lesson 2
Importing and Cataloging Assemblies
Assemblies are the unit of both deployment and security in the .NET Framework. Managed code in SQL
Server resides within assemblies. Before you can start to work with managed code in SQL Server, you need
to learn about assemblies and how you can import them into SQL Server and secure them.
Lesson Objectives
After completing this lesson, you will be able to:
Detail the permission sets that are available for securing assemblies.
Import an assembly.
What Is an Assembly?
Assemblies are the unit of both deployment and
security in the .NET Framework. They contain the
code that will be executed, are self-describing, and
may contain resources.
Structure of an Assembly
Prior to managed code, executable files (.exe files)
and dynamic-link libraries (.dll files) contained only
executable code. Compilers produce executable
code by converting instructions in higher-level
languages into the binary codes that the
computer’s processor requires for execution.
Managed code assemblies have a specific structure. In addition to executable code, they contain a
manifest. This manifest provides a list of the contents of the assembly and of the programming interfaces
that the assembly provides. This enables other code to interrogate an assembly to determine both what it
contains and what it can do. As an example, SQL Server can gain a great deal of understanding of an
assembly by reading this manifest when it is cataloging an assembly.
Assemblies can contain other resources such as icons. These are also listed in the manifest. You can
structure assemblies as either .exe files or .dll files. The only difference between the two is that .exe files
also include an area that is called the portable execution header (PE header), which the operating system
uses to find out where the executing code of an .exe file starts. SQL Server will only import .dll files and
will refuse to import .exe files.
Assemblies also form a boundary at which security is applied. In the next topic, you will see how this
security is configured.
MCT USE ONLY. STUDENT USE PROHIBITED
12-10 Implementing Managed Code in SQL Server
SAFE
Administrators should regard SAFE as really meaning what the name says. It is a particularly limited
permission set, but it does allow access to the SQL Server database in which it is cataloged via a special
type of connection that is known as a context connection. Administrators should be comfortable with the
cataloging of SAFE assemblies. SAFE is the default permission set.
EXTERNAL_ACCESS
EXTERNAL_ACCESS is the permission set that is required before code in an assembly can access local and
network resources, environment variables, and the registry of the server. This permission set is still quite
safe and is typically used when any form of external access is required. Administrators should be fairly
comfortable with the cataloging of EXTERNAL_ACCESS assemblies, after a justification for the external
access requirements has been made.
UNSAFE
UNSAFE is the unrestricted permission set. It should be rarely used for general development. UNSAFE is
required for code that calls external unmanaged code or code that holds state across function calls, and
so on. Administrators should only allow the cataloging of UNSAFE assemblies in situations that have been
very carefully considered and justified.
You can flag the database as TRUSTWORTHY by using the ALTER DATABASE SET TRUSTWORTHY
ON statement. In general, this is not recommended without an understanding of what changes this
makes to the database security environment.
An asymmetric key is created from the assembly file that is cataloged in the master database. Next, a
login mapping to that key is created. Finally, the login is granted the EXTERNAL ACCESS ASSEMBLY
permission on the assembly. This is the recommended method of granting permission to use
EXTERNAL_ACCESS or UNSAFE permission sets, but setting it up is an advanced topic that is beyond
the scope of this course.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-11
Importing an Assembly
Before you can use the code in an assembly within
SQL Server, you must import and catalog the
assembly within a database.
CREATE ASSEMBLY
You can use the CREATE ASSEMBLY statement
both to import and catalog an assembly within the
current database. SQL Server assigns a permission
set to the assembly that is based on the WITH
PERMISSON_SET clause in the CREATE ASSEMBLY
statement. If no permission set is explicitly
requested, the assembly will be cataloged as a SAFE
assembly and the code within the assembly will only
be able to execute tasks that the SAFE permission set permits.
Before you can execute any code in a user-created assembly, you must set the ‘clr enabled’ option to 1
(enabled) at the instance level. It is still possible to catalog an assembly and the objects within it even if
this option is disabled. It only prevents code execution.
After the assembly is cataloged in the database, the contents of the assembly are contained within the
database and SQL Server no longer needs the file from which it was cataloged. After the assembly is
cataloged, it will be loaded from within the database when it is required, not from the file system.
Assembly Path
There are three locations from which an assembly can be imported:
1. A .dll file on a local drive. The drive may not be a mapped drive.
2. A .dll file from a Universal Naming Convention (UNC) path. (A UNC path is of the form
\\SERVER\Share\PathToFile\File.dll.)
3. A binary string that contains the contents of the .dll file.
At first, it might seem odd to consider cataloging an assembly from a binary string, but this is how Visual
Studio catalogs assemblies if you deploy an assembly directly from Visual Studio. Visual Studio cannot
assume that you have access to the file system of the server. You might be using an instance of SQL Server
or using a database that a hosting company is hosting and have no access to the file system of the server
at all.
Cataloging an assembly from a binary string enables you to stream an assembly to the server within the
CREATE ASSEMBLY statement. It is worth noting that, if you later generate a script for the database, any
contained assemblies will also be scripted as binary strings.
MCT USE ONLY. STUDENT USE PROHIBITED
12-12 Implementing Managed Code in SQL Server
Demonstration Steps
Import and catalog an assembly
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod12\Demo12.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-13
Lesson 3
Implementing CLR Integration in SQL Server
After an assembly has been cataloged, you also need to catalog any objects within it. This will make the
objects visible within SQL Server so that they can be called from within Transact-SQL code.
Lesson Objectives
After completing this lesson, you will be able to:
Explain how appropriate attribute usage is important when you are creating assemblies.
Implement stored procedures that have been written in managed code and that require access to
external resources.
Implement user-defined data types that have been written in managed code.
Take into account considerations for user-defined data types that have been written in managed
code.
Attribute Usage
Attributes are metadata that is included within code
and is used to describe that code. When you are
implementing managed code within SQL Server,
attributes are used for reasons of deployment,
performance, and correctness.
Attributes
If you have not written any managed code, the
concept of attributes may be unfamiliar to you.
Attributes are metadata (or data about data) that is
used to describe functions, methods, and classes.
Attributes do not form part of the logic of the
objects; instead, they describe aspects of them.
For example, consider an attribute that records the name of the author of a method. This does not change
how the method operates, but it could be useful information for anyone who uses the method. The .NET
Framework also has a special set of logic called Reflection that enables one set of managed code to
interrogate details of another set of managed code. Attributes are returned as part of this process. SQL
Server accesses the attributes that you associate with your code through reflection.
MCT USE ONLY. STUDENT USE PROHIBITED
12-14 Implementing Managed Code in SQL Server
Deployment
The first reason why attributes are helpful relates to deployment. Adding a SqlFunction attribute to a
managed code method tells Visual Studio (or other code that is used for deployment) that the method
should be cataloged as a function within SQL Server. Adding an attribute to a method is also referred to
as “adorning” the method with the attribute.
If you do not add a SqlFunction attribute to a method, you can still manually catalog the method as a
function in SQL Server. The limitation is that automated deployment systems will not know to do so.
You might wonder why SQL Server does not just automatically catalog all methods as functions when it
catalogs an assembly. The reason is that you can use methods for more than just functions. Some
methods are only used within the assembly and are not intended to be used by code that utilizes the
functionality that the assembly provides.
Performance
The second reason why attributes are helpful relates to performance. Consider the DataAccess property
of the SqlFunction attribute that is shown on the slide. This property tells SQL Server that no data context
needs to be provided for this method. It does not access data from the database. This makes the function
quicker to execute and reduces its memory requirements.
As another example of how an attribute can help with performance, consider an attribute that tells SQL
Server that a method call always returns NULL if the parameter that is passed to the method is NULL. In
that case, SQL Server knows that it does not need to call the method at all if the value is NULL.
Correctness
The final reason why attributes are helpful relates to correctness. If a new Circle data type is created, it
might provide a method that is called Shrink. SQL Server needs to know that if this method is called, the
internal state of the user-defined data type will be changed when the method returns. This helps SQL
Server to know how the method can be used. For example, SQL Server would then know that the method
could be called in the SET clause of an UPDATE statement. It would also prevent SQL Server from
enabling the method to be called in a SELECT list or WHERE clause in a SELECT statement.
Scalar UDFs
Scalar user-defined functions are a common use
case for managed code and often offer a higher-
performing alternative to their equivalent Transact-
SQL functions.
CREATE FUNCTION
You can use the CREATE FUNCTION statement to
catalog a scalar user-defined function that has been
written in managed code. In the statement, you
need to provide the details of the returned data
type and a path to the method within the assembly.
Note that the name that a function is called within
SQL Server does not have to match the name that
the method is called within the assembly. However, it is considered good practice to have these matched
with each other to avoid confusion.
EXTERNAL NAME
When you are cataloging the function, the EXTERNAL NAME clause is used to point to where the method
exists within the assembly. This normally consists of a three-part name:
The first part of the name refers to the alias for the assembly that was used in the CREATE
ASSEMBLY statement.
The second part of the name must contain the namespace that contains the method. In the example
on the slide, UserDefinedFunctions is a class. However, the UserDefinedFunctions class itself could
be contained within another namespace. If that other namespace was called CompanyFunctions, the
second part of the name would need to be specified as
[CompanyFunctions.UserDefinedFunctions].
The third part of the name refers to the method within the class.
Note that even if the code has been built in a case-insensitive language such as Visual Basic, and the
database collation is set to case-insensitive, the assembly name that is provided in the EXTERNAL NAME
clause is case- sensitive.
Table-Valued UDFs
Table-valued functions (TVFs) are cataloged in a
similar way to scalar functions, but they need to
include the definition of the returned table.
CREATE FUNCTION
You can also use the CREATE FUNCTION
statement to catalog TVFs that are written in
managed code. The return data type, however,
must be TABLE. After the data type, you need to
provide the definition of the schema of the table. In
the example shown on the slide, the table consists
of two columns, both of integer data type.
Deployment Attribute
The definition of TVFs provides an example of why the properties of an attribute are useful. First, the
SqlFunction attribute indicates that the method should be cataloged as a function. The properties of the
attribute indicate:
The name of the FillRow method. (Do not be concerned with the FillRowMethodName method at
this point. Although it must be present, it relates to the internal design of the function.)
The schema for the returned table. An automated deployment system (such as the one provided in
Visual Studio) needs to know the format of the returned table to be able to automatically catalog this
function in SQL Server.
Parameter Naming
The names that you choose for the parameter in Transact-SQL do not need to match the names that you
use in the managed code.
MCT USE ONLY. STUDENT USE PROHIBITED
12-16 Implementing Managed Code in SQL Server
For example, you could catalog the function in the example on the slide in the following way:
Parameter Naming
CREATE FUNCTION dbo.RangeOfIntegers
(@StartValue int, @EndValue int)
RETURNS TABLE (PositionInList int, IntegerValue int)
AS EXTERNAL NAME
SQLCLR_Demo2.UserDefinedFunctions.RangeOfIntegers
However, you should create Transact-SQL parameters that have the same name as the parameters in the
managed code unless there is a compelling reason to make them different. An example of this would be a
parameter name that was used in managed code that was not a valid parameter name in Transact-SQL.
Even in this situation, a better option would be to change the parameter names in the managed code
wherever possible.
CREATE PROCEDURE
You can use the CREATE PROCEDURE statement to
catalog a stored procedure that is written in
managed code. The relevant deployment attribute
is the SqlProcedure attribute. This attribute tells
Visual Studio (or any other deployment tool) that
the method should be cataloged as a stored procedure.
You should list parameters that need to be passed to the stored procedure in the same way that they are
listed for a Transact-SQL stored procedure definition.
SqlPipe
Stored procedures that are written in managed code support both input and output parameters, just like
their equivalent procedures in Transact-SQL.
Like stored procedures that are written in Transact-SQL, stored procedures that are written in managed
code need a way to return rows of data. You use the SqlPipe object within the stored procedure code to
achieve this data. This object can return rows of data.
If you call the Send method of the SqlPipe object and pass a string value to it, the outcome is the same
as if you had issued a PRINT statement in a Transact-SQL–based stored procedure. You will see the values
returned on the Messages tab in SQL Server Management Studio.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-17
You can see the SqlPipe object used in the following code example:
SqlPipe
public partial class StoredProcedures
{
[SqlProcedure]
public static void ProductsByColor(SqlString Color)
{
SqlConnection conn =
new SqlConnection("context connection=true");
SqlCommand command = conn.CreateCommand();
SqlPipe outputPipe = SqlContext.Pipe;
outputPipe.Send("Hello. It's now " +
DateTime.Now.ToLongTimeString() + " at the server.");
if (Color.IsNull)
{
command.CommandText =
"SELECT * FROM Production.Product "
+ "WHERE (Color IS NULL) ORDER BY ProductID";
}
else
{
command.CommandText =
"SELECT * FROM Production.Product "
+ "WHERE (Color = @Color) ORDER BY ProductID";
command.Parameters.Add(
new SqlParameter("@Color", Color.Value));
}
conn.Open();
outputPipe.Send(command.ExecuteReader());
conn.Close();
}
};
Access to the file system requires EXTERNAL_ACCESS permission when the assembly that contains the
method is cataloged.
Server, yet many applications require it to be enabled. xp_cmdshell enables the applications to perform
operations at the file-system level. Enabling xp_cmdshell is undesirable from a security perspective and
managed code provides alternate ways to implement this required functionality in a much safer form.
Triggers
You can implement both DML and DDL triggers
from within managed code.
CREATE TRIGGER
You can use the CREATE TRIGGER statement to
catalog methods in managed code assemblies as
triggers. The relevant deployment attribute is
SqlTrigger. The SqlTrigger attribute properties
that are most useful are:
Access to Modifications
Like triggers that are written in Transact-SQL, triggers that are written in managed code can access the
details of the changes being made or the commands that have been executed.
Within DML triggers, access is provided to the inserted and deleted virtual tables in exactly the same way
as in DML triggers that are written in Transact-SQL.
Similarly, within DDL triggers, access is provided to the XML EVENTDATA structure.
SqlTriggerContext
A DML trigger can be associated with multiple events on an object. Within the code of a DML trigger, you
may need to know which event has caused the trigger to fire. You can use the SqlTriggerContext class to
build logic based on the event that caused the trigger to fire.
User-Defined Aggregates
User-defined aggregates are an entirely new type of
object for SQL Server; you cannot create them in
Transact-SQL. The ability to create aggregates
enables you to provide additional aggregates that
the built-in set of aggregates does not provide. For
example, you might decide that you need a
MEDIAN aggregate, but SQL Server does not supply
one. Another good use case for creating aggregates
occurs when you are migrating code from another
database engine that offers aggregates that differ
from those that SQL Server provides. You could also
create aggregates to operate on data types that are
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-19
CREATE AGGREGATE
You can use the CREATE AGGREGATE statement to catalog user-defined aggregates that are written in
managed code. The relevant deployment attribute is SqlUserDefinedAggregate. Note that the path to a
struct or class will be a two-part name, as shown in the EXTERNAL NAME clause on the slide.
Serialization
SQL Server needs to be able to store interim results while it calculates the value of an aggregate. In
managed code, the ability to save an object as a stream of data is called “serializing” the object. User-
defined aggregates need to be serializable. In managed code, you can implement them as either classes
or structs (data structures). Most user-defined aggregates would be implemented as structs rather than as
classes, because structs are easier to implement.
The property Format.Native that is shown in the example on the slide is indicating that the struct will be
serialized by using the standard serialization mechanisms that are built in to the .NET Framework. You can
only use the built-in serialization with simple data types. For more complex data types, you need to add
user-defined serialization.
Attribute Properties
A few more useful attribute properties are shown in the example on the slide.
IsInvariantToDuplicates. This attribute property tells SQL Server that the result of the aggregate is
the same even if it does not see the values from every row. It only needs to see unique values. To
visualize this, consider which rows the built-in MAX or MIN aggregates need to process and how this
compares to which rows the built-in COUNT aggregate needs to see.
IsInvariantToNulls. This attribute property tells SQL Server that the result of the aggregate is
unaffected by seeing rows that do not have a value in the relevant column.
IsNullIfEmpty. This attribute property tells SQL Server that if no rows need to be processed, the
aggregate does not need to be called because the result will be NULL anyway.
Name. This attribute property tells Visual Studio (or any other deployment tool) what name the
aggregate should have when it is cataloged.
Note: This is not a complete list of all the possible properties, just the most useful ones.
CREATE TYPE
You can use the CREATE TYPE statement to catalog
user-defined data types. The data type will be
defined as a class in a managed code assembly.
Similar to user-defined aggregates, data types need
MCT USE ONLY. STUDENT USE PROHIBITED
12-20 Implementing Managed Code in SQL Server
to be serializable because SQL Server needs to be able to store them. The deployment attribute is
SqlUserDefinedType.
The geometry, geography, and hierarchyid system data types are, in fact, system CLR data types. Their
operation is unrelated to the ‘clr enabled’ configuration setting at the SQL Server instance level. The ‘clr
enabled’ option only applies to user-created managed code.
Accessing Properties
InstanceOfTheType.Property, for example, @Shape.STArea
The methods of an instance of a managed code data type are accessed by using the code in the following
example:
Accessing Methods
InstanceOfTheType.Method(), for example, @Shape.STDistance(@OtherShape)
Managed code data types might also include functionality that is useful without creating an object of the
data type first. This enables you to expose functions from within a data type somewhat like a code library.
The methods of the managed code data type itself are accessed by using the code in the following
example:
Note: The one exception to this is that binary comparisons are permitted when the
IsByteOrdered property of the SqlUserDefinedDataType attribute is set to true. Even in this
situation, only a simple binary comparison is performed.
For example, you cannot compare two geometry data types by using the code that is shown in the
example below:
However, you can compare the properties of the two data types by using the code that is shown in the
example below:
For user-defined data types, there is no method for creating new types of index to support them. What
you can do is create a persisted calculated column in the same table and use it to “promote” the
properties of the user-defined data type into standard relational columns. You can then index these
columns.
Operator Overloading
In object-oriented programming, it is possible to define or change the operators that operate on the
object. User-defined data types do not offer this capability. For example, you cannot define a customized
meaning for a > (greater than) operator.
Demonstration Steps
Create aggregates and user-defined data types
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. Run D:\Demofiles\Mod12\Setup.cmd as an administrator to revert any changes.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
6. If the previous demonstration was not performed, open the 21 - Demonstration 2A.sql script file
and execute steps 1 to 3.
10. Close SQL Server Management Studio without saving any changes
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-23
Supporting Documentation
The following list details the proposed functionality that is being considered for managed code.
Trigger that records balance movements that have a value of more than 1,000.
Stored procedure that writes an XML file for a given XML parameter.
Objectives
After completing this lab, you will be able to:
Password: Pa$$w0rd
2. For each object that is listed, determine whether it is appropriate to implement it in managed code.
MCT USE ONLY. STUDENT USE PROHIBITED
12-24 Implementing Managed Code in SQL Server
Supporting Documentation
The following list details the proposed functionality being considered for managed code.
Trigger that records balance movements with a value of more than 1000.
Stored procedure that writes an XML file for a given XML parameter.
2. In the D:\Labfiles\Lab12\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
Results: After this exercise, you should have created a list of which objects should be implemented in
managed code and the reasons for your decision.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 12-25
2. Catalog the assembly and the functions that are contained within it.
2. Query the sys.assemblies and sys.assembly_files system views to confirm the details of how the
assembly has been cataloged.
3. Use the CREATE FUNCTION statement to catalog the dbo.IsValidEmailAddress function. It takes a
parameter named @email of type NVARCHAR(4000) and returns one bit. It is found in the assembly
at SQLCLRDemo.[SQLCLRDemo.CLRDemoClass].IsValidEmailAddress.
5. Use the CREATE FUNCTION statement to catalog the dbo.FolderList function. It takes two
parameters: @RequiredPath of type NVARCHAR(4000) and @FileMask of type NVARCHAR(4000).
It returns a table of file names, with one column called FileName of type NVARCHAR(4000). It is
found in the assembly at SQLCLRDemo.[SQLCLRDemo.CLRDemoClass].FolderList.
MCT USE ONLY. STUDENT USE PROHIBITED
12-26 Implementing Managed Code in SQL Server
SELECT dbo.IsValidEmailAddress('test@somewhere.com');
GO
SELECT dbo.IsValidEmailAddress('test.somewhere.com');
GO
SELECT dbo.FormatAustralianPhoneNumber('0419201410');
GO
SELECT dbo.FormatAustralianPhoneNumber('9 87 2 41 23');
GO
SELECT dbo.FormatAustralianPhoneNumber('039 87 2 41 23');
GO
SELECT * FROM dbo.FolderList(
'D:\Labfiles\Lab12\Starter','*.txt');
GO
Results: After this exercise, you should have three functions working as expected.
Review Question(s)
Question: Which types of database objects can you implement by using managed code?
Module 13
Storing and Querying XML Data in SQL Server
Contents:
Module Overview 13-1
Lesson 2: Storing XML Data and XML Schemas in SQL Server 13-9
Module Overview
XML provides rules for encoding documents in a machine-readable form. It has become a widely adopted
standard for representing data structures rather than sending unstructured documents. Servers that are
running Microsoft® SQL Server® data management software often need to use XML to interchange data
with other systems and many SQL Server tools provide an XML-based interface.
SQL Server offers extensive handling of XML both for storage and for querying. This module introduces
XML, shows how it is possible to store XML data within SQL Server, and shows how to query the XML data.
The ability to query XML data directly avoids the need to shred it to a relational format before executing
Structured Query Language (SQL) queries. To effectively process XML, you need to be able to query XML
data in several ways: returning existing relational data as XML, querying data that is already XML, and
shredding XML data into a relational format.
Objectives
After completing this module, you will be able to:
Lesson 1
Introduction to XML and XML Schemas
Before you discover how to work with XML in SQL Server, it is important to understand XML itself and
how it is used outside SQL Server. You need to understand some core XML-related terminology, along
with how you can use schemas to validate and enforce the structure of XML. One common problem with
using XML in SQL Server is a tendency to overuse it. It is important to understand the appropriate uses for
XML when you are working with SQL Server.
Lesson Objectives
After completing this lesson, you will be able to:
Determine appropriate use cases for XML data storage in SQL Server.
Data Interchange
XML came to prominence as a format for
interchanging data between systems. It follows the
same basic structure rules as other markup
languages (such as HTML) and is used as a self-
describing language.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-3
XML Document
<?xml version="1.0" encoding="iso-8859-1" ?>
<?xml-stylesheet href="orders.xsl"?>
<order id="ord123456">
<customer id="cust0921">
<first-name>Dare</first-name>
<last-name>Obasanjo</last-name>
<address>
<street>One Microsoft Way</street>
<city>Redmond</city>
<state>WA</state>
<zip>98052</zip>
</address>
</customer>
</order>
Without any context and information, you can determine that this document holds the details about an
order, the customer who placed the order, and the customer’s name and address details. This explains
why XML is defined as a self-describing language. In formal terminology, this is described as “deriving a
schema” from a document.
XML Specifics
The line in the example document that starts with “?xml” is referred to as a processing instruction. These
instructions are not a part of the data, but determine the details of encoding. The first instruction in the
example shows that version “1.0” of the XML specification is being used along with a specific encoding of
“iso-8859-1.” The second instruction indicates the use of the extensible style sheet “orders.xsl” to format
the document for display, if displaying the document is necessary.
The third line of the example is the “order” element. Note that the document data starts with an opening
order element and finishes with a closing order element shown as “</order>.“ The order element also has
an associated attribute named “id.”
It is important to realize that elements in XML (as in most other markup languages) are case-sensitive.
Element-Centric XML
<Customer>
<Name>Tailspin Toys</Name>
<Rating>12</ Rating >
</Customer>
Attribute-Centric XML
<Customer Name="Tailspin Toys" Rating="12">
</Customer>
Note that if all data for an element is contained in attributes, a shortcut form of element is available.
MCT USE ONLY. STUDENT USE PROHIBITED
13-4 Storing and Querying XML Data in SQL Server
Attribute-Centric Shortcut
<Customer Name="Tailspin Toys" Rating="12"></Customer>
<Customer Name="Tailspin Toys" Rating="12" />
XML Document
<order id="ord123456">
<customer id="cust0921" />
</order>
This code provides the details for a single order and would be considered to be an XML document.
XML Fragment
<order id="ord123456">
<customer id="cust0921" />
</order>
<order id="ord123457">
<customer id="cust0925" />
</order>
This text contains the details of multiple orders. Although it is perfectly reasonable XML, it is considered to
be a fragment of XML rather than a document.
To be called a document, the XML needs to have a single root element, as shown in the following
example.
Single Root
<orders>
<order id="ord123456">
<customer id="cust0921" />
</order>
<order id="ord123457">
<customer id="cust0925" />
</order>
</orders>
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-5
XML Namespaces
An XML namespace is a collection of names that
you can use as element or attribute names. It is
used to avoid conflicts with other names. Imagine
an XML instance that contains references to both a
product and an order. Both of these elements could
have a child element called id, so any reference to
the id element could easily be ambiguous.
Namespaces are used to remove that ambiguity.
XML Namespace
xmlns="http://schemas.microsoft.com/sqlserver/profiles/gml"
Note that specifying an address in a namespace does not necessarily mean that you could use the URI
that is provided to retrieve the details in any particular format. Many URIs that are used in namespaces
only link to an address where a human-readable description of the namespace is found. Many other URIs
do not lead to any real resources at all. The URI is simply used as a unique identifier for the namespace to
reduce the possibility of duplicate entries.
Prefixes
When you are declaring a namespace, an alias for the namespace is assigned. In XML terminology, this
alias is called a “prefix” because of the way it is used within the remainder of the XML.
XML Prefix
xmlns="urn:AW_NS" xmlns:o="urn:AW_OrderNS"
Two namespaces have been declared. The second namespace has been assigned the prefix o.
The prefix is then used later to identify which namespace any element name is part of, as shown below.
Using Prefixes
<o:Order SalesOrderID="43860" Status="5"
OrderDate="2001-08-01T00:00:00">
<o:OrderDetail ProductID="761" Quantity="2"/>
<o:OrderDetail ProductID="770" Quantity="1"/>
</o:Order>
In this snippet, the Order and OrderDetail elements are identified as being part of the urn:AW_OrderNS
namespace by being prefixed by o.
MCT USE ONLY. STUDENT USE PROHIBITED
13-6 Storing and Querying XML Data in SQL Server
XML Schemas
XML schemas are used to provide rules that
determine the specific elements, attributes, and
layout that should be permitted within an XML
document.
XML schemas are often referred to as XML Schema Definitions (XSDs). XSD is also the default file
extension that most products use when they are storing XML schemas in operating system files.
There is no suggestion that this would make for a good database design, but note that you could use this
table design to store all objects from an application—customers, orders, payments, and so on—in a single
table. Compare this to how tables have been traditionally designed in relational databases.
SQL Server gives the developer a wide range of choices, from a simple XML design at one end of the
spectrum to fully normalized relational tables at the other end. It is important to understand that there is
no generic right and wrong answer for where a table should be designed in this range of options.
You may be dealing with data that is already in XML, such as an order that you are receiving
electronically from a customer. You may want to share, query, and modify the XML data in an
efficient and transacted way.
You may need to achieve a level of interoperability between your relational and XML data. Imagine
the need to join a customer table with a list of customer IDs that are being sent to you as XML.
You may need to use XML formats to achieve cross-domain applications and need to have maximum
portability for your data. Other systems that you are communicating with may be based on entirely
different technologies and may not represent data in the same way as your server.
You may not know the structure of your data in advance. It is common to have a mixture of
structured and semistructured data. A table might hold some standard relational columns, but also
hold some less structured data in XML columns.
You may have very sparse data. Imagine a table that has thousands of columns where only a few
columns or rows ever tend to have any data in them. (Sparse column support in SQL Server provides
another mechanism for dealing with this situation, but it also uses XML in the form of XML column
sets. Sparse columns are an advanced topic that is beyond the scope of this course.)
You may need to have order within your data. For example, you might need to retain order detail
lines in a specific order. Relational tables and views have no implicit order. XML documents can
exhibit a predictable order.
You may want to have SQL Server validate that your XML data meets a particular XML schema before
processing it.
You may want to store transferred XML data for historical reasons.
You may want to create indexes on your XML data to make it faster to query.
MCT USE ONLY. STUDENT USE PROHIBITED
13-8 Storing and Querying XML Data in SQL Server
Demonstration Steps
Structure XML and structure XML schemas
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod13\Demo13.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-9
Lesson 2
Storing XML Data and XML Schemas in SQL Server
Now that you have learned about XML, schemas, and the surrounding terminology, you can consider how
to store XML data and schemas within SQL Server. This is the first step in learning how to process XML
effectively within SQL Server.
You need to see how the XML data type is used, how to define schema collections that contain XML
schemas, how to declare both typed and untyped variables and database columns, and how to specify
how well-formed the XML data needs to be before it can be stored.
Lesson Objectives
After completing this lesson, you will be able to:
Choose whether XML fragments can be stored rather than entire XML documents.
XML Variable
DECLARE @Settings xml;
After you have declared a variable that has the xml data type, you can store any well-formed XML in it by
default.
Well-Formed XML
SET @Settings = '<Customer Name="Terry"></Customer>";
SET @Settings = '<Customer Name="Terry"><Customer>';
The first assignment would be successful and the second assignment would fail because the value that is
being assigned there is not well-formed XML.
MCT USE ONLY. STUDENT USE PROHIBITED
13-10 Storing and Querying XML Data in SQL Server
Canonical Form
SQL Server stores XML data in an internal format that makes it easier for it to process the XML data when
required. It does not store the XML in the same format (including white space) as the data was received in.
Canonical Form
DECLARE @Settings xml;
SET @Settings = N'<Customer Name="Terry"></Customer>';
SELECT @Settings;
1 <Customer Name=”Terry”/>
Note that the output that is returned is logically equivalent to the input, but the output is not in exactly
the same format as the input. It is referred to as having been returned in a “canonical” or logically
equivalent form.
XML Schemas
XML schemas are legible to humans at some level, but they are designed to be processed by computer
systems. Even simple schemas tend to have quite a high level of complexity. Fortunately, you do not need
to be able to read (or worse, write!) such schemas. Tools and utilities generally create XML schemas, and
SQL Server can create them, too. You will see an example of this in a later lesson.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-11
XML Schema
<xsd:schema targetNamespace="urn:schemas-microsoft-com:sql:SqlRowSet1"
xmlns:schema="urn:schemas-microsoft-com:sql:SqlRowSet1"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:sqltypes=
"http://schemas.microsoft.com/sqlserver/2004/sqltypes"
elementFormDefault="qualified">
<xsd:import namespace=
"http://schemas.microsoft.com/sqlserver/2004/sqltypes"
schemaLocation="http://schemas.microsoft.com/
sqlserver/2004/sqltypes/sqltypes.xsd" />
<xsd:element name="Production.Product">
<xsd:complexType>
<xsd:attribute name="ProductID" type="sqltypes:int"
use="required" />
<xsd:attribute name="Name" use="required">
<xsd:simpleType sqltypes:sqlTypeAlias=
"[AdventureWorks].[dbo].[Name]">
<xsd:restriction base="sqltypes:nvarchar"
sqltypes:localeId="1033" sqltypes:sqlCompareOptions=
"IgnoreCase IgnoreKanaType IgnoreWidth">
<xsd:maxLength value="50" />
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="Size">
<xsd:simpleType>
<xsd:restriction base="sqltypes:nvarchar"
sqltypes:localeId="1033" sqltypes:sqlCompareOptions=
"IgnoreCase IgnoreKanaType IgnoreWidth">
<xsd:maxLength value="5" />
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
<xsd:attribute name="Color">
<xsd:simpleType>
<xsd:restriction base="sqltypes:nvarchar"
sqltypes:localeId="1033" sqltypes:sqlCompareOptions=
"IgnoreCase IgnoreKanaType IgnoreWidth">
<xsd:maxLength value="15" />
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
</xsd:complexType>
</xsd:element>
</xsd:schema>
You create an XML schema collection by using the CREATE XML SCHEMA COLLECTION syntax that is
shown in the following code snippet.
System Views
You can see the details of the existing XML schema collections by querying the
sys.xml_schema_collections system view. You can see the details of the namespaces that are referenced by
XML schema collections by querying the sys.xml_schema_namespaces system view. Like XML, XML schema
collections are not stored in the format that you use to enter them. They are stripped into an internal
format.
You can get an idea of how XML schema collections are stored by querying the
sys.xml_schema_components system view, as shown in the following code example.
Untyped XML
You may choose to store any well-formed XML.
One reason is that you might not have a schema for
the XML data. Another reason is that you might
want to avoid the processing overhead that is
involved in validating the XML against the XML
schema collection. For complex schemas, validating
the XML can involve substantial work.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-13
The following example shows the creation of a table that has an untyped XML column.
Untyped XML
CREATE TABLE App.Settings
( SessionID int PRIMARY KEY,
WindowSettings xml
);
You can store any well-formed XML in the WindowSettings column, up to the maximum size of a SQL
Server XML object, which is currently 2 GB.
Typed XML
You may want to have SQL Server validate your data against a schema. You might want to take advantage
of storage and query optimizations based on the type information or want to take advantage of this type
information during the compilation of your queries.
The following example shows the same table being created, but this time, it has a typed XML column.
Typed XML
CREATE TABLE App.Settings
( SessionID int PRIMARY KEY,
WindowSettings xml (SettingsSchemaCollection)
);
In this case, a schema collection that is called SettingsSchemaCollection has been defined. SQL Server will
not enable data to be stored in the WindowSettings column if it does not meet the requirements of at
least one of the XML schemas in SettingsSchemaCollection.
This is equivalent to defining the table by using the following code because the CONTENT keyword is the
default value for typed XML declarations.
Note the addition of the CONTENT keyword. When CONTENT is specified, you can store XML fragments
and entire well-formed XML documents in the typed XML location.
DOCUMENT Keyword
CREATE TABLE App.Settings
( SessionID int PRIMARY KEY,
WindowSettings xml (DOCUMENT SettingsSchemaCollection)
);
In this case, XML fragments would not be able to be stored in the WindowSettings column. Only well-
formed XML documents could be stored. For example, a column that is intended to store a customer
order can then be presumed to actually hold a customer order and not some other type of XML
document.
Demonstration Steps
Work with typed and untyped XML
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
7. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-15
Lesson 3
Implementing XML Indexes
Indexes on XML columns are critical for achieving the high performance of XML-based queries. There are
four types of XML index: a primary index and three types of secondary index. It is important to know how
you can use each of them to achieve the maximum performance gain for your queries.
Lesson Objectives
After completing this lesson, you will be able to:
It is important to note that XML indexes can be quite large compared to the underlying XML data.
Relational indexes are often much smaller than the tables on which they are built, but it is not uncommon
to see XML indexes that are larger than the underlying data.
You should also consider alternatives to XML indexes. Promoting a value that is stored within the XML to
a persisted calculated column would make it possible to use a standard relational index to quickly locate
the value.
MCT USE ONLY. STUDENT USE PROHIBITED
13-16 Storing and Querying XML Data in SQL Server
Based on the App.Settings table that was used as an example earlier, you could create a primary XML
index by executing the following code.
A PATH index helps to decide whether a particular path to an element or attribute is valid. It is
typically used with the exist() XQuery method. (XQuery is discussed in a later lesson in this module.)
A PROPERTY index is used when retrieving multiple values through PATH expressions.
You can only create a secondary XML index after a primary XML index has been established.
When you are creating the secondary XML index, you need to reference the primary XML index.
Demonstration Steps
Implement XML indexes
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod13\Demo13.ssmssln, and then click Open.
7. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
13-18 Storing and Querying XML Data in SQL Server
Lesson 4
Using the Transact-SQL FOR XML Statement
There is a common requirement to return data that is stored in relational database columns as XML
documents. Typically, this requirement relates to the need to exchange data with other systems, including
those from other organizations. When you add the FOR XML clause to a Transact-SQL SELECT statement,
it causes the output to be returned as XML instead of as a relational rowset. SQL Server provides several
modes for the FOR XML clause to enable the production of many styles of XML document.
Lesson Objectives
After completing this lesson, you will be able to:
3. EXPLICIT mode enables you to have more control over the shape of the XML. You can use it when
other modes do not provide enough flexibility, but this is at the cost of greater complexity. You can
mix attributes and elements as you like in deciding the shape of the XML.
4. PATH mode together with the nested FOR XML query capability provides much of the flexibility of
the EXPLICIT mode in a simpler manner.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-19
1 A. Leonetti SC
2 A. Wright GC
3 A. Scott Wright EM
4 Aaron Adams IN
5 Aaron Alexander IN
6 Aaron Allen IN
Now look at the modified statement after adding the FOR XML clause.
Note that one XML element is returned for each row from the rowset, the element has a generic name of
row, and all columns are returned as attributes. The returned order is based on the ORDER BY clause.
In the example on the slide, you can see how to override the generic element name. In that example, the
elements have been named Order instead.
MCT USE ONLY. STUDENT USE PROHIBITED
13-20 Storing and Querying XML Data in SQL Server
In addition, notice that the results have been returned as an XML fragment rather than as an XML
document. This is because there is no root element. Also, in the example on the slide, you can see how to
automatically add a root element called Orders.
Element-Centric XML
You will notice that in the previous examples that the columns from the rowset have been returned as
attributes. This is referred to as “attribute-centric” XML. You can modify this behavior to produce
“element-centric” XML by adding the ELEMENTS keyword to the FOR XML clause.
Element-Centric XML
SELECT FirstName, LastName, PersonType
FROM Person.Person
ORDER BY FirstName, LastName
FOR XML RAW, ELEMENTS;
<row>
<FirstName>A. </FirstName>
<LastName>Leonetti </LastName>
<PersonType>SC</PersonType>
</row>
<FirstName>A. </FirstName>
<LastName>Wright</LastName>
<PersonType>GC</PersonType>
</row>
Note that each column has been returned as a subelement of the row element.
Look at the following query (which is a modified version of the query that you saw in the last topic).
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-21
Each table in the FROM clause, from which at least one column is listed in the SELECT clause, is
represented as an XML element. The columns that are listed in the SELECT clause are mapped to attributes
or subelements, if the optional ELEMENTS option is specified in the FOR XML clause. You can see the
output of this query below:
For this reason, it is common to provide an alias for the table, as shown in the following code.
Note that in the example on the slide, the nesting of the resultant XML is based upon the ORDER BY
clause, and not on any form of grouping statement.
NULL Columns
Look at the following query.
NULL Results
SELECT ProductID, Name, Color
FROM Production.Product AS Product
ORDER BY ProductID
FOR XML AUTO;
Note that several products do not have any color. In the resultant XML, NULL values are not returned as
zero-length strings; they are omitted from the results by default. Although this is appropriate in general, it
can cause a specific problem when you are deriving an XML schema from an XML document. For
example, if someone sent you an XML document that had product details, if none of the products
happened to have a color, you would assume that there was no color column.
XSINIL
To assist in situations where a schema needs to be derived from a document that contains nullable
columns, SQL Server provides an additional option called XSINIL. This option adds an element to the
output to indicate that an element exists, but that it is currently NULL.
XSINIL
SELECT ProductID, Name, Color
FROM Production.Product AS Product
ORDER BY ProductID
FOR XML AUTO, ELEMENTS XSINIL;
<Product xm1ns:xsi="http://www.w3.org/2001/XMLschema-instance">
<ProductID>533</ProductlD>
<Name>Seat Tube</Name>
</Product>
<Product xm1ns:xsi="http://www.w3.org/2001/XMLschema-instance">
<ProductID>534</ProductlD>
<Name>Top Tube</Name>
</Product>
<Product xm1ns:xsi="http://www.w3.org/2001/XMLschema-instance">
<ProductID>535</ProductlD>
<Name>Tension Pulley</Name>
</Product>
<Product xm1ns:xsi="http://www.w3.org/2001/XMLschema-instance">
<ProductID>679</ProductlD>
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-23
<color>Silver</color>
</Product>
<Product xm1ns:xsi="http://www.w3.org/2001/XMLschema-instance">
<ProductID>680</ProductlD>
<Color>Black</Color>
</Product>
<Product xm1ns:xsi="http://www.w3.org/2001/XMLschema-instance">
<ProductID>706</ProductID>
<Color>Red</Color>
</Product>
Note the difference between the rows that have no color and the rows that do have a color. You can also
use XSINIL in other modes such as PATH and RAW.
The power of EXPLICIT mode is to mix attributes and elements at will, create wrappers and nested
complex properties, create space-separated values (for example, the OrderID attribute may have a list of
order ID values), and create mixed contents.
PATH mode, together with the nesting of FOR XML queries and the TYPE clause, gives enough power to
replace most of the EXPLICIT mode queries in a simpler, more maintainable way. EXPLICIT mode is rarely
needed now and is complicated to write queries for.
MCT USE ONLY. STUDENT USE PROHIBITED
13-24 Storing and Querying XML Data in SQL Server
The slide provides an example of an XML PATH query. Note that the path to c.ContactID is shown as
@EmpID. Values that start with an at sign (@) in XPath refer to attributes. You can see in the output that
the c.ContactID value has been returned as the EmpID attribute of the row element.
The next two columns that are listed in the example on the slide detail the path to the values. For
example, the c.FirstName path is shown as EmpName/First. This indicates that the c.FirstName value
should be generated as an element named First that is a subelement of an element named EmpName,
which is itself returned as a subelement of the row element.
You can use FOR XML EXPLICIT mode queries to construct such XML from a rowset, but PATH mode
provides a simpler alternative to the potentially time-consuming EXPLICIT mode queries.
PATH mode, together with the ability to write nested FOR XML queries and the TYPE directive to return
xml data type instances, enables you to write less complex queries and gives enough power to replace
most of the EXPLICIT mode queries in a simpler, more maintainable way.
TYPE Keyword
In the previous topics in this lesson, you have seen
how FOR XML AUTO queries can return attribute-
centric or element-centric XML. If this data is
returned from a subquery, it needs to be returned
as a specific data type.
SQL Server 2005 introduced the xml data type, but for backward compatibility, the data type for return
values from FOR XML subqueries was not changed to xml. However, a new keyword, TYPE, was
introduced that changes the return data type of FOR XML subqueries to xml.
XML Subquery
SELECT Customer.CustomerID, Customer.TerritoryID,
(SELECT SalesOrderID, [Status]
FROM Sales.SalesOrderHeader AS soh
WHERE Customer.CustomerID = soh.CustomerID
FOR XML AUTO) as Orders
FROM Sales.Customer as Customer
WHERE EXISTS(SELECT 1 FROM Sales.SalesOrderHeader AS soh
WHERE soh.CustomerID = Customer.CustomerID)
ORDER BY Customer.CustomerID;
The previous query will return the Orders subquery as a varchar column rather than hyperlinked XML.
Now look at the following modified query.
TYPE Keyword
SELECT Customer.CustomerID, Customer.TerritoryID,
(SELECT SalesOrderID, [Status]
FROM Sales.SalesOrderHeader AS soh
WHERE Customer.CustomerID = soh.CustomerID
FOR XML AUTO, TYPE) as Orders
FROM Sales.Customer as Customer
WHERE EXISTS(SELECT 1 FROM Sales.SalesOrderHeader AS soh
WHERE soh.CustomerID = Customer.CustomerID)
ORDER BY Customer.CustomerID;
The addition of the TYPE keyword means that the second query will return the subquery data with an xml
data type that will be hyperlinked in SQL Server Management Studio.
MCT USE ONLY. STUDENT USE PROHIBITED
13-26 Storing and Querying XML Data in SQL Server
Demonstration Steps
Use FOR XML queries
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod13\Demo13.ssmssln, and then click Open.
7. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-27
Lesson 5
Getting Started with XQuery
In Lesson 4, you learned how to query relational data and return it as XML. Sometimes, however, the data
is already in XML and you may need to query it directly. You might want to extract part of the XML into
another XML document; you might want to retrieve the value of an element or attribute; you might want
to check whether an element or attribute exists; and finally, you might want to directly modify the XML.
XQuery methods make it possible to perform these tasks.
Lesson Objectives
After completing this lesson, you will be able to:
What Is XQuery?
XQuery is a query language that is designed to
query XML documents. It also includes elements of
other programming languages, such as looping
constructs.
XPath Expression
/InvoiceList/Invoice[@InvoiceNo=1000]
This XPath expression specifies a need to traverse the InvoiceList node (that is the root element because
the expression starts with a slash mark (/)), then traverse the Invoice subelements (note that there may be
more than one of these), and then to access the InvoiceNo attribute. All invoices that have invoice
number 1,000 are returned.
MCT USE ONLY. STUDENT USE PROHIBITED
13-28 Storing and Querying XML Data in SQL Server
Although there is unlikely to be more than one invoice that has the number 1,000, nothing about XML
syntax (without a schema) enforces this. One thing that can be hard to get used to with the XPath syntax
is that you constantly need to specify that you want the first entry of a particular type, even though
logically you may think that it should be obvious that there would only be one. You indicate the first entry
in a list by the expression [1].
In XPath, you indicate attributes by using the at sign (@) prefix. The content of the element itself is
referred to by the token text ().
FLWOR Expressions
In addition to basic path traversal, XPath supports an iterative expression language that is known as
FLWOR and commonly pronounced “flower.” FLWOR stands for “for, let, where, order, and return,” which
are the basic operations in a FLWOR query.
FLWOR Expression
SELECT @xmlDoc.query('<OrderedItems>
{
for $i in /InvoiceList/Invoice/Items/Item
return $i
}
</OrderedItems>');
This query supplies OrderedItems as an element. Then, within that element, it locates all items on all
invoices that are contained in the XML document and displays them as subelements of the OrderedItems
element. An example of the output from this query is shown below.
<OrderedItems>
</OrderedItems>
Note that becoming proficient at XQuery is an advanced topic that is beyond the scope of this course. The
aim of this lesson is to make you aware of what is possible when you are using XQuery methods. The
available XQuery methods are shown in the following table.
Method Purpose
The nodes() method will be covered in the next lesson, which discusses shredding XML to relational data.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-29
query() Method
You can use the query() method to extract XML
from an existing XML document. The XML that is
generated can be a subset of the original XML
document. Alternatively, it is possible to generate
entirely new XML based on the values that are
contained in the original XML document.
An XQuery expression in SQL Server consists of two sections: a prolog and a body. The prolog can contain
a namespace declaration. You will see how to do this later in this module. The body of an XQuery
expression contains query expressions that define the result of the query. Both the input and output of a
query() method are XML.
Note that if NULL is passed to a query() method, the result that the method returns is also NULL.
query() Method
SELECT XmlEvent.query(
'<EventSPIDs>
{
for $e in /EVENT_INSTANCE
return <SPID>
{number($e/SPID[1])}
</SPID>
}
</EventSPIDs>')
FROM dbo.DatabaseLog;
This query tells SQL Server to return one xml value for each row in the dbo.DatabaseLog table. The xml
value that is returned for each row will have a root element that is called EventSPIDs. For each
EVENT_INSTANCE node that is contained in the XmlEvent column within each row, a subelement
named SPID should be returned. The contents of that node will be the value of the first SPID subelement
of the EVENT_INSTANCE node returned as a number.
<EventSPIDs>
<SPID>69</SPID>
</EventSPIDs>
You will see how this works in the demonstration at the end of this lesson.
MCT USE ONLY. STUDENT USE PROHIBITED
13-30 Storing and Querying XML Data in SQL Server
value() Method
The value() method is useful for extracting scalar
values from XML documents as a relational value.
This method takes an XQuery expression that
identifies a single node and the desired SQL type to
be returned. The value of the XML node is returned
cast to the specified SQL type.
Do not be too concerned with the namespace declaration in the example shown on the slide. It is only
specified because the examples in the AdventureWorks database require this.
Example Output
You can see the output from this query in the following table.
Result
1 19
2 23
3 25
4 28
5 34
6 35
Note that, as with the query() method, if NULL is passed to the value() method, NULL will be returned.
exist() Method
Use the exist() method to check for the existence
of a specified value. The exist() method enables the
user to perform checks on XML documents to
determine whether the result of an XQuery
expression is empty or nonempty. The result of this
method is:
exist() method on the xml data type instead of the value() method. The exist() method is most helpful
when it is used in a SQL WHERE clause and utilizes XML indexes more effectively than the value()
method.
modify() Method
You can perform data manipulation operations on
an XML instance by using the modify() method.
The modify() method changes the contents of an
XML document.
Note that, unlike the previous methods, an error is returned if NULL is passed to the modify() method.
Examples
In the following example, a new SalesPerson node with the text() of Bill is inserted into the first position
of the first invoice in the list of invoices.
Insert
SET @xmlDoc.modify(
‘insert element Salesperson {“Bill”}
as first
into (/InvoiceList/Invoice)[1]’);
In the following example, the name of the SalesPerson node name is replaced by Ted.
Replace
SET @xmlDoc.modify(
‘replace value of
(/InvoiceList/Invoice/Salesperson/text())[1]
with “Ted”’);
In the following example, the SalesPerson subelement would be removed from the InvoiceList/Invoice
path.
Delete
SET @xmlDoc.modify(
‘delete
(/InvoiceList/Invoice/Salesperson)[1]’);
Demonstration Steps
Use XQuery in DDL triggers
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod13\Demo13.ssmssln, and then click Open.
7. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-33
Lesson 6
Shredding XML
Another common need that can arise when you are working with XML data in SQL Server is to be able to
extract relational data from within an XML document. For example, you might receive a purchase order
from a customer in XML format. You need to parse the XML to retrieve the details of the items that you
need to supply.
The extraction of relational data from within XML documents is referred to as “shredding” the XML
documents. There are two basic ways to do this. SQL Server 2000 supported the creation of an in-memory
tree that you could then query by using an OPENXML function. Although that is still supported, SQL
Server 2005 introduced the XQuery nodes() method, which in many cases will be an easier way to shred
XML data.
In addition to covering these areas in this module, you will see how Transact-SQL provides a way of
simplifying how namespaces are referred to in queries.
Lesson Objectives
After completing this lesson, you will be able to:
Describe how to shred XML data.
Use system stored procedures for creating and managing in-memory node trees that have been
extracted from XML documents.
Use the OPENXML function.
Shredding XML
The process for shredding XML is:
2. By calling sp_xml_preparedocument, an in-memory node tree is created, based on the input XML.
3. The OPENXML table-valued function is then used to query the in-memory node tree and extra
relational data.
4. The relational data that has been extracted is normally combined with other relational data as part of
standard Transact-SQL queries.
sp_xml_preparedocument
sp_xml_preparedocument is a system stored
procedure that takes XML either as the untyped
xml data type or as XML stored in the nvarchar
data type, creates an in-memory node tree from the
XML (to make it easier to navigate), and returns a
handle to that node tree.
sp_xml_preparedocument reads the XML text that was provided as input, parses the text by using the
Microsoft XML Core Services (MSXML) parser (Msxmlsql.dll), and provides the parsed document in a state
that is ready for consumption. This parsed document is a tree representation of the various nodes in the
XML document, such as elements, attributes, text, and comments.
Before you call sp_xml_preparedocument, you need to declare an integer variable to be passed as an
output parameter to the procedure call. When the call returns, the variable will then be holding a handle
to the node-tree.
It is important to realize that the node tree must stay available and unmoved in visible memory because
the handle is basically a pointer that needs to remain valid. This means that, on 32-bit systems, the node
tree will not be able to be stored in Address Windowing Extensions (AWE) memory.
sp_xml_removedocument
sp_xml_removedocument is a system stored procedure that frees the memory that a node tree occupies
and invalidates the handle.
In SQL Server 2000, sp_xml_preparedocument created a node tree that was session-scoped, that is, the
node tree remained in memory until the session ended or until sp_xml_removedocument was called. A
common coding error was to forget to call sp_xml_removedocument. Leaving too many node trees to
remain in memory was known to cause a severe lack of available low-address memory on 32-bit systems.
Therefore, a change was made in SQL Server 2005 that made the node trees created by
sp_xml_preparedocument become batch-scoped rather than session-scoped. Even though the tree will be
removed at the end of the batch, it is considered good practice to explicitly call sp_xml_removedocument
to minimize the use of low-address memory as much as possible.
Note that 64-bit systems generally do not have the same memory limitations.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-35
OPENXML Function
The OPENXML function provides a rowset over in-
memory XML documents, which is similar to a table
or a view. OPENXML enables access to the XML
data as though it is a relational rowset. It does this
by providing a rowset view of the internal
representation of an XML document.
The parameters that are passed to OPENXML are the XML document handle; a rowpattern, which is an
XPath expression that maps the nodes of XML data to rows; and an indication of whether to use attributes
rather than elements by default. Associated with the OPENXML clause is a WITH clause that provides a
mapping between the rowset columns and the XML nodes.
The ColPattern that is shown is an optional, generic XPath pattern that describes how the XML nodes
should be mapped to the columns. If ColPattern is not specified, the default mapping (attribute-centric
or element-centric mapping as specified by flags) occurs.
In the example on the slide, note how the alias o has been assigned to the urn:AW_OrderNS XML
namespace. That alias is then used throughout the document when an element that is defined in that
namespace is used.
MCT USE ONLY. STUDENT USE PROHIBITED
13-36 Storing and Querying XML Data in SQL Server
nodes() Method
The nodes() method provides a much easier way to
shred XML into relational data than OPENXML and
its associated system stored procedures.
The result of the nodes() method is a rowset that contains logical copies of the original XML instances. In
these logical copies, the context node of every row instance is set to one of the nodes that is identified
with the query expression. This enables subsequent queries to navigate relative to these context nodes.
It is important to be careful about the query plans that are generated when you use the nodes() method.
In particular, no cardinality estimates are available when you use this method. This has the potential to
lead to poor query plans. In some cases, the cardinality is simply estimated to be a fixed value of 10,000
rows. This might cause an inappropriate query plan to be generated if your XML document contained
only a handful of nodes.
APPLY operations cause table-valued functions to be called for each row in the left table of the query.
In this query, for every row in the dbo.DatabaseLog table, the nodes() method is called on the XmlEvent
column from the dbo.DatabaseLog table. When table-valued functions are used in queries like this, you
must provide an alias for both the derived table and the columns that it contains. In this case, the alias
that is provided to the derived table is EventInfo and the alias that is provided to the extracted column is
EventDetail.
One output row is being returned for each node at the level of the XPath expression /EVENT_INSTANCE.
From the returned XML column (EventDetail), a series of columns is generated by calling the value()
method. Note that it is called four times for each output row in this example. Also note that the path to
the value to be returned and the data type of that value are being specified along with output column
aliases.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-37
Demonstration Steps
Shred XML data by using the nodes() method
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod13\Demo13.ssmssln, and then click Open.
7. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
13-38 Storing and Querying XML Data in SQL Server
You also have an upcoming project that will require the use of XML data within SQL Server. No members
of your current team have experience working with XML data in SQL Server. You need to learn how to
process XML data within SQL Server and you have been provided with some sample queries to assist with
this learning.
Objectives
After completing this lab, you will be able to:
Use Cases
2. In the D:\Labfiles\Lab13\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
Results: After this exercise, you will have seen how to analyze requirements and determine appropriate
use cases for XML storage.
Task 1: Review, Execute, and Review the Results of the XML Queries
1. On the taskbar, click SQL Server 2014 Management Studio, in the Connect to Server window,
ensure that Server name is MIA-SQL, and then click connect.
2. On the File menu, click Open, click File, navigate to D:\Labfiles\Lab13\Starter, and then select
InvestigateStorage.sql and click Open.
3. Review the queries, execute the queries, and determine how the output results relate to the queries.
Results: After this exercise, you will have seen how XML data is stored in variables.
MCT USE ONLY. STUDENT USE PROHIBITED
13-40 Storing and Querying XML Data in SQL Server
2. Review the queries, execute the queries, and note the output. Do this one query at a time for scripts
13.10 and 13.11.
Results: After this exercise, you will have seen how to create XML schema collections.
2. Review the query, execute the query, and review the results for scripts 13.21 to 13.29.
Results: After this exercise, you will have executed queries that return SQL Server relational data as XML.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-41
Supporting Documentation
Output Order: Rows within the XML should be in order of SellStartDate ascending and
then ProductName ascending. That is, sort by SellStartDate first and then
ProductName within SellStartDate.
EXEC Production.GetAvailableModelsAsXML;
Results: After this exercise, you will have created and tested the required stored procedure that returns
XML.
Question: You could pass XML data to a stored procedure by using either the xml data type
or the nvarchar data type. What advantage does the xml data type provide over the
nvarchar data type for this purpose?
Question: Which XML query mode did you use for implementing the
WebStock.GetAvailableModelsAsXML stored procedure?
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 13-43
Review Question(s)
Question: What is XML?
Module 14
Working with Spatial Data in SQL Server
Contents:
Module Overview 14-1
Module Overview
Business applications routinely deal with addresses and locations, yet they rarely provide effective ways to
process distances and proximity. Spatial data in Microsoft® SQL Server® data management software
enables the effective storage and processing of locations, addresses, and shapes. This capability can help
business applications make better decisions and you can also use it to help visualize results, which often
makes results easier to interpret.
Objectives
After completing this module, you will be able to:
Describe the importance of spatial data and the industry standards that are related to it.
Explain how to store spatial data in SQL Server.
Lesson 1
Introduction to Spatial Data
Before starting to work with spatial data, it is important to understand where it is typically used in
applications and what types of spatial data there are. Most business applications need to work with
addresses or locations. SQL Server can process both planar and geodetic data. It is important to
understand the difference between these two types of data in addition to how the SQL Server data types
relate to the relevant industry standards and measurement systems.
Lesson Objectives
After completing this lesson, you will be able to:
Target Applications
There is a perception that spatial data is not useful
in mainstream applications. However, this
perception is invalid: almost every business
application can benefit from the use of spatial data.
Business Applications
Although mapping provides an interesting
visualization in some cases, business applications
can make good use of spatial data for much more
routine tasks. Almost all business applications
involve the storage of addresses or locations.
Customers or clients have street addresses, mailing
addresses, and delivery addresses. The same is true
for stores, offices, suppliers, and many other business-related entities.
It could be true that customers really do purchase from their local store. The owner may have just come
across a small sample of data and been misled by it. Alternatively, it could be true that customers do not
purchase from their local stores. If so, it might be interesting to know what they purchase when they
travel to another store. Perhaps the local store does not hold a wide enough variety of stock. Alternatively,
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-3
perhaps the customers purchase everything they need from a more remote store because they do not like
the staff at the local store. A situation might also exist where two stores are ”cannibalizing” each other’s
business. The data might also be used to find new locations for stores.
It is important to realize that these sorts of questions are normal business questions, not specialized
mapping questions. This is the sort of problem that you can solve quite easily if you can process spatial
data in a database.
Key Points
In the spatial data community, several types of
spatial data are used. SQL Server works with vector-
based two-dimensional (2-D) data, but has some
storage options for three-dimensional (3-D) values.
Spatial data in SQL Server is currently based on the 2-D technology. In some of the objects and properties
that it provides, spatial data in SQL Server supports the storage and retrieval of 3-D and 4-D values, but it
is important to realize that the third and fourth dimensions are ignored during calculations. This means
that if you calculate the distance between, say, a point and a building, the calculated distance is the same
regardless of which floor or level in the building the point is located.
Question: Which existing SQL Server data type could you use to store (but not directly
process) raster data?
MCT USE ONLY. STUDENT USE PROHIBITED
14-4 Working with Spatial Data in SQL Server
Key Points
Planar systems represent the Earth as a flat surface.
Geodetic systems represent the Earth more like its
actual shape.
Planar Systems
Prior to the advent of computer systems, it was very
difficult to perform calculations on round models of
the Earth. For convenience, mapping tended to be
two-dimensional in nature. Most people are familiar
with traditional flat maps of the world.
However, as soon as larger distances are involved, flat maps provide a significant distortion, particularly as
you move from the center of the map. When most of the standard maps from atlases were first drawn,
they were oriented around where the people who were drawing the maps lived. That meant that the least
distortion occurred where the people who were using the maps were based.
As an example, in the flat map that is shown on the slide, it is not obvious how Africa’s area (about 30
million square kilometers) compares to North America’s area (about 24 million square kilometers). Also,
compare Antarctica’s size on the map and note that it is really only about 13 million square kilometers in
size.
Geodetic Systems
Geodetic systems represent the Earth as a round shape. Some systems use simple spheres, but it is
important to realize that the Earth is not actually spherical.
Spatial data in SQL Server offers several systems for representing the shape of the Earth. Most systems
model the Earth as an ellipsoid rather than as a sphere.
Key Points
The Open Geospatial Consortium (OGC) is the
industry body that provides specifications for how
processing of spatial data should occur in systems
that are based on Structured Query Language
(SQL).
SQL Specification
One of the two data types that SQL Server provides
is the geometry data type. It conforms to the OGC
Simple Features for SQL Specification version 1.1.0
and is used for planar spatial data. In addition to defining how to store the data, the specification details
common properties and methods to be applied to the data.
The OGC defines a series of data types that form an object tree. In the chart that is shown on the slide, the
objects that are supported and can be created in spatial data in SQL Server are shown in blue (or the
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-5
darker color). Other objects in the OGC Geometry hierarchy are shown in yellow (or the lighter color).
Curved arc support was added in SQL Server 2012.
Extensions
SQL Server also extends the standards in several ways. SQL Server provides a round-earth data type that is
called geography, along with several additional useful properties and methods.
Methods and properties that are related to the OGC standard have been defined by using an ST prefix
(such as STDistance). Those without an ST prefix are Microsoft extensions to the standard (such as
MakeValid).
Key Points
There have been many systems of measurement
over time. SQL Server supports many of these
measurement systems directly. When you specify a
spatial data type in SQL Server, you also specify the
measurement system to be used. You specify this by
associating a spatial reference ID with the data. A
spatial reference ID of zero indicates the lack of a
measurement system. This is commonly used where
there is no need for a specific measurement system.
SRID 4326
The World Geodetic System (WGS) is commonly used in cartography, geodetics, and navigation. The latest
standard is WGS 1984 (WGS 84) and is best known to most people through the Global Positioning System
(GPS). GPS is often used in navigation systems and uses WGS 84 as its coordinate system.
In spatial data in SQL Server, SRID 4326 provides support for WGS 84.
If you query the list of SRIDs in SQL Server, the entry for SRID 4326 has the following name. This is
formally called the Well-Known Text (WKT) that is associated with the ID:
WGS 84
GEOGCS["WGS 84", DATUM["World Geodetic System 1984", ELLIPSOID["WGS 84", 6378137,
298.257223563]], PRIMEM["Greenwich", 0], UNIT["Degree", 0.0174532925199433]]
This specifies how WGS 84 models the Earth as an ellipsoid (you can imagine it as a squashed ellipsoid),
with its major radius of 6,378,137 meters at the equator, a flattening of 1 / 298.257223563 (or about 21
kilometers) at the poles, a prime meridian (that is, a starting point for measurement) at Greenwich, and a
measurement that is based on degrees. The starting point at Greenwich is specifically based at the Royal
Observatory. The units are shown as degrees and the size of a degree is specified in the final value in the
definition. Most geographic data today would be represented by SRID 4326.
MCT USE ONLY. STUDENT USE PROHIBITED
14-6 Working with Spatial Data in SQL Server
Demonstration Steps
View the available special reference systems
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
5. On the File menu, click Open, click Project/Solution, navigate to
D:\Demofiles\Mod14\Demo14\Demo14.ssmssln, and then click Open.
8. Follow the instructions contained within the comments of the script file.
Question: Do you currently use GPS data in any existing applications within your
organization?
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-7
Lesson 2
Working with Spatial Data Types in SQL Server
SQL Server supports two spatial data types, geometry and geography, which have been created as
system common language runtime (CLR) data types. You need to know how to use each of these data
types and how to interchange data by using industry-standard formats.
Lesson Objectives
After completing this lesson, you will be able to:
Explain how system CLR types differ from user CLR types.
Use the geometry data type.
Use Microsoft extensions to the OGC standard when working with spatial data.
Key Points
SQL Server supplies rich support for spatial data. It
provides two data types: the geometry data type,
which is suited to flat-earth (planar) models, and
the geography data type, which is suited to round-
earth (geodetic) models.
Note that although “latitude and longitude” is a commonly used phrase in the general community, the
geographical community uses the terminology in the reverse order. When you are specifying inputs for
geographic data in SQL Server, the longitude value precedes the latitude value.
Additional Support
The Microsoft Bing® Maps software development kit (SDK) has been updated to work closely with spatial
data in SQL Server. SQL Server Reporting Services includes a map control that you can use to render
MCT USE ONLY. STUDENT USE PROHIBITED
14-8 Working with Spatial Data in SQL Server
spatial data and a wizard to help to configure the map control. The map control is available for reports
that are built by using Business Intelligence Development Studio and for reports that are built by using
Report Builder.
An application that stores or retrieves spatial data from a database in SQL Server needs to be able to work
with that data as a spatial data type. To make this possible, a separate installer (MSI) file has been
provided as part of the SQL Server 2012 Feature Pack to enable client applications to use the spatial data
types in SQL Server. The installer is called “Microsoft System CLR Types for SQL Server 2012.” By installing
this file on client systems, an application on the client can “rehydrate” a geography object that has been
read from a SQL Server database into a SqlGeography object within .NET managed code.
ST Prefix
For the properties and methods that are implementations of the OGC standards, an ST prefix has been
added to the names of the properties and methods. For example, the X and Y coordinates of a geometry
object are provided by STX and STY properties and the Distance calculation is provided by the
STDistance method.
For Microsoft extensions to the OGC standards, no prefix has been added to the name of the methods or
properties, so, for example, there is a MakeValid method. You must also take care when you refer to
properties and methods because they are case-sensitive, even on servers that are configured for case-
insensitivity.
Question: You may have used a web service to calculate the coordinates of an address. What
is this process commonly called?
SQL Server 2008 introduced the concept of a system CLR data type, which was separate from the user-
defined data types, but also implemented in managed code. In addition, SQL Server 2008 replaced the 8-
KB limit on serialization with a 2-GB limit. This increased limit makes it possible to create complex data
types by using managed code.
In SQL Server, there are three system CLR data types that take advantage of this large data type support:
geometry, geography, and hierarchyid. Unlike user-defined CLR data types, these system data types
operate even when the ‘clr enabled’ setting for the server instance is disabled.
You can see the currently installed assemblies and whether they are user-defined by executing the
following query:
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-9
As an example of this, look at the following code that is accessing the STX property of a variable called
@Location:
Accessing Properties
SELECT @Location.STX;
You can access a method of an instance of a spatial data type by referring to it as Instance.Method().
As an example of this, look at the following code that is calling the MakeValid method of a variable called
@InputLocation
Accessing Methods
SELECT @Location = @InputLocation.MakeValid();
It is also possible to call methods that are defined on the data types (geometry and geography) rather
than on instances (that is, columns or variables) of those types. This is an important distinction.
As an example of this, look at the following code that is calling the GeomFromText method of the
geometry data type:
Note that you are not calling the method on a column or variable of the geometry data type, but on the
geometry data type itself. In .NET terminology, this would be referred to as calling a public static method
on the geometry class. Note also that the methods and properties of the spatial data types are case-
sensitive, even on servers that are configured with case-insensitive default collations.
Key Points
The geometry data type is used for flat-earth (that
is, planar) data storage and calculations. It provides
comprehensive coverage of the OGC standard.
Z and M values in the geometry data type, but it ignores these values when it performs calculations.
You can see the input and output of X, Y, Z, and M in the following code:
geometry Results
POINT (12 15)
POINT (12 15 2 9)
The SQL Server geometry data type provides comprehensive coverage of the OGC Geometry data type.
The X and Y coordinates are represented by STX and STY properties.
Key Points
The geography data type is used for round-earth
values, typically involving actual positions or
locations on the Earth. It is an extension to the OGC
standard.
as the northern or southern hemispheres, but just that no two points could be more than half the Earth
apart if they were contained in the same instance of the geography data type. This limitation was
removed in SQL Server 2012.
To enclose points, they are listed in counterclockwise order. As you draw a shape, all of the points to the
left of the line that you draw will be enclosed by the shape. The points on the line are also included.
If you draw a postal code region in a clockwise direction, you are defining all points outside the region. In
earlier versions of SQL Server, because results were not permitted to span more than a single hemisphere,
an error would have been returned. This restriction was removed in SQL Server 2012.
For geography, the viewer is quite configurable. You can set which column to display, the geographic
projection to use for display (for example, Mercator, Bonne, and so on), and you can choose to display
another column as a label over the relevant displayed region.
The spatial results viewer in SQL Server Management Studio is limited to displaying the first 5,000 objects
from the result set.
Key Points
The internal binary format of any CLR data type is
not directly used for input and output of the data
type in most cases. You need to accommodate
string-based representations of the data.
CLR data types (including the geometry and
geography system CLR data types) are stored in a
binary format that the designer of the data type
determines. Although it is possible to both enter
values and generate output for instances of the
data type by using a binary string, this is not
typically very helpful because you would need to have a detailed understanding of the internal binary
format.
Well-Known Text (WKT). This is the most common string format and is quite human-readable.
Well-Known Binary (WKB). This is a more compact binary representation that is useful for
interchange between computers.
MCT USE ONLY. STUDENT USE PROHIBITED
14-12 Working with Spatial Data in SQL Server
Geography Markup Language (GML). This is the XML-based representation for spatial data.
All CLR data types must implement two string-related methods. The Parse method is used to convert a
string to the data type and the ToString method is used to convert the data type back to a string. Both of
these methods are implemented in the spatial types and both assume a WKT format.
Several variations of these methods are used for input and output. For example, the STAsText method
provides a specific WKT format as output and the AsTextZM method is a Microsoft extension that
provides the Z and M values in addition to the two-dimensional coordinates.
Question: Why is there a need to represent spatial data types as strings within SQL Server?
Key Points
A wide variety of OGC methods and properties has
been provided in spatial data in SQL Server, along
with a number of OGC-defined collections. Several
of the common methods and properties are
described here, but many more exist.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-13
Common Methods
Common OGC methods include:
The STDistance method, which returns the distance between two spatial objects. Note that this does
not only apply to points. It is also possible to calculate the distance between two polygons. The result
is returned as the minimum distance between any two points on the polygons.
The STIntersects method, which returns 1 when two objects intersect and returns 0 otherwise.
The STArea method, which returns the total surface area of a geometry instance.
The STLength method, which returns the total length of the objects in a geometry instance. For
example, for a polygon, STLength returns the total length of all line segments that make up the
polygon.
The STUnion method, which returns a new object that is formed by uniting all points from two
objects.
The STBuffer method, which returns an object whose points are within a certain distance of an
instance of a geometry object.
Microsoft Extensions
Key Points
In addition to the OGC properties and methods,
Microsoft has provided several useful extensions to
the standards. Several of these extensions are
described in this topic, but many more exist.
Common Extensions
Although the coverage that the OGC specifications
provide is good, Microsoft has enhanced the data
types by adding properties and methods that
extend the standards. Note that the extended
methods and properties do not have the ST prefix.
The MakeValid method takes an arbitrary shape and returns another shape that is valid for storage in a
geometry data type. SQL Server produces only valid geometry instances, but enables you to store and
retrieve invalid instances. You can retrieve a valid instance that represents the same point set of any
invalid instance by using the MakeValid method.
You can use the Reduce method to reduce the complexity of an object while attempting to maintain the
overall shape of the object.
The IsNull method returns 1 if an instance of a spatial type is NULL; otherwise it returns 0.
GML
<Point xmlns="http://www.opengis.net/gml">
<pos>12 15</pos>
</Point>
GML is excellent for information interchange, but you can see that the representation of objects in XML
can quickly become very large.
The BufferWithTolerance method returns a buffer around an object, but uses a tolerance value to allow
for minor rounding errors.
Demonstration Steps
Work with spatial data types in SQL Server
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-15
Lesson 3
Using Spatial Data in Applications
After you have learned how spatial data is stored and accessed in SQL Server, you need to understand the
implementation issues that need to be addressed when you are building applications that use spatial data
in SQL Server. In particular, you can create spatial indexes to improve the performance of the applications
and you need to understand how spatial indexes work and which methods they are useful for to build
performant applications.
Lesson Objectives
After completing this lesson, you will be able to:
Explain the basic tessellation process that is used within spatial indexes in SQL Server.
Key Points
Spatial queries can often involve a large number of
data points. Executing methods such as
STIntersects for a large number of points is slow.
Spatial indexes help to avoid unnecessary
calculations. Otherwise, complex geometric
calculations are involved.
Tessellation Process
Key Points
Spatial indexes help to avoid unnecessary
calculations by breaking down larger problems into
problems that need to be solved and problems that
do not.
Tessellation
In the example from the discussion where you were
considering how to find streets that intersect your
suburb or region, the biggest problem is that
checking every street in the state or worse, in the
country, would take a very long time. The irony of this is that almost all of these calculations would return
an outcome that showed no intersection.
To avoid making unnecessary calculations, SQL Server breaks the problem-space into relevant areas by
using a four-level grid. Each grid level consists of several cells.
The basic idea is that if your suburb is located within a region of cells, any streets that do not extend into
those cells do not need to be checked. You can use grid levels to quickly isolate large areas that do not
need to be checked.
For example, if you are checking for streets that are part of Vienna, you do not need to check for streets
that are contained entirely within Paris. Moreover, you do not need to check any street that is contained
entirely within France. You can quickly eliminate those as not being of interest to you.
Spatial Indexes
Key Points
Spatial indexes are unlike standard relational
indexes. Instead of locating specific rows to be
returned, queries that use spatial indexes operate in
a two-phase manner. In the first phase, possible
candidates are found. In the second phase, the
returned list of candidates is individually checked.
Spatial Indexes
When you traverse a clustered or nonclustered
index on a SQL Server table, you apply the
predicates in the WHERE clause to filter the specific rows. After you have applied the predicate, you are
left with precisely the rows that you require. Spatial indexes work in a different way. Instead of precisely
locating the specific rows, you can use spatial indexes to locate rows that could potentially be of interest.
You saw in the last topic how you can apply tessellation to minimize the number of calculations that need
to be performed. Spatial indexes use this tessellation process to quickly reduce the overall number of rows
to a list of candidate rows that might potentially be of interest. In the street-based example that was
mentioned previously, if Vienna was contained inside a grid cell and a street entered that cell, you still do
not know if the street actually intersects the boundaries of Vienna. However, you know that you need to
check whether it does, because it is possible that it might.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-17
To enable a check on the effectiveness of the primary filter, SQL Server provides a Filter method that only
applies to the primary filter. You can then compare the number of rows that the Filter method returns to
the total number of rows to see how effective the spatial index has been. This will be shown in the next
demonstration.
Key Points
Spatial indexes are created by using the CREATE
SPATIAL INDEX statement. Indexes on the
geometry data type should specify a
BOUNDING_BOX setting.
Index Bounds
Unlike more traditional types of index, a spatial index is most useful when it knows the overall area that
the spatial data covers. Spatial indexes that are created on the geography data type do not need to
specify a bounding box because the data type is naturally limited by the Earth itself.
Spatial indexes on the geometry data type specify a BOUNDING_BOX setting. This provides the
coordinates of a rectangle that would contain all possible points or shapes of interest to the index. The
geometry data type has no natural boundaries, so specifying a bounding box enables SQL Server to
produce a more useful index. If values arise outside the bounding box coordinates, the primary filter
would need to return the rows in which they are contained.
Grid Density
SQL Server also enables you to specify grid densities when you are creating spatial indexes. You can
specify a value for the number of cells in each grid for each grid level in the index:
Spatial indexes are also different from other types of index because it might make sense to create multiple
spatial indexes on the same table and column. Indexes that have one set of grid densities might be more
useful than a similar index that has a different set of grid densities for locating data in a specific query.
MCT USE ONLY. STUDENT USE PROHIBITED
14-18 Working with Spatial Data in SQL Server
To make spatial indexes easier to configure, SQL Server 2012 introduced automatic grid density and level
selections: GEOMETRY_AUTO_GRID and GEOGRAPHY_AUTO_GRID. The automated grid configuration
defaults to an eight-level grid.
Limitations
Spatial indexes do not support the use of ONLINE build operations, which are available for other types of
index in SQL Server Enterprise.
Key Points
Not all geometry methods and not all predicate
forms can benefit from the presence of spatial
indexes. The table on the slide shows the specific
predicates that can potentially make use of a spatial
index as a primary filter. If the predicate in your
query is not in one of these forms, spatial indexes
that you create will be ignored.
Key Points
In a similar way to the geometry data type, not all
geography methods and not all predicate forms can
benefit from the presence of spatial indexes. The
table on the slide shows the specific predicates that
can potentially make use of a spatial index as a
primary filter. Unless the predicate in your query is
in one of these forms, spatial indexes that you
create will be ignored.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-19
Key Points
An active community that contributes user-created
extensions to spatial data in SQL Server exists on
the CodePlex site. The functions, types, and
aggregates that were present in the
sqlspatial.codeplex.com project at the time of
writing this module are listed on the slide. As the
project continues to evolve, the capabilities that are
provided in the project will change. You may be
able to use some of these extensions directly. They
may also be useful as starting points when you
create your own extensions to the spatial data in SQL Server. In addition, note that several additional
built-in aggregates were added to the spatial data in SQL Server in SQL Server 2012.
Demonstration Steps
Use spatial data in SQL Server to solve some business questions
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
4. In the Connect to Server window, in Server name, type MIA-SQL and then click Connect.
8. Follow the instructions contained within the comments of the script file.
Supporting Documentation
Stored Procedure Specifications
Objectives
After completing this lab, you will have:
Password: Pa$$w0rd
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-21
2. Review the query, execute the query, and then review the results for scripts 19.1 to 19.9.
Results: After this exercise, you should have seen how to work with the geometry data type.
Results: After this exercise, you should have replaced the existing Longitude and Latitude columns with
a new Location column.
Question: Where would you imagine you might use spatial data in your own business
applications?
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 14-23
Review Question(s)
Question: What is the main difference between the geometry and geography data types?
Question: When you are defining a polygon, why does it matter how you specify the order
of the points?
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
15-1
Module 15
Incorporating Data Files into Databases
Contents:
Module Overview 15-1
Lesson 1: Considerations for Working with Data Files in SQL Server 2014 15-2
Module Overview
Organizations store and manage data files in a wide range of formats. Very often, this data is stored on
the file system of the server operating system, but organizations are increasingly choosing to integrate
data files into their relational databases because of the benefits this can bring.
This module provides an overview of the options for storing data in a database in Microsoft® SQL
Server® 2014 data management software, and the benefits of each storage type. It also explains the key
factors to consider when you are planning to incorporate data files into your databases, and describes
how you can use full-text indexes and semantic search functionality to search documents in various ways.
Objectives
After completing this module, you will be able to:
Describe the options for storing data files in SQL Server 2014 and plan an appropriate storage
solution for a given scenario.
Explain how to implement FILESTREAM and FileTables.
Describe the benefits of full-text indexing and semantic search functionality, and explain how to use
these features to search data in data files.
MCT USE ONLY. STUDENT USE PROHIBITED
15-2 Incorporating Data Files into Databases
Lesson 1
Considerations for Working with Data Files in SQL Server
2014
SQL Server 2014 provides various ways to incorporate data files into a database. This lesson describes the
benefits and challenges of storing data files in databases, reviews the options for storing data files in a
SQL Server 2014 database, explains the considerations for planning to store data files, and explains how to
choose the appropriate storage solution for a given scenario.
Lesson Objectives
After completing this lesson, you will be able to:
Describe the options for storing data files in a SQL Server 2014 database.
Choose an appropriate storage option for data files for a given scenario.
Maintaining referential and transactional integrity. Some data might relate directly to data that is
stored in a database, so the organization might want to make this relationship explicit. For example, a
company might hold a résumé of each employee as a data file and want to explicitly relate this file to
the rest of the information about the employee that is held in a relational database.
Enabling developers to create applications that can easily and efficiently make use of data files.
Centralized storage for relational and nonrelational data, which can streamline management and
maintenance tasks such as performing backups, and helps to reduce the effort that is required to
manage the data.
Access to both data files and relational data can be controlled through database security, which
simplifies security and reduces the possibility of errors in security configuration.
The ability to perform better and more efficient searches by using the built-in indexing features in the
database management system.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-3
The relational database engine can maintain referential and transactional consistency between data
files and relational data.
Reduced complexity for developers who write applications that use data files.
It can offer better read performance than simply storing data in the database by using a data type
such as image or varbinary(max); however, this depends on the size of each file and to some extent,
on the type of file that you are storing. For example, for files that are larger than 1 MB in size, or for
streaming video files, the file system will typically deliver better read performance.
The file system will typically handle fragmentation better than SQL Server for files that are frequently
modified.
There is no single point of management for the data in the database and the data in the file system,
so you will need to plan separate maintenance and backup schedules.
Maintaining two separate stores adds an extra layer of complexity for developers who write
applications that use the data.
You cannot take advantage of integrated services such as Full-Text Search and Statistical Semantic
Search to query textual data.
Earlier versions of SQL Server included the image data type for storing BLOBs. This data type adds a
pointer to each row that points to the location of the data in the BLOB data files. The image data type is
still available in SQL Server 2014, but because it is now deprecated, you should not use it in any new
databases.
MCT USE ONLY. STUDENT USE PROHIBITED
15-4 Incorporating Data Files into Databases
To store BLOB data in a SQL Server 2014 database, you can use the varbinary(max) large-value data
type. The varbinary(max) data type enables you to store binary data up to 2,147,483,647 bytes
(approximately 2 GB) in size. When you designate a column as varbinary(max), SQL Server stores data
that is up to 8,000 bytes in size in the same data page as the rest of the row. For data that is larger than
8,000 bytes in size, SQL Server stores the data in separate data pages and includes a pointer to the pages
in the row. This flexible approach to storage can help to improve read performance when BLOBs are
smaller than 8,000 bytes, because SQL Server can read the data directly from the table’s data pages, so it
does not have to incur extra page reads by accessing the dedicated BLOB data pages.
Note: You can use the large value types out of row table option of the sp_tableoption
stored procedure to change the way that SQL Server stores varbinary(max) data. A value of 0
forces SQL Server to store BLOBs that are smaller than 8,000 bytes in the appropriate data row
and to store BLOBs that are larger than 8,000 bytes in the BLOB storage pages. A value of 1
forces SQL Server to always store the data in varbinary(max) columns in the BLOB storage
pages, regardless of the size of the BLOB. By default, the large value types out of row table
option is set to 0.
The advantages of storing data files by using the varbinary(max) data type include:
SQL Server maintains transactional and relational integrity for BLOB data.
Reduced complexity for developers who write applications that use the data.
The ability to take advantage of integrated services such as Full-Text Search and Semantic Search to
query textual data.
The disadvantages of storing data files by using the varbinary(max) data type include:
You can only access BLOB data that is stored as varbinary(max) programmatically; you cannot access
the BLOB data directly through the file system. For example, if you store Microsoft Word documents
by using varbinary(max), you cannot open these documents directly by using Word.
Read performance is directly affected by the number of pages that are required to store each BLOB;
more page reads leads to slower response times, so performance can be degraded for large BLOBs
that require many data pages.
Improved performance over storing BLOBs in SQL Server data pages when BLOBs are larger than
1 MB on average.
SQL Server maintains transactional and relational integrity for BLOB data.
Reduced complexity for developers who write applications that use the data.
The ability to take advantage of integrated services such as Full-Text Search and Semantic Search to
query textual data.
You can only access BLOB data that is stored in FILESTREAM columns programmatically; you cannot
access the BLOB data directly through the file system.
You must store FILESTREAM BLOBs on the same server where the database data files are located or
on a storage area network (SAN). You cannot use a remote location such as a shared folder on the
network for BLOB storage.
A FileTable is a SQL Server table that has a predefined schema. The columns in a FileTable include a
varbinary(max) column that has the FILESTREAM attribute enabled, and a series of metadata columns
that store information including the file size, file creation time, and the last write time. FileTable files are
part of a hierarchy that includes a database-level directory and a separate directory for each FileTable in
the database. Each row in a FileTable represents either a file or a directory in the FileTable shared
directory. Each FileTable has two columns, which are called path_locator and parent_path_locator. These
columns use the hierarchyid data type to keep track of the place of each file and folder in the FileTable
folder hierarchy.
To use FILESTREAM, you must create or alter a database to set the NON_TRANSACTED_ACCESS option
on a database that has a filegroup that is configured for FILESTREAM. Nontransactional access enables
access to BLOB files through the file system, but enabling it means that you cannot restore BLOB data to a
specific point in time. You can configure NON_TRANSACTED_ACCESS by using the following values:
FULL. When you set NON_TRANSACTED_ACCESS to FULL, you can read and write files and folders
by using the FILESTREAM shared directory. For example, you can drag a new file to the directory, and
this file is stored in the FileTable.
READONLY. When you set NON_TRANSACTED_ACCESS to READONLY, you can read files in the
FILESTREAM shared directory, but you cannot modify them or save new files to the directory.
OFF. When you set NON_TRANSACTED_ACCESS to OFF, you cannot access BLOB files in the
FILESTREAM shared directory. However, you can still access the files programmatically.
MCT USE ONLY. STUDENT USE PROHIBITED
15-6 Incorporating Data Files into Databases
Reference Links: For more information about RBS, see the Remote Blob Store (RBS) (SQL
Server) topic in SQL Server Books Online.
Ease of manageability. For example, will BLOB data be included in database backups, or will you
have to schedule separate backups?
Ease of configuring and maintaining security. The simpler the security model, the less prone it will
be to configuration errors.
The need to maintain nontransactional file access through the file system. Do users need to
access the files that you want to store by using programs such as Word?
Ease of development effort for applications that use the data. It will be less complex for
developers if all of the data is in a single store.
The maximum size of the data files. When you use varbinary(max), the largest file size that you
can accommodate is 2 GB. Other storage mechanisms do not have this limitation.
The need to maintain transactional integrity for data files and relational data. Transactional
integrity is required for point-in-time restores, but it is not possible to provide transactional integrity
when you use a FileTable that has full nontransactional access configured.
The need to perform full-text and semantic searches on data files. To use full-text indexes and
semantic searches, the data needs to be stored in SQL Server.
The location of the directory that will host the files. Storage can be either local or on the network,
but not all of the available solutions will support network storage.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-7
The following matrix compares the various storage options against the factors that are described in the list
above.
Performance Good, uses the Good when Good, uses Good, uses file
file system files are file system system
smaller than streaming streaming
1 MB on
average
Note: The matrix does not include RBS because factors such as manageability and
performance depend on the RBS solution that you use.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-9
Lesson 2
Implementing FILESTREAM and FileTables
FILESTREAM and FileTables enable you to integrate the storage of data files on the file system with data
that is stored in your relational database, while maintaining good levels of performance and fast response
times. This lesson describes the benefits that these technologies offer, and explains the considerations for
implementing and working with FILESTREAM and FileTables. The lesson also demonstrates how to enable
FILESTREAM and implement a FileTable.
Lesson Objectives
After completing this lesson, you will be able to:
Describe the options for accessing FILESTREAM data and FileTable data.
o 2. This enables FILESTREAM for Transact-SQL access and Win32 streaming access.
After you have enabled FILESTREAM, you should restart the SQL Server service.
The following code example shows how to use sp_configure to configure the FILESTREAM access level for
a SQL Server instance.
The following code example creates a database that contains a dedicated FILESTREAM filegroup.
The following code example creates a table that has a FILESTREAM column.
Note: You cannot enable the FILESTREAM attribute on an existing column. To convert an
existing varbinary(max) column to FILESTREAM, you should first create a new FILESTREAM-
enabled column, and then copy the data from the existing column into the new column.
You should place the filegroups that contain FILESTREAM data on separate volumes from the
operating system files, paging files, SQL Server database and log files, and SQL Server tempdb.
FILESTREAM data is included in database backups and restores, so you do not need to maintain a
separate backup for data files.
For BLOBs that are smaller than 1 MB in size, FILESTREAM may not perform as well as storing data as
varbinary(max) without the FILESTREAM attribute enabled.
When you enable transparent database encryption for a database, the data in the FILESTREAM
column is not encrypted.
Reference Links: For more information about the best practices for using FILESTREAM, see
the FILESTREAM Best Practices topic in SQL Server Books Online.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-11
The following code example configures the HumanResources database for FileTables.
Configuring a FileTable
To create a FileTable, you use the AS FILETABLE option with a CREATE TABLE statement and specify a
name for the FileTable directory that will contain the data for the FileTable. This directory will be a
subdirectory of the folder that you created by setting the DIRECTORY_NAME option.
Creating a FileTable
USE Archive;
GO
CREATE TABLE Images As FileTable
WITH
(FileTable_Directory = 'Images');
GO
When the NON_TRANSACTED_ACCESS option is set to FULL, you can add files to a FileTable by saving
them or dragging them into the FileTable shared folder. For example, the Images FileTable in the
preceding code example would have a Universal Naming Convention (UNC) path of \\MIA-
SQL\MSSQLSERVER\Archive_Files\Images, where MIA-SQL is the instance name, MSSQLSERVER is the
instance-level FILESTREAM share name, and Archive_Files is the FileTable directory for the database. After
you create the FileTable, you can open and work with the files in the same way that you would if they
were not contained in a FileTable.
Note: You can access FileTable files by navigating to the shared folder by using the UNC or
by navigating to the folder on the local file system. If you use the latter method to open FileTable
files, you will not be able to open files that were created by using applications that use memory-
mapped files. Examples of applications that use memory-mapped files include Notepad and
Paint. However, you can open these files by using the UNC path.
MCT USE ONLY. STUDENT USE PROHIBITED
15-12 Incorporating Data Files into Databases
o Table partitioning.
o Database replication.
FileTables support the following SQL Server features with some limitations:
o Including a database with a FileTable in an AlwaysOn availability group changes the way that
failover works. After failover, you can still access FileTable data on the primary replica, but you
cannot access it on readable secondary replicas.
o FileTables support AFTER triggers for data manipulation language (DML) operations, but they do
not support INSTEAD OF triggers for DML operations. FileTables fully support data definition
language (DDL) triggers.
o You can create views on FileTables, but you cannot include FILESTREAM data in indexed views.
FileTable is an extension of FILESTREAM, so the same restriction applies to them.
Reference Links: For more information about compatibility with other SQL Server features,
see the FileTable Compatibility with Other SQL Server Features topic in SQL Server Books Online.
For more information about using FileTable with AlwaysOn Availability Groups, see the
FILESTREAM and FileTable with AlwaysOn Availability Groups topic in SQL Server Books Online.
FileTableRootPath
The FileTableRootPath function returns the root-
level UNC path for FileTables for the current
database or for a specified FileTable. You can
format the path into the format that your
application requires by setting the @option
argument as follows:
A value of 2 returns the path with the server name displayed as a fully qualified domain name
(FQDN).
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-13
The following code example returns the root-level UNC path for the current database.
GetFileNamespacePath
The GetFileNamespacePath function returns the UNC path to a directory or file in the FileTable directory
hierarchy. You can use the is_full_path argument with a value of 1 to return the full UNC path or with a
value of 0 to return the relative path. 0 is the default value.
You can also set the @option argument to determine the formatting of the path in the same way that you
can for the FileTableRootPath function.
The following code example returns the relative paths of the files in the Images FileTable in the Archive
database.
You can use the FileTableRootPath and GetFileNamespacePath functions to avoid the use of hard-
coded file paths in applications. By using these functions to return the required paths, you can help to
ensure that applications can function regardless of changing environmental factors, such as databases
that are hosted on different servers.
Reference Links: For more information about using the FileTableRootPath and
GetFileNamespacePath functions, see the Work with Directories and Paths in FileTables article in
the MSDN library.
GetPathLocator
The GetPathLocator function returns the hierarchyid value for a FileTable file or directory. You must
supply the path name to the file or directory.
The following code example uses the GetPathLocator function to return the path locator hierarchyid
value for the Images FileTable directory.
When you migrate files from the file system to a FileTable, you can use GetPathLocator to replace the
original UNC path for each file in the metadata with the FileTable UNC path, which helps to ensure that
the metadata for the files is correct.
Reference Links: For more information about using GetPathLocator when you are
migrating files from the file system to a FileTable, see the Load Files into FileTables article in the
MSDN library.
MCT USE ONLY. STUDENT USE PROHIBITED
15-14 Incorporating Data Files into Databases
PathName
The PathName function returns the path of a BLOB in a FILESTREAM column. You can include the
@option argument to obtain the path in the correct format for your applications. The values for @option
are the same as the values that were described in the FileTableRootPath section at the beginning of this
topic.
Demonstration Steps
View FILESTREAM configuration
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as AdventureWorks\Student with the password Pa$$w0rd.
2. Run D:\Demofiles\Mod15\Setup.cmd as an administrator to revert any changes.
9. In the SQL Server (MSSQLSERVER) Properties dialog box, click the FILESTREAM tab.
10. Note that FILESTREAM is enabled for Transact-SQL access, File I/O access, and remote access, and that
the FILESTREAM share is named MSSQLSERVER, and then click Cancel.
2. In the Connect to Server dialog box, in the Server Name field, type MIA-SQL and then click
Connect.
3. In SQL Server Management Studio, click File, point to Open, and then click File.
4. In the Open File dialog box, browse to D:\Demofiles\Mod15, and then double-click FilesDemo.sql.
5. In the query window, under the Enable filestream comment, highlight the Transact-SQL statement,
and then click Execute.
6. Under the Create filestream database comment, highlight the Transact-SQL statement, and then
click Execute.
7. Under the Create filetable comment, highlight the Transact-SQL statement, and then click Execute.
8. In File Explorer, in the D:\Demofiles\Mod15 folder, click Document1, press Shift, click Document3,
right-click Document3, and then click Copy.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-15
9. On the taskbar, click Start, type Run and then press Enter.
10. In the Run dialog box, type \\localhost\MSSQLSERVER\FilestreamData\Documents and then click
OK.
12. In SQL Server Management Studio, in the query window, under the Query FileTable comment,
highlight the Transact-SQL statement, and then click Execute.
13. Leave SQL Server Management Studio open for the next demonstration.
MCT USE ONLY. STUDENT USE PROHIBITED
15-16 Incorporating Data Files into Databases
Lesson 3
Searching Data Files
In SQL Server, you can create full-text indexes that enable you to perform fast and efficient searches on
data files. This lesson explains the benefits of using full-text indexes and the considerations for
implementing a full-text index. It also explains the options for performing searches against full-text
indexes, and explains how you can use Semantic Search to perform more sophisticated searches.
Lesson Objectives
After completing this lesson, you will be able to:
Generation term search. A generation term search locates all inflected forms of specified words. For
example, a generation term search on the word “walk” could locate the words “walk,” “walks,”
“walked,” and “walking.”
Proximity term search. A proximity term search locates specified words or phrases that are near to
other specified words or phrases.
Weighted term search. A weighted term search uses supplied weighting values that are associated
with the search terms to ensure that the query returns the most relevant rows first.
Thesaurus search. A thesaurus search identifies words that are synonyms of the search terms.
In addition to these search types, you can also use Semantic Search to perform searches to identify
documents that are stored in a varbinary(max) column and have similarities or are related in some way.
Performance
Using a full-text index delivers much better performance than simply using the LIKE predicate to identify
words or phrases in text. Furthermore, you cannot use the LIKE predicate to search formatted binary data.
Consequently, if you anticipate that users will need to perform frequent searches on data files, you should
consider creating a full-text index.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-17
Property-Scoped Searches
SQL Server 2014 supports the searching of the properties of files in varbinary(max) columns. (This
functionality applies to varbinary(max) columns regardless of whether the FILESTREAM attribute is
enabled.) For example, you could use a property search to identify all Word documents that have a
particular author. Property-scoped searches are possible only for documents for which there is an
appropriate filter available that can recognize the properties for that document type.
Full-text search includes word breakers, stemmers, stoplists, and thesaurus files for each supported
language. These tools enable the identification of word boundaries, the rejection of words such as “the”
and “an” that have no inherent meaning beyond their syntactical function, and the recognition of the
conjugational forms of verbs.
Language support. Full-text indexes enable support for multiple languages, so you can use a single
index to service queries against data in different languages.
Filegroup placement. Creating and populating a full-text index can incur high I/O, so for optimal
performance, you should consider placing the full-text index on a filegroup that can provide good
I/O performance and that is separate from other data. Conversely, when manageability is a greater
concern than performance, you can place a table and its associated full-text index on the same
filegroup.
Managing updates. By default, SQL Server incrementally updates full-text indexes automatically as
changes occur in the underlying data. To minimize performance disruption at busy times, you can
configure SQL Server to update full-text indexes based on a schedule that you define, or you can
configure only manual updates. You can create a schedule for a full-text index by opening the
properties of the index, and on the Schedules page, clicking New, and specifying the schedule that
you require. Note that populating a full-text index uses a SQL Server Agent job, so you should ensure
that the SQL Server Agent service is running.
MCT USE ONLY. STUDENT USE PROHIBITED
15-18 Incorporating Data Files into Databases
Note: In SQL Server 2005 and earlier, a full-text catalog was a physical structure that you
placed in a specified filegroup when you created the catalog. In SQL Server 2008 and SQL Server
2014, a full-text catalog is a logical structure that does not reside in a filegroup.
You can create a full-text catalog by using a CREATE FULLTEXT CATALOG Transact-SQL statement.
You can use the CREATE FULLTEXT INDEX Transact-SQL statement to create a full-text index. You must
provide a name for the index, the name of the table and the names of the columns to include in the
index, the name of the column that will act as the key index column, and the name of the full-text catalog
for the index. You can also specify the language for each column.
If the index will include columns that use the varbinary(max) data type, you should also provide the
name of the column that identifies the type of document for each row in the table. This column is
specified as the TYPE COLUMN column in the CREATE FULLTEXT INDEX statement. SQL Server uses
filters to extract information from different types of documents, and can automatically select an
appropriate filter for each document type by using the information in the TYPE COLUMN column. When
you create a FileTable, the schema includes a column named file_type, which identifies the document
type for each row. You should specify this column as the TYPE COLUMN column when you are creating
full-text indexes on FileTables.
The following code example creates a full-text index in the ArchiveCatalog full-text catalog on a FileTable
named Archive.
To keep the index up to date with the data, you can configure SQL Server to use change tracking to
perform incremental updates to the index. You can enable change tracking by using the
CHANGE_TRACKING AUTO or CHANGE_TRACKING MANUAL options with either the CREATE
FULLTEXT INDEX or the ALTER FULLTEXT INDEX statements. The AUTO option automatically updates
the index when data changes in the source table. However, because this occurs as a background process,
changes may not immediately appear in the index. The MANUAL option tracks changes to the source
data in the same way, but it does not update the index until you issue a START UPDATE POPULATION
command in an ALTER FULLTEXT INDEX statement.
Reference Links: For more information about populating full-text indexes, see the
Populate Full-Text Indexes topic in SQL Server Books Online.
Schedule regular index defragmentation for the clustered and nonclustered indexes on the source
table.
Reorganize the full-text catalog by using the ALTER FULLTEXT CATALOG REORGANIZE statement.
Typically, SQL Server uses multiple internal tables, known as index fragments, to store data in a full-
text index. Indexes for which the source data changes frequently will often contain a large number of
fragments. Running the ALTER FULLTEXT CATALOG REORGANIZE statement merges these
fragments into a single larger fragment, which can dramatically improve performance.
Reference Links: For more information about improving performance for full-text indexes,
see the Improve the Performance of Full-Text Queries topic in SQL Server Books Online.
The following code example uses the CONTAINS predicate to find matches for the word “Mountain” in
the Production.Product table.
Typically, you use the FREETEXT predicate when you need to identify words and phrases that match the
meaning of the search term that you provide, even if the results do not match the actual words that are
used in the search term. FREETEXT uses the thesaurus to achieve this.
The following code example uses the FREETEXT predicate to find words and phrases that match the
meaning of the term “safety components” in the Production.Document table.
When you define search terms in a full-text query, you can use the AND, OR, and NOT Boolean operators
to combine conditions.
The following code example returns all products from the Production.Product table that contain the
words "frame", "wheel", or "tire" in the product name. The words are weighted, enabling ranking of the
results in terms of the closeness of the match. The rows that match best are returned first.
Semantic Search
Semantic Search extends the capabilities of full-text
search to enable you to identify documents that are
similar or related in some way. For example, you
could use Semantic Search to identify the résumés
in a FileTable that relate to a specific job role.
Although a standard full-text query will reveal
résumés that contain similar keywords or phrases,
these searches may miss relevant résumés where
the author has not used the specified keywords that
are contained in the search term. By identifying
deeper semantic patterns, Semantic Search can
provide a results set that more accurately matches
the search query.
Semantic Search uses a database named the Semantic Language Statistics database, which contains the
statistical models that are used to perform semantic searches. You must install this database from the SQL
Server installation media before you can use Semantic Search.
Reference Links: For more information about installing and configuring the Semantic
Language Statistics database, see the Install and Configure Semantic Search topic in SQL Server
Books Online.
Note: Semantic Search does not support as many languages as a full-text index. To view
the list of supported languages for Semantic Search, query the sys.fulltext_semantic_languages
catalog view.
The following code example adds Semantic Search to an existing full-text index on the Document table in
the AdventureWorks database.
Finding key phrases in a document. You can use the SemanticKeyPhraseTable function to identify
key phrases. SemanticKeyPhraseTable returns a table that includes the following columns:
o Document_key. This column contains the key value of the document that contains the matched
term.
o Keyphrase. This column contains the matching term in the document.
o Score. This column contains a weighting value between 0 and 1 that evaluates the quality of the
match. The higher the value, the better the match.
MCT USE ONLY. STUDENT USE PROHIBITED
15-22 Incorporating Data Files into Databases
Finding similar or related documents. You can use the SemanticSimilarityTable function to find
related documents. SemanticSimilarityTable returns a table that includes the following columns:
o Matched_document_key. This column contains the key value of the document that is identified
as having similarities with the source document.
o Score. This column contains a weighting value between 0 and 1 that evaluates the degree of
similarity. The higher the value, the greater the similarity.
Identifying the key phrases that make documents similar. You can use the
SemanticSimilarityDetailsTable function to identify the key phrases that make documents similar.
SemanticSimilarityDetailsTable returns a table that includes the following columns:
o Keyphrase. This column contains the phrases that are identified as making the documents
similar.
Score. This column contains a weighting value between 0 and 1 that evaluates the key phrases according
to the degree of similarity that they indicate between the two documents. The higher the value, the
stronger the link between the phrases.
Demonstration Steps
Create and query a full-text index
1. Ensure that the 20464C-MIA-DC and 20464C-MIA-SQL virtual machines are running and then log on
to 20464C-MIA-SQL as ADVENTUREWORKS\Student with the password Pa$$w0rd.
2. Ensure that you have run the previous demonstrations in this module.
3. On the taskbar, click SQL Server 2014 Management Studio.
4. In SQL Server Management Studio, in the FilesDemo.sql query window, under the Create full-text
catalog comment, highlight the Transact-SQL statement, and then click Execute.
5. Under the Get index ID for FileTable PK comment, highlight the Transact-SQL statement, and then
click Execute.
6. In the Results pane, in the name column, right-click the value in row 1, which begins with
PK_FileStor_, and then click Copy.
7. In the FilesDemo.sql query window, under the Create full-text index comment, after KEY INDEX,
highlight PK_FileStor_REPLACE_WITH_INDEX_ID, right-click
PK_FileStor_REPLACE_WITH_INDEX_ID, and then click Paste.
8. Under the Create full-text index comment, highlight the Transact-SQL statement, and then click
Execute.
9. Under the Find documents containing "imperdiet" near "vivamus" (within 15 search terms)
comment, highlight the Transact-SQL statement, click Execute, and then review the results.
10. Close the FilesDemo.sql query window, and do not save any changes.
You have decided to use a FileTable to store the résumés. In this lab, you will implement and test the data
files storage solution.
Objectives
After completing this lab, you will have:
Created a FileTable.
Password: Pa$$w0rd
4. Create a FileTable
2. In the D:\Labfiles\Lab15\Starter folder, run the Setup Windows Command Script file (Setup.cmd) as
Administrator.
3. Add a file to the FileStreamGroup filegroup. The name of the file should be FileStreamData and the
file path should be “D:\Labfiles\Lab15\Starter\Filestream”.
o NON_TRANSACTED_ACCESS = FULL
o DIRECTORY_NAME = N’HRFiles’
o FileTable_Directory: ‘Resumes’
2. Copy Max Benson.doc, Shai Bassli.doc, and Stephen Jiang.doc from the D:\Labfiles\Lab15\Starter
folder to the \\MIA-SQL\MSSQLSERVER\HRFiles\Resumes FileTable share.
2. In SQL Server Management Studio, type and execute a SELECT statement that returns the following
metadata columns from the Resumes table:
o Name
o Cached_file_size
o Last_write_time
o A column named [Full Path] that uses the GetFileNamespacePath() FILESTREAM function to
return the full UNC path for each file.
Created a FileTable.
MCT USE ONLY. STUDENT USE PROHIBITED
Developing Microsoft® SQL Server® Databases 15-25
2. Query the sys.sysindexes view to obtain the full index name for the index name that begins
'PK_Resume'; in the Results pane, note the name of the primary key index.
3. Type and execute a Transact-SQL statement that creates a full-text index on the Resumes table, using
the columns and values in the following table.
4. Replace the value in the KEY INDEX clause with the value that you obtained in the previous step.
Place the index on hr_catalog.
2. Type and execute a Transact-SQL statement that returns all résumés that contain the word
“machinist“ within 50 terms of the word “degree.“
2. In SQL Server Management Studio, type and execute a Transact-SQL statement to add the
Statistical_Semantics option to the full-text index on the Resumes table.
2. In SQL Server Management Studio, type and execute a Transact-SQL statement that uses the
SemanticKeyPhraseTable function to return the top 10 phrases in the Shai Bassli.doc file.
3. In SQL Server Management Studio, type and execute a Transact-SQL statement that uses the
SemanticKeyPhraseTable function to return the top two résumés that are about production.
MCT USE ONLY. STUDENT USE PROHIBITED
15-26 Incorporating Data Files into Databases
Results: At the end of this exercise, you will have created a full-text catalog and a full-text index, and you
will have tested the index by running queries against it.
Question: If the lab scenario were modified as described in the list below, how might this
influence your choice of storage solution for the data files?
Administrators want to be able to perform point-in-time restores for all database data,
including data files.
Review Question(s)
Question: How have you enabled the storage of data files in your places of work? How
could you use the features of SQL Server 2014 to improve the storage of data files?
MCT USE ONLY. STUDENT USE PROHIBITED
15-28 Incorporating Data Files into Databases
Course Evaluation
Your evaluation of this course will help Microsoft understand the quality of your learning experience.
Please work with your training provider to access the course evaluation form.
Microsoft will keep your answers to this survey private and confidential and will use your responses to
improve your future learning experience. Your open and honest feedback is valuable and appreciated.
MCT USE ONLY. STUDENT USE PROHIBITED
L1-1
3. In File Explorer, navigate to the D:\Labfiles\Lab01\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
2. In the Connect to Server window, in Server type, ensure that Database Engine is selected.
3. In Server name, ensure that MIA-SQL has been entered.
5. Click Connect.
6. Expand Databases.
7. Expand AdventureWorks.
3. In the right pane, ensure that the following services are listed:
Task 2: Ensure That All Required Services Including SQL Server Agent Are Started and
Set To Autostart for Both Instances
1. Double-click SQL Server (MSSQLSERVER).
2. Click Service.
4. Click OK.
5. Repeat steps 1-4 for each MSSQLSERVER service except the SQL Full-text Filter Daemon Launcher
service.
Task 3: Configure the TCP Port for the SQL3 Database Engine Instance to 51550
1. In the SQL Server Configuration Manager window, in the left pane, expand SQL Server Network
Configuration, and then click Protocols for SQL3.
9. In the SQL Server Configuration Manager window, in the left pane, click SQL Server Services.
10. On the toolbar, click the Refresh icon, and then make sure that the SQL Server (SQL3) service has
started.
3. In File Explorer, navigate to the D:\Labfiles\Lab02\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
2. Review the supporting documentation for details of the PhoneCampaign, Opportunity, and
SpecialOrder tables and determine column names, data types, and nullability for each data item in
the design.
Created a schema.
Created tables.
Created a schema.
MCT USE ONLY. STUDENT USE PROHIBITED
L2-2 Developing Microsoft® SQL Server® Databases
4. Click Execute.
6. Click Execute.
8. Click Execute.
Created the tables that you designed in the first exercise of this lab.
MCT USE ONLY. STUDENT USE PROHIBITED
L3-1
3. In File Explorer, navigate to the D:\Labfiles\Lab03\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
2. In Object Explorer, expand the MIA-SQL server, expand Databases, right-click the AdventureWorks
database, and then click New Query.
4. Click Execute.
7. Click Execute.
Results: Having completed this lab, you will have added constraints to the DirectMarketing.Opportunity
table.
Note: This query should fail due to the PRIMARY KEY constraint.
Note: This query should fail due to the FOREIGN KEY constraint.
Results: After completing this exercise, you should have successfully tested your constraints.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L4-1
3. In File Explorer, navigate to the D:\Labfiles\Lab04\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
Results: After completing this exercise, you will have created tables with clustered indexes.
MCT USE ONLY. STUDENT USE PROHIBITED
L4-2 Developing Microsoft® SQL Server® Databases
Results: After completing this lab, you will have created a nonclustered index.
MCT USE ONLY. STUDENT USE PROHIBITED
L5-1
3. In File Explorer, navigate to the D:\Labfiles\Lab05\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
USE AdventureWorks
GO
SELECT * FROM sys.stats WHERE object_id = OBJECT_ID('Production.Product');
GO
2. Check to see whether any autostats have been generated. If they have, they will appear in the results
with a _WA prefix.
DBCC SHOW_STATISTICS('Production.Product',Product_Color_Stats);
GO
Question Answer
Note: The results returned can vary. Sample results are shown in the following table.
Question Answer
Task 8: Execute an SQL Command and Check the Accuracy of Some Statistics
1. In Object Explorer, expand MIA-SQL, and then expand Databases.
-- Query 1
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE
'A%';
GO
-- Query 2
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE
'Alejandro%';
GO
-- Query 3
SELECT ProspectID, FirstName, LastName FROM Marketing.Prospect WHERE FirstName LIKE
'Arif%';
GO
Results: After this exercise, you will have assessed selectivity on various queries.
2. Click Connect.
4. In Workload, ensure that File is selected, and then click Browse for a workload file.
5. Navigate to D:\Labfiles\Lab05\Starter.
6. Click PersonQuery.sql, and then click Open.
2. Navigate to D:\Labfiles\Lab05\Starter.
3. Click PersonIndex.sql, and then click Open.
4. Notice that Database Engine Tuning Advisor has created a script to create a nonclustered covering
index by using INCLUDE.
5. Click Execute.
Results: After completing this exercise, you will have created a covering index.
MCT USE ONLY. STUDENT USE PROHIBITED
L6-1
2. In the D:\Labfiles\Lab06\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
2. Click New Query, and then enter the following Transact-SQL code.
3. Click Execute.
2. Click Execute.
3. View the results and verify that the buffer pool cache is enabled and uses a 10-GB file named
S:\BufferCache.bpe.
4. Use File Explorer to view the contents of drive S, and verify that the BufferCache.bpe file exists.
5. Close File Explorer, but keep SQL Server Management Studio open for the next exercise.
Results: After completing this exercise, you should have enabled the buffer pool extension.
MCT USE ONLY. STUDENT USE PROHIBITED
L6-2 Developing Microsoft® SQL Server® Databases
5. On the Execution plan tab, view the execution plan that is used for the query. Examine the icons
from right to left, noting the indexes that were used. Note also that the query processor has identified
that adding a missing index could improve performance.
6. Click New Query, and then enter the following Transact-SQL code to create a nonclustered
columnstore index on the FactInternetSales table. Alternatively, in the D:\Labfiles\Lab06\Solution
folder, you can open the Create Columnstore Index on FactInternetSales.sql script file.
USE AdventureWorksDW
GO
CREATE NONCLUSTERED COLUMNSTORE INDEX [IX_NCS_FactInternetSales]
ON dbo.FactInternetSales
(
[ProductKey],
[OrderDateKey],
[DueDateKey],
[ShipDateKey],
[CustomerKey],
[PromotionKey],
[CurrencyKey],
[SalesTerritoryKey],
[SalesOrderNumber],
[SalesOrderLineNumber],
[RevisionNumber],
[OrderQuantity],
[UnitPrice],
[ExtendedAmount],
[UnitPriceDiscountPct],
[DiscountAmount],
[ProductStandardCost],
[TotalProductCost],
[SalesAmount],
[TaxAmt],
[Freight],
[CarrierTrackingNumber],
[CustomerPONumber],
[OrderDate],
[DueDate],
[ShipDate]
);
8. Switch back to the Query FactInternetSales.sql tab, and then click Execute to rerun the query.
9. On the Execution plan tab, view the execution plan that is used for the query. Examine the icons
from right to left, noting the indexes that were used. Note that the columnstore index is used, and
that the query processor does not identify any missing indexes.
MCT USE ONLY. STUDENT USE PROHIBITED
L6-3
USE [AdventureWorksDW]
GO
ALTER TABLE [dbo].[FactProductInventory]
DROP CONSTRAINT [PK_FactProductInventory];
GO
ALTER TABLE [dbo].[FactProductInventory]
DROP CONSTRAINT [FK_FactProductInventory_DimDate];
GO
ALTER TABLE [dbo].[FactProductInventory]
DROP CONSTRAINT [FK_FactProductInventory_DimProduct];
GO
CREATE CLUSTERED COLUMNSTORE INDEX [IX_CS_FactProductInventory]
ON dbo.FactProductInventory;
GO
8. Switch back to the Query FactProductInventory.sql tab, and then click Execute to rerun the query.
9. On the Execution plan tab, view the execution plan that is used for the query. Examine the icons
from right to left, noting the indexes that were used. Note that the columnstore index is used, and
that the query processor does not identify any missing indexes.
Results: After completing this exercise, you should have created columnstore indexes.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L7-1
2. In the D:\Labfiles\Lab08\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
Note: Ensure that approximately nine colors are returned and that no NULL row is returned.
Note: Ensure that approximately 26 rows are returned for blue products. Ensure that approximately
248 rows are returned for products that have no color.
2. In the D:\Labfiles\Lab09\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
2. Review the Function Specifications: Phone Number section in the supporting documentation.
Results: After this exercise, you should have created a new FormatPhoneNumber function within the
dbo schema.
MCT USE ONLY. STUDENT USE PROHIBITED
L9-3
2. Review the requirement for the dbo.IntegerListToTable function in the supporting documentation.
PositionInList IntegerValue
1 234
2 354253
3 3242
4 2
PositionInList IntegerValue
1 234
2 354253
3 3242
4 2
Results: After this exercise, you should have created a new IntegerListToTable function within a dbo
schema.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-1
2. In the D:\Labfiles\Lab10\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
3. Review the supplied table requirements in the supporting documentation for the
Production.ProductAudit table.
4. On the taskbar, click SQL Server 2014 Management Studio, in the Connect to Server window,
ensure that Server name is MIA-SQL, and then click connect.
5. In Object Explorer, expand MIA-SQL, and then expand Databases.
6. Expand AdventureWorks, expand Tables, expand Production.Product, and then expand Columns.
UPDATE Production.Product
SET ListPrice=3978.00
WHERE ProductID BETWEEN 749 and 753;
GO
SELECT * FROM Production.ProductAudit;
GO
Results: After this exercise, you should have created a new trigger. Tests should have shown that it is
working as expected.
MCT USE ONLY. STUDENT USE PROHIBITED
L10-3
3. Expand the MarketDev database, expand Tables, expand Marketing.CampaignBalance, and then
expand Triggers.
Results: After this exercise, you should have altered the trigger. Tests should show that it is now working
as expected.
MCT USE ONLY. STUDENT USE PROHIBITED
L11-1
2. In the D:\Labfiles\Lab11\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
4. In the Database Properties - InternetSales dialog box, on the Filegroups page, in the MEMORY
OPTIMIZED DATA section, click Add Filegroup.
5. In the Name box, type MemFG and then press Enter..
6. In the Database Properties - InternetSales dialog box, on the Files page, click Add.
USE InternetSales
GO
CREATE TABLE dbo.ShoppingCart
(SessionID INT NOT NULL,
TimeAdded DATETIME NOT NULL,
CustomerKey INT NOT NULL,
ProductKey INT NOT NULL,
Quantity INT NOT NULL
PRIMARY KEY NONCLUSTERED HASH (SessionID, ProductKey) WITH (BUCKET_COUNT=100000))
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
3. Click New Query, and then type the following Transact-SQL code. (Alternatively, in the
D:\Labfiles\Lab11\Solution folder, open the TestShoppingCart.sql script file.)
USE InternetSales
GO
INSERT INTO dbo.ShoppingCart (SessionID, TimeAdded, CustomerKey, ProductKey,
Quantity)
VALUES (1, GETDATE(), 2, 3, 1);
INSERT INTO dbo.ShoppingCart (SessionID, TimeAdded, CustomerKey, ProductKey,
Quantity)
VALUES (1, GETDATE(), 2, 4, 1);
SELECT * FROM dbo.ShoppingCart;
Results: After completing this exercise, you should have created a memory-optimized table and a natively
compiled stored procedure in a database with a filegroup for memory-optimized data.
USE InternetSales
GO
CREATE PROCEDURE dbo.AddItemToCart
@SessionID INT, @TimeAdded DATETIME, @CustomerKey INT, @ProductKey INT, @Quantity
INT
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = 'us_english')
INSERT INTO dbo.ShoppingCart (SessionID, TimeAdded, CustomerKey, ProductKey,
Quantity)
VALUES (@SessionID, @TimeAdded, @CustomerKey, @ProductKey, @Quantity)
END
GO
3. In SQL Server Management Studio, click New Query, and then enter the following Transact-SQL
code. (Alternatively, in the D:\Labfiles\Lab11\Solution folder, open the Create
DeleteItemFromCart.sql script file.)
USE InternetSales
GO
CREATE PROCEDURE dbo.DeleteItemFromCart
@SessionID INT, @ProductKey INT
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = 'us_english')
DELETE FROM dbo.ShoppingCart
WHERE SessionID = @SessionID
AND ProductKey = @ProductKey
END
GO
MCT USE ONLY. STUDENT USE PROHIBITED
L11-3
5. In SQL Server Management Studio, click New Query, and then enter the following Transact-SQL
code. (Alternatively, in the D:\Labfiles\Lab11\Solution folder, open the Create EmptyCart.sql script
file.)
USE InternetSales
GO
CREATE PROCEDURE dbo.EmptyCart
@SessionID INT
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = 'us_english')
DELETE FROM dbo.ShoppingCart
WHERE SessionID = @SessionID
END
GO
7. Click New Query, and then enter the following Transact-SQL code. (Alternatively, in the
D:\Labfiles\Lab11\Solution folder, open the Test Procs.sql script file.)
USE InternetSales
GO
DECLARE @now DATETIME = GETDATE();
EXEC dbo.AddItemToCart @SessionID = 3,
@TimeAdded = @now,
@CustomerKey = 2,
@ProductKey = 3,
@Quantity = 1;
EXEC dbo.AddItemToCart @SessionID = 3,
@TimeAdded = @now,
@CustomerKey = 2,
@ProductKey = 4,
@Quantity = 1;
SELECT * FROM dbo.ShoppingCart;
EXEC dbo.DeleteItemFromCart @SessionID = 3, @ProductKey = 4;
SELECT * FROM dbo.ShoppingCart;
EXEC dbo.EmptyCart @SessionID = 3;
SELECT * FROM dbo.ShoppingCart;
Results: After completing this exercise, you should have created a natively compiled stored procedure.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L12-1
2. In the D:\Labfiles\Lab12\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
Table-valued function that returns a list Yes, good use of external access.
of files in a particular folder.
Function that formats phone numbers as Yes, good use of string handling.
strings.
Trigger that records balance movements No, only involves data access.
that have a value of more than 1,000.
Stored procedure that writes an XML file Yes, good use of external access.
for a given XML parameter.
Function that counts rows in a table. No, only involves data access.
Results: After this exercise, you should have created a list of which objects should be implemented in
managed code and the reasons for your decision.
MCT USE ONLY. STUDENT USE PROHIBITED
L12-2 Developing Microsoft® SQL Server® Databases
3. Expand System Databases, right-click the master database, and then click New Query.
USE AdventureWorks
GO
CREATE ASSEMBLY SQLCLRDemo
FROM 'D:\Labfiles\Lab12\Starter\SQLCLRDemo.DLL'
WITH PERMISSION_SET = EXTERNAL_ACCESS;
GO
6. Highlight the query above and then, in the toolbar, click Execute.
8. Highlight the query above and then, in the toolbar, click Execute.
10. Highlight the query above and then, in the toolbar, click Execute.
MCT USE ONLY. STUDENT USE PROHIBITED
L12-3
12. Highlight the query above and then, in the toolbar, click Execute.
SELECT dbo.IsValidEmailAddress('test@somewhere.com');
GO
SELECT dbo.IsValidEmailAddress('test.somewhere.com');
GO
SELECT dbo.FormatAustralianPhoneNumber('0419201410');
GO
SELECT dbo.FormatAustralianPhoneNumber('9 87 2 41 23');
GO
SELECT dbo.FormatAustralianPhoneNumber('039 87 2 41 23');
GO
SELECT * FROM dbo.FolderList(
'D:\Labfiles\Lab12\Starter','*.txt');
GO
Results: After this exercise, you should have three functions working as expected.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L13-1
2. In the D:\Labfiles\Lab13\Starter folder, right-click Setup.cmd, and then click Run as administrator.
3. When you are prompted, click Yes to confirm that you want to run the command file, and then wait
for the script to finish.
Results: After this exercise, you will have seen how to analyze requirements and determine appropriate
use cases for XML storage.
MCT USE ONLY. STUDENT USE PROHIBITED
L13-2 Developing Microsoft® SQL Server® Databases
2. On the File menu, click Open, click File, navigate to D:\Labfiles\Lab13\Starter, and then select
InvestigateStorage.sql and click Open.
USE tempdb;
GO
4. Highlight and execute scripts 13.1 to 13.9 separately, comparing the results of each script with the
script comment.
Results: After this exercise, you will have seen how XML data is stored in variables.
MCT USE ONLY. STUDENT USE PROHIBITED
L13-3
USE tempdb;
GO
3. Highlight and execute scripts 13.10 to 13.11 separately, comparing the results of each script with the
script comment.
Results: After this exercise, you will have seen how to create XML schema collections.
MCT USE ONLY. STUDENT USE PROHIBITED
L13-4 Developing Microsoft® SQL Server® Databases
2. In the query below, highlight and execute scripts 13.21 to 13.29 separately, comparing the results of
each script with the script comment.
13.21 FOR XML AUTO. One row per product. Note the
element name is based on the
table name.
13.24 Note the effect of the Note the element name is now
column alias compared to Product, based on the alias
13.23. name of the table.
13.26 Nested XML with TYPE. Note that rows with a value in
the Description column show
that value as XML.
Results: After this exercise, you will have executed queries that return SQL Server relational data as XML.
MCT USE ONLY. STUDENT USE PROHIBITED
L13-5
CREATE PROCEDURE
Production.GetAvailableModelsAsXML
AS BEGIN
SELECT p.ProductID,
p.name as ProductName,
p.ListPrice,
p.Color,
p.SellStartDate,
pm.ProductModelID,
pm.Name as ProductModel
FROM Production.Product AS p
INNER JOIN Production.ProductModel AS pm
ON p.ProductModelID = pm.ProductModelID
WHERE p.SellStartDate IS NOT NULL
AND p.SellEndDate IS NULL
ORDER BY p.SellStartDate, p.Name
FOR XML RAW('AvailableModel'), ROOT('AvailableModels');
END;
GO
EXEC Production.GetAvailableModelsAsXML;
GO
Results: After this exercise, you will have created and tested the required stored procedure that returns
XML.
MCT USE ONLY. STUDENT USE PROHIBITED
MCT USE ONLY. STUDENT USE PROHIBITED
L14-1
2. In File Explorer, navigate to the D:\Labfiles\Lab14\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
3. In the User Account Control dialog box, click Yes and then wait for the script to finish.
2. Click File, click Open, click File, navigate to D:\Labfiles\Lab14\Starter and click Lab Exercise 1.
3. Click Open..
4. In the query below, highlight and execute scripts 19.1 to 19.9 separately, comparing the results of
each script with the script comment.
5. Review the results from each script. Remember to click the Spatial results tab to see the output.
19.3 Draw a more complex shape. The shape is drawn. Note how a
polygon is represented in text in
the query.
Results: After this exercise, you should have seen how to work with the geometry data type.
UPDATE Marketing.ProspectLocation
SET Location = GEOGRAPHY::STGeomFromText(
'POINT(' + CAST(Longitude AS varchar(20))
+ ' ' + CAST(Latitude AS varchar(20))
+ ')',4326);
GO
Results: After this exercise, you should have replaced the existing Longitude and Latitude columns with
a new Location column.
MCT USE ONLY. STUDENT USE PROHIBITED
L15-1
3. In File Explorer, navigate to the D:\Labfiles\Lab15\Starter folder, right-click the Setup.cmd file, and
then click Run as administrator.
4. In the User Account Control dialog box, click Yes, and then wait for the script to finish.
2. In the Connect to Server dialog box, in the Server type list, click Database Engine, in the Server
name list, click MIA-SQL, in the Authentication list, click Windows Authentication, and then click
Connect.
3. In SQL Server Management Studio, on the File menu, point to Open, and then click File.
4. In the Open File dialog box, navigate to the D:\Labfiles\Lab15\Starter folder, click FileTable.sql,
and then click Open.
5. Under the Configure filestream access level comment, highlight the Transact-SQL statement, and
then click Execute.
2. Under the Add a file to the new filegroup comment, highlight the Transact-SQL statement, and
then click Execute.
3. Under the Set Filestream options comment, highlight the Transact-SQL statement, and then click
Execute.
3. Navigate to the D:\Labfiles\Lab15\Starter folder, click Max Benson.doc, press and hold down Ctrl,
click Shai Bassli.doc, click Stephen Jiang.doc, right-click Stephen Jiang.doc, and then click Copy.
4. In File Explorer, in the navigation bar, type \\MIA-SQL\MSSQLSERVER\HRFiles\Resumes and then
press Enter.
3. In SQL Server Management Studio, in the FileTable.sql query window, under the Query the
FileTable metadata comment, highlight the Transact-SQL statement, and then click Execute.
4. Review the results, noting that the GetFileNamespacePath function returned the full UNC path to
each file in the FileTable shared folder.
5. Close the FileTable.sql query window, and do not save any changes.
Created a FileTable.
2. In the Open File dialog box, navigate to the D:\Labfiles\Lab15\Starter folder, click
FullTextIndex.sql, and then click Open.
3. Under the Create a full-text catalog comment, highlight the Transact-SQL statement, and then click
Execute.
4. Under the Get index name for the FileTable primary key comment, select the Transact-SQL
statement, and then click Execute.
5. In the Results pane, note the name of the primary key index.
6. Under the Create full-text index comment, in the Transact-SQL statement, locate the KEY INDEX
clause.
7. In the KEY INDEX clause, replace the index name (which will begin with PK_Resumes_ and is followed
by a string of numbers and letters) with the index name that you noted in step 5.
8. Under the Create full-text index comment, select the Transact-SQL statement, and then click
Execute.
3. Under the Find resumes containing ‘machinist’ within 50 terms of ‘degree’ comment, highlight
the Transact-SQL statement, and then click Execute.
5. Close the FullTextIndex.sql query window, and do not save any changes.
MCT USE ONLY. STUDENT USE PROHIBITED
L15-3
2. In the Open File dialog box, navigate to the D:\Labfiles\Lab15\Starter folder, click
SemanticSearch.sql, and then click Open.
3. Under the Register the semantic language database comment, highlight the Transact-SQL
statement, and then click Execute.
4. Under the Add semantic search index comment, highlight the Transact-SQL statement, and then
click Execute
2. Under the Get the top 10 phrases in Shai Bassli.doc comment, review the Transact-SQL SELECT
statement, select the Transact-SQL SELECT statement, click Execute, and then review the results.
3. Under the Find the top two resumes that are about 'Production' comment, review the Transact-
SQL SELECT statement, select the Transact-SQL SELECT statement, click Execute, and then review the
results.
4. Close the SemanticSearch.sql query window, and do not save the changes.
Results: At the end of this exercise, you will have created a full-text catalog and a full-text index, and you
will have tested the index by running queries against it.
MCT USE ONLY. STUDENT USE PROHIBITED